This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Proceedings of the 6th Ritsumelkan International Symposium
STOCHASTIC PROCESSES AND APPLICATIONS TO MATHEMATICAL FINANCE
This page intentionally left blank
Proceedings of the 6the Ritsumeikan International Symposium
STOCHASTIC PROCESSES AND APPLICATIONS TO
MATHEMATICAL FINANCE Ritsumeikan University,, Japan
6–10 March 2006
Editors
Joro Akahori Shigeyoshi Ogawa Shinzo Watanabe Ritsumeikan University,, Japan
World Scientific NEW JERSEY . LONDON . SINGAPORE . BEIJING . SHANGHAI . HONG KONG . TAIPEI . CHENNAI
Published by World Scientific Publishing Co. Pte. Ltd. 5 Toh Tuck Link, Singapore 596224 USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.
ISBN-13 978-981-270-413-9 ISBN-10 981-270-413-2
Printed in Singapore.
Chelsea - Stochastic Processes (6th).pmd
1
1/18/2007, 2:37 PM
December 26, 2006 14:28 Proceedings Trim Size: 9in x 6in
PREF-06-2+
PREFACE The 6th Ritsumeikan international conference on Stochastic Processes and Applications to Mathematical Finance was held at Biwako-Kusatsu Campus (BKC) of Ritsumeikan University, March 6–10, 2006. The conference was organized under the joint auspices of Research Center for Finance and Department of Mathematical Sciences of Ritsumeikan University, and financially supported by MEXT (Ministry of Education, Culture, Sports, Science and Technology) of Japan, the Research Organization of Social Sciences, Ritsumeikan University, and Department of Mathematical Sciences, Ritsumeikan University. The series of the Ritsumeikan conferences has been aimed to hold assemblies of those interested in the applications of theory of stochastic processes and stochastic analysis to financial problems. The Conference, counted as the 6th one, was also organized in this line: there several eminent specialists as well as active young researchers were jointly invited to give their lectures (see the program cited below) and as a whole we had about hundred participants. The present volume is the proceedings of this conference based on those invited lectures. We, members of the editorial committee listed below, would express our deep gratitude to those who contributed their works in this proceedings and to those who kindly helped us in refereeing them. We would express our cordial thanks to Professors Toshio Yamada, Keisuke Hara and Kenji Yasutomi at the Department of Mathematical Sciences, of Ritsumeikan University, for their kind assistance in our editing this volume. We would thank also Mr. Satoshi Kanai for his works in editing TeX files and Ms. Chelsea Chin of World Scientific Publishing Co. for her kind and generous assistance in publishing this proceedings. December, 2006, Ritsumeikan University (BKC) Jiroˆ Akahori Shigeyoshi Ogawa Shinzo Watanabe
v
December 26, 2006
14:28
Proceedings Trim Size: 9in x 6in
PREF-06-2+
vi
The 6th Ritsumeikan International Conference on STOCHASTIC PROCESSES AND APPLICATIONS TO MATHEMATICAL FINANCE Date March 6–10, 2006 Place Rohm Memorial Hall/Epoch21, in BKC, Ritsumeikan University 1-1-1 Nojihigashi, Kusatsu, Shiga, 525-8577, Japan Program March, 6 (Monday): at Rohm Memorial Hall 10:00–10:10 Opening Speech, by Shigeyoshi Ogawa (Ritsumeikan University) 10:10–11:00 T. Lyons (Oxford University) Recombination and cubature on Wiener space 11:10–12:00 S. Ninomiya (Tokyo Institute of Technology) Kusuoka approximation and its application to finance 12:00–13:30 Lunch time 13:30–14:20 T. Fujita (Hitotsubashi University, Tokyo) Some results of local time, excursion in random walk and Brownian motion 14:30–15:20 K. Hara (Ritsumeikan University, Shiga) Smooth rough paths and the applications 15:20–15:50 Break 15:50–16:40 X-Y Zhou (Chinese University of Hong-Kong) Behavioral portfolio selection in continuous time 17:30– Welcome party March, 7 (Tuesday): at Rohm Memorial Hall 10:00–10:50 M. Schweizer (ETH, Zurich) Aspects of large investor models 11:10–12:00 J. Imai (Tohoku University, Sendai) A numerical approach for real option values and equilibrium strategies in duopoly 12:00–13:30 Lunch time
December 26, 2006 14:28 Proceedings Trim Size: 9in x 6in
PREF-06-2+
vii
13:30–14:20 H. Pham (Univ. Paris VII) An optimal consumption model with random trading times and liquidity risk and its coupled system of integrodifferential equations 14:30–15:20 K. Hori (Ritsumeikan University, Shiga) Promoting competition with open access under uncertainty 15:20–15:50 Break 15:50–16:40 K. Nishioka (Chuo University, Tokyo) Stochastic growth models of an isolated economy March, 8 (Wednesday): at Rohm Memorial Hall 10:00–10:50 H. Kunita (Nanzan University, Nagoya) Perpetual game options for jump diffusion processes 11:10–11:50 E. Gobet (Univ. Grenoble) A robust Monte Carlo approach for the simulation of generalized backward stochastic differential equations 12:00– Excursion March, 9 (Thursday): at Epoch21 10:00–10:50 P. Imkeller (Humbold University, Berlin) Financial markets with asymmetric information: utility and entropy 11:00–12:00 M. Pontier (Univ. Toulouse III) Risky debt and optimal coupon policy 12:00–13:30 Lunch time 13:30–14:20 H. Nagai (Osaka University) Risk-sensitive quasi-variational inequalities for optimal investment with general transaction costs 14:30–15:20 W. Runggaldier (Univ. Padova) On filtering in a model for credit risk 15:20–15:50 Break 15:50–16:40 D. A. To (Univ. Natural Sciences, HCM city) A mixed-stable process and applications to option pricing 16:50– Short Communications 1. Y. Miyahara (Nagoya City University) 2. T. Tsuchiya (Ritsumeikan University, Shiga) 3. K. Yasutomi (Ritsumeikan University, Shiga)
December 26, 2006 14:28 Proceedings Trim Size: 9in x 6in
PREF-06-2+
viii
March, 10 (Friday): Epoch21 10:00–10:50 R. Cont (Ecole Polytechnique, France) Parameter selection in option pricing models: a statistical approach 11:10–12:00 T. V. Nguyen (Hanoi Institute of Mathematics) Multivariate Bessel processes and stochastic integrals 12:00–13:30 Lunch time 13:30–14:20 J-A, Yan (Academia Sinica, China) A functional approach to interest rate modelling 14:30–15:20 M. Arisawa (Tohoku University, Sendai) A localization of the L´evy operators arising in mathematical finances 15:20–15:50 Break 15:50–16:40 A. N. Shiryaev (Steklov Mathem. Institute, Moscow) Some explicit stochastic integral representation for Brownian functionals 18:30– Reception at Kusatsu Estopia Hotel
December 26, 2006 14:28 Proceedings Trim Size: 9in x 6in
PREF-06-2+
ix
December 26, 2006 14:28 Proceedings Trim Size: 9in x 6in
From Access to Bypass: A Real Options Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . K. Hori and K. Mizuno
127
The Investment Game under Uncertainty: An Analysis of Equilibrium Values in the Presence of First or Second Mover Advantage. . . . . . . . . . . . . . . . . . . . . . . . . . . J. Imai and T. Watanabe
151
Asian Strike Options of American Type and Game Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M. Ishihara and H. Kunita
173
Minimal Variance Martingale Measures for Geometric L´evy Processes . . . . . . . . . M. Jeanblanc, S. Kloeppel, and Y. Miyahara
193
xi
January 17, 2007
19:18
Proceedings Trim Size: 9in x 6in
contents
xii
Cubature on Wiener Space Continued . . . . C. Litterer and T. Lyons
A Convolution Approach to Multivariate Bessel Proceses . . . . . . . . . . . . . . . . . . . . T. V. Nguyen, S. Ogawa, and M. Yamazato
233
Spectral Representation of Multiply Self-decomposable Stochastic Processes and Applications . . . . . . . . . . . . . N. V. Thu, T. A. Dung, D. T. Dam, and N. H. Thai
245
Stochastic Growth Models of an Isolated Economy . . . K. Nishioka
259
Numerical Approximation by Quantization for Optimization Problems in Finance under Partial Observations . . . . H. Pham
275
December 26, 2006
10:43
Proceedings Trim Size: 9in x 6in
01Ankirchner˙Imkeller
Financial Markets with Asymmetric Information: Information Drift, Additional Utility and Entropy Stefan Ankirchner and Peter Imkeller Institut fur ¨ Mathematik, Humboldt-Universit¨at zu Berlin, Unter den Linden 6, 10099 Berlin, Germany
We review a general mathematical link between utility and information theory appearing in a simple financial market model with two kinds of small investors: insiders, whose extra information is stored in an enlargement of the less informed agents’ filtration. The insider’s expected logarithmic utility increment is described in terms of the information drift, i.e. the drift one has to eliminate in order to perceive the price dynamics as a martingale from his perspective. We describe the information drift in a very general setting by natural quantities expressing the conditional laws of the better informed view of the world. This on the other hand allows to identify the additional utility by entropy related quantities known from information theory. Key words: enlargement of filtration; logarithmic utility; utility maximization; heterogeneous information; insider model; Shannon information; information difference; entropy. 2000 AMS subject classifications: primary 60H30, 94A17; secondary 91B16, 60G44. 1. Introduction A simple mathematical model of two small agents on a financial market one of which is better informed than the other has attracted much attention in recent years. Their information is modelled by two different filtrations: the less informed agent has the σ−field Ft , corresponding to the natural evolution of the market up to time t at his disposal, while the better informed insider knows the bigger σ−field Gt ⊃ Ft . Here is a short selection of some among many more papers dealing with this model. Investigation techniques concentrate on martingale and stochastic control theory, and methods of enlargement of filtrations (see Yor , Jeulin , Jacod in [22]), starting with the conceptual paper by Duffie, Huang [12]. The model 1
December 26, 2006
10:43
Proceedings Trim Size: 9in x 6in
01Ankirchner˙Imkeller
2
is successively studied on stochastic bases with increasing complexity: e.g. Karatzas, Pikovsky [24] on Wiener space, Grorud, Pontier [15] allow Poissonian noise, Biagini and Oksendal [7] employ anticipative calculus techniques. In the same setting, Amendinger, Becherer and Schweizer [1] calculate the value of insider information from the perspective of specific utilities. Baudoin [6] introduces the concept of weak additional information, while Campi [8] considers hedging techniques for insiders in the incomplete market setting. Many of the quoted papers deal with the calculation of the better informed agent’s additional utility. In Amendinger et al. [2], in the setting of initial enlargements, the additional expected logarithmic utility is linked to information theoretic concepts. It is computed in terms of an energy-type integral of the information drift between the filtrations (see [18]), and subsequently identified with the Shannon entropy of the additional information. Also for initial enlargements, Gasbarra, Valkeila [14] extend this link to the Kullback-Leibler information of the insider’s additional knowledge from the perspective of Bayesian modelling. In the environment of this utility-information paradigm the papers [16], [19], [17], [18], Corcuera et al. [9], and Ankirchner et al. [5] describe additional utility, treat arbitrage questions and their interpretation in information theoretic terms in increasingly complex models of the same base structure. Utility concepts different from the logarithmic one correspond on the information theoretic side to the generalized entropy concepts of f −divergences. In this paper we review the main results about the interpretation of the better informed trader’s additional utility in information theoretic terms mainly developed in [4], concentrating on the logarithmic case. This leads to very basic problems of stochastic calculus in a very general setting of enlargements of filtrations: to ensure the existence of regular conditional probabilities of σ–fields of the larger with respect to those of the smaller filtration, we only eventually assume that the base space be standard Borel. In Section 2, we calculate the logarithmic utility increment in terms of the information drift process. Section 3 is devoted to the calculation of the information drift process by the Radon-Nikodym densities of the stochastic kernel in an integral representation of the conditional probability process and the conditional probability process itself. For convenience, before proceeding to the more abstract setting of a general enlargement, the results are given in the initial enlargement framework first. In Section 4 we finally provide the identification of the utility increment in the general enlargement setting with the information difference of the two filtrations in terms of Shannon entropy concepts.
December 26, 2006
10:43
Proceedings Trim Size: 9in x 6in
01Ankirchner˙Imkeller
3
2. Additional Logarithmic Utility and Information Drift Let us first fix notations for our simple financial market model. First of all, to simplify the exposition, we assume that the trading horizon is given by T = 1. Let (Ω, F , P) be a probability space with a filtration (Ft )0≤t≤1 . We consider a financial market with one non-risky asset of interest rate normalized to 0, and one risky asset with price Xt at time t ∈ [0, 1]. We assume that X is a continuous (Ft )−semimartingale with values in R and write A for the set of all X−integrable and (Ft )−predictable processes θ such that θ0 = 0. If θ ∈ A, then we denote by (θ · S) the usual stochastic integral process. For all x > 0 we interpret x + (θ · X)t , 0 ≤ t ≤ 1, as the wealth process of a trader possessing an initial wealth x and choosing the investment strategy θ on the basis of his knowledge horizon corresponding to the filtration (Ft ). Throughout this paper we will suppose the preferences of the agents to be described by the logarithmic utility function. Therefore it is natural to suppose that the traders’ total wealth has always to be strictly positive, i.e. for all t ∈ [0, 1] (1)
Vt (x) = x + (θ · X)t > 0 a.s.
Strategies θ satisfying Eq. (1) will be called x−superadmissible. The agents want to maximize their expected logarithmic utility from terminal wealth. So we are interested in the exact value of u(x) = sup{E log(V1 (x)) : θ ∈ A, x − superadmissible}. Sometimes we will write uF (x), in order to stress the underlying filtration. The expected logarithmic utility of the agent can be calculated easily, if one has a semimartingale decomposition of the form (2)
t
Xt = Mt +
ηs dM, Ms , 0
where η is a predictable process. Such a decomposition has to be expected in a market in which the agent trading on the knowledge flow (Ft ) has no arbitrage opportunities. In fact, if X satisfies the property (NFLVR), then it may be decomposed as in Eq. (2) (see [10]). It is shown in [3] that finiteness of u(x) already implies the validity of such a decomposition. Hence a decomposition as in (2) may be given even in cases where arbitrage exists. We state Theorem 2.9 of [5], in which the basic relationship between optimal logarithmic utility and information related quantities becomes visible.
December 26, 2006
10:43
Proceedings Trim Size: 9in x 6in
01Ankirchner˙Imkeller
4
Proposition 2.1. Suppose X can be decomposed into X = M + η · M, M. Then for any x > 0 the following equation holds (3)
u(x) = log(x) +
1 E 2
1
0
η2s dM, Ms .
Let us give the core arguments proving this statement in a particular setting, and for initial wealth x = 1. Suppose that X is given by the linear sde dXt = αt dt + dWt , Xt with a one-dimensional Wiener process W, and assume that the small trader’s filtration (Ft ) is the (augmented) natural filtration of W. Here α is a progressively measurable mean rate of return process which satisfies 1 |αt |dt < ∞, P−a.s. Let us denote investment strategies per unit by π, so 0 that the wealth process V(x) is given by the simple linear sde dVt (x) dXt = πt · . Vt (x) Xt It is obviously solved by the formula
t
Vt (x) = exp[
πs dWs − 0
1 2
t 0
π2s ds +
t
πs αs ds]. 0
t Due to the local martingale property of 0 πs dWs , t ∈ [0, 1], the expected logarithmic utility of the regular trader is deduced from the maximization problem (4)
uF (1) = max E[ π
1
πs αs ds − 0
1 2
1 0
π2s ds].
The maximization of
1
πs αs ds −
π → 0
1 2
1 0
π2s ds
for given processes α is just a more complex version of the one-dimensional maximization problem for the function 1 π → π α − π2 2
December 26, 2006
10:43
Proceedings Trim Size: 9in x 6in
01Ankirchner˙Imkeller
5
with α ∈ R. Its solution is obtained by the critical value π = α and thus 1 1 α2s ds]. (5) uF (1) = E[ 2 0 This confirms the claim of Proposition 2.1. This proposition motivates the following definition. Definition 2.1. A filtration (G t ) is called finite utility filtration for X, if X is a (Gt )−semimartingale with decomposition dX = dM + ζ · dM, M, where 1 ζ is (Gt )−predictable and belongs to L2 (M), i.e. E 0 ζ2 dM, M < ∞. We write F = {(Ht ) ⊃ (Ft )(Ht ) is a finite utility filtration for X}. We now compare two traders who take their portfolio decisions not on the basis of the same filtration, but on the basis of different information flows represented by the filtrations (Gt ) and (Ht ) respectively. Suppose that both filtrations (Gt ) and (Ht ) are finite utility filtrations. We denote by (6)
X = M + ζ · M, M
the semimartingale decomposition with respect to (Gt ) and by (7)
X = N + β · N, N
the decomposition with respect to (Ht ). Obviously, M, M = X, X = N, N and therefore the utility difference is equal to 1 1 uH (x) − uG (x) = E (β2 − ζ2 ) dM, M. 2 0 Furthermore, Eqs. (6) and (7) imply (8)
M = N − (ζ − β) · M, M
a.s.
If Gt ⊂ Ht for all t ≥ 0, Eq. (8) can be interpreted as the semimartingale decomposition of M with respect to (Ht ). In this case one can show that the utility difference depends only on the process µ = ζ − β. In fact, 1 1 (β2 − ζ2 ) dM, M uH (x) − uG (x) = E 2 0 1 1 1 2 µ dM, M) − E( µ ζ dM, M) = E( 2 0 0 1 1 = E( µ2 dM, M). 2 0
December 26, 2006
10:43
Proceedings Trim Size: 9in x 6in
01Ankirchner˙Imkeller
6
The last equation is due to the fact that N − M = µ dM, M is a martingale with respect to (Ht ), and ζ is adapted to this filtration. It is therefore natural to relate µ to a transfer of information. Definition 2.2. Let (G t ) be a finite utility filtration and X = M + ζ · M, M the Doob-Meyer decomposition of X with respect to (Gt ). Suppose that (Ht ) is a filtration such that Gt ⊂ Ht for all t ∈ [0, 1]. The (H t )−predictable process µ satisfying
·
M−
µt dM, Mt
is a (Ht ) − local martingale
0
is called information drift (see [18]) of (H t ) with respect to (Gt ). The following proposition summarizes the findings just explained, and relates the information drift to the expected logarithmic utility increment. Proposition 2.2. Let (G t ) and (Ht ) be two finite utility filtrations such that Gt ⊂ Ht for all t ∈ [0, 1]. If µ is the information drift of (H t ) w.r.t. (Gt ), then we have 1 1 µ2 dM, M. uH (x) − uG (x) = E 2 0 3. The Information Drift and the Law of Additional Information In this section we aim at giving a description of the information drift between two filtrations in terms of the laws of the information increment between two filtrations. This is done in two steps. First, we shall consider the simplest possible enlargement of filtrations, the well known initial enlargement. In a second step, we shall generalize the results available in the initial enlargement framework. In fact, we consider general pairs of filtrations, and only require the state space to be standard Borel in order to have conditional probabilities available. 3.1 Initial enlargement, Jacod’s condition In this setting, the additional information in the larger filtrations is at all times during the trading interval given by the knowledge of a random variable which, from the perspective of the smaller filtration, is known only at the end of the trading interval. To establish the concepts in fair simplicity, we again assume that the smaller underlying filtration (Ft ) is the augmented filtration of a one-dimensional Wiener process W. Let G be an F1 –measurable random variable, and let Gt = Ft ∨ σ(G),
t ∈ [0, 1].
December 26, 2006
10:43
Proceedings Trim Size: 9in x 6in
01Ankirchner˙Imkeller
7
Suppose that (Gt ) is small enough so that W is still a semimartingale with respect to this filtration. More precisely, suppose that there is an information drift µG such that 1 |µG s | ds < ∞ P-a.s., 0
and such that
.
˜ + W=W
(9)
0
µG s ds
˜ To clarify the relationship between the with a (Gt )− Brownian motion W. additional information G and the information drift µG , we shall work under a condition concerning the laws of the additional information G which has been used as a standing assumption in many papers dealing with grossissement de filtrations. See Yor [27], [26], [28], Jeulin [21]. The condition was essentially used in the seminal paper by Jacod [20], and in several equivalent forms in Follmer ¨ and Imkeller [13]. To state and exploit it, let us first mention that all stochastic quantities appearing in the sequel, often depending on several parameters, can always be shown to possess measurable versions in all variables, and progressively measurable versions in the time parameter (see Jacod [20]). Denote by PG the law of G, and for t ∈ [0, 1], ω ∈ Ω, by P G t (ω, dl) the regular conditional law of G given Ft at ω ∈ Ω. Then the condition, which we will call Jacod’s condition, states that (10) PGt (ω, dg) is absolutely continuous with respect to PG (dg) for P− a.e. ω ∈ Ω.
Also its reinforcement (11)
PG t (ω, dg) is equivalent to
PG (dg) for P− a.e. ω ∈ Ω,
will be of relevance. Denote the Radon-Nikodym density process of the conditional laws with respect to the law by pt (ω, g) =
dPG t (ω, ·) dPG
(g),
g ∈ R, ω ∈ Ω.
By the very definition, t → Pt (·, dg) is a local martingale with values in the space of probability measures on the Borel sets of R. This is inherited to t → pt (·, g) for (almost) all g ∈ R. Let the representations of these martingales with respect to the (Ft )−Wiener process W be given by t g ku dWu , t ∈ [0, 1] pt (·, g) = p0 (·, g) + 0
December 26, 2006
10:43
Proceedings Trim Size: 9in x 6in
01Ankirchner˙Imkeller
8
with measurable kernels k. To calculate the information drift in terms of these kernels, take s, t ∈ [0, 1], s ≤ t, and let A ∈ Fs and a Borel set B on the real line determine the typical set A ∩ G−1 [B] in a generator of Gs . Then we may write E([Wt − Ws ] 1A 1B (G)) = E( 1A [Wt − Ws ] PG t (·, dg)) B = E(1A [Wt − Ws ] [pt − ps ](·, g)) PG(dg)
B
=
t
g
ku du) PG (dg)
E(1A
B
=
s
s
g
t
ku pu (·, g) du) PG(dg) pu (·, g)
t
ku du pt (·, g)) PG(dg) pu (·, g)
E(1A
B
=
g
E(1A s
B
g
ku PG (·, dg)) p (·, g) t u B t g ku | g=G du). = E(1A 1B (G) s pu (·, g) = E(
1A
The bottom line of this chain of arguments shows that · klu ˜ =W− | g=G du W 0 pu (·, g) is a (G )−martingale, hence a (Gt )−Brownian motion provided that 1 kg t | u | | du < ∞ P−a.s.. This completes the deduction of an explicit 0 pu (·,g) g=G formula for the information drift of G in terms of quantities related to the law of G in which we use the common oblique bracket notation to denote the covariation of two martingales (for more details see Jacod [20]). Theorem 3.1. Suppose that Jacod’s condition (10) is satisfied, and furthermore that g
(12)
µG t
kt | g=G = = pt (·, g)
d dt p(·,
g), Wt
pt (·, g)
| g=G ,
satisfies
1
(13) 0
|µG u | du < ∞ P−a.s..
t ∈ [0, 1],
December 26, 2006
10:43
Proceedings Trim Size: 9in x 6in
01Ankirchner˙Imkeller
9
Then
·
˜ + W=W 0
µG s ds
˜ is a G−semimartingale with a G−Brownian motion W. To see how restrictive condition (10) may be, let us illustrate it by looking at two possible additional information variables G. Example 1: Let > 0 and suppose that the stock price process is a regular diffusion given by a stochastic differential equation with bounded volatility σ and drift α, σt = σ(Xt ), t ∈ [0, 1], where σ is a smooth function without zeroes. Let G = X1+ . Then in particular X is a time homogeneous Markov process with transition probabilities Pt (x, dy), x ∈ R+ , t ∈ [0, 1], which are equivalent with Lebesgue measure on R+ . For t ∈ [0, 1], the regular conditional law of G given Ft is then given by P1+ −t (Xt , dy), which is equivalent with the law of G. Hence in this case, even the strong version of Jacod’s hypothesis (11) is verified. Example 2: Let G = sup Wt . t∈[0,1]
To abbreviate, denote for t ∈ [0, 1] Gt = sup Ws , 0≤s≤t
G˜ 1−t = sup (Ws − Wt ). t≤s≤1
Finally, let p1−t denote the density function of G˜ 1−t . Then we may write for every t ∈ [0, 1] G = Gt ∨ [Wt + G˜ 1−t ].
(14)
Now Gt is Ft −measurable, independent of G˜ 1−t , and therefore for Borel sets A on the real line we have Gt −Wt (·, A) = p (y)dy · δ (A) + p1−t (y)dy. (15) PG 1−t G t t −∞
A∩[Gt −Wt ,∞[
Note now that the family of Dirac measures in the first term of (15) is supported on the random points Gt , and that the law of Gt is absolutely continuous with respect to Lebesgue measure on R+ . Hence there cannot be any common reference measure equivalent with δGt P−a.s. Therefore in this example Jacod’s condition is violated.
December 26, 2006
10:43
Proceedings Trim Size: 9in x 6in
01Ankirchner˙Imkeller
10
It can be seen that there is an extension of Jacod’s framework into which example 2 still fits. This is explained in [18], [19], and resides on a version of Malliavin’s calculus for measure valued random elements. It yields a description of the information drift in terms of traces of logarithmic Malliavin gradients of conditional laws of G. We shall not give details here, since we will go a considerable step ahead of this setting. In fact, in the following subsection we shall further generalize the framework beyond the Wiener space setting. 3.2 General enlargement Assume again that the price process X is a semimartingale of the form X = M + η · M, M with respect to a finite utility filtration (Ft ). Moreover, let (Gt ) be a filtration such that Ft ⊂ Gt , and let α be the information drift of (Gt ) relative to (Ft ). We shall explain how the description of α by basic quantities related to the conditional probabilities of the larger σ−algebras Gt with respect to the smaller ones Ft , t ≥ 0 generalizes from the setting of the previous subsection. Roughly, the relationship is as follows. Suppose for all t ≥ 0 there is a regular conditional probability Pt (·, ·) of F given Ft , which can be decomposed into a martingale component orthogonal to M, plus a component possessing a stochastic integral representation with respect to M with a kernel function kt (·, ·). Then, provided α is square integrable with respect to dM, M ⊗ P, the kernel function at t will be a signed measure in its set variable. This measure is absolutely continuous with respect to the conditional probability itself, if restricted to Gt , and α coincides with their Radon-Nikodym density. As a remarkable fact, this relationship also makes sense in the reverse direction. Roughly, if absolute continuity of the stochastic integral kernel with respect to the conditional probabilities holds, and the RadonNikodym density is square integrable, the latter turns out to provide an information drift α in a Doob-Meyer decomposition of X in the larger filtration. To provide some details of this fundamental relationship, we need to work with conditional probabilities. We therefore assume that (Ω, F , P) is standard Borel (see [23]). Unfortunately, since we have to apply standard techniques of stochastic analysis, the underlying filtrations have to be assumed completed as a rule. On the other hand, for handling conditional probabilities it is important to have countably generated conditioning σ– fields. For this reason we shall use small versions (Ft0 ), (G0t ) which are countably generated, and big versions (Ft ), (Gt ) that are obtained as the smallest right-continuous and completed filtrations containing the small
December 26, 2006
10:43
Proceedings Trim Size: 9in x 6in
01Ankirchner˙Imkeller
11
ones, and thus satisfy the usual conditions of stochastic calculus. We further suppose that F0 is trivial and that every (Ft )−local martingale has a continuous modification, and of course Ft0 ⊂ G0t for all t ≥ 0. We assume that M a (Ft0 )−local martingale. The regular conditional probabilities relative to the σ−algebras Ft0 are denoted by Pt . For any set A ∈ F the process (t, ω) → Pt (ω, A) is an (Ft0 )−martingale with a continuous modification adapted to (Ft ) (see e.g. Theorem 4, Chapter VI in [11]). We may assume that the processes Pt (·, A) are modified in such a way that Pt (ω, ·) is a measure on F for PM −almost all (ω, t), where PM is given on Ω × [0, 1] defined by PM (Γ) = ∞ E 0 1Γ (ω, t)dM, Mt , Γ ∈ F ⊗B+ . It is known that each of these martingales may be described in the unique representation (see e.g. [25], Chapter V) t (16) Pt (·, A) = P(A) + ks (·, A)dMs + LA t , 0
where k(·, A) is (Ft )−predictable and LA satisfies LA , M = 0. Note that trivially each σ−field in the left-continuous filtration (G0t− ) is also generated by a countable number of sets. We claim that the existence of an information drift of (Gt ) relative to (Ft ) for the process M depends on the validity of the following condition, which is the generalization of Jacod’s condition (10) to arbitrary stochastic bases on standard Borel spaces. Condition 3.1. k t (ω, ·)G0 is a signed measure and satisfies t−
kt (ω, ·)
G0t−
Pt (ω, ·)
G0t−
for PM −a.a (ω, t). If (3.1) is satisfied, one can show (see [4]) that there exists an (F t ⊗ Gt )−predictable process γ such that for PM −a.a. (ω, t) dkt (ω, ·) (ω ). (17) γt (ω, ω ) = dPt (ω, ·) G0t− It is also immediate from the definition that (18)
On the basis of these simple facts it is possible to identify the information drift, provided (3.1) is guaranteed.
December 26, 2006
10:43
Proceedings Trim Size: 9in x 6in
01Ankirchner˙Imkeller
12
Theorem 3.1. Suppose Condition 3.1 is satisfied and γ is as in (17). Then αt (ω) = γt (ω, ω) is the information drift of (Gt ) relative to (Ft ). Proof. We give the arguments in case M is a martingale. For 0 ≤ s < t and A ∈ G0s we have to show t E [1A (Mt − Ms )] = E 1A γu (ω, ω) dM, Mu . s
Observe E [1A (Mt − Ms )] = E [Pt (·, A)(Mt − Ms )] t = E (Mt − Ms ) ku (·, A) dMu + E[(Mt − Ms )LA t ] 0
t
=E
ku (·, A) dM, Mu s
t
γu (ω, ω ) dPu (ω, dω ) dM, Mu
=E s
A
t = E 1A (ω) γu (ω, ω) dM, Mu , s
where we used (18) in the last equation. We now look at the problem from the reverse direction. As an immediate consequence of (18) and Proposition 2.2 note that (Gt ) is a finite utility filtration if and only if γ2t (ω, ω ) Pt (ω, dω ) dM, Mt dP(ω) < ∞. Starting with the assumption that (Gt ) is a finite utility filtration, which 1 thus amounts to E 0 α2 dM, M < ∞, we derive the validity of Condition 3.1. In the sequel, (Gt ) denotes a finite utility filtration and α its predictable information drift, i.e. · ˜ αt dM, Mt (19) M=M− 0
December 26, 2006
10:43
Proceedings Trim Size: 9in x 6in
01Ankirchner˙Imkeller
13
is a (Gt )−local martingale. To prove absolute continuity, we first define approximate Radon-Nikodym densities. This will be done along a sequence of partitions of the state space which generate the respective σ–fields of the bigger filtration. So let tni = 2in for all n ≥ 0 and 0 ≤ i ≤ 2n . We denote by T the set of all tni . It is possible to choose a family of finite partitions (Pi,n ) such that • for all t ∈ T we have G0t− = σ(Pi,n : i, n ≥ 0 s.t. tni = t), • Pi,n ⊂ Pi+1,n , • if i < j, n < m and i 2−n = j 2−m , then Pi,n ⊂ P j,m . We define for all n ≥ 0 the following approximate Radon-Nikodym densities γnt (ω, ω )
=
n 2 −1
i=0 A∈Pi,n
1]tni ,tni+1 ] (t)1A (ω )
kt (ω, A) . Pt (ω, A)
k (ω,A)
Note that Ptt (ω,A) is (Ft )−predictable and 1 ]tni ,tni+1 ] (t)1A(ω ) is (Gt )−predictable. Hence the product of both functions, defined as a function on Ω2 × [0, 1], is predictable with respect to (Ft ⊗ Gt ). By the very definition, for PM −almost all (ω, t) ∈ Ω×[0, 1] the discrete process (γ m t (ω, ·))m≥1 is a martingale. To have a chance to see this martingale converge as m → ∞, we will prove uniform integrability which will follow from the boundedness of the sequence in L2 (Pt (ω, ·)). This again is a consequence of the following key inequality (for more details see [4]). Lemma 3.1. Let 0 ≤ s < t ≤ 1 and P = {A 1 , . . . , An } be a finite partition of Ω into G0s −measurable sets. Then 2 t t n ku 2 E (·, Ak ) 1Ak dM, Mu ≤ 4E αu dM, Mu < ∞. Pu s s k=1
Proof. An application of Ito’s formula, in conjunction with (16) and (19),
December 26, 2006
10:43
Proceedings Trim Size: 9in x 6in
01Ankirchner˙Imkeller
14
yields n
1Ak log Ps (·, Ak ) − 1Ak log Pt (·, Ak ) k=1
=
n k=1
t
−
s
1 1A dPu (·, Ak ) Pu (·, Ak ) k
1 1 dP(·, A ), P(·, A ) k k u 2 Ak s Pu (·, Ak ) t t ku ku ˜u− (·, Ak ) 1Ak dM (·, Ak ) 1Ak αu dM, Mu − = s Pu s Pu k=1 t 2 1 1 t ku Ak 1Ak dLu + − (·, Ak ) 1Ak dM, Mu 2 s Pu s Pu (·, Ak ) 1 t 1 Ak Ak + 1 dL , L A u 2 s Pu (·, Ak )2 k 1 2 n
t
+
(20)
Note that Pt (·, Ak ) log Pt (·, Ak ) is a submartingale bounded from below for all k. Hence the expectation of the left hand side in the previous equation is at most 0. One readily sees that the stochastic integral process with respect ˜ in this expression is a martingale and hence has vanishing expectation, to M while a similar statement holds for the stochastic integral with respect to the singular parts LAk . Consequently we may deduce from Eq. (20) and the Kunita-Watanabe inequality 2 n 1 t ku (·, Ak ) 1Ak dM, Mu E 2 s Pu k=1 n t ku ≤E (·, Ak ) 1Ak αu dM, Mu s Pu k=1 ⎞ 12 ⎛ n 2 12 t ⎟⎟ ⎜⎜ t ku 2 ⎟ ⎜ (·, Ak ) 1Ak dM, Mu ⎟⎟⎠ E αu dM, Mu , ≤ E ⎜⎜⎝ Pu s s k=1
which implies E
2 t t n ku (·, Ak ) 1Ak dM, Mu ≤ 4E α2u dM, Mu . Pu s s k=1
This completes the proof.
December 26, 2006
10:43
Proceedings Trim Size: 9in x 6in
01Ankirchner˙Imkeller
15
Lemma 3.1 will now allow us to obtain a Radon-Nikodym density 1 process provided the given information drift α satisfies E 0 α2 dM, M < ∞. Note that our main result implicitly contains the statement that the kernel kt is a signed measure on the σ–field G0t , PM −a.e. 1 Theorem 3.2. Suppose that the information drift α satisfies E 0 α2 dM, M < ∞. Then the kernel k is absolutely continuous with respect to Pt (ω, ·)|G0t− , for PM −a.a. (ω, t) ∈ Ω × [0, 1]. This means that Condition 3.1 is satisfied. Moreover, the density process γ provides a description of the information drift of (G t ) relative to (Ft ) by the formula αt (ω) = γt (ω, ω). 2 Proof. By definition and Lemma 3.1 (γm t (ω, ·))m≥1 is an L (Pt (ω, ·))–bounded m martingale and hence, for a.a. fixed (ω, t), (γ t (ω, ·))m≥1 possesses a limit γ. It can be chosen to be (Ft ⊗ Gt )−predictable. Take for example
γt = lim inf(γnt ∨ 0) + lim sup(γnt ∧ 0). n
n
Now define a signed measure by ˜kt (ω, A) = 1A (ω )Zt (ω, ω )dPt (ω, dω ). Observe that k˜ t (ω, ·) is absolutely continuous with respect to Pt (ω, ·) and that we have for all A ∈ P j,m with j2−m ≤ t k˜ t (ω, A) = kt (ω, A) for PM −a.a. (ω, t) ∈ Ω × [0, 1]. By integrating, we obtain the equation
t
Pt (ω, A) = P(A) +
(21)
0
k˜ s (ω, A) dMs + LA t (ω)
for all A ∈ j2−m ≤t P j,m . Since the LHS and both expressions on the RHS are measures coinciding on a system which is stable for intersections, Eq. (21) holds for all A ∈ G0t− . Hence, by choosing kt (·, A) = k˜ t (·, A) for all A ∈ G0t− , the proof is complete. We close this section by illustrating the method developed by means of an example. Example 3.1. Let W be the Wiener process, P the Wiener measure, F t0 the filtration generated by W, a > 0, τ(a) = 1 ∧ inf{t ≥ 0 : W t = a}, δ > 0,
December 26, 2006
10:43
Proceedings Trim Size: 9in x 6in
01Ankirchner˙Imkeller
16
Ht0 = σ(τ ∧ t + δ) and G0t = Ft0 ∨ Ht0 . Again let (Ft ) and (Gt ) be the smallest respective extensions of (Ft0 ) and (G0t ) satisfying the usual conditions. An investor having access to the information represented by (Gt ) knows at any time whether within the next δ time units the Wiener process will hit the level a, provided the level has not yet been hit. In this example, the information drift of (Gt ) is already completely determined as the density process of kt (ω, ·) relative to Pt (ω, ·) along the σ−algebras Ht0 (this follows from a slight modification of the proof of Theorem 3.1). Let S = sup0≤r≤t Wr , F(a, x, u) = P(τ(a − x) ≤ u) and recall that F(a, x, u) = u y t (a−x)2 √ exp(− 2y )dy, for all x < a (see Ch.III, p.107 in [25]). Note that 0 2πy3
for all r ≤ u ≤ 1 we have Pr (ω, {τ(a) ≤ u}) = 1{Sr ≥a} + 1{Sr
December 26, 2006
We formulate an operational definition for absence of model-free arbitrage in a financial market, in terms of a set of minimal requirements for the pricing rule prevailing in the market and without making reference to any ’objective’ probability measure. We show that any pricing rule verifying these properties can be represented as a conditional expectation operator with respect to a probability measure under which prices of traded assets follow martingales. Our result does not require any notion of “reference” probability measure and is consistent with the formulation of model calibration problems in option pricing. Key words: model-free arbitrage, martingales, fundamental theorem of asset pricing, pricing of contingent claims.
1. Introduction 1.1 Model-based vs model-free arbitrage Stochastic models of financial markets represent the evolution of the prices of financial products as stochastic processes defined on some (filtered) probability space (Ω, (Ft )t≥0 , P), where it is usually assumed [9, 10, 13, 12, 14] that an “objective” probability measure P, describing the random evolution of market prices, is given. Given a set of benchmark assets (St )t≥0 , described as semimartingales under P,the gain of a trading strategy (φt )t≥0 is defined via the stochastic integral φdS with respect to ∗ We acknowledge financial support from the European Network on Advanced Mathematical Methods in Finance.
53
December 26, 2006
10:50
Proceedings Trim Size: 9in x 6in
Ritsumeikan
54
the price processes. Then, one introduces the set of (P-)admissible trading strategies as strategies with limited liability i.e. whose value is P-a.s. bounded from below [9, 10]: t φ is admissible if ∃c ∈ R such that for all t, P( φ dS ≥ −c) = 1 0
An arbitrage opportunity is then defined as an admissible strategy φ such that T T P( (1) φ dS ≥ 0) = 1 and P( φ dS > 0) > 0, 0
0
a definition which depends on P through its null-sets. The Fundamental Theorem of Asset Pricing [12], which is the theoretical foundation underlying the use of martingale methods in derivative pricing, is then loosely summarized as follows: roughly speaking, in a market where no such arbitrage opportunities exist, there exists a probability measure Q equivalent to P such that the (discounted) value Vt (H) of any contingent claim with terminal payoff H is represented by: (2)
Vt (H) = EQ [H|Ft ]
Loosely speaking: if the market is arbitrage-free, prices can be represented as conditional expectations with respect to some “equivalent martingale measure” Q. However, as noted by Kabanov [13], the precise formulation of this fundamental result is quite technical. In the case of market models with an infinite set of market scenarios, absence of arbitrage has to be replaced by a stronger condition known as No Free Lunch with Vanishing Risk [9, 10], which means requiring that, for any sequence of admissible strategies T with terminal gains fn = 0 φn dS, such the negative parts fn− tend to 0 uniformly and such that P( fn → f ∗ ) = 1 then P( f ∗ =0) = 1. Under the NFLVR condition, one obtains [6, 9, 10, 13] the existence of a probability measure Q equivalent to P such that the (discounted) value Vt (H) of a contingent claim with terminal payoff H is represented by: (3)
Vt (H) = EQ [H|Ft ]
Furthermore, in the case of unbounded price processes the martingale property should be replaced by the weaker local martingale or “σmartingale” properties [9, 10]. In addition, when asset prices are not locally bounded (as in a model with unbounded price jumps), the only admissible investments are those in the risk-free asset, which makes the
December 26, 2006
10:50
Proceedings Trim Size: 9in x 6in
Ritsumeikan
55
above definitions somewhat trivial: the set of strategies needs to to be suitably enlarged [3, 4]. All these additional technical assumptions are less obvious to justify in economic terms. But perhaps the most important aspect of this characterization of absence of arbitrage in terms of “equivalent martingale measures” is the way an arbitrage opportunity (or free lunch) is defined: the definition explicitly refers to an objective probability measure P. In financial terms, such a strategy is more appropriately termed a model-based arbitrage, where the term “model” refers to the choice of P. The absence of arbitrage is then justified by saying that, if such an arbitrage opportunity would appear in the market, market participants (“arbitrageurs”) would exploit it and make it disappear. This argument implicitly assumes that market participants are able to detect whether a given trading strategy is an arbitrage. Such a reasoning can be safely applied to model-free arbitrage opportunities: for instance, if discrepancies appear between an index and its components or if triangle arbitrage relations in foreign exchange markets are not respected, market participants will presumably trade on them. In fact this is the basis of many automated “program” trading strategies, which make such arbitrage opportunities short-lived. But the argument is less obvious when applied to a model-based arbitrage. A model-based arbitrage opportunity is risk-free if the model P on which it is based is equivalent to the (unknown) one underlying the market dynamics. Once “model risk” – i.e. the possibility that P is misspecified– is taken into account, a model-based arbitrage is not riskless anymore. However model uncertainty cannot be ignored when dealing with the pricing of derivative instruments [7] and model-based arbitrage strategies can in fact be quite risky. Hence, market participants will attempt to exploit a model-based arbitrage opportunity if they believe that there is some market consensus on the underlying model i.e. that market prices will not move in a way which is precluded in the model. However, in financial markets, and even more so in the context of derivative pricing, there is no consensus on the “underlying model” P [7]: the relevance of a definition of arbitrage which relies on the existence of a consensual or “objective” probability measure may thus be questioned. Market consensus is expressed, not in terms of probabilities, but in terms of prices of various underlying assets and their derivatives traded in the market. It thus seems more natural to formulate the absence of arbitrage in terms of properties of market prices, that is, as constraints linking the relative values of traded instruments. Well-known constraints of this type are cash-and-carry arbitrage relations between spot and forward prices, spot relations between an index and its components, triangle relations between exchange rates, put-call parity relations, arbitrage inequalities
December 26, 2006
10:50
Proceedings Trim Size: 9in x 6in
Ritsumeikan
56
linking values of call and put options of different strikes and maturities, in-out parity relations for barrier options. Characterization of arbitrage-free price systems in terms of equivalent martingale measures also contrasts with the way the martingale pricing approach is commonly used in derivatives markets. Derivative pricing models are usually specified in terms of a (parametric) family (Q θ , θ ∈ E) of “martingale measures” and the parameters θ of the pricing model are typically obtained by calibrating them to observed prices of various derivatives. The specification of an objective probability measure typically plays no role in this process. In fact, in most cases (Black-Scholes model, diffusion models, stochastic volatility models,..) the probability measures (Qθ , θ ∈ E) are mutually singular so the model selection problem cannot be formulated as a search among martingale measures equivalent to a given measure P [2]. So, any characterization of absence of arbitrage in terms of equivalent martingale measure would appear as inconsistent with the practice of specifying and calibrating pricing rules in this way. Our goal in the present work is to present a formulation of the martingale approach to derivative pricing which is • consistent with the way arbitrage constraints are formulated by market participants, namely, in terms of market prices • consistent with the way derivative pricing models are specified and calibrated in practice, that is, without referring to any “objective” probability measure. We will start by formulating a set of minimal requirements for a pricing rule which can be interpreted as absence of model-free arbitrage. These requirements are formulated in terms of properties of prices (i.e. market observables), which is closer to the way arbitrage constraints are viewed in a financial markets, and without resorting to any reference probability measure. We will then show that any pricing rule verifying these minimal assumptions can be represented by a conditional expectation operator with respect to a probability measure Q under which prices of traded assets are martingales (“martingale measure”). Our proof is based on simple probabilistic arguments. Our result can thus be viewed as a model-free version of the fundamental theorem of asset pricing. 1.2 Relation with previous literature As noted above, classical formulations of the Fundamental Theorem of Asset pricing are based on the absence of model-based arbitrage (which includes model-free arbitrage as a special case). It is therefore interesting that one obtains a similar result with weaker assumptions. Since our result
December 26, 2006
10:50
Proceedings Trim Size: 9in x 6in
Ritsumeikan
57
does not hinge on the existence of an objective probability measure, it is robust to model misspecification, an important issue in financial modeling. The relation of our framework to classical formulations of the Fundamental Theorem of Asset pricing are further discussed in Section 4. A similar formulation of properties of pricing rules was proposed by Rogers [14]. In [14], a pricing rule was defined as a map on L ∞ (Ω, P) for some reference probability measure P. Unlike [14], our formulation avoids any reference to a consensual or “objective” probability measure, and the set of contingent claims i.e. the domain of the pricing rule is determined a posteriori, not imposed a priori. We believe this renders our approach more general and more amenable to financial interpretation. This point is further commented upon in Section 4.2. 1.3 Outline The article is structured as follows. In Section 2 we discuss some reasonable and financially meaningful requirements for a pricing rule and formulate them in mathematical terms. In Section 3 we characterize any pricing rule verifying these requirements as conditional expectation with respect to a martingale measure. Section 4 discusses some implications of our result and its relation to previous literature on arbitrage theorems. 2. Definitions and Notations Let (Ω, (Ft )t∈[0,T] ) be the set of market scenarios endowed with a filtration (Ft )t∈[0,T] representing the flow of information with time (in particular, F0 is trivial). Let L0 denote the space of R-valued, FT -measurable random variables, representing payoffs of contingent claims and let L∞ denote the subspace of bounded variables. Let Y be the set of the non-anticipative processes Y : Ω × [0, T] → R ∪ {+∞, −∞} i.e. such that for each t, Yt is R ∪ {+∞, −∞}-valued and Ft -measurable. A pricing rule can be seen as an operator Π : L0 → Y which assigns a price process Πt (H) to each contingent claim H ∈ L0 . Note that a pricing rule does not necessarily assign a finite price to all payoffs H ∈ L0 . Denote by Dom(Π) the domain of Π, that is, the set of payoffs with a finite price: Dom(Π) {G ∈ L0 | Π(G) is finite valued} We can now formulate the minimal requirements for a pricing rule via the following definition: Definition 2.1. A pricing rule is a mapping (4)
Π : L0 → Y H → (Πt (H))t∈[0,T]
December 26, 2006
10:50
Proceedings Trim Size: 9in x 6in
Ritsumeikan
58
that satisfies the following properties: A1 If G, H ∈ Dom(Π), then K = max(G, H) ∈ Dom(Π). A2 Positivity. For any H ∈ L0 , if H ≥ 0, then Π(H) ≥ 0. A3 Ft -linearity on Dom(Π): For any H1 , H2 ∈ Dom(Π) and any bounded Ft -measurable variable λ, λH 1 + H2 ∈ Dom(Π) and (5)
Πt (λH1 + H2 ) = λΠt (H1 ) + Πt (H2 )
A4 Time consistency. ∀H ∈ L0 ,
Πs (Πt (H)) = Πs (H) 0 ≤ s ≤ t ≤ T
A5 Normalization. Π(1) = 1. A6 Market consistency. If H is tradable at price (V t )t∈[0,T] in the market (whence in particular H = VT ), then H ∈ Dom(Π) and (6)
∀t ∈ [0, T], ∀ω ∈ Ω,
Πt (H)(ω) = V(t, ω).
A7 Continuity. If (Hn )n≥1 is an increasing sequence in L0 , uniformly bounded from below, with Hn ↑ H, then Π0 (Hn ) ↑ Π0 (H). Let us comment on the various elements in this definition. The requirement that Π(H) is non-anticipative simply means that the pricing rule only makes use of information available at t in order to assign the price at time t to a claim. Also, it is quite natural that Πt (H) is R∪{+∞, −∞} valued. For example, some payoffs H may carry a huge downside risk that no market participant is willing to assume at any price: this formally translates into Π(H) = −∞. A1 This property means that, if H and G are two payoffs priced in the market then the option to exchange them i.e. max(H, G) is also priced in the market. Together with [A5], it ensures that, if an asset S is priced in the market then the most common derivatives on S, namely calls and puts, also belong to the domain of Π. A2 Positivity ensures that the pricing rule verifies model-free static arbitrage inequalities. For instance, it guarantees that the price of call options is decreasing and the price of a put option is increasing with respect to its strike.
December 26, 2006
10:50
Proceedings Trim Size: 9in x 6in
Ritsumeikan
59
A3 Ft -linearity on Dom(Π) expresses additivity of prices plus the fact that the value of a position, when computed at time t, scales linearly when we multiply the size of the position by a factor which is known at t (i.e. Ft -measurable). This property obviously implies linearity: Dom(Π) is thus a vector space. In financial terms, linearity together with (A2) guarantees that the price of call and put options is convex in the strike price. A4 Time consistency rules out “cash and carry” arbitrage strategies for traded assets. It ensures for instance that forward contracts on traded assets are priced consistently with their underlyings. A5 Normalization simply means that we are dealing with prices expressed in units of a given numeraire.1 Since (A2) and (A3) imply that Π is monotone, a consequence of the normalization condition is that L∞ ⊂ Dom(Π). A6 Market consistency means that the pricing rule is compatible with observed market prices. It reflects the fact that pricing rules used by market operators are “calibrated” to prices of instruments (underlyings, derivatives) whose prices are observed in the market. Together with the linearity condition (A2), it implies put-call parity for calls and puts on traded assets. A7 By the positivity property, if (Hn )n≥1 is a monotone (increasing to H) sequence of payoffs then (Π0 (H − Hn ))n≥0 is a decreasing and positive sequence so it has a limit. So the continuity condition boils down to requiring continuity from above at zero for Π0 (.). This is a rather weak continuity requirement, which excludes unrealistic specifications of pricing rules which would allocate very different prices to very similar payoffs. Remark 2.1. (Vector lattice property)Properties [A1], [A2], [A3] and [A5] imply that the set Dom(Π) of payoffs with a finite price forms a vector lattice that contains L∞ (see [1] for definitions). 3. Pricing Rules as Conditional Expectation Operators Let us start by showing that, for any market-consistent “martingale” measure Q, the conditional expectation operator with respect to Q defines a pricing rule in the sense of Definition 2.1: 1 One could rewrite the whole formalism with the apparently (but not really) more general condition 0 < Π(1) ≤ 1.
December 26, 2006
10:50
Proceedings Trim Size: 9in x 6in
Ritsumeikan
60
Proposition 3.1. Let Q be a probability measure defined on (Ω, (Ft )t∈[0,T] ) such that the prices Vt (H) of all traded assets are martingales with respect to Q. There exists a pricing rule Λ such that 1. Dom(Λ) is the vector space L1 (Q) of Q-integrable payoffs ; 2. For any H ∈ Dom(Λ), (7)
Πt (H) = EQ [H|Ft ]
Q − a.s.
Proof. For a Q-integrable payoff H one can define Λ(H) as (a version of the) Q-conditional expectation of H, as in (7). To define a pricing rule, we need to extend Λ to the entire space L0 , i.e. also to non-integrable payoffs. For a positive payoff G, EQ [ G | Ft ] is always well-defined, with values in R ∪ {+∞}. Let us fix a general payoff H and call (αt )t a version of (EQ [ |H| | Ft ])t . For all t ≤ T, k ∈ N consider the Ft -measurable sets Ak,t = {k ≤ αt < k + 1} Fix t and for any Ak,t select a version fk,t of EQ [HIAk,t | Ft ] and define Λt (H) = fk,t on Ak,t Λt (H) = +∞ on Ω − ∪k Ak,t , Λ(H) thus defines an element of Y. It is very easy to see that Dom(Λ) = L1 (Q), i.e. it is the space of Q-integrable payoffs. On this space Λt (H) satisfies (7). The properties (A1), (A2), (A3), (A4), (A5), (A7) of a pricing rule are easily verified for Λ and to obtain (A6) when H is tradable, simply choose Λt (H) to be the version of EQ [H|Ft ] that coincides with Vt (H). We now state our main result, which shows that any pricing rule can be represented as a conditional expectation with respect to a “martingale measure” Q: Theorem 3.1. Given a pricing rule Π, there exists a probability measure Q defined on (Ω, FT ) such that Π coincides with the conditional expectation with respect to Q. More precisely: 1. Dom(Π) is the vector space L1 (Q) of Q-integrable payoffs ; 2. For any H ∈ Dom(Π), (8)
Πt (H) = EQ [H|Ft ]
Q − a.s.
December 26, 2006
10:50
Proceedings Trim Size: 9in x 6in
Ritsumeikan
61
3. Prices of traded assets are Q-martingales. Proof. Define Q on FT by ∀A ∈ FT ,
Q(A) = Π0 (1A )
It is not difficult to see that Q is a probability measure. In fact, Q is positive by positivity of Π, additive by linearity of Π and normalized. Furthermore, the continuity property (A7) of Π0 implies the monotone convergence property for Q, which is therefore a probability. Define a simple payoff as an element H ∈ L∞ of the form H=
n
c i 1 Ai ,
Ai ∈ FT ,
ci ∈ R.
i=1
Since Π is linear, for any simple payoff H we have Π0 (H) = EQ [H]. A general H ∈ L0 , H ≥ 0 can be approximated from below by a monotone sequence (Hn )n≥1 of simple payoffs: Hn ↑ H Using the monotone convergence theorem for Q-expectation and the continuity property (A7) for Π, we can pass to the limit in EQ [Hn ] = Π0 (Hn ) and we thus obtain Π0 (H) = EQ [H] If H is in L1 (Q), both its positive and negative part H + , H− have finite Q-expectation and by additivity of Q and Π we get Π0 (H) = EQ [H]. If H is not integrable, then either Π0 (H + ) = EQ [H + ] or Π0 (H − ) = EQ [H − ] is infinite. By property (A1), Π0 (H) cannot be finite. In particular, we obtain Dom(Π) ⊆ L1 (Q) but not equality yet, since we need more properties to control Πt when t > 0. Let then H ∈ L1 (Q) and fix t ∈ [0, T]. Applying Ft -linearity and time consistency, for any A ∈ Ft we have that Π0 (1A H) is finite and coincides with Π0 (1A Πt (H)). Hence, for any A ∈ Ft (9)
EQ [1A H] = EQ [1A Πt (H)]
which characterizes Πt (H) as a version of the Q-conditional expectation of H with respect to Ft . This shows also that Dom(Π) coincides with L1 (Q). Finally, property (A6) of Π entails that if V is the market price of a traded asset H, then V is a version of the Q-martingale with terminal value H: Vt = Πt (H) = EQ [H | Ft ]
December 26, 2006
10:50
Proceedings Trim Size: 9in x 6in
Ritsumeikan
62
Remark 3.1. (Continuity of Π) Inspecting the first part of the above proof shows that we could have recovered Q also from the restriction of Π to L∞ ⊆ Dom(Π). In particular, it would have been enough to consider the (linear, positive) functional ψ : L∞ → R defined by: ψ(H) = Π0 (H) If we endow L∞ with the uniform norm, it is a Banach space (in fact, a Banach lattice). Hence, thanks to [1] , ψ is already norm-continuous and so it can be identified with a measure Q on (Ω, FT ). But without any extra condition, Q is a finitely additive measure but not a probability measure in general. To get countable additivity, we need the continuity condition property (A7), which amounts to requiring order-continuity of ψ. 4. Discussion We have characterized pricing rules defined on L0 as conditional expectation operators with respect to a probability measure Q such that prices of traded assets are Q-martingales. Our characterization does not require any a priori restriction on the domain of the pricing rule or the existence of a reference probability measure. We now examine some of the consequences of this result and its relation with previous characterizations of absence of arbitrage. 4.1 Implications for the specification of derivative pricing models In contrast with previous formulations of no-arbitrage theorems, our result does not include any reference to an “objective” probability measure P. In particular, we characterize internally consistent pricing models in terms of “martingale measures” without requiring that these martingale measures be equivalent to a reference probability measure Q. This is consistent with the way derivative pricing models are specified and used in the market. In practice, one does not necessarily start by identifying/ specifying an “objective” probability measure P and then subsequently look for a suitable martingale measure Q compatible with market prices, among those equivalent to P. Instead, common practice is to specify a derivative pricing model in terms of a (parametric) family (Qθ , θ ∈ E) of “martingale measures” and select the parameter θ of the pricing model are typically obtained by calibrating them to observed prices of various derivatives. The specification of an objective probability measure typically plays no role in this process. In fact, in most cases (Black-Scholes model, diffusion models, stochastic volatility models,..) the probability measures (Qθ , θ ∈ E) are mutually singular: for example, if Qσ designates a Black-Scholes model with volatility parameter σ then σ1 σ2 entails that Qσ1 and Qσ2 are mutually singular measures. So, the model selection
December 26, 2006
10:50
Proceedings Trim Size: 9in x 6in
Ritsumeikan
63
problem cannot be formulated as a search among martingale measures equivalent to a given measure P [2]. Therefore, while any characterization of absence of arbitrage in terms of equivalent martingale measure would appear as inconsistent with this (commonly used) way of specifying and calibrating pricing models, our result provides a justification for it: it simply reflects the fact that there is no consensus in the market on the “objective” probability and not even on its equivalence class. 4.2 The domain of the pricing rule Another common feature of previous formulations of the absence of arbitrage is that the set of contingent claims is chosen in advance, either as L∞ (Ω, P) or Lp (Ω, P), p ≥ 1 for some reference measure P. In practice the set of payoffs is defined independently from any probability measure: it typically contains unbounded payoffs whose integrability with respect to a given probability measure is not determined a priori, so this approach does not seem very natural. In our approach, a pricing rule is defined on L0 -the set of all possible payoffs- and the domain of the pricing rule is determined a posteriori. We find this approach financially meaningful. In fact, the simplest derivatives –call options– have unbounded payoffs and are priced on the market, so taking the set of payoffs to be L∞ (P) —as in [14]— seems restrictive. Of course, the pricing operator defined in this way can be then extended but this may lead to further mathematical issues (which should be the right extension to use? Is the resulting extension market-consistent?). In our setting, market consistency is guaranteed a priori and as a consequence of our result Dom(Π) turns out a posteriori to be the space L1 (Q). 4.3 Introduction of a set of benchmark assets Suppose that a pricing rule Π is given on the market. Consider now a set of d benchmark price processes S1 , · · · , Sd (the so-called underlyings). We will now show how to recover the local-martingale or σ−martingale properties for S = (S1 , · · · , Sd ) (see e.g. [4, 13, 10]) within our framework. In the usual approach, to build gain processes of trading strategies as stochastic integrals one requires that S is an Rd -valued semimartingale with respect to the reference probability P. In our model-free context the natural counterpart is the assumption that S is a Q-semimartingale. One can then introduce stochastic integrals with respect to S and define a notion of replicating strategy: Definition 4.1. Given a pricing rule Π on the market, represented by a martingale measure Q and an Rd -valued Q-semimartingale S, a payoff H ∈ L0 is said to be S-replicable if there exist a x ∈ R and a predictable
December 26, 2006
10:50
Proceedings Trim Size: 9in x 6in
Ritsumeikan
64
process (strategy) ϕ such that: 1. ϕ is S-integrable under Q. t 2. Q( Πt (H) = x + 0 ϕdS ) = 1. Remark 4.1. In the above definition and in what follows probabilistic notions are induced by the pricing rule through its representing Q. Delbaen and Schachermayer [10] linked the No Free Lunch with Vanishing Risk property under P with the existence of a probability measure equivalent to P under which S is a σ-martingale, a notion introduced in [5]. We will now show how the σ-martingale property appears in our context. Let us recall the following result from Emery [11]: Proposition 4.1. [11] Let S be a d-dimensional semimartingale on (Ω, F , (Ft )t∈[0,T] , Q) and denote by L(S)(Q) the set of predictable and S-integrable processes under the probability measure Q. The following assertions are equivalent: 1. there exist a d-dimensional Q- martingale N and a positive (scalar) process ψ ∈ ∩1≤i≤d L(N i )(Q) such that Si = ψ dN i ; 2. there exists a countable predictable partition (Bn )n of Ω × R+ such that IBn dSi is a Q-martingale for every i, n; 3. there exist (scalar) processes ηi with paths that Q − a.s. never touch zero, such that ηi ∈ L(Si )(Q) and ηi dSi is a Q-local martingale. Definition 4.2. We say that S is a σ-martingale under Q if it satisfies any of the equivalent conditions of the above Proposition. The above equivalences illustrate that the σ-martingale property is a generalization of the local martingale property. Remark 4.2. Whenever the the (B n )n can be written as stochastic intervals ]Tn , Tn+1 ] where Tn is a sequence of stopping times increasing to +∞, then the previous definition coincides with that of local martingale. If for some i the process Si is not the market price of a traded asset (but, for instance, a non-traded risk factor such as an instantaneous forward rate or instantaneous volatility process) then Si is not necessarily a martingale. However, the result by Emery allows us to recover the σ-martingale features of S under Q from a straightforward analysis of the market spanned by S. Roughly speaking, there must be a traded derivative H with underlying S, which is S-replicable via a hedging strategy that is always non zero:
December 26, 2006
10:50
Proceedings Trim Size: 9in x 6in
Ritsumeikan
65
Proposition 4.2. Suppose that for all i there exists an S i -replicable derivative Hi traded in the market with a strategy (ϕ it )t∈[0,T] that Q-a.s. never touches zero. Then S is a σ-martingale under Q. with market price V i = Π(H i ), our Theorem 3.1 Proof. Since Hi is traded implies that the gain ϕi dSi is a Q-martingale. Then, given the assumption on the ϕi s, S is a σ-martingale under Q from a direct application of item 3, Proposition 4.1. Remark 4.3. (The ’No Free Lunch with Vanishing Risk’ property) If S is indeed a σ−martingale under Q, then the market spanned by S satisfies the NFLVR condition with respect to Q (and henceforth with respect to any P ∼ Q). In fact, consider the Q-admissible strategies ϕ i.e. whose gain processes are Q -almost surely bounded from below: ∃c > 0, Q(
ϕdS ≥ −c) = 1
If S is a σ-martingale under Q, such strategies give rise to gain processes which are Q-supermartingales (see e.g. [10]). Hence absence of arbitrage T obviously holds, since EQ [ 0 ϕdS] ≤ 0. An application of Fatou’s Lemma then shows that NFLVR also holds. References 1. Aliprantis, C. D. and Border, K. C. (1999) Infinite-dimensional analysis: a hitchhiker’s guide. Springer, Berlin. 2. S. Ben Hamida and R. Cont, Recovering volatility from option prices by evolutionary optimization, Journal of Computational Finance, 8 (2005), pp. 43– 76. 3. Biagini, S. (2005). Convex Duality in Financial Theory with General Semimartingales. PhD Thesis, Scuola Normale Superiore. 4. Biagini, S. and Frittelli, M. (2005) Utility Maximization in Incomplete Markets for Unbounded Processes. Finance and Stochastics 9, 493–517. 5. Chou, C. S. (1979) Caract´erisation d’une classe de semimartingales. Sminaire de probabilit´es XIII, p. 250–252, Berlin: Springer. 6. Cherny, A. and Shiryaev, A. (2002) Vector stochastic integrals and the fundamental theorem of asset pricing, Proceedings of the Steklov Institute of mathematics, 237, pp. 6–49. 7. Cont, R. (2006) Model uncertainty and its impact on the pricing of derivative instruments. Mathematical Finance, 16, No. 3, 519–547. 8. Dalang, R. C., Morton, A., and Willinger, W. (1990) Equivalent martingale measures and no-arbitrage in stochastic securities market models, Stochastics Stochastics Rep. 51, no. 1–2, 41–49.
December 26, 2006
10:50
Proceedings Trim Size: 9in x 6in
Ritsumeikan
66
9. Delbaen, F. and Schachermayer, W. (1994) A General Version of the Fundamental Theorem of Asset Pricing. Math. Ann., Vol. 300, pp. 463–520. 10. Delbaen, F. and Schachermayer, W. (1998) The fundamental theorem of asset pricing for unbounded stochastic processes, Math. Ann., 312, pp. 215–250. 11. Emery M. (1980) Compensation de processus a` variation finie non localement int´egrables, S´em. Prob. XIV, Lecture Notes in Mathematics Vol. 784, 152–160, Berlin: Springer. 12. Harrison, J. M. and Pliska, S. R. (1981), Martingales and stochastic integrals in the theory of continuous trading, Stochastic Process. Appl., 11, pp. 215–260. 13. Kabanov, Yuri (2001) Arbitrage theory, in: J Cvitanic, E Jouini and M Musiela (eds.): Option pricing, interest rates and risk management, Cambridge University Press. 14. Rogers, L. C. G. (1998) The origins of risk-neutral pricing and the BlackScholes formula. Risk Management and Analysis, 2, 81–94. 15. Ross, Stephen (1978) A Simple Approach to the Valuation of Risky Streams. Journal of Business, vol. 51, No. 3, 453–475.
December 26, 2006
19:6
Proceedings Trim Size: 9in x 6in
Carassus˙etal
A Class of Financial Products and Models Where Super-replication Prices are Explicit L. Carassus1 , E. Gobet2 , E. Temam1 1
Laboratoire de Probabilit´es et Mod`eles Al´eatoires - Universit´e Paris 7 - Denis Diderot - Case 7012 - 2, place Jussieu 75251 Paris cedex 05 - France 2 Laboratoire Jean Kuntzman - ENSIMAG INP Grenoble BP 53 - 38041 Grenoble cedex 09 - France
We consider a multidimensional financial model with mild conditions on the underlying asset price process. The trading is only allowed at some fixed discrete times and the strategy is constrained to lie in a closed convex cone. We show how the minimal cost of a super hedging strategy can be easily computed by a backward recursive scheme. As an application, when the underlying asset follows a stochastic differential equation including stochastic volatility or Poisson jumps, we compute those super-replication prices for a range of European and American style options, including Asian, Lookback or Barrier Options. We also perform some multidimensional computations. Key words: Closed formula for Super-replication cost; convex cone constraints on portfolio; exotic European and American options. AMS Classification: 90A09, 60G42, 26B25.
1. Introduction We consider a financial market consisting of d risky assets with discounted price process denoted by S, and one risk-less bond: the trading is allowed only at fixed discrete times. We assume that the trading strategies are also subject to portfolio constraints. Namely, given a closed convex cone K with vertex in 0, the vector of number of shares invested in the risky assets is constrained to lie in K . Such formalization includes in particular incomplete markets and markets with short-selling constraints. It is well-known that in those contexts, it is not possible to define an unique 67
December 26, 2006
19:6
Proceedings Trim Size: 9in x 6in
Carassus˙etal
68
fair price, i.e. the initial cost of a strategy replicating a given contingent claim, as in the context of complete markets. A possible way of defining a price is to consider the minimal initial wealth needed to hedge without risk the contingent claim. This is called the super-replication cost and has been introduced in the binomial setup for transaction costs by Bensaid-Lesne-Pag`es-Scheinkman [1], in a L 2 -setup for transaction costs and short-sales constraints by Jouini-Kallal [14, 15] and in the diffusion setup for incomplete markets by El Karoui-Quenez [9]. In the context of convex constraints, this notion has been studied among others by Cvitani´cKaratzas [4], Karatzas-Kou [17], Broadie-Cvitani´c-Soner [3] and in a great generality by Follmer-Kramkov ¨ [11]. In those papers a dual formulation is given. Namely, the super-replication cost of an European contingent claim, H, is essentially the supremum over a given set of probability of the expectation of H (or a modification of H). Nevertheless this dual formulation does not enable in general to effectively compute the super-replication price. Here we combine primal and dual formulations, in order to provide a closed formula for European and American style options under general assumptions on the underlying S (namely, an usual non degeneracy condition), and also to give the hedging strategy. In the case of European vanilla options, finding the super-replication price reduces to compute some concave envelop of the payoff function. For more general options, it involves recursive computations using again kind of concave envelops. The coefficients of the affine function which appears in the concave envelop give the hedging strategy. The application of this algorithm turns to be simple to derive the super-replication prices of all usual options. Patry [20] obtains a similar formula, in the Black-Scholes case, for an European vanilla option. Our effective computation shows that, when the asset prices can heavily fluctuate, the super-replication prices are trivial in the sense that they correspond to basic strategies such as ”Buy and Hold”. In particular, for an European call option, the super-replication price is equal to the initial price of the underlying: this result has been already obtained in the context of transaction costs by Cvitani´c-Shreve-Soner [7] and Cvitani´c-Pham-Touzi [6], and for a continuous time stochastic volatility model by Cvitani´c-PhamTouzi [5]. Our approach emphasizes the fact that the super-replication price depends on the law of the underlying asset price process only through its null sets. These results are presumably not surprising, even if, up to our knowledge, only specific one dimensional cases have been handled in the literature. Finally, this work provides a relatively complete answer to the problem of how to explicitly compute super-replication prices in a general discrete time strategies framework. The paper is organized as follows. In Section 2, we describe the financial
December 26, 2006
19:6
Proceedings Trim Size: 9in x 6in
Carassus˙etal
69
model and give the notation of the paper. Then, we recall the notion of No Arbitrage and state a dual formulation for the super-replication problem: while this result is standard in the European case (see KabanovR`asonyi-Stricker [16] and Schachermayer [23]), it has not been yet stated in the American context (this is Theorem 2.2, which proof is postponed in Appendix). Section 3 is devoted to the closed formulae for the superreplication prices, and their proofs. In Section 4, we effectively compute the super-replication price for European and American style exotic options (including Asian, Lookback or Barrier options), when there is only one risky asset (See table 1 for those explicit computations). We also handle some multidimensional examples. These results hold true if the underlying asset law admits a positive density w.r.t. the Lebesgue measure: it includes for example Black-Scholes model, general stochastic differential equations, stochastic volatility models, or models governed by Brownian motion and Poisson process. We will also see that increasing the number of hedging dates does not modify the super-replication prices.
2. The Financial Model and Super-replication Theorem 2.1 Notations and definitions Let T > 0 be a finite time horizon and set T = {0, 1, . . . , T}: the financial market model consists of one risk-less asset with price process normalized to one and d risky assets with price process S = {St = (S1t , ..., Sdt )∗ , t = 0, ..., T} valued in (0, ∞)d. Here the notation ∗ is for the transposition. The stochastic price process (St )t∈T is defined on a complete probability space (Ω, F , P) equipped with the filtration F = {Ft , t ∈ T }, where the σ-field Ft is generated by the random variables S0 , S1 , · · · , St . We make the usual assumption that F0 is trivial and FT = F . A trading portfolio is a Rd -valued F-adapted process φ = {φ t = (φ1t , . . . , φdt )∗ , t = 0, ..., T − 1}, where φit represents the amount of wealth invested in the i-th risky asset at time t. The R-valued F-adapted process C = {Ct , t ∈ T } represents the cumulative consumption process. We assume that C0 = 0 and that C is non-decreasing. We also use the notation ∆St = St − St−1 and ∆Ct = Ct − Ct−1 , for t = 1, ..., T. Given an initial wealth x ∈ R, a trading portfolio φ, and a cumulative consumption process C, the wealth process Xx,φ,C is governed by : x,φ,C
(1)
X0
= x,
x,φ,C Xt
= Xt−1 + φ∗t−1 ∆St − ∆Ct ,
x,φ,C
for t = 1, . . . , T.
December 26, 2006
19:6
Proceedings Trim Size: 9in x 6in
Carassus˙etal
70
The induction equation (1) leads to x,φ,C
Xt
=x+
t
φ∗u−1 ∆Su − Ct ,
t ∈ T.
u=1
The condition C = 0 means that the portfolio φ is self-financed. We now impose some constraints on the trading portfolios. Let K be a closed convex cone of Rd with vertex in 0. For any x ∈ [0, ∞), we say that a trading strategy (x, φ, C) is admissible, and we denote (x, φ, C) ∈ A, if for all t = 0, . . . , T − 1, φt ∈ K a.s. Such constraints cover in particular the case of incomplete markets (K = {k ∈ Rd : ki = 0, i = 1, ..., n} : it is impossible to trade in the n first risky assets) and short-sales constraints (K = [0, ∞)d). Let H be an European contingent claim, i.e., a FT -measurable random variable. Following Follmer ¨ and Kramkov [11], we introduce the notion of minimal hedging strategy for H. First, an European H hedging strategy is x,φ,C ≥ H a.s. We will denote by AeH the a strategy (x, φ, C) ∈ A such that XT ˆ C) ˆ ∈ Ae is minimal if ˆ φ, set of European H hedging strategies. Then, (x, H x,φ,C
ˆ Cˆ ˆ φ, x,
for all (x, φ, C) ∈ AeH Xt ≥ Xt a.s. for all t ∈ T . Note that xˆ is then the so-called super-replication cost pe (H) of H, i.e. the minimal initial capital needed for hedging without risk H: pe (H) = inf{x ∈ R : ∃ (φ, C) s.t. (x, φ, C) ∈ A eH }. It is straightforward that xˆ ≥ pe (H). Conversely, set x ∈ R such that there ˆ ˆ exists (Φ, C) with (x, Φ, C) ∈ AeH , then by minimality of Xx,ˆ φ,C , x ≥ xˆ and taking the infimum over such x, we get the reverse inequality. We now define the same notion for American contingent claim (Ht )t∈T . An American H hedging strategy is some (x, φ, C) ∈ A such that for all x,φ,C ≥ Ht a.s. We will denote by AaH the set of American H t ∈ T , Xt ˆ C) ˆ ∈ Aa is minimal if for all (x, φ, C) ∈ Aa , ˆ φ, hedging strategies. Then (x, H H x,φ,C
ˆ Cˆ ˆ φ, x,
Xt ≥ Xt of H, i.e.
a.s., for all t ∈ T . Again xˆ is the super-replication cost pa (H)
pa (H) = inf{x ∈ R : ∃ (φ, C) s.t. (x, φ, C) ∈ A aH }.
We now recall the usual notion of No-Arbitrage, which characterization is meaningful for super-replication theorem 2.2. Definition 2.1. We say that there is no arbitrage opportunity if, for all trading strategies Φ such that (0, Φ, 0) ∈ A, we have XT0,Φ,0 ≥ 0 a.s. =⇒ XT0,Φ,0 = 0 a.s.
December 26, 2006
19:6
Proceedings Trim Size: 9in x 6in
Carassus˙etal
71
In Pham and Touzi [22], a characterization of this no-arbitrage condition is provide and to state it, we introduce the following two sets: ˆ = x ∈ Rd : φ∗ x ≤ 0, ∀φ ∈ K . K dQ ∈ L∞ , ∆St ∈ L1 (Q) P= Q∼P : dP ˆ , 1 ≤ t ≤ T P − a.s. . and EQ [∆St |Ft−1 ] ∈ K We also need a non-degeneracy assumption. This assumption is essential to prove Theorem 2.1 below: if it fails to hold, the set of final dominated payoffs may not be closed, see Brannath [2]. Assumption 2.1. Let t = 1, . . . , T. Then for all F t−1 -measurable random variables ϕ valued in K , ϕ∗ ∆St (ω) = 0 =⇒ ϕ(ω) = 0
for a.e. ω ∈ Ω .
Models studied in Section 4 fulfill the above assumption. Theorem 2.1. (Pham-Touzi [22]). Under Assumption 2.1, the no arbitrage condition is equivalent to P ∅. Without cone constraints, see also Dalang-Morton-Willinger [8]. Let St,T be the set of all stopping w.r.t. the filtration F such that t ≤ τ ≤ T. 2.2 Super-replication Theorem Our starting point to derive closed formulae for super-replication prices is the dual formulation of the super-replication theorem. It states that the super-replication cost of an European (resp. American) contingent claim, H (resp (Ht )t∈T ), is essentially the supremum over any probability measure Q in given set P (resp. and every stopping time τ less than T) of EQ (H) (resp EQ (Hτ )): this is given by Theorem 2.2. We give the proof of this non surprising result, since to our knowledge, it has not been done before in the American case. Indeed, Follmer ¨ and Kramkov [11] obtain, via an Optional Decomposition Theorem, for continuous time asset price process and convex constrained the super-replication Theorem (this is no longer the expectation of H but of a modification of H which takes into account the convex constraints). But to deal with this great generality, they have to assume first that the wealth process is non negative; second, the strategy φ has to be chosen so that the set {( tu=1 φ∗u−1 ∆Su )t=1...T } is locally bounded from below: in a discrete setup, with say T = 1, this boundedness Assumption implies to choose φ0 ≥ 0 or S1 bounded, which is rather restrictive.
December 26, 2006
19:6
Proceedings Trim Size: 9in x 6in
Carassus˙etal
72
Kabanov-R`asonyi-Stricker [16] and Schachermayer [23] derive a general version of super-replication theorem for European claims, while Sch¨al [24] have studied the American context for a L2 -setup without constraints on the strategy. Our proof is different since the result for American claims is obtained thanks to that for European ones. Theorem 2.2. Suppose that Assumption 2.1 and the no arbitrage condition hold. Let H be an European contingent claim, assume that sup EQ [H] < ∞. Q∈P
ˆ C) ˆ ∈ Ae such that ˆ φ, Then, there exists a minimal hedging strategy ( x, H ˆ Cˆ ˆ φ, x,
Xt
= ess sup EQ [H | Ft ] . Q∈P
In particular, pe (H) = xˆ = sup EQ [H] . Q∈P
Let (Ht )t∈T be an American contingent claim, assume that, sup τ∈S0,T , Q∈P
EQ [Hτ ] < ∞,
ˆ C) ˆ ∈ Aa such that ˆ φ, Then, there exists a minimal hedging strategy ( x, H ˆ Cˆ ˆ φ, x,
Xt
= ess
sup τ∈St,T , Q∈P
EQ [Hτ | Ft ] .
In particular, pa (H) = xˆ =
sup τ∈S0,T , Q∈P
EQ [Hτ ] .
Proof. See Appendix.
3. The Main Results Our main objective now is to derive closed formulae for the superreplication prices in the mathematical background defined above: while the essential supremum involved in Theorem 2.2 are difficult to be directly evaluated because of the set P, the prices given by formulae from Theorems 3.1 and 3.2 are simple to compute. Let us introduce two notations:
December 26, 2006
19:6
Proceedings Trim Size: 9in x 6in
Carassus˙etal
73
• we will denote by µ j (S0 , . . . S j−1 ), the conditional law of S j knowing F j−1 . • the law of the vector (S0 , . . . , S j ) will be denoted by IP j . First we treat the European case. For a measurable function h from (R d )T+1 into R, we define a sequence of operator, based on kinds of concave envelops, by (2) (3)
where, for u from (Rd ) j+2 into R, one has (4) α + β∗ x j if µ j+1 (x0 , . . . , x j ) z : α + β∗ z < u(x0 , . . . , x j , z) = 0 u fα,β (x0 , . . . , x j ) = +∞ otherwise. The essential infimum in (3) is related to the measure IP j . Note that the definition of the operator Γej h is related to a dynamic programming principle: at each time j, Γej h is somehow the minimal value of any strategy which almost everywhere super hedge Γej+1 h. Then, the following theorem holds. Theorem 3.1. Assume Assumption 2.1 and the no arbitrage condition. Let H = h(S0 , . . . , ST ) be an European contingent claim, for some measurable function h from (Rd )T+1 into R. Assume that sup EQ [H] < ∞. Q∈P
ˆ C) ˆ ∈ Ae and its value at time ˆ φ, Then, there exists a minimal hedging strategy ( x, H t ≤ T is ˆ Cˆ ˆ φ, x,
Xt
= Γet h(S0 , . . . , St ) IPt − a.s.
In particular, pe (H) = Γe0 h(S0 ). We now turn to the American case, by considering (ht )t∈T a family of measurable functions such that for t ∈ T , ht maps (Rd )t+1 into R. We define a new sequence of operator Γa replacing the equations (2) and (3) by (5) ΓaT h(x0 , . . . , xT ) = hT (x0 , . . . , xT ) Γaj+1 h a (6) Γ j h(x0 , . . . , x j ) = ess inf { fα,β } ∨ h j (x0 , . . . , x j ) 0 ≤ j ≤ T − 1. (α,β)∈R×K
December 26, 2006
19:6
Proceedings Trim Size: 9in x 6in
Carassus˙etal
74
Then we get Theorem 3.2. Assume Assumption 2.1 and the no arbitrage condition. Let H = (Ht )t∈T be an American contingent claim, such that sup τ∈S0,T , Q∈P
EQ [Hτ ] < ∞.
For t ∈ T , we denote by ht a measurable function from (Rd )t+1 into [0, ∞) such that Ht = ht (S0 , . . . , St ) a.s.. Then, there exists a minimal hedging strategy ˆ C) ˆ ∈ Aa and its value at time t ≤ T is ˆ φ, (x, H
ˆ Cˆ ˆ φ, x,
Xt
(7)
= Γat h(S0 , . . . , St ) a.s.
In particular, pa (H) = Γa0 h(S0 ). Theorem 2.2 proves the existence of an optimal strategy and thus, from Theorems 3.1 and 3.2, we can easily deduce that the essential infima, involved in the definition of operators Γe and Γa , are attained. It turns that the optimal portfolio is the optimal β from (3) and (6), which is easy to compute in the practical examples (see Section 4). Proof of Theorems 3.1 and 3.2. We only give the proof for American contingent claims, since the European case is very similar. In the following we will denote (8) Iu (x0 , . . . , x j ) = {(α, β) ∈ R×K | µ j+1 (x0 , . . . , x j ){z | α+β∗ z < u(x0 , . . . , x j , z)} = 0}. First, it is easy to check that the measurability of u implies that of the u . Thus, recursively, by definition of the essential infimum and functions fα,β remembering that each ht is measurable, we can prove that each Γat h is also measurable. ˆ ˆ First step: Xtx,ˆ Φ,C ≤ Γat h(S0 , · · · St ) IPt − a.s. Conditionally on FT−1 , let (α, β) ∈ IhT (S0 , . . . , ST−1 ); then, by (8) hT (S0 , . . . , ST−1 , z) ≤ α + β∗ z ,
µT (S0 , . . . , ST−1 ) − a.e.
Let Q ∈ P; since Q is in particular equivalent to P on FT−1 , one gets IEQ [hT (S0 , . . . , ST ) | FT−1 ] ≤ IEQ [α + β∗ ST | FT−1 ] ≤ α + β∗ ST−1
IPT−1 − a.s.
December 26, 2006
19:6
Proceedings Trim Size: 9in x 6in
Carassus˙etal
75
ˆ . By (4), it follows that using that EQ [∆St |Ft−1 ] ∈ K hT (S0 , . . . , ST−1 ) IEQ [hT (S0 , . . . , ST ) | FT−1 ] ≤ fα,β
IPT−1 −a.s.
∀ α, β ∈ R×K .
and thus, IEQ [hT (S0 , . . . , ST ) | FT−1 ] ≤ ess
inf
(α,β)∈R×K
hT { fα,β }(S0 , . . . , ST−1 )
≤ ΓaT−1 h(S0 , . . . , ST−1 )
(9)
IPT−1 − a.s.
IPT−1 − a.s. .
Let τ ∈ St,T . Writing Hτ = Hτ 1τ≤T−1 + HT 1τ>T−1 , it follows from (6) and (9), that IPT−1 a.s. one has IEQ [Hτ | FT−1 ] ≤ 1τ≤T−1 Γaτ h(S0 , . . . , Sτ ) + 1τ>T−1 ΓaT−1 h(S0 , . . . , ST−1 ) ≤ Γa(T−1)∧τ h(S0 , . . . , S(T−1)∧τ ). Recursively, repeating the same kinds of arguments with ΓaT−1 h, · · · , Γat+1 h, we get IEQ [Hτ | Ft ] ≤ Γat∧τ h(S0 , . . . , St∧τ ) = Γat h(S0 , . . . , St )
IPt a.s. .
Now, take the essential supremum on Q ∈ P and τ ∈ St,T , and reˆ Cˆ ˆ C) ˆ ∈ Aa such that Xx,ˆ φ, ˆ φ, call that by Theorem 2.2, there exists (x, = t H ess sup EQ [Hτ | Ft ]: the first inequality is completed. τ∈St,T , Q∈P
Second step: X tx,Φ,C ≥ Γat h(S0 , · · · St ) IPt − a.s., for any (x, Φ, C) ∈ AaH . T−1 ∗ Φi−1 ∆Si − Φ∗T−1 ST−1 and β¯ = ΦT−1 : remark Let (x, Φ, C) ∈ AaH . Put α¯ = x + i=1
¯ belongs to IhT (S0 , . . . , ST−1 ). Thus, one has ¯ β) that conditionally on FT−1 , (α, IPT−1 − a.s. x,Φ,C XT−1 ≥ x+
and by definition of a H hedging portfolio of an American contingent claim, we conclude x,Φ,C XT−1 ≥ ΓaT−1 h(S0 , . . . , ST−1 )
IPT−1 − a.s. .
Repeating this process, one gets the result for the second step. In particular, this holds true for the minimal strategy and Theorem 3.2 is proved.
December 26, 2006
19:6
Proceedings Trim Size: 9in x 6in
Carassus˙etal
76
4. Application: Some Super-replication Prices 4.1 Specification of the models The explicit prices given in the sequel are available if for each j ∈ {1, ..., T}, the measure µ j (S0 , ..., S j−1 ) is equivalent to the Lebesgue measure on (0, ∞)d. In that case, it is easy to check that there is no arbitrage opportunity. Note also that all the measures involved in the essential infima can be taken as the Lebesgue measure. Actually, the existence of a positive density for the corresponding law is very often satisfied: we list some examples, illustrating by the way that the results cover a wide class of financial models. Note that tree models do not satisfy this condition of existence of a density w.r.t. the Lebesgue measure (however, up to very tedious computations, it is possible to get super-replication prices for binomial and trinomial models). In the following, W is a q-dimensional Brownian motion. • The well-known stochastic differential equation of Black-Scholes in its multidimensional version: dSit Sit
= µi dt +
q
j
σi, j dWt .
j=1
If σσ∗ is invertible, it is clear that this process satisfies the required condition. • A non Markovian generalized version of the model above: dSit Sit
= µi (t, (Ss )0≤s≤t )dt +
q
j
σi, j (t, (Ss)0≤s≤t )dWt ,
j=1
with the non degeneracy condition [σσ∗ ](., .) ≥ σ20 Id for some σ0 0. For the existence of the positive density, see Kusuoka-Stroock [18]. • A stochastic volatility model: dSit Sit
= µi dt +
q
j
σi, j,tdWt
j=1
where (σt σ∗t )t≥0 is a matrix-valued continuous time process, which we assume to be positive definite and independent of the Brownian motion W. It is easy to check the existence of the positive density.
Figure 1. Computation of concave envelops for the Call option.
• Merton’s model with jumps [19]: this is a generalization of BlackScholes model including Poisson type jumps. It may be defined by ⎞ ⎛ i Nt ⎟⎟ q ⎜⎜ q j ⎟ ⎜ σ W +(µ − σ2 /2)t Sit = Si0 ⎜⎜⎜⎜ ( f (Yij ) + 1)⎟⎟⎟⎟ e j=1 i,j t i j=1 i,j , ⎠ ⎝ j=1 where ( f (Yij ))1≤i≤d, j≥1 are i.i.d. random variables, strictly greater than −1, (Nti ; 1 ≤ i ≤ d) are Poisson processes with arrival rate λi . All processes and random variables defining this multidimensional model are independent. For this homogeneous Markov process, it is easy to prove the existence of a positive density w.r.t. Lebesgue measure on (0, ∞)d assuming that σσ∗ is invertible. 4.2 Computation of the prices in dimension one Here, we restrict to one risky asset (d = 1) starting with the unconstrained case (K = R). We sketch the proofs of some results of table 1: it somehow reduces to compute iterative concave envelops (w.r.t. the Lebesgue measure on (0, ∞)), which is easy for the usual options. 4.2.1 Vanilla Options. We first consider the case of an European Call option whose payoff is h(x0 , . . . , xT ) = (xT − K)+ . Applying formulae (3), one first gets ΓeT−1 h(x0 , . . . , xT−1 ) = xT−1 (see figure 1); by a straightforward iteration, it follows that Γej h(x0 , . . . , x j ) = x j , and thus pe (H) = Γe0 h(S0 ) = S0 . Analogously, for the European Put h(x0 , . . . , xT ) = (K − xT )+ , one gets Γej h(x0 , . . . , x j ) = K, and thus pe (H) = K. These results have already been obtained by Patry (2001).
December 26, 2006
19:6
Proceedings Trim Size: 9in x 6in
Carassus˙etal
78
ΓeT−1 h(x0 , . . . , xT−1 ) = 1x0
U−K
h(x0 , . . . , xT−1 , x) = 1x0
K
U
x
Figure 2. Computation of concave envelops for the Up and Out Call option.
For the American style options, analogous computations provide the same prices as above. 4.2.2 Barrier Options. Let us consider, for example, the case of an European Up and Out Call T 1xi
K < U. For given x0 , . . . , xT−1 smaller than U, the concave envelop of the function (xT −K)+ 1xT
associated American claim for which h j (x0 , . . . , x j ) = also gets ΓaT−1 h(x0 , . . . , xT−1 ) =
1xi
1xi
1xi
pe (H) = pa (H) = S0 (1 − K/U). 4.2.3 Extension. Assume that the contingent claim H = h(S0 , · · · , ST ) can be traded at some extra dates. Then, the definition of the super-replication price should imply more rebalancing dates. But, it is easy to prove, in our context of conditional laws equivalent to the Lebesgue measure, that the super-replication prices are unchanged. For example, consider a monthly monitored barrier option with expiration date equal to one year: if we are allowed to hedge each month, or each day, or even each hour, the super-
December 26, 2006
19:6
Proceedings Trim Size: 9in x 6in
Carassus˙etal
79
Table 1 Explicit super-replication prices of some options.
Name Call Put Asian Call (Fixed strike) Asian Call (Floating Strike) Asian Put (Fixed strike) Asian Put (Floating Strike) Partial Lookback Call Call on maximum Barrier Up and Out Call Barrier Up and Out Put Barrier Up and In Call Barrier Up and In Put Barrier Down and Out Call Barrier Down and Out Put Barrier Down and In Call Barrier Down and In Put
Payoff
European Price S0 K T S0 ai
(ST − K)+ (K − ST )+ T ai Si − K i=1
i=1
0 ≤ ai ≤ 1 T K − ai Si
i=1
+
0 ≤ ai T ai Si − ST
+
i=1
+
i=1
+
0 ≤ ai T ST − ai Si
0 ≤ ai ≤ 1 (ST − λ min(S1 , . . . , ST ))+ λ ∈ [0, 1] (max(S 1 , . . . , ST ) − K)+ 1Si U (ST − K)+ 1∃i/Si >U (K − ST )+ (S0 < U) 1Si >L (ST − K)+ (S0 > L) 1Si >L (K − ST )+ (S0 > L, K > L) 1∃i/Si L) 1∃i/Si L)
S0
T−1 i=1
ai
American Price S0 K T−1 1 2 i + T S0 i=2
(a i = 1/T) T 1 i S0 i=2
(ai = 1/T)
K
K
S0 (1 − aT )
S0 (1 − aT )
S0
S0
T S0 S0 (1 − K/U)
T S0 S0 (1 − K/U)
K
K
S0
S0
S0 K/U
S0 K/U
S0
S0
K−L
K−L
L1LK K
L1LK K
December 26, 2006
19:6
Proceedings Trim Size: 9in x 6in
Carassus˙etal
80
replication price will be the same. Remark that in dimension 1, the only relevant cones are R+ and R− . In the first case, the prices given in table 1 are unchanged. In the second one, for bounded payoffs, results still hold: otherwise, prices become infinite. 4.3 Some prices in a multidimensional setting We may consider an option written on d assets with payoff equal to ⎞ ⎛ d ⎟⎟ ⎜⎜ i ⎜ ai ST + a0 ⎟⎟⎟⎠ , H = ⎜⎜⎝ i=1
+
for some real numbers (ai )0≤i≤d : it includes exchange options with payoff equal to (S1T − KS2T )+ , or Call options on index with payoff equal to ( di=1 pi SiT − K)+ (with pi ≥ 0 and di=1 pi = 1). It is not hard to check that without cone constraints, the super-replication price is given by pe (H) = (a0 )+ +
d (ai )+ Si0 . i=1
The optimal strategy is again static and equals φ∗t = ((a1 )+ , · · · , (ad )+ ). If there is a cone constraint defined by K , we can easily see that the price is unchanged if K contains any vector ei (the i-th element of the canonical base of Rd ) for which ai > 0. Otherwise, one has pe (H) = +∞. 5. Conclusion Taking mainly advantage of the primal formulation, we give a recursive formula to compute the super-replication price and the optimal strategy for European and American contingent claims. For this, we have assumed that mild conditions on the underlying assets hold (Assumption 2.1), and that the trading is discrete and constrained to lie in a closed convex cone. When the conditional law of the asset process is equivalent to the Lebesgue measure, we perform explicit computations for the usual options. What clearly happens is that the super-replication prices are somehow very high. It is already known that in the context of imperfect continuous financial markets, the super-replication price of an European call is equal to S0 . Our results show that this feature of high prices remains true for a large class of financial products and models, when only discrete time strategies are allowed. It reinforces the necessity to turn to other concepts to price options such as minimization of shortfall risk or prices based on utility functions (see Follmer-Leukert ¨ [12], and Hodges-Neuberger [13] among others).
December 26, 2006
19:6
Proceedings Trim Size: 9in x 6in
Carassus˙etal
81
Appendix: Proof of Theorem 2.2 The proof of the European case is standard (see Pham [21], KabanovR`asonyi-Stricker [16] and Schachermayer [23]). For the American case, let (Ht )t∈T be an American payoff such that (10)
sup Q∈P, τ∈S0,T
IEQ [Hτ ] < ∞.
By analogy with the usual dynamic programming equation, we introduce the process Yt defined by YT = HT Yt = Ht ∨ ess sup IEQ [Yt+1 | Ft ]
for t = 0, . . . , T − 1.
Q∈P
Set At = {Ht ≥ ess sup IEQ [Yt+1 | Ft ]} and Q∈P
τT = T τt = t1At + τt+1 1Act . Note that each τt belongs to St,T : τ0 will play the role of an optimal stopping time. Actually, the proof of the American part of Theorem 2.2 follows from the following lemma. Lemma 5.1. With the above notation and Assumption 10, there exists a minimal ˆ C) ˆ ∈ Aa such that strategy (Y0 , φ, H ˆ Cˆ Y ,φ,
Xt 0
= Yt = ess sup IEQ [Hτt | Ft ] = ess Q∈P
sup Q∈P, τ∈St,T
IEQ [Hτ | Ft ].
We now turn to its proof. x,φ,C ≥ ess sup IEQ [Hτ | Ft ] for any (x, φ, C) ∈ AaH . The result Step 1: Xt Q∈P, τ∈St,T
is clear using the admissibility of the American strategy and the supermartingale property of Xx,φ,C under any Q ∈ P. Step 2: ess sup IEQ [Hτt | Ft ] ≥ Yt . We proceed by induction. Clearly, the Q∈P
property holds when t = T. Assume now that it holds for t + 1, and let denote by (pe (Hτt+1 ), φ˜ t+1 , C˜ t+1 ) the minimal strategy for the European claim
December 26, 2006
19:6
Proceedings Trim Size: 9in x 6in
Carassus˙etal
82
Hτt+1 , using Theorem 2.2. Thus, one deduces from the definition of Yt that Yt = 1At Ht + 1Act ess sup IEQ [Yt+1 | Ft ] Q∈P
ˆ C) ˆ ∈ Aa such that X 0 = Yt . Let Step 3: there exists a strategy (Y 0 , φ, t H (xt , φt , Ct ) be the minimal strategy associated to the European contingent x ,φt ,Ct
claim Yt : we can take φtu = ∆Ctu+1 = 0 for u ≥ t, so that Xt t Yt . Set
x ,φt ,Ct
= XTt
≥
φˆ t = φt+1 for t = 0, . . . , T − 1 t ˆ ∆Ct = Yt−1 − Yt + φtt−1 ∗ ∆St for t = 1, . . . , T. We first prove that ∆Cˆ t is non negative. Theorem 2.2 for the European claim Yt yields x ,φt ,Ct
ˆ C) ˆ ∈ A. which is non negative by definition of Yt−1 . This proves that (Y0 , φ, ˆ Cˆ Y ,φ,
We now show by induction that Xt 0 = Yt : this will also complete the ˆ C) ˆ ∈ Aa . For t = 0, this is obvious. If the property holds proof of (Y0 , φ, H true at time t, we deduce that ˆ Cˆ Y ,φ,
by definition of the consumption ∆Cˆ t+1 . The combination of the three steps leads to the equality of Lemma 5.1; taking into account Step 1, we prove the minimality of the strategy ˆ C). ˆ (Y0 , φ, Acknowledgments. This work has been partly done while authors were affiliated to Ecole Nationale des Ponts et Chauss´ees and Ecole Polytechnique.
December 26, 2006
19:6
Proceedings Trim Size: 9in x 6in
Carassus˙etal
83
References
1. Bensaid, B., Lesne, J., Pages, H. and Scheinkman, J. (1992). Derivative asset pricing with transaction costs. Math. Finance 2, 63–86. 2. Brannath, W. (1997). No-arbitrage amd martingale measures in option pricing. PhD thesis. Universit¨at Wien, Vienna, Austria. 3. Broadie, M., Cvitanic, J. and Soner, M. (1998). Optimal replication of contingent claims under portfolio constraints. Rev. of Financial Studies 11, 59–79. 4. Cvitani´c, J. and Karatzas, I. (1993). Hedging contingent claims with constrained portfolios. Ann. Appl. Probab. 3, 652–681. 5. Cvitani´c, J., Pham, H. and Touzi, N. (1999). A closed formula for the problem of super-replication under transaction costs. Finance Stoch. 3, 35–54. 6. Cvitani´c, J., Pham, H. and Touzi, N. (1999). Super-replication in stochastic volatility models under portfolio constraints. J. Appl. Probab. 36, 523–545. 7. Cvitani´c, J., Shreve, S. and Soner, H. (1995). There is no nontrivial hedging portfolio for option pricing with transaction costs. Ann. Appl. Probab. 5, 327–355. 8. Dalang, R. C., Morton, A. and Willinger, W. (1990). Equivalent martingale measures and no-arbitrage in stochastic securities market models. Stochastics Stochastics Rep., 29(2), 185–201. 9. El. Karoui, N. and Quenez, M. (1995). Dynamic programming and pricing of contingent claims in an incomplete market. SIAM J. Control Optim. 33, 29–66. 10. Follmer, ¨ H. and Kabanov, Y. (1998). Optional decomposition and Lagrange multipliers. Finance Stoch. 2, 69–81. 11. Follmer, ¨ H. and Kramkov, D. (1997). Optional decompositions under constraints. Probab. Theory Related Fields 109, 1–25. 12. Follmer, ¨ H. and Leukert, P. (1999). Quantile hedging. Finance Stoch. 3, 251–273. 13. Hodges, S. and Neuberger, A. (1989). Optimal replication of contingent claims under transaction costs. Review of Futures Markets 8, 222–239. 14. Jouini, E. and Kallal, H. (1995). Arbitrage in securities markets with shortsales constraints. Math. Finance 5, 197–232. 15. Jouini, E. and Kallal, H. (1995). Martingales and arbitage in securities markets with transaction costs. J. Econ. Theory 66, 178–197. 16. Kabanov, Y., Rasonyi, ` M. and Stricker, C. (2003). On the closedness of sums of convex cones in L0 and the robust no-arbitrage property. Finance and Stochastics. 7(3), 403–411. 17. Karatzas, I. and Kou, S. (1996). On the pricing of contingent claims under constraints. Ann. Appl. Probab. 6, 321–369. 18. Kusuoka, S. and Stroock, D. (1984). Applications of the Malliavin calculus I. in: K. Itˆo, ed., Stochastic Analysis, Proc. Taniguchi Internatl. Symp. Katata and Kyoto 1982, Kinokuniya, Tokyo 271–306. 19. Merton, R. (1976). Option pricing when underlying stock returns are discontinuous. Journal of Financial Economics 3, 125–144. 20. Patry, C. (2001). Couverture apporch´ee optimale des options europ´ennes. PhD thesis. Universit´e Paris Dauphine, Paris, France. 21. Pham, H. (2000). Dynamic Lp -hedging in discrete time under cone constraints. SIAM J. Control Optimization 38, 665–682. 22. Pham, H. and Touzi, N. (1999). The fundamental theorem of asset pricing with
December 26, 2006
19:6
Proceedings Trim Size: 9in x 6in
Carassus˙etal
84
cone constraints. J. Math. Econom. 31, 265–279. 23. Schachermayer, W. (2004). The Fundamental Theorem of Asset Pricing under Proportional Transaction Costs in Finite Discrete Time. Mathematical Finance 14(1), 19–48. 24. Schal, ¨ M. (1999). Martingale measures and hedging for discrete-time financial markets. Math. Oper. Res. 24, 509–528.
December 26, 2006
18:53
Proceedings Trim Size: 9in x 6in
actkyotomar06
Risky Debt and Optimal Coupon Policy and Other Optimal Strategies Diana DOROBANTU and Monique PONTIER [email protected] and [email protected] Laboratoire de Statistique et Probabilit´es. Universit´e Paul Sabatier, Toulouse, F31 062 TOULOUSE cedex.
The model developed is a structural valuation model of risky debt which includes a dynamic of debt. The value of the firm follows either a Brownian dynamic or a mixed Brownian-Poisson dynamic. The default threshold is endogenous; thus the optimal default threshold changes as the firm’s value changes. Another optimal control problem is setting with respect to a couple (coupon or dividend policy, stopping time). Hence a local time is introduced in the firm value dynamic, being the optimal coupon or dividend policy. Once again an optimal threshold is given. 1. Introduction This work concerns the financial policy of an entreprise. Its value V is a diffusion driven by a Brownian motion W or a couple (W, N), Brownian motion-Poisson process. The financial policy is C, a positive increasing process (coupon or dividend policy). Besides we consider the failure time τ and we look for an optimal strategy (C, τ) so that the debt price is to be minimized (r > 0 is the discount current rate): τ e−rs dCs + e−rτ h(Vτ )], (C, τ) → E[ 0
the function h will be precised later. This is an optimal control problem, the strategy being the control (C, τ). Such problems are widely studied by many authors with respect - either the optimal stopping problem, - or the coupon policy. - or both (C, τ). 85
December 26, 2006
10:55
Proceedings Trim Size: 9in x 6in
actkyotomar06
86
We can quote some references among a lot of ones. Gabillon’s working paper [8] was our starting point, guiding us to this optimizing problem. Merton’s paper [14] is a seminal reference in Economics, modelizing prices as stochastic processes. Concerning the optimization with respect to the failure time τ, it is often supposed to be a hitting time: τb = inf{s > 0, Vs ≤ b} and the deal is to optimize with respect to the level b. Think of Duffie et al. [4, 5], Kou and Wang [12], Mordecki [15] or Dao’s thesis [1]. Concerning the optimization with respect to the increasing process C, the dividend policy, think of Jeanblanc and Shiryaev [11], Pham [17] or Decamps and Villeneuve [3]. Mixing both point of view is not so easy to manage with. ∞ e−rs dCs + e−rτ h(Vτ )] v(x) = sup Ex [ (M,C,τ)
τ
where dVt = Vt (µdt + σdWt − dCt + dMt ), v0 = x, M is an investment process. For instance this model is introduced in [9] but here we suppose dMt = 0. Nevertheless, our model can not be viewed as a particular case of Guo and Pham’s one [9], since they use a function to be optimized ∞ π(x) Ex [ τ e−rs π(Vs )dCs ] where limx→∞ x = +∞. This condition is not satisfied in our τ case, here π(x) = 1. Davis and Zervos [2] look for the optimization of Ex [ 0 e−rs (λVs2 + dCt + dMt ) + µe−rτ Vτ2 ]; in such a case, v is no more convex; the only way is to use viscosity solutions and variational inequalities. Vath et al. [20] are interested in an impulse control model but their subject is outside our topic. These authors’ point of view is related with the free boundary problem and the optimization problem is solved using the variational inequalities and the viscosity solutions. We start with a simpler problem. In a first part, we fix the coupon policy (dCs = e g s ds, 0 ≤ g < r). In such a model, the optimization problem can be turned to an optimal stopping one and we solve it using standard results and we prove that the default time is a hitting time, arguing convexity arguments. In a second part no more concerning the valuation of a risky debt but the optimization of firm value in case of tax deduction, we restrain the set of stopping times as hitting times of a level b, b being unknown for a moment. We first solve the optimal control with respect to the policy C, assuming τ to be the hitting time of b by V using Jeanblanc and Shiryaev’s results [11], or Decamps-Villeneuve [3]. Then we optimize the obtained value with respect this level b. An optimal level b∗ is found and optimal strategy is exhibited as the couple (C, b∗ ), C being a local time of the value V. In such a way, we do not need any variational inequations, and we use the following standard optimal stopping tools [6, 7, 19].
December 26, 2006
10:55
Proceedings Trim Size: 9in x 6in
actkyotomar06
87
1.1 Optimal stopping tools For the sake of completeness, we recall the main theorems we use to solve the optimizing problem. Theorem 1.1. (cf. Th. 3.4. [7]) Let V be a strong Markov realization of a right semi-group (Pt ) and a measurable function f such that Y : t → e−rt f (Vt ) is of class D. Let J be the Snell envelope of Y (i.e. the smallest super martingale greater than Y) : Jt = ess supτ∈∆, τ≥t E[Yτ | Ft ]. Then J has the form J t = e−rt φ(Vt ) where φ is the r-reduite of f . Let ∆ be the set of stopping times and recall that a Markov process Y is of class D if the set {YT , T ∈ ∆} is uniformly integrable. In the constant coupon case, dCs = ds and when h(x) = x, f (x) = 1/r − x and Yt = e−rt f (Vt ). Theorem 1.2. (cf. [19], chap. 3, th. 3, p. 127) If f is upper semicontinuous satisfying supt e−rt | f (Vt )| ∈ L1 and t → e−rt f (Vt ) is continuous at t = 0, then τ∗ = inf{s ≥ 0 : f (Vs ) = φ(Vs )} is the smallest optimal stopping time. Theorem 1.3. (cf. [7] Remark 3.5) Let Y be a strong Markov process and J its Snell envelope. A stopping time τ∗ is optimal if and only if : Yτ∗ = Jτ∗ and the process (Jτ∗ ∧t , t ≥ 0) is a martingale. 2. A simple model We look for a stopping time such that the risks are minimum: we want T to minimize the debt price on ∆, T → E[ 0 e−rs dCs + e−rT VTa ], a ∈]0, 1]. 2.1 Mixed Brownian-Poisson process diffusion Let W be a Brownian motion and N, a Poisson process with constant + intensity θ, on a filtered probability space (Ω, (Ft , t ∈ R ), P). We suppose that the value V is solution to the following stochastic differential equation, g ∈ R, σ > 0: (1)
dVt = Vt gdt + Vt σdWt − Vt− dNt .
The strong solution of this stochastic differential equation with jump is (we use L´epingle et M´emin [13]) Vt = V0 exp[(g − 0.5σ 2 )t + σWt ] almost surely when t < T1 , first N jump time. After this time, Vt = 0 since the jump size is −Vt− so VT1 = 0. The exponential form constrains V to stay at 0. More precisely, given N˜ t = Nt − θt, ˜ Vt = V0 e(g−θ)t Et (σW − N), E meaning Dol´eans stochastic exponential.
December 26, 2006
10:55
Proceedings Trim Size: 9in x 6in
actkyotomar06
88
2.2 Optimal stopping problem Here we fix the coupon policy as dCt = e g t dt, 0 ≤ g < r, h : x → xa , 0 < a ≤ 1. We have to minimize the following map in ∆ (the set of stopping times): T T → Ex [ e−(r−g )t dt + e−rT VTa ], x = V0 . 0
T Now, we put r := r − g , let us remark that 0 e−(r−g )t dt = r1 (1 − e−r T ). Following [8], we introduce νt = e−g t Vta , following the stochastic differential equation ([10] Theorem 32):
1 dνt = νt [ag − g + a(a − 1)σ2 ]dt + νt aσdWt − νt− dNt , ν0 = V0 = x. 2 So we have to maximize the map on ∆:
T → Ex [e−r T f (νT )] = Ex [e−r T (1/r − νT )], f (x) = 1/r − x. It is a standard optimal stopping problem with respect to the process Y : t → e−r t f (νt ). Proposition 2.1. Let us assume that r > g , 0 < a ≤ 1 and (2) g 1 1 − − 2 σ2 σ2
Then the process Y : t → e−r t [ r1 − νt ] is a strong Markov process going to 0 almost surely and in L1 sense when t goes to ∞. It satisfies the hypotheses of Theorems 1.1, 1.2, 1.3.
Proof: : The key is that for a convenient choice of parameter a, Z : t → e−r t νt is a positive super martingale which converges almost surely to 0. Indeed, the non martingale part of the semimartingale Z is 1 [−r + ag − g − θ + a(a − 1)σ2 ]νt− dt. 2 As soon as the dt coefficient is negative, this process is a super-martingale. Since besides Z is non negative, it converges almost surely when t goes to infinity. So, we need 1 −r + ag − g − θ + a(a − 1)σ2 < 0, 2 meaning that a stays strictly between this second degree expression roots.
December 26, 2006
10:55
Proceedings Trim Size: 9in x 6in
actkyotomar06
89
On the other hand Ex (|Yt |) ≤ 1/r e−r t +xa e−bt (b = −ag+r+θ− 21 a(a−1)σ2 > 0) so it converges almost surely and in L1 to 0. We define the function φ such that φ(x) = supT∈∆ Ex [YT ]. Remark that φ ≥ 0 since ∀t, φ(x) ≥ Ex [Yt ] and limt→∞ Ex [Yt ] = 0. So φ(x) ≥ 0 which is + the supremum when T = t ∈ R . p.s.,L1
In such a case, using limt→∞ Yt = 0, S. Villeneuve [21] proved that φ(x) = supT Ex [YT+ ]. So we go to an optimal stopping problem for Yt+ which takes its values in [0, 1/r]. • As a consequence of this proposition, we can apply Theorems 1.1, 1.2, 1.3: Proposition 2.2. The debt is 1/r − J0 , where J denotes the Snell envelope of process Y satisfying Jt = e−r t φ(νt ), τ = inf{t ≥ 0, Yt = Jt } is the smallest optimal stopping time, (e−r t∧τ φ(νt∧τ ), t ≥ 0) is (F , P)-martingale such that φ(ν τ ) = r1 − ντ . 2.3 Function φ We have now to express this function φ. Proposition 2.3. Let us assume r > g and a as in (2), then φ : x → supT∈∆ Ex [YT ] is a bounded convex function. Proof: We above proved that φ(x) = supT Ex [YT+ ] and noticed that Yt+ takes its values in [0, 1/r]. Thus the function φ is also bounded. On the other hand, the function φ is convex as a supremum of convex ones
x → E1 [e−r T (
1 − xa νT )] and 0 < a ≤ 1. r •
Corollary 2.1. Let b = inf{x > 0 : φ(x) > 1/r −x}. Then {x : φ(x) = 1/r −x} = [0, b], φ(0) = 1/r = f (0). Proof: Remark that under P0 , V0 = 0, so ν0 = 0. Thus φ(0) = 1/r = f (0). So for all x ∈ [0, b], φ(x) = f (x). Since f and φ are convex, φ ≥ f and f (0) = φ(0), actually, {x : φ(x) = 1/r − x} = [0, b]. • Proposition 2.4. Let us assume ν0 > b and let τ = inf{s ≥ 0 : f (νs ) = φ(νs )}. Then τ is the smallest optimal time. On the other hand, let τ b = inf{s ≥ 0 : νs ≤ b}, then τ = τb . Proof: : The first point is a consequence of Theorem 1.2. Corollary 2.1 expresses that actually τ = inf{s ≥ 0 : νs ∈ [0, b]} meaning τ = τb . •
December 26, 2006
10:55
Proceedings Trim Size: 9in x 6in
actkyotomar06
90
Proposition 2.5. Assuming (1) and dC t = e g t dt, 0 ≤ g < r, there √ the model 1 1 2 2 2 −(ag−g − 2 aσ )− (ag−g − 2 aσ ) +2(r +θ)a2 σ2 −γ1 1 exists γ1 = < 0, b∗ = (r +θ)(1−γ ≤ r +θ a2 σ2 1) such that 1 x θ − b∗ )( ∗ )γ1 + ]1x≥b∗ . φ(x) = (1/r − x)1x
Proof: We have to verify that with such a function φ, Jt = e−r t φ(νt ) is the Snell envelope of Y. Since f is an affine function, φ convex, φ = f on [0, b], φ ≥ f as soon as the right derivative φ (b+ ) ≥ f (b) = −1. b ≥ (=)
γ1 −γ1 1 yields φ (b+ ) = ( − b) ≥ (=) − 1. (r + θ)(1 − γ1 ) b r + θ
On the other hand, φ(ντ ) = f (ντ ) so it is enough to prove that t → e−r t∧τ φ(νt∧τ ) is a martingale. With a good choice of b, φ (b+ ) = f (b) thus the + function φ belongs to C2 (R − {b}). We apply Ito formula to J : t → e−r t φ(νt ) 1 on {t < τ} = {νt > b}, φ(νt ) = ( r +θ − b)( νbt )γ1 + r (rθ +θ) and the non martingale part of er t dJt is νt νt 1 2 2 1 1 1 a σ γ1 (γ1 − 1)( − b)( )γ1 + (ag − g + σ2 a(a − 1))γ1 ( − b)( )γ1 2 r +θ b 2 r +θ b νt 1 θ − b)( )γ1 + ] + θφ(0) = 0. −(r + θ)[( r +θ b r (r + θ) Indeed, γ1 is the non positive solution of the second degree equation 1 2 2 1 2 2 a σ γ(γ − 1) + (ag − g + 2 σ a(a − 1))γ − r − θ = 0 so the coefficient of νt γ 1 1 ( r +θ − b)( b ) is nul. Finally φ(0) = 1/r concludes the proof. • Another proof is to deduce from φ(ντb ) = f (ντb ) and the martingale property that φ(x) = Ex [ f (ντb )]. This is computable using the Laplace’s transform. Remark 2.1. In the case of a pure diffusion process, there is nothing more to prove: it is enough to get the intensity θ = 0. Thus the optimal solution is the hitting time of b∗ (defined below), so is the optimal value: there exists −(ag − g − 12 aσ2 ) − (ag − g − 12 aσ2 )2 + 2r a2 σ2 γ= < 0, a 2 σ2 −γ 1 ≤ b∗ = r (1 − γ) r such that the optimal value is φ(x) = (1/r − x)1x
1 x − b∗ )( ∗ )γ 1x≥b∗ . r b
December 26, 2006
10:55
Proceedings Trim Size: 9in x 6in
actkyotomar06
91
Proposition 2.6. The default time is τ = inf{θb , T1 }
where θb is the first passage time of V0 e−g t exp(ag − 12 aσ2 )t + aσWt ) below level b and T1 the first N jump time. Clearly, these two stopping times are independent since a Brownian motion and a Poisson process with a deterministic intensity are independent. So the default time law is: Q{τ ≤ t} = 1 − Q{θ b > t}Q{T1 > t} = 1 − [Φ(
where Φ is the distribution function of the standard Gaussian law, x0 = ln Vb0 , η = ag − g − 12 aσ2 . Proof: Remark that νT1 = 0 < νb , thus necessarily τ ≤ T1 . Besides, the process ν goes under the level b either if it jumps to 0 (time T1 ) or if its continuous part hits the level b. The second point is a consequence of exercise 1.21 p. 309 [18]. • On the other hand, let us consider the problem of an optimal default threshold: νb = arg min{b → Dt (b)} where the debt Dt (b) = when νt ≤ b.
1 r
− φ(νt ) =
1 θ+r
νt γ 1 1 − ( θ+r when νt ≥ b; νt − b)( b )
1 − ν)( 1ν )γ1 is, It is easy to see that ∀t ≥ 0, ∀νt , Dt is minimum as ν → −( r +θ the optimal threshold is
νb =
(r
−γ1 1 < . + θ)(1 − γ1 ) θ + r
This result is consistent with Proposition 2.5. 3. Reflective diffusion Now let us consider a new optimal problem: we suppose that to pay dividend allows the entreprise to obtain tax deduction, proportionnally to the coupon payment and, on the failure time, it has to pay a fraction α of the discounted value (β, α ∈ [0, 1[, C increasing process, T ∈ ∆). Thus this optimal problem concerns the strategy couple (C, T): max{C,T}Ev [β
0
T
e−rs dCs − αe−rT VT ],
December 26, 2006
10:55
Proceedings Trim Size: 9in x 6in
actkyotomar06
92
the first term is a benefit for the firm. On the failure time T, the firm has to pay a fraction α of its discounted value. However, here we restrain the set of stopping time to be the set of V hitting times. 3.1 A first step is to optimize the first term (following [11] or [3]) with respect to the increasing process C: in [11] or [3] τ is the failure time; here it is the process V hitting time of the a priori fixed level b. Let the firm’s production capacity Vt = beXt , V0 = v = beX0 , where dXt := (g − 0.5σ2)dt + σdWt − dCt , X0 = x, Xt = ln
Vt b
so inf{t ≥ 0 : Xt ≤ 0} = inf{t ≥ 0 : Vt ≤ b}. Proposition 3.1. If the threshold b is fixed, the optimal strategy is C = L x0 , the local time of X at the level x0 , free boundary, x0 =
ln α22 − ln α21 α1 + α2
,
−α2 , α1 are the two solutions of 12 σ2 x2 + gx − r = 0, − α2 < 0 < α1 . Moreover,
τb
F(v, b) := Ev [ 0
e−rs dLxs 0 ] =
(v/b)α1 − (v/b)−α2 . + α2 ( αα21 )−2α2 (α1 +α2 )
α1 ( αα21 )2α1 (α1 +α2 )
Proof: : We use Proposition 2.1 in [3] (as a consequence of [11]): sup Ex [ Z
τ0
τ0
e−rt dZt ] = V0 (x) = Ex [
0
0
e−rt dLxt 0 ],
τ0 = inf{t ≥ 0 : Xt ≤ 0} and V0 , x0 being expressed in (2.5) respectively (2.6) [3]: (3) (4)
3.2 The second step is to optimize this last result with respect to the level b. Since Vt = beXt and ∀t ≥ 0, Xt ≤ x0 , then the upper bound of V is bex0 and b ≥ ve−x0 . So we have to optimize on [ve−x0 , v] the function: G : b → βF(v, b) − αEv [e−rτb Vτb ]. Remark that b → F(v, b) is decreasing since α1 > 0 > −α2 . Because of continuity, necessarily, Vτb = b. Besides, b → τb = inf{t : Vt ≤ b} is decreasing, so b → Ev [e−rτb ] is increasing, also is b → bEv [e−rτb ]. Since α and β are non negative, the function G is decreasing and the optimum is reached at b = ve−x0 . The optimal stopping time is the hitting time is τve−x0 , and the optimal value is G(v, ve−x0 ) = βF(v, ve−x0 ) − αEv [e−rτve−x0 Vτve−x0 ]. As a conclusion we get Theorem 3.1. The optimal strategy is (L x0 , τve−x0 ) where Lx0 is the local time of ln α2 −ln α2
the process t → Xt = ln veV−xt 0 at the level x0 = α21 +α2 1 , −α2 and α1 being the two solutions of 1 2 2 −x0 −x }. Then, 2 σ x + gx − r = 0, − α2 < 0 < α1 and τve 0 = inf{t ≥ 0 : Vt ≤ ve −x0 the optimal value is G(v, ve ). Finally remark that this is not “the” optimal policy. Instead of fixing the level b, let us fix the coupon policy dCt = e g t dt, g < r. We have a standard optimal stopping problem to solve like in the first part. T v(x) = sup Ex [β e−r t dt − αe−rτ Vτ ]. T∈∆
0
We look for the Snell envelope of Y : t →
β β β (1 − e−r t ) − αe−rt Vt = − e−r t ( + αe g t Vt ). r r r β
This process is less than the constant r which is its limit when t goes to β infinity: the optimal stopping time is τ = ∞ and the optimal value is r which is to be compared to G(v, be−x0 ) for any v. But, ∀v (x 0 only depends on (r, σ, g)) there exists one g near enough r so that: β > G(v, be−x0 ). r Finally, we can choose g so that this optimal value is better than the previous one.... The conclusion is that the best to do is to go on as long as the tax deduction is applied.
December 26, 2006
10:55
Proceedings Trim Size: 9in x 6in
actkyotomar06
94
4. Conclusion Remark that Theorem 3.1 is the firm point of view. But the sharesholders’ point of view could be to optimize H : b → F(v, b) + αEv [e−rτb Vτb ]. In such a case, we need to explicitely compute the map b → Ev [e−rτb Vτb ]. But this new function is no more decreasing and actually the optimal threshold will be higher that the minimum bex0 . We do not need any computation for this conclusion. Thus the shareholders’ interest contradicts the firm’s interest; their power leads to a too early firm failure. For instance, it could be leading to unemployement. Our suggestion could be, instead of alway optimizing the shareholders’ wealth, to build other financial models to avoid a total economic disaster, the consequence of shareholders’ optimal strategies. On this topic, we can refer Peyrelevade [16] page 87, D’autres voies peuvent eˆtre envisag´ees: encourager par la fiscalit´e le r´einvestissement des b´en´efices plutˆot que la distribution et rendre plus couteux, ˆ voire interdire les rachats d’actions; autoriser des dividendes beaucoup plus e´lev´es pour les titres d´etenus depuis plus longtemps, de fac¸on a` stabiliser les actionnariats; a` nouveau grˆace a` la fiscalit´e sur les plus-values, favoriser les d´etentions longues et d´ecourager les allers-retours... or a recent paper in Le Monde [22]: C´ecile Ducourtieux concludes as following Les financiers renvoient dos a` dos versement de dividendes et investissement. Selon eux, en faisant la part trop belle aux actionnaires, par manque de confiance en l’avenir, d’imagination ou d’opportunit´es d’investissement, les entreprises pourraient compromettre leur valeur a` long terme. “C’est moins risqu´e de rendre de l’argent aux actionnaires que de l’investir. Mais les niveaux actuels d’investissement, plutˆot bas, auront un impact sur les niveaux futurs de profits”, analyse Thomas Aubrey, un dirigeant de la soci´et´e de conseil britannique Thomson financial. References 1. DAO B. (2005) “Approche structurelle du risque de cr´edit avec des processus mixtes diffusion-sauts”, Th`ese de l’universit´e Paris-Dauphine. 2. DAVIS M.H.A. and ZERVOS M. (1994) “A problem of singular stochastic control with discretionary stopping”, The Annals of Applied Probability, vol 4-1, 226-240. 3. DECAMPS J.P. and VILLENEUVE S. (2006), “Optimal dividends policy and growth option”, preprint. 4. DUFFIE D. and LANDO D. (2001) “Term Structures of Credit Spreads with Incomplete Accounting Information”, Econometrica, 69, 633-664.
December 26, 2006
10:55
Proceedings Trim Size: 9in x 6in
actkyotomar06
95
5. DUFFIE D. and SINGLETON K.J. (1999), “Modeling Term Structures of Defaultable Bonds”, Review of Financial Studies. Special 1999, 12(4), 687-720 6. N. EL KAROUI (1981), Les Aspects Probabilistes du Contrˆole Stochastique, Lecture Notes in Mathematics 876, p.73-238, Springer-Verlag, Berlin. 7. N. EL KAROUI, J.-P. LEPELTIER, A.MILLET (1992), “A Probabilistic Approach of the Reduite”, Probab. Math. Statist. 13, no 1, p.97-121. 8. GABILLON J.C. (2003) “ Le risque de taux de la dette risqu´ee”, working paper, ESCT. 9. GUO X. and PHAM H.(2005) “Optimal partially reversible investment with entry decision and general production function”, Stochastic Processes and their Applications. 10. IKEDA and WATANABE 1981, Stochastic Differential Equations and Diffusion Processes, North-Holland,Amsterdam. 11. JEANBLANC M. and SHIRYAEV A.N. (1995) “Optimization of the flow of dividends”, Russian Math. Survey, 50:2 257-277. 12. KOU S.G., WANG H. (2004) “Option pricing under a double exponential jump diffusion model”, Management Science, 1178-1192. ´ ´ 13. LEPINGLE D. et MEMIN A. (1978), “Sur l’uniforme int´egrabilit´e des martingales exponentielles”, Zeit. fur ¨ Wahrsche. Theory, 42(3), 175-203. 14. MERTON R. C. (1974), “On the Pricing of Corporate Debt: The Risk Structure of Interest Rates”, The Journal of Finance, 29, May, 449-70. 15. MORDECKI E. (1999) “Optimal stopping for a diffusion with jumps”, Finance and Stochastics, vol 3, 227-236. 16. J. PEYRELEVADE, (2005) Le capitalisme total, Seuil, Paris. 17. PHAM H. (1997), “Optimal stopping, free boundary and American option in a jump-diffusion model”, Applied math.Optimization, 35, 145-164. 18. REVUZ D. and YOR M. (1991) Continuous martingales and Brownian motion, Springer Verlag, Berlin. 19. SHIRYAEV A. (1978) Optimal Stopping Rules, Springer-Verlag, New-York. 20. VATH V. L., PHAM H., VILLENEUVE S. (2006), “A mixed singular/switching control problem for a dividend policy with reversible technology investment”, preprint. 21. VILLENEUVE S. (2005), “On the threshold strategies and smooth-fit principle for optimal stopping problem”, preprint. 22. DUCOURTIEUX C. “La g´en´ereuse politique de dividendes des soci´et´es du CAC 40”, daily paper “Le Monde” dat´e du mardi 16 Mai 2006, 14.
This page intentionally left blank
December 26, 2006
18:57
Proceedings Trim Size: 9in x 6in
rung
Affine Credit Risk Models under Incomplete Information Rudiger ¨ Frey1 , Cecilia Prosdocimi2 , Wolfgang J. Runggaldier2 1
Department of Mathematics, University of Leipzig 04081 Leipzig, Germany ([email protected]). 2
Dipartimento di Matematica Pura ed Aplicata, Universit´a di Padova, Via Trieste 63, 35121 Padova, Italy ([email protected], [email protected]).
We consider the problem of computing some basic quantities such as defaultable bond prices and survival probabilities in a credit risk model according to the intensity based approach. We let the default intensities depend on an external factor process that we assume is not observable. We use stochastic filtering to successively update its distribution on the basis of the observed default history. On one hand this allows us to capture aspects of default contagion (information-induced contagion). On the other hand it allows us to evaluate the above quantities also in our incomplete information context. We consider in particular affine credit risk models and show that in such models the nonlinear filter can be computed via a recursive procedure. This then leads to an explicit expression for the filter that depends on a finite number of sufficient statistics of the observed interarrival times for the defaults provided one chooses an initial distribution for the factor process that is of the Gamma type. Key words: Credit risk, affine models, incomplete information, pricing of credit derivatives, nonlinear filtering, finite dimensional filters. 1. Introduction We consider the problem of evaluating some basic quantities such as defaultable bond prices and survival probabilities in a credit risky environment under incomplete information on the underlying model. We use the reduced-form or intensity-based approach to credit risk with default times 97
December 26, 2006
10:57
Proceedings Trim Size: 9in x 6in
rung
98
modeled as the jump times of a doubly stochastic Poisson process. In this model class default intensities are driven by a common factor process Xt ; this is a convenient way for generating dependence between default events of different firms. Typically it is assumed that Xt is not directly observable, and this will also be the main setting here. In that case the distribution of Xt can be updated on the basis of the observed default history and this leads to what may be termed as information-induced contagion (see e.g. [8]). In the next Section 2 we discuss our underlying model and describe three examples of basic quantities in a credit risky environment (prices of defaultable bonds with and without recovery and survival probabilities) that we first evaluate under the assumption of full knowledge also of the factor process with the main purpose of motivating the filtering problem that arises in the evaluation of these same quantities when only the defaults are observable, but not the values of Xt . In Section 3 we discuss the filtering problem and present its general solution. Although in explicit form, this general solution will in most cases be difficult to compute and so the interest arises to consider particular classes of models, for which the solution can actually be computed. One such class corresponds to the case when Xt is a finite-state Markov chain and this is discussed in [4] in a more general context. In this paper we shall concentrate on the so-called affine case and this is the subject of Section 4, where we show that the filter can be computed via a recursive algorithm. The actual computation of this recursive algorithm is discussed in the last Section 5, where we also show that for a suitable choice of the initial distribution of the factor process the filter can be computed as an explicit function of a finite number of sufficient statistics of the observed default interarrival times. In this paper we assume that the information available to an agent comes only from observing the default history. More general information structures can be envisaged such as in the case when agents can observe also prices of defaultable bonds. This generalization is considered in [4], but not for the affine case. 2. The Model and Some Basic Examples Consider a market with m firms that may default and denote by τ j the default time of firm j ∈ {1, · · · , m}. The default state of the portfolio can be summarized by the default indicator process (1)
Yt = (Yt,1 , · · · , Yt,m )t≥0
with
Yt, j = 1{{τ j ≤t}}
Given a filtered probability space (Ω, F , Gt , P), all processes will be Gt −adapted and τ j is an Gt −stopping time. Our intensity-based default model implies that there are no common defaults among the firms, so that
December 26, 2006
10:57
Proceedings Trim Size: 9in x 6in
rung
99
we may introduce the ordered default times 0 = T0 < T1 < · · · < Tm . One may then also consider what can be called the default-identity process ξn that denotes the identity of the firm defaulting at Tn . The default observation history Ht ⊂ Gt can then be given the following two equivalent representations (2)
Ht = σ{Ys ; s ≤ t} = σ{(Tn , ξn ) ; Tn ≤ t}
The factor process Xt ∈ R may be any Markov process (a specific such process will be considered below, see (6)). We assume that default times are conditionally independent, doubly stochastic random times (see [7], Section 9.6); the default intensity of firm j at time t is given by λ j (Xt ) for some function λ j : R → (0, ∞). Formally, this means that default times are independent given F∞X = σ(Xt : t ≥ 0) with conditional survival probabilities given by t X λ j (Xs )ds . P(τ j > t | F∞ ) = exp − 0
2.1 The affine case We shall say that we are in the affine case if Xt satisfies a diffusion equation (3)
Furthermore, assuming for sake of generality that also the short rate is driven by the factor process, i.e. rt = r(Xt ), ⎧ ⎪ λ j (Xt ) = λ j Xt , λ j > 0 ⎪ ⎪ ⎨ (5) ⎪ ⎪ ⎪ ⎩ r(Xt ) = r Xt , r>0 In particular, for the process Xt we shall consider a Cox-Ingersoll-Ross (CIR)-type model, i.e. (6)
dXt = (a − bXt ) dt + σ Xt dwt
with a, b, σ > 0 and a ≥ σ2 so that Xt > 0 a.s. which, by (5), will then also imply that λ j (Xt ) > 0, r(Xt ) > 0 a.s. For this affine case in what follows 2
December 26, 2006
10:57
Proceedings Trim Size: 9in x 6in
rung
100
we shall be able to derive explicit expressions both in the case of full as well as of partial information. To this effect we recall here the following proposition, which in its general form can e.g. be found in [6], Section 6.2.2 ¯ ψ(T − t) = by making the following identifications : t = T − t, λ = β, µ = λ, B(t, T), −aφ(T − t) = A(t, T). The particular case for β = 0 can also be found in [7], section 9.5.2. The derivation is based on the Kolmogorov equation for functionals of Markov processes. Proposition 1. Let Xt satisfy (6) and define (7)
−βXt F(t, x) := E t,x e exp −
T
¯ s ds λX
t
for a generic β ≥ 0 and λ¯ > 0. In the present affine case this function F(t, x) admits the representation F(t, x) = exp [A(t, T) − B(t, T)x]
(8)
where, for given T, the functions A(·, T), B(·, T) satisfy the following first order ordinary differential equations in t ∈ [0, T] (9)
¯ γ(T−t) − 1) β[γ + b + eγ(T−t) (γ − b)] + 2λ(e βσ2 (eγ(T−t) − 1) + γ − b + eγ(T−t) (γ + b) ⎛ ⎞ (T−t)(γ+b) ⎜⎜ ⎟⎟ 2γe 2 2a ⎜ ⎟⎟ log ⎜⎜⎝ 2 γ(T−t) ⎟ 2 γ(T−t) σ βσ (e − 1) + γ − b + e (γ + b) ⎠
√ ¯ b2 + 2σ2 λ.
2.2 Examples Of the following three examples the first two concern pricing under full information and so the underlying probability measure P has to be seen as a pricing (martingale) measure. The third one concerns survival probabilities and there P represents then the historical/real world probability measure. The basic quantities in these three examples may be considered as building blocks for more important credit risky products.
December 26, 2006
10:57
Proceedings Trim Size: 9in x 6in
rung
101
2.2.1 Example 1. Defaultable zero-coupon bond on firm j with maturity T and zero recovery Using standard results for pricing defaultable claims in models with doubly-stochastic default times (see e.g. [7], section 9.4.3) the price at time t ≤ T of a zero recovery bond on firm j can be expressed as T (11) p j (t, T) = E e− t r(Xs )ds (1 − YT, j ) | Gt T j = (1 − Yt, j )EXt e− t R j (Xs ) ds := Π1 (Xt , Yt ) where R j (Xt ) := r(Xt ) + λ j (Xt )
(12) j
It is thus a function Π1 (Xt , Yt ), parametrized by t, T that for simplicity we drop from the notation, of the current values of the factor and the default indicator processes. From the previous Section 2.1 it is easily seen that in the affine case the j function Π1 (Xt , Yt ) takes the following exponentially affine form (13)
where, for Xt satisfying the CIR model (6), the coefficients in (13) are given by the formulae in (10) with β j (t, T) given by the expression for B(t, T) there and α j (t, T) by that for A(t, T). Furthermore, for the present case the coefficients in the right hand side of (10) have to be chosen as follows (we may consider t, T as fixed): a, b, σ come from (6), β = 0, λ¯ = λ j + r; γ = √ ¯ b2 + 2σ2 λ. 2.2.2 Example 2. Recovery payment j Denote by Zτ j 1{τ j ≤T} the recovery payment at the time τ j of default of j
the j−th firm, where Zt is an FtX − adapted process. It is well-known that the value in t of the recovery payment is given by τj j (1 − Yt, j )E e− t r(Xs )ds Zτ j 1{τ j ≤T} | Gt (14) s T j j = (1 − Yt, j )EXt t Zs λ j (Xs )e− t R(Xu ) du ds := Π2 (Xt , Yt ) ; see again [7] for a proof. From these building blocks the price of many credit derivatives is obtained in a straightforward manner. For instance, the price of a zero-coupon
December 26, 2006
10:57
Proceedings Trim Size: 9in x 6in
rung
102
bond with recovery is simply given by the sum of the price without recovery and the value of the recovery payment, and also spreads of credit j default swaps are easily computed. In the affine case also Π2 (Xt , Yt ) can be given a more explicit form that is partly of the exponentially affine type. We do not discuss this in detail here referring the reader to [7], Section 9.5.3 or directly to the original paper [3]. 2.2.3 Example 3. Survival probabilities As already mentioned, in this example the underlying probability P is the historical/real world probability measure. We want in fact to compute the probability, given our information, that firm j does not default prior to a given time T. A similar argument as in the derivation of (11) immediately gives
T j (15) P τ j > T | Gt = (1 − Yt, j ) EXt exp − λ j (Xs )ds := Π3 (Xt , Yt ) t
j
Notice that the expression of Π3 (Xt , Yt ) is completely analogous to that of
j Π1 (Xt , Yt )
in (11) so that in the affine case it can be given an expression of j
the exponentially affine form like Π1 (Xt , Yt ) in (13). 3. Incomplete Information (The Filtering Problem) In the examples of the previous section we have seen that, under full information also of the factor process Xt , the values of the basic quantities of interest can be expresses as an explicit function Π(Xt , Yt ) of the current values of the factor and the default indicator processes. If agents do not have access to the full information represented by the filtration Gt , but only to that corresponding to Ht , i.e. the information represented by the default history then, based on iterated conditional expectations, it appears natural to consider as corresponding values for the basic quantities the following (16) Πt (Yt ) := E Π(Xt , Yt ) | Ht where the expectation is under the measure P that is a pricing (martingale) measure in the case of the first two examples and the historical measure in the third. For the third example there is no problem with the definition (16), but in the case of the pricing examples one has to make sure that (16) leads to arbitrage-free prices. To this effect we can state the following simple lemma Lemma 2. Assume the short rate is H t −adapted, in particular determinis t tic. Then, taking as numeraire the money-market account B t := exp 0 rs ds , formula (16) leads to arbitrage-free prices in the sense that B−1 t Πt (Yt ) is a (P, Ht )−martingale.
December 26, 2006
10:57
Proceedings Trim Size: 9in x 6in
rung
103
Proof : Let s ≤ t; then Π(X ,Y ) Π (Y ) t t E tBt t | Hs = E E | Ht | Hs Bt =E =E
Π(X ,Y )
Π(X ,Y ) t t | Hs = E E | Gs | Hs Bt
Π(X ,Y )
| Hs =
t
Bt s
Bs
t
s
Πs (Ys ) Bs
where we have used the fact that P is a pricing (martingale) measure for the numeraire Bt in the sense that B−1 t Π(Xt , Yt ) is a P−martingale in the full filtration Gt . 3.1 The filtering problem We have seen that the problem of computing values of risk-sensitive products under incomplete information about the factor process amounts to that of computing conditional expectations as in (16). Since Yt ∈ Ht , in what follows we shall for simplicity drop the dependence on Yt so that the right hand side of (16) becomes of the form E f (Xt ) | Ht where f (·) is a generic (bounded) function of the factor process. Denoting by πt (dx) the conditional distribution of Xt given Ht , (16) leads then to the problem of computing (17) πt ( f ) := E f (Xt ) | Ht = f (x)πt (dx) which is a nonlinear filtering problem of a diffusion process, given point process observations. 3.2 General solution of the filtering problem Let us first introduce the global default intensity (18)
¯ t , Yt ) := λ(X
m (1 − Yt, j ) λ j (Xt ) j=1
which is the sum of the default intensities of the still surviving firms. Note ¯ t , Yt ) is the intensity of Tn+1 . that, for Tn ≤ t < Tn+1 , λ(X In deriving our filtering results we distinguish the cases between default times and at a default time. 3.2.1 Filter between defaults Let Tn be the generic n−th default time and let t ∈ [Tn , Tn+1 ). It follows from the general filtering equations (Kushner-Sratonovich equations) for point process observations by the so-called innovations method (see [5], [2]) that for πt ( f ) as defined in (17) one has t ¯ πs ( f ) ds (19) π t ( f ) = πT n ( f ) + πs (L f ) − πs (λ¯ f ) + πs (λ) Tn
December 26, 2006
10:57
Proceedings Trim Size: 9in x 6in
rung
104
where L is the generator that corresponds to the diffusion process Xt . Furthermore, noticing that for t ∈ [Tn , Tn+1 ) one has Yt = YTn , whenever t ∈ [Tn , Tn+1 ) we shall consider λ¯ as a function of x alone, i.e. ¯ ¯ YTn ) λ(x) = λ(x,
(20) ¯ y) as in (18). with λ(x,
Proposition 3. Assume the conditions for the uniqueness of the solution of (19) as described e.g. in Appendix 2 of [2] hold. The solution to (19) is then for t ∈ [Tn , Tn+1 ) and an integrable (bounded) f given by πt ( f ) =
(21)
t ( f ) t (1)
where t ( f ), the unnormalized conditional expectation of f , can be obtained as ψt (Tn , x)( f ) πTn (dx) (22) t ( f ) = with
t ¯ ψt (Tn , x)( f ) := ETn ,x f (t, Xt ) e− Tn λ(Xs )ds
(23)
¯ t ) according to (20). and λ(X Remark 4. This proposition shows that, between defaults, the filter evolves deterministically and its evolution is determined by the Markovian semigroup ψt (Tn , x)( f ) in (23). Proof : Although the proof can be obtained from that of Proposition 3.4 in [2], we present here a direct derivation. By Ito’s formula we have f (t, Xt ) e− = +
t Tn
t Tn
t
e− e−
Tn
s Tn
s Tn
¯ s )ds λ(X
− f (Tn , XTn )
¯ u )du λ(X
¯ s ) f (s, Xs ) ds (L f )(s, Xs ) − λ(X
¯ u )du λ(X
√ σ Xs
∂f ∂x (s, Xs )dws
Assuming the conditions for applying Fubini’s theorem are satisfied, let us take on the left and right hand sides the expectation conditional on XTn = x
December 26, 2006
10:57
Proceedings Trim Size: 9in x 6in
rung
105
thus obtaining t ¯ ETn ,x f (t, Xt )e− Tn λ(Xs )ds − f (Tn , x) =
t Tn
s ¯ ¯ s ) f (s, Xs ) ds ETn ,x e− Tn λ(Xu )du (L f )(s, Xs ) − λ(X
Using the definition of ψt in (23) this last relation can be rewritten as t ψt (Tn , x)( f ) − ψTn (Tn , x)( f ) = ψs (Tn , x) (L f ) − ψs (Tn , x) (λ¯ f ) ds Tn
Integrating with respect to πTn (dx), taking into account (22) and applying once more Fubini, one arrives at t t ( f ) − T n ( f ) = s (L f ) − s (λ¯ f ) ds Tn
From here one obtains then immediately " ! t ( f ) ¯ dt = πt (L f ) dt − πt (λ¯ f ) dt + πt ( f ) πt (λ) dπt ( f ) = d t (1) from which the result follows by the assumed uniqueness of the solution of (19). 3.2.2 Filter at a default Consider now a generic default time Tn. Again from the general filtering equations of the innovations approach ([5], [2]) one has (24)
π Tn ( f ) =
πTn− (λξn f ) , πTn− (λξn )
where we implicitly use that πTn− (λξn ) > 0 a.s. The expression πTn− ( f ) denotes here the left hand limit of πTn ( f ) in Tn , which exists by (19). Concluding this section 3.2 one sees that the crucial point to obtain a solution of the filtering equations is the possibility of explicitly computing the semigroup ψt in (23) and in the next Section 4 we shall address this issue in the case of an affine model. 4. Filtering in Affine Models As mentioned at the end of the previous section, in this section we shall derive an explicit representation of the semigroup ψt in (23) for affine models and this will then lead to an explicit solution of the filtering problem.
December 26, 2006
10:57
Proceedings Trim Size: 9in x 6in
rung
106
Recall that for an affine model we have postulated an affine dynamics for Xt that we take here as given by the CIR model (6). Furthermore, as in (5), we shall assume λ j (Xt ) = λ j Xt , which by (18) and (20) then implies that ¯ t , Yt ) = λ(Y ¯ t ) · Xt ; for the short rate we assume that it is constant, also λ(X i.e. r(Xt ) ≡ r. Finally, we shall assume f (t, x) to be exponentially affine, analogously to F(t, x) in (8), namely of the form f (t, x) = exp α(t, T) − β(t, T)x
(25)
and its specific form depends on the specific problem at hand (see the examples in Section 2.2). Notice next that with f (t, x) of the form (25) the semigroup ψt becomes ¯ t ψt (Tn , x)( f ) = ETn ,x eα(t,T)−β(t,T)Xt e−λ Tn Xs ds (26)
¯ t = eα(t,T) ETn ,x e−β(t,T)Xt e−λ Tn Xs ds
¯ t ) since Yt = YTn . The crucial where we have simply written λ¯ for λ(Y quantity becomes therefore the second factor in the rightmost expression in (26). We have now the following proposition, the proof of which follows immediately from Proposition 1 Proposition 5. Under the assumptions of this section one has (27)
4.1 Filter between defaults Combining Propositions 3 and 5 and noticing that the constant 1 can be expressed as the function f (t, x) in (25) for α(t, T) = β(t, T) = 0, one immediately obtains
December 26, 2006
10:57
Proceedings Trim Size: 9in x 6in
rung
107
Proposition 6. The filter πt ( f ) is, for t ∈ [Tn , Tn+1 ), given by e−B(Tn ,t;β(t,T))x πTn (dx) (29) πt ( f ) = η(Tn ; t, T) e−B(Tn ,t;0)x πTn (dx) where (30)
Remark 7. It follows from (28) that B(T n , t; β) is, for t ≥ Tn , nonnegative ¯ where whenever β ≥ β(t)
¯ γ(t−Tn ) − 1) γ − b + eγ(t−Tn ) (γ + b) 2 λ(e ¯ = − min (31) β(t) , γ + b + eγ(t−Tn ) (γ − b) σ2 (eγ(t−Tn ) − 1) ¯ is strictly positive for t > Tn and it is equal to zero for t = Tn . From Thus β(t) the examples in Section 2.2 it follows that the value of β(t, T) to be used for the parameter β in the numerator of the right hand side in (29) (as well as in (34) below) is strictly positive, while in the denominator of the same formulae it is equal to zero. Consequently the corresponding value of B(Tn , t; β) is always nonnegative. In (34) below we shall also need the derivative of B(T n , t; β) with respect to β, evaluated at β = β(Tn+1 , T) and at β = 0. Since Tn+1 > Tn and thus ¯ n+1 ) < 0, not only β = β(Tn+1 , T), but also β = 0 are interior points of the β(T domain of positivity for B(T n , t; β) and in this domain it is easily seen from (28) that B(Tn , t; β) is differentiable with respect to β. Recall now also that Xt as given 2 by (6) for a ≥ σ2 has its a-priori support in the positive half line and this implies that also all conditional distributions πt (dx) have their support in the positive half line. It follows that the two integrals in the right hand side of (29) are well defined and finite. From the above Remark 7 we have that, for all values of interest for β, B(Tn , t; β) is strictly positive. Given that Xt > 0 anyway, this leads us to define the moment generating function of the conditional distribution πt (dx) of Xt by φ>0 (32) χt (φ) := πt e−φ Xt From Proposition 6 one then obtains immediately Corollary 8. For t ∈ [Tn , Tn+1 ) and f (t, x) as in (25) we have that the filter value πt ( f ) is given by (33)
πt ( f ) = η(Tn ; t, T)
χTn (B(Tn , t; β(t, T)) χTn (B(Tn , t; 0))
where η(·) is as in (30), χTn (·) as in (32) and B(T n , t; β) is given by (28).
December 26, 2006
10:57
Proceedings Trim Size: 9in x 6in
rung
108
4.2 Filter at a default time Based on the general formula (24) for the filter at a generic default time Tn , we can now show the following (since for the case between defaults we considered the interval [Tn , Tn+1 ), here we take Tn+1 to denote the generic default time) Proposition 9. Assuming the conditions are fulfilled to differentiate under the integral sign, at the generic default time Tn+1 we have ∂ A(Tn ,Tn+1 ;β) χ (B(T , T ; β)) e T n n+1 n ∂β | β=β(Tn+1 ,T) (34) πTn+1 ( f ) = eα(Tn+1 ,T) · ∂ A(Tn ,Tn+1 ;β) χTn (B(Tn , Tn+1 ; β)) | β=0 ∂β e where A(Tn , Tn+1 ; β) and B(Tn , Tn+1 ; β) are given by (28), β(T n+1 , T) corresponds to the exponentially affine representation of f (t, x) in (25) and χ Tn (·) is as defined in (32). − (f) Proof : Starting from (24) for the default time T n+1 and noticing that πTn+1 is the limit, for t ↑ Tn+1 , of the filter πt ( f ) when t ∈ [Tn , Tn+1 ), using Proposition 3 with f (t, x) as in (25) and with the semigroup ψt (·) as in (26) combined with Proposition 5, we obtain the following sequence of equalities, where we use differentiation under the integral sign in two instances
πTn+1 ( f ) =
πT− {λξn+1 ·XTn+1 f (Tn+1 ,XTn+1 )} n+1
πT− {λξn+1 ·XTn+1 } n+1
=
T n+1 X du −λ¯ −β(Tn+1 ,T)·XT u Tn n+1 e πTn (dx) eα(Tn+1 ,T) ETn ,x XTn+1 e T n+1 −λ¯ Xu du } πTn (dx) ETn ,x XTn+1 e Tn α(Tn+1 ,T)
Notice that, by the considerations in Remark 7 all the above quantities are well defined.
December 26, 2006
10:57
Proceedings Trim Size: 9in x 6in
rung
109
Remark 10. Notice that in the filter update at T n+1 only the information about the timing Tn+1 of the (n + 1)−st default is being used and not also that of the defaulting firm ξn+1 ; in fact, the factor λξn+1 , which contains this information, drops out due to the normalization. This happens however only because of our affine form of the dependence of the default intensity on the factor process and the fact that the latter is taken to be scalar. In any case, the information about ξ n+1 is not lost as it becomes part of YTn+1 . From Corollary 8 and Proposition 9 it now follows that the crucial quantity for computing the filter in the affine case is the moment generating function χt (φ) in (32), which we need to compute only for the values of t corresponding to the default times and for any φ > 0. Notice also that χt (φ) is nothing but the filter πt ( f ) for f (t, x) when this latter function has the exponentially affine form in (25) with α(t, T) = 0 and β(t, T) = φ. Combining these remarks with the results of Proposition 9 one obtains immediately Corollary 11. Under the assumptions of Proposition 9 one has ∂ A(Tn ,Tn+1 ;β) χ (B(T , T ; β)) e T n n+1 n ∂β |β=φ (35) χTn+1 (φ) = ∂ A(T ,T ;β) n n+1 χTn (B(Tn , Tn+1 ; β)) |β=0 ∂β e On the basis of the previous results we can now write down an algorithm to compute recursively the filter in the affine case, which we do in the next subsection. 4.3 Filter algorithm i) Start at T0 = 0 with a given χ0 (φ). ii) At the generic Tn+1 compute (see Corollary 11) ∂ A(Tn ,Tn+1 ;β) χTn (B(Tn , Tn+1 ; β)) ∂β e |β=φ χTn+1 (φ) = ∂ A(T ,T ;β) n n+1 χTn (B(Tn , Tn+1 ; β)) |β=0 ∂β e iii) For t ∈ [Tn , Tn+1 ) and with f (t, x) = exp[α(t, T) − β(t, T)x] the filter is then given by (see Corollary 8) πt ( f ) = η(Tn ; t, T)
χTn (B(Tn , t; β(t, T)) χTn (B(Tn , t; 0))
where η(Tn ; t, T) is given in (30) and B(T n , t; β) in (28). iv) For t = Tn+1 the filter is (see Proposition 9 and Corollary 11) πTn+1 ( f ) = eα(Tn+1 ,T) χTn+1 (β(Tn+1 , T))
December 26, 2006
10:57
Proceedings Trim Size: 9in x 6in
rung
110
Step ii) above is a recursive formula to compute the moment generating function χTn that then leads to an explicit filter solution according to Steps iii) and iv). Although recursive in nature, this Step ii) may become increasingly difficult to compute since the analytic expression for χTn (φ) might become more and more involved with every step. A natural question then arises to see whether, with a suitable choice of the initial χ0 (φ), the recursions remain at a level that allows for feasible computations. This will be the subject of the next Section 5, where we also show that, although a priori the filter is not finite-dimensional in the traditional sense, for a suitable choice of the initial distribution it can be parametrized by a finite number of sufficient statistics. Remark 12. We conclude this section by pointing out that our filter results may have a wider scope than only what we have described here by showing that the filter can be computed by computing the conditional moment generating function and that this can be done recursively according to (35). In fact, since this conditional moment generating function is related to the Laplace transform of the conditional (filter) distribution, one can, at least in theory, invert this conditional moment generating function thereby recovering the filter distribution itself. This would then allow to compute conditional expectations not only of exponentially affine functions of the factor process but also of any integrable function thereof. 5. Finite Dimensional Computation of the Filter In this section we exhibit a choice for χ0 (φ) that allows Step ii) in the filter algorithm of the previous subsection 4.3 to remain computable at every step. For this purpose recall that the recursions in Step ii) correspond to the recursive formula (35), where the coefficients A(T n , Tn+1 ; β), B(Tn, Tn+1 ; β) are given in (28). Introduce the shorthand notations
= γ + b + eγ(Tn+1 −Tn ) (γ − b) = 2 λ¯ eγ(Tn+1 −Tn ) − 1 = σ2 eγ(Tn+1 −Tn ) − 1 = γ − b + eγ(Tn+1 −Tn ) (γ + b) = 2γe
(Tn+1 −Tn ) (γ+b) 2
√ ¯ are all positive quantities. The coefficients that, since γ = b2 + 2 σ2 λ, A(Tn , Tn+1 ; β) and B(Tn , Tn+1 ; β) from (28) can then be given the following
Choosing as distribution for the initial value X0 of the factor process a specific Gamma-type distribution, we shall now prove the following theorem that gives an explicit computable representation for χt (φ) at the various default times t = Tn . Theorem 13. Let (38)
!
1 χ0 (φ) = 1+φ
" 2a2 σ
2a
= (1 + φ)− σ2 ,
φ>0
which according to (32) corresponds to the moment generating function of a 2a Gamma distribution for X0 with parameters σ2 , 1 . Then (39)
2a
χTn (φ) = cn (φ Hn + Kn )− σ2 −n pn (φ)
where Hn and Kn satisfy the recursions ⎧ ⎪ Hn = Rn Hn−1 + Un Kn−1 , H0 = 1 ⎪ ⎪ ⎨ (40) ⎪ ⎪ ⎪ ⎩ Kn = Sn Hn−1 + Vn Kn−1 , K0 = 1 the coefficient cn is given by (41)
# 2a $−1 − 2 −n cn = Kn σ pn (0)
and pn (φ) is a polynomial of degree n − 1 given by (42) ⎧ 1 for n = 0 and n = 1 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ 2a ⎪ ⎪ − pn (φ) 2 − n + 1 Hn (φUn + Vn ) % ⎪ σ ⎪ ⎨ pn (φ) = ⎪ ⎪ ⎪ ⎪ ∂ % ⎪ pn (φ) pn (φ) + (φHn + Kn )(φUn + Vn ) ∂φ +Un (φHn + Kn )% ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ for n ≥ 2 with (43)
! % pn (φ) = (φUn + Vn )
n−2
pn−1
" φRn + Sn , n≥2 φUn + Vn
December 26, 2006
10:57
Proceedings Trim Size: 9in x 6in
rung
112
Proof : The statement is clearly true for n = 0. We show it first for n = 1 and then inductively for all n ≥ 2. i) the case n = 1 : by (35), (36), (37) and the recursions in (40) we have ⎧ ⎫ 2a ⎪ ⎪ ⎪ ⎪ βR +S ⎨ W1 ⎬ σ2 χT0 βU1 +V1 ⎪ ⎪ +V βU ⎪ 1 1 ⎪ ⎩ 1 1 ⎭ |β=φ ⎧ ⎫ 2a ⎪ ⎪ ⎪ W1 βR1 +S1 ⎪ ⎬ σ2 ∂ ⎨ χT0 βU +V ⎪ ⎪ ∂β ⎪ 1 1 ⎪ ⎩ βU1 +V1 ⎭
which indeed corresponds to (39) with (41) and (42). ii) the general case n ≥ 2 : assume (39) holds for n − 1. Then, always by (35), (36), (37) and (40) we obtain χTn (φ) =
∂ ∂β ∂ ∂β
(45) =
∂ ∂β
∂ ∂β
Wn βUn +Vn Wn βUn +Vn
1 βUn +Vn 1 βUn +Vn
2a
σ2
2a
σ2
χTn−1 χTn−1
βR
βR
n +Sn βUn +Vn n +Sn βUn +Vn
|β=φ
|β=0
2a βH
σ2
σ2
2a βR +S n +Kn − σ2 −n+1 pn−1 βUnn +Vnn βUn +Vn
2a βH
2a βR +S n +Kn − σ2 −n+1 pn−1 βUnn +Vnn βUn +Vn
|β=φ
|β=0
Taking into account that, by (43), ! " βRn + Sn (46) pn−1 = (βUn + Vn )2−n% pn (β) βUn + Vn the numerator in the rightmost expression of (45) becomes (47) − 2a2 −n+1 ∂ σ % pn (β) ∂β (βUn + Vn )(βHn + Kn ) |β=φ
where we have used the definition of pn (φ) in (42). Returning to (45) and recalling that the denominator in the rightmost expression of (45) is the same as the numerator except for putting β = 0, one finally obtains 2a
(48)
χTn (φ) =
(φ Hn + Kn )− σ2 −n pn (φ) − 2a2 −n Kn σ pn (0)
2a
= cn (φ Hn + Kn )− σ2 −n pn (φ)
Remark 14. It follows from Theorem 13 that, for a choice of the initial distribution corresponding to χ0 (φ) in (38), the sequence χTn (φ) and therefore (see Steps iii) and iv) in Section 4.3) the entire filter is parameterized by a same finite number of sufficient statistics, namely the pairs (Hn , Kn ) and the polynomial functions pn (φ), all of which can be computed recursively on the basis of the functions Rn , Sn , Un , Vn of the interarrival times of the defaults. References 1. Bielecki, T. R., M. Jeanblanc, & M. Rutkowski (2004). Modeling and Valuation of Credit Risk. In : Stochastic Methods in Finance (M. Frittelli and W. Runggaldier, eds), Lecture Notes in Mathematics. Vol 1856, 27–126. Springer Verlag. 2. Ceci, C., & A. Gerardi (2006). A Model for High Frequency Data under Partial Information : a Filtering Approach. International Journal of Theoretical and Applied Finance, 1–22. 3. Duffie, D., & N. Gˆarleanu (2001). Risk and Valuation of Collateralized Debt Obligations. Financial Analysts Journal. 57, 41–59. 4. Frey, R., & W. J. Runggaldier (2006). Credit Risk and Incomplete Information : a Nonlinear Filtering Approach. Preprint. 5. Kliemann, W. H., G. Koch, & F. Marchetti (1990). On the Unnormalized Solution of the Filtering Problem with Counting Process Observations. IEEE Transactions on Information Theory. 36, 1415–1425. 6. Lamberton, D., & B. Lapeyre (1995). An Introduction to Stochastic Calculus Applied to Finance. Chapman and Hall. 7. McNeil, A. J., R. Frey, & P. Embrechts (2005). Quantitative Risk Management. Princeton University Press. 8. Schonbucher, ¨ P. (2004) Information-driven Default Contagion. Preprint, Department of Mathematics, ETH Zurich. ¨
This page intentionally left blank
December 26, 2006
18:58
Proceedings Trim Size: 9in x 6in
haraRP1+
Smooth Rough Paths and the Applications Keisuke Hara1∗and Terry Lyons2 1
Department of Mathematical Sciences, Ritsumeikan University, Japan 2 Mathematical Institute, University of Oxford, UK
Key words: p-variation, rough path
1. Introduction This article is based on a joint work submitted to a journal (K. Hara and T. Lyons [1]), which was mainly studied in the first author’s academic year 2004–2005 in Oxford. In this article, we will show the idea, the main results, and the sketch of the proofs. We will also show the related problems and some examples, which are not included in our paper mentioned above.
First of all, we explain the essence of our idea: smooth rough paths. It is a good starting point to recall the well known definition of p-variations. Let p ≥ 1 be a real number and F : I → R n be a continuous function on an interval I in R. Then we define the p-variation (norm) of F by Fp−var,I
where the sup runs over all finite dissection {t j } of the interval I. If the interval I is finite, this definition should interest us only for non-smooth paths like Brownian paths because smooth paths have the trivial estimate with the derivatives like |F(t j+1 ) − F(t j )| ≤ C|F (t j )|(t j+1 − t j ). However, what happens if I is the whole real line R ? Now the difference |F(t j+1 ) − F(t j)| can easily sum up to the infinite even if the path is smooth. Therefore, we can ask when we have the finite p-variation for smooth paths. In other words, we can study how smooth paths oscillate globally with their p-variations. ∗
Partially supported by ACCESS Co. Ltd. 115
December 26, 2006
10:59
Proceedings Trim Size: 9in x 6in
haraRP1+
116
The concept of smooth rough paths is a generalization of this idea. More precisely, we ask not only the p-variations of paths themselves but also the variations of the iterated integrals of the paths in the framework of rough path theory. By this procedure, we can get more information how they oscillate globally and specially how they behave at the infinities. For example, we can apply the general framework of rough path theory to study the differential equations driven by the path if we establish the rough path property of the smooth path. In the following section, we will give simple definitions that we need. The main results and the sketch of the proofs will be shown in Section 3. Section 4 and 5 are devoted to the applications to Fourier analysis and the related problems. 2. Definitions We prepare the basic concept of rough path theory. Though rough path theory has a very general framework, the almost definitions here can become much simpler than general rough path theory because we assume that the paths are smooth. Let F(t) : I → Rn be a continuous function defined on a closed interval I = [a, b] (where a or b may be infinite, +∞ or −∞). We are interested in the oscillation of the iterated integrals: i dFt1 ⊗ · · · ⊗ dFti , i = 1, 2, . . . Fu,v = u
It is a delicate issue in general rough path theory how to define such iterated integrals, because the main area to apply rough path theory is heavily nonsmooth functions. However, in this article, there is no problem because we suppose that the functions are smooth. Therefore the all integrals are well-defined in the usual sense and satisfy an important algebraic relation, i.e., Chen’s identity. (For details, see Lyons-Qian[3], pp.30. In this article, we do not use this relation although it essentially works in the background.) We denote the each tensor norm in Rn ⊗ · · · ⊗ Rn simply by |Fiu,v |. As the special case, we have F1u,v = F(v) − F(u) and we measure the norm with Euclidean norm. First we need the definition of a control function. Definition 2.1. (control functions) Let ∆ I denote the simplex {(u, v) : u ≤ v, u, v ∈ I}. We call a bounded continuous function ω(u, v) : ∆I → [0, ∞) a control function if ω(u, u) = 0 for all u ∈ I and if it is super-additive, i.e., ω(u, v) + ω(v, w) ≤ ω(u, w) for all u ≤ v ≤ w (u, v, w ∈ I).
December 26, 2006
10:59
Proceedings Trim Size: 9in x 6in
haraRP1+
117
Next we define our main concept. Definition 2.2. (r-rough paths) Let r ≥ 1 be a constant and denote the integer part by [r]. We call F(t) a (smooth) r-rough path (or simply, a rough path) if there exists a control function ω(u, v) : I × I → [0, ∞) such that |Fiu,v | ≤ ω(u, v)i/r
(1)
for any i = 1, . . . , [r] and any u ≤ v ∈ I. Note that the first level estimate, i.e., |F(v) − F(u)| ≤ ω(u, v) 1/r means that F has the finite r-variation on I, because we can sum up the oscillation for any dissection of I by the super additivity: |F(un+1 ) − F(un )|r ≤ ω(un , un+1 ) ≤ ω(a, b) < ∞. n
n
In rough path theory, we are interested also in the estimates for the oscillation of the higher iterated integrals. With these estimates, we can define the integral for such a rough path and consider the differential equation driven by the rough path (see Lyons [2] and Lyons-Qian [3]). Here we want to repeat the remark that smooth paths on R are not necessarily rough paths, because we cannot always control the oscillations in the global sense. Therefore, the problem is when smooth paths are rough paths. 3. Main Results In the last section, we introduced the concept of smooth rough paths. Roughly saying, we call a function a rough path if we can control the oscillation of the path and the iterated integrals. In this section, we give an answer to the question in the last section, that is, when smooth paths are rough paths. More precisely, we show a condition of the integrability to assure that the smooth paths on R are rough paths. It is natural that we need the integrability of not only the paths themselves but also the derivatives, because we want to control their (global) oscillation. For example, we naturally expect that a heavily winding path (whose derivative should be very large) cannot be a rough path. First we must show the first level estimate: |F(v) − F(u)|r ≤ ω(u, v) < C < ∞
December 26, 2006
10:59
Proceedings Trim Size: 9in x 6in
haraRP1+
118
for any (u, v) ∈ ∆R and some r ≥ 1. (The positive constant C may be different at different places in this article.) We have the following theorem. Theorem 3.1. Let 1 < p, q < ∞. If F ∈ L p (Rn ) and F ∈ Lq (Rn ), then there exists a control function ω on ∆R such that
r
1
+ 1. dF(u) ≤ ω(s, t), for r = p 1 −
q s 1, F is a [r]-rough path, i.e., a 1-rough path. We show the proof here because this first level estimate is not difficult. It is enough to show the one dimensional case because the r-variation can be estimated with the sum of the variations of the each coordinates. Suppose that a smooth function G : R → R satisfies the estimates Gp < ∞ and G q < ∞. Let us write the function G as the sum of the positive part G+ := G ∨ 0 and the negative part G− := (−G) ∨ 0. Then, we have almost everywhere (G+ ) = G
(G+ ) = 0 if G ≤ 0.
if G > 0,
Since G+ ≥ 0 and r ≥ 1, we have
r
G+ (t) − G+ (s)
≤
G+ (t)r − G+ (s)r
. Then we can estimate the right hand side with the integral form as follows.
+ r
r r
+ +
G (t) − G (s) =
d G (u)
s
r−1 + + r G (u) (G ) (u) du
=
s
G+ (u)r−1 (G+ ) (u) du. ≤r s
The negative part G also has the same estimate. By Jensen’s inequality, |G(t) − G(s)|r ≤ (|G+ (t) − G+ (s)| + |G− (t) − G− (s)|)r ≤ 2r−1 |G+ (t) − G+ (s)|r + |G− (t) − G− (s)|r ,
so that r
|G(t) − G(s)| ≤ r 2
r−1 s
|G(u)|r−1 |G |(u)du.
December 26, 2006
10:59
Proceedings Trim Size: 9in x 6in
haraRP1+
119
Now a simple application of Holder’s ¨ inequality gives that Gr−var,[s,t] =
sup
m−1
r
G(u j+1 ) − G(u j )
s
|G(u)|r−1 |G (u)| du
≤ r2r−1 s
≤ r2
r−1
1/α |G(u)|
(r−1)α
β
|G (u)| du
du
s
1/β ,
s
where 1/α + 1/β = 1. Therefore the path G has the finite r-variation on R if p = (r − 1)α and q = β, because Gr−var = Gr−var,R ∞ 1/α r−1 p ≤ r2 |G| du =
−∞ r−1 r2 Gr−1 p G q
∞ −∞
q
1/β
|G | du
< ∞.
Note that (r − 1)/p + 1/q = 1. Setting r−1 1 p q Gp,[s,t] + G q,[s,t] , ω (s, t) = r2r−1 p q ω is a control function for the r-variation of G by the elementary inequality: x1/a y1/b ≤
x y + a b
for any x, y ≥ 0 and 1/a + 1/b = 1, a > 1. (We can also check that the estimate holds for p = ∞ or q = ∞, if we use the supremum norm · ∞ . But the estimate is nonsense for p = q = ∞.) We have finished the proof. Next we consider the second level estimate. We want to control the it erated integral dFu dFv . However, the estimate does not generally s
can be infinite. We can see this situation in the following simple example. Consider the special case of p = q > 2 and the following function H : R → R2 : H(t) = (H1 (t), H2 (t)) = (R(t) cos t, R(t) sin t)
December 26, 2006
10:59
Proceedings Trim Size: 9in x 6in
haraRP1+
120
with a radius R(t) : R → R. We can choose the radius R(t) such that R(t) and the derivative R (t) are in Lp for some p > 2, and R(t) is not in L2 . Then, we can easily check that H satisfies our condition Hp + H p < ∞. However, the area explodes to infinity:
∞ −∞
(H2 dH1 − H1 dH2 )
=
∞ −∞
R(t)2 dt = ∞.
So we cannot have the finite r-variational norm for any r. Therefore our only chance is the case that 1/p + 1/q = 1. In other words, we can only expect 2-rough paths at most. We have the following theorem. Theorem 3.2. Let 1/p + 1/q = 1 and p > 1. If F(u) ∈ L p (Rn ) and F ∈ Lq (Rn ), then the estimate for the area
dF(u) ⊗ dF(v)
≤ Cω(s, t), (−∞ ≤ s < t ≤ ∞)
s
holds for a constant C and the same control function ω as the Theorem 3.1. Therefore, the function F is a 2-rough path. The proof is much subtler than the previous one. Therefore we show the sketch of the proof as follows. It is enough to consider the two dimensional case, because we can estimate the area in Rn with the sum of the projected areas to the each two coordinates. Suppose that a smooth path Z in R2 satisfies the condition Zp +Z q < ∞. We want to estimate the off-diagonal part 1 (Z(v) − Z(s)) × dZ(v), Ast = 2 s
December 26, 2006
10:59
Proceedings Trim Size: 9in x 6in
haraRP1+
121
Now the first term of the right hand side is easy to handle as follows. |Z(v) × dZ(v)| ≤ |Z(v)| |Z (v)| dv s
s
≤
s
1/p |Z(v)|p dv
s
|Z (v)|q dv
1/q
≤ ω (s, t) . The difficulty is in the second part though it seems easier. We divide the situation into the two cases: either |Z(s) × (Z(t) − Z(s))| ≤ θω(s, t) or |Z(s) × (Z(t) − Z(s))| > θω(s, t). for some constant θ. The first case is no problem because it is already controlled by ω(s, t). For the second case, remember the first level control, i.e., for any u ∈ [s, t], we have |Z(u) − Z(s)| ≤ Cω(s, u)1/2 ≤ Cω(s, t)1/2 . Then the point Z(u) is relatively near to Z(s); the path {Z(u) (s ≤ u ≤ t)} is well separated to the origin. Therefore it should ensures that the triangle area can be estimated by the control function. More precisely, we will show that if |Z(u) − Z(s)| ≤ Cω(s, t)1/2 and |Z(s) × (Z(t) − Z(s)) | > θω(s, t), then |Z(s) × dZ(v)| < C|Z(v) × dZ(v)|. In fact, we can show it by the following lemma. Lemma 3.1. Suppose that α and β are vectors emanating from the origin O and that l is the line through α parallel to β. Then let πγ be projection of γ onto l through zero. That is to say πγ = rγ, β × α − rγ = 0.
Now suppose that β × α > δ2 , β ≤ δ/2, and γ − α < δ/2. Then, for an infinitesimal increment dτ and γ = τ + dτ, we have
γ × τ
≥ C
πγ × dπτ
,
where C is a universal constant. The comparison is true also in the other direction as well.
December 26, 2006
10:59
Proceedings Trim Size: 9in x 6in
haraRP1+
122
We can only need elementary geometry to get the lemma above (though it is a bit tricky). If we can show this lemma, the following estimate:
Z(v) × dZ(v)
≤ C |Z(v) × Z (v)| dv. |Z(s) × (Z(t) − Z(s))| =
s
s
is a direct consequence by integrating the infinitesimal estimate. Therefore we can estimate the original integral As,t essentially only by the first one, i.e., s
g (k) = c fˆ(k),
where c is a constant. If a function g and the derivative g are in Lp (p ≥ 1), the path g has the finite global p-variation by our first theorem. (Note that r = p(1 − 1/p) + 1 = p.) Therefore g(R) has a limit as R → ±∞ (and of course the limit is zero). Since g(R) − g(R ) =
R
g (u)du
R
for any two constants R < R, we have that
R R
ik0
e
fˆ(k)dk =
R R
fˆ(k)dk
has a limit as R → ∞ and R → −∞. So we can conclude that the inverse Fourier transform exists at x = 0. In the special case p = 2, g and g are in L2 if and only if f and f (x)/x are in L2 . Therefore in this case, we have a verifiable condition for the pointwise convergence of the inverse Fourier transform. We want to extend this idea to non-linear problems. For example, the following is a simple toy model.
December 26, 2006
10:59
Proceedings Trim Size: 9in x 6in
haraRP1+
123
In the case of p = q = 2, we have the second level rough path estimate by our second theorem. Therefore, we can prove that the limit existence in the following nonlinear oscillating problem. (We need to control not only the variation of the path but also the area for the nonlinear case. See [2] and [3].) Let γ : R → C be a complex valued function that satisfies our condition γ + γ ∈ L2 . Suppose that C → G a Lie algebra of a Lie group G. Then, we can compute R −→ Γ =exp dγ R
(i.e., (d/dR) ΓR,R = γR ), and the limit exists as R → ∞, R → −∞. Therefore, if a complex valued function f on R satisfies f + f (x)/x ∈ L2 , we can −→ R conclude that the limit of exp R fˆ exists as well as the above linear case. This “nonlinear” version may seem artificial. However, this is a simple version of more serious problems. i.e., nonlinear Fourier transforms (see T. Tao and C. Thiele [4] for precise information on nonlinear Fourier theory). Actually we can get some rigorous conditions for the pointwise existence of nonlinear Fourier transforms according to our idea (Hara-Lyons [1]). But the goal, i.e., to get a reasonable and natural condition to define nonlinear Fourier transforms, seems still far away. 5. Related Problems We leave more applications to our previous paper [1]. Instead we list related problems here. Though the following problems do not necessarily have a direct connection to nonlinear Fourier analysis or even smooth rough paths, they are more fundamental and seem interesting. 5.1 When is the rough path property preserved? Let X(t) : I → Rn be a rough path on an interval I and T be a transform of paths. Then, a general question is when the transformed path TX(t) : I → Rn is again a rough path, or when T preserves the rough path property. For example, if T is a rotation transform on Rn , the transformed path is always a rough path trivially. But, if we consider the rotation in the infinitesimal sense, the question becomes much subtler. More precisely, if X(t) : R → C( R2 ) is a smooth rough path and if T is the following Fourier transform type: t T : X(t) → TX(t) = eits dX(s), then is the path TX(t) a rough path? If not, how should we restrict the condition on the original rough path X(t) ? We do not have a good answer.
December 26, 2006
10:59
Proceedings Trim Size: 9in x 6in
haraRP1+
124
5.2 automatic rough paths Recall the definition of a rough path. For example,
if there
exists a t control ω such that (i) |X(t) − X(s)| < ω(s, t)1/2 and (ii)
dXu dXv
< ω(s, t) s
for any s < t, we call the path X(t) a 2-rough path. Note that the ratio of 1/2 and 1 is proper for the length and the area. Therefore, it seems plausible that the condition (ii) is automatically satisfied if a path satisfies the condition (i). Of course, we cannot generally derive the condition (ii) from (i). For example, N. Victoir [5] constructed such an example in the infinite dimensional space. Then, how about paths in the Euclidean space? J. B. Garnett (UCLA) let us know the following simple example: f (z) =
∞
2−n/2 z2 . n
n=0
for z ∈ C such that |z| = 1. We can easily check that this f has the finite (1/2)-Lipschitz norm and that the Dirichlet norm |z|≤1 |∇ f |2 dz is infinite. Then, when can we automatically get the rough path property from only the first estimate (i)? (cf. the First Theorem of rough path theory (see, Lyons [2])). 5.3 probabilistic versions of the First Theorem The First Theorem by Lyons is a cornerstone of rough path theory, which roughly says that we can derive the estimates of higher iterated integrals from the lower ones (Lyons [2]). If we can get some kind of the probabilistic version of the theorem, it should be useful for applications to probability theory. Of course, there are many possible versions to be considered. Though we can show the following theorem as a try, we need more powerful one. (n)
Theorem 5.1. Let Xs,t be a multiplicative functional (for each path) whose expectation is controlled by a control function ω(s, t) as follows: i p/i ≤ ω(s, t)θ , 1 ≤ i ≤ p, (2) E Xs,t for some θ > 1 and p such that [p] = n. Then, there exists a multiplicative (m) extension Xs,t to T (m) for m > n that satisfies the following estimates, (3)
i p/i ≤ C(i, p) ω(s, t)θ, E Xs,t
i > p,
for constants C(i, p), and this extension is unique almost surely. (see Lyons-Qian [3] for the definition of “multiplicative”.)
December 26, 2006
10:59
Proceedings Trim Size: 9in x 6in
haraRP1+
125
References 1. Keisuke Hara and Terry Lyons. Smooth rough paths and the applications to Fourier analysis, preprint. 2. Terry Lyons (1998). Differential equations driven by rough signals, Revista Matem´atica Iberoamericana, 14, No. 2. 3. Terry Lyons and Zhongmin Qian (2002). System Control and Rough Paths, Oxford Mathematical Monographs, Oxford Science Publications, Oxford University Press, Oxford. 4. Terence Tao and Christoph Thiele. Nonlinear Fourier Analysis, Proceedings of IAS Park City Mathematics Series, to appear. 5. Nicolas Victoir (2004). Levy area for the free Brownian motion: existence and nonexistence, Journal of Functional Analysis, 208, 107–121.
This page intentionally left blank
December 26, 2006
15:39
Proceedings Trim Size: 9in x 6in
FromAccesstobypass1
From Access to Bypass: A Real Options Approach Keiichi Hori1 and Keizo Mizuno2 1
2
Faculty of Economics, Ritsumeikan University School of Business Administration, Kwansei Gakuin University
1. Introduction Public utility industries, such as telecommunications, natural gas and electricity, consist of a production facility and a network facility. For example, in the electricity industry, a plant for generating electricity is a production facility, whereas transmission and local distribution wires are network facilities. The network facility is also known as an essential facility because firms without access to the network facility cannot provide final goods or services to their customers. They are also characterized by large sunk costs for investment, and increasing uncertainty in business environments. To make final goods markets more efficient and innovative, competition has been introduced in public utility industries in the OECD countries. One of the key policies for competition is an open access policy or unbundling to allow rivals access to the incumbent’s network facility. However, the open access policy may weaken incentives to investment in the network facility. In recognition of the harmful effects on incentives, the open access policy has been reconsidered in some cases. For example, in 2003, the Federal Communications Commission (FCC) adopted new rules regarding the network unbundling obligations of incumbent local phone carriers, with the aim of providing incentives for the carriers to invest in broadband. Motivated by the debate on competition policies in public utility industries, this paper investigates firms’ incentives for investments in network public utility industries under competition and uncertainty. We employ a real options approach to examine the issues, since the industries are characterized by large sunk costs for investment and increasing demand (or cost) uncertainty. The real options approach features irreversibility of investment under uncertainty. It highlights the option value of delaying an investment decision. In fact, a decision on the timing of irreversible investment under uncertainty is crucial for firms in network public utility industries, so that this approach should be widely used in the study of such industries, es127
December 26, 2006
15:39
Proceedings Trim Size: 9in x 6in
FromAccesstobypass1
128
pecially for studies on the effect of a change in regulatory policies on the performance of the industries. The application of a simple net present value (NPV) approach cannot provide an appropriate understanding of an firms’ incentive to construct the network facility under conditions of uncertainty and the irreversibility of investments. This is because such an approach ignores the option value of delaying an investment. This is the main reason why we employ the real options approach to examine the incentives for investment in network industries. The main contribution of our analysis is the characterization of the access-to-bypass equilibrium by allowing an entrant the opportunity to access an incumbent’s network facility. In addition to this, we discuss two specific issues concerning competition policies. The first issue is to examine the effect of the choice of a competition scheme on firms’ incentive to invest in network facility. The second issue is how the incentives to invest in networks are influenced by uncertainty. There are some works examine the incentive to invest in network facilities. For example, Sidak and Spulber (1997), Gans and Williams (1999), and Gans (2001) consider an investment incentive for an infrastructure without uncertainty. The effect of uncertainty with the irreversibility of investment has been formally examined by Biglaiser and Riordan (2000). Unlike this paper, however, they neither examine the situation of the game between an incumbent and an entrant nor allow the entrant to construct a bypass. Hori and Mizuno (2006a) is a first attempt to investigate the investment game in network public utility industries by focusing on an entrant’s decision to undertake an additional investment for bypass construction under uncertainty, and analyzes the effect of access charges on firms’ incentives for preemption and investment. There are several studies on the effect of open access policy and the comparison between competition regimes in an open access environment. A closely related study to this paper on the issue of open access policy is Bourreau and Dogan ˘ (2005). They focused on an incumbent’s incentive for unbundling and its incentive to set an access charge. This paper differs from Bourreau and Dogan’s ˘ study in two ways. First, while the positions of firms in a market are exogenously given in their study, they are endogenously determined in our model. This means that we study a firm’s preemptive incentive in our model. To the best of our knowledge, previous studies have not mentioned the preemption effect in an open access environment. Second, Bourreau and Dogan’s ˘ study did not deal with uncertainty, whereas we examine a firm’s investment incentive in a stochastic environment. Hori and Mizuno (2006b) is another closely related study to this paper, which discuss the two issues concerning competition policies. This paper analyzes them in case of a lump sum or fixed access charge
December 26, 2006
15:39
Proceedings Trim Size: 9in x 6in
FromAccesstobypass1
129
while Hori and Mizuno (2006b) does in the case of a usage access charge. The next section illustrates the framework of a real options model for an imperfectly competitive network industry. Section 3 provides three benchmarks that are useful to derive the characteristics of the equilibrium that we examine below. Section 4 characterizes the access-to-bypass equilibrium of the model. Section 5 examines the effect of open access policy and clarifies the respective conditions for service-based or facility-based competition to induce a more competitive market. Section 6 discusses the effect of uncertainty on the comparison of competition regimes. Concluding remarks are in Section 7. 2. The Model 2.1 Production in a network industry under imperfect competition We analyze a continuous-time model of strategic investments. The network industry needs two types of facility to serve its consumers: a production facility and a network facility. Investments in those facilities may take place simultaneously or sequentially. In the beginning of this investment game, we assume that no firm has established its facilities. The investment cost for the production facility is Ip > 0, whereas that for the network facility (called “network” hereafter) is I n > 0. We assume that the investments are irreversible, hence, both I p and I n are sunk costs. If there already exists a network built by a rival firm in the market, an entrant without a network may utilize the existing network to serve consumers.1 If the entrant can access the existing network, it pays a lump-sum access charge, v.2 v is given for each firm and may have been determined by a policymaker. We assume that once the entrant enters the market via access to the incumbent’s network, it has to continue to serve the market by paying v, even when the demand is unexpectedly low. This assumption is justified by a universal service obligation to serve consumers, for example. Note also that the entrant, having access to another network, may invest in its own network at some time in the future. This additional network is called a “bypass”. We assume that firms compete against each other, selling a homogeneous good in the market. The profit flows of the firms are uncertain because the firms face an aggregate exogenous industry shock. The profit flow of a firm is represented by π = YΠ (N), which does not include the access charge. Here, Y is the aggregate exogenous shock, N is the number of active firms, and Π (N) is interpreted as the nonstochastic part of the firm’s profit flow at the industry equilibrium. 1 2
We restrict our attention to a one-way access environment. Production costs other than the access charge are assumed to be zero.
December 26, 2006
15:39
Proceedings Trim Size: 9in x 6in
FromAccesstobypass1
130
Y evolves exogenously and stochastically according to a geometric Brownian motion given by the following expression. dYt = αYt dt + σYt dW Here, α ∈ (1/2) σ2 , r is the drift parameter measuring the expected growth rate of Y, r is the risk-free interest rate, σ > 0 is a volatility parameter, and dW is the increment of a standard Wiener process with dW ∼ N (0, dt). Note that, since α > (1/2) σ2 , the expected firm’s profit flow is enhanced stochastically. 2.2 A two-firm case For analytical simplicity, we focus on a two-firm case in this paper. There exist two risk-neutral identical firms, i = 1, 2, which plan to enter into a network industry. A firm that initially enters the market with both a production facility and a network is called a leader, whereas other firms, which may or may not have a network, are called followers. In this paper, we assume that no firm has established its facilities at the outset, thereby enabling any firm to become a leader in the market. That is, the positions of firms in the market are endogenously determined in our model. A firm’s profit flow in the monopoly equilibrium is represented by YΠ (1). We assume that once the two firms enter the market, they compete with quantities of a homogeneous good produced in the market, i.e., Cournot competition. When the two firms are active in the market, we need to distinguish two duopolistic market structures: “access duopoly” in which a follower has access to the leader’s network; and “bypass duopoly” (2) represent the in which a follower maintains its own network. Let YΠ profit flow of a firm in an “access duopoly”, while YΠ (2) represents the profit flow in a “bypass duopoly”. The following relationship is assumed to hold for expository convenience:
(2). Assumption Π (1) > Π (2) > Π While it is natural to assume that Π (N) is a decreasing function of N, the (2) needs some explanation. The idea behind assumption that Π (2) > Π this assumption is the existence of a positive externality caused by additional supply of network facility, i.e., a bypass.3 This positive externality is 3 Examples of the positive externality are high-speed data transmission accomplished by the expansion of optical fiber cables in telecommunications, a reduction of blackouts by construction of another transmission line in a local electricity market, the ability to increase the provision of high calorie gas by construction of an additional gas pipeline, and a reduction of congestion in railroads.
December 26, 2006
15:39
Proceedings Trim Size: 9in x 6in
FromAccesstobypass1
131
called a “network expansion effect”. The network expansion effect creates a higher willingness-to-pay of consumers, which is in turn reflected by a high equilibrium profit flow of a firm. The timing of the game is as follows. Within each instant [t, t + dt), each firm firstly decides whether to invest or not, given the realization of Y. Secondly, at the time of entering the market, a firm decides the quantity to be produced and the market clears. We focus on Markov strategies: the firms’ investment and quantity decisions depend only on the current value of Y. If a firm is a leader, it invests in both a production facility and a network at the same time. There is no reason for a leader to delay investing in a network, since a leader cannot distribute its products without the network. On the other hand, if a firm is a follower, it has the option of making sequential investments. That is, a follower can decide whether to invest in a bypass at the time of entering the market, or later on. Hence, a follower can adopt one of three alternative strategies: (i) “access strategy” in which it enters the market by accessing a leader’s network (i.e., the follower only invests in a production facility) and never builds its own network, (ii) “bypass strategy” in which it enters the market by building its own network, and (iii) “access-to-bypass strategy” in which it enters the market by accessing a leader’s network and later building its own network. 3. Three Benchmarks It is useful to derive equilibria of some restrictive games as benchmarks to characterize the equilibrium of the game described in the previous section, since it depends on the follower’s strategy which makes the analysis complicated. 3.1 Simultaneous investment As an initial benchmark, we consider the case of a duopoly in which each firm simultaneously invests in a production facility and a network. In the simultaneous timing of investment, it is impossible for one firm to use the other firm’s network facility, so that each firm must construct its own. That is, we do not allow either the possibility of sequential timing of investments by a firm or between the two firms. For this reason, we can focus on a single firm’s decision problem to derive the equilibrium in the simultaneous investment case. This benchmark is also useful when considering the non-cooperative investment game analyzed below. The optimal investment rule for a single firm is formulated as follows. ⎧ ⎛ ∞ ⎞⎫ ⎪ ⎜⎜ ⎟⎟⎪ ⎪ ⎪ ⎪ ⎪ ⎬ ⎨ −rT ⎜⎜⎜ −rτ p n ⎟ e ⎜⎜ e Yτ Π (2) dτ − (I + I )⎟⎟⎟⎟⎪ (1) V (Yt ) = maxEt ⎪ ⎪ ⎪ ⎪ ⎪ T ⎝ ⎠ ⎭ ⎩ T
December 26, 2006
15:39
Proceedings Trim Size: 9in x 6in
FromAccesstobypass1
132
where Et denotes expectations conditional on the information available at time t, and T is the future time at which the investment is made. A single firm’s value function in this case is given by4
(2)
⎧ β S ⎪ Y Π(2) ⎪ − (Ip + In ) i f Y < YS ⎨ YYS S r−α V (Y) = ⎪ ⎪ YΠ(2) i f Y ≥ YS ⎩ − (Ip + In ) r−α
1 where β = 2 1 −
2α σ2
+ 1− YS =
(3)
2α σ2
2
+
8r σ2
(> 1). The trigger point is given by:
β r−α p (I + In ) . β − 1 Π (2)
Two points deserve to be mentioned in order to clarify the standard properties of the investment timing derived under a real options approach. First, the trigger point YS is larger than the trigger point that would be derived under an NPV approach. an NPV approach, we have the Using S = r−α (Ip + In ) < YS , because the NPV approach ignores trigger point Y Π(2) the option value of waiting. Second, since β is a decreasing function of a volatility parameter σ, an increase in uncertainty (i.e., an increase in σ) makes the firm deter entry even if it is risk neutral. 3.2 Bypass equilibrium As a second benchmark, we allow sequential timing of investments by the two firms, but we assume that a follower is not allowed to access a leader’s network facility. We call this benchmark the “access-forbidden” case. Let us derive the equilibrium of this case. We start by considering the follower’s choice of strategy. Then, the leader’s strategy is examined, given the follower’s choice. As usual in dynamic-game contexts, the game is solved backwards. In this case, the follower must invest not only in the production facility, but also in the network facility to serve consumers. Hence, the value function of the follower is as follows.
(4)
VFB
⎧ β B ⎪ Y Π(2) ⎪ ⎨ YYB − (Ip + In ) i f Y < YB r−α (Y) = ⎪ ⎪ YΠ(2) i f YB ≤ Y ⎩ − (Ip + In ) r−α
The trigger point is given by: 4 We follow the procedure of Dixit and Pindyck (1994) for the derivation of the value function, which is provided in the Appendix.
December 26, 2006
15:39
Proceedings Trim Size: 9in x 6in
FromAccesstobypass1
133
YB =
(5)
β r−α p (I + In ) . β − 1 Π (2)
Comparing (2) and (3) with (4) and (5), we find that in the case of simultaneous timing of investment under a duopoly, a single firm’s value is the same as the follower’s value in this case. Next, we consider the leader’s value in this environment. When the follower enters the market by constructing its own bypass facility, the value function of the leader is as follows:
(6)
VLB
⎧ β−1 β YB Π(2) YΠ(1) ⎪ ⎪ (Ip + In ) i f Y < YB + YYB ⎨ r−α 1 − YYB r−α − (Y) = ⎪ ⎪ YΠ(2) i f YB ≤ Y ⎩ − (Ip + In ) r−α
The first term of VLB (Y), when Y < YB , represents the expected monopoly profit until YB is reached, whereas the second term represents the expected value occurring from the probability that YB is reached. According to the argument of Fudenberg and Tirole (1985), the form of the non-cooperative equilibrium depends on the relative magnitude of the leader’s value and the value when both firms invest simultaneously. In our framework, we need to compare the value of VLB (Y) with VS (Y), which is the same as VFB (Y). Then, the same argument as in Proposition 3 of Weeds (2002) applies to our analysis: if there exists some Y ∈ 0, YB such that VLB (Y) > V S (Y), two asymmetric leader-follower equilibria appear in the non-cooperative game. The two equilibria differ only in the identities of the two firms. In fact, as the following proposition shows, we have the two asymmetric equilibria, which we call the “bypass equilibrium”, in this access-forbidden case. Proposition 3.1. There exists a bypass equilibrium in which the leader’s unique trigger point YLB is characterized by: VLB (Y) < VFB (Y)
if
Y < YLB
VLB (Y) = VFB (Y)
if
VLB (Y) > VFB (Y)
if
Y = YLB Y ∈ YLB , YB
VLB (Y) = VFB (Y)
if
Y ≥ YB
Proof. See Appendix.
December 26, 2006
15:39
Proceedings Trim Size: 9in x 6in
FromAccesstobypass1
134
The bypass equilibrium is standard in the literature, as discussed in Dixit and Pindyck (1994, Chapter 9): Comparing the bypass equilibrium with the equilibrium in the simultaneous investment case, we can see the effect of competition on a firm’s investment timing. That is, because YLB < YS = YB , the competition induces earlier timing of the investment in production and network facilities. This result is also reported by Katz and Shapiro (1987) in a framework of R&D racing game without uncertainty. 3.3 Access equilibrium In contrast to the second benchmark, we now consider the case in which the follower is obliged to access the leader’s network facility in order to serve customers. We call this the “bypass-forbidden” case. When the follower must have access to the leader’s network facility, its value function is as follows:
(7)
⎧ A β YA Π(2) ⎪ ˆ Y v ⎪ p if Y < Y ⎪ − − I ⎨ YA A r−α r VF (Y) = ⎪ ⎪ ˆ ⎪ YΠ(2) v p ⎩ i f YA ≤ Y r−α − r − I
The trigger point is as follows.
(8)
YA =
β r − α p v . I + ˆ (2) β−1Π r
The leader’s value function in the bypass-forbidden case is as follows.
(9)
⎧ β−1 β YΠ(2) ˆ ⎪ YΠ(1) Y Y v ⎪ ⎪ 1 − + + ⎪ r−α r YA YA ⎪ ⎨ r−α p n) A VLA (Y) = ⎪ (I i f Y < Y − + I ⎪ ⎪ ⎪ ˆ ⎪ v p n A ⎩ YΠ(2) if Y ≤ Y + − (I + I ) r−α
r
To characterize the equilibrium in the bypass-forbidden case, we should note that the benchmark of the value function with simultaneous investment must be the same as that used in the access-forbidden case, since each firm cannot expect to use the other firm’s network facility. Then, it is natural for us to restrict our attention to the asymmetric leader-follower equilibrium in this case. We call this the “access equilibrium”. However, when we focus on the access equilibria (i.e., the asymmetric leader-follower equilibria), there is a problem in relation to the equilibrium’s existence and uniqueness. In fact, depending on the magnitude of the access charge relative to the investment costs, the equilibrium cannot exist in a range of Y. For our purpose, it is enough to focus on the case
December 26, 2006
15:39
Proceedings Trim Size: 9in x 6in
FromAccesstobypass1
135
where a unique access equilibrium exists. The following proposition shows a sufficient condition for the existence of the unique access equilibrium. Proposition 3.2. There exists a unique access equilibrium in which the leader’s unique trigger point YLA is characterized by: VLA (Y) < VFA (Y)
if
Y < YLA
VLA (Y) = VFA (Y)
if
Y = YLA
VLA (Y) > VFA (Y)
if
Y > YLA
under the following condition. (10)
v 1 n ≥ I r 2
Proof. See Appendix. The condition (10) guarantees only VLA (Y) > VFA (Y) for Y > YLA : the follower incurs such an access payment that after entering the market, its value cannot exceed the leader’s value. Without uncertainty, Gans (2001) and Gans and Williams (1999) derived the relationship between the access charge and the network-investment timing. However, because of no-uncertainty and the symmetry between two firms, the follower accesses the leader’s network immediately in their equilibria. In contrast, even in the symmetric structure of the game the leader can enjoy the monopoly rent (i.e., YLA < YA ) in our asymmetric leader-follower equilibrium. 4. The Access-to-Bypass Equilibrium 4.1 Possible equilibria under open access policy The follower takes one of three kinds of strategies: access strategy, bypass strategy and access-to-bypass strategy. Hence, the bypass equilibrium or the access equilibrium can occur in service-based competition under open access policy. In the analysis below, however, we mainly focus on “the access-to-bypass equilibrium” in which a leader firstly enters a market by building a new network, a follower then enters with access to the leader’s network, and in the future the follower will construct a bypass. The access-to-bypass equilibrium has two features: it is a leader– follower equilibrium, and a follower undertakes a sequential investment. The reason that we restrict our attention to the access-to-bypass equilibrium is that the equilibrium can generally occur under a stochastically growing demand environment, and it shows some peculiar characteristics of network industries.
December 26, 2006
15:39
Proceedings Trim Size: 9in x 6in
FromAccesstobypass1
136
4.2 A follower’s choice of strategy In this subsection, we specify the condition under which a follower chooses access-to-bypass strategy over the other two strategies. We first examine when the follower makes the transition from the access project to the bypass project. Suppose a follower already enters the market by accessing a leader’s network. Then, a follower earns YΠ (2) instead of (2)−v in every period after building a bypass at the cost of I n . Solving the YΠ follower’s bypass investment problem, we obtain the firm value after the (2). The investment; Yt ∆Π (2) / (r − α) + (v/r − I n ) where ∆Π (2) ≡ Π (2) − Π firm value is the expected values of the incremental profit flow generated from access-to-bypass and the saving of the access charge payment minus the investment cost of a bypass. We define V T (Y) as the option value of the investment in a bypass. From the standard procedure and V T (0) = 0, we have V T (Y) = GYβ .5 The parameter G and the trigger point YB∗ are the solutions that satisfy the following value-matching condition and the smooth-pasting condition. (11)
V T YB∗ = ∆F YB∗ − In
(12)
V T YB∗ = ∆F YB∗
Here, ∆F (Y) ≡ Y∆Π (2) / (r − α) + v/r is the value of the transition project from access to bypass (i.e., the difference in the values between the bypass project and the access project). From the above two conditions, we derive a trigger point YB∗ at which the follower builds a bypass.6
(13)
YB∗ =
β r − α n v I − β − 1 ∆Π (2) r
Next, we consider the follower’s decision problem of whether it should enter the market by accessing the leader’s network. Note that, since there is an opportunity to build a bypass after access to the leader’s network, the effective value of the entry by access includes not only its value, but also the option value of the investment in a bypass. In fact, using the standard procedure again, we can derive the effective value of the entry by access that has a similar form to V T (Y). Then, using the value-matching and the 5 The reason for the requirement that V T (0) = 0 stems from the observation that if Y goes to zero, it will stay at zero. β1 n 6 We also obtain G = (YB∗ )−β ( β −1 I − β 1−1 vr ). 1
1
December 26, 2006
15:39
Proceedings Trim Size: 9in x 6in
FromAccesstobypass1
137
smooth-pasting conditions and the effective value of the entry by access, we derive the trigger point YA∗ .7 YA∗ =
(14)
β r − α p v I + ˆ (2) β−1Π r
Now we can characterize the follower’s choice of strategy. It is apparent that the follower does not adopt the access strategy, since ∆Π (2) > 0 and the demand is stochastically growing.8 Hence, we have the following lemma: Lemma 4.1. The follower adopts the access-to-bypass strategy (or bypass strategy, respectively) if and only if Ip +
(15)
(2) Π v (Ip + In ) . ≤ (>) r Π (2)
Proof. Observe that, when YB∗ < +∞ and (0 <) YA∗ ≤ YB∗ , the follower adopts the access-to-bypass strategy. Equation (15) is derived by rewriting the condition that YA∗ ≤ YB∗ . When YA∗ > YB∗ , only the bypass strategy remains for the follower. In fact, this is confirmed, because β r − α p v β r − α n v I + I − < ≡ YA∗ β − 1 ∆Π (2) r β−1Π r (2) (2) (Ip + In ) < Π (2) Ip + v . ⇔Π r
YB∗ ≡ (16)
From equation (15), we note that, if the follower adopts the access-to (2) /Π (2) < 1. That is, the present bypass strategy, then v/r ≤ I n , because Π value of the access charge cannot exceed the investment cost for a network if the follower adopts the access-to-bypass strategy. Otherwise, too large an access charge induces a follower to enter by building a bypass rather than to enter through access to a leader’s network. When the follower chooses the access-to-bypass strategy, its value function is derived as follows. 7 The derivation of the effective value of the entry by access and the trigger point Y A∗ is omitted, since the procedure is almost the same as that stated above. 8 From the assumptions that ∆Π (2; v) > 0 and Y evolve according to a geometric Brownian motion such that it has the expected growth rate α, there necessarily exists a trigger point Y B∗ at which the follower starts a bypass project.
ˆ YΠ(2) − vr − Ip βr−α YB∗ ∆Π(2) + YYB∗ + vr − In r−α A∗
if Y
YΠ(2) r−α
− (I + I ) p
n
≤ Y < YB∗ i f YB∗ ≤ Y
The trigger points YA∗ and YB∗ are given by (14) and (13), respectively. 4.3 A leader’s firm value Next, we derive a leader’s firm value. As mentioned before, we focus on the case when a follower takes the access-to-bypass strategy. In that case, the value function of the leader can be derived as follows. 9 ⎧ β−1 β YA∗ Π(2) A∗ β−1 ˆ YΠ(1) ⎪ Y Y ⎪ + YA∗ 1 − YYB∗ ⎪ r−α 1 − YA∗ r−α ⎪ ⎪ ⎪ ⎪ B∗ A∗ β A∗ β ⎪ Π(2) ⎪ ⎪ + vr 1 − YYB∗ + YYB∗ Y r−α ⎪ ⎪ ⎪ ⎪ p n A∗ ⎨ (18) VLAB (Y) = ⎪ − (I +I ) β i fβY B∗< Y ⎪ β−1 ⎪ ˆ YΠ(2) Π(2) ⎪ Y ⎪ + vr 1 − YYB∗ + YYB∗ Y r−α ⎪ ⎪ r−α 1 − YB∗ ⎪ ⎪ ⎪ ⎪ i f YA∗ ≤ Y < YB∗ − (Ip + In ) ⎪ ⎪ ⎪ YΠ(2) ⎩ p n (I + I ) i f YB∗ ≤ Y r−α − 4.4 The equilibrium Before describing the equilibrium, we derive each firm’s equilibrium strategy. A firm invests in the production facility at YA∗ and the network at YB∗ if it is a follower (i.e., if a rival has already entered the market). If a rival has not entered the market, it invests in both facilities at Y where Y < YA∗ and VLAB (Y) > VFAB (Y). Irrespective of the rival’s decision, it never invests at Y where VLAB (Y) < VFAB (Y). We can now derive the access-to-bypass equilibrium as a subgame perfect Nash equilibrium. Although we have not found the leader’s trigger point YL∗ where the leader immediately invests, the equilibrium defines YL∗ . As Fudenberg and Tirole (1985) discussed, Y L∗ is the smallest value of Y 9 Given Y A∗ and YB∗ , the leader’s firm value is defined as follows: VLAB ≡ TA TB A B Y Π(2) p n ˆ (2) + v dτ] + Et [e−rT ][ t Et [ t e−rτ Yτ Π (1) dτ]+Et [e−rT TA e−rτ Yτ Π r−α ] − (I + I ) where
T A is the first time Yt reaches YA∗ and T B is the first time it reaches YB∗ .
December 26, 2006
15:39
Proceedings Trim Size: 9in x 6in
FromAccesstobypass1
139
that satisfies VLAB (Y) = VFAB (Y). If VLAB (Y) < VFAB (Y), two firms want to be the follower, hence neither firm will invest. On the other hand, if VLAB (Y) > VFAB (Y), both firms want not only to invest immediately, but also to invest at smaller level of Y motivated by preemption to the rival firm. Therefore, one of the two firms invests, i.e., becomes the leader when VLAB (Y) = VFAB (Y). Proposition 4.1. There exist two access-to-bypass equilibria differing only in the identities of the two firms under the conditions that (In /2) < vr ≤ (2) /Π (2) (Ip + In ) − Ip . A unique leader’s trigger point Y∗ ∈ 0, YB∗ in Π L the equilibria is characterized by VLAB (Y) < VFAB (Y)
if
Y < YL∗
VLAB (Y) = VFAB (Y)
if
VLAB (Y) > VFAB (Y)
if
Y = YL∗ Y ∈ YL∗ , YB∗
VLAB (Y) = VFAB (Y)
if
Y ≥ YB∗ .
Proof. See Appendix. The access-to-bypass equilibrium is as follows. The two firms do not enter the market when Y ∈ 0, YL∗ , where YL∗ is the trigger point at which the leader enters the market. When Y ∈ 0, YA∗ , the leader enjoys monopoly profits. When Y ∈ YA∗ , YB∗ , the follower operates with access to the leader’s network. Finally, when Y ∈ YB∗ , +∞ , the follower serves its customers by building a bypass. The conditions in Proposition 4.1 suggest that there exist lower and upper bounds of v. If v is too small, the condition that (In /2) < vr will be violated. This is because v that is too small gives the two firms the incentive to be the follower rather than the leader. Thus the access-tobypass equilibrium cannot exist. On the other hand, v cannot be too large compared with In as Lemma 4.1 states. We have two observations to make on Proposition 4.1. First, two asymmetric leader–follower equilibria appear in this game. The two equilibria differ only in the identities of the two firms: firm 1 becomes the leader in one equilibrium, and firm 2 becomes the leader in the other equilibrium. Hence, we use equilibria rather than equilibrium in Proposition 4.1. The equilibrium outcomes are the same between the two equilibria. Second, although the firms face uncertainty and irreversibility of investment, an incentive for preemption to a rival firm still exists. While being the follower is motivated by the value of waiting before making the
December 26, 2006
15:39
Proceedings Trim Size: 9in x 6in
FromAccesstobypass1
140
investment and the opportunity of making sequential investments, being the leader is motivated by earning not only the monopoly rent but also the access profit in an open access environment. Since the motivation for being the leader is stronger than themotivation for being the follower (i.e., VLAB (Y) > VFAB (Y)) for Y ∈ YL∗ , YB∗ in the access-to-bypass equilibrium, the two firms participate in the race to become the leader. This point can be confirmed when comparing the game in which the roles of leader and follower are exogenously preassigned to the firms. In that case, a firm preassigned as the leader makes use of an option value to wait, which implies that its optimal trigger point becomes larger than YL∗ . 5. The Effects of Open Access Policy 5.1 Equilibria in service-based and facility-based competitions In this section, we compare “service-based competition” under open access policy with “facility-based competition”. In service-based competition accomplished by open access policy, entrants can enter the market by accessing an incumbent’s network facility if and when they desire. It is expected that service-based competition encourages potential entrants to enter a market, since they need not incur the large fixed costs of building their own network. In facility-based competition, on the other hand, the additional networks built by potential entrants may introduce a positive network externality or reduce congestion, both of which contribute to growth of the market. Hence, each of the two competition regimes has its own advantages from a welfare viewpoint. In this section, we try to clarify the conditions in which one competition regime would induce a more welfare-enhancing environment than the other. In service-based competition, we assume that only the access-to-bypass equilibrium occurs. In facility-based competition, a follower is not allowed to access a leader’s network. Therefore, the economic environment and the timing of the game in facility-based competition are the same as those of the bypass equilibrium as we shown in section 3.2. 5.2 The effect on a follower’s entry timing We can confirm that, in a service-competition regime under open access policy, a follower enters the market earlier and builds a bypass later than in facility-based competition. Proposition 5.1. In the access-to-bypass equilibria,(i) a follower enters the market earlier than a follower in the facility-based competition equilibria, and (ii) a follower builds a bypass later than a follower in the facility-based competition equilibrium. That is, YA∗ < YB < YB∗ . Proof. See Appendix.
December 26, 2006
15:39
Proceedings Trim Size: 9in x 6in
FromAccesstobypass1
141
The first inequality in Proposition 5.1 shows that a follower’s entry with open access must be earlier than that without open access while the second inequality states that a follower with open access constructs a bypass later than does one without open access. This implies that the introduction of a positive externality generated by a bypass construction in service-based competition cannot be earlier than that in facility-based competition. The second result is the same as Lemma 6 in Bourreau and Dogan ˘ (2005). They stated that, when there is unbundling, a follower builds its bypass later than when there is no unbundling. While there are a few differences in the setting of the model between the two papers, the implication of the “replacement effect” is robust. 5.3 The effect on a leader’s entry timing Next, we compare the entry timing of a leader between the two competition regimes. The following claim states the necessary and sufficient condition for a leader in the access-to-bypass equilibria to enter earlier (later) than that in the facility-based competition equilibria: Proposition 5.2. A leader enters earlier (later) in the access-to-bypass equilibria than in the facility-based competition equilibria (i.e., YL∗ < (>) YL ) if and only if QAB (YL ) > (<) 0 where QAB (Y) ≡ VLAB (Y) − VFAB (Y) is the difference in firm value between the leader and the follower in the access-to-bypass equilibria. Proof. YL∗ is the smallest value of Y which satisfies QAB (Y) = 0. Therefore, QAB (YL ) is positive (negative) if and only if YL∗ < (>) YL . Remember that YL∗ is the smallest value of Y which satisfies QAB (Y) = 0. Therefore, QAB (YL ) is positive (negative) if and only if YL∗ < (>) YL . However, as seen in the proof of Proposition 4.1, substituting (17) and (18) into this necessary and sufficient condition gives too complicated an expression to obtain an intuitive interpretation. Hence, we instead try to seek a sufficient condition for one competition regime to make a leader’s entry earlier than in the other competition regime. We show a sufficient condition for facility-based competition to cause a leader to enter earlier than in service-based competition. Proposition 5.3. A leader in the facility-based competition equilibria enters earlier than does the one in the access-to-bypass equilibria (i.e., YL < YL∗ ), if Π (1) is sufficiently large. Proof. See Appendix.
December 26, 2006
15:39
Proceedings Trim Size: 9in x 6in
FromAccesstobypass1
142
Proposition 5.3 states that, if Π (1) is sufficiently large, the incentive to become a leader in facility-based competition is stronger than that in service-based competition. Proposition 5.4. A leader in the access-to-bypass equilibria enters earlier than does the one in the facility-based competition equilibria (i.e., YL∗ < YL ) if (19)
β Π (1) Ip + 2v/r > . Ip + v/r β−1Π (2)
Proof. See Appendix.
Condition (19) has several implications. For example, if the difference (2) is sufficiently small, the proposition would hold. between Π (1) and Π In addition, the proposition holds in the case where the level of access charge is high. 6. Uncertainty’s effect on the choice of competition regimes The effect of uncertainty on a leader’s entry timing becomes ambiguous in our model. This ambiguity stems from a conflict between the preemption effect and the real options effect. Since a follower enters later when uncertainty is higher, the incentive to be a leader becomes larger. This is because a leader can enjoy a monopoly rent for a longer period. Hence, the preemption motive is a countervailing force to the real-option effect. This point is also illustrated by Mason and Weeds (2003). They demonstrate that if the monopoly profit is large, an increase in uncertainty hastens a leader’s entry. The reasoning of the countervailing relationship between the two effects is applied to the choice of competition regimes. In fact, from condition (19), we have a sufficient condition with respect to uncertainty for servicebased competition to cause a leader to enter earlier than in facility-based competition. Proposition 6.1. When uncertainty is small, a leader in the access-to-bypass equilibria enters earlier than a leader in the facility-based competition equilibria does (i.e., YL∗ < YL ). Proof. This is obvious from condition (19), since β is a decreasing function of σ.
December 26, 2006
15:39
Proceedings Trim Size: 9in x 6in
FromAccesstobypass1
143
The intuitive explanation of Proposition 6.1 is that the preemption effect in the access-to-bypass equilibria is stronger than that in the facility-based competition equilibria. When uncertainty is large, the real option effect enhances the preemption effect in the facility-based competition equilibria rather than that in the access-to-bypass equilibria. Therefore, when uncertainty is sufficiently large, we can state the following: Corollary 6.1. When uncertainty is sufficiently large, a leader enters earlier in the facility-based competition equilibria than in the access-to-bypass equilibria (i.e., YL∗ > YL ). Proof. See Appendix. 7. Concluding Remarks The purpose of this paper was to characterize the access-to-bypass equilibrium. We also examined the effect of open access policy on both the incentives for building network facilities or infrastructures, and the degree of competition in network industries. To address these questions, we employed a real options approach, since uncertainty and irreversibility of investment are influential factors when considering investment problems. In the model, we assumed that a bypass construction creates a positive externality through which firms can benefit from an increase in profit. Then, comparing competition under open access policy with facility-based competition, we confirmed that service-based competition causes a follower’s entry to be earlier than is the case in facility-based competition. The effect of open access policy on a leader’s entry depends on several factors in the environment. In particular, we showed that when the level of access charge is high, a leader (i.e., the first entrant) in an open access environment enters earlier than a leader in facility-based competition. However, we also showed a sufficient condition for open access policy to delay a leader’s entry: when the monopoly profit flow is large, a leader in an open access environment enters later than a leader in facility-based competition. These findings can explain the phenomenon that has recently occurred in network industries. For example, in US telecommunications, when the (regulated) access charge was low, a monopoly rent was large, or uncertainty was prevalent, introduction of cable broadband networks was deterred. Appendix The derivation of a single firm’s value function in the case of simultaneous investment In this appendix, we illustrate the derivation of a single firm’s value function VS (Y) in the case of simultaneous investment. Equation (1) can be
December 26, 2006
15:39
Proceedings Trim Size: 9in x 6in
FromAccesstobypass1
144
regarded as an optimal stopping problem. Given an initial position Yt = Y, the firm chooses when to stop delaying the investment to maximize its firm value. By solving the problem, we can find the optimal timing when the firm invests in two types of facilities. If the solution is t (i.e., T = t), it means the firm invests immediately. This optimal stopping problem can be solved by a standard technique in the real options literature (see Dixit and Pindyck (1994) for details). At first, we examine the value of the project after the firm invested (i.e., the value in the stopping region). The value of the project at time t, FS (Y), satisfies the following Bellman equation. (20) FS (Y) = YΠ (2) dt + E FS (Y + dY) e−rdt Expanding the right-hand of (20) using Ito’s Lemma, we have (21) 1 FS (Y) = YΠ (2) dt + αYFS (Y) + σ2 Y2 FS (Y) dt + (1 − rdt) FS (Y) + o (dt) 2 where o (dt) represents terms that go to zero faster than dt. Dividing both sides of (21) by dt and proceeding to the limit as dt −→ 0, we obtain the differential equation as follows. (22)
The general solution of the differential equation is: (23)
V (Y) = A1 Yβ1 + A2 Yβ2 +
YΠ (2) r−α
To prevent the value from diverging as Y goes to zero, A2 must be zero, since β2 < 0. In addition, to rule out a speculative bubble, A1 must be zero. Therefore, V (Y) = YΠ(2) r−α . Once we know the value of the project V (Y), we can obtain the value of the option to invest in the project. In fact, the option to invest F (Y) must satisfy the following differential equation. (24)
1 2 2 σ Y F” (Y) + αYF (Y) − rF (Y) = 0. 2
The general solution of the differential equation is F (Y) = B1 Yβ1 + B2 Yβ2 . To ensure that F (Y) goes to zero as Y goes to zero, we have B2 = 0. Therefore, F (Y) = B1 Yβ1 . Finally, we use the value-matching and the smooth-pasting conditions, which means that at the trigger point YS , (25) F YS = V YS − (Ie + Im )
December 26, 2006
15:39
Proceedings Trim Size: 9in x 6in
FromAccesstobypass1
145
F YS = V YS
(26)
YΠ(2) Substituting F (Y) = B1 Yβ1 and V (Y) = r−α into (25) and (26) yields −β1 YS Π(2) β1 r−α (Ip + In ). (Ip + In ) and Ys = β1 −1 B1 = Y S r−α − Π(2)
The proof of Proposition 3.1 Let us define PB (Y) ≡ VLB (Y)−VFB (Y). Substituting (4) and (6) into PB (Y) for Y < YB , we have: (i) PB (0) < 0; (ii) PB YB = 0; and (iii) PB YB < 0, which guarantees the existence of YLB . In addition, we have PB ” (Y) < 0 for Y < YB , which guarantees the uniqueness of YLB . This has the properties mentioned in the proposition. The proof of Proposition 3.2 For Y < YA , the procedure is the same as in the proof of Proposition 1. Let us consider the case where Y ≥ YA . For the uniqueness of the equilibrium, it is sufficient to show that PA (Y) ≡ VLA (Y) − VFA (Y) > 0 for Y ≥ YA . Substituting (7) and (9) into PA (Y) for Y ≥ YA, we can ensure that PA (Y) > 0. Hence it is sufficient to show that PA YA > 0 for the uniqueness. Using (8), we have v P A YA = 2 − I n r
(27)
Hence, there exists a unique access equilibrium if (10) holds. Proof of Proposition 4.1 We first define QAB (Y) ≡ VLAB (Y) − VFAB (Y) for each Y. To guarantee the existence of the equilibrium and uniqueness of YL∗ , it is sufficient to ensure that (i) QAB (0) = − (Ip + In ) < 0, (ii) QAB YB∗ = 0, (iii) QAB YB∗ < 0, (iv) QAB ” (Y) < 0 for Y ∈ YA∗ , YB∗ , (v) VLAB YA∗ > VFAB YA∗ . These conditions are similar to those given by Nielsen (2002) and Weeds (2002). Since it is straightforward to show (i) and (ii), we have omitted their proof. Substituting (18) and (17) into VLAB (Y) and VFAB (Y) for Y ∈ YA∗ , YB∗ , we have (28)
Q
AB
(Y) = 1 −
Y YB∗
β1
v 2 − In . r
December 26, 2006
15:39
Proceedings Trim Size: 9in x 6in
FromAccesstobypass1
146
So, we have 1 v n − I 2 , YB∗ r " 1 ! Y β1 −2 v n − I (30) 2 . QAB ” (Y) = −β1 β1 − 1 r (YB∗ )2 YB∗ Evaluating (29) at Y → Y B∗ and rearranging it, QAB YB∗ can be expressed as follows. (29)
QAB (Y) = −β1
Y YB∗
β1 −1
v ! " ∆Π (2) 2 r − In QAB YB∗ = β1 − 1 r − α vr − In
As long as the access-to-bypass equilibrium exists, we have vr < In , because Π(2) (Ip + In ) < Ip + In from (15). Therefore, Q AB YB∗ < 0 if and Ip + vr ≤ Π(2) only if 2 vr > In . Hence, (iii) is proved. Similarly, under the condition that 2 vr > In , QAB ” (Y) < 0 and A∗ β1 2 vr − In > 0. Hence, (iv) and (v) are proved. QAB YA∗ = 1 − YYB∗ (2) /Π (2) (Ip + In )−Ip, To sum, under the conditions that (In /2) < vr ≤ Π the access-to-bypass equilibrium exists and YL∗ is unique. Proof of Proposition 5.1 (i) Two types of equilibrium can emerge in service-based competition, depending on the level of access charge and the degree of positive network externality; the one is the access-to-bypass equilibrium and the other is bypass equilibrium. The bypass equilibrium emerges when a follower adopts the bypass strategy. In fact, the trigger point of a follower in the bypass equilibrium is exactly the same as in the facility-based competition equilibrium, YB . Hence we need to check the trigger points of a follower only in the access-to-bypass equilibrium. In the access-to-bypass equilibrium, the relationship that YA∗ < YB∗ holds. Then, as shown in the proof of Lemma 1, we can confirm that YA∗ < YB∗ implies YA∗ < YB . (ii) We show that YB < YB∗ if and only if YA∗ < YB∗ . This is because β1 r − α n v β1 r − α p v < ≡ YB∗ I + YA∗ ≡ I − β1 − 1 Π r β1 − 1 ∆Π (2) r (2) (2) (Ip + In ) < Π (2) In − v ⇔ Π (2) (Ip + In ) − Π r β β r − α r − α v 1 1 (Ip + In ) < ⇔ YB ≡ ≡ YB∗ . In − β1 − 1 Π (2) β1 − 1 ∆Π (2) r
December 26, 2006
15:39
Proceedings Trim Size: 9in x 6in
FromAccesstobypass1
147
Therefore, we have YB < YB∗ . Proof of Proposition 5.3 Substituting (17) and (18) for ∀Y ∈ 0, YA∗ into QAB (Y) ≡ VLAB (Y) − VFAB (Y) and rearranging it, we have YΠ (1) − (Ip + In ) r−α Y β1 p v YA∗ Π (1) I +2 − + A∗ r r−α Y β1 Y v + B∗ In − 2 . r Y
QAB (Y) =
(31)
To prove the proposition, we check the sign of QAB (YL ) where YL is the trigger point of the leader under facility-based competition equilibrium. Note that YL is characterized by VLB (YL ) = VFB (YL ), so that we have β1 B β1 −1 YL Π (1) Y Π (2) YL YL − (Ip + In ) + B 1− B r−α r−α Y Y β1 B Y Π (2) YL − (Ip + In ) = B r−α Y or (32)
β1 B YL YL Π (1) Y Π (1) − (Ip + In ) = B − (Ip + In ) . r−α r−α Y
Substituting (32) into QAB (YL ) gives QAB (YL ) = (YL )β1 χ (x) , where
(33)
−β1 YB Π (1) B χ (x) ≡ Y − (Ip + In ) r−α −β1 −β1 v YA∗ Π (1) v In − 2 Ip + 2 − + YA∗ + YB∗ r r−α r −β1 β1 Π (1) p (I + In ) − (Ip + In ) = YB β1 − 1 Π (2) ⎤ ⎡ ⎢ β1 Π (1) p v ⎥⎥ v A∗ −β1 ⎢ p I + ⎥⎥⎦ + Y ⎢⎢⎣I + 2 − r β1 − 1 Π r (2) −β1 v + YB∗ In − 2 r
December 26, 2006
15:39
Proceedings Trim Size: 9in x 6in
FromAccesstobypass1
148
(2) , Ie, Im . and x ≡ v, Π (1) , Π (2) , Π Hence QAB (YL ) < 0 if χ (x) < 0. Note that the terms in the bracket of the third term of (33) is negative. Hence, χ (x) < 0 if −β1 YB
(34) where j ≡
β1 Π (1) p (I + In ) − (Ip + In ) β1 − 1 Π (2) ⎡ ⎤ −β1 ⎢⎢ β1 Π (1) p v ⎥⎥ v A∗ p ⎢ I + ⎥⎥⎦ + Y ⎣⎢I + 2 r − β − 1 r 1 Π (2) −β1 ) −β1 * = YB [l − mΠ (1)] < 0 jΠ (1) − k + YA∗
β1 Ip +In β1 −1 Π(2) ,
k ≡ (Ip + In ), l ≡ I p + 2 vr and m ≡
β1 Ip +(v/r) . β1 −1 Π(2)
Let us
show (34) holds if Π (1) is sufficiently ) large. * Define Γ [Π (1)] ≡ [mΠ (1) − l] / jΠ (1) − k . Note that Lemma 4.1 ensures j > m as long as the access-to-bypass equilibrium exists, and l > k under the conditions of Proposition 4.1. Hence, (35)
jl − km dΓ [Π (1)] = ) *2 > 0. dΠ (1) jΠ (1) − k
Equation (35) shows that Γ [Π (1)] is an increasing function of Π (1) and that Γ [Π (1)] monotonically converges to m/ j as Π (1) goes to infinity. Let β us compare YA∗ /YB with Γ [Π (1)]. Since YA∗ /YB = m/ j < 1 and β > 1, ! "β there exists a threshold of Π (1) above which m/ j < Γ [Π (1)] or + (36)
YA∗ YB
,β <
mΠ (1) − l . jΠ (1) − k
Since jΠ (1) − k > 0, we have (34). Therefore, if Π (1) is sufficiently large, a leader enters earlier in the facility-based competition equilibrium than in the access-to-bypass equilibrium. Proof of Proposition 5.4 Note that the term in the brackets of the first term of (33) is positive, while that in the brackets of the third term is negative. Suppose (37)
β1 Π (1) p v v Ip + 2 − > 0. I + r β1 − 1 Π r (2)
December 26, 2006
15:39
Proceedings Trim Size: 9in x 6in
FromAccesstobypass1
149
Then, because YA∗ < YB < YB∗ , we obtain −β1 β1 Π (1) p B∗ (I + In ) − (Ip + In ) χ (x) > Y β1 − 1 Π (2) ⎤ β1 Π (1) p v n v v ⎥⎥⎥ p +I + 2 − + I − 2 ⎥⎦ I + r β1 − 1 Π r r (2) ⎡ ⎤ p n p −β1 β1 ⎢⎢ I + I I + (v/r) ⎥⎥⎥ B∗ ⎢ = Y Π (1) ⎢⎣ (38) − ⎥. β1 − 1 Π (2) (2) ⎦ Π The right-hand side of (38) is positive because of (15). This means that (37) is a sufficient condition for YL∗ < YL . Proof of Corollary 6.1 As in the proof of Proposition 5.3, we show that χ (x) < 0 if σ is sufficiently large. (2) (Ip + In ) . On Remember that YA∗ /YB = m/ j = [Π (2) (Ip + (v/r))] / Π ) * other hand, Γ [Π (1)] [mΠ (1) − l] / jΠ (1) − k = ≡ the (2) − (Ip + 2v/r) / [BΠ (1) (Ip + In ) /Π (2) − (Ip + In )] BΠ (1) (Ip + (v/r)) /Π ! " where B ≡ β/ β − 1 . Then, as σ → +∞, β → 1 and B → +∞. Hence, β (2) (Ip + In ) = lim Γ [Π (1)]. This lim YA∗ /YB = [Π (2) (Ip + (v/r))] / Π σ→+∞
σ→+∞
means that the first term and the second term of χ (x) are cancelled out when σ is sufficiently large. Since the third term of χ (x) is always negative and approaches to zero, we can state that χ (x) < 0 if σ is sufficiently large. References 1. Biglaiser, G. and M. Riordan (2000), “Dynamic Price Regulation”, Rand Journal of Economics, 31, 744–767. 2. Bourreau, M. and P. Dogan ˘ (2005), “Unbundling the Local Loop”, European Economic Review, 49, 173–199. 3. Dixit, A. K. and R. S. Pindyck, (1994), Investment under Uncertainty, Princeton, NJ: Princeton University Press. 4. Fudenberg, G. and J. Tirole (1985), “Preemption and Rent Equalisation in the Adoption of New Technology”, Review of Economic Studies, 52, 383–401. 5. Gans, J. S. (2001), “Regulating Private Infrastructure Investment: Optimal Pricing for Access to Essential Facilities”, Journal of Regulatory Economics, 20, 167–189. 6. Gans, J. S. and P. L. Williams (1999). “Access Regulation and the Timing of Infrastructure Investment”, Economic Record, 75, 127–137. 7. Hori, K. and K. Mizuno (2006a), “Access Pricing and Investment with Stochastically Growing Demand”, International Journal of Industrial Organization, 24, 795–808.
December 26, 2006
15:39
Proceedings Trim Size: 9in x 6in
FromAccesstobypass1
150
8. Hori, K. and K. Mizuno (2006b), “Competition schemes and investment in network infrastructure under uncertainty”, mimeo. 9. Katz, M. K. and C. Shapiro (1987), “R&D Rivalry with Licensing or Limitation”, American Economic Review, 77, 402–429. 10. Mason, R. and H. Weeds (2003), “Can Greater Uncertainty Hasten Investment?”, mimeo. 11. Nielsen, M. J. (2002), “Competition and Irreversible Investments”, International Journal of Industrial Organization, 20, 731–743. 12. Sidak, G. and D. Spulber (1997). Deregulatory Takings and the Regulatory Contract, Cambridge: Cambridge University Press. 13. Valletti, T. M. (2003), “The Theory of Access Pricing and its Linkage with Investment Incentives”, Telecommunications Policy, 27, 659–675. 14. Weeds, H. (2002). “Strategic Delay in a Real Options Model of R&D Competition”, Review of Economic Studies, 69, 729–747.
December 26, 2006 11:2
Proceedings Trim Size: 9in x 6in
ritsumei01n+
The Investment Game under Uncertainty: An Analysis of Equilibrium Values in the Presence of First or Second Mover Advantage. ∗ Junichi Imai1 and Takahiro Watanabe2 1
Graduate School of Economics and Management, Tohoku University 2 Faculty of Urban Liberal Arts, Tokyo Metropolitan University
In this paper we develop a valuation model when there exist two competitive firms that face irreversible investment decisions under the demand uncertainty. We propose a numerical procedure to derive both project values and equilibrium strategies in the duopolistic environment. In numerical examples we consider two different economic conditions, which are labelled first mover advantage and second mover advantage, and examine the effects of these conditions on the equilibrium strategies as well as the project values. We show that these conditions cause significant changes in the equilibrium strategies of both firms. JEL: G31,C61,C73 Key words: Real option, Investment game, Equilibrium strategy, Project Valuation
1.
Introduction The real option approach valuing real assets or projects has been playing an important role in the field of corporate finance and financial economics. It has been proposed as a useful concept to analyze a strategic investment. The usual real option analysis is implicitly based on the assumption that the underlying risk is exogenous and that management cannot affect the underlying stochastic process. This assumption is appropriate if a firm is nearly in a perfectly competitive market or it has monopolistic power over the market of the project. However, management in the real world ∗ Both authors acknowledge a research support from Grant-in-Aid for Scientific Research in Japan Society for the Promotion of Science 17510128.
151
December 26, 2006 11:2
Proceedings Trim Size: 9in x 6in
ritsumei01n+
152
should consider rival firms’ behaviors in an imperfectly competitive market because one firm’s action affects other firms’ decisions and vice versa. Therefore, we should consider strategic interaction among competitive firms. Several studies recently focus on integrating real option analysis with game theory to reflect the interaction. In this paper we develop a valuation model that can derive the project values and the equilibrium strategies when there exist two competitive firms that face irreversible investment decisions under uncertainty. Suppose each firm operates an existing project that generates a cash flow stream. Each firm has an option to reinvest in the project so that the marginal cash flow of the project is increased. The cash flow stream from the project depends on the level of demand that follows a continuous-time stochastic process, and the investment decisions of both firms. Each firm makes an investment decision to maximize its project value. We approximate the underlying continuous-time process with the discrete-time lattice process and propose a numerical procedure to derive the project values in equilibrium strategies. In numerical examples we derive the project values and analyze the strategic behavior of the competitive firms in the equilibrium under different economic situations. Recent researches that integrate real option approach with game theory can be classified into two categories. The models in the first category assume that the underlying variable of the project follows a one-period or two-period process. Smit and Ankum (1993), Kulatilaka and Perotti (1998), Smit and Trigeorgis (2001) are in the first category. These studies give us intuitive ideas about the effect of the competition under uncertainty and clarify the strategic behaviors of the firms under competition. However, several strong assumptions are contained in the models. Especially, the underlying process is too simple and the number of decision opportunities is strictly restricted. As a result, they are not suitable for a quantitative analysis. On the other hand, the models in the second category often assume that the underlying variable follows a continuous-time stochastic process. Smets (1991) and Dixit and Pindyck (1994) are examples of the earliest studies in this category. The models are constructed by extending a deterministic model proposed by Fudenberg and Tirole (1985) to a stochastic dynamic process. Grenadier (1996) applies this model to an analysis of land development projects while Huisman (2001) refines it to analyze the strategic behaviors of the firms. Other studies in this category are Hoppe (2000), Lambrecht (2000), Grenadier (2002), Huisman and Kort (2003) and Huisman and Kort (2004). Although the models in the second category can derive valuable implication about strategic investments under uncertainty and competition,
December 26, 2006 11:2
Proceedings Trim Size: 9in x 6in
ritsumei01n+
153
several assumptions in these models seem to be too naive to apply to a practical use of evaluating real projects. First, the assumption that the project has an infinite horizon is not always realistic and the models does not capture the effect of the maturity on the project value. Secondly, most researches assume that the underlying variable follows a geometric Brownian motion1 . If we assume the different underlying process or loosen the infinite horizon assumption it is not easy to derive a closed-form solution of the problem. Accordingly, a numerical approach is valuable in this case. The third assumption is about the continuity of the decision opportunities. Most studies implicitly assume that management can make investment decisions continuously. It is observed, however, that these decision opportunities do not always come continuously even if the underlying risk would change continuously. They come once in a week or in a month, sometimes in a quarter in many actual investment projects. Therefore, it is necessary to construct a model to distinguishes the continuity of the underlying process from the discrete decision opportunities. The purposes of this paper is (1) to derive a numerical procedure to evaluate an investment project under the demand uncertainty and competition; (2) to analyze the project values and the equilibrium strategies under two conditions of competitive environments, which is called First Mover Advantage (FMA) and Second Mover Advantage (SMA). For the first purpose we develop a discrete-time lattice model and integrate real option approach with game theory. It is well known that the lattice process can approximate the continuous-time stochastic process if the parameter values are adequately adjusted. Furthermore, this model can be easily extended to deal with the finite number of decision opportunities under a wider class of continuous-time underlying processes. Our model is located between the two categories stated above; namely, our model can assume a continuous-time underlying process with finite number of decision opportunities, which enables us to derive more realistic results and implication. Although our numerical approach cannot obtain analytical solutions, we can obtain the current project value that depends on both the current demand and time to the project maturity, which have not been analyzed by the past studies. Imai and Watanabe (2006b) propose a similar numerical procedure. In their model, however, two firms are assumed to be asymmetric where the two firms make an investment decision sequentially: one firm can make a decision first at each period and the other firm does after observing the rival’s decision. This corresponds to a problem of a perfect information in game theory. The game is simple because a multiple equilibria problem 1 Kijima
and Shibata (2002) consider the stochastic volatility process.
December 26, 2006 11:2
Proceedings Trim Size: 9in x 6in
ritsumei01n+
154
never occurs and it can be solved by a simple backward induction. The research with regard to symmetric and simultaneous decisions is also necessary because it is a more fundamental setting from a theoretical viewpoint. It also enables us to compare the results of the existing researches such as Smets (1991), Dixit and Pindyck (1994), Grenadier (1996) and Huisman (2001). Therefore, in our model, both firms make investment decisions simultaneously at each decision opportunity. Since multiple equilibria could emerge in the simultaneous decision case we introduce the following two criteria to select an equilibrium. In the first criterion, we distinguish the two competitive firms before starting the game, which are labelled firm L and firm F in the paper. Then we assume that firm L has a competitive advantage over firm F in the sense that one equilibrium is selected so that the project payoff of the firm L is maximized if there exist multiple equilibria. We call it firm L advantage criterion (LAC) for convenience in the paper. Although LAC looks rather ad hoc Imai and Watanabe (2006a) point out that LAC is consistent with a theory of equilibrium selection in game theory developed by Harsanyi and Selten (1988) if we suppose that the cash flow of firm L is infinitesimally greater than that of firm F after both firms’ investment. In the second criterion, we assume that one equilibrium is selected with an equal probability when there exist multiple equilibria. This criterion is consistent with the previous studies of Grenadier (1996) and Huisman (2001) that examine a preemption game under the assumption of FMA. The criterion is exogenously assumed in our paper, but Fudenberg and Tirole (1985) derive the same criterion endogenously with a more sophisticated model. In this paper we call it 50% criterion. The second purpose of this paper is motivated as follows. Recent studies concerning the integration of real option approach with game theory mainly focus on investments of two competitive firms under the condition of FMA. FMA is a condition in which the increment of the marginal cash flow of a firm that invests earlier is greater than that of the other firm that invests later. In this paper a firm which invests in the project earlier is labelled the first mover while a firm which invests after the first mover is labelled the second mover. It is known that the condition of FMA causes a preemption game because each firm wants to invest earlier than the rival firm in order to prevent the rival firm’s investment and to enjoy the monopolistic revenue from the investment. In other words, both firms want to become the first mover of the game. FMA is an appropriate condition to describe preempt competition under uncertainty such as land developments and oil refinery projects. However, some competitive situations cannot be captured by the context of FMA. For example, consider an investment opportunity to enter a new
December 26, 2006 11:2
Proceedings Trim Size: 9in x 6in
ritsumei01n+
155
product market. The leader firm, which intends to launch a new product, often needs to invest in the research and development. The marketing cost is also necessary for the promotion of the new product to create a new market. On the other hand, the follower firm, which intends to enter the market after the leader, could learn from the leader’s experience. The follower can observe the market development and decide to invest after the market is matured. Furthermore, the follower has to pay less money for the market creation. To describe this situation SMA is a more appropriate condition where the increment of cash flow of the second mover is larger than that of the first mover. In this paper, we examine the equilibrium strategy of the competitive firms under the condition of SMA as well as that of FMA. In numerical examples we analyze comparative statics under the conditions of both FMA and SMA with respect to the initial demand and examine the effect of these two conditions on the equilibrium strategies as well as the project values. We observe that a preemption game emerges under the condition of FMA that leads to multiple equilibria. As a result, an asymmetric equilibrium where only firm L invests first, is chosen in case of LAC. In that case the project value of firm L, which is the first mover, is larger than that of firm F. However, we can conclude that the project value of firm L is not always larger than the project values accomplished by both firms in coordination. This is because the firms tends to invest earlier than the optimal timing of the firm that is assigned to be the first mover exogenously. The equilibrium strategies are always symmetric under the condition of SMA. Both firms can make the best use of the flexibility to defer the investment even in the presence of competition. As a result the project values of both firms are equivalent to the coordinated project value. This paper is organized as follows. In section 2 we develop a valuation model. Numerical analyses are done in section 3. Finally, concluding remarks are in section 4. 2. Model Description 2.1 A Basic Mode This section provides a valuation model of the project under demand uncertainty and competition. The model is based on Dixit and Pindyck (1994), Grenadier (1996) and Huisman (2001) which can be regarded as one of the typical models for the project valuation under uncertainty and competition. Two firms are introduced, denoted by firm L and firm F. Each firm operates a project, which has an option to reinvest in the project that increases the firm’s cash flow by paying the investment cost I. In the model we assume that the future demand is uncertain and follows a continuous-time stochastic process. Let Y(t) denote the realized demand
December 26, 2006 11:2
Proceedings Trim Size: 9in x 6in
ritsumei01n+
156
at time t and it follows (1)
dY(t) = µ (t, Y(t)) dt + σ (t, Y(t)) dzP ,
where µ (t, Y(t)) and σ (t, Y(t)) are the instantaneous drift and volatility, respectively, and dzP is the increment of the Wiener process under the probability measure P. As explained in the introduction, our model is established on the lattice process that can converge to the corresponding continuous-time process as the number of periods tends to infinity. In addition, the lattice model can deal with finite decision opportunities under the continuous-time stochastic process and it enable us to distinguish decision opportunities from the timings of changing demand. Imai and Watanabe (2006b) analyze the effect of the discrete decision opportunities thoroughly. In this paper we do not focus on this aspect and simply assume that the decision opportunities come continuously. We assume that the projects of both firms continue within finite horizon and end up at some future time T which are called a maturity of the project. Thus, each firm can choose a time of the investment under uncertainty of the demand until the maturity of the project. To construct a lattice model, T we divided the time interval [0, T] into M periods of length ∆t: ∆t = M . The demand Y(t) changes at time t = m∆t for m = 0, · · · , M and we simply denote the demand Y(m∆t) at period m by Y(m). Note that M is assumed to be sufficiently large for the approximation. The chance of the investment in the project for each firm is at most once in the M + 1 decision opportunities and the firm can never invest again after the investment. Thus, there are two states for each firm with regard to the firm’s decision. Let xi (m) denote the state of firm i(i = L, F) at each period m = 0, · · · , M. xi (m) = 0 represents the state where firm i has not invested in the project until period m, while xi (m) = 1 represents the state where firm i has already invested. Note that xi (m) stands for the state of firm i after its decision of the investment at period m. There are totally four possible pairs of states, which are denoted by (xL (m), xF (m)). At the m-th decision opportunity for each m = 0, · · · , M when both firms have already invested (i.e., (xL (m − 1), xF (m − 1)) = (1, 1)), both firms do not make any decision. When one firm has already invested but the other firm has not yet (i.e., (xL (m − 1), xF (m − 1)) = (0, 1) or (1, 0)), only the firm that has not invested makes a decision. Finally, when neither firm has invested (i.e., (xL (m − 1), xF (m − 1)) = (0, 0)), both firms make their investment decisions. In this case, we assume that both firms determine their strategies simultaneously. The marginal cash flow obtained by each firm at each period depends on the following two variables. The first one is Y(m), the realized demand at period m. The second variable is a pair of the states of both firms about
December 26, 2006 11:2
Proceedings Trim Size: 9in x 6in
ritsumei01n+
157
Firm F Invest
Fi rm L
Invest
Not invest
Not invest
( D11 , D11 )
( D10 , D01 )
( D01 , D10 )
( D00 , D00 )
Figure 1. Marginal cash flow of the project. When both firms have already invested in the project the cash flow obtained by each firm at each period is given by D11 Y(m)∆t. It is given by D00 Y(m)∆t when neither firm has invested yet. The cash flow obtained by firm L is given by D10 Y(m)∆t when only firm L finishes the investment. The cash flow obtained by firm F is given by D01 Y(m)∆t in this case. On the other hand, when only firm F has invested in the project the cash flow obtained by firm L and firm F are given by D01 Y(m)∆t and D10 Y(m)∆t, respectively.
the investment (xL (m), xF (m)). Let Dxi (m)x j (m) denote the cash flow per unit of demand of firm i where j is the rival firm of firm i. The cash flow per unit of demand and unit of period is illustrated in Fig. 1. In the usual real option analysis no-arbitrage principle is often assumed for the valuation. It is difficult, however, to apply this principle to our model since the demand of the merchandise cannot be observed in the market. Alternatively, Cox and Ross (1976), Constantinides (1978), and McDonald and Siegel (1984) propose the equilibrium approach for the option pricing. In this paper, we apply the equilibrium approach and assume that the demand risk can be considered private risk or unsystematic risk that is independent of the market risk. Since an investor pays no risk premium with respect to the unsystematic risk in equilibrium, we can assume that firms are risk neutral in the valuation model and let r denote the risk-free rate. Now, we develop the valuation model from a game theoretical perspective. First, we define a strategy of each firm. A decision whether each firm has invested or not until each period, denoted by 1 or 0, respectively, is called an action of the firm. We apply the Markov subgame perfect equilibrium to a solution concept. This means that each firm’s action depends only on the current demand and the pair of states instead of the history of their actions and the demands. The firm i’s strategy si for i = L, F is defined as a list of the actions at each period for any current demand and any state of both firms. si (m, x j (m − 1), Y(m)) is the firm i’s strategy at period m when the realized demand is Y(m) and the previous state of the rival firm j is x j (m − 1). si (m, x j (m − 1), Y(m)) = 1 indicates that firm i finishes investing until period m while si (m, x j (m − 1), Y(m)) = 0 represents that firm i has not invested yet. Now we define the present value of the project on the condition that
December 26, 2006 11:2
Proceedings Trim Size: 9in x 6in
ritsumei01n+
158
the future demand is given at period m. We introduce additional notation to define it. Let Y m = (Y(m + 1), · · · , Y(M)) be the sequence of random variables for future demands after period m. Suppose that sL , sF and Y m is determined for a given period m. Then, the state of each firm i for any period l ≥ m is specified, denoted by xˆi (l, si , s j , xi (m − 1), x j (m − 1), Y m ). In addition, whether firm i invests in the project exactly at period l for l ≥ m is given by indicator functions 1i , which is formally written as: 1i (l,si , s j , xi (m − 1), x j (m − 1), Y m ) 1 xˆi (l − 1, si , s j , xi (m − 1), x j (m − 1), Y m ) = 0 = and xˆi (l, si , s j , xi (m − 1), x j (m − 1), Y m ) = 1, 0 otherwise. Now, let ui (m, si , s j , xi (m − 1), x j (m − 1), Y m ) be the present value of the project of firm i. We set that the cash flow obtained by each firm i at maturity M is always equal to zero, i.e., ui (M, si , s j , xi (M − 1), x j (M − 1), Y M ) ≡ 0. The project value at period m (0 ≤ m ≤ M − 1) is defined by ui (m, si , s j , xi (m − 1), x j (m − 1), Y m ) M−1 X e−r∆t(l−m) {Dxˆi xˆ j Y(l)∆t − I1i (l, si , s j , xi (m − 1), x j (m − 1), Y m )}, = l=m
where xˆi and xˆ j are precisely written as xˆi ≡ xˆi (l, si , s j , xi (m − 1), x j (m − 1), Y m ), xˆ j ≡ xˆ j (l, s j , si , x j (m − 1), xi (m − 1), Y m ). Next, we define the expected value of the project at m-th decision opportunity which is the expected cash flow after period m for a given strategy profile. The firm i’s expected value at m-th decision opportunity for each m = 0, · · · , M, denoted by Ui (m, si , s j , xi (m − 1), x j (m − 1), y), is given by Ui (m, si , s j , xi (m − 1), x j (m − 1), y) = EQ [ui (m, si , s j , xi (m − 1), x j (m − 1), Y m )|Y(m) = y]. where EQ [·] represents an expectation under the risk neutral probability measure Q. Finally, we define a concept of an equilibrium. A couple of strategy (s∗L , s∗F ) is said to be an equilibrium if and only if for any firm i = L, F, any m = 0, · · · M, any current demand y, and any states of both firms at previous period m − 1, xi (m − 1) and x j (m − 1), s∗i satisfies (2) Ui (m, s∗i , s∗j , xi (m − 1), x j (m − 1), y) ≥ Ui (m, si , s∗j , xi (m − 1), x j (m − 1), y) for any strategy si .
December 26, 2006 11:2
Proceedings Trim Size: 9in x 6in
ritsumei01n+
159
2.2
Derivation of the Project Values One of our main interests is to derive the project values of the both firms in equilibrium. We define an equilibrium in the previous subsection by Eq. (2). This definition, however, is not written as a tractable form so that we cannot derive the equilibrium strategies with the definition. In this subsection we develop a dynamic programming procedure to obtain the project values of the both firms in equilibrium strategies. The recursive procedure is constructed by induction. (x (m−1),xF (m−1)) (x (m−1),xF (m−1)) Let VL L (m, y) and VF L (m, y) denote the project values of firm L and firm F, respectively. The value functions depend on period m , the current demand Y(m) = y, and the states of both firms xL (m − 1) and xF (m − 1) at period m − 1, assuming that the both firms follow equilibrium strategies after period m. Let s∗L and s∗F denote the equilibrium strategies of both firms. Then, the project values are formally written by (x (m−1),xF (m−1))
We define that the cash flows obtained by both firms at matu(x (M−1),xF (M−1)) (M, y) = rity period M are supposed to be zero, i.e., VL L (xL (M−1),xF (M−1)) VF (M, y) = 0 for any demand y and any states (xL (M−1), xF (M− 1)). (x (m−1),xF (m−1)) (x (m−1),xF (m−1)) VL L (m, y) and VF L (m, y) are obtained by the fol(x (m),xF (m)) lowing recursive procedure for m = 0, · · · , M − 1. Let vi L (m, y) denote the project values (not including the investment cost) after the decisions of both firms at period m, assuming that both firms follow equilibrium strategies after period m + 1. (1,1) (1,1) First, we consider the values of VL (m, y) and VF (m, y) which are easy to compute because both of the firms have already invested and there are no decision to make at period m. Thus, the project values can be derived by adding the discounted expected values of the next period to the current cash flow: (1,1)
(3)
Vi
(1,1)
(m, y) = vi
(m, y)
for any i = L, F where (1,1)
vi
h (1,1) i (m, y) = D11 y∆t + e−r∆t EQ Vi (m + 1, Y(m + 1))|Y(m) = y . (1,0)
Next, consider the project value of VF (m, y). In this case firm L has already invested in the project and firm F can make an investment decision
December 26, 2006 11:2
Proceedings Trim Size: 9in x 6in
ritsumei01n+
160
solely. If firm F decides to invest in the project at period m the project value (1,1) can be written as vF (m, y) − I. On the other hand, if firm F decides not to (1,0) invest at period m the project value is equal to vF (m, y) where h (1,0) i (1,0) vF (m, y) = D01 y∆t + e−r∆t EQ VF (m + 1, Y(m + 1))|Y(m) = y . Firm F makes the optimal decision to maximize its project value at period m. Thus, the following equation is satisfied. (1,0)
Next, we consider the project value of VL (m, y). Firm L which has already invested in the project has no decision to make at period m but the project value is dependent on the firm F’s decision. Thus, the following equations hold. (1,0)
Figure 2. The payoff matrix in case of simultaneous decision.
where (0,1)
vF
h (0,1) i (m, y) = D10 y∆t + e−r∆t EQ VF (m + 1, Y(m + 1))|Y(m) = y .
Finally, the project values at state of (0, 0) are considered when both firms have options to invest in the project. In this case actions for both firms in the equilibrium strategies can be derived from a Nash equilibrium of one shot game described by the payoff matrix illustrated in Fig. 2. (0,0) In the figure, vL (m, y) is defined by (0,0)
vL
h (0,0) i (m, y) = D00 y∆t + e−r∆t EQ VL (m + 1, Y(m + 1))|Y(m) = y .
There could be multiple Nash equilibria in the payoff matrix in Fig. 2, hence we need to define additional criteria to choose the unique equilibrium. As explained in the introduction, we introduce two criteria which are called LAC and 50% criterion. In the LAC, we assume that firm L has a competitive advantage in the case of the multiple equilibria. In the payoff matrix shown in Fig. 2, (1,1) (1,1) (1,0) (0,1) (0,1) (1,0) equations vL = vF , vL = vF and vL = vF are satisfied at any period. Hence, we can determine the equilibrium strategies using these equations. Table 1 summarizes the equilibrium strategies that can occur under LAC. By using above equations, all possible cases are classified into nine cases. The second column labelled ”conditions of equilibria” in Table 1 corresponds to the conditions which determine equilibria. The resulting equilibria are shown in the third column denoted by the pair of the states, (sL (m, xL (m − 1), Y(m)), sF (m, xF (m − 1), Y(m))). For example, (1, 0) represents the state where only firm L finishes the investment. Note that the third and forth cases correspond to symmetric equilibria while the fifth and sixth cases correspond to asymmetric equilibria. If there exist multiple equilibria, one equilibrium strategy is selected by LAC, which is reflected as an inequality in the fourth column labelled ”condition of equilibrium selection”. In the fifth column, the selected equilibrium by LAC is shown. The last two columns correspond to the project values of firm L and firm F in the equilibrium, respectively.
December 26, 2006 11:2
Proceedings Trim Size: 9in x 6in
ritsumei01n+
162
It is important to note that in the case of symmetric equilibria, which correspond to the third and forth cases, LAC is equivalent to another criterion for equilibrium selection called payoff dominance. In the payoff dominance criterion we choose the unique equilibrium if all players’ payoffs in the equilibrium are greater than those of the other equilibrium. After Harsanyi and Selten (1988) apply this criterion to their theory of equilibrium selection it is often used in game theory. Fudenberg and Tirole (1985) and Huisman (2001) also use this criterion to select an equilibrium. Note that firm L’s project value in the equilibrium selected by LAC is greater than any other value in the multiple equilibria by definition. In addition, firm F’s project also takes the maximum values in that equilibrium because of the symmetry of the strategies between two firms. Consequently, the equilibrium selected by LAC in symmetric equilibria is consistent to that chosen by payoff dominance criterion. In applying the 50% criterion, there are four possible cases which are (x ,x ) summarized in Eq. (8). By symmetric definition of our model vL L F = (x ,x ) (x ,x ) (x ,x ) vF L F and VL L F = VF L F hold for any state of (xL , xF ). Consequently, the (0,0) (0,0) project values of VL and VF are calculated by:
v(1,1) (m, y) − I > v(0,1) (m, y) L L and v(1,0) (m, y) − I > v(0,0) (m, y), L L (1,1) (0,1) vL (m, y) − I > vL (m, y) and v(1,0) (m, y) − I < v(0,0) (m, y), L L (1,1) vL (m, y) − I < v(0,1) (m, y) L and v(1,0) (m, y) − I > v(0,0) (m, y), L L (1,1) (0,1) vL (m, y) − I < vL (m, y) and v(1,0) (m, y) − I < v(0,0) (m, y). L L
The first case indicates that both firms choose to invest at period m when the previous state is (0, 0) and in the fourth case neither firm invests in the project at period m. There exists the unique equilibrium in these cases. On the other hand, the second and third case corresponds to multiple equilibria where the states of (1, 0) and (0, 1) satisfy the equilibrium condition. In these cases the project value can be given by the average of the two project values in the equilibria under 50% criterion. 3. Numerical experiences and economical implication 3.1 Selection of parameters In this section we analyze a relation between competition and flexibility under uncertainty. Especially, we focus on the effects of FMA and SMA on the project values as well as the equilibrium strategies. In the numerical
December 26, 2006 11:2
Proceedings Trim Size: 9in x 6in
ritsumei01n+
163
(0,0)
Table 1 Calculation of VL
(0,0)
and VF
conditions of equilibria 1 v(1,1) L (1,1) v 2 L (1,0) v F (1,1) v L 3 (1,0) v F (1,1) v 4 L (1,0) v F (1,1) v 5 L (1,0) v F (1,1) v 6 L (1,0) v F (1,1) v L 7 (1,0) v F (1,1) v 8 L (1,0) v F (1,1) v 9 L (1,0) v F
(0,1) −I >v , L (0,1) , −I >v L (0,0) −I >v F (0,1) , −I >v L (0,0) −I v L (0,0) −I v F (0,1) −I v F (0,1) −I v F (0,1) −I
(1,0) (0,0) v −I >v , L L (0,0) (1,0) v −I
under the second criterion of equilibrium selection.
equilibria
condition of equilibrium selection
selected equilibrium
VL VF (1,1) (1,1) −I v −I L F
(1, 1)
(1, 1)
v
(1, 1)
(1, 1)
v
(1,1) (0,0) −I >v L L
(1, 1)
v
(1,1) (0,0) −I
(0, 0)
(1,0) (0,1) −I >v L L
(1, 0)
(1,0) (0,1) −I
(0, 1)
v
(1,0) (0,0) −I
v
v
(0,0) (1,0) −I
v
v
(0,0) (1,0) , −I >v L L (1, 0), (0, 1)
v
(1,0) (0,0) v −I >v , L L (1, 0), (0, 1)
v
(1,1) (1,1) −I v −I L F (1,1) (1,1) −I v −I L F v
(0,0) L
v
(0,0) F
(1,0) −I L
v
(1,0) F
v
(0,1) L
v
v
(1,0) (0,0) −I >v , L L
(1, 0)
(1, 0)
v
(1,0) (0,0) −I
(0, 1)
(0, 1)
v
v
(1,0) (0,0) −I
(0, 0)
(0, 0)
v
(0,1) −I F
v
(1,0) −I L
(1,0) F
v
(0,1) L (0,0) L
v
(0,1) −I F
v
(0,0) F
v
examples, both LAC and 50% criterion are applied in case of the multiple equilibria, which enables us to compare our model with the existing researches. In addition, we examine economical implication of each criterion by comparing the results. We assume in our numerical examples that the underlying demand process follows a geometric Brownian motion with constant drift r = 3% and a constant volatility σ = 30% in the risk neutral world2 . The demand process is approximated by the trinomial lattice model whose number of periods are equal to M = 1000 when the project horizon T is equal to one. To describe the economic condition between two firms we assume (9)
D10 > D11 > D00 > D01 .
These inequalities imply that the investment can increase the marginal cash flow for both firms. In addition, we classify the conditions of FMA and SMA by comparing D10 − D00 with D11 − D01 . Note D10 − D00 indicates the increment of the marginal cash flow of the first mover while D11 − D01 indicates that of the second mover. For brevity we define the condition of FMA as D10 −D00 > D11 −D01 , while SMA is defined as D10 −D00 < D11 −D01 . Note that the definition of FMA is the same as in Huisman (2001). The parameter values of Dij , i, j = 0, 1 are given in each numerical example. 2 Imai and Watanabe (2006b) analyze a similar model under a non-Gaussian stochastic process.
December 26, 2006 11:2
Proceedings Trim Size: 9in x 6in
ritsumei01n+
164
3.2
Project values under the condition of FMA Figure 3 illustrates the project values of the firms with regard to the initial demand Y(0) under the condition of FMA. Parameter values are given by D10 = 8, D00 = 3, D01 = 0 and D11 = 4, which satisfy the condition of FMA. In this case Imai and Watanabe (2006b) show that a chicken game could emerge: there exist two asymmetric equilibria in which one firm’s strategy is different from the others’. Our main interest is to analyze the project values of firm L and firm F in the equilibrium on condition that neither firm invests before period (0,0) (0,0) zero, which are given by VL (0, y) and VF (0, y), respectively. To analyze the equilibrium strategies we also derive project values of under different (1,1) (0,1) (1,0) conditions before time zero: VL (0, y) − I, VL (0, y) and VL (0, y) − I. Consider the firms’ actions at period zero when the initial demand is (1,1) Y(0) = y. The value of VL (0, y) − I indicates the project value of firm L at period zero on the condition that both firms invest at period zero . (1,1) Note VL (0, y) does not include the investment cost. It is apparent that (1,1) VL (0, y) − I is linear with respect to the initial demand since the project has no option to decide, which corresponds to the net present value of (0,1) the project. VL (0, y) indicates the firm L’s project value on the condition that firm F has already invested in the project. Since only firm L has an (0,1) option to defer the investment, the shape of VL (0, y) with respect to the initial demand is similar to that of standard real option value to defer the investment without competition. According to the figure the value of (0,1) VL (0, y) is relatively small but not zero when y < YB . This is because firm L defers the investment at period zero and the possibility of the future (0,1) investment is small but positive. The value of VL (0, y) contacts with (1,1) that of VL (0, y) − I at y = YB where the optimal strategy for firm L has been changed into investing in the project at the beginning of the project. (1,0) VL (0, y) − I indicates the firm L’s project value on the condition that firm L invests at period zero. Although firm L has no decision to make its marginal cash flow depends on the firm F’s action that changes the firm L’s project value as well as the firm F’s. It is of interest to note that the (1,0) project value VL (0, y) − I peaks around y = 47. Increasing the initial demand makes a rise of the total cash flow for a given marginal cash flow, but it also increases a possibility for earlier investment of firm F. This leads to a decrease of the marginal cash flow of firm L, from D10 to D11 , (1,0) (1,0) and VL (0, y) − I decreases when y becomes greater than 47. VL (0, y) − I (1,1) (0,1) eventually becomes equal to VL (0, y) − I and VL (0, y) at y = YB as the initial demand increases.
December 26, 2006 11:2
Proceedings Trim Size: 9in x 6in
ritsumei01n+
165
200
1,1 VVL(1,1)-I ( 0, y ) − I L (
150
)
VVL(1,0)-I ( 0, y ) − I L (1,0 )
0,1 VVL(0,1) ( 0, y ) L (
100
)
0,0 VVF(0,0) ( 0, y ) F (
50 0
)
0,0 VVL(0,0) ( 0, y ) L (
)
V ( 0, y ) V50(0,0) ( 0,0 )
0
10
20
YA30
40
50 YB
50%
60
70
80
-50 -100 -150 -200
initial demand y
Figure 3. The project values of the firms with respect to the initial demand under FMA.
Observations of the project values under the condition of FMA: Ac(0,0) (0,0) cording to Fig. 3, VL (0, y) and VF (0, y) are nearly equal but could not be identical when the initial demand is less than YA . In this interval, the project value takes a maximum around y = 26, which results from the sim(1,0) ilar effect to those of VL (0, y) − I around y = 47. Examining both firms’ project values in detail reveals that the firm F’s project value is sometimes greater than the firm L’s project value around y < YA in spite of the advantage of firm L. This phenomena is called flexibility trap discussed by Imai and Watanabe (2004) and Imai and Watanabe (2006a). In the interval of y ∈ (YA , YB ) the project value of firm L is larger than that of firm F under (0,0) (1,0) (0,0) LAC. Note VL (0, y) is equal to VL (0, y) − I while VF (0, y) is equal (0,1) to VL (0, y) in the interval. This implies that only firm L invests in the project at period zero while firm F defers the investment and waits until the optimal investment timing. Both project values become identical when y ≥ YB , which means that it is optimal to invest in the project at period zero. (0,0)
Effect of the criteria on equilibrium selection: Finally, let V50% (0, y) denote both firms’ project values in the equilibrium at period zero when we apply 50% criterion to the case of multiple equilibria. Both project val-
December 26, 2006 11:2
Proceedings Trim Size: 9in x 6in
ritsumei01n+
166
300
VL ( 0, y ) − I VL(1,1)-I (1,1)
250
1,0 VVL(1,0)-I ( 0, y ) − I L (
200
VVL(0,1) ( 0, y ) L 0,0 VVF(0,0) ( 0, y ) F
150
(
(
)
V ( 0, y ) V50(0,0) ( 0,0 ) 50%
50
-50
)
0,0 VVL(0,0) ( 0, y ) L
100
0
)
( 0,1)
0
10
20
30 YC 40
YD
50
60
70
80
-100 -150 -200
initial demand y
Figure 4. The project value of the firms with respect to the initial demand under SMA.
ues are identical under this criterion because both firms are completely (0,0) (0,0) symmetric by definition. The comparison among VL (0, y), VF (0, y) (0,0) and V50% (0, y) enables us to understand how the criterion of equilibrium selection affect the project values. We focus on the comparison in the interval of y ∈ (YA , YB ) since all the values are nearly identical other(0,0) wise. In this interval we can observe that the value of V50% (0, y) is exactly (0,0) (0,0) (0,0) the same as an average of VL (0, y) and VF (0, y);namely, V50% (0, y) = n (0,0) o (0,0) 1 2 VL (0, y) + VF (0, y) . Under the LAC one of the competitive firms, firm L, is predetermined to have a competitive advantaged in advance. n (0,0) o (0,0) Therefore, we can interpret that the value of 12 VL (0, y) + VF (0, y) represents the project value of each firm when one of the two firms are selected as firm L with a equal probability before the game. On the other hand under the 50% criterion every time multiple equilibria emerges one of them are equally selected. Note 50% criterion does not select a advantaged firm before the game. Consequently, the result indicates that 50% criterion is equivalent to the criterion of selecting a advantaged firm as firm L with an equal probability in advance3 . 3 This observation does not hold when y < Y . In this interval flexibility trap is observed A that breaks the equality.
December 26, 2006 11:2
Proceedings Trim Size: 9in x 6in
ritsumei01n+
167
3.3
Project values under the condition of SMA Figure 4 shows the project values of both firms under the condition of SMA. The parameter values of the marginal cash flows are set as D10 = 8, D00 = 3, D01 = 0 and D11 = 6. Note the value of D11 is changed so that the condition of the SMA holds. Although multiple equilibria could emerge as in the previous example, the type of equilibria is different. Under the condition of SMA there could be symmetric equilibria which consist of the state of (1, 1) and (0, 0). If we apply LAC to the equilibrium selection problem, the unique symmetric equilibrium is always selected, which leads to the identical project values of firm L and firm F. Furthermore, LAC is consistent with the payoff dominance criterion stated in the previous section. Observations of the project values under the condition of SMA: The (1,1) (1,0) (0,1) shapes of the values of VL (0, y) − I, VL (0, y) − I,and VL (0, y) depicted in Fig. 4 are similar to those in Fig. 3 while the shapes of the project values of (0,0) (0,0) VL (0, y) and VF (0, y) are quite different. The most important difference (0,0) (0,0) in this figure is that VL (0, y) is equivalent to VF (0, y) regardless of the initial demand. By examining the equilibrium strategies of both firms, we find that when y ≤ YD both firm L and firm F defer the investment at (1,1) period zero. These values contact with VL (0, y) − I at y = YD where the optimal strategies of both firms are changed into investing at period zero. We can confirm from the figure that asymmetric equilibrium never emerges under the condition of SMA even if we apply LAC. With a comparison (0,0) (0,0) (1,0) of VL (0, y) (or VF (0, y)) with VL (0, y) − I we can conclude neither firm has an incentive to invest earlier to become the first mover under the condition of SMA. Effect of the criteria on equilibrium selection: The comparison of the two criteria in the equilibrium selection is also examined under the condition of SMA. In Fig. 4 it is shown that the project value at period zero (0,0) (0,0) under 50% criterion, V50% (0, y), is less than or equal to VL (0, y) (and (0,0) VF (0, y)). The relation between the two criteria is not clear under the condition of SMA. Since applying 50% criterion is inconsistent with the payoff dominance criterion, 50% criterion is not proper under the condition of SMA. 3.4 Discontinuity between FMA and SMA Figure 5 depicts the values of both firms with respect to the initial demand under the condition of FMA and SMA to understand the difference of the project values between two conditions. The common parameter
December 26, 2006 11:2
Proceedings Trim Size: 9in x 6in
ritsumei01n+
168
initial demand y
Figure 5. The comparison of the project values under FMA and SMA.
values are set as D00 = 3, D01 = 0 and D11 = 6. D10 = 10 is used for representing the condition of FMA while D10 = 8 for the SMA. Note that (1,1) (0,1) values of VL (0, y) − I and VL (0, y) are common under both conditions. As mentioned in Fig. 4 both project values under the condition of SMA are identical:both firms defer the investment at period zero until the initial demand becomes y = YD . Although we assume that firm L and firm F are competitors the result implies that the equilibrium strategies of the competitive firms are equivalent to those of the firms in cooperation. It is called a coordination in game theory, which means that one firm is in cooperation with the other to accomplish their common interest. The firms in coordination choose identical strategies in this case. Consequently, under the condition of SMA the equilibrium strategies of the competitive firms are equivalent to those of the coordinated firms. In other words, both firms act as if they were one firm that obtains the marginal cash flow of D00 before the investment and that of D11 after the investment. For that (0,0) (0,0) reason we refer the value of VL (0, y) (= VF (0, y)) as coordinated value. The coordinated value can be accomplished if a firm can coordinate with the other firm. It is important to point out that the coordination is not an assumption but the result in the equilibrium under SMA. Under the condition of FMA, on the other hand, asymmetric equilibria emerge in the interval of y ∈ (YF , YI ) and firm L’s project value is larger than firm F’s in the interval. Especially, firm L’s project value is larger than the coordinated value in the interval of y ∈ (YG , YH ) . This implies that it is better for firm L to take advantage of the LAC and invest in the project solely
December 26, 2006 11:2
Proceedings Trim Size: 9in x 6in
ritsumei01n+
169
D
Figure 6. The comparison of the project values with respect to D11 under FMA and SMA.
at period zero as the first mover when the initial value is in y ∈ (YG , YH ) . In (0,0) the interval of y ∈ (YE , YG ) or y ∈ YH , Y J , however, the value of VL (0, y) (0,0)
and VF (0, y) are smaller than the coordinated value. It must be noted that in these intervals the strategies that correspond to the coordinated value are not in the equilibrium even though they are feasible. In other words, neither firm can accomplish the value under the condition of coordinated FMA in the interval of y ∈ YE , Y J . This can by explained by a fear of preemption:both firms know that they accomplish the coordinated value by waiting to invest until the optimal stopping time if they can cooperate with each other. In the presence of the preemption, however, they have no choice but invest in the project earlier due to a fear of preemption, which decreases the project values of both firms. It can also be interpreted as an example of prisoner’s dilemma:without a coordination both players must make less attractive decisions. Figure 6 also shows the difference of the equilibrium strategy between FMA and SMA. The figure illustrates the project values at period zero under LAC with respect to the marginal cash flow D11 4 . The parameter 4 We
(0,0)
remove the value of V50% (0, Y(0)) from the figure. As discussed in this section, it is
December 26, 2006 11:2
Proceedings Trim Size: 9in x 6in
ritsumei01n+
170
values are common in the first two figures: D10 = 8, D00 = 3, D01 = 0. The initial demand is set as Y(0) = 45 and let D11 be in the interval D11 ∈ [3, 8] where inequality (9) is always satisfied. Note D11 ∈ [3, 5) satisfies the condition of FMA while D11 ∈ (5, 8] satisfies that of SMA. DB = 5 represents the boundary value between FMA and SMA. First we analyze the figure in the interval under FMA, i.e., D11 ∈ [3, 5). (0,0) (1,0) The project value of firm L, VL (0, y), is equivalent to VL (0, y) − I and (0,0) (0,1) the project value of firm F, VF (0, y), is equivalent to VL (0, y) when D11 < DA . This indicates that only firm L invests at period zero while firm F defers the investment under LAC. In the interval DA ≤ D11 < DB the equilibrium strategy for both firms is to invest immediately at period (1,1) zero and the both project values are on the line of VL (0, y) − I. We can observe that under the condition of FMA, firm L that has a competitive advantage over firm F could have opportunities to make the best use of the competitive advantage and to become the first mover and to maximize the project value. The project values of the firms and the corresponding equilibrium strategies are discontinuously changed at the boundary between FMA and SMA at D11 = DB . When D11 > DB both project values jump up and the corresponding equilibrium strategies are also changed from investing immediately into waiting to invest. We can figure out that both firms can obn (1,1) o (0,0) tain the real option values, which are given by VL (0, y) − VL (0, y) − I . This results from the absence of preemption; namely, both firms can consequently coordinate the optimal timing of the investment without a fear of preemption. The real option values tend to decrease as D11 increases and finally vanishes at the point Dc where investing immediately becomes the equilibrium strategy. It is important to point out that both project values keep increasing as D11 increases under the conditions of SMA. Consequently, we can conclude again that under the condition of SMA, the existence of the real option to defer the investment always has a positive impact on the project values of both firms. 4.
Concluding Remarks In this paper we develop a valuation model when there exist two competitive firms under the demand uncertainty. We propose a numerical procedure to derive both project values and equilibrium strategies in the duopolistic environment. We apply two types of criteria, which we call firm L advantage criterion(LAC) and 50% criterion to choose the unique hard to compare the results of FMA and SMA in the 50% criteria because the condition of FMA could create asymmetric equilibria while that of SMA could create symmetric equilibria.
December 26, 2006 11:2
Proceedings Trim Size: 9in x 6in
ritsumei01n+
REFERENCES 171
equilibrium under the simultaneous decision case. Our numerical examples reveal the following. Under the condition of FMA there could emerge asymmetric equilibria in the LAC where firm L usually becomes the first mover and relishes the larger project value. However, the project values of both firms often become smaller than the coordinated project value because of the existence of a fear of preemption. On the other hand, the condition of SMA causes a completely different influence on the equilibrium strategies of the competitive firms. Both firms’ project values are equal and the corresponding equilibrium strategies are always symmetric even if LAC is assumed. Furthermore, it is shown that both firms can make the best use of the flexibility to defer the investment in the presence of competition. As a result the project values of both firms are equivalent to the coordinated project value. In other words, the value of flexibility under uncertainty does not conflict with the existence of competition under the condition of SMA, which contrasts to the results under the condition of FMA. References C, G. M. (1978): “Market Risk Adjustment in Project Valuation,” Journal of Finance, 33(2), 603–616. C, J. C., S. A. R (1976): “The Valuation of Options for Alternative Stochastic Processes,” Journal of Financial Economics, 3(1-2), 145–166. D, A. K., R. S. P (1994): Investment under Uncertainty. Princeton University Press. F, D., J. T (1985): “Pre-emption and Rent Equalisation in the Adoption of New Technology,” The Review of Economic Studies, 52, 383–401. G, S. R. (1996): “The Strategic Exercise of Options: Development Cascades and Overbuilding in Real Estate Markets,” Journal of Finance, 51(5), 1653–1679. G, S. R. (2002): “Option Exercise Games: An Application to the Equilibrium Investment Strategies of Firms,” The Review of Financial Studies, 15(3), 691–721. H, J. C., R. S (1988): A General Theory of Equilibrium Selection in Games. MIT Press. H, H. C. (2000): “Second-mover Advantages in the Strategic Adoption of New Technology under Uncertainty,” International Journal of Industrial Organization, 18, 315–338.
December 26, 2006 11:2
Proceedings Trim Size: 9in x 6in
ritsumei01n+
172 REFERENCES
H, K. J. M. (2001): Technology and Investment: A Game Theoretic Real Options Approach. Kluwer Academic Publishers. H, K. J. M., P. M. K (2003): “Strategic Investment in Technological Innovations,” European Journal of Operational Research, 144, 209– 223. (2004): “Strategic Technology Adoption Taking into Account Future Technological Improvements: A Real Options Approach,” European journal of Operational Research, 159, 705–728. I, J., T. W (2004): “A Two-stage Investment Game in Real Option Analysis,” Discussion Paper. (2006a): “Kyousou Jyoukyouka deno Real Option to Jyuunannsei no Wana,” Discussion Paper. (2006b): “A Numerical Approach for Real Option Values and Equilibrium Strategies in a Duopoly,” Discussion Paper. K, M., T. S (2002): “Real Options in a Duopoly Market with General Volatility Structure,” Discussion Paper of Kyoto University. K, N., E. C. P (1998): “Strategic Growth Options,” Management Science, 44(8), 1021–1031. L, B. (2000): “Strategic Sequential Investments and Sleeping Patents,” Project Flexibility, Agency, and Competition, pp. 297–323. MD, R., D. S (1984): “Option Pricing When the Underlying Asset Earns A Below Equilibrium Rate of Return: A Note,” Journal of Finance, 39(1), 261–265. S, F. (1991): “Exporting versus FDI: The Effect of Uncertainty, Irreversibilities and Strategic Interactions,” Working Paper, Yale University. S, H. T. J., L. A. A (1993): “A Real Options and Game-Theoretic Approach to Corporate Investment Strategy Under Competition,” Financial Management, pp. 241–250. S, H. T. J., L. T (2001): “Flexibility and Commitment in Strategic Investment,” Real Options and Investment Under Uncertainty/ Classical Readings and Recent Contributions, pp. 451–498.
December 26, 2006
11:9
Proceedings Trim Size: 9in x 6in
Asianput
Asian Strike Options of American Type and Game Type Masaya Ishihara and Hiroshi Kunita Nanzan University, Seto, 464-0062, Japan
We study Asian strike put options of Europian, American and game type. After changing the variable, we show that the study of these Asian strike options is reduced to the study of Europian, American and game type options for a one dimensional nonstationary Brownian motion (diffusion process) yt with a suitable pay-off function. We get the exercise region of the holder for the American type option and the exercise regions of the holder and the writer for the game type option. Then we study the early exercise premium for the American type option and the early exercise premium of the holder and the early cancellation fee of the writer for the game type option. For the study of the early cancellation fee of the writer, we need an extension of Ito’s ˆ formula, which involves the local time of the price ˆ formula in Appendix. process yt . We establish the generalized Ito’s 1. Introduction Game option was introduced by Kifer [3]. Call and put options of the game type were studied by Kyprianov [7], Kunita-Seko [5] and the others. In this paper we study Asian strike option of Europian type, American type and the game type. Let St be the price process determined by a Black-Scholes type stochastic differential equation (2.1). Let G t be the geometric average of Su , u ≤ t. If the pay-off function is given by f (S, G) = max{G − S, 0}, the option is called an (geometric) average strike put option or Asian strike put option. It is known that the problem of the Asian strike option is reducced to a problem of Europian option of the one dimensional process yt = t log St /Gt with the pay-off function ξ(e y/t ) = max{e y/t − 1, 0}. The fact will be reviewed in Section 2. In the average strike put option of the American type, the holder of the 173
December 26, 2006
11:9
Proceedings Trim Size: 9in x 6in
Asianput
174
option can exercise it at any time before the expiration time T. One can reduce it to an American option of the one dimensional process yt with the same pay-off function ξ(e y/t ). Exercise region and the early exercise premium for this option were given in Wu-Kwok-Yu [10] without proofs in the case of call option. We give the details of the proof in Section 3 (Theorem 3.2). In the case of the game type option, the writer can also cancel the option with an addtional cancellation fee. In the present paper, the cancellation fee will be given by max{G − S, 0} + δS, where δ is a positive constant. The fair price of the option is then given by V(S, G, t) = inf sup E0 [e−r(σ∧τ−t) { f (Sτ , Gτ )1τ≤σ + fδ (Sσ , Gσ )1σ<τ }|St = S, Gt = G], σ∈Tt,T τ∈T t,T
where Tt,T is the totality of stopping times with values in the interval [t, T]. The supremum and the infimum are taken in the sense of the essential supremum and the essential infimum. One can also reduce it to the problem of the game type option for the process yt with pay-off functions ξ and ξδ := ξ + δ. In fact, the function V(S, Se y/t , t)S−1 depend on (y, t) only a.s. and is characterized by W(y, t) = inf sup E∗ [e−q(σ∧τ−t) {ξ(e yτ /τ )1τ≤σ + ξδ (e yσ /σ )1σ<τ }|yt = y], σ∈Tt,T τ∈T t,T
with a suitable risk neutral probability measure P∗ . A main purpose of the paper is to get the holder’s exercise region and the writer’s cancellation region of the option. It will be shown in Section 4 that the holder’s exercise region is an upper half line [y∗δ (t), ∞), where y∗δ (t) > 0 and the writer’s cancellation region is either an empty set or a single point {0} (Theorem 4.1). Another object is to get the holder’s early exercise premium and the writer’s early cancellation fee. We are particularly interested in the writer’s cancellation fee. It will be represented by using the local time of the process yt at the level 0. It will be discussed in Section 5. A technical reason why we need the local time is that the value function W(y, t) may fail the smoothness at the boundary of the cancellation region. Indeed, the first derivative W y (y, t) looses the continuity at the boundary point y = 0 of the writer’s exercise region. Hence we can not apply the usual Ito’s ˆ formula. We have to extend the formula to a more rough function. In Appendix, we revisit the generalized Ito’s ˆ formula for the distribution valued process T(t) ◦ ϕs,t , where ϕs,t is the stochastic flow generated by a SDE (Kunita [4]). Then we establish an Ito’s ˆ formula involving the local time of the one-dimensional stochastic flow. Throughout the paper, we use the notation of the sotchastic flow ϕs,t (y) instead of the diffusion process yt . A merit is that the value function W(y, t)
December 26, 2006
11:9
Proceedings Trim Size: 9in x 6in
Asianput
175
can be defined without using the conditional means. Then we may discuss the regularity of the function W(y, t) with respect to (y, t) more easily. 2. Average Strike Put Options Let P0 be the risk neutral probability measure, where the price process St satisfies the SDE dSt = (r − q)St dt + σSt dZ0t .
(2.1)
Here Z0t is a standard Brownian motion with respect to P0 , r > 0 is the interest rate and q ≥ 0 is the dividend rate of the stock St . The solution is written by (2.2)
1 St = S0 exp{(r − q − σ2 )t + σZ0t }. 2
The geometric average of the price process is given by (2.3)
t 1 log Su du , Gt := exp t 0
0 < t < T.
The pair (St , Gt ) satisfies the following integrability condition. For any p > 1, there exists a positive constant Cp such that E[(1 + sup |Su | + sup |Gu |)p |St = S, Gt = G]
(2.4)
t
t
≤ Cp (1 + |S| + |G|)p ,
∀S, G, t.
Further the differential of Gt satisfies dGt =
(2.5)
St Gt log dt. t Gt
Thus the pair (St , Gt ) is a two dimensional (nonstationary) diffusion process with the generator (2.6)
L(t)U =
S ∂U 1 2 2 ∂2 U ∂U G σS + log . + (r − q)S 2 ∂S t G ∂G ∂S2
We shall consider an option where the pay-off is a function of the stock S and its geometric mean G. We denote it by f (S, G). We assume that it is of linear growth, i.e., there exists a positive constant K such that | f (S, G)| ≤ K(1 + |S| + |G|),
∀S, G.
December 26, 2006
11:9
Proceedings Trim Size: 9in x 6in
Asianput
176
The value of the option at time t in the case where St = S and Gt = G is then given by the formula (2.7)
V E (S, G, t) := E0 [e−r(T−t) f (ST , GT )|St = S, Gt = G].
It is well defined a.s. for any S, G and it has a modification of a smooth function and it satisfies the partial differential equation (2.8)
(L(t) +
∂ − r)V E (S, G, t) = 0 ∂t
with the terminal condition VE (S, G, T) = f (S, G). The above partial differential equation involves three variables (S, G, t). However, we can reduce the above equation to an equation involving two variables only, provided that the pay-off function is written by f (S, G) = Sξ(G/S), where ξ(y) is a one dimensional continuous function of the linear growth. Indeed, consider the stochastic process yt = t log(Gt /St ). Since t yt = 0 log Su du − t log St , we have dyt = log St dt − log St dt − td log St 1 = −t r − q − σ2 dt − tσdZ0t , 2 by Ito’s ˆ formula. Since e−(r−q)t St is a positive martingale, we may define a new probability measure P∗ by dP∗ = e−(r−q)t St /S0 dP0 . Then by Girsanov’s theorem Z∗t := Z0t − σt is a P∗ -Brownian motion and yt satisfies 1 dyt = −t r − q + σ2 dt − tσdZ∗t . (2.9) 2 From now we will use the notation of the stochastic flow. Let ϕs,t (y) be the solution of the above stochastic differential equation stating from y at time s. It defines a stochastic flow of diffeomorphsims; It satisfies ϕs,u (y) = ϕt,u ◦ ϕs,t (y) for any s < t < u and y. In the present case, we have ϕs,t (y) = yt − ys + y. For a fixed s, y, the flow is a one dimensional non stationary diffusion process. Its generator is given by (2.10)
L∗ (t)U =
σ2 t 2 ∂ 2 U σ2 ∂U t . − r − q + 2 ∂y2 2 ∂y
Now if the pay-off function is of the form f (S, G) = Sξ(G/S), the value function VE (S, Se y/t , t) is written by (2.11)
V E (S, Se y/t , t) = E0 [e−r(T−t) ST ξ(e yT /T )|St = S, Gt = Se y/t ] = SE∗ [e−q(T−t) ξ(e yT /T )|yt = y],
a.s.
December 26, 2006
11:9
Proceedings Trim Size: 9in x 6in
Asianput
177
Therefore the function V E (S, Se y/t , t)/S depends on (y, t) only a.s. Now we can define the function WE (y, t) by W E (y, t) := E∗ [e−q(T−t) ξ(eϕt,T (y)/T )].
(2.12)
The function V satisfies VE (S, G, t) = SW E (t log GS , t) a.s. for any y, t. Further we have (L∗ (t) +
(2.13)
∂ − q)W E (y, t) = 0, ∂t
0
with the terminal condition WE (y, T) = ξ(e y/T ). In the case of the average strike put option, the pay-off functions are given by f (S, G) = max{G − S, 0},
ξ(e y/T ) = max{e y/T − 1, 0}.
Then the value function is represented by (2.14)
W E (y, t) = E∗ [e−q(T−t) max{eϕt,T (y)/T − 1, 0}] ∞ −q(T−t) =e max{eY/T − 1, 0}F(y, t, Y, T)dY. −∞
Here F is the transition probability density of the diffusion process ϕt,T (y), i.e., F(y, t, Y, T)dY = P∗ (ϕt,T (y) ∈ dY), and it is given by ⎛ ⎞ T ⎜⎜ ⎟ ⎜⎜ Y − y + µ t udu ⎟⎟⎟ ⎜ ⎟⎟ , F(y, t, Y, T) = n ⎜⎜⎜
⎟⎟ T ⎝⎜ σ ⎠⎟ 2 du u t
(2.15)
where µ = r − q + distribution.
σ2 2
and n is the density function of the standard normal
3. American Average Strike Put Options We shall consider American Asian option. In this case the holder of the option can exercise the option at any time before the terminal time T. The price of the option at time t in the case where St = S and Gt = G is given by (3.1)
V A (S, G, t) := sup E0 [e−r(τ−t) f (Sτ , Gτ )|St = S, Gt = G], τ∈Tt,T
December 26, 2006
11:9
Proceedings Trim Size: 9in x 6in
Asianput
178
where Tt,T is the set of all stopping times with values in the interval [t, T]. It is finite for any S, G, t by (2.6). We shall consider the American average strike put option. We can show similarly as the case of the Asian option that the function VA (S, Se y/t , t)/S does not depend on S a.s. It coincides with the following function WA (y, t) a.s. for any (y, t). (3.2)
W A (y, t) := sup E∗ [e−q(τ−t) ξ(eϕt,τ (y)/τ )]. τ∈Tt,T
It holds WA (y, t) ≥ ξ(e y/t ) for any t, y. Lemma 3.1. For any y > 0, the value function W A (y, t) is nonincreasing with respect to t. Proof. The function VA (S, G, t) is nondecreasing with respect to G. We want to show that V(S, G, t) is nonincreasing with respect to t. Set Ht = t t log Gt = 0 log Su du. Then we have V A (S, G, t) = sup E0 [e−r(σ−t) f (Sσ , exp( σ∈Tt,T
Hσ ))|St = S, Ht = t log G]. σ
Note that the conditional law of (Su+h , Hu+h ) with respect to P(·|St+h = S, Ht+h = (t + h) log G) is equal to the conditional law of (Su , Hu ) with respect to P(·|St = S, Ht = t log G). Then we have for any h > 0, V A (S, G, t + h) = sup E0 [e−r(σ−t) f (Sσ , exp( σ∈Tt,T−h
Hσ ))|St = S, Ht = t log G] σ
≤ V A (S, G, t). This shows that VA is nonincreasing with respect to t. Now we transform the function VA to the function of t and y = t log S/G. Then if y > 0, W A (y, t) = V A (S, Se y/t , t)S−1 is nonincreasing with respect to t. Indeed, we have (3.3)
−y ∂VA ∂VA ∂ A W (y, t) = (S, Se y/t , t)S−1 + (S, Se y/t , t) 2 ey/t ≤ 0, ∂t ∂t ∂G t
if y > 0.
The proof is complete. Define the continuation region and the exercise region by C = {(y, t); W A(y, t) > ξ(e y/t )},
E = {(y, t); W A(y, t) = ξ(e y/t )},
respectively. Let τ∗ = τ∗y,t be the hitting time of the process ϕt,u (y), u ∈ [t, T] to the set E, i.e., τ∗ = τ∗y,t := inf{u ∈ [t, T], (ϕt,u (y), u) ∈ E} ∧ T.
December 26, 2006
11:9
Proceedings Trim Size: 9in x 6in
Asianput
179
Then we have ∗
∗
W A (y, t) = E∗ [e−q(τ −t) ξ(eϕt,τ∗ (y)/τ )].
(3.4)
(If E = φ we understand that τ∗ = T. Hence in this case we have W A (y, t) = W E (y, t), where the latter is giveny by (2.11)) Further we have ∂ 0, (y, t) ∈ C, − q)W A (y, t) = (3.5) (L∗ (t) + ψ(y, t), (y, t) ∈ E o ∂t where ψ(y, t) = q − re y/t −
(3.6)
y y/t e t2
and Eo denotes the inner points of the set E. In the following we will often consider the function ψ(y, t), 0 ≤ y < ∞. The function ψ is decreasing with respect to y and ψ(∞, t) = −∞. If r < q, there exists y0 (t) > 0 such that ψ(y, t) > 0 for y < y 0 (t) and ψ(y, t) < 0 for y > y0 (t). If r ≥ q, then ψ(y, t) < 0 holds for any y > 0. We set y 0 (t) = 0 in the latter case. Theorem 3.2. Consider the American average strike put option. The exercise region E is nonempty. For any t, its section Et := {y; (y, t) ∈ E} is equal to the upper half line [y∗a (t), ∞), where y0 (t) ≤ y∗a (t) and y∗a (t) > 0. Further, the value function W A (y, t) is decomposed as ∞ A −q(T−t) (3.7) W (y, t) = e max{eY/T − 1, 0}F(y, t, Y, T)dY +
−∞
T
e−q(u−t)
t
∞ y∗a (u)
(−q + reY/u +
Y Y/u e )F(y, t, Y, u)dYdu. u2
The first term of the right hand side coincides with WE (y, t), the value function of the average strike put option discussed in the previous section. The second term of the above is called the early exercise premium of the holder. It is positive because the integrand is positive. Proof. We first show that for any 0 < t < T inequality WE (y, t) < ξ(e y/t ) holds for sufficiently large y. A direct computaion yields ϕt,T (y)/T
(3.8) e
=e
y/T
T 1 σ2 T 2 − t2 T 2 − t2 1 exp − (r − q) uσdZ∗u − exp . 2 T T t 4 T
The last member exp
T 1 T
t
uσdZ∗u −
σ2 T2 −t2 4 T
with mean 1. Therefore, W E (y, t) ≤ E∗ [e−q(T−t) eϕt,T (y)/T ]
is a positive martingale
December 26, 2006
11:9
Proceedings Trim Size: 9in x 6in
Asianput
180
T 2 − t2 1 ≤e e exp − (r − q) 2 T 1 ≤ e y/T exp − r(T − t). 2 y/T −q(T−t)
W E (y,t)
Consequently, if t < T, ey/t < 1 holds for sufficiently large y. This implies W E (y, t) < ξ(e y/t ) for sufficiently large y. Now since WA (y, t) ≥ ξ(e y/t ) holds for any y, t, we get W A (y, t) > W E (y, t) for any t, proving that Et is nonempty for any t and sufficiently large y. We want to show that C is connected, which ensures that Ct is an interval and Et is an upper half line [y∗ (t), ∞). We follow the argument of Oksendale [9]. Let U ⊂ C be the connected component including {(y, t); ψ(e y/t) > 0}. (If the above set is empty take any connected component in C) Then U = C holds. Indeed, if it is not connected there exist another connected component V in C. Let τ be the leaving time from the set V. Then for (y, t) ∈ V we have (L∗ (t) + ∂t∂ − q)ξ(e y/t ) = ψ(e y/t ) ≤ 0. Therefore E∗ [e−q(T∧τ−t) ξ(eϕt,T∧τ (y)/T∧τ )] T∧τ ∂ − q)ξ(eϕt,u (y)/u )du] e−q(u−t) (L∗ (u) + = ξ(e y/t ) + E∗ [ ∂u t ≤ ξ(e y/t ). The left hand side of the above is equal to W A (y, t). Therefore we have W A (y, t) ≤ ξ(e y/t ), which contradicts to (y, t) ∈ C. Now the section Ct is a one dimensional connected set (interval), which includes (−∞, y0(t)). Then Et is a upper half line [y∗a (t), ∞), where y0 (t) ≤ y∗a (t). Further, Et does not contain 0, so that y∗a (t) > 0. The function WA (y, t) is locally Lipschitz continuous with respect to (y, t). (Kunita-Seko [5]). For any t, the Radon-Nikodym derivative W A y (y, t) (with respect to y) is continuous with respect to y except possibly at the critical point y∗a (t). At the critical point y = y∗a (t), we have W A y (y−, t) = A W y (y+, t) by the contact property. Furthermore, for any t, W A y (y, t) is continuously differentiable with respect to y for y y∗ (t). Then W A y (y, t) is absolutely continuous with respect to y for any t. We can apply Ito’s ˆ formula (Theorem 6.4 in Appendix) to the value function WA (y, t). Then we have e−q(T−t) W A (ϕt,T (y), T) − W A (y, t) T e−q(u−t) W yA (ϕt,u (y), u)σudZ∗ (u) + =− t
T t
e−q(u−t) (L∗ (u) +
∂ − q)W A (ϕt,u (y), u)du. ∂u
December 26, 2006
11:9
Proceedings Trim Size: 9in x 6in
Asianput
181
Since W A (ϕt,T (y), T) = ξ(eϕt,T (y)/T ) holds, we get E
T
A
W (y, t) − W (y, t) = t
E∗ [e−q(u−t) ψ(ϕt,u (y), u)1 y(u)>y∗a (u) ]du.
Therefore the value function is written by (3.7). Finally the value function VA (S, G, t) can be computed through the formula V A (S, G, t) = SW A (t log GS , t). 4. Average Strike Put Options of the Game Type We shall consider the game type Asian option. The pay-off function of the holder of the option is given by f (S, G) and that of the writer is given by fδ (S, G) = f (S, G) + δ(S, G) (> f (S, G)), where δ(S, G) denotes the penalty payed by the writer in the case where he cancel the option. The price of the option is then given by V(S, G, t) := inf sup JS,G,t(σ, τ), σ∈Tt,T τ∈T t,T
where JS,G,t (σ, τ) := E0 [e−r(σ∧τ−t) { f (Sτ , Gτ )1τ≤σ + fδ (Sσ , Gσ )1σ<τ }|St = S, Gt = G]. It is finite for any S, G, t by (2.6). Further it holds f (S, G) ≤ V(S, G, t) ≤ fδ (S, G) for any S, G, t. Suppose that the pay-off functions satisfy f (S, G) = Sξ(G/S) and fδ (S, G) = Sξδ (G/S). Then we have V(S, Se y/t , t) = inf sup E∗ [e−q(σ∧τ−t) S{ξ(eϕt,τ (y)/τ )1τ≤σ + ξδ (eϕt,σ (y)/σ )1σ<τ }]. σ∈Tt,T τ∈T t,T
Therefore V(S, Se y/t , t)/S is a function depending only on (y, t). It coincides with the following W(y, t) a.s. for any (y, t) W(y, t) := inf sup J∗y,t (σ, τ),
for any y, t. Now we define the writer’s cancellation region, the holder’s exercise region and the continuation region by (4.3)
EA = {(y, t); W(y, t) = ξδ (e y/t )},
EB = {(y, t); W(y, t) = ξ(e y/t )},
CG = {(y, t); ξ(e y/t ) < W(y, t) < ξδ (e y/t )}, respectively. Let σ∗ = σ∗y,t and τ∗ = τ∗(y,t) be hitting times of ϕt,u (y), u ∈ [t, T] to the sets EA , EB , respectively. Then it holds (4.4)
W(y, t) = J y,t (σ∗ , τ∗ )
for any y, t. (If EA = φ then σ∗ = T holds valid. In this case we have W(y, t) = W A (y, t) for any (y, t)) Now shall consider the average strike put option of the game type. The pay-off functions are defined by ξ(e y/t ) = max{e y/t − 1, 0},
ξδ (e y/t ) = ξ(e y/t ) + δ,
where δ is a positive constant. The value function of the American average strike put option W A (y, t) is a strictly decreasing function of t. It converges to the pay-off function ξ(e y/T ) as t → T. Therefore W A (0, t) decreases to 0 as t increases to T. Hence if W A (0, 0) > δ, then there exists a unique 0 ≤ t∗δ < T such that W A (0, t) < δ for t > t∗δ and W A (0, t) > δ for t < t∗δ . If W A (0, 0) < δ, then W A (0, t) < δ holds for all t. In this case we set t∗δ = 0. Theorem 4.1. Consider the average strike put option of the game type. Assume 0 ≤ q ≤ r. 1) EB is a nonempty closed set and the section EBt is equal to the upper half line [y∗δ (t), ∞), where 0 < y∗δ (t) ≤ y∗a (t). 2) The writer’s cancellation region EA is equal to E0 := {(0, t); t ≤ t ∗δ }, i.e., {0}, if t ≤ t∗δ , A (4.5) Et = φ, if t > t∗δ . The first assertion on the exercise region EB can be verified similarly as Theorem 3.2. In order to prove the second assertion, we prepare two lemmas. We consider the function (4.6)
˜ W(y, t) := sup J∗y,t (σ0 , τ), τ
˜ where σ0 is the hitting time to the set E0 . We want to prove W(y, t) = W(y, t). ∗ A ˜ Note first that W(y, t) = supτ J y,t (T, τ) = W (y, t) holds if t > δ∗a , where W A (y, t) is the value function of the American average strike put option. We consider the case where t < t∗a .
December 26, 2006
11:9
Proceedings Trim Size: 9in x 6in
Asianput
183
˜ Lemma 4.2. The function W(y, t) is nondecreasing with respect to y. Further for any y > 0 it is nonincreasing with respect to t. Proof. Sincce the pay-off functions ξ and ξδ are nondecreasing functions of y, the function J∗y,t (σ0 , τ) is also nondecreasing with respect to y. Then ˜ W(y, t) is also nondecreasing with respect to y. In order to prove the decreasing property with respect to t, we consider ˜ G, t) := sup JS,G,t(σ0 , τ). V(S,
Note that the law of (Su+h , Hu+h ) with respect to P(·|St+h = S, Ht+h = (t + h) log G) is equal to the law of (Su , Hu ) with respect to P(·|St = S, Ht = t log G). Then we have for any h > 0, ˜ G, t + h) V(S, = sup E0 [e−r(τ∧σ0 −t) { f (Sτ∧σ0 , exp( τ∈Tt,T−h
˜ G, t). ≤ V(S, This shows that V˜ is nonincreasing with respect to t. Now we transform the function V˜ to the function of t and y = t log S/G. ˜ ˜ Se y/t , t)S−1 coincides with W(y, t). We can show that it is The function V(S, nonincreasing with respect to t for any y > 0, by using an equation similar to (3.3). ˜ Lemma 4.3. Assume 0 ≤ q ≤ r. For any t, it holds W(y, t) < ξδ (e y/t ) if y 0. ˜ ˜ satisfies ξ(ey/t ) ≤ W(y, t) for any y, t. Set E˜ = Proof. The function W y/t ˜ {(y, t); W(y, t) = ξ(e )}. We can show that E˜ t is a upper half line [ y˜ ∗ (t), ∞) where y˜ ∗ (t) > 0, similarly as the case of American option. We want to ˜ show W(y, t) < ξδ (e y/t ) holds for any y > 0. Fixed t, consider the function ˜ ˜ U(y) := W(y, t) − ξ(e y/t ). It satisfies the boundary condition (terminal
December 26, 2006
11:9
Proceedings Trim Size: 9in x 6in
Asianput
184
˜ ∗) = U ˜ y ( y˜ ∗ ) = 0 since at the boundary point y = y˜ ∗ the condition) U(y ˜ y (y, t) = ∂ (ξ(e y/t )) holds valid. Further for y < y˜ ∗ (t) it contact property W ∂y satisfies the second order ordinary differential equation ˜ (L∗ (t) − q)U(y) = K(y),
(4.8) where
K(y) = −
˜ ∂W (y, t) − {q − re y/t }1{0
The above K is positive because ∂∂tW (y, t) ≤ 0 by Lemma 4.3 and q − re y/t < −(r − q) < 0. The solution of the differential equation (4.8) is represented by ∗ ∗ ˜ = C1 eλ1 (y− y˜ ) + C2 eλ2 (y− y˜ ) + (4.9)U(y)
∗
1 λ1 − λ2
y y˜ ∗
K(z)(eλ1 (y−z) − eλ2 (y−z) )dz,
∗
where λ1 > 0 ≥ λ2 and eλ1 (y− y˜ ) and eλ2 (y− y˜ ) are fundamental solutions of the homogeneous equation (L∗ (t) − q)U = 0. The boundary conditions imply C1 = C2 = 0. Therefore we get ˜ U(y) =
1 λ1 − λ2
y y˜ ∗
K(z)(eλ1 (y−z) − eλ2 (y−z) )dz,
−∞ < y < y˜ ∗ .
˜ Then U(y) > 0 holds for 0 < y < y˜ ∗ . Further we have ˜ y (y) = U
1 λ1 − λ2
y y˜ ∗
K(z)(λ1 eλ1 (y−z) − λ2 eλ2 (y−z) )dz,
˜ is strictly decreasing. which is negative for any −∞ < y < y˜ ∗ . Therefore U ˜ ˜ ˜ t) < Since U(0) = δ, we have U(y) < δ for any 0 < y < y˜ ∗ , proving W(y, y/t ∗ ξδ (e ) for any 0 < y < y˜ . ˜ Next, if y < 0, we have always W(y, t) < δ = ξδ (e y/t ). The proof is complete. Proof of Theorem 4.1. Assume that t > t ∗δ . Then we have the inequalities W(y, t) ≤ W A (y, t) < ξδ (e y/t ) for any t. Therefore EA = φ and we have W(y, t) = W A (y, t) for any y. Assume next that t ≤ t ∗δ . The value function ˜ W(y, t) of the game option is less than or equal to W(y, t). Therefore the function W(y, t) satisfies the inequality W(y, t) < ξδ (e y/t ) for any y 0. This ∗ proves that EA t = {0} if t < δ . The proof is complete.
December 26, 2006
11:9
Proceedings Trim Size: 9in x 6in
Asianput
185
5. Early Exercise Premium and Early Cancellation Fee Lemma 5.1. The value function of the average strike put option of game type satisfies ∂W (0−, t) = ∂y ∂W (0−, t) < ∂y
(5.1) (5.2)
∂W (0+, t), ∂y ∂W (0+, t), ∂y
if t ≥ t∗δ , if t < t∗δ .
Proof. Let τ∗ be the hitting time to the set EB and σ0 be the hitting time ˆ to the set E0 as before. Set W(y, t) := J ∗y,t (T, τ∗ ) and conisder the function ˆ ˆ ˆ t) = 0 U(y, t) := W(y, t) − W(y, t). It is a nonnegative function. If t ≥ t ∗δ , U(y, ˆ holds for any y. Then W(y, t) = W(y, t) is continuously differentiable at y = 0. Therefore we get (5.1). ˆ t) is We shall consider the case where t < t∗δ . Then the function U(y, ∗ ˆ is represented by positive for y < yδ (t). The function U ˆ ˆ t,σ0 (y), σ0) − ξδ (eϕt,σ0 (y)/σ0 )}1σ0 <τ∗ ] U(y, t) = E∗ [e−q(σ0 −t) {W(ϕ ˆ σ0 ) − δ)1σ0 <τ∗ ]. = E∗ [e−q(σ0 −t) (W(0, For any y > 0, it is decreasing as t is increasing and for any t it is decreasing with respect to y > 0. Further for any t, it is increasing with respect to y < 0. ˆ ˆ Now the function U(y) := U(y, t), 0 < y < y ∗δ is represented by (5.3)
∗ ∗ ˆ U(y) = C1 eλ1 (y−yδ ) + C2 eλ2 (y−yδ ) +
where L(z) = −
1 λ1 − λ2
y y∗δ
L(z)(eλ1 (y−z) − eλ2 (y−z) )dz,
ˆ ∂U (z, t) ≥ 0. ∂t
ˆ ∗ ) = 0 and Further it satisfies the boundary (terminal) condition U(y δ ∗ ˆ U y (yδ ) ≤ 0. Therefore we have C1 + C2 = 0 and C1 λ1 + C2 λ2 ≤ 0. This implies C1 ≤ 0 and C2 = −C1 . Then we have y 1 λ1 (y−y∗δ ) λ2 (y−y∗δ ) ˆ −e )C1 + L(z)(eλ1 (y−z) − eλ2 (y−z) )dz. U(y) = (e λ1 − λ2 y∗ δ
If C1 < 0, the first term of the right hand side is positive. If C1 = 0, the ˆ last term of the above is positive, since U(0) > 0. In the latter case the ineqquality L(z) > 0 holds on some interval.
December 26, 2006
11:9
Proceedings Trim Size: 9in x 6in
Asianput
186
ˆ y is written by Now the derivative U ˆ y (y) = (λ1 eλ1 (y−y∗δ ) − λ2 eλ2 (y−y∗δ ) )C1 + U
1 λ1 − λ2
y y∗δ
L(z)(λ1 eλ1 (y−z) − λ2 eλ2 (y−z) )dz.
Both terms of the right hand side are nonpositive. Further if C1 < 0, the first term is negative and if C1 = 0, the second term is negative. This ˆ y (0−, t) ≥ 0, ˆ y (0+, t) < 0. On the other hand for y < 0, we have U implies U ˆ since U(y, t) is a nondecreasing function of y < 0. These two inequalities ˆ imply (5.2) because the function W(y, t) is continuously differentiable at y = 0. Theorem 5.2. Assume 0 ≤ q ≤ r. If t ≥ t∗δ , we have W(y, t) = W A (y, t), where WA (y, t) is the value function of the American average strike put option. If t < t∗δ , we have W(y, t) < W A (y, t). Further W(y, t) is decomposed as
(5.4)
W(y, t) = e−q(T−t)
T
∞
−∞
max{eY/T − 1, 0}F(y, t, Y, T)dY
∞
Y Y/u e )F(y, t, Y, u)dYdu u2 T ∂W σ2 −q(u−t) 2 ∂W e u (0+, u) − (0−, u) F(y, t, 0, u)du. − 2 t ∂y ∂y
+
e−q(u−t)
y∗δ (u)
t
(−q + reY/t +
Remark. The first term of (5.4) coincides with the value of the Europian option. The second term is called the early exercise premium of the holder and the third term is called the early cancellation fee of the writer. Proof. We apply Ito’s ˆ formula to the function W(y, t). It is locally Lipschitz continuous in (y, t). The Radon-Nikodyme derivative W y (y, t) (with respect to y) is a function of bounded variation with respect to y. It admits the Lebesgue decomposition: dW y (y, t) = W yy (y, t)dy + {W y (y+, t) − W y (y−, t)}δ0(dy), where δ0 is the delta function concentrated at y = 0. Then setting ϕu = ϕt,u we have by Theorem 6.4 in Appendix T −q(T−t) W(ϕT (y), T) = W(y, t) − e−q(u−t) W y (ϕu (y), u)σudZ∗(u) e
t T
∂ − q)W(ϕu (y), u)du e−(u−t) (L∗ (u) + ∂u t T ∂W −q(u−t) 2 2 ∂W (0+, u) − (0−, u) dL0u (y), + e σu ∂y ∂y t
+
December 26, 2006
11:9
Proceedings Trim Size: 9in x 6in
Asianput
187
where L0u (y) is the local time of the flow ϕu (y) at the level 0. The expectation of the first term is equal to WE (y, t). Then we get (5.5)
W E (y, t) = W(y, t) + E∗ +E∗
t T
T
e−q(u−t) ψ(eϕu (y)/u )1{ϕu (y)>y∗δ (u)} du)
e−q(u−t) σ2 u2
t
∂W ∂W (0+, u) − (0−, u) dL0u (y) . ∂y ∂y
The second term of the right hand side is computed by T ∞ Y e−q(u−t) (q − reY/u − 2 eY/u )F(y, t, Y, u)dYdu. u t y∗δ (u) The last term is ∂W σ2 T −q(u−t) 2 ∂W (0+, u) − (0−, u) F(y, t, 0, u)du. e u 2 t ∂y ∂y Here we used the equality. E
∗
[L0v (y)]
1 = 2
v
F(y, t, 0, u)du. t
6. Appendix: A Generalized Ito’s ˆ Formula and Local Time of a One Dimensional Stochastic Flow. Consider a stochastic differential equation on Rd . (6.1)
dϕt =
m
v j (t, ϕt )dB j (t),
0
j=0
where v j (t, x) = (v1j (t, x), ..., vdj(t, x)), j = 0, 1, ..., m are continuous in (t, x) ∈
[0, T]×Rd, C∞ in x ∈ Rd such that their derivatives of any order are bounded. B0 (t) = t and B(t) = (B 1 (t), ..., Bm(t)) is a standard Brownian motion. Let ϕt ≡ ϕs,t (x), t ∈ [s, ∞) be the solution of (6.1) starting from x at time s. It is a diffusion process with the infinitesimal generator ⎞ ⎛ ⎜⎜ 1 ∂2 ∂ ⎟⎟⎟⎟ ⎜⎜ ij i (6.2) Lt u(x) = ⎜⎜ a (t, x) + v0 (t, x) ⎟ u(x), ⎝2 ∂xi ∂x j ∂xi ⎟⎠ i, j
i
j i where ai j (t, x) = m k=1 vk (t, x)v k (t, x). It is known that a family of a suitable modification {ϕs,t (x), 0 ≤ s < t < ∞, x ∈ Rd } defines a stochastic flow of diffeomorphisms of Rd .
December 26, 2006
11:9
Proceedings Trim Size: 9in x 6in
Asianput
188
Let D be the space of C∞ -functions on Rd with compact supports equipped with Schwartz topology and let D be its topological dual space. Elements of D are called (Schwartz) distributions. Each T ∈ D is a continuous linear map on D. If T is a signed measure which is absolutely continuous with respect to the Lebesgue measure dx, say T = f (x)dx, it holds T(h) = Rd f (x)h(x)dx for any h ∈ D. For any given Schwartz distribution T and any s < t, we define the composition of T with C∞ -diffeomorphisms ϕs,t denoted by T ◦ ϕs,t as a random distribution (D -valued random variable). If T = f (x)dx and f is continuous. Then it is defined by f (ϕs,t (x))dx. It satisfies
f (ϕs,t (x))h(x)dx =
f (y)h(ϕ−1 s,t (y))J s,t (y)dy,
by the formula of the change of variables, where Js,t (y) is the Jacobian of the map ϕ−1 s,t . In general, we can define T ◦ ϕ s,t as a D -valued random variable through the relation (6.3)
T ◦ ϕs,t (h) := T(h ◦ ϕ −1 s,t Js,t ),
∀h ∈ D.
We extend Ito’s ˆ formula to distribution valued process. In the following theorem, Sk,2 denotes a suitable subspace of D equipped with a Hilbertian norm. For the defintion of stochastic integrals for Sk,2 -valued processes etc, see [4]. Theorem 6.1. (c.f. Theorem 3.1 in [4]) Let X(t) be a continuous S k,2 -valued process of bounded variation. Then (6.4)
t
X(t) ◦ ϕs,t = X(s) + + s
X(du) ◦ ϕs,u + s
t
i
t s
∂X(u) ◦ ϕs,u dϕis,u ∂xi
L0u X(u) ◦ ϕs,u du,
where L0u =
1 ij ∂2 a (u, x) 2 ∂xi ∂x j i, j
and the derivative in the above formula is taken in the sense of the distribution (weak derivative). From now on we will restrict our attention to the one dimensional y stochastic flow ϕs,t (x). A stochastic process Ls,t (x), t ∈ [s, ∞) with parame-
December 26, 2006
11:9
Proceedings Trim Size: 9in x 6in
Asianput
189
ters x, y ∈ R is called the local time of the flow ϕs,t (x) at level y ∈ R if it is continuous in (s, t, x, y) a.s. and satisfies t y (6.5) f (ϕs,u (x))du = 2 f (y)Ls,t (x)dy, s
R
for any f . Hence if such a local time exists, it satisfies t 1 y (6.6) Ls,t (x)h(x)dx = δ y ◦ ϕs,u du (h) 2 s R where δ y ∈ D is the Dirac’s delta function concentrated at the point y and t δ ◦ ϕu du is a random distribution. Conversely if a stochastic process s y y Ls,t (x) with parameter (s, x, y) continuous in (s, t, x, y) a.s. satisfies (6.6) for any h ∈ D, then it satisfies (6.5). Hence it is a local time. Thus if the t local time exists, the random distributions s δ y ◦ ϕs,u du are absolutely continuous with respect to the Lebesgue measure and their density function t δ ◦ ϕ du (x) is continuous in (s, t, x, y) and coincide with the twice of y s,u s the local time i.e., we have t 1 y Ls,t (x) = (6.7) δ y ◦ ϕs,u du (x) 2 s for any (s, t, x, y) a.s. The above formula was pointed out by Ogawa [8] with the name of B-shift in the case where the corresponding flow is the shift of a Brownian motion. The local time is a continuous increasing process of t for any s, x, y. Further, since ϕs,t has the flow property ϕs,u (x) = ϕt,u (ϕs,t (x)) for any s < t < u, the local time has the additive property (6.8)
y
y
y
Ls,u (x) = Ls,t (x) + Lt,u (ϕs,t (x)),
∀s < t < u, ∀x, y.
Lemma 6.2. (Kunita [4]) Assume that a(t, x) > 0 for any t, x. 1) The transition probability Ps,t (x, dy) of the diffusion ϕ s,t (x) has a smooth density ps,t (x, y) with respect to the Lebesgue measure. 2) The local time exists. Further, we have 1 t y ps,u (x, y)du E[Ls,t (x)] = 2 s for any s < t and x, y. 3) It satisfies the following Tanaka’s formula.
t
(6.9) (ϕs,t (x) − y)+ = (x − y)+ +
1[y,∞) (ϕs,u (x))du ϕs,u (x) + s
s
t
y
a(u, y)du Ls,u (x).
December 26, 2006
11:9
Proceedings Trim Size: 9in x 6in
Asianput
190
Corollary 6.3. Let T(t), t ∈ [0, ∞) be a family of distributions such that T(t)(h) is continuous in t for any h ∈ D and is written by T(t)(h) = h(y)µ(t, dy), where t µ(t, dy) are bounded signed measures. Then the random distribution s T(u) ◦ t ϕs,u du is a signed measure and has a density function ( s T(u) ◦ ϕs,u du)(x) with respect to dx a.s. and it satisfies t s
1 2
y
(6.10) R
du Ls,u (x)µ(u, dy) =
t
T(u) ◦ ϕs,u du (x)
s
for any (s, t, x) a.s. Theorem 6.4. Let f (t, x) be a locally Lipschitz continuous function. Let f t and fx be Radon-Nikodym derivatives of f with respect to t and x, respectively. Assume that for any t, fx (t, x) is a function of bounded variation with respect to x and is written by d fx (t, x) = fxx (t, x)dx + g(t, x)λ(dx), where λ is a singular bounded measure (Lebesgue decomposition) and fxx (t, x) and g(t, x) are locally bounded functions. Then we have (6.11) f (t, ϕs,t (x)) = f (s, x) + t s R m t s
j=1
s
ft (u, ϕs,u (x))du +
t s
Lu f (u, ϕs,u (x))du
y
+ +
t
a(u, y)g(u, y)d uLs,u (x)λ(dy) v j (u, ϕs,u(x)) fx (u, ϕs,u(x))dB j(u),
where Lu f (u, x) =
(6.12)
1 a(u, x) fxx (u, x) + v0 (u, x) fx (u, x). 2
Proof. By trancating the function f if necessary, we may regard X(t) = f (t, x)dx as a S k,2 -valued process of bounded variation. We set ϕu = ϕs,u y y and Lu (x) = Ls,u (x). We have clearly
1 t L0u X(u) ◦ ϕu du (u, x) = a(u, ϕu (x)) fxx (u, ϕu (x))du 2 s t y + a(u, y)g(u, y)dLu (x)λ(dy). 0
R
Summing up all these computations, we get the formula (6.11). Remark In the case where the vector fields v j and function f do not depend on t, the formula can be written more simply. Indeed, setting Xt = ϕ0,t (x), L y = L y (x), we have (6.13)
t 1 t f (Xt ) = f (X0 ) + fx (Xs )dXs + a(xs ) fxx (Xs )ds 2 0 0 y + a(y)g(y)Lt λ(dy). R
The formula similar to the above has been proved by Meyer for continuous semimartingale Xt in the case where the function f is convex Lipschitz continuous function not depending on t, where λ is a positive measure. See [1]. In our case the function f is not neccessarily convex and it depends on t. References 1. Karatzas, I. and S. Shreve (1991): Brownian Motion and Stochastic Calculus, Springer. 2. Karatzas, I. and S. Shreve (1998): Methods of Mathematical Finance, Springer. 3. Kifer, Y. (2000) Game Options, Finance and Stochastics 4, 443–463. 4. Kunita, H, (1994). Stochastic flows acting on Schwartz distributions, J of Theor. Probab. 7, 247–278. 5. Kunita, H. and Seko, S. (2005) Game options and their exercise regions, preprint. 6. Kwok, K.Y. (1998) Mathematical models of financial derivatives, Springer 7. Kyprianou, A.E. (2004), Some calculations for Israeli options, Finance and Stochastics 8, 73–86. 8. Ogawa, S. (1975) Le bruit blanc et calcul stochastique, Proc. Japan Acad. 51, 384–388. 9. Oksendal, B. (1998) Stochastic differential equations, Springer. 10. Wu, L.X., Y.K.Kwok and H.Yu.(1999) Asian options with American early exercise feature, International Journal of Theoretical and Applied Finance, 2(1), 101–111.
This page intentionally left blank
December 26, 2006
11:10
Proceedings Trim Size: 9in x 6in
Ritusumei2006˙miyahara
Minimal Variance Martingale Measures for Geometric L´evy Processes M. Jeanblanc1 , S. Kloeppel2 and Y. Miyahara3 1
Universit´e d’Evry, France 2 ETH, Zurich 3 Presenter; Nagoya City University, Japan
This is a short report on a talk given at the ”short communications” session. An article which contains the full proofs and several extensions of the results in this report is now in the preparation, and it will appear in the near future. Key words: incomplete market, geometric L´evy process, minimal variance martingale measure, minimal Lq martingale measure, minimal entropy martingale measure.
1. Problems and Results We investigate the explicit forms of the minimal variance martingale measure and minimal Lq martingale measure for geometric L´evy processes. We study the relation of these martingale measures with the minimal entropy martingale measure. Let X be a real-valued L´evy process on a probability space (Ω, F , P). We denote by ν its L´evy measure and we recall the Ito-L´ ˆ evy decomposition of X: t Xt = bt + σWt + 0
t |x|>1
xN(ds, dx) + 0
|x|≤1
xN(ds, dx)
where N(dt, dx) = N(dt, dx) − dtν(dx). Here W is a Brownian motion and N is a Poisson random measure. We study a financial market where the price of the risky asset is given in terms of the exponential of X as St = S0 eXt (such a process is called “geometric L´evy process” or “exponential L´evy process”) and where the risk less asset has a constant interest rate r. Q For any probability measure Q equivalent to P, we denote by (Lt , t ≥ 0) 193
December 26, 2006
11:10
Proceedings Trim Size: 9in x 6in
Ritusumei2006˙miyahara
194
the strictly positive P-martingale such that dQ|Ft = LQ t dP|Ft . From the predictable representation theorem of positive martingales (see [3] ) there exists two predictable processes f and g such that Q Q g(t,x) − 1) N(dt, dx) . dLt = Lt− ft dWt + (e In what follows, we shall denote by L( f, g) such a density, and by Q( f,g) the corresponding equivalent martingale measure. The martingale property of ( St = St e−rt , t ≥ 0) under Q( f,g) holds if and only if for any t the equality (1) ft σ + (e g(t,x) (ex − 1) − x11|x|≤1 )ν(dx) = β IR
holds almost surely, where β = −(b + 12 σ2 − r) (see [3] ). We shall denote by C the set of pairs ( f, g) such that (1) holds and by C0 the set of nonrandom pairs such that (1) holds (in that case, f is a real number and g a real function of the real variable x). Our aim is to find the density L, i.e., a pair of predictable processes ( f ∗ , g∗ ) such that E(L2T ( f ∗ , g∗ )) = inf{E(L2T ( f, g)) , ( f, g) ∈ C }. The solution of this problem is the “minimal variance martingale measure” (MVMM). We denote it by P (MV) . Remark 1.1. M. Schweizer [4] introduced the “variance optimal martingale measure” (VOMM). The VOMM is a signed measure in general. In the case where the VOMM is positive, the VOMM is identified with MVMM by definition. We prove that, under some conditions, there exists a non random optimal pair ( f ∗ , g∗ ). Theorem 1.1. Assume that the following equation for ( f, g(x), µ) f = µσ (e µσ2 +
g(x)
− 1) = µ(ex − 1)
1 + µ(ex − 1) (ex − 1) − x11|x|≤1 ν(dx) = β
December 26, 2006
11:10
Proceedings Trim Size: 9in x 6in
Ritusumei2006˙miyahara
195
has a solution ( f ∗ , g∗ (x), µ∗). Then it holds that ∗ ∗ (i) The martingale measure Q( f ,g ) is the MVMM for the process St . (ii) The L´evy measure of (Xt , t ≥ 0) under P(MV) is ∗ ν(MV) (dx) = e g (x) ν(dx) = 1 + µ∗ (ex − 1) ν(dx) When the conditions of the previous theorem are not satisfied, we don’t know whether the MVMM exists or not. We define an “-optimal” MVMM, Q = Q( f ,g ) , as a martingale measure which satisfies the following condition E[L2 ( f , g )] ≤ E[L2 ( f, g)] + , ∀( f, g) ∈ C. Theorem 1.2. For any > 0, there exists an -optimal MVMM in the class of non-random controls. Theorem 2 suggests us that the following conjecture may hold true. Conjecture: If the MVMM exists, then it is in the class C 0 . We can generalize the above problem to the “minimal Lq martingale q measure” (MLq MM), q > 1, and we denote it by P(ML ) . The problem is to find predictable processes f ∗ and g∗ such that E{(LT ( f ∗ , g∗ ))q } = inf E{(LT ( f, g))q; ( f, g) ∈ C}. We obtain the following theorem, Theorem 1.3. Assume that the following equation for ( f q , gq (x), µq ) q(q − 1) fq = µq σ (q−1)gq (x)
q(e
µq σ2q q(q − 1)
+
(1 +
− 1) = µq (ex − 1)
µq (ex − 1) q
1 ) q−1 (ex − 1) − x11|x|≤1 ν(dx) = β.
has a solution ( fq∗ , g∗q (x), µ∗q ). Then ∗ ∗ (i) The martingale measure Q( fq ,gq ) is the minimal Lq martingale measure for the process St . q (ii) The L´evy measure of Xt under P(ML ) is (MLq )
ν
(dx) = e
g∗q (x)
1 q−1 µ∗q x ν(dx) = 1 + (e − 1) ν(dx) q
December 26, 2006
11:10
Proceedings Trim Size: 9in x 6in
Ritusumei2006˙miyahara
196 q
Next we consider the limit limq↓1 P(ML ) . q
Theorem 1.4. Assume that the ML q MM P(ML ) , 1 < q ≤ 2, and the minimal entropy martingale measure P(MEMM) exist, then under some additional assumptions it holds that q ∗ x lim ν(ML ) (dx) = eθ (e −1) ν(dx) = ν(MEMM) (dx). q↓1
q
This result means that “when q ↓ 1, minimal q martingale measure P(ML ) converges to the minimal entropy martingale measure P(MEMM) in some sense.” (See [1] for the L´evy measure of X t under P(MEMM) .) 2. Remarks The similar problem as ours have been discussed by many persons. But almost all papers are studied under the frame work of signed martingale measure or only absolute continuity. On the other hand, our discussions are in the frame work of the equivalent martingale measures. So our results are obtained in the set up of the equivalent martingale measures. The similar results with Theorem 1.4 are obtained for continuous processes ([2], etc.). Our result is an extension of such results to the case of jump processes. References 1. Fujiwara, T. and Miyahara, Y. (2003), The Minimal Entropy Martingale Measures for Geometric L´evy Processes, Finance and Stochastics 7, 509–531. 2. Grandits, P. and Rheinl¨ander, T. (2002), On the minimal entropy martingale measure, The Annals of Probability, 30(3), 1003–1038. 3. Kunita, H. (2004), Representation of Martingales with Jumps and Applications to Mathematical Finance, Stochastic Analysis and Related Topics in Kyoto, In honour of Kiyosi Itˆo, Advanced Studies in Pure Mathematics 41, Mathematical Society of Japan. 4. Schweizer, M. (1996), Approximation Pricing and the Variance-Optimal Martingale Measure, Annals of Probability 24, 206–236.
December 26, 2006
11:13
Proceedings Trim Size: 9in x 6in
litterer-lyons-updated
Cubature on Wiener Space Continued Christian Litterer and Terry Lyons Mathematical Institute, University of Oxford, 24-29 St Giles’, Oxford, OX1 3LB, UK
Higher order particle methods can provide an accurate description of an evolving family of measures, but frequently the number of particles used in the description explodes as we iterate. In this paper we present a general method to simplify the support of the intermediate measures used in the iteration without increasing the error in the particle approximation by more than a constant factor. We describe two algorithms that can be used to simplify the support of a discrete measure and give an application to the cubature on Wiener space method developed by Lyons, Victoir [13]. Key words: particle systems, stochastic analysis, cubature, stochastic filtering 1. Introduction Probabilistic understandings are producing novel numerical techniques for integrating partial differential equations of parabolic type. For example see the work of Kusuoka [9]. We consider a Stratonovich stochastic differential equation (1)
dξt,x = V0 (ξt,x )dt +
d
Vi (ξt,x ) ◦ dBit , ξ0,x = x
i=1
defined by a family of smooth vector fields Vi and driven by Brownian motion. It is well known that computing a weak solution to (1) and setting PT−t f := E( f (ξT−t,x )) solves a parabolic partial differential equations (PDE). More precisely u(t, x) = E( f (ξ T−t,x )) solves
(2)
∂u (t, x) = −Lu(t, x), ∂t u(T, x) = f (x) 197
December 26, 2006
11:13
Proceedings Trim Size: 9in x 6in
litterer-lyons-updated
198
where L is the differential operator given by 1 L = V0 + (V12 + . . . Vd2 ). 2 Kusuoka’s algorithm [9] and the Kusuoka-Lyons-Victoir (or KLV) method [13] are two higher order methods for weak approximations of SDEs. In both approximation schemes the number of calculations grows exponentially in the number of iterations; nonetheless the methods can be highly effective in practice. In [15] Ninomiya and Victoir present an application to pricing Asian options under a Heston stochastic volatility model. They evaluate an algorithm, that might (non-trivially) be interpreted as a special case of the KLV method of degree five (in combination with Romberg extrapolation and quasi Monte-Carlo) for the SDE arising in this case and obtain a speed up by a factor of 10 6 compared to the classical Euler-Maruyama scheme. Similar results are obtained by Ninomiya [16] and Ninomiya, Kusuoka [10] for Kusuoka’s algorithm. They combine the method with the tree based branching algorithm (TBBA), a partial sampling scheme developed by Crisan, Lyons [3], and apply it to price European options under a Heston model. In this paper we give a general methodology for recombining particles in a particle systems by reducing the support of the discrete measure. We describe two algorithms that may be used to compute such reduced measures and we announce results of an application to the KLV method. We can show that if we interpret the combined method as a particle system, then under the uniform Hormander ¨ condition for the vector fields Vi the number of particles grows polynomially in the number of iterations. Finally we outline an application of the recombination methods to the stochastic filtering problem. 2. Higher order methods for weak approximations of SDEs In [9] Kusuoka develops a higher order method for weakly approximating the solution of stochastic differential equations (1). For the application of the method the test function f is required to be Lipschitz and vector fields must satisfy the so called UFG condition, which is weaker than the uniform Hormander ¨ condition (see section 6 for details). The author introduces the concept of a m-moment similar family of random variables. Let k A = {∅} ∪ ∞ k=1 {0, . . . , d} and α = (α1 , . . . , αk ) ∈ A be a multi-index. To take account of the special role of the vector field V0 in the direction of the drift of the SDE we define a weight on a multi-index α by α = k + card(j : α j = 0). Let A(j) = {α ∈ A : α ≤ j}.
December 26, 2006
11:13
Proceedings Trim Size: 9in x 6in
litterer-lyons-updated
199
Define iterated integrals by B◦ α (t) =
0
dBit11 · · · ◦ dBitkk ,
for (i1 , . . . , ik ) ∈ A. Definition 2.1. A family of random variables {Z α ; α ∈ A(m)} is m-moment similar (m ≥ 1), if Z(0) = 1, E(|Zα |n ) < ∞ for any n ≥ 1, α ∈ A, and if
for all α1 , . . . , αk ∈ A with α1 + · · · + αk ≤ m. Starting with a given m-moment similar family of random variables the author gives an explicit construction of a discrete Markov operator Ut . He develops using Lie algebra techniques a stochastic Taylor expansion such that the expansion of Ut f agrees with the expansion of Pt f up to level m. The remainder estimate involves higher order derivatives of the test function in the directions of the vector fields Vi and their Lie brackets, so in particular if f is only assumed to be Lipschitz the bound in the approximation of Pt f by Ut f will not be small. The results in Kusuoka, Stroock [7] and Kusuoka [8] show that Pt f will be smooth, at least in the direction of the vector fields Vi (see section 6 for a more precise statement). So considering an uneven partition of the global time interval [0, T] with time steps getting smaller towards the end, the method can be applied iteratively over the subintervals. Since the operator Ut is Markov, the error over the global time interval of the approximation is the sum of the errors over the subintervals. Hence, one can arbitrarily reduce the error in the approximation. It is shown in [9] that for an operator based on a m-moment similar family there exists a partition such that the error in the weak approximation is bounded by Ck−(m−1)/2 ∇ f ∞ , where k is the number of time steps in the partition. The results in [9] are generalised and reformulated in terms of a Lie algebra language in Kusuoka [11]. Following Kusuoka [9], Lyons and Victoir [13] have developed the cubature on Wiener space approach, which we refer to as the KLV method. For
December 26, 2006
11:13
Proceedings Trim Size: 9in x 6in
litterer-lyons-updated
200
a positive measure on Rd , a cubature measure of degree m is a discrete measure that integrates polynomials up to degree m correctly. The expectation E( f (ξT,x )) might be regarded as an integral against Wiener measure. A cubature measure on Wiener space is a discrete measure QT =
n
λi δωT,i
i=1
supported on the continuous paths of bounded variation over the time interval [0, T] that integrates iterated integrals correctly, i.e. satisfies n ik i1 ◦dBt1 · · · ◦ dBtk = λj dωij1 (t1 ) . . . dωijk (tk ) E 0
j=1
0
for all multi-indices (i1 , . . . , ik ) ∈ A(m). The integer m is the degree of the cubature formula. As Brownian motion Bt is a self-similar process, i.e. for any constant c > 0 the process c−1/2 Bct equals Bt in distribution, it is sufficient to construct only one cubature measure Q1 over the time interval [0, 1]. Any measure Q T may then be obtained from Q1 by rescaling and reparametrising the paths in the support of Q1 . In the construction of the cubature measure in [13] the (truncated) signature map, introduced by Chen [1], plays a crucial role. It might be viewed as an algebra homomorphism from the continuous paths of bounded variation with concatenation as multiplication into the (truncated) tensor algebra. A crucial observation in [13] is that the signature map can be used to articulate a formulation of the cubature problem in terms of Lie polynomials (elements of the (truncated) free Lie algebra) of degree up to m. The problem becomes to find a discrete measure supported on exponentials of Lie polynomials with centre of mass matching the expectation of the (truncated) signature of Brownian motion. For the cases of m = 3 and m = 5 the authors use Lie algebra techniques to give explicit constructions of such Lie polynomials starting from any given cubature measure for Gaussian space. The number of particles in the support of the measure obtained in the construction is twice the number of particles in the support of the Gaussian cubature measure it is based on. Cubature formulas with small support may be found for example in Victoir [20]. The error in the approximation of PT f is bounded by sup PT f − EQT f (ξT,x ) x∈Rn
December 26, 2006
11:13
Proceedings Trim Size: 9in x 6in
litterer-lyons-updated
201
(3)
≤ CT (m+1)/2
sup (i1 ,...,ik )∈A(m+2)\A(m)
Vi1 . . . Vik f ∞ .
Following Kusuoka’s method, the KLV method can again be applied iteratively over suitably chosen but uneven partitions of [0, T] and using the estimates in [7], [8] one obtains Ck−(m−1)/2 ∇ f ∞ , for the rate of convergence of the approximation, where m in this case is the degree of the cubature formula. The KLV method might be viewed as a discrete Markov kernel taking discrete measures on RN to discrete measures on RN . More explicitly we have KLV(δx , T) =
n
λi δξT,x (ωi,T ) .
i=1
and
EQT f (ξT,x ) = EKLV(δx ,T) f.
Hence, to compute the expectation of f (ξT,x ) with respect to the cubature measure one needs to compute the endpoints of the pathwise solution of the SDE along the cubature paths. 3. High order recombination for evolving stochastic systems In the previous section we have discussed a number of higher order methods for solving parabolic partial differential equations by approximating weak solutions of SDEs. In general, when there is an evolving measure there is no reason to assume that the PDE one wishes to solve is linear or we want to solve a PDE at all. As an example of an evolving measure one can think of the evolving posterior distribution of the filtering problem, which is governed by the Zakai equation (a non-linear SPDE). The stochastic filtering problem has numerous practical applications (for example in signal processing or tracking) and numerical approximations of its solution have been subject of active research over the past few decades. The evolution of a measure over time can be regarded as path in the space of measures. So, even if the underlying measure space is finite dimensional we face a potentially infinite dimensional problem. Particle systems can provide in many cases accurate descriptions of the evolution of such measures. Unfortunately, the number of particles used in the description of these measures may explode when we apply an iterative method that preserves accuracy, sometimes even grows exponentially (compare section 2).
December 26, 2006
11:13
Proceedings Trim Size: 9in x 6in
litterer-lyons-updated
202
Sometimes a measure can be effectively described by its integral against a specific family of well chosen test function. When this is true and the support of the particle measure is large compared to the number of test functions, we can replace it by a simpler measure with smaller support that integrates the test functions correctly and thus retains the required properties. Definition 3.1. A discrete probability measure µ˜ is a reduced measure with respect to µ and Pn if and only if it satisfies the following three conditions: 1. supp(µ) ˜ ⊆ supp(µ) 2. For all p ∈ Pn p(x)µ(dx) ˜ =
p(x)µ(dx)
3. card(supp(µ)) ˜ ≤ n + 1. The first condition is important and ensures that the reduced measure satisfies feasibility constraints imposed on the measure we target. The iterated application of the KLV operation may be viewed as a particle system that evolves as a n-ary tree, approximating the heat kernel measure of a PDE. The support of this measure and the number of ODEs required to define it grows exponentially in the number of iterations. The particles in the tree will not recombine. The relevant property of the measure would be to integrate Pt f - the heat kernel applied to the test function - correctly. In the following sections we present two algorithms to compute a reduced measure for a discrete measure µ with finite support. Both algorithms have a polynomial runtime in the size of the support (the second can be applied to discrete measures with infinite support) and can be used to introduce high order recombination in examples such as the KLV method introduced above. 4. A measure support reduction algorithm Let Ω be a probability space with a discrete probability measure µ=
nˆ
λi δzi
λi > 0, zi ∈ Ω.
i=1
We are also given a finite set of test functions Pn = {p1 , . . . , pn } and in the following we assume that µ has large support, by which we mean that nˆ is
December 26, 2006
11:13
Proceedings Trim Size: 9in x 6in
litterer-lyons-updated
203
at least of order n2 . Our aim is to calculate a reduced measure for µ. We define a Rn dimensional random variable P := (p1 , . . . , pn ) and consider the law of the random variable P under the measure µ, i.e. µP =
nˆ
= λi δxi
xi = (p1 (zi ), . . . , pn (zi ))T ∈ Rn .
i=1
We may assume that the xi are all distinct, as we can otherwise trivially eliminate points zi from the support of the original measure µ. The centre of mass (CoM) for the measure µP is given by (4)
CoM(µP ) =
nˆ
λi xi .
i=1
If we can find a subset of at most n+1 particles xik and new weights λ˜ k which produce a new probability measure µ˜ P such that CoM(µ˜ P ) = CoM(µP ), the measure µ˜ = λ˜ k δzik is a reduced measure for µ. By considering xi − CoM(µP ) in place of the xi we may assume without loss of generality that CoM(µP ) is at the origin. A first algorithm, communicated to us by Victoir [19], sequentially eliminates particles from the support of the measure : Given any n + 2 points, the system given by (5)
n+2
ui xki = 0
i=1 n+2
ui = 0
i=1
has a non-trivial solution, which may be determined using Gaussian elimination. This is easy to see as we have a linear system with n + 2 variables, but only n + 1 constraints. Thus we may either add n+2 λi u j xk j min − ui <0 ui j=1
to (4) or subtract
n+2 λi min u j xk j ui >0 ui j=1
December 26, 2006
11:13
Proceedings Trim Size: 9in x 6in
litterer-lyons-updated
204
from (4) leaving all weights in the result non-negative and their overall sum unchanged. By construction the coefficient of some x j vanishes. We now have obtained a new measure with the required properties and at least one point less in the support. Applying the procedure iteratively until there are only n + 1 points left we obtain a reduced measure. The method requires order nˆ iterations of the above procedure. Remark 4.1. Let n˜ be the dimension of the lowest dimensional (affine) subspace of Rn containing the set {(p1 (y), . . . , pn (y)) | y ∈ supp(µ)}. Since we have shifted the CoM to the origin we may assume that the particles xi are contained in a proper n˜ dimensional subspace. Hence, equation (5) has a non-trivial solution for any subset of n˜ + 2 particles and we can continue to ˜ apply the elimination procedure described above until card(supp(µ)) ˜ ≤ n+1. For improving the runtime of the algorithm we partition the support of the measure µP into n + 2 subsets of equal size and consider their respective centre of mass. Let these sets be denoted by Ik , 0 ≤ k ≤ n + 1. Define λx i i x˜ j = λ˜ j i∈I j where λ˜ j = i∈I j λi . We might interpret the points x˜ j as the respective centre of masses of the individual subsets. Then n+1 x= λ˜ j x˜ j , i=0
λ˜ j > 0 for all 0 ≤ j ≤ n + 1 and n+1 i=0
λ˜ i =
nˆ
λi = 1.
i=0
Hence we may use the previously outlined procedure to eliminate some x˜k , thus eliminating ˆn/(n + 2) points in one step.
Theorem 4.1. Given µ and P n the algorithm described above computes a reduced measure in ˆ + 2, n + 1)) O(n2 nˆ + n log(n/n)C(n steps where C(n + 2, n + 1) represents the number of steps required to solve a system of linear equations with n + 2 variables and n + 1 constraints.
December 26, 2006
11:13
Proceedings Trim Size: 9in x 6in
litterer-lyons-updated
205
Proof. It is clear from the description that the algorithm computes a reduced measure, so we only provide a brief, informal analysis for the runtime of the algorithm. The estimates can easily be made rigorous by considering floors and ceilings without changing the order of steps required. After k iterations of the algorithm starting with nˆ points the number of particles in the support of the measure is down to
1−
1 n+2
k ˆ n.
We would like to know for which k we have
1−
1 n+2
k
≤
n . nˆ
k k ˆ iterations Noting that 1 − nk ≤ e−k/n we see that roughly k ≈ n log n/n are sufficient. Forming the n-dimensional linear combinations takes time linear in the size of the support of the measure and n at each iteration. Thus the number of steps for these operations is trivially bounded by n
∞ nˆ 1 − k=0
1 n+2
k
ˆ = n(n + 2)n.
5. A second algorithm Before we start the description of the actual algorithm we give a simple argument for checking if a non-degenerate simplex given by points x1 , . . . , xk+1 contained in some space of possibly higher dimension contains the origin. Lemma 5.1. Consider a non-degenerate simplex of k + 1 points and let l be the line through a fixed interior point xM of the simplex and the origin. We can decide if the simplex contains the origin and if not compute the intersections of l with the faces of the simplex in O(C(k+2,k+1)) steps. Proof. As xM is in the interior of the simplex we may write (6)
k+1 i=1
λi xi − xM = 0,
k+1
λi = 1
i=1
which has a unique solution and the λi are strictly positive. If the k + 1 points are lying in a k dimensional affine subspace which does not contain
December 26, 2006
11:13
Proceedings Trim Size: 9in x 6in
litterer-lyons-updated
206
the origin the line joining the origin and xM will only intersect at xM and there is nothing to do for us (no intersection with the faces). The system (7)
k+1
µi x i + µ0 x M = 0
i=1 k+1
µi − µ0 = 0
i=1
has a non-trivial solution, if and only if the points x1 , . . . , xk+1 are contained in a k-dimensional subspace. Hence we may assume that the simplex is contained in a k dimensional subspace, xM 0 (otherwise we are done) and that the system (7) has a non-trivial solution. As already seen in the previous algorithm we may add or subtract small quantities of (7) from (6) until the coefficient of some xi or XM vanishes. Note that as the λi are positive, xM has a negative coefficient and k+1 i=1 µi and µ0 have the same sign there will be some variable vanishing. Also observe that the xi cannot vanish all simultaneously. We can distinguish two possible cases. First xM is eliminated and we are left with a convex combination describing the origin, thus solving the problem. For the remaining case observe that any line, which intersects the interior of a simplex, intersects its boundary at precisely two points. By eliminating without loss of generality (if necessary relabeling the particles) x1 or xk+1 we have moved to the boundary and hence have determined both intersections of the line (it can be checked easily that they are different). For example eliminating xk+1 we obtain after renormalising a new probability measure with strictly positive weights λ˜i satisfying (8)
k
˜ M, λ˜ i xi = λx
i=1
and ki=1 λ˜ i = 1 for some λ˜ > 0. Note that for both intersections the weights of xM are positive and so the origin cannot be contained in the simplex. Let Xw and Xc be two probability measures with strictly positive weights and γ ∈ (0, 1). Let x c and xw denote the respective centre of mass of the measures. As before without loss of generality CoM(µp ) is at the origin. We give the description of the actual algorithm:
December 26, 2006
11:13
Proceedings Trim Size: 9in x 6in
litterer-lyons-updated
207
We start with Xw = δx1 , Xc =
n
i=2
λi /(1 − λ1 )δxi and γ = 1 − λ1 .
While Xc has non-empty support we perform the following iteration: 1. Remove a random particle xi from Xc and add it to Xw with weight λi γ . 1−γ Renormalise weights in both measures and update xw , xc . Update γ to be γ(1 − λi ). 2. Apply the elimination procedure described in section 4 (Remark 4.1) to Xw until it fails and replace Xw with its output. 3. Check using Lemma 3 if the convex hull of supp(Xw ) contains the origin. If yes Xw is the desired measure and we may terminate. If the line joining xw and the origin does not intersect the faces of the simplex obtained from the points in supp(Xw ) go to 1. Otherwise update Xw using the weights given in (8) and replace γ by ˜ λγ ˜ 1 − γ + λγ
.
Thus we eliminate a point from Xw and the new centre of mass of Xw is ˜ w. given by λx Finally recover the reduced measure µ˜ from µ˜ P := Xw . Proposition 5.1. Algorithm 2 computes a reduced measure. Proof. Observe that after step 2, the measure Xw will contain at most n + 1 particles. The theorem follows from two invariants that are maintained throughout the execution of the algorithm. First a simple calculation shows that we have CoM(µp ) = γxc +(1−γ)xw, i.e. CoM(µp ) is a convex combination of xc and xw after every step of the algorithm. The second invariant is, that after step 2 the convex hull of the particles in supp(Xw ) is a non-degenerate simplex. We argue by contradiction. Suppose card(supp(Xw)) = j+2 and the simplex is degenerate. Then we have j + 2 particles contained in a j dimensional affine subspace, contradicting the assumption that we have reduced the measure in step 2 (compare Remark 4.1).
December 26, 2006
11:13
Proceedings Trim Size: 9in x 6in
litterer-lyons-updated
208
Remark 5.1. If the measure µ has infinite support the algorithm described above will terminate with probability one if there is an integer k ≤ n + 1 such that a random sample of k points taken from supp(µ) with a positive probability forms a non-degenerate simplex containing the origin. As a third alternative approach to the problem one might consider the following. If we can find a subset of points that with a reasonably high probability contains the CoM in its convex hull, we may use linear programming to check if a given set of points contains the CoM in its convex hull and reconstruct the weights. It turns out that for some distributions of the points a subset of size 2n is sufficient to consider. By applying radial projections we may without loss of generality assume that all points lie on the unit sphere. Observe that the origin is not contained in the convex hull of the points if and only if we can find a half sphere that contains all of the points. For k points on the unit sphere in Rn denote the probability that there is a half sphere that contains all k points by Pn,k . Suppose that the joint distribution is symmetric around the origin in the sense that it is invariant under reflecting one of the points at the origin keeping the remaining fixed. Moreover we assume that n points treated as vectors in Rn are linearly independent with probability one. k i.i.d. variables uniformly distributed on the sphere are a simple example of such a distribution. Under the conditions outlined above Wendel [21] has evaluated P n,k explicitly to be Pn,k = 2
−k+1
n−1 k−1 . j j=0
In particular this yields Pn,2n = 1/2. 6. Some results from cubature on Wiener space In this section we introduce some notation and give a brief overview of the Kusuoka-Lyons-Victoir method (cubature on Wiener space), Lyons and Victoir [13], developing on results by Kusuoka. (RN , RN ) denote the smooth RN valued functions with derivatives of Let C∞ b (RN , RN ), 0 ≤ i ≤ d all order bounded. In the following we interpret Vi ∈ C∞ b as vector fields on RN . Consider the probability space (C00 ([0, T], Rd), F , P), where C00 ([0, T], Rd) is the space of Rd valued continuous functions starting at 0, F its usual Borel σ-field and P the Wiener measure. Define the coordinate mapping process Bit (ω) = ωi (t) for t ∈ [0, T], ω ∈ Ω. Under Wiener measure, B = (B1t , . . . , Bdt ) is a Brownian motion starting at zero. Furthermore let B0t (t) = t. Let ξt,x , t ∈ [0, T], x ∈ RN be a version of the
December 26, 2006
11:13
Proceedings Trim Size: 9in x 6in
litterer-lyons-updated
209
solution of the Stratonovich stochastic differential equation (SDE) (9)
dξt,x =
d
Vi (ξt,x ) ◦ dBit , ξ0,x = x
i=0
that coincides with the pathwise solution on continuous paths of bounded variation. We define the Ito functional ΦT,x : C00 ([0, T], Rd) → RN by ΦT,x (ω) = ξT,x (ω). Following [8] we define for multi-indices α = (α1 , . . . , αk ), β = (β1 , . . . , βl ) ∈ A a multiplication by α ∗ β = (α1 , . . . , αk , β1 , . . . , βl ). Let A1 = A \ {∅, 0} and A1 (j) = {α ∈ A1 : α ≤ j}. We inductively define a family of vector fields by taking V[∅] = 0,
V[i] = Vi ,
V[α∗i] = [V[α] , Vi ],
0≤i≤d
0 ≤ i ≤ d, α ∈ A.
The following condition on the family of vector fields Vi was introduced by Kusuoka in [8]. Definition 6.1. The family of vector fields V i , i = 1, . . . , d is said to satisfy the condition (UFG) if the Lie algebra generated by it is finitely generated module, i.e. there exists a positive k and uα,β ∈ C∞ satisfying as a left C∞ b b for all α ∈ A1 , (10) V[α] = uα,β V[β] . β∈A1 (k)
Note that the uniform Hormander ¨ condition (see e.g. [8]) implies the UFG condition. In the following we assume that the vector fields Vi satisfy the UFG condition. Let p be is the minimal integer such that the UFG condition is satisfied and A := A(p). Definition 6.2. We define the (formal) degree of a vector field V [α] , α ∈ A denoted by dα to be the minimal integer k such that V[α] may be written as V[α] = uα,β V[β] β∈A1 (k)
with uα,β ∈ C∞ . b
December 26, 2006
11:13
Proceedings Trim Size: 9in x 6in
litterer-lyons-updated
210
Recall the Kusuoka-Lyons-Victoir operation from section 2. For a measure µ = li=1 µi δxi on RN the KLV operation is given by KLV(µ, s) =
n l
µ j λi δΦs,x j (ωi ) .
j=1 i=1
Let D be a partition of the interval [0, T] t0 = 0 < t1 < . . . < tk = T, and let s j = t j − t j−1 . ( j)
Given a partition D we recursively define a family of measures KLVD,x for 0 ≤ j ≤ k by ( j)
KLVD,x
KLV(s j+1 )
/ KLV ( j+1) . D,x
(0)
and set KLVD,x = δx . In [13] the authors obtain the following bound: Theorem 6.1. Let f be Lipschitz, suppose that vector fields V i satisfy the UFG condition and V0 has formal degree at most 2 then sup E f (ξT,x ) − EKLV(k) f D,x x∈Rn
⎛ ⎞ (m+1)/2 k−1 ⎜⎜ 1/2 ⎟⎟ s i ⎟⎟ , ≤ KT ∇ f ∞ ⎜⎜⎜⎝sk + ⎟ (T − ti )m/2 ⎠ i=2
with KT a constant independent f . It was pointed out in [6] that the estimates in Theorem 6.1 require V0 to have formal degree at most 2. If V0 has higher formal degree the estimates in Theorem 6.2 will change and all bounds in the paper will change accordingly. In the following we have assumed for simplicity that V0 has formal degree at most 2. The main ingredients used when obtaining the bound in Theorem 6.1 are the estimate in (3) and the following regularity result due to Kusuoka, Stroock [7] and Kusuoka [8], which says that even if f is not Ps f is smooth in the directions of the vector fields. Theorem 6.2. Let f be Lipschitz and α1 , . . . , αk ∈ A1 then (11)
V[α1 ] . . . V[αk ] Ps f ∞ ≤
KT s1/2 ∇ f ∞ (α s 1 +...+αk )/2
provided the vector fields satisfy the UFG condition.
December 26, 2006
11:13
Proceedings Trim Size: 9in x 6in
litterer-lyons-updated
211
7. Cubature with recombination In the iterated KLV method (Section 6) the total error over the interval of approximation [0, T] is bounded by the sum of the individual errors over smaller time intervals. We can use the algorithms described in sections 4 and 5 to introduce an intermediate step after each small time computation, replacing the discrete measure with one having smaller support. If we can do this without significantly increasing the one step errors then we can obtain a new global error bound over [0, T] for this algorithm that is of the same order as the error bound in the iterated KLV method. We briefly outline our strategy. A generalisation due to Sussmann [18] of the Frobenius integrability theorem implies that the pathwise solutions of the SDE for the cubature paths lie in a smooth submanifold S of RN . In the following we will refer to S as the reachability manifold. To approximate a function on the reachability manifold in L1 norm of a discrete measure it is sufficient to find accurate uniform functional approximation schemes that apply on the support of the measure. Any reduced measure should have support strictly contained in the support of the original measure. Hence, we can restrict our attention exclusively to the support of the original measure. As the particles in the support of the measure may be widely distributed considering polynomials alone as test function may not be sufficient. Instead we will cover the support by suitably scaled patches and insist that our new measure integrates polynomials up to degree m on each of these patches correctly. We will address the issues of finding the right scaling and metric to cover the support and obtain the required estimate on the remainder term. Finally we examine upper bounds on the number of test functions required in the reduction step. The fundamental idea is to consider Taylor expansions along paths in the direction of the vector fields and their Lie brackets, as the regularity estimates of [7] apply only in these directions. Obviously the quality of these estimates depends on the number of Lie brackets involved (compare Theorem 6.2) and this will have to be reflected in our metric. A natural metric to consider was studied by Nagel, Stein, Wainger (or NSW) in [17]. Definition 7.1. Let Ω ⊆ R N be a connected open set and let C(δ) denote the set of absolutely continuous functions φ : [0, 1] → Ω which can be expressed in the form cα (t)V[α](φ(t)), (12) φ (t) = α∈A
December 26, 2006
11:13
Proceedings Trim Size: 9in x 6in
litterer-lyons-updated
212
where |cα (t)| < δdα for almost every t. Define ρ, the NSW metric associated to the vector fields Vα , to be (13)
ρ(x, y) = inf(δ > 0| there is φ ∈ C(δ) with φ(0) = x, φ(1) = y).
Moreover let ρt be defined as ρ but with √ dα −1 t |cα (t)| ≤ δ δ dα
as a bound on the coefficients in (12). Note that our definition of formal degree differs slightly from the one given in [17]. Let g be a smooth function on the reachability manifold S and ψ an absolutely continuous path ψ : [0, 1] → S, satisfying (12) with coefficients cα . We define the r th degree pathwise Taylor approximation of g along ψ by TayrS (g, ψ)
=
r
V[α1 ] . . . V[αi ] g(ψ(0))
i=0 (α1 ,...,αi )∈Ai
0
dcα1 . . . dcαi ,
where dcα = cα (u)du. The error term Rr (g, ψst ) := g(ψ(1)) − Tay rS (g, ψ) is given by (α1 ,...,αr+1 )∈Ar+1
The following lemma may be found in Litterer, Lyons [12]. Lemma 7.1. The pathwise Taylor expansion Tay rS (g, ψ) defined above only depends on the endpoints of the path ψ and is a polynomial on R N . The following proposition provides uniform estimates on the remainder of the pathwise Taylor expansion of Pt f on balls in the metric ρt . Let x ∈ S and Bδ,t (x) = {y ∈ RN : ρt (x, y) < δ}. Proposition 7.1. The remainder function Rr (Pt f, x)(y) is uniformly bounded on Bδ,t (x) by δr+1 C r/2 ∇ f ∞ , t where C is a constant independent of f and δ.
December 26, 2006
11:13
Proceedings Trim Size: 9in x 6in
litterer-lyons-updated
213
With this proposition in mind we define a reduction operation Red(µ, s, t), which takes a discrete measure µ and produces a - in general not unique - reduced measure µ˜ with respect to the following set of test functions. √ Let B1 , . . . , B be a covering of the support of µ by balls of radius s in the metric ρt . Then we can find disjoint sets Si ⊆ Bi ∩ supp(µ) that cover supp(µ). Note that this partition is by no means unique. The set of test functions is given by all functions of the form pi χS j , where i ranges over a basis for the polynomials of degree up to m and j ranges over all patches contained in the partition. The measure µ˜ may be computed using one of the algorithms presented in sections 4 and 5. We define a recursive family of measures corresponding to the iterative application of the KLV and the reduction operation. The base case is given by (1)
QD,x := KLV(s1 , δx )
(2)
(1)
QD,x := KLV(s2 , QD,x)
and the recursion by ( j)
QD,x
Red(s j ,T−t j )
/ Q˜ ( j) D,x
KLV(s j+1 )
/ Q( j+1) , D,x
where 2 ≤ j ≤ k − 1. Our main result is the following: Theorem 7.1. With the above notation we have ⎛ ⎞ (m+1)/2 k−1 ⎜⎜ 1/2 ⎟⎟ si ⎜ ⎟⎟ , sup PT f (x) − EQ(k) f ≤ CT ∇ f ∞ ⎜⎜⎝sk + ⎟⎠ m/2 D,x (T − t ) i x i=1 where CT is a constant independent of f . In other words compared to the KLV method introduced in section 6 the error bound for the approximation of PT f is only increased by a constant factor. The proof of the theorem is an easy consequence of the estimates obtained for the KLV method in [13] and Proposition 7.1 and may be found in [12]. The bounds we obtain for the KLV method with recombination match the bounds of the original KLV method up to a constant multiple. Hence, we recall a family of partitions from [13] given by γ j . tj = T 1 − 1 − k
December 26, 2006
11:13
Proceedings Trim Size: 9in x 6in
litterer-lyons-updated
214
For γ > m− 1 the results in [13] show that the bound in Theorem 6.1 implies PT f (x) − E (k) f ≤ Ck−(m−1)/2 ∇ f ∞ , Q D,x
while for the case 0 < γ < m − 1 one obtains PT f (x) − E (k) f ≤ Ck−γ/2 ∇ f ∞ . QD,x Under the uniform Hormander ¨ condition we show in [12] that the number (k) of particles in the support of the measure QD,x used in the approximation grows polynomially in k, the number of iterations. We obtain this bound by covering the reachablity manifold by balls in the metric ρ. The result is based on two elementary lemmas. The first shows that the solution of the SDE along any cubature path ωT,i stays inside a ball of radius of order √ T, if the ball is taken in the metric ρ. A second estimate relates balls in the metric ρ to Euclidean balls, yielding polynomial bounds involving the constant C := supα∈A,x∈s |Vα (x)|. Both estimates combined allow us to obtain a polynomial bound on the number of balls required for the covering. 8. Application of the recombination methods to the stochastic filtering problem There are other important examples of measures evolving in time. We point out that the methods we have described above are really quite generic and have potential to radically improve numerical methods in some of these other examples. We think particularly of the setting where a known Markov process evolves in time, and where partial observations are possible. Then an application of Bayes theorem allows one, at least formally, to describe the conditional distribution of the process at the current time given the observations up to the current time. Although we have stated this problem in a rather abstract way, there are many settings and situations, and many concrete examples where the ability to make such computations in practice has great value. The simplest and the most computationally tractable examples of this problem, and the payoff that computational tractability can provide, are the linear filters. In this case, the Markov process solves a linear stochastic differential equation, and the observations are linear in the underlying Markov process. It is an easy exercise to show that if this process starts at a point, or with a Gaussian distribution, then its conditional distribution based on the observations will also be Gaussian. The computational benefits of this remark (due to Kalman and Bucy) are enormous because the
December 26, 2006
11:13
Proceedings Trim Size: 9in x 6in
litterer-lyons-updated
215
measure can be described using a finite dimensional data set (the mean and covariance of the Gaussian measure). These evolve according to a stochastic differential equation driven by the observational data. The computations can be quite tractable even in a hundred dimensional space. The nonlinear context is definitely harder, and a fundamental reason for this is that the evolving posterior measure does not usually live in any finite dimensional sub-manifold of the space of probability measures. The stochastic differential equation of Kalman and Bucy becomes the nonlinear stochastic partial differential equation for the density of the measure known as the Zakai equation. Classical SPDE theory can be used in low dimensions (two or three) but is not practical even in six dimensions (tracking a single particle in three-dimensional space usually involves three dimensions for position, and three for momentum). Substantial progress has been made in the nonlinear context over the last 10 years. This progress comes from the development of particle filters; the basic developments are well-documented in the survey articles [2, 4]. The essential idea is to model the evolving measure by a cloud of particles, which one usually regards as having equal or at least nearly equal mass. After an observation interval, one updates this cloud of particles. There are a number of similar methods for carrying out this updating. The choice usually depends upon how easy it is to simulate the underlying process and its conditioned counterpart. We will describe the most standard approach and outline the way that it can be combined with the ideas discussed above to produce methods that retain the high dimensional benefits of particle methods while at the same time achieving a high order of approximation and the benefits of the best finite element PDE methods. The basic particle method used in the continuous time context starts with a cloud of particles, each having a unit weight, then each particle moves forward randomly according to the unconditional distribution, while at the same time a path integral is computed along that trajectory to compute the increased or decreased likelihood of that trajectory given the observational data (variance reduction can often be usefully applied to the strategy – but we ignore it for simplicity in this discussion). So, after our observations one has a new cloud of particles that is randomly repositioned, and re-weighted to reflect the observational information. The crucial step is then to resample from this empirical weighted measure a number of particles. The number of particles should be chosen so that the new empirical measure adequately describes the posterior measure under the assumption that the prior, one unit of time previously, was the initial cloud. It is very common for this re-sampling to be IID, were each particle is chosen from the re-weighted empirical measure (so called a genetic sampling), but it is clear that this is a very bad idea if the time steps
December 26, 2006
11:13
Proceedings Trim Size: 9in x 6in
litterer-lyons-updated
216
are too short, as one quickly converges to a Dawson Watanabe measure valued process, and so the particles would concentrate themselves on a low dimension set [5]. There are other methods for resampling, and the TBBA method described in [3] is simple and an order of magnitude more efficient as well as being simpler to implement computationally. However it does not normally yield huge benefits because the errors introduced by the Monte Carlo simulation of the underlying process are of the same order of magnitude as those coming from the genetic resampling, providing the genetic resampling is not done too frequently. The methods introduced earlier in this paper can almost trivially be adopted to produce high order estimates in the filtering problem increasing the benefits of TBBA and going further to cubature resampling. Again, one works recursively; but now one assumes one has a cloud of particles that are weighted so that they accurately integrated against a class of polynomial test functions in the same way that the prior distribution of our parameter integrates (in other words we assume that they are a cubature sequence for the prior). Now, each particle is moved forward using the stochastic differential equation, but it is not moved forward to one new particle, rather it is moved forward to a particle measure approximating the transition probability in the sense of cubature against the space of polynomial or other test functions, using the cubature on Wiener space methods we have already described. Again, one would compute the new weights on these particles using the observational data, but this time we would take care to do it in a way that preserves the high order accuracy provided by the cubature methodology. Of course, in this way one will have squared the number of particles, but then one can step in again with the reduction methodology and replace this large class of particles with new weights on a small a subset. The much greater accuracy provided by these clouds of particles, when compared with the code at Monte Carlo approach, should result in substantial computational gains. We also mention that the computation of the distribution of a Markov process conditional on partial observations is no less important in the discrete time and discrete space context. Particle methods are used extensively and go under the name of the sequential Monte Carlo methods. It should be clear that the same sets of ideas could be used profitably in this context as well providing one identifies the classes of functions that one would like to integrate accurately. In this discrete case it could happen that the test functions are simply the characteristic functions of certain sets. Economies would occur if one could identify a smaller class.
December 26, 2006
11:13
Proceedings Trim Size: 9in x 6in
litterer-lyons-updated
217
References 1. K.T. Chen: Integration of paths, geometric invariant and a generalized Campbell-Hausdorff formula Ann. Math. II 65,163-178 2. Crisan,D.: Numerical methods for solving the stochastic filtering problem, In: Lyons,T.J., Salisbury,T.S., editor, Providence, R.I., American Mathematical Society. (2002) Pages: 1-20. 3. D. Crisan, T.Lyons Minimal entropy approximations and optimal algorithms for the filtering problem Monte Carlo Methods and Appl. Vol.8, No.4, 343-355, 2002 4. Crisan, D, Doucet, A, A survey of convergence results on particle filtering methods for practitioners,IEEE T SIGNAL PROCES. 50(2002) 736 - 746. 5. Crisan,D.O., Lyons,T., A particle approximation of the solution of the KushnerStratonovitch equation,Probability Theory and Related Fields 115(1999)549 - 578 6. D. Crisan, S. Ghazali: On the convergence rates of a general class of weak approximations of SDEs preprint 7. S. Kusuoka, D. Stroock: Application of the Malliavin calculus III J. Fac. Sci. Univ. Tokyo 1A 34, 391-442 8. S. Kusuoka: Malliavin Calculus Revisited J. Math. Sci. Univ. Tokyo 10 (2003), 261-277 9. S. Kusuoka: Approximation of Expectation of Diffusion Process and Mathematical Finance Advanced Studies in Pure Mathematics, Taniguchi Conference on Mathematics Nara ’98, 147-165 10. S. Kusuoka, S. Ninomiya: A new simulation method of diffusion processes applied to finance Stochastic processes and applications to mathematical finance, 233-253, World Sci. Publ., River Edge, NJ, 2004 11. S. Kusuoka Approximation of expectation of diffusion processes based on Lie algebra and Malliavin calculus Advances in math. economics Vol. 6, 69-83, Springer, Tokyo, 2004 12. C. Litterer, T. Lyons: High order recombination and an application to Cubature on Wiener Space preprint 2006 13. T. Lyons, N. Victoir: Cubature on Wiener space Proc. R. Soc. Lond. A (2004) 460, 169-198 14. T. Lyons: Differential equations driven by rough signals Revista Mathematica Iber. (1998) Vol. 14, Nr. 2, 215-310 15. S. Ninomiya, N. Victoir: Weak approximation of stochastic differential equations and application to derivative pricing, preprint, 2004 16. S. Ninomiya: A partial sampling method applied to the Kusuoka approximation Monte Carlo Methods and Appl. Vol.9, No.1, 27-38 (2003) 17. A.Nagel, E. M. Stein, S. Wainger: Balls and metric defined by vector fields I: Basic properties Acta Math. 155 (1985), no. 1-2, 103–147 18. H. J. Sussmann: Orbits of Familie of Vector Fields and Integrability of Distributions Trans. of the AMS, Volume 180, June 1973, 171-188 19. N. Victoir: Personal communication 20. N. Victoir: Asymmetric cubature formulae with few points in high dimension for symmetric measures SIAM J. Numer. Anal., Vol. 42, No.1, 209-227 21. J.G. Wendel A problem in geometric probability Math Scand 11 (1962), 109-111
This page intentionally left blank
December 26, 2006
19:1
Proceedings Trim Size: 9in x 6in
0609ritspro1
A Remark on Impulse Control Problems with Risk-sensitive Criteria Hideo Nagai∗ Division of Mathematical Science for Social Systems, Graduate School of Engineering Science, Osaka University, Toyonaka, 560-8531, Japan email: [email protected]
A verification theorem for an impulse control problem related to optimal investment with transaction costs is studied. A risk-sensitive variational inequality (QVI) of ergodic type concerns the problem. The pair (u, l) of a function u and a constant l is considered to be a solution of the QVI. The constant determines the value of the impulse control problem and an optimal strategy is constructed from the function u. In proving the verification theorem we have uniqueness of the constant l of the solution of the QVI.
1. Introduction Optimal investment problems with transaction costs have been formulated as impulse control problems in the case where the cost of a transaction includes fixed components and the optimal strategies are given through solving the quasi-variational inequalities (QVIs) (cf. [1], [4], [7], [11], [5] etc.). To construct optimal strategies from the solutions of QVIs strong regularity assumptions are usually assumed. In the present paper we are going to weaken the regularity assumptions. We discussed the problems with the risk-sensitive criterion in a very particular case of fixed transaction costs in [9] for the model with many assets, where risk sensitive QVIs (actually VIs are studied in this special case) introduced under general setting by Bielecki and Pliska (cf. [4]) concern. We are going to consider the problem under the setting with more general transaction costs and derive the risk-sensitive QVI of “ergodic type” in the present section. In this QVI the pair (u, l) of a function u and a constant l is considered to be a solution. ∗
Research supported in part by Grant-in-Aid for Scientific Research No. 16340025, JSPS. 219
December 26, 2006
11:15
Proceedings Trim Size: 9in x 6in
0609ritspro1
220
However, the underlying diffusion process is not ergodic and behaves on the whole Euclidean space, which differs from the cases studied by M. Robin [12]. Moreover, the distinct feature of the present ones is including quadratic growth nonlinear terms with respect to the first order derivatives which causes difficulties to treat. The current verification theorem in section 2 complements Theorem 5.1 in [10]. Let us consider the following market model: (1.1)
dS0 (t) = rS0 (t)dt, dSi (t) = Si (t){µidt +
S0 (0) = s0 , m
k=1
(1.2)
σik dWtk },
Si (0) = si , i = 1, . . . , m, where Wt = (Wt1 , . . . , Wtm )∗ is a standard Brownian motion process on a probability space (Ω, F , P, Ft ) and r, µi , i = 1, . . . , m, σik , i, k = 1, . . . , m are 0 1 m ∗ all constants. Let iπ(t) i≡ (π (t), π (t), . . . , π (t)) be a shareholding process and V(t) := m π (t)S (t) the wealth process. Define i=0 (1.3)
hi (t) =
πi (t)Si (t) , i = 0, 1, . . . , m V(t)
and set the risky fraction vector h(t) = (h 1 (t), h2(t), . . . , hm (t))∗. Then V(t) satisfies ⎧ dV(t) ⎪ ⎪ = {r + h(t)∗(µ − r1)}dt + h(t) ∗σdWt , ⎪ ⎪ ⎨ V(t) (1.4) ⎪ ⎪ ⎪ ⎪ ⎩ = v, V0 where 1 = (1, 2, . . . , 1). Suppose that the trading times are a sequence of stopping times without accumulation points. Then, πi (t) must be a piecewise constant for each i and during the time periods with no trade we have ⎧ ⎪ dh = D(ht )[E − 1h∗t ][µ − r1 − σσ∗ ht ]dt + D(ht )[E − 1h∗t ]σdWt , ⎪ ⎪ ⎨ t (1.5) ⎪ ⎪ ⎪ ⎩ h0 = ξ, πi (0)si
where E is a unit matrix, ξi = v , i = 1, 2, . . . , m and D(h) is a diagonal matrix with the diagonal components D(h)i j defined by (D(h)) i j = hi δi j , i, j = 1, 2, . . . , m. If the investor posseses positive amount of each security,
December 26, 2006
11:15
Proceedings Trim Size: 9in x 6in
0609ritspro1
221
namely 0 < hi (0) < 1, i = 0, 1, . . . , m, then 0 < h 1 (0) + h2 (0) + · · · + hm (0) < 1 and we could regard h(t) governed by stochastic differential equation (1.5), starting at an interior point ξ of the simplex Hm : Hm = {(h1 , . . . , hm ); h1 + h2 + . . . + hm < 1, 0 < hi , i = 1, 2, . . . , m} a diffusion process on Hm . Its generator L0 is written as L0 u(h) = 12 tr[D(h)(E − 1h ∗ )σσ∗ (E − h1∗ )D(h)∇2u] +(µ − r1 − σσ∗ h)∗ (E − h1∗ )D(h)∇u = 12 tr(G∗ σσ∗ GD2 u) + (µ − r1 − σσ∗ h)∗ G∇u, where G := (E − h1∗ )D(h) ⎛ 1 1 2 1 ⎜⎜ h (1 − h ) −h h ⎜⎜ ⎜⎜ ⎜⎜ −h1 h2 h2 (1 − h2 ) ⎜⎜ ⎜ .. .. = ⎜⎜⎜ . . ⎜⎜ ⎜⎜ . ⎜⎜ .. ⎜⎜ −h1 hn−1 ⎝ 1 n −h h −h2 hn
⎞ ··· ··· −hn h1 ⎟⎟ ⎟⎟ .. ⎟ . ··· −hn h2 ⎟⎟⎟ ⎟⎟ ⎟⎟ .. .. .. ⎟⎟ . . . . ⎟⎟ ⎟⎟ .. .. n n−1 ⎟ ⎟⎟ . . −h h ⎟ n−1 n m m ⎠ · · · −h h h (1 − h )
Note that (1.6)
det G = h 1 h2 · · · hm (1 − h1 − h2 − · · · − hm )
and so the coefficients of L0 degenerate on the boundary of Hm . Now let us construct a controlled process, regarding a sequence of the pairs of each traiding time and the risky fraction just after the trade as an (0) impulse control. Set τ0 = 0 and let ht = ht (ξ) be the solution of (1.5) and (0) G(1) t := ∩>0 σ(hs ; s ≤ t + ). (1) Then, for a G(1) t stopping time τ1 and a Gτ1 measurable Hm − valued random variable ξ1 define (0) t < τ1 ht , (1) (1) ht = ht ((τ1 , ξ1 )) := ht−τ1 (ξ1 ), τ1 ≤ t.
Successively set for i = 2, 3, . . .. (i)
(i−1)
Gt := ∩>0 σ(hs
; s ≤ t + ),
December 26, 2006
11:15
Proceedings Trim Size: 9in x 6in
0609ritspro1
222 (i) and take a G(i) t stopping time τi and a Gτi measurable Hm − valued random variable ξi . Then we can define a controlled process (i−1) ht , t < τi (i) (i) ht = ht ((τk , ξk )k=1,2,..,i) := , ht−τi (ξi ), τi ≤ t
(1.7)
(i) h¯ t = h¯ t ({(τn , ξn )n }) := ht ,
τi ≤ t < τi+1 ,
i = 0, 1, . . . .
Let A be the set of all strategies {(τk , ξk )k } such that P(limk→∞ τk ≤ T) = 0 for each T. The transaction cost at each trading time τk , k = 1, 2, . . . , is assumed to be {δ0 + (1 − δ0 )D(h¯ τk − , h¯ τk )}Vτk − , where 0 < δ0 < 1 is a constant and D(h, h) is a function on Hm × Hm such that 0 ≤ D(h, h) ≤ 1 − 1 , 0 < 1 < 1. So, the total wealth of the investor just after k-th trading time becomes (1.8)
Under the new probability measure P˜ the criterion J can be written as J = γr + lim supT→∞ (1.14) ¯ γ) = ΦT (h;
∞
k=1
1 T
¯ ˜ log E[exp{Φ T (h; γ)}],
γ{−c0 + c(h¯ τk − , h¯ τk )}1τk ≤T + γ
T 0
η( ˜ h¯ s )ds.
December 26, 2006
11:15
Proceedings Trim Size: 9in x 6in
0609ritspro1
224
The quasi-variational inequality (QVI) corresponding to the problem maximizing the criterion J is 1 ˜ − l,ˆ Mu − u} = 0, max{ L˜ 0 u + (∇u)∗ G∗ σσ∗ G∇u + γη(h) 2
QVI (1.15) is defined on a simplex Hm and the coefficients of L˜ 0 degenerate on the boundary of Hm in view of (1.6), which causes difficulty to treat. Therefore, by introducing a coordinate transformation, we shall rewrite the QVI to the one on the whole Euclidean space in which the diffusion coefficients have no degeneracy. Set ϕ: Rm → Hm defined by hi = ϕi (y1 , y2 , . . . , ym ) =
Through this mapping (1.5)’ is transformed to ⎧ ˜t ⎪ dy = {µ − r1 − d(σσ∗) + γσσ∗ ϕ(yt )}dt + σdW ⎪ ⎪ ⎨ t (1.5) ⎪ ⎪ ⎪ ⎩ y0 = ϕ−1 (ξ), where d(σσ∗ )i = 12 (σσ∗ )ii . Set b = (σσ∗ )−1 (µ − r1 − d(σσ∗ )). Then (1.15) is transformed to 1 ¯ u¯ − u} ¯ ∗ ∇u¯ + γη(y) ¯ = 0, max{ L¯ u¯ + ∇uσσ ¯ − l,ˆ M 2
(1.17) where
k 1 ¯ + (∇ν) ¯ ν(y) ¯ ∗ σσ∗ ∇u, ¯ = b∗ y + γ log(1 + e y ), L¯ u¯ = tr(σσ∗ D2 u) 2 m
k=1
December 26, 2006
11:15
Proceedings Trim Size: 9in x 6in
0609ritspro1
225
η(y) ¯ =
m i=1
and
(µ − r1)i
m exp yi exp yi (σσ∗ )i j exp y j 1 (1 − γ) m − k 2 1 + k=1 exp yk 2 (1 + m k=1 exp y ) i, j=1
¯ u(y) ¯ ¯ )}, M = sup {−c¯0 + c¯(y, y ) + u(y y ∈Rm
c¯0 = γc0 , c¯(y, y ) = γc(ϕ(y), ϕ(y )). QVI (1.17) corresponds to the impulse control problem maximizing the criterion (1.14)
J = γr + lim supT→∞ ¯ γ) = ΨT ( y;
∞
k=1 {−c¯0
1 T
˜ ¯ γ)}] log E[exp{Ψ T ( y;
+ c¯( y¯ τk − , y¯ τk )}1τk ≤T + γ
T 0
η( ¯ y¯ s )ds,
where y¯ t is constructed from the solution yt of (1.5)” in a similar way to (1.7). We shall study a verification theorem for this impulse control problem through QVI (1.17) in the next section under a little more general setting. ¯ ∈ L∞ Finally we give some remarks on QVI (1.17). We can see that Mu ∞ for u ∈ L because of (1.10) and that ¯ − Mv ¯ ∞ ≤ u − v ∞ . Mu
(1.18) Moreover, noting that
supz |c¯(y, z) − c¯(y , z)| = supz |c(ϕ(y), ϕ(z)) − c(ϕ(y ), ϕ(z)| ≤ K|ϕ(y) − ϕ(y )| ≤ K |y − y | ¯ turns out to be an operator from on accont of our assumption (1.11), M 1,∞ W to itself. Furhtermore, note that if u¯ is a solution, then so is u¯ + C for any constant C and also that the pair of the function u¯ and a constant lˆ is considered to be a solution of (1.17). So it is a kind of QVI of “ergodic type” although the underlying diffusion process is not always ergodic. 2. A Verification Theorem Motivated by the discussions in the previous section, we shall reformulate an impulse control problem and its quasi-variational inequality of “ergodic type ”under more general setting.
December 26, 2006
11:15
Proceedings Trim Size: 9in x 6in
0609ritspro1
226
We consider the stochastic differential equation ⎧ ⎪ dX = a∇ν(Xt )dt + σ(Xt )dWt ⎪ ⎪ ⎨ t (2.1) ⎪ ⎪ ⎪ ⎩ X0 = x ∈ R m . on a filtered probability space (Ω, F , P; Ft ), where a(x) = (ai j (x)) = ((σσ∗)i j (x)). We assume that (2.2)
σi j (x) ∈ W 1,∞ ,
∇ν ∈ W 1,∞ ,
m
Di ai j (x) = 0.
i=1
and that (2.3)
ξ∗ a(x)ξ ≥ κ|ξ|2 , ∀ξ ∈ Rm , ∃κ > 0. (k)
Then we can construct the controlled process X¯ t for a sequence of Ft (k) stopping times τk , k = 1, 2, . . . and Fτk measurable, Rm valued random variables ξk , k = 1, 2, .. as was done in the previous section for the solution (k) (k) yt of (1.5)”. Here F t is defined in a similar way as Gt in the previous section. Take an impulse cost function C(x, z) satisfying and −C1 ≤ C(x, z) ≤ 0, (2.4)
Then, for a given bounded Borel function f (x) and a constant C0 > 0 we define a criterion by J = lim supT→∞ (2.5) ¯ = ΦT (X)
1 T
∞
¯ log E[exp{ΦT (X)}]
k=1 {−C0
+ C(X¯ τk − , X¯ τk )}1τk ≤T +
T 0
f (X¯ s )ds,
and consider the problem maximizing the criterion J. The quasi-variational inequality of the problem is considered (2.6)
1 max{Lu + (∇u)∗ a(x)∇u + f (x) − l, Mu − u} = 0, 2
where Lu =
1 ij a (x)Di j u) + (∇ν)∗ a(x)∇u 2 i, j
December 26, 2006
11:15
Proceedings Trim Size: 9in x 6in
0609ritspro1
227
and
Mu(x) = −C0 + sup{C(x, z) + u(z)}. z∈Rm
As mentioned in the above the pair of a function u and a constant l is considered to be a solution of the QVI. To study (2.6) we introduce the following bilinear form
+ (a(x)∇ν)∗(∇ψ)ϕβ2λ dx, defined on the wighted Sobolev space W 1,2,λ = {v ∈ L2 (β2λ dx); with the norm v λ =
Rm
|∇v|2 β2λ dx < ∞},
Rm
(v2 + |∇v|2 )β2λ dx,
βλ (x) = e−λ
√
1+|x|2
.
Consider the variational inequality ⎧ ⎪ a(ψ, v − ψ) − (( f − l)ψ, v − ψ) ≥ 0, ∀v ≥ e M log ψ ⎪ ⎪ ⎨ (2.7) ⎪ ⎪ ⎪ ⎩ ψ ≥ eM log ψ , and assume that (2.7) has a solution (ψ, l) such that ψ ∈ W 1,2,λ ∩W 1,∞ for suitable λ > 0. Then ψ(x) ≥ eM log ψ(x) ≥ e−C0 +supz {C(x,z)+log ψ(z)} ≥ e−C0 −C1 +log ψ(0) and we see that u = log ψ ∈ W 1,∞ . Thus we have a solution u ∈ W1,∞ of (2.6). Moreover, note that under our assumptions the operator M is the one from W1,∞ to itself. Then C = {x ∈ Rm ; u(x) > Mu(x)} = {x ∈ Rm ; ψ(x) > eM log ψ(x) } is an open set and (2.7) implies that Lψ + ( f (x) − l)ψ = 0, in C Lψ + ( f (x) − l)ψ ≤ 0 in distribution sense and ψ(x) ≥ eM log ψ(x) .
December 26, 2006
11:15
Proceedings Trim Size: 9in x 6in
0609ritspro1
228
Consequently we see that Lu + 12 (∇u)∗ a(x)∇u + ( f (x) − l) = 0, in C
(2.8)
Lu + 12 (∇u)∗ a(x)∇u + ( f (x) − l) ≤ 0
in distribution sense and u(x) ≥ Mu(x).
(2.9)
In what follows we assume that (2.10)
−C0 + C(x, x ) + C(x , z) ≤ C(x, z) − β,
∃β > 0.
Let us show how to construct an optimal strategy of the above impulse control problem from the solution (u, l) of (2.6). For x ∈ D := {x ∈ Rm ; u(x) = Mu(x)} take ξ ∈ Rm such that sup{C(x, z) + u(z)} −
and take ξˆ1 such that sup{C(Xτˆ 1 − , x ) + u(x )} − x
β < C(Xτˆ k − , ξˆ1 ) + u(ξˆ1 ). 2
Successively we define (n−1)
τˆ n = inf{t ≥ τˆ n−1 ; u(X¯ t
(n−1)
) = M(X¯ t
)}
β 2
December 26, 2006
11:15
Proceedings Trim Size: 9in x 6in
0609ritspro1
229
and take ξˆn such that β (n−1) < C(X¯ τˆ n−1 − , ξˆn ) + u(ξˆn ), 2n
(n−1)
sup{C(X¯ τˆ n−1 − , x ) + u(x )} − x
n = 2, 3, . . .. Theorem 2.1. Suppose that (2.6) has a solution (u, l) such that u ∈ W 1,∞ and assume the assumptions (2.2)-(2.4) and (2.10). Then the above defined straregy {(ˆτn , ξˆn )}n forms an optimal staretegy: sup J = sup lim sup
{τn ,ξn }n
{τn ,ξn }n
T→∞
1 ¯ = l. log E[exp{ΦT (X)}] T
and as a consequence the constant l of the solution to (2.6) is unique. Proof of Theorem 2.1. By construction of X¯ t , X¯ τˆ n = ξˆn and we have u(X¯ τˆ n − ) − u(X¯ τˆ n ) − {−C0 + C(X¯ τˆ n − , X¯ τˆ n ) +
β }<0 2n
Let X¯ T ∈ C, then for the largest n satisfying τˆ n ≤ T, X¯ t , τi ≤ t < τi+1 , 2,p 2,p i = 0, 1, . . . , n − 1, stay in C. Moreover, ψ ∈ Wloc (C) and so u ∈ Wloc (C) and 1 Lu + (∇u)∗ a(x)∇u + ( f (x) − l) = 0, 2 Therefore u(X¯ T ) − u(X¯ 0 ) > u(X¯ T ) − u(X¯ τˆ n ) + − =
T 0
= − 12
Lu(X¯ s )ds +
T 0
T 0
i=1 {−C0
T
(∇u)σσ∗ ∇u(X¯ s )ds +
0
n
¯
i=1 {u(Xτˆ i − )
n
(∇u)∗ σdWs −
−
n
in C
β + C(X¯ τˆ i − , X¯ τˆ i ) + 2i }
n
i=1 {−C0
+ C(X¯ τˆ i − , X¯ τˆ i ) +
(∇u)∗ σdWs −
i=1 {−C0
− u(X¯ τˆ i−1 )}
T 0
( f (X¯ s ) − l)ds
β + C(X¯ τˆ i − , X¯ τˆ i ) + 2i }
Accordingly
T
e0
( f (X¯ s )−l)ds+ τˆ i ≤T {−C0 +C(X¯ τˆ i − ,X¯ τˆ i )}
¯
¯
1
≥ e−β−u(XT )+u(X0 )− 2
T 0
(∇u)σσ∗ ∇u(X¯ s )ds+
T 0
(∇u)∗ σdWs
.
β } 2i
December 26, 2006
11:15
Proceedings Trim Size: 9in x 6in
0609ritspro1
230
Thus we see that lim sup T→∞
T 1 ¯ ¯ ¯ log E[e 0 f (Xs )ds+ τˆ i ≤T {−C0 +C(Xτˆ i − ,Xτˆ i )} ] ≥ l T
On the other hand, u(y) ≥ Mu(y) and u(X¯ τi − ) ≥ −C0 + supz {C(X¯ τi − , z) + u(z)} ≥ −C0 + C(X¯ τi − , X¯ τi ) + u(X¯ τi ) for general strategy (τi , ξi )i , i = 1, 2, . . .. Therefore u(X¯ T ) − u(X¯ 0 ) ≤ u(X¯ T ) − u(X¯ τn ) + ni=1 {u(X¯ τi − ) − u(X¯ τi−1 )} −
n
i=1 {−C0
+ C(X¯ τi − , X¯ τi )}
and we have
T
E[e 0
f (X¯ s )ds+
T
≤ E[e 0
¯
¯
τi ≤T {−C0 +C(Xτi − ,Xτi )
]
f (X¯ s )ds−(u(X¯ T )−u(X¯ 0 ))+u(X¯ T )−u(X¯ τn )+ ni=1 {u(X¯ τi − )−u(X¯ τi−1 )}
ˆ t = Wt + t σ∇ν(Xs )ds is a Brownian Then under the probability measure W 0 motion process and so ˆ t. dXt = σ(Xt )dW ˆ = Since Xt is a symmeric diffusion process with the generator Lu 1 m ij ˆ D (a (x)u) under P, u(X ) − u(X ) admits a decomposition t 0 i, j=1 i 2
u(Xt ) − u(X0 ) =
t 0
ˆ s + Nt , (∇u)∗ σ(Xs )dW
ˆ ([6]). where Nt is an additive functional of zero energy corresponding to Lu Although the decomposition usually admits exceptional sets, they could be removed under the present setting because the relevant diffusion process has the transition density. Since we have ˆ + (∇ν)∗ a(x)∇u + 1 (∇u)∗ a∇u + f (x) − l ≤ 0 Lu 2
December 26, 2006
11:15
Proceedings Trim Size: 9in x 6in
0609ritspro1
231
in distribution sense t 1 {(∇ν)∗ a∇u(Xs ) + (∇u)∗ a∇u(Xs ) + f (Xs ) − l}ds) −(Nt + 2 0 is a positive additive functional. Therefore u(Xt ) − u(X0 ) ≤
By considering controlled process under the probability measure Pˆ we see that
T
E[e 0
f (X¯ s )ds−(u(X¯ T )−u(X¯ 0 ))+u(X¯ T )−u(X¯ τn )+ ni=1 {u(X¯ τi − )−u(X¯ τi−1 )}
T
ˆ 0 = E[e
f (X¯ s )ds−(u(X¯ T )−u(X¯ 0 ))+u(X¯ T )−u(X¯ τn )+ ni=1 {u(X¯ τi − )−u(X¯ τi−1 )}
T
×e 0
ˆ s− 1 (∇ν)∗ σ(X¯ s )dW 2
T
ˆ lT−(u(X¯ T −u(X¯ 0 ))+ 0 ≤ E[e
T
×e 0 ¯
¯
1
0
ˆ s− (∇u)∗ σ(X¯ s )dW
ˆ s− 1 (∇ν)∗ σ(X¯ s )dW 2
= E[elT−u(XT )+u(X0 )− 2
T
T 0
T
T 0
(∇ν)∗ a∇ν(X¯ s )ds
0
{(∇ν)∗ a∇u(X¯ s )+ 12 (∇u)∗ a∇u(Xs )}ds
(∇ν)∗ a∇ν(X¯ s )ds
(∇u)∗ a∇u(X¯ s )ds+
]
T 0
]
(∇u)∗ σ(X¯ s )dWs
]
Thus we see that lim sup T→∞
T 1 ¯ ¯ ¯ log E[e 0 f (Xs )ds+ τi ≤T {−C0 +C(Xτi − ,Xτi )} ] ≤ l T
because of u ∈ W 1,∞ and conclude our present theorem. References 1. C. Atkinson and P. Wilmott, Portfolio management with transaction costs: an asymptotic analysis of Morton and Pliska model, Mathmatical Finance 5(1995) 357–367. 2. A. Bensoussan and J. L. Lions Applications des In´equations Variationnelles en Contrˆole Stochastique, Dunod, Paris, 1978. 3. A. Bensoussan and J. L. Lions Contrˆole Impulsionnel et In´equations Quasi Variationneles, Dunod, Paris, 1982.
December 26, 2006
11:15
Proceedings Trim Size: 9in x 6in
0609ritspro1
232
4. T. R. Bielecki and S. R. Pliska, Risk sensitive asset management with transaction costs, Finance and Stochastics 4 (2000) 1–33. 5. A. Cadenillas, Consumption-investment problems with transaction costs: Survey and open problem, Math. Meth. Oper. Res 51 (2000) 43–68. 6. M. Fukushima, Y. Oshima, and M. Takeda, Dirichlet Forms and Symmetric Markov Processes, de Gruyter, 1994. 7. A. J. Morton and S. Pliska, Optimal Portfolio Management with Fixed Transaction Costs, Math. Finance 5 (1995) 337–356. 8. H. Nagai, Risky Fraction Processes and Problems with Transaction Costs, Proceedings of the International Syposium on ”Stochastic Processes and Applications to Mathematical Finance”, eds. J Akahori, S. Ogawa, and S. Watanabe, World Scientific, (2004) 271–288. 9. H. Nagai, Stopping problems of certain multiplicative functionals and optimal investment with transaction costs, to appear in Appl. Mat. Optim. 10. H. Nagai, Risk-sensitive quasi-variational inequalities for optimal investment with general transaction costs, Asymptotic Analysis 48 (2006) 243–265. 11. S. Pliska and M. J. P. Selby, On a free boundary problem that arises in portfolio management, Phil. Trans. Royal Soc. London 347 (1994) 555–561. 12. M. Robin, On some impulse control problems with long run average cost, SIAM J. Control Optim. 19 (1981) 333–358. 13. T. Tamura, Maximizing growth rate of a portfolio with fixed and proportional transaction costs, to appear in Appl. Math. Optim. (2006)
December 26, 2006
13:43
Proceedings Trim Size: 9in x 6in
levythu
A Convolution Approach to Multivariate Bessel Proceses Thu Van Nguyen1 , S. Ogawa2 , and M. Yamazato3 1
Department of Mathematics, International University, HCM City 2 Department of Mathematical Sciences, Ritsumeikan University 3 Department of Mathematics, Ryukyu University
In this paper we introduce and study Bessel processes {Bxt } which take values in a d-dimensional nonnegative cone R+d of Rd and are constructed via the multi-dimensional Kingman convolution . We prove that every d-variate Bessel process is a stationary independent increments-type process. Moreover, a stochastic integral with respect to {Bxt } with the convergence in distribution is defined. AMS 2000 subject classification: Primary 60G48, 60G51, 60G57; Secondary60J25, 60J60, 60J99 1. Introduction and Prelimilaries Let P denote the class of all p.m.’s on the positive half-line R+ endowed with the weak convergence and ◦ := ∗1,β denote the Kingman convolution (Hankel transforms ) which was introduced by Kingman [5] in connection with the addition of independent spherically symmetric random vectors in Eucliean n-space. Namely, for each continuous bounded function f on R+ we write : ∞ ∞ ∞ 1 Γ(s + 1) (1) f (x)µ ∗1,β ν(dx) = √ πΓ(s + 12 ) 0 0 0 −1 f ((x2 + 2uxy + y2 )1/2 )(1 − u2 )s−1/2 µ(dx)ν(dy)du, where µ, ν ∈ P , β = 2(s + 1) 1 (cf. Kingman [5] and Urbanik [15]. The Kingman convolution algebra (P, ◦) is the most important example of Urbanik convolution algebras (cf Urbanik [15]. In the language of Urbanik convolution algebras, the characteristic measure, say σs , of the Kingman convolution has the Rayleigh density (2)
with the characteristic exponent κ = 2 and the kernel Λs Λs = Γ(s + 1)Js (x)/(1/2x) s.
(3)
The radial characteristic function (rad.ch.f.) of a p.m. µ ∈ P, denoted by µ(u), ˆ is defined by ∞ Λs (ux)µ(dx), (4) µ(u) ˆ = 0 +
for every u ∈ R . In particular, the rad.ch.f. of σ s is σˆ s (u) = exp(−u2 ), u ∈ R+ .
(5)
It should be noted that, since the rad.ch.f. is defined uniquely up to the delation mapping x → ax, a > 0, x ∈ R+ , the representation (5) of the rad.ch.f. of σs may differ from that in Urbanik [15]. It is known (cf. Kingman [5], Theorem 1), that the kernel Λs itself is an ordinary ch.f. of a p.m., say Gs , defined on the interval [-1,1] as the following (6)
dGs (λ) =
Γ(s + 1) 1 2
1 2)
1
(1 − λ2 )s− 2 dλ
π Γ(s + 1 G− 1 = (δ1 + δ−1 ) 2 2 G∞ = δ0
1 (s ∈ (− , ∞)) 2 1 (s = − ), 2 (s = ∞).
Thus if θs denotes a r.v. with distribution Fs then for each t ∈ R+ , 1 (7) Λs (t) = Eexp(itθ s) = exp(itx)dG s(x). −1
Now we quote a definition of a Bessel process in Revus-Yor [12] Definition 1.1. A Bessel process is the square root of the following unique strong solution of the SDE t Zs dβs + βt, (8) Zt = x + 2 0
for any β 0 and x 0. It should be noted that Shiga and Watanabe [14] characterized the Bessel family as one-parameter semigroups of distributions on path spaces W = C(R+ , R) which stands for a convolution approach to Bessel processes. Our aim in this paper is to study Bessel processes via the Kingman convolution method. Therefore we will assume that the dimension β 1. Moreover, we will consider the Bessel process started at 0 only and will denote it by B(t), t 0.
December 26, 2006
13:43
Proceedings Trim Size: 9in x 6in
levythu
235
2. The Cartersian Product of Kingman Convolution Algebras This concept was introduced in Nguyen [10]. Namely, let P(Rk+ ) denote the class of all p.m.’s on Rk+ equipped with the weak convergence. Let F1 , F2 ∈ P(Rk+ ) be of the product form Fi = τ1i × ... × τki
(9) j
where τi ∈ P, j=1,2,... and i=1,2. We put F1 k F2 = (F11 ◦ F21 ) × ... × (Fk1 ◦ Fk2 ).
(10)
Since convex combinations of p.m.’s of the form (9) are dense in P(Rk+ ) the relation (10) can be extended to arbitrary p.m.’s on P(R k+ ). For every F ∈ P(Rk+ ) the k-dimensional radial ch.f. Fˆ is defined by ˆ = F(t)
(11)
k Rk+ j=1
Λs (t j x j )F(dx),
Let λ, λ1 , ..., λk be i.i.d. r.v’s with the common distribution Gs . Let X = (X1 , ..., Xk) be a Rk+ -valued random vector with distribution F. Further, suppose that r.v’s X and Λ, where Λ = (λ1 , ..., λk ), are independent. Set ΛX = {λ1 X1 , ..., λk Xk }
(12) and
d
Gs F = ΛX.
(13) Then, we have
F(y) = E(ei ),
(14) where y = (y1 , ..., yk) ∈ Rk+ In fact, we have E(e
and
i<(λ1 y1 ,...,λk yk ),X>)
<, > denotes the inner product in Rk .
=
Rk+
E(ei
k
j=1 (y j x j λ j
F(dx)
=
Rk+
Πkj=1 Λs (t j x j )F(dx)
= F(y). Thus, F(y) is an ordinary symmetric k-dimensional ch.f., and hence it is uniformly continuous. The following theorem is a simple consequence of (1.3) and (2.2).
December 26, 2006
13:43
Proceedings Trim Size: 9in x 6in
levythu
236
Theorem 2.1. The pair (P(R k+ , k ) is a commutative topological semigroup with δ0 as the unit element. Moreover, the operation k is distributive w.r.t. convex combinations of p.m.’s ∈ P(Rk+ ). In the sequel, the pair (P(Rk+ , k ) will be called a k-dimensional Kingman convolution algebra. For each vector x ∈ Rk+ the generalized translation operators (shortly, g.t.o.’s) Tx , x ∈ Rk+ acting on the Banach space Cb (Rk+ ) of real bounded continuous functions f are defined, for each y ∈ Rk+ , by x (15) T f (y) = f (u)δx k δy (du). Rk+
In terms of these g.t.o.’s the k-dimentional rad. ch.f. of p.m.’s on Rk+ can be characterized as the following: Theorem 2.2. A real bounded continuous function f on Rk+ is a rad.ch.f. of a p.m., if and only if f (0) = 1 and f is {T x }-nonnegative definite in the sense that for any x1 , ..., xk ∈ Rk and λ1 , ..., λk ∈ C k
(16)
λi λ j Txi f (x j ) 0.
i, j=1
(See [10] for the proof). Lemma 2.1. Every p.m. F defined on R k+ is uniquely determined by its rad.ch.f. F and the following formula holds: k F2 (t) = F1 (y)F2 (t), F1
(17)
where F1 , F2 ∈ P(Rk+ ) and y ∈ Rk+ . Proof. The formula (17) follows from formulas (1, 3, 12, 14). Now using the formulas (2, 3, 9) and a theorem of Weber ([6], p. 394) and integrating ˆ 1 u1 , ..., tk uk ), the function F(t t j , u j ∈ R+ , j = 1, ..., k, k − times w.r.t. σs , we get ˆ 1 u1 , ..., tkuk )σs (du1 )...σs (duk ) = (18) F(t Rk+
R+
...
k
R+ j=1
Λs (t j x j u j )F(dx)σs (du1 )...σs(duk ) =
k Rk+ j=1
exp{−t2j x2j }F(dx),
December 26, 2006
13:43
Proceedings Trim Size: 9in x 6in
levythu
237
which, by change of variables y j = x2j , j = 1, ..., k and by the uniqueness of the k-dimensional Laplace transform, implies that F is uniquely determined by the left-hand side of (18). Definition 2.1. A distribution F on Rk+ is said to be Rayleigh, if the Gs F defined by (12) is a k-dimensional symmetric Gaussian p.m. The following theorem is obvious and its proof is omitted. Theorem 2.3. A distribution F of a r.v. X on Rk+ is Rayleigh, if and only if for every x ∈ Rk+ the r.v. < x, X > is one-dimensional Rayleigh. It is the same as in the case k=1, the i.d. elements can be defined as the following: Definition 2.2. A p.m. µ ∈ P(R k+ is called i.d.if for every natural m there exists a p.m. µm such that µ = µm k ... k µm , (m − terms).
(19)
Moreover, a nonnegative stochastic process ξt , t ∈ T is said to be i.d., if each its finite dimensional distribution is i.d. Let ID(k ) denote the class of all i.d. elements in (P(Rk+ , k ). The following theorem is a slight generalization of Theorem 7 in Kingman [5]. Theorem 2.4. µ ∈ ID( k ) if and only if there exist a finite measure M on Rk+ with the property that M({0}) = 0 and for each y = (t 1 , ..., tk) ∈ Rk+ (20)
−logµ(y) ˆ =
Rk+
(1 −
k
Λs (< t j x j >)
j=1
1 + x2 M(dx), x2
where the integrand on the right-hand side of (21) is assumed to be (21)
lim (1 −
x→0
k
1 + x2 2 = tj . x2 j=1 k
Λs (< t j x j >)
j=1
In particular, if M = 0 then µ becomes a Rayleigh measure with the rad.ch.f. (22)
−logµ(y) ˆ =
k j=1
for any y ∈ Rk+
and λ j 0, j = 1, ..., k.
λ j t2j ,
December 26, 2006
13:43
Proceedings Trim Size: 9in x 6in
levythu
238
Proof. The proof of the first part of Theorem is a similar to that of Theorem 7 in Kingman [5]. To prove the remainder part we assume that k=2. The proof for the case k 3 is similar. For t, x ∈ R2+ we put H = H(t 1 , t2 , x1 , x2 ) :=
(23)
1 − Λs (t1 x1 )Λ(t2 x2 ) x21 + x22
By virtue of Kingman ([5], Formula (24)) and by the series representation of Λs (.) (Kingman [5], Formula (4))and by the fact that the measure G s is symmetric on the interval [−1, 1] we have 1 − Λs (t1 x1 )Λs (t2 x2 ) =
which proves (21). Now, letting M in (20) tend to measure zero and in2 tegrating both sides of (22) w.r.t. 1+x M(dx) we conclude, by virtue of x2 (25)and (26), that the formula (20) holds. Finally, since every projection of the limit p.m. is Rayleigh, it follows from Theorem Theorem 7 in Kingman [5]that the limit p.m. with rad.ch.f. of the form (22) must be a k-dimensional Rayleigh p.m. It is evident, from (22), that µ is Rayleigh in Rk+ if and only if for each y ∈ Rk+ the image of µ under the projection Πy x =< x, y > from Rk+ onto
December 26, 2006
13:43
Proceedings Trim Size: 9in x 6in
levythu
239
R+ is Rayleigh on R+ . Hence and by the Cram´er property of the Kingman convolution (cf. Urbanik [16]) we have the following theorem: Theorem 2.5. Suppose that µ, ν ∈ P(R k+ ) and µ k ν is Rayleigh. Then both of them are Rayleigh. 3. Multivariate Symmetric Random Walks Given a p.m. µ ∈ P and n=1,2,... we put, for any x ∈ R + B(R+ ), the Borel σ-field of R+ , (27)
and B ∈
Pn (x, E) = δx ◦ µ◦n (E),
here the power is taken in the convolution ◦ sense. Using the rad.ch.f. one can show that {Pn (x, E)} satisfies the Chapman-Kolmogorov equation and therefore, there exists a homogenuous Markov sequence, say {Sxn }, n = 0, 1, 2,..., with {Pn (x, E)} as its transition probability. More generaly, we have Lemma 3.1. Suppose that {µ k , k = 1, 2, ...} is a sequence of p.m’s on Rk+ . For any 0 n < m, x ∈ Rk+ , E ∈ B(Rk+ ), (28)
Pn,m (x, E) = δx k µn k µn+1 k ... k µm−1 (E).
Then, {Pn,m(x, E)} satisfies the Chapman-Kolmogorov equation and therefore there exists a Markov sequence {Xxn }, n = 0, 1, 2, ... with Pn,m (x, E) as its transition probability. Proof. It can be proved by using the rad.ch.f. Since σs is i.d. w.r.t. the Kingman convolution the family of p.m.’s q(t, x, E) := σ◦t s ◦ δx (E) where t, x ∈ R, E is a Borel subset of R+ and the power is taken in the Kingman convolution sense, satisfies the Chapman-Kolmogorov equation and stands for a transition probability of a homogenuous Markov process Bxt , t, x ∈ R+ , such that , with probability 1, its realizations are continuous (cf. Nguyen [8] and Shiga-Wantanabe [14]). Let Hs be a k-dimensional Rayleigh measure with rad.ch.f. (20) and (29)
P(t, x, E) := H st s δx (E),
where t 0, x ∈ Rk+ , E is a Borel subset of Rk+ and the power is taken in the sense of convolution s . Then there exists a homogeneous Markov process, denoted by {Bxt } with values in Rk+ and transition probability (28).
December 26, 2006
13:43
Proceedings Trim Size: 9in x 6in
levythu
240
Definition 3.1. Every Markov process {B xt } with transition probability given by (28) is called a k-dimensional Bessel process. From the above definition and by (28) we have: Theorem 3.1. The rad.ch.f. of {B xt }, t 0 is of the form −logEΛ(< y, Bxt >=< y, x > t + t
(30)
k
λ2j y2j ,
j=1
where y ∈ Rk+ , λ j 0, j = 1, ..., k and t 0. j
j
j
Suppose that X j = {X1 , X2 , ..., Xk }, j = 1, 2 are Rk+ -valued independent r.v.’s with the corresponding distributions F j , j = 1, 2. Put X2 = {X11 ⊕ X12 , ..., Xk1 ⊕ Xk2 }. (31) X1 Then we get a k-dimensional radial sum of r.v.’s. By induction one can define such an operation for a finite number of r.v.’s. It is evident that the radial sum is defined up to distribution of r.v.’s and that the operation is associative. It is a natural problem to consider the usual multiplication of a Rk+ -valued r.v. and a nonnegative scalar. It is easy to see that the multiplication is distributive w.r.t. the radial sums defined by (31) which helps us to introduce the following stochastic integral. Definition 3.2. Let C be a σ-ring of subsets of a set X. A function M : C → L+ := L+ (Ω, F , P),
(32)
where L+ denotes the class of all nonnegative r.v.’s on (Ω, F , P), is said to be an k -scattered random measure, if • (i) M(∅) = 0 (P.1), • (ii) For any dent and
A, B ∈ C, A ∩ B = ∅, then M(A)andM(B) are indepen d M(B) M(A ∪ B) = M(A)
• (iii) For any A1 , A2 , ... ∈ C, pendent and (33)
the r.v.’s M(A j ), j = 1, 2, ... are inde-
d M(∪∞ j=1 A j ) =
∞ j=1
M(A j ),
December 26, 2006
13:43
Proceedings Trim Size: 9in x 6in
levythu
241
where the series on the right-hand side of (33) is convergent in distribution. It should be noted that the above definition of k -scattered random measure is subject to the equality in probability which, however, can be modified in the same way as Rajput and Rosinski ([11], Lemma 5.1 and Theorem 5.2) so that the new k -scattered random measure is defined almost surely. Specificly, we state without proof the above mentioned Lemma used by Rajput and Rosinski. Lemma 3.2. (O. Kallenberg) Let ξ and η be random elements defined on the probability space (Ω, P) and (Ω , P ), and taking values in the spaces S and T, respectively, where S is a separable metric space and T is a Polish space. d
Assume that ξ = f (η ) for some Borel measurable function f : T → S. Then d
there exists a random element η = η on the (”randomized”) probability space (Ω × [0, 1], P × Leb) such that η = f (η ) a.s. P × Leb. It is well known that if {W(t)}, t ∈ R+ is a Wiener process, then there exists a Gaussian stochastic measure N(A), A ∈ B0 , where B0 is the σ−ring of bounded Borel subsets of R+ with the property that, for every t 0, W(t) = N((0, t]). The same it is also true for Bessel processes. Namely, we get Theorem 3.2. Suppose that {B 0t } is a Bessel process started at 0. Then there exists a unique k -scattered random measure {M(A)}, A ∈ B0 , such that for each t 0 (34)
d
M((0, t]) = B0 (t).
Proof. It is the same as the proof for the case k=1 in Nguyen ([10], Theorem 4.2). Definition 3.3. Let M be a k -scattered random measure defined by the equation (33). Then for any 0 s < t the quantities M((s, t]) are called k -increments of the Bessel process {B0t }. By the same reasoning as in Nguyen ([10], Theorem 4.3) we have Theorem 3.3. Every k-dimensional Bessel process B0t , t 0 is a stationary independent k -increments process. Now we proceed to construct a new non-linear stochastic integration of a nonnegative function w.r.t. a Bessel process. For simplicity we assume that k=1 and write the Bessel process started at 0 as B(t), t . Let M denote the ◦-scattered random measure associated with B(.) and let L 2+ [0, T], T > 0
December 26, 2006
13:43
Proceedings Trim Size: 9in x 6in
levythu
242
the Hilbert space of all measurable nonnegative functions f on [0,T] such that T 2 f (u)2 du < ∞. (35) f := 0
Given a partition Π := {t0 = 0 < t1 < ... < tN T} of an interval [0, T], T > 0 we put fΠ (t) =
(36)
N
fti χ(ti ,ti+1 ](t) .
i=0
Then, the integral
T 0
fΠ (t)d◦ B(t) is defined as
1
(37) 0
The integral
T 0
d
fΠ (t)d◦ B(t) =
N
fti B([ti , ti+1 ))).
i=1
f (t)d◦ B(t) is defined as:
T
(38) 0
ξ(t)d◦ B(t) = lim|Π|→0
N
fi M(ti , t(i+1) ),
i=1
where |Π| := max{ti+1 − ti , i = 0, 1, ...N} and the limit is taken in the distribution sense, provided it exists. Theorem 3.4. For each function f ∈ L2+ [0, T] the integral (36) exists in the T convergence in distribution and for any α > 0 the rad.ch.f. of S := 0 α f (u)d◦ B(u) is given by
T
− log EΛs (vS) = v2
(39)
f 2 (u)du,
0
v 0. Proof. We have (40)
− log EΛs (v
N i=1
N fi M(ti , ti+1 ) = v (ti+1 − ti ) fi2 2
i=1
→v
T
2 0
f 2 (u)du
December 26, 2006
13:43
Proceedings Trim Size: 9in x 6in
levythu
243
which implies the conclusion of the theorem. By the above definition and by using the rad.ch.f. we get the following theorem: Theorem 3.5. (i) Let f 1 , f2 ∈ L2+ [0, T] and c 0. We have
T
(41) 0
cd◦ B(t) = cB(T);
(ii) If supp( f1 ) ∩ supp( f2 ) = ∅, then independent and
T
T 0
◦
T
{ f1 (t)(t) + f2 (t)}d B(t) =
(42)
f1 (t)d◦ B(t) and
◦
T
f1 (t)d B(t) +
0
0
T 0
f2 (t)d◦ B(t) are
f2 (t)d◦ B(t)
0
(iii) ( non-linearity) In general
T
◦
T
{ f1 (t)(t) + f2 (t)}d B(t)
(43) 0
iii If fn → f
◦
T
f1 (t)d B(t) + 0
f2 (t)d◦ B(t).
0
in L2+ [0, T], then
T
T
◦
fn (t)d B(t) →
(44) 0
f (t)d◦ B(t)
0
in distribution. References 1. Bingham, N. H., Random walks on spheres, Z. Wahrscheinlichkeitstheorie Verw. Geb., 22 (1973), 169–172. 2. Cox, J. C., Ingersoll, J. E. Jr., and Ross, S. A., A theory of the term structure of interest rates. Econometrica, 53(2), 1985. 3. Jeanblanc, M., Pitman, J., and Yor, M., Self-similar processes with independent increments associated with L´evy and Bessel processes, 100, No.1-2 (2002), 223– 231. 4. Kalenberg, O., Random measures, 3rd ed. New York: Academic Press 1983. 5. Kingman, J. F. C., Random walks with spherical symmetry, Acta Math., 109 (1963), 11–53. 6. Bebedev, N. N., Special functions and their applications. Prentice-Hall, INC Englewood Cliffs, N.J. 1965. 7. Levitan, B. M., Generalized translation operators and some of their applications, Israel program for Scientific Translations, Jerusalem 1962. 8. Nguyen, V. T., Generalized independent increments processes, Nagoya Math. J. 133 (1994) 155–175.
December 26, 2006
13:43
Proceedings Trim Size: 9in x 6in
levythu
244
9. Nguyen, V. T., Generalized translation operators and Markov processes, Demonstratio Mathematica, 34 No 2, 295–304. 10. Nguyen, V. T., A convolution approach to Bessel processes. submitted to Urbanik Volume Prob. Math. Stat. 2006. 11. Raiput, B. S., Rosinski, J., Spectral representation of infinitely divisible processes, Probab. Th. Rel. Fields 82(1989), 451–487. 12. Revuz, D. and Yor, M., Continuous martingals and Brownian motion. Springerverlag Berlin Heidelberg 1991. 13. Sato, K., L´evy processes and infinitely divisible distributions, Cambridge University of Press 1999. 14. Shiga, T., Wantanabe, S., Bessel diffusions as a one-parameter family of diffusion processes, Z. Warscheinlichkeitstheorie Verw. geb. 27 (1973), 34–46. 15. Urbanik, K., Generalized convolutions, Studia math., 23 (1964), 217–245. 16. Urbanik, K., Cram´er property of generalized convolutions, Bull. Polish Acad. Sci. Math. 37 No 16 (1989), 213–218.
December 26, 2006
13:52
Proceedings Trim Size: 9in x 6in
guithay[1]3
Spectral Representation of Multiply Selfdecomposable Stochastic Processes and Applications∗ Dedicated to Professor Dang Dinh Ang for his 80th birthday
Nguyen Van Thu1 , To Anh Dung2 , Duong Ton Dam2 , and Nguyen Huu Thai3 1
Department of Mathematics, International University, HCM City, email:[email protected] 2 Department of Mathematics, University of natural sciences, HCM City 3 Department of Mathematics and Statistics, University of Economics, HCM City
In the present paper we study multiply selfdecomposable probability measures (SDPM) and processes and prove their integral representations. Similarly, the multiple s-selfdecomposability case is treated. Our results extend some of known results due to Urbanik, K., Jurek, Z., Rosinski, J. and Rajput, B. S. As an application, following Cartea and Howinson ([1]) we introduce the DampedL´evy-mixed - stable process which leads to a mathematical model for option pricing. AMS 2000 subject classification: Primary 60E07, 60B12,60G10; Secondary 60G51,60 H05,60E07. Key words: infinitely divisible processes, α-SDP, random measures, stochastic integration, α-s-SDPM
1. Introduction, Notation and Preliminaries The main aim of this paper is to prove that each multiply selfdecomposable process (MSDP) on an Euclidean space admits a stochastic integral w.r.t.a MSD random measure (RM). Moreover, we will consider similar problems for multiply s-self-decomposable processes (MsSDP). ∗ The paper is completed during the first and second author’s stay at the Department of Mathematics Ritsumeikan University.
245
December 26, 2006
13:52
Proceedings Trim Size: 9in x 6in
guithay[1]3
246
Through the paper we shall denote by X a fixed d-dimensional (d=1,2,...) Euclidean space with the usual inner product <, > and norm .. Let P(X) denote the class of all probability measures (PM) on the σ-field B(X) of Borel subsets of X equipped with the weak convergence. Let us denote by ID(X) the class of all IDPM’s on the space. Given a positive number c we define on X the following two families of mappings Tc and Ur as follows: ⎧ ⎪ ⎪ ⎨Tr x = rx, (1) ⎪ x ⎪ ⎩Ur = max(0, x − r) x , Ur (0) = 0. Further, for a PM µ ∈ P(X) and a mapping T on X let Tµ denote the image of µ under T. Recall (cf. Lo´eve [12] and Sato [25]) that a PM µ ∈ P(X is called SD if for each 0 < c < 1 there exists a PM µ c such that (2)
µ = T c µ ∗ µc
where ∗ denotes the ordinary convolution of PM’s. The concept of shrinking SDPM (shortly, s-SDPM) was introduced by Medgyessy [14] and studied by Jurek [3], [4], [6]. Namely, a PM µ is called s-SD if it is ID and for each 0 < c < 1 there exists a PM µ c such that (3)
µ = U c µc ∗ µc
where the power is taken in the convolution sense. It is known [3], [4], [6], [29], [17] that if µ is SD (resp., s-SD) then µ, µ c are both ID. The class of all SDPM’s (resp., s-SDPM’s) on X is denoted by L(X) (resp., U(X)). Let Ln (X), n = 1, 2, ...(resp., Un (X), n = 1, 2, ...) denote the class of all n-times SDPM’s (resp., n-times s-SDPM’s) which were first introduced by Urbanik1 [29] (resp., Jurek [4] and then studied further by many other authors (cf., for example [4], [17], [25]...). They are defined recursively as follows: A p.m. µ ∈ Ln (X), n = 2, 3, ... if and only if µ ∈ L1 (X) and for each c ∈ (0, 1) the component µ c in (2) belongs to Ln−1 (X). µ
It has been proved by Nguyen ([18], Proposition 1.1) that a p.m. belongs to Ln (X), n = 1, 2, ..., if and only if, for every c ∈ (0, 1) there
1 It should be noted, that our notation L (X) used here and in references [17], [18] is other n than that in Urbanik and other Authors [4], [25]. In particular, in our notation, L1 (X) denotes the set of all SDPM’s on X while in [4], [25] this class was denoted by L0 (X).
December 26, 2006
13:52
Proceedings Trim Size: 9in x 6in
guithay[1]3
247
exists a p.m. ν := µc,n ∈ ID(X) such that the following equality holds: ∗rk,n µ = ∗∞ k=0 (Tck ν)
(4)
where the power is taken in the convolution sense and, for n = 1, 2, ... ; k = 0, 1, 2,... we put n+k−1 . (5) rk,n = k The formulas (4) and (5) lead to the following interpolation of classes L n (X) (cf. Nguyen [18] and [19]): For each α > 0 we put ⎧ ⎪ ⎪ α ⎨1 k = 0, (6) =⎪ ⎪ ⎩α(α − 1)...(α − k + 1)/k! k = 1, 2, ... k and introduce the class α-times SDPM’s, shortly, α − SDPM s as the following: Definition 1.1. (cf. Nguyen [19]) A p.m. µ ∈ L α (X), α > 0, i.e. it is α-times SDPM’s, if and only if, for every c ∈ (0, 1) there exists a p.m. ν := µ c,α ∈ ID(X) such that the following equality holds: rk,α µ = ∗∞ k=0 (Tck ν)
(7)
where the power is taken in the convolution sense and, for any α > 0 and k = 0, 1, 2,... we put α+k−1 (8) rk,α = k It should be noted (cf. [19]) that the infinite convolution (7) is weakly convergent if and only if (9) logα (1 + x)ν(dx) < ∞ X
In the sequel, we shall denote by IDlogα (X) the subclass of ID(X) of all distributions for which the condition (9) is satisfied. Further, If {X(t)} is a X-valued L´evy process d
with ν = X(1) and ν ∈ IDlogα (X), then we say that it is of the class ID logα (X). Now, let us quote the following important integral representation for SDPM’s due to Vervaat-Jurek [5]: Theorem 1.1. (Jurek-Vervaat) A p.m. µ belongs to L 1 (X) if and only if there exists an X-valued L´evy process {X(.)} of the class ID log such that ∞ d (10) µ= exp(−t)X(dt) 0
December 26, 2006
13:52
Proceedings Trim Size: 9in x 6in
guithay[1]3
248
The integrator {X(.)} is called the background driving L´evy process (shortly,BDLP) of µ (cf. [5], [6]) and the r.v. X(1) is called BD r.v.. Further, Nguyen, N. H. [16], obtained the following pretty generalization of Theorem 1.1 to the case of α−SDPM’s for each α > 0. Theorem 1.2. (Nguyen, N. H. [16]) A p.m. µ belongs to L α (X) if and only if there exists an X-valued L´evy process {Xα (t)} of the class IDlogα (X) such that ∞ d (11) µ= exp(−t)tα−1Xα (dt) 0
In the sequel we shall need the following representation of ch.f.’s of ID and MSDPM’s on X: Theorem 1.3. (cf [20], [24]) A p.m. µ is ID if and only if its ch.f. µ(y), ˆ y ∈ X is of the unique form: (ei − 1 − iτ(x))M(dx) (12) − log µ(y) ˆ = i < z, y > + < Σy, y > − X
where z ∈ X is fixed; Σ is a quadratic form on X and M is a L´evy measure on X characterized by the property that M(0) = 0 , M is finite outside of very neighborhood of the origin and x2 M(dx) < ∞; 2 U1 1 + x the function τ(x) is defined by
⎧ ⎪ ⎪ ⎨x x ∈ U1 ; τ(x) = ⎪ ⎪ ⎩1 x > 1,
U1 being the closed unit ball in X. In what follows, if µ is ID with the ch.f. given by (12) then we will identify it with the triple [z, Σ, M]. Thus,we have Theorem 1.4. (cf. Nguyen [19], Theorem 2.4) A p.m. µ belongs to L α (X), α > 0 if and only if µ = [z, Σ, M], where z, Σ are the same as in Theorem (1.3) and the L´evy measure M is given by ∞ vα (x)( χA (e−u x)uα−1 du)m(dx) (13) M(A) = X
0
where m is a finite measure on X vanishing at the origin; A is a Borel subset of the real line separated from 0; the weight function vα (x) is defined by ∞ e−2t x2 α−1 −1 t dt (14) vα (x) = 1 + e−2t x2 0
December 26, 2006
13:52
Proceedings Trim Size: 9in x 6in
guithay[1]3
249
Theorem 1.5. (Nguyen [17], [19]) (A p.m. µ is mixed-stable i.e. µ ∈ L ∞ (X) if and only if µ = [z, Σ, M], where z, Σ are the same as in Theorem (1.3) and the L´evy measure M of µ is given by ∞ dt χA (tx) 2x+1 h(x)ν(dx) (15) M(A) = t V1 0 where ν is a PM on the open unit ball V1 := {x ∈ X : x < 1} and h(x) is a nonnegative continuous weight function on V1 . (α)
2. Mappings {T c } and Classes {Lα (X)} (α) . In this section we introduce families of mappings {Tc },where 0 < c < 1; α > 0, acting on the whole class ID(X) and show that they play the same role as mappings Tc in the definition of α-SDPM’s. To begin with let us consider the following particular cases: 2.1 α = n = 1, 2, ... Let µ ∈ Ln (X), n = 1, 2, .... By Proposition 1.1 [19], for every 0 < c < 1 Eq. (4) holds. Putting rk,n Tcn µ = ∗∞ k=1 (Tck ν)
(16)
and taking into account (4) we have (n)
µ = Tc µ ∗ µc,n
(17)
Conversely, it is also true. Namely, by induction one can prove that if a PM µ satisfies Eq. (17) for each 0 < c < 1 and for a PM µ c,n , then it belongs to Ln (X). 2.2 0 < α < 1. This case was treated in [19]. Namely, for such α the mapping T c,α is defined in [19]. Then, by Theorem 2.1 [19], it follows that a PM µ belongs to L α (X) if and only if for every 0 < c < 1 there exists a PM µc,α such that µ = Tcα µ ∗ µc,α
(18) 2.3 The general case α > 0 : It is easy to show that (19)
1=
∞ ∞ (−1)k−1 rk,α = |rk,α | k=1
k=1
Consequently, the mapping Tc,α : ID(X) → ID(X) given by (20)
|rk,α | Tc,α µ = ∗∞ k=1 Tck µ
December 26, 2006
13:52
Proceedings Trim Size: 9in x 6in
guithay[1]3
250
for any 0 < c < 1 and α > 0 is well-defined. Furthermore, the following general theorem holds: Theorem 2.1. A PM µ belongs to Lα (X), α > 0, if and only if for each 0 < c < 1 there exists a PM µc,α such that Eq. (18) holds. Proof. The ”if” part is similar to the proof of Theorem 2.1 [19]. To prove the ”only if” part one may assume that α = β + n, where 0 < β < 1, n = 1, 2, .... But it is clear by virtue of the cases 2.1 and 2.2 and by noticing that the mappings Tc,n and Tc,β commute with each other. Theorem 2.2. (α-differentiability of α-SDPM’s on X) For every α > 0 and every PM µ ∈ Lα (X) there exists a weak limit, denoted by Dα µ, which belongs to IDlogα (X) and satisfies the equation −α
Dα µ = limt→0 µtc,α
(21)
where t = − log c, µc,α is as in (7) and (17). Proof. (See Nguyen [19], Theorem 2.4 ). Definition 2.1. (cf. Nguyen [19]) The limit measure D α µ in Theorem (2.2) is called the α-derivative of µ. The following Theorem is obvious: Theorem 2.3. For each α > 0 the operator D α stands for an algebraic isomorphism between Lα (X) and IDlogα (X). (α)
3. Mappings {U c } and Classes {U α (X)} Following verbatim the proof of cases 2.1, 2.2 and 2.3 we have the Theorem: Theorem 3.1. For any 0 < c < 1 and α > 0 and for every PM µ ∈ ID(X) we put (22)
α
α
|( k )|c |( k )| = ∗∞ Uc,α µ = ∗∞ k=1 Tck µ k=1 Uck µ k
Then we get a mapping U c,α which stands for a well defined continuous isomorphism of the convolution algebra ID(X). Moreover, restricted to ID(X), it stands for an analogue of the shrinking mapping U c in (1). Definition 3.1. A PM µ ∈ ID(X) is said to be of the class U α (X), α > 0, or equivalently, α-s-SD, if for each 0 < c < 1 the following formula holds: (23) for some PM µc,α ∈ ID(X)
µ = Uc,α µ ∗ µc,α
December 26, 2006
13:52
Proceedings Trim Size: 9in x 6in
guithay[1]3
251
From the above definition we have: Theorem 3.2. A PM µ = [z, Σ, M] belongs to Uα (X), α > 0 if and only if the L´evy measure M satisfies the following condition: (24)
Σ∞ k=0 |
α k |c Tck M ≥ 0 k
for each 0 < c < 1, or, equivalently, (25)
Σ∞ k=0 |
α |U k M ≥ 0 k c
Definition 3.2. (cf. Jurek [8]) Given α > 0 let G α denote a Gamma r.v. <α> with distribution τα . Let U (X) denote the class of all distributions of tdYρ (τα (t)), where Yρ (.) is a X-valued L´evy process with L(Yρ (1)) = ρ. (0,1) By virtue of formulas (23) and (24) and ((29) in Jurek [8]) we have the following theorem Theorem 3.3. The following equation hold: (26)
Uα (X) = U <α>
which shows that definitions 3.1 and 3.2 are equivalent. 4. Stochastic Representation of MSDPM’s and s-MSDPM’s. Our main aim in this Section is, following the method of Rajput and Rosinski[24], to give a representation of MSDC and s-MSDC processes via stochastic integrals w.r.t. the corresponding random measures (RM). Namely, since the general forms of the L´evy measures were obtained in Sections 2 and 3 the Kolmogorov extension theorem and the method of Rajput and Rosinski [24] allow to obtain the required representation. Definition 4.1. Let T be a parameter set Z of all integers or R of all real numbers. A stochastic process Xt , t ∈ T is said to be ID, stable, mixed-stable, α-SD, α-s-SD if for any t1 , t2 , ..., tn ∈ T and λ1 , λ2 , ..., λn , n = 1, 2, ... the r.v. Σn1 λ j Xt j is ID, stable, mixed-stable, α-SD, α-s-SD, respectively. Definition 4.2. Let Λ = {Λ(A) : A ∈ S} be a real stochastic process defined on a probability space (Ω, F , P), where S stands for a σ-ring of subsets of an arbitrary non-empty set S satisfying the following condition : There
exists an increasing sequence Sn , n = 1, 2, ... of sets in S with n Sn = S.
December 26, 2006
13:52
Proceedings Trim Size: 9in x 6in
guithay[1]3
252
We call Λ to be an independently scattered RM, if, for every sequence {An } of disjoint sets in S, the random variables Λ(An ), n = 1, 2, ... are independent, and , if ∪n An belongs to S, then we also have Λ(∪n An ) = Σn Λ(An ) a.s., where the series is assumed to be convergent a.s. In addition, if for every A ∈ S the distribution of Λ(A) is ID, stable, mixed-stable, MSD, respectively, then we say that it is an ID, stable, mixed-stable, MSD RM. Each r.v. Λ(A), A ∈ S has the ch.f. ∞ 1 2 (eitx − 1 − itτ(x))FA (dx). (27) − log E exp(itΛ(A) = itν 0 (A) + t ν1 (A) − 2 −∞ where t ∈ R, A ∈ S and − ∞ < v0 (A) < ∞, 0 ≤ v1 (A) < ∞ and FA is a L´evy measure on R. Moreover, v0 is a signed measure , v1 a measure and FA a L´evy measure. Moreover, we have the following Theorem 4.1. (cf. Raiput and Rosinski [24], Proposition 2.1) The ch.f. (27) can be written in the unique form: (28) E exp(itΛ(A)) = exp( K(t, s)λ(ds)) A
where t ∈ R, A ∈ S and (29)
2 2
(eitx − 1 − itτ(x))ρ(s, dx),
K(t, s) = ita(s) − 1/2t σ (s) + A
with (30)
a(s) =
dv0 (s) dλ
σ2 (s) =
dv1 (s) dλ
and (31)
and ρ is given by Lemma 2.3 in (cf. Raiput and Rosinski [24]. Moreover, we have (32) |a(s)| + min{1, x2}ρ(s, dx) = 1 a.e.λ. R
Definition 4.3. (cf. Urbanik and Woyczynski [27])
December 26, 2006
13:52
Proceedings Trim Size: 9in x 6in
guithay[1]3
253
(a) If f is a simple function on S, f = Σ j x j χA j , A j ∈ S then, for each A ∈ σ(S), we put f dΛ = Σ j λ(A ∩ A j )
(33) A
(b) A measurable function f : (S, σ(S)) → (R, B(R) is said to be Λ−integrable if there exists a sequence { fn } of simple functions as defined in (a) such that (i) fn → f
a.e. λ,
(ii) For every A ∈ σ(S), the sequence { n → ∞. If f
f dΛ} A n
converges in prob., as
is Λ−integrable, then we put
f dΛ = P − limn→∞
{ A
fn dΛ, A
where { fn } satisfies (i) and (ii). Now, combining Theorems 1.3, 1.4, 1.5 we get the following: Theorem 4.2. Given α > 0, let Λ(A), A ∈ S be a α − s.d.r.m. Then, the ch.f. of Λ(A) is of the unique form (20) where (34)
Proof. By virtue of (13) it follows that for any A ∈ S and t ∈ R Λ(A) has the representation (37) 1 − log E exp(itΛ(A)) = itν 0 (A) + t2 ν1 (A) − 2
∞
−∞
vα (x)(
∞
k(e−u x, t)uα−1 du)m(A, dx)
0
which, by a similar argument of Proposition 2.1 in Rajput and Rosinski [24], implies that there exists a unique finite measure ν on σ(S) × B(R) such that ν(A × B) = m(A, B),
for any A ∈ S, B ∈ B(R).
Moreover, for every A ∈ σ(S) we have ν(A, {0}) = 0. Now, we are in the position to present the following theorem whose proof is a simple combination of Theorem 6 and the Komogorov extension theorem and Theorem 5.2 in Rajput and Rosinski [24]. Theorem 4.3. Given 0 < α ≤ ∞ let {Xt : t ∈ T} be an α − SD stochastic process defined on a probability space (Ω , P ). Then there exists an α − SDRM, say Λ, defined on the probability space (Ω, P) such that Ω = Ω × I, P = P × Leb , Leb being the Lebesgue measure on I and {Xt : t ∈ T} = { ft (s)dΛ(s) : t ∈ T} a.s.P, S
where { ft (s) : t ∈ T, s ∈ S} are some measurable functions on S and I denotes the closed unit interval. By a similar argument as for MSDPM’s we have the following: Theorem 4.4. Given α > 0 let {Xt : t ∈ T} be an α − s − SD stochastic process defined on a probability space (Ω , P ). Then there exists an α − s − SDRM, say Λ, defined on the probability space (Ω, P) such that Ω = Ω × I, P = P × Leb , Leb being the Lebesgue measure on I and {Xt : t ∈ T} = { ft (s)dΛ(s) : t ∈ T} a.s.P, S
where { ft (s) : t ∈ T, s ∈ S} are some measurable functions on S and I denotes the closed unit interval.
December 26, 2006
13:52
Proceedings Trim Size: 9in x 6in
guithay[1]3
255
5. An Application in Option Pricing If X is L´evy - stable random variable with index 0 < α < 1, then it does not have any integer moments, and, for the case 1 < α < 2, only the first integer moment exists. Therefore, to overcome this difficulties, following Cartea and Howinson [1], we introduce the following Damped-L´evy-mixed - stable process which will lead to a mathematical model for our purpose of option pricing. Suppose that X j (t), j = 1, 2 are independent L´evy -stable processes with indexes 0 < α1 < α2 < 2, respectively such that the logarithm of the characteristic function of X j (1) is given by +∞ (eiux − 1 − iuτα j (x))W j (x)dx, j = 1, 2. (38) ψ j (u) = −∞
for x < 0 for x > 0 for α j > 1 for α j = 1 for α j < 1.
Here Cp , Cq > 0 are scale constants, p, q ≥ 0 and p + q = 1. Following Cartea and Howinson[1] the exponential cut-off e −λ|x| is introduced to obtain the Damped L´evy measures ⎧ ⎪ ⎪ for x < 0 ⎨Cq |x|−1−α j e−λ|x| , (39) W λj (x) = ⎪ . ⎪ ⎩Cp x−1−α e−λ|x| , for x > 0 Let W λj , j = 1, 2, denote the Damped L´evy measures corresponding to L´evy processes Xλj (t), j = 1, 2 with (40)
φλj (u)
=
+∞ −∞
(eiux − 1 − iuτα j (x))e−λ|x| W j (dx)
Putting, for t ≥ 0, X(t) = X 1 (t) + X2 (t) we get a L´evy process X(t) which is also a mixed-stable-L´evy process with Φ(u) = Φ 1 (u) + Φ2 (u), where Φ j (u), j = 1, 2 are given by (40). Putting ⎧ ⎪ ⎪ for x < 0 ⎨Cq |x|−1−α j e−λ|x| , (41) W λj (x) = ⎪ , j = 1, 2 ⎪ ⎩Cp x−1−α j e−λ|x| , for x > 0 and taking into account (40) we infer that the logarithm of the ch.f., denoted by φλ (u), for a Damped-L´evy process {X λ (t)} is of the form (42)
The Damped-L´evy process X λ (t) := X1λ (t) + X2λ (t) has the following property: (i) {Xλ (t)} is a L´evy process. (ii) It is not a stable process. (iii) limλ→0 Xλ (t) = X(t) (in distribution and in probability). (iv) The process {Xλ (t)} has finite moments of all orders. Moreover, its exponential moments exist. Suppose that we work under the framework of the market with the stock price process X(t) = X 1 (t) + X2(t) which satisfies the condition that X j (t), j = 1, 2 are independent α j -stable L´evy processes under measure Q. Our further aim is to deduce a kind of the Black-Scholes formula under L´evy-Mixed-Stable Shocks. In what follow we assume that 1 < α1 < α2 < 2. Then, by Cartea and Howinson ([1], Proposition 3, p. 12) we have (44)
(j=1,2). We assume that the logarithm of the stock price process, under the risk-neutral measure, is a Damped-mixed-stable L´evy process. Then, by Cartea and Howinson ([1]) (45)
St+∆t = St exp(r−D0 )∆t−φ(−iσ)+σφ ,
where r > 0 is the risk free rate and σ > 0. Equation (45) can be rewritten as the following: (46)
St+∆t = St expπ∆t+σφ
where π, φ are parameters for Damped-(α1, α2 )-mixed-stable-L´evy process X λ (t). Then as ∆t → 0 the ”Damped Black-Scholes” PDE can be given and solved as in the Damped-stable-L´evy case (cf. Cartea and Howinson ([1]), p. 24).
December 26, 2006
13:52
Proceedings Trim Size: 9in x 6in
guithay[1]3
257
References
´ and Howinson, S., Distinguished limits of L´evy—stable process, and ap1. Cartea, A. plications to option pricing, Oxford Financial Research Centre, Ser. OFRC working papers Ser., No. 2002mf04. 2. Hardin, Jr., C. D., On the spectral representation of symmetric stable processes, J. Multivariate Anal. 12, 385–401, 1982. 3. Jurek, Z. J., Limit distributions for sums of shrunken random variables, Dissertationes Mathematicae, 185, PWN Warszawa, 1981. 4. Jurek, Z. J., Limit distributions and one-parameter groups of linear operators on Banach spaces, J. Multivariate Anal. 13, 578–601, 1983. 5. Jurek, Z. J. and Vervaat, W., An integral representation for selfdecomposable Banach space valued random variables, Z. Wahrsch. verw. Gebiete. 62, 247–262, 1983. 6. Jurek, Z. J., Selfdecomposability perpetuity laws and stoping times, Probab. Math. Stat. 19, 413–419, 1999. 7. Iksanov. M. A., Jurek, Z. J., and Schreiber, M. B., A new factorization property of the selfdecomposable probability measures, Annals of Probability, 32, No. 2, 1356–1369, 2004. 8. Jurek, Z. J., The random integral representation hypothesis revisited: new classes of s-selfdecomposable laws. Proc. Int. Conf. Hanoi, 13–17 August 2002, World Scientific, Hongkong 2004, 495–514. 9. Kallenberg, O., Random measures, 3rd ed. New York: Academic Press 1983. 10. Kuelbs, J., A representation theorem for symmetric stable processes and stable measures on H. Z. Wahrscheinlichkeits, Verw. Geb. 26, 259–271, 1973. 11. Kumar, A. and Schreiber, B. M., Characterization of Subclasses of Class I, Probability Distributions, Ann. Probab. 6, 279–293, 1978. 12. Lo´eve, M., Probability theory, New York, 1950. 13. Maruyama, G., Infinitely divisible processes, Theor. Prob. Appl. 15, 3–23, 1970. 14. Medgyessy, P., On a new class of unimodal infinitely divisible distributions and related topics. Studia Sci. Math. Hungar. 2, 441–446, 1967. 15. Musielak., Orlicz spaces and modular spaces, Lecture Notes math., vol. 1034, New York Berlin Heidelberg: Springer 1983. 16. Nguyen, N. H., Stochastic representation of α-times selfdecomposable diistributions., a private communication. 17. Nguyen, V. T., Stable type and completely self-decomposable probability measures on Banach spaces, Bull, Ac. Pol., S´er. Sci. math., 29, No. 11–12, 1981. 18. Nguyen, V. T., Fractional Calculus in Probability, Probab. Math. Statst. 3 No. 2, 171–189, 1984. 19. Nguyen, V. T., An alternative approach to multiply self-decomposable probability measures on Banach spaces, Probab. Th. Rel. Fields 72, 35–54 (1986). 20. Parasarathy, K. R., Probability measures on metric spaces, New York: Academic Press 1967. 21. Pr´ekopa, A., On stochastic set functions I, Acta Math. Acad. Sci. Hung. 7, 215–262, 1956. 22. Pr´ekopa, A., On stochastic set functions II, III. Acta Math. Acad. Sci. Hung. 8, 337–400, 1957. 23. Rosinski, ´ J., Random integrals of Banach space valued functions, Studia math. 78,
December 26, 2006
13:52
Proceedings Trim Size: 9in x 6in
guithay[1]3
258
15–38, 1985. 24. Rajput, B. S. and Rosinski, ´ J., Spectral representation of infinitely divisible processes, Probab. Th. Rel. Fields 82, 451–487 (1989). 25. Sato, K., I., Urbanik’s, Class L m of Probability Measures, Ann. Sci. Coll. Lib. Arts Kanazawa Uni. 15, 1–10, 1978. 26. Sato, K. I., L´evy processes and infinitely divisible distributions, Cambridge University of Press 1999. 27. Urbanik, K. and Woyczynski, W. A., Random integrals and Orlicz spaces, Bull. Acad. Polon. Sci. 15, 161–169(1967). 28. Urbanik, K., Random measures and harmonizable sequences, Studia math. 31, 61–88, 1968. 29. Urbanik, K., Slowly varying sequences of random variables, Bull. Acad. Pol. Sci. S´erie des Sci. Math. Astr. et Phys. 20, 8(1972), 679–682.
December 26, 2006
19:10
Proceedings Trim Size: 9in x 6in
EconoGrowth
Stochastic Growth Models of an Isolated Economy Kunio Nishioka
Key words: Neo-classical model, Economic growth with uncertainty, the stochastic Solow equation, a diffusion process on (0, ∞) 1. Introduction Study of economic growth has continued more or less steadily for almost 600 years after A. Smith and T. Malthus. The main thesis of economic growth theory is to answer the following question: Why some nations are so rich and the others are so poor? In nowadays economic growth theories, Neo-classical growth model plays the fundamental part. This model was developed by the works of R. Solow, 1956, 1957. After Solow’s work, Lucus (1988), Romer (1986), Mankiw (1992), and etc. refined Solow model by importing advances of technology or human factor. Today a mainstream is an endogenous growth theory, which build these advances into economic growth itself. We will start from Solow model. An economy in the model is considered in the following setting. Assumption 1.1. (i) The economy is an isolated island in where many labors live. There is a social planner, who governs all economic. (ii) There is one good. At time t, production Y(t) of the good depends on two factor, capital K(t) and labor L(t). The good can be either consumed or invested as capital. (iii) The social planner saves a constant fraction s ∈ (0, 1) of production, to be added to the economy’s capital stock, and distributes the remaining fraction uniformly across the labors of the economy. In what follows, we introduce the following normal signatures in economic theory: 259
December 26, 2006
14:40
Proceedings Trim Size: 9in x 6in
EconoGrowth
260
(1.1)
Y(t) = output at time t, I(t) = investment at time t,
K(t) = capital stock at time t, C(t) = consumption at time t,
L(t) = the number of labors at time t. From Condition 1.1 ( + a little ), the following condition is derived: Condition 1.2. (i) The economy is Keynes system, that is I(t) + C(t) = Y(t). (ii) The technology for producing the good is given by the production function F : R2+ → R+ , that is Y(t) = F(K(t), L(t)).
(1.2)
(iii) Capital depreciates at a fixed rate λ ∈ [0, 1], that is K (t) = I(t) − λ K(t). (iv) Saving rate s ∈ (0, 1) is constant, that is Y(t) = s Y(t) + C(t). (v) The population of labors increases in a constant rate n: L (t) = n L(t).
(1.3)
In addition, we assume that the production function F in (1.2) is neo classical, i.e. the following condition is fulfilled. Condition 1.3. The production function F is a strictly concave C 2 class function with F(0, L) = 0 = F(K, 0). Moreover F satisfies: (i) Inada Condition: lim ∂K F(K, L) = ∞,
K→0
lim ∂K F(K, L) = 0,
K→∞
lim ∂L F(K, L) = ∞, L→0
lim ∂L F(K, L) = 0.
L→∞
(ii) CRS condition (constant returns to scale): F(a K, a L) = a F(K, L) for ∀a > 0.
December 26, 2006
14:40
Proceedings Trim Size: 9in x 6in
EconoGrowth
261
Example 1.4. A typical example of the above production function is F(K, L) = Kα L1−α ,
0 < α < 1,
which is called Cobb-Douglus type. We introduce the per-capita measurements, that is (1.4)
We also call this f as a production function. By definition (1.5) of f and Condition 1.3, Condition 1.5. A production function f : R + → R+ is a strictly concave C2 class function with f (0) = 0. Moreover f satisfies (1.6)
(Inada condition)
lim f (k) = ∞, k→0
lim f (k) = 0.
k→∞
Combining the equations in Condition 1.2, we derive ODE for the capital stock K(t): K (t) = Y(t) − C(t) − λ K(t) = s Y(t) − λ K(t) (1.7) = s F K(t), L(t) − λ K(t). By a simple calculation, k (t) =
K(t) L(t)
=
K (t) K(t) L (t) − · . L(t) L(t) L(t)
Now (1.7) and (1.3) give the dynamics of capital stock in per-capita measurement: (1.8) (Solow equation) k (t) = s f (k(t)) − λ + n k(t), where s ∈ (0, 1) is saving rate, λ ∈ [0, 1] is capital depreciating rate, and n is population growth rate.
December 26, 2006
14:40
Proceedings Trim Size: 9in x 6in
EconoGrowth
262
Owing to Condition 1.5, there exists a unique solution k∗ to (1.9)
s f (k) − (λ + n) k = 0,
k > 0,
and it is a stable fixed point of Solow equation (1.8). Proposition 1.6. There exists a unique point k∗ > 0 which solves (1.9). We call k∗ as the state of golden age, since lim k(t) = k∗
t→∞
for any k(0) > 0.
2. Verification of Solow model We shall compare the result in Proposition 1.6 with a statics in the real economy between 1980 and 1997. Growth rate of per-capita GDP is y (t)/y(t) which is easily derived from (1.5) and Solow equation (1.8): (2.1)
f (k(t)) y (t) k(t) = k (t) = f (k(t)) s − (λ + n) y(t) f (k(t)) f (k(t))
From Condition 1.5, k/ f (k) ∼ 1/ f (k) ∼ 0 if k is sufficiently small, and we know that the right hand side of (2.1) behaves as k f (k) s − (λ + n) f (k)
⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩
diverges if k → 0, > 0 and monotonely decreases in k if 0 < k < k∗ , =0 if k = k∗ , <0 if k > k∗ .
December 26, 2006
14:40
Proceedings Trim Size: 9in x 6in
EconoGrowth
263
Figure 2.1. Per-capita GDP and its growth rate
From the above, the following observations are easily derived: (i) If k(t) is small, y (t)/y(t) should be very large, (ii) If k(t) is near to the golden age k∗ , y (t)/y(t) should be very small. Therefore all dots in Fig. 2.1 should be distributed along the bold curve in the figure. But there exist many such counter examples in the zone A of Fig. 2.1 and it is difficult to justify nations with negative growth rate by Solow model.
3. The stochastic Solow equation I. In order to prevail the previous contrariety, many economists make various attempts to approve Solow model. (i) Lucus (1988), Romer (1986), and etc. imported advances in technology into Solow model. For instant, Lucus considered a production function (3.1)
Y(t) = F(K(t), A(t) L(t)),
where (3.2)
A(t) ≡ exp{gt},
g is a non-negative constant
December 26, 2006
14:40
Proceedings Trim Size: 9in x 6in
EconoGrowth
264
is an advances in technology for each labor. (ii) Uzawa (1965), Mankiw(1992), and etc. introduced human factor. For instant, Mankiw introduced a production function Y(t) = F(K(t), L(t), H(t)) with a human factor H(t), and he derived a simultaneous equations k (t) = sk y(t) − (n − λk )k(t), h (t) = sh y(t) − (n − λh )k(t), where sk is a constant saving rate to capital stock. sh is a constant saving rate to human capital stock, and λk , λh are constant depreciating rates. (iii) Some economists tried to randomize Solow equation. II. Especially in (iii), Merton (1975) shifted the population growth equation (1.3) on to a SDE1 (3.3)
dL(t, w) = n L(t, w) dt + σ L(t, w) dB(t, w),
where n and σ are positive constants and {B(t, w)} is a one dimensional Brownian motion. By Ito’s ˆ formula, Merton has obtained the following SDE which accounts per-capita capital stock {k(t, w)} as a diffusion process in (0, ∞):
(3.4)
(the stochastic Solow equation) dk(t, w) = −σ k(t, w) dB(t, w) + s f (k(t, w)) − λ + n − σ2 k(t, w) dt
We are interesting to precise behaviors of {k(t, w)} and its growth rate. The growth rate is defined as (3.5) k (t)/k(t) = log k(t) , when k(t) is a solution of (1.8). But k (t) has no sense in SDE, and we should consider an average growth rate in time2 (3.6) 1 Cho
ρ(t, w) ≡
log k(t, w) − log k(0) t
and Cooley (2001) replaced (3.2) by the diffusion process A(t, w) ≡ exp{(g − a2 /2)t + a B(t, w)}. 2 This converges to the Lyapunov index of {k(t, w)} as t → ∞.
December 26, 2006
14:40
Proceedings Trim Size: 9in x 6in
EconoGrowth
265
instead (3.5). Proposition 3.1. Let the production function f satisfy Assumption 1.6, and define a constant θ by θ ≡ λ + n − σ2 /2.
(3.7)
Then asymptotic behaviors of k(t, w) and its average growth rate ρ(t, w) are as follows: θ<0 θ=0 θ>0 k(t, w) → ∞ a.s. recurrent† recurrent ρ(t, w) → − θ a.s. → 0 a.s. Here ‘ recurrent’ means that {k(t, w)} is a recurrent diffusion on (0, ∞) with an invariant probability measure. In ‘ recurrent†’ case, {k(t, w)} is recurrent but its invariant measure is infinite and it converges in C´esaro’s sense, that is 1 lim T→∞ T
(3.8)
T
k(t, w) dt = ∞ a.s.
0
Remark 3.2. (i) Let θ ≥ 0. Then for the stochastic Solow equation case (3.4), there is no such state of golden age as in Proposition 1.6. Since k(t, w) is recurrent, k(t, w) reaches every point on (0, ∞) with probability one. (ii) Let K(t, w) be the total capital stock. Then it holds that log K(t, w) − log K(0) σ2 = lim ρ(t, w) + n − . t→∞ t→∞ t 2 lim
From Proposition 3.1 and the above remark, we have the main theorem. Theorem 3.3. Suppose that the production function f satisfies Assumption 1.6. On the (λ, n − σ 2 /2) plain, we define domains A through C as follows:
December 26, 2006
14:40
Proceedings Trim Size: 9in x 6in
EconoGrowth
266
Then asymptotic behaviors of the indicators, the per-capita capital stock k(t, w), the population of labors L(t, w), and
(3.9)
the total capital stock K(t, w), are as follows: ∈A ∈B ∈C (λ, n − σ2 /2) k(t, w) → ∞ a.s. recurrent recurrent L(t, w) → 0 a.s. → 0 a.s. → ∞ a.s. → 0 a.s. → 0 a.s. → ∞ a.s. K(t, w)
4. A quaere to Inada condition I. In most neo-classical growth models, it is supposed that the production function satisfies Inada condition in Assumption 1.3. However some economists newly assert that Inada condition at zero lim f (k) = ∞
(4.1)
k→0
is inapposite. From CRS condition and the mean value theorem, K+1 K , 1) − F( , 1) F(K + 1, L) − F(K, L) = L F( L L K+1 1 K = L f( ) − f ( ) = L f (y) = f (y), L L L When L is sufficiently large, y is small, and (1.6) derives that f (y) ∞. This means that under the assumption (4.1), an additional unit of capital derives any large production if an amount of labour is sufficiently large, what conflicts to the real economic data. So Kamihigashi (2003) proposed to suppose the following (4.2) instead of (4.1): Condition 4.1. The production function f satisfies (4.2)
C2 class, strictly concave, and f (0) = 0, 0 < ∃ lim f (k) ≡ f (0) < ∞, lim f (k) = 0. k→0
k→∞
December 26, 2006
14:40
Proceedings Trim Size: 9in x 6in
EconoGrowth
267
Example 4.2. We present a production function satisfying Condition 4.1, that is also a modification of Cobb-Douglus type: let c > 0 and 0 < α < 1 be constants and α F(K, L) ≡ K + c L L1−α − cα L. From this F, we have f (k) ≡ (k + c)α − cα .
Proposition 4.3. Let the production function f satisfy Assumption 4.1. Set the constant θ as in (3.7) and a constant γ as γ ≡ s f (0) − θ = s f (0) − (λ + n −
σ2 ). 2
Then θ θ<0 θ = 0 0 < θ < s f (0) k(t) → ∞ a.s. recurrent† recurrent ρ(t) → − θ a.s. → 0 a.s. θ θ = s f (0) θ > s f (0) k(t) recurrent‡ → 0 a.s. ρ(t) → 0 a.s. → γ a.s. Here ‘ recurrent’ means that {k(t, w)} is a recurrent diffusion on (0, ∞) with an invariant probability measure. In ‘ recurrent†’ and ‘ recurrent‡’ case, {k(t, w)} is recurrent with an infinite invariant measure and converges in C´esaro’s sense, that is 1 lim T→∞ T
T
k(t) dt = 0
∞ a.s. 0 a.s.
recurrent† recurrent‡ .
II. We shall investigate asymptotic behaviors of economic indexes (3.9). Case 1: s f (0) ≥ 1. Theorem 4.4. Suppose that the production function f satisfies Assumption 4.1 and that s f (0) ≥ 1. On the (λ, n − σ 2 /2) plain, we define domains A through D as follows:
December 26, 2006
14:40
Proceedings Trim Size: 9in x 6in
EconoGrowth
268
Then asymptotic behaviors of economic indexes (3.9) are as follows: (λ, n − σ2 /2) ∈A ∈B ∈ C k(t, w) → ∞ a.s. recurrent recurrent L(t, w) → 0 a.s. → 0 a.s. → ∞ a.s. → 0 a.s. → 0 a.s. → ∞ a.s. K(t, w) (λ, n − σ2 /2) ∈ D k(t, w) → 0 a.s. L(t, w) → ∞ a.s. K(t, w) → ∞ a.s.
Case 2: 0 < s f (0) < 1. Theorem 4.5. Suppose that the production function f satisfies Assumption 4.1 and that 0 < s f (0) < 1. On the (λ, n − σ2 /2) plain, we define domains A through F as follows:
Appendix. One dimensional diffusion process with boundaries I. We shall review behaviors of the diffusion process {k(t, w)} defined by SDE (3.4). Fix an arbitrary point k0 ∈ (0, ∞), and define k (the scale function) S(k) ≡ ϕ(y) dy, k0
(density of the speed measure) m(k) ≡ where
y
ϕ(y) ≡ exp{−2 k0
σ2 k 2
s f (ξ) − (λ + n − σ2 ) ξ dξ}, σ2 ξ2
1 , · ϕ(k) y > 0.
Using the scale function S and the speed measure m(k) dk, Feller (1954) and Ito-McKean ˆ (1965) classified boundaries of a one dimensional diffusion into five types, that is a regular boundary, an entrance, an exit, an infinite natural, and a finite natural. II. For {k(t, w)} given by SDE (3.4), its boundary points are 0 and ∞, and both are natural boundaries. In this case, asymptotic behaviors is already known, Nishioka (1976). Case 1. Both are infinite natural: {k(t, w)} is recurrent on the interval (0, ∞), and density function of an invariant measure is m(k) 1 1 (A.1) µ(k) ≡ . = · 2 2 C C σ k ϕ(k)
December 26, 2006
14:40
Proceedings Trim Size: 9in x 6in
EconoGrowth
270
Here the constant C is ⎧ ∞ ⎪ ⎪ ⎪ m(k) dk if the integral is finite, ⎨ (A.2) C≡⎪ 0 ⎪ ⎪ ⎩ 1 otherwise. In addition, the following Ergodic Theorem holds: Maruyama-Tanaka (1957): If functions g, h are integrable with respect to µ(y) dy, then T ∞ g(k(t, w)) dt g(y) µ(y) dy 0 0 = ∞ a.s. (A.3) lim T T→∞ h(y) µ(y) dy h(k(t, w)) dt 0
0
where the denominator in the right hand side must not vanish. Case 2. One is finite natural and the other is infinite natural: (i) {k(t, w)} cannot reach boundaries within a finite time, almost surely. (ii)
⎧ ⎪ 0 a.s. if 0 is finite natural ⎪ ⎪ ⎪ ⎪ and ∞ is infinite natural, ⎨ lim k(t, w) = ⎪ ⎪ ∞ a.s. if 0 is infinite natural ⎪ t→∞ ⎪ ⎪ ⎩ and ∞ is finite natural.
Case 3. Both are finite natural: The statement (i) in Case 2 is true, but S(∞) − S(x) , S(∞) − S(0) S(x) − S(0) Px [lim k(t, w) = ∞] = . t→∞ S(∞) − S(0)
Px [lim k(t, w) = 0] = t→∞
A. Sketch of proofs Proof of Proposition 3.1 Step 1. Owing to Appendix §A, I, boundaries 0 and ∞ for {k(t, w)} are classified as follows: θ<0 θ≥0 0 an infinite natural boundary ∞ a finite natural an infinite natural
December 26, 2006
14:40
Proceedings Trim Size: 9in x 6in
EconoGrowth
271
Now applying §A, II, we easily see behaviors of {k(t, w)}. Step 2. We shall investigate ρ(t, w). By Ito’s ˆ formula,
(A.1)
s ρ(T, w) = T
T 0
f (k(t, w)) σ2 σ dt − (λ + n − ) − B(T, w). k(t, w) 2 T
First we assume that θ > 0. In this case, an invariant measure is a probability measure and the law of iterated logarithm implies lim
T→∞
1 B(T, w) = 0 a.s. T
Then from (A.1) and Ergodic Theorem (A.3), ∞ f (k) σ2 lim ρ(T, w) = s µ(k) dk − (λ + n − ) a.s. T→∞ k 2 0 We calculate the first term in the right hand side. Put β ≡ 2(λ + n)/σ2 − 2. By Inada condition (1.6),
f (k) 2s k f (ξ) C exp{ dξ} k σ2 k2+β σ2 k0 ξ2 0 s C σ2 ∞ 1 2s k f (ξ) = dk 2 1+β exp{ 2 dξ} 2 2s σ k0 ξ σ k 0 k ∞ f (ξ) 1 C 2s = · 1+β exp{ 2 dξ} k=0 2 k σ k0 ξ2 ∞ 2 σ C 1 2s k f (ξ) (1 + β) + dk 2 2+β exp{ 2 dξ} 2 σ k0 ξ2 σ k 0
the first term = s
=
∞
dk
σ2 C 1 σ2 (1 + β) = λ + n − . 2 C 2
Now we have proved that limT→∞ ρ(T) = 0. Step 3. We shall investigate behavior of ρ(t, w) when θ = 0. In this case, an invariant measure µ(k) dk is not finite, that is ∞ ∞ 1 dk = ∞ for large L. µ(k) dk ∼ 2k σ L L Moreover the function f (k)/k may not be integrable.
December 26, 2006
14:40
Proceedings Trim Size: 9in x 6in
EconoGrowth
272
Fix an arbitrary ε > 0, and define a function h as h(k) ≡ εk,
k ≥ 0.
Since f satisfies Inada condition (1.6), we can find a unique point k† > 0 such that f (k) = h(k), k > 0.
Another functions f˜ and h are defined as
f (k) 0 ≤ k < k† ˜ f (k) ≡ h(k) ≡ ε k k† ≤ k,
f (k) − εk 0 ≤ k < k† ˜ h(k) ≡ f (k) − h(k) = 0 k† ≤ k. Here h(k)/k is integrable with respect to µ(k) dk. We easily see that T T≥ I(0,L) (k(t, w)) dt, 0
∞ 0
I(0,L) (k) µ(k) dk < ∞,
for arbitrary L > 0. By Ergodic Theorem (A.3), 1 T h(k(t, w)) 0 ≤ lim sup dt k(t, w) T→∞ T 0 T ∞ h(k(t, w)) h(k) dt µ(k) dk k(t, w) k 0 0 = ∞ ≤ lim T . T→∞ I (k) µ(k) dk I(0,L) (k(t, w)) dt (0,L) 0
0
∞
Note that
µ(k) dk = ∞, and let L → ∞. Then we have
0
1 lim T→∞ T
T 0
h(k(t, w)) dt = 0 a.s. k(t, w)
December 26, 2006
14:40
Proceedings Trim Size: 9in x 6in
EconoGrowth
273
Remark that 1 T
T 0
h(k(t, w)) 1 dt = k(t, w) T
T
ε dt = ε 0
From this and the previous calculation, 1 T f˜(k(t, w)) dt lim T→∞ T 0 k(t, w) 1 T h(k(t, w)) 1 T h(k(t, w)) = lim dt + lim dt = ε. T→∞ T 0 T→∞ T 0 k(t, w) k(t, w) Since the definition of f˜ asserts that 0 ≤ f ≤ f˜, 1 T f (k(t, w)) 0 ≤ lim sup dt k(t, w) T→∞ T 0 1 T f˜(k(t, w)) ≤ lim dt = ε a.s. T→∞ T 0 k(t, w) Here ε > 0 is arbitrary. Let ε ↓ 0 and we have 1 T f (k(t)) dt = 0 a.s. lim T→∞ T 0 k(t) Note that our assumption is θ = 0. Now we have lim ρ(T) = 0 − θ = 0 a.s.
T→∞
by an analogous way as in Step 1. We shall omit the remained proof.
2
Proof of Proposition 4.3 Using Appendix §A, I, we can classify the boundaries of {k(t, w)}: Put θ = λ + n − σ2 /2 and θ<0 0 ≤ θ ≤ s f (0) s f (0) < θ 0 an infinite natural boundary a finite natural ∞ a finite natural an infinite natural We shall omit the remained proof.
2
December 26, 2006
14:40
Proceedings Trim Size: 9in x 6in
EconoGrowth
274
References 1. Chang, F. and Malliaris, A. G., Asymptotic growth under uncertainty: Existence and uniqueness, Rev. Economics Studies, 59 (1987), 169–174. 2. Cho, J-O. and Cooley, T. F., Business cycle uncertainty and economic welfare, 2001, Working Paper. 3. Feller, W., Diffusion processes in one dimension, Trans. A. M. S., 77 (1954), 1–31. 4. Ito, ˆ K. and McKean, Jr., H. P., Diffusion Processes and Their Sample Paths, Springer-Verlag, 1965. 5. Jensen, B. S. and Wang, C., Basic stochastic dynamic system of growth and trade, Rev. International Economics, 7 (1999), 378–402. 6. Jones, C. I., Introduction to Economic Growth, W. W. Norton & Company, 1998. 7. Kamihigashi, T., Almost sure convergence to zero in stochastic growth models, 2003, Working Paper. 8. Lucus, D. S., On the mechanism of economic development, J. Monetary Economics, 22 (1988), 3–42. 9. Mankiw, N., Romer, D., and Weil, D., A contribution to the empirics of economic growth, Quarterly J. Economics, 107 (1992), 407–438. 10. Maruyama, G. and Tanaka, H., Some properties of one-dimensional diffusion processes, Mem. Fac. Sci. Kyushu Univ., Ser. A, Math., 11 (1957), 117–141. 11. Merton, R., An asymptotic theory of growth under uncertainty, Rev. Economic Studies, 42 (1975), 375–393. 12. Nishioka, K., On the stability of two-dimensional linear stochastic systems, Kodai ¯ Math. Sem. Rep., 27 (1976), 211–230. 13. Solow, R. M., A contribution to the theory of economic growth, Quart. J. Economics, 70 (1956, Feb.), 65–94. 14. Solow, R. M., Technical change and the aggregate production function, Rev. Economics and Statistics, 39 (1957, Aug.), 312–320. 15. Uzawa, H., Optimum technical change in an aggregative model of economic growth, International Economic Review, 6 (1965), 18–31.
December 26, 2006
14:56
Proceedings Trim Size: 9in x 6in
quantifpartial-rits
Numerical Approximation by Quantization for Optimization Problems in Finance under Partial Observations Huyˆen PHAM Laboratoire de Probabilit´es et Mod`eles Al´eatoires CNRS, UMR 7599, Universit´e Paris 7 e-mail: [email protected]
We present some recent developments on optimal quantization methods for numerically feasible solutions to discrete-time optimization problems under partial information. The main problem in effective implementation is in growing dimension of the the approximating filters. We overcome this difficulty by performing a quantization of the pair process filter-observation. Dynamic programming is then applied to solve the approximated optimization problem. Several numerical applications in finance are presented for the pricing of American option or for hedging problems in the context of partially observed stochastic volatility models. MSC Classification (2000) : 60G35, 65C20, 65N50, 60G40, 90C39. Key words: Partial observation, quantization, dynamic programming, optimal stopping, control problem.
1. Introduction Partial observation problem in an uncertain environment is formally the situation where we face a stochastic system whose evolution is governed by a hidden process (the signal) that we observe only through some noise. Such problem has considerable importance in applications. For example, in financial market models, we do not have in general a complete knowledge of all parameters involved in the pricing, hedging of options, or in portfolio/investment problem. Some usual partial observation situa275
December 26, 2006
14:56
Proceedings Trim Size: 9in x 6in
quantifpartial-rits
276
tions encountered in the literature and in real markets are the following : (i) The investor cannot observe the stock appreciation rate but only the stock price process in continuous-time models, (ii) Trading in markets occurs in practice in discrete time, and we cannot fully recover the volatility value from the single observation of the stock prices in stochastic volatility models, (iii) In corporate finance, firms value are not easily available, and we only observe the value of equity issued by the firm that is a noisy function of the real firm value. The filtering problem consists in the estimation of the signal based on past observations, and the filter is the conditional law of the signal given the observations. It is a traditional problem in probability and statistics, and occurs for example in a financial context when one wants to estimate the volatility given the price observations. However, most of the numerical methods for computing or approximating the filter, are performed only for a given fixed set of observations, and very few methods are concerned with the approximation of the filter process where randomness is due to past observation process. However, this is required in various applications, especially in dynamic optimization problems under partial observation. Stochastic optimization is a traditional area in mathematics and received a special renewed interest for its multiple and various applications in finance. For example, it arises in the pricing of American options and in portfolio/investment choice. Optimization problems under partial observations have been largely studied in the literature, mostly from a theoretical aspect, see e.g. the book [2]. In this line of research, it is shown how the original optimization problem can be transformed into a complete observation one by introducing the filter state variable. Typically, in continuous-time models, this transformation is achieved by the method of change of probability reference and when the signal is on the drift of the observed diffusion process, or in other words, when the observation is an additive noise of the signal. In this context, the filter process is governed by the Zakai stochastic partial differential equation, leading generally to an infinite dimensional control problem. In some particular cases, as e.g. the Kalman-Bucy model where the pair signal-observation is governed by a linear gaussian model, the problem can be reduced to a finite-dimensional one, and may then be numerically dealt by the dynamic programming method. However, there are few works that investigate the numerical aspect of optimization problems under partial observations when the observation is a multiplicative noise of the signal. This arises typically in finance
December 26, 2006
14:56
Proceedings Trim Size: 9in x 6in
quantifpartial-rits
277
in stochastic volatility discrete-time models, and is naturally of considerable importance in practice. The main problem in effective approximation comes from the growing dimension of the filter process depending on the whole observation path, and which we need to approximate. A basic approach, suggested e.g. by [3] or [6], consists of discretizing at each time k, the observation Yk by a discrete random variable Yˆ k , and thus approximating the filter Πn (Y0 , . . . , Yn ) at time n, by Πn (Yˆ 0 , . . . , Yˆ n ), where we stress the dependence of the filter on the observation path (Y0 , . . . , Yn ). Hence, if Yˆ k takes for example M values at time k, we need to store Mn values in order to compute the approximate filter at time n. This makes the effective implementation not feasible in practice for a long horizon n. We refer to [12] for a detailed overview of this approach. We overcome this difficulty by adopting an approach recently proposed in [11] and [5], based on the Markov property of the pair filter-observation process with respect to the observation filtration. We then perform a quantization of the pair filter–observation process. Optimal quantization of random vectors consists basically of finding the best approximation in Lp -norm of a random vector by a discrete random vector taking at most N values. This was originally developed in the 50’s in the context of information theory where the basic motivation was to transmit efficiently a continuous stationary signal by means of a finite number of codes (or quantizers) . More recently, the quantization approach was applied to various fields, and notably to numerical probability, where it appears as an efficient spatial discretization method for solving multi-dimensional problems arising typically in finance. Here, we show how one can apply ideas from quantization to numerically solve optimization problems under partial observations. We are especially concerned with the effective implementation of this approach. The paper is organized as follows. Section 2 formulates the partial observation discrete-time framework and recalls preliminaries on the filtering problem. In Section 3, we present the optimization problems that we are interested in, optimal stopping problems arising in American option pricing, and control problems arising in portfolio/investment choice. Section 4 gives some short background on optimal vector quantization, and we apply in Section 5 this quantization approach for the approximation of the filter process. Finally, we provide applications for the numerical approximation to our optimization problems of interest, and in particular in the financial context of partially observed stochastic volatility models,
December 26, 2006
14:56
Proceedings Trim Size: 9in x 6in
quantifpartial-rits
278
we illustrate our results for the computation of Bermudean options and the solution to mean-variance hedging. 2. Partial Observation Discrete-time Framework 2.1 Signal-observation model We consider a discrete time partially observable process (X, Y) where X = (Xk )k∈N represents the state or signal process that may not be observable, while Y = (Yk )k∈N is the observation. We assume that the signal process (Xk )k is a finite-state Markov chain on the probability space (Ω, P), valued in E = {x1 , . . . , xm }, with initial law µ = (µi )i and probability transition matrix ij Pk = (Pk ) : µi = P[X0 = xi ], ij Pk
i = 1, . . . , m
j
= P[Xk = x |Xk−1 = xi ],
i = 1, . . . , m, j = 1, . . . , m.
The observation sequence (Yk ) is valued in Rq , such that the pair (Xk , Yk ) is a Markov chain on (Ω, P), and we assume that the law of Yk conditional on (Xk−1 , Yk−1 , Xk ), k ≥ 1, admits a bounded (known) density (called sometimes local likelihood function) : y −→ gk (Xk−1 , Yk−1 , Xk , y ). For simplicity, we assume that Y0 is a known deterministic constant equal to y0 . A typical example is given by the following scheme : Yk = Gk (Xk−1 , Yk−1 , Xk , ηk ), for some measurable function Gk , and where (ηk ) are a white noise. For example in finance, (Xk )k is the unobservable return and/or volatility of stock price S while Yk = ln Sk is the logarithm of the observed price process Yk = Yk−1 + b(Xk−1 ) + σ(Xk−1 )ηk , and gk is explicit once the density of the white noise ηk is specified. 2.2 Filter evolution We denote by (FkY )k the filtration generated by the observation process (Yk )k , and by (Πk )k the filter process defined by : Πik = P[Xk = xi |FkY ],
k ∈ N, i = 1, . . . , m.
Hence Πk = (Πik )i is the conditional law of Xk given FkY , and is a random vector taking values in the m-simplex of Rm : Km = π = (πi )1≤i≤m ∈ Rm : πi ≥ 0 and |π|1 = 1 .
December 26, 2006
14:56
Proceedings Trim Size: 9in x 6in
quantifpartial-rits
279
i Here, we denote for any vector π = (πi )i in Rm , |π|1 = m i=1 |π |. From Markov property and Bayes formula, the filter process satisfies the filtering forward equation : (2.1)
Π0 = µ
(2.2)
Πk = H¯ k (Πk−1 , Yk−1 , Yk ) :=
Hk (Yk−1 , Yk ) Πk−1 , k ≥ 1, |Hk (Yk−1 , Yk ) Πk−1 |1
where Hk (Yk−1 , Yk ) is the prediction-updating m × m transition matrix : ij
ij
Hk (Yk−1 , Yk ) = gk (xi , Yk−1 , x j , Yk )Pk ,
1 ≤ i, j ≤ m,
and is the transpose. Remark 2.1. From (2.2), we see that the randomness of the filter at time k ≥ 1, depends on the whole observation path Y1 , . . . , Yk : Πk = Πk (Y1 , . . . , Yk ). 3. Dynamic Optimization Models In this paper, we focus on two types of dynamic optimization problems : optimal stopping problems and control problems that arise classically in finance in American option pricing and investment/portfolio choice. We are mainly concerned about these problems in the partial observation context described in the previous section. We consider the case of finite horizon problems over dates 0, . . . , n, n ≥ 1. 3.1 Optimal stopping We denote by TnY the set of stopping times adapted with respect to the observation filtration {FkY , k = 0, . . . , n} and valued in {0, . . . , n}. Given a measurable function h on {0, . . . , n} × E × Rd , we are interested in the following optimal stopping problem under partial observation : u0 = sup E h(τ, Xτ , Yτ ) . τ∈TnY
In financial applications, u0 is the price of the American option with payoff h on stock (logarithm) price Y and with unobservable volatility X. Before studying this problem, let us briefly recall how it is solved in the full information context : v0 = sup E h(τ, Xτ , Yτ ) , τ∈Tn
where Tn is the set of stopping times adapted with respect to the full information filtration {Fk , k = 0, . . . , n} = {σ(X j , Y j , j ≤ k)} and valued in
December 26, 2006
14:56
Proceedings Trim Size: 9in x 6in
quantifpartial-rits
280
{0, . . . , n}. We introduce the corresponding value at time k : Vk = ess sup E h(τ, Xτ , Yτ ) , τ∈Tk,n
where Tn is the set of stopping times adapted with respect to (Fk ) and valued in {k, . . . , n}. Then, the random values Vk are calculated in backward induction via the dynamic programming principle : Vn = h(n, Xn , Yn ) Vk = max h(k, Xk , Yk ) , E Vk+1 |Fk ,
k = 0, . . . , n − 1.
This successive computations of the conditional expectations : involves E Vk+1 |Fk , and so by the Markov property of (X, Y), computations of the conditional law of (Xk+1 , Yk+1 ) given (Xk , Yk ), k = 0, . . . , n − 1. There are various numerical approaches to approximate these conditional distributions : Monte-carlo methods with Malliavin calculus [4], grid methods [8], or alternatively quantization approach as in [1]. In the partial observation context, and as it will be precised later, this will involve successive computations of conditional distributions of Πk+1 given Πk . Here, the main problem in effective approximation comes from the growing dimension of approximating filters, see Remark 2.1. We suggest an approach based on an appropriate quantization of the filter process. Some background on quantization is recalled in the next section. 3.2 Control problems We consider a real-valued controlled process (Wk )k with dynamics in the form : (3.1)
Wk+1 = F(Wk , αk , Yk , Yk+1 ), k =, . . . , n − 1,
for some measurable function F, and where the control process α = (αk )k is valued in some compact subset A of Rl , and is adapted with respect to the observation filtration (FkY ). We denote by A this set of control processes. Notice that (Wk ) is adapted with respect to (FkY ). Given a measurable function h on E × Rd × R, we are interested in the following control problem under partial observation : Jopt = inf J(α) := inf E (Xn , Yn , Wn ) . (3.2) α∈A
α∈A
A typical financial example corresponds to the case where Y is the logarithm price of a risky asset and X is its unobservable volatility. Consider
December 26, 2006
14:56
Proceedings Trim Size: 9in x 6in
quantifpartial-rits
281
an investor who can trade at any time k a number of shares αk in stock based on the past price observations, and invest the rest in a riskless bond assumed for simplicity equal to one. Her wealth process (Wk )k is then governed by : Wk+1 = Wk + αk (eYk+1 − eYk ),
k = 0 . . . , n − 1.
The function represents the criterion function associated to a portfolio choice or hedging problem. For example, the mean-variance hedging problem for a payoff option h(Yn ) at maturity n corresponds to (Xn , Yn , Wn ) = (h(Yn ) − Wn )2 . 4. Short Background on Optimal Vector Quantization The basic idea of (quadratic) quantization is to replace an R d -valued random vector Z ∈ L2 (P, Rq ), with probability law PZ , by a random vector taking at most N values in order to minimize the induced L2 -error. For this, consider a grid z = {z1 , . . . , zN } of N points in Rq (we shall often identify such a grid with a N-tuple in Rq ), and its Voronoi tesselations, that is Borel partitions C1 (z), . . ., CN (z) of Rq satisfying : Ci (z) ⊂ ξ ∈ Rq : |ξ − zi | = min |ξ − z j | , i = 1, . . . , N. j=1,...,N
Here, |.| is the euclidian norm on Rq . Then, one defines the z-Voronoi quantization of Z as the closest neighbour projection of Z on the grid z : Zˆ z = Projz (Z) :=
N
zi 1Ci (z) (Z),
i=1
whose discrete probability law PZˆ is characterized by : pˆi := PZˆ (zi ) = PZ (Ci (z)), i = 1, . . . , N. In the sequel, we often drop the exponent z in Zˆ z when there is no ambiguity, and we say that Zˆ is a quantizer of Z. The L2 -error induced by this ˆ 2 . As a function of the projection, called L2 -quantization error, is Z − Z 1 N q N N-tuple (grid) z = (z , . . . , z ) ∈ (R ) , the square of the L2 -quantization error, called distorsion, is written as : ˆ 2= DZN (z) = Z − Z (4.1) min |ξ − zi |2 PZ (dξ). 2 i=1,...,N
December 26, 2006
14:56
Proceedings Trim Size: 9in x 6in
quantifpartial-rits
282
First, notice by definition of the closest neighbour projection that the L2 quantization error is the minimum of L2 -error Z − Y2 among all random variables Y taking values in the grid z. Then, two questions arise naturally : for fixed N, is there an optimal grid z∗ which minimizes the L2 -quantization error (or equivalently the distorsion), and how does this minimum behave when N goes to infinity? The latter question is answered by the so-called Zador theorem : Theorem 4.1. (see [7]) Assume that Z ∈ L2+ε (P, Rq ) for some ε > 0. Then, q q+2 2 z 2 q ˆ lim N min Z − Z 2 = Jq ( | f | q+2 dλq ) q , N
z
where PZ (dξ) = f (ξ)λq (dξ) + ν(dξ) is the Lebesgue decomposition of P Z with respect to the Lebesgue measure λq on Rq , and Jq is a constant depending on q, corresponding to the uniform distribution on [0, 1] q. Remark 4.1. In dimensions q = 1 and 2, J 1 = ∼
q 2πe
1 12
and J2 =
5√ . 18 3
For q ≥ 3, Jq
as q goes to infinity.
The optimal N-quantization problem that consists in determining a grid z , which minimizes the L2 -quantization error, relies on the property that the distorsion is continuously differentiable at any N-tuple having pairwise distinct components, with a gradient obtained by formal differentiation in (4.1) : (4.2) ∇DZN (z) = 2E KN (z, Z)], ∗
where KN : (Rq )N × Rq → (Rq )N is defined by KN (z, ξ) = ((zi − ξ)1ξ∈Ci(z) )1≤i≤N . A quantizer Zˆ = Zˆ z is said stationary if the associated N-tuple z satisfies ∇DZN (z) = 0. An optimal quantizer is a stationary quantizer. The integral representation (4.2) of ∇DZN suggests, as soon as independent copies of Z can be simulated, to implement a stochastic gradient algorithm (descent), in order to get numerically a stationary quantizer. By denoting, z(s) = (zs,1 , . . . , zs,N ) the grid (or N-tuple in Rq ) at step s, the stochastic gradient descent procedure is recursively defined by : z(s+1) = z(s) − δs+1 KN (z(s) , ξs+1 ),
December 26, 2006
14:56
Proceedings Trim Size: 9in x 6in
quantifpartial-rits
283
where (ξs )s are independent copies of Z, and (δs )s is a positive sequence of step parameters satisfying the usual conditions : δs = ∞ and δ2s < ∞. s
s
In our context, this leads to the Kohonen algorithm or competitive learning vector quantization (CLVQ) algorithm, which also provides as a byproduct an estimation of the weights pˆ i of the Voronoi tesselations associated to the stationary quantizer. We refer to [10] for a complete description and discussion of the convergence of algorithm. Optimal grids and their companion parameters, i.e. weights of the Voronoi tesselation and distorsion, for the normal distribution are available and downloadable on the webpages of Gilles Pag`es or Jacques Printems. 5. Quantization of the Filter Process In view of solving dynamic optimization problems under partial observation, we need an approximation of the filter process (Πk )k . Recall the dependence of the random filter on the observation : Πk = Πk (Y1 , . . . , Yk ). An usual approach, suggested e.g. in [3], consists of approximating Πk (Y1 , . . . , Yk ) by Πk (Yˆ 1 , . . . , Yˆ k ) where Yˆ k is a quantizer of Yk . The main problem in effective implementation is the growing dimension of this approximating filter : indeed, for instance, if each Yˆ k takes M values, then at time n, the random filter Πn (Yˆ 1 , . . . , Yˆ n ) would take Mn values in Km , which is not realistically implementable for a long horizon n. In order to overcome this numerical difficulty, we present a quantization approach introduced in [11] and based on the Markov property of the pair filter-observation (Πk , Yk ) with respect to the observation filtration (FkY ). In other words, the conditional law of Xk+1 given FkY is summarized by the sufficient statistic (Πk , Yk ), and we shall approximate the pair Markov chain (Πk , Yk ) by an approximation of their successive probability transitions. One first proves that the probability transition Rk (from time k − 1 to k) of the Markov chain (Zk ) = (Πk , Yk ) in Km × Rd is given by : ¯ k (π, y, y ), y )Qk (π, y, dy), Rk ϕ(π, y) = ϕ(H where Qk (π, y, dy) is the law of Yk conditional on (Πk−1 , Yk−1 ) = (π, y) with density : (5.1)
y −→
m i, j=1
ij
gk (xi , y, x j , y )Pk πi .
December 26, 2006
14:56
Proceedings Trim Size: 9in x 6in
quantifpartial-rits
284
This shows in particular that Zk may be simulated through the following simulation procedure of its probability transition (Rk ) : for k = 0, Z0 is a known deterministic vector equal to z0 = (µ, y0 ), and for k ≥ 1, starting from (Πk−1 , Yk−1 ), • we simulate Yk according to the law Qk (Πk−1 , Yk−1 , dy ) given in (5.1). • we compute Πk by the forward filtering equation Πk = H¯ k (Πk−1 Yk−1 , Yk ). Once we are able to simulate independent copies of (Z0 , . . . , Zn ), we apply an optimal quantization to each Zk in Km × Rd , for k = 0, . . . , n, following the vector quantization method described in the previous section. For each k = 0, . . . , n, we denote by Zˆ k the zk -Voronoi quantizer of Zk , valued in the k grid zk = (z1k , . . . , zN ) consisting of Nk points in Km × Rd associated to the k Voronoi tesselations Ci (zk ), i = 1, . . . , Nk . As a byproduct, we approximate the probability transitions (Rk ) of the Markov chain (Zk ) by the probability transition matrices (ˆrk ) defined by :
ij j rˆk = P Zˆ k = zk Zˆ k−1 = zik−1 ij P Zk ∈ C j (zk ), Zk−1 ∈ Ci (zk−1 ) βˆk = =: i , P [Zk−1 ∈ Ci (zk−1 )] pˆk−1 for all k ≥ 1, i = 1, . . . , Nk−1 , j = 1, . . . , Nk . The process (Zˆ k ) obtained by this method, is called a marginal quantization of the process (Zk ) : it is characterized for each k by its grid space z k , and by the probability ij transition matrix rˆk = (ˆrk ). Denoting by ξs = (ξs0 , . . . , ξsn )s , independent copies of (Z0 , . . . , Zn ), the optimal grids zk that minimize the L2 -quantization error Zk − Zˆ k 2 for ij each k, and the companion parameters rˆk , are practically implemented according to the Kohonen algorithm as follows : Initialisation phase : (0)
k , . . . , z0,N ) ∈ (Km × Rd )Nk for k = 0, . . . , n, • Initialize the n grids zk = (z0,1 k k
with Γ(0) = z0 reduced to N0 = 1 point for k = 0. 0 0,i j
• Initialize the weights vectors : p0,i = 1/Nk , βk+1 = 0, i = 1, . . . , Nk , j = k 1, . . . , Nk+1 , and the distorsion D0N = 0, for k = 0, . . . , n. k
(s)
k , . . . , zs,N ), the weights Updating s → s + 1 : At step s, the n grids z k = (zs,1 k k
s,i j
vectors ps,i , βk+1 , i = 1, . . . , Nk , j = 1, . . . , Nk+1 , have been obtained and we k
December 26, 2006
14:56
Proceedings Trim Size: 9in x 6in
quantifpartial-rits
285
use the sample ξs+1 of (Z0 , . . . , Zn ) to update them as follows : for all k = 0, . . . , n, • Competitive phase : select ik (s + 1) ∈ {1, . . . , Nk } such that ξs+1 ∈ Cik (s+1) (z(s) ), i.e. ik (s + 1) ∈ argmin1≤i≤Nk |zs,i − ξs+1 |2 . k k • Learning phase : Updating of the grid :
s,i s,i s+1 = z − δ 1 − ξ zs+1,i z , s+1 i=i (s+1) k k k k
i = 1, . . . , Nk
Updating of the weights vectors and of the probability transition
for all i = 1, . . . , Nk , j = 1, . . . , Nk+1 . 6. Numerical Approximation to Optimization Problems under Partial Observation 6.1 Quantization of optimal stopping We turn back to the optimal stopping problem under partial observation considered in paragraph 3.1, and we define the corresponding values : (6.1) Uk = ess sup E h(τ, Xτ , Yτ )| FkY , k = 0, . . . , n, Y τ∈Tk,n
Y where Tk,n is the set of (FkY )-stopping times valued in {k, . . . , n}. By using the law of iterated conditional expectation and the definition of the filter, we notice that problem (6.1) may be reduced to a complete observation model with state variable the (FkY )-adapted process (Zk ) : n
1τ= j E[h(j, X j , Y j )|F jY ]FkY Uk = ess sup E Y τ∈Tk,n
j=k
n
= ess sup E 1τ= j Π j h(j, ., Y j )FkY Y τ∈Tk,n
j=k
˜ Zτ )F Y , = ess sup E Πτ h(τ, ., Yτ )| FkY = ess sup E h(τ, k Y τ∈Tk,n
Y τ∈Tk,n
December 26, 2006
14:56
Proceedings Trim Size: 9in x 6in
quantifpartial-rits
286
with the notation : ˜ z) = πh(., y) = h(k,
m
h(k, xi , y)πi ,
∀z = (π, y), π = (πi )i ∈ Km , y ∈ Rd .
i=1
By the (FkY )-Markov property of (Zk ) and the dynamic programming principle, we have Uk = uk (Zk ) where functions uk are defined in backward induction by : ˜ z) un (z) = h(n, ˜ z) , E [ uk+1 (Zk+1 )| Zk = z] . uk (z) = max h(k, Following [1], we provide a quantization approximation of Uk = uk (Zk ) ˆ k = uˆ k (Zˆ k ), for k = 0, . . . , n, where (Zˆ k ) is a marginal quantization of by U (Zk ) on grids (zk ) with corresponding probability transition matrices (ˆrk ), as described in the previous section, and functions uˆ k are explicitly computed in recursive form by : ˜ z) uˆ n (z) = h(n, ˜ z) , E uˆ k+1 (Zˆ k+1 ) Zˆ k = z . uˆ k (z) = max h(k, From an algorithmic viewpoint, this reads as : ˜ zi ), i = 1, . . . , Nn uˆ n (zin ) = h(n, ⎧n ⎫ ⎪ ⎪ N k+1 ⎪ ⎪ ⎪ ⎪ ⎨˜ ⎬ ij j i ˆ ˆ h(k, z , ) , (z ) uˆ k (zik ) = max ⎪ r u ⎪ k+1 k+1 ⎪ k ⎪ k+1 ⎪ ⎪ ⎩ ⎭ j=1
i = 1, . . . , Nk , k = 0, . . . , n − 1.
ˆ k 1 in terms of quantization error Zk − Zˆ k 2 L1 -error estimation Uk − U is stated in [11]. By combining with Zador’s theorem, we obtain a rate of C(n) convergence of order , where C(n) is a constant depending essentially 1 N m−1+d
on the boundedness and Lipschitz conditions on gk and h, and the horizon n. Numerical illustration : Bermudean options in a partially observed stochastic volatility model We consider an observable stock (logarithm) price Yk = ln Sk , with dynamics given by : (6.2)
√ 1 Yk+1 = Yk + r − Xk2 δ + Xk δεk+1 , 2
December 26, 2006
14:56
Proceedings Trim Size: 9in x 6in
quantifpartial-rits
287
Table 1 Comparison of quantized filter value to its Monte Carlo estimation. Monte Carlo Quant. with N¯ = 300 Quant. with N¯ = 600 Quant. with N¯ = 900 Quant. with N¯ = 1200 Quant. with N¯ = 1500
where (εk ) is a sequence of Gaussian white noise, and (Xk ) is the unobservable volatility process. δ = n1 is the time step from an Euler scheme over a period [0, 1]. We assume that (X k ) is a Markov chain approximation a` la Kushner [8] with spatial step ∆ and with m = 3 states of a mean-reverting process : (6.3)
dXt = λ(x0 − Xt )dt + ηdWt .
In this context of a partially observed stochastic volatility model, we consider a Bermudean put option with payoff y → (κ − e y )+ , and with price :
(6.4) . u0 = sup E e−rτδ κ − eYτ Y τ∈T0,n
+
We perform numerical tests with : - Price and put option parameters : r = 0.05, S 0 = 110, κ = 100, - Volatility parameters : λ = 1, η = 0, 1, ∆ = 0, 05, X 0 = 0.15, - Quantization : Grids are of same size N¯ fixed for each time period. We first compare in Table 1 the filter expectation at the final date computed with a time step size δ = 1/5 and by using the optimal quantization method with increasing grid size N¯ , and with 106 Monte Carlo iterations of the path observation Y. We observe that besides the very low error level, the absolute error (plotted in Fig. 1) and the relative error are decreasing as the grid size grows. Secondly, in order to illustrate the effect of the time step, we compute the American option price under partial observation when the time step δ decreases to zero (i.e. n increases) and compare it with the American option price with complete observation of (Xk , Yk ). Indeed, in the limit for δ → 0 we fully observe the volatility, and so the partial observation price should converge to the complete observation price.
December 26, 2006
14:56
Proceedings Trim Size: 9in x 6in
quantifpartial-rits
288 3
1.355
x 10
1.35
1.345
1.34
1.335
1.33
1.325
1.32
1.315 200
400
600
800
1000
1200
1400
1600
Figure 1. Filter error convergence as N¯ grows.
11
Quadratic risk
10
9
8
7
6 Total observation Partial observation 5 2.5
3
3.5
4 4.5 Initial capital
5
5.5
6
Figure 2. Quadratic hedging of an European put: graph of w 0 → infα∈A E((κ − eYn )+ − Wn )2 ) in the partial and total observation case. Size grid for W = 100 points, size grid for (eY , Π) = 1500 points, size grid for (eY , X) = 45 points.
Moreover, when we have more and more observations, the difference between the two prices should decrease and converge to zero. This is shown in figure 6, where we performed option pricing over grids of size N¯ Π,Y = 1500 in case of partial observation. The total observation price is given by the same pricing algorithm carried out on N¯ X,Y = 45 points for the product grid of (Xk , Yk ). For fixed n, the rate of convergence for the
December 26, 2006
14:56
Proceedings Trim Size: 9in x 6in
quantifpartial-rits
289
Table 2 American option price for embedded filtrations—First Example. n
Table 3 American option price for embedded filtrations—Second Example. n
5
10
20
Tot. Obs. (N¯ X,Y = 45) Part. Obs. (N¯ = 1500)
1.57506
1.72595
1.91208
0.988531
1.30616
1.59632
Variation
0.58
0.42
0.31
Π,Y
approximation of the value function under partial observation is of order 1/(m−1+d) where N¯ Π,Y is the number of points used at each time k for the N¯ Π,Y grid of (Πk , Yk ) valued in Km × Rd . From results of [1], we also know that the rate of convergence for the approximation of the value function under full observation is of order m × N¯ Y where N¯ X,Y = m × N¯ Y is the number of points at each time k, used for the grid of (Xk , Yk ) valued in E × Rd . This explains why, in order to have comparable results, and with m = 3 and d = 1/3 . 1, we have chosen N¯ Y ∼ N¯ Π,Y In addition, it is possible to observe the effect of information enrichment as the time step decreases. In fact, if we consider multiples of n as the time step parameter, we notice that the American option price increases for both total and partial observation models (see Tables 2 and 3). 6.2 Quantization of control problem We turn back to the control problem under partial observation considered in paragraph 3.2. By using the law of iterated conditional expectations, we can rewrite the expected cost function as follows: J(α) = E E (Xn , Yn , Wn )|FnY ⎡ m ⎤ ⎢⎢ ⎥⎥ i i (x , Yn , Wn )Πn ⎥⎥⎥⎦ = E ⎢⎢⎢⎣ i=1 ˆ n , Yn , Wn ) = E (Π
December 26, 2006
14:56
Proceedings Trim Size: 9in x 6in
quantifpartial-rits
290
where ˆ y, w) := (π,
m
(xi , y, w)πi
i=1
The original control problem (3.2) can now be reformulated as a problem under full observation with state variables (Πk , Yk , Wk ), valued in Km × Rd × R, and (FkY )-adapted :
ˆ n , Yn , Wn ) . Jopt = inf E (Π α∈A
Recalling the dynamics (3.1) of (Wk ) and following the dynamic programming principle for discrete-time control problems, we define the sequence of functions on Km × Rd × R : ˆ y, v) un (π, y, w) = (π,
uk (π, y, w) = inf E uk+1 (Πk+1 , Yk+1 , F(w, a, y, Yk+1))(Πk , Yk ) = (π, y) , a∈A
for k = 0, . . . , n − 1, so that Jopt = u0 (µ, y0 , w0 ), where w0 is the initial value of W0 at time k = 0, and we recall that (Π0 , Y0 ) = (µ, y0 ). In order to compute this sequence of functions uk , we deal separately with the approximation of the pair filter-observation process (Zk )k = (Πk , Yk )k that does not depend on the control, and the approximation of the controlled process (Wk )k . • We apply a marginal quantization of the process (Zk ) = (Πk , Yk ), and we ˆ k , Yˆ k ) the corresponding quantizers on grids (zk ), and denote the (Zˆ k ) = (Π (ˆrk ) the associated probability transition matrices, as described in section 5. The i-th point of the grid zk of size Nk in Km × Rd is denoted zik = (πk (i), yik ) ∈ Km × Rd , i = 1, . . . , Nk . • The approximation of Wk is obtained by a classical uniform space discretization similar to the Markov chain method as in Kushner. We fix a bounded uniform grid on the state space R for the controlled process (Wk ). Namely, we set Γ = (2ν)Z ∩ [−L, L], where ν is the spatial step and L is the grid size. We denote by ProjΓ the projection on the grid Γ according to the closest neighbor rule. Recalling the dynamics (3.1) of the controlled process (Wk ), we approximate it as follows :
December 26, 2006
14:56
Proceedings Trim Size: 9in x 6in
quantifpartial-rits
291
ˆ k ), given a control α ∈ A, we define the discretized controlled process (W valued in Γ, by : ˆ k , αk , Yˆ k , Yˆ k+1 )). ˆ k+1 = Proj (F(W W Γ We then approximate the sequence of functions uk by the sequence of functions uˆ k defined on zk × Γ, k = 0, . . . , n, by a dynamic programming type formula : ˆ y, w) uˆ n (π, y, w) = (π,
ˆ ˆ ˆ ˆ ˆ uˆ k (π, y, w) = inf E uˆ k+1 Πk+1 , Yk+1 , ProjΓ (F(w, a, y, Yk+1)) (Πk , Yk ) = (π, y) . a∈A
From an algorithmic viewpoint, this is computed explicitly as follows : ˆ in , w), zin = (πn (i), yin ) ∈ zn , i = 1, . . . , Nn , w ∈ Γ, uˆ n (zin , w) = (z N k+1
j ij j rˆk+1 uˆ k+1 zk+1 , ProjΓ (F(w, a, yik, yk+1 )) (6.5) uˆ k (zik , w) = inf a∈A
(6.6)
zik
j=1
= (πk (i), yik ) ∈ zk , i = 1, . . . , Nk , w ∈ Γ, k = 0, . . . , n − 1.
For w0 ∈ Γ, the solution Jopt = u(µ, y0 , w0 ) to our control problem is then approximated by Jquant = uˆ 0 (µ, y0 , w0 ). Moreover, this backward dynamic programming scheme allows us to compute at each time k = 0, . . . , n − 1, an approximate control αˆ k (z, w), z ∈ zk , w ∈ Γ, by taking the infimum in (6.5). Error estimation between Jopt and Jquant in terms of the quantization errors Zk − Zˆ k 2 for Zk = (Πk , Yk ), the spatial step ν, and the grid size L for (Wk ) is stated in [5]. By combining
with Zador’s theorem, this provides a rate of convergence of order C(n) ν + L1 + 11 . N m−1+d
Numerical illustration : Mean-variance hedging in a partially observed stochastic volatility model In the setting of the stochastic volatility model described in paragraph 6.1, we consider the mean-variance hedging of a put option. The logarithm of the observed stock price is Y = ln S, its unobservable volatility is X, and the wealth process W controlled by the number of shares α invested in stock, is governed by : Wk+1 = Wk erδ + αk (eYk+1 − eYk erδ ), where r is the constant interest rate, and δ > 0 is the interval between two trading dates. The dynamics of (X, Y) is given by (6.2)-(6.3). Given a put
December 26, 2006
14:56
Proceedings Trim Size: 9in x 6in
quantifpartial-rits
292
300 points 600 points 1500 points
8.7 8.6 8.5 8.4 8.3 8.2 8.1 8 7.9 7.8 2
2.5
3
3.5
4
Figure 3. Quadratic hedging of an European put: graph of w 0 → infα∈A E((κ − eYn )+ − Wn )2 ) for different quantification grid sizes (N = 300, 600, 1500) and a fixed uniform grid size (N W = 400)
Table 4 Quadratic hedging of an European put: European put price (defined as the initial capital minimizing the risk) and optimal control strategy calculated for different quantization grid sizes (N = 300, 600, 1500) and a fixed uniform grid size (NW = 400) N 300 600 1500
European put price 3.04132 3.05965 3.07098
Optimal control strategy α0 -0.2813 -0.2813 -0.2813
option of payoff (κ − eYn )+ at maturity n, the investor’s objective is defined by the control problem : 2
inf E (κ − eYn )+ − Wn .
α∈A
We perform numerical tests with : - Price and put option parameters : r = 0.05, S 0 = 110, κ = 110, - Volatility parameters : λ = 1, η = 0, 1, ∆ = 0, 05, X 0 = 0.15, - Quantization of (Zk ) = (Πk , Yk ) : grids are of same size N fixed for each time period with step δ = n1 . When it is not precised, we choose n = 5. - Discretization of (Wk ) : we use a N W -point grid defined by Γ = (2ν)Z ∩ [Lin f , lsup ] with Lin f = −10, Lsup = 15 and so ν = 2(N25 W −1) .
December 26, 2006
14:56
Proceedings Trim Size: 9in x 6in
quantifpartial-rits
293 8.6
400 points 200 points 100 points
8.5 8.4 8.3 8.2 8.1 8 7.9 7.8 7.7 2.4
2.6
2.8
3
3.2
3.4
3.6
Figure 4. Quadratic hedging of an European put: graph of w 0 → infα∈A E((κ − eYn )+ − Wn )2 ) for different fixed uniform grid sizes (NW = 50, 100, 200, 400) and a fixed quantization grid size (N = 300)
- Approximation of the optimal control: golden search method (see [9]) on A = [−1, 1]. In order to study the effects of the quantization grid size N and uniform grid size N W , we plot the graph of w0 → infα∈A E((κ − eYn )+ − Wn )2 ) for different values of N and N W (Figs. 3 and 4). As expected, the global shape of the graph is parabolic, due to the quadratic hedging criterion that we have used. The minimum is reached at wmin which can be considered as the ”quadratic hedging price” of our European put option. The corresponding hedging strategies are given in Table 4, and Fig. 5 displays the graph of α0 as a function of the initial wealth w0 . We can see that the strategy is nearly constant for w0 ∈ [2, 4], where the non constant values may be due to numerical imprecision. This is consistent with the theoretical result, which shows that the optimal strategy for the mean-variance hedging problem does not depend on the initial wealth when the (discounted) stock price is a martingale, which is the case here. In Fig. 6 and in the Table 5, we compare the European put option price under partial and complete observation when we increase the number of observations (i.e. the time step δ decreases to zero). Denoting by NΠ,Y the number of grid points used in the partial observation case to make an optimal quantization of the pair (Π, Y), by NX,Y the number of grid points used in the total observation case to make an optimal quantization of the
December 26, 2006
14:56
Proceedings Trim Size: 9in x 6in
quantifpartial-rits
294
Figure 5. Quadratic hedging of an European put: graph of w 0 → α0 (w0 ) for a quantization grid size of N = 300 and a fixed uniform grid size of NW = 400
0.25
a2
0.15
0.1
0.05
0
5
10
15
20
25
Figure 6. Quadratic hedging of an European put: distance between total and partial observation European put prices (defined as the initial capital minimizing the risk) when we increase the number of observations (axis of abscissae) and consequently the time step δ goes to 0. Size grid for W = 30 points, size grid for (eY , Π) = 1500 points, size grid for (eY , X) = 45 points
pair (X, Y), and by L the grid size in the discretization of the controlled variable W, we recall that the discretization error is of order −1 1 d+m−1 NΠ,Y +ν+ L
December 26, 2006
14:56
Proceedings Trim Size: 9in x 6in
quantifpartial-rits
295
Table 5 Quadratic hedging of an European put: comparison between partial and total observation price (defined as the initial capital minimizing the quadratic risk) and strategies when we increase the number of observations and consequently the time step δ goes to 0. Size grid for W = 30 points, size grid for (eY , Π) = 1500 points, size grid for (eY , X) = 45 points Time step Partial observation Partial observation Total observation Total observation δ price strategy price strategy 1\5 2.9933 −0.2813 3.24459 −0.2734 1\10 3.5255 −0.3013 3.65515 −0.2422 1\20 3.9501 −0.3215 4.02799 −0.3614
2.5
2
1.5
1
0.5 Tot Obs Option Price (45 pts) Part Obs Option Price (1500pts)
0
0
5
10
15
20
25
30
35
Figure 7. Partial and total observation option prices as δ → 0
for the partial observation case. For the total observation case we have: ! 1 1 +ν+ NX,Y R where NX,Y = mNY (see [11]). So, in order to obtain comparable results, given the uniform grid discretizing the variable W, we perform an optimal quantization of (Π, Y) and (X, Y) by using grid sizes NΠ,Y and NX,Y = mNY such that: 1 d+m−1 NY NΠ,Y where d = 1 and m = 3. That is why we have chosen NΠ,Y = 1500 and NX,Y = 45.
December 26, 2006
14:56
Proceedings Trim Size: 9in x 6in
quantifpartial-rits
296
We notice that when the number of observations increases (i.e. δ → 0), the partial observation price converges to the complete observation price; this is due to the fact that with observation performed in continuous time we are able to calculate the volatility given by the quadratic variation of the price process (eY ). Figure 2 shows that by working in a total observation setting the quadratic risk associated to a given initial wealth is smaller than the corresponding value obtained in the partial observation case. This is consistent with the fact that the filtration generated by the observation price is included in the full information filtration, and consequently the corresponding optimal cost function in the partial information case is larger than the one in the full information case.
References 1. Bally, V. and G. Pag`es (2003): “A quantization algorithm for solving discrete time multi-dimensional optimal stopping problems”, Bernoulli, 9, 1003–1049. 2. Bensoussan, A. (1992): Stochastic control of partially observable systems, Cambridge university Press. 3. Bensoussan A. and W. Runggaldier (1987): An approximation method for stochastic control problems with partial observation of the state: a method for constructing ε-optimal controls, Acta Appli. Math., 10, 145–170. 4. Bouchard, B., I. Ekeland, and N. Touzi (2004): “On the Malliavin approach to Monte Carlo approximation of conditional expectations”, Finance and Stochastics, 8, 45–71. 5. Corsi, M., H. Pham, and W. Runggaldier (2006): “Numerical approximation by quantization of control problem in finance under partial observations”, to appear in Mathematical modelling and numerical methods in finance, edited by A. Bensoussan and Q. Zhang, special volume of Handbook of numerical analysis. 6. Di Masi, G. B. and W. J. Runggaldier (1987): An Approach to Discrete-Time Stochastic Control Problems under Partial Observation, SIAM J. Control & Optimiz. 25, pp. 38 - 48. 7. Graf, S. and H. Luschgy (2000): Foundations of quantization for random vectors, Lecture Notes in Mathematics n0 1730, Springer, Berlin, 230 pp. 8. Kushner, H. J. and P. Dupuis (1992): Numerical Methods for Stochastic Control Problems in Continuous Time, Springer, New York. 9. Luenberger, D. (1984): Linear and nonlinear programming, Addison-Wesley. 10. Pag`es, G., H. Pham, and J. Printems (2004): “Optimal quantization methods and applications to numerical problems in finance”, Handbook of computational and numerical methods in finance, ed. S. Rachev, Birkhauser. 11. Pham, H., W. Runggaldier, and A. Sellami (2005): “Approximation by quanti-