Stochastic Mechanics Random Media Signal Processing and Image Synthesis Mathematical Economics and Finance Stochastic Optimization Stochastic Control Stochastic Models in Life Sciences
Stochastic Modelling and Applied Probability (Formerly: Applications of Mathematics)
14
Edited by B. Rozovski˘ı G. Grimmett Advisory Board D. Dawson D. Geman I. Karatzas F. Kelly Y. Le Jan B. Øksendal G. Papanicolaou E. Pardoux
N. V. Krylov
Controlled Diffusion Processes Translated by A. B. Aries
Reprint of the 1980 Edition
Author Nicolai V. Krylov School of Mathematics 127 Vincent Hall University of Minnesota Minneapolis, MN 55455 USA
[email protected] Managing Editors B. Rozovski˘ı Division of Applied Mathematics Brown University 182 George St Providence, RI 02912 USA
[email protected]
ISBN 978-3-540-70913-8 DOI 10.1007/978-3-540-70914-5
G. Grimmett Centre for Mathematical Sciences University of Cambridge Wilberforce Road Cambridge CB3 0WB United Kingdom
[email protected]
e-ISBN 978-3-540-70914-5
Stochastic Modelling and Applied Probability ISSN 0172-4568 Library of Congress Control Number: 2008934473 Mathematics Subject Classification (2000): 93E20, 60J60, 60H10, 60H20 Soft cover reprint of the 1980 edition Translated from the Russian Edition published by Nauka, Moscow 1977 c 2009 Springer-Verlag Berlin Heidelberg c 1980 Springer-Verlag New York Inc. This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Cover design: WMXDesign GmbH, Heidelberg Printed on acid-free paper 9 8 7 6 5 4 3 2 1 springer.com
Preface
Stochastic control theory is a relatively young branch of mathematics. The beginning of its intensive development falls in the late 1950s and early 1960s. ~ u r i that n ~ period an extensive literature appeared on optimal stochastic control using the quadratic performance criterion (see references in Wonham [76]). At the same time, Girsanov [25] and Howard [26] made the first steps in constructing a general theory, based on Bellman's technique of dynamic programming, developed by him somewhat earlier [4]. Two types of engineering problems engendered two different parts of stochastic control theory. Problems of the first type are associated with multistep decision making in discrete time, and are treated in the theory of discrete stochastic dynamic programming. For more on this theory, we note in addition to the work of Howard and Bellman, mentioned above, the books by Derman [8], Mine and Osaki [55], and Dynkin and Yushkevich [12]. Another class of engineering problems which encouraged the development of the theory of stochastic control involves time continuous control of a dynamic system in the presence of random noise. The case where the system is described by a differential equation and the noise is modeled as a time continuous random process is the core of the optimal control theory of diffusion processes. This book deals with this latter theory. The mathematical theory of the evolution of a system usually begins with a differential equation of the form
at = f(t,xt) with respect to the vector of parameters x of such a system. If the function f (t,x) can be measured or completely defined, no stochastic theory is needed. However, it is needed if f(t,x) varies randomly in time or if the errors of measuring this vector cannot be neglected. In this case f(t,x) is, as a rule,
representable as b(t,x) + a(t,x)[, where b is a vector, cr is a matrix, and is a random vector process. Then
5,
It is convenient to write the equation in the integral form
where xo is the vector of the initial state of the system. We explain why Eq. (2) is preferable to Eq. (1). Usually, one tries to choose the vector of parameters x, of the system in such a way that the knowledge of them at time t enables one to predict the probabilistic behavior of the system after time t with the same certainty (or uncertainty) to the same extent as would knowledge of the entire prior trajectory xs (s 5 t). Such a choice of parameters is convenient because the vector x, contains all the essential information about the system. It turns out that if x, has this property, it can be proved under rather general conditions that the process 5, in (2) can be taken to be a Brownian motion process or, in other words, a Wiener process w,. The derivative of 5, is then the so-called "white noise," but, strictly speaking, 5, unfortunately cannot be defined and, in addition, Eq. (1) has no immediate meaning. However, Eq. (2) does make sense, if the second integral in (2) is defined as an Ito stochastic integral. It is common to say that the process x, satisfying Eq. (2) is a diffusion process. If, in addition, the coefficients b, o of Eq. (2) depend also on some control parameters, we have a "controlled diffusion process." The main subject matter of the book having been outlined, we now indicate how some parts of optimal control theory are related to the contents of the book. Formally, the theory of deterministic control systems can be viewed as a special case of the theory of stochastic control. However, it has its own unique characteristics, different from those of stochastic control, and is not considered here. We mention only a few books in the enormous literature on the theory of deterministic control systems: Pontryagin, Boltyansky, Gamkrelidze, and Mishchenko [60] and Krassovsky and Subbotin [27]. A considerable number of works on controlled diffusion processes deal with control problems of linear systems of type (2) with a quadratic performance criterion. Besides Wonham [76] mentioned above, we can also mention Astrom [2] and Bucy and Joseph [7] as well as the literature cited in those books. We note that the control of such systems necessitates the construction of the so-called Kalman-Bucy filters. For the problems of the application of filtering theory to control it is appropriate to mention Lipster and Shiryayev [51]. Since the theory of linear control systems with quadratic performance index is represented well in the literature, we shall not discuss it here.
Control techniques often involve rules for stopping the process. A general and rather sophisticated theory of optimal stopping rules for Markov chains and Markov processes, developed by many authors, is described by Shiryayev [69]. In our book, problems of optimal stopping also receive considerable attention. We consider such problems for controlled processes with the help of the method of randomized stopping. It must be admitted, however, that our theory is rather crude compared to the general theory presented in [69] because of the fact that in the special case of controlled diffusion processes, imposing on the system only simply verifiable (and therefore crude) restrictions, we attempt to obtain strong assertions on the validity of the Bellman equation for the payoff function. Concluding the first part of the Preface, we emphasize that in general the main aim of the book is to prove the validity of the Bellman differential equations for payoff functions, as well as to develop (with the aid of such equations) rules for constructing control strategies which are close to optimal for controlled diffusion processes. A few remarks on the structure of the book may be helpful. The literature cited so far is not directly relevant to our discussion. References to the literature of more direct relevance to the subject of the book are given in the course of the presentation of the material, and also in the notes at the end of each chapter. We have discussed only the main features of the subject of our investigation. For more detail, we recommend Section 1, of Chapter 1, as well as the introductions to Chapters 1-6. The text of the book includes theorems, lemmas, and definitions, numeration of which is carried out throughout according to a single system in each section. Thus, the invoking of Theorem 3.1.5 means the invoking of the assertions numbered 5 in Section 1 in Chapter 3. In Chapter 3, Theorem 3.1.5 is referred to as Theorem 1.5, and in Section 1, simply as Theorem 5. The formulas are numbered in a similar way. The initial constants appearing in the assumptions are, as a rule, denoted by Ki,ai.The constants in the assertions and in the proofs are denoted by the letter N with or without numerical subscripts. In the latter case it is assumed that in each new formula this constant is generally speaking unique to the formula and is to be distinguished from the previous constants. If we write N = N (Ki,Gi,. . .), this means that N depends only on what is inside the parentheses. The discussion of the material in each section is carried out under the same assumptions listed at the start of the section. Occasionally, in order to avoid the cumbersome formulation of lemmas and theorems, additional assumptions are given prior to the lemmas and theorems rather than in them. Reading the book requires familiarity with the fundamentals of stochastic integral theory. Some material on this theory is presented in Appendix 1. The Bellman equations which we shall investigate are related to nonlinear partial differential equations. We note in this connection that we do not
assume the reader to be familiar with the results related to differential equation theory. In conclusion, I wish to express my deep gratitude to A. N. Shiryayev and all participants of the seminar at the Department of Control Probability of the Interdepartmental Laboratory of Statistical Methods of the Moscow State University for their assistance in our work in this book, and for their useful criticism of the manuscript. N. V. Krylov
Contents
Notation
1 Introduction to the Theory of Controlled Diffusion Processes 1. The Statement of Problems-Bellman's Principle-Bellman's Equation 2. Examples of the Bellman Equations-The Normed Bellman Equation 3. Application of Optimal Control Theory-Techniques for Obtaining Some Estimates 4. One-Dimensional Controlled Processes 5. Optimal Stopping of a One-Dimensional Controlled Process Notes
2 Auxiliary Propositions Notation and Definitions Estimates of the Distribution of a Stochastic Integral in a Bounded Region Estimates of the Distribution of a Stochastic Integral in the Whole Space Limit Behavior of Some Functions Solutions of Stochastic Integral Equations and Estimates of the Moments Existence of a Solution of a Stochastic Equation with Measurable Coefficients Some Properties of a Random Process Depending on a Parameter The Dependence of Solutions of a Stochastic Equation on a Parameter The Markov Property of Solutions of Stochastic Equations Ito's Formula with Generalized Derivatives Notes
3 General Properties of a Payoff Function 1. Basic Results 2. Some Preliminary Considerations 3. The Proof of Theorems 1.5-1.7
Contents
4. The Proof of Theorems 1.8-1.11 for the Optimal Stopping Problem Notes
4 The Bellman Equation 1. Estimation of First Derivatives of Payoff Functions 2. Estimation from Below of Second Derivatives of a Payoff Function 3. Estimation from Above of Sewnd Derivatives of a Payoff Function 4. Estimation of a Derivative of a Payoff Function with Respect to t 5. Passage to the Limit in the Bellman Equation 6. The Approximation of Degenerate Controlled Processes by Nondegenerate Ones
7. The Bellrnan Equation Notes
5 The Construction of &-OptimalStrategies 1. &-OptimalMarkov Strategies and the Bellman Equation
2. &-OptimalMarkov Strategies. The Bellman Equation in
the Presence of Degeneracy 3. The Payoff Function and Solution of the Bellman Equation: The Uniqueness of the Solution of the Bellman Equation Notes
6 Controlled Processes with Unbounded Coefficients: The Nonned Bellman Equation 1. Generalizations of the Results Obtained in Section 3.1 2. General Methods for Estimating Derivatives of Payoff Functions 3. The Normed Bellman Equation 4. The Optimal Stopping of a Controlled Process on an Infinite Interval of Time
5. Control on an Infinite Interval of Time Notes
Appendices 1. Some Properties of Stochastic Integrals 2. Some Properties of Submartingales
Bibliography
Notation
Ed denotes a Euclidean space of dimension d with a fixed orthonormal basis, xi the ith coordinate of the point x E Ed ( i = 1,2, . . . ,d), xy = (x,y) the scalar product of vectors x, y E Ed, and x2 = xx the square of the length of x, 1x1 = p. a = (a") denotes a matrix with elements aij, a* the transpose of the matrix a, ay the vector equal to the product of the matrix a and the vector y, xay = (x,oy); tr a denotes the trace of the square matrix a, det a the determinant of the matrix a, and
where /loll is said to be the matrix norm of a. vfi = av/axi, gradxu is the vector with the coordinates vXi,and vxixj = a2v/ax' ax'. If a is a matrix of dimension d x dl, and b is a d-dimensional vector, then
where the matrix (aij) = iaa*. dij denotes the Kronecker delta, x,,, = xr(x) is the indicator of a set T, that is, the function equal to unity on r and equal to zero outside T. X~,,,, is the graph of the function x, given on [O,T]. z A t = min(z,t), t+ = t+ = +(It1 t),
+
r means equal by definition.
Notation
(SZ,9,P) is a probability space; SZ denotes a set whose points are denoted by the o with indices or without, 9is a o-algebra of the subset SZ, and P is a probability measure on 9. M< denotes the mathematical expectation of a random variable <. (A-as.) means almost surely on the set A. (A-a.e.) means almost everywhere on the set A.
Introduction to the Theory of Controlled Diffusion Processes
1
The objective of Chapter 1 is to make the reader familiar with the general concepts, methods, and problems of the theory of controlled random diffusion processes. In Sections 1 and 2 we formulate the basic problems and indicate methods of solution. We omit rigorous proofs, our arguments being of a purely heuristic nature. In our opinion, the simplicity of the ideas at the basic level justifies our confidence in our ability to solve seemingly complex control problems. Many of the assertions in Sections 1 and 2 are proved rigorously further on in the book under appropriate conditions. In Section 3 we first explain heuristically how the notions of Section 2 can be applied in order to obtain estimates of various kinds, later making rigorous calculations for a specific example. Beginning with Section 4 and throughout the sequel we try to adhere strictly to accepted norms of mathematical rigor. Section 4 deals with the special case of 1-dimensionalcontrolled processes, and Section 5 deals with the theory of optimal stopping rules for l-dimensional controlled processes. In these two sections we distinguish a class of problems for which the main conclusions made in Sections 1 and 2 hold true. In order to comprehend the material of Chapter 1, the reader ought to be familiar with some results related to the theory of stochastic integrals, the material, for instance, in Sections 1-6 of [23]. For the convenience of the reader, we summarize the fundamentals of stochastic integral theory in Appendix 1.
1 Introduction to the Theory of Controlled Diffusion Processes
1. The Statement of Problems-Bellman's Principle-Bellman's Equation We consider in an Euclidean space Ed (d is the dimension of the space) a random process x, subject to the dynamic equations xt = x
+ Ji b(aS,xs)ds + J,o(aS>xs)dws,
(1)
where o(a,y), b(a,y) are given functions of y E Ed and of a control parameter a, x is the initial value of the process x,, w, is a dl-dimensional Wiener process, and dl an integer. Naturally, b(a,y) is a d-dimensional vector: b(a, y) = (bl(%y), . . . ,bd(a,y)); o(a,y) is a matrix of dimension d x dl, 4% Y)= (oij(a,y)). We denote by A the set of admissible controls, i.e., values of the pararneter a. Choosing appropriately the random process a, with values in A we can obtain various solutions of Eq. (1). We can thereby "control" the process x, considered. This gives rise to the question as to whether there exists a solution of Eq. (1) for the process {a,) chosen, and if this is the case, whether it is unique, that is, whether the process {x,) can be defined uniquely after {a,) has been chosen. We put off considering these questions and finding answers to them until later. From the practical viewpoint, it is reasonable to consider that the values of the control process a, at the time s are to be chosen on the basis of observations of the controlled process {x,) before time s. In other words, a, has to be a function of the trajectory xrp,., = {(t,xt):OI t I s): a, = a,(x~o,sl). Suppose that a cost functional is given for evaluating the control performance. Suppose also that on each trajectory of the process x, from the time t to the time t At the "cost" is f*t(x,) At + o(At), where f"(y) is the given function. Then the total loss for the control used is given by
+
for each individual trajectory of x,. Corresponding to the strategy a = {a,(x~o,,l)), the "mean" loss for the process x, with initial point x is given by vqx) = M
Somf".(x,) dt.
~,,~)) This gives rise to the problem of finding a strategy a0 = { a ~ ( ~ ~such that (for fixed x) vaO(x)= U(X)= inf va(x). (2) a
In the case where there exists no strategy a0 (the lower bound in (2) is not attained), we may wish to construct for each E > 0 a strategy a' = { a ~ ( ~ ~such ~ , , that ~ ) ) va"(x)I u(x) E. The strategy a&is said to be &-optimal for the (initial) point x; the strategy a0 is said to be optimal for the point x. The function v(x) is called a "performance function," determining which strategy will be of interest to us.
+
1. The Statement of Problems-Bellman's Principle-Bellman's
Equation
Setting aside temporarily the questions about the convergence of integrals which define pa and va(x), we show how to solve the problem of finding v ( x ) and aeusing Bellman's principle. Bellman's principle states that
-
for each t 2 0. To make the relation ( 3 ) clear we imagine that an interval of time t has elapsed from the initial instant of time. For the interval t the loss is given by
The trajectory of the process has reached a point, say y, at the instant of time t. What is to be done after the time t to minimize the total loss? Since the quantity (4) has already been lost, we should find out how to minimize the loss occurring after the instant of time t. We note that increments of a Wiener process after the time t, together with the point y, define completely the behavior of a trajectory of x, for s 2 t. The increments of ws after the time t do not depend on the "past" prior to the time t, and they behave in the same way as the pertinent increments do after the initial instant of time. Furthermore, coefficients of (1) do not depend explicitly on time. Hence obtaining as a trajectory after the time t, say a function ( y , : s 2 t), is equivalent to that of obtaining a function ( y , + , : s 2 0 ) as a trajectory after the initial instant of time (if we start from the point y). We note also that the loss f"(x) does not depend on time explicitly. Therefore, we can solve the problem of minimizing the loss after the time t, assuming that the trajectory starts from a point y at the initial instant of time. It can readily be seen that the mean loss after the time t, under the condition xi = y, cannot be smaller than v ( y ) and can be made arbitrarily close to v(y). Therefore, if we proceed after the time t in the optimal way, the mean loss during the entire control operation will be given by
-
-
In general, the quantity ( 5 ) is smaller than va(x). Even if there exists no optimal control, nevertheless we can get arbitrarily close to ( 5 ) by changing a, for s 2 t. Therefore the lower bounds of ( 5 ) and va(x) with respect to all strategies coincide, as stated in (3). Further, assume that v is a sufficiently smooth function. Applying Ito's formula to v(x,), we have1
Recall that x, = x.
1 Introduction to the Theory of Controlled Diffusion Processes
where
Therefore, it follows from Bellman's principle that
0 = inf {M u
= inf M a
[J: P ( ~ds,+) .(xi) J: [f g x S )+ LUsv(xt)]ds,
where we divide all the expressions by t and also we let t tend to zero, obtaining- thereby the equation inf [Lu(x)v(x)+ f"(x)] = 0. usA
Equation (6) is known as Bellman's differential equation for the optimal control problem under consideration. We started from vu(x)and arrived at v(x) and, then, at Eq. (6).We can do it in the backward direction, namely, we can show that if a function w satisfies Bellman's equation, the function w coincides with v. In doing so we can also see how to find optimal and &-optimalstrategies with the aid of Bellman's equation. For the function w inf [Law + f"] = 0. ~ E A
Therefore, -Law I f " ; using Ito's formula we obtain
We pass to the limit in the last inequality as t + oo,assuming that for the function w ( x ) and the process xi for any strategy a as t -+ oo Mw(x,) + 0,
M
J: p ( x s )ds + M Somf '(x,) ds.
This yieldsZ
We show, in turn, that w(x) 2 v(x). We assume that for each x the lower bound in (7)can be attained for some a = aO(x).We assume further that there We note that a sufficient condition for the inequality w 2 v to be satisfied is that the left-hand side of (7) be nonnegative.
1.
The Statement of Problems-Bellman's Principle-Bellman's Equation
exists a solution of the equation xp
=x
+ fi o(aO(x,O),x,O)dws + Ji b(aO(x,O),x,O)ds.
Since - ~ " ~ ( " ) ( x ) w=( xf )*O(")(x),by Ito's formula
= Mw(xp)
+ M fi f "O("~)(x~) ds,
from which it follows for a strategy aO = {aO(xs)}as t
+
co that
Therefore, w ( x ) = v(x) and a0 is the optimal strategy (for any point x). In the case where the lower bound in (7) cannot be attained, in order to prove the inequality w ( x ) 2 v(x) we take a function g(x) > 0 such that for an arbitrary x and a strategy a
For 8 > 0 we find the function aE(x)from the condition
and consider the strategy a&= {aE(x,)).Let xi be a process corresponding to the strategy a&and starting from a point x. By Ito's formula
Therefore, w ( x ) 2 va"(x)- E 2 v(x) - E, i.e., we have again that v(x) = w ( x ) and the strategy a&is &-optimal(for any point x). In fact, Bellman's equation provides a technique for finding the performance function v ( x ) as well as optimal and &-optimalstrategies. We note that the &-optimalstrategies constructed above determine the choice of a control at time t on the basis of the instantaneous value of x, rather than the entire segment of the trajectory xI0,,,.In other words, these strategies are characterized by the fact that the control at a point y is always the same, namely, it is equal to ae(y),regardless in what way and at which instant of time the trajectory has arrived at the point y. Intuitive reasoning suggests that we could have restricted ourselves to the aforementioned strategies from the very beginning. Indeed, the knowledge of how the trajectory has arrived at the point y cannot help us, by any means, to influence the "future" behavior of the trajectory x, because increments of the process w,, which determine this behavior, do not depend on the "past." Furthermore, the "cost" we have to pay after the trajectory has arrived at
1 Introduction to the Theory of Controlled Diffusion Processes
the point y is not a function of the preceding segment of the trajectory. If it is therefore advantageous, for any reason, to use a control at least once after the trajectory has reached the point y, it will be advantageous, for the same reasons, to use this control each time when the trajectory reaches y. A strategy of the form {a(x,))is said to be Markov since its corresponding process is Markov; the behavior of the latter after the instant of time t depends only on the position at the time t and does not depend on the "prehistory." Therefore, one need seek optimal and &-optimalstrategies only among Markov strategies. However, it turns out that in order to prove the validity of our heuristic arguments justifying Bellman's equation, we need to consider all possible strategies. For example, in explaining Bellman's principle it was important that the controls applied after an instant of time t did not depend on the preceding ones. It becomes possible to apply various controls at a point y before time t as well as after time t. It turns out that it is sometimes convenient to broaden the notion of a strategy. Taking a strategy a = a , ( ~ ~ , , ,and ~ ) solving Eq. ( I ) , we obtain a process x, dependent on the trajectory w,:x, = X,(W[,,,~). Putting this solution into the expression for a,, we write a, as P,(W~,,~~). It is clearly desirable to include all processes P, = ~,(w[,,,~) with values in A in the set of the strategies considered, and to determine the resulting controlled process we solve in fact the equation
It can be seen that admitting strategies of the form P,(W[,,,~) is equivalent to being allowed to choose a control on the basis ofobservations ofthe process w,, the observations of the process w, providing us simultaneously with all the data about the process x,. It can also be seen that including these strategies in the set of all admissible strategies we leave intact the preceding arguments concerning Bellman's equation as well as Markov strategies. In particular, the inclusion of new strategies does not decrease the performance function, and can, thus, be approximated with the aid of Markov strategies. We make now a few remarks on the structure of Bellman's equation which may simplify the notation in some cases. Equation (6) can be written in a more expanded form as inf
asA
[
i,j= 1
d
aij(a,x)ux.,(x)
+ i=1 bi(w)ux.(x)+ f i x ) 1
1
= 0,
(8)
where x assumes values from that space in which the controlled process (1) assumes its values, and bi(a,x)is the velocity of the deterministic component of the motion of the ith component of the process, with the position of the process at a point x and with a control a applied. The matrix a(a,x) = (aij(a,x))= $o(a,x)o*(a,x)is symmetric; i.e., a* = i(oa*)* = too* = a and nonnegative definite, i.e., (al,1) = 3(0*1,0*1) = 310*112 2 0 and, in addition,
2. Examples of the Bellman Equations-The
Normed Bellman Equation
characterizes the diffusion component of the process. Furthermore, as can easilv be seen,
therefore add(a,x)= 0 for each x, a if and only if adk(a,x)= 0 for all k = 1, . . . , d l , x, a, i.e., if the last coordinate of the process has no diffusion component. In this case aid(ct,x)= 0, adi(a,x)= 0; the first summand in (8) becomes
1. Exercise Let o,(a,x), a2(a,x) be two square matrices of dimension d x d. To construct a new matrix 03(a,x)having d rows and 2d columns, suppose that the first d columns of the matrix a3(a,x)form a matrix a,(a,x) and that the last d columns form a matrix a2(a,x). We denote by ai(a,x)the matrices corresponding to oi(a,x)o:(a,x).Prove that
2. Examples of the Bellman EquationsThe Normed Bellman Equation The examples given in Section 2 are intended to show that in spite of the rather specialized form of the controlled system (1.1) and the performance functional many stochastic control problems can be reduced to the problem examined in Section 1. Suppose that we need to maximize M p a instead of minimizing it. Since - sup M a
Somfat(xt)dt
= inf M a
the payoff function v l ( x ) E sup M a
Som[- f Q ( x t ) ] dt,
Somf'.(xt)dt
satisfies the following Bellman equation:
+ (- f
= -SUP
sup [Lavl
+ f*]= 0.
0 = inf [L"(- v l ) asA
which yields
aeA
[Lav1 + f
~ E A
"I,
1 Intr-aduction to the Theory of Controlled Diffusion Processes
We note that in the minimization problem we derived Bellman's equation, (1.6), which contained inf; in the maximization problem (1.6) contains sup. In some cases, in order to ensure the existence of the functional pa, i.e., the convergence of the corresponding integral, one need introduce "discounting." For example, we consider the problem of finding vZ(x)= sup M a
So e - p ( x J dt, w
where x, is a solution of Eq. (1.1). If fa(x) is a bounded function, the integral in (2) exists. The multiplicative factor e-' is said to be a discounting multiplicative factor and can be interpreted as the probability that the trajectory of the process does not vanish before the instant of time t, and in fact we obtain the payoff f"f(x,)dt during the interval of time from t up to t + dt. We show how to reduce the last problem to the previous one. For x E Ed, y E (- C O , ~ )we , put f "(x,y) = e-yf "(x) and consider in Ed x El a control process whose first d coordinates move according to Eq. (1.1) and whose last coordinate is subject to the following "equation":
Let It is seen that vz(x,y)= e-yv2(x) and if our conclusions about the Bellman equation are valid, then
Putting vz(x,y)= e-Yv,(x), fa(x,y) = e-Yfa(x) into the last equality and canceling out by e-Y,we find SUP asA
[
I
d
1 a i j ( ~ , ~ ) ~ z x+i xCj ( ~bi(a,x)vzxi(x) ) - v ~ ( x+ ) fa(^) i=l
i,j=l
By the same token, for v3(x)= sup M a
J ' ~ ( x ,expl)
= 0.
Ji cas(xs)dsJdt,
where ca(x) is the given function of (a,x), we arrive at the equation
+ fa] = 0,
sup [LUv3- cav3 UEA
with the help of the equation
for the last coordinate y,.
(3)
2. Examples of the Bellman Equations-The Normed Bellman Equation
We see that introducing the "discounting" factor exp[-So cas(xs)ds] results in the appearance in Eq. (4) of the expression - cau,, which was absent in (1).
1. Exercise Let bl(a,x) be a dl-dimensional vector. Introducing the additional coordinate
explain why S P :
M
Morn
P(X~)~XP{J; bl(as,xS)dws-
1
1'
satisfies the equation
I
/bl(as~xs)~2ds dt
2. Exercise Using Ito's formula, show that if a function a(x) furnishes the upper bound in (1) (or in (3) or in (4)) for each x, the strategy {~(x,))is optimal in the corresponding problem.
An important class of optimal control is the problems of optimal stopping, in which one needs to choose, in addition to the strategy a, a random stopping time z such that the mean of the functional
J; P(XJ
dt + Q(x,)
is maximized. Surely, having made the decision to stop at time z,we stop observing the process after this time z. Hence we have to make the decision whether to stop the process at time t or not, on the basis of observation of the process only up to the time t. In other words, we shall treat Markov times as stopping times. On the set where z = oo, we assume, as usual, that g(x,) = 0, so that if stopping does not occur, we obtain
It turns out that the problem of optimal stopping can be reduced to the problem mentioned above via the technique of randomized stopping. We will illustrate this by an example of stopping a Wiener process (in (1.1) dl = d, o is a unit matrix, b = 0). Furthermore, we assume that f"(x) does not depend on a: f"(x) = f (x). Defining the nonnegative process r, = r(wI0,,,),we prescribe a rule for randomized stopping of the process w,. Let the trajectory w, stop with probability rt At + o(At) during the interval of time from t up to t + At, under
1 Tntroduction to the Theory of Controlled Diffusion Processes
the condition that it does not stop before the time indicated. Stopping w, at the instant of time t we have (compare with (5))
Then, as is easily seen, the probability that stopping does not occur on an individual trajectory before the time t is equal to exp(- Sf, rs ds) (in particular, exp( - 1 ; rs ds) for t = co). Therefore, the probability that stopping does in fact, occur in the interval (t,t + At), is given by
Hence the expected payoff on an individual trajectory is given by
f: [ J, p,rc exp = JoW
g (x
rs ds] dt
+
:s
[ J:
f (x + ws)ds exp -
+ w,)r,exp[- J, rs ds] dt
rs ds]
Integrating the last expression by parts we have
Therefore, using the above technique of randomized stopping, we obtain in the mean with the aid of the process r, M
f: + [ J: P(x
w,)exp -
crsds] dt,
+
where f' = f rg, cr = r. It can be readily seen that if it is really advantageous to stop the trajectory x w, at time t with nonzero intensity of r,, then one can as well stop the process at time t with probability one. Therefore, "randomized stopping cannot give a better result than "nonrandomized" stopping. On the other hand, an instantaneous stopping rule can be approximated in a reasonable sense by means of randomized stopping, if we increase the stopping intensity of r, after the time z. Hence
+
= sup M r
J: + [ J; P(x
w,)exp -
crsds] dt.
2. Examples of the Bellman Equations-The Normed Bellman Equation
Similarly, in the general case
= Sup M a,r
:J
[ ~i
f at3rt(x,) exp -
cas*rs ds] dt,
where = f" + rg, casr = r. If we regard the pair (a,r) as one parameter, we can easily notice the similarity between functions v, and v,. Hence, we write for v, a Bellman equation which is similar to (4).We note, however, that v, does not satisfy a Bellman equation in many cases. The point is that the functions f".' and casrin (6) are not bounded as functions of one of the control parameters r. With this in mind, let us go back to the derivation of the Bellman equation (1.6). Taking the last limit before (1.6), as t + co, we assumed that f"sr
+
is close to f"O(x) LaOv(x)in some sense uniformly with respect to strategies a. In order this to be the case, it is necessary at least that the process xs does not move far away from the initial point x in a short interval of time. We therefore assume that the coefficients o(a,x), b(a,x) are bounded, and furthermore the function f"(x) is bounded as well. As we saw above, the scheme which involves "discounting" reduces to a scheme without "discounting" if we introduce an additional coordinate y, and the equation
For the scheme with "discounting" we have therefore the requirement that ca(x)be bounded. Therefore, if we wish to consider the controlled process
with the payoff function v, and unbounded o, b, c, f , we have to apply, in general, methods other than those given in Section 1. One of the methods to be applied is based on random change of time and enables us to go from the unbounded o, b, c, f to the bounded a, b, c, f . We take a positive function m(a,x) for which the expressions a, b, mb, c, = me, f, mf are bounded functions of (a,x). Let
-
-
-
6,
and suppose that because of some special features of a controlled process, we can consider only those strategies a for which I++ <;oo for each t, @ , = co. Then the function z,: the inverse of the function I):, will be defined on [O,co) and ,z" co.
1 Introduction to the Theory of Controlled Diffusion Processes
Replacing the variable t = zf and assuming that 8, = cx,:, z, = x,:, we have
Jc exp{ -J: P ~ ( ~ds}, )
fat(x,)
dt
=
Jc exp { - Ji cfu(zu)du}
fs(zs)
ds,
where the process c, is given by the formula By [23, Chapter 1,43, Theorem 31, the process 5, is Wiener. Therefore, it is quite likely that
Jc f{s(zs)exp [-J: c(.(zd du] B
v3(x)= sup M 0
and v3 satisfies the equation
where a,
= $o,o* = ma. In
other words,
which is known as "normed" Bellman equation, and which differs from the Bellman equation (4) by the presence of the normalizing multiplier m(a,x). We have deduced Eq. (7)to take care of the unboundedness of the functions o, b, c, f . If these functions o, b, c, f are bounded, we can take as the normalizing multiplier m(a,x)a function identically equal to unity. Then &, mb, mc, mf are bounded, and (7)is equivalent to (4). In order to see what the normed Bellman equation ensures for v,, we assume that the functions f "(x)and g(x) in (6) are bounded. Let m(a,r,x) = (1 r)-l. As can readily be seen, the functions ,/*)o(a,x), m(a,r,x)b(a,x), m(cx,r,x)canr(x), m(a,r,x)P r ( x )are bounded, and in conjunction with (6),(7)
+
Assuming
E =
1/(1 + r), we obtain
+ + (1 - &)(g- u5)] = 0,
Sup SUP [&(Lav5 fa)
&€[O,l] a€A
sup [ E sup(Lav, ~s[O,l] asA
+ f ")+ (1 - &)(g- v,)
1
= 0.
Noting that the expression in square brackets is a linear function of E, we can easily prove that (8) is equivalent to the combination of the following conditions :
2. Examples of the Bellman Equations-The Normed Bellman Equation
+
sup (Lav5 f") l0, asA
9
-
v5 5 0,
sup (Lao5+ f " )= 0 for g - u5 < 0. asA
We have thus obtained three relations for finding the payoff function u, in the optimal stopping problem. We can easily write these relations as one equality if we compute the upper bound with respect to E in (8): g - v5
+ sup[Lav5 + f a + v 5 - g ] + a
=O.
The last equation is said to be Bellman's equation for the problem of optimal stopping of a controlled process. In the examples given above, the controlled process (1.1) was considered in the entire space Ed. In some cases, however, we regard as essential only the values of the process (1.1) before an instant of time z, of first departure of a trajectory of (1.1)from a domain D. For example, let MzD be either the largest value or the smallest one. The problem of finding the minimum for
is a more general problem, which can be reduced to the original problem in the following way. We change o(a,x),b(a,x) so that they remain the same in the domain D and are equal to zero outside D. We write the new a(a,x), b(a,x)thus obtained as o"(a,x),b"(a,x).Furthermore, we assume f"" ( x )= f"(x) for x E D, f""(x)= g(x) for x $ D, c(x) = 0 for x E D, C ( X ) = 1 for x $ D. Then the process, once having entered the boundary dD of the domain D, will remain permanently at a point x,,. It is seen that
where jZ, is the solution of (1.1) with the modified a(a,x),b(a,x). The Bellman equation for v6(x)yields inf
aeA
[
i,j=l
d
Fj(a,x)u,,.,,(x)
1
+ i = 1 ~(qx)u6,.(x)- c(x)u6(x)+ f""(x) = 0,
where (lii'(a,x))= io"(a,x)o"*(a,x). In particular,
We see that the problem of control in the domain D with "discounting" ca(x),as well as the problem of optimal stopping before first exit from D, can be investigated in the same way. v6 = g even everywhere outside D
1 Introduction to the Theory of Controlled Diffusion Processes
Finally, another class of optimal control problems, which is included in the scheme considered, is that involving time varying stochastic equations. Rather frequently, time appears in the coefficients of Eq. (1.1)in the explicit manner. For example, let the controlled process x, start at time r at a point x, and let this process satisfy the equation
We assume that we need to minimize
Let v7(x,r)= infuua(x,r).We take the direct product Ed x El and consider El as a time axis. The process (x,,t) is at a point (x,r) at an initial instant of time of control, and this process (x,,t) will move to a point (x,,,, r + u) with unit velocity along the time axis during a control interval u. Let y,=(x,+,,r+ u),x,+,= yil),r + u = y~'),~,=u,+,;then
yL2)
= y(2)
where y(') = x, Y(')
= r,
+ J1ids, which, in turn, yields
~,(y('),y(~)) = inf M fOw f "(y)l),yj2))dt B
and
Therefore, inf
asA
[
i,j=l
aij(u,x,r)u7xtxj(x,r)
or, in other words,
The problems with different performance functionals can be considered in a similar way for a controlled nonstationary process. Thus, for example,
2. Examples of the Bellman Equations-The
Normed Bellman Equation
the problem of finding
is equivalent to that of minimizing for the process y,.
where g"(y)= g(y(')) and z, is the instant of time of first departure of y, from a strip { y : y ( 2 ) < T ) . Therefore, for r < T
3. Exercise Let the lower bound in the last equation be attained for each (x,r) if u = a(x,r). Find a nonstationary Markov strategy using the formula u0 = {u(xt,t)).Show that a0 is an optimal strategy.
4. Exercise Let S, be a sphere with radius equal to unity in Ed, A = { a = (u('),u(')):u(')E [0,1], a(') E S1),o(a,x) = P o ( x ) , where o ( x ) is some square matrix of dimension d x d, b(a,x) = d 2 ) ( 1- a(')),f"(x) = a(')(f (x) + 1) - 1 ; f ( x ) is a fixed function. Let r, be the first exit time of trajectories of the solution of equations
from the domain D,
Show that Bellman's equation is equivalent for u(x) to the following relations: Lou + f < 0 , lgrad ul - 1,
([gradul
- l)(Lou + f ) = 0
5. Exercise Let A
= [O,co). We
consider the one-dimensional process xt = x
+ J,&dws
1 Introduction to the Theory of Controlled Diffusion Processes
and put
where f is given negative bounded function. Write the Bellman equation for o and prove that there is no solution to this equation. Prove also that v = 0 and that the function o satisfies the normed Bellman equation.
3. Application of Optimal Control TheoryTechniques for Obtaining Some Estimates In some problems in diffusion process theory one needs to estimate from above or from below expressions of the form
I=M
[J; exp{- Ji cs d S }
(z,)dt
+ exp
Is 1 I -
c, dt g(z.1 .
We shall show in this section how such estimates can be obtained with the aid of optimal control techniques. The technique we apply in this section has been used for finding most estimates given in our book. Let a process z, be represented as
where z is the time of first departure of z, from the domain D, f(z), g(z) are the given functions, and c, is a random process. Assume that we have imbedded the process z, into a family of controlled processes. In other words, we consider that the process z, can be obtained as a solution of the equation x,
=x
+
o(,,x,) dw,
+ fi b(cxs,xs)ds
for x = z and a strategy a chosen, say, for E = { ~ ~ ( x [ ~Also, , ~ ~assume ) } . that we can find the functions ca(x),f ' ( x ) so that cEs(zs)= c,, f "(z,) = f(z,). Then, I = v y z ) 2 v(z), where va(x)= M
[J:
{
fat(x,)exp -
Ji cas(XS)cis}dt + g(x,) V ( X ) = inf a
exp -
{J:
va(x),
c a s ( xds~
11
,
x, is a solution of Eq. (I),and z, is the first exit time of solutions of (1)from D. As in Section 1.2, we can conjecture that the function v(x) will satisfy the corresponding Bellman equation :
inf [ L a v + f a ] = O
asA
inD,
-v+g
=OondD,
3. Application of Optimal Control Theory-Techniques
for Obtaining Some Estimates
17
where
To estimate I from below it remains to find an explicit solution to the boundary value problem (2). If we write this solution as w(x), then w(z) I vZ(z)= I. Note that we do not need to prove here the equality w = v. Hence (compare this with the arguments concerning Eq. (1.7)) in order to estimate I from below it suffices to solve a less difficult problem,
The feasibility of an explicit solution of Eq. (2) depends, to a great extent, on the form of the solution, that is, it depends on the fact that the process z, is adequately incorporated into the system of processes (1). Hence a controlled process is to be introduced so that v(x) depends on some known function of d coordinates, for example, on 1x1 rather than on all the d coordinates. In this case (2) can be reduced to an ordinary differential equation. We make a few additional remarks on the inequality w(z) s I. The constant w(z) can be found via some other arguments unrelated to (2) and (3). It is, however, convenient to prove that w(z) I I, using (2) (or (3)), that is, using the function w(x) rather than the value of w(x) at z. As we have seen in Section 1.1, for proving the inequality w(z) I vz(z) we need to apply Ito's formula to a solution of Bellman's equation for a problem modified appropriately in which the performance functional is given by In the present case Ito's formula is to be applied to
where x, is the solution of (1)for x = z, a, = a,. Note that the latter expression is equal to w(z,) exp(- So c, ds). By Ito's formula,
in which by virtue of (2) (or (3))
1 Introduction to the Theory of Controlled Diffusion Processes
In short, whence
Since w(z,) I g(zr),we obtain w(z) I I, letting t = co. It should be noted that in the above considerations we used (2) (or (3)) in obtaining (4). In other words, w(s) 5 I for any function w for which (4) is satisfied and w(z,) I g(z,). Such functions w are said to be stochastic Lyapunov functions (see [45]). Therefore, a method based on Bellman's equation can also be used for finding stochastic Lyapunov functions. The above considerations imply as well that, having obtained an explicit expression for a function w such that (4) is satisfied, we do not need to use the Bellman equation, a performance function, or any other notions from an optimal control theory in order to justify the inequality w(z) II. However, it is easier, as can be seen later, to find an explicit form for the function w, if we express this function w as a performance function and apply the Bellman equation. Furthermore, the esetimate w(z) I vu(z) is exact (unimprovable) in the class of processes (2) in the case w = v. We shall illustrate this with an example. Consider the process z, = z + Sb o, dw,, where o is a matrix of dimension d x d and w, is a d-dimensional Wiener process. Let E < lzl< R. We estimate from above the probability of the event that this process reaches the closure of a sphere S, = {x:1x1 < E } before it leaves a sphere S,. In other words, we need to estimate from above
where g = 1 on dS, = (x:lxl = E},g = 0 on as,, and z is the time of first exit of Z, from D = D(E)= S,\(S, u as,). Assume that o, is nondegenerate and bounded. Moreover, for all A E Ed, s 2 0 and all w,let where p, v are constants larger than zero. We take as A the set of all matrices a of dimension d x d such that for all A E Ed p(;1I2I (&a*1,1) s v1AI2. (6) For all a E A we put o(a,x) = a, b(a,x) = 0, cu(x)= 0, f '(x) = 0. We consider the,controlled process
3. Application of Optimal Control Theory-Techniques for Obtaining Some Estimates
19
It is seen that the process z, belongs to the process given above and that Mg(z,) v(z),where v(x) = sup Mg(xz,). (7) a
Bellman's equation yields for u(x)
We note further that because the problem is spherically symmetric the ) ~ a(a,x)does not depend on x, function v depends only on 1x1 :v(x) = ~ ( 1 x 1and a(a,x) = a(a).We obtain from (8)the relation
for ~(1x1). We fix x, assume r = 1x1, and take an orthogonal matrix T such that x IxITe,, where el = (1,0,. . . ,0). Then, (9) becomes U S A
=
1 el a"(a)elu"(r) + - [tr a(a) - el a"(a)el]ul(r) r
where E(a) = T*a(a)T = a(T*a). We note that tr a(a) = tr a"(a),ela"(a)el= a""(a), and that the matrices T*a run through all A when a runs through A. Hence
(10)being equivalent to (9)for r = 1x1. Equation (10)is an ordinary differential equation of order 2. Let us solve Eq. (10)with respect to a second derivative. We have
Note that a(a) = &a* and that the inequality al1(a)2 p follows from (6) for I 1 = 1, I Z = . . . = ;Id= D. Hence the first factor in (11)cannot approximate zero; therefore
1 aii(a) ul'(r)+ - sup ul(r) C -= 0. r i = 2 all(a)
It is easily seen that u(r)decreases as r increases, that is, ut(r)l 0,which fact follows from a version of Bellman's principle. To show this, let e < r < 1x1,
1 Introduction to the Theory of Controlled Diffusion Processes
Dl
= S,\(S,
u as,);then ~ ( 1 x 1= ) sup Mv(x,,,) a
= u(r) sup
P{lxIDll= r ) Iu(r).
a
Therefore, (12) yields ul'(r)
= 0. + r1 ~ ' ( rainf)s A i = 2 aaii(a) fl(a) -
-
The lower bound given above can be easily computed (aii(a)are changeable independently for different i in an interval [ p , ~ ] )and, , finally, we obtain
whence
where y = 1 - (d - ~ ) ( p / v ) . ~ The considerations we used in deriving (13) are of heuristic nature. Hence it remains to prove (13).Keeping in mind the problem stated above, we prove first that
Note that u(r)and v ( x )defined in (13)are infinitely differentiable functions of their arguments for r > 0 , x f 0 , respectively. Furthermore, u'(r) < 0. Hence u actually satisfies (12),(1I ) , (lo),and (9).Therefore v satisfies (8). The above implies, because v ( x ) is smooth, that for any t (compare with (4) and (5)). Letting t go to infinity and taking advantage of Fatou's lemma and the fact that v is nonnegative, we arrive at (14). We have thus solved our main problem. However, it is desirable to know to what extent the estimate (14) is exact and whether it can be strengthened. In other words, we wish to prove that v ( x ) in (13)is in fact a payoff function. Using Bellman's equation we find the optimal control a(x). For x 1 = r, x2=...= x d = 0 , the upper bound in (8) can be attained for the same a as that in (9) and (10).As our investigation demonstrates, this upper bound in (10) can be attained by a diagonal matrix a(a) such that aii(a)= p, i 2 2, al'(cc) = v. The eigenvectors of this matrix, except for the first vector, are orthogonal to the x1 axis, with eigenvalues equal to p; the first eigenvector is along the x 1 axis, with eigenvalue v. Because of the spherical symmetry of the problem, the upper bound in (8) can be attained for a different x on a matrix a,, with one eigenvector being We assume that y
f
0.
3. Application of Optimal Control Theory-Techniques for Obtaining Some Estimates
21
parallel to x and corresponding to the eigenvalue v, and the remaining eigenvalues being equal to p. Therefore, the matrix a, is characterized by the fact that a,x = vx, axy = py if (x,y) = 0. Since any vector
and
we have
Therefore,
Assuming a(x) = f i ,we obtain the function a(x) for which a(a(x),x) (=ax) yields the upper bound in (8). We can easily determine a(x) by noting that the eigenvectors of a(x) are the same as those of a,, and that the eigenvalues are equal to and f i We have aij(x) = ,pjP + (fi- f i ) 1x1
q.
The function a(x)is smooth everywhere except for x take a strategy a0 = {a,(x,)),the equation
fi
= 0. Therefore, if
we
has a solution uniquely defined up to the time of first entry at the zero point. Applying Ito's formula to the above solution, we have whence, as t V(X)
-+
co,
= Mv(x,,)
+ lim Mv(x,)x,, t'W
+
= vaO(x) lim Mv(x,)x,,,,. f+
m
In order to prove that v(x) from (13)is in fact a payoff function, a0 is an optimal strategy, and the estimate (14) is therefore exact, it sufficies to prove that the last summand is equal to zero. Since v is a bounded function, we need only to show that z, is finite with probability 1. Let g,(x) = - x 2 R2. By Ito's formula we have
+
1 Introduction to the Theory of Controlled Diffusion Processes
which implies, by Fatou's lemma, that Mz,2(v + (d - 1)p) l gl(x) and Mz, < co. We make a few more remarks on the payoff function v(x) and the optimal process (15). Let y > 0, i.e., (d - 1)p < v, and let D(0) = S1\{O). From the equality vaO(x)= v(x) we obtain
from which it follows that the process (15) reaches with nonzero probability the zero point before it reaches as,. Furthermore, this probability tends to 1 when the initial point of the process tends to zero. We emphasize the fact that the process x, is nondegenerate and has no drift.
4. One-Dimensional Controlled Processes We shall prove in Sections 4 and 5 that a payoff function is twice continuously differentiable arid satisfies Bellman's equation if a one-dimensional controlled process is nondegenerate for no strategy at all, and if, in addition, some assumptions of a technical nature can be satisfied. Furthermore, we justify in these sections the rule for finding &-optimalstrategies using Bellman's equation. We wish to explain the relationship between the theory presented in Section 1.4 and the theory of multidimensional controlled processes which will be discussed in the subsequent chapters. One-dimensional controlled processes constitute a particular case of multidimensional controlled processes. Hence a general theory can provide a considerable amount of information about a payoff function. Bellman's equation, and &-optimal strategies. However, taking advantage of the specific nature of one-dimensional processes, we can prove stronger assertions in some cases. At the same time, we should point out that the results given below do not include everything that follows from the general theory. Let A be a (nonempty) convex subset of some Euclidean space, and let (T(~,x), b(a,x), ca(x),fa(x) be real functions given for a E A, x E (- m,co). Assume that ca(x)2 0; o(a,x), b(a,x), ca(x),and f"(x) are bounded and satisfy a Lipschitz condition with respect to (a,x), i.e., there exists a constant K such that for all a, p E A, x, y E El
where, as usual, a(a,x) = &[a(a,x)I2.
4. One-Dimensional Controlled Processes
Furthermore, we assume that the controlled processes are uniformly nondegenerate, that is, for some constant 6 > 0 for all a E A, x E El Let a Wiener process (wt,Ft) be given on some complete probability space (0,9,P), and let a-algebras of Ftbe complete with respect to measure P. 1. Definition. By a strategy we shall mean a random process a,(w)with values in A, which is progressively measurable with respect to the system of oalgebras of We denote by ?I the set of all strategies. x":
To each strategy a E ?I and a point x we set into correspondence a solution of the equation
By Ito's theorem the solution of Eq. (1) exists and is unique. We fix numbers r, < r, and a function g(x) given for x = r,, x = r,. We denote by za*" the first exit time of x?" from (rl,r2) and we set
V(X)= sup vU(x). ae9(
We shall frequently need to write mathematical expectations of the expressions which include repeatedly the indices a and x,where a is a strategy, and x is a point of the interval [rl,r2]. We agree to write these indices as subindices or superindices or the expectation sign. For example, we write M: S',f*t(x,) d t instead of f"'(xy) dt, etc. In addition, it is convenient to introduce the following notation: qt.' = ex:') ds.
1
The definition of va(x) thus becomes the following:
The definition given above of the strategy enables us to use the information about the behavior of the process w, in controlling the solution of Eq. (1). From the practical point of view, this situation is not natural. Hence we shall consider some other control techniques. Let C[O,co) be the space of continuous real functions x,, t E [O,co); let Jlr, be the smallest o-algebra of subsets of C[O,co), which contains all sets of the form {xl ,,: x, I a), s I t, a E (- co,co).
1 Introduction to the Theory of Controlled Diffusion Processes
2. Definition. A function a,(x[,,,,) = at(xI0,,,)with values in A, given for t E [O,co), x[,,,) E C[O,co),is said to be a natural strategy admissible at the point x E [r1,r2]if this function is progressively measurable with respect to M, and if there exists at least one solution of the stochastic equation
which is Ft-measurable for each t. We denote by %,(x) the set of all natural strategies admissible at a point x. To each strategy a E 'LIE(x)we set into correspondence one (fixed)solution x?" of Eq. (2).
3. Definition. A natural strategy a,(~[,,,~) is said to be (stationary) Markov if a , ( ~ ~ , ,= , ~a(x,) ) for some function afx). We denote by %,(x) the set of all Markov strategies admissible at a point x. Note that to each natural strategy a,(~[,,,~) admissible at x, we can set into correspondence a strategy E 2I such that x:" = xf,". In fact, let us take a solution x,(w) = x;,"(w) of Eq. (2), and let us assume that Pt(w)= a , ( ~ ~ , , ~ ~It( wis)seen ) . that {P,) is a strategy and that the equation dx, = o(p,,x,)dw, + b(B,,xt)dt with the initial x , = x given can be satisfied for x, = x?". By the uniqueness theorem this equation has no other solutions; therefore, xf," = x:". It is clear that the inclusions 'UM(x)c %,(x) c 'U have precise meaning. we take a function a(x) with values In order to show that 'U,(x) # in A such that Ia(x) - a(y)l I Nlx - yl for all x, y for some constant N. Since the composition of functions satisfying a Lipschitz condition satisfies also a Lipschitz condition, there exists a solution of the equation
a,
therefore, { ~ ( x , )E)'U,(x). In the same way as we introduced the function v(x) for computing the upper bounds on the basis of %,(x), %,(x), we introduce here the functions v(,,(x), v(,,(x). It is seen that
4. Definition. Let E 2 0. A strategy a E 'U is said to be &-optimalfor a point x if v(x) I va(x)+ E.A 0-optimal strategy is said to be optimal.
Our objective is to prove the following theorem. 5. Theorem. V ( M , ( X ) = v ( ~ ) ( = x ) v(x)forx E [rl,r2],v(r,) = g(rl),v(r2)= g(r2), v(x)and its derivatives up to and including the second order are continuous on
4. One-Dimensional Controlled Processes
[r1,r2],5and v"(x)satisfies a Lipschitz condition on [r,,r2]. For all x
+
sup [a(u,x)vt'(x) b(u,x)vl(x)- ca(x)v(x)+ f"(x)] = 0. asA
E
[r1,r2],
(3)
Furthermore, v is the unique solution of (3) in the class of functions which are twice continuously diflerentiable on [r1,r2]and equal to g at the end points of this interval. In order to prove the theorem we need four lemmas and additional notation. Let
+
F[u] = F[u](x)E sup [Lau(x) f"(x)] = F(x,u,ul,u"), aeA
6. Lemma. Let a(x),b(x),c(x),f ( x ) be continuous functions on [rl,r2]and let a 2 6, c 2 0, (a] lbl IK on this interval. Then there exists a unique function u(x)which is twice continuously diferentiable on [rl,r2],is equal to g at the end points of the interval [rl,r2],and is such that for all x E [r1,r2]
+ + IcI
+
a(x)ul'(x) b(x)ur(x)- c(x)u(x)+ f ( x )= 0.
(4)
Furthermore, IIu"llB
+ IIu'IIB + llullB
and, for g(rl) = g(r2)= 0,
llull~
N211f
Nl(llf
11s f 1)
ll-Yl9
where N , depends only on r,, r2, 6, K , g(rl), g(r2);N 2 depends only on r,, r2,6, K.
PROOF.Assertions of this kind are well known from the theory of differential equations (see, for example, [46]).Hence we shall just sketch the proof. First by considering instead of the function u the function u - $ in which $ is linear on [rl,r2] and $(ri) = g(ri),we convince ourselves that it suffices to prove the lemma for g = 0.Further, we can define a function y(x) explicitly By definition, v1(r1)(vU(rl))is assumed to be equal to the limit vf(x)(vU(x))as x fine v1(r2), v"(r2) in a similar way.
1 r,. We de-
1 Introduction to the Theory of Controlled Diffusion Processes
such that the replacement of the unknown function with the aid of the formula u(x) = E(y(x))turns Eq. (4)into Dividing both sides of the last equation by a,, we arrive at the equation satisfying the boundary condition E(0) = E(1) = 0. Note that c,(y) 2 0 in (7). Analogous properties of the solution (4) can readily be derived from the properties of (7). Therefore, it suffices to prove our lemma for Eq. (7). Let g,(x,y) = ( x A y)(l - x v y) and for il > 0 let SA(X,Y) =
u""
1
&sh f i
s h f i ( x ~y)sh&(l
- x v y).
Elementary computation indicates that the combination of the relations - AE = - h, E(0) = E(1) = 0 is equivalent to the following:
In addition, (7) is equivalent to E" - ilE = (c, - A)E - f, and therefore to the equation
Let A = IIc211B[0,1]; then IITAul - TAu211B[0,11 5
llul
- u 2 1 1 B [ 0 , 1 ] ~max S o l g A ( x , ~ ) = d ~&llul - uZ/lB[O,l].
As can easily be verified,
Consequently, T Ais a contraction operator; Eqs. (8),(7),and (4)have unique solutions satisfying zero boundary conditions. We deduce the estimates (5) and (6) for the solution of (7).It follows from (8)that IIEllBIO,l]
max max gA(x,~)llf2/191[0,1] + &llEllB[0,l], x
Y
from which we find IIEllBro,ll 5 Nil f2119110,1,.Further, from (7)we obtain an [ ~ , ~ ] we . obtain the estimate E1 using the represenestimate for I l ~ ' ~ l l ~Finally, tation E'(x) = J", iil'(y)dy,where x, is a point in [0,1] at which ii' = 0. We have thus proved the lemma. 7. Lemma. There exists a constant N depending only on r,, r,, 6, K , and such that Mz: 5 N for each a E a,x E [rl,rz]. In particular, va(x) and v(x) are Jinite functions.
4. One-Dimensional Controlled Processes
PROOF.We can consider without loss of generality that r1 = -r,. Let
Regardless of the fact that w(x) is the difference between two nondifferentiable functions, we can easily verify that w(x) is twice continuously differentiable and that for each a 2 6, b E [- K,K], x E [rl,r2]
In addition, w 2 0 on [rl,r2],w(ri)= 0. By Ito's formula, for each a E x E [r1,r2],t 2 0
a,
from which we conclude, using properties of the function w, that w(x) 2 M:(z A t ) and, as t -+ co,w(x) 2 M:z. 8. Lemma. Let the function a(x) satisfy a Lipschitz condition. We define a Markov strategy cc,(xl,,,,) by cct(xI,,,,) = a(x,). If f ( x ) is continuous on [rl,r2], the function
is twice continuously digerentiable on [rl,r2] and is the unique solution of the equation
+
La(")(x)u(x) f ( x )= 0,
x
E
[r1,r2],
(9)
in the class of twice continuously dlferentiable functions which are equal tog at the end points of Lrl,r2]. In particular, L"(")(x)v"(x) f "(x)= 0. Moreover, if some function w(x)has two continuous derivatives on [rl,r2],w(q) = g(ri)and
+
we have w 5 v"
+ ~ l l h l lwhere ~ ~ , N is the constant from (6).
PROOF.By Lemma 6, Eq. (9) satisfying the boundary conditions u(ri)= g(ri) has a smooth solution. Writing this solution as u , and applying Ito's formula to the expression ~ , ( x ~ , " ) e -we ~ : easily ' ~ find
from which the equality u,(x) = u(x)follows as t -+ co because zas",MET are finite, f , u , are bounded, and c" is nonnegative. We have thus proved the first assertion. In order to prove the second assertion, we set Law + f" = - h, and note that the function h, is continuous and La[w - r"] = - h,. Then,
1 Introduction to the Theory of Controlled Diffusion Processes
according to the first assertion, we have
It remains to estimate u,(x). h: as well as h, is a continuous function. Therefore,u,(x) satisfies the equation Lau2= - h:, and, in addition, u2(ri)= 0. By Lemma 6 we have u2 i ~ l l h : l l ~Since ~ . h, 5 h, h: i h', Ilh:I19, I Ilhl12,, u2 i NllhllZl, thus proving the lemma.
9. Lemma. Let u(x),u,(x), u,(x) be bounded Bore1 functions on [rl,r2],E > 0. There exists a function a(x) with values in A, such that a(a(x),x)u2(x) + b(a(x),x)u,(x)- ca(")(x)u(x) + f"(")(x)+ E 2 F(x,u(x),ul(x),u2(x))for all x E [rl,r2].Furthermore, there exists a(x) satisfying the Lipschitz condition and also there exists a real-valued nonnegative function h(x) such that Ilhllst.,I E and for all x E [r1,r2]
+
+
+
a(a(x),x)u2(x) b(a(x),x)ul( x )- ca(")(x)u(x) f"(")(x) h(x) = F(x,u(x),u,(x),u,(x)).
PROOF.We fix some countable set {a(i))everywhere dense in A. Because a, b, c, f are continuous in the argument a
We conclude from the above that for each x
E
[rl,r2] there is an i such that
-
Next, we denote by i(x) the smallest value of i for which the last given inequality can be satisfied. It is seen that the (measurable) function a(x) a(i(x))yields the function stated in the first assertion. In order to prove the second assertion of the lemma, we extend the function i(x) outside [rl,r2] assuming i(x)= 1 for x $ [rl,r2].Let a,,,(x) = Ma(n A i(x
+ w,)) =
JzGSm-" a(n
-
A
i(y))e-(Y-x)2p'dt.
It is easily seen that a,,,(x) is an infinitely differentiable function. Furthermore, a,,Jx) E A due to the convexity property of the set A. Note that, as is well known (see, for example, [ l o ] ) ,for each measurable bounded function y(x) as t 0 the function My(x w,) + y(x) ( a s ) Hence a,,,(x) + a(n A i(x)) (as.) as t 10, and, clearly, a(n A i(x))-, a(i(x))as n + co. Defining
+
4. One-Dimensional Controlled Processes
we obtain lim lim ht*"(x)I E
n-m
t10
ht,"(x)2 0.
Further, since u, u,, u, are bounded, the class of functions ht3" is class equibounded. Therefore, lim lim ~ ~ h t * nI ~&(r2 ~ 9 1- r,),
n + m t10
and t, n can be chosen so that (Iht3"ll,, I 2&(r2- r,). Thelemma has thus been proved. 10. Proof of Theorem 5. We use the so-called method of successive approximation in a space of strategies. This method, known as the Bellman-Howard method, enables us to find &-optimalstrategies and approximate values of a payoff function without solving nonlinear differential equations. We take as ao(x) any function with values in A satisfying the Lipschitz condition. We define the Markov strategy a, using the formula a,(~[,,~,) = aO(xt).Let vo(x)= vao(x).If a,,a,, . . . , a,, uo(x), v,(x), . . . , u,(x) have been constructed, we choose a function a,+ ,(x) such that it satisfies the Lipschitz condition and
,
Lan+lv,+ fOL"+l
+ h,+,
= F[v,],
(11)
1 191[1 ,r21
where h,+ is a function with a small norm: llhn+ I 1/(n + l ) ( n+ 2). Assume that v,+,(x) = van+'(x),where the strategy a,+, E %,(x) can be found with the aid of the function a,, ,. We prove that the sequence {v,(x)) has a limit and that this limit satisfies Eq. (3). We shall also prove that the limit of v, coincides with v. First, we investigate the behavior of v,, v; as n -+ co.Applying Lemma 8, we obtain F[v,] 2 0 since Lanv, + f"" = 0. We conclude from (11) that
Therefore
i.e., the sequence of functions
is monotone increasing. Furthermore, by Lemma 7 the totality of functions v, as well as u, is bounded, hence lim,,, u, exists. It is seen that v, has a limit as well. For
1 Introduction to the Theory of Controlled Diffusion Processes
x
E
[rl,rZ]let iY(x)= lim v,(x). n+ m
By Lemma 6, it follows from the equality Lanv,
+ f"-= 0 that
where N does not depend on n. By the Lagrange theorem, Iv,(x) - v,(y)l I Nlx - yl. Therefore, the functions v, as well as u, are equicontinuous and uniformly bounded. By the Arzela theorem, some subsequence of the functions u, converges to the limit uniformly in x. Since the function u, increases as n increases, the entire sequence of u, converges to the limit uniformly in x. It then follows that v, converges to 5 uniformly in x. In particular. iY(x) is continuous on [rl,rz]. Further, using the Lagrange theorem, we derive from the uniform estimate llv;ll that Iv;(x) Nlx - yl, where N does not depend on n. By the Arzela theorem, the sequence {vi) is compact in the sense of uniform convergence on [rl,rz].Assume that {v;,) is a uniformly convergent subsequence and that iY1 is the limit of this subsequence. Taking the limit in the equality
V;(~)I
we have Therefore, 5' = a', {v;) has a single limit point, and v; -r 5' uniformly on [rl,r2]. In addition, lC'(x) - af(y)lI NIX - yl. Let us consider 5" and F[5]. We use Eq. (11). As was noted above, F[v,] 2 0 and f".+ + La.+la,+, = 0. Therefore, (11) yields
Dividing the last inequality by a(a,+ ,(x),x), we easily find
Also, we note that
From this representation of F it follows that the equation F = 0 is equivalent to r F1 = 0, and that 0 5 r + F1 I 6-la if 0 I F I E. Therefore, (12) yields
+
+
~ - ' K [ v ;- v;+, 6-1K(lvA 2 (v: F1(x,~,,v;)2 0.
+
+ Iv,
-
U , + ~ I ) + 6-1h,+l]
4. One-Dimensional Controlled Processes
Integrating over x and letting n E'(x) - E'(rl)
+
+ lim
n+ m
co, F,(s,ds),uA(s))ds= 0
by virtue of the proven properties of v,, vh. Next we exploit a property of the function F,(x,y,p). Since the magnitude of the difference between the upper bounds does not exceed the upper bound of the magnitude of this difference, we have
$1 +
Iv, - El), and (13) In particular, IF,(s,v,,v;) - F,(s,E,E')l I (K/6)(lv;yields ~ , ( s , ~ s ) , T ( s ) )= d s0. (14)
Further, applying the elementary inequality
and the property of the upper bound noted, we find
from which we derive
In short, F1(x,$x),5'(x))satisfies a Lipschitz condition. Differentiating (14), we have first that E1'(x) F,(x,E(x),B'(x))= 0, and F[E] = 0; second, it follows from the equality v"" = -F,(x,E,E') that 5" satisfies a Lipschitz condition. In order to complete the proof of the theorem, we need only to show that for each twice continuously differentiable solution u(x) of the equation F[u] = 0 satisfying the boundary conditions u(rl) = g(rl), u(r2)= g(r2), the .. equalities u = v(,, = v(,, = v hold. First of all, if E > 0 and u is a function having the properties given above, by Lemma 9 there are u(x) and h(x) such that La(")u(x)+ f"(")(x)+ h(x) = 0
+
1 Introduction to the Theory of Controlled Diffusion Processes
+
and Ilhl19, < E. This implies, by Lemma 8, that u I va NE and, therefore, < V ( M ) 5 v ( ~ ) 0. On the other hand, the equality F[u] = 0 for each a E N, x E [r,,r,] for t < z yields ~:'~; We apply Ito's formula to the expression ~ ( x ~ , " ) e - then
which gives the inequality u 2 va as t -t co, due to the finiteness of za,", M:z (Lemma 7), the boundedness off, u, the equality u(x2")= g ( x y ) , and 2 u. The theorem the nonnegativeness of P. Hence, u 2 v and u 2 v ( ~2, has thus been proved.
11. Remark. Since v" = v, v = limn,, van. Furthermore, the proof of the theorem provides us with a technique for finding &-optimalstrategies and the approximate value of the payoff function v. Using this technique, one has to know how to solve equations of the form La(")u(x)+ f '(.)(x) = 0 and + fB(")(x)2 F[u]x - E. The equality how to find P(x) such that LB(")u(x) v = lim,, ,vanenables us to estimate v",v',v as follows. By Lemmas 6 and 8,
where N , depends only on the maximum magnitudes of a(a,x), b(a,x), ca(x)g(rl), g(r,), r,, r,, 6. Therefore, by the Lagrange theorem, for all x, Y E Cr1721
Letting n -t m, we obtain a similar inequality for the function v. We divide the both sides of the inequality thus obtained by Ix - yl. Letting y go to x, we then have that for all x E [r1,r2]the sum Iv(x)l Iv'(x)I Ivl'(x)Idoes not exceed N , . Therefore,
+
+
12. Remark. (The smooth pasting condition.) At each point x
+
v(x - 0 ) = v(x O), vl(x - 0 ) = vl(x vl'(x - 0 ) = v"(x O),
+
+ O),
E (rl,r2),
4. One-Dimensional Controlled Processes
which fact together with the boundary conditions v ( r l ) = g(rl), v(r2)= g(r2) helps us to find xo, e l , c 2 , d l , d,; if, for example, it has been proved that on some interval [ r l , x o ] the function v is representable as vl(x,cl,c2),and on [xo,r2] as v2(x,dl,d2),with v , and v , being known. 13. Exercise Let A
= [-
l,l],
Prove that the third derivative of the function v(x) = sup,,, at a point (r, r,)/2.
+
Mz"," is discontinuous
14. Exercise Using the inequality v r
+ F,(x,v,,vb) 2 0, prove that v" = lim,,, -
v; (as.).
We shall make a few more remarks on Theorem 5 proved above. We have first proved the existence of a solution of the equation F[u] = 0. Second, we have proved the fact that u is equivalent to v. We have thus obtained a theorem on the uniqueness of the solution of the equation F[u] = 0 satisfying the boundary condition u(ri)= g(ri).One should keep in mind that in the theory of differential equations the existence and uniqueness theorems are proved for a wider class of equations than that of equations of the type (3) (see [ 3 , 3 3 , 4 3 , 4 6 , 4 7 ] ) . The result of Exercise 13 shows that a payoff function need not have three continuous derivatives even if a, b, c, f is analytic in (a,x).We note here that if, for example, the function F,(x,y,p) has 10 continuous derivatives in (x,y,p), v has 12 continuous derivatives. We can easily deduce this by induction from the fact that the equation F [ v ] = 0 is equivalent to the equation V" + F1(x,v,vl)= 0. The next theorem follows from Remark 11 and the uniform convergence of vanto v. 15. Theorem. For each E > 0 there exists a function a(x) satisfying a Lipschitz condition and such that the Markov strategy cl,(xl0,,,)= cl(x,) is &-optimalfor all x.
If the payoff function v has been found, &-optimalMarkov strategies can easily be found with the aid of Lemmas 8 and 9. In fact, using Lemma 9, one can find a function a(x) satisfying a Lipschitz condition and such that In this case Ilhl191 I E and, by Lemma 8, u(x) I va(x)+ NE.
1 Introduction to the Theory of Controlled Diffusion Processes
The problem with optimal strategies is much more complicated. To show that it is true we assume for E 2 0 that It is seen that for some x the set AO(x)can be empty. The next theorem is given without proof. 16. Theorem. (a) If a strategy a, is optimal for a point z E (rl,r,), the random vector a, E AO(xFZ) almost surely on a set {za3'> t) for almost all t. (b) A Markov strategy c l , ( ~ ~ = ~ ,a(x,) , ~ ) admissible for a point z E (rl,r,), is optimal for z if and only if a(x) E AO(x)for almost all x E (rl,r2). It follows from our theorem that the requirement for optimality imposes a very strict limitation on a strategy. The reader who has solved Exercise 13 can easily understand that the sets AO(x)(x # 0) are empty if A = (- 1,1), x;"= x + w, + a,ds, v(x) = sup, Mzz. Consequently, there is no optimal strategy in this case. The case where A = [- l,l] is more interesting (as in Exercise 13). Here AO(x)= (1) for x E [rl,(rl + r2)/2); AO((rl+ r,)/2) = [- 1,1], AO(x)= { - 1) for x E ((rl r2)/2,rZ]. In this case the function a(x) determining an optimal strategy is to satisfy (at least for almost all x) the following conditions: a(x) = + 1 for x E [rl,(r, + r2)/2]; a((r, + r2)/2)E [ - l,l], a(x) = - 1 for x E ((rl r,)/2, r,]. There arises the question of admissibility of the strategy a(x,) with the function a(x), that is, the question of solvability of the equation
fi
+
+
x,
=x
+ w,
+ sda(x,)ds
with a discountinuous drift coefficient. That the last equation is solvable can easily be proved with the aid of an appropriate transformation y, = f(x,) which reduces the initial equation to y, = f(x) o(y,)dw,, where o(y) satisfies a Lipschitz condition. Hence, there exists an optimal strategy in Exercise 13. Equations with coefficients which do not satisfy a Lipschitz condition have not been studied adequately (see, however, [75,78] and Section 11.6). In Section 1.1, we used Bellman's principle to deduce the Bellman equation. We now prove this principle.
+6
17. Theorem (Bellman's Principle). For all x E [rl,rz], a E a,let a Markov time ya9" I za3"be given. Then, for each function u(x) twice continuously differentiable on the interval [rl,r,], and such that F(u) = 0, we have on [rl,r2] the equality
which, in particular, holds for u = v.
5. Optimal Stopping of a One-DimensionalControlled Process
PROOF. We denote by ii(x)the right side of (16).Taking in (15)instead of z A t the expression y A t, we easily find that u(x)2 ii(x). On the other hand, for E > 0, a smooth function a(x) and a function h(x) such that Ilhl191 I E, we define the Markov strategy a,(xlO,,,)= a(x,). By Ito's formula
which yields We have used the fact that h 2 0 since 0 = F[u] 2 La(")u(x)+ f"(")(x)= -h(x). By Lemma 8, the last expectation as a function of x satisfies the equation La(")ul(x) h(x)= 0.
+
Therefore, by Lemma 6, the above expectation does not exceed Nllhl191 < NE.Finally, u(x) a iZ + NEfor each E > 0. Hence u I ii, which, together with the converse inequality proved before, yields the equality u = ii, thus proving the theorem. U 18. Exercise Prove that Eq. (16) will hold if we require only that f"(x) be measurable in x, continuous in a, and bounded with respect to (a,x).
19. Exercise For h E 9 1 [ r 1 , r Z let ]
u(x) = sup M:
Ji h(x.) dt.
Prove that lu(x)l< ~ l l h l l ~where , , N does not depend on h, x.
5. Optimal Stopping of a One-Dimensional Controlled Process
-
We consider the control scheme given in Section 1.4. To this end, we take the same set A and the functions o(a,x),b(a,x),ca(x),f"(x) satisfying the conditions given in Section 1.4. For simplicity of notation, we assume that ca(x) 0. In contrast to what we did in Section 1.4, we assume here that the function g(x) is given on the entire interval [r,,r,] and that g(x)is twice continuously
1 Introduction to the Theory of Controlled Diffusion Processes
differentiable on this interval. As in Section 1.4, we denote by xFx a solution of Eq. (4.1), and assume that z = za9"is the time of first departure of ~ p ~ " ~ f r o m [r1,r21. For a Markov time y we set
and introduce a payoff function in the optimal stopping problem defined by w(x)= sup
V~,~(X).
as'll,~
In this section, we deal with the problem of finding a strategy a E ill and a Markov time y, such that V " , ~ ( X 2 )w(x) - E. 1. Definition. Let E 2 0. A Markov (with respect to {Ft}) time y to be &-optimalfor a point x if
= ya
is said
2 w(x) - E. sup vU,Y"(x) as%
A 0-optimal Markov time is said to be an optimal Markov time. We shall investigate the optimal stopping problem using the method of randomized stopping. We denote by %, the set of pairs (a,r), where a G ill, r = r,, is a nonnegative progressively measurable (with respect to {Pt}) process such that r,(m) 5 n for all (t,m). Let % = U, 23,. For a E A, r 2 0, let P r ( x )= f"(x) rg(x) and for (a,r)E 8 let
+
E,(x)
=
sup Ea*r(x), E(x) = lim En(x)= sup P r ( x ) . (ax)E 91n n-tm (a,r)s9
The main properties of functions v",(x)and the relationship of these functions with w(x) will be proved in the following lemma, whose first assertion justifies, in addition, the application of the method of randomized stopping. 2. Lemma. (a) w(x) = E(x) on [rl,r2]. (b) - E,,(x)I 5 (l/n)N for all x E [rl,rz],where N depends only on K and the function g. (c) E,(x) are twice continuously dzfferentiable on [rl,rz], E:(x) satisjes a Lipschitz condition, En(ri)= g(ri),and
Iw(x)
F[En] IIE;II~~rl,r2l
+ n(g-
En)+ = O
+ IIEAIIBrrl,r21 + Il~nll~rr1,r21
on [rl,r2].
(1)
I N, where N does not depend on n.
PROOF.The function i?,(x) is representable as the payoff function given in Section 1.4 if we take B, = A x [O,n] instead of the set A, and if we assume
5. Optimal Stopping of a One-Dimensional Controlled Process
that o(B,x) = o(a,x), b(P,x) = b(a,x),co(x)= r, fo(x) = f ' ( x )+ rg(x) for f i = (a,r) E B,. p becomes a control parameter, and 23, replaces a set of strategies %. Hence, Theorem 4.5 immediately implies the assertion on smoothness of E,(x) and the fact that E,(x) satisfies the corresponding Bellman equation. This equation is the following: 0=
+
+ b(cr,x)Ei(x)- rE,(x) + rg(x) f"(x)] sup [a(a7x)5L(x) asA,rs[O,nI
+
+
sup [LaEn(x) f '(x)] + sup r[g(x)- E,(x)] = F[En] n(g - En)+, asA . r s to,,] which proves (c). In order to prove (b), we write (1) as
+
+
sup [a(a,x)EL(x) b(a,x)Ek(x) f :(x)] = 0, asA
where f; = f" + n(g - En)+. From this, using Theorem 4.17, we have for all Markov times y = ya9"that
where En 2 g - (g - En)+ =- g , and f ; 2 fd;therefore, cn(x)n sup M: ae'll,?
[J;
fat(xt)dt
+ gn(xy
A
,)I.
On the other hand, if we take in (2) y = yo = yadX = inf {t :g(x;",")2 En(xFx)), then for 0 5 t 5 yy we have f,(x;,") = f ( x y ) and F,(x~.,",,) = g,(x;.,",,). Therefore, E,(x)
= sup Mz a e Pl
I
[J,Y"f ?xt) dt + gn(xYod A'
Comparing the last equality with (3),we obtain
A
.
which makes a crucial point in our proof. It turns out that if g is replaced with g, = g - (g - En)+, E,(x) will become a payoff function in the optimal stopping problem. Further, using the inequality connecting the magnitude of the difference between the upper bounds and the upper bound of the magnitude of the differences, we find
Therefore,
1 Introduction to the Theory of Controlled Diffusion Processes
In order to estimate (g - En)+,we write (1) as follows:
where 4.17,
f":: = f a + n(g - v",), + n(En- g) + Lug, which En(x)- g(x) = sup M: u s e
yields, by Theorem
Ji e-"'f "nf (xt)dt.
fi 2 fa + Lug2 - (fa + Lag)- ; hence 1 En(x)- g(x)> -sup M e-"'dtsup(/. + L'~)- = --N, n
Note that
ae'U
Jc
u,x
1 g - En I - N , n
1 (g- En)+<-N, n
(5)
which, together with (4), completes the proof of (b). (a) follows obviously from (b). Next, we prove (d). We have from (2) that ~n(x)= Sup M: use
[J; f:(xt)
dt + g(Xr)].
It can easily be verified that the function f i(x) satisfies a Lipschitz condition. According to Remark 4.1 1,
where N depends only on the maximum magnitudes of a(a,x), b(a,x), g(ri), rl, r,, 6. It remains only to note that fil I n(g - En)+,and that the both terms in the right side can be estimated with the aid of constants not depending on n. We have thus proved the lemma.
I
I f"l +
We can easily derive the desired properties of the function w(x) from the lemma proved. It follows from (d) and the fact that En converges to w in the same way as in 4.10 that w, w' are continuous on [rl,r2], 5; + w1 uniformly on [rl,r2], Iwl(x)- W'(~)I 5 Nlx - vl, which implies that w'(x) is absolutely continuous, wU(x) exists almost everywhere on [rl,r2], and w1'(x)l 5 N on the set on which w" exists. From now on we shall denote by wU(x)the function defined everywhere on [rl,rz], equal to the second derivative of w(x) at those points at which this derivative exists, and equal to zero at the remaining points. It is seen that w 2 g on [rl,rz], w(ri) = g(ri). Furthermore, it follows from Lemma 2c that F[&] I 0; therefore Ei + Fl(x,En,E;) I 0. Integrating this inequality and passing to the limit, we find for x > y that
5. Optimal Stopping of a One-DimensionalControlled Process
We divide the both sides of the last inequality by x - y and take the limit as y f x. Then, w" + F1(x,w,wt)I O(a.s.),i.e., F[w] I O(a.s.)on [rl,r2]. Further, let T = { x : w ( x )= g(x)). T is a closed nonempty (ri E T ) subset of the interval [rl,rz]. Let [pl,pz] be a subinterval not intersecting T . Then w(x) > g(x)for x E [ p l ,p3].Since En + w uniformly on [rl ,r2],the inequality Dn(x)> g(x) will be satisfied for x E [ P I ~ P ~beginning ], from some n. Therefore, (g - En)+ = 0, and by Lemma 2c F[&] = 0 on [p,,p,]. Hence 5: F,(~,i7~,5;) = 0 on [pl,p2] for sufficiently large n, which leads us, as in 4.10, to the assertion that w"(x) is continuous on [p,,pz], w" F1(x,w,wl)= 0 on [ ~ 1 , ~ 2 l . The above implies, in turn, two facts. First, w" = -Fl(x,w,w') on any subinterval not intersecting T : therefore w" outside r satisfies a Lipschitz condition. Second, F[w] 0 butside T . Finally, noting that it follows from w - g 2 0 and w - g = 0 on r that w' - g' = 0 on T n (rl,r2),we have the following theorem.
+
+
=
3. Theorem. (a) w together with its derivative is continuous on [rl,r2],w' is absolutely continuous, and w" is bounded on [rl,r2]. The function w" satisjes a Lipschitz condition outside the set r = ( x E [rl,rz]:w(x)= g(x)). (b) w 2 g, w(ri)= g(ri),F[w] I 0 (as.) and F[w] = 0 on [r,,r2]\r. (c) W' = g' on the set I' n (rl,rz). Next, we investigate &-optimalstrategies and optimal stopping times. 4. Theorem. (a) For a E a,x E [rl,rz] we denote by yo = y"sX the time ofjrst entry of a process x;"sXinto the set l';then, y o is an optimal stopping time. (b) We define for E > 0 a function a(x) satisfying a Lipschitz condition. In addition, we dejne a numerical function h(x) such that Ilhl121 I E and
(see Lemma 4.9). Also, we dejne a Markov strategy a using the formula a , ( ~ ~ , ,= , ~a(x,). ) Then, a is an NE-optimalstrategy for any point x and, moreover, W(X) I
Mi
N not depending on E and x.
[SF
j Y X t )dl
I
+ g (x,) + Ns,
PROOF.It is easily seen that only (7) is to be proved, and that the remaining assertions of our theorem follow from (7). It is obvious that (7) is to be proved only for x q! T . Let (pl,pz)be a subinterval not intersecting T . The function w is twice continuously differentiable and w" satisfies a Lipschitz condition on (p,,p2). Hence, the limit w"(p, + 0) and the limit wU(pz- 0) exist. Furthermore, it follows from (6) that on (p1,p2)
1 Introduction to the Theory of Controlled Diffusion Processes
We apply Lemma 4.8 to the function w and the strategy a taking an interval [p,,p,] as an initial one. Then
A simple analysis of the deduction of (4.6) shows that the constant N can be taken to be the same for all p,, p, E [r1,r2]. It shows also that thus proving the theorem.
5. Exercise Let y"*" = y y be the first exit time of the process x:." from { x :w ( x ) > g(x) + E); then ya." is an &-optimalstopping time.
6. Exercise Let w ( x ) > g(x) on ( p l , p z ) and let w ( p i ) I g ( p i ) + E. Find the strategy a, which is &-optimalfor a point x , E ( p l , p , ) in the maximization problem
where z l is the first exit time from ( p l , p z ) (see Theorem 4.15). Then
that is, xl is a 2~-optimalstopping time and a is a 2~-optimalstrategy in the primary problem for the point x,.
We shall explain how the preceding results can be used for finding Eoptimal strategies and &-optimalstopping times. First, we find n such that [En - wl < ~ / (see 4 Lemma 2b). The function En is a solution of the equation sup
[a(a,x)E:(x)
a~A,r~[O,nl
+ b(a,x)EL(x)- rEn(x)+ rg(x) + f"(x)] = 0;
hence & can be found via the method of successive approximation in the space of strategies (see 4.10). Let 5; + En as m + co. We take m such that IEn(x)- Er(x)l I e/4 for x E [r1,r2]. Let
It can be easily verified that w(x) g(x) + E on [r,,r,]\G, and if (p ,,p2) c G, then w > g on (p1,p2).Therefore, the &-optimalstrategy for the points of [r,,r,]\G consists in instantaneous stopping. For the points of any
5. Optimal Stopping of a One-DimensionalControlled Process
interval (pl,p2) c G the first exit time from G is an &-optimalstopping time (Exercise 5); we can find &-optimalstrategies using Exercise 6. In some cases it is difficult to do this. However, sometimes it is possible to define a function u(x) explicitly such that u(x) seems to coincide with w. In such cases the following theorem can be useful.
7. Theorem(0n Uniqueness). Let the function u(x) together with its Jirst derivative be dejined and continuous on [rl,r2]. Assume that ul(x)is absolutely continuous on [rl,rZ]. Finally, let u 2 g, u(ri) = g(ri),F[u] I 0 (as.). F(u) = 0 (as.) on the set {x E [rl,r2] :u(x)> g(x)}. Then u(x) = w(x). PROOF.We first prove that u I w. Let r = (x:u(x) = g(x)). Since g I w, it suffices to establish the inequality u I w on any subinterval (pl,p2) not intersecting r. Let the sequences pl be such that pl # pi,p f i pl, p i f p2. We immediately note that by the Lagrange theorem,
asn -+ co. Further, on (pl,pz)we have F[u] = 0 (as.), which yields u" Fl(x,u,ul) = 0, u" = - F1(x,u,uf)(a.s.). The expression F1(x,u,u1)is continuous in x, therefore u" coincides with a continuous function almost everywhere on (p1,p2). This readily implies that the function u" itself is continuous on (pl,pz). Next, we apply Theorem 4.5 to the function u on the interval [p'i,pg. Denoting by yn = yn,","the first exit time of x;," from [p;,p",, and noting that u(x;?) is equal to u(p;) or u(p",, we have
+
which implies, as n -+ co,that u(x) I w(x). Further, we prove the converse inequality. We denote by uU(x)a Bore1 function equal almost everywhere to a derivative of u'(x). We can take a function u"(x) such that the inequality F[u] I 0 is satisfied at all points [rl,rZ]. In fact, by assumption this inequality can be satisfied almost everywhere. We can redefine ul'(x) at the points at which F[u](x) > 0 for a randomly selected u", if we note that, in conjunction with the obvious inequality F(x,y,p,r) I Gr K(lp1 lyl 1) for r SO, for any x, y, p one can choose r I 0 so that F(x, y,p,r) I 0. In this case, Lau fOL I F[u] 5 0 everywhere on [rl,r2] for each a E A.
+
+ +
+
1 Introduction to the Theory of Controlled Diffusion Processes
Since ul(x)is absolutely continuous, S::lurf(x)ldx< co. By Theorem 2.10.1, the fact that this condition is satisfied is sufficient for Ito's formula to be applied to the expression u(x;,"). Using Ito's formula and the inequalities u 2 g, Lau + f" 5 0 for each a E '$ and Ifor each Markov time y, we conclude that -
Therefore, u(x)2 sup,, V " , ~ (= X w(x). ) thus proving the theorem. The arguments in the above proof can prove as well the following theorem.
8. Theorem. Let a function u(x), together with its first derivative, be definite and continuous on [r,,r2].Assume that ul(x)is absolutely continuous on [r,,r2]. If u 2 g and F[u] 5 0 (a.s.), then u(x) 2 w(x) on [rl,r2]. In other words, w is the smallest function among those satisfying the inequalities u 2 g, F[u] 5 0 on [rl,rz]. 9. Exercise We take the function cU(x)from Section 1.4 and redefine II",~(X), letting
We encourage the reader to develop the actual arguments required in this section.
Notes Section 1. Girsanov in [25] was apparently the first who justified the application of Bellman's equation to some control problems, relying to a great extent upon differential equations theory. Using the same theory, Fleming in [13-161 made further steps in the development of optimal control theory; also, see Fleming and Rishel [17]. Speaking of the relationship between differential equations theory and optimal control theory, it is appropriate to draw the reader's attention to [20, 21, 30, 31, 35, 661. The control variables depending upon the entire "past" for processes with continuous time is first discussed by Fleming in [15]. Section 2. The normed Bellman equation was first introduced in [37]. The method of randomized stopping is developed in [29-31,371. The optimal stopping problem of a Markov (uncontrolled) process is discussed by Shiryayev in [69]. It is essential to note that the equations from optimal stopping theory are, in many cases, equivalent to the equations from the theory of differential (variational) inequalities; see Lions [49, SO], Lewy and Stampacchia [48], and Tobias [73,74]. The comparison of Exercise
Notes
4 with Example 3.7 in Chapter 1 in [SO] shows the relationship of other kinds between the optimal control theory of diffusion processes and the variational inequalitiestheory. Section 3. Other examples illustrating the application of optimal control theory for obtaining estimates which play an essential role in this theory may be found in [39,40]. Section 4. In a sense, our discussion consists of carrying the results and methods of Fleming in [13] over to the one-dimensional case. In the multivariate (d 2 3) case, using the methods mentioned above, it is possible to consider only optimal control problems in which the control parameter is not included in the diffusion coefficients; see [13-17,211. The reason for this is that it is impossible (in view of a well-known example due to N. N. Uraltseva) to prove a suitable analog of Lemma 6 for d 2 3. At the same time it is possible to work out a theory rather similar to that described in this section for the plane (d = 2), and, to allow the control parameter to be included in the coefficients of diffusion as well as drift; see [30, 31, 35,661. The method used in proving Theorem 5 differs, in fact, from the Bellman-Howard method to the extent that in the latter method, a,, is determined by the condition F[v,] = La-+lv,+ fb"+l. P. Mosolov drew the author's attention to the fact that the Bellman-Howard method follows the Newton-Kantorowich method for solving nonlinear function equations (see [3]).We note that the Bellman-Howard method applied to functional equations led to the quasilinearization method (see Bellman and Kalaba [5]). For one-dimensional control problems, see Mandl [52, 531, Prokhorov [64], Arkin, Kolemayev, and Shiryayev [ I ] , and Safonov [65]. Section 5. The methods developed in this section have been borrowed from [29,31]. Some hints as to how to find the set r can be found in Section 3.4 as well as in [56]. For the solution of equations arising from the sequential analysis, see also Shiryayev ~691.
,
Auxiliary Propositions
2
1. Notation and Definitions In addition to the notation given on pages xi and xii we shall use the following: T is a nonnegative number, and the interval [O,T] is interpreted as an interval of time; the points on this interval are, as a rule, denoted by t, s. D denotes an open set in Euclidean space, b the closure of D, and 2 0 the boundary of D. Q denotes an open set in Ed+,; the points of Q are expressed as (t,x) where t E El, x E Ed.d'Q denotes the parabolic boundary of Q (see Section 4.5). SR = {XE Ed: 1x1 < R),
CT,R= (0,T) x SR, HT = (0,T) x Ed.
C R = Cm,R,
If v is a countably additive set function, then Ivl is the variation of v, = ~ ( I v+ I v) is the positive part of v, and v- = )(IvI - V)is the negative part v. If T denotes a measurable set in Euclidean space, meas r is the Lebesgue measure of this set. For p 2 1 8 , ( r ) denotes a set of real-valued Bore1 functions f(x) on r such that
In the cases where the middle expression is equal to infinity, we continue to denote it by f llP,ras before. In general, we admit infinite values for various integrals (and mathematical expectations) of measurable functions. These values are considered to be defined if either the positive part or the
11
2 Auxiliary Propositions
negative part of the function has a finite integral. In this case the integral is assumed to be equal to m (- co) if the integral of the positive (negative) part of the function is infinite. For any (possibly, nonmeasurable) function f(x) on T we define an exterior norm in 2,(T), using the formula
+
where the lower bound is taken over the set of all Borel functions h(x) on T such that 1 I h on T. We shall use the fact that the exterior norm satisfies the triangle inequality: I]fl I f211p,r.Also, we shall use -+ 0 as n , co, there is a subsequence {n') for which the fact that if f,.(x) + 0 as n' -+ m (T-a.s.1. B(T) denotes the set of bounded Borel functions on I' with the norm
If
]I f n I P gr
+ f2l P,
]I
+ ]I
C(T) denotes the set of continuous (possibly, unbounded) functions on T. f is a smooth function means that f is infinitely differentiable. We say that f has compact support in a region D if it vanishes outside some compact subset of D. C;(D) denotes the set of all smooth functions with compact support in the region D.
We introduce A,,, . ..,., ( These elements are derivatives of f(t,x) along spacial directions. The time derivative is always expressed as (a/at)f(t,x). C2(D)denotes the set of functions u(x) twice continuously differentiable in D (i.e., twice continuously differentiable in D and such that u(x) as well as all first and second derivatives of u(x) have extensions continuous in D). C1,2(Q)denotes the set of functions u(t,x)twice continuously differentiable in x and once continuously differentiable in t in Q. Let D be a bounded region in Ed, and let u(x) be a function in D. We write u E W2(D)if there exists a sequence of functions un E C2(D)such that as n, m -+ co,where
Under the first condition of (1) and due to the continuity property of un, the functions in W2(D) are continuous in D. The second condition in (1)
1. Notation and Definitions
implies that the sequences u:ixj are fundamental in Yd(D). Hence there exist (Borel) functions ui, uij E Yd(D),to which uti, u:ixj converge in Yd(D). These sequences u",, utiXjconverge weakly as well to the functions given above. In particular, assuming cp E C," (D),and integrating by parts, we obtain
sD
cpu" dx = -
sD
cpxiundx,
Letting n + oo,we obtain
1. Definition. Let D c Ed, let v and h be Borel functions locally summable in D, and let l,, . . . , I, E Ed. The function h is said to be a generalized derivative (in the region D) of the function v of order n in the l,, . . . ,Indirections and this function h is denoted by u(,,, . . .,,( if for each cp E Cc(D) JD
cp(x)h(x)dx = (-
lr
JD
~(~)cp(,,, . . (I") dx.
In the case where the li direction coincides with the direction of the rith coordinate vector, the above function is expressed in terms of v,,, . . . ,,, U ( ~ l ). .. (In). The properties of a generalized derivative are well known (see [57,71,72]. We shall list below only those properties which we use frequently, without proving them. Note first that a generalized derivative can be defined uniquely almost everywhere. Equation (2) shows that ui = uxi in the sense of Definition 1. Similarly, uij = uXixj.Therefore, the functions u E W2(D) have generalized derivatives up to and including the second order. Furthermore, these derivatives belong to 2d(D). We assume that the values of first and second derivatives of each function u E W2(D) are fixed at each point. By construction, for the sequence un entering (I), 1lu:i - uxilld,D + 0,
1Iu:ixj - uxixj/ld,D+ 0. The set of functions W2(D)introduced resembles the well-known Sobolev space W;(D) (see [46,71,72]). If the boundary of the region D is sufficiently regular, for example, it is once continuously differentiable; Sobolev's theorem on imbedding (see [46,47]) shows that, in fact, W2(D)= W:(D). In this case u E W2(D)if and only if u is continuous in D,has generalized derivatives up to and including the second order, and, furthermore, these derivatives are summable in D to the power d. It is seen that if the function u is once continuously differentiable in D, its ordinary first derivatives coincide with its first generalized derivatives (almost everywhere). It turns out (a corollory of Fubini's theorem) that, for example, a generalized derivative u,~exists in the region D if for almost
2 Auxiliary Propositions
all (xi, . . . ,x$) the function u(xl,x$, . . . ,x$) is absolutely continuous in x1 on (xl :(xl,xi, . . . ,x$) E D} and its usual derivative with respect to x1 is locally summable in D. The converse is also true. However, we ought to replace then the function u by a function equivalent with respect to Lebesgue measure. It is well known that if for almost all ( x r l , . . . ,x$) the function u(xl, . . . ,xi,x'b+l, . . . ,x$) has a generalized derivative on ((xl, . . . ,xi):(xl, . . . , xi,xcl, . . . ,x$) E D) and, in addition, this derivative is locally summable in D, u will have a generalized derivative in D. Using the notion of weak convergence, we can easily prove that if the functions cp, vn (n = 0,1,2, . . .) are uniformly bounded in D, v" -+ v0 (D-as.),. for some l,, . . . , 1, for n 2 1 the generalized derivatives v ~ ~ . . ., elk, , exist, and v . . .,,, 1scp (D-as.), the generalized derivative vg ,,.. also exists, V(1,).. 5 cp (D-a.s.1, and
.,!,)I
%).
. . (lk)
0 -+ V(l1)
. . . (lk)
weakly in LY2 in any bounded subset of the region D. In many cases, one needs to "mollify" functions to be smooth. We shall do this in a standard manner. Let c(x), cl(t), c(t,x) = il(t)i(x)be nonnegative, infinitely differentiable functions of the arguments x E Ed, t E El, equal to zero for 1x1 > 1, It1 > 1 and such that
For E # 0 and the functions u(x), u(t,x) locally summable in Ed, El x Ed, let
(3 (3
u(')(x)= E - ~ C u(O,.")(t,x) =E-~C -
* U(X) (convolution with respect to x), * u(t,x) (convolution with respect to x),
(: :) *
ds)(t,x)= ~-("'l)l -,-
u(t,x) (convolution with respect to (t,x)).
The functions u("(x), ~(~~")(t,x),u(~)(t,x) are said to be mean functions of the functions u(x), u(t,x). It is a well-known fact (see [10,71]) that u'" -+ u as E -+ 0: a. at each Lebesgue point of the function u, therefore almost everywhere; b. at each continuity point of the function u; uniformly in each bounded region, if u is continuous; c. in the norm Yp(D) if u E Yp(D) and in computing the convolution of u(') the function u is assumed to be equal to zero outside D. Furthermore, u(" is infinitely differentiable. If a generalized derivative u(,) exists in Ed, then [ u ( ~ , ] ( ~= ) [u(')](~). Finally, for p 2 1 IIu(~)IIP,E~5 I l ~ l l p , ~ ~ )
IIu(~)IIB(E~)
5 IIuIIB(E~).
1. Notation and Definitions
Considering the functions u'", we prove that the generalized derivative uxl of the function u(x) continuous in D does not exceed a constant N 1 almost everywhere if and only if the function u(x)satisfies in D the Lipschitz condition with respect to x1 having this constant, that is, if for any points xl,x2 E D such that an interval with the end points x,, x2 lies in D and xi = x i (i = 2, . . . ,d),the inequality (u(xl)- u(x2)(INllxl - x21 can be satisfied. It turns out that if a bounded function a has a bounded generalized derivative, o2 has as well a generalized derivative, and one can use usual formulas to find this generalized derivative. In addition to the space W2(D)we need spaces W2(D),W1s2(Q),and W1,2(Q),which are introduced for bounded regions D, Q in a way similar to the way W2(D)was, starting from sets of functions C2(D),C1,'(Q),and C1,2(Q),respectively, and using the norms
For proving existence of generalized derivatives of a payoff function another notion proves to be useful.
2. Definition. Let a function u(x)be given, and let it be locally summable in a region D. Let v(T)be a function of a set r which is definite, a-additive, and finite on the a-algebra of Bore1 subsets of each bounded region D' c D' c D. We say that the set function v on D is a generalized derivative of the function u in the l,, . . . , I, directions, and we write v(dx) = U(,,) . . . (,,)(x)(dx),
if for each function rp E C;(D),
The generalized derivative (d/dt)u(t,x)(dtdx) for the function u(t,x) locally summable in the region Q can be found in a similar way. The definitions given above immediately imply the following properties. It is easily seen that there exists only one function v(dx) satisfying (4)for all rp E C$(D).If the function u(,,,. . . (,,,(x)exists, which is a generalized derivative of u in the l,, . . . , 1, directions in the sense of Definition 1, assuming that v(dx) = u(,,,.. .(,,)(x)dx, we obtain in an obvious manner a set function
2 Auxiliary Propositions
v, being the generalized derivative of u in the I,, . . . , I, directions in the sense of Definition 2. Conversely, if the set function v in Definition 2 is absolutely continuous with respect to Lebesgue measure, its Radon-Nikodym derivative will satisfy Definition 1 in conjunction with (4). Therefore, this Radon-Nikodym derivative is the generalized derivative u(,,,. . .(,,,(x). This fact justifies the notation of (3). In the case where the direction li coincides with the direction of the rith coordinate vector, we shall write
Using the uniqueness property of a generalized derivative, we easily prove that if the derivatives u(,,,. . . (,,,(x)(dx)for some k exist for all 11, . . . , I,, then
for I1,I . . . 11,1 # 0. Further, if the derivatives u(,,(,,(x)(dx)exist for all 1, all the derivatives u(,,,(,,,(x)(dx)exist as well. In this case, if I1,I . I1,I # 0, then
- (11 - 12)2u(11 - 1*)(11 - l2)(x)(dx)I. In fact, using Definition 2 we easily prove that the right side of this formula satisfies Definition 2 for k = 2. Theorem V of [67, Chapter 1, $11 constitutes the main tool enabling us to prove the existence of u(,,,. ..(,,,(x)(dx).In accord with this theorem from [67], the nonnegative generalized function is a measure. Regarding
as a generalized function, we have the following.
3. Lemma. Let u(x), v(T)be the same as those in thejrst two propositions of Definition 2. For each nonnegative cp E C;(D) let the expression (5) be nonnegative. Then there exists a generalized derivative u(~,, ,,,( in the sense of Definition 2. In this case, inside D ,,,
(- l)"(l,,. . . ,,,,(x)(dx) 2 (- l)kv(dx),
that is, for all bounded Bore1 T c fi c D
To conclude the discussion in this section we summarize more or less conventional agreements and notation. (w,,F,) is a Wiener process (see Appendix 1).
2. Estimates of the Distribution of a Stochastic Integral in a Bounded Region
Fzis the o-algebra consisting of all those sets A for which the set A n
{z I t) E Ftfor all t.
1UZ(t) denotes the set of all Markov (with respect to {Ft)) times z not exceeding t (see Appendix 1). C([O,T],E,) denotes a Banach space of continuous functions on [O,T] with range in Ed, Jlr, the smallest o-algebra of the subsets of C([O,T],E,) which contains all sets of the form where s 2 t, r denotes a Borel subset of Ed. 1.i.m. reads the mean square limit. ess sup reads the essential upper bound (with respect to the measure which is implied). inf l21 = a , f (xz) = f (xz)xz< a. When we speak about measurable functions (sets), we mean, as a rule, Borel functions (sets). The words "nonnegative," "nonpositive," "it does not increase," "it does not decrease," mean the same as the words "positive," "negative," "it decreases," "it increases," respectively. Finally, A=
az 1a(~')~
i= 1
denotes the Laplace operator. The operators La, F[u], F,[u], used in Chapters 4-6 are defined in the introductory section in Chapter 4.
2. Estimates of the Distribution of a Stochastic Integral in a Bounded Region Let A be a set of pairs (o,b),where a is a matrix of dimension d x d, and b is a d-dimensional vector. We assume that a random process (o,,b,) E A for all (w,t), and that the process
is defined. We shall see further that in stochastic control, estimates of the form
play an essential role, in (I) f is an arbitrary Borel function, z, is the first exit time of x, from the region D, and Q = ( 0 , ~x) D. A crucial fact here is that the constant N does not depend on a specified process (ot,bt),but is given
2 Auxiliary Propositions
instead by the set A. In this section, our objective is to deduce a few versions of the estimate (1). We assume that D is a bounded region in Ed, x, is a fixed point of D, an integer dl 2 d, (w,,F,) is a dl-dimensional Wiener process, o,(o) is a matrix of dimension d x dl, b,(w) is a d-dimensional vector, and c,(w), r,(o) are nonnegative numbers. Assume in addition that o,, b,, c,, r, are progressively measurable with respect to (9,) and that they are bounded functions of (t,o). Let a, = 30,o:. Next, let p be a fixed number, p 2 d, and let y,,,
=
1
rudu,
q,,,=
1
cudu,
$, = c: -[(dil)l(pil)!(rtdet 4)11(pfI).
One should keep in mind that for p = d the expression cp-d)l(p+l)is equal to unity even if c, = 0; therefore $, = (r, det a,)ll(dil) for p = d. 1. Definition. A nonnegative function F(c,a) defined on the set of all nonnegative numbers c as well as nonnegative definite symmetric matrices a of dimension d x d is said to be regular if for each E > 0 there is a constant k ( ~such ) that for all c, a and unit vectors 1
2. Theorem. Assume that Ib,l I F(c,,a,) for all (t,w) for some regular function F(c,a). There exist constants Nl,N, depending only on d, thefunction F(c,a) and the diameter of the region D, and such that for all s 2 0, Bore1 f (t,x) and g(x), on a set { z D 2 s), almost surely
Before proving our theorem, we discuss the assertions of the theorem and give examples of regular functions. Note that the left sides of the inequalities (2) and (3) make sense because of the measurability requirements. It is seen that the function F(c,a) = c is regular. Next, in conjunction with Young's inequality,
if x , 2~ 0, p-l
+ q-l
=
1. Hence for a E ((),I),
E E (0,l)
Therefore, ca(tra)'-' is a regular function for a E (0,l).
2. Estimates of the Distribution of a Stochastic Integral in a Bounded Region
We show that the function (det a)lid not depending on c is regular. Let p1 5 p2 I ...I pd be eigenvalues of a matrix a. We know that p, I (a2,L) if 121 = 1. Further, det a = p1p2 . pd, tra = p1 + p2 + . . . + pd. From this, in conjunction with the Young's inequality, we have +
Using the regular functions given above, we can construct many other regular functions, noting that a linear combination with positive coefficients of regular functions is a regular function. The function tr a is the limit of regular functions ca(tr a)'-' as LX 10. However, for d 2 2 the function tr a is not regular. To prove this, we suggest the reader should consider
3. Exercise For p
= d,
c, = 0, s = 0, g G 1 it follows from (3) that M
(det a,)llddt i N 2 (rneas D)'/~.
From the statement of Theorem 2 we take D = S,, F(c,a)= K tr a, with K > R-'. It is required to prove that for d 2 2 there exists no constant N 2 depending only on d, K, R, for which (4) can be satisfied.
This exercise illustrates the fact that the requirement Ib,l I F(c,,a,), where F is a regular function, is essential. In contrast to this requirement, we can weaken considerably the assumption about boundedness of o, b, c, r. For example, considering instead of the process x,, y,,, the processes
-
where z, is the time of first departure of x, from D, and noting that x, = X,, y,, = Y,,, for t < z,, we immediately establish the assertion of Theorem 2 in the case where a,,,,o,, ~,,,,b,, x,, c,, X, ,,,r, are bounded functions of (t,o). We think that the case where s = 0, r, 1, p = d is the most important particular case of Theorem 2. It is easily seen, in fact, that the proof of our theorem follows generally from the particular case indicated. The formal proof is rather difficult, however. It should be noted that according to our approach to the proof of the theorem, assuming s # 0, r, 1 makes the proving of estimates for s = 0, r, 1 essentially easier. In the future, it will be convenient to use the following weakened version of the assertions of Theorem 2.
-
+
2 Auxiliary Propositions
4. Theorem. Let z be a Markov time (with respect to {SF,}), not exceeding 7,. Also, let there exist constants K , 6 > 0 such that for all t < z(o),;iE Ed d
Ibt(o)l I K,
C
ay(o)lliilj2 d1;i12.
i, j= 1
Then there exists a constant N depending only on d, K, 6, and the diameter of the region D such that for all s 2 0 and Bore1 f(t,x) and g(x) on the set {s I z ) , almost surely
This theorem follows immediately from Theorem 2 for r, p = d. In fact, we have x,,, = x ,
-
1, c, = 0,
+ Ji X . . P ~ ~ W ~ + Ji xU<.budu,
since e-Vs,t$,= (det a,)'Kd+')and det a,, which is equal to the product o f eigenvalues o f the matrix a, for t Iz, is not smaller than ad. Furthermore, lxtc7btlr K X 1 ( d e tX,,7a,)11d,the function F(c,a) = Kd-'(det a)'ld is regular and, in addition, { s I 7,) 2 { s I 7 ) . Next, in order t o prove Theorem 2, we need three lemmas. 5. Lemma. Let lbtl I F(ct,at)for all (t,o) for some regular function F(c,a). There exists a constant N depending only on the function F(c,a) and the diameter of the region D such that on the set {z, 2 s) almost surely
PROOF.W e can assume without loss o f generality that x , = 0. W e denote by R the diameter o f the region D and set u(x)= P - ch ~1x1for a > 0, P > ch(aR). W e note that u(x) is twice continuously differentiableand u(x)2 0 for x E D. Applying Ito's formula t o e-'+'s,tu(x,), we have for t 2 s on the set {z, 2 s) that
2. Estimates of the Distribution of a Stochastic Integral in a Bounded Region
Assume that for all x E D, r 2 0 Then
which proves the assertion of the lemma as t -+ oo,with the aid of Fatou's lemma. Therefore, it remains only to choose constants a, j3 such that (5) is satisfied, assuming obviously that x # 0. For simplicity of notation, we shall not write the subscript r in c,, a,, a,, b,. In addition, let ;l = x/lxl, p = 1x1. A simple computation shows that
= (1
+ a sh up)-l{c(P - ch ap) + a sh ap(b,L)
ashap 1 - [tr a - (aA,A)] - F(c,a). +l+ashapp We note that ch ap r 1, chap 2 sh ap, a sh ap 2 aZp and for x E D the number p l R. Hence
Therefore, it follows from (6) that I r c
p - chaR 1 + ashaR
a2 a2 + (an,,?)+[tr a - (aA,L)] - F(c,a). 1 + a 1+a2R
We recall that F(c,a) is a regular function. Also, we fix some E < 1/R and ) E. Next, choose a large enough that a2/(1 + a2R) > E, a2/(1 + a) 2 k ( ~+ we take a number p so large that
Then 1 2 k(~)[c+ (al,il)] + E tr a - F(c,a)2 0,thus proving the lemma.
2 Auxiliary Propositions
6. Corollary. Let G(c,a) be a regular function. There exists a constant N depending only on F(c,a), G(c,a) and the diameter of the region D such that
+
In fact,let Fl(c,a)= F(c,a) G(c,a).Then Ib,l I Fl(c,,a,),G(c,,a,)I Fl(c,,a,), and the assertion of our lemma is proved for Fl(c,a). 7. Lemma.LetR > O , h ( t , x ) 2 O , h ~ $ P ~ + ~ ( C ~ ) , h ( t , x ) =IO O,h(t,x)=O fort for 1x1 2 R. Then on (- oo,co) x Ed there exists a bounded function z(t,x) 2 0 equal to zero for t < 0 and such that for all sujiciently small if > 0 and nonnegative dejinite symmetric matrices a = (aij)on a cylinder C R .
where N(d) > 0. Furthermore, if the vector b and the number c are such that lbl I(R/2)c,then on the same set biz:) 2 cz("),i f if is sujiciently small. Finally, for all t 2 0, x E Ed
This lemma is proved in [42] by geometric arguments.
8. Lemma. Let lbtl < F(ct,at)for all (t,w)for a regular function F(c,a). There exists a constant N depending only on d, F(c,a), and the diameter of D, and such that for all s 2 0, f(t,x) on a set {z, 2 s}, almost surely
In other words, the inequality (2)holds for p = d.
PROOF. Let us use the notation introduced above:
s.,, = [c. du,
$t
= (r,det o,)ll(dt
l),
We denote by R the diameter of D and we consider without loss of generality that x , = 0.In this case D c S,. Also, we assume that z, is the first exit time of x, from S,. It is seen that z, 2 .z, Suppose that we have proved the inequality
((7,
2 s)-as.) for arbitrary s, f , where N = N(d,F,R). Furthermore, taking
2. Estimates of the Distribution of a Stochastic Integral in a Bounded Region
in (7) the function f equal to zero for x !+ D, we obtain
< Nllfl l d + l , C n
=
Nllfl l d + l , Q
2 s)-as.) and, a fortiriori, ({zD2 s)-as.). It suffices therefore to prove (7). Usual reasoning (using, for example, the results given in [54, Chapter 1, $21) shows that it suffices to prove (7) only for bounded continuous nonnegative f(t,x). Noting in addition that by Fatou's lemma, for such a function ((2,
we conclude that it is enough to consider the case where r,(w) > 0 for all (t,~). We fix T > 0 and assume that h(y,x) = f ( T - y, x) for 0 < y < T, x E S,, and h = 0 in all the remaining cases. Using Lemma 7, we find an appropriate function z. Let z = z , , be the first exit time of a process (ys,,,xi)considered for t 2 s from a set [O,T) x S,. We apply Ito's formula to the expression e-vs~tz")(T- y,,,x,) for E > 0, t 2 s. Then
Using the properties of z(') for small E > 0, we find
Furthermore, z(') )10. Hence
2 Auxiliary Propositions
in which we carry the term containing z(" from the right side to the left I Nllf lid+ l,cR: side. Also, we use the estimate Iz(")IISUP^,^ < Nllhlld+
IzI
~ ( d , ~ ) l l f l l ~ + ~ ,+~ M . ({1l n 7 e-n;ulbuldu19s
where y,,, E (0,T) for u E (s,z) by virtue of the condition r, > 0, and in addition, xu E S R ; hence the function h is continuous at a point ( T - yS,,,xu) and h(T - yS,,,x3 = f(y,,,x,). Letting E to zero in the last inequality, we obtain, using Fatou's lemma,
Further, on the set {zR2 s) it is seen that z I 7,. Therefore, by Lemma 5,
Finally, on the set {z, 2 s) for all T > 0,t > s, we obtain
It remains only to let first t + co,second T + co,and then to use Fatou's lemma as well as the fact that obviously z,,, + z, as T + co on the set {z, 2 s). We have thus proved the lemma. 9. Proof of Theorem 2. We note first that it suffices to prove Theorem 2 only for p = d. In fact, for p > d in accord with Holder's inequality, for example,
< -
(M{pe-".t(det
')I
a t ) l l d l g ( x t ) l p ~ d d t I ~ } e-ps.tctdtI% ~(M{~
-
2. Estimates of the Distribution of a Stochastic Integral in a Bounded Region
In this case, "I! e-'Ps.tc,dt= 1 - e-Qsr'DI 1, and if we have proved the theorem for p = d, the first factor does not exceed [ N ( ~ F , DIlgP1dlJd,DldlP ) = NdlP(dJ,D)llgllp,D I (N(d,F,D)+ l)llgJJp,DThe inequality (2) was proved for p = d in Lemma 8. Therefore, it suffices to prove that
((zDI s)-a.s.) for all g. We can consider without loss of generality that g is a nonnegative bounded function. In this case, since (det a)lldis a regular function v = sup ess sup x, s,o
0
,,M
I
(det ~ , ) ~ l ~ g ( x , ) edt- ' P9 ~,, ~
is finite by Corollary 6. If v = 0, we have nothing to prove. Hence we assume that v > 0. Using Fubini's theorem or integrating by parts, we obtain for any numbers t , < t , and nonnegative functions h(t), r(t) that
+ l: exp{- J: r(u)d u I r ( t ) ( r h(u)du) dt. From this for s 2 0, A E FS, rt = (l/v)(g(x,)(deta,)lld,h, = (det ~ , ) ~ l ~ g ( x , ) , we find My,
ID,,
p he-",'
+, ,M ,,
dt
p
where the last term is equal to
exp{-l
r.
(r 1
hue-'Ps~u du dt,
2 Auxiliary Propositions
Therefore,
% htf?-qs'tdf
M ~ ~ , r D a s
where f(t,x) = e-zgdl(d+')(x). Consequently, by Lemma 8,
where the constants N (which differ from one another) depend only on d, the function F(c,a), and the diameter of D. The last inequality is equivalent to the fact that {T, 2 s)-as.)
From this, taking the upper bounds, we find and v I ~ l l g l l ~thus , ~ ,completing the proof of Theorem 2. 10. Remark. Let 6 > 0. The function F(c,a) is said to be ®ular if for ) that for all c, a, and unit vectors A some E E (0,d) there is a constant k ( ~such In the sense of the above definition, the function which is ®ular for all 6 > 0, is a regular function. Repeating almost word for word the proofs of Lemmas 5 and 8 and the proof of Theorem 2, we convince ourselves that if the region D belongs to a circle of radius R, lbtl I F(ct,at)for all (t,o) and if F(c,a) is an R-'-regular function, there exist constants N l , N , depending only on d, F(c,a), and R such that the inequalities (2) and (3) are satisfied. 11. Exercise Let d 2 2, D = S,, E > 0.Give an example illustrating the (R-' + &)-regularfunction F(c,a) for which the assertions of Theorem 2 do not hold. (Hint: See Exercise 3.)
12. Exercise Let z(" be the function from Lemma 7. Prove that for sufficiently small E the function z("(t,x)decreases in t and is convex downward with respect to x on the cylinder C,.
3. Estimates of the Distribution of a Stochastic Integral in the Whole Space
3. Estimates of the Distribution of a Stochastic Integral in the Whole Space
I
In this section1 we shall estimate expressions of the form M f," f(t,x,)l dt using the 9,-norm off, that is, we extend the estimates from Section 2.2 to the case D = Ed. We use in this section the assumptions and notation introduced at the beginning of Section 2.2. Furthermore, let
Throughout this section we shall have two numbers K,,K, > 0 fixed and assume permanently that
-
for all (t,o).Note immediately that under this condition Ibtl does not exceed the regular function F(ct,at) K,c,. First we prove a version of Theorem 2.2. 1. Lemma. Let R > 0, let z be a Markov time with respect to {Ft},and let z, = inf { t 2 z: lx, - x,l 2 R}.' Then there exists a constant N = N(d,K,,R) such that for any Bore1 f(t,x)
PROOF. First, let z be nonrandom finite. For t 2 0 we set F ; = Fr+,, WI = Wr+t - wr,
z' is the first exit time of the process xj from S,. It is then seen that
=
M ( J ~e-Vi+if(y;
Also, see Theorem 4.1.8. infd = m.
1
+ y,, xi + x,)l d t l ~ b
(as.).
2 Auxiliary Propositions
Furthermore, (w;,F;) is a Wiener process. In addition, by Theorem 2.2
for any x E Ed, y 2 0. In order to prove our lemma for the constant z, it remains to replace y, x by the Yo-measurable variables y,, x, in the last inequality. To do as indicated, we let rcn(t)= (k 1)/2"for t E (k/2",(k 1)/2"], rcn(x)= rcn(xl,. . . ,xd)= (IC,(X), . . . ,rcn(xd)).Note that rc,(t) 1 t for all t E (- c o , ~ ) K,(x) , --+ x for all x E Ed. From the very start, we can consider without loss of generality that f is a continuous nonnegative function. We denote by T:,Td, the sets of values of the functions rcn(t),rcn(x)respectively. Using Fatou's lemma, we obtain for the function f mentioned,
+
M
5N
{J:
e-Qi+;f(y; + y,, x;
2 ((s,,)J f
+ x,) dt l
+
1
~ b
l/(p+ 1)
P+
l(t,x)dx dt)
Further, we prove the lemma in the general case. Taking A E 9, and setting zn = rc,(z),
z; = inf{t 2 zn:lx, - x,.l 2 R), we can easily see that
zn 17 , x.<m
lim
n-t m
l'
lim z;
n+ m
e-Qt+tlf(~t,xt)l dt z
and that for s E r,l the set
2 z,,
e-*.$d f(yt,xJ1 dt,
3. Estimates of the Distribution of a Stochastic Integral in the Whole Space
Therefore, in accord with what has been proved,
thus completing the proof of our lemma. 2. Lemma. As in the preceding lemma, we introduce R, z, Z* Also, we denote by p = p(I) the positive root of the equation I - p K l - p2K2 = 0 for I > 0. Then 1 ~ { ~ ~ ~ < ~ e - I~ a;l;ji - l i ge -r *} - - ~ ~ < ~
PROOF.Let n(x) = ch plxl. Simple computations show that
Taking advantage of the fact that sh plxl 5 ch ~1x1,shplxl 5 plxl ch ~1x1,we obviously obtain Ic,n(x) - ~ * t , ~ ~ 2 n (c,xch ) p ) x l ( I - pKl - p2K2) = 0.
Further, using Ito's formula applied to e-A"tn(xt+ x), we have from the last inequality that
2 Auxiliary Propositions
Using the continuity property of n(x),we replace x with a variable ( - x i , J in the last inequality. Then which yields for A E F1
We have proved the lemma; further, we shall prove the main theorem of Section 2.3. 3. Theorem. There exist constants N i = Ni(d,Kl,Kz)(i = 1,2) such that for all Markov times z and Bore1 functions f(t,x), g(x)
PROOF.We regard f , g as nonnegative bounded functions and in addition, we introduce the Markov times recursively as follows: Z0 zn+l
= Z,
= infit
2 zn:lxi- xrnl 2 1).
Note that by Lemma 2, M { ~ p + l < m-q*+ e l IFI)= M {M { X m+
<
'IFIn) IFr)
where p is the positive root of the equation 1 - pKl - p Z K z = 0 . It is seen that zn increase as n increases; the variables XI"< we - rp,n
decrease as n increases. The estimate given shows that as n + co Mx1n<
we- q = n _ t O ,
X ~ , ~ ~ ~ - ~ = " (a.s.). - ) O
3. Estimates of the Distribution of a Stochastic Integral in the Whole Space
7"
Due to the boundedness of the function c,(o) we immediately have that co (a.s.) as n --i co. Therefore, using Lemma 1, we obtain (a.s.)
--i
N
(Esf
p+l(t,x)dx dt)
ll(P+l) w n=O
j'
M {e-'zn~Tn<
1%)
l / ( P + 1)
e-'T(Jy: P + 1(ty) dx dt) . I N - ch' chp-1 Having proved the first assertion of the theorem, we proceed to proving the second. To this end, we use the same technique as in 2.9. The function g is bounded and c: -(d/p)(detat)l/P 5 C: -(*/P)(tr
K$Pct.
Hence
and the number v = sup ess sup M T
a,
(SP
e-'Pz.tc: -(dlp'(detat)liPg(xt)dt ISF,
is finite. We assume that v > 0, and that r, = (l/u)c: -(dlp)(det4)11pg(~t), h, = c: -(dlP)(deta,)'/Pg(x,). Using Fubini's theorem, we obtain
+M
{lm rt exp{-
qT,,-
rudu}
from which it follows, as in 2.9, that M
{r
e-'*stht dr 1 SFr} n 2M
(dmhue-".udu )dt 1 SF,1
(J:{
h, exp - 9,,2-
(a.s.),
r, du} dt / F,].
2 Auxiliary Propositions
Noting that the last expression equals zero on a set {z = oo) we transform it into
where f(t,x) the theorem
= e-'gP/(p+l)(x). Therefore,
according to the first assertion of
Consequently, 0
INlv1~'P+1)J(gJJPplt~1)7 v 5 N:+(l/P)l(gl(p,Ed I (1 + N ~ I ( s ( ( ~ , E ~ ,
which is equivalent to the second assertion, thus completing the proof of our theorem. We give one essential particular case of the theorem proved above. 4. Theorem. Let K3, K, < co, I > 0, 6 > 0, s 2 0, for all t 2 s, w E 0, 5: E Ed
There exist constants N i= Ni(d,p,I,6,K,,K4) (i = 1,2) such that for all Bore1 functions f (t,x), g(x) M
% e-Yf(tst)l dt 5 Nlllf llp+1,nm7
This theorem follows from the preceding theorem. In fact, for example, let r t = l,ct = Ifor t 2 s,Kl = K,/I,K, = K4/I.ThenIb,l I Klct, tra, IK2c, for t 2 s. For t < s, let us take c, such that the above inequalities still hold, noting that (det aJ1/(p+l) 2 Bd/(p+'). Therefore
< e-".M -
fi.
exp{-
J: cudu}cip-d)i(p+l)(rtdet ~ ) l ~ ' p t l ) l f (dt~ , ~ ~ l
4. Limit Behavior of Some Functions
5. Exercise We replace the third inequality in (1) so that det a, 2 6, and we preserve the first two inequalities. Using the self-scaling property of a Wiener process, and also, using the fact that in (3) g(x) can be replaced by g(cx), prove
where N(d,K,) is a finite function nondecreasing with respect to K , .
4. Limit Behavior of Some Functions Theorems 6 and 7 are most crucial for the discussion in this section. We shall use them in Chapter 4 in deducing the Bellman equation. However, we use only Corollary 8 in the case of uniform nondegenerate controlled processes. In this regard, we note that the assertion of Corollary 8 follows obviously from intuitive considerations since the lower bound with respect to a E 23(s,x) which appears in the assertion of corollary 8 is the lower bound with respect to a set of uniform nondegenerate diffusion processes with bounded coefficients (see Definition of B(s,x) prior to Theorem 5). We fix the integer d. Also, let the number p 2 d and the numbers K1 > 0, K2 > 0, K, > 0. We denote by a an arbitrary set of the form
where (Q,P,P) is a probability space, the integer dl 2 d, (w,,Ft) is a dldimensional Wiener process on (Q,F,P), a, = o,(o) is a matrix of dimension d x dl, b, = b,(o) is a d-dimensional vector, c, = c,(o), rt = r,(o) are nonnegative numbers, and o,, b,, e,, r, are progressively measurable with respect to {&} and are bounded functions of (t,o) for t 2 0, o E 0 . In the case where the set (1) is written as a, we write 52 = Qa, F = P a , etc. Denote by 91(Kl,K2,K3)the set of all sets a satisfying the conditions Ib:l I Klc;,
tria:la:l*
5 K2c:,
rf
for all (t,o). For x E Ed, a E 21(Kl,K2,K3),let x:,'
As usual, for p = d, $;
=x
+ Ji o: dw: + Ji b: du,
= (rfdet
IK3c:
2 Auxiliary Propositions
For the Bore1 function f(t,y), s E (- co,co), x v(s,x) = v ( f,s,x) = v(K,,K2,K,,f =
sup
Ma
~E(U(KI,K~,K~)
E Ed let
,s,x)
So* e-qF+; f ( ~ : : x f . ~dt,)
where M a denotes the integration over SZa with respect to a measure Pa. In addition to the elements mentioned, we shall use the elements given prior to Theorem 5. 1. Theorem. Let f E $4,+ ,(Ed+,). Then v(s,x) is a continuous function of (s,x)on Ed+ and, furthermore,
,,
PROOF. Since IbFl I K,c;, tr E; I K2c;,the estimate of v follows from Theorem 3.3. In this case, we can take N(d,Kl,K2)= N,(d,K,,K,), where N , is the constant given in Theorem 3.3. Further, we note that for any families of numbers h", h",
Hence
Some-qF+:lf(y:sl,xtl.)
lv(sl,xl)- v(s2,x2)1I sup M' a
- f( y;~2,xf~2)l dt.
I f f (t,x)is a smooth function of (t,x),with compact support, then
I
If (y;,",~;."')- f (fl,",~;,"')
= N(ls1 - s2l
Morever, +; I
+ 1x1 - ~ 2 1 ) .
( C ; ) ( ~ - d ) l ( ~+ l ) ( ~ ~ , - ; t d - d a a()td~ ) l f i ~ +1) < K:/(P+~ ) K ~ I ( P + 1 2
)
~
Therefore
+
Consequently, we have Iv(s,,x,) - v(s2,x2)lI N(ls, - s,l Ix, - x,l) for f(t,x), with v being a continuous function. ,(Ed+,), we take a sequence of smooth Iff is an arbitrary function in 9,+ + 0.Using the functions f, with compact support so that f -fn~~,+,,,,+, property of the magnitude of the difference between the upper bounds,
11
4. Limit Behavior of Some Functions
which we used before, we obtain This implies that the continuous functions v(fn,s,x) converge to v(f,s,x) uniformly in E d + , . Therefore, v(f,s,x) is continuous, thus proving the theorem. The continuity property of v(s,x)implies the measurability of this function. For investigating the integrability property of v(s,x) we need the following lemma. 2. Lemma. Let R > 0, let ~2~be the time ofjirst entry of a process x:3Xinto a set SR, let ya be a random variable on Qa, ya 2 72" and let E be the positive root of the equation K,E' K,E - 1 = 0. Then, for all t,, s
+
M a x y u , ,e-
rp;=
<e&R~&l~l, -
PROOF. We fix a, x. For the sake of simplicity we do not write the superscripts a, x. In addition, we write y;ks = s y;kO as s y,. The first assertion of the lemma is obvious for 1x1 < R; therefore we assume that 1x1 > R. In accord with Ito's formula applied to e-'t-Elxtl we obtain
+
+
where
Hence e - & R ~ e - r p q XI , se-"'"I. t Using Fatou's lemma, as t -+ co,we arrive at the former inequality. In order to prove the latter inequality, we note that under the assumption r, < K3ct we have on the set ( t , I y, + s} that t , - s I K3cp,, from which it follows that
Furthermore,
2 Auxiliary Propositions
Having multiplied the extreme terms in the last two inequalities we establish the second assertion of the lemma, thus completing the proof of the lemma.
3. Theorem. There exists a finite function N(d,Kl) increasing with respect K , and such that for all f E 9p + (Ed+ ,)
,
PROOF.Suppose that we have proved the theorem under the condition that K 2 = K 3 = 1. In order to prove the theorem under the same assumption in the general case, we use arguments which replace implicitly the application of the self-scaling property of a Wiener process (see Exercise 3.5). If a E % = %(K1,K2,K3),let 1 1 rRa,Fa,Pa,&,wf,Ft,-a;, -b;,c;,
& 4G
It is seen that a' E % = %(K1/&,l,l). It is also seen that a' runs through the entire set % ( K 1 / G , 1 , 1 )when a runs through the entire set %(Kl,K2,K3). Further, for f E 9,. ,(Ed+,) let f '(t,x)= f (K3t,&x). We have
= sup as%
Ma
Sowe-"@ f ( y ~ " , x ; ,dt~ )
- K:/(P+' ) K ~ ( Pl )+sup
M"
a' E 'U'
Some-@'+trf (s + K , y;',', x + G x ; ' , O )dt
Therefore, if we have proved our theorem for K 2 = K 3 = 1, then 11v(K1,K2,K3,fr',.)Il:zi,Ed+l
S l ~ I ~ ( ~ , fll ,,s ,lx ), r l dxds
- K:KP12)d
4. Limit Behavior of Some Functions
Therefore,it suffices to prove this theorem only for K , in our proof in this case the expression
=K, =
1. We use
representable as the "sum" of terms each of which incorporates the change which occurs while the process (yr*",xy)moves across the region associated with the given term. We assume without loss of generality that f 2 0. Let R be such that the volume of S, is equal to unity. We denote by w(t,x) the indicator of a set C,,,. Let At,,xt,(t,x)= w ( t , - t,xl - x)f(t,x).It is seen that
Ia(s,x) =
S_" d t , f d x , Ma Som
e-qF$x t,,,,, ( y ; + , x g dt.
rn
In order to estimate the last expectation for fixed t,, x , , we note that
At ,,,, )(t,x) can be nonzero only for 0 I t , - t I 1, Ix, - xl r R. Hence, if
ya is the time of first entry of the process ( t , - yF%, - xFx) into the set C,,,, then
Furthermore, on the set {ya < co) 0 I t l - y;f I 1
R 2 Ix, - x;kXI = IXY-~~I.
and
The last inequality in the preceding lemma implies the inequality ya 2 z;*-"'. By Theorem 3.3 and the preceding lemma we obtain
{4
INi(lAtl,xl)llp+i,Ed+, exp - R
E
1
I
- -1x 2 - xll - -2( t i - s - 1) ,
where N , = N , ( d , K l , l ) is the constant given in Theorem 3.3. Also, we note that for t , < s the first expression in the above computations is equal to zero since t , - yFS I t l - s < 0 and ya = co. Hence where .n(t,x) = exp[(&/2)R- (~/2)1x1+ +(t + I ) ] for t I 0, n(t,x) = 0 for t > 0. Therefore, since v = sup, I",
2 Auxiliary Propositions
In the right side of the last expression there is a convolution (with respect and z(tl,xl). It is a wellto (tl,xl)) of the two functions known fact that the norm of the convolution in 9, does not exceed the and the norm of the other function product of the norm of one function in 9p in 9 , . Using this fact, we conclude that
I l & t l , x l , I P +l , E d+l
To complete the proof of the theorem it remains only to show that the last constant N(d,Kl) can be regarded as an increasing function of K , . Let where the upper bound is taken over all f E zp+ l(Edtl)such that > 0. According to what has been proved above, N(d,Kl) < m. Il f l l p + i , ~ d + I In addition, the sets 'U increase with respect to K,. Hence v, R(d,K,) increase with respect to K,. Finally, it is seen that Iv(f,s,x)l I v(l f l,s,x) and IIv(K1,l,l,f,',')llp+l,Ed+l
I R(d,Kl)llf llp+l.Ea+l.
The theorem has been proved. We extend the assertions of Theorem 1 and 3 to the case where the function f(t,x) does not depend on t. However, we do not consider here the process r,, as we did in the previous sections. Let
=
sup
ae%(K1,K2,0)
Ma fow e-@(~f)(P-~)/P(det an1/Pg(x:,") dt.
4. Theorem. (a) Let g E Zp(Ed);then v(x) is a continuous function,
Iv(~)I N(d3K1,K2)11~IIp,Ed' (b) There exists ajinite function N(d,K1) increasing with respect to K , and such that for all g E YP(Ed)
This theorem can be proved in the same way as Theorems 1 and 3.
4. Limit Behavior of Some Functions
We proceed now to consider the main results of the present section. Let numbers K > 0, 6 > 0 be fixed, and let each point (t,x) E Ed+ (XE Ed) be associated with some nonempty set 23(t,x) (respectively, 23(x)) consisting of sets a of type (1). Let 23 be the union of all sets 23(t,x),23(x).We assume that a function ct(w) is bounded on 23 x [0,co) x UaSZa and that for all a E 23, U E [ o , ~ )O , E aa, y E Ed tr ot[at]* 5 K, r: = 1, ICCJtl*yl 2 61~1. It is useful to note that (2) can be rewritten as d
1 (a:)"yiyj
(a:y, y) =
1
2 2621y12,
i,j=l
1
since (a:y, y) = +(o:(o;)*y,y) = 4 ($)*yI2.
,,
5. Theorem. (a) Let 12 1, > 0, let Q c Ed+ let Q be an open set, and let f E z p + I(Q), Za = p , s , x - inf(t 2 O:(t + s,xFsX) 4 Q), =
Then
sup M'
a s B(s,x)
:J
e-qT-Uf(t
+ s, x
~dt.~ )
~lIz~Ip+5 i , aN(d,K,J,Ao)llf l l p + l , ~ . (b) Let A 2 1, > 0,let D c Ed, let D be an open set, and let g E Yp(D), za = T~~~ = inf (t 2 0 : ~ : '4~D),
:f
zyx) = sup Ma a e B(x)
Then
e-q;-ug(x:,x) dt.
PROOF.Since all eigenvalues of the matrix a: are greater than id2, det a: 2 2a62d..From this, assuming that f"=I f lxQ, i?: = :C + A, @: = q ~ :+ At and noting that i?: 2 1, we find Iz"s,x)( IN(6)A" -p)l(p
+l)
SUP a e B(s,x)
M' Jo
)
e - q-a( t F a (p-d)l(p+l)
x (r: det a:)ll(~+ly(y:,"x:,s) dt. It is seen that K tr aa < - i?:, '- 1 Therefore,
and
1
< - z.; '-1
f
2 Auxiliary Propositions
which implies, by Theorem 3, that
thus proving assertion (a) of our theorem since
Proceeding in the same way, we can prove assertion (b) with the aid of Theorem 4. The theorem is proved. 6. Theorem. (a) Suppose that Q is a region in Ed+,, fl(t,x) is a bounded Borel function, f E 9,+ l(Q),1 > 0, Za = Za.s.x - infit 2 O:(s+ t, xpx) $ Q},
1
zA(s,x)= zA(f ,s,x)
= a ~sup ' t ) ( s , x )M a [ ~ ~ e - Q ~ - A y ( s + t , x ~ 3 d t + e - ~ ~ ~ - ". ~ l ( s + ~ , ~
Then, there exists a sequence 1, + co such that l,zL(s,x) + f (s,x) (Q-as.). (b) Suppose that D is a region in Ed, g,(x) is a bounded Borel function, g E T p ( D ) 2, > 07 Z~ = z
~=,inf{t ~ 2 O:xFx$ D},
Then, there exists a sequence 1, + co such that l,zAn(x)+ g(x)(D-as.). 7. Theorem. (a) W e introduce another element in Theorem 6a. Suppose that Q' is a bounded region Q' c Q' c Q. Then I(1.z" f f l p +l , Q , + O as 1 + co. If fl = 0,we can take Q' = Q. (b) Suppose that in Theorem 6b D' is a bounded region, D' c B' c D; then 1112~" gllp,D'-' 0 as 1 -t co. If g1 = 0, we can take D' = D. PROOFOF THEOREMS 6 AND 7. It was noted in Section 2.1 that the property of convergence with respect to an exterior norm implies the existence of a subsequence convergent almost everywhere. Using this fact, we can easily see that only Theorem 7 is to be proved. 7a. First, let fl = 0. We take a sequence of functions PROOFOF THEOREM f" E CF(Q)such that f" - f llp+l,Q+ 0. It is seen that
11
11z"f7s,x) - f(s7x)l I11z"f,s7x) - z"(f",s,x)l + Il.zA(f",s,x)- f"(s,x)l
+ If " ( 3 , ~-) f(s,x)l,
4. Limit Behavior of Some Functions
from which, noting that Iz"(f;s,x) - z" f ",s,x)l Iz"(Jf - f"l,s,x), we obtain, in accord with Theorem Sa,
In the last inequality the left side does not depend on n; the first and third terms in the right side can be made arbitrarily small by choosing an appropriate n. In order to make sure that the left side of the last inequality is equal to zero, we need only to show that for each n
-
In short, it suffices to prove assertion (a) for fl 0 in the case f E C:(Q). In conjunction with Ito's formula applied to f(s + t, X>~)I-~F-" for each a E B(s,x), t 2 0 we have
where
Since a;, b,: c; are bounded, IL; f(t,x)l does not exceed the expression
Denoting the last expression by h(t,x), we note that h(t,x) is a bounded finite function; in particular, h E Z p +,(Q). Using the Lebesgue bounded convergence theorem, we pass to the limit in (3) as t -+ co.Thus we have f(s,x)
= AMa
sz
e-ql-uf(s
which immediately yields
+ t,xex)dt - Ma :S
+
e - q ~ - * f L ~ ( st, x:,') dt,
2 Auxiliary Propositions
In short, we have which, according to Theorem 5a yields
1
I Nllhllp+l,Qlim - = 0, A+m
1
thus proving Theorem 7a for fl E 0. In the general case IAz" f ,s,x) - f (s,x)l I il sup ~~e-@'~-"'"lfl(s as5(s,x)
1
+ il
a
sup M ,
Jr
+ za,x:,.) 1
e-'pF-"(s
+ t, xpx)dt - f(s,x) ,
where the exterior norm ofthe second term tends to zero; due to the boundedness of fl the first term does not exceed the product of a constant and the expression
Therefore, in order to complete proving Theorem 7a, it remains only to show that ]ln"lp+l,Qr -+ 0 as il -+ co for any bounded region Q' lying together with the closure in Q. To this end, it suffices in turn to prove that nA(s,x)-+ 0 uniformly on Q'. In addition, each region Q' can be covered with a finite number of cylinders of the type C,,,(s,y) = {(t,x): y - xl < R, It - sl < r}, so that C,,,,,(s,y) c Q. It is seen that we need only to prove that n"t,x) 0 uniformly on any cylinder of this type. We fix a cylinder C,,,(s,y) such that C,,,,,(s,y) c Q. Let zs(x) = infit 2 0: - xpXl 2 R}. Finally, we denote by p ( l ) the positive root of the equation il - pK - p2K = 0. Also, we note that for (t,x) E C,,,(s,y) we have za"*" 2 r A zf;c(x).Hence
1
-+
IX
Furthermore, by Lemma 3.23 the inequality holds true. Therefore, the function n"t,x) does not exceed Ae-" + I(ch p(il)R)-I on C,,,(s,y). Simple computations show that the last constant tends to zero as A -+ co.Therefore, n"t,x) tends uniformly to zero on C,,(s,y). This completes the proof of Theorem 7a. Theorem 7b can be proved in a similar way, which we suggest the reader should do as an exercise. We have thus proved Theorems 6 and 7. In Lemma 3.2, one should take .c = 0,c, = 1.
5. Solutions of Stochastic Integral Equations and Estimates of the Moments
8. Corollary. Let f
E
2,+l(Q), f 2 0 (Q-as.) and for all (s,x) E Q let
inf
Ma
a t B(s,x)
Sr e-qy(s + t, x;sx) dt
= 0.
Then f = 0 (Q-as.). In fact, by Theorem 2.4 the equality (4) still holds if we change f on the set of measure zero. It is then seen that for 2 2 0 inf
a E 5(s,x)
Ma
Sr e-qr-Ay(s + t, x:sX) dt
= 0.
Furthermore, for f1= 0 A z(-f,s,x)=
-
inf
Ma s r e - q ~ - "f (s
a E 5(s.x)
Therefore, za = 0 in Q and -f
= lim,,,
2,zk
=0
+ t, x?") dt.
(Q-as.).
5. Solutions of Stochastic Integral Equations and Estimates of the Moments In this section, we list some generalizations of the kind we need of wellknown results on existence and uniqueness of solutions of stochastic equations. Also, we present estimates of the moments of the above solutions. The moments of these solutions are estimated when the condition for the growth of coefficients to be linear is satisfied. The theorem on existence and uniqueness is proved for the case where the coefficients satisfy the Lipschitz condition (condition ( 2 ) ) . We fix two constants, T > 0, K > 0. Also, we adopt the following notation: (w,,F,) is a dl-dimensional Wiener process; x, y denote points of Ed; a,, a,(x), F,(x) are random matrices of dimension d x dl; b,(x), &x), t,, 5; are random d-dimensional vectors; r,, h, are nonnegative numbers. We assume all the processes to be given for t E [O,T], x E Edand progressively measurable with respect to {Fib If for all t E [O,T], w, x, y we say that the condition ( 9 ) is satisfied. If for all t E [O,T], w, x lla,(~)11~ I 2r:
+ 2K21x12,
we say that the condition (R) is satisfied. Note that we do not impose the condition ( 9 ) and the condition (R) on o",(x),&(x).Furthermore, it is useful to have in mind that if the condition ( 2 ) is satisfied, the condition (R) will be satisfied for r, = Ilot(0)ll, hi = Ibt(0)l
2 Auxiliary Propositions
+
(with the same constant K) since, for example, IJat(x)I12I 2110,(0)11~ 211.t(x) - ~t(O)Il2. As usual, by a solution of the stochastic equation
process xt for we mean a progressively measuable (with respect to {Ft}), which the right side of (1) is defined4 and, in addition, xt(w) coincides with the right side of (1) for some set SZ' of measure one for all t E [O,T], w E SZ'. 1. Lemma. Let x, be a solution of Eq. (1) for
5, = 0. Then for q 2 1
We prove this lemma by applying Ito's formula to the twice continuously differentiable function and using the inequalities
I x ~ ~ ~
2. Lemma. Let the condition (R) be satisjed and let x, be a solution of Eq. (1) for tt = 0. Then, for all q 2 1, E > 0, t E [O,T]
+
-
where A = 4qK2 E A,, . If the condition (9) is satisjed, one can take in (2) hs = lbs(0)l,rs = llcs(O)ll.
PROOF. We fix q > 1, E > 0, to E [O,T]. Also, denote by $(t) the right side of (2). We prove (2) for t = to. We can obviously assume that $(to) < co.We make one more assumption which we will drop at the end of the proof. Assume that xt(w) is a bounded function of w, t. Using the preceding lemma and the condition (R), we obtain
Next, we integrate the last inequality over t. In addition, we take the expectation from the both sides of this inequality. In this case, the expectation of the stochastic integral disappears because, due to the boundedness of Recall that the stochastic integral in (1) is defined and continuous in t for t I T if
5. Solutions of Stochastic Integral Equations and Estimates of the Moments
xt(o),finiteness of $(to), and, in addition, Holder's inequality, M
J:
~ x , l * ~ ~ o ( x t, ) sx ~NM ~
J:
llor(xt)llzdt
'
Furthermore, we use the following inequalities:
P 52
MIX,^^^ +
1 28
~ I ~ ~ 1 ~I q -MIX^^^^)' ~ r : -(l/q)(~r:q)l/q. Also, let
In accord with what has been said above for t 5 to, m(t) I
Ji [lqm(s) + w1-"~q'(s)] ds.
Further, we apply a well-known method of transforming such inequalities. Let 6 > 0. We introduce an operator F, on nonnegative functions of one variable, on [O,to], by defining
u2(t)for It is easily seen that F, is a monotone operator, i.e., if 0 I ul(t) I all t, then 0 5 F,ul(t) I F,uZ(t)for all t. Furthermore, if all the nonnegative functions u; are bounded and if they converge for each t, then limn,, F,un(t) = F, limn,, un(t). Finally, for the function v(t) = NeQt for all suffi1 we have F,u(t) I v(t) if t E [O,to]. In fact, ciently large N and 6 I
for It follows from (3) and the aforementioned properties, with N such that m(t) I v(t), that m(t) I F,m(t) I . . . I F:m(t) Iv(t). Therefore, the limit limn,, F:m(t) exists. If we denote this limit by v,(t), then m(t) I v,(t). Taking the limit in the equality F;+lm(t) = F,(F;m)(t), we conclude that v, = F,v,. Therefore, for each 6 E (0,l) the function m(t) does not exceed some non-
2 Auxiliary Propositions
negative solution of the equation
We solve the last given equation, from which it follows that v,(t) 2 6, v,(O) = 6, and vb(t)= ilqv,(t) + gv; -'l/q'(t). (4) Equation (4),after we have multiplied it by v$114)- (which is possible due to the inequality v, 2 6) becomes a linear equation with respect to v;Iq. Having solved this equation, we find v,1/4(t)= 6'14 + $(t). Therefore, m(t) I (#Iq ll/(t))qfor all t E [O,to],6 E (0,l).We have proved the lemma for the bounded x,(w) as 6 -+ 0. In order to prove the lemma in the general case, we denote by z, the first exit time of xi from S,. Then xi ,,,(o) is the bounded function of (o,t),and, as is easily seen,
+
Therefore the process xi,,, satisfies the same equation as the process x, does; however, o,(x), b,(x) are to be replaced by x ,,,, o,(x), ~,,,,b,(x), respectively. In accord with what has been proved above, M 1 ~ , , , , 1 ~ ~ I [$(t)Iq. It remains only to allow R -+ co, to use Fatou's lemma, and in addition, to take advantage of the fact that due to the continuity of x, the time z, -, co as R -+ co. We have thus proved our lemma.
3. Corollary. Let fi llas112ds < co with probability 1, and let z be a Markov Then, ).for all q > 1 time with respect to (9,
In fact, we have obtained the second inequality using Holder's equality. The first inequality follows from the lemma, if we take o,(x) = a,X,,,, b,(x) = 0, write the assertion of the lemma with arbitrary K, E, and, finally, assume that K 10, E 10. %
4. Exercise In the proof of the lemma, show that the factor 2qin Corollary 3 can be replaced by unity.
5. Corollary. Let the condition (2) be satisfied, let xi be a solution of Eq. ( I ) , and let 2, be a solution of the equation 2, =
e + J; ~ , ( z , ) ~ w+, J; %(x;)ds.
5. Solutions of Stochastic Integral Equations and Estimates of the Moments
Then, for all q 2 1, t E [O,T]
+ Ilos(as)- as(ns)112q) ds,
where p
= 4q2K2
PROOF.Let y,
+ q.
= ( x , - 2,) -
(5, - 5).Then, as is easily seen,
in this case From this, according to the lemma applied to the satisfy the condition (9). process y,, we have
+
+ 2(zq - 1 ) s ; e ( l ~ ~ ) * t [- ~M) ~ ~ o , ( xt, , -
5)- i?s(Ts)112q]1~q ds.
We raise both sides of the last inequality to the qthpower. We use Holder's inequality as well as the fact that Ibs(as
+ Ss - Es) - Ks(x"s)l IIbs(% + Ss - f s ) - bs('s)l + lbs(x"3- ~ s ( ~ s ) l I K21Ss - 5 1 + IbS(ZS)- Es(ns)l, (a + b)¶ I 2q '(aq + bq), -
which yields
+ 22q-11bs(~s) + 2q(2q - 1 ) q 2 ~ q - ~ ~ -~ q 1 ( ~ + P(2q - l)q22q- llas(Ts)- Zs(Ts)112q]ds. It remains to note that Ix, - T,l I lytl + ISt - &I , I x , - RtIzqI 22q-11~t12q + Q2q
Es(~s)124 l
2 2 q - 1 ) tt z l 2 q , thus proving Corollary 5.
6. Corollary. Let the condition (R)be satisfied, and let x, be a solution of (1). Then there exists a constant N = N(q,K) such that for all q 2 1, t E [O,T] M l ~ t' lq
< - N M I C , I ~ ~+ NtqP'M
S; [ICSIzq +'h
+ r:q]eN"-s)ds.
In fact, the process y, r x, - 5, satisfies the equation dy, = o(yt + t J d w , + b(y, + S,)dt,
Y O = 0,
2 Auxiliary Propositions
the coefficients of this equation satisfying the condition (R), however with different h,, r,, K. For example, Ilo,(x
+ 5,)112 121: + 2K2[1x+ ttll212r: + 4K214;,12+ 4K21xI.
Therefore, using this lemma we can estimate Mlyt12q.Having done this, we need to use the fact that I 22q-11ytl + 2'4-I IttI* In our previous assertions we assumed that a solution of Eq. (1) existed , and we also wrote the inequalities which may sometimes take the form oo I oo.Further, it is convenient to prove one of the versions of the classical Ito theorem on the existence of a solution of a stochastic equation. Since the proofs of these theorems are well known, we shall dwell here only on the most essential points.
IX,/~~
7. Theorem. Let the condition (2' be )satisjed and let
fr
Then for t T Eq. (1) has a solution such that M Ixtl2dt < co. If x,, y, are two solutions of (I), then P { s ~ p , , ~ , , ~ ~ lytl x , > 0) = 0.
MIX,
PROOF.Due to Corollary 5, - y,12 = 0 for each t. Furthermore, the process x, - y, can be represented as the sum of stochastic integrals and ordinary integrals. Hence the process x, - y, is continuous almost surely. The equality x, = y, (a.s.) for each t implies that x, = y, for all t (a.s.), thus proving the last assertion of the theorem. For proving the first assertion of the theorem we apply, as is usually done in similar cases, the method of successive approximation. We define the operator I using the formula Ix, =
Ji os(xs)dw, + fi bs(xs)ds.
This operator is defined on those processes x, for which the right side of (5) makes sense, and, furthermore, this operator maps these processes into processes Ix, whose values can be found with the aid of the formula (5). Denote by V a space of progressively measurable processes x, with values in Ed such that
It can easily be shown that the operator I maps V into V. In addition, it can easily be deduced from the condition (2) that MIIx. - Iyt12I aM
Si
lxs - ys12ds,
where a = 2K2(1+ TK2). Let xjO)r 0, xj"") = 5, + Ixj") (n = 0,1,2, . . .). It follows from (6) that
5. Solutions of Stochastic Integral Equations and Estimates of the Moments
Iterating the last inequality, we find
Since a series of the numbers (Ta)"12(n!)-n12 converges, it follows from (7) that a series of functions xj"") - xj") converges in V. In other words, the functions xj"") converge in V, and furthermore, there exists a process x", E V such that Ilxj") - %,I[ 0 as n -+ GO. Further, integrating (6),we obtain
-
IIIxt - IytIl 5 aTIlxt -
ill.
(8)
In particular, the operator I is continuous in V. Passing to the limit in the equality I(xj"") - (5, Ix = 0, we conclude 112, - (5, I%)[[= 0 , from which and also from (8) it follows that I%, = I(( + I%), for almost all t, w. However, the both sides of this eqbality are continuous with respect to t for almost all w. Hence they coincide for all t at once almost surely. Finally, taking xi = 5, + I%,, we have xi = 5, + I(< + I%)),= 5, + Ix, for all t almost surely. Therefore, x, is a solution of the primary equation, (I), thus completing the proof of the theorem.
+
+
8. Exercise Noting that as(x)= [os(x)- as(0)] + us(0),prove that the assertions of the theorem still hold if M j: 1q,12 dt < co,where
We continue estimating the moments of solutions of a stochastic equation.
9. Theorem. Suppose the condition (2) is satisjied, xi is a solution of Eq. ( I ) , and 2,is a solution of the equation
z,
Then, ij- the process 5, - is separable, the process x, - 2,is also separable, and for all q 2 1, t E [O,T] M sup s5t
Ix,
- %,IzqINeNiM sup
where N = N(q,K).
s5f
It, - ? , I z q
ti,
PROOF. It is seen that x, - x", is the sum of 5, - stochastic integrals and Lebesque integrals. Both types of integrals are continuous with respect to t. Hence, the separability property of 5, - 5 implies that xi - x", is separable, and in particular, the quantity sup,,, lxs - is measurable with respect to w.
2 Auxiliary Propositions
As was done in proving Corollary 5, the assertion of the theorem in the general case can easily be deduced from that in the case where 5, = 5; = 2,= 0, FS(x)= 0, bs(x)= 0. It is required to prove in the latter case that
-
M sup lxs12qI Ntq-'eN'M S$t
Jd [b.(0)1~~+ 11~~~(0)11'~] ds.
(9)
Reasoning in the same way as in proving Lemma 2, we convince ourselves that it is possible to consider only the case with bounded functions x,(o) and to assume, in addition, that the right side of (9)is finite. First, we prove that the process
is a submartingale. We fix c > 0 and introduce an auxiliary function of the real variable r using the formula q(r) = d m . Note that cp(lx1) is a smooth function on Ed. In conjunction with Ito's formula,
[ I I g , ( X t ) I 1 2-
ICTt*(Xt~2]} Ixt I
dt
xtgt(x*) + eK2tqf(/xtl) dw,.
IxtI Let us integrate the last expression over t from s, up to s2 2 s,, and also, let us take the conditional expectation under the condition Fsl.In this case, the expectation of the stochastic integral disappears (see Proof of Lemma 2). In addition, we use the fact that since bt(xt)xt 2 - Ibt(xt)lIxtI 2 - K21xtI2- lb,(0)1Ixtl,
0 I cpl(r)I 1, lrl I q(r).
then
Furthermore, cp" 2 0,
r la:(xt)xt12.Therefore,
from which, letting E go to zero, we obtain, using the theorem on bounded 1 Fsl)2 )I,,. Therefore )I,is a submartingale. convergence, M {)I,, From well-known inequalities for submartingales (see Appendix 2) as well as Holder's inequality we have M sup
sst
I x I ~ M ~SUP ~ yiq~ < 4 ~ 1 7 ~ ' ~ S$t
< - 4 . 224- l e 2 4 K 2 t ~ I X , 1 2 4 + 4 . 22q- l e 2 q K 2 t tZ q - 1M
Jd /bs(0)12qds.
5. Solutions of Stochastic Integral Equations and Estimates of the Moments
It remains only to use Lemma 2 or Corollary 6 for estimating MIX,^^^, and, furthermore, to note that tae" 2 N ( ~ , b ) e for ~ ~a' > 0, b > 0, t > 0. The theorem is proved. 10. Corollary. Let the condition ( R ) be satisfied, and let xt be a solution of Eq. (1). Then there exists a constant N(q,K) such that for all q > 1, t E [O,T] P NP-'eNtM
M sup lxS S l t
If
Si [1ts12q+ h:'I + 121d ~ .
5, is a separable process, then M sup lxslzq5 NM sup
sct
S l t
1tt12q + ~ t ' - ~ e ~Si' M
+ h 2 + I : ~ ]ds.
First, we note that the second inequality follows readily from the first expression. In order to prove the first inequality, we introduce the process y, = x, - 5,. It is seen that dy, = ot(yt + t,)dw, + bt(yt + <,)dt, yo = 0. In estimating yt it suffices, as was done in proving Lemma 2, to consider only the case where y,(co) is a bounded function. Similarly to what we did in proving our theorem above, we use here the inequality bt(y, + &)yt 2 - Kzly,12 - (K21tt1 h,)ly,l, thus obtaining that the process
+
is a submartingale. From the above, using the inequalities for submartingales as well as Holder's inequality, we find M sup sst
1 yt12q
PM
SUP r:q
54 4 ~ ~ : ~
S l t
+
For estimating Mly,12q,it remains to apply Lemma 2, noting that ot(x t,), b,(x + t,)satisfy the condition ( R ) in which we replace r:, h,, K by r: 2K21tt12,ht + K21tt1,2 K , respectively.
11. Corollary. Let M sup S l t
Si IIosI12ds <
1s:
dasl
2q
+
m (as.). Then for all q t 1
5 2q+~ ( 2 9 I ~ P ' -M
~i
llosl12q
ds.
This corollary as well as Corollary 10 can be proved by arguments similar to those used to prove the theorem. Taking as(x)= a,, b,(x) = 0, we have the process x, = a,dws. The proof of the theorem for K = 0 shows that lxtl is a 1 ~MIX,(^^, 4 which can be estimated submartingale. Hence fvl sup,,, I ~ ~ 1 with the aid of Corollary 3.
yo
2 Auxiliary Propositions
+
12. Corollary. Let there exist a constant K , such that /la,(x)II Ib,(x)I I K1(l 1x1)for all t, o,x. Let x, be a solution of Eq. (1)for 5, = xo, where xo is a j x e d point on Ed. There exists a constant N(q,K,) such that for all q 2 0, t E [07T1 M sup Ix, - xOlqI Ntqi2eNt(1
+
+I X ~ ~ ) ~ , M sup 1xSl45 ~ e ~ ' +( I1X O ~ ) ~ ,
S I t
S S t
In fact, for q 2 2 these estimates are particular cases of the estimates given in Corollary 10. To prove these inequalities for q E [0,2] we need only to take = sups5,(xs- x01(1 l ~ o l ) - ~q2 , = sup,,tlx,l(l + 1x01)- l and, furthermore, use the fact that in conjunction with Holder's inequality, ~ l q , 1I~ (M I V ~ I ~ ) ~ ~ ~ . 13. Remark. The sequential approximations x: defined in proving Theorem 7 have the property that
+
lim M sup Ix: - xtI2 = 0, tIT
n-tm
where x, is a solution of Eq. (1).Indeed, X:+'=(~+IX;,
x,=tt+Ijzt,
x:+l-x,=Ix:-IjZ,,
from which, using Corollary 11 and the Cauchy inequality, we obtain M sup Ix:" tST
- xtl2 5 Z M sup ,ST
+ 2M sup tST
[oS(x:)- as(Xs)]dw, [bs(x:)- bs(ds)]ds
+ NM SoTlbS(x:)- bs(K))12ds 2 NM So Ix: - jz,12 T
ds.
As was seen in the proof of Theorem 7, the expression given above tends to zero asn -+ co.
6. Existence of a Solution of a Stochastic Equation with Measurable Coefficients In this section, using the estimates obtained in Sections 2.2-2.5 we prove that in a wide class of cases there exists a probability space and a Wiener process on this space such that a stochastic equation having measurable coefficients as well as this Wiener process is solvable. In other words, ac-
6. Existence of a Solution of a Stochastic Equation with Measurable Coefficients
acording to conventional terminology, we construct here "weak" solutions of a stochastic equation. The main difference between "weak solutions and usual ("strong") solutions consists in the fact that the latter can be constructed on any a priori given probability space on the basis of any given Wiener process. Let o(t,x)be a matrix of dimension d x d, and let b(t,x)be a d-dimensional vector. We assume that a(t,x), b(t,x) are given for t 2 0, x E Ed, and, in addition, are bounded and Bore1 measurable with respect to (t,x). Also, let the matrix a(t,x) be positive definite, and, moreover, let for some constant 6 > 0 for all (t,x),I E Ed.
1. Theorem. Let x E Ed. There exists a probability space, a Wiener process (w,,F,) on this space, and a continuous process x, which is progressively measurable with respect to {F,},such that almost surely for all t 2 0 x,
=x
+ Jd
c(s,xS)dws
+ Ji b(s,xs)ds.
For proving our theorem we need two assertions due to A. V. Skorokhod.
2. Lemma.5 Suppose that dl-dimensional random processes 5: (t 2 0, n = 0, 1,2, . . .) are dejined on some probability space. Assume that for each T 2 0, &>O lim sup sup > C } = 0, c+m
lim sup h10
n
n
sup
tt,tz
f
P{I~:I
~{lt:, - 5:,1
> E}
= 0.
If~-tzl
Then, one can choose a sequence of numbers n', a probability space, and random processes 5, defined on this probability space such that all finitedimensional distributions of 5":' coincide with the pertinent jinite-dimensional distributions of 5:' and > E } -+ 0 as n' cc for a11 E > 0, t 2 0. 3. Lemma.6 Suppose the assumptions of Lemma 2 are satisfied. Also, suppose that dl-dimensional Wiener processes (w:,F:) are defined on the aforegoing probability space. Assume that the functions 5:(0) are bounded on [O,cc) x 52 uni$ormly in n and that the stochastic integrals 1: 5: dw: are dejined. Finally, let 5: -t w: -t w: in probability as n co for each s 2 0. Then 1; -+ I: as n -+ co in probability for each t 2 0.
P { I ~ ' -,I
,c:
-+
=yo
-+
4. Proof of Theorem 1. We smooth out o, b using the convolution. Let on(t,x)= o('")(t,x),bn(t,x)= b ( ' " ' ( t , ~(see ) ~ Section 2.1), where E , -+ 0 as n -+ a, See [70, Chapter 1, $61. See [70, Chapter 2, $61.
' In computing the convolution we assume that oij(t,x ) = adi', b'(t,x) = 0 for t < 0.
2 Auxiliary Propositions E,
# 0. It is clear that on, bn are bounded, on -+ a, bn -t b (a.s.j as n
(onA,l)= (oA,l)(") 2 61A12
c.
+
CO,
for all A E Ed, n 2 1. Let a, = o, bo = b. We take some d-dimensional Wiener process (w,,Ft). Furthermore, we consider for n = 1, 2,. . . solutions of the following stochastic equations: dx: = on(t,x:) dw, bn(t,x:) dt, t 2 0, x", x. Note that the derivatives on, bn are bounded for each n. Hence the functions of on, bn satisfy the Lipschitz condition and the solutions of the aforegoing equations in fact exist. According to Corollary 5.12, for each T
+
sup M sup n
t
Ix:~
< CO.
Using Chebyshev's inequality we then obtain lim sup sup ~{lx:l> C)
c+m
n
Further, for t, > t, xY2 - x:,
=
t
= 0.
l2 on(s,xE)dws + J: bn(s,x;) ds,
from which, according to Corollary 5.38 for t, - t1 < 1 we have
where the constants N depend only on the upper bounds /loll, Ibl, and do not depend on n. In conjunction with Chebyshev's inequality lim sup h10
n
sup
Itl-tzl
P {lx;, - x:,l > E) = 0.
Using Lemma 2 we conclude that there exists a sequence of numbers n', a probability space, and random processes (x"y;l:') on this probability space such that the finite-dimensional distributions of (x":',~:') coincide with the corresponding finite-dimensional distributions of the processes (x:';~,), and for all t 2 0 the limit, say (x"P;CP),exists in probability of the sequence (x"?;C;') as n' -+ co. For brevity of notation we assume that the sequence {n') coincides with {1,2,3, . . .). The processes (x":;C:) can be regarded as separable processes for all n 2 0. Since ~lx";, = - x:,I4 4 NltZ - t1I2 for n > 0, ItZ- tll I 1 (by Fatou's lemma), the relationship between the extreme terms of this inequality holds for n = 0 as well. Then, by Kolmogorov's theorem x": is a continuous process for all n 2 0. C:, being separable Wiener processes, are continuous as well.
%;,I4
MIX;,
In Corollary 5.3 we need take z = co,t = t,, a, = a&. 4 ),,.~
6. Existence of a Solution of a Stochastic Equation with Measurable Coefficients
Further, we fix some T > 0. The processes (x;;w,) are measurable with respect to pTfor t I T ; the increments w, after an instant of time T do not depend on F T . Therefore, the processes (x:;w,) (t I T ) do not depend on the increments w, after the instant of time T. Due to the coincidence of finite-dimensional distributions, the processes (x":;fi:) (t 5 T ) do not depend on the increments C: after the time T for n 2 1. This property obviously holds true for a limiting process as well, i.e., it holds for n = 0. This readily implies that for n 2 0 the processes fi: are Wiener processes with respect to o-algebras of Fj"), defined as the completion of o(x":,C::s I t ) . Furthermore, for n 2 0 and each s I t the variable x": is Fj")-measurable. Since 2: is continuous with respect to s, 2: is a progressively measurable process with respect to {Fj")). These arguments show that the stochastic integrals given below make sense; Let k-,(a) = 2-"[2"a], where [a] is the largest integer I a. Since o,(t,x";) for n 2 1 are bounded functions of (o,t),continuous with respect to t, and since k-,(t) t as m -+ a, then
-
lim M
m-t m
Jb l l c n ( t , a
-
- o n ( d t ) , + m . ) ) ~dt~ 2 0
for n 2 1 for each T 2 0. Hence for each t 2 0
So
6
Writing similar relations for a,(s,x:) dw,, Sfo b,(s,2:) ds, b,(s,x3 ds, and using the fact that the familiar finite-dimensional distributions coincide, we can easily prove that for all n 2 1, t 2 0
In other words,
for each t 2 0 almost surely. We have thus completed the first stage of proving Theorem 1. If we had so far the processes xi, the convergence property of which we knew nothing about, we have now the convergent processes 2:. However, in contrast to x;, the processes 57: satisfy an equation containing a Wiener process which changes as n changes. We take the limit in (2) as n -, co.For each no 2 1, we have
where o,,(s,x,) satisfies the Lipschitz condition with respect to (s,x).Hence llono(t2,x";2) - ~no(tl,x":l)ll N(lt2 - tll + - x":ll).
2 Auxiliary Propositions
In addition, by virtue of (1) lim sup h10
n
sup
Itz-tll
~{llo,,(t2,Z~~) - a,,(t~,jZ,:)ll > 8 ) = 0.
From this it follows, according to Lemma 3, that the first term in (3) tends in probability to So ono(s,jzsO)d@,O. Therefore, applying Chebyshev's inequality, we obtain
We estimate the last expression. It is seen that M
j,If (s72;)lds I etM J: e-'1 f(s,%;)1ds 5 etM Jr e-.l f(~,z;)lds.
Therefore, by Theorem 3 A 9
for n 2 1, where N does not depend on n. For n = 0 the last inequality as well holds, which fact we can can easily prove for continuous f taking the limit as n 4 co and using Fatou's lemma. Furthermore, we can prove it for all Bore1 f applying the results obtained in [54, Chapter 1, $21. Let w(t,x) be a continuous function equal to zero for t2 + 1x1' r 1 and such that w(0,O) = 1 , 0 I w(t,x) I 1. Then, for R > 0
%M
n-t w
Ji (la.
- anol/2(~,Z;) ds 5 NM
Ji [I
-w
(+,;)I
ds
+ n-r E rn M J ~R ~' R ( A ~ )
+ Nil lloo - ~nol1211d+~,~R,R. It should be noted that f(a,cr:i)
= ilo:A12
2 i621A12 since 61112 I(a,i,A) = (i,a:A) 5 1 1 1(cr:ll.
7. Some Properties of a Random Process Depending on a Parameter
Estimating M
Sb I o ,
- ~ ~ l l ~ ( r ,dr n :in)
similar fashion, we find
for each no > 0, R > 0. Finally, we note that the last expression tends to zero if we assume first that no -+ oo,and next, that R + co.Therefore,
in probability. We have a similar situation for the second integral in (2). Therefore it follows from (2) that
for each t 2 0 almost surely. It remains only to note that each side of the last equality is continuous with respect to t ; hence both sides coincide on a set of complete probability. We have thus proved Theorem 1.
7. Some Properties of a Random Process Depending on a Parameter In investigating the smoothness property of a payoff function in optimal control problems it is convenient to use theorems on differentiability in the mean of random variables over some parameter. It turns out frequently that the random variable in question, say J(p), depends on a parameter p in a complicated manner. For example, J ( p ) can be given as a functional of trajectories of some process xf which depends on p. In this section we prove the assertions about differentiability in the mean of such or other functionals of the process. Three constants T, K , m > 0 will be fixed throughout the entire section. 1. Definition. Let a real random process x , ( o ) be defined for t E [O,T].We write x, E 2 if the process x , ( o ) is measurable with respect to ( o , t ) and for all q 2 1
We write x, E 2 B if x, is a separable process and for all q 2 1 M sup Ix,Iq < m.
,ST
The convergence property in the sets 2 , 2 B can be defined in a natural way.
2 Auxiliary Propositions
2. Definition. Let xp, x:, . . . , x;, . . . E 9 ( 9 B ) . We say that the 9-limit (9B-limit) of the process x: equals xp, and we write 9-limn,, x: = xp(9B-lim,,, x: = xp) if for all q 2 1
lim M
n+ m
Jo T ~x;- xplqdt = o
lim M sup x: - xp/q = o).
n+w
tsT
Having introduced the notions of the 9-limit (9B-limit), it is clear what is meant by 9-continuity (9B-continuity) of the process xf with respect to the parameter p at a point p,. 3. Definition. Suppose that p , E Ed, unit vector 1 E Ed, yt E 9 ( 9 B ) . Further, suppose that for each p from some neighborhood of the point p , a process xp E 9 ( 9 B ) is given. We say that y, is an 9-derivative (9B-derivative) of xf at the point p, along the 1 direction, and also, we write
We say that the process xf is once 9-differentiable (9B-differentiable) at the point p , if this process xp has 9-derivatives (9B-derivatives) at the point p, along all 1 directions. The process xf is said to be i times (i 2 2) 9-differentiable (9B-differentiable) at the point p , if this process xp is once 9-differentiable (9B-differentiable) in some neighborhoodlo of the point p, and, in addition, each (first) 9-derivative (9B-derivative) of this process xf is i - 1 times 9-differentiable (9B-differentiable) at the point p,. Definitions 1-3 have been given for numerical processes x, only. They can be extended to vector processes and matrix processes xi in the obvious way. Further, as is commonly done in conventional analysis, we write yf = 9-(a/al)xf if yp" = 9 - ( a / a l ) ~ p l ~for = ~all , p , considered, 9-(dial, d1,)xp 9-(a/al,)[9-(d/al,)xf], etc. We say that xp is i times 9-continuously 9-differentiable if all 9-derivatives of xp up to order i inclusive are 9continuous. We shall not dwell in future on the explanation of such obvious facts. We shall apply Definitions 1-3 to random variables as well as random processes, the former being regarded as time independent processes. In order to grow familiar with the given definitions, we note a few simple properties these definitions imply. It is obvious that the notion of 9continuity is equivalent to that of 9B-continuity for random variables. Furthermore, lMxP - MxP0lI MlxP - xPoI. Hence the expectation of an 9-
-
lo
That is, at each point of this neighborhood.
7. Some Properties of a Random Process Depending on a Parameter
continuous random variable is continuous. Since
the derivative of MxP along the 1 direction at a point p, is equal to the expectation of the 9-derivative of xP if the latter exists. Therefore, the sign of the first derivative is interchangeable with the sign of the expectation. Combining the properties listed in an appropriate way, we deduce that (a/dl)MxP exists and it is continuous at the point p, if the variable xP is 9-continuously 9-differentiable at the point p, along the 1 direction. A similar situation is observed for derivatives of higher orders. Since for z I T
X: is an 9-continuous variable if z(o) I T for all o , xp is an 9B-continuous process, and x: is a measurable function of o . A similar inequality shows that for the same z
if xp has an 9B-derivative along the 1 direction, and if x: and the right side of (1) are measurable functions of o. These arguments allow us to derive the properties of 9-continuity and 9-differentiability of the random variable x: from the properties of 2B-continuity and 9B-differentiability of the process xp. Furthermore, (1) shows that the order of the substitution oft for z and the order of the computation of derivatives are interchangeable. Suppose that the process xp is continuous with respect to t and is 9 B continuous with respect top at a point p,. Also, suppose that z (p) are random functions with values in [O,T], continuous in probability at the point p,. We assert that in this case x:&,, x:(,, are 9-continuous at the point p,. In fact, the difference IxS;",, -+ 0 in probability as p + p, and in addition, this difference is bounded by the summable quantity 2q-1 sup, IxpOlq. Therefore, the expectation of the difference indicated tends to zero, i.e., the variable x:tP, is 9-continuous. The 9-continuity of the second variable follows from the 9-continuity of the first variable and from the inequalities
~: ? ~' ( " , , 1 ~
In conjunction with Holder's inequality M sup t
X:
dS -
J: X(" dS[ I M
x:I
- x(" dS[ 2 T q - M
Jd xI:
- x("Iq ds.
2 Auxiliary Propositions
Therefore, fi x!ds is an 9B-continuous process if the process xf is 2continuous. We prove in a similar way that this integral has an 9B-derivative along the 1 direction, which coincides with the integral of the 9-derivative of xf along the 1 direction if the latter derivative exists. In other words, the derivative can be brought under the integral sign. Combining the assertions given above in an appropriate way, we can obtain many necessary facts. They are, however, too simple to require formal proof. It is useful to have in mind that if {Ff]is a family of a-algebras in 52 and if the process xf is k times 9-differentiable at the point p , and, in addition, progressively measurable with respect to {Ft-,), all the derivatives of the process xf can be chosen to be progressively measurable with respect to {Ff]. Keeping in mind that induction is possible in this situation, we prove the foregoing assertion only for k = 1. Let yf = 9-(a/al)xf. Having fixed p, we find a sequence rn + 0 such that (l/r,)(~f+'~' - xp) -+ yp almost everywhere dP x dt. Further, we take jj; = limn,, (l/r,)(xf+'"'x;) for those o,t for which this limit exists and jjf = 0 on the remaining set. It is seen that the process yf is progressively measurable. Also, it is seen that
since jjf = yf (dP x dt-as.). We shall take this remark into account each time we calculate 2-derivatives of a stochastic integral. We have mentioned above that differentiation is interchangeable with the (standard) integration. Applying Corollary 5.11, we immediately obtain that if (wf,Ff)is a dl-dimensional Wiener process, of is a matrix of dimension d, x dl, which is progressively measurable with respect to {Ft} and is 9-continuous at a point po, the integral 6 afdw, is 2B-continuous at the point p,. If of is 9-differentiable along the 1 direction at the point p,, then for p = po
A similar assertion is valid in an obvious way for derivatives of higher orders. 4. Exercise Prove that if the function xf is continuous (continuously differentiable) with respect to p in the usual sense for all (t,w) and, in addition, the function M jrlxP1q dt(M j:l(a/ax)xfp dt for each I, 111 = 1) is bounded in some region for each q 2 1, the process xf is 9continuous (9-differentiable and 9-(a/al)xf = (8lal)xf) in this region.
7. Some Properties of a Random Process Depending on a Parameter
Further, we turn to investigating the continuity and differentiability properties of a composite function. To do this, we need three lemmas. 5. Lemma. Suppose that for n = 1,2, . . . , t E [0, T I , x E Ed, the dl-dimensional processes x: measurable with respect to (w,t)are dejined, and, in addition, the variables h:(x) measurable with respect to (w,t,x) are given. Assume that x; -+ 0 as n -* co with respect to the measure dP x dt, and that the variable h:(x) is continuous in x for all n, w, t. Furthermore, we assume that one of the following two conditions is satisjied: a. for almost all (w,t) lim
lim
6 - 0 n-m
wr(6) = 0,
where w:(4 = suplxl sSlh:(x)l; b. for each E > 0
lim
ii;;; JOT P { w : ( ~> ) E } dt = O.
S+O n-m
Then Ih:(x:)l I w;(lx;l) + 0 as n -+ co in measure dP x dt.
PROOF.We note that since h:(x) is continuous in x,w:(6) will be measurable with respect to (w,t). Further, condition (b) follows from condition (a) since (a) implies that wy(6) + 0 as n -+ co, 6 -+ 0 almost everywhere; (b) implies the same although with respect to dP x dt. Finally, for each E > 0, 6 > 0
where the first summand equals zero by assumption. Thus, letting 6 + 0 and using (b),we have proved the lemma.
6. Lemma. Let x; be dl-dimensional processes measurable with respect to (co,t) (n = 0,1,2, . . . ,t E [O,T]), such that 9-limn,, x: = xp. Let f,(x) be random variables dejined for t E [O,T], x E Ed, measurable with respect to (w,t), continuous in x for all (o,t), and such that If;(x)l 5 K(l + 1x1)" for all a, t, X . Then 9-limn+, f,(x:) = f(xp). PROOF.First we note that under the condition ( f , ( x ) I ( K ( l + 1x1)" the processes f;(x:) E 9 for all n 2 0. Next, we write f;(x:) - f;(xP) as ht(y:), where ht(x)= f,(x + xp) - f,(xp), y: = x: - xp. Since MS:IJJ;I dt + 0, y: + 0 in measure dP x dt, from which we have, using Lemma 5 applied to y: and h,(x), that ht(y;)+ 0 in measure dP x dt. Since the function lal/(lal + 1) is bounded and
2 Auxiliary Propositions
in measure dP x dt, then for each q 2 1 lim M JOT lg/9:12q dt
n-t m
= 0.
Moreover, in view of the estimate If,(x)l I K(l
we have
m,
n
n
+ 1x1)" and the fact that
SoTlx:12qmdt < JoT(l + 1ht(y:)l)2qdt< SoT[ I + K(1 + sup M
sup M
(2)
SUP n
M
+ K(l +
/X:])~
< CO.
Using the Cauchy inequality, we derive from (2)and (3)that
s
lim (M
n-t w
J:
Ig:12qdt)1'2(M J:(I
+ Iht(y:)l)2qdt)li2 = 0
for each q 2 1. The lemma is proved. We note a simple corollary of Lemma 6.
7. Corollary. If for n = 0, 1,2, . . . the one-dimensional processes xi, y: are dejined and 9-limn,, x; = xp, 9-limn,, y: = yp, then
Indeed, the two-dimensional process (x:,y:) has the 9-limit equal to (xp,yp).Furthermore, the function f(x,y) = xy satisfies the growth condition If (x,y)I I (1 d m ) ' . Hence 9-limn,, f (x;,y;) = f (xp,yp).
+
8. Lemma. Suppose that the assumptions of Lemma 6 are satisfied. Also, suppose that for n = 1,2, . . , u E [0,1] the dl-dimensional random variables x:(u) are dejined which are continuous in u, measurable with respect to (o,t)and such that (x:(u)- xpI I Ix: - xpI. Then Y-lim
n-t m
So1~ ( x : ( u du) )
=~ ( x p ) .
PROOF. In accord with Holder's inequality, for q 2 1
7. Some Properties of a Random Process Depending on a Parameter
Ix:
X~I,
Ix;I +
It follows from the inequalities Ix:(u) - xPI < Ix:(u)l I - xPI that x:(u) E 3 and 2'-limn,, x;(u) = x; for each u E [0,1]. Therefore, by Lemma 6
Ix:
In(4
-
'1
M JOT I f , ( d ~ )) - f,(x;) dt
+
0.
Finally, by the inequalities
we have that the limiting expression in (4) belongs to 3,and the totality of variables I,(u) is bounded. By the Lebesgue theorem, as n + co
thus proving the lemma. Further, we prove a theorem on continuity and differentiability of a composite function. 9: Theorem. Suppose for x E Ed, and p in a neighborhood of a point po E Ed the random processes xp = xp(o), J ( x ) = f,(o,x) with values in Ed and El, respectively, are given for t E [O,T] and measurable with respect to (t,o). (a) For all t, o let the function f,(x) be continuous in x, let If,(x)(I K(l + IxI)", and let the process xp be 3-continuous at po. Then the process ft(xf) is also 9-continuous at po. (b) Suppose that for all t,o the functionf,(x) is i times continuously differentiable over x. Furthermore, suppose that for all t, o the absolute values of the function f,(x) as well as those of its derivatives up to order i inclusively do not exceed K(l + 1x1)". Then, if the process xp is i times (2'-continuously) 3differentiable at the point p,, the process ft(xp) is i times (2'-continuously) 2'-differentiable at the point po as well. In addition, for the unit vector 1 G Ed
where
for those i, p for which the existence of the left sides of (5) and (6) has been established.
PROOF. For proving (a) it suffices to take any sequence of points pn -+ p,, to put xj"' = xp" a n 4 finally, to make use of Lemma 6.
2 Auxiliary Propositions
We shall prove (b) for i = 1. First we note that ft(x,y) -f,(,,(x)lyl is a continuous function of (x,y) and Ift(x,~)I= Iftcy)(x)Ilyl I ~ ( +1IX~)~IYI I~ ( +1J m l r n + ' . Further, we take the unit vector 1 E Ed, a sequence of numbers r, we put
+ 0, and
Using the Newton-Leibniz rule we have
where Ixj")(u)- xPl2 + lyjn) - yp"12IIxpO+rnl- xp"lZ+ lyjn) - yp"12and where, by Lemma 8 applied to xj")(u)and yj")(u)E yj"),
Therefore,
Finally, by (a), f,(xPO,ypO)is 9-continuous with respect to po if xpO is 9continuously 9-differentiable with respect to p,. This proves assertion one in (b) for i = 1. At the same time we have proved Eq. (5), which we find convenient to write as follows:
For proving (b) for all i we apply the method of induction. Assume that the first assertion in (b) is proved for i I j and for any processes ft(x), xf satisfying condition (b). Let the pair ft(x), xf satisfy the conditions of (b) for i = j + 1. We take a derivative 9-(a/al)ft(xp) and prove that this derivative is j times 9-differentiable at a point po. Let us write this derivative as fr(xf,yf).We note that the process (xf,yf) is j times 9-differentiable at the point po by assumption, the function ft(x,y) is continuously differentiablej times with respect to the variables (x,y). Also, we note that the absolute values of the derivatives of the above function up to order j inclusively do not exceed N(l + J m ) r n + l . Therefore, by the induction assumption, fr(xf,yf) is j times 9-differentiable at the point p,. Since 1 is a vector, ft(xp) is, by definition, j + 1 times 9-differentiable at the point p,. In a similar way we can prove 9-continuity of 9-derivatives of ft(xf) at the point po if 9-derivatives of xf are 9-continuous at the point po. Finally,
7. Some Properties of a Random Process Depending on a Parameter
in conjunction with (5)
which, after simple transformations, yields (6).The theorem is proved. 10. Remark. The theorem proved above can easily be used for proving the 9-continuity and 9-differentiability of various expressions which contain random processes. For example, arguing in the same way as in Corollary 7, we can prove that if xf, yf are real i times 9-differentiable processes, the product xfyf is i times 9-differentiable as well. If the real nonnegative process xp is i times 9-differentiable, the process e-"4 is i times 9-differentiable as well. In fact, notwithstanding that the function e-" grows more rapidly than any polynomial as x -r - co, we consider the nonnegative process xf, and moreover, we can take any smooth function f ( x ) equal to zero for x I- 1 and equal to e-" for x > 0. In this situation the hypotheses of the theorem concerningf ( x ) will be satisfied and e-"f = f (xf).Combining the foregoing arguments with the known properties of integrals of 9continuous and 9-differentiable functions, we arrive at the following assertion. 11. Lemma. Let the processes xf, f :(x),f :(x) satisfv the conditions of Theorem 9a (Theorem 9b), and, in addition, let f :(x) 2 0 ; then the process
f :(xf)exP{-
J; f :(.:Ids}
is 9-continuous at the point po (i times (9-continuously)9-dgerentiable at the point p,).
Fixing z E [0, T I , and regarding S',f f ( x f )ds as a time independent process, we conclude that the following lemma is valid.
-
12. Lemma. Let the processes xf, f :(x),f :(x) satisfy the hypotheses of Theorem 9a (Theorem 9b), and, in addition, let f :(x) 2 0. Let the random variable z ( o ) E [O,T] and let the random processes yf, f :(x) be such that the processes % = yf, x ( x ) f :(x) satisfy the hypotheses of Theorem 9a (Theorem 9b). Then the random variable
is 9-continuous at the point p, (i times (9-continuously)9-diflerentiable at the point p,).
13. Remark. Equation (5) shows that in computing an 9-derivative of a composite function the usual formulas familiar in analysis can be applied.
2 Auxiliary Propositions
14. Exercise Derive a formula for the derivative of a product, using (5). (Hint: Take a function f(x,y)
= XY.)
We have investigated the properties of the functions f,(x:) in the case where f;(x) does not depend on n. We prove a few assertions for the case where f;(x) depends on the parameter n in an explicit manner.
15. Lemma. Let 5 ( 0 ) be a dl-dimensional random vector. Further, let h(x) = h(o,x), w(R,E)= w ( o , R , ~ )be measurable variables which are dejined for x E Edl,R 2 0, E 2 0, p E Q. Assume that w(R,E)increases with respect to R and to E, Ih(x) - h(y)l I ~ ( 1 x vlyl, 1 - yl) for all o, X , y and (h(x)lI K(l + 1x1)" for all o, x. Then, for all R 2 0, E E (0,l)
IX
PROOF. We fix R 2 0 , E E (0,1), and also we take a dl-dimensional vector q such that it does not depend on 5, w and is uniformly distributed in the sphere { X E Edl:1x1 < 6). It is seen that Mlh(5)I
MI~(S)IXI~I>R-~
+ Mlh(5) - h(l+ r ) l x l t ; l I R - l
+ Mlh(5 f v ) ~ x ~ I R - ~ .
The assertion of our lemma follows from the above expression as well as the assumptions of the lemma since < E < 1, for 151 I R - 1 and
15 + rl < R,
- h(5
Mlh(5 + v
) ~ x ~ ~ R - I
+ r)(Iw(R,E),
16. Lemma. Suppose that for x E Ed,, t E [O,T], n = 1,2,3, . . . ,R > 0, E > 0 dl-dimensional processes x: are dejined which are measurable with respect to (o,t). Furthermore, suppose that the variables h:(x) and w:(R,E)increasing with respect to R and E are dejined, these variables being measurable with respect to (o,t,x) and (o,t),respectively. Assume that w:(lxl v yl, - yl) 2 Ih:(x) - h:(y)l for all o, t, x, y,
I IX
7. Some Properties of a Random Process Depending on a Parameter
and for each R > 0,6 > 0
lim
&i
P{w:(R,e) > 6 ) dt
JOT
810 n - t m
= 0.
Finally, let h:(x) -+ 0 as n -+ oo in measure dP x dt for each x Then, h:(x:) + 0 as n -+ co in measure dP x dt.
E
Ed.
We shall prove this lemma later. We derive from Lemma 16 (in the same way as we derived Lemma 6 from Lemma 5 ) the following theorem. 17. Theorem. Let the hypotheses of Lemma 16 be satisfied. Furthermore, let Ih:(x)I < K(l 1x1)" for all n, w, t, x and for all q 2 1
+
sup M JOT lx~~:lqdt < m. Then
n
2lim h;(x;) = 0. n+ m
18. Remark. By Chebyshev's inequality, (7) follows from (9). Using Chebyshev's inequality it can easily be proved that the condition (8)is satisfied if w:(R,E)is nonrandom and lim
JOT
w ~ ( R , dt E )= 0.
810 n - t m
For w:(R,E)it is convenient to take KE if Ih:(x) - h:(y)l < KIx - yl.
PROOF OF LEMMA 16. Since the convergence of h:(x:) to zero in measure is equivalent to the same convergence of (217~) arctan h:(x:), and furthermore, since the latter variable is bounded and
we can consider without loss of generality that Ih:l I 1. It is clear that in this case 2 A w: can be taken instead of w: so that w: will be assumed to be bounded as well. According to Lemma 15 (we take in Lemma 15 K = 1, m = 0) for any R > 0, E E (0,l)
We make use of the fact that the convergence in measure is equivalent to the convergence in the mean for uniformly bounded sequences. Thus we
2 Auxiliary Propositions
have that the sequence
-
-
as n co for any y E Ed,. Furthermore, each term of this sequence does not co exceed T. This implies that the last expression in (10) tends to zero as n for any E > 0, R > 0. Letting n + co in (lo), next, E 10, R + co, and in addition, using (7), (8) and the fact mentioned above that the convergence in the mean and the convergence in measure are related, we complete the proof of Lemma 16.
8. The Dependence of Solutions of a Stochastic Equation on a Parameter Let E be a Euclidean space, let a region D c E (D denotes a region of parameter variation), and let T, K, m be fixed nonnegative constants; (wt,Pt) is a dl-dimensional Wiener process. Furthermore, for t E [O,T], x E Ed, p E D, n = 0,1,2, . . . we are given: o,(x), o;(x), o,(p,x) are random matrices of dimension d x dl and b,(x), b:(x), b,(p,x), t:, t,(p) are random d-dimensional vectors which are progressively measurable with respect to {Pi).Assume that for all t, o,x, y Also, assume that o:(x), b:(x) satisfy (1) for each n 2 0 and, in addition, ot(p,x), b,(p,x) satisfy (1) for each p E D. Suppose that all the processes in question belong to 8 for all values of x, n, p. Recall that the space 8 was introduced in Section 7. We shall frequently use further on other concepts and results given in Section 7. We define the processes x:, x:, xf as the solutions of the following equations: x: = x os(x:) dw, bs(x;) ds; x:
+ Jd + Ji = t: + Ji of(xS)dw. + J,' b:(xS) ds;
Note that by Theorem 5.7 the above equations have solutions. We also note that by Corollary 5.6 these solutions belong to 8.If t:, t,(p) E 2 B for all n, p, according to Corollary 5.10 x:, x:, xp E 8 B as well for all n, p, x.
-
-
--
-
-
1. Theorem. Let o:(x) op(x), b:(x) bp(x) in 8 as n co for each x E Ed and let -+ 5: in 9 as n -, co. Then x: x: in 8 as n + co. If tP in 8 B as well as n -+ co, then x; -+ x,O in 2 B as n co.
<:
<:
8. The Dependence of Solutions of a Stochastic Equation on a Parameter
PROOF.Let 8:(x) = o:(x) - o:(O). It is seen that 8:(x) satisfies the Lipschitz condition (1)and the growth condition IlC:(x)II IKIxI. Furthermore, 8:(x) -r 8p(x) in 9 for all x. From this, using Theorem 7.17 and Remark 7.18, we conclude that C:(xp) + 8p(xp) in 9 . Adding the last relation with o:(O) -r $(O) in 9 , we obtain: o:(xp) + op(xp) in 9 . Similarly, b:(x,O) -r bp(xp). Applying Corollary 5.5 and Theorem 5.9 for (%,i?,,b;) = (xp,op,bp), (x,,o,,b,) = (x:,<,b:), we immediately arrive at the assertions of the theorem:
2. Corollary. If the process c,(p) is 9-continuous (9B-continuous), and, in addition, for each x E Ed the processes o,(p,x), b,(p,x) are 9-continuous in p at the point p, E D, the process xp is 9-continuous (9B-continuous) at the point po. 3. Lemma. Suppose that for each t E [O,T],p E D, w the functions o,(p,x), b,(p,x) are linear with respect to x. Let the process &(p)and for each x E Ed the processes oXp,x) and b,(p,x) be i times (9-continuously) 9-differentiable at the point p, E D. Then, the process xp is i times (9-continuously) 9digerentiable at po. If, in addition, &(p) is i times (YB-continuously) 9 B differentiable at the point p,, the process xf will be the same as the process 5,(~). PROOF.Due to the linearity of o,(p,x), b,(p,x)
where (x:)j is the jth coordinate of the vector x: in the basis {ej).This implies that the last assertion of the lemma is a corollary of the first assertion as well as the results, which were proved in Section 7, related to the 9 B differentiability of integrals and the 9-differentiability of products of 9differentiable processes. We prove the first assertion.To this end, we make use of the induction with respect to i and, in addition, assume that i = 1. We take a unit vector 1 E E and, in accord with what was said in Section 7, let the processes
be progressively measurable for x E Ed. By Corollary 2, we conclude that the process xp is 9-continuous at the point p,. It is not hard to see that for p = p, the process
2 Auxiliary Propositions
exists, is progressively measurable (and 2-continuous with respect to p, if
are 2-continuous with respect to p). Furthermore, q,(p,) E 2 . According to Theorem 5.7, the solution of the equation Y! = qt(p) +
J,,~ S ( P , Y ~+~ JW ~SS ( P , Yds~ )
exists and is unique for p = p,. Let us show that yp = 2-(d/dl)xp for p = p,. To this end, we take a sequence rn -+ 0 and assume that yp(n) = r;'(~p+'~' - xp). It can easily be seen that
where
W e Z e given the expression
In addition, since the 2-limit of the product (sum) equals the product (sum) of 9-limits, we have that in 2 ,
Similarly, in 2
Thus, q,(p,,n) -+ qt(po) in T as n -+ co. Comparing (2) with (3), we have from Theorem 3 that ypO(n) -+ ypO in 2.Hence
for p = p,, proving thereby that xp is 2-differentiable. It is clear that (4) is satisfied at any point p at which there exist 2-derivatives [,(p), at(p,x),bt(p,x).Further, if the foregoing derivatives are continuous
8. The Dependence of Solutions of a Stochastic Equation on a Parameter
at the point p,, they are defined in some neighborhood in which (4) is satisfied. In this case, as we noted above, q,(p) is 9-continuous at the point p,. Also, by Corollary 2; it follows from Eq. (2) that the process yf is 9continuous at the point p,. This fact implies that the process xf is 9-continuously 9-differentiable at p,. Suppose that our lemma is proved for i = i, and that the assumptions of the lemma are satisfied for i = i, + 1. We shall complete proving our lemma if we show that each first 9-derivative of xf is i, times (9-continuously) 9-differentiable at the point p,. We consider, for instance, 9-(d/dl)xf. This process exists and satisfies Eq. (2) for p close to po. Since the assumptions of Lemma 3 are satisfied for i = i, (even for I), by the induction assumption, the process xf is i, times (9i = i, continuously) 9-differentiable at p,. From this, it follows that the process q,(p) is i, times (9-continuously) 9-differentiable at p,. Applying the induction assumption to (2), we convince ourselves that the process yf is i, times (9-continuously) 9-differentiable at the point p,. The lemma is proved.
+
4. Theorem. Suppose that the process &(p) is i times (9-continuously) 9diferentiable at a point p, E D, and that the functions a,(p,x), b,(p,x) for each s, o are i times continuously (with respect to p, x) dzferentiable with respect to p, x for p E D, x E Ed. Furthermore, assume that all derivatives of the foregoing functions, up to order i inclusive, do not exceed K(l + 1x1)" with respect to the norm for any p E D, S, o,X. Then the process xf is i times (9continuously) 9-difirentiable at the point po. If, in addition, the process &(p) is i times (9B-continuously) 9B-diferentiable at the point p,, the process xf will be the same as the process <,(p). PROOF.Because the notion of the 9-derivative is local, it suffices to prove the theorem in any subregion D' of a region D, which together with its closure lies in D. We construct an infinitely differentiable function w(p) in such a way that w(p) = 1 for p E D', ~ ( p=) 0 for p 4 D. Let &(p) = <,(p)w(p), iS,(p,x) = a,(p,x)w(p), E,(p,x) = b,(p,x)w(p). Then <,,E,, 5, satisfy the conditions of the theorem for D = E. Further, since the assertions of the theorem hold for 9, 3, 6 in Ejthey hold as well for c, b, a in the region D'. This reasoning shows that in proving our theorem, we can assume that the assumptions of the theorem are satisfied for D = E. In this case we use the induction over i. First, let i = 1. Further, we take a unit vector 1 E E and a sequence of numbers rn -, 0. Let
Using the Newton-Leibniz formula, we easily obtain
2 Auxiliary Propositions
where
+
as a process in E x Ed with a We look upon the pair (p urnl,x~(n,u)) time parameter t . It is seen that I(P + urn&x!(n,u)) - (p,x!)l I I(p
+ rnl,x!"~')
- (p,x!)l.
Furthermore, by Corollary 2 and the 9-continuity of 9-differentiable functions, xpof -+ XPin 9 . In order to apply Lemma 7.8, we note that, for instance, IbS,,,(p,x)l IK(l + d m ) " for all o,s, p, x. By this lemma, for p = po
*"'
in the sense of convergence in the space 9 in which
R,
7 : , vt(p,n)are progressively measurable for those Note that Zs,&, v,, i x, for which they exist. In fact, one can take the derivative 9-(a/al)<,(p) be progressively measurable. Also, for example, a,,,,(p,x) is progressively measurable (ordinary derivative with respect to the parameter of a progressively measurable process) and continuous with respect to p, x. Hence the process ~ , , , ~ (+p ur,l,x,P(n,u)) is progressively measurable and continuous with respect to u, which, in turn, implies the progressive measurability of the Riemann integral J,l
~s,xj(p+ w.1, x:(n,u)) du
and the progressive measurability of the process E:("(px).
8. The Dependence of Solutions of a Stochastic Equation on a Parameter
Further, since o,(p,x), b,(p,x) satisfy the Lipschitz condition (1) with respect to x, o,,,(p,x), b,,,,(p,x) are bounded variables. This implies that the functions 5,(p,x) and 6s(p,s),linear with respect to x, satisfy the Lipschitz condition (1). By Theorem 5.7, for p = p, there exists a solution of the equation Y!
+ J; ~S(P.Y:)dws + J; L(P.Y:) ds.
= %(PI
By Theorem 1, comparing (5) with (6), we conclude that 9-lim r; '(xf n-+ w
+
*.' - xp) = 9-lim yf(n) = yf n+ m
for p = p,. This shows that yp = 9-(a/al)xf for p = p,, and therefore the process xf is 9-differentiable at the point p,. It is also seen that yf = 9-(d/al)xf at each point p at which 9-(a/al)(,(p) exists. Next, let (,(p) be 9-continuously 9-differentiable at the point p,. Then 9-(d/al)(,(p) exists in some neighborhood of the point p,, yp being the 9-derivative of xf along the 1 direction in this neighborhood. In addition, , the process (p,xp)is 9-continuous at the point p,, and the functions os,,,(p,x), b,,,,(p,x) are continuous with respect to (p,x) and do not exceed K(l 1x1)" with respect to the norm. Therefore, by Theorem 7.9, the processes o,,,,(p,x:), b,,,,(p,x:) are 9-continuous, and consequently, the process y,(p) is 9continuous at p,. Similarly, the fact that the functions o,,,,(p,x), b,,,,(p,x) are bounded and continuous with respect to p, x, implies that the processes Zs(p,x) and are 2-continuous at p, for each x. To conclude our reasoning, we observe that, by Corollary 2, the process yp interpreted as the solution of Eq. (6) is 9-continuous at the point p,. Thus, we have proved the first assertion of the theorem for i = 1. Further, suppose that this theorem has been proved for i = i,, and, in addition, the assertions of the theorem are satisfied for i = io + 1. Consider the derivative 9-(d/dl)xp. As was shown above, we may assume that this process is yp and that it satisfies Eq. (6). By the induction assumption, xp is i, times 9-differentiable at p,. Therefore the pair (p,xf) is i, times 9-differentiable as well. By Theorem 7.9, the processes r~,,,~(p,x:), b,,,,(p,x:), r~,,,~(p,x:), b,,,,(p,x:) are io times 9-differentiable at the point p,. Hence, in Eq. (6) the processes yt(p)LZs(p,x),6s(p,x)are i, times 9-differentiable with respect to p. Since 5,(p,x), b,(p,x) are linear functions of x, according to the preceding lemma the process yp is i, times 9-differentiable at the point p,. We have thus proved that the derivative (a/dl)xp is i, times 9-differentiable at the point p,. Since 1 is an arbitrary unit vector from E, this implies, by definition, that xf is i, 1 times 9-differentiable at the point p,. In addition, if (,(p) is i, + 1 times 2-continuously 9-differentiable at the point p,, we can prove that xp is i, + 1 times 9-continuously 9differentiable at the point p, if we put the word "9-continuously" in the appropriate places in the above arguments. This completes the proof of the first assertion of Theorem 4.
+
+
2 Auxiliary Propositions
For proving the second assertion of the theorem, we need only to prove, due to the equality
that the processes o,(p,xf), b,(p,xf) are i times (9-continuously) 9-differentiable at the point p,. It is obvious that a process which is identically equal to (p,O) is i times 9-continuously 9-differentiable. It is also seen that, since the function o,(p,O) is i times continuously differentiable with respect to p and, in addition, the derivatives of this function are bounded, the process o,(p,O) is i times 9-continuously 9-differentiable in accord with Theorem 7.9. Furthermore, the process (p,x:) is i times (9-continuously) 9-differentiable at the point p,, the function o,(p,x) - o,(p,O), with respect to the norm, does not exceed Klxl, and, in addition, the derivatives of this function satisfy the necessary restrictions on the growth. By Theorem 7.9, the process o,(p,x:) - o,(p,O) is i times (9-continuously) 9-differentiable at the point p,; the same holds for the process o,(p,x:) = o,(p,O) + [o,(p,x:) o,(p,O)]. The process in b,(p,xf) can be considered in a similar way. The theorem is proved. 5. Remark. For i 2 1 we have proved that for any unit vector 1 E E the solution of Eq. (6) is the 9-derivative of xp along the 1 direction:
We have seen that the last equation is linear with respect to yf; also, we applied Lemma 3 to this equation for i 2 2. In Lemma 3 we derived Eq. (2), according to which the solution of the equation which follows is an 9derivative of y! along the 1 direction, that is, a second 9-derivative of x! along the 1 direction. This equation is the following:
where, according to the rules of 9-differentiation of a composite function (see (2)),
8. The Dependence of Solutions of a Stochastic Equation on a Parameter
Note that the above equations as well as equations for the highest 9derivatives of xf can be obtained proceeding from the fact that xf is 9-differentiable the desired number of times, if we differentiate the equality x!
=
<,(PI + J; os(~,x:)dws + J b,(p,x9 ds,
interchange the order of the derivatives with those of the integrals, and, in addition, make use of the formula for an 9-derivative of a composite function. The following assertion is a simple consequence of Theorem 4 and Corollary 2 in the case where D = Ed, &(p) = p, at(p,x) = o,(x),b,(p,x) = b,(x). 6. Theorem. The process x: is 9B-continuous. If os(x),b,(x) are i times continuously differentiable with respect to x for all o,s, and iJ in addition, each derivative of these functions up to order i inclusively does not exceed K(l 1x1)" with respect to the norm for any s, x, o,the process x: is i times 9B-continuously 9B-differentiable.
+
In concluding this section, we give two theorems on estimation ofmoments of derivatives of a solution of a stochastic equation. Since, as we saw in Remark 5, it was possible to write equations for such derivatives, it is reasonable to apply Corollaries 5.6 and 5.10-5.12 for estimating the moments of these derivatives. The reader can easily prove the theorems which follow.
7. Theorem. Let there be a constant K , such that for all s, x, p, w Suppose that the process & ( p ) is 9B-differentiable at apoint p0 E D. Further, suppose that 9B-derivatives of the process &(p) have modijications which are progressively measurable and separable at the same time. Let the functions a,(p,x), b,(p,x)for each s, o be continuously differentiable with respect to p, x for p E D, x E Ed. In addition, let the matrix norms of the derivatives of the function a,(p,x) and the norms of the derivatives of the function bs(p,x) be smaller than K ( l + 1x1)" (m 2 1) along all directions for all p E D, s, o,x. Then for any unit vector 1 E E, q 2 1, t E [O,T]
where N
= N(q,K,m,K,).
8. Theorem. (a) Let the functions as(x),bs(x)be continuously differentiable with respect to x for each s, o.Then for any unit vector 1 E Ed, q 2 1, t E [O,T], X E Ed
where N
= (q,K).
sst
2 Auxiliary Propositions
(b) Let the functions o,(x), b,(x) be twice continuously dlfferentiable for each s, o.Further, for each x, s, o and unit vectors 1 E Ed let Ilos(l,(l,(x)ll+ Ibscl,cl,(x)l i K(1 + 1 ~ 1 ) ~ .
+
+
Also, suppose that IloS(x)II Ib,(x)I IK l ( l 1x1) for all x, s, o for some constant K,. Then for any q 2 1, t E [O,T], x E Ed and the unit vector 1 E Ed
where N
= N(q,K,m,K,).
9. The Markov Property of Solutions of Stochastic Equations The Markov property of solutions of a stochastic equation with non random coefficients is well known (see [9,11,24]). In this section, we shall prove a similar property for random coefficients of the equation (Theorem 4), and moreover, deduce some consequences from this property. We fix two constants T , K > 0. In this section we repeatedly assume about (w,,F,), ti,ot(x),bt(x),with indices and tildes or without them, the following: (w,,F,) is a dl-dimensional Wiener process, o,(x) is a random matrix of dimension d x d l , b,(x), 5, are random d-dimensional vectors; ot(x),b,(x), 5, are defined for t E [O,T], x E Ed, progressively measurable with respect to {%I, and
for all possible values of the indices and arguments. We can now specify the objective of this section. It consists in deriving formulas for a conditional expectation under the condition Foof functionals of solutions of the stochastic equation
Note that if the assumptions made above are satisfied, in accord with Theorem 5.7 the solution of Eq. (1)on an interval [O,T] exists and is unique.
1. Lemma. Suppose that for all integers i, j > 0, t,, . . . , ti E [O,T], z,, . . . , z j E Ed the vector {wtp,tip, otp(zq),btp(zq):p= 1, . . . , i, q
=
1, . . . . j )
does not depend on Po.Then the process x,, which is a solution of Eq. ( I ) , does not depend on Fo either.
9. The Markov Property of Solutions of Stochastic Equations
PROOF. AS we did in proving Theorem 5.7, we introduce here an operator I using the formula
In proving Theorem 5.7 we said that the operator I is defined on a set of progressively measurable functions in LY2([0,T] x 52) and also that this operator maps this set into itself. Let a function yt(w) from the set indicated (for example, yt = 0) be such that the totality of random variables {wt,tt,~t,rrt(x),b~(x):t E [O,T], x E Ed)
(2) does not depend on Fo.We prove that in this case the totality of random variables
does not depend on Foeither. We denote by E the completion of a rr-algebra of subsets 52, which is generated by the totality of random variables (2). By assumption, C does not depend on Fo.It is seen that for proving that (3) is independent of Po, it sufficesto prove that random variables Iy, are C-measurable for t E [O,T]. For real a let xn(a)= 2-"[2"a], where [a] is the greatest integer less than or equal to a. If y E Ed, we assume that xn(y)= (xn(yl),. . . ,xn(yd)),and, in addition, that Tn is a set of values of the function xn(y),y E Ed. Due to the continuity of ot(x)with respect to x we have
Therefore, the variable ot(yt) is C-measurable. The C-measurability of bt(yt)can be proved in a similar way. Further (see Appendix I), for almost all s E [0,1] for some sequence of integers n' in probability
Since the function ~ , ( + r s) - s assumes only a finite number of values on an interval [O,t], the integrals in a limiting expression are integrals of step functions. The former integrals are to be written as finite sums which consist of the product of values of rrr(y,) and an increment wr and the product of values of br(yr)and increments r. The foregoing sums are C-measurable. Hence the limiting expressions are C-measurable, which implies the Cmeasurability of Iy,. As we did in proving Theorem 5.7, we define here the sequence x: using the recurrence formula
2 Auxiliary Propositions
By induction, it follows from what has been proved above that the processes x: do not depend on Pofor n 2 0, t E [O,T]. According to Remark 5.13, for t E [O,T] 1.i.m. x: = x,. n+ ao
Therefore, the process x, does not depend on P o . The lemma is proved. In the next lemma we consider (i%,,@),5",, E,(x), &(x) as well as (wt,Pt), St, ot(x), b,(x). AS we agreed above, we assume here that these elements satisfy the same conditions. Let x", be a solution of the equation
.',=
+ Si E~(.?.)d i ~ +, Sd Er(zr)dr.
2. Lemma. Suppose that for all integers i, j > 0 and t,, . . . , ti E [O,T], z,, . . . , zj E Ed the following vectors are identically distributed:
Then the jinite-dimensional distribution of the process x, is equivalent to that of the process x",.
PROOF.We make use again of the operator I from the previous proof. Let
ifi=
Ji cS(jjs)d ~+, J', 6,(jjS)ds
and let the processes y,, jj, be progressively measurable with respect to {&I, respectively :
(g},
M
Jd ly,/'dt
< m,
M
Ji ~jj,~'dt< m.
Further, for any i, j > 0, t,, . . . , ti E [O,T], z , , . . . , zj E Ed let the vectors
have identical distributions. Note that if two random vectors have identical distributions, any (Borel) function of one vector has the same distribution as the other has. From this it follows, in accord with Eq. (4), that for any i, j > 0, t,, . . . , ti E [07T], z l , . . . , z j E Ed the vectors
~ w ~ ~ ~ S ~ ~ ~ Y ~ ~ ~ ~ ~ ~ ( Y=~1,~. .). ,~i q~=~ 1,~. .(. ,j), Z~)~~~~(Y {~zp,~tp,jjtp,~tp(jjzp),~fp(~q)7~tp(jjzp),b;p(~q):~ = 1, . . . ,i7q = 1, . . . , j )
(7)
have the same distributions. It is useful to draw the reader's attention to the fact that in order to prove the proposition made above, we need to use vectors of type (6) at the values of z, different from those which appear in (7).
9. The Markov Property of Solutions of Stochastic Equations
We chooses E [0,1] SO that Eq. (5) holds for t = t,, . . . , ti, and, in addition, that similar representations hold for Having done this, we can see that the vectors
r.
are representable as the limits in probability of identical functions of vectors of type (7). Therefore, the vectors (8) have identical distributions for any i,j > 0, t i , . . . , t i € [O,T], z,,. . . , z j € Ed. Next, we compare the vectors (6) and (8). Also, we find sequences of the processes Passing from vectors of type (6)to vectors of type (8),we prove by induction that the finite-dimensional distribution of x: is equivalent to that of 2:. Therefore, the finite-dimensional distributions of the limits of these processes in the mean square coincide, i.e., x, and j2;. The lemma is proved. 3. Corollary. If ti, a,(x), b,(x) are nonrandom and if, in addition, they are equal to F,(x), &x), respectively, for all t E [O,T], x E Ed, the processes x,, 2, have identical finite-dimensional distributions. Furthermore, the process x, does not depend on 9,, and the process jZ; does not depend on go.
c,
This corollary follows from Lemmas 1 and 2 and the fact that all Wiener processes have identical finite-dimensional distributions and that, for example, w, = w, - w, does not depend on F0. The formula mentioned at the beginning of the section can be found in the next theorem. In order not to complicate the formulation of the theorem, we list the conditions under which we shall prove the theorem. Let Z be a separable metric space with metric p and let (w;,F:) = (wt,Ft),ob;(x),b;(x) be defined for z E Z. We assume (in addition to the assumption mentioned at the beginning of the section) that the functions o:(x,o), b:(x,o) are continuous with respect to z for all t, o , x and
for all x. 4. Theorem. Suppose that the assumptions made before proving the theorem are satisfied. Let the totality of variables {w,,of(x),b:(x):t E [(),TI, x E Ed) be independent of 9,for all z E Z. Further, let 5 be an 9,-measurable random variable with values in Ed and afinite second moment, let be an 9,-measurable
2 Auxiliary Propositions
random function with values in 2. Finally, let y, be a solution of the equation
W e denote by x:," a solution of the equation
Let F ( Z , X [ , , ~be ~ ) a nonnegative measurable function on Z x C([O,T],Ed). Then M { F ( ~ , ~1 [9 ~0 ,= )~@(5,5) ~) (a.s.), (11) where cD(z,x) MF(Z,XF$~).
-
PROOF.First we note that due to the conditions imposed, Eq. (9)and Eq. (10) are solvable and, in addition, are continuous with respect to t. Further, it suffices to prove Eq. (11) for functions of the form F(z,x,,, . . . ,x,,), where ti, . . . , tn E [O,T] and F(z,x,, . . . ,xn) is a bounded continuous function of (z,x,, . . . ,x,). In fact, in this case Eq. (11) extends in a standard manner to which are measurable with respect to a all nonnegative functions F(Z,X~,,,~), product of a o-algebra of Borel sets in z and the smallest a-algebra which contains cylinder sets of the space C([O,T],Ed).It is a well-known fact that the latter o-algebra is equivalent to the a-algebra of Borel sets of the metric space C([O, T ] , E d ) . In future, we shall consider functions F only of the type indicated. Let A = {z");i 2 1) be a countable everywhere dense subset in Z . For z E Z we denote by R,(z) the first member of the sequence {z")) for which p(z,z(i)I2-". It is easily seen that iS,(z)is the measurable function of z and that p(z,En(z))5 2-" for all z E Z. In addition, we define the function IC,(X) as in the proof of Lemma 1. By Lemma 1, almost surely
where we take the limit as n -+ oo. We agreed to consider only bounded continuous functions F(Z,X[,,,~)(moreover, of special type). Hence, the left side of (12)yields the left side of (1I ) , if we show that for some subsequence {n' lim sup Ix:n'"J3"n'(0- y,l = 0
>
n'4w t
In this case the right side of (12)yields the right side of (11) if we prove that @(z,x)is a continuous function of (z,x).
9. The Markov Property of Solutions of Stochastic Equations
Since the variables ~ d i )K,(() , are Fo-measurable, we can bring an indicator of the set {~,(i) = z, rc,(() = x) under the sign of a stochastic integral. Multiplying (10) by the indicator of the above set, bringing this indicator under the integral signs, replacing the values z, x by values En([), K,((), which are equal to z, x on the set considered, and, finally, bringing the indicator out, we have that on each set {En([)= z, rc,(c) = x) the process x ~ ~ ( satisfies ~ ) , ~ the~ equation ( ~
The combination of the sets {E,(i) = z, K,(() = x) with respect to z E A, x E r, produces all 52. Hence xKn(r)Kn(r) satisfies Eq. (14) on SZ. Comparing (9) with (14), we have in accord with Theorem 5.9 that
+ 1 lazn(5)( y,) - a$(y,) 11 1 ' dt. Here 15 - ~ ~ ( ( 1+1 0 uniformly on 52, bp(r)(yt)-, bi(yt) for each t, due to continuity of b:(x) with respect to z. Furthermore, Ibp(i)(y,)12+ lb$(yJI2 does not exceed 4 sup, lbf(0)I2 4 ~ytI2. ~ 1
+
The last expression is summable over dP x dt. Investigating of(x) in a similar way, we conclude using the Lebesgue theorem that
This implies (13). For proving the continuity of @(z,x)with respect to (z,x) it suffices to prove that for any sequence (z,,x,) -+ (z,x) there is a subsequence (z,,,x,,) for which @(z,,,x,,) + @(z,x).From a form of @(z,x)we easily find that it is enough to have lim sup lx;" n'+m
,ST
,xn'
- xf,xl= 0
The existence of such a subsequence {n') for any sequence (z,,x,) converging to (z,x) follows from the considerations which are very similar to the preceding considerations concerning Eq. (13). The theorem is proved. 5. Remark. The function MF(z,x;fT1) is measurable with respect to (z,x). Indeed, the set of functions F(z,x[,,~,)for which @(z,x)is measurable contains all continuous and bounded functions F. For these functions F, @(z,x)is continuous even with respect to (z,x). From this we derive in a usual way that the set mentioned contains all nonnegative Bore1 functions F(z,~[~,~]).
2 Auxiliary Propositions
6. Exercise Prove that the assumptions of Theorem 4 about the finiteness of
can be weakened, and that it is possible to require instead uniform integrability of the values IJa:(0)112, I&(0)12 over dP x dt for z which run through each bounded subset Z.
Further, we consider the problem of computing a conditional expectation under the condition 9,,where s E [O,T]. We shall reduce this problem to that of computing a conditional expectation under the condition go using a time shift. If the function F(xl0,T - , ] ) is defined on C([O,T - s], Ed) and xIo,,-,]E C([O,T - s], Ed),we denote by F ( x [ , , ~a~value ) of F on the function 0,x which is given by the formula (O,x), = x,+, for t E [0, T - s]. Sometimes F ( x [ , , ~is~ written ) as f3sF(x~o,T~,l). Similar notation can be used for the functions F(x[,,,,).
7. Theorem. Let the assumptions of Theorem 4 be satisjed. Further, let s E [O,T], and let ( = ( ( o ) ,5 = 5(o)be 9,-measurable variables with values in Z and Ed, respectively. Finally, let a:+,(x) and b:+,(x) be independent of o for all t 2 0. Suppose the process y, satisjes the equation
for t E [s,T]. W e dejine the process xf3s*x for t E [0, T - s] as a solution of the equation
- , ~ ) on Then for any nonnegative measurable function F ( Z , X [ ~ , ~ given x C([O,T - sl, Ed), {F([,~[s,T]) 1 9s) = @(5,5)
where
(a.s-),
,
PROOF. Let G, = w, +, - w,, It is seen that
= %+ ,,
p,= y, +,
a:(x) = a;+,(XI, @(x)=.b:+ ,(x).
in this case 5,'L are go-measurable, and G, is a Wiener process with respect to By Theorem 4
g.
MT
1 9,)= F
0-
]
1 go}= (
1
(a.s.1,
9. The Markov Property of Solutions of Stochastic Equations
where $(z,x) = M F ( Z , ~ " ~ and ~ ~ ~Z:, -"~is, )a solution of the equation
It remains to note that, by Corollary 3, the processes xf,"", have identical finite-dimensional distributions. Therefore $(z,x) = @(z,x), thus proving the theorem. x"$sX
The technique involving a time shift can be applied in the case where s is a Markov time. The following fact, which we suggest the reader should prove using the above technique, leads to the so-called "strong Markovian" property of solutions of stochastic equations. 8. Exercise Let a,(x) E a(x), b,(x) b(x) be independent of t and w, let z be a Markov time with respect to {Ft},and let x: be a solution (it is given for each t) of the equation
F
Prove that in this situation for any x C([O,w),E,),
= F(xlO,,))given on
E
Ed and a nonnegative measurable function
-
({z< a}-as.), M,{B,FIF~} = M,:F where x indicates that in computing the conditional expectation one needs to take xro,,, for the argument F, and x: indicates that first M,F M F ( X [ ~ , , ,is) to be found and, second, y is to be replaced by x:.
9. Remark. The assertions of Theorems 4 and 7 hold not only for nonnegative functions F. This property of F was necessary to make the expressions we dealt with meaningful. For example, Theorem 7 holds for any measurable function F for which ~ I ~ ( l , y [ , , < ~ )clo.In fact, by Theorem 7
where @(, ,(z,x) = M F , ( Z , X ~ ~ , "In - , this ~ ) . case the left side of (15) is finite with probability 1 for both the sign + and the sign - . In particular, the functions @(+,(z,x),@(-,(z,x)are finite for those (z,x) which are values of ([(o),((o)) on some subset S2 which has complete probability. Having subtracted from (15) with the + sign, the same with the - sign, we find
in this case the function @(z,x)exists at any where @(z,x)= MF(z,~fb",',"-,~); rate for those (z,x) which are necessary for Eq. (16) to be satisfied. Theorem 7 enables us to deduce the well-known Kolmogorov's equation for the case where o,(x) and b,(x) do not depend on o .
2 Auxiliary Propositions
Denote by xFx a solution o f the equation
lo. Theorem. Let c,(x),f,(x), g(x)be nonrandom real-valued functions, c,(x)2 0. Let ot(x), b,(x), c,(x), f,(x), g(x) be twice differentiable in x, where neither ot(x) nor b,(x) depends on w. Furthermore, let the foregoing functions and their first and second derivatives with respect to x be continuous with respect to (t,x)in a strip [O,T] x Ed. In addition, let the product of the functions at(x), b,(x), c,(x), f,(x), g(x) and their $rst and second derivatives and the function (1 + Ixl)-'" (functions and their derivatives) be bounded in this strip. Then the function v(t,x) has the following properties:
+1 ~ 1 ) ~
for all x E Ed, t E [O,T], where N does not depend on 1. Iv(t,x)l IN(l (44; 2. v(t,x) is once differentiable with respect to t7 is twice dzfferentiable with respect to x, and, in addition, the derivatives are continuous in the strip [O,Tl x Ed; 3. for all t E [O,T],x E Ed
Moreover, any function which has properties (1)-(3) coincides with v in the strip [O,T] x Ed.
PROOF. By assumption, Ilct(0)ll, Ibt(0)l are continuous. Therefore they are bounded on [O,T] and
~ )a random where N does not depend o n t, x. Furthermore, F ( S , X ; $ ~ - ,is is a measurable (even continuous) function on variable since F(s,xI,,
9. The Markov Property of Solutions of Stochastic Equations
C([O, T - s], Ed). From this and the assumptions If,(x)(l N(l + [XI)", Ig(x)I I N(l + 1x1)" and c,(x) 2 0 we deduce the first property of the function v if we use estimates of moments of solutions of a stochastic equation (see Corollary 5.12). Equation (17) makes sense, in general, only for t E [0, T - s]. It will be convenient to assume further that the process xs3"is defined for t E [O,T] for all s E (- c o , ~ )x, E Ed.As before, we define the process x?" as a solution of Eq. (17), in which, having redefined the functions o,(x), b,(x) if necessary, we extend these functions from the interval [O,T] to (- co,co) defining ot(x) = crdx), b,(x) = bdx) for t 2 T and o,(x) = oo(x), bt(x) = b,(x) for t I0. By Theorem 8.6, the process x?" is twice 9B-differentiable with respect to x. By virtue of the results obtained in Section 7 (see Lemmas 7.1 1 and 7.12), the above proves that the random variable F(S,X~&.-~,) is twice 9-differentiable with respect to x for each s E [O,T], and also that the function u(s,x) has all second derivatives with respect to x for each s E [O,T]. In order to prove that the function v(s,x) is continuous with respect to (s,x), we need only assume in (17) that p = (s,x), x = <,(p), o,+,(y) = o,(p,y), bs+t(y)= ~,(P,Y), write cs+t(y)= ct(p,y), .L+t(y)= ~,(P,Y) in the expression for F, and, in addition, make use of Corollary 8.2 as well as the results from Section 7. Using similar notation, taking the first and second 9B-derivatives of xf3"with respect to x (see Remark 8.5),and the 9-derivatives of F ( x , s ~ ~ ~ - , ] ) and applying Corollary 8.2 as well as the results from Section 7, we prove that the first and second derivatives of v(s,x) with respect to x are continuous with respect to (s,x). This implies continuity of Lv(s,x) + f,(x) with respect to (s,x). Hence, if the first relation in (18) has been proved, we have continuity of (a/dt)v(t,x). It should be mentioned that the second relation in (18) is obvious. Therefore, it remains only to prove that the derivative (a/at)v(t,x)exists and the first equality in (18) is satisfied. Furthermore, it suffices to prove this fact not for (d/at)v(t,x) but only for the right derivative of the function v(t,x) with respect to t for t E [O,T). Indeed, as is well known in analysis, if f(t), g(t) are continuous on [O,T] and if the right derivative f(t) is equal to g(t) on [O,T), then f '(t) = g(t) on [O,T]. We fix x and take t, > t,, t,, t, E [O,T]. Further, let s = t, - t,. By Theorem 7 (see Remark 9),
- ~ , ~ ) Furthermore, simple computations where @(y)= M F ( ~ , , X [ ~ , ~=~v(t,,y). show that
2 Auxiliary Propositions
From this and (19) we find
-
where Y:'," exp[-6 c,,+,(x:',")dr]. Next, let w(y) be a smooth function with compact support equal to 1 for - xl I 1. Also, let v,(t,,y) = v(t,,y)w(y), vz(t2,y) = v(t2,y) - Vl(t2,~).We represent the second term in (20) as the sum of two expressions starting from the equality u = v, f v,. Using Ito's formula, we transform the expression which contains v,. Note that derivatives of v,(t2,y) are continuous and have compact support, and therefore bounded. We have
IY
where
It is seen that v = v, at a point x. We replace the expression vl(t2,x) in (21) by the expression v(t,,x) and carry the latter into the left-hand side of (21). Further, we divide both sides of the equality by s = t2 - t, and, in addition, we let t2 1 t,. By the mean-value theorem, due to continuity of the expressions considered
Moreover, \(l/s)h:',"l does not exceed the summable quantity
for some suitable values of the constant N, q. Finally, v,(t,,y) = 0 for 1 y - xJ I1, and, by property 1, Iv2(t2,y)lI N(1 1~1)".Hence Iv~(t2,y)lI Nl y and by Corollary 5.12,
~1"'~
+
The arguments carried out above enable us to derive from (21) that the right derivative of the function v(t,x) exists at a point t = ti, and also prove that the derivative equals [-J,(x) - Lv(t,,x)] for all ti E [O,T). As was explained above, this suffices to complete the demonstration of properties 1-3 for the function v. We prove the last assertion of the theorem concerning uniqueness of solution of (18). Let u(t,x) be a function having properties 1-3. In accord
10. Ito's Formula with Generalized Derivatives
with Ito's formula for any R > 0
where zR equals the minimum of T - s and the first exit time of x:," from SR. It is seen that zR + T - s for R + oo. Moreover, the expression in the curly brackets under the sign of the last mathematical expectation in (23) is continuous with respect to zR and, in addition, it does not exceed a summable quantity of the type (22). Therefore, assuming in (23) that R + oo, using the Lebesgue theorem, we can interchange the sign of the limit and the sign of the expectation. Having done this and, further, having noted that u(T,x) = g(x), we immediately obtain u(s,x) = u(s,x), thus proving the theorem. 11. Remark. The last assertion of the theorem shows that v(s,x) depends neither on an initial probability space nor on a Wiener process. The function u(s,x) can be defined uniquely by the functions a,(x), b,(x), ~ ( x )f,(x), , g(x), i.e., by the elements which belong to (18). The function v(s,x) does not change if we replace the probability space, or take another Wiener process, perhaps, even a d,-dimensional process with d, # dl, or, finally, take another matrix o,(x) of dimension d x d,, provided only that the matrix o,(x)of(x) does not change.
10. Ito's Formula with Generalized Derivatives Ito's formula is an essential tool of stochastic integral theory. The classical formulation of the theorem on Ito's formula involves the requirement that the function to which this formula can be applied be differentiable a sufficient number of times. However, in optimal control theory there arises a necessity to apply Ito's formula to nonsmooth functions (see Section 1.5). In this section, we prove that in some cases Ito's formula remains valid for functions whose generalized derivatives are ordinary functions. Moreover, we prove some relationships between functions having generalized derivatives and mathematical expectations. These relationships will be useful for our further discussion.
2 Auxiliary Propositions
W e fix two bounded regions D c Ed, Q c E d + , in spaces Ed and Ed+,, respectively. Let dl be an integer, dl 2 d, let (w,,Ft) be a dl-dimensional Wiener process, let 0, = o,(o) be a matrix o f dimension d x dl, let b, = b,(o) be a d-dimensional vector, and, finally, let c, = c,(o) be real-valued. Furthermore, let
Assume that o,, b,, c, are progressively measurable with respect t o { F , } and, in addition, for all t 2 0
Under the assumption made above, for each xo E Ed the process
is well-defined. 1. Theorem. Let s, x0 b e j x e d , xo E Ed, s E (- m,m). Also, let T Q be thejrst exit time of the process (s t, x,) from a region Q, let z be some Markov time (with respect to {F,})such that z I z,, let z, be thejrst exit time of the process x, from a region D, and, jnally, let z' be a Markov time not exceeding 2,. Suppose that there exist constants K , 6 > 0 such that Ilo,(o)II Ibt(o)l c,(o) 5 K , (a,A,A) 2 61;112for all A E Ed and (o,t), which satisfy the inequality t < z V 2'. Then for any u E W 2 ( 0 ) v, E W1,'(Q), t 2 0
+
+
+
r
+
e-* grad, u(xr)ordwr,
+ S: e-+'rgrad, v(s + r, xr)ordw,
(1)
almost surely on the sets {z' 2 t } , { z 2 t ) , respectively. Furthermore, for any u 'E WZ(D),v E W1,'(Q) u(xo)= - M J l e - + ' r ~ r u ( xdr r ) + Me-qT'u(xr.),
10. Ito's Formula with Generalized Derivatives
PROOF.We prove both the assertions of Theorem 1 in the same way via approximation of u, v by smooth functions. Hence we prove the first assertion only. Let a sequence vn E C1,2(&)be such that
IIv
- v"IIB(Q) 0,
112,
+
- v"IIw~,~((L) + 07
lllgradx(v - vn)I211d+1,Q 0. -)
Further, let Y, = xo
+ Ji ~r
< P r
dwr +
Ji
b<&r
dr.
We note that y, = x, for t I z < co, which can easily be seen for t < r, and which follows from the continuity property of y, and x, for t = z < co. We prove that the right side of Eq. (1) makes sense. Obviously, for r < z
where N depends only on d, K. From this, using Theorem 2.4" we obtain
Similarly,
Nlllgradxv1211d+l,~.
(3)
Further, we apply Ito's formula to the expression vn(t,yt)eCqt.Then, we have on the set { t Iz) almost surely e-q=vn(s+ r, xJ - e-nvn(s
+ t, x,) =
1 :( +1 e-*.
+ Lr) vn(s+ r, x ) dr
e-.. grad, vn(s+ r, xr)ordwr.
(4)
We pass to the limit in equality (4) as n + oo. Using estimates similar to estimates (2) and (3), we easily prove that the right side of (4) tends to the right side of (1). The first assertion of Theorem 1 can be proved for the function u by an almost word-for-word repetition of the proof given. The slight difference is
'' In Theorem 2.4, we need take for D any region such that (-
co,co)x D
3
Q.
2 Auxiliary Propositions
that if for vnthe existence of the terms in (4)follows from the obvious boundedness of z(o), then a similar formula for proving the first assertion of the theorem for u is valid since z f ( o )< co ( a s ) and even Mz' < co (in Theorem 2.4, assume that s = 0, g = 1). The theorem is proved. Henceforth, when we mention this theorem we shall call the assertions of the theorem Ito's formulas. The assumption that the process x, is nondegenerate is the most restrictive assumption of Theorem 1. However, we note that the formulation of the well-known Ito formula imposes no requirement for a process to be nondegenerate when only differentiable functions are being considered. In the next theorem the assumption about nondegeneracy will be dropped, and in Ito's formula instead of an equality an inequality will be proved. Consider the case where o,, b,, and c, depend on the parameter x E Ed. We fix s E E l . Furthermore, for t 2 s, x E Ed let there be given: o,(x), a random matrix of dimension d x d l ; bt(x),a random d-dimensional vector; c,(x) and f,(x), random variables. Assume that o, + ,(x), b, + ,(x), c, + ,(x), f,, ,(x) are progressively measurable with respect to {Ft)for each x, and that c,(x), f,(x) are continuous with respect to x and bounded for (w,t,x) E D x Q, where Q, as before, is a bounded region in Ed+,. Also, for all t 2 s, x and y E Ed let Ilot(x) - ot(y)II + Ibt(x) - bt(y)l 4 Klx 11ct(x)11+ Ibt(x)I K(1 + where K is a constant. Under the above assumptions, for each x E E the solution xssx of the equation dr xt = x + g s +r(xr)dwr + bs +
]XI),
Ji
YI,
Ji
exists and is unique (see Theorem 5.7). We denote by z r the first exit time of (s + t, x:") from the region Q ;
2. Theorem. Let (s,x)E Q and, in addition, let a function v E C(Q) belong to W1,2(Q')for each region Q', which together with its closure lies in Q. Assume that the derivatives of v can be chosen so that for some set r c Q, for which meas (Q\r) = 0, for all w and (t,y) E r the inequality
10. Ito's Formula with Generalized Derivatives
can be satisfied. Then for any Markov time z (with respect to (9,";) not exceeding
where cp, = cp;sx, xt = x ; , ~ PROOF. In proving Theorem 2, we drop the superscripts s, x. First, we note that in proving this theorem we can assume that z I zQ,,where Q' c Q' c Q. Indeed, for all such Markov times let our theorem have been proved. We take an arbitrary time z I 7,. It is seen that zQ, f zQ and z A zQPf z when the regions Q', while expanding, converge to Q. Substituting in (6) the variable z A zQ,for z, taking the limit as Q' f Q, and, finally, noting that v is continuous in Q, cp, and x, are continuous with respect to t, and, in addition, z and f,+,(xt) for t I z are bounded, we have proved the assertion of the theorem in the general case. Thus, let z I ZQ,. Further, we apply a rather well-known method of perturbation of an initial stochastic equation (see Exercise 1.1.1). We consider some d-dimensional Wiener process Kt independent of {Ft}. Formally, this can be done by considering a direct product of two probability spaces: an initial space and a space on which a d-dimensional Wiener process is defined. We denote by x; a solution of the equation x:
=x
+ S, o,+.(x:)
dw.
+ &.Et + fi b,+.(x;)
dr,
where E, # 0, E , + 0 as n -+ a. It is convenient to rewrite the last equation in a different form. Let o:(x) be a matrix of dimension d x (dl d), such that the first dl columns of the matrix o:(x) form a matrix ot(x),and also the columns numbered dl + 1, . . . , dl d form a matrix &,I, where I is a unit matrix of dimension d x d. Furthermore, we take a (dl d)-dimensional Wiener process w, = (w:, . . . ,wfl,Kt, . . . ,Kf). Then
+
+
+
x:
=x
+ Ji o;+.(x:)
dm.
+ S; b,+,(x:)
dr.
(7)
By Theorem 8.1, sup ,, Ix: - x,l + 0 as n + co in probability for each t. Therefore, there exists a subsequence ini} such that sup,,, Ix: - x,I + 0 (as.) as i + cc and for each t. In order not to complicate the notation, we assume that {nil = {n}. Let z& be the first exit time of (s + t, x:) from Q'. It is not hard to show that limn,, 7;. 2 zQ, (as.). Hence, if we assume that zi = z A inf z;,, nzi then zi I z,, and zi -, z as i -+ co (as.). Further, we apply Theorem 1 to v, Q', x:, zi for n 2 i. Note that zi I z; for n 2 i. Moreover, v E W1,2(Q').Next, it is seen that
2 Auxiliary Propositions
where N does not depend on t, o,n. Finally,
All the assumptions of Theorem 1 have been satisfied. Therefore, computing for the process x: (see (7)) the operator L, appearing in Theorem 1, and in addition, assuming that
for n 2 i, we have V(S,X) = -M
S:
e-"':g.+r(~;)dr+ Me-p:iv(s
+T~,X:~).
By the hypothesis of the theorem,
Furthermore, by Theorem 2.4,
Therefore, in integrating over r in the first expression in the right side of (8),we can assume that ( s r, x:) E r. From (8) we find
+
-
2
M
S:
e - ~ A:V ( S
+ r, x:) dr.
Because zi does not exceed the diameter T of the region Q', sup,,, + 0 as n + co,f,+,(y) and c,+,( y) are continuous with respect to y, and zi t z as i + oo,we conclude that in the last expression for v(s,x) the first two terms in the right side as n + co, then as i + co, yield the right side of Eq. (6). Therefore, for proving the theorem it remains only to show that
Ix:
x,I
: lim sM n+
w
ST
IAv(s
+ r,x:)l dr = 0.
Making use of Theorem 2.2, we assume s = 0, c, = 1, F(c,a) = c, b, = b,+,(x:), r, = 1, p = d, o, = cr:+,(x:). Note that, as was noted before, Ib,l 5
10. Ito's Formula with Generalized Derivatives
N . 1 = Nc, for t < z;l,, where N does not depend on n, and, moreover,
Therefore
where N does not depend on n. The last expression tends to zero as n + co, since v E W1?'(Q'). Therefore, the norm of that expression is finite. The theorem is proved. 3. Remark. It is seen from the proof that if for all ( t , o ) the function f,(x) is upper semicontinuous, lhxn,x f,(xn) 2 f,(x), the assertion of the theorem still holds. 4. Corollary. If o,(x), b,(x), c,(x) do not depend on o and in addition, L,(x)v(t,x) av(t,x)/at is a bounded continuous function of (t,x) E Q, we have in the notation of the theorem
+
V(S,X) = Me-..v(s
+ I,XJ - M
Sd e-*.
av L,+,(xr)u(s r,xr) -(s
+
+
az
+ r,x.)
I
dr.
5. Exercise to Theorem 1 (Compare [44, p. 391.) Let d 2 2, a E (0,1),p = [(d - 1)/(1 - a)] - 1, u(x) = Ixla, b ( x ) = = 6" + p(xixj/lx12).We take as D a sphere S,, and also, we take as x, some (possibly "weak") solution of the equation dx, = o(xt)dw,,x , = 0 . Let o , = o(xt),b, = 0, C, = 0. Show that second derivatives of u are summable with respect to D to the power p = ad/(2 - a). (Note that p -+ d as a -, 1.) Also, show that Lu(x,) = 0 (as.) and that Ito's formula is not applicable to u(x,).
m,where aii(x)
6. Remark. In the case where Q = ( 0 , T ) x S,, we have 7%'"= 0 for s = 0 in the notations introduced before Theorem 2. This suggests that it would be useful to have in mind that if Q = ( 0 , T ) x S,, one can take in Theorem 2 instead of zs9"(in Theorem 1 instead of zQ) the minimum between T - s and the first exit time of the process xs,X (respectively, the first exit time of the
2 Auxiliary Propositions
process x,) from S,. For s = 0 this minimum is not in general equal to zero. Thus we can derive meaningful assertions from Theorems 1 and 2. In order to prove the validity of the remark made above, it suffices to repeat word-for-word the proof of Theorems 1 and 2.
Notes Section 1. The notations and definitions given in this section are of common usage. Definition 2 as well as the concept of an exterior norm are somewhat special. Sections 2, 3,4. The results obtained in these sections generalize the corresponding results obtained in [32, 34, 36, 401. Estimates of stochastic integrals having a jumplike part can be found in Pragarauskas [62]. Sections 5, 7, 8, 9. These sections contain more or less well-known results related to the theory of Ito's stochastic integral equations; see Dynkin [Ill], Liptser and Shiryayev [51], and Gikhman and Skorokhod [24]. The introduction of the spaces Y , 9 B is our idea. Section 6. The existence of a solution of a stochasticequation containingmeasurable coefficients not depending on time was first proved in [28] by the method due to Skorokhod [70]. In this section we use Skorokhod's method in the case when the coefficients may depend on time. For the problem of uniqueness of a solution of a stochastic equation as well as the problem of constructing the corresponding Markov process, see [24,28,38]; also see S. Anoulova and G. Pragarauskas: On weak Markov solutions of stochastic equations, Litovsky Math. Sb. 17(2) (1977), 5-26, also see the references listed in this paper. Section 10. The results obtained in this section are related to those in [2f, 341.
General Properties of a Payoff Function
3
In this chapter we study general properties of a payoff function-the properties that a payoff function possesses under minimal assumptions on the system. Our attention is focused on proving the continuity property of a payoff function as well as proving different versions of Bellman's principle, and, furthermore, proving the fact that strategies which are close to optimal ones can be found among natural strategies. In Chapter 5 we shall discuss the feasibility of further reduction of a set of strategies to Markov strategies without decreasing the payoff. Considering the problem of optimal stopping of a controlled process, we describe in this chapter a subclass of stopping rules, which yield the same payoff performance as the class of all possible stopping rules.
1 . Basic Results Let A be a separable metric space (a set of admissible controls), let Ed be a Euclidean space of dimension d, and let T be a nonnegative number. We consider a controlled process in the space Ed in an interval of time [O,T]. Taking an integer dl, we assume that (w,,Ft) is a dl-dimensional Wiener process. Suppose that for all a E A, t 2 0, x E Ed we are given o(a,t,x), which is a matrix of dimension d x dl, b(a,t,x), which is a d-dimensional vector, and, in addition, we are given real-valued functions ca(t,x)2 0, f"(t,x), g(x). As in Chapter 1, o characterizes here the diffusion component of the process, b characterizes a deterministic component of the process, f"(t,x) A t plays the
3 General Properties of a Payoff Function
role of the payoff during the interval of time from t to t + At, if the controlled process is near a point x at time t and if, in addition, a control a is used, and g(x) is the gain at time T. The function ca(t,x)is the measure of "discounting." This function is introduced, first, for greater generality, second, because we consider the problems of optimal stopping of a controlled process and investigate them, as we did in Section 1.2 and 1.5, using the method of randomized stopping. We assume that the functions o, b, c, f , g are continuous with respect to (a,x) and continuous with respect to x uniformly over a for each t. Also, we assume that the above functions are Bore1 with respect to (a@). Furthermore, for some constants m, K 2 0 for all x, y E Ed, t 2 0, a E A, let
We shall consider below a function g(t,x) as well, the assumptions about which are formulated before the proof of Theorem 8. As was done in Section 1.4, we introduce here the concepts of a strategy, a natural strategy, and Markov strategy.
1. Definition. By a strategy we mean a process at(co)progressively measurable with respect to a system of o-algebras of {Fz-,), having values in A. We denote by % the set of all strategies. To each strategy a E %, s E [O,T], x E Ed we set into correspondence a solution x;,S*X of the equation xt
=x
+ J,
s + r, xr)dw.
+ J,b(ar,s + r, xr)dr,
t t 0.
Note that due to the assumptions about a and b, the solution of the last equation exists and is unique. It is always convenient to represent the process ~ l " . ~ as, "a set of the last d coordinates of a (d + 1)-dimensional process eSsX = (y;,S,X,x;,S,X) which is a solution of the following system of equations : yZ = s+Ji l d r ;
In this case s appears as one of the components of the given initial process. Furthermore, if we consider the process xl"."pXon an interval of time [0, T - s], this implies that the process z y x is considered before the first exit from a strip [O,T) x Ed.
1. Basic Results
For s IT let
For convenience we shall use the superscripts a and the subscripts s, x on the expectation sign to indicate expectations of quantities which depend on s, x, and a strategy a. For example, M:,
JOT-'
/ L C ( ~+ t, x,)e-pt dt z M
JOT-'
~
+
( t,sx;,',")e-.:'
'x
dt,
We treat the probabilities of events in a similar way. For example, ~;,,(lx~l2 R } = P{IX;~'."I2 R}. In the above notation v(s,x) = sup M:,[J;-S a€%
+ t,xt)e-ptdt + g
~CI'(S
( ~ ~ - ~ ) e - ~ ~ - ~
Let C([O,co),Ed)be a space of all continuous functions x, with values in Ed defined on [O,co). Also, let Jlr, be the smallest o-algebra of the subsets C([O,co),Ed)which contains all sets of the form {xI0,,,:xr E I') for r E [O,t] and Borel r c Ed. 2. Definition. A function at(xIo,,,) = a , ( ~ [ ~ with , , ~ ) values in A given for t E [O,co), xlo,,) E C([O,co),Ed)is said to be a natural strategy admissible at a point (s,x) if this function is progressively measurable with respect to (J,} and, if, in addition, there exists at least one solution of the stochastic equation
which is progressively measurable with respect to Ft.We denote by %,(s,x) the set of all natural strategies admissible at the point (s,x). To each strategy a E %,(s,x) we set into correspondence one (fixed) solution xfs," of Eq. (4). 3. Definition. The natural strategy U ~ ( X [ ~ , is , ~ )said to be a (nonstationary) Markov strategy if a , ( ~ ~ , ,= , ~a,(x,) ) for some Borel function at(x).We denote by aM(s,x)the set of all Markov strategies admissible at the point (s,x).
'
The finiteness of the last two expressions will be proved after the statement of Theorem 7 (see below).
3 General Properties of a Payoff Function
As was done in Section 1.4, we establish here the natural embedding of 21E(s,x)in 2l. One essential characteristic of this embedding should be noted. To the natural strategy a E (U,(S,X) we set into correspondence a strategy fi E IU according to the formula Pt(w) = a,(x:ti(w)). In this case Pt(w) depends on (s,x). Furthermore, if this same strategy a belongs to IU,(s',xf) as well for (s',xt)# (s,x), the strategy Pi(w) = at(xr&j'(w)) does not, generally speaking, coincide with fi. Therefore, the operation of embedding of 21E(s,x) in IU depends on (s,x). Further, we note that 2l,(s,x) c 21E(s,x)and IUM(s,x)# @ (21M(s,x)contains strategies of the form at(xI,,,,) a, where a is a fixed element of A). For s I T let
-
It is seen that v(,, I v(,, I v. 4. Definition. Let E 2 0. A strategy a E 2l is said to be &-optimalfor a point (s,x) if v(s,x) I va(s,x)+ E. A 0-optimal strategy is said to be optimal.
We formulate those results related to u, Bellman's principle, and Eoptimal strategies, which we shall prove in the subsequent sections of Chapter 3.
5. Theorem. The function v(s,x) is continuous with respect to (s,x)on [O,T] x Ed, v(T,x) = g(x). There exists a constant N = N(m,K,T) such that for all s E [O,T], x E Ed lu(s,x)l r N ( l 1x1)" (5)
+
6. Theorem. Suppose that s E [O,T],x E Ed. Furthermore, for each a E 2l we are given a time z" T - s which is Markov with respect to {Ft}and, in addition, we are given a nonnegative process r: which is progressively measurable with respect to {Ft}and bounded with respect to (t,w). Then
v(s,x) = sup M ,: as%
{J; [
+
~ ( st, x,)
+ rtv(s + t, x,)] exp( - pr, - Ji r. du) dt
7. Theorem. v(s,x) = v(,,(s,x). In the previous theorem one may take the upper bound over a E 21E(s,x). Let us discuss the assertions of Theorems 5, 6, and 7. In Theorem 5 the equality v(T,x) = g(x) is obvious. Inequality (5) follows from the fact that from Corollary 2.5.12 M,:
sup Ixtlm 4 N(m,K,T)(l +
t
[XI)",
1. Basic Results
and also from the fact that
IK(T - s
+ 1) sup Mz,, as9l
sup (1
tjT-s
+1 ~ ~ 1 ) ~ .
Thus, the main assertion of Theorem 5 is the one about the continuity of v(s,x). We shall see that the continuity of v(s,x) with respect to x follows from the continuity of a, b, c, f , g with respect to x, and, finally, the continuity of v(s,x) with respect to s follows from the specific nature of the problem. In this connection, we recall that a, b, c, f are only measurable with respect to t. Theorem 6 for c 0 is a usual Bellman's principle. In a certain sense the assertion of Theorem 6 is Bellman's principle in the general case as well. We explain this for c = I, using a method close in essence to the method of randomized stopping described in Section 1.2. We introduce a random variable t which is exponentially distributed with parameter A and is independent of {Ft). Bellman's principle implies the equality
-
v(s,x) = sup M,: sea
far(s
+ t, x,)e-@"dt + v(s + r A t,x,
A
I
Je-qlAc
which can easily be transformed into (6)using Fubini's theorem and writing the mathematical expectation given above as follows:
Theorem 7 shows that one should seek &-optimalstrategies among natural strategies. We shall see from the proof of Theorem 7 (see Remark 3.4) that one can take &-optimalnatural strategies of a very specific type. We can prove some general results for the optimal stopping problem as well. Let g(t,x) be a continuous function of (t,x) ( x E Ed,t 2 0 ) such that Ig(t,x)I IK(l 1x1)" for all t, x. For s E [O,T] we denote by 11J1(T - s) the set of all Markov times (with respect to {Ft)) not exceeding T - s. For a E 8,z E %R(T - s) let
+
Using a similar formula, we introduce w(,, (w(,,), replacing the upper bound over a E 9I by the upper bound over %,(s,x) (over N,(s,x)). 8. Theorem. The function w(s,x) is continuous with respect to (s,x)on [O,T] x Ed, W ( S , X ) 2 g(s,x), w(T,x) = g(T,x). There exists a constant N = N(m,K,T)
3 General Properties of a Payoff Function
such that for all s E [O,T], x
E
Ed
Iw(s,x)l I N(l
+ 1x1)"
9. Theorem. Let s E [O,T], x E Ed, and furthermore, let a Markov time z" E YX(T - s) be dejined for each a E a. Then w(s,x) = sup
sup
a a B ya(IR(T-S)
M:,{J~'
in this situation the upper bound over over aE(s,x).
Y
f..(s
+ t, x,)e-.'
dt
can be replaced by the upper bound
Let E > 0 and let
z:,S,x= inf { t 2 0 :w(s + t, x;3S*X) I g(s
10. Theorem. w(s,x) = w(,)(s,x). Furthermore, for inequality
+ t, x?".") + E ) . E
> 0, s E [O,T],x E Ed the
holds. If A consists of a single point, the last inequality becomes an equality for E = 0.
11. Theorem. In the notation of Theorem 6
a,
If taI z:'"." for some E > 0 and all a E we have equality in (10). In any case, the upper bound over a E 9 l can be replaced by the upper bound over a E aE(s,x).Finally, if A consists of a single point, z" I z y ~ .then ~ , we have equality in (10).
As in Theorem 5, the first assertion is the strongest one in Theorem 8. In fact, the equality w(T,x) = g(T,x ) is obvious and the inequality w(s,x) 2 g(s,x) follows immediately from the definition of w(s,x) and the fact that z = O is a Markov time. Moreover, (8)can be proved with the aid of (7) in the same way as (5) was proved. Theorem 9 is Bellman's principle for the optimal stopping problem of a controlled process. Further, we note that z:,"," is the time of first exit of the
1. Basic Results
+
process (s t, ~ 2 %from ~ ) an open (in relative topology of [O,T] x Ed) set Q, = {(s,~): W(S,X) > g(s,x) + E ) . Since w(T,x) = g(T,x), ~2%'" I T - s. It is seen that zf*"."E %X(T- s). Theorem 10 shows that zf9","is an &-optimal stopping time of the controlled process. If we deal with the stopping of a diffusion process, Theorem 10 asserts optimality of a time z y . In regard to this, we note that Theorems 5-11 can be used in investigating solutions of stochastic equations in the case when there is no control. Theorem 11 includes one more formulation of Bellman's principle, which is more convenient than Theorem 9 for the deduction of differential equations. Theorem 11 is the central theorem in the precise sense. Note that Theorem 10 follows immediately from Theorem 11 if in Theorem 11 we take I$ 3 0, za = and take advantage of the fact that Also, we show how Theorem 9 can be deduced from Theorem 11. To this end, we write the right side of (9) as w,(s,x). It follows from the inequality
+
w,(s,x) r sup M{:,JR;'
fq@ t, ~,)e-+'~dt
as%
and the inequality that
+
w,(s,x) r sup M:,{J~AS~~(S t , ~ ~ ) e - ~ ~ d t a e DI
Since T A T , I z,, by Theorem 11 for I$ = 0 we have that the last upper bound is equal to w(s,x). Therefore, w,(s,x) 2 w(s,x). On the other hand, g(s,x) < w(s,x). Therefore, w,(s,x) I sup
sup
bltX{J;"'
as% y ~ m ( T - s )
p(s
+ t, x,)e-qt dt
It remains only to assume in Theorem 11 that r: = 0, and, in addition, to note that the last upper bound does not exceed w(s,x).Hence w,(s,x) I w(s,x). Similar reasoning is possible if one takes in (9) instead of % a set %,(s,x). The inequality is strict if (s, x) 4 Q,.
3 General Properties of a Payoff Function
Approximating the initial o, b, c, f , g by means of differentiable functions and passing to the limit is a crucial point in proving the results formulated above and many others. In this section we prove one theorem on the passage to the limit of the indicated. Suppose the functions h,"(t,x)(n = 0,1,2, . . .) are given. We write h;(t,x) -+ h*,(t,x)in 9,([O,T],B)if for each R > 0 lim
JOT
sup sup lh;(t,x) - h",tt,x)l dt u e A Ixl
n-tm
= 0.
12. Theorem. Let a,(a,t,x) be a matrix of dimension d x dl, let b,(a,t,x) be a d-dimensional vector, let c:(t,x) be nonnegative, and, finally, let f ",(t,x),g,(t,x) be real, defined for n = 1,2, . . . , a E A, t E [O,T],x E Ed. Assume that on, b,, c,, f, are measurable with respect to (a,t,x) and that, in addition, they converge to o, b, c, f in 9,([O,T],B)as n + co.Furthermore, for each n let the functions on, b,, c,, f,, g, satisfy inequalities (1)-(3) with identical constants K and m, let g,(t,x) be measurable with respect to (t,x),and, finally, for each R > 0 let
lim sup sup Ig,(t,x) - g(t,x)l = 0. n+w
For a E 8,s E [O,T],x
M:,
f
M;,, -+
Ed we denote by ~ ? ~ ' " (ansolution ) of the equation
soc>(s + r, ~ : , ~ , ~dr.( nThen ) ) for any q
?(s
t, ~ , ( n ) ) e - ~ '( "P) ( s
sup Ig,(s tST-s
as n
E
cWs 1 +
Further, let cp;,","(n)=
(11)
fe[O,T]l x l j R
+ t, x.)e-"Iq
2 1, R
dt
+
+ t, ~ , ( n ) ) e - ~ ~-' "g(s' + t,xt)e-'Ptlq
+
co unijormly over a E QI, s E [O,T],x
E
>0
0, 0
(12)
SR.
PROOF.We assume that each of the functions o, b, c, f , g, on, b,, c,, f,, g, is equal to zero for t 2 T. Then, in assertions (12) we can replace T-s by T. Further, as can easily be seen, the assertion that the left sides of (12) tend uniformly to zero is equivalent to the fact that relations (12) will still hold if we permit the values a, s, x in these relations to depend on n in an arbitrary way, provided a = a, E QI, s = s, E [O,T],x = x, E S,. In the future, we shall assume that in (12)the values a, s, x are replaced by similar values a,, s,, x, and also that R > 0 is fixed. Let
By Corollary 2.5.12 for any q 2 1 sup M sup ( n rsT
~< 3 co. ~
1. Basic Results
Furthermore, let
Obviously, as n
-+
co
In addition, Ih:(x) - h:(y)l 12w:(Jxl v Jyl),which implies in accord with Theorem 2.7.17 as n -+ co in 2' that
-
Replacing in the above arguments f by o, b, and using Theorem 2.5.9, we have that y: -+ 0 in 9 B as n -+ co. Further, the function wt(r,6)
sup
sup
a s A Ix-YI s 6 . 1 ~ 1 ,I y l s r
If
'(t,x) - f"(t,y)l
tends to zero as 6 1 0 due to the uniform continuity (with respect to a) of f"(t,x) with respect to x. In addition, the above function does not exceed 2K(1 r)". By Lebesgue's theorem
+
lim
6-0
i&
n+m
SoTwSn+.(r,6)dt
5 lim
6-'0 SOT
w.(r,S) dt = 0.
By Lemma 2.7.5, wsn+,(r,a:) -+ co for any r > 0 as n -+ co in measure dP x dt. This yields, by virtue of (13) and Chebyshev's inequality, lim
n-t m
JOT
P{w,+,(<:,q:)
> E ) dt 5 lim Iiiii
r + w n-+w
SoT~{lt:I > r) dt SoTP{w,+,(r,qD
+ rlim -tm n-rm
> E ) dt = 0.
In other words, ~~,+~(t:,q:) -+ 0 in measure dP x dt. Since, obviously, If""'(s. + t, 4') - P(s, + t, y:)l wsn+t(S:,r]:), the first of the above expressions tends to zero as n -+ co in measure dP x dt. Using (13), we can easily prove that this first expression indicated tends to zero in 2' as well (see the deduction of Lemma 2.7.6 from Lemma 2.7.5). Comparing the above with (14), we conclude that as n -+ co in 9.We would prove this theorem if the functions c;, c", g,, g were equal to zero. If the functions c;, ca, g,, g are not equal to zero, the reader will easily complete proving the theorem by noting that
- f2e-(P215 If1 - f 2 J + If1 + f21
Iv1 - vzl,
3 General Properties of a Payoff Function
if cp,, 402 , 0, and, in addition, applying the previous results as well as Holder's inequality. The theorem is proved.
13. Corollary. Suppose that the assumptions of Theorem 12 are satisjied. Also, suppose that we are given measurable functions g,(x) satisfying the inequality Ig,(x)l 5 K(l + 1x1)" and such that for each R > 0 lim sup Ig,(x) - g(x)l = 0.
n+ m 1x1 s R
Using the functions on, b,, c,, f,, g,(t,x), g,(x), we construct functions v,': v:, w,, v, in the same way as we constructed above the functions va", va, w, v on the basis of the functions o, b, c, f , g(t,x), g(x). Then, wds,x) -+ w(s,x), v,(s,x) -+ v(s,x) as n -+ co uniformly over s E [O,T], x E SR for each R > 0. Moreover, v~'"(~-")(s,x) -+ va, A ( T - S ) (s7x),v%,x) va(s,x) as n -+ co uniformly over a E a , z E m(T), s E [O,T), x E SR for each R > 0. Indeed, for example, +
Furthermore, it is seen that
+ M:,,
sup Ig,(s
tlT-s
+ t, ~,(n))e-~'("'- g(s + t, xt)e-+"I.
The last expression tends to zero as n -+ co by the theorem, uniformly over a E a , s E [O,T], x E SR. We once approximated the given functions with the aid of infinitely differentiable functions, using convolutions with smooth kernels. Let us see what this method of approximation can give in the case considered. We shall show that o,(a,t,x), etc., can be taken to be infinitely differentiable with respect to x. 14. Theorem. Let a sequence E, -+ 0 as n -+ co. Then the assertions of Theorem 12 and Corollary 13 hold for a,(a,t,x) = o'Oen)(a,t,x), b,(a,t,x) = b(07En)(a,t,~), etc. (For the notation, see Section 2.1).
PROOF.We assert that the functions on,b,, c,, f,satisfy, in general, inequalities
(1)-(3) with identical constants K and m, and that, in addition, these functions converge to o, b, c, f in 8,([O,T],B). Furthermore, we need verify (ll), (15) and the fact that Ig,(x)l Ig,(t,x)l 5 N(l \XI)",where N does not depend on n, t, x.
+
+
1. Basic Results
The desired estimates for on, b,, c,, f,, g, follow from the fact that, for example,
I sup Ilo(a,t, x - ~,z)llI K(l 121
s1
+ E, + 1x1)
Further, as in proving Theorem 12, we introduce here wt(r,6). We mentioned that wt(R,&)+ 0 as E + 0 for each t, R. From this, for each t we obtain
By Lebesgue's theorem,
1
sup sup f :(t,x) - f*(t,x)l dt a e A ixlsR
-+
0.
In a similar way we prove relations (1I), (15)and the fact that the functions on, b,, c, + a, b, c in Y1([O,T],B).The theorem is proved. In some cases one can take the functions a,, b,, c,, f,,g, to be infinitely differentiable with respect to (t,x). 15. Theorem. Suppose that the set A consists of aJinite number of points only and, in addition, that the sequence E, -+ 0 as n + oo, lenl I 1. Then the assertions of Theorem 12 and Corollary 13 hold for ~,(a,t,x)= o('")(a,t,x), b,(a,t,x) = b(")(a,t,x), etc. (in computing the convolution with respect to (t,x) we assume for t I0 that o(a,t,x) = o(a,O,x),etc.).
PROOF. Estimates of the growth of the functions a,, b,, c,, f,, g, can be obtained in the same way as in the preceding proof. Furthermore, Eqs. (11) and (15),as was mentioned in Section 2.1, are known. Therefore, it remains to show that on, b,, c,, f, + a, b, c, f in 9,([O,T],B).We shall prove only the convergence off,. For proving the convergence of a,, b,, c,, we repeat wordfor-word the corresponding arguments. Note that the definition of convergence in 9,[O,T], B) involves the upper bound with respect to A. Since A consists of a finite number of points, the upper bound mentioned does not exceed the sum (with respect to a E A) of expressions standing under the sign of the upper bound. Therefore, we prove
3 General Properties of a Payoff Function
that f,
-+ f
in 9,([O,T],B)if we show that for each a E A for all R > 0 sup If :(t,x) - f*(t,x)l dt = 0. !! ,,,,, JOT
Let us take the function w,(r,d)from the proof of Theorem 12. Writing the convolution f :(t,x) in a complete form and recalling that [(t,x)= i,(t)[(x) (see Section 2.1),we easily prove that for 1x1, lyl < R, - yl I E
IX
w:(R + l , ~ , ) , ( ff(t,x)- f f(t,y)l Iwt(R + 1 , ~* ~&i151(t&i1) ) which implies convolution with respect to t. Let h: = f: - f '. For 1x1, lyl < R, lx - yl
<E
1
Ih:(t,x) - hxt,y) Iw:(R
+ 1,E ) + w,(R,E).
Next, we apply Lemma 2.7.15, taking in this lemma 5 instead of R. For each E > 0
Ex E
S, and R
+1
From this it follows that the limiting expression in (16)is smaller than
In the last expression the second term tends to zero as n -+ oo since the mean of any function from 9, converges in 9,.For the same reason the first term as n -+ oo tends to JOT
wt(R + 2, E ) dt
+
JOT
wt(R
+ 1. E ) dt.
(17)
Thus, (17)estimates the left side of (16)for any 8 > 0. In proving Theorem 12, we saw that (17)tended to zero as E -+ 0. We have thus proved Theorem 15.
2. Some Preliminary Considerations We shall prove the assertions of Theorems 1.5-1.11 by approximating an arbitrary strategy with the aid of step strategies, i.e., using strategies which are constant on each interval of a subdivision I of an interval of time [O,T]. We would expect that the upper bound of the payoffs given by all step strategies which have been constructed on the basis of subdivisions I = ( 0 = to, t,, . . . ,t, = T ) will tend to the corresponding payoff function as maxi(ti+, - ti) -+ 0. In this section, we prepare the proof of the above fact in a special formulation (see Theorem 3.2), and, furthermore, we prove that U ( S , X ) and w(s,x)are continuous with respect to x. Using the definitions, assumptions, and notations given in Section 1, we introduce here some new objects. Let us take E A, 0 5 s 4 t, and a function
2. Some Preliminary Considerations
u(x).Also, let us define a strategy j?,
= P and assume, in addition, that
In order to make oneself familiar with the operator G,,, we suggest the reader should work out the following exercise. 1. Exercise Let 0 $ so $ s, . . . 5 s, = T.Show that G,,,,, G,,,,,, . . . , G,n_,,,,, g(x) is the upper bound of va(so,x) with respect to all strategies a E '2I for which a, is constant on each semiinterval of time [si - so, si+ - so).
We shall repeatedly assume about the functions u(x) to be substituted in the operators GB,G, that for some constants K and m 2 0 for all x E Ed
+
In this case G[tlul(x)I N ( l Ixl)", where N does not depend on j?, s, t, x. As was seen in the discussion of Theorem 1.5-1.7, such inequalities readily follow from estimates of the moments of solutions of stochastic equations. 2. Theorem. Let the continuous function u(x) satisfy inequality (1). Then the function G!,,u(x) is continuous in x uniformly with respect to j? E A and s, t such that 0 I s t T. The functions va(s,x), va,'A(T-fl(s,x)are continuous in x uniformly with respect to a E 'U, s E [O,T], z E m ( T ) . In particular, the functions G,,u(x), v(s,x), w(s,x) are continuous in x uniformly with respect to s, t such that 0 I s 4 t I T .
PROOF.The last assertion follows from the fact that, for example, sup IGs,tu(xn)- Gs,tu(xo)l5 sup sup IG!,tu(x,) - G!,tu(xo)l. OssstsT Ossst5T B o A Furthermore, the right side of the last expression tends to zero as x, + x , according to the first assertion of the theorem. Next, we take a point xo E Ed, a sequence x, -+ x,, and also we assume that h, = x, - x,, o,(a,t,x) = o(a,t , x + h,). In a similar way, we introduce b,, c,, f,, g,, u,. For instance, u,(x) = u(x + h,). Since ca(t,x)is continuous in x uniformly with respect to a, for each t sup,,, sup,,, s R Ic;(t,x) - ~ ( t , x ) l - 0+ as n + oo.By Lebesgue's theorem, lim n-rm
JOT
sup sup lc;(t,x) - ca(t,x)ldt aoA I x I ~ R
= 0.
It is not hard to verify that the remaining assumptions of Theorem 1.12 are satisfied. Therefore, Theorem 1.12 is applicable in our case. Furthermore,
3 General Properties of a Payoff Function
we note that the process xf,"."O(n)from Theorem 1.12 as well as the process ~ f , " ,-" h, ~ satisfy the same equation in an obvious way. Hence, x?"."O(n) = ~9%'"" - h,, c:(s + t, x:,',"O(n)) = cat(s+ t, x;,'"~), etc. By Theorem 1.12 we have that
as n + co uniformly with respect to a E %, s E [O,T]. Taking instead of g(s,x) the function u(x) in the last relation (this can be done because of continuity of u(x) and by virtue of (I)),we find as n + oo uniformly with respect to a E a,s E [O,T]. We derive the assertions of the theorem from the limiting relations proved above, in an elementary way. This completes the proof of Theorem 2. Further, we need continuity of va, Ga with respect to a. Let a metric in the set A be given by a function p(al,a2).We assume that p(a1,a2)< 1. for all a,, a, E A. We can easily satisfy this inequality if we replace, when needed, the initial metric by an equivalent metric, using the formula 2 p'(a,,a2) = - arctan p(a,,a2). .n
3. Definition. For a', a2 E % let
fla',a2)
= M JOT
p(a:,a:)
dt.
If a" E % (n = 0,1, . . .) and p"(an,aO) + 0 as n + co,we write an + aO. Since p(al,a2) < 1, p"(a',a2) will be defined for each a', a2 E %. 4. Exercise Using Theorem 2.8.1, prove that if p"(a1,a2) = 0, then for all (s,x).
By hypothesis, the set A is separable. We fix a countable subset {a(i)) dense everywhere in A. 5. Definition. Let I = {0 = to,t , , . . . ,t, = T ) be a subdivision of an interval of time [O,T], a E %, and let N be an integer. We write a E %,,(I,N), if
2. Some Preliminary Considerations
a,(o) E { a ( l ) ,. . . ,a(N)) for all o E S2, t E [O,T], and a, = ati for t E [ti,ti+,), i = 0, 1, . . . ,n - 1. Let %,,(I) = %,,(I,N), a,, = %,,(I). Strategies of the class a,,are said to be step strategies.
UN
0,
6. Lemma. Suppose that the diameter of the subdivision I , of the interval [O,T] tends to zero as n -+ co. Then for each strategy a E there is a sequence of strategies a" E 91cT(I,)converging to the strategy a E
a.
PROOF.The distance p" satisfies a triangle inequality. Hence it suffices to show that:
U,
%,,(I,) is dense in % ;, a. in the sense of the distance p" the set b. the set a,, is dense in a set of all the strategies each of which assumes only a finite number of values from {a(i)); c. the latter set is dense in %. Proof of (a). If a E acT, for some subdivision I = ( 0 = to,t i , . . . ,t, = T ) the equalities a, = ati can be satisfied for t E [ti,ti+,).Using the strategy a and the subdivision I,, we construct a strategy an so that a: will be right continuous, constant on each interval of the subdivision I,, and, in addition, coincide with a, at the left end points of the foregoing intervals. In this situation a: differs from a, only on those intervals of the subdivision I, each of which contains at least one point ti. It is seen that p(a:,a,) + 0 as n + co everywhere, except for, perhaps, the points ti. Hence p"(an,a)-, 0. Proof of (b). We take a strategy a, that assumes values in {a(l),. . . ,a(N)). In a Euclidean space EN let us choose N arbitrary points x,, . . . ,xN so that Ixi - xjl 2 1 for i # j. Let P,(o)= xi, if a,(o) = a(i), t E [O,T], P,(o)= 0 for t > T. It is easily seen that for s, t E [O,T]
Assuming K, = a,, fit = Po, we define completely the functions a,, P, for negative t. Let ic(n,t) = j2-" for j2-" 5 t < ( j + 1)2-", j = 0, 1, f2, . . . It is a well-known fact (see, for instance, the proof of Lemma 4.4 in [51]) that there exists a number s and a sequence consisting of integers n' -+ co, such that
+
+ 0 as By virtue of (2),we have for the functions a: = a,~n,,-,,+s:p"(an,a) n' -+ co. Furthermore, it is not hard to see that ~ ( nt ,- s) + s is a step function oft, ~ ( nt ,- s) + s 5 t. Hence a: is 9,-measurable and a" E % ., Proof of (c).Let us introduce on A the following functions:
It is seen that K,(u) is equal to that a(i),which is at a distance from a of not more than lln, and, which in addition, has the smallest possible index. Since
3 General Properties of a Payoff Function
{a(i)) is dense everywhere in A, the functions in(a),lc,(a) are defined on A and p(lc,(a),a)) Illn for all a E A. Further, let lcn,,(a) = a(N A i,(a)). Obviously, lc,,(a) -+ a, if we let first N -+ co, second, n -+ co. Hence p"(lc,,(a),a) -+ 0 under the same conditions for each strategy a E %. It remains only to note that the strategy lcn,,(at) takes on values only in the set {a(l), . . . ,a(N)). We have thus proved the lemma.
7. Lemma. Let s E [O,T], let z,, z, be random variables with values in [0, T - s], and, jinally, let u(x) be a continuous function satisfying the condition (1). Then the random variable
is an 9-continuous function of (a,x)for a E %, x E Ed.
PROOF. Note first that if an -+ a, xn -+ x, the 9B-limit x:","."" is equal to xFhX,which fact follows, according to Theorem 2.8.1, from the continuity of o(a,t,y) and b(a,t,y) with respect to a, boundedness of o(a,t,y) and b(a,t,y) for fixed y, and, finally, convergence of a:(o) to at(@)in measure dP x dt. Further, reasoning in the same way as in the proof of Lemma 2.7.6, and, moreover, using condition (1.3) as well as the continuity of P(t,x) and f"(t,x) with respect to (a,x), we can prove that the processes cat(s t, x:,~,~), f Q ( s+ t, x:,"") are 2'-continuous with respect to (a,x). Also, applying the results obtained in Section 2.7 related to the 9B-continuity of integrals and the 9-continuity of products of 9-continuous processes, we immediately arrive at the assertion of the lemma. Lemma 7 is proved.
+
8. Corollary. For s E [O,T] and z E 9X(T - s) the functions va(s,x)and ua,'(s,x) are continuous with respect to (u,x) for a E a,x E Ed. For 0 I s I t I T the function G:,,u(x) is continuous with respect to (P,x) on A x Ed. Combining Corollary 8 with Lemma 6, we have
9. Corollary v(s,x) = lim
sup
n + m as'U,,(I,)
va(s,x)
for any sequence of subdivisions whose diameter tends to zero.
10. Exercise Prove that (3) is 9-continuous with respect to s and, next, deduce from that the continuity of va(s,x)with respect to s. This together with Theorem 2 enables one to conclude that va(s,x)is continuous with respect to (s,x)and that, in addition, v(s,x) is a Bore1 function of (s,x).
2. Some Preliminary Considerations
We prove some other properties of the operators G!,,. Letting the sequence E, -+ 0, E, # 0, we consider the mean functions for the functions a, b, c, J: Let a,(a,t,x) = a(")(a,t,x) (see the notation in Section 2.1), etc. In other words, we take a,, b,, c,, f, from Theorem 1.15. We denote by ~:,~,."(n) a solution of the equation dx, = a,(a,, s
+ t,x,) dw, + b,(a,, s + t,x,) dt,
Furthermore, let p"""(n) =
-
& e ( s + r, ~ : ~ , ' ( ndr) )
and, in addition, for a constant strategy P,
P for 0 I
s I t I T let
Regarding P as a single point of A and using Theorem 1.15 as well as the estimates of moments of solutions of stochastic equations, we have the following assertion.
11. Lemma. Let a continuous function u(x) satisfy condition ( I ) , and let u,(x) = u('")(x).Then Ie:u,(x)l for all s I t I T , x Furthermore,
E Ed, n
E
+ 1x1)"
> 0, P E A, where N does not depend on s, t, x, n, P. G!;:u,(x)
as n -, co for each t IT , 1x1 I R}.
IN(1
+
G!,tu(x)
A uniformly on each set of the form {(s,t,x):OI s I
The functions a,, b,, c,, f,, u, are smooth with respect to (t,x).In addition, their derivatives grow not more rapidly than (1 1x1)". For example,
I u,(x)l IE; ' J; =
21s1
+
u(x - e,z) : [(z)dz dl'
I
By Theorem 2.9.10, the function G!;:u,(x) is the unique solution of a certain equation. By Remark 2.9.1 1 the foregoing function is uniquely determined by the functions a, = +anax,b,, c,, f,,u,, for finding which it suffices, obviously, to give a, b, c, f, u. This, by Lemma 11, implies
12. Corollary. The function G!,,u(x) does not change i f we change the probability space and, furthermore, take another dl-dimensional Wiener process. The function G!,,u(x) can be determined uniquely by o, b, c, f , u.
3 General Properties of a Payoff Function
Let us use the properties of the function G~;;u,(x) to a greater extent. 13. Corollary. Suppose that p E A, and that the function o(p,t,x)(b(P,t,x)) for each t E [O,T] is twice (once) continuously diflerentiable over x. Furthermore, suppose that second (jrst) derivatives of the foregoing function with respect to x are bounded on any set of the form [O,T] x SR. Let t E [O,T], and let q(s,x) be an injinitely differentiable function on Ed+,, which is equal to zero outside a certain cylinder [O,t] x SR. Then where
In fact, let
a + 1 akJ(p,s,x)a2 + axiaxj
Lfi(s,x)= as
i,j=l
d
bk(p,s,x)
-a axz
- cfll(s,x).
By Theorem 2.9.10, in a strip [O,t] x Ed. Multiplying the last equality by q, integrating by parts, and, in addition, introducing an operator Lfll*in the usual way, we have
It remains only to let n + co and to note that the integration is to be carried out over a bounded set and, for example, for almost all s, x. (For the properties of mean functions, see Section 2.1). The final property of the operator Gt,, which we give in this section follows immediately from Theorem 2.9.7 and Remark 2.9.9.
14. Lemma. Suppose that s E [O,T], 0 I tl I t2 IT - s and a strategy a E 2l is such that a, = a,, for t E [tl,t2). Let the continuous function u(x) satisfy condition (1). Then almost surely
Note that for proving the lemma we should take in Theorem 2.9.7, A, t,, t2, x:;~'~,at,, af+,(x), bf+,(x), instead of 2, s, T, l, 5, aF(x), b:(x) respectively.
3. The Proof of Theorems 1.5-1.7
3. The Proof of Theorems 1.5-1.7 In the preceding section we proved that some mathematical expectations of the form MZ,,Fa are continuous with respect to (a,x) on '? xI Ed. Furthermore, we learned how to approximate any strategy by means of step strategies. Also, we introduced the operators G!, G,,, which are crucial for the discussion in this section. Having thus completed the technicalities, we proceed now to prove Theorems 1.5-1.7. 1. Lemma. Let so < s, < . - . < s,
=
T. Then
PROOF.Let ui(x) = G ,,,, +,. . . G,,_,, ~ ( x (i) = O,l,. . . ,n - I), un(x)= g(x). Also, we fix E > 0. By Theorem 2.2, the function u,-,(x) = G,,-,,,, g(x) is continuous and furthermore, it satisfies the inequality l ~ , - ~ ( x ) l N(l + 1x1)". This implies in accord with Theorem 2.2 that the function u,-,(x) = Gs,-,,sn_l~n,(x) is continuous. Arguing in the same way, we convince ourselves that all the functions ui(x) are continuous. Further,
By Corollary 2.8, the functions G,,,, ,ui+,(x) are continuous with respect to p. Hence the last upper bound can be computed on any countable set everywhere dense in A. Noting in addition that Gt,s, ,ui+ ,(x) is continuous with respect to x according to Corollary 2.8, we conclude that there exists a (countable-valued)Bore1 function Pi(x) such that for all x +
+
In a space of continuous functions xIo,,, with values in Ed we define the function a,(x[,,,)) = a , ( ~ [ ~ using , , ~ ) the formula a , ( ~ ~ = ~ Pi(~,,_,o) ,,~) for t E [si - so,si+, - so), i = 0, . . . , n - 1, a , ( ~ ~ = ~ ,PO(0) , ~ ) for t 2 T - so. It is seen that the function a, is progressively measurable with respect to {Jlr,) and also it is seen that the equation
is equivalent to a sequence of equations
3 General Properties of a Payoff Function
etc. Each of the equations given is solvable. Therefore, aiis a natural strategy admissible at each point (s,x). Finally, by Lemma 2.14 for i = 0, 1, . . . , n - 1
Adding up all such inequalities and collecting like terms, we find thus proving our lemma. In the theorem which follows we prove the first assertion of Theorem 1.7.
2. Theorem. (a) v(,, = u. (b) Let so = sb < s\ < . . . < st(, maxj(s; + - si) -t 0 for i --+ co. Then
=
T (i = 1,2, . . .),
PROOF.Assertion (a) follows from (b), Lemma 1, and the obvious inequality v(,, 5 v. Furthermore, it follows from Lemma 1 that the upper bound in (1) does not exceed v(so,x).Since the upper limit is smaller that the upper bound, to prove (b) we need only to show that
2.9, Using Corollary . . we construct step strategies so that vai(so,x)-+ v(so,x) as i + oo and a: = ~ l ~ i - , , for t E [s: - sO,sj+ - so). Also, we introduci functions u; according to the formulas U:(~,(X) = g(x), u~(x)= Gs;,s;+lu~+l(~) ( j = 0,1, . . . ,n(i) - 1). By Lemma 2.14
Adding up such inequalities with respect to j from j = 0 to j = n(i) - 1, and, in addition, collecting like terms, we obtain: vai(so,x)5 ub(x). Therefore, v(so,x) 5 b,,,ub(x), which is completely equivalent to (2). The theorem is proved.
3. The Proof of Theorems 1.5-1.7
3. Exercise Prove that if the subdivisions {sj} are embedded, the functions under the limit in (1) converge monotonically to v(s,,x).
4. Remark. The theorem proved above together with the constructions made in Lemma 1 provides a technique for finding &-optimalstrategies in the class of step natural strategies.
5. Lemma. (a) Let s E [O,T], x E Ed, cc E a.Then the processes
deJined for t E [0, T - s], are supermartingales with respect to IF,} thejrst , process being nonnegative (as.). (b) Gs,tv(t,x)5 U(S,X) for x E Ed, 0 5 s 5 t 5 T.
PROOF. It is seen that
where the right side is a martingale. Hence 6;'"'" is a supermartingale if ~ 2 " "is a supermartingale. The nonnegativity of 6 ; ~ "follows ~ from the fact that by the definition of a supermartingale, 62"3" 2 ~:,,{6~-,135)and g.,s,x T-s - 0. Further, by Theorem 2.2 the function v(s t, x) is continuous with respect to x. In addition, Iv(s + t, x)I I N ( l 1x1)". Therefore, by Lemma 2.7
+
+
for each t E [0, T - s] if an -+a. By Lemma 2.6, we can choose step an + a, which implies that the supermartingaleness of x;,"," needs to be proved only for step strategies. Since the constancy segments of a, can be considered one 5 K;;~,"( a s ) for t2 2 t1 if by one, it suffices to prove that ~:,,{x,~l9",~) cc, = a,, for t E [tl,t2). By Lemma 2.14, for such a strategy
from which it is seen that it remains now to prove assertion (b) of the lemma. We assume 8, E A, so = t, and, in addition, we construct a sequence of subdivisions so = sb < si < . . . < sico = T of an interval [t,T] so that
3 General Properties of a Payoff Function
maxj(sj+, - sj) + 0. By Theorem 2 and Lemma 2.14, it is not hard to obtain
where the constants N do not depend on x. This implies that for each fi E A the magnitude of the sequence 8,"~) Gs;,s; ' ' ' Gs;(,)-t,%(,)g(~t-s
does not exceed N(l + I ~ f 5 ~ ] )the " , latter expression having a finite mathematical expectation. Therefore, recalling that
and also applying Lebesgue's theorem, we easily find G!,tv(t,x)
= lim
i+ m
G,qtGs;,s; . . - Gs:(,) - l,s;(,)g(x),
where the expression standing under the limit does not exceed v(s,x) in accord with Lemma 1 or Theorem 2. We have thus proved Lemma 5. 6. Theorem (Bellman's Principle). For s I t I T
in this case we can take the upper bound with respect to a E '21E(s,x)as well.
PROOF. The properties of supermartingales imply that U(S,X)= Mf,,icO 2
Mf,xict-s
2 M~,.IC~= - ~ v'(s,x).
Taking upper bounds with respect to a E '2l or a E %,(s,x), we prove the required result. The following lemma proves Theorem 1.5.
7. Lemma. Thefunction v(s,x) is continuous with respect to (s,x)for s E [O,T], x E Ed. PROOF.By Theorem 2.2, the function u(s,x) is continuous in x uniformly with respect to s E [O,T]. Therefore, it suffices to prove that v(s,x) is continuous in s for each x. We fix x,. We need to prove that if s,, t, E [O,T], Assuming that A consists of a single point
Po, we have in (3) equalities instead of inequalities.
3. The Proof of Theorems 1.5-1.7
t, - S, --t 0, then v(sn,xo)- v(tn,xo)+ 0. We consider without loss of generality that t, 2 s,. Further, we use Theorem 6 for x = xo, s = s,, t = t,, and, in addition, choose an E 'U such that the upper bound mentioned in the assertion of Theorem 6, attained for a = a" to within l/n. We have
where the superscript n attached to x, cp stands for (an,s,,xo).By Corollary 2.5.12, for any q 2 1 sup M sup Ix:Iq < co, n
t5T
from which it follows due to (1.3) that the limiting expression in the first term of (4) does not exceed N(t, - s,), and also that this term itself is equal to zero. If we replace f by c in the above arguments, and if, furthermore, we use Chebyshev's inequality, we can see that cp:,-sn -+ 0 in probability. By Corollary 2.5.12, x:~-,~-+ xO in probability. Due to the uniform continuity of v(t,x) with respect to x,
as y -, 0. It follows, in turn, that h(x:,-,, - xo) -+ 0 in probability. In particular, V(~.,X:~-~~) - v(tn,xO)--t 0 in probability. We can now easily prove that the expression standing under the sign of mathematical expectation in the second term in (4) tends to zero in probability. From (5) we conclude that the mathematical expectation of (5) as well tends to zero (compare with the deduction of Lemma 2.7.6 from Lemma 2.7.5). The lemma is proved. 8. Proof of Theorem 1.6. We shall drop the superscripts (a,s,x). Further, we take the supermartingale k-, = ".:-k from Lemma 5, which is, according to Lemma 7, continuous in t. Therefore, by the lemma given in Appendix 2, the processes
p,
=
Ji [/".(s
+ u, xu)+ r.v(s + u, xu)]exp
(
- cp. -
J,,r. du
)
du
are supermartingales for t E [0, T - s]. Therefore, 0 = M:,,Ck-o
- pol 2 M:,xClcr - p,l.
Applying the properties of supermartingales one more time, we obtain
3 General Properties of a Payoff Function
It remains to take in the above inequalities upper bounds with respect to a E 8 , which completes the proof of Theorem 1.6. If we take in the above inequalities upper bounds with respect to cr E %,(s,x) and if, in addition, we use Theorem 2a, we arrive at the second assertion of Theorem 1.7, thus completing the proof of Theorem 1.7.
9. Exercise In proving Lemma 7, we introduced the function h(y). With the aid of h, we define a convex modulus of continuity of v(t,x) at a point xo according to the formula
*, h(y). where ?i(r)= suplul Prove that if N-
I1, where N = N(K,T,m).
10. Remark. From Theorem 2 and Corollary 2.12 it follows that the function v(s,x) can be defined uniquely after the functions o, b, c, f , g have been given. The function v(s,x) depends on neither the probability space nor the Wiener process involved.
4. The Proof of Theorems 1.8-1.1 1 for the Optimal Stopping Problem In this section we shall use the method of randomized stopping (see Section 1.2). Recall that this method consists in the introduction of the multiplier exp(-Sb r, du) into the functional which characterizes the payoff, and also in the replacement of the function f" by f" + rg. In accord with this remark we shall carry out the following construction. For n > 0 let B, = A x [O,n]. Furthermore, for P = (cr,7) E B, let
It is clear that for each n for p E B,, t 2 0, x and y E Ed the functions o(P,t,x), b(P,t,x) satisfy conditions (1.1) and (1.2) with the same constant K, and, in additon, the functions cp(t,x),f p(t,x),g(x) satisfy the growth condition (1.3) with the same constant m and a different constant K. Hence as in Section 1 where we introduced the concepts of a strategy, a natural strategy, and a payoff function with respect to A, o(a,t,x), b(a,t,x), ca(t,x)f"(t,x), g(x),
4. The Proof of Theorems
1.8-1.11 for the Optimal Stopping Problem
we can introduce here analogous quantities with respect to B,, a(P,t,x), b(P,t,x), cP(t,x),f P(t,x),g(x) = g(T,x). We denote by 23, the set of corresponding strategies, and denote by B,,,(s,x) the set of natural strategies. Let %, be a set of nonnegative processes F, which are progressively measurable with respect to {Ft}and such that Ft(o) I n for all (t,o), 8 = UB,, BE(s,x) = uBn,E(~,x), % = U%nEach strategy P E 23, is, obviously, a pair of processes (a,,r,), with a = {a,} E %, P = {F,}E %,. Conversely, each pair of this kind yields a strategy in 23,. It is easily seen that if P = (47)E En,xFS,"is a solution of the equation
In other words, xf3""= xr:",". Let Ws,x) = sup M s-J:[,! BE%
+
+
1
fPt(s t , ~ , ) e - ~ ~ dgt( T , ~ , - ~ ) e - * ~. - ~
Here, as well as above, the indices attached to the sign of the mathematical expectation imply the mathematical expectation of a expression in which these indices are used wherever possible. Theorems 1.5-1.7, as well as the results obtained in Sections 2 and 3, are applicable to the function Fn(s,x)as well as to a payoff function in the control problem without stopping. In particular Fn(s,x)is continuous with respect to (s,x), gn(T,x)= g(T,x) (Theorem 1.5). 1. Lemma. Let s E [O,T], x E Ed, P = (N,?)
E
23. Then the process
dejned for t E [0, T - s], is a continuous supermartingale.
PROOF. By Lemma 3.5a for P E 23, the process
is a supermartingale. In particular, (P = (a,O)),
is a supermartingale. It remains to apply the lemma from Appendix 2 to the last expression, thus completing the proof of our lemma. 2. Lemma. Let s E [O,t], x E Ed, y i E 1)32(T- s), = (ai,?) E B(i = 1,2, . . .). Further, let a Bore1 function u(t,x) satisfy the inequality lu(t,y)l I N(l Iyl)" with the same constant N for all t 2 0, y E Ed. In addition, let
+
3 General Properties of a Payoff Function
Then
f 5:(s + t, xt)e-qtdt
lim M l x!:{J:
i-+m
+ u(s + yi,xyr)e-qyi
PROOF. It can easily be seen that ~f PI(, + t, x;i's'x)e-'li,s.x -
I
I faf(s
+
+ t, x~'s'x)le-
q:i3s3x
t, X;i,~.~)e-~:i~s~x~
(
(1 - exp -
J:
dp))
Integrating the both sides of the last expression over t E [O,yi], introducing the notation hi = sup (1 Ix;',S'xl)m
+ and finally, noting that I f"(t,x)l a K(l + Ixl)", te[O,T-s]
< KMhi -
[yi 0
(1 - exp(-
It is also seen that IM!,;U(S
J: Tbdp))
dt
Iq(t,x)I a K(l
+ J:
7:erp(-
+ IxI)", we find
J: 7; dp) dt]
+ yi,xYi)e-q~'- ~;f,u(s+ yi,xYi)e-qyil
Therefore, it suffices to show that the last expression tends to zero. Since ca(t,x)5 K(l + Ixl)", then cp;i,"x IKRT for h' a R. Hence
+ R G MxhiaR ~ ~ ~ ~ 7 ~ e x p ( - ~ : T ~ d p ) d t i-t m
1 R
5 - sup M(hi)' i
+ ReKRT
i+ m
(
MxhiSR yzT:exp -cp;i3s3x So'
J:%$) dt.
4. The Proof of Theorems 1.8-1.1 1 for the Optimal Stopping Problem
By hypothesis, the last term is equal to zero. Furthermore, it follows from estimates of moments of solutions of stochastic equations (see Corollary 2.5.12) that sup, M(hi)2< co. Therefore, letting R -. co in the inequality 1 I I- sup ~ ( h ' ) ~ , R i we obtain I = 0, which proves the lemma.
3. Lemma. (a) Let s E [O,T], x E Ed and for each a E 9.t let z" E %X(T- s), r" E 9' 3 be dejned. Then
(b) Let g, F,(s,x)
=g A
= sup
5,. Then sup
a s % re'Jn(T-s)
M:,LJ;
/"'(s
+ t, x,)e-"dt + gn(s+ 7, xde-*
1
Furthermore, we can replace in (a) and (b) the upper bound with respect to a E 9.t by an upper bound with respect to a E %,(s,x).
PROOF.By Theorem 1.6, Eq. (1) holds for any za, ra if it holds for z" = T - s, ra = 0. We deduce assertion (b) from (a) for r" = 0 in the same way as we deduced the corresponding assertion from Eq. (1.5.2) in proving Lemma 1.5.2. Theorem 1.7 implies that it is possible to replace % by %,(s,x) in the preceding considerations. Thus, it remains only to prove (1) for za = T - s, ra = 0.Let P = (a3) E 23,. Furthermore, let
-
According to Lemma 3.5 the process K, is a supermartingale. According to the lemma given in Appendix 2, the process p, K,@, - r,dG, is a supermartingale and, in addition,
So
F,,(s,x) = Mpo 2 MpT-, 2 e"(T-S'[ " ~ ~ - s- Fn(s9x)]+ V".(S,X).
(2)
Using Fubini's theorem, we easily prove that MpT-.
+
= ~;,{~(~,x,-,)e-q~-~
[f'
+l,(g - Q](s
1
+ t,xt)epmdt .
3 General Properties of a Payoff Function
Obviously, the upper bound of the last expression with respect to f E %, is equal to
[f"' + n(g - E,,)+](s
1
+ t , ~ ~ ) e - ' ~+d gt ( T , ~ ~ - ~ ) e -. ' ~ - ~
Taking this fact into consideration, recalling the definition of En, and, finally, computing the upper bounds in (2) with respect to a E (II, 7 E %, we arrive at (1)for z" = T - s, r" = 0. The lemma is proved. 4. Corollary. Since g, I g, 5, I w.
5. Lemma. (a) The function w(s,x) is continuous with respect to s, x. (b) There exists a constant N such that IiT,(s,x)l < N ( l + 1x1)" for all n, s, x. (c) En(s,x)f W ( S , X ) uniformly on each set of the form {(s,x):s E [O,T], 1x1 5 R). PROOF.Assertion (a) follows from (b) as well as the continuity property of En(s,x). Since '23, c B,,,, the sequence E,(s,x) increzses. Moreover, in accord with Corollary 4, En I w and, obviously, E,,(s,x) I E,(s,x); in this case the function E,, does not differ essentially from the function v considered in Section 1. All this together with the estimates of v, w given in Section 1 proves assertion (b). Let ~ ( s , x=) limn,, E,(s,x). By Corollary 4, W(s,x) I w(s,x). On the other hand, for a E a,z E m ( T - s) let 'F, = nX,,,, fit = (at,Tf).Then, using Fubini's theorem, it is not hard to obtain
Let us write last relation in a different form. Let
for t I T - s, ~ " " ~ (=t )ya,S,X(T- s) for t > T - s. Furthermore, we introduce a random variable 5 which has an exponential distribution with a parameter equal to unity and which, in addition, does not depend on {ya3"."(t)}.What we have obtained can be written as follows: E,(s,x) 2 M:,,y(z + (l/n)(). W(s,x) 2 M:,,y(z + (l/n)t). Letting n tend to infinity, we note that the process ya."."(t)is continuous with respect to t,
4. The Proof of Theorems 1.8- 1.11 for the Optimal Stopping Problem
and, finally, we note that the last quantity is surnmable. Therefore, by Lebesgue's theorem, W(S,X)2 M~,,v(z)= va~'(s,x),
W(qx) 2 ~ ( q x ) .
We conclude that w(s,x) = w(s,x). From the last equality and the inequality w(s,x) 2 g(s,x) it follows, in particular, that the decreasing sequence of nonnegative continuous functions g(s,x) - gn(s,x) = g(s,x) - g(s,x) A En(s,x) + ~ ( s , x) ~(s,x)A W(S,X) = 0. By Dini's theorem, g(s,x) - g,(s,x) -+ 0 uniformly on each cylinder C,,,. In view of Lemma 3 (and Corollary 1.13), in order to prove (c) it suffices to show that Ign(s,x)(5 N(l + 1x1)" with the same constant N for all n, s, x. The last inequality follows from assertion (b), thus proving the lemma.
6. Remark. The proof of (a) completes the proof of Theorem 1.8. Theorems 1.9 and 1.10, as was seen in Section 1, follow from Theorem 1.11. For proving Theorem 1.11 we need an analog of Lemma 1, which is a combination of Lemma 1 and Lemma 5 (and Theorem 1.12). 7. Corollary. Let s E [O,T], x E Ed,B = (47) E .%2 p f , ~=, ~ W(S+ t, x:,S3x)e-q f , ~ , ~
Then the process
deJined for t E [0, T - s], is a continuous supermartingale. The process pf;;" is also a supermartingale for each z E %R(T- s). Subtracting from the process pff;" for B = (cr,O) the martingale
we arrive at a supermartingale which for t = T - s is equal to The last expression is nonnegative. Further, from the definition of the supermartingale it immediately follows that a supermartingale which is positive (a.s.) a t a certain moment of time, is positive (as.) at each preceding moment of time. Summing up what has been said above, we have the following result. 8. Corollary. Let s E [O,T], z E %R(T- s), x E Ed, cl E a.Then the process
is a nonnegative supermartingalefor t E [0, T - s].
3 General Properties of a Payoff Function
9. Proof of Theorem 1.11. Corollary 7 and the properties of supermartingales imply
which proves inequality (1.10). Next, let za I z:S3x, E > 0. For P = ( a 7 )E (I) let zP = za. According to Bellman's principle (see Theorem 1.7) for each n Cn(s,x)=
sup
P E IB,.E(S,X)
M,:
(f;
fL(s
+ T,g(s + t, x,)] exp
+ t, x,)e-.'
dt
+ Cn(s+ 7 ,x,)e
-q=
J 9 Tpd p dt
- cp, -
Taking the limit in the last expression as n -+ cc and, furthermore, using the inequality g(s,x) s w(s,x), (3) and the fact that fT, t w, we have
Further, we take a sequence a'
E %,(s,x),
FiE %, for which
From the inequality g(s + t, x;,".") < w(s also from (3) and (4),we find E
lim ME,
i- m
fi7:exp(-
cp, -
+ t, xpqX)- E for t < z:"."
fdi: d p ) dt = 0.
and
4. The Proof of Theorems 1.8-1.1 1 for the Optimal Stopping Problem
By Lemma 2, the last expression and (4) yield
By Corollary 7, the process
is a continuous supermartingale. Therefore, according to the lemma given in Appendix 2, the process K:,"." - pt,S,"is a supermartingale for each p = (cl,r) E 23. In particular, M;,,K, IM!,xp,, which together with (5) and (3) yields w(s,x) r lim M&{J~ [/.:(s i-+ m
r
sup
M t {w (s
aG B E ~ s , ~ )
+ t, x,) + rtw(s + t, xt)] exp(-
rp, -
J,rpdp) dt
+ 7, *,I exp (- rp, - J; rpdp)
It only remains to prove that in (1.10) we have equality if A consists of a single point and if, in addition, za Iz?" In this case we do not write the superscript a since we deal with only one strategy. Let z < z y . For E > 0, as can be seen, z A z": I z:", and, further, in accord with what has been proved, W(S,X) = M.,{J;*
"
[f(s
+ t, XJ + rtw(s + t, xt)] exp
(-
rp.
-
J
r, dp dt
)
Letting E 10 and noting that z A z": t z A zzx = z IT - s, the function w(t,x) is continuous with respect to (t,x) and also, noting that the quantity
.-,
+
sup,, Iw(s p, x,)( has a finite mathematical expectation, we have proved what was required. This completes the proof of Theorem 1.11.
10. Remark. Applying Remark 3.10 to the functions iJ,,(s,x),we see that these functions are defined uniquely by the functions a(p,t, y), b(p,t,y), co(t,y),f P(t,y)
3 General Properties of a Payoff Function
and g(T,y). The latter functions are expressible in terms of o(a,t,y), b(a,t,y), P(t,y),f"(t,y) and g(t,y), which together with Lemma 5c proves that in order to compute the function w(s,x) it is sufficient to give the functions o(a,t,y), b(a,t,y), cOL(t,y), f OL(t,y),and s(~,Y). 11. Exercise Examine the possibility of "pasting" to the strategy which "serves well" until a moment of time z, a strategy which "serves well" during an interval between z, and z,, and, furthermore, of extending this procedure, and show that the final assertion of Theorem 1.11holds in every case, not merely in the case where A consists of a single point.
12. Exercise Prove that for u
-
w and for u = g
w(s,x)= sup sup a ~ re91 $
M:{,J-s:
v..(~+ r, x,)
We conclude the discussion in this section by formulating two theorems. In the first theorem we shall estimate the rate of convergence of En to w. In the second theorem we shall give one connectivity property of a set Q., Both theorems mentioned will be proved in Section 5.3. However, here we note that the expressions appearing in the formulations of Theorems 13 and 14 which follow will be determined in the introduction to Chapter 4. Also, we note that the spaces W:d,2(HT)will be introduced in Section 5.3 (see Definition 5.3.1). 13. Theorem. Let g E y:l2(HT) n C(RT), F[g] 2 -K(1 Then in HT 1 Iw(s,x) - ~,,(s,x)l - N(K,m,T)(l 1x1)" n
+ 1 ~ 1 (H~-a.s.). )~
+
14. Theorem. Let g E W:d,2(HT)n C(HT),s E [O,T]. Furthermore, let there exist a function h(t,x) which is continuous in (s,T) x Ed and which coincides with F[g](t,x)almost everywhere in (s,T) x Ed. Further, let Q = ((t,x):t E (s,T),x
E Ed,h(t,x) > 01,
Qb = ((t,x):tE (s,T),x E Ed, w(t,x) > g(t,x)}.
Then Q c Qb and also each connected component of the region Qb contains at least one connected component of the region Q. In particular, if the set Q is connected, that is, if it consists of a single connected component, then the set Qb is connected as well.
Notes
Notes Section 1. The results valid for the general case appear here for the first time. Some of these results for particular cases can be found in Krylov [36] and Portenko and Skorokhod [61]. Sections 2,3. Some of the results in these sections can be found but without detailed proofs, in Portenko and Skorokhod 1611. The step strategies are considered in Fleming [141. Section 4. The methods for investigating the optimal stopping problem used in this section have been borrowed from [29-31, 361. Theorem 14 is the generalization of a result obtained in [56].
The Bellman Equation
4
In Chapter 3 we investigated general properties of controlled processes, such as continuity of a payoff function, the feasibility of passing to the limit from one process to another, the validity of Bellman's principle of various forms, etc. The assumptions we have made are rather weak. In this chapter we shall see that by making additional assumptions on the smoothness of initial objects, we can prove some smoothness of the payoff functions as well as the fact that the payoff functions satisfy the Bellman equation. The assumptions, definitions, and notations given in Section 3.1 are used throughout this chapter. Section 5 dealing with a passage to the limit in the Bellman equation is an exception, however, and self-contained in terms of assumptions and definitions. In addition to the main assumptions taken from Section 3.1, each section of this chapter contains assumptions which will be formulated or referred to at the beginning of each section, and which will be of use only in that section. We wish to give particular attention to one peculiarity of our assumptions. We make an assumption about a parameter m 2 0, which yields the rate of growth of functions as 1x1 + oo. The simple case when m = 0 is not excluded from our assumptions, and in this case the functions in question satisfy the boundedness assumption. For a first reading of Chapter 4, we therefore recommend the reader assume that m = 0. Furthermore, it will be easier to comprehend the material of this chapter under the assumption that ca(t,x)= 0. Let 1 a(a,t,x) = -o(a,t,x)o*(a,t,x), 2
4 The Bellman Equation
d
Fl(uij,t,x) = sup
C
acA i,j=l
aij(a,t,x)uij,
Note some properties of the quantities introduced. Since for each (t,x) the functions a(a,t,x), b(a,t,x), ca(t,x),f *(t,x)are uniformly bounded with respect to a (see (3.1.2) and (3.1.3)), the functions F, F, are finite. Since a(a,t,x), b(a,t,x), ca(t,x),f"(t,x) are continuous with respect to a E A and, since, in addition, the set A is separable, in determining F, F,, one can take the upper bound with respect to any countable set everywhere dense in A. This implies, for example, that the functions F(u0,uij,ui,u,t,x),Fl(uij,t,x), F[u](t,x) are measurable with respect to their arguments. Furthermore, if in a region Q for each a E A and a function u(t,x) (a.e. on Q), one can remove a set raof measure zero from Q for each a E A, so that the expression Lau + f a does not exceed zero on the remaining set. The union of rawith respect to a from any countable subset d of the set A has measure zero, in addition, outside this union Lau f" I0 on Q for each a E 1.If we take in the last inequality the upper bound with respect to a E d , and if we take the set 2 to be everywhere dense in A, it turns out that F[u](t,x) I 0 on Q r,. In particular, F[u] I 0 (Q-a.e.). This reasoning shows that if (1) is satisfied for each a E A, then F[u] I 0 (Q-a.e.). It is seen that the converse holds true as well. Finally, we mention that the function F, can be computed on the basis of the function F immediately according to the following simple formula:
+
UaEA
F1(uij,t,x)= lim r+m
1 F(uo,ruij,ui,u,t,x). r
-
1. Estimation of First Derivatives of Payoff Functions
1. Estimation of First Derivatives of Payoff Functions In addition to the assumptions made in Section 3.1, we assume here that for all t E [O,T], M E A, R > 0, x, y E SR Ica(t,x)- ca(t,y)(+ If"(t,x) - fa(t,y)l + Ig(x) - g(y)( + Ig(t,x) - g(t,y)( 2 K(l R)"lx - yl. (1)
+
1. Theorem. The functions v(s,x), w(s,x) have, for each s E [O,T], first-order generalized derivatives with respect to x. Furthermore, there exists a constant N = N(K,m) such that for each s E [O,T] for almost all x
Igrad,v(s,x)l
+ Igrad,w(s,x)l 5 N(l + I ~ 1 ) ~ ~ e ~ ' ~ - ~ ) ' .
PROOF.First, we prove the theorem under the assumption that for each t E [O,T], a E A the functions o, b, c, f , g are once continuously differentiable in x. Then it follows from our assumptions (see (1) and (3.1.1)) that for 1 E Ed Ila,l)(a,t,x)ll + Ib(l)(a,t?x)l5 K? Ici3t,x)J+ If :i,(t,x)l + Iga)(x)l+ I~cr)(t,x)l 2 K(1 + 1x1)". Relying upon the results obtained in Sections 2.7 and 2.8, we obtain that for any strategy a E %, s E [O,T] and z E %R(T - s) the functions va(s,x) and va9'(s,x)are continuously differentiable in x. In this case, for example,
In order to estimate v?;(s,x), we put ypS,"= 9-(d/dl)~;,~~.". We have
where the magnitude of the first term does not exceed KM:,,(l
+ Ixtl)"lytl
4 K[M:,,(l
+I~t1)~~l~~~[M:,xl~t1~l~~~,
which yields in turn, by Corollary 2.5.12 and Theorem 2.8.8 on estimation of moments xFSsX, yr'S'x,
4 The Bellman Equation
In order to estimate the second term in (2),we apply the Cauchy inequality. The square of (2) is estimated in terms of the product of the quantity M:,,[ . P ( s
+ t,xt)I2 a K 2 ~ : , , ( 1+ I X , ~ ) ~a"N ( l + 1x1)
Zm N t
e
and the quantity
Therefore, the magnitude of the second term in (2) does not exceed N ( l + Ixl)2meNfEstimating in a similar way the expression
which resembles rather closely the left side of (2), we finally find
Next, for 1x1, lyl < R , according to the Lagrange theorem
< N ( l + R)'"Ix
- yleN(T-S).
(3)
As in Section 2.1, a function satisfying a Lipschitz condition has generalized derivatives, and in addition, the gradient of such a function does not exceed the Lipschitz constant. Hence (3) implies the existence of first-order generalized derivatives of w(s,x) and, in addition, the inequality
for almost all x
E
S,. The last inequality implies precisely that
with the same constant N . The function v(s,x)can be considered in a similar manner. We have thus proved the theorem for smooth functions o, b, c, f , g. For proving this theorem in the general case we make use of Theorem 3.1.14 and Corollary 3.1.13. We approximate the functions a, b, c, f , g(x), g(t,x) using smooth functions on,b,, c,, f,,g,(x), g,(t,x), which we have obtained from the initial functions by means of convolution with a function E ; ~ [ ( E ; ' x )
1. Estimation of First Derivatives of Payoff Functions
(see Theorem 3.1.14). Let
E,
= l/n. For x, y E S,, for example, let
It is seen from the above that on,b,, c,, f,,g,(x), g,(t,x) satisfy our assumptions for the same constant I?, m. Hence, if we denote by w,(s,x) the payoff function constructed on the basis of on, b,, c,, g,(t,x), then for 1x1, < R (see (3))
lyl
IX
Iw,(s,x) - w,(s, y) 1 5 N(K,m)(l + R)2meN(K,m)(T -') - YI.
Taking the limit in the last inequality as n + co,we arrive at (3) in the general case using Corollary 3.1.13 and Theorem 3.1.14. As we have seen above, inequality (3) implies the assertions of the theorem for the function w(s,x). Similar reasoning is suitable for v(s,x), thus proving the theorem.
2. Exercise Using Bellman's principle, prove that for some constant N
Iv(t,x) - v(s,x)l< ~
c
= N(K,m) for
t2s
( + I ~l l ) ~ " + ' e ~ ( ~ - " ) .
3. Remark. If ca(t,x) = 0, (2) contains no second term and, therefore
It is not in general permissible to assert that Bellman's equation F[v] = 0 holds for the function v(t,x) having only first derivatives with respect to x. In fact, the equation F[v] = 0, in addition to first derivatives with respect to x, involves second derivatives as well as a derivative with respect to t. It turns out that although the foregoing derivatives enter into the inequality F[v] < 0, we can make this inequality meaningful by integration by parts.
4. Theorem. Let R > 0, and, in addition, let q(t,x) be a nonnegative function which is injtnitely direrentiable on E d + , and is equal to zero outside [O,T] x S,. Then for u(t,x) = v(t,x) and for u(t,x) = w(t,x)for each P E A
where we assumefor the sake of simplicity that aij = aij(P,t,x),etc. Note that the assertion of the theorem makes sense since the functions wxi,vxi exist, o(P,t,x), b(P,t,x) satisfy a Lipschitz condition with respect to x, and, furthermore, they even have bounded first generalized derivatives. The
4 The Bellman Equation
first-order generalized derivatives of the function a(B,t,x) = &r(p,t,x)o*(p,t,x) are bounded in each cylinder C,,,,. We first prove the theorem for differentiable o, b. In the lemma which follows assumption (1)will be absent.
5. Lemma. Let P E A and let the function a(P,t,x)for each t E [O,T] be twice continuously dflerentiable in x. Also, let b(P,t,x)be once continuously dlfSerentiable in x for each t E [O,T]. Furthermore, let the corresponding derivatives of these functions be bounded in each cylinder C,,,. Then for u(t,x) = v(t,x) and for u(t,x) = w(t,x)
where q is a function having the same properties as that in Theorem 4. PROOF. We introduce a constant strategy
P, = P.
For 1 2 0 let
+ M:,,U(T, ~ ~ - , ) e - ' ~ - ~ - ' ( ~ - ~ ) . If in Corollary 3.2.13 we take rb(t,x)= cb(t,x)+ 1 instead of cb(t,x)and a function f"8(t,x) = f b(t,x)+ Au(t,x) instead off a(t,x),and if, in addition, we
replace u(x)by u(T,x),we have
According to Bellman's principle (Theorems 3.1.6 and 3.1.11) w$ I u ; therefore, 0 I Aq(u - w$)and
for each 1 2 0. Further, let us take the limit in (4) as 1 -+ co. We note that due to the estimate
summability of the last quantity and, in addition, the continuity of u(s + t, xf,S,X) in t, the function h,,,(t)
= ~ ! , , e - ~ ~ u+(t,s x,)
is a continuous function oft. Therefore, if { is a random variable having an exponential distribution with exponent equal to unity, then Mh,,[(T - s) A ({/A)]-+ h,,,(O) as 1 -+ co. This implies precisely that for s E [O,T], 1 -+ co
Furthermore, it follows from the estimates of moments of solutions of stochastic equations that IhS,,(t)l I N(l + IxI)", where N does not depend on
1. Estimation of First Derivatives of Payoff Functions
s, x, t. Hence the left sides in (5)are bounded in C T ,uniformly with respect to I. From the estimates mentioned above and the inequality If P(t,s)lI K(l + 1x1)" it follows that for s E [O,T]
where N does not depend on s, x, I. Therefore, the totality of functions wf: is bounded on [O,T] x S, and wf: -+ u as I -+ co. Since in (4) we can take an integral over the set C,,,, we replace the function w t by u, letting I -+ co. The lemma is proved.
PROOF OF THEOREM 4. AS in proving Theorem 1, we approximate a, b, c, f , g(x) using smooth functions a,, b,, c,, f,,g,(x), and, in addition, taking the convolution of a, b, c, f , g(x) with the function nd[(nx)(see Theorem 3.1.14). We denote by v, the payoff function which has been constructed pn the basis of a,, b,, c,, f,, g,(x). According to Lemma 5,
where the operator L!* is constructed in the usual way on the basis of a,(p,t,x) = 4a(~,t,x)aX(p,t,x),b,(P,t7x),c~(t,x). Since the function v, has a generalized derivative with respect to x, we have, integrating by parts in (6), that
Let us take the limit in (7) as n + co. According to Theorem 3.1.14 and Corollary 3.1.13, the functions v,(t,x) converge to v(t,x)uniformly on [O,T] x S,. As was indicated in Section 2.1, c!(t,x) -+ cP(t,x) for all t, x due to the continuity of cS(t,x)in x and for almost all (t,x)because the generalized derivative bii exists. Furthermore, IbxilI K ; hence Ibnxil= Ibxi* nd[(nx)l2 K. This reasoning shows that the right side of (7) tends to
a s n + GO. Further, by Theorem 1, Igrad, v,(t,x)l I N(l
+1~1)~"
4 The Bellman Equation
for t E [O,T], x E Ed, where N = N(K,T,m) does not depend on n. Then we obtain
-+ -+ a$ almost where N depends only on K, T, m, R. We know that everywhere. It can easily be seen as well that the union of the foregoing derivatives is bounded on [O,T] x S,. Hence the first term in the right side of (8) tends to zero as n -+ co. The second term tends to zero as well, since v, -+ v in 9Z(CT,R), and, in addition, because the norms ]gradxvnl in 8,(CT,,) are bounded and therefore (see Section 2.1) vnXj-+ vxj weakly in g2(CT,,). Therefore, the limit of the left side of (7) is equal to
asn -+ co. In a similar way this theorem can be proved for w(s,x), thus completing the proof of the theorem. In order to derive two corollaries from the theorem proved above, we need two simple facts. If in a region Q c [O,T] x Ed the bounded functions cp(t,x), $(t,x) have bounded generalized derivatives, cpxixj,qxi,$xj, then for any r E Cgm(Q)
JQ(W)xi$,
dx dt =
-JQ(oq)xixj$dx dt,
Inequalities (9) and (10) can be proved in the same way, namely, we need to take a function q E C$'(Q) such that it is equal to unity everywhere where q # 0. Next, we replace $xj by ($ql),j in (9) and, furthermore, we replace $ by mean functions $(" in both inequalities. Since the products $(')q,,$(')q E C,"(Q), by the definition of a generalized derivative, we can shift the derivatives from $("ql, $("q onto cp. Also, we pass to the limit as E -+ 0 using the theorem on bounded convergence. Finally, the presence of q1 has obviously no effect on values of the resulted expressions, which fact allows us to remove q, in general. 6. Corollary. Let a region Q c H,, P E A, and also let a(p,t,x) as a function of the variables (t,x) have second generalized derivatives with respect to x. Then for each nonnegative function q E C,"(Q) for u = v and for u E w
JQuLb*qdx dt
-
JQf
dx dt.
1. Estimation of First Derivatives of Payoff Functions
We have made use of the preceding remarks in order to remove the derivatives from the function u. Using the same remarks and shifting the derivatives on u whenever possible in the assertion of the theorem, we have
Taking arbitrary y 2 0, we arrive the following assertion.
7. Corollary. Let a region Q c H,. Furthermore, let a function v(w), as a function of (t,x), have two generalized derivatives in x and one derivative in t in the region Q. Then for any P E A almost everywhere on Q In other words, almost everywhere on Q
Lemma 5 (or Corollary 6) has a rather unusual application for proving assertions of the type of Theorems 2.3.3 and 2.3.4. We recall (see [40]) that for a fixed I > 0 a nonnegative infinitely differentiable function u(x) given on Ed is said to be I-convex if the matrix (Iu(x)dij- u,,,,(x)) is nonnegative definite for all x. According to Corollary 1 of Lemma 1 in [40],for each A-convex function /grad u(x)l a $ u(x)at each point x.
8. Theorem. Suppose that on a probability space measurable processes xi, a,, b,, c,, cp, are defined for t E [O,co),with xi E Ed. W e assume that a, is a nonnegative dejinite matrix of dimension d x d, b, is a d-dimensional vector, c,, cp, are nonnegative numbers. Assume that there exists a constant A > 0 such that c, 2 I tra, + $lbtl for all t, o.Finally, assume that for any smooth bounded function u(t,x)which decreases in t, is kconvex with respect to x and which has bounded derivatives (d/dt)u,u,,, u,,,, on [O,co) x Ed, the inequality
is satisfied, where
Then for any nonnegative Bore1 function f(t,x) M
Sowe-Ydet ai)ll(d+l ~ ( t , xdt~ 1) N(d,i)llf
PROOF.First we note that c,u 2 Iu tr a, + ,hulbtl 2 AU tr at This condition can be replaced by (12).
/id+ i,lO,rnl E d .
+ (grad u,b,).
4 The Bellman Equation
Hence
is greater than zero since the trace of the product of positive matrices is positive; furthermore, by assumption, (a/&) u I 0. The above implies, in particular, that the right side of (11) is always definite. Next, we have from (11)that Mu(O,xo)2 - M Jom eKqtL;u(t,x,)dt,
(12)
where
We take a smooth function f (t,x) with compact support and some n > 0. Let A, be a set of all matrices a of dimension d x d such that traa* I 2n. For a E Anleta(a,t,x) = a, b(a,t,x) = 0, ca = A tr a(@), f" = (det a(a))li(d+l'f(t,x). We take as T a number such that f(t,x) = 0 for t 2 T - 2. Let g(x) = 0. On the basis of the quantities introduced, using some d-dimensional Wiener process, we define the payoff function v,,(t,x). It is seen that v,(t,x) increase as n increases. Let v(t,x) = lim,,, v,(t,x). By Theorem 2.3.3, for all x, n
Therefore, we shall prove the theorem for the function f chosen if we show that sup sup u(t,x) z M e-~.(detat)ll(d+l)f(t,xt)dt. (14) 220 X E E ~
Som
It should be mentioned that in accord with the results obtained in [54, Section 1, 21 it suffices to prove the assertion of the theorem for smooth nonnegative f(t,x) with compact support only. Thus, it remains only to prove (14). Using Lemma 5, we take as r](t,x)the function
which, being a function of (t,x), satisfies the conditions of Lemma 5 if s G [E,T - E]. Noting in addition that in our case the coefficients LB do not depend on (t,x),we can easily find using Lemma 5 for s E [E,T - E], n > 0 that d
C
i,j=l
aiiv$j,,,(s, y) - AU$)(S,~) tr a +
1
as
v$)(s,y)
+ (det a)lKd+l'f(e)(s,y) I 0
(15)
for all y E Ed and for nonnegative symmetric matrices a such that tr a I n. As can easily be seen, the functions v(t,x), v,(t,x) are equal to zero for t E [ T - E,T I . It is convenient to suppose that the functions v(t,x), v,(t,x) are
2. Estimation from Below of Second Derivatives of a Payoff Function
defined not only for t E [O,T] but for t 2 T as well, and, furthermore, that v(t,x) = v,(t,x) = 0 for t 2 T. Then f ("(s,~)= V ~ ) ( S=, ~ 0 )for s E [T - 1, co) (E< 1). Therefore, (15) holds not only for s E [E,T - E] but for s 2 T - E as well. By virtue of (13),according to the Lebesgue theorem, vf) -+ v(". Further, from (15) we have d
C
aijuEj(s,y)- ilv(&"(s,y) tra
i,j=l
+a
-
ds
v("(s,y)
+ (det a)'i'd+l''(s,~
5 0 (16)
for s 2 E and y E Ed for any nonnegative symmetric matrix a. If in (16) we take aij = nti(j, divide both sides of (16) by n, and, finally, let n -t a,we obtain n
In short, the matrix (ilv(')$j - v $ ) ~2, )0 and, furthermore, the function v(') is kconvex with respect to y. Taking a = 0, we obtain from (16) that v(') decreases in s. From (16) and (12) for u(t,x) = V'')(E t, x) we find
+
It remains to note that the left side of the last inequality does not obviously exceed the left side of (14). Also, we need to let E 10 and to obtain the right side of (14) from the right side of the inequality given above, using Fatou's lemma as well as the fact that f (')(t,x) -+ f (t,x) uniformly in (t,x). We have thus proved the theorem.
2. Estimation from Below of Second Derivatives of a Payoff Function In this section we shall estimate from below second derivatives of a payoff function. Also, using the estimates obtained, we shall represent the inequality
in a local form (see Lemma 1.5 or Corollary 1.6). Assume that the conditions given in Section 3.1 are satisfied. We also assume that the functions cr(a,t,x),b(a,t,x), ca(t,x),fOL(t,x),g(x), g(t,x) for each a E A, t E [O,T] are twice continuously differentiable over x. For all a E A, t E [O,T], x E Ed, 1 E Ed,let Ilo(l)(a,t,x)ll+ Ib(l,(a,t,x)l 5 K , Ilo(~)(~)(a,t,x)ll + Ib(,,(~)(a,tJ)l K(1 + 1 ~ 1 ) ~
4 The Bellman Equation
-
and for
uU(t,x) g(x),
let
uU(t,x)E g(t,x)
I~:l)(t,x)l+ lut)(,)(t,x)l K(1 + 1x1)"
We prove under the foregoing assumptions that second-order generalized derivatives of the payoff functions v(s,x), w(s,x) with respect to x and firstorder derivatives of these functions with respect to s are countably additive functions of sets (see Definition 2.1.2). As was done in Section 1, we rely here upon the estimates of derivatives of functions va(s,x),v",~(s,x), which, according to the results obtained in Sections 2.7 and 2.8, are twice continuously differentiable in x. If we let 1 E Ed, a E JU, r E LMZ(T - s) and if, in addition, we write the derivative v~&;;(,)(s,x), using the differentiation rules of mathematical expectations, integrals, and composite functions, we obtain a rather cumbersome expression. In order to simplify it, we introduce the following notation:
E:;:(t)
=
1
e-(qr-qt'f"(s
+ r, xr)dr + e - ( q ~ - q ~ )+~ r,( sx,);
where xr = x:JJ, yr = ya,"," zr = za,s,x r r qr = q:,S,X. As can easily be seen, for each o and for almost all t E [0,r] 9
9
Differentiating the last expression, we find
2. Estimation from Below of Second Derivatives of a Payoff Function
Further, noting that tf;:(z) = g(s + z,x,), we conclude that for t E [O,z]
which constitutes an equation with respect to tf;:(t) for t E [O,z]. Since the reverse of the transformations we have carried out is true, t t : ( t ) is the unique solution of (4) on [O,z]. Further, it is not hard to see that the process tf;:(t A z ) is 8B-differentiable in x. According to the well-known rule of operations with derivatives, it follows from (4)that for almost all w for t E [O,z]
a az
- car(s+ r, xr)8B-- tf;:(r A z ) ) dr. It is convenient to regard (5) as an equation with respect to an 8 B derivative. Comparing (1)and (4)with (2) and (5),one easily sees that (?is,,(t) satisfies Eq. (5) for t E [O,T]. Since the solution of Eq. (5) is unique for the same reasons the solution of Eq. (4) is unique, then for almost all w for t E [O,zl
Differentiating (5) over 1 and, furthermore, regarding the relation thus obtained as an equation for YB-(aZ/d12)t:;:(t),we prove in a similar way that for almost all w for t E [O,z]
a2
YB-z
t:;3A 4 = t:;;(,),s,x(t).
1. Exercise Prove (6)and (7) by direct differentiation of (1).
We can deduce from (6)and (7) the estimates of
First, from our assumptions, for t I T ltf;:(t)l I K ( T - s Further,
IC?;~,(S
+ r, x,)t:;:(r)l
+ 1)?sup (1 + I X ; , ~ , ~ ~ ) ~ . ST-s
I K ~ ( T- s + 1) sup (1 + I X ~ * " ~ ~ ) ~ " . ?ST-s
4 The Bellman Equation
Hence from (6) and (2)we obtain
where N ,
= K(T
- s + 1) + K 2 ( T - s + 1)'. Similarly,
+ N , tsT-s sup (1 + I X ; ~ " ~ ~ ) ~ "
sup
tsT-s
IZ:". ~,
(9)
where N 2 = N , + 2N,K(T - s). We also make use of the estimates of moments of derivatives of solutions of stochastic equations given in Theorem 2.8.8. In this case we obtain, for example,
where N = N(K,m). Estimating other expressions in the right sides of (8) and (9) in a similar way, and, in addition, noting that va''(~,x)= M:,,5:;:(0), v:i;(~,x)= M:,xt:fis,x(0)> v:fil)(~,x) = M:,xS$i(l),s,x(O),
we prove that the following assertion holds for vu,'(s,x).
2. Lemma. For each s E [O,T], cr E % and z E %R(T - s) the functions vu(s,x) and vu,"s,x) have second continuous derivatives with respect to x. There exists a constant N = N(K,m) such that for all 1 E Ed
Ix~)~", + Ix()~",
lo:i)(l)(s,x)l+ Iv$ii)(s7x)lINeN(T-S'(l+ NeN(T-s)(l ~ V : ~ ) ( S , X ) ~ Iv:i:(s,x)l IvU(s,x)I IvU,'(s,x)IINeN(T-s)(l+ 1x1)".
+ +
The proof of this lemma for vu(s,x)is the same as that for vu,'(s,x). However, we need to take z = T - s and, furthermore, replace g(t,x) by g(x).
3. Theorem. There exists a constant N the functions
= N(K,m) such
+l ~ 1 ~ ) ( ~ ~ ~ ~ ) + ~ , l~1~)(~~/~)+
v(s,x) + NeN(T-"(l w(s,x) + NeN(T-"(l + are convex downward with respect to x.
that for each s E [O,T]
2. Estimation from Below of Second Derivatives of a Payoff Function
PROOF. Let 1 # 0. Simple calculations show that [(I
+ I X ~ ~ ) ( ~ " I ~ ) I(l)(l) +
We take N from Lemma 2. Also, let N , = 23m/2N.By virtue of (10) and Lemma 2 a second directional derivative of the functions va(s,x)+ N,eN1(T-"(l + is positive for any a G 8, s E [O,T]. Therefore, these functions are convex downward. A fortiori, their upper bound with respect to a is convex downward: v(s,x)
+ N,eN1(T-"(l+ l ~ 1 ~ ) ( ~ " / ~ ' + ' .
We consider w(s,x) in a similar way, thus completing the proof of the theorem. 4. Corollary. Each of the functions v and w is representable as the digerence between two functions convex downward with respect to x. The subtrahend can be taken equal to h r e N ( T - S ) (1 + IX(2)(3m/2)+1 for a constant N
= N(K,m).
In fact, let us, for example, take N given in Theorem 5, and in addition, let us write In the lemma which follows we list general properties of differences of convex functions.
5. Lemma. Suppose that in a convex region Q c HT we are given a function u,(s,x) - u,(s,x), in which u , and u, are defined, measurable, locally bounded in Q, and, furthermore, convex downward with respect to x in Q, = {x:(s,x)E Q ) for each s. Then, for each l,, I , E Ed in the region Q there exist derivatives u~,,,(,,,(s,x)(dsdx)(see Definition 2.1.2). In this case inside Q
U(S,X) =
In addition, if the bounded function y(s,x) is measurable with respect to s, is twice continuously digerentiable with respect to x for each s, and, finally, i f this function is equal to zero outside some compact set lying in Q, then for any ll, l 2 Ed
PROOF. It is easy to obtain from the equality u = u , - u, and the properties of derivatives of u,, u, analogous properties of derivatives of u. We shall
4 The Bellman Equation
see below that second derivatives along the I , direction of a function which is convex downward with respect to x are nonnegative. Hence inequality (11) readily follows from the equality u = u1 - u,. Therefore, it suffices to prove the lemma for the functions u which are convex downward with respect to x (u = u l , u , = 0). Further, it suffices, obviously, to prove that the assertions of the lemma hold in any bounded region Q' c Q' c Q. Note that by hypothesis the function u is bounded in any region Q'. Then, in proving the lemma it is possible to assume that the region Q is bounded, and that the function u is convex downward with respect to x and is bounded in Q. It will be convenient to assume as well that the function u is extended in some way outside Q. Let us take a unit vector 1 E Ed, s E (O,T), and also let us take a nonnegative q E C;(Q,). Furthermore, for a real r we introduce an operator AS, using the formula A:((s,x) = ((s, x
+ r l ) - 2((s,x) + 5(s,x - rl).
Integrating by parts, we easily prove that from which we have, by the mean value theorem, A:y(x) = r2q(l,(I,(x+ 0rl) where 101 < 1. In particular, for Irl I 1 the collection of functions ( l / r z )A:y(x) is bounded and for r + 0 it converges to q(,,(,,(x).Therefore,
The function q is equal to zero near the boundary Q,. Hence for a sufficiently small r the function A?q(x) is equal to zero near the boundary Q,. For sufficiently small r the last integral in (12) can be extended to Ed. Further, if we write Ju A:q dx as the sum of three integrals and if, in addition, we make in these integrals the change of variables of the type y = x + a, we easily obtain that for sufficiently small r
JQu(s,x)A:q(x) dx = J
Ed
q ( x )A:u(s,x) dx =
JQq ( x )A:u(s,x) dx.
Because u is convex, d?u(s,x) 2 0 if the distance between x and dQ, is larger than Irl. Furthermore, q 2 0 . Therefore, for small r
By virtue of (12),
Similarly, for any nonnegative function q E C g ( Q ) we have
2. Estimation from Below of Second Derivatives of a Payoff Function
By Lemma 2.1.3 the inequalities proved above imply the existence for each s of the measure u(,,(,,(s,x)(dx)on Q, as well as the existence of the measure u(l,(l,(s,x)(dsdx)on Q. By Fubini's theorem, for y E C,"(Q)
JQrlu(l)(l)(dsdx) JQuq(l)(l,ds dx =
In short,
This equality proved for y E C,"(Q) can be extended to all nonnegative Bore1 functions y if we apply usual arguments from measure theory. Further, by definition,
for all y E CF(Q,). Approximating the function y E CZ(Q,)which is equal to zero near aQ, uniformly in Q, along with its second derivative using the functions from C;(Q,), we can see that Eq. (13) holds for the functions q E C2(Qs)as well. Next, if the nonnegative function y is one taken from the formulation of the lemma, then
JQ
d~dx
~ ~ ( l ) ( l )
U(S7Xk(l)(I)(S,X) dx]
= JOT
dS [JQs
= JOT
ds [JQs ~(s~x)u(i)(l)(s~x)(dx)] = ~u(al)(dsdx).
J,
It remains only to recall (see Section 2.1) simple relations between u~,,,(~,, and u(,,+,,(, +,,, u(,,-,,,(,, -,,,,and also to represent a bounded function q which satisfies the conditions of the lemma as the difference between two nonnegative functions which also satisfy the conditions of the lemma. The lemma is proved.
6. Theorem. For any l,, 1, E Ed inside HT there exist generalized derivatives v ( ~ ~ ) ( ~ ~ ) dx), ( s , x~(,~,(,,,(s,x)(dsdx) )(~s (see Dejinition 2.1.2). There exists a constant N = N(K,m) such that for each 1 E Ed inside H , for u = v and for u = w ~(,,(~)(s,x)(ds dx) 2 - NeN(T-S)(l+
1 ~ 1 ) ~ "dsdx.
(14)
PROOF.From Lemma 5 and Corollary 4 follows the existence of derivatives as well as the fact that u(,,(,,(s,x)(dsdx) 2 - NeN(T-"[(l
+ l ~ 1 ~ ) ( ~ ~( ~ ~( 1 ds ~) ) dx.+
(15)
4 The Bellman Equation
Using equality in (10)we have that
I (3m
+ 2)(3m+ 1)(1+ I x ~ ) ~ " ,
from which and from (15) we have (14), thus completing the proof of the theorem. Further, we can write the integro-differential inequalities given in Section 1, in a local form. Let
7. Theorem. In HT there exist generalized derivatives
v,i,j(ds dx),
a
w,i,j(ds dx),
Furthermore, for u = v and for u
-~
as
Ew
( ddx), s
d w(ds dx). as
-
inside HT for all p E A
- LPu(s,x)(dsdx) - f
P(s,x)ds dx 2 0.
In other words, [- LPu(s,x)(dsdx) - f P(s,x)ds dx] is a (positive) measure. PROOF. The existence of the derivatives vxi and wXiwas proved in Section 1. We have proved the existence of ~,,,~(dsdx) and ~ , , , ~ ( ddx) s in the preceding theorem. By Lemma 1.5 for j E A and for any nonnegative v] E CF(HT) JHT
[vLP*q+ f
q]ds dx d 0.
By Lemma 5, it follows from the last inequality that
where
- cP(s,x)v(s,x) ds dx
+ f P(s,x)ds dx.
By Lemma 2.1.3, the above enables us to conclude that the derivative (d/ds)v(s,x)(dsdx)exists and that it does not exceed [-vP(ds dx)] inside H,. Hence the theorem is proved for the function v. We can prove the theorem
3. Estimation from Above of Second Derivatives of a Payoff Function
for the function w in a similar way, thus completing the proof of our theorem.
8. Remark. In proving Theorem 7 we used no assumptions about nondegeneracy of a controlled process. In particular, all the assertions made in this section hold in the case where o(a,t,x) = 0.
3. Estimation from Above of Second Derivatives of a Payoff Function Inequalities of the form Lav(dsdx)
+ f a d s dx s 0
(see Theorem 2.7) enable us to estimate from above second derivatives v ( ~ ) ( ~dx). ) ( ~Such s estimation consists of the fact that we preserve the derivative v(,,(,,(dsdx) on the left side of (1)but we carry all the remaining expressions over to the right side of (1). It is necessary that the derivative v(,,(,,(dsdx) be "actually" present in some inequality of type (1) or that the derivative u(,,(~(s,x) "actually" belong to the operator F[u]. We assume that in addition to the assumptions made in this chapter, the assumptions made in Section 4.2 concerning the derivatives o, b, c, f , g(x), g(t,x) are satisfied. For t E [O,T], x E Ed, a E A, 1 # 0 let p(1) = p(t,x,Z) = inf sup na(t,x)(a(a,t,x)A,A), I:II=l ~ E A
We note that due to continuity of a(a,t,x), na with respect to a and due to separability of A we can, in determining p(t,x,l), compute the upper bound on a basis of a countable subset of A. Therefore, the upper bound mentioned is measurable with respect to (t,x). Furthermore, this upper bound is continuous with respect to A; therefore p(t,x,l) is measurable with respect to (t,x).In particular, Q(1)is a Bore1 set. Further, we introduced the function na(t,x)into the formula which gives p(t,x,l), for the sake of convenience. Since for each (t,x)the functions na(t,x), [na(t,x)]-' are bounded from above on the set A, p(t,x,l) > 0 if and only if inf sup (a(a,t,x)A,A)> 0,
in other words, if Therefore,
1:12=1 u s A
inf F1(AiAj,t,x)> 0.
I : Id= 1
( t , ~E)H T : inf Fl(AiAj,t,x) > 0 L:ld=l
4 The Bellman Equation
Next, we shall explain in what sense Q(1)is a set on which the derivative u(,,(,, actually belongs to the operators F[u], Fl[u]. Let a point (to,xo)4 Q(1). Then, it can easily be seen that there is a vector I , such that 12, = 1 and Fl(~02$,to,xo) = 0. We can assume without loss of generality that the direction of ;lo is equivalent to that of a first coordinate vector. Then all(a,to,xo)= 0 for all a E A. The property of nonnegative definiteness of the matrices a(a,t,x) implies that al'(a,to,xo)= ail(a,to,xo)= 0 for all a E A, i = 1, ..., d. Therefore,
For computing Fl[u](to,xo)we need to know the derivatives u,,,~ only for i, j 2 2. At the same time, it is impossible to express
in terms of the derivatives u,,,, mentioned since 1' = l(l0/l2,l) # 0. For example, the functions u, u + ( X I ) ' have identical derivatives with respect to xixi (i,j 2 2); however, their derivatives witfi respect to (1)(1) are distinct. Operators F1 on the functions u, u (XI)' apparently coincide at the point (to,xo).An arbitrary variation of u(,,(,,has no effect on the value of Fl[u]. In the same sense, the derivatives uxlxl,u ~do znot ~belong ~ to the operator
+
In fact, assuming7 = (1,1),we easily prove that Lu = 2u(i,(i,.It is impossible, however, to express either U,I,I or uXzX2 in terms of u(i)ii,. The reader will understand how the estimation of the type v(,,(,,(dsdx) I I) ds dx depends on the equality p(t,x,l) = 0 from the following exercise.
1. Exercise Let d
= dl = 2,
T = 1,
Prove that p(1) = 0 for 1 X I = (1,1), p(7) > 0, vm(i)= 0, and that the function v(s,x) is a smooth function of s and ~(,,~,,(r) > 0 for 1x7, where r = [0,1] x { x : x l = x2}. Note that j, ds dx = 0.
Thus, if p(1) = 0, we need not know the value of u(,,(,,in order to compute F[u]. The derivative v(,,(,,(dsdx) is not in general absolutely continuous with respect to Lebesgue measure, that is, the generalized derivative v(,,(,,(s,x)
3. Estimation from Above of Second Derivatives of a Payoff Function
does not exist. We shall see below (see Theorem 5) that if p(t,x,l) > 0 in any region Q, the generalized derivative v(,,(,,(s,x)does exist in Q. 2. Lemma. Let u = (uij)be a matrix of dimension d x d, and let $ be a number. Assume that for all A E Ed (ul,A)2 *Ill2. Then for all (t,x)E HT and for all units 1 p(t,x,l)(ul,l) I sup na(t,x)tra(a,t,x)u aeA
+ I/-.
PROOF. We fix t, x. Furthermore, we denote by r the smallest closed convex set of matrices of dimension d x d, which contains all matrices na(t,x)a(cr,t,x) (a E A). We can obtain r, for example, of the closure as the set
which implies that the set r is bounded and, in addition, that rnax (aA,l)= sup na(t,x)(a(a,t,x)l,A), aeT
asA
rnax tr au = sup na(t,x)tr a(a,t,x)u. a sA
NET
Let us prove that p(t,x,l) = inf rnax (al,l)= rnax inf (al,l). a s T 1:11=1
rl:II=l a e T
The first equality in (3) follows from the first equality in (2). In order to prove the second equality in (3),we apply the main theorem of game theory. Let R > 0. The function (al,/Z)is given on I' x { l : M = 1,121 = R}, convex upward (linear) with respect to a, convex downward with respect to /Z because the matrices in r are nonnegative definite. Furthermore, the sets r, { A : M = 1,1A1 I R ) are convex, bounded, and closed. Therefore, for each R >0 ( )= min rnax (al,A) I:lI=l,III
= rnax
min
(al,l),
a s R A:l1=1,[1I5R
where the second equality implies that there exists a matrix aR E r such that (aRl,l)2 pR(l) if 1A = 1, 121 I R. Letting R + co and, in addition, taking a convergent sequence of the matrices aR, we find a matrix a E r such that (ZAl) 2 lim,,, pR(l)if 111 = 1. The definition of pR(l)shows that lim,,, pR(l)= p(1). Therefore, inf (Zl,l)2 p(l),
I:11=1
p(1) Iinf (?&A) 5 sup inf, (aA,/Z). L:iI=1
acT I:IA=1
4 The Bellman Equation
On the other hand, sup inf (a1,A) 5 sup a s r L:lL=l
inf
(a1,A) = ,uR(l)+ ,u(l),
a s r A:lA=l,IAlsR
which completes the proof of the second equality in (3). Further, let ,u(a,l) = inf,,,,=, (al,l). By definition, p(a,l) I(a1,A) if 11 = 1. Using in the last inequality the expression A(lA)-' instead of 1, we find ,u(a,l)(E1)2I (a1,l) for 11 # 0. Thus certainly ,u(a,1)(11)2I (a1,A) for 11 = 0, which implies that the matrix a - ,u(a,l)(lilj)2 0 for each a E T. The matrix $(u + u*) - $1, where I denotes the unit matrix of dimension d x d, is nonnegative definite as well by definition. The trace of the product of nonnegative definite symmetric matrices is nonnegative. Hence for each a E r
+
0 I tr[a - ,u(a,l)(lilj)][~(u u*) - $I] = tr[a - ,u(a,l)(lilj)]u - $[tr a - ,u(a,l)] I tr au - ,u(a,l)(ul,l) I+-.
+
The last inequality holds due to the fact that na(t,x)tr a(a,t,x) I 1, 1 2 tr a 2 p(a,l) for a E r. Finally, we have ,u(aJ)(ul,l)I tr au + I,-, where it remains only to take upper bounds with respect to a E T and, in addition, to make use of (3) and (2). The lemma is proved. We introduce some additional notations. If v is a o-additive function of sets, Ivl is the variation of v; furthermore, v- = i(lvl - v) is the negative part of the measure v. It is a well-known fact that if v is absolutely continuous with respect to the measure v, and, in addition, v(dx)/v,(dx) = f(x), the measures Ivl and v - are absolutely continuous as well with respect to v, and -
--
v I@$
I f(x)(,
v - (dx)
-v, (dx) - f-(x)'
The next theorem incorporates the objects whose existence was proved in Sections 1 and 2 (in particular, see Theorem 2.7). 3. Theorem. Let u, = v, u, = w. Let measurable functions $,, $, be such that t,bidtdx I ~,(~,(,,(dt dx) inside HTfor any 1 # 0, i = 1,2. (By Theorem 2.6, we can take as $i the right side of inequality (2.14).) Then for all unit 1 and i = 1, 2 inside HT
p(l)ui(l,(l,(dtdx) I ($i) - dt dx
+t:(
-
ui
(dt dx)
+ Igrad, uil dt dx + (ui)
+
a
-
- ui(dtdx) I
at
where lLsl
= ILB(t,x)l
inf ILBl[($i)B tra(P,t,x)
dt dx + dt dx,
+ Igrad,uil + (ui)++ 11dt dx,
+ Ib(lj,t,x)l + cP(t,x) + f S_(t,x).
(5)
3. Estimation from Above of Second Derivatives of a Payoff Function
PROOF. We introduce the measure v, using the formula
All set functions
are absolutely continuous with respect to v. We denote by &(t,x), Dij(t,x), p(t,x) the respective Radon-Nikodym derivatives with respect to the measure v. According to Theorem 2.7, [- L@v(dt dx) - f @ dt dx] is a measure which is, obviously, absolutely continuous with respect to the measure v and whose Radon-Nikodym derivative with respect to the measure v is nonnegative. From this, for each p E A we have
almost everywhere in HT with respect to the measure v. The set (t,x) on which (6) are satisfied has full measure and, generally speaking, depends on P. Taking advantage of the separability of A as well as the continuity of a, b, c, and f with respect to P, we can easily prove that the set of those (t,x) for which (6) are satisfied for all at the same time has a full measure v as well. Further, by assumption for each unit 2 E Ed inside HT
Therefore,
almost everywhere in HT with respect to the measure v. Due to continuity of 1'2' with respect to A, the last inequality holds for all unit 2 at the same time on some set of full measure v. Therefore, the conditions of Lemma 2 are satisfied on the set mentioned for uij = a,,, Also, we note that since from (6) on the set of full measure v for all jE A we obtain
4 The Bellman Equation
Applying Lemma 2, we conclude that almost everywhere in HT with respect to the measure v
Multiplying the last inequality by ~ ( d t d x )we , arrive at (4)for i = 1. Next, let us prove (5) for i = 1. Noting that the matrix (Fij - $1p6ij)2 0 and, in addition, the trace of the product of symmetric positive matrices is positive, we find
We have from (6) that almost everywhere in HT with respect to the measure v
Computing the lower bound of the right side of the last inequality on a basis of a countable set fi which is dense in A, using the continuity of lLBlwith respect to p, and, finally, multiplying the inequality thus obtained by v(dt dx), we complete the proof of (5) for i = 1. We prove inequalities (4) and (5) for i = 2 in the same way. The theorem is proved. 4. Corollary. Let a region Q c H T , and let a measure ((d/dt)v)-(dtdx) (respectively a measure ((d/dt)w)-(dtd x ) ) in a region Q be absolutely continuous with respect to a Lebesgue measure dt dx. Then, the restriction of the set functions v(,,(,,(dtdx) (respectively, w(,,(,,(dtd x ) )on a set Q n Q(1)is absolutely continuous with respect to Lesbesgue measure. Furthermore, in Q there exists a generalized derivative
in the sense of Dejinition 2.1.1. In fact, we take as $, the right side of inequality (2.14). From (4) and Theorem 2.6 we have that inside Q where
which, as can be seen, implies that inside Q
3. Estimation from Above of Second Derivatives of a Payoff Function
Therefore, if r c Q, 1 , dt dx
= 0,
then
If, moreover, r c Q(1),by virtue of the inequality p(1) > 0 on Q(1)
In other words, v(,,(,,(dtdx) is absolutely continuous with respect to Lebesgue measure on Q n Q(1). Further, since then inside Q -
a
dt dx
dt dx 5 - v(dt dx), at
which yields the lower estimate of (d/dt)v(dtdx) in terms of the Lebesgue measure dt dx. Next, taking the upper estimate from (5), we obtain, as above, that (a/at)v(dtdx) is absolutely continuous with respect to Lebesgue measure on Q. Therefore, if q E C,"(Q), then
and therefore, the measure density (a/dt)v(dtdx) with respect to Lebesgue measure is a generalized derivative of v with respect to t in the sense of Definition 2.1.1. The function w can be considered in the same way. 5. Theorem. Let a region Q c H,. Let a measure ((d/dt)v)-(dtdx) in a region Q be absolutely continuous with respect to a Lebesgue measure dt dx. Then, in Q there exists a derivative (a/at)v(t,x)and furthermore; a. if Q c Q(1)for some unit vector 1, a second generalized derivative v(,,(,,(t,x) exists, and also
where $, is a function satisfying the assumption of Theorem 3; b. i f for all (t,x) E Q, 1 # 0
0,
then Q c Q(l), all generalized derivatives of the type vXix,(t,x)exist in Q, and, in addition, F[v] 5 0 (Q-a.e.).
PROOF.According to the preceding corollary, (a/at)v(t,x)exists in Q and, furthermore, under conditions (a) the function of sets v(,,(,,(dtdx) in the region
4 The Bellman Equation
Q is absolutely continuous with respect to Lebesgue measure. The RadonNikodym derivative of the function with respect to Lebesgue measure is a generalized derivative of the type v(,,(,,(t,x).Estimates for the former derivative follow immediately from (4) and the assumption that $1
dt dx I v(,,(,,(dtdx).
In order to prove (b), we note that the function
-
is continuous with respect to 1, which together with (7) implies that on Q ,u(t,x)
inf sup na(t,x)(a(a,t,x)l,l)> 0.
II/=1 a s A
It is seen that for all 2 E Ed
which yields for 111 = 1 p(t,x,l) 2 inf p(t,x)ll12 = p(t,x). 1:11= 1
Therefore, p(1) 2 p > 0 on Q. In other words, Q c nQ(1). Using assertion (a), we finally conclude that in Q there exist all generalized derivatives of the type v(,,(,,(t,x).As we have seen above, this implies the existence of all second mixed generalized derivatives in the region Q. Finally, the inequality F[v] I 0 (Q-a.e.) follows from Corollary 1.7, thus proving the theorem. 6. Remark. Obviously, Theorem 5 will hold if we replace in its formulation v and $, by w and $, $, being a function satisfying the assumption of Theorem 3.
4. Estimation of a Derivative of a Payoff Function with Respect to t We saw in Section 4.3 (see Theorem 3.5) that in order to prove the existence of second-order generalized derivatives of a payoff function with respect to space variables and to estimate them, we had to know how to estimate the derivatives of a payoff function with respect to t. In this section we estimate the absolute values of (a/at)v(t,x), (a/at)w(t,x), making more assumptions than in Sections 2 and 3. In addition to the main assumptions made in Chapter 4, we assume here that the functions a(a,t,x), b(a,t,x), ca(t,x),f '(t,x) for each a E A are once continuously differentiable with respect to (t,x) on the function g(x) is twice continuously differentiable with respect to x,
a,,
4. Estimation of a Derivative of a Payoff Function with Respect to t
and the function g(t,x) is once differentiable with respect to t and twice differentiable with respect to x, with the derivatives (d/dt)g(t,x),g,i(t,x), gXix,(t,x) being continuous in H T . Furthermore, for ail a E A, t E [O,T], 1 E Ed let
where the constants K and m are the same as those in (3.1.1)-(3.1.3). The last inequality follows readily from the foregoing and from (3.1.2) and (3.1.3) if in the right side of (1) we replace K ( l + by N(K,d) (1 + v2". It is seen also that (1 + 1x1)" I(1 + v2". Hence, if there exist constants K and m 2 0 for which all the assumptions except (1) can be satisfied, there will exist (other) constants K and m for which all the assumptions including (1) can be satisfied. Therefore, we can easily do without (1).We shall not omit ( I ) , however, because the estimate (1) is convenient in the case, for example, where m = 0 and, in addition, a(a,t,x) and b(a,t,x) are bounded functions. Furthermore, the right side of (1) written in a special form will be convenient for our computations. It is clear that we can always extend the functions a, b, c, f , and g(t,x) for t > T so that our assumptions are satisfied for all t E [O,co). However, in this case we need, perhaps, to replace the constant K by 2K for t > T. Let us assume that we have carried out the extension as described above. Due to the results obtained in Section 2.8 the process xf3"." is 9 B continuously 9B-differentiable over s for s E (0,T) on an interval [O,T]. Furthermore, if q:,"." = 9 B - ( d / d ~ ) x : " ~then , for each n 2 1 (see Theorem 2.8.7)
IXI)("+~)
1~1)~" IXI)("+~)
where N = N(K,m,n). It is seen that the process (s + t,xf9"'")is 9B-continuously 2B-differentiable as well. It follows from Section 2.7 that the function vu,'(s,x)is continuously differentiable over s on (0,T )for each a E a,x E Ed,z E YJl(T). Our objective consists in finding the formula for ( d / d s ) ~ ~ , ' "((~~, x- )~. ) 1. Lemma. Suppose that on a square (0,T) x (0,T) we are given a bounded function $(s,r) which is measurable with respect to (s,r). Also, suppose that for almost all r the function $(s,r) is absolutely continuous with respect to s
4 The Bellman Equation
and that the function d
ds +(s,r) =
[
*
t-s
-
"'"
t -S
if the limit exists otherwise
satisjies the inequality
Then the function l;f-"(s,r)dr is absolutely continuous with respect to s and, in addition, a derivative of this function with respect to s coincides with
for almost all s.
The proof of the lemma follows from the fact that for those r for which $(s,r) is absolutely continuous with respect to s, the function (a/ds)$(s,r)is a derivative of $(s,r), is measurable with respect to (s,r), and, finally, as we can easily verify using Fubini's theorem,
We can better describe the method of using this lemma if from the definition of va,'(s,x),applying Ito's formula, we derive the following formula:
If we assume that the function g(t,y) is infinitely differentiable with its derivatives increasing not too rapidly for (yl + co,the functions Larg(s r, y) are continuously differentiable with respect to (s,y).Further, since the process (s + r, x;,".") is 2'B-continuously -%'B-differentiablewith respect to s, due to the results obtained in Section 2.7 the random variable
+
is 2'-continuously -%'-differentiablewith respect to s on (0,T) for each r E [O,T]. In this situation the mathematical expectation Mf,x~lbr[f
+ r,xr) + Larg(s+ r , ~ r ) ] e - ' ~
(3)
is continuously differentiable with respect to s on (0,T) for any a, x, r. If we compute the derivative of (3) with respect to s and, furthermore, if we make use of the familiar estimates of moments of x;,"." as well as inequality (2), we can easily prove (compare with the proof of Theotem 1.1) that the derivative of (3) with respect to s is bounded for s E (O,T), r E (0,T) for each x. Therefore, in this case the function va~"'(T-s)(s,x) is absolutely continuous with respect to s according to Lemma 1. Also, ((0,T)-a.e.)
4. Estimation of a Derivative of a Payoff Function with Respect to 6
Let us transform the last expression. Using Ito's formula as well as the rules given in Section 2.7, which enable us to interchange the order of derivatives and integrals, we conclude that
Thus, if g(t,y)is a sufficiently smooth function,
Immediate computation of the last derivative with respect to s and, next, the application of Fubini's theorem (or carrying out transformations identical to those given in Section 2) lead us to the following result.
2. Lemma. For each x E Ed, a E a, z E 'iIX(T)the function V ~ , " ' ( ~ - " ( S , X ) is absolutely continuous with respect to s and, in addition, almost everywhere on (0,T)its derivative coincides with 8'9 ' T - S ) (s,x)where
- M : , X ~ r , T - S [fJXT-'(T,xTTS) + LaT-sg(T,~~-S)]e-YT-s,
<"i(t) =
J' e-(*-+'tp(s + r, x,) dr + e-'Vr'.Vt'g(~ + r, xJ.
4 The Bellman Equation
The arguments adduced prior to the lemma included an assumption that g(t,y) is very smooth. However, the lemma holds in the general case as well. In order to be convinced of this, one needs only to approximate g(t,y) using convolutions with smooth kernels and, second, to pass to the limit in the formula
proved for smooth g(t,y). Passage to the limit becomes possible due to the familiar estimates of the moments of x:~",",q:,",". The foregoing estimates enable us to assert that
lea,7(~,~)l I NeN(T-S)(l +I x ~ ) ~ ~ ,
where N
= N(K,m) (compare, for
(5)
example, the proof of Theorem 1.1).
3. Theorem. For each x E Ed the functions v(s,x) and w(s,x) are absolutely continuous with respect to s on [O,T], have on this interval a generalized derivative with respect to s, and,Jinally, for some constant N = N(K,m)
PROOF. From (4) and (5)for s2 > s , we find lva,r~(T-~~)(S2, X )v a , t ~ ( T - ~(sl,x)l l) I N$(T-"'(l
+ ~ x ( ) ~ "-( ss,). ~
Since and the difference between the upper bounds is not greater than the upper bound of the differences, we have which implies the absolute continuity of w(s,x) with respect to s. Dividing the last inequality by s2 - s , and taking the limit as si -+ s we obtain the ordinary derivative (a/as)w(s,x).We complete the proof of the theorem for w invoking the fact that a generalized derivative of a function of a real variable coincides almost everywhere with the ordinary derivative of this function. In order to prove the theorem for the function v(s,x),it suffices to assume that z = T and to replace g(s,x) by g(x) in all the arguments in this section. The theorem is proved. 4. Exercise We change a and b in Exercise 3.1. Let a(t,x) be a unit matrix of dimension 2 x 2 for t E LO,$]; aii(t,x) = 1 for t E ($,l];b(a,t,x) = 0 for t E [O,;]; b(a,t,x) = (a,-a) for t E ($,I]. Show that the derivative (a/as)v(s,x)is unbounded (near the point ($,O)).
5. Passage to the Limit in the Bellman Equation
5. Passage to the Limit in the Bellman Equation We know from Sections 1-4 the conditions imposed on a, b, c, f , and g under which payoff functions have generalized second derivatives with respect to x and a generalized first derivative with respect to t. We shall show further on (see Theorems 7.1 and 7.2) that it is easy to deduce the Bellman equations for payoff functions from the existence of the derivatives mentioned and from the assumption about the nondegeneracy of all processes x:,"." In order to remove the condition that all the processes x;,"." are to be nondegenerate, we need theorems on passage to the limit in the Bellman equation. Throughout this section Q denotes a bounded subregion of H,, Z(a,t,x) is a nonnegative symmetric matrix of dimension d x d, 6(a,t,x)is a d-dimensional vector, Fa(t,x), f"a(t,x) and Ta(t,x)are numbers. We assume that 4 b", c", and dare all definite for (a,t,x) E A x Q, measurable with respect to (t,x) and also continuous with respect to a. Furthermore, we assume that c" 2 0, d 2 0 ; ii, b", c" and dare bounded on A x Q, sup, Ip(t,x)l E Y d + ,(Q). Let
7,
-
+ i,j=l z 2j(a,t,x)uij d
G(uo,uij,ui,u,t,x) sup r"(t,x)uo a sA
We denote by d'Q the parabolic boundary of Q, that is, the set of those points (to,xo)of the (usual) boundary of Q for each of which there exists a number 6 > 0 and, in addition, a continuous function x, defined on [to - 6, to] such that x,, = x,, (t,x,) E Q for t E [to - &to). It is easily seen that if the process (so t, x,) is continuous with respect to t and is inside Q at t = 0, this process (so + t,x,) can leave the region Q only across the parabolic boundary of Q. Two basic theorems of the present section are the following.
+
1. Theorem. Let the functions u, E W1*2(Q) (n = 0,1,2, . . .); also, n t0
IIunllB(8'Q)
lim
<
n-+w
I un
- uOlld+
l,Q
= 0.
Then:
lim G[u,] 2 G[uo]
n-+ w
b. if inf,,
(a.e. on Q);
,
G[un],in&, ((slat) + 0,E gd+ AQ), then G[uo] 2
lim G[u,]
n-tm
(a.e. on Q).
4 The Bellman Equation
This theorem implies that under the appropriate conditions r
1 1
lim G[un]
lim G[un] 2 G lim un 2
n+ m
ln-m
(ax. on Q).
n-m
2. Theorem. For some constant 6 > 0 for all (qt,x) E A x Q, A E Edlet r"(t,x) 2 6.
(Z(a,t,x)A,A)2 61AI2, Let un E W1s2(Q)(n = 0,1,2, . . .); also,
Then for any function h E LZd+,(Q):
where N depends only on d, 6, and the maximal values of thefunctions @(a,t,x), zij(a,t,x),r"a(t,x)with respect to i, j = 1, . . . ,d, and (a,t,x) E A x Q. In particular, if G[un] + - h in the norm of LZd+,(Q), then G[uo] = - h (a.e.). It is essential to note that the hypotheses of the theorem include none on the convergence of derivatives of the functions un to derivatives of the function uo. In this connection, let us bring the reader's attention to 3. Exercise Let d = T = 1, Q = (0,l) x (-1,1), G[u] = Isup d
;(
u
+ m,x).
Let ~.(x)= sgn sin(2"nx),
Prove that the totality of u,, u,,, u,, nevertheless,
is bounded, u,, -t u, uniformly in Q, and,
0 = G[u,] 2 !@ G[u,] = 1
(a.e. on Q).
n-r m
Assertion (a) in Theorems 1 and 2 can sometimes be strengthened. 4. Exercise For all cc E A let the functions ~(a,t,x)(6(ct,t,x)) be twice (once)continuously differentiable with respect to x, let Fa(t,x) be once continuously differentiable with respect to t in Q and, finally, let the respective derivatives of the foregoing functions be bounded in Q.
5. Passage to the Limit in the Bellman Equation
Prove that if u, E W1.'(Q) (n = 0,1,2, . . .), h E Y d + , ( Q ) , llun - ~ ~ l l =~ 0,+ then bn+, G[u,] 2 G[uo] (a.e, on Q) and if in addition sup,,, G[u, E 6R,(Q) (with the constant N = 1 ) ~~(G[uO]
+ h)+lld+l,Q
k!? ll(G[un]
n+ m
+ h)+lld+l,~'
Theorem 1 follows readily from Theorem 2. For example, let us prove Theorem l b assuming that Theorem 2 has been proved. First we note that if u E W1s2(Q),then G[u] E Y d + ,(Q). This fact follows immediately from the measurability of G[u](t,x) (compare with Chapter 4, Introduction) and from the obvious inequality
where the constant N depends only on the upper bounds of the quantities iiij(a,t,x),b"'(a,t,x),F(t,x), p(t,x), r"OL(t,x).Further, for E > 0 we put E,(a,t,x) = ii(a,t,x) + EZ, where I denotes the unit matrix of dimension d x d, F:(t,x) = 7a(t,x) E. We can construct the operator G, on the basis of iS,, b, E, J and r", in the same way as we have constructed the operator G on the basis of ii, 6, i?, 7, and r: Obviously, G,[u] = G[u] ~((aldt) A)u and
+
+
Let hno=
-
+
infnznoG,[un] for no > 0.Since - hnoI G,[unol, inf G,[un] I
n$l
the function hnoE 2Zd+l(Q) due to inequality (1) and Theorem lb. In addition, (Ge[uo] hno)-= 0 for n 2 no. Inequalities (2) enable us to use Theorem 2 and thus obtain (Ge[uo] + hno)- = 0 (a.e. on Q), that is, G,[uo] 2 -hno (a.e. on Q),
+
G[uO]
+
E(;
+
2 inf G,[un] nzno 2 inf G[un]
nzno
+ E inf
nzO
For no + co,E 0 the foregoing proves Theorem lb. Therefore, we need to prove only Theorem 2, which we shall do at the end of this section. However, we investigate now an auxiliary problem assuming the conditions of Theorem 2 to be satisfied. It is convenient to assume that ii(a,t,x), b"(a,t,x)are defined not only on Q but for all (t,x) in general. Redefining them, if necessary, outside Q, we can arrange so that b"(a,t,x)= 0, ii(a,t,x) = 61 for (t,x) F$ Q. Let b(a,t,x) be the positive symmetric square root of the matrix 2ii(a,t,x). We fix in a set A a countable everywhere dense subset (a(i),i 2 1) and,
~ , ~
4 The Bellman Equation
furthermore, we denote by % the set of all measurable functions a"(t,x)given on (- m,m) x Ed and assuming values from {a(i),i 2 1). Since each eigenvalue of the matrix a" is greater than 6, each eigenvalue o" is greater than Therefore, (o"1,A)2 , / % $ IBy 2 .Theorem 2.6.1, we have that for each a" E s E [O,T], x E Ed there exists a probability space, a d-dimensional Wiener process (w,,F,), and a continuous process x, = xFS,." on this space such that
J26
a,
Let ~"t,x)= ~ ~ ( ' , ~ ' ( t , x ) , R?h(s,x) = M
Si h(s + t, x;s3x)exp [-At Ji Zi(s + r, x$',~)dr] dt, -
L
_1
where z is the time of first exit of the process (s + t, x p x ) from the region Q. Certainly, in this notation we should somehow incorporate the dependence on the choice of a probability space, a Wiener process, or x";'".. However, we shall not do this but instead assume that for each a" E %, x E Ed, s E [O,T] one of the needed probability spaces, one Wiener process, and one of the processes x:,~~'"is fixed. For shortening the notation we put also $(t,x) = $('sX)(t,x) for a" E
a.
5. Lemma. Let u E W1,'(Q), Ta(t,x)E 1, G[u] for all (t,x) E Q
= - h(a.e. on
Q), A 2 0. Then
+ f" + h)(t,x)+ n: u(~,x)]
u(t,x) = sup [R;(~u ie%
PROOF.We denote by u"(s,x).the right side of (3) and note that it does not change, by Theorem 2.2.4, if we replace the function h by the equivalent function h"(t,x)= - G[u](t,x). Using ii, b", and E, we construct the operator in the same way as we constructed the operator La on the basis of a(a,t,x), b(a,t,x), and ca(t,x) in the introduction to this chapter. It is seen that Z"u p I - h" everywhere in Q for a E A and for E l E a,(t,x) E Q
+
Applying Ito's formula (Theorem 2.10.1), we obtain
+ 17; u(t,x) 2 R;(AU+ 7+ h")(t,x)+ n; u(~,x),
u(t,x) = R:(Au - 2" u)(t,x)
which holds for any function a" E %. Therefore, u 2 u". For proving the converse we note first that due to the continuity of ii(a,t,x), b"(a,t,x),Za(t,x), and p(t,x) with respect to M and, in addition, due to the density {a(i)) in
5. Passage to the Limit in the Bellman Equation
the set A G [ ~ =] sup [
lim max [Z"("u
~ ( i +) ~ ~ ( i ) = ]
n+m
i
+ f"""'].
isn
Therefore, for each E > 0 and each (t,x)E Q there exists a number i such that G[u](t,x)- E I p ( i ) u ( t , ~+)f"a(i)(t,~). We denote by i,(t,x) the smallest integer which satisfies the last inequality. It can easily be proved that the set {(t,x)E Q: i,(t,x) = i ) is measurable for each i. Hence the function i,(t,x) is measurable and, by this token, the function a",(t,x)= a(i,(t,x))is measurable as well, for which for (t,x)E Q. The foregoing (compare with (4)) implies that I E(t,x) + R ~ E ( ~I, xU"(t,x) ) + E ( T- t).
Here E is an arbitrary positive number; therefore, u I 5. This completes the proof of the lemma. For 1= 0 we derive from the lemma a probabilistic representation of a solutiop of the equation G[u] = - h:
6. Exercise Prove using (5) that if P(t,x) = 1, ul,u2 E W1,'(Q),
G[u,] 2 G[u2]
(a.e. on Q),
ulla.Q < ~ ~then u,l = u2 ~ everywhere ~ , in Q. In particular, if F[u,] = F[u,](a.e. on Q), ~ ~ = u21azQ, l then ~ u,, = u2 ~ in the region Q.
We note another simple consequence of Eq. (9,which is, however, rather irrelevant to the discussion in this section. 7. Theorem. Let (thefirst assumption of Theorem 2 be satisfied)
Q = CT,R, h = ess sup
CT,R
IG[u,]
-a
r(
1, - G[u2] )
1,
~
W1"(CT,R), ul(T,x)= u2(T,x).
1 ~2, E
Then for each n > 0 there is a constant N = N(K,n) such that for all (s,x)E CT,R
4 The Bellman Equation
PROOF.Let hi = -G[ui]. Further, we write representations (5) for u1 and u, and, second, we subtract the representations obtained. Noting that the magnitude of the difference between the upper bounds does not exceed the upper bound of magnitudes of the differences, we have
Since Ih, - h,l 5 h (a.e.),Rglh, - h,l(s,x) I R$h(s,x) 5 h(T - s). Furthermore, since ul(T,x) = u2(T,x),
n; I L ,
- u,~(s,x)I M ~ U ,- uzl(s
+ 2,
I sup sup lu,(t,y) - u , ( t , ~ ) l ~ { l x : , = ~ ,R ~ I) . te[O,Tl I y l = R
It remains only to estimate the last probability. It is seen that it equals
According to Corollary 2.5.12 the last expression does not, in turn, exceed R-"NeN(*-"(1 + 1x1)" This completes the proof of the theorem. Using the probabilistic representation given in Lemma 5 in order to solve the equation G[u] = - h, we can give a probabilistic formula for the operatbr G.
8. Lemma. Let r"(t,x) = 1 and also let
+ n?u(t,x)]
F ~ , u ( ~ ,=x )S U ~ [ R ; ( A U + $ + h)(t,x) as%
Then for, all
where N depends only on d, 6, and the maximum of the moduli iZij(a,t,x),@(a,t,x). Furthermore, for the same h, u lim ]IA(Fh,u - U ) - G[u] - hlld+l,Q= 0.
2.- m
PROOF.By Lemma 5, u = F h l u , where h, ence between the upper bounds, we find
= - G[u].Estimating
F ~ ,u u = Fh,u - F ~ IuSUP R?(h - h l ) 5 sup R;(h oisl
Fh,u - u 2 inf R;(h - h,) as@
=
the differ-
- h,)+,
oi€l
-sup R;(hl - h) 2 -sup R:(h - hl)-, be%
csir
which together with Theorems 2.4.5 and 2.4.7 prove the assertions of the
5. Passage to the Limit in the Bellman Equation
theorem. In fact, by Theorem 2.4.5
s Allsup R:(h as%
d]l(Fiu - u ) + l l p + l , Q
- hl)*llp+l,Q5 Nll(h - hl)*llp+l,Q.
By Theorem 2.4.7, the expressions
-A
d sup R"',h - h,), .EL
sup R:(hl
-
as%
h)
converge in the sense of the upper norm $cb+,(Q)to h-h,. Since d ( ~ h , u u) is between the foregoing expressions, the former converges to h - h, as well. The lemma is proved.
-
9. Proof of Theorem 2. First we consider the case Fa(t,x) 1, h = 0. Let a region Q' c Q' c Q. According to Lemma 8
Further, it is seen that IF,0u - F;u,l I d sup R:lu - unl BE%
where No = sup, Ilu, -
+ No sup II: ae.
1,
~ ~ l l ~ By~ ~Theorem , ~ , . 2.4.5
Since the constant N does not depend on n, the right side of the last inequality tends to zero as n -+ oo. Therefore,
4 lim lim d]l(F,Oun - u,)A+m n + m
+ No lim A +
lid+
l,QT
/Id+l,I"
sup II! 1 E
where the first term does not exceed N lim.,, Il(G[u,])- lid+ ,, ,, in accord with Lemma 8, and the second term is equal to zero in accord with Theorem 2.4.7. Finally,
for each region Q' c Q' c Q, N depending only on d, 6 as well as the maximal magnitudes of Zij(cr,t,x),&a,t,x) with respect to a, t , x, i, j. Next, we choose an increasing sequence of regions Qf whose union is Q. Putting instead of Q' the region Q: in the left side of (6) and, in addition, letting i -+ oo, we complete proving assertion (b) of the theorem for h = 0. In this case assertion (a) of the theorem can be proved in a similar manner.
4 The Bellman Equation
Using formal transformations, we can derive the general assertion from the particular case considered. Let h = 0, and also let Fa(t,x)be an arbitrary function satisfying the conditions of the theorem. We construct the operator G[u] on a basis of the functions (r^)-'iT,((9-'%, (f")-'c", ((r"-'7, 1 in the same way as we constructed the operator G[u] on a basis of the functions 4 6, z, F. Let N,
= sup
sup r"(t,x).
~ E (At , x ) c Q
Note that for any set of numbers I": 1. if 0 I sup,,, la, then SUP
asA
2. if sup,,,
1" 5 N, sup (~'(t,x))-'I" I N'6-l sup la; a sA
aeA
Id I 0, then sup 1" 2 N, sup Fu(t,x))'la2 N,6-' sup 1'. a€A
a, A
asA
The foregoing implies that
These inequalities together with the assertions of the theorem which hold for the operator G and h = 0 immediately prove the theorem for G and h = 0. In order to prove the theorem for an arbitrary h E Zd+,(Q), it suffices to note that G[u] h can be written as c[u] in an obvious way, if we construct c[u] on a basis of the functions i7, 6, 7 + h, and 7 in the same way as we construct G[u] on a basis of ii, 6, F, and F. This completes the proof of the theorem.
+
I?,
6. The Approximation of Degenerate Controlled Processes by Nondegenerate Ones Let (fi,g) be a (dl + d)-dimensional Wiener process, let E be a number, and, finally, let o,(a,t,x) be a matrix of dimension d x (dl + d), in which the first dl columns coincide with the respective columns of the matrix o(a,t,x) and, also, the block of last d columns gives &I,where I denotes a unit matrix of dimension d x d. Denote by % ' the set of all processes a" = a",(o)which are progressively measurable with respect to and which take on values s E [O,T], x E Ed we define the process X?,"(E) to be a from A. For a" E solution of the equation
a,
(6)
6. The Approximation of Degenerate Controlled Processes by Nondegenerate Ones
Furthermore, let q:,"."(&) =
Ji cZr(s+ r, x~,~.'(&)) dr,
v,(s,x) = sup v%(s,x). as%
For s E [O,T] we denote by @(T - s) the set of all Markov times (with respect to ,))f which do not exceed T - s,
(g})
w,(s,x) = sup
sup
a s % Ss(JR(T-S)
v$~(s,x).
The processes xFhX(&) for E # 0 are nondegenerate in the following sense. Let a,(a,t,x) = ia,(qt,x)o:(cr,t,x). It is seen that a,(cr,t,x) = a(a,t,x) + +&'I2 +&'I.
(2)
Hence for any 1 E Ed
The equality a, = a relations:
F,[u]
+ $&'I
immediately implies the following useful
= sup [LEu(t,x) + f"(t,x)]
= F[u]
UEA
+ -8'21
Au.
In the case where & -,c for all t, the set of strategies % c a. If, in addition, the first dl coordinates of the process fir form a process w,, due to the uniqueness of a solution of Eq. (1) we have xf*","= x?"."(O) for cr E %. Hence we can say that the nondegenerate controlled process xf3"."(&) as E + 0 approximates the (degenerate, in general) process x~~"'".
6
1. Theorem. As E + 0 v,(t,x)
+
v(t,x),
uniformly on each cylinder C,,,.
4 The Bellman Equation
PROOF.According to Corollary 3.1.13 as E -, 0 uniformly on each cylinder C,,,. It is seen that for E = 0 the process xFS,"(~) can be defined to be a solution of the equation x, = x
+ J,a(ar,s + r, x.) diT: + Ji b(ar,s + r, x.) dr,
where G; is the vector composed of the first dl components of the vector fit.The last equation is equivalent to the equation for x;,",", in which, however, the Wiener process is (possibly) a different one and, in addition, it is allowed to choose strategies to be measurable with respect to rather large a-algebras. However, as we know from Remarks 3.3.10 and 3.4.10, a payoff function does not depend on a probability space and the fact that one dl-dimensional Wiener process is replaced by another dl-dimensional Wiener (with respect w, which to, possibly, very large a-algebras) process. Therefore, vo = v, w, together with (4) proves the theorem.
-
In some cases, for example, in finding numerical values of payoff functions, it is crucial to know how great the difference Iv,(s,x) - v(s,x)l is. 2. Theorem. For all s E [O,T], a E A, R > 0, x, y E S,, let
EE
Then there exists a constant N [- l,l]
= N(K,m)
such that for all (s,x) E HT,
Iv,(s,x) - v(s,x)l + IwXs,x) - w(s,x)l I lElN(1 + 1x1)2m eN(T-s) .
PROOF.We can prove the theorem by differentiating Eq. (1) over the parameter E. We prefer, however, a formal application of Theorem 1.1. We add the equation E, = E OdG, + Odr to Eq. (I), replacing in (1) E by E,, and furthermore, we regard E, as the last component of the controlled process (Xa,~.(~,e) &) t ,Eta , s , ( ~ , ). Note that for s E [O,T], x, y E Ed, el, c2 E El, a E A,
+ So
So
In other words, the function o,(a,s,x) satisfies a Lipschitz condition with respect to (x,E)uniformly in a, s. ~ "the ~ scheme ~ ~ , E con~ ~ ~ ~ " ~ Therefore, the controlled process ( X ~ ~ ~ ~ fits sidered in Chapter 4. Theorem 1.1 estimates the gradient of the functions v,(s,x), w,(s,x) with respect to the variables (x,~).In particular, the generalized lx12 I R2 do not exceed derivatives v,, W, with respect to E for
+
7. The Bellman Equation
N ( l + R)2meN(T-S) . A s was mentioned in Section 2.1, the boundedness of a generalized derivative yields a Lipschitz constant. Hence for c2 1x1' I R2
+
Iv,(s,x) - vo(s,x)l + Iw,(s,x) - wo(s,x)l I lclN(1 + R ) e
2m N ( T - s )
where N = N(K,m). It remains only to take R2 = x2 theorem is proved.
>
+ 1 for 181 5 1. The
7. The Bellman Equation The Bellman equation plays an essential role in finding a payoff function and &-optimalstrategies. It turns out that if the processes x;,","are nondegenerate, we can obtain the Bellman equation under the assumption of the existence of generalized derivatives of a payoff function. First, we prove two results of the type mentioned, and second, we derive the Bellman equation imposing restrictions only on o, b, c, f , g. These restrictions on o, b, c, f , g will be formulated after Theorem 2. Here as well as everywhere else in this chapter, we assume that the assumptions made in Section 3.1 are satisfied.
1. Theorem. Let a bounded region Q c H,, let w E W',2(Q),and, jinally, for each region Q' which together with its closure lies in Q let there exist a number 6 = 6(Q1)> 0 such that for all (t,x) E Q', a E A, A E Ed (a(a,t,x)i,A)2 @I2. Then F[w] I 0 (a.e. on Q), F[w] = 0 (Q n ((t,x):w(t,x)> g(t,x))-a.e.), w 2 g in the region Q. In short, (F[w] + w - g),
+g -w =0
(a.e. on Q).
PROOF.For P E A we introduce a constant strategy Pi r b. Let the region Q' c Q' c Q, let a point (s,x)E Q', and, finally, let Z' be the time of first exit of the process (s + t,xf,",") from the region Q'. By Theorem 3.1.11, for each i 2 0
By Ito's formula (Theorem .2.10.1)
4 The Bellman Equation
Therefore, subtracting these two formulas, we obtain
o 2 M,!
Ji' [LPW(S+ t, x,) + f l(s + t, ~ , ) ] e - ~ ' - dt. "
Multiplying the last inequality by A and, in addition, letting 1+ co,we find according to Theorem 2.4.6 that LBw+ f P I 0 (a.e. on Q'). Then, F[w] I 0 (a.e. on Q). On the other hand, let E > 0 and let the region Q' c Q n {(s,x):w(s,x)> g(s,x)
+ E).
Then, according to the Bellman principle (according to Theorem 3.1.11)
where 2' is the time of first exit of the process (s + t, x:,".") from the region Q'. By Ito's formula,
which implies that
0 = sup M,: are%
Isup M,: OLE%
J:
[LU.W(S
+ t, x,) + fc(s + t, x,)]e-"dt
J:
F[W](S
+ t, x,)e-"
dt,
(1)
where F[w] I 0 (a.e. on Q). Hence the right side of (1) is equal to zero. By virtue of Corollary 2.4.8, we have F [w] = 0 (a.e. on Q'). In view of arbitrariness of Q' it means that F[w] = 0 (Q n {(s,x):w(s,x)> g(s,x) + 6)-a.e.) for each E > 0. The union of all these regions for all E > 0 constitutes a region Q n {(s,x):w(s,x) > g(s,x)). Therefore, in the last region F[w] = 0 almost everywhere. Finally, the inequality w 2 g is obvious (see, however, Theorem 3.1.8). We leave it as an exercise for the reader to prove the last assertion. The theorem is proved.
2. Theorem. Let a bounded region Q c H T , let v E W1,2(Q),and, finally, for each region Q' which together with its closure lies in Q let there exist a number 6 = 6(Q') > 0 such that for all (s,x) E Q', a E A, 1 E Ed (a(a,t,x)A,A)2 61A12. Then F[v] = 0 (a.e. on Q). The proof of this theorem follows exactly the proof of Theorem 1. We need only, instead of Theorem 3.1.11, to use Theorem 3.1.6 where it is necessary. We formulate the conditions which, in addition to the assumptions made in Section 3.1, we assume to be satisfied in the remaining part of this section.
7. The Bellman Equation
+ +
Let us introduce a vector ya(t,x) of dimension d x (dl d 4), whose coordinates are given by the following variables: aij(a,t,x)(i= 1, . . . ,d,j = 1,. . . ,dl), b'(a,t,x) (i = 1,. . . ,d), ca(t,x),fa(t,x), g(x), g(t,x). For all a E A, 1 E Ed let the derivatives @,(t,x),y&(l,(t,x),(d/at)ya(t,x)exist and be continuous with respect to (t,x) on RT.Assume that the derivatives mentioned (they are vectors) do not exceed K(l + 1x1)" in norm for all a E A, 1 E Ed, (t,x) E RT. Also, it is convenient to assume that for all a E A, x E Ed We note that the relationship between this assumption and the preceding one was discussed in Section 4. We shall prove under the combined assumptions indicated that the functions v and w satisfy the corresponding Bellman equations in the region Q* = ((t,x) E H T :SUPaEA (a(a,t,x)A,A)> 0 for all A # 0). First we show that the set Q* is in fact a region. Let p = p(t,x) = inf sup na(t,x)(a(u,t,x)A,A). JIZI=l aEA
3. Lemma. The function p(t,x) is continuous in [O,T] x Ed, the equality is satisjied, the set Q* is open, and the function p-'(t,x) is locally bounded in Q*.
PROOF.The third and fourth assertions follow from the first and second assertions and also from well-known properties of continuous functions. Further, the derivatives (with respect to (t,x)) of the functions a(u,t,x), b(a,t,x), ca(t,x),and f"(t,x) are bounded on any set of the form A x [O,T] x (x: 1x1 I R). Therefore, these functions are continuous with respect to (t,x) uniformly with respect to a. By similar reasoning the function (a(a,t,x)A,A)is continuous with respect to (t,x) uniformly with respect to a E A, A E S , . Then we have that the function na(t,x)(a(a,t,x)A,A)is continuous with respect to (t,x) uniformly with respect to a E A, A E S,. Furthermore, we note that the modulus (or absolute value) of the difference between the lower (upper) bounds does not exceed the upper bound of the moduli of the differences. Therefore, if (tn,x,) -+ (to,xo),then
by the definition of uniform continuity.
4 The Bellman Equation
In order to prove the second assertion, we make use of the fact that due to the inequality na(t,x)I 1 for = 1 we have ~up,,~(a(a,t,x)A,A) 2 p(t,x). Hence, if (t,x)E HT and p(t,x) > 0, then (t,x)E Q*. If p(t,x) = 0, there will be a sequence An E as1for which sup na(t,x)(a(a,t,x)An,An)-+ 0. USA
Therefore, (a(a,t,x)A,,A,) -+ 0 for all a E A. We can consider without loss of generality that the sequence {A,) converges to a limit. Ee denote this limit by A,. Then (a(a,t,x)Ao,Ao)= 0 for all a E A. Therefore, (t,x)f Q*, which completes the proof of the second assertion, thus proving the lemma. 4. Theorem. In HT (in the region Q*) the functions v(t,x), w(t,x) have all generalized jirst (respectively, second) derivatives with respect to x and a generalized first derivative with respect to t. The foregoing derivatives are locally bounded in HT (respectively in Q*). There exists a constant N = N(K,m) such that for u v and for u -= w for any 1 E Ed
-
+IxJ)~"~~(~-')
-~ ( 1
5
1
U ( ~ ) ( II) - N(1
P
+ I x J ) ~ ~ ~ ~ ( ~ - (a.e. ~ ) on Q*). (3)
This theorem is in some way a summary of the results obtained in Sections 1-4. The existence of (d/dt)u,u ( ~follows ) immediately from Theorem 1.1 and Theorem 4.3, from which, in addition, we have estimates of the foregoing derivatives. The existence of (d/dt)u implies that the measures (d/dt)u(dtdx) and ((d/dt)u)-(dtdx) are absolutely continuous with respect to the Lebesgue measure, and, furthermore, their Radon-Nikodym derivatives are equal to (a/dt)u and ((a/dt)u)-,respectively. By Theorem 3.5b and Remark 3.6, all generalized second derivatives of the functions v(t,x), w(t,x) with respect to x exist in Q*. Further, as was shown in proving Theorem 3.5, the function p(1) appearing in assertion (a) of that theorem is greater than p. Therefore, by Theorem 3.5 and Remark 3.6,
I
+ Igrad,u( + lul + 1 1~1)~"
(a.e.on Q*),
where $ = - ~ e ~ ( ~ - +' ) ( l is the right side of inequality (2.14). To complete the proof of inequality (3), it remains only to use inequality (2) and also, to recall (see Section 3.1) that lul I N ( l I ~ l ) ~ e ~ (The ~ - 'theorem ). is proved.
+
7. The Bellman Equation
5. Theorem. F[v] = 0 (a.e. on Q*), F[w] I 0 (a.e. on Q*), F[w] = 0 (Q* n {(s,x):w(s,x) > g(s,x))-a.e.), w(s,x) 2 g(s,x), in the region Q*. The assertion concerning w can be written in short as follows: (F[w]
+ w - g), + g - w = 0
(a.e. on Q*).
PROOF.According to Corollary 1.7, F[v] 10, F[w] I 0 (a.e. on Q*). We prove that F[w] = 0 almost everywhere in any bounded region Q' which together with its closure lies in Q* n {(t,x):w(t,x) > g(t,x)}, which fact is obviously sufficient for proving the assertions of the theorem concerning w. Let us make use of the approximation of degenerate processes by means of the nondegenerate ones, which was described in Section 6. We take the matrix o,(a,s,x), the process X'~~'~(E), and the function w,(s,x) from Section 6. As was indicated in Section 6, the matrix a,(a,t,x) E io,(a,t,x)a:(a,t,x) is equal to a(a,t,x) + *e2Z and it also satisfies inequality (6.2):
Hence for E # 0 the set Q* associated with the matrix a, coincides with HT, which implies, according to Theorem 4, the existence of generalized first and second derivatives of w, with respect to x, a generalized first derivative with respect to t. Furthermore, it implies that the foregoing derivatives are locally bounded in HT. By Theorem 1, due to (4) for E # 0 the function w, satisfies the equation (see (6.3)) F[w,] + $c2Aw, = 0 almost everywhere in the region {(t,x)E HT:w,(~,x)> g(t,x)}. For all sufficiently small E these regions contain Q'. In fact, since Q' c {(s,~) E HT:W(S,X) > g(s,x)), the continuous function w(s,x) - g(s,x) > 0 on Q'. Since the set Q' is a compact, there exists a number 6 > 0 such that w(s,x) - g(s,x) 2 6 for (s,x) E Q'. By Theorem 6.1, the functions w,(s,x) g(s,x) -* w(s,x) - g(s,x) as E + 0 uniformly on Q'. Therefore, for all sufficiently small E on Q' (even on Q') the inequality w,(s,x) - g(s,x) 2 612 is satisfied. From the above we conclude that for all sufficiently small E
F[w,] 2 -i~~ A W,
(a.e. on Q').
(5)
Using Theorem 5.lb we take the limit in (5). However, before doing this, we need to estimate Aw,, (a/at)w,. Note that for /el I 1 the matrix o,(a,t,x) satisfies the same conditions as those the matrix o(a,t,x) satisfies, having, however, a different constant K. Indeed, the matrix norms of their derivatives with respect to t and x obviously coincide. Furthermore,
+
Ilo,(a,t,x)l12 = Ilo(a,t,x)l12 c2 I (K2
+ 1)(1+ 1~1)~.
4 The Bellman Equation
Hence, applying Theorem 4 to the function w, for lei 5 1, E # 0, we find the constant N which depends only on K and m, for which
for all 1 E Ed, where p,(t,x) = inf sup %(t,x)(a,(a,t,x)A,A), I1I=1 a s A
nz(t,x) = (1
+ tra,(a,t,x) + Ib(a,t,x)l + cU(t,x)+ I fOL(t,x)I)-l.
Further, it is seen that (a,(a,t,x)A,A) 2 (a(a,t,x)A,A).Since e2 tr a,(a,t,x) = tr a(a,t,x) + - d I tr a(a,t,x) 2 n;(t,x) that
+ c2d,
r (111 + e2d))na(t,x).Hence p, 2 (1M1 + ~ ' d ) ) From ~ . (6) we conclude
Iw, ~ , ( , ~
S [1
+ i ( 1 + e2d)]N(1 + l~l)~'e"-')
(a.e. on Q*).
By virtue of Lemma 3 the last expression is bounded on Q' by a certain constant. Thus, there exists a constant N such that for any E E [- l,l], E f 0, the inequality IAw,l 5 N can be satisfied almost everywhere on Q'. Theorem 4 also implies the uniform boundedness of I(d/dt)w,l for e E [- 1,1]. Next, we take a sequence w,,. The above arguments and (5) yield F[w,] 2 inf F[wll,] 2 - N (a.e.on Q'), nz 1 1-1 (a.e. on Q'). in^ F[wlIn] r -- lim Awl,, 2 0 n+ w 2 n+m n The former inequality allows us to assert that the function inf,, 9d+ ,(Qf)(it is bounded on Q'). The latter inequality together with Theorem 5.lb yields F[w] 2
lim F[w,,,]
n+ w
20
,F[wIl,]
E
( a e o n Q').
Recalling that F[w] 0 (a.e. on Q*), we obtain F[w] = 0 (a.e. on Q'). We have proved the theorem for the function w. It remains only to prove that F[v] = 0 (a.e.on Q*). Let us consider the functions v,(s,x) introduced in Section 6. By inequality (4) and Theorem 4, for e # 0 generalized derivatives (a/dt)v,(t,x), vaxi(t,x)~,,~,,(t,x)exist and are locally bounded in HT. By Theorem 2 for E # 0 F[v,]
+ ie2Au, = 0
(a.e. in HT).
We fix a certain bounded region Q' c Q' c Q*. In the same way as we did before, we estimate on Q' the derivatives v,(,,(,,,(dldt)~,,using Theorem 4.
7. The Bellman Equation
Then, using Theorem 5.lb, we can conclude that F[v] 2 l a F[vl,,] n-t m
=
1
!& (-p dv,,.)
n-t m
t0
(a.e. on Q1).
On the other hand, since F[v] I 0 (a.e. on Q*), F[v] = 0 (a.e. on Q'). Due to arbitrariness of Q', F[v] = 0 (a.e. on Q*). The theorem is proved. 6. Remark. Inequality (6) together with the estimate ,usgiven in the preceding proof, shows that for all E E [- 1,1], E # 0, 1 E Ed, ,u -Iw8(,,(,,I 1+,u
I (1
+ c2d)N(1+ I ~ 1 ) ~ ~ e ~ (a.e. ( ~ on - ~Q*), )
where N = N(K,m). From (6) it follows in general that inequality (7) holds almost everywhere in H T . However, the function ,u = 0 outside Q*. By Theorem 4, inequality (7) holds as well for E = 0. Absolutely similarly, for all E E [- l,l],1 E Ed ,u l+,u
-IvEcn(l)J I (1
+ ~ * d ) N (+l 1x1)3meN ( T - t )
(a.e. on Q*),
where N = N(K,m). In the case when for all (t,x)E RT and 1 # 0
the set Q* coincides with H T , and, in addition, the continuous function ,u(s,x) > 0 at each point [O,T] x Ed. Hence, the function ,u-' is bounded on each cylinder C,,,. Also, the above arguments show that the derivatives w,(,)(,, and v,(,,(,, are bounded (a.e.) in each cylinder C,,, by a constant not depending on E. The same remark applies to the mixed derivatives w,(,,,(,~,and vE(11)(12), which, as we know, can readily be expressed in terms of we(,,+,,,(,, +,,,, w&(ll-12)(11-12,, V&~11+12)(11 + l 2 b and V & ( l l-l2)(11-12). The next theorem follows immediately from Theorems 4 and 5, the results obtained in Section 3.1 on the continuity of v and w, as well as on the estimates of IvI and Iwl, and, finally, from the remarks made above about the properties of ,u if condition (8) is satisfied. Recall that the assumptions made in Section 3.1 and the assumptions about the smoothness of a, b, c, f , g(x), and g(t,x) which were formulated before Lemma 3 are assumed to be satisfied. 7. Theorem. For all (t,x)E HT and 1 #.O let inequality (8) be satisjied (i.e., FI(l'1',t,x) > 0). Then the functions v(t,x) and w(t,x) are continuous in H T , have in HT all generalizedjrst and second derivatives with respect to x and a generalizedjrst derivative with respect to t. These derivatives are bounded in each cylinder C,,,. There exists a constant N = N(K,m) such that for all (t,x) E RT Iv(t,x)l 5 N ( l I ~ l ) " e ~ ( ~ - " , Iw(t,x)l r N(1 + I ~ l ) " e ~ ( ~ - ~ ) .
+
4 The Bellman Equation
Finally, a. F[v] = 0 (a.e. in HT), v(T,x) = g(x); b. F[w] I 0 (a.e. in HT),w(t,x) 2 g(t,x)for (t,x) E RT,F[w] = 0 almost everywhere on the set {(t,x)E H T ; w(t,x) > g(t,x)), w(T,x) = g(T,x).
It follows from this theorem that, in particular, the derivatives (a/at)v, (a/at)w,v,,, wxl, vXiij,and wxiX,are summable with respect to any cylinder C,,, to any power. Using theorems on embedding (see [47, Chapter 11, Lemma 3.31) we deduce from the foregoing the following. 8. Corollary. Under the assumptions of Theorem 7 grad,v(t,x) and grad, w(t,x) are continuous in R T . Moreover, for any R > 0, 1 E (0,l) there is a constant N such that for 1x1, lxzl 5 R, and t, t,, t, E [O,T] the inequalities
Ix,~,
Igrad,ui(t,x,) - grad,ui(t,xz)l I NIX, - xzl" (grad, ui(t,,x) - grad, ui(t2,x)l I N J t l - t2IN2, are satisfied, where u ,
i = 12, i = 1,2,
= v, u, = w.
Further, since the nonnegative function w(t,x) - g(t,x) is continuously differentiable with respect to x, its derivatives with respect to x vanish at the points of H T , at which this function vanishes. This implies 9. Corollary. The smooth pasting condition
grad, w(t,x) = grad, g(t,x) is satisfied, under the assumptions of Theorem 7 , everywhere on the set { ( t , ~E)
RT:w(t,x) = g(t,x)) and, in particular, on the boundary of this set.
10. Remark. The assertions of Theorems 5 and 7 will hold if in formulating the conditions imposed on ya after Theorem 2 we do not require continuity of y:,,, y:l,(,,, and (a/at)yaif by these derivatives we mean generalized derivatives, and if, finally, we drop the condition ILa(T,x)g(x)l+ ILag(T,x)l 5 K(1 +
1~1)~".
We shall explain this. In fact it suffices to show that Theorem 4 will still hold if we replace K, m by some other constants in the formulation of the theorem. Let us smooth the coordinates ya(t,x), assuming ya(t,x)= ya(O,x)for t I 0. Furthermore, using the vector ya(t,x,e) z [ya(t,x)](e), let us construct the payoff functions v(t,x,s) and w(t,x,s). For 0 < E < 1 the vector ya(t,x,e) satisfies all the conditions formulated after Theorem 2 containing the constants K' and m', which do not depend on E since, for example, yR(t,x,e) = [y:l,(t,x)](e). Hence for the functions v(t,x,e) and w(t,x,e) Theorem 4 holds true in which K , m, p, and Q* are replaced by K', m', ,u(t,x,e) and Q*(e),respectively, constructed on a basis of ya(t,x,e).
Notes
Next, in each cylinder C ,, the vector ya(t,x)satisfies a Lipschitz condition with respect to (t,x)with a constant not depending on a since estimates of the generalized derivatives yTl, and (8/8t)yado not depend on ol. This readily implies that ya(t,x,e) + ya(t,x) as e -+ 0 uniformly on A x C,,, for each R. In particular, p(t,x,e) -t p(t,x). Furthermore, by Theorem 3.1.12 and Corollary 3.1.13, v(t,x,e)+ v(t,x) and w(t,x,e) + w(t,x) as E -+ 0 uniformly on each cylinder C .,, The convergence of the payoff functions and the estimate given in Theorem 4 of their generalized first derivatives with respect to (t,x), which is uniform with respect to e E (0,1), enables us, as was mentioned in Section 2.1, to prove the existence and to estimate the generalized first derivatives of u(t,x) and w(t,x) with respect to (t,x).We can estimate v(,,(,,(t,x) and w(,,(,,(t,x) in the same way, if we take advantage of the fact that all p(t,x,e) 2 &(t,x) > 0 due to the uniform convergence of p(t,x,e) to p(t,x) in each bounded region Q' which together with its closure lies in Q*, starting from some instant of time.
Notes This chapter uses the methods and results of [34, 36, 371, and [58, 591. For control of jump processes, see Pragarauskas [63]. Section 1. If the set A consists of a single point, i.e., if we consider a diffusion process, it is possible to regard the functions v, - v as payoff functions. Therefore, in Theorem 4 we then have equality instead of inequality. Similar assertions can be found in Freidlin [18]. Theorem 8 is the generalization of a result obtained in [38, 621. Sections 2, 3. The method applied in these sections, involving the derivatives in the sense of Definition 2.1.2, enables us to do without the theorems on interior smoothness of solutions of elliptic as well as parabolic equations, that is, the theorems used by Krylov, Nisio, and Pragarauskas (see the references listed above). Section 4. In Theorem 2.9.10 the differentiability of v(t,x) with respect to t is derived from the existence of second derivatives of o, b, c, f with respect to x. Exercise 4 shows that in the presence of control in order to estimate (a/at)v(t,x),we need to require that the derivatives of o,b, c, f with respect to t exist. Section 5. The results obtained in this section for the time homogeneous case, can be found in [34]. It is well known that the limit of harmonic functions is harmonic, and the associated theory has much in common with the theory developed in this section. Section 6. The relationship between the payoff functions associated with a controlled process, as well as the nondegenerate approximation of this process, is investigated in Fleming [16], Krylov [37], Tobias [74]. Section 7. The fact that a payoff function satisfies the Bellman equation implies, in particular, that the Bellman equation is solvable. It is interesting to note that differential equations theory suggests no (other) methods for proving the solvability of the Bellman equations in question. The smooth pasting condition (Corollary 9) was first introduced by Shiryayev (see [69]).
The Construction of E-OptimalStrategies
5
The main objective of investigating a controlled process approached from a practical point of view is to construct optimal strategies or strategies close to optimal. In this chapter we show how one can find &-optimalstrategies in optimal control problems which were discussed in Chapters 3 and 4. Recall that we proved in Chapter 3 that one can always find &-optimal strategies in the class of natural strategies. In this chapter we focus our attention on constructing Markov (see Definition 3.1.3) &-optimalstrategies which is of interest from a practical point of view due to the simplicity of Markov strategies. Adjoint Markov strategies which are investigated in this chapter, are somewhat more complex, becoming thereby less applicable in engineering than Markov strategies (see Definition 3.17). However, from a theoretical point of view adjoint Markov strategies are more convenient in some respects than Markov strategies. Considering adjoint Markov strategies, we prove in Section 3 that a solution of the Bellman equation is a payoff function. In the arguments of this chapter the results obtained in Chapter 4 on payoff functions satisfying the Bellman equation play an essential role. Throughout Chapter 5 we use the assumptions, definitions, and notations given in Section 3.1.
1. &-OptimalMarkov Strategies and the Bellman Equation We showed in Sections 1.1, 1.4, and 1.5 the way to construct &-optimal strategies having the knowledge of the payoff function. In this section and
5 The Construction of &-OptimalStrategies
the sequel we carry out construction of these strategies in the following cases : a. for each R > 0 there exists a number 6, > 0 such that for all a E A, (t,x) E CT,R,jl E Ed the inequality (a(a,t,x)A,A)2 6R(AI2 is satisfied; b. for all t E [O,T],x E Ed, jl # 0 sup (a(a,t,x)il,jl)> 0 ; asA
c. a(a,t,x) does not depend on x. The technique for finding &-optimalMarkov strategies in the case (a) is given in this section. For (b) and (c) we shall prove the existence of &-optimal Markov strategies and construct randomized &-optimalMarkov strategies in the subsequent sections. The case (c) incorporates the control of a completely deterministic process when o(a,t,x) = 0. In this section, in addition to the assumptions made in Section 3.1, we impose the following conditions. Let A be a convex set in a Euclidean space. Also, for each (t,x) E R, let the functions a(a,t,x) and b(cc,t,x) satisfy the Lipschitz condition with respect to a, namely, for all a, P E A, (t,x) E RT let Ilo(a,t,x) - o(P,t,x)ll + Ib(a,t,x) - b(P,t,x)l 5 Kla - PI.
+ +
Furthermore, we introduce a vector ya(t,x)of dimension d x (dl d 4) whose coordinates are given by the followirg variables : oij(a,t,x)(i = 1, . . . ,d, j = 1, . . . ,dl), bi(a,t,x)(i = I, . . . ,d), ca(t,x),fa(t,x), g(x), g(t,x). We assume that for each a E A, 1 E Ed the derivatives ~:~,(t,x), y;,(,,(t,x), (d/dt)ya(t,x)exist and are continuous with respect to (t,x) on RT.In addition, for all a E A, 1 E Ed, ( t , ~E) RTlet
Finally, we assume for the sake of convenience that for all a E A, x f Ed As was shown in Section 4.4, we can always get rid of the last assumption by choosing appropriate constants K and m in the other assumptions made above. We regard the foregoing assumptions as satisfied throughout this section. 1. Lemma. Let a bounded region Q c HT and let a function u E W1*'(Q). For the function a = a(t,x) with values in A let
1. &-OptimalMarkov Strategies and the Bellman Equation
W e assert that for each
E
> 0 one can jind a function a(t,x) given on
(- co,co) x Ed which is injinitely differentiable with respect to (t,x) and, in.
addition, has values in A and a constant N such that
for all x, y
E Ed, t
2 0.
PROOF.We fix E > 0. We proceed in the same way as in proving Lemma 1.4.9. We choose a countable subset {a(i):i 2 1 ) which is everywhere dense in A. From the equality FLU] = sup i
+ ~ ( ~=' lim 1 max [ ~ ~ "+' uf""'] n+ao
isn
and the boundedness of Q we readily derive the existence of the measurable function a"(t,x)which assumes only a finite number of values from {a(i)) and is such that Ilh"ld+ l , Q I 4 2 . Assume that a"(t,x)is defined everywhere in Ed+,and equal to a(1) outside Q. We take the smoothing kernels nd+'[(nt,nx). Let a,(t,x) = nd+'c(nt,nx)* a"(t,x). As has repeatedly been noted, a,(t,x) are infinitely differentiable, a, + a" (a.e.). Moreover, a,(t,x) E A for all (t,x) because A is convex. Further, it follows from the continuity off" and the coefficients of La with respect to a that ha- -, h"a.e. on Q). Due to boundedness offa and the coefficients of La on Q there exists a constant N for which
for all a E A everywhere on Q. Hence the totalitv of the functions han is bounded by one function in gd+,(Q). By the Lebesgue theorem, Ilhanlld+l , Q + Ilh"lld+l , Q . Therefore, there exists a number n(e) such that
Next, we let a(t,x) = a,(,)(t,x) and prove that the function a(t,x) is the sought function. We embed the region Q into a cylinder C,,,. The function [(t,x) is equal to zero for 1x1 > 1. Outside of Q, therefore, outside of CT,,E = a(1).It is easy to derive from the foregoing properties of the functions [ and E that a,(t,x) = a(1) for 1x1 > R + 1 for all n. In particular, a(t,x) = a(1) for 1x1 > R + 1. For reasons similar to those above, a(t,x) = a(1) for t < - 1 and for t > T + 1. Due to the continuous differentiability of a(t,x) we have N 1 = sup sup Ia(,,(t,x)l < co. t,x
IsEd
5 The Construction of &-OptimalStrategies
Finally, let us show that for all x, y E Ed, t 2 0 We have
Ilo(a(t,x),t,x)- o(a(t,y),t,y)ll I K(1
+ Nl)lx
-
yl.
We can estimate the corresponding difference for the functions b in a similar way. The lemma is proved. Note that the function a(t,x) whose existence is asserted in the lemma depends on the combination Q, u, E. In the case where Q is a subregion of H T , u E W1,'(Q), E > 0, we denote the combination Q, u, E by p: It is convenient to write the function a(t,x) constructed on the basis of p as a[p](t,x).For a fixed so E [O,T] and the function a[p](t,x)we can define the Markov strategy a [ p ] using the formula
+
+
+
+
Since the functions o(a[p](so t, x), so t, x), b(a[p](so t, x), so t, x ) satisfy the Lipschitz condition with respect to x, the Markov strategy a [ p ] is admissible at a point (so,x)for each x E Ed.
2. Theorem. For each R > 0 let there exist a number 6 , > 0 such that for all a E A, ( t , ~E) CT,,, A E Ed (a(a,t,x)n,n)2 6~lL1'. Then v(,, = v on RT. Furthermore, we jix so E [O,T], assume that p = ( C T , R , ~ ,and, ~ ) , jinally, define the Markov strategy a [ p ] using Eq. (1). Then for each x E Ed lim lim v"[P1(so,x)= v(so,x). R+m e J 0
PROOF.First we note that since
the first assertion follows from the second assertion. We have from Theorem 4.7.7 that v E W1*2(CT,R) for all R. Therefore, we have defined the strategy a[p]. Furthermore, F[v] = 0 (HT-a.e.). Therefore, by the definition of the function cl [ p ](t,x) llhPlldf
where
1,CTSn
2
'9
1. &-OptimalMarkov Strategies and the Bellman Equation
According to Ito's formula
+ M::J
JiTsR hp(so + t,xt)e-vt dt,
where T T , R = inf{t
+
2 O:(so t , ~ , 4) [O,T) x SR).
By Theorem 2.2.4, the absolute value of the last mathematical expectation does not exceed NIIhPIId+I , C = , R < - NE, where N does not depend on E. Hence it tends to zero as E 1 0. Further, by virtue of the equality v(T,x) = g(x) the first expression in the right side of (2)equals
+ M$:V(SO + For proving the theorem it suffices to show that the two last terms tend to zero as R -, co uniformly with respect to E. By virtue of the growth estimates, as 1x1 + co, of the functions f"(t,x), g(x), v(t,x), to do as indicated we need only to prove that
Note that on the set
{ ~ a , [ % ~ ~ 3 "
- so) the
inequality
is satisfied. Hence
Furthermore, (3) follows from the estimates of moments of solutions of stochastic equations, thus completing the proof of the theorem.
3. Theorem. Suppose the assumption made in the preceding theorem is satisjied dejine Then w(,, = w on R,. Moreover, wejix so E [O,T], put p = (CTSR,w,&), the Markov strategy a [ p ] using Eq. ( I ) , and, jinally, denote by T $ ~ ~ , " ~ "the from Qo = {(t,y)E H T :w(t,y) > time ofjirst exit of the process (so + t, xFIP1,sOO.") g(t,y)). Then for each x E Ed lim lim V " [ P ~ ~ ~ = ~ (w(so,x). S~,X)
R+m &LO
5 The Construction of &-OptimalStrategies
PROOF.This proof follows closely the preceding one. Ry Theorem 4.7.7, F[w] = 0 (Qo-a.e.).Therefore, by the definition of a[p](t,x) where According to Ito's formula
+ MYS:"" where
-
z,,,
= infit
hp(so+ t, xJe-'PCdt, 2 O:(so
+ t,x,) 4 Qo n ([O,T) x S,)).'
By Theorem 2.2.4, the last term does not exceed N~~hP~~,+,,QonCT,R and tends to zero as E 10. The first term in (4) is equal to
where the mathematical expectations tend to zero as R + GQ uniformly with respect to E,which fact can be proved in the same way as the corresponding fact from the preceding proof, since TT,, I zo I T - so, and since, in addition, the inequality T,,, < T - so is satisfied on the set {T,,, < 7,). The analysis carried on Eq. (4) implies the assertions of the theorem, thus completing the proof of the theorem.
2. &-OptimalMarkov Strategies. The Bellman Equation in the Presence of Degeneracy Theorems 1.2 and 1.3 provide the technique for finding &-optimalMarkov strategies if the strong nondegeneracy condition is satisfied: (a(a,t,x)A,A)2 6RlA12for all a E A, (t,x) E C,,, A E Ed, R > 0, where 6, > 0. If we reject this condition, we shall not know how to construct &-optimal"pure" Markov strategies. In some cases considered in this section it is possible, however, to construct &-optimal"mixed" Markov strategies making no assumption about nondegeneracy. In the cases mentioned we prove the existence of (usual) &-optimalMarkov strategies. The superscripts ~ [ p ]so, , x are omitted here and below in the proof.
2. The Bellman Equation in the Presence of Degeneracy
In this section we assume that the assumptions made in the preceding section are satisfied. In particular, we assume that A is a convex set in a certain Euclidean space. We denote by (so,xo)a fixed point of HT. In addition to the basic dl-dimensional Wiener process (wt,Ft)we shall need a d-dimensional Wiener process (9t7@J as well as a (d + dl)-dimensional Wiener process (fit,&). We assume that the processes listed can be defined on the probability spaces (Q,F,P), (fi,@,F), (a,g,p),respectively (we permit these spaces to be equivalent). The last d coordinates of the vector fit form a d-dimensional Wiener process which we denote by %if.We denote by iT; a dl-dimensional Wiener process composed of the first dl coordinates of the vector fit. Recall that we agreed in Section 1 to denote by the letter p the triple Q, u and E, where Q is the bounded subregion of HT, u E W1s2(Q),and E > 0. As in Section 1, we denote here by a[p](t,x) some smooth function on Ed+, with values in A such that its first derivatives with respect to x are bounded, the functions satisfy the Lipschitz condition with respect to x uniformly with respect to t, and E. IIF[u] - [La[P1~f"[P1]lld+ 1,
+
The existence of the function a[p] having the properties listed was proved in Lemma 1.1. If p = (Q,u,E),zt is a (nonrandom) continuous function which is given on [0, T - so] and assumes values in Ed, we can define the Markov strategy cc[p,z) using the formulas
We consider the equation
where br(x) = b(a[p](so
+ r, x + EZ,),so + r, x).
In the same way as we did in proving Lemma 1.1, we show here that the coefficients of Eq. (2) satisfy the Lipschitz condition with respect to x with a constant K(l N,), where N1 has been taken from the proof of Lemma 1.1. Therefore, Eq. (2) is solvable, and, furthermore, the Markov strategy a[p,z] is admissible at a point (so,xo).Using the notation from Section 3.1, we write the solution of Eq. (2) as x~[~,"~,"".~. Since so, xo are fixed, we write in short x:[P,'].
+
5 The Construction of &-OptimalStrategies
It is seen that va[P~'l(so,xo) 5 v(so,xo)everywhere on 6. We shall prove below (see Corollary 2) that ~ " [ P ~ " ~ ( sis~a, xrandom ~) variable. Hence for any sets p = (Q,u,E) &jva[p,*1 ( ~ 0 ~ x20 u(so,xo), ) (3) where M denotes the mathematical expectation associated with the measure
B.
We can interpret the mathematical expectation in (3) to be the payoff obtained by means of a mixed (weighted, randomized) Markov strategy. We explain this without defining concretely the mixed strategy. Assume that a probability measure is given on a set 'UM(so,xo).Also, we assume that first in correspondence with this measure we have a game involving the Markov strategy a and, next, the process is controlled by means of this strategy a. The average payoff of the control is equal to va(so,xo).The integral of va(so,xo) with respect to a probability measure on %,(so,xo) represents the total average payoff of the control of this type. In the case when the probability distribution on 'UM(so,xo)is given by a random element a[p,@], the total average payoff equals the left-hand side of (3). From a practical viewpoint, the technique of controlling a process by means of a random Markov strategy is no less legitimate than that of controlling a process by means of a (nonrandom, pure) Markov strategy. The fact that the left-hand side can be expressed in other terms is very important for the further discussion. On the probability space (fi,#,F) we consider the following equation:
where ar(x) = cr(a[p](so &(x) = b(a[p](so
+ r, x + ~fi:'),so + r, x), + r, x + ~fi:'),so + r, x).
Eq. (4) has a unique solution. In fact, o",(O)and &(o) are bounded, since a(a,t,O) and b(a,t,O) are bounded uniformly with respect to (a,t). Furthermore, the functions o",(x), Gr(x) satisfy the Lipschitz condition with respect to x, having the constant K(l + N,), where N, has been taken from the proof of Lemma 1.l.Finally, the processes Zr(x)and &(x)are progressively measurable with respect to We introduce a convenient notation for a solution of Eq. (4). If x, is a solution of Eq. (4), we put
{&I.
The process P[p] is a strategy with respect to a system of the o-algebras of in the sense of Definition 3.1.1. It is seen that x, satisfies the equation
(6)
2. The Bellman Equation in the Presence of Degeneracy
Using the standard notations, we can write the following: x, = X,B[P~."~~"~. This also enables us to apply usual short notations, write the indices P[p], so, xo only for the sign of the mathematical expectation, and introduce v ~ [ ~ ~ ( s ~ , x ~ ) , in writing mathematical expectations of functionals of a solution of (4). In addition, since so, xo are fixed, we shall write xftP1 insteadof x ~ [ P 1 ~ s O ~ x O . The following lemma gives an obvious formula with which we can modify the left-hand side of (3). 1. Lemma. Let F(z,xIO,T-sol) be a measurable function given on CZ([O, T - so],Ed)and such that
for some constants N , n and also for all z, x[o,T-solE C([O,T - so],Ed).Let a bounded region Q c H T , let a function u E W1,'(Q) and let a number E > 0. Using Lemma 1.1., we construct the function a[p](t,x) on the basis of the set p = (Q,u,E).Using formula (1) we introduce Markov strategies a[p,z] for z E C([O,T - so],Ed). Furthermore, we dejine the strategy P[p] using the ), x, is a solution of Eq. (4). Then formula Pt[p] = a[p](so+ t, x, + ~ f f ; 'where the function is bounded and measurable with respect to z for z E C([O,T - so],Ed), where M (m) denotes the mathematical expectation associated with the measure
S (P).
PROOF. The boundedness of @(z)follows from (5) and the fact that due to familiar estimates of moments of solutions of stochastic equations
In proving (6) and the measurability of @(z),we use the results obtained in Section 2.9. We note that it is possible to solve Eq. (4) in a different way. We denote by g:the completion (with respect to measure p) ofthe smallest a-algebra containing & as well as all sets of the form { f i r E T } , where rI T - so and T denotes a Bore1 subset of Ed.We draw the reader's attention to the fact that r runs through the entire interval [0, T - so]. It is easily seen that the processes dt(x),&(x)from Eq. (4)are progressively measurable with respect to new a-algebras, and also that (%i,P,) is a dldimensional Wiener process. We note as well that the solution of Eq. (4)does not change on [0, T - so] when we pass from to (ffi,@,). This obvious fact follows, for instance, from the uniqueness of a solution of Eq. (4)and the fact that the former solution x, is progressively measurable with respect to
(ff;,g)
5 The Construction of &-OptimalStrategies
{&] and, in addition, is progressively measurable with respect to { p t } by virtue of the inclusion gtc p t ( t E [0, T - so]). Further, we use Theorem 2.9.4 in the case where 2 = C([O, T - so], Ed),
= %;6,T-so],
t = X~
and for z E Z o;(x) = o(a[pI(so + t, x + 4 , So + t, 4 , bf(x) = b(cr[p](so t,x &zt),s0 t,x).
+
+
+
Then the solution of Eq. (4) is a solution of the equation x, = t
+ si c!(xr) diVi + si bj(xr)dr.
By Theorem 2.9.4 (also, see Remark 2.9.9) ~ ~so,xo{F(~"~~[o, P[PI T-sol)l = 8(wt), where 8(z) = fiF(Z,Z;~~]-
q[P3z1 is a solution of Eq. (2),in which w, is replaced by tT:. 8(z) = @(z).Hence
By Corollary 2.9.3,
-
M ~ ~ ~ , F ( % "T-sol) , X [ ~=, fi@(%")= ~@(i?). We have proved Eq. (6). We can easily derive the measurability of 8 , therefore, of 8,from Remark 2.9.5, if we write F as F + - F -. This proves the lemma.
2. Corollary. For a = a[p,z] or a = p[p] let z = z" be the time of Jirst exit of the process (so t,xF)from a region Q1 c (- 1,T) x Ed. Then, the function v"[P~zl~~so,xo) is measurable, bounded with respect to z on C([O, T - so], Ed),and,Jinally,
+
Furthermore, the function v"[P~zl(so,xo) is measurable, bounded with respect to z on C([O, T - so], Ed),and also
Indeed, the second assertion is a particular case of the first assertion ( Q , = (- 1,T) x Ed, g(t,x) = g(x)). In order to prove the first assertion, we introduce the function z(z) for z E C([O, T - so],Ed) to be the time of first exit of the curve (so + t, z,) from Q,. Since Q1 c (- 1,T)x Ed,z(z) I T - so. It is easy to prove that limZn,,z(z,) 2 z(z). Therefore, the function z(z) is lower semicontinuous and, in addition, is measurable with respect to z. Further, we consider the function
2. The Bellman Equation in the Presence of Degeneracy
where
[
exp - J:
P[PI(SO
+ r, ++
22.)
(so
+ r, x,) dr
I
dt.
The last function is obviously continuous on [0, T - so] x C2([0,T - so], Ed). Hence is measurable, being the composition of measurable functions. The application of the lemma to the function F(z,xto,T - s o l ) leads us immediately to the first assertion of the corollary.
3. Remark. We have discussed above the technique for controlling a process by means of an initial randomized Markov strategy. This technique provides an average payoff equal to the left-hand side of (7). From Eq. (7), we get another possibility to obtain the same payoff. Suppose that we have realized the d-dimensional Wiener process 6t so that it is observable and independent of w,. The pair (w,,%,)forms a (dl + d)-dimensional Wiener process. Furthermore, the pair ( X ~ ~ * " ~where , Z , ) , z, = %,, satisfies the equation
We have thus obtained a 2d-dimensional controlled process, for which the function a" = a[p](s, t,x, EZ,) is a Markov strategy (we mentioned the observability property of 6, because trajectories of a controlled process are taken as observable). If we assume the process (w,,%,) to be E,, we can easily see that
+
+
Therefore, we can obtain the left-hand side of (7) by mixing Markov strategies as well as by applying a strategy which is Markov with respect to a complete controlled process. In the next section we call strategies of this kind adjoint Markov strategies. 4. Lemma. W e take the functions v, from Section 4.6. Further, for E # 0 let p = ( C T , R , ~ , , ~Using l ) . Lemma 1.1, we construct the function a[p](t,x)on the basis of the set p. Using Eq. ( I ) , we introduce Markov strategies a[p,z] for z E C([O,T - so], Ed). On a probability space (O,F,P) we dejine the strategy p[p] using the formula fi,[p] = a[p](so+ t,x, + &it$'), where xi is a solution
- --
5 The Construction of &-OptimalStrategies
of Eq. (4). Finally, we assume that
[
x exp -
11
J: cfir[pl(so+ r, xr + E@:) dr
Then v~,)(so,xo)= v(so,xo).Moreover,
dt 2 v(so,xo).
~ v(so,xo), ~ ~ ( s ~ , x ~ ) lirn lirn - lirn -~ v ~ ~ ~ = &+O R + m
~ ~ l o
and alsofor each 6 > 0 lirn lirn lirn ~{va[p,'"l(so,xo) < v(so,xo)- 6) = 0. E+O R + m el10
PROOF. It follows from (10) that v ( ~ ) ( s ~= , xv(so,xo). ~) In turn, (10) follows from (9) since, according to Chebyshev's inequality, the probability given in (10) does not exceed
Therefore, we need to prove only (9). First we note that for E # 0 the nondegeneracy of the processes x?"."(&)(see Section 4.6, Inequality (4.6.2)) (a/dt)vEas well as ensures the existence of generalized derivatives vexi,veXixj, the boundedness of these derivatives in each cylinder C,,,. This implies, in particular, that the function a[p](t,x) is definite. We take an arbitrary strategy P = Pt which is progressively measurable with respect to {gt}. Further, we consider the expression
where x, is a solution of the following equation (having coefficients not depending on E):
2. The Bellman Equation in the Presence of Degeneracy
Differentiating U ( E ) over E, bringing the notation of the derivative under the sign of the mathematical expectation and the integral, using the fact that the derivatives f , c, and g increase with respect to x not faster than a certain power, and, finally, applying the familiar estimates of moments of solutions of stochastic equations, we conclude that there exists a constant N(xo,K,T,m) for which /U'(E)~ I N(xo,K,T,m)for I 1 . Hence lu(0)- U ( E ) I ~ N(xO,K,T,m)lel for lei I 1. It is crucial that the constant N does not depend on the strategy j. Due to the result thus obtained, we can replace the expression x, 8%:' by x, everywhere in (8). However, by Corollary 2 the left-hand side of (8) thus modified coincides with the left-hand side of (9).Therefore, the left-hand side of (9) is not smaller than v(so,x,). Since va(so,x0)s v(so,x,) for each strategy a E %, the left-hand side of (9)is, on the other hand, not greater than v(so,x,). We have thus proved the lemma.
IE~
+
Now we can prove the main result of this section. 5. Theorem. Suppose that at least one of the following conditions is satisjied: a. o(a,t,x) and b(a,t,x)do not depend on x ; b. a(a,t,x)does not depend on x ; c. for all t E [O,T], x E Ed,IZ # 0 sup (a(a,t,x)A,A)> 0. a~ A
Then, using the notations of the preceding lemma, we have that inequality (8)as well as the assertions of this lemma hold true.
PROOF.Presumably, if condition (a) is satisfied, condition (b) will be satisfied. We included condition (a) in the statement of the theorem for completeness. If (a) is satisfied, the proof of the theorem becomes very simple. In fact, if (a) is satisfied, then, (see Eq. (4))the process xfLP1 E%;'is a solution of the equation
+
If we introduce the matrix 6, in the same way as we did in Section 4.6, we can easily turn the last equation into Eq. (4.6.1). Therefore, xfLP1 + E%;'= x ~ [ ~ ~ for , ~a11~ t ~almost ~ ~ (surely. E ) Further, from formulas (4.6.3) and the definition of a[p] = a[C,,,,u,,cl], we have F,[u] - [Ltu
IIFe[v&]- [ L t L P 1+~ &
+ f " ] = F[u] - [Lau + f"],
/Id+ I , C T , R
< El. -
(11)
From this, fixing E # 0 and applying Theorem 1.2 to the controlled process have that the expression in (8)under the sign of the lower limit
X ~ ' " * ~ ( Ewe ),
5 The Construction of &-OptimalStrategies
with respect to E is equal to v,(so,xo). By Theorem 4.6.1, v, + v; therefore, we have proved inequality (8) and, by this token, the assertions of Lemma 4 for (a). If (a) is not satisfied, the equality xtLP1 + 8%;' = X;[~~"".~(E) does not hold, in general. Thus, we cannot apply Theorem 1.2 for proving (8). In cases (b) and (c), to be considered at the same time, we can prove Eq. (8) almost in the same way as Theorem 1.2. We assume everywhere below that E # 0,lel I1. It can easily be seen that the process y: = xfrP1 E%;) satisfies the equation
+
By Theorem 4.7.7, F,[u,] follows that
-
=0
(H,-a.e.). From this as well as from (11) it
< &I,
IlhPl/d+ ~ , C T , R-
where hP - LfrP1u,- farp1.Since the matrix a,o: is uniformly nonsingular, we can apply Ito's formula to the expression
Using Ito's formula, for each R1 2 0 we obtain where, letting q: =
J; cMpl(so+ r, y:)
dr,
zP,' denotes the first exit time of the process (so + t,y[) from [O,T) x SR1, we write the variables If(R1) as follows:
- b(.[pI(so
x grad, v,(s,
+ t, y/),so + t, Y!
- aij(a[p](so
+
+ t, y[)e-vf
-
&%;)I
dt,
+ t, Y!), so + t, Y:
x v,xi,j(~o t, yf)e-'~fdt.
- &%;')I
2. The Bellman Equation in the Presence of Degeneracy
As in proving Theorem 1.2, we show that lim,llo I,P(R1) = 0 if R > R1; lim sup sup sup
R1-+m l a l s l R > O e l > O
Next, we turn to the variables IP,(R1). By Theorem 4.1.1, it is easy to obtain for I 1 that Igrad,v,(t,x)l I N(K,T,m)(l + IXI)~"(HT-a.e.).Suppose that the last inequality is satisfied on a set T,such that means (HT\T,) = 0. We put the sum ~r,(so+ t, Y)! + X H ~ \ T ~+( St,O YP) before dt in the formula for IP,(R1). Furthermore, we split IP,(R1)into two terms in an appropriate manner. Applying Theorem 2.2.4 to the second term, we can see that it equals to zero. The first term as well as, therefore, IP,(R1)does not exceed
1
l&%;l(l+ y[l)'" dt.
N(K,T,~)~ Therefore,
lirn sup sup IIP,(R1)l = 0.
&+O R > O e l > O
If (b) is satisfied, then I$(R1) = 0. If (c) is satisfied, according to Remark 4.7.6 the derivatives v,,, are bounded in C,,,I (a.e.) by a constant not depending on E. In addition, for each 1 E Ed
Hence Ila(4t7x) - a(a,t,Y)ll IN(d,d1)K2(1 + 1x1 + lYl)lx - Yl,
where N does not depend on E,R, R1). This indicates that
(although N depends, for example, on
lim sup sup Il$(R1)l = 0 &+O R > O & l > 0
in both cases (b) and (c). Finally, from (12) and the properties investigated of IP(R1) we conclude that for each R1 2 0 lim v,(so,xo) 5 lim lim 0
&-t
,-to
lim
R-tm cl/O
5 The Construction of &-OptimalStrategies
where y(R1) is the expression in (13) under the sign of the limit. Letting R1 -t co and noting that, by Theorem 4.6.1, the left side of the last inequality is v(s,,x,), we arrive at inequality (8) (the notations have slightly been changed). The theorem is proved. We suggest the reader should prove a similar theorem for the optimal stopping problem as an exercise.
6. Exercise Let one of conditions (a), (b), or (c) of Theorem 5 be satisfied. Also, let p = ( C T , R , ~ , , ~ l ) , 7, = T ~ [ be ~the time ~ ~of first ~ exit . ~of the ~ process ~ ~ (so~ + t , x ~ [ ~ ~ from " ] ) Q, = {(t,x) E H,:w(t,x) > g(t,x) + 6 ) . As an analog of Lemma 4, prove that lim lim lirn lirn M V " [ ~ ~ " ~ ~ '= ~ (w(sO,xO) S~,X~) -810 E-0 R - r m z 1 l O
and that for each 6' > 0 --
-
610 E-0 R-m
8'10
..
lirn lirn lirn lirn ~ { v " [ ~ ~ " ] ~ ' ~<(w(s,,xo) s ~ , x ~ )- 6 ' )
Draw a conclusion that w,
=w
= 0.
in H,.
7. Exercise Consider a 1-dimensional process: d where
= dl =
1, T = 1, A
= [-
l,l], o(a,s,x) = a(x + a),
1 for x 2 1, x for x E [- l,l], -1 f o r x s -1,
+
Show that v(s,x) = xZ 1 - s. Let cr,(x) = n[(nx)*sgn x. Prove that for a point (0,O) &-optimalstrategies can be found among Markov strategies = a,(x, z,) for an appropriate choice of n and a continuous of the form cr,(~[~,,~,) function z,. Is this assertion true if one takes a,(x,) instead of a,(x, z,)?
+
+
3. The Payoff Function and Solution of the Bellman Equation: The Uniqueness of the Solution of the Bellman Equation The problem of finding a payoff function is one of the main problems in optimal control theory. As we have seen in Sections 1 and 2, the knowledge of a payoff function enables us, for example, to construct &-optimalstrategies. According to the results obtained in Section 4.7 it is natural to seek for a
3. The Payoff Function and Solution of the Bellman Equation
payoff function treated as a solution of the Bellman equation. Assume that we have found a solution of the Bellman equation. A question immediately arises whether the solution found is equal to the payoff function. If it is known a priori that the payoff function satisfies the Bellman equation, the question posed above is equivalent to a qukstion about uniqueness of a solution of the Bellman equation. In the general case, the positive answer to the latter question contains an assertion of uniqueness of a solution of the Bellman equation. In this section we show that a "smooth solution of the Bellman equation which does not increase too rapidly as 1x1 -+ co is equal to a payoff function. We assume that the assumptions made in Section 3.1 are satisfied. Note that, as Exercise 4.3.1 shows, the assumptions mentioned do not ensure either the existence of derivatives of a payoff function or that the payoff function satisfies the Bellman equation. Furthermore, we assume that on an initial probability space we are given a d-dimensional Wiener process (with respect to 0-algebras of { F t ]@,) not depending on w,.It is always easy to have this assumption satisfied if we consider a direct product of an initial space by a space on which a d-dimensional Wiener process is defined. It is crucial to emphasize the fact that a payoff function does not depend on the expansion of a probability space (see Remarks 3.3.10 and 3.4.10). The fact that the solution of the Bellman equation is a payoff function will be proved in two stages. First, we prove that the solution is not smaller than the payoff function. Second, we shall prove the converse. At the same time, we shall elucidate the general question as to how a function has to be related to the Bellman equation in order that we may assert that it is greater (smaller) than the payoff function. Our argument includes a function u(t,x) given on H,, about which we sometimes assume that there exist constants N and p 2 0 such that
1. Definition. Let Q be a subregion of H,. We write u E W:,(Q) if u E W1,'(Q') for any bounded region Q' which together with its closure Q' lies in Q. The definitions which follows include the operator F defined in the introduction to Chapter 4. 2. Definition. A function u is said to be excessive (with respect to the operator F) in a certain region Q c H T if u 2 0 in Q, u E Wfd,2(Q)n C(Q), F[u] I 0 (a.e. on Q). A function excessive in H , is said to be an excessive function. 3. Definition. A function u is said to be superharmonic (with respect to the operator F ) in a certain region Q c H , if u satisfies inequality (1) in a region Q, u E W:d,Z(Q) n C(Q), F[u] I 0 (a.e. on Q). A function superharmonic in H T is said to be a superharmonic function.
5 The Construction of &-OptimalStrategies
The property of excessive and superharmonic functions, which is of our main concern, is to be found in the following lemma. 4. Lemma. Let u be a superharmonic (or excessive) function in a region Q c H T . Then, for any (s,x) E Q, a E a,z E %R(T - s)
where zQ = zaS'Xis the time of jirst exit of the process (s
+ t , ~ j " , ~from ,")
Q.
PROOF.If (s,x) lies on the boundary of Q, then zQ = 0, and therefore the assertion is obvious. Let (s,x) E Q. Also, let
r = { ( t , ~E) Q:F[u](t,x)5 0). Then meas (Q\T) = 0 and, in addition, for all (t,x)E r for any o Further, we make use of Theorem 2.10.2, taking into account that by that theorem the region Q is assumed to be bounded. We have that inequality (2)is satisfied if we replace zQ by the time of first exit of the process (s t,xFS3")from Q n C,,,. We write this time as Z R , ~ . Replacing .to by Z R , a , we write inequality (2). Let R + co.If u is a superharmonic function, then lul 5 N(l + and, furthermore, the expressions under the sign of mathematical expectation in inequality (2) thus modified can be estimated in terms of a summable quantity
+
1~1)~
sup (1
tiT-s
+ IxF"."I)"+P.
Since, as can be seen, zR,Q+ zQ and x,,,,,,, + x,,,,, we can complete the proof of our lemma applying the Lebesgue theorem. If u is an excessive function, u 2 0, and, in addition, one should apply Fatou's lemma instead of the Lebesgue theorem. The lemma is proved. 5. Theorem. Let a function u be given and let it be continuous in RT.Let region Q c H T . Finally, let u be a superharmonic (excessive) function in the region Q. Then a. i f U ( S , X ) 2 g(s,x) in Q, u 2 w in HT\Q, then u 2 w in RT; b. i f u 2 v in HT\Q, u(T,x) 2 g(x) in Ed, then u 2 v in R,. PROOF.(a) Due to the continuity of the functions u and w it suffices to prove the inequality u(s,x) 2 w(s,x) for (s,x) E H T . Since this inequality is satisfied outside Q by assumption, we can admit that (s,x) E Q. In addition, let us take a E a,z E %R(T - s). Applying Lemma 4, we consider the relation
3. The Payoff Function and Solution of the Bellman Equation
where the indices a, s, x are omitted for the sake of brevity. For z s zQ we have here: u(s z,x,) 2 g(s z,x,) since (s + z, x,) E Q and the inequality u 2 g which is satisfied on Q by assumption will be satisfied on Q as well due to the continuity of u and g. Next, if zQ < T - s, then
+
+
+ TQ,x,,)
HT\Q, 4 s + ZQ,x,,) 2 w(s + z, x,,). (S
E
If zQ -. T - s, then due to the continuity of u, g and, in addition, by virtue of the Inequalities u 2 g in Q, w 2 g in H, and u 2 w in HT\Q we have: u 2 g in HT,u 2 g in BT,u(T,x) 2 g(T,x), U(S
f ZQ,x,,)
2 g(T,x,,)
= w(T,x,,) = w(s
+ TQ,xZQ).
From Lemma 4 and Eq. (3) analyzed above, we conclude that 7
+ S ( S + z,
I
+ W ( S + tQ,X ~ ~ ) ~ - ' ~ Q X , , < .~
Computing in the last inequality the upper bound with respect to z E %R(T - s) and a E %, we obtain in accord with Theorem 3.1.9 that u(s,x) 2 w(s,x). We have proved assertion (a). Applying Theorem 3.1.6 for z = TQ, r, = 0 we can prove assertion (b) in a similar way. This completes the proof of the theorem.
6. Corollary. In the case where w and v are superharmonic functions (see Section 4.7), the function w is the smallest superharmonic majorant of g(s,x) and the function v is the smallest superharmonic function which majorizes g(x) for s = T . 7. Corollary. Let u E W:?(H,)
n C(I7,). Also, let (1) be satisfied. Then, if
In fact, since the positive part of a number is greater than zero, by hypotheses, g Iu (HT-a.e.).Further, since g and u are continuous functions, g I u everywhere in HT. Next, if the inequality F[u](s,x)> 0 were satisfied at a point (s,x) E HT, it is seen that F[u] u - g > 0 at this point, hence (F[u] u - g), + g - u = F[u] > 0. Therefore, F[u] < 0 (HT-a.e.), and also u is a superharmonic majorant of g(s,x). Theorem 5 and Corollary 7 enable us to find upper estimates for a payoff function. In order to prove the theorem on lower estimates, we need three auxiliary results.
+
+
5 The Construction of &-OptimalStrategies
8. Lemma. Let (so,xo)E HT,let a(s,x) be a Bore1 function on HT with values in A, and let 6 > 0. Further, let
n b,(s,z) = ---b (s z z). n + 1zI2 " ' ' Let us jnd a strategy an,' using the formula
where z?'(o) is a solution of the equation z, = x0
+ Ji an(so+ r, z,.)dw, + 6*, + Ji bn(so+ r, z,.) dr.
Then for all q 2 1
< a,
sup sup M sup
8 ~ [ 0 , 1n] z l
--
lim lim M sup 6 1 0 n+co
t5T-so
ZsT-so
'$I,
-~
~
~
'= 0. ~
~
~
~
,
~
PROOF. AS can easily be seen, the functions
are differentiable over z and, furthermore, the derivatives of these functions do not exceed Nn, with N not depending on n, s, x, z. Moreover, they satisfy the Lipschitz condition with respect to x since, for example, using the simple inequality we find
This implies that the coefficients of Eq. (4) satisfy the Lipschitz condition and Eq. (4)is solvable. Due to the familiar estimates of moments of solutions
~
l
~
3. The Payoff Function and Solution of the Bellman Equation
of stochastic equations (5) follows from the inequality
and a similar inequality for Ib,(t,z)l. Further, the process x;n'6~s0~x0 is a solution of the following equation:
Comparing the last equation with Eq. (4), we have in accord with Theorem 2.5.9 that
- b(a:s6,so + t, z:s6)12q dt, where N depends only on q, K, T - so. It remains to show that the last two terms tend to zero as n + co. We fix 6 > 0. Also, we consider the latter term only. Let zk be the time of first exit of the process z?"rom S,. We have
as R + co uniformly with respect to n. Therefore, we can complete proving the lemma if we prove that for each R > 0
as n -+ co.We note that the process z:pS is a solution of the equation zt = xo
+ Ji a",(s0 + r, z,.)di~,.+ J: bn(sO+ r, z,.)dr,
where the matrix a", is obtained as a result of writing to onfrom the right of the unit matrix of dimension d x d, which is multiplied by 4 1 , = (wr,5,).It is not hard to see that for each A, l5;1l2 = (5,5,*A,A) = (onaxA,A)+ 6A2 = la,*AI2 6L2 2 6A2. Therefore, by Theorem 2.2.4,
+
5 The Construction of &-OptimalStrategies
where N does not depend on n. Finally, the functions bounded in C,,, uniformly with respect to n
Therefore, according to the well-known properties of convolutions (see Section 2.1) the foregoing functions tend to zero as n -+ co for each t for almost all z. Thus, by the Lebesgue theorem, the right side of (7) tends to zero as n + co.The lemma is proved. Repeating the arguments given in the proof of Theorem 3.1.12 after Eq. (3.1.14) and, in addition, using the current lemma, we arrive at the following (also see Corollary 3.1.13). 9. Lemma. Let us carry out the construction as was described in the formulation of Lemma 8. W e take arbitrary Markov times znv6E 1)32(T- s,). Then
x exp
[-"J:
x exp[-
ca:s"so
+
fi P:'~(s,+ r,
z76)dr]
z*:l
+ "J:
I.:"(S~
+ ti z:")
dr] dt} - u a n ' ~ o , x= o )0.~
10. Lemma. Let a bounded region Q c HT. Let a function u satisfy inequality ( 1 ) in H T , u E W1,2(Q)n C(RT),let F[u] 2 0 (a.e. on Q), and let E > 0. We take a Bore1 function a(s,x) given on HT which assumes values from A and such that + f l l ( S , X ) ( ~2, ~- E) (a.e. on Q). (8)
3. The Payoff Function and Solution of the Bellman Equation
W e jix (so,xo)E Q. Using Lemma 8 let us define the strategies an,< Then
+
u(so,xo)5 lim lirn M ~ ~ , ' ~ , ~ ( s 0xc,6)e-"rns6 6 1 0 n+m
where zn,' is an arbitrary Markov time not exceeding the time of first exit of the process (so + t, z;,') from the region Q.
PROOF.First we note that, in the same way as we did in proving Lemmas 1.4.9 and 4.5.5, we can establish here the existence of a function a(s,x) such that for all (s,x) E Q the left side of (8) is greater than F[u](s,x)- E. Since Flu] 2 0 (a.e. on Q), the function a satisfies inequality (8). Applying Ito's formula, we easily find
where
+ 1 [bi(a:,', so + t, zF6)- bL(s0 + t, z~~')]uxi(s0 + t, z:') d
i=l
1
dt,
By virtue of the inequality zn,Q T - so we obtain from the above as well as (8)that
5 The Construction of &-OptimalStrategies
We take the limit here as n + co, 6 1 0. Assuming g(s,x) = u(s,x) in Lemma 9, we conclude that for proving this lemma we need only to show that -lim lim I",' = 0. 810 n+m
Since the process z:,' is nondegenerate and, in addition,
for some constant N and all (t,z) E Q, n 2 1, by Theorem 2.2.2 (compare with the proof of Theorem 2.10.2)
where N does not depend on n, 6. Therefore, the estimated term belonging to In,'tends to zero as n + co, 6 1 0. Further, by Theorem 2.2.2,
where N does not depend on n, 6. In proving Lemma 8 (see the reasoning carried out after Eq. (7)) we showed that b,(t,z) + b(a(t,z),t,z) (HT-a.e.).In completely similar fashion, o,(t,z) + cr(a(t,z),t,z);therefore, a,(t,z) -+ a(a(t,z),t,z) (H,-a.e.). Since the totality of functions a,(t,z) is bounded on Q and, in addition, the derivatives uXixjE 9,+ l(Q), by the Lebesgue theorem the right side of (9) tends to zero as n + co for each 6. In the same way we can estimate the term containing b' - bb, which belongs to I"," We have proved the lemma. The following theorem enables us to find lower estimates for payoff functions.
3. The Payoff Function and Solution of the Bellman Equation
11. Theorem. Let u E Wfd,2(HT)n C(RT) and satisfy inequality (1) in HT. Then :
+ +
a. if(F[u] u - w), then u < w in 8,; b. if(F[u] u - v), then u I v in 8,.
+ w - u 2 0 (HT-a.e.),u(T,x) I w(T,x)for all x E Ed, + v - u 2 0 (HT-a.e.),u(T,x) I v(Tx)for all x E Ed,
-
PROOF. Assertion (b) follows from (a). In fact, temporarily let us take g(s,x) r v(s,x). By Theorem 3.1.6, w v and also the inequality u l w from (a) implies that u I v. Proof of (a). For each E > 0 LU(u- E)+ f a = LUu+ f + &ca2 LC(u+ fOL. OL
Therefore, F[u - E] 2 F[u]. Furthermore, we note that for any real a the function (a + t), - t decreases with respect to t, thus implying that
+
+
(F[u - E] (U- E)- w)+ W - (U- 8) (HT-a.e.). 2 (F[u] + u - w), + w - u 2 0 It is seen that u(T,x) - E < w(T,x). Therefore the function u - E satisfies (a), for s = T this function being strictly smaller than w. If assertion (a) has been proved for such functions, then u - E I w, from which, in turn, we conclude as E 10 that u I w. Therefore, we may assume that u(T,x) < w(T,xj for all x E Ed. We note also that due to the continuity of u and w it suffices to prove the inequality u 5 w in HT. Let Q' = {(s,x)E HT:U(S,X) > w(s,x)). We wish to prove that the region Q' is an empty set. Assume the converse. We take (so,xo)E Q'; in addition, let R > lxol and Q = Q' n C,,, n {(s,x)E HT:s > so/2).. By virtue of the inequality u(T,x) < w(T,x) we have: Q c HT, hence u E W1,2(Q).The expression w - u is negative on the region Q. Therefore, it follows from the inequality (F[u] + u - w), + w - u 2 0 that (FLU] + u - w)+ > 0. Then, almost everywhere on Q Now we can apply the preceding lemma, thus obtaining for fixed E > 0
where zn,' is the time of first exit of (so + t, z:,"
from the region Q.
5 The Construction o f &-OptimalStrategies
-
If we take in Lemma 9 lu(s,x) - w(s,x)l instead of g(s,x) and if, in addition, we put ca(s,x) 0, we shall see that the last term in (10) is equal to --
lim lim M lu(so 6 1 0 n+m
+ ?"'a, z:&)
+ T " , ~z:A$)~. ,
- w(sO
In order to estimate the last expression, we note that if Iz:;f6( < R, a point (so zn3',z:~f6)lies on that part of the boundary of Q, where u = w. In short, if < R, then
+ Iz:;~.,I
Therefore, the last term in (10)does not exceed N sup M(1 + Iz:;~al)m+P~I,;A(j61 Z~ 6,n
I
1 1
+R
---
6,.
1
+
sup
lz:.'/)
m+p+l
t<~-so
N <-l+R'
where, by virtue of (5) the constant N does not depend on R. Finally, the first term in the right-hand side of (10) is smaller than w(so,xo)according to Theorem 3.1.11. Thus we obtain from (10)
Here the numbers R > Ixol, E > 0 are arbitrary, N not depending on R. Letting R -+ co,E 1 0, we conclude that u(so,xo)5 w(so,xo).However, this is impossible for a point (so,xo)E Q'. Hence the set Q' is empty. We have thus proved the theorem. Using the inequality w(s,x) 2 g(s,x) as well as the fact that for any real a the function (a - t ) , + t increases with respect to t, we have
12. Corollary. Let u E W:d,2(HT)n C(HT),let u satisfy inequality (1) in H T , u(T,x) 2 g(T,x)for all x E Ed,
+
(F[u](s,x) u(s,x) - g(s,x))+ Then u
+ g(s,x) - u(s,x) 2 0
(H,-a.e.).
w i n RT.
Noting that for a 2 0 the function (a - t)+ + t 2 0 for all t, we have 13. Corollary. Let u E W:d,2(HT)n C ( B T )and let it satisfy inequality (1) in H T , u(T,x) 2 g(x)for all x E Ed, F[u] 5 0 (HT-a.e.). Then u v in RT.
Combining Theorem 5 and Corollary 7 with the last two corollaries we can immediately assert that any solution of the Bellman equation is equal to a payoff function.
3. The Payoff Function and Solution of the Bellman Equation
14. Theorem. Let u E W:d,2(HT)n C ( R T )and let u satisfy inequality (1) in HT. Then:
a. if(F[u](s,x) + u(s,x) - g(s,x))+ + g(s,x) - u(s,x) = 0 (HT-a.e.),u(T,x) = g(T,x) on Ed, then u = w in RT; b. i f F[u] = 0 (HT-a.e.),u(T,x)= g(x) on Ed, then u = v in 8,.
Lemma 4, Lemma 10, and Corrollary 13 enable us to prove Theorems 3.4.13 and 3.4.14. 15. Proof of Theorem 3.4.13. Since nu+ - nu 2 0, by the hypotheses of the theorem
+ p] 2 0 (HT-a-e.). wheref""= nEn + n(g - En)+ +f" + K ( l + Ixl)", which implies, by Corollary sup [Lag - ng aeA
13, that
g(s,x) I sup M:,{J:-~ aeB
+ sup M:,K asB
f@(s + t, xt)e-qt-nt dt
sup (1
t
+ I X , J )So ~
T-s
e
-nt
dt.
According to Lemma 3.4.3, the first term of the last expression is equal to En(s,x).According to Corollary 2.5.12, the second term of the last expression does not exceed
Therefore, n(g - En) I N ( l + 1x1)" Further, we compare the definition of w with Lemma 3.4.3b. Using the fact that lg - g,l = (g - Gn)+ I ( l / n ) N ( l+ we have
/XI)",
1 n
I - N sup M:,,
sup (1
+1 ~ ~ 1 ) ~ .
By invoking Corollary 2.5.12, we complete the proof of Theorem 3.4.13.
5 The Construction of &-OptimalStrategies
16. Proof of Theorem 3.4.14. First we show that Q c Qb. Let the opposite be true: a point (so,xo)E Q\Qb. It is seen that w(so,xo)= g(so,xo).We consider without loss of generality that x, = 0. Furthermore, let .zl = 3h(so,0). Since > 0 and h(t,x) is a continuous function, for all sufficiently small R,p on the cylinder the function h(t,x) is greater than el. We choose appropriate values of R > 0 and p > 0. Next, we diminish p so that the inequality
is satisfied for all processes z, of the form z, =
+
J: ordw, + J: b, dr
(12)
+
for which Ilarll 1b.l 1 2K(1 R) for all r, co. We can arrange for (11) to be satisfied by choosing p > 0 due to Corollary 2.5.12. We note immediately the implication of (11): where z is the time of first exit of (so from (11) we note that
+ t, z,) from eP,,.
For deriving (13)
Hence
Further, we denote by h"(t,x) a continuous function which has compact support in H , and is equal to h(t,x) on eP,,. By hypothesis of the theorem, 2 0 (c,,,-a.e.), wherep = f a - h". Next, we fix E > 0 and sup,,,[Lag + f""] apply Lemma 10 replacing in this lemma f" by f"".~henwe can find strategies an,%uch that
is the time of first exit of the process (so + t,z:*" from ep,,. where Recalling the definition of w as well as the equality g(so,O) = w(so,O), we
3. The Payoff Function and Solution of the Bellman Equation
derive from the inequality given above lirn lim
610 n - t m
ME:Srs6 L(so + t,xt)e-qt dt 5 e(T - so).
Using Lemma 9 (in which we assume g = 0, f" = K), we find
+
Note that 1Iz': I R for r I t I ~"9'. Hence ca(so+ r, z)': 5 K(l R)". Furthermore, (so t, z):' E for t I zns6.Therefore, in (14) we can replace h" by h. Finally, using the inequality h > el on c,,,, we obtain from (14)
c,,,
+
ele-KP(l+ R ) m lim - lim - Mzns6Ie(T - so). 610 n+m
The time zn,' is the time of first exit of the process (so c,,,. In this case ZFP,.~
=
Ji
+ r, z:,?
(15 )
+ t,z:frn,6) from
ciwr
Also, the expression containing bn can be estimated in a similar way. Therefore, by virtue of (13), Mzn,' 2 $p and, moreover, from (15) we have
In (16), p, R, el do not depend on e, e being arbitrarily small. Therefore, the left-hand side of (16) is equal to zero. However, this contradicts the inequalities p > 0 and el > 0, thus proving that Q c Qb. We take now a connected component 0; of the region Qb and prove that this component contains at least one connected component of the set Q. Assume the converse. Then it follows from the inclusion Q c Qb that nQ= Therefore, F[g] I 0 almost everywhere on Oh. Thus, by Lemma 4 on &b
Po
a.
g(s,x) 2 sup
sup
M:,{J~~'
a ~ 9Y I EW(T-S)
fs(s
+ t, xJe-.'
dt
where z = za,"." is the time of first exit of (s + t,x;,",") from (Zb. It is seen that for (s,x) E 0; g(s z, x,) = w(s + 2, x,). Hence
+
5 The Construction of &-OptimalStrategies
Using Theorem 3.1.9, we conclude from (17) g = w on oh, which fact contradicts the inequality g < w on Q, thus proving the theorem. Many constructions carried out in this and preceding sections involve strategies of a special form.
17. Definition. Let (so,xo)E H,. A strategy cc,(w) is said to be an (so,xo)-adjoint Markov strategy if on [O,T] x Ed x Ed we can define a matrix a(t,x,z) of dimension d x dl, a d-dimensional vector b(t,x,z), a function a(t,x,z) with values in A, and, finally, a number 6 such that there exists (with respect to (9,)) a progressively measurable solution (x,; z,) of the system of equations
+
and, furthermore, for this solution a,(o) = a(so t, x,(w),z,(o)) for all t E [0, T - so], o . We denote by %,,(so,xo) the set of all (so,xo)-adjointMarkov strategies. Note that if a strategy cc from %,,(so,xo) was constructed with the aid of Eq. (IS), then x, = for t IT - so. In fact, the first relation in (18) can be written as follows: x, = xo
+ & a(ccr,so + r, x,) dwr + Si b(o,, so + r, x,) dr.
It is not hard to see that the strategies an,6which were constructed in Lemma 8 are adjoint Markov. If we take in Section 2 (w,;fi,) for ii,, and if, in addition, we assume in Eq. (18) that a(t,x,z) = a[p](t, x + ~ ( z xo)), a = 0, b = 0, and 6 = 1, it turns out that the strategy P[p] which was introduced prior to Lemma 2.1 is adjoint Markov. It is clear that from a practical point of view we can interpret the employment of adjoint Markov strategies to be the supplement to the controlled system considered of "adjoint" coordinates z,, using as a result Markov strategies with respect to an expanded controlled system (x,;zt). We note also that in the deterministic case the process z, is said to be a guiding process (see Krassovsky and Subbotin [27]). 18. Exercise Assume that v E W:;:(HT), F [ u ] = 0 (H,-a.e.). We fix E > 0 and take a Bore1 function cr(s,x) on H , with values in A, such that L a ( S , X ) ~+ ( ~fCI(S~X)(q~) ,~) 2 -E
Notes
For (so,xo)E HT we define the (so,xo)-adjointMarkov strategy in the same way as we did in Lemma 8. Prove that
+
v(so,xo)4 lim lim v""'d(so,xo) E(T- so). 810 n - m
19. Exercise
-
(Compare with Exercise 2.7.) Consider a one-dimensional controlled process; d = dl = 1, T = 1, A = { - 1 ) u {+ I), a(a,s,x) = a(x a), where a(x) = sgnx for 1x1 2 1, o(x) = x for 1x1 I: 1. Let b = c = f 0, g(x) = x2. Show that v(s) = xZ 1 - s. Also show that one can find E-optimal strategies for a point (0,O)among strategies of the form a, = sgn z, where z, is a solution of the equation
+
+
an(x)= n((nx) * sgn x, n is a sufficiently large number, 6 is a sufficiently small positive number.
20. Exercise Show that in the situations described in Exercises 2.7 and 19 there exists no optimal is the completion of the a-algebra generated by w, for s E [O,t]. The strategy if 9, question as to whether the equality v(,,(O,O) = v(0,O) is true in Exercise 19 still remains open.
Notes Sections 1,2. Markov strategies are frequently understood as Bore1 functions a(t,x) for which the appropriate stochastic equation has at least a weak solution. The existence of E-optimal strategies in the class of such strategies is proved in Krylov [36], Nisio 153, 591, and Portenko and Skorokhod [61]. Fleming in [IS] constructed Markov (in the sense of the definition adopted in our book) optimal strategies for one class of problems. Some problems of finding optimal strategies have been solved in BeneS [6]. Our discussion closely follows [35]. Section 3. If (each) solution of the Bellman equation is equal to a payoff function, the Bellman equation can have only one solution. The theorems on uniqueness of solutions of nonlinear equations, unrelated to optimal control theory, can be found in Bakel'man [3], Ladyzhenskaja and Uraltseva [46], Ladyzhenskaja, Solonnikov, and Uraltseva [47], and [33,43].
Controlled Processes with Unbounded Coefficients. The Normed Bellman Equation
6
In Chapters 3, 4, and 5 we studied controlled processes on a finite time interval under the assumption that the initial variables o(a,t,x), b(a,t,x), ca(t,x),and f"(t,x) are bounded functions of a for each (t,x). The objective of Chapter 6 is to carry the results obtained in Chapters 3-5 over to controlled processes with coefficients unbounded with respect to a, and also to consider controlled processes on an infinite time interval.
1. Generalizations of the Results Obtained in Section 3.1 Let Ed be a Euclidean space of dimension d, let T E (O,co),let dl be an integer, and, finally, let (wt,Ft)be a dl-dimensional Wiener process. We denote by A a separable metric space. We fix a representation of A as the union of nonempty increasing sets A,: A = U,"=,A,, A,, 3 A, (possibly, A , = A 2 --. . . = A). For t 2 0, x E Ed, and a E A we assume that the functions a(a,t,x), b(a,t,x), ca(t,x)2 0, fa(t,x), g(x), and g(t,x), to be given, have the same meaning as the functions given in Section 3.1 have. We assume that the functions a, b, c and f are continuous with respect to (a,x). We also assume that for each t and n the foregoing functions are continuous with respect to x uniformly with respect to a E A, and that they are Bore1 with respect to (a,t,x). Furthermore, for each n let there exist constant m, 2 0 and K, 2 0 such that for all x and y E Ed, t 2 0 and a E A,
,
6 Controlled Processes with Unbounded Coefficients
We assume that g(x) and g(t,x) are continuous and, in addition, for some constant m 2 0 and K 2 0 for all t, x
1. Definition. Let n 2 1. We write a E %, if the process a = at(o)(t2 0) is progressively measurable with respect to (Ft}and assumes values from A, for t E [O,T].Let % = a,.The elements ofa set % are said to be strategies.
u,
Fixing n and considering the strategies a from %, only, we have the scheme which we considered in Sections 3.1-3.4. Using here the usual notations given in Chapter 3, we put vn(t,x)= sup vU(t,x), as%,
w,(t,x) = sup
SUP
V~'~(~,X).
us%, rsW(T-t)
For each n we define in an obvious manner the set of natural strategies admissible at a point (t,x). We denote this set by %,,(t,x). Let %,(t,x) = %~,E(~?x). Further, let v(t,x) = sup vU(t,x),
Un
as%
w(t,x) = sup
sup
us% rcW(T-t)
vUJ(t,x).
It is seen that v, -+ v and w, -+ w as n -+ co. In addition, v, I v and w, I w for each n. From this and from Theorems 3.1.5 and 3.1.8 which enable us to estimate lvll and lwll we find for (t,x)E H T , where N = N(K,,m,,T). It is clear that the functions v and w can in general take on values equal to + co. In order to give a sufficient condition for the functions v and w to be finite, we make use of Definition 5.3.2 of excessive functions and Definition 5.3.3 of superharmonic functions. Note that the operator F appearing in the above definitions can be represented by the same formula as that given in the introduction to Chapter 4.
2. Lemma. Let a function u be given and let this function be continuous on I f T . Let a region Q c HT. Assume that u is a superharmonic (or excessive) function in the region Q. Then, i f :
1. Generalizations of the Results Obtained in Section 3.1
a. u(t,x) 2 g(t,x) in Q, u 2 w in HT Q,then u 2 w in A,; b. u 2 v in HT\Q, u(T,x) 2 g(x) in Ed, then u 2 v in 17,. This lemma follows immediately from the inequalities w, I u, u, 2 u (Theorem 5.3.5) and the fact that w, -+ w and v, -+ v as n -+ co. In some cases the condition of the lemma can easily be verified. We show how this can be done if, for example, the assumptions made in Section 3.1 are satisfied. Our arguments will not lead us to a new result. However, they will be useful from the viewpoint of methodology. Thus, we take the numbers K and m given in Section 3.1 and, furthermore, we put
+
u(t,x) = 2mKeN1(T-t)(lIXI~)"/~, choosing the constant N, later. We have
where o = o(a,t,x),etc. Since o and b satisfy condition (3.1.2)(that is, condition (2) for K , = K),
From this we see that the left-hand side of the last inequality is negative for N, = 1 + N,. In addition, in H,
Therefore, the function u(t,x) satisfies at once all the conditions of Lemma 2, in this case, however, it is possible to take Q = HT. From Lemma 2 we obtain the result which is familiar to us from Chapter 3: if the conditions of Section 3.1 are satisfied, then v I u, w I u. In the particular case considered we applied Lemma 2 for Q = HT. Generally speaking, in applying Lemma 2 in the specific cases, it would be natural to make an attempt to find a superharmonic (or excessive) function for which the conditions of Lemma 2 would be satisfied for Q = HT.In such situations H,\Q is empty and, in addition, the formulation of the lemma contains no conditions to be imposed on the values (in general, unknown) of v and w in HT. However, it is not always possible to construct such a function u.
6 Controlled Processes with Unbounded Coefficients
3. Exercise We consider a one-dimensional case: d = d , = T = 1 , A = [O,co), A, = [O,n], 0 = ax, b = c = f = 0, and g(x) is an arbitrary continuous function satisfying the inequality 1x1 I g(x) 521x1 1. Prove that there exist no superharmonic functions (in the sense of Definition 5.3.3) in H , for which u(1,x) 2 g(x) for all x. (At the same time, the function 21x1 1 is superharmonic in Q = H,\{(t,x):x = 0 ) . For t = 1 this function is not smaller than g(x), and for x = 0 it is equal to 1 2 g(0) = v(t,O). Hence the function 21x1 + 1 together with the region Q satisfies the conditions of Lemma 2.)
+
+
In practice, it is natural to consider strategies only from %, that is, the strategies a,(o) which assume values from some A, that is the same for all (t,o). In this connection, we give the following 4. Definition. Let (s,x) E H T The process a,(o) which is progressively measurable with respect to (P,} and has values from A is said to be a strategy admissible at a point (s,x), if
a. for t E [0, T - s] there exists (at least one) solution of the equation x, = x
+ Si 40..s + r, X ) dwr + Si b(ar,s + r, xr)dr,
(6)
for which b. M j$-"fal(s + t,x,)exp(-6 car(s r,xr)dr)dt < a, C. M SUPtsT-slxtlmvml < 03, d. there is an integer-valued function n(R) = na(R) such that a, E A,n(suprs ,lxrI)for all t E [0, T - s], o.
+
We denote by %(s,x) the set of all strategies admissible at a point (s,x). If a E %(s,x),we denote by x;~","a certain (fixed forever) solution of Eq. 6 which satisfies conditions (b), (c), (d).' For the strategies a E %(s,x) we shall use the abbreviated notation rp?"", M:,, etc., giving them the same meaning as we did before. We note that the expressions va(s,x)and va,"(s,x)for a E %(s,x) and z E m ( T - s) have been defined in accord with Definition 4b,c as well as condition (4), although these expressions are possibly equal to + oo. Under the assumptions made in Section 3.1 the sets % and %(s,x) coincide if we put A, = A, = . . . = A. In fact, comparing Definition 4 and Definition 3.1.1, we have that %(s,x) c %. On the other hand, for a E % condition (d) is a consequence of the equality of A,. Conditions (b) and (c) follow from the estimates of moments of solutions of stochastic equations (see Section 2.5), which fact shows as well that in the general case %(s,x) 3 a,, %(s,x) 2 %. Hence V(S,X) r sup va(s,x), a s 8(s,x)
Indeed, it is not hard to show that there is only one solution to within an equivalence.
1. Generalizations of the Results Obtained in Section 3.1
W(S,X) 5 sup
sup
as'll(s,x) r s'JR(T -s)
V~,~(S,X).
Before formulating the next theorem we note that due to (5) and Definition 4b,c the expressions in (7) (see below) standing under the sign of the upper bound have been defined, and furthermore, they are greater than - co.
5. Theorem. (a) v(s,x) = sup,,,( ,,,, va(s,x)for all (s,x) E H,. (b) The function v(s,x) is lower semicontinuous on RT, v(T,x) = g(x). (c) If (s,x) E H T and if for each a E %(s,x) the time z = za E 1IJI(T -s) and a bounded (with respect to (t,o)) nonnegative progressively measurable (with respect to (9,)) process r, = r: are dejined, then
-
(d) In (a) and in (7) we can replace the set %(s,x) by %,(s,x) as well as by %.
PROOF.Since the equality v(T,x) = g(x) is obvious, (a) follows from (c) for r; 0, za = T - s. From the continuity of vn(s,x)(see Theorem 3.1.5) we deduce lim v(t,y) = lim sup vn(t,y) (f,~)+(s.x) n
(t,~)+(s,x)
r sup n
lim
vn(t,y) = sup vds,x) = v(s,x). n
(t,y)+(s,x)
Therefore, v(s,x) is lower semicontinuous. We have proved (b). Let y a , s ,)xU
,
Ji [fi(S + t, X ~ . ~+J )r
+
: (S~ ~t, x:,~,~)]
Also, we introduce Ya,"."(u,z),omitting the signs f in the last formula. By Theorem 3.1.7, for each n vn(s,x)= sup M;,,Y(v,,z)< sup M;,xY(v7z). a E%,.E(S,X)
a E%E(~,x)
Therefore, v(s,x)l
sup a E %E(SJ)
M~,,Y(v,z)lsupM~,,Y(v,z) as%
which immediately implies that in order to prove the theorem, it suffices to prove the inequality V(S,X) 2 M:,,'Y(v,z) (8)
6 Controlled Processes with Unbounded Coefficients
for all a E %(s,x), z E %R(T- s). We fix a E %(s,x). Further, for R > 0 we define the strategy P using the formula P, = a,,,,, where z, is the time of first exit of^^^'" from S,. It is seen that p E %n(R), where n(R) = na(R)has been taken from Definition 4. Furthermore, the processes xff;", and ,:::x satisfy the (same) equation :
Hence the foregoing processes coincide for all t E [0, T - s]. In particular, = x;,s,x for all t IZ, almost surely. Since B E for j = 1,2, . . . , we have by Theorem 3.1.6 that
xa,~,~
Let j -+ co, R -+ co. We note that when R co, T A T R each o for all sufficiently large R. By Fatou's lemma -+
-+
Z, Z A Z ,
= Z
foA
Furthermore, by virtue of inequalities (5) and Definition 4b,c the totality of variables Y : ~ S ~ ( V+j~, z( ~A ,zR)is bounded by a summable quantity. From this, using the Lebesgue theorem, we find lim lim M:,xY(-)(~n(R)+ j, z A z ~=) M:,xY(-,(v,~).
R-tmj-tm
Since Y = Y(+,- Y(-,, the above relations enable us to obtain (8) from (9), thus completing the proof of the theorem. 6. Exercise In Exercise 3 we take g(s,x) = [x2/()xl+ 1)](1 - s). Prove that the assumptions of Lemma 2a can be satisfied for u(s,x) = 1x1, Q = H,\{(s,O)}. Prove also that w(s,x) = Ixl(1 - s), and, furthermore, that if E 2 0, is the time of first exit of (s t, x;pqx) from the region
*","
+
then for (s,x)E Q, sup M:,,g(s KEU
+ r,, x,=) + 6 = sup M:,,w(s + z,,x,J = ~(1x1+ 1) < w(s,x). aePI
This exercise demonstrates that not all the assertions of Theorems 3.1.10 and 3.1.11 are true in general. In this connection, we do not prove theorems on &-optimalstopping times in the general case. We note nevertheless that it is convenient to search for &-optimalstopping times using Theorem 3.1.10 and approximating w(s,x) with the aid of w,(s,x). In the next theorem, it is useful to keep in mind the remark made before Theorem 5.
1. Generalizations of the Results Obtained in Section 3.1
7. Theorem. (a) For all (s,x) E HT W(S,X) =
SUP
sup
as%(s,x) r s'lR(T-s)
va3=(s,x).
(b) The function w(s,x)is lower semicontinuous on H T , w(s,x) 2 g(s,x) on RT. (c) For any a E %(s,x), z E %X(T- s) and the nonnegative bounded progressively measurable processes (with respect to (9,)) r, the inequality W(S,X)
t M:,
I
W(S
+ z, x,) ex(-
9, -
J; rpdp) dt
is satisjied. (d) W e have assertion (3.1.9) of Theorem 3.1.9, in which we can replace the upper bound with respect to a E % by the upper bound with respect to a E %(s,x) as well as by the upper bound with respect to a E %,(s,x).
PROOF. We prove (b) exactly in the same way as we proved (b) in the preceding theorem. Using the notation from the proof of Theorem 5, we can write (c) as follows: We can prove this in the same way as we prove an analogous inequality for v, making use, however, of Theorem 3.1.1 1 instead of Theorem 3.1.6, by which for all n, P E a,, w,(s,x) 2 M!,,Y(w,,z).
-
Proof of (d). For each a E %(s,x) let a time z = za E %R(T - S ) be defined. Putting in (c) rt 0 and also, noting that w(s,x) 2 g(s,x), we obtain w(s,x) 2 sup
sup
M:,,Y(w, z A y )
t sup
sup
M:,{J;AY
ae'U(s,x) ye!Ut(T-s)
aeQl(s,x)ye!Ut(T-s)
P ( +~t , ~ , ) e - ~ d t
Let us extend the above inequalities, replacing first %(s,x) by % c %(s,x), and next by %,(s,x) c %. Finally, let us take into account that the last resulting expression will not be smaller than the following expression: sup
sup
a e Q L , ~ ( s , x yem(T-s) )
M:,{f;^'
+ t, x,)e-vt dt
6 Controlled Processes with Unbounded Coefficients
for all n. Here we have equality by Theorem 3.1.9. To complete proving (d) we need to let n -+ co in the sequence of inequalities obtained if we proceed as indicated above. Assertion (a) follows from (d) if we assume za = T - s. We have proved the theorem. In Theorems 5 and 7 we asserted that v(T,x) = g(x) and w(T,x) = g(T,x). In other words, we know the boundary values of v and w for s = T. Regretfully, it may turn out that these boundary values of the functions u and w are weakly related to the values of these functions for s < T (see Exercise 11). Thus, the question arises as to when the boundary values are assumed, i.e., when lim W ( S , X ) = g(T,x). lim V ( S , X ) = g(x), sfT
sf T
The following lemma shows that always
lim v(s,x) 2 g(x), ST
T
lim w(s,x) 2 g(T,x). sf T
8. Lemma. For all R > 0 the limits lim sup [v(s,x)- g(x)]-,
lim sup [w(s,x) - g(T,x)]-
s f T lxlrR
sTT I x l r R
are equal to zero.
PROOF. By Theorem 3.1.5, lim sup Iv,(s,x)
-
g(x)l = 0.
sTT I x l c R
Hence the assertion of the lemma follows from We can consider the function w in a similar way, thus completing the proof of the lemma.
9. Theorem. (a) For each E > 0 and R > 0 let there be 6 > 0 and a superharmonic (or excessive) function u(s,x) in a region ( T - 6, T ) x Ed such that in this region u(s,x) 2 g(s,x) and u(T,x) < g(T,x) + E for 1x1 5 R. Then for all R >0 lim sup Iw(s,x) - g ( ~ , x )= l 0. s f T IxlrR
(b) For each E > 0 and R > 0 let there be 6 > 0 and a superharmonic (or excessive) function u(s,x) in a region ( T - 6, T ) x Ed such that g(x) I u(T,x) for all x, u(T,x) I g(x) + E for 1x1 < R. Then for all R > 0 lim sup Iv(s,x) - g(x)l = 0. s f T IxlrR
1. Generalizations of the Results Obtained in Section 3.1
PROOFOF (a). We take E > 0 and R > 0. Also, we find 6 > 0 and the corresponding function u(s,x). Applying Lemma 2a, we consider instead of the strip HT the strip ( T - 6, T) x Ed and, furthermore, we take the latter strip for Q. We thus have u(s,x) 2 w(s,x) in [ T - 6, T] x Ed, which yields Further, due to the continuity of u(s,x) and g(s,x) we have
lim
sup [w(s,x) - g(s,x)] + I sup [u(T,x) - g(T,x)] + .
sf T I x l c R
1x1 5 R
By hypothesis, the last expression does not exceed E. Since E is arbitrary, lim sup [w(s,x) - g(s,x)] + s f T Ixl5R
= 0.
Comparing this result with Lemma 8 and noting that la1 = a+ + a _ , we have proved assertion (a). Assertion (b) can be proved in a similar way, thus proving the theorem. 10. Remark. We have applied the version of Lemma 2 in which Q = H T . We could have used Lemma 2 in its general form but the statement of the appropriate theorem would then become too cumbersome. Using the scheme investigated in Section 3.1 as an example, we show how to apply Theorem 9. We shall see that the assumptions of Theorem 9 can always be satisfied if we take the controlled processes given in Section 3.1. We consider only the second assertion of Theorem 9. Assume that E > 0 and R > 0. First we find an infinitely differentiable function g(x) such that g(x) I g(x) for all x, g(x) I g(x) + E for 1x1 I R. It is seen that since 1g(x)1 2"K(1 + IX~~)"~', one can take g(x) = 2"K(1 + IX~')"~~ for 1x1 > 2R. Let
The computations, similar to those carried out after Lemma 2, yield
Using the assumptions made in Section 3.1 about the growth order of a and b, we easily find that in HT
6 Controlled Processes with Unbounded Coefficients
The last expression is smaller than zero for s E ( T - 6, T ) if 6 is such that
Hence the function u' satisfies the hypotheses of Theorem 9b.
11. Exercise We consider a 1-dimensional case: d = d , = T = 1, A[O,co), A, b = c = f = 0. Let g(x) be a bounded continuous function. Prove that for s < 1 v(s,x) = sup g(y), Y
= [O,n],
a(a,s,x) = a,
lim v(s,x) = sup g(y). st 1
Y
12. Exercise Prove Theorems 3.1.10 and 3.1.11 for E > 0 in the general case if it is known that each connected component of the region Q , is bounded. (Hint:It is necessary to prove first that Q, is in fact a region, and second, that for each connected component of Q, there is a number no after which it is contained in the sets {(s,x):w,(s,x) > g(s,x) (~/2)}.)
+
13. Exercise Prove Theorems 3.1.10 and 3.1.11 for e > 0 in the general case if it is known that Iw(s,x)l + f"(s,x)l I N(l 1x1)" for all a, s, x and that, in addition, for some r > 0 for all s, x, E > 0
I
+
sup Mg,, DEN
SUP t5ze
Ixtlrn+* < co.
2. General Methods for Estimating Derivatives of Payoff Functions In Chapter 4 estimates of derivatives of payoff functions played an essential role. They were used, in particular, in deriving the Bellman equations. The proof of the estimates given in Chapter 4 was based on the estimates of moments of derivatives of solutions of stochastic equations with respect to the basis of the initial data in Section 2.8. In this section we give more precise estimates of derivatives of solutions of stochastic equations as well as more precise estimates of derivatives of payoff functions, omitting the proof, which will appear in a forthcoming publication. We introduce a vector yU(t,x) of dimension d x (dl d 4), whose coordinates are given by the variables oij(a,t,x)(i = 1, . . . ,d, j = 1,. . . ,dl),
+ +
2. General Methods for Estimating Derivatives of Payoff Functions
bi(a,t,x) (i = I,. . . ,d), ca(t,x), fOL(t,x), g(x), g(t,x). We assume repeatedly in this section that the assumptions made in the previous section have been satisfied. We also assume that for each a E A, t E [O,T] the vector ya(t,x)is twice continuously differentiable in x and, in addition, for all n, a E A,, 1 E Ed, ( t , ~E) HT
Besides these (tacit in the assertions of this section), we shall need additional assumptions. For convenience of reference we enumerate them. We fix 6 E (O,$]. Also, let
We fix five nonnegative functions u-,(t,x), u,(t,x), u-,(t,x), uz(t,x), uo(t,x), which together with two derivatives with respect to x and one derivative with respect to t are given and, in addition, are continuous in RT. On 8,let
We note that such functions always exist; for example, ui = 0. However, we shall require more than inequality (2). A list of the conditions follows. Not all are required to be satisfied simultaneously. The assertions indicate specifically the various sublists of conditions. 1. Assumption (on first derivatives of a and b). For all (t,x) E HT, a E A, 1 E Ed
where we have omitted for the sake of brevity the arguments (t,x) in the function u,(t,x) and the operator L",t,x), and also the arguments (a,t,x) in the functions a(a,t,x), b(a,t,x). We shall omit the arguments in the other assumptions in a similar way. 2. Assumption (on second derivatives of a and b). For all (t,x) E H,, a E A, a€A,lEEd
6 Controlled Processes with Unbounded Coefficients
3. Assumption (on first derivatives of c, f , g). (a) For all (t,x) E RT, a E A,, l ~ E ~ , ~ = v , , w , ( n = ,...) 1,2
(b) ca(t,x)does not depend on x. There exists a linear subpace Ef of the space Ed such that for each 1' E Ef, 1" IEf, a E A, (t,x) E HT Furthermore, for each I'
E
Ez, 1 E Ed, a E A, (t,x) E RT
Finally, for all (t,x) E H,, a E A, I" IEf
4. Assumption (on second derivatives of c, f , g). For all (t,x) E RT, a E A,, 1 E Ed,u = u,, W , (n = 1,2, . . .) [f ;)(l) - C:~)(,)U] - I (- L ~ ; U ~-@I4)() ( ~ ' L; ~ 2)('I2 +(614),
We shall formulate later two more assumptions. However, right now we discuss some techniques for verifying the assumptions listed in various places. First we note that if in a specific case it is possible to choose functions u, so that each function satisfies only one of the inequalities (6), (8), or (11)-(14), the sum of these functions can be used as u, in all the inequalities. The functions u- ,, u-, appear only on the right-hand sides of the inequalities (4), (5), and (7). The left-hand sides of the inequalities contain u1 only, with the exception of inequality (7) in the left-hand side of which u, appears also. The greatest number of conditions is imposed on the function u,. Assumption 1 being the most stringent. Inequality (3) being satisfied depends only on the appropriate choice of the function u,. In the remaining inequalities, the right-hand sides contain functions which do not appear in the left-hand sides so that these inequalities can be satisfied by choosing these functions appropriately. In this connection, it is useful to bear in mind that since one can multiply or divide these inequalities by (112,(3) can be satisfied for all 1 E Ed if and only if (3) is satisfied for all unit vectors 1. Further, we can easily verify Assumption 1 if first derivatives of o and b with respect to x are bounded by the same constant for all a, t, x. In fact,
2. General Methods for Estimating Derivatives of Payoff Functions
in this situation we assume that u,(t,x) = eeN'. Then, the left-hand side of (3) does not, obviously, exceed N,e-N'1112, where the constant N, does not depend on a, t, x. The right-hand side of (3) is equal to
Since ca 2 0, one can, for example, assume that 6 = 4,N = 4N1. In this case the function u, constructed satisfies (3). The constant T does not appear in an explicit manner in the assumptions made, which fact enables us to employ our assumptions for control over an infinite interval. Note that for T = m it is sometimes inconvenient to take a function of the form e-Nt as the function u,. In such cases one can keep in mind that Assumption 1will be satisfied if, for example, ca(t,x)is "sufficiently" large compared to first derivatives of a and b with respect to x, more precisely, if
for all (t,x) E RT,a E A and for unit 1 E Ed. Indeed, (15) coincides with (3) for u1 E 1, 111 = 1. (Inequality (15) is discussed in more detail in the notes to this chapter.) We have given two cases in which Assumption 1 is satisfied. We note that the first case holds in the scheme investigated in Chapters 3-5. Indeed, if the constant K, in (1.1) does not depend on n and, in addition, is equal to K, the assumption on differentiability of o and b enables us to rewrite (1.1) as Ib(i,l 1 K. In this connection, we show that if the conditions under which Theorems 4.7.4, 4.7.5, and 4.7.7 on smoothness of a payoff function and, also, on the Bellman equation were proved, Assumptions 1-4 will be satisfied as well. In other words, we wish to show that if A , = A, = . . . = A, K I --K - . . . -- K, ml = m2 = . . . = m, there always exist functions ui and a number 6 which satisfy Assumptions 1-4. Here as well as in a similar situation in Section 1, our objective is purely methodological. In order to have (3) satisfied, we take 6 = 4,u, = e-N1t,and furthermore, we choose an appropriate N,. Since
1 0(!,1 +
for verifying Assumption 2 we need only to choose u- from the condition where N, is a constant. Repeating the arguments given after Lemma 1.2' in Section 1, we easily convince ourselves that for u-, we can take a function of the form N,e-N4'(1 Ix~')~".
+
In these arguments one needs to take 8m instead of m.
6 Controlled Processes with Unbounded Coefficients
We see from Chapter 3 that 101,
IwI
IN(l
+ 1x1)". Since for u = v,w
the function u , for inequality (6)can be sought from the condition where N is an appropriate constant. It is seen that the function u,, equal to N,e-N6'(l + I X ~ ~ )is~ applicable ", for a certain choice of constants N , , N , . Similarly, to have (13) satisfied, it suffices to solve an inequality of the form
L",,
+ N ( l + lxl)8m1(2+6) I 0.
As we saw in Section 1, this can easily be done. Finally, choosing functions of the form N,eCN@(l+ IxI2)" which satisfy inequalities (8), (14), and next adding up all the expressions thus found for u,, we obtain the function to be taken as u, in inequalities (6),(8),(13),and (14)in the large. Since we had dealt with an expression of the form N,e-N1Ot(l+ IxI2)",each time, the final version of the function u, thus obtained satisfies the inequality luzl I N ( l IxI)", using which we can easily find the function u - , satisfying (7). Therefore, Assumptions 1-4 have been satisfied.
+
5. Exercise Summarizing the above arguments, show that in the case considered one can take
Also, we write down what Assumptions 1-4 become if we take constants for the functions ui and if, in addition, we consider only Assumption 3a. Here (3)becomes (15).Inequality (4) can obviously be satisfied if there exists a constant N such that In other words, it suffices to take a constant N for which Ilo(oa)ll Nc". (16) Considering the remaining inequalities in a similar way, we can see that Assumptions 1-4 can be satisfied if inequalities (15)and (16)are satisfied and, furthermore, if there exists a constant N such that
2. General Methods for Estimating Derivatives of Payoff Functions
for all (t,x) E RT, a E A,, 1 E Ed, u = v,, w, (n = 1,2, . . .). In this case we can say that c" is sufficiently large and that the derivatives of the functions g(t,x), g(x) are bounded. On the face of it, the last inequalities in (17) as well as inequalities (6) and (13) seem to be rather strange. These inequalities contain the functions v, and w,, which are in general unknown. In this connection, we note that
Hence, if, for example, we could, using (1.5) and Lemma 1.2, estimate the functions v, and w, and also, we could prove that Iv,l I u" and lw,l I u" for a function u", (6) and (13) will be satisfied if
The last remarks concerning (6), (13), and (17) are unnecessary if ca(t,x) does not depend on x. In such a case c;, = 0 and c;,(,, = 0, the functions u, and w, do not belong to (6),(13), and (17),and, finally, these functions need not be estimated. This fact shows how convenient it is to write (6) and (13) as we did above. In the case where ca(t,x)does not depend on x, Assumption 3 is regarded as satisfied if we succeed in satisfying conditions (b). Let us discuss this case. Inequalities (11) and (12) are particular cases of inequalities (6) and (8) since c(,, = 0 and also since in (11) and (12) we consider not all 1" E Ed but only those which are orthogonal to a subspace Ef. We can posit two extreme possibilities: Ef contains only a zero vector; EF coincides with the entire space Ed. If the first possibility is realized, equalities (9) and (10) hold for arbitrary 0 and b, any vector 2" 1Ef, and (11) and (12) must be satisfied for all (t,x) E R,, a E Ad, 11' E Ed. Also, in this case conditions (a) are satisfied. The advantage of considering conditions (b) along with conditions (a) is obvious when Ef = Ed. Here only the zero vector is orthogonal to Ef; furthermore, inequalities (11) and (12) are automatically satisfied (u(,,= 0 for 1 = 0), as well as Eq. (9). Therefore, if c" does not depend on x, Assumption 3 is satisfied if, for example, for all 1', 1 E Ed(10) is satisfied; in this case we can assume Ef = Ed. We note that, as can easily be seen, (10)holds for all 1', 1 E Ed,a E A, (t,x) E 8, if and only if the functions a(a,t,x) and b(a,t,x) are linear with respect to x for all a E A, t E [O,T]. It is seen that Assumption 2 is satisfied as well. In order to consider the intermediate possibility 0 # Ef # Ed, let us imagine that the space Ef is generated by the first coordinate vectors el, e,, . . . ,e,,, where 1 I do < d. Equalities (9) and (10) have then to be
6 Controlled Processes with Unbounded Coefficients
satisfied for the vectors 1' with arbitrary first do coordinates as well as coordinates numbered d, + 1, . . . , d, which are equal to zero. From this condition it readily follows that for all 1" IEF, 1 E Ed, i = 1, . . . , do; j = 1, ..., dl of:',,)(a,t,x) = 0, ~&(~)(a,t,x) = 0,
bf,..)(a,t,x)= 0, bil)cl,(a,t,x)= 0.
The second relations in (18) imply that the first do rows of the matrix o and the first do coordinates of the vector b depend on the coordinates of x in a linear way. The first relations in (18) show that the elements of the matrix o and the elements of the vector b do not depend on xdo+l,. . . , xd. Therefore, the system dx, = o(a,, s
+ t, xi)dw, + b(a,, s + t, xi)dt
splits into two parts: the "upper" part for the coordinates x: ( i = 1, . . . ,do) and the "lower" part for the coordinates x: (i = do + 1, . . . ,d), the "upper" system being linear with respect to x:, . . . , x? and, in addition, containing no other unknowns. We shall complete this discussion of Assumptions 1-4 by suggesting the reader should do 6. Exercise Show that Assumptions 1-4 can be satisfied if a, b, c do not depend on x and, also, for all (t,x)E RT,C( E A, 1 E Ed
[f ;)(,)I
-
5 ( - ~ ; ~ ~ ) ( 1 / 2 ) - ( 6 1 4 )~( -; ~ ~ ) ( 1 / 2 ) + ( 6 / 4 ) ,
[g(,)(,)(t,x)] - 4 u \ ' / ~ ) - (u2 ~( /1~/ 2) ) + ( d / 4 ) ( t , ~ ) , [g(,)(,)(x)] $ u\'/~)-('/~) u2( 1 1 2 ) + ( 6 i 4 ) ( ~ , x ) .
Note that the last inequalities are automatically satisfied iff", g(t,x), g(x) are downward convex with respect to x.
7. Assumption (on derivatives with respect to t). (a) The functions f , g, o, b, c are continuously (with respect to (t,x)) differentiable with respect to (t,x) for all (t,x) E H T , n, a E A,,
for all (t,x) E IT,, a E A
2. General Methods for Estimating Derivatives of Payoff Functions
(b) Furthermore, for all (t,x) E H T , a E A,, u = v,, w, (n = 1,2, . . .)
(c) Finally, the derivatives gXixj(t,x),gxixj(x)are continuous in H , and, also, for all x E Ed
Much of what has been said above could easily be repeated for Assumption 7. We note only that in verifying that Assumption 7 is satisfied, one need not attempt to find immediately the function u-, which simultaneously satisfies inequalities (20) and (21) (as well as inequalities (4) and (5)). One can find one function for (20) and another function for (21), and, next, one can take as u-, the sum of the functions thus obtained. The same holds for uo from (22)-(24). Moreover, one should keep in mind that Lag(T,x) in (24) is the quantity at a point (T,x), obtained as a result of application of the operator La(t,x) to the function g(t,x), and, in addition, that La(T,x)g(x)is to be regarded as the value of the operator La(T,x)evaluated on the function g ( ~ ) .
8. Assumption (summarizing). Assumptions 1, 2, 3b, 4, and 7 are satisfied, also, for 1' E Ed, a E A, (t,x) E H T ,
or Assumptions 1,2,3a, 4, and 7 are satisfied. We see from the results obtained in Chapter 4 that derivatives of the functions v and w can be estimated only on the sets on which these derivatives belong to the operator F[u]. In this connection, here as well as in Section 4.7, let Q* = {(t,x)E HT:~~pasA(a(a,t,x),A,A) > 0 for all il # 01, p
= p(t,x) =
inf sup na(t,x)(a(a,t,x)A,A).
111=1 a s A
We also introduce Q,*and p, (n = 1,2, . . .) using similar relations in which we put A, instead of A. 9. Lemma. p, I ,u,+~,p = limn,, tinuous in H,, the equalities
pn, the function p(t,x) is lower semicon-
6 Controlled Processes with Unbounded Coefficients
are satisjied, the set Q* is open, and, jinally, the function pP1(t,x) is locally bounded on Q*.
PROOF. The inequality p, Ip, + is obvious. Further, due to the boundedness of na(t,x)a(a,t,x),which is uniform with respect to (a,t,x),the quadratic forms na(t,x)(a(a,t,x)l,l)and the functions
are continuous with respect to A E S, uniformly with respect to a, t, x, n. In accord with Dini's theorem it follows from the obvious relation
that in this case the convergence is uniform with respect to l E dS1 for each t, x. In particular, the lower bounds of the foregoing expressions converge, that is, p,(t,x) -, p(t,x) as n -, 0. Since the limit of an increasing sequence of continuous functions is lower semicontinuous and, furthermore, by Lemma 4.7.3 the functions p,(t,x) are continuous, p(t,x) is lower semicontinuous. The equality Q* = UQ,* follows from the fact that p,(t,x) f p(t,x). We can prove the other equality fc . d* by repeating the appropriate arguments from the proof of Lemma 4.7.3. Finally, the last assertions follow from the continuity of p.(t,x) and also from the monotone convergence of p,(t,x) to p(t,x). The lemma is proved. We formulate now the main results obtained in estimating derivatives of the functions v and w. It should be borne in mind that we shall give some assertions of Theorems 10-12 in a short form, in other words, we assert, for example, in Theorem 11 that for each x E Ed the function v(t,x) is left continuous with respect to t on [O,T], and that the function
increases (does not decrease) with respect to t on [O,T]. We fix x E Ed. The function $(t,x), as any increasing function, can have not more than a countable set of discontinuity points; all the discontinuities are of the first kind and, in addition, $(t - ,x) 5 $(t,x) s $(t + ,x) for each t. Since an integral with respect to r is a continuous function oft, we conclude that for each x E Ed the function v(t,x), being the function oft, can have not more than a countable set of discontinuity points, all of whose discontinuities are of first kind, and, in addition, that v(t,x) I v(t + ,x) for each t, that is, the graph of v(t,x) can have upward jumps only. The properties of v(t,x) listed above do not appear in Theorem 11. A similar situation is observed for the corresponding assertions of Theorems 10 and 12. In Theorem 10, the appropriate auxiliary function decreases, and hence v(t,x) has downward jumps only. Finally, we note that from the inequality $(t-,x) 4 $(t,x) follows the inequality v(t -,x) 4 v(t,x). Furthermore, by Theorem 1.5, the function v(t,x)
2. General Methods for Estimating Derivatives of Payoff Functions
is lower semicontinuous. Therefore, v(t,x) I v(t-,x). Comparing the last inequality with the previous one, we obtain v(t,x) = v(t - ,x). Thus, the assertion about the left continuity of v(t,x) with respect to t in Theorem 11 is a consequence of Theorem 1.5 and of the fact that $(t,x) increases with respect to t. 10. Theorem. Suppose that a function v is bounded above in each cylinder
C,,, and also that Assumptions 1-4 are satisJied. Then: a. for each I,, 1, E Ed inside HT there exist generalized derivatives ~ ( ~ ~ ) ( ~ ~ , dx) ( t , x(see ) ( dDefinition t 2.1.2), and also for each 1 E Ed inside HT where
b. for each t E [O,T] the function v(t,x) is continuous with respect to x, has a generalized derivative with respect to x (in the sense of Definition 2.1.1), and, in addition, for each R > 0 , y E (0,1],t E [O,T] almost everywhere in S,
c. inside HT there exists a derivative (d/at)v(t,x)(dtdx),
a
- v(dtdx) I
at
inf ILBl(ti, + Igrad, vl
BsA
L B V ( ~dx) ~
+ v+ + 1)dt dx,
+ fp dt dx I o3
for each fi E A ; for each 1 E Ed ,uv(,,(,,(dtdx) I (Dl
+ Igrad,vl + v+ + 1)dt dx +
(a4 )-
v
(dt dx);
d. for each x E Ed the function v(t,x) is right continuous with respect to t on [O,T], and, furthermore, for each R > 0 there is a constant N 2 0 such thatfor all x E SR thefunction v(t,x)- N t decreases with respect to t o n [O,T]. Finally, if we replace v by w in the above formulations, assertions (a)-(d) will still hold.
11. Theorem. Let Assumptions 1, 3b, and 7 be satisfied; in this case for all 1' E E f , a E A, (t,x) E HT let equalities (25) be satisfied. Or let Assumptions 1,3a, and 7 be satisfied. Then: a. the function v(t,x) is bounded in each cylinder C,,,, for each x E Ed the function v(t,x) is left continuous with respect to t on [O,T], and, finally, The meaning of ILBJis explained in the statement of Theorem 4.3.3; L6u(dt dx) is defined prior to Theorem 4.2.7.
6 Controlled Processes with Unbounded Coefficients
for each x E Ed the function v(t,x)
+ Sd ii2(r,x)dr
increases with respect to t on [O,T], where ii2= (76-1u-1u2)112+ u,; b. inside HT there exists a derivative (a/dt)v(dtdx) and
a u(t,x)(dtdx) 2
-
at
-
ii2(t,x)dt d x ;
c. for each R 2 0 lim sup Iv(t,x) - g(x)l = 0. ttT IxlsR Assertions (a)-(c) will remain true i f we replace in them v by w and g(x) by g(T,x). The next theorem holds i f the assumptions given in the preceding section only are satisfied. W e could have formulated it in Section 1; however, it is more closely related t o the discussion in this section.
-
12. Theorem. Let there exist a continuous function q(t,x) given on RT,and also a number 6 , > 0 such that for each t E (O,T] the function ut(s,x) v(t,x) + 1: q(r,x)dr, being a function of the variables (s,x),is superharmonic (or excessive) in the region HT n ( t - 6,,t) x Ed (see Dejinitions 5.3.2, 5.3.3). Then, we can prove Theorem 1la,b,c i f we replace ii2 by q.
13. Theorem. Let the summarizing Assumption 8 be satisjied. Then the functions v,, v, w, and w are uniformly bounded and also equicontinuous in each cylinder C,,,. Generalizedjirst derivatives of these functions with respect to (t,x) are uniformly bounded in each cylinder C,,,. Moreover, each of the foregoing functions satisJies the inequalities -Hz
a
I -u
at
I inf ILBl(iTl pea,
+ Igradxul + u+ + 1)
for any R > 0, y E (0,1]. Finally, v, f v and w, 7 w as n C,,, for each R > 0.
(HT-a.e.),
+
oo uniformly in
14. Theorem. Let the summarizing Assumption 8 be satisjied. Then in the region Q* the functions v(t,x) and w(t,x) have all generalized second derivatives with respect to x, which are locally bounded in Q*.Moreover, for each 1 E Ed
2. General Methods for Estimating Derivatives of Payoff Functions
almost everywhere in Q*. Finally, for each no second derivatives of the functions v, and w, with respect to x for n 2 no are un$ormly bounded with respect to x and, also, n 2 no in each bounded region which together with its closure lies in Q,*,.
Theorems 10-12 are of an auxiliary nature in relation to Theorems 13 and 14. The former theorems can in turn be derived from the following estimates of derivatives of solutions of stochastic equations. For a E %, 111 = 1: (s,x)E RT let
Since by the definition of the set % it follows from the inclusion a E % that a E %, for a certain n, due to the results obtained in Section 2.8 the processes yFSsX,zFSgX exist and are continuous on [O,T]. Let
This process is defined for a E IU, (s,x) E HT on a time interval [0, T-s] if assumption 7a is satisfied.
15. Theorem. Let Assumptions 1 and 2 be satisjied. Then for each (s,x) E H T , z~!lX(T-s),a~%
Furthermore, for each t , E [0, T - s], y E [O, 8/(2 - 6)] on a set {z 2 t1) almost surely 26- l U l ( $ + t l , x;l.s.x)ly;;s?xl~e-2(1 -6) a cpt;"'"
ul(s
+ z, ~ J 1 y ~ l ~ e - ~ ' ~ - ' " + "
6 Controlled Processes with Unbounded Coefficients
16. Theorem. Let Assumptions 1 and 7a be satisjied. Then for each (s,x)E H,,
z~1131(T-s),a~%
17. Exercise Making use of Exercise 5 and Theorems 13 and 14, prove Theorem 4.7.4 and, also, that in that theorem
3. The Normed Bellman Equation As is seen from Section 1.2 (also, see below Exercise 15), a payoff function need not satisfy the Bellman equation. In this case the initial functions o, b, c, f, g may be as smooth as desired. The objective of the present section consists in deriving a correct normed Bellman equation for a payoff function. We shall see that such an equation holds in a very broad class of cases. In addition to the assumptions given in Section 6.1, let us assume the following. We denote by ya(t,x)the vector of dimension d x (dl + d + 4), having the coordinates oij(a,t,x) (i = I , . . . ,d, j = 1, . . . ,dl),bi(a,t,x) (i = 1,. . . ,d), ca(t,x),fa(t,x), g(x), g(t,x). We assume that for each a E A, 1 E Ed the derivatives y:,(t,x), y;l,(,,(t,x) as well as the derivative (d/dt)ya(t,x)exist and are continuous on 8,. Assume that for all n, a E A,, 1 E Ed, ( t , ~E )
Let
I
1 1
Q,* = (t,x)E HT:sup (a(a,t,x)1,1)> 0 for all 1 # 0 , asA,
(t,x)E H,:SUP (a(a,t,x)1,1) > 0 for all 1 # 0 . a sA
By Lemma 2.9 the sets Q* and Q,*are open, Q,*+,2 QX, Q* = U, Q,*.To state our last assumption we assume that for each no and for each bounded
3. The Normed Bellman Equation
region Q' which together with its closure lies in Q,*,there exists a constant N such that for all n 2 no,j = 1,. . . , d
+
+ I;t
IvnXil 1vnXixjl
-
,I-
v
lv,(t,x)l S N , Iw,(t,x)l IN,
(a.e. on Q'),
(t,x)E Q'.
(4)
As was shown in Section 4.7, the functions v, and w, in the region Q,* have generalized first and second derivatives with respect to x as well as a generalized first derivative with respect to t, which appear in the inequalities given above. We also know from Section 4.7 that these derivatives are locally bounded in Q,*.We require that the local boundedness of the derivatives in Q,*be uniform with respect to n 2 no. Finally, we note that due to Theorems 2.13 and 2.14 the assumption about the validity of (2)-(4) can be satisfied if the summarizing Assumption 8 in Section 2 is satisfied. It follows from (2)-(4) that the functions v, and w, are equicontinuous and uniformly bounded in Q'. This together with the obvious relations v, + v, W , + w enables us to conclude that v and w are continuous in Q*. Furthermore, from (2),(3),and the convergence of v, + v, w, + w it follows that the functions v and w have two generalized derivatives with respect to x, one derivative with respect to t, and, in addition, these derivatives are locally bounded in Q*. If ma(t,x)is a nonnegative function given for a E A, t E [O,T), x E Ed, let d
G m a ( ~ o , ~ i j , ~=i ,SUP ~ , tma(t7x) ,~) UEA
+1
i,j=l
aij(a,t,x)uij
1. Definition. A nonnegative function ma(t,x)(a E A, t E [O,T),x E Ed) is said to be a normalizing multiplier if for all u,, uij,ui, U , t E [O,T),x E Ed
The normalizing multiplier ma(t,x)is called regular if there exists a function N(t,x) < co such that for all a, t, x the inequality mao(t,x)2 N(t,x)m,(t,x) can be satisfied, where
6 Controlled Processes with Unbounded Coefficients
2. Exercise Prove that m,,(t,x) is a normalizing multiplier and, furthermore, that for each nonnegative function N(t,x) the function N(t,x)m,,(t,x) is a normalizing multiplier.
The following theorem is a theorem on the normed Bellman equation.
3. Theorem. Let m,(t,x) be a normalizing multiplier. Then: a. Gma[v]= 0 (a.e. on Q*). (6) b. Gma[w] 1 0 (a.e. on Q*),,w(t,x)2 g(t,x) in the region Q*, Gma[w] = 0 almost everywhere in the region Q0 = {(t,x)E Q*: w(t,x) > g(t,x)}. In short,
+
(Gmu[w] w - g),
+g -w=0
(a.e. on Q*);
(7)
4. Definition. We call Eq. (6) the normed Bellman equation. We call Eq. (7) the normed Bellman equation for the optimal stopping problem.
In order to prove Theorem 3, we need four lemmas. 5. Lemma. Assertions (a) and (b) in Theorem 3 hold true if ma(t,x)= mao(t,x).
PROOF.First, we prove Theorem 3b. Taking advantage of the continuity of w(t,x) and g(t,x), the reader can easily prove that the first assertion in (b) is equivalent to the second assertion. Hence we shall prove the first assertion only. We note first that w, 2 g (see, for example, Theorem 4.7.5) and w 2 w,. Therefore w 2 g. Further, we denote by GF(uo,uij,u,t,x)the right side in (5) if we replace A by A,. Due to the nonnegativeness and boundedness of mu, (mu, I 1)it can be seen that if for some n 2 1, uo, uij, ui, u, t, x
then Gy(uo,uij,ui,u,t,x)I0. If we have equality in (8), then
Then, by Theorem 4.7.5, we have that GpO[wn]I 0 (Q,*-a.e.),GpO[wn]= 0 (Q: = { ( t , ~E) Q;:w,(t,x) > g(t,x))-a.e.). Next, we take a bounded region Q' c Q c Q* and, furthermore, we choose a number no so that Q' c Q&. The existence of such a number follows from the theorem on separation of the finite covering of the compactum Q' from the covering of Q' by expanding regions Q;. For n 2 no we have: Q' c Q,*,GF;O[wn] I GY[w,], wno5 w,, Q: c Q:. Moreover, GmmO[wn] 2 Gy[wn], which together with what has been proved above enables us to conclude that for n 2 no GY;o[wnI
0
(Q1-a.s.)
GmU0[wn] 20
(Q' n Q:o-a.e.).
(9)
3. The Normed Bellman Equation
We take now the limit in (9) as n -+ co.We note that the functions mu0(t,x), maO(t,x)a(a,t,~),mnO(t,x),b(u,t,~),maO(t,x)~IX(t,x), mao(t,xlf"(t,x)are bounded. Further, we assume that inequalities (2)-(4) are satisfied. Applying Theorem 4.5.1, we see that the passage to the limit is possible in (9) and that Letting no .+ co and using the fact that G O ;; we find Gmao[w]l 0
((2-a.e.),
GmaO, w,,
GmaO[w]2 0
t w, Q0 = U, Q:,
(Q' n Qo-a.e.).
Comparing the last inequalities, we have GmaO[w]= 0 (Q' n Qo-a.e.). Taking advantage of the arbitrariness of Q', we complete the proof of Theorem 3b. We can prove Theorem (a) in a similar way. The lemma is thus proved. The above arguments included many properties of controlled processes. It turns out that Theorem 3 can be deduced from the lemma on the basis of the fact that if ma(t,x)is a normalizing multiplier, any solution of the equation (inequality) Gmao(uo,uij,ui,u,t,x) = 0 ( 1 0 ) is also a solution of the equation (inequality) Gma(uo,uij,ui,u,t,x) = 0 (10). For proving this, we need the following. 6. Lemma. Let d, be an integer and on A let there be given two functions: la with values in Ed, and a numerical function ha. We assume that the equality 11a12 + lha12= 0 can be satisfied for no a E A. For u E Ed, let
F(u) = sup(lau +ha),4 me A
G(u) = sup nmo(Pu+ h"), ~ E A
where nu, = (11a12+ lh"12)-112.Then the set T = {u:F(u) 40) is convex, closed (possibly, empty) and, furthermore, the set To = {u:G(u) = 0) is the boundary of the former set.
PROOF. It is seen that the inequality F(u) 0 is equivalent to the inequality G(u) 5 0. Hence r = {u:G(U)l 0). (10) Further, the upper bound of a set of linear functions is downward convex. Therefore, the function G(u) (F(u)) is downward convex. The inequality ha,lau + naohuI lul + I implies the finiteness of G(u); the properties of finiteness and convexity in turn imply the continuity of G(u).This together with (10) enables us to assert that T is a closed convex set and that To I, aT. It remains only to prove that To c dT. Let the converse be true. Then there is a point uo E To such that uo # aT. Note that, obviously, To c T. Therefore, uo E T. Since uo does not lie on the boundary of the set T, uo is an interior point of T. We consider without loss It is possible that F(u) assumes the value
+ cc at some points or everywhere.
6 Controlled Processes with Unbounded Coefficients
of generality that uo = 0. Also, let the number r > 0 be so small that U = {u: < r} c r. Then G(u)l0 in U, G(0)= G(uo)= 0. We choose a sequence ai such that na,,hal + 0. This can be done since G(0)= 0. The vectors n,,,la' lie in a unit sphere. Hence, one can choose from the sequence of the vectors n,,,h". a convergent subsequence. We denote the limit of this sequence by e. Since ln,01a12 + lnaoha12= 1 for all a E A and since na,,P1 + 0, = 1. Finally, it follows from the inequalities n,,ola'u na,,hal I G(u)I 0 which hold for all u E U that eu I 0 for all u E U. However, if we take u = (l/r)e,we obtain e2 I 0, which is impossible because of the equality leI2 = 1. The contradiction thus obtained proves the lemma.
IuI
112
+
We fix (t,x) E H T , some function ma(t,x)> 0, and also we use this lemma in order to study the inequality Gma(uo,uij,ui,u,t,x) 5 0. It is natural to regard the set u = (uo,uij,ui,u)(i,j,= 1, . . . ,d) as a point of the Euclidean space Ed,, where d2 = 2 + d + d2. Let
Then, as can easily be seen,
The function Gma(u,t,x)is therefore good enough to replace F(u). It is seen that if mu > 0, the function naofrom Lemma 6 is equal to [ma(t,x)]-'mu0(t,x), which fact indicates that in the given case the function G(u)from Lemma 6 is equal to Gmao(u,t,x). Using Lemma 6, we immediately arrive at the following. 7. Lemma. Let (t,x)E HT and let some function ma(t,x)> 0 for all a E A. Then the set rma(t,x)= {(u~,u~~,u~,u):G~~(u~,u~~,u~,u,~,x) 10 ) being a set in E2+d+dz, is convex, closed, and, in addition, the set is the boundary of the former set. 8. Lemma. Let ma(t,x)be a normalizing multiplier. Then all solutions of the = 0 ( 2 0 ) are also solutions of the equation (inequality) GmaO(uo,uij,ui,u,t,x) equation (inequality) Gma(uo,uij,ui,u,t,x) = 0 ( 1 0 ) If m,(t,x) is a regular normalizing multiplier, the converse also holds, that is, the foregoing equations (inequalities) are equivalent.
PROOF.The assertions about the above inequalities easily follow from the fact that mao(t,x)> 0, and, further, if ma(t,x) is a regular multiplier, then m,(t,x) > 0.
3. The Normed Bellman Equation
Let us prove the first assertion of the lemma about the equations. We fix some t, x; but we do not write t, x in the arguments of the functions. By Lemma 7, in any neighborhood of a point (ub,u;,u:,ul) such that GmaO(ub,u:j,u:,u') =0 there are points (uo,uij,ui,u)at which F > 0. It is seen that Gmu2 0 at the same points. By Definition 1, the function Gmais finite. Furthermore, since the function Gmais convex, it is also continuous. Comparing the last two assertions, we conclude that Gmw(ub,ufj,u:,u') 2 0. The converse inequality is obvious, therefore Gma(ub,u~j,u:,u') = 0. We have thus proved the first assertion. In order to prove the second assertion of the lemma, we note that if Gmu(ub,u~,u:,u') = 0, the expression appearing in (5) under the sign of the upper bound is negative. Therefore, having taken N = N(t,x) from Definition 1, we have On the other hand, it follows from the equality Gmm = 0 that GmeI 0 and (since mu > 0) GmaO 5 0. The lemma is proved. The comparison of Lemma 5 with Lemma 8 immediately proves Theorem 3. Lemma 8 shows that all the information on payoff functions, which can be obtained from the normed Bellman equations having different m,(t,x), is contained in an equation corresponding to the normalizing multiplier m,,(t,x). Despite the fact that ma0(t,x)plays therefore a particularly essential role, it is frequently convenient to consider in practice other normalizing multipliers. The second assertion of Lemma 8 shows that the application of regular normalizing multipliers induces no loss of the information on payoff functions. 9. Example. Let d = 1, A = (-co,co), a(a,t,x) = 1, b(a,t,x) = 2a, cu=O, f"(t,x) = - a2f (t,x),where f (t,x) > 0. In this case
The equation GmzO = 0 seems to be, at least at first sight, inconvenient. Let m,(s,x) E 1. Since f > 0, 2au1 - a2f is the bounded function of a and, furthermore, 1 is a normalizing multiplier. Obviously, this multiplier is even regular. The calculations of G1 reduce to finding the vertex of the parabola 2aul - a2f. We have Therefore, in this case the equation G1 = 0 (equivalent to the equation Gmao= 0)becomes Uo u,, u: f -1'2(t,~)= 0.
+
+
6 Controlled Processes with Unbounded Coefficients
It seems much more complicated to transform the equation GmaO = 0 to the equation given above when we do not use Lemma 8.
10. Exercise Prove that the function rn,(t,x) r 2 0,t~ [O,T),x€ Ed
E
sup {r[tr a(a,t,x) UEA
1 is a normalizing multiplier if and only if for any
+ Ib(cc,t,x)l + P(t,x)] + f"(t,x)) < co.
Turning again to Lemma 7, we see how the function Gmabehaves on a boundary of the set T. If the point (~b,u:~,uf,u') E T o and if, in addition, the function Gmuis finite in some neighborhood of the foregoing point, here as well as in the proof of Lemma 8 the function Gmais continuous at the point (~b,u~~,u:,u'), and, also, in any neighborhood of this point there are points at which Gma> 0 and Gmu(ub,u:j,u:,u',t,x) = 0. It is also possible that Gma(ub,u~j,u:,u',t,x) I0, and that in any neighborhood of the point (~b,u;~,u~,u') there are points at which GmQ= + a.
11. Definition. We say that at the point (~b,u:~,u:,u') the function G m a ( ~ o , ~ i j , ~(t, i ,x~are , t , fixed) ~ ) has a zero crossing if: = 0, the function Gmu is continuous at the point a. Gmu(ub,u:j,~:,~',t,x) ( u ~ , u ~ ~ , u ~and, , u ' )furthermore, , in any neighborhood of the point (ub,ujj, u:,u') there are points at which Gmu> 0 ; or b. G m u ( u b , ~ : j , ~ ~ ,5~ '0, , t ,and x ) in any neighborhood of the point (~b,u~~,u:,u') there are points at which Gma= + oo.
Recall that F(uo,uij,ui,u,t,x)= sup uo + asA
d
d
C
i,j=l
aii(a,t,x)uij
+ C bi(a,t,x)ui i= 1
It is seen that F(uo,uij,ui,u,t,x)= G1(uo,uij,ui,u,t,x). Combining what has been said prior to Definition 11 with the assertion of Lemma 7, we have that the function F ( u ~ , u ~ ~ , u ~ ,has u , ~a, xzero ) crossing at each point of the set To(t,x). Then, due to Lemma 5 the following theorem holds. 12. Theorem. (a) For almost all (t,x)E Q* the function F(u0,uij,ui,u,t,x),being the function of (uo,uij,ui,u),has a zero-crossing at the point
3. The Normed Bellman Equation
(b)FLw] I 0 (Q*-a.e.), w(t,x) 2 g(t,x) everywhere on Q*, and for almost all (t,x)E Q0 the function F(uo,uij,ui,u,t,x), being the function of (u0,uij,ui,u), has a zero crossing at the point ((d/dt)w(t,~),~~~,~(t,~),~~~(t,~),~(t,~)).
This theorem enables us to make the Bellman equation meaningful even if the equation cannot be satisfied. Indeed, we write if F crosses zero at the point (uo,uij,ui,u).Then assertion (a) in Theorem 12 implies that From Lemmas 5 and 7 we have the following. 13. Theorem. (a) For almost all (t,x)E Q* the point
lies on the boundary To(t,x)of the set T1(t,x). (b)For almost all (t,x) E Q0 the point
lies on the boundary To(t,x)of the set T1(t,x).
Let us indicate here an example of the application of Theorem 12. 14. Example. d = 1, A = [ o , ~ )f"(t,x) , = ag(t,x), La = ( a p t )
The Bellman equation is here the following: sup
;[
UE[O,~)
v
I
+ vxx + a(g(t,x)- v)
+ (d2/ax2)- a.
= 0.
This equation may be not satisfied. Theorem 12 provides the correct interpretation of Eq. (11).According to Theorem 12 the function
has a zero-crossing at the point ((d/dt)v,vx,,v,,v). According to Definition 11 there exist two ways for a function to cross zero. The first way implies that
and, furthermore, for any (uo,u,,,ul,u) close to ((a/at)v,v,,,v,,v) the function F(u0,u1,,u,,u,t,x) is finite. It immediately follows from (12) that g(t,x) 5 v,
6 Controlled Processes with Unbounded Coefficients
+
(a/at)v v,, = 0. Since any small variation of v in (12) need not make the left side of (12)go to infinity, g(t,x) < v. Therefore, the first way for a function to cross zero implies the relations
a
-v+v,,=O at
to be satisfied. In the other case, i.e., on the set (t,x) where (13) is not satisfied, the left side of (12) does not exceed zero, and, furthermore, some arbitrarily small variations of (a/at)v,v,,, v make the left side of (12)go to infinity, which is possible only if g(t,x) = v. The inequality F[v] < 0 yields (d/dt)v + v,, I 0. Therefore, either (13)is satisfied or
a + v,, < 0. at
-v
-
It is clear that (13) or (14)hold almost everywhere in H , rather than for all (t,x) E HT.We also note that by Theorem 12 the function w(t,x) constructed on the basis of g(t,x), f"(t,x) 0, La = (dldt) (az/ax2)satisfies either (13)or (14)almost at each point (t,x). We can obtain (13) and (14) in a somewhat different way if we take the normalizing multiplier ma(s,x)= 1/(1 a) and, in addition, if we let P = a/(l a). Then the equation Gma[v]= 0 can be written as
+
+
+
The expression under the sign of the upper bound is a linear function of
p. Using the fact that a linear function on an interval attains the upper
bound at either end point of the interval, we immediately have
which precisely means that either (13)or (14)is satisfied. 15. Exercise Show that in Example 14, Eq. (11) cannot be satisfied almost everywhere in HT if g(T,x) = 0 and, also, if g(t,x) is a function which is bounded and continuous in RT and satisfies the inequality g(t,x) > 0 for (t,x)E H,.
16. Exercise Let f"(t,x) 2 0 for all a, t, and x, and let m,(t,x) be a normalizing multiplier. Prove that there exists a function N(t,x) such that m,(t,x) 5 N(t,x)m,,(t,x) for all a, t, x.
4. The Optimal Stopping of a Controlled Process
17. Exercise Prove all the theorems given in Section 6.3 in the case where, in the assumption associated with inequalities (2)-(4), inequalities (2) and (3) are replaced respectively by (a.e. on Q').
4. The Optimal Stopping of a Controlled Process on an Infinite Interval of Time In this section we investigate the limiting behavior of a payoff function as T + co in the problem of optimal stopping of a controlled process. We assume that the basic inequalities (1.1)-(1.4) are satisfied for all a E A, in each trip RT for constant K,, m,, K, m, which depend in general on T. We vary T as well as the sets Nn and N and also the functions w,(t,x) and w(t,x) (see Section 1) associated with the controlled process in HT. Hence, it is natural to attribute the index T to a n , 8 , w,, w, defined in Section 1. Thus, we consider the sets of strategies N;, NT as well as the functions w,(T,t,x), w(T,t,x). In addition to the assumptions that Kn = Kn(T),m, = m,(T), K = K(T), m = m(T), we state and use some other assumptions which will be given after Exercise 11 in this section. We denote by 1)31 the set of all Markov times with respect to {Ft}, N= NT. It is seen that N is the set of all progressively measurable (with respect to {Ft}) functions a, having values in A, for each of which for T > 0 there is a number n such that a,(o) E A, for all t I T, o E 52. Obviously, 1)31(T - s) 2 1)31(T1- s) for T 2 T'. Therefore,
nT
w(T,s,x) = sup
sup
va3'(s,x)
aeBT r ~ m ( T - s )
If z E 1)31(T' - s), the values of a, for t > T' (even for t > T' - s) are of no consequence for the computation of va*'(s,x).This implies that the right side of (1) is equal to SUP V~'~(S,X) = w(T1,s,x). sup aeBT' r ~ m ( T ' - s )
Therefore, w(T,s,x) increases with respect to T and has the (possibly, infinite) limit lim w(T,s,x) = w(s,x). T+m
It is seen that for s I S w(s,x) = sup w(T,s,x) = sup sup w,(T,s,x). TtS
T2S
n
6 Controlled Processes with Unbounded Coefficients
It is clear that the function w(s,x) is lower semicontinuous with respect to (s,x) as the upper bound of continuous functions w,(T,s,x). In particular, w(s,x) is a Bore1 function. It is seen that for arbitrary a E %, r E 1)32 M:,{J~
e-'P(s
+ t, x,) dt + e-'
(2)
might be infinite. We denote by 1% x 1131l(s,x)the set of pairs (a,z)E % x 1)32, for which at least one of the following expressions is finite:
It is natural to put va,'(s,x)= v:$,(s,x) - vy2,(s,x) for (a,r) E 1% x 1)32I(s,x). Such a definition of va,' agrees with the former definition since for a E NT, r E IIJZ(T - s)
This also implies that % x 1)32(T - s) c 1% x 1131l(s,x) for any s 4 T, x E Ed, and that the set 1% x 1)32);nl(s,x) is nonempty. It is seen that 1% x 1)321(s,x)= % x 1)32 for all s and x if the functions f "(s,x)and g(s,x) for all values of the arguments are greater than zero or, conversely, always smaller than zero. We investigate next how the function w(s,x) is related to the function
which it is natural to call a payoff function in the problem of optimal stopping of a controlled process at an infinite time interval. 1. Lemma. Let a function u(s,x) be nonnegative, continuous in [O,co) x Ed, and, in addition, let this function belong to W:d,2((0,co) x Ed).Let h"(s,x) be measurable with respect to s, continuous with respect to (a,x),and nonnegative. Finally, let us assume that Lau + h" < 0 ((0,co) x Ed-a.e.) for each a E A. Then for each a E %, (s,xf)E [O,co) x Ed,z E 1131 M J;[,:
e-'.hat(s
+ t, xt)dt + e-"u(s + r, x,)
I
5 u(s,x).
(3)
PROOF. The assertion of the lemma resembles that of Lemma 5.3.4; therefore, we prove it in a similar way. Let r = {(t,x):Lau(t,x) + ha(t,x)5 0 for all a E A'}, where A' is some everywhere countable dense set in the set A. It is seen that meas((0,oo) x Ed\T) = 0. Due to the continuity of ha and the coefficients La with respect to a for all (t,x)E r, a E A we have Lau(t,x)+
4. The Optimal Stopping of a Controlled Process
h"(t,x) 1 0. Then, for any a E a, o,(t,x)E r
+
~ ~ ( t , x ) [ L ~ ' u ( t ,ha'(t,x)] x) s 0.
This implies that according to Remark 2.10.6 and Theorem 2.10.2 inequality (3) is satisfied for each R > 0, T > 0, (s,x)E C,,, if we replace z by z A z,,,, where z,,, is the time of first exit of (s + t, xJ from [O,T) x S,. Applying next Fatou's lemma, letting R + co, T + C Q , and, finally, making use of the nonnegativeness of ha and u, we complete the proof of (3). The lemma is proved. 2. Lemma. Let u,(s,x) ahd u2(s,x)be nonnegative continuous functions on [O,co) x Ed. We assume that u2 E W f ; 2 ( ( 0 , ~x) Ed). W e also assume that either of the following conditions is satisfied:
a. Lau2I 0 ( ( 0 , ~x) Ed-a.e.)for all a E A, u,(T,x) = 0, lim sup uz(T,x)
T-m xeEd
where an expression o f the form 010 is assumed to be equal to zero; b. Lau2+ u, I 0 ( ( 0 , ~x ) Ed-a.e.)for all a E A. Then for any s 2 0 , x E Ed
lim sup sup M ~ , , x , , ~ - , u ~ ( s
T-m
as%
TE(IR
+ z,x,)e-"'
=0
if (a) is satisfied, and, further, for any a E %
i f (b) is satisfied.
PROOF. Let E ( T )= sup sup T'>T X
~
u,(T',x) uZ(T',X)' E ~
-
Under assumption (a), E ( T )L 0 as T + CQ. Using Lemma 1, we find M:,,x,~T-,u~(s
+ ~,x,)e-" I e(T)M;,,u,(s + ~ , x , ) e - ~ ' I ~(T)u,(s,x).
We have thereby proved equality (4).By Lemma 1
if (b) is satisfied. Since the integral with respect to t converges in the last equation, the integrand tends to zero. Therefore, (5) is satisfied. This completes the proof of the lemma. 3. Remark. Lemma 2b will be satisfied if, for example, u, 5 Nlca for all a for some constant N,. In this case we may take u2 = N,. Lemma 2a will
6 Controlled Processes with Unbounded Coefficients
+
be satisfied if u , E W:d,2((0,co)x Ed) and Lau, E U , I 0 ((0,co) x Ed-a.e.) for all a for some E > 0, in which case the function u,(s,x)esEcan be taken for u,. It is convenient to use Lemma 2 in verifying the next theorem. 4. Theorem. Let s 2 0, x
E
Ed and, also, for each a E % let
Then w(s,x) =
sup
(a,r)s IW x(lRl(s.x)
vaJ(s,x).
PROOF. It is seen that
W(S,X) = lim sup
sup va,'(s,x) r~m(T-s) lim sup sup va~'(s,x)
T - t m asWT
=
T - t m a s % rs(lR(T-s)
5 sup
T > 0 (a,,)€
SUP
1%
v ~ , ~ ( s= ,x)
x(lRl(s,x)
(@,I)€
sup
V~'~(S,X).
IN x (DII(s,x)
Let us prove the converse. To this end, as can easily be seen, it suffices to show that for all (a,z)E 1% x )IJZI(s,x) By definition, g(s + z, x,) convergence
=0
if z
=
co. Then, by the theorem on monotone
Similarly,
-
lim T-tm
va. 1 ~ G - s )
-
(~5x1.
Subtracting the last inequalities, we find (6). The theorem is proved.
4. The Optimal Stopping of a Controlled Process
5. Exercise Consider the process x, = -ewc-('12)twhere w, is a one-dimensional Wiener process. The process x, is a solution of the stochastic equation
Prove that sup Mx,
ZCDI
= 0,
sup Mx,
reFm(T)
= -1
for each T > 0. This exercise shows that the assertion of Theorem 4 is false, if we impose no restrictions on the controlled process.
Theorem 4 answers the question as to when a payoff function in the problem of optimal stopping of a controlled process on an infinite time interval can be obtained as the limit of payoff functions associated with finite time intervals. The presence of the set 1% x %RI(s,x)in the statement of this theorem is rather inconvenient. In this connection we consider the case where 1% x %Rl(s,x)= 2l x %R.
6. Theorem.Let there be nonnegative functions a,, W 2 E W:?((O,co) x Ed) continuous in [O,W) x Ed and such that a, 2 g in [O,W) x Ed and, in addition, for all a E A let
+
Then 1% x %Rl(s,x)= 2l x %R for each s, x. Furthermore, vT;',(s,x)s W1(s,x) W2(s,x)for all a E 2l, z E %R, s 2 0, x E Ed;w IW 1 W 2 , and also the function w is Jinite.
+
The assertions of this theorem follow immediately from the inequalities
Here the second inequality has been obtained by Lemma 1. 7. Exercise Let p, q > 0, and let w, be a d-dimensional Wiener process w(T,s,x)=
sup
M
ZEW~DI(T-S)
In this case a denotes a unit matrix, b Using Theorem 4, show that
IX
+ YIP (l+S+'T)"
= 0, c = f = 0, g(s,x) =
w(s,x) = sup M TEW
Ix + w.Ip
(1
+ s + 7)'.
IxlP/(l
+ slq.
6 Controlled Processes with Unbounded Coefficients
Prove that w(s,x) = co for p 2 2q and w(s,x) < co for p < 2q for all s, x. (In the first case use the law of the iterated logarithm; in the second case use Theorem 6, assuming
w 2 = 0,
and choosing an appropriate N.)
Theorem 4 together with the equality w(s,x) = lim lim w,(T,s,x) T + m n+m
as well as the results obtained in Chapter 5 enables us to find strategies and Markov times for which va"(s,x)differs arbitrarily slightly from a payoff function in the problem of optimal stopping of a controlled process on an infinite time interval. In some cases, one can indicate an optimal stopping time. 8. Theorem. Let a set A consist of a single point. Assume that there exists a nonnegative function iij E W:d,2((0,m) x Ed) continuous in [O,m)x Ed and such that
Ig(T7x)l - 0, lim sup ----~ ~E( T~ , X )
T+m X
Also, we assume that for each s 2 0, x is satisfied:
L* I O5 ((0,m) x Ed)-a.e. E Ed either
of the following inequalities
We denote by z y the time of first exit of the process (s + t, xi*")from Qo = {(s,x) [O,m) E d : W ( S , X ) > ~ ( s , x ) ) . Then, for each z E 1)32 the variable vE(s,x)is defined, vT(s,x)< m(vZ(s,x)> - m),if the first (second) inequality (7) is satisfied, W(S,X)
for all (s,x).
= sup vT(s,x), rsm
PROOF. First we note that, since the function w(s,x) - g(s,x) is lower semicontinuous, the set Qo is in fact a region. Further, by Lemma 2
as T
+
m . From this it follows, in particular, that for each z E
for large T
+
Ms,xe-(P'~T> T-slg(~ z, x,)I < a. We omit here and hencefbrth the index cc, which assumes only one value; we assume that the expression O00is equal to zero.
4. The Optimal Stopping of a Controlled Process
Furthermore, since lg(s,x)l s K ( T ) ( l + /xl)m(T) for s 5 T, due to the estimates of moments of stochastic equations or any T we have Ms,xe-(Pz~r5 T - s l ~ (+~ 7 , xT)I< a.
Therefore, for any z
+
~ ~ , , e - * ~ l g (2, s xT)l< co.
It is also seen that
The last two inequalities immediately imply our assertions concerning vT(s,x). Now we can write 1% x %Rl(s,x)= % x %R. Further, by Theorem 4, w(s,x)= sup vT(s,x). rem We prove that the time z y is optimal. We denote by z y ( T ) the time of first exit of (s t, x,) from Q,(T) = {(s,x)E RT:w(T,s,x) > g(s,x)). By Theorem 3.1.10 for each T on RT
+
wT(s,;) = M,,,{J~
ro(T)
-
e .Y(s
+ t, x,) dt + e - " ~ ( ~ ' g (+s r,(T), xr0(,)
Let T + co.By virtue of the relation w(T,s,x) w(s,x)we have Q,(T) c Q,(Tf) for T' > T , Q , = Q,(T), z y ( T ) f z y The continuity of x;," with respect to t also implies that on a set {z;" < co)
UT
lim x:+(~)= x:p.
T+ w
By the monotone convergence theorem, M,,
So
ro(T)
-
e * Y b + t, xd dt = M , ,
So
ro(T)
e *Y+(s+ t, xd dt -
Applying here the estimates of solutions of stochastic equations as well as the dominated convergence theorem, we conclude that for each T'
6 Controlled Processes with Unbounded Coefficients
As T' -+ co the last expression tends to zero by virtue of (8). By the dominated convergence theorem, it follows from (9) that
+
lim ~ s , x ~ - " ' o ~ , o , T ~ - sz0,x,,) g(~
T'+ ao
= Ms,,e-"=og(s,
+ z,, x,,).
Thus, letting T' in (10) to infinity, we obtain: w(s,x) = v70(s,x), which proves the theorem. However, the condition associated with (7) cannot be removed.
9. Example. Let c = 0, f (s,x) = (d/ds)(ssins), g(s,x) = 0. Then, as can easily be seen, w(s,x) = co, z, = co,and, in addition,
is not defined.
+
10. Example. Let c = 0, f = 0, g(s,x) = - 1/(1 s)'. In this case we can take 1/(1+ s) for W(s,x). Obviously, w(s,x) = 0, 7, = co, and, in addition, by Theorem 8,
11. Exercise Let us take p < 2q in Exercise 7. Show that in such case one can take W l given in Hint to Exercise 7 for iij in Theorem 8. Using the fact that the process fi(,,,, is a Wiener process for any constant c > 0 (self-scaling property of a Wiener process), prove that
Deduce from this, noting that
and also applying the fact that the problem is spherically symmetric, that. Qo together with each point (so,xo)contains a part of the paraboloid
which lies in [O,oo) x Ed. Using the almost obvious inequality w(s,x) 2 (1 By definition.
+ so)4(1+ s)-(w(sO,x)
for s 2 so
4. The Optimal Stopping of a Controlled Process
prove that if (so,xo)E Qo (i.e., w(so,xo)> g(so,xo)),then (s,xo)E Q0 for all s 2 so. Combining this result with the preceding one, we arrive at the assertionthat for some constant
Prove that w(s,O) > 0, and then Qo # (a, c, > 0.
We proceed now to derive the normed Bellman equation. In addition to the assumptions formulated at the start of this section, we shall impose the following conditions. Let ya(s,x) be a vector of dimension d x (dl + d + 3) having coordinates aij(a,s,x) (i = 1, . . . ,d, j = I, . . . ,dl), bi(a,s,x) (i = 1, . . . ,d), ca(s,x), f"(s,x), g(s,x). Assume that the vector ya(s,x) is one time continuously differentiable with respect to s, twice continuously differentiable with respect to x on [O,m)x Ed for each a~ A, and for all n = 1,2,. . . , a € A,, 1 E Ed, T > 0, (s,x)E RT
Let QX(T) = {(s,x)E HT:sup (a(a,s,x)A,1) > 0 for all A # 01, asAn
Q*
=
{(s,x)E ( 0 , ~x) Ed:sup (a(a,s,x)A,1)> 0 for all 1 # 0). aeA
We assume that for each no, T o > 0 and for each bounded region Q' which together with its closure lies in Q:,,(To), there exists a constant N such that for all n 2 no, T > T o , i, j = 1, . . . ,d almost everywhere in Q'
Thus, we assume that the assumptions made in Section 3 are satisfied in each strip HT.In this connection we note that we can obtain the required estimates of derivatives in various cases if we apply the results obtained in Section 2. Finally, we introduce the concept of a normalizing multiplier in the same way as we did in Section 3, letting T = co in Definition 3.1.
12. Theorem. The function w(s,x) in the region Q* is continuous, has a generalized jirst derivative with respect to s, and two generalized derivatives with respect x ; all these derivatives are locally bounded in Q*. For any normalizing multiplier ma(s,x) we have Gma[w]< 0 (Q*-a.e.), w(s,x) 2 g(s,x) in the region Q*, Gmu[w]= 0 (QO= {(s,x)E Q*:W ( S , X ) > g(s,x))-a.e.). The proof of this theorem follows the proof of Theorem 3.3b. Hence we restrict ourselves to some hints only. One can easily derive the assertion of
6 Controlled Processes with Unbounded Coefficients
the theorem for an arbitrary normalizing multiplier from the assertion for m,(s,x) = mao(s,x)with the aid of Lemma 3.8. According to Theorem 3.3 this theorem holds if we replace w by w ( T ) = w(T,s,x), and, furthermore, if we replace Q* by Q*(T) = U, Q,*(T). By Theorem 4.5.1, it is possible to = 0 and in inequalities of take the limit in equalities of the type GmaO[w(T)] the type Gmao[w(T)] I 0 as T -t KI.
13. Remark. As in the previous section assertions of the type of Theorems 3.12b and 3.13b hold here. The translation of the preceding arguments requires no changes. We consider an important particular case of the problem of optimal stopping of a controlled process on an infinite time interval where the functions o(a,s,x), b(a,s,x), ca(s,x),f "(s,x),g(s,x) are time homogeneous, i.e., they do not depend on s. In this case for any a E IU and for bounded z E 1111 the function va~'(s,x)does not depend on s. In order to convince oneself that this is the case, it suffices to write (2) in an explicit form and to note that x~"~ does " not depend on s. It follows from the equality w(T,s,x) = sup
sup
a ~ 2 IreiUl(T-s)
va,"(s,x)
that w(T,s,x) depends only on x and T - s. This, in turn, implies that w(s,x) = lim,,, w(T,s,x) does not depend on s. It is clear that if ma(s,x)= ma(x),then G m a ( ~ o , ~ i j , ~does i , ~ not , ~ , depend ~) on s. Here we put Gm,(uij,ui,u,x)= Gm~(O,uij,ui,u,s,x), Gma[u]( x )= Gm~(~,ixj(~),~xi(~),~(~),~). It is natural that we omit now the argument s in the functions which do not depend on s in the case considered. 14. Theorem. The function w(x) in the region
(a(a,x)I,A) > 0 for all I # 0 is continuous and, furthermore, has two generalized derivatives with respect to x, which are locally bounded in D*. In addition, if the nonnegative function ma(x)is such that for all x, uij,ui, u sup ma(x)< KI,
Gma(uij,ui,u,x)< oo,
(13)
u
then Gme[w]5 0 (D*-a.e.), w(x) r g(x) in D*, P a [ w ]= 0 (Do = ( x E D*:
W ( X ) > g(x))-a.e.).
This theorem follows from Theorem 12 since, obviausly, Q* = ((),a) x D*,
Q0 = (O,CO) x Do,
a
-
as
w(x) = 0
5. Control on an Infinite Interval of Time
and, by virtue of the inequality
+
Gma(u0,uij,ui,u,x) 4 luOlSUP ma(x) Gma(uij,ui,u,x)< oo, a
the function m,(s,x)
= m,(x) is a normalizing multiplier.
15. Remark. In the stationary case one can write K , and m, instead of K , ( T ) and mn(T)in ( l l ) ,since the vector y does not depend on s. In verifying (12) using the results obtained in Section 2, it is natural to seek the functions ui depending only on x. The same is applicable to the functions w, and w2 from Theorem 6. 16. Exercise Returning to Exercises 7 and 11, we assume that p < 2q. Show that we can change the function g(s,x) inside the region {(s,x):s2 0, 1x1 < (co/2) in such a way (mainly, smooth out) that the payoff function does not change and, furthermore, Assumption 2.8 can be satisfied for u, = u , = u, = W , (w, given in Hint to Exercise 7), u - , = u - , = 0 for each T. Derive from the above that w(s,x) has one generalized derivative with respect to s and two generalized derivatives with respect to x, these derivatives being locally bounded in ( 0 , ~ x) Ed,
m)
where h = 0 in QO,h = - (aglas)- $ Ag outside QO.Regarding (14)as an equation with respect to w and noting that w I(1 + for some E > 0, prove that w(0,x) = M
Putting here x
= x , = (c,,O,
Jow
h(t,x + v,)dt.
. . . ,O),
and using the fact that w(O,xo)= g(O,xo)and also, that the distribution of w, is known, write an equation for c,. Prove that such an equation has a unique solution with respect to c,.
5. Control on an Infinite Interval of Time Let the functions o(a,t,x), b(a,t,x), ca(t,x),fa(t,x) be given for all cc E A, t 2 0, x E Ed. Also, in each strip of H T let the functions given satisfy Assumptions (1.1)-(1.3), with constant K , and m,, which depend, generally speaking, on T . As in the previous section, we introduce here the sets of strategies N$, 'UT = U, U ',; 'U = ( U T . Let
nT
v,(~,s,x)= sup M,:
JOT-'
ae%,T
v(T,s,x) = sup v,(T,s,x). n
e-'Yat(s
+ t, x~ dt,
6 Controlled Processes with Unbounded Coefficients
The objective of this section consists in investigating the limit behavior of v(T,s,x) as T + co.We denote by I%l(s,x) the set of all a E '3 for which at least one of the following expressions is finite:
For a E I%I(s,x)let va(s,x)= v?+)(s,x)- v:-)(s,x). Throughout this section we assume that there exists a E A and also, that u(s,x) andgl(s,x)are nonnegative continuous functions given on [O,co) x Ed, which belong to Wfd,2((0,co)x Ed)and such that
,--
v(T7x) lim sup = 0, X E E (T,x) ~~!
T-tm
where a relation of the form 010 is assumed to be equal to zero, and, in addition, for all a E A Lavl 0 ( ( 0 , ~x )Ed-a.e.). This assumption was discussed in Remark 4.3. Furthermore, after Theorem 4 we add some conditions to those listed above. 1. Theorem. For all s 2 0, x V(S,X)
-- lim v(T,s,x), T+ oo
E Ed we
have
v(s,x)2 --g(s,x),
u(s,x)=
sup
a s I%I(s,x)
va(s,x). (1)
PROOF.By Lemma 4.2, for any s 2 0, x E Ed lim sup M~,,Q(T,X~-,)~-"T-S = 0.
T+oo a s %
Let
I
+ t, x,) dt - e-QT-sg(T7xT-s). (3)
i7(T7s,x)= sup M:,, aeBT
Due to (2), lim Iv(T,s,x) - E(T,s,x)l = 0.
T+m
Further, let T' > T, and let a E By Theorem 2.9.7, almost surely
(4)
aT'and at = g for t E [ T - s, T' - s].
5. Control on an Infinite Interval of Time
By Lemma 4.1, the last expression is greater than (- l)g(T,x$?g. Therefore, for strategies a of the given type
Computing here the upper bounds, we obtain, obviously, 5(T,s,x) 5 5(T1,s,x).Therefore, the function E(T,s,x) increases with respect to T. From this and (4) it follows that the limits lim E(T,s,x), lim v(T,s,x) T- m
T-r m
exist and are equal. Furthermore, these limits are greater than E(s,s,x) = - v(s,x). Let us prove (1). On one hand, by the monotone convergence theorem for a E I'%I(s,x) vU(s,x)= v?+)(s,x)- 0:-)(s,x)
I lim v(T,s,x) = v(s,x). T-r m
Thus, the right side of (1) does not exceed v(s,x). On the other hand, let s 2 0, x E Ed, T > s, a E 21T. Also, we define a' E 2l using the formula a: = a, for t < T - s, a; = for t 2 T - s. For Tf > T, as above,
+ Mt,,eCvT -'g(T,xT-3, where, by virtue of (2) for a sufficiently large T, the right side is finite and, therefore, the primary expression is bounded uniformly with respect to T'. Hence, if T is sufficiently large, a' E I%l(s,x). From (2) and (5) we have
6 Controlled Processes with Unbounded Coefficients
Therefore, for sufficiently large T sup
va(s,x)2 iT(T,s,x).
a = IVI(S,X)
As T -* co we have that v(s,x) does not exceed the right side of ( I ) , thus proving the theorem. 2. Remark. The above proof can be used in finding &-optimalstrategies on an infinite time interval if v(s,x) < co. Namely, we take first a large T such that the limiting expression in (2) becomes smaller than &/3 and that Iv(T,s,x) - v(s,x)(< 43. Next, we choose a E N such that
Further, as we did in the proof given, we costruct a' on the basis of a. Then, by (6)
3. Remark. We can write (1) as follows : v(s,x) r lim lim vn(T,s,x) = sup T - m n-m
a~ I(DII(s,x)
va(s,x).
It turns out that we can interchange the limit with respect to n and the limit with respect to T . In fact, let us denote by iTn(T,s,x)the right side of (3), in which N T is replaced by N:. Here as well as in our theorem we prove that F,(T,s,x) increases with respect to T , if A, 3 a. Let this condition be satisfied for n 2 n*. Then from (2)we have v(s,x) = lim lim Fn(T,s,x) = sup sup iT,(T,s,x) T + m n-m
Tzs nsn*
=
sup sup iTn(T,s,x)= sup lim Fn(T,s,x) nzn* T g s n t n * T+m
=
sup lim v,(T,s,x) = lim lim v,(T,s,x). nsn* T - m n-+m T+m
Remark 2 contains the inequality v(s,x) < co. The sufficient condition for v(s,x) to be finite can easily be found from Lemma 4.1. 4. Theorem. Let there exists a nonnegative function iJ E W:d,2((O,co) x Ed) which is continuous in [O,co) x Ed and such that for all a E A Laa f"+ 5 0 (a.e.). Then - v -(s,x) Iv(s,x) 5 V(s,x) lNl(s,x) = N ,
+
for all s 2 0, x E Ed. We derive now the normed Bellman equation. We shall assume in future that the assumptions made in the preceding section after Exercise 4.11 are
5. Control on an Infinite Interval of Time
satisfied if we replace in those assumptions g(s,x) by 0, and, in addition, if we replace in (4.12) the function w,(T,s,x) by v,(T,s,x). Here as well as in Section 4, the following holds true. 5. Theorem. The function v(s,x) in the region Q* is continuous, has a generalized jirst derivative with respect to s and two generalized derivatives with respect to x, and, in addition, all these derivatives are locally bounded in Q*. For each normalizing multiplier ma(s,x) Gmm[v] =0
(a.e. on Q*).
We consider now in more detail the case where the functions o, b, c, f do not depend on s. It is seen that here the function v(s,x) does not depend on s: v(s,x) = v(x). Here as well as in the preceding section, we introduce Gmz(uij,ui,u,x) Gm=[u](x) and a region D*. From Theorem 5 we immediately have the following.
6. Theorem. The function v(x) in the region D* is continuous and has two generalized derivatives with respect to x. These derivatives are locally bounded in D*. If the nonnegative function ma(x)is such that for all x, uij,ui, u Gma(uij,ui,u,x) < co,
sup ma(x)< oo, a
then Gmu[v]= 0 (D*-a.e.).
The first condition in (7)(also, (4.13))is superfluous in some cases. 7. Theorem. For all n 2 1, x
inf [tr a(a,x)
aeA,
E D*,
let
+ Ib(a,x)l + ca(x)+ If"(x)l] > 0.
(8)
Then G m f v ]= 0 (D*-a.e.)for any nonnegative function ma(x)such that the inequality Ga(uij,ui,u,x) < 03 is satisjied for all x, uij,ui, u.
PROOF. Let
v,(x)
=
lim v,(T,s,x). T+m
As follows from Theorem 1, for all n for which g
GF(uij,ui,u,x)= sup ma(x) a € A,
i,j= I
aij(a,x)uij+
d
i= 1
E A,,
this limit exists. Let
I
bi(a,x)ui- ca(x)u+ f ' ( x ) ,
6 Controlled Processes with Unbounded Coefficients
We note that by (8)
sup rTi,,(x) < co.
aeA,
Let n be such that g E A, and let temporarily A = A,. Then, by Theorem 6 we have that G?[v,] = 0 (D*-a.e.),where D,*= {x E Ed:sup (a(a,x)il,A) > 0 for all il # 0). aeA,
We take a bounded region D' c D' c D*. Also, we choose a large number no such that c D,*,,g E A,,. For n 2 no hence
Let here n + co. By hypothesis, the ordinary derivatives of the functions v,(T,s,x) as well as the functions themselves are bounded in (0,l) x D' uniformly for n 2 no, T 2 2. This implies that for n 2 no the function v,(x) has two generalized derivatives which are uniformly bounded in D'. According to Remark 3 and Theorem 4.5.1 we have from (9) as n + co
Since D' is an arbitrary subregion of D* and furthermore, G r O1 Gaa0 as 0 '
+
O07
GfiaO[u]= 0
(D*-a.e.).
Next, repeating almost word-for-word the proof of Lemma 3.8, it is easy to see that all solutions of the equation GfimO(uij,u_,,u,x) = 0 are solutions of the equation Gma(uij,ui,u,x) = 0 if the function Gmais finite. Therefore, in this case Gma[v] = 0 (D*-a.e.),thus proving the theorem. Sometimes the application of Theorem 7 yields more extensive information about a payoff function than the application of Theorem 6 does.
8. Example. Let A = (0,1], A, = [l/n,l], and let the function f(x) 2 0. It is not hard to visualize a situation in which Fa[u](x)
= sup ma(x)[adu(x) - au(x) ~ E A
+ ~ '(x)]. f
If the function m,(x) is bounded with respect to a for each x and if, in addition, m,(x) > 0, the relation Gma[v](x)= 0 holds if and only if Av(x) - v(x) f(x) 1 0. In particular, this inequality is equivalent to the Bellman equation (m, = 1). At the same time, if we take ma(x)= l/a, it is easily seen that Gmais a finite function and also (by Theorem 7 if the appro-
+
priate conditions are satisfied) 0 = sup(Av - v a6A
+ af)= A v - v + f .
9. Exercise Let A = (0,1], A, support, and let
= [l/n,l].
Furthermore, let f(x) be a smooth function with compact
cma[u](x)= sup m,(x)[a Au(x) - au(x) asA
+ af(x)].
Prove that A v - v + f 5 0 (a.e.), v 2 0 and A v - v + f = 0 almost everywhere in the set {x:v(x) > 0). Prove also that iff 5 0, then v = 0 and, in addition, if we take the normalizing multiplier a-', we obtain the false relation A v - v + f = 0. Explain why Theorem 7 is inapplicable in this case. (Hint: See the assumptions made at the start of this section.)
10. Remark. If condition (8) is satisfied, it is possible to prove the theorem on zero crossing or the theorem that (v,,,,,v,,,v) belongs to a boundary of some set, which are similar to Theorems 3.12 and 3.13.
Notes Section 1. The idea of Exercise 6 is similar to that of an example due to Dynkin. Section 2. In [36] and [37], it is required that c be sufficiently large compared to the first and second derivatives of a, b, c, f. It is thus assumed that (15)and (17) are satisfied. In [22], an example illustrates that if T = co and, in addition, inequality (15) is violated, the second derivative of the payoff function can be unbounded. It is known about a diffusion process (A consists of a single point; see Freidlin [19]) that if "killing" is insignificant, the "payoff function" for T = co need not possess smoothness. It is also known that as c as well as the smoothness of initial functions increases, the smoothness of the "payoff function" increases. (See Freidlin [19]). It is interesting to note that the increase of smoothness need not occur in controlled processes. For instance, let (w,,g,) be a one-dimensional Wiener process, A = [O,v], T = co,
where 1> 0. It is not hard to show that 1 v(x) = - cos x,
1
x E [O,z],
6 Controlled Processes with Unbounded Coefficients
where z is the solution on ( 0 , ~of ) the equation
Hence v"(z-) = -(l/A)cosz # 0 = v"(z + ) and, further, for each 1 the second derivative v is discontinuous. Concerning the increase of smoothness of a payoff function, see also the remarks made prior to Theorem 1.4.15. Sections 3,4,5. In these sections the results obtained in [37] have been developed further. Theorem 4.8 can be obtained by the methods used by Shiryayev in [69]. There are other well-known methods for finding a stopping boundary, different from the method described in Exercise 4.16, see [44, 56,681.
Appendix 1
Some Properties of Stochastic Integrals
We shall mention some facts from the stochastic integral theory, omitting the proof. The latter can however be found in Doob [9], Dynkin [ll], and Shiryayev [51], and Gikhman and Skorokhod [23,24]. Let (52,F) be a measurable space, and let {F,, t 2 0) be an increasing family of G-algebras(a flow of o-algebras) which satisfies the condition F, c F for all t 2 0. A process 5,(0) given for t 2 0, w E 52 with values in Ed, is if) for each s > 0 said to be progressively measurable (with respect to {F,) the function &(w) considered for t E [O,s], w E 52 is measurable with respect to the direct product of a o-algebra of Bore1 subsets of an interval [O,s] and 9,. It is a known fact that a continuous process t,, which is measurable with respect to F,for each t, is progressively measurable. A nonnegative function z given on D is said to be a Markov time (with respect to {F,}), if for any s 2 0 the set {w:z(w) > s) E F s . It is also a known fact that the times of first exit of continuous progressively measurable processes 5, from open sets are Markov times. Let a probability measure P be given on (52,F). A continuous process w, = (w:, . . . ,wtl) defined for t 2 0, o E D is referred to as a dl-dimensional Wiener process if w, = 0, the increments w, on nonoverlapping intervals are independent, and, finally, if the distribution w, - w,(t > s) is normal with parameters 0, (t - s)I, where I denotes the unit matrix of dimension dl x dl. If, furthermore, for any t 2 0 the variable w, is F,-measurable, the increment w,,, - w, for h 2 0 does not depend on Ft and if the o-algebras of 9,are complete, we say that the pair (w,,F,) is a dl-dimensional Wiener process or that w, is a dl-dimensional Wiener process with respect to {F,}. It should be mentioned here that any Wiener process w, is Wiener with respect to the completion of its own o-algebras of 9; = o{w,:s I t).
Appendix 1 Some Properties of Stochastic Integrals
Let (w,,F,) be a dl-dimensional Wiener process, and let a, be a random matrix of dimension d x dl which is progressively measurable with respect to ( 9 , ) and such that for each t 2 0
So
Then the stochastic integral a,dw, is defined, and is a continuous progressively measurable process (with respect to (9,)) satisfying the condition
so
The stochastic integral a, dw, can be constructed in the following way. Let a process a, be a step process, i.e., there exist numbers 0 = to < t , < t2 < . . . < t, = m such that a, = ati for t E [ti,ti+ i = 0, . . . , n - 1. Then Soa,dw, can be defined as the corresponding sum: if t E [ti,ti+,), then
In this case equality (2) can immediately be verified. In the general case we prove that there exists a sequence of progressively measurable step processes o,(n) such that for each t > 0
Hence, by the Cauchy criterion and equality (2), the sequence of stochastic integrals a,(n)dw, is Cauchy in L, (mean square sense), and, hence, convergent in L,. Let us denote the limit a, dw,, where the limit for each t is determined of course only to within an equivalence. We also prove that for each t, w one can choose values of Sf, a,dw, such that the process thus obtained becomes continuous with respect to t. Hence by the integral o, dw, we usually mean a continuous process. It turns out that for any Markov time z for all t 2 0 almost surely
So
So
fi
A stochastic integral is in general not the limit of integral sums, similar to the Riemann-Stieltjes sums. It is, however, known that for each process a, there exists a sequence of integers i(n) which tends to infinity as n + m and is such that for all T > 0 and almost all s E [0,1]
where xi(t) = 2-'[t2'], [a] denoting the largest integer I a. We note that the second integral in (4) is an integral of a step function and, therefore, is of the form (3).
Appendix 1 Some Properties of Stochastic Integrals
A stochastic integral can be defined not only for the functions a, satisfying condition (1).The characteristic feature of a stochastic integral under condition (1) is that
almost surely on the set {w:z(w) 2 t) for any bounded Markov time z and for any t. Let z be a Markov time, let a, be a process with values in a set of matrices of dimension d x dl, and let b, be a d-dimensional process. We assume that xs5ps, ~~~~b~ are progressively measurable. Furthermore, we assume that the integrals ~,,,o,dw,, ~,,,b,ds are defined. If the process 5, satisfies the relation
so
Sf,
it is convenient to write dt, = a, dw, + b, dt, t I z, to= x. The formal expression a,dw, + b,dt is called the stochastic differential of t,. The notation dt, = a, dw, b, dt, t Iz, to= x is only the short-hand representation of Eq. (6), and is more convenient than (6) when we encounter the need to write u(tt), where u is some function, using the stochastic integrals. It is sufficient to find the stochastic differential du(t,). It turns out that for any twice continuously differentiable function u given on Ed, we have the following Ito's formula:
+
where the first term is understood as
) or grad ~ ( 5 , ) We can write the first term in (7) in a short form: grad ~ ( 5 ,dt, a, dw, + grad u([,)b, dt. In order to compute dtf d t j in the second term in (7), one is to apply the usual rules for removing parentheses as well as the following rules for the product of stochastic differentials: ( d ~ f=) dt, ~ dwf dw: = 0 for i # j, dwfdt = 0, (dt)2 = 0. In short, dtf d5j =
d1
1 afkajkdt = (o,a:)'j
dt.
k= 1
Putting together in (7) the terms with dt and the terms with dw,, respectively, we can rewrite Ito's formula as follows: du(t,) = grad u(t,)a, dw,
+ L"t,btu(<,)dt.
Appendix 1 Some Properties of Stochastic Integrals
Equation (7) is an analog of Taylor's formula with two terms, and can be as well called a formula for the stochastic differential of a composite function, the integral version of which (compare with (8)),
is known as the change of variables formula. Sometimes the need arises to find the stochastic differential of u(t,<,).In such a case we can add to the process 5, another coordinate, assuming 5;" = ds. We thus reduce the problem to the case when u depends explicitly only on a process which is (d + 1)-dimensional. Then,
Sb
du(t,(,) = grad, u(t,St)o,dw,
+
+
)
u(t,5,) dt. t:( Finally, if c, is a nonnegative progressively measurable process such that S',c, dt < co and cp, = c, ds, then
yo
d [e-qt~(t,St)]= u(t,5,) de-qt = e-"
+ e-qt du(t,S,) + (de-qt)du(t,t,)
grad, u(t,5Jot dw, + e-pt
Equations (9) and (10) hold for t 5 z with probability one for every function u(t,x) which together with its two derivatives with respect to x and a first derivative with respect to t is continuous with respect to (t,x) in the closure of some region in the space of the variables (t,x) containing with probability one the trajectories (t,tJ before time z. It is often necessary to use a corollary obtained upon integrating (lo), taking the mathematical expectation and applying property (5):
(as. on (z 2 t]), if z is bounded, the mathematical expectation exists and, finally, M
Si la: grad, ~ ( t , 5 , ) l ~ edt- ' ~<~m
Furthermore, let o,, b, depend on t, o as well as the point x E E d : o, = a,(x), b, = b,(x). For each x E Ed let the processes o,(x), b,(x) be defined for all t 2 0, o E D and, furthermore, let these processes be progressively measurable. We assume that there exist two constant K , and K , such that for all possible values of the arguments
Appendix 1 Some Properties of Stochastic Integrals
We fix x E Ed. The assertion of Ito's theorem implies that the stochastic integral equation
for the function x, = x,(w) ( t 2 0, w E SZ) has a unique (to within an equivalence) continuous (with respect to t )progressively measurable (with respect solution. )) to (9,
Appendix 2
Some Properties of Submartingales
Let (Q,F,P) be a probability space, and let {F,, t 2 0 ) be an increasing family of o-algebras satisfying condition Ftc F . The real-valued process 5,(o) given for t E [O,T] is said to be a submartingale on the time interval t E [O,T] (with respect to the family of {F,)) if the random variables 5, are Ft-measurable,
for all t, s E [O,T],s I t. The process 5, is referred to as a martingale if the second inequality in (1) is an equality. The process 5, is referred to as a supermartingale if the process (- 5,) is a submartingale. The properties of submartingales, martingales, and supermartingales are well known (see, for instance, [9, 51, 541). We give some properties without proving them. 1. If tt is a submartingale and, in addition, cp(x) is an increasing convex downward function of the real x, then ~ ( 5 ,is) a submartingale. 2. If 5, is a martingale and, in addition cp(x) is a convex downward functioil, then cp(5,) is a submartingale. 3. If 5, is a separable submartingale, p > 1, then
It is useful to bear in mind that (p/(p - 1)). I4 for p 2 2. 4. Let be a right continuous supermartingale and let MI^,( < a for t I T. If z is a Markov time, C,,, is a supermartingale. Further, if z, and 7, are Markov times such that z,(o) I z2(m)I T for all m, then M5,, 2 Mt,,. We do not think that the following fact is well known, hence we shall prove it here.
ct
Appendix 2 Some Properties of Submartingales
Lemma. Let K , be a supermartingale with continuous trajectories, and let M supt,,lctl < co. Further, let @,(w)be a nonnegative continuous progressively measurable process which increases in t for all cc, or decreases in t for all rn and is bounded on [O,T] x Q. Then: a. the process p, = K,@, - Sfo time z not exceeding T, Mp, 2
K,
SUP
d@,is a supermartingale. Also, for any Markov Q t M ( ~, KO)
t,w
+M K ~ @ ~ ;
b. pt - K,@, is a supermartingale if @, increases in t ; p, - K,@, is a submartingale i f @, decreases in t. (Note that in the applications frequently
Here, as follows from Fubini's theorem, PC =
J; (fa + rsus)~ X (P J: rs,ds,) ds + v, exp(- J; rsds).
PROOF.Writing K , as K , + R,, it is easily seen that it suffices to consider the case where rco = 0. We assume that K , = 0 and K,, @, are defined for t 2 T according to the formulas K , = K,, @, = @,. First we prove that
sup
t
We note that
1 4 sup @,M sup t,W
Furthermore, due to the equality p,
= 0 as
t
IK,)
< co.
well as the continuity of p,
Appendix 2 Some Properties of Submartingales
Next, the absolute value of the difference between the limit expressions in (2) and (5) is equal to
where w(r) denotes the modulus of continuity of the function K. This estimate, the fact that w(r) tends to zero as r -+ 0, the inequality
and relations (4) and (5) together yield (2) and (3). Let t , < t. Using (2), let us find M{p,lF,-,,).Inequality (3) enables us here to take the limit after taking the mathematical expectation. The last inequality in (4) enables us to interchange the mathematical expectation and integration operations. In addition, we make use of the fact that 2 0 and also, for s 2 t,, M {(xs+r- ~s)@sI%-,,) = M {@sM{ ~ s +-r ~ s l g s ) l % ~ 1>0. Then we have
Therefore, p, is a supermartingale. Further, Mp, 2 MpT = lim
M@,M {K,+,- K,~F,} ds.
Here the conditional expectation is negative. Hence ~ p 2, sup O,M lim t,m
r10
A f:(Ks+r r
- x,) ds = sup OMK,. t.0
Assertion (a) is thus proved. We can prove (b) in a similar way, using the formula
S:
1 p, - r,O0 = lim ; (K,,, - K.)(@.- O0)ds. rlo
Bibliography
[I] V. I. Arkin, V. A. Kolemayev, and A. N. Shiryayev, On finding optimal controls, Trudy MAIN71, (1964), 21-25. (in Russian). [2] K. G. Astrom, Introduction to Stochastic Control Theory (Academic Press, New York, 1970). [3] I. Ya. Bakel'man, Geometric Methods for Solving Elliptic Equations (Nauka, Moscow, 1965) (in Russian). [4] R. S. Bellman, Dynamic Programming (Princeton University Press, Princeton, N.J., 1957). [5] R. S. Bellman and R. E. Kalaba, Quasilinearization and Nonlinear Boundary-Value Problems (Elsevier, New York, 1965). [6] V. E. BeneS, Full bang to reduce predicted miss is optimal, SIAMJ. Control and Optimization 14, (1976), 62-84. [7] R. S. Bucy and P. D. Joseph, Filtering for Stochastic Processes with Applications to Guidance (Wiley, New York, 1968). [8] C. Derman, Finite State Markovian Decision Processes (Academic Press, New York, 1970). [9] J. L. Doob, Stochastic Processes (Wiley, New York, 1953). [lo] N. Dunford and J. T. Schwartz, Linear Operators: General Theory (Wiley-Interscience, New York, 1958). [ l l ] E. B. Dynkin, Markov Processes (Academic Press, New York, 1965) (English translation). [12] E. B. Dynkin, and A. A. Yushkevich, Controlled Markov Processes and Applications (Nauka, Moscow, 1975) (in Russian). [English translation: Springer-Verlag, New York, 1979.1 [13] W. H. Fleming, Some Markovian optimization problems, J. Math. and Mech. 12(1) (1963), 131-140. [I 41 W. H. Fleming, The Cauchy problem for degenerate parabolic equations, J. Math. and Mech. 13 (1964), 987-1008. [15] W. H. Fleming, Duality and a priori estimates in Markovian optimization problems, J. Math. Anal. 16 (1966), 254-279. [16] W. H. Fleming, Stochastic control for small noise intensities, SIAM J. Control 9(3) (1971), 437-515.
Bibliography [17] W. H. Fleming and R. W. Rishel, Deterministic and Stochastic Optimal Control (Springer-Verlag, Berlin, New York, 1975). [IS] M. I. Freidlin, A note on the generalized solution of the Dirichlet problem, Theory Prob. Appl. lO(1) (1965), 161-164 (English translation). [19] M. I. Freidlin, On the smoothness of solutions of degenerate elliptic equations, Math. USSR Zzuestija 2(6) (1968), 1337-1357 (English translation). [20] E. B. Frid, On the semiregularity of boundary points for nonlinear equations, Math. USSR Sbornik 23(4) (1974), 483-507 (English translation). [21] A. Friedman, Stochastic DzfSerential&uations, I, I1 (Academic Press, New York, 1975). [22] I. L. Genis and N. V. Krylov, An example of a one-dimensional controlled process, Theory Prob. Appl. 21(1) (1976), 148-152 (English translation). [23] I. I. Gikhman and A. V. Skorokhod, Stochastic DzfSerential Equations (Naukova Dumka, Kiev, 1968)(in Russian). [English translation: Springer-Verlag, Berlin, Heidelberg, New York, 1972.1 [24] I. I. Gikhman and A. V. Skorokhod, The Theory of Random Processes Vol. 3 (Nauka, Moscow, 1975) (in Russian). [English translation: Springer-Verlag, Berlin, Heidelberg, New York, 1979.1 [25] I. V. Girsanov, Minimax problems in the theory of diffusion processes, Soviet Mathematics, Doklady Akademii Nauk SSSR 136(4)(1961),118-121 (English translation). [26] R. A. Howard, Dynamic Programming andMarkov Processes (Wiley, New York, 1960). [27] N. N. Krassovsky and A. I. Subbotin, Closed-Loop DzfferentialGames(Nauka,Moscow, 1974) (in Russian), [English translation: Springer-Verlag, Berlin, Heidelberg, New York, forthcoming.] [28] N. V. Krylov, On Ito's stochastic integral equations, Theory Prob. Appl. 14(2)(1969); 330-336 (English translation). [See also: Letter to the Editor: Correction to On Ito's stochastic integral equations, Theory Prob. Appl. 17(1) (1972), 20-33 (English translation).] [29] N. V. Krylov, On a problem with two free boundaries for an elliptic equation and optimal stopping of a Markov process, Soviet Mathematics, Doklady Akademii Nauk SSSR 194(6) (1970), 1370-1372 (English translation). [30] N. V. Krylov, Bounded inhomogeneous nonlinear elliptic and parabolic equations in the plane, Math. USSR Sbornik ll(1) (1970), 89-99. (English translation). [31] N. V. Krylov, Control of Markov processes and W-spaces, Math. USSR Zzuestija, 5(1) (1971), 233-265 (English translation). [32] N. V. Krylov, An inequality in the theory of stochastic integrals, Theory Prob. Appl. 16(3) (1971), 438-448 (English translation). [33] N. V. Krylov, On the theory of nonlinear degenerating elliptic equations, Soviet Mathematics, Doklady Akademii Nauk SSSR 201(6) (1971), 1820-1823 (English translation). [34] N. V. Krylov, On uniqueness of the solution of Bellman's equation, Math. USSR Izvestija 5(6) (1971); 1387-1398 (English translation). [35] N. V. Krylov, Lectures on the Theory ofElliptic Dzfferential Equations, (Izd. Moskovsk. Gos. Univers., Moscow, 1972) (in Russian). [36] N. V. Krylov, Control of a solution of a stochastic integral equation, Theory Prob. Appl. 17(1)(1972), 114-131 (English translation). [37] N. V. Krylov, On control of the solution of a stochastic integral equation with degeneration, Math. USSR Izvestija 6(1) (1972), 249-262 (English translation). [38] N. V. Krylov. On the selection of a Markov process from a system of processes and the construction of quasi-diffusion processes, Math. USSR Zzestija 7(3)(1973),691 -708 (English translation). [39] N. V. Krylov, Some estimates in the theory of stochastic integrals, Theory Prob. Appl. 18(1)(1973), 54-63 (English translation). [40] N. V. Krylov, Some estimates of the probability density of a stochastic integral, Math. USSR Izvestija 8(1) (1974), 233-254 (English translation).
Bibliography [41] N. V. Krylov, On Bellman's equation, Trudy Shkoly-Seminara po Teorii Sluchainykh Protsessov (Druskininkai, November 25-30, 1974), 1 (Vilnius, 1975)(in Russian). [42] N. V. Krylov, Sequences of convex functions and estimates of the maximum of the solution of a parabolic equation, Siberian Math. J. 17(2)(1976); 226-236 (English translation). [43] N. V. Krylov, The maximum principle for parabolic equations, Uspekhi Matem. Nauk 31(4) (1976), 267-268 (In Russian). [44] R. Kudzhma, Optimal stopping of semistable Markov processes, Litovsk. Matem. Sbornik 13(3)(1973), 113-117 (In Russian). [45] H. J. Kushner, Stochastic Stability and Control (Academic Press, New York, 1967). [46] 0 . A. Ladyzhenskaja and N. N. Uraltseva, Linear and Quasi-Linear Equations of Elliptic Type (Academic Press, New York, 1968) (English translation). [47] 0. A. Ladyzhenskaja, V. A. Solonnikov and N. N. Uraltseva, Linear and Quasi-Linear Equations of Parabolic Type, (Trans]. Math. Monographs, Vol. 23, Amer. Math. Soc., Providence, R.I., 1968). [48] H. Lewy and G. Stampacchia, On existence and smoothness of solutions of some noncoersive variational inequalities, Arch. Rational Mech. Anal. 41(4) (1971), 242-253. [49] J. L. Lions, On inequalities in partial derivatives, Uspekhi Matem. Nauk 26(2) (1971), 205-263 (in Russian). [50] J. L. Lions, Optimal Control of Systems Governed by Partial Dzfferential Equations (Springer-Verlag, Berlin, Heidelberg, New York, 1971). . [51] R. S. Liptser and A. N. Shiryayev, Statistics of Random Processes, 1, 2 (SpringerVerlag, Berlin, Heidelberg, New York, 1977-1978) (English translation). [52] P. Mandl, On optimal control of a nonstopped diffusion process, Z . Warscheinlichkeitstheorie, und Verw. Gebiete 4(1) (1965), 1-9. [53] P. Mandl, On the control of a Wiener process for a limited number of switchings, Theory Prob. Appl. 12(1)(1967), 68-76 (English translation). [54] P. A. Meyer, Probability and Potentials (Blaisdell, Waltham, Mass., 1966). [55] H. Mine and S. Osaki, Markovian Decision processes A (Elsevier, New York, 1970). [56] T. P. Miroshnichenko, Optimal stopping of the integral of a Wiener process, Theory Prob. Appl. 20(2)(1975),387-391 (English translation). [57] S. M. Nikolskii, Approximation of Functions of Several Variables and Imbedding Theorems (Springer-Verlag, Berlin, Heidelberg, New York, 1975) (English translation). [58] M. Nisio, Remarks on stochastic optimal controls, Jap. J. Math. l(1) (1975), 159-183. [59] M. Nisio, Some remarks on stochastic optimal controls, Proc. Third USSR-Japan Sympos. Probab. Theory, pp. 446-460 (Lecture Notes in Mathematics No. 550, Springer-Verlag,Berlin, Heidelberg, New York, 1976). [60] L. S. Pontryagin, V. G. Boltyansky, R. V. Gamkrelidze, and E. F. Mishchenko, Mathematical Theory of Optimal Processes (Nauka, Moscow, 1969)(in Russian). [61] I. I. Portenko and A. V. Skorokhod, Existence of &-OptimalMarkov Strategies for Controlled Dzffusion Processes. Pfoblems of Statistics and Control of Random Processes (Izd. Instituta Matematiki Akademii Nauk Ukrainskoi SSR, Kiev, 1973) (in Russian). [62] G. Pragarauskas, Some estimates of stochastic integrals, Litovsk. Matem. Sbomik 15(3)(1975), 21 1-217 (in Russian). [63] G. Pragarauskas, On the control theory of discontinuous random processes, Trudy Shkoly-Seminara po Teorii Sluchainykh Protsessov, pp. 252-281 (Druskininkai. November 25-30, 1974, 1, (Vilnius, 1975) (in Russian). [64] Yu. V. Prokhorov, Control of a Wiener process when the number of switchings is bounded, Trudy MIAN71 (1964), 82-87 (in Russian). [65] M. V. Safonov, Control of a Wiener process when the number of switches is bounded, Theory Prob. Appl. 21(3) (1976), 593-599 (English translation).
Bibliography [66] M. V. Safonov, On a Dirichlet's problem for Bellman's equation in a plane domain, Math. USSR Sbornik 102(2) (1977), 260-279 (In Russian). [67] L. Schwartz, Theorie des Distributions, 1 (Hermann, Paris, 1950). [68] L. A. Shepp, Explicit solutions to some problems of optimal stopping, Ann. Math. Statist. 40(3) (1969), 993-1010. [69] A. N. Shiryayev, Optimal Stopping Rules (Springer-Verlag, Berlin, Heidelberg, New York, 1978) (English translation). [70] A. V. Skorokhod, Studies in the Theory of Random Processes (Scripta Technica, Washington 1965) (English translation). [71] V. I. Smirnov, A Course of Higher Mathematics (Pergamon, Oxford and New York, 1964)(English translation). [72] S. L. Sobolev, Applications of Functional Analysis in Mathematical Physics (Amer. Math. Soc., Providence, R.I., 1963)(English translation). [73] T. Tobias, The optimal stopping of diffusion processes and parabolic variational inequalities, Dzfferential Equations 9(4) (1973), 534-538 (English translation). [74] T. Tobias, On optimal stopping of diffusion processes with singular matrices of diffusion, Zzvestia of Akademii Nauk Estonskoi SSR 23(3) (1974), 199-202 (in Russian). [75] A. Yu. Veretennikov and N. V. Krylov, On explicit formulas for solutions of stochastic equations, Math, USSR Sbornik 29(2)(1976), 239-256 (English translation). [76] W. M. Wonham, Random Dzferential Equations in Control Theory: Probabilistic Methods in Applied Mathematics, 2 (Academic Press, New York, 1970). [77] A. K. Zvonkin, On sequentially controlled Markov processes, Math. USSR Sbornik 15(4)(1971)607-617 (English translation). [78] A. K. Zvonkin and N. V. Krylov, On strong solutions of stochastic differential equations, Trudy Shkoly-Seminara po Teorii Sluchainykh Protsessov pp. 9-88 (Druskininkai, November 25-30, 1974), 2, (Vilnius. 1975)(in Russian).
Index
Bellman-Howard method 29,43 Bellman'sequation 4, 5 , 6 , 11, 13,37, 203 , normed 12,268 Bellman's principle 3, 6, 34, 133, 134, 135, 150 Contraction operator 26 Derivative ,992-94 , 2 B - 92-94, 109 , Radon-Nikodym 50 Discounting 8, 9, I1
Function , composite 97, 99 ,convex downward 176- 177, 183 , convex upward 183 ,®ular 60 , excessive 229-230, 246 , finite 72-73 , locally summable 49 , payoff 21, 33, 37, 129, 228-229, 276 , performance 2 , regular 52-53 ,R-'-regular 60 , set 49 , smooth, with compact support 46, 68, 172
, stochastic Lyapunov 18 , superharmonic 229- 23 1, 246
with a zero crossing 272-273
Markov time 117 , €-optimal 36 , 0-optimal 36 Method of perturbation of a stochastic equation 125 Normalizing multiplier 267, 270- 27 1, 283 , regular 267,270-271 Optimal stopping problem 36 Optimal stopping time 39 Parabolic boundary 193 ,2-continuous (SB-continuous) 103, 109, 144 ,%differentiable (9B-differentiable) 92-94, 99, 103, 104-105, 107-108, 190 , one-dimensional controlled 22 , real random 9 1 , separable 83 Randomized stopping 9- 10,36, 152
Smooth pasting condition 32 Solutions of a stochastic equation , strong 87 , weak 87 Strategy 23, 130,246 , admissible 248 ,€-optimal 24,40-41, 132- 133,149 ,Markov 6 , 2 4 , 2 9 , 3 3 , 131,216,220 , adjoint 223, 242 ,E-optimal 214,218 , mixed 220 , (so, no)-adjoint 242
,natural 24, 131, 133, 148,246
, optimal 34
, step 148, 149 , 0-optimal 24 Submartingale 84- 85 Supermartingale 149, 157, 159, 299-300 Theorem on continuity and differentiability of a composite function 97 Theorem on uniqueness 4 1