\\m v^eff 1 . The reader can easily verify that 1/2
Cf.
<(l4\Ml2+2^1/^)
190
Numerical Techniques
Thus, each /* of the form (14.8) is a distribution belonging to H'1. Now we return to the operator A*. For ip e H1, using integration by parts, it follows from (14.7) that n
n
-{A*i>,i>) = Y^W, where
AV)L 2 + Y
(oiiDrt'AV0L2,
(14-9)
n
(14-10)
6i = 6i + 5 ^ 4 - K j ) i=i
Let LOQ(Rn) = B(Rn) denote the class of essentially bounded measurable functions on Rn furnished with the usual norm topology || g 1100= ess-sup{|p(x)|,x e
Rn}.
Suppose there exist three numbers /3,7,6 > 0 satisfying the following condi tions (Al) :
k eB(Rn),i
= l,2--n,
with/? = sup{|| 6* ||oo,i = l , 2 , » , n } < 00,
n
(A2) :
7 |£|
2
a
*Mi < m2-
< (1/2) Y
(14.11) Then it follows from Schwarz inequality applied to the first term of (14.9) that n
I £ ( 6 ^ , D i i > ) L 2 \ < p \ \ 1 > \\L2II Lhl> || ia •
(14.12)
i=l
Similarly by virtue of the ellipticity assumption (A2), we have n
(1/2) Y
(diiDjrP, D^)Li
> 7 || Dip ||| 2 .
(14.13)
Using Cauchy inequality, ab < (e/2)a2 + (l/2e)6 2 Ve > 0, it follows from (14.9), (14.12) and (14.13) that
- ( A > , VO > (7 - 0(*/2)) II ^
II2 -(/?/&) || V Ilia •
(14.14)
191
Linear and Nonlinear Filtering
Since e > 0 but otherwise arbitrary, it follows from (14.14) that there exist a > 0 and A > 0 such that --04>,>) 04>,f>) + A A || m mi£l 22
a\\^\\2H1, V V e f f 11.. > alllM&i, VVetf
(14.15)
This inequality also holds for the operator A. It is clear from the above expression that
wu m_{(z|M} = +0O) Hm ftf^U+oo.
(14.16) (14,6)
for B = A or A . Operators satisfying such properties are called coercive. Following similar procedure and using the upper bound in (A2), one can easily justify that there exists a constant C > 0, dependent only on /3 and 6, such that \(A*4>,i>)\ < p || 1> \\ D4> || D4> || Di,|U\\2 L2 \(A* \\L\\ +6 +6 || D \\Lt||\\L2D^ 2 L2
^ (14.17)
Inequality (14.17) also holds for the operator A. This inequality implies that both the operators A and A* are linear and bounded from H1 t o f f " 1 . Thus we have proved the following result. L e m m a 1 Under the assumptions (Al) and (A2) given by (14.11), both the differential operators A* and A satisfy Garding's inequality (14.15) and hence these operators are coercive with respect to the triple #Hi^L i ^2^ L2 ^
H-\
Further both A and A* are bounded linear operators from H1 to H~x and A* is the generator of a Co semigroup {S(t),t > 0} in H. In view of the above result, equation (14.5) can be treated as a stochastic differential equation on the Hilbert space H = L2(Rn). Defining the operator F by F(p) = (RQ 1h)p, we can rewrite equation (14.5) as dp = A*pdt +
(F(p),dy)
Po == PoPo-
(14.18)
Clearly F is a multiplication operator and for unbounded h, it is an unbounded operator in H with domain a proper subspace of H. If on the other hand h is bounded, F is a bounded operator from H to £(Rm,H). Under this assumption, using Banach Fixed point theorem, one can easily prove the existence of a unique solution using the integral equation dy). p(t) s)F(p(s)),dy). W = S(t)po + / (S(t - s)F(p(s)), W
Jo Jo
(14.19)
192
Numerical Techniques
Further, one can prove that for any finite time interval I = [0, T], the solution p e L2(I, H1) D L2(I, H) n C(I, H) with probability one. We state this result formally as a theorem. For details see [2] [3]. Theorem 2 Suppose the assumptions of Lemma 1 hold and that RQ 1h is uniformly bounded. Then equation (14.18) has a unique solution p and that for any finite time interval I, pe L2{I, Hl) n L 2 (J, H) D C(J, H) - P-a.s. The uniform boundedness assumption of R$ lh is rather strong. In fact, for the existence of weak solutions this is not necessary. This can be appreciated as we deal with the Galerkin approximation. 14.3 Galerkin Approximation using Special Basis Functions For convenience, from now on we use the notation V = H1 , H = L2(Rn) and V* = if" 1 . Note that the embeddings V t-> if c-^ V* are continuous but not necessarily compact. We assume, however, that there exists a sequence {vi, i G N} C V which is orthogonal in V and orthonormal in H and complete in all these spaces. We use this sequence to construct Galerkin approximation for the solution p of equation (14.18). Since po e H, we can approximate this by the sequence {po} given by k
k
P0 = ^(PO, Vi)Vi = J2 Z0iVi>
( 1 4 - 2 °)
which converges to po in H. Define k
pk{t) = Y,ZHt)vi,
(14.21)
2=1
where the Fourier coefficients, {Z\,\ < i < k}, are determined as follows. Substitute (14.21) into equation (14.18) and scalar multiply by Vj,l < j < k, giving
k ■
k
dZ)(i) = J ^ a ^ W c f t + ^ Z f ( t ) ( f f i i , d y ) > l <j < i=l
fc,
(14.22)
i=l
where the elements of the matrices are given by dji = (AVJ, vi) and Hji = (R^hvj^i),
1 < i,j < k.
(14.23)
Linear and Nonlinear Filtering
193
Note that {Hji}, taking values from Rm, are functions of time in case a0 is y dependent. Clearly equation (14.22) can be written as an ordinary kdimensional stochastic differential equation as follows: dZk{t) = AkZk{t)dt + Bk(t, Zk(t))dy, u «. / Zfc(0) = Z o f c ^ { ( p o ^ i ) , . . . , ( p o , ^ ) } ,
(14.24)
where Ak G M(k x k) and Bk(t,z) G M(k x m) for t > 0 and z G Rk. Since Ak is bounded and Bk is linear in z and hence Lipschitz, it follows from the results of chapter 2 that this equation has a unique solution {Zk(t),t > 0}, possessing finite second moments and Zk G C(I,Rk) P-a.s. Clearly pk given by pk(t) = J2i=iZi(t)vi ^ a ^ s o m C{I,H) P-a.s and it is an approximate solution of equation (14.18). For numerical computation, there are two major difficulties with the se quence {vi}. First, there is no systematic method available for constructing the sequence {vi} which is orthogonal in V and orthonormal in H. Secondly, since this sequence is orthonormal in if, there is no guarantee that the approx imating sequence {pk} will preserve positivity. Since any numerical algorithm must be terminated after a finite number of steps, say AT, it is possible for pN to assume negative values over certain regions of the space Rn, which is absurd for probability densities. In order to avoid this negativity, N may have to be taken very large requiring extensive computation and hence large CPU time. Recently Ahmed and Radaideh [12] proposed a sequence of Gaussian func tions {wi} as the basis functions for numerical solution of Zakai equation.
These are given by {wi(x) = Wi(x,mi,Bi),x
G Rn,i G N} where
Wi{x,mi,Bi) = {l/(2n)n/2^/d^)exp-(-)(B-\x-mi)1x-mi),
(14.25)
with mi ^ nij G Rn for i ^ j and the matrices Bi G M+(n x n) are positive definite. For all examples we have considered Bi = 7*7,7» > 0, with complete success. It is evident that for m^ ^ mj, the sequence {wi} is linearly indepen dent but neither orthogonal nor normal, though, by Gram-Schmidt procedure, one can always orthonormalize the given sequence. However for numerical purpose this is very inefficient and not at all necessary. Rather, it is more convenient to use the sequence {wi} as they are. The following properties of the sequence {wi} are advantageous over the orthonormal sequence {vi} (a) Wi(x)
>0,xeRn,ieN.
194
Numerical Techniques (b) Wi e D(A) = D(A*)jWi
e C°°.
(c) the sequence {wi} is complete in the class L2(Rn) = H. In (14.21) we replace the sequence {vi} by the sequence {wi} and replace the equation (14.24) by VkdZk(t) u VkZk(0)
= AkZk(t)dt u = zk,
+ Bk(t,
Zk{t))dy,
(14.26)
where the elements of the matrices are given by Vk = {dji = (wj,Wi),l
< ij
< fc}; Ak = {a?* = (wj,A*Wi),l
< ij
< A:};
k
Bk(t,z)
= {tfjti(t,z)
= ^((R^fyi
Wj,Wi)zul
<j
i=l
(14.27) The initial state ZQ is given by the corresponding expression in (14.24) with Vi replaced by Wi. In this case all the Fourier coefficients {Z^} > 0. Note that the matrices Vk are invertible. C o m m e n t 1 Equation (14.26) is equivalent to d(pk(t)jWj)
= (pk(t)1Awj)dt
+ (^ffi(t)RZ%wj)H,dy\pk(0)
= pk0, (14.28)
for 1 < j < k. Integrating this we have the weak form (pk(t)1wj)
= (p^wj)+
/ Jo
{pk(s)1Awj)ds
i*r
+J
\
\Apk{s)RZlh,Wj)H,dy{s)\,
(14 29)
'
for every Wj, 1 < j < k. Looking at the integrand of the martingale term in (14.29) more closely, we see that
(p/'(t)R^1h,wj)H
* . = V z f ( t ) / wi(x)(R^1h)(x)wj(x)dx,
(14.30)
is well defined even for unbounded h, in particular, for h with polynomial growth, \h(x)\Rm < K(l + \x\qRn), 0
Linear and Nonlinear Filtering
195
Thus for weak solutions, uniform boundedness assumption is unnecessary. 14.4 Spatial Discretization and Computational Algorithm For numerical computation, using Runge-Kutta techniques, Stratonovich formulation is the correct framework for stochastic systems. Thus a correction term, known as the Wong-Zakai correction [74], ( see also chapter 2), must be added to the Zakai equation (14.5) or (14.18) giving dpt = A*(H dt - ( 1 / 2 ) ^ ( ^ 0 1 /i, h)dt +
pt{Rvlh,dy)
= A*ptdt + ptiR^h^dy),
(14.31)
Po =Po, where A* = A* — (i?Q */i, h)I. In the Galerkin scheme discussed in the preceding section, the operator A* must be replaced by A*. In this case the elements of the matrix Ak of equation (14.26) are replaced by Ak = {aji = {wjiA*Wi) - (l/2)((RQ1h,h)wj,wi)H,l<
ij
< k}.
(14.32)
Using the solution of equation (14.26), with this correction, approximate con ditional mean and the covariance matrix of the original process {x(t),t > 0}, can be calculated as follows:
x{) K(A
zi^m,
(14.33)
E £ i Z?{t){Bj + (x(t) -
mi)(x(t)
-
mi)'))
The first term is demonstrated as follows :
±(t) = E{xm?r} =
s
r**^
(14.34)
JRnPt{Z)
~ !ZztN(t)wi(0^
EliZ?(t)
'
Similarly one can verify the second expression. In order to compute all the matrix elements given by (14.27) and (14.32) as accurately as possible and at the same time minimize computational burden, it is necessary to select a
Numerical Techniques
196
suitable bounded domain in Rn. This can be based on the approximate support of the initial density po- Define the vector
mQ= j i p 0 (0 d£, and define the cube C2a = C2a{™<0) = {x £ Rn
: THoi - CL < Xi < ITloi + a , l < i < 71}
around the mean mo- Since po is a probability density, for any e > 0, there exists a compact set K C Rn such that Po(x)dx > 1 — e. / We choose a large enough so that K C Cia> This domain can be enlarged as desired by adjusting a so as to absorb any integer multiple of K. Once the domain is fixed, each edge is partitioned into d equal intervals and create dn — N cells covering the entire domain. They are then indexed in some order giving C20 = Ur=i Dr- The center of cell Dr is denoted by mr. Choose the matrix Br = 7 r / , 7 r > 0. Then use the sequence {wr = w(x,mr,Br)} for the Galerkin approximation. For further details see [12]. One way of judging the performance of the approximate filter is to evaluate the residual error defined by the innovation process
V(t)=
[ Jo
a^(s)(dy(s)-h(s)ds)
where h(t) = E{h(x(t))\Jrf}. We know that V is a Brownian motion with mean zero and incremental covariance R. Note that h is computed using the expression,
Hi)« / Morfm, where pk is the approximate solution of equation (14.31) obtained by solving equation (14.26) as discussed in the preceding section. If the observed behavior of the measurement residual V(t+At) — V(t) is inconsistent with its theoretical property, then it must be concluded that the number of terms in the Galerkin approximation is inadequate or the domain chosen does not adequately support the probability mass. So the domain decomposition by cells must be further
Linear and Nonlinear Filtering
197
refined by increasing the number of cells or the domain must be enlarged or both till the error significantly reduces. 14.5 Basic Computational Steps Here in this section we present the basic computational steps. Step 1 Generate the random processes W and V. Step 2 Given x 0 , solve the system equations (14.1) and (14.2) using Runge-Kutta method to obtain the values of the observation process y at discrete points of time {y(ti),i = 1,2, ...L}. Step 3 Solve for Zk(0) from equation (14.26b). Step 4 Solve the finite dimensional stochastic differential equation (14.26) using Zk(0) as the initial state to obtain {Zk(ti)1i = 1,2, ....,L}. Step 5 Check the residual AV(t) « vo(t)-\Ay(t)
-
h(t)At).
If this is not approximately a random element with mean zero and covariance AtR, increase the number of cells N and the size of the cube C^ai if required, and go to step 3, otherwise stop. 14.6 A n Alternative Approach Before presenting some numerical examples based on the above scheme, we prefer to present here another interesting approach which may be useful for real time computing and filtering. This is based on an implicit scheme directly applied to the system equation (14.31) in the Hilbert space H. Let J = [0,T] be any finite time interval partitioned into say L equal subintervals of length A. Let t € J be a point from any one of the interior nodes and suppose p(t) is already known and that new information contained in the increment of y given by Ay(t) = y(t + A) - y(t) is available. The problem is to find p(t + A) from the given information. We obtain the solution by approximating the equation (14.31) using the following implicit scheme: p(t + A) « p(t) + (i*p(* + A)) A + p(t)(R^%
Ay(t)),
(14.35)
Numerical Techniques
198 which can be written as (7 - AA*)p(t + A) « p(t)(l
+ (R^h,
Ay(t))\.
(14.36)
If the operator (7 - A A*) is invertible, the solution is given by p(t + A) * (7 - A i * ) " V * ) ( l + (#0 1A, A y ( i ) ) ) .
(14.37)
Prom (14.37) we can then construct the following algorithm ps = (l-AA*y1{l
+
(R^h,Ays)}P{s.lh
(14.38)
Po = P o , s = 1,2,...,L, where ps = p(sA). For invertiblity of the operator (7 — A^4*), we note that for every (j) € V we have ((I-AA*)0,0) = | 0 & - A ( i V , * ) > |*& + A { a || 0 ||2V -AI^H, + | v W » . h ) 4 > \ 2 H ) >(l-XA)\d>\l
+
(14.39)
(aA)\\d>fv,
where the second line follows from the coercivity of —A* ( see inequality (14.15)) and the positivity of the matrix RQ. Hence for A sufficiently small, it follows from (14.39) that the operator (7 — AA*) is coercive, that is, there exists a constant c > 0 dependent only on {A, a, A } so that ( ( / - A i * ) < M ) v % v >c\\4> |fr, \/ e V.
(14.40)
From this inequality one can prove that the operator has a bounded inverse from V* to V satisfying || (7 - A i * ) " V ||v< (1/c) || V Hv VV; e V*.
(14.41)
For convenience denote B = (I — A A*) and let A : V —> V* denote the duality map or the canonical isomorphism of V onto V*, so that for any > e V, || (j) \\v=\\ A0 \\v* and similarly for any g e V*, \\ g \\v* = \\ A - 1 # ||y . Since in our case V = H1, we have A = (7 - A) where A denotes the Laplacian in Rn. Now define the operator G by G0 = 0 - f l A " 1 ( B 0 - / ) >
for
/eV*,O<0
Linear and Nonlinear Filtering
199
Clearly, if this operator has a fixed point, say, ip e V, then ip = Gift means that Btj) = f or in other words t\) is a solution of equation B<\> = / . Since this is true for any / £ V* we conclude that B is an onto map from V to V*. On the other hand it is one to one since || B(j) ||v*> c \\ 0 ||v, V0 G V. All these mean that 5 is invertible and has a bounded inverse. Further, recall that for bounded ft, the operator B is bounded from F to V*. Let Ch denote this bound. We show that for a suitable choice of 6, the operator G has a fixed point in V. By simple computation one can verify that
II G0i - G<j>2 \\2V =|| 0x - 0 2 Hf, -20(A" 1 (£0! - £0 2 ),0 X - 0 2 ) + 02||A-1(B01-^2)|^ = 11 01 - 02 ||V " 2 0 ( 5 0 ! - 502,01 - 02)V*,V + 02||£01-£02||^
Po = p o , s = 1,2,- • -,L. Note that this also proves the existence of a unique solution of Zakai equa tion (14.31) and that the solution is pathwise continuous with values in V. 14.7 Examples and Simulation Results Consider the three dimensional system dx\ = g(x)(3i(x)dt + adW\ dx2 = g(x)(32(x)dt + adW2 dx3 = g(x)(33(x)dt + crdW3,
(14.43)
200
Numerical Techniques and the observation dynamics given by one of the following set dVi dVl = ndt xidt + aa00dV! dy2=x dt + a dV =2 x2dt 00 22
(14.44a)
dy3 = = xz3dt + a0dV3. dyi= xidt + aaQ0dVi dyi=xidt dy = xx22dt dt + a00dV dV22.. dy22 = + a dyi = =xidt xitft + +
/ " i 4 4 4 6 )
(i4.446) (14.44c)
The functions # and {ft,i = 1,2,3} are given by the following expressions - x\ xxxx33 + + x2x x23x), g= = tanh(2xi - x\ x2 x2 + + xxxxx22 + + Xl 3), Pi=4xi+x + 2+ Pi = 4 x i 2+x P2 = Xx— P2=X x-
X3, X3,
2x2 x3, 2X2 + + *3,
£3 = xi xi + + X2 - 22x x 33.
Since the drift vector b = [gpi,gp2,gp3]' function
is the gradient of a scalar valued
2 F = log!cosh[2xf - x X\| - x X 2 + fxx f j X22 + + xX!XX + f 2 Xf23xA 1 lXX + 3]\1
it satisfies the following Benes condition [25],[12],[79]. II V F || 2 +AF + (l/
(14.45)
where Q is a positive definite matrix, q € R3 and q0 G R. An exact solution of the corresponding Zakai equation is given by p(t,x) = c(t)explF(x) c(t)exp!.F(x)
- (l/2)(E( l / 2 ) ( E -11(ar ( x - m(t)),x m(t)),x- - m(t))\. m(t))\.
(14.46)
This equation determines an unnormalized conditional density in terms of the parameters {E,m} which are the solutions of the following equations E =o a2I/ - (l/<x (l/
(14.47) (14 4?) '
We solve each of the three problems (14.43-14.44a), (14.43-14.44b) and (14.4314.44c) using the steps given in section 4 and 5. The exact results are obtained
Linear and Nonlinear Filtering
201
by solving (14.47) and (14.46) and compared with the solutions computed using the numerical scheme described in sections 14.4 and 14.5. The estimated states are presented as functions of time in the figures 1,2,3. Theoretical results are plotted by unbroken lines and numerical ones are plotted by broken lines. It is extremely encouraging to see how close the numerical results are to the analytical results which are based on exact solutions. The data used for the numerical simulation are a = 1.0, a0 = 0.015, E 0 = 0.17, mi(0) = 0.2,ra 2 (0) = 0.1,m 3 (0) = - 0 . 1 . C o m m e n t 2 The reader may find it challenging to try with the alternative approach as proposed in section 14.6. C o m m e n t 3 For on line computation, the alternative approch may prove to be very usefull. If the operator A is time varying, this requires a powerfull computer to evaluate the inverse of elliptic operators at each time step. C o m m e n t 4 Again the reader is encouraged to evaluate the performances of EKF-3 and EKF-4 given by equations (12.50)-(12.51) of chapter 12 and equations (13.55)-(13.56) of chapter 13 respectively. This can be done by comparing their solutions with the best estimates given by the solutions of the associated Zakai equations (13.37B) of chapter 13.
Fig.l
Numerical Techniques
202
Fig.2
Fig.3 Courtesy of Kluwer Acad. Pub., Dynamics and Control,7,1997, 293-308
203
C H A P T E R 15 PARTIALLY OBSERVED CONTROL
15.1 Introduction In many problems arising in physical or social sciences, one is required not only to give a best estimate of the unobservable from the observable, but also exercise controls to change the course of evolution of the unobservable inorder to achieve certain objectives. This is partially observed control. For exam ple, the population (or immigration) control for a country may be exercised on the basis of limited survey data in order to maintain a certain growth rate considered to be good for the country socially and economically. The same phi losophy applies to government regulations applied to fishing industries where the objective is to prevent overfishing by prescribing appropriate net size which is determined on the basis of government estimate of the available fish stock and its growth characteristics. In fact this concept is universal and applies to any situation where decision is to be made for "good" on the basis of partial information. 15.2 Linear Systems with Integral Observation The system is governed by the following controlled stochastic differential equation, dx(t) = A(t)x(t)dt
4- /(*, u(t))dt + a(t)dW(t),
x(0) = x 0 ,
(15.1)
with the measurement dynamics given by dy(t) = H(t)x{t)dt
+ a0{t)dV{t),t
> 0, y(0) = 0.
(15.2)
Here all the parameters satisfy the same set of standard assumptions that we used in chapter 3. The function / is a map from [0,oo) x U —► Rn, measurable in the first variable and continuous in the second. This is the mechanism through which control is exercised on the plant. The set U is a
Partially Observed Control
204
compact, possibly convex, subset of Rd, prescribing the set of admissible values the control may take. Here we assume that the initial state xo is Gaussian, the random processes {W, V} are Brownian motions in IP and Rm respectively and they are all mutually uncorrelated. The performance of the system over the time period I = [0,T] is measured by the average cost given either by J(u) = ES. EI f £(t,x(t),u(t))dt\. £(t,x(t),u(t))dt\.
(15.3a)
J(u) = E[ E{ ff £{t,x{t),u{t))dt\, £(t,x(t),u(t))dt\,
(15.36)
or
where x(t) is the best estimate of x(t) given the history {y(s), 0 < t} or simply the information sigma algebra JFf'. The functions £ and £ are measurable in the first variable and continuous in the rest of the variables. Additional assumptions will be introduced as necessary. The basic problem is to find a control which is J=f measurable that imparts a minimum to the cost functional. Since / is T\ adapted, it follows from the concluding remarks of chapter 4 that the best estimate x is given by the solution of the SDE + fdt + + KHR^idy dx = Axdt +
- Hxdt),
x(0) = x0 - xx00.-
(15.4)
The matrix K is given by the solution of the same differential Riccati equation (3.35) of chapter 3, reproduced here K(t) = A(t)K(t)
+ K(t)A'(t)
+ Q(t) -
K(t)H'(t)Ro\t)H(t)K(t),
K(0) = K0.
(15.5)
Thus, it is important to note that by introducing J* adapted controls, the error covariance cannot be modified, or in other words, control cannot change the uncertainty in the state estimate. However the estimate x(t) itself can be influenced by choice of the control forces given by / . But at the same time control actions will also bring about similar influence on the state itself. This suggests that one may use either of the cost functionals (15.3a) or (15.3b). First we consider the cost functional (15.3b) along with the system (15.4). This is a very realistic problem. Since the estimate is available and F? adapted, this is a fully observed problem. Recall that the process V given by dV = ^(dy Hxdt), o-^idy - Hxdt),
(15.6)
Linear and Nonlinear Filtering
205
is an innovation process and it is a Brownian motion having the same in cremental covariance, R, as that of the Brownian motion V perturbing the measurement dynamics. Hence equation (15.4) can be written as dz = Azdt + fdt + GdV, z(0) = x0, /
/
!
>
G = KH (Rao)-1 = KH #o
where
1
(15.7)
W
In other words we have converted our partially observed control problem into a fully observed control problem given by dz = Azdt + fdt + G(t)dV, z(0) = x 0 , J{u) =E\
Ii(s,
z(s), u(s))ds\
x),u€U
=► Inf,
where the infimum is taken over the class of admissible controls Uad comprised of T± measurable processes taking values from the set U. Later we show that the infimum is actually attained in a subclass U of Markovian functionals of the process z. That is, u € hi if it has the representation, u(t) = F(t, z(t)) for a suitable map F : [0,T] x Rn —► U. We refer to this problem as (PI). This can be solved by using Bellman's principle of optimality. For this purpose we embed this control problem into a family of problems as follows. Let t € (0, T) and z(t) = x and consider the problem d£(s) = A(s)£(s)ds -I- / ( s , u(s))ds + G(s)dV, £(t) ( fT ~ 1 C(u,t,x) = EIJ t(3,£(s)Ms))ds\J*y
=x,s>t, (15.9)
where £(s) = £™x(s),s € (t,T] is the unique solution corresponding to the control u e Uad of the SDE (15.9a), starting from the state x at time t. Corre sponding to the control policy u, the functional C(u,t,x) denotes the average running cost over the interval [t,T] starting from the state x. According to Bellman's principle of optimality, if z has reached the state x at time t, the choice of future control actions should be such that it minimizes the cost for the remaining time period. Denote cj)(t,x)=mf{C(u,t1
ad}.
(15.10)
For the moment, suppose the infimum is attained at u° G Uad so that >(t, x) = C(u°, t, x). Clearly then, for t = 0 and x = x 0 , we have J(u°) = 0(O,x o ),
(15.11)
Partially Observed Control
206
provided this function is continuous in all it's arguments on I x Rn. The function 4> is called the value function. We assume that 0 € Cl'2{I x Rnx and show that it satisfies a semilinear partial differential equation called the Bellman equation. We use Ekland variational principle. Let u° G Uad denote the optimal control and let $°s denote the unique solution of (15.9a) corresponding to u°. Here {t,x} is fixed but arbitrary. Let v be an U valued random variable measurable with respect to the sigma algebra Jf. Define a new control , , _ Ju°)s), >-\v, -\v,
U{s U[S)
iorse[t + At,T]; for s€[t,t se[t,t +At). + At).
(15.12) (15 12) '
Clearly by virtue of the optimality principle,
= El
M ft+ rMi(s,tl i(S,^x(s),v)ds x(s),v)ds
+ 4>(t (t++At,^ At,Zl x(tx(t ++
At))\FA, At))\^\, A15.13)
where ^X{t+At) denotes the state attained by the solution of equation (15.9a) at time i + At due to the control action v. This is given by the approximation, %tX(t + At) At)~x« x + (A(t)x + f(t,v))At
+ G(t)(V(t
+ At) -
V(t)),
« x + {A(t)x + f(t, fit, v))At + G(t)AV. Then using Lagrange formula, one can easily verify that y ry W At)) W + + At,£l At,£lx(t(t + + At)) x
- 4>(t + At,x + (A(t)x + f(/,v,)At 4>(t+ t,x) + {Mt + At,x),A(t)x
+ G(t)AV))
+ o(o(A
+ f(t,v))At
^ ^ (15.14)
^
+ (G (t)<j> (t)<j)*,AV) x,AV) + (1/2) < 4>xxG(t)AV,G(t)AV
> +o(At).
Substitut+ng this in (15.13) and us
< £?{jf
/ ( . , & ( , ) , « ) ds\??}
+ (4>x(t + At,x),A(t)x
+ (l/2)Tr (<j>xx(t + At, x)(GRG'))At
+ f)At
(15.15) (15.15)
+ o(At).
For convenience let D and D2<j> denote the eradient vector and the matrix xo second partials of <j> respectively ana M{Tf, U) t h t class of F? measurable U valued random variables. Dividing both sides of the inequality (15.15) by At and letting At - 0, we have, for all (t,x) G / x Rn, 0 < (£»(*,*) {C<j>){t,x) + £(t,x,v),(t,x)
eIxRn,\/ve
M(F?,U), M(F?,U),
(15.16)
Linear and Nonlinear Filtering
207
where ( £ » ( t , x ) = {d/dt)cf> + (D4>,A(t)x + f(t,v))
+ (l/2)Tr{{D2
= (d/d*)0 + ■ * > v
where A is the infinitesimal generator of the Markov process z = x. This follows from right continuity of T\', measurability of / in the first argument and continuity in the second, and the almost sure continuity of the process £ and the assumption that <j> £ C1,2(I x Rn). Now suppose there exists a control ^° € ^ad at which the inequality (15.16) turns into an equality. That is 0 = (£u°(t>)(t, x) + I(t, x, u°(t)), (*, x) G 7 x iJ n .
(15.17)
We show that any such control must be optimal. Consider the solutions z° and f£ x of equations (15.8) and (15.9) respectively, corresponding to the same control ix°. Taking the Ito derivative of >(s,£°x) along this trajectory and integrating over the interval [t,T], we have, -0(*,x) = I
(£°ct>)(s,$Zx(s))ds + J
(G*D(s,ZZx),dV(s).
(15.18)
Since z°(t) = x, it follows from uniqueness of solution that £°iX(s) = z°(s),s > t, and thus equation (15.18) is identical to the following expression -<j){t,z°{t))=
f
{£°(t>){s1z0(s))ds+
f
(GD4>(s,z°),dV(s).
(15.19)
Taking the expectation of either side and letting t —» 0, it follows from (15.19) that -4>(Q,X0)
= E U
(£»(s,2°(S))ds|.
(15.20)
Using (15.17) in (15.20) and recalling equation (15.11), we have J(u°) = 0(0,x 0 ) = E< f
i(t,z°(t),u°(t))dt\.
(15.21)
Therefore, any admissible control satisfying the equality (15.17) is an optimal control. Thus we have proved the following result. T h e o r e m 1 Suppose there exists a control u° G Uad and a function (j) € C1,2(I x Rn) satisfying the terminal condition 0(T, x) = 0,x 6 Rn, so that 0 = (£«>)(*, x) +i(t,x,u°(t),(t,x)
€ [0,T) x i T , n
0 < ( £ » ( * , x) + £ ( t , x , v ) , (*,x) e [0,T) x R ,Vv e U.
5
Partially Observed Control
208
Then u° is optimal. This result is equivalent to the well known Bellman equation. Indeed, byvirtue of the expressions (15.22), the question of existence of a function 0, as stated in the above result, is equivalent to that of existence of a solution of the Hamilton-Jacobi-Bellman (HJB) equation given by (d/dt)<j> + inf {(D(j), Ax + f(t,u)) ueu {
+ l(t,x, u)\ 4- (l/2)Tr(D2GRG) = 0, )
0(T,x)=O. (15.23) Define the Hamiltonian M(t, x, u, q) = (q, Ax+f(t,
u))+I(t, x, u), (*, x,u,q) € IxRnxUxRn.
(15.24)
Recall that / is measurable in the first variable, continuous in the second and £ is also measurable in the first variable, and continuous in the rest of the arguments. Since U is compact, this implies that for almost all t € / , and for each (x,q) e Rn x Rn, the Hamiltonian, considered as a function of u, is continuous and hence attains it's minimum on U. Thus, the infimum in the Bellman equation can be replaced by the minimum. Further, if u —► M ( t , x , u , q) is strictly convex or equivalently Muu is a positive definite matrix, that is, M u u > 0, for almost all t and for all (x, u, q) then there exists a unique point u* = r){t,x,q) e U usually dependent on the variables indicated, such that infM{t,x,u,q) =
M{t,x,r)(t,x,q),q).
u€t/
Using this function r\ one can rewrite the Bellman equation as a semilinear PDE (d/dt)4> + (l/2)Tr((D24>)GRG') 0(T,x) = O.
+ M(t, x, r,(t, x, D), D) = 0, (
• '
This equation and similar equations for general nonlinear problems to follow are known as the Hamilton-Jacobi-Bellman (HJB) equation. The question
Linear and Nonlinear Filtering
209
of existence of a (classical) solution 0, that is, one that belongs to the class C 1 ' 2 and satisfies equation (15.25) every where, is very crucial. In fact it is very rare that such a solution exists. Very stringent assumptions on the parameters {A,/,cr,H,a 0 l l} are required for the existence of such smooth solutions. For details on the regularity conditions required of the parameters mentioned above see the excellent review article due to Wonham [75]. In fact Wonham imposed all the necessary assumptions on these parameters so that the existence results for semilinear parabolic equations appearing in the PDE literature are applicable [see Ladyzenskaja et al, 60]. In any case, if equation (15.25) has a classical solution 0°, then the optimal control is given by u°(t) =V(t,z0(t),D(j>(t,z0(t))
= fj{t,z°{t)),
(15.26)
where z° is the solution of the estimator equation (15.7) written as dz = Azdt + f(t, rj(t, z))dt + GdV, z(0) = x 0 .
(15.27)
Therefore fj provides the feedback control. Thus theorem 1 can be restated as follows. T h e o r e m 2 Suppose U is compact and convex, and the Hamiltonian M is continuous on U and strictly convex. Then, if the equation (15.25) has a solution (j) e C 1 ' 2 , the optimal control u° is given by u°(t) = r}(t,z(t)), where z is the solution of equation (15.27). With the development of generalized solutions including weak solutions, these assumptions are no longer necessary. In fact now we have the concept of viscosity solution, most appropriate for the Hamilton-Jacobi-Bellman equation see [1],[38] and the references therein. We will have some more comments on the question of solutions of HJB equation later. Now we consider the genuine partially observed control problem and refer to this as (P2). This is the same problem as above with the exception that now the objective functional is given by the expression (15.3a) instead of (15.3b). Our objective here is to show that this partially observed problem can be converted to a fully observed one as treated above. Subtracting equation (15.4) from equation (15.1) and using equation (15.2), one can easily verify that the difference (x — x) = e satisfies the differential equation d(x - x) = (A - TH)(x - x)dt + adW - Ta0dV, x(0) — x(0) =
XQ
—
XQ,
(15.28)
Partially Observed Control
210
where T = KH' RQ1. Recall that XQ is Gaussian with mean x 0 and covariance P0 = K0. Prom equation (15.28) it follows that the conditional probability law of x(t) given x(t) is Gaussian. Precisely, the probability law of x(t), given that x(t) = C, is Gaussian with mean £ and covariance K(t) which is also the covariance of the error e(t). Let 9{t, x, C) =
,„ * exp - {{^(K^it^x U (27r)^VdetX(t) 2A denote the corresponding density. Hence for fixed u yK
W
E{Z(t,x(t),u)\x(t)=C>}
= I
e(t,x,u)g(t,x,C)dx
- C), x - C)} VA
= itt,(;,u).
(15.29)
In view of this, the objective functional (15.3a) can be written as J(u) = E< I £(t,x(t),u(t))dt\
= E< I
E{t(t,x(t),u{t))\x(t)}dt\ (15.30)
= E< /
i(t,x(t),u(t))dt\.
This shows that the original partially observed control problem associated with (15.1),(15.2) and (15.3a) has been converted to a fully observed problem described by dx = Axdt + fdt + GdV, x(0) = xn, where , , , G = KH (Ra0)~\
(15.31)
with the objective functional given by
rT ~ J(u)=
e(t,x(t),u(t))dt. (15.32) ./o Thus the solution of this control problem is again given by the solution of the HJB equation (15.25), where in the definition of the Hamiltonian M given in (15.24), £ is replaced by L Under a number of very strong assumptions on the parameters, Wonham proved that the optimal control actually lies in the subset ii C Uad- This is the celebrated separation theorem originally proved by Wonham. For further details see [75]. LQGR Problem. As an immediate application of the above result, we consider the so called linear quadratic Gaussian regulator problem. The system is given by dx(t) = A(t)x(t)dt
+ B(t)u{t)dt + (j(t)dW(t),
x(0) = x 0 ,
(15.33)
Linear and Nonlinear Filtering
211
with the same measurement dynamics as described by equation (15.2). The cost functional is given by J(u) = (1/2)E
[f {(N0x,x)
Jo
+ (Nu,u)}dt, (Nu,u)}dt,
(15.34)
"
where N0 € M+(n x n) and N e M+(d x d) are positive definite and U = Rd. We call this problem (P3). Note that E{N0x,x)
= TV (N0K) +
(N0x,x).
Thus the cost functional (15.34) takes the form J(u) = (1/2) / TV (N0K)ds + (l/2)£ / {(N0x,x) Jo Jo
+ (Nu,u)}ds,
(15.35)
and hence *(t,*,u) i{t,*,u) = = (l/2)TV (N0K) + (l/2){(N0z,z)
+
(Nu,u)}.
The Hamiltonian is given by M(t,z,u,q)
= {q,Az + £u) + (1/2)TV (NQK) + (l/2){{N (l/2){{N0z,*) 0z,*)
+ (JVu.u)}. (15.36) d l Since U = R , minimizing the Hamiltonian one obtains u = -N~ B'q. Substituting this in HJB equation (15.25) we have (d/&t)4> (l/2)Tr {{D ((D22)GRG) d>)GRG') ++M(t,x, M(t,x, -NB'D4>,D) -NB D,D<j>) (d/dt)4> ++(l/2)Tr = 0,= 0, (15.37) 0(T,x)=O. 4>(T,x)=O. Since the cost integrand is quadratic, for suitable {P,p,r}, 4> has she following form.
the value function
4>(t, z) = = (l/2)(P(t)z, (l/2)(P(t)z, z) + (p(t), (p(t), z) + r(t). r{t).
(1(.38) (1(.38)
Substituting this in equation (15.37) and equating quadratic, linear and constant terms to zero individually, one obtains for {P,p,r} the following set of equations P + PA + A'P + No - PBN~lBP ll
p+(A' PBN- B)p p + (A' -- PBN~ B')p r+ + (l/2)Tr (PGRG {PGRG
= 0, P{T) = = 0,P(T) = 0,
= 0,p(T) 0,p(T) = = 0, 0, = {\I2){BN-11BB'p,P)p,p) = = 0,r(T) 0,r{T) = 0. + N0K) - (l/2)(BJV-
(15.39)
212
Partially Observed Control
The terminal conditions follow from equation (15.37b). Being homogeneous with zero terminal condition, the second equation implies that p(t) = 0 for all t > 0. Finally it follows from the third equation of (15.39) and equation (15.38) that the optimal cost is given by J(u°) = 0(0, x 0 ) = (l/2)(P(0)xo,x 0 ) rT + (1/2) / IV {P{s)GRG'(s) Jo
(15.40) +
N0(s)K(s)}ds.
Thus we have the following result. Theorem 3 The solution for the linear quadratic Gaussian regulator problem (15.33),(15.2) and (15.34) is given by the following set of equations P + PA + AP + N0 - PBN-XB'P u° =
= 0, P{T) = 0,
l
-N~ BPx,
J(u°) = (l/2)(P(0),xo,xo) + (1/2) /
Tr {P{s)GRG
(s) +
Jo
N0(s)K(s)}ds, (15.41)
where K is the solution of equation (15.5). 15.3 Linear Systems with Dynamic Observation The system is governed by the following set of equations dx = (Ax + Dy)dt + f(t, u(t))dt + adW, x(0) = x0,
(15.42)
dy = (Hx + Cy)dt + a0dV, y(0) = 0,
with the the objective functional given by (15.3a). Let us call this problem (P4). Since the control is T% measurable, / is also T\ adapted. Hence this is similar to the model B of chapter 4. According to theorem 2 of Chapter 4, the optimal filter is given by dx = Axdt + ( / + Dy(t))dt + T(dy - Cydt - Hxdt),x(0)
= x0l
(15.43)
where T = KH RQ1 with K being the solution of equation (15.5). Again, introducing the innovation process V given by dV{t) = <Jol(dy(t) - (Cy(t) + Hx{t))dt),
(15.44)
Linear and Nonlinear Filtering
213
we can write the estimator equation as dx = Axdt + ( / + Dy(t))dt + Ta0dV, = Axdt + ( / + Dy)dt + G(t)dV, x{0) = x 0 ,
(15.45)
where G is defined by (15.7b). Again one can easily verify that V is an J~^ Brownian motion with incremental covariance given by R. This problem is very similar to the partially observed problem (P2) treated in the previous section. Define the Hamiltonian M by M(t, x, y, u, q) = (q, Ax + f(t, u) + D(t)y) + l(t, x, u), for (t, x, y,u,q)
(15.46)
e I x Rn x .Rm xU x Rn, and the mapping n:I
x R n x Rm x Rn
by inf M(t,x,?/,u,g) = M(t,x,y,?7(t,x,y,q),q). Thus we have proved the following result. Theorem 4 The necessary condition of optimality for the problem (P4) is that the HJB equation given by (d/dt) + (l/2)Tr ((D2(t>)GRG) + M(t1x,y(t),rJ(t,x,y(t),D(l)),D(t>)
= 0,
0(T,x) = O, (15.47) has a classical solution. Note that the coefficients of equation (15.47) are dependent on the ob served data. In fact one may also allow the cost integrand to depend on the observation reflecting the measurement cost. The reader is encouraged to con struct a feedback regulator as in theorem 3. 15.4 Fully Observed Nonlinear Systems In this section we consider an optimal control problem for fully observed nonlinear systems. In general the system is governed by the following SDE, dx = 6(x, u)dt + cr(x, u)dW, t > 0, x(0) = x 0 .
—> £/
Partially Observed Control
214
Let Uad denote the class of admissible controls. This consists of the class of measurable functions defined on I taking values from a compact convex subset U C Rd adapted to at most the current information about x(t). That is, u{t) has the representation u(t) = v(t,x(t)) for some Borel measurable function v : I x Rn —> U. We assume that for any such admissible control, the system (15.48) has a unique solution {x(t), t e 1} having continuous sample paths and bounded second moments (see chapter 2). The objective functional given by J(u) = E I £(t,x,u)dt (15.49) Jo is to be minimized over the class Uad- Let <> / denote the value function given by (t,x) = inf< E{ /
£(s,x(s),u(s))ds\x(t)
= x},u € Uad \-
Again following the Bellman's principle of optimality as in section 15.2, one can verify (formally) that (j> satisfies the HJB equation (d/dt)> 4- inf { (£ty, b(x, u)) + £(t, x, u) + (l/2)Tr (D2 a(x, u)) 1 = 0 , J
u€U {
0(T,x) = O, (15.50) where a denotes the diffusion matrix given by a(x,u) = cr(x, u)Qa (x, u), with Q being the incremental covariance of the Wiener process W. This is a quasilinear partial differential equation, usually a very difficult problem in the field of PDE. The question of existence, uniqueness and regularity properties of solutions of such equations is far outside the scope of this book. For fur ther discussion see the concluding remarks (comments 4,5) at the end of this chapter. Instead we shall treat the problem as an optimal control problem of the forward or the backward Kolmogorov equation driven by vector valued controls. Let Uad denote the class of admissible controls. For u e Uad the backward Kolmogorov operator is given by A{u)(j> = (1/2)TY {D2(/) a(x, u)) + (Dfa b(x, u)).
(15.51)
The forward operator, as defined by equation (14.7) of chapter 14, or equivalently its adjoint, is given by
A*(u) = (l/2)j^£>y(ay*) - J3A(M) n
n
(15.52)
Linear and Nonlinear Filtering
215
where bi = -bi + XIj Dj(aij)- Note that the last expression is written in the divergence form. Let u G Uad be a fixed control and P t u the probability measure induced by the solution process of equation (15.48) at time t > 0, corresponding to the initial probability measure P 0 induced by the initial state XQ. Assuming that PQ has a density po> one can demonstrate, under some suitable assumptions on the coefficients, that Pu has a density pu and that it is given by the solution of the forward Kolmogorov equation (see chapter 2), ^=A*(u)p,
p(0)=Po.
(15.53)
Again under certain suitable assumptions, given that p0 e H = L2{Rn), we also have p\ e H. Hence the cost functional given by (15.49) can be written
J(u)= f i(t,p«,u)dt= f \ [ t(t,£,u)p?(£)(%) dt J
\
Jo
UR
"
J
(15.54)
= / 0?(t,u)(.),P?(-))//^ ./o
where we have used the standard notation for scalar products in H,
The problem is to find a control that minimizes (15.54) subject to the dynamics (15.53). Thus we have the following result. L e m m a 5 The stochastic control problem (15.48)-(15.49) in Rn is equivalent to the deterministic control problem (15.53)-(15.54) in the infinite dimensional Hilbert space H. According to this result it suffices to develop necessary conditions of optimality for the problem (15.53)-(15.54). We have seen in chapter 14, that under the assumptions of Lemma 1 (see (Al) and (A2) of equation 14.11, chapter 14), for each fixed u e U, the operator A* (u) is the generator of a Co-semigroup in H. Here we assume that, for each u e Uad, A*(u(t)) generates a strongly con tinuous evolution operator in H. Thus corresponding to each p0 € H, equation (15.53) has a unique solution pu e L2(I, V) C\ C(J, H). In fact this is a special case of the general theory of differential equations involving coercive operators. We present here for easy reference the following abstract result.
Partially Observed Control
216
L e m m a 6 Consider the evolution equation z-=A(t)z
+ f, z(0) = z0y
(15.55)
and suppose that there exist constants {c > 0, A > 0, a > 0} such that for all t > 0, the operator A(t) satisfies the following estimates v*y\
-{A{t)v,v)v%v
+ \\v\2H>a\\v\\2v
VveV.
\(A(t)v,v)
Then for each z0 G H and / G L2(I, V*), equation (15.55) has a unique solution z G L2(I, V) n C(I, H). Further, the map / —► z is a continuous linear map from L2{I,V*) to L2{I,V). For reference, see [2], [3]. In order to develop necessary conditions of optimality we need some regularity assumptions on the coefficients of the operator A*(u). We deliberately avoid rigorous justification of all the steps, since this will carry us far from the major objective of this book. Further, we may as sume that an optimal control exists. For details on this question the interested reader is referred to the literature [3],[4],[5],[6], [20],[65],[66]. We also assume that the parameters 6, a are once continuously differentiable on Rn x U with the derivatives being bounded. Let u°,u G Uad and suppose that u° is the optimal control. Since U is closed and convex, u£ = u° + e(u — u°) G Uad for 0 < e < 1 and hence J(u° + e(u - u°)) > J(u°). (15.57) Let p e ,p° denote the solutions of equation (15.53) corresponding to the controls ue and u° respectively. Subtracting equation (15.53) corresponding to the control u° from that corresponding to control u £ , dividing by e and letting e —> 0, one can verify that q* = {1/eW
- P°) — 9, in L2(I, V) n C(I, H),
where q satisfies the differential equation jq
= A*{u°)q + dA*(u°,u - u°)p°,q{0) = 0.
(15.58)
Here dA*(u°,u — u°) denotes the Gateaux differential of the operator valued function u —> A*(u) at u° in the direction u — u°. Note that, under the differ entiability assumptions of the coefficients {a, &}, A* G C 1 (t/,£(V, V*)). Define the function / by f(t) = dA*(u°(t),u(t) ~ u°(t))p°(t). Hence one can verify
Linear and Nonlinear Filtering
217
that, for p° e L2(I, V) n C{I, H), we have / e L2(I, V*). Thus it follows from Lemma 6, that equation (15.58) has a unique solution q e L2(I, V) n C(I, H). From (15.57) and (15.54) it follows that
0 < dJ{u°,u-
u°) = U m | ( l / e ) J ( u ° + e{u - u°)) -
J(u°)\ (15.59)
= J
0
U{t,quu )
0
+
{Iu{ty,u ),u-u°)\dt,
where l{t,quu°{t))= Zu(t,V0t,U°(t))
e(t,x,u0{t))qt(x)dx=<e(t,;u°(t)),qt>v*y,
[ =
/ £u(t,X,U0(t))p°t(x)dx JRn
=<
eu(t,;U°(t)),P°(t)
>V*y
.
(15.60) Our notation above clearly suggests that we have used the assumption that, for each u e Uad, ^(•,-^),^(-,-,ti)eL2(/,0Hence L, given by L(q) = fQ £(t,qt,u°)dt, is a continuous linear functional of q on 1/2(7, V). On the other hand it follows from (15.58) and Lemma 6, that dA*(u°, u - u°)p° —► q is a continuous linear map from L2(I, V*) to L2(I, V). Hence there exists a <j> G L2(I, V) such that L(q)=
[ i(t,quu°)dt= Jo
[ Jo
< >(t),dA*(u°(t),u(t) - u°(t))p°(t) >v,v* dt.
(15.61) The function (j> is the adjoint variable uniquely determined by the solution of equation -±d> = A(u°) + e(tr,u°(t)),t€ [0,T),4>(T)=0. at Substituting (15.61) into (15.59) we conclude that, for all u € Uad, f | < 4>(t),dA*(u°(t),u{t)
- u°(t))p° >vy.
+(£u(t,p0]u°),u-
u°)\dt
(15.62)
> 0. (15.63)
Introduce the Hamiltonian H(t, u, 77,0 =< V, A*{u)<; > v , v + < t(t, •,«), C >v,v
=< n, A*(U)<; >v,v +i(t, c,«),
(t,u,ri,c)eixuxvxv. (15.64)
218
Partially Observed Control
Thus we have proved the following necessary conditions of optimality. Theorem 7 Suppose the elements of {a, b} are Cl{Rn x U) and further they satisfy the assumptions (Al) and (A2) of Lemma 1 (chapter 14) uniformly on Rn x U. Then, in order that {u°,p°} e Uad x L2(I, V) be an optimal control trajectory pair, it is necessary that there exists a (j) e L2(I, V) n C(7, H) such that rp
j {Hu(t,u°{t),{t)y(t)),u(t)-u°{t))dt>Q, Jo
VueUad
(15.65)
where 0 is the unique solution of the adjoint equation -(d/dt)> = A(u°)(t) + i(t, -, tx°), 0(T) = 0,
(15.66)
and p° is the solution of the forward Kolmogorov equation (d/dt)p = A*(u°)p, p(0) = po. Comment 1 (Terminal Cost) If a terminal cost E{F(x(T))} (15.49), then (15.54) must be modified by adding =
j
(15.67). is added to
F(x)p£(x)dx.
This modifies the adjoint equation (15.66) by the nonzero terminal condition 0 ( T ) ( ) = F ( ) . The integral is well defined for F G V*. For necessary condi tions of optimality for jump Markov processes see [15]. 15.5 Computational Methods On the basis of the results of Theorem 7, we can develop two algorithms one for the open loop controls and one for the feedback controls. Algorithm A (Open Loop Control). Step 1 Choose any UQ e Uad for n — 0 and suppose at the nth stage ^n £ Mad has been obtained. Step 2 Use un in place of u° to solve equations (15.66) and (15.67) giving <j)n and pn. Step 3 Use this to compute Hu(t, un(t), (t>n(t),pn(t)) = Hn(t) of inequal ity (15.65).
219
Linear and Nonlinear Filtering
Step 4 Define u n +i = un — sHn for (e > 0) suitably small so that u n + i(£) G U for almost all t € I. Step 5 Compute J ( u n + i ) = J{un) — £ || # n ||2 +0(5), for £ suitably small. Step 6 Stop if a stopping criterion has been met, otherwise go to step 2 with the new control un+\. Comment 2 In case, at any stage n and t e I, un(t) e dU, the boundary of [/, and Hn(t) is also outwardly directed, one must set un+\(t) = un(t), otherwise follow step 4. Algorithm B (Feedback Control) For feedback controls we define G = G(t,x,ii(t,x),0(t,x),p(t,a:)) = p(t,x)A(u)<j)(t,x)
+ £(t,x,
u)p(t,x).
Using this expression we can rewrite (15.63) explicitly as space time integration in place of the abstract form (15.65) giving
/ {•
< Gu(t,x, u o (t,x),p o (t,x),0(t,x)),w(t,x) — u°(t,x) >Rd \dxdt > 0,
(15.68) JlxBr I where Br = {x € Rn : \x\ < r},r < 00. We state the necessary modifications of the preceding algorithm. For Uad take all Borel measurable functions from / x Rn to U. Replace H by G and Hu by Gu. For step 3, define Gn(t,x) = G ? n (t,x,it n (t,x),0 n (t,x),p n (t,x)). For step 4, define u n + i(£,x) = un(t,x) eGn(t,x) keeping in mind the comment 2. Before concluding this section we like to mention that the control problem comprised of (15.53)-(15.54) involving the FKE (forward Kolmogorov equa tion) is equivalent to the following control problem involving BKE (backward Kolmogorov equation) - {d/dt)(j)u = A{u)<j)u + t(t, x, u), 0(T, x) = 0, f J(u) = / (j)u(0,x)po(x)dx —> inf.
(15.69)
JR"
This follows from Ito differential formula. The same set of necessary condi tions exactly as stated in Theorem 7, can be derived from the control problem (15.69). We invite the reader to verify this. 15.6 Partially Observed Nonlinear Systems In this section we consider control of partially observed nonlinear systems.
Partially Observed Control
220
Unfortunately separation theory due to Wonham, does not hold for nonlinear systems. Here we force the Zakai equation as the state equation which is to as to minimize certain cost functional. We consider
as discussed in section 2, a separation and consider be controlled optimally so the system
dx = b(x, u)dt 4- cr(a;, u)dW, x(0) = £o, (15.70)
dy = h{x)dt + a0(t)dV, y(0) = 0.
Let T± = a{y(s),s < t} denote the least sigma algebra (completed) with respect to which y is measurable. Let U be a compact convex subset of Rd and Uad = M(J r y , U) the class of U valued controls {u(t), t > 0}, which are Ty measurable. This is taken as the class of admissible controls. Let us consider an arbitrary but fixed control from this class and consider the system (15.70) driven by this control. Then it follows from chapter 13 ( see Theorem 2, equation 13.37B), that the unnormalized density of x(t) = xu(t), relative to the sigma algebra J^, is given by the solution of the controlled Zakai equation dput = A*{u)putdt + p?(Rolh,dy),p0
= po.
(15.71)
u
Since the conditional probability density of the state x (t)1 given the obser vation {y(s), s < £}, is related to the Zakai density via p™ = c(t)p^ with c(t) = ( 1 / f pfdx) being the normalizing factor, it is reasonable to consider the unnormalized cost functional J(u) = E I
i(t,p?,u(t))dt
=E I
(f
£{t,x,u(t))p?(x)dx\dt
(15.72)
(f
e(t,x,u(t))p?(x)dx]dt.
(15.73)
in place of the true cost functional JT(u) = E I
i(t,p?,u(t))dt
=E f
Thus the original control problem of minimizing JT over Uad subject to the dynamics (15.70) is approximately equivalent to the fully observed control problem of the Zakai equation (15.71) coupled with the cost functional given by (15.72). A necessary condition of optimality is meaningful only if existence of optimal controls is assured. We assume throughout that an optimal con trol exists. Readers interested in existence theory may refer to the literature [2],[3],[4],[37] for finite dimensional problems; and [6], [7] for infinite dimen sional systems. Let L|(7, V), Ly2{I, H),Ly2(I, V*) denote the Hilbert spaces of ^ - a d a p t e d random processes satisfying E J || z | | | dt < oo, where B = {V,H,V*},
(15.74)
Linear and Nonlinear Filtering
221
and Cy(I, H) C L\ (7, H) the Banach space of T^ adapted H-valued processes having continuous sample paths with probability one. Here we are only con cerned with the necessary conditions of optimality. Before embarking on this topic we note that, under the given assumptions, for each control u G Uad, the Zakai equation (15.71) has a unique solution p G L\{I, V)nCy(I, H). This can be proved on the basis of the apriori estimate E\pt\2H + 2aE f
|| ps \\2V
2 0\ H
+ (2A+ || h W^E
Jo
[
\ps\2Hds
JO 2
<\p0\ Hexp{(2\+
(15.75)
||fc||oo)T},Vt€J,
n
where || h ||oo= sup{|/i(x)|jRm,x G R }. The first inequality follows from Ito's formula applied to the function f(pt) = (l/2)\pt\'ji, and the second follows from Gronwall inequality. In section 15.4, we treated forward Kolmogorov equation (15.53) as the state equation coupled with the cost functional (15.54), whereas, here we con sider the Zakai equation (15.71), a stochastic PDE, as the state equation and (15.72) as the cost functional. The former is a deterministic infinite dimen sional problem obtained from a finite dimensional fully observed stochastic con trol problem and the later is a stochastic infinite dimensional problem obtained from a finite dimensional partially observed stochastic control problem. We fol low the same procedure as in section 15.4. Let u° G Uad be the optimal control and u G Uad any other control. Then by convexity, u£ = u° + e(u — u°) G UadLet p£ and p° be the solutions of equation (15.71) corresponding to controls ue and u° respectively. Then following the same steps as in section 4, one can verify that q given by
9 = lim{(l/ £ )(p e -p°)} in 23(J,lO eJO
satisfies the SPDE dqt = A*{u°)qtdt + dA*(u°,u-
u°)p°dt + qt(Ro1h,dy),q0
= 0,
(15.76)
and that the Gateaux differential of J satisfies the following inequality dJ(u°,u-u°)
= El
f
[i(t,qt,u0)
+ (Iu{t,p0t,u0),u-u°)\dt\
>0Vu€Z4d(15.77)
Again for the SPDE dv = A*{u°)udt + g°(t)dt + HRo'h, f(0) = 0,
dy)
ds < \p
Partially Observed Control
222
one can show that the map g° —> v is a continuous linear map from L f t / , 0 —♦
L%(I,V).
Hence, taking dA*(u°,u - u°)p% for g°(t) and q for z/, it follows from the continuity argument that there exists a <j> € L l C > ^0 s u c n ^ n a ^ E / i{t,qt,u°)dt 7o
=E f ./o
< 0(t),di4*(u°,u-ti°)p? >v,v* A,
(15.79)
where 0 satisfies the backward stochastic PDE -dfa = A{u°)(j)tdt + l(t, ;u°)db + <M#o x /i, dy),>r = 0.
(15.80)
Thus we have proved the following result. Theorem 8 Suppose the assumptions of theorem 7 hold and that h is continu ous and bounded from Rn to Rm. Then, in order that (u°, p°) G Uad x ^ I ( ^ ' ^ ) be an optimal pair, it is necessary that there exists a e L\(I, V) such that Elf
{t),pot),u(t)-u0{t)>Rddt\>0WueUadl
(15.81)
where (j) is the solution of equation (15.80), p° is the solution of equation (15.71) corresponding to the control u° and H is given by the expression (15.64). Hence the necessary conditions of optimality are given by (15.81), (15.80) and (15.71). This is a stochastic minimum principle which can be used to solve the LQGR problem yielding the same set of equations as in theorem 2. For more extensive study on partially observed control problems see the recent book due to Bensoussan [20] where many variants of the problem have been discussed. The necessary conditions given by theorem 8 are also useful for system identification considered in the next chapter. However, their usefulness for computing optimal controls on line (in real time) seems to be limited because of the adjoint equation which is a backward stochastic differential equation. For engineering applications, design of suboptimal controls using neural networks may be much easier. Comment 3 (Terminal Cost) If a terminal cost E{< F, p? >} is added to the running cost functional (15.72), then the adjoint equation (15.80) must be
Linear and Nonlinear Filtering
223
modified by adding the nonzero terminal condition, >T = F(.). 15.7 Some Examples and Discussion Example 1 This example illustrates fully observed problem of section 15.4. dx = uxdt + y/(l + x2)dw,x(0)
= x0,U = [-1,4-1],
T
J(u) = (l/2)E
f o / {Ai(x - m) 2 + Jo
o \2u2}dt.
(15-82)
By virtue of (15.65), we have / Jo
/ (xDcj) p + \2pu°)(u
- u°)dxdt > 0 \/u € Uad,
JR
where Uad denotes the class of all Borel measurable functions on I x R with values in U. ^From this one can easily verify that the optimal feedback control has the bang-bang structure given by the implicit relation u° = -sign (xDcj) 4- A 2 u°). If there is no control constraint, the set U = R and in this case u° = -(l/A2)x£>>, and the associated adjoint equation is given by -(d/dt)(/) = (1/2)(1 + x2)D24> - (l/2)x2(Dcf))2 + (l/2)Ai(x - m) 2 ,
teI,xeR
0(T, x) = 0, x e R. Example 2 Here we consider a partially observed problem. Population control by immigration can be modeled as dN = (ciN 4- c2N2 + uN)dt + /3dw, N(o) = N0 dy = h(N)dt 4- rdv, j/(0) = 0,
(15.83)
where u denotes (per capita) immigration rate and w and v are standard Wiener processes. The observable y is, for example, a vector of production levels of m distinct sectors of the economy which are influenced by immigra tion. The problem is to minimize the difference between a desired population
Partially Observed Control
224
level and the one actually attained by a specified time period [0,T], The cost functional is given by J{u) = (1/2)E{
[ \!U2dt + X2(N{T) Jo
Nd)2}
The admissible controls are .T7^-adapted processes taking values from the set U = [0, K] 0 < K < 1. The first term may represent cost of administration. Applying the necessary conditions of optimality given by Theorem 8, one can verify that the optimal control is of the form u° = (/s/2)(l - sign{xD(j))), where <j) must satisfy the equation -dcj) = Ul/2)02D2(t)
+ (cix + c2x2 + u°x)D(f> + (l/2)Ai(u°) 2 left (l/r2)(j)h(x)dy,
+ 0(T,x) = (l/2)A 2 (x-iV < i ) 2 , and p° is the solution of equation
dp = A*(u°)pdt +
(l/r2)phdy.
Comment 4 (HJB equation) Fully observed problems can be solved using the HJB equation (15.50). Including also the terminal cost, one can rewrite this equation in the abstract form W
J
0(T,x) = 0 o ( x ) , x € J ^ n ,
(15.84)
where o denotes the terminal penalty. In general the HJB operator A is strongly nonlinear and equation (15.84) does not have a classical solution, that is, (j> does not belong to the class C1,2(I x Rn). In recent years the con cept of viscosity solutions which are merely continuous functions on I x Rn has broadened the scope of HJB equation as a powerful tool for solving fully observed control problems. However, the literature on this topic is mostly de voted to the question of existence of solutions (optimal feedback controls), see [38] and [1] and the references therein. The assumptions imposed on the basic coefficients are rather strong. The computational burden for multi dimensional
Linear and Nonlinear Filtering
225
problems is heavy, though, with the development of super computers this may be easier in the near future. In the mean time, control theory must be further developed both for fully observed and partially observed problems relaxing as far as possible many of the stringent assumptions used in this book and the current literature. Fully observed problems are partially satisfactory, though for application one must develop feasible computational techniques for con structing viscosity solutions and state feedback controls. These are interesting doctoral thesis projects. We conclude this comment with an example from finance. Example 3 (Finance) There are two investment tools, one risk free with small mean growth rate as described by equation dps = aspsdt, and the other risky, such as mutual fund, with large mean growth rate ar > as described by equation dpr = pr(ardt 4- ardw), where or denotes its volatility and w is the standard Brownian motion. This is the price dynamics of the two assets. An individual wants to invest his wealth £ using these instruments so as to maximize his return. This is a question of optimal portfolio selection. For details see [89]. Letting u G [0,1] denote the fraction invested in the risky asset, and c G [0, oo) the consumption rate, the growth dynamics of his wealth is given by d£ = (1 — u)as^dt -f u£(ardt 4- ardw) — cdt = (as£ 4- (ar - a 5 )u£ - c)dt 4- u^ardw, f (0) = x. The return functional is given by the discounted revenue
(15.85)
= Ex fX e-0tU{c{t))dt + e'0T'P, (15.86) Jo where x is the initial wealth, U is the utility function, /3 is the discount rate, P is the bankruptcy value of the assets and rx is the stopping time to bankruptcy. Defining the value function V(x) = sup{J(u,c,x),u € [0,1],c > 0} one can verify that the stationary HJB equation, associated with the HJB evolution equation (15.84), is given by 0V(x) = sup{(l/2)a?x2u2Vxx -f (asx + (ar - as)xu - c)Vx + U(c)}, J{u,c,x)
V(0) = P. This is a highly nonlinear problem. Ignoring the constraints one can verify that the optimal policy is given by
.-c 1 (,,v.,v.)-fe^l ( v,/v„) c = Gi(x,V„V„)
=
fU')-1(V,),
(i5g7)
Partially Observed Control
226
where if denotes the gradient of the utility function U. In the presence of constraints the optimal policy is given by (15.88)
u = max{Gi A 1,0},c = max{G 2 ,0}.
Substituting this in the equation one obtains a highly nonlinear problem. Comment 5 (HJB equation for Partially observed problem) Though partially observed control problems are the natural settings for most of the problems arising in science, engineering and economics, the subject is far from satisfactory. Once the partially observed control problem described by the equations (15.70) and (15.73) is transformed into the control problem described by the equations (15.71) and (15.72), one may again formally write an HJB equation. We rewrite (15.71) and (15.72) as dp = A*(u)pdt A*{u)pdt + F{p)dz, F(p)dz, po = po T r J(u) = E I/ (eu(s),ps)Hds. Jo In this case formally the HJB equation is given by (d/dt)V + G{V) = 0,t € [0,T), v 6 D(A*) C H, V ( T » = 0.
(15.89)
(15.90)
where the HJB operator Q is given by G(V) = m(t(A*(u)v, g(V) infUA*{u)v,DV) DV) {l/2)Tr{{D22V)F(v)F*(v)) V)F(v)F*(v)) H H + (l/2)Tr((D
+ (iu,u)H,«(eu,v)eH,«eu\. u\.
One of the advantages of the HJB formulation is that it eliminates the necessity of the backward stochastic equation (15.80). The question of existence of viscosity solutions of partial differential equations of the form (15.90) on / x if is far more difficult and is rarely considered in the literature. Interested readers may see [85],[86] and the references therein. These results often give only existence without uniqueness. In case uniqueness holds, one can construct the optimal feedback control law. The control law so obtained is a function of time and the density. Since a physical control law must be a functional of the naturally observed process y, practical usefulness of a feedback control law depending on the conditional density is seriously questionable. For applications, one can rigorously formulate partially observed control problems using neural networks placed in the feedback loop thereby reducing the complexity of designing optimal feedback control laws to the problem of optimum selection of parameters or the weights. These are current problems of research interest.
227
C H A P T E R 16 S Y S T E M IDENTIFICATION
16.1 Introduction The problem of system identification is fundamental in all physical and engineering sciences. Very often the governing equations are known but the parameters that enter in the coefficients are either not known or known only approximately. For example, in vibration problems the form of equations for beams, plates, shells are known but the parameters such as modulus of rigidity, coefficients of elasticity etc. may not be precisely known. In reaction-diffusion problems the reaction rates and the diffusion coefficients may be unknown. Similarly in thermal problems, the heat conduction, convection and radiation parameters may not be known or known only approximately. In Quantum me chanics, the potential function in Schrodinger equation and even the masses of the interacting particles may not be precisely known. In fact very often mathematical models are used to describe the evolution of physical, chemical, biological, or even social systems based on fundamental laws of physics and scientific intuition on the infinitesimal (microscopic) relationship between the various interacting entities that produce the macroscopic behavior. For ex ample, in the modeling of economic and social systems one introduces a large number of parameters representing the infinitesimal interactions between the various entities that contribute to the temporal evolution of the process. Some time even the mathematical models are not able to capture all the intricacies of the natural system. In the process many natural parameters are introduced which may be determined either theoretically if possible or by experiments and observation. We present in this chapter some general methodologies for identification of such system parameters. 16.2 Fully Observed Linear and Nonlinear Systems Consider the system dx = b(t, x, a)dt +
(16.1)
System
228
Identification
where a € RN is an unknown vector of parameters that determines the dy namics of the system (16.1). Throughout the remainder of this section we assume that W is a standard Brownian motion and that the functions b and a satisfy the standard assumptions (uniformly with respect to a ) so that for each fixed a e RN, the system (16.1) has a unique strong or weak solution which is continuous with probability one having finite second moments whenever XQ has a finite second moment. The basic identification problem is to determine the true parameter a* from the observation {x(t),t 6 / = [0,T]}. A very classical and powerful technique for solution of this problem is the maximum likelihood estimate. For the likelihood function, we use the Radon-Nikodym derivative. Let C denote the Banach space of continuous functions on / with values in Rn and B(C) the Borel sigma algebra on C containing all cylinder sets and {Bt(C) | t > 0} C B(C) an increasing family of subsigma algebras (completed) of the sigma algebra B(C). Let /xQ denote the measure on C in duced by the process x a , the unique solution of system (16.1) corresponding to the parameter a. Let /x° denote the measure induced by the unique solution corresponding to the system given by dx = cr(t, x)dW, x(0) = x0.
(16.2)
Then we know that /xa -< fi° and the corresponding RND is given by pa(x) = expl
< R~1(s,x(s))ba,dx(s)
/
\*-\8Ms))KsMs),*)\2ds\
-(1/2) / (
= Expl
,T
/
a
>Rn
(16'3)
1
< b ,R- {s1x(s))dx(s)
>Rr,
T
(1/2) j
dsRrX
where R = era', and ba = bQ(s,x(s)) = b(s,x(s),a). That is, dfj,a = padfi°. A sufficient condition for /xa to be a probability measure is that cr~lba is bounded. Let T G B(C) denote any event. Then the probability of this event is given by
tia(T) = J p«{x)dn°{x).
(16.4)
Suppose as a result of an experiment we have observed the following trajectory {{(t),t e 1} e C. Let T e B(C) be a set such that f e T. Then clearly
Linear and Nonlinear Filtering
229
it follows from (16.4) that the probability of such a realization is maximum if a coincides with the true parameter a*. Since \i° is independent of a, it follows from the above argument that, given the observation x = £, find a G RN that maximizes the functional a —> Pa(€)- This is the justification for using maximum likelihood estimates. It follows from this and the expression (16.3), that we can choose J(a) = p a (£) for the likelihood function or J(a) = log pa(£). Hence the best parameter is obtained by maximizing this function. Thus, for J(a) one may choose J{a)=
[ Jo
Rr> (16.5) 1
a
a
(1/2) [ ds< /r (s > x(s))& ,& >jRn Jo In many practical problems the drift ba may be given by N
ba(t,x) = ^2aMt,x)
(16.6)
i=l
where {bi = 6i(t,x),i = 1,- • -,iV} are n-vector valued functions satisfying the standard Lipschitz and growth conditions. Clearly, it follows from this assumption that ba can be written as ba(t,x)
= B(t,x)a
(16.7)
where B : I x Rn —> M(n x N). In this case equation (16.5) can be rewritten as J(a) = (a, rrr)RN - ( l / 2 ) ( r T a , a)RN,
(16.8)
where rp
T]T=
f B'(s,x(s))i?- 1 (s,x(s))dx(s) Jo
£RN
TT=
I B'(5,o:(s))ii- 1 (s,x(s)) J B(s.x(s))ds € M(N x N). Jo
(16.9)
Given the history {x(t),t € / } , the variables rjr and IV are known as deter mined by equation (16.9). In order that the functional (16.8) has a maximum
System
230
Identification
it is required that the matrix I > be positive definite for some finite time T. In general we may consider the functional Jt(a) = (a,ifc) - ( l / 2 ) ( r t a , a ) , t > 0.
(16.10)
Thus if the parameter space is whole of RN, and if for a history of length t, Tt > 0, the problem has a solution and it is simply given by maximizing Jt(a) over RN. It follows from this that the maximum likelihood estimate of the parameter a at time t, provided I \ > 0, is given by
a°t = IV V
(16.11)
If t° is the first time for which Tto > 0, then it is also positive for all t>t°. Thus the history Ff0 provides the minimal statistics for estimation of the parameter. Suppose that the true parameter is a*. We show that the estimate (16.11) is consistent in the sense that a°t —► a* as t —> oo.
(16.12)
Note that W defined by W(t) = W(t) - [ a- 1 (5,x(5)) J B( 5 ,x(5))a*d5^ > 0,
(16.13)
is a Brownian motion on the measure space (C,B(C),/J,*) where n* is the measure corresponding to the true parameter a*. Hence it follows from this that x is a weak solution of equation dx = Ba*dt + adW. Substituting this in the expression for rjt given by (16.9) for T = t, we have rjt = I > * - f / Jo
BR-l<jdW.
This combined with (16.11) gives a°t - a* = IV 1 / (a^B)'
dW.
Note that the quadratic (matrix) variation process of the martingale
Jo
(a^B) dW
(16.14)
Linear and Nonlinear Filtering
231
is given by Tt. We can show from this that for any e > 0, ]x^PUa?-a*\RN
>e\
= 0.
(16.15)
This is very similar to the case of scalar Brownian motion w, lim
0
p i M ^ i > € \ =0 for any e > 0. [
t
)
Using equation (16.11) and Ito's formula we obtain a stochastic differential equation for a° given by da°t = -T;\B'R~lB)a0tdt
+ r;l(B'R~l)dx,t
> t°,
(16.16)
driven by the observed process x. Under suitable assumptions such as, B R~lB strictly positive, the asymptotic limit of a£ is the true parameter. We summa rize this result in the following theorem. Theorem 1 Suppose ba and a satisfy the standard assumptions guarantee ing existence and uniqueness of continuous solutions and that ba is linear in a satisfying (16.6). The dispersion matrix a taking values from M(n x n) is nonsingular and W is a standard Brownian motion. Then the maximum likelihood estimate of a G RN at any time t > t°, denoted by a£, is given by equation (16.11) and it satisfies the SDE (16.16). Further, this estimate converges asymptotically to the true parameter a* with probability one. C o m m e n t 1 Equation (16.16) provides a recursive estimate which is evidently adapted to the data field T*. Example 1 For illustration, we consider the (scalar) population model given by dx = (a\X + a2x2)dt + jdw (16.17) where a\ is the intrinsic growth rate, a2 is the competition or cooperation factor and w is a standard Brownian motion representing uncertainties. If Oil > 0, the population is cooperative and if negative it is competitive. The problem is to estimate these rates from historical population data {x(s), s > 0}. In this case B = (x, x 2 ) , a = ( a i , #2)'> °" = 7- Using this in (16.9) we obtain for T = t,
V, = d/72) [ ( X)) « l W = <1/T» / ( x f ) ) **W' (s)
M*£{*
x3(s)\,
System
232
Identification
The reader can easily verify that Tt is positive. Then ot°t is given by (16.11). If the growth rate is known precisely but the competition (or cooperation) factor Q.2 is not, then we have
m°
-(Hi) J- t
where %
= ( l / 7 2 ) f x2(s)dx(s) Jo
= (l/7)
Jo
fx2{s)dw
r , = ( l / 7 2 ) f xA{s)ds. Jo This example demonstrates clearly that the interacting coefficients OL\ and c*2 can be determined from population data using the technique summarized in theorem 1. Clearly the result of theorem 1 also applies to the linear problem dx =Axdt + adW =B{x)adt + adW.
(16.18);
V
where a G RN, N = n 2 , denotes the elements of the matrix A which is to be identified and B(x) G M(n x N) is linear in x. Identification of Dispersion Parameters The preceding results, however, do not apply if a or more precisely R = CFRQG is also to be identified. The problem arises from the fact that now equation (16.4) is given by H<*(T) = f pa(x)dva{x),
(16.19)
where pa has the same expression as (16.3), now with R being dependent on a and va is the measure induced by equation (16.2) with a dependent on a. This presents a very difficult problem which we consider in section 16.4. In any case, if the set of admissible parameters V is a compact subset of RN and cra(£, x) is globally Lipschitz in x e Rn having at most linear growth uniformly with respect to a eV and further it is measurable intel and continuous in a € V, then we can prove that, for each bounded set T G B(C), the functional given by / ( a ) = /x a (r) is continuous on V and hence attains both its maximum and minimum on V. However we do not have a simple expression for this as
Linear and Nonlinear Filtering
233
in the drift identification problem. This must be computed iteratively using a suitable algorithm given in section 16.5. A simple practical approach is described as follows. Consider the sys tem (16.18) and suppose I f is a standard Brownian motion. For any given set of model parameters {A, a} we write the equations for the mean and the covariance ra = A(t)m,m(0) = Exn, ; P = AP + PA + Q , P ( 0 ) = P 0 , t € J ,
(16.20)
where Q = oo . Suppose the observed process over the period I is given by {€(t),t e / } , a sample path realization of the process x. Then define the deterministic error covariance by the expression
This is clearly deterministic given the observed sample path £. Let {P, ra} denote the solution of equation (16.20) corresponding to the system parameters {A,Q}. Then one may try to find {A,Q} by minimizing the cost functional given by J(T, A,Q)=
[ Tr((P(t) - P(t))S(t)(P(t) Jo
- P(t)))dt,
(16.21)
subject to the differential constraints (16.20). The weighting matrix S(t) is any suitable symmetric positive definite matrix valued function possibly increasing with t > 0, in the sense that (S(t 2 )C,C) > (5(*i)C,C) f o r a11 C(T^ 0) G Rn and £2 > t\. This should be chosen in such a way that very little weight is given for small values of t and very large weight is given for large values of t. This is intended to eliminate the impact of initial uncertainties and transients. Thus the identification problem has been converted into an approximate control and optimization problem. This simple minded approach has been successfully used even for partially observed linear systems discussed in the following section. What is crucial here is the availability of sufficiently long historical data. When suitable historical data is not available one may use Monte Carlo simulation to generate an empirical covariance M
P(t) = (l/M) £(&(t) - m(t))(&(t) - m(t))'
234
System
Identification
for use in equation (16.21). For numerical results see [14]. 16.3 Partially Observed Linear Systems Suppose the system and measurement dynamics is given by the following system of equations dx = Axdt + adW,x(0) adW, x(0) = x0,
(16.22)
= 0,0, dy = Hxdt +
= x0, (16.23)
where the state estimation error covariance K(n) satisfies the differential Riccati equation K(n) AK{-K) + K(ir)A K(7T) = AK(n) K{*)A" +Q-
K{-K)H K(n)H
R^HK^), R^HK(ir),
K(TT)(0) = K0 = Po,Q s= aa , Ro == a0a0a. 0. K(n)(0) o-
(16.24) (16 24) '
Here the process {z(t),t G / } is the innovation process, a Brownian motion with covariance RQ. We know that the covariance of the process x corresponding to the parameter w, denoted by P(ir), is given by the solution of equation P(TT) = = AP(n) AP(TT) + + P(K)A'
+ (TT)(0) = +Q Q,, PP(TT)(0) =P P00 = =
KK0.0.
(16.25)
Let x(n)(t) = x{t) denote the mean of the process x given by the solution of equation (d/dt)x(t) = Ax(t),x(0) = x0, x0, (16.26)
Linear and Nonlinear Filtering
235
corresponding to the parameter TT = {A, a, H, <Jo}. The two covariances K(TT) and P(TT) are related as follows. For any £ e Rn we have
(tf(*)(t)£,O = £{(s(0-*(t).O2} = s { ( x ( t ) - x(t),02
+ (*(«) - x ( t ) , 0 2 + 2(x(t) - x ( t ) , 0 ( * ( t ) - x ( t ) , o }
- ( P W ( * K , 0 + ^ { ( x - x , 0 2 + 2£{(x(t) - x(t),0(*(«) ~ * ( * U ) I * ? } } = ( ^ ) ( * K , 0 + E{(X - x , 0 2 + 2{(x(t) - x ( t ) , 0 ( * ( t ) - * ( * ) , 0 } } = (P(7T)(tK,0-^{*(<)-2(<),0a}(16.27) Here we have used standard properties of conditional expectations relative to the sigma algebra T%. Define e(7r)(<) = x(t) — x(t). Subtracting (16.26) from (16.23), it is easy to verify that e(7r) satisfies the following SDE de(ir) = (A-
K{IT)H'R~
1
H)e(ir)dt + K(w)H' RoV(dy - Hxdt),e(ir){0)
= 0. (16.28)
Thus (16.27) can be rewritten as K(n)(t) = P(*)(t) ^P(n)(t)-K
E{e(n)(t)e'(n)(t)}
e
(n)(t).
This identity is correct if the choice of the parameter TT coincides with the true system parameter TT* while {y(t),t € / } is the actual measurement (field) data corresponding to the true parameter TT* . In other words if the observed data corresponds to the true parameter TT* and not the trial parameter TT, and this data is used as the input to the model system (16.28) giving e(n), the identity (16.29) may not hold. Thus it is natural to choose TT that minimizes any potential mismatch. This can be done by introducing the cost functional
J(T,TT)
= /
Tri{K{Tr)
+ Ke{Tr) - P(n))S(t)(K(ir)
+ Ke(rr) - P(TT)) jcft,
(16.30) where S(t) is a suitable weighting matrix, symmetric and positive definite possibly increasing with t. It is clear from equations (16.24), (16.25) ,(16.26) and (16.28), that their solutions are uniquely determined by the parameters {A, Q,H, RQ} rather than {A, a, H, cr0}, the data {x 0 ,Po = -^o} and the ob servation {?/(£),£ € [0,T]}. So we may redefine TT as TT = {A,Q,H,RQ} and
System
236
Identification
accordingly the admissible set V. Given the data {y(t), t G / } and n G V equation (16.28) is deterministic with solution e(ir,y) and hence we can use KV(n)(t) = (e(ir,y)(t)e'(w,y)(t)),
(16.31)
in place of Ke in the objective functional (16.30). Hence, if we can find a 7r° eP that minimizes the functional J, we have an approximate solution for the parameter identification problem. Thus we have the following result. Theorem 2 Consider the system (16.22) and suppose it is required to identify the parameters n = {A, Q, H, Ro} C V from the data {y{t), t G [0,T]}, x0 and Po- Let the observed data y be used as the input to the system (16.28). Then the identification problem is equivalent to the control problem: find 7r° G V that minimizes the cost functional (16.30), with Ke replaced by Ky, subject to the dynamic constraints (16.24-16.26) and (16.28). This technique was originally developed and reported in [14] with con vincing numerical simulation results. Necessary conditions of optimality of Pontryagin type was also given. However, method of simulated annealing was used for optimization. For details the reader is referred to [14]. A problem that a scientist faces in application of this technique is that Ke is not available in practice and so one must replace this in (16.30) with Ky given by (16.31). This is obtained from the solution of equation (16.28) corresponding to the measured data {y{t),t e 1} which is just one sample path as observed from experiment. This problem was overcome in [14] by taking a long history of the process {y(t),t G [0, T]}. An empirical covariance was also used for numerical computation. Example 2 The technique described above is illustrated by the following example. Consider the system and measurement model dx\ = ( a n x i + a\2X2)dt + a\\dw\ + Gyidw
Linear and Nonlinear Filtering
237
values with the estimated ones, it is evident that the method works very well for time invariant systems. Parameters an «12 a>2i a>22
9n 1
Ql2
1 |
J&11 hi2
Q22
1
Parameter Identification Starting value Estimated value -1.0 -1.994406 1.995625 1.0 2.0 0.504037 -3.0 -2.064103 1.010031 0.1 0.5 0.2000456 1.010027 0.1 -0.000276 -0.7 2.0 1.009440 0.1 0.992090
Actual value -2.0 2.0 0.5 -2.0 1.01 0.2 1.01 0.0 1.0 1.0
| |
16.4 Partially Observed Nonlinear Systems Now we consider partially observed nonlinear systems. Suppose the system and measurement dynamics is given by dx = b(t, x, a)dt + a(t, x, a)dW, x(0) = xo,
(16.32)
dy = h(t, x, a)dt + a0(t, y)dV, y(0) = 0.
Here the unknown parameter (vector) appears in all the coefficients except in the measurement noise. From chapter 13, we know that the associated Zakai equation (see chapter 13, equation 13.37B) is given by dpa(t) = A*(t1a)pa(t)dt
+
pa{Ro1ha1dy),p(0)=po.
(16.33)
For emphasizing dependence of the coefficients on the unknown parameter a, we may use it whenever required as superscript as indicated by 6 a ,cr a ,/i a . Again from equation (13.9) of Chapter 13, we have the Radon-Nikodym deriva tive given by qa = qa(t) = e x p j f
< R»lha,dy
> - ( 1 / 2 ) y < R^ha,ha
> ds},t
> 0.
° ° (16.34) For the fully observed problem, we argued that the RND given by qr or log(qr) can be chosen for the likelihood function. Thus, in the partially observed
238
System
Identification
situation the natural candidate for the likelihood function is LT(a)
= E°{qa(T)\F%}
(16.35)
1> .
=< pa(T),
Hence the identification problem can be stated as follows. Given the history {y(t),t el= [0,T]}, find a that maximizes the functional (16.35) subject to the dynamic constraints (16.33). For easy reference we call this problem IP. We shall reformulate this problem as a parameter optimization problem of a deterministic partial differential equation by use of an exponential transformation. In the literature this pathwise formulation is also known as robust formulation. Define z(t)= and
(16.36)
RZ\s,y{s))dy{s), f R^(s,y(s))dy(s),
Jo Jo
(16.37)
Pa(t) = pa(t)exp Pa(t)exp - {ha . z{t)}, P«(t) z(t)},
where from now on we use the notation (£, <)/*» = £ • <. Then by use of Ito formula one can verify that dpa(t) {t) = dpa(t) exp - {ha • z(t)} + pa(t) d(exp - {haQ ■ •{t)}) •(t)}) + (1/2) < dpa{t), (t), d(exp d(exp - {ha • z) > a a (exp-{h«•z})A*(t,a)(p -•})dt = (exp - {h° • z})A*{t,a)(p (t)eMh a(t)eXa»{h
■ •})dt
(16.388 ^ 8
a 0a -(l/2) ,h ,/»")*, )dt, -(l/2)pPaa(t)(R^h (t)(i2jr1/»
where A*(t,a) denotes the formal adjoint of the generator A(t,a) of the Markov process x given by the first equation of (16.32). Note that the last term in the first identity is the quadratic variation term. Clearly, given the process y{t),t € / , or equivalently the process z(t),t G /, equation (16.38) is a standard (deterministic) partial differential equation. We rewrite this equation as (d/dt)p (t)=A*(t,a)p a(t), Pa(t), (d/dt) = A*(t,a) Paa(t) (16.39) (n\ (16.39) Pa\y) —Po, pQ(,u) — po, where A* denotes the differential operator given by A*(t,a)<j> x p - (ah-•)a ■ •) A*(t,a)U A*(t,a)U A*(t,a) = eexp-(h
eex{ha a aa•}\ •}) - eex{h
l aa aa (l/2)(R^ h ,h )<j>. (l/2){R^h )^>.
(16.40)
Linear and Nonlinear Filtering
239
Note that the coefficients of this differential operator are parameterized by the process y through the process z. Equation (16.39) is a partial differential equation on the domain I x Rn. The objective functional (16.35) takes the form JT(a) = LT
a (T),
1 > = < pa(T)
exp(/i a • z(T)), 1 > .
(16.41)
Thus, our identification problem (IP) is equivalent to the optimization prob lem: find a that maximizes the functional JT(&) given by (16.41) subject to the dynamic constraint (16.39). The problem as stated above may have no so lution without further assumptions. Let V C RN denote the set of admissible parameters and suppose the coefficients satisfy the following assumptions. Assumptions (Al) b(t, x, a) is bounded and measurable in t and continuous in {x, a} on I x Rn x V satisfying uniform Lipschitz and linear growth condition in x on Rn. (A2) All the columns of the diffusion matrix o~(t,x,a) satisfy similar con ditions as those of 6 and Holder continuity in t on / . Further there exists a constant 7 > 0, such that a = acr > 7 / , V(£, x,a) e I x Rn x V and that the first derivatives in x are continuous and bounded on I x Rn xV. (A3) For every (£,x) £ I x Rn, the mapping a —► 6(t,x,a), a —> a(t, x, a) and a —► /i(a, x) are once Gateaux differentiate on V. (A4) For every a £ V, h is C 1 in x with bounded first derivatives. Fur ther, the maps a —> /i(x, a), and a —> (d/dxi)/i(x,a), are continuous and bounded on V for each x G Rn. (A5) The matrix valued function cr0(t,t/) is measurable in t on / and uni formly Lipschitz on Rm possessing at most linear growth. Further it is invertible. Under these assumptions we can prove that the operator A*(t, a) satisfies the Garding's inequality. Let H denote the Hilbert space L 2 ( ^ n ) and V = H1, the Sobolev space as introduced in chapter 14 ( see chapter 14, Lemma 1), and V* = (iJ 1 )* its dual. Then we have the following result. Lemma 3 Under the assumptions (Al)-(A5), the operator .4*(£,a), and hence A as well, is a bounded linear operator from V to V* and -A* is coercive in the sense that there exist constants { c > 0 , / ? > 0 , A > 0 } such that \{A*(t,a)u.v)V;v\ p\\u\\2vyueV, and V ( t , a ) e / x p .
=< p
240
System
Identification
Proof. Let CS°(Rn) denote the class of infinitely differentiable functions having compact supports and let {u,v} G C§°. For convenience of notation we use Di4> = (d/dxi) for rhe eartial lo <j> with hespect tt aad D<j> for rhe eradient vector. Then by integration by parts one can easily verify that aaaa(t,u,v) (t,u,v)
r
= = -- (( 11 // 22 )) //
= = (A*(t,a)u,v) (A*(t,a)u,v)
JR
"
+ / + /
Yf3j(Dju)vdx Yf3j(Dju)vdx
+ [ + [
n
]] T (£>,«)(£*«)<& T ay oy(A«)(Ity;)dx iJ 1 i , j== i
Yliu{Div)dx+ yliu{Div)dx+
(16.43) (16 43)
f'Suvdx, f'Suvdx,
where n
Pi(t,x,a) ^(t,x,a)
a a DD ha = (1/2)Y,aiiDi(h (1/2) J2 YaH H i(i(ha
■ z),l < j < n,n, -z),l<j<
i=\ »=i n
7i(*,x,a) = - ( 1 / 2 ) ^
^ ^ . ( / i 0 • s) - (1/2) ^ 3Dj(Da3ij) {ai3) + + bul bi,l
3=1 3=1
j=l 3=1
n
6{t,x,a)
n aa
a
aijDj{h = (1/2) ^ , aijDj{h ai-jDjih" ■ z)Di{h
■ z) + (1/2) Y
i,j=l
aa 33>.,(ay)A(/i •■z)z) J(aij)A(/i
i,j=l n
-Y,hVi{ha•z)-{ll2){RZlh",h"). i=l
Under the assumptions (A1)-(A3) and (A5), it is easy to verify that these coefficients are bounded measurable functions of their arguments. Thus there exist nonnegative constants {Ci,C2,C3,C4} such that by Schwarz inequality we have |{A*(t,a)u,v)|
C 1 |Du| L a |.Di;|i a ++ C a | CD2\DU\ t i | LL2>\V\L U a2 < C^Du^Dv^ + +C C33\DV\ \Dv\L2La\U\ \u\L2La
<>/2C|M|v||t,||y
=
+ C4+C4\U\LML \u\La\v\La 2
(16.44) (16.44)
c||tt||v||t;|fv,
where the constant C = WBX.{Ci,C2,CZ,C£. Since C$°(Rn) is dense in the sobolev space H^R") = V, inequality (16.44) holds for all u,v GV. Thus the operator A*(t,a) is a bounded linear operator from F to F*. For coercivity we use (16.43) with v replaced by -u and the Cauchy inequality 2 ab < {e/2)a? + (l/2e)6(l/2e)6 (e/2)a2 + , a, 62,a,6GR,e>0, G R, e > 0, (16.45)
Linear and Nonlinear Filtering
241
to arrive at the following inequality (-A*(t,a)u,u)
> ( 7 - (e/2)(C 2 + C 3 ) ) || u fv - ( 7 - (e/2)(C2 + C3) + (l/2e)(C2
+ C3) + cA \u\2H, (16.46)
Vu € V. Choosing e = {^/{C2 + C 3 )) in (16.46) and defining /? = 7 / 2 and A = {( 7 /2) + (1/2 7 )(C 2 + C3) 2 + C4} we obtain (-A*(t,a)u,u)
+ A|u& > /? || u ||£,Vtx £ V,t e I,a e V.
(16.47)
This proves coercivity. QED In view of this result and Lemma 6 of chapter 15 we have the following theorem. T h e o r e m 4 For any a e V and any finite interval I = [0,T] and every initial density p0 e H, equation (16.39) has a unique weak solution pa e L2(I, V) D C(I, H). Further, for every p0 e H satisfying \p0{x)\ < M0 exp - {tf|x| 2 },x G Rn,MQ > 0,# > 0,
(16.48)
there exist constants M\ = Mi(Mo,$) > 0, and >, 0 < 6 < $ possibly depending on {Mo,#,T} such that the following estimate holds. \pa{t,x)\
< Afi exp-{<5|x| 2 },Vxe Rn,0
(16.49)
The proof of the estimate (16.49) follows from the fact that, under the given assumptions, the fundamental solution of the parabolic equation (16.39) satisfies the following estimate r(«,x,T,OI < (ki/(t-r)^2)exp-{k2(\x-C\2)},
(16.50)
for all a e V,x,£ € Rn,0 < r < t < T, where k\ and k2 are positive con stants depending on 7 of assumption (A2) and the bounds of the parameters {aij,/?i,7i,5} as defined in equation (16.43). In fact, the estimate (16.49) can be computed using the expression for the solution pa(t,x)=
[
r(t,x,0,Z)po(£)dZ,
242
System
Identification
and the estimate (16.48) and (16.50). For details on the estimate of the fun damental solution see [43]. In view of theorem 4, we rewrite our objective functional (16.41) as follows: JT(a) (a)
(16.51)
= (qa(T),r, (gQ(T),7 (r))H a(T)) ?QH
where Va(t) = Tfa(t,x) = exp{ha(x) ■ z(t) - (6/2)\x\2}, Va{t) Va(t,x) = , [ u ^ „„, ,21 D nn qa(t) = qa(t,x)=pa{t,x)exp{{6/2)\x\22},x& Rqa{t) = qa{t,x)=pa(t,x)exp{(S/2)\x\ },x 6 Rn .
(16.52) 16"52)
Next, we introduce the operator B* as follows aa B*(t,a)4> B*{t,a)<j>==exp{-h exp{-h ■• z{t) z(t) + + (6/2)\x\22}A*(t,a)(exp{h }A*{t,a)(exp{haa
2 2 •■z(t) z{t)- - (6/2)|x| {6/2)\x\ })})
l a a -(ll2){RZ h ,h )4> -(l/2)(R;lh°,h°)
=:exp{(«5/2)N 22 M*(0exp{(-«5/2)N 22 }.
Sr, ^lU.ffcU^.
! hve=°pe0rfa*ui
Zr^^^TLf^^f^^x^^^l
H i VitJW U l Lilt; t J X p i t J o b l U I l b yi.yJ.OZt)
(d/dt)qa = B*{t,a)qa,qa(0)
(16.53)
d-IlU ^ l U . O O y , t i l e t J V U l U b l U l l tJLJUctLlUIl
= q0 = p 0 exp{(6/2)|x| p0exp{{6/2)\x\22}, },
(16.54)
is equivalent to the original equation (16.39). In other words, for the initial condition p0 satisfying (16.48), if Pa € L2(I, V)nC(I, H) is the unique solution of (16.39) then qa given by (16.52) is the unique solution of equation (16.54) belonging to the space L2(I, V ) n C ( J , H). Thus the identification problem (IP) stated earlier is equivalent to the following optimization problem: (d/dt)qa = B*(t,a)qa,qa(0)
= q0
M<*) Ma) = (q {qaa(T),T] (T), Va{T)) a{T))H H —+ max.
(16.55) ^16'55)
Now we are prepared to demonstrate that under one more additional as sumption the identification problem (16.39)-(16.41) denoted by (IP) or its equivalent, described by equation (16.55), has a solution. Theorem 5 (Existence) Consider the identification problem (IP) or equivalently (16.55). Suppose the assumptions (Al)-(A5) hold, p0 satisfies the esti mate (16.48), a —► h<* has atmost linear growth and that the parameter set V C RN is compact. Then the mapping a —> Ma) is continuous on V and hence attains its maximum on V.
Linear and Nonlinear Filtering
243
Proof. Since a continuous function on a compact set attains both its maximum and minimum, it suffices to verify that a —> JT(a) is continuous on V. Let an,a° £ V, and pn,p° the corresponding solutions of equation (16.39) and qn,q° those of the equivalent system (16.54). By virtue of Theorem 4, these solutions exist and they are unique. Suppose an —► a°. We prove that qn converges to q° in the topology of L2(7, V) n C(I, H). It suffices to show that pn converges to p° with respect to the same topology. Denning znn(t)=p (t)=pnn(t)-p°(t),t€l, (t)-p°(t),tel, one can easily check that zn(t),t (t) (d/dt)zn{t)
= A*(t,an)zn(t)
e 7, satisfies the differential equation + (A*{t,a (A*(t,ann)
- A*(t,a°))p°,t
€ I,
n
zzn(0) (0) = 0.
(16.56)
Scalar multiplying on either side of (16.56) by zn(t) we obtain {l/2)(d/dt)\z (l/2){d/dt)\zn{t)\ (t)\2H
-
{A*(t,ann)znn(t),znn{t)) (t)) n n n = {{A*(t, a{{A*(t,a )-A*(t,a°))p°,z (t)). ) - A*(t, a°))p°, zn(t)).
(16.57) (16 57) '
Integrating this over the interval [0,«] and using the coercivity property and the Cauchy inequality (16.45) one arrives at the following inequality \zn(t)\2H + (2/3 - e) f || zn(s) \\2V ds Jo t t n 2 < A / \z (s)\ Hds + (l/e) /[ || (A*{s,a (A*(s,ann)-A*(s,a°))q° Jo Jo \/te7,e>0. (16.58) that
||^. ds,
(16.58) (16.58)
Using e = j3 and applying Gronwall inequality it follows from
Ar \zn(t)\l+P //3) fT it)\2H+P ff II zn(s) \\2V ds < (eXT /P)[ Jo Jo
II|| (A*(t,an)-A*(t,a°))p°
\\2V. dt,
(16.59) for all t € I. From assumptions (Al) and (A2) one can prove that a —► A*(t,a) is strongly continuous (strong operator topology) from L2(7,V) to L2(7,V*). Hence it follows from (16.59), upon letting n->oo, that zn —-> 0 in L2(I,V) n C(I,H) and consequently qn —► q° in L2(I, V) n C(7,H) as an —► a°. This proves sequential conttnuity and hence conttnuity of the map a —► qa. Under the conttnuiiy assumptton of ha(x) in a e V and uniform
System
244
Identification
linear growth in x € Rn, the map a —► rja(T) is continuous from V to H. Thus the cost functional JT{OL) given by (16.51) is continuous on V and hence the existence of a maximum follows from the compactness of V. This completes the proof. QED The next question is how do we find the best parameter a° that maximizes the likelihood functional JT(&)- We present here a set of necessary conditions that the optimum parameter must satisfy. For this we need the Gateaux differentiability of the solution map a —► qa. Suppose the admissible set of parameters V is a compact and convex subset of RN. Let a€ = a° -f e(a a°), e € [0,1], Clearly by convexity of V, ct£ € V whenever a ° , a e V. L e m m a 6 Suppose the assumptions (Al)-(A5) hold and that V is a compact convex subset of RN. Then the map a —> qa is Gateaux differentiable on V and further its Gateaux differential at a° in the direction a — a° denoted by r is given by the weak solution of the following differential equation: (d/dt)r = B*(t,a°)r
+ G(t,a°,a
- a o )g o ,r(0) = 0,
(16.60)
where G denotes the strong Gateaux differential of the operator valued function a —► B*(t,a) at a° in the direction a - a° given by lim || {l/e){B*(t,a*)
- B*(t,a°))4> - G(t,a°;a-
a°)0 || v .—> 0,V0 G V. (16.61)
Proof. The proof is quite similar to that of theorem 5. We give here an outline only. Let qe and q° denote the unique solutions of system (16.54) corresponding to the parameters a€ and a° respectively. Define r £ = (l/e)(
(16.62)
Then following the same procedure as in theorem 5, one can verify that this sequence is contained in a bounded subset of the space W = { e L2(I, V):j>e
L2(I, V*)}.
(16.63)
Furnished with the norm topology, /
\ 1/2
II llws (|| 4> ||i a ( / i V ) + || 4> HL(/,v.)J
,
W is a reflexive Banach space. Hence the sequence {rE} has a subsequence which converges weakly to an element r e W a s e j O (along this subsequence).
Linear and Nonlinear Filtering
245
It is known [3] that W is continuously embedded in C{I, H) and hence r G C(I,H) fl 1/2(7, V). Prom this one can verify that r is a weak solution of (16.60). In fact, by use of arguments related to V* valued distributions one arrives at the conclusion that equation (16.60) holds almost everywhere on / . This completes the outline of our proof. QED For a slightly different but detailed proof see [2] and [27]. Finally with the help of the above results we derive the necessary conditions of optimality. Let f}(ot°, a - a°) denote the Gateaux differential of t] with respect to a at a° in the direction a — a°. T h e o r e m 7 (Necessary conditions of optimality) Suppose the assump tions of Lemma 6 hold. Then, in order that a° G V be the maximum likelihood estimate of the unknown parameter a G V, it is necessary that there exists a ip° G L2(I, V) n C(I, H) such that the following relations hold: {d/dt)q° = B*(t,a°)q°, -(d/dt)^°
= B(t,a°)r,
q°(0) = q0, r(T)
(16.64)
= 77ao(T),
/ (G(t, a°; a - a°)q°(t), ^°(t))v%vdt Jo + (
(16.65) (16.66)
<0VaeP
Proof. Let a° G V denote the optimal parameter and a G V any arbitrary element. Then the Gateaux differential of JT at a° in the direction a — a° denoted by dJria0, OL — a°) is given by dJT(a°,a-
a°) = (rao(T),r,ao(T))H
+ {qao(T),ri(a0,a-
a°)(T))H,
(16.67)
where qao = q° and rao == r° are the solutions of equations (16.64) and (16.60) respectively. Since a° is optimal it is clear that JT(a° + e(a - a°) - JT{OL°) < 0, \/e G (0,1], Va G V.
(16.68)
Dividing this by e and letting e —> 0, it follows from (16.67) and (16.68) that (r a o(r),77 Q o(T))tf + (qao(T),
< 0 Va G V.
(16.69)
Consider the general nonhomogeneous problem associated with (16.60) given by (d/dt)y = B*(t, a°)y + g, y(0) = 0. (16.70)
System
246
Identification
One can show that for every g € L2(I, V*), this equation has a unique solution y e W and that the map g —» y(T) is continuous linear from L2(I, V*) to H. Define g°a = G{t,a°,aa°)q°. Note that g°a € L2{I,V*). Thus the map g°a —>
(ra°(T),r,ao(T))H
is a continuous linear functional on L2{I, V*) and hence there exists a unique V>° € L2(I,V) such that (16.71) = /I < 9° £ (a*{t)^°{t) ) . l H 0 >v,v
16.5 A Computational Algorithm Assuming that the Gateaux differentials are linear, an assumption that normally holds, inequality (16.66) can be expressed coordinate wise as follows T,( riGtfrcW.mv.vdt + ( g ( T ) , ^ ° ) ( T ) ) H V ; -a?) -a?) |;(jfW.a'v.^w*+(gm^KD)*)^ ~^\./o ) 0
, (16.72)
o = 0 -i V<< 0 = ((VVJJT T( a( °a)e, a) -, aa -)aR V
Step 1. At the nth stage, n > 0, an € V is given, solve equation (16.64) for qn Step 2. Corresponding to the same an, solve equation (16.65) for ipn. Step 3 . Using {an,qn,^n}
and (16.72) compute the gradient
VJT(an).
Step 4. Define a n + 1 = a" + £ V J T ( a n ) for e > 0 sufficiently small so that Step 5. Compute JT(an+1)
= JT(an)+e\\
\7JT(an)
\\2RN +o(e).
Step 6. If J T ( a n + 1 ) < JT(an) reduce £ and repeat step 4. Otherwise go to step 1 with this new a = an+1 and continue till a stopping criterion is satisfied. Comment 2 The reader will notice that the general assumptions stated in this section are rather strong. These are not at all necessary. However under these assumptions rigorous proof are easily obtained. Comment 3 For further details, see [2],[29]. For the general theory of identification and numerical results, the reader is referred to [2].
247 17. References [1] N.U. Ahmed,Elements of Finite Dimensional Systems and Control Theory, Pitman Monographs and Surveys in Pure and Applied Mathe matics, 37,(1988), Longman Scientific and Technical, U.K., Co-publis her: John Wiley, New York. [2] N.U. Ahmed, Optimization and Identification of Systems Governed by Evolution Equations on Banach Space , Pitman Research Notes in Mathematics Series, 184, 1988, Longman Scientific and Technical, U.K., Co-publisher: John Wiley, New York. [3] N.U. Ahmed, and K.L.Teo, Optimal Control of Distributed Parameter Systems. Elsevier North Holland. New York. Oxford. 1981. [4] N.U.Ahmed and K.L. Teo, An Existence Theorem on Optimal Control of Partially Observable Diffusions, SIAM Journal on Control, 12, 3, p. 351-355, 1974. [5] N.U. Ahmed, Optimal Control of Stochastic Systems, Probabilistic Analysis and Related Topics, (ed. A. T. Bharucha-Reid), Academic Press Series, 2, p.1-68, 1979. . [6] N.U. Ahmed, Optimal Relaxed Controls for Infinite Dimensional Stoc hastic Systems of Zakai Type, SIAM Journal on Control and Optimiza tion, 34,5, p.1592-1615,1996. [7] N.U Ahmed, and J. Zabczyk, Partially Observed Optimal Controls for Nonlinear Infinite Dimensional Stochastic Systems, Dynamic Systems and Applications, 5, p. 521-538, 1996. [8] N.U. Ahmed, M. Fuhrman, and J. Zabczyk, On Nonlinear Filtering in Infinite Dimensions, Journal of Functional Analysis, 143, 1 , p. 180204, 1997. [9] N.U. Ahmed and T.E. Dabbous, Nonlinear Filtering of Systems Gov erned by Ito Differential Equations with Jump Parameters, J. Math. analysis AppL, 115,1, p.76-92., 1986. [10] N.U. Ahmed, and T.E. Dabbous and H.W. Wong, Gradient Method for Computing Optimal Controls for Stochastic Differential Equations, Stochastic Anal- ysis and Applications,5(2) , 121-150, 1987. [11] N.U. Ahmed and P. Li, Qadratic Regulator Theory and Linear Filter ing Under System Constraints, IMA Journal of Mathematical Control and Information, 8(8), pp.93-107, 1991. [12] N.U. Ahmed and S.M. Radaideh, A Powerful Numerical Technique Solving Zakai Equation for Nonlinear Filtering, Dynamics and Con trol, 7, pp 293-308, 1997.
248
References [13] N.U.Ahmed and S.M. Radaideh,MocK/ied Extended Kalman Filtering, IEEE Transactions on Automatic control, 39,6, pp. 1322-1326, 1994. [14] N.U.Ahmed and S.M. Radaideh, Identification of Linear Stochastic Systems Based on Partial Information, Journal of Applied Mathemat ics and Stochastic Analysis 8,3, pp.349-360, 1995. [15] N.U. Ahmed and H.W. Wong,i4 Minimum Principle for Systems Gov erned by Ito Differential Equation with Markov Jump Parameters, Dif ferential Games and Control Theory II (Ed. E.O. Roxin, P.T.Liu and R.L. Sternberg), Lecture Notes in Pure and Applied Mathematics, 30, Marcel Dekker, Inc. New York Basel, 1976. [16] N.U. Ahmed, Existence and Uniqueness of Measure Valued Solutions for Zakai Equation, Publicationes Mathematicae, Debrecen, 49, 3-4, pp.251-264,1996. [17] L. Arnold, Stochastic Differential Equations: Theory and Applica tions, John Wiley and Sons, New York, 1974. [18] B.D.O. Anderson and J.B. Moore, Optimal Filtering, Prentice-Hall, Inc., Englewood Cliffs, N.J, (1979). [19] J.S. Baras, G.L. Blankenship, and S.K. Mitter, Nonlinear Filtering of Diffusion Processes, Proc. IFAC Congr., Koyoto. Japan, Aug. 1981. [20] A. Bensoussan, Stochastic Control of Partially Observable Systems, Cambridge University Press, 1992. [21] P. Billingsley, Probability and Measure, (2nd Edition), Wiley-Interscience, New York Chichester Brisbane Toronto Singapore, 1985. [22] A. Bagchi, Continuous Time Systems Identifications with Unknown Noise Covariance, Automatica, 11, pp.533-536, 1975. [23] R.S. Bucy, Nonlinear Filtering Theory, IEEE Aut. Contr., 10, pp.198212, 1965. [24] S.K Berberian, Measure and Integration, The MacMillan Company, New York, Collier-MacMillan limited, London, 1965. [25] V.E. Benes, Exact Finite Dimensional Filter for Certain Diffusion with Nonlinear Drifts, Stochastics, 6, pp.65-92, 1981. [26] J.M.C. Clark, The Design of Robust Approximation to the Stochastic Differential Equation of Nonlinear Filtering, Communication Systems and Random Process Theory, pp.721-735, 1981. [27] T.E. Dabbous and N.U. Ahmed, Nonlinear Filtering of Diffusion Pro cesses with Discontinuous Observations , Stoch. Analys. Appl., 2, 1, pp.87-106. 1984. [28] T.E. Dabbous, N.U. Ahmed, J.C. McMillan and D.F. Liang, Filtering of Discontinuous Processes Arising in Marine Integrated Navigation Systems, IEEE Transaction on Aerospace and Electronics Systems, 24,1, pp.85-102, 1988.
Linear and Nonlinear Filtering
249
[29] T.E. Dabbous and N.U. Ahmed, Parameter Identification for Partially Observed Diffusion , J. of Optimization Theory and Applications(JOTA)., 75,1, pp 33-50, 1992. [30] M.H. Davis, and S.L. Marcus, An Introduction to Nonlinear Filter ing, Stochastic Systems: The Mathematics of Filtering and Identi fication and Applications (NATO Advanced Study Institute Series). Dordrecht. Reidel. pp.53-75, 1981. [31] M.H. Davis, New Approach to Filtering for Nonlinear Systems, IEEE Proc, 128, 5, pp.166-172, 1981. [32] G.B. Di Masi and W.J. Runggaldier, An Approximation to Opti mal Nonlinear Filtering with Discontinuous Observations: Stochastic Systems, The Math- ematics of Filtering and Identification and Appli cations (NATO Advanced Study Institute Series), Dordrecht, Reidel, pp.583-590, 1980. [33] R. J. Elliot, Stochastic Calculus and Applications,Springer-Verlag, Heidel burg Berlin. New York, 1982. [34] R.J. Elliot and R. Glowinski, Approximations to solutions of Zakai Fil tering equation, Stochastic Analysis and Applications, 7(2), pp.145168, 1989. [35] W.H. Fleming, Measure Valued Processes in Control of Partially Ob servable Stochastic Systems, Appl. Math. Optim., 6, pp.271-285, 1980. [36] W.H. Fleming and S.K. Mitter, Optimal Control and pathwise Nonlin ear Filtering of Nondegenerate Diffusions, presented at the 20th IEEE conf. De- cision Contr., San Diego. CA. 1981. [37] W.H. Fleming, and E. Pardoux, Optimal Control for Partially Ob served Diffusions , SIAM J. Optim., 20,2 , pp.261-285. 1982. [38] W.H. Fleming and H.M. Soner, Controlled Markov Processes and Vis cosity Solutions, Springer-Verlag, New York Berlin Heidelberg London Paris Tokyo Hong Kong Barcelona Budapest, 1993. [39] W.H. Fleming and Q. Zhang, Nonlinear Filtering with Small Obser vation Noise: Piecewise Monotone Observations , stochastic analysis, Academic press. Inc., pp.153-168, 1991. [40] P. Florchinger and L.F. Gland, Time Discretization of the Zakai Equa tion for Diffusion Processes Observed in Correlated Noise, Stochastics and Stochastics Reports, 35, pp.233-256, 1991. [41] A. Friedman, Stochastic Differential Equations and Applications,!, Academic Press, New York San-Francisco London, 1975. [42] A. Friedman, Stochastic Differential Equations and Applications,2, Academic Press, New York San-Francisco London, 1976. [43] A. Friedman, Partial Differential Equations of Parabolic Type , Pren tice Hall,Inc. 1964.
250
References [44] A. Germani, Stochastic Modelling and Filtering, Proc. IFIp-WG 7/1 Working Conference Rome, Italy, Dec. 10-14, 1984, Lect. Notes in Control and Information Sciences, 9 1 . [45] J.C Geromel, Optimal Linear Filtering Under Parameter Uncertainty, Manuscript LAC-DT/ School of Electrical Engineering, UNICAMP, CP610,- 13081-970,1997, Campinas,SP,Brazil. [46] I.I. Gihman and A.V. Skorokhod, Stochastic Differential Equations, Springer-Verlag, New York Heidelberg Berlin 1972. [47] Gihman, I.I. and Skorokhod, A.V, The Theory of Stochastic Processes III. Springer-Verlag, Heidelberg Berlin., New York, 1979. [48] L.F. Gland, Systematic Numerical Experiments in Nonlinear Filtering with Automatic Fortran Code Generation, Proceeding of 25th CDC, Greece, pp. 638-642, 1986. [49] P.R. Halmos, Measure Theory, D. Van Nostrand Company, Inc., Princ eton, New Jersey, Toronto, New York, London, 1964. [50] M. Hazewinkel and J.C. Willems, Stochastic Systems: The Mathemat ics of Filtering and Identification and Applications , Proc. NATO Advanced Study Institute, Les Arcs, Savoie, France,(june 22-July 5, 1980), D.Reidel Publishing Company, Dordrecht Boston London. [51] A.H. Jazwinski, Stochastic Processes and Filtering Theory , Academic Press, New York, London, 1970. [51] G. Kallianpur, Stochastic Filtering Theory , Spring-Verlag, Heidelberg Berlin, New York, 1980. [52] R.E. Kalman and R.S. Bucy, New Results in Linear Filtering and Prediction Theory, Tran. ASME,Ser, D: J. Basic Eng., 83, pp.95-108, 1961. [53] H. Kunita, The Stability and Approximation Problems in Nonlinear Filtering Theory, Stochastic analysis, Academic press, Inc., pp.311329, 1991. [54] H.J. Kushner, On the Dynamical Equations of Conditional Probabil ity Density Functions with Application to Optimal Stochastic Control Theory, J. Math. Analys. Appl., 8, pp.332-344, 1964. [55] H.J. Kushner, On the Differential Equations Satisfied by Conditional Probability Densities of Markov Processes, SIAM J. Contr., 2, pp.106119, 1964. [56] H.J. Kushner, Dynamical Equations for Optimal Nonlinear Filtering, J. Diff. Equations, 3, pp.179-190, 1967. [57] R. Katzur, B.Z. Bobrovsky and Z. Schuss, Asymptotic Analysis of the Optimal Filtering Problem for One Dimensional Diffusions Measured in a Low Noise Channel, Part I. SIAM J. Appl. Math. 44, 3, pp.594604, 1984.
Linear and Nonlinear Filtering
251
[58] R. Katzur, B.Z. Bobrovsky and Z. Schuss, Asymptotic Analysis of the Optimal Filtering Problem for One Dimensional Diffusions Measured in a Low Noise Channel , Part II. SIAM J. Appl. Math. 44, 6, pp.1176-1191, 1984. [59] P.E. Kloeden and E. Platen, Numerical Solution of Stochastic Differ ential Equations, Springer-Verlag, Heidelberg Berlin, New York, 1992. [60] O.A. Ladyzenskaja, V.A. Solonokov and M.N. Uralceva, Linear and Quasilinear Equations of Parabolic Type, Eng. Trans, 23, AMS (1968). [61] D.F. Liang, Exact and Approximation of State Estimation Techniques for Nonlinear Dynamical Systems, Advances in Contr. and Dynamic Systems. Academic Press. New York , London, 19, pp.1-71, 1983. [62] R.S. Liptser and A.N. Shiryayev, Studies of Random Processes I and II, Springer-Verlag. Heidelberg Berlin, New York, 1978. [63] H.J. Larson and B.O. Shubert, Probabilistic models in Engineering Sciences, 1, I I , John Wiley and Sons, Inc., 1979. [64] G.N. Milstein, Approximate Integration of Stochastic Differential Eq uations, Theory Prob. Appl., 19, 1974. [65] E. Pardoux, Nonlinear Filtering Prediction and Smoothing Stochastic Systems: The Mathematics of Filtering and Identification and Appli cations (NATO Advanced Study Institute Series), Dordrecht, Reidel, pp.529-577, 1980. [66] E. Pardoux, Stochastic Partial Differential Equations and Filtering of Diffusion Processes, Stochastics 3,2 , pp. 127-167, 1979. [67] K.A. Parathasarathy, Probability Measures on Metric Spaces, Aca demic Press, New York London, 1967. [68] O. Perez, W. Colmenares, E. Granado and F. Tadeo, Robust Multimodel Control of a Neutralization Process , Proc. ICECS'97, pp.13351340,1997. [69] J. Picard, An Estimate of the Error in Time Discretization of Nonlin ear Filtering Probplems, Theory and Applications of Nonlinear Control Systems, North-Holland, pp.401-412, 1986. [70] A.V. Skorokhod, Studies in the Theory of Random Processes, (eng. trans.), Addison-Wesley Publishing Company,Inc. Reading, Massach usetts, (1965). [71] D.W. Stroock and S.R.S. Varadhan, Multidimensonal Diffusion Pro cesses, Springer-Verlag Berlin Heidelberg New York, 1979. [72] R. Temam, Infinite Dimensional Systems in Mechanics and Physics , Springer-Verlag New York Berlin Heidelberg London Paris Tokyo, 1980. [73] K.L. Teo, N.U. Ahmed and M.E. Fisher, Optimal Feedback Control for Linear Stochastic Systems Driven by Counting Processes, Journal of Engineering Optimization, Vol. 15 , No.l, pp. 1-16, 1990.
252
References
[74] E. Wong and M. Zakai, On the Convergence of Ordinary Integrals to Stochastic Integral, Ann. Mat. 36, ppl560-1564, 1965. [75] W.M. Wonham, Random Differential Equations in Control Theory, Probabilistic Methods in Applied Mathematics, 2, (Ed. A.T. Bharucha Reid), 1970, Academic Press, New York, London. [76] O. Zeitouni and B.Z. Bobrovsky, On the Reference Probability Ap proach to the Equations of Nonlinear Filtering, Stochastics,19, pp. 133149, 1986. [77] M. Zakai, On the Optimal Filtering of Diffusion Processes, Z. Wahrsch, Verw. G e b . l l , pp.230-243, 1969. [78] G. Da Prato and J. Zabczyk, Stochastic Equations in Infinite Dimen sions , Encyclopedia of Mathematics and its Applications, 44, Cam bridge University Press, 1992. [79] P.S. Maybeck, Stochastic Models, Estimation and Control, Vol. 1, Academic Press, New York San Francisco London, 1979. [80] J. Picard, Asymptotic Study of Estimation Problems with Small Ob servation Noise, Stochastic Modelling and Filtering, Proc. IFIP-WG 7/1, Rome, Italy, 1984, Springer Lect. Notes in Control and Inf. Sc. Vol. 9 1 , Springer-Verlag, Berlin Heidelberg New York London Paris Tokyo, 1987. [81] J. Golec and G. Ladde, On an Approximation Method for a Class of Stochastic Singularly Perturbed Systems, Dynamic Systems and Ap plications, Vol. 2, pp.11-20,1993. [82] J. Golec, Sample Path Approximation for a Class of Stochastic Sys tems, Dynamic Systems and Applications, Vol.5 , pp.569-581,1996. [83] S.K. Biswas and M.B. Subrahmanyam, Worst-Case Estimation of Un known Sinusoids Contained in Corrupted Measurement Data, Ameri can Control Conference, Philadelphia, June 24-26,1998. [84] K. Ito, On Stochastic Differential Equations, Memoirs of the Ameri can Mathematical Society, N o . 4, American Mathematical Society, Providence, Rhode Island, 1951. [85] M.G. Crandal and P.L. Lions, Viscosity Solutions of Hamilton-Jacobi Equattions, Transactions A.M.S., 277, pp.1-42, 1984. [86] N.U. Ahmed and X. Ding, Controlled McKean-Vlasov Equations and Viscosity Solutions, Communication in Applied Analysis, (to Appear 1998). [87] T.E. Dabbous, N.U.Ahmed, S.S. Lim Linear Filtering for a Class of Jump Processes Arising in Navigation Systems, IMA Journal of Math ematical Control and Information,Vol.7, pp.269-292, 1990. [88] J.K. Tungait, Continuous-Time System Identification on Compact Parameter sets, IEEE trans, on Information Theory, IT-31, pp.652659, 1985.
Linear and Nonlinear Filtering
253
[89] S.P.Sethi, Optimal Consumption-Investment Decisions Allowing for Bankruptcy: A Survey , ICOTA-98, Workshop on Optimization: Techniques and Applications, Curtin University of Technology, West ern Australia, June29-30,1998.
This page is intentionally left blank
255
18. index Absolute Continuity 5, 7, 51 66, 125, 169,171190 Adapted 10, 14, 21, 28, 57, 74 156, 171, 182, 214, 222 Almost sure continuity 169 Borel sets 2, 12, 113 Borel-Cantelli Lemma 8 Brownian motion 10, 25, 33, 48 66, 92, 118, 151, 167, 213, 236 Backward Kolmogorov equation 39, 41, 46, 214 Cameroon-Martin-Girsanov formula 47, 51, 169 Canonical Sample space 169 Cauchy Inequality 190, 240, 242 Correlated noise 85, 95, 105 153 Chapman-Kolmogorov equation 40, 44 Conditional expectation 5, 57 93, 103, 152, 156 Conditional probability 7, 168 179, 209, 220 Continuity of Stochastic processes 9, 18, 44 Continuous martingales 12,117 Covariance operator 89, 93, 99 103, 110, 142 Differential Riccati equation 62, 72, 89, 129, 158, 181, 204 Diffusion process 40 Doob's martingale inequality 12 Dynamic programming equation 207, 208, 214 Equivalence of random variables 5
stochastic processes 9 Error Covariance 79, 82 Existence of solutions SDE30 filtering problems 125 Zakai equation 199 Feller process 44 Feller semigroup 44 Forward Kolmogorov equation 43, 45 Filtering problem 55, 56, 121 167 Filter theory 55,155,167 Linear 55, 69, 77, 85, 95 105, 113, 121, 141 Nonlinear 155, 167, 187 Filtration (Filtered probability space) 10 Gaussian Random variable 3, 4 Random process 11, 210 Gateax derivative 62, 80,88,98 Gain Constraints 121 Games theory 131 Girsanov theorem 46 Gronwall inequality (Lemma) 36 H J B equation 208, 214, 225 Identification 227 Linear Systems 227, 234 Nonlinear Systems 227, 237 Information sigma algebra 57 204 Integro-differential equations 64, 145 Ito Differential 34 Ito Integral 13, 14
256
Ito-Stratonovich S D E 32, 38 Infinitesimal (Differential) Generator 42, 43 Innovation Process 92, 102, 111 Jump Process 9, 13, 113 (see Poisson Process) Kalman Filtering 55, 69 Kolmogorov's Equations Backward 41 Forward 43 Kolmogorov's Continuity Condition 9 Lebesgue Dominated Convergence 8, 44 Linear Quadratic Gaussian Regulator 210 Lp Martingale 12 Markov Property 10 Markov Process 9, 10 Markov Semigroup 44 Martingales 12 Sub Martingale 12 Super Martingale 12 Discrete Kalman Filtering 77 Measurable Functions 2, 9 Modes of Convergence 4, 5 Measures 2, 4, 45, 48, 51,113,169 176, 187 Measurement Process 56, 69 78 Moments 4, 182 Multiple Wiener integrals 15 Nonanticipative Process 14 Nonlinear Filtering 155, 167 Observation (Measurement) Process 56, 69 Optimum Filter 55 Linear 58, 72, 83 90, 100, 110, 129, 141 Nonlinear 155, 167 Optimal control 203 Optimum Gain 80, 83
Index Partially observed 203 Control 203 Identification 227 Probability Space 1 Poisson process 9, 13, 21, 113 Prediction 68, 186 Probability measure 2 Quadratic variation process 19 Radon-Nikodym derivative ( R N D ) 6, 169 Random variables 2 Riccati differential equations 63, 71, 74, 88 Right continuity 169 Robust filtering 131 Sample path 11, 170 Sample space 1, 169 Semigroups (see Markov, Feller) 44 Stochastic process 9 Stochastic differential equations (SDE) 25, 28, 36 Stochastic Integrals 14, 21, 22 Strong solutions 29 Signal process 66 System and Measurement Dynamics 55 ,56, 69, 78, 85 95, 105, 113, 121, 141, 155 Time optimal control 52 Trajectory 113 Transition Kernel 39 Terminal cost 225 Uncorrelated Noise 90 Uniqueness of solutions 30 Uniform integrability 53 Value function 206 Viscosity solution 225 Weak solution 51 White noise 11, 150 Wiener process (martingale) 9, 10, 14