The Vector-Valued Maximin
This is volume 193 in MATHEMATICS IN SCIENCE AND ENGINEERING Edited by William F. Ames, Georgia Institute of Technology A list of recent titles in this series appears at the end of this volume.
The Vector-Valued Maximin V . I. Zhukovskiy INSTITUTE OF TEXTILE AND LIGHT INDUSTRY Moscow, RUSSIA
M . E. Salukvadze INSTITUTEOF CONTROL SYSTEMS ACADEMY OF SCIENCES GEORGIAN TBILISI, REPUBLICOF GEORGIA
ACADEMIC PRESS, INC. Hurcourt Bruce & Compuny, Publishers
Boston San Diego New York London Sydney Tokyo Toronto
This book is printed on acid-free paper.
@
Copyright 0 1994 by Academic Press, Inc. All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher.
ACADEMIC PRESS, INC. 1250 Sixth Avenue, San Diego, CA 92101-431 1 United Kingdom Edition published by ACADEMIC PRESS LIMITED 24-28 Oval Road, London NW1 7DX
Library of Congress Cataloging-in-Publication Data Zhukovskii, Vladislav Iosifovich. The vector-valued maximin / V.I. Zhukovskiy, M.E. Salukvadze. cm. - (Mathematics in science and engineering ; v. 193) p. Includes bibliographical references and index. ISBN 0-12-779950-8 (acid-free) I . Differential games. 2. Mathematical optimization. 1. Salukvadze, M. E. (Mindiia Evgen’evich) 11. Title. 111. Series. QA272.248 1993 519.3-dc20 93-9390 CI P Printed in the United States of America 9 3 9 4 9 5 9 6 EB 9 8 7 6 5 4 3 2 1
Contents
xi xv xvii
Preface Notation Abstract
Chapter 1 Quasimotions and their Properties Reference Information 1.1. Control System 1.2. Motion Properties 1.3. Kononenko’s Counter-Example 2. Piecewise-Continuous Stepwise Quasimotion 2.1. Subbotin’s Counter-Example 2.2. Stepwise Quasimotion 2.3. Properties of Quasimotions 2.4. Completeness of a Quasimotion Bunch 3. The Alternative and the Saddle Point 3.1. Useful Information 3.2. Proof of the Alternative 3.3. Minimax, Maximin, and Saddle Point 3.4. Saddle Point Properties of 4. Corollaries of the Alternative 4.1. Additional Proposition 4.2. Corollaries 4.3. A Property of the Linear-Quadratic Problem 1.
1
1 4 6 8 8 11 17
20 25 25 27 29 33 38 38 41 42
Chapter 2 Slater Optimality I. Slater-Maximal Strategy 1.1. Formalization of Multicriterial Problem 1.2. Definition of Slater-Maximal Strategy V
49 49 51
vi
Contents
1.3 Properties of Slater-Maximal Strategy 1.4. Stability 1.5. Properties of Inheritance and Rejection 2. Sufficient Conditions 2.1. Conditions 2.2. Corollaries of Theorem 2.1 2.3. A Universal Slater-Maximal Strategy 3. Structure in the Case of Slater Optimality 3.1. Description of Structure 3.2. Structure 3.3. Comparison With Another Definition of Problem (1.1) Solution 3.4. Example. Multicriterial Dynamic Problem Without Slater-Maximal Strategy
53 57 61 63 63 65 66 70 70 71
74 77
Chapter 3 Pareto Optimality 1. Pareto-Optimal Strategy 1.1. Definition and Geometric Interpretation 1.2. Properties of the Pareto-Maximal Strategy 1.3. Stability 2. Relations Between the Sets 9''' and V s 2.1. Preliminary Remarks 2.2. Quasiconcave Multicriterial Problems 2.3. Remark 3. Structure in the Case of Pareto Optimality 3.1. Description 3.2. Structure 3.3. Existence 3.4. Specific Features of Pareto-Maximal Strategies 4. Sufficient Conditions 4.1. Auxiliary Propositions 4.2. Sufficient Conditions 4.3. Corollaries 5. A Linear Quadratic Multicriterial Problem 5.1. Problem Statement 5.2. An Auxiliary Optimal Control Problem 5.3. Formal Procedure for Obtaining V p 5.4. Theoretical Basis of Algorithm 5.5. Exact Selution of System (5.22), (5.23) 6. Comparison to P1-Optimality 6.1. Definition 6.2. Properties of P1-Maximal Strategy 6.3. Uniqueness of Values of the Goal Functional Vector 6.4. Relationship Between 9 " and 9 "'
81 81 83 84 90 90 90 92 93 93 94 96 96 100 100 105
107 110 110 111 1 I3 115
118 120 120 121 124 125
Contents
vii
Chapter 4 Geoffrion Optimality 1. Geoffrion-Maximal Strategy 1.1. Definition 1.2. Properties 1.3. Structure 1.4. External and Dynamic Stability 2. Necessary and Sufficient Conditions 2.1. Auxiliary Propositions 2.2. Necessary Conditions 2.3. Sufficient Conditions 3. A-optimality 3.1. Compactness of the Set Xs 3.2. Formalization of the Relation > A and its Properties 3.3. A-optimal Strategies of Problem (1.1) in Chapter 2
131 131 134 137 142 145 145 148
151
153 153 155 158
Chapter 5 Vector-Valued Saddle Points Definition 1.1. Why Saddle Points in Differential Games? 1.2. Formalization of Vector-Valued Saddle Points and Some of Their Properties 1.3. Geometric Interpretation 2. Properties of Saddle Points 2.1. Existence of A-saddle Point 2.2. Dynamic Stability 2.3. Compactness 3. Invariance of Vector-Valued Saddle Points 3.1. Affine Transformations 3.2. Addition of Criteria 3.3. Use of Increasing Functions 4. Sufficient Conditions 4.1 Examples of Functions Increasing over > and over t on RN 4.2. Slater Saddle Points 4.3. Pareto Saddle Points 4.4. Geoffrion Saddle Points 4.5. A Linear Quadratic Game 4.6. A-saddle Points 4.7. Specific Features of Vector-Valued Saddle Points 1.
169 169 172 181 186 186 191 194 202 202 204 207 210 210 212 216 218 220 227 228
Chapter 6 Vector-Valued Guarantees 1. Vector-Valued Maximin and Minimax 1.1. Formalization of Vector-Valued Maximin
237 237
...
Vlll
Contents
1.2. Relationship to Scalar Maximin 1.3. Geometric Interpretation 1.4. Why is the Maximin Better than the Vector-Valued Saddle Point? 2. Existence of Vector-Valued Maximins 2.1. Existence of Slater Maximin 2.2. Topological Properties of Slater Maximin 2.3. Stability of the Set of Slater Maximins 2.4. Relation Between Maximins and Vector-Valued Saddle Points 2.5. Vector-Valued Maximin as Solution of a Differential Game 3. Pareto c-maximin 3. I . Formalization of c-saddle Points in Static Games 3.2. Properties of c-saddle Points of the Game (3.1) 3.3. Existence of a Pareto c-maximin in the Differential Game (1.1)
243 245 249 25 1 252 257 259 262 266 270 270 275 277
Chapter 7 The Competition Problem Mathematical Model of Competition 1.1. A Model of a Special System 1.2. A Model of Competition 1.3. Optimal Decision-Making in Competition Problems 1.4. A Geometric Interpretation of the ZS-Solution 1.5. Procedure for the Construction of a ZS-Solution 2. A Game With Separable Payoff Function 2.1. Problem Statement 2.2. Vector-Valued Saddle Points 2.3. Vector-Valued Maximin and Minimax 2.4. Existence of ZS-Solutions With N = 2 3. The ZS-Solution in the Competition Problem 3.1. A Game With Mirror Payoff Function (Saddle Points) 3.2. Vector-Valued Maximin and Minimax of the Game (3.1) 3.3. Obtaining ZS-Solutions in the Competition Problem 4. Model of Competing Research Activities 4.1. Single-Firm Model 4.2. Game-Theoretic Model of Competition 4.3. Pareto-Optimal Controls 4.4. Decision Making in the Competition Model 1.
28 1 28 I 284 285 286 290 29 1 29 I 293 300 305 31 I 31 1 315 319 322 322 324 326 330
Chapter 8 A Pursuit Game With Noise 1.
Statement of Problem 1.1. Cooperative Pursuit Game with Negligible Noise 1.2. Recognition of Noise (Uncertainties) in Problem (1.3)
335 335 337
Contents
ix
2. Pursuit of Two Target Points, one of which with Uncertain Location 2.1. Problem Statement and Essential Results 2.2. Pareto Minimaxes 2.3. Pareto Saddle Points 3. Pursuit of Two Target Points 3.1. The Problem 3.2. Auxiliary Information 3.3. Solution of the Pursuit Game 4. A Three-Criteria1 Pursuit Problem 4.1. Problem Statement 4.2. An Auxiliary Proposition 4.3. Pareto Saddle Points of the Game (4.5)
339 339 342 344 347 348 35 I 356 359 359 362 364
Appendix 1 Concepts from Topology
37 I
Al.1. The Topological Space A1.2. Metric Spaces A1.3. Convergence (in the Topological Sense) A1.4. The Space {2"},
371 372 373 375
Appendix 2 Upper Semicontinuous Multivalent Mappings
377
Appendix 3 Auxiliary Propositions from the Theory of Multicriterial Problems
379
Appendix 4 Vector-Valued Maximins in Static Problems
383
A4.1. Slater Maximin A4.2. Other Notions of Vector-Valued Maximins
383 385
References
389 397 401
Author Index Subject Index
This page intentionally left blank
Preface
As technology develops control systems become increasingly complicated
and must satisfy diverse requirements. In particular, today’s systems must be ecologically harmless and respond to changing social conditions. These are just a few reasons that have turned every decision-making process into a multicriterial problem. In real life control systems are subject to disturbances, noise, or other uncertainties that do not necessarily yield to statistical analysis. The plethora of criteria and uncertainties is the primary reason why differential games with vector-valued criterion have become an essential research tool. On the other hand, power differential games are solved by an approach that was originally devised by Isaacs [35] and is erroneous from our point of view. In this approach an initial multicriterial problem is replaced by a singlecriterion problem, using linear convolution scalar criteria with positive coefficients and taking the saddle point as a solution. But is this saddle point relevant to the initial multicriterial problem? This question is generally not discussed in works on differential games. As a consequence, the solution of a differential game may overlook its multicriterial nature, and what is solved is a single (singlet criterion) problem with convolution criterion. Further, the claim is made, without any proof, that the solution of the single-criterion is the solution of the initial multicriterial problem. For instance, the payoff function widely used in differential game theory as the sum of the terminal and integral parts is in many problems a convolution of two criteria, one terminal (for example, the requirement that the process be close to a desired fixed position at the end of the motion) and the other integral (for example, the requirement that the motion consume as little energy as possible). The situation in the theory of zero-sum games is, in our view, summarized by Norbert Wiener who has stated that real freedom is the freedom to do nothing. What sense, then, does the saddle point have for an initial differential game with vector-valued payoff! This solution turns out to be a Geoffrion saddle xi
xii
Preface
point whose definition combines that of the saddle point from game theory and the concept of efficiency proper from the theory of multicriterial problems. This vector-valued saddle point has proved to be a poor solution of a game with a vector-valued payoff for there can exist two Geoffrion saddle points, the values of all the components at one such point strictly larger than at the other (thus, an absence of equivalence and internal stability of the vector-valued saddle point). In this case the maximizing player naturally tries to arrive at a saddle point where the components of the game function assume great values and, conversely, the minimizing player prefers that point at which the values of all the components are smaller. The vector-valued saddle points are not interchangeable, in as much as strategies of the player that combine distinct saddle points do not add up to a vector-valued saddle point. Because the game is zero-sum, deals between the players with the choice of a concrete saddle point are prohibited. Consequently, among all the vectorvalued saddle points there is none that would satisfy both players simultaneously. It follows that the solution of differential games with vector-valued payoff function needs another definition. In this book such a solution is proposed, which we call the vector-valued maximin. Our main object is to describe its properties and prove its existence under the constraints common to positional games. Mathematically theoretical reasoning stays within the theory of positionally differential zero-sum games with scalar payoff function, a theory developed by the Urals scientific school headed by N. N. Krasovski y. Chapter 1 will describe a modified motion (referred to as a quasimotion) of a dynamic system that results from a counter-example put forward by A. I. Subbotin. Unlike the conventional approach [43], stepwise quasimotions permit a finite number of discontinuities of the first type (not being continuous as stepwise motions). The subsequent three chapters discuss four kinds of optimality in multicriterial positionally dynamic games: Slater, Pareto, Geoffrion and A-optimality analogs. In every fixed strategy the criteria are multivalent, since fixed strategies give rise to nonunique quasimotions. This multivalence is recognized at the stage of defining the optimality, and is presumed in discussing the properties of optimal strategies. Chapter 2-4 lay the groundwork for Chapter 5 on the positive and negative properties of vector-valued saddle points; the need is proved for a new definition of the solution of differential games with vector-valued payoff function. Such a solution, a vector-valued maximin, is described in Chapter 6, where
Preface
...
XI11
the existence is established of a Slater maximin strategy and of a Pareto Emaximin. The internal stability of the set of vector-valued maximins, their relation to vector-valued saddle points, and certain other properties are determined. Finally, a vector-valued saddle point with equal corresponding vector-valued maximin and minimax is proposed as a “good” solution (2solution) of a differential game with vector-valued payoff function. The final two chapters examine the problem of competition (Chapter 7) and the game of approaching points for which only a possible domain is known. In the competition problem, it is the structure of an optimal solution (initial position of vector-valued maximins, minimaxes and the values of payoffs in the saddle points) which is of interest. In certain cases the existance of a 2-solution is ascertained and efficient methods of constructing it are developed. The analysis offers only certain lives of research in the theory of differential games with vector-valued payoff. The goal is to invite the reader to an overview of the established theory of zero-sum differential games. We are deeply convinced that the games of degree, in particular where the payoff function as the sum of integral and terminal parts ought to be solved in a different way, by finding the vector-valued maximin (or minimax) rather than the saddle point of this sum. Though we will most probably be under fire for attacking the fundamentals of the differential game theory, we recall the words of the American humorist Essar “Don’t hate your enemies, they may become your partners because the adversary seeking your mistakes is better than your friend who tries to conceal them”. Finally, the authors wish to thank Professor George Leitmann for his contribution and support of this book. Vladislav Zhukovskiy Moscow, July I991 Mindia Salukvadze Tbilisi, July 1991
This page intentionally left blank
Notation
Many common and special symbols used in the book are omitted or are referenced where they are used the first time. The latter are either explained when introduced or referenced where needed.
=
implication "from.. . it follows that.. ." o equivalence I such that; A cB set A belongs to set B (including the case of A A B = {Cl c = u b, A, ~ E B } ; A - B = { C ~ =C u - b, ~ E A ~, E B } ; A\B = { C I C E A and c $ B } ;
+
+
=
B);
A QB negation of A c B; R' a set of real numbers; IW" Euclidian m-dimensional space with norm 11 . 11 ; m-dimensional vector with zero components; 0, 0, , (m x m)-dimensional matrix with zero elements; N = (1, 2, ..., N } ; IF = ( F ' , . ..,FN); IF(') 3 IF(Z)oFI') 2 FIZ),
i E N;
IF(') > [F(Z)oFI')> FjZ),
iEN;
P') 4 Pz) negation of IF'') > P); E(1) 2 E(z)oE(l) 2 E(2) and [F") # E(z); negation of IF"' 2 P; = {IFIF, > 0, i E N}; RN = {IFIF,< 0, iEN};
IF(')
V2)
W) = U x . M W);
@ = {U
+ u(t, x) 1 u(t, x) E H Ecomp R h } xv
set of first player's strategies;
xvi
Notation
set of second player's strategies; u(t, x ) I u(t, x ) G Q ~ c o m R p4} bunch (bundle) of quasimotions %[to,xo, V ] x c . , to, xo, V l = {xct,to, xo, V l , t o Q t Q 0); V = {V
t
x[e,V ] = X[O, to,xo, V ] = U x[O, to, xo, V ] = %[to,xo,V ] n { t = 0);
u x[e,
X C H ]=
-XI.
to, xo.
ul
1
[ ~to, , xo, u t H I ;
=x
U E 9
X[U - H, V
t
Q] =
x[O, to, xo, U , V ]
reachability domain;
UE%
VEY-
r ( V ) = <XC& V I , W ) ) ; T ( U )= <XCR U I , W ) ; With K =S Slater, K =P Pareto, K =G Geoffrion, K =A A-optimality; set of K-minimal solutions x K [ V ]of the problem r ( V ) ; X(")[V] X ( K ) [ U ] set of K-maximal solutions x K [ U ]of the problem T(U); min" [F(x[O,to, xo, V ] )= ff(x,[V]); xc.1
maxK ff(x[O,to, xo, u])= [F(x" xc.1
Fr(K)[F(xCe,
q)= F ( x ( K ) [ V l )
[U ] ) ; =
u u
*EX,K,C
Fr(K)F(XIO, U ] )= L F ( X ' ~ ) [ U=] )
X E X'Y"
VF)
%F)
Vl
F(x);
IF(x); U]
set of K-maximin strategies V ( K ) ; set of K-minimax strategies U'");
(U", V") saddle point for K; IF(%, [V(")] maximin for K ; F(%K[U(K)]) minimax for K. The set of saddle points is Y Slater, 9 Pareto, Q Geoffrion, LA? A. quadratic form z ' 9 z is positive-(negative-) definite. 9 > q 9 < 0)
Abstract
The operation of each of two conflicting systems in differential antagonistic (two-person zero-sum) games is usually estimated by a set of criteria. For example, in a pursuit problem a pursuer tries not only to approach an escaping object (first criterion), but also to travel with minimum energy expenditure (second criterion). Though there may be several criteria, investigators consider only one (equal to the sum of all criteria with positive “weight” coefficients) and treat a saddle point of such a scalar criterion as the solution of the differential game. No rigorous proof of such a reduction of the initial multicriterion problem is given, however. The accepted solution (saddle point) possesses at least two negative properties: (1) Absence of equivalence: there may exist two saddle points (with different weight coefficients),at one of which the values of all the criteria are greater than at the other. Therefore, the “maximizing” player strives to use the first saddle point and the other player, the other saddle point. The players cannot arrange a joint choice, because the game is antagonistic. (2) Absence of interchangeability: if the players choose as “their” strategies the saddle point for each of them that is “good”, together such strategies may not create a saddle point.
Thus, the generally accepted approach in the theory of differential games does not lead to a guaranteed result for each player separately. In the present monograph a new approach that overcomes properties (1) and (2) is given, based on the concept of a vector-valued guaranteed result. The first part of the book is dedicated to the static problem (one-step antagonistic game with vector-valued payoff function) and the second part to positional differential games with vector-valued payoff function. The investigations lie at the junction between theories of multicriteria problems and differential games now under intensive development around the world. xvii
This page intentionally left blank
Chapter 1
Quasimotions and Their Properties
A new kind of dynamic system is introduced that differs from existing systems in that the associated stepwise motions can have discontinuities of the first degree at a finite number of decomposition points.
1. Reference Information
Results are reported (without proof) from the theory of positional differential games as a basis for their theoretical research. 1.1. Control System
This subsection will be concerned with a conflict-controlled dynamic system 72 described by a system of common differential equations i= f (t, x, u, v).
(1.1)
Here x E R" is the state vector, the time t varies in a fixed range from to 2 0 to 8 > to. The control action of the first player u E Rh, of the second, v E R4; the position is (t,x) E [to,81 x R". Unless otherwise stated, it is assumed that the following holds: Condition 1.1.1. The components of a vector-valued function f ( . ) are continuous over the totality of arguments: u E H Ecomp Rh, v E Q Ecomp R4.
1
2
1. Quasimotions and Their Properties
For every bounded domain G of the space of positions (t,x ) there is a constant
R(G) > 0 such that
(1 f ( t ( ' ) ,X(l), u, u) -f ( P ) ,x(2),u, v)ll < R(GKllx'" - X ' 2 y
+ It'"
- t(2)l)
for every (t"), xu))E G ( j = 1, 2) uniformly with respect to u E H, u E Q. There is y = const > 0 such that for every tE [to,O], U E H, U E Q the following inequality is satis$ed: IIf(t3 x , u,
411 < Y(1 + Ilxll).
(1
.a
Hereafter Iw" is Euclidean space of m-dimensional vectors y = (y,, . . . ,y,) with norm 11 yll = (Xr= ~ j ? ) l /and ~ , comp R" is a set of compacta R".
,
If Condition 1.1.1 holds, the solutions x(.) = {x(t), to < t < O} of the system (1.1) exist, are unique and extendible to the interval [to,O] for (1.3)
x(to) = xo
and for any Borel-measurable functions u(t) and u(t) such that u(t)E H, ~ ( E t )Q
(1.4)
at every t E [to,0). At a fixed initial position (to,xo), every totality of measurable controls is associated with "its own" solution x( .) of (1.1). The choice of control actions u and u is made by the first and second players, respectively, on the basis of knowledge of feedback data. The strategies U of the first player are identified with the mappings u(t,x), which are determined at every possible position ( t , x ) and satisfy only the constraint u(t, x ) E H.
In particular, the entire compactum u(t,x ) = H can be used as a strategy. The relation between the strategy U and the mapping u(t,x) is denoted U t u(t,x), and the set of the first player's strategies by the symbol 49. Analogously, the second player's strategies V are identified with the mappings v(t, x) determined at every possible position (t,x ) E [to,0)x R" and satisfying the inclusion v(t, x ) E Q.
This correspondence is denoted V t u(t, x), and the set of the second player's strategies V" or V" = { V + u(t, X ) I v(t, X ) E
Q}.
1. Reference Information
3
These strategies and the corresponding motions of the system (1.1) have been described in formal terms [43, p. 32; 83, p. 531. As for a motion [43, p. 32-33] of the system (1.1) entailed by a second player whose strategy is fixed, the initial position (to,xo)E [0,8) x R" and, assuming the strategy V + u(t,x) is chosen, we cover the closed interval [to,01 by a system A of semi-intervals r j < t < z j + I(j= 0, 1,. . .,m(A)- l), T o = to, 7m(A)= 0. Let u(t) E H, to < t 8 be some Borel-measurable realization of the control action u that evolves with time as a consequence of some reasoning made by the first player. The stepwise motion
-=
x ( . , v; A ) = { x ( t , to, xo, U c ) , r! A), to < t
< el
will be the term applied to an absolutely continuous solution of the integral equation
r! A ) = xo, T~ < t
X(T~,
<~
~ += 0, ~ 1,. ( . j. ,m(A) - 1).
(1.5)
The existence, uniqueness and extendability to [to,81 of such a stepwise motion x( * , r! A) is asserted by the well-known theorems of differential equations [17]. Every specific partition A of the closed interval [to,81 and realization of control action u(t), t o E t < 8 is associated with its own unique stepwise motion x(., XA). Furthermore, if the inequality (1.2) is valid, this will be sufficient for the extendability of left stepwise motion until time t , E ro, to]. The motion X c q =
{Xct, to, xo,
vl, to < t < el
of the system (1.1) from the initial position (to,x o ) generated by the strategy V E - Y is given [43, p. 331 by the function x [ t ] , to < t < 8 for which a sequence of stepwise motion may be found such that x(t, to, X ( k ) ,
1, v; ~ ( k ) ) ,
u(k)(.
uniformly converging to x[t], to < t the constraint lim sup [T?]
k+w
j
- T?)]
to
(1.6)
< 8 over the closed interval [to,el under
= 0,
lim
k+w
IIx(~)
- xo 11 = 0.
(1.7)
The fixed strategy V E -Y- and the initial position (to,x o )E [0,8) x R" generate, generally speaking, not a single motion but a set of motions,
4
1. Qwsimotions and Their Properties
referred to as a bunch. Different motions of the bunch (denoted %[to, xo, V]) are obtained by distinct sequences of partitions A(k),the initial values of the state vector x(~),and the realizations of the first player’s control actions dk)(t),
< t < e. A plot of the motions x[. ,to, xo, U , Vl generated from the initial position (to,xo) by a pair of fixed strategies V E Y and U E%, U + u(t, x) (the pair (U, V ) E % x Y is referred to as a situation) differs only by the fact that in plotting the stepwise motions x(t, U , K A ) , to < t < 8 in (lS), (1.6) at t ~ [ 7 f ) , d k ) 3 it is not an arbitrary measurable function u ( ” ( t ) ~ H ,but d k ) ( t= ) u(zj ’+ :k) , x(7$, U , r! A(k)))which are used. Specifically to
x(t, U ,
r! A(’)) = X ( T ~ )U, , r! A(k))+ J . f(z, x(7, U , r! A(k)), Tp’
u ( ~ f ) x(~f), , U, C E[7jk),7y)J,
A(k))),u(7y), x(7f), U , r! A“))))d7,
x(@, U , r/; A(k))= x(k)(j= 0, 1,.
Here, distinct motions from %[to, x, U , sequences { A(k)}and {
. .,rn(A(k))- 1).
are obtained by using distinct
1.2. Motion Properties
In [43, p. 36-38; 83 p. 541 the following properties are established. Property 1.1. Motion bunches %[to, xo, U], %[to, xo, Vl and %[to, xo, U , V ] form non-empty compacta in the space of continuous functions Cn[to,01 with norm
Property 1.2. %[to,
xo,
u, Vl E %[to,
xo, U I n %[to, xo,
vl.
The principal result of positional differential game theory is the theorem of an alternative proved in [43, p. 68-69; 83, p. 81. In one version, a set A4 closed in R” is given. Denote
xCe, to, x0, u l = %[to, x0, ui n { t = e},
xce, to, x0, VI = n t o , xo, ~l n { t = el.
1. Reference Information
5
Theorem 1.1. Assuming that Condition 1.1 is satisjed and that for every S E R” and (t,x ) E [O, 0) x R” we have max min s’f(t, x , u, u) = min max s’f(t, x , u, u). VEQ
UGH
ueH
(1.8)
voQ
(saddle point condition for a small game). Then, whateuer the initial position (to,xo)E [O, 8) x R”, (1) either
there is the j r s t player’s strategy U E % such that xce, to, xo, UI = M , (2) or the number E > 0 and the second player’s strategy V E Y will be found such that
where M E ={ x E R I I I x - x * ~ ~ <E, x*EM}.
In the first case we say that the problem of encountering the set M at time 8 can be solved, in the second case that the problem of evading the set M at time 0 can be solved. The theorem of an alternative helps solve diferential games of quality [35, p. 231 where “only two outcomes” are of interest (if the evading point can be captured or if the pursuer can be evaded). Quality games are opposed to games of power [35, p. 231. Here the performance of the conflict-controlled system Z is evaluated using the value of a criterion-functional defined over the motions of this system. Let us limit ourselves to a criterion of terminal form, F(x[0]). We wish to consider a differential positional game
where, together with Condition 1.1, the scalar function F(x) is assumed to be continuous over R”; { 1, 2) are the ordinal numbers of the players. The solution of the game (1.9) with initial position (to,xo) is, from the point of view of the first player, the minimax strategy U oE %, i.e., min max U€4
qX[e,
to, xo, U ] ) = min F(x[e, to, xo, Uo]) = F,.
XC.1
(1.10)
XC.1
The solution of the same game from the point of view of the second player is the maximin strategy Vo E “Ir, i.e., max rnin F(x[8, to, xo, V l ) VEY-
XC.1
= rnin XC.1
F(xC8, to, xo, V”])
=
F*.
(1.11)
6
1. Quasimotions and Their Properties
It is well known [83, p. 2411 that
F , 2 F* and, if (1.8) is not satisfied, the inequality will be strict. If F , = F*, the pair of strategies ( U o , V o )is referred to as the saddle point of the game (1.9). The following propositions [116, p. 25-27] are true. (1) The situation (UO, V o ) is the saddle point of (1.9) with initial position (to, xo) iff F(XC0, to, xo,
VOl)2 F(xC0, to, xo, UO,VO1) 2 F(xC0, to, xo, UOl) (1.12)
for any motions x [ . ,to, xo, VO], x [ . , to, xo, U o , VO], x [ * , to,xo, U o ] of the system (1.1). (2) The value F(xC0, to, xo, U o , V O ] is ) unique; in other words, F(xC0, to,xo, uO, v'])= const for every motion x [ . ,to,xo, U o , VO]. (3) Any two saddle points ( U ( ' ) ,V1)) and (U'", V'))are interchangeable (or the saddle points will be ( U ( ' ) ,V'))and (U'", V(')))and equivalent, or F(xC0, to, xo, U'", V ' q ) = F(xC0, to, xo, U'", V'"]).
The existence of a saddle point is established by using an alternative. The following theorem may be proved. Theorem 1.2. [43, p. 76-77; 83, p. 1061. If Condition 1.1 and 1.8 are satisjed, then, whateuer the initial position (to,xo)E [O,f3) x R", there exists a saddle point ( Uo, V o ) in (1.9).
I .3. Kononenko 's Counter-Example
Motion in the sense discussed above has a distinct disadvantage, in that a motion interval cannot constitute a motion. Example 1.1 [38].
Assume that the control system from 1.1 has the form
i= u, 0 < t
< 1,
X[O]
= xo = 0,
IUI
< 1.
(1.13)
7
1. Reference Information
Consider the strategy ( V
t u(t,
x) =
+ 1 with x > 0, t E [0, l),
1 with x < 0, t E [O, l), 0 with x = 0, C E[0, +)U(i, 1) ( + 1 with x = 0, t =+. -
One of the system motions generated by the strategy V from position = 0,O < t < 1, shown as a heavy line in Fig. 1.1.1. However, one part of this motion, ~ ( ~ ’ [ t=] O,* < t < 1, is not a motion in the closed interval [i,11, whereas the line segments x(”[t] = t - $ and ~(~)[= t ] -(t - 4) are motions, though not the segment ~ ( ~ ’ [ = t ] 0 at t ~ [ i13., As a result, the part of the motion x[t] = 0 is a motion generated from the “current” initial position (4,O). This follows from the fact that Definition (1.7) of a motion does not include a limit transition over the initial time. In other words, for a converging sequence initial positions of type (to,x ( ~ )must ) be discussed, the initial time to being fixed. In control problems and differential games this fact “undermines” the optimal strategy. Specifically, let a strategy be chosen as the solution of a (to,xo) = (0,O) is x[t]
X
u=+l
0
v=O
1
u=-1
Fig. 1.1.1.
t
8
1. Quasimotions snd Their Properties
game, the optimal criterion being the value reached at the “end” of a motion “one part of which has been lost” during the development of the game over time, Then the time comes (in our example, at t = when it is necessary to use another optimal strategy since the previous strategy does not lead to the desired result (the motion that yields the optimum result is out of the question at that time). In order to avoid “the loss of part of the motion,” Kononenko proposed adding a limit transition over the initial time to define the stepwise motions (1.1). Namely, stepwise motion must start not at position (to,x(‘)), but at the initial position (t“), x(~)),and extended (if t(k)> t o ) left to to. By inequality (1.2), such extension is not possible. To plot the motion x[ .,to, xo, V l we must analyze the limit transition of the stepwise motions x ( t , t(k),x(~), dk)( .), V, A(k)), to < t < 8, from those extended left (at t(k)> to) or truncated (with t(’) c to), supplementing (1.7) with the requirement
4)
(1.14)
For example, the motion ~ ( ~ ’ = [ t0] (Fig. 1.1.1.) is the limit of the sequence of stepwise motions emanating from (t(k),x ( ~ ) ) =(i+&, 0), k = 1,2, ... If (1.7) and (1.4) hold, the following assertion is true. Proposition 1.1. [ 3 8 ] . Zf x [ t ] = x [ t , to, xo, Vl, to < t G 8 is a motion of the system (1.l)fromthe initial position (to,xo)generated by the strategy at every t* E [to,01 a motion segment x [ t , to, xo, VJ, t* < t < 8 represents the motion of the system generated by Vfrom the initial position (t*,x [ t * , to,xo, Vl).
<
For such motions Properties 1.1 and 1.2, the theorem of an alternative, and the saddle point are valid [ 3 8 ] . Both limit transitions (1.7) and (1.14) will be used to determine the motions.
2. Piecewise-Continuous Stepwise Quasirnotion Stepwise quasimotions differ from stepwise motions in that they are piecewise-continuous and, at the partition points, may possess steps (discontinuities) of the first kind. 2.1. Subbotin’s Counter-Example
Another disadvantage of motion as defined previously is the possibility of new motions appearing as the game develops over time that are not to be
2. Piecewise-Continuous Stepwise Quasimotion
9
found in the initial bunch of motions. Subbotin noted this circumstance; his arguments follow. Example 2.2.
Let the control system (1.1) have the form
We examine the strategy 0 with 0 with 1 with V f u(t, x) = 0 with + 1 with - 1 with
+
I
x x
> 0, ~ E [ O , l), < 0, t E [0, l),
x = 0, t E [0, l),
2), > 0, t ~ [ 1 2), , c 0, t ~ [ 1 2). ,
x = 0, te[1, x x
The bunch of motions from the initial position (to,xo) = (0,O) generated by the strategy V consists of two motions shown in Fig. 1.2.1 as a heavy line. The same strategy V from position (1,O) generates three motions x(')[t], ~ ( ~ ) [ t ] , and ~ ( ~ ' = [ t0,] 1 < t < 2. Consequently, when the current position (t, x [ t ] )
Fig. 1.2.1.
10
1. Quasimotions and Their Properties
moves with x " ) [ t ] or x ( ~ ) [ cat ] time t = 1, another motion ~ ( ~ ' [ t1]< , t < 8, which is not found in %[to, xo, Vl is added to the initial bunch %[to, x o , V ] . Consequently, as the game develops not only are segments of the initial motions lost (as in the example provided by Kononenko) but there are also new motions (as in this example) that are not found in the initial bunch. New motions (such as d 3 ) [ t ] = 0, 1 < t < 2) are undesirable for players who choose their own strategy. Indeed, such a new motion leads to new criteria1 values (which dictate the choice made by the players). For this reason, the optimal strategies in the initial position (at the beginning of the game) cease to be optimal as the game evolves unless the current position "lands" at a point from where a new motion may emanate. To avoid this result, the notion of stepwise motions must be changed; existence of stepwise motions with discontinuities (steps) at partition points can be assumed. In this example the motion x [ t ] = 0,O< t < 2, is the limit of the sequence of stepwise quasimotions x")(t), 0 < t < 2 (k = 1,2,. . . ), shown in Fig. 1.2.2 as a broken line. These have discontinuities of the first kind at the point t = 1 and are continuous at all other points of the segment [0,2]. Convergence of x'k)(t)to x [ t ] = 0 must be considered in the space M , [ t o , 81 of bounded n-dimensional functions with norm Ilx( ~)IIM,ITo,el = s u p t O d t dIlx(t)ll. ~ By this definition of a motion, new motions are out of the
Fig. 1.2.2.
2. Piecewise-Continuous Stepwise Quasirnotion
11
question and the players can make decisions (choose strategies) after having eliminated the possibility of new motions as the game evolves. In this section this theoretical approach, based on the use of piecewise stepwise motion, will be discussed.
2.2. Stepwise Quasimotion
In defining stepwise quasimotions (piecewise-continuous motions) generated by the strategies U € 4 2 , V E V and situations ( U , V)E@ x V , we implicitly assume that Condition 1.1 is satisfied by the system (1.1). Given the initial position (to, xo), the number a E [0,1] and strategy V t u(t, x), V E V ,we cover the closed interval [to, 01 by a system of A semiintervals z j < t < z j + ( j = 0, 1,. . . ,m(A) - l), 70 = to, z,,,(~) = 0. Further, u(t) is a Borel-measurable function, u(t)E H at every t E [to, 13). By stepwise quasimotion of the system (1.1) generated at the initial position (to, xo) E [O, 0) x R" (a) by the strategy V + u(t, x) G Q, (b) the decomposition A: t o < zo < z1 < ... < (c) the number a E [0,1]
= 0,
and
we will understand any function x(-,
r! A, a) = { x ( t , zo,
xo, f,,
4.1, r! A,
a), zo
< t < 0)
which, at z j < t < zj+l, satisfies the quasistepwise equation x(t, V; A, a) = f j
+
under the condition
r,
J
f(z, x(z, V, A, a), u(z), u(zj, f j ) ) d z
m(A) - 1
1
j=O
(2.2)
7;
IIfj
- xo(zj3
V, A, a)II
(2.3)
where x o ( z j , r! A, a) is the value (at t = z j ) of the right-extended solution x(t, r! A, a) of equation (2.2) over the interval z j - < t < z j ( j = 1,2,. . . ,m(A) - 1); in other words xo(zj,
r! 4 a)
12
1. Quasimotions nod Their Properties
where $70,
A, a) = 20, xO(TO,
r! A, a) = xO.
Unlike stepwise motions [83, p. 531, stepwise quasimotions may have final steps l(ij - X ~ ( T r! ~ A, , a)ll # 0 at partition points z j . The sum of such steps must not exceed a specified number a (Fig. 1.2.3). This fact is recognized in the notation for stepwise motions x ( t , r! A, a) that introducing in the arguments the value a (in contrast to the stepwise motion x(t, r/: A)). With a = 0, a stepwise motion turns into common stepwise motions [83, p. 531. A set of stepwise motions of the system (1.1) defined in this way generated at an initial position ( t o ,xo) by the fixed strategy r! all possible partitions A, all numbers a E [0,1], and the Borel-measurable functions u(t)E H will be denoted %(to, xo, V ) . If in (2.2) and (2.4) with T~ < t 7 j + ( j = 0,1,. . . ,m(A)- l), it is assumed that U(T) = u(.rj,2j), the quasistepwise equations (2.2) and (2.4) determine a stepwise quasimotion x( * ,to, xo, U ,r/: A, a) of the system (1.1) generated by
-=
xo
I
Fig. 1.2.3.
I
2. Piecewise-Continuous Stepwise Quasimotion
13
the situation ( U , V) E 42 x 9'- at the initial position (to,xo) (where U + u(t, x) is the first player's specified strategy). The set of stepwise quasimotions of the system (1.1) generated from the initial position (to,xo) by the situation ( U , V), all possible partitions A, and any numbers a E [0,1] will be denoted x(to, xo, U , V). If one player possesses information not only on the current position (t, x), but also on the realization of the other player's control action at time, a counterstrategy will, in general, have to be used. The counter-strategies U , of the first player are identified [43, p. 3551 with the functions u(t, x, u), determined for all positions (t, x), the vectors u E Q satisfy the condition u(t, x, u) E H, and the function u(t, x, u) is Borelmeasurable over u E Q for all fixed (t, x) E [O,@ x R". The set of strategies of the first player is denoted 42,. Analogously, the second player's counter-strategies V , t u(t, x, u) are associated with functions u(t, x, u), determined for every position (t, x) and vectors u E H satisfy the condition u(t, x, u) E Q and Borel-measurable over u E H. Denote the set of the second player's counter-strategy VUthus:
vu= { V, t u(t, x, u) I u(t, x, u) E Q, v(t, x, u) E [o, e) x R" x XI. By stepwise quasimotion of the system (1.1) generated from the initial position (to,X ~ ) E[O,O) x R" by the (a) counter-strategy U , t u(t, x, u) E H, (b) the partition A:to < zo < z1 < ... < z,,,(~) = 8, and (c) the number a E [0, 13 we will understand every function
satisfying, at z j < t < z j + equation
( j= 0, 1,. . . ,m(A) - l), the
quasistepwise
(2.5)
under the condition
14
1. Quasirnotions and Their Properties
where xo(zj, U , , A, a) is the value at t extended on the interval zj-
=tj
of the solution of (2.5) right
< t < t j ( j = 1, 2,. . . ,m(A) - l),
u('j - 1,
2j - 1 9
~(t)),
or
4~)) d ~ ,
where xo(zo, U , , A, a) = xo, X ( t 0 , U,, A, a) = 20. The set of stepwise quasimotions (1.1) generated from the initial position (to,xo) by a fixed counter-strategy U , , all possible partitions A, any numbers a E [0,1], and any Borel-measurable function u(t)E Q will be denoted %(to,xo, U"). Assuming in (2.5) and (2.6) that at z j < t < z j + , ( j = 0,1,. . .,m(A) - l), u(z) = u ( t j , i j )the , quasistepwise motions (2.5) and (2.6) dictate the stepwise quasimotion of the system (1.1) generated by the situation (U,, V) from the initial position ( t o , xo) (where V t u(t. x ) is the second player's specified strategy). The set of stepwise quasimotions of the system (1.1) generated at the initial position (to,xo) by the situation ( U , , V) by all possible partitions A and any number aE [0,1] will be denoted %(to, xo, U,, V). M,[to, 01 will denote a set of bounded n-vector-valued functions z(t) with norm
Proposition 2.1. The set of all the stepwise quasimotions %(to,xo, V ) of the system (1.1) generated at the initial position (to,x o ) by thefxed strategy K all possible partitions A, any numbers a E [0,1], and any Borel-measurable function u(t)E H is bounded by the norm of the space M,[to, 01, that is, there exists K = const such that for every x( * , K A, a), the condition Ilx(. , K A, a)IIMmIto.BJ < K is satisfed. Proof. We first establish that, for Ilx(t, r/: A, a)ll
tj
< t < tj+ , ( j = 0, 1, . . . ,m(A) - l),
< (1 + IIxoll)eY(rj+l-to) +I = O j
112, - x o ( t I ,K A, a)IIeY(rJ+l-'o) -1 (2.7)
15
2. Piecewise-Continuous Stepwise Quasimotion
where, for
r!
X ~ ( T ~ + ~A,,
IIX(zj+1,
r! A,
a)ll
a), we have
< (1 + lIxOll)eY(r~+l-'o)
The proof relies on the following inequality, which is a consequence of the triangle axiom:
IlijII
< IIRj - XO(Tj, r! 4 411 + l l x o ( T j ~r! 4 a)ll
(2.9)
which satisfies [42, p. 391 for the solution x( . ) of the system (1.1) under (1.2), the relation IIx(t)ll Q (1
+ IIxoll)ey(f-ro)- 1.
(2.10)
Inequalities (2.7) and (2.9) will be proved by mathematical induction. Let us first verify the validity of the inequalities for j = 0. At t E [r,, rl) it follows from (2.10) that IIx(t,
r! A,
a)ll
< (1 + I12011)eY('-'o) - 1.
In view of (2.9) for xo(zo, V, A, a), we have IMt,
+ lP0- xo(ro, r! A, a)ll + IIxOll)ey(r-ro) -1 < ( l + IIxOll)ey(rl-'o) + 112, - xo(zo, r! A, a)ll ey(rl-fo) -1
r! A, 411 Q(1
or inequality (2.7) with j = 0 is valid. Similarly, inequality (2.8) with j = 0 holds for ~~xo(zl, r! A, .)I[. Let us assume that (2.7) and (2.8) are satisfied for j = k, or, for zk
Q
< Tk+l, k
and IIxO(zk+l,
r! A,
a)ll
< (1 + Ilxoll)ey(rk+l-'O) k
+ 1 1121- Xo(71, l=O
r! A,
a)ll eY(rk+l-fo)- 1.
We now prove the validity of inequalities (2.7) for j 7 k + 2) we have from (2.10)
=k
(2.11)
+ 1.
At
t E [zk + 1,
IIX(t,
r! A, @-)I[ < (1
I12k+lll)ey(t-rkt1)- 1
(2.12)
16
1. Quasimotions and Their Properties
and, by (2.9) (withj = k
<
~ ~ i k + l ~I l i~k + l
+ l), r! A, a)ll +
-XO(Tk+lr
IIXO(Tk+l,
r! A? a)ll.
(2.13)
Substituting (2.11) and (2.13) in (2.12), we obtain
+ IIxoII)ey(rk+2-'o)
<(I
k+ 1
+C l=O
lli,- xo(zl, r! A, a)" ey(rk+2-'o) - 1.
The result holds true for [lxo(rk+2, and (2.8) are proved. From the chain of relations (1
< A, a)ll as well. Thus, inequalities (2.7)
+ IIxoll)ey(rl-to)+ Ili, - xo(z0, r! A, a)II ey(rl-'o)- 1 <(I
+
+
i 11-
l=O
x, - xo(zl, V, A, a)ll ey(r2-'o) -1
...........................................................................
and the inequality
we have from (2.8) and (2.7) and (2.3) that, at every t E [ t o ,01,
<(a
G(2
+ 1 + IIxoll)ey('-'o) - 1
+ (Ixoll)ey('-'O)- 1 = K .
This shows that the set of stepwise quasimotions %(to, X , , V) is bounded. In concluding this subsection note that the set %(to,xo, V )is a subset of the space M,[to, 01 of all bounded n-vector-valued functions z( - ) :[to, 01 + R". In the same way the set of all stepwise quasimotions %(to,xo, U,V) may be shown to be bounded. Because o(t) E Q is a Borel-measurable function and
17
2. Piwewise-ContinuousStepwise Quasirnotion
u(t,x , u) a Borel-measurable function over u E Q, then, since the superposition of two Borel functions is also a Borel function, the two sets of stepwise motions %(to, xo, U,) and %(to, xo, U , , V ) are bounded.
2.3. Properties of Quasimotions
Let us proceed to the notion of a quasimotion, or limits (in M,[to, 0)) of the sequences of stepwise motions discussed in subsection 2.2. Defnifion 2.2. A quasirnotion x [ . ] = { x [ t , t o , x o ,Vl, to < t < O} of the system (1.1) generated from the initial position (to, x o )by the strategy V EIlr is any function x [ . ] = { a [ t ] , to < t < O} continuous over the closed interval [to, O ] , for which there exists a sequence of stepwise quasimotions (extended left to to if # > t o ) that converges (r, rn -, co)to it (in the metric of the space M" [to, 01): x( -,V, A("),a('")) = { x ( t , #, x0,
ax), u(')( *),
V, A@),LT''")), to
when diam A(r)+ 0 and
1~8)- toJ+ IIkt) - xoJ\+ O a('") + O
as r + co as rn+ 00.
(2.14)
Here diam A(r)= max [$ i
TP)]
and 0 < a("') < 1 (r, m
= 1,
2,. .. ,).
Consequently, for this sequence { x ( ., r/: A"), a('"))},it is true that sup Ilx[t] - x(t, V, A(r),a('"))(l-,0
tostse
provided that (2.14) holds. A bunch of quasirnotions of the system (1.1) will be denoted %[to, xo, VJ. Distinct quasimotions of a bunch are obtained if different sequences A(r),a("') and u(t)E H are used. This can be helpful in plotting stepwise quasimotions (converging to the quasimotions x [ t , to,xo, V ] ) . In a similar way, quasimotions of the system (1.1) may be determined that
18
1. Quasimotions and Their Properties
are generated from the initial position (to,xo) by the counter-strategy U,and the situations (U, V) or (U,,V). Bunches of such quasimotions of the system (1.1) will be denoted %[to,xo, U,], X [ t o ,xo, U,Vl, or %[to,xo, U,, VI. The properties of bunches of the system (1.1) are described below. Proposition 2.2. The bunch %[to,xo, vl is a non-empty, closed subset of the space Cn[tO,03 c M , [ t o , 01 bounded (by the norm).
Proof. The set X [ t o ,xo, V] contains the motions determined in [83, p. 531 (they are obtained by setting a = 0); consequently %[to,xo, r/7 # 0. It follows from Proposition 2.1 that the set X(tO,x0,V) of all stepwise quasimotions x(. , A, a) is bounded by the norm of the space M , [ t o , 01. Then its closure $?(tO,xo,V) is also bounded in this space. For this reason the bunch of quasimotions X[to,xo, V] is bounded (by the norm of the space M,[to, 61) as a subset of $?(to,xo, V). But the metric of Cn[tO, O] is induced by the metric M,[to, 61 and X [ t o , xo, V] c C,[to, O ] . As a consequence, the bunch of quasimotions %[to, xo, V l will be a bounded (by the norm) subset of CnCtor 01.
Let us establish that the bunch %[to,xo, V] is closed in Cn[tO, 01; in other words, we wish to prove that %[tO,x0, V] = $?[tO,x0, V] is the closure (in Cn[tO, 83) of the set X [ t o ,xo, V ] . Given y [ . ] E $?[to,xo, V], let us show that y [ .] ~ % [ xo, t ~ r/7. , Because y [ -1 is the limit point of some sequence of quasimotions {x(')[- I } c X [ t o ,xo, r/7 c Cn[to,01, the function y [ . ] = { y[t], to < t < O } , as the limit of the sequence of the continuous functions x("[t],
to d t d 0 ( I
=
1, 2,
. . .)
is continuous over [to,01. Following the outline of the proof in [83, p. 541, we choose a sequence of positive numbers E', 1 = 1,2,. . . , that converge to zero. For every 1 = 1, 2,. . ., we draw a sphere in the space B(YC.1, 4 = { Z ( . ) E M , C t O ? 01 1114.)- YC.lIlhfM,(to,e] < &I}.
(2.15)
Since y[.]~$?[t,, xo, V] is the closure of %[tO,x0, V], there exist quasimotions By the definition of a quasimotion, for every 1 = 1,2,. . . , a sequence of stepwise quasimotions {x(., K A;'), a:))} exists and converges to x("[ . ] in the space M , [ t o , O] (assuming the limits (2.14) hold). For every I , we choose for
19
2. Piecewise-Continuous Stepwise Qunsimotion
sufficiently large r and m, a single “representative” x(’)(., V A;’), as)) of every sequence { x( . , V, A:‘), a!,?)} so that
c .I, 4,
(1) x(‘)(. V , A?, 4)) E B(Y (2) diam A:‘) < c l , a!,!) < c’, 9
The sequence {x(., V , A;’), a t ) ) } consisting of “representatives” of stepwise quasimotions satisfies, as 1 + co, (2.14) and converges (in the metric of M,[to, 61) to the continuous function y [ t ] ,to < t < 6. Consequently, y [ .] is a quasimotion, or y[ .I E %[to,xo, V ] . This proves Proposition 2.2. Similar propositions hold for the bunches %[to, xo, U,] %[to, xo, u, VI, and %[to, xo, u,, V I .
and
Proposition 2.3. For every position (to,xo) and strategies U E a,V E “Y-, and U v ~ @ , ,the sections of quasimotion bunches by the hyperplane t = 0: X[O, to, xo, UI, XCO, to, xo, U , VI, XCO, to, xo, U,, VI, and XCO, to, xo, u,l, X[O, to, xo, V ] are non-empty compacts in R”.
The next proposition [43, p. 381 is also true. Proposition 2.4. Given an arbitrary initial position (to,xo) and situation ( U , V ) , every quasimotion x [ . ,to, xo, U , V ] of the bunch %[to,xo, U , V ] is contained in the bunches %[to,xo, U ] and %[to, xo, V ] , or %[to, xo,
u, vl = %[to, xo, U l nXCt0, xo7 Vl
and, consequently,
x[6,to, x0, u, V ] = x[e,to, x0, U l n xC6, to, xo, Vl. Similarly, every quasimotion x [ . to, xo, U,, V ] of the bunch %[to,x0. U,, is contained in the bunches %[to, xo, U,] and %[to,xo, V ] , or %[to, xo,
VI
u,, V l = %[to, xo9 U”1 n s r t o , xo, Vl.
Henceforth, we will make use of the set
X
= X[O,
to, xo, U + H, V - Q]
=X[H x
Q] =
u
UEQ, VEIY”
x[O, to, x0, U , V ] .
It is the reachability domain of the system (1.1) from the positions (to,xo).
20
1. Qunsimotiom and Their Properties
2.4. Completeness of a Quasimotion Bunch
The main result of this section is as follows. Theorem 2.1. For every position (t*, x[t*]), where t* E [to, 0) and x[.] some quasimotion from the bunch %[to, xo, Vl, the following inclusion is valid
xce, t*, xCt*I, vl = xce, to9 xo, vl.
(2.16)
Proof. Let x[.] = {x[t], to < t < O} be some quasimotion of the system (1.1) generated by the strategy I/ from the initial position (to,xo) and time t* E [to, 81. Consider an arbitrary quasimotion x*[.] = {x*[t], t* < t < 8} generated by the same strategy r! but now from the initial position (t*,x[t*]), where x[t*] is the value of x[t] at t = t* (Fig. 1.2.4). The inclusion (2.16) is proved if the function n(t) =
/
x[t] at to < t < t*, x*[t] at t* < t < 8,
(2.17)
/
'/
I
I
I
t0
t*
0
Fig. 1.2.4.
t
2. Piecewise-Continuous Stepwise Quasimotion
21
can be shown to be a quasimotion of the system (1.1) generated by the strategy V from the initial position (to, xo). To prove this assertion, let us specify a numerical sequence E,E(O, l), 1 = 1,2,. . . ,such that lim
1- m
E, = 0.
(2.18)
For every 1 = 1,2,. . . ,we construct two spheres B(x[ .], E,) (see (2.15)where y [ . ] = x[ .]) and B(x*C*I, E,) = {z(.)€MnCto, eI)IIz(.) - x*~.IlI~.[t,,e]< e l } .
By the definition of a quasimotion, there exist two sequences of stepwise quasimotions {.(a
L', A@),a'"))} c M,[to, 01
and {x*(., L', A$), a t ) ) } c ~ , [ t * el ,
converging (in the metric of the above spaces) to x[ under the following conditions:
-1and x*[ -1,respectively,
diam A(r)+ 0, a('")+ 0 as r, m --* 00, diam A:) + 0, a t ) + 0 as h, 4 + co.
(2.19)
Here A'?
to = 7:)
A$): t* =
<'!z < ... < t!&m) = 0,
$' <
< ... < $/Ay
7(lh)
= 0.
(2.20)
Then for every 1 = 1, 2, ..., we may select from the sequences
{x(., L', A@),a("))} and {x*( ., L', A:), a',"))} stepwise quasimotions x(')(., L', A('), a('))and x*(')(. , L', A'$, a:)) such that a(')
+ a$) + 2c, < 1,
diam A(') < E,,
diam A:) < E,.
(2.21)
(2.23)
Indeed, inequalities (2.21) and (2.23) hold by virtue of (2.19) and the inclusion (2.22) and the fact that the sequences {x(. , L', A(r),a""))} and {x*( -,L', A:), a',"))} converge to x[ .] and x*[ respectively. a],
22
1. Quasimotions and Their Properties
Note that, by (2.19), as 1 + 00. a(')+ 0 and a:)
diam A(') + 0 and diam A:)
+0
+0
(2.24)
Let us introduce the functions
From (2.17), (2.22), and (2.25),
A('),h ' " ) ~B(2[ .3, E'). Let us show that :('I(.V, A('),@) is a stepwise quasimotion when ?(')(., V,
diam A(')< max { diam A('), diam A:)}
(2.26)
(2.27)
where the number a is chosen so that
ti(')< a(') + a:)
+ 2.5'.
(2.28)
Indeed, from (2.25) and (2.20) we have the partition
Since the addition of one more point t* will not increase the diameter of the partition, (2.27) follows from (2.29). The number i?') is obtained for a(')(., V, A('), $ I ) ) by (2.3) and, for this reason, by (2.29) and (2.25), the left-hand side of (2.3) for a(')(., V, A('),& ( I ) ) does not exceed a(I)
+ a:) + IJR(l)(t*,
V, A(0, a('))- x*(O ( t* , V, A:), a:))11.
By (2.26), the latter term in this sum cannot exceed 26,. Hence Inequality (2.28) follows. By (2.28) and (2.21), 2')< 1. Finally, let us prove that %[ .] from (2.17) is a quasimotion. Since 1 + 00, by (2.18) -+ 0. Consequently, from (2.27) and (2.24), diamA(')+O
as 1 + 00,
and (2.26) shows that, as 1 + 00, the stepwise quasimotions a(')(., V, A('),P ) converge (in the metric of M , [ t o , 01) to the continuous function a[ .]. The function a [ . ] is then (by Definition 2.1) a quasimotion of the system (1.1)
23
2. Piecewise-Continuous Stepwise Quasimotion
generated by the strategy V from the initial position (to,xo). The theorem is proved. The same kind of proof may be used for the next assertion.
Proposition 2.5. For any position (t*, x[t*]), where t* E [ t o ,01 and x[ .] is an arbitrary quasimotion of the bunch X [ t o ,xo, U , Vl, the following inclusion is valid
For this reason, as the initial position moves along any quasimotion (generated by a given strategy from a specified initial position), no new segments of quasimotions may occur that could not be found in the original bunch of quasimotions (generated from the same initial position). Example 2.2.
Consider the control system of Example 2.1,
i= u,
0 G t G 2,
11
x[to] = x[O] = 0,
G 1
but using a different strategy V (Fig. 1.2.5): 1 with x > 0, t E [0, 23, 1 w i t h x = O , t = 1, V - u(t, x) = 0 with x = 0, t E [0, l), 0 with x = 0, t ~ ( 1 2), , - 1 with x 0, t E [0, 23.
[
-=
According to [43, p. 331, stepwise motions originating at the position
u=o
- o=+l
1
c
1
I-I Fig. 1.2.5.
-1 o=o
"
- ^
2
-I
24
1. Quasirnotiom and Their Properties
(to = 0, x(’’)), where
x(“)+ xo = 0, generate the motions shown as a broken line in Fig. 1.2.6. When they originate at the position (t* = 1, x[t*] = 0), part of the previous motion x(’)[t] = 0, 0 < t < 2 vanishes (Fig. 1.2.7) and a new one appears x(’)[t] = 1 - t, 1 < t < 2 (shown as a broken line in Fig. 1.2.7). This “appearance” or “disappearance” of motions can lead to a loss of “optimal” motions or the appearance of new motions that may spoil the initial optimality of the strategy chosen as a solution of the problem. Let us apply the approach we have discussed for plotting a (quasi)motion. First, the “new” motion x(’)[ .] is a segment of the quasimotion
0 with t~ [0, 13, xCtl = {d2)[t] with te[1, 23 since x[ .] is the limit of the sequence of stepwise quasimotions 0 at ~ E [ O , l), x(t) = xdk(t) at t ~ [ 1 ,21
where xdk(t)= 1 - t - 6 k , 0 < d k < 1 and limk-,a 6, = 0. Second, the “segment” of the motion x(’)[ . ] is the limit of a sequence of
/ /
/
/
/
/
/
/
#
/
/
’
/
1
\
\
\
\
\
/
/
/ \
/
\
/ x‘’”.]
2
t
25
3. The Alternative a d the Saddle Point
,. x
.
2
1 \
\
\
\
X'*)[
\
.]
-
t
\ \
\
\
\
Fig. 1.2.7.
stepwise quasimotions as {x(t, T ( ~ )0,, 0, V, A(k),0), t E [l, 21) and dk)-+ 1 k+m.
+ 0 as
3. The Alternative and the Saddle Point 3.1. Useful Information
The line of reasoning in studies of solutions to differential positional games is suggested by the theorem of the alternative. In proving the Theorem concepts given in [43] will be used under the assumption that Condition (1.1) is satisfied. Consider a set W in the space of the positions (t, x). The set W is said to be u-stable if, for any choice of position (t*, x*) E W , value t* E (t*, 01, and vector u * E Q , then, of all the solutions x ( t ) of the differential equation in contingencies
w E g u ( 4 4 t h u*),
x ( t * ) = x*
where %(t, x, v*) = coCf
I f =f(t,
x, u,
4,U E HI,
(3.1)
26
1. Quasimotions and Their Properties
(co[ . ] is the convex closed hull of the set [ . I ) , at least one solution x(t), t , < t < 8, may be found satisfying the following inclusion (t*, x(t*))E
w
If, for every position (t, x ) E [0, 0) x R" and every vector z E W, it is true that min max z ' f ( t , x , u, u) = max min z ' f ( t , x , u, u) usH DEQ DEQ ueH
(3.2)
then, following (43, p. 561, we may assume that the saddle point condition is satisjedfor a small game. In (3.2) and the following discussion, the overline signifies transposition. Let us apply the notion of an extremal strategy to the set W used in the definition a u-stable set. Let (t*, x*) be a given position. We introduce a strategy U' - u ' ( t , x ) extremal for W in the following manner. If the hyperplane t = t , in the position space does not intersect any vector u'(t, x ) can be chosen as U E H .If such an intersection does exist, a position ( t , , w , ) ~W must be selected that is the nearest in the Euclidean metric to (t*, x*). As u'(t, x ) the vector u* that is chosen must be as defined by the equality max (x* - w,)'f(t, x , u*, u) = min max (x* - w*)'f(t, x , u, u). ucH
veQ
oeQ
Now consider two functions x")[t] and x c z ) [ t ]The . former is the solution of the equation i [ t ] = f ( t , x [ t ] , u*, u(t)), x'"[t]
= x?)
The function xc2)(t)is the solution of the equation in contingencies a ( t ) E 9 u ( t , x(t), u*),
x(Z)(t*)= x y ' ,
where the set Su(.) is defined in (3.1). The constant vectors u* and u* are selected so that max [x:) - x f ) ] ' f ( t * ,x:), u*, u) UEQ
= min max [x:) - x y ) ] ' f ( t * ,x t ) , u, u), ueH
min [x:) ueH
-
veQ
x:)]'f(t*, x g ) , u, u*)
Denote p(t) = Ilx'"[t] - x y t ) l l .
3. The Alternative and the Saddle Point
27
Then [43, p. 591 at every t E [t*, 81, we have
(3.3) The estimate (3.3) is uniform at all positions (t*,x t ) )and (t*,+ x!?) in every pre-assigned bounded domain G of the position space. Note that the estimate (3.3) may be employed over every semi-interval [ z j , z j + of the partition A (see subsection 2.2). This fact will be used in the following lemma; this lemma is needed for proving the alternative.
3.2. Proof of the Alternative
Lemma 3.1. If the closed set W is u-stable, the strategy U' t u'(t,x) is extremal for Wand the initial position (t*,X * ) EW Then for every quasimotion xct, t,, x*, ~ ~t , 1< t ,< 8, (t, XCt, t*, x*, U ' I )E
w vt
E Ct*,
81.
(3.4)
The lemma will be proved by contradiction; assume that some quasimotion x*[t] = x*[t, t,, x*, U ' ] , t , < t < 8, of the bunch X[t,, x*, U'] does not satisfy this property and that t*E[t,,8) is the upper bound of times t at which (t,x*[t,t*,x*, U ' ] ) EW Since W is closed and x*[ . ] = {x*[t],t , < t < 8}, the inclusion (t*,x * [ t * ] )W ~ is valid and at t ~ ( t *O),, we have (t,x * [ t ] ) $Wsince at t* < 8, time T* ~ ( t *8), may be found such that Proof.
(t, x * [ t ] ) $W at every t ~ ( t *z*). ,
(3.5)
Since W is, u-stable, every section W(t) = W n { t = z}, where T E [t*, z*], is not empty, while the distance from position (t,x*[t])to the sets W(t) at t E [t*, z*] is less some number 5. The stepwise motions x(t, U(I),A(r), &")) = x(t, $I), x(I),$I), U"),u(')( .), A(r), a("')), t,
1 , 2,... ),
yield in limit (as r, m + 00) the chosen motion x*[t], t , < t < 8. It will be demonstrated in the discussion which follows that at t E [t*, T*], &(t)
<
=
[&(t*)
+ a("') + (1 + (t - t*)cpr) eS"-'*)
9
(3.6)
28
1. Quasimotions and Their Properties
where E,,(c) is the distance from the point x(t, U', A('), a('"))to the section W(t); cp, = sup cp(t - T?))( j = 0, 1,. 1.
. . ), t E [z?),
TY~
3)is a point in the partition A"); cp(t - 7 7 ) )and B are values of the estimate (3.3); the domain G is chosen so that it contains all positions (t,x(t, U',
A(,), a(")) together with their <-neighborhood. The validity of Lemma 3.1 is an immediate result of (3.6). Indeed, the inclusion (t*, X*)E W leads to $,,,(t.,) + 0 as r, m + 00 and, moreover, cpr + 0, a('") + 0 as m,r + 00. Then the inequality (3.6) implies that ~:,,,(t)+ 0 as r + 00 and m + 00, where C E[t*, .*I. Consequently, for the stepwise quasimotion x(t, U',A('), a("')) all positions (t, x(t, U', A('), a("'))),t E [t*, z*] converge to the set W while lim
r,m+oo
sup Ilx(t, U', A('), a("')) - x*[t] 1)
t.dtd0
= 0.
Hence (t, x * [ t ] ) ~ W at every t E [t,, z*] which contradicts (3.5). We must still demonstrate that the estimate (3.6) holds for every r, rn = 1, 2,. .. . Assume that it does not hold for at least one pair (r, m).Denote by to the lower bound of the number t E [t*, r*] for which $,,,(t) > $,,,,(t) is valid. The function e;,,,(t) is right continuous [43, p. 641. As a consequence, at time to, ~ & ( t = ~ $,,(to). ) Let [$), T?: be a semi-interval containing the point to. By the definition of the number to and from the inequality e:,,,(t*) < $rm(t*), we have
&,2,($)) G $rm($)),
$rn(t*)
> +rm(t*L
to the right of to. where t* is some point of the semi-interval [T$'), zy: Equations (3.7) and (3.6) lead to the following chain of relations: e&,(t*) > $,,(t*) - [$,,($))
- $,,,(7~))] exp B(t* - T?))
+ a("') + (1 + (t* - t,)cp,)] exp B(t* - z*) - [ ~ , 2 , ( t * )+ a(,) + (1 + cry) - t,)cp,)] exp p($) - t*) x exp B(t* - z j') + E : ~ ( $ ' ) ) exp B(t* - z?)) = (t* - r?))cp,exp P(t* - t*) + E ~ , ( Z ? ) ) exp B(t* - ~j') 2 ( t * - zy)cp, + &,2,(Z?))[1 + B(t* - ??))I 2&:,(Zy))[l + B(t* - $1) + (t* - z$)cp(t - Ty).
= [e,2,(t*)
(3.7)
3. The Alternative and the Saddle Point
29
In the latter inequality rp(t* - Ty)) < rp,.
In light of the reasoning in [43, p. 651, this inequality contradicts (3.3).This establishes inequality (3.6) and, consequently, Lemma 3.1.
Using the outline of the proof proposed by Krasovskiy [43, p. 65-68], we now prove that the theorem of the alternative holds for quasimotions.
Theorem 3.1. If Condition (1.1) is valid, then, for any position (t,x ) E [0, 0) x R" and any vector z E R", (3.2) is true. If some closed set M c R" is given, then, whatever the initial position (to,xo)E [O, 0) x R", either there exists a strategy V ' E which ~ is guaranteed to intersect for all quasimotions x[ * , to,xo, V'], the set M at time 8, in other words, that xce, to, xo, V'l E M , v x c . , to, xo,
V'l,
or there exist a number E > 0 and strategy V ' E V such that, for all the quasimotions x [ .,to,xo, V ' ] , the deviation at time Ofrom the &-neighborhoodof the set M is assured,
Remark 3.Z. Changing the roles of the strategies V and V we obtain the following new alternative: Let some closed set M t R" be given. Then, whatever the initial position (to,xo)E [0, 0) x R", either there exists a strategy V' E V such that x[8, to,xo, V'] E M for every quasimotion x [ . ,to,xo, V ' ] , or there exist a strategy V'E@ and number E > 0 such that
X[O, to, xo, V ' ]n M' = 0.
3.3. The Minimax, Maximin and Saddle Point A zero-sum positional differential game in quasimotions with scalar payoff function is given by the system
r. = <{1, 21, x, {%
V } ,F(XCOl)>.
(3.8)
Here 1 and 2 are the players' numbers. The first minimizes by choosing strategy V E @ while the latter maximizes the scalar payoff F(x[O]) by
30
1. Quasimotions nod Their Properties
choosing the strategy V E Y ; the sets of strategies 9 and Y are defined in section 1.1. The system E is described by the differential equation (1.1) as follows:
k
=f ( t , x , u, u),
(3.9)
x [ t o ] = xo
while Condition 1.1 is assumed to be valid. The game proceeds in the following manner. The minimizing player chooses and employs his strategy U E 9 to minimize F(x[B]);the other player chooses V EY the opposite. The strategy (situation) (U, V) generates from the initial position (to, xo)E [0, 0) x R", the bunch %[to, xo, U , V ] of quasimotions x [ ., to, xo, U , VJ. Then the first player loses F(x[O, to, xo, U , V ] ) , and the latter wins the same amount.
Condition 3.1. Let the scalar function F(x) be continuous over R". The difference between games (3.9) and (1.9) is that (3.9) uses the quasimotions rather than the motions of the system E. As in the case of (1.9) the solution (3.8) for the first player constitutes the minimax strategy U oE 9: F , = min max F(x[B, to, xo, U ] ) = max F(x[B, to, xo, U"]). V&
XI.1
(3.10)
XI.1
The number F , is referred to as the minimax of the game (3.8). This minimax strategy possesses two properties.
Property 3.1.
Equality (3.10) is equivalent to
max F(x[B, to, xo, U ] ) 2 max F(x[B, to, xo, UO]), V U E 9, 4.1
(3.1 1)
XI.]
which follows immediately from the first equality in (3.10).
Recall that the first player minimizes F(x[B]),so if his strategy U fixed, the "worst" loss for him will be max F(x[B, to, xo, U ] ) = xI.1
max
E is ~
F(x) = F(x*[B, to, xo, U ] ) .
xC.l~~Cto.xo.UI
This loss (the first player's strategy U is fixed) occurs on the quasimotion x * [ . ,to,xo, U ] of the bunch %[to, xo, U ] over which the maximum in the problem F(x) + max
3. The Alternative and the Saddle Point
31
is obtained with x E X [ e , t o , xo,
UI
where the set X [ e , to, xo, U ] is defined in Proposition 2.3. Then inequality (3.11) signifies that the first player suffers the least possible loss of all possible “ b a d (maximal) losses by using the minimax strategy Uo. The latter inequality in (3.10) leads immediately to Property 3.2. F , 2 F ( x [ & to, xo, U”) for any quasimotion x [ .] of the bunch %[to, xo, UO].
As a result, by utilizing the strategy U ’ E @ , the first player maintains his losses below F,. Combining Properties 3.1 and 3.2 leads to F , of (3.1), which is the “best” (lowest possible) of all the losses for the first player if he uses the minimax strategy Uo.In this way, the principle of the guaranteed result [30] is utilized by the first player. Analogously, for the second, maximizing, player the solution of (3.9) is determined by the maximin strategy V oE V F*
= max VEY
min F(x[B, to, xo, V ] ) = min F(x[B, to, xo, V”]). (3.12) XI.1
’
I
The number F* is referred to as the maximin of the game (3.8). Property 3.3.
T h e j r s t equality from (3.12) is equivalent to the inequality
min q x [ e , to, xo, VO]) 2 min XC.1
Xf.1
q x [ e , to, x o , a),V V E V .
From the latter equality in (3.12) we have the following property. Property 3.4.
F* d F ( x [ e , to, xo, VO1) for any quasimotions x [ .,to, x o , V ” ] of the bunch %[to, x o , V ” ] .
In meaningful level by combining Properties 3.3 and 3.4 we are able to realize the principle of guaranteed result for the second (maximizing F(x[B]))
32
1. Quasimotions and Their Properties
player. Namely, of all the "worst" (small) gains for the second player the highest gain will be the gain which is assured by the maximin strategy Vo. The relationship between the minimax and maximin of the game (3.8) is determined by the following assertion.
Proposition 3.1. F , 2 F* where the minimax F , and maximin F* are obtained in (3.10) and (3.12), respectively. The proof follows from the following chain of relations: (3.10)
F , = rnin max F(xC8, to, xo, U ] ) = max F(x(8, to, xo, UO]) U€l
-
XI.]
max
XI.] (1)
F(x) 2
max
~ E X C ~ , ~ ~ ~ ~ x€X[e,to,xo,UO, , U ~ I VO]
F ( x ) = max F(xC0, to, xo, U", VO1) xt.1
(2)
2 rnin F(x[B, to, xo, U", V"]) =
rnin
xeX[~,r0,xo,UO,VO]
XI.]
=min F(xC8, to, xo, V " ] )
(3)
rnin
F(x) 2
(3.12)
xr.1
=
max rnin F(xC8, to, xo, V]) VEYr
F(x)
x~~~e,t,,x~,v~i
= F*.
xI . 1
Inequalities (1) and (3) are valid because of the inclusion (Proposition 2.4)
xce, to, x0, u0,vO1= x ~ eto,, x0, uO1n X C O ,to, xO7~ 0 1 , where the transition (2) is obvious. Defiitiion 3.2. [43, p. 451. The situation (V", V " ) E @x V is referred to as the saddle point of the differential game (3.8) if F*
= max VEV
rnin F(x[B, to, x o , V ] ) = rnin F(x[B, to, xo, VO]) xI.1
Xt.1
=max F(x)[B, to, x o , V O ] = ) rnin max F(xC8, to, xo, U]) = F,. UE$I
XI.]
The number F" game (3.8).
= F* = F ,
(3.13)
XI.]
is referred to [43, p. 451 as the value of the
Theorem 3.2. Let Conditions 1.1 and 3.1 and equality (3.2) be satisjed. As a result, for every choice of initial position (to,x o ) E [0, 8) x R" in the game (3.8) there exists a saddle point (U", V").
3. The Alternative and the Saddle Point
33
The proof of a similar theorem (Theorem 1.2) for motions is given in [43, p. 73-77] and follows from the theorem of the alternative only. The alternative is valid also for quasimotions (Theorem 3.1) and for this reason the proofs in [43, p. 73-77] are valid also for quasimotions.
3.4. Saddle Point Properties of
ra
The saddle point has certain useful properties. The situation (Uo, V o )is the saddle point of the game (3.8) iff the following inequalities are satisjied:
Property 3.5. FW,
to, x0,
uO1)< w
e ,to,x0, u0,vO1)< F
w , to, xo, vO1)
(3.14)
for all quasimotions x[t, to, xo, Uo], x[t, to, xo, Uo, VO],x[t, to, xo, VO], to < t
Proof.
< 8.
Necessity. The implication (3.13)
F , = max
(3.14) results from the equalities
qX(e,to, xo, UO])= min &[e, "1
XI.]
to, xo, VO])
= F*
which are equivalent to the inequalities
wee, to, xo, vO1)< F ,
= F*
G F(XCB, to, xo, vO1)
(3.15)
, < t < 8. Then for the for any quasimotions x[t, to, xo, UO],x[t, to,xo, V O ] to same quasimotions x [ to,xo, U o , VO] and by the inclusion (Proposition 2.4) a ,
xc., to, xo,
uo,VO1 = a - t o , xo, U0l n q t o , xo, VO1
we have from (3.5) q x c e , to, xo, UO, VO-J) < F ,
= F*
< q x c e , to, xo, U O ,~
0 3 .
Hence F ( X [ ~to, , x0, Uo, VO]) = F , = F*.
(3.16)
Combining (3.15) and (3.16) leads to (3.14). At the same time, (3.10) proves that for any quasimotions x[t, to, xo, U o , VO], to < t < 8, the value of the game is unique and equal to F(x[B,to,xo, Uo, VO]) = F , = F*.
34
1. Quasimotions nod Their Properties
Suflciency. From the left-hand side of (3.14), we have
3 min max F(x[B, to, xo, U ] ) . U€@
xI.1
(3.17)
Analogously, from the right-hand side of (3.14) it follows that
F(x[O,
to, xo,
Uo, V O ] )G min F(xC0, to, xo, VO1) xI.1
< max VEW'
min F(x[B, to, xo, VJ). xI.1
(3.18)
But, by Proposition 3.1,
F , 3 F*. Combining this inequality with (3.17) and (3.18) transforms all the inequalities in (3.17) and (3.18) into equalities, which demonstrates that (3.13) is valid. In proving the necessity we have established the following. Property 3.6. If (Uo, V o ) is the saddle point of the game (3.8), for any quasimotion x [ * ,to, xo, U o , V"] of the bunch %[to, xo, U o , VO], the game (3.8) has the unique value
F(x[O, to, x0, Uo, VO]) = F , = F*. The saddle points (U('),V('))and (U"), V c 2 )of ) the game (3.8) are referred to as interchangeable if the saddle points of this game are ( U ( ' ) ,V c 2 ) )and ( I Y ' ~V")), ) , and are said to be equivalent if
F(x[e, to, x0, U'", V")]) = F(x[B, to, x0, V2),V 2 ) ] ) for any quasimotions x [ . , to, xo,
u'", ~ ( l ) ] ,x [ . , to, xo, U 2 )V2)1. ,
Property 3.7. All the saddle points ofthe differential game (3.8) are equivalent and interchangeable.
Proof: If (U''), Vcl))and (U"), V 2 )are ) two saddle points of the game (3.8), by Proposition (2.3) the following inclusion are valid: x [ e , to, x0, U"), Vr)]c X [ d , to, x0,
u")]n x[e,to, x0, V r ) ] (r = 1, 2)
35
3. The Alternative and the Saddle Point
for any quasimotions x[
to, xo,
*,
F(x[e, to, xo,
U"), V ( r ) ]Hence, . from Property 3.5
u(l),~ ( ~ G' 1F(x[e, ) to, xo, u(l), ~(~7) qx[e, to, xo, u(2),v(~)I), 1 G9 qxre, to, xo, v2), v2)1) G qX[e, to, xo, u ( ~~)(,2 7 ) G
qX[o,
to, xo,
u(2), ~ (
for all quasimotions x[. , to, xo, V ) V , m )(r, ] in = 1, 2). Combining these two relations transforms the inequalities into equalities. Hence, in particular, the saddle points (U('), Vcl)) and ( U ( 2 )V, ( 2 ) )are equivalent. Now let us demonstrate the interchangeability of the saddle points ( W ,V')) and (U2), Vc2)). Equivalence means that
q x [ e , to,xo, u(1),v(~)I) = F(x[e, to, xo, u"), v)]) = q x [ e , to, xo, u(2),v q ) and from (3.14) =. F(xre, to, xo,
u(l)i)G F(xre, to, xo, U'l',
(3.19)
yCl)1)
v(~)I), F(x[e,to, x0, u(~)-J) G F(x[e, to, x0, U 2 )V2)1) , G F(xce, to, xo,
G
mre, to, xo, ~ ( ~ 7 ) .
(3.20)
Combining (3.19) and (3.20) leads to the inequalities
qX[e,
to, xo,
u q ) G &re,
to, x0,
u(l),~ ( ~ G' 1F(x[e, ) to, x0, vC29 (3.21)
which, like inequalities (3.19) and (3.20), are valid for any quasimotions
x[. ,to, xo, U(l)], x[. , to, xo, U ( ' ) , V(2)], x[. to, xo, Vc2)]. The fact that (3.21) is
valid signifies, by virtue of Property 3.5, that (U('),V'2))is the saddle point of the differential game (3.8). Remark 3.2. The saddle point ( U o , V o ) is chosen as the solution of the positional differential game (3.8) for the following reasons: (1) The guaranteed (payoff) functions of this game (the minimax F , (3.10) and maximin F* (3.12)) are equal to the value of the payoff at the saddle
36
1. Quasirnotiom nod Their Properties
point (equalities (3.13)). In other words, by applying the saddle point the guaranteed game results become equal for the two players; (2) The value of the game F(x[e,
to, xo,
uo, ~ 0 1 =) FO
is the same at every saddle point (equivalence property); (3) All the saddle points are interchangeable; in other words, either player can use “his” strategy at any saddle point. This strategy, combined with that of the second player (also at any saddle point) becomes the saddle point of the game (3.8). It is because these properties do not hold for vector (Slater or Pareto) saddle points that we have undertaken the special study of vector-valued maximins and minimaxes presented here. Remark 3.3. Proposition 3.1 implies that if there is no saddle point in the game (3.8) F, > F*. An example of this game is provided in [83, pp. 2422441. This situation occurs if the condition of the saddle point for a small game is not met. However, a saddle point can be obtained if the classes of strategies are extended to include mixed strategies, or if a pair consisting of the strategy of one player and the counter-strategy of the other is used. The existence of a saddle point in the differential game (3.8) may be proved as in [43, pp. 289, 3641 if mixed strategies or strategies and counter-strategies are. substituted for the pure strategies 4? and V in the class of quasimotions with natural (see sec. 2) substitution of quasimotions for corresponding motions.
Theorem 3.2 (existence of saddle point) is formulated assuming that the players’ strategies are constrained by U t u(t, x) c H and V t v(t, x) G Q. If the strategies are additionally subject to mutual constraints, for example, meet the requirement that g(x[O, to, x,,, U , > 0, where g is a scalar function defined over R”, a saddle point may not necessarily exist. Moreover, the inequality F, 2 F* (from Proposition 3.1) is not necessarily satisfied Remark 3.4.
a)
the dynamic system 72 is described by a system of two “separable” equations:
3. The Alternative and the Saddle Point
37
with x E R", y e R", C E[to, 01. Let the set of the first player's strategies be %! = { U + u(t, x) I u(t, x) E H ~ c o m Rh} p and that of the second one Y = {V
+ v(t, y)I o(t, y) c Q ~ c o m Rq}. p
The componentsf(2)(.), i = 1,2, are assumed to satisfy the constraints onf( .) from Condition 1.1. Moreover, let the scalar functions cp(x) and $(y) be continuous over R" and R", respectively. In contrast with game (3.Q what is new here are the special classes of the players' strategies (for example, ~ ( tx), does not depend on the state vector y) and the inequality-type constraints s(Xce1, VceI) 2 0.
(3.24)
Assume that the scalar function g(x, y) is continuous over R" x R" and denote
W V ) = { U E 4 I s(xce, to, xo, Ul, yce, to, yo, V l ) 2 0 for at least one pair of quasimotions (x[ , to, xo, UI, YC. to, yo, VI)) Z 0, *
Y ( U )= { VE Y I s(xce, to9 xo, UI, yce, t o , yo,
Vl) 2 0
and for at least one pair of quasimotions (x[. , to, xo, U ] ,
YC.
9
to,
Yo, VI)) f
(3.25)
0.
From (3.25), we have XCV3 = {XCO, to, xo,
V l l U E W V ) , g(x[O, to, xo, u1, YCO, lo,
for at least one quasimotion y [ . , to, yo, Vl} #
YCUl = {Yce, to, xo, V l I VEY(U), s(xce,
to,
yo, V I ) 2 0
0, and
xo, Ull
y[e, to, yo, V]) 2 0 for at least one quasimotion
XC.9 to, xo, U I } #
0.
It is obvious that
u u
x[q c x [ u + HI = x[e, to, xo, u + H I = YCUI
= Y C V+ QI = u e , to, yo, v + QI =
x[e, to, xo, U ] ,
U€W
yce, to, yo,
VE-Y
VI
(3.26)
where X [ U - HI,Y [ V + Q] is the reachability domain of the first (second) subsystem of (3.23) and x[ * , to, xo, U] and y [ .,to, yo, V] are the associated quasimotions.
38
1. Quasimotions and Their Properties
Proposition 3.2. In the diflerential game (3.22) the following inequality is satisfied
sup
VEY-
inf F(x[B], y [ O ] ) 2 inf xcei~x[vi,Y~.i U E
sup
F(x[8],
~EYCUI,~C.I
y[e]).
(3.27)
Proof. The proof follows from the following chain of inequalities:
Note that in the case F(x,y) = x - y and g ( x , y ) = 1 - x (3.27) will be strict.
-
y, inequality in
4. Corollaries of the Alternative 4.1. Additional Propositions
The reasoning in this chapter follows from corollaries of the alternative. They are formulated for the game (3.8), assuming it is played by a single player. The system (1.1) becomes i=f ( t . x,
0).
(4.1)
4. Corollaries of the Alternative
39
Here, as before, the state vector X E R " , and the initial position (to,xo)E [O, 8) x R", 8 = const > 0 is fixed. The strategy V is associated with the functions u(t, x ) G Q . As before, we denote the set of such strategies V . Condition 1.1 is assumed to hold; in other words, the functionf(t, x , u) is continuous and "locally' Lipschitz with respect to x, inequality (1.2) is valid, and the set Q is closed and bounded. Let us consider the reachability domain of the set
The set X [ Q ] is closed and bounded in R", it consists of the right (at t = 8) ends of the quasimotions x [ t , to, xo, V - Q ] , to < t < 8, of the system (4.1) generated by the strategy V + Q from position (to,xo).Now let M be a certain closed set in R" such that its intersection with X [ Q ] = X [ 8 , to, xo, V + Q ] is not empty, M,
= X [ Q ] nM
# @.
(4.3)
Then the set M, from (4.3) is also closed. This leads to the next assertion. Proposition 4.1. Assume that Condition 1.1 is satisjied and that the initial position (to,xo) is defined in such a way that there exists a bound (4.3) on the closed set M c R".Then a strategy V* E V exists for which
xce, to, xo, v*3 c M .
(4.4)
The inclusion (4.4) means that whatever closed set M intersects the reachability domain (4.2) of the system (4.1), a strategy V* E V may be found that leads any quasimotion x[ * to, xo, V * ] of the system (4.1) generated by V* from position (to,xo) at the time when the process ends to the set M, (4.3), i.e., the following inclusion is valid: xce, to, xo,
V*l E M ,
for every quasimotion x [ . ,to. xo, V * ] . Proof. The proof of the proposition follows from Theorem 3.1 of the alternative. To apply it, the system (4.1) must be rewritten: i =f(t, x , u)
+ 0,u
(4.5)
40
1. Quasirnotiom nod Their Properties
where u E [0,1];0, is an n-vector with zero components. Consequently, in the system (4.5) the control u will be treated as the action of a fictitious player. Its strategies U are associated with the scalar functions u(t, x) E [0,1](for all possible positions (t, x) E [0,0) x W"); the set of these strategies is denoted 4. Condition 1.1 and the fact that the set M is closed guarantee that all the requirements of Theorem 3.1 (theorem of the alternative) are met in the case of the system (4.5).Let us verify, for example, the validity of the saddle point condition in a small game. Here it is necessary, by (3.2),to demonstrate that, for any z E R", (t, x) E [0,0) x R" a set (uo, uo) E [0,1]x Q exists such that z ' [ f ( t ,x,
0)
+ O,U0] < z"f(t, x, u") + O,U0] < z " f ( t , x, uO) + 0"Ul
(4.6)
for every u E [0,1]and u E Q.Relations (4.6)are valid since any number in the closed interval segment [0,1]can serve as our u and the vector uo is defined z ' f ( t , x, u) = z ' f ( t , x, 0'). The maximum on the leftby the equality rnaxVeQ hand side is reached by virtue of the continuity off& x, u) with respect to u and the compactness of Q.We are then justified in applying Remark 3.1 to the system (4.5)and to the closed set M from (4.3).By that remark, for a given initial position (to, X ~ ) E[0, 0) x R", a strategy V* E Y may be found that, for all quasimotions x[ * ,to, xo, V*] of the system (4.8), an intersection with M at time 8 is guaranteed; in other words, x[0, to, xo, V*] E M for any quasimotions x[ .,to, xo, V*] of the system (4.5)generated by the strategy V* from positions (to, xo); or evasion is possible at time 0 from an &-neighborhood of M by some strategy U* E 4. However, the evasion problem has no solution, since the strategy U does not influence the system (4.5).Rather, it "acts" through the term 0,u. Indeed, for every strategy U E ~it , is true that %[to, xo, U ] = %[to, xo, V - Q] by the definition of quasimotion bunches of the system (4.5).But then the reachability domain X [ 0 , to, xo, U ] of (4.5) from position (to, xo) with fixed strategy U € 4falls within the reachability domain X [ 8 , to,xo, V + Q] of (4.5). In other words, X [ 0 , to,xo, U] = X [ 0 , to, xo, V + Q]. Consequently, the problem of evading M (4.3) using some strategy U E 4 cannot be solved:
x[e,to, x0, U] n M = x[e,to, x0, v + Q] n M
= M,
z
for any U € 4 . By Theorem 3.1 (theorem of the alternative) (Remark 3.1),for the system (4.5)and, consequently, for (4.1), only one problem of intersection with the set M may be solved. As a result (Proposition 4.1),there exists V * E Y for which the inclusion (4.4)is valid.
4. Corollaries of the Alternative
41
4.2. Corollaries In the further discussion we will often use the following assertion.
Corollary 4.1. If Condition 1.1 holds at eoery point x* E X[O, to,xo, V + Q ] , there exists a strategy V* E Y under which the equality x[O, to, xo, V * ] = x* is valid for all quasimotions x [ . , to,xo, V * ] of the system (4.1) generated by the strategy V* from position (to,xo). Because the point x* is a closed set in X [ & to, xo, V - Q ] , Corollary 4.1 follows immediately from Proposition 4.1. Let the vector-valued function F(x) = (F,(x),... ,FN(x))defined over the set X[O, to, xo, V + Q ] be given. Consider the set
(F(X[O,to, xo, V - Q ] ) ={IF€ RNl ff = f f ( x ) xEX[O, , to, xo, V - Q ] } .
(4.7)
Corollary 4.2. Assume that Condition 1.1 is valid and that the functions F,(x) (i = 1,. ..,N ) are continuous. Then for any point ff*E ff(X[O,to,xo, V + Q]), there exists a strategy V* E Y such that for every quasimotion x[. ,to, xo, V*] of the system (4.1) (generated by the strategy V* from position (to,xo)), IF*= q x c e , to, xo,
v*I).
(4.8)
Indeed, since F,(x) are continuous, the set M
=
{ x ~ x [ eto, , xo,
v + Q ] I q x ) = F*}
(4.9)
is closed and bounded in X[O, to,xo, V + Q]. By Proposition 4.1, a strategy V * E Y exists such that x[O,to, xo, V * ] E M ~for any quasimotions x [ .to, xo, V * ] . Because the set M is defined by (4.9), for all quasimotions x[ to, xo, V*] equality (4.8) is valid. a ,
-
Corollary 4.3. Let Condition 1.1 be satis-ed, the functions F,(x), i = 1,. . . ,N , be continuous, and a closed set M E RN be specified such that M n F(X[O, to, xo, V
Ql)# 0.
(4.10)
Then a strategy V * E Y may be found such that 5(4?, to, xo, V*I)E M for any quasimotions x [ * , to, xo, V * ] of the system (4.1).
(4.11)
42
1. Quasimotions and Their Properties
Because the reachability domain X [ e , to, xo, V + Q ] is a compactum in Iw" (Proposition 2.3), by virtue of the continuity of Fi(x), the set F(X[e,to, xo, I/ t Q ] )is also a compactum in IwN. But then the intersection of the closed set M and F(X[e, to, xo, V t Q ] )will be a compactum in IwN. Let us introduce the set
X,
=
{XEXCO, to, x O , V
Q ] I M n F(x) #
t
0}.
(4.12)
By virtue of the continuity of F,(x), i = 1,. . . ,N, over X [ e , to, xo, V t Q ] and (4.10), the set X, will be closed and bounded. Proposition 4.1 suggests that a strategy V* exists such that xce,
to, xo,
V*I EX,
for all quasimotions x [ . , t o , xo, V * ] . But, consequently, by virtue of how (4.12) is obtained, we have
wee, to, x0, V*I)E M n wee, to, x0,
-+ Q I )
for any quasimotions x [ .,to, xo, V * ] ; in other words, the inclusion (4.1 1) is true. 4.3. A Property of the Linear-Quadratic Problem Let us establish one auxilliary property of the linear-quadratic problem which we will use to devise counter-examples. The functional
(4.13)
is defined over the solutions x ( . ) = { x ( t ) ,t o < t i= A(t)x
+ B(~)u,
< O} of the system
to) = X O
(4.14)
with initial conditions from the set toE
10,e),
xo E
w.
(4.15)
The set of strategies VL will be considered by functions of the type V t o(t, x ) = Q(t)x, where the elements of the (n x n)-dimensional matrices Q(t)are continuous over [O,e). Consequently, the strategies differ only in the
4. Corollaries of the Alternative
43
matrices Q(t) but are all linear with respect to x . The state vector x E R", and, to simplify the discussion, we assume the control action UER" also; the elements of the matrices A(t), E(t), 9 ( t ) , 9 ( t ) ,and G(t) are assumed to be continuous and the elements of C to be constants; the matrices 9 ( t ) ,G(t),and C are symmetric; u[t] = u(t, x(t, to,xo, V)), where x ( . , to, xo, V) is a solution of the system (4.14) with u = Q(t)x, x(to) = x o ( V - Q(t)x, or V E Y ~ ) , 8 = const > 0; such solutions exist and are extendible to [ t o , 01. In the discussion that follows, the condition that the quadratic form u'9(t)u is positive-definite will be denoted 9 ( t )> 0. If 9 ( t ) > 0, there exists 6 = const > 0 such that u'9(t)u 2 6u'u = 611~11~.
Lemma 4.1. I f )Ixoll # 0 and V euery t E [to,01,
t
Q(t)x is some strategyfrom YL,then, at
Proof. We prove the lemma by contradiction: Assume that time t* € ( t o ,131 exists such that IJx(t*,to, xo, V))I = 0. Then, two solutions of the system pass through the point x = 0 at time t*, i= [ A ( t )
+ B(t)Q(t)l,
one trivial, x ( t ) = 0, and the other x( .,to,xo, V), which contradicts the uniqueness of the solution [Sl, p. 2211. Proposition 4.2. I f 9 ( t )> 0, then whatever the initial position (to,x o ) E [O,O) x R", I(xo11 # 0, and strategy V E Y La, strategy V* E Y Lexists such that J(V*) > J(V).
Proof. Let V - Q(t)x be some strategy from (to,x o ) E [0,0) x R", llxo)I # 0 be some initial position. We introduce the function at
[A(t)x
(4.17)
the set Y L and
+ E ( ~ ) u+] ~ ' 9 ( t+) u~' Y ( t ) x + x'G(t)x. (4.18)
44
1. Quasirnotiom and Their Properties
Let us find a function of the type cp(t,x ) = x’O(t)x such that W(t,x, Q(t)x, cp(t, x)) = 0 at every (t, X ) E [0, 81 x R“, q(8, x ) = x‘Cx
with every x E R”.
(4.19) (4.20)
The identity (4.19) is valid if the coefficients of the quadratic form cp(t,x ) = x‘o(t)x satisfy the ordinary linear non-uniform differential matrix equation
6 + @ [ A @ )+ B(t)Q(t)] + [A’(t)+ Q’(t)B’(t)]O
+ Q’(t)g(t)Q(t)+ Q’(t)Y(t)+ G ( t )= O n x n
(4.21)
where On,, is a zero (n x n)-dimensional matrix. In the above the identity (4.20) is satisfied if O(8)= c. (4.22) Because the matrices A(t),B(t), Q(t),g(t),Y ( t )and G(t)are all continuous, the system (4.21) with boundary condition (4.22) has [91,p. 2211 a unique continuously differentiable solution @,(t) extendable to the interval [O,O] (as a solution of a linear non-uniform system of differential equations with continuous coefficients). Then, with cp,(t, x ) = x’O,(t)x the identities (4.19) and (4.20) are satisfied. Now let x(t, to,xo, V + Q(t)x)= x(t), t o < t < 8 be a solution of (4.14)with u = Q(t)x and x(to) = xo. Assuming in (4.19),where cp = q*(t,x ) = x’O,(t)x, the vector x = x(t), we obtain, using (4.18),
W t , 4 t h = Q(t)x(t),cp,k
x(t)))
at every t E [to, 81; here u(t, x(t)) = Q(t)x(t). Integrating both sides of (4.23) over the interval from to to 8 and making use of (4.20),(4.14), and (4.13),we have
45
4. Corollaries of the Alternative
where u [ t ] = Q(t)x(t). Hence
JW)= cP*(to,
(4.24)
xo).
In the strategy V* + Bx, where the constant /3 > 0 will be defined below, substituting u = Bx in W(t,x, u, cp,(t, x)) gives us w t , x, u = P x , ‘P*(t, 4)
+ O,(t)[A(t)+ PB(t)] + [A’(t) + BE(t)]O,(t) + B 2 9 ( t ) + BY(t)+ G(t)}x B 2 9 ( t ) + B[O,(t)E(t)+ B’(t)O,(t)+ 9 ( t ) ]+ K ( t ) } x .
=x‘{o*(t)
(4.25)
=XI{
Here K ( t ) is the matrix of those coefficients of the quadratic form (with respect to x) W(t,x, fix, p*(t, x)) that do not contain the constant B. Because 9 ( t )> 0, there exists a constant 6 > 0 such that u ’ 9 ( t ) u 2 6u’u for every u E R”. By this inequality, (4.25) becomes
wt,x,
0
= Bx, cp*(t,
4)
+
2 x ’ { 6 B 2 E n B[O,(t)E(t)
+ B’(t)O,(t)+ Y(t)]+ K ( t ) } x .
(4.26)
Note that En is an (n x +dimensional identity matrix. The matrix (4.26) has the form
bZ6 + B w l l ( t ) + 411(t) BW21(t)
+ 421(t)
........................... Bwnl(t)
4n1(t)
fiw12(t)
B26
+ q12(t)
BW22(t)
+ q22(t)
........................... PWn2(‘)
+ 4n2(t)
.*’
Bwln(t)
*”
PW2n(t)
+ q1n(t) q2n(t)
... ........................... ”’
B2S
+ BWnn(t)
qnn(t)
(4.27) and the coefficients Wkj(t),qkj(t),k , j = l , . . . , n are continuous for ~E[O,O] and, for this reason, uniformly bounded. Then we may choose the absolute value of B,[V] so large that all the principal minors of the matrix (4.27) with fl 2 B,[ V l are positive at every t E [0, 81. By Silvestre’s criterion (positivedefiniteness of quadratic forms), w t , x,
0
= Bx, cp*(t,
XI)
>0
(4.28)
at every t~ [O, 81, X E R“, llxll # 0, B 2 P,[VI. Let us denote by x*(t), to < t < 8, the solution of the system (4.14) with x(to) = xo, llxoI) # 0 and u = p*x, /?* = const 2 B,[VJ. By Lemma 4.1 with llxoll z 0, Ilx*(t)II z 0
(4.29)
46
1. Quasirnotiom and Their Properties
at every t E [ t o , 81. Formulas (4.28) and (4.29) lead to the inequality W ( t ,x*(t), v
= /?*x, cp*(t,
x*(t)))> 0,
v t E [to, 01.
Integrating both sides of this inequality over the interval from to to 8,
In these transformations, (4.20), (4.14), and (4.1) are taken into account, V* -+ /?*x, cp,(t, x ) = x’O,(t)x, v*[t] = /?*x*(t)and @,(8) = C. From (4.30), we obtain
JW*)> cP*(to,
xo).
Combining this inequality with (4.24)leads to (4.17).This proves Proposition 4.2. The reasoning we have used to prove Proposition 4.2 leads to the following assertion. Proposition 4.3. If 9 ( t )> 0, for every choice of initial position ~ ) [0, E 8) x R“, llxoII # 0, and every strategy V E V L there exists /?,(V) = const > 0 such that,for strategies V* -+ /?*x with arbitrary constants /?*2 /?*(V, JW*) > JW).
(to,X
We may establish the following assertion in a similar way. Proposition 4.4. If 9 ( t ) < 0, for every choice of initial position (to,X ~ ) [0, E 8) x R”, I1xoII # 0, and every strategy V E V L there exists fl,(V) = const > 0 such that, for the strategy V* -+ /?*x with arbitrary constants fl* > /?,( V ) , J( V*) < J( V ) .
Let us apply Propositions 4.3 and 4.4 to design a linear-quadratic differential game without saddle point.
4. Corollaries of tbe Alternative
Example 4.2.
47
Consider the differential linear-quadratic game
The control system E is described by the system of linear differential equations i= A(t)x
+ B,(t)u + B2(t)u,
x(t0) = xo,
(4.32)
where the state vector x E R" and the control actions of the players u E R" and u E R". The elements of the (n x n)-dimensional matrices A(t), B,(t), and B2(t) are continuous at t E [0,8],the time when the game ends 8 > 0 is fixed, and t E [ t o , 01. The set of strategies of the first (minimizing) player 42L = { U - M(t)xl the elements of the (n x n)-dimensional matrix M ( t ) are continuous at t~[O,0]}.For the second, maximizing, player the set of strategies Y L= { V - Q(t)x I the elements of the (n x n)-dimensional matrices Q ( t ) are continuous at t E [0, el}. Consequently, the strategies of the players are confined to the class of functions linear over x and differ by the associated with M ( t )and Q(t).The quasimotions x(t, to,xo, U , V ) ,to < t < 8, in the game (4.31) are identified with solutions of (4.32) with strategies U t M(t)x and V - Q(t)x having been chosen by the players. Assume that the payoff (4.31) is determined by the functional
J ( U , V ) = xye)cx(e)
+
I:
{u'~tl~l(t)u[t~
+ ~'[t].9~(t)u[t]+ x'(t)(Y,(t)u[t] + Y2(t)u[t])+ x'(t)G(t)x(t)} dt. (4.33) Here x(t) = x(t, to, xo, U , V) is the solution of the system (4.32); the strategies chosen by the players are U t M(t)x and V + Q(t)x; u[t] = M(t)x(t, to, x,, U , V) and u[t] = Q(t)x(t, c,, x,, U , V);the matrices gi(t), LZi(t), and G(t)are continuous at C E[0, 03 and C is constant; moreover, without limitation of generality we may assume that C and Sit) are symmetric. The saddle point (Uo, V 0 ) € a Lx V Lis a pair of strategies such that J(U0, V ) < J(U0, VO) < J ( U , VO) with any V E V Land U E aL. Then in the game (4.31)there is no saddle point if, for every pair of strategies (situation) ( U , V ) there exists either a strategy
48
1. Quasimotions and Their Properties
V* E V Lor a strategy U* E @L or both and such that
J(U, V * ) > J(U, V )
(4.34)
J(U*, V ) < J ( U , V )
(4.35)
or or both inequalities are satisfied simultaneously.
Proposition 4.5. Zf g1(t)< 0 and (or)g2(t)> 0, then,for any choice of initial position (to,xo)E [O,O) x R", llxo)I # 0, there is no saddle point in the differential game (4.31). Proof. Let U + M ( t ) x and V - Q(t)x be an arbitrary situation from aLx V Land, for example, g2(t)> 0. In the optimal control problem, J ( U , V ) - max J ( U , V ) VE Y
with constraints V E.YLand jc =
[A(t)
+ B,(t)M(t)]x + B2(t)u,
x(t0) = xo.
Proposition 4.2 (with g2(t)> 0 and llx,,II # 0) suggests that a strategy V* E V Lexists such that J(U, V*)> J(U, V),
in other words, inequality (4.34) is valid. Since the situation ( U , V) is arbitrarily chosen, the absence of saddle points for the differential game (4.31) is proved. We may analogously prove that there are no saddle points if g1(t) c 0, though Proposition 4.4 and inequality (4.35) must now be used.
Chapter 2
Slater Optimality
In this chapter the concept of a solution of a multicriterial dynamic positional system, similar to the Slater optimum (weak effectivity, weak Pareto optimum) will be introduced but relative to the multivalence of goal functionals. Sufficient conditions will be established that reduce the solution of a multicriterial problem to the problem of finding an optimal positional control for a specially designed goal functional. The structure of this solution will be determined.
1. Slater-Maximal Strategy 1. I . Formalization of Multicriterial Problem By a multicriterial (vector-valued optimization) problem we will understand the system
a *Ir, J ) .
(1.1)
The state vector x [ t ] is assumed to vary with the system Z given by the equation
k =f(t, x,
01,
(1.2)
x E R"; the initial value of the state vector xo, the times when the process starts to 2 0 and ends 8 > to are fixed; and the control action u E R4.
By analogy with this notion, the strategies V are associated with the functions u(t,x) E Q c R4(for every possible position (t, x) E [to, 8) x R").The set of strategies V + u(t, x) will be denoted V . Suppose conditions such as Condition 1.1 of Chapter 1 are satisfied.
49
50
2. Slater Optimnlity
Condition 1.1.
Thefunction f ( t , x , v) is continuous and “locally’ Lipschitz with respect to x , 11 f (t, x , u)II d y(1 + Ilxll), and y = const > 0, and the set Q is closed and bounded.
The quasimotions x[., t O , x o ,Vl = { x [ t , t O , x 0 , V ] , to d t d O} of the system (1.2) generated by the strategy VE f from position (to,X ~ ) E[0, 0) x R” are defined (see section 2 of Chapter 1) as limits (in the space M , [ t o , 01) of stepwise quasimotions designed in an appropriate way. In order to apply the results of Sections 2.2 and 2.3 of Chapter 1, we assume u = 0,. The performance of E is estimated by an N-vector-valued goal functional J = (F,(XCOl), . . ., F N ( X C O 1 ) ) =
(1.3)
The components F i ( x ) are defined for every x E X[O, to, x o , V + Q ] where the integer N 2 2; denote N = { 1,. . . ,N}. The coordinates of the vector [F(x[O]) will be referred to as the goal functional (or criterion) and [F(x[B])itself, as a vector of goal functionals (vector-valued criterion). In meaningful levels it is required to choose a strategy V E which, ~ starting from a fixed initial position (to,xo), causes all the components of the vector F(x[O]) to assume the largest possible values simultaneously. Condition 1.2.
The functions F i ( x ) , i E N, are assumed to be continuous with
respect t o x .
In this multicriterial problem (l.l), every fixed strategy V is, generally speaking, associated with a set of goal functional values F i ( x [ O , to, xo, V I ) = { F i ( x ) I x E x [ ~to, , xo,
VI}.
This concept must not be neglected in determining the solution (strategy
VE f )of (l.l), and should be thoroughly analyzed; we assume for the sake of
simplicity that N = 2. Now let two arbitrary strategies V1)and V(’) be chosen. In the space of values of the vector-valued goal functional F(x[B]) = (Fl(x[8]), F2(x[O]) every strategy V”) ( j= 1,2) is associated with some “cloud,” i.e., set of values [F(X[O,to, x o , V”)]).These F(X[B, to, x o , V(j)]) are bounded and closed in Rz. How might we compare such sets? and which set is “better”? The difficulty of solving a multicriterial problem (in the case of univalent goal functionals) are aggravated by the features associated with their multiualence. In this and succeeding chapters four possible interpretations of a solution of (1.1) are proposed and the structure of these solutions is presented. The concept of a strategy used here (in the case of the system
51
1. Slater-Maximal Strategy
(1.2)) leads to the same values of the vector-valued goal functional F(x[8]) that are the solution of the static multicriterial problem (XCe, to, xo, V
+
Ql,‘F(x)>.
( 1.4)
The problem (1.4) must be treated independently of the system X.There is a compact set X (which is assumed to fall within the reachability domain X [ e , to, xo, V t Q]). N goal functions F,(x),iE N, are determined over X that, by Condition 1.2, are assumed to be univalent and continuous. It is required (in meaningful levels) to choose a point X E X that provides the greatest possible values of all the goal functions F,(x). In addition to (l.l), problem (1.4)is analyzed and the relation between the solutions of (1.1) and (1.4) is established. This relation makes it possible to determine the structure of the solution of (1.1) for the dynamic system X.
I .2. Definition of Slater-Maximal Strategy Suppose Conditions 1.1 and 1.2 are satisfied.
Definition 1.1. The strategy V” E V will be Slater-maximal in the multicriterial problem (1.1) with initial position (to,xo)E [0, 0) x R” if there is no V E V such that the following system of strict inequalities is true: i€ N
Fi(xC8, to, xo, V I ) > Fi(xCe, to, xo, V“]),
(1.5)
for any quasimotions x [ . ,to,xo, V ] and x[. ,to,xo, V’].
Remark 2.1. The Definition of a Slater-maximal strategy is equivalent to the following. The strategy V ‘ E V is Slater-maximal in (1.1) with initial position (to,xo)E [O,e) x R” if for every strategy VE V and every quasimotion x*[ * , to, xo, V ]E X [ t o ,xo, V ] , a subscript of its “own” i(V) = i, E N and quasimotion x * [ . , to, xo, V”]E %-[to,xo, V q may be found such that Fio(x*Ce, to,
VI) G Fio(x*Ce, to, ~
~ 0 ,
0
V”1). ,
The set of Slater-maximal strategies V” will be denoted V s .
Remark 2.2. Definition 1.1 implies also that the strategy V * E V is not Slater-maximal in (1.1) with initial position (to,xo) if there exist a strategy PEV and at least two quasimotions x*[ . , to, xo, V * ] and x * [ ., to, xo, of
fi
52
2. Slater Optimnlity
the bunches T[to, xo, V * ] and %[to, xo, 31, respectively, such that the system of inequalities Fi(x*Co, to, xo,
VI)
> ~ i ( x * [ ~to,, xO, V * ] ) ,
ie N,
is simultaneous. If N = 2, the following geometrical interpretation of the Slater-maximal strategy V s may prove useful (Fig. 2.1.1). Let V s be a Slater-maximal strategy for (1.1) with N = 2. Add to every point [F E F(X[O, to, xo, V ” ] ) the first coordinate square and find the Slaterminimal points of the set G shaded in Fig. 2.1.1. These points are represented as a heavy line. Then, by Definition 1.1, if V s is Slater-maximal and V is any other strategy, the points of the set F(X[O, to, xo, V]) cannot penetrate strictly into the shaded set G. The points [F(X[O, to, xo, V]) can reach its boundary
F,
Fig. 2.1.1.
1. Slater-Maximal Strategy
53
(denoted by the heavy line). This correspondence exists between any two strategies of Y one of which is Slater-maximal. By analogy to Definition 1.1, the strategy VSeY will be said to be Slaterminimal in problem (1.1) with initial position (to,xo)E [0, 0) x R” if, for every VE Y the following system of inequalities is not simultaneous:
1.3. Properties of Slater-Maximal Strategy Analogs of Slater-maximal strategies for the “static” case have been described in [70, pp. 49,75, 1581.
Property 1.1. With N = 1 (in the case ofone goalfunctional J = F,(x[B])of problem (l.l), the Slater-maximal strategy is equivalent to the maximal (optimal) strategy in the optimal control problem max max F,(x[B, to, xo, V ] ) = max F,(x[B, to, xo, V s ] ) V€Y
XI.]
XI
’
1
(1.6)
with constraints V E -Y and i= f ( t , x , u),
x[to] = xo.
Here
max F,(x[B, to, xo, V“])= min F,(x[B, to, xo, 4.1
xI.1
PI).
(1.7)
In other words, in this optimal control problem the value of the criterion F l ( x [ 8 ] )over the optimal (positional) strategy V s is unique.
Indeed, by Remark 1.1, for the case N = { I } , we have F,(xC& to, xo, V l ) < F,(xCB, to, xo, VS1)
for all strategies V E Y and quasimotions x [ * , to,xo, V ] and x [ .,to,xo, Vs]. Hence max F A X ~ O , to, x0, V I )G F,(xce, to, xo, vS1) XI.]
for any strategies V E Y and quasimotions x [ * , to,xo, V s ]€%[to,xo, V“].
54
2. Slnter Optimnlity
Consequently max max F,(x[~,to, xo. V I ) < Fl(x[B, to, xo. V’I). VEY’
x[.]
But the strategy V’E “Y, and the latter inequality thus becomes an equality, which implies that (1.6) and (1.7) are valid. However, if N = 2, the value of the criteria F(x[B, to,xo, V s ] ) over the Slater-maximal strategy V s cannot be unique.
assumes the form shown in Fig. 2.1.2. The set “Ysof Slater-maximal strategies are composed of several strategies:
V s + (6,4) = (1, [a, PI), where the constants a and
I/’ -
P are such that
(US, US) = ( [ a , PI, 1)
-1
< a < P < 1.
The set of values of the goal functional vector F(x[B]) associated with the Slater-maximal strategies V s “fill” the segments AB and BC (Fig. 2.1.2). Here, for example, the strategy V* + (1, [ - 1, 13) is Slater-maximal though it is represented, not by a single point in the criteria1 space ff = ( F , , F2), but by the entire segment AB, namely ff(X[O, to, xo, V * ] ) = AB. Consequently, the Slater-maximal strategy I/* generates a bunch of quasimotions x[ ., to,xo, V*] such that the values of the goal functional vector ff(x[8]) over this bunch, if combined, form a set (the segment AB). Property 1.1 shows that the Slater-maximal strategy is a fairly complete notion. It includes as a particular case the notion of a maximal positional strategy in an optimal control problem. Property 1.2. inequalities
If there exists a subset I6 c N such that with every V E “Y, the
1. Stater-Maximal Strategy
55
C
Fig. 2.1.2.
are nonsimultaneous for any quasimotions x [ . , to, xo, V ] and x [ . ,to, xo, V’], the strategy V s is Slater-maximal for (1.1) with initial position (to,xo).
Property 1.2 follows from the fact that if (for any V E V )inequalities (1.8) are not simultaneous, then so is the “complete” system of inequalities (1.5). Note that the converse is not true. Property 1.2 shows that the Slater-maximality V s does not deteriorate if new performance criteria whose maximization is desirable are added to the system E. Property 1.2 leads directly to the following assertion. Property 1.3. Each of the N maximal strategies V‘”E“Y, max max F i ( x [ e , to, xo, VEY
xr.1
iE
N:
v])= F i ( x [ e , to, xo, V ) ] )
(1.9)
is Slater-maximal in (1.1) with initial position (to,xo).
Indeed, equality (1.9) is equivalent to the requirement that, for every V E “Y, Fi(xCe, to, xo,
VI) < F i ( x [ o , to, x o , V(i)I)
56
2. Slater Optimality
for any quasimotions x [ .,to,x o , V ] and x [ . , to, xo, V")].This implies that the strict inequality F i ( x r e , to, xo, V I ) > F i w , to, x0,
v(~)I)
cannot be satisfied for any V E V and quasimotions x c . , to, Xo, V l ,
x [ . , to,
XI),
viq.
Now Property 1.2 shows that Property 1.3 is valid.
Corollary 1.1. If Conditions 1.1 and 1.2 in (1.1) are satisjed, there exists a Slater-maximal strategy for any choice of initial position (to,x o )E [O,@ x R". To prove the existence of a Slater-maximal strategy V s in ( l . l ) , it is sufficient to establish (by Property 1.3) the existence of a maximal strategy V ( ' ) E Vin the optimal control problem (1.6), where F , ( x [ B ] ) is replaced by F,(x[O]).For this purpose, let us consider a zero-sum positional differential game
Here the control system Z* is described by Equation (4.5) of Chapter 1, is the set V of strategies of the maximizing second player, and the first player with opposing interests utilizes strategies from the set %! = { U t u ( t , x ) I u ( t , x )E [O, l ] } , and the game payoff is defined by the functional Fi(x[O]). Using relations (4.6) of Chapter 1, the condition of the saddle point in a small game may be shown to be satisfied. Consequently (Theorem 3.2, Chapter l), in the game r*,there exists a saddle point (Uo, V ( ' ) ) E %x V for every choice of initial position (to,X ~ ) [O, E 0) x R" and, by Property 3.5 of Chapter 1,
for any quasimotions x [ . , to, xo, U O ] ,x [ . , to, xo, U o , V i ) ]and , x [ . , to, xo, V(i']. The specifics of the system (4.5) of Chapter 1 (because of the term Onu, the strategy U does not influence the system (4.5, Chapter 1)) is such that the bunch of quasimotions x [ .,to, xo, U O ] consists of quasimotions x [ ., to, xo, Vl with distinct V E V .As a consequence we find from (1.10) that
1. Slater-Maximal Strategy
57
for any V E Y and x [ . , to, xo, V ] , x [ . , to, xo, Vci)].The latter inequality implies (see proof of Property 1.1) that V c Sis the maximal strategy in the optimal control problem (1.6) where F l ( x [ 8 ] )has been replaced by Fi(x[8]).
1.4. Stability
Certain properties of Slater-optimal strategies V s E Y s depend on various types of stability. Game theory demonstrates that all the optimality principles discussed thus far reflect directly or indirectly the idea of a situation stability satisfying these principles. This idea has been realized in various ways on the basis of different principles, however [90, p. 941.
Property 1.4. The set Y sof Slater-maximal strategies V s is externally stable in that, for every strategy V E Y and quasimotion x*[t, to,xo, VJ, to < t < 8, generated by it, there exists a Slater-maximal strategy V’E Y ssuch that i~ N, < Fi(xC8, t o , ~ 0 VSI), , for every quasimotions x [ t , to, x,, V s ] ,to < t < 8, of the system (1.2) generated Fi(x*Co, t o ,
~ 0 V, I )
by V sfrom position (to,xo). Proof. Let V be some strategy from the set Y and x*[t, to, xo, V ] , to < t < 8, a quasimotion generated by it. We introduce the set X*[t,, x,]
=
{ x E X [ B , to, x0, V t Q] I F,(x) 2 Fi(x*[O, to, x0, V ] ) ,i E N}.
Because the point x*[8, to,x,, V ]E X*[to,x,], X*[to,x,] # 0;in addition, it follows from the compactness of the reachability domain X [ 8 , to, xo, V t Q ] and the form (inequalities) of the bounds that X*[to,x,] is a nonempty compactum in R”. Note that for any point x E X*[to,x,], Fi(x)2 Fi(x*[8,to, x0, V ] ) ,
iE
N.
(1.11)
Now consider N nonnegative numbers pl,. . . ,PN,Zpi > 0, and introduce a continuous function ZrZl BiFi(x). We wish to find a point x ” ~ X * [ t , , xo] such that N
N
C BiFi(xy) = XEX*[~,,X,] max pi~i(x). i= 1 i=1
(1.12)
Let us show that the point x y is a Slater-maximal solution of the multicriterial static problem <X*Cto, ~ 0 1 9{Fi(X))isN),
(1.13)
58
2. Slater Optimslity
or, for any x ~ X * [ t ~ , x ,the ] , system of inequalities F i ( x ) > F i ( x y ) , i~ N, is nonsimultaneous. Hereafter, the lower-case letters s, p , d, and a will denote solutions of static problems and the upper-case letters S, P , D, and A, the optimal strategies of dynamic problems. Indeed, if the point x y were not a Slater-maximal solution of (1.13),there would exist f E X * [ t o , x o ] such that the following system of inequalities is simultaneous: F , ( f )> Fi(x9), iE N.
Multiplying the ith inequality in (1.14)by the above the strict inequality
(1.14)
/3, and adding we have
which is incompatible with (1.12).Consequently, xY is a Slater-maximal solution of (1.13). By (1.1l),for all x E X [ 8 , to, x0, V - Q]\X*[to, x , ] , Fj(x) < Fj(x*C& to, xo, V I )
for at least one j e N. Therefore, it follows from the nonsimultaneity of the inequalities F i ( x ) > F i ( x y ) , ie N, that for all x ~ X * [ t ~ , itx is ~ ]also nonsimultaneous for any x E X [ 8 , to,xo, V - Q ] . The latter signifies that the point x y found above is a Slater-maximal solution of a “full static” multicriterial problem (XCe,
LO, X O ,
+
Ql,
{Fi(X)}ieN).
Corollary 4.1 of Chapter 1 leads to a strategy V ’ E V such that x y = x [ e , to, xo, V s ] for all quasimotions of the system (1.2)generated by the strategy V s from position (to,xo). This strategy V s is Slater-maximal in (1.1) by virtue of Proposition 3.1 and
F,(x”)
= F i ( x [ e , to,
for all quasimotions x [ t , to, xo, x = x Y ~ X * [ t , , x , ] and (1.15)we have
Fi(xCe, to, xo,
v’I)
vS]), ie N, V s ] , to < t < 8. By
(1.15)
xo,
=~i(x”3 )
(l.ll), where
F,(x*[e, to, xo, V I ) , i E N,
which proves Property 1.4. The external stability of a set of Slater-maximal strategies makes it possible
1. Slater-Maximal Strategy
59
to choose the optimal solution of (1.1) from this set. If there were no such stability, this set would not suffice for choosing “good” solutions. Property 1.5. The set Y sof Slater-maximal strategies is internally stable in the sense that any two Slater-maximal strategies V‘” and V c 2 and ) quasimotions x[t, to,xo, V(’)]and x [ t , to, xo, V’)],to < t < 0 generated by them the following system of inequalities is nonsimultaneous:
F,(x[O, to, xo, V(1)])> F,(x[6, to, xo, V”’]),
i € N.
Nonsimultaneity follows from the inclusions V 2 ) € Y s and V“)E Y and Definition 1.1 of the Slater-maximal strategy V s = Y c 2in) the multicriterial problem (1.1).
The internal stability of the set Y sof Slater-maximal strategies implies that no strategy of that set possesses advantages over any other strategy (in the sense of Definition 1.1). Property 1.6. Any Slater-maximal strategy of the multicriterial problem (1.1) with initial position (to,x O ) €[0,0) x R” is dynamically stable, that is, the strategy V s remains Slater-maximal in problem (1.1) with current initial position (t,x [ t , to,xo, V’]) at every t~ [to. 01, where x [ t , to,xo, V’], to < t < 6, is any quasimotion of (1.2) generated by V sfrom initial position (to,xo).
In other words, the strategy V s which is Slater-maximal in the initial multicriterial problem (1. l), is also Slater-maximal for “new current” multicriterial problems, such as (l.l), but with a varying initial position (t,x[t, to, xo, V s ] )that slides along any quasimotion x [ . ,to,xo, V s ] of the bunch .%?[to, xo, V’]. Proof. The proof of Property 1.6 will be given by contradiction. Then a quasimotion x*[ * , to, xo, V s ] and time t* € ( t o ,6) exist such that V s is no longer a Slater-maximal strategy in (1.1) with initial position (t*,x*[t*, to,xo, V’]) = (t*, x*[t*]).Then by Definition 1.1, a strategy P and quasimotion %[t,t*, x*[t*], PI, t* < t < 6, generated by this strategy from the initial position (t*,x*[t*])may be found for which the following system of inequalities is simultaneous:
F,(%[O, t*, x*[t*], PI) > F i ( x [ 6 , t*, x*[t*], Vs]),
i € N,
(1.16)
60
2. Slnter Optimality
for some quasimotion %[t,t*, x * [ t * ] , V ' ] , t* Chapter 1,
x[e,
< t < 8. But, by Theorem 2.1 of
t*, x * [ t * ] , v S ] E x[e,to, x 0 ,
ace, t*, x * [ t * ] , 33 E x [ & where
{
to,
x0,
vS3, PI
V s with (t, x ) E [to, t*) x R",
P= 3 with (t, X ) E [t*, 8) x R", and so
VE V . As a result, for the quasimotions xS[t, to, xo,
xt,
to,
xo,
r
PI =
-
VI =
I
with to < t V'] x [ t , t*, x * [ t * ] , V s ] with t* < t x * [ t , to, xo, -
x * [ t , to, xo, V s ] with to f[t, t*, x * [ t * ] , 31 with t*
< t*, < 8;
< t < t*, < t < 8,
by (1.16) the following inequalities are valid: Fi(j;-[e,to, XO, PI) > Fi(xS[& to, xo,
vsl),
iEN
which contradicts the maximality of the Slater strategy V s in problem (1.1) with initial position (to,xo). To conclude the discussion we require two further remarks. Remark 1.3. The dynamic stability of Slater-maximal strategies was established by means of the quasi-motions of the system (1.2). If we consider only motions (subsection 1.1, Chapter l), Slater-maximal strategies do not, generally speaking, possess the property of dynamic stability. In fact, as the position (t, x [ t , to,xo, V']) slides along the motion x [ t , to,xo, V s ] , to < t < 8, (cf. subsection 1.1, Chapter 1) at a certain time t * ~ ( t , , O ) new motions can start (as in Example 2.1 in Chapter 1) that are not the continuation of the motion of the system (1.1) from the initial position (to,xo). The values of all the goal functionals may be larger than F(x[O, to, xo, V']). In that case the Slater-maximal strategy V s for problem (1.1) with initial position (to,xo) ceases being Slater-maximal for problem (1.1) now from the initial position (t*, x [ t * , to, xo, V'1).
Remark 2.4. The properties of external, internal, and dynamic stability can also be true for Slater-minimal strategies V, E V of the multicriterial problem (1.1).
1. Slater-Maximal Strategy
61
(1) The set *lr, of Slater-minimal strategies V, of problem (1.1) with initial position (to,xo) is externally stable, that is, for every strategy V E V- and quasimotion x*[. ,to, xo, V], there exists a Slater-minimal strategy V, of “its own” for every V and x*[. ,to, xo, V] such that
F , ( x [ e , to, x0, &I) G F,(x*[e, to, x0, VI),
iE N,
for every quasimotion x[ . ,to, xo, V,] generated by V, from position ( t o , xo).
(2) The sets % of Slater-minimal strategies V, of problem (1.1) with initial position (to,xo) is internally stable, that is for every two strategies V ( ’ ) and V(’) of *lr,, the following system of inequalities is inconsistent:
Fi(x[e, to,x0, v(’)]) < Fi(x[e, to, x0,
v ‘ ~ ) ]i)E,N,
for all quasimotions x[ .,to, xo, V”)], j = 1,2. (3) The Slater-minimal strategy V, of problem (1.1) with initial position ( t o , x o ) is dynamically stable, that is, the strategy V, remains Slaterminimal in problem (1.1) for “current” position ( t , x [ t , to, xo, V,]) for every time t E [ t o , 01 and quasimotion x[ ., to, xo, V,] E X [ t o ,xo, V,]. These three propositions follow immediately from Properties 1.4- 1.6, respectively, where the Slater-maximal strategy of problem (1.1) with initial position (to,xo) becomes Slater-minimal for the following multicriterial problem:
G,
V - 9
-J>
with the same position, the converse also being true. This problem differs from (1.1) only by the sign of the goal functionals.
1.5. Properties of Inheritance and Rejection
The construction and application of mathematical models of real-world problems often leads to an adjustment in the model itself. For multicriterial problems this generally implies changing the set Q of values of the control action u or using state constraints. Thus it is necessary to evaluate the influence of the point “rejected from” or “added to” Q which depends on properties of inheritance and rejection formulated for the static multicriterial problem in [S,pp. 12-14]. Before proceeding on to a description of the properties, let us consider an auxiliary proposition.
62
2. Slater Optimnlity
Lemma 1.1. If the compacta Q1 c Q2 c R4, 5 1 )
(1.17)
= *L;;)
and from (1.17) it follows that
xce, to, XI), 51)lc xce, to, xo, 772,l.
(1.18)
The sets of strategies
qj,= { V t u(t, x ) \u(t, X ) c Q j } ,
(1.19)
j = 1, 2,
and
xce, to, xo, ~ ~= {xce, ) 1 to, x0, VI I V E qj,> = X [ O , to, xo, V - Q j ] ,
j = 1, 2.
The lemma follows immediately from definitions of strategies and quasimotions (Definition 2.1, Chapter 1).
Property 1.7. Let (a) the strategy V s be Slater-maximal in the problem
l-
=
V ?~(XCOI)),
(b) V S € 7 . , (c) 7c Y . Then V s will be Slater-maximal in the problem i= same initial position.
=
(Z,4,F(x[O]))with the
The multicriterial problem differs from r only by the fact that the set of control action values u of problem is such that 0 c Q and Properties 1.1 and 1.2 are valid also for problem f . This property is referred to as inheritance. In meaningful levels it may be formulated as follows. The Slater-maximal strategy V s chosen over some set Y remains so over every one of its subsets 4 c Y if V s is contained in the “truncated” set of the strategies 4.
Proof. The proof follows immediately from Definition 1.1. Indeed, since ? c V ,by (1.18),
xce, to, xo, v t 01 = xp,to, x0, 7 1
c X [ e , to, xo, V - Q] = XCO, to, x0, Y].
2. Sufficient Conditions
63
The nonsimultaneity, for any V E V ,of the system of inequalities ( 1 3 , leads to the nonsimultaneity of the system Fi(x, 8, to, xo, V I ) > Fi(x[e, to, xo,
V'I),
iE N,
(1.20)
for all V E ? c V and all quasimotions x[. ,to,xo, V ] ,x[. ,to,xo, V']. Here, V S € ? . The inconsistency of (1.20) implies the maximality of the Slater strategy V s in problem with initial position (to,xo). This proves Property 1.7.
The following property of rejection is defined as the invariability of the set
V sof Slater-maximal strategies of problem (1.1) if the strategies that are not Slater-maximal in this problem are removed from V .
Property 1.8. If the strategy V E V is not Slater-maximal in problem (1.1) with initial position (to,xo),the set V' of Slater-maximal strategies of the problem remains the same for the multicriterial problem { V \ V ) ?w e l ) >
with the same initial position, or V? G V : .Y-' E ? =-4.' G V . Here 9' is the set of Slater-maximal strategies of the problem (E,8,IF(x[O]))with initial position (to,xo). The validity of the rejection property follows immediately from Definition
1.1 since in this case those inequalities in (1.5) are rejected which are certainly associated with non-Slater-optimal strategies V E V .
2. Sufficient Conditions
2.1.
Conditions
Theorem 2.1. Suppose Conditions 1.1 and 1.2 are satisjied. If a set of positive numbers PI,. . . ,BN exists such that
max min min BiFi(x[f3, to, xo, V ] )=min min &Fi(x[e, to, xo, V€Y
*[.I
kN
x[.]
idV
PI), (2.1)
the strategy
is Slater-maximum in problem (1.1) with initial position (to,xo).
Before proving the theorem we will need two lemmas.
64
2. Slnter Optimality
Lemma 2.1. I f thefunctions F i ( x )are continuous ( i E N, x E R"),for everyfixed set of positive numbers PI,. . . ,BN the function is continuous.
This lemma is proved in [19, p. 691.
Lemma 2.2. Let the set X c R" be closed and bounded, the functions Fi(x), iE N, continuous for x E R" and, PI,. . . , BN a set of positive numbers. Then min min BiFi(x)= min min B i F i ( x ) . XEX
ieN
XGX
iEN
Proof of Theorem 2.1. Let the strategy V s be the solution of problem (2.1) for some set of positive numbers. From (2.1), interchanging the minimum operations (Lemma 2.2), we obtain rnin rnin PiFi(x[e,to, xo, V s ] ) 2 m i n mun BiFi(x[O,to, xo, V ] ) kN
xC.1
kN
xC.1
for any strategies V EY .For every strategy V a subscript io( V) be found such that
= i, E N may
min min BiFi(x[O,to, xo, V ] ) = rnin /lioFio(x[O,to, xo, V ] ) . ieN
xC.1
(2.3)
XC.1
(2.4)
O n the other hand, with the above subscript we have min pioFio(x[e,to, xo, V s ] ) 2 min min piFi(x[e,to, xo, ~ XC.1
iGN
xC.1
~1).
Combining (2.3), (2.4), and the latter relation we have the inequality Bio
min Fio(xCO, to, xo, VSI) 2 Pi, min Fio(x[e,to, xo, VI). XC.1
XC.1
Consequently, for every strategy V a subscript io(V)= i, E N may be found such that min Fio(x[e,to, xo, V']) 2 min Fio(x[e,to, xo, XC.1
XC.1
Vl).
(2.5)
The function Fio(x)is continuous over X [ e , to,xo, V s ] and X[O, to,xo, V ] . Therefore, quasimotions a [ .,to, xo, V ] E X['Cto,xo, V ] and xs[. ,to, xo, V s ] E X[to, xo, V s ] may be found such that min Fio(x[B,to, xo, VI) XC.1
= Fio(kCe,
to, xo,
min Fio(x[e,to, x0, V s ] ) = Fio(xS[e,to, x0, XC.1
Vl),
~~1).
65
2. Sufficient Conditions
Combining these equalities with (2.5), we have Fi,(Xe, to, ~
VI) < Fio(xSCe,to, xo, VsI).
0 ,
Because the strategy V is arbitrary, the latter inequality signifies, by Definition 1.1, that V s is a Slater-maximal strategy of problem (1.1) with initial position (to,xo).
2.2. Corollaries of Theorem 2.1
Proposition 2.1. If Conditions 1.1 and 1.2 are satisjed, there exists a Slatermaximal strategy of the multicriterial problem (1.1) with any initial position (to, xo) E co, 0) x R". Proof: To prove the assertion it is sufficient, by Theorem 2.1, to show that there exists a Slater-maximal strategy V s satisfying equality (2.1). Consider an auxiliary zero-sum positional differential game
where B1,. . . ,BN is some set of positive numbers; the control system Z*is defined in (l.lO), the sets 42 and Y are the same. Consequently, for every choice of initial position ( t o ,xo) there exists, by Theorem (3.2) of Chapter 1, a maximin strategy V s E Y in the game Ts such that max min min BiFi(x[e, to, xo, V ] ) = min min BiFi(X[e, to, xo, PI). V E Y x[.]
icN
xC.1
kN
The resultant equality is identical to (2.1). Proposition 2.1 is proved. Remark 2.1. Theorem 6.1 offers two advantages. First, it permits the set of Slater-maximal strategies of the multicriterial problem (1.1) to be found for a given initial position (to,xO)e[O, 0) x R". Specifically, every set of positive numbers pl,. ..,PN is associated with a Slater-optimal strategy V i E v of its own satisfying equality (2.1). Once all positive p,, .. .,BN are sorted such that ZBi = 1 (by (2.1), the proportional collections PI,. ..,PN are associated with the same strategy VS (hence the constraint XBi = 1). In this way the set of Slater-maximal strategies is found. Second, the problem of finding Slater-maximal strategies reduces to the solution of the following optimal control problem in positional strategies:
max rnin rnin BiFi(x[0, to, xo, VJ) = min min BiFi(x[O, to, xo, V s ] ) V E Y x[.]
icN
x[.]
ieN
(2.6)
66
2. Slater Optimality
under the constraints f =f(t, X , v),
to]
= x0,
Y = { V - ~ ( tX ,) I ~ ( tX, ) E Q } .
To solve this problem Subbotin and Chentsov developed mathematical tools for modifying the method of dynamic programming [83, Chapter 1111. The problem is discussed on pages 274-275 (where it was necessary to assume that o = miniENPiFi). By means of these results necessary and sufficient conditions for the solution of problem (2.6) can be obtained as constraints on the potential of the associated optimal control problem (2.6) (see the proposition in [83, p. 2753). Remark 2.2. For a Slater-minimal strategy Theorem 2.1 must be reformulated. Suppose conditions 1.1 and 1.2 are satisfied. The strategy I / ~ EY is Slater-minimal in problem (1.1) with initial position (to,x o )E [0,0) x R" if a set of positive numbers PI,. . . ,PN exists such that
min max max BiFi(x[O,to, xo, V ] ) = max max PiFi(x[O,to, xo, &I). VeY
x[.]
ieN
x[.]
ieN
Theorem 2.1 enables us to establish the existence of a universal Slatermaximal strategy in problem (1.1).
2.3. A Universal Slater-Maximal Strategy
Definition 2.1. The strategy V s E Y is universal Slatet-maximal in problem (1.1) if with any choice of initial position (to,xo)E [0,0) x R", the following system of inequalities is, for all K nonsimultaneous:
u,
for any quasimotions x[. , to, xo, x [ . , to, xo, Vs]. In contrast, let us assume that the strategies V are identified now with the functions o(t. x, E ) E Q E comp R4, where E is some accuracy parameter. The quasimotions of the system jc
=f(t, x, v),
x [ t J = xo
have been defined in [42, p. 1 lo] in terms of the triple limit of corresponding stepwise quasimotions. The definition is as follows. Assume that Condition 1.1 is satisfied. The condition of continuity may be eased by requiring the continuity of f ( t , x , v) with respect to x , v for each t E [ O , 01 and Bore1 measurability over t with fixed ( x , v) E R" x Q.
67
2. Sufficient Conditions
Let some strategy V t u(t, x , E ) be fixed together with an initial position (to9 XO).
For the class of strategies (depending also on the accuracy parameter every function
V; A, a) = { x ( t , 70, x o , Ro, V; A, a,
xe(.,
E) =
xE(t, r! A, a), to
E)
< t < 0)
will be referred to as the stepwise quasimotion of the system (2.2) generated from the initial position ( t o , x o ) by the strategy V t u(t,x,E), the partition A: to = 70 < z1 < ... < 7,(A) = 8, and the number a. With z j < t < z j + ( j = 0,1,. . . ,m(A) - 1) this function satisfies the equation ~ ' ( t r! , A, a) = fj
+
f ( 7 , ~ ~ ( V;7 A, , a), u(zj, fj, E ) ) ~ T
(2.7)
if
where x ; ( z j , V , A, a)is the value (with t = z j ) of the solution x e ( t , V , A, a) of (2.7) extended to the right over the interval 7 j + l < t < z j , that is, x;(rj,
V; A, a) = f j - l +
s'i
f(79
XYT, V; 4 a), 4 7 j -
1, fj- 1, E ) ) ~ T ,
Tj- I
where X E ( 7 0 , r! A, a) = f, and x;(7,, V , A, a) = xo. The quasimotions x [ - ] of the system (1.1) generated from the initial position ( t o , x o ) by the strategy V t U ( ~ , X , E ) will be the functions x [ t ] , to < t < 0, continuous over the closed interval [to,01 such that there exists a sequence of stepwise quasimotions (left extended to to if T$) > to) xEq(., K A@),a(")) = { ~ ( t7$), , x 0 , x$), V, A(r), a("), EJ, to
< t < O}
such that lim
E,+O
lim
lim
diamA"'+O a'"''-O
with
sup I l x [ t ] - xeq(t, V; A(r), a("))II = 0
t,
+ ling" - xoII) = 0,
lirn (I# - to[
r-tm
Here the partition
lirn a("') = 0,
lirn cq = 0.
r-rm
4-m
68
2. Slater Optimnlity
and diam A(')
= max
i
[TY~ - ry'].
If Condition 1.1 is satisfied, as in subsections 2.2 and 2.3 of Chapter 1, the set of such partitions may be shown to be nonempty and each quasimotion x [ * , to, xo, V ] to be an absolutely continuous function of time t (because it satisfies the Lipschitz condition with respect to t). Moreover, with the above quasimotions Propositions 2.2-2.4 and Theorem 2.1 of Chapter 1 hold.
Theorem 2.2. If Conditions 1.1 and 1.2 are satisjied, for a function Fi(x), i e N, satisfying a Lipschitz condition with respect to x, a universal Slatermaximal strategy exists in problem (1.1). Before proving the theorem we will need the following assertion.
Lemma 2.3. I f the functions Fi(x), i E N , satisfy a Lipschitz condition over exist Ai = const such that for any x")EX[O, to, xo, V + Q],j = 1, 2, the following inequality holds:
X[& to, xo, V - Q], there
- Fi(x('))I < RiIIx(')- X ( ~ ) ( I , iE N IFi(xC1))
then, for every set of positive numbers q ( x ) = minieNBiFi(x).
P1, ...,BN,so
(2.8)
does the function
To complete the discussion, let us prove Lemma 2.3. First let us prove the inequality
Indeed, from the chain of relations BiFi(X'") = B i F i ( X ( 2 ' ) + Bj[Fi(X'") - F i ( X ' 2 ' ) ]
+ Bi1Fi(X(1)) - Fi(X(2)I
and, because the sum of a finite number of summands is maximal (not exceeding the sum of the maxima), we have
2. Sufficient Conditions
69
Similarly,
+
max f i i ~ i ( x ( 2<) )max fiiFi(x(')) max fii1Fi(x('))- Fi(x"))I. ie N
ie N
ie N
From these two inequalities (2.9) follows. Because minieNb i F i ( x )= - maxiENfii [ - F i ( x ) ] and since inequalities (2.8) and (2.9) are true, we have
I
min fiiFi(x('))- min fiiFi(x(2))
where 3, j = 1, 2,
=
ieN
i d
I
= --ax
ieN
fii[-Fi(x"))]
+ max f i i [ - F i ( x ( z ) ) ] kN
maxieNfiiRi.Consequently, for every x ( ~ ) E X [ Oto, , X,, V - Q ] ,
which implies that minieNfiiFi(x)is Lipschitz. Let us prove Theorem 2.2. Let PI,. . . ,f i N be some set of positive numbers. Consider a zero-sum differential game
({1, 2}, C - (4.5) of Chapter 1, {W,V } min , fiif'i(x[O1))ieN
(2.10)
Now = QE
{ V - o(t, x , &)I u(t, x , =
{u
- u(t, x ,
E)
E) G
Q ~ c o m R4}, p
I u(t, x , E ) G [O, l]}
with E being the accuracy parameter. The quasimotions of the system (4.5) (Chapter 1) with initial position (to,x o ) are defined by means of a triple limit of stepwise motions which are the essence of the solution of the integral equation (2.7). Since by the premise of Theorem 2.2, the functions Fi(x),i~ N, satisfy a Lipschitz condition with respect to x , the function minicNfiiFi(x)is Lipschitz too (Lemma 2.3). Hence, by Conditions 1.1 and 1.2 the canonical case holds for the positional differential game (2.10) [42, p. 671. The validity of the saddle point condition in a small game for the system (4.5) in Chapter 1 was verified in
70
2. Slater Optimality
order to prove Proposition 4.1 in Chapter 1. As a consequence, in the game (2.10) a maximin strategy V s exists [42, p. 2311 whatever the initial position (to,xo). This signifies that max min min p i F i ( x [ O , to, xo, V ] ) = min min piFi(x[O, to, xo, P]) V e V x[.]
i&
x[.]
ieN
(2.11) where V s is invariable for any choice of (to,xo). Repeating the proof of the sufficiency of Theorem 2.1 leads us to conclude that V s from (2.11) will be Slater-maximal in problem (1.1) whatever the choice of the initial position. In contrast to the above, Theorem 2.2 enables us to establish a Slater-maximal strategy whose form does not depend on the choice of the initial position of problem (1.1). In other words, such solutions do not change when the initial position does. This is not, generally speaking, true of the Slater-maximal strategies whose existence is established in Corollary 1.1 or Proposition 2.1. Here the Slater-maximal strategy must be found as one of the strategies forming the saddle point of positional differential games composed specially. These saddle points are, in turn, plotted as extremal for W-stable bridges. These depend on the choice of the initial position in problem (1.1). Remark 2.3.
3. Structure in the Case of Slater Optimality 3.1.
Description of Structure
Let us find the structure of the multicriterial problem (1.1) with fixed initial position (to,xo). Let V sbe the entire set of Slater-maximal strategies V s of problem (1.1) with the above (to,xo). Denote
xs = i X s = x[e,
to, xg, vs]
I VSE VS}
=
xce, to, xo, ~ 1 . (3.1)
Consequently X s is the set of all right (at t = 0) ends of the quasimotions x[ * , to, xo, V s ] generated from the initial position (to,xo) if all the Slatermaximal strategies V s of V sare examined. This set X s c X [ 0 , to, xo, V - Q], the domain of reachability at time t (Fig. 2.3.1).
=
0 of the system (1.2) with
x [ t o ] = xo
3. Structure in the Case of Slater Optimality
71
,./
Fig. 2.3.1.
Now consider a static multicriterial problem
(XCo, to, ~ 0 , + Q], {Fi(X)}icN)(3.2) Denote as X sthe set of all Slater-maxima weakly effective solutions xs of the static multicriterial problem (3.2). Recall that to denote optimal solutions of static multicriterial problems we use lower-case letters (s, p, g, and a). The structure of the multicriterial dynamic problem (1.1) is such that
xs= x=.
Consequently, the Slater-maximal strategies of (1.1) “lead”, by time t = 8, the quasimotions of the system (1.2) only to Slater-maximal solutions of the static problem (3.2). What is more, by “trying” all Slater-maximal strategies V s from the set Y swe end with the entire set of Slater-maximal solutions of the static problem (3.2) obtained from the original “dynamic” problem (1.1). This result is formulated in the following Proposition.
3.2. Structure Proposition 3.1. I f Conditions 1.1 and 1.2 are satisjied, for every initial position (to,xo)E [0,8)x R” the sets X s and X 5 are identical.
72
2. Slater Optimality
or the set of the right ends (with t = 0) of quasimotions x[. ,to,xo,Vs] when “scans” the entire set of Slater-maximal strategies of problem (1.1). The set X s consists of all the Slater-maximal solutions of (3.2). Proof: Assume that (to,xo)is some initial position of the set [0, 0) x R”. We find for (to,xo) sets X s of (3.1) and of Slater-maximal solutions of the static multicriterial problem (3.2). First, any point xsE X s must be shown to be contained in X s . By Corollary 4.1 (Chapter l), there exists a strategy V* E V such that xs
= x[e, to, xo,
V*I
(3.3)
for all quasimotions x[ . to, xo, V*] of (1.2) generated from (to,xo) by V*. To prove that V* is Slater-optimal in (l.l), we must show that the system of strict inequalities Fi(x[e, to, xo, Vl) > Fi(x[B, to, xo, V * ] ) ,
ie N,
(3.4)
is inconsistent with some quasimotions x[ .,to, xo, V] and x[ . ,to,xo, V*] of the system (1.2) generated by V and V * from the position (to,xo).Let us assume that the opposite is true, i.e., that the strategy V* is not Slatermaximal. Then there exists at least one strategy f e V for which the system of inequalities ie PI) > Fi(xCe, to, ~ 0 v*I), , is simultaneous with certain quasimotions x[ to, xo, f] and x[ Fi(x[O, to, ~
0 ,
(3.5)
~9
*, to, xo, V * ] of the system (1.2) generated by the strategies f and V * from position (to,xo). Then f = x[O, to,xo, is a point from the set X [ e , to,xo, = %[to, xo, n {t = O } , that is, f belongs to the cross-section of the quasimotion bunch %[to, xo, by the hyperplane t = 0. As a result 2 E X[O, to,xo, V - Q] since
a
XCO, to, xo, f1
= XCe, to, xo, V
By (3.5) and (3.3), we have
-
Fi(f) > Fi(x[O, to, x0, V * ] ) = Fi(xs),
1 ,
a
Ql. ie N.
(3.6)
A point fEX[O, to, xo, V + Q] is thus found such that the system of inequalities Fi(f) > Fi(xs),
i6
N,
is simultaneous, which contradicts the Slater-maximal solution in the static
3. Structure in the Case of Slater Optimality
73
multicriterial problem (3.2).Consequently, our assumption is false and the V* from (3.3) is Slater-maximal in problem (1.1) with initial position (to,xo). But then xsE X s also by the structure of the set X s . Consequently, the following inclusion is valid:
xs c xs.
(3.7)
x s E xs
(3.8)
If hence, from (3.7), Proposition 3.1 is valid. The proof of (3.8) will also proceed by contradiction. Assume that a Slater-maximal strategy V ’ E Y exists such that at least one point x * E X [ O ,t o , x o , V s ] is not in the set X’, or is not Slater-optimal. Because X[O, to,xo, V s ] c X[O, to,xo, V - Q ] , it is true that x* E X[O, to,xo, V - Q ] . But the point x* is not a Slater-optimal solution of (3.2) and so there is a point xs E X [ e , to,xo, V -+ Q] such that the system of strict inequalities Fi(xs)> F,(x*),
iE
(3.9)
N,
is simultaneous. For Corollary 4.1 of Chapter 1 there exists a strategy V‘E Y such that (3.10) with x [ . ,to,xo, V s ] generated from the position (to,xo) by the strategy Vs. From (3.10) and (3.9) we obtain
=-
F,(xs)= Fi(x[B, to, xo, V s ] ) F,(x*),
ie N.
(3.11)
Finally x* E X [ e , to,xo, V s ] and so quasimotions x * [ . ,to,xo, V s ] exist such that x* = x*[e,to, xo,
vs3.
Consequently, by (7.1 l), F,(x[e, to, x0,
vS1)> Fi(x*[e, to, x0, vS1),i E N,
(3.12)
for all quasimotions x [ to,xo, V’]. Consequently, if at least one point x* E X[O, to,xo, Vs]exists that does not belong to the set X s , a strategy V s can be constructed such that the system of strict equalities (3.12) holds. But the strategy (3.12) is in conflict with the Slater maximality of V s in the multicriterial problem (3.1). The result and contradiction proves the inclusion (3.8). This proves Proposition 3.1. a ,
74
2. Slater Optimnlity
3.3. Comparison with Another Definition of Problem (1.1) Solution The book [116, p. 1711 suggested a solution of the multicriterial problem (1.1) with initial position (to,xo).
Definition 3.1. The strategy V* E V is referred to as S-optimal in problem the following system of inequalities is nonsimultaneous:
(1.1) if for any
min F i ( x [ e , to, xo, V ] )> min Fi(x[B, to, x0, Xt.1
x[.1
v*]),
iE
N.
If the class of the strategies in use is additionally constrained the above definition can be criticised, as seen in Example 3.1.
Suppose the entire set of strategies consists of two elements
V1)t u(’)(t, x) = (ul, u,)
VZ)+ u(’)(t, x) = {(ul,
= (6,
u,) I u:
6 = const > 0,
6),
+ u; = r z , uj 2 0, j
= 1, 2).
The system of (3.2) has the form
il= u1,
i,= u,.
The initial position is fixed: {(to,xo) I to
= 0, xo = (Xl[OI,
x2[01) = (0, 0))
as is the time 0 = 1 of the end of the process. The goal functionals from (3.2) are given as follows: i = 1, 2.
Fi(x[l]) = xi[l],
Here x = (xl, x,) E R2,
u = (ul, u,) E R2.
Then the set F(X[O, to, xo, V‘l)]) = (Fl,F,) = (6,6) and lF(X[O,to, xo, V(’)])is represented by the hatched part of a circle of radius r in the first quarter of Fig. 2.3.2. Assume that 0 < 6 < r (Fig. 2.3.2). Because min F i ( x [ e , to, xo, V 2 ) ]=) 0, x1.1
then, for arbitrary small positive 6 min Fi(x[O, to, xo, V ) ] ) Xt.1
= const
i = 1, 2,
> 0,
=- min Fi(x[e, to, xo, V 2 ) ] ) , Xt.1
i = 1, 2.
75
3. Structure in the Case of Slater Optimality
Fig. 2.3.2.
As a result (Definition 3.1), the strategy terial problem. Then the set
xs = {.re,
to, xo,
V ( l )is
S-optimal in this multicri-
v(1g) = (6,6).
It is easy to see that in the corresponding static problem (3.2), the set of Slater-optimal solution
Xs= {x[e,
to, xo,
V2)]}= {(xl, x2)1 x:
+ x:
= r2, xi
2 0,i
= 1,
2}.
Consequently the sets X s and X s do not coincide. Moreover, for large values r the point
, is located far from the set F(X[e, to,xo, V 2 ) ] )i.e.,
( P ( w W , to, XO,
W )0,2 ) =
where O2 = (0, 0)E R2, and characterizes inadequatedly the set F(X[e, to, xo, V 2 ) ] Since ) . the number 6 > 0 may be as small as desired, it is easy to understand that for large values of r, the strategy V 2 )and , not V1), will be the solution of the multicriterial problem. It is the strategy V 2 )and not Y C 1that ) will be the Slater-maximal solution if the choice of solution is dictated by Definition 1.1.
76
2. Slnter Optimality
Remark 3.1. Let us specify Fig. 2.1.1 where Slater-maximal strategies are illustrated. Because of
(1) the internal stability of the set of the Slater-maximal strategies (Property 1.5) and (2) the possible multivalence of the values of the goal functional vector F(x[O]) obtained over the Slater-maximal strategy (Example 1.1). In the case N = 2, the set ff(X[O, to, xo, Vs]) can be either a segment parallel to the axes F , and F , or a point (Fig. 2.3.3). Consequently, the corrected plot has the form shown in Fig. 2.3.3. Proposition 3.1 reduces, in turn, the problem of finding the values of the vector of goal functionals F(x[O]) over the set lCrs of Slater-maximal strategies to that of determining the Slater-maximal points of the set F(X[O, to, xo, V + Q] (Fig. 2.3.4). The set F(X[O, to, xo, V - Q)] is the mapp-
Fig. 2.3.3.
77
3. Structure in the Case of Slater Optimality
Fig. 2.3.4.
ing X[O, to,xo, I/ + Q] of the reachability domain (1.2) in the criterion space IF. Then, by Proposition 3.1, the set
F(x[e,to, xo, vS3= { q x ) I x = x[e,
to, xo, PI,
I/S E v s }
coincides with the set of Slater-maximal points for F(X[O, to,xo, V + Q]):
IF(X[O, to, xo, V S ]=) Sr'S'IF(XIO,to, xo, V - Q]). In Fig. 2.3.4 such points for ff(X[O, to, xo, I/ + Q]) are shown as a heavy line.
3.4. Example. Multicriterial Dynamic Problem without Slater-Maximal Strategy In a multicriterial linear-quadratic problem the position (t, x) changes by the equation
k = A(t)x + B(t)u,
x ( t o ) = x0,
(3.13)
where x, u E R" and the elements of the (n x n)-dimensional matrices A ( t ) and
78
2. Slater Optimality
B(t) are continuous over [ 0 , 0 ] ; 0 = const > 0. The set of strategies YL= { V t u(t, x ) I u(t, x ) = Q(t)x and the elements of the (n x n)-dimensional matrices Q(t) are continuous over [0, O]}, and so the strategies differ only by the matrices Q(t) and are bounded by linear functions of x. The performance of the system (3.13) is evaluated by the vector-valued functional
(3.14)
J ( V )= ( J , ( ~ L . . . , J ’ q ( ~ ) ) ,
where
+
J J V ) = xye)cix(e)
l
{~’[t]g~(t)U[tl
+ u ’ [ t ] Y ( t ) x ( t +) x’(t)Gi(t)x(t)}dt,
i E N.
The elements of the (n x n)-dimensional matrices 9Ji(t),g(t),and Gi(t) are continuous and of Ciare constant; the matrices gi(t),Gi(t), and Ci are symmetric. Every fixed strategy V E YL, V - Q(t)x generates, after substituting u = Q(t)x in (3.13), a unique solution x ( t ) = x(t, to,xo, V ) , t o < t < 0, of the system (3.13). By virtue of the linearity of the differential equations 2 = [A(t)
+ B(t)Q(t)]x,
to) = x O .
In Ji(V) we have set u[t] = v(t, x(t)) = Q(t)x(t). The notion of a Slater-optimal strategy in this case is transformed as follows.
Definition 3.2. The strategy V sE YLis Slater-maximal in the problem
(2 - (3.13), YL,J + (3.14)) with initial position (to,x o )E [0,0) x R” if, for every strategy system of inequalities is nonsimultaneous: J i ( V ) > J~(V”),
iEN,
(3.15)
< the following
(3.16)
where V t Q(t)x, V s + QS(t)x and on the left-hand side of (3.16), x(t) = x(t, to, xo, V ) , u[t] = Q(t)x(t), and on the right-hand side, x(t) = x(t, to, xo, Vs),~ [ t = l Qs(t)x(t,to, xo. Vs).
Proposition 3.2. I f q.(t)> 0, iE N, in the multicriterial problem (3.15) there does not exist, for any initial position (to,xo) from the set [O, 0) x R”, llxoII # 0, any Slater-maximal strategy.
3. Structure in the Case of Slater Optimality
79
Proof. By Definition 3.2, in problem (3.15) with (to,xo) from the set [O,@ x R", IIxoll # 0, no Slater-maximal strategy exists if, for any strategy V€"trL a strategy V*€"trL of its own exists for every r/; such that the following system of inequalities is simultaneous: J i ( V * )> J i ( V ) ,
i € N.
Let some strategy V € V L be fixed together with the initial position
(to, xo)E [O,t9) x R", llxoll # 0. As a result, by Proposition 4.3 (Chapter 1) for every j~ N, there exists a constant pj( V ) > 0 such that for the strategy
V") + pyx, = const 2 pj( V ) , the strict inequality J j ( V ( j ) )> J j ( V ) is valid. Assume that p* = maxjpj(V). Consequently, for the strategies V* - p*x the inequalities J i ( V * ) > J i ( V ) , iE N, are valid, which shows that there is no Slater-optimal strategy in problem (3.15).
This page intentionally left blank
Chapter 3
Pareto Optimality
The notion of a solution of problem (1.1) in Chapter 2 resulting from the Pareto optimality concept but with the goal functionals assumed to be multivalent is introduced. The structure of such a solution is described and the connection to the Slater-maximal solution considered in the preceding chapter is established. 1. Pareto-Optimal Strategy 1.1. DeJinition and Geometric Interpretation
Consider the multicriterial problem (1.1) of Chapter 2 where, as before,
c + k = f ( t , x, u), Y = { V + u(t, x)l u(t, x ) E Q ~ c o m p R4},
J ( V ) = (Fl(XCOl),
. . ., FN(XCO1)).
Assume that Conditions 1.1 and 1.2 (Chapter 2) are satisfied, the vectorfunction f ( t , x, u) is continuous and locally Lipschitz with respect to x, IIf(t, x, u)ll < y(1 + Ilxll), Q is compact, and the scalar functions F,(x) are continuous. Moreover, x E R",
u E R4,
to E [0, 0),
0
= const
> 0.
and the initial position (to,xo)E [0, 0) x R" is fixed. Definition 1.1. The strategy Vp E Y will be referred to as Pareto-maximal in 81
82
3. Pareto Optimality
the multicriterial problem (1.1) in Chapter 2 with initial position (to, xo) if, for every strategy V E Y and all quasimotions x[ . ,to,xo, V ] , x[ . ,to, xo, VP], the following system of inequalities is nonsimultaneous: Fi(xCe, to, x0,
VI) 2 Fi(.4e, to, x0, VPN,
iE
N,
(1.1)
at least one of which is a strict inequality for every specific pair (XC. to, xo, V l , xc to, X O ? VPl). In other words, the strategy V P E V is referred to as Pareto-maximal in problem (1.1) of Chapter 2 with initial position (to,xo)E [O,O) x R" if, for any other strategy V E V and any pairs of quasimotions (x[. ,to,xo, V ] , x[. , to,xo, Vp]), either 3
.9
Fi(xre, to, x0,
or a subscript io(V ) = io
~ 1= )F,(xre, to, x0, VPIX
EN
Fi,,(xCe, to,
iE N,
may be found such that
VI) < Fio(xC6, to, xo, Vpl).
~ 0 ,
The set of Pareto-maximal strategies will be denoted Y P . In the case N = 2 we have the following geometric interpretation of a Pareto-maximal strategy (cf. Fig. 3.1.1).
Figure 3.1.1
1. Pareto-Optimal Strategy
83
Let VP be a Pareto-maximal strategy for problem (1.1) (Chapter 2) with N = 2 and some initial position (to,xo) and let V be any other strategy. Construct the set G as in Fig. 2.1.1. Then, by Definition 1.1, the points of the set will be outside the shaded set G. Only the points shown in Fig. 3.1.1 by the double line may be common to the two sets. Unlike the geometric interpretation of a Slater-maximal strategy, where the points F(X[O, t,, xo, VJ)may reach the rays F , = min F,(x[e,
i
to, xo,
Xt.1
F , > min F,(x[B, to, xo, V']), F , X1.l
I I
V']), F , > min FZ(x[O,to, xo, V']) , = min F,(x[B, x[
'
to, x,,
1
V']) ,
for a Pareto-optimal strategy this is impossible. The points F(X[e, to, xo, Vl) remain outside the shaded set G or coincide with the Pareto-minimal boundary of the set F(X[e, to,xo, Vp]). This coincidence exists for any two strategies one of which is Pareto-optimal. Note that the Pareto-optimal (or efficient) strategy is an extension of the definition of a Pareto-optimal solution of a multicriterial static problem [64]. The notion of a Pareto-optimal strategy V, may be introduced in a similar way. Definition 1.2. The strategy V,E Y will be said to be Pareto-minimal in problem (1.1) (Chapter 2) with initial position (to, xo)E [O,e) x R" if, for every strategy VE -Y- and all quasimotions x[. ,to, xo, V ] , and x[. ,to,xo, V J , the following inequalities are non-simultaneous: Fi(xC6, to, xo,
Vl) G Fi(x[e, to, ~
0 vpl), ,
i E N,
at least one of which (for every specific pair (x[ .,to, xo, V ] , x[ . , to, xo, V,])) is a strict inequality. The Pareto-maximal strategies of (1.1) of Chapter 2 are Pareto-minimal for the problem (C, V , - J ) , the converse is also true. 1.2. Properties of the Pareto-Maximal Strategy
The properties of Pareto-maximal strategies are similar to those of the Slatermaximal strategies of Chapter 2 Propositions 1.1, 1.3, 1.4, and 1.5.
84
3. Pareto Optimality
Property 1.1. With N = 1 in problem 1.1 (Chapter 2) the Pareto-maximal strategy is identical to the maximal (optimal) strategy in the optimal control Problem (1.6, Chapter 2) and equality (1.7, Chapter 2) is valid.
The validity of Property 1.1 follows immediately from the equivalent definition of a Pareto-maximal strategy. Property 1.2. If every strategy V EY is associated with the unique value of every functional Fi(x[B,to, xo, V ] ) , icz N, the Pareto-maximal strategy Vp is Pareto-optimal in problem (1.1, Chapter 2) with initial position (to,x o )since for every V E Y ,the following system of inequalities is nonsimultaneous:
Fi(xCe, to, xo,
VI) 2 Fi(x[e, to, xo,
VpI),
i e N,
at least one of which is a strict inequality. Consequently, as in the case of Slater optimality, the notion of a paretomaximal strategy is fairly complete, and its special cases include the notions of a solution of an optimal control problem and a Pareto optimum. The next property immediately follows from Definitions 1.1 of Chapter 2 and 1.1 of this chapter. Property 1.3. Every Pareto-maximal strategy is Slater-maximal, or YpG Ys.The converse is, generally speaking, invalid. For instance, in Example 1.1 (Chapter 2) only the strategy V' +- (1.1) is Pareto-maximal. It corresponds to the point B = IF(x[B,to,xo, V']) = (F,(x[B,to,xo, V T ) , F,(x[O, to, xo, V q ) ) in Fig. 2.1, while the set of Slater-maximal strategies is larger, i.e.,
PI),0 < a < /?< I } u{V i b - vS(t,x ) 1 vS(t, x ) = ([a, B], I), O < a < b < I}. In particular, for a = b = 1, we have V s l = Vp; consequently, *Y' c Y s . fs = { V$
+ vS(t,x ) I uS(t,x ) = (1,
[a,
1.3. Stability The following property results immediately from Definition 1.1 since *Yp c Y .
Property 1.4. The set Ypof Pareto-maximal strategies is internally stable,
1. Pareto-Optimal Strategy
85
or, for every Vci)E Yp and every quasimotion x [ . ,to,xo, V(j)]( j = 1,2) the following system of inequalities is nonsimultaneous:
F~ (x[e, to, x0, ~
9 2 Fi(x[e, ) to, x0. V 2 ) ] ) ,
iE N
at least one of which is a strict inequality.
Property 1.4 states that if, in formalizing the optimal solutions of (1.1) in Chapter 2, the discussion is confined to the notion of Pareto-optimality, there is no Pareto-maximal strategy that is more advantageous than the Slatermaximal strategies. Property 1.5. The set of Pareto-maximal strategies (1.1, Chapter 2) with initial position (to,xo) is externally stable; in other words, for every strategy V E Y and every quasimotion x * [ . , to,xo, V ] generated by [ there exists a Pareto-maximal strategy Vp its own such that
F,(x*[e, to, x0, vl) G Fi(x[e, to, x0, v‘]),i c N for all quasimotions x [ . , to, xo, V p ] EX[^^, xo,
(1.2)
Vq.
Note that distinct V E Y and distinct quasimotions x [ . , to,xo, V ] are, generally speaking, associated with distinct Vp satisfying inequalities (1.2). The proof of Property 1.5 follows that of Proposition 1.4 (Chapter 2) with the following differences: (1) In the linear chain (1.12, Chapter 2) only positive numbers PI,. . .,PN should be used; in this case [70, p. 711 the point xR from (1.12) of Chapter 2 is a Pareto-optimal solution of problem (1.13) of same Chapter and for any X E X*[to,x,], the system of inequalities F,(x) 2 Fi(x”),at least one of which is a strict inequality, is nonsimultaneous; (2) In plotting (1.15) of Chapter 2, Proposition 3.1 rather than Proposition 3.1 of Chapter 2 must be used. Property 1.6. The Pareto-maximal strategy Vp of the multicriterial problem (1.1) of Chapter 2 with initial position (to,xo) is dynamically stable in that Vp remains Pareto-maximal for (1.1) of Chapter 2 with “current” initial position (t,x[t, to,xo, V q ) at any t E [to,O] and any quasimotions x [ -,to, xo, V q .
The proof is similar to that of Property 1.6 of Chapter 2, where inequalities (1.16) become nonstrict inequalities (except for at least one inequality) and appropriate changes in the subsequent inequalities are made.
86
3. Pareto OptimaIity
Properties 1.1, 1.4, 1.5, and 1.6 are similar to Properties 1.1, 1.4, 1.5, and 1.6 of Chapter 2 but no analogs of Properties 5.2 or 5.3 of same Chapter will now exist. Specifically, not every maximal strategy V‘i) of (5.9) of Chapter 2 is Pareto-maximal and the addition of new goal functionals may “spoil” the Pareto maximality of Vp. This is illustrated in the following example. Example 1.1. Let the system Z be described by the equations x* ,. = u.,,
i
xi[O] = 0,
= 1,
2.
The goal functionals Fi(x[8])= xi[l], i = 1, 2; here 8 = 1, x = (xl,x,), v = (vl, v,), and x o = (xl[O],x2[O])= 0,. Let the set Q be defined by the conditions =
i l “17
”)
v1 = r v2 = r
with r = const > 0. Then the set F(X[8,
with with
0 < u, < r 0 < u1 < r
to, xo,
V
1.
+ Q ] )has the form of Fig.
3.1.2.
v,
It is easy to verify that the strategy V = const) E Q is Slater-maximal if (Vl, V,) - (vl (V,, V,) + ( u l
= r,
v2
= constE[O,
= const E
= (Vl,
V,) + ( u l
r])~“Ir~,
[o, r ] , u , = r ) E “Irs,
A
Figure 3.1.2.
= const,
1. Pareto-Optimal Strategy
87
and the strategies V y ) = (V',",
py)- (u
- r, u2 = const E [0, r ] )
are maximal strategies for the functional F l ( x [ B ] ) ,i.e., max max F,(x[B, to, x o , V ] ) = F 1 ( x [ e ,to, xo, V 9 . VEY-
x[.]
(1.3)
However, the strategy V ( ' )- (r, i r ) , which, by (1.3), is maximal, is not Pareto-maximal, since the strategy V ( ' ) - (r, r ) satisfies the condition F , ( x [ e , to, x0, F,(x[B, to, xo,
P 1 ) l= ) r = F l ( x [ e , to, x0, ~ ( l ) ] ) ,
PI)])
=r >
~ ~ ( to, ~ xo, [ e V, 1 ) ]=) -.2r
Consequently, there is no analog of Property 1.3 (Chapter 2) in the case of Pareto-maximality. Moreover, it is obvious that, although in the case of a single goal functional F,(x[B]) V(l) is Pareto-maximal (Property l.l), with two goal functionals Fi(x[B]), i = 1, 2, VC1)is not Pareto-maximal. Consequently, there is no analog of Chapter 2's Property 1.2 in the case of Pareto-optimality either. What is important is that in this example p(') is Pareto-maximal and satisfies the condition max max F , ( x [ B , to, xo, V€Y^
x[.]
v ] )= F , ( X [ O , to, x o , P(')])= r.
Property 1.9 below will show that this condition is also true in the general case. Before formulating this property some additional propositions will be needed. Lemma 1.1 max max Fj(x[B, to, xo, V ] ) = sup max F j ( x [ B , to, xo, V ] ) . VEY-
V€Y+
x[.]
x"]
(1.4)
Here Vpis the set of Pareto-maximal strategies of problem (1.1) of Chapter 2 with initial position (to,xo).
Proof. Because the set of Pareto-maximal strategies Vpc V (Vp# 0 by Corollary 3. l),
sup max F j ( x [ e , to, xo, V ] ) d max max Fj(x[B, to, xo, V ] ) .
VEYP
x[.]
VEY'
x[.]
(1.5)
88
3. Pareto Optimality
Let us now prove the converse inequality. The set F(X[e, to,xo, V - Q ] ) is compact in RN. Then a sequence of the points = (Fik),. . . , F y ) , . . . ,@),
F(X[e, to, xo, V
t
Q])
exists such that lim F y ) = max max F j ( x [ e , to, xo, V ] ) = Ff. V€*.
k-m
x[.]
By Corollary 4.2 (Chapter l), for every point Pk)there exists a strategy of its own such that = F ( X [ 6 , to, X o ,
= 0, 1,. . . ,
k
vk)]),
for any quasimotion x [ . , tO,xo, v k ) ] of the system (1.2) of Chapter 2 generated by the strategy v k ) from position (to,xo). Consequently,
F y ) = F,(xre, to, x o ,
vk)])
= max
XC.1
Fj(xCe, to, xo,
By virtue of external stability, for every strategy strategy Vpk) of its own can be found such that
v k )
vk)]).
a Pareto-maximal
F j ( m t o , xo, Vpk)])= max F j ( x [ e , to, xo, VL)I) XC.1
a m a x F j ( x [ e , to, xo, XC.1
vk)])
=
Fy)
for all quasimotions x[ . , to, xo, Vpk)].Therefore lim max F j ( x [ e , to, xo, Vp,,]) 2 lim F$k)= Ff.
k-m
k-m
x[.]
Then
sup max Fj(x[O, to, x o , V ] ) 2 Ff
VEYP
XC.1
= max VEY
max Fj(x[O, to, xo, V ] ) . XC.1
Combining the latter relation with (1.5) leads to (1.4).
Lemma 1.2.
A Pareto-maximal strategy Vp exists such that
max max F j ( x [ e , to, xo. V ] ) = max F j ( x [ e , to, xo, V T ) . VEV-
xc.1
XC.]
(1.6)
Equality (1.6) will be proved by contradiction; assume that the maximal strategies P) EY max max F j ( x [ e , to, xo, V ] ) = max F,(x[6, to, xo, VEY
XC.1
x
=
F j ( x [ e , to, xo,
V j ) ] )
~ ( j ) ] )
(1.7)
1. Pareto-Optimal Strategy
89
do not include Pareto-maximal strategies. Then because of external stability, for every strategy Y o there exists a Pareto-maximal strategy V:, its own such that max F,(x[e, to, x0, V J ) ]G) rnax Fj(x[e,to, xo, V 3 ) . XC.1
xc.1
(1.8)
But, by (1.6) and (1.7) from Chapter 2, max Fj(x[e,to, xo, VU)])= max max Fj(x[O,to, xo, V ] ) v&- X C . 1 xc.1 and so (1.8) can only be an equality. Consequently, (1.8) and (1.7) imply (1.6). Combining (1.6) and (1.9) from Chapter 2 with (1.4) and (1.6) gives us the next proposition.
Property 1.7. There exist a Pareto-maximal strategy V pand a Slater-maximal strategy V s such that max max F j ( x [ e , to, xo, V ] ) = max F j ( x [ e , to, xo, V 9 ) V€Y
x[.]
XC.1
= max xc.1
F,(x[e, to, xo, P I ) ,
and max max Fj(x[O, to, xo, V ] ) = max max Fj(x[O,to, xo, VI) V€Y
VEYP
XC.1
x[.]
= max max Fj(x[e,to, xo, V ] ) VEYS
-
x[.]
max
xeXCO,roJ,,,V-
Q1
Fj(x).
Consequently, the values of every goal function F j ( x ) that are maximal over the domain of the system (1.2, Chapter 2) are reached by the Pareto and Slater maximal strategies. Note that analogs of the above properties for static multicriterial problems have been described in [70]. Finally, let us formulate the properties of inheritance and rejection in the case of a Pareto maximum as the following assertions are proved similarly by Properties 1.7 and 1.8 from Chapter 2.
Property 1.8. (Inheritance). Let the strategy Vp be Pareto-maximal in problem (1.1) in Chapter 2 with initial position (to,xo). Then for eoery subset of
90
3. Pareto Optimality
strategies T c V such that V'E the problem
7, the strategy Vp will be Pareto-maximal in
G,
7, wcei))
for the same initial position. Property 1.9. (Rejection). If the strategy V E V is not Pareto-maximal in problem (1.1) in Chapter 2 with initial position (to,xo),then, in the multicriterial problem
wcm,
(E?{ V \ v>,
with initial position (to, xo) we have the same set -trp of Pareto-maximal strategies as in problem (1.1) of Chapter 2, i.e.,
V T G v :v
p
c T *T
P
c vp.
2. Relations between the Sets Ypand V S 2.1.
Preliminary Remarks
Let some initial position (to,xo)be fixed. By virtue of Property 1.1 with N = 1 (for the optimum control problem (E,V ,F,(x[B]))) the Pareto-maximal strategy is identical to the optimal strategy. Simultaneously, the Slatermaximal strategy is also optimal with N = 1. For this reason for N = 1, we have Yp= VS. By Definitions 1.1 (Chapter 2) and 1.1 (Property 1.3),
vpc v s
(2.1)
or, with fixed initial position (to,xo), the set of Pareto-maximal strategies in (1.1, Chapter 2) is a subset V Sof the Slater-maximal strategies. But for N = 2, these sets do not necessarily coincide. In particular, the Slater-maximal strategy V ( ' )in Example 1.1 is not Pareto-maximal.
2.2. Quasiconcave Multicriterial Problems
The multicriterial problems form a special class in which every Paretooptimal strategy is Slater-maximal, the converse also being true, i.e.,
2. Relations between the Sets Y pand V s
91
Vp= V s .These constitute what are known as quasiconcave multicriterial problems. Before proceeding, we will need certain definitions.
Dejinition 2.2. The scalar function of a vector-valued argument F(x) is referred to as strictly quasiconcave over the convex set X if for any x ( l )# x ( ~ ) from X and any ~ E ( O ,l), F(px"'
+ (1 - fi)x")) > min(F(x(')),F ( x ' ~ ' ) } .
(2.2)
By this definition, only those strictly quasiconcave functions need to be analyzed that are defined over convex sets. Thus the strictly concave function F(x) defined by F(/.?x"'
+ (1 - P)X"')
> PF(x"')
+ (1 - B)F(x'2')
with ~?E(O, l), x(') # x ( ~ )~, ( J ' E X j = , 1, 2, is strictly quasiconcave. The features of quasiconcave functions in multicriterial static problems seem to have been first studied by Podinovskiy [70, p. 991. To analyze a control system whose reachability domain X [ 0 , to, xo, V - Q] is convex, closed, and bounded, consider the system
k
= AX
+ Bv,
x [ t o ] = x0
(2.3)
where x E R", v E Q c R4, and to < t < 8; the elements of the matrices A and B are constant; the vector xo E R" is fixed; and 8 = const > 0. If Q is a convex compactum, the reachability domain at time 0 is convex, closed, and bounded in R".
-
Definirion 2.2. The multicriterial problem is referred to as quasiconcave for a specified initial position (to,x o )E [0, 0) x R" if the set X [ 8 , to, xo, V - Q] is convex and if every function F,(x), i E N , is strictly quasiconcave over XCe, to, xo, V Ql. Proposition 2.1. I f Conditions 1.1 and 1.2 (Chapter 2) hold and ifproblem (1.1) of Chapter 2 is quasiconcave for the speczjied initial position ( t o , xo),for this initial position every Slater-maximal strategy of problem (1.1) of Chapter 2 is Pareto-maximal, the converse also being true, or V S= VP.
Proof. The inclusion YPGV S
(2.4)
92
3. Pareto Optimnlity
results from Property 1.3. Let us show that the converse is true, or (2.6) Then (2.4) follows from (2.5) and (2.6). The inclusion (2.6) is proved by contradiction. Assume that there exists a Slater-maximal strategy V s that is not Pareto-maximal in the multicriterial problem with initial position (to,xo). Because V s is not Pareto-maximal, there exist strategies V * E V and a pair of quasimotions ( x * [ . ,to, x o , V * ] , x*[ * ,to, x o , V s ] ) such that the following system of inequalities is simultaneous: V SE VP.
Fi(x*[B, to, x o , V * ] ) 2 Fi(x*CB, to, xo,
PI),
iE f ~ ,
(2.7)
at least one of which is strict. The set X[O, to, x o , V - Q ] is convex (by the definition of the quasiconcavity of problem (1.1) in Chapter 2) and so contains, in addition to the points x * [ & to, x o , V * ] and x * [ & to, x o , V s ] , the point 2 = &*[e, to, x 0 ,
v*]+ x * [ e , to, x0, PI).
(2.8)
By virtue of the strict quasiconcavity of the function F i ( x ) ,we have for the above pairs of quasimotions ( x * [ ., to,xo, V * ] , x * [ . , to,xo, V s ] ) that Fi(2) = Fi(%x*C& to, xo,
V*I
V*I), v*I).
>min{Fi(x*[e, to, xo, = F ~ ( ~ * [ to, B , xo,
+ X*c& to, xo, vsI)) Fi(x*C& to, xo, V S N
(2.9)
The latter equality utilizes (2.7). Then, by Corollary 4.2 (Chapter 1) there exists a strategy PE-Y such that, for every quasimotion x[. ,to, xo, PI,it is true that F i ( i ) = Fi(x[O, to, x o , PI), i~ N. Therefore, by (2.9), Fi(xCe, to, xo,
PI) > Fi(x*[B, to, xo, vS1),
iE N
for every quasimotion x [ ., to, x o , P] that contradicts the Slater maximality of the strategy V s in the multicriterial problem (1.1, Chapter 2) with initial position (to,xo). This proves Proposition 2.1. 2.3. Remark
Proposition 2.1 distinguishes a class of multicriterial problems for which every Slater-maximal strategy is Pareto-maximal; the converse is also true. This class includes mathematical models whose dynamics are described by
93
3. Structure in the Case of Pareto Optimality
the linear system (2.3), where the set Q is convex and ccmpact and all goal functionals F,(x), i E N, are strictly concave with respect to x. The mathematical tools of the preceding section are applicable to such problems. Thus, for a quasiconcave multicriterial problem Theorem 2.1 from Chapter 2 may be applied in the following way.
Proposition2.2. I f Conditions 1.1 and 1.2 of Chapter 2 hold and problem (1.1) of Chapter 2 is quasiconcave for the specijed initial position (to,x o ) E [O,e) x R” and if, for some set of positiue numbers PI,. . .,PN, max min min p i F i ( x [ e , to, x o , V ] ) = min min P i F i ( x [ e ,to, xo, VEY
x[.]
ieN
x[.]
ieN
V9)
the strategy Vp is Pareto-maximal in Chapter 2 problem (1.1) with initial position (to,xo). Proposition 2.2 follows immediately from Theorem 2.1 (Chapter 2) and Proposition 2.1. To make an actual search for a Pareto-maximal strategy in the case of quasiconcavity, the above, Remark 2.1 from Chapter 2, modifications of the dynamic programming method for optimal control can be applied with the criterion minisNP,F,(x[O])and the constraints (1.2, Chapter 2), assuming V E 9‘“.
3. Structure in the Case of Pareto Optimality 3.1. Description Let us now proceed to determine the structure of the solutions Vp of the multicriterial problem (1.1) in Chapter 2. It is analogous to the structure of these solutions in the case of Slater optimality. Specifically, let Yp be the entire set of Pareto-optimal strategies Vp of problem (1.1) in Chapter 2 with given initial position (to,xO)e[O,@ x R”. Let
xP=
{.
= x[e,
to, x0,
vg I vPE vP}.
Put differently, Xp is the set of all points x of the reachability domain X[O, to, xo, V + Q ] that can be arrived at only at time t = 8 by quasimotions x [ .,to, x o , V q generated from the position (to,x o ) of the strategies Vp as Vp “scans” the entire set 9‘“‘ of Pareto-optimal strategies. Simultaneously, let us consider the associated static multicriterial problem
(XCe, to, XO,
+
Q1, F ( x ) ) .
(3.1)
94
3. Pareto Optimality
The symbol Xp will denote the entire set of Pareto optimal (maximal) points x p in problem (3.1). The structure of the solutions of problem (1.1, Chapter 2) in the case of Pareto optimality is such that, for fixed initial position (to, x o )E [0, 0) x R", it is true that Xp = Xp. In other words, by the time for process 0 ends, the Pareto-maximal strategies of problem (1.1, Chapter 2), will have led the quasimotions of Chapter 2's system (1.2) ( x [ t o ] = x o ) to the Pareto-optimal (maximal) points of the static problem (3.1) and only to their points. This result is formulated in Proposition 3.1.
3.2. Structure
Proposition 3.1. If Conditions 1.1 and 1.2 of Chapter 2 hold and ifan initial position (to, x,,) E [0,0) x [w" is jixed, the set Xp and X p coincide.
Here X p = { x E X C B , to, x0, V +- Q ] I x
= x[0,
to, x0, Vp], V'E Vp}
and the set Xp is all Pareto-optimal points of problem (3.1). The proof follows that of Proposition 3.1 of Chapter 2 and will be presented for completeness of presentation. Proof.
First, we prove the inclusion x p G
xp.
With x p being an arbitrary point of the set XP and, since, by (3.1), x ~ E X [ Oto,, xo, V + Q ] , it follows from Corollary 4.1 (Chapter 1) that there exists a strategy V* E V such that xp
= x[e,
(3.3)
to, xo, V*-J
for any quasimotions x [ ., t O , x o ,V * ] of the system (1.2) in Chapter 2 generated from the position ( t o , x o )of the strategies V*. To prove that the strategy V* is Pareto-optimal, we assume the contrary, i.e., that there exists a strategy P and pair of quasimotions ( x * [ .,to, x o , f ] , x * [ . , to, xo, V * ] ) such that the following system of inequalities is simultaneous: F , ( x * [ B ,to, xo,
P I ) 3 F , ( x * [ e , to, xo, V * ] ) ,
at least one of which is a strict inequality.
iE
N,
(3.4)
3. Structure in the Case of Pareto Optimality
95
It follows from (3.3) that Fi(xP) = F,(x[O, to, xo, V * ] ) ,
(3.5)
N,
iE
for every quasimotion x[ . ,to, xo, V * ] E T[to,xo, V * ] . The points i = x[O, to, E X [ O ,to,xo, 33 c X[O, to,xo, V t Q]. Therefore, by (3.4) and (3.5)for the point 2 = x*[O, to,xo, we have
xo,
v]
v],
F i ( i ) 2 Fi(xP),
(3.6)
iE N,
at least one of which is a strict inequality. The relations (3.6) contradict the Pareto optimality (maximality) of the point xp in problem (3.1). Now let us prove the inclusion
xpE xp.
(3.7)
Again, let us assume the contrary, i.e., that there is at least one point X'E Xp that is not Pareto-optimal in problem (3.1). But then there is a point x* E X[O, to, xo, V + Q] such that the following system of inequalities is simultaneous: F,(x*) 2
(3.8)
i E N,
Fi(XP),
at least one of which is a strict inequality. By Corollary 4.1 (Chapter l), there is a strategy V* for which x* = x[O, to,xo,V * ] for every quasimotion x[ . , to, xo, V * ] . Therefore Fi(x*) = Fi(x[B, to, xo, V * ] ) ,
iE
(3.9)
N.
O n the other hand, by the construction of the set Xp, the point xp belongs to the reachability domain of the system (1.2) in Chapter 2 with some Paretomaximal strategy Vp of problem (1.1, Chapter 2). Then there is a quasimotion x'[ .,to,xo, V q such that xp = xp[e, to, xo,
vq
(3.10)
and F ~ ( x= ~ )Fi(XP[e, to, x0,
vq),
(3.11)
iE N.
Combining relations (3.8)-(3.10) leads to the following system of inequalities: Fi(x[O, to, xo,
V*I)
2 Fi(xPIO, to, xo.
Vq),
iE N,
(3.12)
while, for any quasimotion x[ . , to, xo, V * ] , at least one of the inequalities is strict. The relations (3.12) are incompatible with the Pareto maximality of strategy Vp in problem (1.1, Chapter 2) with initial position (to,xo). Finally, Proposition 3.1 follows from (3.2) and (3.7).
96
3. Pareto Optimality
3.3. Existence
Corollary 3.1. If Conditions 1.1 and 1.2of Chapter 2 hold, a Pareto-maximal strategy in the multicriterial problem exists for any initial position (to,xo) E LO, 0) x la”. Indeed, with (to,x o ) an arbitrary position of the set [O, 0) x R”, if Condition 1.1 (Chapter 2) is met, the set X[O, to, x,, V - Q] is closed and bounded in R“. Then a Pareto-optimal point x p of problem (3.1)can be found [85, p. 181 from N
N
where PI,. . . ,PNis an arbitrarily chosen set of positive numbers. Such a point xp exists by virtue of the continuity of the functions F i ( x )and the compactness of the set X [ 0 , to, x,, V +- Q ] . But then, by Corollary 4.1 (Chapter l), there is a strategy Vp such that X P = x[e,
to, x0,
vP],
VX[.
, to, x,,
vp]E x [ t , , x,,vP].
This strategy Vp will be Pareto-maximal for the multicriterial problem (1.1, Chapter 2) for a specified initial position (to,x o ) (Proposition 3.1).
3.4. Specific Features of Pareto-Maximal Strategies In concluding this chapter let us consider the following non-conflicting positional differential game:
ro= (N, x, { K } i c N , {Fi(x[CI)}i,N),
(3.13)
where the system E is described by the equation =f ( t , x ,
v1,
...
VN)
= f ( t , x , 0)
(3.14)
where uiEQiEcompRqi is the control action of the i-th player; “yi‘ = { + ui(t,x ) I ui(t, x ) E Qi} is the set of his strategies, u = (ul, u 2 , . . . , uN); the remaining variables in the game (3.13)are the same as in the multicriterial problem; Conditions 1.1 and 1.2 of Chapter 2 are assumed to hold. The Pareto-maximal strategy provides “the largest possible combination” of values for every goal functional Fi(x[O]).Moreover, by Proposition 3.1 this combination consists of “the largest” values the functions may have simultaneously over the reachability set X [ 0 , to,xo, V -+ Q], Q = Q1x ... x QN of the
3. Structure in the Case of Pareto Optimality
97
system (1.2) in Chapter 2. For this reason the Pareto-maximal strategy may seem, “from the game-theoretic point of view,” to be a “good” solution of the differential game (3.13). But this is not so. In the differential game (3.13) the maximin (guaranteed) gain
Ff = max min F,(x[O, V6.U;
to, xo,
XC.1
v]) = min Fi(x[O, to, xo, V03) 4.1
(3.15)
is larger for some players than the gain FP over the Pareto-maximal set of strategies Vp in an associated multicriterial problem, e.g., (1.1, Chapter 2); FP = Fi(xCe, to,
XO,
TI)
where
The i-th player can be assured of the gain Ff state in (3.15) given any strategies of the other N\i players by using his maximin strategy Vp, since Fi(x[e, to, xo, G‘I)
> Ff
for any quasimotions of the system (3.14) generated by the strategies Vp from position (to, xo), whereas the Pareto-maximal set of strategies presumes that all the players agree on the joint choice of their strategies. Indeed, why should the i-th player in the case Ff > Fr adjust to the others whereas he can achieve a gain of at least Ff by using his maximin strategy Vp from (3.15)? As a result, Pareto-optimality should be used sparingly in positional differential games; in particular, in every case we should limit ourself to at least those Paretomaximal sets of strategies Vp for which FP > Ff for every i E N. The feasibility for being the inequality Ff > F y is confirmed by the following example. Example 3.1.
In a two-person “tug-of-war” game the motion equations are
(x = (Xl, x2))
A = u1
+ 02,
= o2 = (0, 0).
X[O]
(3.16)
The first player’s control action is constrained by the inequality IlUlII
2
(3.17)
and the second player’s, by IIu2II
G 1.
(3.18)
98
3. Pareto Optimality
With the points A , = (0,5) and A, = ( 5 0 )fixed, the i-th player tries to pull, and by the time the game ends (6 = l), the point x [ 6 ] is as close as possible to Ai (i = 1,2). Then the i-th player’s payoff function can be represented as F,(x[O]) =
-
llx[l] - A J ,
i = 1, 2
(3.19)
(the players “maximize” their payoff functions). The i-th player’s strategy is identified with the functions ui(t,x ) , which satisfy inequalities (3.17) (i = 1) or (3.18) (i = 2). The set of the i-th player’s strategies will be denoted K. The reachability domain X of the system (3.16) from position (to,xo) = (O,O,) is a circie of radius j with center at the origjn (ci. Fig. 3.3.1). The Pareto-optimal points in the associated static multicriterial problem (3.20)
( X , { F i ( x ) > i =1.2)
fill a quarter of the circumference, arcB,B,. In particular, the point B, is Pareto-optimal in problem (3.20) and is reached from a set of strategies Vp - ((2,0), (1,O)). By Proposition 3.1 this set is Pareto-optimal and the payoffs of the players for this set F , ( x [ e , to, x0,
vP3= -@,
F,(XCB,
Figure 3.3.1.
to, xo,
vP1)= -2.
(3.21)
3. Structure in the Case of Pareto Optimality
99
Let us now find the maximin gain of the first player by the reasoning of [43, pp. 106- 1113. The payoff function F , (x[O]) turns into F,(~CO-J) =
-
+
[(xl[i~)2 ( x 2 [ i ~ 5)211/2.
we consider the antagonistic differential game
61,Wl, V2) = Fl(XCO3)).
((1, 2}, z + (3.161, {% The following expression is true:
wt,x,
01, u 2 )
=acp at
+
[!g](.,+
u2).
(3.22)
Then the maximin W(t,x, u l , u2) in (3.22) with bounds (3.17) and (3.18) is (cf. (26.6) in [43, p. 1091)
lI:1l
acp max min W(t, X, u l , u 2 ) = + -. at
v2
VI
(3.23)
The values of the vectors u? and u! that lead to this maximin are given by the equalities (3.24) with Ildcp/axll # 0 while up may assume any values consistent with inequalities (3.17) and (3.18), provided that Ilacp/axll = 0. For the equation
with boundary condition ~ ( 1X), = - [x:
+
( ~ 2
5)2]"2
the following solution is found: ~ ( tX), = - [x:
+ (xZ
- 5)2]1'2
+ (1 - t)
in the domain
- [x:
+
( ~ 2
5)2]1'2
+ (1 - t) < 0
and cpk
4 =0
100
3. Pareto Optimality
in the domain - [x:
+(
~ 2
q2]1’2
+ (1 - t ) 2 0.
(3.25)
But for any (x,,x2) in the reachability domain X, the function cp < 0, so Condition (3.25) cannot be fulfilled. Consequently, the potential of this antagonistic differential game is given by the equality V(t, X )
=
- [x:
+(
~ 2
5)2]1’2
+ (1
-
t).
Then the maximin value of the payoff (by (25.6) in [43, p. 1073) Ff = max min Fl(x[B, to, xo, V,]) = cp(to, xo)= -4. VI
(3.26)
xt.1
Comparing (3.26) and (3.21) yields F; = F , ( x [ e , to, xo, v9)=
-J34< -4
= max VI
min F,(x[B, to, xo. V]) = Ff. xt.1
This inequality asserts that the maximin gain for the first player is larger than his payoff for a Pareto-maximal set of strategies. Therefore, in the game (3.16)-(3.19) it is not “worth the first player’s while” to agree to a Paretomaximal set of strategies Vp +- ((2,0), (1, 0)),since a maximin strategy (which presumes maximal counteraction of the other player in the game r,) yields a larger payoff. 4. Su5cient Conditions 4.1. Auxiliary Propositions
The sujicient conditions that are satisfied by a Pareto-optimal strategy Vp of problem (1.1) in Chapter 2 for any initial position (to,xO)e[ O , O ) x R” will be used in the next section to derive an explicitly Pareto-optimal strategy in the linear quadratic case. A continuous scalar function $(Fl,. . . ,FN)defined over RN will be referred to as strictly increasing with respect to a set of arguments if, for any two points, IF”’ = (FI“, . . . , Fp),
j = 1, 2,
from the simultaneity of the system of inequalities FI’)
< Fi2),
iE
N,
4. Sufficient Conditions
101
at least one of which is a strict inequality, it follows that $(F\l),
. . . ,F p ) < $(F\Z), . . . ,Fp).
(4.1)
For instance, if the continuous function $(Fl,. . . ,FN)is strictly increasing with respect to every variable, or if it follows from F:” < Fi2’ that
then $(Fl,. . . ,FN)is strictly increasing over the set of arguments.
Proposition 4.1. If Conditions 1.1 and 1.2 of Chapter 2 hold and if the function $(Fl,. . ., FN) is strictly increasing over the set of arguments, the strategy Vp E Y such that with the initial position (to,X ~ ) E[O,O) x R”,
will be Pareto-maximal in the multicriterial problem (1.1,Chapter 2) with initial position (to,xo).
Proof. Let us show that all the points
-
xp
= x p , to, xo, VPI
(4.3)
(the strategies V p are found from (4.2))are Pareto-optimal in the static problem ( X [ & to, xo, V Q], {Fi(x)}iEN)(3.1). Then, by Proposition 3.1, the strategies V p are Pareto-optimal in problem (1.1, Chapter 2) with initial position (to,xo). Assume the opposite, i.e., that no point xp found from (4.2) and (4.3) is Pareto-optimal in problem (3.1). Then there exists an n-vector x* E X[O, to, xo, V - Q] such that the system of inequalities
F,(x*) 2 Fi(XP),
iE
N,
(4.4)
is simultaneous and such that at least one is a strict inequality. Because the function $(Fl,.. . ,FN)is strictly increasing over the set of arguments, by (4.4),
,FN(X*)) > $(F1(XP), * . . ,FN(XP)). (4.5) Consequently, a point x* E X[O, to, xo, V - Q] may be found such that the strict inequality (4.5) is true. $(F,(x*),
* * *
102
3. Pareto Optimality
Now let us consider an auxiliary zero-sum differential game
<{
2}, z*?{@>v } ,J ( V ) = $(F1(x[e])?. . .
9
FN(x[el))?
(4.6)
where the control system Z*is identified with i=f(t, x, u)
+0 u
the null vector 0, E R" and the scalar control u E [0,1]. As in the proof of Proposition 4.2 (Chapter l), it is easily seen that all the constraints of Theorem 3.2 (Chapter 1) concerning the existence of a saddle point of the zero-sum differential game (4.6) are satisfied. That the saddle point condition in a small game is satisfied was proved in Proposition 4.2 (Chapter l), moreover the function $(F,(x), . . . ,FN(x))is continuous. The remaining constraints are corollaries of Condition 1.1 of Chapter 2. Then in the game (4.6) there exist [83, p. 961 a maximin strategy Vp under which (4.2) and a minimax strategy U'E@ are fulfilled. By the definition of a saddle point (UO, V q , we have
=max $(F,(x[e, to, xo, Uol),. . . ,FN(X[e, to, xo, uol)) Xt.1
=$(Fl(x[e, to, xO, uoV9),. . . FN(X[e, to, XO, uo,Vpl)) = $(F1(x[e, to, XO, V9), . . . FN(X[e, to, XO, Vpl)) =max $(F,(x[B, to, x0, V +- QI), . . . ,FN(xC0,to, xo, - Ql)) Xt.1 9
9
(4.7)
where the latter equality is a corollary of the fact that the strategy U has no effect on the system X* of (4.6) (it acts through the term 0,u). It follows from (4.7) that, for any point x E X[O, to,xo, V + Q], $(Fl(X)9.* . FN(X)) Y
< $(Fl(x[& to, XO, VT),. . .
9
FN(X[e, to, XO?
Vq))
= $(Fl(XP),. . . ,FN(XP)).
This inequality is incompatible with (4.5),so any point xpobtained from (4.3) is Pareto-optimal in the multicriterial problem (2.1).But then, by Proposition 3.1, the strategy Vp found from (4.2) is also Pareto-optimal in problem (1.1) of Chapter 2 with initial position (to,xo). This proves Proposition 4.1. The function $ = Zr= BiFi (with Pi positive numbers) is easily seen to be
4. Sufficient Conditions
103
strictly increasing over the set of arguments. The next remark follows from Proposition 4.1.
Remark 4.1. If Conditions 1.1 and 1.2 of Chapter 2 hold, with PI,. . . ,PN a set of positive numbers and ( t o , x o )an initial position, the strategy Vp for which N
N
is Pareto-optimal in problem (1.1, Chapter 2) with initial position (to, xo). The function I) = C PiFi will be used in the next section to find a Paretooptimal strategy in the linear quadratic multicriterial dynamic system. It follows from the proof of Proposition 4.1 that the problem (4.2) of finding Pareto-optimal strategies can be viewed as equivalent to that of finding the saddle point of a positional differential zero-sum game (4.6), where there is no “minimizing” game. Therefore, Proposition 4.1 makes it possible to use the mathematical tools of sufficient conditions [83, Chapter 31 that are satisfied by the piecewise-smooth function of the price of a zero-sum positional differential game in order to obtain Pareto-optimal strategies of problem (1.1, Chapter 2). Such relations “can be viewed as an extension of the first-order partial equation from the theory of dynamic programming” [ibid., p. 71, specifically as an extension, “to the case of non-smooth functions, of the well-known Hamilton-Jakobi dynamic programming equations” [ibid., p. 51. The tools given in this monograph on pp. 274-275 are useful in solving problem (4.2). The findings of Leitman, Fleming, Friedman, Leons, and others can also be applied. The form in which we shall be applying sufficient conditions seems illustrative. We will require certain additional propositions.
Proposition 4.2. If the function q( -): [0,0] x R +R’ is continuous and V a certain strategy from the set V ,for the condition sup x[.]
-
lim [ d t , xCtl) - q(t,, x*)](t - t * ) - ’
t+t*+O
to hold at every position (t*, X*)E [0, 0 ) x
<0
(4.8)
R , it is necessary and suficient that
In (4.8) and (4.9) the quasimotions x [ E%[t*, x*, V ] are generated from position (t*, x*) by the strategy V E V . a ]
104
3. Pareto Optimnlity
Proof. The sufficiency ((4.9)* (4.8))of the proposition is obvious. The proof of necessity ((4.8) (4.9))follows the proof of Lemma 3.2.1 in [83, p. 1151. Assume that the converse is true, i.e., that there exist a position (to,xo)E [0, 0) x R", quasimotion x*[ . ] e X [ t o ,xo, V], and number a > 0 such that
is not empty since q(to,x*[to]) = q(to,xo). We let z* = max{z I [to, z] c T } .
(4.11)
The superposition cp(t,x*[t]) follows from the continuity of the functions x*[. 1: [to, 0) -+ R" and cp(t, x). Then z* c 8 and, by (4.10) and (4.1l),
and there also exists a number 6 E (0,O- z*) such that
(4.13) at every t E(z*, z*
+ 6). In light of (4.12),inequality (4.13) turns into
Equality (1 1.14) is, however, incompatible with (4.8).Indeed, it follows from Condition (4.8) [83, p. 1151 that for any arbitrary position (t*,x*)E [0, 0)xR and arbitrary quasimotion i [- 1 EX[^*, x*, V], for any number y > 0 there is a number a(?[ *I)> 0 such that for the quasimotion i [. ] = x[. ,t,, x*, V], 444 2CtI) < d t * , x*) + Y ( t - t*)
holds at
t E
[t,, t ,
+ 61. Assuming in
(T*, x*[z*]), i[t] = x*[t]
(4.15)
(4.15) that y = a/(0 - to), (t*, x*)= leads to a contradiction to (4.14).
4. Sufficient Conditions
105
Following the proof of Proposition 4.2, we may prove the following assertion.
Proposition 4.3. If the function (p[ . ] : [O, 01 x R" -+ R' is continuous and V a strategy from the set "Y-, for the condition inf
x[.]
lim [q(t, x [ t ] )- cp(t*, x,)](t - t*)-' '=qm
0
(4.16)
to hold it is necessary and suflcient that for every (t*,x*)E [O, 6 ) x R",
min min q(t, x [ t ] )2 rp(t,, x,). Xc.1 t , a a
(4.17)
In (4.16) and (4.17), x [ . ] = x [ . , t,, x*, V ] is any quasimotion of the bunch ZL-t,, x*, V l . Finally, we have the following assertion from Propositions 4.2 and 4.3.
Corollary 4.1. If the function q(t, x ) is continuous and V a strategy from the set "Y-, at every position (t,, X*)E [0,6) x R", the condition
= inf x[.]
lirn
t-rf,+O
[q(t, x [ t ] )- cp(t,, x,)](t - t,)-'
=0
holds i f , for any quasimotion x [ .] = x [ ., t,, x*, V l of the system (1.2) in Chapter 2 generated from any position (t*,X , ) E [0,6) x R", q(t,XCtl) = dt,, x*) at every t E [to, 01.
4.2. Suficien t Conditions Propositions 4.1-4.3 lead to a sufficient condition satisfied by the Paretomaximal strategy Vp in the multicriterial problem for any initial position (to,xo) LO, 6) x R".
Proposition 4.4. If Conditions 1.1 and 1.2 of Chapter 2 hold, there exists a continuous function rp(. ): [0,0] x R" + R' that is strictly increasing for the set of arguments of the function @(Fl,.. . ,F N ) and a strategy V'E Y such that (1) for every x E R",
de,
=
.
@(Fl(x)?.
.?
FN(X));
(4.18)
106
3. Pareto Optimality
(2) for any position (t*,x*) E [0,0) x R" and euery quasimotion x [ . ] E XCt*, x*, V T ,
=
! i l J [q(t, x [ t ] )- q(t*, x*)l(t - t * ) - l
= 0;
(4.19)
t-tt,+O
(3) At euery position (t*,x*)E [0, 0) x R", sup *[.I
-
lim [q(t, x [ t ] )- cp(t,, x*)](t - t*)-' G 0
t+t*+O
(4.20)
for quasimotions x [ . ] E X[t,, x*, V ] of the system (1.2) in Chapter 2 generated by any strategies V E V from position (t*,x*) Then the strategy V p is Pareto-maximal for the multicriterial problem (1.1, Chapter 2) for any choice of initial position (to, X ~ ) [0, E 0 )x R . Proof. If (to,xo) is an arbitrary position in the set [0,0) x R", by Condition 4.19 and Corollary 4.1, cp(t,x [ t , to, xo, V']) = q(to,xo) at every C E [to,01 for every quasimotion x [ .,to,xo, V q of the system (1.2) of Chapter 2 generated by the strategy Vp from position (to,xo). Hence (at t = 0), from (4.18) we have d t o , xo) = d o , xC4 to, xo,
V9)
= W l ( X C 0 , to, xo, V'I),
for every quasimotion x [ ., to, xo,
.. ., F
N W , to,
xo, VpIN
V q E X [ t o ,xo, V q as well. Therefore,
Letting V be an arbitrary strategy from the set V , by (4.20) and Proposition 4.2 for any quasimotion x [ . ,to, xo, V ] of the system (1.2) of Chapter 2 generated by this strategy V from position (to,xo), d t , xrt, to, xo, V I ) G d t o , xo)
at every t E [to, 01. In particular, at t = 0, d 0 9 xre, to, xo, V I ) G d t o , xo).
(4.22)
Equality (4.18) leads to
d o , xce, to, xo, V1) = +(F,(xC0, to, xo, V I ) ,
* *. 9
FN(XC0, to, xo, VI)).
4. Sufficient Conditions
107
Therefore, from (4.22) $(Fl(xCe,
to, x O ,
V 1 ) 7 . .
. FN(X[e, 3
XO,
V1))G d t O ,
xO)
for quasimotions x [ . , to, x o , V ] EX[^,, xo, V ] . But then, obviously, it is also true that
Combining (4.21) and (4.23) leads to min $(F1(xC0, to, xo, xc.1
vl), . . . ,FN(xI?, to, xo, VI))
Gmin $(F,(x[B, to, xo, V 9 ) , . . . ,FN(xCO, to, xo, xc.1
V'I)),
(4.24)
which holds for every strategy V E V . But in this case inequality (4.24) is equivalent to (4.2), which, by Proposition 4.1, asserts the Pareto maximality of the strategy Vp in problem (1.1) in Chapter 2 with initial position (to,xo). Because this initial position (to,x o )is taken arbitrarily from the set [0, 0) x R", the strategy Vp is Pareto-maximal for any initial position (to,x o ) E [0,0) x R".
4.3.
Corollaries
Corollary 4.2. If thefunction $ ( F l , . .. ,FN)of Proposition 4.4 is chosen in the form $ = C fl,F,, where pi are some positive numbers, following assertion will hold. Proposition 4.5. If Conditions 1.1 and 1.2 of Chapter 2 hold and ifthere exists a continuous function cp(. ): [0, 01 x R" + R', a set of positive numbers B1,. . . ,B N , and strategy V'E Y such that (1) for every x E R", N
( ~ ( 0X, ) =
C i=
1
BiFi(x);
(2) constraints 2 and 3 of Proposition 4.4 hold, The strategy Vp is Pareto-maximalfor problem (1.1) of Chapter 2for any choice of initial position (to,X ~ ) [0, E 0) x R . This result follows from Proposition' 4.4 and from that the arguments of the function X;=, PiFi are increasing functions.
108
3. Pareto Optimality
Finally, if the function q(t, x) is continuously differentiable and if by the quasimotion of the system (1.2) from Chapter 2 generated by the strategy V - v(t,x) we mean the solutions with u = v(&,x),Conditions 4.19 and 4.20 turn into
(4.25) which must hold at every position (t,X)E [O, 0) x R". Recall that a q / a x is a column vector whose components are partial derivatives of the function q(t,x) with respect to the coordinates of the vector x.
Corollary 4.3. Thefunctions +(Fl,. . .,FN)of Propositions 4.1 and 4.4 may be represented [70, pp. 71-72], in addition to Z pi&, pi = const > 0 (Corollary 4.2), as the functions (4.26)
where
F:
=
F,(x)+ yi,
max
yi = const > 0;
x~X[fl.t~,x~.VtQ]
(4.27)
in the case of the dzferential game (3.13)
where
Ff = max min &-u;
and there exists at least one point that
XC.]
=
&(x[O, to, xo,
v])
(4.29)
(fl,. . .,fN) E F(X[O, to, xo, V + Q]) such
6> F?,
iEN.
(4.30)
These functions JIr(F1,. . . ,FN)are strictly increasing over the set of arguments (4b2 under conditions (4.30)). Therefore, if Conditions 1.1 and 1.2 of Chapter 2 are satisjied, the strategies Vp found from the equalities max min $',(FI(X[~,to, Xo, V€V
x[-l
=min
xc.1
+r(Fl(xCe,
v]), . . . ,FN(XC0,to, X o , v])) xO,
vq),... FN(x[O,to, XO, vq)) >
(4.31)
4. Sufficient Conditions
109
are Pareto-maximal in problem (1.1) of Chapter 2 with initial position (to,x o ) (Proposition 4.1). The existence of Vp was demonstrated in the proof of Proposition 4.1.
Functions (4.26) and (4.28) help us apply (4.31) to identify from the set of Pareto-maximal strategies Vp those strategies which have obvious geometric or “game-theoretic” properties, since the strategy V‘” found from (4.31) with r = 1 leads all quasimotions x [ . ,to, xo, Vcl)] to those points x[O, to, xo, Vcl)] of the reachability domain X[O, to, xo, V + Q] where the values of the goal functions F i ( x ) are closest (in terms of the Euclidean metric and up to y = (yl,. . . ,yN)) to the “Utopian point” IF* = (FT,. . . , Fg). This point IF* of the maximal values of the goal functions F i ( x )(in the reachability domain X[O, to, xo, V + Q]) is obviously the most desirable point, though it can be obtained by the strategy V E Y only in special cases. Our desire is to use a strategy V “ ) that would guarantee values of the goal functions that are maximally (in terms of the Euclidean metric and up to y = (y,, . . . ,yN)) close to IF*. The strategy Vcl) is usually called [SS, section 23 the mean-square strategy. The strategy V(’)found from (4.31) with r = 2 has been termed [85, section 31 the Nash arbitration solution. It is the solution that makes it possible for a judge to make maximal allowance for the interests of parties (players) in the differential positional game (3.13), especially if they are “symmetric”. Sets of axioms that lead to the form (4.28) have been given in [89]. The functions can also assume other forms, e.g.,
N
where m, Pi, and mi are positive constants and the values of F:, Ff, i E N,are given in (4.27) and (4.29), (4.30).
110
3. Pareto Optimality
5. A Linear Quadratic Multicriterial Problem 5 .I .
Problem Statement
Let us assume that the variations of the position (t, x) of the system X from (1.1) of Chapter 2 are described by the linear equation jc = A(t)x
+ B(~)u,
(5.1)
and the initial conditions x(tO)= xo are constrained only by to E [O, O),
(5.2)
xo E R";
the performance of X is estimated by a vector-valued functional
J(V= (Jl(V),. . ., JN(V)),
(5.3)
where
J ~ (=v xye)cix(e) ) +
jt:
+
{x'(t)~,(t)x(t) ~ ' [ t ] ~ , U [ t l ) dt,
iE
N.
(5.4)
As before, in (5.1) and (5.4) the state vector X E R " and tim t ~ [ t ~ , O ] ,while to < 0 is the fixed time when the functioning of X ((5.1)),ends, the control action u E W, the elements of the matrices A(t), B(t), and G i ( t )of appropriate dimensions are continuous, the matrices Ci and 3. are assumed to be constant, and Ci, gi, and Gi(t), symmetric, and the overline denotes transposition. Now, in contrast to the foregoing discussion, let us assume that the feasible values of the control vector u are not bounded by U E Q , a compactum in R'J, and that the feasible strategies are identified with any functions V + u(t, x) which, for every fixed X E R", are bounded, Borelmeasurable over t and satisfy, at every t~ [0, O), a Lipschitz condition with respect to x under the condition IIu(t, x(1)) - u(t, x(2))ll < lllx(l) - x(2)11,
1 = const > 0.
The set of such strategies will be denoted VB. The quasimotion x( .,to, xo, V) of the system (5.1) generated by the strategy V - u(t, x), V EVB, from the initial position (to,xo) will be represented simply as a solution of this system with u = u(t,x) and x(to) = xo. In effect, we are now using a continuous control procedure [42, section 51. Every fixed strategy V E V generates ~ a unique solution x ( t , to,xo, V ) , t o < t < 8, extensible to [to, O] that is an absolutely continuous function with respect to t and satisfies (5.1), where u = u(t, x) and x(to) = xo, at almost all t E [ t o , O ] [42, p. 471. By the almost
5. A Linear Quadratic Multicriterial Problem
11 1
feasible realization o r . ] of the strategy V we understand in this case the function v [ t ] = u(t, x(t, to,xo, V ) )which, by the above constraints, is Borelmeasurable and bounded. The feasible realizations v [ t ] , to < t < 8, are not all assumed to be uniformly bounded, though each v [ t ] , to < t < 8 is bounded by “its own” constant. Consequently, every strategy V - v(t, x ) from VB with specified initial position (to,xo) generates a unique solution x(. ,to, xo, V ) = { x ( t , to,xo, V ) , t o < t < 0). Substituting this result in the functional (5.4), where v [ t ] = v(t, x(t)), ~ ( t =) x(t, to, xo, V )and x(8) = x(d, to, xo, V ) ,we find a unique value Ji( V ) for every functional from (5.3), i E N. Since the quasimotions are unique, the definition of Pareto maximality turns into the following definition.
Definition 5.1. The strategy V pE VBis Pareto-optimal in the problem (C - (5.1), V ” ,J ( V ) f (5.3))
(5.5)
if, for any strategy V E VBand every initial position ( t o , xo) from the domain (5.2), the system of inequalities Ji(V)3 Ji(Vp),
iE
N,
is nonsimultaneous and at least one of them is a strict inequality. In this section an explicit form will be found for the Pareto-optimal strategy of one class of multicriterial problems (5.5). To obtain this result, let us consider an auxiliary optimal control problem whose solution is the Pareto-optimal strategy for (5.5).
5.2. A n Auxiliary Optimal Control Problem Let us define a certain set of positive numbers /?,, . . . ,BN and consider the matrices
and scalar variable (5.7)
112
3. Pareto Optimnlity
where yi(t)=
it {x’(z)G,(t)x(t)+ u’[z]~~u[T]}dz.
J to
In light of (5.6)-(5.8) the variable y(t) is obviously a solution of the scalar differential equation j = x’Gp(t)x
+0’9p~,
Y(to)= 0
(5.9)
and the functional (cf. (5.3) and (5.4)) is given as N
I ( V )=
1 p i ~ i (=~Xye)c,x(o) ) + yp).
i= 1
(5.10)
An auxiliary optimal control problem may now be formulated. Find a strategy V oE V Bsuch that for any initial position (to,xo) from the domain (5.21, max I ( V ) = I( V o ) V € P
(5.11)
under the constraints i= A(t)x
+ B(t)u,
x(to)= xo,
+
j = x’GB(t)x u ‘ ~ ~ u , y(to)= 0.
(5.12) (5.13)
By the definition of the set V Bof strategies r! for any V + u(t, x ) from V B the system (5.12) with u = u(t,x) and initial condition x(tO)= x o (from the domain (5.2)) has a unique solution x(t) = x(t, to, xo, V ) , to d t d 0, which is extendible to the closed interval [to, 01. Let us find A t ) as the absolutely continuous function by the formula y(t) =
lo
+
{ x ’ ( z ) G p ( z ) ~ ( ~v’[T]~~u[z]} ) dz.
This function is defined at any t E [to,01 and, at nearly any t E [to,01,satisfies the system (5.13) with initial condition y(to)= 0. Therefore, for any strategies V E V and ~ any x(to) = xo from (5.2), the system of differential equations (5.12), (5.13) has a unique solution (x(t),y(t))extendible to the closed interval [to, 01-
Proposition 5.1. If for some set of positiue numbers PI,. . .,PN the strategy I/’ E YEis a solution of (5.1 1) under constraints (5.12) and (5.13), Vp is Paretooptimal for the multicriterial problem (5.5) (see Dejnition 5.1).
5. A Linear Quadratic Multicriterial Problem
113
Proof. We proceed by contradiction. Let the strategy V'E YBbe a solution of (5.1 1) under constraints (5.12) and (5.13) without being Pareto-optimal for problem (5.5). The latter signifies that there is a strategy V* E YBand initial position (to,x o ) from the domain (5.2) such that the system of inequalities
Ji(V*) 3 Ji(Vp),
iE
N,
(5.14)
is simultaneous and such that at least one is a strict inequality. Multiplying every i-th inequality in (5.14) by one of the positive numbers Pi from the condition and summing leads to the inequality
In the notation of (5.10) and (5.6)-(5.8) this relation has the form I ( V * ) > I( vq.
The latter inequality is incompatible with (5.1 1). This proves Proposition 5.1. Consequently, by means of Proposition 5.1 the search for a Pareto-optimal strategy Vp of problem (5.5) is reducible to that of a search for the optimal control Vp of problem (5.1 1)-(5.13). Therefore, the discussion will concentrate on problem (5.1 1) under conditions (5.12) and (5.13). It will be solved by combining a dynamic programming procedure with the Lyapunov function method proposed by Krasovskiy for optimal control problems. Below we will assume that the following Condition holds.
Condition 5.1. There are positive numbers /I1,. . . ,PN such that the quadratic forms v'9,v and x'G,(t)x are negative-dejinite and such that x'C,x is constantnegative.
This condition is satisfied if, for at least one ie N, the matrices 3,Gi(t),and Ci in (5.4) are negative-definite at t E [to, 01 since the sign-definiteness of a quadratic form is not violated if supplemented with any quadratic form with fairly small (in magnitude) factors (assuming that C, = Ci + XjsN,i6Cj, G, = Gi 6 C j s ~ \ G i j , and 9, = gi+ 6 XjCNiig j ,where 6 > 0 is fairly small).
+
5.3. Formal Procedure f o r Obtaining Vp The formal procedure (shown to be justified in Proposition 5.2) of a search for a strategy VPcYB is as follows. Suppose the numbers P1,.. . ,PN satisfy
114
3. Pareto Optimality
Condition 5.1. Consider the Bellman-Krasovskiy function cp(t,
x , y ) = x’@(t)x
+ y,
(5.15)
where the (n x n)-dimensional matrix @ ( t ) to be found is assumed to be symmetric for the time being. We set [A(t)x
at
= 2x’@(t)[A(t)x
+ B ( ~ ) u+] x ‘ G p ( t ) ~+ ~
‘
9
p
~
+ B(t)u] + x’ d@(t) x + x ’ G D ( t ) x+ u ‘ 9 @ u . dt ~
(5.16)
Recall that dcp/dx denotes a column vector whose components are the partial derivatives of the function cp(t,x ) with respect to the coordinates of the vector x. In particular, for (5.15) acp
- = 2@(t)x.
ax
This relation was found in the derivation of (5.16). Though the function cp(t,x,y) depends explicitly on y, W(t,x , u) of (12.16) does not. Now let us solve the problem of finding, for every fixed pair ( t , X ) E [O, 0) x R”, a function uP(t,x ) such that max W ( t , x , u) = w(t,x , uP(t, x)).
(5.17)
v
Relation (5.17) does not hold unless (5.18)
and unless the matrix
is positive definite. The latter is true by Condition 5.1. From (5.18) we find uP(t, x):
T I
= 2B’(t)O(t)x
u=uP(t,x)
+29J
= 0,.
Hence
vp+ UP(?,
x) =
-
9; ‘B’(t)O(t)x.
(5.19)
5. A Linear Quadratic Multicriterial Problem
115
Now we find the matrix @(t)with W(t,x,
UP(?,
(5.20)
x)) = 0
and
d o , x,
Y ) = X'CpX
+y
(5.21)
for any X ER " and YER'. Substituting (5.19) in (5.20) and collecting like terms, we have from (5.20)
:[
X'
-
+ @ A + A'@ - O B g i ' B ' O + G,]x
= 0.
For this identity to hold with any X E R" it is sufficient that the matrix @ ( t ) be a solution of the Riccati differential matrix equation
dO
-
dt
+ @A(?)+ A'(?)@ - OB(t)9, 'B'(t)O+ G,(t) = O,,,.
(5.22)
The boundary condition for @(t) is found, by (5.21) and (5.15), in the form c p ( ~x, , y ) = xw(e)x
+
=
x'cpx+ y.
For the latter condition to hold, it is sufficient that @(B) = c,.
(5.23)
Consequently, if the matrix @ ( t ) is a solution of equation (5.22) with boundary condition (5.23), then, for any x E R", y E R', and t E [0, 81, (5.17), (5.20), and (5.21) hold. Finally, by substituting the resultant @(t)in (5.19) we find the explicit form of the Pareto-optimal strategy. Remark 5.2. If Conditions 5.1 are satisfied, the system (5.22) with boundary condition (5.23) has a solution that is continuous over t as a symmetric matrix @(t)extendible to the closed interval [0, 01. This Proposition follows from [49] since such a Riccati matrix equation (5.22), (5.23) is also obtained for the optimal control in the linear quadratic case.
5.4. Theoretical Basis of Algorithm
The search procedure for a Pareto-optimal strategy in problem (5.5) must now be theoretically justified.
116
3. Pareto Optimnlity
Proposition 5.2. If Condition 5.1 holds and a solution @(t)of the system (5.22) with boundary condition (5.23) extendible to the closed interval [0, 01 is found, the Pareto-optimal strategy in problem (5.5) has the form (5.19). Proof. Let @(t)be a solution of (5.22),(5.23). It exists by Remark 5.1. Let us choose an arbitrary position (to,xo)from the domain (5.2) and, for the system (5.12), (5.13), give the initial conditions as x(t0) = xo,
Y(t0)= 0.
we substitute the matrix @(t)in (5.19) and assume that in the system (5.11), (5.13), u = uP(t,x ) = -9; 'B'(t)@(t)x.With this v = uP(t, x ) the system has the solution (XP(t),
yP(t))= (x(t,t o , xo, VP),Y(t, to, xo, VP))
(5.24)
extendible to the closed interval [to,01 (in (5.24) the strategy is Vp - uP(t,x ) from (5.19)). Indeed, with u = vP(t,x ) the system (5.12) has the continuous solution xp(t)= x(t, to,xo, Vp) as a system of common differential equations with continuous factors [91, p. 221). Then there exists a scalar variable yP(t)= y(t, to,xo, V q that satisfies the system (5.13) and is defined at every t E [to, 01. Let us now find the function q ( t , x , y) of (5.15), where @(t) is a given solution of the system (5.22), (5.23). Then at every t E [ t o , el, @(t,XP(t),y'(t))
=
W ( t , XP(t),uP(t, XP(t))= 0
(5.25)
since if @(t)is a solution of (5.22), (5.23), W(t,x , up(&x ) ) = 0 for any x E R" and at every t E [O,O], in particular, at x = xP(t).Condition 5.25 asserts that the function cp(t,x'(t), y'(t)) remains constant at every t E [to,01. Hence d t o , xo, 0) = d e , xP@),Yp(e)).
(5.26)
By virtue of identity (5.21) and (5.10), (5.23), and (5.15),
~ ( 0xP(e), , yP(m = c
~ ~ ( ~ ) i w e ) X+~ Y( pe m)
+ [ ~ ~ [ t ] ] ' 9 p u ~dt[=t ]I)( V q .
(5.27)
From (5.26) and (5.27) it follows that d t o , xo, 0) = IW4.
(5.28)
Now let V* + u*(t,x) be some strategy from VB. By the constraints
5. A Linear Quadratic Multicriterial Problem
117
imposed on the set of strategies, the system (5.12), (5.13) with the above initial conditions and with u = u*(t, x) has the unique solution (x*(t), y*(t)) = (x(t, to, xo. V*), y(t, to,xo, V*)), to < t < 0, extendible to the closed interval [to,01. By (5.17) and (5.20),with any x E R" at any t E [0,8] the following inequality holds: W(t,x, u*(t, x))
< 0.
For this reason
444 x*(t), y*(t))= W(t,x*(t),
u*(t, x*(t)))
<0
(5.29)
at nearly every t E [to,01. By (5.29), given the solution (x*(t), y*(t)) of the system (5.12), (5.13), the functions q ( t , x*(t), y*(t))cannot increase over time. Consequently, in particular, d t o , xo, 0) 2 d 0 , x*(@, Y * ( W
(5.30)
By (5.21), (5.23) and (5.10),
re
+ [u*(t,
~*(t))]'9~u*(t, x * ( t ) ) } dt = I( V*).
(5.31)
From (5.30) and (5.31) it follows that d t o , xo, 0) 2
V*).
(5.32)
Finally, by combining relations (5.28) and (5.32) causes condition I(V*) < I(Vp) to hold for any strategy V* E YB.This is equivalent to (5.11). Consequently, the strategy V'E YEof (5.19) guarantees that equality (5.11) holds under the constraints (5.12) and (5.13). Then, following Proposition 5.1, the strategy (5.19) is Pareto-optimal for the multicriterial problem (5.5). This proves Proposition 5.2. Algorithm. Proposition 5.2 leads to a practical method of finding a Paretooptimal strategy V'E YBin the multicriterial linear quadratic problem (5.5).
(1) Find positive numbers /I1,.. . ,/IN for which Condition 5.1 will be fulfilled. (2) Find a solution @(t),0 < t < 8, of the system (5.22),(5.23).This solution @(t)will obviously be a symmetric matrix. Finally, the Pareto-optimal strategy may be represented in the form (5.19).
1 18
5.5.
3. Pareto Optimality
Exact Solution of System (5.22). (5.23)
The most difficult step in our analytic derivation of a Pareto-optimal strategy is the solution of the system (5.22), (5.23). One particular case of an exact solution is identified by the next condition. Condition 5.2. Suppose there exist positive numbers pl,. . . ,PN such that the
quadratic forms x ' C B x and 0 ' 9 ~ v are negative-de$nite while G , BiGi
N
=
i= 1
= 0"x 11'
Consider the matrix differential equation
z = A(t)Z,
(5.33)
Z(t0)= 2 0 ,
where Zo is a nonsingular (nondegenerate) fixed real matrix. Then the system (5.22), (5.23) have a unique solution Z(t) extendible [to, 01 as a solution of a system of linear homogeneous equations with continuous coefficients (the elements of A(t) are continuous). Such a matrix Z(t) is fundamental in that its columns include n linearly independent solutions of the system i = A(t)z, z E R". Therefore, at every t E [to, 01 there exists an inverse matrix Z-'(t). Let us introduce an auxiliary matrix Y(0, t) = z(O)z-'(t)
(5.34)
at every tE[tO, 01. The properties of the matrix Y(0,t) follow from its definition, specifically: (1) Y(t, t) = Z(t)Z- '(t) = En,an identity (n x n)-dimensional matrix. (2) The matrix Y(0, t) is nonsingular since, at every t E [ t o , 01, there exists an inverse matrix Y-'(0, t). , (3) Y - ~ o t,) = ~ ( t0).
From (5.34), Y(6, t) = z ( e ) z - ' ( t ) = -z(e)z-'(t)z(t)z-l(t) = - Y(0,
t)A(t)Z(t)Z-'(t)
= - Y(0,
Furthermore, we are using the fact that Y(0, t) = - Y(0, t)A(t),
Lye, t) = -A'(t)Y'(O,
t).
t)A(t).
5. A Linear Quadratic Multicriterial Problem
119
Proposition 5.3. If Condition 5.2 holds, the system (5.22) with boundary condition (5.23) has the solution
Proof. Assuming in (5.35) that t = 8, we have O(8) = Y'(8, B)C,Y(8,0) = C,, or the boundary condition (5.23) holds. Operations with the matrices suggest (for brevity, the arguments are omitted in obvious cases and the bracketed expressions taken from (5.35) are denoted as points)
0 = y {. . . } - l y + y = -A'y{...}-'y-
-{...}-I
K t
y'{ . . . } - I [ $
1
y + y {. . . } - 1 Y {...)](..}-'y-
y { ...}-'yA
Consequently, the matrix @(t) defined by equality (5.35) is a solution of the system (5.22) with boundary condition (5.23). Remark 5.2. As shown above, if for at least one i E N the matrices gi, Gi(t), and Ciare negative-definite, then (if the solution of (5.22), (5.23) @(t) is extendible to [ O , O ] ) there exists a Pareto-optimal strategy in problem (5.5). If, however, all q,i E N, are positive-definite, in problem ( 5 . 9 , where the set VB is replaced by VL,there exists no Pareto-optimal strategy (with llxoII # 0). Indeed, because any Pareto-optimal strategy of problem (5.5) is also Slateroptimal, from Proposition 3.2 of Chapter 2 there follows the assertion Proposition 5.4. If 3 > 0, i E N, in the multicriterial problem ( 5 . 9 ,
(E - (5.1), VL,J + (5.3)) for any initial position (to,xo)E [0,8) x R", llxoI( # 0, there will exist no Paretooptimal strategy.
120
3. Pareto Optimality
6. Comparison to P1-Optimality 6.1. Dejinition
One of the present authors (V.I.Z.) has introduced [102, p. 451 the notion of P1-optimality for the multicriterial problem (1.1, Chapter 2), specifically for "Y7
J>
with Conditions 1.1 and 1.2 of Chapter 2 satisfied.
Definition 6.1. The strategy V"E"Y is said to be PI-maximal in problem (1.1, Chapter 2) if there exists no strategy VE"Y that would make the following system of inequalities simultaneous: min Fi(x[B,to, xo, V]) 2 min Fi(x[B,to, xo, V"]), xc.1
xc.1
i~ N,
(6.1)
at least one of which is a strict inequality. The set of P1-maximal strategies will be denoted YP1.
Remark 6.1. The following definition is equivalent to Definition 6.1: The strategy Vpl ~ - lisr P1-maximal in problem (1.1, Chapter 2) if, for any other strategy V E "Y, either min Fi(x[B,to, xo, V]) xc.1
= min
XC.1
Fi(x[B, to, x0, V"]),
iE
N,
or there exists a subscript io( V ) = io E N such that min Fi,(x[B, to, xo, V]) < min FiJx[e, to, xo, Vp']). XC.1
XC.1
In the case N = 2 the P1-maximal strategy may be geometrically interpreted as in Fig. 3.6.1. Let Vpl be the P1-maximal strategy for problem (1.1, Chapter 2) with N = 2 and V any other strategy. Let us find points IF = (F7'"[V], Fy'"[V]), where Fyi"[V]
= min
XC.1
Fi(x[e, to, xo, V]),
i = 1, 2.
Then, by Definition 6.1, point A with coordinates (F;"'"[VJ, Ft'"[V]) remains outside the shaded angular region G with vertex (F7'"[Vp1], Fy'"[ V"]) (Fig. 3.6.1). In the extreme case, the point (Fy'"[V],F7'"[V]) may coincide only with the point (fl'"[VP1I7 Fy'"[ V"]). This is true of any two strategies V and Vpl, one of which is P1-maximal.
6. Comparison to PI-Optimelity
121
6.2. Properties of PI-Maximal Strategy
P1-maximality is a cross between the principle of guaranteed result devised by Germier [30] and Pareto optimality [64]. By this principle optimality is defined for the "worst," or minimal values of the criteria (the components F i ( x ) of the criterion vector [F(x[O]))extend over the set X[O, to, xo, Vl of all right ends of the quasimotions x[. ,to, xo, V ] . The sets F(X[O, to, xo, V]) are not comparable (for distinct strategies V E V )but every such set is associated with the point
)
min Fi(x[O, to, xo, V]), i~ N
(6.2)
of minimal values of the criteria over X[O, to, xo, u 1. Moreover, the strategies V E V are comparable for these minimal points (6.2) which are, generally speaking, beyond the reach of the quasimotions x [ * , to, xo, V], which are, in fact, negligible. This apparent shortcoming of Definition 6.1 is overcome by the following property.
I
F;'"[V]
F;'"[ VP']
Figure 3.6.1.
F,
122
3. Pareto Optimdity
Property 6.1. With N = 1 (in the optimal control problem (E,Y", F(x[O]))) the P1-optimal strategy Vpl coincides with the maximin strategy, i.e.,
max min F,(x[o, to, xo, V€Y
*[.I
v ] )= min F,(x[O, xc.1
to, xo, VplI).
This equality follows directly from Remark 6.1. Property 6.2. I f in problem (1.1, Chapter 2) only "singleton" strategies (or V such that every bunch of quasimotions .%-[to,xo, V] is associated with a unique value of the goal functional vector F(x[O, to, xo, V ] )we have
F,(x[O, to, xo, V])
= min xc.1
Fi(x[O,to, xo9 VI)
=max Fi(x[O,to, xc.1
the P1-optimal strategy Vp' is Pareto-maximal. Specifically for any V the system of inequalities Fi(x[O, to, xo, V]) 2 F,(x[O, to, xo, Vpl]),
V i E N,
VEY",
is nonsimultaneous, at least one of which is a strict inequality. Properties 6.1 and 6.2 show that the notion of P1-optimality is fairly complete in that it includes as a special case the notion of Pareto-maximum and maximin. Property 6.3. The set of P1-maximal strategies is internally stable, or,for any V ( ' ) and V 2 from ) Yp1,the system of inequalities
min Fi(x[O,to, xo, V')]) 3 min Fi(x[O, to, xo, VZ)]), xI.1
iE N,
X[.1
is nonsimultaneous, at least one of which i s a strict inequality. This property follows directly from Definition 6.1 since V ( ~ Y" )E Property 6.4. The set of P1-optimal strategies is said to be externally stable i f , for every strategy V EY" there is a strategy Vpl E VP1of its own such that
min Fi(x[e, to, xo, V ] ) d min F,(x[O, to, xo, V"]), "1
iE N.
Xt.1
Property 6.5. The P1-maximal strategy Vpl of problem (1.1, Chapter 2) with initial position (to,xo) is said to be dynamically stable, i.e., Vpl remains P1-
6. Comparison to P1-Optimality
123
maximal for problem (1.1, Chapter 2) with fixed current initial position (t,x[t, to, xo, v"]) at any t E [to,el for every quasimotion x [ . ,to,xo, Vpl].
Properties 6.2 and 6.5 may be proved in the same way as Properties 1.5 and 1.6 of Chapter 3 with Fi(x[O,to,xo, V]) replaced by min,I.l Fi(x[O,to,xo, V]). T o consider the structure of the P1-optimal solutions of (1.1) from Chapter 2, let us look at two sets. The first set, Xp', is the set of all points x from the reachability domain X[e, to,xo, V +- Q] that may be "reached" only at time t = 8 through quasimotions x [ . , to,xo, V"] as the strategy Vpl "scans" the entire set of PI-optimal strategies Vpl,or x p i
=
u
xce, to, xo, 171.
VEY'P'
The other set X p was discussed in section 10.1 and consists of all points x p E x[e, to, o, V + Q] that are Pareto-maximal (effective) in the multicriterial static problem ( X [ O , to, xo, V - Q1, W >
i.e., here the inequalities F,(x) 3 Fi(xP),i E N, x E X [ e , to,xo, V t Q] are false unless F,(x) = Fi(xP),iE N.
Proposition 6.1. I f Conditions 1.1 and 1.2 from Chapter 2 are satisjied, the sets X p and Xp' coincide. This Proposition was proved in [116, pp. 199-2011.
Corollary 6.1. If Conditions 1.1 and 1.2 of Chapter 2 are satisfied, in problem ( l , l , Chapter 2) there will exist a P1-maximal strategy for any choice of initial position (to,xo)E [O,@ x R". Indeed, if Conditions 1.1 and 1.2 in Chapter 2 are satisfied, the set X[e, to,xo, V - Q] is closed and bounded (compact) in R". Let us choose some Pareto-maximal point xp of the static multicriterial problem
(XCe, to, xo, V + Ql, E(x)). For this to hold it would be sufficient, for instance, if xp may be found such that N
N
124
3. Pareto Optimality
for some fixed set of constant positive numbers pl,. . . ,pN. In the latter equality a maximum must be taken since X[O, to, xo, V t Q] is compact and the scalar function Fi(x),i E N, is continuous (over this set). By Corollary 4.1 (Chapter l), there exists a strategy Vp' E Y such that xp
=
xce, to, xo, v p 1 1
for every quasimotion x [ * ,to, xo, Vpl]. This strategy Vpl is P1-maximal for problem (1.1, Chapter 2) (Proposition 6.1). Consequently, the structure and properties of P1-maximal strategies are the same as those of Pareto-optimal strategies discussed earlier in this chapter, the only difference being that of uniqueness.
6.3. Uniqueness of Values of the Goal Functional Vector
Theorem 6.1. Every P1-maximal strategy Vpl of the multicriterial problem andJixed initial position (to,xo)E [0, 0) x R" is associated with a unique value of the goal functional Fi(x[O,to, xo, V"]), i E N, that is, max Fi(x[O, to, xo, V"]) XL.1
= min
F i ( x [ e , to, xo, V"])
We prove the theorem by contradiction. Assume that there is an ordinal number ioE N and a P1-optimal strategy Vp' such that
Proof.
max Fi,(x[O, to, xo, V"]) xr.1
> min Fi,(x[e, to, xo, V"]). xr.1
(6.4)
The function F,,(x) is continuous (Condition 1.2, Chapter 2) and the set X[O, to, xo, Ypl] is closed and bounded in R" (Proposition 2.3, Chapter 1). Then there exist quasimotions x*[-, to, xo, Vp'] and x , [ . , to, xo, Vpl] from the bundle X [ t o , xo, Vpl] such that
125
6. Comparison to P1-Optimality
and
By Corollary 4.1 in Chapter 1 there is a strategy V* E -Y such that x*[e, to, xo,
V
P = ~
x[e,
to, xo,
V*I
for any quasimotion x[ .,to, xo, V*]. Therefore, at every j~ N, F j ( x * [ e , to, xo, Vp']) = F j ( x [ e , to, xo, V*]) = min F j ( x [ e , to, xo, v*]). xt.1
(6.6) Hence, in light of (6.5), rnin F,(x[e, to, xo, V*]) > min Fio(x[O,to, xo, V"]). Xt.1
Xt.1
(6.7)
The inequality (6.7) and the P1-optimality of the strategy Vpl entails (see Definition 6.1 and Remark 6.1) the existence of a subscript j~ N such that min F j ( x [ e , to, xo, Vpl]) > min F j ( x [ e , to, xo, V*]). xI.1
Xt.1
(6.8)
Bearing in mind (6.8) and (6.6),
The resultant inequality min Fj(x[e, to, x0, V"]) XI
> F j ( x * [ e , to, xo, V"])
.I
is inconsistent since the quasimotion x*[
*,
to, xo, V"]
€ % [ t o , xo, Vpl].
6.4. Relationship between -Yp and 9'"'
Recall that 9'"' is the set of Pareto-optimal strategies in the multicriterial problem (1.1, Chapter 2) (Definition 1.1, Chapter 3) and 9'"' the set of P1optimal strategies in the same problem (Definition 6.1). By Propositions 3.1 and 6.1, the structures of Pareto- and P1-maximal strategies are such that Xp = XP
and Xp' = XP.
126
3. Pareto Optimnlity
Therefore the sets
xp = xP' = x p . Here
u
xp=
V € l -p
and xp1
=
u
xce, to, xOl VI
xce, to, xo,
VI
VE3-P'
that is, the sets of right ends (at t = 0) of quasimotions generated by Paretoand PI-maximal strategies coincide. Consequently, any P1-maximal strategy is simultaneously Pareto-maximal. Now we have the next assertion. Proposition 6.2. YP1 c
(6.9)
v p .
These sets do not, however, necessarily coincide since the PI-maximal strategies Vpl are, by Theorem 6.1, only "singletons." Specifically, these strategies Vpl E Ypl generate only bunches %(to,xo, Vpl] of quasimotions x[. ,to,xo, Vp'] such that for every Vp' E Y ,the set F(X[0, to, xo, V"]) in the criterial space RN is a point (rather than a set) while the values of F(x[0,to,xo, V 9 ) may, for some Pareto-maximal strategies Vp also form a set (not a point) in RN. Example 6.2. Let the system (1.2) from Chapter 2 have the form i= u,
where x
= (xl,
XCO] =
o,,
0 Q t Q 1,
x,), u = (ul, u 2 ) and the set
Q = {u
= (ul, u2)l u1
Consequently, the finitial position
+ u2 Q 1, ui 2 0, i = 1, 2).
(to,xo)=(O, 02).The goal
functional vector
~ ( x c e i= ) ( ~ ~ ( x c e F,(xc~I)) i), = ( ~ ~ ~x 12 c1 m , The set XC0, to, xo, V - Q] is shaded in Fig. 3.6.2 and the set F(X[0, to,xo, V t Q]) is of the same form but is in the criterial space {Fl,F 2 } . Then the PI-maximal strategies are
Vp' - uP'(t, x) = l(ul, u2)\u1 = 1 - u2, u2 = const >, 0, u 1 >, 0).
6. Comparison to PI-Optimality
127
These strategies generate quasimotions x [ . , to, xo, Vp’] that lead to “pointwise” values IF(X[O, to, xo, V“]) represented by the heavy line in Fig. 3.6.2. Moreover, by Proposition 6.1 the strategies that lead to such pointwise values (and only those strategies) sum to the entire set V“. Now let us consider a “multivalent” strategy V‘E V such that the associated set IF(XIO,to,xo,V ‘ ] ) is a bounded interval of AB, represented by the heavy double line in Fig. 3.6.2. This strategy is not P1-maximal, for the point
remains (strictly) within the shaded triangle and it has a “singleton” strategy Vpl E V such that F(X[O, to,x,,, V“]) is a point K (1) that remains on AB and
(2) the following strict inequality holds:
for every quasimotion x[ ., to, xo, V“]. At the same time this strategy Vp is Pareto-maximal by Proposition 3.1. Consequently, V‘EV‘ and V p $ V p l , thus the inclusion (6.9) does not generally become an equality. If, however, in formalizing the strategies V in problem (1.1, Chapter 2) only “singleton” strategies are used, or strategies such that for any quasimotion
Figure 3.6.2.
128
3. Pareto Optimality
a,
x[ to,xo, the set F(X[O, to,xo, V]) is a point in the criteria1 space RN, (6.9) does become an equality, or, in this case, a ,
V-Pl
=y
p .
P1-maximal strategies and other types of similar solutions of problem (1.1, Chapter 2) have been thoroughly analyzed in [102, 1161. In concluding Chapters 2 and 3 note that in analyzing multicriterial dynamic problems only programmed (time-dependent only) strategies were used. Our justification is that these chapters are largely auxiliary and that the findings are employed below in devising Slater- and Pareto-maximin strategies. Moreover, numerous arguments in favor of feedback have been provided in the Introduction. Because of certain features of Slater- and Pareto-maximal strategies in (1.1, Chapter 2), these solutions may appear to be less advantageous. Those features must be recognized if these optimal strategies are used as solutions of (1.1, Chapter 2). These features are: (a) The set of both Slater- and Pareto-optimal strategies includes, as a rule, an infinite set of V EV-. Distinct strategies of this type lead to distinct values of the goal functionals. Because of internal stability, for any two, for instance, Pareto-maximal strategies V ( ’ )and V(’) it follows from the inequality
F j ( x [ e , to, xo, ~
(< Fj(x[e, ~ 9to, xo, ~ ( 2 9 .
Which Pareto-maximal strategy, V“’ or V(’), is preferable? Definition 1.1 from Chapter 3 does not provide an answer. Therefore, either additional reasoning is needed (not included in the definition of Pareto maximality) or it is up to the decision-maker to decide in favor of one strategy versus the others. (b) In some problems several, rather than a single, best solutions must be chosen and ranked by preference, as in the case of competitions or sporting events. Slater and Pareto optimalities are effective only if just a single “winner” is to be chosen. (c) By Definition 1.1 from Chapter 3, one strategy is “better” than the other if at least one of the inequalities (1.1, Chapter 3) is strict. In the general case, however, one vote may not be enough. In some cases the result of comparing two strategies may depend on the number of strict inequalities in (1.1,
6. Comparison to Pl-Optimality
129
Chapter 3). In this context it would be of interest to extend Gorokhovik‘s findings [31] to the case of positional multicriterial problem (1.1, Chapter 2). (d) In cooperative differential games (Example 3.1) the gains of some players in the optimal situation may turn out to be less than their maximin gains F f of (3.15), and these can be secured by every player who uses “his own” maximin strategy (in a game “against” everybody else). In this case such a player would not agree to a Pareto-maximal set of strategies. Therefore, in cooperative differential games the set of Pareto-maximal situations is constrained by the condition of individual rationality whereby the gains must be at least equal to the maximin gains. One should not, however, think that Pareto- and Slater-optimal strategies suffer from the same disadvantages. The reverse is true, and these notions both have a major role to play in the theory of multicriterial optimality and in applications. In meaningful levels, the arguments in favor of such solutions reduce to the following: (a) The values of one goal functional cannot be improved without some deterioration in the values of some of the others. In this sense Pareto-optimal strategies are, simultaneously, the best by all the criteria1 strategies (in this sense, are optimal). (b) The sets of Slater- and Pareto-optimal strategies are much “narrower” than that of all strategies 9‘“. Therefore, the design of such sets is one of the first stages in most procedures, in particular interactive procedures, and in multicriterial optimization methods for dynamic systems. (c) For such a narrow set there may be various facts and properties that simplify the solution and cannot by any means hold for the entire set of strategies. (d) Sets of (Slater- and Pareto-) optimal strategies are internally, externally, and dynamically stable. (e) The notions of Slater and Pareto optimality is (or must be) used in defining numerous solutions of positional differential games such as Zequilibrium [99] and its extension [37], strong equilibrium 1391, and various cooperative solutions [66]. These notions will be useful in subsequent chapters where vector-valued saddle points and the vector-valued maximin of a positional differential game are formalized. This reasoning shows that Slater- and Pareto-optimal strategies are fundamental notions in the theory and applications of decision-making in dynamic problems supplied with numerous criteria.
This page intentionally left blank
Chapter 4
Geoffrion Optimality
In this chapter the notion of a solution for a multicriterial positional problem that is analogous to the Geoffrion optimum is introduced. The original problem is reduced and the Slater optimum employed to devise a straightforward way of finding the Geoffrion-optimal strategies.
1. Geoffrion-Maximal Strategy
I . I . Dejnition As in the preceding Chapters, let us again consider the multicriterial problem (1.1) in Chapter 2:
(G
“Y?
Wa))
assuming Conditions 1.1 and 1.2 of Chapter 2 hold and that the initial position (to,xo)E [O,t9) x R“ is fixed.
Definition 1.1. A strategy V G is said to be Geofrion maximal in (1.1, Chapter 2) if (a) VG is Pareto-maximal for this problem; (b) there exists a positive number M such that for any ordinal numbers i E N, strategies V E “Y, and quasimotions x [ .,to, xo, V ] and x [ .,to, xo, VG] for which Fi(xCe, to,
~
0
VI) , > Fi(xCe, to, ~ 0 VGI) , 131
(1.1)
132
4. Geoffrioo Optimality
and for some j
EN
such that
The set of Geoffrion-maximal strategies will be denoted Y GNote . that, by the definition of a Pareto-maximal strategy (Definition 1.1, Chapter 3), if, for some strategy quasimotions x[ .,to, xo, V ] and x[ . ,to,xo, V"], and ordinal number i~ N, inequality (1.1) holds, there is certainly an ordinal number j e N for which inequality (1.2) holds with the same strategy V and same quasimotions x[. ,to,xo, V ] and x[ .,to,xo, V"]. Therefore, this Definition essentially requires that a number M exist for which (1.3) holds under the above conditions. Following [70] a Pareto-maximal strategy VG that is not Geoffrionmaximal will be said to be non-properly egectiue. By Definition 1.1 the strategy Vp E Y is non-properly effective if, for any number M > 0, may be arbitrarily large, there exist an ordinal number i E N and strategy V E Y such that for pairs of quasimotions (x[., tO,x0,V ] , x[., tO,xo, V"J)satisfying (1.1) with VG = Vp and distinctjE N for which (1.2) holds (also with VG = Vp), the following inequality holds: = - M c F ~ ( ~to, c ~xo, , vP1)- Fj(xce, to, x0,
VI)I.
When N = 2, the Geoffrion-optimal strategy V G be geometrically interpreted as in Fig. 4.1.1. Let V G be the Geoffrion-maximal strategy in problem (1.1, Chapter 2) with N = 2 and suppose V E Y is any other strategy. Construct an obtuse angle such that tana = 1/M. Then construct at every point IF E IF(X[O, to, xo, V G ] ) an angle G i.e., consider the set G, = qxp,to,xo, vG1) G 1. Then, by Definition 1.1, for any strategy V the set IF(X[e, to,xo, V"]) remains outside the shaded area G,, having in common with it only the points shown as a double line in Fig. 4.1.1. Unlike the geometric interpretation of Fig. 3.1.1 for a Pareto-optimal strategy, the points of the set F(X[e, to, xo, V ] )may occur neither in G (with boundary outside the double line of Fig. 3.1.1) nor in any other set formed by the angles a (cf. 4.1.1). Consequently, Geoffrion-maximal strategies limit the
+
,,
1. Geoffrion-Maximal Strategy
133
Figure 4.1.1.
possible positions of the set F(X[O, to, xo, V ] ) (in addition to the Paretomaximal strategy V"). This correspondence is true of any two strategies one of which is Geoffrionmaximal. Note that the notion of Geoffrion optimality (equivalently, proper effectiveness) for static problems has previously been defined in [45]. The notion of a Geoffrion-minimal strategy is introduced in a similar way. Specifically, V'E W" is Geoflrion-minimal in ( 1 . 1 , Chapter 2) with initial position (to,xo)E [O,O) x R" if (a) V, is Pareto-minimal in this multicriterial problem; (b) there exists a positive number M such that for any ordinal numbers i E N, strategies V E W", and quasimotions x[ . to, xo, V ] and x[ * , to, xo, V,] for which Fi(xCO, to, xo,
VI) < Fi(x[O, to, xo,
Vcl)
134
4. Geoffrion Optimality
and some j e N such that Fj(x[e,to, xo,
~ 1> )Fj(x[e,to, x0, vGi)
the following relation holds: Fi(x[B, to, xo, VGl) - Fi(xC0, to, xo,
vl)
GMCFj(xC0, to, ~
vl) - Fj(xC0, to, ~
0 ,
0 VGl)* ,
The order relation that defines the Geoffrion maximum (or minimum) will, in some cases, be denoted z G Thus . for a Geoffrion-maximal strategy V G , the relation
%a, to, xo, V l ) 2 mco, to, xo, V G 3 , G
v VEV,
signifies that V G is Pareto-optimal in (1.1, Chapter 2) and that (1.1)-(1.3) are satisfied; in the case of a Geoffrion-minimal strategy, the relation
W [ O , to, xo, Vl) $ W Y ,to, xo, vG1h G
v VEY,
signifies that VG is Pareto-minimal in (1.1, Chapter 2) and that the inequalities in (b) hold.
1.2. Properties Property 1.1. With N = 1 in ( l , l , Chapter 2) the Geoffrion-maximal strategy V G coincides with the maximal (optimal)strategy in the optimal control problem (1.6, Chapter 2), since equality (1.7, Chapter 2) is then true.
Property 1.1 follows from the fact that for N = 1, the Geoffrion-maximal strategy is Pareto-maximal and the converse is also true (requirement (b)) in Definition 1.1 is not “effective”). Now what we must do is to make use of Property 1.1 of Chapter 3. Property 1.2. Every Geoffrion-maximal strategy is Pareto-optimal, or Y Gc
Y P .
( 1.4)
Combining the inclusion (1.4) with Property 1.3 of Chapter 3 we have the chain of inclusions V G
c
Y
P
C Y S
for sets of Geofrion-, YG,Pareto-, Vp,and Slater-optimal, Y’, strategies.
135
1. Geoffrion-Maximal Strategy
Figure 4.1.2.
(Strict) inclusion, the converse of (1.4), does not hold even in a static multicriterial problem. Example 1.1. Assume that in (1.1, Chapter 2) the system I: is described by the equations x. ', = u.,,
xi[O] = 0,
i = 1, 2,
0
< 1;
the set
< - u: + a, ui 2 0,i = 1, 2, a = const > 0}
Q = {(ul, u,) I uz and the goal functionals
WYI) = (F,(xC~I),F , ( X C ~ I ) ) = (XIC11, xzC11). By Proposition 3.1 in Chapter 3, the values of the goal functional vector F(x[B]) associated with the Pareto-maximal strategies ,'/I that is the set FCYI
=
u we,
VEVP
to,
xo,
VI)
is the north-east boundary of the area shaded in Fig. 4.1.3and shown as a double line formed by a segment of the parabola F, = - F : a. In particular, the strategy VC1)+ (0,a) is Pareto-maximal. It is associated with point A in Fig. 4.1.3.
+
A
=
F(x[e,to, xo, v q = ( ~ p~ $,1 ) ) .
Another Pareto-optimal strategy is V ( , )- (a, - a z + a), where the positive number a < is associated with point B in Fig. 4.1.3,
6
B
=
q x p , to, xo, ~
(
2= 9 (F\,), ~ $ 2 ) ) .
136
4. Geoffrion Optimality
Fz
t
Figure 4.1.3.
The differences of the coordinates
A F 1 -- F (12 ) - F(1)= F ( 2 ) > 0 AF2 = F\*’ - F”\
=
-(AF1)2 < 0.
Consequently, by “moving” from point A to another point B fairly close to it, both associated with the Pareto-optimal strategies V(l) and V ( 2 )a, payoff of first infinitesimal order by means of the first criterion will be obtained due to a loss of second infinitesimal order by the second criterion. In this case inequalities (1.3) do not hold and the Pareto-maximal strategy V(l) is not Geoffrion-maximal. Consequently, in this example Y Gc ,Yp and ,Yc # Vp, or the set of Geoffrion-maximal strategies Y c is not a subset of the set of Pareto-maximal strategies Yp and does not necessarily coincide with the latter. As shown in the example, the Pareto-maximal strategy V(l) is abnormal, for both criteria F,(x[O]) and F,(x[B]) are assumed to be equally important and it is natural to require that the payoff produced by one criterion be comparable with the loss produced by the other, or that the payoff and loss be of the same infinitesimal order when replacing V“’ by a second Pareto-
1. Geoffrion-Maximal Strategy
137
maximal strategy V 2 )This . property is of major importance in the definition of a Geoffrion-optimum, making it possible to identify Pareto-maximal strategies that are not abnormal in this example. Property 1.3. The set Y Gof Geoffiion-maximal strategies of problem (1.1, Chapter 2) is internally stable in that,for any strategies V”’E Y G( j = 1,2)and quasimotions x[. ,to,xo, Y C 1 ) and ] x [ * ,to,xo, V 2 ) ]generated by them the system of inequalities
Fi(x[O,to, x0,
P)]) 3 Fi(x[e,to, x0, V 2 ) ] ) ,
iE N,
at least one of which is a strict inequality, is nonsimultaneous.
Property 1.3 follows from the inclusion (1.3)and the internal stability of a set of Pareto-maximal strategies (Property 1.4,Chapter 3). Property 1.4 (inheritance). I f the strategy V G is Geoffrion-maximal in problem (1.1,Chapter 2) with initial position (to,xo),then, for any subset of strategies 9 c Y such that V GE 9, the strategy VG is Geoffrion-maximal in the problem
(z 4, wei)) with the same initial position. Property 1.5 (rejection). I f the strategy V is not Geoffrion-maximal in (1.1, Chapter 2) with initial position (to,xo), then, in the problem
with initial position (to,xo), the same set Y Gis Geoffrion-maximal as in (1.1, Chapter 2).
These two properties follow directly from Definition 1.1.
1.3. Structure
Let us consider two subsets in R”. The first subset
138
4. Geoffrion Optimality
is the set of all right (at t = 0) ends of quasimotions x [ * ,to, xo, V G ] obtained in “testing” all Geoffrion-maximal strategies V G from the set YG. The second subset, Xg, consists of all properly effective (Geoffrionmaximal) solutions xg of the multicriterial static problem (XCO, to, xo,
t
Ql, 5 ( ~ ) ) ,
(1.6)
where X[O,to, xo, V t Q ] is the reachability set of the system (1.2)in Chapter 2 from position (to,xo) and the criterion vector 5(x) = (F,(x),.. . ,F N ( x ) )is formed by functionals 5(x[O])defined over X[O,to, xo, V t Q ] . Recall that
Definition 2.2. A solution x gE X[O, to, xo, V t Q ] is referred to as properly eflectiue (Geoffrion-maximal) in problem (1.6) if (a) x g is effective (Pareto-maximal) in this problem; (b) there exists a positive number M such that for any ieN and any x ~ X [ eto, , xo, V t Q ] for which the following strict equality holds: Fi(X) > Fi(xg)
and some j
EN
such that Fj(X) < Fj(XB)
the following condition holds: Fi(X) - Fi(xg) < M[Fj(xg) - Fj(X)]. Proposition 1.1. If Conditions 1.1 and 1.2from Chapter 2 are satisfed, then, for any choice of initial position (to,x O ) e[O, 0) x R”, the sets X G and Xg coincide.
Proof. Let (to, xo) be some initial position from the set [O, 0) x R”. For (to, xo), we find sets X G of (1.5) and X g of properly effective solutions of (1.6). Note that, by Proposition 3.1 in Chapter 3, the set XP of Pareto-optimal solutions of (1.6) coincides with the set xp = {XExCe, to, xo,
v t Q] I
=
x[e, to, xo, V ] , V E Y - ~ ) .
Moreover, by Definitions 1.1 and 1.2,
XG
c
xp, xg c x p .
(1.7)
1. Geoffrion-Maximal Strategy
139
Let us show, first, that any point x ~ E isX contained ~ in X G . Corollary 4.1 (Chapter 1) suggests that there exists a strategy V * such that x* = xce, to, xo,
v*3
(1.8)
for every quasimotion x [ . , to, xo, V * ] of the system (1.2) in Chapter 2 generated from position (t,,x,) by the strategy V*. Let us prove that this strategy is Geoffrion-maximal in problem (1.1, Chapter 2) with initial position (to,xo).Assume the contrary is true, i.e., V * is not Geoffrion-maximal. By Proposition 3.1 in Chapter 3, it is Pareto-maximal. Then V* is not Geoffrionmaximal if, for any arbitrarily large number M > 0, there exist i E N and V E Y and quasimotions x [ .,to,xo, V ] and x [ . ,to,xo, V*) satisfying (1.1) with VG = V* such that for every j E N for which (1.2) holds with V G = V * , Fi(xC0, to, ~ 0 VI) , - Fi(xCe, t o , ~ 0 V,* l ) > M [ F j ( x [ e , to,
~
0 V, * l )
- Fj(x[e, to, ~
0 V, l ) l .
Since (1.8) and the inclusion x* = x [ e , to,xo, VJE X [ & to, xo, V ] hold, these relations can be represented in the following form: For any arbitrarily large M = const > 0, there is an ordinal number i E N and alternative x* E X[O, to,xo, V + Q ] such that Fi(X*) > Fi(xg)
and for every j
EN
for which F,(X*) < Fj(XE)
it is true that Fi(X*) - Fi(xg) > M[Fj(XB) - Fj(X*)].
These inequalities contradict the proper effectiveness of the solution x g of (1.6). Consequently, the assumption is false and the strategy V G of (1.8) is Geoffrion-maximal in problem (1.1, Chapter 2) with initial position (to,xo). Consequently,
xg E X G .
( 1-91
To prove Proposition 1.1, it is now sufficient to show that XG G
xg
(1.10)
hence, from (1.9) Xg = X G . The inclusion (1.10) may also be proved by contradiction. Assume that there is a Geoffrion-maximal strategy V GE Y
140
4. GeotTrion Optimnlity
such that for at least one quasimotion, x [ .,to, xo, V G ]E %[to, xo, V G ] the point x* = x[e, to, xo, V G ]4 X g . Since, by Proposition 3.1 in Chapter 3, the point x* E Xp, the set of Pareto-optimal (effective) solutions of (1.6),x* 4 X * signifies that for any arbitrarily large M > 0, there exist i E N and R E X [ e , to, xo, V + Q ] satisfying Fi(R) > F,(x*)
(1.11)
and are defined in such a way that for any ordinal number j c N such that Fj(R) < Fj(X*),
(1.12)
Fi(R) - F ~ ( x *> ) M [ F j ( x * ) - Fj(R)].
(1.13)
it is true that Now Corollary 4.1 (Chapter 1)leads to a strategy i, E *Y for which =
x[e, to, xo, fi]
for every quasimotion x [ ., to, xo, P] of the system (1.2)in Chapter 2 generated by the strategy 3 from position (to,xo). Then relations (1.11)-(1.13) may be represented in a different way. That is, for any arbitrarily large number M > 0, there exist an ordinal number i E N and quasimotions x [ .,to, xo, VG] and x [ . ,to, xo, V ] satisfying
wee, to, xo, PI) > Fi(xC6, to, ~ 0 vG1) , and are defined in such a way that for every j E N such that Fj(XC0, to, ~ 0 PI) , < Fj(xCe, to, ~
0
~, “ 1 )
it is true that Fi(xCe, to, ~ 0 PI) , - Fi(x[e, to,
vG1)
~ 0 ,
> w F j ( x [ e , to, x0,
vG1)- F j ( x [ e , to, x0, PI)I
which contradicts the Geoffrion maximality of the strategy V G in (1.1, Chapter 2) with initial position (to,xo). The contradiction proves the inclusion (l.lO),hence Proposition 1.1 as well. Corollary 1.1. If Conditions 1.1 and 1.2 in problem (1.1,Chapter 2) are satisjed for any choice of initial position (to,xo)E [ O , e ) x R“, there exists a Geofrion-maximal strategy V G ,or V G# @. Proof.
To prove the corollary, let us consider the associated static multicri-
1. Geoffrion-Maximal Strategy
141
terial problem (1.6).We fix some set of positive numbers B1,. . . ,BN and find a point xg such that max
N
N
1 B i F i ( x )= 1
xEX[O,to,x0,V-Q] i = 1
i=l
B i ~ i ( ~ g ) .
(1.14)
Since F i ( x ) is continuous, i E N (Condition 1.2,Chapter 2), and the reachability domain X[O, to, x o , Y +- Q] is compact (Proposition 2.3,Chapter l), such a point x gE X[O, to, xo, Y t Q] exists. To make the presentation complete, let us prove that x g of (1.14)is a properly effective (Geoffrion-maximal) solution of the static multicriterial problem (1.6). But if this is not so, there is an alternative f E X[O, to, x o , Y + Q] such that the system of inequalities
Fi(f)2 F i ( x g ) ,
iEN
(1.15)
is simultaneous, at least one of which is a strict inequality. Multiplying both sides of (1.15)by a positive number Pi and summing over i E N,
which is incompatible with (1.14). Let us now prove that the point x g of (1.14)is a properly effective solution of problem (1.6)with M = (N - l)maxi,jsNj?j(Bi)-'. Assuming the contrary, there must then exist i E N and x* E X[O, to, x o , V - Q] such that
[
21
F i ( x * ) - Fi(xg) > ( N - 1) max - [Fj(xg) - F j ( x * ) i,jEN
for any j # i; but then
F,(x*) - Fi(xg) > ( N - 1)Pi [Fj(X8) - F j ( X * ) ] .
Bi
(1.16)
Multiplying both sides of (1.16)by Bi(N - l)-' and summing over all j # i, we have
which is incompatible with equality (1.14).Consequently, the point xg is a properly effective solution of (1.6).
142
4. Geoffrion Optimality
Now let us devise a strategy V G such that xg
=x
p , to, xo, V"]
for every quasimotion x [ .,to,xo, V G ] of the system (1.2, Chapter 2) generated by VG from position (to,xo).Such a strategy V G exists by Corollary 4.1 in Chapter 1. Finally, by Proposition 1.1 such a strategy V Gis Geoffrion-maximal in (1.1, Chapter 2) with initial position (to,xo).
Corollary 1.2. Proposition 1.1 suggests an illustrative, or geometric way of obtaining the set Y Gof Geojirion-maximal strategies of (1.1, Chapter 2) with initial position (to,xo). For this purpose it is necessary t o j r s t , j n d the entire set X g of properly ejiective solutions x g of the multicriterial static problem (1.6); and second, obtain all possible V " E Y such that
x[e, to, xo, V G ] E x g for every quasimotion x [ . , to, xo, V"]. The set Y Gof such strategies VG is the set of Geoffrion-maximal strategies of (1.1, Chapter 2) with initial position (to,xo).
I .4. External and Dynamic Stability
Property 1.6. The set Y" of Geojirion-maximal strategies of (1.1, Chapter 2) with initial position (to,xo) is extremally stable, if,for every strategy V E Y\Yp and quasimotion x*[ .,to, xo, V ] generated by it there exists a Geojirionmaximal strategy VG such that Fi(x*[o, to, xo,
VI) < Fi(x[e, to, xo,
VGI),
iE
N,
(1.17)
for every quasimotion x [ ., to, xo, V"] of the system (1.2, Chapter 2) generated by VG from position (to,xo).For every quasimotion x * [ * ,to, xo, V ] at least one of the inequalities in (1.17) is strict. This definition of external stability is different from the definition used until now. Specifically,the set of strategies V\Ypfor which Y" is externally stable does not include the set Yp of Pareto-optimal strategies of Problem (1.1, Chapter 2). The need for such a change in the definition of external stability is clear from Example 1.1 where, for a Geoffrion non-maximal point A
143
1. Geoffrion-Maximal Strategy
(associated with the strategy V ( ' ) +- (0,a)) there exists no strategy VGE ^trc such that F i ( x [ e , to, xo, V ' ) ] < ) F i ( x [ 6 , to, x0,
PI),
i = 1, 2.
Proof. Let V be some strategy from the set V\Vp and x * [ . . to,xo, V ] some quasimotion generated by this system. Because the point x*[O, to, xo, V ] E X [ & to, x o , V +- Q ] , the reachability domain of the system (2.2, Chapter 2) from position (to,xo) and X[O, t o , xo, V - Q ] is a compactum in R (Proposition 3.2, Chapter l), that (1) the set X*Cto, xol = { X E XCe, t o ,
~
0 V,
- Q1 I Fi(x)
>Fi(x*[O, to, xo, V ] ) ,
iE
N}
is a nonempty compactum in R"; (2) for any point x ~ X * [ tx~, ] , ie N.
F , ( x ) 3 F i ( x [ e , to, x0, V I ) ,
(1.18)
Let us consider N arbitrary numbers PI,. . . ,PN and a continuous function XieN &Fi(x). In proving Corollary 1.1 the point X ~ XE* [ t o ,x o ] determined from (1.19) will be found to be properly effective (Geoffrion-maximal) in the multicriterial static problem (X*Cto,
XOI,
W).
For every XEXCO,to, xo, V + Q]\X*[to, x,], we have by (1.18), F j ( x ) < Fj(x*CO, to,
~
0 V, l
)
(1.20)
for at least one j~ N. Therefore since the system of inequalities F i ( x ) 2 Fi(xB),
iE
N,
at least one of which is a strict inequality, is nonsimultaneous (for every x E X * [ t o , xo]), the inequalities Fi(x8)2 Fi(x*CB, to, xo,
VI),
ie N,
(which follow from (1.18) with x = x g ~ X * C t Oxo]), , and (1.20), x g is Paretomaximal (effective) even for a complete static problem (1.6).
144
4. Geoffrioo Optirnality
Let us show that the alternative xg is properly effective in the multicriterial problem (1.6). Let us assume the contrary, i.e., that xgis not properly effective in (1.6). For the earlier constants pi > 0, i E N, of (1.19) we find the point fg such that
C PiFi(x) = ieN 1 piFi(fg).
max
x ~ X C O , ~ O , X O ieN ,~~QI
This alternative fgis properly effective (see the proof of Corollary 1.1) in problem (1.6). Besides, since it is assumed that xg of (1.19) is not properly effective in (1.6), IF(x8) # F(fg). Now, for an infinitely increasing sequence of positive numbers { M , } , lim,+m M , = + co, the solution xg is not properly effective and F(xg) # F ( 9 ) for every r = 1, 2,. . .may be associated with at least one ordinal number i(r) such that Fi(r)(Xg)> Fi(r)(;').
Note that if every Fi(xg) < Fi(fg),i~ N, at least one of which is a strict inequality, the solution xg would not be Pareto-maximal (effective) in problem (1.6). The set of ordinal numbers N = { 1,2,. . . ,N } is finite, therefore in the sequence {i(r)}, r = 1, 2,. . . , at least one ordinal number i occurs infinitely often. Let us assume that this ordinal number is associated with a sequence of subscripts r i . Then, for any j E N satisfying Fj(XB) < F j ( 9 )
we find that F,(x') - Fi(P)> M,, [Fj(?) - F j ( X B ) ]
(1.21)
for an infinitely increasing sequence Mri+ + co. But Fj(fg)- Fj(xg) > 0 is a collection of specific numbers. Therefore, we have from (1.21) that the number Fi(xg)is larger than any specified number, which contradicts the boundedness of the continuous function Fi(x) over the compactum X[O, to, xo, V + Q]. This proves that the solution xg of problem (1.6) is properly effective. By Corollary 4.1 from Chapter 1 there exists a strategy VGE f such that xg
= x[e, to, xo, V G ]
(1.22)
for all quasimotions x[ . ,to, xo, V G ] of the system (1.2, Chapter 2) generated by V G from position (to,xo). Since x g e X g is a set of properly effective solutions of (1.6), by Proposition 1.1 V Gis Geoffrion-maximal in (1.1, Chapter 2) with initial position (to,xo). The proof of Property 1.6 is completed by the fact that by (1.18) and the inclusion X ~ X*[to, E x,], Fi(xg) 2
F A X * [ & to, x0,
VI),
iE
N,
2. Necessary and Sufficient Conditions
145
and, by (1.22), it is also true that V $ Y p and Y Gc Y p ;consequently, inequalities (1.17) are also true for every quasimotion x [ . ,to, xo, V'] of the system (1.2, Chapter 2) generated by the Geoffrion-maximal strategy V G from position (to, xo). Property 1.7. Any Geoffrion-maximal strategy V G of (1.1, Chapter 2) with initial position (to,x o ) is dynamically stable, or remains Geoffrion-maximal in (1.1, Chapter 2) for the current initial position (t, x [ t , to, xo, V"]) at every t E [to,O] and for any quasimotion x [ . ,to,xo, V"] generated by V G from (to, xo).
Unlike the procedure of proving Property 1.6 of Chapter 2 here we will use the property of inheritance for multicriterial static problems. Let x [ * ,to, xo, V"] be an arbitrary quasimotion from the bunch X [ t o ,xo, V"]. By Proposition 1.1, the point x[O, to, xo, V"] EX*,a set of properly effective solutions of (1.6). Since over time t, XCO, t, xCt,
to, xo,
V"], V + Qlc XCO, to, xo,
+
Ql
and xC0, to, xo, VGI EXCO, t, xCt, to, xo, VG1,
v + Q1
(Theorem 2.1, Chapter l), because of inheritance [S, p. 121 the solution x[O, to, xo, V"] remains properly effective in the multicriterial static problem
(XCO, t, xCt,
to, xo,
V"1, V + Ql,F(x)>
but, by Proposition 1.1, V G is Geoffrion-maximal in (1.1, Chapter 2) with initial position (t, x [ t , to, xo, V"]).
2. Necessary and Sufficient Conditions 2.I .
Auxiliary Propositions
We will need certain lemmas from [62,63]. For this purpose we consider a criteria1 space RN with elements F = (Fl,.. . ,FN)and use the notation IF(') 2 I F ( ~ ) ~ F2! 'F!'), )
iEN;
IF(') 2 [F('),FI1) 2 F;'),
iE
IF"' 2 IF''*
[F'"
= F(2) or
for at least one I E N. Note that
[F'"
N and
IF(') # F'');
F!" > F!"
2 IF(') is false if IF") 2
[F(').
146
4. Geoffrion Optirnnlity
Lemma 2.1 1631. F(’)
for some vector
=
1 ff”) iff
(PI,.. . ,PN) from the set
Necessity. Let lF‘” 2
If ff“) = lF(2) (componentwise),
1 P i F ) ’ ) = c PiF12)
ie N
at every
ie N
/?€a). Assume now that Fj”
>
and let
y = max { I , r } ,
where
If y
= 0,
CieNPiFI’)= P j F j l ) > PjF:2’ =
y > 0, then consider the number
x=
F:’)
-
2Y
1 PiF12’ for any PEB.If, however,
i€ N
F(2)
’ >o.
Hence FYI - F Y ) > 2yx 2 x(I
+r)
or, in light of (2.3), F:” - X I 2 F:”
+ Xr.
Therefore
if
pj = [l
+ x ( N - 1)I-l
2. Necessary and Sufficient Conditions
147
and
pi = x[1 + x(N - I)]-’ Suficiency.
if
(2.4)
i # j.
If P’) 2 P’is false, F‘Z’ 2 [F“’*Fy’ 2 Fj.”,
iE N,
[F”’
# P’).
Multiplication of the i-th inequality by pi = const and summation over i = 1, 2,. . . ,N yield (2.1).
Remark 2.1. pi used in (2.1) depend on FI” and F12’ since, pi = cpi(x)and x by (2.3) and (2.2) depends on F:” and F$2’,i E N. Remark 2.2.
by (2.4),
For the multicriterial problem (1.6), (XCo, to, xo, V
+ Ql,
F(x))
(2.5)
the solution x P €X [ B , to, xo, V t Q] is, in the above notation, effective (Pareto-optimal) if ff(xp)2 ff(x), V X E X [ B to, , xo, V - Q].
Then, by Lemma 2.1, the solution xp is effective in problem (2.5) if there is a vector function for it, p ( x ) = (fil(x),. .. ,fiN(x)),such that (1) fi(x)E 93 with Vx E X[O, to, x0, V + Q];
(2)
ZieN
Pi(xFi(Xp)2
for every X E X C B ,to, xo, V
Pi(x)Fi(x) t
Q].
If additional constraints are imposed on the form of fi(x), sufficient conditions of effectiveness are obtained. One such possible condition is formulated in the next lemma.
Lemma 2.2. If there exists a Jinite collection of constant vectors p ( k ) ~ 9 3 ( k = 1,. . . , T ) such that, for every XEXCB,to, xo, V - Q], there exists an ordinal number k E { 1,. .. , T } such that
the alternative xp is efective in problem (2.5).
In the general case the requirements of Lemma 2.2 are sufficient to make x p
148
4. Geo5rion Optimdity
effective, though for properly effective solutions they are necessary and sufficient. Finally, Definition 1.1 leads to an equivalent notion of a Geoffrionmaximal strategy.
Lemma 2.3. The Pareto-maximal strategy VGE Y is Geofrion-maximal in (1.1, Chapter 2) with (to,xo) ifthere exists a number M > 0 such that,for every ordinal number i E N and every strategy V E V and quasimotions x [ .,to,xo, V ] and x [ . ,to, xo, V"], the system of inequalities Fi(xC6, to, xo, VI) > Fi(xC6, to, xo, V"I), Fi(xCe, to, xo,
VI) - Fi(xC6, to, xo, VGI) - MCFj(xC8, to, xo, V"I -x[O, to, xo, V]] > 0
j~ N, j # i
(2.6)
is nonsimultaneous.
2.2. Necessary Conditions
An analog of these conditions for static multicriterial problems has been determined in [62]. Necessary conditions are formulated in the next theorem.
Theorem 2.1. If the strategy VG is Geofrion-maximal in (1.1, Chapter 2) with (to,xo)E [0,6) x R", there exists ajnite collection of vectors P(l), . . . ,f i ( k )B~ of (2.2) ( K < N ) such that for every V and every pair of quasimotions ( x [ .,to, xo, V], x [ . ,to,xo, VG]) there exists an ordinal number j~ { 1,. . . ,K } for which
c BI."Fi(x[&to, xo, V"1) 2 c PY'Fi(x[8, to, xo, VI).
i d
iE N
(2.7)
Let the strategy VG be a Geoffrion-maximal strategy in (1.1, Chapter 1) with (to,xo). Then, by Lemma 2.3, there exists a number M > 0 such that, for every ordinal number i E N, every strategy V E V, and every quasimotion x [ . ,to,xo, V"], XI:., to, xo, V], the system of inequalities (2.6) is nonsimultaneous. For an arbitrary strategy V E Y and every fixed ordinal number i E N, the system (2.6) is nonsimultaneous if either Proof:
Fi(xC& to, xo,
VI) - Fi(xC6, to, xo, VGI) < 0
2. Necessary and Sufficient Conditions
149
or Fi(x[e, to,
~ 0 V, I )
- Fi(xCe, to,
~
0
V"I) ,
- M [ F j ( x [ e , to, xo, VGI) - Fj(XC0, to, xo, V1)l d 0
(2.8)
for some jcz N \ i and distinct pairs of quasimotions ( x [ ., to,x o , V], x [ . , to,x o , V"]). Note that, for fixed i~ N and V E V , the number of inequalities in (2.8) associated with distinct (and every possible) pair of quasimotions (XC
., to, xo, V I , x c ., to, xo, V"1)
does not exceed
1 + c;-l
+ c;-l + ... + CN-1 N - 2- 2N-1 - 1.
(2.9)
It should be emphasized that the number of inequalities of the type of (2.8) is finite even though there may be an infinite number of pairs of quasimotions ( x [ .,to,x o , VJ, x [ . ,to, x o , V"]). Summation over i of all inequalities of this type leads, for fixed strategies V and fixed pair ( x [ . , t 0 , x o , Vl, x [ .,to,x o , V"]), to the inequality
C L i [ F i ( x r B , t o , xo, V I ) - Fi(xCe, to, xo, VGI)I < 0.
ic N
For (2.10) this inequality reduces to V"]).
(2.11)
Consequently, for every V E V and every It is obvious that T'E~?. ( x [ . ,to,xo, V], x [ .,to,x o , VG]), there exists a vector s(j)E 93 for which inequalities (2.7) associated with every i E N are finite, the number of such inequalities (2.8) associated with every i E N are finite, the number of such vectors that have the desired property is also infinite. In other words, there exists a finite collection of vectors P I ) , P2),. . . ,P K ) € 9such that, for every strategy V E Y and every pair of quasimotions ( x [ . ,to, x o , V], x [ . ,to,x o , VG]), there exists an ordinal number j E { 1,. . . ,K } for which inequality (2.7) holds (with pcn = 87").
150
4. Geoffrion Optimality
For a collection of no more than N vectors that have the desired property, let y = min{ py)I i E N, j = 1,. . . ,K}. We consider the vectors 1 - (N -
with i 1)y with i
=
1,. . . ,N and i # j
=j
,
iEN.
(2.12)
It is obvious that p c j ) = (fly), . . . ,@)) E i?d for any j E N. Let us establish the inclusion
B’”, . . . ,B’“’ E co { p‘”, . . . ,p‘”}
(2.13)
where co{ . } denotes the convex closed hull of the set { . }. Let p)be an arbitrary vector from the “old” collection. If y = 1/N, then, obviously, 87)= 1/N ( i = 1,. ..,N). In this case for 1, = ,I2 = ... = 1, = 1/N, we have
1 ,Ii#)
=
/7$”,
j, 1~ N,
(2.14)
is N
or ~ ( ‘ ) E c op(l), { . . . ,fi‘”)}. If y < 1/N we consider
This proves the where CieN1:’)= 1. Equalities (2.14) also hold for these 1;’). inclusion (2.13). Let us prove that the “new” collection of vectors /?(j), j E N, from (2.12) has the necessary properties. We assume the contrary, i.e., that there exist a strategy V E -Ir and pair of quasimotions ( x [ .,to,xo, V ] , x [ . ,to,xo, V “ ] ) such that
zieNli
For an arbitrary I E N with some ,Ii 2 0, i E N, = 1, (2.14) is true, Multiplying both sides of inequality (2.15) by l j and summing, we have
1 n i p y w i ( x [ e , to. xo, V ] ) > i ,1p N , ~ ~ p y ) ~ ~to,( xxo,[ e ~“1). ,
i,jsN
Together with (2.14) this leads to the inequality
Pk)
. . . , does not have the desired which implies that the old collection 8(’), property (2.1 1) either. This contradiction proves that there exists a collection consisting of a maximum of N vectors fi‘’), . . . ,P(k)satisfying Theorem 2.1.
2. Necessary and Sufficient Conditions
151
2.3. Suficient Conditions
Sufficient conditions for the existence of Geoffrion-maximal strategies are formulated in the next theorem. Theorem 2.2. Zffor max min V E ~ x' [ . ]
iaN
P = (PI,. . . , P N ) €ofg (2.2),
PiFi(x[O,to, xo, V ] ) = min x[.]
1 biFi(x[B,to, xo, V " ] )
iEN
(2.16)
then V G is Geofrion-maximal in (1.1, Chapter 2) with (to,xo). Proof. By Remark 4.1 of Chapter 3, the strategy V G found from (2.16) is Pareto-maximal in (1.1, Chapter 2) with (to,xo). To prove that this strategy V G is Geoffrion-maximal, consider the number
(2.17) and assume the contrary. Then, for the number M of (2.17), there exist a subscript i E N, a strategy PEV , and quasimotions (x*[ ., to,xo, 31, x * [ . ,to, xo, V"]) such that, by Lemma 2.3, the following system of inequalities is simultaneous: Fi(x*CB, to, xo, Fi(x*Ce, to, xo,
31)
-
71)- Fi(x*Ce, to, xo, ~ " 1>) 0
(2.18)
F ~ ( X * C to, B , xo, VG1)
PI)] > 0
j E N, j # i
(2.19)
No = { j E N I F j ( x * [ 6 , to, xo, V"]) - F ~ ( x * [ @to, , xo, V ] ) > 0).
(2.20)
-MIFj(x*[O, to, xo, V G ] )- Fj(x*[B, to, xo,
The set of subscripts is denoted
By the Pareto optimality of V G and (2.18), No # we have from (2.19)
0. Then, by summing over
j E N\{i},
(N
-
1)CFi(x*CO, to,
~
0
31)- Fi(x*CO, to, ~ 0 ~, " I ) I 7
152
4. Geoffrion Optimality
Adding (2.18) and (2.21), we have, in light of (2.22), w"(x*re,
to, xo,
31)- Fi(x*Ce, to, x o , vG1)l
Substituting M from (2.17), we find by simple algebra that
For an auxiliary zero-sum positional differential game
(x*, the conflict-controlled system E is described by the equation i= 0,u +f(t,
x , u),
x[to]
= xo,
(2.25)
where the vectors x , u, andf are as in the system (1.2) in Chapter 2,0, is the null-vector from R", Condition 1.1 of Chapter 2 holds, the set of strategies Y is determined above, and 42
=
{u
t u(t,
x ) I u(t, x ) E [O, l]}.
By Condition 1.2 in Chapter 2 the function X P i F i ( x )is continuous over R" and the constants Pi > 0, i E N, are as in (2.16). For the game (2.24) with initial position (to, xo), the requirements of Theorem 3.1 (Chapter 1) specifying the existence of a saddle point ( Uo, V") E 42 x Y are fulfilled. By the definition of a saddle point in a differential game, (2.16), and the fact that the strategy U o has no impact on the system (2.25) (it "acts" through the term 0,u) we have
for every quasimotion x [ . ,to, xo, V G ] , any strategies V E Y , and quasimotions x [ . ,to, xo, V ] generated by them. In particular, from the latter inequality we have
1 PiFi(xC0, to, xo, vG1)2 ic1 PiFi(XC0, to, xo? 31).
ie N
N
(2.26)
3. A-Optimality
For ordinal numbers j
E
153
N\No, it is true, by (2.20) and (2.22), that
F j ( X * [ B , to, xo, VG1) < Fj(X*CO, to, xo,
31).
Hence, from (2.26) it follows that
a P i F i ( x * C e , to, ~
PI) +
0 ,
j€No
p j ~ j ( x * [ to, ~ , ~ 0 PI), ,
which is incompatible with (2.23). This contradiction proves the Geoffrion maximality of VG from (2.16). Remark 2.3. Every vector DEBis associated, in general, with "its own" Geoffrion-maximal strategy. The necessary conditions (Theorem 2.1) show that in obtaining the entire set Y Gof Geoffrion-maximal strategies a finite number of at most N vectors p ~ 9 3suffices.
3. A-optimality 3.1.
Compactness of the set X s
For problem (1.1, Chapter 2) Conditions 1.1 and 1.2 from Chapter 2 are assumed to hold and the initial position (to,xo) E [O,e) x R" is fixed. In section 3 of Chapter 2 the structure of Slater-optimal strategies was disclosed by means of the set X s of all right (at t = 8) ends of quasimotions x [ ., to, xo, V s ] generated from the initial position (to,x o ) by the Slater-optimal strategies V s when the V s scan the entire set Y sof Slater-optimal strategies of (1.1, Chapter 2) with (to,xo), that is,
Lemma 3.1.
The set X s is closed and bounded (it is a compactum in R"), while
WS) = (J V€Y
is a compactum in
w - e , to, xo, S
Vl)
RN.
Proof. By Proposition 3.1 from Chapter 2, the set X s concides with the set Xsof weakly efficient (Slater-maximal) solutions x s of the problem
(XCe, to, xo,
+
Q1, F ( x ) ) ,
(3.2)
154
4. Geoffrion Optimslity
where X[O, to,xo, V t Q] is the reachability domain of the system (1.2, Chapter 2) from position (to,xo). Therefore, to prove Lemma 3.1, it is sufficient to prove the compactness of the subset X s c X [ O , to, xo, V -+ Q]. By Proposition 2.3 from Chapter 1, the set X[O, to,xo, V + Q] is closed and bounded in R”. Consequently, it follows from the fact that X s is closed and the compactness of X[O, to, xo, V - Q] containing X s that X s is compact. That X s is closed was established in [70, p. 1421; to make the presentation complete, we now repeat the proof. We assume the contrary, i.e., that X ” is not closed; in this case, there is a sequence (x‘’)’) c X s of weakly efficient solutions x(’) of problem (3.2) that converge to the point xOEX[O, to,xo, V t Q ] and the solution xo is not weakly efficient in problem (3.2). This means that there is an alternative (solution) x* E X[O, to,xo, V - Q] such that F , ( x * ) > F,(xO),
iE
N.
The functions F,(x),i E N, are continuous at the point xo E X [ O , to, xo, V i Q] thus in X[O, to,xo, V t Q] there exists a sufficiently small neighborhood D(xo)of the point xo such that for any YED(xO),the following inequalities are simultaneous: F,(x*) > Fi(y), But x ( ~+) x o as k thus
+ 00,
iE
N.
therefore, for sufficiently large k, the point x ( ~ ) D(xo), E F,(x*) > F i ( ~ ( k ) ) ,
iE
N.
The latter inequality is incompatible with the weak efficiency of x(’) in problem (3.2). The continuity of Fi(x) (Condition 1.2, Chapter 2) and the compactness of X s imply the compactness of the set f f ( X S= )
u
ff(x)
XCX‘
and, consequently, of (X’ = X s ) as well, and make the set F(Xs) of (3.1) as a compact and bounded set. Remark 3.1. The compactness of X s is a “convenient” property, or, as a rule, the set X s is sufficiently large and its “shrinkage” (or the choice of a specific strategy V s E ^Ys) usually requires the use (and maximization) of additional criteria defined for the set X s . The compactness of X s suggests that
3. A-Optimality
155
a point x* of the set X s may exist where additional criteria such as Fk(x[e]), k > N, are maximal. A strategy V s E Y-' that leads to such a maximum may be found such that
x*
=
x[e, to, xo, vS1
(3.3)
for any quasimotions x[. ,t o , xo, Vs]. The existence of V s satisfying (3.3) is established in Corollary 4.1, Chapter 1.
3.2. Formalization of the Relation
and its Properties
>A
Let A be a constant (N x N)-dimensional matrix whose elements are a i j , i,
j~ N. We consider a polyhedral cone K = {FE [wN I A F > O N } , where N is the vector AIF = (CjENa i j F j , .. . , ZjEN a N j F j )Then . AIF > 0 means that
On
[wN
we introduce an order relation, IF(')
>A
[F(z)*
Here the vector inequality IF'') inequalities
1 aij[F$"
je N
>A
-
[F(') - [ F ( ~ ) E K .
IF"'
is equivalent to the N scalar iEN,
Fy)] > 0,
or
1 aijF$" > 1 a..F(2'. v I
jeN
je N
means that The relation A and Let us consider a multicriterial static problem (2.5)
( X e , to, xo, V - Ql,F(x)>
>A
is false. (3.4)
If Conditions 1.1 and 1.2 from Chapter 2 hold, the set X [ e , to,xo, V - Q], as a compactum in [w" and the components Fi(x),i~ N, of the vector function [F(x) are continuous.
Definition 3.1. The vector x " ~ X [ eto, , xo, V - Q] will be called the Amaximal solution of problem (3.4) if F(x)
F(x"),
Vx E X [ e , to, xo, V - Q].
(3.5)
156
4. Ceoffrion Optimality
The solution x,EX[O, to, xo, V - Q] is A-minimal for problem (3.4) if F(x)+A
W,),
Vx E XCe, to, xo, V + Q1.
The A-maximal solution of problem (3.4) is A-minimal for ( X [ &tO,x0, V t Q], - F(x)),the converse also being true. The set of A-maximal solutions of (3.4) will be denoted X". Relations (3.5) imply, in "scalar" form, that, for any X E X [ O to, , xo, V + Q], the following system of inequalities is nonsimultaneous:
Remark 3.2.
C aijFj(x)> j 1 aijFj(x"), cN
iEN,
jc N
that is, the alternative x" is a weakly efficient solution of the multicriterial
where the vector criterion is
Consequently, the A-optimality of the solution X" in problem (3.2) is equivalent to the weak efficiency (Slater optimality) of this solution of (3.6). The next propositions follow immediately from the continuity of the scalar functions Fi(x), iE N, the compactness of X[O, to,xo, V - Q], and (16.3).
3. A-optirnality
157
Proposition 3.2. The set X" of A-maximal solutions of problem (3.2) is a nonempty compact subset of X[O, to, xo, V -+ Q]. Proof. We introduce a scalar function q ( x ) = C i , j F j ( x ) .Since F j ( x ) is continuous over the compactum X[O, to, xo, V t Q], so is q ( x ) . Therefore, there exists a solution x" such that
max
x c X [ B . ~ , . x , , V t Q]
q ( x ) = q(x").
(3.7)
Let us show that the point x" is an A-maximal solution of problem (3.2). We assume the contrary, i.e., that, x" of (3.7) is not A-maximal in problem (3.2). Then, by Definition 3.1, there is a solution f E X[O, to, xo, V t Q] for which the strict inequalities
1 U i j F j ( f )> 1 aijFj(x"),
iEN,
jcN
jsN
are simultaneous. Termwise summation of the inequalities of this system yields the inequality
1 U i j F j ( f )> 1 UijFj(X"),
i,j&
i,jcN
which is incompatible with (3.7). Consequently, xa is an A-maximal solution of problem (3.2). The compactness of the set X" of such solutions follows from the weak efficiency of every x" for problem (3.6) and from Lemma 3.1. The next proposition follows from Proposition 3.2 and the continuity of F ~ ( x )i~, N, for x[e,to, x0, v +- Q]. Proposition 3.3. The set F(X") = (J F(x) XSX"
is a nonempty compact subset of F(X[O, to, xo, V
t
Q]) c RN
Similar propositions also hold for A-minimal solutions of problem (3.2). Furthermore,
(1) if A = EN an identity (N x N)-dimensional matrix, the A-maximal solution of (3.2) coincides with the weakly efficient solution, (2) the notion of an A-maximal solution is a special case of a solution of (3.2) that is optimal over a polyhedral cone [92].
158
4. Geoffrion Optirnality
3.3. A-optimal Strategies of Problem ( 1 . 1 ) in Chapter 2 In this section Conditions 1.1, 1.2 (Chapter 2) are assumed to hold, along with the following condition.
Condition 3.1. The elements aij of the constant matrix A are positive, i.e., aij > 0, i, j E N. Definition 3.2. The strategy V A e V is referred to as A-maximal in the multicriterial dynamic problem ( 1 . 1 , Chapter 2) for the initial position (to,xo)E [ O , O ) x R“ if there is no strategy V E Y such that
E(x[e, to, X O ?
V1)# A
F(x[e,
x07
‘“1)
(3.8)
for all quasimotions x [ . ,to,xo, V ] and x[ . ,to,xo, V A ] .The set of A-maximal strategies is denoted V ” . Remark 3.3. Either of the following two definitions is equivalent to Definition 3.2, the initial position (to,xo) being assumed fixed. 1. The strategy V” is A-maximal in (1.1, Chapter 2)( if, for all V E V the following system of inequalities is nonsimultaneous:
for all quasimotions x [ . ,to, xo, V ] and x[. , to, xo, V ” ] . 2. The strategy V” is A-maximal in (1.1, Chapter 2) if, for every 3-tuple ([ x [ . ,to,xo, V ] ,x [ .,to,xo, V”), or any strategy V E Y and some associated pair of quasimotions ( x [. ,to,xo, V ] , x [ .,to, xo, V ” ] ) , there exists an ordinal number i, E N such that
The relationship between A-maximal strategies and Geoffrion-maximal strategies in (1.1, Chapter 2) is established by the following theorem, Conditions 1.1, 1.2 (Chapter 2), and (3.1) are assumed to hold.
Theorem 3.1. Every A-maximal strategy of (1.1, Chapter 2) with (to,xo)is Geoffrion-maximal in problem (5.1). Consequently, if the elements of an (N x N)-dimensional matrix A are positive, the A-maximal strategy V” is Geoffrion-maximal. This means that V” (Definition 1.1)
3. A-Optimality
159
(1) is Pareto-maximal, i.e., for any strategy V E V and any pair of quasimotions (x[ . ,to, xo, V], x[ ., to, xo, V”]), the system of inequalities
F , W ,to, x0, VI) 2 Fi(x[Ie, to, x0. vA1),ie N, at least one of which is a strict inequality, is nonsimultaneous, (2) there exists a positive number M such that, for any ordinal number i~ N, strategy V e V , and pair of quasimotions (x[., to, xo, V], x[. , to, xo, V”]) for which
V”I)
(3.11)
~”1)~
(3.12)
~”1) 0 VAI) , - Fj(xCe, t o , ~ 0 VI)I. ,
(3.13)
Fi(xCe, to, xo, VI) > F,(xCe, to, xo,
and some j e N for which F j ( x [ e , to, xo, VI) < F j ( x [ e , to, x0, it is true that F,(XCO, to, x0, VI)
-
F , w , to, xo,
GMCFj(xCe, to,
~
Moreover, the positive constant M may be specified by the formula (3.14) With this definition, the proof of Theorem 3.1 will proceed in two stages. On the first stage, I/” will be shown to be Pareto-optimal and, on the second, (3.13) will be shown to hold where M is given by (3.14). Proof. First stage. Assuming that V” is A-maximal in (1.1, Chapter 2) with (to,xo), let us show that V” is Pareto-maximal. If it is not, there exists a strategy V e *Y- and pair of quasimotions ( a [ . ,to,xo, PI, a [ . ,to, xo, V ” ] )such that the following system of inequalities in nonsimultaneous:
F j ( f [ e , to, x0,
PI) 2 Fj(ace, to, x0, ~ ~ j1e )N,.
with at least one of the inequalities strict. Multiplying the j-th inequality by a positive number aij, an element of A, and summing overjE N we have, with i varying from 1 to N , a system of strict inequalities,
1 aijFj(ri-[e,to, x0, 91) > jeC a i j F j ( a [ e to, , x0, ~ ” 1 1 ,
je N
N
i e N.
These inequalities are incompatible with the fact that the system (3.9) is nonsimultaneous for any V E Y and (x[., to, xo, V], x [ . , to, xo, V”]). This contradiction proves the Pareto-optimality of V ” .
160
4. Geoffrion Optimnlity
Second stage, or proof of Condition (2) for M of (3.14). By the definition of a Pareto-optimal strategy of (1.1, Chapter 2) and the inclusion V” E Vpdetermined on the first stage, for every V E V and pair of quasimotions ( x [. , to, xo, V], x [ .,to,xo, V”] for which a subscript i E N exists such that (3.1 1) is true, there is a subscript j E N for which (3.12) is true. Now fix a strategy V E Y“ and pair ( x [* ,to,xo, V], x [ .,to,xo, V”]). Let us assume that io E N is chosen so that
Fio(xCe,to,
~
0
V, I ) - Fio(xCe, to,
~
0
VAI) ,
=max c ~ , ( x r eto, , xo, ~ 1 -)Fi(x[e,to, xo, ~ ” 3 , icN,
where
N~ = {iE N I F,(x[e, to, x0, V l ) > Fi(x[e, to, x0, ~ ” 1 ) )
(3.15)
(3.16)
or Fio(x[O,to,xo, V]) - Fio(x[O,to, xo, V”]) is the largest positive diference of the form Fi(x[O,to,xo, V]) - Fi(x[e,to, xo, V”]) for fixed V E V and ( x [. , to, xo, V], x [ * , to, xo, VA]). Let us also find, for the same V and the same x [ . , to,xo, V], x [ . , to, xo, V ” ] , an ordinal number j , E N such that Fjo(xre,to, xo. VI) - Fjo(xCe,to, xo, V”I) =min CFj(xC&to, xo, V l ) - Fj(x[O,to, xo, V”])]
(3.17)
I Fj(xCe, to, ~ 0 VI) , < Fj(xCe, t o , ~ 0 V”I)} ,
(3.18)
with constraints Nz = { j e
or Fjo(x[e,to, xo, V]) - Fjo(x[O,to, xo, V”]) is the largest difference of the form IFj(xCe, to,
~
0
VI) , - Fj(xCe, t o , ~
0 V”I)I. ,
Now to verify Condition (2) it is sufficient to show that FiO(xC0, to,
~
0
V,I ) - Fio(xCe, to,
~ 0 V”]) ,
GMCFjo(xCe, to,
XO,
V”1) - Fjo(x[e,to, xo, V])],
(3.19)
where M is independent of the choice of V and the pair of quasimotions ( x [ . ,to, X O , V l , x C . , to, xo, V”]). Let us assume the contrary, i.e., Fi,(xCe, to,
~ 0 V, I )
- Fio(xC6, to,
~ 0 V”I) .
>MCFjo(xCe, to, X O , V”I) - Fj,(x[e, to, Since V” is A-optimal, for V and ( x [* ,
to,xo, V ] ,
~ 0 V])]. ,
(3.20)
x [ . ,to, xo, V”]) there
3. A-Optimslity
exists, by Remark 3.3 (2), an ordinal number x [ . ,to, xo, V”]) = i, E N such that (3.10), i.e.,
1
jeN\io
161
i*( r! x [ . ,to, xo, V],
ai,j[Fj(xre, to, xo, V I ) - Fj(x[e, to, xo, VAI)I +ai,ioCFio(x[e,to, xo, V l )
-
Fio(x[e,to, xo, V”])]
<0
(3.21)
holds. In (3.21) we substitute (3.20) for the difference Fi,(x[O, to, xo, V ] ) - Fio(x[e,to, xo, V”]) and for the differences Fj(xCe, to,
~
0
V,I ) - Fj(xCe, to,
~ 0 VAI) ,
the negative number Fjo(x[e,to, xo, V]) - Fjo(x[e,to, xo, V”]) from (3.17) with j # i,. Since (3.21) and (3.20) are true and aij is positive,
+ai.ioMCFj,(xCe, to, xo,
VI) - Fjo(x[e,to, xg, V”])] < 0.
We divide by a negative number Fjo(x[e,to, xo, V]) - Fjo(x[erto, xo, V”]). Then
1
jcN\io
ai.j > ai.ioM,
or, in light of (3.14),
The resultant contradiction, i.e., that M < M , proves the theorem.
Remark 3.4. The constant M > 0 of (3.14) cannot be any smaller, as is shown in the following example.
162
4. Geoffrion Optimality
where x = (x,, x2), u = ( u , , u2). With a (2 x 2)-dimensional matrix A all of whose elements are equal to one, the reachability domain X[O, to, xo, V t Q] in a two-dimensional state plane {xl, x2} takes the form shown in Fig. 4.3.1. Since AlF(x[O]) = (xl[l] x2[1], xl[l] + xz[l], in this problem the set V Aof A-maximal strategies V A is defined in such a way that the set
+
XA=
(J
V€l
XCO,
to, xo,
Vl
A
takes the form shown in Fig. 4.3.1 as a heavy line. With two A-maximal strategies VC1) t (0.1) and V 2 )+ (l.O), F,(xCO, t o , xo, V'2'1) - F,(xCO, to, xo,
W)
=x,[O, to, xg,
v27
=x2[e, to. xo,
~ ( ~-' x3 2 ~ eto, , xo, v(~)I
-
x,[O, to, xo,
V ( l ) ]=
1
v q - F,(~[B, to, xo, ~ ( 2 ' 1 ) .
=F~(X[O, to, xo,
Consequently, for the pair of numbers F,(x[O, to, xo, V(l)])
x2
-
F,(x[O, to, xo, V 2 ) ]=) - 1 < 0
F2
Figure 4.3.1.
3. A-Optimality
163
and FZ(XC6,
V‘”1)
to, xo,
- F,(xCB, to, xo, V q ) =
+1 >0
M = 1 and (3.14) does not hold if M < 1. Remark 3.5. To obtain a rough estimate of the number M, we consider, for every fixed i e N, two numbers ui = min aij and jeN
Bi = C jcN
aij B Nu,.
Then o i ~ - l ) = m a xB-i - l d N - l . icN ui Consequently, MaN-1. Proposition 3.4.
Equality (M = N - 1) holds if aI.1
= a.12 =
... = aiN,
iEN(.
(3.22)
Proof. Necessity is proved by contradiction. Let M = N - 1 and assume (3.22) does not hold for some i e N. Then
pi = C
jc N
aij > N min aij = Nui jc N
and M = m a x -B-i ieN
which contradicts the fact that M Suficiency.
Ui
1 > N - 1,
=N -
1.
If (3.22) hold,
and
Therefore.
M
= max i&
Bi
--
mi
1 = N - 1.
164
4. Geolirion Optimslity
Remarks on Theorem 3.1. (1) The geometric interpretation of A-maximal V Adiffers from that of the Geoffrion-maximal V Aby the fact that the angle ct in Fig. 4.1, is fixed for V A and “governed by the matrix A. For VG c( is governed by the existence of the number M > 0 of Definition 1.1 and is always associated with a specific VG and corresponding pairs of quasimotions ( X C . , to,xo, Vl, x C . , to,xo, VGI). (2) One disadvantage of optimal solutions to multicriterial problems is the fact that, as noted at the end of section 6 in Chapter 3, the set of such optimal solutions (strategies) is, as a rule, infinite. It would be natural to try to shrink this set, for instance by using A-maximal strategies as solutions of (1.1, Chapter 2). Indeed, by Theorem 3.1, every A-haximal strategy is Geoffrionmaximal, i.e., the set Y
A
c
YG,
the set of Geoffrion-maximal strategies. Combining this inclusion with Properties 1.3 of Chapter 3 and 1.2 leads to the following proposition
Proposition 3.5. For (1.1, Chapter 2) with (t,,x,) the following chain of inclusions holds: Y A C Y G c Y p C v - S ,
(3.23)
where V Ais the set of A-maximal strategies (Dejnition 3.2); Y Gis the set of Geofrion-maximal strategies (DeJnition 1.1); Y pis the set of Pareto-maximal strategies (Dejnition 1.1, Chapter 3); Y sis the set of Slater-maximal strategies (Dejnition 1.1, Chapter 2). Consequently, Y Ais the “smallest” set of the solutions discussed in Chapter 2-4. (3) The set V Aof A-maximal strategies is found by considering an arbitrary (N x N)-dimensional matrix A with positive constant elements aij and replacing (1.1, Chapter 2) by a “new” multicriterial problem
G AWOI)
=
Rxcei))
(3.24)
that differs from (1.1, Chapter 2) only in the criterion vector A[F(x[B]),i.e., every “new” scalar criterion in (3.24) is a linear combination, with positive coefficients, of the “ o l d criteria of (1.1, Chapter 2). We have
3. A-optimality
165
The set 9 of Slater-maximal strategies of (3.24) coincides, by Definition 3.2, with the set V A since , any Slater-maximal strategy of problem (3.24) is Amaximal for (1.1, Chapter 2). Therefore to devise A-maximal strategies, all the results of Chapter 3 must be applied to the “transformed” problem (3.24) rather than to ( 1 . 1 , Chapter 2). (4) The set
xp=
u
xce, to, xo,
V€.Y^P
Vl
(3.25)
of right ends of quasimotions x[., to, xo, V q generated by all possible Pareto-maximal strategies Vp of (1.1, Chapter 2) suffers from the disadvantage that it is not necessarily closed or, therefore, compact. Thus, in Fig. 4.3.2 the effective points of the shaded set in the criterion space {Fl,F 2 } are the segments AB and C D with point C excised. Since the criteria F , ( x ) are continuous, the associated set XP of efficient solutions of the problem
(XCe, to, xo,
+ Ql,{Fi(x),F A X ) } )
is not closed either. Also, by the above structure of the solutions (Proposition 3.1, Chapter 3), the set Xpcoincides with Xpfrom (3.25).Therefore, the set Xp is not necessarily closed or, consequently, compact.
A
Figure 4.3.2.
166
4. Geoffrion Optimality
The non-compactness of Xp makes it difficult to undertake decisionmaking procedures in (1.1, Chapter 2) by means of additional requirements (criteria) defined for that set. Even if these criteria are continuous, the associated extrema will not necessarily be obtained at points of X. Neither can Corollary 4.1 (Chapter 1) be applied and lead to extrema with V E Y .This disadvantage can be eliminated if the “tentative optimal” set, e.g., Xp, is a compactum. Lemma 3.1 establishes the compactness of the set X s from (3.1, Chapter 2). But from (3.23) we have the inclusions X” c X G c
x p
c
xs,
where the set XK=
U
V€Y -K
xre, to, x0,
vi
(K = A, G, P, S)
is formal by the right ends of the quasimotions x [ . , to, xo, V K ]as the strategy VK “scans” the entire set Y KTherefore, . the set X s is the “largest” of all the sets; moreover, X G and Xp are not necessarily closed. In our view, the important thing is that the set X” is, for any constant ( N x N)-dimensional matrices A with positive coefficients, a compact subset of the reachability domain X[O, to, xo, V - Q]. Specifically, the following proposition holds.
Proposition 3.6. If Conditions 1.1 and 1.2 of Chapter 2 are fulJilled,for any choice of the (N x N)-dimensional matrix A with positive coeficients, the set X” is a nonempty compactum in RN. The set X” forms the right (with t = 0) ends of the quasimotions x [ .,to, xo, V”] of the system (1.2, Chapter 2) as V A scans the entire set of Amaximal strategies Y ” . The proof follows from the fact that Y” of problem (1.1, Chapter 2), for (to,xo), coincides with the set 9 ‘of Slater-maximal strategies of problem (3.24) from the same initial position (by Definition 3.2). By Lemma 3.1, the set
2s=
u-=
xce, to, xo, V ]
V E ‘L
is a compact subset of the reachability domain X[O, to, xo, V + Q]. Simultaneously, = X ” , since 9 ‘= Y” by Definition 3.1. Consequently, X” is a compact subset of X [ e , to, xo, I/ + Q]. Therefore, in making optimal decisions in (1.1, Chapter 2) A-optimality seems to offer more advantages since:
xs
3. A-Optimelity
167
(1) the set of A-maximal strategies is the “smallest” of all the V K (K = A, G, P, S): Y
A
c
Y G
cY
P
cY
S ;
(2) the set X A is a compactum in R”, which guarantees that there exist “updated” optimal strategies that make use of additional strategies to determine the “best” values of the goal functional vector F(x[e]); (3) the fact that V Acoincides with QS, the set of Slater-maximal strategies of problem (3.24), makes it possible to use the results of Chapter 2 to formulate the properties of A-maximal strategies, in particular, to establish the existence, of internal and dynamic stability, inheritance and rejection, sufficient conditions, and the existence of generally applicable A-maximal strategies under the assumption that Conditions 1.1 and 1.2 of Chapter 2 are satisfied. In the case of A-optimality the solutions are structured in the same way as in the case of Slater (Pareto, Geoffrion) optimality. These properties are proved by the results described in Chapter 2, albeit with obvious modifications.
This page intentionally left blank
Vector-Valued Saddle Points
Chapter 5
Slater, Pareto, Geoffrion, and A-saddle points will now be determined for differential positional games with a vector-valued payoff. Their properties will be ascertained and those features which distinguish them from the saddle point of differential games with a scalar payoff identified.
1. Definition 1.1.
Why Saddle Points in Diferential Games?
Most papers, even the early pioneering efforts [35], were concerned with games such as i= f ( t , x, u, u), XER",
J(u,
x(t0) = X",
~ E cHRh,
+
U) = F(x(0))
to < t
<8
U E Qc R4,
6:
9 ( t , ~ ( t )~,[ t ]~,[ t ] ) d t .
Setting aside, for the time being, the strict mathematical formalization of the game and the appropriate range of the player's strategies, let us consider the game in meaningful levels. There are two facts to note: First, the solution is usually represented as the saddle point (u", u"), J(u0, u) < J(u0, u") for arbitrary strategies u( .) and u ( . ). 169
< J(u, UO)
(1.2)
170
5. Vector-Valued Saddle Points
Second, the payoff function J(u,u) is a sum of two terms, one terminal
F(x(8))and the other integral
1:
9% x(t), uCtl, vCt1)dt.
This sum is obtained by designing a mathematical model of actual problems in each of which each term is a criterion. In effect, the initial real-world problems, especially in the area of system mechanics, are essentially multicriterial. Having overlooked this fact (or using subjective reasoning) numerous writers on antagonistic differential games of power convolute these criteria into a single criterion (usually a linear convolution with positive coefficients). Then a “new” differential game is solved by finding a saddle point for it. The question is then what might be the “game-theoretic” meaning of the point for the initial multicriterial problem It turns out that the point is a Geoffrion and, consequently, a Pareto and a Slater saddle point. This result is illustrated by the following straightforward model problem [ 4 2 ] . Example 1.1. A vehicle (load) of mass m is to be moved along the axis x from point A to point B (Fig. 5.1.1), x [ t ] being the coordinate of the vehicle at time t. The movement starts from position x [ t o ] = xo at time to = 0 with initial velocity i [ t o ] = V,. The vehicle is to be moved at time t = 8 up to point B and then halted, or it is required that V,=O. Suppose the vehicle is moved by a propulsive force u and wind force u. In the form of Newton’s second law the motion equation is m2
=u
-
u,
x [ t o ] = xo,
i[to] =
V,.
By the above constraints imposed on the motion x [ t ] at time 8, it is required that u be chosen so as to minimize both of the following criteria:
A
--
”
= (xe - X C ~ I ) ~ ,
JAU,
V)
J,(U,
0) =
(icel)~. B
VO
-v~=o
“
XB
XO
Figure 5.1.1.
X
1. Definition
171
The wind is assumed to constitute the greatest hindrance to this effect of u. If the vehicle engine generates a propulsion u [ t ] at time t, the energy is J,(u,
I:
u 2 [ t ] dt.
U) =
Note that u2[t] may describe the power needed to create a propulsion u [ t ] . The vehicle may be assumed to have a wind generator. Then, if the vehicle is subjected to the action of the wind, u [ t ] , the generator produces energy j:ou 2 [ t ] dt. A fourth criterion is now introduced: Jq(u, U) = -
I:
u 2 [ t ] dt.
The motion x [ t ] , to < t < 8, may be made to "trace out," on average, the desired vehicle motion At). Then a fifth criterion is introduced: J5(u, u) =
I:
[x[t] - y(t)]2dt.
Consequently, in this example it is required to find a strategy u( . ) that leads to the smallest possible values of the five criteria J i ( i = 1,2,. . ., 5 ) with the wind u countering this. This is a multicriterial problem which should be viewed as a differential game with vector-valued payoff function I@,
4 = (Jl(U, 4, * . .,J5(u, u)).
(1.3)
In the theory of differential games this problem is usually solved in the following way [42, pp. 7-81. A positive value Pi of a criterial unit Ji(u,u), i = 1,. . . , 5 , is assumed to be known. The problem is represented as a singlecriteria1 problem with criterion
Then the saddle point (uo, u") of (1.2) is found,
for all feasible strategies u( .) and u( .). For the initial multicriterial problem with vector criterion I(u,u) from (1.3), this saddle point (uo,uo) has certain properties:
172
5. Vector-Valued Saddle Points
(1) for any feasible strategies u( .), the following system of inequalities is nonsimultaneous: Ji(uo, u) 3 Ji(uo, uo),
i = 1,. . . ,5,
(1.5)
at least one of which is a strict inequality; (2) for any feasible strategies u( .), the following system of inequalities is nonsimultaneous: Ji(uo, uo) 2 J~(u,uo),
i = 1,. . . , 5 ,
(1.6)
at least one of which is a strict inequality. We prove this assertion by contradiction. A feasible strategy u = i is assumed to exist such that the system (1.4) is simultaneous for u = 8, where at least one of the inequalities is strict. Multiplying both sides of the i-th' such inequality by a positive number Pi and summing over i yields i= 1
i= 1
which is incompatible with (1.5). The nonsimultaneity of the system of inequalities (1.6), at least one of which is a strict inequality, may be proved in a similar way. The situation (uo, uo) in which both (1.5) and (1.6) are incompatible, at least one in each system being a strict inequality, is referred to as the Pareto saddle point. The situation (uo, uo) thus defined will be shown below to also be a Geoffrion and, consequently, Pareto and Slater saddle point. Consequently, vector-valued (Geoffrion, Pareto, and Slater) saddle points are obtained in solving zero-sum differential games when the payoff function is vector-valued. These results will be discussed in the present chapter.
I .2. Formalization of Vector- Valued Saddle Points and Some of Their Properties
In an antagonistic differential positional game with vector-valued payoff function
r = ({1,2), z {a,-y>,wm
(1.7)
1 and 2 are the players; the conflict-controlled system Z is described by
1. Definition
173
equations (1.1) from Chapter 1, i= f ( t , x , u, u)
(1.8)
the vectors x E R", u E H E comp Rh, and u E Q E comp R4, time t E [to,01, where the constants 8 > to 2 0; unless otherwise specified, the initial position ( t o , x o )E [O,e) x R" is assumed to be fixed, with x [ t o ] = xo. Condition 1.1 of section 1.1 in Chapter 1 is assumed to hold, or the vector-valued functionf( .) is continuous and locally Lipschitz with respect to x, and (1.2, Chapter 1) is true. The set of strategies of player 1 (the maximizing player) is % = { u - u(t, x ) I u(t, x ) E H ,
v(t,x ) E [to, e) x rw1.
and of player 2 (the minimizing player), Y = { v -+ o(t, x ) I u(t, x ) E Q, v(t, x ) E [ t o ,
e) x R"}.
The quasimotions x [ ., to, xo, U ] , x [ ., to,xo, U , V ] , x [ ., to,xo, V ] of the system (1.8) generated from the initial position (to,x o ) by strategy U , situation (V, V ) , and strategy V are determined in section 2.3 of Chapter 1. Finally, for the vector-valued payoff function F(x[e]) = ( F l ( x [ e ] ) ,. . . , F N ( x [ e ] ) Condi) tion 1.2 of Chapter 2 is assumed to hold, i.e.,
Condition 1.1. to x .
The scalar functions Fi(x), ie N, are continuous with respect
In the game (1.7) player 1 chooses his strategy U €43 so as to obtain by the time t = 0 the game ends the least possible values of all the components Fi(x[B]), is N, while player 2 chooses V EY so as to obtain the largest possible value of the components. In the game (1.7) both players choose their strategies U and K respectively, to achieve their goals. The situation ( U , V) generates from the initial position (to, xo) the quasimotion x [ * , to, xo, U , V ] of the system (1.8). As a result, by the time 0 the game ends we have a point x [ e , to, xo, U , V ] that dictates the value of the vector-valued payoff functions F(x[B, to, xo, U , V ] ) in the game (1.1). In the approach of [90] the vector IF(x[O])specifies the amounts to be paid to player 2 that have been taken from player 1. Direct application of the theory of multicriterial positional dynamic systems, those fundamentals of which were presented in Chapters 2-4, i.e., the optimal solutions of (1.7) can be formalized in different ways. We will discuss only four of these approaches under the assumption that Condition
174
5. Vector-Valued Saddle Points
Definition 1.1. The situation ( U s , V’)E% x Y will be referred to as the Slater saddle point of the game (1.7) if [F(x[B,
XO,
us])4 IF(x[B,
for all quasimotions x[
*,
x09
us, Vsl)4
lo7
[F(x[e9
xO,
Vsl)
(1*9)
to,xo, U s ] , x[. ,to,xo, U s , V’], and x[. ,to, xo, V’].
In “coordinate” form (1.9) asserts that (1) for any pairs of quasimotions (x[. ,to, xo, Us, V s ] , x[ .,to,xo, Us])e %[to,x0, Us, V s ] x %[to,xo, U s ] the following system of inequalities is nonsimultaneous:
Fi(x[07to, xo, Us, V s ] )< Fi(x[B, to, xo, U s ] ) ,
iE
(1.10)
N;
(2) for any pairs of quasimotions (x[ * , to, xo, V s ] , x[. , to, xo, U s , P])E %[to,xo, Vs] x %[to, xo, Us, V s ] the following system of inequalities is nonsimultaneous:
Fi(x[8, to, xo, V s ] ) < F i ( x [ e , to, x0, Us, PI),
iE
N.
The set of Slater saddle points is denoted 9’.
Remark 1.1. Since, by Proposition 1.4 of Chapter 1, the bunches of quasimotions %[to, xo,
u, VS1 = %[to,
xo, V”,
true%’,
175
1. Definition
and %[to, xo,
us,V l c %[to,
xo,
US],
(1.11)
VVE “y-,
Condition (1) of Definition 1.1 may be reformulated as follows: (1) for every strategy V E V“ and any pairs of quasimotions (x[ ., to, xo, U , V s ] , x[ ., to, xo, Us, V s ] ) the following system of inequalities is nonsimultaneous:
F,(x[B, to, xo, U s , V ] ) > F i ( x [ B , to, xo, Us, V s ] ) ,
iE
N.
(1.12)
Definition 1.1 (Chapter 3) leads to the next definition. Definition 1.2. The situation (Up, V P ) ~ u x2 *Y- is referred to as the Pareto saddle point of the game (1.7) if
uxco, to, xo,
U T ) 9 5(XCO, to, xo,
up,VT)9 mco, to, xo, V 4 )
(1.13)
for any 3-tuple of quasimotions (x[. ,to,xo, Up], x[. , to, xo, Up, V T , xc . , t o , xo,
TI).
In “scalar form” (1.13) may be represented, in light of Remark 1.1, in the following form: (1) for every strategy V E V“ and any pairs of quasimotions (XC.9
to, xo,
up,Vl,
x [ . , to, xo,
up,V q )
the following system of inequalities is nonsimultaneous: Fi(x[B, to, xo, Up, V ] ) 2 F,(x[B, to, xo, Up, V T ) ,
iE
N,
(1.14)
at least one of which is a strict inequality; (2) for every strategy U E 42 and any pairs of quasimotions (x[ ., to, xo, Up, Vp], x[., to, xo, U , Vp]) the following system of inequalities is nonsimultaneous: F,(xC& to, xo, Up, Vp]) 2 Fi(x[B, to, x0, U , Vp]),
iE
N,
at least one of which is a strict inequality. The set of Pareto saddle point of the game (1.7) will be denoted 8.From Definition 1.1 in Chapter 4 we have the following definition. Definition 1.3. The situation (UF, VG)e42x of the game (1.7) if
“y-
is the Geofrion saddle point
(1) (VG,V G )is a Pareto saddle point (Definition 1.2)
176
5. Vector-Valued Saddle Points
(2) there exists a positive number M such that for any ordinal numbers i E N, strategies V E “Y, and pairs of quasimotions ( x [ . ,to, x o , U G , V ] , x [ . , to, x o , UG, V G ] )for which Fi(xC6, to,
~
0
UG, VI) > Fi(xC0, to, 9
~
0
U G , VGI) 7
and some j E N such that F j ( x [ e , to, x o , UG, VI) < Fj(xCe, to, xo, U G , VGI)
the following inequality holds:
(3) for any i E N and U E @ and pairs of quasimotions ( x c . , to, xo, U , V G ] ,x [ . , to, xo, U G , V G ] )for which F i ( x [ e , to, x o , U , V “ ] ) < Fi(xCe, to, x o , UG, VGI)
and some j E N such that Fj(xCe, toy xo,
u, VGI) > Fj(xCe, to, xo, U G , V “ ] )
the following inequality holds: Fi(xC0, to, xo, UG, VGI) - Fi(xC0, to, xo, UG, VGI)
GMCFj(xC8, to, ~
u, VGI) - Fj(xCe, to, xo, UG, VGI)I
0 %
The set of Geoffrion saddle points is denoted ’3. The discussion will be concerned only with constant (N x N)-dimensional matrices A with elements aij > 0, i, j E N. From Definition 3.1 in Chapter 4 we have the following definition. Definition 1.4. The situation (U”, V”)E@ x “Y will be referred to as the Asaddle point of the game (1.7) if ~5(xce,to, xo, uAi)4 A w e , to, xo,
uA,~ ” 14) A
w e ,
to, x o ,
~”1) (1.15)
for any quasimotions x[., to, x o , U”], x [ . , to, x o , U”, V ” ] , and x [ * ,to, xo, V”] . In scalar form (1.5) have the form:
1. Definition
177
(1) for every strategy V E Y and any pairs of quasimotions (x[. , to, xo, U”, V ] , x[., to, xo, U”, V ” ] ) the following system of inequalities is
nonsimultaneous:
1 aijFj(xC6, to, xo, U A , VI)
je N
(2) for every strategy U E J‘& and any pairs of quasimotions (x[ ., to, xo, U , V ” ] , x[-, to, xo, U”, V”]) the following inequalities are nonsimultaneous:
1 aijFj(x[B,to, xo, u”,v”])> je1 aijFj(x[B, to, xo, U , v”]),
jsN
iEN.
N
The set of A-saddle points will be denoted d.If in the course of the discussion Slater, Pareto, Geoffrion, and A-saddle points are required simultaneously, the term, “vector-valued saddle points of the game (1.7),” will be made use of. Definitions 1.1-1.4 and Theorem 3.1 (Chapter 4) lead to the assertion
Property 1.1.
The following chain of inclusion is true:
d
cYc
(1.16)
9 c 9,
that is, any A-saddle point is a Geoffrion saddle point with constant
(
M = max a;’ i,jeN
c a, - 1);
leN
any Geoffrion saddle point is a Pareto saddle point, and every Pareto saddle point is a Slater saddle point.
It is obvious that “combined” saddle points of the game (1.7) may also be defined; in particular, a saddle point (Us, V q which is Slater “on the left” and Pareto “on the right” may be defined by requiring that Remark 2.2.
F(x[O, to, xo, US]) 4 WR to, xo,
us,VP1) 4 w
e , to, XO? VPI)
for distinct 3-tuples of quasimotions (x[. , to, xo, U s ] , x[ ., to, xo, Us, Vp), x[ ., to, xo, Vp)). Such points will be excluded from our discussion; previous studies appear to have overlooked these types of saddle points, Remark 2.2.
The inclusions in (1.16) may be strict.
178
5. Vector-Valued Saddle Points
Exampfe 1.1. Assume that in (1.7) the control system is described by the “separable” scalar equations i= f ( t , X, u),
x[ro]
= x0,
L = d t , y , 4,
Y C t O l = Yo,
X E [w’,
Y E R’,
(1.17)
where both scalar functionsf( .) and cp( . ) satisfy Condition 1.1 of Chapter 1. The sets of players’ strategies are as follows:
a* = { U + u(t, x) 1 u(t, x) E H Ecomp R h } , V* = { V
- u(t, y ) I u(t, y ) G Q Ecomp [ w q }
moreover, the strategies U and V are assumed to be singletons, i.e., x[O, lo,xo, U] and y[O, to, yo, V] are points, not sets for all U E & * and V E Y * . The existence of sets of such strategies follows from Corollary 4.1 (Chapter 1). Finally, a 2-component payoff function [F(x[B], y[O]) is specified by the formula [F(xCel,YCOl) = (F,(xCOI),FAYCOI)) = ( X C e l , YCOI). Any situation (fi, P)E%*x V * may be shown to be a Slater saddle point. Indeed, for any V EY * ,the system of inequalities xce, to, x0, 61> xce, to, x0, fi1, YCO, to, Yo,
V l > YCO, to, yo, PI,
(1.18)
is inconsistent for all 3-tuples of quasimotions (x[. ,to,xo, 01, y [ . ,to, yo, V I , y [ . , to, yo, PI). This is true since the inequality xce, to, x0, 61 > xce, to, x0, fi1 cannot hold for the same quasimotions x[. , to, xo, fi]. Note that, the choice of the classes of strategies, the quasimotions x[ .,to, xo, U] and y [ . ,to,yo, V] of every subsystem in (1.17) are obtained independently. In a similar way, for every strategy U E %* and any 3-tuples of quasimotions (x[ . ,to,xo, U ] , x[ . , to, xo, 01, y [ * , to, yo, the following system of inequalities is nonsimultaneous:
v]),
Xce, to, x0, u i < xce, to, x0, fii,
Yce, to, yo,
31< Y C ~to,, yo, 71, (1.19)
since the following inequality does not hold (with identical y [ . , to, yo, YCO, to. Yo,
v1 < YCO, to, Yo, PI.
v]),
1. Definition
179
If the systems (1.18) and (1.19) are nonsimultaneous for any V E V * and U E%*, respectively, then, by Definition 1.1, every situation (0,V)E’%* x V * is a Slater saddle point. Indeed, in our example, Y = @* x
v*.
(1.20)
Let us now consider the “larger” sets of strategies ’% = { U - ~ ( tX), I ~ ( tX), G H } ,
v = { V - ~ ( ty,) I ~ ( ty ,) E Q}. The following inclusions obviously hold: ’%* c $2,
v* c V ,
(1.21)
are, by Proposition 2.3 (Chapter l), compacta in R’. Then there exist points x* and y* such that (1.22) (1.23) Let us now derive strategies U* E $2 and V* E V such that X* =
xce, to, x0, U*I,
Y*
= YCO, to, yo,
V*I
(1.24)
for every quasimotion x[ * , to, xo, U*] and y [ .,to, yo, V * ] . By Corollary 4.1 in Chapter 1, such strategies U* and Y* do exist. Because they are singletons,
U*E’%* and
V*EV*.
Then for any x E X [ O , to, xo, U - H ] the following system of inequalities is nonsimultaneous: x* > x ,
y*
> y*
(1.25)
at least one of which is a strict inequality. Indeed, by (1.22) the first inequality cannot be strict and the second one is an equality. By Corollary 4.1 in
180
5. Vector-Valued Saddle Points
Chapter 1, for every point x E X [ e , to, xo, U - H ] there is a strategy U E % * such that x = xce, to, xo,
Ul
(1.26)
for all quasimotions x[ ., to, xo, U ] EX[^^, xo, U ] . Then the nonsimultaneity of (1.25), where at least one of the inequalities is strict, means that, by (1.24) and (1.26), for any strategies U E @* and all 3-tuples of quasimotions (x[ . , to, xo, U ] , x[. ,to, xo, U *], y[ .,to, yo, V * ] ) the following inequalities are nonsimultaneous: xce, to, xo, U*I 2 xce, to, xo,
YCO,
to, Yo7
V*1
2 YCe, to, Y o ,
ui V*l
at least one of which is a strict inequality. In a similar way for any strategies V E Y * and all 3-tuples of quasimotions (x[ ., to, x,,, U *], y[ ., to, yo, V * ] ,y [ . , to, y o , V ] )the following inequalities are nonsimul taneous: xre, to, xo,
u*1 G xre,
to, x0,
yre, to, Yo,
V*1 s yre,
to, Y o ,
u*1, VI
at least one of which is a strict inequality. This implies that, by Definition 1.2, the situation ( U * , V*)E@* x Y * is a Pareto saddle point. Note that the set 9 ' of Pareto saddle points forms only situations (U*, V*)E@* x Y * for which (1.22)-(1.24) hold. Indeed, for any other strategy U E%* such that xre, to, xo, U l > x*
this inequality may be satisfied by using the strategy U* E %*, i.e.,
xce, to, xo, u ] > x* = x[e, to, xo, u * ] . Then the strategy U , in combination with any other V E Y * ,is not a Pareto saddle point, since there exists U* E %* such that the following system of inequalities is simultaneous:
xce, to, x0, ui > x ~ eto, , x0, u*1, yce,
to,
yo, V I 2 YCO, to, yo,
Vl
the second of which turns into an equality. Consequently, the set 9 of Pareto saddle points forms only situations (U*, V*)E%* x Y * for which (1.22)(1.24) hold. Consequently, by (1.20) here we have 9 c @* x Y * = Y and 9# Y .
1. Definition
181
Property 1.2. With N = { l}, any Slater saddle point of (1.7) is the saddle point of an antagonistic positional diferential game with scalar payof function ((1, 2},
z{a,q,F , ( x C m .
(1.27)
Indeed, let the situation ( U s , V s )E 92 x V be the Slater saddle point of (1.27) with (to,xo). Consequently, by Dejnition (1.1),
(1) for any quasimotions ( x [ ., to, xo, U s ] ,x [ . , to, xo, Us, V s ] ) the following inequality is nonsimultaneous: F l W ? to, xo,
us])> F , ( x [ & to, xo, us,VS1)
which is equivalent to the inequality F,(x[U, I,, xo, U S ] )< F,(x[O, to, xo,
us,VS1)
(1.28)
for all x [ . , to, xo, Us, V s ] and x [ . , to, xo, U s ] . (2) for any quasimotions x [ . , to, xo, Us, V s ] and x [ ., t o , x o , V’], the following inequality is nonsimultaneous: F,(xC&
Lo,
xo,
us,VS1) > Fl(XC6, to, xo, VS1)
which is equivalent to the inequality F l ( x r 4 to, xo,
us, VS1)G F,(xC& to, xo9
VS1)
(1.29)
Inequalities (1.28) and (1.29) assert, by Property 3.5 in Chapter 1, that the situation (Us*V s ) is the saddle point of the game (1.27).
By virtue of the inclusions (1.16), this is also a property of Pareto, Geoffrion, and A-saddle points, Property 1.2 implies that Definitions 1.1- 1.4 are sufficiently exhaustive and encompass, as a special case, the notion of the saddle point of an antagonistic differential game with scalar payoff function. 1.3. Geometric Interpretation Before proceeding to a geometric interpretation of vector-valued saddle points, we have to define a property that characterizes the structure of the set
w e , to, xo, us,vS1)=
u
UX)
~~xc~,~,,~,,u~,YsI for the fixed Slater saddle point (Us, Vs), where
x[e,to, xo, us, V S I=
u
-4.1
x [ e , to, xo,
us, vS1.
182
5. Vector-Valued Saddle Points
Property 1.3. For any vectors F " ' E F ( X [ O , to, xo,
us, VS]),
j
=
1, 2,
(1.30)
it is true that [F"'
4 [F'Z'.
(1.31)
Indeed, from the inclusion (1.30) it follows that there exist two quasimotions x"'[ ., to, xo, Us, V s ] , j = 1,2, such that F"'
=
F(x"'[O, to, xo, us, v']),
j = 1, 2.
(1.32)
By the inclusion (Proposition 2.4, Chapter 1) %[to, xo,
us, VS1= %[to,
xo, U S ] ,
it follows from (1.9) that [ F ( x y e ,to, x0, us,vS3)4 F ( x ( ~ ) [to, o ,x0, us,vS]) or, in light of (1.32), IF"' 4 P2'. Consequently, for a fixed Slater saddle point ( U s , V s ) the set F(X[O, to, xo, Us, V s ] )is internally stable (in the sense of the relation >). Internal stability, but in the sense of 2 , is established in a similar way for the set F(X[O, to, xo, Up, V Y ) and, consequently, for F(X[fl, to,xo, U", V"]) with fixed Pareto (Geoffrion) saddle point; it may also be established, in the sense of > A , for the set F(X[O,to, xo, U A ,V"]) for every A-saddle point. As for the geometric interpretation of the Slater saddle point in (1.7), N = { 1,2}, a two-component payoff function we have
W ~ I= )(F,(xCflI),F,(xCOI)). Let us assume that
{(Fl,F,)I F i< 0, i
1, 2},
R2
=
rw:
= {(Fl, F 2 ) 1 F i> 0, i = 1, 2)
=
are angles with vertex at (0,O)in the criteria1 space {Fl, F 2 } with "perforated" sides. We consider a specific Slater saddle point ( U s , V s )E Y and add R: to every point of the set F(X[O, to, xo, Us, V']). We end up with the shaded area in Fig. 5.1.2 since, by Property 1.3, the set F(X[O, to, x,, Us, V s ] )is internally stable in the sense of the relation >, one possible position of this set in the plane
1. Definition
183
F,
Figure 5.1.2.
{Fl,F,} is shown as a heavy line in Fig. 5.1.2. Then the condition F w , to, xo,
us])4 ~ ( X c e ,to, x0, us, vsl)
implies that, for every pair of quasimotions (x[ ., to, xo, U s ] , x[ . , to, xo, Us, V s ] ) the points of the set
W C R to, xo, U S ] )=
(J
W)
~EXCB,W,,U~I
never penetrate the area F(X[O, to, xo. Us, V’]) + R,: that is, the shift F(X[O, to, xo, Us, V’]) R: of the positive angle R: does not contain points from the set F(X[O, to,xo, U s ] ) . The boundary F(X[O, to, xo, U s , V’]) + R,: denoted by the double line, can be reached by points from the set F(X[O, to, xo, U s ] ) , for instance, at K . Note that, from Proposition 2.4 in Chapter 1 it follows that
+
wee, to, xo,
US, VS1) = wee, to, xo, US1)
184
5. Vector-Valued Saddle Points
i.e., the points of the set lF(X[O,to, xo, U s , V s ] )shown in Fig. 5.1.2 by the heavy line remain inside the set F(X[O, to, xo, U s ] ) . Similarly, the condition ~ ( X C &to, XOl
for every quasimotion x[ requiring that
*,
U S , VS1) 4 F(XCO, to,
XO?
VS1)
to, xo, Us, V s ] , x[. ,to, xo, V s ] is equivalent to
~ ( x ~to,exo,, vS1)n 5(xce,to, x0, us,vS1)+ ~2
=0
(cf. Fig. 5.1.3). Unlike the Slater saddle point, for the Pareto saddle point (Up,V') E 9, for instance the points of the set lF(X[O, to, xo, U q ) not only never penetrate F(X[O, to, xo, Up,V q ) R,: but cannot even lie on the boundary of that set other than at the points F(X[O, to, xo, Up, V']), or points of type K (cf. Fig. 5.1.2). Finally, for Geoffrion saddle points ( U", V G )E Q the sets F(X[e, to, xo, U"])
+
Figure 5.1.3
1. Definition
185
and F(X[O, tO,x0, V G ] ) do not have points in common, other than [F(X[O, to, xo, UG, V G ] ) , with, some additional sets to ff(X[O, to, xo, UG, V G ] ) R: and IF(X[O, to,xo, U G , V G ] ) R t , respectively, either (Fig. 4.1.1). In Fig. 5.1.3 this “additional” set stays within the double dash-and-dot line. A-saddle points may be interpreted geometrically in the same way though now the angle ct (Fig. 5.1.3) is fixed, is the same for all A-saddle points, defined by the matrix A. In the case of a Geoffrion saddle point this angle varies for every saddle point ( UG, V G )E Y and is governed by the existence of the constant M from Definition 1.3.
+
+
Remark 2.3. A geometric interpretation clarifies the “guaranteed” nature of a vector-valued saddle point. What, specifically, is guaranteed by, e.g., the Slater saddle point (Us, Vs)? The maximizing second player uses his own strategy V s from the saddle point ( U s , Vs).Whatever strategy U E%! is chosen by the first player, the set of values of the vector-valued payoff function F(X[O, to, xo, U , V s ] ) in the actual situation ( U , V s ) “lands” (Fig. 5.1.2) in the region IF(X[O, to,xo, V s ] ) (since X[O, to, xo, U , V s ] c X[O, to, xo, Vs] with VU E%). Consequently, none of the points IF of the set F(X[O, to, x o , U , V s ] ) remain inside F(X[O, to,xo, Us, V s ] ) . :W Speaking in game-theoretic terms, this implies that by using V s the maximizing player ensures that all components of the payoff function F(x[O]) will simultaneously assume values at least equal to F ( X [ e , to,xo, Us, V s ] ) for every quasimotion X I . , to,xo, us,Vsl, or
+
IF(XCO, to, xo,
u, V S I )4: F ( X C O , to, xo, us,VS1)
for any strategies U E 42 and quasimotions x[ .,to,xo. U , V s ] , x [ . ,to, xo, Us, V s ] . In other words, any one of the vectors IF(x[O,to,xo, Us, V s ] ) is the guaranteed amount the maximizing player assures for himself by using Vs, whatever the behavior of the other player. The use of U s by the minimizing player has a similar guaranteeing sense. In this case, for any choice of V E Y , F(XCO,
to, x0,
us,V I ) 4: w e , to, xo, us,vS1)
for all quasimotions x [ .,to, xo, Us,V ] , x [ .,to, xo, Us,V s ] . Consequently, for the minimizing player U s ensures values of all components F(x[O, to,xo, Us, V ] ) , VVE Y , that do not simultaneously exceed
w e , to, xo, us,VS1),
v x c - , to, xo,
us,V”.
186
5. Vector-Valued Saddle Points
2. Properties of Saddle Points 2.1. Existence of A-saddle point
The existence of the four types of saddle points referred to in Definitions 1.1- 1.4, is established by the following theorem.
Theorem 2.1. If Condition 1.1 and (3.2)from Chapter 1 hold and the functions Fi(x),ie N, are continuous, then,for any initial position (to,x o )E [0, 0) x [w“ and constant ( N x N)-dimensional matrix A with positive elements, there exists in the direrential game (1.7) an A-saddle point ( U ” , V”), or # 0, and, by the inclusions (1.16), none of the sets 59,9, and 9’is empty. Proof. Consider an auxiliary antagonistic differential positional game with scalar payoff function ({1,2),
~7
{%,
~
1
c ~ i a i j ~ j ( x )~ 0 1 ) 7
9
i,j&
(2.1)
where the collection of nonnegative numbers pi satisfies the condition C pi = 1. Since F j ( x ) , j e N, is continuous, so is BiaijFj(x).Moreover, for the system E of (1.Q Condition 1.1 and (3.2) from Chapter 1 are fulfilled in the game (2.1). Theorem 3.2 (Chapter 1) leads to the conclusion that in (2.1), for any choice of initial position (to, xo), there is a saddle point ( U”, V”)E92 x V“, or, by Property 3.5 (Chapter l), the inequalities
are true for any quasimotions x [ . , to, xo, U ” ] , x [ . , to, xo, V ” ] , and x [ . , to, x o , U”, V ” ] of the system (1.8) generated from position (to,x o ) by the strategies U A and V” and situation ( U ” , V”), respectively. Let us prove that the saddle point ( U ” , V”) (2.2) in the game (2.1) is the Asaddle point in the game (1.7) with the same initial position ( t o , x o ) . We assume the contrary. This means that either there are two quasimotions x * [ . ,to, xo, U ” ] and x*[ ., to, xo, U”, V A ] such that
1 aijFj(x*[O,to, xo, U ” ] ) > C aijFj(x*[O,to, xo, u”, v”]),
j EN
jeN
ie N, (2.3)
2. Properties of Saddle Points
187
or that there is a pair of quasimotions (?[ . ,to,x o , U A ,V A ] ,?[. ,to,x o , V A ]for which
or that (2.3)and (2.4)are simultaneously true. If (2.3)is true, then, multiplying every i-th inequality by Pi from the payoff function of (2.1)and summing over i E N, we have the inequalities
1 PiaijFj(x*Co, to, ~
ijsN
0 u”I) , >
1 PiaijFj(x*Co, to,
i,jeN
~
0 U, A , VAI)7
which are incompatible with the left inequality of (2.2) with x [ .] = x * [ .]. Incompatibility of (2.4) may be established in a similar way. This incompatibility proves the theorem, whence d # $3. But then (1.16) leads to Y # 0,9 # 0,and Y # 0. The requirement that the sets H and Q be bounded and closed will be shown to be essential for Theorem 2.1. Example 2.1. This example will demonstrate that if the sets H and Q are unbounded, the Slater saddle points do not necessarily exist in a differential game. Let us consider a linear quadratic differential antagonistic game with vector-valued payoff function
rL= ({ 1,2), zL,{@L, vL}, I(U, V)>. The control system EL is described by the linear differential equations jc =
A ( t ) x + B,(t)u
+ B2(t)U,
x ( t o ) = xo,
where x E R”; u, u E R”; the elements of the (n x n)-dimensional matrices A ( t ) and B,(t) are continuous over [0, 81; the constants 9 > to 2 0. The set of strategies of the first (minimizing) player 42‘ = { U + H(t)xl the elements of the matrices H ( t ) are continuous over [0, 81); and ^trL = { V + Q(t)xl the elements of the matrices Q ( t ) are continuous over [0, 9)). By the quasimotions x( ., to, xo, U , V ) of the system EL generated by the situation (U, V) + ( H ( t ) x , Q ( t ) x ) we will understand the common solutions x ( t ) = x ( t , to, x o , U , V ) ,to < t < 8 of this system for u = H ( t ) x , u = Q ( t ) x and
188
5. Vector-Valued Saddle Points
initial condition x(to) = xo. Henceforth, the payoff function will contain the notation u [ t ] = H(t)x(t)and u[t] = Q(t)x(t). Let the vector-valued payoff function
w ,V) where
+
J i ( U , V ) = x'(e)Cix(e)
v), . . . ,J,W,
=( J ~ W ,
1
v)),
+
+
{x'(t)Gix(t) ~ ' [ t ] @ " ' ~ [ t u] ' [ t ] 9 y ' ~ [ t dt. ]}
All the (n x n)-dimensional matrices used here are symmetric and constant; the prime denotes the transposition. Since the solution x(t) = x(t, to, xo, U , V ) , to < t < 8, generated by every specific situation ( U , V) E aL x VL is unique, the Slater saddle point x VL of the differential game TL is defined by the conditions (Us, V s ) I(US, V ) 4 I(US, VS) 4 Z(U, V S )
for every VeVL and U € a L . Hence in TL there will not exist any Slater saddle point if, for every situation (U, V) E 4YLx YL,there exists either its own strategy E aLsuch that (1)
J ~ ( CV, ) < J ~ ( uv), , iEN, or a strategy 3€VLfor which
(2) J i ( U , V ) < J i ( U , simultaneously.
V), ie N,
or both systems of inequalities hold
This property and Propositions 4.3 and 4.4 in Chapter 1 lead to two sufficient conditions, in neither of which will there exist Slater saddle points in rL. 9 > 0 (9< 0) implies that the quadratic form z ' 9 z is positive- (negative-) definite.
Condition I. If 9 y ' > 0, ie N, there is no Slater saddle point in the differential game rLfor any (to,xo)E [O, e) x R",llxoll f 0. Condition 11. Zf 9 y ) < 0, i E N, there is no Slater saddle point rLfor any (to, xo) E [O, 0) x R", llxoll # 0. Let us establish Condition I; Condition I1 may be proved in a similar way. Let ( U , V) + (H(t)x,Q(t)x)be an arbitrary situation of the set aL x VL,x(t) a
2. Properties of Saddle Points
189
solution of ZLwith u = H(t)x and u = Q(t)x,and J i ( U , V) the value of the i-th component of the payoff function Z(U, V). We consider the functional
+?(t)[Gi
+ H'(t)@")H(t)]2(t)}d t
which is defined for the solutions
i ( t ) , to
< t < 0 of the system
Since @") > 0, i E N, by Proposition 4.3 (Chapter 1) for the strategy V - Q(t)x there exists a constant Bi(V) > 0 such that for 2 Bi(V) and any /Jx, the following strict inequality holds:
v-
P) > Ji(U, V ) . /J*2 maxisNpi( V ) with
Ji(U,
Therefore, for the constant strategy ?+ B*x all N strict inequalities of (2) hold, which means that there is no Slater saddle point in rL. Then it follows from the inclusions (1.16) that for (to, X ~ ) E[0, 0) x R", l)xoII # 0, and @') 0 or 9;") > 0, i E N, there exist no Pareto or Geoffrion or A-saddle points in TL(these should be defined in the same way as the Slater saddle point is defined for rL). It should be reemphasized that for the differential game of this example the strategies U + u(t, x) and V +- u(t, x) are defined in such a way that IIu(t, x)ll and IIu(t, $11 may assume arbitrarily large values. This follows, first, from the linearity of the functions u(t, x) = H(t)x and u(t, x) = Q(t)x with respect to x, and, second, from the fact that the elements of the continuous matrices H ( t ) and Q(t) may assume any values, may also be arbitrarily large. It is because IIu(t, x)ll and I)u(t,$11 are infinite that there are no vector-valued saddle points in rL.
-=
Example 2.2. In this case the fact that H or Q are open will be shown to lead to the possibility that a differential game may lack Slater saddle points, even though H and Q are both bounded sets in this case.
190
5. Vector-Valued Saddle Points
where the control system X* is described by the two scalar differential equations
x[o~= yco]
A = u, j = u,
= 0,
tE[to,
03,
e = 1,
to = 0,
the sets of strategies 42
=
{ U + u(t, x) I u(t, x) c [O, 1 3 ,
Y = {V
t u(t,
x) I u(t, x) c [O, l)}.
Consequently, the set Q is not closed since it is closed interval [0, 13 with the point 1 excised; the two-component payoff function is
we19 Ycel) = (FI(xCa Y C m F2(XC01, YCQI)) =(xCel + Ycel, YCel - xcel). If it were true that H = Q = [O, 13, the Slater saddle point would be the situation ( U s , V s ) - (u*, l), where u* is any number from the closed interval [0, 11. Indeed, then x[e] = u*,y(8] = 1 and, for any quasimotion x[ .,to, x, U] and every U E 42, the following system of inequalities would be nonsimultaneous: 1
+ x[d,
to, xo, U ] < 1
+ u*,
1 - x[e, to, xo, U] < 1 - u*,
and, for any quasimotions y[ ., to, xo, V] and every V E V * = { V +- v(t, x, y) I u(t, x, y ) c [O, l]}, the following inequalities would be nonsimultaneous: 1 + u* < Y C ~ ,to, yo,
VI + u*,
1 - u* < yce, to, yo, V I - u*,
since y[O, to, yo, V] cannot exceed 1. In reality, however, the set Q = [0, 1) (the point u = 1 having been excised, so for every V E Y there exists a strategy P t i , [0, ~ l), i, = const, such that yce, to7 Yo,
VI < YCe,
to, Yo,
PI
for every quasimotion y[ . , to,yo, V ] . Hence, whatever the situation (U, V) E 42 x Y ,there exists a strategy P - i, for which the following system of inequalities is simultaneous: YCO, to, Yo, VI
YCe, to, Yo, V I
+ xre, to, xo, U l < YCO, to, yo> PI + xce, to, xo, U l , -
XW,
to, xo, U l < YC& to, Yo,
PI - xce, to, xo.
UI
2. Properties of Saddle Points
191
for any pairs of quasimotions (x[. , to, xo, U ] , y [ . , to, yo, V ] ) .The latter system of inequalities implies that there is no Slater saddle point in r*of Example 2.1. From the chain of inclusions (1.16) it also follows that in this game there are no Pareto or Geoffrion or A-saddle points. The important thing is that the absence of vector-valued saddle point for I-* is due to the fact that Q is not closed.
2.2. Dynamic Stability
Before proceeding to a discussion of the dynamic stability of vector-valued saddle points let us consider the properties of inheritance and rejection.
Property 2.1.
(inheritance). Let us assume that
(1) the situation (V", V") E 42 x V is a Slater (or Pareto or Geoflrion or A-) saddle point in the game (1.7) with initial position (to,xo);
(2) (UK, V " ) E c x T ; (3) c 42, T c v .
(2.5) (2.6)
c
Then (U", V") is the Slater (or Pareto or Geoflrion or A-) saddle point in the game
f = ((1, 2}, x, {% T } ,w 4 m with the same initial position (to,xo).
The game f differs from (1.7) only in terms of the sets of feasible strategies of the players that are subsets of the "initial" sets of 42 and V ,respectively. Proof. It follows from the inclusions (2.6) that
xre, t o , xo,
c x TI = ( U , V u- xce, to, xo, u, VI, )€@ v
xce, to, x0, 42 x 77-1 =
u
x
( U , V ) € 4x Y
xce, to, x0,
u, VI
(2.7)
and, from (2.5),
x[e,to, xo, uK,vX1c x[e,to, x0, 4x TI. Therefore, the relations of Definitions 1.1-1.4 that hold for points
192
5. Vector-Valued Saddle Points
x E X [ 8 , to,xo, Q x V ] also hold for the "narrower" set X [ 8 , to, xo, @ x 7 1 of (2.7), which includes the "optimal" quasimotions. Property 2.2 (rejection). If the situation ( V , V ) E Qx V is not a Slater (or Pareto or Geofrion or A-) saddle point of the game (1.7) with initial position (to,xo), for the game
((1, 21, x, {Q\U,Y \ V > , W Y I ) )
(2.8)
with initial position (to,xo) the same set of associated saddle points exists as in (1.7).
In other words, the sets d,3,9, and 9'for (1.7) and (2.8) coincide if the situation ( U , V) does not belong to d,or to 3, or to 9, or to 9, respectively. The rejection property follows immediately from the definitions of the saddle points since those ends of the quasimotions x[O, to,xo, V, V ] are rejected in these definitions, that have certainly been generated away from the saddle point (V, V). Property 2.3. (dynamic stability). Any Slater (or Pareto or Geoffrion or A-) saddle point ofthe game (1.7) with initial position (to,xo) is dynamically stable, or, ifthe situation (V", V") were a Slater (or Pareto or Geoffrion or A-) saddle point in (1.7) with (to,xo),(V", V") would remain a saddle point of the same type in (1.7) with the "current" initial position (t,x[t, to,xo, V", V"]) at any t E [to,81 for any quasimotions x [ to,xo, V", V"] of the system (1.8) generated by the situation (V", V") from position (to,xo). a ,
Proof. The proof follows that of Property 1.5 in Chapter 2, though we will employ an analog of the inheritance property implicitly established in proving Property 2.1. Specifically, let ( V K ,V K ) be a Slater (or Pareto or Geoffrion or A-) saddle point in the game (1.7) with initial position (to,xo), and let x [ ., to,xo, U", V"] be some quasimotion of the system (1.8) generated from the initial position (to,xo) by the situation (U", VK). As time t increases from to to 8, the reachability domain of the system (1.8) narrows from the initial position (t,x [ t , to, xo, V K ,V"]), X [ 8 , t, x[t, to, xo, U", V"], Q x V ] = X [ 8 , to, xo, 92 x V ] in the notation of (2.7). Besides, by Proposition 2.5 (Chapter 1)
2. Properties of Saddle Points
193
and Then, if the relations (inequalities) of Definitions 1.1-1.4 hold for the points x [ e ] of the sets X[O, to,xo, $2 x V ] and if
x[e, t , XCC,to, x0, uK,v K ] , uK,v"]
c
xce, to, x0, uK,vK1
these relations (inequalities)will also hold for a "narrower" set X[O, t , x [ t , to, xo, U", V " ] , $2 x 7cr] and X[O, t , x [ t , to, xo, U", V " ] , UK,V " ] . Some inequalities are simply "rejected for points x [ e ] that are not included in X[O, t, x [ t , to, xo, U", V " ] , U", V"], while for the "optimal" points x*[O] associated with the saddle point (U", V"), it is true that, on the one hand, and, on the other, X*ce] c
x2
xce, t , x [ t , to, xo, U K , V K I ,uK,vK1c x[e,to, xo, uK,vK1.
f
Figure 5.2.1.
194
5. Vector-Valued Saddle Points
Compactness
2.3.
In the state space the set
is the set Rs of right (at t = 8) ends of quasimotions xCt, to, xo, U s , V“], to < t < 8, generated from the initial position ( t o , x o ) when all the Slater saddle points ( U s , Vs) of Y are scanned. Moreover, this set . f s c X [ 8 , to,xo, U t H,V - Q ] is the reachability domain of the system (1.8) at time t = 8. For every point xs E we construct a situation (Os, Ps) E 42 x 7““ such that
xs
xS = x[e, to, x0,
Os, Psi
(2.10)
for any quasimotions x [ ., to, xo, Os, PSI of the system (1.8) generated by the situation (@, ps)from the initial position (to,xo). The existence of such ($, Ps) follows from Corollary 4.1 (Chapter 1); they will henceforth be referred to as “singletons,” and the set of such situations denoted 9.
Lemma 2.1. If the situation ( U s , V s )E Q x V is the Slater saddle point of the game (1.7) with initial position (to,xo), there exist singleton situations (Os, Ps) (dejned by (2.10) in which xs “scans” the entire set X [ 8 , to,x,,, Us, V s ] ) such that (Cs,Ps) are the Slater saddle points of (1.7) for the same initial position. If (Us, V s ) is a Slater saddle point, then, by Definition 1.1, for all quasimotions x [ . , to, xo, U s ] , x [ - , to, xo, Us, V s ] and x [ ., to,xo, V’], relations (1.9) hold and, by (2.5), it is true, in particular, that Proof.
~
8to, x0. ,
us])4 F ( X C O , to, x0, Os, PSf1)4 wm to, x0, ~ ~ 1 ) . (2.11)
The singleton strategies derived from Corollary 4.1 (Chapter 1) are so structured that their choice makes the following inclusions hold:
OS1 = %[to, xo, U S ] , f u t o , xo, PSI = xcto, xo9 VS1.
%[to, xo,
But then it follows from (2.11) that W 8 , to, xo,
OSI)4 W
8 , to, xo7
CS, PSI)4 &48,
to, xo,
PSI),
which means that the situation (Cs,Ps) is the Slater saddle point of the game (1.7).
2. Properties of Saddle Points
195
The compactness of the set 2’ of (2.9) will be established for games such as (1.7) but with “separating dynamics.” Specifically, the conflict-controlled system X o is assumed to be described by a system of two “separable” differential equations =f(t, x,
4,xCto1 = xo,
L = d t , Y , 4, yCtol = Y o ,
(2.12)
where XER”, y e Rk,and the functions f(.)and q(.)satisfy Condition 1.1 from Chapter 1 or are continuous, locally Lipschitz with respect to their own state variables, and
llfll G Y(1 + Ilxll),
llvll G Y(1
+ Ilull),
Y = const > 0.
The initial position (to,xo) is fixed and the vector-valued pay-off function [F(x[O], yCO1) = (F,(xCOl, YCOl), . =
f
*
9
FN(XCO1, YCOl))
{Fi(xCeI,yCOl), ie N)
is defined in such a way that the functions &(x, y), i E N, are continuous. As before, the sets of players’ strategies @ = { U + u(t, x, y)I u(t, x, y ) E H ~ c o m R p h},
Y = {V
t o(t,
x, y)l u(t, x, y) c_ Q ~ c o m p R4}.
We will also be using subsets of strategies, Q0 =
{U
Y o= { V
t u(t, t
x) I u(t, x) E H } ,
~ ( ty)I , ~ ( ty, ) G Q}.
(2.13)
Note that 4’ c Q and Y oc 9‘“. The quasimotions of the system (2.12)generated by U or V or the situation (U, V) are determined in section 2.3 of Chapter 1. In effect, what we have is a differential positional game with vector-valued payoff function
<{ 1921, xo,{@* Y } ,~ ( X C ~yI ,c m
(2.14)
and fixed initial position (to, xo)E [0,0) x R”. V s ) of the game (2.14). As in Let 9 be the set of Slater saddle points (Us, case of (2.9) we consider the set
9=
u
(U,V)E.Y
( x c ~to, , x0, UI, y[e, to, y o , VI)
(2.15)
196
5. Vector-Valued Saddle Points
which is formed by the right ends of the quasimotions (x[ . , to, xo, U s ] , y [ . , to, yo, V s ] as the situations (Us, V s ) scan the entire set of Slater saddle points of the game (2.14).In addition, for either subsystem of (2.12) we obtain the reachability domain
x = x[e,to, xo, u - H I =
u x[e,
to, x0,
u],
UEQO
Y
=
YCe, to, yo,
-
Ql
=
U
V € P
yCe, to, yo,
Vl
and examine the static zero-sum game with vector-valued payoff function
(X,y > ,m,Y)>.
((1, 21,
(2.16)
2’will denote the set of Slater saddle points (x”,f ) V(x, y ) E X x Y
w, y ) 4 W’, y”,
of (2.16), or
and V(x’, y”)E 8’
W”, Y”) 4 w%Y?.
(2.17)
The structure of the saddle points ( U s , V’) of (2.14) is defined in such a way that the sets 2’ and 2”coincide. Specifically,
Proposition 2.1.
For the game (2.14) with initial position ( t o , xo, yo),
2 s = 2..
(2.18)
Proof. Let us establish the inclusion
2s E 2..
(2.19)
We assume the contrary, i.e., that there exists a Slater saddle point (Us, Vs) of (2.14) that generates the quasimotion (a[., to, xo, U s ] , j [ * , to, yo, V’]) such that
(xe, to, x0, us],w ,to, yo, V~IMX’.
(2.20)
(cs,
Using Lemma 2.1, we design a singleton situation ps)that first, is the Slater saddle point of (2.14);and second, realizes the equalities xS =
ace, to, x0, us]= x[e,
OS3,
vS1= yce, to, x0, Psi x[. , to, xo, fis] and y [ . , to, yo, 9’1.
yS = jce, to, y o ,
for all quasimotions
to, x0,
(2.21)
2. Properties of Saddle Points
It follows from (2.20) that there exists a point x* E X (or y* such that either
E
197
Y or both)
w*, Y S ) < w, YS)
or
w, Y S ) < ws,Y*)
or both strict inequalities are true simultaneously. Specifically, let us assume the first inequality is true. By Corollary 4.1 (Chapter 1) and the fact that x* EX = x[e, to, xo, U
+ HI,
(2.22)
there exists a strategy U*E%’ of (2.13) that generates quasimotions x[ ., to,xo, U * ] of the first subsystem in (2.12) such that x* = X[O, to, xo, U*]
(2.23)
for any quasimotions x[ * , to, xo, U*]. But then, in light of (2.21)-(2.23), the first inequality of (2.22) may be rewritten thus:
w e , to,x0, ~ * 1Y ,C ~to,, yo, psi) < w e 7to, x0, fiS1,Y C ~to,, yo, psi). (2.24) But this contradicts the fact that (6’, ps)is the Slater saddle point of (2.14), which establishes the inclusion (2.19). The converse inclusion may also be proved by contradiction. Let us assume that Slater saddle point (x’, y”) of (2.16) exists and that (x”, y”) 4 2s.
(2.25)
For the situation (xd,y”) we obtain a pair of strategies (U”,V”)E%’ x Y o (2.13)such that for all quasimotions x [ . , to, xo, U’] and y [ . , to, y o , Vd] of the system (2.12),
(2.26) The existence of (U’, V ” )follows from Corollary 4.1 (Chapter 1). By virtue of (2.25) and Lemma 2.1, the situation (U”,V’) is not the Slater saddle point of (2.14) with (to, xo, yo). This implies that there is a strategy 6 E % (or PEY or both) such that, by (2.26), either
w e , to, x0, 6 1 , m, to, yo, ~ ~< IF(XCB, 1 ) to, x0, ~ ” 1Y ,C ~to,, y o , ~ ~ 1 )
198
5. Vector-Valued Saddle Points
or 5(XC& to, xo, U’I,
3CR to, yo,
TI) > W O , to, xo, Udl, yre, to, yo, V61) (2.27)
or both inequalities are true simultaneously for at least one quasimotion a [ . , I,, xo, 01 (respectively, 9[-,to, yo, PI)and all quasimotions x[. , to, xo, U’], y [ . ,to, yo, V’]. Specifically, let us assume that, for instance, (2.27) holds. But
3 = 3C0, to, yo, PI E Y = YC&
to, Y O ,
v + Q1
is the reachability domain of the second subsystem of (2.12). Then, by (2.26), inequality (2.27) can be represented as
w, 3) > w, Y’) for 3~ Y , which is incompatible with the definition (2.17) of the Slater saddle point of (2.16). This contradiction proves the inclusion 2’E 2’. Equality (2.18) follows from (2.19) and the fact that
d’ c 2’.
Proposition 2.2. If the sets X and Y are compacta and the components F i ( x ,y), i E N, of the vector-valued function F(x, y ) are continuous over X x I: the set of Slater saddle points (x’,yd) (defined by (2.17)) of the game
r, = ( X , I: w, A>
(2.28)
forms a compact subset in X x Y and the set of values of the vector-valued payoffunction F(x, y ) on the set of these saddle points is also a compactum in
w,Y )
=
u
(X,Y)EX x y
m,Y )
Proof. The proof proceeds by contradiction. Let us assume that a sequence . . ,) of Slater saddle points of (2.28) has been found that converges (since X x Y is compact) to the point (x*, y*), which is not the Slater saddle point of this game. This means that there is ~CEX (or 3~ Y or both) such that either
(x‘~),y ( k ) )(k = 0, 1,.
y*) <
m*,y*)
or (2.29)
199
2. Properties of Saddle Points
or both inequalities hold simultaneously. We assume that the first of these inequalities holds. Since F(x, y) is continuous, there are neighborhoods T ( i ,y*) and T(x*,y*) of the points (i,y*) and (x*, y*) such that for any (2, P) E T(2, y*),
(2, 3) E
w*,Y*)
the following vector-valued inequalities hold:
q2,j) < ff(2, 3). Let us now find an ordinal number (2, Y'%
w, Y*),
(2.30)
k^ so large that n x * , y*),
(x(? y%
which is feasible since x ( ~-+ ) x* and y(k)-+ y* as k -+ 00. Then, by (2.30), F ( i , y(Q) < F(x(4, y'",
which is incompatible with (2.17) for the Slater saddle point (x@,$@). This contradiction proves that of the set of Slater saddle points (xg,yo) of the game (2.28) is compact. From the continuity of F(x, y) and the compactness of the set of Slater saddle points it follows that the set of values of the vector-valued payoff function is compactum which are obtained on the set of Slater saddle points of the game (2.28). Proposition 2.3. The set 2 ' of (2.15) is closed and bounded (compact)in X x Y and the set of values of the vector-valued p a y o f functions @')
=
u
Y)
(X,YkRS
is a compactum in
w,Y )
=
u
Y).
kYkX x y
Recall that the set 2"orms the right (at t = 0) ends of the quasimotions x[ . ,to, xo, U s ] , y [ . ,to, yo, V s ] as the situations ( U s , V') scan the entire set of Slater saddle points of the game (2.14) with initial position (to,xo, yo).
Indeed, from Condition 1.1 of Chapter 2, it follows that the reachability domains
X
= X[O,
to, x0, U - HI,Y = Y [ 0 , to, yo, V - Q]
200
5. Vector-Valued Saddle Points
of the subsystems of (2.12) are compact. Then in the game (2.16) the set X " of Slater saddle points (2.17) is (since lF(x,y) is continuous) a compact subset of X x Y (which follows from Proposition 2.2). But the sets 2' and d s coincide by Proposition 2.1. Therefore, the set 8' is also a compactum. Remark 2.2.
By analogy to (2.15), for the game we introduce three sets
2'
=
u u u
(U,V)EP
rZG =
(XCO,
( U , v)Eg
8"=
fie,
( x c ~to, , x0, UI,
to, yo, V I ) ,
to, x0, ui, yre, to, yo,
(xre, to, x0, ~
VI,
1Y C, ~ to, , yo, VI)
(2.31)
(U.VW
which are formed by the right ends of quasimotions x [ . , to, y o , V"],y [ . , to, yo, V"] as the situation (V", V") scans the entire set X ( X = 9, 3, d; K = P, G, A). From the inclusions (2.16) it follows that
8"c 9 c 8' The sets 8" and example in [ S S ] .
2'
c
8s.
(2.32)
are not necessarily closed, which follows from an
Example 2.3. The reachability domain X [ 6 , to, xo, V - Q] of problem (1.1, Chapter 2) with initial position (to,xo) is assumed to be a circular cone in R3 whose base is a unit circle in the plane x 0 x2 and whose vertex is the point L = (0,1, 1) (Fig. 5.2.2). If the vector-valued payoff function is IF(x[B])= x [ e ] = (x,[6], x 2 [ e ] , x3[6]),all points of the arc DCB (with point B punched out) are effective. The point B is not effective, since L = (0,1,1) 2 (0,1,O) = B. Here the set of effective points is not closed, the point B having been punched out. If the vector-valued payoff function is two-component and [F(x[O]) = (x,[O], x2[O]),the set of effective solutions proper is an arc DCB of a circumference without the endpoints D and B, or this set is not closed, either. But the set 8" is closed (compact), which follows from the next assertion. Proposition 2.4. For any constant matrix A with positive elements the set of (2.31) in the game (2.14) is a compact subset of X x Y; while the set
MA)=
u
(X,Y)E2*
w,Y )
2"
2. Properties of Saddle Points
201
Figure 5.2.2.
is a compactum in the criteria1 space.
The proof follows immediately from Proposition 2.3 since the A-saddle point of (2.14) is, by Definition 1.4, a Slater saddle point of the game ((1, 21,
z{a,V } ,A W O I , YCOl)),
(2.33)
which differs from (2.14) only in the form of the vector-valued payoff functions, in (2.14) it is F(x[O], y[O]) and in (2.33), AF(x[O], y[O]). Therefore, the set 8”of (2.31) is a compactum and, since F(x,y) is continuous, the set F(X”) is also a compactum in the criteria1 space.
zG
Remark 2.2. It follows from Example 2.1 that the set generated by all Geoffrion saddle points is not necessarily closed or, consequently, compact. But it follows from (2.32) that in 8“ there exist compact subsets 8”formed by A-saddle points (for any constant matrices A with positive coefficients).
202
5. Vector-Valued Saddle Points
3. Invariance of Vector-Valued Saddle Points 3.1. Afine Transformations Assuming that for a conflict-controlled system X described by equation (1.8), Condition 1.1 (Chapter 1) is satisfied and that Condition 1.1 is satisfied for the vector-valued payoff function. Let us consider the game (1.7) and see how the set of vector-valued saddle points introduced by Definitions 1.1-1.4 change under transformations of the vector-valued payoff function.
Definition 3.1. A zero-sum differential game with vector-valued payoff
function
f
=
<{ 1, 2}, x, {a,V } ,R X C O l ) >
is afinely equivalent to the differential game (1.7)
r
=
<{ 1, 2}, x, {@, V}, W O I ) >
with the same initial position (to,x o ) if
+
$(x[O]) = kiFi(x[O]) a i ,
iE
(3.1)
N,
for the same quasimotions xC.1 of the system (1.8), where ki = const > 0, ai = const, i E N . In the notation of the general theory of games [89] this affine equivalence of differential games will be denoted as p r.
-
Let us introduce vectors k = ( k l , . . . ,k,), a = (al,. . . ,a,), and the operation of coordinate-wise multiplication k 0 [F = ( k , , F , , ...,k,F,). Then (3.1) can be represented as t ( x [ O ] )= k 0[F(x[O])+ a.
Lemma 3.1. Afine equivalence of antagonistic differential games with vectorvalued payoff function is an equivalence in that its specijic features include: (1)
--
r r (rejexivity); f r, then r p (symmetry);
(2) i f
(3) i f
- F2)
-
and
-
-
F2) F3), then r(l) F3)(transitivity).
To prove (l), it is sufficient that in (3.1) ki = 1 and ai = 0, i E N. In proving (2) note that it follows from (3.1) that
3. Invariance of Vector-Val& Saddle Points
’
203
-
while k; = const > 0 and - ai k; = const for every i E N. To prove (3), we represent the relation F3) r(’)in the form
F!3)(x[e])= k i 2 ) F ! 2 ) ( ~ [ 8+] )a!’),
i E N,
where k!’) = const > 0, a!’) = const. Then, in light of (3.1) (where in the game r = r(’)we assume that 5 = 5(’)),
F!3)(x[O])= k!’)k!’)F!’)(x[O])+ k!’)a!’)
+
+ a!’),
iE
N,
-
where k!’)k;’) = const > 0 and kj2)a!’) a;’) = const. The latter inequalities imply, by Definition 3.1, the relation r(3) r(’).
Proposition 3.1. A$nely equivalent differential games have the same vector saddle points, or the sets of Slater, Geofrion, Pareto, and A-saddle points coincide in a$nely equivalent games. The invariance of sets of Slater, Pareto, Geoffrion, and A-saddle points for differential games r and f follows from the fact that the nonsimultaneity of the system of inequalities “does not deteriorate” if every i-th inequality of this system is multiplied by a positive number ki and an arbitrary number ai added to both sides. The set of Geoffrion saddle points is also invariant but in the game f a positive constant has to be used:
M = M max-.k j i,jsN ki Indeed, for instance, the relation
-
.
I
~
.
.
from Definition 1.3 with Pi = kiFi + a i , i e N, assuming (3.2) is true, takes the following form for the game f:
204
5. Vector-Valued Saddle Points
Remark 3.1. The differential game (1.7) with initial position (to,xo) can always be transformed into a game where all the components of the vectorvalued payoff functions are positive and the sets of vector-valued saddle points of the two differential games coincide. Indeed, XCO, to, xo, U + H , V + Q], the reachability domain of the system (1.8) from the initial position (to, xo) is a compactum in R" by Proposition 2.3 in Chapter 1. Then on this compactum each continuous function IFi(x)l,i E N, assumes its maximal value,
L. =
max
xeX[O,to,xo,U + H , V + Q ]
IFi (x)l*
We introduce the number
L = max Li ie N
and subject the differential game (1.7) to the affine transformation Fi(x[e])
=
el) + L
+ I.
In keeping with the choice of the positive constant L, the function e ( x ) > 0,VxEXCO, to, xo, U + H , V + Q], and, by Proposition 3.1, in affine equivalent differential games (1.7) and ({1,2},
z {*,q ,P(XCOI)= (FAXCOI),. . .
fN(x[~l)))
the sets of vector-valued saddle points coincide. It is also obvious that in any permutation of the components of the vectorvalued payoff functions ff(x[O]), the sets of vector-valued saddle points coincide.
3.2. Addition of Criteria The following property holds for Slater and A-saddle points.
Proposition 3.2. If there exists a subset of natural numbers K c N such that VS) is a Slater saddle point of the game the situation (Us, ((1, 2},
+ (1-8),
{a,Y } ,{Fj(xCoI)}j,~)
(3.3)
3. Invariance of Vector-Valwd Saddle Points
205
with initial position (to, x o ) E [ O , @ x R”, this situation is also the Slater saddle point in the game (1.7)
((1, 2},
c t (1-8L {@, v } ,{ F j ( x C e I ) ) j c ~ >
with the same initial position. Proof. By Definition 1.1 the situation (Us, Vs) is not a Slater saddle point in the game (3.3) unless for any quasimotions x [ .,to, x o , Us], x [ . ,to,x o , Us, V s ] , x [ . , to, x o , V s ] either of the following two systems of inequalities is nonsimultaneous:
F j ( x [ e , to, xo,
W )> F j ( x [ e , to, xo, us,VsI),
Fk(x[e,
us,Vsl)> Fk(x[e,
j e W,
or xO,
xO,
Vsl), k E H *
But nonsimultaneity of the inequalities by some criteria ( j e W) entails that of the “complete” system of inequalities (1.9) ( j N). ~ Note that the converse is not true (cf. Example 1.1, Chapter 3). , of a zero-sum differential game with Corollary 3.1. The saddle point ( U ( i ) Vi)) scalar payoff function r(i) =
((1, 21, c t (1.81, {a,v } , el)>
and initial position (to, x o ) is the Slater saddle point ofthe game (1.7) with the same initial position.
Indeed, if (Vi), VCi)) is the saddle point of the game Pi),then, by Property 3.5 from Chapter 1, the following inequalities hold: F ~ ( X C B ,to, x 0 , U ( ~ ) s ] ) F i ( x [ e , to, x o ,
u(i),~ ( ~ ’G3 ~) , ( x [ eto, , xo, v ) ] )
for any quasimotions x [ * , to, x o , W ] , x [ . , to, xo, Ui),V ) ] ,x [ . , to, xo, V ) ] . Hence, for the same quasimotions the following inequalities are nonsimultaneous: F ~ ( X [ Bto, , xo,
P I ) > F i ( x [ e , to, xo, u“),v(i)3),
F i ( x [ e , to, x o ,
~ ( i )v , )])
> F i ( x [ e , to, xo, V ‘ ~ ) ] ) .
Then it follows from Proposition 3.2 that the situation (U(i),V ( i )is ) the Slater saddle point of the game (1.7).
206
5. Vector-Valued Saddle Points
Consequently, the saddle points of every game Pi),i E N, are Slater saddle points of the game (1.7).
Remark 3.2. Corollary 3.1 is helpful in proving the existence of Slater points of the game (1.7). Specifically, if Conditions 1.1 (Chapter 1) and 1.1 and equality (3.2) from Chapter 1 hold, then, for any choice of initial position (to,X ~ ) E[O,e) x R" in the differential game (1.7) existence Slater points. Indeed, when the above conditions hold in Fi) for any choice of initial position (to,X ~ ) E[0, 0) x R", by Theorem 3.2 (Chapter 1) there exists a saddle point (U"),V c i )which, ) by Corollary 3.1, is the Slater saddle point of the game (1.7). Remark 3.3. Let the situation (Us, Vs)be the Slater saddle point of the game (1.7) and let $(x) be an arbitrary scaler function. Then making the latter an additional component of the payoff function lF(x[B]) the situation ( U s , V s ) remains the Slater saddle point of the game ({L 21,
x + ( 1 4 , {*,*y>,{WC~I),$(X[Ol)}>.
This Remark follows from Proposition 3.2. Let us also consider the case where the addition of a criterion "does not spoil" the set of vector-valued saddle points.
Proposition 3.3. Assuming that the game
f
=
<{ 1, 2}, x, {a, Y } ,QxCelD
is afinely equivalent to (1.7) with the same initial position (to,xo) and that $(x) is an arbitrary number function defined on R , the sets of Slater (Pareto)saddle points for the diferential games ((19
21,
x
3
{*9
q
9
{f(xCel), $(xCel)j>
and ((1, 21, z
{*3
q ?{ ~ ( X C ~ I WCel)}) ),
with the same initial position (to, xo) coincide. The proof follows from the fact that (lF(X'l)), $(X'l)))
> (F(x'2'),
Ij(x'2')) 3 (
Q x q $(x'1))) > (F(x(2), $(x'2'),
the converse also being true. This interdependence also holds for 2 .
3. lnvariance of Vector-Valued Saddle Points
3.3.
207
Use of Increasing Functions
The scalar function $(y) of the scalar argument y is strictly increasing on R' if $(yl) < $(yz), the converse also being true. y1 < y 2
Proposition 3.4.
Let the scalar function $(y) be strictly increasing on R'. Then,for a j x e d initial position (to,xo)E [O,O) x R", the differential games (1.7) and
<{ 1, 2}, & {q9 v } ,{F,(xC~I),. . . ,F i -
~(xC~I),
$(Fi(x[eI)), F i + I(x"31)9 . . FN(xC01))) . 2
(3.4)
have the same set of Slater (Pareto) saddle points.
Proof. If the situation ( U s , V s ) is a Slater saddle point, then, by Definition 1.1, for every pair of quasimotions (x*[ ., to, xo, Us], x*[ . , to,xo, Us, Vs] there is a natural number io(x*)= i, E N such that F ~ , ( ~ * cto, B , xo, us])G Fi,(x*re, to, xo, us, vS3.
(3.5)
For brevity of notation, the nonsimultaneous right-hand inequalities of (1.9) are omitted, though similar relations also hold for any pair of quasimotions (x*[ -,to,xo, Us, Vs], x*[. ,to, xo, V']. If i, # i, ( U s , Vs] is the Slater saddle point of the games (3.4) and (1.7). If, however, i, = i, then (3.5) is equivalent to +(F,(X*ce,
to, xo,
us]))G $(Fi,(x*ce, to, xo, us,vsl)),
which signifies, too, that (Us, V s ) is the Slater saddle point of (3.4) and (1.7). If the situation (Up,V q is the Pareto saddle point of the game (1.7), by Definition 1.2, for every pair of quasimotions (x*[., to,xo, U q , x*[ . , to, xo, Up,V T ) either w * c e , to, xo, up,vq) = w * p , to, x0, up])
or there is a natural number io(x*)= i o E N such that Fio(x*[e, to, xo, up])< Fi,(x*ce, to, xo, up,vq).
Here again the nonsimultaneous inequalities on the right-hand side of (1.13) are omitted. Since $ ( y ) is strictly increasing, the above equality is equivalent to Fj(x*ce, to, x0, u4) = Fj(X*ce, to, xo, up,V V , $ ( w * c e , to, xo,
j E
N , j # i,
u9))= $(Fi(x*Ce, to, xo, up,V'I).
208
5. Vector-Valued Saddle Points
Inequality with io # i signifies that (Up, V q is the Pareto saddle point of the games (1.7) and (3.4), and for io = i, this inequality is equivalent to
$ ( w * [ e ,t o , xo, UT))< $ ( F i ( X * C O ,
to,
xo, UP,VTN.
In this case (Up, V q will also be the Pareto saddle point of (1.7) and (3.4).This proves the proposition. By virtue of this property, since it is a constant n-vector the component of the vector-valued payoff function I;;(x[O])= Ilx[O] - all may be replaced by $(&(x[O])) = llx[O] - all2, because the scalar function $(Fi) = FZ is strictly increasing with F, (while F , > 0). But if the function Ilx[O] - all is not differentiable with respect to x at the point x[O] = a, llx[O] - all2 will be differentiable for every x[O] E R”. Any component of the vector-valued payoff function &(x[O]) may be replaced, for instance, by kiFi(x[O]) ai, where k, = const > 0, ai = const, or by aFi(X[el)(CL= const > l), by b-Fi(X[el) (b, = constE(0, l)), or by (Fi(x[O]))k,where k is an odd positive integer. With these transformations the sets of Slater and Pareto saddle points do not change. Now, let us use scalar functions of the vector argument $(IF) defined on RN.
+
Definition 3.2. The scalar function of the vector argument is referred to [70, p. 221 as increasing (nondecreasing) over > on [WN if IF(’) >
F‘2’ 0$(F(I))
> $(V)
> F@)e-$(IF(l))2 $(F(2)) and increasing (nondecreasing) over 2 if IF‘” 2
(IF‘” 2
[F‘2’0$(F(’))
> $(P)
[F‘2’0$(lF‘”)
2 $(IFQ))).
It has been found [70, pp. 63-64] that if the function $(IF) = $(Fl,. .. ,F N ) is continuous on RN and (1) if it is nondecreasing over every variable F i , i E N, for any fixed values of the remaining variables 4, j~ N, j # i, it is nondecreasing over > and 2 ; (2) if it is increasing over every variable, the others being fixed, $(F) is increasing over 2 ; (3) if it is nondecreasing over every variable and increasing over at least one of them, $(IF) is increasing over >;
3. Invariance of Vector-Valwd Saddle Points
209
(4) if it is nondecreasing over 2 , it does not decrease for every variable; ( 5 ) if it is increasing over >, it does not decrease for every variable.
Proposition 3.5. (1) If the scalar function $(IF) is nondecreasing over 2 on RN,the diferential games (1.7) and
((1, 21,
x,
{%9
v1, { W Y I ) , w(xceI))1)
(3.6)
have the same set of Pareto saddle points. (2) I f thefunction +(F) is increasing over > on RN,(1.7) and (3.6) have the same set of Slater saddle points.
In both cases the initial positions ( t o , x o )of the two games are assumed to coincide. The proof will be presented for (2); (1) is proved in a similar way. If (Us, V s ) is an arbitrary Slater saddle point of (1.7) with initial position (to,xo), by Definition 1.1 for any quasimotions x[. ,to, xo, U s ] , x[. ,to, xo, Us, V’], x [ . ,to, xo, V s ] it is true that
w e , to, xo, us])4 w
e , to, x0,
us,vS1)4 w
e , to, xo,
us]).
Hence, from Remark 3.3 (Us, V s ) is the Slater saddle point of the game (3.6). Conversely, assuming that the situation (Us, V s ) is the Slater saddle point of (3.6) with initial position (to,xo), let us show that it is also the Slater saddle point of (1.7) with the same initial position. We assume the contrary, that ( U s , V s )is not the saddle point of the game (1.7).Then there is either a pair of quasimotions (x*[. ,to, xo, Us], x*[ .,to, xo, Us, V s ] )such that
[ F ( ~ * [ Oto, , xo, us])> o * c e , to, xo,
us,vS1)
(3.7)
or a pair ( 2 [ .,to, xo, Us, V s ] , a[. , to, xo, V s ] ) for which
w e , to, x0, us,vS1)> w e , to, x0, vS1) or both vector inequalities hold simultaneously. But, for example, from (3.7) since $(F) is increasing over > on RN it follows that w x * c e , to, xo, US1)) >
ww*ce, to, xo, us,VS1))
which, combined with (3.7), signifies that the situation (Us, V s ) is not the Slater saddle point of (3.6) with initial position (to,xo). This contradiction proves Proposition 3.5.
210
5. Vector-Valued Saddle Points
Remark 3.4. The above proposition makes it possible to reject the component of the vector-valued payoff function of (1.7) that is increasing over one component and nondecreasing over the remaining components of this payoff function (for instance, if the component is increasing over each of the remaining ones). The set of Slater and Pareto saddle points does not change. Examples of functions $(F) that are increasing and nondecreasing over > and 2 are provided in the beginning of the next section.
4. Sufficient Conditions 4.1.
Examples of Functions Increasing over > and over 2 on RN
Let us first consider the case of functions $(IF) that are increasing over > and 2 on RN.
with
8, = const > 0, i~ N;
with positive constant k and
pi, i E N; r
with the same k ,
pi,and F: $4(F) =
=
llik
max
XEX[B,~,,X,,U - H,V - Q]
-n(F: + ic N
with positive constants ki and yi, i~ N;
with F i> 0, fii
= const
> 0, i~ N.
yi
-
FiW;
Fi)ki
4. Sufficient Conditions
21 1
Each of these functions $ j ( F ) = $ j ( F , , . .. ,FN),j = 1, .. . , 5 , is increasing with respect to each variable with the others being fixed. over > on RN but not necessarily increasing with respect to every variable, $tj(F) =
with
1
i6 N
BiFi
pi = const > 0, Xfli > 0;
where
ti are functions increasing on R';
for instance t i ( F i )= Fi - FP,
ti = BiFi, where FP =
min
.x6X[0.to.x,,U - H,V- Q ]
FAX)
and the constant Pi > 0; note that the functions I)~(IF) and $,(F) are increasing over > on R", as the functions $(IF) = F ibut are not increasing over 2 on RN;
$dF)
=v
where none of the scalar functions
1 ti(Fi))
G N
3
ti(y)
is decreasing on R', the family line (or, for any y1 # y 2 of R', t k ( Y 1 ) # tk(y2)) and the scalar function q ( y ) is increasing over the number line. Let us proceed now to cases of functions that are not decreasing over 2 on [WN.
(ti(y), ie N) separates points of the straight there is a function tk(y), k E N, such that
where the vector /3 = (B ,,..., BN)eB= B E R ~ I &B 0, i e N , constant; in particular tjl0(F)
= F,;
$lI(F) = min &(Fi - F?) i€ N
where f l e a and FP = .x6'x[0.r,,.xo, min Fi (x). U T H, VT Q]
1 Bi >
ic N
2 12
5. Vector-Valued Saddle Points
In the propositions that follow, Conditions 1.1 (Chapter 1) and 1.1 of this chapter are assumed to be satisfied and the initial position (to,x o )fixed for the differential game (1.7).
4.2.
Slater Saddle Points
I f the scalar function $(IF) is increasing over > on RN,the saddle point (Us, V s ) of the diferential zero-sum game with scalar payof function
Proposition 4.1.
((1, 21,
+( W ,
{% r>, Il/(Wel))>
(4.1)
with initial position (to, xo) is the Slater saddle point of the game (1.7).
Proof. Because (Us, V s )is the saddle point of the game (4.1), by Property 3.5 from Chapter 1
+ ( w e , to, xo, us]))G ~ ( x c eto,, xo, us,~ ~ 1 ) ) <W(XC&
to, xo, V S N
(4.2)
for any quasimotions x [ . ,to, xo, U s ] ,x [ . , to, xo, U s , V’], and x [ . , to, xo, V s ] of the system (1.8) generated by Us, ( U s , Vs), and V s , respectively, from the initial position (to,xo).Let us now assume the contrary, that the situation (Us, V s )is not the Slater saddle point of (1.7). This means that either there are quasimotions x*[ .,to, xo, U s ] and x*[ .,to, xo, Us, V s ] such that the following vector inequality is simultaneous:
w [ e , to, xo, us])> w * c e , to, xo, us,vS1) or there exists a pair of quasimotions which
(4.3)
(a[. ,to,xo, Us, V s ] , i [ ,to,xo, V s ] for *
v c e , to, xo, us,VS1) > w e , to, xo, VS1) or both vector inequalities hold at the same time. For instance, let us assume that (4.3) is true. Then, because $(IF) is increasing over > on R”, we have the relation w ( x * c e , to, xo, US])) > II/(Yx*ce,to, xo,
which contradicts the left-hand side of (4.2) with x [ .] the proposition.
us,VS1)L = x*[ .].
This proves
Proposition 4.1 leads to various sufficient conditions under which the
4. Sufficient Conditions
213
Slater saddle point of (1.7) does exist; for this purpose one of $j(F), j = 1,. . . ,9, mentioned at the beginning of this section is used as the function *(F).
These conditions are not necessary, which follows from the next example.
Example 4.1. We assume that in (1.7) the change of the conflict-controlled system E is described by a system of two scalar differential equations, i1=u,
iZ=u,
x,[O]=X~,=O,
O=l>t>O.
i=l,2,
(4.4)
Here x = (x,, x,), xo = (xl0, x,,), the sets H = Q = [0,1]. and the vectorvalued payoff function F(x[B]) = (F,(x[B]), F2(x[O])) = (x,[e], xz[O]). The sets of strategies 42 and Y are additionally constrained by singleton strategies such that X[O, to,xo, U , V] (for any situation (U, V ) E42 x Y )is a point in R2. The existence of such situations ( V , V) for every point x from the reachability domain X [ O , to, xo, U - H , V - Q] is obtained from Corollary 4.1 from Chapter 1. x Y is a Slater saddle In this differential game any situation ( U s , point. Indeed, because the strategies are singleton, for any quasimotions x[. ,to,xo, U , Vs] the following system of inequalities is nonsimultaneous: X,C@,
to, xo, U I < x,ce, to, xo, US],
x,w, to, x0,
vS1< x,w,
to, xo,
vS1
(because the latter inequality turns into an equality); so are x,ce, to, xo,
us]> x,ce,
x,ce, to, xo, VI
to, x0,
us],
> x,ce, to, xo, vS1.
Consequently, all the situations (U, V) E 42 x Y are Slater saddle points. Now let us consider the zero-sum differential game Tsassociated with (4.1) with scalar payoff function
with BE [0,1] whose dynamics are also described by the system (4.1) with the same sets of strategies 42 and Y .In this game the saddle points have the form (u*, 1)
(0,1) (0,u*)
with p = 0, with 0 < fl < 1, with
fl = 1,
214
5. Vector-Valued Saddle Points
where u* and v* are arbitrary numbers from the closed interval [0,1]. By Proposition 4.1 these points are Slater saddle points of the initial differential game of Example 4.1. In Fig. 5.4.1 the entire set of Slater saddle points is shaded and the heavy line shows those which satisfy the sufficient conditions of Proposition 4.1. Consequently, these conditions are not necessary. For instance, ( U , V ) t (1,O) is a Slater saddle point though there exists no /~E[O, 11 for it such that this situation is the saddle point of Ts. Now let us apply the method of dynamic programming to Proposition 4.1.
Theorem 4.1. Let us assume that Conditions 1.1 (Chapter 1) and 1.1 (this chapter) are satisfied and that there exist situations (Us, V S ) € f &x "tr, a continuous function cp( .): [0,0] x R" + R', and a function $(Fl,. . . ,FN) increasing over > on RN such that (1) for every x E R", V(e, x ) = $ ( F A X ) ,
~m);
. . .,
(4.5)
(2) for any position (t*,x*)E [O, 13)x R" and every quasimotion xc .I
E
zrt*
9
x*,
us, VS1
the following equalities hold -
lim
t+r,+O
C&,
xCt1) - d t * , x*)l(t - t*)-
J =I ~ I!
f+f*+O
[q(t, ~ [ t ]-) cp(t,, x*)](t - t J '
= 0;
(4.6)
(3) at every position (t*,x*) E [ O , O ) x R" the following inequalities hold:
Then the situation (Us, V s ) is a Slater saddle point of the game (1.7) for any choice of initial position (to,xo)E [0,0) x R". Proof. If (to,x,,) is an arbitrary position from the set [0,0) x R", by (4.6) and Corollary 4.1 of Chapter 3 cp(t,x[t, to,xo, U s , V s ] )= q(to,xo) at every t E [ t o , 01 for every quasimotion x [ ., to,xo, U s , V s ] of the system (1.8)
Figure 5.4.1.
generated by the situation (Us, V s ) from position ( t o , x o ) . Hence for t = 8, from (4.5) we have
also for a11 quasimotions x[. ,to, xo, Us, VsI. From the first inequality in (4.7) and Proposition 4.2 from Chapter 3, for any quasimotion x[. ,t o , xo, U s ] we have that in particular, at t = 8,
q(e, x[e, to, x0,
uS1)G d t o , x0).
Hence, from (4.3,
FN(X[er
to,
xO7
us])<
dto9
'0)
(4.9)
for any quasimotions x[ . ,to, xo, U s ] .Similarly, from the second inequality in (4.7) and Proposition 4.3 from Chapter 3,
2 16
5. Vector-Valued Saddle Points
for every quasimotion x [ .,to,xo, V']. Combining this inequality with (4.8) and (4.9) leads to the inequality
$(F,(x[B, to, xo, U S ] )., . *
9
FN(X[&
to, xo9 US]))
G v w , ( x C ~to, , xo, U S , VS1), * . . FN(XC0, to, xo, U S , VSI)) 9
G $(F,(x[& to, xo, VS1), * * * w 4 0 , to, xo, VS1)), 9
which holds for any quasimotions x [ . , to, xo, U s ] , x [ * , to, xo, Us, V'], and x [ . , to, xo, V s ] . Then, by Property 3.5 in Chapter 1, the situation ( U s , V s ) is the saddle point of the differential game (4.1)with initial position (to,xo).This in turn, by Proposition 4.1, implies that ( U s , V s ) is the Slater saddle point of the differential game (1.7) with initial position (to, xo). This proves the theorem. This theorem makes it possible to use the mathematical tools of [43, 831 used to determine the saddle point of differential games, a modification of the dynamic programming method, in determining the Slater saddle Points.
4.3. Pareto Saddle Points Following Proposition 4.1 we can prove the next assertion. Proposition 4.2. If the scalar function $(IF) is increasing over 2 on RN,the saddle point (Up,V q of (4.1) with initial position (to,x o ) is the Pareto saddle point of the differential game (1.7) with the same initial position.
In this case the function $(F) may be represented as anyone from $,(IF) ... , 6,9). Using dynamic programming tools as in Theorem 2.1 (Chapter 1) we have the next assertion.
( j = 1,2,
Proposition 4.3. If there exists a situation (Up, V?E% x V , a continuous function cp(*): LO, 01 x R R', and afunction $(F) increasing over 2 on RN such that (4.5)-(4.7) hold under the substitution ( U s , V s )+ (Up, Vp), then,for any choice of initial position (to,xo)E [0, 0) x R", the situation (Up,V q is a Pareto saddle point of the diferential game (1.7).
217
4. Sufficient Conditions
Another sufficient condition for the existence of a Pareto saddle point is given by the next assertion. Proposition4.4. If the function $(F) is not decreasing over 2 on RN,the strict saddle point (Up, V') of the game (4.1) with initial position (to,xo)is the Pareto saddle point of (1.7) with the same initial position.
Proof. The saddle point (Up, V') of (4.1) is strict if
$(W& to, xo, UP,VlN < $(W& to, xo, up,V 9 ) ) < $([F(x[O,
to, xo,
u, (4.10)
for any strategies V E *Y and V # V', U E 42 and U # Up and quasimotions x[. ,to, xo,Up, V], x[ to, xo, Up, V T , and x[. , to, xo. U , .]'V Let us assume that the contrary, that the strict saddle point of (4.1) is not the Pareto saddle point of (1.7). Then there exist either a strategy VEV ( V # V? and quasimotions x*[ .,to, xo, Up, 31,x*[ . ,to, xo,Up,]'V such that a ,
qx*ce, to, x0, up,PI) 2
u
[F(X*CO, to, x0,
up,VT)
or there are strategies E 9 (0# Up) and quasimotions i [ .,to, xo,Up, V '], i [ .,to, xo, ]'V for which
c,
w e , to, x0, up,Vq) 2 w o , to, x0, 0, Vq),
or both inequalities hold simultaneously. Because the function $(F) is not decreasing over 2 on RN, it follows from, for instance, the first vector inequality with P # V' that
$(m*ce, to, x0, up,PI)) 2 w(X*ce, to, x0, up,~~1)). We have a contradiction with the left-hand side of (4.10), where V = Pand x[.] = x*[.]. That the second vector inequality is impossible may be established in a similar way. This contradiction proves Proposition 4.4. Remark 4.1. In Proposition 4.4 the function $(F) may be represented as t,bl0([F) and i+hl1(F) from subsection 4.1. In particular, for the function $lo(lF) = F iwe have: - If among the components of the vector-valued payoff functions ff(x[B]) there exists F,(x[O]) such that in the differential game Tj= ({ 1,2}, C + (1.8), {42, V } ,IF(x[B])) with initial position (to, xo) there is a strict saddle point (U"), V")), this will be the Pareto saddle point of the game (1.7) with initial position (to,x,).
218
5. Vector-Valued Saddle Points
Note that the function $(IF) = F j is increasing over > on RN and so, by Proposition 4.1, the saddle point of rj is the Slater saddle point of (1.7), though only strict saddle point are Pareto saddle points.
4.4. Geoffrion Saddle Points
Proposition 4.5. If pl,. . . ,PN is some collection of positive numbers, the saddle point (U", V") of a differential game with scalar payofffunction (4.1 1)
and initial position ( t o , x o )E [O,f3) x R" is a Geoffrion saddle point of (1.7) with the same initial position. Proof. Because the function $lo(F) = CkN p i F i with positive pi,ie N, which are not decreasing over 2 on RN, by Proposition 4.4 the saddle point (U", V") of (4.11) is the Pareto saddle point of (1.7) with initial position (to?xo). Let us prove that the situation (U", V G )is the Geoffrion saddle point. Let us assume the contrary. Then for the number M = N max-pi i,jeN
(4.12)
pi
there exist either an ordinal number i E N , a strategy PEV, and pair of quasimotions ( x * [ .,to, xo, U", PI, x * [ . ,to, xo, U", V"]) such that the following system of inequalities is simultaneous:
F,(x*[e, to, x0, uG,PI)- F,(x*[e, to, x0, vG,vG1)> 0,
P I ) - &(X*CR to, xo, UG, V"1) -M[Fj(x*[e, to, x0, uG,vG1)- Fj(x*[e, to, x0, uG,PI)] > 0,
F,(x*Cf3, to, xo, U",
jEN,
j # i,
(4.13)
or there exist a natural number i E N , strategy OE%, and quasimotions a [ .,to, xo, U", V"] for which the following system is simultaneous:
(%[ .,to, xo, 6, V"],
4. Sufficient Conditions
219
or both these systems hold at the same time. Let us assume, for instance, that the system (4.13) is simultaneous. We identify a set of subscripts N~ = { j E N I 4 ( x * [ e , to, x0, uG,PI) - Fj(x*[e,to, x0,
uG,v“])< o}. (4.14)
By Definition 1.2 (Pareto saddle points) and the inclusion ( U G , V G ) € 9 ,the set No # 0. Then we have from (4.13) N[F;.(x*[e,to, x0, uG,PI) - Fi(x*[e,to, x0, uG,vGi)i
On the other hand, since by the premise of Proposition 4.5 ( U G ,V G )is the saddle point of the game (4.11) with initial position (to,xo),we have, by Property 3.5 of Chapter 1:
for any strategies V E Y and U E and all quasimotions x [ . ,to, xo, U G ,V ] , x [ . ,to,xo, U G , V G ] ,x [ . ,to, xo, U , V G ] ;from the left-hand side of this inequality, in particular,
or
Then, making the inequality stronger by (4.12), we obtain
220
5. Vector-Valued Saddle Points
Proposition 4.5 and Theorem 4.1 make it possible to formulate a sufficient condition for the existence of a Geoffrion saddle point using the dynamic programming method. If Conditions 1.1from Chapter 1 and 1.1 in this chapter are satisfied and there exist a situation (U", VG)€4!lx "Ir, a continuous function cp( .): [0,8] x R" -+ R', and positive constants pl,. . .,PN such that
Proposition 4.6.
(1) for every x E R",
(2) Conditions (4.6) and (4.7) are satisfied under the substitution
( U S ,V S ) (U", V"),
then, for any choice of initial position (to, xO)e[O, 0)x [w , the situation (UG, V") will be the Geofrion saddle point of the diferential game (1.7). The Geoffrion saddle point that results from this theorem for a linear quadratic differential game is, by the inclusions (1.16), the Pareto and Slater saddle point. 4.5.
A Linear Quadratic Game
The variation of the linear quadratic differential game of Example 2.1,
rL= ({1,2}, zL,pL7 vL}, w ,v))
(4.16)
over EL is described by a linear system of differential equations i= A(t)x
+ B,(t)u + B2(t)v,
x(t0) = xo,
(4.17)
where x E R", U E Rh, and v E Rq; the elements of the matrices A(t) and Bj(t), j = 1,2, of the associated dimension are continuous on [0,8]; the time 8 when the game ends is fixed; and to E [0,8). The sets of players' strategies are QL =
{ u + u(t, x) IU(t, x) = H(t)x},
-YL = { V -+ o(t, X ) I v(t, X )
= Q(t)x}.
The elements of the (n x h)-dimensional matrix H and the (n x q)-dimensional matrix Q(t) are continuous on [O, 81; consequently, the strategies of the first (second) player differ only by the matrices H ( t ) and Q(t) and their form
4. Sufficient Conditions
221
(linearily over x ) remains invariable. The form of the vector-valued payoff function of (4.16) is as in Example 2.1, i.e.,
w ,V ) = ( J A U , V ) , ...,J d U , V ) ) , where
W,
V )= xye)c,x(e)
+
6:+
+ u'[t]g'l"'u[t]
{x'(t)~,x(t)
o'[t]9'l"'u[t]} dt,
(4.18)
iE N.
Here the matrices of appropriate dimensions Ci,Gi,9?),and 9 y ) are constant and symmetric; x ( t ) = x(t, to,x,, U , V ) , to < t < 8, is a quasimotion of the system ELwith the chosen strategies U - H ( t ) x and V - Q(t)x defined as a common solution of the system (4.17) with u = H ( t ) x , u = Q(t)x, and initial condition x(to) = x,; u [ t ] = H ( t ) x ( t ) and u [ t ] = Q(t)x(t). Example 2.1 showed that with @") > 0, i~ N (or 9 y ) c 0, iE N), in the game (4.16) for any choice of initial position (to,X , ) E [O,e) x R", IIx,II # 0, there is no Slater saddle point or, consequently, Pareto, Geoffrion or Asaddle point (which are properly determined for the game (4.16)). Recall that 9 > 0 (9< 0) implies that the quadratic form z ' 9 z is positive(negative-) definite. Let us determine the constraints on the parameters of (4.16) for which the Geoffrion (and, consequently, Pareto and Slater) saddle point does exist and show its explicit form. Because any situation (U, V)E@' x YL generates a unique situation x ( t ) = x(t, to,x,, U , V ) , to < t < 8, of the system (4.17), x(t,) = x,, the Geoffrion saddle point of (4.16) is defined as follows: The situation (UG,V G ) ~ @ x' YLis referred to as the Geoffrion saddle point of (4.16) with initial position (to,x,) if (1) ( UG, V G )is the Pareto saddle point of this game, that is,
I(UG, V ) 2 f ( U G , V G ) $ I ( U ,
VG)
with any V E YLand U E @ ~ ; (2) there is a constant M > 0 such that for any ordinal numbers strategies V E YLfor which Ji(UG, V ) > J i ( U G , VG)
and some j E N such that
Jj(U", V ) < Jj(UG, V G )
iE
N and
222
5. Vector-Valued Saddle Points
the following inequality holds: Ji(UG, V ) - Ji(UG, V G ) < M [ J i ( U G , VG) - Jj(UG, V ) ] .
For any ordinal numbers i E N and strategies U E 4YL for which Ji(U, V G )< Ji(UG, V G )
and some j~ N such that
q u , V G ) > qv",
VG)
it is true that
qU", V G ) - A ( U , V") < M [ J j ( U , V G ) - qv", V G ) ] . Because the proof of Proposition 4.5 is also applicable in the case of singleton quasimotions, that is, solutions x(t, to, xo, U , V ) , t o < t < 8, it is true that the saddle point of the differential game
with initial position (to, x o ) and any collection of positive numbers pl,. . . ,pN is a Geoffrion saddle point of (4.16) with the same initial position. Hence follows a technique of obtaining the Geoffrion saddle point of (4.16). The saddle point of (4.19) has to be found for some collection of positive numbers bl,. . . ,/IN. This saddle point can be arrived at using Proposition 4.6. Proposition 4.7. If there exists a collection of positive numbers such that the quadratic forms
pl,. . . ,/IN (4.20)
are negative-dejinite, the Geoflrion saddle point of (4.16) has, for any choice of initial position (to. xo)E [0, 8) x R", the form
(4.21)
4. Sufficient Conditions
223
where the symmetric (n x n)-dimensional matrix @(t) is a solution of the Riccati matrix diferential equation
/
with a boundary condition
@(O) =
\ -1
1
c pici.
(4.23)
is N
Proof. Proposition 4.6 cannot be used directly in the proof of Proposition 4.7 since the strategies U - u(t, x ) and V - u(t, x ) , more specifically, the functions u(t, x ) and v(t, x), may assume infinite values in the game (4.16) (since H and Q are not required to be compact) and the components of the vector-valued function A ( U , V ) d o not have the terminal form F,(x[O]). Following the technique of Proposition 4.6, however, let us consider a function (where the arguments t and x are omitted for brevity): ( A X+ B ~ + uB~u)
(4.24) where Gg =
c BiGi,
i d
9 f )=
1Pi@"),
is N
9$") = is1N pi@"),
and d q / a x is a column vector whose components are the partial derivatives of the function q ( t , x ) with respect to the coordinates of the vector x . By (4.20) it follows from Proposition 4.7 that then conditions sufficient for max w ( t , x , uG, u) = w ( t , x , uG, uG), v
min w ( t , x , u, uG) = w ( t , x , uG, uG) U
(4.25)
224
5. Vector-Valued Saddle Points
reduce to the following requirements:
Hence
(4.26)
Then the function cp(t,x) is found as a quadratic form cp(t. x) = x'@(t)x,
(4.27)
@'(t)= @(t).
Substituting (4.27), (4.26), u = uG and u = uG in (4.24) and annihilating the factors of the resultant quadratic form for the components of the vector x gives us (4.22). The boundary condition (4.23) follows from the identity q(e, x)
=
x q e ) x = XI
1 bici) x.
(4.28)
(isN
Substituting the solution of the system (4.22), (4.23) in (4.27) and (4.26) yields (4.21). Then, substituting (4.21) and (4.27) in (4.24), we have w(t,x,
u ~ ( tx), ,
u ~ ( tx)) , = 0,
v(t, x) E LO,
el x w.
(4.29)
All the reasoning has been formal thus far since Proposition 4.6 cannot be used. Therefore, let us check directly that the situation (UG, V G )from (4.21) is the saddle point of (4.19) for any choice of initial position (to,xo)E [O, 13)x R". Let (to, xo) be an arbitrary position of [O,O) x R" and xG(t), to < t < 8,
4. Sufficient Conditions
225
solutions of the system (4.17)with u = uG(t,x) and u = uG(t, x), equation (4.21) and x(to) = xo. Substituting x = xG(t) in (4.29) we have W ( t ,XG(t),
UG(t, XG(t)), u G ( t , X G ( t ) ) )
= 0,
vt E [to,
81.
Integrating both sides from to to 8 and using the form of Win (4.24)and of 4 in (4.18), we have
(4.30) By the first equality in (4.25) and (4.29), w(t, x,
UG(C,
x), v )
< 0,
v(t, x,
u) E LO,
el x IW" x w.
Let V + u(t, x) be an arbitrary strategy from the set V Land let i ( t ) , to < t < 8, denote a solution of (4.17) with u = uG(t,x), u = u(c,x), and x(t0) = xo.
Substituting a(t) in the preceding inequality, we have
w(t,a(t),uG(t,%(ti),00, n(t))) G 0,
vt E [ t o ,
81.
Again, integrating both sides from t o to 8 we have, in light of (4.24)and (4.18), the inequality
1
ieN
BiJi(uG, V)
<
to, ~ 0 ) .
Combining it with (4.30) yields
1 Bi4(VG, v) G ieN 1 BiJi(UG,
icz N
VG),
VVEVL.
The second part of the inequality determining the saddle point (UG,V G )of (4.19) is obtained in a similar way using the second equality in (4.25). This proves the proposition. Consequently, the Geoffrion saddle point of the game (4.16)is obtained by the following algorithm:
(1) Find constants pi > 0, i E N, for which the quadratic forms (4.20) are negative-definite; (2) Find a solution @(t)of the system of differential equations (4.22) with boundary condition (4.23); (3) Write out the explicit form of the Geoffrion saddle point by (4.21). Remark 4.2.
Let us consider a particular case of (4.16) where the payoff
226
5. Vector-Valued Saddle Points
function has two components, I(U , V) = (J1(U , V ) ,J2(U , V ) ) .We assume that @") > 0,g'i")> 0, and 9$') < 0 , 9 ? ) < 0. In this case neither of the differential games ((1, 2},
X + (4.17), {aL, VL}, I(U, V ) )
has a saddle point for any choice of initial position ( t o ,xo)E [ O , O ) x R", llxo)I # 0 (Proposition 4.5, Chapter 1). Let us show, however, that for certain relationships between the characteristic numbers of the matrices @)' and @"), i = 1, 2, (4.20) may be made to hold such that the quadratic forms
+ (1
- u'[ Pg'i'''
-
/!?)@']u,
+ (1 - fi)@"]v,
v'[P.@i"'
(4.31)
are negative-definite for certain values BE (0,l). Following [ 15, p. 951, for sign-definite quadratic forms, and every u E R h and u E R4,
Ilu'u < u ' 9 v ) u < A1u'u, - A ~ u ' u< u ' ~ $ ' ' u< - A ~ u ' u , 6,v'u
-A2v'u
< 0 ' 5 3 ~<' ~Alv'v, < v ' 9 ~ ' v< -b2u'v
(4.32)
where, e.g., for the matrix @') the positive numbers I l and A 1 denote the least and the largest, respectively, root of the characteristic equation det [@') - A&] = 0, E , being an (h x h)-dimensional identity matrix. In another challenging case a number PE(O, 1) may be found for which the quadratic forms (4.31) are negative-definite, since with this P the forms of (4.20) for N = 2 are, by Proposition 4.7, negative-definite. Let us assume that
and
PI -.
where
=-
4. Sufficient Conditions
227
It is obvious that
-A2 < p < - , 62 21
A1
Then, in light of (4.32),
or with the above B1 and B2, the quadratic forms (4.31) are negative-definite. Here Proposition 4.7 would be useful in obtaining vector-valued saddle points. 4.6. A -saddle Points
In obtaining A-saddle points of the game (1.7) with initial position (to,xo) we should not overlook the fact that, by Definition 1.4, every A-saddle point is a Slater saddle point of the game ((1, 21, C + (1.81, (@,
91, A W O I )
(4.33)
with the same initial position. Specifically, the situation (U”, V A ) € @x Y is the A-saddle point of (1.7) if and only if AWB,
to, xo, uAi) 4 A w e , to, xo, uA,vA1)+AWO,
to, xo, ~ ” 1 )
for any quasimotions x[. ,to, xo, u”], x[. ,to, xo, u”,v”],x[., to, xo, V”I. Therefore, the sufficient conditions for the Slater saddle point (subsection 4.2) at present formulated for the game (4.33) are those for the A-saddle point for (1.7). Note also that the scalar function $(PIIF) is increasing over > on RN if A5“’ > A P ’ o $(AIF“’) > $(AIF‘Z’)
228
5. Vector-Valued Saddle Points
and in the functions $ j ( j = 1,. . . ,9) of subsection 4.1 Fi must be replaced by ZjcNa i j F j ,where aijare the elements of the constant matrix A. Finally, if these elements aij > 0, i, j E N, then, by Property 1.1, any A-saddle point of (1.7) is the Geoffrion saddle point of that game. Then, for instance, Proposition 4.1 for the A-saddle point takes the form of the next assertion. Proposition 4.8. If the scalar function $(F) is increasing over > on RN, the saddle point (U”, V ” ) of the diferential game with initial position (to, x o ) is the A-saddle point for (1.7) with the same initial position. If, however, the elements of the (N x N)-dimensional matrix A are positive, the A-saddle point (U”, V”) is the Geoffrion saddle point. In the case of an A saddle point Theorem 4.1 takes the following form: If there is a situation ( U ” , V ” ) E @x “Ir, a continuous function cp( .): [O, 01 x R” -+ R’, and function $(Fl,.. . ,F N ) increasing over > on RN such that (1) for every x E R“
(2) (4.6) and (4.7) are satisfied, where (Us, V s ) is replaced by (U”, V”),
then the situation (U”, V”) is the A-saddle point of (1.7) for any choice of initial position (to, xo)E [O, 0) x R .
4.7. Specific Features of Vector-Valued Saddle Points
This subsection will concentrate of the specific features of vector-valued saddle points attributable to the absence of equivalence and interchangeability, features that have been established for a differential game with scalar payoff function and formulated as Property 3.7 (Chapter 1). Equivalence implies for this game that the values of the payoff functions coincide on different saddle points. For a differential game with vector-valued function this is not so.
229
4. Sufficient Conditions
Example 4.2. Assume that in the differential game (1.7) the system C is described by the equations jc = u,
j
=
= y[O] = xo = yo = 0 2 E IW
X[O]
u,
o = to < t < 8 = where x = (xl, x2) and y (compacta)
H
=
= ( y l ,y 2 )
2
(4.34)
1,
are two-dimensional vectors; the sets
{ u = (ul, u2)Ilujl< 3, j = 1, 2},
Q = { u = (uI, u 2 ) I u :
+ u:
d 9}
are shown in Fig. 5.4.2. The sets of strategies of the “minimizing” player is % = {U-u(t,x,y)Iu(t,x,y)~H}
and of the “maximizing” player, *Y- = { V
t
~ ( tX, , y ) I ~ ( tX, , y ) E Q}.
The quasimotions x[. ,to, xo, U], y [ .,to,yo, V] of the system (4.34)generated by the situation ( U , V ) E @x *Y- from position (to, xo, yo) are defined as in subsection 2.3 in Chapter 1. Finally, the two-component payoff function [~(xcei, y ~ e i= ) (F,(xc~ Y
C F , (~x c ~ ,YC~I))= X C ~ I+ Y C ~ I
+
= ( X l C ~ l y 1 c a x2Cel
u2
t
+ Y2cel).
(4.35)
O2
Figure 5.4.2.
t
230
5. Vector-Valued Saddle Points
This differential game will be denoted henceforth as r('). In addition to rcr) we consider two auxiliary two-criteria1 problems:
r, = w,, 420, x c e i ) , ry= GY,y o ,YCel). Here the system Z, ( Z y)is described by the equation x[O]=x0=O,
i=u,
( j = u, y[O]
= y o = 0,
O
0
< l),
The set of strategies 42O
=
{U
t
u(t, x)l u(t, x ) E H } ,
(Yo= { V - u(t, y ) I u(t, y ) c G } ) .
In rxit is required to choose a strategy U E 42' such that by time t = 0 = 1 the least possible values of both components ( x , [ O ] ,x,[B]) = x[O] must be obtained; in ry the appropriate strategy V E V ' is to lead to the largest possible values (y,[O], y,[O]) = y[O]. The reachability domains X[O, to, xo, U - H ] and Y[O, to, yo, V - Q ] of the systems Ex and Z y , respectively, are shown in Fig. 5.4.3. Because of the form of the criteria which coincide the values of x[O, to, xo, U ] and y[O, to9, yo, V ] of the state coordinates at time t = 0, the sets of values of the criteria in the criteria1 space also have that form The set X , of Slater-minimal value of xs = x[O] in rxand set of associated solution xs of the static problem ( X [ O , to,xo, U + H I , x ) in Fig. 5.4.3a are denoted by a heavy line. These two sets also coincide (the left and lower sides of the square) because lF(x) = x . Let 42' be a subset of the 42' strategies U s such that xce, to, xo,
us]E X ,
for every quasimotion x [ . ,to, xo, U s ] . The set 42' # 0, because it includes singleton strategies U s such that x[O, to,xo, U s ] , is a point of X , . Then for every point x ' E X , there exists, by Corollary 4.1 in Chapter 1, a strategy U s E a0for which for every quasimotion x [ * , to,xo, U s ] . All these strategies are contained in the set 42'. In a similar way, for r,, we obtain a set Y s of Slater-maximal values of
4. Sufficient Conditions
X2
Y 2 PI
Y2
f
\.V
(b)
i
Q]
I
Figure 5.4.3.
Y’
231
232
5. Vector-Valued Saddle Points
ys = y[e] and a set of associated solutions y” in the static two-criteria1 problem ( Y [ &to,yo, V -+ Q],y). These sets also coincide (a quarter of a circle of radius 3 in the first quadrant) and are shown in Fig. 5.4.3b also as a heavy line. We then obtain a subset Y sc Y oof all strategies V sE Y osuch that yre, to, yo,
Here again V s#
vS1E ys,
~ Y C .to, , yo,
vS1.
contains singleton strategies V s
vysE y S3 V%
YO
= ,
ys = y [ e , to, yo,
vS],vy[. , to, yo, vS1.
By the construction of the sets X, and Ys,
4: xs,
v x E x [ e , to, xo,
y S 4: Y ,
VY E YCe, to, yo,
x
u -+ HI,
xsEx,,
Q1,
yS E Ys.
+
Therefore, any pair (xs, ys) E X , x Y s is the Slater saddle point of the static game
(xce,to, xo, u - HI, yre, to, yo, v - QI, x + Y ) . Then every situation ( U s , V S ) ~x a VS S and , only these situations, are the
Slater saddle point of the differential game P I ) . By virtue of (4.35) and the construction of the sets X, and Ys, the set of values of the vector-valued payoff function F(x[e], y[O]) = x[O] y[O] on the set of all Slater saddle point asx Y s
+
because
xce, to, xo, as]=
u u
xce, to, xo,
UElS
YI?, to, yo9 VS1=
V€V
ui,
yce, to, Yo, V l .
+
Let us find the algebraic sum of the sets X, Y s ; for this purpose to every point x’EX, we add Ys. In Fig. 5.4.4 the set X, Y s is shaded. Now consider two Slater saddle points
+
( P ,V ) + ) ( d l ’ ( t , x), U ( l ) ( t , y ) ) = (( + 3, (U‘Z’, V ( 2 ) )+ (u‘2’(t, x),
u‘2’(t,
y)) = (( - 3,
- 3), (0, -
+ 3)),
3), ( +3, 0)).
I
C
4. Sufficient Conditions
F*
-3
-3
1
B
Figure 5.4.4.
Then
=(3
+ 0, - 3 + 3 ) = ( 3 , 0).
This value of F is associated with point A in Fig. 5.4.4. In a similar way
yxc4 to, xo, u‘2’1,YC&
to, yo,
V‘”I>
the associated point B is shown in Fig. 5.4.4.
= (0,-3);
233
234
5. Vector-Valued Saddle Points
Comparing these two values of IF at the Slater saddle points (U(’),Y ( ’ ) )and (V2), we have w e ,
to, xo,
W,
Y c e , to, Y,, F I ) > wee, to, x0, ~ ( ~ )Y 1C ,~to, , Y,,
v(~)I)
or
wee, to, xo, U‘”I, YC&
to,
YO,
Ycl)I) > Fi(xCe, to, xo, U‘2’I,Y C ~ to, , YO, i = 1, 2.
vc2’I),
By the resultant inequalities, the values of all the components of the payoff function IF(x[O], y[O]) at one Slater saddle point ( U ( ’ ) ,Y ( ’ ) )is strictly larger than those at the other, (U‘”, Y c 2 ) )In . terms of the theory of antagonistic games this implies nonequiualence, since if the saddle points were equivalent such values would coincide. From the standpoint of the theory of multicriterial problems this property signifies an internal instability for the set of Slater saddle points. Because of this result, the Slater saddle point is less advantageous as a solution of (1.7). Indeed, the other (maximizing) player would rather have the former of the two situations (U(’),Y c l ) )and ( U ( 2 )V”)), , because in this situation the values of both criteria Fi(x[O],y[O]), i = 1, 2, are larger. The first (minimizing) player would rather have ( U ( 2 )V, ( 2 ) )since , the values of both criteria in this situation are smaller. The players cannot agree on a choice of a “mutually acceptable” situation, since the game is antagonistic. Internal instability (non-equivalence) is the main cause of (vector-valued maximin and minimax) vector-valued guarantees that will be introduced in the next chapter.
Example 4.3. This example will suggest that the vector-valued saddle points of a differential game are not interchangeable, unlike those of a zero-sum differential game with scalar payoff function (Property 3.7, Chapter 1). In a differential game i
= u,
-
j = u,
XCO]
= y[O] = xo = y o = 0,
o = t o G t G e = 1; 42 = { u u(t, x, Y ) 1 4 4 x, Y ) E co, 111, y = { v - U ( t , x, Y ) Idt, x, Y ) c c- 1, 0 3 , the vector-valued payoff function is W O I , YCOl) = (Fl(XCO1,
Y c m Fz(Xcel, y c m
=(xC11 + YC11, x c l l ~ Y c 1 1 ~ ~
4. Sufficient Conditions
235
Let us show that the situations (U"), Vcl))- (1,O) and ( U 2 )V, 2 )) (0, - 1) are Slater saddle points. Indeed
x[e,
to, xo,
x[e, to, xo,
u(1)1= 1, u(2q =
y [ e , to, yo,
v ( q = 0,
y[e, to, yo, V ( Z =~ - 1.
0,
The reachability domains are
xce, to, xo, u - HI = LO, 11, w, to, Yo, v + QI = c 1,oi. -
Then for any XEX[O, to, xo, U-H] the situation (V'),V')),
=
[0, 13 and
l+y+l+O>x+O,
YE
Y[O, to, yo, V-Q]
l . y + =O.l+O.x.
+
for (4.36)
implies that the Here x[O, to, xo, U'"] = 1, y[O, to, yo, I""] = 0 the sign strict inequality (>) cannot hold for any x E [0,1] or y E [ - 1,0], the sign > denotes that for some x E [0,1] and y E [ - 1,0] strict inequality (>) holds. In effect, by virtue of (4.36) the situation ( U ( ' ) ,V1))is the Slater saddle point For every XEXCO, to, xo, U + H] and Y E Y[O, to, yo, V - Q] for the situation (U(2),V C 2 ) and ) the associated right ends of the quasimotions x[O, to, xo, V2)]= 0 and yC0, to,yo, Vc2)] = - 1, o+y>o-
1+x-1,
O.y+O.(-l)>X.(-l)
These relations show that the situation (U"), V2))is also the Slater saddle point. If interchangeability were a property of the game, the situations (V'), V @ ) ) and (U'", would also be Slater saddle points. But since this is not so, let us show that, for instance, (U'),V 2 )is) not a Slater saddle point, that is, that there exists either a strategy Q E V such that
~ ( ~+ 'YCO, 1 to, yo, v ( ~ ) I , to, x0, ~ ( ~ Yce, ' 1 . to, yo, PI > X C ~ to, , x0, ~ ( " 1y[e, . to, yo, ~ ( ~ ) i ,
XCO, to, x0,
xce,
u(l)i+ Yce, to, yo, Qi> xce,
to, x0,
(4.37)
for at least one 3-tuple of quasimotions (XC., to, xo, w
or there exists a strategy
, YC ., to, Yo,
PI9 YC.9 to, Yo, V("1)
fi E 9 such that
o, to, x0, u(')I + yce, to, yo, ~
(> xce, ~ to, ~ x0,1 fi1 + Y C ~ to, , yo, ~ ( ~ ) 1 , xce, to, x0, u(')I.Yce, to, y o , v ( ~ )>I xce, to, x0, fi1.Y C ~ to, , yo, ~ ( ~ ' 1
236
5. Vector-Valued Saddle Points
for at least one 3-tuple of quasimotions (XI
*,
to, xo,
w,xc
*
9
to, xo,
61, Y C .
9
to, Y o ,
v(2)1)
or there exists a situation (6,7) in which the both systems of inequalities hold at the same time. Let us see that with +- (- $), inequality (4.37) holds. Indeed, here xce, to, x0,
~ ( ~= ’1,1
Y c e , to, Y,,
Y c e , to, Yo,
v2)1= - 1,
PI = -3,
and so
x[e,
to, xo, u(lq
+ Y[e,to. Yo, PI = 1 - 1 = $ > O 2
= 1 - 1 = x[e, to, xo,
xce, to, x0,
~ ( ”.yre, 1
to, Y,,
u(iq+ Y c e , to, Y o , v(zq,
PI
=1*(-0, 5 ) = -0, 5 > -1 = l . ( - l ) = Xce,
to, xo, u(1q. y [ e , to, Y o ,
v(2q.
Non-interchangeability is another negative aspect of the vector-valued saddle point as a solution of a differential game. Indeed, because there is no internal stability, the maximizing player chooses the saddle point where all the components of the vector-valued payoff are larger and the minimizing player for the saddle point where they are smaller. While every player chooses a strategy of his own from among the “desirable” saddle points, these strategies do not necessarily lead to the vector-valued saddle point, because of non-interchangeability. In this case the decision-making players generally ignore the vector-valued saddle points. Therefore, it is highly doubtful that solutions of a differential game will be confined to vector-valued saddle points. In effect, there are two reasons why a new notion should be sought for a solution of a differential game, that is the “internal instability” of the set of vector-valued saddle points and their non-equivalence. There is still another reason which we shall discuss in the next chapter. Each player who employs his strategy from a saddle point assures for himself a value of the vector-valued payoff function no matter how the other player behaves. This is the positive sense of a vector-valued saddle point (cf. Remark 1.3). But because of internal instability this is not the strongest possible guarantee. The need to find the stronger possible guarantee is the third reason why vector-valued maximin and minimax must be found.
Vector-Valued Guarantees
Chapter 6
We shall now determine for differential positional games with vector-valued payoff function the vector-valued maximin and minimax, and establish their existence and their properties.
1. Vector-Valued Maximin and Minimax 1.1. Formalization of Vector- Valued Maximin
As in the preceding chapter, we consider zero-sum positional differential game with vector-valued payoff function
<{ 1, 2},
x, {@,
Y } ,W
OI)>
(1.1)
and fixed initial position (to, xo)E [0,8) x R". Conditions 1.1 (Chapter 1) and 1.1 (Chapter 5 ) are assumed to hold. To determine the Slater-maximin strategy (l.l), X ( s )[ V ] will denote the set of Slater-minimal alternatives (solutions) xs( Vl = x s [ 8 , to, xo, V ] of the multicriteria1 static problem
W )= (XC&
to, xo, V I , W ) ,
(1.2)
or for a fixed strategy VE Y ,for every x s [ V ] = x J 8 , to, xo, V ] E X(,J V ] ,it is true that w%CVl)
4
mc&to, xo, V l ) ,
VXC.7
to, xo, V I E ~ C t Oxo, , VI.
Recall that in the game (1.2) the compactum XC8, to, xo, V ] forms all the right 231
238
6. Vector-Valued Guarantees
(at t = 0) ends of the quasimotions x[., to, xo, V] of the system (1.8) from Chapter 5 generated by a fixed strategy V E V from position (to,xo). The set X("[U] of Slater-maximal solutions xs[U] = x[0, to, xo, U ] of the static multicriterial problem l-(U)
=
(XCR to, xo, Ul,
0))
(1.3)
will also be used, that is, for every xs[U] EX(~)[U]and xC0, to,xo, U],
W C U I ) 4: WC&to, xo,
Ul.
Unlike (1.2), in problem (1.3) the strategy UE%! of the first (maximizing) player is fixed and the set x c e , to, xo, UI
=
u
xce, to, xo, UI,
4' l~~[~,,x,,~l
where the quasimotions x[. ,to, xo, U ] of the system (1.8) in Chapter 5 are generated by the strategy U from position ( t o . xo). We set F(XCH, Q1)= where XCH,
Q1=
u
u
=xC&QI
( U , V ) € Q x i,,
W),
xC0, to, xo, U ,
VI
is the reachability domain at time t = 0 of the system (1.8) from Chapter 5 from position (to,xo); this domain is formed by all the right ends of quasimotions x [ - , to, xo, U , V] as the situation ( U , V) scans the entire set %!XXy;
9r(,,F(XC& to, xo,
Vl) = F(x(,)CV]) =
9r(S)F(X[B,to, xo, U ] ) = F(X(s)[~])=
u u
xaX'SlCV1 XSX'S"
minS F(Xc0, to, xo, VI)
=
W
=
~(,)CVl,
F(x) = P C U ] , U]
%dVI),
XL.1
maxS [F(x[e, to, xo, U ] ) = F(xs[U]). Xl.1
Consequently, Fr(,,[F(X[B, to, xo, V]) is the set of Slater-minimal values of F(x) (Slater minima) in problem (1.2) and 9r(S)F(X[0,to, xo, U]) the set of
1. Vector-Valued Maximin and Minimax
239
Slater maximal (Slater maxima) values of F(x) in (1.3). In the above notation Fr(,,[F(X[H,to, xo,
v ] )=
8 r ' S ' U x C ~to, , xo, U I ) =
u u
r$yS
W C O , to, xo, U ,
maxS[F(x[o,to, xo, U I ) . +[.I
The sets X ( K ) ( V ]X, ' K ) [ U ] 9r,,)[F(X[B, , to, xo, V ] ) ,and 8r(K"F(X[B,to, xo, U ] ) and operations maxK, minK ( K = P, G , A ) are introduced in a similar way with 4 replaced in appropriate relations by 9 (for K = P), by kG(for K = G the relationship > G is determined at the end of subsectionl.1) in Chapter 4, and (for K = A). In these cases it is also true that
+,.,
F r ( K ) W C e , to, xo, V l )
=u u
Fr"%(X[B, to, xo, U ] ) =
min' [F(xCB, to, xo, V ] ) , .yC.
1
maxK [F(x[B,to, xo, u]). XC.1
The following definition is of key importance here. Definition 1.1. The strategy V'"E ^Y- is referred to as the Slater-maximin in
the game (1.1) if there exists a point f s [ l / ( S ) ] ~ X ( s , [ Vsuch S ] that w s c V ' s ' I ) 4:
WSCYI)
(1.5)
for every V E^Y- and x s [ V ] E X ( s ) (V ] , The vector F($[V(s']) will be referred to as the Slater-maximin for (1.1). Analogously, the strategy U(')E% is referred to as the Slater-minimax in (1.1) if there exists i'[U(')] E X ( ~ ) [ U ( ' )such ] that IF(f'[ U'S'])
+
F(XS[
U])
for any U E @ and every xs[ U ] E X's'[U]. The vector F(2s[U(s)])will be referred to as the Slater-minimax of (1.1). of (1.1) and Now 9'":) will denote the set of Slater maximin strategies i ( , , [ V ( ' ' ] the set of values i s [ V ( s ) ] associated with the strategy V(') (cf. Definition 1.1). In a similar way, is the set of Slater-minimax strategies U") of (1.1) and 2(')[Uca] the set associated with the minimax strategy U'" of values %'[ U'"].
240
6. Vector-Valued Guarantees
In the notation of (1.4), by Definition 1.1,
VS)= F(~,[V(~'])= minS [F(x[e, xc.1
=
to, xo,
~(~7)
maxS u minS F(x[e, to, xo, V]) V€V-
XC.1
= maxS F(X,,,[V]) = maxS V€Y'
VEV-
maxS F(x),
X€X,,,[V]
= m in Su maxS F(x[e, to, xo, U ] ) U€Q
=
XI.]
minS [F(X(s)[V]) = minS minS F(x). U€4
U€@
xeX'S"U1
(1.6)
These relations are analogs of (3.12) and (3.10) from Chapter 1 which introduce the maximin V o and minimax U o strategies, respectively, of the differential game (3.8) in Chapter 1 with scalar payoff function. Finally, if in the relations that define the Slater-maximin and Slater-minimax strategies, the operation 4 is replaced by 9 and S by P, the resultant strategies V(') and are the Pareto maximin and minimax, respectively, in the game (1.1). In a similar way, replacing in Definition 1.1 4 by and S by G (and 4 by + A and S by A) yields the notion of a Geoffrion-maximin (A-maximin) strategy. To differentiate between the maximin V(")and minimax U(")strategies, on the one hand, and the strategies (V", V") that generate the associated saddle point, here and below in the case of vector-valued maximins and minimaxes the superscript will be parenthesized, e.g., V S )will be the Slater-minimax strategy and V s the second component of the Slater saddle point. We associate every strategy V E Y with the set X(p)[V] of Pareto-minimal solutions xp[V] of problem (1.l), i.e.,
aG
X(,)CVI= { x P c ~Exv, i to, x0, Definition 2.2. game (1.1) if
~1 I W,CVI)
a ~ ( x )vxco, , to, xo, ~ 1 ) .
The strategy V ( ' ) E Y is referred to as the maximin in the
241
1. Vector-Valued Maximin and Minimax
The vector f f ( i P [ V ( ' ) is ] ) referred to as the Pareto-maximin of (1.1); the set of Pareto-maximin strategies will be denoted V r ) .As in (1.6),
VP)= F ( i P [ V P ) ] )= minP (F(x[e,to, xo, V P ) ] ) 4.1
=
m a x P u minP IF(x[B, to, xo, V ] ) . VEI'
x[.]
For every strategy V E V using the relation subsection 1.1 in Chapter 4, we obtain a set x(G)[VI
=
>G
introduced at the end of
{ X G I V I E X [ e tO,xO, , V 1 I F ( x G [ V ] )kGF(x), V x E X [ O ,
xO,
V1}.
Definition 1.3. The strategy V G ) ~will V be referred to as the Geoffrionmaximin of (1.1) if 3 %G [ V ( G ) E ] x ( G )[V'G'] I [F(fG [ V C G )$G ] ) [F(xG[V ] ) , v V E V ,XG [V ] E x ( G )[V ] .
The vector f f ( i G [ V ( G )will ] ) be referred to as the Geoffrion-maximin of (1.1); the set of Geoffrion-maximin strategies will be denoted VLG) and
=
maxG u minG V€Y
xc.1
to, xo,
v]).
As for the concept of an A-maximin strategy V(")of (l.l), let a constant (N x N)-dimensional matrix A be specified with constant elements aij > 0. Every strategy V EV is associated with the set X , , , [ V ] of all A-minimal solutions x A [ V ] of T(V) from (1.2), where IF + Aff, that is, for the above strategy V and any x , [ V ] E X , , ) [ V ] and all x [ . , to, xo, V ] , it is true that
AF(XACV1) 4 A w e , to, xo, V I ) .
Definition 1.4. The strategy V will be referred to as the A-maximin in (A)] the differential game (1.1) if there exists a point i A [ V ( A ) ] ~ X ( A ) [ Vsuch that
+
A w A C V ' A ) l ) Aff(XACV1)
for any V E V and every x A[V ] E X ( A [) V ] .
242
6. Vector-Valued Guarantees
The vector [F(E,[V(A7) will be referred to as the A-maximin of the game (1.1). As in (1.6),
FA’ = 5 ( f A [ V ( A ’ ]=) minA [F(x[O,to, xo, V A ’ ] ) 1
The notions of the Pareto-, Geoffrion-, and A-minimax strategies of (1.1) are defined. In discussing the maximins of the four types simultaneously, the term, vector-valued maximin, will be employed and the associated strategy V ( K () K = S, P, G, A) will be referred to as the vector-valued maximin strategy. Remark 1.1. The different types of vector-valued maximins d o not neces-
sarily intersect (cf. Fig. 6.1.1). Let the strategy V ( ’ )be associated in the criteria1 space {Fl,F 2 } with the set IF(X[O, to, xo, V ( ’ ) ] shown ) as the rectangle AB’C’D made by a heavy line in Fig. 6.1. Then the set of Slater-minima in problem (1.2) with V = V ( ’ )is f f ( X ( s ) [ V ( ’ ’= ] ) A B ‘ u B‘C’ and the set of Pareto minima 5(X(p)[V(’)]) = B’
B’
C‘
Figure 6.1.1.
243
1. Vector-Valued Maximin and Minimax
coincides with the point B’. The strategy V(’) is associated with the rectangle B”C”DC formed by the broken line; the associated set of Slater minima is IF(X,s,[V ” ’ ] ) = B”C” u B”C;
there is a unique Pareto minimum IF(X,,)[ V ” ) ] ) = B”.
Let the strategies V ( 3 )be associated with the square GKLM formed by the dash-and-dot line, the associated number of Slater minima being F(X(s,[V‘3’])= K G u G M
and the Pareto minimum being [F(X(,,[ V 3 ) ]=) G.
Then by Definition 1.1 the set of Slater maximins is A B u B C and, by Definition 1.1, the set of Pareto maximins consists of the three points B”, G, and B’. Consequently, the set of Slater maximins does not have any points in common with the set of Pareto maximins. Moreover, the Pareto-maximin strategy V3)is not Slater-maximin. Consequently, for vector-valued maximin, inclusions such as (1.16) in Chapter 5 are not generally true.
I .2. Relationship to Scalar Maximin In the differential game (1.1) with N = { l} or the zero-sum differential game (3.8) from Chapter 1 with scalar payoff function F,(x[B]), ((1, 21,
x, {a?V } ?F,(xCBI)),
(1.7)
the maximin strategy VOE V is governed by (3.12) from Chapter 1: max min F,(x[B, to, x o , V ] ) = min F,(x[B, to, xo, V O ] ) VEY
4.3
x[.]
(1.8)
or, by Property 3.3 in Chapter 1, min F,(x[B, to, xo, V O ] )2 min F,(x[B, to, xo, V ] ) , VVE V . xc.1
xc.1
Property 1.1. The vector-valued maximin function of (1.1) with N coincides with the maximin strategy V o of the game (1.7).
(1.9)
=
{ l}
244
6. Vector-Valued Guarantees
Indeed, for N = { I } the set
x0[vi= {x0[vi E
m ,
to, xo,
VI I F,W 3 F,(~,cvI), VxExCe, to, x0, VI}
may be equivalently represented as
xOcvi= { x 0 c ~Eix w , to, x0, ~1I ~ , ( x4:) F,(X~[VI), vx E xw, t o , xo, VI} where 4: signifies that the strict inequality F,(x) < Fl(xo[V]) cannot hold if x EX[& to,xo, V]. But then Xo[V] = X(,,[V] (K = S, P, A), because the relations F,(x) 4: F , ( x , [ V ] ) and a,,F,(x) 4: a,,Fl(xo[V]) are equivalent for the positive constant a,,; in the case of the Pareto minimum one of the inequalities in (1.1) from Chapter 3 must be strict. In the case N = 1 this strict inequality is associated with the criterion F,(x[B]). Consequently, at the first stage of determining the vector-valued maximins (in obtaining the sets x(K)[V], K = S, P, A) the sets x(K)[V] coincide with the set of “internal” minima Xo[ V] in (1.8). Using the set Xo[ V], inequality (1.9) associated with the “external” maximum in (1.8) can be represented in an equivalent form, Fl(XCVO1)
4: F , ( x C a
for any V E Y , x[V]EX,[V] and all x[VO]~Xo[Vo]. But then for a , , = const > 0 too, it is true that ~ l , ~ , ( x ~4:~a O , , F1, () x ~ ~ l ) .
The preceding reasoning has to be applied to the Pareto maximum. At the second stage of determining vector-valued maximin strategies V ( K ’ , K = S, P, A, are introduced, the difference being that in Definitions 1.1, 1.2, and 1.4 only contain quasimotions %K[V‘K’] = %[e,to,xo, V ( K ) ]are “identified.” In the case of the scalar maximin (1.8) any quasimotions x[ -,to, xo, such that x[O, to, xo, VO] EX,[VO] will do. Consequently, the maximin strategy V o from (1.8) is a Slater-, Pareto-, and A-maximin strategy for the game (1.1) with N = { l}. Because the Geoffrionmaximin strategy is, by Definition 1.3, Pareto-maximin, the above fact is also true for it. The converse proposition is that, any vector-valued maximin strategy of (1.1) with N = { 1) is maximin, (cf. (1.8)). Let us consider the Slater maximin. (The other three cases of Pareto-, Geoffrion-, and A-maximin may be
1. Vector-Valued Maximin and Minimax
245
discussed in a similar way.) If in (1.1) the set N = { l}, then X(,,[V] forms points xs[V] = x[O, to,xo,V] such that
msce, to, XO? Vl) + [F(XCO,lo, xo, Vl), -F,(x,[V])
= min
VXC ., to,
xo, VI
F,(x[O, to, xo, VI) = F(dV1).
XC.1
The nonsimultaneity of inequalities (1.5) implies in this case that
F,($CV‘S’l)
> Fl(XS[VI), V V E Y , XSCVI EX(S)[VI
=. max VEY-
min F,(x[e, to, x0, v]) x“]
Then, by (3.12) from Chapter 1, (1.7).
= min xr.1
F,(x[O, to, x0, V9).
is the maximin strategy of the game
Remark 1.2. Property 1.1 signifies that the notion of a vector-valued maximin is “fairly complete” in that it includes as a special case a definition of the maximin strategy for the differential positional game (1.7) with scalar payoff function. If in (1.1) the set 92 = 0or Y = 0, the notions of a Slater-, Pareto-, Geoffrion-, and A-maximal (or -minimal) strategy of the multicriterial dynamic problem, (Chapters 2-4), follow from Definitions 1.1- 1.4, respectively. This fact also characterizes the “completeness” of the notion of a vector-valued maximin.
1.3. Geometric Interpretation The Slater maximin of (1.1) will be geometrically interpreted for the case N = { 1, 2}, or [F = ( F , , F2). Every strategy V is associated in the criterion space {Fl, F 2 } with a set [F(X[O, to, xo, Vl) of values F(x) = (F,(x), F2(x))as the state vector x scans the entire set of right ends x[O, to, xo,V] of the
246
6. Vector-Valued Guarantees
quasimotions x[. , to, xo, V] of the system (1.8) in Chapter 5 generated from position ( t o , x o ) by the strategy K Figure 6.1.2 shows four such sets F(XCf3, to, xo, V,]), k = 1,. . . ,4, associated with each of the four strategies V,E V .In each of the sets F(X[O, to,xo,V,]) the Slater minima F(X,,,[V,]) are represented by heavy lines. Recall that the set X,,,[V,] is formed by the Slater-minimal solutions xs[V] of problem (1.2), hwere F = ( F l , F,). The set of Slater maximins in Fig. 6.1.2 is formed by the segments AB and BC. The points of these stretches are Slater-maximal for the sets F(X,,,[V,]) shown by a heavy line. This implies that for any point on these segments, e.g., 9,the inside of the angular region G with vertex at point 9 and sides parallel to the axes F , and F , does not contain points from the sets F(X,,,[V,]), k = 1,. . . ,4.It is also important that for F(X[O, to,xo, V,]) situated as in Fig. 6.1.2, the strategies V, and V, are Slater-maximin. The points 2,[V,] (Definition 1.1) associated with V, represent values of the payoff function F(2[V2]) that form the segment BC.
I
1
I I I 111-
I=
F,
Figure 6.1.2.
1. Vector-Valued Maximin and Minimax
247
In a similar way, the Slater-maximin strategy V, is associated with points i sV,[] E X(s)[V,] such that the Slater maximin IF(fs[ V,]) fills the segment AB. As for a geometric interpretation of the Pareto maximin, by Definition 1.2 every strategy V E V is associated with a set of Pareto-minimal values lF(X(p)[V]) = {(Fl(x), F,(X))IXEX(,)[V]) obtained (Fig. 6.1.3) on the set X(p)[V] of Pareto-minimal solutions of the two-criteria1 problem. The nonsimultaneity of the inequalities ff(ip[V(P)])$ IF(x,[fl) V V E Y , and the fact that X,[V]EX(~)[V] (Definition 1.2) implies that none of the Pareto-minimal values lF(X,,,[V]) for any strategy V E V may remain inside or on the boundaries of G other than at its vertex F(ip[ V(‘)]). Consequently, the value lF(f,[V(P)]) cannot be “strictly improved” with respect to any component without worsening of that with respect to another component when using any strategy V E V other than the Pareto-maximin V“). This comparison is made for the Pareto-minimal values of F(x) that can only occur in problems (1.2) with distinct V E V .From Fig. 6.1.3 and Definition 1.2 the Pareto minimax IF(ip[V(P)]) is also seen to be a guaranteed result for the
-
F,
Figure 6.1.3.
248
6. Vector-Valued Guarantees
“maximizing” player in that it is Pareto-maximal for the set of all Pareto minima F(X(,JV]) of problem (1.2) for any V E Y . Unlike the geometric interpretation of the Slater maximin in this case: (1) the Pareto-maximin ff(2,[Vcp)]) is compared to the Pareto minima IF(X(,,[ V ] ) , VVE Y ; the Slater maximin IF(?,[ V c s ) ] )is compared to “wider” sets IF(X(,,[ V ] ) that are Slater minima of problems (1.2) with distinct V E Y ; (2) the points of the sets f f ( X , , , [ V ] )cannot “reach” inside the angular region G (as in the case of the Slater maximin), not even sides of that region with the vertex punched out.
The geometric interpretation of the Geoffrion maximin ff(2, [ V G ) ] )differs from that of the Pareto maximin in two respects: (1) the values IF(?, [VG7) are compared to F(x, [V ] )of the payoff function obtained for all Geoffrion-minimal solutions X, [ V ] of the two-criteria1 problem (1.2) for distinct V E Y ; (2) the points of the set F ( X ( , ) [ V ] ) for any V E Y cannot reach the inside or the boundaries, other than at the vertex ff(&[V‘G’])of at least one obtuse angle G1 that contains a right angle with the same vertex F(R, [VG’]), (Fig. 6.1.4).
Note that if the set IF(X(,)[V‘G’])is a parabola, as in Fig. 6.1.4, its vertex A is the Pareto, but not the Geoffrion, maximin. Indeed, there is not a single obtuse angle G that G strictly inside itself with vertex at A but not points of that parabola nearest A. But all other points of the parabola other than A are Geoffrion maximins for the sets [F(Xo,[ i = 1,2, and ff(X(G)[V‘G’]) in Fig. 6.1.4. Geometric interpretations of the A-maximin differ from those of the Geoffrion maximin in that
c]),
(1) the A-maximin ff(2,[V‘A)]) is compared to the sets f f ( X , , , [ V ] ) associated with A-minima in problem (1.2) for any V E Y ;
(2) the size of the obtuse angle 6 , is fixed and defined by the matrix A; in the case of the Geoffrion maximin one such angle G I is sufficient.
To conclude this subsection, note that a similar geometric interpretation is applicable to vector-valued minimaxes.
249
1. Vector-Valued Maximin and Minimax
Figure 6.1.4.
1.4. Why is the Maximin Better than the Vector-Valued Saddle Point?
This question will be answered for the case of a Slater maximin, as the reasoning is similar for other vector-valued maximins and minimaxes. To start with, note that the application of the Slater maximin strategy VS) assures a certain guaranteed result, i.e., a vector guarantee for the second (maximizing) player. Indeed, by Definition 1.1, IF(?~[V(4 ~ ) [F(x[o, ]) to, x0,
v ' ~ ) ] ) ,V X [ . ,
to, x0,
PI.
Therefore, whatever strategy U E $2 is used by the first player, i.e., with x[e, to, xo,
u, vS)1 E x[e,to, xo, v(sq
by Proposition 2.4, Chapter 1, the following relation will hold: IF(?,[V(~4 ) I )F(x[e, to, x0,
u, v ( ~ ) ] ) ,vX[.,
to, xo, U ,
vSq.
(1.10)
250
6. Vector-Valued Guarantees
The latter relation implies that by using V(’) the second player, given any behavior U E @ of the first player, obtains a payoff vector lF(x[e, to,xo, U , Vcs)])at least equal to F(2,[VS)]) (with respect to all components simultaneously); the vector lF(iS[Vcs)])is the guarantee assured for the maximizing player if he uses V c s )(Fig. 6.1S). Put differently, none of the points F(x[O, to, xo, U , P)] with any U E @ or x [ . ] can be found within the right angular region shaded in Fig. 6.1.5. Consequently, as in the case when the component V s of the Slater saddle point (Remark 1.3,Chapter 5) is applied, by using a maximin strategy V c s )the maximizing player assures for himself values of all components of the payoff function IF(x[B]) simultaneously at least equal to the maximin F(lS [ V’)]). This, however, does not offer any advantages of the Slater maximin over the saddle point. The chief advantage is that this guarantee F($[VcS)]) is the largest of all possible for the maximizing player. Indeed for every his strategy V E V“ the vector [F(xs[V]) with xs[ V] EX,,,[ V ] is a guarantee in the above sense, that is, F(XSCV1) 4
W R to, xo,
Figure 6.1.5.
V1)
2. Existence of Vector-Valued Maximins
25 1
for any quasimotions x[., to, xo, V ] and, consequently,
for every strategy U E % of the first player and associated quasimotions x[. , to, xo, U , V]. Note that this relation is obtained in the same way as (1.9). By Definition 1.1,
Thus, the vector-valued guarantee F(fs[ Vcs)]) is the largest (Slater-maximal) of all the vector-valued guarantees F(x,[ V ] )for the case in which the second player chooses distinct strategies V E "Y of his own. The desire to assure for oneself the largest Slater vector-valued guarantee explains the purpose (in game-theoretic terms) of using the Slater-maximin strategy V c s )on the part of the second player. The Slater maximin will be seen in this chapter to have other positive properties that distinguish it from the Slater saddle point. First is the internal stability of the set of Slater maximins (cf. Proposition 2.3), which the associated saddle points cannot provide. Second, by Proposition 2.7, every Slater maximin
4: w
e , to, xo, us,vS1)
for any Slater saddle point (Us, V'), or the value of the payoff function (and so the vector-valued guarantee of Remark 1.3 in Chapter 5 ) for any saddle point ( U s , V s ) cannot exceed (meaning of Slater) every maximin. Consequently, the vector-valued guarantee that the maximizing player can assure for himself by using his own strategy V s from the saddle point (Us, V s ) cannot exceed (meaning of Slater) the vector-valued guarantee F ( i s [ VS)]) that is assured by the Slater-maximin strategy Vcs).
2. Existence of Vector-Valued Maximins Numerous auxiliary propositions will now be proved and lead to a proof of the existence of a Slater-maximin strategy with the conventional constraints imposed in positional differential games. Properties of vector-valued maximins, in particular, relationship with saddle points, are also determined.
252
6. Vector-Valwd Guarantees
2.1. Existence of Slater Maximin
Let us assume that for the differential game (2.1) Conditions 1.1 (Chapter 1) and 1.1 (Chapter 5) and equality (3.2) from Chapter 1 hold until the initial position (to,xo) fixed and, X = X [ 8 , to, xo, U - If, Y - Q] is the reachability domain of the system Z from initial position ( t o , x o ) .By Proposition 2.3 (Chapter l), the set X is a compactum in R". Let us consider the space {2"},,, of all compacta of the set with dist( .) determined in (Al.1). The properties of this space are described in section 1.4 of Appendix 1. Because the reachability domain is closed and bounded, it is true (see Appendix 1) that for any infinite sequence { X ( k ) }c {2'}, there exists a compactum X0~{2"},,, and a converging (metrically) subsequence {X""} such that lim dist (X'ke)Xo)= 0.
e- cc
In other words, the set {2"},,, is compact. Let us introduce the set S whose elements are defined as follows: (1) the elements of 2 are compacta:
x[vl = x[e,to, x0, vl
= %[to, x0,
vl n { t = el,
VVEV,
or sections of bundles of quasimotions X [ t o ,xo, Y ] by the hyperplane t = 8. Consequently, the set S forms a collection of sets X [ Y ] associated with all possible strategies Y E"Y; (2) for no X [ Y ] does there exist a strategy V such that X [ P ] c X [ V ] and X [ P ] # X [ Y ] . Consequently, if for any two strategies V ( ' ) and V(') from V the inclusion X[ V ( ' ) ] c X [ Y(')] holds, X [ Y ( ' ) ] and X [ V2)] will be distinct elements of the set 2.
PE
Lemma 2.1.
The set JV is a compactum.
To prove the lemma it is sufficient to show that in any infinite sequence { X [V k ) ]k, = 1, 2,. . . }, X [ V ( k ) E] JV, a converging subsequence {X(kr),e = 1, 2,. .. I may be found such that Proof.
lim dist(X[Y""],
e- m
XIVo]) = 0
for some strategy V o E V .The proof will proceed in two stages.
2. Existence of Vector-Valued Maximins
253
First stage. First let us show that in the sequence of compacta { X [ V ( k ) ] }c H a subsequence { X [ V ( k e ) ] can } be found that converges in {2"},,, and is such that there exists a compactum X o E {2"}m with the following two properties:
limco dist ( X [ V C k e )X] ,o ) = 0
(2.1)
e-
and there exists a strategy V O E V such that X[VO]
c
xo.
(2.2)
To prove (2.1) note that the space {2"},,, is compact and the compacta X [ V ( k ) ]c X , thus X [ V C k )E] {2"},. Then for any infinite subsequence of compacta { X [ V ( ' ) ] } there exists (see A1.3) a subsequence { X [ V ( k u ) ]and } a compactum X o E {2">,,, such that (2.1) is true. Let us establish the existence of a strategy V o E V" for which (2.2) is true. We assume the contrary, i.e., there exists no V oE 9 " such that the inclusion (2.2) ' holds. Then for the closed set (compactum), there exist, by virtue of the alternative (Theorem 3.1 and Remark 3.1 from Chapter I), a number E > 0 and strategy U* € 4 2 such that for every quasimotion x[., to, xo, U * ] evasion is possible at time 8 away from the &-neighborhoodof the compactum X o , or from the set
For the above number E, (2.1) makes it possible to specify a sequential number L(E)such that e 2 L(E)is true by virtue of the property (A1.2) of the norm dist(. ) X I V ( k o ) ]c Me.
But then for the closed set X [ V ( k e ) there ] exists a strategy V ( k uV" ) ~by means ] the set of which, at time 8, all the quasimotions x[. ,to, xo, V ( k O )intersect X[ as does the strategy U* E 42 which ensures evasion of x[ to, xo, U * ] away from certain &,-neighborhoodX [ V ( k e ) at ] the same time ( E , ~ ( 0E))., This is in conflict with the alternative (Remark 3.1, Chapter 1). Consequently, there exists a strategy V O E Y such that (2.2) is true. Second stage. Let us prove a ,
X [ V " ] = xo
(2.3)
by contradiction. Let X [ V O ]# X o . Then, by virtue of (2.2) X [ V O ]c X o . The
254
6. Vector-Valued Guarantees
set X I V o ] is a compactum, the reachability domain of the system (1.8) from Chapter 5. Therefore, X [ V o ] ~ { 2 x }and, , because {2"}, is a compactum there exists a sequence of compacta { X ( k ) c } {2"}, such that (1)
lim dist(X'k),X I V o ] ) = 0;
k-m
(2) in the sequence { X ' k ) } a subsequence {X(kr',r = 1,2,. . . } may be and, X'kr)# X [ V k r ) ] , bounded every element of which X'kr)c X [ Vkr)] where the subsequence { X ' k r ) }is introduced on the first stage. Hence, from (A1.2) for every number be specified such that for
E
> 0 a sequential number R(E)may
X I V o ] c G"(X'k"),
r > R(E),
(2.4)
where Gc(X'kr')is the &-neighborhoodof the compactum X(kr).Note that from this way 2 is obtained there exists no strategy V E V such that X [ V ] c X(kr). Then by the alternative (Remark 3.1, Chapter 1) the evasion problem at time t = 0 can be solved (using Vo);on the other hand, by (2.4)(with c = c1 E(O,.E)) 0 E 4! and number 5 > 0 such that X[O, to, xo, f i ] n GE(X'kr)) = 0. Consequently, on the one hand, for a closed set X [ V O ]the evasion problem at time t = 0 can be solved (using VO); on the other hand, by (2.4)(with E = c~E(O, C)) the problem of evasion from C"(X[ V O ] may ) be solved (using the strategy fi). What we have is a conflict with the alternative (Remark 3.1, Chapter I), which proves (2.3) and the lemma. Every compactum X [ V ] E 2 is associated with the set X,, [ V ] of Slaterminimal solutions xs[ V ] of the static multicriterial problem Let us introduce the multivalent mapping Y of the set 2 into the reachability domain X : Y:&? + x : Y ( X [V ] ) = X 0 ) [V ] =
{ X S [ V I E X [ V ] c x :lF(x,[V]) 4 lF(x),V X E X [ V ] } . (2.6)
Consequently, every compactum X [ V ] , V E V , is mapped by means of the multivalent mapping Y from (2.6) into the set X , , , [ V ] of Slater-minimal solutions x s [ V ] of problem (2.5).
2. Existence of Vector-Valued Maximins
Lemma 2.2. ous on X .
255
The multivalent mapping Y defined in (2.6) is upper semicontinu-
For the proof it is sufficient, by Appendix 2, to show that the complete prototype Y - ' ( M ) = { X C V ] E ~ I X ( S ) [ L ' I ~# M0)
of every closed set M c X is a closed subset in X . Indeed, with M closed in X and X o = X[ V O ]the limit point of the set Y -'(M), we consider any sequence { X [ V ( k ) ] , k= 1,2, ...}, X I V ( k ) ] ~ Y - l ( Mof) points of the set Y - ' ( M ) that converges to X o and the associated sequence { xs[ V")]); x, [V ( k )E] M , k = 1,2,. . . . As a closed subset of the compactum (by Lemma 2.1), the set M is also a compactum. Therefore, there exist a subsequence {xs(V ( k e ) ] } converging to some point xo E M (e + 00). By the way it is constructed and the continuity of F(x) on R , F(x,[ V(k@)]) E F(X(S)[ V(ke)]), e+cc lim
IF(x,[ V(k,)])+ F(XO),
dist(F(X[ V ( k e ) ] )F, ( X I V o ] ) )= 0.
(2.7)
From (2.7) and the lemma of Appendix 3 it then follows that F(x0) E 9 r , F(X[ VO])0XO E X,,) [ V03
that is, the complete prototype Y - ' ( M ) of the closed set is closed in 2.
Lemma 2.3. For every fixed V E Y , the set X , , , [ V ] of Slater-minimal solutions xs [V ] of (2.5) is a nonempty compactum in X . The lemma formulates the well-known finding of the theory of multicriterial problems [70, p. 1433. T o make our discussion complete, this lemma will be proved. Assume the contrary, that is, there does exist a sequence { x ( ~ )c ] X t S ) [ V ]of Slater-minimal solutions of (2.5) such that x'') + xo and xo$X,[V]. Note that the existence of the point xo, the limit of the sequence {x(~)}, follows from the compactness of X [ V ] . Because xo 4 X,,,[ V ] , there exists a solution 2 E X [V ] for which IF(%)> F(X0).
Because the components of the N-vector F(x) are continuous, for "fairly large" k it is true that F(2) > F(x(k)),
256
6. Vector-Valued Guarantees
which contradicts the Slater minimality of the solution x ( ~in) (2.5). That X , , , [ V ] is nonempty follows from the fact that the Slater-minimal solution of (2.5) is, for instance, the point xo defined by the equality F,(xo) = max F,(x). xsx
The existence of xo is ensured here by the continuity of the scalar function F,(x) on the compactum X . Note that F , ( x ) is the first component of the vector-valued payoff function IF(x).
Lemma 2.4. The set
ysCY1 =
u X(S)CVI
VEY
is a nonempty compactum in X .
The lemma follows from the compactness of X , , , [ V ] for every V E Y (Lemma 2.3) and the upper semicontinuity on & of the multivalent mapping Y Y ( X [ V ] )-+ X , , , [ V ] (Lemma 2.2). Indeed, because the mapping Y &' + X is upper semicontinuous on 2'and the image Y ( X [ V J )= X,,[I/1 is a compactum in X , the image of the compactum &'
is a compactum in X (theorem in Appendix 2). This proves the lemma.
Lemma 2.5. The strategy V(,) is Slater-maximin in the game (1.1) if there exists a Slater-maximal solution 2, in the multicriterial static problem
< ys
C
~
1
W)), 9
(2.9)
where the set of solutions Y [ Y ] is dejined in (2.8).
The proof may be obtained from the following chain of equivalent propositions: the strategy V(') is the Slater-maximin in (1.1)o -there exists a point Els[ V'')] E X,,, [ V S )such ] that F ( i , [ Vcn]) for all V E Y and x , [V ] E X,,, [V ] o -the point El,[V(S)] = El, is Slater-maximal in (2.9). This proves the lemma. -
4: F(x,[ V ] )
2. Existence of Vector-Valued Maximins
257
The key finding of this chapter is the following proposition, which establishes the existence of a Slater-maximin strategy in (1.1) under constraints common for positional differential games. Theorem 2.1. If for the differential game (1.1) Conditions 1.1 (Chapter 1) and 1.1 (Chapter 5) and (3.2)from Chapter 1for the saddle point are satisfied for a small game, then,for any choice of initial position (to,xo) there exists a Slatermaximin strategy in the game (1.1). Proof. Let (to,xo)be an arbitrary initial position from the set [O,O) x R". By Lemma 2.4, the set Y,[V] is a compactum which is a union (over V EV )of sets of Slater-minimal solutions X,,,[V] of problems (2.5). For a compact set of solutions &[V] and criteria Fi(x), iE N, continuous on R", there always exists in the multicriterial problem (2.9) [70, p. 1421 a Slater-maximal solution I,. Then, by Lemma 2.5, there is a strategy V @ ' E V (with I, = I,[V(s)]) which is Slater-maximin in (1.1) with initial position (to,xo).
Remark 2.1. In a similar way the following proposition may be proved: If Conditions 1.1 (Chapter 1) and 1.1 (Chapter 5), and (3.2) from Chapter 1 in the differential game (1.1) are satisfied for any choice of initial position (to,X ~ ) E[0, 0) x R", there exists a Slater-minimax strategy U ( S @. )~ Remark 2.2. Unless otherwise stipulated, the requirements of Theorem 2.1 are assumed henceforth to be satisfied. In particular, Condition 1.1 (Chapter 1) ensures the existence of quasimotions of the system (1.8) in Chapter 5, Conditions 1.1 in Chapter 5 require the continuity of the components of F(x), while the saddle point condition of a small game ensure (Theorem 3.2, Chapter 1) the existence of a saddle point in the zero-sum positional differential game (3.8) in Chapter 1 with scalar payoff function 2.2.
Topological Properties of Slater Maximin
Let 9(' denote ) the set of Slater maximins of the game (1.1) with initial position (to,xo), while (2.10) where
2(,)[V(')]
is the set of all values of Is[Vcs)]associated (by Definition
258
6. Vector-Valued Guarantees
1.1) with the Slater-maximin strategy V'" (whose set was denoted in subsection 1.1 as V',")). Then in the notation of subsection 1.1, g c s ) = F(f
(S)C
yes, * 1) =
u
~(f(,)CV("l
V€.l-:'
=
u max'u m i d F(x[O, to, xo, V ] ) . V&'
XC.1
The topological properties of the sets the next proposition. Proposition 2.1. domain X .
(2.11)
are formulated in and f(,)[V',")]
The set X(')[V',")] is a compact subset of the reachability
Indeed, since in the static multicriterial problem (2.9) the set & [ V 3is a compactum in X while the vector-valued criterion F(x) is continuous on X , the set of Slater-maximal solutions of this problem is a nonempty compactum [70, p. 1423. By Lemma 2.5, this compactum coincides with the set defined in (2.10). Consequently, the set f2,,,[V',)3 is a nonempty compactum in R".
zn[V',")]
Proposition 2.2. The set 9(' of)Slater maximins of (1.1) is a nonempty compact subset of the space RN.
The Proposition follows from the compactness of X,,, [V',")] (Proposition 2.1) and the continuity of [F(x),since under a compact mapping lF: R" -,RN a compactum is mapped into a compactum.
Remark 2.3. Similarly, the set of Slater minimaxes of (1.1) is proved to be a nonempty compactum in RN. Remark 2.4. The results so far in this section lead to numerous propositions on the A-maximin of (1.1). Indeed, Definitions 1.1 and 1.4 differ only in that in formalizing the A-maximin strategy VA),understood as a vector-valued payoff function of the differential game lF(x[O]) of Definition 1.1, is replaced by AF(x[B]), where A is a specified (N x N)-dimensional matrix with positive elements. Therefore, by Theorem 2.1 and Proposition 2.2, the following propositions hold when the requirements of Theorem 2.1 hold for the notions of an A-maximin strategy and an A-maximin thus introduced:
2. Existence of Vector-Valued Maximins
259
(1) in the game (1.1) for any choice of (to,X ~ ) [O, E 0) x R” there exist an Amaximin and an A-minimax strategy; (2) the sets of A-maximins and A-minimaxes are nonempty compacta in
RN.
The existence of a Pareto-maximin is discussed in section 3.
2.3. Stability of the Set of Slater Maximins To investigate the stability properties of the Slater maximin for the differential game (l.l), the constraints of Theorem 2.1 are assumed to hold. Let us consider various properties (2.11) of the set of Slater maximins. Proposition 2.3. The set @) of Slater maximins of (1.1)is internally stable in that, for any two Slater maximins [F”) E FG(’)( j = 1,2), [F‘”
-+
Indeed, following Definition 1.1 of the Slater maximin [F”) there exist a strategy V”)E Y and right end of the quasimotion i s [ V ( j ) ]E X ( ~ ) [ V (such ~)] that
Proof.
IF”’ = [F(i,[V”’]) 4: F ( x , [ V ] )
for every V E Y and xs( V ] E X,,,[ V ] . Then IF”) = [F(is[V(j)]), j = 1, 2.
Assuming that V
=
V ( 2 )and x s [ V J = i sV(” ) ] , we have
[F‘”
= IF(i,[V‘”]) 4: [F(i,[V‘2’])= [FQ’,
which proves Proposition 2.3. By using & [ Y ]of (2.8), a union (over all V E V ) of the sets X ( s ) [ V ]of Slater-minimal solutions of the static problem (2.5) the auxiliary set of all values of the vector-valued payoff function is obtained:
ff(KC.Y-1)=
U W,)CVI)
V€Y
=
l.J
VEY-
minS w *[.I
e , to, xo, VI).
(2.12)
260
6. Vector-Valued Guarantees
Proposition 2.4. The set .Fcs) of Slater maximins is externally stable with respect to the set ff(Y,(Y])of (2.12) in that, for every N-vector [F c v E yminx[-.Is IF(x[B,to, xo, V ] ) there is a Slater maximin IF(’) E .F@’ of its own such that
u
IF
< IF‘S’.
Indeed, by Lemma 2.4 the set & [ V ]of (2.8) is a compactum in R” and .Fcs) forms Slater-maximal points of the set [F(Y,[Y]) (Lemma 2.5 and (2.12)). Furthermore, from the compactness of Y , [ f ] and the continuity of IF(x) it follows that ff(Y,[-Y]) is compact. External stability in this case has been established in [70, pp. 158; 351. Remurk 2.5.
The set of Slater minimaxes
is also externally stable with respect to the set
U
maxS XC.1
w e , to, x0, v1).
This property implies that, for any vector ff E ~UEImax,c.lf f ( x [ eto. , xo, U ] ) , there is a Slater minimax IF0, such that [F 2 ffo,. Proposition 2.5. If
(1) the strategy (to,xo); (2) V‘S’EF, (3) .t- c v-.
VS’is Slater-maximin in the game (1.1) with initial position
Then V‘” remains Slater-maximin in the game
((1, 2}, z, {@,
IF(xcm
(2.13)
for the same initial position.
The game (2.13) differs from (1.1) only by the fact that the set of feasible values of control actions by the second player is such that Q” c Q. The property formulated in Proposition 2.5 is naturally referred to as inheritance.
261
2. Existence of Vector-Valued Mnximins
Proof. From the definition of quasimotions (subsection 2.3, Chapter 1) it follows that, for Q“ c Q, for any strategy VETc “Y,
2 ~ =~ 2ce,1to, x0, VI
= xce,
to, x0,
VI,
where d [ V ] = uxl,lx[O, to, xo, V] is a cross-section of the bundle &[to, xo, V] of quasimotions x[ .,to, xo, V] of the system (1.8) in Chapter 5, where u “ ’ ( t ) ~ HTherefore, . all the strategies VETc Y of the set of Slater minimal solutions of the problems (d[V], lF(x)) and (2.5), respectively, coincide, that is 2 , S ) C V l = X(S)WI.
(2.14)
“Y is the Slater-maximin strategy of (l.l), by Definition 1.1 there exists If a point as[Vcs)] E X(s,[V(s)] such that
~(~sCV(s’l) 4: W S C V I )
(2.15)
for every V E “Y and xs[V] EX(~)[V].Then, by (2.14), relations (2.15) also hold for V ET c “Y. Therefore, V c s )is the Slater-maximin for the same initial position. The sets of Pareto-, Geoffrion-, and A-maximins, F(‘F@), ), and 9(A), respectively, are also internally stable. Specifically for any ffU)e9 ( K ) ( j = 1, 2, K = P, G , A), Remark 2.5.
IF(’) IF‘” Aff“)
2 IF‘’
aGIF‘2’ 4 AIF(2)
(Pareto maximins), (Geoffrion maximins), (A-maximins).
The above reasoning may be established as in the proof of Proposition 2.3 using Definitions 1.2- 1.4 of vector-valued maximin strategies as appropriate. To gain a deeper understanding of external stability for the set of Amaximins, we associate every strategy V E Y with the set X,,,[V] of Aminima in the multicriterial problem (2.5),
X(A)CYl = { X A C V l EXCVI I A m A C V l ) 4 A~(xCVI),W
V I EXCVl).
This set is a compactum (Lemma 2.3). The union of these sets YACYI =
u
X[V]E.#
X(A,CVl =
u
VEY-
X(A,CVI
262
6. Vector-Valwd Guarantees
is also closed and bounded in R“. Finally, consider the sets
Proposition 2.6. The set F--(,) of A-maximins is externally stable with respect to the set F(Y,[V]), that is, for any vector FE
IJ
VEY”
there is an A-maximin
minA F(x[B, to, xo, V ] ) , xC.1
F--(*) such that A f f < A5‘”).
This proposition is a corollary of Proposition 2.4.
2.4. Relation between Maximins and Vector- Valued Saddle Points The structure of vector-valued saddle points, maximins, and minimaxes is established by the next proposition. Proposition 2.7. For any saddle points ( U K ,V K ) e %x V ,minimaxes minKu maxK F(x[B, to, xo, U ] ) U€Y
Xl.1
and maximins
maxKu minK [F(x[B,to, xo, V ] ) , VET’
XC.1
the following relations hold with K
=
S,
minSu maxS ~ ( X C B to, , xo. U ] ) 4 5(x[B, to, xo, Us, V s ] ) U€Q
x[.]
4 maxSu minS F(x[B, to, xo, V ] ) ; VEY’
with K
=
(2.16)
XC.1
P,
m i n P u maxP [ F ( X [ B , to, xo, U l ) 2 (F(x[O,to, xo, Up, V T ) UEQ
XC.1
2 maxP u minP F(x[O, to, xo, V ] ) ; CEI
q.1
263
2. Existence of Vector-Valued Mnximins
with K = G ,
minGu maxG qx[e,to, xo, u ] )$G IF(x[O, to, xo, U€W
XC.1
uG,P I )
aG m a x G u minG q x [ e ,to, xo, v]); V€V
XC.1
with K = A
minAu max” U€4
XC.1
to, xo, U ] )
ff(x[O,to, xo, U”, V ” ] )
max” u m i d F(x[B, to, xo, V 3 ) VEY
XC.1
for any quasimotions x [ ., to, xo, U K , V K ] ( K = S, P, G, A).
Proof. Let us prove the right-hand side of (2.16); the other remaining relations may be established in a similar way. We assume the contrary, that is, that there exist a Slater saddle point (Us, V s )of the game (l.l),a quasimotion xs[.] = xs[ * , to, xo, Us, V’], and maximin strategy V ( s )and associated point i S [ V c S )such ] that
By Definition 1.1 in Chapter 5 of a Slater saddle point, ~ ( X s c e4 i
w e , to, xo, vS1)
for all quasimotions x [ ., to, xo, V s ] , thus xs[O] = x S [ & to, xo, Us, v“] EX(^)[ V s ] ,i.e., the set of Slater minimal solutions of the problem ( X [ V s ] , F(x)). Consequently, for a Slater-maximin strategy V(’) and associated point i s [ V ( s ’ ] there is a strategy V s and point xs[e] EX(,,[V~]such that I F ( ~ - , ~ V (<S q) ]x)s c e ] ) ,
which contradicts Definition 1.1 of a Slater-maximin strategy This proves the proposition.
of (1.1).
A geometric interpretation of Proposition 2.7 in the case of a Slater optimum is shown in Fig. 6.2.1 for N = {1,2}, Let F ( 9 ) be the set of values of the vector-valued payoff-function IF(x[O]) reached on all Slater saddle points ( U s , V s )E Y and any quasimotions x [ .,to, xo, Us, V’], that is,
u
~(9) =
(U,V)EY
wv,to, x0, us,vsl)= u
u
( U , V ) E Y x€XCo,t,,x,,US,VS1
w.
264
6. Vector-Valued Guarantees
=
I
I
I
I
minSu maxsF(x[O,
-U € J
XC.1
I
to, x,u])
F,
Figure 6.2.1.
Also let Fr‘S)(IF(Y) (Fro, [F(Y)) be the set of Slater-maximal (minimal) points of the set [F(Y); R: (respectively, R?) is the first (respectively, third) quadrant of the criteria1 plane {Fl,F 2 } :
R: = {IF = (Fl, F 2 ) I F i2 0,i = 1, 2},
RZ_ = { [F = (F1, F 2 ) I F i< 0,i = 1, 2).
It follows from Proposition 2.7 that the Slater maximins maxSu minS ff(x[e, to, xo, VfY
XC.1
u)
cannot remain inside the domain Br‘S’5(Y)+ R? indicated in Fig. 6.2.1 by the horizontal lines and the Slater minimaxes minSu maxS &[e, to, xo, U ] ) U€4
XC.1
cannot remain inside the domain 9r(,,[F(Y)+ R: in Fig. 6.2.1 indicated by the vertical lines.
2. Existence of Vector-Valued Maximins
265
A similar geometric interpretation (with obvious amendments) is true of Pareto and Geoffrion minimaxes, maximins, and saddle points and in the case of an A-optimum.
Proposition 2.8. If the situation (Us, V,)E@ x Y is a Slater saddle point of the dzrerential game (l.l), a Slater-minimax (P) and -maximin ( Vcs))strategy and associated points is[U‘s)] and i s [ V ( s ) ]exist such that for at least one quasimotion i [ e , to, xo, Us, V s ] = i [ V ,P], IF(i[US, VS]) = F ( i S [ U ( S ) ] )
=
IF(i,[V‘S’-J)
(2.17)
the strategy U s is Slater-minimax and Vs, Slater-maximin while the associated, (Definition 1.1) points %[Us,V s ] coincide.
We will prove the assertion for V s (as it is similar in the case of Us). From Definition 1.1 in Chapter 5, it follows that
Proof.
i [U S , VS] EX,,) [ VS]
(2.18)
is the set of Slater-minimal solutions x s [ V s ] of the problem ( X [ p ] , F(x)). Indeed, by virtue of the right-hand relation in (1.9) of Chapter 5 with X[US, VS] = ?[US, VS],
w e , to, x0, us,vS1)+ ~ ( x c eto, , x0, vS1)
for all quasimotions x [ . , to, xo, V s ] . Whence follows the inclusion (2.18). From (2.17) and Definition 1.1 of a Slater-maximin strategy V(’),we have IF(%[ us,VS]) = IF(%, [ V,,)])4: IF(x, [V ] )
for every V E Y and x , [V ] E X(,)[V ] . Hence, from (2.18) it follows that the second component V s of the Slater saddle point (Us, Vs) is the Slatermaximin strategy of the game (1.1). This proves the proposition. The Proposition may be proved for the Pareto-, Geoffrion-, and A-optima in a similar way. Specifically, the following proposition holds. If the situation (UK, V K )is a vector (K = P, G, A) saddle point of (1.1) and there exists a vector-valued maximin V K ) and a vector-valued minimax UK), strategies (with associated points iK[VK]and RK[U(K)])such that for some quasimotion
P) [ .3 E %[to, xo, UK, VK],
266
6. Vector-Valued Guarantees
the following equality holds: where the first component of the vector saddle point U K is a vector-valued minimax and V K a -maximin strategy of (1.1) while their associated point is 2(“)[0] (K = P,G,A), by Definition 1.1.
2.5.
Vector-Valued Maximin as Solution of a Diferential Game
Examples 4.2 and 4.3 in Chapter 5 show that the use of the vector saddle point as a solution of a differential zero-sum game with vector-valued payoff function is highly problematic for the following reasons. (1) The set of vector saddle points are not internally stable. Specifically, two vector saddle points may exist at one of which the value of the components of the payoff function is strictly larger than that of the associated components at the other saddle point (cf. Example 4.2, Chapter 5). In this case the maximizing player would use the saddle point at which the values of all the components of the payoff function is larger and the other player would use the saddle point at which these values are smaller. Therefore, it is “beneficial” for the players to use different saddle points, each of which “yields” a different payoff. (This represents the nonequioalence of vector-valued saddle points.) Because the game is zero-sum, a common saddle point cannot be negotiated. Then, which saddle point will be the solution of the game? A discussion based on saddle points will not be able to answer this question. (2) If either player uses the component of the saddle point which is “good for him”, i.e., “his own” component, the resultant situation (pair of strategies) does not necessarily generate a saddle point (noninterchangeability of vector-valued saddle points; cf. Example 4.3, Chapter 5). In this case, although each player is interested in his own saddle point, in this kind of game such a solution (vector-valued saddle point) is not obtained. (3) Even though using his own strategy from the vector saddle point ensures a certain guaranteed result for the player (the vector-valued guarantee of Remark 1.3, Chapter 5), this guarantee is not necessarily “the best” (e.g., for the maximizing player it is not the Slater-largest). In looking for the “best” vector-valued guarantee the entire set of vector
2. Existence of Vector-Valued Maximins
267
saddle points has to be obtained. This, can be very difficult, since finding even one saddle point of a zero-sum differential positional game with a scalar payoff function may be highly time-consuming if at all feasible. These three features make the saddle point less advantageous as a solution of a differential game. The new solution, or vector-valued maximin, has numerous advantages over the vector-valued saddle point. In particular: (a) Unlike Slater saddle points, the set @’) of Slater maximins is internally stable (Proposition 2.3), in that for any two maximins IF(’) and F(’) in the set, F“) 4 [F‘’. From the viewpoint of the minimizing player neither Slater maximin is more advantageous then the other, so he is free to use either. (b) None of the vector-valued guarantees [F(x[B, to, xo, Us, V’]) made possible by any Slater saddle point ( U s , V s ) may exceed (in terms of Slater) any maximin (Proposition 2.7), that is, IF(x[B, to, xo,
us,V”) 4 [F,
VF E F--CS).
(2.19)
Consequently, the Slater-maximin strategy Vcs) yields a Slater-maximal vector-valued guarantee F(f, [V S ) from ] the [F(x[B,to, xo, Us, V s ] ) provided, by Remark 1.3 in Chapter 5, if the strategies V s from the saddle point (Us, Vs) are used. The relation (2.19) may be viewed as asserting the “nonimprovability” of vector-valued guarantees provided by the Slater-maximin strategies Vca. 9‘“:“) over those provided by Slater saddle points. Note that there is no question of interchangeability for Slater-maximin strategies VS) E YT),since any two distinct strategies yield, generally speaking, distinct Slater maximins; here we remain within the framework of the solution concept. Similar arguments hold for the Slater minimax. Vector-valued maximin (and minimax) strategies used as solutions of the differential game thus d o not have the disadvantages (1)-(3) of vector-valued saddle points. Now let us proceed to the game-theoretic essence of vector-valued maximins (and minimaxes). From general game theory [89], in zero-sum games with scalar payoff function the maximin strategy is the solution from the viewpoint of the maximizing player. In anticipation of the strongest possible resistance offered
268
6. Vector-Valued Guarantees
by the opponent, this strategy provides the largest of all “poor” (minimal)) gains for the player, a maximin gain being guaranteed for any behavior of the other player. This is also true of zero-sum games with vector-valued payoff function if, for instance, a Slater-maximin strategy is used. On the one hand, this provides the (Slater-) maximal of all the “worst” (Slater-minimal) vectorvalued guarantees. On the other hand, its application ensures, given any behavior of the opponent, a value of the payoff function at least (Slater-) equal to the associated maximin. By general game theory [89], for an antagonistic game with scalar payoff function without a saddle point the game is solved from the viewpoint of every player. For the maximizing player this implies the use of the maximin strategy and, for the minimizing player, of the minimax strategy. This is also true of games with vector-valued payoff function. Since in such games the vectorvalued saddle point does not work, it would be natural to solve this game from the viewpoint of every player. For the maximizing player this implies the application of the vector-valued maximin strategy, while for the minimizing player, the associated minimax strategy. Consequently, from the viewpoints of both the disadvantages of the vector-valued saddle points and of the game theory, the vector-valued maximin and minimax are the most acceptable solution of antagonistic games with vector-valued payoff function. Note that Theorem 2.1 establishes the existence of Slater-maximin (and -minimax) strategies in a differential positional game with constraints, as is usual for such games. Finally, the “best” solution of antagonistic games with scalar payoff function is, according to general game theory, the saddle point. It is obtained when the game maximin and minimax coincide whereas the saddle point itself is the resultant of the minimax and maximin strategies. Following this approach, we propose that the vector-valued saddle point at which the value of the payoff function coincides with the associated vectorvalued maximin and minimax represents the “best” solution of an antagonistic game with vector-valued payoff function. This is referred to as the Z solution; the reader is free to offer other solutions and denote them by the remaining 25 letters of the Roman alphabet. Definition 2.1. The situation ( U Z K ,V Z K )will be referred to as the Z K solution of the differential game (2.1) with an initial position (to,xo) if (1) the pair ( U Z KV, Z K )is a vector-valued saddle point;
2. Existence of Vector-Valued Maximins
269
(2) there exists a quasimotion ?[-, to, xo, U Z K ,V Z K ]such that
q q e , to, xo, U Z K ,
VZKI)
- maxKu minK E(x[e, to, xo, V Z K ] ) VEY-
=
XC.1
minKu maxK qx[e, to, xo, U z K ] ) . UEW
(2.20)
xc.1
Consequently, four kinds of 2-solutions are introduced. Note that, by Proposition 2.8, the strategy UzK of the 2-solution is the associated vector, maximin strategy. valued minimax and V Z K the Thus with K = S and the pair (Uzs, V z s ) the Slater saddle point, where the value of the vector-valued payoff function coincides with the Slater maximin and minimax, the situation (Uzs, V z s ) is referred to as the ZS-solution. The next chapter establishes the existence of the ZS-solution and supplies a straightforward procedure of obtaining a ZS-solution in the dynamic variety of the competition model which the last chapter establishes, the existence of the ZS-solution and a method of constructing the ZP-solution in a differential pursuit game. What, then, are the advantages of 2-solutions? In answering this question we will limit ourselves to ZS-solutions. With appropriate changes the answers are similar for the other ZK-solutions. First, the set of ZS-solutions is internally stable, that is, for any two ZS, 2)), solutions (U‘”, Vl))and ( V 2 )V
q a p , to, xo, u(1),v(~)I) 4: s(n[e,to, xo, u(2),~ ( 2 1 1 ) . Second, the value of the payoff function IF(i[O, to,xo, Uzs, V z s ] ) on the ZSsolution (Uzs, Vzs) is simultaneously (Slater-)maximal and minimal for the set of values of these functions reached on all Slater saddle points, that is, IF(?[& to, xo, Uzs, V z s ] ) = maxS maxS qx[e, to, xo, U , (U,VW
XC.1
a)
Put differently, the value of [F(Ec[B, to, xo, Uzs, V z s ] ) cannot be increased or decreased (in terms of Slater) by using at least one Slater saddle point. Note that (2.21) follows immediately from (2.16) and (2.20). This property will be referred to as “non-improvability” of 2-situations through the use of Slater saddle points.
270
6. Vector-Valued Guarantees
Third, if the maximizing player uses the strategy Yzs of the ZS-solution (Uzs, Yzs), the (Slater-) maximal vector-valued guarantee is secured for him that, by (2.20),coincides with ff(i[tJ, to, xo, Uzs, V”’]). The minimizing player who uses Uzs secures for himself the least possible vector-valued guarantee which, also by (2.20), coincides with F ( i [ d , to, xo, U z s, Yzs1).Because the situation (Uzs, V z s ) is a Slater saddle point, in the situation (Uzs, V z s ) the maximal possible vector-valued guarantees are secured for both players simultaneously and these guarantees are the same. Consequently, in the situation (Uzs, Yzs)the “best” vector-valued guarantees are obtained for the players and neither can expect any better because the other player opposes him. Consequently, the ZS-solution has all the advantages of vector-valued maximins and minimaxes. Thus, this solution seems to be the best solution for the differential positional antagonistic game (1.1).
3. Pareto &-maximin
Numerical values of the solutions, such as saddle points, maximins, and minimaxes of games with vector-valued payoff functions are usually found with errors. This section will discuss some challenging features of vector Esaddle points in static zero-sum games with a vector-valued payoff function. At the end of the section the existence of the Pareto &-maximinin a differential game is established.
3 , l . Formalization of waddle Points in Static Games In an antagonistic (not differential) game with a vector-valued payoff function (X,
x U X , Y)>,
it will be assumed that the compacta X c Rh and Y components of the vector-valued payoff function
F(x, Y ) = ( F , ( x , Y), . . F,(x, Y ) ) . 9
are continuous on X x Y,
(3.1) c
R4 while the
3. Pareto &-maximin
271
Definition 3.1. The situation (xs, y S ) ~ xX Y is referred to as the SIater saddle point of the game (3.1) if Y S ) 4:
ws,y S )4: w,Y ) ,
v x E x,y E
The set of Slater saddle points of (3.1) will be denoted as 9’”. Recall that in static games (3.1) the associated solutions are denoted by lower-case subscripts s, p, g, and a and in differential games, by upper-case subscripts, S, P, G, and A. The situation ( x p , yp)E X x Y is referred to as the Pareto saddle point of (3.1) if F(x, y”) k
f f ( X P , y”)
vxE
$ ff(x”, y).
x,Y E
The set of Pareto saddle points will be denoted 9“. The situation ( x g , y B ) e Xx Y is referred to as the G e o f f o n saddle point if ( 1 ) ( x g , y g ) is the Pareto saddle point in this game; (2) there exists a positive number M > 0 such that for j~ Y and EN = (1, ..., N } such that Fi(Xg, j) > Fi(X’9
~7,
there exists j e N such that Fj(Xg, Y) < Fj(Xg,y g )
and with these i, j , and j, FAXB, 7)- FAXB, y g ) < M[Fj(XB, y*) - Fj(XB, j ) ] ;
and for 2 E X, i E N,such that FA?, yP)< F i W , y 7
there exists j E N such that Fj(% y g ) > F j W , Y”,
and for these i , j , and 2, Fi(XB, yg) - Fi(2, yg) < M [ F j ( 2 , yg) - Fj(XP, y9l.
The set of Geoffrion saddle points will be denoted 3’.
272
6. Vector-Valued Guarantees
Finally, the situation (x",y") E X x Y is referred to as the A-saddle point of (3.1) if, for fixed constant (N x N)-dimensional matrix A with positive elements, if A[F(x",y) 4 AF(x", y") 4 A[F(x, y"),
VXEX, Y E Y;
The set of A-saddle points will be denoted as d".There follows from the above definitions and Theorem 3.1 from Chapter 4 the chain of inclusions dS c 9 s c 8'
c
ys.
(3.2)
Corollary 3.1. Let us formulate a definition of the vector-valued saddle points of (3.1) that would be equivalent to the above. For this purpose we introduce the set XAY) = {X'(Y)EXI F(x, Y )
Y S ( 4 = { Y S WE y I
+
v,Y ) 4
F(X'(Y), Y), VXEX),
Y'W),
VY E Y).
In a similar way, XJY) = {Xp(Y) Y"Y)
=
E
x I F(X9 Y ) $ UXp(Y),
{ Y P ( 4 E y I F(X9 Y ) 2
F(X,
Y), vx E
x 1,
yP(x)),V Y E Y}.
Similar sets may also be introduced in the case of Geoffrion optimality, X g ( y ) , Yg(x), and A-optimality, X , ( y), q:,. The following proposition holds: The situation (xk,y l ' )X~ x Y is a k-saddle point if XkEXk(yk),
y k € yk(Xk),
k = a, g, p, S.
Thus the situation (xp,yp) is a Pareto saddle point of (3.1) i f x p ~ X p ( y pand ) y p € YP(XP).
One of the reasons why Slater, Pareto and Geoffrion &-saddlepoints are introduced is that in (3.1) the associated saddle points do not necessarilfy exist. In the example below Slater and so, by (3.2), Pareto, Geoffrion, and A-saddle points do not exist. Example 3.1.
Assume that in the game (3.1)
Fi(x, Y ) = x
+ Y,
F,(x, y) = x - y,
X, Y E
R'
and that the sets X = Y = (0,1] represent an interval with the point 0 punched out. This game will be denoted r*. If it were true that X = Y = [O, 11, the situation (x', y') = (0,y'), where ys is
3. Pareto E-maximin
273
any number from [0, 13, would be the Slater saddle point of I-*. This fact was established in Example 2.2, Chapter 5 whose static variety is r*. Now assume that the point x = 0 is punched out, or X = (0,1]. Then for every x*E(O, 11 there exists a point ?c(O, 13 such that the system of inequalities
+ ys < + ys = Fl(x*, ys), Fz(2, ys) = 2 ys < X * + ys = F ~ ( x *ys) ,
Fl(?, ys) = ?
X*
-
is simultaneous for V y ’ ~ ( 0 13. , This is the exact reason why there is no Slater saddle point in r*.It follows then from the inclusions (3.2) that there are no Pareto or Geoffrion or A-saddle points in this game. This fact is the first argument in favor of vector &-saddlepoints. Another is that in obtaining numerical values of vector-valued saddle points measurement errors occur and which makes it very difficult to obtain exact values. The situation is saved by the fact, established in subsection 3.2, that any situation from a “fairly small” neighborhood of the vector-valued saddle point of (3.1) is &-saddlein that it implements the relations (in the above definitions of vector-valued saddle points) “up through E”. Note that in r*of Example 3.1, the Slater &-saddlepoint does exist. Proceeding now to formal definitions, let us assume that a constant vector E = (cl,. . . ,E ~ is ) specified with nonnegative components (ei2 0, i c N) such that ZisNg i > 0. The situation (x:, y:) E X , Y is referred to as the Slater &-saddle point of (3.1) if
w:, Y 3 4: w, Y)(2) w:, Y 3 #- %, Y:) + (1)
E,
VY E
E,
vx E
x
x.
The situation (x,”,y,”)E X x Y is referred to as the Pareto &-saddlepoint of (3.1) if (1) (2)
%e,Y 3 F wc, Y ) - 8, VY E r, we, Y 3 F(x, Y 3 + 8, V X E X .
The situation (x:, y:) E X x Y will be referred to as the Geoflrion &-saddlepoint of (3.1) if (1) (x,”,y,”)is the Pareto &-saddlepoint of (3.1); (2) there exists a positive number M > 0 such that for every jjc such that
C(X:,
j j ) > Fi(x:, y:)
+ ~i
x i~ N,
274
6. Vector-Valued Guarantees
and subscripts j
E
N such that
it is true that
and for X E X and i E N such that and subscripts j E N such that it is true that
The above definitions are “fairly” complete. First, for N = { l}, they turn into those of a waddle point ( x , , y , ) of a zero-sum game with scalar payoff function [89, p. 941
(X,
I: F , ( x , Y ) )
specifically,
Fl(% Y,) - E l 2
F l ( X & ,Y e )
2 Fl(X,, y ) + E l
for every x E X and y E I: Second, if in (3.1) some strategy x* E X is fixed, then from, for instance, the notion of an &-effectivesolution [70, pp. 59-60] of the multicriterial problem (I: ff(x*,y ) ) follows from the definition of the Pareto &-saddlepoint, that is f f ( x * ,y ) a F(x*, y:) for any Y E Y Consequently, the notion of &-vector-valuedsaddle points includes, as a special case, the generally accepted notions of &-solutionsfrom game theory in the theory of multicriterial problems. In the geometric interpretation of &-saddle points the discussion will concentrate on the case N = { 1,2}. If the situation (x*, y * ) is the Slater &-saddlepoint of (3.1) with N = { 1,2}, no points of the set F ( X , y * ) = { f f ( x , y * ) I x ~ Xwill } be found inside the angular region G I bounded in Fig. 6.3.1 by broken lines. Likewise, the intersection of the set f f ( x * ,Y ) = { F ( x * , y ) l y ~Y } and the inside of the
3. Pareto &-maximin
215
Figure 6.3.1.
angular region G, (denoted in Fig. 6.3.1 by broken lines) is empty. For the Slater &-saddle point (x*,y*), the sets F(X,y*) and F(x*, Y ) may have common points with the sides of the angular region G and G,,respectively. Unlike the Slater &-saddle point, Pareto &-saddle points need a similar positioning of the sets F(X, y*) and F(x*, Y ) relative to G and G2,but in this case these sets cannot have any points in common with the sides of G,and G,,either, except, possibly, the points F(x*, y*) - E and F(x*, y * ) E. Finally, in the case of the Geoffrion &-saddlepoint (x*, y*), the sets F(X, y*) and F(x*, Y ) not only remain outside G I and G,,but also outside some obtuse angular region that encompasses G I and G, with vertices at the points 5(x*, y * ) - E and F(x*, y*) + E, respectively (Fig. 6.3.2). Only these vertices can be common to both.
,
+
3.2. Properties of &-saddlePoints of the Game (3.I ) In this subsection the sets X and Y are assumed to be compact in the game (3.1) and the scalar functions Fi(x,y), ie N, to be continuous on X x I: The following propositions are established in [1081. Proposition 3.1.
I f the situation (xs,ys)is the Slater saddle point of (3.1), a
276
6. Vector-Valued Guarantees
Figure 6.3.2.
choice of positive numbers E = ( E ~ ., ..,E ~ may ) lead to a number a(&)> 0 such that any situation (x*, y*) from the a(&)-neighborhoodof the point (x”,y”),
w,(XS,YY = {(x, Y ) E X x Y I
IIX
- XSII
+ IIY
-
Y”l <
a(&)},
is the Geoffrion waddle point of (3.1), Proposition 3.2. If (xp, yp) is the Pareto saddle point of (3.1), a choice of nonnegative E = (cl,. . . ,E ~ ) Zei , > 0, may lead to a number S(E) > 0 such that any situation (x*, y*) from the a(&)-neighborhoodB ( ~ ( E(x”, ) , y“)) of the point (xp, yp) is the Geofrion &-saddlepoint of (3.1).
Remark 3.Z. Because any Pareto saddle point of (3.1) is, by (3.2), simultaneously the Slater saddle point, from Proposition 3.1. There follows an analog of Proposition 3.2 for positive ci,i e N. Proposition 3.2 is “broader” than this analog, as it also covers the case of nonnegative c i , i e N, such that ZjEN E j > 0. Remark 3.2. Because the Geoffrion saddle point of (3.1) is, by (3.2), also the Pareto saddle point of that game, we have from Proposition 3.2:
3. Pareto &-maximin
277
Proposition 3.3. If (xg, yg) is the Geoffrion saddle point of (3.1), once a collection of nonnegative numbers E = ( E ~ ., . . ,E ~ ) C , > 0, is known, a number b ( ~>) 0 may be found such that any situation (x*, y*) from the a(&)neighborhood B(6, (xg, yg)),6 < 6 ( ~ )o,f the point (xg,y g )is the Geoffrion waddle point of (3.1). In concluding this subsection, these findings can be extended to differential games. Moreover, the requirement that E~ > 0, i e N, is important in Proposition 3.1, as demonstrated by the following example. Example 3.2. In (3.1), where the sets of strategies are X = Y = [0, 11 and the vector-valued payoff function F(x, y ) = ( F , ( x ,y), F2(x,y ) ) = ( x .y, x y), the situation (0,O) E [0, 112 is the Slater saddle point. Let now E = (0, E ~ ) where , 0 < E~ < 0.5. If Proposition 3.1 were true, then for every X E X and any (x&,ye)E [0,6) x [0, 6) E R2 (with 6 > 0 fairly small) the following system of inequalities would have to be nonsimultaneous
+
XeY
> &YE?
X&
+ Y - E 2 > X & + YE.
(3.5)
The nonsimultaneity of (3.5) is equivalent, for every x E [0, 13, to the claim that at least one of the two inequalities Y
or Y G Y&+ E2
-=
is true. But for y = 1 and any 0 < y, < 0.5 and 0 E~ c 0.5, these two inequalities are nonsimultaneous. Thus, (3.5) are simultaneous. Consequently, in this example for any E = (0, E ~ ) , 0 < E~ 0.5, there exists no 6neighborhood of Proposition 3.1, which confirms the importance of requiring E~ > 0, i E N, in this proposition.
-=
3.3. Existence of a Pareto &-maximinin the Diferential Game ( 1 . 1 ) In this subsection (1.1) is a differential zero-sum game with vector-valued payoff function. Conditions 1.1 (Chapter 1) and 1.1 (Chapter 5) are assumed to hold and the initial position (to, x o ) is fixed. The procedure used to prove the Slater maximin in Theorem 2.1 is inapplicable in the case of a Pareto maximin, since the sets X , , , [ V ] (Pareto minimal solutions of (2.5)) are not necessarily closed or compact. Existence of a Pareto &-maximin will be established using another approach.
278
6. Vector-Valued Guarantees
Specifically, every strategy V E V is associated with the set X ( p ) [ V ]of Pareto-minimal solutions x p [V ] of (2.5):
X(P)CVI = {XPCVIEXC~IIWPCVI) %-
w,VXEXCVIJ.
Recall that x p [ V ] = xp[B, to, xo, V ] are the right ends of the quasimotions x p [ . ,to,xo, V ] of the system (1.8) in Chapter 5 generated from the initial position (to,xo) by the strategy V E V . The set X ( p ) [ V ]is nonempty [70, p. 1581 but is not necessarily closed (Fig. 4.3.2). Thus, for the set shaded in Fig. 4.3.2, where F(x) = (F,(x),F,(x)) = ( x l ,x 2 ) = x , curves AB and C D with point C punched out are Pareto-maximal points. Then the limit C of the sequence of Pareto-maximal points
IF'",
IF'2',
P,. . . .
(converging to C) is not Pareto-maximal. Thus the set of Pareto maxima is not closed here. Let us assume that a constant N-vector E = ( E ~ ., . . ,cN) is specified with positive components ci = const > 0, i E N.
Definition 3.2. The vector F;')E R N is referred to as the Pareto &-maximinof the differential game (1.1) with an initial position (to,xo)if there exists a V point i p [ V L p ' ] ~ X p [ such V ~ ] that F: = f f ( 2 p [ V ; p )and ]) strategy V ~ " ) E and
W P C V I ) %- WP[V,'I) + c
(3.6) for every V E Y and x p [V ]E X ( p ) [ V ]The . strategy VLP)that satisfies (3.6) will be referred to as the Pareto &-maximinin (1.1). In scalar form, (3.6) asserts that for any strategies V E V and associated Pareto-minimal solutions x p[V ] E X ( p[) V ] in problem (2.5), the following system of inequalities is nonsimultaneous:
F i ( x p [ V ]2 ) F i ( i P [ V f P ) ] ) , iE N, at least one of which is a strict inequality.
Theorem 3.1. If Conditions 1.1 (Chapter 1) and 1.1 (Chapter 5 ) hold, in ( 1 . 1 ) for any choice of initial position (to,xo)E [ O , O ) x R" and N-vector E (ci > 0, i E N), there exist Pareto e-maximins and a Pareto &-maximinstrategy. Proof. Consider a certain constant vector
E
with positive components and
3. Pareto &-maximin
279
fix some initial position (to, X ~ ) E[ O , @ x R". We then distinguish an arbitrary strategy V, E Y and use it to construct the set X,,,[V,] of Pareto-minimal solutions xp[V,] of (2.5), where V = V,. Now fix an arbitrary point ip[ V,] EX(,,[ VJ. Construct an N-vector F ( i , [ V,]) and supplement F(2,[V1]) with the vector E. Let us now find whether there exists a strategy V 2 € Y and at least one associated point ip[V2]~X,[V2] satisfying the inequality F(ip[VJ
2 F(ip[VJ
+ c.
(3.7)
If there is no such strategy of the associated point, lF(x,[ VJ) is (by Definition 3.1) the Pareto &-maximin,and V, the &-maximinstrategy. If there exist a strategy V2and points i,[V2] that satisfy (3.7), the process is continued. Specifically, the vector F(iP[ V,]) is supplement with E. We now check for the existence of a strategy V3 and point 2, [V,] EX,,, [ V3] for which
(3.8) and so on. The process cannot be continued infinitely by virtue of the compactness of the reachability domain X, the continuity of F(x) on the compactum and, as a corollary, that of components of the vector lF(x) are bounded on X. Indeed, from (3.7), (3.8),.. ., we have in k steps, F(ip[VJ)
2 F(ap[V1])
+ (k - 1
) ~
and, as a consequence the fact that of E > On, Fi(xp[&]) -, + 00 as k -+ 00 which contradicts the fact that Fi(x),i E N, is bounded on X. Consequently, within a finite number of steps the following result is obtained: A strategy V, E Y and associated point a,[ V,] E X(,)[V,] are obtained such that for any V E Y and x,[V]EX~,)[V] the following system is nonsimultaneous
lF(2,[V] 2 F(2,[&1)
+ c.
By Definition 3.1 this implies that V, is a Pareto &-maximinstrategy and F(ip[V,] the Pareto &-maximinof the game (1.1). This proves the theorem. Remark 3.3. The requirements of the theorem are less restrictive than those of Theorem 2.1 in that the saddle point condition (3.2) in Chapter 1 for a small game is not imposed. Consequently, the Pareto &-maximinexists under more general assumptions than the Slater maximin. The reasoning used in the proof of Theorem 3.2 also proves the following proposition:
280
6. Vector-Valued Guarantees
If Conditions 1.1 (Chapter 1) and 17.1 (Chapter 5 ) hold, for any choice of the constant vector E with positive components and an initial position (to,x o )E [ O , O ) x R” in a differential game, there exists a Slater &-maximin E V with associated point i sif:’)] [ E X,,, [V:”] and Slater Estrategy maximin QS), that is, F p = [F(i,[VJS’])4: lF(x,[V]) - E
for every V E-tr and every x s [ V ] E X ( , ) [ V ] . A similar proposition is obviously true also for the A-&-maximinstrategy and the A-&-maximin. To see this is so, it is sufficient to replace [F in the preceding proposition by AF.
Chapter 7
The Competition Problem
A decision choice procedure is proposed in a mathematical model of competition between two similar economies. Games are analyzed with separable and mirror (anticommutative) vector-valued payoff functions. In the end of the chapter an optimal solution is obtained in a model of the competitive exploration of a scientific problem.
1.
Mathematical Model of Competition
1.1. A Model of a Specific System
Two similar economic units (factories, corporations, concerns or states) have conflicting interests. The dynamics of the one economy are assumed to be described by the vector differential equation x = f(t, x,
4,
xCto1 = xo,
(1.1)
where x E R" is the state vector; t E [ t o , 81 time; 8 > to 2 0 constants; u E Rhthe vector; and U E Ha collection of control actions in the system. Numerous kinds of right-hand sides of equations (1.1) used in economic applications of the theory of differential games have been reported [12]. In a model of the exploration of a scientific problem by two similar firms simultaneously f = u (section 4). The right-hand side f(t, x, u) of the system (1.1) will be assumed to satisfy Condition 1.1 of Chapter 1, that is, f ( t , x, u) is continuous over the set of arguments and locally Lipschitz with respect to x, H ~ c o m p R ~and , 3 y = const > 0: l l f l l < y(1 Ilxll) uniformly with respect to t, x, and u.
+
28 1
282
7. The Competition Problem
Disregarding the competition with the other economy, (1) the set of strategies =
{ U t u(t,X) 1 ~ ( X) t , E H,
V(t, X) E [to, 91 x R" f ;
(1.2)
(2) the performance of this system is described by the set of criteria
For instance, in comparing specific constructions built around different control systems the choice has to be made between the following criteria [25, p. 25-27]: (1) adaptability to variations of the process gain; (2) insensitivity to variations of the gain; (3) static accuracy of the system; (4) noise-immunity of the system; (5) ease of system restructuring. In addition, the collections of criteria reflect:
(1) ease of design of the systems; (2) adaptability of the system to streamlined manufacture; (3) complexity of system start-up; (4) system performance; (5) system safety. The scalar functions F,(x), i E N, will be assumed hereafter to be continuous on R" and (Condition 1.1, Chapter 5) will be assumed to hold. Disregarding the interaction of this system with the other system and with the environment, the economic management process may be described in the following way (Fig. 7.1.1). Let us assume that a certain strategy U + u ( t , x ) is chosen and, depending upon the accuracy of approaching the optimal values of the criteria Fi(x), i E N, a partition A: to < zo < z1 < ... < z,,,(~) = 9, the initial value of the vector xo and the number a reflecting the sum of the steps of stepwise quasimotions at the partition points are established. At time zo the controller generates a vector jzo, where llxo - lollis chosen so that the sum of the steps of the stepwise quasimotion is less than a and a value of the strategy U , i.e., the vector u(zo,xo), which is fed to the system Z. This system
1. Mathematical Model of Competition
283
I=
u - U(t, x) Regulator
Figure 7.1.1.
generates the first interval x ( t ) , r o Q t Q r 1 of the stepwise quasimotion by means of the stepwise equation
+
x(t) = fo
J:o
f(r, x(T), u(ro, 20))dr.
The measuring unit determines at time t = r1 the value of the state vector x(r,) = xo(rl),given to the regulator which forms to ci the vector 2, and a value of u(r,, 2 , ) and transmits them to E.By means of the stepwise equation x(t) = i
l +
f , f(r, x(4,
U(Z1,
fddz,
the measuring units generate a second interval x(t),r1 < t < r2, of stepwise quasimotion and determine x(r2) = x0(r2), which is fed to the controller. Following the strategy U and number ci, the controller determines the vectors f2,u(rz,f2),and so on. As a result we have a stepwise quasimotion x(t),ro Q t Q 8, whose right end x(8) specifies the values of the criteria Fi(x(8)),i E N. Note again that the 3-tuple (A,x(~),cL("')) is governed, by subsection 2.3 of Chapter 1, by the admissible approach of Fi(x(@),i~ N, to the values of Fi(x[8, to,xo,U ] ) , where x[*, to, xo,U ] are quasimotions of the system (1.1) generated by the strategy U from the initial position (to,xo).
284
7. The Competition Problem
1.2. A Model of Competition
If there were no competition the goal of the first economy would be to choose a strategy U E @so as to obtain the largest possible values of all the criteria Fi(x[B]),i E N, simultaneously. In this case the results of Chapters 2-4 could be used. Now let us proceed to a mathematical model of competition between two economies that are similar in that: (1) the dynamics of the second economy are specified by the same
differential equation
L =f@, Y , 4,
YCtOl = yo;
(1.3)
(2) the initial values of the state vectors are identical, xo = yo;
(3) the control action u E Q, that is, the set of values of u and u coincide, whence the set of strategies of the second economy is 9'" = { V
=
-+ ~ ( tX ,) I ~ ( tx,) Q } ;
(4) the performance of the second economy is evaluated in terms of the same set of criteria F(y[e]) = (F,(y[O]),. . .,F,(y[B])).
Consequently, the competition between these two systems can be modeled as a differential zero-sum positional game with vector-valued payoff function <{1,2}, zc, {ac, v C }~ ,~ ( x c eycei)). i,
(1.4)
Here the system zcis described by the common differential equations = f(t,
x, 4,j ,
= f(t,
Y , 01,
(1.5)
where x , y ~ R " ; the control action of the first (second) player u E Q E comp Rq, u E Q; time t E [to, el; the constants 8 > to 2 0; the right-hand sides satisfy Condition 1.1 of Chapter 1; and the initial position (to, x0,yo) = (to, xo,yo = xo)E LO,
e) x
The set of strategies of the first (second) player
[wn
x
W.
285
1. Mathematical Model of Competition
The vector-valued payoff function
~c(xCel? YC
U
=
V Y C m - W3I)
(1.7)
or, in coordinate form,
F(xCeI9 YCKI)
=
Fi(yCOI) - Fi(xCoI),
i e N.
The differential positional zero-sum game (1.4) will be referred to as the competition problem. In the game (1.4) the second player chooses V e Y c so as to obtain the largest possible values of the components of the payoff function
E(xCeI9 YC~I) = Fi(yCoI) - Fi(xCoI),
i e N,
and thus increase the gap between the values of the similar scalar criteria Fi(~C01)and Fi(xCo1). Using the strategy U e a Cthe first player tries to have the lowest possible value of the components of Fi(x[B], y[O]), i e N. Because min[F,(y) - Fi(x)]
= - max[Fi(x) -
F,(y)],
both players try to increase the gap between the values of the similar scalar criteria F,(x[O]) and Fi(y[O]),i e N.
1.3. Optimal Decision-Making in Competition Problems
By the solution of the competition model (1.4) we will understand the Slater saddle point ( Uzs, V z s )E aCx Y cin which the value of the payoff functional is equal to the Slater maximin and its minimax.
Definition 2.2. which
The ZS-solution of (1.4) is the situation (Uzs, V z s )~'42' x Vc,
(1) is the Slater saddle point of this game: w c o , to, xo, UZSI,YCO, to, xo, vl) 4 w 4 e , to, xo, UZS1,YCO, to, xo, VZS1)
4 FC(xce,to, xo, ~ 1yCe, , to, x0, vZs1)
(1.8)
for any strategies V EYc,U E aCand all quasimotions x[ .,to,xo, U], YC ., to, xo, vl;
286
7. Tbe Competition Problem
(2) there are quasimotions (k[. ,to, xo, U z s ] , j [ ., to, xo, V z s ] ) for which there exist a Slater maximin and minimax such that FC(2[0, to, xo, UZSI,jco, to, xo, V"") =
minSu maxS Fc(x[d,
t o , xo, U ] , y [ 0 , to, xo, V t
Q])
XC.1,YC.l
=
maxS u minS FC(x[O, to, xo, U VEY'
t
.x[.],yC.I
Q],y[&tO,xO, V]).
(1.9)
Note that it follows from (1.9) and Proposition 2.8 from Chapter 6 that the strategy Uzs is Slater-minimax and Vzs, Slater-maximin, while the associated points (j;"[UZs],j s [ V t Q]) and (XJU - Q], y , [ V z s ] ) coincide with (jC[O, to, xo, U z s ] , j[O, to, xo, Vzs3). 1.4. A Geometric Interpretation of the ZS-Solution
A geometric interpretation of the ZS-solution will be provided for the case N = {1,2}.
First, let us give an interpretation of the equalities F C ( i [ O , to, X O , UZSI,EC0,to, xo, V Z S 3
-
minS Fc(xCO, to, xo, U - Q1,y[O, to, xo, V s ] )
x[.l. Y I , I
(1.10)
Because Vzs is a Slater-maximin strategy and
(X0, to, xo, UZSl,3C0, to, xo, V"")
=
(x~"", jCVZS1)
are the associated points of Definition 1.1 in Chapter 6, it follows from the first equality in (1.10) that
~C(~Cu"sl, jCV""3) 4 FC(XC& to, xo, Ul, yce, t o , xo, VZS1) for any strategies U E aCand quasimotions x[ . ,to, xo, U ] ,y[. ,to,xo, V z s ] of the system (1.5) and xo = yo. The latter relation signifies that none of the points of the set
Fc(X[U - Q], Y [ V z s ] )=
u
FC(x,y)
x e - X [ U - Q] ye Y[v=s]
will land strictly inside the angular region G with vertex F(A[Uzs], j [ V z S ] ) (Fig. 7.1.2).
1. Mathematical Model of Competition
287
Figure 7.1.2.
Consequently, if the maximizing player has chosen and has been using the Slater-maximin strategy Vzs, it will not be possible for all the possible values of the vector-valued payoff function LFc(x[O],y[O]) to be less than 5 " ( i [ U Z s ]j,[ V z s ] ) ;in other words, by choosing Vzs, the maximizing player assures for himself components of the vector-valued payoff function that are at least equal to F ~ ( ~ ~[C Vu" ] )~. ~ ] , On the other hand, by (l.lO), the values of the vector-valued payoff function LFC(i[UZS], j [ V Z S ] are ) such that ffC(2 [
P I , j [V Z S ] )
=
maxSu minS ffC(x[O, t O , x o ,U - Q],y[O,to,xo, V ] ) VEY'
XC.I.YC~1
(1.1 1)
for any V E V . Let us assume that the reachability domain of the first subsystem in (1.5)
while
288
7. The Competition Problem
Then (1.1 1) implies that the values F c ( 2 [ U z s ] , j [ V z s ] )of the vector-valued payoff function become equal to the Slater-minimal values of the set Fc(X, Y [ U )for every fixed strategy V E V c ,or coincide with the points of the set
9i-(,)FC(X,Y [ V ] )= {%€FC(X, Y [ V ] ) l F ; 4 IFc, V[FCE[FC(X,Y [ v l ) } . Therefore, by (1.11) for no V e V c will the points F r o ) F C ( X ,Y [ V ] ) land inside the right angle G 2with vertex at the point Fc(2[UZs],j [ V z s ] ) but with sides directed opposite to those of 6 , (Fig. 7.1.3). Figure 7.1.3 shows the possible positions of three sets
9r(,,FC(X, Y [ V c k ) ] k) ,= 1,2,3, relative to the point F C ( 2 [ U Z Sj][, V z s ] ) . Consequently, in addition to the Slater saddle point, if the Slater maximin F C ( i [ U Z S9] ,[ V z s ] ) defines the northeast boundary of 4 r o ) F C ( X ,Y [ V ] )which no values of Fc from the sets 9 r o , F C ( X , Y [ V ] ) for any V E V c can exceed, componentwise, Fc(2[UZS], j [ V z s ] ) .Here F C ( 2 [ U Z S j] [, V z s ] )is the Slater maximum of all Slater minima F r o , F C ( X , Y [ f l ) ,t/ V E Vc.Consequently, if for every strategy V E V Ca set of Slater minima Fr(,,lFC(X,Y [ V ] ) are obtained in the multicriterial static problem
(X x
n n IFC(%
Figure 7.1.3.
Y)),
(1.12)
1. Mathematical Model of Competition
289
the best (Slater-maximal) strategy for the maximizing player, is lFC(i[UzS],9 [ V z s ] ) , which is obtained for the Slater-maximin strategy Vzs. The associated point ( i [ U z s ] ,3 [ V z S ] X ) ~x Y [ Vl is the Slater-maximal solution of (1.12) with V = Vzs. A similar geometric interpretation of the Slater-minimax strategy Uzs is shown in Fig. 7.1.4. Here the reachability domain of the second subsystem from (1.5) is
moreover,
is the set of Slater maxima of the two-criteria1 problem
for every fixed strategy U E
of the first player.
Figure 7.1.4.
290
7. The Competition Problem
Consequently, using the strategy U z s the first player restricts to the angle
G, the maximal values of Fc(X[Uzs], Y ) by Fc(2[Uzs], 3 [ V z s ] ) . Simulta-
neously, the same value of the payoff function is the least of all I F E F ~ ( ~ " F ~ ( XY[ U ) ]Slater , maxima of problem (1.13) for any UE%!'. Consequently, the best, or Slater-minimal value of FC(x[O],y[O]) for the minimizing first player is F C ( a [ U z s ] , j [ V z S ] (the ) angular region G , is the lower bound). Because, by (1.9),these values Fc(? [U z s ] ,9 [ V z s ] ) are the same for the first and second players, the situation (Uzs, V z s )of Definition 1.1 is the most acceptable (for both players) solution of a differential zero-sum game with vector-valued payoff function. We have, therefore, chosen the ZS-solution as the optimal solution of the competition problem (1.4) for the following reasons (regarding (2)-(4) see at the end of subsection 2.5 in Chapter 6). (1) The ZS-solution (Uzs, V z s )of the game (1.4) with initial position (to,xo) is dynamically stable, that is, (Uzs, Vzs) remains the ZS-solution of this game but with a current initial position ( t , x [ t , to,xo, U z s ] , y [ t , to,xo, V z s ] )at any t E [to,O ] and any quasimotions x [ *, to,xo, U z s ] and y [ . , to, xo, V z s ] . (2) The ZS-solution is internally stable. (3) The ZS-solution is nonimprovable by using Slater saddle points. (4) The ZS-solution ensures the maximal possible vector-valued guarantees for both players simultaneously.
1.5. Procedure for the Construction of a ZS-Solution
The most important finding of this chapter is the following procedure for the construction of a the ZS-solution: (1) find at least one Slater-maximal strategy U s ~ % ! ( U+s uS(t,x ) ) of the multicriterial dynamic problem
(E,@, UXCOl)),
(1.14)
where the set of strategies 92 is defined in (1.2), H = Q, and the system E in (1.1); simultaneously, identify at least one quasimotion 2 [ .,to, xo, U s ] of the system (1.1) generated from the initial position ( t o , x o )by the strategy Us.
2. A Came with Separable Payoff Function
291
(2) The ZS-solution of (1.4) is the situation ( U s , V s ) , where V s - uS(t,y) and
WXU~I, icvsl)= F C ( w , to,xo, us],ice, to,xo, vS3 where j [ t , to,xo, Vs]
= % [ t ,to,x0, Us],
to G t G 8.
This result follows from an analysis of two kinds of differential zero-sum game with vector-valued (1) separable payoff function ~ ~ 0YC1OI) , = w x c 0 1 )
+ IF(~)(Y~OI),
the game dynamics being described by the system x, u),
x[to] = xo,
L = f ( 2 ) ( tY,, u),
Y C t O l = yo;
jc = f ( l ) ( t ,
(2) mirror (anticommutative) payoff function
W ~ IYcel) , = - UYC81, x c m with dynamics (1.5) where xo = yo. The game (1.4) is easily seen to be a game with both separable and mirror payoff function. Games with a separable payoff function are discussed in section 2 and those with a mirror function, in section 3. In the last section 4 of this chapter, a procedure for constructing the ZS-solution of subsection 3.3 is obtained in a model of competitive exploration of a scientific problem. 2. A Game with Separable Payoff Function
2.1. Problem Statement In the differential game
- y d ~~ , X
+~Ycei)),
c e i )
(2.1)
the system Zd is described by two separable systems of common differential equations i= f‘”(t, x, u), j = f ‘ 2 ’ (t, Y, u),
(2.2) (2.3)
292
7. The Competition Problem
where (x, y) E R" x R" is the state vector; t E [to, 81 time; the initial position (to,xo, y o ) and time 8 > t o when the game ends are fixed; as before, the control action of the first player is U E H ~ c o m Rh p and of the second player, u E Q E comp R4;every vector function f ( ' ) ( t ,x, u) and f ("(t, y, u) is assumed to satisfy Conditions 1.1 (Chapter 1) and the components of the N vector functions F(')(x) and V2)(x) are continuous on R" and R", respectively. The set of strategies adfor the first (Vdfor the second) player are Qd
= {u
- u(t,x, y ) I u(t, x, y ) E H } ,
Y d= {I/ + u(t, X, y ) I v(t,X, y ) G Q}.
In effect, what we have is a special case of the game (1.1, Chapter 6), where the system of differential equations has been reduced to two equations, and the components F~(x[O], ycei) = F:')(x[~])
+ ~ : ~ ) ( y [ e ] ) , i E N,
of the payoff function is formed by the sum of two functions F$"(x[O]) and F,(y[B]), each depending only on x[O] or y[O]. This game will be referred to as a zero-sum position differential game with separable uector-valued payoff function. The next natural step is to obtain an insight into the properties of the set of solutions to the game (2.l), such as vector-valued saddle points, maximins, and minimaxes. The game (2.1) will be associated with two multicriterial dynamic problems, (P,
a, ~ ( y q e - ~ ) )
(2.4)
and
(P), Y-, F'"(y[8])).
(2.5)
The control system Z(')is described by equation (2.2) and C('), by equation (2.3),the initial positions (to,xo) for (2.2)and (to,yo) for (2.3)are fixed. The sets of strategies are Q = { U +-
~ ( t X, ) ~ U ( ~ ,C XH ) },
Y = { V t u(t, y) I ~ ( ty ,) E Q } .
For problem (2.4) we denote by as the set of Slater-minimal strategies Us; by Qp, the set of Pareto-minimal strategies Up;by QG, the set of Geoffrionminimal strategies UG;and by aA, the set of A-minimal strategies U A ,where A is a constant fixed (N x N)-dimensional matrix with positive elements.
293
2. A Game with Separable Payoff Function
Similarly, for the problem (2.5), V sdenotes the set of Slater-maximal strategies Vs; Vp, the set of Pareto-maximal strategies Vp; V G ,the set of Geoffrion-maximal strategies VG;and V Athe , set of A-maximal strategies V A . By virtue of (3.23) and Proposition 3.6, both from Chapter 4, 02s v-s
2.2.
3 0 2 p 3 %G
3 %A ,
3 v-p 3 V G 3 v-A.
Vector- Valued Saddle Points
The differential game (2.1) is associated with a static zero-sum game with vector-valued payoff function (X,
I: lyx, y ) = W
X )
+ F(Z'(y)),
(2.6)
where the reachability set of the system (2.2) from position (to, xo) is
x = x r e , to, xo, u - HI =
u
VEY'
Xre,
to, xo, u],
and that of the system (2.3) from position ( t o ,yo),
The strategies of the first (minimizing) player is x E X and that of the second (maximizing) player y E Y. We let X, be the set of Slater-minimal strategies xs of the static multicriterial problem (X, F'"(X)) or
x, = {XSEXI P ( X S ) +- [F"'(x),
(2.7) vx E X};
Y s is the set of Slater-maximal strategies y" in the multicriterial problem
(I: F'2'(Y)>
(2.8)
or Y S= {ySE YIF(2)(yS) 4: F(2)(y),
v y € Y}.
By Definition 3.1 in Chapter 6, the situation ( x S , y S ) ~xXY is the Slater saddle point of the game ({ 1,2}, {X, Y}, F(x, y)) if Vx, yS)4: F(XS,y S )4: w , Y),
vx E x , Y
E
Y.
294
7. The Competition Problem
Lemma 2.1. The situation (xs,y ” ) is the Slater saddle point of the game (2.6) iff xsE X, and y”E Y’. Consequently, the entire set of Slater saddle points of (2.6) is Xsx Y“ and the set of all values of the vector-valued payoffunction is equal to the algebraic sum of the set P1)(XS)+ P2’(Y”),where ff“’(X,) =
u [F“’(x),
[ F ‘ 2 ’ (Y S )
=
Proof.
Necessity.
u
[F‘2’(y).
y € Y’
X€X,
Because (x”,y ” ) is the Slater saddle point of (2.6),
[F“’(x) + [ F ‘ y y ” )4: Pyx,) + P ( y ” )4: lP1)(XS)+ IF‘Z’(y),
V X E X , Y E I:
From the left-hand side of the latter relation it follows that
P y x ) 4: IP(x”),
vx E x,
since the sign ( + ) remains true in these relations if on both its sides the same ) added or subtracted. In turn, the latter relation constant N-vector F ( 2 ) ( y s is signifies that xsE X,,the set of Slater-minimal solutions of (2.7). The inclusion y S € Y“ is proved in the same way. Suflciency. Let x”E X,and y” E Y’. The following inclusions are equivalent: [F“’(x)
4: F‘l)(xS),
vx E x
and P ’ ( y ” ) 4: F‘2’(y),
vyE
I:
The first (second) inclusion remains true if a constant vector F(2)(y”) (respectively, F(’)(x”))is added to both sides:
[F“’(x) + l P ( y ” ) 4: [F“’(x”)+ F(2’(y”)), F‘yxs) + [F@)(yS)4: F‘l’(xs)+ P ’ ( y ) ,
vx E x, VyE I:
Hence, by Definition 3.1 (Chapter 6), the situation (xs,y ” ) is the Slater saddle point of (2.6). The following proposition is established in a similar way: For the situation (xk,y k )E X x Y to be the Pareto (with k = p), Geoffrion ( k = g ) , or A ( k = a ) saddle point of Definition 3.1 from Chapter 6, it is necessary and sufficient that xkE xk and yk E yk. Remark 2.2.
Furthermore, the entire set of associated saddle points is xk x Yk, while the
2. A Game with Separable Payoff Function
295
set of values of the vector-valued payoff function on X k x Y k is P y x k )+ P’(Y k ) . Here xk is the set of Pareto ( k = p ) , Geoffrion-(k = g), and A-minimal ( k = a) solutions of (2.7) and Y k ,the set of associated maximal solutions of (2.8).
Proposition 2.1. ff us^@' and Vs, the situation (Us, V s ) is the Slatersaddle point of the diflerential game (2.1). Note that the sets @‘ and V s are defined in the concluding part of subsection 2.1. Proof. By virtue of the structure of solutions to the multicriterial dynamic problem (Proposition 3.1, Chapter 2), if the strategy V s is Slater-maximal in problem (2.5), then y[O, to, yo, V s ] E Y s for any quasimotion y [ . ,to, y o , V s ] of (2.3)generated from the initial position (to,y o ) by the strategy Vs. In a similar way, x[& to, xo, US] E X , ,
vUsE@s,x[. , to, xo, U S ] E % [ t , ,
xo,
US].
Because X is the reachability set of the system (2.2) from the position (to,xo), xce, to, x0, UI E X
for any strategies U E @d and quasimotions x[ .,to, xo, U ] which they generate. In a similar way, YC& to, Yo, vl E y
for every V E Vdand quasimotions y [ . ,to, yo, V ] . Then, by Lemma 2.1,
+ F‘2’(yc& to, yo, VS1) 4: ff‘”(x[~,to, xo, US1) + W Y C & to, Yo, VS1) 4: W x c e , to, x0, us])+ WYC& to, y o , ~ 1 )
[F‘”(X[B, t o ,
xo, UI)
for any strategies U E adand V E Vd and the quasi-motions they generate, which implies, by Definition 1.1 from Chapter 5, that (Us, V s ) is the Slater saddle point of the differential game (2.1). Q.E.D. Because as# c$ and V s# c$ (Corollary 1.1, Chapter 2), in the differential game (2.1) the set of Slater saddle points is not empty. Remark 2.2.
296
7. The Competition Problem
Remark 2.3. By Lemma 2.1 and proof of Proposition 2.1, in addition to singleton strategies U s and V s (the points x[e, to,xo, U s ] EX, and y [ e , to,yo, V s ] E Y s ) , the Slater saddle point may be generated by the strategies U s € % and V'E Y , where the sets X [ e , to,xo, U s ] c X, and Y[O, to, yo, V s ] c Ys, as by any other strategies from the sets 42d and Yd. The only constraint is that these strategies must be in the inclusions
xce, to, xo, usl = x,,yce, to, yo, vS1= ys.
(2.9)
Consequently, the entire set of Slater saddle points of (2.1) is obtained by and V S ~ Y d finding the entire set 42' c ad(Ysc Yd)of strategies that satisfy the inclusions (2.9). Then the entire set of Slater saddle points is 42' x Ys, and the set of associated values of the vector-valued payoff function is the algebraic sum 5 ( ' ) ( X , ) IF("( Y').
+
Let s be the set of Slater saddle points of (2.1). Unlike the general case of (2.1) (see Example 4.3, Chapter 5), the Slater saddle points of (2.1) are interchangeable. Specifically,
Proposition 2.2. If (U('),If(')) and (U"), V 2 ) are ) arbitrary Slater saddle points of (2.1), the situations (U"), V @ ) )and (U'", V l ) ) are also Slater saddle points of this game.
(2.10)
2. A Game with Separable Payoff Function
297
for VU €ad and V E Vd,x [ . ,to,xo, U l and y e . ,to, yo, Vl. In obtaining (2.10) the following two properties were used: (1) The reachability domains of (2.2) and (2.3) are
x = xce,to, xo, u - H I Y
(2)
=
YCe, to, Y O , V
[F")
4 P 2 ) eV')
+
u u YC~,
xce, to, xo,
=
UEQ
Q1 =
VEV'
u] =
to, Y O , V l =
u u YCO,
Xce,
UEQd
V € P
to, xo, u],
to, YO, V I ;
+ a 4 F2)+ a for any constant vector a € W.
The relating in (2.10) signify that the situation ( W ) V , 2 ) is ) the Slater saddle point of (2.1).In a similar way the situation ( U ( 2 )Vl)) , is also found to be a Slater saddle point. Q.E.D. Propositions 2.1 and 2.2 may be proved in a similar way for the other saddle points of (2.1).
Proposition 2.3. If the strategies U K e a K and V K e V K ,the situation ( U K ,V K )is the Pareto ( K = P), Geoflrion (K = G),and A(K = A ) saddle point of (2.1). The saddle points vor every $xed K = P, G, A ) are interchangeable, their set is not empty, and, by Property 1.1 from Chapter 5,
d
c
9 c 9 c 9,
where d is the set of A-saddle points, B is the set of Geofrion saddle points, B is the set of Pareto saddle points.
Corollary 2.1. The above findings suggest a procedure for obtaining vectorvalued saddle points of (2.1). For illustration, this tool is applied to Pareto saddle points. Find at least one Pareto-maximal strategy Vp in the multicriterial problem (2.5). (2) Obtain the Pareto-minimal strategy Up of (2.4). (1)
Then the Pareto saddle point of (2.1) is the situation (Up, Vp). By Remark 2.1 and Proposition 3.1 from Chapter 3, if the entire set of such Pareto-optimal strategies 42' and Vphas been found, the set of values of the vector-valued payoff function ffc(x[e], y [ e ] ) = IF(')(x[8])+ lFc2)(y[e]) on all Pareto saddle points is
F'"(X[ap])+ I P ( Y [ V P ] ) ,
298
7. The Competition Problem
where
Example 2.2. Consider the game (2.1), where the system Z is described by the equations i= u, j = u,
where x
= (xl, x,), y = (yl, y,);
XCO]
= y[O] =
o,,
(2.11)
the sets are
< 1, u, < 1, u1 + u, Q = {U = ( u ~ , u , I) U: + U: < l}, = { u = ( # I , u,) I u1
2 O},
time t E [0, 01, 0 = 1; the two-component payoff function is ~ ( X C ~ YCOl) I, = (Fl(XC~I9YCOI), F2(X)Ca
YCOl)) (2.12) = XCOl + YCOl = (XICOI + YICO19 XZCO1 + YZCOl). We will denote the game as T(,.,,. The reachability domains X and Y of the first and second subsystems from
(2.11) are shaded in Fig. 7.2.1. By Remark 2.1, for the static two-criteria1 problem
(X,lF'"(X)
= x),
we obtain a set X, of Pareto-minimal strategies xp (in Fig. 7.2.1 the set is shown as a heavy line AB) and for the problem
(k: W
Y ) = y),
we obtain a set Yp (the quadrant C D of the circle) of Pareto-maximal strategies yp. Then the set of values of the two-component payoff function (2.12) which can be reached on the entire set of Pareto saddle points will, by Corollary 2.1, coincide with the algebraic sum of the sets (Fig. 7.2.2),
v,, YP) = (Fl(X,,YP), F,(X,, YP)) =
F(1)(XP)+ F(,'(YP)
=
x,+ YP.
Following the procedure of Corollary 2.1, we obtain a subset of Pareto
2. A Game with Separable Payoff Function
Figure 7.2.1.
L
IFz
Figure 7.2.2.
299
7. The Competition Problem
300
We wish to find the strategies
saddle points for the differential game U p of the first player such that
xco, to, xo, UPIEXpr
and of the second player, ,'/I
such that
YC4 to, Yo,
V'I
E yp.
These conditions are easily seen to be satisfied by any strategies of the form
+
Up + u(t,x) = (ul, uzlul u2 = 0, u1 = const < 1, u2 = const < I), V'
+ u(t, y ) = ( u l , uzlu:
+ ug = 1, u1 = const 2 0, u2 = const 2 0).
Every situation (Up, V') is a Pareto saddle point of l-(2,1,. The entire set of Pareto saddle point of r(2.1) is larger than the situations (Up, V') thus obtained. They do not, for instance, include the Pareto saddle points (Up, V') for which X[O, to,xo, U p ] is a subset (not a point) of X, and Y[O,to, yo, V'] is also a subset of the set YP. Besides, they do not include the strategies UpE Qd or V' E V dof the form U + u(t, x, y) and V + u(t, x, y ) such that
xce, to, xo, up]= x,,yce, to, yo, vpi= YP. The sets Qd and V dare defined in (1.6).
2.3. Vector-Valued Maximin and Minimax
For the case of the differential games (2.1) with separable payoff function, Proposition 2.7 from Chapter 6 may be made more specific regarding the position of the vector-valued minimaxes, maximins, and the set of values of the vector-valued payoff function that can be obtained on the set of all saddle points. For instance, when the notion of Slater optimality is used the set of Slater maximins coincides with Fr's)lF(Y), the Slater-maximal boundary of lF(Y),i.e., the set of values of the payoff function that can be obtained on the set of all Slater saddle points, while the set of Slater minimaxes coincides with 9to,F(Y), the set of Slater minima for IF(9'). This fact is established in the next proposition.
2. A Game with Separable Payoff Function
Proposition 2.4.
301
For the diflerential game (2.1),
Here
V 1 ) ( X s )= (J ff(l)(x), P ( Y S ) = (J f f ' y y ) , YEY '
*EX.
and X , ( Y s )is the set of Slater minimal (or maximal) solutions of problem (2.7) (or Problem (2.8)). Let us prove (2.13); equation (2.14) may be proved in a similar way. The proof will proceed in two stages. On the first stage we establish the following inclusion: any Slater maximin
u maxSu m i 2 ff(x[e, to, xo, V ] , y [ e , to, yo, ~ l ) ~ ~ r ( ~ )+[ff'*'(Ys)1, ~(')(x,) Y€Pd
XI.l.Y[.l
(2.15) and on the second stage, the converse inclusion
TP)[[F(')(x,) + P2)(Y s ) ]c u maxSu minS [F(x[e,to,xo, a, y[O, to,yo, a). VEYd
X[.I,YC~l
(2.16)
From these two inclusions (2.13) will follow. Proof. First stage. The system (2.2), (2.3) is represented in the form
Here the vector z = (x,y). By Definition 1.1 (Chapter 6) of the maximin strategy V @ )of the game (1.1, Chapter 6), every strategy V € Y dis associated with the set Z ( s ) [ f l of Slater-minimal solutions z s [ q = ( x s [ V J ,ys[w) of the problem
( Z [ V ] , F(z) = P y x ) + F'yy)),
(2.17)
302
7. The Competition Problem
where
ZCVI=
u zce,
Zt '
1
to, z0, ~1 =
u
x[
Y[
'
'
1 1
(xce, to,x0, VI,
Yce, to, Y,, ~ 1 ) .
Because of the specific form of the dynamic system (2.2), (2.3), the first subsystem in (2.2) is obviously independent of the control signal K Then, by the way the quasimotions are obtained (subsection 2.3, Chapter l), the set Z[V]
=
x x Y[V],
where the reachability domain of the system (2.2)
x=
u xce,
U€l
to,x0,
UI
and the set Y[Vl = uy[.ly[O, to,yo, VJ is the intersection of the bunch CY[to,yo, V] of quasimotions y[., tO,yo, V] of the system (2.3) and the hyperplane t = 8. Therefore, for every point zs [ V] = (xs [ V], y, [V]) of the set Z(,,CVl/7, F(1)(XS[V])
for any X E Xand y[V] we have
E
+ F(2)(ys[V])
4 F"'(x)
+ F'2'(y[V])
Y[V]. Assuming in (2.18) the vector y[V]
F'"(x,[V])
4 F"'(x),
(2.18) = y,[r/7,
VXEX.
Consequently, for any strategy V E Y ' " ~the collection of all xs[V] forms the set X, of Slater-minimal solutions of the static multicriterial problem (2.7). This set X, is independent of the choice of the strategy C: and every strategy V in (2.18) may be associated with any point xs[V] = x,EX,, which is also independent of the choice of ys[V]. Assuming now in (2.18) the vector x = x, = xs[V], we have E(2'(YscVl) 4:
q;;,
VY E YCVl.
Therefore, the set of all points ys[V] coincides with the set Y,[V] of Slaterminimal solutions ys[V] of the problem
< Y C n F(2)(Y)>, and the choice of the specific ys[V] x, = xJV] in (2.18). Consequently, we have Z,,,CVI
E
=
x [ V ] is also independent of the point
xs x xcv1
2. A Game with Separable Pay06 Function
and any point
c
303
(2.19)
xs VI EX,.
With V(') the Slater-maximin strategy of (2.1),by Definition 1.1 in Chapter 6 this implies that there exists
is[V S ' ]
j s [V'S'])
= (& [V'S'],
such that
+ P ( j S V'S']) [ 4: ff"'(x,[V]) + P ( y s [ V ] )
IF"'(f,[V'S'])
for any V E Y ~ , X ~ [ V ] E X ,and , y s [ V ] ~ Y,[V]. Assuming now in (2.20) the vector xs[V] = $[V's']
= x,EX,,
c
F(2)(AcV's'l) 4: F'*'(Y, VI)
(2.20) we have
(2.21)
with V V E Y ~The . set of strategies Y c Vd(Y is defined in subsection 2.1), and among the strategies Y there are singleton strategies, i.e., strategies for which for any y * K~ the reachability domain of the system (2.3) from the position (to,yo). there exists (see Corollary 4.1, Chapter 1) a singleton strategy V,, E V such that y* = y[& to,yo, V,,] for all quasimotions y[ . , to, yo, V,*] of the system (2.3). For such strategies Y*
=
YCv,*I
=
Y,[V,*I9
since the entire set Y[V,,] degenerates to the point y*. For such singleton strategies, equation (2.21) signifies that F'2'(js[
V'S'])
4: IF'Z'(y),
vy E
rc:
thus
(2.22) the set of Slater-maximal solutions of problem (2.8). In light of the inclusions (2.19) and (2.22),
+ F(2)($,[V'S'])€
F"'(f,[V'S)])
For singleton strategies V,, where (2.20) may be reformulated as F"'(2,[V'S']) for every x , E X , and
YE
YE
F'"(XS)+ P ' ( Y S ) .
I:bearing in mind the inclusion (2.19),
+ F'2'(ps[V'S'])
4: F"'(X,)
+ F'Z'(y)
Y. But then this relation holds also with any V"], y,[ Vn]) terminates in
(xs,y) E X , x Y'; in other words, the situation (?,[
304
7. The Competition Problem
the Slater maximum on the set [F(”(X,)+ P 2 ) ( Y S )Consequently, . any Slater maximin of (2.1) satisfies the inclusion (2.15). Second stage. Proceeding now to the proof of (2.16), suppose that for some strategy V*eVd and point .2,[V*] = ( i , [ V * ] ,j , [ V * ] ) the following inclusion holds: I P 1 ) ( i s [ V * ]+ ) F ( 2 ) ( j s [ V * ] ) ~ 9 r ( S ) [ [ F ( 1+) (IF(2)(Ys)]. X,)
(2.23)
In this case the strategy V* will be shown to be Slater-maximin for the game (2.1),while .2,[V*] = (i,[V*], j , [ V * ] )is the associated point from Definition 1.1 (Chapter 6). From the inclusion (2.23), it follows that P”(i,[V*])
+ [F‘2’(js[V*])+ P y x s ) + 5‘2’(y”)
(2.24)
for every x , E X, and ys E Ys.By the strategy V E V dof the second player, a set Z ( s ) [ V lis obtained of Slater-minimal solutions of (2.17). By the first step of the proof, Z ( s ) [ V l = X, x Y , [ V ] . The set Y s of Slater-maximal solutions ys (2.8) is externally stable [70, p. 1581. Therefore, for any quasimotion y [ - ,t O , y o ,r/7 of the system (2.3) (including those for which y[O, to, yo, Vl = y,[Vl E Y , [ V ] ) ,there exists y ’ ~Y s such that [F‘2’( y , [V ] )
< F‘2’( y”.
By this inequality and the proof of the equality X s = { x s [ V l } ,for any V E Vd on the first stage, (2.24) may be given in the form
F‘”(i,[V*])
+ [F(2)(jS[V*])+ F(l)(xs[v-J)+ 5(2)(y,[VI)
for any V E V and ~ ( x , [ V l , y , [ V l ) ~ Z ( ~ ) [ VHence, l. by Definition 1.1 (Chapter 6), V* is the Slater-maximal strategy of the differential game (2.1). This proves the proposition. Remark 2.4. The above procedure of proving Proposition 2.4 is applicable to the cases of Pareto, Geoffrion, and A-optima. In this case the following equalities may be established for (2.1):
F ( K=) u maxKu minK q x [ e , to, xo, V ] , y [ e , to, yo, V]) VEYd
5K) =u
*c~l,Yc~l
minKu maxK [F(x[O,to, xo, U l , y[O, to, yo, U l )
UEqd
*C.I,YC.l
(2.25)
2. A Game with Separable Payoff Function
305
where K, k = (P, p; G, g; A, a). These equalities and Remark 2.3 help reveal the structure, i.e., with K, k = P, p, of the Pareto-optimal solutions of (2.1). First, the set
u u wce,
~(9) =
( U , V ) E 9 x“1 Y[
’
to, x0,UI,y w , to, yo,
~1)
1
of values of the vector-valued payoff functions ff(x[B],y[B]) saddle points is the sum
on all Pareto
[F“’(X,) + !P( YP), where X, is the set of Pareto-minimal solutions of problem (2.7) and Yp is the set of Pareto-maximal solutions of problem (2.8). Second, the Pareto maximins of this game form the Pareto-maximal boundary of the set E(P), while the Pareto-minimaxes, the Pareto-minimal boundary of this set. Thus in the game T(z.l, of Example 2.1, any point of the curve LTHK (heavy line) is the Pareto-maximin while every point of the segment LK (double line), the Pareto minimax.
2.4. Existence of ZS-Solutions With N = 2
Recall that by the ZS-solution of the differential game (1.1, Chapter 6) we understand the Slater saddle point (Uzs, V z s ) for which there exists a point f[UZS, V z s ] = are, to, xo,Uzs, Vzs] and a Slater maximin and Slater minimax such that F(Ei.[Uzs, Vzs])
=
maxS u minS qx[e, to, xo,V]) VEY
x[.]
= minS u maxS F(x[O, to, xo,U]). U€%
x[.]
In this subsection the existence of a ZS-solution will be demonstrated in the differential game (2.1) with two-component payoff function (N = { 1,2)). In effect, we consider a differential game (2.1), where IF”) = (Fy),FY)),j = 1,2. This game is associated with two two-criteria1 static problems:
r(1)= (x,pyyx), ~yy~)}), =(
r; {F(iZ)(Y), F:z’(Y))),
306
7. The Competition Problem
where X is the reachability domain of the system (2.2) with U €42 and Y an analogous domain for the system (2.3), where V EV . In this context some propositions from the theory of multicriterial problems will be helpful. Lemma 2.2. For every i = 1,2 in problem
F'),
min F$''(x) = min F$')(x), XPX
*€XT
where X , is the set of Slater-minimal solutions of F(');similarly,
max Fi2)(y)= max FI2)(y), YE
Y
y € YS
where Y s is the set of Slater-maximal solutions of
r(2).
This Lemma is established in [70, p. 751. Lemma 2.3. If
Fil) = min F\')(x), *EX,
(2.26)
there exists
(2.27) such that
(2.28) in a similar way, i j
there exists
Fi2)= min F\')(y) Y E Y'
such that
2. A Game with Separable Payoff Function
307
Here, as before,
Proof. (2.26) (2.27), (2.28) is assumed; the second part of the lemma may be established in a similar way. The set X, is a compact subset of X, understood as a set of Slater-minimal solutions of r(l)(provided X is a compactum and F(')(x) continuous). Under a continuous mapping, the image of a compactum is a compactum, thus the set F(')(X,) is closed and bounded in R2. But then the set F(~)(x,)n ( ~ 1( 1 )=
1 )
is also a compactum. Therefore there exists a number
f\') =
IF"'=
max
(F:l',F:lI)€(IF"'(X,)n(F:l' = $ l I'
11
Pi') such that (2.29)
Fi1).
Let us show that fil)in (2.29) satisfies conditions (2.27) and (2.28). We assume the contrary, i.e., that there exists a vector F(') = (Fill, Fi'))E ff(')(X,) such that F\') # f\') and
(2.30) where
f\')
is defined in (2.29). Because F(')eF(')(Xs) and
g(') =
(f\'),Pi1))€F(')(X,), from (2.30) and the internal stability of the set F(')(X,), we have
But F\') # because of the way cannot be true, for by Lemma 2.2,
F(') is obtained in (2.30) and Fil) < f\') (2.31)
and
B(')E F'"(X,) = F("(X). The resultant contradiction F"'E{F"'(x,)
max
n { F\"
= P,')]]
F\') =
max
F"'EIF"' (X.)
proves the equality
F\1) = pa) 2
whence follows (2.27). The inclusion (2.28) is a corollary of (2.31). Remark 2.5. Propositions similar to Lemma 2.2 and 2.3 in the case of a Pareto optimum have been proved in [70, pp. 75,1183.
308
7. The Competition Problem
Proposition 2.5. In the differential game (2.1) with N solution, i.e., a Slater saddle point (Uzs, V z s ) such that
W& to, xo, UZSl,YC& to, yo, = =
there exists a ZS-
VZS1)
maxSu minS F(x[B, to, xo, V€Yd
=2
Xr.1,Yr.l
VI, yC0, to, yo, VI)
minSu maxS F(xC8, to, xo, U ] , yC6, to, yo, Ul) X[.I,Y[.l
for any quasimotions x [ .,to, xo. U z s ] ,y [ * , to, yo,
v"].
Note that by virtue of the way quasimotions are obtained and of the specific form of the system (2.2) and (2.3),
x[e,to,xo, V ] = x[e,to,xo, u - H I = x, yce, tO,yo,ui = yce, to, yo, - QI = r, where X (respectively, Y) is the reachability domain of the system (2.2) (or (2.3)). By Lemmas 2.3 and 2.2, there exists a Slater-minimal solution x , E X of the problem r(l)such that Proof.
F(1) I (x,) = min F\')(x) = min F\I)(x), X€X,
X€X
(2.32)
F\')(x,) = max @)(x), X€X,
and the point x,EX, c X. Similarly, in problem maximal solution y" E Y s for which
r(2)there exists a
Slater-
Fi2)(yS) = max F f ) ( y )= max Fi2'(y), y€
YS
YE
Y
Fi2)(yS) = min Fi2)(y). ys Y'
(2.33)
Now let us consider a multicriterial problem
( X , x YS,F"'(x)
+ P'(y)),
(2.34)
where, as noted above, the sets X, and Y" are compact. By virtue of (2.32) the situation (x,, y') is Slater-minimal in problem (2.34)
309
2. A Game with Separable Payoff Function
and, by (2.33), the same situation is the Slater-maximal solution of this multicriterial problem. Consequently, at the point (x,, y'), P1)(x,)
+ F 2 ) ( y S )= -
maxS [P1)(x) + P2)(y)]
( X , Y F X , x yo
minS
(X.YkX, x
Y'
[P')(x)
+ P2)(y)],
or
+ [ F ' 2 ) ( y s ) E ~ r ' s ' [ [ F " ' ( X+s )P2)(Ys)], V1)(xs)+ P ) ( y S E) Frcs,[P1)(Xs)+ P'( Ys)].
[~'"(x,)
(2.35)
In addition, by Lemma 2.1 the situation (x,, y') is the Slater saddle point of the game (2.6) with N = 2, because x, E X, and ys E Y', thus IF")(x)
+ lF(2)(ys)4: P1)(x,) + P 2 ) ( y S 4:) [F(')(x,) + [F'2'(y)
(2.36)
for every x E X and y E I: By Corollary 4.1 from Chapter 1 there exist strategies Uzs E % and Vzs E V" such that xce, to,
yce, to,
XO, Yo9
UZSl= Xsr V
zs -
1- Y
(2.37)
s
for all quasimotions x[. ,to, xo, Uzs] and y [ . ,to,yo, Vzs]. Since for any U E %d and V E V d , xce, to, xo, UI EX,
yre, to, yo, VI E
r:
it follows from (2.36) and (2.37) that WXCB,
+ 5(2)(~ce, to, yo, VZSI) to, x0, uZs1) + E'2)(yce, to, yo, v z s 3
to, xO1UI)
4: WXCB, 4: w x c e , to, xo, UZSN+ ~ ' 2 ' ( y ct o~, yo, , V3)
for any strategies U E %d and V E V d and quasimotions x[. ,to, xo, U], y [ . ,to, yo, Vl. The latter chain of relations signifies that the situation (Uzs, V z s ) is the Slater saddle point of the differential game (2.1) with N = 2. By (2.36) and (2.35), for the Slater saddle point (Uzs, Vzs),
+ P 2 ) ( y [ 8 ,to,yo, I/zs])~9r(s)[IF(')(X,)+ P2)(Ys)], P)(x[B, to,x0, uZs]) + P2)(y[e, to,yo, I/zs])~~r(s,[(F(l)(X,) + F2)(Y,)].
[F(')(x[~, to, xo, UZs])
310
7. The Competition Problem
Then, by Proposition 2.4,
is simultaneously the Slater maximin and Slater minimax. Consequently, the situation (Uzs, V z s ) obtained in (2.32),(2.33) and (2.37) is the ZS-solution of the differential game (2.1) with N = 2. Note also that, by Proposition 2.8 from Chapter 6, the Slater-maximin strategy in this ZS-solution is Vzs with the associated point
and the Slater-minimax strategy is U z s with the associated point ( i UZS], [ j[P I ) . Thus, in Example 4.2 in Chapter 5 there are two ZS-solution to which the points C and D in Fig. 5.4.4 correspond. The point C is associated with the first ZS-solution, the Slater saddle point (UW, ycl)) - (
pu ( l )) = ((-1, + 1X(O,l)k 3
with U ( ' ) the Slater-minimax and V(" the Slater maximin strategies. The point D represents the second ZS-solution (U'2', V 2 ) ) - ( u ( 2 ) ,u ( 2 ) ) = ((
+ 1,
-
l), (1,O)).
Remark 2.6. In a similar way it may be proved that in (2.1) with N = 2, there exists a Pareto saddle point such that (Up, Vp) is the ZP-solution, ff(x[B, to, xo, Up, Vp]) coincides with the Pareto maximin and minimax. In particular, there are two such solutions in Example 2.1. In Fig. 7.2.2 they are represented as the points L and K. For L the Pareto saddle point is (VL',VL)) + (( - 1, + l),(O, + l)), the associated Pareto-maximin strategy is VL with point
(;re,
to,xo, ~
( ~ x' 1e ,,to,Yo, vL)l) = (( - 1, + 11,(0, + I)),
and the Pareto-minimax strategy is U(L)with the same point. The point K is represented as the Pareto saddle point (UK), V K )-) (( + 1, - l),( + 1,O)); here the Pareto-minimax strategy is U ( Kwith ) point ((1, - l), (1.0))and the Paretomaximin strategy is V Kwith ) the same point. Finally, the point L represents the Pareto maximin maxP u minP [F(x[~],y[e]), VEY-
XC.l,YC~I
3. The ZS-solution in the Competition Problem
31 1
which is equal to (- 1,2) and minP u maxP [F(x[~],~[e]), LIE@
*C.I,YC.l
and for the point K, maxPu minP VEV
qX[e], y[e])
*C~l,YI~l
=
minP u maxP UEQ
XC~1,YC.I
[ ~ ( ~ [ e y[e]) ],
= (2,1).
3. The ZS-solution in the Competition Problem This section considers basically the properties of vector-valued saddle points, maximin and minimax in a differential game with mirror vector-valued payoff function. Then, using the properties of such solutions for games with separable and mirror payoff function, a procedure is proposed for deriving a ZS-solution in a competition model.
3.1. A Game with Mirror Payof Function (Saddle Points)
In a differential game {qM,
-Y^,>?
IF(x~819
ycel)>,
(3.1)
the control system EM is described by equations (1.5), = f(t, x,
4,
L = f(t, y , 4,
(3.2)
where x, y E R"; time t E [to,01; the constants 0 > to 2 0; the control actions of the players u E Q ~ c o m R4, p u E Q (or H = Q); the vector function f(t, x, u) satisfies Condition 1.1 of Chapter 1. The initial positions of both subsystems of (3.2) are assumed to be the same, i.e., the initial position is ( t o , xo, Y o = XO).
is,
(3.3)
The sets of strategies of the players are defined in (1.6),where H = Q, that
%I = { u f U ( t , X, Y ) I U(t,X, Y ) Q } , VM= { L' + u(t, X, Y ) I u ( 4 X, Y ) E Q > .
(3.4)
Because the right-hand sides of the subsystems in (3.2) are the same, the initial positions ( t o , x o ) and (to,yo) also coincide by virtue of the same
3 12
7. The Competition Problem
constraints (inclusions) in (3.4), the reachability sets of each of the subsystems in (3.2) coincide, that is, XCO, to, x0, U
t
Q] = X
Y
=
=
Y[O, to, yo
= x0,
V + Q].
(3.5)
The components of the vector-valued function F(x, y) are continuous on X x Y = X 2 . Besides, F(x, y ) is assumed to be mirror (anticommutative):
m,y ) = - 5(Y, x )
(3.6)
for any EX and y~ Y In particular, it follows from (3.6) that F(X,
x ) = 0,
E
w,
v x E x.
(3.7)
Among the mirror vector-valued functions are F(x, Y ) = cx 0Y I ( X - Y ) ,
where x 0 y = ( x ~ , ~ ~ , . . . ~ x n y , , ) ; F(x, y ) = F(x) - q y ) .
The latter example forces us to take a special look at the class of differential games (3.1) with mirror vector-valued payoff function. The vector-valued payoff function F(x[O],y[O]) of (3.1) that has the property (3.6) will be referred to as mirror and the game (3.1) itself as the zero-sum game with mirror vectorvalued payoff function. The following is an auxiliary proposition that will be helpful in the proofs of this subsection. Let U* t u*(t, x , y ) and V* t u*(t, x , y ) be some strategies from the sets %M and VM, respectively, and let the sets X [ U * ] = u x t . I x [ OtO,x0, , U * ] and Y [ V * ] = U,t.ly[O,t,,yo, V * ] . Consider the strategies U: t u*(t,y,x) and V: t u*(t, y, x); these differ from U* t u * ( t , x , y ) and V* t u*(t, x, y) in that in the functions u*(t, x , y ) and u*(t, x , y), the vectors x and y are interchanged. Let XCV:I =
u u
xce, to, x0,
~ 1 ,
YCO, to, xo,
u:1.
xt.1
ycu:1 Lemma 3.1.
=
Yt.1
The following inequalities hold: X [ U * ] = YCu:],
Y [ V * ] = X[V?].
3, The ZS-solution in the Competition Problem
313
Indeed, in devising stepwise quasimotions of the first (respectively, second) subsystem of (3.2)with U = V,* - u*(t, y, x) (respectively, with V = UF t u*(t, y, x)), we have, in fact, the same system of stepwise equations, where the vectors x and y have been interchanged. Hence the lemma follows. Proposition 3.1. Ifthe situation (V:, U ; ) is the Slater saddle point of(3.1), so is the situation (Us, Vs).
Indeed, for the Slater saddle point (Us, Vs), we have, by Definition 1.1 in Chapter 5,
Proof.
~ ( x cto, ~ x0, , ui, yre, to,xo, vS1)4: F(xce, to,xO7us],yce, to,x0, vS1)
4:
F(XC& to, xo, USl,yCe,to, xo, v3)
for any U E %lM and V EVMand quasimotions x[ . ,to,xo, U ] ,y [ .,to,xo, U. In light of (3.6), these relations are equivalent to -V
Y C ~to, , x0, vS1,xce, to,x0, ~ 14: )- wee, to,xo, vS1,xce, to,xo, U S ] )
4: - F(YC% to, xo, VJ, xce,
to, xo, US])
or
w e ,to,x0, vS1,xce, to,xo, UI) 4 w e , to,xO9vS1,xre, to, x0, us]) 4 F(YC&
to,
XO,
Vl, xce, to, xo, US])
for any U E 4&, V EVM,and quasimotions x[ * ,to,xo, U ] , y [ to,xo, Vl. By (3.5) and Lemma 3.1, the latter relations may be represented as 1 ,
w e ,to, xo, V I 9yce, to, XO, v3) 4 w e , to, xo9 c1,yce, to, XOl q l ) 4 ~ ( x r eto, , xo, UI),
y[e, to, xo, U;I)
for any strategies Uc%lMand V€VM and quasimotions x[., to, xo, V : ] , y [ . , to, xo, V1, x [ * , to, xo, U ] , y [ ., to, xo, U , ” ] which , proves Proposition 3.1. In a similar way the following proposition is found to be true for the situation ( U K ,V K )to be the Pareto (with K = P), Geoffrion (K = G), or Asaddle point (K = A) of (3.1) it is necessary and sufficient that the situation (V,“, U:) be the saddle point, consequently. Remark 3.2. By Theorem 2.1 in Chapter 5 the set of A-saddle points is nonempty in the differential game (3.1, Chapter 4) and, consequently, by Property 1.1 from Chapter 5, there exist Geoffrion, Pareto, and Slater saddle
3 14
7. The Competition Problem
points. But in a static (nondifferential) game the mirror nature of the game is not sufficient for the existence of vector saddle points which, in the case N = 1 (scalar payoff function), coincides with the ordinary saddle point. Let us demonstrate this case for a matrix zero-sum game with scalar payoff function. Example 3.1. Both players in a static zero-sum game with scalar payoff function (X, I:F , (x, y)) have three strategies each
x = (p, p, $9,
y = (y'",
y ( 2 ) ,y'3').
Now let a.. v = Fl(x(i),y")), i , j = 1, 2, 3.
If in this game the strategies of the first player are associated with the rows, and of the second one, with the columns of the matrix, we have a table (payoff matrix), A
=
[:!
a13
a22 a32
a33
The strategies of the first player can then be associated with the ordinal numbers of the appropriate rows of the matrix A and of the second one, with the ordinal numbers of its columns. The game itself with a specified payoff matrix is usually represented as a collection r(l.l,Chapter6) =
(x,
A),
where now 2 = { 1,2,3} and ? = { 1,2,3}. Consequently, r(3,1) is a matrix game whose analysis is a central concern of general game theory. This game has a mirror payoff function if a,.= 11
-0..
11
and the matrix A must be skew-symmetric. Let us see that in r(3.1) with mirror payoff matrix
3. The ZS-solution in the Competition Problem
there exist no vector-valued saddle point that, for the matrix game coincides with the ordinary saddle point ( i o , j o )E 2 x that is,
3 15
r(3,1),
mqx min aij = min aijo = rnax a i o j= min max a i j . jEY
je Y
iEX
iEX
ieX
r(3,1) with A = A*, max min aij = 1 < + 1 = min
jsY
Indeed, for the game i
i
-
i
max a i j , j
thus in r(3,,) with A = A* there is no Slater or, consequently, Pareto, Geoffrion, or A-saddle point. What is important is that the symmetry of the matrix A also is not sufficient, for the saddle point to exist. Thus, for a symmetric matrix +1
-1
A = [+1 0
0 -1
-:I,
+1
it is true, as in the preceding case, that max min aij = - 1 < j
i
+ 1 = min i
max a i j . j
3.2. Vector-Valued Maximin and Minimax of the Game (3.1) By the way quasimotions are obtained (subjection 2.3, Chapter l), every fixed strategy V e V Mis associated, at the time t = 8 the game ends with the set Z[V] = x x Y[V],
where X = X [ t o ,xo, U - Q] n { t = O}
is the reachability domain of the first subsystem in (3.2), and Y [ V ] = @Y[to,xo, V ] n {t = e} is the intersection of the bundle of quasimotions of the second system in (3.2) generated by the strategy V from initial position ( t o ,xo) and the hyperplane t =
e.
Similarly, every strategy U E 'BMof the first player is associated with the set Z[U] =X[U] x
r,
316
7. The Competition Problem
where
Y=%[to,~,,VtQ]n{t=O}, X [ U ] = %[to,xo, U ] n { t = O } . Now, let us consider two multicriterial static games,
Lemma 3.2. For every strategy VE.Y-,, the set Z(,,[V] coincides with Zcs)[Vx]and,for any U E @ ~ , z's"U] = Z(,)[U,]. Proof. Let V be some strategy from the set VM. If z,[ every X E X and ~ [ V J EY [ U ,
FS(ZSCV1)= W
S C V I , YSCVI)
V l E Z,,,[ U,then, for
4 WX,YWl)
or, recalling (3.6),
- b C U , X S C U ) 4 - F(YCVl9 4. Therefore, for every y[U E Y[VJ = X[V,] (Lemma 3.1) and X E X = Y (see (3.5)), F(YSCV1, X S C U ) 4: W V X I , Y )
(3.8)
in every x [ V x ]EXCV,] and y e I: But, by the form of the system (3.2), for the
3. The ZS-solution in the Competition Problem
317
strategy V, E % ,,, there exist quasimotions x[ .,to, xo, V,] and y [ .,to, xo, V,] such that XSCV,l = xce, to, xo, V,l
= YSCVl,
Y ~ C K=I YCe, to, x0, V I = xscvl. Therefore (3.8) means that
W C K I r YSCVXl)4r
W
C
~
X
I
Y), ,
YYE
k: xCVxIEXCI/,I,
or (YSCYI,
XSCVI)
= (xSCVx1,Y S C K l ) E Z ' S ) C ~ x l .
Consequently, the following inclusion is true: Z,S)Cvl = ZCS~CVXl.
The converse inclusion may be proved in a similar way. This proves the lemma.
Proposition 3.2. I f the strategy V"'E VMis Slater-maximin in (3.1) with point (ks[V's'],j S [ V ( " ] ) , V$) is the Slater-minimax strategy in this game and the associated point ( k S [ V 3 , jS[V,(S)])= (js[V(S)],2s[V("]) and the associated Slater-maximin F ( i S[ V S ) ]j,s[ = - [F(Y[ V,(')], j s [V,cs)]), is the Slater-minimax. Conversely, if the strategy U") E aMofthejrst , is player is Slater-minimax in the game (3.1) with (2s[U's)],y [ U c s ) ] )UF' Slater-maximin while (2,[Ul"'], js[Ul"'])= (j"U'"'3, P [ U ( " ] ) and F(P[ U'S'], y [U'S']) = - IF($ [U 3 , j s[U 3 ) .
Proof. If Vcs)is the Slater-maximin strategy of (3.1) from Chapter 4, then, by Definition 1.1 from Chapter 6, there exists a point ( i S [ V ( ' ) ] , j s [ V c s ) ] ) ~ Z ( s , [ V such ( S ) ] that ~(%cV's)19h C ~ ' " ' 3 ) 4:
%m7 Yl S,C Y I )
318
7. The Competition Problem
for every V € V M , ( x s [ V ] ,y s [ V ] ) ~ Z ( S ) [ VBecause ]. F(x, y ) is mirror (3.6),
~ ( ~ s c ~ Mc V” ( ls ’~l )4 V for any V E VM and Z,,, [ V ] = 2”’ [ V,], thus
Y S c n XSCVI)
( x s [ V ] ,y s [ V ] )Z,,[V]. ~
By
(3.9) Lemma
3.2,
(XSCVI, Y S C V I ) = (YSCV,I), XSCVxI).
Among the quasimotions ( x [ ., to,xo, Cs],y[ . ,to,xo, V:]) of the system (3.2) there are those for which P[V,CS’]= x y o , to, xo, V,’”] = js[V‘”], j y v 3 = y y e , to,xo, V 3 = f,[V‘S’]
and (Y[I/X(S’],
because Z,,,[ V”)] the form
=
j”Cv,c”])EZ(S)[V:/x(”’]
Z”’[ VL”]. Consequently, (3.9) may be represented in
w[V,‘S’l, j”P’’3)
4 F ( X S C V 1 ? YSCV,l)
for every V . E @ ~and (xs[Vx],y s [ V , ] ) E Z ( ~ ) [ V , The ] . latter relation implies that VLS)is the minimax strategy of (3.1) and that the associated point is
(-;s[V,‘q
j”cV,cS’])
= (jS[V(S’], 2s[V‘s’]).
The latter part of the proposition is proved in a similar way. In the differential game (3.1) with mirror payofffunction, the of all Slater maximins is symmetric with respect to the point set 5 = 0, E RN of the set of all its minimaxes 5s,.
Proposition 3.3.
Proof. By (3.6) and Proposition 3.2, the following chain of equations holds:
=
IF(%, [V”’], j s [V”’]) = - 5 ( i S[V,Cs’],j s [ V,“’])
- -
m i n S u maxS F(xC0, to, xo, U ] , yC0, to, xo, U l ) UEffM
X1,l.YI.l
that is, for every Slater-maximin there exists a minimax that is symmetric
3. The ZS-solution in the Competition Problem
319
with respect to the point IF = 0,. Analogously, for any Slater-minimax strategy (3.1), maxS q x [ e , to, xo, U ] , y[O, to, xo, Ul)
min’u
~@% XC.I,YC~l l
--
maxS u minS IF(x[e, to, xo, V ] , y[O, to, xo, V ] ) vcv;,
~C~I.YC~1
that is, every Slater-minimax is associated with a maximin symmetric with respect to IF = ON,whence follows Proposition 3.3. Finally, the next proposition may be proved in a similar way. Proposition 3.4. In (3.1) the set of all vector-valued maximins 9(K) is symmetric (with respect to the point IF = ON), to the set of minimaxes 9(;(K) ( K = P, G, A).
In other words, the following sets are symmetric (with respect to IF = 0,): (1) set of Pareto maximins and minimaxes; (2) set of Geoffrion maximins and minimaxes; (3) set of A-maximins and A-minimaxes.
3.3. Obtaining ZS-Solutions in the Competition Problem
-
Now let us prove that the procedure leading to a ZS-solution (subsection 1.5) is valid. We consider a multicriterial dynamic problem
ry= (I:
( w , v ,~ ~ ( ~ c e i ) ) ,
where the system I: is described by equation (1.3) with initial position ( t o , xo); the set of strategies -Y = { V
+
u(t,Y ) I v(t, Y )
Q}
and the vector criterion IF(y[O])is defined on the reachability domain Y of the system (1.3).
Proposition 3.5.
If V s - uS(t,y )
i s the Slater-maximal strategy of problem
320
7. The Competition Problem
ry,the situation (Vz, Vs), where I/; + uS(t,x) is the ZS-solution of the competition problem (1.4), while the associated point, see DeJnition 1.1 (2), (a[& to, xo, e l , ice, to, xo, V']) is defined in such a way that i r e , to, xo, = y [ e , to, xo, Vs] and a[ to, xo, v,"] is any quasimotion ofthe system (1.1) generated b y the strategy v,"from position (to,xo).
e]
a ,
Proof. Let us show that the situation (V;, V s ) is the Slater saddle point of (1.4). Because V'E VC, the set of Slater-maximal strategies of problem ry,by Proposition 2.1 it would be sufficient to show that the strategy - us@,x) is Slater-minimal in the problem
(c - (i.i), %, - ~ ( ~ c e ] ) )
(3.10)
or Slater-maximal in the problem
rx= ( c + (i.i), %, ~ ( X c e l ) ) . But the problem r, coincides with rywhen x is replaced by y. Therefore, Ks is
the Slater-maximal strategy in rxand, consequently, Slater-minimal in problem (3.10). Let us show now that, for any quasimotion bC.9 to, xo,
el,yc.9 tO,XO, V")
of the system (1.5) such that
ace1 = xce, to, xo, el = yce, to, xo, VS1 = iCVSI, the point 0, = F(j[VS]) - lF(n[v,"])E.F-(S)[F(YS) - ff(XS)],
(3.1 1)
where X, and Ysare the sets of Slater-maximal solutions of the associated problems (X, Since X
=
K
W)>
and
( X F(JJ)>.
x,= YS,
(3.12)
that is, the Slater saddle point (V'., V s ) is such that the value of the payoff function chosen on it is the Slater maximum of the set F(Ys)- F(X,). Indeed,
3cvS1= yce, to, xo, vS1E ys, nL-~'.I = xce, to, xo, I/xSIEXs,
3. The ZS-solution in the Competition Problem
321
by Proposition 3.1 in Chapter 2 on the structure of solutions to dynamic multicriterial problems. Therefore,
F(j[VS])
-
F(%[V])EF(YS)
-
F(X,).
Now (3.11)will be proved by contradiction. Assume that there exists a saddle point (xs,y s ) of the game
(X,Y, F(Y) - W) such that F(yS)- F(XS)> F(j[VS]) - F(%[v,s])
= 0,.
Hence 5(yS) > F(x”.
(3.13)
However, by Lemma 2.1, y S € Ysand x’EX,. Then inequality (3.13) contradicts (3.12) and the internal stability of the Slater-maximal solutions of the static multicriterial problem ( Y, F(y)). The resultant contradiction proves (3.11). The inclusion
~(jCVSIl) - ~ ( ~ C V l ) E ~ q S ) C ~( YW SS )) l is proved in a similar way. From this expression,(3.1 l), and Proposition 2.4, it follows that there exist a Slater maximin and Slater minimax equal to F(j[Vs]) - F(i[V]), which proves Proposition 3.5. Analogously, the ZS-solution of (1.4) is the situation (Us, U:), where the strategy U s is Slater-maximal in the multicriterial problem (E + (l,l),92, F(x[B])), while the associated point (%[& to, xo,Us], j [ e , to, x0, ~ 3is ) such that n[e, to, x0, us]= j [ e , to, xo,U 3 . This is exactly the proposition used in subsection 1.5 to obtain the ZSsolution of the competition problem (1.4). Remark 3.2. A similar procedure for obtaining a vector-valued saddle point, where the associated vector-valued maximin is equal to the minimax, can also be formulated in the cases of the Pareto-, Geoffrion-, and Aoptimum.
Thus, in the case of the Pareto optimum (for the ZP-solution) the procedure is as follows:
322
7. The Competition Problem
(1) Find the Pareto-maximal strategy U ' E ~ ' UPtuP(t, , x) of the multicriterial problem r, and, simultaneously, obtain at least one quasimotion ?[. , to, xo, Up] for the system (1.1); (2) The situation (Up, U,'), U!tup(f, y) is the Pareto saddle point such that for p[8, to,x0, UyP] = ?[8, t O , x 0 ,Up], FW,
to, x0, U;I)
-
w e , to, xo, vp1)
= m a x P u minP [[F(y[e, to, x0, VEY'
=
XC~I,YC~]
minP u maxP [F(y[O,
fJE4F
to, x,
v]) - F(x[e,
to, x0,
v])]
U]) - F(x[e,
to, xo,
Ul)]
XC.l.YC.1
that is, (Up,UyP) is the ZP-solution of the competition problem (1.1). This procedure is implemented in the next subsection in constructing a mathematical model of an identical research problem that is solved by two competing teams.
4. Model of Competing Research Activities
Sets of Pareto minimaxes and maximins are found and a ZP-solution obtained in the case of the Pareto optimum. 4.1.
Single- Firm Model
The model is constructed along the lines of [77], where only the Nash equilibrium was investigated. Two teams are engaged in identical research. The time 8 > 0 alloted for this research is specified in advance. If neither is successful for this time, the research ceases since by that time the problem is no longer relevant. Let t j denote a random time tje(O, 131at which the i-th (i = 1,2) team scores a success. If ui(t)is the rate at which the i-th firm acquires (at time t~ [0,8]) the knowledge needed to solve the problem, the knowledge acquired by the i-th team increases by the equation ii = ui(t),
zi(0) = 0,
i = 1, 2,
(4.1)
where the scalar functions ui(t) are Borel-measurable and ui(t)E [0, ail at every t E [0, 0) is the ai specified positive constant. The sets of such functions ui( .): [0,8) + [0, a i l will be denoted as ai,i = 1,2. The initial conditions
4. Model of Competing Research Activities
323
zi(0) = 0, i = 1,2, signify that at the initial time t = 0 neither team has any insight into the problem. The probability that the i-th team will solve the problem if the amount of knowledge about it is zi is described [77] by the formula Fi(zi) = 1 - exp[-Azi]. From this distribution law it follows, in particular, that
+
Y { t i ~ ( tt , d t ) l t i > t ) = Au,(t)dt, ~ E [ O , 01,
or, if by time t the i-th team has not solved the problem, the probability of a success over the nearest time span of dt is directly proportional to ui(t),while the probability of failure by time t is exp[ - Azi(t)]. If the value of the patent by the current time is a constant L, the mean income from securing a patent by the i-th team is described by the functional
J!’)(ui)= AL
f
ui(t)exp[ - Azi(t)]dt.
Let us now proceed to estimating the costs of the i-th team in the course of the research activities. The cost of obtaining additional knowledge at time t is estimated [77] as 0.5u;(t). Therefore the mean payment over the entire time span 6 is given by the functional
ijr e ui’(t)exp[-rt]exp[-Azi(t)] 1
0
dt
where the rate of discounting r is assumed to be a specified positive constant. Then the second criterion of the team’s performance is the functional
:
Ji2)(ui)= - -
u?(t) exp[-rt
- Azi(t)] dt.
(4.3)
If it is required to analyze the optimal behavior of the i-th team alone regardless of the other team, the mathematical model could be the twocriteria1 dynamic problem
(ii= ui, Zi(0) = 0; 4; {Jp(ui),J i 2 ) ( U i ) } ) ,
(4.4)
where the functionals J!’)(ui)and J12)(ui) are described in (4.2) and (4.3), respectively. In this problem, however, the i-th team generates the control ui( *)€ai in order to maximize both criteria J!’)(ui)and J!’)(ui).This signifies that the i-th team tries to minimize the cost -Ji2)(ui) and, simultaneously, maximize the income J!2)(ui).In terms of the theory of multicriterial problems
324
7. Tbe Competition Problem
the solution of (4.4) may be the Pareto-maximal control up( *)E%!~,or for any ui( . ) E ai,the following system of inequalities is nonsimultaneous: J{l)(Ui)
> Jp(up),
.p(Ui)
2 Jp(up)
so
at least one of which is strict; now let = {up}. Note that the Pareto-minimal solution u : ( . ) E of (4.4) is made possible by the nonsimultaneity of the system of inequalities J!”(Ui)
< Jy(u:),
JyyU,)
< JIZ)(ui*)
at least one of which, as in the preceding definition, is a strict inequality. Consequently, in the case of seeking the optimal solution for one team, disregarding the other team, it is sufficient to find, and then use, the Paretooptimal control for problem (4.4). This, however, is insufficient in the case of two competing teams.
UP
4.2. Game-theoretic Model of Competition In the competition the i-th team tries to do better than the other team in terms of both criteria or at least in terms of one criterion without having the other team’s chances. For the first team this means trying to choose the appropriate u ~ ( - ) E %so ! ~ as to increase both components of the vector 1b1, u2) = U l ( 4 , u2)9
12(%
U2)h
(4.5)
where
li(u1, u2) = Jil)(ul)- 5i2)(u2),
i = 1,2.
(4.6)
The objective of the other team is just the opposite, i.e., to find its own control u2( . ) E % ! ~in order to reduce the components of the vector Z(ul, u2),which, by virtue of the fact that min[ -51 = --ax J , implies increasing the differences J(1)
2 (u2)
-
J Y ( d= - 1 A U 1 ,
J (22 ) (u2) - J(l2)(U1)
u2)9
= -12(U1,U2).
The process dynamics are described by the system
z* I. = u . with constraints ui( - )E 4 .
zi(0) = 0,
i = 1,2
4. Model of Competing Research Activities
325
Then the mathematical competition model may be represented as a differential zero-sum game with two-component payoff function where q.is the set of strategies of the i-th player (subsection 4.1); the control system C is described by the ordinary differential equations (4.1); the vectorvalued payoff function I(ul, u2) is specified in (4.9, (4.6). Now let us proceed to the notion of a solution to (4.7). The situation (uy(.),us(. )) E a1x a2will be referred to as the Pareto saddle point of the game (4.7) if q u , , US) 4 I(&
US),
V4(.)€@1
and
Wl, u!) 4 ml, u2)9
1
VU2(. E @2
*
The set of all Pareto saddle points is also denoted as 9. The solution of (4.7) for the first player, the Pareto-maximin strategy uiP)(.), is defined in the following way. A new class of strategies, called the counterstrategies of the second player, will be needed. The second player's counter-strategy is associated with the Borelmeasurable (over t, ul) functions u2(t,ul) such that u2(t,ul) E [0, a2] for every (4 U1)E LO, 0) x LO, all. The set of such counter-strategies will be denoted e2(u1), the inclusion %2 c a2(u1) being obvious. In a similar way the set aI(u2) of counterstrategies ul(t, u2):[O, 0) x [0,a2] + [0, a l l of the first player is introduced. The strategy U : ( . ) E % ~ of the first player will be referred to as the Pareto maximin in the game (4.7) if (1) for every strategy u1(.)e4Vl of the first player there exists a counterstrategy ii2(.) = ti2(.,u 1 ( . ) ) ~ a 2 ( u 1of) the second player such that N41,
iiz)
9 I(%,
U2b
vU2(.)E*2(UI);
the set of such counter-strategies fi2( .) will be denoted 4?2(u1); (2) there exists a counter-strategy iir = ii:(. ,u:(. )) E d2(u:)for which
w ,fir) a I(%, k),
Vu,(.)E%1,
fi2(*)E4?2(Ul).
The constant vector I(u:, fir) will be referred to as the Pareto maximin of
(4.7) and denoted
Z(u:, fir) = maxP u U A . )€*I
minP I(ul, u2).
4. ) E *hJ
326
7. Tbe Competition Problem
The notion of a Pareto-minimax strategy u;( .) of the second player (using the set 421(uz) of counter-strategies of the first player) and of the Pareto minimax of the game (4.7) are introduced in the same way:
I(;:, u r ) = minP u U2(. ) E %
maxP I(ul, uz). UI(.)E%(UZ)
Definition 4.1. The solution of (4.7) is the situation (u:, u ? ) which (1) is the Pareto saddle point in this game;
(2) the strategy u:( .) is the Pareto-maximin and u t ( .), the Pareto-minimax of (4.7); (3) the value of the two-component payoff function I(ul, u z ) in the saddle point (u:, u t ) coincides with the Pareto minimax and maximin, that is, I(u:, u t ) = maxP u Ul(. ) E
-
8,
minP I(ul, u z ) = u 2 ( .) E . W , ( U , )
minP I(u:, u z ) U2(.)E%(U:)
maxP I(ul, u t ) = minP u maxP I(ul, uz). %(4 U2( , ) E @ 2 Ul(. )E4 m 2 )
U l ( . )E
This solution (u:, u t ) is internally stable over the entire set of Pareto saddle points of (4.7), specifically
w,
UZ)
a I ( U 7 , 4),w , ur) $ 1(u7, 4)
for any Pareto saddle points
( U ~ ( - ) , U ~ ( * ) )of E ~the
game (4.7).
The solution of (4.7) introduced by Definition 4.1 is the ZP-solution introduced in Definition 2.1 of Chapter 6. In the last subsection of this section the analytic form of this solution will be found.
4.3. Pareto-Optimal Controls That the following method of obtaining Pareto-maximal control is valid is proved in [102]. Specifically, it is required to find the optimal control @ ( * ) E % in the problem J i ( u i ) = /lJl”(ui)
+ (1
-
/l)Jlz)(ui) - max
U,(. ) € %
(4.8)
4. Model of Competing Research Activities
327
with constraints z. ' I = u.
Zi(0) = 0
(4.9)
and at least one constant PE(O, 1). Then the control uy( .)E'%,' is Pareto-optimal in problem (4.4). The solution of (4.8) and (4.9) will be obtained by the dynamic programming method combined with the method of Lyapunov functions [102]; for brevity of notation, the subscript i will be omitted in the further discussion. The dynamic programming equation for (4.8) and (4.9) takes the form
q(e, z) = 0, solving this equation, the constraint u( - ) E [0, a]
vzE
W.
(4.10)
will be neglected for In now and its validity will be tested below. Since we have the quadratic function of u in the braces of (4.10), the maximum in (4.10) is obtained with (4.11) Substituting uo found in (4.11) in the first equality of (4.10) gives us the following partial differential equation for obtaining the function q(t,z)
The solution of (4.12) is sought in the form q ( t ,z) = h(t)e-".
(4.13)
Substituting (4.13) in (4.12), we find an ordinary differential equation for determining the function h(t):
Its solution is (4.14)
328
7. The Competition Problem
Hence, in light of (4.13) and (4.1l),
' + 2(1 A2- B)r
(1 - B)[(BL)-
The resultant Pareto-maximal control uo(.) has certain properties. First, u o ( . ) is not explicitly dependent on the state variable, thus in problem (4.4) it would be sufficient to use only programmed, or exclusive time-dependent controls, with t time. Second, uo(t) > 0 at every t E [O, 131 for p = const E (0, l), which implies that it is desirable to conduct the research over the entire time span [O, 01. Third,
that is, uo(t) is bounded at every t E [0,19]. This requirement is obvious if an allowance for the research cost is to be made. Now, let us proceed to obtain in the criterion space {J1,J2}the set of values of the functionals (J,(u),J2(u)) that can be achieved on the set of all Pareto-maximal controls of problem (4.4). Since the set of controls @ is convex and the criteria J,(u), and J2(u))are strictly quasiconcave [70, p. 991, then, following [70, p. 1071, we have the entire set uo( .) of Pareto-maximal controls for problem (4.4)if the optimal control problem (4.8),(4.9) is solved (i.e., if uo( .) is found) as the parameter B varies from 0 to 1. From (4.13) and (4.14),
Hence (4.16) Simultaneously, by the dynamic programming method q(0,O) = /3J"'(UO)
Denoting in (4.16)
+ (1 - P)J'2'(UO).
329
4. Model of Competing Research Activities
we have
+
aL2cos +a~cos
= sin+
‘(O,’)
+
+
With = 71/2 we have cp(0,O) = 0, that is, the scalar product (J(”(u0),J(2)(uo))*(O,1) < 0 J(2)(uo)< 0 for all uo( *)E@’. With =0 we have q(0,O) = L, that is, (J(”(u0),J(’)(u0)).(1,O) < L =-J(’)(uo) < L, Vuo( .)€ao. In Fig. 7.4.1 the set J(@’) of values of the criteria (J(’)(u),J(”(u)) on the set 92’ of Pareto-maximal controls uo( * ) is shown as a heavy line. Here also
+
J(@) =
u u
(J(1)(U),J‘2’(u)),
U(.)EQ
J(@’o) =
(J(1)(u),J‘2’(u)).
U( . ) € 1 0
t J‘2’
Figure 7.4.1.
L
330
7. The Competition Problem
4.4. Decision Making in the Competition Model
In this subsection the results of the preceding subsection are shown to lead to the explicit form of a solution of the game (4.7) (Definition 4.1). For brevity of the notation, we introduce two vectors
Ji(ui)= (J!')(ui), J12)(ui)),
i = 1, 2
and consider a two-criteria1 dynamic problem Let 42: = 42'
rl = (il = Ulr z(o) = 0; al;J1(ul)). be the set of Pareto-maximal controls of rl and Jl(W=
A
u
( J ( l 1 W 9
(4.17)
J(l2)(u1N
E@ :
the set of values of the vector-valued criterion J , ( u ) obtained on 42:. Simultaneously with rl let us consider a two-criteria1 dynamic problem
r2= ( i 2= U 2 , z2(0) = 0; ~49~;- J ~ ( U ~ ) ) , where - J 2 ( u 2 ) = (-J\')(U~),J\~)(U~)); in this subsection it is assumed that
Thus the set of controls for the first player a1coincides with the set of controls of the second player. Because the multicriterial problems rl and T2 are of the same kind, the set 42: of Pareto-maximal controls of rl 42: =
@a,
where 42; is the set of Pareto-minimal controls of T2 or the Pareto-maximal controls of the two-criteria1 problem (i2
= u2, z2(0) = 0;
422; J 2 b 2 ) ) .
Recall, that, by (4.15),
As in Lemma 2.1, we may establish the following proposition.
33 1
4. Model of Competing Research Activities
Proposition 4.1. Any situations (u: .), u r ( .))€a: x @-f, and only such situations are Pareto saddle points of (4.7). 7he values of the vector-valued payoff functions Z(ul, u,) obtained on the set 427 92; of all Pareto saddle points are dejined by the difference
+
I(@:,@?)
= Jl(@:) - J,(@,*) =
u
[ J ~ ( u : )- J ~ ( U ? ) ] . (4.18)
u:c ' )E*: u:( ' k.w:
Figure 7.4.2 shows the sets J,(@T)and J 2 ( @ z ) and Fig. 7.4.3, the set
Z(@?, @?)
= J1(@?)
-
.I,(@?).
Pareto minimaxes and maximins are obtained by solving the dynamic twocriteria1 problem
(ii= u f ,Zi(0)= 0;@: x a*;qul, u,)), where the solutions (u:( -), ur( .)) are defined on the set $2:
X
42;
of Pareto saddle points of the game (4.7).
Figure 7.4.2.
(4.19)
332
7. The Competition Problem
Proposition 4.2. The set of Pareto maximins of the game (4.7) coincides with the Pareto-maximal values of the vector-valued payo$ function l(ul, u2) in problem (4.19) obtainable on the set 42: x 42: of all Pareto saddle points, or
where Jl(42:) - J2(42?) is defined in (4.18).
The set of Pareto minimaxes of (4.7) coincides with Pareto-minimal values of the vector-valued payoff function Z(ul, u,) in problem (4.19) obtainable on the set of all Pareto saddle points, or
The set of Pareto minimaxes in symmetric (with respect to the point 0, E R2) to the set of its maximins, and the set of Pareto-maximin (Paretominimax) strategies coincides with the set of Pareto-maximal (Paretominimal) controls of problem (4.19).
Figure 1.4.3
4. Model of Competing Research Activities
333
Proofs may be obtained using the procedures applied to prove Propositions 2.4, .2, and 3.3, with the relation > replaced by 2 . In Fig. 7.4.3 the set of Pareto maximins is shown as a double line and the set of Pareto minimaxes, as a broken line. In the latter figure the point I(0,O) is seen to agree with the solution ( u : ( * ) , u t ( . ) ) of the game (4.7), Definition 4.1. There the Pareto minimax coincides with the maximin and the value of the payoff function I(ul, u2) in the Pareto saddle point. This solution (u:( .), u:( .)) is implemented, in particular, by the situation (u:( .), u r ( .)), where u t ( t ) = u:(t), u:( .)E%:. Then Proposition 4.3. Any situation (u:( .), u r ( where u r ( t ) = u:(t) at t E [O,O) and u:( .)E%: (the set of Pareto-maximal controls of problem (4.17)), is the solution of(4.7) (DeJinition 4.1). a)),
The proof is analogous to that of Proposition 3.5. From Proposition 4.3 we have an algorithm for obtaining a solution (Definition 4.1) in the model (4.7) of two research projects run simultaneously by two different teams: Algorithm.
(1) choose some Pareto-optimal control u:( .) of (4.17) using the formula (4.20) (2) the solution of (4.7) is represented by the situation (u:( .), ur( .)), where u f ( t )= u:(t) at t E [O,e). The choice (on stage (1)) of the Pareto-maximal control u : ( . ) is equivalent to “assigning” a specific constant 8 E (0,l) and then using (4.20). Note that 8 must also satisfy the constraint
This page intentionally left blank
Chapter 8
A Pursuit Game with Noise
In this chapter the problem of the pursuit of a moving object with some points fixed is discussed. Concerning some of these points only the domain of their possible location is known. The case of two or more target points is investigated. Minimax Pareto strategies and saddle points are plotted.
1.
Statement of Problem
The mathematical model of a cooperative variety of the pursuit problem is described and the sources of noise (uncertainties) indicated.
1.1. Cooperative Pursuit Game with Negligible Noise
Assume that the change of the position x[t] of the moving point M in the state space R" is described by the equation 1 = f ( t , x, u)
(1.1)
where x E R"; time C E [to, 91; the constants 9 > t o 2 0; the control action u E H E comp Rh; the function f ( t , x, u) satisfies the Condition 1.1 in Chapter 1; and the initial position of the point in R" x[to] = xo is given. The quasimotions x[t, to, x, U], to < t < 9, generated by UE%
=
{ U + U(t,X)IU(t,X) E H}
are formalized by the tools of section 2.3 in Chapter 1. Assume that in R the goal points M i , iE N, are fixed.
335
336
8. A Pursuit Game with Noise
It is necessary to choose a strategy U E 9; then by the time 8 the game ends, all the distances IlxCe, to, xo,
UI - Mi II = Fi(xC8, to, xo, UI),
i~ N,
( 1-21
will assume the lowest possible value. For different types of the system (1.1) and different sizes of Mi, this problem, referred to as a cooperatioe pursuit game, has been analyzed in [66, pp. 105-1221. This work concentrated on the dynamic stability of numerous optimal solutions of the game and solutions were formalized in terms of the theory of non-zero-sum static games without side payments (with nontransferable payoffs). Indeed
(X - (1.11, 4, F(xCel)), (1.3) where F(x) = (F,(x), . . . ,FN(x)),Fi(x)= IIx - Mi 11 is a multicriterial dynamic
problem (Chapters 2-4), where U E 9 must be selected so as to obtain the lowest possible values of all the components of the vector-valued criterion F(x)simultaneously. The optimal solution of (1.3) can be a Slater or Pareto or Geoffrion or A-minimal strategy. In the procedure of [66] for obtaining a set of Pareto-minima of the problem (1.3), it is assumed that M = conv{Mi, ie N} is the convex hull of the points M,, M,, ...,M,. The set M is an r-dimensional polyhedron and so is contained in some r-dimensional hyperplane of the n-dimensional Euclidean space R”(r < n) but not entirely in any (r - 1)-dimensional hyperplane; the boundary of M consists of a finite number of (r - 1)-dimensional polyhedrons. The orthogonal projection n,M of the set M on the reachability domain X is the set
Theorem 1.1. If the reachability set X of (1.1) is conoex, then [66, p. 1151 the set of all Pareto minima of (1.3) is nx M, that is, the set of Pareto minimal values of the oector-oalued criterion F(x[e]) = (Fi(x[e]), ...,FN(X[e])) is the orthogonal projection of the conuex hull of the “target” points MI,. ..,MN on the reachability domain X = X[O, to, xo, U + H ] ofthe sysem (1.1). Consider the meaningful sense of (1.3) in mechanics and economics problems.
1. Statement of Problem
337
In problems of the mechanics of control systems, the point x [ t , to, xo, U ] usually characterizes the position (at time t E [ t o , 01)of the process ?2 in the state space. Then the optimal solution (the strategy U* € 4 2 ) of problem (1.3) assures the highest possible proximity, at the desired time t = 8, to all the fixed points simultaneously. If in the system (l.l), the vector u = (uI, ..., uN), where uiis the controlling action of the i-th player who forms “his” strategy, U i ~ 4 2=i { V i + ui(t, x)l ui(t, x) c HiEcomp W > ,
i~ N,
in order to reach the lowest possible value of his own payoff function lFi(x[O,to, xo, U,, . ..,UN]),the problem (1.3) is a dzflerential tug-of-war game. Here every i-th player chooses (and uses) his own strategy in order to drag the point x[0, to, xo, U , , . . .,UN] as close as possible to its objective point Mi. Such games in the case of three players have been studied for example in [ 116, p. 421, where the Nash equilibrium situation was selected as the solution. In problems of economics, the desirable final results have to be obtained with the existing constraints on material and technical resources. Making use of this principle, the future economic performance is estimated by the deviation of the planned indices x = (x,, . . . ,x,) from certain setpoints M i = (my),. . . ,rn;)), iE N. The existence of several setpoints can be traced to several causes. For instance, these setpoints may reflect differences in understanding the ideal production plan from diverse points of view (that of the economic unit, region, or nation) or from technological, social, ecological or other points of view. In all publications on this problem noise and possible disturbances are neglected. The multicriterial nature of this problem is a much less serious difficulty than the recognition of noise (uncertainties). The approach below is the outcome of the analysis described in Chapters 5 and 6. 1.2. Recognition of Noise (Uncertainties) in Problem (1.3)
The first cause of noise (uncertainties) in the mathematical description of the convergence problem is the incompleteness of a priori data on the system state. In numerous control problems only the ranges over which certain system parameters or exogenous signals may assume any unpredictable values are known. In particular, a highway measured on a large-scale map where the
338
8. A Pursuit Game with Noise
terrain is not shown can turn out to be much longer than expected. At the initial design stages the engineers have to deal with this range. Also, uncertainties may be caused by temperature fluctuations or variations of other atmospheric conditions. Most frequently, uncertainties result from incomplete information on the system environment, such as demand for the products, production indices, natural resources, etc. For such variables only ranges may be specified, e.g., by experts. In numerous cases the probabilistic characteristics remain highly questionable. Other reasons why uncertainties may come into the picture include:
(1) admissible errors in the readings of measuring devices; (2) recognition of the power of wind or current (when a vehicle moves in the air or water); (3) inaccurate constraints imposed on the control action. The above uncertainties can, in most cases, be recognized by introducing a parameter u (which characterizes the uncertainty) in the equation describing the system E. Then the time variation will be described by a system of differential equation of the type (1.1, Chapter 1) rather than (1.1). Thus in recognizing the disturbing forces we substitute i= f(t, x, u)
+ Bu
for (1.1) where B is some constant (n x 4)-dimensional matrix; with inaccurate constraints on the control actions in (1.1) u can be replaced by u u, thus jc = f(t, x, u
+ u).
+
Note that there are special cases of equations (1.1) in Chapter 1 that may be obtained. Second, the criteria in (1.2) (performance indices) may also be specified inaccurately. For example, points M imay be known only as occurring within a sphere of radius ri and center at Ai,that is,
1
M i E W , , ( U i ) = { X E R " IIX
-ail[ d ri}.
This uncertainty can be caused by fog, a smoke screen, or a cloud in which Mi is found. In economic problems uncertainties are sometimes attributable to price variations. For instance, let ci be the price of the i-th product and xi[O] its quantity, produced by deadline 8. It is known in advance that the prices (El, ..., E n ) = 2. are able to vary in a certain range, where the vector E belonging to a given polyhedron C c R". The production amount xi[O] of
2. Pursuit of Two Target Points, One of Which with Uncertain location
339
the i-th type manufactured by time 8 is the sum cixi[O] = Fi(x[8]). Then every vector x[8] = (x,[O], . ..,xn[8]) is associated with some set
[F~:(~[OI) = {qX[el)
= E~XCO] IE E
q
in the criteria1 space; the dash stands for transposition.
2. Pursuit of Two Target Points, One of Which with Uncertain Location The location of one target point is known accurately and of the range of the location of the other is known. A static analog corresponds to the differential pursuit game with two points. A geometrically illustrative way to solve the game is proposed.
2.1.
Problem Statement and Essential Results
Assume that in a three-dimensional Euclidean space R3 the point M, is fixed (its location is characterized by the vector a l ) and that the other target point M , can assume any unpredictable location inside a sphere %(a,)
I
= {XE R~ I I X - a,
II < r }
of radius r and center at the point A,, or inside the set a, y = { Y E R3
The state x[t], to < t
+
where
I II Y II < I > .
(2.1)
u,
(2.2)
< 8, of the moving point changes by the equation jc =
where the vectors x, u E R3, the constraint on the control action is
H
1
= {UE R~ llull
< R = const);
(2.3)
the initial position (to, xo) = (0, O3)€ R' x R3 and time 0 = 1 when the game ends are fixed. By (2.3) and the definition of the quasimotions x[. ,to, xo,U] of the system (2.2) generated by the strategy UE%! = { U - u ( t , x)l IIu(t, x)ll < R } from the initial position (to,xo) = (0,03),the reachability domain of the system (2.2), X = X[O, to, xo, U + H ] , is a sphere in R3 of radius R(O - to)= RO = R with center in the origin 0,.
340
8. A Pursuit Game with Noise
Remark 2.2. By Corollary 4.1 in Chapter 1, for every point x* E X there exists a strategy of “its own” U* €42 such that
x* = X[O, to, xo, V * ] for all quasimotions x[ * , to,xo. U*] of the system (2.2) generated by the strategy U* from the initial position (to, xo) = (0,O J . The two-component payoff function in this pursuit game is specified as
Here the second criterion F2(x[U], y) characterizes the distance from the point x[U] to a, y, where y can assume any value from the sphere Y
+
In a pursuit game with points M, and M,, it is required to find a strategy U* that would assure the closest possible proximity of the state point x[O] to M I and M, at the same time. In this problem the location of M, is not known accurately and only the set (the sphere W,(a,))inside which M, can assume any location is known. By Remark 2.1, the problem of determining the strategy U* reduces to finding a point x* E X as close to M, and M, as desired. For this reason the game will be associated with a static zero-sum game with a two-component payoff function
I
(X,
I: W Y ) ) .
(2.5)
Here X = {XER3 llxll < R} is the reachability domain of the system (2.2), which for the game (2.5) is the set of strategies of the minimizing (first) player, the set Y of the strategies y of the maximizing (second) player is determined in (2.1), by the way they are obtained and by Proposition 2.3 from Chapter 1 the sets X and Y are compacta in R3. The vector-valued payoff function will be given, by (2.4), as w , Y ) = ( F A X , Y), F,(x, Y ) ) = (Ilx - a , 112> Ilx - Y - a,1I2).
(2.6)
Now let us assume Condition 2.1 holds. The constants R > r > 0. The solution of the game (2.5) in the eyes of the first player is assumed to be ) Appendix 4) and in the eyes the second the Pareto-minimax strategy x ( ~(see player is supposed to be the Pareto-maximin strategy y(P) and, finally, the solution satisfying both players simultaneously will be the Pareto saddle point (xp, yp) at which the value of the payoff function F(xP, yp) coincides with the Pareto minimax and maximin (an analog of the ZP-solution in the static
2. Pursuit of Two Target Points, One of Which with Uncertain location
341
case). This solution belongs to the internally stable (i.e., nonimprovable for all Pareto saddle points) subset of the set of Pareto saddle points. For the game (2.5) the results are as follows. In Fig. 8.2.1 the vectors a, = OA, and a, = OM, and the plane of the drawing coincides with the plane in R3 passing through the points 0,A, and Ml. ~
(1) The set of Pareto-minimax strategies x ( ~is) equal to CB, which is the orthogonal projection of the segment A,M, on the set X.
(2) Any minimax strategy x(~,=OE,together with the strategy Y ( ~ ) A =, K , forms the Pareto saddle points of the game (2.5), that is, [F(X(,,,
y ) 9 F(x(p),Y'P))
9 w, Y'p'),
VXEX,
YE
Figure 8.2.1 shows an easily comprehendible geometrical method of plotting A,K from OE.
-
(3) The sole Pareto-maximin strategy of the game (2.5) will be y(p)= A 2 9 which, together with the Pareto-minimax strategy xtP)= OC, forms a Pareto saddle point; as a result, the value F(X(~), y(p))is simultaneously a
Figure 8.2.1.
342
8. A Pursuit Game with Noise
Pareto-minimax and Pareto-maximin and coincides with the payoff at the Pareto saddle point ( x ( ~Y‘~’), , , or is a ZP-solution of the game (2.5). Remark 2.2.
Finally, to plot the strategy U* E %! realizing the equality x ( p )= xce, to, xo, u*1.
VxC.9 to, xo, u*1
(2.7)
Remark 2.1 must be used together with the fact that if x ( p )= (x:,xt) is a constant vector, (2.7) is realized by the strategy U* t u*(t,x ) = (x:, xr). Its validity follows from the form of (2.2), the initial position (to, xo) = (0, 0,) and 0 = 1. 2.2. Pareto Minimaxes
Let us now proceed to the proof of the propositions in section 2.1. The notation used here is explained in Appendix 4.
Proposition 2.1. The set of Pareto-minimax strategies x ( ~of) the game (2.5) consists of vectors determined by the orthogonal projection on the set X of the segment M,A2, i.e., it coincides with the arc BC (see Fig. 8.2.1). Proof. a 2 4 X . Because the maximum (over y) of the function JIz- yll on the set Y = { Y E R3 llyll < r} with 11 yll # 0 is reached at the unique point y = -rzIIzII-’, for every strategy X E X ( x # a2), the set Yp of maximum Pareto-maximal strategies of the two-criteria1 problem r, = ( x F(x, y ) ) consists of the single vector
I
YP(X) = y”(x) = -
x - a2 IIX
- a211
r.
Substituting (2.8) in (2.6) we obtain f f ( x y, p ( x ) )= (F,(x, yp(x)),F2(x,yp(x))), where F,(x, yP(x))=
IIX
-
a , 112,
Let us now consider the two-criteria1 problem
(X,
w,
YP(X))>.
(2.10)
2. Pursuit of Two Target Points, One of Which with Uncertain location
343
By definition (Appendix 4, section 4.2) the problem of finding a Paretominimax strategy of the game (2.5) reduces to plotting a Pareto-minimal strategy of the two-criteria1 problem (2.10), (2.9). Before plotting, we transform problem (2.10) into a more suitable form. For this purpose let us apply results from [70, pp. 60-643 on equivalent vectorvalued criteria. Specifically, two sets of criteria (vector-valued criteria) are equivalent if their Pareto-minimal strategies coincide. The following propositions are true. (1) If the scalar function JI(F2) increases with F,, the collections (F,(x), F,(x)) and (F,(x), JI(F,(x))) are equivalent; in the case we assume W 2 ) = JF,, F , > 0. (2) The collections (F,(x), F2(x))and (F,(x), F,(x) constant r.
+ r) are equivalent for all
By these propositions, the collections of criteria (Ilx - Q1112, (Ilx - a211 + r),)
and (Ilx - a11I2, IIX - a2II)
*
are equivalent. Using, finally, the former proposition, given the function JI(F,) = F:, F, 2 0, we obtain two equivalent collections of criteria: (Ilx
-
a,112, (Ilx
- a211
+ rI2)
and (2.11)
(Ilx - a11I2, Ilx - a21I2).
Consequently the problem of finding Pareto-minimal strategies in (2. lo), (2.9) reduces to plotting these strategies in a two-criteria1 problem (X,
b,Y"X))>,
(2.12)
where F(X,YP(X))= (F1(X,YP(4), F,(X,YP(X))) = (Ilx - a11I2, IIX
-
a2ll').
(2.13)
Such a problem (rapprochement game) has been discussed in detail in [66, p. 1151. The Pareto-minimal strategies in (2.12) have been found to be an orthogonal projection of the segment M,A, on the set X.This projection (see Fig. 8.2.1) coincides with the arc BC. But then, by the of equivalence of the collections of criteria of (2.1l), the arc BC forms the set of Pareto-minimal
344
8. A Pursuit Game with Noise
strategies in (2.9), (2.10) and, consequently, the Pareto-minimax strategies of the game (2.5). a, EX. From (2.6) we have F,(x*, y) = )I y1I2 with x* = a, and the Paretomaximal strategy yP(x*) will be every vector y satisfying the equality )[ yll = r. For every strategy x # a, and corresponding yp(x)= -(x - a,)/(llx - a2JI)r, we have F,(x,
Y"X))
= (Ilx - a2 II
+ rI2.
Hence F,(X*, YP(X*))= IIYP(X*)II2= r2
-= F,(x, Y P ( 4
and then F(X, y"x))
32 E(x*, Y"(X*)).
The latter relation signifies that x* = a, is a Pareto-minimax strategy. The proposition is proved. 2.3. Pareto Saddle Points
Proposition 2.2. The Pareto-minimax strategies x ( ~ o)f the game (2.5) ( x ( ~#, a,), together with the strategy
form the Pareto saddle points of (2.5).
ProoJ Consider an arbitrary Pareto-minimax strategy x ( ~of) the game (2.5) and the associated
From the definition of a Pareto saddle point (in subsection 3.1, Chapter 6) and of the sets Xp(y) and Yp(x)(Appendix 4, section 4.2) it follows that the situation (x(~,, yp) is a Pareto saddle point of the game (2.5) iff x ( ~E )Xp(yp) and ypE YP(x(,,). Since by plotting YPE YP(X(,,) = YP(X(,)),
(x(,,,yP) turns out to be the Pareto saddle point of the game (2.5) if x ( ~E)X,(yP). Let us prove this inclusion.
345
2. Pursuit of Two Target Points, One of Which with Uncertain location
Consider a two-criteria1 problem
(X,V X , Y”)>,
(2.14)
U Y P ) = W , Y P ( X , ) ) = (Ilx - a11I2, IIX - (yP + a2)1I2).
(2.15)
where The Pareto minimum in problem (2.14), (2.15) is reached [66, p. 1111 at every point x of
XP(YP)= %Cal,a,
+ YP1
(2.16)
of the orthogonal projection on the set X of a segment with endpoints a1 and a2 + yp. The points of the segment of the given set can be z = a1
+ (a2 - a,)B, B E C O , 11.
Let us consider the case llzll < R (z # a2, or p # 1). Then the orthogonal projection z on the set X will be x(,) = z = a, + B(a2 - al). The strategy is yp
=-
X(P)
IIX(,)
- a2 - a211
y =
-
a1 + (a2 - a1)B - a2 lla1 + (a2 - a1)B - a211
y =
- a1 - a2 y. lbl - a211
Therefore, the set Xp(yP) from (2.16) can be represented as follows:
The segment [a1, a, + (az - al)lla2 - a, 11 -‘r] includes the closed interval [a,, a2] and, consequently, the point z. For this reason the projection of the segment on X contains x ( ~ )Then . for llzll < R, we have x ( , ) ~ X , ( y ~ ) . Let us now consider the case llzll > R, where also
The quantity llzll can, by virtue of the fact that llzll > R, be represented as where y = const > 0. We consider the associated corresponding strategy a1
+ (a2 - a1)B
- a,
346
8. A Pursuit Game with Noise
or
Then the internal
- a d 1 P)RY + Il(a1 - azH1 - B)R - azyll (a2
-
The resultant interval [M,B] is located concerning [a,, a,] as shown in Fig. 8.2.2, where
\
\
‘i ‘1 1
\
\
\
\\
b? \
/
- l \
Figure 8.2.2.
\
\
‘\
\
\
\
3. Pursuit of Two Target Points
347
The point 9 of the segment [M,B] is projected to the same point of the set X as is D,, that is, X(p) = 7@E%Cal,
a2
+ YP(X(,))I.
Consequently x(,) E Xp(yp(x(,,)). Remark 2.2. Because the set 9 of all Pareto saddle points of the game (2.5) (see Fig. 8.2.1) is
-
IICM, 11 = min IIx - a,I12 = min F , ( x , y ) , ( X > Y )E 9
( x . Y)E 9
llC9 )I = max IIx - y - a21I2= max F , ( x , y ) (X>Y)EB
( x .Y ) E 9
the Pareto saddle point of the game (2.5) (xo,yo) = (E, A 2 9 ) is such that for any situation (x(,), yP)€B, VX(,,,
YP)
2 wo,Y O ) , W(,),Y") $ m0,Y O ) .
Moreover, yo is the Pareto-maximin strategy of the game (2.5) and xo the minimax, or (xo,yo) will be internally stable in the face of the situations from 9and the vector-valued payoff IF(xo,yo) at this Pareto saddle point coincides with the Pareto minimax and maximin. Consequently, this situation is more suitable as the solution of the game (2.5) and is an analog of the ZP-solution in the static case.
Example 2.1. Consider the game (2.5) with a,, a,, r, and R shown in Fig. 8.2.3a. The range of the two-component payoff function values ff(x,y) = (IIx - a, 1 2, IIx - y - a21I2)is shown in Fig. 8.2.3b. The vertical lines stand for sets of the form F(x, Y) = {F(x, y) y e Y}. The range C,B,B2C2 is the set of values of F(x(,,,y) reached through an exhaustive search of the Paretominimax strategies (belonging to the arc CB) and district Y E !I The arc C,B, is the set of all Pareto minimaxes in this game. It coincides with the values of the payoff function at the Pareto saddle points (plotted by Proposition 2.2).
I
3. Pursuit of Two Target Points We assume that the location of two target points is known, and determine the location of the object approaching them with an error with known range.
348
8. A Pursuit Game with Noise
3.1. The Problem As in the preceding section, let us assume that a change of the current state of the process is described by a three-dimensional vector x [ t J , O = to G t G 0, O = const > 0. Here x[ = { x r t ] , t E [O, 0 ] } is a quasimotion of the system (2.2) generated from the initial position (to, xo) = (0, 0,) by the strategy a ]
U € Q = { U + U ( t , X ) ( IIu(t,x)112 < a}.
(3.1)
Then the reachability domain
x = xre, to,xo, u + H-J =
u
U€4y
xre, to, x0,
vi
is a sphere in R3 with center at the origin and radius aO. Unlike section 2, let us assume that the location of two targets, A, and A2 are known exactly. They are determined by the vectors a, and a2, respectively.
3. Pursuit of Two Target Points
349
The location of the process C itself (at the time 0 the process ends) is measured with error y, of which only the range is known. In other words, y can assume any unpredictable value from the sphere
Y = { ~ E [ W ~ I I ~ J<J y~ }~ , y = c o n s t > O .
(3.2)
In meaningful terms, if the error is neglected, we would have to select a strategy U* E 9 that assures the maximal possible proximity of the state vector x[O, to, xo, U * ] to both points A , and A, simultaneously. When is recognized A, and A, are approached not by the point x[O] = x[O, to, xo, U * ] , but by the point x[O] + y, where the vector y can assume any value from the sphere X that is, the points A, and A, are approached by the set x[el+
Y;
Because we know only that Y E Y (by the principle of the guaranteed result [30]), when choosing U* the most unfavorable values y E Y for the process, or catastrophe, can be expected. As a result, because Z can be any where in the set x[O, to, xo, U] + X U must be chosen so that the object approaches A, and A, as close as possible simultaneously (Fig. 8.3.1). In light of Remarks 2.1 and 2.2 the problem reduces to the selection of a point X * E Xclose to A, and A, in the above sense.
Figure 8.3.1.
350
8. A Pursuit Game with Noise
Consequently, the static analog of the pursuit game can be represented as (X,
I: W , Y ) ) ,
(3.3)
where the set of the first (minimizing) player’s strategies x is the reachability domain X of the system (2.2) from the initial point (to, xo) = (0,03).The set of strategies y of the second (maximizing) player is determined in (3.2); the two-component payoff function is
bY ) = ( F , ( x , y ) , FAX, Y ) ) = (Ilx + Y - a, 112,
IIX
+ Y - az112).
(3.4)
The solution of (3.3) will be the situation (xg,yg)), that is, the Geoffrion saddle point (Definition 3.1, Chapter 6). From the point of view of the minimizing player, the use of the strategy xg from (xg,y g ) assures for him values of the vector-valued payoff function ff(xg, y g )for all Y E Y (Fig. 8.3.2). ff(xg, y), t l y ~ such that IF(xg, y) Here the points of the set
aG
wg,Y ) = (J wg,y) YEY
cannot “penetrate” the shaded obtuse angular region with vertex at
Figure 8.3.2.
IF(xg, yg).
3. Pursuit of Two Target Points
351
In other words, once the strategy xg is chosen, the first player assures for himself the payoff vector IF(xg, yg) whatever the behavior (strategy y E Y) of the second player. This section discusses how to plot the strategies x g and y g . First, auxiliary information is needed from the theory of zero-sum games with vector-valued payoff function for the method of plotting Geoffrion saddle points by means of Lagrangian multipliers. Then the explicit type of Geoffrion saddle point of the game (3.3) is found.
3.2. Auxiliary Information Here the game l-
=
(X,
k: F ( X , Y)>
(3.5)
of general type is examined, where X Ecomp Rh, YEcomp R4, and the components Fi(x, y), i E N, of the vector-valued payoff function F(x, y) are continuous over X x I: The game r will be associated with the zero-sum game with scalar game function
(3.6) where the positive constants
Pi,i~ N, are fixed.
Remark 3.1. Analogously [70, pp.80-81], the validity of the following proposition is established: -
any saddle point (x*,yg) E X x Y of the game Fa or
Note that (3.7) is equivalent to satisfying the chain of equalities max min FB(x,y) YPY
xaX
= min XPX
F p ( x ,y g ) = max FB(xg,y) YSY
= rnin xeX
max FB(x,y). YPY
(3.8)
352
8. A Pursuit Game with Noise
Let us employ Lagrangian multipliers to find the saddle points of the game (3.6) and, consequently, the Geoffrion saddle points of the game (3.5). Assume that the sets X and Y in the game (3.6) are specified by the inequalities X = { x ~ R ~ ( d , ( x ) > , O , r1 ,=. . . , I } , Y = { y E R q I h j ( y ) > O , j = 1,..., k } .
(3.9)
The Langrangian function for the game (3.6), (3.9)
where the constants pr < 0 and i j 2 0 are Lagrangian multipliers. We introduce the column vectors p = (Pl,...,P,),
= (il,...,Ak),
d = (dl,. ..,dl), h = ( h l , .. .,hk).
Then the constraints (3.9) and the Lagrangian function assume the form
x ={X€
1
Rh d(x) 2 O,],
y = { y E R q I h ( y )> O k } ,
where 0, is the null vector from R”,
Yx,y, p, 4= 8’W,y ) + p ’ d ( x )+ x h ( y ) . (Recall that the prime signifies the transpose). Then, for example, 8’ is a row vector and jl’iF(x, y) the scalar product of the vectors p and F(x, y). We will need some propositions on methods of obtaining the saddle point (xe, yg) of the game (3.6) using the Lagrangian function L(x, y, p, i).The functions Fi(x, y), i E N, and the components d(x) and h(y) are assumed to be continuous over the product of compacta X x Y and the compacta X and Y specified by the inequalities (3.9). Lemma 3.1. [27, p. 1163. The following inequalities are valid: max min yeY
1 B i F i ( x , y )= max min inf sup ~ ( xy,, p, A), y€Y I>Ok 1 B i F i ( x , y )= min max sup inf y x , y , p , A ) . y e Y p < o , 120,
XEX i a N
min max XEX Y S Y
xex
pgo,
xsx
i€N
If there exists y g = arg max min inf yeY
xox
sup L(x,y, p, A)
I > O k p
3. Pursuit of Two Target Points
353
then max min rlF(x,y ) = min pIF(x, ye). y€Y
xex
X€X
Analogously, if there exists xe = arg min max sup inf L(x, y, p, A) xsx
then
ysY
L,Ok
p
min max p[F(x,y ) = max p’[F(x*, y). xsx
ysY
Ye y
If the constraints (3.9) specifying the sets satisfy regularity conditions (see Lemma 2.2, Chapter 2), in Lemma 3.1 constraints may be imposed on the ranges p and 1.For example, the first part of Lemma 3.1 can be reformulated as follows [27, p. 1181.
Lemma 3.2. Assuming that the scalar functions Fi(x,y), iE N, satisfy a Lipschitz condition for (x,y ) ouer X x Y, the functions dr(x) and hj(y) are continuous, and min d,(x) < - yp(x, X )
with x $ X,
min hj(y)< -yp(y, Y )
with y $ Y,
r = 1. ...,I
j = 1. ...,I
(3.10)
where y > 0 is constant and, for example, p(x, X ) = min IIx - zll. Z%X
Then constant vectors max min ysY
X€X
A* 2 Ok and p* < 0, exist such that
min
max Y x , y, p, A) = max min p F ( x , y).
Ok4A61* p*
y€Y
xsx
The problem of finding the saddle point of the game (3.6) (and, consequently, the Geoffrion saddle point of the game (3.5)) can be reduced to plotting the saddle point of a zero-sum game (without the constraints X or Y) with scalar payoff function q x , y, p, A). In that game the first player chooses the strategy (x,A) to minimize L(x, y, p, A) and the second player, ( y ,p) to maximize this function. Numerical methods are applicable (cf. [181). The conditions [27, p. 1191 when this is possible are as follows:
Proposition 3.1.
Assume that
sup inf L(x,y,p, A) = inf sup L(x, y, p,A).
y € Y xsx PdOl A24
xsx
A>ok
ysY p
(3.1 1)
354
8. A Pursuit Game with Noise
Then
max min p’F(x, y ) = min max p’F(x, y). y€Y
xex
xex
(3.12)
yeY
In particular, if ((xg,A g ) , (yg,pg)) is the saddle point of the Lagrangian function, the situation (xg,yg) will be the saddle point of the payoff function p’IF(x, y) on X x Y and, consequently, the Geoffrion saddle point of the game (3.5). If d,(x) and hj(y)are concave, the functions Fi(x,y), i E N, are convex over x and concave over y on the convex compacta X and K respectively, then (3.1 1) results from (3.12). Let us formulate necessary and sufficient conditions for the existence of the saddle point of the game (3.6). Proposition 3.2. (27, p. 1211.
Assume that
(1) Thefunctions d,(x) and h j ( y )are concave, (2) The function p’F(x, y ) is concave over y for every X E X and convex over x for every Y E !I (3) The sets X and Y are conuex compacta. (4) The regularity condition (3.10) is satisjied. Then (xg,yg) will be the saddle point of the function p’ F(x, y ) over X x Y i f l constant vectors A g 2 O1 and pg 6 0, exist such that the pair ((xg,Ag), ( yg, pg))is the saddle point of the Lagrangian function, or max inf q x , y, p, A) = inf L(x,ye,pg,A) yeY
xsx
P C O , 1201
xex
1 2 0,
= sup q x g ,y, p, A g ) = min sup q x , y, p, A). YeY
xex
y€Y
Lao, GO, In section 3.4 the method of Lagrangian multipliers helps find the Geoffrion (and, consequently, Pareto and Slater) saddle point in the pursuit game of section 3.1. P 90,
Example 3.1. A Quadratic Programming Problem. We wish to find the Geoffrion saddle point for the game (3.5) in the case when (1) Every function Fi(x,y ) has the form
Fi(x,y) = x’Aix + y’Biy + a:x
+ biy,
iEN,
where x E R”, y E Rq; A i and Bi are symmetric constant matrices of appropriate dimensions, the vectors ai, and bi are constant, and the prime signifies the transpose, (2) No constraints are imposed on the strategy sets; in other words, X = R andY=R.
3. Pursuit of Two Target Points
355
In effect, we are discussing the following game:
({ 1,2}, { Rh,R4}, { F i b ,y ) = x ’ A i x + y ’ B i y + a:x + b : y } i , N ) .
(3.13)
Proposition 3.3. If positive constants Pi, i E N, exist such that the quadratic forms x ’ ( C i eBiAi)x ~ and -y’(CieNpiBi)y are positive definite, the Geofrion saddle point (x’, y‘) and consequently, the Pareto and Slater saddle points as well, of the game (3.13) have the form
(3.14) To prove the assertion we need only apply Remark 3.1. The saddle point of a zero-sum game with scalar payoff function
where Pi= const > 0, game (3.13). Denote
iE
N,will also be the Geoffrion saddle point of the
Sufficient conditions for the existence of the saddle point of (3.13) are as follows: (3.16)
=2
ay2 (3)
(*)
axay
W,P)
=2
C
1 BiBi > 0,
ieN
=Ohxq.
(x’y’)
BiAi > 0,
(3.17)
356
8. A Pursuit Game with Noise
In (3.17) the condition C > qrespectively, C < 0) for the matrix C signifies that the quadratic form z' Cz is positive- (respectively, negative-) definite. The relations (3.17) are satisfied by virtue of Proposition 3.3 and (3.16) leads to (3.14). 3.3. Solution of the Pursuit G a m e
Following the reasoning of section 3.1 let us consider a static pursuit game with a two-component payoff function,
<x,E: {Ilx + Y - a,1I2, IIX + Y - a2Il2D7 where the compacta
(3.18)
x = { X E WI llxll G M e } , y = ( Y E 543
I II Yll G 71,
where y, c1 and 8 are specified positive constants and a, and a, are vectors associated with the fixed points A, and A, (Fig. 8.3.1). Using the results of section 3.2, let us find the Langrangian function YX,Y,P,4 = Bllx + Y - a,II*
+ (1 - B)llx + Y - a21I2 + p(a2e2 - IIx II2, + A(Y, - IIY II,),
where the constants to be determined are p > 0, A > 0, and BE(O, 1). For a zero-sum game with scalar payoff function Yx, y, p, A), where x E R3,y E R3,the saddle point (xe,y e ) exists if, recalling (3.7), (3.8) and Lemma 3.1 from Chapter 1,
(3(xW + (%lXV, +
+ 2( 1 - p)[x' + ye - a,] +2pxg = 03,
= 2B[xe
y e - a,]
= 2B[xe
y e - a,] 2(1- B)[xe
+
+ y e - a,] - 2Aye = 03;
(3.19)
(3.20) iixg II G M e , II Y PII G Y; here E, is an identity (3 x 3)-dimensional matrix.
3. Pursuit of Two Target Points
357
From (3.19) it follows that
pxg + l y g = 0,. Hence
;xg.
yg = -
(3.22)
Substituting this y g in the first equation of (3.19) we have
(+
xg 1 Hence, from (3.22), if (1
+p
:> +
p - - = pa, -
p/A) # 0,
xg = (1
+ p - ;)-I
yg = -
; + ); (1
(1 - B)a2.
+ (1 - P)aJ
[Pa, -1
p -
CBa1
+ (1 - B)a,I.
If p > 0 and A > 1, we have: First, from the fact that 1 + p - p/A > 0, 1 p - p / A # 0; second, the conditions (3.20) are satisfied, that is, the quadratic forms x’E,(l + p)x and - x’E,( 1 - A)x are positive-definite. The constraints (3.21) hold if
+
IIpal
+ (1 - p)a21l2= a2e2
llfial
+ (1 - B)a2112= y 2
(3.23)
Let the constants a and 8 be such that a8 > 1. Assume that A = a8 > 1 and p = y > 0; substituting them in (3.23), we have IIpal
+ (1 - B)a2112= a202
= [ae
+ y(ae - 1)12.
(3.24)
Under this condition the second equality in (3.23) is also satisfied. * (0,l). Then (3.21) is satisfied if (3.24) has at least one root /I~ Now we have the following. Proposition 3.4.
If in the game (3.18) b8 > 1 and the equation
l l h + (1 - B)a2II = C 8.
+ Y(ae - 1112
(3.25)
8. A Pursuit Game with Noise
358
has at least one positive real root the game (3.18) has theform
p* < 1, the Geoffrion saddle point (x8, y 8 ) of
Consequently, the vectors x8 and y8 are parallel to P*a, + (1 - B*)a2 and oppositely directed (Fig. 8.3.3), x8 having the same direction as b*a, (1 - /?*)a2.The length o f t h e vector x8 is equal to a0 and ( ( y 8 (= ( y.
+
Proposition 3.4 and the above geometric interpretation leads to an algorithm for plotting the Geoffrion saddle point (9y, 8 ) of the game (3.18) and, consequently, of a guaranteeing strategy x8. To simplify the notation we consider only the case of n = 2. Algorithm. Let the vector components be ai = (@, a$)),i 1. Find the positive root p* < 1 ofequation (3.24): ,!?’[(a(,‘)- a(:))’
=
1,2.
+ (a(:)- a\’))’] + 2fl[a$’)(a‘,‘) ail))+ a\’)(a\’) - a\’))] + [a‘:)]’ + [a$’)]’ - [a0 + y(a8 - I)]’ = 0. -
A,
+(1
Figure 8.3.3.
-
4. A Three-Criteria1 Pursuit Problem
359
Cases can be identified when such a root exists depending on the constants a, 8 and y and the components of the vectors ai, i = 1,2. 2. Plot the Geoffrion saddle point (xg, yg) by means of the formulas
(3.26) Note that IIP*al
+ (1 - /3*)a211 = a8 + y(a8 - 1) # 0
by virtue of (3.25). Then the guaranteeing strategy U* in the original differential pursuit game will, by virtue of (3.26) and Remarks 2.1, be UG t uG(t,x) = (l/O)xg.
4. A Three-Criteria1 Pursuit Problem
We now examine a cooperative pursuit game with three target points. The distance from the vehicle to two of the target points is measured with an error for which only the ranges are known.
4.1. Problem Statement
The vehicle Z is studied whose current location in the state space R3 is described by the equation i= u.
Here x, u E R3, the initial position (to, xo) = (0, 0,)
and the time 8 > 0 movement ends are fixed. The set of strategies is 42 = { U
t u(t, x)
I IIu(t,x)ll < R = const}
and the quasimotions x[ ., to, xo, U ] of the system (4.1) generated by the strategy U E 42 from the initial position (to,xo) are determined as in section 2.3, Chapter 1. Note that the reachability domain of the system Z is a sphere
360
8. A Pursuit Game with Noise
in R3 with center at the origin 0, and radius Re, or
X =
uu
U e B x[.]
1
x[e, to, x0, U ] = { X E R3 llxll
< Re>.
(4.2)
The three points A,, A, and A3 and the associated vectors a,, a, and a3 are assumed known. Let us also assume that the distance I: (at time t = 0) to the point A, is known accurately and, at the same time, the distance to A, and A, is estimated with error y which can be any from the sphere (Fig. 8.4.1)
I
Y = { y E [w3 11 yll
< r>,-r= const > 0.
(4.3)
In solving such a pursuit problem we must be aware of the fact that the error Y E Y can be most unfavorable for I:. This problem resulted from studying the vibrations experienced by a subject seated in a cockpit mounted on a rotating centrifuge. A strategy U * E Q has to be formulated such that, with the above measurement errors, the vehicle must approach I: (at time 0) remaining as close as possible to all the three points A,, A, and A,. Bearing in mind Remarks 2.1 and 2.2 the problem reduces to a choice of the point x* E X close to Ai (i = 1,2,3) as above.
Figure 8.4.1.
4. A Three-Criterial Pursuit Problem
361
For this reason, in the static analog of the game the following vectorvalued criterion can be used: Y ) = (Ilx - Q 1 II 2,
Y ) = ( F A X , Y), F,(x, Y),
IIX
+ Y - a2 It 2, IIX +Y
- a3 It 2,
(4.4)
with XEX,Y E Y In meaningful terms it is required to choose a strategy U* that would assure (with U = U*) the lowest possible values of all three criteria simultaneously:
w e , to, x0, U*I, Y ) = (F,(XCB, to, x0, u*1, Y), F
2 w , to, x0, U*I, Y),
F , ( ~ c B to, , xo, U*I, Y ) ) = ( I I X C ~ ,to, x0, U*I - a,ii2,
Ilxt?,
to, xo, U*l
+ Y - Q21I2, Ilxce, to, xo, u*1 + Y - Q31I2).
Let us associate the dynamic pursuit problem with its static analog, a zerosum game with a three-component payoff function (X,
t
x WGY)),
F3
Figure 8.4.2.
(4.5)
362
8. A Pursuit Game with Noise
where the compactum X is defined in (4.2), Y in (4.3) and the function F(x, y) in (4.4). By the solution of the game (4.5) the Pareto saddle point will be understood, that is, UX,
Y”) k
wp,Y”) k q x p ,y),
VXEX, Y
E
k:
With the strategy x p as a solution (and, in light of Remarks 2.1 and 2.2, the strategy U* + u(t,x ) = xp) we guarantee all the values of the criteria Fi(x[O],y) = Fi(x[O,to,xo, V * ] ,y), i = 1, 2, 3 (for any Y E Y ) simultaneously at least equal to (F, ( x p ,yp), F2(xP,yp), F,(xP, yp)) (Fig. 8.4.2). In other words, all the possible values of 5(xp,y) from F(xp, Y ) = 5(xp,y ) remain outside angular region G (Fig. 8.4.2), with only the point ff(xP,yp) a point in common with G,
u,,,,
F(xp, Y )n G = [F(xP,yp).
4.2. An Auxiliary Proposition
Lemma 4.1.
For every X E X such that 03
+ B(x - a21 + (1 - B X X - 4,
VBE[O, 13,
the set
where YP(x)is a set of Pareto-maximal solutions of the three-criteria1 problem
< k: F
k Y)>
obtained from (4.5) by fixing x E X . The strategy y is explicitly contained only in the criteria F2(x,y) and F,(x, y). Since
Proof.
+ Y - %1l2 = P 2 ( - Y , x - a2), F 3 ( X , Y ) = IIx + Y - a3112= P Z ( - Y y , X - 4, F2(x,y) =
IIX
the values of the criteria can be interpreted as the squared distances from the vectors - y to the vectors x - a2 and x - a3, respectively.
4. A Three-Criteria1 Pursuit Problem
363
Let us fix an arbitrary strategy REX and plot the associated set YP(f) of Pareto-maximal solutions of the problem (I: F ( i , y)). By Lemma 3.1, 03
z B(x - a,) + ( 1 - P)(x - a,),
VBECO,
and, consequently, there exists a constant k make
[k(f
==
11, XEX,
0 so large (Fig. 8.4.3) as to
- u2), k ( i - a,)] n Y = 0
true. For j = yp(i) the equality 11 j (1 = r holds and the values of F,(f, yP(f)) and F , ( i , yp(i)) become
+ + r2, + 2[yp(i)]’(i - a,) + r2.
F , ( i , yP(2)) = (If+ y ’ ( i ) - a2112 = 11% - a21)2 2[y”(i)]’(i -a,)
F3(2, yp(2)) = 112 + yP(2) - a, 11
= 112 - a,
1,
(4.6)
Now consider the criteria P2(f,y)= l l y - ~ ( f - a 2 ) I l 2 , m , Y ) =
llY-~(i-%)l12.
(4.7)
Let us find a set Yp(i) of strategies that the collection f(i, y) = ( F 2 ( i ,y), F,(f, y)) Pareto-minimal. The solution, by Theorem 1.1 in Chapter 8, will be the strategies y , ( i ) , which are the orthogonal projection of the segment [ k ( 2 - u2), k(2 - a,)] on to the set Y The segment does not have any points in common with the set Y Consequently, the projections 7rr,[k(2 - a2), k(2 - a,)] form the arc BC (Fig. 8.4.3) and the strategies y,(2) that make
Figure 8.4.3.
364
8. A Pursuit Game with Noise
$(i, y) from (4.7)Pareto-minimal have the form
- p(% - a,)
+ (1- p x i -
a3)
IIp(2 - az) + (1 - BXi - a3111
With this yp(i) the set of criteria (4.7)assumes the values
f,(i, yp(i)) = k211i - a2112- 2k[yp(%)]’(i - a2)r + r2,
f 3 (yp(2)) i , = k211i - a31I2- 2 k [ y p ( i ) ] ’ ( i a3)r
+ rz.
(4.9)
Comparing (4.9)to (4.6)and bearing in mind that yp(%)makes f ( i ,y) Paretominimal, yp(i) = yp(2) from (4.8)makes F(2, y) Pareto maximal on !I Consequently,
(4.10) This proves the lemma. 4.3. Pareto Saddle Points of the Game (4.5)
The strategy y P ( i ) of the form (4.10)will be denoted as yP(i,p) and Y P ( 3=
u
BECO?
11
y”x,B)
and the set Y p ( i )coincides with the arc BC (Fig. 8.4.3). Proposition 4.1. Any strategy x,EX, which, together with some strategy yP(xp,/3*), forms the Pareto saddle point of the game (4.9,belongs to the projection A(al, a,, a3) on the set X . Here A(a,, a,, a3),is a triangle with vertices A,, A,, A, (Fig. 8.4.1). Proof. Consider an arbitrary point (xp,yp(xp,p*)) E X x I! It will be the Pareto saddle point of the game (4.5)iff x p ~ X p ( y Pwhere ), yp = yP(xp,p*). This fact follows from the definition of the Pareto saddle point and the inclusion ~ P YP(xp) E established in Lemma 4.1. Consider a three-criteria1 problem
<x,w G Y P ) >
(4.11)
4. A Three-Criteria1 Pursuit Problem
obtained from (4.5)by setting y u x , yP) = (F,(x,
Y”9
= yp.
365
In (4.11)the vector-valued criterion is
F,(x, yP), FAX, Y”)
(4.12)
= (Ilx - a,1I2, IIX - (a2 - YP)1I2, IIX - (a3 - YP))II2).
The set Xp(yp)is the union of all strategies that make the collection of criteria in (4.12)Pareto-minimal. Such strategies are defined by the orthogonal projection A(a,, a2 - yp,a3 - yp) on the set X (Theorem 1.1, Chapter 8): XP(YP)
= nxCNa,,
a2
- YP, a3 - YP)l.
Here every strategy xp(yp)has the form =
XJYP)
where z* =a, +n:(a,-a,
-y”)+If(a, --a,
w*, -y”),1?
+n; < 1, ny 20
( i = 1,2).
If the inclusion x P ~ X p ( yis P )true, the point z* EA(a,,a, - yp,a3 - yp) also exists such that the vectors z* and xp point in the same direction, or a,
+ n:(a,
- a, - yP)
+ ny(a, - a,
(4.13)
- yP) = kXp,
where yp =
xp - a#* - a3(l - B*) llxp - a,/?* - a,(l - B*)ll r.
(4.14)
Substitute (4.14)in (4.13)and transform (4.13)into
(4.15) Let us introduce the notation
The equality (4.15)signifies that the vectors xpand +; point in the same direction. By (4.10),the end of the vector% belongs to A(u,, a2,a3).The vector zfrom (4.16)points in the same direction as the vector a,B* a,(l - /3*)
+
366
8. A Pursuit Game with Noise
/
\
/
0
Figure 8.4.4.
(Fig. 8.4.4).Then the direction of the vector x p must coincide with that of the vector (Fig. 8.4.4). Consequently,the vector x p satisfying the necessary condition (4.8) belongs to the closed interval [0, B] intersecting A(a,, a,, a3). This proves the proposition. We introduce the following notation for the points A(al, a, a3): Z@l,
12) = a,
+ (a2 - a,)& + (a3 - a1112.
Proposition 4.2. Every strategy x p = xP(Al,1,) = nxz(A1,A,), together with the strategy yP(xp, ( A , / [ L , + L,])), f o r the Pareto saddle point of the game (4.5). Here y p ( x ,/?) = y p ( x )is determined in (4.10).
367
4. A Three-Criteria1 Pursuit Problem
As in the proof of Proposition 4.1, the situation ( x p ( I lI,), , yp) will not be the Pareto saddle point of the game (4.5) unless x p ( I l , 1 2 ) ~ X p ( y Let p ) . us prove this inclusion. The set Xp(yP)is the orthogonal projection of A(u,, a, - yp,a3 - yp) onto X. By (4.18), we have I , - a 3 - 22 x p - a-, I, +I, I, I, a, - yp = a, .r, 21 I1xp - a-, - a3 I, I, I1 2,
+
A +
+
xp - a, a3
A, ~
- yp = a3 -
I,
+I,
1
)b2 - a3 -
I,
+I,
. r.
(4.19)
Substituting (4.17) in (4.19), a,
a, - yp = a, - r
a, a,- - -yp = a? - r
+
(a2 -
u , ) I , + (a3 - u,)I, -a,
I
x p - a-,
+ (a, - U J I , +
A1
I, +I,
~
- a3
l
A2 ~
I, +I, 9
I, I, +I,
- a3-
- a,)&
- - a3-a, -
(a3
Ilxp - a-, I , 21 +I,
I , I+2A , I,
4+ A 2
-
I1
I,
+ I2
a I3, + A I l, l
We next transform these equalities into
a3-yP=a3
+
*
r.
Therefore the triangle A(a,, a, - yp, a3 - yp) is located as shown in Fig. 8.4.5. relative to A(ul,u2,u3). Here x , ~ [ A , , 9 ] ;in other words, it belongs to A(a1, ~2 - yp,a3 - yP), thus x PE X p ( y p ) .
368
8. A Pursuit Game with Noise
a3
Figure 8.4.5.
Case 2. llz(1,,
1,)ll > RO. Then
Let /~z(1,,1,)~~ = R e
+ 6, where 6 is const > 0. In this case, A,
A,
(4.20)
a1 -
+ (a, - 4
1 1
Re
+ (a3 - a&,
+6
xp - a,-
4
- a,11
11
11 + 1,
We transform (4.20) into
Now introduce the notation
- a3-
I,
+ 1,
+ 1,
- a321
+ 1,
1
‘r.
4. A Three-Criteria1 Pursuit Problem
369
- YP
Figure 8.4.6.
--
+-
-
With this notations A ( q , a, - yp, a3 - yp) can be represented as A(a,,a,+b+d,a, b+d) (Fig.8.4.6). Here a point kA(a1,a,-yp,a3-yP) exists for which n,i = xp (Fig.8.4.6) and, consequently, x p ~ X p ( y P ) .
This page intentionally left blank
Appendix 1
Al.1.
Concepts from Topology
The Topological Space
Let H be a set. By a topological structure (or, in short, topology) in H is understood a collection T of subsets of the set H satisfying the following conditions (called the “axioms of the topological structure”): (1) The intersection of any finite family of elements from T is an element
of 7. (2) The union of any (finite or infinite) family of elements from element of T.
T
is an
In particular, it follows from (1) that H ET and from (2) that fa E T. If T is the topology in H, the pair (H, T) is referred to as a topological space with topology T and the elements of the collection T as open sets in ( H , z). If the topology T in H is implied, only the term, “topological space”, is used and the notation (H, T) is substituted for H. If T is the topology in H , its basis is a collection v c T such that every element from T is a union of some subset of element from v. For example, the set R of real numbers has, as a rule, a topology whose basis consists of various intervals (a, b). This topological space is generally denoted as R’. Topological spaces with an enumerable basis make an important class. Thus in the topological space R’ the enumerable basis consists of all possible intervals with rational ends. Let (H, T) be a topological space; the neighborhood of the point (set) is any subset in H containing an open set where this point (set) stays. The subset in H is open iff it is the neighborhood of every its point. The sets, complementary in H to open ones are called closed in H. In particular, 0 and H are 37 1
372
Appendix 1. Concepts from Topology
closed as complements in H to the open H and 0,respectively. By definition the closure L of the set L c H is the smallest closed set containing L. The set is referred to as everywhere dense in H if its closure L = H. The topological space H is referred to as separable if it possesses an enumerable dense everywhere subset L. The topological space H is referred to as Hausdorfif any two its different point have nonintersecting neighborhoods. The covering of the set L c H is a collections 0 of subsets H whose union contains L. The covering 6' of the set L is referred to as a subcovering of the covering 6 of this set if 6' c 0. The topological space H is called a bicompactum if ever its open sets contains a finite subcovering. A bicompactum is by definition a Hausdorff bicompact topological space. The topological space H is a bicompactum iff the intersection of any centered family of its closed subsets is nonempty or such family that any final subfamily has a nonempty intersection. Let (H, z) be a topological space and H* some subset of H. In the case of a collection T* = { G n H* 1 G E 7},T* is a topology in H* and the space ( H * , T*) is called the topological subspace (H, 7).The topology t is said to induce the topology T* on the set H*. The set H* is called a bicompactum if the topological space (H*, z*) is a bicompactum.
A1.2. Metric Spaces The distance (or metric) on the set H is a nonnegative function d(h(", h(2)) defined on the Cartesian product H x H and satisfying the conditions (metric axioms) as follows (1) d(h('),h(2))= 0 iff h(') = h(2);
(2) d(h'",
h'2')
< d(h'3', h'") + d(h'3', h'Z'),
v h('), h(2),h'3'E
H.
The metric space is the pair ( H , d ) . Usually if the metric d on H is known, (H, d) is replaced by H. The set
W h o ) = { h E H Id(h, ho)
-= r }
where r > 0 is fixed is referred to as the sphere of radius r with center at the point ho. The set G is open (in H), if, together with everyone of its point, it contains as well some sphere with center in this point. Every sphere is an open set. The
A1.3. Convergence (in the Topological Sense)
373
collection of all the open sets (in H) specifies some topology t oin H which is referred to as a metric topology. Consequently, every metric space is a topological space. The bases of the topology T, in the space H can be various spheres Wr(h) where ~ E Hr ,= const > 0. The set
K[hol = { h E H I d(h,h,) < r } is a closed sphere of radius r with center at the point h,. Because Wr[hO] = Wr(h,), its W,[ho] is closed in H. If the metric space H is separable (in other words, if it contains an enumerable set L that is everywhere dense) it also has an enumerable basis. The basis of the topology T, can be chosen as a set of spheres W,,,(h) where h E L and m runs through all the natural numbers. The subset L of the metric space H is compact, if a subsequence converging to the element h E L can be identified in any its sequence h(k)E L. The closed subset of a compactum is a compactum and any compactum is separable. In a metric space the notions of bicompactness and compactness are equivalent. The sequence (h")) of the space H isfundamental, if for every E = const > 0 an ordinal number K ( E )exists such that
d(h('),h(K(e))) < E,
V l > K(E).
The space H is called complete if every fundamental sequence of its points (h(k)) is a converging sequence or ~ ( O ' E Hexists such that limk+md(h(k), h")) = 0. Every compactum in a metric space is a complete space and in Euclidian space R" the set X is compact iff it is closed and bounded; in the space R" distance is Ilh - zIJ.
A1.3.
Convergence (in the Topological Sense)
The point h belongs the lower limit of the sequence of the sets
H('), . . . .,
h E Li H(k), k+w
if any neighborhood of the point h intersects all the sets some k. The following propositions are equivalent:
W k starting ) with
374
Appendix 1. Concepts from Topology
(2) lim d(h,
IJh - zll = 0,
= lim min
k+ m
k+m
ZEHlkl
(3) A sequence of the points (h")) exists such that h
=
lim h(k) and
k+m
The point h belongs to the upper limit of the sequence of the sets H(l),H@', . . ., h E Ls H"', k-m
if any neighborhood of the point h intersects an infinite number of sets The following propositions are equivalent:
(1)
hE
Ls H(k),
k-m
(2) lirn infp(h, P k )= ) lim inf min Ilh - zll k-1 m
k+m
z€H(k)
= 0.
The above definitions of the upper and low limits lead to Li H(k)= n Ls
k+m
c
H(kl)
kl-m
u Li
H ( k )=
k,-m
Ls H(k),
k-m
where the operations n and u are applied to any possible increasing sequence of the subscripts (kJ. If h~ Lsk-., H"), there exists sequence such that h ( k l ) ~ H ( kand l) h = liml+mh("), consequently h~ Li,,, H"). A sequence of the sets (EPk))is said to converge to the set H'O), ~ ( 0=)
~i~
HWo
k-m
~i
H ( k ) = ~ ( 0=)
k-m
L~ HW). k-w
Converging sequences have, in particular, the following two properties: (1)
f P k )c V k + ) Lim Htk) c Lim V k ) , k+m
k-m
(2) h ( ' ) H(') ~ and h = lim h(k)* h E Lim H(k). k+ w
k-w
A Generalized Bolzano- Weierstrass Theorem Any sequence ( H ( k ) )of a separable space contains a converging subsequence ( H ( k l ) )or ,
L~
H(ki)
I+ m
whose limit can be an empty set.
=
~i I+m
H(ki)
A1.4. The space (2'},,,
375
A1.4. The Space (2'1, Let X be a compactum in R"; in other words, the set X is closed and bounded. The space of all the compacta of the set X is denoted as {2"},,,. The distance dist(X('),X(2))between any two elements X(') and X") from {2"},,, is supposed to be the number
if one of the sets X(') or X ( 2 )is empty, and the other is singleton.
(Al.l) The function dist (.) is a metric over the set {2"},,, and is referred to as Hausdorff metric. Its properties include: (1) dist({x(')}, {x")}) = llx(') - X ( ~ ) I I , (2) dist(X('), X ( 2 ) )= inf{EIX(') c Gc(X(2))
and Xt2)c Ge(X('))} where X") and X(2) are not empty and for example, Ge(X(2))is the &-neighborhood of the set X ( 2 ) ,or
G"(X('))= {f?~{2~},,,ldist(f, X ( 2 ) )< E } ;
(3) dist(X('), X(2))< &ox(') c Ge(X(2)) and X(2)c G e ( X ( * ) )
(A 1.2)
The set X is compact and, consequently, separable. The separability of X leads to [46, p. 3511 the bicompactness of the space {2"},,, and, by the metrizability of {2"},,,, to its compactness. Let the sequence (X(k))of elements of the space {2"},,, be fundamental in that for every c con st > 0 there exists an ordinal number K ( E )such that for all k > K(E), dist(X(k),X(K("))) < E.
376
Appendix 1. Concepts from Topology
Because the compactum X is a complete metric space, so is {2"},,, [46, p. 4173. This signifies that for every fundamental sequence (X',)) c {2"}, there exists a compactum X'O'E {2"},,, such that lim dist(X('), X'O)) = 0.
k-m
In other words, for every E = const > 0 a sequential number K(E)can be specified such that for every k > K ( E )the following inclusions are true:
X',) c G'(Xo) and X o c G c ( X ( k ) ) where G(')(Y)is an &-neighborhoodof Y in the space {2'},,,. Now, from the separability of X it follows [46, p. 3511 that {2"},,, is also separable. Then, by the generalized of Bolzano-Weirstrass theorem a converging subsequence (X")) can be identified in any sequence (X(,[))in {2"},,, such that Lim,,, X(,l) = X(O) and the compactness of {2"},,, it is true that X'O'E{2"},. Note also that any converging sequence is fundamental. Consequently, if X is compact in [w", in any infinite sequence (Sk)) c {2"},,, a converging subsequence ( X ( , l ) )can be identified such that lim dist(X(kl),X o ) = 0
I+ w
with X(O)E{2"},,,. Note that from dist(X(k),X'')) [46, p. 3501.
=0
it follows that Lim,,,
(A 1.3)
X',) = X'O)
Upper Semicontinuous Multivalent Mappings
Appendix 2
The multivalent mapping Y of the set H in X associates with every point d~ H a nonempty set Y ( d )c X , referred to as the image d. For the set c H the set is referred to as the image of the set H under the mapping I: The small protoimage of the set M c X is the set
Y L ' ( M ) = { d ~ H l Y ( dc) M } . The full protoimage of the set M c X is
Y-'(M) = {
dHI~Y(d)n M
#
a}.
For these protoimages the following relations are valid:
H
YL'(Y(H)), CYL'(M) = Y-'(CM), c
H
c Y-'(Y(H)),
C Y - ' ( M ) = YL'(CM),
where, for example C M = X\M is the complement of the set M in X , C Y - ' ( M ) = H\Y-'(M). A multivalent mapping Y is referred to as upper semicontinuous at the point ~ E if Hfor every open set M c X, Y ( 2 )c M such that a neighborhood C ( 2 ) the point 2 exists such that
Y ( C ( d ) )c M . A multivalent mapping Y is upper semicontinuous on A c H , if it is semicontinuous in every point r? c H . The following propositions are equivalent:
377
378
Appendix 2. Upper Semicontinuous Multivalent Mappings
(1) a multivalent mapping Y is upper semicontinuous on H , (2) for every open set M c X the set Y; ' ( M ) is open in fi, (3) for any closed set M c X the set Y - ' ( M ) is closed in H .
Let {2"},,,be a set of compact subsets in X with a Hausdorff metric dist( .), where X is a specified compactum in R" (here X is equivalent to the reachability domain XCO, to, xo, U +- H , V - Q ] of the system (1.7) from Chapter 5 with fixed initial position (to,xo)). The space {2"},,, is thoroughly analyzed in Appendix 1. A multivalent mapping Y (2'1, -+ X is given. The multivalent mapping
YC(X0)= G"Y(X")) = { X € X I p(x, Y ( X 0 ) )< F } is referred to as the &-bulgeof the mapping Y ; here p(x,
~ ( x ' =) ) min IIx - zll. Y(X0) ZE
Since the space {2"},,, is metric, the mapping Y : ( 2 ' } , , , + X will be upper semicontinuous at the point XOE {2"},,, for every E > 0 iff there exist a neighborhood C ( X o ) of the element X o such that 3 E C ( X o ) implies Y ( 3 )c Y y x O ) .
Theorem 19, p. 231. For Y : { 2%},,,+ X an upper semicontinuous multivalent mapping, i f H is a compact subset of {2'),, its image Y ( H )is a compactum in X .
Appendix 3
Auxiliary Propositions from the Theory of Multicriterial Problems
Consider a sequence of static multicriterial problems
P k )= (x(~),F(x)), where the components are F , ( x ) ,i E N, the vector functions F(x) are continuous, and (X(k))is a sequence of compacta from R" converging to the compactum X * c R in the Hausdorff metric, or lim dist(X('), X*) = 0.
k-+m
(A3.1)
Xik)will denote the set of Slater-minimal alternatives (solutions) xp' of the problem I-(') and X: the set of Slater-minimal solutions x: of the limiting multicriterial problem (X*, F(x)).
Lemma. If the sequence xik),k
1,2,. . ., is such that
=
xik)+ x*, xik)E Xik),k = 1,2,. . .,
(A3.2)
then x* E x : .
Proof. The lemma will be proved by contradition. Let the vector x* from (A3.2) is not Slater-minimal in the problem r*= ( X * , ff(x)). Consequently, f E X* exists such that (A3.3)
F(f) < F(x*).
It follows from (A3.1) and the last Proposition of Appendix 1 that X*
=
Lim
k+m
x(")= 379
Li
k+m
X(k).
380
Appendix 3. Auxiliary Propositions from the Theory of Multicriterial Problems
Since ?EX*,by the definition of the lower limit (Li) of the sequence of sets X“), any neighborhood G(f) of the point 2 intersects all sets X ( k )starting with some k. Choose the points x ( ~ ) C(2) E n X ( k ) ,k = 1,2,. . ., in such a way that X(k’
+ 2.
(A3.4)
It is important that x ( ~belong ) to the compactum X“). Then for “sufficiently large” neutral numbers k it follows from the continuity of lF(x), the convergences (A3.2) and (A3.4), and the strict inequalities (A3.3) that [F(X‘k’)
< [F(X!k’),
which is inconsistent with the Slater-minimality of the solution xp) in the multicriterial problem Pk). Remark. The assertion of the lemma does not hold for Pareto-minimal solutions xa) of the problem
r(k).
Example. Let
Figure A.3.1.
Auxiliary Propositions from the Theory of Multicriterial Problems
38 1
CO { . } signifies the convex closed hull of the set { . }. As &k + + 0, the sequence of compacta ( X ‘ k ) converges ) to the compactum X ( 0 ) in the Hausdorff metric ( X ( 0 )is the square ABCD). If &k > 0, the point X ( k ) = ( - 1 - & k , 0) is Pareto-minimal in the problem ra),but x p = (- 1,O) is not in the problem ( X ( 0 ) , { x l , x 2 } ) .Consequently, the limit x p of the sequence of Pareto-minimal solutions x a ) of the problem rf)with (A3.2) is not generally a Pareto-minimal solution of the limiting problem rz = ( X * , %(x)).However, the validity of the following proposition can be established [SS]. For any Pareto-minimal solution x z of the problem r* there exists a sequence of Pareto-minimal solutions x f ) of the problem P k )such that $1 + x;. For Slater-minimal solutions this proposition does not hold. For instance, if in the sequence of the two-criteria1 problems of this example, k = 1, 2,. . ., it is assumed that zk < 0 and ck + -0 as k + a , the point x: = (0, - 1) is a Slater-minimal solution of the limiting problem r*= (X(O),{ x l , x , } ) . However, no sequence of Slater-minimal solutions xLk) of the problems rf), k = 1,2,. . . ,converging to x t exists. In all the problems r!) with &k < 0 there exists a unique Slater-minimal solution (- 1, - 1).
rt),
This page intentionally left blank
Appendix 4
Vector-Valued Maximins in Static Problems
A4.1. Slater Maximin Consider the multicriterial static problem with uncertain factors =
(X,
I: W G Y ) ) ,
where X ~ c o m p R ”is the set of uncertainties x , the set YEcompR‘ of solutions y, and the components of the vectors F(x, y ) are continuous over X x Y The problem r can be interpreted also as a zero-sum static game with vector-valued p a y o f function F(x, y ) where the first (minimizing) player employs a strategy x E X such that all the components of the vector-valued payoff function F(x, y ) assume the lowest possible values and the objective of the second (maximizing) player is the opposite: he seeks a strategy y E Y such that the components F(x, y) will be the largest possible. Denote by X , ( y ) the set of all the Slater-minimal uncertainties x , ( y ) in the multicriterial problem obtained from by fixing Y E Y Consequently, XAY) = {
X S ME
x I W ( Y ) ?Y ) 4 w,Y ) ,
v x E XI.
For every y e Y the set X , ( y ) is a compact subset of the set X. ~ is referred to as a Slater-maximin strategy in the game The strategy y ( ’ ) Y I- if there exists a strategy %Jy(’))E X , ( y ( ’ ) )such that F ( n s ( Y ( S ) ) ,Y‘”)
4: W A Y ) , Y ) ,
VY E
I: X , ( Y ) E XAY).
The vector F(%,( y(’)), y‘”))will be referred to as the Slater-maximin.
383
384
Appendix 4. Vector-Valued Maximins in Static Problems
Analogously, let us introduce the set Y”X)
= { Y”X) E y
I w,Y ” 4 4: 4% Y ) , VY E y >
of the Slater-maximal solutions of the multicriterial problem
r, = (k: W Y ) ) which is obtained from the game r by fixing x E X.
The strategy x,”)E X will be Slater-minimax for the game exists such that
Wqs),P ” X , ” , ) )
+ m,
v x E x,Y ” 4
Y”(X)),
E
if p”(x,,,)E Y ” ( x )
Y”(x).
The vector [F(x(,,,j’((x,,,))will be the Slater-minimax. The following propositions hold [107, 1091.
r forms a nonempty compactum in Y and the set of Slater-maximins is also a compact subset of R N ;a similar property is characteristic of the set of Slaterminimax strategies and Slater minimaxes. (2) With N = 1 ([F(x,y)= F , ( x , y ) ) , the Slater maximin strategy y(’) is equivalent to the maximin strategy yo of the game (X, k: F , ( x , y)), that is, (1) The set of Slater-maximin strategies of the game
max min F,(x, y) = min F , ( x , yo). y€Y
X€X
X€X
(3) The set of Slater maximins is internally stable, or %s(Y(l’),
y‘”)
4: ~ ( ~ s ( Y ‘ 2 ’y‘2’) ),
for any two Slater’s maximins F(kS(y”’),y”’),j stable for the points of the set W S ( Y ) ,
Y)=
u
=
1,2, and externally
Y)
XGXSY)
or, for every Y E Y” and F ( x , ( y ) , y ) ~ F ( X , ( y ) , y ) ,there exists a Slatermaximin F(Rs(y(’)),y‘”) such that
w s o 4 , Y ) < w ( Y ( s ’ ) 9 Y‘”). Similar kinds of stability are valid also for Slater minimaxes. (4)If ( x s , y s )is the Slater saddle point of the game r,
w,Y”) 4: w,Y”) 4: w,Y),
v x E X , y E I:
A4.2 Other Notions of Vector-Valued Maximins
385
A4.2. Other Notions of Vector-Valued Maximins
Let us define the Pareto-maximin and Pareto-minimax strategies of the game
r.
We introduce the sets X J Y ) = {X,(Y)
E
x IW,(Y), Y)
m,Y), V x
of the Pareto-minimal uncertainties of the problem
Y P ( 4 = CYp(X)E y I V
E
x}
ry,
2 u x , Y), V Y E Y) and Pareto-maximal solutions of the problem rx. Note that X,(y) # 0 and Yp(x)# 0. The strategy y ( , ) ~Y(respectively, X , Y”(X))
x(,) E X ) is referred to as the Pareto-maximin (respectively, Pareto-minimax) in the game r if 3~p(Y‘p’)Ex,(Y‘p’) I v,(Y‘P’), Y‘,’)
2
F(X,(Y),
Y), VY E I:X p ( Y ) E X,(Y).
In a similar way, 3 i q X(,J
I
E YP(X(,)) [F(X(,), J,(X(,)))
$ Qx, Y”(X)), v x E x, Y P WE YP(X)*
and The vector lF(i,(y(P)),y(P)) is Pareto-maximin in the game F ( X ( ~ )JP(x,,,)) , is the Pareto-minimax. Now let us consider the Geoffrion maximin. We associate with every strategy y e Y the set X,(y) of Geoffrion-minimal strategies x,(y) of the problem ry,that is (1) The strategy x,(y) is Pareto-minimal in
W , ( Y ) , Y ) 9 v x , Y),
ry, VxEx;
(2) There exists a constant M y > 0 such that for i E N and (x,y) E X x { y } for which
386
Appendix 4. Vector-Valued Maximins in Static Problems
and some j
E
N such that
Fj(xg(y),Y ) < Fj(x, Y )
the following inequality is true: Fi(xg(y),Y ) - Fi(x, Y )
MyCFj(x, Y ) - Fj(xg(y),Y)I.
The strategy y ( g ) E Y will be referred to as Geoflrion-maximin for the game r if there exists . j t , ( y ( g ) ) ~ X , ( y ( gsuch )) that the situation (R,(y(@), is a Geoffrion-maximal solution of the multicriterial problem
or (1) The situation
(.jtg( y@)),y ( @ )is
a Pareto-maximal solution of
W g ( Y ‘ @ ) Y‘B’) , $ W & Y ) > Y),
V Y G r, x , ( Y ) E X g ( Y ) ;
(2) There exists a constant M > 0 such that for which
iE
N and (x,(y),y), for
Fi(xg(y),Y ) > F i ( ~ g ( y ( sY‘”) ))~
and some j
E
N such that
Fj(xg(y),Y ) < F j ( ~ g ( P ’ Y‘”) ),
the following inequality holds:
P’) - Fj(xg(y),Y)I.
Fi(xg(y),Y ) - F i ( ; g ( P ) , Y@’) G MCFj(E;-g(y‘”),
The vector F(kg(y(g)),y‘@) is a Geofrion maximin of the game r. The notions of a Geoffrion-minimax strategy and of a Geoffrion minimax may be introduced in a similar way. Note that systematic study of the properties of Pareto- and Geoffrionmaximins has not been carried out which would undoubtedly be very enlightening. Now let some constant (N x N)-dimensional matrix A with positive elements be specified. Associate with every strategy Y E Y the set of Aminimum uncertainties x,( y) in the problem r,: X J Y ) = { x a ( Y ) E X I A ~ ( x a ( Y )4 , Y AVX,Y), )
VXEX}.
The strategy y ( , ) Y~ will be referred to as A-maximin in the game r if there
A4.2 Other Notions of Vector-Valued Maximins
387
exists i , ( y ( " ) ) X~ , ( y ( " ) )such that A F ( % ~ ( Y 'Y'"') ~'),
+ A ~ ( x , ( Y )Y,) ,
VY E X x a ( y ) E X a ( y ) .
The vector 5(i,(y(")),y'")) is the A-maximin in the game r. By definition, the situation ( i , ( y ( " ) ) ,y'") is a Slater-maximal solution of the multicriterial problem
(u
YEY
and
the
{ x a ( y ) ,y>, 5(x9 y))
solution y(") will be a Slater-maximin in the game A5(x, y)). As a consequence, the results of section 4.1 of Chapter 1 on the Slater-maximin in the game r hold also for r, (only the substitutions F(x, y) + A5(x, y) and s + a are needed in this formulation). In particular, the set of A-maximins of the game r is a nonempty closed and bounded set in R .
r, = ( X ,
This page intentionally left blank
1. Arrow, K. I., Barankin, E. W. and Blackwell, D. Admissible points of convex sets. Contributions to the Theory of Games, 2 1953, Princeton, pp. 87-92. 2. Aumann, R.I., Peleg, B., and Von Neumann. Morgenstern solutions for cooperative games without side payments. Bull. Amer. Math. SOC., 1960, 66(3), pp. 173-179. 3. Barantsev, A. V. The rule of multipliers for a vector optimization problem. Tr. Rostovsk. Un-ta. Matem. Analiz Primen., 1975, 7 , pp. 184-190 (in Russian). 4. Basar, T. and Olsder, J. Dynamic Noncooperative Game Theory. Academic Press, New York and London, 1982. 5. Berezovskiy, B. A., Baryshnikov, Yu. M. and Kempner, L. M. Multicriterial Optimization. Mathematical Aspects. Nauka, Moscow, 1989 (in Russian). 6. Bilchev, S. I. Z-equilibrium in a differential game described by parabolic
7.
8.
9.
10. 11.
equations. In: Differential Multi-player Games. (Visshe technichesko uchilishte, Rousse, Bulgaria), 1984, pp. 47-52 (in Russian). Blaquiere, A., Gerard, F. and Leitmann, G. Quantitative and Qualitative Games. Academic Press, New York and London, 1969. Borisenko, M. V. On multiple-valued guaranteed estimates in differential games with a vector-valued performance criterion. Vestnik Moskovsk. Un-ta. Vychislitefnaya matematika i kibernetika, 1983, 2, pp. 71 -74 (in Russian). Borisevich, Yu. G., German, B. D., Myshkis, A. D., and Obukhovskiy, V. V. Multivalent mappings. Itogi nauki i tekhniki. Matem. analiz, 1982, 19, VINITI, Moscow, pp. 127-230 (in Russian). Breakwell, I. V. Some Differential Games with Interesting Discontinuities. Internal report, Stanford University, 1973. Burshtein, F. V. and Korelov, E. S. Multicriterial problems of decision making under uncertainty and risk. Teoreticheskaya kibernetika, 1980 (Metsniyereba, nilisi), pp. 143-148 (in Russian).
389
390
References
12. Case, I. H. Applications of the theory of differential games to economic problems. In: H. W. Kuhn and G. P. Szego (eds.), Differential Games and Related Topics, pp. 345-371. North-Holland, Amsterdam, 1971. 13. Case, I. H. Economics and the Competitive Press. New York University Press, New York, 1979. 14. Chernous'ko, F. L. and Melikyan F. F. Game Problems of Control and Search. Nauka, Moscow, 1976 (in Russian). 15. Chetayev, N. G. Stability of Motion. Gostekhnizdat, Moscow, 1955 (in Russian). 16. Chikriy, A. A. Nonlinear diflerential evasion games. Doklady AN SSSR, 1979, 246(6), pp. 1306- 1309 (in Russian). 17. Coddington, E. A. and Levinson, N. Theory of Ordinary Differential Equations. McGraw-Hill, New York, 1955. 18. Danskin, I. M. Fictitious play for continuous games. Naval Res. Logist. Quart., 1954, 1(4), pp. 313-320. 19. Dem'yanov, V. F., Introduction into Minimax. Nauka, Moscow, 1972 (in Russian). 20. Zhukovskiy, V. I. and Dochev, D. T. (eds.). Differential Multi-Player Games. A List of References. Visshe tekhnichesko uchilishte, Rousse (Bulgaria), 1985 (in Russian). 21. Zhukovskiy, V. I. and Dochev, D. T. (eds.). Diferential Multi-player Games. A Collection. Visshe teknichesko uchilishte, Rousse, (Bulgaria), 1984, 26, series 9 (in Russian). 22. Zhukovskiy, V. I. and Dochev, D. T. (eds.). Differential Non-zero-sum Games. A Collection. Visshe tekhnichesko uchilishte, 1981, 25, series 9 (in Russian). 23. Dochev, D. T., and Stoyanov, N. V. Existence of Z-equilibrium in a differential multi-player game. Differential Multi-player Games, 26, series 9, pp. 64-72. Visshe tekhnichesko uchilishte, (Rousse, Bulgaria), 1984 (in Russian). 24. Dragustin, C. Minimax pour des criteres multiples. Recherche operationelle, 1979, 12(2), pp. 169-180. 25. Yemelyanov, S. V., Kostyleva, N. Ye., Matich, B. P., Ozernoi, V. M., and Zimokha, V. A. Multicriterial Estimation of Local Process Control Systems. Institute of Control Sciences, Moscow, 1983 (in Russian). 26. Yeroshov, S. A. Investigation of Games with a Vector-valued Payoff Function. Cand. Sc. thesis. MGU, Moscow, 1982 (in Russian). 27. Fyodorov, V. V. Numerical Maximin Methods. Nauka, Moscow, 1979 (in Russian). 28. Friedman, A. Diferential Games. Wiley, New York, 1971. 29. Gaidov, S. D. Z-equilibrium in stochastic differential games. Diyerential Multiplayer Games. (Visshe teknichesko uchilishte, Rousse, Bulgaria), 1984, 26, series 9, pp. 53-63 (in Russian). 30. Germier, Yu. B. Introduction to Operations Research. Nauka, Moscow, 1971 (in Russian).
References
39 1
31. Gorokhovik, V. V. The optimization of system with a vector-valued objective function. In: Multivar. Technol. Syst. Proc. Symp. Manchester, 1974. London, pp. S31-1-S.32-4. 32. Grigorenko, N. L. Pursuit of one evader by several objects of various types. Doklady AN SSSR, 1983, 268(3), pp. 529-533 (in Russian). 33. Grote, J. D., The Theory and Applications of Dtfferential Games. Reidel, Netherlands, 1975. 34. Gusev, M. I., and Kurzhanskiy, A. B. Equilibrium situations in multicriterial game problems with non-conflicting interests. Doklady AN SSSR, 1976, 229(6), pp. 1295-1298 (in Russian). 35. Isaacs, R. Differential Games. 2nd ed. Kruger, Huntington, 1965. 36. Jentsch, G. Some thoughts on the theory of cooperative games. Advances of Game Theory. Ann. Math. Studies, 1964, 52, pp. 407-442. 37. Kalchev, B. D. On certain equilibria in cooperative differential games without side payments. Proc. 3rd Con$ Differential Equations and Applications ( I ) . Rousse, Bulgaria, 1985, pp. 171- 174 (in Russian). 38. Kononenko, A. F. The structure of an optimal strategy in dynamic control systems. Zhurnal vychislitel'noi matematiki i maternaticheskoi jiziki, 1980, 20(5), pp. 1105- 1 1 16 (in Russian). 39. Kononenko, A. F. and Konurbaev, I. M. Existence of equilibrium situations, optimal in the class of positional strategies, Pareto-optimal for certain differential games. In: Game Theory and Its Application. Kemerovo University, 1983, pp. 105- 114 (in Russian). 40. Konurbaev, I. M. Effective Equilibrium Situations in Dynamic Systems with Hierarchical Vector of Interests. Academy of Sciences Computing Center, Moscow, 1985 (in Russian). 41. Kornienko, I. A. On nonscalar minimax problems. In: Operations Research and Analytical Design in Technology. Kazan Aviation Institute, 1979, pp. 3-8 (in Russian). 42. Krasovskiy, N. N. Control of a Dynamic System. Nauka, Moscow, 1985 (in Russian). 43. Krasovskiy, N. N. and Subbotin, A. I. Differential Games. Mir, Moscow, 1974. 44. Kuhn, N. W. and Szego, G. P. (eds.). Differential Games and Related Topics. North-Holland, Amsterdam, 1971. 45. Kuhn, H. W. and Tucker, A. W. Nonlinear programming. Proc. Second Berkeley Symp. Math. Stat. Probab. Univ. California Press, Berkeley, 1951, pp. 481-492, 46. Kuratovskiy, K. Topology. Academic, New York and London, 1966. 47. Kurzhanskiy, A. B. Control and Observation under Uncertainty. Nauka, Moscow, 1977 (in Russian). 48. Kurzhanskiy, A. B. and Gusev, M. I. On multicriteria solutions in gametheoretic problems of control. IIASA Proc. Workshop Decision Making with Multiple ConJicting Objectives. Laxenburg, Austria, 1975, 2, pp. 51 -67.
392
References
49. Lee, E. B. and Markus, L. Foundations of Optimal Control Theory. Wiley, New York, 1967. 50. Leitmann, G. Cooperative and Non-Cooperative Many-Player Diyerential Games. Springer Verlag, Vienna, 1974. 5 1. Leitmann, G. (ed.). Multicriteria Decision Making and Differential Games. New York, Plenum, 1976. 52. Mishchenko, E. F., Nikol’skiy, M. S., and Satimov, N. Yu. Evasion problems in differential multi-player games. Proc. Moscow Mathematical Institute, 1977,143, pp. 105-128 (in Russian). 53. Molostvov, V. S. and Zhukovskiy, V. I. On A-Optimality in a Class of Cooperative Many-Player Differential Games. Lecture Notes in Control and Information Sciences. Springer Verlag, 1980, t y l ) ,pp. 489-498. 54. Molostvov, V. S., Zhukovskiy, V. I., and Korhonen, P. A Maximum Approach to Solving MCDM Problems under Uncertainty. Proc. 7th European Congress Operations Research. Bologna, Italy, June 16- 19, 1985, p. 43. 55. Morozov, V. V. Mixed strategies in a game with vector-valued payoffs. Vestnik MGU, (Computer Mathematics and Control Science), 1978, (4), pp. 44-49 (in Russian). 56. Morozov, V. V. On mixed strategies in a game with vector-valued payoff function. Proc. 3rd All-Union Conf. Operations Research. Gorki, 1978 (in Russian). 57. Morozov, V. V., On the properties of a set of non-inferior vectors. Vestnik MGU, (Computer Mathematics and Control Sciences), 1977, (4), pp. 47-51 (in Russian). 58. Morozov, V. V., Sukharev, A. G., and Fyodorov, V. V. Operations Research in Problems and Exercises. Vysshaya shkola, Moscow, 1986 (in Russian). 59. Moulin, H. Game Theory for Economics and Politics. A Collection of Methods. Hermann, Paris, 1981 (in French). 60. Zhukovskiy, V. I. (ed.). Non-Zero-Sum Differential Games. A Collection AllUnion Mechanical Enginering Correspondence Institute, Moscow, 1986 (in Russian). 61. Nikolskiy, M. S. On guaranteed estimates in differential games with vectorvalued performance criterion. lzvestiya AN SSSR, Tekhnicheskaya kibernetika, 1980 (2), pp. 37-43 (in Russian). 62. Nogin, V. D. On optimality conditions in multi-goal optimization. In: Numerical Methods of Nonlinear Programming. Kharkov, 1979, pp. 139- 140 (in Russian). 63. Nogin, V. D. Duality in multi-goal programming. Zhurnal vychislitel‘noy matematiki i matematicheskoy Jiziki, 1977 (l), pp. 254-258 (in Russian). 64. Pareto, V. Manuel d’economie politique. Giard, Paris, 1909. 65. Petrosyan, L. A. Stability of solutions in differential multiplayer games. Vestnik Leningradskogo universiteta, 1977 (19), pp. 46-52 (in Russian). 66. Petrosyan, L. A. and Danilov, N. N. Cooperative Differential Games and Their Applications. Tomsk University, 1985 (in Russian).
References
393
67. Petrosyan, L. A. and Tomskiy, G. V. Geometry of Simple Pursuit. Nauka, Novosibirsk, 1983 (in Russian). 68. Podinovsky, V. V. General zero-sum games. Zhurnal oychisliternoy matematiki i matematicheskoyjziki, 1981, 21 (3,pp. 1140- 1153 (in Russian). 69. Podinovskiy, V. V. Effective plans in multicriterial decision problems under uncertainty. In: Models of Decision Making Processes. Far Eastern Center, USSR Academy of Sciences, 1978, pp. 102-113 (in Russian). 70. Podinovskiy, V. V. and Nogin, V. D. Pareto-Optimal Solutions of Multicriterial Problems. Nauka, Moscow, 1982 (in Russian). 71. Pontryagin, L. S. Linear differential games. I. Sowiet. Math. Doklady, 1967,6, pp. 769-771. 72. Pontryagin, L. S., Boltyanskiy, V. G., Gamkrelidze, R. V. and Mishchenko, E . F. The Mathematical Theory of Optimal Processes. Interscience, New York, 1962. 73. Prokop’yev, V. A. On a stable cooperative solution of one differential game. Diyerential Multi-Player Games. (Vysshe tekhnichesko uchilishte, Rousse, Bulgaria), 1984, 26, series 9, pp. 131-143 (in Russian). 74. Prokop’yev, V. A., and Zhukovskiy, V. I. Stability of solutions in one class of differential games. Proc. Fourth Inter-Republican Workshop on Operations Research and Systems Analysis. Kutaisi, USSR, 1985, p. 120 (in Russian). 75. Rashkov, P. I. Sufficient conditions of 2-equilibrium in a differential game in Banach space. Differential Multi-Player Games. (Vysshe tekhnichesko uchilishte, Rousse, Bulgaria), 1984, 26, series 9, pp. 91-99 (in Russian). 76. Rodder, W. A generalized saddlepoint theory. Its application to duality theory for linear vector optimum problems. Europ. J. Operat. Res., 1977 (l),pp. 55-59. 77. Reingaum, J. F. A class of differential games for which the closed loop and openloop equilibria coincide. J. Optimization Theory Appl., 1982,36(2), pp. 253-362. 78. Rozen, V. V. The equilibrium situation in games with ranked outcomes. Current Lines of Research in Game Theory, Moklas, Vilnius, 1976, pp. 115-118 (in Russian). 79. Rozen, V. V. Properties of outcomes in equilibrium situations. Mathematical Models of Behavior. Saratov University, 1975 (2), pp. 45-49 (in Russian). 80. Salukvadze, M. Vector- Valued Optimization Problems in Control Theory. Academic Press, New York, 1979. 81. Salukvadze, M. An approach to the solution of the vector optimization problem of dynamic systems, J. Optimization Theory Appl. 38, pp. 409-422, 1982. 82. Slater, M. Lagrange multipliers revisited: A contribution to nonlinear programming. Cowles Commission Discussion Paper. Math. (403), November, 1950. 83. Subbotin, A. I. and Chentsov, A. G. optimization of Guarantees in Control Problems. Nauka, Moscow, 1981 (in Russian). 84. Terziayn, S. A. 2-equilibrium in a differential game described by a hyperbolic equation. In: Diyerential Multi-Player Games (Vysshe tekhnichesko uchilishte, Rousse, Bulgaria), 1984, 26, series 9, pp. 106- 111 (in Russian). 85. Tynyanskiy, N. T. and Zhukovskiy, V. I. Differential non-zero-sum games
394
References
(cooperative variety). Itogi nauki i tekhniki. Matematicheskiy analiz, 17, VINITI, Moscow, 1977, pp. 199-266 (in Russian). 86. Tynyanskiy, N. T. and Zhukovskiy, V. I. Differential non-zero-sum games (non-coalition variety). It Ogi nauka i tekhniki. Matematicheskiy analiz, 15, VINITI, Moscow, 1977, pp. 199-266 (in Russian). 87. Tynyanskiy, N. T. and Prokop’yev, V. A. The cooperative form of a noncoalition positional differential game. In: Hierarchical Multi-Step, Differential Games and Their Applicutions, Kalinin University, 1984, pp. 21 -41 (in Russian). 88. Vaisbord, E. M. and Zhukovskiy, V. I. Introduction to Multi-Player Differential Games and Their Applications. Gordon and Breach, New York, 1988 (Translated from Russian edition of 1980). 89. Vorobiev, N. N. Game Theory. Springer Verlag, Berlin, 1977. 90. Vorobiev, N. N. The state-of-art in game theory. Uspekhi matem. nauk, 1970, 25(2; 152), pp. 81- 140 (in Russian). 91. Warga, J. Optimal Control of Diyerential and Functional Operations. Academic, New York, 1976 (references are to the Russian translation of 1977). 92. Yu, P. L. Cone convexity, cone extreme points and nondominated solutions in decision problems with multiobjectives. J. of Optimization Theory Appl., 1974, 14(3), pp. 319-377. 93. Yu, P. L. and Leitmann, G. Comprosise solutions, domination structures and Salukvadze’s solutions. J. optimization Theory Appl., 1974, 13,pp. 362-378. 94. Zhautykov, 0. A., Zhukovskiy, V. I., and Zharkynbaev, S. Diyerential Games of Several Players (with Time Lag). Nauka, Alma-Ata, 1988 (in Russian). 95. Zhukovin, V. Ye. Multicriterial Models of Decision Making under Uncertainty. Metsniyereba, Tbilisi, 1983 (in Russian). 96. Zhukovskiy, V. I. Continuity of values of the cost functions in differential games. Differential Equations and Application ( I I ) , Proc 3rd Conf, Rousse, Bulgaria, 1987, pp. 999-1004. 97. Zhukovskiy, V. I. Cooperative many-player differential games. Proc. 9th IFIP Conf Optimization Techn., Warsaw. 1972, pp. 221 -222. 98. Zhukovskiy, V. I. On a paradox in differential games. Proc. 4th Conf Dierential Equations and Applications, Rousse, Bulgaria, 1989, p. 498. 99. Zhukovskiy, V. I. Some Problems of non-antagonistic differential games. Mathematical Methods in Operations Research, Bulgarian Academy of Sci., Sofia, 1985, pp. 103-195. 100. Zhukovskiy, V. I. and Chernyavskiy, I. V. The &-saddlepoint in a zero-sum game with vector-valued payoff function. In: Multicriterial Systems under Uncertainty and Their Applications. Bashkiria University Press, Ufa, 1988, pp. 34-36 (in Russian). 101. Zhukovskiy, V. I. and Dochev, D. T. Some unsolved problems. Differential Multi-Player Games, (Visshe tekhnichesko uchilishte, Rousse, Bulgaria), 1984, 26, series 9, pp. 9-13 (in Russian). 102. Zhukovskiy, V. I. and Dochev, D. T. Vector-valued Optimization of Dynamic Systems. Visshe tekhnichesko uchilishte, Rousse, Bulgaria, 1981 (in Russian).
References
395
103. Zhukovskiy, V. I. and Molostvov, V. S. On Pareto-optimality in cooperative differential games. In: Mathematical Methods in Game Theory, 1980, (6), pp. 196208 (All-Union Research Institute of Systems Studies, Moscow), (in Russian). 104. Zhukovskiy, V. I. and Molostvov, V. S. The equilibrium situation in games with multivalent payoff functions. In: Dynamic Game Systems, 1982 (4),pp. 47-66. AllUnion Research Institute of Systems Studies, Moscow, (in Russian). 105. Zhukovskiy, V. I. and Molostvov, V. S. Multicriterial Decision Making under Uncertainty. International Research Institute for Management Sciences, Moscow, 1988 (in Russian). 106. Zhukovskiy, V. I. and Molostvov, V. S. Vector-valued optimization under uncertainty. Proc. 14th International Con$ Mathematical Optimization Theory and Applications. Eisenach, December 11- 15, 1989. 107. Zhukovskiy, V. I. and Moukhine, V. V. Multicriterial and perturbed problems. Mathematical Methods in Operations Research. Bulgarian Academy of Sci. Sofia, 1990. 108. Zhukovskiy, V. I. and Molostvov, V. S. Multicriterial Systems Optimization under Uncertainty. International Research Institute for Management Sciences, Moscow, 1990, (in Russian). 109. Zhukovskiy, V. I. and Salukvadze, M. E. Multicriterial Control Problems under Uncertainty. Metsniyereba, Tbilisi, 1991 (in Russian). 110. Zhukovskiy, V. I. and Stoyanov, N. V. Unsolved problems. Diferential NonZero-Sum-Games (Visshe tekhnichesko uchilishte, Rousse, Bulgaria), 1981, 25, series 9, pp. 9-15 (in Russian). 111. Zhukovskiy, V. I. Possible lines of research in differential multi-player games. I. Slater-optimality. Godishnik na V U Z , Prilozhna matematika. Bulgaria, 1981, 17(4), pp. 37-50 (in Russian). 112. Zhukovskiy, V. I. and Stoyanov, N. V. Possible lines of research in differential multi-player games. 11. Nash-equilibrium situation. Godishnik na V U Z , Prilozhna matematika. Bulgaria, 1984, 20(4), pp. 9-23 (in Russian). 113. Zhukovskiy, V. I. and Stoyanov, N. V. Possible lines of research in differential multi-player games. 111. The Nash equilibrium. Godishnik na V U Z , Prilozhna matematika. Bulgaria, 1985, 21(I), pp. 7-23 (in Russian). 114. Zhukovskiy, V. I. and Stoyanov, N. V. Possible lines of research in differential multi-player games. IY. Active equilibria in counter-strategies. Goldishnik na V U Z , Prilozhna matematika. Bulgaria, 1985, 21(2), pp. 9-22 (in Russian). 115. Zhukovskiy, V. I. and Tynyanskiy, N. T. Optimality in noncoalition differential games. Non-Zero-Sum Diyerential Games and Their Applications. All-Union Correspondence Mechanical Engineering Institute, Moscow, 1986, pp. 3-7 (in Russian). 116. Zhukovskiy, V. I. and Tynyanskiy, N. T. Equilibrium Controls of Multicriterial Dynamic Systems. Moscow University, 1984 (in Russian). 117. Zhukovskiy, V. I. and Vaisman, K. S. Specifics of zero-sum games with a vectorvalued payoff function. Multicriterial Systems under Uncertainty and Their Applications. Bashkiria University, Ufa, 1988, pp. 22-27 (in Russian).
This page intentionally left blank
Author Index
A
Arrow, K. I., 389 Aumann, R. I., 389 B Barankin, E. W., 389 Barantsev, A. V., 389 Baryshnikov, Yu. M., 389 Basar, T., 389 Bellman, 114 Berezovskiy, 8. A,, 389 Bilchev, S. I., 389 Blackwell, D., 389 Blaquire, A,, 389 Boltyanskiy, V. G., 393 Bolzano, 374, 376 Borel, 2, 11, 12, 13, 66, 110, 111, 322, 325 Borisenko, M. V., 389 Borisevich, Yu.G., 389 Breakwell, 1. V., 389 Burshtein, F. V., 389 C
Case, 1. H., 390 Chentsov, A. G., 66, 393 Chernous’ko, F. L., 390 Chernyavskiy, I. V., 394 Chetaev, N. G., 390 Chikriy, A. A,, 390 Coddington, E. A,, 390
D
Danilov, N. N., 392 Danskin, 1. M., 390 Dem’yanov, V. F., 390 Dochev, D. T., 390, 394 Dragustin, C., 390
E Essar, xiii Euclide, 109
F Fleming, 103 Friedman, A., 103, 390 Fyodorov, V. V., 390, 392
G Gaidov, S. D., 390 Gamkrelidze, R. V., 393 Gel’rnan, B. D., 389 Gerald, F., 389 Geofrion, vii, xii, xvi, 131-369, 379-387 Germier, Yu. B., 121, 390 Gorokhovik, V. V., 129, 391 Grigorenko, N. L., 391 Grote, J. D., 391 Gusev, M.I., 391
391
398
Author Index
H Hamilton, 103 Hausdorff, 372. 375,378, 381
I
Molostvov, V. S., 392, 395 Morgenstern, 389 Morozov, V. V., 392 Moukhine, V. V., 395 Moulin, H., 392 Myshkis. A. D., 389 N
Isaacs, R., xi, 391
J Jakobi, 103 Jentsch, G.. 391
Nash, 109. 322, 337, 395 Newton, 170 Nikol'skiy, M. S., 392 Nogin, V. D.. 392, 393 0
K Kalchev, B. D., 391 Kempner, L. M., 389 Kononenko, A. F., v, 6, 8, 10, 391 Konurbaev, 1. M., 391 Korelov, E. S., 389 Korhonen, P., 392 Kornienko, I. A,, 391 Kostyleva, N. Ye., 390 Krasovskiy, N. N., xii, 29, 113, 114, 391 Kuhn, H. W., 390, 391 Kuratovskiy, K., 391 Kurzhanskiy, A. B., 391
L Lagrange, 351,352, 354, 356, 393 Lee, E. B., 392 Leitmann, G., xiii, 103, 389, 392, 394 Leons, 103 Levinson, N., 390 Lipschitz, 39, 59, 68, 69, 81, 110, 173, 195, 281, 353 Lyapunov, 1 13, 327
M Markus, L., 392 Matich, B. P., 390 Melikyan, F. F., 390 Mishchenko, E. F., 392, 393
Obukhovskiy, V. V., 389 Olsder, J., 389 Ozernoi, V. M., 390
P Pareto, V., vi, vii, viii, ix, xii, xvi, 81 -369, 379-387 Peleg, B., 389 Petrosyan, L. A,, 392, 393 Podinovskiy, V. V., 91, 393 Pontryagin, L. S., 393 Prokop'yev, V. A., 393, 394
R Rashkov, P. I., 393 Reingaum, J. F., 393 Riccati, I15 Rodder, W., 393 Rozen, V. V., 393
S Salukvadze, M., iii, xiii, 393, 394, 395 Satimov, N. Yu.,392 Silvestre, 45 Slater, M., v, vi, vii, viii, ix, xii, xvi, 49-369, 379-387
Author Index
Stoyanov, N. V., 390, 395 Subbotin, A. I., v, xii, 8, 9, 66, 391, 393 Sukharev, A. G., 392 Szego, G. P., 390, 391
T Terziayn, S. A,, 393 Tomskiy, G. V., 393 Tucker, A. W., 391 Tynyanskiy, N. T., 393,394, 395
V
Vaisbord, E. M., 394 Vaisman, K. S . , 395 Von Neumann, 389 Vorobiev, N. N., 394
399
W Warga, J., 394 Wiener, N., xi Weierstrass, 314, 316 Y
Yemelyanov, S. V.. 390 Yeroshov, S. A,, 390 Yu, P. L., 394
Zharkynbaev, S., 394 Zhautykov, 0. A,, 394 Zhukovin, V. E., 394 Zhukovskiy, V. I., iii, xiii, 120, 390, 392, 393, 394, 395 Zimokha, V. A,, 390
This page intentionally left blank
Subject Index
A
Algorithm of saddle point, GeoNrion, 225, 358 of strategy, Pareto-optimal, 117, 333 C
Control, Pareto-optimal, 326, 327, 332, 333 Criterion, vector-valued, 50
E Equation differential linear, 42, 44,47, 77, 110, 187, 220 matrix, 44,47, 1 15, 118, 223 Ricatti, 115, 223 separable, 36, 178, 198 Example, 6, 9, 23, 36, 54, 74, 77, 86, 97, 126, 135,161,170,178,187,189,200,210, 2 13,229,234,272,277,298,314,347, 354,380 counter Kononenko's, 6 Subbotin's. 8
F Function Borel-measurable, 2, 1 I, 12, 13
increasing over >, 208 over 2 ,208 strictly, 100, 207 locally Lipschitz, 50 payoN, 29 guaranteed, 35 mirror, 291, 311 scalar, 29 separable, 291 vector-valued, 1, 188, 221 strictly quasiconcave, 91 Functional goal, 124 vector-valued, 50, 78, 110
G Gain, maximin, 97, 99 Game, 5, 191, 192, 204, 206, 260 differential, 5, 36, 189, 207, 209, 218, 226, 227, 228, 3 I I affinely equivalent, 203 antagonistic, 99, 172, 181, 186 linear-quadratic, 47, 187, 220 positional, 5, 96, 169, 195 zero-sum, 25, 56, 65, 102, 152, 202, 205,212,237, 243, 284, 325 with payoff function mirror, 31 1 separable, 291 of power, 5
401
402
Subject Index
Game (Cont.) pursuit, 335, 339, 350, 356, 359 cooperative, 335 of quality, 5 rapprochement, 343 static zero-sum, 196, 270, 293, 340, 351 tug-of-war, 97, 337 Guarantee, vector-valued, 237, 251
Motion bunch, 4, 9 quasi, I , 9 bunch, 17, 18, 19,20 stepwise, 8, 10, 11, 12, 13, 14, 67 piecewise-continuous, 8 stepwise, 1, 3, 8
0 I Image full proto, 377 small proto, 377 Inheritance, 61, 62, 89, 191, 260 Instability, internal, 234, 236, 266
M Mapping, multivalent, 377, 378 Maxima, Slater, 239 Maximin, 29, 249 A-, 242, 248, 387 A-E-, 280 Geoffrion, 241, 248, 385, 386 Pareto, 241, 247, 248, 325, 326, 385 E-, 270, 277, 278 scalar, 243 Slater, 239, 245, 246, 248, 252, 257, 260, 283, 285, 286, 383 C-, 250 vector-valued, 36, 237, 242, 243, 251, 265, 266, 300, 315, 317,383, 385 Metric, Hausdorff, 375 Minima A-, 248 Pareto, 248 Slater, 238, 246, 248 Minimax Geoffrion, 386 Pareto, 247, 326, 342, 385 Slater, 239, 285, 384 vector-valued, 36, 237, 265, 300, 3 15, 3 17 Model of competing research activities, 322 of competition, 281, 284, 324 decision making, 330
Optimality A-, 153 Geoffrion, I3 I Pareto, 81, 93, 97, 107, 1 I I , 121 PI-, 120, 125 Slater, 49, 70
P Point goal, 335 utopian, 109 Problem competition, 281 decision-making, 285 linear-quadratic, 42, 354 optimal control, 48, 53, I 1 1, 112 multicriterial, 49, 61, 81, 1 1 1, 119, 131, 147, 156, 379 dynamic, 77, 290, 292, 319, 330, 336 full static, 58 linear-quadratic, I 10 quasiconcave, 90, 91 static, 51, 57, 93, 98, 123, 138, 155, 237, 256, 288, 383
Q Quadratic form negative-definite, 113, 118, 188, 221 positive-definite, 43, 188, 221
R Rejection, 61, 63, 90, 192
Subject Index
S
Saddle point, 5, 25, 26, 29, 33, 47, 262, 354 A-, 166, 176, 181, 227, 228, 272, 294 c-, 270, 272, 273, 274, 275, 276, 277 game, 6 differential, 32, 34 Geoffrion, 169, 175, 181, 218, 220, 221, 222, 225,271, 277, 294, 355, 358 interchangeable, 6, 34, 36 Pareto, 169, 175, 181, 206, 209, 216, 217, 271, 276, 294, 297, 325,326,331,334, 355, 364,366 Slater, 169, 174, 181, 187, 188, 194, 204, 206, 207,209,212,214,235,265,271, 275, 285,294,295,296,308,313,355 strict, 217 vector-valued, 169, 203, 228, 249, 262, 293, 297 invariance, 202 Set, 3 of maximins, 262 A-, 259, 319 Geoffrion, 319, 332 Pareto, 319, 332 Slater, 258, 318 vector-valued, 319 of minimaxes, 262 A-, 259, 319 Geoffrion, 3 19 Pareto, 319, 332 Slater, 258, 318 vector-valued, 319 of motions, 3 stepwise quasi, 13 of saddle-points A-, 177, 186,200,272, 297 Geoffrion, 176, 200, 271, 297 Pareto, 175, 200, 271, 297, 325, 347 Slater, 174, 198, 271, 297 of solutions Pareto-minimal, 336 Slater-maximal, 238 Slater-minimal, 255 of strategies Geoffrion-maximal, 132 Pareto-maximal, 82, 90 Slater-maximal, 51, 90 Solution, 266 of dilTerential game, 266, 268, 269
403
Z-, 269 ZP-, 269 ZS-, 269, 270, 286, 290, 308, 31 I , 319 Space, 371 bicompactum, 372 complete, 373 Hausdorff, 372 metric, 372 topological, 371 separable, 372 sub, 372 (2'),, 375, 378 Stability, 25, 57, 84, 259 dynamic, 59,85, 122, 129, 145, 191, 192 external, 57, 85, 122, 129, 142, 384 of A-maximins, 262 of Slater maximins, 260 internal, 59, 84, 122, 129, 137, 182, 236, 267, 326, 384 of A-maximins, 261 of Geoffrion maximins, 261 of Pareto maximins, 261 of Slater maximins, 259 of ZS-solution, 290 Strategy, 2 effective, 147 non-properly, I32 properly, 138 counter, 13 extremal, 26, 27 maximal A-, 155-167 Geofirion, 131- 167 Pareto, 81-167 PI-, 120-126 Slater, 49, 51, 53, 55, 56, 59, 62, 65, 72, 78, 83, 84, 89, 91, 164, 165, 319 universal, 66, 68 maximin, 5, 31, 100, 102, 129,240,243 A-, 240, 241 Geoffrion, 240, 241 Pareto, 240, 241 Slater, 239, 256, 257, 265, 383 vector-valued, 242 mean-square, 109 minimal A-, 156 GeolTrion, 133, 134 Pareto, 83, 133, 134 Slater, 53, 66
404
Subject Index
Strategy (Cont.) minimax, 5, 30, 102, 122
A-, 240, 242 Geoffrion, 240, 242 Pareto, 240, 242 Slater, 239, 257, 265 optimal, 7, 10 A-, 158 Geoffrion, 13 I , 132 Pareto.81.83, 111, 115, 116, 119, 124, 125, 129, 134 PI-, 125
Slater, 74, 75, 134 plear's, 2 System control, I conflict-controlled, I , 152, 195
T Topology, 371 metric, 373
Mathematics in Science and Engineering Edited by Willliam F. Ames, Georgia Institute of Technology Recent titles
T. A. Burton, Volterra Integral and Differential Equations C. J. Harris and J. M. E. Valenca, The Stability oflnput-Output Dynamical Systems George Adomian, Stochastic Systems John O’Reilly, Observers for Linear Systems Ram P. Kanwal, Generalized Functions: Theory and Technique Marc Mangel, Decision and Control i n Uncertain Resource Systems K. L. Teo and Z. S. Wu, Computational Methods for Optimizing Distributed Systems Yoshimasa Matsuno, Bilinear Transportation Method John L. Casti, Nonlinear System Theory Yoshikazu Sawaragi, Hirotaka Nakayama, and Tetsuzo Tanino, Theory of Multiobjective Optimization Edward J. Haug, Kyung K. Choi, and Vadim Komkov, Design Sensitivity Analysis of Structural Systems T. A. Burton, Stability and Periodic Solutions of Ordinary and Functional Differential Equations Yaakov Bar-Shalom and Thomas E. Fortmann, Tracking and Data Association V. B. Kolmanovskii and V. R. NOSOV, Stability of Functional Differential Equations V. Lakshmikantham and D. Trigiante, Theory of Difference Equations: Applications to Numerical Analysis B. D. Vujanovic and S. E. Jones, Variational Methods i n Nonconservative Phenomena C. Rogers and W. F. Ames, Nonlinear Boundary Value Problems in Science and Engineering Dragoslav D. Siljak, Decentralized Control of Complex Systems W. F Ames and C. Rogers, Nonlinear Equations i n the Applied Sciences Christer Bennewitz, Differential Equations and Mathematical Physics Josip E. Pecaric, Frank Proschan, and Y. L. Tong, Convex Functions, Partial Orderings, and Statistical Applications E. N. Chukwu, Stability and Time-Optimal Control of Hereditary Systems E. Adams and U. Kulisch, Scientific Computing with Automatic Result Verification Viorel Barbu, Analysis and Control of Nonlinear Infinite Dimensional Systems Yang Kuang, Delay Differential Equations: With Applications in Population Dynamics W. F. Ames, E. M. Harrell 11, and J. V. Herod, Differential Equations with Applications to Mathematical Physics V. I. Zhukovskiy and M. E. Salukvadze, The Vector-Valued Maximin
I S B N 0-12-779950-8