MAT H E MATIC S I N SCIENCE A N D ENGINEERING A SERIES OF M O N O G R A P H S A N D TEXTBOOKS
Edited by Richard Bellman University 1.
2. 3. 4.
5. 6.
7. 8.
9. 10. 11.
12. 13. 14. 15. 16.
17. 18.
19. 20. 21.
22.
of
Southern California
TRACY Y. THOMAS. Concepts from Tensor Analysis and Differential Geometry. Second Edition. 1965 TRACY Y. THOMAS.Plastic Flow and Fracture in Solids. 1961 RUTHERFORD ARIS. T h e Optimal Design of Chemical Reactors: A Study
in Dynamic Programming. 1961 LASALLEand SOLOMONLEFSCHETZ.Stability by Liapunov’s Direct Method with Applications. 1961 GEORGELEITMANN (ed.) . Optimization Techniques: With Applications to Aerospace Systems. 1962 RICHARDBELLMANand K E N N E T HL. COOKE.Differential-Difference Equations. 1963 Mathematical Theories of Traffic Flow. 1963 FRANKA. HAIGHT. F. V. ATKINSON. Discrete and Continuous Boundary Problems. 1964 JOSEPH
A. JEFFREY and T. TANIUTI. Non-Linear Wave Propagation: With Applications to Physics and Magnetohydrodynamics. 1964 JULIUS T. T o u . Optimum Design of Digital Control Systems. 1963 HARLEY FLANDERS. Differential Forms: With Applications to the Physical Sciences. 1963 SANFORD M. ROBERTS.Dynamic Programming in Chemical Engineering and Process Control. 1964 SOLOMON LEFSCHETZ.Stability of Nonlinear Control Systems. 1965
DIMITRIS N. CHORAFAS. Systems and Simulation. 1965 Random Processes in Nonlinear Control Systems. A. A. PERVOZVANSKII. 1965 111. Methods of Matrix Algebra. 1965 MARSHALL C. PEASE, V. E. BENES. Mathematical Theory of Connecting Networks and Telephone Traffic. 1965 WILLIAMF. AMES.Nonlinear Partial Differential Equations in Engineering. 1965 J. A C Z ~ LLectures . on Functional Equations and Their Applications. 1966 R . E. MURPHY.Adaptive Processes in Economic Systems. 1965 S . E. DREYFUS. Dynamic Programming and the Calculus of Variations. 1965 A. A. FEL’DBAUM. Optimal Control Systems. 1965
MATHEMATICS 23. 24.
25. 26.
27. 28. 29.
I N SCIENCE A N D ENGINEERING
A. HALANAY. Differential Equations : Stability, Oscillations, Time Lags. 1966 M. NAMIKOGIUZTORELI. Time-Lag Control Systems. 1966 DAVIDSWORDER. Optimal Adaptive Control Systems. 1966 MILTONASH. Optimal Shutdown Control of Nuclear Reactors. 1966 DIMITRISN. CHORAFAS. Control System Functions and Programming Approaches. ( I n Two Volumes.) 1966 N. P. ERUGIN. Linear Systems of Ordinary Differential Equations. 1966 SOLOMON MARCUS.Algebraic Linguistics; Analytical Models. 1967
.31.
A. M. LIAPUNOV. Stability of Motion. 1966 GEORGE LEITMANN (ed.). Topics in Optimization. 1967
.?2.
MASANAO AOKI.Optimization of Stochastic Systems. 1967
30.
In preparation A. KAUFMANN. Graphs, Dynamic Programming, and Finite Games MINORUURABE. Nonlinear Autonomous Oscillations A. K A U F M A Nand N R. CRUON.Dynamic Programming: Sequential Scientific Management Y . SAWARAGI, Y . SUNAHARA, and T . NAKAMIZO. Statistical Decision Theory in Adaptive Control System5 F. CALOGERO. VariabIe Phase Approach to Potential Scattering 1. H. AHLBERO,E. N. NILSON,and J. L. WALSH.T h e Theory of Splines and Their Application HAROLD J. K U S H N E RStochastic . Stability and Control
MATHEMATICS 23. 24.
25. 26.
27. 28. 29.
I N SCIENCE A N D ENGINEERING
A. HALANAY. Differential Equations : Stability, Oscillations, Time Lags. 1966 M. NAMIKOGIUZTORELI. Time-Lag Control Systems. 1966 DAVIDSWORDER. Optimal Adaptive Control Systems. 1966 MILTONASH. Optimal Shutdown Control of Nuclear Reactors. 1966 DIMITRISN. CHORAFAS. Control System Functions and Programming Approaches. ( I n Two Volumes.) 1966 N. P. ERUGIN. Linear Systems of Ordinary Differential Equations. 1966 SOLOMON MARCUS.Algebraic Linguistics; Analytical Models. 1967
.31.
A. M. LIAPUNOV. Stability of Motion. 1966 GEORGE LEITMANN (ed.). Topics in Optimization. 1967
.?2.
MASANAO AOKI.Optimization of Stochastic Systems. 1967
30.
In preparation A. KAUFMANN. Graphs, Dynamic Programming, and Finite Games MINORUURABE. Nonlinear Autonomous Oscillations A. K A U F M A Nand N R. CRUON.Dynamic Programming: Sequential Scientific Management Y . SAWARAGI, Y . SUNAHARA, and T . NAKAMIZO. Statistical Decision Theory in Adaptive Control System5 F. CALOGERO. VariabIe Phase Approach to Potential Scattering 1. H. AHLBERO,E. N. NILSON,and J. L. WALSH.T h e Theory of Splines and Their Application HAROLD J. K U S H N E RStochastic . Stability and Control
COPYRIGHT @ 1967, BY ACADEMIC PRESS TNC.
ALL RIGHTS RESERVED. NO PART OF THIS BOOK MAY BE REPRODUCED I N ANY FORM, BY PHOTOSTAT, MICROFILM, OR ANY OTHER MEANS, WITHOUT WRITTEN PERMISSION FROM THE PUBLISHERS.
ACADEMIC PRESS INC. 111 Fifth Avenue, New York, New York 10003
United Kingdom Edition published by ACADEMIC PRESS INC. (LONDON) LTD. Berkeley Square House, London W.1
LIBRARY OF CONGRESS CATALOG CARDNUMBER: 66-16442
PRINTED I N THE UNITED STATES OF AMERICA
Lisi of Contributors Numbers in parentheses indicate the pages on which the authors' contributions begin.
A. BLAQUIZRE (263), Faculty of Sciences, University of Paris, Paris, France. E. K. BLUM(417), Department of Mathematics, University of Southern California, Los Angeles, California. STEPHEN P. DILIBERTO (373), Department of Mathematics, University of California, Berkeley, California. BORISGARFINKEL (3, 27), Ballistic Research Laboratories, Aberdeen Proving Ground, Maryland.' HUBERT HALKIN (1 97),2 Bell Telephone Laboratories, Whippany, New Jersey. HENRYJ. KELLEY (63), Analytical Mechanics Associates, Inc., Westbury, Long Island, New York. E. KOPP(63), Research Department, Grumman Aircraft EngineerRICHARD ing Corporation, Bethpage, Long Island, New York. G. LEITMANN (263), Department of Mechanical Engineering, University of California, Berke!ey, California.
A. I. LURIE(103), Leningrad Polytechnic Institute, Leningrad, USSR. K. A. LURE(147), Department of Mathematical Physics, A. F. Ioffe PhysicoTechnical Institute, Academy of Sciences of the USSR, Leningrad, USSR. H. GARDNER MOVER(63), Research Department, Gr~immanAircraft Engineering Corporation, Bethpage, Long Island, New York. BERNARD PAIEWONSKY (391), Institute for Defense Analyses, Arlington, Virginia. Present address : Yale University Observatory, New Haven, Connecticut. Present address : Department of Mathematics, University of California, La Jolla, California.
vii
Four years ago a multi-author volume,"Optimization Techniques with Applications to Aerospace Systems" (Academic Press, New York and London, 1962), appeared in this series. In the intervening years many facets of optimization theory and its application to problems in engineering and applied science have been explored. This book includes many results which extend or generalize the findings recorded in the earlier one. This volume contains ten contributions to the field of optimization of dynamical systems. The reason for assembling these contributions and their publication under one cover stems from my belief that, while too long for inclusion in technical journals (some chapters are of monograph length), they deserve recording in more or less permanent format. The book is divided into two parts of five chapters each. The investigations reported in Part 1 are based on variational techniques and constituteessentially extensions of the classical calculus of variations. Part 2 of the volume contains contributions to optimal control theory and its applications; here the arguments are primarily geometric in nature. Some chapters deal with the solutions to particular problems while others are devoted primarily to theory. However, these latter chapters also contain problems for purposes of illustrating various aspects of the theory. Thus I hope that this volume will be ofinterest to both theoreticians and practitioners.
G . LEITMANN
August, 1966 Berkeley, California
ix
Contents List of Contributors
vii
Preface
IX
Part 1
A Variational Approach Chapter 1. Inequalities in a Variational Problem BORIS GARFINKEL
1.0 Introduction 1.1 Condition I 1.2 Conditions II and III PART A-CASE (a) 1.3 Preliminary Considerations 1.4 Singularities in Case (a) 1.5 The Extremaloid Index 1.6 The Imbedding Construction 1.7 Condition IV 1.8 Proof of Sufficiency 1.9 Numerical Example PART B-CASE (b) 1.10 Preliminary Considerations 1.11 Singularities in Case (b) 1.12 The Imbedding Construction 1.13 Numerical Example 1.14 Discussion of the Results References
3
5 7 7 7 10
12 13
13 14 14 17
17 19 20 21 23 25
Chapter 2. Discontinuities in a Variational Problem BORIS GARFINKEL
2.0 Introduction 2.1 Conditions Ic and Id PART A-CASE (a) 2.2 Conditions la, Ib, II, and III 2.3 Preliminary Considerations 2.4 The Function hey')
27 29 30 30
31 32 xi
XII
CONTENTS
2.5 Zermelo Diagram 2.6 The Imbedding Construction 2.7 Condition IV' 2.8 The Hilbert Integral Proof of Sufficiency 2.9 2.10 Numerical Example 2.11 Discussion of the Results PART B-CASE (b) 2.12 Conditions Ia, Ib, II, and III 2.13 Preliminary Considerations 2.14 Zermelo Diagram 2.15 Corner Manifolds 2.16 Conditions II' and I1N' 2.17 Free Corners 2.18 A Special Case 2.19 The Imbedding Construction 2.20 Proof of Sufficiency 2.21 Numerical Example 2.22 Discussion of the Results References
35 36 38 39 41 42 44 45 45 46 47 49 50
52 52 54 55 56 61 62
Chapter 3. Singular Extremals HENRY
J.
KELLEY, RICHARD
E. Kozr, AND
H. GARDNER MOYER 3.0 3.1 3.2 3.3
Introduction Second Variation Test for Singular Extremals A Transformation Approach to the Analysis of Singular Subarcs Examples References
63 64
79 84 100
Chapter 4. Thrust Programming in a Central Gravitational Field A. I. 4.1 4.2 4.3 4.4 4.5 4.6
LURIE
General Equations Governing the Motion of a Boosting Vehicle in a Central Gravitational Field Integrals of the Basic System of Equations Boundary Conditions: Various Types of Motion Orbits on a Spherical Surface Boosting Devices of Limited Propulsive Power Singular Control Regimes References
104 111
117 127 135 141
145
CONTENTS
XIII
Chapter 5. The Mayer-Bolza Problem for Multiple Integrals: Some Optimum Problems for Elliptic Differential Equations Arising in Magnetohydrodynamics
K. A. 5.0 5.1 5.2 5.3
LURIE
Introduction Optimum Problems for Partial Differential Equations: Necessary Conditions of Optimality Optimum Problems in the Theory of Magnetohydrodynamical Channel Flow Application to the Theory of MHD Power Generation: Minimization ofEnd Effects in an MHD Channel Appendix References
147 149 160 165 189 192
Part 2
A Geometric Approach Chapter 6. Mathematical Foundations of System Optimization HUBERT HALKIN
6.0 6.1 6.2 6.3 6.4 6.5
Introduction Dynamical Polysystem Optimization Problem The Principle of Optimal Evolution Statement of the Maximum Principle Proof of the Maximum Principle for an Elementary Dynamical Polysystem 6.6 Proof of the Maximum Principle for a Linear Dynamical Polysystem 6.7 Proof of the Maximum Principle for a General Dynamical Polysystem 6.8 Uniformly Continuous Dependence of Trajectories with Respect to Variations of the Control Functions 6.9 Some Uniform Estimates for the Approximation z ( t ; ~ ) of the Variational Trajectory y ( t ; ~ ) 6.10 Convexity of the Range of a Vector Integral over the Class d of Subsets of [0, I] 6.11 Proof of the Fundamental Lemma 6.12 An Intuitive Approach to the Maximum Principle Appendix A. Some Results from the Theory of Ordinary Differential Equations Appendix B. The Geometry of Convex Sets References
198 201 204 206 209 210 214 220 226 229 230 242 247 247 256 260
Chapter 7. On the Geometry of Optimal Processes A. 7.0 7.1
BLAQUJERE AND
Introduction Dynamical System
G.
LEITMANN
265 266
xiv 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 7.10 7.11 7.\2 7.\3 7.\4 7.15 7.\6 7.\7 7.\8 7.19 7.20 7.2\ 7.22 7.23 7.24
CONTENTS
Augmented State Space and Trajectories Limiting Surfaces and Optimal Isocost Surfaces Some Properties of Optimal Isocost Surfaces Some Global Properties of Limiting Surfaces Some Local Properties of Limiting Surfaces Some Properties of Local Cones Tangent Cone 'i&'I:(x) A Nice Limiting Surface A Set of Admissible Rules Velocity Vectors in Augmented State Space Separability of Local Cones Regular and Nonregular Points of a Limiting Surface Some Properties of a Linear Transformation Properties of Separable Local Cones Attractive and Repulsive Subsets of a Limiting Surface Regular Subset of a Limiting Surface Antiregular Subset of a Limiting Surface Symmetrical Subset of Local Cone Y(x) A Maximum Principle Boundary Points of g* Boundary and Interior Points of g* Degenerated Case Some Illustrative Examples Appendix Bibliography
268 268 269 270
273 276 282 286 292 294 300 301
303 305 311 3\2 317 318 323 336 35\ 360 364 370 371
Chapter 8. The Pontryagin Maximum Principle STEPHEN P. DILIBERTO
8.0 8.1 8.2 8.3 8.4 8.5 8.6
Introduction The Extended Problem The Control Lemma The Controllability Theorem The Maximum Principle (Part I) The Maximum Principle (Part II) The Bang-Bang Principle Reference
373 374 376 379 382 384 387 389
Chapter 9. Synthesis of Optimal Controls BERNARD PAIEWONSKY
9.0 9.\ 9.2 9.3
Introduction Neustadt's Synthesis Method Computational Considerations Final Remarks References
39\ 392 40\ 413 4\5
CONTENTS
xv
Chapter 10. The Calculus of Variations, Functional Analysis, and Optimal Control Problems E. K. 10.0 10.1 10.2 10.3 10.4 10.5 10.6 10.7 10.8 10.9
BLUM
Introduction The Problem of Mayer Optimal Control Problems Abstract Analysis, Basic Concepts The Multiplier Rule in Abstract Analysis Necessary Conditions for Optimal Controls Variation of Endpoints and Initial Conditions Examples of Optimal Control Problems A Convergent Gradient Procedure Computations Using the Convergent Gradient Procedure References
417 418 420 423
427 431 439
443 453 457 459
Author Index
463
Subject Index
466
Inequalities in a Variational Problem BORIS GARFINKEL7 BALLISTIC RESEARCH LABORATORIES. ABERDI-EN PROVING GROUND. MARYLAND
1.0 Introduction . . . . 1.1 Condition 1 . . . . 1.2 Conditions I1 and 111 . . PART A-CASE (a) . . . . I .3 Preliminary Considerations 1.4 Singularities in Case (a) . 1.5 The Extremaloid Index . . 1.6 The Imbedding Construction 1.7 Condition IV . . . . I .8 Proof of Sufficiency . . . 1.9 Numerical Example . . . PART B-CASE (b) . . . . 1.10 Preliminary Considerations 1.1 1 Singularities in Case (b) . 1.12 The Imbedding Construction 1.13 Numerical Example . . 1.14 Discussion of the Results . . . . References
1.0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
.
. . . . . . . . . . . . .
.
. . . . . . . . . . . . .
.
. . . . . . . . . . . . .
.
. . . . . . . . . . . . .
.
. . . . . . . . . . . . .
.
. . . . . . . . . . . . .
.
. . . . . . . . . . . . .
.
. . . . . . . . . . . . .
.
. . . . . . . . . . . . .
.
. . . . . . . . . . . . .
3
5
7 7 . 7 . 10 . 12
. 13 . 13
. . . . . . . . .
14 14 17 17 19 20 21 23 25
Introduction
The frequent occurrence of optimization problems with bounded state and control variables has revived the interest in the old subject of inequalities in the calculus of variations . In this chapter we are concerned with inequalities in the problem of Lagrange. illustrated by the following examples: (a) Find the curve y ( x ) that joins the points ( - 3 . - 11) and (2. 2) in the domain y - x3 + 3x 2 0. and minimizes the integral
s-
2
(1
+ Y’~)’’’ dx
t Present address: Yale University Observatory. New Haven. Connecticut . 1
4
BORIS GARFINKEL
(b) Find the curve y(x) that joins the points (0,O) and (3, 5 ) , satisfies the inequality y’ - x 2 0, and minimizes the integral
Both problems are solved in the text with the aid of the theory developed in this chapter. The general problem can bi: formulated as follows. Let 9 be a set of points (x, y , w) in which the functions f ( x ,y , w) and 4(x,y , w) have continuous derivatives up t o order r + 2, with r 2 1, and which satisfies the inequality 4 2 0
(1.1)
An admissible curve y(x) has its elements (x,y , y’) in 9, and has a piecewise continuous first derivative y ‘ . In the class of admissible curves we seek a y(x) that joins the fixed points (xl, yl) and ( x 2 ,y 2 ) ,and minimizes the integral
In the notation of Bolza, y(x) is of class (C’, D’), admitting a finite number of corners. Generally, y is an n-vector. To simplify the exposition, we shall restrict ourselves to the case n = 1. Two cases of the problem, illustrated by the examples (a) and (b) above, are distinguished: (a) 4 = @(x, y ) , (b) 4 = 4 ( x , y , y’). Case (a) has been treated by Bolza,’ Mancil,’ and others, and case (b) by Valentine.3 The singularities that arise on the locus 4 = 0 have been investigated by Garfinkel and M ~ A l l i s t e r . ~ The two cases are treated in Parts A and B of this chapter, following Sec. 1.2. The necessary and sufficient conditions of the calculus of variations are discussed in Secs. 1.1, 1.2, 1.7, and 1.12. For a strong relative minimum, the former include conditions I-IV, in the notation of Blisss; the latter comprise I, ]IN’, and IV’. The proof of sufficiency appears in Secs. 1.9 and 1.13. The corner condition is applied in Secs. 1.4 and 1.1 1 to the points of junction of the region and the boundary subarcs. The singularities that arise on 4 = 0 are treated in Secs. 1.4, 1.5, and 1.11. The imbedding of a test-curve in a field of extremals is carried out in Secs. 1.6 and 1.12. Numerical examples are worked out in Secs. 1.9 and 1.13. Finally, the results are summarized in Sec. 1.14. Let R be the open region 4 > 0, and B the boundary 4 = 0. The terms R-arcs and B-arcs will designate arcs in R and B respectively. A curve is said to be normal if $,, # 0 in case (a), and 4,,, # 0 in case (b), for all (x,y , y’) belonging to the curve. Normality will be assumed.
1.
1.1
INEQUALITIES IN A VARIATIONAL PROBLEM
5
Condition I
Construct the function
F = f + Aq5
(1.3)
where the Lagrange multiplier A(x) satisfies
Aq5 = O
(1.4)
An admissible curve is generally compounded of R-subarcs, on which
4 > 0 and II = 0, and of B-subarcs, on which q5 = 0 and A # 0 is admitted. Let such a composite curve lie in R for x 1 I x < 5, and in B for 5 I x 5 x 2 ,and have no corners except possibly at the junction x = 4 . Generally, the symbol
A will denote the jump in a function g(x) that has limits from the left and from the right at x = 4 ; i.e.,
&(O
= S+(O - 9 - ( 0 =
lim [g(5
&+O
+
E)
- g(5 - E ) ] ,
E
>0
(1.5)
For a weak variation Csy, the equation of a neighboring admissible curve Y ( x ) can be written as Y(X) = Y(x>+ fiyix)
Define the variations Sy’, 84, and the differentials dy, dy’, 6y’ = y - y’
Sq5 = q5ySy + q5), Csy’
(1.6)
0’4by
+ y’ dx dy’ = by’ + y” dx dy‘ dd = 4xd x + q5y dy + 4,,, dy
= iiy
(1.7)
and similarly for any function of x, y , y‘. An admissible tiariation Csy(x) will be defined as one that satisfies
Wx,)
= SY(X*) = 0
(1.8)
and also satisfies Sq5 2 0 on the boundary q5 = 0. A variation is said to be bilateral if it remains admissible upon a change of sign. Such are the variations on R-subarcs, where q5 # 0, and the constrained variations on B-subarcs, where Sq5 = 0 for n 2 2. A unilateral variation is a variation on a B-subarc for which Sq5 > 0. In view of ( I .2), ( I .3), and (1.4), the functional I can be written as
I=
s
t-
x1
Fdx
+
sxz c
F d x - sx211+ dx
r
6
BORIS GARFINKEL
Let Y(x) intersect (p = 0 a t ( 5 + dx,y ( l ) + dy). Since F may have a j u m p A F a t x = <,and since A = A(x),the,first Variation of I is given by 6 / = -[AF]<+ The identity
I
-XI
6Fdx-Jcx2A64dx
(1.10)
-*I
d 6y‘ = - 6y
(1.11)
dx
leads to
(1.12) The integration by parts and the use of (1.7) converts ( 1 . l o ) into 6 1 = - [A(F - Y’F,,,) dx
+ AFy, d
~ ] ~ (1.13)
where dx and dy inside the brackets satisfy 4 a minimum, it is necehsary that
=0
at x
6/20
=
5 . In order that f be (1.14)
Let the arbitrary 6y(x) be successively chosen as: ( I ) a bilateral variation with a fixed (, (2) a bilateral variation with a variable 5, and (3) a unilateral variation. Then the Dubois-Reymond lemma and (1.14) imply in succession the following three conditions: The Euler condition Ia, (1.15)
holding between corners, the Weierstrass-Erdmann corner condition I b, [A(F - Y’F,,) dx
+ AF,, dy]<= 0
(1.16)
which must hold at corners for all dx and dy satisfying d(p = 0, and the conuexity condition Ic.
A10
(1.17)
The system of two equations (1.15) in the unknown y(x) and A(x) defines the extremals of the problem. We shall use the term extremaloid to describe curves that satisfy condition I , which comprises la, Ib, and Ic. The particular extremaloid that satisfies the prescribed end-conditions will be referred to as the test-curve, and designated as E l , .
1. 1.2
INEQUALITIES IN A VARIATIONAL PROBLEM
7
Conditions II and 111
The derivation of the necessary conditions I1 and 111 does not differ essentially from that in the problem without the inequality 4 2 0. Accordingly, we merely state them here without proof. Let the E-function be defined by
E
3
F(x, Y , Y‘) - F(x, Y , v’) - ( Y’ - y’)F,,(x, Y , Y’)
(1.18)
A curve E l , is said to satisfy condition 11 of Weierstrass if E 2 0 for all (x,y , y’, A) of E l , and f o r all sets (x,y , Y ) # (x,y , y’) that satisfy 4 2 0. With the sign 2 replaced by >, 11 becomes 11’. A curve E l , is said to satisfy condition 11,’ if there exists an N-neighborhood of (x,y , y‘, 2) on E12 with 24 = 0, in which E > 0 for all (x,y , y‘, A) in N and for ail sets (x,y , Y ‘ ) # (x,y , y’) that satisjy 4 2 0. A cwue E l , is said to satisfy condition 111 of Legendre if F y r y2 r 0 for all (x,y , y‘) of E l , . With the sign 2 replaced by >, 111 becomes 111’. A strengthened form of 111’ is llIF’, which requires that F y r y>r 0 for all (x, y ) of E l , and for all y’. Because of the assumed continuity off’ and 4, it is equivalent t o 111,’ of Bliss, which extends the condition to an F-neighborhood of (x, y ) . We note that 111,‘ implies 11,’ and that the latter implies 111’. The remaining conditions IV and 1V’ are discussed in Secs. 1.8 and 1.1 3. PAR; A-CASE 1.3
(a)
Preliminary Considerations
The existence and uniqueness of a n extremal continuation at points that are neither corners nor junctions will be proved with the aid of the Hilbert5 theorem. O n the R- and the B-subarcs respectively, the hypothesis of the theorem is the nonvanishing of the Hilbert determinants; for n = 1 we write (1.19) This condition is assured by 111’ and the assumed normality 4y # 0 on the extremal. We note that the corresponding Euler equations (1.15) can be written
d
dx -fy,
=fy
+ A4y,
( 1.20)
8
BORIS GARFINKEL
Since
4” = 4,v” + 4y,Y’2 + 24,,Y’ + 4 x x
4‘ = 4,Y‘ + 4 x 9
(1.21)
Eqs. (1.19) imply that y” and 1 can be determined from Eqs. (1.20) when x,y , y’ are given. By the implicit function theorem, y” and 1 exist and are in c‘,and the Euler equations have a unique solution.
The behavior of extremals at junctions is investigated with the aid of the corner condition (1.16). Since dx and dy in (1.16) satisfy dx
+ 4, dY
A ( f - Y’f,,) = K4X
A&.
d4 and since +,,,
= 0,
=4x
(1.22)
(1.16) implies 9
=
K4,
(1.23)
where K is a constant. From Eq. (1.23), the two unknown y,‘ and ti are to be determined. Corners can be classified as free and constrained, depending on whether K = 0 or K # 0. The occurrence of corners a t junctions of the R- and the B-subarcs is discussed in Theorems 1 and 2.
Theorem 1. I f a curce satisfies condition ZIIF’,then it has no corner at a junction of a region and a boundary subarc.
PROOF. The E-function, defined in Eq. (1.18) assumes the form E(Y’, Y ’ )= f (Y ’ )-f(v’) - ( Y’ - Y’)f,*
(1.24)
where the arguments ( x , y ) have been deleted to save writing. At a corner, with y’ = y-’ and Y’ = y+’, Eq. (1.24) becomes
E(Y-‘, Y,’)
= Af
- (AY’)fy,
(1.25)
and in view of Eqs. (1.23) and (1.21),
v+’)= t i + + ’
HY-’,
(1.26)
For entry into the boundary, 4 + ’ = 0, and Eq. (1.26) implies E = 0. Sincefis in C r t 2 , & p yexists ‘ and is continuous. By the mean value theorem applied to Eq. (1.24), there exists a number 0 such that
+ o A ~ ‘ ) ( A Y ’ ) ~ ,o < o < I
~ ( y - ’y+’) , =fy,,,,(y--l
(1.27)
Since&.,. > 0 by IIIF’, Eq. (1.27) and E = 0 imply Ay‘ = 0. Since the transition is reversible, the same argument applies to exit from the boundary.
1.
9
INEQUALITIES IN A VARIATIONAL PROBLEM
Theorem 2. I f a curue satisjies condition II’, then it has no constrained corner at a junction of a region and a boundary subarc. PROOF.Consider the function E( Y ) = E(y-’, Y ’ ) . Since E by the corner condition, I!‘ implies min E Y’
= E(y-’,
y+’) = 0
=0
at Y’ = y,’ (1.28)
Since E( Y ’ ) is continuous, the relation E,. = 0 must hold at the minimum Y’ = y+’. By the differentiation of (1.24) and the use of (1.23), Ey. =f y r
-f,r
= Af,, = ~4~
(1.29)
The normality 4,, # 0 then implies K = 0. Free corners, corresponding t o K = 0, have been extensively treated in the literature, in particular by Bolza, Reid, Graves, and others. Such corners are commonly investigated by means of the corner manifolds, discussed in Sec. 2.18 of the next chapter. Since their occurrence on the boundary 4 = 0 is incidental, we shall limit ourselves here to continuous transitions only, characterized by
Ay’=O,
Then the corner condition is trivially satisfied with extremal must be tangent to the boundary 4 = 0. Entry is thus subject t o the conditions
4=0, connecting the two unknown
(1.30)
A4’=0
4’=0 and a, where
K = 0,
at x = c
and the incident
(1.31)
CI
is the parameter of the family
(4- ”4~.Y,),= 5
(1.32)
y(x, a) of extremals. The Jacobian determinant of (1.31) is J =
Since $,, # 0, and since y , # 0 by IV’, a solution of (1.31) exists if $ - ” # 0. The ambiguous case &’’ = 0 is resolved in Sec. 1.4 by considering higherorder derivatives. It is noteworthy that a degree of freedom is lost upon entry, with the one-parameter family y ( x , CI)degenerating into the B-extremal 4 = 0. At a point of exit from the boundary, the tangency condition (1.30) is established by symmetry. Thus, while entry is limited t o R-arcs that are tangent t o the boundary, exit from the boundary is unrestricted, except as noted in Sec. 1.4. The value x = ( at the point of exit thus becomes the parameter of the family y(x, 5 ) of the emergent extremals, and the degree of freedom lost on entry is restored on exit.
10
BORIS GARFINKEL
Since y’ is continuous, and since y” and A are generally discontinuous at junctions, it follows from the Hilbert theorem that y”(x) and A(x) are in (C“,Do). That continuous transitions defined by(] .30)are not always possible will be shown in Sec. 1.4. 1.4
Singularities in Case (a)
On a composite arc, the system (1.15) is linear in the unknowns y”, A and has the Hilbert determinant (1.33) A unique solution exists iff H # 0 at x = <. Accordingly, we define singular points by the condition H = 0, which implies 4 = 0 in virtue of 111’. Singularities therefore occur on 4 = 0 in transitions classified in Table I. TABLE I TYPES OF TRANSITION Type
Symbol
Name
Remarks
Region-boundary Boundary-region Region-region Boundary-boundary
RB BR
Entry Exit Nonentry Nonexit
Nontrivial Nontrivial Trivial Trivial
RR BB
We have assumed that all transitions are continuous, with Ay‘ = 0. Accordingly, let p be defined as the least integer such that 0 < p < r and y!p+’)(<)# Y (”+’)(<) in a nontriviaI transition.
Lemma la. If 4y,EE O and i f @ ( r )= O , then at the point x = 5 of a continuous nontrivial transition A4(P+2)AA(P1 > 0 (1.34) PROOF.For a nontrivial transition, an application of the jump operator A to the EuIer equation in (1.15) yields (1.35) An(<) in view of the continuity o f y and y’. It follows by 111’ and normality that both Ay“ and AA vanish or d o not vanish simultaneously. The definition of p and the successive differentiation of the Euler equation leads t o f y yAY”(5) = 4
&,,
y
Ay‘p+2’(<)= 4y AA(”(5) # 0
( I .36)
1.
11
INEQUALITIES IN A VARIATIONAL PROBLEM
By analogous reasoning, the application of the A-operator to the function
4" yields and
4? AY''(0 = A4"(0
(1.37)
4y AJJ~+~'(&J= A4'p'2'(t)
(1.38)
) Eqs. (1.38) and (1.36) now leads to The elimination of A y ( p + 2 from
A4(P+2) =f Y7Y!(4y)2 A/z(P)
(1.39)
and finally, by 111' and normality, the conclusion follows.
Corollary la. Under the same hypothesis 4(P+2)A(P)
<0
(1.40)
where the two factors belong to the R and the B continuations respectively. PROOF.Two cases occur, which shall be referred to as R and B.
CASER. If 5 belongs to a R-arc, the RB and the RR transitions must be considered. For the RB transition, A(x) = 0, A!"(() = 0 for x I t, and +(x) = 0, +',p+2'(t) = 0 for x 2 4 . Hence (1.34) becomes 4'P+2)AIf) < 0, where the two factors are the lowest nonvanishing derivatives at 5. For the RR transition all the existing derivatives of y ( x ) are continuous at (, so that 4cP+2)= (p',p+2). Since 4 - refers to the same R-arc, the last two equations imply (1.40). CASEB. If 5 belongs to a B-arc, the RR and BB transitions must be considered. Analogous reasoning shows that #$'f2)A(p) < 0 and A?') = A?', again leading to (1.40). We shall now inquire whether the two continuations indicated in (1.40) meet both the requirements 4 2 0 and A 5 0 . The question is settled by
Theorem 3a. r f the hypothesis of Lemma l a holds, then the extremal has either two continuations or none: (1) .for p even there exists a continuation in the region and n continuation in the boundary; ( 2 )for p odd there is no continuation. PROOF. Case R. If 4 belongs to R-arc the dominant terms of the Taylor series expansion of A(x) and 4(x)about 5 in powers of E = Jx- are given by
A((
+ &)
= ePn',p)(r)/p!
+ ...
&(tf E ) = (+ I ) p ~ p + 2 ~ ( p + 2 )+( ~2)! ) / (+p
( B, .**
(RR)
( 1.41)
12
BORIS GARFINKEL
for the RB and the RR continuations respectively. Since 4(c - E ) > 0, Eqs. > 0, (1.41) and (1.40) imply two possibilities: (1) When p is even: 4(”+’) d(5 E ) > 0, A?) < 0, A(c E ) < 0. The RB and RR transitions satisfy the respective requirements A(5 + E ) < 0 and +(( + E ) > 0. Both continuations being possible, an extremaloid arc tangent to the boundary splits off a boundary subarc. (2) Whenp is odd: 4 ( p + 2<)0,4(( + E ) < 0, @ > 0, A( + &) > 0. Both transitions violate the respective requirements, and neither continuation is possible; the extremaloid thus comes to a dead-end.
+
+
5 belongs to a B-arc, (1.4 1) is replaced 4(( + E ) = ~ ” + ~ 4 $ “ ” ( ( ) / (+p 2)! + ... A([ & E ) = (fl)”&”n(P’(~)/p! + ...
CASEB. If
by (BR) (BB)
(1.42)
Since A( - E ) < 0, (1.42) and (1.40) again imply two possibilities: (1) When p is even: A(p) < 0, A(5 E ) < 0, 4(”+’)> 0, $(( + E ) > 0. Both continuations being possible at x = 5, a boundary extremaloid splits off a tangent subarc. > 0, A(t + E ) > 0, $ ( p + 2 ) < 0, 4(5 + E ) < 0. Neither (2) When p is odd: continuation being possible the extremaloid comes to a dead-end. An illustration of the theory is furnished by the following example. Consider a boundary extremal subarc with
+
f = (1 + y y ,
4 =y
- x3
Then y(x) = x3, y’ = 3x2, y” = 6x, y”’ = 6, and A(x) = 6x(1 + 9x4)-3/2, with the aid of (1.20). At 5 = 0, we find A(() = 0, A’(5) = 6. It follows from (1.40) that p = 1. Since p is odd, the diagnosis is a dead-end. A string stretched along a convex boundary 4 = 0 provides a physical interpretation of the fact that a geodesic y(x) has no continuation beyond a point of inflection, where y”(5)= 0 and y”’(C;)> 0. More sophisticated examples can be found in problems of optimum control, such as GarfinkeL6 1.5
The Extremaloid Index
Since ~ ( ~ + ’ ) ( x at) x = 5- exists for k = 0, 1, ..., r, the successive derivatives of 4(x) and A(x) at x = 5- can be determined. This can be done by the repeated differentiation of 4(x) = +(x, y(x)) if the subarc is in R, or of the Euler equation (20.2) if the subarc is in B. Let q be the order of the lowest nonvanishing derivative of the set (4(”+’), ,I(”)) at x = 5 - , and let the index ( 5 ) be defined as 0 if q is even, and as 1 if q is odd. In terms of the index $0, the behavior of the extremaloid can be summarized by the following :
1.
INEQUALITIES IN A VARIATIONAL PROBLEM
13
Theorem 4a. If 4,. = 0 and if 4(4) = 0, then at x = 5 the extremaloid either splits, undergoing both a trivial and a nontrivial transition, or it comes to a deadend, depending on whether the index is zero or one. Note that the statement covers the situation 4’ < 0, not included in the proof based on (1.40). Here q = i = 1, and a dead-end occurs if a corner is excluded. 1.6
The Imbedding Construction
In order to test the sufficient conditions, the test-curve E l , is imbedded in families of extremals. If both end-points are in R, a typical test curve is of the structural type RBR; i.e., it contains one B-subarc. Let ll and 5 , refer to the points of entry and exit respectively. In the imbedding construction, two one-parameter families y ( x , a) and e(x, 4 ) will be used. The first one is the central family of R-extremals issuing from (xl,y l ) ;the second is generated on the locus 4 = 0 by the BR transition. The existence of the latter family is assured by 1 < 0 and Theorem 4a. The parameter 5 is the value of x at the point of exit. The two families are separated by their common extremal, which is the trivial continuation beyond x = t1of the member of y ( x , a ) that is tangent to 4 = 0. The union of the two families is a simply connected region. Exclusive of the B-subarc, the test-curve E l , is imbedded for some tl = tlo in the central family y ( x , tl), and for 5 = toin the exiting family e(x, t), where to= 5,. The B-subarc 4 3 0, being tangent to the members of the family e(x, c), is contained in the family envelope, satisfying on 4 = 0 the relation e&4, 5 ) = 0 (1.43) From the assumed continuity off it follows that there exists a N-neighborhood of x, y , y’ on the B-subarc that is covered by the extremal family e(x, 5). The extension of the analysis to the case of several B-subarcs presents no difficulties. 1.7 Condition I V In view of (1.43), condition IV is stated as follows: A curve El, is said to satisfy condition IV‘ i f it does not contain points belonging to the envelopes, distinct from 4 = 0, of the families imbedding E I 2 . The analytic statement of the condition is given by the requirement Y A X , Uo) f 0,
e&,
to) # 0,
x1
< x s tl x I xz
t o -=
(1.44)
14
BORIS GARFINKEL
The continuity off implies the extension of IV‘ t o some neighborhood N , of El,. There the functions a(x, y ) and ((x, y ) exist, and the covering of N, by extremals is said t o be simple. As shown in Sec. 2.9, IV‘ and n = 1 imply that the families y(x, LX)and E ( X , () are fields in N, . With S x , ” in (1.44) replaced by “ <x2 ,” condition 1V’ becomes the necessary condition IV. “
1.8
Proof of Sufficiency
Conditions I, It,’, and IV‘ are suficient for a strong relative minimum. We shall show that a test-curve El, satisfying these conditions yields a lower value for the integral I than any other admissible curve joining the end-points and lying in some neighborhood M . Let N , be the xy-projection of the N-neighborhood in which 11,’ holds; let N, be the xy-neighborhood in which IV’ holds. In the light of Secs. 1.5 and 1.6, the neighborhood M defined by M = Nin N ,
(1.45)
has the following properties :
1. M is simply covered by one-parameter families y(x, a) and e(x, 5 ) of extremals. 2. Each of the two families is a field in M. 3. The boundary between the two fields is their common extremal. 4. M is the union of the two fields and is simply connected. 5. On every curve in M , the E-function, calculated with y’ the slope of the field and Y the slope of the curve, is positive for all Y’ # y’. The proof of sufficiency depends on Lemmas 3 and 4 of the next chapter, and is essentially the same as the proof given there in Sec. 2.9. If the minimum is unique and if M extends over the entire region 4 2 0, then the minimum is absolute. 1.9
Numerical Example
The solution of a problem contains the following stages: 1. Construct the central family y(x, LX)of extremals through (xl, yl), and a family e(x, 5 ) of extremals exiting from 4 = 0. 2. Determine the point ti of entry. 3. From the end-conditions determine the parameters cq, and (, . 4. Test the sufficient conditions.
The outline is designed for extremaloids of the type RBR. Degenerate cases, symbolized by R, B, RB, and B R must also be considered.
1.
INEQUALITIES IN A VARIATIONAL PROBLEM
15
Jn example (a) of Sec. 1.0,
f = (1 + y ’ y , XI =
-3,
y,
- x 3 + 3x x2 = 1, y2 = 2
4 =y
-11;
=
Observe that: (1) R-extremals are straight lines, and the B-extremal is a cubic; ( 2 ) the end-points are in R, and the line joining them violates the inequality 4 2 0. We therefore seek a solution of the form RBR. The R and B types of arc are characterized in Table 11, with m and b appearing as constants of integration in the solution of the Euler equations. TABLE 11 FAMILIES OF EXTREMALS
R Y(X)
mx+b 0
+(X>
-x3
X-4
B x 3 - 3x &(I0 - 18x2 0
+ ( m + 3)x + b
The initial condition yields b als,
= 3m
- 11, and the central family of extrem-
y ( x , m) = m ( x
4 given by 4= - ~ ~ +
+ 9~‘‘)-~’’
+ 3 ) - 11
(1.46)
with the corresponding
( m + 3 ) ~ + 3 ~ ~ - 1 1
For a continuous entry, the relations
m=9,
(1.47)
4 = 4’ = 0 yield (1.48)
x=[,=-2
which corresponds to the extremal y = 9 ~ +16
(1.49)
On the B-subarc, y
=x3
For a continuous exit at x
- 3x,
= 5,
y‘
= 3x2 -
3
(1 S O )
the relation Ay’ = 0 leads to
y,‘
=3
p -3
(1.51)
The equation of the exiting family e(x, 5 ) can then be written as
y
=4x,
5 ) = 3(tZ- I ) ( x - 5 ) + c3 - 35 = 3(52
- iix
- 253
( I .52)
16
BORIS GARFINKEL
to= t2 = - 1, which corresponds to the
The terminal condition yields extremal
(1.53)
y=2
Thus we obtain a test-curve defined by Y = ~ x +16
( - 3 1 x 5 -2)
y=x3-3x y=2
( - 2 1 x 1 -1) ( - 1 5 x 5 1)
(1.54)
To test the sufficient conditions, we note that: (1) condition Ic is satisfied, since I < 0 on the B-subarc; (2) since
fY’f= ( 1 + Y t2 ) -312
,0
( I .55)
condition 111,‘ and, a fortiori, 11,’ hold; (3) condition IV’ is satisfied in virtue of the following relations :
+
eg(x, - 1) = - 6 x - 6 # 0
y,(x, 9) = x 3 # 0, -3 <x I -2
-l<x
( I .56)
A strong relative minimum is thus assured. The order q of the lowest nonvanishing derivative of either + ( x ) or I ( x ) on 4 = 0, and the corresponding index i ( x ) are listed in Table I l l . In view of TABLE 111
EXTREMALOID INDEX i ( x ) x=-2 -2<xt0 x=o
$‘‘=I2 X>O
X‘
=6
4=2 4=0 q=l
;=o
i=o i=l
Theorem 4a, splitting occurs at every point of 4 = 0 except at x = 0, which is a dead-end. Generally, it is necessary that i ( x ) = 0 at all points of E I 2except possibly at x = x 2 . Clearly, this requirement is satisfied by our solution. If the terminal condition is replaced by x 2 = a, y 2 = b with b 2 a3 - 3a, then in the various regions of the ab-plane, the solution curve belongs to the structural types R , RB, or RBR, as indicated in Table IV. The simple covering of the region 2 0 is depicted in Fig. I . Since the minimum is unique, and since the neighborhood M extends over the entire region, the minimum is absolute.
+
1.
INEQUALITIES I N A VARIATIONAL PROBLEM
17
TABLE IV SOLUTION TYPE
Type
Location of the terminal point -.
-
~
R RB RBR
-20-
-3
-2
b 1 9 a + 16 h29af16 b = a3 - 3a b59af16
- 3 5 ~ 5 - 2 -25a
'
-I
~
b
U
I
1
1
I
0
I
2
3
- x
FIG. 1 . The imbedding of the test curve.
PART B-CASE
1.10
(b)
Preliminary Considerations
As in Sec. 1.3, an extrenial has a unique continuation between corners and junctions. A significant change in the analysis is the replacement of ( I . I9), ( I .20), and ( I .21) by ff2 =
14,42
(1.57)
18
BORIS GARFINKEL
and
4‘= 4x + 4”y’ + +,.y“ = 0
(1.58)
Again, the Hilbert condition is assured by 111‘ and the assumed normality 4,, # 0 on the extremal. It follows that y ” and A’ exist and are in C‘, and that the Euler equations have a unique solution. Since $, # 0, the differentials dx and dy in (1.16) are arbitrary, and the corner condition becomes A(F - Y’F,.) = 0,
(1.59)
AFyr= 0
from which the two unknown y,’ and A + are t o be determined. Since F f + l 4 and A4 = 0, (1.59) can be written
+
A [ f - ~ ’ ( f , , I J $ ~ , ) ]= 0, K
A(J1, + A 4 , S )
=0
=
(1.60)
Theorem 1 of Sec. 1.3 remains valid; in the prooff’ is replaced by F, and is set equal to zero, again leading t o H Y - ’ , Y+’>= 0
(1.61)
at the junction. As in case (a), we limit ourselves t o continuous transitions, characterized by Ay‘ = 0 (1.62)
For such transitions ( I .60) implies AA = 0
(1.63)
so that, in contrast t o case (a), the multiplier l ( x ) is continuous. Entry is thus subject t o the conditions
4=0,
Ay‘=O,
AA=O
(1.64)
Ay’=O,
A1=0
(1.65)
By symmetry, the conditions
A=O,
hold for exit. In both transitions, the unknown (, y + ’ , l +are t o be determined from a system of three equations. For a family y ( x , a), the solution furnishes c1(a)and t z ( a ) , which define the enfry locus El and the exit locus E, of the family. In contrast t o case (a), both transitions preserve the number of degrees of freedom. Since y’ and A are continuous, and since y” and I.’ are generally discontinuous at junctions, it follows from the Hilbert theorem that y ” ( x ) and A’(x) are in (C,Do).
1.
1.11
19
INEQUALITIES IN A VARIATIONAL PROBLEM
Singularities in Case (b)
For a composite arc, the system (1.15) is replaced by d dx
d - FyP = F dx y’
(1.66)
- (A+) = 0
which is linear in y”, i, and ’ has the Hilbert determinant (1.67) From III’, the normality, and
Lemma lb. I f 4,.# 0 and nontrivial transition
A4 = 0 it follows that H = 0 iff 4 = A = 0.
if4(5) = 0, then at the point x = r of a continuous A 4 ( P + l )A j l ( P + l )< 0
(1.68)
The proof proceeds as in Lemma la of Sec. 1.4, with the replacement of 1, 4 y ,4”by -IL’, 4 y r 4’ , respectively in (1.35-1.39). Corollary 1b. Under the same hypothesis, +(P+q(P+l)
+
+
>0
(1.69)
where the two factors belong to the R and the B continuations respectively. The proof proceeds as in Corollary I a of Sec. 1.4, with the replacement of 4 ( p + 2 ) , A ( P ) by @ ” + I ) , A ( P + respectively.
Theorem 3b. I f the hypothesis of Lemma 16 holds, then the extremal has a unique continuation: ( I )f o r p 1 even, an R-arc continues in R, and a B-arc continues in B; ( 2 ) for p + 1 odd, an R-arc continues in B , and a B-arc continues in R.
+
PROOF.Case R. Equation (1.41) is replaced by
A(
4(t
+
*
E)
= Ep+lIz(+p+l) ( M P
E)
=(*l)p+lEp+’
+ I)! + ...
4( p + ’ ) ( 5 M p + l)! + ...
(RR) (RB)
(1.70)
Then (1.70), (1.69), and $(5 - E ) > 0 imply either (1) or (2): (1) p + 1 is even: the quantities 4 ( p + l )4 ,( ( E ) , A?+’), A(5 + E ) are positive; no entry can occur, and the extremaloid continues in the region; ( 2 ) p + 1 is odd: the same quantities are negative, so that entry is the unique extremaloid continuation.
+
20
BORIS GARFINKEL
CASEB. Equation (1.42) is replaced by
4(( + E ) = EP+14(+P+I’(S)/(p + I)! + ... A(< & E ) = ( f l ) P + ’ E P + ‘ ( ( ) / ( p + I)! + ...
(BR)
(BB)
(1.71)
Now (1.711, (1.69), and A ( ~ - E ) < 0 imply two possibilities: ( I ) when p + I is even the quantities A(”+’), A( + E ) , + ( p + l ) , +(t + E ) are negative; no exit can occur, and the extremaloid continues in the boundary. (2) when p + 1 is odd: the same quantities are positive, so that exit is the unique extremaloid continuation. In terms of the index i ( x ) ,defined in Sec. 1.5, the behavior of an extremaloid can be summarized by the following.
+,,
Theorem 4b. If # 0 and if 4(() = 0 , then at x = 5 the extremaloid unticvgoes a trivial or a nontrivial transition depending on whether the index is zero or one. Note that the statement covers the situation 1 < 0, not included in the proof based on (1.69). Here q = i = 0 , and only the trivial transition BB occurs. 1.12
The Imbedding Construction
Since y’ is not prescribed at the end-points, it is not possible to ascertain a priori whether the end-points of the solution curve lie in > 0 or in 4 = 0. Extremals that issue from ( x l ,y l ) therefore comprise a central family y(x, a ) of R-extremals and one B-extremal, to be designated as B,. The entry-locus El of the family y(x, a), and the corresponding C1(a)are determined from the solution of (1.64). With < ] ( a ) known the B-subarcs are constructed as follows. Let y = g(x, p ) and A = A(x, y ) be the solutions of the differential equations 4 = 0 and the Euler equation ( I .58). Since y and A are continuous at x = tl, the parameters p and y are expressed in terms of ct with the aid of the equations
+
(1.72) leading to p = p(a) and y = y(a). With P(a) and y(a) known, the equations of the family of B-subarcs can be written in the form y = g(x, a), /z = A(x, a). The exit-locus E2 of the family can then be determined from 4 5 2
3
a) = 0
( I .73)
leading to t z= t2(a). The family y(x, a) thus generates a central family of RBR extremaloids, with y(x, a) assuming different functional representations on various subarcs.
1.
INEQUALITIES IN A VARIATIONAL PROBLEM
21
In contrast to case (a), the imbedding construction extends to the entire curve E l , . The case of more than one B-subarc presents no difficulties. On the B,-extremal through ( x , , y , ) , the multiplier I is obtained from (1.58) in the form /I = L(x, <), where 5 is arbitrary. At every point of B, for which there exists a 5 such that I = 0 and A' > 0 an extremal exits into R. The set of such points belongs to the exit locus E, , generating a family e ( x , 0, which is enveloped by B,. Without any loss of generality, 5 can be taken as the value of x for which A = 0. Condition IV' requires that E , , contain no points belonging to the envelopes, distinct from B , , of the families of extremals that imbed E l , . If E l , is imbedded for c( = z0 in the central family y(x, a)of extremaloids, IV' takes the form y,(.x, ao) # 0,
x, < x I x,;
if E 1 2is also imbedded for i; = l oin a noncentral family E ( X , replaced by
( 1.74)
0, then (1.74) is ( 1.75)
As in case (a), IV' implies its extension to a neighborhood of (x,y),in which the imbedding families are fields. The proof of sufficiency is essentially the same as in case (a). A novel feature is that adjoining fields are separated by one of the loci E l and E , , on which the corner condition is trivially satisfied. 1.13
Numerical Example
The solution generally involves the following stages: ( 1) Construct the central family of R-extremals and the B,-extremal through (xi Y I ). (2) Construct the entry and exit loci E, and E , . (3) Construct the continuations of the initial subarcs of step ( I ) , and determine the parameters a. and tofrom the end-conditions. (4) Test the sufficient conditions. In example (b) of Sec. 1.0, 3
f = Y',, x, = 0 , y ,
q5=y'-x = 0;
x2
= 3,
y,
=5
Observe that: ( I ) the R-extremals are straight lines and the B-extremals are parabolas; (2j the straight line joining the end-points violates the requirement q5 2 0. Therefore we seek a solution of the structural types RBR and BR. The
22
BORIS GARFINKEL
R and B types of arc are characterized in Table V, with m, b, p, and y appearing as constants of integration in the solution of Euler equations. TABLE V FAMILIES OF EXTREMALS
R
The initial condition yields b extremals can then be written as y(x,m)=mx
=0
B
and 3!,
= 0.
The central family of R-
O ~ x s m if m 2 0 O<x
if m < O ;
(1.76)
the B,-extremal is y = 4x2,
A = 2(< - x )
(0 5 x I <)
(1.77)
with y = {. The entry locus E l , obtained from (1.64), corresponds to x = 4 , = m,
y=q1=rn
2
(1.78)
The B-subarcs generated by the RB transition are determined as in (1.72), with g = +(x2 + B) and A = 2(y - x ) , leading to P=m2
y=m
(1.79)
A = 2(m - x )
(1.80)
and
y = J(x2 + m2),
-=
Since A' 0 on the B-subarcs, including B,, the exit locus E2 does not exist, and we are limited to extremaloids of the type RB. The terminal condition yields m = 1, which corresponds to the test-curve defined by y=x
y
= +(x2
+ l),
A = 2(1 - x )
( 0 5 x 5 1) (1 < x 5 3 )
(1.81)
To test the sufficient conditions, we note that: (1) condition Ic is satisfied, since l. 0 on the B-subarc; (2) since F y r y= j 2
(1.82)
1.
INEQUALITIES IN A VARIATIONAL PROBLEM
23
condition 111,’ and, a fortiori, 11,’ are satisfied; (3) condition IV’ holds in virtue of the following relations:
(1.83) A strong relative minimum is thus assured. The extremaloid index i(x) is calculated in Table VI, which is analogous to Table 111 of case (a): TABLE VI
EXTREMALOID INDEX i(x) x=m x>m
$’=
q=l q=O
-1
h=2(m-x)
i=l i=O
In view of Theorem 4b, entry occurs at x = m, and no exit is possible for x > m,in agreement with previous conclusions. If the terminal condition is replaced by x2 = a, y , = b, then in various
regions of the ab-plane the solution curve belongs to the structural types R, RB, and 8,as indicated in Table VII. TABLE VII SOLUTION TYPE Type
Location of the terminal point
R
RB B No solution
b2a2 < b < u2 b = &a’ b < +a2
ha’
The simple covering of the region 4 2 0, x2 _< 2y is depicted in Fig. 2. Since the minimum is unique, and since the neighborhood M extends over the entire region, the minimum is absolute.
1.14 Discussion of the Results If the results of this chapter are to be extended to n > 1, the sufficient conditions must include the vanishing of the Hilbert integral I* on every closed path in M .
24
BORIS GARFINKEL
Y
t
FIG.2. - solution curve; - . - . entry locus E l .
A control problem with state variables y i ; i = 1, ... n, and control variables I , . .. m can be transformed into a standard problem of the calculus of variations by means of the substitution u . ;j =
u . = y'.
Jfn
J
Inequalities of the form (4x,y ) 2 0, 4(x, y , u) 2 0, where y and u are vectors, will then correspond t o cases (a) and (b) respectively, and the conclusions of this chapter will apply. Singularities that occur on the locus 4 = 0 can be taken into consideration by writing condition Ic in the following form (A S O ) : i(x) = 0 i(x) = 0 i(x) = 1
case (a)
if x < x2
(x - 5
d X -52)
#0
case (b)
=0
case (b)
Here i ( x ) is the extremaloid index, defined in Sec. 1.5, and points of entry and exit, respectively.
(1.84)
C1, t2refer to the
1.
INEQUALITIES IN A VARIATIONAL PROBLEM
25
REFERENCES 1. 0. Bolza, “Lectures on the Calculus of Variations.” Chelsea, New York, 1904. 2. J. P. Mancil, The minimum of a definite integral with respect t o unilateral variation, in “Contributions t o the Calculus of Variations.” Univ. of Chicago Press, Chicago, Illinois, 1933-1937. 3. F. A. Valentine, Problem of Lagrange with differential inequalities, in “Contributions t o the Calculus of Variations.” Univ. of Chicago Press, Chicago, Illinois, 1933-1937. 4. B. Garfinkel and G. T . McAllister, Singularities in a variational problem with a n inequality, Pacific J . Math. 273 (1966). 5. G. A. Bliss, “Lectures on the Calculus of Variations.” Univ. of Chicago Press, Chicago, Illinois, 1946. 6. B. Garfinkel, Minimal problems in airplane performance, Quart. Appl. Murh. 9, No. 2, 149 (1951).
2 Discontinuities in a Variational Problem B 0 RI S G A R F I N KELT BALLISTlC RESEARCH LABORATORIES. ABERDEEN PROVING GROUND. MARYLAND
2.0 Introduction . . . . . 2.1 Conditions Ic and Id . . PARTA-CASE(a) . . . 2.2 Conditions Ia, Ib. 11. and 111 2.3 Preliminary Considerations 2.4 The Function h(y') . . . 2.5 Zermelo Diagram . . . 2.6 The Imbedding Construction 2.7 Condition IV' . . . . . . 2.8 The Hilbert Integral 2.9 Proof of Sufficiency . . . . 2.10 Numerical Example 2.1 1 Discussion of the Results PART B-CASE (b) . . . . 2.12 Conditions Ia. Ib. 11. and 111 2.13 Preliminary Considerations 2.14 Zermelo Diagram . . . 2.15 Corner Manifolds . . . 2.16 Conditions 11' and 11, . . . 2.11 Free Corners . . . . 2.18 A Special Case . . . 2.19 The Imbedding Construction . . 2.20 Proof of Sufficiency . . 2.21 Numerical Example 2.22 Discussion of the Results . References . . . . .
. . . . . . . . . . . . 27 . . . . . . . . . . . . 29 . . . . . . . . . . . . 30
. . . . . . . . . . . . 30
. . . . . . . . . . . . 31 . . . . . . . . . . . . 32 . . . . . . . . . . . . 35
. . . . . .
. . . . .
. . . . . .
. . . . .
. . . . . .
. . . . .
. . . . . .
. . . . .
. . . . . .
. . . . . .
. . . . . . . . . . . . . . .
. . . . . . .
. . . .
. . . .
. . . . . .
. . . . . . .
. . . . . .
. . . . . . .
. . . . . .
. . . . . . .
. . . . . .
. . . . . . .
. . . . . .
. . . . . . .
. . . . . .
.
. .
.
. .
.
. . . . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
36 38 39 41 42 44 45 45 46 47 49 50 52 52 54 55 56
. . . 61 . 62
2.0 Introduction The frequent occurrence of optimization problems with discontinuous control variables has focused the attention of mathematicians and engineers upon the old subject of discontinuous solutions in the calculus of variations . A solution curve is generally of class D'. admitting discontinuities in y'.
t Present address: Yale
University Observatory. New Haven. Connecticut . 27
28
BORIS GARFINKEL
Such discontinuities, commonly referred to as corners, are subject to the Weierstrass-Erdmann corner condition. Three types of corner may be distinguished: (1) free corners, arising from the nonconvexity of the integrand function f ( x , y , y') with respect to y ' ; (2) reflection corners, occurring on the boundary 4(x, y ) = 0 of the region 4 2 0; ( 3 ) refraction corners, arising from the discontinuity off on some locus 4(x, y , y ' ) = 0. The terms reflection and refraction have been borrowed from geometrical optics, governed by the variational Fermat principle. In this chapter we are primarily concerned with refraction in the problem of Lagrange, illustrated by the following two examples: (a) Find the curve y(x) that joins the points (0,O) and (13, 61), and minimizes the integral
Jb'
3f
dx
where f = y" if 4 3 y - t x 2 - 16 > 0, f = +y'2 if 4 < 0, and either value off can be assumed if 4 = 0. (b) Find the curve y(x) that joins the points (0,O) and (9, 54), and minimizes the integral
+ 4 if 4 < 0, and either value where f = Sy" + y' if 4 = y' - x > 0, f = off can be assumed if 4 = 0. Both problems are solved in the text with the aid of the theory developed in this chapter. 2 be a set of The general problem can be formulated as follows. Let 9 points (x, y, w)in which the functions f -(x, y , w),f ' ( x , y , w),and 4(x, y , w) are defined, and have continuous derivatives up to the third order. An admissible curve y(x) has its elements (x, y , y ' ) in 9, and has a piecewise continuous first derivative y'. In the class of admissible curves, we seek a y ( ~that ) joins the fixed points (xl, yl) and (x2,y 2 ) ,and minimizes the integral where
f=ff=f+ f=f-
if 4 < 0 or f = f i
if 4 > 0 if 4 = 0
(2.2)
In the notation of Bolza,' y(x) is of class (C',D'),admitting a finite number of corners. Generally, y is an n-vector. In order to simplify the exposition,
2.
DISCONTINUITIES IN A VARIATIONAL PROBLEM
29
two assumptions will be made: (1) an admissible curve contains no more than one refraction corner; (2) n = 1. A special case, solved by Bliss,2 is characterized by the following restrictions: (1) 4 is a function of x and y only; (2) an admissible curve contains no more than one point of 4 = 0. As this case does not cover situations frequently arising in optimum contr01,~there is a need for a more general solution of the refraction problem. Such a solution, removing the two restrictions imposed by Bliss, will be constructed in this chapter. Two cases of the problem, illustrated by examples (a) and (b) above, are distinguished: (a) 4 = 4(x, y ) ; (b) 4 = +(x, y , y'). These cases are treated in Parts A and B, respectively, following Sec. 2.1. The necessary and sufficient conditions of the calculus of variations are discussed in Secs. 2.1, 2.2, 2.7, 2.12, and 2.17-2.20; the proof of sufficiency appears in Secs. 2.9 and 2.20. In Secs. 2.3 and 2.14 the four basic types of refraction are classified, and the corner condition is used t o construct a refracted continuation of an extremal. A graphical construction, based on the Zermelo diagram, is described in Secs. 2.5 and 2.14. The question of the existence and uniqueness of such continuations is investigated in Secs. 2.4 and 2.16, and the results are used in Secs. 2.6 and 2.19 to construct the fields required in sufficiency proofs. Numerical examples are worked out in Secs. 2.10 and 2.21. Finally, the results are summarized in Secs. 2.1 1 and 2.22. The boundary 4 = 0 is the analog of the interface of the geometrical optics. Since two values off are associated with a point in 4 = 0, we shall distinguish the two sides B- and B + of the locus I$ = 0, with B - corresponding to the choice f =f -, and B+ to f =f +.We shall use the notation R-, R f for the regions 4 < 0 and 4 > 0, denote the closures R- u B- and R' u B+ by 4 s 0 and 4 2 0, respectively, and follow the terminology of the last chapter in referring to arcs in R and in B as R-arcs and B-arcs, respectively. A curve is said to be normal if 4y # 0 in case (a), and dY,# 0 in case ( b) for every (x, y , y ' ) belonging to the curve. Normality will be assumed. 2.1
Conditions Ic and Id
Condition Ic, stated in the previous chapter as Eq. (1.84), requires the following modification: 120 if f = f A
30
BORIS GARFINKEL
Condition Id, which must be satisfied on
4 = 0, is
f=min(f+,f-)
(2.4)
In other words, on a B-subarc of a minimizing curve,fmust assume the lesser of the two values f and .f- for given (x, y,y’). The derivation of this condition involves the concept of a crossover ’’ variation, involving the exchange of the values f and f - on 4 = 0 without any alteration of the values x , y , y’. Such a variation is defined by +
“
+
Sy(x) E 0 Sf=f+-f=f--f+
if f = f if f=f+
(2.5)
In order to derive (2.4), let the crossover variation be applied on some interval (tl,5,) of a B-subarc of a minimizing curve. Then the integral Z of (2.1) receives an increment
Since it is necessary that AI 2 0, and since (tl,t2)is arbitrary, it follows that S f 2 0 for all x on the B-subarc, and (2.4) follows immediately. Conditions Ia and Ib of the previous chapter are retained here without change; the combination Ia, b, c, d will be designated as condition I. We shall apply the term extremaloid to a curve that satisfies condition I, in contrast to extremal, which is a solution of the Euler equations. The particular extremaloid that satisfies the prescribed end-conditions will be referred to as the test-curve, and designated as El,. PART A-CASE 2.2
(a)
Conditions la, Ib, 11, and 111
The necessary conditions for a strong relative minimum include I-IV; the sufficient conditions comprise I, 111”, and IV’. Condition I consists of Ia, Ib, Ic, and Id, of which the last two have been discussed in Sec. 2.1. The Euler condition Ia assumes the form
d zfY,=f,
R=O
if
4#0 (2.7)
if 4
~
0
2.
DISCONTINUITIES IN A VARIATIONAL PROBLEM
31
The corner condition Ib becomes A ( f - Y l f , , )= K 4 X 9
Af,. = K 4 Y
(2.8)
where A denotes the jump at the corner, and K is a constant. Condition 11, stated in the previous chapter, is retained here with the E-function defined by
E
=f(Y? -f (Y’) - ( Y’ - Y ’ l f , ,
(2.9)
where x,y , y’ belong to the curve and 4 ( x , y ) # 0. The following condition 11,’ will be used in the proof of sufficiency: A curve E I 2is said to satisfy condition 11,‘ if there exists an F-neighborhood of (x,y ) on Et2such that E > 0 for all (x,y ) in F and f o r all Y‘ # y‘. Condition I11 requires that&,,, 2 0 for all ( x , y , y’) of E 1 2 .Condition 111” is a strengthened form of 111’ of the preceding chapter. It requires that f,.,.> 0 for all (x, y ) of E l , and for all y’, including y’ = & co. We shall show that 111” implies 11,’. If #(x, y ) # 0, define a unilateral F-neighborhood of (x,y ) as one that contains no point of I$ = 0. Sincef is C3 in F, 111“ implies its extension, HI;, to F. Furthermore, the mean value theorem applied to (2.9) in F leads to
+
E = +(Y’ - y ’ ) 2 f , , , , ~ ’ O( Y’ - y’)]
(0 < 0 < 1)
(2.10)
Sincef,.,. > 0 by it follows that E > 0. On the other hand, if 4(x,y ) = 0 on a subarc of Et2, 111’’ holds on both B- and Bf.By the preceding argument, 111” can be extended to a bilateral F-neighborhood, again leading to E > 0. The remaining conditions IV and 1V‘ will be discussed in Sec. 2.7. 2.3
Preliminary Considerations
Let the end-points (xl,y l ) and (x2,y z ) lie in 4 < 0 and in q5 > 0, respectively. The six basic transitions on 4 = 0, listed in Table I, include four types of refraction, as well as the entry and the exit described in the previous TABLE I
BASICTRANSITIONS R - R+ R-B+ B- R+ B-B+ R-BB+R+
Region + region refraction Region + boundary refraction Boundary + region refraction Boundary +boundary refraction Entry Exit
32
BORIS GARFINKEL
chapter. In the latter two transitions, I I I “ implies that the R-subarc of the minimizing curve is tangent to the locus 4 = 0 at the junction of the R and the B-subarcs. From 4 = 4(x, y ) we derive
4’ = 4 x + 4 y Y ’ At a corner, the left and the right-hand values of
(2.1 1)
4‘ are
(2.12) x + 4,Y-’> 4 + ?= 4 x = dYY+’ The case of reflection, where $-‘4+‘ < 0, has been excluded in this chapter. For refraction it is necessary that 4-’4+‘ 2 0, with the equality holding only
4-’
=4
where refraction is critical. We shall use the latter term to describe the situation where the incident or the refracted extremal lies in the locus 4 = 0 or is tangent t o it. Refraction is thus governed by the relations
A(f -Y%4
=K4x,
Aff
=
K4,,
4-‘4+‘ 2 0
(2.13)
of which the first two constitute the corner condition (2.8). For a given extremal y(x) intersecting 4 = 0, the corner condition furnishes two equations from which y + ’ and K are to be determined. The nonvanishing of the Jacobian (4’Jy.,.)+is sufficient for the existence of a solution. Since&,. > 0 by Ill‘, a solution exists if I$+’ # 0. The exceptional case 4+’ = 0 includes the transitions R-B- and R-B’. Then the extremal indeed has two continuations, as shown in the previous chapter, with the boundary extremal splitting off an R-extremal tangent t o the boundary 4 = 0. The phenomenon of criticd rejection in geometrical optics illustrates the fact that the corner condition may give rise t o an imaginary refracted continuation. The question of whether or not (2.13) has a real and unique solution will be investigated next. For this purpose we shall use the function h(y’) of Sec. 2.4. 2.4 The Function h(y’)
The slope y”’ of the locus by
4 = 0, obtained from (2.11) with 4’ = 0, is given -4x/4y
(2.14)
=f(Y’) + (9’- Y’lf,,
(2.15)
y”’ =
Let h(y’) be defined on
4 = 0 by NY’)
We shall distinguish h- and h + , corresponding to f =f - and f = f + ,respectively. Several theorems concerning h(y’) will be proved.
2.
DISCONTINUITIES IN A VARIATIONAL PROBLEM
33
Theorem 1. If (x,y , y‘) belongs to an extremaloid, the function h(y‘) is continuous at a corner on 4 = 0.
PROOF. The left and the right-hand values of h and the jump in h are given by
+ ( y - y-‘) f y T , h+ =f +(v+’)+ (9’- v+’>fy:, h- =f-(y-‘)
(2.16)
Ah = h+ - h -
in the notation of the previous chapter. Then it follows from Eqs. (2.13)-(2.15) that Ah = A(f- y’&) + j’Afy, = 0 (2.17) If the transition occurs in the reverse sense; i.e., from 4 2 0 to Cp < 0, the superscripts + and - are to be interchanged. Thus the corner condition can be written Ah = 0, and (2.13) becomes h+
= h-
,
(y+’ - j ’ ) ( y - ’ - y”’) 2 0
(2.18)
from which y+’ is to be determined for given (x, y , y-’). Lemma 1. rfl11’’ holds, then (a) the function h(y’) has one and only one stationary point at y’ = ?’, which is u maximum; (b) h(co) = h( - co) = - co ; (c) i f max h > 0, then h(y‘) has two distinct roots lying on the opposite sides of y”’; ifmax h = 0 there is one root y‘ = 9‘; if max h < 0, there are no real roots.
To prove (a), note that hy, = ( j ’ - y’)&Cy,,
h Y f y= ,
-fyPy,
+ (7’ - Y ’ ) & , ~ , ~ ,
(2.19)
Since f y s y , > 0, it follows that hyr= 0 implies y’ = j ’ and hYfy,< 0. Thus y’ = y”’ yields the one and only stationary point, max h(y’) = h(?’) Y’
=f(?’)
(2.20)
To prove (b), we use the mean value theorem and (2.19), leading to h(y’) = h(y”’)+ (y’ - y”’)hy, = h ( y ) - (y’ - y”’)2J;,y.
(2.21)
where the bar over a letter denotes its mean value: corresponding to the argument j ’ + O(y’ - j ’ )with 0 < 6’ < 1. Sincef,,,. > 0, and since lim&,yf# 0 as Iy’J-+ 00, the conclusion follows. To prove (c), we use (a), (b), and the C 2 continuity off. The question of the existence and uniqueness of a real refracted continuation for an extremal intersecting 4 = 0 can now be answered.
34
BORIS CARFINKEL
Theorem 2. If an extremal intersects the locus 4 = 0, and if 111” holds, then the corner condition (2.18)furnishes one real y+‘ or none, depending on the sign of the discriminant D dejined by
- h-
D -ff(y”’)
(2.22)
Three cases arise: (a) if D > 0, then y+’ # 7; (b) (c) if D < 0, then y+‘ is imaginary.
if D = 0, then y+‘ = 7‘;
PROOF.Consider the function
g(y’)
h+(y‘)- h -
(2.23)
Since h- is a constant, Lemma 1 applies to g(y’), leading to the conclusion that g(y’) is maximized at y’ = 7, with maxg(y’) = D
(2.24)
and with real roots y,’ and y,’ such that y,’ I y”’ I y,’ if D 2 0. Note that if the roots are distinct, only one of them satisfies the second part of Eq. (2.18), and is to be identified with y+‘. TABLE I1 THEDISCRIMINANT D
Case
Transition types
Remarks
D >0 D =0 D
R-R+,B-R+ R-B+,B-B+
Y+‘#9‘ y+’ = 9’ y +’ imaginary
None
The three cases of Theorem 2 are summarized in Table 11 for 4-l > 0. The case D = 0 corresponds to refraction into B. In particular, for the B-B+ transition, D = 0 implies
6 Z f + ( J ’ )- f - ( y ) = 0
(2.25)
in agreement with Eq. (2.4). If 6’(x) # 0 along the extremal, then the relations
6=0, 6‘<0;
A->o,
A+
(2.26)
comprise Ic and Id for the B-B+ transition. Points of 4 = 0 where such a transition occurs will be called critical points, and will be designated by x* . Their existence is proved by the following example:
f-=(3+y
,2 ) 112
,
f + = $ ~ ’ ~ + 3 ~ - 2 ~ ~ + 2
4 = y - +x2 = 0
2.
35
DISCONTINUITIES IN A VARIATIONAL PROBLEM
Then Eq. (2.25) and the second part of Eq. (2.7) yield
6 = 2 - (3
+x
y ,
A-
= (3
-+,
+ x2)-3’2,
A+
=
-2
Clearly, x* = 1 leads to 6 = 0, 6‘ = A- = 1 A+ = -2, which satisfies (2.26). An important consequence of Theorem 2 is the following corollary.
Corollary. At every point of an extremaloid subarc in refracted continuation.
4 =0
there exists a
The proof depends on the observation that on a B--arc
h-
=f -(y”‘),
D =f+(y”’)
-f
-(y)2 0
(2.27)
where the latter inequality follows from Id. The same argument applies to a B+-arc with a suitable interchange of the + and - signs. In addition to a refracted continuation, a B-arc also has the trivial continuation and the exiting continuation, discussed in detail in the previous chapter. Altogether there are three continuations, symbolized by B-R’, B-B-, and B- R- respectively. 2.5
Zermelo Diagram
A useful geometrical interpretation of the corner condition and of the E-function is provided by the graph off = f(y’) with (x,y ) fixed. From Eqs. (2.14), (2.15),and(2.18)it follows that: (1) the tangent t o f - ( y ’ ) at y’ = y-’, and the tangent to f ‘ ( y ’ ) at y’ = y+’intersect at the point (y”’, h - ) ; (2) y+‘ and y-’ lie on the same side of y”’. A geometrical construction of y + ’ for given (x,y , y-’) is shown in Fig. 1.
FIG.1. Function g(y).
36
BORIS GARFINKEL
Condition I1 receives the following geometrical interpretation : A curve El, satisfies I1 if the tangent to the curve f = f (y') lies entirely below the curve for every (x, Y , Y ' ) of El,. As seen from Fig. 2, 11 is assured by the convexity off(y'), as previously remarked in Sec. 2.2.
f
(Y')
f*
1 I
-4
Y
I
I
Y-
Y+
I
FIG.2. The corner condition, case (a)
2.6
The Imbedding Construction
In the proof of sufficiency, the neighborhood of the test-curve El, is covered by extremals. Four types of singly refracted extremaloids can arise from a combination of the basic transitions of Table I. They will be designated R - R ' , R-B-R', R-B'Ri, R-B-B'R'. We shall show that the last type is the most general, and includes the others as special cases. Let 5,, t 2 , 5,, x* refer to the points of entry, exit, refraction, and the critical point of E l , . In the extremaloid R-B-B'R', all the four points exist and satisfy the relation 51
< 5,
= x*
< 52
The type R-R', considered by Bliss,4 and the type R-B-R' can be regarded as degenerate forms of R-B-B'R'; cl, and x* are absent in the first, and c2 and x* are absent in the second. If E12belongs to the third type, R-B'R', a dual problem can be formulated with the end-points interchanged, and the sign of 4 and the sense of integration reversed. Then El, assumes the structure R - B - R ' , thereby reducing the problem to a previous case. Accordingly, the latter type can be dismissed from further consideration.
c2,
2.
DISCONTINUITIES IN A VARIATIONAL PROBLEM
37
In the imbedding construction for the type R-B-B+R', a set S of six one-parameter families of extremals will be used. They will be designated as y(x, cr), r(x, t),r + ( x , t),e - ( x , 0, r - ( x , <), e + ( x , <). The first one is the central family of the R--extremals issuing from (xl,yl); the second is its refracted continuation in R f , generated by the R-R+ transition; the rest are generated on 4 = 0 by the B-R', B - R - , B + R - , B + R + transitions respectively. The parameter 5 is the value of x a t the point of transition. These families are TABLE 111 THEIMBEDDINGFAMILIES
Family Ax,
4
5) i-+(x, 5) r(x,
Generating transition
RR-R+ B-R+
e - ( x , 5) e+(x,f)
B- RB+R'
r-(x,
B+R-
5)
refraction critical refraction exit exit critical refraction
listed in Table 111 and illustrated in Fig. 3. The boundaries bi ; i = 1, 2 , ... , 5 , separating two adjoining families of S are listed and described below: b1-the unrefracted continuation beyond x = of the R--extremal through (xl, y l ) tangent to 4 = 0 a t t1 b,-its refracted continuation into R f b,-the R--extremal generated by the B-R- transition a t x* 6,-the R+-extremal generated by the B f R + transition at x* b,-the locus 4 = 0. /
,
Y 9.0 I
x,
/
/bl
I b2
I b4
FIG.3 . The imbedding of a test-curve
38
BORIS GARFINKEL
It appears that the boundary between any two adjoining families is either their common extremal or the locus 4 = 0. The union of the six families is a simply-connected region. The existence of the set S in some neighborhood of E l , is established in the following lemma.
Lemma 2. There exists a neighborhood of the test-curve E l , that can be covered by a set of one-parameter families of extremals. PROOF.The interval (5, - E , 5, + E ) of the locus 4 = 0 is decomposed into three subintervals, which will be examined in succession.
Subinterval
(tl - E,
el). Since an extremal through (xl, y , ) is tangent to
4 = 0 at x = tl,there exists an E ~ such , that every point in the subinterval can be reached by a member of the central family y ( x , a). Since t1 belongs to a
B--subarc of E l , , D ( t l ) > 0 by the corollary of Sec. 2.4; furthermore, since D(5) is continuous along 4 = 0, there exists an E , such that D > 0 on the entire subinterval. Let E = min(e,, 8,). Then by Theorem 2, every member of y(x, a) that intersects 4 = 0 on (tl - E, 5 , ) has a refracted continuation in R+.
Subinterval (tl, t*).By Ic and the corollary of Sec. 2.4, at every point an extremal exits from B - into R - , and is also critically refracted into R+. Subinterval (x*, 5, + E). Since 4 5 , ) < 0 by Ic, and since A(x) is continuous at interior points of a B-subarc of a minimizing curve, as shown in the previous chapter, there exists an E such that A < 0 on the entire subinterval. By Ic and the corollary of Sec. 2.4, an extremal exits from B+ into R + , and is also critically refracted into R - . The B-subarc of E l , , separating adjacent families of extremals, is not a member of either. Exclusive of the B-subarc, the test-curve E l , is generally imbedded for some value a = a,,in the central family y ( x , a), and for the appropriate value 5 = toin one of the families r(x, t), r+(x, t), e+(x, 5). The B-subarc of E l , adjoins a subset of (r, r', r - , e+, e - ) , depending on the extremaloid type. A neighborhood of the subarc always includes r(x, 0, although the latter region touches 4 = 0 only at the point x = l,.Parameters 4 are independent of a except in the family r(x, l),generated by the R - R + transition, where 5 is a function of a. 2.7
Condition IV'
Depending on the type of extremaloid, the neighborhood of E l , consists of two, four, or six families of S. We note that 4 = 0 is tangent to all the
2.
DISCONTINUITIES I N A VARIATIONAL PROBLEM
39
members of the families e - and e + of exiting extremals. Consequently, the locus 5, = 0 is their envelope, and the relations e<-(t, t) = 0,
q+(t,5 ) = 0
(2.28)
are satisfied on 4 = 0. In view of this circumstance, condition 1V’ is stated as follows. A curve E l , satisfies condition IV’ if it does not contain points belonging to the envelopes, distinct from 4 = 0, of the families that imbed El,. The analytic formulation appears in Table IV. In the first two types. TABLE IV
CONDITION IV’ OF JACOBI
to= 5, ; in the last one, to= 5 , . Along 4 = 0, no conditions have been imposed on the e-families, enveloped by 4 = 0; the normality 4y # 0 precludes the occurrence of cusps. The continuity offimplies the extension of 1V’to a neighborhood of El,. In such a neighborhood the functions a(x, y ) and ((x, y ) exist, and the covering is said to be simple. With “ < x , ” replaced by “ <x2,” condition IV‘ becomes the necessary condition IV. 2.8 The Hilbert Integral
A field is defined as a region F of the xy-space with a slope function p(x, y ) having the following properties: (a) It is single-valued and is of class C’. (b) The line integral I* defined by I*
EJ h dx,
h -f(y’)
+ ( Y’ - y’)&,
y’ = p(x, y )
(2.29)
where Y’ is the slope of the path of integration, vanishes on every closed path in the region.
40
BORIS GARFINKEL
Since the functionfmust satisfy the Euler equation in order that (b) hold, it follows that the differential equation
Y’
=P(X,
Y)
(2.30)
is identically satisfied by a one-parameter family y ( x , a ) of extremals. By the implicit function theorem, y = y ( x , a ) can be solved for a, to yield a(x, y ) , provided yo,(x,a ) # 0. Then in some xy-region, the function y’ = p ( x , y ) , obtained by the elimination of a from y’ = y,(x, a), will have the property (a). Conversely, the function p ( x , y ) so constructed has the property (b), provided n = 1, as has been assumed in this chapter. Two further properties of the Hilbert integral I* will be stated. Let I be defined by (2.31) and let E and C, respectively, denote an extremal of the field and any curve in the field. Then
I(E) = I*(@,
I(C) - I*(C) =
(2.32)
where the last E is the E-function with Y’ the slope of C and y’ the slope of the field. The result follows from the definitions (2.29) and (2.8). In the proof of sufficiency we shall use two lemmas.
Lemma 3. Let $ = 0 be the equation of the boundary separating two adjacent fields fi and f2 whose union U is a simply connected region. If the Hilbert integrand h in the direction of the locus $ = 0 is continuous across $ = 0 , then the Hilbert integral I* vanishes on every closed path in U . The proof involves the decomposition of the closed path in U into a sum of a closed path in fi and a closed path in f 2 , with the cancellation on II/ = 0 of their respective contributions to I*.
Lemma 4. The hypothesis of Lemma 3 holds if the (ocus $ = 0 is either ( 1 ) the common extremal of the fields fiand fior (2) a locus on which the extremals of fiand f2are joined by means of the corner condition.
PROOF. In case (l), the slope Y’ of $ = 0 is equal to the common value y’ of the slopes of the two fields on $ = 0. Hence h- = h+ = f ( y ’ ) ,in view of the second part of Eq. (2.29). In case (2), the equality /z- = h+ is established by the argument of Theorem 1 of Sec. 2.4.
2. 2.9
DISCONTINUITIES I N A VARIATlONAL PROBLEM
41
Proof of Sufficiency
Conditions 1, Ill”, and 1V’ are sufficient for strong relative minimum. We shall show that a test-curve E l , satisfying these conditions yields a lower value for the integral Z than any other admissible curve joining the end-points and lying in some neighborhood M . In view of the continuity off, 111” and IV‘ imply their extensions to some xy-neighborhoods, say Fl and F, , respectively. Let Fo be the neighborhood of Sec. 2.6, in which the extremal families of the set S exist. In the light of Secs. 2.6 and 2.8, the neighborhood M defined by
M
= Fo n Fl n F,
(2.33)
has the following properties : (1) M is simply covered by one-parameter families of extremals. (2) Each one of these families is a field in M . (3) The boundary betveen any two adjoining fields is either their common extremal or the locus 4 = 0, on which the corner condition is satisfied by the imbedding construction. (4) M is the union of the fields of (2), and is simply connected. Therefore, Lemmas 3 and 4 of Sec. 2.8 apply, with the conclusion that the Hilbert integral I* vanishes on every closed path in M . Let E l , be the test-curve, and C,, any other curve joining the same endpoints and lying in M. Since I*(C,,) = Z*(EI2),we can write with the aid of (2.32) Z(C12) - I(E12) = Z(C12)- I*(C,,)
+ I*(C,,) - I(&,)
=J’Edx
(2.34)
Now 111” implies HF’,as shown in Sec. 2.2. Hence E > 0 on C,, , and I(E12) < I(CI2)
(2.35)
We have proved the following theorem.
Theorem 3. Ifthe locus of discontinuity o f f is of the form 4(x, y ) = 0 , and ifan admissible curve satisJies la, Ib, Ic, Id, III“, and IV’, then i r furnishes a strong relative minimum. If the class of admissible curves is restricted to curves containing no more than one point of 4 = 0, only the R - R + refraction is possible. Then Id can be deleted, and the rather stringent condition 111” can be relaxed to the combination of the classical 11,’ and 111’. Condition IV’ is reduced to the first line of Table IV. This special case, treated by Bliss,2 is covered by the following corollary.
42
BORIS GARFINKEL
Corollary. If the class of admissible curves is restricted to curves containing no more than one point of 4 = 0, and i f an admissible curve satisfies Ia, 1 b, Ic, ]IN‘, III‘, and IV‘, then it furnishes a strong local minimum. Generally, if the minimum is unique, and if M extends over the entire xy-plane, then the minimum is absolute. 2.10
Numerical Example
The solution of a problem contains the following stages: (1) Construct the central family of extremals through (xi, yl). ( 2 ) Construct the families r, r’, e - ; determine the entry x = tl, the critical point x = x* , and the exit x = (3) Determine the parameters a,,and tofrom the end-conditions. If there is n o real solution, attack the dual problem with the end-points interchanged. (4) Test the sufficient conditions. In example (a) of Sec. 2.0,
c2.
f-
=p,
4=y
$x’ - 16 x2 = 13, y 2 = 6 1
f+=* 3 y12 ,
xi = 0 , y1 = O ;
-
We observe that (1) extremals in R are straight lines, and the B-extremal is a parabola; ( 2 ) no tangent can be drawn to 4 = 0 from the initial point. Consequently, only the R - R f and R-B-R’ types of extremaloid need be considered. The R and the B types of arc are characterized in Table V with TABLE V FAMILIES OF EXTREMALS R
B
m and b appearing as cmstants of integration in the solution of the Euler equation. The quantity d(x) is obtained from (2.25). The initial condition fixnishes b = 0, and the central family of extremals, y(x, m) = mx
(2.36)
At a point corresponding to x = t, where the family intersects the locus
4 = 0, the corner condition involves the following calculations: h-
=
-yf
i2j’~-’,
h+ = $(-y’+” + 2j’y+’),
y”‘ = 5‘2
(2.37)
2.
DISCONTINUITIES IN A VARIATIONAL PROBLEM
43
For the R-R+ transition, y-‘
=m =
In view of the relations h- = h , and 0 < 5 I8,
y+’ = +(
45 + 165-l
+-‘ > 0, d,+’ > 0, we then obtain + (&cz
- 6 + 192t-2)”2
(2.38) (2.39)
The r-family of directly refracted extremals can now be written as
+ (&t2
y = T ( X , 5 ) = [+<
- 6 + 1925-2)L’2](~ - 5 ) + i t 2+ 16,
0<5I8
The value 5 = 8 in (2.38) corresponds to m
6,:
=4
(2.40)
in (2.36), leading to
y =4x
(2.41)
for the member of the central family tangent to d, = 0. For the same value of 5, (2.40) yields 62: Y = ~ x 16 (2.42) for its refracted continuation. These are the boundaries 6, and b, , respectively, in the notation of Sec. 2.6. Since the r-family lies in the region d, > 0, y > 6x - 16, it excludes the terminal point (13, 61). The extremaloid type R-R+ can therefore be dismissed. The point of entry is given by x
= 51 = 8,
y = 32
(2.43)
For the 3 - R +transition, we substitute into (2.37), leading to
y-’ = y”’ = $5
(2.44)
Y,’ = $5
(2.45)
The r+-family is then given by V, = T+(x,5 ) = $5(x
- 5 ) + $5’ + 16 = S(X - +t’ + 16, 8 I5 < 00
(2.46)
The terminal condition y(13) = 61 yields to= 12, corresponding to the extremal y = 9~ - 56 (2.47) Thus we obtain a test-curve of the type R - B - R , , defined by
(see Fig. 4).
y = 4x
(0 5 x 5 8)
y = ) x Z + 16 y = 9~ - 56
(8 I x 5 12) (12 5 x I13)
(2.48)
44
BORIS GARFINKEL
To test the sufficient conditions, we note that (1) On the B--subarc, A > 0 and 6 > 0. Therefore Ic and Id are satisfied. (2) 111" is satisfied, sincef,:,, = 2 and f,?,. = 4. (3) IV' is satisfied in virtue of the following relations:
r , + e , 5) = -*5 # 0 8I x= 5I12 r+(X, 12) = *x - 12 # 0 x I 13 12 I
Y , b , 4) = x # 0, O<x<8 r5(2,2) = -4 # 0, x=t1=2
(2.49)
A strong relative minimum is thus assured. This minimum is unique but is not absolute. For the family r+(x, 5 ) in (2.46) has an envelope. Y
0
I
2
3
4
5
6
7
8
9
1011
12
13
1415
FIG.4. The minimizing curve (solid line).
2.11
Discussion of the Results
A solution of the refraction problem in case (a) has been obtained under the following restrictions :
2.
DISCONTINUITIES IN A VARIATIONAL PROBLEM
45
(1) There is only one dependent variable y. (2) An admissible curve contains no more than one refraction corner. (3) 111”is satisfied. We shall discuss the possibility of the removal of these restrictions. If (1) is removed, the sufficient conditions must include the vanishing of the Hilbert integral I* on every closed path in the M-neighborhood of E l , . Restriction (2) can be removed if the RB refraction does not occur on E , , of either the given problem or of the dual problem. Condition 111” can be regarded as a combination of 111,’ with the requirement
as y‘+ kco
iim&y, # 0
The latter requirement can be removed if the problem can be expressed in parametric form; e.g.,f= p(x, y)(l y’2)‘’2. The entire condition Ill” can be replaced by the combination of the weaker conditions Ill’ and llN‘, if the class of admissible curves is restricted to curves containing no more than one point of 4 = 0.
+
PART B-CASE 2.12
(b)
Conditions la, Ib, II, and 111
The necessary conditions for a strong relative minimum are I, 11, HI, IV; the sufficient conditions comprise I, IIN’, IV‘. Condition I contains la, Ib, Ic, and Id, the last two of which were discussed in Sec. 2.1. Conditions Ia and Ib, derived in the previous chapter for case (b), assume the forms Ia
(2.50)
Ib Since F
A(F - Y‘F,,)
=f+
= 0,
AF,.
=0
(2.51)
A4 with A4 = 0, Eqs. (2.50) and (2.51) can be written
Ja
Ib
A[f-
+ A4,,)1 = 0,
~ ’ ( f y ,
A ( f , , + A4y,)= 0
(2.53)
It is understood that at the discontinuity, the derivatives o f f with respect to its arguments are one-sided.
46
BORIS GARFINKEL
Condition I1 of the previous chapter is retained here, with the E-function defined by
E = F( Y') - F(,v') - ( Y' - y')F,,,
(2.54)
Condition 111 requires that FYTy, 2 0 for all (x, y , y') belonging to the curve. Conditions IIN', IV, and IV' are discussed in Secs. 2.16 and 2.19. 2. I 3
Preliminary Considerations
At a refraction corner, with Ay' # 0,
A+
= 4y,Ay'#
0,
+-++
(2.55)
<0
The equality in the second part of (2.55) holds only in the case of the critical refraction. Thus, in contrast to case (a), where 4 = 0 and is continuous at a corner, is generally discontinuous in case (b), and need not go through value zero. Since y' is not prescribed at x1 and x2, it is not possible to ascertain a priori whether the end-points lie in 4 < 8, 4 = 0, or 4 > 0. In addition to the six transitions of Sec 2.3, we must also consider the six reversed transitions R'R-, etc. Altogether, there are eight basic types of singly refracted extremaloids, listed in Table VI. If the structural formula is truncated from
+
TABLE VI EXTREMALOID TYPES
B- R - R B+ B-R-B+R+ R-B-B+R+ R-B- R+B+ +
B+R+R - BB+R+B-RR+B+B-RR+B+R- B -
either end or from both, there result 32 degenerate types, such as R'B'R-, R+B-R+,R'R-, B + R i , R - , which occur in the example of Sec. 2.20. Refraction is governed by the relations A(F-y'Fy,)=O,
AFy,=O,
1+$+=0,
+-++
< 0 (2.56)
The first two equations constitute the corner condition (2.51); the third is a consequence of the general identity A+ = 0 ; the last is in (2.55). For a given extremal y(x), the system (2.56) furnishes three equations, from which the three unknown x = 5, y + ' , A + are to be determined. For refraction into the
2.
47
DISCONTINUITIES IN A VARIATIONAL PROBLEM
region, A+ = 0; for refraction into the interface, 4+ = 0 and a nonzero A+ is admitted. The Jacobian determinant J of the left-hand members of the first three equations of (2.56) with respect to x, y+’, A+ can be expressed in the form
J=QH+ (2.57) where H is the Hilbert determinant (1.67) of the previous chapter, and the Carathtodory function R is defined by (2.58) SZ = AFx y-’Fy+ - y+‘FyThe derivation of (2.57) is based on the formulas d lim - F(x(X,Y(X>,Y+’) = @x)+ + Y-’Fy+
+
x+<-o
dx
d
(2.59)
- ( F - y‘Fyr)= Fx
dx
d
F,,. = Fy
holding along an extremal. Except for points of entry, where A + the determinant H+ does not vanish. Thus
=
4+ = 0, (2.60)
Q#O
is a sufficient condition that a solution of (2.56) exist. As in case (a), the solution may be imaginary. The question of the existence of a real refracted continuation will be investigated next. 2.14
Zermelo Diagram
To simplify the analysis, let 4 ( x , y , y’) be written
4 = Y’
(2.61)
- y”’(x,Y )
Then the Zermelo curve f(y ’), with (x,y ) fixed, has a left branch f -(y’) and a right branch f’(y’), defined on the intervals -co
-f-m
Af=f+(y”’)
Af,, =f;(y”’)
-fJP’>
(2.62)
Let 6 designate the special form of the E-function of (2.54) defined by &(Y’, r? = f ( Y? - f ( Y ’ ) - ( Y ’ - Y’)f,, - c o < y ‘ r y ” ’ ~ Y < ~ o if A f r o -m
(2.63)
if A f 2 0
Clearly, 6 is the distance from the line tangent to the curvef(y’) at y’, measured vertically to the curve at Y’.
48
BORIS GARFINKEL
A line is said to support a curve if it contains two points of the curve and does not cross it. If f ( y ’ ) is supported from below at points y,’ and y2’ such that -a3
< yl‘ r y ” ’ I y2‘ < co
(2.64)
then the following relations hold : min b(y,’, Y’) = b(yl’, y,’) Y’
if y,‘<J’ I y,’
and
=0
Af I 0
(2.65. I )
and min &(y2’, Y ’ ) = B(y,’, y,’)
=0
Y‘
if yl‘ = j ‘ I y2‘ and A f 2 0
( 2.65.2)
I t follows from (2.64) and (2.65) that the following relations hold at y,’ and YZ‘:
f , , . =A2,
f,,,s f y z *
if yl’ < j ’ < y,’ if (Yl’ - y”’)(Y2’ 7P’) = 0
(2.66)
In the first part of Eq. (2.66), the supporting line is a double tangent with yl‘ and y,’ as the points of contact; in the second part of Eq. (2.66) one of the points of support is the point y”’ of the discontinuity o f f : A remarkable property of the numbers y,’ and y,’ is described in the following theorem.
Theorem 4. Ifthe curue f ( y ’ ) with a jump Af is supported from below at y,‘ and y,’, then there exist numbers Al and A, satisfying 24 = 0 and the convexity condition Ic, and together with y,‘ and y,‘ satisfying the corner condition. PROOF.With ( x ,y ) fixed, consider the system of four equations f(Y1’)
- Y l ’ ( f y , ’ + A,) = f ( Y z ’ ) - Y2’(fyz’+ 2 2 )
f,,, + 1 1 =f.,, +22
AI(Y1’ - y”’)
= 0,
(2.67)
Az(y2’ - F’) = 0
The first two equations constitute the corner condition (2.53) with y-’ = yI’, y+’ = y2‘, A- = A,, 1, = A2, and with q!~,,, = 1 in consequence of (2.61); the last two equations follow from the general identity Aq5 = 0. We shall show that (2.67) is satisfied by the points of support, and that and A, determined by
2.
DISCONTINUITIES IN A VARIATIONAL PROBLEM
49
(2.67) satisfy Ic. Four cases are distinguished :
a, = a2 = o
(1) y,’ < Y’ < Y2’ (2) y,’ < 7’ = y2’
Af < 0 Af > 0 Af = 0
(3) y,’ = y”’ < y2’ (4) y,’ = y”’ = y2’
= 0,
A, < 0
I , > 0, A 2 = 0 0 < 1, < A&, 1 2 = 1, - Af,. < 0
In case ( I ) , & , . = f,,. by the second part of Eq. (2.66), so that the supporting line becomes a double tangent. In view of (2.65), the points of contact satisfy (2.67) with A, = A2 = 0. In case (2),fyl, 5 fy,. by the second part of Eq. (2.66).
ff++&+g case 2
case I
case 4
I
I
t
1 I
1 1
1
I
I
I
I I
2.15
1
II
I I
I
I
1
I
‘
I
;
I
’
I
I I
’
I
‘ ,
I I
I
I I
Corner Manifolds
If the hypothesis of Theorem 4 is satisfied, then to every (x, y ) correspond real (yl’, A,) and ( y 2 ’ ,A,) such that I 4 = 0. The functions yi’(x,y ) and
50
BORIS GARFINKEL
Ai(x, y ) for i = 1, 2 can be determined as the solution of (2.67). If these functions satisfy the relations
then the two sets M , ( x , y , y,’, A,) and M , ( x , y , y2’, A,) are designated as corner manifolds. They can be used to determine the points of entry t,, the points of exit t,, and the points of refraction t* for a family of extremals y = y(x, a), 2 = A(x, a). As shown in the previous chapter for case (b), 1 vanishes a t x = t1 and x = 1 2 . The following cases arise: ( I ) If there exists a tL such that a> = Y’(t1,
~ x ( t I 3
~((1,
then the first part of Eq. (2.71) yields of the family. (2) lf the relations ~ x ( x ,a) =
Y’(x,
0 = 4t1, a) Z
a))= Yi’, =
J-i
(2.71)
tl(a), which defines the entry locus
a ) ) = Yi’,
4x9 a )
hold on some interval of x, and if there exists
Z li Z 0
(2.72)
l, such that
xt, , a) = 0
(2.73)
then (2.73) yields 5, = t 2 ( a ) ,which defines the exit locus of the family. (3) If there exists a t* such that yx = yi’# J’,
/I= .I.)!= 0
(2.74)
0
(2.75)
or y X =y.’I
=F‘,
2
= Ai #
we say that the family intersects the corner manifold M i . The intersection yields 5, = t*(a), which defines the corner locus C of the family; its xyprojection can be written in the form $(x, Y ) = 0
(2.76)
Cases (2.74) and (2.75) are distinguished as refraction on R and B-subarcs, respectively. 2.16
Conditions I I ’ and 11,’
A weakened form of lI’, used here, admits E = 0 for Y’ # y’ at the corners of the curve. Indeed, with x = 4 , y‘ = y-‘, and Y‘ = y+‘, the E-function of (2.54) vanishes in virtue of the corner condition (2.51). Its left- and right-hand ! at the corner, with Q derivatives E‘ along a n extremal assume the values f2
2.
51
DISCONTINUITIES IN A VARIATIONAL PROBLEM
defined by (2.58). The corresponding mathematical statements
E(Y-', Y,')
=0
d E-' = lim - E(y', y + ' ) = R x-5-0
dx
(2.77)
d
E+' = lim - E(y-', y ' ) = -R x-g+o dx are derived with the aid of (2.59). In order that 11' hold it is necessary that
RlO (2.78) Two cases arise: ( I ) If R < 0, then (2.77) implies that E > 0 in a neighborhood of x = 5 on the refracted continuation of the extremal, and that E becomes negative for the trivial continuation Ay' = 0. More generally, let Q ( x ) be defined on the extremal, with y-' replaced by y' in (2.58), and let R@) be the lowest nonvanishing derivative of Q ( x ) at x = 5. Then the requirement (-
1)PR'P'
<0
(2.79)
includes R < 0 for the special case p = 0, and implies the same conclusion. (2) If Q ( x ) = 0 on some interval of x, then the corner condition does not yield a solution for 5. Indeed, at every point of an extreinal subarc in the corner locus $'= 0, the extremal splits into a refracted and a trivial continuation, which both satisfy E > 0 in the neighborhood of x = t. The question of the existence and uniqueness of the refracted continuation can now be answered.
Theorem 5. Let the curve f ( y ' ) be supported from below by one and only one line. With the corner manifolds designated as MI and M 2 , let an extremal intersect M I at x = 5,. If conditions I and 11' are satisfied for x _< t,, then they are also satisfied on the refracted continuation x 2 5 , , determined by the transition M , -+ M 2 . If R < 0 , the continuation satisfies I1 uniquely. The theorem remains valid if the subscripts I and 2 are interchanged.
PROOF. The M , -+ M2 transition is characterized by
13- = A1 y-' = yl', (2.80) Y+' = YZ', A + = 132 Condition Ia is satisfied by the solution of the Euler equation with the initial conditions provided by (x,y , y+', A+). The corner condition Ib, and the convexity condition Ic are satisfied by Theorem 4. Condition Id, Eq. (2.4) , holds if a corner occurs on a B-subarc, inasmuch as the curve f ( y ' ) is supported
52
BORIS GARFINKEL
from below. Observe that E l , satisfies 11’ if and only if the tangent to the curve f(y‘) at y’ lies entirely below the curve for all ( x , y , y’) of El,. In view of (2.70), the condition is equivalent to the requirement
x = (Y’ - Vl’)(Y’ - Y2’) 2 0
(2.8 1)
which excludes the interval (yl’, y2’) from the domain of y’. Since x is positive initially and vanishes at corners, R I 0 implies that x 2 0 between corners, so that 11’ is satisfied. A unilateral N-neighborhood of ( x ,y , y’, A) on Et2 is defined as one whose xyy’-projection lies entirely in x 2 0. Condition 11’, will be stated as follows. A curue E12 is said to satisfy condition 11,’ if there exists a unilateral Nneighborhood of ( x ,y, y‘, A ) on Et2 such that E > O for ull ( x ,y , y‘, A) in N that satisfy 24 = 0, and f o r all Y‘ # y‘. The assumption that there be but one supporting line is removed in Sec. 2.17. 2.17 Free Corners Generally, f is not a convex function of y‘. As such, it has a supporting line for each pair of consecutive zeros of f y P y , . Let the points of support from below be designated by y; and y ; for j = 1, 2, .. . ,m. If y ; and y ; belong to the same branch off, the corresponding corners are said to be free. For such corners, the results of Secs. 2.14-2.16 remain valid with the following modifications. Theorem 4 reduces to one case, (yl’ - jj’)(y2’ - j ’ ) # 0, which corresponds to R - R - and R’RC transitions. The points 5 , of these transitions belong to some corner locus Cj and the corner manifolds MI and M2j. In the proof of Theorem 5, (2.81) is replaced by
xj = (Y’- Y ;j)(Y’ - Y;j> j = 1, 2, ... , m
20
(2.82)
x = x1 n x 2 n ... x m
(see Fig. 6). The restriction (2.61) that 4 = 0 have only one real root can also be removed; the case of several roots jjk’ can be treated as in (2.82). The subject of free corners has been studied by Bolza,’ Reid,’ Graves, and others.
2.18 A Special Case Free corners do not occur, and the testing of 11’, is considerably simplified if we impose condition 111” of Sec. 2.2. This condition implies the following consequences.
2.
DISCONTINUITIES IN A VARIATIONAL PROBLEM
53
By differentiation of (2.63) we obtain 6 y . y . =f y ’ y , ,
Ey, =fy*
eYr= - ( Y’ - Y ’ ) J , , ~ , (2.83)
and deduce the following properties of 6:(1) 6 is continuous in Y ’ ; (2) in view of III”, the second part of Eq. (2.83) and Eq. (2.61), b y . , .> 0 for all Y ’ , including Y’ = & co. Therefore E( Y ’ )has one and only one minimum m, occurring a t some value Y’ = y*’. Accordingly, let m(y’) be defined by m(y’)= min Q ’ , Y’
Y ’ ) = E(y’, y*’).
(2.84)
The following properties of m(y’) are t o be noted : (1) m(y’) is continuous;
FIG.6 . Supporting line-free corner.
(2) in view of the second part of Eq. (2.83), (2.63), and H I ” , m(y’) is monotonic. A line supporting the curve f (y’) from below a t y,’ and y,’ is characterized by m (y l ’ ) = 0,
m(y,‘)
= 0,
y,’ = Y’ = y’, y,‘= Y’=y,‘
if A f s 0 if A f 2 0
(2.85)
The minimum m occurs either in the interior of the domain of Y’ or on its boundary Y‘ = J‘. The two cases, g y r=f,,. -fy,.
=0
20
are in accord with (2.66).
if y,’ < J’ < y,‘ if (y,’ - J’)(y2‘- 7‘)= 0
(2.86)
54
BORIS GARFINKEL
Theorem 6. Condition 111“ is both necessary and suficient in order that the curve f ( y ‘ ) with a jump discontinuity have one and only one supporting line from below, with the corresponding corner manifolds Ml and M , . PROOF. If A j s 0 in (2.63), the mean value theorem leads to €(Y’, Y ’ )=f’(Y’) - f -(y’)
+ +( Y’ - y’)2f,.,.[y’ + O( Y’ - y’)],
O
(2.87)
Condition 111’ and (2.84) respectively imply €( - co, Y’) = 00 and m( - 00) = co. Since and since (2.84) implies
€(y”’, y”’) = Af I 0
(2.88)
< &(y”’, 9’)
(2.89)
mW’)
we conclude that m(y‘) < 0. Since the continuous and monotonic function m(y’) changes sign in the domain (-co,y), there exist a y,’ such that m(y,’) = 0, and the corresponding yz’ in the second part of Eq. (2.85). A similar argument applies to the case of Af 2 0. Theorem 7. r f El satisfies I I I” and ~(x,)> 0 , and i f it has a corner at every intersection with the corner manifolds for i2 < 0 , then it satisjies IT’ and ITN‘. The proof depends on the fact that (2.81) implies II’, and the combination of 111’‘ and 11’ implies 11,’ by the argument similar to the one in Sec. 2.2 2.19
The Imbedding Construction
Extremals that issue from (x1, y,) comprise a central family y(x, a ) of R-extremals and one B-extremal, to be designated as B,. The latter is either B- or B’, depending on which one satisfies Id. For these extremals, the corner manifolds M , and M2 furnish the C-locus, the entry locus El, the exit locus E, , and the corresponding t,, tl,5 , . With ( , ( a ) known, the B-subarcs are constructed as follows. Let y = g(x, p) and A = A(x, y) be the solutions of the differential equations 4 = 0 and the second part of (2.52), respectively. Since yand , I are continuous at x = tl, the parameters p and y are determined from the equations
Y(tI(c0,
a) = d t I ( @ ) ,
81,
J.(t1(a),
Y) = 0
(2.90)
leading to p = P(a) and y = y(a). The R-extremals of the family y(x, a)that intersect the C-locus at x = (*(C() generate a family of refracted extremals. The union of the two families constitutes a central familyy(x, a ) of extremaloids, with y ( ~a,) assuming different functional representations on various subarcs.
2.
DISCONTINUITIES IN A VARIATIONAL PROBLEM
55
The splitting of extremals occurs in two cases described below. (1) On the B,-extremaloid through (x,,yl), the multiplier A is obtained from the second part of (2.52) in the form A = A(x, t), where 4 is arbitrary. Hence at every point of B, for which there exists a 5 such that A = 0 and 1'> 0 an extremal exits into R. The set of such points belongs to the exit locus E, , generating a family e(x, t), which is enveloped by B,. Without any loss of generality, 5 can be taken as the value of x at the point of exit. (2) If an extremal subarc belongs to the C-locus, then at every point x = 5 of the subarc there exists a refracted continuation, generating a family r(x, 5). Such points are characterized by Q = 0, as noted in Sec. 2.16. Hereafter we shall assume that Q # 0 on El,. Then E,, has no subarc in the C-locus E 0, and is embedded in its entirety in a family of extremaloids. It appears that the boundary between two adjacent families is one of the loci C, E l , and E,. It is to be noted further that the corner condition is satisfied on the C-locus, and is trivially satisfied on El and E, . The union of the families y(x, u), Y ( X , 4 ) is simply-connected. Condition IV' requires that E l , contain no points belonging to the envelopes, distinct from B,, of the families of extremals that imbed E l , . If E l , is imbedded for u = uo in the central family y(x, u ) of extremaloids, IV' takes the form (2.91) if E l , is also imbedded for relations of the form
4
= toin
a noncentral family, then the additional
must be satisfied on the appropriate intervals of x. With " <x2" in (2.91) and (2.92) replaced by " IX,", IV' becomes the necessary condition IV. In view of the assumed continuity o f f , IV' implies its extension to a neighborhood of El,. In such a neighborhood, the functions u(x, y ) , ((x, y ) exist, and the covering of the neighborhood by the extremals is simple. 2.20
Proof of Sufficiency
The sufficient conditions for a strong relative minimum comprise I, 11,' and IV'. We shall show that a test-curve satisfying these conditions yields a lower value for the integral I than any other admissible curve joining the end-points and lying in some neighborhood M .
56
BORIS GARFINKEL
Let Nl be the xy-projection of the N-neighborhood in which 11,’ holds; let N, be the neighborhood in which IV‘ holds. Then M defined by M
=N, A
N,
(2.93)
has the following properties : (1) M is simply covered by one-parameter families of extremals. (2) These families are fields in M. (3) The boundary between any two adjoining fields in M is one of the loci C, E l , E , , on which the corner condition is satisfied by the imbedding construction. (4) M is the union of the fields of (2) and is simply-connected. (5) On every curve in M , the E-function, calculated with y’ the slope of the field and Y’ the slope of the curve, is positive for all Y’ # y’. The rest of the proof is identical with that of Sec. 2.9. We have proved the following theorem.
Theorem 8. If the locus of the discontinuity o f f is of the form +(x, y , y‘) = 0, and f a n admissible curve satisfies the conditions Ia, Ib, Ic, Id, lIN‘, and IV’, then it furnishes a strong relative minimum. If the minimum is unique and if M extends over the entire xy-plane, then the minimum is absolute. 2.21
Numerical Example
The solution of a problem contains the following stages: (1) Construct the central family of R-extremals and the B,-extremal passing through (xl, yl). (2) Construct the corner manifolds M I and M , ,and the loci C, E l , and E, . (3) Construct the continuations of the initial subarcs of step (l), and determine the parameters a0 and tocorresponding to an admissible curve. (4) Test the sufficient conditions. In example (b) of Sec. 2.0, f - = +y’, x1
+ y’,
=o,
y1
f += $y’,
=o;
x2
+ 4,
4 = y’
-
x
=9, y , =54
Observe that extremals in R are straight lines, and extremals in B are parabolas. The R and B types of extremal are characterized in Table VII, with m, b, p, y appearing as constants of integration in the solution of the Euler equations. The quantity S(x) is defined in (2.25); the i- sign refers to B + and B- arcs, respectively.
2.
57
DISCONTINUITIES IN A VARIATIONAL PROBLEM
TABLE VII FAMILIES OF EXTREMALS
R
The initial condition yields h R-extremals is defined by
B
fi
= 0,
= 0.
Hence the central family of
A =0 y(x, m) = mx, if m > O (Rf) O<x<m Osx
(2.94)
o<x-
(2.95)
the B,- extremal is y = 4x2,
A=+(C-x),
with y = 5, and A(x) 2 0. Since FyTy.= FS,, = 2, condition 111” holds. By Theorem 6, there exists a unique supporting line, with corner manifolds M , and M,. The latter are determined by the corner condition (2.67). With 9’ = x, (2.67) becomes
-ty;‘ - alYl’ +y,’
=
-tv2, 2 + 4 - A
+ 1 + A1 = 4y2’ + A,
A1(yl’ - x) = 0,
~ Y ~ J
(2.96)
%&’ - x ) = 0
The solution of (2.96) appears in Table VIII and in Fig. 7. TABLE VIII THECORNERMANIFOLDS MI
Domain of definition 5<x<m 3 5 x 5 5 0 1 x 5 3
Mz -
~~
A,
Yl‘
x
-
2(x - 4)”2 3 X
~.
(4
-
0 0 x)’”
A2
Y2‘
x -
1 x
+
1
5 2(4 - x ) ” ~
-
( x - 4)”Z 0 0
58
BORIS CARFINKEL 1 0
9 8
0 -1
X L
- 2 L l 0
1
I
I
I
I
2
3
4
5
I 6
I
I
I
7
6
9
FIG.7. The corner manifolds.
It can be shown that the initial subarcs of (2.94) and (2.95) lead t o five types of extremaloid, described in Table IX. We shall confine our analysis t o the case m > 5. I t is seen from the figure that the corresponding extremaloid is R + B + R - , and that the refraction occurs in the sense M , -+ M I . The entry locus E, is obtained from Eqs. (2.71) and (2.94), with yx = m and y,’ = y”’ = x. In parametric form, the result appears as
x
= m,
=
(2.97)
y = ~ ] , = 2m
As seen from Table VII, the B+-subarc generated by the R+ B + transition is of the form (2.98) y = +(xZ + fi), A = +(y - x) The parameters leading t o
p and y are determined from Eq. (2.90) with g = $(x2 + p), lj=m2,
Then (2.98) becomes
y
= +(x2
+ m’),
A = +(m- x)
The C-locus is determined from (2.75) with i = 2, leading t o A the aid of Table W I T and (2.100) we obtain
+ 2(m 5 ) ’ i 2 , y = v]*= m 2 + 2(m - 5 ) + 2m(m - 5)’’’ x = 5, = m
-
(2.99)
y=m
(2.100) =A2.
With
(2.101)
2.
59
DISCONTINUITIES IN A VARIATIONAL PROBLEM
The latter value of x yields
A-
= 1, =
-(m - 5)’12,
From (2.58) we obtain
R
=
= y,‘ = m -
y,’
2
(2.102)
= 1- - A+
-AA
For the B’R- transition 2 - I 0 and 2,
= 0,
(2.103)
so that the necessary condition
R i 0 is satisfied. This transition generates the R--subarc y
= (m - 2)x
+ 4m + 4(5 - m)l/’ - 10
(2.104)
N o further transitions being possible, the R’B’R- family of extremaloids is defined as in Table IX. The terminal condition y(9) = 54 yields m = m, = 6, corresponding to a test-curve defined by y
= 6~
y = $x2 + 18,
A = 3 - )X
y=4x+18
(0 i x I 6) (6 Ix I 8) (8 I x i 9)
(2.105)
The intersection of this solution with the corner manifolds is depicted in Fig. 7, with c1 and 5, marked. TABLE 1X EXTREMALOID FAMILIES Family
Type
R+B+R-
= mx
y y
=
y
= ( m - 2)x
l(m’
+
Range of x
XZ)
h = :(m - x )
+
+
4 m 4(m 5<m<m
R+R-
y = 5x y = 3x
R+B-R-
y=mx
+ 26
m=S,
y
=
5)Il2
- 10
+ m2 + 4 m - 8(5 - m - 24)ljZ) f(m x ) - I ( m - 2)x + 4m - 4(5 - m ) l i 2- 14
+ 2(m
-
B-R-
y = &x’ h = &([ - x ) y = Ex - 462
R-
y=mx
5)*j2, co)
(0, m - 2 - 2(5 - m ) l ’ z ) ( m - 2 - 2(5 - m ) l i 2 ,m 2) ~
~
4 i m < 5
O i t S 2 mIO
(m
31555
y = &(x2
h=
~
( m - 2, a)
60
BORIS GARFINKEL
To test the sufficient conditions, we note that (1) On the B+-subarc, 1 < 0 and 6 = x - 4 > 0. Therefore Ic and Id are satisfied. ( 2 ) 111” is satisfied, since FyYy,= F,ty. = 2. (3) At x = x1 = 0, the following values are assumed: y’ = 6, y,’ = 0, y 2 ’ = 4. Then (2.81) becomes X(X1) =
12 1 0
(2.106)
so that IT,’ holds by Theorem 7. (4) IV’ is satisfied in virtue of the following relations: Y d X , 6) = x
=6 =x+6
(0 < x I 6) (6 I x I 8)
(2.107)
(81x59)
We conclude that (2.105) furnishes a strong relative minimum. The other extremaloid families of Table IX do not satisfy the prescribed terminal condition for real values of the parameters. Therefore the solution (2.105) is unique. Y
-
561-
52 48 44
40
3632 28 24
20 16
-4
-8
-
-
-
2
I
3
4
5
6
7
I
R+
-12-
FIG.8. The corner, the entry, and the exit loci.
a
9
2.
DlSCONTlNUlTIES IN A VARIATIONAL PROBLEM
61
The splitting of extremals occurring on the C-locus and on the B,-subarc, as discussed in Sec. 2.19, is illustrated by the families R + R - and B - R - , respectively, with 5 replacing rn as the family parameter. To investigate the existence of the absolute minimum, we note the following: ( I ) For the B - R - family, the range of ( defined by x 2 0 is 0 5 4 s 3. For the interval 2 < ( 5 3, the family overlaps with R ' B - R - in the domain D defined by
2x - 2 < y I + X 2 2x - 2 < y 5 3x - 5
( 2 < x < 3) (3 < x < m)
(2.108)
If the terminal point lies in D, there are two minima fl and f 2 , corresponding t o R'B-R- and B - R - , respectively. It can be shown that I, < f 2 . Thus, if we seek the least minimum, we can eliminate the overlap by decreasing the 5 I 2 as indicated in Table IX. range of 4 to 0 I (2) I t can be shown that the M-neighborhood of E , , extends over the entire half-plane x > 0, which is covered by the five families of Table IX without a n overlap, as shown schematically in Fig. 8. We conclude that for any choice of (x,, y 2 ) there exists a unique solution, which furnishes a n absolute minimum. A partial check of this statement is provided by the values f ( E I 2 = ) I IS+ and f ( C I 2 = ) 123 for the curve C,, defined by y = 6 x o n 0 I x I 9. 2.22
Discussion of the Results
A solution of the refraction problem in case (b) has been obtained under the restrictions n = 1 and R # 0. The first restriction can be removed by the requirement that I* vanish on every closed path in M . If the second restriction is removed, subarcs in the corner locus $ = 0 must be admitted. Although such subarcs cannot be imbedded in a family of extremals, the difficulty can be overcome by the replacement of 11,' by the more stringent condition 111". Indeed, the situation is quite analogous to that in case (a), with the locus = 0 playing the role of 4 = 0. I n conclusion we note the following application to the theory cf optimum control. Let the locus of the discontinuity o f f b e of the form
+
4
=
4(x$~i
3
uj)
=0
(2.109)
where y i is the set of the state variables, and u j is the set of the control variables. Then the transformation uJ. = y'.J + n
(2.110)
converts the problem into case (b), treated in Part B of this chapter. For a practical illustration see GarfinkeL3
62
BORIS GARFINKEL
ACKNOWLEDGMENTS The author wishes t o express his appreciation t o Dr. G . T. McAllister, who contributed the construction used in Sec. 7 and has made several other useful suggestions. The author also gratefully acknowledges the aid of Mrs. Bernice Krouse, who typed the manuscript.
REFERENCES I . 0. Bolza, “ Lectures on the Calculus of Variations,” Chelsea, New York, 1904. 2. G. A . Bliss, A problem in the calculus of variations in which the integrand is discontinuous, Trans. Am. Math. SOC.8, 325 (1906). 3. B. Garfinkel, Minimal Problems in Airplane Performance, Quart. Appl. Math. 9, 149 (1951). 4. G . A. Bliss, “Lectures on the Calculus of Variations.” University of Chicago Press, Chicago, Illinois, 1946. 5. W. T. Reid, Discontinuous solutions in the non-parametric problem of Mayer, Amer. J . Math. 69, 69 (1935).
Singular Extremalst HENRY J. KELLEY ANALYTICAL MECHANICS ASSOCJATES. INC., WESTBURY. LONG ISLAND. NEW YORK
R I C H A R D E. K O P P H. GARDNER MOYER RESEARCH DEPARTMENT. GRUMMAN AIRCRAFT ENGINEERING CORPORATION. BETHPAGE. LONG ISLAND. NEW YORK
3.0 Introduction . . . . . . . . . . . . . . . . . 3.1 Second Variation Test for Singular Extremals . . . . . . . 3.10 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 1 Problem Formulation 3.12 Second Variation . . . . . . . . . . . . . . . . . . . . . . . 3.13 First Special Control Variation 3.14 Second Special Control Variation . . . . . . . . . 3.15 General Analysis . . . . . . . . . . . . . . 3.16 Alternate Development of Necessary Conditions . . . . . 3.17 Junction Conditions . . . . . . . . . . . . . 3.2 A Transformation Approach to the Analysis of Singular Subarcs . . 3.20 Introduction . . . . . . . . . . . . . . . . . . . . . . . 3.21 Transformation to Canonical Form 3.22 The Legendre-Clebsch Condition in the Transformed Variables 3.3 Examples . . . . . . . . . . . . . . . . . . 3.30 Two Elementary Examples . . . . . . . . . . . 3.3 1 A Servomechanism Example . . . . . . . . . . . 3.32 A Midcourse Guidance Example . . . . . . . . . 3.33 Aircraft " Energy Climb " . . . . . . . . . . . 3.34 Goddard's Problem . . . . . . . . . . . . . 3.35 Lawden's Spiral . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . .
3.0
63 64 64 65 66 67 69 71 73 78 79 79 80 81 84 84 86 87 88 90 92 100
Introduction
Singular extremals are usually associated with variational problems which have the control variables appearing linearly in the system differential
7 This work was partially supported by the U.S. Air Force Office of Scientific Research of the OPirce of Aerospace Research. under Contracts A F 49(638)-1207 and A F 49(638)-1512 and also by the Theoretical Division of NASA Goddard Space Flight Center under Contracts NAS 5-2535 and NAS 5.9085 . 63
64
HENRY J. KELLEY, RICHARD E. KOPP, AND H. GARDNER MOYER
equations-singular arcs or subarcs arising when the pseudo-Hamiltonian function i P is not explicitly a function of the control variable over a nonzero time interval. When such a situation occurs, neither the maximum principle nor the classical variational theory provide adequate tests for minimality of the arc. The subject of singular extremals is not merely of academic interest since many problems in rocket and air vehicle flight exhibit solutions which include singular subarcs. Although the following analyses discuss explicitly only the case in which the control variable appears linearly in the system differential equations, the results are more widely applicable. Other examples which offer the possibility of singular subarcs are those problems in which the surface in " hodograph space ''I is nonconvex. The nonconvex surfaces in these cases may be replaced by their convex hulls t o obtain " relaxed " variational problems2x3 in which appropriately chosen control variables are sectionally linear. The possible appearance of singular subarcs in a problem is accompanied by considerable analytical difficulty. There is no general method available for determining these subarcs and the manner in which they form segments of the minimizing arc. Valuable insight into these questions is provided by the Green's theorem method of Miele.4 This method also provides necessary and sufficient conditions for minimality, although severely restricted in its applicability in terms of number of variables present. A fairly comprehensive treatment of singular arcs in problems linear in the state has been given by LaSalle.5 However, in such problems, singularity is equivalent t o degeneracy in the sense of nonuniqueness of solution. The present chapter will derive conditions for minimality of singular arcs over short intervals of time. The derivative material is taken mainly from four previous papers6-' by the authors, with extensions and considerations of the interrelationships of the two following approaches : first, a second variation test for singular subarcs employing special control variations, and second, a transformation approach to the analysis of singular subarcs. No pretense of complete treatment can be made since difficult questions on the number and sequence of singular and nonsingular subarcs remain unanswered ; however, some material on junction conditions for singular arcs is included. The problem of synthesizing solutions containing singular subarcs is not considered. 3.1
Second Variation Test for Singular Extremals
3.10 Introduction The approach presented in this section follows the work of Kopp and Moyer6 and Kelley.* The positive semidefiniteness of the second variation of the payoff function is examined for a special class of explicitly defined control variations. A parameter t of the special control variations is allowed
3.
65
SIhGULAR EXTREMALS
to approach zero in the limit and the dominant term of tlie second variation is evaluated, providing a test for determining the minimality of the singular subarc over short lengths of arc. The special class of control variations has been constructed so that terminal conditions can be satisfied by a weak additional control variation which does not contribute to the dominant term in the second variation. The authors are aware of some similar work concerning singular subarcs and the second variation which is being done by H. Robbins." His analysis follows, somewhat, the approach used in Sec. 3.16.
3.1 1 Problem Formulation In the usual format for trajectory and control problems, we are given a system of differential equations and boundary conditions i = I , ... , I1 i; = f i ( x l , ... , s,, u1, . . . , u,, t ) , i = 1, ... , I 1 i = I , ... , m, mI n xi(tf) = x i f , x;(t,) = x i o ,
(3.117
and are required to minimize a function P ( X , , + ~... ~ ,, xnf, t l ) of the open final state variables x i f . This is called tlie general problem of Mayer. The minimization is usually subject to explicit constraints on the control variables u L , which require the control vector u to belong to a class of admissible controls U. However, such explicit constraints will present no dificulty in this analysis, since it will be assumed that the control corresponding to a singular subarc is interior t o the boundary of U, as is usually tlie case. Necessary conditions for P to be a minimum are that the Pontryagin pseudo-Hamiltonian function S be a minimum for all admissible controls : where
2(C1
+ Aul, ... , fir + Au,.) 2 Yfe(U,,... , fi,)
x = ic i.,f;(x,, ... , S,], u , , ... , u,, =
(3.2)f
,1
1
t)
(3.3)
The 2; variables are the usual Lagrange multipliers, and satisfy the differential equations and natural boundary conditions
i = m + I , ..., n
t
Dot denotes differentiation with respect to time t . f. Here i7, denotes an optimal value of if,. Here it is assumed that A, # 0. In particular, the value of A0 is taken to be poritiue, namely, h, = 1 ; hence, we have a minimum principle.
66
HENRY J. KELLEY, RICHARD E. KOPP, A N D H. GARDNER MOYER
Subarcs of the solution, along which the determinant whose elements are the second partial derivatives d 2 Z / d u , du,, k , s = I , ... , r vanishes over a nonzero interval of time, are referred t o classically as singular subarcs. In this case the maximum principle, or the Weierstrass condition (3.2), fails to determine the nature of extremal arcs over short lengths of arc. Such subarcs commonly arise in problems in which the functionsfi are linear in one or more of the control variables, the function 2 thus being correspondingly linear. Another situation which offers the possibility of singular subarcs is that in which the surface in “ hodograph space” is nonconvex. This surface was defined by Contensou‘ through the equation i,=,fi(x, u, t ) with x and t fixed. In this case the surface may be replaced by its convex hull t o obtain a “relaxed” variational problem233i n which the functionsfi and 2 are sectionally linear in appropriately chosen control variables. For the remainder of this analysis, we will consider the case in which a single control variable appears linearly in the system differential equations. However, this in no way limits the wider applicability of the analysis.
3.12 Second Variation The total variation in the payoff functional P due t o a variation in the control vector u is given” by
where
A i i =fi(x Axi(t,) = 0
Axi(tf)
=0
+ AX, u + Au, t ) -fi(x,
U,
t),
i = 1,
...,TI
i = 1, ... , TI i = 1, ... , rn
(3.6)
When one of the control variables appears linearly in #, the first term on the right-hand side of Eq. (3.5) may be identically zero during a nonzero time
3.
SINGULAR EXTREMALS
67
interval, producing a singular subarc. In this case, the additional terms in Eq. (3.5) must be examined to determine the nature of the extremal path. Under the assumption that the singular control is interior to its boundary, Eq. (3.5) is evaluated for variations in the singular control Au = K 6u to second order terms in K.
K2
+
where
2
82P(x
i,jE+
6xi(ta) = 0, 6xi(tJ)= 0,
t ) J' 6Xi(tf) 8 X j ( f f ) ax;,axj,
i = 1, ... , n i = I , ... , m
(3.7)
(3.8)
The usual approach tw the study of the second variation encounters difficulties as a result of the control variations appearing only linearly in Eq. (3.7). The problem of minimizing A P , subject to constraints (3.8), the classical accessory minimum problem, cannot be treated in the usual manner owing to the vanishing of d2%/au2.
3.13 First Special Control Variation It has long been appreciated that carefully chosen special variations are useful in deriving necessary conditions. There exists, for example, a classical derivation of the Legendre necessary condition along these lines.' With this in mind, we search out control variations which will satisfy boundary conditions imposed on Eq. (3.8) and which will allow the positive semidefiniteness of AP, to be tested. The first member of a set of control variations intended to accomplish this is shown in Fig. 1 and is designated as q o l ( t ,7). The time t = 0 is designated as the center of the interval 22, and may occur at any interior point of the singular subarc. The parameter z will be allowed to approach zero in the limit. Successive integrations of qol(t,z) with respect to t are designated by qu'(t, z), that is
68
HENRY J. KELLEY, RICHARD E. KOPP, A N D H. GARDNER MOYER I
Equation (3.8) can be integrated for 6u = qo'(t,z), subject to the initial conditions 6x,(t,) = 0, i = 1, ... , n. For the present, the boundary conditions at tf will be relaxed and will be satisfied later by auxiliary control variations. Let 6xi(t) = A i , l ( t ) q l l ( t >z) + A i , 2 ( t ) ( ~ 2 ' ( t , ti'([) (3.10) (Note that the superscripts of 4 and cp do not denote exponentiation.) Substituting Eq. (3.10) into Eq. (3.8) and equating coefficients gives A. '9'
afi
=-=-
au
a 2 2
a,all (3.1 1 )
and (3.12)
The necessary smoothness properties of the coefficients in Eq. (3.8) required for the existence of Eqs. (3.1 1) and (3.12) are assumed. An evaluation of the function q , ' ( t , z) shows that t i ' ( t ) is of order r 3 ; that is (3.13) The variation of the functional P(x,t ) can easily be evaluated by substituting Eq. (3.10) into Eq. (3.7) and integrating by parts, retaining only the dominant terms in z.
3.
SINGULAR EXTREMALS
69
In this and the subsequent analysis, it is assumed that the Taylor series in T used t o evaluate A P , has a nonzero interval of convergence. A necessary condition for P t o be a minimum is that AP, 2 0. Since t = 0 is any interior point on the singular subarc, a necessary condition for the extremal arc to be minimizing is
(3.15) along the entire singular subarc. (The inequality is reversed for a maximum.) This can be shown t o be equivalent t o the condition (3.16) In light of the terminal boundary conditions imposed on Eq. (3.8), one is justified in being concerned with the admissibility of such a control variation. In regard to this, the authors are prepared to give only a plausibility argument concerning the satisfaction of boundary conditions. From Eq. (3.10) it is observed that the dominant term of 6x,(t,) is of order z3 or smaller. Therefore, corrections in Ax,(t,) t o satisfy boundary conditions to first order in K can be made with auxiliary weak control variations Au of order K.r3 which contribute terms of order K 2 ~ t6o A P , . Thus, the dominant term in A P , is unchanged in Eq. (3.14). The existence of such variations is equivalent to a normality assumption. A rigorous demonstration, which would require that boundary conditions be satisfied exactly, would follow arguments similar to those used by Pontryagin and others t o show that there are points that can be reached on all curves having their tangent rays interior to the cone of attainability. No attempt has been made by the authors along these lines. 3.14 Second Special Control Variation
If Eqs. (3.15) and (3.16) are met marginally (equality), the nature of the T) extremal subarc is still undetermined. In this case, the second member vo2(t, (see Fig. 2) of the class of special control variations is used and a procedure similar t o that above is followed. Successive integrations of c p o 2 ( f , T ) are designated by q v 2 ( t ,T ) , that is (3.17) Equation (3.8) is then solved for 6u = q o 2 ( t ,z), subject t o initial conditions 6xi(ro) = 0, i = 1, ... , n with relaxed terminal conditions as previously
70
HENRY J. KELLEY, RlCHARD E. KOPP, AND H. GARDNER MOYER
mentioned. The variations 6 x i ( t )in this case are expressed by
where
and
(3.20) An evaluation of q3'(t, z) shows that ti2(t) is of order z6. Substituting Eq. (3.18) into Eq. (3.7) and integrating by parts gives
a2s
+
z1
i,zaxi
j=
I
d22
I
+K2JT
axj A i , l A j , l ] { q l 2 ( f ?7))'
(--[ 1 d2
--I
2dt2
a 2 2
n
+ c - A .
axi -f:- a22 i=
Ai,l]-
au
;,j=1
a x i axj
+ O(r12)
2 dt
i,j=
Ai,2
dt
iy a u a x i A i , z ] +-2 d'[t ---
r 3 4
au axi
c-a~a 2axi2
A.,3]
-A i , l Aj . 2 1 ax, a x j
~ r , 2 ~ j , 2 ] { q 2 2 (z )t )>2
dt
(3.21)
3.
SINGULAR EXTREMALS
71
The integrand of the first term on the right-hand side of Eq. (3.21) which would lead to terms of order t 8vanishes identically by assumption. This result is identical to the term arising out of the control variation q o l ( t ,T). Evaluating Eq. (3.21) and retaining only the dominant terms gives
(3.22) from which we obtain the second in a sequence of necessary conditions. The admissibility with regard to boundary constraints of such a control variation follows the same type of argument as before. The variation of the state variables at t = t f is of order K t 6 and, thus, terminal boundary conditions can be satisfied by auxiliary weak control variations Au of order K t 6 which do not contribute to the dominant term in A P 2 , as given in Eq. (3.22). The inequalities in Eq. (3.22) can be simplified to
(3.23)
3.15 General Analysis If the equalities are satisfied in Eqs. (3.22) and (3.23), the third control variation of the class is used and so on. The motivation for choosing such a class of control variations arises from the theory of distributions. To find cpz+"(t,t ) , consider the derivative of qO4(t, t ) accepting the Dirac delta; approximate this distribution by a pulse of width T ~ + and ~ , scale so that the magnitude of the pulses is unity. From the previous discussion of the control variations cpol(t, t) and cpo2(t, t ) ,it becomes evident that it is not necessary to actually construct the specific control variation, but only to be assured that such variations exist for which a constructive method has been given. With these thoughts, AP2 for the qth special control variation [Au = qoY(t,z)] will be evaluated. The variation 6 x i ( t )becomes 2a
(3.24)
72
HENRY J. KELLEY, RICHARD E. KOPP, AND H . GARDNER MOYER
where tiq(t) is of order z [ ( ~ +1 ) ( q + 2 ) ' 2 1 . The coefficients At,$are given by (3.25)
where A,,1 = d 2 2 / a / 2 , au. Substituting Eq. (3.25) into Eq. (3.7) and integrating by parts gives
(3.26)
The notation Y ] l , k - 1 ( A i , u + 2 )designates that all terms of the form Ai,u appearing in q l , k - l are t o have their second indices increased by 2. A similar rule is used where the A , , u terms appear as products in the definition of v s , k in Eq. (3.27).
Let q be chosen such that the dominant term in z for Eq. (3.26) results from the integrand which contains the term [q;(r, z)]'. That is, all coefficients of [qkq(t,z)I2 are identically zero for k < q. The dominant term in Eq. (3.26)
3.
73
SINGULAR EXTREMALS
- ' ~ . a necessary condition for the will then be of order T [ ( ~ + ' ) ( ~ + ~ ) Therefore, extremal path to be minimizing is
thy+c ax,a2cF axj 1
2 i,j=
s= I
I
(3.28)
-A i,qAj , q 2 0
To satisfy boundary conditions on xi(tf), auxiliary control corrections are made in the remaining interval of time which will add to A P 2 terms which are of one degree higher in t than those arising out of the dominant term in Eq. (3.26). Thus, they can be neglected as t approaches zero in the limit. Although it is not readily apparent, it will be shown in the next section that the necessary condition given by Eq. (3.28) can be expressed equivalently as (3.29) We are indebted to Robbins for arriving first at this form of the test. 3.16 Alternate Development of Necessary Conditions We consider here an alternate development of the necessary conditions expressed by Eq. (3.29). This approach proceeds similarly to a proof outlined by Robbins which was forwarded to the authors in the form of unpublished notes dated July 1964. To second-order terms in K n
a2s
6x,6xj-
a2x ,=c axi au 6 X i 6u n
I
+TIai,au 6Ai 6u
(3.30)
a2z
(3.31)
n
where hi-,=
a2& 1+ aaA i2aus 6 U al, axj
61,=
-
n
6Xj
a2H
~
j =
6xjc ax,a2x axj n
~
j = I
a2H 1 ax, n
~
j=
6Aj - ___
ax, au
6U
Equation (3.30) together with Eq. (3.7) yields
(3.32)
74
HENRY J . KELLEY, RICHARD E. KOPP, A N D H. GARDNER MOYER
The boundary conditions for Eqs. (3.31) are 6xi(to)= 0 ,
6 x i ( t f )= 0,
i = 1, ... , n i = I , ... , m
However, as before, the terminal conditions imposed on 6 x i ( f f will ) be relaxed and satisfied with auxiliary control variations which will not contribute to the dominant term in AP,. The special control variation 6u = q o q ( t ,z) will be used, and the dominant term in T of AP2 will be evaluated. The coefficient of 6u in the integrand of Eq. (3.32) is recognized to be s(d&f/du), the first order difference between a . T / d u with and a&f/au without the control variation 6u along the singular subarc:
where ( )* designates evaluation along the subarc with the control variation 6u. Substituting Eq. (3.33) into Eq. (3.32) and integrating by parts 4 times gives
i= 1
l,
i,j=m+l
ax.'f ax.Jf
We will now assume that the first explicit appearance of u is always in an even-order time derivative of a X / a u . A proof of this assumption will be given subsequently. The parameter q will be chosen so that this even order is equal t o 24. The coefficient of the qq4(t,z) term in the integrand of Eq. (3.34) can be written as (3.35) Expanding the integrand of Eq. (3.35) in a Taylor series and retaining only the dominant term in K gives
3.
SINGULAR EXTREMALS
75
The first term on the right-hand side of Eq. (3.36) is identically equal to zero, along the singular subarc. It is left for the reader to verify that the dominant term in Eq. (3.34) arises from the term
in Eq. (3.36) and is of order K 2 T ~ ( ~ + * ) ( ~ + ~ )Substituting - *]. this term into Eq. (3.35) yields
The coefficient of qoq(t,7)in the integrand of Eq. (3.37) is expanded as a power series in time about t = 0, any interior point on the singular subarc; and the integration indicated is performed
Equation (3.38) is now substituted into Eq. (3.34) giving
(3.39) The terms 6xi and 6Ai are of order T ~ ( ~ + ~ ) ( and ~ + ~thus ) / ~do ~ not contribute to the dominant term in A P , . Terminal boundary conditions on 6xj(ts) can be satisfied with auxiliary control variations of order l ) ( q + 2 ) ' 2 1 and also will not contribute to the dominant term in AP, as given by Eq. (3.39). Since t = 0 is any interior point on the singular subarc, a necessary condition for the extremal arc to be minimizing is (3.40) along the entire singular subarc, the inequality being reversed for a maximizing extremal. We will now proceed to prove that if d%/du is successively differentiated with respect to time, then u cannot first explicitly appear in an odd order derivative. It will first be necessary to derive a few basic relationships. Given a scalar function F(x, X), where the components of the x and X vectors obey the differential equations i i = d # / d l i and Ai = -a#/axi ( i = 1, ... , n), then d n aFas n M a s (3.41) - F(x,X) = -- - C -- = - ( V X ) T S V F dt i + l axi aAi i = l aAi axi
2
76
HENRY I. KELLEY, RICHARD E. KOPP, AND H. GARDNER MOYER
where (3.42) and S is the 2n x 2n matrix (3.43) The explicit dependence of the function F o n time t is not considered; however, this assumption is not restrictive, since t can be eliminated by adjoining an additional component to the x vector whose derivative is equal t o unity. The time derivative of the gradient of the function F is d dt
+ [V(V&?)T]SV F
The relationships S T = - S and [V(VX')'IT tain the remaining preliminary equations: d d - [(VA)TSV B ] = - (V A ) T S V B dt dt =
(V
= V ( V Y Y ) are ~
(3.44) now used t o ob-
+ ( V L I ) ~dt-dS V B
A) S VB
+ [V(V&?)TSV A I T SV B + ( V A ) T SV dtd B -
(3.45) In the following, the term
a
dkax au dtk au
--__
will be denoted by u k . We will assume that the ak ( k = 1, ... , p - 1) and all their time derivatives are equal to zero. Our task is t o prove that ap = 0 when p is odd. Using Eq. (3.41), we have
(3.46)
3.
77
SINGULAR EXTREMALS
Similarly, (3.47) Differentiating Eq. (3.47) with respect to time and using Eq. (3.45) gives
(3.48)
From Eqs. (3.45) and (3.47) da,-,/dt is (3.49) and upon differentiating with respect to time, we have
(3.50) Continually repeating this process yields the general equation
dtk
dtk-' du
We will now assume that p is o d d ; that is, p (3.48), and (3.51)
= 2u
+ 1. From Eqs. (3.46),
(3.52) since the quadratic form
ZTSZ = 0
(3.53)
where Z is any 2n vector. Therefore u cannot first appear in an odd order time derivative of d%/du.
78
HENRY J . KELLEY, RlCHARD E. KOPP, A N D H. GARDNER MOYER
3.17 Junction Conditions
Nothing has been said so far about the joining of singular subarcs with nonsingular subarcs. Although an extensive analysis on this subject is not available, it is worthwhile making some observations concerning necessary conditions at junction points. Let us consider the possibility of joining a singular arc a-6 to a nonsingular arc 6'-c at time t i with a discontinuity in the control time history, as shown in Fig. 3a. Along the singular subarc the inequality in Eq. (3.40) is satisfied, while along the nonsingular arc ( d A f / d u )Au > 0. Referring to Fig. 3a we see that Au, which is any variation from the nonsingular subarc,
--
urn,
b'
I
I
ti
t
(a)
I
s
I
I
I
qEven
I
I
I
t
ti
I
ti
(C)
FIG.3. Junction conditions.
t
3.
SINGULAR EXTREMALS
79
must be negative-thus requiring a%/au < 0. However, d%/au can also be evaluated in the neighborhood of the junction point on the nonsingular subarc by a Taylor series expansion, which leads t o the conclusion that at the junction point (3.54) In order to satisfy the inequalities in both Eqs. (3.40) and (3.54), q must be odd. A similar analysis considering a jump discontinuity in the control u t o the lower boundary produces the same conclusions. Therefore, if a minimizing singular subarc is joined to a nonsingular subarc at a corner (jump discontinuity in u ) , q must be odd. The situation will now be examined in which a singular subarc joins the nonsingular subarc with onset of saturation (no jump in u), as shown in Fig. 3b. In this case, an evaluation of d X / a u on the nonsingular subarc in the neighborhood of the junction leads to the conclusion that
(3.55) In this case, t o satisfy the inequalities in Eqs. (3.40) and (3.53, q must be even. We may summarize these results as follows: (1) If q is odd, a jump discontinuity in control from a locally minimizing singular arc [satisfying (3.40)] to either bound will find the condition ( a X / d u ) Au 2 0 satisfied a t slightly later t, i.e., then corner junctions are permitted. I n this case, if singular control were maintained until the control saturated, the control must jump t o the opposite bound, Fig. 3c. (2) If q is ez’en, jump discontinuities in control from singular arcs satisfying (3.40) in strengthened form are ruled out. From the incompatibility of (3.40) and (3.54) in q-even problems, there is a temptation t o conjecture that minimizing singular arcs of the q-even variety are isolated, except for the saturation possibility, since the most common type of junction, the corner junction, has been ruled out and, hence, that they are of only minor importance in the structure of a family of solutions. That this is not the case is illustrated by a simple example presented in Sec. 3.30 exhibiting complex junction phenomena.
3.2 A Transformation Approach t o the Analysis of Singular Subarcs 3.20 Introduction In the following section, we present an approach to the analysis of singular subarcs by means of a transformation t o a new system of state variables.
80
HENRY J. KELLEY, RICHARD E. KOPP, A N D H. GARDNER MOYER
Viewed as a practical analysis tool, the approach has a fairly serious shortcoming, namely that “closed form” solution of a system of nonlinear differential equations is required for synthesis of the transformation. While this consideration obviously limits the approach, there are nevertheless a number of applications in which it proves useful. From a theoretical viewpoint, the transformation approach is suggestive in providing the idea of attacking singular problems in a state space of reduced dimension, and in this respect may prove helpful in future research efforts. It has been called to the authors’ attention that the transformation scheme presented here is similar to a scheme developed by Faulknerl3-I5 for treatment of the case in which the differential equations of state are reducible to a single total differential equation. The two schemes appear to be equivalent for problems of that type, which have a fairly frequent occurrence in applications. 3.21 Transformation to Canonical Form
We investigate first the possibility of a canonical form for the problem under examination, which resembles that chosen by Wonham and Johnson16 for study of a special class of problems featuring linear constant coefficient systems and cost functional an integral quadratic form in the state variables. We write the equations of state (3.1) in the form
i = I , ... , n (3.56) ii= p i ( x , , ... , x,, t ) + qi(xl, .. . , x,, t)u, The control variable u is subject to an inequality constraint of the form u1
(3.57)
I uI 242
We consider the possibility of introducing new variables z j = r j ( x l , ... , x,,t ) ,
j
=
1, ... , m
(3.58)
which will satisfy equations of state whose right members are not dependent upon the variable u explicitly: (3.59) where the vanishing of the collected coefficient of the variable u (3.60) has been assumed. If we restrict attention to transformations in which the functions r j are independent, it follows that m < n ; for if m 2 n, the identical vanishing of the q i would be implied by (3.60). For the purpose of determining functions r j having the desired property, we seek the solutions of the linear homogeneous first-order partial differential
3.
SINGULAR EXTREMALS
81
equation (3.60). From the theory of characteristics," we are led to consideration of the ordinary differential equations
dx ds
-=
i = 1, ... , n
q i ( x l , .. . ,x,, t ) ,
(3.61)
in which s is a parameter and t is regarded as fixed. If one of the x i is adopted instead of s as independent variable, the general solution may be represented in terms of n - I parameters: C,
= l p k ( x I ,... , x,,
t),
k
=
I , ... , n - 1
(3.62)
The c, are constants of integration, and the ( P k are rnutudly independent integrals of the system. Each integral q kis a solution of the partial differential equation (3.60). The first n - 1 of the new variables z j are then to be defined according to r j = qj.The nth variable z, we define as
z, = .Y[
(3.63)
choosing I such that yL # 0 over the domain of interest, a choice which we assume open t o us for the time being. The transformation between the variables z and x is nonsingular by the nonvanishing of the Jacobian determinant (3.64) which is ensured by the mutual independence of the functions q j together with q1 # 0. More generally, the function z,, may be chosen as any function of the x, which possesses continuous first partial derivatives, which satisfies A f 0, and whose time derivative contains the variable y with a coefficient which does not vanish in the neighborhood of the singular subarc to be tested. It should be mentioned in passing, that the functions r j , , j = 1, . . . , n - 1 , obtained by solution of the system (3.61) are not unique since arbitary oncedifferentiable functions of the integrals qk are also solutions of the partial differential equation (3.601, and also have the desired properties. Thus, the canonical form may be realized in a number of ways, and, in general, the choice may be decided upon for convenience in subsequent analysis. A systematic method for exploiting the choice may be of interest as a topic for future research.
3.22 The Legendre-Clebsch Condition in the Transformed Variables To provide intuitive motivation for our next step, we digress momentarily, considering the possibilities offered by our transformation in (rarely occurring) problems devoid of inequality constraints on the control variable. In
82
HENRY J. KELLEY, RICHARD E. KOPP, AND H. GARDNER MOYER
such cases, we are led to an equivalent problem in a state space of smaller dimension, the z j , j = 1, ... , n - I , becoming the state variables and z, = x l the control variable. This change occurs through the identical vanishing of the Lagrange multiplier associated with the nth equation of state (3.65) In this equation, as well as i n the first n - 1 equations of state (3.59), the variables xi are presumed eliminated in favor of the z j by use of the inverse transformation. It should be noted that jump discontinuities in the new control variable z,(t) = x , ( t ) occurring at corner points of the solution imply impulsive behavior of u(t). Such behavior would be admissible in the absence of an inequality constraint on u, which we have momentarily assumed, the Weierstrass necessary condition then being directly applicable. Unless the transformed equations are linear in the new control variable z, = x I, the Weierstrass necessary condition can be employed in conjunction with the Euler equations for the transformed problem to yield information not obtainable via the corresponding condition in the original problem. The extremals of the transformed problem are the singular extremals of the original, and those extremals satisfying the strengthened version of the Weierstrass condition are minimizing, at least over short intervals. In the special case in which the transformed equations of state (3.56) are linear in the new control variable x I , an additional transformation to a state space of still smaller dimension is indicated. Redirecting attention to the problem of main interest in which the inequality constraint (3.57) is operative, we perceive that the course of action just described is not open to us. We may, however, examine subarcs over which the control variable u takes on values intermediate between the specified bounds u1 < u < 212 (3.66) with similar considerations in mind. If u = Q ( t )is the optimal control, we must, evidently, restrict attention to small variations 6,u(t) = E ~ ( I ) where , ~ ( tis) an arbitrary, piecewise, continuous function and the magnitude of the variation E is vanishingly small, so that u = 6 + 6u satisfies (3.66). In the literature of classical variational theory, such variations are often referred to as weak variations and the Legendre-Clebsch condition, necessary for a weak relative minimum, plays a role loosely analogous to that of the Weierstrass condition whenever a restriction to vanishingly small variations is either assumed or imposed. We rewrite Eqs. (3.59) with the notation a j for the functions appearing on the right as
ij = a j ( x , , ... , x,, t ) ,
j = I , ... , n - 1
(3.67)
3.
83
SINGULAR EXTREMALS
and with the variables x i eliminated in favor of z j , as
i j= b j ( z l , ... , z,, t ) ,
j
=
1, ... , n - 1
Introducing the usual Lagrange multipliers / I j , j the Hamiltonian
=
(3.68)
1, .. . , n - 1, we form (3.69)
and write the Euler-Lagrange equations corresponding t o the z j
.ij = -a&?.jaZj,
j = I , ... , n - I
(3.70)
and that corresponding to z,
az/az, =o
(3.71)
The Legendre-Clebsch necessary condition is (3.72) for 6z, # 0, or
a2z.jazn22 o
(3.73)
Solutions of the system (3.68), (3.70), and (3.71) are the extremals of the transformed problem ; the condition (3.73) provides an additional criterion for screening these candidates. If the left member of (3.73) is positive, the singular subarc is locally minimizing, i.e., over short time intervals; if the left member is negative, the singular subarc is locally maximizing. The vanishing of the left member of (3.73) corresponds t o the special case, mentioned earlier, in which z , enters the function X linearly. Thus along singular arcs of the original problem Eq. (3.73) partially fills the gap created when the Weierstrass necessary condition is being trivially satisfied. The preceding argument lacks rigor because no account has been taken of the restriction on control variations imposed by terminal boundary conditions involving the nth state variable. It can be shown by direct calculation that the second variation test applied to a problem already transformed t o canonical form produces precisely the Legendre-Clebsch criterion as given by Bliss' and, hence, it becomes clear that the result rests upon the same assumptions as the second variation test. Regarding choice between the two approaches t o testing singular arcs, ease of application would seem t o favor the second variation test, since no laborious synthesis of a transformation is required. The transformation scheme, when it can be carried through, however, has two attributes t o recommend it. One is a sufficiency statement: the strong form of (3.73) guarantees the minimizing character of the singular solution over a sufficiently short
84
HENRY J. KELLEY, RICHARD E. KOPP, AND H. GARDNER MOYER
length of arc. (The strengthened Legendre-Clebsch condition insures a weak relative minimum over a sufficiently short length of arc and, since only weak variations in the new control variable z,, are admissible as a result of bounds on the control u, the restriction to a weak minimum loses significance.) The second is the convenience of the system in canonical form in analyzing the structure of the solution in terms of possible subarcs. While, in general, insufficient rules are available to permit systematic piecing together of subarc sequences into a composite solution, special cases may be more amenable to suitable specialized attacks if the canonical form is employed.
3.3
Examples
3.30 Two EIementary Examples T o illustrate the necessary condition in application of the second variation test, we will first consider two elementary examples. For the first example in our illustration, the differential equations of constraint are 2
i, = u,
(3.74)
1, = x1
where the control u is constrained by IuI 5 1. The problem is to determine so that the final value of the state variable x2 is minimized subject t o fixed values of t o , t f , x,(t,), x2(t0),and x , ( t f ) . The pseudo-Hamiltonian function %' is
u(t), to I tI tf,
+
2 = ilU
(3.75)
aZx,2
where
A,
=
-2aZx,,
X,
= 0,
a 2 ( t f )=
I
Along the singular subarc
az.jau = I.* = o conditions, x1 = 0 and u = 0.
thus leading to the second variation test gives
(3.76) The application of the (3.77)
thus satisfying the necessary condition for a minimizing extremal. For the second example, we consider the differential equations of constraint t o be i, =x2,
i , = u,
i3 =xl 2
(3.78)
3.
85
SINGULAR EXTREMALS
where IuI I 1. The pseudo-Hamiltonian function 2 is
2
=I
~ x+ , A~+ u I3~1’
(3.79)
where
1,= -A1
1,= -2A3x1, ;2, Along the singular subarc
= 0,
A3(t/) =
az.jau = = o
thus leading t o the conditions x1 = x2 = 0 and u the second variation test gives
1
(3.80)
= 0. The first application of (3.81)
which is inconclusive. The next application of the second variation test gives (3.82) which satisfies the necessary condition for a minimizing extremal. Since q is even in Eq. (3.40), from the discussion of junction conditions in Sec. 3.17 one might be tempted t o conjecture that such singular arcs are isolated and of only minor importance. That this is not so is illustrated by considering a problem treated recently by Johansen.” As part of a more complex min-max problem, Johansen treats a special case of the problem just discussed with to and tf fixed, xl(to)fixed, x2(to)unspecified, and x 3 ( t o )= 0, with x , ( f f )and x , ( t f ) unspecified. While these boundary conditions suffice to illustrate the phenomenon of interest, it will also appear for a large variety of other boundary conditions. The solution of this problem, as presented in Johansen,” consists of a sequence of an infinite number of switchings between u = - 1 and u = + 1 with the time between switchings rapidly decreasing. The limit of the sum of times between switchings is finite, with x, and x, vanishing a t the limit point. Thus the joining of this arc with the singular subarc x1 = x2 = 0 is possible a t the essential discontinuity of u(t). If the class of admissible functions specified in the problem statement required u ( f )t o be piecewise continuous, one would say that no minimum exists, merely a lower bound. If u ( t ) is required to be only measurable, however, the difficulty disappears. From an engineering viewpoint, such solutions are of interest since they can be approximated as closely as one wishes by better behaved functions. The point of the example is that singular arcs of the q-even type may play a role in solutions containing other types of arcs as subarcs.
86
HENRY J . KELLEY, R I C H A R D E. KOPP, A N D 11. GARDNER MOYER
3.31 A Servomechanism Example In Miele4 and Johnson and Gibson,20 the following problem has been studied in some detail. Given the system
+y
= x,
i 1
(3.83)
i 2= - y
(3.84)
i3 = x12/2
(3.85)
s
lYl
(3.86)
1
the control taking the system from a specified initial state to x 1 = x, = 0 and extremizing the final value of x3 is sought. The structure of the solution of this problem is rather complex, belying its innocuous appearance. The Hamiltonian function is
A? = 4GQ + Y ) + A,(-Y)
+ 13(Xl2/2)
(3.87)
and the Euler-Lagrange equations are
-axlax,= - A 1, = -ax/ax, = -1, a.1
X3
=
=
-axlax,
=
~
o
~
~
(3.88)
(3.89) (3.90) (3.91)
The necessary second variation condition for minimality of singular subarcs is (3.92)
which indicates the possible appearance of singular subarcs in the solution of the problem of minimizing the final value of x3 since, in this case, ,I3, = 1. The test rules out the possibility of such subarcs in the problem of maximizing x 3 f , thus leading to the conclusion, in this case, that the optimal. control is bang-bang. The transformation scheme leads to new variables z1 = x1 + x 2 , z 2 = x3 , z3 = x, satisfying state equations
i, = z 3 ,
i, = (zl
- ~,)~/2,
i, = - y
(3.93)
Identifying the Lagrange multipliers corresponding to the new variables as I,, I,, I , in order to avert possible confusion, the Haniiltonian function is = [lZ3
+ 12[(Z1
- Z3)2/21
+ /3(-y)
(3.94)
3.
87
SINGULAR EXTREMALS
and the Euler-Lagrange equations are
i, i, i,
- a x / d z , = -/2(z1 - z 3 ) = -ax/az, =o = - a x / a z , = -I, + I ~ - (z 3 )~ a x -- - - I , = O
(3.95)
=
dY
(3.96)
~
(3.97) (3.98)
the latter equation being satisfied along singular subarcs. Along such subarcs the Legendre-Clebsch necessary condition
a2x/az,2 = i2 2 o
(3.99)
provides the same criterion as the second variation test; namely, expressed in terms of the original multipliers corresponding to the variables xl,x 2 , x3, the result is 1, 2 0. 3.32 A Midcourse Guidance Example As a second illustration, we examine the simplified version of the optimal midcourse guidance problem treated by Striebel and Breakwell2I by the Green’s theorem method. The equations of state are (3.100)
(3.101) The variable x1 is the variance of an extrapolated terminal miss estimate; information rate” quantity, and g ( t ) > 0, a decreasing function of r, is a measure of control effectiveness. Control linear in x l ( t ) has been assumed, with a “feedback gain” y ( t ) 2 0 taking on the role of control variable in the variational problem. The terminal value of x 2 , the expected propellant expenditure, is to be minimized subject to fixed initial and terminal value specifications on x,. Depending upon the boundary values and the given functions g ( t ) and b(t), one or more singular subarcs may appear in the solution. They are characterized by = -b*9/2g (3.102) b(t) 2 0 is an
“
The necessary second variation condition takes the form or
(3.103) (3.104)
which is satisfied, since both b and g are nonnegative from the nature of the midcourse problem.
88
HENRY J. KELLEY, RICHARD E. KOPP, AND H. GARDNER MOYER
The transformation scheme leads to new variables z1 = [ 2 ( ~ ~ ) ’ /+~ /x2 g] and z , = x,, the latter becoming control variable along singular subarcs. Introducing multipliers I, and I , corresponding to the new variables, we have (3.105)
which leads to (3.102), and (3.106)
Since the quantity to be minimized, in terms of the new variables, is the terminal value of z , -(2/g)zit2,we have I, 2 0 and the Legendre-Clebsch condition is satisfied. 3.33 Aircraft
“
Energy Climb ”
Equations of state corresponding to a greatly simplified model of aircraft climb performance are given by
h = Vsin y
(3.107)
, T-D V = -- 9 sin y m
(3.108)
In these equations h is altitude, V airspeed, y the angle of the flight path to the horizontal, T engine thrust, D aerodynamic drag, m the vehicle’s mass, and g the acceleration of gravity. The approximation has been made that the thrust is directed along the tangent to the flight path, and it is also usual to assume, in the course of the simplified “ energy climb analysis, that the mass m is constant, i.e., fixed at some suitable average value. A further assumption, which drastically affects the character of optimal flight path solutions, is that aerodynamic drag is a function of altitude and airspeed only D = D(h, V ) given by the drag for level flight. This procedure neglects induced drag changes arising from departures from level flight, possibly including substantial changes associated with maneuvering. Under these assumptions, the flight path angle y takes on the role of a control variable, and since only sin y appears in the state equations, the substitution u = sin y , - 1 < u < 1 brings the problem into the format of a single control variable appearing linearly. The version of the flight performance problem most often considered is minimum time from fixed initial h, V to fixed final h, V. It should be noted that problems whose statement includes specifications on the range variable x,having the state equation f = V cos 7, are not amenable to the type of analysis presently under consideration. ”
3.
89
SINGULAR EXTREMALS
Analysis of the Euler-Lagrange equations for this problem leads to a singular extremal characterized by V -a( T dh
0)- g - ( aT -
av
D) - 9 (TD) =0 V
(3.109)
which defines a curve in h, V coordinates. This model of simplified aircraft climb performance has been analyzed by numerous investigators including Kaiser,22 R u t ~ w s k i and , ~ ~Miele,25 the latter most comprehensively by means of his Green’s theorem device. It was recognized at an early date that (3.109) corresponds t o stationary points of excess power V(T - 0) along contours of constant energy h + ( V2/2g),i.e., that the result may be stated as (3.110) or as
a av
h + ( V 2 / 2 g )= c o n s t
=o
(3.111)
The minimum time solution will be composed, in general, of vertical climb (y = z / 2 ) subarcs, vertical dive (y = - n/2)subarcs and subarcs of the singular arc. Of main interest in connection with the subject of the present chapter is the testing of the singular extremal for minimality over short lengths of arc. The transformation scheme leads naturally to energy as a state variable z1 = h + ( V2/2g),z 2 = h, and the Legendre-Clebsch condition is (3.1 12) Since it is clear from the interpretation of the Lagrange multiplier variables that I, < 0 for minimum time problems, the criterion follows that (3.113)
which implies that the stationary points of excess power along constant energy contours must be maxima, a result in accord with engineering intuition. The second variation test, discussed in the beginning of this chapter, leads to the identical result. In analyses of aircraft capable of supersonic flight, the rise in drag through the region of transonic airspeeds will produce two maxima separated by a minimum in curves of excess power along contours of constant energy. Thus, a portion of the singular extremal in the vicinity of the transonic region will not furnish minimum time over short lengths of arc, and the sequence of
90
HENRY J. KELLEY, RICHARD E. KOPP, A N D H . GARDNER MOYER
subarcs for some specified combinations of initial and final states may be quite complex. While analysis via the simplified model has largely been superseded in favor of more satisfactory models in applications work, it is of historical interest that some of the earliest engineering applications of variational theory encountered singular arcs ; however, the accompanying mathematical difficulties were not appreciated until some time later. The relationship between optimal paths according to the simplified model and to more complex models has been investigated to a certain e ~ t e n twith ~ ~ central . ~ ~ portions of optimal flight paths for the complex model often resembling flight along the singular extremal of the simplified theory. 3.34 Goddard’s Problem The problem of determining the optimal thrust program for the vertical flight of a sounding rocket has been extensively studied in the astonautical literature. The state variables in this problem are altitude h, velocity u, and mass m. Using these variables, the differential constraints are (3.1 14)
h=v
(3.115) m=--
T
(3.1 16)
C
in which the rocket thrust T is bounded above and below according to (3.117)
O
The function D is aerodynamic drag, g is the acceleration of gravity, and c is rocket exhaust velocity. The pseudo-Hamiltonian function 2 is
J
T - D(h, V ) - g(h) m
where
-2,
A2 a D +m dv
A,
=
2,
= -2 m 2 [T - D(h, v)]
A
T
;
(3.118)
(3.1 19)
3.
SINGULAR EXTREMALS
91
The singular subarc occurs when (3.120)
An evaluation of the constants for the necessary second variation condition gives 1 A,, = A , , = 0, m (3.121)
Substituting these relationships into Eq. (3.15) yields, as a necessary condition for maximum summit altitude, (3.122) This expression, together with the relationship -
dt
dT
___
(3.123)
satisfied along the singular subarc, gives
[
A2 0 : ; ~ ) 2 d D -m3
a2D] +--+-av av2 2 0
(3.124)
An application of the transformation method requires the solution of the following differential equations :
dh/ds = q,
= 0,
dv/ds = q2 = l/m,
d m / h = q3 = - I/c
(3.125)
The constants of integration are evaluated, thus leading to the transformation z1 = h,
z 2 = v,
z3 =
(3.126)
The differential constraints for the transformed system are
i, = u (3.127)
92
HENRY J. KELLEY, RICHARD E. KOPP, A N D H . GARDNER MOYER
Letting z 2 take on the role of the new control variable, the new pseudoHamiltonian is
(3.128) where
i,
,
= ;g(z,) 13
(3.129)
The stationary solution of the transformed problem corresponding to the singular subarc of the original problem occurs when
The Legendre-Clebsch necessary condition requires for a maximal extremal
a22
-=
azZ2
I, --
D(z,, z 2 )
c
2 aD azD] +--+y SO (3.131) c
aZ2 aZ2
which is identical to the expression reached by the necessary second variation condition. The advantage of employing the above variables z1 and z2 was first recognized by Ross,28 who established the maximal character of the variable thrust subarc for the square-law drag case. In the case of a more general drag law, e.g., one which exhibits sharp variation in the vicinity of sonic velocity, the Legendre-Clebsch condition may rule out intermediate thrust operation over a certain velocity range. This was pointed out by Leitmanr~.~'
3.35 Lawden's Spiral Rocket flight in an inverse square law field has been studied extensively by L a ~ d e n ~ ' ,and ' ~ others.' However, until recently, the nature of the singular arcs has been unresolved. A recent, rather comprehensive analysis of this problem and the singular arcs was given by Robbins." The system equations are ti =
i, =
Y + (T/rn)sin 8,
x + (T/rn) cos 8,
J' = u,
(3.132)
1 = u,
m = -TIC
where Y = -py/R3,
X = -px/R3,
R
= (x'
+ y2)'"
(3.133)
3.
93
SINGULAR EXTREMALS
and the problem is to choose T and 0 such that P = - m f is minimized (minimum fuel) subject t o the constraint 0 2 T 5 T,,, . The pseudo-Hamiltonian 2 function becomes
&=Al
( Z
px + A 2 --
--+-sin0
+ - cos 0 , +
+
( R 3 m T
-
T
-= C
o
(3.134) where the Ai are the adjoint variables and obey the differential equations
li = - a x l a x i ,
i = I , ._., 5
(3.135)
From the classical theory 2 is minimized with respect to T and 0 giving
sin 0 = -
4
cos 0
+ a 2 y’
(A12
when p < O ,
T= T ,,,
where
= -
when
T=O (A,2
p=
m
22
(A,2
+
A22)’/2
p >O
+ A22)”2 m
C
(3.136)
-”
(3.137)
C
The singular condition occurs when p identically equals zero over a nonzero interval of time. That (A12 + A22)1’2 is then a constant derives from d/dt (ma,) = 0. Without loss of generality we set (A12 + A22)1/2 = 1. Applying the first test given by Eq. (3.28) with q = 1 , we obtain 1
)?I1
c s
Ai,lAj,l 2 0 + 2- i,.i= I axi axj
(3.138)
Substituting for ql this expression becomes
(3.139) The only nonzero terms will be contributed by ( a 2 2 / a T a x S ) A 5 ,and ,
(8’H / a m 2 A:, ) where
a22 --
- sin
aT ax, a 2 x
am2
-
2
0+
m
cos 0
)
=
1
m
2T TEA, sin 8 + A2 cos 01 = - 3 m3 m
(3.140)
94
HENRY J. KELLEY, RICHARD E. KOPP, A N D H. GARDNER MOYER
Evaluating Eq. (3.139) with these terms, we find that it is satisfied with the equality sign. Applying the second test, that is Eq. (3.28) with q = 2, we obtain 1
5
where
The evaluation of Eq. (3.142) shows q1,2 and u],,, to be identically equal to zero. Furthermore, the only terms that contribute to the sum in Eq. (3.141) are for i and j equal to 3 and 4. Without loss of generality, we choose our coordinate system such that x j = R and x4 = 0 when the test is applied. The terms that contribute to Eq. (3.141) are: a 2 x
-=-ax,2
d2&
6p~1 R4 '
3p~2
--ax3 ax, R4 '
I . m
A 3 , 2= - sin 8,
A4,2
a2z- 3 p ~ 1 ax,'
R4
1
(3.143)
= - cos 8
m
Substituting Eq. (3.143) into Eq. (3.141) and using Eq. (3.136), we obtain as a necessary condition
3 p sin 0 -~ (3 - 5 sin2 e} 2 o
(3.144)
m2R4
From Eqs. (5.94) and (5.105) of L a ~ d e n we , ~ see ~ that along the time-open singular arc ( 2= 0) R and T/m are given by
R= and 1 - 3 sin2 cp T =6 (
m
3 - 5sin2 cp
a sin6 cp 1 - 3 sin2 cp
[27 - 75 sin2 cp + 60 sin4 cp] sin" cp
(3.145)
1
(3.146)
where a and b are positive constants and cp is the angle between the thrust direction and the local horizontal. Note that cp equals 8 for the position
3.
SINGULAR EXTREMALS
95
coordinates we have chosen. From Eq. (3.146) we see that sin 0 must be positive and from Eq. (3.145) that O < s i n 6 ' < JJ
(3.147)
which violates inequality (3.144), thus showing that the singular arc is not minimizing. Application of the transformation approach to Lawden's problem encounters two complications not met in the simpler examples treated in the preceding pages. The first complication is that two successive transformations become necessary, for reasons which will be apparent ; the second complication is a proliferation of state variables, arising from the appearance of time derivatives of the second control variable 6' in the transformed equations of state. The following treatment is essentially the same as the one previously given in Kelley' with additional material on the synthesis of transformations from Kelley.' A transformation, designed to rid all save one of the state equations of the linearly appearing control variable T, is obtained by application of the method described earlier as cos 0 - u sin 0 p = u sin 6' + v cos 0 + c log rn V = -clogm CL
=u
The inverse transformation is u = CL cos
0
+ p sin 6' - c log m sin 0 + p cos 6' - clog m cos 0
u = - a sin 0
rn
= e-vlc
(3.148)
(3.149)
The transformed equations of state are & = Y cos 8 - Xsin 0 - ( V
B=
+ /?)o
YsinO+XcosO+ao 3 = aces 0 + /3 sin 0 + Ysin 0 k = - - a s i n Q + p c o s O + Vcose = eV/CT
(3.150)
8=0 in which 6' has now assumed the status of a state variable due to the appearance of its first time derivative. As previously noted, there is some freedom of choice in selection of new variables during the synthesis procedure and, in this case, the choice has been governed by simplification of subsequent manipulations. The new variable V now has acquired some of the properties
96
HENRY J. KELLEY, RICHARD E. KOPP, AND H. GARDNER MOYER
of a control variable, since arbitrary weak variations in V may be approximated by suitable thrust variations satisfying the thrust inequality (3.136). However, the exceptional case has arisen in which the new variable V also appears linearly in the transformed equations of state and, as discussed earlier in this connection, a second transformation is indicated. The result also may be obtained by the synthesis procedure as
li/ = y sin 8 + x cos 8 + ajw y = y cos 6' - x sin 6' CD = y sin 8 + x cos 8
(3.151)
where the functions li/ and y are integrals, and the function Q is destined to assume the role of a new control variable. The two transformations may be combined into a single transformation,' which may take the form
li/ = y sin 8 + x cos 8 + (l/o)(u cos 8 - L' sin 8)
p = c l o g m + usin8 y = y cos 8 - x sin 8 Q, = y sin 8 + x cos 8 V = -clogm
+ vcos8
(3.152)
It is readily verified that this transformation is nonsingular, by noting the nonvanishing of the Jacobian determinant wm
(3.153)
The equations of state in the new system of variables are obtained as
4 = yw + (l/o)( Y cos 6' - x sin 0) - (p/w)($ - a) fi = Ysin 8 + xcos 6 + wz($ - @)
(3.154)
jJ = w($ - 2Q)
(3.156)
6=p+yw+v V = Tim = Tevlc
(3.157)
d=w
(3.159)
&=p
(3.160)
(3.155)
(3.158)
It has been tacitly assumed in the course of the manipulations leading to Eqs. (3.154) through (3.160) that the steering angle 8 is twice differentiable, i.e., the derivatives 8, = w and d = p exist. Examination of Eqs. (3.135) and
(3.136) indicates that such an assumption is justified if the gravitational force components Y and X possess first partial derivatives, except for a finite
3.
SINGULAR EXTREMALS
97
number of points along the trajectory corresponding to thrust direction reversals a t which AI and A2 vanish simultaneously. We exclude such reversal points from the segments of arc analyzed in the following. In accordance with the objectives of the transformation, it is observed that the variables T and V appear only in Eqs. (3.157) and (3.158) and that, as a consequence of this, the multipliers I,, and A, vanish along the singular subarcs. We note that the coefficients of T in Eq. (3.158) and of V in Eq. (3.1 57) never vanish and, accordingly, that a n admissible variation in thrust 67' may be found which produces a n approximation as close as one wishes to a n arbitrary variation s@(t),provided that the magnitude of 6 0 is sufficiently small. With 0 in the role of control variable and small variations being assumed, the intermediate thrust arcs must satisfy the Legendre-Clebsch necessary condition for a weak relative minimum. The Euler-Lagrange equations for the system (3.154), (3.155), (3.156), (3.159), (3.160) are (3.161)
ax
a
ax 10 -- - - - =ao
-0
(3.162)
(Ysin O
(3.1 63)
+ X c o s 0)
a
a
- ~ ~ - ( ( ~ c o s O - ~ s i n O ) - A- ( Y s i n O + X c o s d )
ae
-2Apw($
ao
- 0)- A?($
- 20) - a,
(3.164)
(3.165)
(3.166)
(3.167)
98
HENRY J. KELLEY, RICHARD E. KOPP, A N D H . GARDNER MOYER
The Legendre-Clebsch necessary condition for a weak relative minimum is
for arbitrary d@, dp. Positive semidefiniteness of this quadratic form requires that (3.169) ?>0
-
ap2
(3.170) (3.171)
We have
a 2 z = Afi a2 av a@
( Y sin fl
f
I, Xcos 0) + -
a2 a@
( Y cos 0 - Xsin 0) (3.172)
az.P -- - 0
(3.173)
a 2 2 _ -I ,
(3.174)
ap2
--
amp
From Eqs. (3.171), (3.173) and (3.174), it follows that I , = 0. With this simplification and the elimination of the multiplier variables from the Euier-Lagrange equations (3.161) through (3. I67), we arrive at
az
w2+-=0
a@ az
-p + - = 0
aY
(3.1 75) (3.176) (3.177)
in which 2 - YsinB+XcosO
(3.178)
is the component of gravitational force along the thrust direction. In the case of an inverse square law gravitational field Y=
- kY 2 3/29 tx2 +Y 1
- kx
x= (x2+ y 2 ) 3 ' 2
(3.179)
3.
99
SINGULAR EXTREMALS
and Eqs. (3.175), (3.176) and (3.177) become (3.180) (3.18 1) (3.182) If w is eliminated between Eqs. (3.180) and (3.182), we obtain (3.183) The vanishing of the first factor Q, = 0, circumferential thrust, leads to p w = constant and
101
= k/R3‘2
= 0,
(3.184)
where R = (02 + y2)*j2is the radius. This equation is the orbital frequency for free fall circular motion. The vanishing of the second factor indicates that Eq. (3.182) is satisfied identically along solutions of Eqs. (3.180) and (3.181), these being the equations of Lawden’s3’ intermediate thrust solutions, although in rather different notation. A further result may be obtained from analysis of the consequences of inequality (3.169), which went unnoticed in the original study.’ Evaluation of the second partial derivative appearing in (3.169) combined with slight algebraic manipulation yields
Since k > 0 and the bracketed quantity can be shown to be positive, it follows that
aom 2 o
(3.186)
is a necessary condition for minimaiity of an intermediate thrust arc. By the interpretation of Lagrange multipliers as influence functions of the functional being minimized with respect to the state variables, one can show that the multiplier Lo will be negative for minimum fuel problems. Thus Q,
=ysin 8 + xcos 8 I 0
(3.187)
is a necessary condition requiring that the scalar product of radius vector and thrust direction be negative; i.e., the thrust must be inwardly directed
100
HENRY J . KELLEY, RICHARD E. KOPP, A N D H. GARDNER MOYER
if an intermediate thrust arc is minimizing. This is the same result as that obtained via the second variation test, and is the same as the result first given by Robbins.“ While the application of the necessary conditions rules out some members of the family of singular arcs, including Lawden’s spiral arising in the timeopen case, some others qualify and remain as candidates. Since this is a q-even problem ( q = 2), corner junctions of these locally minimizing candidates with nonsingular arcs are not permissible. There remain the possibilities of control saturation junctions and chattering junctions, as well as the possibility that “ bang-bang thrusting arcs merely cluster around the minimizing singular arcs without ever joining with them. The structure of the family of solutions of Lawden’s problem, i n other words, remains largely unexplored, even after numerous efforts by qualified investigators. “
”
”
..
,.
.
y:;, .
REFERENCES
1. P. Contensou, “ Etude theorique des trajectoires optimales dans un champ de gravitation. Application au cas d’un centre d’attraction unique,” Astronaut. Acta 8, 134-150 ( 1962). 2. J. Warga, Relaxed variational problems, J . Math. Anal. Appl. 4, I I 1-127 (1962). 3. J. Warga, Necessary conditions for minimum in relaxed variational problems, J . Math. Anal. Appl. 4, 129-145 (1962). 4. A. Miele, Extremization of linear integrals by Green’s theorem, in Optimization Techniques” (G. Leitmann, ed.), Chapter 3, pp. 69-98. Academic Press, New York, 1962. 5. J. P. LaSalle, The time optimal control problem, in “Contributions to the Theory of Nonlinear Oscillations,” Vol. V, pp. 1-24. Princeton Univ. Press, Princeton, New Jersey, 1960. 6. R. E. Kopp and H. G . Moyer, Necessary conditions for singular extremals, A l A A J . 3, 1439-1444 (1965). 7. H. J. Kelley, A transformation approach t o singular subarcs in optimal trajectory and control problems, J . SIAM Control 2 , 234-240 (1964). 8 . H. J. Kelley, A second variation test for singular extremals, AIAA J . 2, 1380-1382 (1964). 9. H. J. Kelley, Singular extremals in Lawden’s problem of optimal rocket flight, AZAA J . 1, 1578-1580 (1963). 10. H. M . Robbins, Optimality of intermediate-thrust arcs of rocket trajectories, AIAA J . 3, 1094-1098 (1965). 11. R. E. Kopp, Pontryagin maximum principle, in “Optimization Techniques” (G. Leitmann, ed.), Chapter 7, pp. 255-279. Academic Press, New York, 1962. 12. R. Courant and D. Hilbert, “Methods of Mathematical Physics,” Vol. I , p. 215. Wiley (Interscience), New York, 1953. 13. F. D. Faulkner, Direct methods i/i “Optimization Techniques” (G. Leitmann, ed.), Chapter 2, pp. 33-67. Academic Press, New York, 1962. 14. F. D. Faulkner, The problem of Goddard and optimum thrust programming in “Advances in the Astronautical Sciences,” Vol. I. Plenum Press, New York, 1957. 15. F. D. Faulkner, A degenerate problem of Bolza, Proc. Amer. Math. SOC.6 , 847-854, (1955). ‘I
3.
SINGULAR EXTREMALS
101
16. W. M. Wonham and C. D. Johnson, Optimal bang-bang control with quadratic performance index, Proc. Fourth Joint Automatic Control Conf., pp. 101-1 12 (1963). 17. R . Courant and D. Hilbert, “Methods of Mathematical Physics,” Vol. 11. Wiley (Interscience), New York, 1962. 18. G. A. Bliss, “ Lectures on the Calculus of Variations.” Univ. of Chicago Press, Chicago, Illinois, 1946. 19. D. E. Johansen, Solution of a linear mean square estimation problem when process statistics are undefined, Joint Automatic Confrol Conf. Troy, New Yorlc, June 1965. 20. C. D. Johnson and J. E. Gibson, Singular solutions in problems of optimal control, IEEE Trans. Automatic Control 8, 4-1 5 (1963). 21. C . T. Streibel and J. V. Breakwell, Minimum effort control in interplanetary guidance, IAS Preprint No. 63-80 (January, 1963). 22. F. Kaiser, Der Steigflug niit Strahlflugzeugen-Teil 1, Bahngeschwindigkeit besten Steigens, Versuchsbericht 262-02-L44, Messerschmitt A. G . , Augsburg (April, 1944). (Translated as Ministry of Supply RTP/TIB, Translation GDC/I 5/148 T.) 23. K. J. Lush, A review of the problem of choosing a climb technique with proposals for a new climb technique for high performance aircraft, Aeronautical Research Council Report Memo. No. 2557 (1951). 24. E. S. Rutowski, Energy approach to the general aircraft performance problem, Aerol Space Sci. J . 21, 187-1 95 (1 954). 25. A. Miele, Optimum climbing technique for a rocket-powered aircraft, Jet Propulsion 25, 385-391 (1955). 26. H. J. Kelley, M. Falco and D. J. Ball, Air vehicle trajectory optimization, SIAM Symp. Multivariable System Theory, Cambridge, Massachusetts, November I-3, 1962. 27. H. Heermann, The minimum time problem, J . Astronaut. Sci. 2, 93-107 (1964). 28. S. Ross, Minimality for problems in vertical and horizontal rocket flight, Jet Propulsion 28, 55-56 (1958). 29. G. Leitmann, An elementary derivation of the optimal control conditions, in Proc. 12th Intern. Astronaut. Congr., I961 ( R . M. L. Baker and M. W. Makernson, eds.), pp. 275-298. Academic Press, New York, 1963. 30. D. F. Lawden, Optimal intermediate-thrust arcs in a gravitational field, Astronaut. Acfa 8, 106-123 (1962). 31. D. F. Lawden, “Optimal Trajectories for Space Navigation,” Butterworth, Washington, D.C., 1963.
Thrust Programming in a Central Gravitational Field A. I. LURIE LENINGRAD POLYTECHNIC INSTITUTE. LENINGRAD. USSR
4.1 General Equations Governing the Motion of a Boosting Vehicle in a Central Gravitational Field . . . . . . . . . 104 4.11 The Equations of Motion: Vector Form . . . . . . . 104 4.12 Statement of the Problem . . . . . . . . . . . 104 4.13 The Boosting Devices . . . . . . . . . . . . . 105 4.14 The Mayer-Bolza Problem . . . . . . . . . . . 106 4.15 Conditions of Stationarity . . . . . . . . . . . 107 4.16 Equations Valid for all Types of Boosting Devices . . . . 108 4.17 Additional Conditions of Stationarity : Boosting Devices of the Second Type . . . . . . . . 109 4.18 Additional Conditions of Stationarity: Boosting Devices of the Third Type . . . . . . . . 110 4.2 Integrals of the Basic System of Equations . . . . . . . . 111 4.21 Scalar Integral . . . . . . . . . . . . . . 111 4.22 Vector Integral . . . . . . . . . . . . . . 112 4.23 The Equivalence among Different Systems of Equations . . 113 4.24 The First-Order Differential Equation for the Vector A . . . 113 4.25 Orbital Axes . . . . . . . . . . . . . . . 114 4.26 The Differential Equations Written Relative to the Orbital Axes . . . . . . . . . . . . . . . 117 4.3 Boundary Conditions: Various Types of Motion . . . . . . 117 4.31 General Boundary Problem . . . . . . . . . . 117 4.32 Turning of the Orbital Plane . . . . . . . . . . 118 4.33 Three-Dimensional Motions . . . . . . . . . . 119 4.34 Plane Motions . . . . . . . . . . . . . . 121 4.35 Special Case of Plane Motion . . . . . . . . . . 122 4.36 Moving along a Keplerian Arc . . . . . . . . . . 124 4.4 Orbits on a Spherical Surface . . . . . . . . . . . . 127 4.41 The Statement of the Problem of Spherical Motions . . . . 127 4.42 The Differential Equations of Spherical Motions . . . . 128 4.43 Spherical Motions when w = const (Devices of the First Type) . 129 4.5 Boosting Devices of Limited Propulsive Power . . . . . . . 135 4.5 1 The Differential Equations of Motion . . . . . . . . 135 4.52 Plane Motions . . . . . . . . . . . . . . . 137 4.53 Spherical Motions . . . . . . . . . . . . . . 138 4.54 An Application of the Ritz Method . . . . . . . . 139 4.6 Singular Control Regimes . . . . . . . . . . . . . 141 4.61 Statement of the Variational Problem . . . . . . . . 141 4.62 The Differential Equations . . . . . . . . . . . 143 References . . . . . . . . . . . . . . . . . . 145 103
104 4.1
A. 1. L U R E
General Equations Governing the Motion of a Boosting Vehicle in a Central Gravitational Field
4.11 The Equations of Motion: Vector Form Let r denote the position vector of the center of mass of a vehicle, this vector originating at the attractive center. Furthermore, let v denote the velocity vector. These vectors together determine the instantaneous position of the orbital plane n. The angular momentum vector of a unit mass k=rxv=kn
(4.1)
is directed along the normal (its unit vector being n) to the orbital plane. The differential equations governing the motion of the center of mass can be written down in the following form:? f = v,
t
=
-(p/r3)r + w
(4.2)
Here p denotes a constant equal to the product of the attractive mass and the gravitational constant; w denotes thrust acceleration,
w
= (cq/m)e = we
(4.3)
c denotes the exhaust speed, and q the mass flow rate; the latter satisfies the equation of the flow rate n z = -4
(4.4)
Last, we shall denote by e the unit vector in the direction of the thrust; thus,
1 -e.e=O
(4.5)
We observe, in conclusion, that according to the angular momentum theorem, k=rxw
(4.6)
4.12 Statement of the Problem The vectors r and v and the mass m will further be defined as the “coordinates”$ of the system whereas the quantities e, c, and q will serve as “ controls.” The latter will be subjected to some additional constraints depending on the type of the boosting device considered (see Sec. 4.13). TThe dot denotes differentiation with respect to time t ; in the case of vectors, such differentiation is relative to a n “ inertial ” frame. 1The components of r and v together with m are the “ state variables ” of the system.
4.
105
THRUST IN A CENTRAL GRAVITATIONAL FIELD
The coordinates belong t o the class of continuous functions of time on the interval [0, t l ] . In what follows their initial values will always be assumed prescribed : r=ro,
t=O;
v=vo,
(4.7)
m =mo
Their values r', v l , m1 at the right side of the interval are subjected t o the following system of conditions : (1) the vectors rl, vl are connected by the relations t=t,;
(4.8)
/ = l , 2 ,..., r s 6
pl(rl,vl)=O,
where the quantity t, is either prescribed a priori (tl = t l * ) or unknown; (2) either the final mass is assumed known, i.e., t=t,;
m=m'=m,'
(4.9)
or else minimum fuel consumption is required, i.e., J=
mo -,I
= min
(4.10)
( 3 ) the requirement (4.10) is replaced by the minimum time condition (the quick-operation problem), i.e., J = t , = min
(4.1 I )
where now m1 is either prescribed or unknown. The basic problem is t o prescribe the thrust operation, that is, t o determine the time dependence of the control functions e, c, and q in the class of piece, wise-continuous functions, in such a way as t o satisfy the conditions formulated above. 4.13 The Boosting Devices
In what follows, we shall consider three types of boosting devices: (1) Devices which guarantee the constant magnitude of the thrust-acceleration vector (w = const) while the exhaust speed c is either constant o r is given as a function of time. The flow rate equation can now be integrated t o give
m = mo exp( - w
1;fj
and the mass is then excluded from future consideration. The quick-operation requirement now guarantees the minimum of fuel consumption. The vector e serves as the control function.
106
A. I. LURIE
(2) Propulsive power limited boosters (plasma or ionic). Now the quantities c and q are also treated as controls, and are connected by the relation
N(t)= ic2q
= +(m2w2/q),
Nmin I N ( t )I Nmax
(4.12)
where N ( t )is the propulsive power. Following the technique suggested in Leitmann’ and in Miele2 we can write these inequalities in the form of equalities
”,,,
- .Ir(t>l[J-(t>-
Nminl - v22
=0
(4.13)
where the quantity v2 is treated as an artificial (or auxiliary) “control.” (3) Mass flow rate limited booster (chemical rockets). Here the exhaust speed is assumed constant and the flow rate bounded, i.e., c
= const,
0 _< q(t) i qmax
(4.14)
Now q and v 3 serve as the control functions, so that -dt)l -
q(t”max
v32
(4.15)
=0
4.14 The Mayer-Bolza Problem In the fundamental papers’-5 the formulated problem is treated from the point of view of the Mayer-Bolza problem which is well known in the calculus of variations. A detailed account of the latter problem was given by Bliss;6 various recent publications can also be Different topics on the optimal control of rocket flight are analyzed in Tarasov.’ We introduce three types of the Lagrange multipliers. (1) The vectors I , , I , , and also the scalar A,,, for boosting devices of the second and third types. With the aid of these multipliers, we construct the “ Hamiltonian” corresponding to the right-hand sides of the equations of motion and flow rate:
-
HA= I , * v - I , r ( p / r 3 )+ I , ew
-
H A = I , * v - I , r ( p / r 3 )+ A, * e(cq/m) - I,,,q
(4.16)i t (4.1 a 2 . 3
(2) Scalar multipliers p e , p K , corresponding to the finite constraints imposed on the “control functions.” These multipliers appear in the second term of the Hamiltonian H,=p,(l
H p
= pp(1 - e
*
e> + P,(C2q - 2J”)
-e-e)=O
+ ~ 2 [ ( N m a x- JO(N
- Nniin)
H , = P A 1 - e * e ) f P3“Aqmax- 4 ) - v321 = 0
(4.17)1 - vz2I= 0
(4.17)~
(4.17)3
t A subscript to an equation number refers to the corresponding type of boosting device.
4.
THRUST IN A CENTRAL GRAVITATIONAL FIELD
107
( 3 ) Constant scalar multipliers p I , introduced for the derivation of the boundary conditions. These multipliers serve for the formulation of the " indicating function "
6 =J
+ el,
8,
1 prpl(r',v') + pt(tl - t,*) + p,(m' r
=
l=I
- m*') = 0 (4.18)
Here we have denoted by J the functional (4.10) or (4.1 1) to be minimized. If the quantities t , and (or) m' are not known a priori, then pr and (or) pn, are assumed to be zero. Having set the variation of the functional I =8+
J; ( I , - i + I ,
*
ir
- H , - H,
+ A,&)
dt
(4.19)
equal to zero, we arrive at three groups of equations; namely, equations of stationarity, the Erdmann-Weierstrass conditions at the points t , of possible lack of continuity of the " controls," and boundary conditions for the Lagrange multipliers I , , I , , ,Imand for the Hamiltonian H a . These requirements are complemented by the findamental Weierstrass necessary criterion for a minimum. We construct the function E,having for the above-described types of boosting devices the following form :
E
=I,
E
= ( l/m)k,
*
(e - e*)
- (cqe - c*q*e*)- Am(q - 4")
E = (c/m)I,* (qe - q*e*) - A,(q - q*)
(4.20), (4.201, (4.20)3
Here the asterisk denotes an admissible control which satisfies the constraints imposed on the control functions. The Weierstrass criterion requires that ErO
(4.21)
4.15 Conditions of Stationarity These conditions constitute the system of differential equations and finite relations for the Lagrange multipliers of the first and the second types; namely,
i,= -grad,
H A= - I ,
(4.22) (4.23)
108
A. I. LURlE
grad,(H, grad,(H,
+ H,)
+ H,)
= wl, - 2p,e = 0
=
(cq/m>5,- 2p,e
(4.25),
=0
(4.25) 2.3
4.16 Equations Valid for all Types of Boosting Devices Such equations are the stationarity conditions (4.22), (4.23), and (4.25). From Eq. (4.25) it follows that the vectors 1,and e are parallel to each other; the Weierstrass condition now shows that they are equally directed. I n fact, the unknown control function should be chosen among the admissible ones and, therefore, it becomes clear that the Weierstrass criterion must be valid for c = c*, q = q*. Now we see that inequality (4.20), is true for all types of devices considered. Having now set 1, = X,e, where 1,= * A v = f15,1, we get
E
= (cq/m)1,*
-
(e - e*) = (cq/m)X,(l - e e*) 2 0,
2,
= A,
this being the required relation. Thus, we have
5,
= 1,e
(4.27)
On eliminating the quantity 5 , from Eq. (4.23) by use of (4.22), and choosing the notation 1 for 5, , we arrive at the second-order differential equation for the vector 1 3, = (p/r3)[3(5
r/r2)r - 51
(4.28)
This completes the system of stationary conditions for the first type of boosting devices.
4.
THRUST I N A CENTRAL GRAVITATIONAL FIELD
109
4.17 Additional Conditions of Stationarity: Boosting Devices of the Second Type Now it will be supposed that the exhaust speed is finite. From this assumption it follows due to Eq. (4.12) that the mass flow rate q # 0; having eliminated pl from Eq. (4.26)2, we assert with the help of Eqs. (4.27) and (4.24), that
A being an integration constant. Now we have
2A c=- , Am
2A2 .M = +c2q=-A 2 m 2 q '
cq m
w=-=-
2A Am''
and, by virtue of the flow rate equation (4.4),
On eliminating the quantity p l , the third equation (4.26)2 can be presented in the following form: d/mc + p z ( ~ r n a x .Mrnin- 2.M) = 0 and so the assumption p2 = 0-F leads to A = 0, h = 0, = 0, A,, = 0. It will become clear from material following that this leads t o a contradiction with the boundary conditions of the problem of fuel consumption minimization as well as with those of the problem of minimizing t , . Thus the assumption p 2 = 0 should be withdrawn, and there remain the possibilities v2 = 0 and .M = Xmax or A" = Nmin. But from the Weierstrass condition it now becomes clear that we must retain only the first possibility. Indeed, turning t o Eq. (4.20), and taking Eqs. (4.27) and (4.29) into consideration, we arrive at the inequalities :
+
Of the three admissible control functions, c*, q*, and .M* only two are independent, and the third may be chosen arbitrarily; setting c = c*, we see from the second inequality that .M - N*2 0, and setting q = q*, we infer from the first inequality that c 2 c*, or equivalently, N 2 .M*. But the quantity .M* may be chosen arbitrarily near t o Nmax without violation of inequalities (4.12), and from here it follows that these inequalities require that .M = .Mma"". On a nonzero interval.
110
A. I. LURIE
We have now deduced that for the type of boosting devices considered, the vector 3, given by (4.30) differs from w only by a constant multiplier, and the mass is determined from the equation (4.3 1) 4.18
Additional Conditions of Stationarity : Boosting Devices of the Third Type
Instead of I , , it seems now convenient to introduce the quantity q called the switching function : (4.32) q = (c/m)I - Im From Eq. (4.24), it follows that the switching function satisfies the differential equation ?j = (c/m)l (4.33) Equations (4.26), can now be presented as follows:
v + Y3(qmax - 2q) = 0
~
3 = ~0
3
From the assumption v 3 = 0 and Eq. (4.15) we deduce that the mass flow rate must assume only one of its limiting values q = 0 or q = q,,, . The Weierstrass criterion Eq. (4.20), is now written in the form q(q - q*) 2 0 and there appears the possibility of two regimes, the regime of maximum thrust q = qmax when q > 0 (4.341, and that of moving along a Keplerian arc q=O
when q < O
(4.34)**
At the instant of switching, the Lagrange multipliers and, consequently, q and rj, suffer no discontinuity. Thus, we have dt*) =0
(4.35)
and Q(t, - 0) > 0 when the vehicle proceeds from a Keplerian arc to the maximum thrust regime, and q(t* - 0) < 0 during the inverse maneuver. The case G(t*) = 0 is doubtful since the sign of the second derivative q(t* & 0) (which is discontinuous) provides no information about the subsequent regime. None of the three possibilities can as yet be withdrawn. These are (1) conservation of the preceding regime if the sign of q ( t ) is conserved;
4.
THRUST IN A CENTRAL GRAVITATIONAL FIELD
111
(2) transition from the maximum thrust regime to moving along a Keplerian arc when ~ ( tgoes ) over from negative values to positive ones, and (3) the inverse transition when previously negative values of ~ ( tbecome ) positive. In what was stated above, it had been supposed that v 3 = 0; one cannot, however, withdraw the possibility that v3 # 0 on some time-interval (t* , t,*). In this case (4.36) Here we have the regime of “ singular control ’’twith variable thrust; now the mass flow rate is some unknown function of time determined from the second Eq. (4.36) and from the equation of motion: (4.37) where, however, no further integration is as yet required. We see now that with the condition e(t*) = 0 we cannot exclude the possibility that an arc of “ singular control” is included in the solution, if we can simultaneously retain the continuity of r and v. In the “ singular control -’ regime, 1 = 0 and A = mAn,/c is a constant different from zero, since the assumption A = 0 would be equivalent to the vanishing of all Lagrange multipliers (a, i, A,); such vanishing is presumed impossible in the MayerBolza problem. The constant A is now determined from the requirement of continuity at the moment I,, and it remains to verify that the vectors 1,iand # 2 = 0). So long as A # 0, the Hamiltonian H Aare continuous (of course, the verification reduces to checking the continuity of the vectors e and e. The moment z,, - 0 of termination of the “singular control” is as yet unknown; this moment is probably not that when q(t) reaches either q = 0 or q = qmax,because the quantity q can be discontinuous. It seems useful to retain t** as the unknown quantity determined by boundary conditions (see also Sec. 4.62~). 4.2
Integrals of the Basic System of Equations
4.21 Scalar Integral The equations of motion and the mass flow rate equation together with the stationarity conditions allow the first integral,
H A = 11 expressing the constancy of the Hamiltonian.
t See Chapter 3.
(4.38)
112
A. I. LURIE
For boosting devices of the first type, by virtue of Eqs. (4.16), (4.22), and (4.27), this integral can be presented in the form
-
-i v - I. r ( p / r 3 )+ W A = h
(4.3911
9
For boosting devices of the second type, we infer from Eqs. (4.16), and (4.29) that
- v - I. * r ( p / r 3 )+ QAw
-X
(4W2
=h
In this equation, we may replace the vector li. by the proportional vector w, so that -w v - w * r ( p / r 3 )+ + w Z = h, , h, = (Mma,/A)h
-
Last, taking account of Eq. (4.32) for the switching function, we see that for the third type of booster
-
- i, * v - 1 r(p/r3)
+ qq = h
(4.3913
Here q is constant (qmaxor zero) if q # 0, and q = q ( t ) if q = 0. It can be immediately verified by differentiation that (4.38) in fact is an integral of the equations noted above. The calculation proceeds identically for all cases concerned, and therefore we shall only analyze the first one. In view of the relations X=,ie+Ac,
i-w=iw
e-6=0,
we present the result of differentiation of Eq. (4.39), in the following way
(k +
1
31
r3
r5
- I. - - A * r r)
" 1
- v + i- ( i . + 2 r -w r
=
o
(4.40)
It is observed that the vectors in brackets vanish due to the equations of motion together with those of stationarity, Eq. (4.28). We shall make use of Eq. (4.40) in Sec. 4.23.
4.22 Vector Integral The system of Eqs. (4.2) and (4.28) complemented with condition (4.27) also allows a vector integral. Consider the vector product of both sides of Eq. (4.28) with the vector r; in view of Eqs. (4.2) and (4.27), we get 1
..
d dt
~xr+li.x-5r=li.xr-li.x~=-(~xr-~xv)=~ r
so that
ixr-I.xv=a=ao
Here a denotes a constant vector, a
=
la1 ; o is a constant unit vector.
(4.41)
4.
THRUST IN A CENTRAL GRAVITATIONAL FIELD
113
The subsequent results of this paper are based to a considerable extent upon the use of both scalar and vector integrals. The existence of the first one is very well known while the second has been mentioned neither in the fundamental papers,'-' nor in a more recent publication"; in Isayev and Sonin" it was observed that the vector integral exists in a special case of plane motion, but no use was made of this fact.? 4.23
The Equivalence among Different Systems of Equations
Differentiating Eq. (4.41) and making use of the equations of motion Eqs. (4.2), we arrive at the relation
(4.42) From this we deduce that the vector in brackets is collinear with r and therefore may be presented as br, h being a scalar multiplier. Substitution in Eq (4.40) now shows that (h - $ k . r ) r - v = O where we have eliminated the second term in view of Eq. (4.2).Thus, we have 5+ "
P 3v ~ = ?h5 * r r r r
when r - v # O
It is now clear that in view of the equations of motion (4.2), the secondorder equation (4.28) is equivalent to the pair of first-order equations (4.39) and (4.41); these involve four scalar constants: h and the vector a. The case r v = 0, corresponding t o spherical orbits, requires separate consideration.
-
4.24
The First-Order Differential Equation for the Vector 5
Having constructed the scalar product of Eq. (4.41) by r, v, and k, respectively, and accounting for Eqs. (4.1), we obtain 5*k=ao-r
(4.43)
3, k
=aa-v
(4.44)
(3, r + 5 vjr * v - (3, vr2 + h * rv2) = ua * k
(4.45)
-
-
-
It should be noted that expression (4.43) represents an integral of Eq. (4.44) [see Eqs. (4.3), (4.6), and (4.27)].
t See also Melbourne and Sauerzo.
114
A. I . LURIE
If r * v # 0, we can determine from Eqs. (4.44), (4.45), and (4.39) the covariant components 1, * r, k * v, and k k of the vector k in the vector basis r, v, k. With these components we may form the following expression for this vector:
-
=
l / k 2 ( k* r v x k
+k
*
v k x r + k k k)
This is just the differential equation required; on substitution for k - r ,
k * v, and 1 k, this equation can be presented as X=-
7
r-v
L * ( v x n ) v x n - - 5PL - r r + a r
1 cr-nv x n + - r . v o . v n k
(
1 1 +r9
(4.46)
where
3
9 = +wA
= W A - h,
-
h,
9 = qq - h
(4.47)1,2,3
for boosters of the first, second, and third types, respectively.
4.25 Orbital Axes Determine an orthogonal orbital triad by a triplet of unit vectors: e, and e , , lying in the instantaneous orbital plane TI, and the vector n perpendicular to this plane, where r e ,= - ,
e,=nxe,,
r
1 n=-rxv k
(4.48)
Let w stand for the angular velocity vector of this triad w = o r e , + o,e, + w,n and let us determine its components along the orbital axes. Taking account of kinematic relations
e, = w x e, = -no,
e, = w n
x e,=
=w X
-ep,
n = -e,o,
+ e,w,
+ no,
+ e,w,
we construct the following expressions for the vectors v and k :
+ r( - n o , + eve,) k = r x v = r2(o,e, + w,n) v = (re,)'
= ie,
Observe now that the vector v lies in the Il plane and the vector k is directed perpendicularly to it; from here we deduce that co,
=0
,
k
on= -, r2
v = v,e,
+ -kr e,,
r,
= i.
(4.49)
4.
115
THRUST IN A CENTRAL GRAVITATIONAL FIELD
I t remains now to determine w, ; to do this substitute the expression for v above into the equation of motion (4.2) so that ir = Fe,
k +i e + r2 '
(:
+
- i ;lev
(-er
k ;z + nw,
)
= -' r er
+ we (4.50)
Let a l , a 2 ,and cx3 denote the direction cosines of the thrust vector relative to the orbital axes; namely, e = aler
+ aleg + a3n
(4.51)
where it is of course understood that cx12
+
a22
+
u32
(4.52)
=1
Turning back now to Eq. (4.50), we obtain first, the equations of motion in the instantaneous orbital plane
r=v,,
d r = - -k2 -
P r2
r3.
+ wat,
k=rwa2,
(w=z)
(4.53)
and, second, the expression for the vector r k o = - n + w - u3er r2 k
(4.54)
The differential equations of the orbital triad's rotation can now be presented in the form
Let e* denote the vector whose components along orbital axes are equal to d,, k 2 , k 3 . We may write
k
C=e*+oxe=
rw
(4.56) Consider now the angular velocity vector v of the vector e ; one may of course set v e = 0. The relation between vectors o and v is given by
-
6 = v x e = e*
+ o x e,
v
=e
x (v x e ) = o
+e x e* -W - e e (4.57)
116
A. 1. L U R E
The vector CJ which enters into the expression for the vector integral, is of constant direction; from this we infer that
6
= d*
+0 x d =o
and having denoted by osits components along the orbital axes, we arrive at the following equations: 61
k =1c2,
r
k r2
oz = - -01
+ kr
-~ ~ 3 6 3 ,
r 6,= - - ~ k
~
1
3
(4.58) ~ 2
Equations (4.58) obviously allow the integral (4.59) this relation connects the initial values as well, namely, (0J2
+
(.20)2
+
(030>2 =
(4.60)
1
For reference axes we choose the orbital axes e,', eqo,and no at the initial instant. On these axes, the orbital triad can be oriented by means of three Euler angles a, i, and u ; these are, respectively, the longitude of an ascending node, the instantaneous inclination of orbital plane Tr to the plane no (of the vectors e,', e,'), and the angle in the II plane between the direction m of the ascending node and that of the position vector r. The angular velocity vector can be presented in terms of the derivatives of the Euler angles, that is, o=
di ano + m + lin dt
(4.61)
Comparison with Eq. (4.54) leads t o the system of differential equations di
It --_
r
U'CI~COSU ,
.
r
= - wcc3(sin
k
u/sin i),
k r2
r k
li = - - - wcx3 sin u cot i
(4.62)
Integration of this system provides no new constants because we have already assumed the initial values of the Euler angles t o be zero. We may obtain a more symmetrical form of these equations if we introduce the complex Cayley-Klein parameters u and p, satisfying i
,
where? CI
= cos
(i/2)exp[(i/2)(Q
+ u)],
k
/ j = -2( - - p +r2- n 3 a wkr p = i sin (i/2)exp[(i/2)(R
1
- u)]
(4.63) (4.64)
t There should be no confusion between i = (- 1)'2 and the angle i, since this angle appears only as the argument of trigonometric functions.
4.
THRUST IN A CENTRAL GRAVITATIONAL FIELD
117
Equations (4.63) are integrated with the initial conditions CI = 1, p = 0. Note that the components a> of the vector (r are expressed through their initial values and the Cayley-Klein parameters in the following way :
+ ia, = (cI0+ io2O)ti2- (a1" ia20)p2+ 2a,'~p a, ia, = -(c,O + ioz0)P2+ (0,' io,')a2 + 20,Oafi 6 3 = -(o10 + - (o10- ia20)a/3 + o,O(acc - pp) a1
-
-
-
(4.65)
iO20)ct/7
the overbar denoting the complex-conjugate quantities. 4.26
The Differential Equations Written Relative to the Orbital Axes
With the aid of Eqs. (4.27) and (4.56) we may replace the differential equation (4.56) by the following system :
r-v
e * (v x n)e x [(v x n) x el
a 3 e x ((v x n) x e}
r-v +k
(r *
-
P r2
- a l e x (r x e)
v e x (n x e) (4.67)
where, if a3 # 0, we have according to Eq. (4.43)
A = ara,/ka,
(4.68)
If a3 = 0 (it will be seen later that this equation holds in a special case of plane motion), then the last relation is dropped (at the same time we shall observe that a, = 0, a2 = 0). 4.3
Boundary Conditions: Various Types of Motion
4.31 General Boundary Problem At the instant t , the position and velocity of the vehicle will be assumed prescribed: r1 = r*', v1 = v*'. By a traditional technique we can now determine the values of r', v', k' and the position of the orbital triad-the vectors erl,eq', n1 (either the Euler angles, the Cayley-Klein or Hamilton-Rodriques
118
A. 1. LURIE
parameters, etc.). The term 0, in Eq. (4.18) for the indicating function can now be presented as 0,
= p,
- (r' - r*') + p2
(v' - v * l )
+ p,(tl - [,*I + pn,(ml- m * ' ) = o
(4.69)
where pl, pz denote the Lagrange multipliers. Since multipliers XI, k' and the boundary value of the Hamiltonian HA' are determined from the relations
we have 1' = - p 2 ,k' = pl, and no information about boundary values of 1 and k can now be supplied. If the minimized functional is given by Eq. (4.10), then pm = 0 and
A,'
=
1,
(4.71)
/? = pr
that is, h stays unknown if t , is presumed fixed, and vanishes when t , is not prescribed. If, however, time is minimized [the functional is given by Eq. (4.1 I)], then p I = 0 and h = 1, A,' = p m , that is, 2,"' remains undetermined when rn' is prescribed and vanishes otherwise.
4.32 Turning of the Orbital Plane The unit normal n' = ny' t o the instantaneous orbital plane given at the instant t , so that
With the aid of Eq. (4.70) we get 1
1'= ___ grad,, r' Ir' x v I -
(v' x p) -
n' is assumed
-
p (r' x v') grad Ir' x I'v Ir' x v1I2
I [v' x p Ir' x v1I2 - p * (rl x v')v' x (r' x v')] Ir' x v1I3
and setting p*
X'
= (l/k')p, = v'
this becomes
x p* - p* * n' v' x n' = v' x [n' x (p* x n')]
The vector X' can be calculated in the same way. We write
k'
= vl
x b,
X' = r'
x b,
b = n' x (p* x n')
(4.72)
4.
THRUST IN A CENTRAL GRAVITATIONAL FIELD
119
where b denotes a vector lying in the II' plane since b n' = 0. The pair of vectors k' and I is perpendicular to the pairs of vectors (v' and b, r' and b) disposed in the II' plane; from here it follows that I' and 1' and, consequently, e' and it' are collinear with n, so that = 0,
cl"
a2'
= 0,
cljl = &
r=
+_I
(4.73)
and Eqs. (4.56) and (4.52) show that kl' = 0,
oi,' = (r'/k')w',
ai,'
=0
Turning now to vector integral (4.41), we see that ao = (v' x b) x r' - (r' x b) x v'
=b
x (r' x v') = kb x n
(4.74)
and this means (since for nonplaiie motion we have a # 0 as will be shown below) that the vector c is disposed in the FI' plane and is perpendicular to b. We now have (r
- n1
a3' = 0,
= 0,
all = sin p ,
a2' = cos p
(4.75)
where the angle p is introduced by the relation b = b(e,' cos p + e,' sin p ) . Projecting now the vector (T onto the initial system of axes, we arrive a t relations olo = cos
a' sin(p + u ' ) + cos i'
sin fi cos(p + u ' )
sin Q' sin(p + u ' ) - cos 'i cos 0 3 0 = -sin i' cos(p + u')
a20 =
cos(p + u ' )
(4.76)
where Q' and 'i are prescribed angles which determine n' in the initial system of orbital axes. From Eqs. (4.46) and (4.68) we deduce [having also taken Eq. (4.72) into account]
A'/a
= &(r'/k')sin
This relation shows, in particular, that
E
p,
9'
=0
(4.77)
and sin p have the same sign.
4.33 Three-Dimensional Motions Equations (4.66) and (4.67) may [in view of Eq. (4.68)] be presented in the following form :
120
A. I . LURIE
+ 03
~ ( 1 ( ~ ( 2 03 ~ 3 0 2 )
oi, =
-
I:(7 1 k2 - $),,%,
-2 k C1,2u2
rz
k
-( I rv,
- ~ 1 ' )
+1'
I
+
1~
v,
9
ka, ara, ~
(4.79)
r
r
(4.80)
(4.8 1 ) Equation (4.78) follows immediately from Eqs. (4.68) and (4.79)-(4.81) together with (4.58); this equation is used to formulate expression (4.33) for the switching function. In the most difficult case of boosters of the third type, we have I I unknown quantities, namely r , v , k , m, E l ,
u2,
a3
3
01, 0 2
2
03,
rl
(4.82)
To determine all these, we have the three differential equations of motion Eqs. (4.531, the flow rate equation (4.4), Eqs. (4.79), (4.63), or (4.62) for determination of the Cayley-Klein parameters (or Euler angles) to express the values of oS [by virtue of Eqs. (4.6511, and equation (4.33) for the switching function. For fixed values of yo, u r 0 , k o and mo, the expression for the Cauchy integral will include nine constants mSo, aso,yo, h, and a ; of these only seven may be treated as independent since ci: and:a must satisfy relations (4.52) and (4.59). In the general boundary problem, when, for instance, t , is minimized and terminal mass is prescribed, the constant h equals unity, but t , enters into the set of constants; we have then t o determine seven quantities out of just the same number of conditions which express that r l , ill, k ' , and m' together with Euler angles Q', i ' , and u' are given. If, however, the mass is not prescribed, then Am' is equal to zero and the seventh condition is given by q1 = (c/m') ar'o'/k1a3'. This relation follows immediately from Eqs. (4.32) and (4.68).
4.
THRUST I N A CENTRAL GRAVITATIONAL FIELD
121
Some items concerning the problem of integration of the basic equations when moving along Keplerian arcs as well as those about matching of various solutions at points of switching will be developed below in Sec. 4.36. In problems formulated for boosting devices of the first type (w = const.), the flow rate equation and that for the switching function are deleted, and the time r1 is minimized. We must then find six independent constants (K,', aso, a, tl). This number is reduced t o five in the problem of turning the orbital plane, since by virtue of Eq. (4.76) we may express asothrough one constant p . The five equations for these constants are given by two of the trio of Eqs. (4.73), two equations prescribing the fixed values for Q' and i ' , and, last, the relation w'a sin p = E(k'/rl)which follows from Eqs. (4.77) and (4.47), . 4.34 Plane Motions For motions along plane orbits, the vector n remains constant, n = 0; also [by Eq. (4.55)] a3 = 0, which means that thrust acts in the orbital plane. The vector 3, is disposed in the same plane, and Eqs. (4.43) and (4.58) now show that D is collinear with n: a1= 0, o2 = 0, o3 = f 1, and a = a"nwhere a" = f a. Setting now = cos I), rx2 = sin I) where is the angle between e and the radius vector, we arrive at a pair of differential equations equivalent to Eqs. (4.78)-(4.81); it should be mentioned that Eq. (4.68) has now been deleted, and we must again introduce A into Eqs. (4.78)-(4.81). We get
*
-
sin$+-cos$
- J(cos
Ar
rur
II, +
5 sin rur
cos +-) ur
$1
-
*a
9
sin 9 u , I+!J
(4.83)
(4.84)
These equations must be used together with Eqs. (4.53), (4.4), and (4.33). They combine into a system of the seventh order, and nine constants enter into the general solution. The order can be reduced t o the sixth in the case of boosters of the first type. Considerable simplification may then be achieved, since there is n o necessity to observe the sign of the switching function, and transitions from one regime t o another are absent in this case.
122
A. I. LURIE
The solution to the (plane) general boundary problem for boosters of the third type reduces to the determination of five constants (for fixed values of ro, uo, ko,and m'). Four equations arise from the prescription of the values for r', u l , k ' , and the angle (4.85) This angle gives the change in true anomalies. Let, for instance, m 1 be minimized and t , given ; the fifth equation will then be written as '1' = (c/jn)A1- 1.
4.35 Special Case of Plane Motion This case holds whenever o3 = 0. Under this condition, it follows from Eq. (4.58) that either ct3 = 0 or o2 = 0. In the first case we have, according to Sec. 4.34, ot = 0, o2 = 0, and a = 0; and in the second case Eq. (4.58) shows that o, = 0. In both cases, a = 0, the constant a" drops out of Eqs. (4.83)-(4.84) and the solution t o the general boundary problem is no longer possible. The problem is soluble if we prescribe, at the right end, not the vectors r', v1 themselves but their scalar invariants r l , P I , r1 v'. Indeed, if the indicating function is presented by the relation
-
-
0, = Ol(rt, u l , r1 v')
we have [see Eq. (4.70)]
and substitution into the vector integral Eq. (4.41) shows that a = 0. The differential equation (4.83) is now written as follows: 2k
A
9
r 2
u, r2
r3
.
2k
(4.86)
The elimination of d from these equations is uniquely performed for any prescribed 9. We have
4.
123
THRUST IN A CENTRAL GRAVITATIONAL FIELD
Here 9 is taken in the form of Eq. (4.47),. By an analogous argument, Eqs. (4.47)' and (4.47), become, respectively,
On substitution for IjL and S / L in these expressions we arrive (in view of the equations of motion) at the second-order equation
u W +22 $ + - sin $ = 0 r r
(4.87)
Sometimes it may be preferable to introduce the criterion
5 = Y-
(4.88)
1
instead of q. Eq. (4.33) is then replaced by
Equation (4.87) should be referred to when the initial value of the radial velocity component u, is equal to zero (thrust imposed either at the perigee or at the apogee of an orbit). Consider, for example, the problem of minimization of time, and let the functional relation cp(r', 0 ' ) = 0 be prescribed at the right end of the time interval. Then 0 = t , + pcp(rl, u') and by virtue of Eqs. (4.70), (4.27), and (4.57) we have
X'
=
(I' cos
- acp -P$er
- a l V 3 lsin $l)e:
+ (I' sin $' +
cos $ ' ) e t
1
where h = 1 , v3' = $' + k ' / ( r l ) , . Having eliminated i l / A ' and p/A' from the four resulting equations, we arrive at the three boundary conditions required : cot
$1
v,lr'
=-
k' '
$1
=
k'
--
(?'I2
(I
++dd lInn u1 r
cp(rl, 0')
=o
124
A. I. L U R E
For example, if cp
= ( v ' ) ~- a(,u/r') = 0,
d In v 1 - I -_d l n r1 2'
$1
then =
1 k' 2 (r1>2
Here c( = 1 for the problem of acceleration to orbital velocity, and for escape velocity. In the problem of transfer to a circular orbit
I
+ p2 r1 -
H = p 1 (u')' and having eliminated pl/A, p J A , and we arrive at the relations
a =2
vl
1/A from the four resulting equations,
4.36 Moving along a Keplerian Arc The differential equation (4.28) for the vector 1gives only an equation of variations for the system of the equations of motion. I n fact, we have
and it suffices to put 6r = 1,6v = k . But the solution of the equations of motion is well known, and the derivatives of the position vector r with respect to six constants of the motion lead to a system of linearly independent solutions to the equations of the variations. Having combined some linear forms of these derivatives, we obtain the system of vectors'
1
3n(t - I") 3n(t - t o ) (I 2(1 - & 2 ) 1 / 2 E sin cp er - 2(1 - &2)''2 q2 = - cos cp e, q3 = sin cp e,
Q5 =
+ 21 ++
+ 21 ++
E &
cos cp sin cp e,,, E cos fp E
cos cp cos fp cos cpe,,
r(l - 2 ) cos cp n ,
~
P
44 =
r(1 - 2 ) e,
P
q, = [ r ( l - ~ ' ) / psin ] cp n
+
E
cos cp)e,
4.
1 -i12= n
-
- sin cp e,
r(1 - & 2 ) 3 ’ 2
1. -43 = - r ( l - & 2 ) 3 / 2 (cos cp e, n
-I 4 5 -- n
sin cp (1
I25
THRUST IN A CENTRAL GRAVITATIONAL FIELD
- &2)1/2
1
’
+ c + cos cp
-+
-q6 = n
sin cp
cos 40 + E n ( I - &2)1’2
By 40 we have denoted the true anomaly (measured from the ellipse’s perigee). The expressions for r , z’, , and k = r 2 @are well known :
r=
P l+&COScp’
Dr =
(1 - & 2 ) 3 / 2
sin cp,
rn
k =
flP2
(1 - & 2 ) 3 / 2
= (p/p3)l/2(~
-E2)3/2~
The constants of a Keplerian orbit and the true anomaly at the moment of transfer are expressed through r * , v,*, and k , according to the relations
so that
where h, denotes a constant (energy) of the Keplerian orbit; the moment of passing through perigee, t o , is determined either by well known formulas or from the tables of solutions of the Keplerian equation. We can now present the solution of Eq. (4.28) in the following form:
(4.90) In accordance with the Erdmann-Weierstrass condition, we must choose the constants C, in such a way that the vectors 1 and k and the Hamiltonian H , are continuous at the point of discontinuity of the control function (at this point yl = 0).
126
A. I. LURIE
From Eq. (4.39), taken along the Keplerian arc (w = 0) we get
Only the first term on the left-hand side of this equation differs from zero ; we find C, = - 2h[(1 - ~ ’ ) / n ’ p ]= - (h/h,)[p/(1 - c 2 ) ] (4.91) An analogous calculation with the vector integral Eq. (4.41) shows that
+
+
+ ( - 3 ~ , & ~ , ) [ n p / (-l ~ ~ ) ~ / [np/(l ~ ] n- E
~ ) ’ ~ ~+] C6i,) ( c , =~ uc ,
where i , is a unit vector directed t o the perigee of the orbit, and i, Thus, we have ua3 = f ( - 3 C , + & C 2 )
aa, = (C, cos 9 + C6 sin cp>[np/(1 - E ~ ) ” ~ ] ua2 = (- C, sin q +- C, cos cp)[np/(1 - E ~ ) ” ~
]
=n x
i,.
(4.92) (4.93)
The values of c3and of the components of 0 along fixed directions i, and i, have naturally become constant; one could have foreseen the values of o1 and n2 in view of Eq. (4.68), using an independent variable p (drp = k/r2dt). The coefficients C, and C4have dropped from the scalar and vector integrals. To determine these, we must use the first of Eqs. (4.90); this equation shows that
The Iast relation can be rewritten as
and from here we arrive at Eq. (4.68). The thrust direction program is now determined by the following formulas : 4
4
CI,
1 CtiqK* e,* = a,* 1 c,q, *
ti=
I
k= 1
*
e, , etc.
The relations determining the coefficients C, through the values of uoti,
a,, r, k, and ur at the moment t , are rather involved, in general, and there is no
need t o write them down here ; these relations can immediately be derived with the aid of the equations cited above. Remarkable simplification (C, = 0) is obtained if h = 0: such is the situation in the problem of minimization of fuel consumption where t, is not fixed.
4.
THRUST IN A CENTRAL GRAVITATIONAL FIELD
127
From Eq. (4.33) and in view of the constancy of mass m along the Keplerian arc we find q
=
C
- (A - A*)
(4.95)
m*
Determination of the vector 3, (in the plane motion problem) has been d i s c ~ s s e d .It~ is obvious from what has been said above that this determination requires no further integrations.
4.36a The h vector in the special case of a plane motion probiem and when h =O. So long as 6 = 0, it follows immediately from Eqs. (4.92) and (4.93) that C, = C, = C, = C, = 0; the constant parameters C , and C, are determined from Eqs. (4.94). These equations are now written in the form C , sin cp
=A
cos tj,
C,
2
I
+ +
E E
cos cp cos cp cos cp
= A sin tj + c, 1 +1-52, 52 cos cp
Having replaced in these relations cos cp and sin cp by r and u,, respectively, we arrive at the formula tan tj = - 2 0, t
2
+ B) ,
B = ( ~ / p ~ ) ” I~+[ &(I - - E~)”~(C,/C,)I
This relation can also be obtained after integration of Eq. (4.84) under the assumptions d = 0, 9 = qq = 0. There is no trouble in combining the expressions for A and the q criterion. Consideration of the [ criterion leads to a more compact expression. We write the differential equation (4.89) in the form
and having determined the integration constant and taken care of the expression for tan $, we arrive at the relation
4.4 4.41
Orbits on a Spherical Surface
The Statement of the Problem of Spherical Motions
For motion on a spherical surface, the r-coordinate is subjected to the condition r * r = r2 = const (4.96)
128
A. I. LURIE
We must now introduce an additional term into the functional (4.19), this term being 1 '1 v(r r - r 2 ) dt
jb -
where v denotes a Lagrange multiplier. This will add the term vr to the stationarity condition Eq. (4.23), and Eq. (4.28) will then be rewritten in the form
(
1 .. = :r 3 -1;2rr
-h
1
(4.97)
-vr
All the other equations remain unchanged. It is readily observed that the term vr in Eq. (4.97) is collinear with r and has no influence on the derivation of the vector integral (see Sec. 4.22) or on the construction of the scalar integral (since the value of H Astays the same). But in the preceding Section (Sec. 4.23), the collinearity of the vectors + ( p / r 3 ) 1and r followed immediately from the vector integral and the equations of motion. Thus, when using the integrals (4.39) and (4.41) (together with the equations of motion, flow rate, and switching function), one need not worry about the validity of Eq. (4.97): this one may be satisfied by suitable choice of v. It would be a mistake to consider spherical motions on the basis of the differential equation (4.28). 4.42
The Differential Equations of Spherical Motions
For spherical motions,
r * r = r 2 = const,
r* v
= 0,
v, = 0
(4.98)
The equations of motion are written in the form
P
k2
r
r3'
ic
wa1= 2--
= rwa2,
(4.99)
Consider now Eqs. (4.78)-(4.8 1). Having retained in these equations only the terms containing v, in the denominator, we arrive at the single relation (resulting from all equations considered) (4.100) From Eqs. (4.80) and (4.81) we deduce the relation which does not include
ZI,,namely,
i 3
k
a3
r2
oi, - - CI, = - 2 -
k
- -(02a2 r201
+ o3a3)+ w kr- (1 - a I 2 )
(4.101)
4.
THRUST I N A CENTRAL GRAVITATIONAL FIELD
129
As was demonstrated in Sec. 4.35, the function 9 satisfies the differential equation
Note that, for boosters of the third type, this equation presents a modification of the differential equation for the switching function q ; the latter can be expressed through 9 [see Eq. (4.47),]. Nine quantities k , m, a s , oS,and 9 are thus connected by ten equations, these equations being: Eq. (4.99) of motion; Eq. (4.4) of mass flow rate; four equations (4.100)-(4.102) and (4.52) ; and, last, the three equations of (4.58). But we observe that Eq. (4.102) may be rewritten with the aid of Eqs. (4.100) and (4.99) in the form
(4.103) and this one is obviously satisfied in view of the remaining equations (this point may readily be verified by differentiation). The value of ci, is then determined by differentiation from the equations of motion (4.99), namely, (4.104) The values of ci2 and d , follow from Eq. (4.101) together with the relation E l i , + X 2 d Z + c(3ci3 = 0. Thus, Eq. (4.102) is withdrawn from further use, and Eq. (4.101) serves only for calculation of the switching function in view of Eq. (4.47), . We arrive at eight equations for just the same number of unknowns k , m,u s , and os.
4.43 Spherical Motions when w
= const
(Devices of the First Type)
For simplicity, we introduce nondimensional variables
Here v denotes the ratio of speed = k / r to that along a Keplerian orbit of the same radius, n represents angular speed in the same orbit, and ( the ratio of thrust t o gravitational force? also in the same orbit. With minimization ?There should be no danger of confusing this notation with that for the switching function, because the latter does not appear in this section.
130
A. I. L U R E
of time t , in mind, we write, in view of Eq. (4.47), that 912 = w - hj2 = M: - I/;L = w - u a 3 / m 1 and Eqs. (4.99)-(4.102) will now be rewritten (a prime denoting differentiation with respect to t) as
(4. loo),
One of the two latter equations follows from the others. It seems natural to retain the finite relation Eq. (4.100),. The equations already written must, of course, be complemented by Eqs. (4.58) and (4.52). The latter equation, together with Eqs. (4.99), and (4.100),, leads to the relation
In this equation, the values of o3 and g1 are expressed by Eqs. (4.65) through the Cayley-Klein parameters, these being determined from the system of Eqs. (4.59). This system may now be presented as
;[ v a + -
a ! = -2
01
v2
i2 - (1 - v2)2 6-V03
81'
i
P ' = - [2- V a + 7
Ol
c2 - (1 -
V2)*
6 - VO3
4
(4.107) and a pair of equations for the complex conjugate values E and equations are integrated with initial conditions ao = 1, Eo = 1, Po = 0. Note that the following integral is valid : at?
+ pp = 1
fl. The Po = 0, (4.108)
The constants vo, ,:g and 6 will enter into the general solution. For the general boundary problem there are prescribed the values of v o and v1 together with the vectors e,,' and epl giving the position of the mass center and its velocity direction at the instant t , . For a fixed value of vo, at the right end the following conditions should be satisfied :
v1 = v(oSo,6 , t l ) a, = a(gs0, 6,b , ) ,
B'
= P(O,O>
6,41,
E' = E(UsO, 6,t , )
D1 = D(gs0, 6, t l )
4.
131
THRUST I N A CENTRAL GRAVITATIONAL FIELD
Of these relations only four may be treated as independent because the Cayley-Klein parameters are connected by Eq. (4.108). On the other hand, we have also four independent constants among cso,6, and t1 at our disposal. It is stressed that the values of a', a', P I , p' are determined by those prescribed for e, 1 , ev1 , n' = erl x ev1
4.43a Turning of a circular orbit. Consider the partial solution to the problem of the preceding section constructed under the condition that thrust be collinear with the normal n to the instantaneous orbital plane: a1=0,
c!,=+l=&
'Y2=0,
(4.109)
From (4.99), it follows that for motion of this kind v = 1, which means that the speed is held constant and equal to that on a circular orbit of radius r . In view of Eq. (4.100), [or Eq. (4.106)] we obtain the equation
+
&(GI
=
(Tj
(4.1 10)
6
equivalent to the stationarity condition for the simplest case considered. The differential equations (4.58) can now be written in the form Ol' = 0 2 ,
02' = - 0 1
+
03'= - & [ O 2
E[(i3,
(4.1 11)
According to these equations, the c-derivative of the left-hand side of Eq. (4.1 10) vanishes, and from this we deduce that the corresponding situation is realizable. Equations (4.107) now present a differential system with constant coefficients, namely, OL'
i 2
= - (c!
+ &(fi)
fi'
=
i
(-B
+ &(a)
(4.112)
whose general solution is = "OK1
+ P°K2
9
B = "OPI + P O P 2
(4.113)
Here we have introduced a system of solutions with the unit matrix of initial conditions : K ~ ( z= ) cos2 ( x / 2 ) ciO+ sin2 ( ~ / 2 ) e - ' " , ~ ~ ( =7 +)E sin x ( e i O- e-'O),
pl(c)
= +E
sin X(e'" - e-'")
p2(c) = sin2 ( x / 2 ) e i w+ cos2 ( ~ / 2 ) e - ' " (4.114)
where tan x = (,
w = r/(2 cos
x)
(4.115)
132
A. I. LURIE
The table of cosines (Table I) corresponds to the values of a = 1, B = 0, that is, to the transfer from the initial position of an orbital triad. The equation of motion of the center of mass is now formulated with the aid of the first line of this table :
x + cos2 x cos 20) + eVo cos x sin 2 0 + no&sin x cos x (1 - cos 20)]
r = re, = r[e,0(sin2
(4.116)
From this equation it is easy to deduce that the following relations are valid : r - M = r sin x, Jlr = era sin x + &nocos x (4.1 17) These relations show that an orbit is a minor circle of a sphere resulting from its intersection with the plane perpendicular to the vector JV and moved the distance r sin x from the attractive center. TABLE I COSINES
e,
%
n
x + cos' x cos 2w x sin 2w E cos x sin x (1 - cos 2 w ) sin2
-cos
x
cos sin 2w cos 2w --E sin sin 2w
x
x
cos x sin (1 - cos 2 w ) sin x sin 2w cosz sinZ cos 2w
E E
x+
x
The formulated solution does not allow one to prescribe an a priori direction of the vector n1 normal to the instantaneous n' plane. Indeed, this direction is fixed by two angles Q' and i', while we have only one quantity 2 0 , = r,/cos x at our disposal. This is the reason why it would seem reasonable to obtain a solution by changing at some suitable instant the thrust direction by 180" (going from a3 = E to a j = - 8 ) . Under this assumption, we have [see Eq. (4.1 13)] O < T < T , - o
a3 = E
c(
T*
a3 =
U. = M * K l - ( T
+ 0 5 T 5 71
-&
= K' + ( T I ,
a = "*Pl-(T
B = P1 ' ( T I
- T*) - T*)
where K ~ + ,p i + are determined by Eqs. (4.114), and equations with E replaced by - 8 . We get a1 = @(TI)= K1+(T*)K1-(T1 -
T*)
+ P*K2-(T - T * )
+ B*P2-(7 - T * )
K ~ - , pi-
by the same
+ P~+(T*)K~-(TI- T * )
and
P'
= P(Tl) =
Kl+(T*)Pl-(Tl - T*>
+ Pl+(T*)P2-(71
-
T*>
4.
133
THRUST IN A CENTRAL GRAVITATIONAL FIELD
and after the calculation, in view of Eqs. (4.1 14), we obtain c.tl =
p'
x sin o1+ cos 0,+ sin2 x [cos (wl - 20,) - cos ol] (4.1 18) sin ~ ( C O Sx [cos(wl - 20,) - cos o,]- i sin x sin(0, - 2~0,))
i cos
=E
(W'
=
2'12 cos x,
0,=
z*/2 cos x)
The unknown parameters z* and z1 are the least positive roots of the system of equations c.t'c'
- 8'8'
= cos il,
2cr'P' = i sin 'i exp(in')
(4.1 19)
For 0 5 z 5 z1 - 0, an orbit is combined of two parts of minor circles equally removed from the attractive center; their common tangent at the point rer(zy) is directed along the vector e,(s,), and the vectorsN, and N 2serve as normals to the circles' planes. Since thrust ceases its action at the instant z1 + 0, the Keplerian orbit becomes a major circle whose plane is suitably oriented in space (its normal given by n'). In the problem considered above, the stationarity conditions and the Weierstrass criterion have resulted in the single requirement equation (4.1 10). This condition is valid for either of the transfers erO-+ e,* and er*+ e,.' taken separately; but for the equivalent transfer erO-+ erl it may be satisfied only under the additional requirement a,(.,) = 0, since (r is a constant vector and its components 0,along the orbital axes must be continuous functions of z. But as long as a, = 0, c.t2 = 0 , ct3 = E , we may write
and
l(z,) = 0 ,
r2
h'(z,) = - a2(z*)n*,
8P
(HA),=,*= 0
We have arrived at a contradiction with the equality h = 1 so that one of the Erdmann-Weierstrass conditions is violated, and the transfer erO erl is not an optimal one. It was postulated at the very outset that the thrust vector may be arbitrarily oriented relative to the orbital axes; it seems probable that under this assumption the minimum time requirement can be realized only when there is a component of thrust in the instantaneous orbital plane. The above partial solution does not meet this condition. -+
4.43b Optimal turning of the plane of a circular orbit. We now alter the statement of the problem: it will be assumed in what follows that the orbit is a spherical curve and the thrust acceleration (of bounded magnitude) is
134
A. I. LURIE
collinear with the normal to the instantaneous orbital plane. Under these conditions, the velocity will stay constant in magnitude and collinear with e, (this statement comes from the equation of motion (4.99),). The other equations of motion reduce t o Eqs. (4.55). In view of the nondimensional notation introduced above, we may present Eqs. (4.55) in the form
e,'
= e,
,
e,'
= - e,
+ a$,
n'
-&lev
=
(4.120)
where IE(T)/ I 1. The choice of the "control" ~ ( t must ) be such as to minimize the transition time from the initial position of the orbital plane to the terminal one described by the normal vector n1 = i'. In this connection, we give the indicating function in the form 0 = z, + p (n' - 5'). We introduce three Lagrange vectors A,, 1,,and 31, , and write the Hamiltonian H A = 31, * e, - 1, e, + ~ 5 ( k , n - 1, e,). From Pontryagin's maximum principle it follows that E ( Z ) may take either of its limiting values
-
-
-
E =
+1
if
E =
-1
if
-
h,-n-h,.e,>O 1,.n-I3.e,<0.
The moment of switching z* is found from the equation I.,*
*
n*
-
k3* * e,*
(4.121)
=0
The stationarity conditions lead to the linear system
A,!
= A,,
31,'
=
-31
1
31,'
i-E j 1 3 ,
=
-&jh,
(4.122)
whose partial solution is I, = e,, 1, = e , , 1, = n ; the general solution, including nine arbitrary constants, may be presented in the form
31,
=
A . e,,
h,
=
A . e,,
31,
=
A-n
where through A we have denoted a constant tensor of the second rank. The boundary conditions may be formulated in terms of the indicating function introduced above ; these conditions are Ill =
0,
1,'
H,' = 1
= 0,
(4.123)
Having presented A as a sum of three dyadics
A = ale:
+ a2e, + an1 1
we find that a, = 0 and a2 = 0. The solution of Eqs. (4.122) with the boundary conditions of Eq. (4.123) is now given by 31'
= ae,
nl,
1, = ae, * nl,
31,
= an
*
n1
where a is a constant vector collinear with the Lagrange multipliers 31,.
4.
135
THRUST IN A CENTRAL GRAVITATIONAL FIELD
The scalar integral which expresses the constancy of the Hamiltonian is now transformed into
-
(n' x a) (n iEle,)
=
(4.124)
1
and the notation n' x a = (1/6)6 allows us to present it in the form of Eq. (4.1 10). The vector (r now turns out to be disposed in the 17' plane in accordance with boundary condition, Eq. (4.79, previously found in the general problem of turning the orbital plane. An instant T* may be determined from Eq. (4.121) transformed into
-
a (n*e,* * n1 - e,*n* * n')
- [n' x (n* x e,*)] (a x n') - (n* x e,*)
=a =
=O
or, equivalently, (116)~* er* = 0,
ol(z*)
= 0.
Nothing prevents us from assuming that the latter condition is satisfied. From Eq. (4.124) we also infer that ~ ( z ~ ) [ o ( z= ~ )6, and ~ ( 7 , ) ' = sgn o'(71). This relation serves for the choice of the sign of E ( s ~ ) This . conclusion is also confirmed by Eq. (4.77). The stationarity conditions and the Weierstrass criterion (maximum principle) are satisfied in the statement of the problem above, while the expressions for the 1-vectors and the Hamiltonian [see Eqs. (4.121)] leave no doubt that the Erdmann-Weierstrass conditions are also fulfilled. The differential equations of motion [Eqs. (4.120)] were analyzed above in the equivalent form, Eq. (4.1 12). The orbit so found consists of two parts of minor circles and is optimal for the statement of the problem formulated in this Section; the minimum time and the switching instant may be calculated from Eqs. (4.119). This minimum is of course larger than that obtained for a more general problem when we allow the components of thrust to be disposed in the instantaneous orbital plane. The problem of turning the plane of a circular orbit has been analyzed by other^.'^,'^,'^ 4.5
Boosting Devices of Limited Propulsive Power
4.51 The Differential Equations of Motion
In Sec. 4.17 it was shown that for this type of booster the vector 3, differs from the thrust acceleration vector only by the constant multiplier A / N m a x . With this in mind, we place the expression (4.30) for the vector 3, in the
136
A. I. LURIE
differential equation (4.28) and arrive at the fourth-order differential equation for the vector r, namely,
L(r)=
=O
(4.125)
The total fuel consumption will be presumed to be minimized; in this case
e = -ml + el(rl, i') + p t ( t l - t,*)
(4.126)
where pr = h ; and if tl is not prescribed, then h = 0. The boundary conditions at the right end are formulated through the vector r; in view of Eqs. (4.30) and (4.70), these conditions are written as t = t,,
(r (P
+
5
r):,
= grad,,
+ -$r)t, = -grad;,
At the left end the vectors for H A will be written below.
To,
8,, (4.127)
O,,
HAIf,
=h
ioare presumed to be fixed. The expression
4.51a. Scalar and vector integrals. Substitution of Eq. (4.30) into Eqs. (4.39), and (4.41) transforms these integrals into HA = -
(Fxr).--2
(-r + - r ) x i = b = b o rt
(4.129)
where b and h denote constants. By the same argument as in Sec. 4.23, it can be shown that the system of Eqs. (4.128) and (4.129), now containing four constants, is equivalent to the initial equation (4.125) under the additional requirement r i # 0. Equation (4.68) which may also be presented in the form
-
provides the first integral of Eq. (4.129). This relation may also be written in the easily verified form r * ( ix r ) = b o . r (4.130) Having constructed the cross-product of Eq. (4.129) with i and r, we obtain and
(4. I 3 I)
r * ( ix T.)=bo*i
-
r (r x ii)
= bo
*
(i:
+ (p/r3)r]
(4.132)
4.
THRUST IN A CENTRAL GRAVITATIONAL FIELD
137
the latter by use of Eq. (4.130). We may disregard Eq. (4.131) because it is an immediate consequence of Eq. (4.130). The problem has now been reduced to the system of three scalar equations (4.128), (4.130) and (4.132). These equations determine covariant components of the vector i: in the vector basis r x i, r x r, i, and with their aid this vector is expanded along the axes of the conjugate basis (l/bA)(r x r) x i, (I/bA)i x (r x i), (I/bA)(r x i) x (r x r) where [see Eq. (4.130)]
bA
=r
* i ( r x i>* r
=r
* ib c * r
(4.133)
We arrive at the following result:
+c
i i x ( r x r)
+ c - r i x (r x f)
(4.134)
This result is valid when r * i # 0 and b # 0 (except for spherical and plane motions). The three differential equations (4.134) of the third order allow an integral (4.130). The system depends on three independent constants; namely, the constant unit vector G and h and one other constant, b, which enters into the first integral. The control program for the thrust acceleration is determined from the relation w=r
+(p/~~)r
(4.135)
4.52 Plane Motions For the motion along plane orbits, the vectors r, i, r, F are coplanar, and from Eqs. (4.130)-(4.132) it is readily seen that b = 6n, 6 = +b. Equation (4.129) can now be presented in the form i" x r = [r + 2(p/r3)r] x i + &n. Consider now the vector product of this equation with i ; having replaced P * i in the resulting expression with the aid of Eq. (4.128), we arrive at the following presentation of the vector r :
- i -2Pr ~ i i ~ i ( i x ~ ) + 6 i x n r3
(4.136)
We now have a system of two third-order differential equations containing two constants 6, h. For the special case of plane motion, 6 = 0.
138
A. I. LURIE
4.53 Spherical Motions
In accordance with what has been stated in Sec. 4.4, the differential equation of motion Eq. (4.125) should be complemented by the term vr collinear with r. But this term obviously does not change the form of the integrals (4.128), (4.129). Combined, they present the differential equations of the problem. In terms of the nondimensional variables of (4.105), these equations are written as follows: -
-
(e,” e,,), + je,“ * el” + ern* e, = (e,” x er)’- 2(e,” + e,) x e,‘ = a6
(4.137)
-
where we have accounted for e, e,’ = 0 and e,’ * e,’ = -e, * e,”, and z, had been presumed a priori unknown (h = 0). We must adjoin to these equations the following kinematic relations e,‘
e,’ = ve,,
=
-ve,
+ -P n, V
Pe n‘ = - v
,
(4.138)
where v denotes the speed and p the component of the thrust acceleration along the normal to the instantaneous orbital plane, both parameters presented in nondimensional form. These quantities will from now on be treated as new dependent variables. We write e,” = v’e, - v2e, + pn
(4.139)
Substitution into Eqs. (4.137) leads to the four relations pv = a,6
(v’/v)p
(4.140)
+ p‘ = 0 2 6
(4.141) (4.142)
p2
=
f
+
2 v2
+ I vv” -
f v’2
-
(4.143)
v4
where we have already used Eq. (4.143) for presentation of Eq. (4.142). Only two of these four relations may be treated as independent. Indeed, since a is a constant unit vector, we have 61’ = v c p
,
6 2 ’ = - vfJ1
+ -P c3, V
P c3’= - - ( T 2 , V
c*2
+ 022 + 032
=
1
(4.144)
4.
THRUST IN A CENTRAL GRAVITATIONAL FIELD
139
It is obvious that Eq. (4.141) is a consequence of Eq. (4.140). Furthermore, we arrive at Eq.(4.141) by differentiation of Eq. (4.142) and useof Eq. (4.143). It remains only to require that the right-hand sides of Eqs. (4.140)-(4.142) be connected by the second equation of Eqs. (4.144). This condition is equivalent to the differential relation p"
v' p' +V
p
= - (1 V2
-
2pz - 2v4)
(4.145)
which must be treated together with Eq. (4.143). The first integral of the system of Eqs. (4.143) and (4.145) is already known: we mean the last of Eqs. (4.144) expressed in terms of the left-hand sides of Eqs. (4.140)-(4.142). We have five free parameters-four constants of integration and z,-at our disposal, and it is necessary to choose them in such a way as to satisfy prescribed values of vo, vl, and Euler angles Q', ,'i u'. In this way we infer that the general boundary problem for spherical motion is well formulated. The program of thrust operation is determined from Eqs. (4.1 35) and (4.139) :
w
= (p/r'>(e,"
+ e,). = ( p / r 2 ) [ (1 - v2)er + v'e, + pn]
(4.146)
The control program when the thrust vector is directed along the normal to the instantaneous orbital plane is by no means optimal because the values v = 1 and p = k I do not present a partial solution to the system of equations (4.143) and (4.145). These equations could have been satisfied by setting the values of v2 and p2 equal to 1/4 and 7/16, respectively, but for the regime of constant speed (v = 1/2) one can choose z1 so as to prescribe only one of the three Euler angles Q', i', u l . 4.54
An Application of the Ritz Method
One could have derived the solution of the thrust-acceleration control problem under limited propulsive power without reducing it to the MayerBolza problem as it was done above. In fact, turning back to the basic equations (4.2), (4.3), (4.4), and (4.12), we can write
and, consequently, (4.148) Note that the magnitude of the acceleration w depends on two independent functions of time: N ( t )and q(t). We might, indeed, choose some program for
140
A. I. LURIE
N ( t )and by suitable alteration of the flow rate law q(t) obtain any functional dependence of w ( t ) prescribed from the outset. Thus, it becomes obvious that we have independent quantities under the sign of the integral in Eq. (4.128). From this point and taking Eq. (4.12) into account, we deduce the following inequality : (4.149) The problem of minimum fuel consumption (maximum final mass m ' ) is reduced t o the determination of the law r(t) of motion, this law leading to the minimum value of integral (4.149). When t , is fixed equal to t l * , this provides the " simplest problem " of the calculus of variations, and if t , is not prescribed we arrive at the problem with free boundary. If boundary values of r', v', t , are connected by the relation
0 = O,(rl, i') + h(t, - tl*)
=0
(4.150)
one must set the first variation of
R, =O,(rl, i') + h(tl - tl*) + equal t o zero. Note that the upper limit of the integral must also be varied ; we get
[
- (ii
+ f r)
* -
grad,, 8,
] - Ar)
r=tt
- [ ( H ) , ,- hl
= 6r'
+ i' 6t,,
A i l = hi'
+ P'
=0
(4.152)
where A denotes the total variations at the right end: Ar'
(4.151)
6tl
(4.153)
The left end is assumed fixed (6r0 = 0, 6i0 = 0); in Eq. (4.152) H means the left-hand side of Eq. (4.128), L(r) denotes the differential operator (4.125). The requirement that the first variation must vanish leads, of course, to differential equation (4.125) with boundary conditions equation (4.70). We may now use the Ritz method to minimize integral I in Eq. (4.149), and so determine the vector r(t). The latter is t o be found in the class of functions continuous with their derivatives up t o third order, and satisfying the complemented boundary conditions. Having chosen functions of this class containing some unknown constants C, , .. . , C,, we obtain the minimum problem for the function Z(C,, ... , C,) of these constants. This problem is
4.
141
THRUST IN A CENTRAL GRAVITATIONAL FIELD
obviously reduced to a system of s finite equations for the same number of constants C, . If “geometrical boundary conditions are prescribed (the vectors r l and i’) and t , is fixed, then the choice of a suitable expression for r ( t ) is fairly easy; this case is quite difficult to solve in the equivalent MayerBolza problem since the values of 1’and k’ are all unknown. Inversely, the case of 1, = 0, = 0 treated as the simplest one in the Mayer-Bolza problem, corresponds to “ statical ” boundary conditions (in terms used in the problem of bending of plates), and under these conditions there arise difficulties in the construction of a suitable expression for rft). From Eq. (4.152) it follows, however, that one need not trouble much about “ statical ” conditions since they are met automatically when 6r’ = 0, 6 i 1 = 0 (and 6 t , = 0) and Iattains its minimum value. It is, of course, obvious that one can hardly rely upon such a solution in practice. The reader may find a very detailed account of optimal problems for boosters of limited propulsive power in Grodzovskii et a l l 6 who also presented an extensive bibliography. ”
x’
4.6 Singular Control Regimes? 4.61
Statement of the Variational Problem
It was noted in Sec. 4.8 that for boosting devices of limited flow rate the possibility of “singular control” regimes cannot be excluded. In such regimes, the mass flow rate is determined by Eq. (4.37) under additional constraints expressed by inequalities (4.14). For a “singular control” program, Eq. (4.36) is satisfied; from this it follows that the derivative of the switching function is also equal to zero, and from Eq. (4.33) we infer that j. = 0, A = const. The differential equation (4.28) becomes, in view of Eq. (4.27), the equation for unit vector e of the thrust direction: (4.154) We might derive this result immediately from the variational problem for minimum fuel consumption. From Eq. (4.4) we find
-cq_ m
where
-c?
m
=
If +
$rl,
mo 1 In - = I = -{‘If m’ c o
+f
rI dr
(4.155)
(4.156)
t For a general discussion of singular control, see Chapter 3.
142
A. I. LURIE
The problem is reduced (like that considered in Sec. 4.54) to the cakulation of the first variation of the functional
R,
=
dl(rl, i') + h(t, - tl*)
+
1" 0
(4.157)
Note that the upper limit must also be free; we have 652,
= grad,,
O1 * Ar'
+ grad;,
8' * A i l
+ h S t , + ( r + f r)
- e1,,6t, (4.158)
where A denotes the total variation. The variation of the expression in square brackets is now transformed into
Now, taking Eq. (4.156) into account, we may present the integrand in the following form :
3
= (e.
Si)' - ( e * 6r)' + L(e, r) * 6r
In view of Eq. (4.156), after integration we obtain the following expression for the first variation:
6R,
= Jf'
L(e, r) 6r dt
0
+ (grad;,
+ (grad,,
0, - 6 ' ) * Ar'
+ e l ) - A i l + (h - H),, S t ,
(4.159)
where
H=-
r.e (C . i + -r" 3
(4.160)
This expression coincides with that for the previously introduced quantity H , (see Eqs. (4.39)) for A = const and = 0. Setting the first variation equal to zero, we arrive at the differential equation (4.154) and boundary conditions Eq. (4.70), where now we must set 1 = const.
4.
THRUST IN A CENTRAL GRAVITATIONAL FIELD
143
4.62 The Differential Equations The unknown function-the mass flow rate q(t)-is excluded from the problem formulation : this function is determined from Eq. (4.37) after the whole solution is found. It is required to determine the vector r and the unit vector e from Eq. (4.154) and the equation (4.161) resulting from the equation of motion after elimination of q(r). This system allows both vector and scalar integrals e x r - e x k = a = ao
(4.162) (4.163)
From Eq. (4.57) we get e=vxe,
(4.164)
e=\ixe-v2e
and substitution into Eq. (4.154) gives P i x e - v2e = - (3eru, - e) r3
(ul = e, * e)
This relation shows that v 2 --1 -* (
r3
1 - 3ul2),
. = 3P r3 u1 e x e,
(4.165)
v
From the first of these relations we deduce the inequality a1 5 I / J j noted in the 1 i t er a t ~re .I~ In Eqs. (4.83) and (4.84) we set A/A = 0 and 912 = 0, 4.62a Plane motion.18319 and arrive at a pair of equations which may be transformed into --=-
r
'
r3
cos $ +
a"
- - = 2; cos $ r
$2
cos 1c/ - V"
(:
- sin
$ + $ cos $
)
(4.166) (4.167)
We have denoted by v" the component of the vector v along the normal n to the orbital plane, so that
v=v"n,
v"=J/+klr2=$+@
(4.168)
I44
A. I . LURIE
From Eq. (4.165) it follows that 82
=
4( I
r
- 3 cos2
3P . 5 = -sin $ cos II,
$1,
(4.169)
r3
Equations (4.166) and (4.167) together with the first Eq. (4.169) show that a"; = h
3P +C O S ~$ r2
(4.170) (4.171)
and differentiation of Eq. (4.170) yields, in view of the second Eq. (4.169), that i
a"
2 - cos2 (CI + 311, sin $ cosJ!+I = - sin $ r r
(4.172)
This equation must be solved together with Eq. (4.167). We have i--
'
r
= cos
2 sin $ (F 3 - 5cos2 $ r
1 [!(I $ (3 - 5 cos2 $1 r
+ 3v cos $)
-'3
COS'
(4.173)
$) - 4v cos3 $
1
(4.174)
Equation (4.171) determines r in terms of $ and of constants a", h. The other unknowns v, i and 11, are also expressed through the same quantities [see Eqs. (4.170), (4.173), and (4.174)]; after determining the values of v and I), we also find 9.It is easily seen that substitution of these values of i and 11, into the equation resulting from Eq. (4.170) after differentiation leads to an identity. In this calculation, we made use of Eq. (4.169) for 5 ; this could have been avoided since the initial system of Eqs. (4.166) and (4.167), together with the first of Eqs. (4.169), suffice for the determination of the three unknowns r, $ and k. The result may be presented explicitly'* if t , is not fixed so that h = 0. Then, in view of (4.169), we get P- -- - 1 - 3 cos2 r 9 cos6 $
and
a 1 - 3 cos2 $
rii = 3 y $ = - -
C O S ~$
*
1 - 3c0s2 $ 3 cos $ 3 - 5 cos2 ' a"
'
i = 2a" y@ =
(4.175)
* *
1 - 2 cos2 $ sin $ 3 - 5 cos2 cos2
(4.176)
1 - 3c0s2 $ (3 - 4 cos2 *) 3 C O S ~$ 3 - 5 C O S ~$ (4.177) a"
~
*
4.
THRUST I N A CENTRAL GRAVITATIONAL FIELD
145
From Eq. (4.176) we get (4.178)
All unknown quantities are expressed parametrically through the angle I) between the thrust vector and the position vector. With the aid of Eqs. (4.175) and (4.177) we may also present time in terms of the same parameter: G3
-
* *
3 - 5 cos2 (30s’ - 3 cos2 * ) 2
ant= (1
d*
(4.179)
Integration introduces one more constant, C, . Turning now to Eq. (4.37), we infer [see also Eq. (4.163)] that
(4.180) This relation also determines the mass together with the flow rate q ; one must also check the fulfillment of inequalities (4.14). The general solution depends on four constants a“, CI, C, , and mo . It is obvious from its structure that we may choose these constants so as to assign prescribed initial values to the mass, true anomaly cpo, and only two of the three quantities YO, YO@’, and i.’; a t the right side there remains the possibility of only one quantity to be prescribed, this prescription being achieved by suitable choice o f t , . If it is required to include an arc of singular control in the program, then at the switching point not only “coordinates” r, cp, i., and r @ but also “controls” $ and $ must be continuous as required by the Erdmann-Weierstrass conditions. Such continuous transition is possible only under fairly specialized boundary conditions. REFERENCES 1 . G. Leitmann, On a class of variational problems in rocket flight, J. Aerospace Sci. 26,
5 8 6 5 9 1 (1959). 2. A. Miele, General variational theory of flight paths of rocket-powered aircraft, missiles and satellite carriers, Astronaut. Acta 4, 264-288 (1958). 3. D. F. Lawden, Interplanetary rocket trajectories, Adoan. Space Sci. Techn. 1, 1-53 (1959). 4. D. F. Lawden, Optimal programming of rocket thrust direction, Astronauf. Acra 1, 41-56 (1955). 5. G. Leitmann, Minimum transfer time for a power-liniited rocket, J . A&. Mech. 28, 1-8 (1961). 6. G. A. Bliss, “ Lectures on the Calculus of Variations.” Univ. of Chicago Press, Chicago, Illinois, 1946.
146
A. I . LURIE
7. V. A . Ti-oitskii, 0 variatsionnysh zadachakh optiniizatsii processov upravlenua (On variational problems of control processes), Prikl. Mat. M e k h . 26, 29-38 (1962). 8. L. D. Berkovitz, Variational methods in problems of control and programming, J. Math. Anal. Appl. 3, 145- 187 (1961). 9. E. V. Tarasov, ’‘ Optimal’nie rezhimi poliota letatel’nykh apparatov ” (“Optimal Regimes o f Flight ”). Oborongiz, Moscow, 1963. 10. L. Dahlard, Application of Pontryagin’s maximum principle i n determining the control of a variable mass-vehicle, Progr. Astron. Rocketry 8, 21-29 (1962). 11. V. K. lsayev and V. V. Sonin, O b odnoi nelineinoi zadache optinial’nogo upravleniia (On certain non-linear problems of optimal control), Automzt. Telernekh. (Airtornaf. Rernofe Conrrol) 23, 1 I 1 7- 1 I29 (1 962). 12. A. 1 . Lurie, Svobodnoe padenie material’noi tochki v Kabine sputnika (Free fall of a material point inside of a satellite’s cabin), Prikl. Mat. Mekh. 27, 3-9 (1963). 13. V. F. Illarionov and L. M. Shkadov, Povot-ot ploskosti krugovoi orbiti sputnika [Turningofthesatellite’sorbital plane(circularorbit)], Prikl. Mcit. Mekh. 26,I5-21 ( I 962). 14. H. Lass and C . Solloway, Motion of a satellite under the influence of a constant normal thrust, A R S J . 32, 97-100 (1962). 15. Yu. P. Gus’kov, Metod upravleniia povorotoni ploskosti krugovoi orbiti sputnika (Method of control over the satellite’s orbital plane turning), Prikl. M a t . Mekh. 27, 578-582 (1963). 16. G . L. Grodzovskii, Yu. P. Ivanov, and V. V. Tokarev, Mekhanika kosniicheskogo poliota s nialoi tiagoi (Mechanics of the low-thrust space flight), I ; Inzh. Z h . 3, 590-616(1963); 11: ihid. 3. 748-768 (1963); I l l ; ihid. 4, 168-195 (1964). 17. B. D. Fried, Trajectory optimization for powercd flight i n two o r three dimensions, iti “Space Technology.” Chap. IV. Wiley, New York, 1959. 18. D. F. Lawden, Optimal powered arcs in an inverse square law field, A R S J . 31, 566-568 (1961). 19. D. F. Lawden, Optimal interniediate-thrust arcs in a gravitational field, Astronaut. Act0 8 , 106-123 (1962). 20. W. G . Melbourne and C . G . Sauer, Jr., Optimum interplanetary rendezvous with powerlimited vehicles, A I A A J. 1, 54-60 ( I 963).
5 The Mayer-Bolza Problem for Multiple Integrals : Some Optimum Problems for Elliptic Differential Equations Arising in Magiietoliydrodyiiamics
K. A . L U R I E DEPARTMENT OF MATHEMATICAL PHYSICS, A. F . IOFFE PHYSICO-TECHNICAL INSTITUTE, ACADEMY OF SCIENCES OF THE USSR, LENINGRAD, USSR
5.0 Introduction . . . . . . . . . . . . . . . . 5.1 Optimum Problems for Partial Differential Equations: Necessary Conditions of Optimality . . . . . . . . . . . . . . . . . . 5.1 I Formulation of the Problem 5.12 Admissible Controls and Possible Behavior of State Variables 5.13 Euler Equations and Natural Boundary Conditions . . . 5.14 The Weierstrass Condition . . . . . . . . . . 5.2 Optimum Problems in the Theory of Magnetohydrodynarnical . . . . . . . . . . . . . . . Channel Flow 5.3 Application to the Theory of M H D Power Generation: Minimization of End Effects in an M H D Channel . . . . . 5.31 Basic Equations and Statement of the Problem . . . . 5.32 Euler Equations and Boundary Conditions . . . . . 5.33 The Weierstrass Condition . . . . . . . . . . 5.34 The Case of a Homogeneous Magnetic Field . . . . . 5.35 The Case of the Magnetic Field Varying along the Channel. 5.36 Some Limiting Cases . . . . . . . . . . . . Appendix . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . .
5.0
. 147
. . . . .
149 149 152 152 157
. 160
. 165 . 165 . 166
. 168 . 171
. 175
. 183 . 189 . I92
Introduction
In recent years, considerable interest has arisen in optimum problems for objects whose characteristics are continuously distributed in space and time. Problems of that kind appear in various branches of mathematical physics, especially in continuum mechanics and electromagnetic theory; e.g., different problems of continuous heating in metallurgy, of drying processes,',' some problems in the theory of growth of crystal^,^ in the theory of flight,4 and so 147
148
K . A. LURlE
on. In short, optimum problems naturally arise for any continuous system controlled from outside. The control itself may be either concentrated or distributed in space; the latter case seems to be most difficult in practice, but simultaneously most promising from the point of view of optimization. In what follows, there will be presented an example of distributed control over the conductivity of the working fluid in magnetohydrodynamical channel flow, and we shall be able to estimate the advantages in power generation gained from it. Control over the conductivity of gas flow seems to be quite possible even now; it is very likely that analogous controls would become possible in such “ classical ” fields as mechanics of structures ;if so, we should be led to optimum problems in the mathematical theory of elasticity. It seems worthwhile to add that not any admissible but only optimal control need be realizable in practice. Moreover, in certain cases may considerably simplify theoretical considerations by extending the class of admissible controls so that some physically unrealizable controls wj!! be included as mathematically admissible. In a variety of practically important cases, the resulting optimal control will be realizable; if not, we may inquire into the possibilities of its approximate realization, and so on. These considerations will be illustrated in Sec. 5.3. Mathematically, the optimum problems in question may be described as variational problems of the Mayer-Bolza type with partial differential equations as side conditions. There may be distinguished a class of optimum problems with integral equations as side conditions. Methods of solution for such problems began to be investigated only very recently. Bellman and Osborn’ applied dynamic programming to the derivation of the well-known Hadamard formula for the variation of Green’s function of the Laplace operator. Butkovsky and Lerner6 formulated a very general optimum problem for systems with distributed parameters. Their work was followed by a series of publications by Butkov~ky’.~ in which there was demonstrated an analog of Pontryagin’s maximum principle for optimum problems with distributed parameters and with integral equations as side conditions. The case of constraints formulated by partial differential equations has also been examined by different writers. A. 1. Egorov’ has given the optimality conditions for processes described by quasilinear hyperbolic equations. Yu. V. Egorov’ has outlined analogous conditions for a class of equations in Banach space, this class including hyperbolic and parabolic equations. There is a vast class of optimum problems for hyperbolic equations which has been studied for a long time by different authors : namely, the determination of optimum forms of nozzles and bodies of revolution in hypersonic gas flow. In this field, there can be mentioned well-known papers by Guderley and Hantsch,’ Rao,” and S h m i g l e ~ s k y . ’ ~In * ’ ~these investigations, however, it has been possible (due to the special form of isoperimetric conditions: wth
5.
THE MAYER-BOLZA PROBLEM FOR MULTIPLE INTEGRALS
149
given total length of a body or a nozzle) to reduce an optimum problem with two independent variables to a one-dimensional optimum problem. The general case was only recently considered by Guderley and Armitage.14 In their paper, the reduction t o a one-dimensional problem is no longer possible because of a more general type of isoperimetric condition (area of the nozzle's surface given). Guderley and Armitage formulated only Euler equations and natural boundary conditions; the Weierstrass condition remained beyond the scope of their work. These ideas were quite recently exploited by Kraiko.' The present author has considered a general problem of optimization with partial differential equations of any type given as side conditions." The investigation was carried out according t o traditional methods of the calculus of variations, the necessary condition of Weierstrass having been considered equivalent to a n analog of Pontryagin's maximum principle. In the following sections, a detailed outline of these considerations will be given, together with an illustration of general principles by means of an example from magnetohydrodynamics. 5.1
O p t i m u m Problems for Partial Differential Equations: Necessary Conditions for Optimality
5.11 Formulation of the Problem Let S denote a closed domain in the xy plane with piecewise continuous boundaries C,, C, (Fig. I). In this domain, let us consider a system of partial differential equations
-
=i
=- X i ( Z , i,u ; x,y ) = 0 f3X aZi
The latter equations contain total derivatives with respect to all the arguments included. Vector functions z = ( z ' , ... , z"), ( = ( [ I , . _ _[") , of the arguments x, y describe the mechanical system itself, while the functions u = ( u ' , . . . , u p ) of the same arguments present the " distributed controls." The couple of vector functions z , [ will be called the state of a system. Equations (5.1) present a standard form of any system of partial differential equations (so-called special case of the Pfaffian s y ~ t e m ) . 'In ~ other words, any
150
K. A. LURIE
such system may be written i n the form (5.1) (with the number of dependent variables increased if necessary). It should be mentioned that the presence of [ variables is typical for the majority of applications. For example, the Helmholtz equation + z,!, + uz' = 0 is equivalent to the system
i.i.x
-7 X
1
= z2,
Zyl
= z3;
= - (2
z,2
- uzl, zy2= [ I ;
z,3 = [ I ,
zy3 = 1 2
The wave equation z:.~- (kz,'), = 0 is evidently identical to the system 2,' = - ( I l k , zY' = ['; Z, 2 = C2, zY2= -['
FIG.I . Closed domain in xy plane.
The form (5.1) corresponds to the problems in a sense more general than those given by differential equations of higher order. For example, the system z,I
= [I,
zy' =
-C2
+ u;
z, 2 = [=, zy2 =
['
is equivalent t o equations
Az'
=
du/dy,
Az2 = -&/dX
only if u is differentiable. Note that the latter pair of equations contains derivatives of the control and not the control itself. The class of optimum problems for higher-order equations depending on controls (and not on their derivatives) was examined by A. I . Egorov.2 Returning t o (5.l), let us formulate the restrictions imposed on the control functions. The first r l of these restrictions will be expressed by finite equalities
G J u ; x, y ) = 0,
k = I , ... , r1
(5.2)
the remaining r - r1 having the form of finite inequalities
G,(u;x,y)~O, Suppose that the first n, _< assumed known.
PI
k = r , + 1 , ..., r < p
(5.3)
functions z'are prescribed at C,, this boundary
5.
THE MAYER-BOLZA PROBLEM FOR MULTIPLE INTEGRALS
151
So far we have zilrl = z,‘(t),
i = 1, ... ,n,
(5.4)
The number n, is determined by the conditions of any given problem. The outer curve C, is not assumed known a priori; it is only supposed that along it there are prescribed n2 I n ordinary differential equations of the form
These equations involve a set of functions VK = vK(t)
IC
=
1 , ... , 71
of the parameter t , which will be called boundary controls. The values of z k ( t = 0) are assumed known. The boundary controls are also connected by constraints expressed by equalities
k
gk(L’; t ) = 0
=
1, ... , p1
(5.6)
and inequalities The total number of these constraints is equal to p I 71. It is essential that there will be supposed t o exist a solution of Eqs. (5.1), (5.4), (5.5) under the restrictions (5.2), (5.3), (5.6), (5.7) imposed on the control functions. This requirement is satisfied for any properly formulated physical problem, and we may say that it produces a recipe for distinguishing between iand u variables in basic equations (5.1). In fact, the u variables are distinguished because they serve as immediate control variables. On the other hand, the i variables cannot be altered directly from outside, and generally there is no proper solution t o Eqs. (5.1), (5.4), (5.5) when ( variables are prescribed more o r less arbitrarily on S. The existence of such solutions for some u would be equivalent t o the nonuniqueness of the solution z,[ for the u in question. In what follows, we shall not consider such cases. The Mayer-Bolza problem is now formulated as follows. I n a suitable class of functions, determine the state variables zi, i j and controls uk, vK so as t o minimize the functional
+f
J = l / F ( z , i,u ; X, Y> dx d~ S
XI
.fi(z, t ) dt
+f
h ( z , 0; f)dt
z2
(5.8)
subject to side conditions (5.1)-(5.7). The functions X i , Y i ,F, f i , ,f2 are assumed differentiable with respect t o all their arguments. In the following section we shall explain the notion of “suitable class” of
152
K . A. L U R E
functions because this point is essential to the very existence of the solution to the optimal problem.
5.12 Admissible Controls and Possible Behavior of State Variables For further consideration, it is necessary to specify the class of admissible controls and possible behavior of state variables. The “distributed controls will be assumed to belong to a class of functions no less wide than the class of piecewise continuous functions of two independent variables. Possible discontinuities of distributed controls may occur along smooth, closed, isolated curves C, . In what follows, we shall suppose for the sake of simplicity that there is one such discontinuity along a curve C, lying entirely inside S , and that this curve may be continuously deformed to any of the boundary curves C, or 2,. The state variables zi are assumed continuous across the curve C, ; the variables [ j are in general discontinuous but their values on both sides of C, are connected by the requirement that the tangential derivatives dz’ldt along &, be continuous, i.e.,
”
[ X i x t + Y,y,]’ =o, i = 1, ..., n (5.9) Boundary controls u also will be assumed to belong to a class of functions no less wide than that of the piecewise continuous functions of t . For simplicity, there will be hypothesized only one point t , of such discontinuity, this point being a point of continuity for zi(t) and a corner point of the curve C 2 . The latter assumption is necessary for discontinuity of dzik/dt across t,, because otherwise we should have permitted some line of discontinuity of dzik/dx,dzik/dystarting from t , on the boundary C, and going inside S. Such a line of discontinuity can only be connected with a jump of the distributed control u across it, but we have already assumed above that there is no such line intersecting the boundaries C, or C, .t
5.13 Euler Equations and Natural Boundary Conditions First we shall try to transform the restrictions (5.3) and (5.7) expressed by inequalities to others expressed by equalities. In a traditional way,’8 we introduce (real) artificial controls u, = (u2+’, . .., u * ~ )c‘* , = (0’2 . .. , z i * ” ) by virtue of the following equations:
+’,
C,*= G,(u; x, y ) - (u,,)’ gk* = g,(u;
x,y ) - (Zl*”2
= 0, = 0,
+ I , ... , r k = p1 + 1, ... , p k
=
rl
(5.10)
(5.1 1)
?Note added in proof? I n hyperbolic problems, a corner point of the boundary may initiate some line of discontinuity of the first derivatives going inside S without any jump of the distributed control across it (see Kraiko15).
5.
THE MAYER-BOLZA PROBLEM FOR MULTIPLE INTEGRALS
I53
these equations now taking the place of (5.3) and (5.7). So far we have passed from a closed region of control variation to an open region, but for an increased number of control functions. To proceed, let us now introduce the Lagrange multipliers t i ' ( x , y ) , q i + ( x , y ) , i = 1 , ... , n l-z'(x, y ) , k = r l rk*(x,y ) , k = 1, ..., r , ; Yk(t),
e,,(t), ik = i,, ... , in, k = 1, ... 9 p i ; Y k * ( f ) , k = Pi
+ I , ..., r
+ 1,
... , p
With the aid of these multipliers, we construct a functional
n = J + j'J'(5'2' + q + H + + T'G'
+ T * + G * + )dx dy
S+
+ sJ'(S-S- + 1i-H- + T-G- + T*-G*- ) dx dY S-
(5.12) Here S' denotes a region bounded by the curves X1 and C, , S - , a region bounded by C, and C, (Fig. I ) ; the products of vector functions are interpreted as scalar products. The functional ll is always equal to J, these two being simultaneously stationary. Let L, I,, I , denote the Lagrangians L = F + (S +qH
I , =f,,
I,
=f2
+ TG+
r*G*
+ 0 0 + yg + y*g*
(5.13)
The first variation of Il is composed of double integrals, line integrals along C,, C, , C, , and nonintegral terms. The double-integral part of the first variation is equal to
(5.14)
154
K. A. LURlE
Consider the line integral f d t wherefdenotes the limiting value on the $x curve C of the function determined in the domain bounded by C and continuously differentiable there up t o the curve C. To take the variation of this integral, one should follow the rule
(5.15) Here p denotes the radius of curvature of C, and 6n the variation of the outer normal to this curve. In the expression for the first variation there appears an integral along C,, namely,
rO&(&
+ 4.'
"1
- - q . + - 6z"
' dt (IY
The analogous integral taken along C, is
dt
(5.16)
(5.17) To construct an integral along C,, it is necessary t o take into account the discontinuity of boundary controls. This may be actually done at a corner point f, on the curve 1,.We have (6/ denotes Kronecker symbol)
The last term in this formula denotes a contribution to the first variation from the corner point t , , and is equal t o the difference between the expressions in brackets taken just before and after t , (the words "before" and " after " correspond to the positive direction determined on &). Along the line C, we have
Here Af deyotes total variation ofJ: The functions z i are supposed continuous across X,, this being also true for their total variations. The functions [ j and uk are discontinuous across C, . So far we can write (5.17) in the following way:
5.
155
THE MAYER-BOLZA PROBLEM FOR MULTIPLE INTEGRALS
At the point t , of discontinuity of the boundary controls on X 2 there applies the relation 6z'= Azi - (grad zi . 6r). Here 6r denotes a variation of the position vector of the corner point. Taking into account the continuity of the total variations of ziat the corner point, we may present the last term in (5.18) in the following form: [O,,(t,)]; Azik(t*)- [Oi,(t*) grad z i k ] ; .6r
(5.21) is composed by summing up expressions (5.14), (5.16),
The first variation of n (5.18) and (5.20). Now we can follow the usual argument of the calculus of variations to derive necessary conditions of stationarity. We have : In the regions S * ,
0, i = 1, . _ ., n ;
aL
__ - 0, j = f
1, ... , v
Along the line C,,
Along the line Z 2 ,
dl2 - = 0,
aUK
K =
1, ... , n ;
312 - - - -2yK*U*K= 0,
IC
= p1
+ 1, ... , p
L-+-+ 12 - = 812 o Y2
dn
At the point t* of discontinuity of boundary controls,
a;@*),
O,(t,)
O,(t*> = grad zik-(t,) = O:(t,)
ik
= i,, ... , in,
grad z i k + ( t , )
(5.25)
Along the line C , of discontinuity of distributed controls,
(5.26)
156
K . A. LURlE
The last equation can be transformed by the aid of the Hadamard-Hugoniot theorem and the first of Eqs. (5.26); we may write
(L)’ - ti+(Z,’)’
(5.27)
- qi(zy’)! = 0
The stationarity conditions (5.22)-(5.26) may be rewritten in a Hamiltonian form. To d o this, we first introduce “impulses” dL/az,‘, aL/dz,‘, and make certain that these are just the same as Lagrange multipliers t i ,q i . Let H denote the “ Hamiltonian ”
+
H = [zxiLZxi z ~ ~- L L]zA.r=x ~ ~< , ;yi --
y I. -
5X
+ y~Y - F
-
TG - T*G* (5.28)
The following relations are evident: H,=
H,=
-L,,
H,k =
-L
Hu*k =
-L,k,
Y’
-LU*k,
H,i = --LZ*, H,j = H<, = Xi,
- L < j
H,; = yi
(5.29)
Making use of these formulas, let us replace the first pair of Eqs. (5.1) and the first of Eqs. (5.22) with the following relations: 2,
i
aH
,
=-
3t i
ati aqi -+ - = - -
a~ aqi’
zyl = -
ax
ay
aH aZi
(5.30)
These equations have just the same form as the canonical equations of Volterra.’ The coincidence is as yet, however, merely formal because the H function is by no means Hamiltonian until we eliminate the variables ( j and controls u by the aid of the rest of Eqs. (5.22). This is the argument for the quotation marks in the definition of H above [Eq. (5.2S)l. The other equations may also be rewritten in a similar way. We have
’
k-
aH
all*
E ~ T , * u ,= ~ 0,
k
= rl
+ 1 , ... , r
(5.31)
Equation (5.27) can be represented in the following form : ( H ) ? = z:-(ti)!
+ z;-(y~i)’
(5.32)
By the same argument we can see that the “impulses” dl,/dzfl. calculated from the Lagrangian I , [Eq. (5.13)] coincide with the Lagrange multipliers 6‘,. We introduce the “ Hamiltonian ” h
=
[Zti/2zt,
and write down the
“
- ] Z ] z , t k = ~ , ~= @,“Tik - f 2 - )’g - )’*g*
canonical equations
”
(5.33)
5.
THE MAYER-BOLZA PROBLEM FOR MULTIPLE INTEGRALS
157
The control variables u', u * ~are t o be eliminated from h with the aid of relations ah
auK
=o,
K
=
dh
1, . . . , n ;
__ aU*K
=2y*U*K =0,
K = p ,
+ I, . . . , p (5.35)
5.14 The Weierstrass Condition
In this section, we present a detailed derivation of the necessary condition for a minimum. Its analog for an ordinary Mayer-Bolza problem is well known as the Weierstrass condition. First of all, we shall formulate the basic theorem. Let us denote by E ( ' ) and E(') the Weierstrass functions E"'=L(z,Z,,Z,,Z,U,U*;r,V,r,r*,X,y)
L u, u* ; t, V , r, r*,x, Y )
-UZ, z x , z Y ,
(5.36)
(5.37) In these formulas z , [, u and c correspond t o the optimum values of state variables and controls, and Z , Z, U , V denote any set of admissible functions which satisfy the conditions formulated in Sec. 5.12. The Weierstrass conditions necessary for a strong relative minimum reduce to the following inequalities E(') 2 0
E ( ' )2 0,
(5.38)
The demonstration will be based on the assumption that the extremal surface S bounded by closed curves C, and C, may be imbedded in a family of integral surfaces S(b) bounded by the curves C, and C,(b), depending on n, + n2 parameters bi ( i = 1, ... , n, + 17,). In this family we define the functions z'(b; x, y ) ,
i = I , ... , n ; u k ( b ;x , y ) ,
k
< j ( b ; x, y ) , j = 1, ..., p
=
I , ... , v
(5.39)
and along the boundary curves C, and C2(b) the functions (5.40)
158
K. A. LURIE
Both sets of functions are determined in such a way that they satisfy Eqs. (5.1)-(5.7) and convert into the functions corresponding to the extremal surface S(Cl, C,) under vanishing of the parameters bi . In what follows, it will be supposed that all restrictions imposed on the control variables are already expressed by finite equalities (cf. Sec. 5.13). Let X’ denote a smooth closed curve bounding the region S’ and lying entirely inside S f or S - so that the curves C’ and C, have no common points. Let us also choose some (noncorner) point t’ on the curve X , ( f # t*).t Consider now a closed curve C,’ drawn outside C’ parallel to it at the distance e > 0 and let Se’ - S’denote a ring-shaped region bounded by both curves. The equations of C’ and Ce’ are as follows (C‘)
x = x’(f),
y = y’(t)
(C,’)
x = x’(t)
y = y’(r)
+ e cos nx,
(5.41)
+ e cos ny
Here cosnx, cosny denote direction cosines of the outer normal t o C’. When e 1 0 , the curves C’ and C,’ coincide and the region S,’ - S’ vanishes. The part of S(b) lying outside C,’ will be denoted by S,, - S,‘ . Consider the following sets of functions:
zLi(b,e ; x, Y ) , llj(b, p ; x, Y ) , uk(b;x, y ) , (x, v)E S’ ZYb, e ; x, y), Zj(h, e ; , x, Y ) , U k ( x ,Y ) , (x, JJ)E S,’ - S’ z3j(b,e ; x, v),
i3j(b,
(5.42)
e ; x, Y ) , uk(b;x, Y ) , (x, Y ) E S,, - S,’
The first and the third of these families satisfy Eqs. (5. I ) and (5.21, together with the boundary conditions (5.4) ; the second satisfies the same equations with z i replaced by Z’, and so on. In the same way, we consider a segment (t’, t‘ + E ) on the curve C,(h) (0 _< t 5 t,) and introduce two families of functions
Z,‘(b, e ; t ) , Z,’(h; e ; t ) , uzk(b;t ) , V K ( t ) ,
t‘ 5 t 5 t’
z,’(b, e, c ; t ) , l2j(b,e, E ; t ) , u Z k ( b ; - t ) , ziK(b;t ) 0 < 1 < If, t ’ t E < t < 1,
+
E
(5.43)
These functions satisfy Eqs. (5.5), (5.6), and corresponding initial conditions (see Sec. 5.1 I). For the sake of brevity, we introduce the notation
iHere and henceforth we d o not differentiate the notation used for surfaces and curves and their projections onto the xy plane.
5.
159
THE MAYER-BOLZA PROBLEM FOR MULTIPLE INTEGRALS
The sets of functions constructed above are subjected t o the following conditions : The sets (5.42) t o
Iz, = z1'(b, e ; x,Y ) Ir,
ZYh, e ; x,Y > Z'(b, e ; x,Y ) grad ( Z i - zi)
Iz,
lX,
= z,'(b, c;
Ire,
x,Y ) 6,zl']lz, = n[(S,zi):],,
= n[6,z3' -
the last equation being valid in the limit e = 0. The sets (5.43) to Z2'(b,e ; t ) = z,'(b, e, E ; t ' ) Z,'(b, e ; t' + E ) = z,'(b, e, E ; t' + E ) Zk,(t') = z&(t') + S E Z 2 i ( f ' )
(5.45)
(5.46)
Now we consider the functional
f'+C
+[
I[Z2(b,e ; t ) , Z2r(b,e ; t ) , V r ) l dr
Making use of the stationarity conditions (5.22)-(5.26), we may write (5.48) and
160
K. A . LURIE
Let us now take advantage of boundary conditions (5.4) and those prescribed for the z i k functions at the point t = 0 on C,. These conditions provide n, finite relations among the parameters b , , and n2 relations among b i ,e and t.. Suppose now that it is possible to determine from these equations all the b, as functions of e and E , these functions vanishing when e = E = 0. That total differential of n ( b , e, E ) evaluated for e = E = 0 on the extremal is now equal to andb1 d& d e + -+-db at. e = E = O
;[
This expression can be modified by virtue of (5.45), (5.46) and (5.48)-(5.50) to the form
[
dn = fr,E“’ d t ] de
+ E‘2)(t’)dc:
(5.51)
To make n (and J ) minimum, it is necessary that d l l 2 0. Note that only positive values of e, E (or de and d ~correspond ) to admissible surfaces (boundary curves). This is equivalent to the requirement that
L
E(” dt 2 0 ,
E‘Z’(t’) 2 0
The arbitrariness of the curve Z’ on the extremal surface and of the point t‘ on its boundary lead us now to the Weierstrass conditions (5.38). These conditions can be rewritten in terms of the “Hamiltonians” H and h [see Eqs. (5.28) and (5.30)], the modified form being ~ ( zz,,
u, U , ; t, ‘I, r, r*)Iw z , L u, U* ; t, ‘I, r, r*) K v* ; 8, Y, r*) 5 h(z, v, u* ; 8, Y, Y*)
h(z,
(5.52)
We stress that the “admissible” variables Z, U included here are to be so chosen as to satisfy Eq. (5.9) which expresses the continuity of z i across any admissible line; in particular, for elliptic problems this means that the values x,,y t are arbitrary except for the restriction x t 2 + y t 2 = 1 . This formulation of the Weierstrass conditions provides an analog to the well-known maximum principle derived by Pontryagin for ordinary minimum problems.” We may add that the “artificial controls” u* and u* do not actually enter into the expressions for H and h [Eqs. (5.28), (5.30)]. Lastly, inequalities (5.52) are also valid for corner lines (and points) due to considerations of continuity. 5.2
Optimum Problems in the Theory of Magnetohydrodynamical Channel Flow
The motion of a conducting fluid through magnetic fields is characterized by some peculiar effects studied by means of magnetohydrodynamics (mhd).
5.
THE MAYER-BOLZA PROBLEM FOR MULTIPLE INTEGRALS
161
The most interesting and important applications of mhd are connected with the problem of power generation. In this problem, there is a variety of difficulties caused by technical restrictions currently i n rffect. It is necessary, for example, t o maintain highly intensive magnetic fields in large volumes t o keep the conductivity of the working gas sufficiently high, to reduce the flow of heat t o the walls of a channel, and so on. The variety of different demands (which are sometimes conflicting) attaches considerable importance to considerations of optimality of power generation design, and the problems concerned are of great significance for applications of mhd. Theoretical aspects of the power generation problem are based upon the investigations of the mhd channel flow. Corresponding calculations are often carried out according t o the so-called one-dimensional approximation when only the dependence upon the longitudinal coordinate is considered. This approximation provides considerable information about certain obstacles to the intensification of the power conversion process, but of course it does not take into account two- and three-dimensional phenomena connected with the bending of the current lines inside the channel. This bending reduces the total current emitted from the electrodes and consequently presents an additional source of loss in the device. The reader is referred to a detailed survey2*of the most important losses in the channel of a mhd generator. Methods of optimization of mhd conversion regimes are largely determined by the nature of the losses to be decreased. There may be outlined a group of factors which can be considered as controls within the one-dimensional approximation, and suitable choice of these factors may well reduce the corresponding losses. The factors in question are the distribution of the external magnetic field along the channel, the choice of the channel’s length and its transverse dimensions, and the selection of the initial data. One can indicate various criteria of optimization. Either the total current I flowing from the electrodes to the outer load R o r the value of Joule heating Q are often chosen for such criteria. We can also introduce the effectivity p defined as the ratio IR/(IR + Q). Optimum regimes are characterized either by the maximum value of I (or of p) or by the minimum value of Q. Some one-dimensional optimum problems in magnetohydrodynaniics have already been examined in the l i t e r a t ~ r e . It~ should ~ . ~ ~ be noted that there is no principal difference between these problems and those arising in the theory of rocket flight. In both cases, optimum problems for ordinary differential equations are encountered. The theory of mhd power conversion presents, however, more complicated two-dimensional optimum problems connected with so-called end effects in a mhd channel. I n such a channel, there is observed a sort of current loss caused by inversely directed currents induced at the ends of a zone occupied
162
K . A. LURIE
by the external magnetic field. In other words, some part of the total current is branched out into the current loops inside those regions of the channel that are free of electromotive forces, instead of being turned into the external network. It is important to minimize losses of that kind, and such minimization can be effected in different ways. We may, for example, arrange either the optimum distribution of the conductivity of the working gas or the optimum distribution of the external magnetic field. Both factors are to be so chosen as to decrease end losses to the greatest possible extent. The same argument may be followed to formulate several more complicated problems. Of these, we shall outline the optimum problems with the Hall effect taken into account. Two-dimensional effects inside of a n mhd channel have already been considered in the literature. For a bibliography, the reader is referred to tlie references at the end of this chapter, but we shall outline the main points in what follows. I n niagnetohydrodynamics, the current density j depends o n the electric field E, magnetic field H, and velocity v of the conducting fluid according to Ohm’s law. For many interesting applications, this law can be written in tlie following form:
(5.53) Here c denotes the conductivity of the fluid and c the speed of light. By virtue of the Maxwell equation curl E
1 dH
= ---
c
ar
(5.54)
we may eliminate the electric field E from Eq. (5.53), the resulting equation taking the form
(5.55) I t can be readily verified that the second term in the right-hand side of this equation prevails over tlie first term if Re,
VL
= -Q
1
(5.56)
1’,n
Here L and V denote typical linear dimension and speed of the fluid, respectively; v,, = r2/47ra represents the so-called magnetic viscosity. The parameter Re, is called the magnetic Reynolds number. It is clear from Eq. (5.55) that for small values of Re,,, we can to a first approximation neglect
5.
THE MAYER-BOLZA PROBLEM FOR MULTIPLE INTEGRALS
163
the dependence of H on the velocity v of the conducting fluid. In this approximation, only the external magnetic field should be taken into account. This situation is typical for power conversion conditions where we have, for example :
L
=
lo2 cm,
V = lo5 cmisec,
v,,
=
lo9 cm2/sec,
Re,
=
lod2
Some additional remarks concerning the velocity of the working fluid are to be made. In general, the velocity distribution is governed by the totality of the basic equations of mhd.24 In certain circumstances, however, it is possible to separate the equations for the velocity from those for the magnetic field and electric currents. A detailed account of such possibilities can be found e1sewhe1-e.~~ For our present aim, it suffices to restrict the analysis t o those cases when the density of the Lorentz force is small compared with that of the inertial force. The ratio of the two is characterized by the so-called mhd parameter of interaction
Here B, and po denote typical values of the magnetic field and gas density, respectively; h represents a typical transverse dimension. We shall suppose that N 4 1 , this assumption corresponding to the weak interaction between the flow and the outer load. In fact, from this it follows that the influence of the outer load on the velocity distribution is negligible, and that one can treat this distribution as given from the outside. This is equivalent to the assumption that the supply of kinetic energy of the working gas dominates that part of it which is converted into electricity. So far, the inequalities Re, 4 1, N < I allow us to treat the distributions of magnetic induction, B = H and the velocity v of the working gas as given functions of the coordinates. The outcome is not only a considerable simplification of the fundamental equations, but also permits us to treat the functions B and v as control functions. Under the formulated assumptions, the mhd equations reduce t o Eqs. (5.53) and (5.54) combined with the equation of continuity div j = 0 (5.57) for the electric current density j. Let us now suppose that there is only one component of the velocity v, namely, v = V(y)i, and a single component of magnetic induction, B = - B(x)ij . Having denoted by p ( x , y ) the specific resistance of the working fluid at the point (x,y ) , we may write the fundamental equations in the following form (dji3t = 0): div j
= 0,
j
= p-'
(- grad
z1
+ -C v x B
(5.58)
164
K. A. LURIE
Here we have introduced the following notation: E = -grad z' and j = -curli,z2, z1 and z2 being the electric potential and current function, respectively. The system (5.58) was examined by various authors for a variety of boundary conditions and under different assumptions about the functions p = p(x), V(y),B(x), these functions in all cases being given. For optimum problems corresponding to Eqs. (5.58) it is typical that the functions p, V , and B are treated as control functions. There is a variety of ways to obtain distributed control of this kind. We can alter the magnetic field distribution by a suitable choice of currents in the exciting magnet networks. To obtain control over the conductivity distribution, we may use numerous methods of ionization, especially those connected with seeding the vapors of certain alkali metals such as cesium or potassium into the working gas." On the other hand, we may considerably reduce the conductivity of the hot gas by seeding some quantities of electronegative gases into it (for instance, sulfur hexafluoride or water vapor).26 It is of great importance that there always exist some well-determined intervals for possible values of the control functions. The lower and upper limits of such intervals are determined by the technology now available, and may be either constant or varying with the coordinates. The closed character of an admissible region for the values of control functions appears to be of special importance for those optimum problems which depend linearly on the control functions. An absence of boundaries for the values of the control functions is equivalent in such cases to an absence of optimum control itself. One may observe from Eqs. (5.58) that there is a considerable mathematical difference between two possible cases of optimization. On the one hand, we may treat p ( x , y ) as fixed, and V(y) and B ( x ) as variable controls to be determined. Inversely, we may fix the functions V(y)and B(x), and try to prescribe an optimum control p ( ~y,) . In problems of the first type, the controls enter into the free term of the basic equations (5.58). This is why such problems can, in principle, be reduced to the simplest variational problems, this reduction being always possible if Green's function is assumed known. Only if it is difficult to construct Green's function may it be practically impossible to realize such a reduction. On the contrary, problems of the second type cannot be reduced to the simplest problems, for we cannot construct Green's function so long as optimal control p(x, y ) is unknown. In what follows, we shall examine problems of the second type: the functions V(y)and B ( x ) will be assumed fixed, while p(x, y ) will represent an unknown optimum control to be determined together with the state variables.
5. 5.3
THE MAYER-BOLZA PROBLEM FOR MULTIPLE INTEGRALS
165
Application to the Theory of M H D Power Generation: Minimization of End Effects in an M H D Channel27
5.31 Basic Equations and Statement of the Problem Consider the rectilinear motion (v = [ V ( y ) ,0, O ] > of a conducting fluid along a plane channel of width 26. Let the specific resistance p(x, y ) of the fluid be restricted by (constant) limits pminand p,,,,, . The walls of the channel will be assumed insulating everywhere except for two sections of equal length 21 occupied by ideally conducting electrodes located opposite each other on different walls (Fig. 2). Electrodes are connected through the outer load R .
V
P
-A 0-
B
r Q
0' -xI
I
N
XI
X
1
-6
FIG.2. Scheme of mhd-conversion device.
As soon as the transverse magnetic field B = -i3B(x) is imposed on the moving fluid, an electric current of density j(C1, 12)is induced inside the channel, and through the outer load there flows total current equal to (5.59) Provided the magnetic Reynolds number Re,,, is small compared with unity, we may neglect the induced magnetic field as compared with the external field; if, moreover, the mhd parameter of interaction N is also small, it is possible t o neglect the Lorentz force in the dynamic equation so that the velocity distribution will be considered as prescribed by the purely hydrodynamical problem of rectilinear motion in a channel. These two assumptions simplify the basic mhd equations24 to the form (5.58). Having introduced the notation (see Sec. 5.3)
j = -curl i,z2,
j , = (I,
j y=
c2
(5.60)
166
K. A. LURIE
we may present the system (5.58) and Eq. (5.59) in the following standard form :
(5.61)
I
= 22(/2,
f6) - z2( - A, f6)
According to the preceding discussion, we subject the function p ( x , y ) to inequalities (5.62) P m i n 5 P ( X , L') 5 pmax the limits y,,, and p,,, being known and constant. The upper limit corresponds t o the resistance of the fluid when all external ionization factors are withdrawn, the lower limit characterizes the maximum number of ionization possibilities. It seems relevant t o write down the second-order equations for z ' , z2 following from (5.61) on elimination of the ( variables. These equations are
a 1 az' ---+
a x p ax
a 1 az' a y p ay
a
a
az2
-p-+-p-=-ax ax ay
az2
dy
1
a
VB
cay I a VB cax
(5.63)
As to the complementary boundary conditions, we shall examine the situation when the walls of a channel are as described above. The boundary conditions will be discussed in Secs. 5.34 and 5.35. At infinity, the components ( I , 1 ' of the current density will be assumed to vanish. The basic problem is t o choose the optimum control function p(x, y ) in a class of functions of two independent variables including all piecewise continuous functions satisfying inequalities (5.62) in such a way as to furnish the maximum value of the functional 1 [Eq. (5.59)]. 5.32 Euler Equations and Boundary Conditions Following the technique outlined in Sec. 5.13, we construct the H function corresponding t o the basic system (5.61):
(5.64)
5.
167
THE MAYER-BOLZA PROBLEM FOR MULTIPLE INTEGRALS
Here p* denotes a n artificial control variable defined by the equation ( ~ m a x-
P>(P - P m i n ) - P*’
(5.65)
=0
and ti,q i (i = 1, 2) represent the Lagrange multipliers. These multipliers satisfy the Euler equations [cf. Eq. (5.22)]
P51
lltl + l2v1- r*m- P,,%,
+
v2
Pyll - 5 2 = 0
= 0,
(5.66)
r*p* = o
- pmin)= 0,
The first two pairs of these equations can be modified t o a n equivalent form with the aid of the “flow functions” m,(x, y), w2(x, y ) defined as follows:
el = -ao,/ay
r/l
t2= -ao,jay
= dw,/dx
(5.67)
q2 = aw2/ax
We observe that the first pair of Eqs. (5.66) is identically satisfied, the second pair now being rewritten in the following way: p awl/ay = aw,jax,
p aw,jax
=
-aw2py
(5.68)
On rearranging these equations, we get
a am, ax ax
a
-p-+-p-=o ay
(jWl
ay
d I am, a 1 do2 _ - _ _ +---=o a x p ax a y p ay
(5.69)
Boundary conditions for the Lagrange multipliers ti,q i (or, equivalently, for wl, w 2 ) are defined by initial boundary conditions for the variables zl, z 2 together with the functional 1 t o be maximized. We put off the formulation of these conditions to Secs. 5.34 and 5.35. The last pair of Eqs. (5.66) show that two types of solutions are possible: the solutions characterized by the relations p* = 0, r*# 0, and those described by p* # 0, r*= 0. For the first type of these solutions, it is seen from Eq. (5.65) that the conthese trol function p(x, y ) can take on only the limiting values P,,,~“ or pmax, values being optimum controls. But the Euler equations themselves provide no information about conditions necessary for realization of any regime of control p(x, y ) (sign of r*). For solutions of the second type, the control p(x, y) does not generally take on limiting values. Moreover, restriction (5.62) on the control function is not
168
K. A. LURIE
taken into account at all in these solutions. The related situation is quite analogous to the case of " singular controls " appearing in ordinary control problemsz8? and determined by the requirement that the penultimate Eq. (5.66), where r*= 0, is satisfied inside some two-dimensional domain. The necessary condition of Weierstrass discussed in the following section yields the criteria for realizing any control regime presented by the Euler equations.
5.33 The Weierstrass Condition The Weierstrass condition requires that the Weierstrass function E
= =
Wi', i2, P ; x,Y >- YZ', Z2, P ; x,Y > -tl(pil - PZ1) 5 2 ( i 2 -Z2) - v],(pi2 - PZ2) - Y A i ' - Z')
+
(5.70)
be nonnegative for any admissible set of functions Z', Z2 and P, admissible functions being those satisfying the conditions of the original problem 1' and p by the (cf. Sec. 5.31) and connected with the optimum values of i', relations ( p i ' - PZ')X, + ( p i 2 + PZ2)y, = 0 (5.71) ( 1 2 - Z2)X, - (i' - Z')y, = 0 Here x,, y t denote any set of real numbers satisfying the relation +y,2 = 1. Equations (5.71) are equivalent to the conditions of continuity of the electric potential z1 and of the normal component of the electric current (continuity of z') across any possible line of discontinuity of the control p(x,y), these equations being valid for any set of numbers x,,y , connected by the relation x,' + y t 2 = 1. It is necessary to note that both functions p and P satisfy inequalities (5.62). Now we can eliminate variables Z' and 2' from the Weierstrass function (5.70) by virtue of Eqs. (5.71). We get
x,'
(5.72)
t See also Chaps. 3 and 4.
5.
THE MAYER-BOLZA PROBLEM FOR MULTIPLE INTEGRALS
169
It is easy to verify that the Weierstrass condition E 2 0 can now be represented as follows: (5.73) Here n(y,, -x,) denotes the unit normal to the line of possible discontinuity of the control p(x, y ) . We must now require that inequality (5.73) be satisfied for any direction n of this normal. It is obvious from the structure of the left-hand side of Eq. (5.73) that two different cases are possible, namely, S = -P - P > 0,
A
P
= S',
do2
-- j . grad w 2 I 0 an
Consider case (1). It is seen from the inequality S > 0 and restricting relations (5.62) that p = pmax.When p = ,omax, the parameter S may vary within the limits (5.74) 05S I 1 - PJPmax = Smax 5 1 Suppose that j * grad w 2 < 0. It is clear that the inequality A 5 0 cannot then be satisfied, for we can always choose a direction n for which the term Sjn(ao2/an)is equal to zero. Thus, there remains the case when j . grad w2 > 0. If now the direction n is disposed inside the hatched sectors (Fig. 3) bounded
FIG.3. Descriptive scheme for Sec. 5.33.
by the straight lines perpendicular to the directions j and grad w 2 , respectively, 0 is satisfied. If, on the other hand, thenj, dw2/dn < 0 and the inequality A I the n direction is disposed outside of the above-mentioned sectors, then we have to calculate the maximum value of the function (Fig. 3 ) : ~ ( c $1 P ,= smaxjn - = smaXjlgrad021 do2
an
CP cos $
170
K. A. LURIE
under additional conditions $ = + tp, = const, and to require that the corresponding value of A be nonnegative. The function f ( v , x + cp) of the tp variable is readily seen to be maximum when cp = -(x/2), that is, for the direction bisecting the acute angle x. For this direction
x
= . f ( - ~ / 2 , xi2) = SmaxjIgrad ~ Z I c o (s ~~/ 2 ) The corresponding value o f A is equal t o fnux
A,,,,, ='lgrad W21[Sm.,x cos2(x/2) - cos XI According t o the preceding discussion, we must have
s,,,COS2(X/2)
or
-
cos 31 I 0
x 5 arccos p
(5.75)
where the parameter p is determined by the relation P=
Pmtx - Pmin Pmax
+ Pmin
(5.76)
Now let us take into account that j . grad w 2 > 0, whence it follows that
1x1 I 742. We observe that inequality (5.75) presents an upper limit for the value of acute angle x between the vectors j and grad 0 2 .The value of the
limit depends on the parameter p ; this limit being equal t o n/2 when p = 0, and t o zero when p = 1 . By analogous argument, we observe that in the second case outlined above the inequalities (5.77) j . grad o2< 0, x 2 71 - arccos p are necessary for optimality. The second inequality of (5.77) presents a lower limit for obtuse angle x between the vectors j and grad w 2 .This limit depends on the value o f p and is equal to 7112 when p = 0, and to n when p = 1 . It must be added that the condition A = 0 can be satisfied only along some separate curves, not inside any two-dimensional regions, the latter case being impossible since A contains the noninvariant term S', dw,/dn. This remark shows that there can be n o regime of " singular controls " in the two-dimensional problem considered.28 We summarize the results of this section in the following form :
Theorem. The functional [Eq. (5.59)] can achieue its maximum wlue under restrictions (5.6 1) and (5.62) onlyfor tlie following rdues of tlie controljunction P(X,
v):
x I arccos p p = pmax, j . grad o2> 0, j . grad w 2 < 0, x 2 n - arccos p p = pmln, (2) where the parameter p is defined by Eq. (5.76). (1)
(5.78)
5.
THE MAYER-BOLZA PROBLEM FOR MULTIPLE INTEGRALS
17 1
It is worthwhile t o make some additional observations regarding this theorem. In our preceding considerations concerning the Weierstrass condition, we have never utilized the functional I itself. The statement of the theorem is therefore true for any functional depending only on the boundary values of the state variables. The case of the functional ( - I ) t o be minimized allows, however, an immediate interpretation of the Weierstrass condition. This interpretation proceeds from certain intuitive considerations, not rigorous but quite instructive in themselves. More precisely, we may in a sense approximate the continuous medium-the conducting fluid flowing through the external magnetic field-by an arbitrarily complicated linear network containing concentrated resistances and perhaps electromotive forces, the latter corresponding to a zone occupied by the magnetic field. We may distinguish some (arbitrary) resistance as the outer load R, and try to calculate the total current 1 through this resistance. It is readily observed from Kirchhoff’s laws that the current I flowing through any resistance R depends on any other resistance p of the network according to the relation
I=-
up cp
+b +d
(5.79)
where the coefficients a, 6, c, and d depend linearly on R and all other resistances of the network, and a, b also contain linearly the included electromotive forces. If we now admit the resistance p to vary within the limits (pmi,,, p,,,), then we can readily observe from Eq. (5.79) that only limiting values of p can be optimum, for the right-hand side of Eq. (5.79) is monotonic everywhere. We have now arrived a t the statement of the theorem, but of course without any criteria for distinction between the two regimes. It can be added that by quite the same argument we can get analogous results for the electromotive forces (that is, for the VB/c term in the basic system (5.61), considered as the control term) but this investigation lies beyond our present discussion.
5.34 The Case of a Homogeneous Magnetic Field”
For the case of constant conductivity of the working fluid throughout the channel, the current distribution was examined by V a t a ~ h i n . ~ ~ In what follows, we shall outline the solution of the optimum problem to illustrate our general considerations. Boundary conditions indicating the constancy of the potential along the electrodes and the vanishing of the normal component of the current density along the insulators, as well as the conditions at infinity and Ohm’s law for
172
K. A. LURIE
the outer load, may be expressed as follows:
z'(x, 5 6 ) = z*'
z2(x, +6)(,,,
= const,
1x1 < /2
z2(x, kij)lx<-A =Z
= z+' = const,
zy00, 6) - z'(00, -6)
= z'(
- 00,6) - z'( - 00,- 6 )
1
6
c
-6
=-BJ'
- ~ =
const
(5.80)
Vdy=&
z2(co, 4 6 ) - z2(- 00, + 6 ) = R-'(z+' - 2 - l )
For the Lagrange multipliers, the boundary conditions must be constructed in view of relations (5.80) among the boundary values of the state variables zi and z2 or, equivalently, among their variations. I n view of the last Eq. (5.61) (the functional - I to be minimized) together with Eqs. (5.66), we write the first variation of the functional - I [cf. Eq. (5.12)] as follows (Fig. 2):
(5.81)
-6z+2 4-6 Z d 2
The variations entering into these relations are not independent but are subjected to constraints arising from Eqs. (5.80) after taking variations. Conditions at infinity immediately follow from those on the vertical paths C'(-S)C'(S) and C( -6)C(6) after moving both to infinity. We get (Fig. 2): On electrodes B'B and DID, y2 = 0. On insulators C'B', BC, C'D' and DC, y1 = 0. At infinity, C'(-6)
JW) C'( - 6 )
tldt=J
C(6)
t1df=0
CC-6)
t 2dt
=
J
C(6)
C(-d)
t2 dt = 0
5.
THE MAYER-BOLZA PROBLEM FOR MULTIPLE INTEGRALS
I73
The other terms remaining in Eq. (5.81) combine into the relation
+/
B'
B
q1 dt .6z+' - JDDq1 dt . 6 z - ' = 0
(5.82)
The variations in the left-hand side are connected by the relation [cf. Eq. (5.80)] 62,' - 6ZL' = R(62+2 - 6z-2) This relation permits us to eliminate 62,' from Eq. (5.82), the resulting equation having all the variations independent and the corresponding coefficients equal to zero. We may write .D
"
The boundary conditions just obtained may be rewritten with the aid of the flow-functions " o1and w 2 . We get: On the electrodes, w2(x, 5 6 ) = w 2 &= const,
do,/dy
=0
(5.83)
On the insulators, wl(x,
f6)[,,,
= w l + = const,
wI(x, k6)lx+, = ol- = const
aw2/ay = o
(5.84)
At infinity,
q ( m , 6) = w1(m, -a), 6) = o2(co, -6),
w1( - m , 6 ) = wl( - 0, -6) o2(- m , 6 ) =
w 2 (- 03, -6)
(5.85)
In addition to these conditions, we have w2+ - w2-
+ 1 = R[w,+- wl-]
(5.86)
If now we introduce the function u by virtue of the relation (5.87)
174
K. A. LURlE
then we may write Eqs. (5.58) in the following form:
the vector j being equal to j
1
= - - grad u
(5.89)
P
We observe that Eqs. (5.88) coincide with those of (5.68), if we substitute in the latter the function u for w 2 ,and z2 for q. Having compared the boundary conditions (5.80) and (5.83)-(5.86), we see that, for any p ( x , y ) , we may write Z2
= EmI,
(5.90)
U = &W2
These relations show that the vectors j and grad w 2 are antiparallel everywhere in the channel (x = 71); the Weierstrass condition (5.78) now shows that in the optimum regime we must have p = pminthroughout the channel. This result seems to be in complete accord with physical considerations. We would certainly have arrived at this statement on investigating Vatazhin’s solution,29 but then we could have verified the optirnality of p = pmln only relative to the class of functions which are constant everywhere. Utilizing the general (cf. Sec. 5.2), we have stated this optimality relative to a wider class of functions of two independent variables. There is, in general, no possibility of obtaining any analytical solution for arbitrary functions of that class. Figure 4 shows the ratio ITc/Zcpn of the total current gained under optimum
FIG.4. Dependence of the ratio I n c / l a i .on A/8.
’
conditions (Z,,)29 t o that obtained when the conductivity p - vanishes outside the electrode zone. The parameter c( is equal t o the ratio K(k’)/K(k), k = exp( -nA/6), k2 k f 2 = 1 ; K(k) being the complete elliptic integral of the first kind. The ratio is taken as equal t o unity. We see that the difference between I , ] and I , , vanishes when the parameter ,l/d goes to infinity.
+
5.
THE MAYER-BOLZA PROBLEM FOR MULTIPLE INTEGRALS
175
5.35 The Case of the Magnetic Field Varying along the Channel3'
For a homogeneous magnetic field, we have just seen that for optimum conditions, the conductivity is uniform and constant throughout the channel. This result changes qualitatively when the magnetic field is assumed t o vary along the channel. In the latter case, we obtain discontinuities in the optimum distribution of conductivity. Consider, for instance, the case when the magnetic field is presented by a n even function of x vanishing outside the electrode zone. The velocity of the fluid will be assumed constant. We are going t o prove that the distribution p = const throughout the channel is not optimal under these conditions. For the case of p = const, the current distribution was investigated by V a t a ~ h i n . ' The ~ vector j may be generally given as the sum of two parts: the vector j(') generated by the motion of the working fluid in a homogeneous magnetic field of intensity
and the vector j(2) with components [0, (I/pc)VB(x)].The vector lines j(') and j are presented by Figs. 5a, and 5b, re~pectively.'~ It follows from Eqs. (5.89) and (5.90) (where we must now put r < 0) that the vector lines grad o2are parallel to those o f f ' ) .
(
b)
FIG.5 . Vector lines of jc') and j (Sec. 5.35).
176
K. A. LURIE
Comparison of Figs. 5a and 5b shows that the Weierstrass condition (see theorem) is not satisfied because the vector lines j and grad m, intersect each other in acute as well as in obtuse angles, this being in contradiction with the presupposed constancy of the conductivity. It can be shown by more detailed consideration that the mutual orientation of the vector lines corresponds to the regime p = pminroughly speaking in the electrode zone, and to the regime p = pmaxin the outer region. Physically, this means that for optimization of the current distribution, it is necessary t o intensify the current density inside of the electrode zone and to reduce that in the outer part of the channel to the greatest possible extent. Analogously, it can be shown that the regime of constant conductivity is by no means optimal when the magnetic field zone passes the limits prescribed by the electrode zone. So far there arises the problem of determination of the lines of discontinuity dividing the regions of different regimes of conductivity. In what follows, we shall present a formulation of this problem where it will be supposed that, in general, the magnetic field zone spreads out of the electrode zone. Along the unknown lines Co of discontinuity there applies the condition that the tangential component of the electric field intensity is continuous, the same being also true for the normal component of the current density. These conditions can be expressed as follows :t
[zi]:= 0,
i = 1, 2
(5.92)
Now we must formulate the Erdmann-Weierstrass conditions along the line Xo [cf. Eqs. (5.26)]. We get [mi]: = 0,
i = I, 2
(5.93) (5.94)
Here we have denoted by t(x,, y,), n(y,, - x t ) the tangential and normal directions, respectively, to the line Z o . We may represent Eqs. (5.92)-(5.94) in an equivalent form. To do this, let us expand the functions zi into the sums (5.95) Here Pi denotes a suitably chosen solution to the corresponding Eq. (5.63), continuous together with its first derivatives everywhere inside the channel.
t We attach the index “ 1” to quantities defined inside the region where p (regime l), and the index “ 2 ” to those for the region where p =pmin(regime 2).
p,,,
:
5.
THE MAYER-BOLZA PROBLEM FOR MULTIPLE INTEGRALS
177
The first term in the right-hand side of Eq. (5.95) is responsible for discontinuity of the normal derivatives of zi across the line & , these variables themselves being continuous across it [cf. Eq. (5.92)]. We observe, for example, that representation (5.95) of the function z1 satisfies the corresponding relation (5.92) identically. The other relation (5.92), written in terms of z’, leads immediately to the integral equation for the density p ’ . To derive this equation, let us consider the well-known relations for the limiting values of normal derivatives of the potential of a surface distribution3’ :
cos
*, dt
(5.96)
Here $, denotes an angle between the normal no to the curve C, and the radius vector r, of the running point relative to the pole at the point of observation. The relation [z’]: = 0 is equivalent to I
+-
VB
VB
CPmax
Pmin
2
CPmin
xt
Having now substituted the corresponding expressions in the latter equation by virtue of Eqs. (5.95) and (5.96), we get (5.97)
The parameterp is defined by Eq. (5.76); no denotes the normal to C, external to region 1 where p = pmax. By the same argument, we proceed to the formulation of the related equations for the density p2, as well as for the densities v l , v 2 corresponding to the “flow-functions” wl, w 2 [cf. Eqs. (5.69)]. The last Erdmann-Weierstrass condition (5.94) may now be rewritten as follows :
I t remains now to develop a recipe for thedetermination of the function F’ in the right-hand side of Eq. (5.97). The determination of this function reduces to the boundary problem generated by the initial problem formulated for the state variable 2’. More precisely, we assume for simplicity that the magnetic field vanishes outside the electrode zone, and that the velocity of the
t
Function G , corresponds to F 2 for w 2 .
178
K. A. LURIE
working fluid is constant. Under these assumptions, we suppose that the following distribution of conductivity is optimum : in the middle region PP'Q'S'SQP (cf. Fig. 2 ) all disposed within the limits of the electrode zone, there is the regime p = pmln; in the symmetrically disposed outer regions (including infinity), there is the regime p = p,,, . Needless t o say, the (symmetrical) lines S'Q'P' and P Q S of discontinuity of the control function p ( x , y ) are t o be determined together with the solution to the optimum problem, We shall only suppose in what follows that these lines have their ends on the electrodes (cf. Fig. 6). Under the assumptions formulated, we observe that the boundary problem for the determination of F' may be stated in the following way [see Eqs. (5.63) and (5.80)l.T It is required to determinea harmonic function F' in the half-strip x 2 0, lyl < 6 by the following conditions along the boundaries: F'(x,
+a)
= z+' -
Jx p' In o
(5.99)
At infinity, the derivatives d F l / d x , dF'/dy are equal to zero. The constant difference z+l - z-' is calculated by virtue of Ohm's law for the outer load, namely,
An analogous problem had been examined by Vatazhin ; j 2 following his paper we introduce the analytic function
satisfying the boundary conditions (for notation see Fig. 6)
, sin q dt = - S ( x , y )
along CB, DC
I' The postultitnate Eq. (5.80) must now be replaced by z'(+ co,y ) = 0.
5.
THE MAYER-BOLZA PROBLEM FOR MULTIPLE INTEGRALS
179
A
N
M
S
C
D
FIG.6. Notations for calculations of Sec. 5.35.
=-I
R
Pmin
*I 0
, sin cp
R sin cp +jx’ dx (jxo p1 dt) r ~
Pmin
0
R V *
(5.101)
The analytic function
w=t+iw=
sin(niz/26) cosh(d/26) ’
z=x+iy
(5.102)
realizes the conformal mapping of the half-strip x 2 0, lyl 2 6 into the upper half-plane. Corresponding points are denoted on Fig. 7. W
FIG.7. Points corresponding to one another under conformal mapping.
I80
K. A. L U R E
On the boundary of the mapped region, the following relations are satisfied: z=
z=-
-
cosh(nx/26) cosh(nA/26)'
x20, y=6
sin(ny/2d) cosh(nA/26) '
x
IyJI 6
= 0,
(5.103)
We must now construct the function cD'[z(w)]= Bl1(w)= u1 + iu, which is analytic in the upper half-plane, vanishes at infinity, and satisfies mixed boundary conditions along the real axis u1 =C,(r) if
u1 = - S , ( T ) if
- 1 < z < 1, o = O
1 < / T I < co, o = 0
- $[Pl(O,
6) - P'(0, - S ) ]
(5.104)
Here we have introduced the following notation :
Ix, 1
N'
=
kl =
S(x, 6) dx 1
cosh(nA/26)'
+V
C
I Jx,
B(x) dx
z, =
cosh(nx,/26) (5.105) cosh(nA/26)
This problem can be immediately solved by virtue of the Keldish-Sedov formula.33 We get
5.
THE MAYER-BOLZA PROBLEM FOR MULTIPLE INTEGRALS
181
(5.106) In this formula we have taken the branch of the square root which is positive along that part of the real axis where t > 1. The constant parameter y can be determined from the last of Eqs. (5.104). Having set w = z, 171 < 1 in Eq. (5.106), we get u 1 ( t , 0 ) = ul*(t, 0) -
Y (1
-
q
/
2
- *(t)
Having substituted this expression for v l ( t , 0) into Eq. (5.104), we calculate the constant parameter y. After simple but rather extensive calculations (see Appendix) we arrive a t the following result:
(5.108j
182
K . A. LURIE
In this formula
n(q,11, k ) = J:
(1
+ /I sin’
@
p)(1 - k 2 sin2 P)1’2
k i 2 + k‘,2 = 1
-incomplete elliptic integral of the first kind -complete elliptic integral of the first kind --elliptic integral of the third kind
Equations (5.106) and (5.108) together determine the function @ ‘ ( x ,y ) . Having calculated the derivative dFi/dno and introduced the result into the right-hand side of Eq. (5.97), we arrive at the integral equation for the unknown density pl.It must be added that Eq. (5.94) represents just the extra condition necessary for determination of the discontinuity line C, . An expression for the optimum value of the total current I is of primary interest for our investigation. We can readily obtain this expression on writing
I=
46 z+1 - z - 1 - - R nR
1, (kI2 kl
vl(z’
- T’)’/’
dz
+ 1 [P1(O,6)
-
P’(0, -6)l
Having performed the calculations (see Appendix), we obtain --I
P’(0, 6) - P’(0, -6)
+n
5.
THE MAYER-BOLZA PROBLEM FOR MULTIPLE INTEGRALS
183
Here we have used the notation
(5.1 10)
0 = k , ~0sh(nx/26)
5.36 Some Limiting Cases Let us consider in more detail the case of small values of the parameter p . If p = 0, then the conductivity of the working fluid is constant and fixed. Of course, no optimum problem can arise in this case. On the other hand, if the parameter p is sufficiently small compared with unity, then the angles arccos p and ( n - arccos p ) are quite close to n/2. By virtue of the Weierstrass condition (see theorem) we can easily indicate the limiting position of the (vanishing!) lines X,, of discontinuity. It is readily observed that for this position we must take the locus r of those points where the vector lines j and grad w 2 , drawn for p = const, intersect each other in a right angle. (For the case when B(x) vanishes for 1x1 > 1,we can easily draw these limiting curves if we lay Figs. 5a and 5b on one another. For three variants of the magnetic field graph B ( x ) represented by Fig. 8, the corresponding curves r are drawn in Fig. 9. For values of p not too different from zero, the lines Z, of discontinuity do not differ very much from r, the Weierstrass conditions leading to the mutual disposition of regions with p = pmaxor p = pmlnjust coinciding with that described in the preceding Section. We are interested most of all in the value of the functional I due to optimization of the conductivity distribution as compared with that for conductivity equal to p,,', everywhere. The corresponding variation may be treated as
184
K. A. LURIE
FIG.8. Various shapes of the magnetic field graph.
consisting of the part (1) caused by decrease of conductivity to the value of p,.& in the regions C'S'Q'P'C' and CPQSC of the channel (Fig. 2), and of the variation (2) due to the subsequent transition from r to the line X o of discontinuity. The latter variation, however, turns out to be of a higher order of magnitude. To demonstrate this, we may represent the function z2 by means of Green's functions for the corresponding regions, and then pass to the variations (1) and (2). It will become obvious that the latter variation is combined from the terms containing products of the parameter p with quantities whose
I FIG.9.
curves drawn for various profiles of external magnetic field.
5.
THE MAYER-BOLZA PROBLEM FOR MULTIPLE INTEGRALS
185
order of magnitude is that of the variations of Green's functions caused by the transition from the boundary line r to that of C, . The demonstration may be regarded as complete because the variations of Green's functions vanish together with p . It is now obvious that we may calculate the variation of the total current I under the assumption that the line of discontinuity coincides with I-. To do this, we must use Eq. (5.109), where it is sufficient to keep only terms of the same order of magnitude as p . We must first of all calculate the density p1 to the same approximation; from Eq. (5.97) we infer that
The right-hand terms are calculated along the curve r. The directions no and to are normal and tangential, respectively, to r; the subscript zero denotes that the corresponding term is taken for p = 0 (p,,, = p,,,,). Having eliminated the density pl' from Eq. (5.109) by virtue of Eq. (5.1 I I), we obtain the following expression for the variation of the total current I (where the terms O ( p 2 )are dropped):
(5.112) We have denoted by IActhe value of total current for p everywhere equal to p m i n this , value being given by Idc
=
p m i n +RE* c
-A
B ( x ) dx
(5.113)
In Table I we present the values of the ratio AI/pIdccalculated according to Eq. (5.112) for three variants of the magnetic field distribution B ( x ) indicated on Fig. 8. The parameter Aid is chosen equal to 1 and 2. It is obvious from this table that optimization of the conductivity distribution can provide a considerable increase in the total current. An analogous investigation can also be conducted for the case when the magnetic field zone spreads beyond the limits of the electrode zone. It should be expected, however, that the effect of conductivity optimization should then be less than that in the preceding case, for now the vortex currents would be
186
K . A. LURIE
TABLE I
VALUES OF CHARACTERISTIC RATIOAIJpI,,
W ~-
Case ___
__
~
1
2
0.375 0.474 0.583
0.228 0.253 0.263
~
(1
b C
withdrawn from the electrode zone. Thus, we should consider the disposition of the magnetic field abatement region outside of the electrode zone as the optimization factor itself.32 Let us now proceed to the particular case pmax= co. For this case, the parameter p is equal to unity. In the region where p = pmaxthere can be no current, whereas in the region p = prnin,the Weierstrass condition (theorem) requires that the vector lines j and grad o2be parallel with each other. It is easy to verify that these requirements are met in certain circumstances by the following conductivity distribution : (5.1 14)
The components of the current density are now expressed by ('=O
(5.1 15) The functional Z is equal to
z, =
Pmin
Gl
+ RlId '
V ' G,=-J B(x) dx c
-I
(5.1 16)
The electric field is homogeneous, its only y component being -ZlR/26. We can see that under these circumstances, the problem is unidimensional ; from this it follows that the vector grad o2now has only its y component different from zero, this component being constant and negative. Equation (5.94) provides a restriction imposed on the values of the abscissas + I of the vertical lines of the conductivity jump. It follows now from this equation that
5.
THE MAYER-BOLZA PROBLEM FOR MULTIPLE INTEGRALS
187
these lines should be critical; in other words, the current density should vanish along them
p(+ I )
=0
(5.1 17)
This latest equation, (5.1 17), together with Eq. (5.1 IS), determines the unknown parameter 1. Provided the least root of this equation does not exceed 2, this root gives just the abscissa to be determined. The very existence of such a root (or, equivalently, the existence of an optimal regime expressed by Eq. (5.1 14)) is essentially determined by the profile of the external magnetic field B(x). It follows particularly from Eq. (5.115) that for the realization of
(b)
FIG.10. Possible forms of external magnetic field.
regime (5.114) it is necessary that the function B(x) must decrease sufficiently rapidly towards the ends of the interval ( - A7 A) (Fig. 1Oa); if, presumably, the function B(x) is nonmonotonic, then the optimal conductivity distribution may take the form of alternating zones of maximum and vanishing conductivity (Fig. lob), the former being expanded to the neighborhoods of the maxima of the function B(x). It can be immediately demonstrated that in optimal regime (5.114) the total current ILexceeds, for instance, that obtained for constant conductivity of the fluid, equal to pi,', . In the latter case we get (see Sec. 5.34 for notation)
188
K. A. LURlE
For the profile of a magnetic field of the type presented in Fig. 10a, the following inequality is valid :
This inequality follows from the fact that along the intervals (A, I ) , ( - I , -A) electric currents flow in the negative direction [cf. Eqs. (5.1 15) and (5.117)]. The inequality ---
21
xK(1 - k2)1'2 > I, 2 In k K ( k )
k
= exp( - d
/h)
valid for any 0 < k < 1, indicates that I). > I A c ,or according to what has already been stated, Z,> I A c .
FIG.11. Characteristic graphs of the ratios ZJ12,cand Z,/ZAas functions,of x/6 (Sec. 5.36).
Figure 11 presents the graphs IL/IA,and IJIA as functions of the parameter A/S for three variants of external magnetic field and the ratio RIPminequal t o unity. The first of these functions characterizes the effectiveness of the conductivity optimization as compared with the case when conductivity is everywhere constant and equal t o pi,', . In both cases the external magnetic field is concentrated in the electrode zone (the variants a-c, Fig. 8). It is obvious from the graphs (continuous curves) that the effectiveness of the optimization increases rapidly with transition from variant a to c or, equivalently, together with extension of the region of the magnetic field decrease towards the ends of the electrode zone. For any variant, the effectiveness of the optimization decreases slowly when the characteristic ratio A/S grows, because the relative part of the end losses is then being reduced. We have
5.
THE MAYER-BOLZA PROBLEM FOR MULTIPLE INTEGRALS
189
already observed a n analogous situation in the case of small values of p (Table I). The ratio I,/iA (dashed curves on Fig. I 1) characterizes the effectiveness of the conductivity optimization as compared to the case when conductivity vanishes outside the electrode zone. It may be thought that in the latter case a sort of partial optimization of the conductivity distribution is accomplished, for now there is no longer the possibility of the current lines' departure from the electrode zone. From this it naturally follows that the effectiveness of the optimization decreases as compared with the preceding case(the dashed curves on Fig. 1 1 show lower values than the continuous ones). As before, however, the effectiveness decreases quite rapidly during the transition from a to c. On the other hand, the effectiveness increases together with the ratio A/6, the influence of the partial Optimization now being decreased.
APPENDIX The constant parameter y is determined from the last Eq. (5.104). Having substituted Eq. (5.107) for u1(s, 0) into Eq. (5.104), we write the latter in the following form (for notation see Sec. 5.35):
26
dr
[(I - T 2 ) ( k I 2 - r 2 )I 12,+% ;
- $[P'(O,
6) - P'(0, -611
26
(5.1 18)
The left-hand side of this equation can be rewritten as follows [see Eq. (107)]:
190
K. A. LURlE
Furthermore, we have
The left-hand side of Eq. (5.1 18) is now equal to -2, k, kPi 2
1
-K(kl)
1 (5.1 19)
Consider now the right-hand side of Eq. (5.1 18). We write (for notation see Sec. 5.35):
x
{J
-
-I
--*;p-z
(y1/2 + i" (-y2 IPI-1
S1(p)d p
dz [(l - r 2 ) ( z 2- k, 2 )]l / Z - 4:
I
1 p-1 p-7 P + l
Sl(P) dP)
5.
-fk:
THE MAYER-BOLZA PROBLEM FOR MULTIPLE INTEGRALS
(””)’”(!-’
” ( t 2 - k12)1/2 1 - 5
... dp + -0c
1
m
1
191
... dp)
The latter integral can be transformed by virtue of the substitution t =
into the form
(1 - k ; 2 t 2 ) 1 / 2 , k I 2+ ki2 = 1
Here we let n’ = ki2/(p2- 1). The integral R is now equal to
R
= -
1
[II(arcsin
P -1
1-t ki2 4, ,z kl P -1
k,
-
I )
(2, $, 2 p -1
Now we can easily write down the resulting expression for the right-hand side of Eq. (5.118); namely,
-
k l ’ ) ]+
t12)1’2,
kl‘
I ~
F(arcsin
Plll:!X
(1 - t 1 2 ) 1 / l2 k, f k2 l ’ ) k,’ ’p2- I
Pmin
( I - r12)‘/2
( 1 - rl2)’l2
-=(?,&, 2 p -1
k’,’
(5.120)
Pmax
1
- It 26 Ry
[&[
[P1(O,6) - P’(0, -6)]
F ( i , kl’) - F (arcsin (1 - ? , 2 ) 1 ’ 2
kl (1 - t
1 2 y
k,‘
kl ’
192
K. A. L U R E
The parameter y can now be determined on setting expressions (5.1 19) and (5.120) equal to each other [see Eq. (5.108)]. The total current I is given by 46 I = -nR
kl
10
‘I(‘’
( k 1 2- T
~ )
1
+ -~ [P’(O, 6) - P’(0, -6)l ~ / R
dt
Making use of the preceding equations, we find
kl’
Pmin
Pmax
46
- f [P’(O, 8 ) - P’(0, -6)l
This expression can be immediately transformed to the form of Eq. (5.109). REFERENCES
1. R. Bellman and R. Kalaba, J. Basic Eng. 83 (1961). 2. A. 1. Egorov, Prikl. Mat. Mekh. 27, 688-696 (1963). 3 . A. G. Butkovsky, A. Ya. Lerner, and S. A. Malii, Dokl. Akad. Nauk S S S R 153, 772-775 (1963). 4. T. K . Sirazetdinov, Izu. Vysshikh Uchebn. Zavedenii, Aviats. Tekhn. 2, 11-21 (1961). 5 . R. Bellman and H. Osborn, J . Math. Mech., 7, 1 (1958). 6. A. G. Rutkovsky and A. Ya. Lerner, Dokl. Akad. Nauk S S S R 134, 778-781 (1960). 7. A. G. Butkovsky, Automat, Telemekh. (Automat. Remote Control) 22, 17-26 (1961). 8. A. G. Butkovsky, Automat. Telemekh. 22, 1288-1301 (1961). 9. Yu. V. Egorov, Dokl. Akad. Nauk SSSR 150, 241-244 (1963). 10. K. G. Guderley and E. Hantsch, 2. Flugwiss. 3, H. 9, S. 305-313 (1955).
5. 11. 12. 13. 14. 15. 16. 17.
18. 19. 20.
21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33.
THE MAYER-BOLZA PROBLEM FOR MULTIPLE INTEGRALS
I93
G. V. B. Rao, Jet Propulsion 28, 377-382 (1958). Yu. D . Shmiglevsky, Prikl. Mat. Mekh. 21, 195-206 (1957). Yu. D. Shiniglevsky, Prikl. M a t . Mekh. 26, 110-125 (1962). K. G . Guderley and J. V. Armitage, Paper presented at the Symposium on Extremal Problems in Aerodynamics, Boeing Scientific Research Laboratories, Flight Sciences Laboratory, Seattle, Washington, December 3-4, 1962. A. N. Kraiko, Prikl. Mat. Mekh. 28, 285-295 (1964). K. A. Lurie, Prikl. M a t . Mekh. 27, 842-853 (1963). P. K. Rashevsky, “ Geometricheskaya Teoria Uravnenii s Tchastninii Proisvodnimi.” (“Geometrical Theory of Partial Differential Equations”), p. 323-324. Gostekhizdat, Moscow, 1947. A. Miele, in “Optimization Techniques With Applications t o Aerospace Systems” (G. Leitmann, ed.), p. 112. Academic Press, New York, 1962. V. Volterra, Art. Accad. Nazi. Lincei Rend. 4 , 6’ (1890). R . E. Kopp, in “Optimization Techniques With Applications to Aerospace Systems” (G. Leitmann, ed.), pp. 255-279. Academic Press, New York, 1962. G. W. Sutton, Vistas Astron. 3 , 53-64 (1960). J. H. Drake, A I A A J . 1, 2053-2057 (1963). A. E. Sheindlin, A. V. Gubarev, V. I . Kovbasijuk, and V. A. Prokudin, Izu. Akad. Nark S S S R , Ord. Tekh. Nauk Energ. i Avtomat. 6 , 3 6 3 8 (1962). T. G. Cowling, “ Magnetohydrodynatnics.” Wiley (Interscience), New York, 1957. A. B. Vatazhin and S . A. Regirer, Prikl. M a t . Mekh. 26 (1962). G . G. Cloutier and A. I. Carswell, Phys. Rev. Letrers 10, 327-329 (1963). K. A. Lurie, Prikl. M a t . Mekh. 28, 258-267 (1964). L. I. Rozonoer, Avromat. Telemekh. (Automat. Remote Control), 20, 1320-1 334, 1441- 1458, 1 56 1- 1 578 (1 959). A. B. Vatazhin, Prikl. M a t . Mekh. 25, 965-968 (1961). K . A. Lurie, Zh. Prikl. Mekh. i Tekhn. Fiz. No. 2 (1964). 0. D . Kellog, “Foundations of Potential Theory.” Springer, Berlin, 1929. A. B. Vatazhin, Izu. Akad. Nauk SSSR, Old. Tekhn. Nauk Mekh. i Mashinostr. No. 1 (1962). M. A. Lavrentjev and B. V. Shabat, “Methods of the Theory of Functions of the Complex Variable.” Gostekhizdat, Moscow, 1951.
~~
Mathematical Foundations of System Optimization HUBERT HALKIN? BELL TELEPHONE LABORATORIES. WHIPPANY. NEW JERSEY
6.0 Introduction . . . . . . . . . . . . . . . . . 198 6.1 Dynamical Polysysteni . . . . . . . . . . . . . . 201 . . . 201 6.1 1 System Evolution . . . . . . . . . . 6.12 Control Functions . . . . . . . . . . . . . 202 . . . . . . . . . . . . . . . 203 6.13 Trajectories 6.2 Optimization Problem . . . . . . . . . . . . . . 204 6.21 Terminal Constraints and Performance Criterion . . . . 204 6.22 Remarks on the Structure of the Function f(x. u, t ) . . . . 205 6.23 Remarks on the Evolution Interval . . . . . . . . 206 6 3 The Principle of Optimal Evolution . . . . . . . . . . 206 6.3 1 Reachable Sets . . . . . . . . . . . . . . 206 . . . . . . . . . . . . 207 6.32 Propagation Analogy 6.33 A Preview of the Maximum Principle . . . . . . . . 209 6.4 Statement of the Maximum Principle . . . . . . . . . . 209 6.5 Proof of the Maximum Principle for a n Elementary Dynamical Polysystem . . . . . . . . . . . . . . . . . 210 6.5 1 A n Elementary Dynamical Polysystem . . . . . . . . 210 6.52 Reachable Set . . . . . . . . . . . . . . . 211 6.53 Maximum Principle . . . . . . . . . . . . . 212 6.54 Proof of the Convexity of the Set W . . . . . . . . 213 6.6 Proof of the Maximum Principle for a Linear Dynamical Polysystem . 214 . . . . . . . . . 214 6.61 A Linear Dynamical Polysysteni 6.62 Fundamental Matrix . . . . . . . . . . . . 215 6.63 Reachable Set . . . . . . . . . . . . . . . 217 6.64 Maximum Principle . . . . . . . . . . . . . 217 6.65 Remarks on Convexity . . . . . . . . . . . . 218 6.66 Comoving Space along a Trajectory . . . . . . . . 219 6.7 Proof of the Maximum Principle for a General Dynamical Polysysteni . . . . . . . . . . . . . . . . . 220 6.71 Fundamental Matrix along a n Optimal Trajectory . . . . 220 6.72 Variational Trajectory . . . . . . . . . . . . 221
t Present address: Department of Mathematics. University of California. La Jolla. California . 197
198
HUBERT HALKIN
6.8
6.9 6.10
6.1 1 6.12
6.0
6.73 Approximation Trajectory . . . . . . . . . 6.74 Remarks o n Linearization Techniques . . . . . . 6.75 Statement of the Fundamental Lemma . . . . . 6.76 Maximum Principle . . . . . . . . . . . 6.77 Derivation of the Properties of the Function k(t; q) . Uniformly Continuous Dependence of Trajectories with Respect to Variations of the Control Functions . . . . . 6.81 Distance betweenTwo Control Functions . . . . 6.82 Uniform Boundedness of Variational Trajectories . . Some Uniform Estimates for the Approximation z ( t ; %) of the Variational Trajectory y ( t ; a) . . . . . . . . . Convexity of the Range of a Vector Integral over the Class Icp of Subsets of [0, 1J . . . . . . . . . . 6.101 MultipleBalayage of Vector Integrals . . . . . . . . 6.102 Convexity of the Range of a Vector Integral Proof of the Fundamental Lemma . . . . . . . . 6.1 1 I Proof of the Fundamental Lemma in the Case r = n - I 6.1 12 Extension of the Preceding Proof to the General Case. An Intuitive Approach to the Maximum Principle . . . . Appendix A Some Results from the Theory of Ordinary Differential Equations . . . . . . . . Appendix B The Geometry of Convex Sets . . . . . . References . . . . . . . . . . . . . . .
. . 222 . . 223 . . 223 . . 224 .
. 225
. . 226 . . 226 . . 227 .
. 229
. . 230 . . 230
. . 235
. . 242 . . 242 . . 246 . . 247
. . 247 . . 256 . . 260
Introduction
This chapter is devoted to a mathematical analysis of some of the problems encountered in the optimal control of deterministic systems described by nonlinear differential equations. It contains very few results which are not yet common tools for the growing number of engineers interested in control theory. The main purpose of this chapter is to give a rigorous but still easily understandable derivation of some of the most important of these results. The method developed here is the same as the method which we have introduced in earlier publications devoted to more complex versions of similar problems ( H a l k i ~ ~ ~ ~ . ~ ~ ) . Many engineers, who were successful practitioners of the operational calculus of Heaviside long before Laurent Schwartz gave it mathematical respectability, could dispute the necessity of adding mathematical rigor to the proof of those results in optimal control which have been accepted for many years. From a purely utilitarian standpoint such an attitude would be acceptable if these results, even if not fully understood, would be applied correctly and successfully. Unfortunately, this is not the case: a great majority of the papers on optimal control in the American engineering literature is there to testify that these results are more often misused than not. Another shortcoming of engineering applications of optimal control is the present state of the art in computing techniques. Even after the pioneering work of Bryson
6.
MATHEMATICAL FOUNDATIONS OF SYSTEM OPTIMIZATION
199
and Denham,8 K e l l e ~ , ~N e ~ s t a d t , ~and ’ Paiewonsky and Woodrow,42 we must admit that computing techniques for optimization problems are still way behind the wishful manipulations of formulas that we find in so many engineering papers. We seriously believe that a sound understanding of the theory of optimal control would be of considerable help in applying it correctly and devising efficient computational methods. The p i k e de rtsistance in any mathematical treatment of the theory of optimal control is always the proof of the maximum principle. There have been some remarkable proofs of the maximum principle in the case of linear systems (Bellman et LaSalle,34 and Neustadt4’ but very few rigorous proofs of the maximum principle in the case of nonlinear systems (Pontryagin et Warga,48 B e r k ~ v i t z and , ~ HalkinZ4). The proof given by Pontryagin and his associates is often obscure and incomplete. The proof of Warga is short and precise but applies only to problems which have been “relaxed by a proper convexification. The proof of Berkovitz is based on McShane’s proof of the multiplier rule for the abnormal case of the problem of Bolza and presupposes an extensive knowledge of the classical calculus of variations. The proof of Pontryagin et al.43 is based on the construction of special variations which have been developed initially by M ~ S h a n e The . ~ ~ special variations of McShane lead to the construction of some conLjex cones. The proof of HalkinZ4is also based on special variations which are different from the variations of McShane and which lead to the construction of some conuex sets. The contlex cones of McShane are the cones spanned by these conwx sets. The properties of a convex cone spanned by a convex set are determined completely by this convex set. However, the reverse is not true: there are many properties of a convex set which are not recoverable from the knowledge of the properties of the convex cone spanned by this convex set. It is to be expected, and it is in fact the case, that from considerations of convex sets, we have been able (HalkinZ4)t o derive all the results obtained from convex cones. Indeed, we have been able to derive some new results which are particularly useful when it comes to computational methods ( H a l k i r ~ ~ ~ ) . In addition to the preceding proofs, there are a great number of heuristic derivations of the maximum principle (Desoer,” Dreyfus,’ Flugge-Lotz and Halkin,15 and Kalman32). All these heuristic? derivations follow the pattern of Carathkodory’s Method of “ Geodatisches Gefalle,”’’ or equivalently of Bellman’s “ Dynamic Pr~gra mrning.” ~These derivations are excellent mnemonic devices to check the correctness of the final formulas. In this chapter we shall present a simple and complete proof of the maximum principle. ”
t It should be added that these papers, especially K a l r n a ~ ~give , ~ ~proofs which are rigorous for some special cases.
200
HUBERT HALKIN
Our proof of the maximum principle proceeds in three steps. We are first interested in open loop systems, i.e., systems which are described by differential equations independent of the state variables or equivalently such that the output can be represented by the integral of a function depending on the input only. From there we go to time varying linear systems and finally to a general type of nonlinear systems. In so doing we are not only solving three separate problems of increasing difficulty but we build up a single proof in three steps each step closely related to one of the three types of systems mentioned above. If, in the case of linear systems, we follow the guiding lines of the proof of McShane or Pontryagin, we obtain a proof which is only slightly simpler than the general proof for nonlinear systems. But if, in the case of linear systems, we follow the approach given in this chapter, we obtain the simple proof which has been developed specifically for linear systems by LaSalle. The mathematical prerequisites for the present chapter are very elementary : calculus, matrix theory, finite dimensional Euclidean space, etc. The reader will find numerous footnotes t o refresh his memory of these prerequisites. Moreover, we have collected in Appendix A all prerequisites related t o the theory of ordinary differential equations and in Appendix B all prerequisites related to convexity in finite dimensional Euclidean spaces. The only nonelementary mathematical result quoted without proof in this chapter is the fixed point theorem of Brouwer for closed convex subsets of finite dimensional Euclidean spaces. Even if the proof of this theorem is not elementary its statement uses only elementary mathematical concepts and its content can be visualized easily. This chapter does not contain a single one of those measure theoretical concepts which have been for too long a necessary evil in the mathematical theory of optimal control. The end of this paragraph is intended only for the reader familiar with measure theory. An important step in the proof of the maximum principle given in HalkinZ4 is based on a well-known theorem in measure theory which states that: i f f is a Lebesgue integrable function from [0, I] into E“ and if &Y is the class of all Bore1 subsets of [0, 11 then the set (JEf(t) dt : E E B ] is convex. In Halkin” we have proved a new result of the same type: i f f is a piecewise continuous function from [0, I] into E” and if d is the class of all subsets of [0, I] which are the union of a finite number of disjoint intervals, then the set (JEf(t) dt : E E d )is convex. This new result allows us to take full advantage of the geometrical content of the proof given in HalkinZ4without using any measure theoretical results and, correspondingly, without introducing measurable control functions which are mathematical artifacts completely out of place in control theory. Geometry is at the center of this chapter. The first step in this direction is
6.
MATHEMATICAL FOUNDATIONS OF SYSTEM OPTIMIZATION
20 I
the principle of optimal evolution ( H a l k ~ n which ~ ~ ) states that: “ Every event of a n optimal process belongs t o the boundary of the set of possible events.” The principle of optimal evolution transforms the given optimization problem into a purely topological (geometrical) problem. We have solved this topological problem by proving that every trajectory belonging to the boundary of the set of possible events satisfies a maximum principle of the Pontryagin’s type ( H a l k i ~ ~ ~ ~ ) . 6. I
Dynamical Polysystem
6.11 System Evolution
We shall consider a system whose state is described by a vector x = (x,, x 2 , ... ,x,) of a n n-dimensional Euclidean space En and whose evolution is given by the differential equation
x =f(x, u, t )
(X = dx/dt)
(6.1)
where u = ( u l , .._, u,) is an m-dimensional control vector of a Euclidean space Em and f (x,u, r) = (fi(x,u, t ) , f2(x,u, t ) , . . . ,J,,(x, u, t)) IS a given n-dimensional vector valued function of x, u, and of the time t. In this chapter we shall assume that we have selected an orthonormal base in E” and that the Euclidean length of a vector x = (x,, x2, . . . , x,) is given by 1x1 1(x,)2)112. Similarly, we assume that we have selected an orthonormal base in Em and that the Euclidean length of a vector u = (u,, u 2 , ... , u,) is given by (u( l(u,)2)’’2. We assume that the vector valued function f(x, u, t ) is defined for all? x E E”, all u E E m and all$ r E [0, I]. Moreover, we assume that the function f(x, u, t ) is twice4 continuously differentiable with respect t o x, continuous with respect to u, and piecewise continuous with respect to t. More precisely, we assume that there exists a finite set7 { t o , t l , ... , t k } c [0, I ] with r , = O< t , < < tk ... = 1, a finite collection of vector valued functions {f,(x, u, t), f2(x, u, t ) , . _ ., fk(x,u, t ) ) such that for each i = 1, 2, ._.,k
=(C:=
=(cr=
t If A is a set then “ a E A ” means “ a is an element of A ” and “ B c A ” means “ B is a subset of A,” $ If a and b are two real numbers with a I b then [a. b] denotes the closed interval from a to b, i.e., the set of all real numbers t such that a I t 5 b. Similarly, ( a , b ) denotes the open interval from a to 6 , i.e., the set of all real numbers f such that a < I < h. 5 Most of the results of the theory of optimal control, for instance the maximum principle of Pontryagin, can be proved by assuming only that f(x, u, t ) is once continuously differentiable with respect t o x instead of twice (see HalkinZ4). However, these proofs are greatly simplified by making the present assumption which is valid for all known practical applications. 7 The set denoted by { t o , t,, ..., t k } is the set whose elements are t o , I , , ..., and t k .
202
HUBERT HALKIN
the vector valued function fi(x, u, t ) and all its first and secor,d partial derivatives with respect to x are defined and continuous with respect to x, u and t for all x E E", all u E E'", and all t E [ti-,, t , ] ; (ii) f(x, u, t ) = fi(x, u, t ) for all x E En, all u E Em, and all t E ( t L - l ,r J .
(i)
6.12 Control Functions
A bounded? subset 51 of E" is given. This set R is called the sct ofadmissible control vectors. Let F be the class of all piecewise continuous functions from [0, I ] into R. This means that each function in the class F is continuous at every point (0, 1) with the exception of at most a finite number of points where it has finite rightf and left limits and has a finite right limit at the point 0 and a finite left limit at the point 1. The preceding definition implies, in particular, that every function in the class F is bounded. The clacs F is called the class (or the set) of admissible control functions. We shall use script letters 42, Y , Yif, to denote functions in the class F. It is understood that the function is the function whose value at the time t is u ( t ) ; similarly, the function Y is the function whose value at the time t is v(t), etc. We shall reserve the symbols u, v, w, to denote vectors in the space Em. From the definition given above we see immcdiately that the class F of control functions has the following two fundamental properties : ( I ) if v E Q then the function V defined by v(t) = v for all f E [0, I ] belongs to the class F ; R of Emis bounded if there exists a constant k such that for all u E' .Q we have I k. $ Letf(t) be a function defined on some open interval ( a , 6). The right liniit of the functionf(r) at the point u is, whenever it exists,
11
f A subset
liin f ( a -t F )
i-0
L>O
Similarly, the left limit of the function fft) at the point h is, whenever it exists. lini f ( b + E ) F - 0 F t O
The functionf(r) is continuous at a point T E ( a , b ) if its right and left liniits exis( and are finite and equal at the point 7. The function f ( t ) has a jump of the first kind at a point T t (a, b ) if its right and left limits exist and are finite but not equal at the point T. I f the ftinctionf(t) has a finite right limit at the point T which is equal to the value of the function f ( t ) at the point 7, we say that the functionf(t) assumes its right limit at the point T. A similar convention holds for a left limit.
6.
MATHEMATICAL FOUNDATIONS OF SYSTEM OPTIMIZATION
a2E F,
and (2) if lationst
v(t) V(.)
T E
203
(0, I ) then the function V satisfying the re-
for all t E [0, 7) i2 v(t) = u2(t) for all t E (z, I ] = ul(t) E
belongs also to the class F. With the help of the second of these properties we may prove easily that if 4Y1 and OZ2 E F and if ,4 is a subset of [0, 11 which is the union of a finite number of disjoint intervals then the function V satisfying relations$ v(t) v(t)
= ul(t)
for all
t
uz(t)
for all
tEA
=
E
[0, I ]
-
A
belongs also to the class F. If 0 , E F then the set O(@) will be defined as the set of points in (0, 1 ) at which the function 42 and the function f(x, u, t ) are continuous with respect to t. Since, by definition, every fun'ction %! in the class F and the function f(x, u, t ) are piecewise continuous with respect to t , it follows immediately that there is only a finite number of points on the interval [0, I ] which do not belong also to the set a(@) and that the set a(%!) is open.$ 6.13 Trajectories If @ is a control function in the class F we shall denote by x(f; 6 2 ) a continuous function o f t differentiable over O(42)and such that7 (i) k(t; %)
= f(x(t; @), u ( t ) , r )
for all
t E O(@)
(ii) x(0; 42) = o
(6.2)
(6.3)
From the theory of ordinary differential equations we know that for every
@ E F we have one and only one of the following two possibilities:
(i) the function x ( t ; 42) exists and is unique and bounded over the interval [0, I ] ;
t The set [0, T ) is the set of all t such that 0 I r < T . Similarly, the set (7,1 ] is the set of all
T < t 2 1 . The set [0, T) is sometimes called the left-closed right-open interval from 0 to T. Similarly, (7, 11 is called the left-open right-closed interval from T to 1 . 1 If A and B are two sets than A B denotes the set of all points of A which do not belong to B. 6 A subset of A of the real line is open if for every N E A there is an F :0 such that any real number h with In hl < F belongs also to the set A . 11There is no loss of generality by assuming that the initial value of the state vector x is zero. This assumption greatly simplifies the notations and can always be made after an appropriate translation of the coordinate system.
t such that
-
~
204
HUBERT HALKIN
(ii) there exists a t E (0, I ] such that the function x ( t ; q) exists and is unique and bounded for every t E [0, t*] with T* < t and is unbounded as t tends to t. When this is the case, T is called a finite escape time of the state variable x for the control function @. Many developments in the theory of optimal control are greatly simplified if we may assume a prior; that the possibility of a finite escape time of the state variable x is ruled out for any control function 42 in the class F. This can be done in many ways. In this chapter we assume that there exists a constant M < 00 such that If(% u, t)l 5 M I X I + 1) (6.4)
+
for all x E E", all u E 52, and all t E [0, I]. I n the preceding relation 1x1 denotes the Euclidean norm of the vector x, i.e., 1x1 The final results given in this chapter are valid even when the assumption (6.4) cannot be made but the corresponding proofs are a little longer. (These proofs are given in H a l k i r ~ . ~ ~ ) With the help of assumption (6.4) it is easy t o prove? that for every control function @ in the class F the function x(t; 42) exists, is unique and bounded over the interval [0, I]. Moreover, it is easy to prove that the functions x ( t ; @) are uniformly boundedf over [0, 11 for all 42 E F. The vector valued function x ( t ; @) will be called the trajectory corresponding t o the control function @. Equation (6.1) together with the definition of the class F of control functions and the various assumptions made in this section constitute what is called a dynamical polysystem (see Bushaw' and HalkinZ8).
==(Cr=
6.2
Optimization Problem
6.21 Terminal Constraints and Performance Criterion We shall now define an optimization problem for the dynamical polysystem introduced in Sec. 6.1. In Sec. 6.1 we stated that the state vector x is zero at the time t = 0. Let r be some integer such that 0 5 r 5 n - 1. Let {sl,s 2 , ... , s,} be a given set of
t See Appendix A, $ A function f ( t ) defined over the interval [O, I ] is bounded if there exists a K .<. co such that < K for all t E [0, I]. Let A be a given set and for every o! E A letf2(t) be a function which is defined over the interval [0, I]. The functionsf,(,) are uniformly bounded over [0, I ] for all n. F A if there exists a K < co such that If,(t)i < K for all t E [0, I ] and all a E A . Wnrning: in the case of a finite collection of functions it is true that the collection is uniformly bounded if each of its functions is bounded, but in the case of a n infinite collection of functions it is quite possible for each function of a collection to be bounded without the collection being uniformly bounded.
+
If(/)
+
6.
MATHEMATICALFOUNDATIONS OF SYSTEM OPTIMIZATION
205
real numbers. We shall prescribe the final values (at the time t = 1 ) of the first coordinates of the state vector x to be {sl, s 2 , . . . , s,} and we shall require the final value (at the time t = I ) of x,, the nth coordinate of the state vector x, to be maximum. The optimization problem is then formally stated as follows: we are given a set? S = {x : x E En, x i = si for i = I , ... , Y } (6.5) Y
and we want to find a control function Y in the class F such that (1)
x(l;Y)ES
(2) for all %L'
E
F such that x(l ; 42) E s
(6.7)
shall hold the relation
x,(l ;
2
;Y )
(6.8)
The control function Y satisfying the preceding conditions is called an optimal control function and the corresponding trajectory x(t ; Y )is called a n optimal trajectory. 6.22 Remarks on the Structure of the Function f(x, u, t ) We should mention here two fundamental differences between the statement of the optimization problem given above and the problem treated by Pontryagin and his associate^.^^ In our formulation we allow the function f ( x , u, t ) to be dependent on the variable x, which is to be maximized at time t = I . We are allowing this dependence for practical and aesthetic reasons: to make the assumption that f ( x , u, I ) is independent of x, would lead to very little simplification of the subsequent developments but would nevertheless break the symmetry among the state variables. Moreover, many practical problems show a natural dependence of the differential equations on the variable to be maximized : in the classical problem of the maximization of the payload of a rocket, the evolution of the rocket depends on its mass a t every intermediate instant of time. Also, in contradistinction to Pontryagin's formulation, we d o not require the differential system to be time independent. It is well known that in the case of a time varying differential system which is continuously djfferantiable
t We use the notation {x : P(x)]to represent the set of all elements x such that the condition P(x) is satisfied. With this convention the set S defined by relation (6.51) is the set of all x in E" such that x, , the ith coordinate o f x, IS equal to s, for each i = I , . _ ., Y .
206
HUBERT HALKIN
with respect to t we may add a new artificial state variable and transform the problem corresponding to this time varying system into the problem treated by Pontryagin and his associates. However, in the more general case of a time varying system which is only piecewise continuous with respect to t , this transformation cannot be performed, Pontryagin's results d o not apply, and the new results contained in this chapter are needed.
6.23 Remarks on the Evolution Interval Tn this chapter we assume that t = 0 is the initial time and that t = I is the terminal time. Any problem with fixed initial and terminal times can be easily transformed in a problem of the type treated in this chapter.
6.3 The Principle of Optimal Evolution 6.31 Reachable Sets In this section we introduce some geometrical concepts which are of fundamental importance t o the theory of optimal control. If t E [0, I ] and if x is a state such that there exists a control function 02 in the class F with x(f; %) = x, we shall say that the state x is reachable at the time t. For every t E [0, I ] we define the set W ( t )as the set of all states which are reachable at the time t , i.e., W ( t )= {x(t;%) : JI1 E F }
(6.9)
We shall show here that the study of the sets W ( t )and of their boundariest
8W ( t )is closely related to the solution of the optimization problem stated in
Sec. 6.2. Let U F first consider the intersection1 S n W(I ) of the terminal set S with W(I), the reachable set at the time t = I . If S n W(1 ) is empty, which i s denoted S n W(I) = the problem has no feasible solution and a fortiori no optimal solution. If S n W(1) is not empty and if there is a point X of S n W(1) for which Xn takes the greatest value,$ then our problem has an
a,
t If A is a certain subset of the n-dimensional Euclidean space E" then aA, the boundary of the set A , is the set of all points a in E" such that for every E > 0 there exist a point b in A and a point c not in A such that / a - bl and / a - cl < E . Note that in general a boundary point of a set is not necessarily an element of that set. A closed set, however, contains all its boundary points. This last property is sometimes used to define a closed set. 1 If A and B a r e sets then A n B is the intersection of the sets A and B, i.e., the set A n B is the set of all elements which belong to both sets A and B. Similarly, A U B is the union of the sets A and B, i.e., the set A u B is the set of all elements which belong to at least one of the sets A and B . 9: From the assumptions of Sec. 6.1 we know that the set W(I) is bounded. The set S n W(1) is a fortiori bounded. However, the reader should realize that a bounded set
6.
MATHEMATICALFOUNDATIONS OF SYSTEM OPTIMIZATION
207
optimal solution, and there is a control function Y in the class F such that x(l : Y )= si. From now on, in this chapter we shall assume that such an optimal solution exists and we shall try to derive as many of its properties as possible. We shall first note that x(1; V )is a boundary point of W(I ) since otherwise there would exist a point x* in W(1)n S with x,* > x,,(l;Y ) which would contradict the optimality of the control function V .Moreover, if? N(x(t, ; Y ) , 8 ' ) c W ( t l ) for some t , E [O, I ] and E , > 0, then for every t , E [ t l , I ] there exists a n e 2 > 0 with N(x(t2; Y ) ,c 2 ) c W ( t 2 )since, from the smoothness of f(x, u, t ) assumed in Sec. 6.1, the point x ( t , ; Y )is an interior1 point of the set of all points reachable at the time t , with the same control function Y" from the points in N ( x ( r , ; Y ) ,q)at the time 1,. The preceding results can be summarized as follows (Flugge-Lotz and Halkin,' and H a l k i ~ ~ ~ ~ ) : Principle of Optimal Euolution. If Y is an optimal control function, then for every t E [O, I ] the state x(t; V ) belongs t o the boundary of the set W(t).
6.32 Propagation Analogy
If we drop a pebble on the surface of a lake, we create a certain perturbation which will propagate with time. The boundary of the time varying set W ( t )is very similar t o the wavefront associated with that propagation. In the remainder of this section we shall assume that for every t E (0, I ] the sets W ( t ) have smooth boundaries$ and, in particular, have tangent hyperplanes at S A W ( 1 )has not necessarily a point H for which 2, takes the greatest valuc. This would be the case if the set S would be closed. The very important question of theclosure of the sets S will not be considered in this chapter. The reader is referred to Filippov,14, R o ~ i n , ~ ~ Warga,47 LaSalle,34 N e ~ s t a d t and , ~ ~H a l k i ~ ~ . ' ~ t The set N ( x ( t , : V ) ,E , ) is the set of all points with a distance to x(t, ; Y ' ) less than E , . f By definition the point a is interior to a set A if there is an E > 0 with N ( a , E ) c A . A precise justification of the property stated in the text is given in Proposition A.8 of Appendix A. 9 For many classical problems in the calculus of variations this assumption is indeed valid. (The proof of that fact follows easily from the classical theory of Hamilton-Jacobi partial differential equations.) However, this assumption is not valid for a large class of problems in optimal control. In the case of the system
P,= U l ,
with the initial condition and the set of admissible control we have immediately
!2
=
{u
i* = llz
x,(0)= x,(0) = 0 = ( u , , u 2 ):
lull
and
luzl 5 I }
Wit) = {x = (xi, x 2 ) : Ix, I and lxLl I f) We see that the set W ( i )is a square and hence has a boundary which is not smooth.
208
HUBERT HALKIN
each of their boundary points. This assumption allows us to state and prove in a very natural way some fundamental properties of optimal solutions. In later sections the same results will be derived without making these assurnptions but this more general derivation will be longer as one could expect. Let V be an optimal control function and let x(t; V )be the corresponding optimal trajectory. By the principle of optimal evolution we have
x(t; V )E dW(t)
for all
t E [0, 11
(6.10)
Let h(t) be a nonzero outward normal to the set dW(t) at the point x(t; V ) . The length of the vector h(t) which is not yet defined will be determined later up to a multiplicative factor. This means that later we shall add some restrictions on the vector function L ( t ) such that the knowledge of the length of h(7) for some 7 E [0, I ] will be sufficient to derive the length of h(t) for all t E [0, 11. As in geometrical optics we can define the velocity of the wavefront d W ( t ) at the point x(t; V ) .This wavefront velocity is parallel to the vector h(t).We shall denote by s(t) the wavefront speed, i.e., the length of the wavefront velocity. We now have two important results?: (1) For every propagation issued from x(t; V )at the time t the projection on the normal vector h(t) of the propagation velocity is at most the wavefront speed s(t), i.e., for all
U € Q
and all
t E O(V)
(6.1 1) (2) The propagation issued from x ( t ; V )at the time t and corresponding to the optimal trajectory must " keep up " with the wavefront d W ( t ) (see the principle of optimal evolution), hence we have
6.33 A Preview of the Maximum Principle We define a function H(x, u, t , h) by the relation (6.13)
H(x, u, t, h) = f(x, u, t ) * 1 7 We recall that the scalar product of two vectors c( and /? is denoted by the norm, or Euclidean length, of the vector is denoted /I(, i.e., ( A / where hl, hz , ... ,h. are the components of the vector h.
OL
/? and
that
=(I:=I(hi)2)1'2,
6.
MATHEMATICAL FOUNDATIONS OF SYSTEM OPTIMIZATION
209
From the relations (6.1 1) and (6.12) we conclude that along the optimal trajectory x(t; V )we have
The inequality (6.14) is a preview of the maximum principle which we shall state in detail in the next section. 6.4
Statement of the Maximum Principle
In this section we shall state the maximum principle of Pontryagin corresponding to the optimization problem defined in Sec. 6.2. We first define the function
H(x, u, t , I) = f(x, u, t ) .I
(6.15)
where x, u, and t are the variables introduced in Sec. 6.1 and where I = ( I l , I , , ... , I,) is an element of the n-dimensional space E". By f(x, u, t ) .1 we mean the scalar product of the vectors f(x, u, t ) and 1,i.e., f(x, u, t ) * I = ,fi(x, u, t)Ii. The function H(x, u, t, I) is called a Hamiltonian. To the vector I and the vector valued function I ( f ) , defined later, are associated many different names : adjoint vector, momentum vector, costate vector, Lagrange multipliers, sensitivity parameters, Green functions, influence functions, etc.
C;=
Pontryagin Maximum Principle. If Y is an optimal control function, then there exists a vector valued function I ( t ) defined and continuous over [0, I ] differentiable over O ( Y ) ,and nonidentically zero such that: ti)
H(x(t; V ) ,v(t), 2, W t ) )2 H(x(t; Y ) ,u, t, Vf)) for all t E O ( V ) andall U E R
(6.16)
(6.17) for all t E O ( V ) (iii) &(l)
=0
(iv) A,(l) 2 0
for
i=r
+ 1, ... , n - 1
(6.18) (6.19)
In relation (6.17) we denote by dH(x, u, f , 1)/ax the vector whose ith component is aH(x, u, t, I ) / a x , . By aH(x, v(t), t, I ( t ) ) / d ~ l ~ = ~we ( ~ mean ;~) the vector aH(x, u, t, I)/ax evaluated at the point x = x(t; Y ) . The condition (6.18) is sometimes called the transversality condition.
210
HUBERT HALKIN
We have stated above the usual form of the maximum principle corresponding to the optimization problem given in Sec. 6.2. It is sometimes interesting to consider a purely geometrical form of the maximum principle in terms of some properties of the set W(1) defined in Sec. 6.3 (see HalkinZ4). Geometrical Maximum Principle. If x(1; V )is a boundary point of the set W(1) then there exists a vector valued function l ( t ) defined and continuous over [0, 11, differentiable over @(V), and not identically zero such that the relations (6.16) and (6.17) are satisfied. 6.5
Proof of the Maximum Principle for an Elementary Dynamical Polysystem
6.51 An Elementary Dynamical Polysystem
In this section we shall assume that the vector valued function f(x, u, t ) is independent of the state vector x and takes the form q(u, t ) , i.e., we assume that the evolution of the dynamical polysystem under consideration is described by the equation f = q(u, t ) (6.20) In conformity with the assumptions made in Sec. 6.1 for the vector valued function f (x,u, t ) we shall assume in the present section that the vector valued function q(u, t ) is continuous with respect to u, and piecewise continuous with respect to t . In other words, we assume that there exists a finite set { t o , t,, ... , t k } c [0, I ] with t , = 0 < t , ... < tk = I , a finite collection of vector valued functions {cpl(u, t ) , q2(u,t ) , . . . , cp,(u, t ) } such that for each i = 1 , 2 ,..., k . (i) the vector valued function 'pi(u, t ) is defined and continuous with respect to u and t for all u E Emand all t E [ t i - l , t i ] , (ii) q ( u , r ) = q I ( u ,t ) for all u E Emand all t E (ti-,, ti). We shall also assume that the vector valued function q(u, t ) is uniformly bounded over [0, 11 for all u E Q, i.e., that there exists an M < + co such that I'p(u,
t)l
5 A4
for all u E Q and all t E [0, I ]
(6.21)
The relation (6.21) is a particular case of the relation (6.4). Under these assumptions the trajectory x ( t ;'32) takes the simple form x(t; '32) = J'q(u(.), 0
and we have immediately Ix(t; %)I
I M
for all
). &
42 E F and all
t E [0, I ]
where M is the real number introduced in relation (6.21).
6.
MATHEMATICALFOUNDATIONS OF SYSTEM OPTIMIZATION
21 I
6.52 Reachable Set
We define? the set W of all states which can be reached at the time t = 1 from the initial state x = 0 at the time t = 0 with some control function in the class F. In other words we have
or equivalently
W = {x( 1 ; %) : %!
W=
E
F}
[lo1
(6.22)
cp(u(t), t ) dt : % E F
I
(6.23)
Let V be an optimal control function. We shall define the set S + of all states in S with an nth coordinate greater than x,( 1 ; V ) .In other words
s+= {x : x E s, X" > x,( 1;V ) }
(6.24)
The sets S+ and W would have no point in common since the existence of such a point would contradict the optimality of the control function V . Moreover, the set S + and Ware convex. We recall that a set A is convex if for every a and b E A and every p E [0, I ] we have pa + (1 - p)b E A . The convexity of the set S + is easy to verify. The proof of the convexity of the set W will be given later. Since the sets Sf and W are convex and have no point in common then there is at least one hyperplane separating them. This means that there exists a unit vector I, the normal to this supporting hyperplane, and a real number h, the distance from the origin to this supporting hyperplane, such that:
x .I I h
for every
x.I2h
for every X E S '
XE
W
(6.25) (6.26)
If we denote by S' the closure$ of the set S + , then the relation (6.26) may be strengthened to
x.1 2 h
for every
XES+
(6.27)
By construction the point x(1; V )belongs to both sets W and S+, hence we have x(1; V )* li. I h 2 ~ ( 1V ; )* I (6.28)
t The set W is identical with the set W(1) introduced in Sec. 6.3. In the present section we prefer to use the simpler notation W instead of W(I), since we shall not consider W ( t ) for any t besides t = 1 . 1The closure of a set is the smallest closed set containing it. In this particular case s+ = { x : x E S , x " > x " ( l ; v ) } The most useful property of a closed set B is the following: if a E En and if a l , a 2 , . . _ is a sequence in B such that limi+mlai- a1 = 0 then a E B.
212
HUBERT HALKIN
which implies
x(1; r
We have then finally and
)- 3 , = h
(6.29)
x X Ix( 1 ; ^Y) 1
for every x E W
(6.30)
x.h>x(l;V).X
forevery X E S +
(6.31)
6.53 Maximum Principle We shall prove that the constant nonzero vector X satisfies the four conditions of the maximum principle. We shall first prove by contradiction that condition (6.16) is satisfied, i.e., that cp(v(t>, t ) * X 2 cp(s t ) . (6.32) for all t EO(Y) and all U E R The relation (6.32) expresses the following simple fact: since the control function V is chosen such that x( 1 ; V )is as far as possible in the direction X, then the projection of the derivative of x ( t ; V ) on the vector X must be as large as possible for all t for which this derivative is defined, i.e., for all t E O(Y). If relation (6.32) does not hold, then there is a u E R, a t E O ( V ) ,and an q > 0 such that (6.33) cp(v(0, t ) I < d u , t ) * a - 21
-
We shall show that relation (6.33) leads to a contradiction. The set O ( V ) is open hence if t E O ( Y ) then there is an E > 0 such that T E O ( V ) for all T with / T - t ( I E. By definition the functions cp(v(z), T ) and q(u, T) are continuous functions of z over the interval [ t - E , t + E l , hence, from relation (6.33), 5. such that there exists an E with 0 < E I (6.34)
If we define a function V * E F by the relations v*(T)= V ( T ) V*(?)
=u
if if
z E [0, I ]
-
z E [t - E , t
( t - E, t
+
+ E)
E]
then we obtain the relation
i.e.,
(so’
cp(v*(t>, t ) dt) * 1 2
(J:
cp(v(t),t > d t )
x(l; Y * )- 5 2 x(I; V )- 1+ 21E
+ 211~
(6.35) (6.36)
6.
MATHEMATICAL FOUNDATIONS OF SYSTEM OPTIMIZATION
213
By construction we have x( 1 ; Y*)E W, hence the relation (6.36) contradicts the relation (6.30). The condition (6.17) is trivially satisfied in the case of the elementary problem considered in this section. Let e, be the unit vector parallel to the ith axis. For every i = r + 1, ... , n - 1 we have
x( 1 ; Y) + e, which implies and I.e.,
x( 1 ; Y) - e, E S +
and
(6.37)
(x(l;Y)+ei).l>x(l;"V).l
(6.38)
(x(l ; V )- e,) .12 x(1; ,Y) .X
(6.39)
ei.l=O
(6.40)
or
Ai=O
for i = r + l , ..., n - 1
(6.41)
The last relation proves condition (6.18). We have also hence i.e.,
or
x(I ; Y) + enE S +
(6.42)
(x(l;Y)+e,).l>x(l;"V).I
(6.43) (6.44)
e;120 An 2 0
(6.45)
The last relation proves condition (6.19). 6.54 Proof of the Convexity of the Set W
In Sec. 6.10 we shall prove the following result. Proposition 10.2 Iff ( t ) is a piecewise continuous vector Lialued function defined over [O, I J and if& is the class of all subsets of [0, 1J which are the union o f a Jinite number of disjoint intervals, then the set
(s: f(t) x(E)dr
1
:E E &
(6.46)
is convex. In the last relation x ( E )denotes the characteristic function of the set E, i.e., a function? which is one for all t in E and 0 for all t not in E.
t The reader should realize immediately that for every set E in the class d the characteristic function x ( E ) is piecewise continuous.
214
HUBERT HALKIN
We shall now prove the convexity of the set W with the help of Proposition 10.2. Let a and b E W a n d p E [0, 11. We shall prove that pa + ( I - p)b E W. If a and b E W then there are functions %a and %b in F such that (6.47) (6.48)
(
t ) - d u a (t), t ) ) X(E) dt : EE d
L = Jol and L* be the set
L*
= (JO1
q(ua(t), t ) dt
i
+x :x E L
(6.50)
The set L is convex by proposition 10.2 which implies immediately the convexity of the set L*. For every E E d the function aEdefined by the relations uE(t) = u,(t) if t E [0, I ] E u,(t) = ub(t) if t E E
-
belongs also t o the class F. Moreover, for every E E d we have dUE(t), t> = q(ua(t), t>+ ((P(Ub(f),t> - d u a ( t ) ? t ) ) X ( E )
(6.51)
which implies
1
0
J q(ua(t), t>dt I
1
(P(UE(~), 1 ) dt =
+
(q(ub(t), t , - q(ua(t>, t > ) X ( E ) dt
(6.52)
a,
We have then L* c W. For E = we have aE= 42, and for E = [0, I ] we have = qb. The points a and b belong then to a convex subset L* of W. Hence pa ( 1 - p) b E W .
+
6.6
Proof of the Maximum Principle for a Linear Dynamical Polysystem
6.61 A Linear Dynamical Polysystem In this section we shall assume that the function f(x, u, t ) has the particular form (6.53) f ( x , u, t ) = A ( t ) x q(u, t )
+
6.
MATHEMATICAL FOUNDATIONS OF SYSTEM OPTIMIZATION
215
I n conformity with the assumptions made in Sec. 6.1 for the function f(x, u, t ) we shall assume in the present section that the matrix valued function A ( t ) is piecewise continuous with respect to t and that the vector valued function cp(u, t ) is continuous with respect to u and piecewise continuous with respect to t . In other words, we assume that there exists a finite set { t o , t , , ... , t k } c [0, I ] with to = 0 < t , < ... < tk = I, a finite collection of matrix valued functions {A,(t), A 2 ( t ) , ._., A k ( t ) } and a finite collection of vector valued functions {cpl(u, t ) , cp,(u, t ) , ... ,cp,(u, t ) } such that for each i = 1, 2, ... , k (i) the matrix valued function Ai(t) is defined and continuous with respect to t for all t E [ t i - , , t i ] ; (ii) the vector valued function cpi(u, t ) is defined and continuous with respect to u and t for all u E E m and all t E ( t i - l - E , ti + E ) ; (iii) A ( t ) = A i ( t ) for all t E ( t i - ] , t i ) ; (iv) cp(u, t ) = cpi(u, t ) for all u E Emand all t E ( t i - , , t i ) . 6.62 Fundamental Matrix Let G(t) be the fundamental matrix associated to the linear system k
(6.54)
= A(t)x
By this we mean that G(t) is a matrix valued function defined and continuous for all t E [0, I], continuously differentiable for all t E [0, I ] for which the matrix A ( t ) is continuous and such that? and
G(t) = -G(t) A ( t )
for all t E [0, 11 - { t o , t , , ... , t k } G(1) = I
(6.55) (6.56)
where I is the identity n x n matrix. It is a trivial matter to prove that the matrix valued function G ( t ) exists, is unique, and has a continuous inverse G-'(t). This inverse matrix G-'(t) satisfies the differential equation
and the terminal condition
G-'(l)
=1
(6.58)
t If the ( i , j ) element of A ( t ) is aij(f)and if the ( i , j ) element of G ( f )is g r j ( r )then relation (6.55) stands for d -gij(l) = -
dr
c
gih(r) ahJ(1)
h=l
for all i = 1 , 2, ... ,n, allj = 1 , 2, _ _n, _and , all r
E
[0, I ] - { t o ,
f,,
..., f x ) .
216
HUBERT HALKIN
An easy way to verify the compatibility of the relations (6.55) through (6.58) is to see that we have and that for all t E [0, 11
-
G(1) G - y l )
=
zz= I
(6.59)
{ t o ,t , , ... , r k } we have
= - ~ ( t~ )
( tG-'(t) )
+ G(t)A ( t ) G - ' ( t ) = 0
(6.60)
We shall show below that for every 4 2 ~ F the vector valued function x(t; 42) takes the form .t
x ( t ; %) = G-'(t) J- G(t)cp(u(t), t) dt 0
for every
t E [O, 11
(6.61)
Since we know already that the function x(t; a)exists and is unique, we have only to verify that the form given in relation (6.61) satisfies the differential equation d dt
- x ( t ; 42) = A ( t ) x(t; @)
+ cp(u(t), t )
(6.62)
for all t E O(%) and the boundary condition x(0; %) = 0
(6.63)
Indeed, for all t E O(%) we have
= A ( t ) G - ' ( t ) j t G ( t ) cp(u(z), t) dt 0
(6.64) and the boundary condition (6.63) follows from the definition (6.61). We have then (6.65)
6.
MATHEMATICAL FOUNDATIONS OF SYSTEM OPTIMIZATION
217
6.63 Reachable Set
As in Sec. 6.4 we shall define the set W of all states which can be reached at the time t = 1 from the initial state x = 0 at the time t = 0 with some control function in the class F. In other words we have (6.66)
W = ( ~ ( 1@) ; :@ E F}
or equivalently, W=
(sd G ( t )
cp(u(t), t ) dt : 92 E
F
(6.67)
Let us compare the two formulas (6.23) and (6.67). We see immediately that they have a similar structure the only difference being that the piecewise continuous vector valued function cp(u(t), t ) is now replaced by the piecewise continuous vector valued function G(t)cp(u(t), t ) . By repeating word for word what has been stated in Sec. 6.5, we can prove that for every optimal control function V there exists a nonzero constant vector h such that (6.68) (6.69) (6.70) 6.64 Maximum Principle
We shall now define a vector valued function k ( t ) by the following relation h(t) = G'(t)k
for all t E [O, 11
(6.71)
where G T ( t )is the transpose of the matrix G ( t ) . Since the constant vector 1 is different from zero and since the matrix valued function G ( t ) is continuous and piecewise differentiable over [0, 11 and has an inverse it follows that the vector valued function h(t) is continuous, piecewise differentiable, and nonidentically zero on the interval [0, I]. We shall prove that the vector valued function h(t) satisfies the four conditions of the maximum principle. The relation (6.68) can be written in the form cp(v(t>, t ) . (GT(t>k>2 cp(u, 1 ) (GT)t>W
(6.72)
218
HUBERT HALKIN
which implies
( 4 r ) x(t; V )+ Cp(v(0, t ) ) . k ( t ) 2 for all
tEO(V)
( 4 t ) x(r; -y)
+ Cp(u, t ) ) W )
and all u
*
~
0
(6.73)
For the linear problem considered in this section the relation (6.73) is equivalent to relation (6.16). From the relations (6.71) and (6.55) we obtain 3, = ( - G ( t )
dt = -AT(r)
A(t))Tk
GT(t)l= - A T ( t ) k ( t )
(6.74) (6.75)
This last relation is equivalent to relation (6.17). Finally, we note that relations (6.69) and (6.70) are equivalent to relations (6.18) and (6.19). This concludes the proof of the maximum principle in the case of a linear dynamical polysystem. 6.65
Remarks on Convexity
Many early papers in the theory of optimal control were devoted to the study of systems whose evolutions are described by equations of the form
k = A(t)x + B(r)u
(6.76)
where A ( t ) is a piecewise continuous n x n matrix valued function and B(t) is a piecewise continuous n x m matrix valued function. In these early papers it was always assumed that the set 0 of admissible control vectors is convex.? This assumption is very convenient in order to obtain the following direct proof of the convexity of the set W , i.e., a proof of the convexity of the set W which does not depend on Proposition 10.2. In the case of Eq. (6.76) the relation (6.67) takes the form
w=
(1:
G(t)B(t) u(t) dr : 42 E F
1
(6.77)
t In many practical applications fi is either some hypercube or the vertices of some hypercube. In the first case fi is convex and each component of the control vector may take any value between its minimum and maximum values. In the second case s2 is not convex and each component of the control vector may take only its minimum or maximum values. The theory of the first case is easy but the apparatus (rheostats, function generators, etc.) is costly. The theory of the second case is not so easy but the apparatus (contactors, relays, etc.) is inexpensive. It has been proved that a system of the second type can always d o what a system of the first type can do. (See LaSalle,34 and Halkir~.’~)
6.
MATHEMATICAL FOUNDATIONS OF SYSTEM OPTIMIZATION
219
Let a and b E Wand p E 10, 11. We shall prove that c = pa + (1 - p ) b E W . If a and b E W , then there are control functions %a and %b in the class F such that 1
a=
j0G(t)B(t)u,(t) dt
(6.78)
b=
J'0 G(t)B(t) Ub(t) dt
(6.79)
and 1
Let %Yc be the control function defined by the relation
+ (l - p ) ub(t>
uc(t>=
(6.80)
Since the set J ! is convex it follows that qCE F, hence there is a vector c in the set W such that 1
(6.81)
c = Jo G ( t ) B(t) uc(t) dt
6.66 Comoving Space along a Trajectory This paragraph does not contain any new results but gives a geometrical interpretation of some results obtained earlier in this section. We shall show that for every trajectory x ( t ; V ) we can introduce a new state variable y = (yl, y , , ... ,y,) and a certain time varying transformation from the old state variable x into the new state variable y in such a way that the differential equation in y corresponding to the differential equation in x given by relation (6.15) is of the type given by relation (6.20). We have defined earlier the continuous and piecewise differentiable n x n matrix valued function G(t) over the interval [0, 11 by the relations G ( t ) = -G(t)A(t)
for all t
E
[0, 11
-
{ t 2 , t,,
... , f k }
(6.82)
and G(1)
=I
(6.83)
For every t E [0, 11 and every x E E" the new state variable y is defined by y = C ( t ) ( x- x ( t ; V ) )
We have then y
+
= G ( t ) ( x - x(t; V ) ) ~ ( t ) ( % %(r;
(6.84)
v ) )= - ~ ( t ) ~ ( t ) (-x x ( t ; v ) )
+G(t)(A(t)x + cp(u, t ) - A(t)x(r; V )- cp(v(t),t))
= G(O(cp(4
t ) - cp(v(0, t ) )
(6.85)
220
HUBERT HALKIN
and we see that the last equation is of the type given in relation (6.20). The system of coordinates corresponding to the new state variable y is called the comoving space along the trajectory x(t; V ) .It follows then that a linear dynamical polysystem can always be considered as an open loop system if we look at it from the comoving space along a trajectory. The trajectory x ( t ; V )can be expressed in the comoving space by the relation YE0 (6.86) and any trajectory corresponding to the same control function V but with different initial condition can be expressed in the comoving space by the relation y = const (6.87) In other words the time varying transformation from the state variable x into the state variable y is a transformation which stretches and twists the field of all trajectories with the same control function Y into a nice field of parallel rectilinear trajectories. 6.7
Proof of the Maximum Principle for a General Dynamical Polysystem
6.71 Fundamental Matrix along an Optimal Trajectory In this section we shall assume that Y is an optimal control function for the optimization problem stated in Sec. 6.2. We have already defined the trajectory x ( t ; Y )corresponding to the control function V . Let A ( t ) be the n x n matrix valued function of the time t whose ( i , j ) element is (6.88) From the assumptions made in Sec. 6.1 it follows that the matrix valued function A ( t ) is well defined and piecewise continuous over the interval [0, 11. More precisely, the function A ( t ) is continuous at every point t in O ( V ) . As in Sec. 6.6 we define a matrix valued function? G(f) continuous with respect to t over [0, I ] and differentiable with respect to t over O ( V ) by the relations: G(t) = -G(t)A(t) for all t E O ( Y ) (6.89) and G(I) = I (6.90)
t Compare the content of this paragraph with Appendix A. You will note that the matrix G ( t)defined here corresponds to the inverse matrix K - ’ ( t : i, 2 ) defined in Appendix A with the boundary condition i = 1 and Z = x(1; Y).
6.
MATHEMATICAL FOUNDATIONS OF SYSTEM OPTIMIZATION
22 1
The reader should realize that the matrices A ( t ) and G(t)have been defined for the particular control function Y .Similar but different matrices could be defined for the other control functions in the class F. The characteristic property of the linear system considered in Sec. 6.6 is that the matrices A ( t ) and G(t) are the same for every control function in the class F. This last property does not hold for the general nonlinear system considered in the present section. In this section we shall not have to consider matrices A ( t ) and G ( t ) corresponding to other control functions besides the optimal control function 9'". This is fortunate, since it enables us to avoid clumsy notations of the form A ( t ; Y ) ,A ( t ; %), G(t;Y ) ,and G ( t ; %).
6.72 Variational Trajectory For every control function % in the class F we define the vector valued function y ( t ; %) by the relation: y(t; %) = x(t; %) - x(t; V )
for all
t E [0, 1J
(6.91)
The vector valued function y(t; %) is called the variational trajectory for the control function % with respect to the control function Y . Lett O*(%) = O(%) n O ( Y ) (6.92) By construction the vector valued function y ( t ; %) is continuous with respect to t for all t E [0, 11, differentiable with respect to t for all t E @*(a) and we have jQ;
W = f(x(t; %I, u(t), t ) - f(x(t; Y ) ,v ( 0 , t ) for every
t E @*(a)
(6.93)
Let q(u, t ) and k(t; %) be vector valued functions defined by the following relations : cp(u, t ) = f(x(t; Y ) ,u, t ) - f(x(t; Y >v(t>, , r>
(6.94)
k(t; a)= f(x(t; a),u ( t ) ,t ) - f(x(t; Y ) ,v ( t ) , t ) -q(u(t), t ) - 4 t M t ;
W
-
x(t; Y ) )
(6.95)
The vector valued function q ( u , t ) is continuous with respect to u and piecewise continuous with respect t o t . For every % E F the vector valued function k ( t ; 42) is piecewise continuous with respect to t .
t If A and B are sets then A n B is the intersection of the sets A and B, i.e., the set A n B is the set of all elements which belong t o both sets A and B. Similarly, A u B is the union of the sets A and B, i.e., the set A U B is the set of all elements which belong to at least one of the sets A and B.
222
HUBERT HALKIN
We can then write (6.96) and ~ ( t@> ; = G-'(t)
1;
G(t)(cp(u(t),2)
for all
+ k(t; W )d~
(6.97)
t E [0, I ]
The vector valued function k(t; %) is identically zero for the linear problem considered in Sec. 6.6. Although k(t; 42) is not identically zero for the nonlinear problem considered in this section, it has nevertheless two interesting properties : (i) there exists a K , < + co such that for all % E F, we have ( k ( t ;%)I
IK, ly(t;
%)I2
for all t E [0, 11
such that u ( t ) = v(t); (ii) there exists a K , < + co such that for all % E F and all t have Ik(t; Wl 5 K , lY(t; %)I
(6.98) E
[0, I] we
(6.99)
These two properties are direct consequences of the definitions (6.91), (6.94), and (6.95) and of the assumptions made in Sec. 6.1. An explicit derivation of these two properties is given in the last paragraph of this section. 6.73 Approximation Trajectory For every % E F let z(t ; a)be a vector valued function of t defined and continuous over [0, I], differentiable over @*(%) and such that (i)
i(t;
a)= A ( t ) z ( t ; 9)+ cp(u(t), t )
for all t
E
@*(%)
(ii) z(0; %) = 0
(6.100) (6.101)
or, equivalently, such that Z(t;
a )= G - ' ( t ) i' G(t)cp(u(r), t) dz 0
for all
t
E
[0, I ]
We have then y ( t ; %) - z ( t ; a)= G - ' ( t )
for all
1'G ( t ) k ( t ;
(6.102)
%) dz
0
t E [0, 11
(6.103)
6.
MATHEMATICAL FOUNDATIONS OF SYSTEM OPTIMIZATION
223
The vector valued function z(t; 42) is called the approximation trajectory for the variational trajectory y(t ; 42). The name " approximation trajectory " is well justified, since the difference y ( t ; 42) - z(t; 42) is small, whenever k ( t ; 42) is small (relation (6.103)) i.e., whenever y ( t ; 42) is small (relations (6.98) and (6.99)), i.e., whenever the comparison trajectory x(t; 42) is close to the optimal trajectory x ( t ; Y ) . 6.74 Remarks on Linearization Techniques All the formulas listed above are familiar to everyone interested in the optimal control of nonlinear systems. We have taken great care to write down all these definitions because we have found a great lack of precision and rigor in most of the available expositions of these questions. Many authors replace the basic differential equation by the first two terms of its Taylor expansion plus a remainder term which corresponds here to the functions k(t; 42). At various stages of the reasoning they state, without proof, that the effect of this remainder .term can be neglected. Such derivations are, in our opinion, very unsatisfactory, since they are strictly equivalent to replacing, from the very start, the nonlinear system by its linear approximation around the comparison trajectory. In our approach we give a precise definition of the linear approximation. Moreover, in the whole development we maintain a clear distinction between the nonlinear system and its linear approximation and when an interesting property of the linear approximation could be fruitfully and legitimately applied to the nonlinear system we prove that this application is wholly justified. 6.75 Statement of the Fundamental Lemma Let us define the sets Wand @ by the following relations
w = (x(1; Y )+ y(1; 42) : 42 E F } R = (x(1; Y )+ z(1; 42) : 42 E F )
(6.104) (6.105)
Equivalently, the set W could be written under the form
w = (x(1; 42) : 42 E F }
(6.106)
and the set @could be written under the form w={x(l;Y-)+z:zEZ}
(6.107)
2 = {z(l ; 42) : 42 E F }
(6.108)
where 2 is the set
224
HUBERT HALKlN
From the results of Sec. 6.6 we know that the set Z is convex, hence that the set w i s convex. As in Sec. 6.5 we define the set S+ by the relation
s+= { X : X E S , X , > X " ( 1 ; V ) )
(6.109)
where S is the terminal set introduced in Sec. 6.2. The set S+ is convex. Since the function z ( t ; %)is a certain approximation of the function y(t; a), it is natural to expect that the set @is a certain approximation of the set W (compare relations (6.104) and (6.105)). We have indeed the following important result: Fundamental Lemma. r f there is no hyperplane separating the convex sets S + and @ then the sets Si and W have at least one point in common. The proof of the fundamental lemma is given in Sec. 6.1 1. We see immediately that the fundamental lemma is a tautology in the case of a linear system, since for a linear system we have W = @. An immediate corollary of the fundamental lemma states that:
If the sets Si and W have no point in common then there is a hyperplane which separates the convex sets S+ and
w.
6.76 Maximum Principle The sets S+ and W cannot have a point in common since the existence of such a point would contradict the optimality of the control function V .If the sets S+ and W have no point in common then, from the corollary of the fundamental lemma, we conclude that there is a hyperplane separating the convex sets S + and Hence, according to Sec. 6.6 there exists a vector valued function k(t), defined, continuous, piecewise differentiable, and nonidentically zero on [0, 1J such that
w.
(6.1 10) d (ii) - k ( t ) = - AT(t)h(t) dt
(iii) Ai(l) = 0 (iv) An(l) 2 0
for i = r
for all t E O ( V )
+ 1, ... ,n - 1
(6.1 11) (6.1 12) (6.113)
With the help of definitions (6.88) and (6.94) these relations can be transformed immediately into the four conditions of the maximum principle.
6.
MATHEMATICAL FOUNDATIONS OF SYSTEM OPTIMIZATION
225
6.77 Derivation of the Properties of the Function k(t; 42) In this paragraph we give an explicit derivation of the inequalities (6.98) and (6.99) mentioned earlier. From the assumptions of Sec. 6.1 and from relation (6.91) we know that there exists a K < + co such that forall
ly(t;42)lIK
t ~ [ O , l ] andall 4 2 6 F
(6.114)
From the relations (6.94) and (6.95) we have k ( t ; 42) = f(x(t; a),~ ( t )t ), - f(x(t;V ) ,u(t), t ) (6.1 15)
-A(t)(x(t; 42) - x(t; V ) ) Hence, for all t
E
[0, I ] such that u(t) = v(t) we have
k ( t ; 42) = f(x(t; a),v(t), t ) - f(x(t; V ) ,v(t), t )
- A(t)(x(t;42) - x ( t ; V ) )
(6.116)
i.e., from relation (6.91),
+
k ( t ; 42) = f(x(t; V ) y ( t ; 42), v(t), t ) - f(x(t; V ) ,v(t), t ) - A ( t ) y ( t ;42) (6.117)
Let us denote by + ( t ; y) the vector valued function
w,t ) - m
f(x(t; V )+ y, v(t), t ) - f(x(t; V ) ,
y
(6.118)
We see immediately that +(z; 0) = 0. From the assumptions of Sec. 6.1 we know that a11 the second partial derivatives with respect to y of + ( t ; y) exist, are continuous with respect to y and piecewise continuous with respect to t . We have then (6.1 19) and there exists a Kl < 03 such that Y)I
Kl lY12
(6.120)
for all t E [0, 11 and for all y with IyI I K . From the definition of the matrix A ( t ) we have (6.121)
We obtain finally
;1"
Y)l -< K / Y 1 2
(6.122)
226
HUBERT HALKIN
for all t E [0, I ] and for all y with Jy JI K, i.e., IWt,
@)I
5
KilY(t;%)I2
(6.123)
for all % E F and all t E [0, I] such that u(t) = v ( t ) .This concludes the derivation of inequality (6.98). The relation (6.1 15) may be written k(r; %)
= f(x(t; V
+
) y(t;
a)!>, u(t), t )
-f(x(t; V ) ,u(l), t ) - A(r)y(t;
(6.124)
Let us define the vector valued function p ( t ; y, u) by the relation p ( t ; y, U)
+
f(x(t; V ) y,
1
U,
t ) - f(x(t; V ) ,U, t ) - A ( t ) y
(6.125)
For every t E [0, 11 and every u E R we have p ( t ; 0, u), and from the assumptions made in Sec. 6.1 we know that p(t; y, u) has a derivative with respect to y which is bounded for all t E [O, I], all u E R and all y such that IyI I K. Hence, there is a constant K2 < 03 such that IpL(t;Y,
41 I K2 IYI
(6.126)
for all t E [0, 11, all u E R and all y with 1yI I K . In other words, there is a constant K2 < co such that IWt;
for all t 6.8
E
[0, I] and all 4?
E
%)I
I K2 lY(t;
%)I
(6.127)
F. This concludes the proof of inequality (6.99).
Uniformly Continuous Dependence of Trajectories with Respect to Variations of the Control Functions
6.81 Distance between Two Control Functions
In Sec. 6.1 we defined a vector valued function x(t; %), called a trajectory, for each control function % in the class F. The purpose of the present section is to state and prove a result which we could colloquially express as follows: if the change from the control function 4Y1 to the control function a2is to the trajectory “small” then the change from the trajectory x(t; x ( t ; a,) will also be “small.” We shall first state precisely what we mean by a “small” change from a control function %l t o a control function a2.In Sec. 6.4 we defined d as the class of all subsets of [0, I] which are the union of a finite number of disjoint intervals. For each set E in the collection d let p ( E ) be the length of the set E,
6.
MATHEMATICAL FOUNDATIONS OF SYSTEM OPTIMIZATION
227
i.e., the sum of the lengths of the finite number of disjoint intervals constituting the set E. We shall say that the distance? between two control functions q1and q2is at most the nonnegative number o! if there exists a set E in the collection at such that p(E) 5 a, and such that the two control functions %1 and q2differ at most on the set E, i.e., such that u,(t) = u2(t) for all t E [0, 11 E. Remark. There are many possible definitions of the distance between two control functions : for instance one could have defined the distance between the two functions %Y1 and a2as the supremum of lu,(t) - u2(t)l for all t E [0, 11. For many readers this second definition could seem more natural” than the first definition. However, we have chosen the first given definition not for its “ naturalness ” but for its convenience in the derivation of further results. The reader who is well acquainted with classical calculus of variations will realize immediately that the first definition corresponds to the concept of strong variations and that the second definition corresponds to the concept of weak variations. “
6.82 Uniform Boundedness of Variational Trajectories This paragraph will be devoted to the proof of the following result:
Proposition 8.1. There exists an N < + 00 such that for all and all E E at with ul(t) = u2(r)for ul/t E 10, I ] E we have
-
(x(r;Q1) - X(T; q2)1INp(E)
for all
TE
[0, 11
and 4Y2 E F (6.128)
PROOF. The scalar valued function Ix(t; %1) - x(t; a2)1 is continuous over [0, 11 and differentiable over @(al) n @(q2). We have immediately d
Jx(t; dt
- x(t;
%2) 1
I I%(t;Q1) - k ( t ; q2)1
(6.129)
t This footnote is intended only for the reader familiar with measure theory. As we said in the introduction we want to avoid any measure theoretical consideration in this chapter. Without such a stringent restriction we would have naturally defined the distance between two control functions 4 , and 42as the measure of the set of points where these two functions differ. If eland @* E F the set S = {r : t
E
[0,11, u,(r) # M)}
is not necessarily in the class A.If in See. 5.1 we would have required all functions in the class F to be piecewise analytic, then the set S would, indeed, be in the class sd and the writing of the present section would have been somewhat simplified. However, we have decided that this simplification was not worth such a supplementary requirement on the class F.
HUBERT HALKIN
i.e.,
- f(x(t; %2),
UZ(f), t)l
+lf(x(t; @I), ui(t), t ) - f(x(t; %I),
~ 2 ( t ) t)l ,
I If(x(t; %I),
u2(r), t )
(6.131)
for all t in @(a,) n @(%,). We have seen in Sec. 6.1 that x(t; %) is uniformly bounded over [0, 11 for all % in F. Hence, from the assumptions stated in Sec. 6.1 for the function f(x, u, I ) we know that there are constants L, and L, < + 00 such that
-
where x ( E )is the characteristic function of the set E, i.e., a function equal to one when r E E and equal to zero when t E [0, 11 E. From a generalization of Gronwall’s inequality proved in Proposition A. 1 of Appendix A we obtain Ix(r;
- x(r; @,)I
I
s,’
(6.135)
5 NAE)
(6.136)
L2 eL1 x(E) dt
for all z E [0, I], i.e., Ix(z;
- X(Z;@,)I
where N = L, eL1.This concludes the proof of Proposition 7.1.
6.
6.9
MATHEMATICAL FOUNDATIONS OF SYSTEM OPTIMIZATION
229
Some Uniform Estimates for the Approximation z(t; 42) of the Variational Trajectory y ( t ; @)
In Sec. 6.7 we have assumed that the control function Y was an optimal control function for the optimization problem stated in Sec. 6.2. To this optimal control function Y is associated an optimal trajectory x(t; Y ) .For every control function 42 in the class F we have also a trajectory x(t; 42) and in Sec. 6.7 we have defined y(t; 42) as the difference between x(t; 42) and x(t; Y ) . The vector valued function y(t; %) is called the variational trajectory for the control function 42 with respect to the control function Y .In Sec. 6.7 we have also defined an approximation z ( t ; 42) for the variational trajectory y ( t ; 42). The aim of the present section is to state and prove some results which characterize how well the trajectory z(t ; %) approximates the variational trajectory y(t; More precisely, the remainder of this section will be devoted to the proof of the following result:
w.
Proposition 9.1. There exists a K < + 00 such that for ail 42 E F and all E E d with u(t) = v(t) for all t E [0, 11 E we haw
-
ly(z, 42) - Z(T;
%)I 2 K ( , u ( E ) ) ~
for all
T E
[0, 11
(6.137)
We have seen in Sec. 6.8 that i.e., Iy(t; @)I I Np(E)
for all t E [O, 11
(6.139)
We have seen in Sec. 6.7 that f ( t ; %)
- i ( t ; @)
= k(t; 42)
for all
ly(0; 42) - z(0; 42)
=0
t E O(42)
(6.140)
and
(6.141)
It follows then that ly(t,
@)
We may write
- z(z; %)I
2
J0 Ik(t; @)I dt
for all
T
E [0, 11
(6.142)
230
HUBERT HALKIN
We have seen in Sec. 6.7 that there is a K 1 < +a such that (k(t;%)I 5 Kl Iy(t; &)I2
for all
t E [0, I ]
Ik(t; %)I 5 K l N 2 ( ~ ( E ) ) 2 for all
t E [0, I ]
I.e., which implies
J[
-
0.11 E
-
E
(6.144)
E
(6.145)
Ik(t; %)I dt I K 1 N 2 ( ~ ( E ) ) 2
We have also seen in Sec. 6.7 that there is a K , < ]k(t; @)I
-
+ co such that
for all
I K2 ly(t;
(6.146)
tEE
(6.147)
EE
(6.148)
i.e., Ik(t; %)I I K,Np(E)
for all
t
which implies JE
IWt;
Wl d 2 K2N(dE>>2
(6.149)
Combining relations (6.142), (6.143), (6.146), and (6.149) we obtain ly(z;
- z(r; @)I 5 (KINZ+ K2”p(m2 for all
z E [0, I ]
(6.150)
We may write K = K l N 2 + K,N and so obtain relation (6.137). This concludes the proof of Proposition 9.1. 6.10
Convexity of the Range of a Vector Integral over the Class d of Subsets of [0, I]
6.101 Multiple Balayage of Vector Integrals In this section we shall prove two theorems which have been used repeatebly throughout this chapter. Both theorems are somewhat related to a well-known theorem of Lyapounov on the range of a vector integral. We shall denote by f(t) some given piecewise continuous? function from A vector valued function f(r) defined over [0, 11 is piecewise continuous over [0, 1 ] if it is continuous at all points of (0, I), with the exception of a finite number of points where it has finite right and left limits and if, moreover, it has a finite right limit at the point 0 and a finite left limit at the point I . From this definition it follows that a piecewise continuous function f(r) is bounded, i.e., there exists a K < co such that
+
I f(l)l 2 K
for all f E [0, I]
where If(t)l is the Euclidean norm (or length) of the vector f(t)
6.
MATHEMATICAL FOUNDATIONS OF SYSTEM OPTIMIZATION
23 1
[0, 11 into an n-dimensional Euclidean space E”. The vector Z = Jo’ f(t) dt represents the average of the function f(t) on the interval [0, I]. The vector Z is also the value at t = 1 of the function g ( t ) = f(7) d.r. If we consider the estimation of the average I as a continuous process then the vector g ( t ) may be regarded as a certain approximation of the fraction tZ of the average vector I . This continuous estimation process is not very accurate if the function f ( t ) fluctuates greatly. Instead of basing our estimation on a single balayaget of the interval [0, I] we could consider a simultaneous balayage of each of the intervals [0, )] and [f, 11 and introduce a function g l ( t ) defined over [0, I] by the relation
J6’
gl(0
=
J
ti2 f(7)
dt
1/2+r/2
+ Jl/2
f(z) dz
(6.151)
as an approximation of the fraction tZ of the average vector I . Intuitively, we could hope that the fluctuations of the function f ( t ) on the intervals [0, 31 and [+, 11 will compensate each other and that g l ( t ) will be a better approximation of t Z than g ( t ) . This process may be refined further: for each integer k we partition the interval [0, 11 into 2k consecutive intervals of equal length 1/2k and we define a function g k ( t ) over [0, I ] by the relation
(6.152)
or, equivalently, by the relation (6.153)
where the sets Dtkare defined by the relation$ (6.154) i= 1
In other words, the set Dtkis obtained by dividing the interval [0, 11 into 2k consecutive and equal intervals and by taking the union of the first fraction t of each of these 2 k intervals. In Proposition 10.1 we prove that the functions gl(t), g 2 ( t ) , .. . , g k ( t ) , ... are becoming more and more accurate approximations of the function tZ as k increases. More precisely, we shall prove the following result:
t The mathematical term “ balayage,” which comes from the French, means literally: “ T h e act of sweeping.” 1 We recall that if a 5 6 then [a, b) is the set of all t such that a I 1 5 b. I f A l , A z , .. . , Ar is a finite collection of sets then u:=,A i is the set of all points which belong t o at least one of the sets A l , A 2 , .. . , A,, . Similarly, n:=,A , is the set of all points which belong t o each of the sets A l , A * , ... , A x .
232
HUBERT HALKIN
Proposition 10.1 Zf f ( t ) is a piecewise continuous function from [0, I ] into an n-dimensional Euclidean space and $ 8 > 0, then there exists an integer K such that
1
1
lJDmkf(t) dt - u s f ( t ) dt 5 0
for all
E
(6.155)
and all k 2 K
u E [0, 11
PROOF. We shall first assume that n = 1, i.e., that the function f ( t ) is real valued. Since a piecewise continuous function on [0, I ] is a fortiori bounded, there exists a real number M < 00 such that
+
for all t~ [0, 11
If(t)l IM
(6.156)
We have then (6.157)
and from the theory of Fourier series we know that there exists a sequence of real numbers a,, a,, a - l , a,, a-,, ... such that m
Ia,l2 f ( t ) x a,
+1 i= 1
+ la-iI2) < + m
(lUiI2
(6.158)
m
+ C (ai cos 2nit + a - i sin 2nit)
(6.159)
i= I
and
=Iof 1
a,
where
“
=
”
(6.1 60)
dt
in the relation (6.159) means that
lim JrIa,
k-rm
Letf,(t),f,(t),
(1)
+
k i=O
(ai cos 2nit
+ a-isin 2nit) -f’(t)
(2
dr = 0
(6.161)
... be real valued functions defined over [0, 11 by the relations
;)) k (f*(;)+A(: + :))
fl(0 = 2 (.f(;) +f(k+ fk+l(t) =
(6.162)
for k = 1, 2, ...
(6.163)
6.
MATHEMATICAL FOUNDATIONS OF SYSTEM OPTIMIZATION
We have then f l ( t ) z a,
233
m
+ iC ai(+ cos int + 4 cos in(I + 2)) = 1
+ C a - i ( +sin int + J sin in(^ + t)) 00
i= 1
(6.164)
We know that for an even integer i we have
+ 1) sin int = sin in(t + I )
(6.165)
cos int = cos in(t
(6.166)
and that for an odd integer i we have
cos int = -cos in(t + 1)
(6.1 67)
sin int = -sin in(t + 1)
(6.168)
Relation (6.164) may then be written f i ( t )M a,
+i =
m
I
(ai cos int
2 , 4 , 6 , ...
+ a-
sin int)
(6.169)
i.e., m
f l ( t )z a,
+i = C
1,2.3,
(azicos 2in2 + a - 2 i sin 2int)
...
(6.170)
By repeating the same procedure we could prove easily that ‘n
fk(t) z a.
+ C
(ai2k i = 1 , 2 , 3 , ...
cos 2int
+ a-
i2k
sin 2int)
(6.171)
It follows then that (6.172) I.e., 1
lirnjo Ifk(t) - aOl2dt
k-. 0
=0
(6.173)
since (6.174) From relation (6.173) we know that there exists an integer K such that
234
HUBERT HALKIN
It is a trivial matter? to verify that (6.176) We have then
By Cauchy-Schwartz inequality1 we have (6.178) We have already proved (see relation (6.175)) that the right side of the previous inequality is not larger than E when k 2 K. Hence we have
1
f ( t ) dt - M 'DZk
Jol
I
f ( t ) dt I E
for all
CI E
[0, I ]
and all k 2 K (6.179)
This concludes the proof of Proposition 6.10 in the case of a real valued piecewise continuous function f(t).
t We have immediately since
By repeating the same procedure we obtain easily relation (10.26) for k = 2, 3,4, ... . f. We recall Cauchy-Schwartz inequality for Riemann integrals (see Courant and Hilbert," Vol. I, p. 49). If g(f) and h(r) are two real valued piecewise continuous functions over [0, 11 then
Here we let g ( r ) = A(f) - uo and h ( f ) = 1.
6.
MATHEMATICAL FOUNDATIONS OF SYSTEM OPTIMIZATION
235
In the case of an n-dimensional vector valued piecewise continuous function f(t) = ( f ’ ( t ) , f 2 ( t ) ... , ,f ” ( t ) )the previous result can be applied to each component: for each i = 1, ... ,n there exists an integer K isuch that?
for all Let K
=
M E
[0, I ]
and all k 2 K ,
(6.180)
max K i . We have immediately
i = l,,..,n
for all
M E
[0, I ]
and all k 2 K
(6.181)
This concludes the proof of Proposition 10.1.
6.102 Convexity of the Range of a Vector Integral Before introducing the statement of Proposition 10.2 we define a-as the class of all subsets of [0, 1) which are the union of a finite number of intervals. The sets DZkdefined earlier are examples of sets in the class d.The class d is an algebra of sets. This fact could be colloquially expressed as follows: after a finite number of operations on sets in the class d,one obtains sets which are also in d . More precisely, we have the following propertiest : (1)
(ii) (iii) (iv) (v)
0E
d ;
[0, I ] ~ d ; if A and B E d then A n B E d ; if A and B E d then A u B E d ; if A E d then ([0, 11 A ) E d .
-
For each i = 1, 2, .. . ,n and each k similar to (6.163), i.e.,
= 1,2,3,
.. . the function f x ‘ ( r ) is defined in a way
and the vector valued function (fx’(t),f*’(t), ... , f r ” ( r ) is denoted by fdt). We d o not need these five properties to define an algebra of sets: the reader may verify for instance that properties (ii) and (iii) can be derived from properties (i), (iv), and (v). We recall that I? denotes theemptyset, i.e., the setwith noelement, that A n Bdenotes the intersection of the sets A and B, i.e., the set of elements which belong t o both A and B, that A u B denotes the union of the sets A and B , i.e., the set of elements which belong either to A , or to B, or to A and B, and that [0, 1J A denotes the complement of A, i.e., the set of elements of 10, 11 which do not belong to A .
-
236
HUBERT HALKIN
Let f(t) be a piecewise continuous function from [0, 11 into an n-dimensional Euclidean space. For each set A in the class d there exists a vector J A f ( t ) dt. Let us consider the set L ( f ) of all these vectors corresponding to all the sets in the class d.Formally we have then
L(f) = (lAf(t)dt : A ~ d ]
(6.182)
We shall prove that Proposition 10.2. If the vector valuedfunction f ( t ) is piecewise continuous then the set L(f) is convex.? The proof of Proposition 6.10 will proceed in three steps. I n Lemma 1 we shall prove that z(f), the closure$ of L(f), is convex. This result is relatively easy to prove, using Proposition 10.1, and is already a good indication of the plausibility of Proposition 10.2. However, we must note immediately that Lemma 1 is not yet as strong as Proposition 10.1, since it is quite possible for a set A to have a closure A which is convex without being itself convex. 9 In Lemma 2 we shall prove that any point interiors to the convex hull?? of L(f)belongs also to L(f). The proof of this result is based on Lemma 1 and on Brouwer's fixed point theorem.$$ Again we note that Lemma 2 is stronger than Lemma 1 but not yet as strong as Proposition 10.2, since it is quite
t We recall that a set A is convex if for any pair of points a and b in A and any real number p with 0 < p < 1 there exists a point c in A with c = pa (I - p ) b . $ If A is a set in a n n-dimensional Euclidean space then A is the closure of the set A defined as follows: A is the smallest closed set containing A or equivalently 2 is the union of A and of the set of the accumulation points of A. We recall that a point a, not necessarily in A , is a n accumulation point of A if for every E > O there exists a point b in A with b # a and la - b/ I E. tj The set A = (x : x E En,1x1 I I , 1x1 # 0 ) is not convex but the set A: (x : x E E" 1x1 5 I } is convex. 7 We recall that a point a is interior t o a set A in an n-dimensional Euclidean space if there exists a n E > 0 such that for all b with la - b/ < E we have b E A . We shall denote by int A the set of points which are interior t o A and by N ( a , E ) the set of all points b such that la -bl < E . t t If A is a set in a n n-dimensional Euclidean space, then co A denotes the convex hull of the set A, i.e., the smallest convex set containing A . The fundamental property of co A is the following: if a E co A then there exists n + I points a,, a*, ..., a n + ,in A and n I nonnegative real numbers pI,p 2 ,... , p n + such that
+
+
,
"+ 1
Cpi = 1
i=l
and
a=
2 piai.
n+ 1 i=l
$$ Lemma 2 could also be proved using some generalizations of the implicit function theorem or using Peano's theorem, a theorem establishing the existence, but not necessarily the uniqueness, of the solution of some type of systems of ordinary differential equations.
6.
MATHEMATICAL FOUNDATIONS OF SYSTEM OPTIMIZATION
237
possible to constructt a set A which is not convex and such that any point interior to co A , the convex hull of A , belongs also to the set A . In Lemma 3 we prove finally that the set L ( f ) is convex. The proof of Lemma 3 is a proof by induction based on the fact that for a space of dimension one Lemma 2 and 3 are equivalent.
Lemma 1. The set t ( f ) is convex.
+
PROOF. Let a* and b* E L(f) and p E [0, I ] we have to prove that c* = pa* ( 1 - p)b* E L(f), i.e., that for any E > 0 there exists a c EL(f) such that Ic* - cI I E . There exist an a and a b E L(f) such that la - a*I and Ib - b*l < 4 4. Let A and B E d with a = $A f(t) dt and b = 10.1 there exists an integer k, such that$
I
j8f(r) dr. From Proposition
I
f(t) x ( A ) dt - p J f(t) x ( A ) dt 'DFk
0
(6.183)
for all k 2 k , and an integer kb such that (6.184) for all k 2 k b . Let K
= max{k,,
k b f . We have then
f ( r ) X ( A ) d r - p J '0f ( f ) X ( A ) d t
(6.185)
and (6.186) From relation (6.186) we have immediately
1'
f(t)X(B)dt-(I - p ) / '0f ( t ) ~ ( B ) d t [O.II-DUK
t Let B be the closed unit square in the plane, Le., B
= {x = ( x , , X Z )
:(XI(5 I,
1x21
I Ii
and let A = B ((0, I)). In other words, the set A is a closed square with one point missing in the middle of one of its sides. The set A is not convex but every point interior to the convex hull of A belongs to A. -$ The characteristic function of the set A is denoted x(A), i.e., x ( A ) is a function whose value is 1 for all t E A and 0 for all t E [0, I ] * A . N
238
HUBERT HALKIN
Let
C =(A n ~
p u (~B n)([O, 11
-
(6.188)
DpK))
We have then from (6.185) and (6.187) (6.189)
i.e., E
Ic - p a - ( 1 - p)bl I -
(6.190)
4
We have now Ic - c*l = Ic - pa* - ( I - p)b*l I ( c - pa - ( 1 - p)bl I ~ / 4 814 ~ / < 4E
+
+
+ la - a*I + Ib - b*l (6.191)
This concludes the proof of Lemma 1. Lemma 2. The set into co L(f) is a subset ofL(f). PROOF. Let a E i n t coL(f). We shall prove that a E L ( f ) . There is an > 0 and a set S* = {a,* : i = 1, ... , n + I } c int co L(f) such that N(a, 28) c co S*. We have int co L(f) c int co &f) and from Lemma I we have coL(f) = L(f) it follows then that into co L(f) c int L(f) and that S* c int E(f). Hence, there exists? a set S = {ai : i = 1, ... , n + I ] c L(f) such that N(a, E ) c co s. For every a, E S there exists a set A , E d with J A i f ( t ) dt = a,. For each i = I , ... , n + 1 we know by Proposition 10.1 that there exists a positive integer K isuch that E
(6.192)
for all integer k 2 K , and all have then
1
'DmknA,
for all integer k 2 K , all i
CI E
f ( t ) d t - CI = 1, 2,
[0, I].
f
-A,
... , n
Let K
= maxi=1,2,,,,"+ l K i .
f(t)dt I
+ I,
&
(6. 93)
/4c.+r,
and all
M E [0,
We
I].
t A rigorous proof of this intuitive fact follows: For each ai* E z(f)let a z , a s , a z , ... be a sequence in L(f) converging t o a,*. Let SJ* = (a: : i = I , 2, ... ,n 1 ) and E~ = maxi=,,2,...," +,la: -a,*l. We have co S* c N (cc.SJ*, E,). We recall that if A is a set and g > 0 then N ( A , E) is the union of all the spheres of radius E with centers in A . Since lirn,=, E, = 0 there exists a n integer rn such that E,,, S E . We have then N(a, 2 ~ C) co S* c N(co S,,r*,F ) and N(a, E ) c S,,,*. We conclude by writing a, = a& for i = 1, 2, .. ,n 1 .
+
+
6.
239
MATHEMATICAL FOUNDATIONS OF SYSTEM OPTIMIZATION
Let
For each 1E A we define a set A ( S ) by the following rules?
M(i, 1)=
c As
i- I
(6.194)
i= 1
(6.195) (6.196) (6.197) We note immediately that for a given 1E A the sets Al(l), A2(1), ... , A,+,(S) are disjoint. From relation (6.193) and the fact that a , = f(r) dt we have
jAi
(6.198) and
1
f(t) dr JDi(i,A)nAc
for all i = I , 2, ... ,n we obtain then
+ 1 and
- N(i, 1)a ,
(6.199)
all 1E A. From relations (6.198) and (6.199)
(6.200) for all i = I , 2, ... , n
+ 1 and all 1E A. We have then (6.201)
for every 1 E A. Let us prove now that 1on A.
JAtA)
f ( t ) dt is a continuous function of
t Let A , , A , , ... ,Ak be a finite collection of sets. The set u f =A, ( is definedas the set of all points which belongs to at least one of the sets A , , A Z , ... , A k . Similarly, the n:,, A , is defined as the set of all points which belongs to each of the sets A , , A 2 , ... , A t .
240
HUBERT HALKlN
For any i = 1, ... , n + 1 and any I‘ and I“E A we have (see relations (6.194), (6.195) and (6.197) IM(i, 5’)- M(i, 5”)l I15’ - 1”1 IN(i, 5‘) - N ( i , 5“)I I 15’ - 1”)
pi‘ - &”I
I15’ - 5”I
(6.202) (6.203) (6.204)
and p(Ai(5’) AAi(5”)) I 3 15’ - 5”)
1
’A(*’)
f(t) dt -
1
f(t) dt I3M(n
A()i”)
h(x) = a We have then
Ih(x) - a1 =
11
A(*(x))
1
f(t) dt
f(t) dt
-
A(Ux)) n+ 1
+ 1) 15’ - 5’“
(6.207)
+x
(6.209)
C li(x)ai
i= I
(6 .205)
(6.210)
which implies that h(x) is a continuous function mapping co S into itself. By Brouwer’s fixed point theorem7 there exists an X E co S such that h(%)= E, i*e*,a = L E ) ) f ( t ) dt which implies a E L(f). This concludes the proof of Lemma 2.
t The fixed point theorem is the only nonelementary mathematical result quoted without proof in the present chapter. This theorem states that if a continuous function maps a closed convex subset of an Euclidean space E n into itself then it has a fixed point. In other words, if h(x) is a continuous function defined and continuous on the closed convex set A such that h(x) E A for all x E A then there is an X E A such that h(%)= 2.The proof of the fixed point theorem can be found in most textbooks of topology.
6.
MATHEMATICAL FOUNDATIONS OF SYSTEM OPTIMIZATION
24 1
Lemma 3. The set L(f) is convex. PROOF.The proof of Lemma 3 will proceed by induction. I f n = 1 nothing needs to be proved since the statements of Lemmas 2 and 3 are equivalent for n = 1. Let us assume that Lemma 3 is true for n = v and then prove it for n=v+l. Let a and b E L(f). We have to prove that ap + b(l - p) E L(f) for all p E (0, 1). If a = b the previous statement is immediate. Let us assume that a # b. We have now two possibilities? (i) p a + ( l - p ) b E d c o L ( f ) (ii) pa
forall p e [ O , 11;
+ (1 - p)b E int co L(f)
for some p E [0, 11.
(6.21 1) (6.212)
In case (ii) we have immediately$ pa
+ (1 - p)b E int co L(f)
for all p E (0, I )
(6.213)
which, by Lemma 2, implies that forall p ~ ( 0 I ,)
p a f ( 1 -p)bEL(f)
(6.214)
and hence proves Lemma 3 for case (ii). In case (i) there exists a supporting hyperplane to the convex set co L(f) passing through the point +a $b. Let p be a nonzero outward normal to this supporting hyperplane. We have immediately4
+
-
. (pa + (1 - p)b)
(6.215)
for all x E co L(f)
(6.216)
E = (JDf(t)dt : D E d, D c ( A A B ) ]
(6.217)
p .a =p b
for all p E [0, 11 and
=p
p a 2 p .x
We shall consider the sets
and L* = (jAnBf(t)dt
+ x :x E 2
I
(6.218)
t If A is a subset of E" then aA denotes the set of boundary points of A. We recall that a point a is a boundary point of the set A if for every E > 0 there is a b E A and a c $ A with a-blandla-cl<E. $ Suppose that there is a p E [0, 1J such that p a ( I - P)b E int co L(f). This implies that there is an L > 0 such that N(pa ( I - p)b, L) t int co L(f). Then for all p E (0, 1) the point pa + (1 - p)b belongs to the interior of the convex hull of the set N(pa (1 - p)b, E ) u {a}u {b}and a fortiori to the set int co L(f). We recall that {a] denotes the set having a single element a. 9: We denote by p a the scalar product of the two vectors p and a.
+
+
+
242
HUBERT HALKIN
We have immediately a and b E L * and L* c L ( f ) .We conclude by proving that the set L* is convex or equivalently by proving that the set 2 is convex. We shall prove that for every x E 2 we have p . x = 0 which implies that the set t has at most dimension v and hence is convex by the induction hypothesis. Indeed, if there is an x + E 2 with p * x + < 0 let Df E d such that D + c (AfAB) and
jD+f ( t ) d t =.'x
Let x A +
=sD+n(A-B)
f ( t )dt and xB+ =
f ( t )dt. We have then x A + + x B + = x+. Wecannot have p * x A + 2 0 and p * xB+ 2 0 because this would imply p . x + 2 0. Let us assume that p * x A + < 0 (a similar proof can be made for the case p * x B + < 0.). Let x* = j A - ( D + n l A - B ) ) f ( t ) dt. We have then x* E L ( f ) and a = x* + xA+. We obtain then the contradiction p * a < p . x*. Similarly, if there is an x + E with p . x + > 0 we define D', xAf and xB+ as above. We cannot have p . x A + I 0 and p . x B + I 0 because this would imply p * x + I 0. Let us assume that p . x A + > 0 (a similar proof can be made for the case p * x B +> 0). Let x * - s B u ( D + n ( A - B ) , f ( t ) dt. We have then x* EL(^) and x* = b + xAf.We obtain then the contradiction p x* > p . b. This concludes the proof of Lemma 3. jD+n(B-A)
-
6.11
Proof of the Fundamental Lemma
6.111 Proof o f the Fundamental Lemma in the Case r = n - 1 In Sec. 6.7 we have stated the following result:
Fundamental Lemma. /f there is no hyperplane separating the convex sets S + and @ then the sets S + and W have at least one point in common. We recall the definition of the sets S + , @, and W :
s+={x:xES,X,>xn(l;V)}
w = {x(l; v-) + z ( l ; 42) :
(6.219)
F}
(6.220)
w= {x(l; V )+ y ( l ; 4 2 ) : 42 E F }
(6.221)
42E
It is convenient to introduce three new sets S,', relations
@*, and W , by the
s*+= { X : X + X ( 1 ; Y ) E S + } w* = {x: x + x(l; V ) E w} w*= {x : x + x( 1 ; V )E W }
(6.222) (6.223) (6.224)
6.
MATHEMATICAL FOUNDATIONS OF SYSTEM OPTIMIZATION
m,,
243
I n other words, the sets S,+, and W , are respective translations of the sets S + , and W along the vector x(1; V ) .The sets S,’, and W , can be equivalently expressed as follows:
m,
S,’
m*= (z(1; %!): 42
i = I , ..., r
for
=(x:xi=O
m*,
and
xn>O}
(6.225)
EF}
(6.226)
w*= (y(1; %!): 9 E F}
(6.227)
The fundamental lemma stated above is then equivalent to the following result : Modified Fundamental Lemma. If there is no hyperpIane separating the convex then the sets S,+ and W , haoe at least one point in common. sets S,+ and
m*
We shall first prove the modified fundamental lemma in the case r = n - 1. In a later paragraph we shall show how this proof can be extended to the general case. The proof given here is of some interest even when r < n - 1 since it is sufficient in order t o prpve the maximum principle without the transversality condition. In the case r = n - 1 the set S,+ is of the form S,+
... , O , u) : a > 0}
= {(O,O,
(6.228)
If there is no hyperplane separating the convex sets S,+ and @, then7 there is an a, > 0 such that x, = (0, 0 , ... ,O, a*) E int Hence, there is an q > 0 and a subset {el, e 2 , ... , en} of @* such that N(x,, q) c A = co (0, el, e,, ... , en}. Let
m*.
c
1
n
A = li.:li.E~n,;c~20,
i= 1
I
1 .
For every x E A there is a unique h(x) E A such that x
n
=
C Ai(x)ei
(6.229)
i=1
For every li. E A there is a unique x(1) E A such that
x(k) = and there is a L < + a such that
Ix(li.‘)- x(y”)l I L II’ - 1”l
C Aiei
(6.230)
i= I
for all 1‘ and li.“
E
A
(6.231)
for all x’ and
EA
(6.232)
and
11(x’) - li.(x”)l I L Ix’- x*I
t For a proof of that statement see Appendix B.
X”
244
HUBERT HALKIN
Let E and y be two positive numbers smaller than one such that KL'(U,
+~
+
) ~ y 2* n ~ I yq/2
(6.233)
where K is the quantity introduced in Sec. 6.9. The reason for introducing E and y in the given form will become clear at the end of the proof. For every i = 1,2, ... , n there is a control function 4 , in the class F such that ei = z(1; ai)
(6.234)
We know also that , dt 4 1 ; @ i ) = J 1G ( t ) d u i ( t ) t> 0
(6.235)
From the results of Sec. 6.10 we know that there is an integer kisuch that (6.236) where (6.237)
,,,,,,"k i . For every x E A and every i = 1,2, ...,n, let
Let v = maxi=
ai(x) =
i- 1
C LAX)
(6.238)
j= 1
(6.239) and (6.240) We have then P(MX)) = I
Wl
(6.241)
For every x E A let 42, be a control function defined as follows : u,(t) = ui(t) u,(t) = v(t)
if t E Mi(x)
for some i = 1, ... ,n
otherwise
The function axis well defined since for every x E A the sets M,(x) are disjoint. It follows then that axis an admissible control function in the class F.
6.
MATHEMATICAL FOUNDATIONS OF SYSTEM OPTIMIZATION
245
From the results of Sec. 6.9 we have then
ax)- Y(I ; %)I I K(Pc(M(x)))z
(6.242)
i.e., (Z(I;
a,) - Y(I ; %,)I I K ( L ( x ) (I~ K L [XI2 ~
(6.243)
It is a trivial matter? to verify that ( ~ ( 1 ;ax)- x ( I 2ne
(6.244)
From the relations (6.241) and (6.242) it follows that ( x - y(1; aX)l 2 K L (xI2 ~ + 2ne
(6.245)
We define a set A* by the following relation A*
= (yx
(6.246)
: x E N(x,, q ) }
Since N(x,, q ) c A , y 5 1 and A is the convex hull of a set containing the origin we have immediately A* c A . We define a function h(x) on the set A* by the relation
+ YX*
(6.247)
h(x) = x - ~ ( 1 ;qX)
From the results of Sec. 6.8 and the preceding definitions it follows that y( 1 ; ax) is a continuous function of x over A*. This implies that the function
t Indeed, we have z(l ; 9,)- XI
=
41;
"),
and
We have then
The last inequality implies then relation (6.244).
c h;(x)z(l n
-
i= I
; 9/;)
246
HUBERT HALKIN
h(x) is also continuous with respect to x over A*. We shall prove that the function h(x) maps the set A* into itself.? We have indeed Ih(x) - Y X , ~ 5 J X - ~ ( 1 qx)l ; 5 KL2 (XI' s KL2(a, + q ) ' y Z + 2 ~ 6
+2 ~ 6 for all
x E A*
(6.248)
i.e., using finally relation (6.233) Ih(x) - YX*l 5 YV/2
(6.249)
which proves that h(x) maps the set S* into itself. From Brouwer's fixed point theorem1 it follows that there is an ? E A* l such that h(2) = 2 i.e., such that Y(1;
ax)= YX,
(6.250)
We have y( I ; 422)E W , and yx* E S,'. This concludes the proof of the fundamental lemma in the case r = n - 1.
6.112 Extension of the Preceding Proof to the General Case Let S,', @, and W , be the projection$ of the sets S,", @*, and W , on the (r + I)-dimensional Euclidean space E'" obtained by taking the dimensions 1, 2, ... , r and n of the original Euclidean space En.To prove the modified fundamental lemma in the general case it is sufficient to prove that: Proposition 11.1. cf there is no hyperplane in E'" separating the conziex sets S,' and then the sets S,' and W, have at least one point in common.
m,,
Let us prove that Proposition 1 I . I implies the modified fundamental lemma in the general case. If there is no hyperplane in E" separating the convex sets S,' and @* then there is no hyperplane in E'+' separating the convex sets S,+ and By Proposition 1 1,1 the last statement implies that the sets S,' and W , have at least one point x, in common. There is some point x in W whose projection is x, . By definition of the set S,' we know that x belongs also to S,". This concludes the proof that Proposition 11.1 implies the modified fundamental lemma in the general case. We note, finally, that Proposition 11.1 is identical with the modified fundamental lemma in the case r = n - 1.
m,.
t This means that for all x E A *
we have h(x) E A * . The fixed point theorem is the only nonelementary mathematical result quoted without proof in the present chapter. This theorem states that if a continuous function maps a closed convex subset of a n Euclidean space E" into itself then it has a fixed point. In other words, i f f is a continuous function defined and continuous on the closed convex set A such that f(x) E A for all x E A then there is a n Z t A such that f(Z) = ri. The proof of the fixed point theorem can be found in most textbooks of topology. This elegant procedure is due to W a ~ g a . ~ *
3
6. 6.12
247
MATHEMATICAL FOUNDATIONS OF SYSTEM OPTIMIZATION
An Intuitive Approach to the Maximum Principle
We shall conclude this chapter by a heuristic and intuitive description of the maximum principle. For every control function u(t) in the class F we have defined a trajectory and a matrix G(t) along that trajectory. The basic property of the matrix G(t) is that: a little variation dx of the state vector at the time t produces a little shift G-’(t) dx of the state vector at the terminal time. One possible way to create at the time t a little variation dx of the state vector is to replace the control function u(t) for a time dt by some other admissible control vector v. An optimal trajectory is characterized by the fact that it is impossible to find an admissible variation which will produce a shift of the terminal state vector in a defined direction (or group of defined directions) which we call “ interesting directions. Since different shifts of the terminal state vector can be added linearly, we must also require that all possible shifts of the terminal state vector be located on one side of a certain hyperplane. Otherwise we could combine different “ uninteresting” shifts to make a shift in an “interesting” direction. This terminal hyperplane is transformed by‘the matrix G(t) into a corresponding hyperplane n(t)at each time t. We must then require that a t each time t alI possible variations be on one side of this hyperplane n(t).This is the maximum principle. ”
APPENDIX A SOME RESULTS FROM THE T H E O R Y O F ORDINARY DI FFE RE NTI AL E Q UAT10 NS
In this appendix we shall state and prove some results from the theory of ordinary differential equations which are of particular interest in the study of optimization problems. We consider a differential equation
k = f(x, t )
(6.251)
where the state variable x = (xl,x2,... , x,) is an element of an n-dimensional Euclidean space E”, where the time t is an element of the closed interval? [0, 11 and where the n-dimensional vector valued function f(x, t ) = (f,(x, t ) , f2(x, t ) , ... ,.fn(x,2)) is given.
t If n and b are real numbers with a 2 b we denote by [a, 61 the closed interval from a to b, i.e., the set of all real numbers t such that a 5 t 5 6. Similarly, we denote by (a, b) the open interval from a to b, i.e., the set of all real numbers t such that a < t < 6.
248
HUBERT HALKIN
We assume that the vector valued function f(x, t ) is defined for allt x E En and all t E [0, I], continuously differentiable with respect to x and piecewise continuous with respect t o t. More precisely, we assume that there exists a finite set$ { t o , t,, ... , t k } c [0, I ] with t o = 0 < t , ... < t, = 1, and a finite collection of vector valued functions {fl(x, t ) , f2(x,t ) , .. . , fk(x,t ) } such that for each i = I , 2, ... , k , (i) the vector valued function fi(x, t ) and all its first partial derivatives with respect to x are defined and continuous with respect to x and t for all x E E" and all t E [ t i - l , ti]; (ii) f(x, t ) = fi(x, t ) for all x E E" and all t E ( t i - l r ti). Moreover, we shall assume that there is a positive real number M < +co such that (6.252)
for all x E E" and all t E [0, I]. In the previous expression 1x1 stands for the Euclidean length of the vector x, i.e., (6.253) Similarly, (6.254) Let 0 be the set§ [0, I ] - { t o , t , , ... , t k f . Let X E En and i E [0, I]. A solution of the Eq. 6.251 with the boundary condition x = X at t = i is a vector valued function q(f : f , X) which satisfies the following conditions: (i) q ( t ; i, X) is continuous with respect to t over [0, I]; (ii) q ( t ; f, X) is differentiable with respect to t over 0 ; d (iii) - q ( t ; i, X) = f(cp(t; i, X), t ) for all t E 0 ; dt (iv) q(i; i, X) = K. We have then the following results:
Theorem 1. For every K E E" and every i E [0, 1 ] there exists one and only one solution q ( t ; i , X) with the boundary condition x = X at t = f .
t If A is a set then "a E A " means "a is an element of A " and " B c A " means '* B is a subset of A." $ The set denoted by { t o ,f,, ..., t k } is the set whose elements are t o , t , , .._, and f k . g If A and B a r e sets then A B is the set of all points in A which are not in B.
-
6.
M A T H E M A T I C A L F O U N D A T I O N S OF SYSTEM O P T I M I Z A T I O N
Theorem 2. For w e r y diferentiable
t and f E [0, with respect to X.
11 the function
249
cp(t; f, X) is continuously
With the help of these two classical? theorems we shall prove some results which are very useful in the theory of optimization. Proposition A.l.1 Suppose that f ( t ) is a real-valued continuous and piecelcisediJerentiahle function d e j n e d otler LO, I], that g( t ) is a real-rafued pieceuisecontinuous function defined oiler [0, I], and that L is a constant. We assume, moreowr, that the following relation holds for all t E [0, 1 ] for which f ( t ) is diferentiable and g( t ) is continuous Then the following relation holds for all t
E
[0, I ]
PROOF.Let us denote by 0 the set of points t in [0, I ] for which the function f ( r ) is differentiable and the function g ( t ) is continuous. We define a function I i ( t ) over the interval [0, I ] as the continuous solution of the differential equation h(t) = Lh(t)
+g(t)
for all
tE0
(6.257)
with the initial condition h(O) = f ( O )
(6.258)
From Theorem 1 stated above we know that the function h ( t ) exists and is unique. Moreover, we can easily verify that the function h ( t ) has the following form
L’O+ j’
e - Lr g ( T ) a?)
h(t) = e+
0
It remains to prove that for all t
E
(6.259)
[0, I ] we have
f ( t >5 4 t )
(6.260)
Let us consider the family of solutions of the differential equation (6.257) corresponding to varying initial conditions at t = 0. For increasing t the
t The proof of these two theorems can be found in any good text book on ordinary differential equations such as Coddington and Levinson’s “Theory of Ordinary Differential Equations.” McGraw-Hill, New York, 1955. 1 Generalized Gronwall’s inequality.
250
HUBERT HALKIN
differential inequality (6.255) prevents the functionf(f) from crossing upward the family of solutions of the differential equality (6.257). Hence, the function f(t) is always on or below the curve I?([) passing throughf(0) at r = 0 and we obtain the inequality (6.260). To the geometrical ideas stated above corresponds the following analytical proof. The previous argument is based entirely on the concept of relative motion of the pointf(t) with respect to the family of solutions eLr(hO
+ j'
e-Lr
0
g ( r ) dt
1
(6.26I )
of the differential equation (6.257) for varying initial condition ho at time 0. This relative motion of the point f(t) with respect to the family of solutions (6.261) is best described by introducing a function k ( t ) in the following manner: k ( t ) is the value at time 0 of the particular solution (6.261) which has the valueJ'(t) at time t. In other words, we define the function k(t)by the relation eL'(/c(t)
+I'e - L ' g ( T ) d r ) = f ( r ) 0
(6.262)
which gives immediately (6.263) and k ( t ) = e - L ' ( f ( t )-
/7(r))
(6.264)
We conclude this analytical proof by showing that k(t)5 0
implies
f ( t ) I h(r)
(6.265)
[O, I ]
(6.266)
and k(1) I 0
for all
t
E
The relation (6.265) follows immediately from relation (6.264). Let us now consider relation (6.266). From relation (6.263) we know that the function k ( t ) is continuous over [O, I ] and we have immediately
k(0) = 0 and k(t) =
s
- L ~ - ~ y (+t e-'tf(t) ) - e-L$(t> - L e - L f f ( t )+ ePL'(Lf(t) + g(t)) - e-L'g(t) = 0
(6.267)
(6.268)
for all t E 0.The relation (6.266) is then satisfied. This concludes the proof of Proposition A. 1.
6.
MATHEMATICALFOUNDATIONS OF SYSTEM OPTIMIZATION
25 1
Proposition A.2. For every x, E E" and every t , , t , , and t 3 E [0, 11 we haoe the identity d t 3 ; t,,
d t 2
(6.269)
X I ) ) = (P(t3 ; tl, X I )
; 21,
PROOF.This proposition could be colloquially expressed as follows: if we integrate the system (6.251) from the state x1 at the time t , to the time t, and then integrate the system (6.251) from the new state q ( t z ; t , , xl)at the time t , to the time t 3 , we obtain a final state identical to the state which we would have obtained by integrating directly the system (6.251) from the state x, at the time t , to the time t , . The proof of this proposition is a direct consequence of Theorem 1. By definition we know that (i) the functions q ( t ; e,, q ( t 2 ; t,, xl)) and q ( t ; t l , x,) are both solutions of the system (6.251); (ii) q ( t ; t,, q(t2; t , , xl))= W; t , , xl>for t = t,. Hence, from Theorem 1, we have q ( t ; t 2 , q(t2; f l ,
for all
xl)) = q ( t ; tl, x l )
LO, 11
and in particular, d t 3 ;f,,
d t 2
; f,,
X I ) ) =
; tl,
XI)
This concludes the proof of Proposition A.2.
Proposition A.3.
For
every X E En and every t and t E [0, I ] we have I q ( t ; t, x>l 5
( ~ r t l+ 1) e'lt-'l.
(6.270)
PROOF. Let r(t) = Iq(f; i, Z)]. The function r(t) is continuous over [0, 11 and differentiable over 0 . For every t E 0 we have immediately
ict) 5 If(q(t;t ,
sz),
(6.271)
t)l
From relation (6.252) we have If(cp(t;
2 , XI, t>l 5 M(1
+ I d t ; f, %)I)
i.e., If(cp(t; f, 3 ,t)l I MU + r ( f ) )
We have then i(t)
and also
I M(l
+ r(t))
r(i) = 1x1
From Proposition (A.1) we obtain then
(1x1 + 1) e M ( t - i ) .
r(t) I
for all
t E [f, 11
(6.272)
252
HUBERT HALKIN
A similar argument would lead t o the relation r(t) I (la1 + 1) e'(i-t).
for all t E [0, f]
By combining the last two relations we obtain the desired result:
r(t>I (1x1+ 1) eMlt-tl.
for all
t E [O,
11
(6.273)
This concludes the proof of Proposition A.3. We shall denote by K ( t ; f, X) the matrix whose (i,,j)element is dcpi(t; f, X)/ 82, and by D ( t ; i, X) the matrix whose (i,.j) element is dfi(x,r ) / d x j l x = p ~ r ~ j . c ~ , i.e., the function dfi(x, t ) / d x j evaluated at the point x = q ( t ; f, E). From Theorem 2 we know that for every t and t E [0, I ] and every x E E" the matrix K ( t ; f, x ) exists and is continuous with respect to E.
Proposition A.4. The matrix K ( t ; f, X) is continuous with respect to t over [0, I], dzyerentiable with respect to 1 over 0 and satisjes the following matrix diffi.rential equation? K ( t ; t , x ) = D ( t ; i, Z) K(t; f, 2 )
for all
t E
0
(6.274)
with the boundary condition (6.275)
K(f;f, X) = I where I is the n x n identity matrix.
PROOF. The boundary condition (6.275) is immediately satisfied. From the definition of cp(t; i, 52) we have immediately cp(t; i, E) = SZ
+ f f(cp(z; f, rZ), i
7)dz
for all
t
E
[0, I ]
(6.276)
We may take the partial derivative with respect t o x of both sides of relation (6.276). This operation is justified by Theorem 2 and by our assumptions concerning the function f ( x , t ) . We obtain dcp(r; i, X)
dz for all
t E [0, I ]
(6.277)
t If we denote by K i j ( t ;i, Z) the ( i , j ) element of the matrix K ( r ; i, Z) and by D,,(t; i, Z) the ( i , j ) element of the matrix D ( t ; i, S) then the relation (6.274) could be explicitly written as follows Ri,(t; i, S) = C oir(r; i, jZ)K,,(t; i, E) I=I
for all i = I,2, ..., n, a l l j = 1, 2, ... , n, and all
I E
0.
6.
MATHEMATICAL FOUNDATIONS OF SYSTEM OPTIMIZATION
253
i.e., K(t;t, X )
=I
+
i:
D(t;t, X ) K ( t ; t, X)dt
for all
tE
[O, I ]
(6.278)
The elements of the matrix D ( t ; f, X) are uniformly bounded for all t E [0, I ] and hence, from Theorem 1, the differential system (6.274) with boundary condition (6.275) admits a unique solution which must coincide with the solution given by relation (6.278). This concludes the proof of Proposition A.4. Proposition A.5. The matrix K ( t ; f, X) has an inverse K - ' ( t ; i , 52) which is continuous with respect to t owr [0, I], diflerentiahle u.ith respect to t oi>er0 and satisfies the following matrix dierentiable equation : R - ' ( t ; f, X) = - K - ' ( t ; f, jZ)D(t; t , X)
for all
(6.279)
te0
with the boundary condition
(6.280)
K - ' ( f ;t , X ) = 1 where I is the n x n identitji matrix.
PROOF.The differential system (6.279) with the boundary condition (6.280) has a unique solution K - ' ( t ; f, E). It remains t o prove that (6.281)
K - ' ( t ; i, X)K(t; f, X) = I
for all t E [0, I]. Indeed we have (6.282)
K - ' ( f , i, X)K(t;t, X) = I and d - ( K - ' ( t ; t , x ) K ( t ;t, x ) ) = 0 dt
since k - ' ( t ; f, X)K(t; f , X ) = - ~ - ' ( t ;i,
for all
tE0
(6.283)
+ K - ' ( t ; f, X ) R ( t ; 7,X)
x ) D ( t ;i, %)K(t;f, X )
+ K - ' ( t ; i, X ) D ( t ;t, X)K(t; t, X )
This concludes the proof of Proposition A S .
=0
(6.284)
254
HUBERT HALKIN
Proposition A.6. For every x 1 E Enand every t,, t , , and t , K(t3 ; t,,
90, ; t,,
E
[0, I ] we have
x , ) ) K ( t 2; t,, x , ) = K(t3 ; t,, x l >
(6.285)
PROOF.Let us take the partial derivative with respect to x of the relation (6.260). This operation is justified by Theorem 2 and from the definition of the matrix K ( t ; f, X) we obtain immediately relation (6.285). This concludes the proof of Proposition A.6. In the particular case t , = t , the identity (6.285) becomes K ( t , ; t,, 9(t2; t,, x , ) ) = ~
- l ( t ,;
4 , x,)
(6.286)
Proposition A.7. We assume that (i) y ( t ) is a vector vaIuedfunction, continuous oi'er [0, 11, diferentiahle over 0, and such that (a)
j ( t ) = D ( t ; f, x)y(r)
(P)
Y(f>= 7
for all
tE0
(6.287) (6.288)
(ii) p(t) is a vector valued function continuous oiler [0, I], dij%rentiahle over 0, and such that? (a)
p(t) = - W ( t ; i, x)p(t)
(PI
PtG
for all
t
0
E
'P
(6.289) (6.290)
Then (9
y ( t ) = K ( t ; f,
(ii)
p ( t ) = ( ~ - ' ( ti,; T Z ) ) ~ ~for all
(iii)
y ( t ) * p(t) = 7 . ii
for all
jz)y
for all
t E [0, 11 t
E
[O, I ]
t E [0, I ]
(6.29 I ) (6.292) (6.293)
PROOF.The matrix D(r; f, j z ) is uniformly bounded for t E [O, I]. Hence the vector valued functions y ( r ) and p(t) exist and are unique. It remains to verify that relations (6.287) and (6.288) hold for the function y ( t ) defined by relation (6.291) and that relations (6.289) and (6.290) hold for the function p(t) defined by relation (6.292). We note first that K(f;i, X) = K - ' ( f ; f, 2 ) = f, and hence that relations (6.288) and (6.290) are verified. From relations (6.274) and (6.291) we obtain for all t E 0 f ( t ) = ( l ( t ; f, =
x)y
=
D ( t ; f, jz)y(t)
D ( t ; f, x ) K ( t ;t, ~ ) y (6.294)
t The transpose of the matrix D ( f , i, Z) is denoted ( D ( t ; i, % ) ) T or more simply D T ( t ;i, 5 ) when no confusion is possible. Accordingly, ( K - ' ( f ; i, 2))' is the transpose of the matrix K - ' ( t ; i, 2).
6.
MATHEMATICAL FOUNDATIONS OF SYSTEM OPTIMIZATION
255
and from relations (6.279) and (6.292) we obtain for all t E 0 d p(t) = - ( K - ’ ( t ; t, ?z))~F= ( - - ~ - ‘ ( i ;i, % ) o ( t ;I , dt =
-
F ( t ; f , x ) ( K - ’ ( t ; i,
= -P
z))*p
( t ; i, x)p(t)
(6.295)
We have, finally, for all t E [0, I ] Y(t) . P(t>= ( K ( t ;f , X)Y) . ( ( K - j ( t ; t, X))Tp) = ( K - l ( t ; f , X)K(t; f , X ) y ) * p = y . p
(6.296)
Proposition A.8. r f t , E [0, I ] and x, E int A then for all t , E [0, 11 the state q ( t , ; xl,t l ) is interior to the sett { q ( t 2; xl,t l ) : x1 E A } .
PROOF.The vector 9 ( t 2 ; xl,t l ) is continuously differentiable with respect to x, and K(t2 ; x,, tl), the matrix of partial derivatives with respect to x l r has a nonzero determinant (since it has an inverse K - l ( t , ;xl,tl)). The proof of Proposition A.8 follows ‘then immediately from the inverse function theorem.$ Proposition A.9. Suppose that t E [0, I], that Z ( E ) is a continuouslydifferentiable junction over [0,a] f o r some a > 0. For every t E [0, I ] and every E E [0, a] de$ne z ( t ; E ) = q ( t ; I , z ( E ) ) . Then f o r every t E [0, 11 the function z ( t ; E ) is continuously differentiable with respect to E . Define y ( t ) as the derivative dldc z ( t ; E ) evaluated at E = 0. The function y ( r ) satisfies then the following conditions: (i) y ( t ) is continuous for all t E [0, 11; (ii) y ( t ) is differentiable for all t E 0 ; (iii) jr(t) = D ( t ; t , z(O))y(t)for all t E 0.
PROOF.By taking the total derivative with respect to E of the relation z ( t ;E )
= q(t;
f,
Z(E))
(6.297)
we obtain (6.298) The notation {v(t2 ; xl, t,) : x1 E A ] stands for “ t h e set of all vectors cp(r2 ; x,, I,) such that x1 E A.” We recall that a point a is interior to a set A if there exists an E > 0 such we have b E A . The notation “ a E int A ” means “ a is interior that for all b with la - b / I F to the set A,” 4 See T. M. Apostol, “ Mathematical Analysis,” p. 144. Addison-Wesley, Reading, Massachusetts, 1957.
256 and for
HUBERT HALKIN E =0
(6.299) (6.300) From relation (6.300) and the properties of the matrix K ( t ; t, Z(0)) listed in Proposition A.4, we obtain, immediately, the required properties of the vector y(t). This concludes the proof of Proposition A.9. APPENDIX 8 T H E G E O M E T R Y O F C O N V E X SETS
In this appendix we shall state and prove some results about convex sets.t These results are needed at various stages of the theory of optimal control. For any positive integer n the n-dimensional Euclidean space will be denoted by En.The length of a vector a in En will be denoted by la[.The scalar product of two vectors a and b in En will be denoted by a * b. We have obviously a * a = la12 for all a E En. We write “ a E A ” to mean a is an element of the set A ” and “ A c B” to mean “ the set A is a subset of the set B”. Similarly, “ a $ A ” means “ a is not an element of the set A.” If A and B are sets then A n B, the intersection of A and B, is the set of all elements belonging to both sets A and B ; A u B, the union of A and B, is the set of all elements belonging to at least one of the sets A and B ; A B is the set of all elements of A which do not belong to B. We start by recalling some definitions concerning the topology of Euclidean “
-
a point a is interior to a subset A of E” if there exists an E > 0 such that for all b E En with la - bJ 2 E we have b E A ; the interior of a set A is the set of all interior points of A . The interior of a set A is denoted by int A ; a point a is a boundary point of a subset A of En if for every E > 0 there exists a b E A and a c $ A such that la - bl and la - cI I E ; the boundary of a set A is the set of all its boundary points. The boundary of a set A is denoted by d A ; a set A in Enis open if all points are interior points of A ; a set A in En is closed if the set En A is open; the closure of a set A is the smallest closed set containing A . The closure of a set A is denoted by A.
-
t For a more balanced study of convex sets see H. G. Eggleston, “ Convexity.” Cambridge Univ. Press, London and New York, 1948.
6.
MATHEMATICAL FOUNDATIONS OF SYSTEM OPTIMIZATION
257
We introduce now the definition of convex set. A set A in Enis convex if for any a and b E A and any p E [O, I ] we have pa + (1 - p)b E A . In other words a set A is convex if any line segment connecting two points of A is entirely contained in A. Proposition B . l . I f the set A is convex and ifa $ 2 then there exists a nonzero vector p such that p - x ~ p a a forall
(6.301)
XEA
PROOF.Let b be a point of 2 which is closest to a, i.e., such that la - bl 5 la - XI We have a # b. Let p
= a - b.
for all
x
E
A
(6.302)
I t remains to prove that
(a - b) . x I (a - b) . a
for all x E A
(6.303)
Indeed, for all x E A and all p E [0, I ] we have pb + (1 - p)x E 2 which implies
Ib - a/’ I 1(1 - p)b
+ px - a/’
(6.304)
i.e., 2p((x - b) * (b - a))
+ p2((x - b)
*
(x - b)) 2 0
(6.305)
Since relation (6.305) holds for all p E [0, I], we have (x - b) * (b - a) 2 0
(6.306)
(a - b) . x I (a - b) * a
(6.307)
i.e., This concludes the proof 3f Proposition B.l. Proposition B.2. I f the set A is convex and f a vector p such that p.xlp-a
forall
E
aA then there exists a nonzero
XEA
(6.308)
PROOF.Let a,, a 2 , .. . be a sequence of points converging to the point a and such that a, $ A for i = I , 2, . .. . For each a, let pi be the nonzero vector given by Proposition B.l such that pi - x I pi . a ,
for all x E A
(6.309)
We may assume that lpil = 1 for all i = I , 2, ... . There exists a vector p with IpI = 1 such that some subsequence P k , , P k z , ... of pl, p z , ... converges to p.
258
HUBERT HALKIN
We have then for any x E A p .x
=
(6.310)
lim (pki x) I lim (pki * aki)= p a
i= m
i=m
This concludes the proof of Proposition B.2. Let us introduce another definition. Two convex sets A and B are separated if there exists a nonzero vector p and a scalar u such that x - p l u x-p2u
forall X E A forall X E B
(6.31 1 )
Colloquially speaking we could say that two convex sets A and B are separated if there exists an hyperplane P such that the set A is on one side of P and the set B is on the other side of P . An hyperplane P is completely determined by a nonzero vector p, normal to P,and a real number u. In that case the hyperplane P is the set of all vectors x such that (6.312)
x.p=a
Proposition B.3. Let A be a convex set in En.Let a E A and b E E" with a # b. Tlze set (a A(b - a) : A > 0 ) is denoted by B. Suppose that the convex sets A and B are not separated. Then there exists an u > 0 such that a u(b - a) E int A .
+
+
PROOF. The set 2, closure of the convex set A , is also convex. There are three possibilities (i) a + A(b - a) # A for all A > 0; (ii) there exists a X > 0 such that a + A(b - a) E d A for all A E (0, 2); (iii) there exists a > 0 such that a + A(b - a) E int A for a11 1 E (0, 2). Possibility (iii) leads immediately to the required result. We shall show now that possibilities (i) and (ii) lead to contradictions. First we consider possibility (i). For every i = 1 , 2, 3, ... let pi be the nonzero vector given by Proposition B.l such that
- + 1 (b - a))
pi * x 5 pi (a
for all
xEA
(6.3 13)
We may assume that ]pit = 1 for all i = 1, 2, ... . There exists a vector p with IpI = 1 such that some subsequence pk,,Pk2, ... of the sequence pl, p 2 , ._. converges to p. For every i = 1 , 2, . .. we have then (6.314) Hence, for every A > k;
we have (6.315)
6.
MATHEMATICAL FOUNDATIONS OF SYSTEM OPTIMIZATION
For any x E A we have p . x = lim (pk, x) 5 lim (pk,* (a i=m
i=x
+ k); ' ( b - a)) = p
*
a
259
(6.316)
and for any R > 0 we have p . (a
+ R(b - a)) = lim (pk,- (a + I(b - a))) i=m
(6.3 17)
2 lim ( p k , * a) = p a i=
m
From relations (6.316) and (6.317) we conclude that the sets A and B are separated which contradicts our assumption. Now we shall consider possibility (ii). Let p be the nonzero vector given by Proposition B.2 such that p xI p (a +
+ (X/2) ( b - a))
In particular, we have p aI p * (a
and
p (a
for all x E A
+ (X/2) ( b - a))
(6.3 18) (6.319)
+ R(b - a)) Ip . (a + (2/2) (b - a))
(.6320)
Relations (6.319) and (6.320) imply that
p (a + I(b - a)) = 0 1
for any I > 0
(6.32 1)
From relations (6.318) and (6.321) we conclude that the sets A and B are separated which contradicts our assumption. Let us introduce a last definition. If A is a set in E", we define the convex hull of A as the smallest convex set containing A . The convex hull of A is denoted co A . Proposition B.4. If A is a set in Enand $a E co A then a is the convex combination of n + 1 vectors in A , i.e., there exists n 1 vectors a,, a 2 , ... , a,+, in A and n + 1 nonnegative numbers p,, p 2 , ... , pn+ such that a = piai and 1= pi.
+
c;='l'
c:=
,
,
PROOF.Let A* be the set of all points of the form piai where a,, a 2 , .. . , akE A , p,, p 2 , . .. , p k 2 0 and pi = 1. The set A* is convex and hence co A c A * . If a E A* we may construct the sequence a,*, a2*, ... , a k - l by the
following rules:
(6.322)
(6.323)
260
HUBERT HALKIN
We have then a,* E co A for all i = 1, 2, ... , k - 1 and a:-, = a. This proves that A* c co A and hence, A* = co A . Now we shall prove by contradiction that any a E A* can be expressed as a convex combination of n + 1 vectors in A . Let k be the smallest integer such that there exists vectors a,, a , , .. . , akin A and real numbers p,, p, , ... , p k > 0 with p i = 1 and piai = a. If k I n + 1 then nothing remains to prove. We shall show that k > n 1 leads to a contradiction. If k > n + I then the vectors a,, a 2 , .. . , ak are linearly dependent and there exist real numbers a l , a 2 , ... , c ( k , one of them at least being different from zero, such that Cf= ai = 0 and alal a2a, a3a3+ ... + akak= 0. Let 0 be the set of real numbers 9 such that 9ai2 - p i for i = 1, 2, ... , k. The set 0 is closed, nonempty (since it contains the number zero), and does not contain the entire real line (since one at least of the x iis different from zero). Let 0, be a boundary point of the set 0. We have O,aj = - p j for some j among the integers 1, 2, ..., k . We have then a = C f = l ( p i +B,crj)ai, i.e.,
xf=
If=
+
+
+
k
a= Since p i
i= 1
(pi
+ 9,ai)ai
i# j
+ 60ai 2 0 and i# j
we obtain a contradiction. This concludes the proof of Proposition B.4. ACKNOWLEDGMENTS
I am very grateful to my wife Carolyn Halkin and to Messrs. A. A. Fredericks, J. W. Holtzman, S. Horing, R. A. Horn, J. C. Hsu, A. G. Lubowe, and V. 0. Mowery for their valuable comments on this paper. During the Fall of 1964 I gave a series of lectures on the mathematical theory of optimal control at the Bell Telephone Laboratories. The “ textbook ” for these lectures was an earlier version of the present chapter. At that occasion I was most fortunate to receive many pertinent suggestions for which I thank S. B. Alterman, H. G . Ansell, P. J. Buxbaum, and W. L. Nelson. REFERENCES 1. M. Aoki, Mimimal Effort Control Systems with a n Upper Bound of the Control Time, IEEE Trans. Automat. Control No 1, 60-61 (1963). 2. R. Bellman, “Adaptive Control Processes, a Guided Tour.” Princeton Univ. Press, Princeton, New Jersey, 1961. 3. R. Bellman, “Dynamic Programming.” Princeton Univ. Press, Princeton, New Jersey, 1957.
6.
MATHEMATICAL FOUNDATIONS OF SYSTEM OPTIMIZATION
26 1
4. R. Bellman, I. Glicksberg and 0. Gross, On the “Bang-Bang” control problems, Quart. Appl. Math. 14, 11-18 (1956). 5 . L. D. Berkovitz, Variational Methods in Problems of Control and Programming, J. Math. Anal. Appl. 3, 145-169 (1961). 6. A. Blaquitre and G. Leitmann, On the Geometry of Optimal Processes, Div. Appl. Mech. Univ. California, Berkeley, Tech. Rept. 6 4 1 0 (1964). 7. J. V. Breakwell, The Optimizations of Trajectories, J. SOC.Indust. Appl. Math. 7 , No 2, 215-247 (1959). 8. A. E. Bryson and W. R. Denham, A Steepest-Ascent Method for Solving Optimum Programming Problems, Raytheon Company. Tech. Rept. BR-1 303 (1961). 9. D. Bushaw, Dynamical Polysystems and Optimization, Contriburions to Differential Equations 2, No 3, 351-365 (1963). 10. C. CarathCodory, Variationsrechnung, in “ Die Differential und Integralgleichungen der Mechanik und Physik” (Ph. Frank and R. v. Mises, eds.), pp. 227-279. Vieweg, Braunschweig, Germany, 1930. 11. R. Courant and D. Hilbert, “ Methoden der Mathematischen Physik,” Springer, Berlin, 1937. 12. C. A. Desoer, Pontryagin’s Maximum Principle and the Principle of Optimality, J . Franklin Znst. 271, 413-426 (1961). 13. S. Dreyfus, Dynamic Programming and the Calculus of Variations, J. Math. Anal. Appl. 1, 228-239 (1960). 14. A. F. Filippov, On Certain Questions in the Theory of Optimal Control, Vestn. Mosk. Univ. Ser. M a t . Mekhan. Astron. Fiz. Khirn, No. 2, 25-32 (1959). [Engl. Transl.: J. SOC.Ind. Appl. Math. Ser. A pp. 76-84 (1962).] 15. I. Flugge-Lotz and H. Halkin, Pontryagin’s Maximum Principle and Optimal Control, Stanford Univ., Stanford, California, Tech. Rept. 130 (1961). 16. I. Flugge-Lotz and R. Marbach, The Optimal Control of Some Altitude Control Systems for Different Performance Criteria, A S M E J. Basic Eng. Ser. D 85, 165-176 (1963). 17. I. Flugge-Lotz and M. Maltz, Analysis of Chatter in Contractor Control Systems, with Applications to Dual-Input Plans, Stanford University, Stanford, California, Tech. Rept. SUDAER 155 (1963). 18. B. Fraejis de Veubeke, MCthodes variationnelles et performances optimales en aeronautique, Bull. SOC.Math. Belg. 8, 136-157 (1956). 19. A. T. Fuller, Bibliography of Optimum Nonlinear Control of Determinate and Stochastic-Definite Systems, J. Electronics Control 13, 589-61 1 (1962). 20. A. T. Fuller, Bibliography of Pontryagin’s Maximum Principle, J. Elecfronics Confrol 15, 513-517 (1963). 21. R. V. Gamkrelidze, Optimal Sliding States, Dokl. Akad. Nauk SSSR 143, 1243-1245 (1962). (In Russian.) 22. H. Halkin, Lyapounov’s Theorem on the Range of a Vector Measure and Pontryagin’s Maximum Principle, Arch. Rational Mech. Anal. 10, 296-304 (1962). 23. H. Halkin, The Principle of Optimal Evolution, in “Nonlinear Differential Equations and Nonlinear Mechanics” (J. P. LaSalle and S. Lefschetz, eds.), pp. 284302. Academic Press, New York, 1963. 24. H. Halkin, On the Necessary Condition for Optimal Control of Nonlinear Systems, J. A n d . Math. 12, 1-82 (1964). 25. H. Halkin, A Generalization of LaSalle’s “ Bang-Bang ” Principle, SIAM J. Control 2, 199-203 ( 1965). 26. H. Halkin, On a Generalization of a Theorem of Lyapounov, J. Math. Anal. Appl. 10,325-329 (1 965).
262
HUBERT HALKIN
27. H. Halkin, Some Further Generalizations of a Theorem of Lyapounov, Arch. Rational Mech. Anal. 17, 272-277 (1964). 28. H. Halkin, Topological Aspects of Optimal Control of Dynamical Polysystems, Contributions to Differential Equations 3, 377-385 (1964). 29. H. Halkin, Method of Convex Ascent, in “Computing Methods in Optimization Problems,” pp. 21 1-239, (A. V. Balakrishnan and Lucien W. Neustadt, eds.). Academic Press, New York, 1964. 30. H. Halkin, Optimal Control for Systems Described by Difference Equations, Advan. Control Systems 173-196 (1964). 31. T. J. Higgins, A Resume of the Basic Literature of State-Space Techniques in Automatic Control Theory, JACC (1962). 32. R. E. Kalman, The Theory of Optimal Control and the Calculus of Variations, RlAS Report 61-3 (1961). 33. H. J. Kelley, Methods of Gradients, in “ Optimization Techniques ” (G. Leitmann, ed.). Academic Press, New York, 1962. 34. J. P. LaSalle, The Time Optimal Control Problem, in “Contributions to the Theory of Nonlinear Oscillations,” Vol. V, pp. 1-24. Princeton Univ. Press, Princeton, New Jersey, 1960. 35. G. Leitmann, “ Necessary Conditions for Optimal Control and Applications,” parts I and 11, Diu. Appl. Mech., Univ. California, Berkeley, California, Tech. Rept. 64-3 (1964). 36. G. Leitmann, On a Class of Variational Problems in Rocket Flight, J. Aerospace Sci. 26, NO. 9, 586-591 (1959). 37. A. Lyapounov, Sur les fonctions vecteurs completement additives, Bull. Acad. Sci. USSR Phys. Ser. 4, 465-478 (1940). (In Russian with a French rksumk.) 38. E. J. McShane, On Multipliers for Lagrange Problems, Amer. J. Math. 61, 809-819 (1 939). 39. A. Miele, A Survey of the Problem of Optimizing Flight Paths of Aircraft and Missiles, Boeing Scientific Research Laboratories, Tech. Rep. No. 27, July 1960. 40. L. W. Neustadt, The Existence of Optimal Controls in the Absence of Convexity Conditions, J. Math. Anal. Appl. 7 , 110-117 (1963). 41. J. J. O’Donnell, Bounds on Limit Cycles in Two-Dimensional Bang-Bang Control Systems with an Almost Time Optimal Switching Curve. Proc. JACC (1964). 42. B. Paiewonsky and P. J. Woodrow, The Synthesis of Optimal Controls for a Class of Rocket Steering Problems, AIAA Summer Meeting 1963. 43. L. S. Pontryagin, V. G. Boltyanskii, R. V. Gamkrelidze, and E. F. Mishchenko, “The mathematical theory of optimal processes ” [Engl. Transl.] (L. W. Neustadt, ed.). Wiley (Interscience), New York, 1962. 44. E. Roxin, The Existence of Optimal Controls, Michigan Math. J. 9, 109-1 19 (1962). 45. J. A. Stiles, Time Optimal Control of a Two Variable System (Ph. D. dissertation), Cambridge Univ., Cambridge, 1964. 46. L. G. Stoleru, A Quantitative Model of Growth of the Algerian Economy, Inst. Math. Studies SOC.Sci., Stanford University, Stanford, California, Tech. Rept. 124 (1963). 47. J. Warga, Relaxed Variational Problems, J. Math. Anal. Appl. 4, 1 11-128 (1962). 48. J. Warga, Necessary Conditions for Minimum in Relaxed Variational Problems, J. Ma&. Annaf. Appf. 4, 129-145 (1962).
7 On the Geometry of Optimal Processes? A . BLAQUIERE FACULTY OF SCIENCES. UNIVERSITY OF PARIS. PARIS. FRANCE
G. LEITMANN DEPARTMENT OF MECHANICAL ENGINEERING. UNIVERSITY OF CALIFORNIA. BERKELEY. CALIFORNIA.
7.0 Introduction . . . . . . . . . . . . . . 7.1 Dynamical System . . . . . . . . . . . . 7.1 1 Transfer of System , . . . . . . . . 7.12 Performance Index and Optimality . . . . . 7.13 Additivity Property; Union of Paths . . . . . 7.2 Augmented State Space and Trajectories . . . . . . . . . 7.3 Limiting Surfaces and Optimal Isocost Surfaces . . . . 7.4 Some Properties of Optimal Isocost Surfaces 7.5 Some Global Properties of Limiting Surfaces . . . . 7.51 Lemma 1 . . . . . . . . . . . . . 7.52 Lemma2 . . . . . . . . . . . . . 7.53 A- and B-Points . . . . . . . . . . 7.54 Lemma 3 . . . . . . . . . . . . . 7.55 Theorem 1 and Corollary I . . . . . . . . 7.6 Some Local Properties of Limiting Surfaces . . . . . 7.61 First Basic Assumption . . . . . . . . . . . 7.62 Definitions of Local Cones 9,(x) and U, (x) 7.63 Interior Points of U, (x) and VB(x); A Second Basic Assumption; Lemma 4 . . . . . . . . 7.64 Local Cone 9 ( x ) . . . . . . . . . . 7.7 Some Properties of Local Cones . . . . . . . . 7.71 A Partition of E n + ' ;Lemma 5; Corollaries 2 and 3 . 7.72 Lemmas 6 and 7; Corollary 4 . . . . . . . 7.73 Lemma 8 . . . . . . . . . . . . . 7.74 Lemma 9 . . . . . . . . . . . . . 7.8 Tangent Cone VX(x) . . . . . . . . . . . 7.81 Definition of Wz(x) . . . . . . . . . 7.82 Closure of a Subset of 'c . . . . . . . . 7.83 A Partition of C ; Lemma 10 . . . . . . .
. . . 265
. . . 266 . . . 266
. . . 267
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. 267 . 268 . 268 . 269 . 270 . 270 . 271 . 271 . 272 . 272 . 273 . 274 . 274
. . . 274 . . . 275 . . . 276 . . . . . . . .
. . . . . . . .
. . . . . . . .
276 277 279 281 282 282 283 285
?This work was supported by the U.S. Office of Naval Research under Contract Nonr-3656(31). 263
264
A
7.9
7.10 7.1 1
7.12 7.13 7.14 7.1 5
7.16 7.17
7.18 7.19
7.20
7.21
. BLAQUIERE
AND G . LEITMANN
A Nice Limiting Surface . . . . . . . . . . . . . 286 . . . . . 286 7.91 Closure of a Subset of a Nice C; Lemma 11 7.92 Cones Y(x) and c6,(x) of a Nice C; Lemma 12 . . . . . 287 7.93 Lemmas 13 and 14 . . . . . . . . . . . . . 289 A Set of Admissible Rules . . . . . . . . . . . . . 292 7.101 State Equations and Control . . . . . . . . . . 292 7.102 An Integral Performance Index and the Trajectory Equation . 293 Velocity Vectors in Augmented State Space . . . . . . . . 294 7.111 Lemma 15 . . . . . . . . . . . . . . . 294 7.112 Lemma 16 . . . . . . . . . . . . . . . 295 7.113 Lemmas 17 and 18; Corollary 5 . . . . . . . . . 297 Separability of Local Cones . . . . . . . . . . . . 300 7.121 Separating Hyperplane . . . . . . . . . . . . 300 7.122 Cone of Normals . . . . . . . . . . . . . 301 Regular and Nonregular Interior Points of a Limiting Surface . . 301 Some Properties of a Linear Transformation . . . . . . . . 303 . . . . . . . . 303 7.141 Variational Equations; Lemma 19 7.142 Adjoint Equations . . . . . . . . . . . . . 305 Properties of Separable Local Cones . . . . . . . . . . 305 7.151 Lemma 20 . . . . . . . . . . . . . . . 305 7.152 Lemma 21 . . . . . . . . . . . . . . . 307 7.153 Corollaries 6 and 7 . . . . . . . . . . . . . 309 7.154 Theorems 2 and 3 . . . . . . . . . . . . . 310 7.155 Corollaries 8 and 9 . . . . . . . . . . . . . 311 Attractive and Repulsive Subsets of a Limiting Surface . . . . 311 7.161 Corollaries 10 and 11 . . . . . . . . . . . . 311 Regular Subset of a Limiting Surface . . . . . . . . . . 312 7.171 A Local Property of a Regular Subset; Corollary 12 . . . 312 7.172 A Maximum Principle; Theorem 4 . . . . . . . . 312 7.173 Relation between Gradient and Adjoint Vectors . . . . . 315 7.1 74 Relation to Dynamic Programming . . . . . . . . 316 Antiregular Subset of a Limiting Surface . . . . . . . . . 317 7.181 Corollary 13 . . . . . . . . . . . . . . . 318 Symmetrical Subset of Local Cone Y(x) . . . . . . . . . 318 7.191 Lemmas 22 and 23 . . . . . . . . . . . . . 318 7.192 Dimension of a Symmetrical Subset; Theorems 5 and 6 . . 320 7.193 Symmetrical Subset as Hyperplane; Theorem 7 . . . . . 321 7.1 94 Symmetrical Subset and Separating Hyperplane; Lemmas 24 and 25 . . . . . . . . . . . . . 322 A Maximum Principle . . . . . . . . . . . . . . 323 7.201 Assumptions; Lemma 26 . . . . . . . . . . . 323 7.202 TheConeV" . . . . . . . . . . . . . . . 326 7.203 A Property of Cone v"; Lemma 27 . . . . . . . . 329 7.204 Convex Closure of Cone V" . . . . . . . . . . 331 7.205 The Cone W; Lemma 28 . . . . . . . . . . . 331 . . . . . . . . . . 332 7.206 Convex Closure of Cone Q' 7.207 Theorem 8 . . . . . . . . . . . . . . . 333 Boundary Points of b* . . . . . . . . . . . . . . 336 7.211 Lemma 29 . . . . . . . . . . . . . . . 337 7.212 Lemmas 30 and 31 . . . . . . . . . . . . 338 . . . . . . . . . . . . . . 340 7.213 Lemma32
7.
A Fundamental Analogy . . . . . . . . . Third Basic Assumption . . . . . . . . . Definition of Local Cones Kl(x) and F0(x) . . . . Interior Points of VI(x) and ~ O ( X ) ; A Fourth Basic Assumption; Lemma 33 . . . . . . . . . . . . . . . . 7.218 Local Cone g(x) 7.219 Lemmas 34 and 35; Corollary 14 . . . . . . . . . . . . . . . . 7.21 10 Lemmas 36 and 37 7.2111 Lemma 38 . . . . . . . . . . . . . 7.2112 Separability of Local Cone %", (x) and (x) . . . . 7.21 13 Cone of Normals at Boundary Point . . . . . 7.21 14 Regular and Nonregular Points of the Boundary . . 7.21 I5 Properties of Local Cones at Boundary Points; Lemmas 3 9 4 1 ; Theorems 9 and 10 . . . . . . 7.2116 A Maximum Principle (Abnormal Case); Theorem I 1 7.22 Boundary and Interior Points of b* . . . . . . . . 7.221 Lemma42 . . . . . . . . . . . . . . 7.222 Lemmas 43 and 44 . . . . . . . . . . . 7.223 An Assumption Concerning Boundary E ; Lemma 45 . 7.224 Another Fundamental Analogy . . . . . . . . 7.225 Some Basic Assumptions; Local Cones WJx) and 0,(x) 7.226 Local Cone G(x) . . . . . . . . . . . . 7.227 Tangent Cone VE(x) at a point of E . . . . . . 7.228 A Nice Boundary E;Lemma 46 . . . . . . . 7.229 Concluding Remarks . . . . . . . . . . 7.23 Degenerated Case . . . . . . . . . . . . . . . . . . 7.231 B-Degenerated Point; Lemmas 4 7 4 9 7.232 A-Degenerated Point; Lemmas 50-52 . . . . . 7.233 Corollary 15 . . . . . . . . . . . . . 7.234 A Trivial Maximum Principle . . . . . . . . 7.235 Concluding Remarks . . . . . . . . . . 7.24 Some lllustrative Examples . . . . . . . . . . . 7.241 One-Dimensional Regulator Problem . . . . . 7.242 Power-Limited Rocket Problem . . . . . . . 7.243 A Navigation Problem . . . . . . . . . . Appendix . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . 7.214 7.215 7.216 7.217
'z,
7.0
265
ON THE GEOMETRY OF OPTIMAL PROCESSES
. . 341
. . 341 . . 342
. . . . . . . .
. . . . . . . .
. . . . .
. . . . .
.
. . .
.
. . . . . . . .
. . . . . .
342 343 344 345 346 346 346 347
347 350 351 352 352 . 354 . 356 . 356 . 357 . 358 . 358 . 360 . 360 . 361 . 362 . 362 . 363 . 364 . 364 . 365 . 366 . 367 . 370 . 371
Introduction
This chapter contains an investigation of the geometry in state space of a dynamical system which behaves in an optimal fashion . The general notion of a dynamical system is introduced in terms of a set of admissible rules which determine the motion of the system in its state space. A cost is associated with the transfer of the system by means of an admissible rule . Optimality is then defined by the requirement that an admissible rule render the minimum value of the cost associated with transfer between prescribed end states .
266
A. B L A Q U I ~ R EAND G. LEITMANN
Under the sole assumption that the cost obeys an additivity property, the existence of so-called limiting surfaces in cost-augmented state space is exhibited. Each member of the one-parameter family of limiting surface is the locus of all optimal trajectories whose initial points belong to it. Furthermore, each such surface belongs to the boundary of the region which contains all trajectories emanating from that surface. These global properties of a limiting surface are of fundamental importance to a discussion of optimal processes. Under various assumptions concerning the geometry of limiting surfaces, local properties of these surfaces are deduced. In particular, for systems described by the usual set of differential state equations and for integral cost, additional geometric aspects of limiting surfaces are discussed. While it is not the primary purpose of the investigation reported here to present a derivation of the maximum principle, this principle is found to be a consequence of the global and local properties of limiting surfaces. Finally, the relation between the maximum principle and dynamic programming is established from the geometric point of view. A word concerning notation and nomenclature may be in order. Unless specifically stated, they are the ones in common use. New symbols are defined, usually in a footnote, where they are first introduced. 7.1
Dynamical System
7.11 Transfer of System
We shall consider a dynamical system whose state is defined by n real numbers x l , x2,.. . , x,. We may think of a state as a point in an n-dimensional Euclidean space En, termed the state space. In state space we select a rectangular coordinate system so that we may,define a point by the state oector x = (xl, x2,... , x,). We shall suppose that the behavior of the dynamical system-that is, the evolution in time t of the system's states-is governed by any one in a prescribedset of rules. A rule which belongs to the prescribed set will be termed an admissible rule. Given an initial state of the system and an admissible rule, the state vector is a function? of time, x ( t ) , which is defined on the time interval during which the rule is operative. In other words, given a point xo and an admissible rule r , the point x(t) moves along a path p in E".:
t We shall use the same symbol to represent a function of time and its value at a given time. The intended meaning should be evident from the context. $. If x ( t ) is defined on [ t o ,t , ] , then the corresponding path p is the image of [ t o ,r , ] under graph { ( t , x): t E [ t o , r,l, x = x ( t ) } .
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
267
In the subsequent discussion we shall be concerned with the paths generated by all admissible rules. In particular, we shall be interested in transferring the system from an initial point xo to aprescribed terminaf point xl. In general, some admissible rules generate paths from x o which terminate at the prescribed point xl, while other admissible rules generate paths from x o which terminate at some point 2' # x ' .
.
7.12 Performance Index and Optimality
Next let us adopt a rule, or functional, which assigns a unique real number t o each transfer? effected by a n admissible rule governing the behavior of the dynamical system. Such a rule will be termed aperformance index. The number which it assigns to a transfer will be called the cost of the transfer. We shall admit a performance index such that the cost of transfer from xo to xf, where xf = x1 or SZ', depends on the generating rule r and the corresponding path p . We shall denote the cost of such a transfer by V(xo,x f ; r, p ) . We shall call a rule optimal for a transfer from xo to x l , and denote it by r* and the corresponding path by p*, if the cost takes on its minimum value; that is, v(x0,x' ; r*, p*> IV(xo,x1 ; r, p ) (7.1) for all admissible rules. While there may exist more than one optimal rule, the definition of optimality expressed by inequality (7.1) implies that the minimum cost is unique. Thus, for prescribed terminal point xl, the minimum cost depends only on the initial point xo. T o emphasize this fact, we write
v*(x0;x ' )
v(x0,x1 ;r*, p*),
Vr*
(7.2)
7.13 Additivity Property; Union of Paths Rather than specify at the outset the set of rules which govern the behavior of the dynamical system and the rule or performance index which assigns a cost to a transfer, we shall make certain assumptions concerning these rules. Regarding the performance index, we shall assume that the cost obeys an additivity property; in particular,
Y(xO,x f ; r, p) where
= V(x0, x i ; Y,
Vr'Ep,
lim
d+xf
p')
+ V(x', x f ; r, pi)
p=piup,
(7.3)
~ ( x ' x, f ; r, pi)= O
t The term transfer is t o be understood as a change of the system's state from one given state t o another.
268
A.
BLAQUIBRE AND G . LEITMANN
Note that rule r, which generates a transfer from xo to xf along p , also generates a transfer from xo to x i along pi and then from xi to xf along p i . We shall also assume that the union of paths, or of portions of paths, is generated by an admissible rule. 7.2
Augmented State Space and Trajectories
We shall find it convenient to introduce another variable, x o , and to consider an ( n + 1)-dimensional Euclidean space En+' of points x, where x = (xo, x ) = (xo, xl, ... , x,) is the vector which defines a point relative to a rectangular coordinate system in En+', the augmented state space. Next let us define a trajectory in E n + 1namely, ;
r 4 {xi: xoi + V(xi,x f ; r, pi)= C,
p ic p }
(7.4)
where p is a path from xo to xf generated by rule r, and C is a constant parameter. Thus, path p is the projection on Enof a trajectory in E n + ' . If a rule is optimal, the corresponding trajectory will be called an optimal trajectory and denoted by r*;that is,
r*
{xi: xoi + ~ ( dx' , ; r*, p i ) = C , p i c p*>
(7.5)
where r* is an optimal rule for a transfer from xo to xl, and p* is the corresponding path. Let xo and xf denote the initial and terminal points, respectively, of a trajectory r. Then xoo= C -
xof = c
v(x', x f ; r, p )
so that xof - xoois equal to the cost of transfer from xo to xf. Thus, if xoo= 0, then Cis equal to the cost of this transfer. If is a trajectory whose projection on E" is a path p which terminates at the prescribed terminal point x i , then r terminates at a point x1 on a line X' which is parallel to the x,-axis and intersects Enin x ' . Hence, an optimal trajectory r*from xo is one for which the value of xol is minimum with respect to all other trajectories from xo. 7.3
Limiting Surfaces and Optimal lsocost Surfaces
Let us now consider two subsets of state space E n : (i) the set E of all initial points, xo, for which there exist admissible rules transferring the system to the prescribed terminal point x ' , that is, E A {x" 3p from xo to x ' }
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
269
(ii) the set E* of all initial points, xo, for which there exist optimal rules, that is,
E* A {x" Of course, E* points.
from xo
3p*
to x ' }
c E E En, so that E* and E may possess boundary
Since V*(xo;x ' ) is defined for all xo xo
E E*,
the equation
+ V*(x;x ' ) = c
(7.6)
where C is a constant parameter, defines a single-sheeted surface C in €* A E* x xo . In other words, C is a set of points which are in one-to-one correspondence with the points of E*. The function xg = c - V*(x;x ' ) is defined on E* and, in general,? vanishes on a surface S whose equation is V*(x;x ' )
=
c
(7.7)
As the value of parameter C is varied, equations (7.6) and (7.7) define two one-parameter families of surfaces, namely, {C} in €* and { S } in E*. We shall call the former iimiting surfaces and the latter optimal isocost surfaces. The first of these names is motivated by a property of C to be discussed in Sec. 7.5. The second name follows from the definitions of minimum cost V*(x; x ' ) and surface S ; namely, a given S surface is the locus of all initial points from which the system can be transferred to the given terminal point with the same minimum cost. 7.4
Some Properties of Optimal lsocost Surfaces
The function V*(x;x') is termed sign-definite in E* if (i) it is sign-invariant for all x E E*; (ii) it is zero only at x = x' ; (iii) partial derivatives dV*(x; X I ) / a x j , j = I , 2, ... , n, are defined and continuous on i*.$ If V*(x; x ' ) is sign-definite, then it possesses the following properties: (i) If x' is an interior point of E*, then there exists a region D E E* such that x * is an interior point of D, and such that every S surface in D is a closed surface which surrounds x ' .
t In the exceptional case of
V * ( x ; X I ) independent of x, it can be shown that V*(x;r')
= 0. in that case surface S is not defined.
$ E* is the set of all interior points of E * .
270
A.
BLAQUIBRE
AND G. LEITMANN
(ii) Furthermore, the optimal isocost surfaces contract towards x 1 as ICI 0. Here we distinguish between (a) V*(x;x ' ) positive definite, i.e., C > 0, in which case S contracts towards x' as C decreases to zero; and (b) V * ( x ;x ' ) negative definite, i.e., C < 0, in which case C increases to zero. (iii) If E* C F ,and x' belongs to the boundary of E*, the optimal isocost surfaces are no longer closed surfaces. However, a connected curve which starts at x ' , lies in E* and terminates at a point x' for which V*(x'; x ' ) = C', intersects every S surface corresponding to C 5 C' if V*(x;x ' ) is positive definite, and to C 2 C' if V*(x;x ' ) is negative definite. Here also the optima1 isocost surfaces contract towards x' as JCI+ 0. Finally, we note that optimal isocost surfaces, corresponding to different values of parameter C, do not intersect each other. --f
7.5
Some Global Properties of Limiting Surfaces
Here we shall deduce some preliminary lemmas concerning trajectories in augmented state space. Based on these lemmas we shall then prove a fundamental theorem which embodies some global properties of limiting surfaces.
7.51 Lemma 1 First among the preliminary lemmas is Lemma 1.t Any optimal trajectory r* which intersects line X' at point x i lies entirely in the C surface passing through x'.
To prove this lemma we shall invoke additivity property (7.3). From this property it follows that V*(x'; x ' ) = 0, so that definition (7.6) implies that one and only one C surface passes through point XI. Furthermore, we need show that ~ ( x ' x', ; r*, pi)= v*(x';XI), p ic p* (7.8)
so that, in view of (7.5) and (7.6), the values of xoion r*and on C, respectively, passing through X I , coincide for every point x i on the corresponding optimal path p* from xo to xi. From additivity property (7.3) we have V*(xo;x ' ) =
V(xo,x' ; r*, p*) V(xo,x i ;r*, pi)
+ V(x', x' ; r*, pi)
f If x * 4E*, it belongs to the boundary of E*; in that case it is t h e closure of C in E n + ' , which contains x'.
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
27 1
, p* = p i u p i . Suppose now that (7.8) is false, namely. for all x i ~ p * where V*(xi;x') 4 V(xi,x1 ; ri*, pi*) < V(xi,x' ; r*, p i )
where ri* is an optimal rule for a transfer from x i to xl, and pi* is the corresponding path. Then we may transfer the system from xo to x i along path p i using rule r*, and then from x i to x1 along path pi* using rule ri*. However, invoking again (7.3), it follows that ~ ( x ' , x 1 ; r, p ) = ~ ( x ' , x i ; r*, p i )
+ V(xi,x1 ; ri*,pi*)
where rule r is made up of r* from xo to x i and of ri* from x i to p = p i u p i . Thus, we arrive at ~ ( x ' , x1 ; r, p )
d, and
< V(x', x' ; r*, p*)
which contradicts the optimality of rule r*. And so (7.8), and hence the lemma, is established.
7.52 Lemma2 The second preliminary lemma is
Lemma 2. A n optimal trajectory r*with one point on a limiting surface C lies entirely on C ;that is. limiting surfaces {C>are the loci of all optiinaltrujectories. In view of definition (7.6), the members of the one-parameter family of limiting surfaces can be deduced from one another by translation parallel to the x,-axis. Furthermore, these surfaces are ordered along the x,-axis in the same way as the value of parameter C. Clearly, one and only one C surface passes through a given point in €*. This latter property and Lemma 1 lead at once to Lemma 2. 7.53 A- and B-Points Before deducing the next lemma let us introduce a bit more nomenclature. It is quite obvious from definition (7.6) that a given C surface separates €* into two disjoint regions. We shall denote these regions by A/C ("above" C) and B/C (" below" C), respectively. For a limiting surface C, corresponding to parameter value C, we have and
XO
> C - V * ( X x') ;
VX E E"}
(7.9)
{X : XO
< C - V * ( X ;x')
VX E E*)
(7.10)
A/C A {X B/C 4i
A point x E A/C will be called an A-point relative to C, and a point x E B/C a B-point relative to C.
272
A. BLAQUI~REAND G. LEITMANN
7.54 Lemma 3 We can now proceed with the proof of
Lemma 3. There exists no trajectory which starts on a given limiting surface
C and intersects line X ' at a B-point relative to X.
Consider a limiting surface C corresponding to parameter value C; a point
xo E C, whose projection on E" is x o ; and a non-optimal trajectory r, corresponding to parameter value C', which emanates from point xo and terminates at point x' on x'. The equation of surface C is xo
whereas the xo-coordinate of
xoi
+ V*(x;x ' ) = c
r is given by
+ ~ ( x ' x', ; r, pi)= C',
p ic p
where r is a non-optimal rule which generates path p , the projection of r on E". Since r is a non-optimal rule, it follows that
v*(x0;x ' ) < V(xo,x' ; r, p ) so that C' >
c
Consequently, a non-optimal trajectory which reaches line XI intersects it at an A-point relative to the limiting surface on which it starts. This conclusion and Lemma 2 result in Lemma 3. 7.55
Theorem 1 and Corollary 1
We shall now utilize the preliminary lemmas to establish
Theorem 1. A trajectory (optimal or non-optimal) whose initial point belongs to a given limiting surface C has no B-point relative to C. And
Corollary 1. A trajectory whose initial point is an A-point relative to a given limiting surface X has no B-point relative to C, nor, indeed, a point on it. Consider a limiting surface, say C,,corresponding to a parameter value C , , and non-optimal trajectory which emanates from a point xo on El. Suppose
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
273
now that a point, say x', of r is a B-point relative to C,. Let C, denote the limiting surface which passes through x', and let C, be the corresponding parameter value. Equation (7.6) and the definition of a B-point lead at once to
c o < CI Consequently, the intersection of Co and line X ' , that is, point (C,, X I ) , is a B-point relative to C,. Let us now postulate a trajectory which consists of the portion of nonoptimal trajectory I- from xo to x', followed by an optimal trajectory from x' to line A".According to Lemma 2, the optimal trajectory which starts at x' lies entirely in C, , and hence intersects X ' at a B-point relative to XI. However, in view of Lemma 3, a trajectory which starts on XI does not intersect line XI at a B-point relative to C,. We conclude that a non-optimal trajectory which issues from a point on a given limiting surface C has no B-point relative to C. This result and Lemma 2 establish Theorem 1. Corollary 1 follows from an analogous argument in which we consider the limiting surface passing through the initial point of the trajectory. Theorem 1 embodies the limiting property of surfaces {C}; namely, a given C surface belongs to the boundary of the region which contains a / / trajectories emanating from that surface. 7.6
Some Local Properties of Limiting Surfaces
In the last section we deduced some global properties of limiting surfaces, global in the sense that they pertain to the behavior of trajectories relative to limiting surfaces as a whole. In this section we shall investigate some local properties of limiting surfaces, local in the sense that they pertain to neighborhoods of points on a given C surface. In the following discussion we shall find it convenient to refer to regions
-
(A/C) u C
(7.1 1)
B/C 4(B/C) u C
(7.12)
A/C
and N
namely, the complements in 6"of regions B/C and A / C , respectively. In other words, we have ri
-
( A / C ) u (BIZ) = I*, ( A / C ) n (BIZ) = 4
and alternately ri
(BIZ) u ( A / C ) = &*,
+
(BIZ) n ( A / C ) = 4
274
A. BLAQUIERE AND G. LEITMANN
7.61 First Basic Assumption Henceforth we shall make a basic assumption concerning the way a given limiting surface C separates b* into the two disjoint regions AIC and BIZ. Let q be any bound vector at an interior? point x of a given limiting surface C. We shall assume that for every vector q there exists a scalar 6 > 0 such that z
for every E , 0 < E < 6, the point x + ~q belongs either t o region A / C or to region BIZ. This assumption will be understood t o apply hereafter as, for example, in the definitions introduced in the next section.
7.62 Definitions of Local Cones VA(x) and V B ( x ) Next let us define two cones associated with every interior point of a limiting surface. Again let q be a bound vector at an interior point x of a given limiting surface C. Then we let
V,(x) 6 {x
+ q : 3ci > 0
w
x
+ cq E A/C)
such that V E , 0 < E < 8, x
+ ~q E BIZ)
such that
V E , 0 < 8 < ci,
(7.13)
Similarly, we let
VB(x)e {x
+ q : 38 > 0
(7.14)
Note that V,(X) and VB(x)are local cones with vertex at point x.
7.63 Interior Points of VA(x) and VB(x); A Second Basic Assumption; Lemma 4 Before introducing a second basic assumption concerning the way a limiting surface separates d*,let us define an interior point of a local cone q A ( x ) or %B(X).
We shall say that x
(i) x + q E VA(x) (ii) there exists an that all points The definition of the namely, @,(XI
@,(x)
t By " interior point of interior point of b*.
+ q is an interior point of V A ( x )(or VB(x)),if (or V B ( x ) ) and ; open ball B(x + q ) in E"" with center at x + q such of B ( x + q ) belong to V A ( x )(or V B ( x ) ) . open local cones, gA(x)and gB(x),follows at once;
(x + q : x {x + q : x
+ q is interior point of g A ( x ) ) + q is interior point of gB(x))
X" we shall always mean a point which belongs to C and is an
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
275
Our second basic assumption is the following: Let x + q ‘ be an interior point of ‘e,(x) (or ‘eB(x));namely, there exists an open ball B(x + q ’ ) in E“+’ which belongs to VIA(x)(or eB(x)).Then there exists an open ball B’(x +q’) in E n + ’ which belongs to F,(X) (or V B ( x ) )and which has the property that for every point x + q in B’(x q’) there exists a positive number a (independent ofq) such that for all E , 0 < E I a , point x + ~q belongs to
+
A T (or B / C ) . t
With the aid of this assumption we shall now prove
Lemma 4. Consider an interior point x of a limiting surface X, and a point x + q‘ E @*(x).Then there exists an open ball B‘(x + q‘) c ‘e,(x) and a positice number p such that f o r all x + q E B’(x q‘) and all c, 0 < c < p, x + ~q E A / C .
+
According to our second basic assumption, there exists an a > 0 such that for all E , 0 < E 5 a,and for all x + q in some open ball B’(x Consider now a conic neighborhood
+ q’),x + ~qE A/C.
i
N(x) (x + k q : k > 0, x + q E B‘(x + 11’)) and an open ball1
B ( x ) L2 {x
+ Ep:
IpI
=
1,
(7.15)
0 < E < cx]
In view of our second basic assumption, we have
N(x) n B ( x ) c A / Z But N ( x ) and B ( x ) are open sets, so that and
+ c q is an interior point of N ( x ) , x + ~q is an interior point of B(x), x
Now let p
x
E
>0
0 < E < a/lql
a/lqlmax,where lqImax denotes the maximum value of lql for all
+ 11 E B’(x + q’). Then we conclude that for all E , 0 < x + cq E A11
E
< p,
which establishes Lemma 4.
7.64 Local Cone Y(x) Local cones WA(x) and VB(x) may be neither open nor closed, depending on the local properties of the corresponding limiting surface. Let us now $
Note that here, as in the definitions of K,(x) and U,(x), we have A / X but B / C . I( )I denotes the norm (Euclidean length) of vector ( ).
276
BLAQUIBRE
A.
AND G. LEITMANN
consider the closed? local cones @,(x) and gB(x),and dejine another local cone at x. We shall denote by Y(x) the intersection o f @,(x) and @,,(x); that is,
Y(x)
(7.16)
G',(x) n @,(x)
In the next section we shall show that Y(x) is not empty; in fact, that it is the common boundary of V,(X) and V,(x). 7.7
Some Properties of Local Cones
Lemma 5; Corollaries 2 and 3
7.71 A Partition of We shall now prove
Lemma 5. Ifx is an interior point of a limiting surface C, then local cones V,(X) and gB(x) constitute a partition of augmented state space E n f l ; in other wordst
VA(x) = comp gB(x)
and
VB(x) = comp V,(X)
Let us note first of all that definitions (7.13) and (7.14) o f %?,(x) and VB(x) lead at once to V,(X) u %?,(x)
= En+
(7.17)
To prove Lemma 5, we need only show that neither cone is empty, and that their intersection is empty. Consider two open rays, L - and L , , which emanate from point x, are parallel to the x,-axis and point into the negative and positive x,-directions, respectively. As a consequence o f definitions (7.9), (7.10), and (7.1 1 ) of AIC, BIZ, and AIC, we have
+ &q E L , a x + E ~ AIC E x + q E L - x + &q E BIZ
c
x
=$
Thus, it follows from the definitions of gA(x)and VB(x)that V,4(X)
#
4
9
%dX)f: 4
(7.18)
Next we shall show that
%dX) =4
(7.19)
-
t Unless otherwise specified, ( ) denotes the topological closure in En+' of ( ). f Notation comp ( ) denotes the complement of ( ) in
En+'.
7.
277
ON THE GEOMETRY OF OPTIMAL PROCESSES
To prove this property, suppose that it is incorrect; namely, f
vB(x)
and consider a point x
x
+ q such that
+ q E VA(x)
and
x
4
+ q E %,'(x)
Then 3c(>O
such that
VE, 0
38>0
such that
Ve,
Suppose, for instance, that VE, 0 < E < CI,
CI
N
< E
0<
E
x + e q AIC ~
x+E~EB/Z
< p ; then x
z
+ E ~ AIC, E
and
x
+ E ~ BIZ E
which implies that z
( - 4 m n (BPI
z4
This is not the case, and so (7.19) is valid. Conditions (7.17)-(7.19) establish Lemma 5. Corollary 2. Local cone Y ( x )4@,(x) local cones V,(X) and V,(x).
n gB(x) is the common boundary of
It is well known that?
d%',(x) = @,(x)
n comp V,(X),
dVB(x) = @,B(x)n comp vB(x)
These relations together with Lemma 5 result in Corollary 2. Furthermore, we have Corollary 3. The following relations are iialid f o r local cones %(x,')
comp @,(x)
= gB(x),
and WB(x):
comp @,(XI = @,(XI
This corollary is a consequence of Lemma 5 with comp eA(x)= comp@,(x),
comp v,(x)
= comp
gB(x)
7.72 Lemmas 6 and 7; Corollary 4 Next we shall deduce two lemmas which will be useful during the subsequent discussion. The first of these is
t Notation a(
) denotes the boundary of ( ).
278
A. A L A Q U I ~ R EA N D G . LEITMANN
Lemma 6. Let q be a bound rector at an interior point x o f a limiting surface C. /f’q = q ( ~ is ) a function of a parameter E with tfie,foliowing propertic.s : (i) q ( ~+ ) 1 as c -+0 (ii) 3y > 0 such that V E , 0 < E < y , then
x
z
+ E ~ ( E E) A/C
+ I E @,(x)
x
To prove this lemma let us suppose that it is incorrect?; namely, x I $ @,(XI
+
so that, by Corollary 3, we have
+ 1E g B ( X )
x
Then, according to our second basic assumption, there exists an open ball B‘(x + 1) and a positive number CI such that for every point x + q in B’(x + 1) and all E , 0 < E < CI, the point x + ~q belongs t o B/C. However, since q ( ~-+) 1 as E 0, there exists a positive number 6 < y such that for E , 0 < E < 6, the point x + q ( ~ belongs ) to B’(x + 1). Consequently, there exists a positive number fi such that for ail 8 , 0 < E < fi, x + E ~ ( E )E B/C -+
x
+ E q ( E ) E s/r
which contradicts (ii) of Lemma 6, and so establishes the lemma. Conversely, we have Lemma 7. Let q he a bound rector at an interior point x of a limiting surface C. r f q = q ( ~ is ) a function of a parameter E bvitf2 tlie ,following properties:
(i) q ( ~ -+ ) 1 as
E
-+
(ii) 3y > 0 such that
0 VE, 0 < E < y ,
then
x
z
+ E ~ ( E )E B/Z
+ I E qsB(x)
x
Suppose again that the lemma is incorrect,$ that is,
+ I $ ‘G,(x)
x so that, according to Corollary 3,
x
+ 1 E GA(X)
Note that the lemma is immediately valid if b , ( x ) $ Again, the lemma is clearly valid if ‘?a(x) =
4.
=
4.
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
279
Then, as i n the proof of Lemma 6, we can state that there exists a positive number 6 < y such that for all I , 0 < c < S,
x
+q(E)
E gA(X)
But then Lemma 4 allows us to conclude that there exists a positive number such that for all I , 0 < I < p, x I Q ( E ) E A / Z which contradicts hypothesis (ii) of Lemma 7 and so establishes the lemma. As an immediate consequence of these lemmas we have
+
p <6
Corollary 4. Let q be a boutld cectorat an interior point x of a limiting surjace Z. Ifq = q ( ~ is ) a function of a parameter E with the following properties: (i) (ii)
as E 0 3y > 0 such that VI,
Q(E) +1
--f
0 < I < y, x -I-
then *X
I ~ ( E )E
C
+I E 9yX)
Since C
z
3
= (A/Z) n(B/Z)
we have
+ &I)(&) E A / C ry
x
and
+ E ~ ( E E) B/C
i
x
In view of Leiiimas 6 and 7, respectively, it follows that
x so that
+ 1 E gA(x) x
and
x
+ I E qB(x)
+ 1 E gA(x)n gB(x)= Y ( x )
7.73 Lemma 8 Consider now any conic neighborhood
N(X)L& (x + kq : k > 0, x
where B(x
+ I) is a n open ball in En"
+ q E B(x + 1)) with center at x + 1. We shall prove
Lemma 8. Let 1 be a bound cector at an interior point x of a limiting surface Z. u x + 1 E Y ( x )and N ( x ) is a conic neighborhood, then and
280
A. B L A Q U I ~ R E AND G. LEITMANN
Moreover
N(x) n @,(x)
#
4
provided @,(x)
N(x) n @,(x)
#
4
provided
=%?,(x)
and
@,(XI
=@Ax)
Suppose that the lemma is false; in particular, that
Nx) nVAX)
=
4
Then it follows from Lemma 5 that N ( 4 = @dx) In view of the definitions of a conic neighborhood and of @,(x), relation implies that
the last
x + 1 E &(x)
This, in turn, contradicts the hypothesis of the lemma that x Similarly by supposing that
N(x) n %,(x)
+ 1 E Y(x).
=4
we arrive at a contradiction. Now suppose that
~ ( x n) gA(x)= 4 Then it follows from Corollary 3 that
N(x)= @',(x> and, since N ( x ) is open, we have
&(x)
= N(x)
and
Furthermore, from the assumption @'(x)
N(x) c@B(x)
N(x)
cgB(x)
= @,(x),
it follows that
VdX)
which leads to a contradiction, as discussed above. Similarly, by supposing that
~ ( x n) @,(XI
=
4
we arrive at a contradiction. Hence, Lemma 8 is proved.
7.
28 1
ON THE GEOMETRY OF OPTIMAL PROCESSES
7.74 Lemma 9 Next we shall prove Lemma 9. Let x’ be an interior point of a limiting surface C, and let L be a connected curve which joins xAE qA(x’)and x BE gB(x‘).Then L intersects Y(x’), that is,
L n W x ’ ) # d, The lemma is clearly valid if either
4,(x’)
= d, ==-XA E Y(x’),
ga(x’) = d, or
@,(XI)
= d,,
since
@,(x’) = d, => X B E Y(x’)
If neither @,(x’) nor @,(x’) is empty, and the lemma is false, that is,
L n Y(x’) = 4 then
xAE gA(x’)
and
xBE @,(x’)
However, according t o Lemma 5 and Corollary 2,
@A(x‘)u @B(X’) u Y(x’) = En+’ so that a point of L belongs to one of the two sets AA k L n @,(x‘),
A B 6 L n gB(x‘)
so that AA u AB = L We shall denote by d(x,, xJ) the curz.ilinear distance between x, E L and x, EL, that is, the distance along curve L (properly parametrized). Consider now two sets of points
( X : X = X [ ~ E Ai =~ l, , 2 ,..., k } and {X : x = x
, E~AB , i = 1,2, . .. , k }
constructed in the following manner: Consider the midpoint of L ; it belongs either to AA or t o A B . If it belongs t o A A , we shall denote it by x I Aand let x l B= x,. If it belongs t o A B , we shall denote it by x I Band let x I A= x A . Then we consider the midpoint of the segment of L between x,* and xIB. If this point belongs t o AA , we shall denote it by xZAand let x Z B= x l B .If it
282
A. BLAQUIERE AND C. LEITMANN
belongs to A,, we shall denote it by xZB and let x,” = xl”.By repeating this process, we obtain two sets of points having the following properties:
d(xA d(X,,
>
d(x,,
X2”) 5
”’
XIB) 2 d(XB,
xzB) 2
“’
XIA) 5
d(x,A, XIB)> d(X,A,
x2B)
5 d(xA, Xk”)
< d(XB, XkB)
> .’. > d(x,”,
XkB)
Furthermore, since L is connected as k - t
d(I(XkA,Xk’) 4 0
so that
Xk”
03
and xkBtend to the same limit, xL, as k increases; that is,
According to our supposition, xL belongs either to AA or to A B . Suppose, for instance, that x L E AA and hence xLE Then there exists a 6 > 0 such that a n open ball B(xL)with center at xL and radius 0 , 0 < p < 6, belongs to gA(x‘).But this contradicts the fact that @”(XI).
Xk, + XL
as k + a
By the same token we arrive at a contradiction, if xL E A,. Consequently, our supposition is incorrect and, in fact, we have L n 9(x’) #
4
7.8 Tangent Cone VX(x)
7.81 Definition of gZ(x) in
We shall say that a unit vector t, is tangent to a limiting surjuce C at a point x if the following conditions are fulfilled :
x, (i)
There exists a vector function q(8) and a positive scalar function both of the same parameter E , such that
PZ(F),
(ii) There exists an infinite sequence S , & ( E : E = E ~ i, = 1 , 2 ,..., k ,
and
and a positive number a such that, for all point x m(E)q(E)E C
+
E,+O 8
E
as k + m )
S, and 0 < E < a, the
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
283
We define now the tangent cone U,(x) of C at x; namely,
%‘,(x) L {x
+ kt, : k > 0,
(7.20)
Vt,}
Obviously, we can extend this definition to apply for a subsett (open, closed, or neither) of C. For instance, let C, denote a subset of C. In defining a unit vector t,” tangent to C,,a t a point x in we retain condition (i) with t, = t,”. In condition (ii) we require the existence of an infinite sequence S,” and of a positive number CI such that for all E E SXy, 0 < E < c(, the point x + m(E)q(E)E C, . The tangent cone of C, a t x is given by
c,,
Vzu(x) (x + kt,” : k > 0, tit,”)
Clearly, the definition applies to the closure of C, ; thus, the tangent cone of C, is defined by qX,(x)2 {x
+ k t f Y: k > 0,
Vtr,}
7.82 Closure of a Subset of C It is easily shown that
W,,(x)
(7.21)
= %‘X”(X>
If tZ, E UXy(x),then conditions (i) and (ii) are satisfied. Let E so that
E
S Z v0 , < E < a,
x + m(b-)q(E)E c,
and consider a ball B(x + m ( ~ ) q ( ~with ) ) center at x + m ( ~ ) q and ( ~ )radius p . Since x + m ( ~ ) q ( is ~ )either a point of C, or a limit point of C, , a ball B(x + m ( ~ ) q ( ~contains )) a point of C, ; let us denote such a point by
x
+ m’(E)q’(E) E
(7.22)
where (q’(&)I = 1. Thus, we have (7.23)
t Note, however, that in this case a tangent vector might not be defined; e.g., if 2;” is a point.
284
A. BLAQUI~REAND G . LEITMANN
so that, since / q ( ~ )=j lq’(~)l= 1 , we have
m’(4 5 4
E )
+ lP’(E>I
Now let
m’(&) = m(e) + p”,
0 5 p” I .Ip’(e)I
(7.24)
and substitute in (7.23); that is, m(E) Itl’(4 - 1 ( E ) I = P’@) - P”Il’(4
so that
Since radius p of B(x + m(&)q(e))is arbitrary, let us choose
Consequently, q ’ ( ~-+ ) try
as
E
as
E -+
-0
(7.25)
and, in view of (7.24), also
m ’ ( ~-+) 0
0
(7.26)
Thus, (7.22), (7.25), and (7.26) imply the satisfaction of conditions (i) and (ii); and so t Z ” E VZ,(X>
Conversely, it is clear that t,” E %”(XI
implies that tZ” E
since implies that
Vr,(X)
x +4 M e ) Eq
x + m(E)q(e) E E,
This completes the proof of (7.21).
7.
O N THE GEOMETRY OF OPTIMAL PROCESSES
285
7.83 A Partition of C; Lemma 10 We shall now prove Lemma 10. Let (C,,X 2 , .. . ,C,)-constitute a partition? of a limiting surface C, and consider a point x E X i ,where the X i ,i = 1,2, ... , y i p, belong to the partition and are all the subsets whose closures contain x.$ Then
nu=,
u Y
%,(4=i = 1%,,(x) To prove this lemma, consider a vector t, such that
+ t, E %,(x)
x
so that conditions (i) and (ii), defining a tangent vector, are met. Condition (i) implies that for every ball B(x + t,) with center at x t, there exists a /3 > 0 such that
+
VE,
0 < E < B,
+q(E)
x
E B(x
+ tz)
(7.27)
Also, according to condition (ii), there exists a sequence S, of E , and a positive number CI, such that VE, 0 < & < a,
&
E
x + m(E)q(E)E
s,,
u xi Y
i= 1
(7.28)
Thus, there exists a positive number 0,0 < 0 I a,such that (7.27) and (7.28) are met for all E , 0 < E < c. Condition (7.28) then implies that VE,
0 < E < 0,
where
E E
x + m(E)q(E) E c,
s,,
c, E {CI,c, ... c,> 9
7
We must now consider two possibilities: (a) Let S,’ denote an infinite subsequence of S,, and suppose that Then
VE, 0 < E < 0,
E E
S,’,
x
+ m(E)q(E) E c,
(7.29)
t, E %,,(x)
(b) Alternatively, there does not exist a S,’ for which (7.29) is fulfilled; for instance, E < 0 1 r: d x m(E)q(E) !$ c,,
+
? T h a t is, C , c C and X,#I$, i = l , 2 , ._.,p ; C in C, =I$, i # j ; and U:’=,X,= C. This can always be done by renumbering the members of the partition.
286
A. B L A Q U I ~ R EAND G. LEITMANN
Then we replace (7.27) and (7.28), respectively, by the statement that there exists a ol,0 < g1 5 g I a, such that V E , 0 < E < c1,
and VE,
x + q ( ~E) B(x
0 < E < 01, c E S Z ,
x
+ tr)
+ rn(&)?l(&)€u i
i=2
Ci
We repeat this process at most y - 1 times. Since conditions (i) and (ii), and hence (7.27) and (7.28), must be met, we conclude that case (a) must arise for a t least one X i ,i = 1, 2, .._,y . In other words,
u Wz,(x> Y
tz E
i= I
which completes the proof of Lemma 10. 7.9 A Nice Limiting Surface
We shall say that a limiting surface C is nice, if it possesses the following properties : There exists a partition {El, C, , .. . , Cfl}of C such that at every point x in i = 1, 2, ... , p, the tangent cone WzL(x)is defined and belongs to a k-dimensional plane TzL(x), k I n, through point x. (ii) If x is an interior point of d*,consider a ball B ( x ) with center at x and sufficiently small radius so that B(x) c Q*. (i)
s,,
Consider also points?
xAE B(x) n AIC,
xBE B(x) n BIZ
Then there exists a positive number points xA and x B ,the point C(XA
f
(1 - .)XB
CI
E
such that, for every pair of
C
7.91 Closure of a Subset of a Nice C; Lemma 11 We shall now prove
Lemma 11. Let C be a nice limiting surjuce, and consider a point x E yI p.$ rftEis a cector tangent to at x, then
n;J,
nplCi,
t This is always possible in view of definitions (7.9) and (7.10) of A / C and SjX,respectively. 1 That is, the set {Z,, C , , ... , C,} belongs to {XI, C , , ... , X&]which is a partition of C satisfying property (i) of a nice limiting surface.
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
287
According to property (ii) of a tangent vector txzof C, a t x , there exist an infinite sequence Sx,and a positive number c( such that for c E Sz,, 0 < E < c(, the point i = 1, 2, ... , y x + m(&jq(c)E C,, Hence,
x
+ t,
+ t,, E g x , ( x ) ,
=x
i = I , 2, ... y )
and so, by (7.21), we have
x
+ t, E V z , ( x ) ,
i = 1, 2, ... , y
But according to property (i) of a nice C surface,
V.X,(x>= T,,(x), Consequently,
x
+ t, E T,*(x),
i = 1 , 2, .'. , Y i = 1, 2,
... , y
whence follows Lemma 11. 7.92 Cones Y ( x ) and V,(x) of a Nice C; Lemma 12
Another salient property of a nice limiting surface is embodied in
Lemma 12. Let x be an interior point of a nice limiting surface C with local cone Y ( x )and tangent cone %',(x). Then W X j = Y"()
Consider a unit vector 1E Y ( x ) ,
111 = 1
and any conic neighborhood
N ( x ) A (x + k q : k > 0, x
+ q E B ( x + l)}
where B ( x + 1) is an open ball in En" with center at x + I and radius p. Consider ancther open ball B'(x) in En+' with center at x and the same radius p. Since x is an interior point of b", there exists a 6 > 0 such that vp,
0 < p < 6,
B'(xj c 6"
We shall show that N ( x ) n B ' ( x) n C # q5
(7.30)
Suppose (7.30) is false, that is,
N ( x ) n B'(x) n Z
= q5
(7.31)
288
A. BLAQUI~REAND G. LEITMANN
We shall show first of all that (7.31) implies or
N ( x ) n B'(x) c AIC
N ( x ) n B'(x) c B/C
(7.32)
Consider two points x A and x B in N ( x ) n B'(x), and assume that x ~ E A / Z and
x,EB/C
Now, since N ( x ) and B'(x) are convex, N ( x ) n B'(x) is convex; that is, for all a and p, a 2 0, p 2 0, and CI + p = 1, ~
x
+
A ~
x
EB N ( x ) n B'(x)
But, since C is nice, there exist a and p, a > 0 , fl > 0, a aXA
+ p = 1, such that
+ pxB E
which contradicts (7.31), proving that (7.31) implies (7.32). However, if (7.32) is valid, then it follows from the definitions of q A ( x )and e B ( x ) ,and of interior points, that x + 1 is an interior point of gA(x)or of V B ( x ) .But that contradicts the assumption of the lemma that x + 1 belongs to Y ( x ) and hence establishes (7.30). Since both N ( x ) and B'(x) are defined in terms of the same radius p, condition (7.30) implies that for all p there exist a number m, 0 < m < p, and a vector q, lql = 1, such that x
+ mq E N ( x ) n B'(x)
and
x
+ mq E I:
Therefore, we can associate with every value of p a vector
11
Iq(p)I = 1
dP),
and a scalar m
m(p)
such that
m(P> q ( p )+ + '0 l
as p-+O
Furthermore, since (7.30) is valid for all values of p, it follows that for any sequence of p p l , p 2 , ..., P k
...
where
Pk+O
we have x
+ m(P>ll(P>E
as k-,
00
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
289
Thus, properties (i) and (ii) of a tangent vector to C at x are fulfilled; that is
x + 1 E G&(X)
and so
x
+ 1 E Y(x)
3
x
+ 1 E %&(x)
In other words, (7.33)
Y(x) c VY(X)
However, a tangent vector to C at x satisfies conditions (i) and (ii) of Corollary 4 ; consequently,
x +1E
%qX) * x
+ 1 E Y(x)
so that %(x)
= Y(Xx>
(7.34)
Lemma 12 follows directly from (7.33) and (7.34).
7.93 Lemmas 13 and 14 Let {XI, C, , ... , C,} constitute a partition of a nice limiting surface C, which satisfies property (i) of such a surface. Consider a point Y
xennc;,
y
i= 1
~
p
where the X i ,i = I , 2, ... , y I p , belong to the partition, and are all the subsets of the partition whose closures contain x. Furthermore, we shall suppose that x is an interior point of 8". Then it follows from Lemma 10 that Y
so that, in view of property (i) of a nice C,
Furthermore, according to Lemma 12, Consequently, (7.35)
290
A.
BLAQUIBRE
AND G. LEITMANN
Finally, consider a vector t, which is tangent to Lemma 11, we have
x
ny=,c,at
x ; then, by
Y
+ t, E i n T,,(x) = 1
(7.36)
Concerning the situation outlined above, we shall now prove
Lemma 13. Prouided
eB(x)=@,(x),
then for alIq such that
+ q E @;,(x)
x and all a 2 0
x
+ 11 + at, E @A(X)
Clearly, if gB(x)= q5 so that qA(x)= En", the lemma is correct. So let us consider @B(x>
f
4
and suppose the lemma is false; namely, there exists a value of a, say such that
x + 11 + a't, $ @A(X)
and hence
x
+ 11 + aft, E
(7.38)
Then there exists an open ball with center at x + q @,(XI, say
+ a't,) B(x + q) with A W
center at x #
For, if we have that
B(x and, since B(x
+ q) is open, that B(x
+ cc't,
and belonging to
@B(X)
Next consider a n open ball Then no matter how small radius p , p > 0,
B(x + 11) n @
= E',
(7.37)
@B(X)
B'(x f 9
E
+ q) c (ZB(X)
+ q) c %*(x)
*
+ q and
radius p.
(7.39)
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
Since we have assumed that
29 1
g B ( x )= gB(x),it follows that B(x + q>
gB(x)
which contradicts the hypothesis of the lemma which implies that x + q i s not a n interior point of V,(x). Now, in view of (7.39), we may consider a point x + q ’ and a n open ball B”(x+ q’) centered a t that point, such that
x and
+ q’ E B(X + 9) n gA(x)
+
+
~ ” ( x q’) c B(X q) n @,(x) Thus, there exists a point x
+ 11’’ such that
(i> x + q ” E gA(x) (ii) lq” - ql < P, V P i= 1
Indeed, x + q ” may be any point which belongs to ball B“(x + q’)and satisfies condition (iii) above. Next consider the line L which passes through point x + q” and is parallel t o t,. Then line L is also parallel to line L‘ 4(x + q
+ cct,
: Vcc)
Clearly the (minimum) distance between lines L and L‘ is less than radius p of ball B(x +q). Consequently, for sufficiently small p , line L intersects the ball B’(x + 9 + cc’t,). This implies that, for sufficiently small p , there exist two points of L , say XA = x
and XB E
B’(x
+ q”
+ q + a’t,)
such that XA E @A(x),
X B E @B(x)
In that event, it follows from Lemma 9 that
292
A. BLAQUI~REAND G. LEITMANN
But this conclusion results in a contradiction. For, since L is parallel to t,, it follows from (7.36) that L is parallel to TZi(x),i = 1 , 2, ... , y. Furthermore,
X,EL
and
so that
L n TZi(x)= 4,
i = 1,2, ... , y
(7.40)
But (7.40) with (7.35) implies that
L n9(x) = 4 and so Lemma 13 is established. By similar arguments one can prove
Lemma 14. Provided @,(x) = gA(x),then for all q such that and all u 2 0 7.10 A Set of Admissible Rules
7.101 State Equations and Control Thus far we have not specified the set of rules which govern the behavior of the system; rather, we have made certain assumptions regarding the system’s behavior. Henceforth, we shall restrict the analysis to systems whose state variables are solutions of state equations i j = f J ( x , , x 2 , ... , x,; u l , u 2 , ... , urn),
j = 1, 2, ... ,n
(7.41)t
where u l , u 2 , ... , u, are parameters. The control vector, or simply the control u=(u1,u2,
...,urn
defines a point in an m-dimensional Euclidean space Em. Given the control as a function of time to I t I t, u = u(t), the state equations provide a rule which governs the system’s behavior during time interval [ t o , t l ] .
t (*)denotes differentiation of ( ) with respect to time t .
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
293
We shall assume that (i) control ~ ( t is ) defined and piecewise continuousf on [ t o ,t,]; (ii) u(t) E 0, V t E [ t o ,t l ] where 0 is a given subset of Em.
A control which satisfies both of these conditions will be termed an admissible control. Clearly, the set of admissible controls together with state equations (7.41) constitutes the set of admissible rules. We shall assume that functions f j ( x , u) and d J ( x , u)/dxi, i , j = 1,2, . .. ,n, are continuous on En x R. Consequently, for a given admissible control u(t), to I t 5 tl, and prescribed initial condition x(to) = x 0
the solution ~ ( t of ) state equations (7.41) is unique and continuous on [to tll. 7
7.102 An Integral Performance Index and the Trajectory Equation Now we shall consider an integral performance index (7.42) where u(t), to 2 t < t , , is a control which transfers the system from given initial state xo at time to to a terminal state X I x, f = XI or xl, at time t,. As in Sec. 7.2, we introduce a variable xo such that
so that
10
=fok 4
(7.43)
We shall assume thatfo(x, u) and dfo ( x , u)/dxj , j = 1, 2, ... , n, are continuous on E" x 0. Equations (7.41) and (7.43) constitute a set of n + 1 scalar equations which we shall write in vector form ir = f(x, u)
(7.44)
where f(x, u> = (fo ( x , u>,f1(x,4, ..' ,fn(x, 4 ) f If u(t) is discontinuous at t = t c , we shall take u(t,) = u(tc - 0). Also, without loss of generality, we shall assume that u ( t ) is continuous at t = to and t = t I .
294
A. BLAQUIfiRE AND G. LElTMANN
Equation (7.44) is the trajectory equation ; namely, for given admissible control u(t), to I tI t , , and prescribed initial condition x(to) = xo
Equation (7.44) possesses a unique, continuous solution x ( t ) on [ t o , tl]. This solution defines a trajectory r in En+,. An optimal control u * ( t ) , t o 5 t 2 t , , results in a transfer of the system from prescribed initial state x o to prescribed terminal state x1 while rendering the minimum value of integral (7.42) or, equivalently, of x,(t,) - x,(to). Note that the value of integral (7.42) is independent of the initial value of xo so that we may choose xo(t,) = 0. A solution of trajectory equation (7.44) with u = u*(t), to 2 t I t,, will be denoted by x * ( t ) ; it defines an optimal trajectory ?I in
7.11
Velocity Vectors in Augmented State Space
Let us now consider the vector f ( x , u) in trajectory equation (7.44). Given a constant value ubof admissible control, that is, u(t) = Uh E 52,
- 00
< t < 00
the vector function f ( x , u) defines a j?eld of velocity vectors f ( x , ub), V x E En+,, which has the following properties: (i) a field line-namely, the curve whose tangent at x is f ( x , ub)-is an integral curve of (7.44), that is, a trajectory in En". through every point of E n + ,there passes one and only one field line (ii) whose tangent is defined a t that point.
7.111 Lemma 15 Consider now a field line, say L, defined parametrically by x=x(t),
f€(-Kl,CO)
and let
where AX+g x(t,
+ A t ) - x(tJ
AX- 4 x ( t i - A t ) - x ( t J x(tJ = X ,
At > 0
(7.45)
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
29 5
Clearly, by (7.45), we have
x i Ax+ E L ,
x i Ax- E L
Thus q+(At)and q - ( A t ) are continuous functions of At, and q + ( m f ( x , Ub> +
q - ( A t ) + - f (x,u')
as A t 4 0
(7.46)
Suppose now that x is an interior? point of a limiting surface C ; then it follows from Theorem I and Corollary 1, respectively, that
+ A X , = x + q+(At)At E A/C z x + A X - = x + q - ( A t ) At E B/C x
(7.47)
for sufficiently small At. Conditions (7.46) and (7.47) together with Lemmas 6 and 7 result at once in
Lemma 15. At an interior point x of a limiting surface C x
+ f (x,u) f gA(X),
x - f (x,u) E eg(X)
for all u E Q.
7.112 Lemma 16 Consider now an optimal trajectory r* generated by u*(t), to I t 5 t,, and given by x*(t). Here we distinguish between two cases : (i) x*(t) # const for a nonzero time interval, that is,
f ( x * ( t ) ,u*(t)) # 0,
t' I t I t", t" > t'
(ii) x*(t) = const for a nonzero time interval, that is,
f ( x * ( t ) ,u*(t)) = 0, Now let
$ That is, an interior point of B*.
t' I tI t", t" > t'
296
A. BLAQUI~REAND G . LEITMANN
where for case (i)
Ax+ 2 x*(tc + At) - x*(t,)
AX- A! x*(t, - At) - x*(tc) x*(tc) = X, t, E (t’, t”), At > 0 u+*
u*(t,
+ 0)
u-* 2 u*(tc - 0)
and for case (ii)
Ax+ 4 x*(t” + At) - x*(t”)
AX- 2 x*(t’ - At) - x*(t’) x*(t‘) = x*(t”) = X, At > 0 u + * a u*(t” + 0 ) u-* 2 u*(t’ - 0 )
Thus, q +(At) and q -(At) are continuous functions of At, where q+(At)-+ f + * A f(x, u+*) q-(At) -+ -f-* P -f(x,
U-*)
as A t + O
(7.48)
Furthermore, if x is an interior point of d* and C is the limiting surface on which r*lies, we have from Lemma 2 that
x
+ Ax,
x+Ax-
=x
+ q+(At) At E C
=x+q-(At)At~C
for sufficiently small At. Conditions (7.48) and (7.49) together with Corollary 4 result in
Lemma 16. A t an interior point x of a limiting surface 2
x + f + *E q x ) ,
x - f-*
E 9yx)
provided f, * and f-*, respectively, are defined.? Of course, if u + * = u-* then f + * = f-* !A f* and x - f* E 9 ( x ) x f* E 9 ( x ) ,
+
t Clearly, f - * is not defined if x = x*(to), and f + * is not defined if x = x*(tl).
(7.49)
7.
297
ON THE GEOMETRY OF OPTIMAL PROCESSES
7.113 Lemmas 17 and 18; Corollary 5 We shall now prove
Lemma 17. For every vector q at an interior point x of a limiting surface Z, such that x
+ 11E
@A(X)
and for every u, u 2 0,
x
+ q + u f ( x , u ) E @'R(X),
vu E
c2
Let us first establish the lemma for
x +rl
E
gA(x)
(7.50)
whence, according to definition (7.13) of gA(x),there exists a v > 0 such that for all F , 0 < E < 0,
-
x + &q E A/C
(7.51)
The subsequent arguments are similar to those employed in the proof of Lemma 15. Here we consider a trajectory, r, which passes through point x ~q at time t i , and which is generated by a constant admissible control
+
u(t)=UbER, In other words,
tE(-co,co)
r is the integral curve of
k = f ( x , Ub) passing through x + ~ q . Let x(t), t E ( - co, co), be a point of r. We have x(tJ = x + &q
X(ti + At) = X(tJ
+ f ( x + &q,ub)At + o(At),
At > 0
(7.52)
It follows from (7.51) with Theorem 1 and Corollary 1 that x(ti
+ At) E A T
for sufficiently small At. Thus, (7.52) leads to
x + q + f (x + q,ub)At =X
+ o(At)
af &q At + + &q + f ( x ,ub)At + ax
t Here af /ax denotes the n point (x, ub) in En+' x Em.
O(E)
At
+ O(At)E A / C yv
(7.53)T
+ 1 x n + 1 matrix LafJax,], i, j = 0, 1 , . .. . n. evaluated at the
298
A. B L A Q U I ~ R EAND G. LEITMANN
Now we shall choose that
E,
0 < E < 0,and At, At > 0, sufficiently small, such At/&= a
where ct is any bounded positive scalar constant. Clearly, this is possible for a, 0 < a < co. Rewriting (7.53), we have
x
L
af + E q + af(x, ub) + ax aEq +
C(O(E)
-
+ __ E A/C O(&)I E
(7.54)
where the quantity in brackets is a continuous vector function, q(~),of E , such that q ( ~-+q ) + af(x, ub) as E + 0 Condition (7.54) together with Lemma 6 results at once in
x
+ q + af(x, Uh) E @A(X)
(7.55)
for all uh E R and all a > 0. Now if VA(x) is closed, then VA(x) = @,(x), and Lemma 17 is established. On the other hand, if qA(x) is not closed, there exist limit points of VA(x) which do not belong to VA(x); in that case, let us consider?
x
+ q E Y(x)
Let N ( x ) denote a conic neighborhood, that is,
+ k t : k > 0,
+ 5 E B(x + q)} where B(x + q) is an open ball in E n + 'with center a t x + q and radius p. N(x)
{x
x
As a consequence of Lemma 8, we have W x ) n VA(4 f
4,
vp > 0
(7.56)
Since p can be chosen arbitrarily small, it follows from (7.56) that there exists a sequence of vectors q l , q 2 , .. . ,q,, such that
x+qi~V,(x), q,+q
i = 1,2, ..., v as
v+co
which implies, as shown earlier, that
But
x
+ q" + af(x, uh)E gA(x), VUh E R, x + q, + af(x,ub) x + q + ctf(x, +
va
Ub)
>0 (7.57)
t Note that %?,(x) = GA(x)u Y ( x ) is the union of the set of all points of r,(x) and the set of all limit points (accumulation points) of KA(x),and d , ( x ) CX.(x). Thus, the limit points which do not belong to %IA(x)must belong to 9 ( x ) .
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
as q, -+q, that is, as v
--f
299
co. Thus, it is readily seen that x
+ q + crf(x, ub) E e A ( X )
for all uh E 0 and all a > 0. For, suppose this conclusion is false and
x
+ 11 + af(x, ub) E
@B(X)
Then it follows from (7.57) that there exists a positive number p such that for v > p
x
+ q, + af (XIUb) E Q,(x)
But this is in contradiction to the result of the first portion of the proof, since
x
+ rl" E WAX)
Finally, we note that Lemma 17 is trivially valid for a = 0. By arguments analogous to those employed above, one can prove Lemma 18. For every uector q at an interior point x of a limiting surface C, such that x
+q E gB(X)
and for every a, a 2 0,
x
+ q - af (x, u) E qB(x),
vu E
n
I t is readily seen now that Lemma 15 is a direct consequence of Lemmas 17 and 18, obtained by setting q = 0 and CI = 1. Lemmas 17 and 18 lead at once to
Corollary 5. IJ'x is an interior point of a limiting surface C, and
i %f(X,UV)
v= 1
is any linear combination of velocity vectors, where CI,,ZO,
u ' E ~ ,
then
v = l , 2 ,..., r,
+ c a,f(x, u") E @,(x) Y
x
x-
v= 1
r
1 a,f (x, u") E @B(X)
v= 1
300
A. BLAQUI~REAND G. LEITMANN
To prove this corollary, we invoke an argument by recursion. In view of Lemma 17, we have
x Furthermore, suppose
x
+ Elf (x, u') E @A(X)
+ q 4 x + c a,f(x, u") E @,(X) S
"= 1
Then it follows from Lemma 17 that
This establishes the first part of the corollary. The second part follows from analogous arguments invoking Lemma 18. 7.12
Separability of Local Cones
7.121 Separating Hyperplane We shall now introduce some dej?nitions. We shall say that an n-dimensional hyperplane Y(x), containing an interior point x of a limiting surface C, is an n-dimensional separating hyperplane of closed cone aA(x)(or qB(x)),if every point
x
+ 11 E @,(x)
(or
@B(x))
lies in one of the closed half spaces determined by Y(x). The corresponding closed half space will be denoted by R, (or RB),and the corresponding open half space by RA (or RB). Then, if there exists an n-dimensional separating hyperplane of g,(x) (or qB(x)),we shall say that cone qA(x)(or qB(x))is separable. Indeed, if F(x) is an n-dimensional separating hyperplane of @,(x) or @,(x), respectively, then (7.58) (7.59)
@,(x) c R ,
=-
c RB gB(x)c R , gB(X)
(7.60) (7.61)
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
30 1
7.122 Cone of Normals If @,(x) (or gE(x)) is separable, let us consider a separating hyperplane F ( x ) and, at point x, a bound vector n(x), In(x)l = 1, which is normal to F ( x ) . Furthermore (i) if @,(x) is separable, we shall choose n(x) such that x
+ n(x) E comp R,
(7.62)
(ii) if 8,(x) is separable, we shall choose n(x) such that x
+ n(x) E
(7.63)
R B
Note that (i) and (ii) are equivalent if both %?,(x)and qB(x)are separable. Finally, if %?,(x) (or gB(x))is separable, we shall consider the set {n(x)} of vectors n(x) defined above-using (i) or (ii) as applicable-for all separating hyperplanes of @,(x) (or qE(x)),and define the cone of normals at point x by %?"(x) A {x 7.13
+ kn(x) : k > 0,
n(x) E {n(x)}}
(7.64)
Regular and Nonregular Interior Points of a Limiting Surface
We shall say that an interior point x of &* is a regular interior point of the limiting surface C which passes through x, if both @,(x) and g,(x) are separable. In order to make this concept more precise, let us suppose that x E C is a point of this kind and that FA(x)is a separating plane of @,(x), for instance. Accordingly, RA is the closed half space determined by FA(x) which contains @,(x). As pointed out in (7.58), we have But gE(x) = comp W,(X) so that %,(X)
c R,.j *
comp R,
E WE(x)
comp R,
E @,(x)
(7.65)
Now let FE(x) denote an n-dimensional separating hyperplane of qB(x), and R , the closed half space which is determined by FB(x) and contains gB(x).Then
@,(x)
c
(7.66)
302
A.
BLAQUIBRE
AND G . LEITMANN
It follows from (7.65) and (7.66) that E RE
comp RA c @,(x)
(7.67)
Since both planes F,(x) and Y,(x), which bound closed half spaces comp R, and R , , respectively, pass through point x, (7.67) implies that comp R,
= RE
(7.68)
which, in turn, implies that
q x )=
F,(X)
&qx)
(7.69)
Furthermore, (7.67) implies that
@,(x)
However, since @,(XI
= Rs
= comp
(7.70)
G?,(x)
it follows from (7.68) and (7.70) that whence7
(7.71)
Finally, from (7.70) and (7.71) we have These conclusions are important since they show the special features of a regular interior point of C ; namely, @,(x) and @,(x) possess the same separating hyperplane
F(x)
= Y(x)
and, moreover, this separating hyperplane is unique, since Y(x) is unique. If C is a nice limiting surface, then
Y(x)
= @z(x)
Thus, a t a regular interior point x of a nice limiting surface X,the tangent cone VX(x)is an n-dimensional hyperplane, namely, the tangent plane Tz(x) of C at point x; that is, @z(x> = T,(x) (7.73) This tangent plane is the common separating hyperplane of @,(x) it is unique. -
t Indeed @,(x)
=
R A we have (7.71).
R,
and G?,(x);
and ?A(X) 5 cA(x). It follows that RA5 U,(x), and since Y A ( x )5
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
303
On the other hand, if not both @,(x) and qB(x)are separable, then point x will be called a nonregular interior point of C. We distinguish among three possibilities : (i) q,(x) is separable, and q B ( x ) is not. (ii) @,(x) is separable, and @,(x) is not. (iii) Neither @,(x) nor qB(x)is separable; in that case, we shall call x an antiregular point of C.
7.14 Some Properties of a Linear Transformation 7.141 Variational Equations; Lemma 19 Consider a vector 11 = ( Y o 2 rll,
...
2
r,)
at point x*(t) of optimal trajectory r*generated by control u*(t), t o I t 5 t,, whose components q j = y j ( t ) comprise the solution of variational equations
for given initial condition q(t’)= q‘, to I t’ I t,. The solution of Eqs. (7.74) defines a nonsingular linear transformation A(t’, t ) such that (7.75) to I t’ I tI t, q ( t ) = A(?’,t)q’, For t
= t“, we
write q(t”) a q”, so that q” = A(t’, t”)q’,
to I t‘ I t” 5 t,
(7.76)
Since A(t’, t”) is a linear operator which transforms q’ at time t’ into q” at time t“, it may be written in matrix form. This amounts to representing A(t‘, t”) by a set of coefficients aij(t’, t”), and the linear transformation by
c Uij(t’,t”)Vj’, n
Vi’’
=
j=O
i = 0, 1, ... , n
Henceforth, we shall not distinguish between linear operator A(t’, t”) and its matrix representation. Since transformation A(t’, t”) is nonsingular, an inverse transformation A-l(t’, t ” ) is defined such that q’ = A-l(t’, t”)q”,
t o 5 t‘ 5
t“ I t,
(7.77)
Indeed, (7.77) is equivalent to (7.76). It is readily seen that (7.77) renders the solution q j ( t ) at time t’ for initial conditions y j ( t ” ) = qj“, j = 0, 1, . . . , n.
304
A. BLAQUI~REAND G. LEITMANN
I n other words, (7.77) implies backward integration, whereas (7.76) implies forward integration. Accordingly, we shall put A - y t ' , t") = A(t", t')
so that (7.77) reads to 2 t' 2 t" 2 t,
q' = A(t", t')q",
The remarks above allow us to extend the notation in (7.75) to cases in t,. Thus, if t,, t,+l, ... , t,-,, t,, is a sequence of points, which to I t I t' I not necessarily ordered, belonging to [ t o ,t l ] ,then it follows from the multiplication rule for linear operators that A(tp,
tq)
= A(tq-1,
tqlA(tq-2
> tq-1)
... A(t,+l, tp+2)A(tp,f , + l ) (7.78)
so that
v(tq-l) = 4 t , - 2
t,-h(t,-z)
9
s(t,)= A(tq-1,
t,>tl(tq-l>
Moreover, we have the obvious relations A@,, 4t,
>
fp> =
1
t,)A(t, > tP)= 1
Finally, from the properties of linear transformation A(t', t ) , to I t' I t, and to t 5 t, , we have Lemma 19. The transform ll(x*(t)) of a plane n(x*(t')) containing point x*(t') of I-*, due to linear transformation A(t', t ) , has the following properties: (i) ll(x*(t))is deJned for all t E [ t o ,t l ] ; (ii) II(x*(t))is a plane of the same dimension us II(x*(t')); (iii) the orientation ofII(x*(t))is continuous on [ t o , t,].
7.
305
ON THE GEOMETRY OF OPTIMAL PROCESSES
7.142 Adjoint Equations Consider now the equations adjoint to variational Eqs. (7.74), namely,
whose solution I ( t ) = (Ao(t),l , ( t ) , ... , A,(t)), for given initial condition k(t’) = I ’ , to I t’ 5 ti, is unique and continuous on [ t o ,t l ] . As a consequence of (7.74) and (7.79), we have Vt E [ t o ,r,]
I ( t ) .q(t) = const,
(7.80)t
Let us now choose the following initial conditions for (7.74) and (7.79), respectively : (i) q‘ # 0 and x*(t’) + q’ E n(x*(t’)); (ii) 1‘# 0 and normal to n(x*(t‘)).
Then it follows from (7.80) that
qt>* q(4 = 0,
vt E [ t o 2 tll
(7.81)
Since (7.81) is valid for allq’ satisfying (i) above, it follows that h(t)
is normal to
n(x*(t))
(7.82)
for all t E [ t o ,tl].
7.15 Properties of Separable Local Cones 7.151 Lemma 20 Let us now prove
Lemma 20. Let x*(t‘) and x*(t“), to I t‘ I t” I t l , be two points of optimal trajectory r*,corresponding to control u*(t) and solution x*(t), t o 5 t 2 t,. Let q’ be a uector at x*(t‘),andq” its transform at x*(t“),due to linear rransjorrnution A(t’, t ” ) ; namely, q” = A(t’, t”)q’
If
+ q’E q A ( X * ( t ’ ) ) x*(t”)+ q” E gA(X*(t”)) x*(t’)
then
7 Dot denotes inner product; e.g., a b =A X;=oaJbJ.
306
A. B L A Q U I ~ R EAND G. LEITMANN
First of all, let us suppose that x*(t’) + q’ E q A ( X * ( t ‘ ) )
According to definition (7.13) of UA(x), there exists a positive number such that for all E , 0 < E < IX, x*(t’)
-
+ ~ qE’A / X
c(
(7.83)
Next consider a trajectory which starts at point x*(t’) + ~ q ’ and , which is generated by the same control as I-*, namely u*(t), to I t I t,, on [t’, t”]. The solution of trajectory equation (7.44), corresponding to this trajectory,
is
x(t) = x*(t)
+ q ( t ) + o(t,
(7.84)
E)
where o(t, E ) / E tends to zero uniformly as E -+ 0, t‘ I t 5 t“, and q(t)is the solution of variational equations (7.74) with initial conditionq(t’) = q’. Thus, q(z) is given by linear transformation A ( t ’ , t ) . According to (7.83), together with Theorem 1 and Corollary I , we have
+
x(t”) = x*(t“) rq(t”) + O ( f ” , E ) for sufficiently small
E.
q(t”)
E
A/z
Furthermore,
+ o(t”,
E)
~
E
+q(t“)Aq”
as E + O
Consequently, it follows from Lemma 6 that
+
(7.85)
x*(t”) q” E gA(X*(t”))
Thus, if UA(x*(t‘))is closed, e A ( X * ( t ’ ) ) = gA(X*(t’))
and the lemma is established. If VA(x*(t’)) is not closed, we must take into account the limit points of VA(x*(t’)) which d o not belong to WA(x*(t’)); that is, we must consider x*(t’)
+ q’ E Y ( X * ( f ‘ ) )
Now our arguments are similar t o those utilized in the proof of Lemma 17. In particular, we note that for every conic neighborhood
+
N ( x * ( t ’ ) ) A( x * ( t ’ ) k5 : k > 0 , x*(t’) + 5 E B(x*(t’)+ q’))
+
where B(x*(t’) q’) is a n open ball in En+’with center at point x*(t’) + q’, it follows from Lemma 8 that N(x*(t‘))
VA(X*(t’)) #
4
7.
307
ON THE GEOMETRY OF OPTIMAL PROCESSES
Hence, we may consider a sequence of vectors ql’,xi2’, ... ,qv’,such that x*(t’) + q i ’E VA(x*(t’)), qy’+q’
i = 1, 2, ... , v
as v + c o
It follows from the first part of the proof that x*(t”)
+ qy”E GA(X*(l”))
(7.86)
where q:A q,(t”) = A(t’, t”)qv‘
Furthermore, since the transformation is linear, q; -+q“ Suppose now that
as q”’+q’
+
X*(l”) q”
namely,
@A(X*(t”))
+
X*(f”) 9’‘ E @,(x*(t”)>
Then there exists a positive number p such that for v > p
+
x*(t”) q;
E
gB(X*(t”))
However, this contradicts (7.86) ; thus, Lemma 20 is established.
7.152 Lemma 21 Next we shall prove Lemma 21. Let x*(t’) and x*(t“), t, I t’ I t” 5 t,, be IWO points of optimal trajectory F*, corresponcfing to control u * ( t ) and solution x*(t), to s t s t,. Let q” be a L3ector at x*(t“)such that
x*(t”)+ q” E @,(X*(t”))
Tllen q” is the transform, due to linear transformation A(t’, t”), of a vector q’ at x*(t’) such that x*(t’)
+ q’ E gB(X*(t’))
Since the proof of this lemma is similar to that of Lemma 20, we shall only present its salient features. We shall suppose first that x*(t”)
+ q” E qB(x*(t‘’))
308
A.
BLAQUIERE
AND
c.LEITMANN
Then, according to definition (7.14) of 5 f E ( x ) ,there exists a positive number p such that for all E , 0 < E < p, x*(t”) + ~ qE”B/X
(7.87)
We shall consider a trajectory which passes through point x*(t”) + eq“, and which is generated by u*(t), to It 5 t,, on [t‘, t“]. As before, the corresponding solution of trajectory equation (7.44) is given by (7.84), but now with q(t”) = q”. As a consequence of (7.87), together with Theorem 1 and Corollary I , we have x ( t ‘ ) f$
since x(t’) E A T
implies x(t”) = x*(t”)
+ 4‘’E A / Z
7
which contradicts (7.87). Thus, ~ ( t ‘=) x*(t’)
for sufficiently small Furthermore,
+ ~ q ( t ’+) ~ ( t ’E,) E B/Z
E.
so that Lemma 7 leads to x*(t’)
+ q’E @,(x*(t‘))
In the second part of the proof, we consider
+
x*(t”) q” E 9 ( x * ( t ” ) )
Again, we shall employ a sequence of vectors qy,q;, ... ,q: such that
+
x*(t”) qy
E
gB(X*(t”)),
qt-+q“
i
=
at x*(t”),
1, 2, ..., v
as v + co
The existence of such a sequence is again ensured by Lemma 8 according to which N(x*(t”))n %?E(X*(t”)) #4
7.
309
ON THE GEOMETRY OF OPTIMAL PROCESSES
for every conic neighborhood
N(x*(t”))g(x*(t”)+ k6 : k > 0, x*(t”)+ 5 E B(x*(t”)+ q”)] From the first part of the proof, we have
x*(t’) + q y ’
E
@B(X*(t’))
where qv’ = A(t”, t’)q:
But qv’+q‘
as q: +q“
x*(t’)
+ q’E @,(x*(t’))
and we can show that
employing an argument analogous to the one used in Sec. 7.151. Hence, Lemma 21 is established.
7.153 Corollaries 6 and 7 The following corollaries are straightforward consequences of Lemmas 20 and 21.
Corollary 6. Zfa vector q’ at point x*(t’) of optimal trajectory
+
r*satisjies
x*(t’) q‘ E q4(X*(t‘)) then its transform q”, due to linear transformation A(t’, t”), at x*(t”), t” 2 t‘, safisfies x*(t”)+ q” E @,(x*(t”)>
and
Corollary 7. Zfa vector q” at point x*(t“)of optimal trajectory r*satisfies
+
x*(t”) q” E @B(x*(t”)) then it is the transform, due to linear transformation A(t’, t”), of a vector q’ at x*(t’), t’ I t”, which satisfies x*(t’)
+ 11’ f @B(X*(f’))
For instance, Corollary 6 follcws at once from Lemma 21. For suppose our assertion is false, that is,
x*(t”) + q” E eB(X*(f”))
310
A.
BLAQUIBRE
AND G. LEITMANN
Then, according to Lemma 21, we have
+
x*(t’) q’ E
gB(X*(t’))
which contradicts the hypothesis of the corollary. Corollary 7 can be proved in similar fashion, invoking Lemma 20.
7.154 Theorems 2 and 3 Let x*(t’) and x*(t“), I, I t‘ 2 t“ _< t,, be two points of optimal trajectory We shall prove
r*,corresponding to control u*(t) and solution x*(t), to 5 t i f,.
Theorem 2.
If QE(x*(t’))is separable,
then
@,(x*(t”)> is
separable.
Theorem 3. I f @ : A ( ~ * ( t ”is) )separable, then gA(x*(t’)) is separable. To prove Theorem 2, let us consider an n-dimensional separating hyperplane F(x*(t’)) of q E ( x * ( f ’ ) )This . plane passes through point x*(t’), and determines closed half space WE’, @,(x*(t’))E RE’, and open half space comp RBf.Thus,
+
x*(t’) q‘ E @’,(x*(t’))=e-x*(t‘)
+ q’ E RE’
(7.88)
Now let II(x*(t”))and Pk denote the transforms, by linear transformation A(t’, t ” ) , of T(x*(t’)) and R E ‘ , respectively.? Since transformation A ( t ’ , f”) is linear and nonsingular, one can show readily that n(x*(t”)) is the common boundary of P;I and comp n(x*(t“)) = P; n comp P i (ii) the transform of comp RE’ is comp pg. (i)
Pi,
namely,
NGWconsider any vector q ” at x*(t”),such that x*(t”)+ q” E gB(X*(t”)) and suppose that
(7.89)
x*(t”)+ q ” E comp
Then, according to (ii) above, it is the transform of a vector q’ at x*(t’), such that x*(t‘) + q’ E comp RE’ (7.90) Moreover, in view of (7.89) and Lemma 21, we have x*(t’) + q’E g E ( X * ( t ‘ ) )
7 Note that RB‘is open,
by definition.
(7.91)
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
31 1
However, (7.91) together with (7.90) is incompatible with (7.88). Hence, we conclude that x*(t”) q” E P;;
+
for every q” which satisfies (7.89). Consequently, we conclude that (i) gB(x*(t”)) is separable; (ii) n(x*(t”))is a separating hyperplane F(x*(t”))of ‘G,(x*(t”)),and (iii) GB(x*(t”))belongs to Pg which js the transform of RB’. Hence, we shall denote P i by R i . Thus, Theorem 2 is established. Theorem 3 can be proved in a similar fashion, employing Lemma 20. In particular, if F(x*(t”))is a separating hyperplane of @,(x*(t”)), it determines a closed half space Ri, @,,,(x*(t”))G F’, such that (iv) F(x*(t”))is the transform, due to linear transformation A(t’, t”), of a separating hyperplane F(x*(t’)) of q A ( x * ( t ’ ) ) ; (v) gA(x*(t’)) belongs to FA’ whose transform is Ri. Thus, we shall denote PA’ by RA’. 7.155 Corollaries 8 and 9 From Theorems 2 and 3 there follow at once Corollary 8. An optimal trajectory l? cannot j o i n points x’ x*(t’) and x” 4x*(t”),r“ > t‘, if is separable and Q,(x“) is not separable. )IX(,@
and Corollary 9. An optimal trajectory r* cannot join points x‘ ~2x*(f) and X ” 4 x*(t“), t“ > t‘, sf is not separable and giA(x“)is .wpurubk. @,,,(XI)
7.16 Attractive and Repulsive Subsets of a Limiting Surface
We shall now introduce two subsets of a limiting surface C: An attractirle subset Ma,, of C is a set of nonregular interior points of E, at each of which @,(x) is separable but QA(x)is not separable. A repulsive subset Mrepof C is a set of nonregular interior points of C, at each of which @-,(x) is separable but C B ( x )is not separable. 7.161 Corollaries 10 and 11 The adjectives attractiue and repulsice have been adopted in view of the following corollaries which are a direct consequence of Corollaries 8 and 9.
312
A.
BLAQUIERE
AND G. LEITMANN
Corollary 10. If x*(t') and x*(t"), t" > t', are points of optimal trajectory r* on limiting surface C , and if x*(t') belongs to Ma,,, then x*(t")cannot be a regular interior point of C nor can x*(t") belong to M r e p . However, apparently an optimal trajectory which emanates from a regular interior point of X, or from a point on a repulsive subset M r e p ,can reach a point on an attractive subset Ma,,.
Corollary 11. If x*(t') and x*(t"), t" > t', are points of optimal trajectory
r*
on Iimiting surface C, and if x*(t') is a regular interior point of C, or x*(t')
belongs to Matt,then x*(t") cannot belong to M r e p .
However, apparently an optimal trajectory can leave a repulsive subset Mrep.
7.17 Regular Subset of a Limiting Surface Yet another subset of a limiting surface C is a regular subset Mregwhich is a set of regular interior points of C, that is, a set of points at each of which both g A ( x )and @,(x) are separable.
7.171 A Local Property of a Regular Subset; Corollary 12 One can readily prove
Corollary 12. If x*(t') and x*(t"), t" > t', are points of an optimal trajectory r* for which every point x*(t), t E [t', t"], is an interior point of limiting surface C, and if x*(t') and x*(t") belong to M r e g ,then x*(t) belongs to Mres,for all t E [t', t"]. Indeed, since gA(x*(t")) is separable, it follows from Theorem 3 that gA(x*(t)>,t E [t', t"],is separable; and since gB(x*(t'))is separable, it follows from Theorem 2 that @,(x*(t)), t E [t', t"],is separable. Thus, x*(t) belongs to Mresfor all t E [t', t"].
7.172 A Maximum Principle; Theorem 4 Now we shall restrict our attention to a regular optimal trajectory r*, namely, one for which x*(t) belongs to Mregfor all t E [ t o , tl]. We shall derive Pontryagin's maximum principle for this case.
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
313
Let x = x*(t), t E [ t o , t l ] , be a point of regular optimal trajectory r*, and consider cones aA(x)and 8,(x) at that point. From the analysis of Sec. 7.13, we know that (i) @',(x) and @,(x) possess a common separating hyperplane F ( x ) , and that this separating hyperplane is unique; (ii) F ( x ) = Y(x); (iii) qA(x)= aA(x) and gB(x)= KB(x), where RA(x) and R,(x) are the two open half spaces determined by F ( x ) . For instance, consider the separating hyperplane S ( x o ) at the initial point xo = x*(to). Since F ( x o ) is the separating hyperplane of @,(xo), it follows from the proof of Theorem 2-see Sec. 7.154-that its transform at any time t, to I tI t , , due to linear transformation A ( ? , , t ) , is the separating hyperplane Y(x) of @,(x) = RB(X), x = x*(t). In view of the uniqueness of the separating hyperplane, the cone U,(x) of normals at point x, defined in Sec. 7.122, contains a single vector n(x), In(x)l = 1, which is normal to Y(x), and such that
x
+ n(x) E R,(x)
(7.92)
On the other hand, it follows from Lemma 15 that
x
+ f (x, u) E @A(X)= RA(x),
and from Lemma 16 that x
+ f+*
E
Y(x)
x - f-* E Y ( x ) where
1
f+* 9f(x, u+*), f - * 9f(x, u-*),
vu 6 0
Y(x) = F ( x )
(7.93)
(7.94)
u + * A u*(t + 0) u-* u*(t - 0)
Now let us consider the solution I ( ? ) ,to 5 t I t,, of adjoint Eqs. (7.79) with initial condition l ( t o ) = 1 ' such that 1 ' # 0 , L o normal to F ( x o ) and directed into R,(xo); that is, I'.
= AOn(xO),
'A > O
(7.95)
Then A(?) is a nonzero vector which is normal to F ( x ) , according to (7.82); namely,
w>= A(t)n(x*(t)),
Vt
[ t o , tll
(7.96)
Since 1(t) and n(x*(t)), t o I t 5 t , , are nonzero continuous vector functions of t , it follows that A ( t ) is a nonzero continuous scalar function of t . Since
314
A.
BLAQUIPRE
AND G . LEITMANN
L(to) = 1.’ > 0, we conclude that A(t) > 0 for all t E [ t o ,t,]. Consequently, (7.92) and (7.96) lead t o
x
+ h ( t ) = x + A(t)n(x)E R,(x) x
= x*(t),
A(t) = Ih(t)l
(7.97)
Conditions (7.93), (7.94), and (7.97) embody Pontryagin’s maximum principle for the case of regular optimal trajectories. Thus, letting Z ( 5 , x, u) A 3, * f (x,u)
(7.98)
we may state Theorem 4. if u*(t), to I t 5 t,, is an optimal controf, and x*(t) is the corresponding solution of trajectory equation (7.44), then there exists a nonzero continuous uector function h(t) which is a solution of adjoint equations (7.79), such that (i) sup i@(h(t),x*(t), u) = A?(h(t), x*(t),u*(t)); U€Q
(ii) Z ( h ( t ) ,x*(t),u*(t)) = 0 ; (iii) ;Lo(t)= const I 0; for all t E [ t o ,t,].
Indeed, it follows from (7.93) and (7.97) that vu E R
I ( t ) * f ( x * ( t ) ,u) I 0,
whence we have condition (i) of the theorem. For condition (ii) we invoke (7.94) with (7.96). Here we distinguish between the two cases discussed in Sec. 7.1 12. Thus, if u*(t)is discontinuous at t = t,, and t, belongs to a nonzero time interval on which x*(t) # const, we have h(t, - 0 ) . f - *
= h(t,
+ 0 ) . f + *= 0
O n the other hand, if x*(t) = const on a nonzero time interval [t’, I”], then
I(t’ - 0 ) . f - * Furthermore,
= h(t”
-
+ 0)
I ( t ) f ( x * ( t ) ,u*(t)) = 0,
*
f + *= 0 t‘ I tI t“
since the orientation of I ( t ) remains fixed during that time interval. Of course, this latter relation is also valid because f ( x * ( t ) ,U*(t))= 0 ,
t’ I t I t“
Finally, condition (ii) of the theorem is clearly satisfied at all other points of “ 0 > tll.
7.
315
ON THE GEOMETRY OF OPTIMAL PROCESSES
Since functions f i ( x , u), j we have
= 0,
1, ... , n, do not depend explicitly on x o ,
so that
A,([)
= const,
V t E [ t o ,t l ]
Furthermore, it follows from (7.96) that (7.99)
Ao(t> = IVt>lno(x*(t>>
where n,(x) is the zeroth component of n(x). We note also-see that
Sec. 7.71-
L- t gB(x0)= R,(xO) so that, in view of (7.92), no(xo) 2
o
Thus, condition (iii) of the theorem is established. 7.173 Relation between Gradient and Adjoint Vectors
If C is a nice limiting surface, then separating hyperplane Y(x) is the tangent plane of C at point x; namely, as shown in Sec. 7.13, we have F(x) = Y(x) = %?,(x)
= T,(x)
Let us recall the defining Eq. (7.6) of 1,that is,
@(x) 4A xo
+ V*(x;w ' )
=
c
If grad @(x)is defined at point x, the following conditions apply: (i) It is normal to tangent plane T,(x). (ii) It is directed into region A / C ; that is, there exists a for all E , 0 < E < 6, x
+
E
CT
> 0 such that
grad @(x) E A / C
Indeed, this means that grad @(x) is directed into w,(x). (iii) The zeroth component of grad @(x)is
a@ (x) (grad @(x)), -= 1 ax0
316
A. BLAQUI$RE AND G. LEITMANN
From the definition of normal n(x), together with (i) and (ii) above, it follows that grad @(x) = -]grad @(x)ln(x) (7.100) and lgrad @(x)Iis defined, since grad @(x)is defined. In view of (iii) above, grad @(x) is not normal to the x,-axis, and hence neither is n(x). Thus, no(x) # 0 and so, according to (7.99), A,(t) # 0 for x = x*(t). From (iii) above, together with (7.100), we have (grad @(x)), = 1 = - lgrad @(x)l no(x)
(7.101)
Hence, for x = x*(t), it follows from (7.99) and (7.101) that AO(t)
=-
(grad @(X*W))O
so that
IV r ) I
lgrad @(x*(t))l
A(?) = A, ( t ) grad @(x*(t)) Ao(t) = const < 0
Since lSol can be chosen arbitrarily, we may write L ( t ) = -grad @(x*(t))
(7,102)
Recall that (7.102) is valid provided C is nice and grad @(x) is defined at
x = x*(t). 7.174 Relation to Dynamic Programming Upon use of (7.102) in conditions (i) and (ii) of Theorem 4, we obtain sup [L(t) f(x*(t), u)] = sup [-grad @(x*(t)) f(x*(t), u)] uen
U E R
=0
Alternately, we have
which is the functional equation of dynamic programming. Equation (7.103) may be obtained in a more direct way, invoking the global property of limiting surfaces. Consider a trajectory r, corresponding to solution x(t), to I t 2 t , , of trajectory equation (7.44). Function @(x)is defined for all x in 8". Moreover, for x E r, @(x) becomes a function of t ; namely, @(x(t))= x,(t)
+ v*(x(t); x ' ) = c(t)
In other words, to each value of time t there corresponds a point x on r, and hence a C surface through that point.
= x(t)
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
317
From Theorem 1, together with the fact that the members of {C} are ordered along the xo-axis in the same way as the value of parameter C, it follows that C(t)is a nondecreasing function along r. Thus, provided the derivative? exists, we have (7.104)
dt
along I'. Furthermore, according to Lemma 2, an optimal trajectory r* belongs entirely to one C surface; that is, C ( t )= const, to 5 t 5 t,. Hence, we have (7.105) along r*. Now if grad @(x) is defined at a point x E d@
-=
dt
c a@
and
r*,we may write
n
j = o -pi axj
where d@/dt is evaluated along I- and
r*,respectively. Thus, (7.104) leads to (7.106)
for x = x*(t), u = u(t), whereas (7.105) results in (7.107) for x = x*(t), u = u*(t). Functional equation (7.103) follows at once from (7.106) and (7.107). It should be noted that (7.104) and (7.105) may be less restricted than (7.106) and (7.107), since the former may be valid even though x is not a regular interior point of C where grad @(x) is defined. 7.18 Antiregular Subset of a Limiting Surface
A set of interior points of a limiting surface C will be called an antiregular subset of C, if neither @,(x) nor gB(x)is separable at a point x of the set. We shall not discuss such subsets in detail at this time. However, one salient feature of an antiregular subset is embodied in the following corollary.
t We mean here the one-sided derivative
lim
At-0
A@
-. At
318
A. BLAQUI~REAND G . LEITMANN
7.181 Corollary 13 Theorems 2 and 3 lead at once to Corollary 13. Let x*(t’) and x*(t“), t” > t‘, be points of optimal trajectory on limiting sutface C.
r*
(i) If x*(t’) belongs to an antiregular subset, then x*(t”) cannot belong to a regular or to a repulsive subset. (ii) If x*(t‘) belongs to a regular or to an attractive subset, then x*(t”) cannot belong to an antiregular subset. 7.19 Symmetrical Subset of Local Cone 9 ( x )
We shall call Z(x) a symmetrical subset of local cone Y ( x ) ,if Z(x) is the set of all points, x + q, such that
xiq~9’(x)
and
x-q~Y(x)
7.191 Lemmas 22 and 23 Let us now prove Lemma 22. Let x’ 4& x*(t’) and x” x*(t“), t” 2 t’, be points of optimal trajectory r*.If @,(x“) is separable, then
x’
+ q’E I(x’)
=5
x” + q” E I(x”)
where q” = A(t’, t”)q’
Since @,(x”) is supposed separable, it follows from Theorem 3 that qA(x’) is separable. Let Y(x”) be a separating hyperplane of @,(x”). According to the results of Sec. 7.154, Y ( x ” )is the transform, due to linear transformation A(t’, t”),of a separating hyperplane Y ( x ’ ) of @,(x’). Let R i and R,‘ denote the closed half spaces determined by Y ( x ” )and F ( x ’ ) , respectively, where
@,(x’)
c RA’
and
@;,(x”)c Rg
Recall here that is the transform of RA’,due to A(t’, t”)-see remark (v). Consider a vector q’ at x’, such that
x’
+q E Z(X’)
Sec. 7.154,
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
319
From the definition of symmetrical subset Z(x‘), we have x’
+ q’E 9 ( x ’ )
Thus,
and x’ - q’ E 9’(x’)
x‘ + q‘ E
so that, according to Lemma 20, X”
+ q” E gA(x”)
X”
+ q” E g A ( X ” )
Now, if then
X”
+ q” E Ri
which impIies that X” - q” E
comp R i
But this is not possible, since X‘
- q’E Y ( X ’ ) e-
Consequently, X”
+ q” € Y(X”)
Moreover, since x’
- q’E I(X’)
the same arguments apply to -q’; hence, X”
- q” E 9 ( X ” )
This establishes the lemma. In an analogous fashion, employing Theorem 2 and Lemma 21, one can prove
Lemma 23. Let x’ x*(t’) and X ” a x*(t”),t” L t’, be points of optimal trajectory r*.rf@,(x’) is separable, then X“
+ q” E Z(x”)=> x’ + q’E I(x’)
where q” = A(t’, t”)q’
Of course, these lemmas are trivially valid for the case of null vectors.
320
A. BLAQUI~~RE AND G . LEITMANN
7.192 Dimension of a Symmetrical Subset; Theorems 5 and 6
if
We shall say that points x
+ q v ,v = 1,2, ... ,r, are linearly independent, =av=o
v=
1
In the absence of such an implication, we shall say that these points are linearly dependent. Furthermore, we shall say that the dimension of symmetrical subset Z(x) is y , if
+
(i) there exist y points x qv, v = 1, 2, ... , y, in Z(x), which are linearly independent ; and (ii) more than y distinct points in I(x) are necessarily linearly dependent. Lemmas 22 and 23 lead readily to Theorem 5. Let x' x*(t') and X" a x*(t"), t" 2 t J , be points of optimal trajectory r*,and let y' and y" be the dimensions of Z(x') and Z(x"),respectively. r f g A ( x " )is separable, then y" 2 yf ; and Theorem 6. Let x' 2 x*(t') and x" 6 x*(t"), t" 2 t', be points of optimal trajectory r*,and let y' and y" be the dimensions of I(x') and I(x"), respectively. rfqB(xt)is separable, then y" I y'. For instance, consider the case for which gA(x")is separable, and suppose the theorem is false; namely, y" < y'. Then consider a set of y" + 1 linearly independent points x' + qv',v = 1, 2, ... , y" + 1, in Z(x'); obviously we can always do this in view of our hypothesis that y' > y". Since A(t', t") is linear and nonsingular, the points X" + q:, v = 1, 2, ... , y" 1, where
+
q; = A(t', t")q,'
are also linearly independent. Indeed,
1 avqv'= A(t", t') c fxvq; v=
Hence
1
1
y"+
1
v= 1
y"+
c
y"+ 1
v= 1
y"+
avq: = 0 *
1
1 avqv'= 0
v= 1
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
32 1
which, in turn, implies u, = 0, v = 1, 2, ... , y ” + 1, since points x’ + q,’, v = I , 2, ... , y” + 1, are linearly independent. Furthermore, according to Lemma 22, points X” + q:, v = 1, 2, ... , y” + 1, belong to f(x”).Thus, we have arrived at a contradiction, since f(x”) has dimension y”. This establishes Theorem 5. Theorem 6 follows from an analogous argument, invoking Lemma 23. 7.193 Symmetrical Subset as Hyperplane; Theorem 7
From Lemmas 22 and 23 one can readily deduce Theorem 7.
If
(i) f ( x ’ ) and f(x”) are hyperplanes at points x’ x*(t’) and x” A x*(t”), t” 2 t’, of optimal trajectory r*,where @ A ( ~ “ )or @,(x‘) is separable; and (ii) the dimensions of f(x‘) and I(x“)are equal; then f(x“)is the transform of f(x’), due to linear transformation A(t‘, t “ ) . Suppose, for instance, that gA(x”) is separable. By hypothesis, the dimensions of I(x’) and f(x”) are equal; that is, y’ = y” 4 y. Consider a set of y linearly independent points x’ + qv’,v = 1, 2, ... , y, in f(x’). This set of points forms a basis for hyperplane f(x’); namely to each point x’ +q’ of plane f ( x ’ ) one can associate one and only one set of y real numbers a, 5 0, v = 1, 2, . _,_y, such that I
11’ =
1
(7.108)
~vllv’
v= 1
Conversely, to each set of y real numbers u, 5 0, v = 1, 2, ... ,y, one can associate one and only one point x‘ + q’ in f(x’) by means of relation (7.108). This statement can be expressed by
x‘ + q’ E f(x‘)e q’ =
Y
1 avqv’,
v= 1
u, E 92
(7.109)
where 92 is the set of real numbers. Since transformation A(t’, t”) is linear and nonsingular, the set of points X”
+ q; ,
q;
= A(t’, t”)qv’, v =
1, 2, ... , y
(7.1 10)
is linearly independent, as discussed in Sec. 7.192. Moreover, it follows from Lemma 22 that this set of points belongs to f(x”). Thus since f(x”) is y-dimensional, this set of points forms a basis for hyperplane f(x”).
322
A. B L A Q U I ~ R EAND G. LEITMANN
Let us denote by x” + 5“a point in I(x”).t We shall show that 4” = A(t’, t”)q’ where x’ + q’E Z(x’), so that we will be able to set 5‘’ = q”. Since the set of points (7.1 10) forms a basis for hyperplane I(x”), it follows that
+ 6“ E I(x”) e 5” = c a“q;, i
X”
a, E
v= 1
and hence X”
+ 6’’E I(x”)c> 6” = A(t’, t”) c avqv’, v= 1
9
a,
€9
(7.1 11)
Upon comparing (7.109) and (7.1 1I), we see that
6” = A(t’, t”)q’,
q‘ E Z(x’)
and so we put
5“ = q“ and conclude that
x’ + q’E Z(x’) 0 X” + q” E I(x”)
Employing Lemma 23, this result is obtained in similar fashion for the case of GB(x’) separable; and so Theorem 7 is established.
7.194 Symmetrical Subset and Separating Hyperplane; Lemmas 24 and 25 Finally, we shall prove
Lemma 24. At an interior point x of limiting surface Z, where qA(x) is separable, any separating hyperplane F(x) of qA(x) contains symmetrical subset I(x). Let
aAbe the closed half space which is determined by F(x) and contains
qA(x) ; that is,
@A(X)
cRA
Consider a vector q at point x, such that
x +q
EI(X)
so that
x+q~Y’(x)
t Since we have not as yet shown that notation q“.
and
x-q~Y(x)
5” is the transform of vector q’ we shall avoid the
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
323
Since x + q € ~ ( X ) ~ X + q E ~ , ( X ) ~ X + q € R W A
it follows that X-qECOmpRA
(7.112)
where, indeed, comp RA
= comp
R,
(7.113)
On the other hand
x - q E Lqx) * x - q E @,(x) = x - q E R,
(7.1 14)
In view of (7.1 12)-(7.114), we have
x - q E RA n coinp R, = F(x) and consequently
x
+ q E F(X)
Hence, Lemma 24 is established. Analogous arguments lead to
Lemma 25. A t an interior point x of limiting surface C, where @;,(x) is separable, any separating hyperplane F ( x ) of @,(x) contains symmetrical subset I(x). 7.20
A Maximum Principle
7.201 Assumptions; Lemma 26 Let us now consider an optimal trajectory r* represented parametrically by x = x*(t), to I t II , . We shall make the following assumptions: 6 ) The limiting surface 2, which contains r*,is nice. (ii) All points of r*are interior points of €*. (iii) At every point x = x*(t), t E ( t o , tl), gWA(x) = GA(x), and @;,(x) = @js(X). A point x will be called a degenerated point if either @,(x) or @,(x) is empty. Thus, assumption (iii) above implies that no point of r*is degenerated, except possibly its initial point xo = x*(to) and its terminal point x1 = x*(t,). However, we shall make the additional assumption that (iv) not both xo and x1 are degenerated points.
324
A. BLAQUI~REAND G. LEITMANN
Let us recall the discussion of Sec. 7.112, according to which there exists a forward tangent and a backward tangent of r* at every point x = x*(t), t E ( t o , t l ) ; namely, these tangents are the supporting rays of
f + * A f ( x , u+*) and of
-f-*
=
- f ( x , u-*)
respectively. Moreover, f + * may differ from f-* on the two sets of points described in cases (i) and (ii) of Sec. 7.1 12. Both of these sets ofpoints are sets of measure zero with respect to position variable x . At all other points of r*,we have u+* = up* so that
4f *
f+* = f-*
At such points, the tangent of r* is defined, namely, the forward tangent and the backward tangent possess the same supporting line, provided f* # 0. It is readily seen that f * = 0 on a set of points which has measure zero with respect to x . For other points, we shall find it convenient to consider unit vector
f* t*4
If"(
According to property (i) of a nice limiting surface, there exists a partition {El, Z 2 , ... , Xg} of C such that at every point x in z i , i = 1 , 2, ... , p, the tangent cone gX,(x)is defined and belongs to a k-dimensional plane Tz,(x), k 5 n, through point x . Of course, every point x = x*(t), t E [ t o , tl], belongs to a subset nY,,C,, yI p, where the X i , i = 1, 2, ... , y, belong to the partition and are all the members of the partition whose closures contain x. The subset may not be the same for all points of r*;not only y, but also the members of the partition which make up the subset, may vary.? Henceforth, in order to t 5 t,, we shall emphasize the dependence of the subset on x = x*(t), to I write
nT=lxi
M ( x )4
0 zi V
i= 1
Note now that the number of such subsets is finite if p is finite; if p is not finite, the set of subsets is denumerable.$
t This may require renumbering
1 See Appendix.
the members of the partition.
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
We shall now consider only those points of and f* # 0, and we shall prove
325
r*where its tangent is defined
Lemma 26. At every point x = x*(t), t E [ t o ,t l ] , of optimal trajectory r*, except on a set of measure zero with respect to x, there exists a vector which is tangent both to r*and to the subset M(x). Let B(x) be an open ball in En+' with center at x radius E. We shall distinguish between two cases: (i)
= x*(t), t G
[ t o ,t l ] ,and
There exists a CJ > 0 such that, for all E < o, no point of r*,other than x, belongs to both B(x) and M(x).
(ii) No matter how small both r*and M(x).
there exists a point x , # x which belongs to
E,
T)y=lciis either finite or at worst de-
Since the set of all distinct subsets numerable, we shall designate this set by {MI, M2
9
"'
9
M,
...I
Then M ( x ) may be any member of this set. Of course,
Ml u M2 u ... v M, Next consider the subsets of
=E
r*
r*n MI, I-* n M , , ... , r*n M , , r*n M i ,i = 1 , 2, ... , v, ... , consider
...
In each subset the set of points D i for which the condition of case (i) above is met. Set D,is clearly discrete, and hence has measure zero. Now, since there is either a finite number or a denumerable infinity of subsets D i , say
D,,D 2 , ... , D,,.. and since the union of a finite number or of a denumerable infinity of sets, each of which has measure zero, is a set of measure zero, it follows that the points of r*,for which the condition of case (i) above applies, constitute a set of measure zero. Next let us turn to case (ii). Since
x, # x
= x*(t),
t E [ t o ,tll,
X, E
r*
we have x,
= x*(t,),
t, E [to > tll, t, # 1
326
A.
BLAQUIBRE
AND G. LEITMANN
We now distinguish between two subcases: (a)
t, belongs to an infinite sequence of times such that
t < I,
for all
E
and as E + O
Ix,-x(-+O
(b) The condition of (a) is not met. Since E may be arbitrarily small, t, belongs to an infinite sequence of times such that t, < r
for all
E
and as E + O
Ix,-xl+O Thus, for subcase (a),
and for subcase (b),
Furthermore, x, where
=x
+ m(E)q(E) E M(x)
m ( ~ ) (x,- X I + o
as
E +
o
Hence, according t o Sec. 7.81, we conclude that t* is tangent to M(x) in subcase (a) -t* is tangent to M(x) in subcase (b)
Since both t* and -t* are tangent to r*,Lemma 26 is established. By the way, of course, it may happen that both t* and - t* are tangent to M(x). Henceforth, we shall let t(x) denote a vector which is tangent to both r * and M(x) at point x. Thus it follows from the discussion above that t* = t(x)
or
-t* = t(x)
(7.115)
7.202 The Cone %'' From now on we shall consider points of conditions apply: (i) The tangent of
r* for
r*is defined, and f* # 0.
which the following
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
327
(ii) There is a one-to-one correspondence between point X E ~ *and time t E [ t o ,t , ] ; that is, x # x*(t), t E (t,t”), if f(x*(t), u*(t)) = 0 on [t’, t”]. (iii) Lemma 26 applies. (iv) Point x E r* is a nondegenerated point. According to assumption (iii) of Sec. 7.201, no point of r*is degenerated, except possibly the end points. Thus, we exclude from consideration x*(t,) or x*(t,), if it is degenerated. Thus, we are disregarding a set of points which is of measure zero in En“. On the other hand, we shall be concerned with those points of r*for which (i)-(ivj above are met. The corresponding set of times, t E [ t o , t,], will be denoted by 0 ; that is, 0
{ t : x*(t) satisfies (i)-(iv) above)
Now consider the set A,(x) which js the union of all vectors v, where v is (i) af(x, u), a 2 0, u E Q; (ii) bt(xj, b 5 0, t(x) tangent to both I?* and M(x), It(x)l = 1 ; (iii)? ci(x), c > 0 , i(x) E L+(x), li(x)l = 1 ; at x = x*(t), t E 0. Let us note a fundamental property of any vector v: namely, for all q such that
x and a
2
0, v
E
A,(x), x
+ q E @,(XI
+ q + CIV E qA(X)
(7.116)
For vectors defined in (i) and (ii) above, this property follows from Lemmas 17 and 13, respectively.$ For vectors defined in (iii) above, it is a consequence z
of the definitions of @,(x) and AIC. We wish to prove that x + q E @,(XI
implies
x +q
First of all suppose that
+ ai(x) E @,(x),
a2O
x + 11 E WA(X)
t Recall that, L + ( x )and L _ ( x )are xo-cylindrical half rays from x, pointing into the positive and negative x,-directions, respectively. 1Note that * A ( ~ ) f 4, according to assumption (iv) of this section.
328
A. BLAQUI~REAND G . LEITMANN
namely, there exists a
/3 > 0 such that for all E , 0 < E < p, x
+ ~q E AIC
z
-
z
Consequently, it follows from the definition of A / C that
x + ~q+ d ( x ) E A / C , Hence, we conclude that x
CI 2
0, 0 < E < p
+ q + cci(x) E WA(x)
Secondly, suppose that
x
+ q E Y(X)
and consider a sequence of vectors 111,
qz9
... qv 9
such that qV+q
x+qi€WA(x),
as v + c o i = 1,2, ..., v
Thus, as we just showed, x+qi+cti(x)EWA(x),
But, if x
i = 1,2, ..., v
+ q + ai(x) E @,(XI
then there exists a y > 0 such that for all v > y,
x + qv + ai(x) E @,(x)
which leads to a contradiction. Since
eA(x)= gA(X)u Y(x) our result is established. We shall be concerned with the vectors of the set V Li {v: v E A,(x), and with the cone %?" 4 {x"
x = x*(t), V f
E
0)
+ k5 : k 2 0, 5 = A ( t , t " ) ~ ,VV E V,
tI t"}
(7.1 17)
where X" 4 x*(t"), t" E [ t o , tl], is a nondegenerated point. However, t" need not belong to 0.t
t Provided I < t", of course.
7.
329
ON THE GEOMETRY OF OPTIMAL PROCESSES
As a direct consequence of (7.1 16), we have
x so that Lemma 20 leads to X"
and consequently
+ v E 8,(x)
+ 6 E qA(x") (7.118)
W" E @,(x")
We shall find it convenient to identify a particular member of set V by subscript; that is, v = {vl, Y z , ... ,v , , .. .> Thus, an index r determines a time t', and hence a point x, x * ( t r ) where set A,(xr) is defined, as well as the member v, E Au(xr).We shall also let
6, & A(t', It")vr,
tr It", v, E
v
(7.1 19)
Clearly, in view of definition (7.1 I7),
+ 6, E W"
X"
(7.120)
7.203 A Property of Cone %"'; Lemma 27 Consider now a vector rp which is a linear combination, having nonnegative coefficients, of a finite number of vectors 6, defined by (7.1 19); namely, S
CP
1
r=
ar6r
1
9
20
"r
(7.121)t
Cone %"' has a noteworthy property embodied in
Lemma 27. If x" 4 x*(t") is a nondegenerated point of optimal trajectory and cp is a vector at X" deJined by (7.121), then X"
+
r*,
E 8',(X")
We shall suppose that t' 2
t2
5 ... 5
ts
5 t"
Then, in view of (7.119) and (7.121), we have ccrA(tr, t")vr
cp = r= 1
=
A(t', t")ccrvr
r= 1
t By appropriate renumbering of the set
(7.122)
V, any such linear combination can be formed.
330
A. B L A Q U I ~ R EAND G. LEITMANN
As pointed out in Sec. 7.141 since [t’, t”]is divided into subintervals [ t l , t21, [ t 2 , t31, ... , [ t s ,
t7
we have A(t’, t”)= A(t”, t”).’. A ( P , t3)’4(t’,
t2)
A(?, t”) = A(t“, t”)... A(t3, t 4 ) A ( t 2 , t 3 ) A(t”, t”) = A(tS,t”)
This permits us to replace (7.122) by
(7.123)
The lemma will now be established by a recursive argument. As a consequence of (7.1 16), we have x 1
+ CfIVl E @ A ( X I )
so that (7.123) with Lemma 20 results in x2 f
q2
@A(X2)
In general, if xr
+q r
@A(xr)
then (7.116) leads to xr
+ q r + ~ r v Er @,(xr>
and (7.123) with Lemma 20 leads to xr+1 +tlr+l
~@A(xr+l)
Thus, we conclude that x,
+ q, +
NAY, E
@.Axs)
whence (7.123) with Lemma 20 establishes Lemma 27.
7.
33 1
ON THE GEOMETRY OF OPTIMAL PROCESSES
7.204 Convex Closure of Cone W" Since cone W" may be nonconvex, we shall find it useful to consider its convex closure K, L2 {x" + cp : cp defined by (7.121)) Note that the convexity of K A follows from its definition together with that of cp, whence
+ X" + X"
E KA (p2 E
KA
+
=> x''
+
~Icp1
a292
E KA,
0
cI1
2
a2
20
First of all, as a consequence of Lemma 27, (7.124)
KA 5 @A(X") Secondly, it is clear from their definitions that
(7.125)
Wfr E K,
And finally, since linear transformation A ( t , t") transforms a vector which is parallel to the x,-axis into one which is parallel to the x,-axis, and since x
it follows that and hence
+ ci(x) E A,(x),
x
= x*(t),
t E 0, c > 0
L + ( x " )c V''
(7.126)
L+(x")c KA
7.205 The Cone W ; Lemma 28 Since the discussion of this section parallels that of the preceding sections, we shall avoid repeating the details of all arguments. We shall now consider the set AJx) which is the union of all vectors w, where w is (i) (ii) (iii)
-af(x, u), a 2 0, u E R; bt(x), b 2 0, t(x) tangent to both r*and M(x), It(x)l cj(x), c > 0, j(x) E L-(x), Ij(x)l = 1 ; at x = x * ( t ) , t E 0.
= 1;
Lemmas 18 and 14, respectively, on the one hand, and the definitions of gE(x)and of SIC on the other hand, lead to the following property of w: For all q such that x q € 8&) and c( 2 0, w E A,(x), (7.127) aw E @B(X) x
+
+ +
332
A. BLAQUI~REAND G. LEITMANN
Here we shall be concerned with vectors of the set
W g {W : w E A,,,(x), x and with the cone %?‘2 {x’
+ k[ : k 2 0,
= x*(t),
Vt E 0 )
[ = A(t, t ’ ) ~ ,VW E W, t 2 t’}
(7.128)
where x’ x*(t’), t‘ E [ t o ,t l ] , i s a nondegenerated point. However, t‘ need not belong to 0.T In view of (7.127), we have x
+ w E @B(X)
x’
+ 5 E @B(x’)
so that Lemma 21 leads to and consequently
s @-,(x’)
V‘
(7.129)
We shall again use index r to identify a member of set W = { W l , w2,
... , w,, ...}
and consider vectors
5, 2 A(tr, t‘)w,,
tr 2 t’, wr E W
and (7.130)
r=l
where, in view of definition (7.128), it is clear that
x‘
+ 5, E V !
By arguments analogous to those employed in Sec. 7.203, one can prove
Lemma 28. If x‘ 4 x*(t‘) is a nondegenerated point of optimal trajectory and $ is a vector at x’ defined by (7. I30), then x’ 7.206
+ $ E QB(X’)
Convex Closure of Cone V’
Now we consider the convex closure of V’, namely,
K B4{x‘ + $ : $ defined by (7.130):
7 Provided t’ < f, of course.
r*,
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
333
From Lemma 28, we have K,
(7.131)
E @,(x')
and from their definitions it follows that %"
(7.132)
c K,
Furthermore, by arguments similar to those used in Sec. 7.204, we conclude that
L-(x') c %?' and hence
L-(x')
c
(7.133)
K,
7.207 Theorem 8 Let us first suppose that
@Ax0)+ 4 ,
xo = x*(to)
(7.1 34)
and let x' = xo, so that K , is a cone with vertex at xo. Since K , is convex and, in view of (7.131), does not fill the entire space, there exists a separating hyperplane F(xo) of K , a t xo. In other words, K , belongs to one of the closed half spaces determined by .T(xo), say R,(xo); that is, (7.135)
K , C R,(x0)
Let II(x*(t)) and P,(x*(t)) denote the transforms, due to A ( t o , t ) , t E 0, of F(xo) and R,(xo), respectively; that is
xo + q E F(X0) 3 x*(t) + A ( t 0 , t)q E H(x*(t)) xo + q E W,(Xo) * x*(f) + A(f0, t)q E P,(x*(f))
In fact, since the transformation is linear and nonsingular, II(x*(t)) is the boundary of BB(x*(t)). As a consequence of the definitions of AJx) and we have %?I,
xo - A ( t , t,)f(x*(t), u) E %',
vu E R
where %" is now a cone a t xo. Thus, it follows from (7.132) that XO
- A(t, to)f (x*(t), U) E KB,
VU E Q
which, in view of (7.135), implies
xo - A ( f , f,)f(x*(f), u) E R,(xo)
334
A. B L A Q U I ~ ~ RAND E G. LEITMANN
And so vu E Q
X*(t) - f(x*(t),u) E PB(X*(f)), whence we conclude that
x*(t) + f(x*(t), u) E comp P,(x*(t)),
Vu E Q
(7.136)
Furthermore, it follows from the definitions of AJx) and %" that
bt(x*(t))E A,(x*(t)), b Hence, we have
xo
0 * xo
+ bA(t, t,)t(x*(t)) E V'
+ bA(t, t,)t(x*(t)) E K , ,
b20
and then xo
+ bA(r, t,)t(x*(t)) E Re(xo),
This implies
x*(t)
+ bt(x*(t))E P,(x*(t)),
which, in turn, implies
x*(t) since b 2 0.
* t(x*(t))
E
b50 b p0
rI(x*(t))
Furthermore, it follows from (7.1 15) that
so that
x*(t)
+ f(x*(t), u*(t)) E rI(x*(t)),
vt E 0
(7.137)T
Let us now consider the solution h(t), to I t 5 t,, of adjoint equations (7.79) with initial condition h(to)= l o# 0 and such that ho is normal to r ( x o ) xo + Lo E R,(xO)
As a consequence of (7.82), and since h(t) is a nonzero continuous vector function, we have L ( t ) is normal to n(x*(t)) x*(t) h ( f )E P,(x*(t))
+
t Of course, this is trivially so if f(x*(t), u*(t)) = 0.
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
335
And so, we conclude from (7.136) and (7.137) that and
I ( t ) . f(x*(t), u)
Vu E R,
I 0,
I ( t ) . f (x*(t), u*(t)) = 0,
Vr
vt E
E
0
0
(7.138) (7.139)
If @,(x0) =$, then we shall suppose that
#4,
@,(x')
x1 = x*(t,)
(7.140)
We arrive again at conditions (7.138) and (7.139) by letting X" = xf and considering cone KA at terminal point X I . Cone K, is convex, and since (7.124) applies, KA does not fill the entire space. Hence, there exists a separating hyperplane F ( x l ) of KA , so that KA c &(XI)
(7.141)
where R,(xl) is one of the closed half spaces determined by F(x'). Let II(x*(t)) and P,(x*(t)) denote the transforms, due to A ( t , t , ) , t E 0, of F(x') and RA(xl), respectively. In view of the definitions of A,(x) and we have vu E R x i + A ( t , t,)f(x*(t),u) E V", %"I,
where V" is now a cone at x'. Thus, it follows from (7.125) that x1
+ A ( t , tl)f (x*(t), u) E K A ,
Vu E Q
which implies, as a consequence of (7.141), whence
x1
+ A(t, t,)f(x*(t), u) E R,(Xl),
x*(t)
+ f(x*(t), u) E P,(x*(t)),
vu E Q
vu E R
(7.142)
By arguments analogous to those utilized in the proof of (7.137), we can show again that (7.143) x*(t) f (x*(t), u * ( t ) ) E rI(x*(t))
+
Here we consider the solution h(t), to < t I t , , with "initial" condition I ( t l )= X' # 0 and such that
X' is normal to F(x')
x1 + h'
E comp
R,(x')
As before, we conclude that
I ( t ) is normal to Il(x*(t)) x*(t) k ( t ) compP,(x*(t)) ~
+
and so conditions (7.138) and (7.139) are valid.
336
A. BLAQUI~REAND G. LEITMANN
Finally, since xo does not appear explicitly in the trajectory equation, we have A,(t) = const,
Furthermore, either
Vt E [ t o ,t,]
xo + A(t,)
E R,(xO)
and, by (7.133) and (7.135), L ( X 0 )
c R,(xO)
or x 1 + A(t,) E comp R,(x')
and, by (7.126) and (7.141), L+(x') c RA(X')
Thus, it follows that Ao(t) = const 5 0,
Vt E 0
(7.144)
In conclusion we can state
Theorem 8. r f the following assumptions are met: (i) limiting surface C is nice; (ii) every point of optimal trajectory r*is an interior point of d*; (iii) no point of r*is degenerated, except possibly its initial or its terminal point but not both; and ifu*(t), to I tI t,, is an optimal control, and x*(t) is the corresponding optimal solution of trajectory equation (7.44), then there exists a nonzero continuous vector function A(t) which is a solution of adjoint equations (7.79), such that
(i)
sup UER
W W ) ,x*(t>,4 = W W ,x*(t>,u*(t)>;
(ii) X ( k ( t ) , x*(t), u*(t)) = 0 ; (iii) Ao(t) = const 2 0 ; for all t E 0. 7.21 Boundary Points of b*
Let us recall now the definitions of sets E and E*, respectively. Set E is the set of all states in E", from which a given terminal state, xl, can be reached along a path generated by means of an admissible rule. Set E* is the set of all states in En,from which x1 can be reached along an optimal path.
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
337
Limiting surfaces (C) are defined in a domain d*of En+',whose projection on E" is E*. If E* has a boundary, then €* possesses an xo-cylindrical boundary whose intersection with Enis the boundary of E*. It is clear that E* c E. Henceforth we shall assume that E* = E ; that is, terminal state x1 cannot be reached from a state from which it cannot be reached along an optimal path. Thus, we have 6* 4 E* x x0 = E X x0 4 6
A point r E E is an interior point of E if there exists an open ball in En, whose center is x and all of whose points belong to E. Then 6 denotes the set of all interior points of E, and E denotes the closure of E in En,that is, the union of the set of all points of E and the set of all limit points of E in En. Likewise, a point x E 6 is an interior point of € if there exists an open ball in En+', whose center is x and all of whose points belong to 6. Then d denotes the set of all interior points of 6, and d denotes the closure of d in n
E"". Analogous definitions hold for comp € and comp 8,the set of interior points of comp € and the closure of comp 6 in En+1. Indeed, we have EGEGE and
8cEcb
Furthermore, the boundaries of E and d,respectively, are and
aE A E n comp E, ad
A d n comp 6,
comp E c En comp d c En+l
so that, for instance, the boundary of comp 6 a(comp 6) = d b Note that E and d may be open, closed, or neither. 7.211 Lemma 29 We shall now prove Lemma 29. Let x' x(t') and x" 4x(t") be points of a trajectory or nonoptimal) in En". If x' E comp 6 then x" $k 6 , Vt" 2 t'
r (optimal
338
A.
BLAQUIBRE
AND G . LEITMANN
To prove this lemma let us suppose it is false; namely, x” E &
Then there exists a trajectory which emanates from X ” and intersects the line X1 at some point x l , that is, a trajectory whose projection on Enis a path which reaches the given terminal state X I . Consequently, there exists a trajectory which starts at x’ and intersects the line X1-namely, the union of the portion of I- from x‘to x” with the trajectory from X ” to xl.But this conclusion is incompatible with the hypothesis that x’ E comp a, and hence Lemma 29 is established. 7.212 Lemmas 30 and 31 Next we shall prove
Lemma 30. Let x’ non-optinial) in
x(t’) and x“
If
x ( t ” )be points of a trajectory
r (optimal or
xi E aa
then X”
# 6,
Vtlr 2 t’
First of all we note that comp d # since comp &
=
4
4 =>a= En+l+a&
= C#I
But this is contrary to the hypothesis of the lemma. Let B(x’) be an open ball in E””, whose center is at x’. Now ~ ( x ’n) comp & = 4 =>B(x’)c & a x ’ € 2
which contradicts the hypothesis of the lemma. Hence B(x‘) n comp & #
4
no matter how small the radius of B(x’). Now consider a point
x‘ + ~ q E‘comp &,
E
>0
(7.145)
and let r’ denote a trajectory which starts at point x‘ + ~ q and ’ which is generated by the same control that generates trajectory r for t 2 t’. At time
7. ON THE GEOMETRY OF OPTIMAL
PROCESSES
339
t” 2 t’, the point of trajectory r‘is given by the solution of trajectory equation (7.44) with initial condition x = x’ E q ‘ at t = t‘; namely, it is
+
X”
+ E q ” + O(t“,
E)
where o(t”,E ) / E tends to zero uniformly as E -+ 0, and 11’ = A(t’, t”)q’
Suppose now that the assertion of the lemma is false and
x” E 6 Then there exists a positive number a such that
x” + ~ q +“ o(t”,E )
E
8,
VE < a
However, this relation together with (7.145) violates Lemma 29; thus, Lemma 30 is valid. Now we can establish
0
Lemma 31. A trajectory whose initial point belongs to comp € has no point on boundary a€, nor, indeed, a point in €. First of all consider a point x E ad, and let B(x) be an open ball in En” with center a t x. If B(x) n d =
then
B(x)
c comp €
Hence x is an interior point of comp 8.But this is not possible since
x E ad 3 x E a(comp 8) Consequently
xEa&PB(x)n€#q5 Now let x‘ suppose that
x(t’) and x”
-
x(t”), t“ > t‘, be points of a trajectory 0
x’ E comp € and Xn E
a&
In other words, suppose that Lemma 31 is false.
(7.146)
r, and
340
A.
BLAQUIERE
AND G . LEITMANN
Now let B(x”) be an open ball in En” with center at consider a point X ” + ~ q E”B(x”) n &
X”
and radius p, and
According to (7.146), this is possible. Consider also a trajectory r’,generated by the same control that generates r on [t’, t”],which passes through point X ” ~ q at” time t”. At time t‘, the point of r’is x’ + E q ’ + o(t’, E ) (7.147)
+
-
where o(t’, E ) / E tends to zero uniformly as E
X’ E
0, and
q’ = A(t”, t’)q”
0
Since that
-+
comp € and
X”
x’
-
(7.148)
+ ~ q “B(x”), it follows from (7.147) and (7.148) E
+ ~ q+’ o(t’,
‘3
E ) E comp
&
for suffciently small radius p. But this is not possible in view of Lemma 29 together with
X”
Hence
+Eq”
E
€
0
x’ E comp € =-x” # d b Of course, Lemma 29 leads at once to the result that 0
n
x’ E comp & a x ’ ’ # G Thus, Lemma 31 is established.
7.213 Lemma 32 Another salient feature of boundary
a€
is embodied in
Lemma 32. Ifpoint x*(t’) of optimal trajectory r*,given by x*(t), to 5 t 5 t,, belongs to boundary a€, then x*(t“)befongs to 88, I‘ I t” 5 I , . As a consequence of Lemma 30, no point of r*on [t’, t l ] belongs to Also, since r*reaches x‘, no point of r*belongs to comp €. Hence 0
x*(t) # comp €
V t E [t’, tl]
6.
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
341
But 0
d
n
u comp
8 u a8
= E"+'
and so it follows that x*(t) E d&,
V t E [t', t l ]
7.214 A Fundamental Analogy It is readily seen that Lemmas 30, 31, and 32 exhibit strong similarities to Theorem I , Corollary 1, and Lemma 2, respectively, provided we invoke the correspondence between :
a€
and C
-
6 and B/C
0
camp G and A / C However, there is one difference between these corresponding regions? 0
n
namely, 6 or comp & may be empty, whereas B/C and A / C are never empty. This difference plays a role in some of the derivations which follow. Many of the results which have been established for interior points of &* will be seen to apply for points of boundary a&, provided we replace C by 0
n
88, A/C by comp &, and B/C by 8. We shall not repeat the details of all derivations, but rather we shall list the pertinent properties of boundary points and point out their similarities to those of interior points. 7.215 Third Basic Assumption Henceforth we shall make the following assumption : Let q be a bound vector at point x E 88.We shall assume that for every vector q there exists a scalar 6 > 0 such that for every E , 0 < E < 6, the point x + ~q belongs either to region 6 or to region comp &. This assumption is similar to the first basic assumption (see Sec. 7.61).
0
t It should also be noted that a&, 2,and comp d are unique, but there is a one-parameter
family of C, BIZ, and A I C . Thus, the correspondence is between 86 and a given C, etc.
342
A. B L A Q U I ~ R EAND G. LEITMANN
7.216 Definition of Local Cones W,(x) and Wo(x) Next let us define two cones associated with every point x of d b . Again let q be a bound vector at point x E d b , and let
U,(x)A {x
+ q : 3a > 0
such that VE,0 < E < a, x
+ EY E 6}
(7.149)
and
g0(x)
(x + q : 3 p > 0 such that Vc, 0 < E < p , x
+ &qE comp a}
(7.150)
Cones Vo(x) and @,(x) are local cones with vertices at point x. They correspond to cones VA(x)and VB(x), respectively. However, unlike U,(x), cone g,(x) may be empty; this happens if
~ ( xn )d
=$
where B(x) is an open ball with center at x E d b and sufficiently small radius. On the other hand, cone U,(x) cannot be empty. For, if X denotes the xocylindrical line through point x E 88,then X c dd c comp d and hence
x +q E
x*x +q E Uo(x)
7.217 Interior Points of Wl(x) and Wo(x); A Fourth Basic Assumption; Lemma 33 Here we shall invoke arguments similar to those of Sec. 7.63; namely, we shall say that x q is an interior point of V,(x) (or U,(x)) if
+
(i) x + q E g,(x) (or Vo(x)); and (ii) there exists an open ball B(x + q) in En+' with center at x that all points of B(x +q) belong to V,(x) (or Vo(x)). Then we have
+q :x +q 3,(x) 2 {x + q : x + q Q,(x)
{x
+ q such
is interior point of gl(x)) is interior point of u,(x)}
Ourfourth basic assumption is the following: Let x -+ q' be an interior point of Vl(x) (or Vo(x)); namely, there exists an open ball B(x +q') in E"+' which belongs to Vl(x) (or Vo(x)). Then there exists an open ball B'(x + q') in
7.
343
ON THE GEOMETRY OF OPTIMAL PROCESSES
which belongs to W,(x) (or Wo(x)) and which has the property that for every point x + q in B'(x + q') there exists a positive number tl (independent of q) such that for all E , 0 < E I a,point x + E q belongs to 6 (or comp a). The proof of the following lemma is similar to the proof of Lemma 4, provided we invoke the correspondence between the appropriate regions as discussed in Sec. 7.214. Ellil
+
Lemma 33. Consider a point x E d b , and a point x q' E @,(x). Then there exists an open ball B'(x + q') c U,(x) and a positive number p such that for all 0
x + q E B'(x +q') andall
E,
0 < E < j3, x
n
+ e q E comp E.
7.218 Local Cone B(x)
Let us now recall that Wo(x) # 4, and let us dejine another local cone at x E d b . Let B(x) denote the boundary of Wo(x); that is, g(x) If q1(x) #
G?,(X) n comp Wo(x)
(7.151)
4, then it follows from definitions (7.149) and (7.150) that W0(x) u W,(X) = E"+l
(7.152)
Furthermore, we can readily see that Wo(x)
WdX)
=
4
(7.153)
For suppose that ~ o ( f-74 gl(x)
+4
and consider a point x
+ q E Wo(x)
and x
+ q E W1(x)
Then 3tl > O
such that VE, 0 < E < a ,
38 > 0 such that VE, 0 < E < p,
Suppose, for instance, that VE, O < E < a ,
tl
+E ~ E $ x + e q E comp F x
< B ; then x + ~ q i 6 , and
x+~q~comp&
which implies that
6 n comp 8 # But this is impossible, and hence (7.153) is correct.
344
A. B L A Q U I ~ R EAND G. LEITMANN
Finally, if #
%,(XI
then
4
U,(x) = comp %,(x)
and hence
g(x)
= @,(x)
(7.154)
n @,(x)
7.219 Lemmas 34 and 35; Corollary 14 The proofs of the following two lemmas are similar to those of Lemmas 6 and 7. We shall not repeat the proofs, but merely state the lemmas.
Lemma 34. Let q be a bound vector at a point x E of a parameter E with the following properties: (i) q ( ~ ) 1 as E -+ 0 , (ii) 3 y > 0 such that VE, 0 < E < y, x --f
then
x
a&.
Ifq = q ( ~ is ) a function
+ q(&) E comp G
+ 1 E @,(x)
Lemma 35. Let q be a bound vector at a point x E ad.cf q = q ( ~is) a function of a parameter E with the following properties:
(i) q ( ~-+) 1 as E 0 (ii) 3y > 0 such that VE, 0 < E < y, x -+
+ q(&) E 8
and provided then The following corollary, which is the analog of Corollary 4, follows directly from Lemmas 34 and 35.
Corollary 14. Let q be a bound vector at a point x E 36. cfq = q ( ~is ) a function of a parameter E with the following properties:
(i) q ( ~-+ ) 1 as E 0 (ii) 37 > 0 such that VE, 0 < E < y, x -+
and provided then
+ E ~ ( E )E 3 6 ,
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
345
7.2110 Lemmas 36 and 37 To deduce the next two lemmas we employ arguments similar to those used in the proof of Lemma 15. Consider a field line, say L, defined parametrically by
x
= x(t),
t E (-
00, 00)
(7.155)
and let
where AX+ 4x(ti
+ At) - x(ti) At > 0
x(ti) = X,
Clearly x+Ax+EL
so that q+(At) is a continuous function of At, and q+(At) -+ f(x, ub),
ubE R, as At -+ 0
(7.156)
Suppose now that XEdd
Then it follows from Lemma 30 that x + A x + =x+q+(At)AtEcompd
(7.157)
Condition (7.157) together with Lemma 34 leads at once to Lemma 36. A t apoint x E d b x for all u E R.
If we now assume that
+ f (x, u) E 4,(x) %T(X)z
4
then we can utilize arguments similar to those employed in Sec. 7.111, and as a consequence of Lemma 35 we can state Lemma 37. Zf%,(x) #
4 at a point x E 36, then x - f(x, u) E q(X)
for all u E Q.
346
A. B L A Q U I ~ R EAND G. LEITMANN
7.2111 Lemma 38 If
@1(4 z
4
then arguments similar to those of Sec. 7.1 12 can be used in conjunction with Corollary 14 to prove
Lemma 38. rfWI(x) #
4 at a point x E a&,
then
x + f + *E a ( x ) x - f-* E B ( X )
provided f + * and f - *, respectively, are defined. 7.2112 Separability of Local Cone @,(x)
and qI(x)
Henceforth we shall assume that
@I(X)f
4
and we shall introduce some definitions which are similar to those introduced in Sec. 7.12. We shall say that an n-dimensional hyperplane Y(x), containing a point X E ~ & is , an n-dimensional separating hyperplane of closed cone @,(x) (or qI(x)), if every point x q E q,(x) (or @,(x)) lies in one of the closed half spaces determined by F ( x ) . The corresponding closed half space will be denoted by R , (or Rl), and the corresponding open half space by R, (or R J . Then, if there exists an n-dimensional separating hyperplane of @,(x) (or qI(x)),we shall say that cone q,(x) (or @,(x)) is separable.
+
7.2113 Cone of Normals at Boundary Point If @,(x) (or qI(x))is separable, we shall consider a separating hyperplane F ( x ) and, at point x E 88, a bound vector n(x), In(x)l = 1 , which is normal to F ( x ) . Furthermore, (i) if @,(x) is separable, we shall choose n(x) such that
x
+ n(x) E comp R,
(7.158)
(ii) if qI(x) is separable, we shall choose n(x) such that
x
+ n(x) E R,
(7.159)
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
347
As in Sec. 7.122, we shall consider the set {n(x)} of vectors n(x) for all separating hyperplanes of g0(x) (or 8,(x)), and define the cone of normals %?,,(x) 4{x + kn(x) : k > 0, n(x) E {n(x)}}
(7.160)
Here we know that the xo-cylindrical line X through point x E 38 belongs to 88. Thus, if F ( x ) exists,
x c F(x) In other words, a separating hyperplane F ( x ) (of g0(x) or gI(x))is “vertical.” Hence, n(x) is normal to X , and the cone of normals %?,(x)belongs to a hyperplane which is perpendicular to the x,-axis.
7.2114 Regular and Nonregular Points of the Boundary We shall say that x E 88 is a regular point of the boundary, if both g0(x) and O,(x) are separable. On the other hand, if not both gO(x)and gl(x)are separable, point x will be called a nonregular point of 88. The discussion of Sec. 7.13 is applicable, provided one invokes the correspondence between appropriate quantities as specified in Sec. 7.2 14. The notions of regular and nonregular points of the boundary, together with their properties, are valuable since they permit one to extend the validity of Lemmas 20 and 21 to optimal trajectories which contain boundary points. 7.2115 Properties of Local Cones at Boundary Points; Lemmas 39-41; Theorems 9 and 10 One can easily prove Lemma 39. Let 1 be a bound vector at point x is a conic neighborhood, that is,
E
38. If x
+ 1 E g ( x ) and N ( x )
N(x)k ( x + kq : k > 0, x + q E B ( X+ I > }
where B(x
+ 1) is an open ball in E n + ‘ with center at x + 1, then %O(X) z
4
Wx) n %I(X) z
4
Wx)
f-l
and
The proof is similar to that of Lemma 8. Suppose that the lemma is false and that WX) n gO(4 = 4
348
A. B L A Q U I ~ R EAND G. LEITMANN
But, according to the results of Sec. 7.218, so that
%,(XI
#
4 * U,(x) = comp %?,(x) N x ) = UdX)
(7.161)
In view of the definitions of N ( x ) and U,(x),relation (7.161) implies that x
+ 1E 4,(x)
However, this contradicts the hypothesis of the lemma, which states that
x + 1 E B(x).
Similarly, by supposing that “ x ) n U,(X) = 4
we again arrive at a contradiction. Thus the lemma is established. Next we deduce the analog to Lemma 20, that is,
Lemma 40. Let x*(t’) and x*(t”), to I t’ I t” I t,, be two points of optimal
trajectory r*,corresponding to control u*(t) and solution x*(t), to I t s t,, such that x*(t’) ~ d (and & consequently x * ( t ” ) ~ d & Let ) . q’ be a vector at x*(t’), and q“ its transform due to linear transformation A(t’, t”); namely, q” = A(t’, t“)q’
0x*(t’) then
+ q’ E O0(x*(t’)) +
x*(t”) q” E @o(x*(t”)) Let us first consider
x*(t’) + q’ E W0(x*(t‘))
According to definition (7.150) of q 0 ( x ) , there exists a positive number such that for all E , 0 < E < p, x*(t’) + eq‘
E
comp &
p
(7.162)
+
Now consider a trajectory which starts at point x*(t’) ~ q ’ ,and which is generated by the same control as r*-that is, u*(t), to -< t 5 t,-on [t’, t”]. This trajectory is defined by the corresponding solution of trajectory equation (7.44), which is x(t) = x*(t) q ( t ) o(t, E )
+
+
where o(t, E ) / E tends to zero uniformly for all t E [t’, t”]as A(t‘, t)q‘.
E --+
0, and q ( t ) =
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
349
From (7.162), together with Lemmas 30 and 31, we have x(t”)= x*(t”)+ E q ( t ” ) for all
E,
+ o(t”,E ) E comp 8
0 < E < p. Furthermore,
q(t”)
(f”, + O+q(t”) 2 q” E)
E
as & + O
Consequently, it follows from Lemma 34 that
x*(t“) + q” E @o(x*(t”))
Thus, if $f?o(x*(t’)>= @o(x*(t‘)) the lemma is established. Thus far, the proof is entirely analogous to the first part of the proof of Lemma 20. The remainder of the proof is equally similar to the remainder of the proof of Lemma 20; we shall not repeat the whole derivation. Let us only note that the rest of the proof begins with the supposition that x*(t’) + q’ E &?(x*(t’)) Then, employing Lemma 39 which corresponds to Lemma 8, one arrives at
+
x*(t”) q” E $70(x*(t”))
This concludes the demonstration of Lemma 40. By arguments analogous to those used in the proof of Lemma 21, one can also establish
Lemma 41. Let x*(t‘) and x*(t”), to I t’ I t” I t,, be two points of optimal
trajectory r*,corresponding to control u*(t) and solution x*(t), to I tI t,, such that x*(t‘)E a8 (and consequently x*(t“)Ea&).Let q“ be a vector at x*(t”) such that x*(t”) + q” E @,(x*(t”)) Then q” is the transform, due to linear transformation A(t’, t”), of a vector q‘ at x*(t’) such that
x*(t’) + q’ E @,(x*(t’))
Now one arrives readily at the analogs to Theorems 2 and 3. Let x*(t’) and x*(t”), to I t‘ I t” I t , , be two points of optimal trajectory r*,corret I t , , such that x*(t’) E 8 8 sponding to control u*(t) and solution x*(t), to I (and consequently x*(t”)E 88). Then we have
350
A. B L A Q U I ~ R E AND G. LEITMANN
Theorem 9. I f @,(x*(t’))is separable, then @,(x*(t”))is separable. and
Theorem 10. I f go(x*(t”))is separable, then g0(x*(t’))is separable. The proofs of these theorems are similar t o those of Theorems 2 and 3. By analogy to the properties of optimal trajectories at interior points of 6*, one can obtain many interesting properties of optimal trajectories at boundary points. Thus far we have stated some of the salient properties. However, we have not as yet defined the notion of a nice boundary; hence, we have not stated associated results. Some properties of an optimal trajectory at boundary points exhibit special features. We shall now turn to some of them.
7.2116 A Maximum Principle (Abnormal? Case); Theorem 11 We are now ready to state
Theorem 11. IJ’u*(t), to I tI I , , is an optimal control, and x*(t) is the corresponding solution of trajectory equation (7.44), and i f x * ( t ) is a regular point of boundary ad for all t E [ t o ,t l ] , then there exists a nonzero continuous vector function 1 ( t ) which is a solution of adjoint equations (7.79), such that
(i)
SUP
U€R
X ( V t ) ,x * ( f ) , u> = #(L(f), x*(t), u*(t>);
(ii) X ( L ( t ) ,x*(t), u*(t))= 0 ; (iii) Ao(t)= const = 0 ; for all t E [ t o ,t , ] . Conditions (i) and (ii) of the theorem can be proved by arguments similar to those used to establish conditions (i) and ($-of Theorem 4. In particular, consider the separating plane Y ( x o ) of both g,(xo) and gO(xo)at initial point xo = x*(to).In view of the proof of Theorem 9, the transform of Y ( x o ) due to linear transformation A(to, t), to I t < t , , is the separating hyperplane F ( x ) of G?,(x)= R,(x), x = x*(t). Next consider the solution 3L(t), to I t I I,, of adjoint equations (7.79) with initial condition 5(to)= 1’ such that 1’ # 0, 1’ normal to F ( x o )and directed into R,(xo).Then 1 ( t )is normal to ,F(x*(t)), to 2 t I t,. The remainder of the proof is similar to that of Theorem 4. Condition (iii) is a consequence of the concluding remark of Sec. 7.21 13; namely, X c ,F(x*(t)) and n(x*(t)) is normal to X . Consequently, 1 ( t ) is normal t o X and so Ao(t)= 0.
t This nomenclature is adopted from the classical calculus of variations where the problem with ho = 0 is dubbed “abnormal.” See case (iii) of Sec. 7.243.
7. 7.22
ON THE GEOMETRY OF OPTIMAL PROCESSES
351
Boundary and Interior Points of &*
In general, some points of an optimal trajectory may be interior points while others may be boundary points of region 6". Thus, we require a theory which applies for both interior and boundary points. We begin by defining two regions for each limiting surface C ; namely,
6, A & n ( B / C ) and 6,
comp 8,
(7.163) (7.164)
Next we consider the boundary of 8 , and b,, that is, Z 4b, n comp 6,
=
2,n 8,
(7.165)t
Since &, = B/C
we have #
4
(7.166)
Moreover, &, is not all of space En"; hence &A
#
4
(7.167)
A point x E &, is an interior point of 8 , if there exists an open ball B ( x ) in E n + ' ,whose center is a t x and all of whose points belong to b,. The set of all interior points &, is &, . The union of the set of all points of 8, and of all limit points of 6 , is the closure b, of 8 , in E n + 1 Entirely . analogous remarks apply to 6, . Similarly, we designate the closures in En+' of C, A / C and B/C = b,, respectively. by and Let us note that comp[& n (BIZ)] &, 4comp &, = (7.168) (comp 8 ) u comp(B/C) Since1 (BIZ) u C u ( A / C ) = 6 and 8 u comp 8 = En" we have comp(B/C) = C u ( A / C ) u comp & (7.169)
c, G,
i
t Note there exists a one-parameter family of boundaries {E}corresponding to the oneparameter family of limiting surfaces {C}. t More precisely, { B i z , C , A / C } is a partition of 8,and {8,comp 8)is a partition of
En+'.
352
A.
BLAQUIBRE
AND G. LEITMANN
Thus it follows from (7.168) that &, = C u ( A / X ) u comp &
(7.170)
7.221 Lemma 42 One can now prove
Lemma 42. Let x(t') and x(t"), t" > t', be points of a trajectory in En+'. I f x ( t ' ) belongs to 6, , then x(t") cannot belong to 8,. In view of (7.170) x(t') E &A
*
x(t') E comp &,
or
x(t') E z u ( A / C )
If x ( ~ ' ) Ecomp &
then it follows from Lemma 29 that x(t")# &,
If
t" > t'
x(t') E c u ( A p )
then it follows from Theorem 1 and Corollary 1 that x(t") $ B/C,
t" > t'
In view of (7.163), Lemma 42 is then established.
7.222 Lemmas 43 and 44 Next we shall deduce
Lemma 43. Let x' 4 x(t') and (optimal or nonoptimal) in
&= x(t"), t" > t', be points of a trajectory lI f x ' E E then x" $ b, .
X"
The proof of this lemma is similar to that of Lemma 30. First of all, note that &A comp 6, # 4 and, if B(x') is an open ball in E""
with center a t x', then
%x') n &A #
4
7.
353
ON THE GEOMETRY OF OPTIMAL PROCESSES
no matter how small the radius of B(x’). For, if B(x’) f’\ 6 , = f$ then B(x’) c &, and hence x’ is an interior point o f b,. But this contradicts the hypothesis o f the lemma. Let ~ q ’E, > 0, be a vector at point x’,such that x’
+ Eq‘
E
X” €
6,
(7.171)
&,
and consider a trajectory r’ which starts at point x‘ + ~ q and ’ which is generated by the same control which generates r on [t’, t”].At time t“ > t‘, the point of I-‘ is X ” + eq” + o(t”,E ) where q” = A(t‘, t”)q’. Suppose now that Then there exists an a > 0 such that for all X”
E,
0 < E < a,
+ E q “ + o(t”,E ) E 6 ,
However, according to Lemma 42 with (7.171), this is not possible; and so Lemma 43 is established. Next we prove
Lemma 44. A trajectory whose initial point belongs to boundary Z, nor, indeed, a point in 6,.
d, has no point on
The demonstration o f this lemma is similar to that of Lemma 31. Let B(x) be an open ball in En” with center at point x . Then it is readily shown that (7.172)
x€s,=>B(x)nd,#c$ x(t’) and
Now consider points x‘ Suppose that
XI‘&
x ( t ” ) , t” > t’, of trajectory
r.
x’ E 8, and that X” E
z
Consider also an open ball B(x“) in E“+’ with center at According to (7.172) we may choose a point X”
+ ~ qE“B(x”) n &
X”
and radius p.
354
A. B L A Q U I ~ R EAND G. LEITMANN
Let r’be a trajectory which is generated by the same control that generates r on [t’, I ” ] ,and which passes through X” + ~ q at“ time t”. At time t‘ the point of r‘is
x’+ E q ’
+ o(t’,
where o(t’,E ) / E tends to zero uniformly as
E)
E -+
(7.173)
0, and
q‘ = A(t”, t’)q”
(7.174)
Now, since x’ E 6, and X” + ~ q E“B(x”), it follows from (7.173) and (7.174) that x’
+ &?l‘ + O ( t ’ , &) E 8,
for sufficiently small radius p. But this is impossible because of Lemma 42 with X”
+ &?l“ E 8,
Hence
x’E 8,
-
XIi
@
z
Also, Lemma 42 leads at once to the result that
x‘ E d A * X” @ 8, Thus, Lemma 44 is established.
7.223 An Assumption Concerning Boundary S; Lemma 45 Henceforth we shall assume that (7.175) However and hence
Furthermore and
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
355
so that, in view of assumption (7.175), we have
8, n ( A / C ) = 4
(7.176)
An important property of boundary Z is embodied in
Lemma 45. Let r* be an optimal trajectory given by x*(t), to I t 5 t,. If x*(t’), t‘ E [ t o ,tl], belongs to a boundary E,and iJ ( A / C ) n Z = 4, then x*(t) belongs to E for all t , t’ It 5 t , . According to definition (7.165) of E,
Then it follows from Lemma 43 that x*(t”) Since gA= comp
b,,
t” > t‘
(7.177)
b,, condition (7.177) implies that x*(t”)E Z A , t“ > t‘
(7.178)
On the other hand, x*(t’) necessarily belongs to region &, so that x*(t’)E Z implies that x*(t’) E 8 n ZUB But & = (B/C) u C u ( A / Z ) and hence x*(t’)E [((BIZ) u C) n I , ] u [ ( A / C )n ZBl Thus, in view of (7.176), x*(t’) E [(B/C)u C] n 2,
Moreover, it follows from the definition of B/C that all points of I:are limit points of BIZ. But 8, = B/C, SO that B/C c 2,
cc6,
(7.179)
8B
Consequently, [(BIZ)u C] n 8 ,
so that
= (B/C)u
x*(t’) E (BIZ) u C
C
(7.180)
356
A. B L A Q U I ~ R EAND G. LEITMANN
Then it follows from the global properties of a limiting surface, embodied in Lemma 2, that t” > t‘
x*(t”)E (BIZ) u C, and hence, by (7.179), that
t“ > t’
x*(t”)E I , ,
(7.181)
Finally, (7.178) and (7.180) lead to x*(t”)E Zja n Z B = E,
t“ > t‘
which establishes Lemma 45.
7.224 Another Fundamental Analogy It is readily seen now that Lemmas 43,44, and 45 are analogous to Lemmas 30, 31, and 32, and to Theorem 1, Corollary 1, and Lemma 2, respectively, provided one invokes the correspondence between :
E
6,
and
ab
and
d
or C or B]C 0
6,
and
n
c o m p d or A/C
and provided one introduces assumptions which correspond to those introduced earlier in the discussion of interior points. Under these provisos, the properties deduced previously for interior points-and particularly those of optimal trajectories at interior points-of d * are equally valid for boundary points of 6‘”. The sole exception t o this statement pertains to those properties which depend on the concept of a nice C surface, since we have not as yet introduced the notion of a nice boundary E.Before taking up this latter concept, let us retrace some other steps of our earlier discussion.
7.225 Some Basic Assumptions; Local Cones %?,(x)
and Wb(x)
We shall now introduce a basic assumption which is similar to the first and third basic assumptions of Sec. 7.61 and 7.215, respectively. Letq be a bound vector at point x E E. We shall assume that for every vector q there exists a scalar 6 > 0 such that for every E , 0 < E < 6, the point x + ~q belongs either to region 2, or to region 6,.
7.
357
ON THE GEOMETRY OF OPTIMAL PROCESSES
Now we define two local cones, namely, %?,(x) 2 ( x
+ q : 3ci > 0
such that VE, 0 < E < ci,
x + ~q E a,}
(7.182)
and Wb(x) Lk
{x + q : 3p > 0 such that VE, 0 < E < p,
x
+ q E b,}
(7.183)
In view of (7.170), we have
d,
=
c v ( A T ) v comp d
Furthermore, ifL+ denotes the Lalf-ray which emanates from x E E,and which is parallel to the xo-axis and points into the positive xo-direction, then L, c comp 8 u
(q)
whence L , c ga Consequently, U,(x) #
&,
vx EE
(7.184)
&,
vx E s
(7.185)
Moreover, we shall assume that U,(x) #
By arguments similar to those of Sec. 7.217, one can define interior points of %?:,(x) and U,,(X),and then open cones @,(x) and @b(x). Finally, we now introduce another basic assumption which is analogous to the fourth basic assumption stated in Sec. 7.217; we need only replace g0(x) by V J x ) , and Ul(x) by ub(x)’ 7.226 Local Cone G ( x )
By analogy to cone g ( x ) defined in Sec. 7.218, we now have
The properties of cones U a ( x ) , %?b(X),and a ( x ) can be deduced by arguments similar to those employed for cones g 0 ( x ) , %?,(x), and g ( x ) , respectively. One arrives at analogous lemmas. We shall not repeat these derivations; rather, we shall discuss some of the salient features of boundary E.
358
A. BLAQUlkRE AND G . LElTMANN
7.227 Tangent Cone U,(x) at a point of E -
We shall say that a unit vector tE is tangent to boundary E at a point x in E , if the following conditions are fulfilled : (i) There exists a vector function (q~)and a positive scalar function m(e), both of the same parameter E , such that
(ii) There exists an infinite sequence and E ~ + O as k - + c o }
S s A { ~ : ~ = ~i =i l, , 2 ,..., k,
and a positive number the point
CI
such that, for all
E E
S, and 0 < E
< a,
x + m(E)q(e)E E
Next we define the tangent cone %?=(x) of E a t x; namely,
WZ(x) {x + kt, :k > 0, Vt,) Properties of this cone can be derived readily by arguments similar to those of Sec. 7.8.t
7.228 A Nice Boundary %; Lemma 46 Let us now turn to the concept of a nice boundary E. In preparation, we shall deduce a lemma. Consider a point x E Z and an open ball B(x) in E"+l with center at x. As a consequence of definition (7.182) of %?*(x) together with the assumption that gb(x)is not empty, we have
vx E E ~ ( xn ) 6, z 4, no matter how small the radius of B(x). If we assume, furthermore, that ~ ( xn ) 6,
z 4,
vx E
z
no matter how small the radius of B(x), we can prove Lemma 46. If
~ ( xn ) 6, x, E B ( X ) n d , x,
E
then there exists a positive number CI such that, for every pair of points xA and + (1 - C I ) X E~ Z.
x,, the point
-
t If Bi is a subset of E and x E Ei,one can also define tangent
cone VXI(x) at point x.
vector tSi and tangent
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
359
Let us note, first of all, that
E = comp(d, u 6,) Indeed, consider a point x
6 6,
u
(7.187)
d B; namely,
6 &A
(7.188)
X 4 d B
(7.189)
Since &, 4comp b,, whence 8, = comp dBand 6 , = comp g A , it follows from (7.188) and (7.189), respectively, that x E g B and x E d, , and hence that x E d, n Z B 4E. This establishes (7.187). Our subsequent arguments are similar to those utilized in the proof of Lemma 9. Let L be the line segment (connected portion of a straight line) which joins points x, and x B .If Lemma 46 is false, then it follows from (7.187) that a point of L belongs to one of the two sets A , & L n 8,-
A, A L n dB
so that
A, v AB=L Let d(xi,xi) denote the distance between xi E L and x, EL, and consider the two sets ( x : x = x ~ ~ E A ,i,= 1,2, ..., k } and { x : x = x i B ~ A Bi,= 1,2, ..., k ) constructed as follows: Consider the midpoint of L ; it belongs either to A, or to AB . If it belongs to A, , we shall denote it by xIAand let xIB= xB. If it belongs to AB , we shall denote it by x1 and let xIA= xA . Then we consider the midpoint of the segment of L between xIAand xIB,and proceed as before. As in Sec. 7.74, we obtain two sets of points, X: and xiB,i = I , 2, ... , k, such that XkA and xkBtend to the same limit, x L , as k increases; that is, XkA
* XL
XkB
3
XL
as k j c o
According to our supposition, x L belongs either to A, or to A B . Suppose, for instance, that x L E A, and hence xLE 6,. Then there exists a 6 > 0 such that an open ball B(xL) in En", having its center at xLand radius p, 0 < p < 6,
360
A.
belongs to
6,.
BLAQUIBKE
AND G. LEITMANN
But this contradicts the fact thatf
xkB-+xL as k + c o Thus, our supposition is incorrect, namely,
LnE#4 whence follows Lemma 46. Note now that Z possesses a property which is analogous to property (ii) of a nice I: defined in Sec. 7.9. However, property (ii) of a nice C is introduced by way of the definition, whereas the analogous property of E is assured by Lemma 46. Now we can define a nice boundary E as follows : Boundary E is nice if there exists a partition {El, E2,... , G ~ of } 2 such that at every point x in Zi, i = 1 , 2, ... ,p, the tangent cone g Z i ( x )is defined and belongs to a k-dimensional plane T,,(x), k I n, through x.
-
7.229 Concluding Remarks One can readily verify that all the properties which we obtained for interior points of a nice I: surface are also valid for points of a nice boundary E, provided one invokes the correspondence specified in Sec. 7.224 together with the assumptions introduced throughout Sec. 7.22. 7.23
Degenerated Case
We should not close this chapter without saying a few words about the degenerated case; namely, the case in which either (i) J , ( x > = 4, or (ii) J,(x) = 4 at an interior point of b*
n
In case (i) point x will be called B-degenerated, and in case (ii) it will be called A-degenerated.
7.
361
ON THE GEOMETRY OF OPTIMAL PROCESSES
7.231 B-Degenerated Point; Lemmas 47-49 Let us consider case (i) and prove
Lemma 47. Let X ' A x*(t') and X " g x*(t"), t" 2 t', be points oj' optimal trajectory r*.I f x' is a B-degenerated point and X " is an interior point of b*, then xn is also B-degenerated. According to hypothesis of the lemma, =
4
@B(x")#
4
@B(X')
(7.190)
Suppose the lemma is false, that is, and consider a vector q" at point x" such that X"
+ q" E @,(x")
According to Corollary 7,q" is the transform, due to linear transformation A(t', t"), of a vector q' at x' such that
x'
+ 11' E
%?B(X')
This contradicts (7.190) and so establishes Lemma 47. Another interesting lemma is
Lemma 48. Let x' x*(t') and X " 4 x*(t"), t" 2 t', be points of optimal trajectory I'*.I f x' is a B-degenerated point, and ifq" is a vector at point x'' such that X"
+ q"E (Yx")
then q" is the transform, due to linear transformation A(t', t"), of a vector q' at x' such that x'
+ q' E Y ( x ' )
In view of Lemma 21 together with the second hypothesis of the lemma, we have
x'
+ 11' E @',(X')
But as a consequence of the lemma's first hypothesis, it follows that gB(X')
and so the lemma is proved.
= Y(x')
@,(XI)
=
4.
Thus
3 62
A. BLAQUIBREAND G. LEITMANN
One can easily demonstrate the validity of
Lemma 49. A t a B-degeneratedpoint x of a limiting surface C,
x - f(x, u) E 9 ( x ) Since @,(x)
= Y(x),
for all u E R
Lemma 49 is a direct consequence of Lemma 15.
7.232 A-Degenerated Point; Lemmas 50-52 In a manner entirely analogous to that used in proving the lemmas of Sec. 7.231, but employing Corollary 6 , Lemma 20, and Lemma 15, respectively, one can deduce
Lemma 50. Let x‘ x*(t‘) and x” g x*(t”), t” 2 t’, be points of optimal trajectory r*.If x” is an A-degenerated point and is an interior point of d*, then x’ is an A-degeneratedpoint. XI
Lemma 51. Let x’ A x*(t’) and X” 4 x*(t”), t” 2 t‘, be points of optimal trajectory r*.Ifx” is an A-degeneratedpoint, and ifq‘ is a vector at point x’ such that
x‘ + q’E 9(x‘) then the transform of q’,due to linear transformation A(t’, t”),is a vector q ” at
x”,such that x” + q” E 9(x”) Lemma 52. A t an A-degeneratedpoint x of a limiting surface X,
x
+ f(x, u) E 9(x)
f o r all u E SZ
7.233 Corollary 15 Consider a nonzero vector q at point x. In view of the discussion of Sec. 7.71, if
x
+ cq E L , ,
E
> 0, then x + q E gA(x)
and if
x+eqiL-,
c>0,
then x+qeEB(x)
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
363
However, if and if
gPA(x) = (b
then gA(x)c 9 ( x )
@,(x) = (b
then VB(x)c 9 ( x )
Thus we arrive at Corollary 15. At an A-degeneratedpoint x of a limiting surface C ,
x+EqEL+,
E>O,
*x+qE9(x)
and at a B-degeneratedpoint x of a limiting surface C ,
x+EqEL-,
E>0,
*x+qEY(X)
7.234 A Trivial Maximum Principle? Let us suppose that (i) the initial point x*(to)= xo of optimal trajectory r*is a B-degenerated point; (ii) 9(xo) s T(xo) where T(xo) is a n-dimensional hyperplane through point xo; (iii) all points of r*are interior points of b*. From Lemma 47 together with hypotheses (i) and (iii) above, it follows at once that all points of r*are B-degenerated. Let x = x*(t), to I t I t,, be any point of r*,and at that point consider a vector q such that x +q E 9(x) (7.191) Vector q may be considered to be the transform, due to linear transformation A ( t o , t), of a vector qo at xo. According to Lemma 48, it follows that
xo + '10 E 9(XO)
so that, as a consequence of hypothesis (ii) above, we have Thus,
+ '10 E T(X0) x + q E T(x)
xo
where T(x) is the transform of 7'(xo), due to linear transformation A(to, t ) ; and so it follows from (7.191) that
9 ( x ) E T(x) i See also Sec. 8.4.
(7.192)
364
A. BLAQUIQRE AND G. LEITMANN
Now Lemma 49 together with (7.192) leads to x
- f ( x ,u) E T(x),
vu E R
(7.193)
Furthermore, it follows from Corollary 15 with (7.192) that T ( x ) is a vertical ” plane, since
“
x+EtEL-,
&>O,
*X+SET(X)
We are now ready to state Theorem 12. r f u*(t), to 5 t 5 tl, is an optimal control, and x*(t) is the corresponding solution of trajectory equation (7.44), and ifconditions(i),(ii),and (iii) above are fulJilled, then there exists a nonzero continuous vector function 1(t) which is a solution of adjoint equations (7.79), such that (i) #(h(t), x*(t), u) = 0, Vu E SZ (ii) ao(t)= o for all at t E [ t o ,t,]. To prove this theorem we need only choose the initial value of 1( t ) such that &(to)= 1’ is normal to T(xo).Then, as shown in Section 7.142, 1(t) is normal to T(x*(t)) for all t E [to,tl]. But T(x*(t))is a “vertical” plane so that Ao(t) = 0,
Vt E [to lll 2
Furthermore it follows from (7.193) that #(1(t), x*(t), u) A 1(t) * f(x*(t),u) = 0,
vu E R
Thus, Theorem 12 is established.
7.235 Concluding Remarks The results of the preceding sections can be extended to cover optimal trajectories containing boundary points ; this can be done by employing the conclusions of Secs. 7.21 and 7.22. By degeneracy of a boundary point x we mean that either @,(x) = 4 or @,(x) = 4. 7.24
Some Illustrative Examples
In order to illustrate some of the concepts introduced in this chapter, we shall now discuss briefly three simple time-optimal control problems. Rather than present an exhaustive discussion, we shall state the equations of the Z-surfaces and point out a few salient features. We shall leave it to the reader to utilize these examples for a further check on the results obtained in this chapter.
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
365
7.241 One-Dimensional Regulator Problem
Consider the system whose state equations are f l = x, f, = u
and control set !2 given by
lul I 1
It is required to transfer the system in minimum time from (x,', xzo) to (090). As is well known, optimal control is bang-bang with at most one switch. The switching curve-that is, the locus of states at which optimal control switches-divides the state space E2 into two open regions denoted by regions I and I1 on Fig. 1.
X1
FIG.1. S-surface.
The minimum cost V*(x, x')-that is, the minimum transfer time from state x = (xl,x,) to x1 = (0,0)-is given by
Lxz
+ J2xZ2 + 4x1 ____ -x2 + J2xZ2 - 4x1
if x E region I
x,
v*(x;
=
if x E region 11 if x E switching curve
The equations of S and X-surfaces are simply
s : V*(x;x ' ) c : xg + V*(x;x')
=
=
c
c
Figures 1 and 2 show an S and a C-surface, respectively. An optimal isocost surface S consists of two parabolic arcs, S, and &. The parabolic arc in region I is tangent to the switching curve at A , whereas the arc in region I1 is tangent to the switching curve at B.
366
A. B L A Q U I ~ R EAND G . LEITMANN
i"
FIG.2. %surface.
A limiting surface C possesses an edge ACB whose projection on E2 lies on the switching curve. Edge ACB is an attractive subset of C. The remaining portion of Z is a regular subset. Of course, X is nice. 7.242 Power-Limited Rocket Problem The behavior of a power-limited rocket in rectilinear flight is described by f, = u
with control set SZ given by
f, = u2 IUI 5
1
Here we wish to transfer the system from (xl0,xzo)to (xi', x,') in minimum time. The region E* of initial states is the open half-plane given by x2 < xzl. Region E*, in turn, is divided by lines of slope 1 and - 1 into regions I, 11, and I11 as shown on Fig. 3. For (xl0,xZo)in region I, any bang-bang control is
FIG.3. S-surface. Note: Letters with a wavy underline in the figures are equivalent to boldface letters in the text.
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
367
optimal (provided it results in transfer to terminal state x'). For (xl0,xZo) in region I1 or 111, optimal control is constant and unique. The minimum cost of transfer from x to x' is given by
k2'- x2 x21 - x,
if x
E
region I
if x
E
region 11, 111
The lines separating regions I and 11, and regions I and 111, may be considered to belong to either region. It is readily seen that an S-surface consists of a line segment, S,, parallel which is symmetric about the to the x,-axis and a parabolic arc, S,, - SIII, x,-axis; see Fig. 3. A C-surface is made up of a triangular section of a plane, C,, inclined at 45" to the state plane together with a portion of a parabolic cone, ZIz- C,,,, as shown in Fig. 4.
i"
FIG.4. X-surface.
A Z-surface possesses two edges, AC and BC, each of which is an attractive subset. The remaining portion of C is a regular subset. C is nice. 7.243 A Navigation Problem
The state equations of a vehicle, which moves with constant speed relative to a stream having constant velocity, are i' =s
+ u,
368
A. BLAQUI~REAND G . LEITMANN
where s = const, and Q is given by 24”
+ u22 = 1
’,
We require the time-optimal transfer from (xlO, xZo)to (xl x”). Here we distinguish among three cases; namely, (i) s < 1 ; (ii) s = 1; and (iii) s > 1 . In each case optimal control corresponds to constant steering angle, that is,
u,*(t) z const
u2*(t)= const
However, concerning initial state region E* and minimum cost V*(x;xl), we must take up each case separately. (i) s < 1. Here E* = EZ,and
V*(x;r’) =
-SAX,
+ [(Ax1)’+ (1 - S ~ ) ( A X ~ ) ~ ] ~ / ’ 1 -s’
where Ax, xi’ - x1 Ax2 4 xzl - x2
An S-surface is a circle which surrounds terminal state X I ; see Fig. 5.
A X-surface is a circular cone as shown in Fig. 6. C is regular and nice. (ii) s = 1. Here E* is the open half-plane given by xI < xI1,and
I FIG.5. S-surface.
s< 1
FIG.6. Z-surface.
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
369
An S-surface is a circle with terminal state x’ deleted; that is, x’ belongs to the boundary of E*. This is shown in Fig. 7. A Z-surface is a circular cone with the directrix on the x,-axis deleted; see Fig. 8. Again, Z is regular and nice.
FIG.7. S-surface.
FIG.8. C-surface.
(iii) s > 1. Here E* is a closed triangular region in the left half-plane as shown in Fig. 9. Minimum cost is given by
V*(x;x ’ ) =
sAxl +[(Ax1)’ - (s’ - AX^)']'/^ s‘ - 1
An S-surface is a circular arc which is tangent to the boundary of E*; see Fig. 9. A X-surface is a portion of a circular cone as illustrated in Fig. 10. Z possesses boundary points of &* along AC and BC. At interior points, C is regular and nice. Boundary E is made up of Z and “vertical” planar sections ACx’ and BCx’. Boundary E is nice.
FIG.9. S-surface.
FIG.10. C-surface.
370
A. BLAQUIERE AND G. LEITMANN
APPENDIX
is a partition of C, we wish to show that the set of subsets niy=,Zi is denumerabie. Consider the set whose elements are
(Ci,Zj),
i , j = 1,2, ..., p, ...
that is, the Cartesian product A x A , where A g { C i : i = 1 , 2 , ..., p ,... } Similarly, the set whose elements are
( X i , C j , Zk),
i, j , k = 1, 2, ... , p, ...
is A x A x A . Since the Cartesian product of two denumerable sets is denumerable, the sets A , A x A , A x A x A , ... are denumerable, and the set ( A , A x A , A x A x A , ...}
is also denumerable, since there is a one-to-one correspondence between its elements and the integers. Furthermore, since the union of a denumerable set of denumerable sets is itself denumerable, the set A u ( A x A ) u ( A x A x A ) *..
is denumerable. Now, the elements of this set are Xi (xi 3 Z j )
( X i , C j , Ek)
i = 1,2, ...)p, ... i , j = 1,2, ..., p, ... i, j , k = 1, 2, ..., p, ...
With each such element of this set, we may associate an intersection
zi
Ei n Z j X i n C j nzk
i = 1,2, ..., p, ._. i , j = l , 2 ,..., p ,... i , j , k = 1, 2, ..., p, ...
7.
ON THE GEOMETRY OF OPTIMAL PROCESSES
37 1
Hence, the set of all such intersections is denumerable. In particular, the set of all distinct intersections is denumerable; we denote it by and note that
-
{Ml,M,, . . . , M v ,
...>
lXi may be any? member of this set whose union is clearly X. BIBLIOGRAPHY
1. A. Blaquiere and G. Leitmann, On the Geometry of Optimal Processes, Parts I, 11,111, Univ. of California, Berkeley, IER Repts. AM-64-10, AM-65-11, AM-66-1. 2. A. Blaquiere, Sur la thkorie de la commande optimale, course notes, Fac. Sci., Univ. of Paris (1963). 3. G. Leitmann, Some Geometrical Aspects of Optimal Processes, J. SIAM, Ser. A : Control 3, No. 1, 53 ff (1965). 4. A. Blaquiere, Further Investigation into the Geometry of Optimal Processes, J . SIAM, Ser. A : Control, 3 , No. 2, 19 ff (1965). 5. A. Blaquitre and G. Leitmann, Some Geometric Aspects of Optimal Processes, Part I: Problems with Control Constraints, Proc. Congr. Automatique Thiorique, Paris (1965). 6. K. V. Saunders and G. Leitmann, Some Geometric Aspects of Optimal Processes, Part 11: Problems with State Constraints, Proc. Congr. Auromatique Thiorique, Paris (1965). 7 . H. Halkin, The Principal of Optimal Evolution, in “Nonlinear Differential Equations and Nonlinear Mechanics” (J. P. LaSalle and S. Lefschetz, eds.) p. 284 ff. Academic Press, New York, 1963. 8. E. Roxin, A Geometric Interpretation of Pontryagin’s Maximum Principle, in “Nonlinear Differential Equations and Nonlinear Mechanics ” (J. P. LaSalle and S. Lefschetz, eds.) p. 303 ff. Academic Press, New York, 1963. 9. R. E. Bellman, “Dynamic Programming.” Princeton Univ. Press, Princeton, New Jersey, 1957. 10. S. E. Dreyfus, “Dynamic Programming and the Calculus of Variations,” RAND Rept. R-441-PR (1965). 11. R. E. Kalman, The Theory of Optimal Control and the Calculus of Variations, in ‘‘ Mathematical Optimization Techniques. ” Univ. of California Press, Berkeley, California, 1964. 12. L. S. Pontryagin et al., “The Mathematical Theory of Optimal Processes.” Wiley (Interscience), New York, 1962. 13. J. L. Kelley, “General Topology.” Van Nostrand, Princeton, New Jersey, 1955. 14. R. Isaacs, “Differential Games.” Wiley, New York, 1965.
t Note again that this may require renumbering the elements of the partition.
The Pontryagin Maximum Principlet S T E P H E N P. D I L I B E R T O DEPARTMENT OF MATHEMATICS, UNIVERSITY OF
BERKELEY, CALIFORNIA
8.0 8.1 8.2 8.3 8.4 8.5 8.6
8.0
Introduction . . . . . . The Extended Problem . . The Control Lemma . . . The Controllability Theorem . The Maximum Principle (Part 1) The Maximum Principle (Part 11) The Bang-Bang Principle . . Reference . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
CALIFORNIA,
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . . . . . . . . . . . . . . . . . .
373 374 376 379 382 384 387 389
Introduction
The purpose of this paper is twofold: First, we present a reformulation of the proof of the Pontryagin maximum principle. Second, we apply these techniques to give a proof of the bang-bang principle. As a byproduct of these proof methods we shall readily establish the validity of the bang-bang principle for nonlinear systems. Our object in reformulating the proof of the Pontryagin maximum principle is to emphasize one central fact. That fact is the existence of a simple family of controls-discontinuous in general-which produce effects on a solution which are differentiable. The types of controls so produced are rich enough (i.e., have enough different members) so as to provide a full range of comparison trajectories. The measure of controllability introduced by Pontryagin et al.' is a cone attached to each point of the trajectory (and generated by the special perturbations of the control function). This cone is a measure of controllability in that it indicates what are possible changes in direction producible by a change in control at that instant or even for earlier control changes. Now cones in general are determined only by linear combinations of vectors with positive
t The work reported
in this chapter was supported by ONR under Nonr-222(88). 373
374
STEPHEN P. DJLJBERTO
coefficients. If, however, among the vectors one is using to span the cone, both some vector ( and - ( occur as generators then the cone will include a linear space containing 6. Our proof of the bang-bang principle stems from these facts: (1) If a one parameter family of vectors ((3) does not move in a “degenerate” way then in an arbitrarily small interval so to s,,there will be n 1 linearly independent vectors ((so), ((s,), ... , ((3,) where so < sl < ... < s,. Thus if X is the cone spanned by them, X will include all of En”. (2) If the control function has a regular point in the interior of the control set there must necessarily be such a vector (. Thus, to eliminate this optimality denying condition one must avoid interior points. (Since we are considering measurable controls the value of the control at individual points is, of course, immaterial. What is to be denied here is that the control variable will have any interior point sets with positive measure.) If the boundaries of the control set are piecewise linear then our argument applies to any face of the boundary. Consequently, the final conclusion is that: “ I n general, the control variable must be at the vertex of the control set except for a set of measure zero.” Note that the result derived is true for any set whose boundaries consist of flat space. In particular our result does not require the control set to be convex.
+
Notation. Vectors will be denoted by lowercase boldface Latin letters and components by subscripts, thus x = ( x l ,... , x,,),different vectors by superscripts, matrices by capital letters. We shall employ, except for these things, the notation of our only reference.’ 8.1
The Extended Problem
The extended control problem begins with the differential equation dx/dt = f ( ~ U) ,
(8.1)
where it is assumed x = (,yo, x l , ... , x,,), f = (yo,. f l , ... , f,), and u = ( u l , ... , u,,,) are vectors, f is continuous in (x, u) but differentiable (continuously) in x, and f is independent of xo. A successful control u = u ( t ) is one such that it carries a solution with initial point xo of the form xo = (0, yo) [i.e., its first component is zero] into a terminal point x’ = (xo’,y’) where y’ is specified. An optimal control is one for which xol is a minimum. Because the value of the control function is immaterial outside the time interval ( t o , t ’ ) during which the solution goes from xo to x1 we may assume the control function defined for all t. To avoid certain trivial cases we shall assume that, if x(to) = xo and x ( t ’ ) = XI, there is no inbetween point T such t )xo or xl. that ~ ( =
8.
PONTRYAGIN MAXIMUM PRINCIPLE
375
To make certain that results are physically reasonable it has been customary to assume thatf,(x, u) 2 0. For the results presented here it is only required that xo(tl)> -00. Any hypothesis that guarantees this is sufficient for our purposes, e.g., fo 2 - M > - 00 ( M being a positive constant). If E"" x R is the set in which f is defined (i.e., for x in and u in R), then an allowable control function is one such that (i) u(t) E Q for all t and (ii) u ( t ) is piecewise continuous. The variations to be considered are only within the class of admissible controlsets. These are determined by the three conditions: (i) The modification which changes any u ( t ) to an allowable constant in an interval is admissible. (ii) If a finite set of controls is admissible then any control which is equal to one of these on one finite set of intervals (whose union is the whole line) is admissible. (iii) The translation of an admissible control is admissible. Formally, these conditions are: (i) If Z is an interval (- co, t o ] or [ t o , t ' ] or [ t ' , + a), and u ( t ) is admissible and u0 E Q, then ii(t) is admissible where U(t) = u(t) = uo
(t $1) ' (t E
I)
(ii) If ui ( i = 1, ... , k) are admissible, if I' (i = 1, ... , k ) are disjoint intervals as above and Uf Z' = ( - 03, + a),then ii(t) defined by ii(t) = u'(t) in i' ( i = 1, ... , k ) is admissible. A regular point of a control function u is a point t where
For the two main classes of control functions, measurable ones or piecewise continuous ones, regular points are almost all points (measurable case) and nonjump points (piecewise continuous case). Let V ( f ) g [af,/axj]be the Jacobian matrix of the vector f with respect to the argument x = (xo, ... , x,,).Let L ( t ) be the solution of the adjoint variational equations d5 = - V'(f)L dt Here V T denotes the transpose of V. It is to be assumed that in the x-arguments of this equation an appropriate solution function has been substituted. Define
-
2 ( 5 , x, u) 42 h f
Define A ( A , x) by
=5
- f(x, u) 2Amfa =
k ( 5 , x) 42 sup 2 ( L , x, u) .Ell
0
(8.4) (8.5)
376
STEPHEN P. DILIBERTO
Theorem 1. (Maximum Principle of Pontryagin). Zfx*(t) is an optimal solution and u*(t) its control function ( t o I tI t’), there exists a solution k ( t ) of the adjoint variational equations such that (1) If Z * ( t , U ) k&(k(t), x*(t>, u), then %* attains its maximum for u = u*(t) at all regular points, i.e., Z*(f, U*(f))= J&!(k(f),
(2) &(L(t), x*(tj) = const 8.2
x*(t))
= 0.
The Control Lemma
Control Lemma.? Let x ( t ) be a solution with control u(t) on t 2 5 t I t 5 . For positive constants d,, d 2 , and E define inbetween points t 3 = t 5 - E(d, + d,) and t 4 = t 5 - E(dZ).Let u(t, E ) be the control defined by
U(t, E )
= U(t)
( t 2 i t < t3j
= V
(t3
I t<
t4)
= u(t)
(t4
It 5
t5)
Let u(tj be regular at t andx(t, E ) be the solution with control u(t, E ) determined by the initial condition x(t2, E ) = x(t’>,for all E . Let xi(&)= x(t’, E ) and x i = x(ti). Also let u5 = u(t5). Then the following size estimates hold,for x ( t ) and x(t, E ) : (1)’ t 3 I t s t 5 , x ( t ) = x3 ( t - t3)f(X5,u5) + O ( E ) (2)’ t 3 I t < t 4 , x(t, E ) = X 3 ( & ) ( t - P ) f ( X 5 ( & ) , v) + O(E) t I t 5 , ~ ( tE ), = x3(&j E d’f(x5(&), V) ( t - t 4 ) f ( x 5us) , + O(E) (3)’ t 4 I
+
+ +
+
Differences between x(t, E ) and x ( t ) may be established thus (recall If1 I M ) : X3(&) = x3 (2) Ix(t, E ) - x(t)l I 2(t - t 3 ) M for (t3I t s t5j (3) x’(E) - x 5 = E d , [ f ( x 5V) , - f ( x 5 ,u 5 ) ]+ O ( E d,) (1)
Remark. This lemma is surprising for several reasons. First, if one wishes to discuss the possible differentiability of x(t, E ) with respect to E one would expect to pattern the “ standard ” differentiability arguments. In these one writes the integral form of the differential equation q t , E) = x3
+
1:
f ( x ( z ,E ) , u(z, E ) ) dz
t Points t l , i = 2, 3, 4, 5 , belong to ( t o , t ’ ) .
8.
PONTRYAGIN MAXIMUM PRINCIPLE
and then differentiates with respect to E . If we let v to write the differentiated equation as?
= dx/d&,this
377
will allow us
But this is impossible here because we have assumed only that f is continuous in u and not differentiable. Despite this, (3) above asserts that v(t5, 0) exists and is given by v(t’, 0 ) = d,[f(x’, v) - f(x5,us)] Equally noteworthy with the existence of v ( t contain d, .
’,0) is the fact that it does not
PROOF.Since (fI I M one has for t 2 t 3 that for any solution z, with any control (we omit the dz in the integrand)
Using this for z = x(t, E ) and z = x(t) one has
Jx(i)- x31 I (t - t 3 ) ~ Ix(t,
Put
x(t, E ) - x(t)
- x31 I (t = (x(t, E )
-
t
-
)
~
x3) + (x3 - x(t))
take norms, and use x3(c)= x 3 ;this yields
Ix(t, E )
3
x(t)l I Ix(t, E ) - x31
+ lx3 - x(r)l
Now using these estimates for the two terms on the right hand side (RHS) one has Ix(t, E ) - x(t)l I 2 ( t - t3)hl and this establishes ( 2 ) . We establish (1)’ in this manner: In the integral equation for x(t), namely, x(t) = x3
+ J,’f(x(z),
~ ( 7 ) ) dz
add and subtract the terms f (x’, us) and f (x’, u ( T ) ) , rearranging the terms in the following manner (we omit the d? in the integrand)
x(t)
t Here 2J2u
=
= x3
+
Jt‘
f(X’,
US)
+
1‘‘
+
j,: f(XS,
[ah/2uk]with u = u(t), x
f(x(z), u(z)) - f(x5, u(z))
= x(t).
) . ( U
- f(x5,US)
378
STEPHEN P. DILIBERTO
Since f is Lipschitzian the integrand of the second term of the RHS is majorized by Klx(z) - x s / , K = const > 0. By using (8.6) this is majorized by KEM. This, in turn, implies the second integral of the RHS is majorized by ( t - t 3 ) ~ M or ~ * ( -t d d~2 ) K M . Thus the second term of the RHS is certainly o(E). For the third term of the RHS one has the ready estimate it -
t3~sUplf(X U(.)) ~ , - q X 5 , u5)1
This means that
rsl,
I + I
f(x5, u5) 5 suplf(x5, u(z)) - f(xs, us)\
(8.9)
rsl,
where Z, is the interval t 3 = t s - ~ ( d , d2) I t 5 t s . That the left hand side (LHS) of this inequality goes to zero (i.e., that the integral itself is o ( E ) ) is now a consequence of the fact that the RHS does. And this in turn depends on the fact that t s is a regular point (i.e., point of continuity) of u(t).
Aside. For " measurable " control functions the argument changes herebut the conclusion remains. Namely, inequality (8.9) does not hold. However, that the LHS -+ 0 with E is precisely the definition of a regular point-and, of course, almost all points are regular for a measurable function. Combining these two estimates with Eq. (8.8) it follows then that x(t) = x3 + ( t - t3)f(X5,
US)
+ o(e)
(8.10)
To establish (2)' and (3)' add and subtract the terms f(x5(&),v) and f(x5(&), u(z, E ) ) to the integral equation for x ( E ) . Thus (again leave off the d7)
x(t, E ) = x 3 becomes
x(t> = x3 +
+
Jt:
f(x(z, E ) , u(z, E ) )
j'f(X5(4, v) + J' Cf(x(z, E l , u(z, t3
t3
- f(X5(E), u(z, 411 + The last two integrals of the RHS are both
t 3 1 t I t5 ,
x(l, &) Note that by putting t
= x3
= t 4 in
1' t3
E))
IIf(XS(4 u(z, E ) - f(XS(E), v)l
O ( E ) precisely
+ ( t - t3)f(X5(&),v) +
as before and so for
O(E)
this equation it has the consequence that
X4(E) = X(t4: E ) = X 3
+
&
dlf(Xs, V)
+ O(E)
Paralleling the derivation for (1)' and (2)' one finds easily that in t 4 5 t I t 5 x(t, E )
= X4(&)
+(t
-
t4)f(X5(&),u5) + O(E)
and this combined with the last equation establishes (3)'.
8.
PONTRYAGIN MAXIMUM PRINCIPLE
379
(1) is the definition of x3(&).(2) follows in one step from (8.6) and (8.7) by subtraction ; ( 3 ) follows from (1)' and (3)' and (2). If in (1)' and (3)' one puts t = t 5 they become, respectively [also use (I)]
x 5 = x3 x ~ ( E=) x 3
+ E(dl + d2)f(x5,u5) + + E ( ~ $ ( x ' ( E )U), + E dZ(x5(&), u5) + O ( E ) O(E)
(2) being valid for any t in the interval, implies for t = t 5 ,that x 5 = x5(&)+ o(E). Thus the expression for x5(&) just given may be rewritten by replacing each x 5 ( &on ) the R H S by x 5 so that x ~ ( E=) x 5
+
E
d1f(x5,V)+ E d 2 f ( x 5u5) , + O(E)
Subtracting this from the last expression for x 5 gives (3). 8.3
The Controllability Theorem
We shall first establish a lemma which will allow us to determine at time t the effect of a control at an earlier time. Using this lemma we define the cones which measure controllability. The principal result will be to prove the extent to which these cones do allow one to modify a solution.
Lemma. Let x ( t ) be the solution of dxldt = f ( x ,U)
(8.11)
defined by x(to) = xo and for given control u(t). Let W(t, t o ) be the matrix solution of dW/dt = V(f) W where V ( f )= [afi/axj](x(t), u ( t ) ) and W(to,t o ) = I. I f x ( t , E ) is the solution of (8.1) determined by the initial value x(to, E ) = x o + EL + o(E),5 a j x e d vector, then x(t, E ) PROOF.
= x(t)
+ EW(t, to)<+ O ( E )
If vj = dx ( t , t o , xo)/dxj then V' =
W ( t , to)Ej
where Ej is thejth unit vector (all zeros and one in thejth spot). Since x(t, E ) = x(t, t o , xo E L ) ,
+
= W(t, t o ) ( t E'Cj) = W ( t ,to)L 0
And since x(t, E )
= x(t)
+ E(dx/dc) + o(E),the result is established.
DeJinition. For any xo, uo.
380
STEPHEN P. DILIBERTO
R(xo, uo) = set of vectors generated by finite sums with positice coefficients of [ f ( xo ,v) - f ( x o ,uo)] for all v E R. /(xo, uo) = space generated by all finite linear combinations of [ f ( xo, v ) f ( x o , uo)] for all v E Q-i.e., the coefficients are no longer required to be positive. Definition. For any solution x ( t ) with control u( t) defined on I ( t o I tI t’) and any z E I (z # to). X ( z ) a X(X,,u,) = set of vectors generated by finite linear sums with positive coefficients of W(z, 7’)s for all 5’ < z and regular, and 6 E R(x,,, u ~ , ) . Y(z) 9 ( x , , u,) = space generated by all finite linear combinations of W ( z , 7’)s for 5’ regular and E f(x,. , u,,). For cones A and B of vectors let A 0 B be the cone generated by A and B, i.e., the set of all combinations EX + By where M , p 2 0 , x E A and y E B. Definition. T ( z )
X ( z ) 0 f (x,, u,), and P ( T ) Y ( z ) @ f (x,, u,).
Theorem 2. If z is a regular point, if t o < z I t’, if dim %(T) = n + 1 and if the vertical segment below X ( T ) [i.e., (x) = (xo(z) + ,u, X ~ ( T ) , ... , x,(r))] with -6 < ,u < 0 lies in ,% then x ( t ) is riot optimal.
PROOF.To prove the theorem it is sufficient to show from the hypothesis that one can construct a solution E ( t ) such that ?,(t’) < xo(t’). This will be accomplished by (A) showing that if 8 ( t )is a solution such that ,?,(z) < x0(z) then 2 is directly constructable from 8 ; (B) showing how to construct 2. Part A . We shall assume x ( t ) with control u(t) and 8 ( t ) with control a ( t ) are both given where: (i) ri0 = % ( t o )= x o = x ( t o ) ; (ii) ri(z’)lieson L , the vertical line through x(z), and below it, i.e., S,(T’) < xo(z). Construct
ii = d ( t ) = u(t)
( t o I: t I T’)
(z’< t I t’ + (T’ - 7))
Let ii be the solution determined by %(to)= xo with control function ii(t). We assert: (i) % ( t )= S ( t ) ( t o I t I T’) (T‘ I tI t’ + (7’ - z)) Z ( t ) = x(t [t’ - TI) + P (ii) for P = (ri,(z’) - xo(z),0, 0, . . . , 0). (i) is trivial. To prove (ii) we need only show that both LHS and RHS are solutions and that they agree at some value
8.
38 I
PONTRYAGIN MAXIMUM PRINCIPLE
o f t (in this case at t = 7'). The LHS is a solution. Since the differential equation is independent o f t and xo it follows that if x ( t ) is a solution so is x(t + h) + ( a , 0, 0 , ... , 0) = x(t + h ) A. It remains to choose A = P and 11 = - [T' - T I . For t = T ' ,
+
x(t
- [T'
+ P = x(r' - [ T ' - T I ) + P = X ( t ) + P = X(T) + (X,(T') - X ( T ) , o,o, ... , O )
-TI)
= (fo(t'),
= ii(T') =
and (ii) is established.
-Y2(T),
X,(T),
c
qz')
. .. , X,(T))
Part B. First, we observe that the set of points B determined by ( d , , ... , d,,,,) where di2 0 and di= 1 is an n-simplex. In fact it is that part of the n-plane cut off by the first quadrant in ( n 1) space by cutting the ith axis at di along it in the positive direction. The cone C , determined by B and the origin is the set of all points ~ ( d ,.,. . , d,,) with 0 I E I 1 and di= 1 . Let a be the extension to 0 I E < a. This cone has n 1 sides, the.jth side Sj being determined by the conditions
+
c
+
0 1 E 2 1,
and
di>O,
i +j,
dj =0,
Edi= I
c
The existence of an S of the type desired will be phrased as a mapping problem on a set of variables ( E , d , , . . . ,d,,+,) where E 2 0, cli 2 0, and di= 1. in particular, the properties of the mappings will be in terms of the sets B and Sj defined above. The vertical segment downward through X ( T ) is given by X(T) -
pEo
where 0 I p < + co, Eo = (1, 0, . . . , 0). The fact that this segment lies h i d e the cone 3 says that there are n + 1 vectors 5' (i = I , ... , n + 1 ) such that
Eo = 2 diLi
The mapping m defined on
(E,
(2di = 1,
di > 0 )
d, ... , dn)= ( E , d ) given by
/"l(&,
d ) = X(T) - &
1 d;Ci
r".
maps C, onto 2, the part of 3 spanned by ( I , ... , The mapping rn restricted to B and S ' , ... , Sn+l determines a base B and p I po sides 9' of .$? which include part of the segment X ( T ) - pEo for 0 I ( p o positive and determined by G I , ... , in its interior. Any map m defined for 0 I E 5 E' and di2 0 with 2 di= 1 by
r")
h=m
has the obvious properties
+ O(&)
382
STEPHEN P. DlLIBERTO
(i) ri? is 1 - 1 if E' is small enough. (ii) For any 6 > 0 there is an E' so that if ( 1 - 6)lnzI 5
Ihl < (1
E'
< E' then
+ S)lml
(iii) For any 0' > 0 there is an E' so that if E' < e2 x ( m , h) < 0'. From this it follows that if E' is suitably chosen (i.e., small enough) the map ri? on the set of points B, S ' , ... , Sn+ldetermines an image which is topologically a sphere and includes in its interior X ( T ) - pE' for 0 5 p < p ' . Standard topological results-i.e., Brower's fixed point theorem-imply that the map ri? extended to the interior of C, will cover, i.e., map onto this segment. To apply this result observe that it will be sufficient to find a control function u(t, E , d ) (which is a modification of u ( t ) ) so that the solution x(t, E , d) based on it has the form
x(t, E , d ) = x(t) - &
c d'6' +
O(6)
This is done as follows: Each 6' = W ( f ,t i ) z i where z i = f(x', v') - f(x', u'), x i= x ( t ' ) , and ui = u(t'). The control change that replaces u ( t ) by the constant v i in the interval t' - ~ ( d , 6,) to t i - &(dJ changes x at t' by the amount z i and at T by 6' = W(r,t i ) z i . Let the control function incorporating all of these changes be u(t, E , d). This is the desired family of changes required.
+
8.4
The Maximum Principle (Part I )
The last result-the controllability theorem-establishes that if a trajectory is optimal then its cone % ( t ) will never be all of ( n 1)-space (thus the cones Y ( t )have dimensions < n 1 at all points). This implies that there will always exist a hyperplane (a linear space of dim n) through the vertex of the cone having one open half space (not including the plane itself) with no points of the cone. If, therefore, a vector a with base at the vertex of the cone lies in this half space not containing the cone then <(a, 6 ) for 6 E .X is certainly positive. If a were perpendicular to the plane then <(a, 1;) 2 90" and if 6' = <(a, 6) denotes this angle then cos(0) 5 0. These observations plus the relationship of the solutions of the adjoint equations are the heart of the maximum principle. Note that if 2 had no point in the separating plane then sup 4 E Z cos(a, 6 ) could be negative. If, however, the cone happens to include an entire line-and ours shall-then % must have points in the plane. This will imply that the smallest (least) value sup 6 E 5 cos(a, 6) can take is zero. We emphasize here that the vector a, which will be the initial value of 3, will be arbitrary except for the restriction that cos(a, 6 ) I 0 for 6 E z. Therefore, a will be unique in one and only one case; namely, the cone 9
+
+
8.
PONTRYAGIN MAXIMUM PRINCIPLE
383
is a half space and the vector a is perpendicular to the plane determining it (i.e., the boundary of 2). Lemma. r f i
= - V(f)'h,
.i. =
-
V(f)v,then h v = 3, W(t, to)v = const.
PROOF. d -h*v dt
=i
* v + h - += -vT1.v + h e
vv = - h - vv + h * vv = o
De5nition. A vector x is separated from a set of vectors V, denoted by
x[S]V, if 3 a hyperplane containing V in one (closed) half space and x in the other (open) half space.
Lemma. Let a = h(z). Then, ifa[S]X(z) or a[S]2'(s) or a IX ( T )or a I Y ( z ) one also has,for t , < z, with both t and t regular, that h(t)[S]A(t)or h(t)[S]d(t) or I ( t )IR ( t ) or h(t)I d ( t ) , respectively. PROOF. Let
6 be elementary,
i.e., let
5 = f(x(t), v) - f(x(t), u(t>)€ f ( t >
( I < z) Then z = W(z, t ) ( E 2?(z). Let a = h(z). Then a I or [S]2'(z) implies a - z = 0 or a - z 5 0. But by the last lemma a - W(z,t)( = h ( z ) *W ( T t, ) is constant in t. Hence for z = t h(t)-W ( t ,t)( - h(t)-c= 0 or I 0 according as a - z = 0 or 5 0. A similar proof is true for aiCiindependently of whether or not the cxi are positive.
Definition. Let ,%(In, x, u) 4? l . f ,
A'(h, x)
sup u E Q 8 ( h , x, u).
Theorem 3. Given any solution x(t) and control u ( t ) , i f 3 a such that a[S].X(t') then 3 l ( t )such that
at aII regular t < t'.
J@(Vt>, x(t>>= m " ( 9 ,x(t>, u(t>>
PROOF.It will be sufficient to show that for each regular t or
Z ( h ( 0 ,x(t>,u(t))2 X ( h ( t ) ,x(t>,v)
(all v)
-
0 2 3, f(x(t), v) - h(t) f(x(t), u(t))
-
0 2 I ( t ) [f(x(t), v) - f(x(t), u(t))] 0 2 h(t) *
5,
5 € R(t)
(all v)
384
STEPHEN P. DILIBERTO
i.e., if x(t)[S]L(t)all t. By the last lemma this is possible if 5 ( t 1 )= a[S]%(t'). Since X ( t ' ) c %(t') it is enough to have a[S]%(t').
Corollary 1. This theorem implies thejirst conclusion of the maximum principle.
PROOF.If x*(t) with control u*(t) is optimal we know that the cone T ( t ' ) can not contain the vertical (downward) segment through x*(t'). It is an elementary proposition that if a cone does not contain a given segment then the segment can be separated from the cone (by a hyperplane). Corollary 2.7 Ifdim s ( t ' ) < n + 1 there exists h = h ( t ) such that X * ( t ,u)" &'(5(t), x*(t), u) = Ofor all u E R. I n this case, since %*(t, u) is constant max X * = min X *
=
U€R
U€Q
k ( l ( t ) , x*(t))
PROOF. dim %(tl) = dim B ( t ' ) . If the dimension of 22 I n, then 2 lies in a hyperplane and there will exist an a = 5(t') such that a 18.Then, by the last lemma 5 ( t ) 1 B ( t ) , and, consequently, %*(t, u) = 0 for all u E R.
8.5
The M a x i m u m Principle (Part II)
The result of the last section established the most quoted part of the maximum principle-which states that for an optimal solution and its control there will always exist at least one solution of the adjoint variational equations for which 2 = d.The most profound part of the maximum principle, however, is the fact that for this choice of 5 ( 5 = h(t)) the maximum,function is continuous. In other words, for every optimal trajectory there is a continuously moving plane through each point of the trajectory ( h is the normal to it) such that every discontinuity in the tangent to the trajectory lies in that plane. Here we shall follow Pontryagin by introducing a function m which lies between A%' and 2 . First, we establish that m is well behaved, and then use it as the key to the properties of d.
De$nition. F 4i {ulu E l2 and u = u(t), to I t I t'} G F (the closure of F ) m(5, x) max *(A, x , u) U€R
Lemma. A'(5, x ) 2 m(5, x ) 2 X'(h, x , u).
t See also Chapter 7.
8.
385
PONTRYAGIN MAXIMUM PRINCIPLE
Lemma. For an optimal solution x*(t) there exists a I ( t ) such that A(t)4&(I(l), x”(t))=rn(k(t),x”(t) = & f ( k ( t ) ,x*(t), u * ( t ) ) all regular t.
A?(?))
Lemma. For an optimal solution x*(t) there is a h(t)such that A’(k(t), x*(t)) is lower semicontinuous. Lemma. For an optimal solution x*(t) there is a L ( t ) such that m(x*(t),I ( t ) ) is absolutely continuous and dmldt = 0 Lemma. If a(r) is I.s.c., b(t) continuous, a ( t ) 2 b(t), and a(t) = b(t) almost euerywhere then a(t) = b(t). The proof of the first lemma is a direct consequence of the definition of m and of Theorem 3. The proof of the second lemma follows from the first and Theorem 3. To prove the third lemma we need merely show that given E > 0 3 f ( t ) such that A ( t ) 2 f ( t ) 2 .&(t) - E Given any t ’ by definition 3 u’ so that X ( k ( t ) ,x*(t’),u’) 2 &(k(t’), x*(r’))- e/2
SinceZ(I(t), x*(f), u) is continuous in t for u fixed 3 S(t’, e) such that 8 ( k ( t ) , x*(t), u’) - X ( k ( t ’ ) ,x*(t’), u’) < 4 2 . For It - t’l 2 6 ( t ‘ , E )
&(I(?), X*(f))= sup *(k(f), X*(f),u) 2 X ( k ( l ) ,X*(l), u) UER
and
%’(k(t), x*(t), u) 2 ./&‘(I(t‘),x*(t’)) - E
c,”
The statement that m ( t ) is absolutely continuous is equivalent to the following: Given ~ 3 such 6 that if d i = 6 ( A i 2 0) and if ( t i- ti’( < d i , then C l N ( m ( t i-) m(ti’)l < E . Since x ( t ) and k ( t ) are absolutely continuous (i.e., are integrals of measurable functions) and
2 (h, x, u)
=I
*
f(x, u),
2 is Lipschitzian in (A, x)
for regular points t’, it follows that m(t>- rn(t’)
rn(k(t),x*(t)) - m(I(t‘),x*(z’)) = %(I(l),x*(t), U*(f)) - 3/e(k(t:),x*(t:), u*(t:)) 2
X ( k ( t ) ,x*(t), u * ( t ’ ) ) - gi(k(t’),x*(t’),u * ( t ’ ) )
2 - K { l I ( t ) - k(t‘)l
+ Ix”(t)
-
x*(t’)\>
386
STEPHEN P. DlLlBERTO
where K is a positive constant. Similarly, m(t) - m(t’) I X ( l ( t ) X*(Z), , u*(t)) - % ( l ( t ’ ) , x*(t’), u*(t))
+ Ix*(t) - x*(t’)l) Im(t) - m(t‘)l 5 K { l l ( t )- l ( t ‘ ) l + Ix*(t) - x*(t‘)l) I K(ll(t)- l(t’)l
Thus
and the result follows. That is, m is Lipschitzian in 3, and x-for solution 1. We shall now prove that dmfdt = 0 at regular t’. Since
the right
m ( l ( t ’ ) , x*(t’)) = X ( l ( t ’ )X*(f’), , u”(t’))
one has for any t
m ( l ( t ) ,x*(t>)2 X ( V t ) ,x*(t>,U*(l’)) Thus, subtracting and dividing by t - t‘ one has
or Let (2 -
m(t) - m(t’) X ( l ) - X ( t ’ ) 2 t - t‘ t - t’
if
t > t’
(case i)
m(t) - m(t’) < X ( t ) - X ( t ’ ) t-tf t - t’
i f t < t‘
(case ii)
t’)Z(t, t ’ )
Then
[ X ( l ( t ) ,x*(t), u*(t))- X ( k ( t ’ ) , X*(f),u*(t))] [ X ( l ( f ’ ]s*(t), , U*(f))- X(l(t’),X*(f’),U*(f))]
+
X f t ) - X(t’) = Z ( t , t’) t - t’ s Z ( t , t’) 2
Z ( t , t’)
+ X(h(t’),x*(t’) u*(t))t -- 2t’( l ( t ‘ ) ,X * ( f ’ ) , if
t < t‘
if
t > t‘
u*(t’))
Since 2 is differentiable in the components of 3, and f, and f is differentiable in the components of x , and x is differentiable in t , we conclude that limt+,t, Z(t, t‘) exists. Moreover, this limit exists with t > t‘ or t < t’. The resulting limit (essentially equivalent to taking the derivative of X assuming u constant) may be computed thus: Since 2 = k - f , we have dX dl -.-=-. dl
dt
dl dt
8.
PONTRYAGIN MAXIMUM PRINCIPLE
Therefore, dnijdt I 0 and dmjdt 2 0. These imply dmjdt
387
= 0.
Remark. a[S]% a . f = 0 since +af E 2 and a[S]% a - z I 0 for all z. This means that a ctf d 0 and a -af 5 0. Thus, a - f = 0. Since 0 2 d ( t ) 2 Z ( t )and X(A(t'), (x*(t'), u*(t') = 0 it follows that
-
-
&(h(t'), x*(t')) = 0 8.6
The Bang-Bang Principle
The result to be derived is this: Theorem 4. If the control set R is compact and bounded by aJinite number of planes, the control u * ( t ) for an optimal solution will have all its regular points at vertices of R provided the nondegeneracy conditions are satisfied at every regular point of the trajectory. Remark. The theorem says that if at any one regular point T , t o < T I t', the conditions of nondegeneracy hold, then u*(t ) must be at a vertex for all t which x*(t) is regular. Remark. This result removes the restriction that the system be linear, that the control set be convex, and that the cost be time. Nondegeneracy conditions. Let f ( x , u) have at least n + 3 derivatives (possibly mixed). Let A(t)
(df/dx)
B ( t ) 2 (afjau)
with x = x*(t), u = u*(t). Let 5 be a vector lying in R or in any of its faces. - djdt((k-*),k = I , 2, ... , n. Nondegeneracy = B5 and Gk = Let requires the linear independence of 1;,' ... , 5". Lemma. If
+5
E
k ( t ) all t and d dt
- W ( t )= AW, then X
= X(0)
contains
a constant
+5, $-A<, ... , *A"<.
388
STEPHEN P. DILiBERTO
PROOF.For any E there is a t, such that if It1 < t, the difference between ( I + tA)&and ( ~ o " ( t " A " / n ! 6) )is less than Et, where I E ~5 E . Thus, the difference 5 - W(t)ccan be written as - ( I + tA)( +- Et = tA< + E t = t(AE + E). When normalized t o a unit vector this becomes A< + E . Because W ( t , t o ) = W ( t - t o ) it follows that if a vector in the cone at t o is transported to to t , it is at that point just W ( t ) i .Therefore, the value at t = 0 of the control vector applied at time --t is + W(t)<.When this one has A6 + E(t) and as E -+ 0 effect is added to L(0) determined by this is A<. Thus the cone at t = 0 certainly includes E and A ( . A repetition of this argument establishes the lemma: By the above argument the cone at every point must contain both and f A < since the conditions at t = 0 are invariant under translation. Now, apply this argument to + A < (instead of 5) and this implies that + A 2 ( is also in the cone at each point, etc.
<
+
<
+c
+<,
+
+<
Lemma.
If +( E R(t) all t and ( A ( t )continuous)
d W / d t = A(t)W
then X ( t ) contains
+<, +A(t)c, ... , + A " (t)(.
PROOF.Continuity of A ( t ) implies that W(t)<= ( I + tAfO) + t ~ ) cwhere 0 as t -+ 0. The last argument implies then that L( t) must contain both 6 and A ( t ) ( .
E
-+
Lemma. I f k<(t)E L(t) and
dW/dt = A(t)W
then X ( t ) contains <(t)and A ( t ) ( ( t )- i ' ( t ) , where c'(t) = dl;(t)/dt.
PROOF. Since W(0, -t)<(-t) =
+
+
{ I tA(0) E t } { ( ( O ) - t<'(O) = G(0) t(A(O)( - <'(O)) Et =
it follows that
+
((0)= w(0, - t)<(- t ) =
-
+
f[A(o)< - 5'(0)]
+Et}
+ Et
Normalizing and taking limits one has that X ( 0 ) contains i-(A(O)((O) ('(0)).
Lemma. I f +<(t)E k ( t ) and dWjdt then k ( t ) contains
= A(t)W
('(t), c'(t), ... , <"(t)
r(t)= A ( t ) r - ' ( t )- L"-'(t)
where for s
<
Go
=
=
1,2, ... ,m
8.
PONTRYAGIN MAXIMUM PRINCIPLE
389
PROOF. This result is obtained by iterating the statement of the last lemma. The proof of the general bang-bang result now follows from the observations that if 7 is a regular point with u*(7) in the interior of R at which the nondegeneracy conditions are satisfied, there will be a set of positive measures whose closure includes z for which the above lemmas hold. This would therefore imply that the cone at 7 had dimension n + I . The impossibility of this for optimal solutions implies thus that any regular point must be on the boundary of R. This argument does not use the dimensionality of Q in any way. Let C be a face R. If u*(7) is an interior point of C the same argument gives some vector f5 E X. The nondegeneracy conditions, again, lead to a contradiction. Hence, if u*(z) is on C it must be on a face of C. This keeps up until one runs out of dimensions-i.e., is at a vertex. From the above arguments it follows that, if there is an interior point in the control set and if the nondegeneracy condition holds at that point, then X = 2’and dim 9 = n + 1 ; however, if dim 9 = n + 1 the trajectory cannot be optimal. This contradiction proves the theorem for it establishes that if the nondegeneracy condition holds at one regular point then the control must be bang-bang there. Professor Gamkrelidze has pointed out to us that requiring the nondegeneracy condition at only one point is insufficient; it may happen that f(x, u) is independent of u on some interval of time. There are extensions of the above result which relate all conditions to one point. These follow by applying the last lemma not only to i ( t ) = Bso but also to W(t, t’)L(t’). REFERENCE
1. L. S. Pontryagin, V. G . Boltyanskii, R. V. Gamkrelidze, and E. F. Mishchenko, “The Mathematical Theory of Optimal Processes” (K. N. Trirogoff, transl.; L. W. Neustadt, ed.), Wiley (Interscience), New York, 1962.
Synthesis of Optimal Controls BERNARD PAIEWONSKY ANALYSES, ARLINGTON, VIRGINIA
INSTITUTE FOR DEFENSE
9.0 Introduction . . . . . . . . . . . . 9.1 Neustadt’s Synthesis Method . . . . . . . 9.1 1 Time-Optimal Control . . . . . . . 9.12 Time-Optimal Control with Effort Constraints 9.13 Minimum Effort Control . . . . . . . . . . . . 9.2 Computational Considerations 9.21 Two Convergence Acceleration Methods . . 9.22 Optimal Space Rendezvous . . . . . 9.3 Final Remarks . . . . . . . . . . . References . . . . . . . . . . . .
9.0
. . . . . . . . . . . . . . . .
. . . . . .
. . . . . .
.391 ,392 .392 ,398 ,400 .401 ,403 ,406 . 413
. . . . . . . . . . . . .
415
Introduction
Optimization problems for linear systems with bounded controls are generally easy to formulate using the maximum principle but from a computational standpoint they are often difficult to solve. Many ideas have been put forward for synthesizing minimum time and minimum effort controllers for linear systems but only a few of these have proved successful when applied to actual computation. Algorithms for synthesizing minimum time and minimum effort controllers have been developed by N e ~ s t a d t l and - ~ these algorithms do provide convergent iterative procedures for solving the two-point boundary value problems associated with the optimization. The best accounts of Neustadt’s methods for the time-optimal control problem and the minimumeffort control problem are found in Neustadt’s original papers. These papers are clearly written and combine a lucid presentation with a high standard of mathematical rigor and precision. The purpose of this chapter is to present an introductory account of these optimal control synthesis ideas together with some examples of engineering applications and computational techniques. There have been other investigations along the general mathematical lines followed by Neustadt. Krasovskii4 and Gamkrelidze5 independently derived 391
392
BERNARD PAIEWONSKY
synthesis procedures for time-optimal problems similar in some respects to Neustadt’s. Results of Russian computational studies along these lines have not yet appeared in publication. Eaton6 has also investigated a synthesis procedure based on similar geometrical ideas but again no computational results are available. Fadden and Gilbert7 have applied Neustadt’s method to the time-optimal regulation of the system x = u (lul 2 1) and have made numerical studies of the convergence of the iterations. 9.1
Neustadt’s Synthesis Method
The synthesis method described here was developed by Neustadt for the time-optimal problem with bounded controls and was later extended to include time-optimal systems with control effort constraints as well as minimum effort controllers. The time-optimal control problem will be discussed in detail first; the extension to systems with effort constraints will be taken up afterward. 9.11 Time-Optimal Control
Let us suppose that we want to control a system governed by linear differential equations with variable coefficients of the form : k(t) = A(t)x(t)
+ B(t)u(t)
(9.1)
In this equation x, and n-vector, is the state of the system at time t , u is an rn-vector of control variables, A is an (n x n) matrix, and B is an (n x m ) matrix. Suppose also that we are given the initial conditions x(0) = xo. We want to determine u ( t ) on the interval [0, t * ] such that t* is the smallest value of t for which x(t*) = 0. Neustadt treats the case where the system is to be transferred fom xo to x1 but in this section we will discuss only the case of x1 = 0. The control vector function u ( t ) is assumed to be piecewise continuous and the values of u ( t ) are required to lie in a set Q. This set is closed, bounded, and convex and contains the origin as an interior point. Any u ( t ) satisfying these conditions is called an admissible control function. To find the optimal control we introduce the adjoint variables A,(t) ( i = 0, ... , n), form the Hamiltoniant
f The elements of matrices A and B are denoted by A i j and Bj,, respectively; Af, denotes an element of A T , the transpose of A .
9.
393
SYNTHESIS OF OPTIMAL CONTROLS
and maximize it over all admissible u. The adjoint variables satisfy the differential equations (9.3)
;2,
=0
The solutions of the adjoint equations can be found in terms of the inverse of the fundamental solution matrix to the system (9.1). The matrix X - ' ( t ) satisfies the differential equation: d dt
(9.4)
(X-yo) =I)
- x-1 = - X - l A
We may write
L(t)
= -[x-']T q
(9.5)
with A,(O) = - q i . The relatiqn between the optimal control and the adjoint comes from the maximization condition : max 2 UER
= max
n
m
C C
ueR i = l k = l
LiBikuk
It is assumed that for almost every t a unique maximum exists. In the timeoptimal bang-bang problem this is LaSalle's normality condition.8 The optimal control law obtained from the maximum principle contains (n - 1) parameters. These are the (n - 1) independent initial conditions for the homogeneous system of adjoint differential equations. The problem now is to find the initial conditions &(O) = - q i for the adjoint variables so that the system (9.1) goes from the initial state to the final state when the control u*(t, q) derived from the maximum principle is applied. The solution of the system (9.1) can be written as
I
+ J;: X - ' ( T ) B ( T ) U ( T1,1) dr
x(t) = X ( t ) ~ ( 0 )
1
(9.6)
where X ( t ) is the fundamental solution matrix of (9.1). Let the desired final state, x ( T ) , be the origin. The set C ( t ) , t 2 0, consisting of the set of points swept out by the vectors ~ ( t= ) -
J" X - l ( r ) B ( z ) u ( ~dz) 0
(9.7)
for a given positive t and for all admissible u is called the reachable set with respect to the origin. This set contains all points from which the origin can be reached within time I. The boundary of C ( t )for each t is an optimal isochrone. The conditions that R be compact and convex insure that C(t)will be closed,
394
BERNARD PAIEWONSKY
bounded, and convex. Furthermore, if t < t' then C(t) is contained in C(t'). The normal to the support plane or tangent plane at a point of the boundary of C ( t )is the vector q, i.e., the initial condition for the adjoint. The function z(t, q) =
-
J' x-'(z)B(r)u(t, 0
q) dz
(9.8)
for any q # 0 and t > 0 is a mapping of vectors q into vectors z(t, q). This function generates the boundary of C ( t )for a given t > 0. We begin an iterative search for the optimal q by guessing the slope of the support plane at x(O), i.e., guessing a startingvalue forq (call it q'). In computational studies it may be convenient to use a unit vector parallel to x(0) as a starting value for q (q' = - x(O)/Jx(O)l if no better information is available). The key to understanding the geometrical significance of the various steps involved in this method is found in the theory of convex sets. We generate the vector z ( t , q') as a function of t and we seek the point where this curve crosses the trial support plane with normal q ithrough x(0). This is shown in Fig. l(a). More specifically, we seek the time when q'.[z(t,q') - x(O)] = 0. This time is called F(q'), i.e., q'*[z(F(q'), q') - x(O)] = 0.Let f ( t , q) denote the function q-[z(t, q) - x(O)]. This relationship can be rewritten as,f(F(q), q) = 0. The scalar product q'. [z(t, q') - x(O)] is a monotonic increasing function of time. It starts from a negative value n'.( -x(O)) and vanishes when z(t, q') crosses the hyperplane through x(0) with normal q'. The location of this point on the hyperplane is also shown in Fig. I(a). This q' is optimal for the point z[F(q'),q'], so q' is the normal to the boundary of C[F(q')] at the point z[F(qi), qi].The time F(q') is less than the optimal time t* because C[F(q')] is contained in C(t*). The justification for these assertions depends on the convexity of C(t). We see that x(0) lies on the boundary of the set C(t*).The convexity of C[F(q')] assures us that q'-z[F(q'),q'] > q i * y for all vectors y, y # z[F(q'), q'], in C[F(qi)].This is easy to see by looking at the projection of z[F(q'),q'] onto the normal to the hyperplane as shown in Fig. 1. But q'ax(0) = q ' . z [ F ( q i ) ,q'] by definition of F(q'). Therefore, x(0) is not in C[F(q')] but lies outside it and the optimal time t* is greater than F(q') unless z[F(q'), q'] = x(0); in that case F(q') = t * . This means that F(q) is maximized when z[F(q),q] = x(0). The two-point boundary-value problem is now transformed into the problem of locating the maximum value of a function of several variables. The goal of this iterative method is to find the q that maximizes F(q). This vector, q*, causes the boundary conditions to be satisfied when u*(t, q*) is applied to the system (9.1). It is important to note that it is the location of the maximum value of F(q) which determines the optimal control. The
9.
395
SYNTHESIS OF OPTIMAL CONTROLS
Support plane to C (t') through x ( 0 )
Boundary of C (t*)
angle
Path traced by z (t, q ' ) as a fuvction of t for fixed q' Boundary of C (F[qi])
Note: q!(z(F[qi],qi)
- x(o))=oat
this p i n t
(a) Path Visualization
r
V.Z(FCVI,V)>V-Y for a11 y C
C(F[q]), y t
z
(b) Normal Projection
FIG.1. Geometrical aspects of Neustadt's method.
maximum value itself is not used in the solution of the two-point boundary-value problem. Neustadt has shown, in the references cited, that z(t, q) is a continuous function of q, and the gradient of the function being maximized, VF(qi), lies along the error vector z[F(q'),qi]- x(0). This result is the key to the practical application of steep ascent methods in maximizing F ( q ) . Example 1. Here is a concrete example which will illustrate the method. Consider the second order system:
il = x2
i2 =u
(lul 5 1)
396
BERNARD PAIEWONSKY
Suppose we want to design a time optimal regulator, i.e., x1 = 0. We apply the maximum principle. The Hamiltonian is: 2 = Alx, + A2u + 1. The optimal control is: u* = sgn I , . The adjoint equations 2, = 0, 1, = -I, are easily integrated to give I , = -ql , A, = - q z qlt. Consequently u* = sgn(q,t - q,). The matrices X ( t ) and X - ' ( t ) are easily found:
+
The components of z ( t , q) are
Z,(4 11)
= Jt 0
sgn(rl,z - V Z )
The boundaries of C ( t ) (optimal isochrones) corresponding to four values of t are shown in Fig. 2. The isochronal curves in this example have sharp corners at the switching curves and the q vectors are not unique at these points.? To examine what takes place when we apply Neustadt's method, let us focus our attention on a typical initial point, x(O), shown in the upper right hand portion of the figure. The first trial value q1 is taken opposite x(0) and a piece of the trial support plane normal to q 1 is shown passing through x(0). The path traced by the function z(t, q') is shown by the dashed line beginning at the origin, coinciding with the upper branch of the switching curve for a short time and then reversing direction and intersecting the plane normal to q1whenf(t, q') = 0.Thegradient of F(ql)lies alongthelinez(F(q'), q') - x(O), and a small change in q taken in this direction will rotate q2 to the left and reduce the inclination of the trial support plane for the next iteration. The eighth and ninth trials using steepest ascent also appear in the figure and it can be seen that relatively small changes in the slope of supporting hyperplane produces relatively large changes in the error vector z(F(q),q)- x(0). This is due in part to the small curvature of the isochrones in the vicinity of the initial point. The numerical aspects of the iterations such as the number of trials to obtain satisfactory results will be discussed in more detail later on in connection with another example. Remark. A digital computer study of a simplified rocket steering problem using the equations pco sin a PCO x , = -cosa, i , = x3, i 3= --9 (9.9) m - Pt m - /It
t See also Chapter 7.
9.
SYNTHESIS OF OPTIMAL CONTROLS
397
FIG.2. Optimal isochrones, .fl = .f2, x2 = u(t).
pointed out certain aspects of the problem which had not been fully appreciated at the start. The function F(q) was a very flat function of q. The optimum step size could not be found satisfactorily by comparing values of F(q). Table I (from Paiewonsky et shows the insensitivity of F(q) to small changes in q. The error vector z ( F ( q ) , q) - xo, however, is a very sensitive function ofq, and Table I also shows how this varied in a particular instance. These digital computer studies'" showed that a significant reduction in computation time was possible by integrating the z ( t , q) equations in advance and avoiding numerical integration. In general, however, it is not possible to evaluate the integrals explicitly and the problem of the growth of numerical integration errors must be faced. Powell's method worked very well here but the details are omitted (see Paiewonsky'") because Example 2 provides a more comprehensive comparison of several methods used for accelerating steep ascent. ~
1
.
~
~
9
~
)
398
BERNARD PAIEWONSKY
TABLE I TYPICAL VALUES OF F(7) AND /Z(F(q)7 ,))- X(0)I 771
772
_ _ _ _ _ _ _ _ _ ~ -
0.54580027 0.54580027 0.54580027 0.54580027
0.01 3 16405 0.01316405 0.01316405 0.01314405
lz - X@)I
773
1.0729829 1.0729839 1.0734830 1.0734830
~
148.56863 148.56906 148.56824 148.56763
.~
22,700 52,705 7,410 121,610
Hybrid analog-digital computer studies (Paiewonsky et ~ 1 . " ) showed that it is entirely feasible to carry out the iterations using simplified step selection procedures. The computation time can be made quite short by proper scaling but the accuracy of the computation is poorer than in the digital studies. In these examples time scales of 320 and 3200 times real time were used with typical run times of 500 and 50 ms, respectively. Convergence was obtained, in typical cases, in less than 20 trials including 7 steps; the time was 15 sec. Higher speed operation yielded similar results in 1.5 sec. The upper limit on time scale is due to the particular capacitors on the integrators. Special purpose analog computers could undoubtedly operate in swifter modes. It makes sense to talk about iteration convergence time only when some measure of terminal error is prescribed and there is a preassigned threshold of acceptance for terminal errors. The digital computer iterations were terminated whenever the system terminal errors were within one part in 5000 of the specified end conditions. The hybrid analog-digital computer used a criterion which ranged between 1 part in 50 and 1 part in 200. This accuracy can be improved by rescaling and using digital techniques for selecting the step size.
9.12 Time-Optimal Control with Effort Constraints We have so far discussed only the time-optimal control problem for linear systems with bounded controls. Many engineering problems, however, require consideration of a different type of constraint on the controls. These constraints are due to controls which depend on the storage or accumulation of required commodities. For example, systems depending upon controls which act by the expulsion of mass are clearly limited by the total amount of expellant carried along. The expelled mass may be accelerated by a mechanism which depends on a supply of stored energy. Separately powered rockets (e.g., ion rockets) require a source of electrical power and a supply of propellant, and there will be two constraints to examine. Chemical rockets, on the other hand, supply their own energy for the expulsion of the propellant and
9.
SYNTHESIS OF OPTIMAL CONTROLS
399
only a constraint on the total propellant mass may appear. The limitations on the systems as just described can be concisely expressed as follows:
E(u(l))
=
1' 0
cp(u) d t 5 M
(9.10)
The functional E(u(t)) is called the control effort and q(u) is called the effort function. There may be more than one effort function appearing in the problem. We now ask for an admissible control which transfers the system from an initial state point to a terminal state point in the least time, and has the property that E(u(t))I M . The constant M represents an assigned storage limit of the general type just described. It is important to keep in mind that these effort constraints are quite different from restrictions or bounds on the state variables. Constraints on the state variables are more difficult to handle and they will not be treated here at all. The synthesis of time optimal controllers for systems with effort constraints follows the general path described in Sgc. 9.11. The notion of the reachable sets C ( t )in n-space is modified to include the control effort as coordinate n + 1. The additional component to the adjoint will be v , + ~(yln+l > 0). The Hamiltonian for this problem contains terms dependent on the control effort:
-
A? = h (Ax + Bu) + qn+lcp(u)+ 20
Define
C(t) as the set swept out by vectors
for all admissible controls (u E Q). The system can be brought to the origin from any point j;: = (xl,... , x,, x,,,,) in C(t) by an admissible control with ~ .reachable sets are nested as before: effort x ~ +The C(t') 3 C(t>
if
t' > t
We again assume that Q is closed, convex, and contains the origin as an interior point. We do not need to assume that Q is bounded. This allows us to remove the constraint on the control magnitude and to treat certain idealized physical problems with impulsive controls. show that two further assumptions on A , B, cp, Neustadt's and Q are needed. First, we assume that C(t) is closed and convex for every t 2 0 and furthermore, if 1 E c"(z) for all z > t , then 1 E c"(t). Second, we required a generalized unique maximum condition to be satisfied. It is assumed that l ( t , q)*Bu+ Y ] , + ~ ~ ( U(,t , q)) has a unique maximum in Q for almost every t, 0 I t < 00. The function p(u) vanishes at the origin and R contains the origin so the maximum is nonnegative.
400
BERNARD PAIEWONSKY
Neustadt's results show that if the above conditions are satisfied and if there is an admissible control which satisfies the effort constraint and which also brings the system from xo to the origin, then there is a unique time optimal control u*(t, q) and q is any vector which maximizes E(?, x"). This function is the smallest value of t for whichJ'(t, q, xo) vanishes, where
f
iln+ 1(zn+ 1
-
The computation of the optimal fi may be a little more complicated in this case because of the nonlinear maximization which occurs in the application of maximum principle, and also because of possible discontinuities in
mi?xo>.
If there is an optimal control which satisfies the conditions of the problem and if the effort is less than the maximum allowed then the solution is clearly the same as the corresponding time optimal control without the constraint. As the boundary conditions are changed to make them more difficult to satisfy or as the allowed control effort is reduced in magnitude the problem will resemble a classical isoperimetric problem with an equality constraint of the form q(u) dt = M .
1
9.13 Minimum Effort Control There is a third optimization problem, closely related to the time-optimal problem with an effort constraint. They are, in effect, dual problems. We are given a time T > 0. We seek to find an admissible control u(t) which transfers the system (9.1) from xo to the origin in time T, and minimizes the control effort q(u) df. Assume as before in Sec. 9.12 that SZ is closed and convex, c ( T )is closed and convex, q(u) is continuous and bounded from below on R, and the generalized unique maximum condition is satisfied. We also assume that there is an admissible control which will transfer xo to the origin in time T and, furthermore? there is more than one possible value for the control effort so that the minimization is not trivial. Neustadt has proved that if the foregoing conditions are met then there is aunique minimum effort control a(t, ij) where = (q, - l), and L(t, q) is the solution of the adjoint equation with L(0) = -q, and q is any vector which maximizes the function q*x(O)-ijZ(T, 9). In t h e expression above Z = z +
s
soT
z , + ~ where Z , + ~ ( Tq), =joT*q[u(t,q)]dt. The maximum of -q-z(z,
(
r ) is the
q(u(t, ii) dt). minimum effort The computation of the optimal control in this case is actually the most direct of the three. As before, an iteration procedure based on steepest ascent
9.
can be used. Neustadt3 shows that the gradient of [-q.Z(T,q)] (n components). Furthermore,?
Z i ( T q) = where 9.2
x ( t , q) is
40 1
SYNTHESIS OF OPTIMAL CONTROLS
c x,;'(T)x,(T, i
q)
=
-Z(T,
q)
( i = 1, ... , n )
the solution of the system (9.1) with u ( t ) = u*(t, 4).
Computational Considerations
We have seen that Neustadt's synthesis procedures for the three optimal control problems require the solution of an ordinary maximization problem in each case. In fact the successful application of Neustadt's method depends almost entirely on being able to accurately locate the maximum point with a reasonable amount of computation. In this section we discuss some methods which can be used and we shall also present examples and results of computational studies. It was pointed out in Sect. 9.1 that the gradient of the function being maximized lies along the error-vector z(F(q), q) - x(0). This function can easily be computed and it should come as no surprise that we attempt to use the gradient in a hill-climbing routine. The terms " gradient method " and " steep ascent method " as used here apply to maximization techniques which take successive steps in directions obtained by linear transformations of the local gradient vector. The method of steepest ascent is the special case wherein steps are always taken in the direction of the local gradient vector. It is well known that the convergence of steep ascent iterations can be slow and, because of this, special techniques have been developed to speed up the rate of convergence and eliminate time consuming oscillations. These accelerated gradient methods determine the direction of successive steps from observations of the gradient vectors at several points instead of using only the local gradient. Suppose we want to maximize a given function of n variables, F(x). We make an initial guess at the solution and at the kth step the algorithm determines a vector gk (the step direction) and a real number hk (the step size). The (k + 1)st point in the iteration is xkf = xk + hkgk, The variations of the steep ascent idea differ according to the rules used to generate the step size hk and the direction of the step gk. One rule for choosing the step size is simply to make it a constant. This rule is easy to apply but it may lead to difficulties in obtaining convergent iterations. As successive steps are taken in a given direction, say in the direction of VF, the values of F(x) (in that direction) will increase initially but may begin
t The elements of matrix X-' are denoted by A'fi'.
402
BERNARD PAIEWONSKY
to decrease again for a large enough step away from the initial point. The step (Ak)* corresponding to the point where the rate of change of F(x) along the line of march vanishes is called the optimum step. That is, the optimum step occurs at the first local maximum along a line whose direction is specified by the rule of the gradient method being used. An obvious way to compute the optimum step is to search for it. Select a small quantity 6h and successively compute F i n the direction given by gk at the points
xk + 6h gk, xk + 26h gk,
e . 1
Sooner or later F will decrease; if this happens at the point xk+ m 6h gk, go one step back and halve 6h and try the point xk+ (m- 4) 6h gk, etc. One of the best available methodsga to find (Ak)* makes use of the fact that the derivative of the function along the line of march,
d
- [F(xk
dh
+ h VF(xk))]= VF[xk + h VF(xk)]
*
VF(xk)
vanishes at hk = (hk)*. We have to solve the equation
-
VF(xk+ (hk)*VF(xk)) VF(xk)= 0
(9.12)
for (hk)*, the optimal step size.
FIG.3. The gradient direction method for finding the optimum step.
Suppose we start at the point xk and march in the direction VF as shown in Fig. 3. As we move along the line the direction of the local gradient is calculated at successive points. The arrows show the angle between VF(xk + hk VF(xk)) and VF(xk) at a typical point. The local gradient will be orthogonal to the original direction at the optimal step. In many cases it will be unnecessary to find the optimal step with great accuracy. It may suffice to ensure that the size of the step is less than the optimum. The sign
9.
SYNTHESIS OF OPTIMAL CONTROLS
403
and magnitude of the dot product V F ( x k+ h VF(xk)).VF(xk) provide useful criteria for this. 9.21 Two Convergence Acceleration Methods
Dauidon-Fletcher-Powell Method. This convergence acceleration scheme was originally due to Davidon" and later modified by Fletcher and Powell.12 The papers discuss the problem of minimizing functions of n variables but the method can be applied to maximization by making obvious modifications. It is an iterative gradient technique which computes the gradient of F(x) at successive values of the n-vector x and attempts to find the places where V F = 0, and [a2F/i3xiaxj] is positive definite. If the function being minimized is quadratic, then [a2F/axiaxj] is a matrix of constants. If this matrix (the Hessian) is known then the minimum point can be found in one step xo - x
= -
(I1-axia2Fa x j
-I
VF(x)
(9.13)
For a general function the Hessian is not a constant but is a function of position. Furthermore the Hessian is often not known explicitly but must be obtained by numerical differentiation. In Davidon's method the inverse of the Hessian is not computed directly but successive approximations are made. A positive definite symmetric matrix H is chosen initially and is modified at each step according to a prescribed rule. That is, instead of stepping in the direction opposite the gradient at all times (as in the steepest descent method) the direction of the step is modified so that the step at the ith stage is taken in the direction - H iVF'. The unit matrix is a satisfactory first choice for H. The sequence of operations at the ith stage is listed below. The superscript indicates the stage of the iterations. I . Find the optimum step in the direction - H iVF(x') 2. Evaluate F ( x i + ' ) and VF(x'") 3. Set y i = V F ( x i + ' )- V F ( x i ) 4. Modify H : Hi" = H i+ A' + B. The matrices A' and B' are obtained by direct calculation according to these definitions : . A' = ,
B' =
.'[ - Hi VF(x')] 0 a'[ - H' VF(x')] [ -.'Hi
-Hiy'
VF(x')] * y'
@
Y' . H'Y'
The symbol a 0 b means that a linear operator D i j is formed from vectors a and b; the matrix elements of this operator are D i j = a ib j. cli is a positive constant corresponding to the length of the optimal step at the ith stage.
404
BERNARD PAIEWONSKY
Fletcher and Powell show that the process is stable and that the minimum of a quadratic form in n variables is obtained in n iterations.
Powell’s Method. Powell13 has described an effective method which is also guaranteed to converge to the maximum in the ideal case of a quadratic, negative definite polynomial in n variables. Powell’s method is closely related to the parallel tangent or “Par-Tan’’ methods developed by Shah el a l l 4 The method depends on the following fact: If F(x)is a quadratic function with a maximum at x* then pairs of points xl,x2 lying on a line through x* have the property that VF(x’) is parallel to VF(x2). Conversely, if the gradients at two distinct points x1 and x2 are parallel, then x* lies on the line connecting x1 and xz or its extension. In Powell’s method, whenever steps are to be taken in the gradient direction, this means the optimum size, although this may not be stated explicitly each time. In practice, because of finite computing accuracy, a step size close to the optimum, but consistently smaller, will have to do. Powell shows that the maximization problem can be solved in n dimensions if it can be solved in (n - 1) dimensions. This leads to a recursive procedure in which a series of problems of decreasing dimension are solved in a cyclic order. For example, to solve two-dimensional maximization problems it is necessary to solve a series of one-dimensional problems (the determination of the optimum step is a one-dimensional problem). Three-dimensional maximizations required the use of a computational routine to do maximizations in spaces of only two dimensions. Suppose that we are given a point x1 in an n-dimensional space. We must find a second point xz, x1 # x2, such that VF(x2) is parallel to VF(x’). A two-dimensional case is shown in Fig. 4(a) with the initial point at XI. We travel in the direction of steepest ascent. At the point 5, the vectors VF(x’) and VF({) are orthogonal and so { corresponds to the optimal step from XI. Now, using VF(5) as the new direction, we find x2 such that VF(x2) is orthogonal to VF(5). Consequently, VF(x’) is parallel to VF(x2). If F(x)is a quadratic function, then the maximum point x* will lie on the line through x1 and x2 (or on its extension beyond x1 or x’). Suppose that F(x) is not quadratic. In that case the maximum point along the line connecting x1 and x2 may not be the maximum of F(x).The process is to be repeated until the absolute value of the gradient of F(x) is below a preselected threshold and the maximum point is considered to be located within desired tolerances. Each set of n(n - 1) iterations is called a cycle of Powell’s method. A three-dimensional example is shown in Fig. 4(b) for a single cycle. Starting at the initial point x1 we compute VF(x’) and determine the point x2 along that line where VF(x’) is orthogonal to VF(x2). We now restrict our
9.
(1) x (2) x
(3)
x
1
2
3
-#
.
2
x
x x
405
SYNTHESIS OF OPTIMAL CONTROLS
I
".--
/,**
3
4
' /
J
/
/
'
FIG.4. (a) Powell's method in two dimensions. (b) Powell's method in three dimensions.
attention to the plane P through x2 and normal to VF(x'). Suppose that the maximum of F(x) in the plane P occurs at x*. We know that the projection of VF(x*) onto P vanishes, and it follows that VF(x*) is parallel to VF(x') and the maximum of F(x) in the three-dimensional space lies on the line passing through x1 and x*. We have already described how to find the maximum in a plane and the steps involved are also depicted in Fig. 3. We go in the plane P in the direction of the projection of VF(x2) on P until a local maximum is reached. Call this point x3 and now follow VF(x3) until another local maximum is reached at a point we call x4. The final step in the planar maximization requires the maximum to be found along the line joining x2 and x4. Recall that this was named x*.The final step in the cycle is the determination of the maximum x** along the line connecting x1 and x*. Some of the more important technical details involved in the programming of Powell's method for a computer are described in reports by Woodrow" and Paiewonsky and Woodrow.I6
406
BERNARD PAIEWONSKY
9.22 Optimal Space Rendezvous Example 2. This example is based on a linearized three-dimensional timeoptimal terminal rendezvous problem with bounded thrust and limited fuel. The three-dimensional powered flight equations are linearized by assuming that the distance between the target and the maneuvering vehicle is always small compared to the distance between the target and the center of the earth. It is also assumed that the total propellant used is a small fraction (e.g., 5 %) of the vehicle’s total mass. The equations of motion do not include the effect of the time-varying total mass. The aim of this example is to outline the computational results. The results of the rendezvous study and the optimal trajectories can be found in Paiewonsky and Woodrow.16 A uniformly rotating coordinate system is employed as shown in Fig. 5 . The rotating rectangular system with axes labeled xl, x, , and xs has its origin at the nominal target radius and moves with the target’s mean motion. The x,-axis is in the orbital plane in the tangential direction, opposite to the direction of the rotation, the x,-axis is in the outward radial direction and the x5 axis is orthogonal to both the xl - and x,-axis. Figure 5 shows the ~ 1 ~ 3 x 5 trihedron located at the radius R, . The trihedron XYZ is an earth-centered nonrotating rectangular coordinate system used as a reference frame for the initial orbit. A target in a circular orbit at the nominal radius will be stationary if placed at the origin of the rotating system. Target vehicles in orbits with eccentricity correspond to a rendezvous with an object moving with respect to the origin of the x1x3x5system; i.e., there will be relative motion between the target and the origin of the coordinate system. The linearized equations of motion are: 1,
1, = x 2 12
= 20x4 $-
=
-2wx,
+ 3 d X , + u2 (9.14)
1s = X,j i6 = -dx5
i3 = xq
+ u,
where the dot indicates d/dt and u1
= A(t) cos
B cos cp,
u 2 = A(t) cos 0 sin cp
and
u 3 = A(t) sin 0
The thrust acceleration constraint is given by the equation ul*
+ u 2 2 + u 3 2 = A2(t)
(0 I A ( t ) I A,,,) 7-
and the propellant constraint is given by the requirement that (n70/c) A ( t ) d t 5 m,(O) where rn,(O) is the initial propellant mass. The total vehicle mass, m, , is assumed to be constant.
9.
407
SYNTHESIS OF OPTIMAL CONTROLS
Z
_-
*
Center ot torth
I
circular orbit
Tqrget path
X
x5
1.
Thrust acceleration A(t)
A(t) Cos
FIG. 5. Definition of coordinate system and steering angles. (a) Coordinate system. (b) Steering angles.
408
BERNARD PAIEWONSKY
It is convenient to transform the variables and rescale the time in terms of the angular velocity o.Let t‘ = of,and define y , = 0 2 x 1 , y , = o x 2 , y3
2
= w x3,
y4
= 0x4,
y,
2
y , = 0x6
= o x,,
The transformed equations of motion are written below. The prime denotes dldt’. dy4ldt’ = y4’ = - 2 ~ 2 3y3 ~2 dyl/dt‘ = y,‘ = y 2 (9.15) dy2/dt‘ = y,‘ = 2y4 u 1 dyJdt‘ = y s ‘ = y , d y J d t ‘ = y,’ = - y , ~3 dy,/dt’ = y3‘ = y4
+
+
+
+
A new variable, y,, is introduced to account for the fuel constraint. The variable y , satisfies the differential equation c dm, dY7 -=-o--= dt‘ mo dt‘
y7(0) = ox,(O) = o A V
-A(t’)
(9.16)
where x7(0) = - c h [ ( m , - m,)/m,] = AV, dm,/dt‘ is the propellant mass flow, c is the rocket effective exhaust velocity, and m, is the total vehicle mass. The fundamental matrix solution and its inverse for the first six equations ‘1 0 0 Y ( f )= 0 0 0
4 sin t‘ - 3t’ 4 cos t’ - 3 2(cos t‘ - 1 ) - 2 sin t’ 0 0
-1 - 4 sin t’ + 3t’ 0 4cost’-3 0 ~ ( C Ot’ S- 1) Y-yt) = 0 2 sin t’ 0 0 0 0
1;
6(t’ - sin t’) 6(1 - cos t’) -3 cos t’ 4 3 sin t’ 0 0
+
2(1 - cos t’) 2 sin t’ sin t’ cos t’ 0 0
0 0 0 0 cos t’ -sin t’
0 0 0 0 sin t’ cos t’
0 0 6(sin t‘ - t’) 2(1 - cos t’) 0 0 -2sin t’ 6(1 -cost’) 0 -3 cos t’ + 4 -sin t’ 0 -3 sin t’ cos t’ 0 0 0 0 cos t‘ -sin t’ 0 sin t’ cos t’ 0 (9.18)
Recall that z(t’q) = Y-’(z)B(z)u(z,q) dz where ~ ( tq), is obtained from the maximum principle. The matrix B(r) is obtained by inspection of the equations of motion.
9.
SYNTHESIS OF OPTIMAL CONTROLS
409
We can now write out the components of ~ ( t ’q): , 2‘
z l ( t r ,y) = -
zz(t,’ y)
0
1‘
[ ~ ( C OTS- l ) u l - (sin T ) U J d z
0
r’
z4(t’, y) = -
0
0
[(2 sin z)ul [(-sin
= - Jr
z6(t’,y) = -
+ 2(1 - cos z ) u z ] dt
j: [ ~ ( C O S-Z3)u1 - (2 sin z)uJ dz
= -
z3(t’, y) = -
z,(t’, y)
[(3r - 4 sin z)ul
If
[(COS
+ (cos z ) u J
t)u3]
T)U3]
dt
(9.19)
dz
dz
0
We now introduce adjoint variables /zi , i = 1, ... , 6, satisfying the differential equations A’ = - AT&. The solution of these equations is:
x(t’) = -[Y
-l(t‘)]T
q
(9.20)
where Ai(0) = - yi. The matrix A is obtained by inspection of the equations of motion. The adjoint variable A7 corresponding to y7 satisfies the equation dx,ldt‘ = 0. We may expand the vector equation and obtain relations between the components of 1 and q : Al(t’> =
-v1
+ + 311, + 2v], y4) sin t’ + 3(2y2 + q 3 ) cos t ’
n,(t‘> = (4y1- 2y4) sin
t‘ - ( 6 ~ 2 31,) cos t’ - 3y, t’
A 3 ( t ’ ) = 16(2y1 -
+ 6% t’ - (6vIz + 4v3)
sin t’ + (47, - 2y4)c0s t’ - 2y1 A5(t’) = r16 sin t’ - y 5 cos t’ &(t’) = ys sin t’ - y6 cos t‘
A4(t’) = (2y2
A7(t’)
+ yJ
(9.21)
= -)17
We find the time-optimal control by forming the Hamiltonian 2 = l i y i ’ + lo from the adjoint variables and the state velocity vector, and then maximizing it with respect to u subject to all constraints on u.
410
BERNARD PAIEWONSKY
Bounded thrust : (9.22)
Limited propellant :
In this expression, nzp is the total propellant mass and mo is total mass at the initial time. The propellant constraint can be thought of as a limit on the total ideal fieldfree A V available for the maneuver.
0.7-
0.6 -
0.5
-
0
-
0.4-
I
F
v
0.3-
N -
-
0.2
0.1
-
O L
FIG.6. Error vector length and stopping time vs total number of steps for a modified steepest ascent method. Each mark represents a n acceptable step. A step is accepted when VF(7' hkVF(qk)).vF(qk) > 0.
+
9.
...
41 1
SYNTHESIS OF OPTIMAL CONTROLS
....rm
..me..
0.7-
2nd
1st cycle
3rd cycle
cycle
0.6 -
-
0.5
-
0.4
-
e
..W.
0
-
-
F
0 0 8 O
O
<
0
E v Y
Each mark represents the result of an optimum step. The abscissa scale i s the total number of trial steps including those needed to find the optimum step.
B 0.3U.
Y
N -
0.2
-
0.1
-
o
I
0
OL
I
100
120
140
160
N
FIG. 7. Error vector length and stopping time vs total number of steps for Powell’s method.
When the Hamiltonian is maximized with respect to 0, cp, and A , we find that: (9.24) tan ‘p* = &/A2 (9.25) and A*(l’) = A,,,
=o
when when
(AZ’ + 1,’
+ 1,’
( r = JL,’
+ i4’ +
-1,) < 0
(A,’+ A ~ ’+ 16’ - A,) > o
(9.26)
The optimal control components are ul*
= A*(t’)A,/r
uz* = A*(t’)A,/r u j * = A*(t’)A,/r
(9.27)
It is now possible to compute ~ ( t ’q), as all of the required functions are available. The results of a computational study on this problem are reported in Paiewonsky and Woodrow,16 and Figs. 5-10 are based on that work. Figures 6-9 show convergence of (z(F(q),q) - x(O)( to zero and F(q) to the optimal
412
BERNARD PAIEWONSKY
0.7-
0.6
0
0.5-
0
I
0
0.4-
1
,
f
-
F Y
,c
3 0.3Eoch mark represents on optimum step. 020
0.1
-
0-
0
0 0
0
0
0
I
1
I
I
I
10
20
30
40
50
N
0 0 0
-
I n
60
I
70
FIG.8. Error vector length and stopping time vs total number of steps for DavidonFletcher-Powell method.
time for the initial point x,(O) = x3(0)= x,(O) = 100,000 ft and x,(O) = x4(0) = x6(0)= - 100 ft/sec. A modified method of steep ascent is shown in Fig. 6, Powell's method in Fig. 7, and Davidon, Fletcher, Powell's method in Fig. 8, and an unaccelerated steepest ascent is in Fig. 9. In each case the optimal time is established after relatively few iterations but the rate of reduction in the error varies widely between the different techniques. It is important to keep in mind that the significant quantities are the errors in the boundary conditions in the physical coordinates. These are the errors which would be incurred at time T = F(qi) if the control u*(t, q') were applied to the physical system.
9.
413
SYNTHESIS OF OPTIMAL CONTROLS
0.5
0.4 I
-3
+$0.3 v
-
F e Y
13
-
F(-I)
LL
0
Y
c(
0.2
-
12
e
0.1
-
II
-
0
o
Each mark represents an optimum step.
oooo 00
0-
IZI
0
0 0
I
O
O
O
0
I
Figure 10 shows how the terminal errors in range (x,’ + x3’ + xS2)”’ are reduced as the number of iterations increase. The terminal errors in velocity ( x 2 2+ x42+ x~~)’’’ follow a similar pattern. In this problem the terminal velocity error threshold was 1 ft/sec and the terminal range error threshold was 300 ft. The initial range was on the order of 20 miles and the magnitude of the initial velocity was on the order of 200 ft/sec (vehicles separating). 9.3
Final Remarks
We have shown how Neustadt’s method can be implemented to compute optimal controls and two convergence acceleration methods have been described. It should be understood that there are many other ways to carry out the maximization and in particular instances some of these may prove to be simpler or quicker, or just more appealing to the individual’s taste. The examples presented are illustrations of steep ascent techniques which worked
414
-0 lo'
-
-
-
. :-Acm
BERNARD PAIEWONSKY
-
.
-
*rsA *\
-
.
*&
**
-
- --
AAA A
. e<
-
04
lo4
u
P
0%
-
n
0
-0 0
.AArCLII*
A
04
*\
-
._
AA bAAA
.A
I
0
A A
A
A A
0.3 *\
._ tE lo3 -
0%-
-
-
I02
*t
18 Dovidon-Fletcher-Powell
Modified steepest ascent
--
I01
0%
A Powell
0
I
25
I
50
I
75
I
I00
I
I25
I A
I50
I 5
for specific cases. A brief account of some methods which did not work is contained in Paiewonsky et aI.9" A fuller account of steep ascent methods and convergence acceleration schemes can be found in Shah et Edelbaum," Wilde," and Spang." An analysis by Neustadt2' of optimal space trajectories is available. ACKNOWLEDGMENT The numerical examples are drawn from studies carried o u t while the author was at Aeronautical Research Associates of Princeton Inc. (ARAP). This work was supported
9.
SYNTHESIS OF OPTIMAL CONTROLS
415
in part by the U.S. Air Force Flight Dynamics Laboratory under Contract AF33(657)7781. 1 wish to acknowledge the assistance of Peter Woodrow of ARAP in carrying out the numerical analyses and preparing the computer programs. Portions of Example 2 and Figs. 5-10 will appear in the Journal of Spacecraft and Rockets and are used here with permission of the AIAA. REFERENCES 1. L. W. Neustadt, Synthesizing time-optimal control systems, J. Math. Anal. Appl. 1,
484-493 (1960). 2 . L. W. Neustadt, Minimum effort control systems, Sac. Ind. Appl. Math. I, Conrrol 1, 16-3 1 (1960). 3. L. W. Neustadt and B. H. Paiewonsky, On synthesizing optimal controls, Proc. 2nd IFAC Congr., London, 1963. Butterworth, London, 1965. 4. N . N. Krasovskii, On the theory of optimal controls, Automat. Remore Control 18, No. 1 1 , 960-970 (1957); see also Prikl. M a t . Meth. 23, 625-639 (1959). 5. R. V. Gamkrelidze, Theory of time-optimal processes in linear systems, Zzu. Akad. Nauk SSSR. Ser. Mar. 22 449-474 (1958) (English transl., Dept. of Eng., Univ. of California, Los Angeles, California, Rept. 61-7 (1961).) 6. J. N . Eaton, An iterative solution to time-optimal control, J. Math. Anal. Appl. 5 , 329-344 (1962); see also Errata and Ad,denda to above article, J . Math. Anal. Appl. 9, 147-152 (1964). 7. E. J. Fadden and E. G. Gilbert, Computational aspects of the time-optimal problem, in “ Computing Methods in Optimization Problems ” (A. V. Balakrishnan and L. W. Neustadt, eds.) pp. 167-192. Academic Press, New York, 1964. 8. J. P. LaSalle, The time-optimal control problem, in “Contributions to the Theory of Nonlinear Oscillations ” ( S . Lefschetz, ed.), vol. V, pp. 1-24. Princeton Univ. Press, Princeton, New Jersey, 1960. 9a. B. H. Paiewonsky, P. J. Woodrow, F. Terkelsen, and J. Mclntyre, A study of synthesis techniques for optimal control, Aeronautical Systems Div., Wright-Patterson AFB, Ohio, Rept. ASD-TDR-63-239 (June, 1964). 9b. B. H. Paiewonsky and P. J. Woodrow, The synthesis of optimal controls for a class of rocket steering problems, AIAA Paper 63-224, June, 1963. 10. B. H. Paiewonsky, P. J. Woodrow, W. Brunner, and P. Halbert, Synthesis of Optimal Controllers Using Hybrid Analog-Digital Computers in: Computing Methods in Optimization Problems ” (A. V. Balakrishnan and L. W. Neustadt, eds.). Academic Press, New York, 1964. 11. W. L. Davidon, Variable metric method for minimization, Argonne Natl. Lab., Argonne, Illinois, Rept. ANL-5990-Rev. (May, 1959). 12. R. Fletcher and M. J. D. Powell, A rapidly convergent descent method for minimization, Cumput. J . pp. 163-168 (July, 1963). 13. M. J. D. Powell, An iterative method for finding stationary values of a function of several variables, Comput. J. pp. 147-151 (1962). 14. B. V. Shah, R. J. Buehler, and 0. Kempthorne, Some algorithms for minimizing a function of several variables, J. SIAM 12, 74 (1964). 15. P. J. Woodrow, Dept. of Electrical Eng., Princeton Univ., Princeton, New Jersey, Control Systems Lab. Tech. Rept. No. 4, November, 1963. 16. B. H. Paiewonsky and P. J. Woodrow, A study of time-optimal rendezvous in three dimensions, AFFDL-TR-65-20, vol. I (January, 1965). 17. T. N. Edelbaum, Theory of maxima and minima, in “Optimization Techniques” (George Leitmann, ed.), Chapter I. Academic Press, New York, 1962. ‘I
416
BERNARD PAIEWONSKY
18. D. J. Wilde, “Optimum Seeking Methods.” Prentice-Hall, Englewood Cliffs, New Jersey, 1964. 19. H. A. Spang, 111, A review of minimization techniques for nonlinear functions, SLAM Rev. 4 (1962). 20. L. W. Neustadt, A general theory of minimum fuel space trajectories, J . SOC.Ind. Appl. Math. Ser. A, Control 3, 317-356 (1965).
10 The Calculus of Variations, Functional Analysis, and Optimal Control Problems E. K . B L U M DEPARTMENT OF MATHEMATICS, UNIVERSITY OF SOUTHERN CALIFORNIA, LOS ANGELES, CALIFORNIA
10.0 10.1 10.2 10.3 10.4 10.5 10.6 10.7 10.8 10.9
10.0
Introduction . . . . . . . . . . The Problem of Mayer. . . . . . . Optimal Control Problems . . . . . Abstract Analysis, Basic Concepts . . . The Multiplier Rule in Abstract Analysis . Necessary Conditions for Optimal Controls. Variation of Endpoints and Initial Conditions Examples of Optimal Control Problems . . A Convergent Gradient Procedure . . . Computations Using the Convergent Gradient References . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
. .
. .
.
.
.
.
. . .
411
.
.
.
. 418
. .
.
.
.
. . . . 420 . . . . 423 . . . . 427 . . . . 431 . . . . 439
PIrocedure .
.
. . .
. . .
443
. 453 . 457 . 459
Introduction
The calculus of variations has a venerable place in the history of mathematics and physics. Its roots lie deep in the classical analysis of the eighteenth and nineteenth centuries and its development extends well into the current century. From it stems at least one branch of modern abstract mathematics, the branch now referred to as functional analysis, although this origin is usually lost sight of by modern functional analysts. In this article, we shall develop some of the relationships between the classical calculus of variations and that part of functional analysis which we shall call " abstract analysis "and show how the abstract approach simplifies the derivation of classical results. We shall do this by introducing another subject of current research in mathematics and engineering: optimal control theory. It is generally recognized that many of the problems of optimal control theory are subsumed by the classical results of the calculus of variations. Conversely, many basic problems in the calculus of variations can be restated as optimal control problems. We shall review how this equivalence is established and then apply 417
418
E. K. BLUM
the methods of abstract analysis to certain general problems in optimal control theory and thereby arrive at theorems in the calculus of variations. Furthermore, we shall show how these abstract methods can be used to construct practical methods of solving optimal control problems. Finally, we shall consider some specific examples which arise in orbital mechanics concerned with low-thrust rocket trajectories. Thus, we shall run the gamut from classical mathematical analysis to modern abstract analysis and then from pure mathematics to very applied mathematics, hopefully illuminating some general unifying principles along the way and showing how these general ideas can be used to produce specific practical methods for solving very real problems. 10.1
The Problem of Mayer
We have said that the calculus of variations had its beginnings in the eighteenth century. Actually, one of the most famous problems in the calculus of variations was originally discussed by Galileo in the early seventeenth century.‘,’ This is the brachistochrone problem. Notwithstanding the fact that most readers are familiar with the brachistochrone problem and that it is discussed in every book on the calculus of variations (e.g., see Bliss’) we shall present it here for convenient reference, since we shall use it as an example to show how a problem of Mayer can be transformed into an optimal control problem and to illustrate a computational method of solving variational problems. It will be recalled that the brachistochrone problem deals with a iinit mass moving along an arc in a uniform gravitational field-for example, a bead sliding down a frictionless wire. The endpoints, Po and P I , of the arc are fixed, as is the initial velocity at time t = 0. Hence, the time of descent of the bead from Po to P , depends only on the shape of the arc. It is required to find, in a certain class of arcs joining Po and P , , that arc which minimizes the time of descent. The brachistochrone (“ shortest time ”) problem was posed as a challenge to mathematicians of the time by James Bernoulli in 1696 and solved by James and his brother John4 in 1697. The origin of the calculus of variations may be traced to that event. It is a simple matter for us now to formulate the brachistochrone problem analytically. To simplify it even further, we shall assume that the arc lies in the (x,y)-plane so that the equation of motion is given by d2s
7= g cos 0 = g dt
dY
-
dS
(10.1)
whereg is the gravitational constant, 0 is the angle between the vertical gravity vector (in the y-direction) and the tangent to the arc at a point P : ( x ,y ) and
10.
419
OPTIMAL CONTROL PROBLEMS
s is arc length from Po to P . Integrating (10.1) with respect to t , we obtain
+
(ds/dt), = 2gy const. Without loss of generality, we may take Po as the point (0, 1) and PI as (xl, 2). Assume further, for illustrative purposes, that the initial velocity is zero. In this case, we have 112
dx
dsdx
where y’ = dy/dx. Integrating with respect to x,we obtain the formula for the time of descent, J : (10.2)
The brachistochrone problem then is to find in the class of arcs, y = y(x) with y( 0 ) = 1 and y ( x l ) = 2, that arc which minimizes the integral in (10.2). This is typical of the kind of problem we are interested in here. We shall reformulate it below after considering the problem of Mayer. The problem of Mayer includes a very general class of problems in the calculus of variations. As formulated by Bliss,3 the problem of Mayer may be stated as follows. Let? yT = y’(t) = ( y , ( t ) ,... , y,(t)) be a vector function o f t in the real interval to < t < t , . y is assumed to be a member of some set of “admissible” functions, S. For example, Bliss takes S to consist of all functions which are piecewise continuously differentiable in [ t o ,t l ] and have values y(t) in some open region of Rg,q-dimensional Euclidean space. Further, let Qi(t, y, y’) = 0,
i = 1,
... , n < q
(10.3)
be a set of differential equations, where the functions Q iare assumed to have continuous third-order partial derivatives in some suitable (2q + 1) dimensional open region R, and the matrix (a@Jayj‘) is to have rank n in R,. Let $j(to,
to>, t i , ti)) = 0,
j = 1,
... , Y 5 2q
+2
(10.4)
be endpoint constraints, where the functions $ j have continuous third-order partial derivatives in some (2q + 2) - dimensional open region R, in which the matrix
has rank p . Finally, let the function of endvalues J = J ( t 0 2 Y(to), t t , Y(tl))
t(
)7
denotes the transpose of ( ).
(10.5)
420
E. K . BLUM
have continuous third-order partials in the region R , . The problem of Mayer is t o find y E S which minimizes J while satisfying Eqs. (10.3) and (10.4). (If J is to be maximized, we simply minimize - J . ) As an example of a problem of Mayer let us again consider the brachistochrone problem. First, we change the independent variable from x to t. Thus, y(x) becomes y , ( t ) and y , ’ = dy,/dt.We introduce
is a differential equation which must be satisfied by ( y l ( t ) ,y,(t)). Also, y , and y , must satisfy the endpoint conditions
where
to = 0
and t ,
$1
= Yl(t0) -
$2
= YZ(t0) = 0,
$3
= Yl(t1) - 2 = 0
= x l . Finally,
1 = 0, (10.7)
the time of descent, J, is given by (10.8)
J =Y2(fl)
It is required to find yT = ( y , ( t ) , y 2 ( t ) ) satisfying (10.6) and (10.7) and minimizing J in (10.8). Thus, (10.6), (10.7), (10.8) is a special instance of (10.3), (10.4), (10.5) which represents the brachistochrone problem formulated as a problem of Mayer. 10.2
Optimal Control Problems
A large class of problems in the theory of optimal control can be formulated as variational problems of the following type. Let xT = x T ( t ) = (xl(t),... , x,(t)) and uT = uT(t>= (u,(t), ... , u,(t)) be functions of t on the interval to It I t , . x is the state ” function and u is the “control ” function. The control is assumed to be in some specified set of admissible controls. For example, u is usually assumed to be piecewise continuous with values in some region, R, of R“. We shall take R to be an open set, although other alternatives are possible. Let “
dx
- = f(x, u)
dt
(10.9)
be a system of n differential equations, where f = (f,, ... ,h) and the to have continuous partials afilaxj and afilau, in
fi, 1 I iI n, are assumed
10.
42 I
OPTIMAL CONTROL PROBLEMS
some prescribed open set, r c R“ x R”. We assume further that the control region R is contained in the projection of r on R“. Let $ j ( t o , x(t,), t , , x(tl)) = 0,
1 ~j I p , p I 2n
+2
(10.10)
be a set of endpoint constraints on the state function, where the Gj have continuous first-order partial derivatives in some open set, R , , and the matrix
has rank p . Finally, let
be a function of endvalues of the state and time. We assume that J has continuous first partials in R,. For each admissible control, u, a state function x which satisfies the “ control equations (l0.9)t is called a trajectory corresponding to u.” An “ optimal control is one which has a corresponding trajectory which satisfies the constraints (10.10) and minimizes J . The ‘‘ optimal control problem is to find such an optimal control u and its corresponding trajectory x. The pair (x, u ) is an “optimal solution of the problem. One should, of course, remark that there are other kinds of problems in control theory which are referred to as optimal control problems. In particular, mention should be made of problems in which the constraints are in the form of inequalities (see Pontryagin et d.). ’$ We shall restrict ourselves to equality constraints in this paper, although the connections between these two kinds of problems is well known (see Refs. 5-7). The optimal control problem becomes a problem of Mayer if we let yi = x i , 1 5 i I n and Y , , , ~= u j , 1 5 j I q. Conversely, the problem of Mayer is transformed into the optimal control problem as follows. First, we solve Eqs. (10.3) for n of the derivatives, say y,’, . .. , y,’ and set x i = y i , I I iI 4, and uj = yA+j, 1 I j I q - n. Then we adjoin to the n differential equations obtained by solving (10.3) for y i , 1 I i 5 n, the additional equations xi+,, = u j , 1 ~j I q - n. This produces a system of 4 differential equations for the q state variables as in (10.9). Similarly, (10.4) and (10.5) become (10.10) and (10.1 I ) respectively, thereby establishing the well-known equivalence of the problem of Mayer and the optimal control problem. As a simple illustration of these ideas, let us return to the brachistochrone problem. It was formulated as a problem of Mayer in (10.6), (10.7), (10.8). To restate it as an optimal control problem, we introduce the state function ”
“
”
”
”
t Also called
“
state equations.”
1 See also Chapters 1 , 6, and 7.
422
E. K. BLUM
x ( t ) = ( x l ( t ) ,xz(t)), where x1 = y 1 and x2 = y 2 . Then we introduce the control function u(t) = u,(t) = yl’. Equation (10.6) and the latter equation yield the control equations,
dx 1 -= U ] dt
(10.6‘)
dt
Similarly, the endpoint conditions (10.7) become the endpoint constraints on the state function,
$]
= x,(t,) -
$2
= x,(t,)
I j 3 = x,(t,)
1 = 0,
= 0,
(10.7‘)
- 2 =0
Finally, the function of endvalues of the state and time which is to be minimized is just the final value of x2,that is, J = xz(t1)
(10.8‘)
We shall show later how to solve the brachistochrone problem as an optimal control problem by an approximation procedure. The solution given by this procedure can then be compared with the known analytic solution. A first necessary condition for a solution of the problem of Mayer and, therefore, for a solution of the optimal control problem as formulated above is given by the classical multiplier rule.3 Most derivations of the multiplier rule are based on classical methods of the calculus of variations. In the following sections, we shall give an alternate derivation using methods of abstract analysis insofar as possible. The main ideas of this new derivation have been known for some time, but they do not seem to have been applied as they are here. Indeed, as we have already remarked in the Introduction, much of the research in abstract analysis was initially motivated by the problems of the calculus variations. We cite the early work of Volterra, at the turn of the century, on what he called “functions of lines” and the work of Hadamard at that time, which he termed “ analyse fonctionnelle,” giving the subject its name. The paper by Bolza’ contains a result similar to one of our theorems. The works of Frtchet’ and GBteaux” set forth the main concepts which we shall use, in a somewhat modified form. Later work of Goldstine11*’2and L i ~ s t e r n i k ’comes ~ closest to the approach taken here. More recent research using functional analysis in variational theory is that of H e ~ t e n e s , ’Bala~ krishnan,15 Neustadt,” Browder,” and the author.” Several books’s-21 give fuller accounts of the basic ideas in various forms.
10.
OPTIMAL CONTROL PROBLEMS
423
We believe that the techniques of our proof, being geometrical in spirit, are easily grasped. They are based on very natural generalizations of the ideas used in establishing the Lagrange multiplier rule of ordinary calculus. The main theorem (Theorem 2) is formulated in an abstract setting so that it applies at once to minimization problems with equality constraints in both the ordinary calculus (finite dimensional spaces) and the calculus of variations (function spaces). It also provides the bases for a new method of solving such problems, the “ convergent gradient method to be explained later. ”
10.3
Abstract Analysis, Basic Concepts
In this section and the next are assembled some general results of abstract analysis based on ideas going back to FrCchet’ and G2teaux.l’ Essentially, the abstract analysis which concerns us is the generalization of the differential calculus to abstract spaces. The development of abstract analysis which we shall present here parallels that given in such references as HilleE8 and Liusternik and Sobolev,” and the reader is referred to these sources for general background information. However, our development differs in details which are important for the intended application to the optimum control problem. For example, Liusternik and Sobolev formulate the definition of abstract differential for functions defined on a normed linear space, whereas we restrict ourselves to functions defined on a pre-Hilbert space. In this respect, their treatment is more general. However, we require differentials of functionals to exist only in finitely open sets (Definition I), whereas they deal with open sets. In this respect, our results are more general and more directly applicable to the optimal control problem. Furthermore, the weaker hypothesis of finitely open sets and the existence of an inner product permits us to give a simpler proof of the abstract multiplier rule. For example, our proofs are based on the classical implicit function theorem for real functions of several real variables rather than on the implicit function theorem in abstract spaces.27 Our results are designed to have just the right amount of generality for application to the optimal control problem considered here, i.e., with equality constraints. There are indications that problems with inequality constraints may require the greater generality of the results in Liusternik and Sobolev.” Throughout this and the following section, E and El will denote normed linear spaces over the real numbers, and f will denote an arbitrary mapping from a subset of E to E l . In particular, E, may be one-dimensional (i.e., the real line) in which casefis a functional. For u E E w e write llull for the norm of u. We begin with some definitions.
h,
Definition 1. A subset D c E is finitely open at u, u E D, if for any h,, ... , E E, n 2 1, there is a neighborhood, N = N(h,, ... , h,), of the origin in
424
E. K. BLUM
R" such that for any (tl, ... , t,,) E N the point u + 1 :tihi is in D. Let D, denote the totality of such points; i.e.,t
{ fI I
D, = u +
tihi n 2 1, hi E E , (tl, ... , t,)
E
N(h,, ... , h,,))
D, is called a star-neighborhood of u relative to D. If D is finitely open at every u E D,then we say that D is ajnitely open set. We observe that a star-neighborhood, D, ,of u need not be a neighborhood of u in the norm topology, that is, there may be no 6 > 0 such that the sphere IIy - uII < 6 is contained in D,. On the other hand, it is clear that a starneighborhood of u is finitely open at every point in the star-neighborhood. In the applications, the functionals which arise have certain desired properties in sets which are finitely open but not necessarily open in the usual norm topology. The continuity property given in the next definition is an example of such a property, as we shall see later. Definition 2. Let f be defined on D c E, with D finitely open at u. Let D, be a star-neighborhood of u and let y E D, . We say that f isjnitely continuous at y if
is a continuous function of ( t l , ... , t,,) in R" at the origin in R" for all choices of hi E E and n. f is finitely continuous in D, if it is finitely continuous at all y i n D,. In the applications, D will be a set of controls in the space, E, of admissible controls. Using well-known continuity properties of solutions of ordinary differential equations, we shall show that various functions on D are finitely continuous. We come now to the matter of generalizing the concept of a differential. There are several ways to do this, all being variations of the G2teaux" or Frtchetg differentials (also see Refs. 18 and 19). We shall choose one which is close to the G2teaux differential. Definition 3. Let f be defined on a set D c E which is finitely open at u. If lim (f(u f-rO
+ th) - f(u)/t
= 6f(u;
h)
exists for all h E E, it is called the weak diferential off at u with increment h.
t
{elP} denotes the set with member e having property P.
10. OPTIMAL CONTROL PROBLEMS
425
Remark 1. It is evident that d
(10.12)
This follows at once from the usual definition of the derivative of a vector function of a scalar t . It is also obvious that 6f(u, h) is homogeneous in h ; i.e.,
Sf(u; sh) = lim (f(u t-10
+ tsh) - f(u))/t
+ t’h) - f(u))/t’
= s lim (f(u t’+O
= s 6f(u; h)
where t‘ = ts. Iff is a real functional, then for u fixed 6f(u; h) is also a real functional, defined for all h E E. In the case where f is a real functional and 6f(y; h) exists for y in a star-neighborhood of u and is finitely continuous at the point u, it is easy to show that 6f(u; h) is also an additive real functional of h. For arbitrary h,, h, E E and real variables t , and t , , let us define t2)
= f(u
+ tlhl + 0
2 )
g is a real-valued function of t, and t , . From (10.12) we see that
-_
ag - 6f(u at1
+ t,h, + t2h2; h,)
and
for It,[ and It,/ sufficiently small to ensure that u + t , h , + t,h, is in D . It follows that ag/&, and 8g/at,are continuous in t , and t, at (0, 0) and exist in a neighborhood of (0, 0). By ordinary calculus, d t l , tz) = d o , 0 ) + t l a g m ,
+
where & / ( t i 2 t22)1’2-+ 0 as ( t , , t,) This may be rewritten as
f ( +~ tih, Taking t ,
= t, = t ,
lim ( f ( u f-10
I(O.0)
+ (0,
+ t 2 aslat2 l(0,O) + &
0) along a ray through the origin.
+ t2h2) - f ( u ) = ti 6f(u;hi) + t 2 Sf(u; h2) + E we obtain
+ 0 ,+ h,)) - f(u>>/t= m u ; h,) + M u ; h,)
+
The left member of this equation is 6f(u; h , h,), by definition. This establishes the additivity. In the application to optimal control theory, we shall be concerned with a linear space of functions in which a naturally defined inner product exists,
426
E. K. BLUM
that is, with a pre-Hilbert space. A real pre-Hilbert space is a real linear space, E, in which a real inner product, (u, v), is defined for all u, v E E. From this point on, E will denote a pre-Hilbert space unless there is an explicit statement to the contrary. A Hilbert space is a complete pre-Hilbert space. As is well known, if g is a bounded linear functional on a Hilbert space, H , then there exists a y E H such that g(h) = (y, h) for all h E H. Although this result does not hold in general in a pre-Hilbert space, it holds in many cases important in our applications. In particular, it may hold for 6f(u; h), which is a real functional of h when f is a real functional and is linear in h when the continuity conditions in Remark 1 are satisfied. These considerations lead us to the next definition. Definition 4. Letf’be a real functional defined on a subset of the pre-Hilbert space E. If there exists a vector Vf(u) E E such that 6f(u; h) = (Vf(u), h) for all h E E, then Vf((u) is called the weak gradient off’ at u.
If Vf(u) exists at a point u, then 6f(u; h) is obviously a bounded linear functional of h. Note that in this case to obtain linearity in h it is not necessary to assume that 6f(u; h) is finitely continuous at u. In fact, not even the existence of 6f(y; h) for y # u need be assumed, We remark that for h a unit vector (Vf(u), h) may be regarded as a “directional derivative” o f f in the direction of h. It is of some interest to consider now the strong differential.” Although we shall make no use of this notion until Sec. 10.8, it is nevertheless pertinent to discuss it here for purposes of comparison with the weak differential and for its possible application to other types of optimal control problems. “
Definition 5. Let f be defined in a neighborhood of u E E where E is any normed linear space, with values in a normed linear space E l . If there exists a bounded (Le., continuous) linear operator, f’(u), mapping E into E , and such that
f(u
+ h) - f ( u ) = f ’ ( u ) h +
E(U,
h)
where lisl\/llh\l-+ 0 as h 0, thenf’(u)h is the strong drfferentiaf o f j ’ a t u with increment h andf’(u) is the strong derivative o f f at u. --f
Note thatf’(u)h is an element of E l , the result of the linear operator f’(u) applied to h. f’(u)h is defined for every h E E. The strong differential is essentially FrCchet’s concept. Iff is a real functional (i.e., E, is the real line) and E is a Hilbert space, then the operatorf‘(u) is also a real functional on E. Since it is bounded and linear, there exists an element, V,f(u), in E such that f’(u)h = (Vsf(u), h) for all h E E. We shall call V,f(u) the strong gradient of f a t u.
10.
427
OPTIMAL CONTROL PROBLEMS
It is a simple matter to show that if the strong differential exists at u, then the weak differential also exists and the two differentials are equal. In proof, we have f(u th) - f(u) = tf’(u)h E(U, th)
+
+
for any h E E and scalar t , where llall/ltl llhll
Sf’(u; h) = lim (f(u t-0
--f
0 as r -+ 0. Thus,
+ th) - f(u))/t
= f’(u)h
Iff i s a functional and the strong gradient exists, then so does the weak gradient and the two are equal. On the other hand, the weak gradient may exist when the strong gradient does not. For example, letfbe the function given in polar coordinates in the plane by f(r, 0) = cos(ri0) for 0 < 0 < 271 and f(r, 0) = 1 for all r. (Thus, f is a functional on a two-dimensional vector space.) The directional derivative o f f at (0,O) along any ray is zero; i.e., for 0 # 0, af/ar = -sin(r/0)/0, which is zero for r = 0. Hence, the weak gradient offat (0,O) is the zero vector. Now, if the strong gradient exists at (0, 0), it must also be the zero vector, since it must be equal to the weak gradient. However, this would imply that cos(r/6) - I =f(r, 6) -f(O, 0) = E(r, 6), where E/r -+ 0 as r 0. But Icos(r/6) - ll/r = l/ r for 0 = 2r/n, which contradicts E/r -+ 0 as r -+ 0. Hence, the strong gradient does not exist. It may also happen that the weak differential exists when the weak gradient does not. A simple example illustrates this. Let ---f
+
f ( x , y ) = (x’ y z y and consider u to be the origin. Then for any vector h = (A,, h J , liin (f(u
t+O
+ th) - f(u))/t
= limf(th,, r-0
t h 2 ) / t = (h,’
+ h,’)”’
=
llhl]
Thus, Sf(0; h ) = lihll, which is clearly not a linear functional of h. Therefore, the weak gradient offcannot exist at the origin. It follows that the strong gradient also does not exist. The relation between the weak and strong differential is further illuminated by the following theorem, a proof of which can be found in Liusternik and Sobolev.” Theorem. If the weak diflerential 6f(u; h) exists ,for [/u’- uJI 2 r and is uniformly continuous in u and continuous in h, then the strong diflerential also exists in this set (and the two diflerentials are equal). 10.4
The Multiplier Rule in Abstract Analysis
Using the concepts of the preceding section, we shall now consider extremal problems in abstract spaces.
428
E. K. BLUM
Definition 6. Let J and g l , ... , g p ( p 2 1) be real functionals defined on D c E. The set C(g,) = {y(gi(y)= 0 } is an equality constraint. The intersection C = 0; C(g,) is also an equality constraint. Let u E C. If there is a neighborhood, N u , of u such that J(y) 2 J(u) for all y in C n N,, n D, then u is a relative minimum of J on the constraint C.
n';
Definition 7. Let C = C(g,)be an equality constraint defined by the functionals g l , :.. , g p . Let u E C and let the gradients Vgl(u), ... , Vgp(u)exist. The set of all h E E such that (Vg,(u),h ) = 0, i = 1, .. . ,p is called the tangent subspace of C at the point u and is denoted by T,,.
A first necessary condition for u to be a relative minimum of J on the constraint C is given by the Lagrange multiplier rule. We shall now establish an abstract generalization of the multiplier rule in pre-Hilbert spaces. This generalization includes the multiplier rule of the ordinary calculus and the multiplier rule of the calculus of variations as special cases. As we have previously pointed out, our generalization differs from others in its assumptions about the domains of definition of the functionals and their differentials. In particular, we shall not require differentials to exist in neighborhoods but only in finitely open sets. Furthermore, we shall not require strong differentials but shall make weak gradients suffice. The basis for most proofs of the multiplier rule is a theorem of the type represented by Theorem I , which follows. Theorem 1. (i) Let J, g , , ... , g p ( p 2 1) be real functionals defined on a set D in a real pre-Hilbert space, E, and let u be a relative minimum of J on the constraint C = 0:C(gi). Let D be finitely open at u. (ii) Let the weak gradients {Vg,(u), ... , Vgp(u)}exist and form a linearly independent set in E. (iii) Let T,, be the tangent subspace of C at u. For any h E T,,, let VJ(y) and Vgi(y),1 5 i 5 p, exist for all y = u + toh f t,Vgi(u), where ( t o ,t , , ... , tp) is anypoint in some neighborhood, N , of the origin in R p f ' . Further, suppose that the Vgi(y) are continuous in the ti in the neighborhood N and that VJ(y) is continuous in the ti at the origin. If h E T,,, then (VJ(u), h) = 0.
xT
PROOF. Let t Define
= ( t i , ... , tP)be
a real p-dimensional vector and s a real scalar.
ysr = u
+ sh + C ti Vgi(u), P
1
where h is an arbitrary point in T,,. (We assume h # 0, since the result follows
10.
OPTIMAL CONTROL PROBLEMS
429
trivially for h = 0.) Then we define the functions Fi(t, s) = Fi(t,, ... , t , , s) = gi(yts),i = 1, . .. , p . The functions Fj have the properties : 1. Fi(O, 0 ) = gi(u) = 0 ; 2. the partial derivatives Fij = aFi/atjand Fis = aFi/asexist as continuous functions of ( t , s) in some neighborhood of (0, 0); 3. the Jacobian matrix (Fjj(t,s)) has rankp for ( t , s) in some neighborhood of (0, 0). Also Fis (0, 0) = 0 for all i = 1, ... , p . To establish property 2, we observe that
= 6gi
Vg j ( U > > = >
( ~ t s ;
This is a consequence of hypothesis (iii) which asserts that the g i and their gradients are defined for Itil and Is1 sufficiently small. The continuity of Fij in ( t , s) follows from the assumed continuity of Vgi(yts). To obtain property 3, we note that Fij(0, 0 ) = (Vgi(U>, VgjCU)) Since the {Vg,(u)}are assumed to be linearly independent, it follows by wellknown results of finite-dimensional linear algebra that the p x y matrix (Fij(O,0)) has rank p . By continuity, the matrix (Fij(t,s)) also has rank p for ( t , s) in some neighborhood of (0, 0). Finally,
= 6gi (
h) = (Vgi ( ~ t s ) , h)
~ t s ;
Hence, Fis(t, s) is continuous in a neighborhood of (0, 0) and Fjs(O, 0) = (Vgi(u), h) = 0 for i = I , ... , p , since h E T,,. By virtue of these properties, we can invoke the classical implicit function theorem to obtain functions Ci(s), i = 1, ... ,p , such that Fi(C,(s), ... , G,(s), s) = 0 for all s in a neighborhood. Furthermore, G,(O) = 0 and the Gi are continuously differentiable in a neighborhood of s = 0. In fact, since F J O , 0) = 0 for all j = 1, ... , p , it follows from classical results that Gi’(0) = 0 for all i = I , ... , p . Since Gi(s)= SGi’(eiss) for s sufficiently small (0
Now, consider the vectors ys = u neighborhood of s = 0,
+ sh + ~ l p G j ( sVgj(u). ) For all s in some
Si (ys) = Fi(GI(s)>... Gp(s), S) = 0 9
Hence, for all such s, we have ys E C. Furthermore, since lims+o Gj(s) = 0,
430
E. K. BLUM
we have lirn,-,,, y, = u. Defining H ( t , s) = J(y,,), we obtain for the partial derivatives of H , using the same methods as above, H,,(O, 0) = (VJ(u), Vgj(u)) and H,(O, 0) = (VJ(u), h). Since H,, and H, exist in a neighborhood of (0, 0) and are continuous at (0,O) we have J(Y,) - J(u)
=
~ ((VJ(u), 5 Vgj(u)) Gj’(O> + (VJ(u), h) + 1 1
= s((VJ(U),
h)
+
E)
where E -+0 as s -+ 0. If (VJ(u), h) = -a2 # 0, then for all sufficiently small positive s we would have J(y,) < J(u), contradicting the hypothesis that u is a relative minimum of J on C. Likewise, if (VJ(u), h) = a2 # 0, then for s < 0 and Is1 sufficiently small we obtain the same contradiction. Therefore, (VJ(u), h) = 0, as was to be proven. Remark. It is of interest to note that if we assume that the strong gradient of J merely exists at u, the theorem remains valid. In proof, let h, = sh + Gj(s) Vgj(u). Then ~s = u + hs and IIhsII 5 Isl(IIhII + IGj(s>/sI*IIVgj(u>II) and h, -+ 0 as s -+ 0. By the definition of strong gradient, we have J(y,) - J(u) = (V,J(u), h,) + E , where &/llh,ll 0 as h, -+ 0. But
1;
1;
---f
( v , J ( ~ )hs) > Since Gj(s)/s -+ 0 as s
=s ---f
(
1
Gj(s>/s ( v s ~ ( U ) ,Vgj(u)>)
0 , we have
J(YJ - J(u)
= S((V,
J(4,h) + E l + 4 s )
where E , - 0 as s-0. Since llh,ll 2 Isl(llhll + E J , we have ~ / l s 5 l E(llhjl + e2)/ /I h,ll so that E / S -+ 0 as s -+ 0. We may then apply the argument of the theorem to obtain (V,J(u), h) = 0. The geometric interpretation of Theorem 1 is self-evident and leads directly to the multiplier rule as follows.
Theorem 2. Let J , g l , . . . , gp be real functionals satisfying conditions (i), (ii), and (iii) of Theorem 1. If VJ(u) # 0 , then there exist unique real scalars /1 1 , .. . , Ap not all zero such that
c A j VSj(U) P
VJ(u) =
(10.13)
1
PROOF. Let ( A , , .. . , A p ) be the unique solution of the linear system of equations P
1 (Vgi(u), Vgj(u)>Aj = (Vgi(u),
j=l
VJ(u)),
i
=17
... p 3
10.
43 1
OPTIMAL CONTROL PROBLEMS
Let T,, be the tangent subspace at u. By Theorem 1, if h E T,, then (VJ(u),h ) = 0. Hence, VJ(u) cannot be in T,,, for that would imply VJ(u)= 0. Consequently, for some Vgi(u) we must have (Vgi(u), VJ(u)) # 0 ; i.e., the above linear system is nonhomogeneous. (Its determinant is nonzero because the {Vgi(u)}are linearly independent.) Hence, not all the Aj are zero. Now, define v = VJ(u), I j Vgj(u). We have (Vgi(u),v) = 0 for i = 1, ... ,y . Therefore, v E T,,. However, for any h E T,,,
xf
(v, h )
(VJ(u), h) -
1
x P
1
Aj
(VSj(U), h ) = 0
again by Theorem 1 and the definition of T,,. Hence, (v, v)
V J ( u )-
= 0,
which implies
D
1
i j VSj(U)= 0
as was to be proven.
Corollary (The Multiplier Rule). Let J, gl, ... , g, ( p 2 1) be real functionals de$ned on a set D in a real pre-Hilbert space E. Let u be a relative minimum of J on the constraint C = 0;C(gi)and let D be finitely open at u. Let the weak gradients Vg,(y), .. . , Vg,(y) exist and befinitely continuous in a star-neighborhood, D,,, of u. Let the weak gradient V J ( y ) exist in D, and beJinitely continuous at u. Finally, let {Vgl(u), ... , Vgp(u))be a linearly independent set in E. I f V J ( u ) # 0 , then there exists a unique nonzero real vector (Al, ... , &) such that VJ(U) =
c P 1
i j
VSj(U)
Remark. Following Liusternik and Sobolev, we could have proceeded by introducing the constraint g(y) = g 1(y), ... , g,(y)) .Thus, g is not a functional but has its values in RP. They require the strong derivative g’(y) to exist in a neighborhood of u. Since g’(y) = (gl’(y), ... , g,’(y)), the same requirement applies to the gi’.Also, they require g’(u) to be a mapping from E onto all of RP.This is equivalent to the linear independence of the functionals (gl’(u), ... , g,’(u)}, since g‘(u)h = (gl’(u)h, ... ,g,’(u)h) for h E E.
10.5 Necessary Conditions for Optimal Controls We shall now apply the abstract multiplier rule of the previous section to the optimal control problem described in Sec. 10.2 and shall obtain necessary conditions for an optimal solution. In view of the equivalence of the optimal control problem and the problem of Mayer, this will yield a new derivation of the multiplier rule for the problem of Mayer.
432
E. K. BLUM
Let R be an open set in R”,Euclidean m-space. A control, uT = u’(t) = ( U l ( t ) ,
... , U , n ( t ) )
is “admissible” on the closed interval [ t o ,t l ] if it is a piecewise continuous function o f t for 2, 5 t 5 t , and the values { u ( t ) } lie in R for all such t . Now, the set, C, , of all piecewise continuous functions u ( t ) defined on an interval [a, b] and having values in R” constitutes a linear space in the usual way, i.e., u v = u(t) v(t) and Au = Au(t) define the operations of addition and scalar multiplication in the space. It becomes a pre-Hilbert space if we define the inner product, (u, v), of any two functions u, v E C,, where uT = uT(t)= (ul(t),... , um(t)),vT =v’(t) = (vl(t), ... , v,(t)), as
+
+
For to and t , in [a, 61, the set of admissible controls on [ t o ,t,] is a subset of if we take u(t) = 0 for t outside of [ t o ,t , ] . In the ensuing discussion, we shall take C, as the underlying pre-Hilbert space. Suppose that the initial values x(to)and the initial time to are fixed. Suppose further that there exists an admissible control u = u ( t ) such that the control equations (10.9) have a solution x(t) in the interval to 5 t 5 t , , with initial values x(to),and the points (x(t;, u ( t ) ) lie in the open region r in which f(x, u) has continuous partials. Consider the one-parameter system of differential equations dx - = f(x, u + sh) (10.9,) dt
c,,
where h = h(t) is an arbitrary admissible control on [ t o , t l ]and s is a scalar parameter. From the theory of ordinary differential equations (see Coddington and L e ~ i n s o n , ’p. ~ 29, for example), it is known that (10.9,) has a solution in [ t o ,t , ] for all sufficiently small Is/. In fact, for k arbitrary admissible controls h,, .. . , hk , the k-parameter system dx k - = f(x, u + sihi) dt 1
1:
has a solution in [ t o , t,] for lsil sufficiently small. In all cases, the initial value is taken to be x(to). We shall designate the solution of (10.9,) by x(t, s) when h is being held fixed in the discussion. The final values ~ ( t , s) , are then functionals depending on the control y = u + sh. In the general case, we have solutions x(t, s,, ... , sk)with final values depending on y = u s i h i . Thus, the final values of the solutions of (10.9,) with initial values x(to) are functionals defined on a which is finitely open at u (Definition 1) for any u E D.We domain, D c
+ 1:
c,,
10.
OPTIMAL CONTROL PROBLEMS
433
express this by writing x(t,) = xl(y) for y E D. It follows immediately that the functions $ j in (10.10) which define the endpoint constraints can also be regarded as functionals on D. With Definition 6 in mind, let us define
gj(Y) = $ j ( t o > x(to>,t i , x ~ ( Y > > , j
=
1, ... P
Similarly, we define J(Y) = J ( t o 7 x(to), t i ,
Xl(YN
If u minimizes J on the constraint C = 0:C(g,) (see Definition 6), then J(y) 2 J(u) for all y E D n C n N u , where Nu is some neighborhood of u. If VJ(y) and Vgj(y) exist and satisfy the hypotheses of the abstract multiplier rule (Theorem 2, Corollary), then we may apply it immediately to the control problem to obtain the necessary conditions of the multiplier rule of the problem of Mayer. Thus, the derivation from this point on consists primarily of a calculation of the gradients of J and the g , . We shall now carry out this calculation using the well-known technique of the adjoint equation. In the following discussion, let h = h(t) be an arbitrary but fixed function in C,. Let u = u(t) be an optimal control on [ t o ,t , ] . For all Is1 sufficiently small, u + sh is an admissible control and, as explained above, has a corresponding trajectory x(t, s) in [ t o ,t , ] . For s = 0, the corresponding trajectory is the optimal trajectory x = x(t, 0). Using the notation explained above, we have x , (u + sh) = x(t,, s). Now, let (aJ/ax,), denote then x 1 matrix of partial derivatives, aJ/ax, j , o f J ( t , , x(f,), t , , x(tl)) with respect to the variables xj(t,), j = 1, ... , n , and evaluated for x(tl) = x(t,, 0) = x,(u), i.e., at the final value of the optimal trajectory. It is assumed that the point (to,x(t,), t i , x(t,, 0)) lies in the region R, in which J and $ j have continuous first-order partials (see Sec. 10.2). Since R, is an open set, and since x(t,, s) is continuous in the parameter s (for a proof see Ref. 23 again), it follows that the points Q,: ( t o , x(t,), t , , x(t,, s)) are also in R, for sufficiently small Is/.Hence, J(Q,) and $ j ( Qs) are defined for Is1 sufficiently small and J and $ j have continuous first-order partials at such 0,. In this discussion, t o ,x(to) and t , are not being varied. Therefore, we assume that p I n and that the p x n matrix (a$j/ax(t,)) has rank p . Now, by Definition 3, Remark 1, we have
Let af/iau denote the n x m matrix of partial derivatives (2jJau,) and afiax the (af,/axj), both evaluated at a point P,: (x(t, s), u + sh), where x(t, s) is the trajectory corresponding to u + sh. Observe that P, E r for to I t 2 t , so thatfhas continuous first-order partials at P,.Writing x(t, s) = (xl(t, s), ... , x,(t, s)), we let i3xja.s be the n x 1 matrix of partials (ax,/&). n x n matrix
434
E. K. BLUM
Once again, by appeal to known results in the theory of ordinary differential equations (e.g., see S t r ~ b l ep. , ~72), ~ we can assert the existence of the ax,/as for Is1 sufficiently small. For any such s,the partial derivatives satisfy the variational equations Since
~ x , ( uh) ; =
+
dx,(u sh) ds
ax(t1, s)
Fl,7=o
it follows that 6x,(u; h) exists and is the final value of a solution of the nth order system, dv v ( 10.1 5 ) dt *
-=g)*h+g)
where the asterisk subscripts indicate that the partials are to be evaluated at the point (x(t, 0), u ( t ) ) of the optimal solution. The initial values v(to) are to be taken as zero when x(to) is not to be varied. Otherwise, v(ro) will be arbitrary. The adjoint equations of (10.15) are given by ( 10.16 )
*
dt
Hence, d(yTv)/dt= yT(3f/au)*h for any solutions y of (10.16) and v of (10.15). Integrating with respect to t , we get ( 10.17)
If we take for y the solution, J,(t), of (10.16) which has final values J,(tl) = (a.J/ax,),, then since v(t,) = 6x,(u, h), it follows from (10.14) and (10.17) that 6J(u; h)
1
11
=
to
(g) h d t + JxT(t0)v(t,) au *
JXT
(10.18)
The function J,'(afliau), is piecewise continuous on [ t o , tl], since JXTis a solution of the system of differential equations (10.16). Hence, J,'(afliau), E C,, . Since t o , x(to) and t , are not being varied in this discussion, we must take v(to) = 0. Using the inner product notation, (10.18) can be rewritten as 6J(u; h) = (J,'(aflau), , h).
VJ(u)
=JxT(g)
*
(10.19)
that is, the gradient of J exists at u and may actually be computed by solving the adjoint Eqs. (10.16) for J,'(t>, integrating backward from t , to to with the
10.
435
OPTIMAL CONTROL PROBLEMS
“starting” values (aJjax,), . (Since u is piecewise continuous in [ t o , t , ] , so is (af/iax), and a piecewise differentiable solution of (10.16) exists in [ t o , t , ] for all starting values.) By similar calculations it is easy to show that the gradients of the gj are (10.20)
where
JIjj,(t) is
the solution of the adjoint Eqs. (10.16) having as final values . Finally, as we have already remarked, for any admissible controls h , , ... , h, and all (sl, ... , sk) with Jsil< 6, the control u + sihi is also admissible and has a corresponding trajectory x(t, sl, ... , sk) which is continuous in (sl, ... , sk). Hence, the matrices (af/iau)and (af/iax) evaluated at the points (x(t, s,, ... ,sk), u s,h,) in R” x R” are continuous functions of the parameters (sl, .. . , sk)in some neighborhood of the origin in Rk. Consequently, any solution qjxof dyldt = - ( a f / ~ ? x )is~ yalso continuous in the parameters s,,.. . , s, in the same neighborhood in Rk.It follows from this that the gradients JIjx(tl) = (iatjj/axl),
1:
xt
+ x:
k
Vgj(u
+ 2sihi) = qTx(“ 1
1
,
j = 1,
(10.21)
... , p
exist and are continuous in the s i . This means that Vgj, j = 1, ... , p , exists and is finitely continuous (Definition 2) in a star-neighborhood D,, i.e., D, is the set of all controls of the form u + sihi for arbitrary h , , ... , h, E and Isi/< d, where d > 0 depends on the choice of h i . D, is, as we pointed out in Sec. 10.3, a set which is finitely open at u. In similar fashion, we establish that VJ also exists and is finitely continuous in a set which is finitely open at u. This is actually more than is required by the Corollary of Theorem 2 with respect to VJ, where we require only that VJ be finitely continuous at u. Thus, we have proven that VJ and Vgj satisfy all the hypotheses of that corollary except for the linear independence of the set {Vg,(u), ... , Vg,(u)). Let us now consider the gradients {Vgj(u)). If they are not linearly independent, then there are scalar multipliers A,, . .. , ,Ip not all zero and such that ,Ij Vgj(u) = 0. On the other hand, if the {Vgj(u)} are linearly independent, then the multiplier rule (Theorem 2, Corollary) can be applied, that is, there are again scalar multipliers A,, ... , 2, not all zero and such that VJ(u) = 1: Vgj(u), as in (10.13). Both of these cases can be subsumed under one general principle by asserting the existence of scalars /, , ,Il, ... , 2, not all zero such that
c,,,
1:
1;
1:
+ cI /Ij VSj(U) = 0 P
I , VJ(u) where I,
=
1 if the (Vgj(u)} are linearly independent and
(10.22)
I,
=0
if they are
436
E. K. BLUM
not linearly independent (note that if VJ(u) = 0, we simply take /, = I and A.3 = 0' j = 1, ... , p ) . Applying (10.22) to the optimal control problem, we obtain the following necessary conditions for an optimal solution : (i0.23a) (10.23b) (10.23~) j = 1,
... , p
(1 0.23d) (10.23e)
Equations (10.23b) and (10.23d) can be combined into a single set of differential equations as follows. Let
l(r) =
-
1,
P
Jx(t)
-
C
A j +jx(t>
1
(10.24)
From (10.23b)-(10.23e) it follows at once that 1, is the solution of the system of differential equations (10.25) with final values 1,(tl) = - 1,
Equation (10.23a) then becomes 1,T
(E) 8x1 *
($1
*
p 1
=0
lj
(-)a*,8x1 *
(10.26)
(10.27)
We shall show that the components Zxi(t), i = I , ... , n, of 1, are the "multipliers" of Bliss' formulation of the multiplier rule (see Bliss,3 p. 202) and Eqs. (10.25) and (10.27) are the Euler-Lagrange equations. In order to prove this, we refer to Bliss3 (p. 203) and to the statement of the problem of Mayer given in Sec. 10.1.
10.
437
OPTIMAL CONTROL PROBLEMS
Following Bliss, we introduce the function n
F(t, Y, Y’>
=
C l j ( r ) @ j ( f > Y, YO
where the (Dj are the fiinctions in (10.3). Let F,, be the n x 1 matrix of partials (aF/ayi), i = 1, . . . , n. Similarly, let F,. be the n x 1 matrix of partials (aFjay,’). The Euler-Lagrange equations may be written in vector form as dF,.
__ =
dt
Now, in the optimal control problem, the functions Q j = xj’ - fj(xl,
(10.2s)
F,
... , x,, u l , ... , urn),
Qj are
of the form
j = 1, ... , n
As explained in Sec 10.2, the optimal control problem is transformed into a problem of Mayer by setting y , = x i ,i = 1, ... , n, and Y , , + k = u k , k = I , ... , m. Hence, for i = 1, ... , n, we have
and dF - dF = 1, dYi’ ax,’ Substituting in (10.28), we obtain the Euler-Lagrange equations in the form (10.29) and for i = n
+ k : k = 1,
..., m,
(10.30)
Comparing (10.29), (10.30) with (10.25), (10.27), we see that they are identical
438
E. K. BLUM
systems of equations. This proves that the Euler-Lagrange equations are necessary conditions for a solution of the problem of Mayer. Furthermore, Eqs. (10.26) for the final values are just the “transversality conditions” obtained by Bliss3 (p. 202, Eqs. 74.9) by setting the coefficients of his d,,* equal to zero, i = 1, _._ , n. The constants ep of Bliss correspond to our Aj and the multipliers I,, I,, ... , I,, of Bliss are precisely our I,, I x l , ... , /x,,, respectively. As in Bliss, it is clear that lo and I x i ( t ) ,i = I , ... , n , do not all vanish simultaneously at any point i in the interval [I,, t l ] . If I, = I , this is trivially the case. If 1, = 0, then = 0 for all i = 1 , ... , n implies that I,(t) = 0 for all t in [ t o ,t , ] , since I, is the unique solution of the system of homogeneous linear equations (1 0.25). Consequently,
by (10.23) and (10.24). However, this contradicts the hypothesis that the matrix (a$j/ax,)* has rank p . Hence, l,(t) # 0 for all t in [ t l , t 2 ] ,which completes the derivation of the multiplier rule as formulated in Bliss for the case in which the endpoints, to and t , , and the initial conditions x(r,) are not varied. In the next section, we shall consider the case in which these quantities are varied also. At this point, it is pertinent to make some observations regarding “normality.” In Bliss3 (pp. 210-219), an arc, y(t), which satisfies the multiplier rule for the problem of Mayer (or the equivalent problem of Bolza, as it is presented in Bliss), is said to have abnormality of order q if it has exactly 4 linearly independent sets of multipliers of the form = 0, I X ( ‘ ) } ,CT = I , .. . , q. If q = 0, then y(x) is said to be a normal arc. It is clear that a normal arc can have at most one (hence exactly one) set of multipliers with I, = 1. The concept of normality is illuminated very clearly by looking at the corresponding optimal control problem. We see at once that normality is equivalent to the linear independence of the set of gradients {Vg,(u), ... , VgJu)}, with all that this connotes geometrically. Abnormality of order 4 is equivalent to the existence of precisely p - q linearly independent vectors among the {Vg,(u)}, that is, the dimension of the subspace, G, spanned by the gradients {Vg,(u), ... , VgJu)} is precisely p - q. When viewed from this abstract standpoint, many of the results concerning normality are easily derived by the methods used in the proof of Theorem 1. For example, Theorem 77.1 in Bliss3 (p. 214) asserts that near a normal arc which satisfies the differential equations (10.3) and the end-conditions (10.4) there is a one-parameter family of such arcs. The equivalent assertion in abstract spaces is that in a neighborhood of a point u on the constraint C = C(gi)which is such that the gradients {Vg,(u), ... , Vgp(u)) are linearly independent, there is a one-parameter “
”
“
”
10.
OPTIMAL CONTROL PROBLEMS
439
family of points which also lie on the constraint C. We established this result in the proof of Theorem 1 (Sec. 10.4). Another remark of a rather different nature is appropriate at this juncture. Let us consider the Hamiltonian function
c l X j f j ( X , u) n
2r(ZX,x, u) =
j= 1
=
Z,=f
where x, u and I , are regarded as independent vector variables of dimension rn, and 11, respectively. We have
n,
ax
af. c* G,
_ lxj aui j=1
i = I , ... , m
If ( x ( t ) , u(t)) is an optimal solution of the optimal control problem and ZJt) is the corresponding solution of (10.25), given by (10.24), then (10.27) yields
for all t in the interval [ t l ,t 2 ] . But this is a necessary condition for .%(Zx(t), x(t), u) to attain a relative maximum with respect to u at the point u = u ( t ) . Hence, we obtain in this case (i.e., when the control region R is an open set) conditions which are an immediate consequence of the maximum principle of Pontryagin, without appeal to that principle. The Pontryagin principle also contains the result that X = X ( t ) = 2r(Zx(t), ~ ( t )~ , ( t )is) a constant along an optimal solution. This property of the Hamiltonian was, of course, long known in the case where u ( t ) is piecewise differentiable. Indeed, we then have
+ ZxT
d X / d t = d(lxTf)/dt= (dZxT/dt)f
(af/ax) d x / d t
+ I,*
(df/du)(du/dt)
Since dlxT/dt= -I,= (gfliax) by (10.25), dx/dt = f and ZxT(i3fllau)= 0 by (10.27), we obtain d 2 r / d t = 0 for all t. Thus, 2 is constant along an optimal solution. 10.6
Variation of Endpoints and Initial Conditions
We shall now extend the results of the previous section to the case where t o , x(t,), t , , x(tl) are all varied simultaneously. This can readily be done if we choose the underlying pre-Hilbert space to be E = C, x R” x R2 in which an arbitrary element is of the form e = (u(t), v l , tol) with u = u ( t ) E C,, v, E R“ and to, = ( t o ,t l ) E R 2 . Thus, E is the direct sum of C,, as the “control space,” R“ as the “ initial-value space,” and R2 as the “ space of endpoints.” If e* = (u*(r), v*,, t,*l) is any element of E, then we have the usual definition
440
E. K. BLUM
of the sum e + e* = (u(t) + u*(t), v1 + vl*, to, e and e* is defined as b
(e, e*> = l0uTu*dt
+ t:,).
The inner product of
+ v l T v l * + tOTltgl
Suppose for an arbitrary initial point t o , an arbitrary final point t , , an arbitrary set of initial values x(t,) E R",and an arbitrary admissible control u, that system (10.9) has a solution, x(t), to I tI t , . The final value x(t,) can be regarded as a function on E to R". Writing e = (u, x(to), t,,), we can denote this function by x,(e), following the usage of the previous section. It is clear that J and the $ j are functionals on E. The results of Sec. 10.5 can be extended in a very natural way to apply to this more general space. We begin by noting that system (10.9) is autonomous and, therefore, we may shift the origin of the time axis without changing the problem. This allows us to set to = 0, which we shall do for convenience in the derivation to follow. Thus, variation of the endpoints can be effected by varying t , only, and we are dealing with the initial-value problem
dx/dt = f(x, U) x(0) = x, where x, denotes an arbitrary initial value which we may treat as a parameter of the problem. Now, let y = x - x,, so that y(0) = x(0) - x, = 0. We obtain a corresponding initial-value problem for y, that is, (10.3 I )
+
where f(y, u, x,) = f(y x,, , u). Thus, the parameter x, is included in the differential equations themselves rather than in the initial conditions. At the same time, the endpoint constraints (10.10) are replaced by the constraints Tj(xO,tl,y(tl)>=O, j=l,... , P (10.32) where qj(x0, t , , y(tl)) = t,bj(O, x, , t , , y ( t , ) + x,). Similarly, the functional to be minimized is now J =J(x, , t , , y ( t l ) ) , where J(xo, t , , y(tl)) = J ( 0 , x, , t,, y(tl) + x,). Then, instead of x R" x R 2 , we may take E = x R" + R, so that an arbitrary e E E is of the form e = ( ~ ( t xo ) , t,). If e E E is such that system (10.31) has a solution in the interval 0 5 t 2 t,, we may solve (10.31) and obtain the final values y ( t , ) . Hence, the final values may be regarded as a function of e and we can write y ( t , ) = yl(e) as in Sec. 10.5. Similarly, we have J(x,, t l , y(tJ = J b , , f , , yl(e)) = J(e) and Tjj<x0,t , , y(tl)) = Tj(e). To calculate gradients, we consider J(e + sh), where h = ( h ( t ) ,dx, , dt,) E E and e is a relative minimum of J satisfying \yj(e) = 0. As before, y ( t , s) is the
c,
cq
10.
sJ(e, 6) =
44 1
OPTIMAL CONTROL PROBLEMS
dJ(e + sii) ds
I =
dJ(e
+ sh)
ds
s=o
s=o
(10.34)
where all partials of J are evaluated at the point (0, x,, dx/dt = dy/dt. Since yl(e + sh) = y(t, s dt,, s),
+
ti,
y(ti)
+ x,). Also,
where we have written ys for ay/as and the asterisk denotes the optimal solution (i.e., s = 0). Now, y(t, s) is a solution of the differential equation dyldt =f(y, u + sh, x, + s dx,). Letting v(t) = ~ , ~ (0), t , we know that v is a solution of the variational equations
But
af/as= (af/au)h + (afpx,)
dx,
442
E. K . BLUM
Thus, (aflau), = (afiau), and (8flax,), = (gf/ay)*. Furthermore, aflay 3f/flay,which means that y,(t, 0) is the solution of the system
z=(i),v+
dV
(Note that v(0) = y,(O, 0) v + dx, , we have
g),h+
= 1ims+, (y(0,
j
fl
0
Z'
v(O)=O
dx,,
* s) - y(0, O))/s
For any solution, z, of the adjoint equation dzjdt z T ( t l ) i ; ( t l= )
=
(df/du), h dt
=
-
= 0.)
Letting
i; =
(aflax),' z,
+ zT(0)T(O)
(10.36)
If z is J,(t), the solution having final values J,(tI) = (aJ/ay (tl))* = (aJ/ax ( t , ) ) * ,then since i ; ( t l ) = v(tl) + dx,, Eq. (10.36) becomes
Combining this with (10.34), (10.35) in (10.33), we get
Since dJ/dsl,,,
=
( V J ( e ) , k), we obtain
where J X 7 ( t )is the solution of the adjoint J X T ( t I )= (2.//ax(tl))* and the subscript
equation (10.16) having final values * indicates that all quantities are evaluated on the optimal solution (x(t), ~ ( t , x, o , t I ) .
10.
OPTIMAL CONTROL PROBLEMS
443
A formula analogous to (10.37) holds for Vgj(e) with $ j replacing J. The remainder of the derivation of Sec. 10.5 carries over mutatis mutandis to the present case. We obtain the following additional transversality conditions arising from the variation of xo and t , , respectively:
(10.38)
Together with the transversality conditions in (10.26) and the Euler-Lagrange equations in (10.25), (10.27), the transversality conditions (10.38), (10.39) constitute the multiplier rule for the problem of Mayer formulated as an optimal control problem. In the case where J and $ j do not involve t , explicitly, we have aJ/atl = 0 and a@j/at,= 0. The transversality condition (10.39) then becomes the relation lxT(tl)x’(tl) = 0. However, this means that Z ( t J = 0. Since the Hamiltonian, 2,is constant along an optimal solution, we obtain the result that 2 ( t ) = 0 along an optimal solution when the endpoint t , is being varied. Finally, we observe that in the case of a nonautonomous system of control equations, the same techniques can be applied to obtain an additional transversality condition, arising from the variation of t o , and of the same form as (10.39) with t , replaced by t o . (We proceed by introducing the new independent variable f = t - to and the system dxfdt = f(x, u, t ) becomes dxjdi = f ( x ,u, f + to),containing to as a parameter. We shall not carry out the detailed steps here.) 10.7
Examples of Optimal Control Problems
We have already presented one example of an optimal control problem in our reformulation of the brachistochrone problem. We shall now consider two examples which arise in the flight mechanics of low-thrust rockets and which have recently received attention in the study of certain space missions. For purposes of illustration, we shall limit our attention to two-dimensional trajectories, although our remarks apply equally well to three-dimensional trajectories. Thus, we consider a rocket of unit mass moving in a central inverse-square-law gravitational field under the force of a continuous (i.e., nonimpulsive) thrust engine. Let ( r , 0) be the polar coordinates of the rocket at time t and let u and u be the radial and circumferential components of velocity, respectively. Let a, and a, be the radial and circumferential components
444
E. K. BLUM
of thrust acceleration, respectively. Then the equations of motion are, as is well known, duldt = v2/r - 1/r2 + a,, d v / d t = - uv/r + a , , (10.40) dr/dt = u , dtl/dt = v/r For given initial conditions at time t = 0 and a particular choice of the functions a,(t), a,(t), Eqs. (10.40) determine the motion of the rocket for 0 5 t 5 t , . It is required to choose a, and a , from a class of admissible controls so as to produce a trajectory which satisfies certain end conditions at time t , and also minimizes the amount of rocket fuel consumed. It can be shown (e.g., see Irving and Blum3’) that minimizing the fuel consumed is effected analytically by minimizing the integral J
= J(tl)=
J
fl
0
(a,‘
+ a e 2 )dt
or, if we write dJ/dt = a: + aO2,by minimizing the final value of the function J(t). The other requirements, as represented by the end conditions, are determined by the mission to be accomplished. We shall consider two missions, “escape” and “rendezvous.” In both missions it will be assumed that the rocket is initially in a circular orbit so that at time t = 0, u(0) = u o ,
v(0) = vo , r(0) = ro J(0) = 0 O(0) = 0 ,
(10.41)
where uo = 0. (We shall continue to write uo to show that most of the discussion applies also to arbitrary initial conditions.) In the escape mission, it is required to escape from the gravitational pull in a fixed time t , , that is, the total energy, E l , at time t , must be zero. The total energy is given by E = (u’ + v2)/2 - l / r (10.42) Actually, we shall consider “escape” to mean El 2 0. To formulate the escape problem as an optimal control problem, we let x1 = u, x 2 = u, x 3 = r, x4 = J be the state variables and u1 = a,, u2 = a, be the control variables. Using (10.40) and the differential equation for J , we obtain for the control equations, dxlldt = xZz/x3 - 1/x,’ u,,
+
(10.43)
10.
445
OPTIMAL CONTROL PROBLEMS
The endpoint constraints are obtained from (10.41), (10.42) : $1 = X,(O)
- uo = 0,
$hz = X,(O)
- v g = 0,
$3
= x3(0) - ro = 0,
$4
= Xq(0) = 0 ,
$5
= C(Xl(tl))z
(1 0.44)
+ (Xz(t!))21/2
-
1/Xdtl)
- El = 0
where uo , u 0 , y o , El are constants determined by physical conditions. Finally, the function to be minimized is simply (10.45)
= x4(r1)
Comparing (10.43), (I0.44), (10.45) with (10.9), (iO.lO), (10.1 I), respectively, we see that we have an optimal control problem of the type described in previous sections. Hence, an optimal solution must satisfy the multiplier rule conditions as given by (10.23a)-(10.23e). It is useful to calculate the gradients of J and the tjj in this case, both for illustrative purposes and for use in the convergent gradient procedure to be described below. From (10.19) we see that we need the function J,(t) and the matrix (aflau). The latter is easily calculated from (10.43) to be (10.46) J x ( t ) usually cannot be given analytically but must be obtained numerically as the solution of the adjoint equations (10.16). In this example, the adjoint equations are as follows: dYl/dt
= (xZ/x3)*
dYZ/df
= -(2xZ/x3)*
dY3ldt = (-2/x33 dy,/dr= 0
YZY3
?
Y1
-k
(x1/x3)*
+ x22/x32)*Y l
Y2
2
- (x1xz1x32)*
Yz
(10.47) 2
The end values which determine J , ( t ) are the values of (aJ/ax(t,)), . Hence, from (10.45) we obtain (10.48) 0 , 1) In this case it is a trivial matter to solve the adjoint equations (10.47). Clearly, we have J,'(t) = (0, 0, 0, 1) for all t. Hence, applying (10.19), we obtain J,'(t,)
= (O,O,
V J ( 4 = (2u,(t),
2UZ(t))
(10.49)
A similar procedure may be applied to the constraints (10.44). First, however, since the initial conditions are not being varied, we may discard
446
E. K. BLUM
the first four constraints. Thus, we have a problem with p = 1 constraint, t+h5 = 0. The corresponding function, $, = $ 5 x ( t ) , is the solution of (10.47) with final values (a$,/&(t,)), . From (10.45) we get $5x(‘l)
= (xl(tl>,x 2 ( t l ) ,
1/(X3(t1))2,0)
Using these final values as starting values, we can integrate (10.48) backward in time to obtain the components of $, = (yjXl(t), ~,h,~(t),$,3(t), $x4(t)). We see at once that ~ ) , ~ ( t=) 0 for all t. Hence, by (10.47) and (10.20), we obtain Vgdu)
= ($XI@)>
(10.50)
$,At))
Formulas (10.49) and (10.50) give the gradients for any control u as well as for the optimal control. The asterisk in (10.47) is thus to be interpreted as specifying that x1 = x l ( t ) ,x2 = x 2 ( t )and x 3 = x3(t) which appear in the equations are the components of the state function, x(t), which is obtained by solving (10.43) with the given control u and the fixed initial conditions. We shall return to these points again when we consider methods of computation. The second example which we shall consider is the mission of rendezvous by low-thrust rocket. The equations of motion are given by (10.40), as before, and the function to be minimized is the same J. However, in place of (10.42), the end conditions to be satisfied are given by u(tl) - u 1 = 0, u(t), - o1 = 0,
r(tl) -
qt,)
-
y1
=0
el = o
(10.51)
In physical terms, it is required to arrive at a preassigned point, (rl, Ul), in space with a preassigned velocity ( u , , v l ) in a fixed time t , . Of all the thrust “programs,” u ( t ) , which will bring the rocket to this rendezvous, we wish to find the one which minimizes J . Again, we shall assume the rocket starts from a circular orbit so that the initial values are fixed and given by (10.41). To formulate the rendezvous problem as an optimal control problem, we proceed as before, introducing state and control variables, except that now we have the additional state variable, U . We set x, = u, x 2 = ZI, xg = Y, x4 = 8, x5 = J . The control equations are as in (10.43) with x4 replaced by x5 and the additional equation, The adjoint equations are
dx,/dt = x2/x3
10.
447
OPTIMAL CONTROL PROBLEMS
We now have J,'(tl) = (0, 0, 0, 0, 1) so that the gradient of J is again given by (10.49). The endpoint constraints are obtained from (10.52) as the p = 4 constraints, $1 = x,(t,) - u 1 = 0, $ 3 = x,(t,) - r l = 0 (10.53) $2 = x,(t,) - ~1 = 0 , $4 = x4(tl) - 01 = 0 To obtain Vgl(u) we must compute the function + l x ( t ) , which is the solution of the adjoint equations (10.52) having +T,(tl) = (a$,/ax(t,))' = (1,0,0,0,0). Similarly, $ z x ( t , ) = (0, 1, 0, 0 , 0), $T,(tl) = (0, 0, 1, 0, 0) and $:,(tl) = (0, 0, 0, 1, 0). It is clear that the last component of + j x ( t ) , j= 1, ... , 4, is the identically zero function. Therefore, by (10.20) and (10.46), Vgj(u) = ( $ j x l ( l ) >
j = 1,
$jx,(t>>,
... 4 5
(1 0.54)
where $ j x , and $ j x r are the first two components of $J~,.We repeat that these results hold for any control, u, and its corresponding trajectory x. It is easy to show that the Vgj(u) in (10.54) are linearly independent. Suppose Vgj(u) = 0. This implies A j ~ j x , ( t l= ) Aj$j,,(tl) = 0. The first summation is just A, and the second is 2 , . Thus, A, = A, = 0. This implies that A3$3xa(t) A4$4x2(t) = 0 and A 3 $ j x r ( t ) A,$,,,(t) = 0. If 1, # 0, then d$,,/dt = A d$,,/dt, where A = -A3/A4. From the first equation in (10.52), we see that d~,h~,,/dt = 1 and d$,,,/dt = 0 at t , . Hence, /z = 0, which means ,I3 = 0. This implies A4 d$,,,/dt = 0 for all t. Since d$,,,/dt = 1 at t , , as the second equation of (10.52) shows, we have A, = 0, which is a contradiction. Hence, A4 = 0, which implies A 3 d$3xz/dt= 0. Since d$,,,/dt = 1 at t , , we must also have A3 = 0, establishing linear independence. From the linear independence of the Vgj it follows that I , = 1 in (10.24) Aj$jx, where not all Aj are zero. of the multiplier rule. Thus, I , = -J, Since J, = (0, 0, 0, 0, 1) for all t , we have as the transversality condition (10.26) in this case Z,(tl) = (--Al, - A 2 , - A 3 , - A 4 , - 1). From the necessary conditions (10.27) and (10.47) we find that 2u1 = ,Ij$j,, and 2u2 = A j $ j x 2 . Thus, u , and u, are components of a solution of the adjoint equations, and consequently they must be differentiable functions. Similar considerations for the escape problem show that ZXT = - J,' A$,' = (-Wxl, - A * x 2 , -A*,3, - 1). Also,
c;'
2;'
1;'
+
+
-1;'
1':
c;'
zxT (af/dK)
= ( -,I$,1
-2U1,
-A$,., - 2 U , )
=0
( 10.55)
Since $,](tl) = u(tl) and ~ ) , ~ ( t ,=) v ( t l ) , we arrive at the following necessary condition for an optimal solution: a,(tl) = ( - A / 2 ) u(tl)
and
a,(t,) = ( - 4 2 ) u ( t l )
(10.56)
i.e., the final thrust vector must be along the final velocity vector, a result obtained by somewhat different techniques in Blum and Irving.30 For some time it was conjectured that a tangential thrust program was the optimal
448
E. K. BLUM
solution of the escape problem. However, numerical computations have shown this conjecture to be false. Of course, a demonstration by numerical calculation cannot be accepted as a mathematical proof in general, however convincing the data may be. Furthermore, in this type of computation, exact estimates of the error arising from various approximation procedures (e.g., numerical integration) are not available. Perhaps more important, in this particular example, is the fact that all the numerical solutions involve some sort of iteration procedure. Even when the iteration procedure is known to converge (and not all procedures in current use possess this property), in this particular example the neighborhood of convergence is quite large, that is, thrust programs which are quite far from optimum produce escape trajectories having almost the same fuel consumption as the optimum. Therefore, it is difficult to conclude with absolute certainty from an approximate numerical solution that the optimal solution does or does not have certain qualitative properties. The same is true of approximations obtained by various analytic techniques such as linearization and perturbation methods. Therefore, it is of some interest to establish qualitative properties by rigorous mathematical proof whereever possible. Using the results developed here, we can now prove rigorously that the optimal thrust program for the escape problem is not tangential, is not circumferential and is not of constant magnitude, although there are programs of each of these three types which produce escape trajectories having fuel consumptions close to the minimum. We prove these results in the following three theorems.
Theorem 3. The optimal thrust program for the escape problem is not tangential at all points of the trajectory.
PROOF. Assume that the optimal thrust program is tangential for all t , that is, suppose that a,(t) = z(t)u(t),
ae(t) = z ( t ) v(t),
05t
t,
(10.57)
We shall prove that this assumption implies a zero thrust program which, of course, cannot produce escape. From (10.55) and (10.57) it follows immediately that z ( t ) v ( t ) = ( - A / 2 ) 1 ) ~ ~ ( t ) . Since lc/xz is the second component of a solution of (10.47), we obtain (10.58)
Again, from (10.55) and (10.57), we have $x,(r) = (-Z/A)z(f)u(t). (We may assume A # 0, since otherwise a,(t) = ae(t)= 0 by (10.55) and the proof is complete.) Applying this to (10.58), we find d U 2v zuu - (ZV) = - z v - - zu = - (10.59) dt r r r
10.
OPTIMAL CONTROL PROBLEMS
449
Using the second equation of (10.43), we get dz dt
d dt
- (zu) = u -
+z
( I 0.60)
Equating these two expressions for d(zu)/dt, we obtain, provided that v # 0, zag
dz dt
-
-z V
U
( z u ) = -z
2
(10.61)
Proceeding similarly with the first adjoint equation in (l0.47), we obtain d dt
U
- (ZU) = - (zu)
r
+ -2A l+bx3
(10.62)
Using the first equation in (10.43), we get (10.63) Combining (10.62) and (10.63), we have dz dt
u-=-*
A 2
r
x3+-T-
z2u
(10.64)
It follows from (10.61) and (10.64) that Yx3= -2z/Ar2
(10.65)
Finally, the third equation in (10.47) yields
But (10.65) implies that d$x3 -4zu
dt
Ar3
2 dz - 4zu Ar2 d t Ar3
2z2
+ -Ar
(10.67)
Equations (10.66) and (10.67) imply that 2z2/Ar2 = 0, which implies that z(t) = 0 whenever u ( t ) # 0. However, u ( t ) is a continuous function of t and u(0) = 1. Therefore, u ( t ) # 0 for 0 I tI E , where E is some positive number. t I E , so that the orbit remains circular in this time Hence, z ( t ) = 0 for 0 I interval. Thus, U ( E ) = 1. The same argument can be applied at t = E to obtain v ( t ) = 1 for t > E . In fact, for any interval, 0 I t 5 t', in which u ( t ) = 1, this argument proves that u(t> # C for t in a small interval beyond t = t'. Hence, z(t) = 0 and u(t) = 1 in this small interval. It follows that v ( t ) = I and z(t) = 0
450
E. K. BLUM
for 0 I t 5 t , . Thus, the orbit is the initial circle, which is the optimal orbit corresponding to the case El = - lj2 in the last equation of (10.44). This proves that the only optimal solution which satisfies (10.57) (i.e., tangential thrust) is the zero thrust program. Before proving the next theorem, let us consider the Hamiltonian function defined at the end of Sec. 10.5. Since % = ZXTf,we have, using (10.43), and the expression for I , given prior to (10.55), 2 = -A$,1(xz2/x3
2
- l/ x3
- J . $ x 3 ~ 1 - t1l2
-~
+ MI)
- ;1$,2(~2
-xIx~/x~)
2 ’
Rewriting this in terms of the variables of the escape problem and letting a2 = a,’ + ao2,we have by (10.55), (10.68) At time t = 0, using the initial conditions (10.41) with u(0) = 0 and v(0) = l/&(O) for a circular orbit, we find that X ( 0 ) = a2(0). At time t = t , , by virtue of the end conditions $,l(t,> = u19
$,2(fl)
= U1,
$,3(fl)
=
I/r1‘
(10.69)
previously derived from (10.44) (see above), we have
Since %’(t,)
= Yf‘(0) on
the optimal solution, we obtain a 2 ( t 1 ) = aZ(0)
(10.70)
that is, the initial and final thrust magnitudes must be equal, a result given also in Blum and Irving.30 It is this result which first led to the conjecture that the optimal thrust program is of constant magnitude at all points of the trajectory. In Theorem 4, we prove that this conjecture is false.
Theorem 4. The optimal thrust program for the escape problem is not of constant magnitude at a/l points of the trajectory.
PROOF.We observe that if u(t ) = 0 for 0 I tI t l , then the orbit is simply the initial circle and therefore is not an escape trajectory. Hence, we may suppose u ( t ) # 0 for some t . Let us set Aa2 = a 2 ( t )- a’(0)
10.
45 1
OPTIMAL CONTROL PROBLEMS
Since %(t) = X ( O ) =a2(0)for all t , we may use (10.68) to solve for ,bjX3/2. For u # 0, this yields (10.71) From (10.55) and the first adjoint equation in (10.47), we get d dt
V
- a, = - ag
+ 2-$,..
J
r
.
(10.72)
Combining (10.71) and (10.72), we have Aa
d
(10.73)
Similarly, from (10.55) and the second equation of (10.47), we get d dt
- a , = (ua, - 2va,)/r
(1 0.74)
Differentiation of a2 and substitution of formulas (10.73) and (10.74) for the derivatives yields d
2
-a
dt
d dt
= 2a, - a,
d + 2a, da, t
Aa2 U
2u r
4u r arae
Thus, d 2 a 2 = - ((uag - V U , ) ~ - nr2/r) a, Aa2/u dt ru
+
-
(10.75)
Now, suppose Aa2(t)= 0 for all t , i.e., suppose a 2 ( t )= A 2 , where A is a constant. Then du2/dt = 0, and it follows from (10.75) that (.a,
- V U , ) ~ - a r 2 / r= O
Hence,
Together with ar2 + a,’ = A’, this implies a, = Au/w and a, = A ( u & l / ,/Y)/w, where w 2 = u2 + (v I/&)’. To determine the sign, we recall that ~ ( 0=) 0 and v(0) = l / m ) in the initial circular orbit. If the minus sign is
452
E. K. BLUM
chosen, then ar(0)= a,(O) = 0, which means that A = 0 and the orbit remains circular. Hence, we choose the plus sign, thereby obtaining
From (10.73), with Aa2 = 0, we have d
u - a, = a,.(:
dt
-
$1 (2 = a,
- ar)
Hence, for a, # 0, we have whence, a, = u / ( t
+ c),
a, = (U
+ 1/JY)/(t + c )
(10.77)
where c is a constant. Differentiating (10.77) and using the second equation in (10.41), we obtain after some simplification d -u / 1 \ -a,=dt r(t + c) +2 7 )
\’
Another expression for da,/dt is obtained by substituting the expressions for a, and a, given by (10.77) into formula (10.74). This yields d dt
-
e) (’ 7) 1
=
-
Comparing the two preceding formulas for da,/dt, we conclude that
This implies that u(t) = 0, which contradicts the assumption that u(t) # 0 and a2(t)= A’. Hence, either a2(t)is not constant or u(t) = 0 for all Z. In the latter case, the thrust magnitude must be zero and cannot produce escape.
Theorem 5. The optimal thrust program for the escape problem is not circumferential at all points of the trajectory. PROOF. Suppose a,($)= 0 for 0 It I I,. Together with (10.55) this implies that $,l(t) = 0 for 0 < t I t , . The Hamiltonian in (10.68) now becomes =
-2a,uv/r - A$,,u
+ a,‘
(10.78)
10.
OPTIMAL CONTROL PROBLEMS
453
Since 2 is constant along an optimal solution and Z ( 0 ) = a,’(O), we may set the right side of (10.78) equal to ao2(0)and solve for 21,b~~u. This yields (10.79) Since tj,.(t) = 0 for all t , the derivative of is also zero. Hence, the first adjoint equation in (10.47) becomes 0 = (v/r)$,. - GX3. This yields, using (10.53, j L $ , 3 ~=
-~u,uv/~
(l0.SO)
Combining (10.79) and (10.80), we conclude that a,’(t) = a,’(O) for all t , that is, a, is constant. By (10.55), this implies that is also constant and therefore d$,,/dt = 0 for all r. By the second adjoint equation in (10.47), this implies further that u\I/,z/r = 0 for all t . If u ( t ’ ) # 0 for some t ‘ in the interval [0, t , ] , then $ x 2 ( t ’ ) = 0. Since $,’ is constant, it follows that $,’ = 0 for all t. This implies a, = 0 for all t and the orbit remains the original circle. On the other hand, if u = 0 for all t , then again we have the case of zero thrust. Therefore, the only optimal circumferential program is the zero thrust program.
Remark. Although the constant magnitude thrust program given in (10.76) is not optimal unless A = 0, it nevertheless has interesting properties which suggest that it may be a first approximation to the optimal solution. In fact, numerical computations show that (10.76) produces escape trajectories having a value of J ( t , ) , which is approximately six percent larger than the optimum value of J ( t l ) . It seems possible to investigate this further analytically. However, we shall not pursue this line here, but rather we turn our attention to the numerical computation of the optimal solution. In the next section we shall describe a rather general procedure for numerical solution which has proven to be suitable for modern digital computers. 10.8
A Convergent Gradient Procedure
The examples discussed in the previous section illustrate the difficulties in obtaining analytic solutions-or even analytic approximations to solutions -of optimal control problems. In all but the simplest problems, to obtain good approximations to a solution one must have recourse to numerical methods. In this section we shall present one such method based on gradients. Gradient-type approximation procedures are not new, but they have received considerable attention recently in a variety of contexts, as in Balakrishnan,” Goldstein,16 Hart and M ~ t z k i n and , ~ ~ K e l l e ~ , ~to ’ mention a few recent works. These procedures generate a sequence of approximations to the solution of a minimization problem by using gradients in various ways, but usually
454
E. K. BLUM
involving the direction of steepest descent. The method to be described now is not of the steepest descent type. Furthermore, we shall prescribe sufficient conditions to ensure convergence of the approximating sequence. A convergence proof was first given in B l ~ m The . ~ treatment ~ given here is a slight modification of that work. We refer to Sec. 10.3 for the basic concepts and definitions to be employed in the abstract formulation of our gradient procedure. Our procedure is based on the multiplier rule as given in the corollary to Theorem 2 and as summarized by Eq. (10.13). It is convenient to restate (10.13) in a somewhat different form. As before, let J, g , , .. . , g p be real functionals on a pre-Hilbert space H. Suppose that the weak gradients VJ(u), Vgi(u), 1 I i 5 p exist at u E H . Suppose further that the p x p Gram matrix ’(u) = ((Vgi(u), VY,(U)>>
is nonsingular. Let p be the p-dimensional row vector ((VJ(U),V g , ( u ) ) , .. . (VJ(u), Vg,(u))) 9
and define A = (Al, ... , A,) as the vector A = p D - l . As is well known, the “projection,” VJ,, of VJ on the subspace G c H spanned by {Vgl(u), _ _ _ , Vg,(u)} is given by P
VJ,(u) = C A j Vgj(u)
The component of VJ orthogonal to G is VJ,(u) = VJ(u) - VJ,(u). The multiplier rule states that if u* is a relative minimum of J on the constraint C = 0; C(g,) (and the conditions of the corollary are satisfied), then VJ,(U”) = 0
(10.81)
Definition 8. Let J, gl, .. . , g p be real functionals on H with weak gradients defined at u and the Gram matrix D(u) nonsingular. If VJ,(u) = 0 and gi(u) = 0, 1 I i 5 p , then u is a stationarypoint of J o n the constraint C(gi).
n;
Remark on Notation. In what follows, for any u E H , we shall write i to denote the unit vector ( l / ~ ~ u \ For ~ ) u any . real functional, .f, we shall write f(x) 9 0 to indicate thatf(x) > a’ > 0 for some positive constant a. Our gradient procedure generates a sequence { u , } which converges to a stationary point u* provided that certain regularity ” conditions are satisfied in a neighborhood of u*. These conditions are given in the next definition. “
DeJinition 9. Let u* be a stationary point of J on the constraint 0;C(g,). u* is a regular stationary point if there exists a neighborhood, N = N(u*), of U* such that the following conditions are satisfied for u E N :
10.
455
OPTIMAL CONTROL PROBLEMS
The strong gradient V,J(u) exists as a continuous function of u and VJ,(U) # 0 ; The strong gradients V,gi(u), 1 I i 5 p , exist as continuous functions of u and Vgi(u) # 0; D(u) is nonsingular ; Let O(u) = arcsin (IIVJ,(u)ll/llVJ(u)/I). For those u in N such that VJ,(u) # 0 the strong gradient V,O(u) exists and IIVO(u)ll $ 0. At u*, the weak differential 6O(u*; h) exists and for = u* + AU, AU # 0 , ( v q u ) , hu) ~ o ( u * G) , as AU + 0 ; For u = u* + Au, Au # 0, let --f
- -
ai = ai(u) = arccos(V$,(u), Au)
____ a. = C(~(U) = arccos(VJT(u), Au),
I Ii 2p,
--
p = p(u) = arccos(VO(u), Au), y
______
= y(u) = arccos(VO(u),
VJ,(u)).
There exist positive constants a , and’a, , such that cos2 ai + cos cx0 cos p/cos y > (i) if VJ,(u) # 0, then lcos yI > a, and a, ; (ii) if VJ,(u) = 0, then cos2 ai> a,. A neighborhood such as N is called a regular neighborhood of u*.
x<
2;
Remark. In view of our earlier discussion in Sec. 10.3 (see Definition 5 and what follows), we may write VJinstead of V,J, since the existence of the strong gradient implies the existence of the weak gradient, and the two are equal. Hence, having made it clear that we require the strong gradient, it is simpler to omit the subscripts.
nf
Theorem 6. Let u* be a regular stationary point of J on the constraint C(gi) and let N be a regular neighborhood of u” which contains no other stationary point. For u = u* + Au E N , Au # 0 , let
h,
= hG(u) =
h,
= hT(U) =
- 2 (gi(u)/tIVgi(u)II) Vgi(u) P
-(1/llVJc(~)Il . (V~(U),VJdu))) VJAu), =o, if VJ,(u)=O
h = h(u) = h,
+ h,
(10.82) if
VJ,(u) # 0 , (10.83) (10.84)
+
Let d = 2 a , ~ , ~ / ( a , ~ 2p p ~ + 3), where a, and a2 are the constants given in DeJinition 9 (condition (5)). Let (s,) be a sequence of real numbers with
456
E. K. BLUM
d/2 < s, < d, n = 0, 1, 2, ... . There exist positive constants, r and k , with k < 1, such that if lluo - u*I/ < r, then the sequence {u,} deJined by
u , + ~= u,
+ s,h(u,),
n
= 0,
1 , 2 , ...
(10.85)
converges to u* and I/u, - u*Ij < k"/luo - u*jl.
PROOF. First we shall prove that there exist r and k , with k < 1, such that if u = u* Au E N and 0 < l/Aull < r, then Ilu s h - u*Ij < k/lAull whenever d/2 < s < d. For convenience we shall adopt the notational convention that throughout the proof any quantity designated by E with appropriate subscripts or superscripts is such that E/llAull -+ 0 as Au -+ 0. We shall make no further mention of this property. By the definition of strong gradient (see the remark following Definition 5 of Sec. 10.3), we have for 1 I i 2 p ,
+
+
gi(u) = gi(u) - gi(u*) = (Vgi(u*), Au)
+
~i
By the continuity of Vgi at u* (condition (2) of Definition 9), gi(u) = (Vgi(u), Au) + & . Using these relations in (10.82), we obtain
+ sh - u* = Au + sh, + sh,, we have + sh - u*112 = IIAull2 + 2s(Au, h,) + 2s(Au,
Since u
I/u
hT)
+ S2(l/h,-1I2+ llhTl12)
(10.87)
From (10.86) it follows that
(h,, Au)
= -
l/Aulj2
"
2 cos2 + 1
cli
(E,,
Au)
(10.88)
where the cli are defined in condition (5) of Definition 9. Applying the Schwarz inequality to (10.86) yields (10.89) where w,(Au)
--t
0 as Au -+ 0. Using (10.83), we find that if h, # 0,
where a,, and are given in condition (5) of Definition 9. By condition (4) of Definition 9, using the mean-value theorem, tan O(u) = tan O(u*) + sec2 O(u')(VO(u'), Au), where u f = u* + s Au, 0 < s < 1. Again, by condition 4,
10.
457
OPTIMAL CONTROL PROBLEMS
u>
(VtXu'), h)= (VO(u), + w',where w' + 0 as Au 0. Since O(u) is continuous at u* and O(u)* = 0, it follows that tan O(u) = (VO(u), Au) + &. Since (Ve(u), Au) = liVO(u)lj jlAull cos p we obtain Au)
(hT,
--f
+ w (Au))
~ ~ A U / / ~ ~( C ( cos 0 O S ~ / C O Sy
= -
(10.90)
where w (Au) -+ 0 as Au -+ 0. Similarly, from (10.83) it follows that
where w T -+ 0 as Au + 0. Combining (10.87)-(10.91), we see that /lu + sh - u*JI2 I jlAuj12(@l(s) where w1
--f
0 and w2 -+ 0 as Au
b
= 1
=
cos'
CIS2
9 cos2 ai
if VJ,(~)
1
c1 = p =p
P
1
cos2 x i
P
1cos2
=
c(,
if
vJ,(u)
if
if
=
(1
1 - 26s
VJ,(u) # 0
=0
+ cs2, where c = c1 + 2 1cos2 a, + 1 + l/lcos yI by @(s)
VJ,(u) # 0
o
+ cos2 p/cos2 y
1
If we replace
(10.92)
0 and
+ cli + cos a0 cos p/cos y
< D ~ ( s )= 1 - 2bs P
---f
+ 2sw1 (Au) + s2w2 (Au))
1
llcos yI
the inequality in (10.92) remains valid. Now, 0 < @(s) < 1 for 0 < s < d, where d = 2a,a22/(a02p2 + 2p + 3). For d/2 < s < d and 0 < (IAu/l < r with r sufficiently small, there exists a constant k 2 < 1 such that 0 < @(s)
+ 2swl(Au) + s2w2(Au) < k 2 < 1
Hence, /1u + sh - u*jl < klIAul1 whenever d/2 < s < d, as we wished to prove. The remainder of the theorem now follows directly, for if jluo - u*/l < r, then ljul - u*II < kllu, - u*I/ and, by induction, liu, - u*II < k"IIuo - u*Il. This, of course, implies u, u*. --f
10.9
Computations Using the Convergent Gradient Procedure
We shall illustrate the use of the convergent gradient procedure by applying it to the brachistochrone problem discussed in Sec. 10.2. The equations which
458
E. K. BLUM
define this problem are (10.6’), (10.7’), and (10.8’). In order to apply our gradient procedure, we must calculate gradients according to formulas (10.19) and (10.20). Thus, we must derive the adjoint equations of (10.6’) and the matrix (afjau). According to (10.16), the adjoint equations in the brachistochrone problem are the following: (10.93) The matrix (af/lau) is the 2 x 1 matrix
(Here we have written u = u l , since the control function is one-dimensional.) To obtain the function J J t ) = (.Jx1(t), Jx,(t)) we must solve (10.93) with the final values J,(tl) = (aJji3x,(t1),aJ/ax,(t,)>.Hence, from (10.8‘) we see that J,(tl) = (0, 1) and the required solution is given by
With regard to the constraints in (10.7’), we may simplify the problem by considering the initial values as being held fixed. Thus, x,(t,) = 1 and x2(t,) = 0 are taken as fixed initial conditions for the solution of Eq. (10.6’). Thus, the “control space” in this problem is the space of functions u l ( t ) .We must choose u l ( t ) to obtain a solution of (10.6’) having the prescribed initial values (1, 0) and satisfying the single constraint
*
=
= Xl(t1) - 2 = 0
(10.95)
According to the theory of Sec. 10.5, we must determine a function, \Cl,(t) = (t+bxl(t), $,,(t)), which is a solution of (10.93) and which has final values \Clx(t,)= (a$/axl(tl), a$/ax,(t,)). From (10.95) we have $,(tl) = (1, 0). The soltjtion of (10.93) having these final values can be by inspection seen to be t+bxl(f) = 1, $ x 2 ( t ) = 0. Hence, thegradient Vt+b= $,‘(afiau>, asgivenby(10.20), becomes in this case simply v* = 1 (10.96) The gradient V J = J,’(iafliau)
vJ =
is given by U
J,I(t)
+
((2gx1(* + u2))1/2
1
In order to obtain VJ(t),we choose a control function u
(10.97) = u ( t ) and
integrate
10.
459
OPTIMAL CONTROL PROBLEMS
the control Eqs. (10.6’) with the initial values (1, 0). The solution x, = x , ( t ) is then used together with u in (10.93) to determine the coefficients. Equation (10.93) is solved for J,(t) using the final values (0, l), as explained previously. Finally, (10.97) yields VJ as a function of t obtained from u, x1 and Jxl. To apply the convergent gradient procedure, we must calculate VJ, and VJ,, Since there - _is only one constraint in this example, it is readily seen that VJ, = (VJ,V$) V$. (As before, the bar denotes the unit vector.) By the definition of the inner product given in Sec. 10.5, we have from (10.96) and (10.97)
Applying formula (10.82) with p
=
1 and g1 = $, we have
h, = - (x,(tA - 2)/t, To compute h , from (10.83), we require VJ, and (V0, We have immediately that VJ, = VJ - VJ,. However, to obtain (VO, we use a numerical approximation, therein making use of the fact that (0, = d6(u; = dO/dsl,=, in the direction Thus we compute 0 = arcsin ~ ~ v ~ T ~ for [ / u~ and [ v again J ~ ~ for u AS VJ,, choosing the value of the scalar As to be “sufficiently small.” We then approximate dO/ds by A0lAs and use this in (10.83) to obtain h , and then obtain h from (10.84). The iteration procedure defined by (10.85) is carried out by choosing some initial guess uo = u o ( t ) for the control function and computing all the above quantities for this uo . To compute the next approximation, u l , from (10.85) requires an estimate of so. In the absence of information about the constant d of Theorem 6, we are forced to guess so by trial and error and then improve the guess by various computational techniques of doubling and halving. These techniques have proved effective in solving the brachistochrone problem and the escape and rendezvous problems.
v,). w,)
v,)
+
v,.
=,>
REFERENCES 1 . Galileo, “Dialog uber die beiden hauptsachlichsten Weltsysteme,” pp. 471-472 (Trans].: Straws), 1630. 2. Galileo, “Dialogues concerning two new sciences,” p. 239 (Trans].: Crew and De Salvio), 1638. 3. G. A. Bliss, “Lectures on the Calculus of Variations.” Univ. of Chicago Press, Chicago, Illinois, 1946. 4. J. Bernoulli, “Acta Eruditorium,” p. 269. Leipzig, 1696.
460
E. K. BLUM
5. L. S. Pontryagin, V. G. Boltyanskii, R. V. Gamkrelidze, and E. F. Mischehenko, “The Mathematical Theory of Optimal Processes,” (K. N. Trirogoff, transl.; L. W. Neustadt, ed.). Wiley (Interscience), New York, 1962. 6. F. A. Valentine, “The Problem of Lagrange with Differential Inequalities as Added Side Conditions, Contributions to the Calculus of Variations 1933-1937.” Univ. of Chicago Press, Chicago, Illinois, 1937. 7. L. Berkovitz, Variational Methodsof Control and Programming, J. Math. Anal. Appl. 3, 145-169 (1961). 8. 0 . Bolza, An Application of the Notions of “General Analysis” to a Problem of the Calculus of Variations, Bull. Am. Math. SOC.16, 402-407 (1910). 9. M. Frechet, La notion de differentielle dans l’analyse genkrale, Ann. Sci. &ole Norm. SUP.42, 293-323 (1925). 10. R. Gtiteaux, Sur les fonctionelles continues et les fonctionelles analytiques, Bull. SOC. Math. France 50, 1-21 (1922). 11. H. H. Goldstine, A Multiplier Rule in Abstract Spaces, Bull. Am. Math. SOC.,(1938). 12. H. H. Goldstine, The Calculus of Variations in Abstract Spaces, Duke Math. J . (1942). 13. L. A. Liusternik, On Relative Extrema of Functionals (in Russian), Mat. Sbornik 41, 390-401 (1934). 14. M. R. Hestenes, Hilbert Space Methods in Variational Theory and Numerical Analysis, Proc. Congr. Math. Amsterdam 3, 229-236 (1954). 15. A. V. Balakrishnan, An Operator-Theoric Formulation of a class of Control Problems and a Steepest Descent Method of Solution, J . SIAM Control Ser. A, 1, No. 2 (1963). 16. A. A. Goldstein, Minimizing Functionals on Hilbert Space, Proc. Symp. Computing Methods in Optimization Probl, UCLA. Academic Press, New York, 1964. 17. E. K. Blum, Minimization of Functionals with Equality Constraints, J . SIAM Control Ser. A , 3, No. 2, 299-316 (1965). (See also United Aircraft Res. Rept. C-11005-814, 1964). 18. E. Hille, “Functional Analysis and Semi-Groups’’ (AMS Colloq. Publ. Vol. 31). Am. Math. SOC.,Providence, Rhode Island, 1948. 19. L. Liusternik and V. Sobolev, “Elements of Functional Analysis.” Ungar, New York, 1961. 20. N. I. Akhiezer, “The Calculus of Variations.” Ginn (Blaisdell), Boston, 1962. 21. J. Dieudonne, “Foundations of Modern Analysis.” Academic Press, New York, 1960. 22. H. A. Antosiewicz and W. Rheinboldt, Numerical Analysis and Functional Analysis, in “Survey of Numerical Analysis” (J. Todd, ed.). McGraw-Hill, New York, 1962. 23. E. A. Coddington and N. Levinson, ‘‘ Theory of Ordinary Differential Equations.” McGraw-Hill, New York, 1955. 24. R. A. Struble, “Nonlinear Differential Equations.” McGraw-Hill, New York, 1962. 25. P. C. Rosenbloom, The Method of Steepest Descent, AMS Proc. Symp. Appl. Math., 6. McGraw-Hill, New York, 1956. 26. L. M. Graves, A Transformation of the Problem of Lagrange in the Calculus of Variations, Trans. Am. Math. SOC.35, 675-682 (1933). 27. T. H. Hildebrandt and L. M. Graves, Implicit Functions and Their Differentials in General Analysis, Trans. Am. Math. SOC.29, 127-153 (1927). 28. F. E. Browder, Variational Methods for Nonlinear Elliptic Eigenvalue Problems, Bull. Am. Math. SOC.71, No. 1, 176-183 (1965). 29. L. W. Neustadt, Optimal Control Problems as Extremal Problems in a Banach Space, USCEE Rept. 133, (1965). 30. E. K. Blum and J. Irving, Comparative Performance of Low-Thrust and Ballistic Rocket Vehicles for Flight to Mars, in “Vistas in Astronautics,” Vol. 11. Pergamon Press, Oxford, 1956.
10.
OPTIMAL CONTROL PROBLEMS
46 1
31. W. Hart and T. S. Motzkin, A Composite Newton-Raphson Gradient Method for the Solution of Systems of Equations, Pacific J . Math. 6,691-707 (1956). 32. H. J. Kelley, Method of Gradients, in “Optimization Techniques” ( G . Leitmann, ed.), pp. 205-252. Academic Press, New York, 1962. 33. E. K. Blum, A Convergent Gradient Procedure in Pre-Hilbert Spaces, Pacific J. Math. 17,No. 1 (1966). (See also United Aircraft Rept. D-110058-18 (1965).)
Author Index Numbers in Darentheses are reference numbers and indicate that an author's work is referred to although his name is not cited in the text. Numbers in italics show the page on which the complete reference is listed. Akhiezer, N . I., 422 (20), 460 Antosiewicz, H. A., 460 Armitage, J. V., 149, 193 Balakrishnan, A. V., 422, 453, 460 Ball, D. J., 90 (26), 101 Bellman, R., 147 (I), 148, 192, 199, 261, 371 Benkovitz, L., 460 Berkovitz, L. D., 106 (8), 146,261 Bernoulli, J., 418, 459 Blacquikre, A., 261, 371 Bliss, G. A., 4, 25, 29, 41, 62, 83, 101, 106, 145,418,419,422 (3), 436,438,459 Blum, E. K., 422, 444, 447, 450, 454, 460, 461 Boltyanskii, V. G., 199 (43), 205 (43), 262, 373 (l), 389,421 (3,26,38, 460 Bolza, O., 4,25,28,52,62,422,460 Breakwell, J. V., 87,101,261 Browder, F. E., 422, 460 Brunner, W., 398 (lo), 415 Bryson, A. E., 198,261 Buehler, R. J., 414, 415 Bushaw, D., 204, 261. Butkovsky, A. G., 147 (3), 148, 192 Carathkodory, C., 261 Carswell, A. I., 164 (26), 193 Cloutier, G. G., 164 (26), 193 Coddington, E. A., 432,460 Contensou, P., 66, 100 Courant, R., 67 (12), 81 (17), 100, 101, 234, 26 I Cowling, T. G., 163 (24), 165 (24), 193 Dahlard, L., 113 (lo), 146 Davidson, W. L., 403,415
Denham, W. R., 199, 261 Desoer, C. A,, 199, 261 Dieudonne, J., 422 (21), 460 Drake, J. H., 161 (22), 193 Dreyfus, S., 199, 261,371 Eaton, J. N., 392, 415 Edelbaum, T. N., 414,415 Egorov, A. I., 147 (2), 148, 150, 192 Egorov, Yu. V., 148,192 Fadden, E. J., 392,415 Falco, M., 90 (26), I01 Faulkner, F. D., 80 (14, IS), 100 Filippov, A. F., 207, 261 Fletcher, R., 403, 415 Flugge-Lotz, I., 199, 207, 261 Fraejis de Veubeke, B., 261 Frechet, M., 422,423,424,460 Fried, B. D., 146 Fuller, A. T., 261 Gamkrelidze, R. V., 199 (43), 205 (43), 261, 262, 373 (I), 389, 391, 415, 421 ( 5 ) , 460 Garfinkel, B., 4, 12,29 (3), 61,25,62 Ggteaux, R., 422,423,424,460 Gibson, J. E., 86,101 Gilbert, E. G., 392,415 Glicksberg, I., 199 (4), 261 Goldstein, A. A., 453,460 Goldstine, H. H., 422,460 Graves, L. M., 423 (27), 460 Grodzovskii, G. L., 141, 146 Gross, O., 199 (4), 261 Gubarev, A. V., 161 (23). 193 Guderley, K. G., 148, 149,192,193 Gus'kov, Yu. P., 135 (15), 146
463
464
AUTHOR iNDEX
Halbert, P., 398 (lo), 415 Halkin, H., 198, 199,200,20l,204,207,210, 218,261,262,371 Hantsch, E., 148,192 Hart, W., 453,461 Heermann, H., 90 (27), I01 Hestenes, M. R., 422,460 Higgins, J. J., 262 Hilbert, D., 67 (12), 81 (17), 100, 101, 234, 261 Hildebrandt, T. H., 423 (27), 460 Hille, E., 422 (18), 423, 460 Illarionov, V. F., 135 (13), 146 Irving, J., 444, 447,450,460 Isayev, V. K., 113, I46 Ivanov, Yu. P., 141 (16), 146 Johansen, D. E., 85,101 Johnson, C. D., 80,86, I01 Kaiser, F., 89, I01 Kalaba, R., 147 (I), 192 Kalman, R. E., 199,262,371 Kelley, H. J., 64 (7, 8, 9), 90 (26), 92 (9), 95, 96 (9), 99 (9), 100,101, 199,262, 453, 461 Kelley, J. L., 371 Kellog, 0. D., 177 (31), 193 Kempthorne, O., 414,415 Kopp, R. E., 64 (6), 160 (20), 100, 193 Kovbasijuk, V. I., 161 (23), 193 Kraiko, A. N., 149, I93 Krasovskii, N. N., 391,415 LaSalle, J. P., 64, 100, 199, 207, 218, 262, 393,415 Lass, H., 135 (14), 146 Lavrentjev, M. A., 180 (33), 193 Lawden, D. F., 92,94, 106 (3, 4), 113 (3,4), 127 (3), 143 (18, 19), 144 (18), 101, 145, 146 Leitmann, G., 92, 106 (I, 5), 113 (I, 5), 101, 145,261,262,371 Lerner, A. Ya., 147 (3), 148,192 Levinson, N., 432,460 Liusternik, L., 422 (19), 423, 427,460 Lurie, A. I., 124 (12), 146 Lurie, K . A., 149 (16), 171 (27), 174 (16, 27), 193
Lush, K. J., 89,101 Lyapounov, A., 262 McAllister, G. T., 4,25 McIntyre, J., 397 (9a), 402 (9a), 414 (9a), 415 McShane, E. J., 199,262 Malii, S. A,, 147 (3), 192 Maltz, M., 261 Mancil, J. P., 4, 25 Marbach, R., 261 Melbourne, W. G., 113, 146 Miele, A,, 64, 86, 89, 106 (2), 113 (2), 152 (18), 164(18), 100, 101,145,193, 262 Mishchenko, E. F., 199 (43), 205 (43), 262, 373 (I), 389,421 (5), 460 Motzkin, T. S., 453,461 Moyer, H. G., 64 (6), 100 Neustadt, L. W., 199, 207, 262, 391, 399, 401,416,422,460 O’Donnell, J. J., 262 Osborn, H., 148,192 Paiewonsky, B. H., 199, 262, 391 (3), 397, 398, 399 (2,3), 401 (3), 402 (9a), 405,406, 41 I , 414, 415 Pontryagin, L. S., 199, 205, 262, 371, 373 (l), 389,421,460 Powell, M. J. D., 403,404,415 Prokudin, V. A., 161 (23), 193 Rao, G. V. B., 148,193 Rashevsky, P. K., 149 (17), 193 Regirer, S. A,, 163 (25), 193 Reid, W. T., 52,62 Rheinboldt, W., 460 Robbins, H. M., 65,13,92,100 Rosenbloorn, P. C., 460 Ross, S., 92, 101 Roxin, E., 207, 262, 371 Rozonoer, L. I., 168 (28), 170 (28), 193 Rutowski, E. S., 89,101 Sauer, C . G., Jr., 113, Saunders, K. V., 371 Shabat, B. V., 180 (33), I93
AUTHOR INDEX
Shah, B. V., 414,415 Sheindlin, A. E., 161 (23), 193 Shkadov, L. M., 135 (13), 146 Shmiglevsky, Yu. D., 148, 193 Sirazetdinov, T. K., 147 (4), I92 Sobolev, V., 422 (19), 423,427, 460 Solloway, C., 135 (14), 146 Sonin, V. V., 113,146 Spang, H . A. 111,414,416 Stiles, J. A,, 262 Stoleru, L. G., 262 Streibel, C. T., 87, ZOZ Struble, R. A., 434,460 Sutton, G. W., 161 (21), 193 Tarasov, E. V., 106,146
465
Terhelsen, F., 397 (9a), 402 (9a), 414 (9a), 415 Tokarev, V. V., 141 (16), 146 Troitskii, V. A,, 106 (7), 146 Valentine, F. A., 4,25,460 Vatazhin, A. B., 163 (25), 171 (29), 174, 175, 178, 186 (32), 193 Volterra, V., 156, 193 Warga, J., 66 (2, 3), 100,207,262 Wilde, D. J., 414.416 Wonham, W. M., 80,101 Woodrow, P. J., 199, 262, 397 (9a, 9b), 398 (lo), 402 (9a), 405,406,411,414 (9a), 415
Subject Index A
Abstract multiplier rule, 431 Accessory minimum problem, 67 Additivity property, 267 Adjoint equations, 305 Adjoint initial conditions, 392, 394 Adjoint variables, 392 Adjoint vector, 305 relation to gradient, 3! 5 Admissible controls, distributed, 152 B
Bang-bang principle, 387 Bilateral neighborhood, 31 Bilateral variation, 5 Bolza, problem of, 199 Boosting devices, 105 constant thrust acceleration, 105 mass flow rate limited, 106 propulsive power limited, 106 Boundary, 35 1 nice, 358 Boundary controls, 151 Boundary point, 256, 336f cone of normals at, 346 local cones at, 342, 356 Bounded, 204 Brachistochrone problem, 418
Canonical form, 80f Caratheodory function, 47 Characteristic function, 21 3 Chattering, 100 Closed interval, 201 Closed set, 21 1 , 256
Closure, 21 1, 256 Comoving space along trajectory, 21 9 Computation of optimum controls, 458 Cone of controllability, 380 Cone of normals at boundary point, 346 at interior point, 301 Conic neighborhood, 275 Constrained corner, 8 Constrained variation, 5 Continuous transition, 9 Control, 292 admissible, 293 allowable, 375 measurable, 376 optimal, 104, 294, 374 successful, 376 Control functions, 202 Control sets, admissible, 375 Control variations, 67f Convergence acceleration, 403 Convex set, 21 1, 256 Convexity condition, 6, 29 Corner locus, 50 Corner manifold, 49 Cost, 267 Critical point, 34 Critical reflection, 32 Critical refraction, 32 Cross-over variation, 30
D Davidon-Fletcher-Powell method, 403 Dead-end, 12 Degenerated point, 323, 360 Distributed controls, 149 Distributed parameter problems, 147f 466
467
SUBJECT INDEX
Dynamic programming, 199 functional equation of, 316 Dynamical polysystem, 204 Dynamical system, 266 transfer of. 266
E Effort constraints, time-optimal control, 398 E-function, 7, 3 I , 46 Enter condition, 6, 30, 45 Entry, 8 Entry locus, 50 Equations of motion, optimal thrust problems, 104 Erdmann-Weierstrass condition, 107 Essential discontinuity, 85 Euler equations distributed parameter problems, 152 for end effects minimization, 167 Euler-Lagrange equations, 437 Exhaust speed, 104 Exit, 8 Exit locus, 50 Extremal, 6 Extremaloid, 6 , 30 Extrenialoid index. 12
F Field, 39 Free corner, 8, 28, 52 Fixed point theorem, 246 Functional, 267 integral, 293 multiple integral, 151 Fundamental analogy, 356 Fundamental matrix, 215, 220 G Gradient method, 401 Gradient procedure, 453 Green’s theorem, 64, 87, 89 Gronwall’s inequality, 249 H Hamiltonian distributed parameter problems, 156 optimal thrust problems, 106 Hilbert determinant, 7, 17 Hilbert integral, 39
Hilbert theorem, 7 Hodograph space, 64, 66 Hybrid analog-digital computer, 398
I Imbedding, 13, 20, 36, 54 Interface, 29 Interior point, 256 Intersection, 256
J Jacobi condition, 13, 39, 55 Jacobian determinant, 81, 96 Jump conditions, 64, 85, 87f Junction condition, 66, 82f
L Lagrange multiplier, 5 Lagrange multipliers distributed parameter problems, I53 optimal thrust problems, 106 Lagrangian, 153 Lawden’s spiral, 92f Left limit, 202 Legendre condition, 7, 3 I , 67 Legendre-Clebsch condition, 81, 87, 92, 97 Limiting surface, 269 nice, 286 partition of, 285 point of, antiregular, 303 nonregular, 303 regular, 301 properties of, global, 270f local, 273f subset of, antiregular, 3 17 attractive, 3 1 1 regular, 3 12 repulsive, 3 1 1 tangent cone of, 282 Linear dynamical polysystem, 214 Linear transformation, 303 Linearization, 223 Local cones at boundary points, 336 at interior points, 274, 275 properties of, 276f of separable, 305f separability of, 300 symmetrical subset of, 318 Lyapounov theorem, 230
468
SUBJECT INDEX
M Magnetohydrodynamics (MHD) minimization of end effects, 165 optimum problems, 160f Magnetohydrodynamical channel flow, 160 Mass flow rate, 104 Maximum principle, 312, 323, 376, 439 abnormal case, 350 trivial, 363 Mayer problem, 419 Mayer-Bolza problem multiple integrals, 147f optimal thrust, 106f McShane, variations of, 199 Midcourse guidance, 87f Minimum effort control, 400 Multiplier rule, 199 N
Natural boundary conditions, 152 Necessary conditions for optimal controls, 431 for singular subarcs, 69f Neustadt’s synthesis method, 392 Nondegeneracy conditions, 387 Normality, 4, 29, 438
0 Open interval, 201 Open set, 256 Optimal, 267 control, 294 control problem, 420 isochrones, 393 isocost surface, 269 space rendezvous, 406 trajectory, 268, 294 Optimum rocket trajectories, 104f, 444 Optimum step, 402 Ordinary differential equations, 228 P
Partial differential equations, 149 Path, 266 Performance index, 267 integral, 293
Pfaffian system, 149 Piecewise continuous, 201 Pontryagin maximum principle, 209, 376 Powell’s method, 404 Principle of optimal evolution, 207
R Reachable set, 206, 393 Reflection corner, 28 Refraction corner, 28 Regular point, 375 Relative minimum, 428 Relaxed variational problems, 64, 66 Right limit, 202 Ritz method, 139 Rule, 265
S Scalar integral, optimal thrust problems, I11 Second variation, 66f Second variation test, 64, 66, 84, 89, 91, 93 Separating hyperplane, 21 1, 300 Simple covering, 14 Singular control in optimal thrust problems, 111, 114 Singular extrernals, 63f Singularity, 10, 19 Sounding rocket, 90 Starting values for iterations, 396 State, 266 equations, 292 space, 266 variables, 292 vector, 201, 266 Stationarity conditions distributed parameter problems, 155 optimal thrust problems, 107 Stationary point, 454 Strong differential, 426 Strong gradient, 428 Strong minimum, 4 Strong variations, 227 Sufficiency, 83f Support plane, 394 Supporting line, 48 Switching function, optimal thrust problems, 110
469
SUBJECT INDEX
T Tangent cone, 282 Test curve, 6 Thrust acceleration, 104 Time-optimal control, 392 Trajectory, 202, 268 equation, 293 optimal, 268, 294 regular optimal, 312 Transformation approach, 64, 79, 86, 88, 91, 95 Transversality conditions, 438, 443
U
Uniformly bounded, 204 Unilateral neighborhood, 31 Unilateral variation, 5 Union, 256
v
Variation of endpoints, 439 of initial conditions, 439 Variational equations, 303, 375 Vector integral, optimal thrust problems, 112
w Weak differential, 424 Weak gradient, 426 Weak relative minimum, 82, 84 Weak variation, 5, 227 Weierstrass condition, 7, 3 1, 66, 82 for distributed parameter problems, 157 Weierstrass criterion, optimal thrust problems, 107 Weierstrass-Erdmann corner condition, 6, 31,45
Z Zermelo diagram, 35, 47