GEOMETRY OF FEEDBACK AND OPTIMAL CONTROL
PURE AND APPLIED MATHEMATICS A Program of Monographs, Textbooks,and Lecture Notes
EXECUTIVE EDITORS Earl J. Taft Rutgers University New B m i c k , New Jersey
Zuhair Nashed University of Delaware Newark, Delaware
EDITORIAL BOARD M. S. Baouendi Ani1 Nerode Cornell University University of California, San Diego Donald Passman Jane Cronin University of Wisconsin, University Rutgers Madison Georgia Institute
Fred S. Roberts Jack K. Hale of Technology Rutgers University
S. Kobayashi University of California, Berkeley
Gian-Carlo Rota Massachusetts Instituteof Technology
Marvin Marcus David Universityof California, Santa Barbara
L. Russell Virginia Polytechnic Institute and State University
W. S. Massey University Yale Universitlft Siegen
Walter Schempp
Mark Teply Universityof Wisconsin, Milwaukee
MONOGRAPHS AND TEXTBOOKS IN PURE AND APPLIED MATHEMATICS 1. K. Yano, Integral Formulas in Riemannian Geometry (1970) 2. S.Kobayashi, Hyperbolic Manifolds and Holomorphic Mappings (1 970) 3. V. S. Vladimirov, Equations of Mathematical Physics (A. Jeffrey, ed.; A. Littlewood, trans.) (1970) 4. B. N. Pshenichnyi, Necessary Conditions for an Extremum (L. Neustadt, translation ed.; K. Makowski, trans.) (1971) 5. L. Nariciet al., Functional Analysis and Valuation Theory ( 1 971) 6. S.S.Passman, Infinite Group Rings ( 1 971) 7. L. Dornhoff, Group Representation Theory. Part A: Ordinary Representation Theory. Part B: Modular Representation Theory ( 1 971, 1972) 8. W. Boothby and G. L. Weiss, eds., Symmetric Spaces (1972) 9. Y. Matsushima, Differentiable Manifolds (E. T. Kobayashi, trans.) (1 972) 10. L. E. Ward,Jr., Topology (1972) 11. A. Babakhanian, Cohomological Methods in Group Theory (1 972) 12. R. Gilmer, Multiplicative Ideal Theory (1 972) 13. J. Yeh, Stochastic Processes and the Wiener Integral (1 973) 14. J. Berros-Neto, Introduction to the Theory of Distributions (1973) 15. R. Larsen, Functional Analysis (1973) 16. K. Yano and S.lshihara, Tangent and Cotangent Bundles ( 1 973) 17. C. Procesi, Rings with Polynomial Identities ( 1 973) 18. R. Hermann, Geometry, Physics, and Systems (1 973) 19. N. R. Wallach, Harmonic Analysis on Homogeneous Spaces ( 1 973) 20. J. Dieudonnd, Introduction to the Theory of Formal Groups ( 1 973) 21. l. Vaisman, Cohomology and Differential Forms (1 973) 22. B.-Y. Chen, Geometry of Submanifolds ( 1 973) 23. M. Marcus, Finite Dimensional Multilinear Algebra (in t w o parts) (1 973, 1975) 24. R. Larsen, Banach Algebras (1 973) 25. R. 0. Kujalaand A. L. Vitter, eds., Value Distribution Theory: Part A; Part B: Deficit and Bezout Estimates by Wilhelm Stoll (1 973) 26. K. B. Stolarsky, Algebraic Numbers and Diophantine Approximation (1974) 27. A. R. Magid, The Separable Galois Theory of Commutative Rings (1974) 28. B. R. McDonald, Finite Rings with Identity (1 974) 29. J. Satake, Linear Algebra (S.Koh e t al., trans.) (1 975) 30. J. S.Golan, Localization of Noncommutative Rings (1 975) 31. G.Klambeuer, Mathemetical Analysis (1 975) 32. M. K. Agosfon, Algebraic Topology ( 1 976) 33. K. R. Goodearl, Ring Theory (1976) 34. L. E. Mansfield, Linear Algebra with Geometric Applications (1 976) 35. N. J. Pullman, Matrix Theory and Its Applications (1 976) 36. B. R. McDonald, Geometric Algebra Over Local Rings (1976) 37. C. W. Groetsch, Generalized Inverses of Linear Operators (1 977) 38. J. E. Kuczkowski and J. L. Gersting, Abstract Algebra (1 977) 39. C. 0.Christenson and W. L. Voxman, Aspects of Topology ( 1 977) 40. M. Nagata, Field Theory (1 977) 41. R. L. Long, Algebraic Number Theory (1977) 42. W E Pfeffer, Integrals and Measures ( 1 977) 43. R. L. Wheeden and A. Zygmund, Measure and Integral (1 977) 44. J. H. Curtiss, Introduction to Functions of a Complex Variable (1978) 45. K. Hrbacek and T. Jech, Introduction to Set Theory (1 978) 46. W. S.Massey, Homology and Cohomology Theory (1978) 47. M. Marcus, Introduction to Modern Algebra ( 1 978) 48. E. C. Young, Vector and Tensor Analysis (1 978) 49. S.B. Nadler, Jr., Hyperspaces of Sets (1978) 50. S.K. Segal, Topics in Group Kings (1 978) 51, A. C, M. van Roo# Non-Archimedean Functional Analysis (1 978)
C. Sadosky, Interpolation of Operators and Singular Integrals (1 979) J. Cronin, Differential Equations (1 980) C. W, Groetsch, Elements of Applicable Functional Analysis (1 980) l. Vaisman, Foundations of Three-Dimensional Euclidean Geometry (1 980) H. l. Freedan, Deterministic Mathematical Models in Population Ecology (1 9801 S. B.Chae, Lebesgue integration (1 980) C. S. Rees et al., Theory and Applications of Fourier Analysis (l 9811 60. L. Nachbin, Introduction to Functional Analysis (R. M. Aron, trans.) (1 981) 61, G. Orzech and M. Orzech, Plane Algebraic Curves (1 981 ) 62. R. Johnsonbaugh and W. E Pfaffenberger, Foundations of Mathematical Analysis (1 981) 63. W.L. Voxman and R. H. Goetschel, Advanced Calculus (1 981) 64. L. J. Corwin and R. H. Szczarba, Multivariable Calculus (1 9821 65. V. l. /str&scu, Introduction to Linear Operator Theory (1 9811 66. R.D. Jsrvinen, Finite and Infinite Dimensional Linear Spaces (1981) 67. J. K. Beem and P. E Ehrlich, Global Lorentzian Geometry (1 981) 68, D. L. Armacost, The Structure of Locally Compact Abelian Groups (1 9811 69. J. W. Brewer and M. K. Smith, eds., Emmy Noether: A Tribute (1 981) 70. K. H. Kim, Boolean Matrix Theory and Applications ( 1 982) 71, T. W. Wieting, The Mathematical Theory of Chromatic Plane Ornaments (1 982) 72. D. B.Gau/d, Differential Topology (1 982) 73. R. L , Faber, Foundations of Euclidean and Non-Euclidean Geometry (1983) 74. M. Carmeli, Statistical Theory and Random Matrices (1 983) 75. J. H. Carruth et al., The Theory of Topological Semigroups (1 983) 76. R. L. Faber, Differential Geometry and Relativity Theory (1 983) 77. S. Barnett, Polynomials and Linear Control Systems (1 983) 78. G. Karpilovsky, Commutative Group Algebras (1 983) 79. F. Van Oystaeyen and A. Verschoren, Relative Invariants of Rings (1 983) 80. l. Vaisman, A First Course in Differential Geometry (1 984) 81. G.W. Swan, Applications of Optimal Control Theory in Biomedicine (l 984) 82. T. Petrie and J. D. Randall, Transformation Groups on Manifolds (1 984) 83. K. Goebel and S. Reich, Uniform Convexity, Hyperbolic Geometry, and Nonexpansive Mappings (1 984) 84. T. Albu and C. NdsMschpcu, Relative Finiteness in Module Theory (1 984) 85. K. Hrbacek and T. Jech, Introduction to Set Theory: Second Edition (1 984) 86. F. Van Oystaeyen and A. Verschoren, Relative Invariants of Rings (1 984) 87. B.R. McDonald, Linear Algebra Over Commutative Rings (1 984) 88. M. Namba, Geometry of Projective Algebraic Curves (1 984) 89. G. F. Webb, Theory of Nonlinear Age-Dependent Population Dynamics (1 985) 90. M. R. Bremner et al., Tables Dominant of Weight Multiplicities for Representations of Simple Lie Algebras (1 985) 91, A. E. Fekete, Real Linear Algebra ( 1 985) 92. S. B.Chae, Holomorphy and Calculus in Normed Spaces (1 985) 93. A. J. Jerri, Introduction to Integral Equations with Applications (1 985) 94. G. Karpilovsky, Projective Representations of Finite Groups ( 1 985) 95. L. Narici and E. Beckenstein, Topological Vector Spaces (1 985) 96. J. Weeks, The Shape of Space (1 985) 97. P. R. Gribikand K. 0. Kortanek, Extrema1 Methodsof Operations Research
53. 54. 55. 56. 57. 58. 59.
(1 985) 98. J.-A. Chao and W. A. Woyczynski, eds., ProbabilityTheory and Harmonic Analysis (1 986) 99, G.D. Crown eta/., Abstract Algebra (1 986) 100. J. H. Carruth et al., The Theory of Topological Semigroups, Volume 2 (1 986) 101 R. S. Doran and V. A. Be/fi, Characterizations of C*-Algebras ( 1 986) 102. M. W. Jeter, Mathematical Programming (1 986) 103. M. Altman, A Unified Theory of Nonlinear Operator and Evolution Equations with Applications (1 986) 104. A. Verschoren, Relative Invariants of Sheaves (1 9.87) 105. R. d. Usmani, Applied Linear Algebra (1 987) 106. P. Blass and J. Lang, Zariski Surfaces and Differential Equations in Characteristic p > 0 (1 987) I
107. 108. 109. 110. 111. 112. 113. 114. 115. 116. 117. 118. 119. 120. 121. 122. 123. 124. 125. 126. 127. 128. 129. 130. 131. 132. 133. 134. 135. 136. 137. 138. 139. 140. 141. 142. 143. 144. 145. 146. 147. 148. 149. 150. 151. 152. 153. 154. 155. 156. 157. 158. 159. 160. 161.
J. A. Reneke et al., Structured Hereditary Systems (1 987) H. Busemann and B. B. Phadke, Spaces with Distinguished Geodesics (1 987) R. Harte, Invertibility and Singularity for Bounded Linear Operators (1 988) G. S. Ladde etal., OscillationTheory of Differential Equations with Deviating Arguments (1 987) L. Dudkin et al., Iterative Aggregation Theory (1 987) T. Okubo, Differential Geometry (1 987) D. L. Stancl and M. L. Stancl, Real Analysis with Point-Set Topology (1 987) T. C. Gard, Introduction t o Stochastic Differential Equations (1 988) S. S. Abhyenkar, Enumerative Combinatorics of Young Tableaux (1 988) H. Strade and R. Farnsteiner, Modular Lie Algebras and Their Representations (1 988) J. A. Huckaba, Commutative Rings with Zero Divisors (1 988) W. D. Wallis, Combinatorial Designs (1 988) W. Wlfsfaa~,Topological Fields (1 988) G. Karpilovsky, Field Theory (1 988) S. Caenepeel and E Van Oystaeyen, Brauer Groups and theCohomologyof Graded Rings (1 989) W. Kozlowski, Modular Function Spaces (1 988) E. Lowen-Colebunders, Function Classes of Cauchy Continuous Maps (1989) M. Pave/, Fundamentals of Pattern Recognition (1 989) V. Lakshmikantham et al., Stability Analysis of Nonlinear Systems (1 989) R. Sivaramakrishnan, The Classical Theory of Arithmetic Functions (1 989) N. A. Watson, Parabolic Equations on an Infinite Strip (1 989) K. J. Hastings, Introduction to the Mathematics of Operations Research (1 989) B. Fine, Algebraic Theory of the Bianchi Groups (1 989) D. N. Dikranjan et al., Topological Groups (1 989) J. C. Morgan l/, Point Set Theory (1 990) P. Biler and A. Witkowski, Problems in Mathematical Analysis (1 990) H. J. Sussmann, Nonlinear Controllability and Optimal Control (1 990) J.-P. Florens et al., Elements of Bayesian Statistics (1 990) N. Shell, Topological Fields and Near Valuations (1 990) B. E Doolin and C. F. Martin, Introduction to Differential Geometry for Engineers (1 990) S. S. Holland, Jr., Applied Analysis by the Hilbert Space Method (1 990) J. Oknirfski, Semigroup Algebras (1 990) K. Zhu, Operator Theory in Function Spaces (1 990) G.B. Price, An Introduction to Multicomplex Spaces and Functions (1 9911 R. B. Darst, Introduction to Linear Programming (1 991) P. L. Sachdev, Nonlinear Ordinary Differential Equations andTheir Applications (1991) T. Husain, Orthogonal Schauder Bases (1991) J. Foran, Fundamentals of Real Analysis (1 991) W. C. Brown, Matrices and Vector Spaces (1 9911 M. M. Reo and.?. D. Ren, Theory of Orlicz Spaces (1 991) J. S. Golan and T. Head, Modules and the Structures of Rings (1 9911 C. Small, Arithmetic of Finite Fields (1 991) K. Yang, Complex Algebraic Geometry (1 991) D. G. Hoffman et al., Coding Theory (1 991) M. 0.Gonzdlez, Classical Complex Analysis (1 992) M. 0.Gonzdlez, Complex Analysis (1 992) L. W. Baggett, Functional Analysis (1 992) M. Sniedovich, Dynamic Programming (1 992) R. P. Agarwal, Difference Equations and Inequalities (1 992) C. Brezinski, Biorthogonality and Its Applications to Numerical Analysis (1 992) C. Swartz, An Introduction to Functional Analysis (1 992) S. B. Nadler, Jr., Continuum Theory (1992) M. A. A/-Gwaiz, Theory of Distributions (1 992) 6 Perry, Geometry: Axiomatic Developments with Problem Solving (1 992) 6 Castillo and M. R. Ruiz-Cobo, Functional Equations and Modelling in Science and Engineering (1 992)
160. E. Perry, Geometry: Axiomatic Developments with Problem Solving (1 992) 161 E. Castillo and M. R. Ruiz-Cobo, Functional Eauations and Modelling- in Science and Engineering (1 992) and Discrete Transforms with Applications and Error Analysis 162. A. J. Jerri,. lntearal .. (l 992) 163. A. Charlier et al., Tensors and the Clifford Algebra (1 992) 164. P. Biler and T. Nadzieja, Problems and Examples in Differential Equations (1 992) 165. E. Hansen, Global Optimization Using Interval Analysis (1 992) 166. S. Guerre-Delabrihre,Classical Sequences in Banach Spaces (1 992) 167. Y. C. Wong, Introductory Theory of Topological Vector Spaces (1 992) 168. S. H. Kulkarni and B. V. Limaye, Real Function Algebras (1 992) 169. W. C. Brown, Matrices Over Commutative Rings (1993) 170. J. Loustau and M. Dillon, Linear Geometry with Computer Graphics (1 993) 171. W. V. Petryshyn, Approximation-Solvability of Nonlinear Functional and Differential Equations (1 993) 172. 6 C. Young, Vector and Tensor Analysis: Second Edition (1 993) 173. T. A. Bick, Elementary Boundary Value Problems (19931 174. M. /Javel, Fundamentals of Pattern Recognition: Second Edition (1 993) 175. S. A. Albeverio et al., Noncommutative Distributions (1 993) 176. W. Fulks, Complex Variables (1 993) 177. M. M. Rao, Conditional Measures and Applications (1 993) 178. A. Janicki and A. Weron, Simulation and Chaotic Behavior of a-Stable Stochastic Processes (1 994) Nonlinear Parabolic Systems 179. P. Neittaanmski snd D. Tiba, OptimalControlof (1 994) 180. J. Cronin, Differential Equations: Introduction and Qualitative Theory, Second Edition (1 994) Lakshmikantham, V. Monotone Iterative Techniques for 181. S. Heikkila and Discontinuous Nonlinear Differential Equations (1 994) 182. X. Mao, Exponential Stability of Stochastic Differential Equations (1 994) 183. B. S. Thomson, Symmetric Properties of Real Functions (1 994) 184. J. E. Rubio, Optimization and Nonstandard Analysis (1 994) 185. J. L. Bueso et a/., Compatibility, Stability, and Sheaves (1 995) 186. A. N. Michel and K. Wang, Qualitative Theory of Dynamical Systems (1 995) 187. M. R. Darnel, Theory of Lattice-Ordered Groups (1 995) 188. 2,Naniewicz and P. D. Panagiotopoulos, Mathematical Theory of Hemivariational Inequalities and Applications (1 995) 189. L. J. Corwin and R. H. Szczarba, Calculus in Vector Spaces: Second Edition (1 995) 190. L. H. Erbe et al., Oscillation Theory for Functional Differential Equations (1 995) 191. S. Agaian et al., Binary Polynomial Transforms and Nonlinear Digital Filters (1 995) 192. M. 1. Gil’, Norm Estimations for Operation-Valued Functions and Applications (1 995) 193. P. A. Griller, Semigroups: An Introduction to the StructureTheory (1 995) 194. S.Kichenassamy, Nonlinear Wave Equations (1 996) 195. V. F. Krotov, Global Methods in Optimal Control Theory (1 996) 196. K. /. Beidar et al., Rings with Generalized Identities (1 996) 197. V. l. Amautov et al., Introduction to the Theory of Topological Rings and Modules (1 996) 198. G. Sierksma, Linear and Integer Programming (1 996) 199. R. Lasser, Introduction to Fourier Series (1 996) 200. V. Sima, Algorithms for Linear-Quadratic Optimization (1 996) 201 * D. Redmond, Number Theory (1 996) 202. J. K. Beem et al., Global Lorentzian Geometry: Second Edition (1996) 203. M. Fontana et al., Prufer Domains (1 997) 204. H. Tanabe, Functional Analytic Methods for Partial Differential Equations (1 997) 205. C. Q. Zhang, Integer Flows and Cycle Covers of Graphs (1 997) 206. E. Spiegel and C. J. O’Donnell, Incidence Algebras (1 997) 207. B. Jaitubczyk and W. Respondek, Geometry of Feedback and Optimal Control (1998) I
Additional Volumas in Prepamtion
GEOMETRY OF FEEDBACK AND OPTIMAL CONTROL edited by Bronistaw Jakubczyk Witold Respondek Polish Academyof Sciences Warsaw, Poland
MARCEL
MARCELDEKKER, INC. D B K K B R
NEWYORK BASEL HONGKONG
ISBN 0-8247-9068-5
The publisher offers discounts on this book when ordered in bulk quantities. For more information, write to Special SalesProfessional Marketing at the address below. This book is printed on acid-free paper. Copyright Q 1998 by MARCEL DEKKER, INC. All Rights Reserved.
Neither this book nor any part may be reproduced or transmitted in any form or byany means, electronic or mechanical, including photocopying, microfilming, and recording, or byany informationstorage and retrievalsystem,without permission in writingfrom the publisher. MARCEL DEKKER, INC. 270 Madison Avenue, New York, New York 10016 http:/lwww.dekker.com Current printing (last digit): 1 0 9 8 7 6 5 4 3 2 1
PRINTED INTHE UNITED STATES OF AMERICA
Preface As inotherfields,especiallymathematicalphysics,geometricmethods
provideapowerfultoolforanalysis
of nonlinearcontrolsystems.The
subject has been developing quickly since the 1980s and the first monographs toappearwere 1989) and
A. Isidori,NonlinearControlSystems(Springer,
H. Nijmeijerand A. ,J. vanderSchaft,NonlinearDynamical
ControlSystems(Springer, methodsin
1985 and
1990).
Therapiddevelopment
controltheory,togetherwiththebroadvariety
of geometric of problems
treated by those methods, resulted in a need to collect the most important and. most illustrative results .devoted to particular subfields of geometric control theory in separate volumes. We believe that optimal control, nonlinear feedback, and the geometry arising in both, form a "triangle"of ideas and results which deserveto be the content of such a volume. We have attempted to present methods and results of geometric control theory in a way that is accessible to a graduate student and to an applied mathematician who is not a specialist in nonlinear control. Ontheotherhand,suchavolumeshould
also be attractive to some pure
mathematicians,includingthosewhoworkinthefieldandthosewhoare lookingfornewproblems.
Our goalhasbeentopresentcontributions
of
geometric methods to optimal control and feedback transformations in a way which is available to nonspecialists and attractive for experts. One part of the volume is devoted to applications of geometry to optimal control and exhibits nice links between the Pontriagin Maximum Principle, symplectic geometry, and the geometry of reachable sets (together with their singularities). The second theme
of the volume concerns feedback
transformations and coverssuch problems as feedback stabilization, feedback
Preface
iv
classification, and feedback invariants. Some presentations and many results and examples are devoted'to small dimensions (n = 2,3). This is because many structural properties are already fairlywellunderstoodandanalyzedinlowerdimensionalcases feedback stabilization, the structure
(e.g.,
of time-optimal trajectories and that
reachable sets, feedback classification in the plane, structural stability.)
of All
those problems provide a wide spectrum of results, thus exhibiting in a very illustrativewaywhatgeometriccontroltheorydealswith.Ontheother hand, manyofthe
difficulties of thesubjectalreadyappearInsmall
dimensions,therebyshow.ingthecomplexityof
the questionsstudiedand
suggestingnewproblemstobesolved.Wehopethisvolumewillbea stimulus to researchas well as a useful reference work. TheTEX-typesettingofthisvolumewasprepared
by Ms. Agnieszka
Swiatkiewicz. Wewould like to express our gratitude for her professional work and kind cooperation.. . Bronisiuw Jakubczyk Witold Respondek
Contents Preface Contributors
vii
Introduction
1
Symplectic Methods for Optimization and Control A. Agrachev and R. Gamkrelidze Singular Trajectories, Feedback Equivalence, and the Time Optimal Control Problem B. Bonnard Controllability of Generic Control Systems on Surfaces Aleksey Davydov Recent Advances in the Stabilization Problem for Low Dimensional Systems W. P.Dayawansa
iii
19
79 111
165
Asymptotic Stabilization via Homogeneous Approximation Henry Hermes
205
Critical Hamiltonians and Feedback Invariants Bronislaw Jakubczyk
219
Optimal Control Problems on Lie Groups: Crossroads Between Geometry and Mechanics V. Jurdjevic Nonlinear Control and Combinatorics of Words Matthias Kawski Feedback Classification of Nonlinear Control Systems on Rz and R3 Witold Respondek
257 305
347
vi
Contents
10 Time-Optimal Feedback Control for Nonlinear Systems: A
Approach
Geometric Heinz Schlittler 11 Qualitative Behavior Control Problem and Stabilization of Dynamical Systems A . N. Shoshitaishvili
1,
383
423
12 An Introduction the to Coordinate-Free Maximum Principle H.J. Sussmann
463
Index
559
Contributors
A. Agrachev Steklov Institute, Moscow, Russia B. Bonnard Laboratory of Topology, University of Bourgogne, CNRS UMR 5584, Dijon, France Aleksey Davydov Vladimir State University, Vladimir, Russia
W.P. Dayawansa Texas Tech University, Lubbock, Texas R. Gamkrelidze Steklov Institute, Moscow, Russia Henry Hemes University of Colorado, Boulder, Colorado Bronishw Jakubczyk Institute of Mathematics, Polish Academy of Sciences, Warsaw, Poland
V. Jurgjevic University of Toronto, Toronto, Ontario, Canada M.Kawski Arizona State University, Tempe, Arizona Witold Respondek Warsaw, Poland
Institute of Mathematics, Polish Academy of Sciences,
Heinz Scuttler Washington University, St. Louis, Missouri A. N. Shoshitaishvili Institute of Control Sciences, Moscow, Russia H. J. Sussmann Rutgers University, New Brunswick, New Jersey
GEOMETRY OF FEEDBACK AND OWIVIALCONTROL
?
Introduction
What are the main problems of nonlinear control theory and how are they related to other domains of mathematics? We hope that 12 surveys presented in the volume will help to answer this question. We start the introductory chapter with an overview of the field of nonlinear control theory as such. The overview presents the areas covered bythis volume and introduces the reader to its subject. In thesecond part of the Introduction we present the contents of 12 Chapters-surveys which form this book. 1. Control systems
Generally speaking, finite dimensional control systems are underdetermined systems of ordinary differential equations. A control system in the most general autonomous form is a system II:
x =m
,U ) ,
where s ( t ) are elements of X ,called the state space, and u(t) belongs to U, the control space. Usually one assumes that the state space X is a finite dimensional differentiable manifold or W . The control space U is taken a subset of a differentiable manifold or JP. The variable s represents the state of the system and its role is analogous to that of the state of a dynamical system (i.e., the memory of the system). The variable U represents an external influence on the system which depends on a particular problem. It is the presence of the variable U which makes the system underdetermined. 1
Introduction
2
Only when the control U is specifiedas a function of time or a function of 2, solutions of the system II may become completely determined by the initial condition for s. Therefore, one usually aasumes regularity of f and regularity of U which guarantee (local) existence and uniqueness of solutions called trajectories. Very often the dynamics f exhibits a special structure, affine or linear with respect to U , and then the control set U is assumed to be (a subset of) RmI In the former case the system reads .(a),
m
c: where f , g1 . . gm are vector fields on X ,and it is called a control-aflne system. In the latter case the system takes the form m i=l
and is called control-hear system. A special roleisplayed by systems whose dynamics f is linear with respect to both the state and thecontrol (we additionally assume that X = W).Then IT is called a linear system and is of the form
A:
h=Az+Bu,
with A and B being matrices of appropriate sizes. Control systems II, C,and A can also be viewed at more geometrically. One natural way is to represent II as a family of vector fields
If the parametrization with respect to controls is of secondary importance then even more geometric way of representing differential inclusion h E F ( z ) , where F(.)
II is to consider it as a
= f(s,V ) .
Observe that F ( z ) is the set of all velocities admissible at s and thus the dependence on control U is suppressed. In this setting the control system
Introduction
3
is represented by a subset F of the tangent bundle TX,where
F = U,EXF(Z). In this way ,we can also define a control system as a trivializable subset of the tangent bundle TX. In the control-linear case of system A the field of velocities F becomes a distribution
F(z) = D ( 4 = sPan{gl(z), ' * ' ,gm(z)), and the system can be regarded as a trivial subbundle D c T X of the tangent bundle. For control-affinesystems C one gets an affine distribution
= fb-9 + W ) ,
+
which can be treated as an affine subbundle f D of the tangent bundle. (The condition of global triviality in our definition is often replaced by local triviality.) The trajectories of the above systems are, respectively, integral curves of the distribution D or of the affine distribution f+D. This inherently relates the theory of nonlinear control systems with problems of differential geometry. 2. Open loop and closed loop controls
There are two natural ways of making the system II a determined ) gives system. The first is to choose U as a function of time U = ~ ( twhich the following non-autonomous differential equation
W ) = f(z(t),U ( t ) ) .
If an initial condition z(0) = zo isgiven then the trajectory z = z(t) is uniquely determined for t sufficiently small, provided the system and the control are sufficiently regular. Usually, one assumes that f is at least Lipschitz with respect to z and continuous with respect to U. For many problems, including most of the Chapters of this book, one assumes f to be smooth or analytic with respect to both its arguments. The clas U of admissible controls is chosen depending on the problem. In most of the cases it includes all piecewise constant controls with values
Introduction
4
in U, and is contained in the class of all measurable, essentially bounded, U-valued functions. Under these assumptions solutions of j. = f(s,U ) exist and are unique when an initial condition is given. The second way of specifying the control is to choose it as a function of the state U = U(.). Then we obtain a dynamical system of the form
S = f(s,U(.)). Also in this case in order to get existence and uniqueness of trajectories one has to impose regularity conditions on f and on U = U(.). The first way of choosing controls, when U is taken a function of time, U = u ( t ) ,is called open loop control. In the second case, i.e.,when U ischosen a function of state, U = ~(s), one speaks about closed loop control. This terminology is suggested by applications. In the latter cme the information about the instantaneous state of the system is fed back to the system (hence the name “closed loop”), while in the former no information on the states(t)is used when choosing the control u(t)(“open loop’’ control).
3. Control problems Three basic control problems will be considered in this book. To describe the first one observe that for a dynamical system S = f(s)initial condition s(0) = 20 uniquely determines the trajectory starting from it. On the contrary, for a control system I1 there is a whole family of trajectories starting from 50,parametrized by all admissible controls. The set of all points reachable by all trajectories starting from 20 is called the reachable set from 10 and is denoted by R(z0).This set can have a complicated structure and, generally speaking, controllability problems concern various aspects of this structure. In particular, complete controllability problem is to characterize systems whose any two states can be joined by an admissible trajectory. Another problem is that of accessibility, meaning that the reachable set R(z0)has a nonempty interior. Especially important is the problem of local controllability, i.e., the problem of determining whether R(s0)contains so in its interior. It is clear that open loop controls are natural in controllability problems.
Introduction
5
The Kalman rank condition [5] for complete controllability of a linear system A was one of the results which gave birth to control theory. Attempts to generalize this rank condition have led to considering natural Lie algebras of vector fields associated to the system 17, Let
L = Lie{T} be the Lie algebra generated by the vector fields of F = {fu}ocEu.In terms of the Lie algebra L,and the family of its generators, various nonlinear controllability properties can be characterized. The stabilization problem is to find a control U = u(z), called stabilizing feedback, such that the closed loop system ri: = f(z,u(z)) becomes (asymptotically) stable at 5 0 .Various versions of the problem are studied like local or globalstabilizability properties and regularity of stabilizing feedback. Stated in this way the stabilization problem uses naturally closed loop controls. From its very statement the stabilizability problem leads to classical problems of stability theory which goes back to Lyapunov [S]. Hence it is natural that concepts and tools from the theory of dynamical systems are used [21. It is a classical result of linear control theory that for a linear system A complete controllability is equivalent to asymptotic stabilizability with arbitrary exponential decay. One of the goals of nonlinear control theory is to establish a similar correspondence for nonlinear systems. Optimal control problems are naturalgeneralizations of problems considered in the calculus of variations. Besides a control system 17 one specifies a functional on the space of trajectories and controls which may be of the
tYPe
5
J = JXdt),
dt,
I
where either the time interval I = [ O , T ] is fixed or T is not fixed and an end-point condition for z ( T ) is imposed. One fixes a class U of admissible controls and an initial condition. The optimal control problem is the problem of minimizingthe functional J over the class of controls U ( . ) in U.
Introduction
6
In particular, in the time-optimal problem one wants to find a control which steers the system from an initial point 20 to a terminal point z1 (or a terminal set) in a minimal time. In this case one takes L E 1. The Lagmngeproblem is to find a control which steers 20 to 21 in a given time T and minimizes the functional J. A widely studied version of this problem is to minimize J, where L is quadratic with respect to U (which corresponds to minimizing energy). The problem as stated above involves open loop controls, not using any information about the current state. However, optimal controls can often be expressed through a function of the state (not regular, in general) and thus give rise to closed loop control U = U(.). Such a way of expressing optimal controls is called optimal synthesis. Studying optimal control problems since the late fifties led to the discovery of the Maximum Principle by Pontryagin and co-workers [g] and development of mathematical control theory. 4. Natural groups, feedback equivalence
Attempts to solvespecific control problems, as the ones mentioned above, have been one of the main early motivations of developing control theory, and they remain as such. Later a need of understanding the structure of classes of control systems and their geometry appeared as it is usual in the development of any mathematical theory. This need is realized by defining natural groups of transformations and, more generally, equivalence relations, and studying their invariants. The simplest group used in nonlinear control theory is the group of diffeomorphisms of the statespace. It maps trajectories of the system into trajectories corresponding to thesame controls U(.). Therefore, this group is natural for open loop control, and for those problems where parametrization with respect to control is important. The group of diffeomorphisms acts naturally on the family F . = {fu}vEuof vector fields, preserving the parametrization with respect to U E U. This leads to a complete description of invariants in terms of the vector fields in F.and their iterated Lie brackets (elements of the Lie algebra L).This set of invariants is very rich
Introduction
7
since the group of diffeomorphisms of X is “small” compared with the class of all systems II. Indeed, the group is parametrized by n functions of n state variables (21, . . . , E,) while the class of systems is parametrized by n functions of n m variables (n state variables and m control variables
+
( m ,* ’ * ,%Td). A larger group leading to deeper invariants is that of invertible transformations x = ($, $) of the state and control of the form
called feedback transformations (the name comes from the fact that the relation between the transformed control and the original control U uses information on the stateE ) . This group of transformations, called feedback group, is natural for the problems whose solutions are of closed-loop type. In the case of control-linear systems A and control-ffie systems C the transformation $ is taken, respectively, linear or affine with respect to U. The problem of feedback equivalence reduces then to equivalence of the corresponding distributions D, or of the affine distributions F = f D, under the group of diffeomorphisms. This is a classical problem in geometry going back to PfafT, Darboux and E. Cartan [l].
+
It turns out that invariants of the feedback group are also infinite dimensional (functional moduli) if m < n. Therefore, one also considers weaker equivalence relations called dynamical (feedback) equivalence. A similar equivalence was already considered by E. Cartan in his work on absolute equivalence of systems of differential equations. 5. Hamiltonian lift
At an early stage of development of mathematical control theory it turned out fruitful to consider a Hamiltonian lift of the original system II (cf.[8]).This lift is important for optimal control problems and also, as it has turned out recently, for static feedback equivalence problem. The idea is to consider additionally at every point 2 E X a cotangent vector p E T,*X and to replace the original system II by the system on the
Introduction
8
cotangent bundle written in coordinates as rI* :
Sf
x = f ( x c , 4 , P = -P+u),
with (p,x ) E T * X , called the Hamiltonian lift of IT. A delicate point in defining the second equation of the system rI* (called the adjoint equation)is to make sure that its definition does not depend on the (canonical) coordinates as a priori we do not have a connection on the cotangent bundle which would allow us to define p in a coordinate independent way. This is established by the fact that Hamiltonian system (HS)can be defined in a coordinate free way usingthe canonical symplectic form on T * X . To do it let us introduce the Hamiltonian H : T * X x U + R of IT defined aa
H(P, x, U> = Pf ( 5 ,U ) , where p f is defined byduality of cotangent and tangent vectors. Then the system IT* can be equivalently written (in canonical coordinates (p,z))
( H S ): i.e., it is the Hamiltonian system corresponding to H(., * , U ) via the standard symplectic structure on T * X . It is clear that r I * is another control system, with the state space of dimension 2n. Its basic advantage is .that it is Hamiltonian and it allows to analyze the original system II microlocally(Le,locally with respect to point x and "direction" p ) . Using the Hamiltonian H one can analyze some properties of the system rI via properties of the function H(p, x , .). In particular, critical points of the function H ( p , x , are crucial for optimal properties of II. The critical values of this function are basic invariants of static feedback. Let us describe more closely these two features of the Hamiltonian H . m)
6. Maximum Principle
To illustrate the role of the Hamiltonian H and thatof the Hamiltonian lift II* in optimal control problems we will state a simple version of the Maximum Principle for the timeoptimal problem.
Introduction
9
Suppose that an initial point x0 E X and a target point x1 E X me fixed. Maximum Principle. I. 2(*) and Z(-) are an admissible control and the corresponding trajectory joining x0 with x1 in a minimal time T , then thereexistsanabsolutelycontinuouscurve F(t) E T & ] X , f l t ) # 0 on [0,TI,such that the triple (AZ, G ) ( - ) satisfies the equations (*S) and the following maximum condition
( M C ):
Z(t),G(t))= $yw5-(t),w),?l)
for almost all t E [0,TI,i.e., G(t) mmimizes the Hamiltonian with respect to the control variable. The same principle holds for the Lagrange problem, except that the Hamiltonian should be taken in the form
H =P f ( X , U ) +q w ,U ) and instead of f i t ) # 0 one claims that (p^(t),q)# 0, with q a constant. Maximum Principle, together with its various generalizations, was a corner-stone in developping mathematical control theory. It relates optimal control with calculus of variations but is more general since the set of control values U needs not be open and optimal controls often take values at extreme points of U ("bang-bang" controls). If U is open and we control all derivatives of x ( t ) ,i.e., the dynamics f ( x , u ) is given as j:= U , Maximum Principle yieldsEuler-Lagrange equations of the calculus of variations.
7. Optimal control and feedback invariants It has recently been revealed that thefeedback equivalenceproblem and the time optimalproblem are closely related. Namely, using the formalism of the Maximum Principle and taking into consideration all critical trajectories defined by this formalism (possibly including the complex ones) one can find a complete,set of feedback invariants. In order to explain this let us assume that the control set U is an open subset of a manifold. The maximum condition (MC) clearly implies the
Introduction
10
following equality
BH
-(p,z,u) = 0 621 for critical points of H as a function of U . If (C) is solvable with respect to U , i.e., there exists U = G(p,s) satisfying (C), then we may define a critical Hamiltonian
The crucial, although simple, observation is that
fi is feedback invariant.
In the analytic case we may consider the equation (C) in complex domain, then the critical Hamiltonians (which are nonunique) are complex valued, in general. The trajectories of the Hamiltonian vector fields onthe cotangent bundle, definedby the critical Hamiltonians, can be used to construct a complete set of feedback invariants. Among these trajectories there are time optimal trajectories of control system II (which are real). In the case of control-affine system C the critical condition (C) takes the form
It is therefore impossible to derive the critical controls directly from this condition. In this case in order to compute them one has to keep differentiating ( E ) along the system R* till controls appear explicitely. Again, it can be shown that so computed critical controls and the corresponding critical (singular) trajectories are feedback invariants. Vice versa, feedbackequivalence can beusedfor analyzing optimal properties of control systems. This is especially useful when preliminary normal forms can be obtained so that the optimal properties of the system become more transparent. Finding feedback invariants responsible for optimal properties of a system is a crucial problem here. The above overview does not exhaust the subject of geometric nonlinear control theory. The reader who wants to learn more systematically foundations of this theory may consult the books [3], [4], [l, [g], and [lo].
Introduction
11
THE VOLUME: A DESCRIPTION OF THE CONTENTS The chapters are arranged alphabetically. However, we have made an attempt to present them by topics and we hope that this will help to exhibit links and relations between the chapters.
1. Optimal control The survey of Schattler deals with the most basictime-optimal problem for systems of the form
5 =m
) +W(%),
Mt)l
5 1,
where U is scalar. Since time-optimal trajectories are necessarily boundary trajectories, i.e., for all times t they satisfy ~ ( tE )ORt(zo),where O7Zt(z0) denotes the boundary of the reachable set at time t , a natural approach used inthis chapteris to analyze directly the boundary of the reachable set. As shown by the author this approach turns out to be fruitful in low dimensions. The idea is to construct the small-time reachable sets as stratified CW-complexes by successiveattaching cells of increasing dimension to the0-dimensional cell consisting of the initial point. For nondegenerate 2- and 3-dimensional cases it is enough to consider submanifolds obtained by concatenating bang-bang trajectories, i.e., the ones corresponding to the control u(t) = f l . In higher dimensions one has to consider also singular trajectories. A precise description of these geometric constructions is the content of the survey. The survey of Jurdjevic discusses a few basic problems from classical mechanics and geometry which can be considered in a uniform framework as Lagrange optimal control problems (the Lagrangian L in these problems is quadratic with respect to control). These are the dynamic equations of the rigid body, the ball-plate problem, various versions of Euler elastica problem (the shape of curves with minimal integral of the square of curvature) and a related Dubbin's problem. The state spaces M for each of those problems are Lie groups. Using the Maximum Principle and invariance of the problems with respect to a natural action of the Lie group it ispossible to compute
Introduction
12
explicitly the optimal solutions for these problems. The geometry of the solutions is analyzed and striking connections between solutions of different problems are revealed. This stresses a unifying role of the Maximum Principle as a tool for understanding of underlying geometry of different optimal problems. The survey of Agrachev and Gamkrelidze explains effectiveness of the Hamiltonian approach and the symplectic formalism in analysis of optimization problems and, in particular, optimalcontrol problems. It startswith a general constraint optimization problem: given two smooth functions
#:U+R, f : U + M , find u g E U whichminimizes #(U) under the constraint f(u) = q, with a fixed q E M . It is shown in the survey that the necessary conditions of Lagrange for optimality can be interpreted geometrically in the cotangent bundle T * M , using its natural symplectic structure. The notions of Lagrange submanifold and Lagrange embedding appear naturally. The Maslov and Arnold-Maslov index are then used in the analysis of optimality, analogously as the Morse index appears in the analysis of variational problems. The authorsdevote a special section to various definitions and description of properties of the Maslov index. Then theyshow its role in obtaining sufficient conditions for optimality of trajectories of control systems. The article of Sussmann concentrates on the Maximum Principle. It presents a version of this principle which in many respects is the strongest known at present, among first order conditions. The word “strongest” concerns three features. The author imposes the weakest assumptions needed for validity of a theorem of the type of maximum principle. The second feature is that the systems are considered on differential manifolds and coordinate free notions are used. Finally, the usual definition of a control system as a family of vector fields parametrized by a finite-dimensional control parameter is weakened to a parametrized family of time depending vector fields. This allows to avoid unnecessary regularity conditions with respect to the control parameter.
Introduction
13
The article begins with a description how the Maximum Principle generalizes the classical Euler-Lagrange necessary conditions in the calculus of variations. The formulation of the Maximum Principle particularly stresses its differential geometric aspects and shows the role of such geometric objects like Lie brackets, connections along curves and symplectic form& lism. The survey of Bonnard discusses the time-optimal problem for controlaffine systems and its relations to feedback equivalence. As no constraints on control are imposed (i-e., the control set U is R”) the time-optimal problem is invariant under feedbackequivalence.Also the time-optimal trajectories are singular trajectories and are invariant under feedback. In the survey feedback transformations areused to bring the system to a normal form so that optimality status of the trajectoriescan be studied in detail. In particular, a method of describing conjugate points on an optimal trajectory is described (the point where the optimal trajectory ceases to be optimal). On the contrary, the optimal trajectories and singular curves can be usedto describe a complete set of feedback invariants. This is shown under certain assumptions in the second part of the survey. 2. Controllability
The survey of Kawski describes how some basic properties of a nonlinear control system are related with algebraic properties of Lie brackets of vector fields generated by the system. This leads to a combinatorial analysis of words and properties of bases of free Liealgebras. The unifying object is the Chen-Fliess series which gives an algebraic way to represent (formal) solutions of a control system. The survey starts with a discussion of the small time local controllability property whichis the ability of covering a neighborhood of an initial point by the trajectories of the system in arbitrarily small time. Several criteria in terms of the iterated Lie brackets from the Lie algebra C = Lie{f,gl,, . . ,gm} are described. Then nilpotent control systems (i.e. systems with nilpotent Lie algebra C) are discussed together with their relevance to controllability and optimal stabilizability. The author presents
Introduction
14
an algorithm to produce a nilpotent approximation of a given system. Then he illustrates the role of the Chen-Fliess series showing a product expansion formula and an application of the series to a path planning problem. In the second part of the survey the main emphasis is put on algebraic and combinatorial aspects of the problems, related to various choices of bases forfree Lie algebras. The shuffle product and itsrole in the characterization of Lie elements and of exponential Lie series isexplained. Finally, the chronological product, which plays an important role in control theory, is discussed from an algebraic point of view. The chapter written by Davydov deals with controllability properties and their stability under small perturbations of the system. Control systems of the local form 5 = f ( x , u ) are considered, where x(t) E X, a smooth surface. One says that a point x E X has the local transitivity property if for any neighborhood V of x one can find a neighborhood of x such that any two points of are attainable one from another without leaving V. The local transitivity zone is the set of all points of X which have the local transitivity property. The first group of results describes generic singularities of local transitivity zones. The second group of results concerns generic singularities of the (positive time) reachable set and its boundary, for control systems on closed orientable surfaces.
v
v
In order to consider some global controllability problems the author defines the nonlocal transitivity zone as the intersection of the positive and negative orbits of any point. RRsults concerning generic singularities of the boundary of nonlocal transitivity zones are given for systems on closed orientable surfaces. It turns out thatall these singularities are stable under small perturbations of the system. Finally, the author defines a structurally stable control system as a system forwhich its positive and negative orbits are homeomorphic to those of any close system. It turns out that a generic control system on a closed orientable surface is structurally stable.
Introduction
15
3. Stabilization The survey ofDayawansa is devoted to thestabilization problem. Contrary to thelinear case, where controllability is equivalent to stabilizability with an arbitrary exponential decay, for nonlinear systems (local) controllability does not imply (local) asymptotic stability. In the first part of the survey the author discusses necessary conditions for that implication to hold. Although the stabilization problem is of fundamental importance in control theory its complexity implies that for nonlinear systems only low dimensional cases are relatively well understood. The survey gives a detailed analysis of the 2-dimensional case (in its full generality) and the 3-dimensionalcase (for homogeneous systems). Finally, some open problems are stated. The chapter written by Hermes concerns the stabilization problem and its relation to optimal control. A control system in Rn of the form j! = f(x) ug(z) with scalar control U is considered. The goal is to find a stabilizing feedback 'U = iL(x) so that the closed loop system becomes asymptotically stable at 0 E ]Wn. This goal is achieved by considering the problem of finding a control which minimizes a cost functional J(u) = L(s,U ) & with a specified positive L. An optimal control in the form of closed loop, U = u(x),becomes a solution of the stabilization problem. An important issue in this method is that the original system is replaced by a homogeneous (with respect to a dilation) approximation. A method of constructing an approximation is presented which is baaed on a filtration of the Lie algebra Lie{ f,g } generated by f and g. The paper of Shoshitaishvili is an attempt to treat the stabilization problem and the optimal control problem using a unified approach based on the so called Ashby relation. Most of the paper is devoted to theformer problem whichis considered as a qualitative behavior control problem, shortly QBC-problem, defined as follows. For a given control system j! = f(z,U ) and a dynamical system f = c$(%) find a feedback U = .(x) such that there exist a map z 4 x transforming the trajectories of the latter system into trajectories of the closed loop system j! = f ( s , a ( s ) ) .
+
:S
Introduction
16
The author shows a correspondence of the QBC-problem with the Ashby relation whose application allows to construct a stabilizing feedback (a branching feedback, in general). In particular, he studies in detail the stabilization problem in the plane. In the final part the author considers the optimal control problem in terms of the Ashby relation. Instead system in the case of the stable dynamical system i = ~$(z)’added to the of the stabilization problem the adjoint equation is added in the case of optimal control problem. In bothcases the existence of a finitely branched integral manifold of the completed system is of crucial importance. 4. Feedback equivalence
The chapter of Respondek deals with the feedback classification of control-&ne systems of the form 5 = f(z) Cuigi. Most of results, methods, and interesting phenomena of the feedback classification in the general case are present already in the low-dimensional case and thus the chapter provides an illustrative introduction to the subject. A complete local classification of generic systems on Rz and R3 is given and lists of models and normal forms (including parameters) are provided. It turns out that already among generic systems functional moduli appear. Such a functional module is analyzed for the planar case and its geometric and control-theory interpretation is provided. A basic object for the classification is a pair ( E ,S) consisting of two curves, in the planar case, or two hypersurfaces, in the 3-dimensional case. In both cases E is the set of equlibria while S is a “singular” set: for planar systems it consists of points, where the vector fields g1 and [f,g1] are linearly dependent whereas for 3-dimensional case it consists of points where the distribution spannedby g1 and 92 fails to be contact. Geometry of singularities is described in terms of E , S, and their relations. In the paper of Jakubczyk the static feedbackequivalence problem is studied for systems x = f(z,u),. The geometry of this problem and its inherent relation with the time-optimal control problem is revealed. More precisely, a construction of a complete set of feedback invariants is presented which shows this relation.
+
Introduction
17
Critical Hamiltonians are defined in the same way as for the timeoptimal control problem, allowing additionally complex valued controls. These Hamiltonians contain all the information about feedback equivalence classes of systems. The same holds for their symmetric functions, called symbols, which are real analytic functions. It isshown that the iterated Poisson brackets of them evaluated at a point form a set of feedback invariants. Under a rank assumption these invariants together with an additional invariant, determine the feedback equivalence classes (i.e., they are complete invariants). The symbols can be computed effectively using complex logarithmic residua. In thecase of systems polynomial with respect to control the symbols are computed explicitly.
REFERENCES
PI [31 [41
[51
PI 171
R. L. Bryant, S. S. Chern, R. B. Gardner, H. L. Goldschmidt, and P. A. Griffiths, Exterior Differential Systems, Springer Verlag, New York, 1991. W. Hahn, Stability of Motion, Springer Verlag, New York, 1967. A. Isidori, Nonlinear Control Systems, 2nded. , SpringerVerlag, Berlin, 1989. V. Jurdjevic, Geometric Control Theory, Cambridge Studies in Advanced Mathematics, Vol. 52, (to appear). R. E. Kalman, P. L. Falb, and M. A. Arbib, Topics in Mathematical System Theory, McGraw-Hill, New York, 1969. M. A. Lyapunov, Stability of Motion, New York, 1996. H.Nijmeijer and A. J. van der Schaft, Nonlinear Dynamical Control Systems, Springer Verlag, New York, 1990. L. S, Pontryagin, V. G. Boltyanski, R. V. Gamkrelidze, E. F. Mischenko, The Mathematical Theoryof Optimal Processes,Wiley, New York,1962, (translation from the Russian edition, Fizmatgiz, Moscow, 1961).
18
[g]
Introduction
E. D. Sontag, MathematicalControlTheory:DeterministicFinite
Dimensional Systems, Springer Verlag, New York, 1990. [lo] H. J. Sussmann ed. , Nonlinear Controllability and Optimal Control, Marcel Dekker, New York and Basel, 1990.
1 Symplectic Methods for Optimization and Control A. Agmchev and R. Garnkrelidze Steklov Inst., U1. Gubkina 8, Moscow GSP-1, 117966, Russia
Abstract. The goal of this paper is to demonstrate the power and productivity of the symplectic approach to optimization problems and to draw attention of specialists interested in Control Theory to this promising direction of investigations. We start from the classical problem of conditional extremum and then turn to the optimal control. All basic notions and facts from symplectic geometry we use, are introduced in the paper. Geometry of Lagrange Grassmannian is studied in greater detail.
1. Introduction 1. The language of Symplectic Geometry is successfully employed in many branches of contemporary mathematics, but it is worth to remind that theoriginal development of Symplectic Geometry was greatly influenced by variational problems. In Optimal Control crucial role was played by the Hamiltonian system of Pontryagin’s Maximum principle, which itself is the object of Symplectic Geometry. In further development of Optimal Control priorities weregiven to ConvexAnalysis. Though Convex and Functional Analysis are very helpful in developing the general theory of 19
20
Gamkrelidze
Agrachev and
Extrema1 problems, they are notat all effective forinvestigating essentially non-linear problems in higher approximations, when the convex approximation fails. Therefore, since the discovery of the Maximum Principle, there were always attempts of introducing geometric methods of investigation, though not as universal as the Convex and Linear methods. These new geometric methods were applied mainly for obtaining optimality conditions of higher orders and constructing the optimal synthesis, and today we already have many ingenious devices and beautiful concrete results. It seems veryprobable that there should be a general framework which could unify differentdirections of these geometric investigations and merge the Maximum Principle with the theory of fields of extremals of the classical Calculus of Variations. We are convinced that the appropriate language for such unification can be provided by Symplectic Geometry, and as a justification for such a conviction we consider the statement, according to which “the manifold of Lagrange multipliers in the problem of conditional extremum is a Lagrangian manifold”. Thus two “Lagrangian” objects-Lagrange multipliers, the main object of the theory of Extremal problems, and Lagrangian submanifolds, main objects of Symplectic Geometry, which existed independently for a long time, can be unified. 2. The present article has an expository character. Its main aim is
to demonstrate the power and productivity of the “symplectic” language in optimization problems and to draw attention of specialists interested in Control Theory to this promising direction of investigations. All basic notions and facts from Symplectic Geometry, needed in the sequel, are introduced in the article. A big part of our exposition is devoted to reformulation of well-known results into symplectic language, many of which became a folklore. In these cases it is almost impossible to give accurate references, and we do not give them, which should certainly not mean that the authors claim any originality. We also do not claim that we have indicated all essential connections of Optimal Control with Symplectic Geometry. For example, we omitted matrix Riccati equations, which play essential role in linear-quadratic optimalproblems. These equations define flows on Grassmannians, and in case of symmetric matrices on Lagrange
Symplectic Methods for Optimization and Control
21
Grassmannians. Such flows axestudied in detail in many publications. We do not consider in the article such a big and important theme as the variational methods of investigation of Hamiltonian systems - a good example of a feedback influence of Optimization on Symplectic Geometry. In the sequel smoothness always means differentiability of appropriate order, a smooth mapping of manifolds is called submersion (immersion)if its differential at every point is surjective (injective). Let M be a smooth manifold. We call a smooth submanifold an M an arbitrary immersed submanifold, i.e. every one-to-one immersion of a manifold into M . We call a Lipschitz submanifold in M an arbitrarylocally-Lipschitzone-to-one mapping 0 : W + M of a smooth manifold W into M . A Lipschitz curve on a Lipschitz manifold is a curve t + O(w(t)),where W(-)is a Lipschitz mapping of a segment into W . To simplify the exposition we shall indicate in the sequel only the image @(W)assuming that the smooth or Lipschitz mapping 0 is given. We assume that every space of smooth mappings of smooth manifolds is always endowed with the standard topology of uniform convergence of derivatives of all orders on arbitrary compact sets. We say that a generic smooth mapping has a given property if the set of all mappings with this property contains a countable intersection of open everywhere dense subsets in the space of all mappings. Weusein this article standard notations. T,M denotes the tangent space to M at x E M, T,*M is the cotangent space (conjugate to thespace T,M). the tangent bundle to M is denoted by T M and is represented as a smooth vector bundle with base M , total space UzEMT,M and the canonical projection v + x,v E T,M. Correspondingly, T*M is the cotangent bundle, which plays a special role in Symplectic Geometry. Therefore we introduce a special symbol for the canonical projection of the total space U,,"T*M onto the base:
Let A : E + E' be a linear mapping of linear spaces, then ker A denotes the kernel of the mapping, imA is its image and rank A = dim(im A ) , A
Agrachev and Gamkrelidze
22
quadratic form q : E + R is represented as q(e) = b(e, e),where b(e1, e 2 ) , e l , e2 E E is a symmetric bilinear form on E. In this case kerq = {e E E 1 b(e, e')= 0 Ve' E E).We write q > 0 (< 0) and we say that the form q is positive (negative), if q(e) > 0 (< 0) for nonzero e E E. If we substitute symbols > (<) by 2 ( S ) we obtain nonnegative (nonpositive) forms. 3. The remainder of this section is devoted to some initial facts from Symplectic geometry; for details we recommend [13]. Let C be a finite-dimensional real vector space. A symplectic form on C is an arbitrary real nondegenerate skew-symmetric bilinear form on C, i.e. a bilinear mapping Q : C x C + R such that o(z1,zz) = -Q(Zz,Z1) VZl,Z2
E C,
and the relation ~ ( zz'), = 0 Vz' E C implies z = 0. The space C with a given symplectic form Q on it is called symplectic. It is easily seen that symplectic forms exist only on even-dimensional spaces and all such forms are transformed into each other by linear substitutions of variables. More precisely, let dim C = 2n, then there exists a basis el,. , , ,en, f1,. , , ,fn in C such that g ( e i ,e j )
(1)
= P(f i , fj) = 0 V i , j ;
o ( e i ,fj) = 0 V i # j ,
o(ei,fi) = 1,
i = 1,. . . ,n.
Every basis of a symplectic space which satisfies (1) is called a canonical basis. Let S C C,the subspace
S~={ZECIQ(~,~)=OV~ES)~C. is called the skew-orthogonal complement to S.For every subspace E C C the relations dimEL = 2n - dimE, (EL)L = E hold. The subspace E is called isotropic if E C EL.Every one-dimensional subspace is isotropic, and the dimension of every isotropic subspace does
Symplectic Methods for Optimization and Control
23
not exceed n. A subspace E is called Lagrangian if E = EL.Thus Lagrangian subspaces in C are exactly the n-dimensional isotropic subspaces. The symplectic group Sp(C) is the group of all linear transformations of the symplectic space C which preserve the symplectic form:
+
It is a connected Lie group of dimension n(2n 1).The elements of this group are called the symplectic transformations of the space C. The Lie algebra of the symplectic group is given by the expression sp(C) = {A E gl(C) I o ( A z ~ , z ~ ) = Q ( A zV~Z, z~~, )EZ C}. ~ Let h be a real quadratic form on C and d,h be the differential of h at the point z E C. Then d,h is a linear form on C which depends linearly on z. For every z E C there exists a unique vector K(z) E C which satisfies the condition cr(Z(z), .) = d,h.
It is easy to show that the linear operator K : C + E belongs to sp(E), and the mappings h + K is an isomorphism of the space of quadratic forms onto sp(C). We obtain the linear Hamiltonian systemof differential equations L = 6 ( L ) corresponding to the quadratic Hamiltonian h. Let e l , , . , ,e,, f l , , . . ,f n be a canonical basis in C and let us write z = (xiei + tifa). The Hamiltonian system takes the standard form in coordinates xi, ti:
cy=,
4. Let N be a 2n-dimensional smooth manifold. A smooth nondegene-
rate closed differential 2-form cr on N is called a symplectic structure on this manifold. The nondegeneracy of the form : y + uy,y E N , means that cry is a symplectic form on the tangent space TUNVy E N ; “closed” means that da = 0, where d is the exterior differential. A manifold N with a given symplectic structure cr on it is called a symplectic manifold.
24
Gamkrelidze
Agrachev and
An immersion (9 : W 4 N is called isotropic if @*U = 0. An isotropic immersion (9 is calledLagrangian if dim W = n. Correspondingly is defined an isotropic (Lagrangian) smooth submanifold of a symplectic manifold N . It is defined by the condition that its tangent space at every point p must be an isotropic (Lagrangian) subspace of the space T,N provided with the symplectic form cy. The most important symplectic manifolds inmany situations arecotangent bundles which carry a natural symplectic structure. To define it suppose M is an n-dimensional smooth manifold and let RM : T * M + M be its cotangent bundle. Let 19 E T * M , v E Tfi(T*M),then T M * W E T T M ( 6 ) M . Denote the pairing of the vector TM*V with the covector 19 E T,,(fi)M by S M ( W ) = 19(n~,v).The correspondence W + S M ( W ) , v E T(T*M),defines a differential l-form S M on T * M . The closed 2-form
defines the standard symplectic structure on the 2n-dimensional manifold T * M .
Remark. To avoid ambiguities we emphasize that the covector 19 E T * M is an element of the 2n-dimensional manifold T * M , (the covector defines the fibre to which it belongs). Often we omit the lower index in the symbols S M , U M , if the manifold M is determined from the context.
A Lipschitz submanifold (9 : W "t T * M is called isotropic if Sa(,) S M = 0 for every closedLipschitz curve y which is contractible in W . It is called Lagrangian if, additionally, dimW = n. An immersion (9 : W + T * M is isotropic iff @ * S M is a closed form on W . An isotropic (Lagrangian) immersion is called exact if the form @ * S M is exact on W . Correspondingly, a Lipschitz isotropic submanifold (9 : W 4 T*M is called exact if S*(r) S M = 0 foreveryclosed Lipschitz curve 7 on W . Among n-dimensional submanifolds in T * M the smooth sections, i.e. differential l-forms on M , are singled out. Such a section (l-form) is a Lagrangian submanifold iff the form is closed. It is an exact Lagrangian submanifold iff the l-form is exact, i.e. is a graph of a differential of a I
Symplectic Methods for Optimization Control and
25
smooth function. The fibres TZM, x E M, of the bundle T*M are exact Lagrangian submanifolds as well. 6, Suppose Ni is a 2n-dimensional symplectic manifold with the sym-
plectic form ai, i = 1,2. The diffeomorphism P : NI + is called symplectomorphism if P*q = 61. A well-known theorem of Darboux states that all symplectic manifolds of equal dimension are locally symplectomorphic:
For 'dzi E Ni thereexistneighbourhoods 0,, of thepoints zi in Nd, i = 1 , 2 and a symplectomorphism P : 0,, + 0,,, P(z1) = z2. Suppose C is an embedded Lagrangian submanifold of a symplectic manifold N . (An immersed submanifold is embedded if its topology is induced by the topology of N ) . It turns out that in this case a certain neighbourhood of C in N is symplectomorphic to a neighbourhood of the trivial section in T*C. Functions on symplectic manifolds are called Hamiltonians. Let h be a, smooth Hamiltonian on C and dh be its differential (which is an exact l-form). Since Q is a nondegenerate 2-form there exists a uniquely defined vector field h' on N, satisfying the condition ~'JQ =~
4
( h*), = dh.
The field h' is called a Hamiltonian field corresponding to the Hamiltonian h, and the differential equation on N , i = h'(i) is the corresponding Hamiltonian system. According to Darboux's theorem symplectic manifold is locally symplectomorphic to symplectic space: in the neighbourhood of an arbitrary point of the manifold we can introduce local coordinates such that in these coordinates the form Q has constant coefficients. If in addition the coordinates correspond to a canonical basis of the symplectic space then the Hamiltonian system in these coordinates with the Hamiltonian function h takes the form (2). Let u7, T E [0,t ] , be a fmily of smooth vector fields, measurable in T and uniformly bounded on every compact in N . Consider the nonstatio-
Agrachev and Gamkrelidze
26
nary differential equation
and suppose that all its solutions can be extended to the whole interval [O,t]. Then the equation (3) defines a (nonstationary) flow on N , i.e. a locally Lipschitz, with respect to r , family of diffeomorphisms P, : N + N, r E [0,t ] ,which satisfies the conditions
We denote Symp N the group of all symplectomorphisms of N. It is easy to show that P, E Symp N VT E [0,t] iff the l-form v,Ja is closed for almost all r E [0, t].At the same time (3) is a (nonstationary) Hamiltonian system iff the form w,]a is exact for almost all r E [ O , t ] . Thus the flow generated by a Hamiltonian system consists of symplectomorphisms. The Poisson bracket of smooth Hamiltonians h1 , h2 is the Hamiltonian
The Poisson bracket defines the structure of a Lie algebra on the space of all Hamiltonians: the operation is skew-symmetric and satisfies the Jacobi identity:
{hl, {h29 h 3 ) ) - (h2, { h ,h 3 ) ) = { { h , h219 hS)Furthermore
{ h , h z ) 0 P = { h , op,h2 0 P) for VP E SympN, where h o P ( z ) = h ( P ( z ) ) ,z E N. Let P, E SympN, 0 5 r 5 t , be a flow defined by the nonstationary Hamiltonian system
and
'p
be a smooth Hamiltonian. Then
Inparticular, the function {h,, 'p) 0.
=
'p
is the first integral of the system (4) iff
Symplectic Methods for Optimization and27Control
Concluding this introductory section we give the variation fornula for a pair of nonstationary Hamiltonians h,, g,.
F, E SympN , 0 5 r 5 t , be flows defined by the Hamiltonians
Let P,, hr, h, g r :
+
Then F, = P, o R,, where R, is the flow generated by the Hamiltonian 97.0 p,:
To prove this relation it is sufficient, according to (5), to show that
a
-('POP~oR~)={h,+gr,cp}oProRr
Sr for an arbitrary smooth function p on N . Using again ( 5 ) we obtain
Propositions, theorems, formulas and figures are numbered independently in each section. References to other sections use double numbering.
2. Lagrange Multipliers and Lagrangian Submanifolds 1. Consider a standaid problem of conditional extremum. Let U, M be smooth manifolds, f : U + M be a smooth mapping, and let p : U + R be a smooth real-valued function. We have to minimize cp on a level set of the mapping f Put N = R x M and define the mapping F : U + N by the relation F(u)= (p(u),f (U)), U ' € U.If at a given point'u thefunction cp attains its minimum on the level set of f then imF and the ray {(p(u) - t, f (U))E I
28
Agrachev and Gamkrelidze
N I t > 0) have an empty intersection. In particular, F(u) is a boundary point of im F. Let FA : T,U -+ TF(,)N be the differential of the mapping F at U (which is a linear mapping of the corresponding tangent spaces). We remind that a point of the manifold U is calleda regular pointof the mapping F if the differential of F at this point is surjective. If the point is not regular it is called a critical point of the mapping. The image of a critical point of F is called a critical value of F. The implicit function theorem implies that theimage of a regular point of the mapping F belongs to the interior of the set im F. Thus if F(u) is a boundary point of im F then im F; # TF(,)N.Hence there exists a nonzero covector E T&,,N which is orthogonal to im F;. Let V be an arbitrary linear space. We shall denote the pairing of a covector I9 E V * with a vector v E V simply by I9v considering the expression as a product of a row by a column. In particular, the relation wlimFA can be written as WFLV= 0 Vv E T,U,or simply as wFL = 0. Thus we obtained the simplest form of the Lagrange multiplier rule: ifthe function cp attains at U anextrema1valueonthe level set of the mapping f then there,e&ts a nonzero W E Ti+.,N such that wFL = 0. Definition 2. Let U,N be arbitrary smooth manifolds 'and F : U -+ N be a smooth mapping. We call a Lagrangian point of the mapping F an arbitrary pair ( W , U ) , where U E U and W E Ticu,N\Osatisfies the equality wFL = 0.The covector W is called a Lagrange multiplier and U is called a critical point corresponding to the Lagrangian point ( w , ~ )The . set of all Lagrangian points of the mapping F is denoted by CF. Let F * ( T * N )be a vector bundle over U, induced by the bundle T*N under the mapping F : U -+ N . As usually, we shall denote the totalspace of a bundle and the bundle itself with the same letter. According to the definition of the induced bundle we have
F * ( T * N )= ( ( 8 , ~I U) E U,I9 E T&,)N} C T * N X U. We identify the manifold U with the trivial section in the bundle F*(T*N):
U = ( ( 0 , ~I U ) E V } C F * ( T * N ) .
Symplectic Methods for Control Optimization and
29
Then, CF C F * ( T * N )\ U . Let (19, U ) E F * ( T * N ) ,hence 29Fi E T*U.
Definition. The mapping F is called a Morse mapping if the mapping (29, U ) + 29FL, (29, U ) E F * ( T * N )\ U is transversal to the trivial section in T*U. In other words, the mapping F is said to be a Morse mapping if the system of equations dFL = 0 is regular at 29 # 0. For N = R, i.e. in case F is a real-valued function, the condition that F is a Morse mapping is equivalent to the assertion that the Hessian of the function F is a nondegenerate quadratic form at every critical point, If the mapping F is a Morse mapping then CF is a smooth submanifold in F * ( T * N )for every N . Since the dimension of the total space of the bundle F* (T*N ) is equal to dim U +dim N and the codimension of trivial section in T*U is equal to dim U we obtain dim CF = dim N . From Thorn's transversality theorem now easily follows
U
Proposition 1. For arbitrary manifolds U , N ,a generic mapping F is a Morse mapping.
;
+N
Let dim N = n and suppose the mapping F : U + N is a Morse mapping. If ( W , U ) E CF then (tw,u) E CF Vt E W \ 0.Hence the ndimensional submanifold CF C F * ( T * N )defines an (n - 1)-dimensional submanifold W Fin the projectivization
PF*(T*N)= F * ( T * N ) / { ( d , u ) (td,.), t E R\O> of the bundle F * ( T * N ) . The projection ( W , U ) c) U maps the manifold CF onto the set of critical points of F . The latter set is the main object of investigation in the singularity theory of smooth mappings. According to Thom-Bordman for every typical F the set of its critical points is a finite union of submanifolds in U of dimensions 5 ( n - 1). At the same time, the set of critical points is not necessarily a smooth submanifold and has, in general, highly complicated singularities even for generic F. Proposition 1 implies that all these singularities are resolved by simply adjoining the Lagrange multipliers.
30
Gamkrelidze
Agrachev and
3. Let F : U -+ N be a smooth mapping, U E U. We shall define the notion of the second derivative of F , which isnot quite trivialif we attempt to obtain aninvariant notion, independent of the choice of local coordinates in U and N , and which should reflect the local structure of F “of the second order”. For example, if U is regular for F then, according to the implicit function theorem, F is represented in some local coordinates as a linear mapping, and thereis no senseto’speak about the second derivative. Todefine ‘the second derivative at a critical point it is appropriate to complement it, as above, by Lagrange multipliers.
Thus let ( W , U ) E CF,W E ker FA, and suppose g and 2 are smooth sections of vector bundles F * ( T * N )T,U , respectively, such that g(u) = W ,
-
= W. Consider a smooth function wF’2 : G e g(G)FAy(G)an U.It is easily seen that the differential (gF’2); of this function at U depends anly on W , W and does not depend on the sections g, 2.
).(W
Definition. The second derivative of a smooth mapping F at a Lagrangian point ( W , U ) is the linear mapping
defined by the relation wF:w = (gF’g)L,v E kerF4. The Hessian of F at a Lagrangian point ( W , U ) is defined to be the real-valued quadratic form wF,h on ker F;, which is defined by the relation wF,h(v) = (wFlw)v, W E ker F :. It is easily seen that the bilinear form (211, W Z ) I+ (wF:wl)wz is symmetric, hence it i s restored by the quadratic form wF,h. The following test for being a Morse mapping is established by a straightforward computation. Proposition 2. A smooth mapping’Fis a Morse mapping iff the linear mapping wF$ is injective at every Lagrangian point ( W , U ) . Suppose ( W , U ) E CF.If the Hessian wF,h is a nondegenerate quadratic form then wF; is injective. The opposite is not always true. For example,
Symplectic Methodsfor Optimizationand Control
-
31
for the following Morse mapping
(ti)
U1
(U1uz+(ul)S
)
U'
E R, i = 1,2,
for which the origin is a cusp point, the Hessian is equal to zero for (0;l), 211 = 212 = 0.
W
=
4. Denote by
Fc : CF + T * N the mapping ( W , U ) t+ manifold.
W , (W, U )
Proposition 3. If F : U exact Lagrangian immersion.
E CF.We remind that T * N is a symplectic
+M
is a Morse mappingthen FC is an
Proof. We shall use local coordinates and, to simplify the exposition, we shall suppose that U,N are vector spaces. Hence we can assume that T*U = U* X U , T * N = N* X N , CF = { ( < , U ) E N * X V I < $ =O},
Fc : K, 4 t+ (5, F ( U ) ) . According to Proposition 2 the property of F to be a Morse mapping is equivalent to the relation &(<%v) # 0 for (<,U) E C F ,W E ker v # 0. Suppose that Fc is not an immersion at ( E , U ) . Then 3(q,v ) E T ( t I u ) C ~ such that q = 0, U E ker $, The definition of T ( t I U ) Cimplies ~ that &(<%v) = 0. We come to the contradiction with the assumption that F is a Morse mapping. It is left to show that the immersion FC is Lagrangian. We have that S N = d y , ( E , U) E N* X N = T * N , hence F ~ S = N
E,
<
kgrachev and Gamkrelidze
32
We have proved that T N o CF defines a smooth mapping of a (dim N - 1)dimensional manifold EFinto N , The following assertion formulates conditions under which the latter mapping is an immersion. Proposition 4. Let F be a Morsemappingand ( W , U ) E C F . The digerential of the mapping T N o F c has rankdim N - 1 at @,U) if and only if rank FA = dim N - 1 and the Hessian wF,h of the mapping F at ( w , ~ is ) a nondegenerate quadratic form.
Proof. We use the notations from the proof of Proposition 3. We have ker(m 0 Fc)tWtu)= (N* x ker F:) n T ( € , , ) ~ F ,
(2)
where W = (5, F ( u ) ) ,cf. (1). Let v E ker FA. According to (1)the existence of a pair (v,W ) contained in the space (2) is equivalent to the relation W E ker wF,h. Moreover the pair ( ~ , 0 )belongs to the space (2) iff 71 im F:. 5. We return to the situation considered in nol,,when N = Wx M , F ( u ) = (cp(u),f ( U ) ) . Let ( W , U ) be a Lagrangian point, W E T&u,N, U E U. Then W = (a,X),a E R, X E T;cu)M,0 = wF: = Xf; +ap$.According to
the usual terminology in the theory of extremal problems the Lagrangian point ((a, X), U ) will be called noma1 if a # 0, and will be called abnormal if a = 0..Consider the set of normal Lagrangian points. Since the Lagrange multipliers W = (a, X) are defined up to a nonzero multiplier we can normalize them by fixing the value of Q arbitrarily. This procedure reduces the dimension and permits to consider the normal Lagrangian points as elements of the space f * ( PM ) rather then of F * ( T * N ) .We shall use the normalization Q! = -1. Put Cf,p
{(X,U)
E
f * ( T * M )[ U E V, X
fc : 0, U ) l+ X,
E Tj,,)M,
Xf;
= 0},
(X, U ) E Cf,p.
The following assertions are easy modifications of the results of n04. Proposition 5. If F = (p, f ) is a Morsemapping then is a smooth submanifold in.f * ( P M ) ,and fc : C,,,-+ T * M is a Lagrangian immersion, where f 5 S M = d p .
Symplectic Methods
forControl Optimization and
33
Proposition 6. Let F = (cp, f) be a Morse mapping and ( & U ) E Cj,cp. The differential of the mapping T M o fc at (X, U) is invertible i f f : is surjective and the Hessian of the mapping F at (-1, A), U ) is a nondegenerate quadratic form. Let ( X , U ) E Cj,cp, W = ( - l , A ) , and w F t be a Hessian of the mapping F at the Lagrangian point ( w , ~ ) If. the quadratic form wFt is negative definite (positive definite) then it is easily seen that U is a point of strong local minimum (maximum) of the function cp on the level set of the mapping f . On the other hand, if f; is surjective and the quadratic form w F t is indefinite then U is not a point of a local extremum of the function cp on the level set of the mapping F . Combining these assertions with Propositions 5 and 6 we come to the
Corollary. Let F = (v, f ) be a Morse mapping and W be a connected open subset in C,,, such that T M o fclw is a diffeomorphismof W onto f ( W )C M . If there exists a point (Ao, U O ) E W such that uo is a point of a local minimum (maximum) of the function cp on the level set off then V(A, U ) E W U is a point of strong local minimum (maximum) of c p . on the level set o f f . Furthermore,
1
(o('h) - (P(U0) = sM Y
for V(Ai,ui) E W , i = 0 , l and for every smooth curve 7 : [O, l]+ fc(W), satisfying the condition T M ( Y ( ~ )= ) .f(ui), i = 0,l.
3. The Problem of Optimal Control 1. Among many similar formulations of the optimal control problem we choose the problem with the optimized integral functional, free time, and fixed end-points. Let M be a smooth manifold and V a subset in R?', We call admissible controls locally bounded measurable mappings v(*) : W, + V ,where W, = { t E W 1 t 2 0). Let go : M x V + P be a continuous function, smooth in the first variable, and let g : M x V + TM be a continuous mapping, smooth
Agrachev and Gamkrelidze
34
in the first variable, and subject to the condition g(z, W ) E T,M VI E M , W E V . In particular, foreveryfixed W the mapping I C ) g(z, W ) is a smooth vector fieldon M . We consider an initial. point OI E M which will not change in the sequel. For every admissible control W(-)there exists a unique absolutely continuous curve T c) I(T; W(-)) in M , defined on a half-interva1.7 E [0, t w ) which , satisfies the condition ~ ( 0W(.>) ; = ICO and is a solution of the differential equation I
.
(1) Denote by.U the set of pairs (t,W(.)), such that Z ( T ; W(-)) is defined.for T E [Ojt]. Define a function cp : U -+ B and a mapping f : U -+ M by relations t
cp(t,V('))
1
= go(+;
9
W(*)),W ( T ) ) d T ,
f(t,W(')) = 4 t ; 4)).
0
Since the infinitedimensional space U is not necessarily a smooth manifold we can not apply directly to cp and f the results of the previous section. Nevertheless we have in the given situation a reasonable substitute of a smooth structure for U.In order to define it we have to consider generalized controls, cf. [g]. We shall not give here the appropriate definitions which would leadus too far aside. We only indicate that therole of the differential at (t,W(.)) of f is taken by the mapping t
f [ t , w ): (0, 4 ' ) )
1
c) e
*
(g(4T;W), 4 7 )
+ 4 7 ) ) - g ( 4 7 ;4 4 T ) ) ) d T
0
+g(+;
+
@,
+
where (t B, W W) E U,(and assuming that t is a Lebesgue point for W(-)).Here P: is a diffeomorphism of a neighbourhood of the point Z ( T ;W ) on a neighbourhood of z(t; W ) which is uniquely defined by the conditions that P,' z and that the curves T C ) PS("), T E [ ~ ' , t are ] , absolutely continuous solutions of the differential equation (1).
=
Let F = (9,f ) : U + B x M . Substituting M by R x M , the point zo by (0, $0) e R x M , and the differential equation (1) by the equation
Symplectic Methods for Optimization and Control
35
d -(zO, 3) = (gO(z, W ( T ) ) ,g(z1V ( T ) ) ) , dt we obtainan explicit expression for = f[t,,,). We call the point ( t ,W) E U a critical point of F if the image of the mapping F(,,v), which belongs to P x T,(t,v)M,does not contain a neighbourhood of the
(~i~,~),
origin. It is easily seen that the image of is convex, hence criticality of ( t ,W) is equivalent to theexistence of a nonvanishing element W = (a,X) E Rx T&ulM such that wFl,,+)(e,W ) 5 0 for V(e, W )satisfying the condition (t 8, W W ) E U.We call W the Lagrange multiplier and the triple W , ( t ,W). a Lagrangian pointfor F. If ( t ,W ) is a critical point for F then (7,v) is also a critical point for F, for VT 5 t. Put X, = P:*X E T,*(,,v)M,then (a, X), is a Lagrange multiplier corresponding to thecritical point (7,W). It is easily seen that the curve T C ) X, in T * M is a trajectory of a nonstationary Hamiltonian system on T * M corresponding to a nonstationary Hamiltonian
+
(2)
+
hT(E)= &7(nMs1W(T)) +W0(TM<,w(T)),
< E T*M, 0 5
5 t.
7
An elementary calculation leads to the.following
Proposition 1. The system (a,X), ( t ,W ) is a Lagrangian point of the mapping F = ( V , f ) i f for this systemthe Pontryagin Maximum Principle holds: 0 = h,(X,) = F E y ( X T g ( " ( T ; v),U )
+ ago("(T;v), U))),
05
5t,
where h, is given by ( 8 ) andthecurve T C ) X, is a trajectory of the Hamiltoniansystem in T * M defined b y thenonstationaryHamiltonian h,, T E [O,t],with the boundary condition At = X. 2. A Lagrangian point (a, X), ( t ,W ) is callednormal if a # 0 and abnormal in the opposite case. As in S2 we normalize the Lagrange multipliers for normal points by putting Q = -1. Let
Agrachev and Gamkrelidze
36
Put
Proposition 1 implies that im fc C H-l(O). Let (X,t,W ) E Cf,,, then there exists an absolutely continuous curve X, E T ' M , 0 5 T 5 t , such that (X,,T, W ) E Cf,, VT, and Xt = X. Without any regularity conditions it isdifficult to expect that C,,, could be provided with the structureof a Lagrangian manifold, and that fc is a Lagrangian immersion. Still it turns out thatif a Lagrangian manifold iscontainedin H"(0) and contains the curve X, 0 5 T 5 t , we can make some essential conclusions about the optimality of the control W(-) independently from any assumptions about the analytic nature of Cf,, and fc. Theorem 1. Let C C H-'(O) be an exact Lagrangian Lipschitz submanifold in T * M such that the preimageof an arbitrary pointin ?TM(C)under the mapping ~ ~ l : l C : + M is a connected Lipschitz complex in C , and the preimage of an arbitrary admissible trajectory in M under the mapping ?TM(L:is a Lipschitz complex in C. Let X, E C , 0 5 r 5 t , be an absolutely continuous cu'rwe such that (X,,r, W ) E C,,, for some admissible control W(-). Then for V(
<
9
t
j g 0 ( 4 7 ;W), 4.))
dT
5 j g 0 W ; $1, W ) )
0
0
holds.
Proof. We denote Z ( T ) = X ( T ; w ( T ) ) , O(T) = x(T;G(T)). Under the assumptions of Theorem 1 one can show that there exists a Lipschitz curve X, E C and a nondecreasing Lipschitz function O(T), 0 5 T 5 1, such that the following conditions are satisfied A
n
(1) X0 = Xo,
(2)
A
X1
= X;
e(o) = 0,e(1) = 6
(3) ?TM(X,) = q e ( T ) ) ,
o 5 T 5 1.
Symplectic Methods for Optimizationand Control
where
denotes the curve r
c)
IT.(Weused
37
here the relations 0 =
H(%)= m ~ E v ( ~ , g ( d ( T )U) ), - gO(z(e(T)),U))). Denote by 7 the curve T c) X .,
Since the form S M is exact on C we have
1S M = 5 S M CV
7
t
=
t
&Z(T)& 0
t
1
5
0
0
= X r g ( 5 ( 7 ) ,u(7))dT =
g o ( Z ( T ) ,V ( 7 ) ) d T .
Remark. The exact Lagrangian submanifold C evidently satisfies conditions of Theorem 1 if the projection TMIL is Lipschitz invertible, fig. 1; “vertical pieces” can be also permitted, fig. 2, but “folds” are not allowed, fig. 3.
Fig. 1 Fig. 2 Fig. 3 3. Let (X,t,v) E C,,, and X, 0 5 r 5 t , be a trajectory of the Hamiltonian system defined by the nonstationary Hamiltonian
with the boundary condition Xt = X. We call the curve X, a norrnal Pontryagin extrema1corresponding t o the control v(*).According to Proposition 1 (X,,r,v) E Cf,, for 0 5 T 5 t.
Agrachev and Gamkrelidze
38
The simplest case when a Lagrangian manifold can be constructed, which contains a given Pontryagin extrema1 X, is given when the Hamiltonian (3)
H ( [ ) = y$tg(n(t),4
- g0(.(t)14)9
t E T*M
is smooth. It is easy to show that in this case X, is a trajectory of the Hamiltonian system defined by the Hamiltonian H . The following assertion is a geometric formulation of the classical method of characteristics for solution of differential equations in partial derivatives of first order.
Proposition 2. Let H : T*M -+ R be a smooth function and assume that CO C T * M is a smooth Lagrangian submanifold. Suppose that H i I TACO# 0 VX E CO n H-l(O). Let t .I+ p(t,X) Be a trajectory of theHamiltoniansystemwithHamiltonian H andthe initialcondition p ( 0 , X) = X E COn H-'(O). Then the mapping p , which is defined on an open set in W x (COn H-'(O)) and with values in H-'(O), is a Lagrangian immersion.
z(X)
Proof. Condition H i I TACOis equivalent to the statement that is not skew orthogonal to TACOin the symplectic space TA(T*M).Since CO is a Lagrangian submanifold the last statement is equivalent to g(X) not being tangent to CO,Hence p is indeed an immersion. Since the Hamiltonian flow preserves the symplectic structure it is sufficient to check that the immersion ( t ,X) t ) p(t,X) is Lagrangian only for t'= 0. We have
Proposition 2 can be used to construct a Lagrangian submahfold from Theorem 1 ifwe take for CO a Lagrangian submanifold in T&M, or more generally, ifwe take a Lagrangian submanifold (4
{ X E T,,M
I5
E
x,XLTzX),
,,,
, ,
where X is a smooth submanifold in M . The submanifold (4) is often used
Symplect,fcMethods for Optimization and Control
39
in problems of optimal control with a variable left end-point z(0) E X , and an obvious generalization of Theorem 1 isvalidfor this case. For nonsmooth Hamiltonians (3) there might not exist even a nonsmooth flow consisting of Pontryagin extremals - the extremals are inevitably intersecting and branching near any singular extremal. Therefore it is important to emphasize that to apply Theorem 1 it is not at all necessary to have a Hamiltonian flow of extremals: the corresponding Lagrangian manifold can be constructed in a different way. It is often possible to construct an auxiliary Morse mapping, cf. 52, for which the Lagrange multipliers coincide with the Lagrange multipliers of the optimal problem in consideration and hence constitute a Lagrangian manifold in the level set of the Hamiltonian (3) corresponding to the value 0. We shall return to this problem later. After the appropriate Lagrangian submanifold C C T + M is constructed we have to examine its projection on M . The local properties of the projection are defined by a mutual disposition of the Lagrangian subspaces TxC and Tx(T,*(,)M)in Tx(T*M),A E C. The next section contains some basic information about the manifold of Lagrangian subspaces of a symplectic space. This information turns out to be useful not only for investigation of the projections T I L , but in many other cases as well.
TIC
4. Geometry of Lagrange Grassmanians 1. Let C be a symplectic vector space with a symplectic form U with dim C = 271. We denote by &(C) the set of all Lagrangian subspaces in C and call it the Lagrange Grassmannian. The subspaces Ao, AI E &(C) are transversal iff A0 nA, = 0. The set of all Lagrangian subspaces transversal to a given A0 E L(C)will be denoted by A$. If p : C + C is a linear symplectic transformationand A C C is a Lagrangian subspace then pA is also Lagrangian. Thus the symplectic group of all symplectic'transformationsacts naturallyon L@). This action automatically defines an action of Sp(C) on the sequences of k points from
Agrachev and Gamkrelidze
40
L@):
Proposition 1. The group Sp(C) acts transitively on the set
{(Ao,Al)I A o , h E W ) ,
n A1 = 0 )
of pairs of transversalLagrangiansubspaces.The restriction p I+ PIA, defines an isomorphism of stablesubgroups Cp E Sp(C) 1pAi = Ai, i = 0,l} on GL(A0).
Proof. The bilinear form (1)
0 1 1 x01 l+ Q
( h 1
x011
xi E Ai,
i = 0, 1,
defines a nondegenerate pairing of spaces A0 and Al. Let el,. .. ,en be a basisinhland f l , ...,f, thedualbasisinAo.Thene1, ...,e n , f l , ...,f, is a canonical basis in C.Proposition 1 follows from the assertion that a linear transformation in C is symplectic iff it maps a canonical basis into a canonical basis. Proposition 1 implies that L(C)is a homogeneous space of the group Sp(C); in particular, L(C)has the structure of a Cw-manifold.We shall indicate a standard family of coordinate neighbourhoods on L(C)defined by pairs of transversal Lagrangian subspaces. Let Ao, A1 E L(C ) , A0 n A1 = 0. The pairing (1) defines an isomorphism A0 W A i . Correspondingly C = A0 61A1 FY A i 6I Al. The indicated is,omorphism identifies0 with the standard symplectic form
(2)
( ( t l , X I ) , K 2 1 2 2 1 1 I+ t 2 Z l
E A,, ti E A:, i
-tlX2, 1,2.
,
The form (2) will be also denoted by Q. Every m-dimensionalsubspace H C hi @,A1W C transversal to A i , (in particular, every subspace sufficiently close to A I ) , is a graph of a linear
Symplectic Methods for Optimization and Control
mapping QH : A1
-+
41
Ai:
It is easy to show that H is a Lagrangian subspace in A i @A1 iff Q& = QH. Let Io : C + hi @ A1 be the isomorphism of symplectic spaces defined above, which sends A0 into A i . Then A I+
(3)
Q ~ A ,
A E A$
is a diffeomorphism of A$ onto the space of selfadjoint mappings of A1 into Ai. Thus A$ is a coordinate neighbourhood in L(C),and the mapping (3) defines local coordinates. Selfadjoint transformations from A1 into A i are actually equivalent to quadratic forms on AI: to a mapping;Q : A1 + Ai there corresponds the form q : z (Qz)z, z E Al. The space of all real-valued quadratic forms on A1 is denoted P(A1).We have
A$
E
P(A,),
dimL(C) = dimP(A1) =
n(n
+ 1)
2
'
It is easily seen that L(C)= @. The manifold L(C)is compact. Taking into account the isomorphism A$ I P(A1) we can say that L(C)is a compactification of the space of quadratic forms of n real variables. We shall nowshow that the geometry of the Lagrange Grassmannian considered as a homogeneous space of the group Sp(C) is intimately connected with the geometry of the space of quadratic forms, which explains the effectiveness of symplectic methods in many problems where we have to deal with families of quadratic forms. Consider a subgroup in Sp(C) which preserves A$. A linear transformation of the space C preserves A$ iff it preserves Ao. The isomorphism IOpermits to consider correspondingly L(Ai @AI)and Ai instead of L@), and Ao. A linear transformation of A i @ A1 is symplectic and preserving AT iff it is represented aa
(E, z) (4)
< E Ay,
2 E
(A*(
+ BA-lz, A"z)
A i , A E GL(Al), B : A1 + h;, B* = B .
<
The mapping (4) transforms the subspace {((,S)I = Qz} C A i @ A1 into subspace {(E,z)I< = (A*QA B)z}. In other words, the Lagran-
+
Agrachev and Gamkrelidze
42
gian subspace corresponding to the quadratic form q(o) = (Qo)z, o E A1 is transformed into the subspace corresponding to the quadratic form q(Az)+ ( B z ) z ,obtained from q by coordinate transformation and translation. Thus the sympIectic transformations which preserve A$’ P P(A1) are “variable substitutions” and translations in the space of quadratic forms of n variables. It turns out that the group Sp(C) is generated already by translations and a single transformation which interchanges the subspaces A, and A l . Suppose that a scalar product (.I.) is defined on A1 which identifies Af with A1 and Af @ A1 with AI @ AI, so that
Proposition 2. The symplectic group of the space A1 @ A1 P C with the symplectic form ( 5 ) is generated b y transformations
and the tramfornation
Suppose Q : A1 subspace
”t
AI, Q = Q*. The intersection of the Lagrangian
with 0 @ A1 coincides with the subspace 0 @ ker Q. In particular, the subspace (6) is not transversal to O@Al iff Q is degenerate. If it is nondegenerate then the symplectic mapping J transforms the subspace (6) into the subspace { ( g , z)I y = Q-lz}, corresponding to the operator -8-l. But, contrary to theoperation of matrix inversion, J is defined fordegenerate Q as well. Speaking not quite formally, we can say that the Lagrange Grassmannian of a 2n-dimensional symplectic space is a compactification of the space of symmetric n x n-matrices (the space of quadratic forms of n variables) for which the inversion of all symmetric matrices, including the degenerate matrices, is possible: the cone of the “ideal matrices”, which
Symplectic Methods for Optimization and Control
43
are inverses to degenerate matrices, is "attached at the infinity" to the space of symmetric matrices. Repeating in the opposite order the identifications of the symplectic spaces we have made, we obtain
A$
\ A?
W
{q E P
( h )I ker q # 0).
2. Let A'"@) be the subspace of k-linear skew-symmetric forms on C,
k = 0, l , ,. . ,2n;
E A2(C). If w E A'"(C), then kerw gf{x E C I ZJW = 0). It iseasy to show that dimkerw 5 2n - k for every nonzero form
form w is called decomposable if dim ker W = 2n - k. A decomposable form is restored by its kernel up to a nonzero scalar multiplier.
W
E Ak (C). The
Let P Ak (C) = (iik(C) \ O ) / { w E QW, w E C, Q! E R, QW # 0) be a projectivization of the space Ak(C), and denote by G the image of a nonzero form w under the canonical factorization Ak (C) 0 + P Ak (C). The mapping
(8)
ker W
c)G,
w is decomposable in Ak (C),
-
is a standard projective Plucker embedding of the Grassmannian of (2n k)-dimensional subspaces in C. It turns out that for k = n the image of the Lagrange Grassmannian under the embedding (8) is an intersection of the image of the standard Grassmannian with a projective subspace in P Ak (C). Namely, the following assertion is valid.
Proposition 3. Let
W
E An@) be a decomposable form. Then
kerwEL(C)wuAw=O. We consider in more detail the Lagrange Grassmannians for n = 1,2. Every one-dimensional subspace in R2 is Lagrangian, hence L(R2) = RP1. Furthermore Sp(R2) = SL(R2), therefore symplectic transformations coincide with the orientation preserving projective transformations. Thus ,!,(R2) is an oriented projective line, topologically - an oriented circle. To describe L(R4) we use the Plucker embedding. The form w E A2(R4)
Gamkrelidze
44
Agrachev and
is decomposable iff wAw=O.
(9)
The equation (9) defines a quadric of signature (+++- - -) in PAa(R4)= R P 6 . To obtain the image of L(R4) under the Pluckerembedding we must intersect the quadric with the hyperplane, defined by the equation o Aw = 0, As a result we obtain a quadric of signature (+ - -) in W . Topologically this quadric, and hence also L(R4),are represented as a quotient of S1x S2relative to the equivalence relation (21, 22) M (-21, -%a), zi E S i , i = 1 , 2 . This is three-dimensional nonorientable manifold, cf. [lo], where two “proofs” are given of the relation L@*) M S1x S2.
++
3. Proposition 1 asserts that the symplectic group acts transitively on pairs of transversal Lagrangian subspaces. Nowwe shall consider the action of this group on triples of Lagrangian subspaces. Let Ai E L@), i = 0,1, where A0 n A1 = 0. The isomorphism introduced in nol, 10 : C + AT CB A I , transforms A0 into A i , hence for VA E A$ we have a selfadjoint linear mapping of A1 into A i :
I o A = { ( € , X ) E A;
Al I € = Q I O A ~ : ) .
The subgroup in Sp(Ai CB A I ) which preserves A1 and AT consists of transformations of the form
((,x)
c)
(A*€,A-lx),
A E GL(A1).
The Lagrangian subspace defined by the equation 6 = Qz is transformed by this mapping into the subspace defined by the equation E = A’QAx. We assign to each A E A0 the quadratic form Q P A ( ~ ) = (&rpnX)z,
2
Eh .
The given considerations imply that the existence of a symplectic transformation which carries the triple of Lagrangian subspaces (Ao, AI, A)into the triple (Ao, A,, A‘) is equivalent to the existence of a linear change of coordinates transforming the form qIOA into the form qrOA]. If q is a real-valued quadratic form on a vector space E we denote by sgnq the difference between the number of positive and negative squares
Symplectic Methodsfor Optimization and Control
45
in the diagonal form of q: sgnq = max{dimH I qIH
> 0, H c E) - max(dimH I qIH < 0, H C E}.
Hence the forms q and q' are transformed into each other by a linear change of variables iff sgn q = sgn q', dim ker q = dimker q'. Note that ker qIOA = A n A l . Denote P(A0, AI 1 A) = sgn qloh.
The number p(Ao,A1, A) is called the Maslov indez of the triple of the Lagrangian subspaces Ao, AI, A.We shall give nowanother, more invariant definition of the Maslov index, which does not presupposes the assumption about the transversality of the subspaces A and A1 to Ao. Let Ai E L@), i = 0,1,2. Define a quadratic form q on AI n (ho+A2) by relations q(A1)
= u(X0, Xz),
A1
= X0
+ A2,
X i E Ai, i = 0,1,2.
+
The vector X1 E AI n(A0 A2) is in general represented as a sum of vectors from A0 A2 not uniquely, but the skew-scalar product of these vectors depends only of XI, which is a direct consequence of the isotropy of Ao, A2, The Maslov index of the triple Ao, 111, A2 is called the number
+
1 . 0 0 , A I , A21
= sgn q.
It is easily seen that q = qI0Al if A2 n A0 = 0. The given definition of the Maslov index uses only the symplectic structure of the space, therefore the index is a symplectic invariant of the triple of Lagrangian subspaces. We can give still another definition of the Maslov index, in which the subspaces Ai, i = 0,1,2,enter symmetrically. Define on the 3n-dimensional space A0 @ A1 @ A2 a quadratic form @ by the relation
q x o + x1 + X 2 ) = a(Xo,h) - ope, x 2 1 + u ( h ,A2). One can show, cf. [ll],that &Io,
....... ..
.
I
A I , A2) = sgn?.
..".. ../.,. . I . . . . ,...... .,. .._l .
,.
... . ..,
,
..,.,. ,.,...._.
. .
.
.
..
,
I
. ...... .
Agrachev and Gamkrelidze
46
The last representationimplies the skew-symmetry of the Maslov index in all three variables: p(Ao, h , Aa) = - p ( h , Ao, ha) = -p(Ao, h , AI).
Somewhat more difficult is,to prove the following identity rule, cf. [U]:
- the
chain
p(Ao,A1,Az) - p ( A o , A l , A 3 ) + p ( A o , A z I A ~ -) p ( A 1 , A a 1 A 3 ) = O f
(10)
VAi E L(C), i = 011,2,3.
Except the Maslov index there are the following trivial invariants of the triple Ao, A I , Az:
(n a
dim(Ai n Ai),
0 5i
< j 5 2,
dim
Ai).
i=O
We have proved that there are no other invariants if A o n A l = A0nA.a = 0. It turns out that this is true without any assumptions.
Proposition 4. A triple of Lagrangian subspaces (Ao,Al,A,) can be transformed into the triple (Ab, A i , Ab) by a symplectic mapping ifl p(A0, A I , 112) = p(AL,A i , A;),
dim(Ai n Ai) = dim(A: n Ai),
Consider the case n = 1 in more detail. Since L(Ra) is an oriented circle the Maslov index is an invariant of a triple of points on the oriented circle. If two of the three pointscoincide the index is zero. LetSO, SI,82 be three different points located on S1in such an order that if we move along S1in the positive direction we pass consecutively through the points sil, s i p , S i 3 . The parity of the substitution (il,i z , is) does not depend on the choice of the initial state of the movement.The Maslov index p(el,salss) is equal to 1 if this substitution is even and is equal to -1 if it is odd. 4. We consider in more detail the tangent spaces TAL(C),A E L ( C ) . To every quadratic form h E P@) there corresponds a linear Hamiltonian field h' and a one-parameter subgroup t C) etiEin Sp(C). Consider the linear
Symplectic Methods for Optimization and Control
47
of the space of quadratic forms to TAL(C).The set of all linear Hamiltonian fields isa Lie algebra of the group Sp(C). At the same time the action of the group Sp(C) on L(C)is transitive. Hence the mapping (11) is surjective. This mapping is certainly not invertible since dimP(C) = n(2n + 1) and dim L(C)= It is easy to show that the kernel of the mapping (10) consists of all quadratic forms which vanish on A. Thus to two different forms from P(C)there correspond equal vectors from TAL(C)iff the restrictions of these forms on A coincide. We obtained a natural identification of the space TAL@)with the space P ( A ) of the quadratic forms on A. The correspondence TAL(C)+ P ( A ) could be described more explicitly, without considering quadratic forms on C.Suppose A(t) is a smooth curve in L@), A(0) = A. We assign to every smooth curve At E A(t) the number $a(% X,). The isotropy of the spaces A(t) imply that In other words, tothe tangent this number depends onlyon XO,
W.
I t = O,
vector
dt
L
O
E TAL@)there
corresponds the quadratic form
It is not difficult to show that this correspondence coincides with the isomorphism TAL(C)W ?(A) defined above. We shall use this identification in the sequel without any further mentioning. The cone of non-negative quadratic forms defines a partial ordering in P ( A ) , We call a curve A(t) E L(C)monotone non-decreasing, (monotone non-increasing) if 2o 5 0)Vt. Suppose that thecurve A ( t ) is contained in a coordinate neighbourhood of the manifold L(C)W L(Ay €B A,) considered in nO1.Thus
v (v
At = ( ( 5 , .) E A; €B A1 I E = Qt.),
where Qt : A1 + AT is a selfadjoint mapping smoothly depending on T.To the tangent vector corresponds a quadratic form on A l . It is easily
%
Agrachev and Gamkrelidze
48
seen that this is the form
x I-) -(%x)x,
x E Al.
Thus the curve A, is non-decreasing (non-increasing) iff
*
5 0 (32
0)* We know that A1 is isomorphic to the space of all quadratic forms on W , though not canonically isomorphic. Consider the remaining part of the Lagrange Grassmannian
which accordingto V. Arnold is calledthe train of the Lagrangian subspace A l . The relation (7) implies that theintersection of Mhl with an arbitrary coordinate neighbourhood A$, where Ao, is transversal to AI, coincides with the set of all degenerate quadratic forms on R". To a subspace A, which has a k-dimensional intersection with AI, there corresponds a form with a k-dimensional kernel. Degenerate forms constitute an algebraic hypersurface in the space of This hypersurface has singularities: its singular points are all formsP(Rn). all the forms which haveat least two-dimensional kernel. At the same time the forms with at least two-dimensional kernelconstitute an algebraic subset of codimension 3 in P(Rn), cf. fig. 1, which represents the hypersurface of the degenerate forms in the three-dimensional space P(Ra).
Fig. 1
Symplectic Methods forControl Optimization and
49
&om this we obtain that M A , is an algebraic hypersurface in L@), and its singular points constitute an algebraic subset of codimension 3 in M A , . Hence M A , is a pseudo-manifold. Let A be a nonsingular point of the hypersurface M A . It is not difficult to show that vectors from TAL&) corresponding to the positive definite and negative definite quadratic forms on A are not tangent to the hypersurface M A , , cf. fig. 2, where the dispositions of the positive and negative cones in TAL@)relative to M A , are given for n = 2.
Fig. 2 We define a canonical coorientationof the hypersurface M A , in L(C) at a non-singular point A, considering as positive that side of the hyperplane towards which the positive definite elements of TAL@) are directed and as negative - towards which the negative definite elements are directed. The defined coorientation of the train permits to define correctly the intersection index . A ( . ) :M A , of an arbitrary continuous curve A(t) in L @ ) , with endpoints outside of M A , , with the hypersurface M A , , If A(t) is smooth and transversally intersecting M A , in nonsingular points the index is defined in the usual way: every intersection point of h(i)adds +l or -1 into the value of the intersection index according to the direction of the vector %lt=;. respectively to the positive or negative side of MA,, At the same time, since the singularities of M A , are of codimension 3 in L@), not only the generic curves, but as well aa generic homotopies of such curves, do not intersect the singularities of Mhl. Thus with a small change of an arbitrarycontinuous curve with endpointsoutside of Mhl we
50
Agrachev and Gamkrelidze
can bring it into a transversal position with M A , , with the intersection index not depending on the perturbation. Furthermore, the intersection index is constant under an arbitrary homotopy which leaves the endpoints outside M A , . Let A ( t ) , t E S1, be a closed curve. Thenthe intersection index A(.) M A , does not depend on A l . Indeed, for VA' E L(C)there exists a P E Sp(C) such that A' = PAl. Since the definitions of a train and of the intersection index are invariant under symplectic transformations we have P M A , = M p h , , (PA(-)) M P A , = A(.) ' M A , . The group Sp(C) isarcwise connected, hence there exists a continuous curve Pt E Sp(E) such that PO = id, P1 = P. Taking into account the homotopic invariance of the intersection index we obtain
A(.) Mp,n, = (PCIA(*)) M A , = A(.) M A , . Definition. The intersection index A(.) M A , of a closed curve A(.) with M A , , which does not depend on A,, is called the Maslov index of the closed curve and is denoted Ind A(.). a
There is a close connection of the described intersection index with the Maslov index of the triples of Lagrangian subspaces. The latter definition permits to express explicitly the intersection index without bringing the curve into general position or solving nonlinear equations. If B segment of the curve A ( t ) for to 5 t 5 tl belongs to thecoordinate P ( A l ) , qt is the quadratic form corresponding to neighbourhood A$ A ( t ) , and to the subspace A1 corresponds the vanishing form, then it is easily shown that 1
(AI[to,tl1) "A,
= 5(sgnqtO- sgnqt,).
Since sgnqt = p(A0, A I , A ( t ) ) then 1
*MA,= Z(,4ho,Al,A(to)) - p(Ao,Al,A(tl))).
(AI[to,t1])
We can subdivide an arbitrary curve A ( t ) into segments A J[ti,ti+l~,to tl < ,.. < tl+l forwhich h(t) is transversal to some Ai E L(C) for
Symplectic Methods for Optimizationand Control
51
where the relation (12) is valid also in case A(ti) n A1 # 0, 1 5 i
5 1.
5. For a global study of C W R"* R, it is convenient to use a special complex structure C" on R,* CB R". Todo this we put z = iz,6 E W * , 2 E R,. The structure C" has a standard Hermitian form h ( z , ~=) C;=, z j d . The real part of h is a usual scalar product
+
n
(13)
(21
I z2) = C(S!Ei+ &), j=l
and the imaginary part coincides with the symplectic form 6.Thus
+
h ( z ,W ) = ( z I W ) io(%, W).
The unitary group UJ(C") preserves h, hence it preserves 6 ,and U(Cn) C Sp(Rn* CB R,) = Sp(C). We shall show that U ( P ) acts transitively on
LO=). It is enough to show that an arbitrary Lagrangian subspace A can be obtained by a unitary transformation from the real subspace A0 = ( E + io I E R"*}, Indeed, suppose el, . . ,e, is an orthonormal basis relative to the real scalar product. Since A is a Lagrangian subspace we have a ( e j , e h ) = 0, hence the basis is orthonormal also relative to theHermitian form h. Therefore the transformation which carries over the standardbasis of the arithmetic space C" into e l , . . . ,e, is unitary. Furthermore, the unitary transformation U : C" + C" is carrying over the real subspace A0 C C" into itself iff its matrix is real, in other words, when U belongs to the orthogonal group: O(R") C U(Cn). Thus the Lagrange Grassmannian is the homogeneous space
.
L(C)W U(C,)/O(R,). Using the representation we can obtain an embedding of L(C)into the space of complex symmetric n x n-matrices as a Lagrangian submanifold. We emphasize that the space of complex symmetric (not selfadjoint) m&
52
Gamkrelidze
Agrachev and
trices is considered. A symplectic structure on this space is given by the imaginary part of the Hermitian form
Bj' = Bj,
(B1,B2) c)tr(B1,&),
j = 1,2,
where T denotes the transposition of a matrix. The imbedding of
L(C) U(C!")/O(C") into the indicated space of matrices is given by the relation
UO(R")
c)
UUT,
U E U(C").
We compute now the fundamental group nl(L(E)). Since m(O(R")) = %,
.l(U(C")) = Z,
we have 7rl(U(Cn)/O(Rn)) = Z. The mapping UO(W") c) detU2 from LJ(C")/O(W") into S1 C C! induces an isomorphism of fundamental groups. Thus if with every closed curve A(t) = U(t)O(IW"),t E S1 from L(C)M U(C")/O(C!") we associate the degree of the mapping t I+ detU2(t) from S1 into S1we obtain an isomorphism of nl(L(C)) onto Z. There is a simple explicit formula for this isomorphism:
1
deg(det U(-)')= xi
1 tr S'
We have introduced in the previous no the Maalov index of a closed curve, IndA(.), which is a homotopy invariant and induces a homomorphism of the fundamental group xl(L(C)) into 2. It turns out that this homomorphism coincideswith the isomorphism (14).To prove this we can compute the Maslov index and the right-hand side of (14)for an arbitrary nontrivial curve. We omit this simple calculation. Thus we have 1
IndA(.) = xi
1 tr(U(t)U(t)-l) dt,
A ( t ) = U(t)O(R").
S'
We give now another integral formula for Maslov index which does not use the representation of L(E) as a homogeneous space.The expression A(t) E TA(~,L(C) is interpreted as the quadratic form A(t) : At e ia(it,At)AtE A(t) on A(t).,The restriction of the scalar product (13)on the subspace
Symplectic Methods for Optimization and Control
53
A(t) determines a representation of this form as
A(t)(X)= (Qh(t)X l X), where Q A ( ~ :) A(t)
X E Nt),
+ A(t) is a symmetric operator. We have 2
IndA(-)= lr
tr Qh(t)dt. S‘
Since the tangent space TAL(C)is identified with the space of linear symmetric operators Q : A + A a Riemannian structure is defined n
( Q 1 1 Q 2 ) e t r ( Q 1 9(2Q)I, Q ) = t r Q 2 = C a j ( Q ) ’ , j=1
where a1(Q), . . . , a n ( & ) are the eigenvalues of the linear operator Q : A + A. Let l(A(.)) be the length of the curve A(.). Then
Monotone nondecreasing curves in L(C)are characterized by the condition a j ( Q ~ ( ~2)0,) j = l , ,. . ,n. For nonnegative aj the inequality n
I
n
n
holds. Hence for a nondecreasing curve A(-) the inequalities
are obtained. 6, Concluding thus survey of the geometry of Lagrange Grassmannians we shall prove a multidimensional generalization of Sturm’s classical theorems about the zeros of the solutions of differential equations of second order,
Proposition 5. Let A(t), to 5 7 5 t l , be a continuous curve in L@), (not necessarily closed), and suppose that AI, A2 E.L(C)satisfy the rela-
Agrachev and Gamkrelidze
54
tions hi n A(t0) = Ai n A(t1) = 0 , i = 1,2. Then
In(-)MA, - A(*) MA?I5 n. *
Proof, Since A? isarcwise connected we can complement the curve h(.)to a closed curve joining A(t1) and A(to) by a curve A C A?. Then
a(.),
-
h(.) M A , = A(.) Mhl = Indx(.) MA? a
= h ( u ) . M ~+, A *M A , . At the same time
1 A * M A ,= 5(p(Al,Az,A(t1)) - p(Al,Az,A(to))). Hence IA MA?I 5 n. Note that we not only estimated the difference of intersection indices, but expressed this difference explicitly as a function of the endpoints of h ( * ) .
Corollary. If A(.) is a monotone curve, nondecreasing or nonincreasing, then the differencebetweenthenumbers of its intersection points with Mal and M A , is not greater then n. Indeed, every intersection point of a monotone curve with MA* gives an increment into the index of the same sign. Proposition 6. Let PT E Sp(C), 0 5 T 5 t , be a continuous curue in Sp(C), PO = id, andsuppose ho,Ab E L@). Put A(T) = P&, A’(T) = P&. Then for ‘#h1 E L(C)the inequality (14)
I N . ) M A , - A’(*) - Mhl l 5 n
holds. Proof. We join h0 and Ab with a continuous curve A and construct a curvilinear quadrangle The closed curve A(.) o PtA o h’(-)-1 o A-l, which is obtained by a successive passage of h(.),Pt A, 7 H h‘(t - T ) , and the reversed curve of A, is contractible. Indeed, the homotopy ( 0 , A ) H P(i-,qJ, 0 C [0,l],
Symplectic Methods for Optimization andControl
55
Fig. 3
A E L(C),contracts the considered closed curve onto the curve A o A-l. Hence
At the same time PtA M A , = A M p;1Al and hence, according to Proposition 5 , ( A M p , - l A , - A Mhl I 5 n. Remark. Suppose P, is an absolutely continuous curve. Then P, can be represented as a flow generated by a nonstationary linear Hamiltonian system in C:
a
“ P , z = &(P,Z), z E C, 87 where is a linear Hamiltonian field on C associated with the quadratic Hamiltonian Q,. If q, is nonnegative (nonpositive) then for VA E L(C)the curve r r-) PTAis monotone nondecreasing (nonincreasing), and the inter-
a
section index in (14) could be substituted by the number of the intersection points.
Theorem 1. Let qr, h, be quadratic nonstationary Hamiltonians on C , where h, 2 0, 0 5 r 5 t , and let PT, FT E Sp(C) be linear Hamiltonian flows on C , generated by the Hamiltonian fields 4 , q, 6,:
+
a
-P, 87
= &P,,
= (gr
+ &)F,,
-
Po = Po = id.
Agrachev and Gamkrelidze
56
Finally, let A ( T ) , X ( T ) be trajectories of the Corresponding flows o n L(C): A ( r ) = P,A(O),
X ( T ) = FrX(0), 0 5 T 5 t.
Then forVA1 E L , which is, transversal to the endpoints of the curves A(.), X(.),the inequality A(-) M A , - n 5
x(.) Mhl'
is valid. Proof. The variation formula, given in Introduction, implies that Ft = PrR,, where R, is a flow corresponding to thenonnegative nonstationary Hamiltonian r, : z I-+ h,(P,z),z E C. In other words,
8 " R ,= ?,R,, 8r
0 5r
5 t,
=id.
Consider the monotone nondecreasing curve A(T)= R,X(O) and the curve A'(T) = P,A(t), 0 5 T 5 t , in L(C).The identity p, e P,R, implies that the closed curve A(.) o A'(.) o @ - ) - l is contractible. Hence Ind(A o A' o
X-') = 0 , X
M A , = A M A , + A' MA,. m
At the same time, A - M A ,2 0, and,according to Proposition 6, A'sMA,1 A - Mhl - n.
5. The Index of the Second Variation and the Maslov Index 1. We shall consider l-cocycles on manifolds with half-integer values, and start with definitions. We call a real-valued l-cochain on the manifold M an arbitrary real function c defined on the set of 4 continuous curves on M subject to the condition
C(Y I [ t o , t , ] ) = C(Y I [ t o , t ] ) + 47 I[t,tl])
vt E (tO,,tl),
where y : [to,tl] + M is an arbitrary continuous curve on M . A l-cochain is a l-cocycle if for every 2 E M there exists a function d, : 0, + R, defined on a neighbourhood 0, of 2,such that C(?) = d , ( y ( t 1 )-) d, ( $ t o ) )
Symplectic Methods for Optimization and Control
57
for Vy : [to, tl] + 0,.Every l-cocycle vanishes on all singular cycles, homologous to zero, (in particular, on all contractible closed curves.) A l-cocycle is called ezact or cohomologous to zem if there exists a function d : M + B such that 47) = d(r(t1)) - d(y(t0))
forevery continuous curve y : [to,t13 + M., Evidently, exact cocycles vanish on every closed curve. Two cocycles are cohomologow if their difference is exact. Suppose ll E L(C). We define a cocycle Indn on L(C) such that Indn h(.) = A(.) Mn foreverycurve in L(C)with endpoints outside on Mn,in particular, Indn A ( - )= Indh(.) for every closed curve h(.).It is sufficient to define Indn on parts of the curve, which belong to some coordinate neighbourhoods. Let A ( , ) be an arbitrary continuous curve in L(C)and suppose that h(t)E Amfor to 5 t 5 tl, for some A E L(C).Put (1)
1 Indn(A I[to,tl1) = s ( p ( A , n ,A(to))- p ( A , n ,A(t1)).
We have to prove that the right-hand side of (1) does not depend on the choice of A. Using the chain rule (4.10) we obtain (2)
1 Indn(h I [ t 0 , t l ] )= ;Z(PP,WO),"tl))
- P(A,h(tO>,Ntl)).
Hence it is remained only to prove that p(A, A(to),h(t1)) does not depend on the choice of A. Let A(t) E AmnA'*, to 5 t 5 tl. The chain rule (4.10) and the equality (4.12) imply
Thus formulas (1)-(2) define a cocycle Indn on L(C)correctly. Formula (4.12) .implies that if the ends of the curve A(.) are outside of Mn then Indn h(.)coincides with the intersection index A(.) Mn.The cocycles Indn are cohomologous for different n E L(C).Indeed, the equality (2) implies that
Agrachev and Gamkrelidze
58
1
= -(P(nl,II12,A(tl)) 2
- P(nl,n2,qto)).
Let ( s , t ) c) A B ( t ) ,S E [0,l],t E [ C Z , , ~be ] a homotopy of continuous curves lII)do not depend on in L ( C ) , where dim(A,(cu) n n) and dim(A,(p) r 8 E [0, l].It is easy to show that in this case Indn A B ( - is ) also independent of S. 2. Vector bundle is called symplectic if every fibre carries a symplectic
structure continuously depending on the fibre. A subbundle of a vector bundle is calledLagrangian if its fibres are Lagrangian subspaces. Cocycles on a Lagrange Grassmannian, considered in the previous section, permit to assign to every pair of Lagrangian subbundles a “characteristic l-cocycle” in the base. We shall consider here the most important particular case of this procedure: we describe the Maslov cocycleof a Lagrangian immersion. Let C be a smooth manifold, dimC = dim M , and 0 : C + T * M be a Lagrangian immersion. For I E C we denote
C, = T@(,)(T*M),n, = % ( z ) ( q M ( * ( z ) ) M ) , A z = @ * ( W ) .
I ,is the tangent Then C, is a symplectic space, TI,, A, E L(&), where I space at $(I) to the fibre in T * M , and Az is the tangent space to @(C) at the same point. Let I(.) be a continuous curve in C, and [to,t~] is a segment in the domain of definition of the curve such that there exists a continuous family of Lagrangian subspaces At E C(X,(t)), to 5 t 5 tl, satisfying conditions (4)
At n nz(t) = At n A,(t) = 0,
vt E [to,h ] .
It is clear that for sufficiently small segments such families exist. Put (5)
1
m(4[to,t1])
= ~ ( P ( A t o , n z ( t o ) ! ~ , , o-) CL(Atl,~I,(tl),n,,tl,>>.
The expression ( 5 ) does not depend on the choice of the family At. Indeed, the symplectic group acts transitively on pairs of transversal Lagrangian
Symplectic Methods for Optimization and Control
59
subspaces, hence there exists a continuous family of symplectic mappings At : &(t) + & ( t o ) ) such that AtA,(t) = Az(o)l At&(t) = &(o)- Therefore, p(AtlJL(t),L(t))= cl(Ato,~,(t0),Ath,(t)). Hence the right-hand side of (5) coincides with Indn,,,,,(Ah, I[to,tll). Suppose that A: E is another family of Lagrangian subsp& ces satisfying conditions (4), and suppose A: : &(t) + C,(to) is a family of symplectic mappings such that A: A;,,, = Al,,,), A;II,(t) = ll,(o). Then Ai = PtAt, where P ( t ) E Sp(C,(o)), P(0) = id, P(t)lI,(o) = II,(o),
5 t 5 t l . Furthermore1 P ( A 3 - L ( t ) A ( t ) ) = cl(Ab,~,(o)lP(t)Ath,(t))l therefore, ifwe substitute in ( 5 ) the family At by the family Ai, we obtain for the corresponding index the expression Indn,(to,( P ( - ) A . A , to
I[to,tl~).
Put A"(t) = P(st)AtA,(t),S E [0,l],t E [to,tl].Since the transform& tion P ( T )preserves II,(o) V r the value of the expression dim(h"(t) n &,) does not depend on S. Hence Indn,(tl A"(.) also does not depend on S. In particular,
Thus the right-hand side of (5) does not depend on the choice of A, and hence defines a l-cocycle mip on C mip is called the Maslov cocycle of the Lagrangian immersion 9 : C + T " M . Consider the composition ~ ~ of the 0 immersion 9 9 with the canonical projection of T * M on M . The critical points of the mapping ?TM o 9 are those z E C for which h, n I f ,# 0. For a generic immersion the set
is a pseudomanifold of codimension 1 in C with a natural coorientation, cf. [5]. In this case the intersection index of the hypersurface (6) with an arbitrary curve in C, with endpoints outside of (6), is correctly defined. This index is called the Maslov-Arnold cocycle. The results of nO1imply that it coincides with mip, At the same time, the cocycle ma isalways defined - for every immersion and every continuous curve, and is compu-
Agrachev and Gamkrelidze
60
ted according to simple explicit formulas, which do not require either the general position considerations nor the location of intersection points.
S. We shall pass now from arbitrary Lagrangian immersions to manifolds of Lagrangian points described in $2. Let U, M be smooth manifolds, f : U + M be a smooth mapping and cp : U + W be a real-valued function. Suppose that F = (p,f) : U + W x M is a Morse mapping. Let C,,,be the manifold of (normed) normal Lagrangian points and fc : C,,,+ T*M be the corresponding Lagrangian immersion, cf. $2, n05. Furthermore, let (X,U) E W = (-1, X) E W 93 TicU,M.We remind that wF,h denotes the Hessian of the mapping F at the Lagrangian point (W, U). Thus wF,h is a real-valued quadratic form on the space ker F: = kerf:.
Theorem 1. Let ~ ( t=) (Xt,ut) E C,+,,t E [0, l] be anarbitrary continuous curve in Cj,,and wt = (-1, X,). Then (7)
sgn(w&)
- sgn(woF,h,) = 2qJc(T(*)).
In other words, the Maslovcocycle of the Lagrangian immersion f c : C,,,+ T*M is the coboundary of the function ( ) ( , U ) e sgn((-1, X)F,h) on Cj,,with half-integer values. Note that the right-hand side of (7) depends only on the dim M dimensional manifold of normal Lagrangian points, at the same time the left-hand side contains the signatures of quadratic forms, defined on spaces of dimensions not less than dim U - dim M . In the extrema1 problemsdimM represents the “number of relations”, dimU is the number of “variables”, and the quadratic form wF,h is the “second variation of the functional”. Usually the number of variables .is much greater than the number of relations, therefore Theorem 1 gives an effective method to compute the signature of the second variation. We considered the finite-dimensionalcase as,apreparation to the study of the .optimal control problem, cf. $3, in which U is substituted by the infinite-dimensional space of admissible controls. Therefore it is reasonable to rewrite the equation (7) in a form which is meaningful,at least,formally, in case of an infinite-dimensional U & well. We have to rewrite only the left-hand side of (7) since right side contains the Maslov cocycle of the
4
Symplectic Methods for Optimization and Control
61
Lagrangian immersion into T ' M , which arise in optimal control problems naturally, as it was shown in $3. Let q be a quadratic form ona linear space E.Positive (negative)index of inertia of q is given by the expression ind+ q = sup{dim E' I E' C E, q [E'
> 0), (ind- q = sup{dimE' I E' C E, q [ E ! < 0)), hence sgn q = ind+ q - ind- q. If E is infinite-dimensional, ker q is finitedimensional, then ind' q or ind' q is infinite and the expression for sgn q has no sense. At the same time, the left-hand side of (7)is meaningful if at least one of the indices ind+ q, ind- q is finite. Corollary. Under the conditions of Theorem 1 we have (8)
(inds(WIF,h,) + -1 dimkerWlF,h, - -21 corank fAl 2
- (ind+(WoF,h,) + 51 dim
l kerwoF,h,- 2 corank
fA0)
= rnf.
( ~ ( e ) ) .
Returning to the optimal control problem of 53 we suppose that theHamiltonian (3.3) is smooth and satisfies to the conditions of Proposition 3.2 for CO = TioM.The Lagrangian immersion in Proposition 3.2
( t ,X)
C)
p ( t , X),
t > 0,
X E to nH-~(o)
substitutes in this case the immersion fc. Let X0 E COn H-l(O). If we put ~ ( 7 = ) (T,XO), 7 > 0, then the equality (8) turns into a direct generalization of the classical Morse formula stating that the increment of the inertia index of the second variation of a regular integral functional along an extremal equals to the sum of multiplicities of conjugate points of the Jacobi equation. Indeed, the curve X, = p(7,Xo) is a Pontryagin extremal. The solutions of the Jacobi equation along this extremal are the curves < =,( & p ( ~ X)IA-Ao)<~, , < E, TA,(T*M).The time-instant t > 0 is conjugate to zero forthe Jacobi equation, if there exists a nonzero solution t7.of the equation satisfying conditions Hio& = 0, TM,
62
Gamkrelidze
Agrachev and
solutions. Employing notations of n02 for the considered Lagrangian immersion we obtain that the multiplicity of a conjugate point t is equal to dim(A,(t) n This is equal tothe contribution of the point y ( t ) into the intersection index of the curve y(.) with the hypersurface ((774 E
h.x .CO I H ( P ( T , X))
= 0, A(,,A)
n rI(,,X) # 01,
in case when the intersection point is isolated and the intersection has the positive sign. In the regular case of a variational problem the intersection points, i.e. the conjugate points of the Jacobi equation, areindeed isolated, but intersections take negative sign ifwe adopt the orientations of this paper. Hence the sum of multiplicities of the conjugate points is equal -m,y(.). In more general cases the conjugate points are not necessarily isolated, but the explicit formulas for the Maslovcocyclegivenin this section permit to ignore this phenomenon. In addition, the Maslov index can be computed along any curve, and not necessarily along extremals. Concluding this section we note that if the index of the second variation is infinite, formula (8) read from right to left, defines the “increment of the index along a curve”, which turns out tobe quite useful for developing of:the analogs of the Morse theory for the indefinite functionals. . ,
6. Bang-Bang Extremals
1. We return to the Optimal Control problem of 53 and consider the case when the Maximum principle generates a nonsmooth Hamiltonian. The problem of description of (possibly nonsmooth) Lagrangian submanifolds contained in the level sets of such Hamiltonians should play a central role for the optimal synthesis. Here will be given some preliminary results in this direction, more detailed results will be published elsewhere. We start with bang-bang controls. Let V(r) = vi, ~ i - 1 < T < ri, i = 1,. . . , l , where 0 = TO < TI < . . . < 71, V(r) = v1+1 for r > 71. Suppose g(z0, VO) # 0 and let Z(T) = z ( T ; ~ ( . )be) a trajectory of (3.1) corresponding to the control V(.). Suppose E Ti(,)M,0 5 T 5 tl, A, # for T # T ‘ , is a normal Pontryagin extremal, corresponding to
x+,
xT
Symplectic Methods for Control Optimization and
63
the control C(.). Thus i T , O 5 T 5 t l , is a trajectory of the Hamiltonian system in T * M defined by the Hamiltonian
Furthermore, 0 = hT(x7)= H&),
0 5T
5 tl,
where, according to (3.3), we have
~ ( 5= ~max(iTg(Z(T),u) ) - go(Z(T),u)). UEV Since the control C(-)is piece-wiseconstant the Hamiltonian h,(X) depends . on T alsopiece-wise constantly: hT(X)= hi(A), 76-1 < 7 < ~ i Denote 8 = (e', , , .,e l ) , where gi are real numbers, eo = 0. If the vector 8 is sufficientlyclose to T = (71,. , . ,n) then the control
is well defined. The control ve(-)defines a Hamiltonian
h!(X) = hi(X) for
t < ei, i = 1, .. . ,l.
In particular, ht(X) = h: (A). Put
The mapping F is a Lipschitz mapping. Furthermore, it is smooth in the region
v q t , e ) l t # e i , ea-1 <ei, i = 1 , . . . , l ) . Proposition 1. Let ( t ,8 ) E V ,t E (&-l,&),A t E T;(,,,$V. Then (X!, t , 8 ) E Cf,cpif hj(X&)= 0 , j = 1 , . . . ,i, where T I+ X$ is a trajectory of the Hamiltonian system defined by the Hamiltonian h!,
Agrachev and Gamkrelidze
64
Proposition 2. Let t E (ri,Ti+l). The restriction of the mapping F to some neighbourhood of the point ( t , T ) is .a Morse mapping if
.
{hj+l,hj}(X,) # 0, j = 1,. . ,i. Propositions 1,2 together with the Proposition 2.5 imply the following Theorem 1. suppose that H(X)= m={hi(X)
I i = 1, . . , l }
xr,
for all X E T * M suficiently close to the points of the curve 0 5 r 5 ti. Let, furthermore, hj(&) < 0 for ~ i - 1 < r < ~ i j, # i; hj(1,) < 0 for j # i , i + 1, and {hi+l, hi}(iri)# 0, i = 1,. , l . Then the% exists a neighbourhood of 10 5 T 5 t l } in T * M such that the set
{xr
(1)
..
{X,“ E 0 I H(X:) = 0, 0 5 7 5 t }
is a Lagrangiansubmanifold in T * M . (We supposehere, as well as in Proposition 1, that &X$ = @(X:).) According to Theorem 3.1 the trajectory Z(r),r E [0,t]is locally optimal if the submanifold (1) is “well-projected” onto M. To construct the manifold (1)is muchharder than toproof its existence. Neither is it simple to investigate how it is projected onto M . In case of a smooth Lagrangian submanifold one can avoid these difficulties by the following assertion:
‘If the Maslov cocycle vanishes on every part of a given curve then some neighbourhood of this curve is well projected onto M . At the same time, to compute the value of the Maslov cocycle it is sufficient to know the tangent spaces to the Lagrangian submanifold only at several points of the curve. Since the Lagrangian submanifold (1) is a Lipschitz manifold, (not everywhere smooth,) hence the Maslov cocycle is not defined on an arbitrary curve as the tangent spaces does not exist at every point of the curve. Nevertheless, it turns out that thecorresponding function with half-integer values can be defined in a natural way, at least for the extremals X.: In accordance with the results of $5 the doubled value of this function on the extremal X$, r E [to,tl],coincides with the diffe-
Symplectic Methods forControl Optimization and
65
rence of the signatures of the Hessians of the mapping F in the Lagrangian points Al! , (tl ,e) and ,X,! (to,0). 2. We shall need a modification of the Maslov index of a triple of
Lagrangian subspaces. Suppose Ai E L@), i = 0,1,2. Put (2)
1 indA,(h, h ) = ~ ( P ( A Oh , , A d
- dim(A1 n h ) + n),
where dimC = 2n. According to the definition indA, (AI, ha) is a symplectic invariant of the triple of Lagrangian subspaces with half-integer values.One can show, cf. [2], 53, that the invariant (2) is nonnegative and satisfies the triangle inequality: indA,(hl,Aa)
I i n d ~ ~ ( A l , A+ind~,,(A,,A,) 2)
VAi E L(C),i = 0,1,2,3.
A continuous curve A(T) E L @ ) , to I t l , is called simple if 3A such that A(T)n A = 0 VT E [to,t l ] .
E
L(C)
Proposition 3 ([2], 53). I f h ( ~ )T ,E [to,tl],is a simple nondecreasing curve in L(C)and ll E L @ ) , then Indn(A(*))= indII(h(to),4 ) ) . In particular, the value of the cocycle Indn on the simple nondecreasing curve depends only on II and the endpoints of the curve, cf. the expression (5.1) valid for an arbitrary simple curve. Now under the assumptions of Theorem 1 suppose that X! belongs to the manifold (1).If, in addition, (t,e) E D then the manifold (1) is smooth near the point A$ Let A! C TA:(T*M)be the tangent space to the manifold (1) at X!. These spaces can be described more explicitly. Let P: E Symp(T*M), T 1 0 be a piecewise smooth curvein Symp(T*M), which is smooth on the intervals (di-l,ei), i = 1,. . , l , (01, +W) and defined by conditions
.
Agrachev and Gamkrelidze
66
The subspaces A! are uniquely determined by the conditions e
A-0
= I~A;
(4)
where I I A = spaces given in (4) can be considered as a linearization of the method of characteristics described in Proposition 3.2. Note that P,*IIA= I I I ~ , ( A ) "7,X. Hence the Mwlov cocycle vanishes on the curves Ase l(e,,ei+l), i.e. rn(Ame[(ei,ei+L)) = 0, i'= 0,1,. . .. Define
m(ASe I [ t 0 , t l ) )=
(5)
c
indn,& (A!i-o,A;i+o),
Vt0,tl.
togi
In other words, to find the value of m on the curve 7 c) A! we must first paste the discontinuities of the curve of tangent spaces r ++ A$ with simple nondecreasing curves in Lagrange Grassmannians and then compute the Maslov cocycle in the usual way. Proposition 4. Let A t be contained in the manifold (l),and suppose l3 n A t o = 0. ( W e use here and in the sequel the abbreviated notation Il = ISx!). Then Hessian of F at the Lagrangian point A!, (t,O), is nondegenerate and its positive index of inertia is computed according to the formula
where
W
= (-I.,X!).
Corollary. Undertheassumptions of Proposition 4 and p&tided rn(h.' Ip,t)) > thecontrol ve isnot locally optimal.
Since dim(A;,-o r lA:,+o) = n - 1 every summand in the right side of (2) is 0, or or 1. At the same time indn(A;i-o,A$$+o) = iff dim(n n # dim(Il fl AZG+o).Note also that
4,
dim(A: n Il)2 n - i
for Oi-l
< 7 < Oi,
Symplectic Methods for Optimization and Control
67
and for a local injectivity of the mapping X c) n(X)of the manifold (1)into M at the point X! it is necessary that he f l ll = 0. Thus local injectivity the mapping X c) n(X), can hold only in case t > B n - l . For t 5 though not locally injective, is still "good" (in the sense of Theorem 3.1) near the point X!, if the intersection A!+, n ll has the minimal possible dimension.
Theorem 2. Suppose that under the assumptions of Theorem 1 and for every ri 5 t the relations indn (A:< -o, A:i +o) =
{ 0,
51 , O 5 i l n - l ,
izn,
hold, then there exists a neighbourhood C of the curve (1) satisfying the conditions of Theorem 3.1.
x)[O,t~ in the manifold
Remark. It is easy to show that under the condition A;-o n l 3= AT,, n IT = 0 the local injectivity of the mapping X c) .(X) of .the Lagrangian manifold (1) into M near the point & is equivalent to the equality indn(&, AT+o) = 0, Typical locations of the trajectories z(., ve) = n(Xe) for different values of indn(AT-o,A;+o) are represented on the figure.
ind 0
ind=1
Fig. 1 Standard topological considerations lead to thefollowing global version of Theorem 2. Suppose W is a closed subset in
{ ( t , e ) a +x ~ ! ~ e ~ - , < ei ~=,l , ...,I, eo=o),
Gamkrelidze
68
where ( t , e )E W
=$
Agrachev and
( ~ ~ E0W) VT E [O,t],and put
To every X f E Cw we assign a Lagrangian subspace At in T p ( T * M ) defined by the relations (3), (4). Theorem 3. Suppose for every X! E LW,t9i 5 t the following relations hold:
If C is arcwise connectedand .(LW) is simplyconnected then there exists a Lagrangian submanifold C 3 &W in T'M satisfying the conditions of Theorem 3.1.
7. Jacobi Curves 1. The results of $6 can be applied, first of all, to the case when the
I
set V of control parameters is finite. In this case the sets of admissible velocities g(z, V ) are also finite, and hence the optimal control problem is badly formulated: the set of admissible trajectories is not closed in the uniform metric, (except the trivial case when g(z,V) is a single point), though V is.compact. As is well knownthe natural completion (relaxation) of the initial system is the system with the sets of admissible controls convg(z, V ) ,z E M . All Pontryagin extremals are preserved under the relaxation, but new ones appear, the so-called singular extremals. After evident changes innotations the relaxed problem takes the form
Symplectic Methodsfor Optimization and Control
69
The functional
5c t ,r
d t , 4'))=
(2)
dT
d(T)&2(7))
0 j=O
is minimized on the level set of the mapping f : (t,v(-)) I+ z(t,v(.)).Here v(') = ( v o ( ~ ) , v l ( ~ ).,..,vr(-)) is an admissible control. The space of all admissible controls, i.e. of measurable vector-functions on R+with values in
Ar = {(vo,.
. . , v r ) l d 20,
= l},
j=O
denote by V . Every problem with right-hand side affinein controls and with a convex polyhedron as the set of control parameters can be represented (l),(2). 0 5 T 5 t l , be a normal Pontryagin extremal for the problem Let (l),(2), corresponding to an admissible control V(*) = ( V " ( , ) , ,TT(.)). Then 1, is a trajectory of the nonstationary Hamiltonian system generated by the Hamiltonian
x,,
. ..
r
&(A)
-
= c d ( T ) ( x g j ( T ( x ) ) g!(T(x))),
x E T'M.
j=O
Additionally, we have "
0 = h,(&) = H p T ) ,
0
5 7 5t,
where H ( A ) = m a z o < j < v ( X g j ( T ( X ) ) - $(.(X))). The mapping'@: JRk -+ V is called a smooth famil9 of variations of the control g(-)if there exist smooth mappings
A :J P -+ L'(&; A'), a:
+L~(R+;&), .
A(0) = V(.), a(O)(t)= 1 (t E [ O , t , ] ) ,
such that O ( z ) ( t ) = A ( z ) ( $a ( ~d )~ )Vz , E Rk,t 1 0, The dimension of the family k can be arbitrary.If 0 is a smooth family of variations of V(.) then x ( t l ) ,( t l , 0) is a normal Lagrange point of the mapping
F@ : (t,z) I+ (v@, @(z>),f ( 4 @(z)>),
t
> 0, z E JRk.
Agrachev and Gamkrelidze
70
x,,
Definition. Positive index of an extremal T E [0, t l ] ,is called the supremum of the positive indices of inertia of Hessians of the mappings F+ at theLagrange point x(t,), ( t l ,0), aver all smooth families @ of variations of the control F(.).
x.
The positive index of the extremal 1 , is denoted by ind+ It is a nolinegative integer or +m. 0 5 T 4 t l . Denote by @(V) the set of all Let Z(T) = z ( T ; V ( - ) )= Pontryagin extremals X, satisfying the relation (X.), = E ( T ) , T E [O,tl]. Then @(F) is a convex closed subset of dimension not greater than n in the space of sections of the bundle T * M ( q . ) .
..(x,),
Proposition 1. Suppose V(.) is locally optimal control in the L1-norm . = then ind+x'= of the problem (l), (2) on the segment [0,t ~ ]#*(F) 0; if @(F) is compact and dim @(F) = m > 0 , then there exists a X E @(V) such that ind+ X, 5 m - 1.
{x,}
Remark. If dim@(F)> 1 then there exist deeper conditions of optimality which take into account the dependence of ind+ X. from X, E @(V), cf. [2]. , e
2. According to a procedure already tested, the index of the extremal should be computed through the Maslov index or its corresponding generalization for the case of a possibly nonsmooth Lagrangian submanifold, which contains and is contained in HV1(0).All we need to know about the Lagrangian submanifold to compute the index is the knowledge of its tangent spaces at thepoints For the particular case of bang-bwg controls the corresponding computations were done in 56,n02. Note that for tangent Lagrangian spaces were dependent explicit formulas (6.1), (6.2) only on linearizations of Hamiltonian flows corresponding to Hamiltonians h, along the extremal. Furthermore, these formulas correctly define a family in T of Lagrangian subspaces, independently from the existence of the appropriate Lagrangian submanifold: we remind that the existence of a Lagrangian submanifold was proved under sufficiently strong conditions of regularity. We can not guarantee that an arbitrary (not necessarily a bang-bang)' extremal is contained in an appropriate Lagrangian subma-
x.
x.
x,.
x
Symplectic Methods.for Optimization and Control
71
nifold in H-' (0). But it turns out thatif ind+ 5; < +W then there exists a corresponding family of Lagrangian subspaces KT c Tx7( P M ) "tangent" to H-'(O). Construction of these subspaces and an explicit expression for ind+ 3. generalize formulas (6.3)-(6.6). Let Pt E Symp(T*M),0 5 t 5 t l , be a Hamiltonian flow defined by a nonstationary Hamiltonian E t , PO = id. Instead of constructing the Lagrangian Subspaces Et directly, we shall describe subspaces At = PG1& contained in the fixed Lagrange Grassmannian L(TA,(T*M))for Vt E [0,t l ] .Denote
- go(.(X)),
hj(X) = Agj(.(A))
X E T * M , j = 0,1,. . ., T ,
-rt = span{&(&) Ivj(t) > 0, o 5 j
5 7);
The subspace Tt is contained in Txt( T * M ) ,0 5 t 5 t l . Let 7 C [0, t l ] be the set of density points of the measurable vectorfunction 5 ( ~on ) the interval (0, t l ) , hence the set [0, t 1 ] / 7is of measure zero. Proposition 2. If ind'x, Tit( T * M )for Vt E 7 .
< +COthenisanisotropicsubspace
in
From here on we suppose that rt, t E 7 are isotropic subspaces; Put rt = P;'f;t C Tx,(T*M).Since the transformations Pt. are symplectic rt is isotropic for Vt E 7 . h r t h e r developments will take place in a fixed symplectic space C = Ti, ( T * M ) .Among the Lagrange subspaces of C a special role is plaid by the tangent spaces to the fiver at the.point TO.We shall denote it by I'I = Ti,(T&M). Let A be a Lagrange subspace, l? an isotropic subspace in C. Put
hr=(A+r)nrL=hnrL+r. It is easily seen that Ar is a Lagrange subspace in C, and the mapping A H Ar is a projection of L ( C ) on the submanifold in L ( C ) consisting of all Lagrange subspaces containing. l?. The mapping A c) Ar is discontinuous on L@), butits restriction onevery submanifold of the form {A E L ( C ) 1 dim(h n I') = const} i s smooth. '
Gamkrelidze
72
Agrachev and
We go now to the construction of Lagrangian subspaces At, We shall describe the curve t I+ At in L(C)using special piecewise-constant approximations of the curve. Let D = {TI,.. .t Q} C 7, where 7 1 < . , < T k , Tk+1 = tl. Define a piecewise-constant curveA@),, 0 5 t 5 tl, in L(C)
.
bY
and put
It is easily seenthat 0 5 r ( D ) 5 $. The expression (3) coincides with Indn of a continuous curvein L(C),which isobtained by successively connecting the values of A ( D ) .with simple nondecreaaing curves, cf.Proposition 6.3. Denote by D the directed set of finite subsets of 7 with inclusions of subsets as the partial order. Theorem 1. The following relations hold: (1)
(2)
Indn A(D1) - .(Dl) 5 Indn A(D2).- r ( D z ) , VD1 C D2 E ind+X, = supD,,(Indn A ( D ) .- r(d)).
'D.
Theorem 2. Suppose that ind+ X. < +m. Then (1) (2)
(3)
For V t E [0,tl] the limit D-limA(D)t = At existe. The curve t c) At in L(C) has at most denumerableset of points of discontinuity and for every t E (0,t ~(t] E [0,t l ) ) the limits lirn,+t-o A,, (lim,+t+o AT) e&t. The curve A. is differentiablealmosteverywhere on [O,tl], and at every point of differentiability 9 2 0. I
.
.Definition. Suppose ind+X, -c +m. The curve At in L@), which exists..accordingt o Theorem ?) is called a Jacobi curve corresponding to the extrema1
x,,
Symplectic Methods Control Optimization andfor
73
It follows from Theorem 2 that a Jacobi curve has properties similar to those of a monotone real-valued function. We already have dealt with smooth monotone curves on L@). Using the properties of the invariant (6.2), it is possible to describe a wider class of curves for which a theory could be developed verysimilar to thatof monotone real-valued functions, For example, the assertion (1) of Theorem 2 is based on a "Lagrange" analogue of compactness principle of Helly. Basic facts of this method are briefly described in [4]. We call At a Jacobi curve since it generalizes solutions of the Jacobi equation of the classical calculus of variations. Under sufficiently weak assumptions of regularity every Jacobi curve turns out to be piecewise smooth and on the intervals of differentiability it satisfies the differential equation for which the right-hand side can be explicitly expressed through I't , cf. [2], [3]. Put Kt = Pt*ht. If V(.) is a bang-bang control, i.e. if O(T) E we(7) for some 8, cf. $6, we obtain E A!. 3. All our basic constructions until now were invariant under smooth coordinate transformations in M . As a final theme we shall discuss the action on Jacobi curves of another important class of transformations of Control systems, the so-called feedback transformations. We remind the basic definition. If the inequalities w j 2 0 are omitted in (1)we come to the system T
(4)
X =C
T
dgj(x),
j=O
C W= ~ 19.
X(O) =EO,
j=O
for which the admissible velocities form affine subspaces. Let b i j ( s ) , E E M , i, j = 0,1,. . , ,T , be smooth functions on M , where CLob+j(z) E 1, j = 0,1,. . , , T , and the (1 + r ) x (1 r)-matrix b ( x ) = ( b i j ( z ) ) T , j is nondegenerate for tlx E M . Put w j = C"3 -0 b i j u j , j = 0,1, . . ,r , where the uj are considered as new control parameters, we come to the system
+
I
c
The system ( 5 ) is said to be obtainedfrom (4) by feedbacktransformation. It is clear that the admissible trajectories of systems (4) and ( 5 ) coincide, so that they are equivalent indeed. After the feedback transformation the
Agrachev and Gamkrelidze
74
functional (2) takes the form
It is easily seen that every (normal) Pontryagin extremal of the problem (4),(2) is a (normal) Pontryagin extremal of the problem (5), (6) and conversely. Let 0 5 t 5 t l , be a normal Pontryagin extremal, E(t) = n(1t). Then h j ( x t ) = hj"(&)= 0 , 0 5 t 5 t l , j = O , l , . . . , T , where hj(X) = Xgj(.rr(X)) - gO(.rr(A)),h$(X) = bijhi(X). An elementary calculation shows that K$(&) = CLobij(Z(t))hi(X). Put
x,,
-
"
r, = span{hj(Xt) 1 j
= 0,.
. .,T } = ~pm{Z;(&) I j
= 0,.
..
,TI.
We suppose that, as in n02, Ft is an isotropic subspace in T J ~ ( T * Mi.e. ), {h&}(&) = 0, 2,j = 0, * . * , T , t E [O,tl]. Suppose r
T
r
U "z = p ( t ) g j ( E )= C E j ( t )&(E)g@),
dt
j=O
j=O
j=O
and put Et = c i = o V j ( t ) h j , = C j ' = o ~ j ( t ) hLet ; , Pt,P," E Symp(T*M), t 5 t l , be Hamiltonian flows generated by nonstationary Hamiltonians -0ht5and 4 h,, PO = P: = id, and supposet'I = PG'Tt, I ' != (P,")r1Ttare isotropic subspaces in Tx,(T*M)= E,0 5 t 5 t l . Applying to the families of isotropic subspaces the same procedure as in n02 we can correspond to every finite subset D C (0, t l ) locally-constant curves A ( D ) t and A(D),bin L@). It is every probable that the following assertion holds:
If At = D-limA(D)t exists for eve4.y t E [0,t] then A! = D-limA(D),b also exists for W E [0,t ] ,and
P,*& = P&Ai, i.e. the curve
if,e' Pt.:&
t E [0,t i ] ,
is preserved by the feedback tmnsfomation.
We cannot prove the assertion in the formulated generality, but un-
Symplectic Methods for Optimization and Control
75
der some additional assumptions of regularity it is possible not only to prove the assertion, but also represent At as a solution of an explicitly written linear Hamiltonian system. We shall give here the corresponding calculations under sufficiently strong conditions of regularity.
-
Put hjt = hj o Pt, ht = ht o Pt. Then
+
Finally, let h; = hjt = { ht ,hjt }, Consider the family of (r 1) x ( r+ 1)t E [O,tl],where matrices A(t) = Ilaij(t)(J,
6
Differentiating with respect to t of the identity {hi, hj}(xt) = 0 gives the relations aaj(t) = aji(t), V i , j , A(t)T(t) 0. Hence thematrix A(t) is symmetric and of rank not greater than r. We call an extremal x t nondegenerate if rank A ( t ) = r , 0 5 t 5 t l . It is easily seen that the property of an extremal to be nondegenerate is preserved under the feedback transformations. Proposition 3.If the extremalxt is nondegenerate thenfor Qt E [0,t l ] the limit At = D-limA(D)t exists, wherethecurve t H At in L(E) is Lipschitz on the half-interval(0,t l ] ,and the curve & = Pt*At is preserved under the feedback transformations.
We call the curve At the Jacobi curve associated with the nondegenerate extremal 'T;t of the problem (4),(2). To find the Hamiltonian -system whichis satisfied by the Jacobi curve At we remark that since At is nondegenerate the kernel of the symmetric matrix A ( t ) coincides with the straight line in R'+' through the vector V(t).Hence a symmetric (T 1) x (r+ 1)-matrix d ( t ) = Ilaij(t)ll exists which satisfies the condition (d(t)A(t)w- W ) E E ( t )for Vw E R'+1.
+
Let, as above, ll = Tx,(T&M) be a Lagrangian submanifold in the symplectic space C = Tx,(T*M) with the symplectic form 6.
76 Gamkrelidze
and
Agrachev
Proposition 4. Under the assumptions of Proposition 3 we have
o
},
ZoEnrO
t E (O,tl], A0 = 17. I n other words, At = Q t I I r O , where Qt E Sp(C) is a linear Hamiltonian flow in C corresponding to the quadratic nonstationary Hamiltonian qT(Z)
C .ij(T)(dhz)Z(dhyT)Z,
l T
=5
Q0
= id.
i,j=O
Let A! be a Jacobi curve associated with the same extremal It for the problem ( 5 ) , (6), which is obtained from the problem (4), (2) by the feedback transformation. Proposition 3 implies that P !& = P&, 0 t 5 t1. Thus At = (P;’ o P,”),,A,”. Observe that (P;’oP,”),, is a symplectic transformation of C which preserves II for V t E [0,tl].Therefore the following Corollary holds.
<
Corollary. Let It be a nondegeneratePontryaginextremal and At, 0 t 5 t l , be the associated Jacobi curve. The expressions Indn(Al[r,tl), 0 < T < t < t l , are invariants of feedback transformations, as well as of
<
arbitrary smooth variable substitutions in M .
REFERENCES
[l] R.Abraham and J. Marsden, Foundations of Mechanics, 2-nd Ed., The Benjamin/Cummings Publ. Comp., London, 1978. [2] A. A. Agrachev, Quadraticmappings in geometric control theory, Itogi Nauki, Problemy Geometrii 20 (1988),VINITI, Moscow, 111205; English translation in J. Soviet Math. 51,no. 6 (1990).
Control Optimization and Symplectic for Methods
77
A. A. Agrachev and R. V. Gamkrelidze, The Morse index and the Maslov index for the extremalsof control systems, Dokl. A M . Nauk SSSR 287 (1986), 521-524; English translation in Soviet Math. Dokl., 33 (1986). A. A, Agrachev and R. V. Gamkrelidze, Symplectic geometry and necessary conditions for optimality,Matem. Sbornik 182 (1991), 3654; English translation in Math. USSR Sbornik 72 (1992). V. I. Arnold, Characteristic class entering in conditions of quantization, F'unkcjonal. Anal. i Prilozen. 1 (1967), 1-14; English translation in Functional Anal. Appl. 1 (1967). V. I. Arnold, Mathematical methods of classical mechanics, Springer, New York, 1978. V. I. Arnold, S t u r n ' s theorem and Symplectic geometry, F'unkcional. Anal. i Prilozen. 19 (1985), 1-14; English translation in Functional Anal. Appl. 19 (1985). V. I. Arnold and A. B. Givental, Symplectic geometry, Itogi Nauki, Sovr. probl. matem., Fundamentalnye napravleniya 4 (1985),VINITI, Moscow, 5-139; English translation in Encyclopaedia of Math. Scien., vol. 4, Springer. R. V. Gamkrelidze, Pt-incipples of optimal control,Plenum Press,New York, 1978. V. Guillemin and S.Sternberg, Geometric asymptotics, Amer. Math. Soc., Providence, R.I., 1977. G. Lion and M. Vergne, The Weil representation, Maslov index and theta series, Birkhauser, Boston, Mass., 1980. H. Sussmann, Ed., Nonlinearcontrollabilityandoptimalcontrol, Marcel Dekker, New York, 1990. A. Weinstein, Lectures on symplectic manifolds, Amer. Math. Soc., Providence, R.I., 1977. M. I. Zelikin and V. F.Borisov, Optimal synthesis containing chattering arcs and singular arcs of the second order, in Nonlinear Synthesis. Proc. IIASA Workshop, Sopron, Hungary, June 1989 (1991), Birkhauser, Boston, Mass..
2 Singular Trajectories, Feedback Equivalence and the Time Optimal Control Problem B. Bonnard Universit6 de Bourgogne, Laboratoire de Topologie - CNRS UMR 5584, 9, Avenue Alain Savary - B.P. 400, 21011 Dijon Cedex, France
Abstract. In this article we recall the equivalence between the feedback classification and the time optimal control problem, for affine control systems. Using examples we show how this identification is useful in computing feedback invariants, Conversely, one can use the action of the feedback group to analyze the time optimal control problem, for instance constructing normal forms to evaluate the accessibility set in a neighborhood of a reference trajectory.
1. Introduction Let X,Y1,. . . ,Y, be analytic vector fields on Rn,we consider systems of the form
dx
(1)
= X(x(t)) + Y(x(t))u(t) dt Rm, Y = (Y1,. . . ,Y,), they are
where x E R", U E called @ne. Let (X, and (X', Y) beY') two affinesystems These systems are called feedback equivalent if there exists a C" diffeomorphism cp of R" and a feedback U = .(x) + @(x)u', where Q! E Cc.(Rn,Rn), p E Cu(Rn,GL(rn,R)) such 79
Bonnanl
80
that
(9
x'= cp*x+ cp*Y.a,
(ii)
Y' = (p*y.p,
where cp acts ona vector field Z according to the rule,p*Z = dcp-l(Z 09). This action defines a group stucture on' the set of triplets ((p,a,p ) and this group, called the (affine) feedback group, is denoted by G). The set Am of affine m-inputs systems, endowed with the G) action, defines a geometry whose understanding is a key problem in system theory.For instance the controllability and the stabilizability properties are invariant properties. It is a generalization of the well-understood problem of classification of linear systems with respect to linear changes of coordinates and linear feedback [6]. Also it can be extended to analyze more general = f(z,U ) , localized around an equilibrium control systems of the form point or a reference trajectory, but in this article everything will be kept at the heuristic level. The first objective of this article is to relate the feedback classification problem with a classification problem of a family of constrained Hamiltonian vector fields.Let us briefly explain how. Assumethat (1)is initialized be the trajectory corresponding to a bounded by z(0) = zo and let zu(.) measurable input U ( . ) . Endow the set of inputs with the L"-norm topology and consider the input/state mapping U ( . ) + z(.,u).This mapping has singularities which are called singular trajectories. One can show that they are parametrized by the Maximum Principle (in brief PMP),applied to the time optimal control problem of system (l),as follows, Introduce the hamiltonian H , definedby H ( z , p , u ) = ( p , X ( z ) Y ( s ) u ) ,where p EJ P \ IO}, U E Rm and ( , ) denotes the standard inner product. The singular trajectories are the projection on the z-space of the solutions of the constrained Hamiltonian equation
+
where E1 is the surface { ( x , p ) ;(@, Y (z)) = 0). This equation encodes two objects, the surface E1 and the solutions that stay in El,One can prove
Singular Trqjectories Time the and Optimal Control Problem
81
that they are the solutions of a vector field 2 defined on a maximal subset C1 of Cl. Moreover one can show that a singularity of the input/state mapping is feedback invariant. Now, let us define the action of G; on (2) as follows: (p,a,P) E G; acts on (Cl,2) as the symplectic diffeomorphism p' given by z = p(g) and p = q( where p , q are row vectors. Now, in [4] we prove that under mild assumptions, two systems (1) are feedback equivalent if and only if if their associated equations (2) are G; equivalent.
S),
The objective of this article is to indicate how to apply this connection. It is organized as follows. In Section 2, we recall some well-known results concerning singular trajectories. In Section 3, we present the results from [5]. Using the action of the feedback group, we compute semi-normal forms in a neighborhood of a given singular trajectory. This allows to evaluate the accessibility set and toanalyze the optimality problem. In particular, it will clarify the problem of computing conjugate points in the time optimal control problem. Conversely, in Section 4,feedback invariants are computed and geometrically interpreted as invariants of the time optimal control problem. We shall indicate how to proceed for systems in small dimensions because to write a complete dictionary is an inextricable problem. Ourworkis related to similar efforts in understanding (locally) the feedback equivalence problem. In particular, we must point out the following contributions. Earliest works deal with the problem of classifying distributions: Darboux [12], Gardner [g]. This problem is also connected to sub-Riemannian geometry. Singularity theory was used by Jakubczyk in [l31 to study the feedback classification problem. See also the contribution from Respondek [l91 to this volume. Elie Cartan's moving frame and equivalence method was applied by Gardner and Shadwick in several publications to classify systems, see for instance [lo]. A variant ofthis approach was developped by I. Kupka in [15]. The connection between the feedback classification problem and the time optimal control problem was exploited by Jakubczyk in [l41 to compute feedback invariants. This approach is close to our point of view, although it is exploited for a different class of systems for understanding the (local) problem.
Bonnard
82
2. Preliminaries
2.1. Definitions. Consider a system on Rn of the form
where U(.) is a bounded measurable map from an interval [O,T] into R"' and f is an analytic mapping from R" x Rm into Rn. Let zo E Rn,and denote by z(t,2 0 ,U ) (briefly z ( t ) )the solution of (3) corresponding to U ( . ) such that z(0) = 50. Fix zo E Rn,T > 0 and let U( ,) be a control defined on [O,T] such that x(.,Q, U ) is defined on the whole interval [O,T]. Let us endow the set of controls defined on [O,T] with the Loo-norm i.e., [ U [ = SUP,,[,,,^ Iu(t)l. Let EXOlT denote the input/state mapping defined so,W). The control U and the on a neighborhood V of U by W E V + s(T, corresponding trajectory are called singular on [O,T]if the Frkchet derivative of ExOtT,denoted by dExolT and evaluated at U is not of full rank. 2.2. Taylor expansion of
The input/state mapping is in fact C" and its Taylor series can be computed as follows. Let us write x(.) v(.) the trajectory corresponding to the control U(.) W(.). Developping f we have: EXOlT.
EXOlT
+
+
f(z+ v, U + v) = f(z,U ) + fac (5,U)V
+ fu (2,U)V + f x u (2,U ) ( % v) + 1/2facx(x,U)(Y, V) + 1 / 2 f u u ( S , U ) ( V , W ) + * * *
+ +
Since s(0) y(0) = z(0) = 50, we have y(0) = 0. Morerover y can be written Slz 62s . ., where 61s is linear in W , 62s is quadratic etc., and using the constraint 9 = f(s,U ) we get
+.
j . + ~ , s82x+ + +...=f(x+b l s + b z s +
...,U+W).
Identifying we get &z = f X ( 2 , (4)
822
U)&Z
= fx(.,U)SZa:
+ fu(z,u)v + fzu(xC1 4 (
h v )
+ 1/2fxac(z,U)(dls, 61s) + l/2fuu(Z,~)(~,W)r which are integrated with the initial conditions b 1 s ( 0 ) = 62s(O) = 0.
Singular Trajectoriesand the Time Optimal Control Problem
83
Let us set
4 4 = fx(x(t,xo,U ) , 4 t ) ) ,
= f d d t ,S O , U ) ,
and let M ( t ) be the n x n matrix solution of
h ( t )= A ( t ) M ( t ) , M ( 0 ) = identity. &om (4),the R4chet derivative is T
(5)
5
dEXogT(v) = M ( T ) M-'(t)B(t)v(t)dt. 0
2.3. Proposition. The following assertions are equivalent: (i) ( x ( . ) ,U ( . ) ) is singular on [ & T I ; (ii) Thelinearsystem = f x ( x , u ) & z + f u (u)v x , is notcontmllable (iii) (S(.),U ( , ) ) is theprojectiononthe space ( s , u ) of a solution (S(.),p ( .),U ( .)) of the constrained Hamiltonian equation
dx = H p ,
dP = -Hx, H, = 0 dt dt where the Hamiltonian is H(x,p,U ) = ( p , f ( x ,U ) ) , and p is non zero.
Proof. By definition we have T
dim
{ 5 M-l(t)B(t)v(t)
dt;v E LW} < n.
0
Hence there exists a non zero row vector p' such that p'M-l(t)B(t) = 0 holds for almost every t. Let p ( t ) be p'M-l(t)B(t).Hence we have that $f = -Hx, and p ( t ) B ( t )= 0 almost everywhere. 2.4. Singular Trajectories and Time Optimal Control. Let us write the Maximum Principle (PMP)for the problem of controlling (3) in minimum time. We get that if (S*(.),U * ( . ) ) is optimal on [0,t*]then there exists p * ( . ) E Rn \ (0) such that a.e.
(8)
H ( s * , ~ * , u *=) ' M a u e ~ mH ( E * , ~ * , u )
H(s* , p * ,U * ) >, 0 and constant.
Bonnard
84
Equation ( 8 ) implies H, = 0. Hence an optimal trajectory is singular. 2.5. Definitions. Using the previous remark a singular trajectory
(x(.), U ( . ) ) will be also called an eztremal and a triplet (x(,),p(.),U ( . ) , ) , an extremal lift. The surface C1 : H, = 0 is called the constraints set. An extremal lift is called totally singular if HuU(x(t),p ( t ) , u ( t ) )= 0. If we are given an extremal (I(.), U ( . ) ) on [O,T],the order of abnormality k ( t ) 2 1, for t E ]O,T],is the codimension of the image of LW by dEzolt. Let p' E Wn \ (0) such that (p', dE"o**(v)) = OVv E LW.The second order intrinsic derivative is the mapping v + (p',& x ( t ) ) ,where v E Ker dExolT and 62 is defined by (4).
2
= f(x,U ) and = g(y, U ) are called feedback equivalent if they are related by a transformation of the form I = cp(g), U = $(g, v ) where the maps y + cp(y) and v + $(., v ) are P-diffeomorphism. These transformations define a group action on the set of systems and the associated group Gf is called the feedback group. The group G; defined in the introduction is its natural restriction to the class of affine control systems (1). 2.6. Feedback group. Two systems (3)
2.7. Computations of Singular Trajectories. To compute explicitly we have to solve equations (6) i.e., H, = 0, takinginto account = Hp, = -H,. The implicit function the differential constraints
9
3
g
theorem tells us that if the Hessian matrix restricted to H, = 0 is of full rank, then a singular control can be locally computed aa a mapping ( q p ) + .^(.,p). This rank is a feedback invariant. For affine systems a2H/au2 = Q and geometrically it means that at each point x, the space {f(x,U ) ; U E Wm } is an affine space in ?Rn.From now on we shall restrict our attention to &ne systems (l),and moreover we shall consider only the single input case: U E W,see [4]for a more complete study. We shall prove that in order to compute the singular controls, one needs only to solve linear equations with respect to U and as quoted in [l41 the main difference between the affine and non &ne case is the numbers of solutions of H, = 0. This multiplicity generates the complexity of the behaviors of
Singular Trajectories Time the and Optimal Control
Problem
85
extremals for the time optimal control problem and it is fully exploited in [l41 to compute a complete set of feedbkk invariants. 2.8. Notations and Definitions. From now on we shall consider
only single input affine systems ,(l),written as (X, Y) , where X and Y are vector fields and let A be the set of such systems. We have H(x,p,U ) = ( p , X+uY) and the constraintsset H, = 0 is identifiedwith C1 = { ( % , p )E Itz"; (p,Y(z)) = 0). The Lie bracket of two vector fields V, W is computed with the convention [V,W] = W,V - V,W and we set ZZ=, { ( % , p ) E Cl;(p,[X, Y](x)) = 0) and S = { ( % , p ) ;< p , [Y, [X, Y]](s) >= 0). Let G be the restriction of (x,p)-+ 1to &\S and let fT = (p,X+W). An extremal lift is said of minimal order if it is contained in \ S. 2.9. Proposition. The extremals 1i.s of minimal order
are the solu-
tions of:
contained in &\S, the singular control being definedby u(t)=G(z(t),p(t)).
2
Proof. An extremal lift (x,p,u) is a solution of = HP' L!?dt = -H, satisfying (p(t),Y(z(t))) = 0, Vt. Differentiating twice this equation with respect to t, we get:
+ 4w 1x3YIl(z:(t)))= 0.
( p ( t ) ,[X, [X, Yll(s(t))
Hence, the extremal is contained in E2 and one can define the singular control in CZ \'S by the dynamic feedback:
= ';i(z(t>,p(t)>. On the other hand in C2 \ S we have gp= Hp and H, = H,.
+ (p,GP(q p)Y) = Hp
A
2.10. Remark and Notations. On C2 \ S, a singular control U(.) is uniquely definedby the dynamic feedback G. The extremal lifts of minimal order are the solutions of an analytic differential equation, denoted by 2. Let X be the map (X, Y)+ (2,E l ) ,
Bonnard
86
2.11. G)-action on ( 2 , C l ) . Let cp be a diffeomorphism of Rn and we lift cp into the symplectic transformation c p'defined by x = cp(y), p = q s . We define the action of (p,a,P ) on (2,C,) m the action induced by the change of coordinates defined by p'. In particular the feedback acts trivially. 2.12'.Definitions. Let E and F be two R-vector spaces and let G be a group acting on E and F. A homomorphism x : G + R \ (0) is called a character. A semi-invariant of weight x is a map S : E + R such that Vg E G,Vz E E,S(g.z) = x(g)S(z). It is an invariant if x = 1. A map 6 : E + F is called a semi-covariant, of weight x, if Vg E G,V x E E, d(g.x) = x(g)g.G(x). It is called a covariant if x = 1. (N.B.: the concept of invariant and covariant does not require E, F to be vector spaces). Now, we can formulate the following result, see [4]for the proof. 2.13. Theorem. The map X : ( X , Y ) + ( 2 , C l ) is a covariantfor the respective G;-actions. Let A, be the class of single input a f i n e systems such that (i) Y never vanishes, (ii) Y ,[X, Y ] and [Y, [ X ,Y ] ]are almost everywhere linearly independent (in particular n 2 3). Two elements ( X , Y ) and ( X ' , Y ' ) of A, are feedback equivalent if and only if A ( X , Y ) and X ( X ' , Y ' ) are G; equivalent. 2.14. Applications. The previous theorem tells us that ifwe substract from the set of systems a bad set, X is a complete covariant. This property which identifies two geometries is important because G) acts on the constrained Hamiltonian differential equations as changes of coordin& tes. In particular all the classical invariants of the vector field 2 related to the behaviors of trajectories: equilibrium points, limit cycles etc. are feedback invariants. Moreover we shall compute other invariants connected to the time optimality problem for singular trajectories, mainly related to the spectral properties of the second order intrinsic derivative. 2.16. I-D Transformation. A system of the form 9 = f(x,U ) can be = v. In control theory, this kind of related to an affine system ifwe set transformation is very important for practical and theoretical reasons. In
9
87
Singular Trqjectories andTime the Optimal Control Problem
particular, as observed in [13],it allows to relate the general classification problem to the classification of affine systems where the distribution x + SpanY(x) is integrable. In optimal control, the inverse transformation, called Goh's transformation;plays an important role. We shall discuss the main properties of this transformation. 2.16. Definitions. Let n I: 2, and consider a single input system (X, Y).Let x0 E RrLbe such that Y (to) # 0 and U be an open neighborhood of 10 such that therestriction of Y to U is &, where ( x 1 ,. . . ,x.) are the coordinates of x E R". The restriction of ( X ,Y ) to U can be written
dx' = X'(x',t"), dt
dx" - X"(X') + U ,
dt where x' = ( x 1 , .. , E The system = X ' ( x ' , x n ) , defined on an open subset U' of R"" and where the control variable is x", iscalled the reduced system associated to ( X , Y ) (restricted to U).Let H ( x , p ,U ) = (p,X u Y ) and let H'(x',p',tn) = (p',X'(x', x")) be the reduced hamiltonian where p' = (PI,. . . ,pn-l) is the vector dual to X'.
.
%
+
2.17. Lemma. For the restriction of ( X ,Y )to U,( x l p , u ) is an extremal lift if and only if (x',p',xn) is an extrema1 lift of the reduced system.
Morever, we have OH' (i)
d BH --(x,p,u) = --(x',p',t"); dt Bu ax" B d2 BH Ba "- (s,p,u)= - w ( H' (ii) X ' , P ' , X ~ ) . au dt2 au Proof, The concept of singular trajectory is feedback invariant, hence = U. Since themap U ( . ) + x"(.) isreone may set X" = 0, and gular, singular trajectories of both systems are in correspondence. Now, computing, since (p,Y ( x ) )= 0,we have p, = 0 and clearly
2.18. Remark. The relations are in fact coming from the connection between the implicit function theorem and the Cauchy-Lipschitz theorem
Bonnard
88
about the.existenceof solutions for differential equations. Indeed if (x,p,U ) is an extrema1 lift of minimal order, the computation of U as U^ in Proposition 2.9 is in fact equivalent to the computation of z" as a function of ( x / , p / by ) solving = 0, using the implicit function theorem.
3. Feedback Action and Time Optimal Control In this section we present the main techniques from [5] using the feedback action, in order to clarify the time optimality problem for single input f f i e systems (X, Y). 3.J.Assumption HO.Consider a system (X,Y) and let (-/,U)be a reference trajectory defined on [O,T].Since the control domain is P, only the singular trajectories are candidate for absolutly continuous time optimal trajectories. Moreover, in order to be time optimal, 7 cannot be a periodic solution. Hence, we shall make assumption HO:7 is a oneto-one analytic singular trajectory. Therefore in a sufficiently small COneighborhood of 'y, it can be identified with t + (t,O,.. ,0) and using a proper feedback, U can be identified with the zero-input. Rom now on we shall assume (7, U ) of this form.
.
3.2. Planar case. Consider a system in R2 and let v = (z,y) be the coordinates. Assume Y transverse to y : t + ( t , O ) , then locally one may set Y = L ,The singular trajectory is contained in S : det(Y, [X, = 0,Y]) 8Y which is identifiedto y = 0. An adjoint variable p = (p1,pz)satisfies p2 = 0 and if (y,p,U ) is of minimal order we have (p,a d 2 Y.XI,)# 0. The system can be written for y small:
dx
dt = 1 +
c
ai(z)yi
a22
and if p is oriented according to the convention < p , X(,> I 0, the sign of ( p , ad2 Y.X)),at 0 is given by the sign of a2. Clearly we have:
Singular Trajectories and the
Time Optimal Control Problem
89
3.2.1. Lemma. If a2 < 0 (resp. > 0) then 7 is time minimizing ( r a p . maximizing) with respect to all solutions contained in a suflciently small neighborhood of y. This examples illustrates the powerof using the feedback group to analyze the optimality problem. Conversely, for planar systems, the assumptions in Theorem 2.13 are not satisfied, and the pair (2, Cl) doesn't contain enough information to classify systems. Take for instance the two systems:
g
For both systems the distribution W + R Y ( v ) and the singular flow = x2 are the same, but they are notfeedback equivalent in any neighborhood of 0. Indeed because in the first case, a non-trivial singular arc is time maximizing and in the second case minimizing. Hence, the time optimality, Cl) has to be taken into account to classifg. which is not encoded in (2, An additional covariant has then to be used. For that, we consider 'the one-form W defined on the set where X and Y are not colinear by w ( X ) = 1 and W ( Y )= 0.This form has the following properties:
:1
(i) If yl is a solution of the system defined on [O,tl],'then dt = t l ; (ii) dw = 0 on the singular trajectory; (iii) For y small, the sign of dw is the sign of az(z)y.
S.,l
W
=
A more complete classification can be obtained by developping this approach. For instance, the set S can have singular points and around such points the normal form is more complicated, see 1191.Now, we shall extend our approach to then-dimensional case. First, we recall a standard result. 3.3. Lemma. The image of P [ O , t ] ,T 1 t > 0 by dEXOlt computed along (7,U = 0 ) is K ( t ) = Span{adk X . Y ( r ( t ) ) ;k E N}, which coincide8 with the first-order Pontryagin's cone defined in [16].For every extrema1 lift (y,u,p),the vectorp(t) is orthogonal to K ( t ) .
Bonnard
90
3.4. Definitions. Assume that foreach t > 0, the codimension of K ( t ) is one i.e., the order of abnormality is k ( t ) = 1 (minimal). Choose an extrema1 lift ( y , u , p ) such that HI, = @,XI,) 2 0. The trajectory 7 iscalled ezceptional if HI, = 0, and hyperbolic if HI, > 0 and < p , ad2Y.XJ, < 0, and elliptic if HI, > 0 and < p , ad2Y.XI, > 0.
3.5. Definition. Let ( y , u ) be a singular trajectory, defined on [O,T].
:
We shall denote by t,,if existing, the first t E ]O,T[such that y is time optimal on ]O,t,[ and ceases to be optimal if t > t,, for all solutions of the system, with same initial and terminal points, and contained in each CO-sufficientlysmall neighborhood of y. The point y ( t c )is called the (first) conjugate point t o $0). (This corresponds to the concept of strong minimum, in classical calculus of variations.) , The main problem when analyzing an optimal control problem is to compute the conjugate points. For the time optimal control we shall prove that there is no general algorithm, contrary to the Riemannian case. 3.6. Assumptions. Let (?,U) be a HO-singular trajectory on [0,T ] which is normalized by y : t + ( t ,0,. , ,0) and U = 0. Since the planar case has been already analyzed, one may assume n 2 3. Moreover,we have the equality K ( t ) = Span{adkX.Y(y(t));k E N),for each t E ] O , T ] . We assume that
.
(Hl) (H2) (H3)
Vt E [O,T],ad2Y.X(y(t))9 Span{adk X.Y(y(t)); k E N) V t E [O,T],the (n - 1) vectors {ad'X.Y(y(t)); k = 0,. . ,n - 2) are linearly independent (Hence K ( t ) is of codimension one). Therefore X(, can be written as g ad2Y.X(,+Cyzt hi addXx.Y(,, where g , ho, . . . , h,-2 are analytic functions from y into R. '
.
3.7. Lemma. g = 0 along y or g ( x ) # 0,Vx E y. In the second case h%-,g-' is a feedback invariant if n 2 4. ,
.
Proof.X can be written in a neighborhood of 7 n-2
Singular Trajectories Time the and Optimal Control Problem
91
where g' and hi are analytic functions whose respective restrictions to y are g and hi. Moreover the restriction of 2 to y is zero. Applying adX to both sides and I'estricting the equation to y we get 0 = g ad X. ad2Y.X(,
- 8 ( X ) ga d 2 XXI,
mod Span{adi X.Y; i = 0,.
. . ,n - 2}/,,
where 8 is the Lie derivative. Now since ad2Y.X and the vectors adiXX.Y, a = 0, . , . , n - 2 are independent along y, there exists an analytic mapping j such that I
adX. ad2Y.XI, = - j
ad2
Y.XI, mod Span{adi X.Y; i = 0,. .. , n- 2}ly.
Hence we get
gj
+ 8x9 = 0
and since along y, X = &, we have t
g ( d ) = g(0)exp
- I j ( t )dt. 0
Hence the first assertion is proved. In order to prove that is a feedback invariant, one consider around y the action of triplets (p,cy,p) which preserves the normalizations on (7, U ) , p(y) = y and the restriction of CY to y is zero. The proof is straightforward and is left to the reader. 3.8. Remark. The previous result shows how to compute feedback invarimts in the neighborhood of a reference singul&r trajectory ( ? , U ) . This trajectory is normalized at the origin, and then feedback invariants are related to the Lie algebraic structure along y. Under our assumptions HO-H2, the first order Pontryagin's cone along y doesn't provide feedback invariants. This is due to the lack of integrabilityconditions along the curve y. If these assumptions are not satisfied, invariants associated to
the classification of non-autonomous linear systems corresponding to the linearized system along y can be obtained, one trivial being the order of abnormality.
Bonnard
92
3.9. Proposition. Let ( 7 , ~be) a trajectory which is hyperbolic or elliptic. Then, under assumptionsHOthe H2,system isfeedback equivalent in a suficiently small neighborhood of 7 to a system of the form
where ann is strictly positive (resp. negative) on[0,T ] if 7 is elliptic (resp. = hyperbolic)and Zi= o , ~(Ix'l), where tx' = (x2,. . ,xn) and lim 0 when x' -+ 0.
.
Proof. The proof of this result is given in [5]. 3.10. Geometric Interpretation. The trajectory 7 is identified to t + (t,0,.. . , 0) and the following conditions are satisfied
(9
adk X.YI,= (-l)k-,
8
..
k = 0,. , n- 2
and is equal'to 0 if k (ii)
> n - 2,
8
XI, = 821'
(iii)
A,.
Hence KI, = Span{ ,.,&}l,. being in standard Brunovsky's form,
The linearized system along 7
.
Let (7,u , p ) be an extremd lift, then one may set p = ( E , 0,. , ,O), where E = +l ,in the elliptic case and E = -1 in the hyperbolic case. The Hamiltonian is HI, = E , Morever (p,ada'Y.XJ,) = Eann, The intrinsic second-order derivative along 7 is identified to:
93
Singular TqjectoriesTime the and Optimal Control Problem
3.11. Definition. System
(X,,,, Y,,,), where
will be called a model. It will be used to evaluate the input/statemapping. 3.12. Self-adjoint Differential Operators. We briefly recall some results from [18]. 3.12.1. Definitions and Notations. We assume n 2 3. Let t E ] O , T ] and C2(n-2)([0, t ] )be the set of maps y : [0,t] + R, of class (C2('"') and U?("-') ([0,t],0) those which satisfy the boundary conditions
.
y ( ~= ) , . = p - 3 ) ( 0 ) = y ( t )=
. . . = y(7+")(t)= 0.
We endow C2(n-2)([0, t ] )with the scalar product
Let bdj, i,j = 0,. . . ,n - 2 be in C('"') ([0,t ] )and such that V i , j. Let q be the map defined on C2(n-2)([0, t])by
bij
= bji,
n-2
i,j=O
which can be written ty'Sy' where ty' = (g1,. . , If 2,y E C2(n-2)([0,t ] )we set b(z,y) = tx'Sy' and
and S =
(bij).
t
B@,I d)= j b(x(s)',€/(a)) fils. 0
Hence we have q ( y ) = b(y, y) and moreover we set Q(y) = B(y, g). Let a, b E a = (a1,. . . , an-'), b = ( b l , . . , and consider ([0,t]) the problem of minimizing Q(g) among all the curves of satisfying the boundary conditions y(i)(0) = ai+1, ~ ( ~ )=(bi++'. t ) Let Dt be the 2(n - 2) differential operator defined on (C2("-2).([0,t ] )by
.
Bonnard
94
This operator is called the Euler-Lagrange operator, associated to the minimization problem and can be written:
In the remaining of this section, we shall assume that Dt satisfies the strong Legendre condition bn-2,n-z > 0 on [0,TI. (Hence it is a non singular differential operator.) Let Dlt be the restriction of D?,to.C2(n-a)([0,t], 0). Integrating by parts we get 3.12.2. Lemma. If x E"C'(~-~)([O, t ] )lind y E (C'(n-z)([O,t], 0 ) , we have B(x,y) = ( D t x ,y). I n particular D"\ is self-adjoint. 3.12.3. Definition. Let t €10,TI, t will be said conjugate to 0 if there exists a non trivial solution y of D' t y = 0. 3.12.4.Proposition. Foreach t E ] O , T ] , there exists a sequence cy = +l, ,.. ,+W svch that
(e,,t, A,,t),
(i) ea,t are elements of C ' ( n - a ) ( [ ~ , t ] , ~~ ,),, t E W satisfying: (e,,t, ep,t) = 1 if cy # P, 0 otherwise and Dttea,t = (ii) . ..Aa,t 2 . . . 2 Xlt; 1 (iii) Each y E ( [0,t ] ,0) can be written as anuniformly conuergent series:
a=l
,
Now, one can state Morse's theorem concerning the spectrum of D't. 3.12.5.Proposition. Let yEE2(n-2)([0,t],0), written as y,ea,t. Then Q(y) = C;t=", A,ty:. The first timetlCconjugate to 0 is the smallest t such that = 0. If t < tic, the only curve minimizing Q is a, = 0 and if t > tic, the minimum of Q is -&l 3.13. TimeOptimality. Now we can outline the method of analyzing the optimality of a reference trajectory (7,U ) , satisfying assumptions HOH2, in the hyperbolic or elliptic case, see [5]for the technical details. One
Singular Trqjectodes andTime the Optimal
Control Problem
95
can assume the system in the normal form (9). First let us consider the model (lo), xi being idenhfied with t: n
dxi = 1 + C aij(t)zix’, dt
dt
i,j=2
Let us set
E
= ( t ,0,.
dx2 dxn = z 3 , ., . , - = U. dt
.. ,0) + 5, hence we have
We let y = t2,
and
and recall that bn-2,n-z. < 0 (resp. elliptic) case. Now clearly we have: . .
> 0) on [ O , T ] in the hyperbolic (resp.
3.13.1. Lemma. The reference trajectory 7 is time minimizing (resp. maximizing) on [O,T],for themodel, if andonly if Vt E ] O , T ] , t -+ ( t ,0,. . . , 0 ) is not accessible from 0, in a time t’ < t (resp. t’ > t ) . 3.13.2. Lemma. The reference trajectory 7 is time minimizing (resp. maximizing),on [O,T],for the model, if and only if the quadratic form t
&(v)
5
= q(v(s)) ds 0
satisfies Q(y) 5 0 (resp. 2 0 ) whenevaluatedontheset of curves v satisfying the boundary conditions y(0) = . . ~ ( “ - ~ ) ( 0=) y(t) = . . . p - y t ) = 0.
.
Now, since U E R, the direction RY = R& is a jump direction, and the variable = xn can be considered as the control variable (this corresponds to Goh’s transformation introduced in 2.15). We have to study
Bonnard
96
the sign of Q on the set of curves y satisfying the boundary conditions:
l d ( ~= ) .
.., p - 3 ) ( 0 )
..
= y ( t )= . = y(n-s)(t)= o
(the constraints in the jump direction y(n-2) are omitted). The extrema of Q correspond to smooth solutions and using Section 3.12 we get:
3.13.3. Proposition. If we eqvand y into CLz.ycrea!,t, ea,t being eigenvectors of theoperator D't associated to Q and X,,t the corresponding eigenvalues, we have Q(g) = C;t=", Aa!,tg:. Let tlCbe the first time conjugate to 0 for D', if t < tic, we have X,,t < 0 (resp. > 0 ) , Va, in the hyperbolic (resp. elliptic) case and y is time minimizing (resp. maximizing). If t > tic, y ceases to be minimal (resp. mwirnal). We can represent in the ( t , z ' ) plane the set of points z ' ( t ) , accessible from 0 when the boundary conditions ai(t) = 0, i = 2, . ,n-l are satisfied (it is a projection of the accessibility set). The d-coordinatecorresponding to y ( t ) is t and every point above (resp. below) the diagonal is accessible, in the elliptic (resp. hyperbolic) case if t < tic.
..
3.14. Intrinsic Computations of Conjugate Points 3.14.1. Definitions and Notations. Let (?,U) be a singular trajectory defined on [0,T] and msume (7,U ) of minimal order. Then from Proposition 2.9, there exists an extrema1 lift such that (?,p) is a solution of g = Z ( z ) , contained in c = 0, where z = (%,p)E Ran, 2 = (Rp, and c : ( % , p ) + ((p,Y ( z ) )(p, , [ X ,Y])) (here Z is considered as a vector field defined a.e. on R2").Let us assume Y and [ X ,Y]linearly independent along y. Let 6a be a solution of the variational constrained differential equation, computed along y : = Z,(y(t))Gz,c,.& = 0. For t E ]O,T],let P ( t ) be the vector space generated by all the vectors dz(t),where (dz(t), 6p(t))is the (n - 2)-space generated by the solutions ,of the variational equation 6z(t)satisfying h ( 0 ) E RY ( ~ ( 0 ) ) . From [17],we have:
-Rz)
3.14.2. Proposition. Let m ( t ) be the dimension of the vector space generated by P ( t ) , Y ( r ( t )and ) X('y(t)).If (?,U) is a singular trajectory satisfying HO-H2, then the first conjugate time, defined in Section 3.13,
Singular Trqiectories and the Time Optimal Control Problem
97
coincides with the first t E IO, TI such that m ( t ) < n. If 7 is exceptional then Vt E ]O,T],m ( t ) < n. 3.15. Conclusion. Using standard results concerning differentialoperators, one can compute the input/state mapping for the model, in particular it is a fold. This allows to decide about time minimality (hyperbolic case) or maximality (elliptic case) of a reference trajectory satisfying assumptions HO-H2 and to compute conjugate points, for the model. This computation requires only integrating the variational equation (it is a linear equation). One can prove, see [5], that the same is true around the reference trajectory, for the original system. At tic, the reference trajectory is still minimizing or maximizing for the model, although it is not any more the unique optimal solution. For the original system, the optimality status at tl, is coded in the &'S occuring in (9). This generates additional feedback invariants. When assumptions HO-H2 are not satisfied, different models haveto be used to analyze the optimality properties, see for instance the analysis developped in [5] concerning the exceptional case. The problem becomes intricate, because the accessibility set can be muchmore complicated. In our case the reference trajectory satisfying HO-H2 is CO-isolated i.e., it is the only trajectory joining ~ ( 0to ) y(T)in a time T, ifwe restrict the system to a sufficiently small neighborhood. This has the following consequence. For each cost function ;1 fo(z,U ) d t , the trajectory 7 is locally optimal. The existence of conjugate points is detectable in terms of the behaviors of extremals in the neighborhood of the reference trajectory. This will be clarified in the next section.
4. Feedback Invariants as Geometric Invariants
in Time Optimal Control in W3 4.1. Notations. Consider the following system
dv
dt = X ( v )+ uY(v), where v = @,glz) E R3.Let D f S y= det(Y, [X, Y], [Y, [X, Y]]), D f P y= det(Y, [X, Y], [X, [X,Y]]), and D f I y = det(Y, [X, Y], X ) (the dependence
Bonnard
98
with respect to X and Y will be omitted when no mistake is possible). Let Il be the set of points where X and Y are dependent and 12 the set where X and [X, Y]are dependent. Theyare clearly feedback invariants. For the remainder of this section, we consider only systems (X,?')such that D1 is not identically 0. By singular trajectory, we mean a singular trajectory contained in R3\ {Dl = 0). 4.2. Lemma. The singular trajectories are the solutions of:
restricted to R3 \ {Dl = 0 ) . Proof. Since p # 0, the relations ( p ,Y)= ( p ,[ X ,Y]) = (p,[X, [X, l']])+ u(p,[Y, [X, Y]]) = 0, defining the singular controls imply D2 + uD1 = 0. Thus, on the set R3\ {Dl = 0}, the singular control is given by a feedback
S be the vector field defined on R3 \ (01= 0) Y )+ S. Observe that S is the by X - e Y and let X, be the map (X, restriction of r*Z to R3 \ {Dl = 0}, where Z is the Hamiltonian vector 4.3. Definitions. Let
field whose solutions are the extremals lifts of minimal order, and T is the projection ( z , p ) + 2. The action of G ) on Z induces the following action on S : (p,cu,p).S= p e s , and from Theorem 2.13, we have the following proposition: 4.4. Proposition. The map X= is a covariant. Moreover the feedback Y),D1 being non identically zero,is equivalent t o classification of pairs (X, the Gf-classificationof ( Z ,RY),(RI' represents the distribution associated to Y).
Now we shall describe feedback invariants related to the time optimal control problems. 4.5. Lemma. Let ( p ,CY,p ) E G;, then:
(i)
F) D x p y ( p ( w ) ) , for i = 1,2,3;
DY*ys'*y(w) = (det 6p"
Singular Trajectories and the Time Optimal Control Problem
= (det
99
g)
det(Y, [X, Y], [Y, [X,Y]](cp(v))etc.
f : R3 + R and define the action of(cp, a,0) E G) o n f as follows: (cp,a,P).f = f o cp. Then, we have the following serni4.6. Corollary. Let
covariants
+ x 3 : (X,Y) -+ DfIY.
(i)
A1
(ii)
: (X,Y)
4.7. Proposition. The exceptional trajectories are contained in D3 = 0, the hyperbolic (resp. elliptic) trajectories in D1D3 > 0 (resp. DlD3 < 0). The sets defined by D3 = 0, 0 1 0 3 > 0 and DlD3 < 0 are invariant
sets. Proof. On R3 \ { D l = 0}, Y and [X, Y] are independent. Let 7 be a singular trajectory, an associated adjoint vector p satisfies (ply) = (p,[X,Y]) = 0, along y. If y is exceptional, then HI, = @,X)\, = 0, hence they are contained in D3, which is an invariant set. Now D 1 0 3 > 0 and 0 1 0 3 < 0 are also invariant, because in R3 \ { D l = 0}, the Lie bracket [Y, [X, Y]] cannot cross the linear span of Y and [X,Y]. By definition, these sets contain respectively, hyperbolic and elliptic trajectories. 4.8. Time Optimality. In thie section we shall describe in details the
optimality results concerning singular trajectories given in [5], and shortly described in Section 3 of this article, in the hyperbolic and elliptic case. We consider only singular trajectories contained R3 \ { D l = 0) and one
! .
100
Bonnard
may assume y one-to-one. Hence 7 can be taken as t + (t,O,O) and by setting U' = U + y is the trajectory corresponding to U' = 0.Using the action of the feedback group around y, one can transform the system into one of the following systems. In the exceptional case we may take the system in the form
3,
where a > 0 on [O,T],and in the hyberbolic or elliptic caae we transform it to theform
+
+
where L(t,g,z ) = a ( t ) z 2 2b(t)yz c ( t ) y 2 , a < 0 (resp. a 3 0) on [ O , T ] in the hyperbolic (resp. elliptic) case and R is a vector field which can be neglected i.e., one can set R = 0 and analyze only the problem for the models. First, we have. 4.8.1. Lemma. I n the exceptional case, the reference trajectory iden-
tified with t + (t,0,O) is CO locally time optimal. Proof, This is clear for the model, since a > 0 andthe equation z ( t ) = a(s)g2(s)ds = 0 imply g = 0, almost everywhere. Hence 7 is the only trajectory joining y(0) to y ( t ) , for T 1 t > 0, in time t, for the model. It is still valid for the system if it is restricted to a sufficiently small neighbourhood.
$i
4.8.2. Proposition. Assyme y hyperbolic or elliptic. The time tl, b conjugate to 0 if there exists a non trivial solution ,$ of Euler-Lagmnge equationon [0,tl,] : - = 0 , with p(0) = p ( t l c ) , whichcan be
$B
written as
Singular Tqjectories and the Time Optimal Control Problem
101
in the canonical form d2 J + K ( t )J dt2
= 0.
By analogy with the Riemaniann case K is called first cuwature, associated to the time optimal control problem. It is an invariant describing the distribution of conjugate points (see for instance Sturm's theorem). 4.8.4. Geometric interpretation. Let y be a hyperbolic or elliptic
singular trajectory, identified with t + ( t ,0, 0),defined on [0, and corresponding to thezero control. First, let us assume that thesystem coincides with the model
dt
B B 8 (l+L(t,y,z))~+z-++sld 8.2 and the associated reduced system defined in Section 2.16 is then dx dV -=l+L(t,y,z), -=z, dt where z is the control variable. By definition of tl,, there exists a curve 9' such that = 1, ~ ' ( 0 )= g'(t1,) = 0, y'(t) # 0 on ]O,tl,[ and $:c L(t,y', dt = 0. Hence the corresponding solution initiating from (0,O)satisfies x'(t1,) = tl, and ( X ' , V') intersects (]O,t1,],0)only at (tl,, 0).
c)
Let E E R and x: be the solution of (17) starting from 0 and corresponding to the controls z = cy'. From our previous analysis the family of curves (zL,~y')intersects. (]O,tl,],O)only at tl,. As in the Riemaniann case, one can show that for system (lli),7 being identified with t + (t,0,O) and Y to (tlc,O)is the limiting point of the intersections of the projection of singular trajectories on the reduced space (they concide with singular trajectories of the reduced system).
g,
4.8.5. Conclusion. Differential equation (12),whose solutions are the singular trajectories'encodes invariants connected with the time optimdity status. Given a singular trajectory the conjugate point along this trajectory can be detected in terms of the behaviors of the neighboring singular trajectories, 4.9. Feedback Invariants for Quadratic Control Systems. Clas-
sical invariant theory studies the actions of Gl(n,R) on the spaces often-
Bonnard
102
sors: vectors, covectors, etc. Computing a complete set of invariants for such actions is the main problem when analyzing the associated geometries. Thus numerous algorithms have been described to achieve this task, see for instance [8].It is interesting to connect this theory with the feedback classificationproblem by using a specific classof polynomial systems. 4.9.1. Preliminaries. We consider the set C of control systems in iR3,
of the form
dv
= &(V) dt
+~ 6 ,
where Q = (Q1,Q z , Q 3 ) , each Qi being a quadratic form and 6 a constant vector field. Let G”f be the subgroup ofG;of triplets (P,a l p ) ,where P E Gl(n,W), a is a quadratic form and p a non zero constant. G”! is a Lie group and acts on system (Q,b) by the action induced by G). The family C is stable for this action. n o m Lemma 4.2, the singular trajectories which are not contained in It3 \ {Dl= 0) are the solution of
and it is a homogeneous differential equation of degree 2. Let 2 be a vector field on R3 and let f be a map from R3 into W. The action of G”f on 2 and f is the action defined by: if (P,a,p) E G”f, then (P,a,~).Z=P*Z=P-l(ZoP)and(P,a,~).f=f~P. By Proposition 4.4,the G”!-classification of pairs (Q, b) is equivalent to theG;-classification of pairs (S,W), where S designs the vector field d& fined on R3 \ (Dl= 0) by equation (19) and D1 is assumed not identically zero. Nowwe shall relate the G”f-classification of pairs (Q, b) to the linear classification of the vector field S. The computations, which are lengthy, are given in [2]. The aim of this section is to give the geometric interpretati ons of the various semi-covariants. 4.9.2. Notations. Themap v + -[Q, 6](v) islinear and the associated matrix is denoted by A . Let ad A be the adjoint matrix corresponding
Singular Trajectories Time and the Optimal Control Problem
103
to A. Set w = adA(b) and observe that [Q, b](w)iscolinear to b. Let L1 = Iwb, L2 = Rw and let S' be the analytic vector field of R3 given by S' = D1Q - D2b. From Lemma 4.4, we have the following results: 4.9.3. Lemma.
For the G"f-actions we have the following semi-co-
variants
4.9.4. Remarks. All these maps are semi-covariants, not covariants.
Thus, only the sets D1 = 0, D3 = 0, and D1 = OnD2 = 0 have an invariant meaning. This is due to the following property. Since S is homogeneous, with degree 2, then the map v + E V , E # 0 transforms S into &S.Thus, in our classification, we must identify S with ES and we are dealing in fact with projective geometry, in which we have no non-constant polynomial invariants. In particular, the Xi's cannot be covariants. 4.9.6. Geometric interpretation of S'. Consider equation
gb.
8=
We set do = and we get g = S'(v). Hence S' is a time reparametrization of S,depending upon the semi-covariant Dl,its trajectories are reparametrized singular trajectories. Observe that S' is a homogeneous cubic vector field of Et3.
Q-
4.9.6, Geometric interpretation of (D1= 0). Computations show that D1 = 0 is the plane generated by L1 = Iwb and L3 = Rw.In particular Adt! = Dl& - D2b = -Dzb, in D1 = 0. Since b is tangent to Dl,this plane is an invariant set for the solutions of = S'(w).This set is the union of two types of sirigularities. First the set of points where b and [Q,b] are colinear, which are the projections on R3 of the singularities of the surface (p,6) = (p,[&,b])= 0. Secondly, the points where [b,[Q, b]] crosses the
2
Bonnard
104
linear span of .b and [Q,b]. This is related to the hyperbolic-elliptic status of singular trajectories. 4.9.7. Geometric interpretation of ( 0 3 = 0). This set is outside
{Dl = 0) formed with the exceptional trajectories. In particular ( 0 3 = 0) = S’(v). Now observe that D3 is an invariant set for the solutions of is a cubic form. Linear classification of cubic forms on C is well known, in particular they have a n o n trivial mtional invariant, called the modulw. This invariant is very important in our classification problem.
8
{Dl = 0) n {D2 = 0). First, observe that the map W + &(v) restricted to the plane D1 = 0 is a cubic form in two variables. Computing we get that the solutions of D1 = 0 n D2 = 0 are the lines L2 = Iww and if a discriminant S, which will be computed later , is positive, two lines denoted by L3 and L4. We have the 4.9.8. Geometric interpretation of
following nice properties. Property 1. In control theory, a system is called weakly controllable if for all W E Et3, the rank of {Q, ~)A.L.(w)is 3 , where {Q, ~ ) A . L is . the Lie algebra generated by the two vector fields Q and b. This propertyis clearly feedback invariant. Let us introduce the following’(constant) vector fields: 211 = b, v2 = [[&,v11,4 , v3 = [[Q,v1], vz] and v4 = [ [ Q , v z ]~, 2 1 , From 131, the system is weakly controllable if and only if the rank of th ese vectors is 3. Now, computing, one can check that it is the case if and only if DZ restricted to D1 = 0 is not identically 0. In other words, the semi-covariant A 2 maps the set of non weakly controllable pairs onto zero. Property 2, Here, we restrict our study to systems (Q, b) such that Span{vl, ~ 2 , 2 1 3 )= Et3. By convention, in this section a singular trajectory is a solution of (19) contained in Iw3 \ {Dl = 0). One may ask the following question: when does a system (Q, b) admit non zero singular trajectories, contained in D1 = 0 ? An associated extremal lift has to satisfy the equations:
(P,b) = (P,[Q, bl) = (P,[b,[Q, bll) = b,[Q, IQ,
bll) = 0
and is then contained in { D1 = 0) f l { D2 = 0). Further computationsshow
Singular Trajectories and
the Time Optimal Control Problem
105
the following. There exists a non trivial singular trajectory in (Dl= 0) if and only if L3 = L4 (L3 and L4 are the two lines previously defined) i.e., 6 = 0. Moreover, such a trajectory is supported by the lines L3 = L4.
3
4.9.9, Definitions and Notations. Since the equations = S’(v) is homogeneous, the distribution W -+ W’(v) is invariant with respect to the transformations W + E W , E # 0 and can be projected onto the sphere S’. More precisely, if we set r = (z2+g2 .z2)l/’, v’ = v / r and r’dt = d a (reparametrimtion), equation = ~ ‘ ( v is ) equivalent to
6
dr da
+
- (w’S’(v’))r
”
dw’ = S’(V’) - (v‘, S‘(w‘))v’ da
(21)
Equation (21) isdefined on S2 and iscalled the projectedequation associated with = S’(w).Clearly,thisequationencodesmoat of the behaviors of singular trajectories and has to be camfully studied. This will be the objective of the remaining part of this article. Let v0 E R3\ {0}, a line RWOsuch that S‘(v0) is colinear to v0 is called a ray. If S’(w0) = 0, a ray is a set of singular points for = S‘(v), otherwise a ray is an asymptoticdirection for solutions of this equation. Clearly a ray corresponds to a singular point for the projected equation and conversely. Now observe that S’(w)can be written as Sc‘(v) 1/5 div (S’(w)).v, where ,Sc‘ is a cubic vector field such that div Sc‘ = 0. Both S’ and Sc’ have the same projection on S2 and thiscorresponds to thedecomposition of = S’(v) into (20) and (21). We have the following result.
9
9
+
9
4.9.10. Lemma. The maps
X 4 : ( Q ,b)
-+ diu S’ and XS : (Q,b) -+ Sc’
are semi-covariants. Now, we are going to parametrize the G”j-orbits and relate the parameters with invariants describing the behaviors and the optimality status of singular trajectories.
4.9.11.Lemma. For an open dense setof pairs (Q,b ) , a G”j-orbit can be parametrized by: Q = ( a l z 2 + a ~ y 2 + a 4 z y + y z ,b~zC2+b2y2+.z2/2+22/,0), b = (O,O,1).
Bonnard
106
Proof. Let e l , ea, e3 be R3-canonical basis, Q = (91,&a, Qs),where Q1 = a1z2 + azy2 + a32' a42y a521 agyz and Qz, Q 3 are respectively defined by changing ai into bi and C i . We set b = e3 and hence applying a feedback one may assume Q3 = 0. Now suppose that the vectors V I , V Z , v3 defined in 4.9.8 are independent. Equivalently it means that V I , v2, W is a basis of R3.We compute Q in the basis el = W , e2 = v2 and e3 = b. This induces the following normalizations:
+
+
as = a5 = b5 = 0 ,
+
ag = 1, b3 = 1 / 2 ,
and we get a pseudo-normal form. In order to get a G"f-representation with a minimal number of parameters, we proceed as follows. We consider the action on the two first components of Q of the linear tranformations P E Gl(3,R) such that P preserves We3 andthe pseudo-normalform. An easy computation shows that these transformations coincide with the aubgmup of matrices of the form
(5 + i),
azo.
Using the action of p, one of the coefficients a2, bz or be can be set to 0. The parameter cy allows to normalize a1 or b4 to 1, if they are non vanishing. We choose be = 0 and b4 = 1 if b4 # 0. Nowwe have to check that each element of a G"f-orbit has the same representation. For that, let (Q, e3) and (Q', e3) be two elements parametrized respectively by the coefficients a l , aa, a4, bl, ba and a:, a i , a i ,b:, bi and let us assume that they are feedback equivalent. Hencethere exists P E G1(3,R), leaving e3 invariant, such that Q' = P * Q mod e3 (this means that the last component of Q' and P * Q is omitted). Computing we get P = identity. This proves the assertion. Proposition 4.9.12. I n the representation of ( Q ,b) given above we have:
Singular Trajectories Time and the Optimal Control Problem
107
+
+
+
+
(iii) 0 3 = -b2y3 - blx2y - zy2 alx2z a2g2z 1/2yz2 a 4 q z (iv) Using the projectivecoordinates U = x / y and W = z/y in which the set D1 = 0 is sent at the infinity, the projected system becomes: = S ( u , w ) ,$ = P z ( u , v ) ,whereP1 = a z + ( a 4 - b 2 ) u + w + ( a 1 - 1 ) - b l u 3 1/2uw2, P2 = 1 2blu (2b2 - a4) - 2a2v2 (I - 2a1)uw - w3 - a&; (v) diu S’ = 2b1z2 6b2v2 - z2 4x9 - 2a4xz - 4azyz.
6
+
+ +
+
+
Now,we shall indicate how to compute a complete set of feedback invariants. In fact, we compute semi-invariants which have to be combined rationally to get rid off the weights and providing invariants. 4.9.13. Singular points. The first step when analyzing the behaviors of the solutions of = S’(w)is to compute the rays and the behaviors of the neighboring solutions, by linearizing. Computing we have:
2
Property 1. If N is the numbers of rays it satisfies generically the inequality 13 2 N 2 3. Indeed in D1 = 0, we have 4 2 N 2 2 and in Dl # 0, 9 2 N 2 1. They are located in the set where b and Q are colinear. (Observe that for a general cubic equation, one has 16 2 N 2 1). Property 2. The behaviors of the solutions near 0 1 = 0 has been classified in [2]in the generic and codimension one case. In particular we have: Near Lz. The two eigenvalues of the linearized projected system me equal to bl. It is a node. Near L3, L4. We set U = a,/x and W = z / x . By definition La, L4 are the roots in U = 0 of equation w2/2 a421 - bl = 0. Hence if 6 = ai 2b1 > 0, we have L3 = -a4 4,L4 = -a4 - d.Computing, we get that the eigenvalues of the linearized projected systems are (X, -X), where 1x1 = 12bl - a4L1, L = L3 or L4. It is a saddle.
+
+
+
Near L1. Recall that L1 is the line wb. By setting U = y/z, W = x / z and W = l/z, the system is near L1, CO-equivalent to = U , = w/2, = w/2. The eigenvalues of the linearized projected system are 1 and 1 / 2 (it is a node) and each line in a, = 0 parallel to e3, is an asymptotic direction.
9
9
9
Bonnard
108
Conclusion. From this analysis we deduce the following. The two parameters a4 and bl are describing the behaviors of singular trajectories near D1 = 0. The line Rb is a blowing-up direction for the solutions, in particularthedistribution Rb is encoded in Q = S'(v). Other semiinvariants can be computed similarly, connected to the behaviors near the rays outside D1 = 0. 4.9.14. Classification of Ternary Forms.Classical invariant theory can be usedin order to compute feedback invariants connected to the linear classification of the following forms: D1 (linear), D3 (cubic), div S' (quadratic) and D1 = 0 n D2 = 0. We briefly indicate how. First the set D3 = 0 provides an invariant which is the modulus. Now, we can compute the invariants givenby the combinations of these forms. For example, consider D2, div S', D3,all restricted to D1 = 0. In D2 = 0 n D1 = 0, there is a root given by Rw,which has a feedback invariant meaning: b and [Q, b](w)are colinear. Hence the cubic form D2 can be factorized as zK1 where K1 = z2/2-b1z2+a4zz, where a/ = I = 0 is the line Rw.Hence K1 is a semi-covariant. Now, div S' and D3 restricted t o a/ = 0 are respectively K2 = 2b1x2 - z2 - 2a4zz and zK3, K3 = a1z2. The eigenvalues of the pencil associated to the two quadratic forms K1 and K2 are depending upon bl and a4. The parameter a1 can be recovered from the eigenvalues of the pencils formed from K3 and K1 or K Z .Further invariants can be computed using the quadratic form div S' and the cubic form D3.In the general case it provides 7 invariants, plus the modulus.
REFERENCES [l] B. Bonnard, On singular extremals in the time minimal control problem in R3,SIAM J. Control Optim., 23, 1985, 794-802.
[2] B. Bonnard, Contribution d l'dtude des trajectoires singudidres; Rapport L.A. Grenoble, 1987.
Singular Trqjectories and the Time Optimal Control Problem
109
B. Bonnard, Quadratic control systems, Math.Contro1 Signals Systems, 4, 1991, 139-160. B. Bonnard, Feedback dquivalence for nonlinear systems and the time optimal control problem, SIAM J. Control Optim., 29, 1991, 13001321. B. Bonnard, I. Kupka, Thtorie des singularitds de l'application entr6e Forum Mathematicum 5, 1992, 111-159.
/ sortie et optimalitt des traje ctoires singulibres,
P. Brunovsky, A classification of linear controllable systems, Kybernitika, 6, 1970, 173-183. E.Cartan, Oeuvres complhtes, Gauthier-Villars, Paris 1953. A. Dieudonnd, J.-B. Carrel,Invariant theory: old and new,Academic Press, New York, 1971. R. B. Gardner, Differential methods interfacing control theory, in Differential Geometric Control Theory, Brockett, Millman and Sussmann, eds., Birkhauser, Boston 1983. R. B. Gardner, Lectures on the method of equivalence with applications to control theory, CBMS-NSF Regional Conference Series in Applied Mathematic, 58, SIAM, Philadelphia, 1989. R. B. Gardner, W.-F. Shadwick, G.-R. Wilkens, A geometric isomorphism wath applications to closed loop control systems, SIAM J. Control Optim., 27,1989, 1361-1368. C. Godbillon, Gtomdtrie diffdrentielle et mtcanique analytique, Herman, Paris, 1969. B. Jakubczyk, Equivalence and invariants of non linear control systems, in Nonlinear Controllability and Optimal Control, H.J. Sussmann ed., Marcel Dekker, New York, 1990. B. Jakubczyk, Microlocal feedback invariants, preprint 1992. I. Kupka, O n feedback equivalence, Canadian Math. Society Conf. Proceedings, 12,1992. E.B, Lee, L. Markus, Foundations of optimal control theory, John Wiley, New York, 1967.
110
Bonnard
[l71 J. de Morant, ContrSleentempsminimaldesr6acteurs chirniques discontinus, Phd. Thesis, U.of ROuen, 1992. [l81 M. A. Naimark, Lineardifferentialoperators,Part I , Frederick Unga. Pub. Co., New York, 1967. [l91 W. Respondek, Feedback classification of control systems ori Ita, R3, this volume. [20] K. S. Sibirsky, Introduction to the algebraic theory of invariants of differential equations, Manchester U. Press, Manchester, 1987.
Controllability of Generic Control Systems on Surfaces Alelcsey Davydov * Steklov Institute of Mathematics, ul. Vavilova 42, GSP-1, 117966 Moscow, Russia
Abstract. We present a complete classification of singularities of controllability boundaries for generic control systems on surfaces. This classification includes three lists of singularities. Namely, Tables 1-3 (given below) describe the generic singularities of boundaries for local transitivity zones, attainable sets and nonlocal transitivity zones, respectively. It turns out thatfor a generic control system all these singularities are stable under small perturbations of the system. These results were proved using the tools and results of singularity theory, theory of dynamical systems and control theory. Our aim here is to explain these results and to illustrate them by examples. For the full proofs we refer the reader to [l,2, 3,41.
1. Classes of control systems Here we define a space of control Bystems considered below.
* Permanent address: Department of Mathematics, Vladimir State University, Gorkii street 87, 600026 Vladimir, Russia 111
Davydov
112
1.1. Space of control systems
We will only investigate control systems on smooth surfaces. We assume that a control .system can be locally described by an equation of the form
where z is a point of the phase space M of our system (the surface M), U is a control parameter belonging to a smooth closed manifold U (or to a disjoint union U of finitely many smooth closed manifolds) which has at least two distinct points, f is a smooth mapping and 2 = &/Bt. Let P be a bundle space over M with fiber U. On the whole surface a control system can be defined bya mapping F of P into the tangent bundle space TM such that 7r o F = r , where r : P + M is the bundle projection and 7r : TM + M is the canonical projection. We identify the space of control systems with the set of such mappings and endow it with the fine Ck(Whitney) topology. The value of k will be separately specified for each class of control systems considered below. A generic control system or a control system in geneml position is a system from a subset of this space, open in the fine Cktopology and dense in fine CO”topology. All the results formulated below are valid fork 2 4 but for somespecial type of control systems they are also true for some smaller IC,
1.2. Polydynamical systems
A control system is called polydynamical if the set of values of the control parameter consists of finitely many points. In the simplest case this set has only two distinct points, and such systems are bidynamical. Example 1. Bidynamical system. Suppose that in the flat sea R&, there are a stream of water with a velocity field v($, y) and a stream of air over the water with a velocity field w(z,y). Our ship without inertia has lost control h d its movement takes place according to one of the two velocityfields. It is easy to see that the ship’s possiblemovements are
Controllabilityof Generic ControlSystems on Surfaces
113
described by the control system
(& ,i) = U+,
Y) + (1 - u)w(z,P)
where an admissible control is a piecewise continuous function U on the real line with two possible values 0 and 1. So the movement of the ship is described by a bidynamical system. For polydynamicd systems all the results formulated below are valid if k 2: 2. Remark. Here and henceforth, unless otherwise specified, 2, 21 are standard plane coordinates and zero is the origin.
1.3. Simplest differential inequalities
Example 2. In the plane RE,a,consider a swimmer driven by a smooth flow of water with a velocity field v(%, 3). In standing water the swimmer can swim in any direction with a velocity bounded by the value of a positive function g. The possibilities of the swimmer can be described by the differential inequality ( k - 211 (z,
+ (0 - v a ( 5 , ~ ) )5~g2(z,Y)
or by the control system
+
(h,0) = ‘U(%,?/l P ( 4
+ +
where U is a point of the two-dimensional sphere U: U; U: = 1, and p maps the point (211, u2,US) of this sphere to the velocity (g(2, y)ul, g ($9
Yb2)
For such systems all the results formulated below are valid if k 2 2.
2. Stability of sets of points with the same local
controllability properties Usually, states in the phase space may have different local controllability properties. Near (the word ‘hem ” means here and everywhere below
Davydov
114
“in a sufficiently small neighborhood of”) some of them control system stays for any positive time period under a suitable choice of an admissible control, and near other ones it is not possible to prevent an unwished evolution of the system. Here we define different notions of local controllability and, for a generic two-dimensional control system, we study the regions of the phase space consisting of all points with the same local controllability properties. We show that these regions are stable under small perturbations of the system.
2.1. Local transitivity zones and the steep domain
An admissible control is a piecewise continuous mapping U : t I+ u(t) E U of the time axis into the set of values ofthe control parameter. Choosing an initial point, an initial time and an admissible control we can define an admissible motion of the control system, at least for times sufficiently close to the initial time. A state of a control system is attainable in a time t from some state of this system if there exists an admissible motion of this system steering the second state to the first one in this time. A point z of the phase space has the local transitivity property (LTP) if for any neighborhood V of z there exist T > 0 and a neighborhood of z such that any two points of are attainable from one an other in a positive time less than T without leaving V . A point z of the phase space has the small time local transitivity p m perty (SLTP) if for any neighborhood V of z and any T > 0 there exists a neighborhood of z such that any two points of are attainable from one an other in a positive time less than T without leaving V . The local transitivity (respectively the smalltime local transitivity) zone is the set of all points of the phase space which haveLTP (respectively SLTP). We denote these zones by LTZ and SLTZ, respectively.
v
v
v
v
Theorem 1 ([l]). L T 2 and SLTZ of a generic control system have the same interiors and the Same closures, and the closures coincide with the closures of the interiors.
Controllabilityof Generic Control Systems
on Surfaces
115
The proof of Theorem 1 is based on investigation of singularities of the field of limiting directions defined below. Obviously, SLTZ E L T 2 but very often these zones are different.
Example 3. For the control system of Example 1, SLTZ = { (3, v) I z = 0 < y}, L T 2 = { ( s l y ) 1 z = 0 5 y} if w(z,y) = (1,O) and w(z, y) = (z - y, x). Indeed, at each point outside the set z = 0 < y there is a drift because both admissible velocities at this point are nonzero and make an angle not exceeding 180'. So this point does not have LTP and SLTP. Further any point z of the positive y-axis has SLTP (and hence LTP). At z we can observe the following phenomenon. The point z belongs to the collinearity line of the velocity fieldsW and W .At z these velocity fields have opposite directions and are not tangent to this line. At any point of this line except the origin, the phase curves of W and W have first order tangency. So near z the velocity fields W and W have first integrals I(z,y) = y and J ( z ,y) = y - s 2 h ( zy), , respectively, where h ( z ) > 0. In thenew coordinates = y and P = z ( h ( z y))lI2 , near z , these integrals take the forms (tildes omitted) I(s,y) = y and J ( z ,y) = y - z2, respectively. Now it is easy to see that in any neighborhood of z there is an admissible cyclic motion surrounding this point with two switchings from one admissible field to theother during one cycle and with any sufficiently small period (Fig. 1; in Figures 1 and 2 the boundary of the steep domain (SD-boundary) is indicated by the thin double line, and the phase curves of the fields are depicted by dashed and solid lines, respectively). Hence any two points of the region bounded by this cycle are attainable from one an other. In fact, using at the beginning the field (1,O) to reach the cycle, then the field W for the motion along the cycle, and, finally, the field (1,O) again to reach the end point, we can steer any initial point to any end point. The time of such transfer is bounded by some value tending to zero together with the period of the cyclic motion. So any point of the positive y-axis has SLTP. Finally, the origin has LTP. Indeed, in any of its neighborhoods there is an admissible cyclic motion surrounding this point with two switchings from one admissible field to the other during one cycle and with some period. This cycle is attainable from every point of the region bounded by
c
116
Davydov
"
Fig. 1.
t
Fig. '2.
Controllabilityof Generic Control Systems
on Surfaces
117
it by using the field v and conversely, any point of this region is attainable from some point of the cycle by using the same field (Fig. 2). This implies LTP for the origin. But the origin does not have SLTP because the point (0, -a) is attainable from ( a ,0) (where a > 0 is sufficiently small) in time no less than a quarter of the period of a revolution around the origin with the vector field W . Example 3 shows that, in general, S L T Z # L T Z . The velocity indicatrix at a point of the phase space is defined to be the set of admissible velocities at this point. The positive linear hull of the velocity indicatrix at a point is called the cone of this point. The steep domain of a control system is defined to be the set of all points of the phase space whose cone does not contain the zero velocity. The sides of the cone of a point of the closure of the steep domain are called limiting directions at this point. An integral curve of the field of limiting directions is called a limiting curve.
Theorem 2 ([l]).The steep domain of a generic control system is the complement of the closure of LTZ (SLTZ), and its boundary is exactly the set of all points for which the convex hull of the velocity indicatrix contains the zero velocity on its boundary. The proof of Theorem 2 is also based on investigation of singularities of the field of limiting directions. Example 4. The steep domain of the polydynamical system given by the three fields of admissible velocities ( H , -y+x2) and (0,+l)coincides with the region y < x 2 . The cone of any point in 21 > x2 coincides with the tangent plane, and so this point belongs to SLTZ [5]. All points of the parabola y = x z except the origin belong to SLTZ (the argument is the same as in Example 3 for points of the positive y-axis). The origin does not have LTP because for any a > 0 the point (0,-a) is not attainable from the origin. In the steep domain two branches of the field of limiting directions are defined by the first two fields of admissible velocities. Thus the limiting curves coincide in this region with the phase curves of these two fields. Figure 3 illustrates the behavior of the net of limiting curves
Davydov
118
near the origin. The SD-boundary is indicated by a thin double line, the interior of LT2 (andSLTZ) is shaded,and the integral curves of the two different branches of the field of limiting directions are depicted by dashed and solid lines, respectively. It is easy to see that the steep domain coincides with the complement of the closure of LT2 (SLTZ).
Fig. 3.
2.2. Types of points of the boundary of the steep domain
It is easy to see from Figure 3 that the behavior of the net of limiting curves (LC-net) near the origin isdifferentfrom its behavior near any other point of the SD-boundary. In other words, the singularity of the LC-net at zero is different from its singularity at any other point of the boundary. It was shown above that zero does not have LTP and any other point of this boundary has SLTP. Hence the singularities of the LC-net have some influence onlocal controllability. To classify the singularities of a generic control system on its SD-boundary it is useful to introduce some new notions, in particular, different types of SD-boundary points.
Controllabilityof Generic Control
Systems Surfaces on
119
An admissible velocity at a point is called limiting at this point if it belongs to a limiting direction at this point. A value of the control parameter is called limiting at a point if it defines a limiting velocity at this point. A point of the SD-boundary will be called a &turning point (respectively a +passing point) if at this point 1) there are exactly two distinct limiting values of the control parameter, 2) the corresponding limiting velocities are nonzero and 3) they have opposite directions and are (respectively are not) tangent to the SD-boundary. Example 5. The SD-boundary of the control system given by the two y - X') is the parabola y = z2. All points admissible velocity fields (&l, of the parabola except the origin are &passing points, and the origin is a 8-turning point.
A point of the SD-boundary will be called a double B-passing point if at this point 1) there are exactly three different limiting values of the control parameter, and 2) the corresponding limiting velocities are nonzero. Example 6. The SD-boundary of the control system givenby the three admissible velocity fields (1,0),(-1,z - g ) , (-1,g z) is the set y2 - z2 = 0. All points of this set except the origin are &passing points, and the origin is a double &turning point.
+
A point z of the SD-boundary will be called a zero-passing point if, at z , there is an isolated limiting value of the control parameter giving the
zero limiting velocity. Example 7. The SD-boundary of the control system givenby the three admissible velocity fields (3, -2y), (1,l), (-1,l) is the set 29 = lzl. All points of this set except the origin are B-passing points, and the origin is a zero-passing point. A point z of the SD-boundary will be called a nonzero-passing point if, at z , 1) there are exactly two different limiting values of the control parameter, and 2) one of them gives a nonzero velocity, while the other gives a zero velocity, and is a nonisolated value of the control parameter.
Davydov
120
A point x of the SD-boundary will be called a black (respectively regular) zero-point if, at x , 1) there is a unique limiting value of the control parameter, and 2) the limiting directions are (respectively are not) tangent to the SD-boundary. Example 8. Consider the control system in the plane given by the velocity fields(y+cos U, sin U ) , where u E S1is an angle. The steep domain of the system is the region 1y1 > 1. In this region the field of limiting velocities is ( g - l / y , &(l- l / y 2 ) 1 / 2 ) Let , us add one more admissible field of velocities, namely (z,1). It is collinear with the field of limiting velocities of the previous system on the hyperbola g2- x’ = 1. The SD-boundary of the new system is the closure of the union of the parts of this hyperbola lying in the second and fourth quadrants and the parts of the lines g2 = 1 lying in the first and third quadrants. Thepoints (0,f l ) of this boundary are nonzero-passing points, and (-a, a’) where a = -((l f i ) / 2 ) 1 / 2 is a &turning point. The other points of this boundary belonging to the hyperbola (respectively to the set y2 = 1 ) are B-passing points (respectively regular zero-points).
+
Example 9. The steep domain of the control system in the plane given by the velocity fields (z+ cos U, Icy + sinu) (where k > 1 and U E S1 is an angle) is the region z2 + k2y2 > 1. Its boundary points ( f 1 , O ) and (0,f l / k ) are black zero-points, and all other points of the boundary are regular zero-points. Theorem 3 ( [ l ] )For . a generic ‘control system any point of the SDboundary is of one of the following seven types: a zero-passing point, a B-passing point, a 6-turning point, a double &passing point, a nonzeropassing point, a regular zero-point, or a black zero-point.
2.3. Singularities and local controllability on the boundary
of the steep domain First we introduce a notion of “singularity”. Two objects of the same nature (sets, mappings, vector fields, .) are called equivalentat a point
Controllabilityof Generic Control Surfaces Systems on
121
if they coincide in some neighborhood of this point. An object’s germ at a point is defined to be its equivalence class at this point. T w o germs of objects of the same nature are called Ck-di#eomorphic if there exists a germ of Ck-diffeomorphism that maps one of the germs to the other. A Ck-singularity(or singularity) is defined to be a class of Ck-diffeomorphic germs.Words‘‘An object A has a singularity B at some point” means “the class of Ck-diffeomorphicgerms defined by the germ of this object at this point is the singularity B”.
Example 10. The set g 2 1s’ - 11 has the same C”-singularity at the two points (h1,O).The function f : s I-+ 1s’ - 11 has the same Cm-singularity at fl (Fig. 4).
Fig. 4. Now we have all tools to describe the singularities of a generic control system on ,the SD-boundary. First we investigate such singularities in a simplest case, namely for a generic bidynamical system (#V = 2), next for a generic polydynamical system (#V E N), then for a generic simplest differentialinequality, and finally fora generic control system (with any admissible U). We will showwhat alterations takeplace on the SD-boundary of a generic control system when changing the set of control parameters from the simplest case to a general one. For simplicity we suppose that our phase space is RE,,.
Davydov
122
Bidynamical systems. Admissible velocitiesV=(VI, v 2 ) and w = ( w 1 , 2 0 2 ) of a generic bidynamical system do not vanish simultaneously, and zero is a noncritical value of the function ~ 1 - ~ 2 2 021 .Hence these velocities are collinear on a smooth curve, and admissible velocities define a field of straight lines along it. In general, last field rotates along the curve and may be tangent to it at some points. In a generic case each such tangency is of the first order. Moreover, all singular points of admissible fields of velocities are nondegenerate and different from any such point of tangency. So any such tangency point lying on the SD-boundary is a 6-turning point, and any singular point of an admissible field of velocities isa zero-passing point. All other points of the SD-boundary are &passing points. The following theorem gives a complete classification of singularities on the SD-boundary for a generic bidynamical system. Theorem 4 ([l,6 , 71). Let z be a point of the boundary of the steep domain of a generic bidynamical system. Then: 1. The point z is either a) a zero-passing point, b) a &passing point, or c) a &turning point,
and the C"-singularity at z of thesteepdomain Cm-singularity at zero of the set, respectively,
a) (Y # 0) U (c.01,
is thesame as the
w-4 9 # 0.
2, The C"-singularity at z of the LC-net is the same as the Cm-singu1a-
rity atzero of the LC-net of the bidynamical system given by one of the following three pairs of vector fields: a)
(*L.),
-
b) (&l,%/
c)
( f b 2
- v)
according as z is, respectively, a) a &passing point, b)-c) a &turning point. 3. If z is a zero-passing point, then the Co-singularityat z of the LC-net is
Controllabilityof Generic Surfaces Control Systems on
123
the same as the CO-singularity at zeroof the LC-net of the bidynamical system given b y vector field ( 1 , l ) and one of the following vector fields:
4 (5,-g),
b) (-5,-2Y) or (5,2Y),c)
-
(-. !/,S)07- (5- Yr .),
according as the corresponding singular point of the admissible field of velocities is, respectively, a saddle, b) a stable or an unstable node, c) a stable or an unstable focus. a)
4. The point z has S L T P (respectively LTP) if and only if it is a &passing point (respectively either a &passing or a focal zero-passing point). 5.
If z is a singular point of a field v1 of admissible velocities then the ratio of the eigenvalues of the linearization of this field at z is different from +l, and the value of the other admissible vector field at this point is not an eigenvector of the linearization of the field w1 at z .
2a
Fig. 5 (parts 2a, 2b, 2c)
Davydov
124
Figure 5 illustrates the singularities described in Theorem 4. The thick line represents the boundary of the steep domain, the double line represents the rest of the line v1202 - v2wl = 0 (the collinearity line of the admissible vector fields), the dashed and solid lines represent limiting curves of the two branches of the field of limiting directions, and thepoint z is indicated by a small triangle if it has LTP, and with a small circle if not. Subcaaes are labeled according to the statement of Theorem 4.
3a
on
3b
on
3c Fig. 5.
Controllabilityof Generic Control Systems
on Surfaces
125
Remark. The bidynamical systems pointed out in statements 2 and 3 of Theorem 4 represent stable realizations of the corresponding singularities. Polydynamical systems. Adding to U one more value of the control parameter and considering three-dynamical systems, we get a wider list of generic singularities on the boundary of the steep domain. Note that according to Theorem 2 for a generic three-dynamical system any point of the SD-boundary is the point of the SD-boundary of some bidynamical subsystem. According to Theorems 3 and 4 for a generic bidynamical system the SD-boundary is the union of all zero-passing points, B-passing points and B-turning points. It is not difficult to prove the following fact. Proposition. For any bidynamical subsystem of a generic three-dynamica1 system all statements of Theorem 4 are valid. firtherrnore, the boundaries of the steep domains of any two bidynamical subsystems can only havetransversal intersections, and suchan intersection canonly occur either at a zero-passing point or at a B-passing point for both subsystems simultaneously. The following theorem gives a complete classification of singularities on the SD-boundary of a generic three-dynamical system. This theorem is a simple consequence of the Proposition and Theorem 4.
Theorem 5 ([l]).Let z be a point of the boundary of the steep domain of a generic three-dynamical system. Then: 1. The point z is either a) a zero-passing point,
b) a &passing point, c) a &turning point, or d) a double B-passing point,
and the C*-singularity at z of the steep domain is the same as the C*-singularity at zero of the set, respectively,
Davydov
126
2. The Cm-singularity at z of the LC-net is the same as the COO-singularity at zero of the LC-net of the three-dynamical system given by one of the following five triplesof vector fields: a) ( f 1 , z ) and ( 0 , l ) (family of semiparabolus), b ) ( f l , y - z 2 ) and eitherbl) ( 0 , l ) (folded focus) orb2)(0, -1) (folded
saddle), c ) ( f l , z 2 - y ) and eithercl) ( 0 , l ) (folded saddle) o r 4 (0, -1) (folded focus) , according as z i s , respectively, a &passing point, b)-c) a &turning point. a)
3. If z is a zero-passing point, then the Co-singularity atz of the LC-net is the same as the Co-singularity of the LC-net of the three-dynamical system given by one of the following eight triples of vector fields: 81) (5, -g),
( 1 , l ) and either (2, -1) or (-2,l) (folded saddle);
a2)
(5, -y),
(-1,2) and ( 2 , l ) (family of semiparabolus);
as)
(2,- g ) ,
( 2 , l ) and (1,2) (folded monkey saddle);
bl) (1, & l ) and either (-z, -2y) or (z, ay) (folded saddle); b2) ( f l , 1) and either (-z, -2y) or (z,2y)(folded node); b3)
(-1,2), (2, -1) and either
(-3,
-29) or (z, 2y) (family of semipa-
mbolas); b4)
(1, l ) , (2,3) and either (-z, -29) or (z, 2y) (folded saddle-node);
c ) (l,O), (0,l) and (z - y,z) (family of semipambolas), according as the corresponding singular point of the admissible field of velocities is, respectively, a saddle, bl)-b4) a node (stable or unstable, which is clear from the kind of a triple), c ) a focus, &1)-83)
and (except case c ) )
Controllability of Generic Control Systemson Surfaces
127
al) onlyone eigendirection, a2) both eigendirections, as) neither eigendirection, bl) only the eigendirection with eigenvalue smaller in modulus, b2) only the eigendirection with eigenvalue larger in modulus, b3) both eigendirections, b4) neither eigendirection
can be given by velocities of the cone of z . 4.
If z is a double &passing point, then the CO-singularityat z of the LCnet is the same as C’=”singudarity of the LC-net of the three-dynamical system given b y one of the following three triples of vector fields: a) (-1,O) and either (1, -9 f z) or (l, f 2); b) (0, -1) and ( y f q1).
if it is either a &passingpoint (maybe double) or a &turning point with the “foldedfocus” singularity of the LC-net (see 2b1) and 2 ~ 2 ) )and has LTP if and only if it either has SLTP or is a zero-passing point with the ‘?family of semiparabolas” singularity of the LC-net (see 3a2), 3b3) and 3c)).
5. Thepoint z has SLTP if andonly
Remarks. 1. Folded nodes, foci and saddle-nodes may be stable or unstable. In the case of a zero-passing point the stability of a folded node and a folded saddle-node is defined to be the stability of the corresponding node of the field of admissible velocities. For a folded focus the kind of stability is clear from the directions of the motion along limiting curves with limiting velocities, namely we have a stable (respectively an unstable) folded focus if the motion is directed towards (respectively away from) the singular point. 2. The three-dynamical systems exhibited in statements 2, 3 and 5 of Theorem 5 represent stable realizations of the corresponding singularities. 3. The topological classification of singularities of the LC-net of a control system on the plane given by finitely many analytic vector fields was found in [8].
Davydov
128
For polydynamical systems with #U > 3, we get the same list of generic singularities on the SD-boundary with the same nature except those at double B-passing points. Theorem 6 ([l]). Let z be a double B-passing point of a generic control system with #U > 3. Then: 1. The C”-singularity at z of the steep domain is the same as the Coosingularity at zem of the set y > 1x1, and z has SLTP. 2. The C’-singularity at z of the net of limiting curves is the same as the CQ-singularityat zem of the net of limiting curves of the polydynamical system given by one of the following two quadruplesof vector fields:
either a) ( - l , O ) , (1,l) and (1,-y/fz),
or b) (0, -l), (1,l) and (yfz,1).
For more general control systems with U including some smooth closed manifold of values of the control parameter, we get topologically the same list of generic singularities on the SD-boundary as for #U = 4.But there may appear folded saddles, nodes and foci of other nature. To see this we will consider a simplest differential inequality. Simplest differential inequality.Consider a differential inequality of the form (5 - v ( w ) ) 2
+ (0- w(z,2/))2 5 1
where ( v ,W )is a smooth vector field. Note that theinequality has the same local controllability properties as the control system
S = v(z, y) + cos U ,
Q = w(z,y) + sinu
where U E S1. Their steep domains coincide with the region v 2 + w2 > 1. In a generic case, 1 is a noncritical value of the function v2 + w2 (or, in other words, d(v2 w2) # 0 if v2 w2 = 1). So the SD-boundary is a smooth curve (v2 w2 = 1). On this curve the field of limiting directions can be defined by the vector field (-W,v). In general, the last field rotates along this curve and may be tangent to it at some points. In a generic case any such tangency is of the first order. Any such tangency point is a black zero-point, and the other points of this curve are regular zero-points,
+ +
+
Controllability of Generic Control Systems on Surfaces
Theorem 7. For a generic simplest differential inequality control system) the following two statements are valid:
129
(a generic
1 ([l]). By means of a n appropriate Coo-diffeomorphism of the phase space and by multiplication with some positive function the germ at a point x of the field of limiting velocities can be transformed to the germ at zero of the vector field
either a)
(v, fy'/')
or b) ( f y ' / ' , y b ( z ,y) F y 1 / 2 a ( z g, ) )
where a , bare smoothfunctions,b(0,O) = &l, a(0,O) = 0 # a,(O,O) # 1/8, if z isrespectively a) aregular zero-pointor b) a black zero-point, and near z the SD-bounday is smooth. 2 ([3]). Any regular zero-point has SLTP. No black zero-point has S L T P , but it has L T P if and only if it gives a "folded focus" singularity of the LC-net (precisely, this happens when 1/8 a,(O,O)). Remark. At a black zero-point the LC-net of a generic control system has a folded saddle, a node, or a focus for a,(O, 0) < 0 , 0 < a,(O,O) 1/8, and 1/8 < a,(O,O), respectively [g]. Topologically, all folded saddles are the same, as well as all nodes, and all foci. In Figure 6 the four singularities of Theorem 7 are depicted. In addition to the marking of Fig. 3, a point is marked by a small triangle if it has LTP and by a small circle if it does not. Consider now the generic case with any possible set U. If the set has no isolated values of the control parameter (for example, it is the n-dimensional sphere, n E N) then for a generic control system each boundary point of the steep domain is either a &passing point, a &turning point, a double &passing point, a black zero-point, a regular zero-point, or a nonzero-passing point. Furthermore, at a point of each of the first five types the control system has the respective singularities described above (in Theorems 5 , 6) [l]. To understand the generic behavior near a nonzero-passing point it is useful to introduce some new notions. Locally a control system can be written in the form (1). Denote by L(z,u) the subspace Rf ( z ,U ) f,,(z,u)T,U of T,M. A value U of the control parameter iscalled singularat a point z if L ( z , u ) # T,M (a
+
130
Davydov
singular value of the control parameter gives a singular velocity). The union of all points ( z , U ) with a singular control U at z is called the singular surface of the control system. The restriction of the projection (z,U ) I+ z to thesingular surface is called the folding of the system. The restriction of a control system to its singular surface gives the field of singular velocities of the system.
Remark. The field of singular velocities and the singular surface play an important role in the investigation of the field of limiting directions. Each limiting velocity is singular. But very often a singular velocity at a point belongs to the interior of the cone of the paint, and so it is not limiting at this point.
Controllabilityof Generic Control Surfaces Systems on
131
According to the definition, at a nonzero-passing point z there are exactly two different limiting values of the control parameter. One of them(denoteit by u l ( z ) ) gives a nonzero velocity at thispoint, and the other u ~ ( z gives ) a zero velocity and is a nonisolated value of the control parameter. In a generic case the points (z, ul(z)), ( z , uz(z)) of the singular surface of the system are a regular point and a critical point of Whitney fold type of the system’s folding, respectively. The restriction of the control system to the germs of its singular surface at these points gives the germs at z of two vector fields v1,v2, respectively. Near z the field v1 is a smooth branch of the field of singular velocities, but the field v2 is not. In the generic case v2 has the same singularity at z as the field of limiting velocities of a generic control systemhas at a regular zero-point (see Theorem 7). The germ at ( z , u ~ ( z ) of ) the set of critical points of the folding is mapped by this folding into the germ at z of a smooth curve y. Near z the field v2 is identically zero on this curve, but by means of the subspaces L(., .) it defines a smooth field L2 of straight lines on this curve. The field v1 also defines a smooth field L1 of straight lines near z (Fig. 7; velocity indicatrices are depicted by double thinlines). The fields L1, Lz rotate along the curve, and in general they may be tangent at some points. In the generic case the nonzero-passing points are exactly those points of such tangency where the fields have limiting directions. The following theorem describes the singularities of a generic control system at its nonzero-passing points. Theorem 8 ([l]). Let z be a nonzero-passing point of a generic control system.Then: 1. The steep domain has the same Cm-singularity at z as the set g
> 21x1
has at zero. 2. The limiting directions at z do not lie in the tangent space of the SD-
boundary at a. 3. The LC-net has the same CO-singularity at z as the LC-net of threedynamical system given by the vector fields (It1,x) and ( 0 , l ) has at
Davydov
132
any of its &passing points (see 2a) of Theorem 5 ) , and the point z has SLTP. 4. The corresponding points ( . z , u ~ ( z ) and ) ( z , u ~ ( z ) are ) a regular point and a critical point of Whitney fold type of the folding of this system, respectively.
Fig. 7. Table 1 contains a complete list of generic singularities which can appear on the SD-boundary of a generic control system. Types of points are indicated in the first column, and the corresponding possible subcases are indicated in the second column. The list of the corresponding singularities of the SD-boundary and the LC-net are indicated respectively in the third and fourth columns (the abbreviation, for example, 5 . 2 ~means “singularity 2a) from Theorem 5 ” ) . Finally, local controllability properties of a point with such singularity are indicated in the fifth column, and the singularity is possible when the number of different values of the control parameter is as indicated in the last column.
Theorem 9 ([l]).The singularities listed in Table 1 are the only singularities that can appear on the boubdary of the steep domain of a generic control system, and they all are stable under small perturbations of the system. Namely, for any control system suflciently close to the original system the steep domainsof these systems and pointsof boundaries of their
Controllabilityof Generic Control
Systems on Surfaces
133
steep domains with the same singulardies canbe mapped into each other by a Ccz-diffeomorphism of the phase space that is C1-close to the identity. Table 1. Singularities of the field of limiting directions on the boundary of the steep domain,
I
I
I
I
I
The following fact is an immediate consequence of Theorem 9. According to the typeof the singular point of the corresponding field of admissible velocities in the case of a zero-passing point and to the type of the corresponding folded singular point of the limiting curves net in the other cases. “-”, “LTP” and “SLTP”mean respectively “fails LTP”, “has LTPbutnot SLTP” and “has SLTP”.
Davydov
134
Corollary. For a generic control system the sets of points with the same local controllability properties are stable under small perturbationsof the system.
2.4. Singularities of the field of limiting directions
in the steep domain These singularities do notinfluence the local controllability near points of the steep domain because at a point of this domain we have a drift. Namely, any admissible motion starting from this point leaves its small neighborhood in a time close to zero. So points of the steep domain fail LTP. But these singularities are important in the investigation of nonlocal controllability which will be done below. Nowwe are going to describe generic singularities of the field of limiting directions in the steep domain. Assume that the phase space is oriented, and a smooth reference direction for measuring angles in the tangent planes is fixed (all results of this subsection automatically carry over to the case of an unorientable phase space), Denote by n1(.) and n2(.) the angles corresponding to theminimal L1(.)and maximal L2(.) limiting directions, respectively. A limiting control at a point z will be called an i-limiting control if the velocity it gives at E lies on the straight line RLi(z), i = 1,2.The i-passing set is defined to be the union of all points having at least two different i-limiting controls. The passing set is the union of the l-passing set and the 2-passing sets. A point z in the steep domain is called an i-nonsingular (respectively, i-passing, i-turning, double i-passing, or a - c u t o f ) point if a) the number of different i-limiting controls at this point is equal to 1 (respectively, 2, 2, 3, l),b) the order of contact of the indicatrix at the point with the line RLi((z) is < 2 at each of the common points (respectively, < 2, < 2, < 2, = 3), and c) only in the case of an i-turning point this line is tangent to the closure of the i-passing set. Remarks. 1. The order of contact of a straight line and a set at their common point is defined to be 1 less than the order at this point of the zero of the function “distance from a point on the line to the set”,
Controllabilityof Generic Control Systems
on Surfaces
135
2. Here and henceforth, unless otherwise stated, the indices i and j take values 1 and 2, and, if they occur simultaneously, then i # j.
Example 11. The i-passing set of the bidynamical system in the plane given by the velocity fields (l,k ( y - z2))is the parabola y = z2. All points of the parabola except the origin are i-passing points, and the origin is a i-turning point (compare with Example 5 ) . Example 12. The passing set of the three-dynamical system in the plane given by the velocity fields (1,0), (1,y - s),(1,s y ) coincides with the set z(s2-y2) = 0. The i-passing set is the union of the set y =1 . 1 and the negative y-axis for i = 1, and is the union of the set y = -131 and the positive y-axis for i = 2 (the positive direction being counterclockwise). All points of the i-passing set except the origin are i-passing points. The origin is a double i-passing point (compare with Example 6).
+
Example 13. The origin is a l-cutoff point of the control system in z cos U y sin U and the plane defined by the equations j. = "(cos U = 1 sinu where U E S'.
+
+
+
Theorem 10 ([l]).Let z be a point of the steep domain of a generic control system. Then: 1. The point is of one of the following types: a) an i-nonsingular point,b) an i-passing point, c) an i-turning point, d) a double i-passing point, e) an i-cutoff point. 2. The Cw-singularity of the net of i-limiting curves at x is the same as the Cw-singularity at zero of the family of integral curves of the equation, respectively, a) y' = 0, b) y' = zsgns, c) y' = ( y - s2)(1 sgn(y - z2)),d) y' = max(0; zY1 (z, y); (x y ) Y 2 ( s , y ) }where , YIand Y2 are smooth functions with Yl(0,O) > yZ(0,O)> 0, e) y' = Y(s,y), where the function Y is continuous and continuously differentiable in y, with Y ( 0 , O ) = 0 , andthederivative Y, iscontinuouseverywhere except the positive y-axis. 3. Thepassingsetanditssubsets of points of thesametype are stable under small perturbations of the system. Namely, for any control
+
+
Davydov
136
system suflciently close to the original system the passing sets of the two systems and their subsets of points of the same typemapped to one another by a C"-diffeomorphism which is C'-close to the identity. In particular, any i-turning (respectively, double i-passing) point is j-nonsingular if #U > 2 (respectively #U > 3). Obviously, singularities of the i-limiting curves net are connected with the singularities of the function ni. This function is the maximum (i = 2) or the minimum (i = 1) of a family of functions depending on a parameter (here the parameter is a point of the steep domain). The generic singularities of such functions are studied in [lo, 11, 121.In our case for a generic control system and any point of its steep domain the germ at this point of the function ni can be mapped (by means of a C"-diffeomorphism and adding a smooth function) the germ at zero of one of the following four functions: 0,
(-1)ilzl,
I
( - ~ ) i ( ~ ~ + ~ ~ (-1)immax{-w4+w2y+wz ~ ~ + / ~ ~ ) , W
E R)
according as this point is, respectively, an i-nonsingular point; either an ipassing point or an i-turning point; a double i-passing point; or an i-cutoff point.
Remark. In the analytical case the normal forms of the germ of the net of i-limiting curves at an i-turning point and the germ of the LC-net at a &turning point contain functional moduli connected with the moduli described by Ecalle [l31 and Voronin [14].
3. The attainable set and its singularities Here and henceforth, unless otherwise stated, control systems on a closed orientable surfaces are considered. In this section we investigate the attainable set ,(= the positive orbit of some initial set) of such systems. We will show that for a generic control system the attainable set is stable under small perturbations of the .system and its boundary is either empty or a smoothly embedded curve with singular points of finitely many types.
Controllabilityof Generic Control Systems
on Surfaces
137
3.1. Asymptotic stability of the attainable set Sometimes a control system may beginan admissible motion from some chosen set of the phase space. This set will be called here the start set. By the attainable set of the control system we understand the positive orbit of the start set. Example 14. Consider the bidynamical system in the plane given by the velocity fields(fl, z2-v). If a start set S lies in the region y > z2 then the attainable set is the open region situated above the union of the phase curves of the fields (or the limiting curves) outgoing from zero (Fig. 8; i-limiting lines are indicated by either solid or dashed lines, the attainable set is shaded, and its boundary is depicted by the respective thick lines). The boundary of attainability is the smoothly embedded curve near any of its points except the origin. At the origin the boundary (respectively the attainable set) has the same Cm-singularity as the set y = (%l3 (respectively y > [ . l 3 ) has at zero. It is easy to see that these singularities are stable under .small perturbations of the system. Namely, the attainable set of any system sufficiently closeto the original system has the same singularities at some point that is close to the origin. A subset A of the phase space is called stable (with respect to a system) if for any E > 0 there exists a 6 > 0 such that each trajectory of the system with initial point z ( t o ) in the &neighborhood of A exists for t o 5 t < 00 and lies in the +neighborhood of A. If, in addition, the distance from the point z ( t ) of this trajectory to A tends to zero as t + 00, then A is called asymptotically stable [15]. Henceforth we will suppose that a start set is the union of finitely many points and some closed one-dimensional submanifold in the phase space. Theorem 11. Theattainableset of agenericcontrolsystemona closed orientable surface is asymptotically stable. Theorem 12 ([2]). For a generic control system on aclosed orientable surface, if a closed set belongs to the interior of its positive orbit, then the orbit is asymptotically stable.
138
Davydov
3'"
Fig. 8.
Fig. 9.
2
Controllability of Generic Control
Systems on Surfaces
139
Remark. Theorem 11 can be proved analogously to Theorem 12.
Example 15. Consider the bidynamical system in the plane given by the velocity fields (-z - 1;- 2 . 1 ~- 2.1) and ( - S 1; -2.1~2.1). These fields drive everything to the points (-1, -1)and (1,l), respectively. The positive orbit of the origin is bounded by the union of the phase curves of the fields outgoing from their singular points. It is easy to see from Fig. 9 (the marking in this figure is the same as in Fig. 8) that the orbit is open and asymptotically stable.
+
+
3.2. Structure of the boundary of the attainable set Obviously SLTZ does not intersect the boundary of the attainable set, nor does L T Z . Hence the boundary is contained in the union of all points failing LTP. Moreover, in the generic case the boundary of the attainable set has a definite structure, and we are going to describe it. Let a point z of the start set belong to the boundary of attainability. Then z will be called an ordinary start point if either z is an isolated point of the start set or near z the boundary coincides with the start set, and z will be called a breakdown (respectively, return) start point if z is a nonisolated point of the start set and the germ of the boundary at z contains the germ of a limiting curve outgoing from z (respectively, incoming to z). Example 16. Consider the bidynamical system in the plane given by the velocity fields (1,H ) . If the startset is ( O , O ) U ( ( S - ~ ) ~ + ( ~ - ~=) ’8) then the attainable setcoincides with theclosed set shadedin Fig. 10 (with the same marking &S in Fig. 8). The points A and B(= (1,4)) are a return start point and a breakdown start point, respectively.
Theorem 13 ([4]).For a generic control system on a closed orientable surface the following three assertions hold: 1. Any point z of the start set belonging to the boundary of the attainable set is either a) an ordinary start point, b) a breakdown start point, or c) a return start point.
Davydov
140
as the C"singularityatzero of, respectively,theset a) either al) p 0 or a2) p 2 1x1 (if z is a nonisolated start point or an isolated start point, respectively), b) g 2 x [ ~ [and , c) either cl) p 5 1x1 or CZ){p < )1.1 U {{v = x) n {Z 2 0))3. The germ of the bounday! of attainability at z is respectively the germ 2. The P-singularity at z of the attainable set is the same
a1) ( s , z ) ,
212)
(17l+(z)w,+(z),z),
>
b) (s+u17W,.z), 4 (s+u17i(.z),z),
where S and S+ are respectively the start set and its part contained in the bounday! of attainability, and q F ( z ) ,$ ( z ) are limiting curves also contained, near this point, in the boundary. Remarks. 1. The singularity 2c2) (or 3c)) of Theorem 13 has a stable realization on the boundary of the attainable set for the bidynamical - p), if the start system in the plane given by the velocity fields (fl,s2 set is the circle (x + + (g - 2 - 2e-l)' = 1 (Figure 11; in addition to the marking of Figure 10, singular points of the boundary of attainability are marked by natural numbers according to their types defined below). 2. By (V,z ) we denote the germ of the set V at the point z, and BV is the boundary of the set V . Outside the start set the boundary of attainability of a generic control system has the same structure (and so the same singularities) as the boundary of the positive orbit of a closed subset lying in the interior of its orbit. To describe this structure we introduce some new notations. Fix a continuous reference direction for measuring angles in the tangent planes of the phase space. Just as above, for z in the closure of the steep domain we denote by L1 ( z ) and L2 ( z ) the minimal and maximal directions, respectively, of the admissible velocities at z. By vi(%) ( q t ( z ) and q;(z), respectively) we denote the limiting curve of the branch Li of the field of limiting directions with the point z included and passing through z (leaving z and incoming to z , respectively). Denote by O+(S)and O-(S) the positive and negative orbit, respectively, of a subset S of the phase space.
Controllabilityof Generic ControlSystems on Surfaces
Fig. 10.
fl
141
2
(x-3) +
Fig. 11.
.. .
. ..
..
_,
.
.... ,.., I
. I .
.... ... .....
I
....... ..... .
,
.
..I
.
. .. . ..
Davydov
142
It is known that for a generic control system the closure of the positive (negative) orbit of any subset of the phase space coincides with the closure of the interior of that orbit and is a manifold with boundary. Moreover, the boundary is a CO-hypersurface in the phase space [lS].
Theorem 14 ([2]). For a generic control system on a closed orientable surface, let S be aclosed subset contained in the interior of its positive orbit. Then: 1. The boundary of that an orbit is contained in the union of the steep domain and the set of folded singular points of saddle, stable node,
monkey saddle, and stable saddle-node type. 2. The germ of the boundary at any of its points lying in the steep domain coincides with one of the following three germs: ai)
(rlik),
z);
b)
(771(4U va(4, 2).
If (qa(z),z) = (OO+(S),z),then: 3. The limiting curveqi(z) is an incoming separatrix for no folded singular point, and one of the following two assertions holds: 4. qa(z) is a cycle contained in 6O+(S), and (qi(z),A ) = (BO+(S), A ) for
each A
E qi(z);
5 . qi(z) is an outgoingseparatrix of a foldedsingularpoint
of saddle,
monkey saddle or stable saddle-node type. Theorem 15 ([2]). For a generic control system on a closed orientable surface, let S be a closed subset contained in the interior of its positive orbit, and let z be any point of the intersection of the boundary of this orbit with the boundary of the steep domain. Then: 1. The germ of this orbit at z contains the germ of S L T Z ( L T Z ) at z. 2. The germ at z of the boundary of the orbit coincides with one
of the
following germs: a folded singular point of saddle type; in this case near z the outgoing separatrices $ ( z ) and $ ( z ) separate the incoming sepamtrices q; ( z ) and q ; ( z ) from S L T Z ( L T Z ) ;
a) ( $ ( z ) U q $ ( z ) , z ) if z is
Controllabilityof Generic Control Systems on Surfaces
143
b) (qi( z ) U qg ( z ) ,z ) if z is a folded singular point of stable node type; in this case the curves q,(z) and q g ( z ) are not separatrices of this node; c) (qf(z) U q$(z),z ) if z is a folded singular point of monkey saddle type; in this case the curves qf(z) and $ ( z ) are outgoing sepamtrices of this singular point, as well as outgoing sepamtrices of a viewed as the singular point of saddle type of the admissible field of velocities determined by the corresponding isolated value of the control parameter; d) eitherdl) ( q ; ( z ) U q g ( z ) , z ) ordz) (qJz)Uq+(z),z) i f z is a folded singular point of stable saddle-node type; in this c w e only the curve $ ( z ) is a separatria: of this saddle-node.
Remarks. 1. If #V = 2, then for a generic control system a folded saddle and a focus occur simultaneously at a &turning point. Thus in this case a folded focus can also appear on the boundary of attainability. 2. In the case of the germ 2b) of Theorem 14 the point z will be called a point of fusion. 3. When changing the direction of movement, positive orbits become negative and vice versa. Hence, Theorems 14 and 15 also hold for negative orbits, butit is necessary to change the upper indices “+” of limiting curves to ‘$-”and vice versa, and the words “incoming”, “outgoing”, and “stable” to the words “outgoing”, “incoming”, and “unstable”, respectively. Figure 12 illustrates the structure of the boundary of attainability described in Theorems 14 and 15. In addition to themarking of Figure 8 the interior of the attainable set is shaded by slanted lines and the interior of SLTZ (LTZ) by vertical lines. 3.3. Singularities on the boundary of the attainable set
In Theorem 13 we have described the generic singularities on the boundary of the attainable set at its points belonging to the start set. Here we deal with the singularities on the boundary of attainability outside the start set.
..
. ,.
.
..... ... ,.
,_,.. jC*:LL.Y1”,,.“,.”“..~......_...,.,
“...,.,.......,..‘._..*.*~,,I
.I
I
,
,
. .... . ... . , . . .-.,
.,”.
...,...,_..
144
Davydov
14.2a
14.2b
15.2a
15.2 b
1Sd, Fig. 12
Controllabilityof Generic Surfaces Control Systems on
145
A point of the boundary of attainability will be called a singular point of type p , 1 5 p 5 6 , if the germ of the interior of the attainable set at that point is C"-diffeomorphic to the germ at zero o f l+)y > (1 or 1-) y < 1 3 1 for p = 1; y > I ~ l z p -for ~ p = 2,3; < y < z, where E = f l , for p = 4; y < ( E c(sgnz - l))lzla,where E = f l and e/2 # c E R,for p = 5 ; y < h(z)for p = 6 , where the germ at zero of the graph of y = h(z) is the germ at zero of the closure of the union of two nonsingular phase curves of a folded node
+
y = (dy/dz
+ a(a + 1)-2z/2)2,
entering zero in opposite directions; everywhere a > 1 is not an integer.
Remark. A singular phase curve of a (folded) node is a phase curve incoming to or outgoing from the corresponding singular point which is extendable through the singular point as a smoothly embedded curve; for noninteger exponent of the node there are exactly four such phase curves. The exponent of a singular point of node or saddle type is the ratio of the largest (in modulus) to the smallest eigenvalue of the linearization of the field at the point. The exponent of a folded singular point is equal to the exponent of the singular point of the preimage. We observe singularities of type 1 at the points (0,O)and A and of type 2 at B in Fig. 10, of type 3 at zero in Fig. 8 and of type 4 at ( 1 , l ) and (-1, -1) in Fig. 9. All these singularities are stable under small perturbations of the considered systems. The following examples give stable realizations of the singularities of types 1, 2, 4, 5 and 6 (see also [2, 9, 171). Example 17. Consider again the bidynamical system in the plane given by the velocity fields (-S - 1; - 2 . 1 ~- 2.1) and (-z+ 1; -2.lyf2.1) (see Example 15). If the start set is the union of the point (0, -2) and the + (y - = 1 thentheattainable set isshownin Fig. 13 circle (z + (Figs. 13, 14 have the same markings as Fig. 8). On its boundary one can see singularities of types 1, 2, 4 and 5 labeled with the respective numbers. Example 18. Consider the control system in the plane with velocity indicatrices consisting of the fields (z cosu, ky sinu), where u E S1 and k > 2 is not an integer. The steep domain is the region za k2y2 > 1
+
+
+
Davydov
146
I
Fig. 13.
Fig. 13
Fig. 14.
Controllabilityof Generic Control Systems
on Surfaces
147
and has four black zero-passing points on its boundary (see Example g), namely, at (&l, 0) we have folded saddles with exponent (1 - k)“k, and at (0, &l/lc)folded nodes with exponent a = k - 1. A start set consisting of two points A and B (Fig. 14) may be chosen in such a way that the attainable set has (the same) singularity of type 6 at the two nodes. A point of the boundary of the attainableset will be called a nonaingular point or a singular point of type p = 0 if near this point the boundary is a smoothly embedded curve. Table 2 contains a complete list of generic singularities which can appear on the boundary of attainability of a generic control system. Types of points which can lie on that boundary are indicated in the first column, and the corresponding possible subcases of types of singularities at those points are given in the second column. The list of the corresponding singularities of the LC-net and the germs of the boundary of attainability are given in the third and fourth columns, respectively. In the fifth and sixth columns we indicate whether the attainable set contains the parts of limiting curves lying on the boundary near the singular point, and thispoint itself, respectively (“+” , “-” and “f” mean respectively “yes”, “no”, “sometimes yes, sometimes no”; the number of signs is equal to the number of those parts). Note that the attainable set contains only those parts which are contained in the limiting curves outgoing from either ordinary isolated start points or breakdown start points.
Theorem l6 [2,4]. For a generic control system ona closed orientable surface, the following three assertions hold: 1. The closure of attainable set coincides with the closure of its interior. 2. The singularities listed in Table 2 are the only ones which may appear on the boundary of the attainable set. 3. All these singularities are stable under small perturbations of the system. Namely, for any control system suflciently close to the original systemtheattainablesets of the two systems can be transformed to each other by a homeomorphism of the phase space which is closeto the identity; the homeomorphism canbe chosen tobe a C“-diffeomorphism everywhere except possibly at singular points of types 4,5, 6.
Davydov
148 Table 2. Singularities on the boundary of attainability.
“SD”= “a point of the steep domain has no folded type”; monkey saddle, saddlenode and saddle types for zero-passing points give singularities of the net from Theorem 5 if #V > 2; here stable folded singular points of node and saddle-node type are admitted only. ‘h++” = “1- and Z--nonsingular”, ‘h&” = “the point is either an i-nonsingular point, i = 1,2,or a passing point of one of the two branches (and if #U = 2 of both of them) of the field of limiting directions and a nonsingular point for the other”. “no” = “near such a point the boundary of attainability coincides with the start set”; in the “i-nonsingular (i-passing) point” or “fusion point” case, in the last column we have <‘+”if there is at least one l‘+” in the preceding column.
-
Controllabilityof Generic Control Systems
on Surfaces
149
Remarks. 1. The exponent a, of a singular point of type 4, 5 , or 6 is invariant under Cm-diffeomorphisms.It can change under small perturb& tions of the control system. So in general the transforming homeomorphism will not be smooth. 2. Theorems 11, 13, 14, 15, 16 automatically carry over to the case of the unorientable phase space [4]. The following fact is an immediate consequence of Theorems 10 and 16. Corollary. For a generic control system any point of the boundary of attainability belonging to thesteepdomaincanonlg be one of two types, namely, either an i-nonsingular point or an i-passing point.
4. Stability of nonlocal controllability Here we investigate nonlocal controllability properties (that have no connection with the startset) for genericcontrol systems on a closed orientable surface. In particular, we study open domains D in the phase space with property that D coincides with the intersection of the positive and the negative orbit of any point of D. It will be shown that for a generic control system, such domains are stable under small perturbations of the system. Finally, we will show that the family of trajectories (or, in the other words, family of orbits of points) of a generic control system are structural stable in the classical sense like the family of phase curves of a generic differential equation on a closed orientable surface. 4.1. Structure of boundaries of nonlocal transitivity zones
An open subset of the phase space coinciding with the intersection of the positive and the negative orbits of any of its points is called a nonlocal transitivity zone (NTZ). It is easy to see that any two points of a NTZ are attainable from each other, and hence a N T Z either does not intersect the positive or negative orbit of a point or is contained in that orbit.
150
Davydov
Example 19. Consider again the control system in the plane with velocity indicatrices consisting of the fields (z+cos U, ky+sinu), where U E S” and k > 2 is not an integer (see Example 18). Its nonlocal transitivity zone is the open region bounded by the closure of the union of the outgoing separatrices of folded saddles (Fig. 15; here in contrast to the marking in Fig. 11 the N T Z is shaded and its boundary is depicted by the respective thick lines). Here the positive orbit of N T Z is the zone itself, and the negative orbit is the whole plane.
Fig. 15. The following theorem describes the structure of the boundary of a N T Z in the generic case.
Theorem 17 [2]. Let Z be a nonlocal transitivity zone for a generic control system on a closed orientable surface. Then: 1. The germ of Z at each point a of its boundary is equal to the germ at z of O + ( Z ) (O-(Z), respectively) if z E O - ( Z ) ( a E O + ( Z ) , respecti2.
. .
. ..
vely). If z E BO+(Z)t l BO-(Z)(angular point), then either
Controllabilityof Generic Control Systems Surfaces on
151
a) z is a point of the steep domain, and then the germ at z of 6 2 is equal to the germ ( q i ( z ) U q:(z),z), in this case ( 8 O + ( Z ) , z ) = ( q i ( z ) ,2) and (aO-(z),z) = ( ~ j ( z )z); , or
b) z is a folded singular point of monkey saddle type and the germ at z of az is equal to (177( z ) U ;7 (z),z). 3. The germ of Z at each point of the intersection of its boundary with the boundary of the steep domain contains the germ of LTZ at this point.
4.2. Singularities on boundaries
of nonlocal transitivity zones It is easy to see fromTheorem 17 that in the generic casethe boundary of N T Z can exibit all singularities that the attainability boundary does. We define a singular point of type p , 0 5 p 5 6, of the boundary of N T Z in the same way as for the boundary of attainability. It turns out that, generically, these are the only singularities that may appear on the boundary of N T Z . Precisely:
Theorem 18 [2]. For a generic control system on a closed orientable surface, the following four assertions hold: 1. O n the boundary of any of i t s N T Z there can only appear singularities listed in Table 3. 2, All these singularities are stable under small perturbations of the system. Namely, for any system suficiently close to the original system its nonlocal transitivityt zones aremapped over to the corresponding zones of the original system by a homeomorphism of the phase space that is close to the identity; the homeomorphism can be chosen to be a Coo-diffeomorphism everywhere except possibly at singular points of types 4, 5 , 6 . 3. The closures of different nonlocal tmnsitivity zones a= disjoint. Remark. As discussed above, singular points of types 4, 5, 6 have exponents preserved under C*-diffeomorphisms. But the exponents can
152
Davydov
Table 3. Singularities on the boundaries of nonlocal transitivity zones.
change under small perturbations of the system. So, in general, the transforming homeomorphism will not be smooth. The following examples give the stable realizations of type 1-5 singularities on the boundary of N T Z . Example 20. Consider the three-dynamical system in the plane given by the velocity fields(-z-l; -2.ly-2.1), (-a+l; -2.ly+2.1) and (-(z2 ) , - 2 . 1 ~(compare ) with Examples 15,17) . The fields havesingular points (1, l), (- 1, -1) and (2,0), respectively. The boundary of NTZ consists of these points, of the phase curve of the first field outgoing from the point (2,O) and of the phase curves of the second field outgoing from (-1, -1) and (2,O) (Fig. 16; the marking of Fig. 16, Fig. 17 is the same as Fig. 15). The boundary has singular points of type 1, 4 and 5 at (2,0), (-1,-l) and (1, l ) , respectively.
Controllabilityof Generic Control Systems on Surfaces
153
Fig, 16.
Example 21. The nonlocal transitivity zone of bidynamical system in the plane given bythe velocity fields(y-2z, -32) and (y-22+1, -3(z-l)) (having the “absorbing” focus funnel at (0,O)and (1, l), respectively) is bounded by a limit cycle of one of the branches of the field of limiting directions, encircling singular points of the fields. This cycle, which is the boundary of the zone, has singular points of type 2 at its intersection with the line y = z (Fig. 17). Example 22. The nonlocal transitivity zone of the bidynamical system in the plane given bythe velocity fields( H , z2-p) coincides with the positive orbit of any point lying over the parabola y = z2. The boundary has a singular point of type 3 at zero (Fig. 8; compare with Example 14). Remark. A singularity of type 6 has a stable realization in a control system which is a suitable perturbation of the system from Example 18. Moreover, the perturbation can be chosen arbitrarily small and localized in any neighborhood of folded singular points of node type.
Davydov
154
5. Structural stability of generic control systems Here we consider control systems on a closed orientable surface and introduce a notion of structurally stable control system. The notion is similar to the classical concept of structurally stable (rough) vector field introduced by A. A. Andronov and L. S. Pontryagin. We will show that for a generic control system its family of trajectories (= family of positive and negative orbits of points) is stable under small perturbations of the system. This means that with respect to orbit equivalence a generic control system behaves like a generic vector field [18,191.
5.1. Stability of systems We call a control system structurally stable if the family of the positive and negative orbits of points of any system which is sufficiently close to the original one can be transformed to the corresponding family for the original system by a homeomorphism of the phase space that is close to the identity.
S ,i
Example 23. Let an object on the torus (= x S$ where 4 and $ are an angles on the circle) admit motions with two smooth structurally stable velocity vector fields both of modulus one. One field of velocities has two simple cycles c j = fn/4, the +component of the field depends only on q5 and has a graph looking as in Fig. 18. The other field of velocities is the image of the first under the diffeomorphism (+,$) ts (-4 n,$).
+
This control system has two nonlocal transitivity zones whichare annuli n/4 < 141 < 3n/4. The zones are separated by two annuli in which the behavior of the phase curves of the fields istopologically the same, namely, the phase curves of one field of admissible velocities are $ = const and the phase curves of the other field spiral away from one cycle of this field and to the other cycle (Fig. 19; marking is the same as in Fig. 15). What do the orbits of points of the torus look like? For a point of any nonlocal transitivity zone, the. positive orbit coincides with the zone, and the negative orbit is the entire torus.
Controllabilityof Generic Control Systems on Surfaces
E Fig. 17.
-* t Fig. 18.
155
156
Davydov
Fig. 19.
Fig. 20.
Controllabilityof Generic Surfaces Control Systems on
157
Consider an interior point of one of the annuli. Through this point we draw the phase curves of the fields of admissible velocities and take two segments of the positive and negative semitrajectory of each of these curves: the segments lie between our point and the nearest two points of intersection of these curves. The positive (respectively negative) orbit of our point is located between the union of two of the four segments lying on the positive (respectively negative) semitrajectories and one of the stable (respectively unstable) cycles which is not nearest to the point (Fig. 20; here in contrast to Fig. 8 the orbits are shaded and their boundaries are depicted by thick lines). Moreover, the union of these two segments is contained in the orbit but the cycle is not. For a point of the boundary of the annulus we have practically the same picture of orbits, except that on the boundary of the orbits we have the corresponding limit cycle (containing the point) instead of the union of the two segments. Apparently, the structure of orbits will be the same for any control system sufficiently close in the Cl-topology (and hence also for any system sufficiently close in the C4-topology) to the original one; by a home omorphism CO-close to the identity the orbits of one of these systems are mapped to orbits of the other one. Thus, our control system is structurally stable.
Theorem 19 [2]. A generic control system on a closed orientable surface is structurally stable. Theorem 19 is an analog of classical theorems of Andronov, Pontryagin and Peixoto [19,201 about structural stability of a generic vector field.
5.2. Sufficient conditions for stability
Usually the classical structural stability theorems for a generic vector fieldgivenecessary and sufficient conditions for such stability. Here we describe some similar conditions for control systems. For a generic control system (namely, for a system whose field of limiting directions has only generic singularities described above) its limit
Davydov
158
curve is called singular if it is either a closed curve lying entirely in the steep domain (such a curve is called a cycle) or the separatrix of at least one folded singular point of the field of limiting directions. Aswe saw above, for a generic control system the separatrices are observed in folded singular point of saddle, node, monkey saddle, and saddle-node type. It is easy to see from Figs. 6.2b1, 6.2b2, 5.3a, and 5.3b, that at such singular point 4, 2, 6, and 4 separatrices are “born”, respectively. The set of singular limiting curves of a generic control system is called structurally stable if for each sufficiently close system its set of singular limiting curves is carried over to the original set by a homeomorphism of the phase space that is close to the identity. Similarly, one defines the structural stabilityof the net of i-limiting curves (or the field of i-limiting directions),
Theorem 20. For a generic control system on a closed orientable surface, the structural stability of the set of singular limiting curves implies the structural stability of the system. Remarks. 1. In Theorem 20 (and in Theorems 21 and 22 below) under a generic control system we understand the one whose field of limiting directions has only generic singularities (this also includes the transversality of jet and multiple jet extensions of.the system to the corresponding submanifolds). 2. The structural stability of the set of the singular limiting curves is not necessary for the structural stability of the system itself.
Example 24. A bidynamical system on the torus given in standard coordinates 4 and 1/1 (see Example 22) by the vector fields (1,O) and ( 0 , l ) is structurally stable, but itsset of singular limiting curves is not. Indeed,the positive (negative) orbit of any point is the entire torus bothfor this system and for any sufficientlyclose system. Hence the system is structurally stable. Anyof its limiting curvesis a cycle (either 4 = const or @ = const) which is not simple. Hence the set of singular limiting curves is not structurally stable [19,203.
.
..
.
..
.
Controllability of Generic Control Systems on Surfaces
159
Theorem 21 [2]. For #U > 3 f o r a generic control system ona closed orientable surface, the structural stabilityof the fieldof i-limiting directions is equivalent to the following three conditions being satisfied:
A. There are no double separatrices, i.e., no limiting curves of the field that are separatrices of folded singular points both f o r increasing and descreasing time. B . Any closed limiting curve of this field lying in the steep domain is a simple cycle. C. Any limiting curve of this field diffeerent from a cycle, if extended to either side, ends at a boundary point of the steep domain or spirals onto a cycle of this field. Theorem 22 (21. For a generic control system on a closed orientable of the set of singularlimitingcurvesis surface, the structural stability equivalent to the following four conditions being satisfied: conditions A C of Theorem 21 f o r each of thetwobranches of thefield of limiting directions, and the condition
D. Only one singular limiting curve can enter each point
of the boundary
of the steep domain that is not a folded singular point.
Remark. Conditions A-C of Theorem 21 are analogues of the corresponding conditions for the structural stabilityof a smooth vector field on a closed orientable surface [19,201. Theorem 23 [2]. For a generic control system on a closed orientable surface, the set of singular limit curves and, for #U > 2, each of the two branches of the field of limiting directions are structurally stable. The following fact is an immediate consequence of Theorems 23 and 9.
Corollary. For a generic control system on a closed orientable surface, no singular limiting curve contains i-cutoff points, double i-passing points, or i-turning points, and any intersection point of such curves is a nonsingular point of each of the branches of the fields of limiting directions.
Davydov
160
6, Appendix: open problems
Here we will briefly discribe some open problems in the theory of controllability of control systems. There are many fundamental results concerning local controllability near a point. Some of them are provedby using the techniques of Lie commutators and special variations of controls (see papers [21, 22, 23, 241 and the references in these works), others are based on geometrical considerations and the theory of normal forms (see [l, 6, 25, 26, 27, 28, 291). But there are rather fewer papers dealing with the sets of points with the same local controllability properties and with stability of such sets under small perturbations of systems (see [l, 6, 71). It seems that problems concerning these sets and their stability are completely not solved even for polydynamical systems on a k-dimensional manifold with k > 2. Probably the same problems for general control systems (for example, like the systems considered in this paper) are more difficult. The nonlocal controllability, transitivity and limit behavior of control systems axe extensively studied topics in the theory of control systems 32, 33, 34, 35, In 361). particular, the results about NTZ (see [2,7, 30, 31, formulated above imply a complete description of the (chain) control sets (defined in [35])for a generic control system on a closed orientable surface. I have the resonable arguments that Theorems 17,18and 19 are also valid in the case of a unorientable phase space, and I hope that proofs of these results will be found. Finally, the singularities of time in time optimal problems is one more topic attracting the attention of reseachers. If the start set is a smoothly embedded closed submanifold, if the indicatrices of admissible velocities are strictly convex and if the interior of the convex hull of each of them contains zero velocity, then these singularities have a fairly complete d e 381). question is what are the genescription (see [lo,11, 12, 17, 37, The ric singularities of the time minimum function without these assumptions about either the zero velocity or the convexity?
Controllabilityof Generic Control Systems on Surfaces
161
REFERENCES A. A. Davydov, Singularities of field of limiting directions of twodimensional control systems, Math. USSR Sbornik 64 (1989), No. 2, 471-493. A. A. Davydov, Structural stability of generic control system on the orientable surface, Math. USSR Sbornik 72 (1992), No. 2, 1-28. A. A. Davydov, Qualitative Theory of Control Systems, Submitted to Mir-AMS. A. A. Davydov, Singularities in two-dimensionalcontrol systems, Thesis, Moscow State University, Moscow, 1982 (in Russian). N. N. Petrov, Controllability of autonomous control systems, Differential Equation 4 (1968), 606-617. J. B. Goncalves, Local controllability of nonlinear systems on surfaces, preprint Porto University, 1990. M. M. Baitman, Controllability regions in the plane, Differential equations 14 (1978), 407-417. A. F. Filippov, Singular points of differential inclusion in the plane, Moscow Univ. Math. Bull. 40 (1985), No. 6, 54-61. A. A. Davydov, Thenormal form of diferential equationthat is not solved for derivative, in a neighbourhood of a singular point, Functional Anal. Appl. 19 (1985), No. 1, 81-89. L. N. Bryzgalova, Singularities of the maximum function of a parametrically dependent jbnction, Functional Anal. Appl. ll (1977), 49-51. L. N. Bryzgalova, Maximum functions of a family of functions depending on parameters, Functional Anal. Appl. 12 (1978), 50-51. V. I. Matov, Topologicalclassification of germs of function of the maximum and minimax of families of functions in general position, Russian Math. Surveys 57 (1982), No. 4, 127-128 J. Ecalle, Theorieiterative:introductiona la theorie des invariants holomorphes, J. Math. Pures Appl. (9) 54 (1975), 183-258.
162
Davydov
S. M. Voronin, Analitic classification of pair of involution and its applications, Functional Anal. Appl. l0 (1981), No. 2, 94-10. P 51 A. F.Filippov, Stability for differential equations with discontinuous and multivalued right-hand sides, Differential Equations 16 (1979), NO.6, 720-727. A. A. Davydov, The quasi-Holder nature of the boundary of attainability, Selecta Mathematica Sovietica 9 (1990), No. 3, 229-234. V. I. Arnol'd, Catastrophe Theory,2nd edition, Springer-Verlag, 1988. V. I. Arnol'd, Geometrical Methods in the Theory of Ordinary Differential Equation, Grundlehren 250, Springer, 1983. J. Palis, W. De Melo, Geometric Theory of Dynamical Systems, An Introduction, Springer, New-York, 1982. M. M, Peixoto, Structural stability on two-dimensional manifolds, Topology, 1 (1962), NO.2, 101-120. H.Sussmann, A generaltheorem o n local controllability, SIAM J. Control Optimization, 25 (1987), No. 1, 158-194. A. A. Agrachev, R. V. Gamkrelidze, Local controllability for families of diffeomorphisms, System and Control Letters 19,1992, A, A. Agrachev, R. V. Gamkrelidze, Local controllability and semigroups of diffeomorphisms, to appear in Acta Appl. Math. M. Kawski, High-order small-time local controllability, in "Nonlinear controllability and optimal control (Ed. H. Sussmann), Marcel Dekker, 1990, 431-467. 1251 J. B. Goncalves, Local controllability in 3-manifold, Systems and Control Letters 14 (1990), No. 1, 45-49. J. B. Goncalves, Geometric condition for local controllability, J. Differential Equations, 89 (1991), No. 2, 388-395. J. B. Goncalves, Local controllability of scalar input systems on 3manifolds, System and Control letters, to appear. B. Jakubczyk, F. Przytycki, Singularities of Ic-tuples of vector fields, Dissertationes Math., 213 (1984). A. Babichev, A. Butkovskii, N. Lepe. Singular sets on phase portraits
Controllability of Generic Control
Systems on Surfaces
163
of dynamical systems with control,Aut. Rem. Control 47 (1986), No, 5 , 607-614, NO.7, 912-919. J. M. Gronski, Classification of closed sets of attainability in the plane, Pacific J. Math. . 77 (1978), No. 1. C. Lobry, Controllability of nonlinear systems on compact manifolds, SIAM J. Control 12 (1974), No. 1, 1-3. H.Sussmann, Some properties of vector field systems that are not altered by small perturbation, J. Differential Equations, 20 (1976), NO. 2, 292-315. M. Sieveking, A generalization of the Poincare-Bendixson theorem, The Institute of Applied MathematicsandStatistics, Vancouver, Technical Report No. 88-11, December, 1988. F. Colonius, W. Klieman, Limit behaviour and genericity for nonlinear control system, Schwerpuktprogramm der Deutschen Forschungsgemeinschaft “Anwendungsbezogene Optimierungund Steuerung”, Report No. 305, Institut fiir Mathematik, Universitat Augsburg, 1991,1-27. F. Colonius, W. Klieman, Controltheoryanddynamicalsystems, Schwerpuktprogramm der Deutschen Forschungsgemeinschaft “Anwendungsbezogene Optimierung und Steuerung”, Report No. 328, Institiit fur Mathematik, Universitat Augsburg, 1991, 1-49. A. Bacciotti, Complete controllability and stabilizability,in “Dynamical systems and Microphysics, Control theory and Mechanics” (Proceedings of Third International Seminar on Mathematical Theory of Dynamical Systems and Microphysics, Udine, Italy, September 4-9, 1983). Ed. A. Blaguere, G. Leitmann. Academic Press, 1984, 1-12. V. I. Arnol’d, Singularities in calculus of variations, in “J. of Soviet Math. ” 27, 1984. V. M. Zakalyukin, Metamorphoses of fronts and caustics, depending on parameter, versality of mappings, J. of Soviet Math. 27 (1984), 2713-2735.
Recent Advances in the Stabilization Problem for Low Dimensional Systems W.P. Dayaluansa Department of Mathematics, Texas Tech University, Lubbock, Tx 79409
Abstract. We survey recent advances on the stabilization problem for two and three dimensional, single input, affine nonlinear systems.
1. Introduction There has been a tremendous interest in the nonlinear stabilization problem in the recent past, as evidenced by numerous research articles, and the recent book byBacciotti [Ba]. One of the main contributing factors has been the realization that modern robotic systems and advanced aircraft etc cannot be analyzed by using linear techniques alone, and more advanced theories are necessary in order to meet these design challenges. This has lead to thegeneralization of well known lineartheories such as stabilization of passive systems (see [BIW], [HMI, [KS], [MA], [MO]etc.) and generali-
This paperwas written while the authorwas at the department of Electrical Engineering and Systems Research Center, University of Maryland, College Park, MD 20742; Supportedin Part by NSF Grants #ECS 9096121, #EID 9212126 and the Engineering Research Center Program Grant #CD 8803012.
165
166
Dayawansa
zation of the notion of minimum phase systems to the nonlinear setting (see [B121 etc.). On the other hand, it has been pointed out that there are interesting classes of highly nonlinear, nongeneric systems which arise as low dimensional subsystems after various types of dimension reduction techniques such m center manifold reduction (see [Ay], [Ba]), modification of zero dynamics by redefining the output function (see [Ba]), and partial feedback linearization (see [Mars], [Ba] etc.). Analysis of the stabilization problem for these latter classes of systems require innovative techniques which have no counterparts in the theory of linear systems. Our focus in this paper will be on this latterclass. Due to thecomplexity of the problem we will only consider this problem in the two and three dimensional cases. We will consider a single input, &ne, nonlinear system given by,
where, IC E W , ( n = 2 or 3), U E 8,and f , g are Ck (k > 0) vector fields. Wewill assume that the originis an equilibrium point of the unforced system, i.e. f ( 0 ) = 0, and that g(0) # 0. Our aim here is to review recent developments on the stabilization problem for (1.1),when it describes a highly nonlinear system, and give a flavor of some of the techniques that have been developed in order to study this problem. Even though there has been much interesting work done on bilinear systems in low dimensional cases, we will not discuss this class here. Interested reader is referred to [Ba] for a recent account on the stabilization problem for bilinear systems. It has been recognized for quite some time now that there is a strong relationship between the stabilizability and various notion of controllability. Indeed, local asymptotic stabilizability clearly implies that thesystem is locally asymptotically controllable to the origin (see section 2 for the definition). First modern attempts to relate stabilizability to controllability seems to be contained in the work of Sussmann (see [Su]), where he proved that for real analytic systems, a certain notion of controllability implies the existence of a piecewise analytic feedback control law which steers all points in the state space to the origin. He also gave an example
Stabilization Problem for
Low Dimensional Systems
167
to illustrate that , in general there may not exist an analytic stabilizing feedback control law. However, it was Brockett (see [Br]) who dispelled the myth that small time local controllability (see section 2 for the definition), implies stabilizability. Brockett showed in [Br] that there are obstructions to stabilizability which are topological in nature. Zabczyk (see [Za]) pointed out that this obstruction is closely related to a necessary condition for stabilizability due to Zabreiko (see [KZ]). A similar result wasused by Byrnes and Isidori (see [BIl]) to prove that the attitude stabilization problem for a rigid body controlled by gas jets can be solved if and only if there are three independent control torques exerted along independent axes. Coron (see[Cor]) used Zabreiko’s theorem to strengthen Brockett’s necessary condition. Brockett’s necessary condition showed for the first time that asymptotic stabilization problem for highly nonlinear systems is much more complex than it was anticipated. This has lead to a renewedeffort to understand the stabilization problem, and in particular focus shifted to low dimensional cases for which there is a variety of topological and analytical tools available. Ayels (see[Ay])pointed out the relevance of the center manifold theorem in order to extract “the highly nonlinear” part of a given control system. This now leads to the problem of stabilizing a nonlinear system in which the linear approximation of the drift vector field is identically equal to zero. Kawski showed (see [Kal]) thatfor a two dimensional real analytic system given by (1.1), small time local controllability implies stabilizability by using a Holder continuous feedback function. This is a landmark result in several respects. On one hand, itshowed oncemore that there is a strong relationship between the controllability property and the stabilizability property. On the other hand, it emphasised the fact that one should not be confined to the class of smooth feedback functions in seeking a solution for the stabilization problem, a statement that has already been made by Artstein (see [Ar]), Sontag and Sussmann (see [SSl]) etc. Boothby and Marino (see [BM3]) gave an obstruction to the stabilizability of such systems by using C1 feedback. Dayawansa, Martin and Knowles showed (see [DMK], [DM31 etc.) that in the two dimensional real
168
Dayawansa
analytic case, local asymptotic controllability to the origin is a necessary and a sufficient condition for the local asymptotic stabilizability of (1.1) by using continuous feedback. They also gave some necessary conditions and sufficient conditions for the asymptotic stabilizability of (1.1)by using C kfeedback for 0 5 IC 5 m. In the three dimensional case, we will restrict out attentionto homogeneous and weighted homogeneoussystems. An example of the former type is the angular velocity control system of a rigid body (see [AS],[SSZ]).However, our primary motivation for studying this class is that if the leading homogeneous part of a system of differential equations is asymptotically stable, then the addition of the higher order terms does not destroy the local asymptotic stability of such a system (see [Ha]). Thus, a theorem due to Coleman (see [Col]) on the asymptotic stability of three dimensional homogeneous systems is of relevance to our problem. Coleman attributes the two dimensional version of this theorem to Forster [Fo] whichhas been discovered iudependently by Haimo (see [Hail) (see Samardzija [Sa] also). Kawski (see [Ka2], [Ka3]) and Hermes (see [Hell) pointed out that the nilpotent approximation of certain three dimensional small time locally controllable systems are dominated by a weighted homogeneous system (instead of a homogeneous part), and showed that the local asymptotic stability of a weighted homogeneous system is preserved if one adds “higher order” terms. They also extended Coleman’s theorem in [Col] to the weighted homogeneous setting. These considerations have motivated the asymptotic stabilizationproblem for homogeneous,and weighted homogeneous systems, and some of their generalizations. Stabilization problem in the case of two dimensional homogeneous systems is relatively straightforward, aa was observed by Samardaija (see [Sa]) and Dayawansa, Martin and Knowles (see [DMK]). However, three dimensional case turns out to be quite challenging. There are well known examples of small time locally controllable systems which do not satisfy Brockett’s necessary condition (see section 3 for an example). It has been pointed out by Coron that there are small time locally controllable three dimensional systems which satisfy Brockett’s necessary condition, but violate a necessary condition
Stabilization Problem for Low Dimensional Systems
169
given in [Cor]. Dayawansa and Martin (see [DMl]) and Kawski (see [Ka2], [Ka3]) have given additional necessary conditions and sufficient conditions for the asymptotic stabilizability of three dimensional homogeneous and weighted homogeneous systems. Hermes has shown (see [He2],[He3]) that for certain classes of weighted homogeneous systems one can derive an asymptotically stabilizing feedback control law by solving a regulator problem. More recently, Dayawansa and Martin and Samelson have shown in hitherto unpublished work (in [DMS2], [DMS3]) that single input, three dimensional, homogeneous polynomial systems of a given odd degree are generically asymptotically stabilizable (a sketch of the proof is given here in section 4). They have also given necessary and sufficient conditions for the asymptotic stabilizability of a generic classof homogeneous polynomial systems of a given even degree (see [DMSl]). This paper is organized as follows. In section two we will give the basic definitions applicable to the problems discussed here. In section three, we will discuss Brockett's, Zabczyk's and Coron's necessary conditions and givesome examples. In sections four and fivewe willreview the work on the asymptotic stabilization problem fortwo and three dimensional systems respectively. In section six, we will conclude the paper by stating a few challenging unsolved problems in the context oflow dimensional stabilization.
2. Basic definitions and notation Throughout the paper I(.([ denotes the Euclidean norm, and B,denotes the open ball of radius E in an Euclidean space. An overlineOver an already defined set denotes the closure of the set and 8 before a set denotes its boundary. Here, we will consider a nonlinear control system having the structure, (2.1)
k = F(z,u),
where, z E W , U E !Rm, F is a family of Ckvector fields. We will assume
Dayawansa
170
that the origin of !JP is an equilibrium point of the unforced system, i.e. F(0,O) = 0. Sometimes, we will assume that (2.1)has the affine structure,
(2.2)
= f (x)+ S(Z>U,
where, IC E W , f and g are smooth and U E f (0) = 0, and that f and g are Ck.
!Rm,
and we will aasume that
W.
Definition 2.1. Let 20 and 21 be elements of Then in (2.1) 11 is reachable from x0 at time T with control bound M if there exists a piecewise continuous control input, U : [O,T]+ R"' where Ilu(t)ll IM for all t E [O,T], such that the corresponding solution of (2.1) starting at a0 at t = 0 passes through I C ~at t = T . The set of points which are reachable from ICO at a fixed time T with control bound M is called the reachable set from x0 at time T with control bound M ,and it is denoted by RM(T,X O ) . The system (2.1) is small time locally controllable at x0ifx0 belongs to the interior of RM(T,ICO) for all M,T> 0. The system (2.1)is small time locally controllable if (2.1) is small time locally controllable at the origin. Definition 2.2. The system (2.1)is locally asymptotically controllable to the origin if for an arbitrary open neighborhood U of the origin there exists an open neighborhood W C U of the origin such that for all 20 E W there is a piecewise continuous control input U : [O,oo) -+ 8 ' with the property that the solution of (2.1) starting at ICO at t = 0 stays in U at all positive times, and converges to the origin as t -+
03.
Definition 2.3. The system (2.1) is locally continuously stabilizable if there exists a continuous feedback control law U = U(.) defined on a neighborhood U of the origin such that the following properties hold: (i) For each ICO E U ,the closed loop system admits a unique solution in forward time. (ii) The closed loop system is locally asymptotically stable at the origin. Such a feedback control law will be called a locally continuously stabilizing feedback control law. Such a control lawis an almost Cp locally
Stabilization Problem for Low Dimensional Systems
171
stabilizing control law, and correspondingly the system (2.1) is locally almost C ' stabilazable if in addition to (i) and (ii) above, U ( . ) is C' on U \ (0). The system (2.1) is locally C ' stabilizable if a C' feedback control law, which satisfies (ii) above, can be found. If a continuous feedback control law can be found on U\{O} which satisfies (i) and (ii) above, then the system is said to be locally almost continuous stabilizable, and such a feedback control law will be called a locally almost continuously stabilizing feedback control law. An important concept that plays a vital role in some of the work on the stabilization problem is that of a control Lyapunov function.
Definition 2.4. Let V : U + 8 be a positive definite smooth function, where U is an open neighborhood of the origin in 8".V is called a control Lyapunov function if at each x E U there exists U, such that v V ( z ) F ( zU,) , < 0 (here no requirements on the continuity of z + U, are stipulated). Control Lyapunov functions have playeda key role in some recent work on the asymptotic stabilization problem (see [CP],[PNC], [TK], [Tsl], [Ts2] etc). Key result in this direction is a Theorem due to Artstein (see [Ar]), which states that if a control system admits a control Lyapunov function, then the system is locally almost continuous stabilizable. Sontag has given (see [Sol], [So21 etc.) explicit constructions for finding such feedback laws,and has analyzed the problem of finding almost C", stabilizing, feedback control laws. An excellent survey was given by Sontag in [SoZ]. Due to the limitations in space we will not discuss this wpect any further. For the sake of self containment of this paper, we will now define the notions of degree and index. The interested reader is referred to [GP], [De], [Sp] etc. for a the validity of the definition, and general details. Let q5 : M + N be a C' map, where M and N are oriented manifolds of the same dimension and M is compact. Then the oriented degree of 9, (denoted by deg(q5, M , N ) ), is defined as follows. According to Sard's theorem (see [GP]) almost all points in the codomain are regular values. Pickone such value yo. It can be seeneasily that A := (q5)-'{yc1} is
.,
.. . .
L
....
......,
,..,.
...
"
....._
."
,.., _...._.... , ,
.
,
...
.
. . ... ,.
. .,
. ,, ..
,,
Dayawansa
172
.
either empty or else it is a finite set. Let A = {z1,q,. . ,zk}(if A = 0 we take k = 0). Now, at each zi,the derivative of 4 is nonsingular. If the determinant is positive at xi then 4 is orientation presemring at xi, and else it is orientation reversing. If 4 is orientation preserving at zi, assign the algebraic multiplicity 1 to zi,and otherwise assign -1, Now, the degree of 4 is defined to be the sum of the algebraic multiplicities at 4 at xi,i = l , ., , ,IC. It can be shown that this integer does not depend on the choice of the regular value. One of the important properties of the degree is that it is homotopy invariant. Now,if 4 is merely continuous, then one can approximate it by a smooth map, and compute the degree. As a consequence of the homotopy invariance property it follows that this integer does not depend on the choice of the approximation, as long as the approximation is closeenough with respect to the uniform metric topology. The integer obtained this way is referred to as the oriented degree of 4.
A closely related notion is that of the topological degree. Let, 4, M and N be as before except that M may not be compact. Let U be a relatively compact open subset of M and let y E N be such that y 6 BU. Then, the degree of f with respect to (6,y) (denoted by d ( 4 , 8 , y) ) is defined sgn(det c$*zi) where { X I , . . . ,zk} = 4-'(2 , where @ is a regular as value of 4 and @ is arbitrarily close to y. As described in the previous paragraph, this notion is homotopy invariant as well, and thus can be extended to continuous maps also.
c:=,
Notion of index is a local version of the degree, defined in the following way. Let 4, M , N be as in the previous paragraph. Let y E N ,and let K C $-l{y} be a closed subset such that there exists an open neighborhood U of K in M with the property that an$-'{y} = K.Then, index of4 with respect to (K, y) (written aa ind(4, K,y)) is defined aa d(4, 8,~)). If K is a singleton {z}, then we write ind(4, z, y) instead of ind(4, {z},v). Let us now consider a system of differential equations,
Wewill assume that $ is continuous, and that 0 E
Rn is an isolated
Stabilization Problem for Low Dimensional Systems
173
equilibrium point. Now, for small e > 0 we may define a map, (2.4) (2.5)
e, : S"-' + S"-' e,@:>= ~ ~ ~ ~ ~ / l l ~ ~ ~ ~ ~ l l *
Definition 2.5. Index of (2.3) at the origin is deg(e,, Sn-l, small enough E > 0.
for
,Sn-l)
As a consequence of the homotopy invariance property it follows that this number does not depend on the choice of e as long as thereare no other equilibrium points of (2.3) inside a closed ball of radius E. It is possible to show that the index of (2.3) at the origin is equal to ind($,O,O). We will now define several notions associated to homogeneous and weighted homogeneous systems, which willbe used in the proceeding sections.
Definition 2.6. System (2.2) is called a homogeneous system (respectively, a positively homogeneous system) of degree p if g ( x ) is a nonzero constant n x m matrix, and f (3)is a vector field which is homogeneous of degree p , i.e., f ( h ) = (X)Pf(z)for all X E R (respectively, positively homogeneous of degree p , i.e., !(A.) = ( X ) P f (x)for all A > 0). Stabilization problem for homogeneous systems arise naturally as that of stabilizing the leading set of terms in a system with null linear part. It is well known that (see [Ha] and the discussion in the proceeding sections) that addition of higher order terms will not affect the stability of a homogeneous system. A very important recent observation made by Kawski (see[Ka2], [Ka3]) and Hermes (see [Hell) is that for certain highly nonlinear systems, one can select coordinates of the state space in such a way that the leading terms form a weighted homogeneous system. This is done by considering a certain nilpotent approximation of the system (see [Ka2], [Hell, [He2], [He31 etc. for details). We will define this notion below ( for the most part we will follow [Ka3] here). Let P = { T I , . . , ,rn}, where ri are positive integers. Let (XI,.. . ,S,,) be a fixed set of coordinates on R". A one parameter family of dilations parametrized by E > 0 is AT : R+ X R" + !Rn, defined by, Af (x) =
Dayawansa
174
(erlxl,,, , ,epnxn).A function 4 : !Un if 4 o Ar = E"#. A vector field
+ Q is A'-homogeneous
of order p
is A' - homogeneous of order p if Xj (x)is A' - homogeneous of order p+rj. Corresponding to theusual Euler field C xj the A'-homogeneous Euler field is defined as C rjxj Solution curves of the A' - homogeneous Euler field will be called A'-homogeneous rays. Anosov and Arnold [AA] use the term, quasihomogeneous, instead of weighted homogeneous, or A'-homogeneous. Also, according to the terminology of Hermes in [Hell etc., the above definition refers to a A'homogeneous of order -p object (instead of order p ) .
&.
&,
Definition 2.7. Let r = {rl, . . . , m } , where rpi are positive integers. The system (2.2) is a A'-homogeneous system of order p if f is a A'homogeneous vector field of order p, and g(x) is a constant n x m matrix. In this paper we will only consider single input systems and in the case of A"-homogeneous systems, we will asume that g(%) = [0, .,0,1IT. Let r = (r1,. . , ,rn},and let us denote R := diag{rl,. ,T,,}. Let G denote the one parameter group generated by the A'-homogeneous Euler field, i.e. G = {exp(sR) I s E R}. By definition, G acts on Rn and the G orbits are just A'-homogeneous rays. We say that two points in R" are G equivalent, if they lie on the same G orbit. Let us consider a A'-homogeneous vector field in Rnl
.. ..
(2.6)
k = X(X).
One can "almost" project (2.7) onto the orbit space R"+\ {O}/G.This follows from the observation that (2.7)
X(t1Xo) = exP(qx(exP(-sP)tl exP(-smso),
for all s E R where, x(t, 20) denotes the solution of (2,6) at time t starting at x0 at zero time. But, this just means that, if two solutions start at G-equivalent points, then it is possible to rescale the time in one of the solutions in such a way that the two solutions will remain G-equivalent at
Stabilization Problem for Low Dimensional Systems
175
all times. In other words, the phase portrait of a "projected system" is well defined on the orbit space. In particular, we can obtain a representation of the "projected system" on any sphere as long as the A'-homogeneous Euler field is transversal to it. Here we will consider the projected system on the unit Euclidean sphere P " , and call it projected dynamics of X , and denote it by .(X).
3. Necessary conditions for the asymptotic stabilization
of nonlinear systems In this section, we will consider a nonlinear control system having the structure, (3.1)
X
= F(s, U ) ,
where, 5 E W , U E W , F is a family of Ckvector fields. We will assume that the origin of .!Rn is an equilibrium point of the unforced system, i.e. F(0,O) = 0. Sometimes, we will assume that (3.1) has the affine structure, (3.2)
li: = f ( x )+ g(x)u,
where, 2 E W , f and g are smooth, and U E W , and we will assume that f(0) = 0, and that f and g are C kvector fields. We will assume that 8F 0 0 # 0 in (3.1) (respectively, g ( 0 ) # 0 in (3.2)). It is clear that local smooth stabilizability of (3.1)implies that the OF00 linear approximation O F 0 0 +) is a stabilisable pair, and that (3.1) islocally asymptotically controllable to the origin. Brockett showed in [Br] that there is a more interesting topological obstruction to asymptotic stabilizability.
(,e,
Theorem 3.1 [Br]. Suppose that (3.1) is locally C1 stabilizable. Then the following are satisfied. (Bl:) The uncontrollable eigenvalues of the linearized system are in the closed left half of the complex plane. (B2:) (1.1) is locally asymptotically controllable to the origin.
176
(B3:) The function (x,U ) c+ F ( x ,U ) : 8" x (0,O).
Dayawansa
R* + Rn is locally onto at
Zabczyk [Za] observedthat an intermediate step in Brockett's proof is already contained as a theorem in [KZ].This theorem is commonlyreferred to as Zabreiko's theorem in the control theory literature.
Theorem 3.2 [KZ]. Considerthesystem of diflerentialequations, 6 = X ( x ) ; x E !Rn, where X is continuous.Supposethatthesystemis locally asymptotically stable at the origin, and that it has unique solutions in forward time for all initial conditions in anopenneighborhood of the origin. Then, the index of the system at the origin is equal to (-1)".
A simple way to prove this theorem (see [KZ], [Za] etc. for details) is to consider the homotopy,
where 4: denotes the flow of X. As r + 00, we get the antipodalmap onwhich has the degree (-l)n.Hence, it follows that theindex of X at the origin is equal to (-l)* as well. Zabreiko's theorem can be used to prove (B3) of Brockett's theorem in the followingway. Let U = U(.) be any smooth stabilizing feedback function. Then, the vector field F ( x , U(.)) has index (-l)n at the origin. In particular, this is nonzero. Hence, it follows from Sard's theorem that x c) F ( x ,~ ( x )is) locally onto at the origin (otherwise, since regular values are dense, it follows that thereexists a regular value with empty preimage). Thus (B3) has been established. Brockett gave the following example (see [Br]) which satisfies (Bl) and ,u~) (B2) but not (B3). Let n = 3 and m = 2 and F ( x 1 , x ~ , 2 9 , ~ 1= (ul,u2,x1u1 - ~ 2 ~ 2 Clearly, ) . (O,O,a ) is in the preimage only if a = 0. Byrnes and Isidori essentially used (B3) in [B111 to prove' that the attitude stabilization problem for a rigid body by using gas jet controls is solvable if and only if the system is feedback linearizable.
Stabilization Problem
for Law Dimensional Systems
177
More recently Coron (see [Cor]) has strengthened (B3). Coron's theorems still depend on Zabreiko's theorem. Let M be a topological space, and let F ( M ) denotes either the singular homology functor with real or integral coefficients, or the homotopy functor, and for a continuous map B : M + N , let T(B): 3 ( M ) + T ( N ) denotes the induced map. Theorem 3.3 [Cor]. For small E
> 0 define,
C€:= {(~,u)lll(z,u)ll c E and F ( z , u )# 0). Supposethat (3.1) is continuous feedback stabilizable.Thentheinduced map 3 ( F ) : 3(C,) + 3 (R" \ (0)) is onto. The idea of the proof given by Coron (see [Cor]) is the following. Let U = a ( . ) be a continuous, locally stabilizing feedback control law. Let, Me := {z E W" I ]l(z,a(z)llC E } . Then, the map z I-) F ( z , a ( z ): M, + W" \ 0 has degree equal to (-l)", and hence it induces an isomorphism F(F(., a(.))) : F(Me)-+ T ( R n\ (0)). But, this map can be factorized as, F(F(.,a(.)))= F(F)oT((id,a)). Now it follows that T ( F )is onto for all small enough E. Coron also derived (see [Cor]) a corollary of this theorem, whichis applicable for (3.2).Let, us assume without any loss of generality that the system has been transformed into the form, & = [fb"']
+
[;l
U
by using a change of coordinates, and a feedback transformation. For small E > 0 let be defined &S {z E S"-llf(z) # 0). Let H, denotes the singular homology functor. Coron established the following: Corollary 3.1. 3(f) : H,-2(%)
+ H,-2(9?"-' \ (0))
is onto.
Proof of this corollary is relatively straightforward by using the MayerVietoris sequence to the pair, (E,X (-€,e), B, x ((-E,€) \ (0))). Details c m be found in [Cor]. Now we describe another necessary condition, which we believe is equivalent to the condition in the corollary. It is trivial to see the equivalence
Dayawansa
178
for homogeneous and weighted homogeneous systems, but it is less clear in the general case. This is an expansion of a necessary condition given for homogeneous systems in [DM11(Theorem 3.7). Theorem 3.4. There exists a continuos function a defined on a neighborhood of the origin in %" such that the index of [f'(x),a(x)]' is equal to (-1)" ifandonly if for small e > 0 there exist closed subsets C1 and C2 of B,(O)\ (0) such that the following hold: (i) Ci C (f)-'{~}. (ii) K1 U K2 = ( f / & , ~ ~ ( ~ ~ ) - 'whew { 0 } , Ki = Ci naB,(O). (iii) ind(f18Bc(0),K1,0)= 1.
A sketch of the proof goes along the following lines(details of the proof will be given in [DMS2]). Suppose that U = a(.) is a continuous function such that the index of [f'(s),a(z)]' is equal to (-1)".Let E > 0 be small enough such that [ f ' ( z ) , a ( ~ #) ]0~for all x E B,(O) \ (0). Now, a # 0 on (fleBc(o)\{o})-'{O). Let Cl := .{ E aB€(O)\ {O)la(4 > 0) and C2 := {x E aB,(O) \ {O)lcr(x) < 0). Let Ki = Ci nBB,(O). The problem of computing ind((f, a),0,O) can be considered as that of counting the number of points in the preimage of (f,a) of a point near (0,l)E Sn-l with appropriate multiplicities. But, this is the same as the problem of computing ind(flsB,(o), K',0). Conversely, suppose that (i)-(iii) of the theorem are satisfied. Then, itis possibleto define a continuous function on B,(O)such that a l ~=, (-l)",a l =~ (-l)n+l, ~ and crl~~l-)-l~o} # 0. Be ( 0 ) Thus, it now follows that ind(((f, a),O,O)= (-1)". 3.1. Necessary conditions for the stabilizability of single input homogeneous systems
Here we consider (3) under the hypothesis that, f is homogeneous of degree p ( p > 0). In this case, the necessary condition for continuous local stabilizability in Theorem 3.4 reduces to the following:
Theorem 3.5. Let { K ) i c ~ denotetheconnectedcomponents of (fls~~(~))-l(O). Then, there exists a continuous feedback finction cy defi-
Stabilization Problem
for Low Dimensional Systems
179
ned on a neighborhood of the origin such that the index of [ f T ( z ) , a ( ~ ) ] T is equal to (-l)n if and only if there exists a partition I = Il U .lz such that Kj := U ~ ~ I ~j E = 1, , 2 are closed subsets of6B1(0) and ind(f)sB,(o),Kj,o) = 1.
Coron gave the following example in [Cor] to illustrate that small time local controllability, and (B3) (see Theorem 3.1) are not sufficient for continuos local stabilizability. We will apply Theorem 3.5 to this example.
z3
Example 3.1. Let z denote the complex number, be a scalar variable. Consider,
(31
+ isz),and let
i = i(z - 5 3 ) 3 , x3
=U.
It is easy to verify by computing Lie brackets that this system is small time locally controllable. It clearly satisfies (B3). However, there are two on ' S and each has points in the preimage of 0 by ( z , x g ) I+ i(z multiplicity equal to three. Therefore, this system violates the necessary condition given in Theorem 3.5, and therefore it is not locally continuously stabilizable. 3.1.1. Necessary Conditions for the Stabilizability of Single Input Homogeneous Systems by Using Homogeneous Feedback . In this section, we will discuss some necessaryconditions for the (global) continuous feedback stabilizability of (3.3) by using a continuos, positively homogeneous, feedback function of the same degree of homogeneity p as $. It has been known for a fairlylong time (see [Ha]) that a C1homogeneous asymptotically stable system 5 = X ( x ) admits a Ck,positively homogeneous Lyapunov function. Recently, it has been shown by Rosier (see [Etos])that this assertion remains valid even ifwe replace C1 by continuous in the statement. We will sketch a slightly modified versionof Rosier's argument below. First we observe that it follows from Kurzweil's theorem ([Ku]) that the system admits a local, C" Lyapunov function W ( z )defined on a small neighborhood of the origin. Let e > 0 be small enough such that the level set of W-l{d) is a (homotopy) sphere for all 6 < 3 ~ Let, . a : [O,oo) + [O,oo)
Dayawansa
180
be a nondecreasing smooth function such that al[o,a~ = 0 and al[2a,oo) = 1. Now, define a smooth, homogeneous, degree p function, V : R" + R by, 00
a{W(z/(X1/P)))dX,
V ( S )=
(34
0
where, ~ { W ( S / ( A ' / P )is) }taken to be equal to 1 when z/X1/P is not in the domain of W . It is easy to verify that V is CP, homogeneous of degree p , and VV(s)X(z) is negative definite (interested readers are referred to [Ros] for further details). bsier's theorem has very important consequences with regard to the stabilization problem for positivelyhomogeneous systems. Wewill first define certain important subsets related to a system and then describe a necessary condition for stabilizability. It is convenient for us to write our single input positively homogeneous control system in the form,
where, z E Rm,y E R, U E R, and f is positively homogeneous of some degree p , i.e. f(Xz, Ay) = ( X ) p f ( z , y) for all X 2 0. Let us refer to the points (0,l) and (0, -1) of Sm as the north pole and the south pole respectively. For arbitrary 6 E R let us define A
(3.5)
Let A~ = A 6
A6
n S".
= {(S,g) E
Rm+' I f ( ~ , y = ) 6s).
Let,
A+ = ua,oAs,
Theorem 3.6 [DMl]. Suppose that there exists a continuous curve p : [0, l] + S" such that,
.... .. .., .
Stabilization Problem for Low Dimensional Systems
181
(i) p ( 0 ) = north pole and p(1) = south pole, (ii) p C A+o.
Then the system does not admit a continuos positivelg homogeneous asymptotically stabilizing feedback function. Observe that this obstruction does not depend on whether or not one can find a feedback function such that the index of the closed loop vector field is equal to (-l)n.At this stagewe don't know whether the restriction on the class of feedback functions can be removed. One way to prove it is to argue as follows(see [DMl]). Suppose on the contrary to the assertion in the theorem that the system admits an asymptotically stabilizing homogeneous feedback function. It follows from from Rosier's theorem ([Ros], see above) that the closed loop system admits a positively homogeneous C" Lyapunov function. It can be assumed without any loss of generality that the feedback function is positive at the north pole and negative at the southpole (if this is not the case, then small bump functions defined near the poles could be used to modify the feedback functions so that they are as stated above). Now, by continuity, there is a point 0 on p at which the feedback function is equal to zero. But since p c A+, it follows that that theclosed loop vector field points outward of the level set of the Lyapunov function at 0, which is a contradiction. 3.1.2. Necessary Conditions for the Asymptotic Stabilizabiiity of A"Homogeneous Systems by Using A"-Homogeneous Feedback. Rosier's theorem is valid inthe context of A"-homogeneous systems as well (see [Ros]). In the construction of the Lyapunov function, simply replace the integrand in (3.3) by {l/(A)k+l)a{W(A'lzl,.,.,X"nzc,))}. It now follows that Theorem 3.10 is valid for A'-homogeneous systems as well, provided that one is seeking for a A'-homogeneous, continuous feedback function. The only 6 in (3.5) by, modification needed is to replace the definition of 2
The following example due to Kawski (see [K&], [Ka3]) established the fact that, small time locally controllable and (B3) or more generally
Dayawansa
182
Coron's necessary condition (Theorem 3.3)does not imply asymptotic st& bilizability by using A'-homogeneous feedback. Example 3.2. j 1
= 321 - 2 / ' 3
52 = x2 - 2/
(3,7)
?j = U,
where,
,xz,y) E R, and U E R. This is a A{Q1311)-homogeneous system,
(a1
Kawski observed that the equator belongs to A+, and hence and st& bilizing feedback function U = .(x) cannot change its sign on the equator. NOW,if alequator > 0, then the set S- := { ( z ~ , Q v), I 21 5 0, 5 0, 2/ 2 0) is an invariant set for the projected dynamics (see the last paragraph of section 2 for the definition) of the closed loop system. On the other hand, if .(equator < 0, then the set S+ := { ( ~ 1 , ~ 2 ,1x1 9 ) 2 0 , ~2. 0 , 5 ~ 0) is an invariant set for the projected dynamics of the closed loop system. In either case, by the Poincare Bendixon theory it follows that there is an equilibrium point of the projected dynamics in S- U S+.Each such point corresponds to a one dimensional invariant A'-homogeneous ray for the closed loop system. However, it is easily seen that (S- U S+)f l A- = 0. Thus, the restriction of the closed loop system to these invariant rays is not asymptotically stable. We note here that Theorem 3.10 also establishes that this example is not asymptotic stabilizable by using A'-homogeneous feedback. Clearly, [(Szn {x1 = xzy6)) n (S-U S+)]U {equator) c A+. Thus, the hypotheses of Theorem 3.10 are satisfied, and therefore we conclude that the system is not stabilizable by any A{9~3~1)-homogeneous, continuous, feedback.
4. Sufficient conditions for asymptotic stabilizability of low dimensional systems Current interest on the low dimensional stabilization problem seems to have started from the work of Ayels (see [Ay]), in which he showed that it
Stabilization Problem for Low Dimensional Systems
183
is possible to extract out from a given nonlinear control system, a highly nonlinear part which essentially contains all difficulties associated with the stabilization problem. This, alongside with Brockett's observation that even small time local controllability does not imply stabilizability sparked a vigorous study on the stabilization problem for smooth two dimensional systems of the form,
where, IC E iR2,
U
E R, and F(0,O) = 0.
One of the first such studies was carried out by Abed and Fu (see [AFl]).An important aspect of their work is that their stabilization procedure can be carried out without first performing a reduction step (such as Lyapunov, Schmidt reduction) on the original system. The key tool used in their work is the Hopf Bifurcation Theorem. Let us consider a smooth one parameter family of ordinary differential equations,
where, z E W , p E (-6,6) is a parameter, and X,(O) = 0 for all p. Let A, := We will assume that (4.2) satisfy the following:
W.
(i) (n-2)-eigenvalues of A0 have negative real parts, and the remaining eigenvalues are f i w o , where WO is a nonzero real number.
(ii) Two of the eigenvalues of A, have the form a, fSaw, where a0 = 0 and &(cy,(O)) > 0. Then, Hopf Bifurcation Theorem (see [GS], [MM] etc.) asserts that there exists a smooth map E c) p ( € )= a z k € 2 k + O ( E 2 k ) : ( - E O , €0) -+ (-6,6) for somestrictly positive integer k and a 2 k # 0, such that (4,2)has a family (t) period near 27rw". Exactly one of the of periodic orbits z , ~ ( ~ ) having characteristic exponents is near zero, and it isgivenby a smooth even function,
Dayawansa
184
where, @zq # 0 and depends only on Fo.Hence, the periodic orbit is stable if Pa, < 0 and unstable if Paq > 0. Abed and Fu gave the following algorithm (due to Howard [Ho]) for computing Pz. Let us write
where, Lx is linear, and Q(z, z) and C(z, z, z) are homogeneous symmetric quadratic and cubic functions respectively. Let T and l be the right and left eigenvectors of L associated with the eigenvalue iwo. Normalize T and l such that the first component of T is equal to one and lr = 1 . Algorithm: Solve,
-LOO,= ( 1 / 2 ) Q (F~) , (2iwo-l- L1)b = (1/2)Q(r,r)for a and b. Then, PZ = 2Re (2lQ(r,a)
+ @(F,b)
+ (3/4)lC(T, e ) } . T,
The main contribution in [AFl] is an analysis of the problem of modifying by using smooth feedback control. To apply their procedure it is not necessary to assume that the state space of (4.1) is two dimensional. They key hypothesis is that all except two of the eigenvalues of have negative real parts, and the two remaining eigenvalues are nonzero, and lie on the imaginary axis. They considered a parametrized family of feedback functions of the form U = a(.) = cTz zTQux C,(z,z, z) and obtained an expression for p; of the closed loop system as a sum of 0: of the open loop system, and another term which depends on c, Qu and C,, and the order three approximation of ( 4 . 1 ) . They showed that, in the case when the imaginary pair of eigenvalues inthe linear approximation of (4.1) are controllable, then one can modify ,& by taking U = C,(z, z, z), i.e. homogeneous cubic feedback. Moreprecisely, p; = (3/4)CU(7-,7-,F)l7, where T and l are the right and the left eigenvectors of corresponding to thepair of imaginary eigenvalues, and 7 = They also gave
9
+
+
+
9 9.
Stabilization Problem for Low Dimensional Systems
185
a sufficient condition for the stabilizability when the pair of imaginary eigenvalues in the linear approximation are not controllable. The work by Abed and Fu which began in [AFl] in 1986 has been greatly expanded in [AF2], [AF3], [AF4] etc. to include the cases of Stationary Bifurcation, degenerate Hopf Bifurcation, Robustness Analysis etc. Perhaps, the most important step in the low dimensional stabilization problem was taken by Kawski in [Kal]. He proved that a real analytic, two dimensional affine system is locally continuous feedback stabilizable provided that it is small time locally controllable. This showed that Brockett's obstruction (B3) is perhaps not so severe as it appeared for a while, and once again there is some hope for relating stabilizability to some version of controllability. Let us consider,
where, z E !R2, f , g are Cw-vector fields, f(0) = 0,g(0) # 0. Kawski proved the following:
Theorem 4.1 [Kal]. Suppose that (4.4) is small time locally controllable. Thenthew existsa Holder continuous locally stabilizing feedback function. The essential idea of his proof is the following. Without any loss of generality we may assume that (4.4)has the following form.
where, fl E C" and fl(0,O) = 0. Let us write W
fl(zl,z2)=214(51,22)+
C amzF, m=k
where ak # 0,and 4 is C". Without any loss of generality we may assume that ak = 1. Small time local controllability implies that k is odd. Therefore, for large enough M > 0 and small enough y > 0, the real valued
Dayawansa
186
function 22
v(zl,z2)=
1 f1(z1,e) de +
MZ:+~
0
is positive definite on a small neighborhood of the origin in R2 (use Holder's inequality). Now define the feedback function U(.)
=
BV
- -*6V
"
BSI
az2
(E)
It follows that for the closed loop system, V = 5 0. Also, 1S 0 for some solution implies that q ( t ) is constant and hence zz(t) is con0 also. Hence x(.) lies on the criticalset of V. stant. Thus However, since V is positive definite, on a small enough neighborhood of the origin, this set is just the origin itself. Hence, by LaSalle invariance locally principle it follows that the feedback function U ( % ) = asymptotically stabilizes the system. Observe that one may have to pick 7 < 1 in order to ensure that V is positive definite (for example, fl(z1,z 2 ) q - z~~(known as the Kawslci ezample). Observe that this does not satisfy (Bl).Therefore it is not C1 feedback stabilizable). Thus, Kawski's theorem illustrates that one may need to resort to Holder continuos feedback in order to stabilize 'highly nonlinear systems.
v=
-E E
Necessary and sufficient conditions for the asymptotic stabilizability of (4.5) (hence (4.4) were givenby Dayawansa, Martin andKnowles in [DMK].
Theorem 4.2. Consider the system (4.5). The following conditionsare equivalent. (i) The system (hence (4.4) is locally almost Cooasymptotically stabilizable. (ii) Bmckett condition (BZ)is satisfied. (iii) For all E > 0 there e d s t ~ E B , ( On) R$ and qeB,(O) n : R such R = ((51,z2) 1 2 1 > 0) and R? = that fl(p) < 0 and f l ( q ) > 0. (Here : ((z1,zz)lzl 0)s
f i r t h e r , a Holder continuous stabilizing found.
feedbackcontrol
lawcan be
Stabilization Problem for Low Dimensional Systems
187
The C1 and C" feedback stabilizability are much more subtle even in the two dimensional case. Dayawansa, Martin and Knowles derived some sufficient conditions in [DMK]. We first define two indices. Since multiplication of f1 by a strictly positive function and coordinate changes do not affect stabilizability of (4.5), we may assume without any loss of generality that f1 is a Weierstrass polynomial, al(z2)zY-l + . . . am(z2) and ai(0) = 0, 1 5 i 5 m. It is well known that the zero set of a Weierstrass polynomial can be written locally as the finite union of graphs of convergent rational power series z2 = $(XI) where z1 E [0,e) or $1 E (-E, 01. Let us denote the positive rationals by Q+ and define,
zr +
+
D+ =
(y E Q+ I fl(zl,+(zl))
C0
for all 2 1 E ( 0 , ~ )for some
E
>0
and for some convergent rational power series +(SI)with
D- = ( 7
leading exponent equal to
'1
-Y
Q+ I fl(-q,+(zl)> o for all 2 1 E (o,E) for some e > o
E
and for some convergent rational power series with leading exponent equal to
+(%l)
'1
-
Y Definition 4.1. The index of stabilizability of f is max{inf,,D+{r},
inf,eo- { Y H . Definition 4.2. The fundamental stabilizability degree of f l is the order of the zero of am(z2) at $2 = 0. The secondary stabilizability degree of fl is the order of the zero of am-1(z2) at zz = 0. Notation:
I := Index of stabilizability of
f1
s1 := Fundamental stabilizability degree of s2 := Secondary stabilizability degree of f1.
fl
Dayawansa
188
S1
Theorem 4.3. The system (4.5) andhence (4,4)) is C1-stabilizableif > 21 - 1 If s1 5 1 2s2 and 91 is odd, then (4.5) is C" stabilizable. If s1 < 1 2.92, then (4.5) is not C" stabilizable.
+ +
Essential idea behind the proofs of Theorems 4.2 and 4.3 given in [DMK]is to consider f I - l ( O ) and observe that it can be written as a union of finitely many analytic arcs. Therefore, it is possible to construct an open neighborhood base of the origin of !R2 and an almost C" feedback function such that each of these open sets is positively invariant, from which asymptotic stability of the closed loop system follows easily. Recently, Coron and Praly have given a more elegant proof of Theorem 4.2, (see [CP]) by using control Lyapunov functions. Another interesting approach to thelocal asymptotic stabilization problem for twodimensional systems was taken by Jakubczyk and Respondek (see [JR]). They considered the case in which g(0) and adjg(0) are linearly independent for some k 5 3 in (4.4). They employed techniques from Singularity Theory (see [GS], [Mal], [TL] etc.) to show that it is possible to find coordinates in R2 such that, up to a multiplication by a strictly positive real analytic function, f 1 takes one of the following forms. za;
X . ( + a;); (zi+ a(z1)za+ ex!);
k =1 k =2 IC = 3,
where, E = fl,a(z1) is a C" function and a(0) = 0. They carried out a detailed analysis of these three cases. Even though stabilizability can be determined in each of these cases by using Theorem 4.3, the work reported in [JR] is more constructive. We also note here the interesting work of Boothby and Marino (see [BMl], [BM2], [BM3],etc.) and Crouch and Igheneiwa [CI]. Boothby and Marino focussed on finding obstructions to C1 stabiliaability. In [Cz]the main aim is to use Newton Polygons in order to find suitable coordinate changes which can simplify the stabilization problem, and hence to give some sufficient conditions for stabilizability.
Stabilization Problem for Low Dimensional Systems
189
4.1. Asymptotic stabilization problem
for three dimensional systems Unlike in the two dimensional case, the picture is much lessclear in the three dimensional stabilization problem. Here we attempt to summarize the key work on this problem. Let us first consider the local asymptotic stabilization problem for the system,
k = F(z,u),
(4.6)
where, s E R3, U E R, F is Coo,and F(0,O) = 0. Let us write down the Taylor polynomial in the form,
k = F ( z ,0)
(4.7)
+ [b+ H ( s ,4 3 2 1 ,
where, b E R3, and H(0,O) = 0. Here again we usume that this system ishighly nonlinear in the sense that the linear approximation of (4.6) is not stabilizable. The principal interest here is in the cases in which (4.7)contains a "dominant part" of a particularly simple form, such that the stabilization problem for this part yields a locally stabilizing feedback function for (4.6). The simplest case of such a dominant subsystem is the following. Let us use the expansion F ( s ,0) = C& $i(s), where +i is homogeneous of degree i. Let k be the smallest positive integer such that $k # 0. Now let f := $k and consider the stabilization problem for, j. = f(.)
(4.8)
+ bu.
Suppose that there exists a continuous, positively homogeneous, stabilizing feedback control function CY of degree k for (4.8).Nowby Itosier's theorem the closed loop system admits a smooth, positively homogeneous Lyapunov function V ( z ) .Now, the remaining terms in (4.7)add terms of order k 1 and higher to the closed loop system. Hence V ( s )remains a local Lyapunov function. Thus we observe that a sufficient condition for the continuous stabilizability of (4.6)is the continuous stabilizability of (4.8)by using a positively homogeneous feedback function.
+
. .
Dayawansa
190
Recently, Kawski (see [Ka2], [Ka3]) and Hermes (see [Hell) observed that for a class of three dimensional, affine, small time local controllable systems, it is possible to find a AT-homogeneous“dominant part” by constructing a Nilpotent approximation of the system. Once again by Rosier’s theorem, it follows that continuous stabilizability of this A‘-homogeneous approximation by using AT-homogeneousfeedback of the same order implies the continuous stabilizability of the original system. Inview of these discussions, we willonlyconsider the stabilization problem for homogeneous and A‘-homogeneous systems here. There are fundamental theorems on the asymptotic stability of two and three dimensional systems. Let us consider the system of ordinary differential equations,
li: = X(z),
(4.9)
z EW
,
n = 2 or 3,
where, X(z) is almost Cl,positively homogeneous vector field of degree p (our standing hypothesis is that p is an integer not less than 1). Let us denote the projected dynamics (see the last paragraph of section 2 for the definition) by,
e =n(x)(e), e E
(4.10)
Theorem 4.4. Let n = 2. Then (4.9) is asymptotically stable if and only if one of the following hold: (a) The system does not have any one dimensional invariant subspaces
and 211.
5
0
+
cos OX1(cos 8, sin 0 ) sin OX2(COS 8, sin e) de cos ex2(cose, sin e) - sin ex1(cos 0, sin e)
<0
where X = (Xl, z2). (b) Restriction of the system to each of its one dimensional invariant subspaces is asymptotically stable. Theorem 4.5. Let n = 3. Let C denotes the w-limit set of (4.10), and let CSZ denotes the cone generated by C . Then, (4.9) is asymptotically stable if and only if the restriction of (4.10)to CSZ is asymptotically stable.
StabilizationProblem for Low Dimensional Systems
191
Note that the w-limit set C of the projected dynamics in Theorem 4.5 consists of a finite union of periodic orbits, equilibrium points, and some homoclinic and heteroclinic orbits between them. Determination of stability on rayscorresponding to theequilibria is trivial andon the cones generated by equilibrium point can be carried out by using (b) in Theorem 4.4. If the restriction of the system to the cones generated by equilibria and periodic orbits is asymptotically stable, then the same holds true for the cones generated by the homoclinic and heteroclinic orbits. Theorem 4.5 is due to Coleman (see [Col]). Kawski observed in [Ka3] that it remains valid for all integers n. Coleman attributes Theorem 4.4 to Forster (see [For]), and it has been discovered independently byHaimo [Hail. These two theorems follow rather easily from the G-equivalence of the solutions of a homogeneous system, i.e. (2.7). Complete proofs can be found in [Ha], [Col]etc. Kawski and Hermes have shownthat these two theorems remain valid (with obvious modifications) in the case of A'-homogeneous systems (see [Ka2], [K&] and [Hell for details). In particular Kawski noted in [Ka2]and [Ka3]that in the case n = 3, if the w-limit set of (4.10) consists of a Jordan curve, if the set of equilibrium points is nonempty and that the system is asymptotically stable on each ray corresponding to theequilibria, then (4.9) is asymptotically stable. Asymptotic stabilization problem for homogeneous systems was considered for the first time by Samardzija [Sa]. There was a resurgence of interest on this class after Andreini, Bacciotti and Stefani proved the following theorem in [ABS]. Let us consider a system, (4.11)
where, IC E some p 2 1.
j. = !(IC, y),
y E 8,U E R, and
Q = U,
f ishomogeneous of degree p for
Theorem 4.6 [ABS], [DM4]. ,Suppose that k = f(z,O) is asymptoticallystable. Then the feedback function U = -pp globally asymptotically stabilizes (4.11).
192
Dayawansa
This theorem follows from the observation that the closed loop system has the structure of a “block upper triangular” system with stable diagonal blocks. Therefore, it follows that the closedloop system islocally asymptotically stable (see [Vi], [So31 etc.). Now, the G-equivariance property (2.7) establishes global asymptotic stability. A A’-homogeneous version of this theorem was given by Kawski in
Pa31 Theorem 4.6 can be strengthened in the following way. Let l? C P” be an embedded ( n- 2)-sphere which is transversal to themeridians (here we call the points (0,l)and (0, -1) in S”-1the north and thesouth pole respectively, and a meridian is a geodesic whichjoins the north pole to the south pole). Let ril denotes the cone generated by l?. Now, there exists a unique function U : I’il + X such that ril is an invariant set of (4.11). Denote the resulting system by Cr.
Theorem 4.7 [DMl]. Suppose that there exists l? C an embedded ( n - 2)-sphere which is tmnsversal to the meridians, such that & isasymptoticallystable.Then (4.11)isalmost C” stabilizable by using positively homogeneous feedback of degree p . Essential idea of the proof is to deform the sphere along meridians such that l? coincides with the equator, and then apply Theorem 4.6.The deformation maybemerely continuous at the origin. However, a close examination of the problem shows that this will not affect the continuity of the feedback function at the origin. Theorem 4.7 is particularly important in the case of three dimensional systems, where the stability of Cr can be determined by using Coleman’s theorem (Theorem 4.5).In particular, if there exists an embedded circle I’ which is transversal to the merideans, meets A - , but does not meet A+o, then it follows from ( b ) of Theorem 4.5,and from Theorem 4.4 that the hypothesis of Theorem 4.7are satisfied, and thus existence of such a circle is a sufficient condition for the stabilisability. This and the necessary condition given in Theorem 3.10 are strong enough to determine whether or not a given single input, three dimensional, homogeneous quadratic system is asymptotic stabilizable by CO (equivalently almost P ) homogeneous
Stabilization Problem for Low Dimensional Systems
193
feedback (see [DMSl] for a generic class, and [DMSZ] for arbitrary systems). However, for higherdegree cases, one has to allow for I'to be tangential to the meridians at certain points. We will describe such a theorem now. For the sake of simplicity, wewill only describe this theorem for three dimensional systems.
Definition 4.3. Let I' c S2 be an embedded circle. A point 8 E I' is on an upper arc, (respectively o n a lower arc) if a particular meridian meets l? for the first time (respectively, for the last time) at 8. Theorem 4.8 [DMSP], [DMSS]. Consider (4.11) in the case when n = 3. Suppose that there exists l? C S2, a n embedded circle, which meets Aat a point on a lower arc or an upper arc, and that I' does not meet A+o, and the poles. Then (4.11) is almost C" stabilizable by using positively homogeneous feedback. Proof of this theorem is fairly complicated. It involves a veryclose examination of the topology of a level set of a homogeneous Lyapunov function for (4.11).Details of the proof will be contained in [DMSP].
4.2. Stabilization of single input homogeneous polynomial systems
In this section we will sketch out how one can apply Theorem 4.8 to determine whether or not a generic, single input, three dimensional, homogeneous polynomial system of given degree p (p is an integer greater than 1) can be stabilized by using homogeneous feedback. Let us consider a homogeneous polynomial system of the form,
(4.12)
Q = U, where, (zI, z2,y) E
$l3;
f l , and j 2 are homogeneous polynomials of degree
Dayawansa
194
p for some integer greater than 1. We will fix p , and study a generic class of (4,12). Clearly, for a generic class, we may assume without any loss of generality that,
(4.13) P-2
Now, A&, the cone generated by A s is given by the degree (p projective algebraic curve,
(4.14)
ZZfl(2,9)
+ 1)
- 21f2(2, P) = 0.
We will only consider the case in which (4.14)describes a nonsingular curve. Note that nonsingularity is a generic property in the space of plane algebraic curves of a given degree.The particular structure of (4.14)( note that there is a ~ 1 ~ y P -term l , a 22yp term, and there are no vP+l terms) does not affect the genericity, since 22yP s12yP-1 ":2 + e 1 = 0 i s a nonsingular curve for all integers p greater than 1. Letus attempt to visualize the structure of (4.14)on the cylinder, S1 x R in R2 x R, where S1 = {(cos(B),ein(B) 1 B E [-7r,7r]}. Wewill identify the north and southpoles for the projected dynamics with S' x CO and S1 x -CO respectively. Nonsingularity of (4.14)implies that each real branch is smooth and no two real branches meet each other. Also, since the coefficient of yp term is 22, and thecoefficient of the yp-l term is %l2, it follows that there is exactly one branch goes to CO as B + O+, and one branch goes to - W as 8 -+ 0-. Since the antipodal map leaves the curve invariant, similar statements hold true as B + rTTi. These are the only asymptotes to the curve. Let oo-,oo+, os-,and a,+ denote the branches which have the asymptotic behavior described above. Then it follows that exactly one of the following hold:
+
(i) bo- = go+,and 6-, = a,+, (ii) oo- = D,+, and go+= c,-.
+
Stabilization Problem for Low Dimensional Systems
19s
C a s e 1: p is odd
Theorem 4.9. In either ofthe cases(i) or (ii), there exists a n embedded does circle in S' X ?R which meets A- at a point on an upper arc, and not meet A+o. Hencesingle input, homogeneous, odd degree polynomial systems in three states are generically stabilizable. Moreover, the feedback functions can be found to be almost C" and homogeneous of degree p . In order to prove the theorem, the only thing needed to observe here is the fact that points close to 03 on uT- and close to -03 on u0- are on A - . Since each branch consists of a smooth curve, and branches do not meet, it is always possible to find a curve as stipulated in 4.9. C a s e 2: p is even Observe that points close to 0;) on uTt and close to -0;) on uo- are on A+ and points close to 03 on uT- and close to -m on uo+ are on A - . Thus it follows that in case (ii) above, it is always possible to find a curve as in the statement of theorem 4.9, and in this case the system is almost C" stabilizable with homogeneous feedback. In the case (i), generically one of the following cases occur: (a)
(b)
uo+ = uo- C
A+o. In this case, Theorem 3.10 applies and hence we conclude that the system is not asymptotically stabilizable by using homogeneous continuous feedback. cot nA- # 0. Then, Theorem 4.8 applies and we conclude that the system is almost C" stabilizable by using homogeneous feedback.
In the case when p = 2, (a) and (b)above can be described in terms of algebraic equations (see [DMSl] for details). A comprehensive analysis of the quadraticcase, including the nongeneric cases, will be given in [DMS2]. Of course, the arguments given here, and the conclusions drawn remain valid for weighted homogeneous systems as well. Here the odd degree case corresponds to AT-homogeneoussystems of order p , when r1, T Z , r3 are odd integers and p is an even integer, or when TI, r2, rg axe even integers and p is an odd integer, and the even degree case corresponds to the case when either r1, i = 1,2,3 and p are all odd or all even.
Daytlwansa
196
The exposition in this section has followed along lines which best describes out personal tastes, and biases. An entirely different approach to the stabilization problem for three dimensional systems has been taken by Hermes. His objective has been to extend the quadratic regulator theory to the case of weighted homogeneous systems, by considering a weighted homogeneous cost function. He has been successful in giving an alternate proof of Kawski’s theorem (Theorem 4.1) by using this approach. He has also worked out some three dimensional examples. However, as far as we know, there hasn’t been any characterization of a reasonable set of intrinsic conditions under which this approach leads to a solution in the three dimensional case. Interested readers are referred to [He3], [He41 etc. for further details.
5. Concluding Remarks
We have summarized some of the work that has been done on the asymptotic stabilization problem for two and three dimensional systems in the recent past. Much of the discussion reflects our personal biwes, and scant attention was given to important aspects of the problem such as the work based on Control Lyapunov Functions. The interested readers are referred to [So21 for a detailed review of this aspect. In our opinion, we, the nonlinear community, has acquired a reasonable understanding of the complexity of the stabilization problem for homogeneous ‘and weighted homogeneous systems. Theorem 4.8 is very encouragingin the sense that it shows that there cannot be “too many other unknown obstructions” to stabilizability, However, finding them ought to be an important task ahead. There is at least one other theoretical question raised by Theorem 4.8. Does the genericity statement hold in higher dimensions? Theorem 3.10 gives a necessary condition for the stabilizability of homogeneous systems by using homogeneous feedback. Does this remain as an obstruction, if we remove the restriction on homogeneityof the feedback function?
Stabilization Problem for Low Dimensional Systems
197
We believe that much interesting work waits ahead in answering these questions. This is bound to a highly fertile area for trying out everyones favorite tools from algebraic geometry, algebraic topology, optimal control theory etc. We hope that these attempts will be successful.
6. Acknowledgements We wish to thank Professor Clyde Martin, Dr. Gareth Knowles, Dr. Sandy Samelson, Dr. D. Chen for the highly fruitful collaborations with us on this subject in the recent past. We also wish to thank Professors David Gilliam and Christopher Byrnes for helpful discussions related to algebraic curves, which eventually led us to Theorem 4.8.
REFERENCES [AA] D. V. Anosov, V. I. Arnold eds. Dynamical Systems I, Encyclopedia of Mathematical Sciences, Vol.1 ; Springer Verlag, Berlin, 1988. [ABS] A. Andrelni, A. Baciotti and G. Stefani, Global Stabilization of Homogeneous Vector Fields of Odd Degree, Systems and Control Letters, 10,(1988), 251-256. [AFl] E. H. Abed and J. H.Fu, Local feedback stabilization and bifurcation control, I, Hopf bifurcation, Systems and Control Letters, 7, (1986), 11-17. [AFB] E. H. Abed and J. H. Fu, Local feedback stabilization and bifurcationcontrol, 11, Stationary bifurcation, Systems and Control Letters, 8, (1987), 467-473. [AF3] J. H. Fu and E. H. Abed, Linear feedback stabilization of nonlinear systems, (preprint) (to appear in Automatica). [AF4] J. H. Fu and E. H. Abed, Families of Lyapunov functions for
Dayawansa
198
nonlinear systems in critical cases, (preprint) (to appearin IEEE Trans. Automat. Contr .).
2.Artstein, Stabilization with relaxed controls, Nonl. Anal, TMA 7,(1983), 1163-1173. D. Ayels and M. Szafranski, Comments on the stabilizability of the angular velocity of a rigid body, Systems and Control Letters 10,(1988), 35-39. D. Ayels, Stabilization of a class of nonlinear systemsby a smooth feedback, Systems and Control Letters, 5, (1985), 181-191.
A. Bacciotti, Local Stabilizability of Nonlinear Control Systems, Series on Advances in Mathematics for Applied Sciences - Vol. 8, World Scientific Press, Singapore, 1992.
W. M. Boothby and R. Marino, Feedback stabilization of planar nonlinear systems, Systems and Control Letters, 12,(1989), 8792. W. M. Boothby and R. Marino, Feedback stabilization of planar nonhear systems 11,28th IEEE Conf. on Decision and Control, (1989), 1970-1974. W. M. Boothby and R. Marino, The center manifold theorem in feedback stabilization of planar, single input, systems, Control Theory and Advanced Technology, 6, (1990), 517-532. C. I. Byrnes and A. Isidori, On the attitude stabilization of rigid spacecraft, Automatica, 27, (1991), 87-95. C. I. Byrnes and A. Isidori, Asymptotic stabilization of minimum phase systems, IEEE Transac. Automat. Contr, AC-36, (1991), 1122-1137. C. I. Byrnes, A. Isidori and J. C. Willems, Passivity, feedback equivalence and the global stabilization of minimum phase nonlinear systems, IEEE Transac. Automat. Contr., AC-36, (1991), 1228-1240.
R. Brockett, Asymptotic stability and feedback stabilization in: Differential Geometric Control Theory, Birkhauser, Boston, 1983.
Stabilization Problem for Low Dimensional Systems
199
J. Carr, Applications of Center Manifold Theory, Springer Verlag, NY, 1981. C. Coleman, Asymptotic stability in %space, in: Contributions to the Theory of Nonlinear Oscillations - Vol. V, Annals of Mathematics Studies, Vol. 45, eds. L. Cesari, J. P. LaSalle and S. Lefschetz, Princeton Univ. Press, 1960. J.-M. Coron, A necessaryconditionfor feedback stabilization, Syst. Contr. Lett., 14, (1990), 227-232. J.-M. Coron, L. Praly, Adding an integrator for stabilizationproblem, Syst. Contr. Lett., 17,(1991), 89-105. P. E. Crouch and I, S. Igheneiwa, Stabilization of nonlinear control systems: the role of Newton diagrams, International Journal of Control, 49, (1989), 205-211. D. Deimling, NonlinearFunctionalAnalysis, Springer Verlag, New York, 1988. W. P. Dayawansa, Asymptotic stabilization of low dimensional systems, in: Nonlinear Synthesis, Progress in Systems and Control Theory, Vol. 9, Birkhauser, Boston, 1991. W. P. Dayawansa and C. F. Martin, Some suficient conditions for the asymptotic stabilizability of three dimensional homogeneous syStemS, Proc. IEEE Conf. on Decision and Control, Tampa, Dec. 1989, 1366-1370. W. P. Dayawansa and C. F.Martin, Asymptotic Stabilization of Two Dimensional Real Analytic Systems, Systems and Control Letters, 12,(1989), 205-211. W. P. Dayawansa and C. F. Martin, A Remark on a Theorem of Andreini, Bacciotti and Stefani, Syst. and Contr. Lett., 13, (1989), 363-364. W. P, Dayawansa, C. Martin and G.Knowles, Asymptotic stabilization of a class of smooth two dimensional systems, SIAM J. on Control and Optimization, 28, (1990), 1321-1349. [DMSl] W. P, Dayawansa, C. F. Martin,and S. Samelson, Asymptotic stabilization of three dimensional homogeneous quadratic sys-
200
[DMSB]
[DMS3]
[Hal [Hail
Dayawansa
tems, submitted to Proc. of the 31et IEEE Conf. on Dec. and Contr. W. P. Daywansa, C. F. Martinand S. Samelson, Asymptotic stabilization of single input, three dimensional, homogeneouspolynomial systems, under preperation. W. P. Daywansa, C. F. Martin and S. Samelson, Asymptotic stabilization of generic singleinput, three dimensional, homogeneous polynomial systems of a given degree, submitted to Systemand Control Letters. H. Forster, Uber das Verhalten der Integmlkurven einer gewohnlichen Digerentialgleichung erster Ordnung in der Umgebung eines singuliiren Punktes, Mathematische Zeitschrift, 43, (1937), 271320. V. Guillemin and A. Pollack, Differential Topology,Prentice Hall, New Jersey, 1974. M. Golubitsky and D. G.Shaeffer, Singularities andGroups in Bifurcation Theory, Vol. 1, Springer Verlag, NY, 1985. W. Hahn, Stability of Motion, Springer Verlag, New York, 1967. V. T. Haimo, An algebraicapproachtononlinear stabilization, Nonlinear Theory Methods and Applications, 10,1986. H. Hermes, Homogeneous coordinates and continuous asymptotically stabilizing feedback controls, Proceedings of the Conference on Differential Equations Applications to Stability and Control, Colorado Springs, 1989. H. Hermes, Nilpotent approximation of control systems and distributions, SIAM J. Contr. Optimiz., 24, (1986), 731-736. H. Hermes, Asymptotic stabilizingfeedback controls and the nonlinear regulator problem, SIAM J. Contr. Optimia., 29, (1991), 185-196. H. Hermes, Asymptotic stabilizationof planar nonlinearsystems, Syst. Contr. Lett., 17,(1991), 437-445. D. J. Hill and P.J . Moylan, Dissipative dynamical systems: Basic input-output properties,J. Franklin Inst. 309, (1980), 327-357.
Stabilization Problem for Low Dimensional Systems
201
L. N. Howard, Nonlinear Oscillations, in: Nonlinear Oscillations in Biology, ed. F. C. Hopensteadt, Amer. Math. Soc., Providence, 1979, 1-68. B. Jakubczyk and W. Respondek, Feedback equivalence of planar systems and stabilizability, in: Robust Control of Linear Systems and nonlinear Control, eds. M. A. Kaashoek, J. H. van Schuppen and C. A. M. Ran, Birkhauser, New York, 1990, 447-456. M. Kawski, Stabilization of nonlinear systems in the plane, Syst. Contr. Lett., 12,(1989), 169-175. M. Kawski, Homogeneous feedback lawsin dimension three,Proc. IEEE Conf. on Decision and Control, Tampa, Dec 1989, 13701376. M. Kawski, Homogeneous Stabilizing Feedback Laws, Control Theory and Advanced Technology, 6, (1990), 497-516.
-
P, V. Kokotovic and H. J. Sussmann, A positive real condition for global stabilization of nonlinear systems, Systems ans Control Letters, 13,(1989), 125-134.
J. Kurzweil, On the inversion of Lyapunov’s second theorem o n stability of motion, Amer. Math. Soc. Transl., Ser. 2. 24, (1956), 19-77. M.A. Krosnosel’skii and P.P.Zabreiko, Geometric Methods of Nonlinear Analysis, Springer Verlag, W ,1984. P. J. Moylan and B. D. 0. Anderson, Nonlinear regulator theory IEEE Transac. Automat. Contr., andinverseoptimalcontrol, AC-18, (1973), 460-464. J. N. Mather, Stability of Cm-mappings, I, 11,Ann. Math. 87 (1962), 89-104; 89, (1969), 254-291. R. Marino, High-gain stabilization and partial feedback stabilization, 25th IEEE Conference on Decision and Control, (1986), 209-213. R. Marino, O n the largest feedback linearizable subsystem, Systems and Control Letters, 6, (1986), 345-352.
Dayawansa
202
J. E. Marsden and M. McCracken, The Hopf Bifurcation Theorem and its Applications, Springer Verlag, New York, 1976. P. J. Moylan, Implications of passivity in a class of nonlinear systems, IEEE Transac. Automat. Contr., AC-19, (1974), 373381 L.Rosier, Homogeneous Lyapunov functions for homogeneous continuos vector field, (preprint). L. Praly, B. d'Andrea/-Novelland J. M. Coron, Lyapunov design fo stabilizing controllers, 28th IEEE Conference on Decision and Control, 1989, 1047-1052. N. Samardzija, Stability properties of autonomous homogeneous polynomial differential systems, J. Differential Eq., 48, (1983), 60-70. E. D. Sontag, A universal construction of Artstein's theorem on nonlinear stabilization, Systems and Control Letters, 13,(1989), 117-123. E. D.Sontag, Feedback stabilization of nonlinear systems, in: Itobust Control of Linear and Nonlinear systems, eds. M. A. K& ashoek and J. H. van Schuppen, A. C. M. Ran, Birkhauser, Boston, 1990. E. D. Sontag, firther facts about inputtostatestabilization, IEEE Trans. Automat. Contr., AC-35, (1990), 473-477. E. H.Spanier, Algebraic Topology,McGraw Hill,New York, 1966. E. D.Sontag and H. J. Sussmann, Remarks on continuous feedback, Proc. IEEE Conf.Decision and Control, Albuquerque, (1980), 916-921. E. D. Sontag and H. J. Sussmann, firthercommentsonthe stabilizability of the angular velocity of a rigid body, Systems and Control Letters, 12,(1989), 213-217. H.J. Sussmann, Subanalytic sets and feedback control, J. Differential Equations, 31, (1979), 31-52. R.Thom, H. Levine, Singularities of differentiablemappings, Lecture Notes in Math 192, Springer Verlag, NY, 1971. I
Stabilization Problem for Low Dimensional Systems
203
[TK] J. Tsinias and N. Kalouptsidis, Output Feedback Stabilization, IEEE Transactions on Automatic Control, AC-35, (1990), 951954. [Tsl] J. Tsinias, Existence of Control Lyapunov Fhnctions and Applications to State Feedback Stabilizability of Nonlinear Systems, SIAM J. Contr. Optimiz., 29, (1991), 457-473. [Ts2] J. Tsinias, Optimal Controllers and output feedback stabilization, Systems and Control Letters, 15, (1990), 277-284. [Ts3] J. Tsinias, Remarkson feedbackstabilizability of homogeneous systems, Control - Theory and Advanced Technology 29, (1991), 457-473. [Vi]M. Vidyasagar, Decomposition techniques for large scale systems with nonadditive interactions: stability and stabilizability, IEEE Tansac. Automat. Contr., AC-25, (1980), 773-779. [Wi] J. C.Willems, Dissipative dynamical systems, part I: general theory, Arch. Rational Mech. Anal., 45, 321-351. [Za] J. Zabczyk, Somecommentsonstabilizability, Applied Mathematics and Optimization, l 9 (1989), 1-9.
Asymptotic Stabilization via Homogeneous Approximation Henry H e m e s Department of Mathematics, University of Colorado, Box 395,Boulder, CO 80309-0395
Abstract. A structural stable property of a system of differential equations, or a control system, can be studied by studying this property for the linearization of the system, i.e. if the linearization possesses the property so does the original system. A linearization, however, often loses relevant information. The main thesis, here, is to study the structural stable property “asymptotic stability’’ in the case where the linearization is inconclusive but a “high order, homogeneous approximation” retains high order, relevant, information. The method consists of first constructing a natural, high order, homogeneous approximating system and then attempting to construct an asymptotically stabilizing feedback control (ASFC) for this approximation, which will then also be an ASFC for the original system. While general homogeneous systems are not as tractable as their special case, linear systems, their analysis is often somewhat easier than that of the original system. This research waa funded by NSF Grant DMS 95-30973.
205
206
Hemes
0. Introduction This paper will present the main ideas of the program, initiated by M. Kawski and the author, to use homogeneous approximations for the problem of constructing asymptotically stabilizing feedback controls (hereafter ASFC) for affinecontrol systems which do notsatisfy the first order local controllability condition. For ease of presentation we consider a single input, n-dimensional, affine control system
x = X o ( z )+ u X 1 ( 2 )X, o ( 0 )
(1)
= 0, X l ( 0 ) # 0,
with X,, X1 real analytic vector fields on R". The methods may be considered as nonlinear extensions of linear approximations and feedback design via use of the linear regulator problem. If linear methods were applicable, one would proceed as follows. First, expand X O , X 1as
XO(X) = AZ + X ~ " ' ( Z+) .
(2)
X ~ ( Z=) b
;
+ X { ' ' ( Z ) + ...
where X ( J ' ) ( z )denotes a vector field having components which are homogeneous polynomials of degree j in the local coordinates. The linear approximation of system (1) is then
5 = A z + bu.
(3)
One would like to design an ASFC U * ( % ) for (3) and have this be a local ASFC for (1). It is important to note that if such a U* has the form u*(z)= Q! X , Q! = ( ~ 1 , .,aa), ~ . i.e., is linear, eq. (3), with control U * @ ) , has the form j: = Bz where B = ( A b 8 a) is a stability matrix. Then (l), with control u * ( z ) , can be written as x = X o ( z ) u * ( z ) X ~ ( z=) Bz (terms of order 2 2) and hence U* is a local ASFC for (1) by standard stability theory via linear approximations. This need not be the case had u " ( z ) been nonlinear1 For the linear theory to apply one must make the controllability assumption
+
+
(4)
6, Ab,
. . ,A"-'b
+
are linearly independent.
Assume this for the moment and (following R. E. Kalman, [l])seek an
Asymptotic Stabilization via Homogeneous Approximation
207
ASFC for (3) by choosing it as the feedback controller which minimizes the quadratic cost functional W
( 5 ) C ( U )=
1 [eu2(c)+ x‘(c)&x(a)]dc, e > 0, x‘ denotes transpose,
0
with Q a positive definite symmetric matrix. Clearly to minimize this cost solutions must tend to zero as t + 00. Applying the Pontryagin maximum principle to the optimization problem (3), ( 5 ) , let p = col(pl,, , ., p n ) and define H ( x , p , u ) = p’Ax p’bu eu2 + x’Qx. Minimize H with respect to U to obtain G(x,p) = -(1/2e)b‘p. Next let H * ( x , p ) = H ( z , p , u * ( x , p ) ) and the “value function” v(x), i.e., .(x) denotes the optimal cost starting from x, must satisfy the Hamilton-Jacobi-Bellman (HJB) equation H * ( qv,(x)) = 0. Specifically, here the HJB equation is (6)
+
+
+
-
( - 1 / 4 e ) ( b v,(s))~ ( A z ) v,(x) = -x’Qx,
v(0) = 0, v(x) > 0
if z # 0.
This equation admits adilation group as a symmetry group. Indeed, if v(.) is a solution and we scale by letting x = x(y) = ~y and v ( E ~=) e2w(y), E > 0, then W is also a solution. Another wayof viewing this is that if v(.) is a solution of (6), then so is ( ~ / E ~ ) V ( E ZThus ). if (6) has a unique (positive definite) solution v(e), we must have ~ ( E z )= .z2v(x),i.e., v is homogeneous of degree 2 . The controllability condition ( 4 ) insures the existence of a solution of (6) which is, in fact, unique; see [ 2 , Chap. 31. This suggests one try v(z) = x‘Ex, E a positive definite symmetric matrix to be determined, as a solution. Substitution readily leads to the classical “stationary Riccati equation” A’E E A - (l/e)Ebb‘E= -Q, which has a unique positive definite, symmetric solution E . The optimal control is then u*(z)= G(z,v,(z))= (-1/2e)b’Ez and v(z)is a Lyapunov function for (3) with trajectory derivative $(S)= -x’Qx. This shows U* is an ASFC for (3), Furthermore, U * , as constructed, turned out to be linear in z which isbasic (m mentioned above) in assuring that U* is also a local ASFC for ( 1 ) . The above is classical and well known (but perhaps not in this form) to control engineers. Our goal is to extend these ideas to the case when
+
Hemes
208
the linear approximation is not controllable, for example to a system of the form (1) with X O ( E )= x78/8x2, X1 = 8 / B q which has the zero vector field as the linear approximation of X,. For simplification, since X l ( 0 ) # 0 we may choose local coordinates so X1 = B/azl and then, if X O ( E= ) & ‘ ai(z)B/8zi,use feedback .(E) = - a l ( z ) + p ( z ) to eliminate the first component of XO.We assume this has been done, i.e., hereafter X O ( E= ) CY=, ui(z)O/dzi.The method proceeds as follows.
A. Determine how to decompose a smooth vector field X ,with X ( 0 ) = 0, as X ( E )= R ( E )where X ( k )is an approximation of X and R is a “higher order remainder” in such a way that E = 0 being an asymptotically stable solution of 5 = X ( k ) ( ~ implies ) the same for 5 = X ( E ) .We call such an approximation one which preserves asymptotic stability. (Here X @ ) no longerneedhave components homogeneous polynomials of degree IC in E. However, as we shall see, it will be a vector field homogeneous of degree IC with respect to some dilation.)
+
B. Construct an approximating system to (l), i.e., (7)
5 = XA”(z)
+ ux1
a (XI =
being its own approximation
+
and (if possible) an ASFC U * ( % ) for (7) which is such that ( X r ) ( z ) u * ( z ) X 1 )is an approximation of ( X O ( E ) u * ( z ) X l )which preserves asymptotic stability. This will assure u*(E)is also a local ASFC for the full system (1). As may be expected, the construction of an ASFC for (7), when possible, will be via a nonlinear regulator problem which leads to an HJBequation admitting a symmetry group.
+
Before beginning the program outlined above, we wish to stress that the problem of the existence and construction of an ASFC for systems of the form (1) has been an extremely active area of research during the period 1985-1995. This paper is meant to present one approach, not to survey the field. The reader interested in a large list of references and the many different methods is referred to the recent book [3].
Asymptotic Stabilization Homogeneous viaApproximation
209
1. Approximations which preserve asymptotic stability Let 6;z = ( ~ ~ 1 ~. .1,cPnzn), , . E > 0, with 1 5 r1 5 , , , 5 rn integers, denote an arbitrary dilation of J P , i.e., 6: : Rn + Rn. A function h : W n + W1 is said to be homogeneous of degree k with respect to 6; if h(6Tz) = Ekh(x).We denote this by h E Hk, i.e., H k consists of functions homogeneous of degree IC. The special (classical) dilation with 1 = r1 = . . . = rnwill be denoted 6.; A vector field X ( z ) = C;=, ac(z)B/Bzg is homogeneous of degree m with respect to 6); denoted X = X ( m ) ,if ai E HTi+m-l, i = 1,. , . ,n.This differs from the convention adopted, for example, in [4], [5], but has the advantage of making a linear vector field X ( z ) = Az homogeneous of degreeone with respect to 6; as is classical. A standard, elementary, result of stability theory is that if, relative to 6,; we expand a smooth vector field X with X ( 0 ) = 0 as X ( z ) = X ( l ) ( z ) X ( 2 ) ( z ) . . . = X ( l ) ( z ) R ( z ) ,then the origin z = 0 is a locally asymptotically stable solution of X ( z ) if it is an asymptotically stable solution of the approximation 5 = X ( l ) ( z )= A s . This was extended by Massera, [6],for homogeneity relative to 6; as follows. If X ( $ ) = X ( k ) ( z ) X ( k + l ) ( z ) . . . = X(’)((a) R ( z ) and the zero solution of 5 = x ( ” ( is ~asymptotically ) stable (and this will be global by homogeneity), then the zero solution of = X ( z ) is locally asymptotically stable.. The generalization of Massera’s result for an arbitrary dilation 6; was,shown to be valid in [7]. We state this as
+
+
+
+
+
+
x
Theorem 1 ([7]). Let X ( k ) be a continuous vector field o n R“ with X ( k ) ( 0 )= 0 which is homogeneous of degree IC with respect to a dilation 6;x = ( ~ ~ 1 x, 1. ,,~~ ~ “ and 2 ~ such) that solutions to initial value problems for x = X ( k ) ( x )are unique. Let R be a continuous s u m of vector fields homogeneous of degree 2 (IC + 1) with respect to 6.: Then .if the zero solution of x = X ( k ) ( x )is asymptotically stable the same is true for the zero solution of x = X(’)(z) R(z). An extension of this theorem has been proved, under weaker hypothesis, in [g], where it is shown that asymptotic stability can be established via a homogeneous Lyapunov function.
+
Hemes
210
Example 1.1. Let X ( I ) = ( - I : + I ~ ) L ~ / L ~ I ~ - I and ~ / 6;s ~ L= ~ / ( & Z 1 , E 3 z 2 ) . Then - I : E H3 = HTI+3-l, xi E H6 = HT,+6-1,- I : / ~ E H5 = HT2+3-1. Thus we may write X ( z ) = X @ ) ( I )+ X ( 6 ) ( ~where ) X @ ) ( I )= -z:a/azl - 1;’~a/622and X @ ) ( I )= siL3/611. Clearly I = 0 is an asymptotically stable solution of x = X ( 3 ) ( ~ hence ) I = 0 is also an asymptotically stable solution of 5 = X ( I ) .Note that this dilation makes x$ of “higher order” than z:. m 2. High order approximations of afflne systems If the linear approximation of (1) is not controllable, we wish to replace the linear approximating system (3) with a higher order approximation of the form (7) where, in particular, X i k ) will denote a homogeneous approximation of X0 of degree k with respect to some dilation S:. But homogeneity and linearity are not coordinatefree notions, Is there a natural way to choose the local coordinates and dilation dl? While we do not give a definitive answer to this question, we will describe a construction giving a dilation, local coordinates, and homogeneous approximating system which retains desired high order information about system (1). Let L = L ( X 0 , X l ) denote the Lie algebra generated by the vec‘tor fields X,, X1 and L(Xo,Xl)(O)the elements of L evaluated at 0. If dimL(Xo,XI)(O) = k < n all solutions of (1) initiating from the origin will lie on a k-dimensional manifold. While one could possibly still have an ASFC for (1) it is not a case wewish to consider. We thus assume, throughout, that (8)
dimL(Xo,X1)(O)= n.
For vector fields X , Y our notation will be (adX, Y ) = [ X ,Y], the Lie The solution, product of X and Y ,while (ad”’ X , Y)= [ X ,( a d k X , Y)]. at time t , of the differential equation j. = X ( z ) , 1 ( 0 )= will be denoted (exp t X )(v).
Deflnition. An extended filtration, F,of L at zero is a sequence of subspaces {Fj : -CO < j < CO} of L ( X ,Y )such that for all integers i, j:
~ I ~ ,
Asymptotic Stabilivationvia Homogeneous Approximation
211
We will construct these filtrations for L(X0, XI) via “weights,” The filtrations will determine preferred (or induced) local coordinates and an induced dilation relative to which we form the approximating system to (l), Let C(X0,Xl) denote the free Lie algebra generated by the symbols X,,XI. There is a natural homomorphism from this free Lie algebra to which sends a Lie polynomial in the symbols X,,XI, into the L(X0, XI), vector field which isthe corresponding Lie product of the vector fields XO, XI. See [lo] for details. Assign integer (negative integers are permitted) weights to thesymbols XO, XI, denoted wt(Xo), wt(X1) respectively, and let the weight of a Lie product be the sum of the weights of its factors. The weight induced filtration F then has subspace Fj consisting of the homomorphic image of all elements in L(X0, X,) having weight a 5 j . Notice that condition (iv) puts a severe restriction on the admissible values of wt(Xo), wt(X1). Example 2.1. Let XO(Z)= a!B/Bs2, X1 = B/asl. There is one Lie product (ad3 XI, Xo) = -6B/azz which doesnot vanish at zero. One could assign wt(X0) = -2, wt(X1) = 1 and have a weight induced filtration. D Example 2.2. Let Xo(s) = (Q - ~ : ) 0 / 0 5 2X1 , = 8/8sl, the product (a d 3 XI, XO)is nonzero at zero but so is (adk X0(ad3 XI, XO)).Here condition (iv) forces the smallest weight one can assign X0 to be zero. D In general, we will assign the weights of Xo, X1 as small as possible. Induced Local Coordinates and Dilation. Let 3 = (F’ : “00 < j 00) be an extended filtration of L(Xo,X1) at zero.Define nk = dim F k (O), -00 c k c 00. Then property (iv) shows n k = 0 if k 5 0 while (8) means dim F N ( O ) = n for some integer N Choose X,, , . ,XRnl E F1 m
Hermes
212
such that these are linearly independent at zero. Adjoin X,,,,+,, . . . ,X,,,, E F 2 such that X,, (0), . . . ,X,,, (0) are linearly independent and continue in this fashion to get X,, ,. . . , X,, with
+ 15 i 5 nj. 5 121, rj = 2 for nl + 1 5 j 5 nz,etc., i.e., rj = k
X,, E Fj,
(9)
nj-1
Choose rj = 1 for 1 5 j for nk-1 1 5 j 5 n k . The dilation Grwith r = ( T I , . . , r,) determined as above is called the filtration induced dilation or dilation adapted to the filtration. In a specific problem, vector fields are initially given relative to some local coordinates, say y = (VI,.. . ,g,) for a neighborhood of zero. Define a local coordinate change z = cp-l(y) where
+
.
g = p(z) = (exp zlX,,) o
.. . o (exp z,X,,>(O).
Then cp is a local diffeomorphism; the coordinates z = (51,. local corodinates induced by ( o r adapted to) the filtration.
. . ,z),
are
Remark 2.1. Since Xl(0) # 0 ifwe assign wt(X1) = 1, which will usually be the case, we can choose X,, = X1 and if in the initial coordinates we had X,(g) = B/Bg1we again have, in the induced coordinates, Xl(Z) = B / B X l . Theorem 2 (See, for example [5, Thm. 21). Let 3 = {Fj : "00 < j < 00) be an eztended filtration at zero for L(X0,Xl) whichsatisfies dimL(Xo,X1)(0) = n, and z = ( ~ 1 ,... ,x,) be local coordinates induced by 3,:S the dilation induced by 3.Then if X E Fl-j, (11)
X(z) = X q z )
+ X(j+l)(Z)+ * ..
where X(i) is homogeneous of degree i with respect to til, Example 2.1 (continued). With wt(X0) = -2, wt(X1) = 1 we have a filtration 3 with dimFl(0) = 1, dimFz(0) = 1, dimF~(0)= 2. This gives G; with r = (1,l).Choose X,, = X1 and X,, = (ad3 X1,Xo). Then the original coordinates are (to within scalar multiples) the induced coordinates. Here Xo(z) = z:B/Bzz and wt(X0) = -2 means X0 E F-2 = F1-3 or j = 3. Indeed, Xo(z)= X f ) ( z )since z: E H3 = H,,+3-1.
Asymptotic Stabilization via Homogeneous Approximation
213
In summary, given the system (l), assignweights wt(Xo),wt(X1), usually as small as possible, so that these induce an extended filtration 7 of L(X0, XI). The filtration induces a dilation : 6 and local coordins tes (which we again denote by S) relative to which we expand Xo(z) = Xi"'(z) +X:"')(S) . I . If we assign wt(X1) = l, in which case we may as well choose X,, = XI, we will again have Xl(z) = Xl0)(z) = 8/Bz1. Note that k = 1 - wt(X0). The approximating system to (1) is
+.
If the system had been linearly controllable, i.e., if Xl(O), (ad Xo,X1)(O), . . . , (adn-lXo,X1)(0) areindependent, the assignment wt(X0) = 0, wt(X1) = 1 gives :S with T = (l,. .. ,l) and system (12) becomes the linear approximating system (3). Homogeneous approximations of the above form are basic tools in the study of small time local controllability, see [lo], or large time local controllability, see [ll].
3. Constructing an ASFC for the approximating system In general, the original system (l), or its approximation (12), need not admit an ASFC. Necessary conditions for the existence of a continuous ASFChavebeengivenby Brockett [g]. These are: B1. For every point y in a neighborhood of zero there is an (open loop) control t -+ uy(t) such that the solution of (1) initiating from y using control uy tends to zero as t + 03. B2. {Xo(z) u X l ( z ) : z in a nbd. of zero, U E W} covers a neighborhood of zero. Condition B1 is satisfied if system (1) is small time locally controllable (STLC) at zero, i.e., if the set of all points attainable from zero at any time tl > 0 contains a neighborhood of zero. Condition B 1 is also satisfied if system (1) is large time locally controllable (LTLC) at zero, i.e. there is some tl > 0 such that zero is in the interior of the attainable set at time tl. Computable conditions for STLC are known, see [lo], while conditions for LTLC can be found in [ll]. It can be shown (see Ill] for references) that if the approximating system (12)
+
Hemes
214
is STLC, then system (1) is either STLC or LTLC. We will deal, here, with systems (examples) having approximations (12)which are STLC and satisfy B2. Systems which do not satisfy B 2 are topics of current research using "dynamic" feedback ordiscontinuous feedback controls. For example, a basic result established by Coron, [12],is that in dimension n 2 4, STLC implies the existence of a continuous (time-periodic) "dynamic" feedback control u ( t , z ) which drives the system to zero in finite time. Our goal isto construct an ASFC for(12)as the optimal control relative to a cost functional of the form 00
1
+
C ( U ) = [elu(a)18 h+(z(a))lda,
(13)
e > 0, 8 > 1,
0
with h+ homogeneous (relative to SF) and positive definite (of homogeneous degree to be determined) while S is to be chosen to assure the associated HJB equation admits a dilation symmetry group. Again, using = aj(z)S/Szj.jFrom somefeedback if necessary, assume X$")(z) the maximum principle for the problem (12), (13), one defines H (x,p , U ) = p l u Cy=,aj(z)pj elule h+(z).Next, minimize H with respect to U to find
+
+
+
Letting H * ( z , p ) = H(z,p,.^(z,p))the value function ).(V the HJB equation H* (x,v,(z)) = 0 or
(15)
- rlvzl(z)Ia/('-') +
c n
aj(z)v,, (z)
then satisfies
+ h+(z)= 0,
j=2
v(0)= 0,).(V where r =
(G)( G ) 1
l/(")
>0
> 0 if I # 0,
-+
, ~ 00 as e "+ 0.
Since X$') is homogeneousof degree with respect to , :S ai E Hrc+k-l, i = 2,. , . ,n. Suppose we were to seek a solution v E He of (15). Then v,( E He-ri while aiv,, E He+k-I. We then would want [vzlls/(8-1) also in He+k-1 which requires (e - T I ) = L + IC - 1 or
("-51- 1
(16)
L = s(T1 + IC - 1) + 1 - k,
IC = 1 - wt(X0).
Asymptotic Stabilization via
Homogeneous Approximation
If (16) is satisfied and h+ is chosen in
215
then if (15) has a unique (positive definite) solution W , it will be true thatW E He.If (12) is STLC at zero, a positive definite solution (in the viscosity sense) of (15) will exist. Furthermore, from (14) we see the optimal control U * ( % ) = .^(x,wzl(z)) satisfies U* E H , , + k - l . This means the vector field (Xik)(z)+ .*(S)&) is homogeneous of degree k with respect to the dilation S; and from Theorem 1, if U* is an ASFC for (12), it is also a local ASFC for (1). The construction proceeds by choosing L and 8 > 1 to satisfy (16). One then seeks a positive definite solution W E He of the HJB inequality (17)
-rlwzl(Z)I~' . ( ~-' )
+
c
He+k-l,
n
q(z)wzj(2) I 0 (negative definite).
j=2
If such a W can be found, it is a Lyapunov function for the approximating system (12) having U = u*(z)= G(z, w,(z)) as given by (14). Furthermore, u*(z)will be a local ASFC for system (1). The advantage of the method is that the form of the function W which one should try as a solution of (17) is known! If (12) is STLC at zero but does not satisfy the necessary condition B2 for a continuous ASFC (e.g., = U , si.2 = 21, 5 3 = z! is such a system), one can expect at best a continuous (viscosity) solution W E He for (17). l = L and This means W could contain terms such as I z 1 l n 1 z ~ , n l rn2r2 (even for small values L ) the general form of W can be quite complex.
si.,
+
4. Examples
Early papers with examples on the use of homogeneity in the construction of ASFC are [13], [14]. Several (somewhat complicated) fully worked examples using the method described here can be found in [15], [16], [17]. We next give an easy example to illustrate the calculations involved in $3, Example 4.1. Let n = 2, Xo(z) = z!B/Bza, X1 = B/Bzl. The XO)= -6d/Bz2. One can nonzero Lie brackets are X1 itself and ( a d 3 XI, assign wt(X0) = -2, wt (X,) = 1; this assignment does satisfy conditions (iv) for an extended filtration 3.Here both X1 and (ad3 XI, XO) have
Hemes
216
weight 1 hence dim F1 (0) = 2 and the induced dilation :S has T = (1,l). The original coordinates are (upto scalar multiples which are insignificant) induced coordinates so we make no coordinate changes. Also,X0 = X?) is its own homogeneous approximation while X1 = X?) = a / B q remains t18 is. Choose .t = 2 (the smaller 4 is, in general, the easier calculations are). Then k = 1 - wt(X0) = 3 and (16) shows S = 4 / 3 which is an admissible value. (We require S > 1.) The HJB inequality (17) is (i) - y ( v z 1 ( ~ ) ) 4 E ~ v ,(z) , 5 0 where 5 0 means negative definite. We seek a solution v E H Z ,i.e., this tells us the form to try is
+
Theory of quadratic forms tells us v will be positive definite if (ii) E2 > a 2 / 4 .Substituting for v in (i) shows we need (iii) - 7 ( 2 ~ 1+ a ~ az? ~ ) ~ 2 E z ~ g z c5~ 0. Since y can be made arbitrarily large by choosing e > 0 sufficiently small, (iii) will be negative definite if 2Ezz1s2 5 0 when 2x1 az2 = o or equivalently if (iv) (a- ( 2 E a ) / a ) 4 5 0. One may easily satisfy (ii) and (iv)simultaneously, e.g., choose Q = 1, E2 = 2 . Thus u * ( z ) = -(3/4e)3(2zl z2)3is an ASFC for e > 0 small.
+
+
+
+
+
where .t Onemay note that in this example u * ( E ) has the form is a linear function of z. In [18], Dayawansa and Martin give an inductive proof to show that the “cubic integrator”
) ) ~l is a linear function of admits an ASFC of the form u * ( E )= ( l ( ~where E . In general, the odd pth power integrator admits areal analytic ASFC of the form u * ( z )= ( l ( z ) ) P and a real analytic, positive definite, solution of the associated HJB equation (15). Necessary conditions (which’sometimes are sufficient) that equation (15) have a real analytic, positive definite, ) , given homogeneous solution v which is such that u * ( E )= G ( z , v Z l ( z ) as by (14),is a real analytic ASFC in Hrl+k+l are given in [l91 in terms of the Lie bracket structure of X,, X1 at zero.
Asymptotic Stabilization via Homogeneous Approximation
217
REFERENCES R. E. Kalman, Contributions to the theory of optimal control, Bol. Soc. Mat. Mexico 5 (1960), 102-119. E. B. Lee and L. Markus, Foundations of Optimal Control Theory, John Wiley & Sons, NY (1967). A. Bacciotti, Local Stabilizability of Nonlinear Control Systems, SeriesonAdvances in Mathematics forAppliedSciences 8 , World Scientific (1991). L. P. Rothschild and E. M. Stein, Hypoelliptic differential operators and nilpotent groups, Acta Math. 137 (1976), 247-320. H.Hermes, Nilpotent and high-order approximations of vector field systems, SIAM Review 33 (1991), 238-264. J. L.Massera, Contributions to stability theory, Annals of Math. 64 (1956), 182-206. H. Hermes, Homogeneouscoordinates and continuous asymptotically stabilizing feedback controls, Differential Equations, Stability and Control, (S.Elaydi, ed.), Lecture Notes in Pure & Applied Math. #127, Marcel Dekker, Inc., NY (1991), 249-260. R. Rosier, Homogeneous Lyapunov functions for continuous vector fields, System and Control Letters, 19 (1992), 467-473. R. W. Brockett, Asymptotic stability andfeedback stabilization, Differential Geometric Control Theory, (Brockett, Millman, Sussmann, eds.), Vol. 27, Birkhauser, Boston (1983), 181-191. H. J. Sussmann, A general theorem on local controllability, SIAM J. Control €4 Opt.. 25 (1987), 158-194. H. Hermes, Large-time local controllability via homogeneous appre ximations, SIAM J. Control t3 Opt., 34 (1996), 1291-1299. J. M. Coron, On the stabilization in finite time of locally controllable systems by means of continuous time-varying feedback laws, SIAM J. Control €4 Opt., SS (1995), 804-833. M Kawski, Stabilization and nilpotent approximations, Proc. 27th IEEE Conference on Decision and Control, 11, (1988), 1244-1248.
218
Hemes
[l41 A. Andreini, A. Bacciotti and G. Stefani, Global stability of homogeneous vector fields of odd degree, Systems & Control Letters 10 (1988), 251-256. [l51 H. Hermes, Asymptotically stabilizing feedback controls and thenonlinear regulator problem, SIAM J. Control & Opt. 29 (1991), 185196. [l61 H. Hermes, Asymptotically stabilizing feedback controls, J. Dig. Eqs. 92 (1991), 76-89. [l71 H. Hermes, Asymptotic stabilization of planar systems, Systems & Control Letters, 17 (1991), 437-444. [l81 W. Dayawansa and C. Martin, Asymptotic stabilization oflow dimensional systems, Progress in Systems and Control Theory, 9, (C. Byrnes and A. Kurzhansky, eds.), Birkhauser, Boston (1991), 53-68. [l91 H. Hermes, Smooth homogeneous asymptotically stabilizing feedback controls, ESAIM J. Control, Opt. & Calc. of Variations, (http://ww.emath.fr/cocv/), 2 (1997), 13-32.
Critical Hamiltonians and Feedback Invariants Bronislaw Jalcubczylc Institute of Mathematics, Polish Academy of Sciences, Qniadeckich8, 00-950 Warsaw, Poland
Abstract. For analytic control systems in the general form k=f(z+), ZEX, U € U we study the local feedback equivalence problem, that is the problem of equivalence of such systems under the local, invertible, analytic transformations z
-+
O(z),
U
+ qz,U ) .
If the spaces (manifolds) X and U satisfy the condition dimX > dimU > 0, then the problem has functional moduli, that is complete local invariants are infinite dimensional. In this paper we construct these invariants explicitely. The construction uses critical Hamiltonians analogous to the ones used in time-optimal control problems. However, we also use complex solutions to theequation for critical points of the Hamiltonian. After symmetrization this gives a set of analytic functions on the cotangent bundle, called symbols. The symbols together with their Poisson brackets form a complete set of microlocal feedback invariants when a rank condition is satisfied. The symbols can be computed explicitely in many cases using complex logarithmic residua. We give an explicit formula for systems polynomial with respect to the control. Partially supported by KBN grant 2P03A 004 09.
219
Jakubczyk
220
1. Introduction, equivalence problems We will consider control systems of the form TI:
k = f(E,U),
E
E
x,U E U
where X = Rn and U C R", an open subset, or more generally X and U are real analytic differential manifolds. We assume that f is analytic with respect to E and U . The main aim of this paper is to present (without proofs) constructions of complete sets of local, and microlocal, feedback invariants for such systems. The idea of this construction is explained in the following section and can be realized in two ways: the invariants can be constructed via critical Hamiltonians (as in the Pontriagin maximum principle), or via complex logarithmic residua. Both ways turn out to be equivalent. The proofs are partially contained in [Ja2](for scalar control); the general version will be presented in a future paper. System of the above form have been systematically studied for at least last fourty years, starting with the work of Pontriagin and coworkers on optimal control problems and that of Kalman on controllability of linear systems. However, in the early period mainly specific problems were studied like various versions of optimal control problems, controllability problems, stabilizability, etc. Despite deep theorems and theories developed concerning these problems, until early seventies the only general results concerning the internal structure of such systems, independent of specific problems, seem to be orbit theorems (Rashevski-Chow, Hermann-Nagano, Stefan-Sussmann). (The structure and geometry of linear systems is relatively well explained by the so called Brunovsky canonical form and the Kalman decomposition theorem.)
A progress in understanding the local geometry of general nonlinear systems has been made after introducing natural equivalence relations in the class of such systems, or in subclasses (cf. [Kr], [Bro], [JRl]). In consistency with the Klein's Erlangen program, a method of understanding the structure (or the geometry) of a class of objects is to define equivalence
Critical Hamiltonians and Feedback Invariants
221
relations, or group actions, and to study their invariants. The following are the most natural equivalence relations for control systems II. Consider another analytic system of the form
E: Definition 1.1. Two systems II and are called (state, pure feedback, or feedback) equivalent if there exist invertible analytic transformations ( % , U ) = x ( E , G ) of the form
=G($), (PFT) z = E, (FT) 3 = @(E),
(ST)
2
-
(state equivalence), U = *(E, G) (pure feedback equivalence), U = *(E, E ) (feedback equivalence), U =U
which transform one system into the other. Until early ninties local invariants have been constructed for state equivalence only, with the exception of feedback invariants of planar systems [JR2]. These are the iterated Lie brackets of the vector fields f,, = f(.,u), U E U, considered at a given point p E X under the action of the group of linear transformations of the tangent space Tp,see [Kr], [Su]. We will state this result at theend of this section. Our main aim here is to construct a complete set of invariants of the weakest of the above equivalence relations, i.e., feedback equivalence. An earlier version of our construction was carried out in [Ja2] for scalar control only. Another approach based on the method of moving frame of Elie Cartan was proposed by Gardner [Gal, and for scalar controls WEN constructed by Kupka [Kul], [Ku2] and by Gardner and Wilkens(seealso [BG] for another geometrization of control systems). In order to constructinvariants of feedback equivalenceit will be usefull to find invariants of pure feedback equivalence, first. (Note that there is a trivial, but rather useless, complete invariant of pure feedback equivalence which isthe image f (z, U) .) Our invariants of pure feedback, called critical Hamiltonians or symbols, will be constructed in Sections 3 and 4. They are of independent interest as they reflect the intrinsic geometry of the system. They also explain certain very general fact which can vaguely be formulated as the principle:
222
Jakubczyk
Complex critical trajectories of control systems, considered up to state equivalence, are complete invariants of feedback equivalence. It seems that this principle holds under very week assumptions. In particular, it holds under our finite multiplicity assumption (see Sections 3, and 4). Some evidencefor this principle to hold follows also from the papers [Boll, BO^], and [RZh], (for control-affine systems) and from [Zh], [MO], and [JZh] (for control-linear systems, i.e., distributions). In these papers only real critical trajectories were considered, which introduces additional restrictions. In [Boll B. Bonnard introduced an idea of lifting the feedback equivalence problem to the cotangent bundle. This idea will also be used here. However, we will use it in a stronger sense introducing objects which are not defined on the whole fibers of the cotangent bundle, but only in some open cones. This idea (called microlocalization) is crutial for our paper. In Section 5 we introduce microlocal feedback equivalenceand describe its invariants. They are also basic invariants for the (slightly stronger) local feedback equivalence. The idea of microlocalization and microlocal invariants is outlined in Section 2. This section serves as an introduction to the main results of the paper. Before explaining the main idea of our invariants it will be helpful to recall the main result of Krener and Sussmann on state equivalence. The following result holds in the analytic case [Kr], [Su].We denote by [g, h] the Lie bracket of vector fields g and h. Recall that a system II satisfies the controllability rank condition at x0 if the Lie algebra Lie{fu)uEv of vector fields on X generated by the vector fields fu = f(-,U ) , U E U ,spans the whole tangent space T,, at 20. Theorem 1.1. Let two analytic control systems ll and have the same control space U = 6 and they satisfy the controllability rank condition at points x0 and 30,respectively. They are locally state equivalent at x0 and 30,respectively, if andonly if thereexists a linear isomorphism of the tangent spaces
Invariants Feedback Hamiltonians and Critical
223
such that the equality
holds for any k 2 1 and any u1,.. . ,uk E U . If additionally X and 2 are simply connected and the vector fields f,, and A, U E U , are complete, then local state equivalence implies global state equivalence and so the above criterion is global. In the above theorem the left iterated Lie brackets
can be replaced by all iterated Lie brackets. An important consequence of this theorem is the fact that any diffeomorphism invariant local property of an analytic system I Idepends only on its iterated Lie brackets of the vector fields f,,, U E U,and it is enough to know the relations between them at one point. (In the C" case it is enough to know some of the relations between the Lie brackets in a neighborhood of a given point, cf. [Jal].) An example of a deep structural propertyof a system analysed in terms of relations between Lie brackets of the vector fields of the system is the property of local controllability (cf. the survey of Kawski in this volume). Another problem which is well understood is that of characterizing nonlinear control systems which are locally feedback equivalent to linear control systems
&=Aa+Bu, (see [JRl], [HSM], and the book [Is]).
- an outline
2. Microlocalization and feedback invariants
Wewill outline the construction of local feedback invariants carried out in Sections 3, 4, 5 , and 6 of thispaper. Roughly, the idea of this construction has its inspiration in the above theorem on state equivalence and some notions introduced for solving the time-optimal control problem.
. . . . . .
I
. . . . . . . . .
, .
,
.
I
.
..I^.
.........,..,,.
. . ..
,
,
,.
,
.
.I..
Jakubczyk
224
Namely, we replace the control system II by a finite number of Hamiltonian dynamical systems on the cotangent bundle T * X .This Hamiltonian systems are defined by critical Hamiltonians which are candidates for solutions of the time-optimal control problem. Then the problem reduces to establishing when two families of Hamiltonian vector fields are equivalent via a diffeomorhism of a special kind (the lift of a state diffeomorphism), a problem similar to the one solved in Theorem 1.1. Assume that X = W , the control space U C R"' is an open subset, and f = (fl, . . . ,fa) is analytic. In order to be able to analyse a system IT at a point z E X separately along any direction p in the cotangent space T,*X we introduce the Hamiltonian of the system as the function H:T*XxU+R
The Hamiltonian measures the velocity of the system in a given direction p = (p1,.. . ,p,), depending on the control U. Let us assume z and p be fixed and consider H as a funcion of U . Its most interesting points are critical points where the derivative vanishes, i.e., the ones satisfying the equation
aH -(p,z,u) aU
= 0.
If additionally the rank of the matrix
is maximal (equal to m ) , then from the implicit function theorem it follows that the above equation has locally a unique analytic solution U = U @ , S). Plugging this solution in the Hamiltonian we obtain the critical Hamiltonian
H1 (P,x) = Hb,2,424S)) which represents a dynamical system on T*X
Critical Hamiltonians and Feedback Invariants
225
This is called a critical Hamiltonian system (its trajectories are candidates for time-minimal or time-maximal trajectories of J I in ‘‘different directions”). From the point of view of finding invariants of feedback equivalence it is more interesting to consider not just regular critical pointsas above, giving rise to a locally unique solution U = ~ ( pz),, but consider degenerated critical points which lead to several (bifurcating) solutions. Such points usually exist. For example, if the control U is scalar then for a given x we may consider the direction p which anihilates the vectors aiffl8ui(z), i = 1,.. . , p . If p < n then such a direction always exists. If additionally B p + l f / 8 ~ p + ~ ( z )# 0, then we can choose our direction p so that
Then the equation %(p, z, U ) = 0 may have p solutions ‘L11 = w ( P , z ) , . . . r u p= up(P,z)r
(in fact it has exactly p complex solutions, when counted with multiplicity). Ifwe plug each of them in the Hamiltonian, we obtain p critical Hamiltonians
H1(p,z)=H(p,z,Ul(p,z)),...,Hp(p,z) =H(P,z,Up(P,z)). The number p of critical Hamiltonians near a given point W = ( p , x,U ) is called multiplicity at W and is an important invariant. It will be assumed to be finite. Our first result presented in Section 4 says that under mild assumptions the criticalHamiltonians represent the system II uniquely up topure feedback equivalence. This result can also be stated in the following way. Define the critical trajectories of our system as the absolutely continuous curves in T * X satisfying the equations
8H
-(p, au
x,U ) = 0.
226
Jakubczyk
Since we assume our functions to be analytic, we may consider these equations in the complex domain (p, z, and U be complex), locally around a given point W O = (po, 10, UO). One of results in Section 4 can be stated as thefollowing specialization of our principle from Section 1.
Complex critical trajectories (in aneighborhood of W O ) f o r m a complete set of invariants of local pure feedback equivalence, if the multiplicity is finite and nonzero at W O .Two analytic systems of finite, nonzero multiplicity at W O and 60 are locally feedback equivalent if and only if their sets of complex critical trajectories are related by a local real-analytic diffeomorphism of the state space lifted to the cotangent bundle, which sends W O into 60. The idea of microlocalization works here by lifting the problem to the cotangent bundle and using local considerations there. Generally, the critical Hamiltonians are not regular functions if the z) are only implicit function theorem is not applicable (the solutions v~((p, continuous functions of (p,z)). This makes a direct analysis of singular trajectories difficult. However, we are able to omit this difficulty by introducing symmetric functions SI,. . . ,S, of the complex critical Hamiltonians,
called symbols, which are real analytic functions. They represent certain Hamiltonian dynamical systems onT * X .The remaining problem for constructing feedback invariants reduces to finding invariants of this family of Hamiltonian systems. We solve this problem in Sections 5 and 6. The idea of this solution is to use a result similar to theresult stated in Theorem 1.1. More precisely, Theorem 1.1 has its version for Hamiltonian vector fields which we state below. Suppose that %! = { h , } , E ~is a family of analytic functions h, : + R, where U is a set with at least two elements. We will use the standard
CriticalHamiltoniansandFeedbackInvariants
227
,
Poisson bracket of two functions g , h : R2"
P defined by
where ( P I , . . . ,p,, z1, . . , x n ) are the standard coordinates on R2".Recall that the standardsymplectic form onR'" is the 2-form W = dpiAdxi. A symplectomorphism of Rpn is a diffeomorphism of R2" whichpreserves the symplectic form (transforms W into W ) . Two families of functions { hu},EU and { ? I , } , E ~ are called (locally) symplectically equivalent if there exists a (local) symplectomorphism 0 : + such that
cy=,
-
h, o 0 = h,,
forall
U
E U.
We say that a family { hu},eU satisfies the rank condition at zo = (PO, t o ) E Pzn if there exist 2n functions 91,. . . ,g2nr each of them of the form of an iterated Poisson bracket gi={h,l,.",{h,k-l,h,k}.',}
(the indices u1,,.. ,U k , and k depend on i ) , such that the differentials dgl , . , dgp, are linearly independent at ZO,
..
Theorem 2.1. Two families {h,},EU and { % , } , E of ~ analytic f i n c tions on Rpn satisfying the rank condition at the points zo and go, respectively, are locally symplectically equivalent if and only if they satisfy the equalities {
h
~
~
~
~
"
~
{
h
~
~
~
-
. l
.
-
~
h
~
~
}
~
'
~
}
(
Z
~
~
~
f o r a n y k > l a n d a n y u l ,...,U k E U . The left iterated Poisson brackets in the above theorem can be replaced by all iterated Poisson brackets. In our construction of feedback invariants the role of the family { h U } U E ~ is played by the finite family of symbols (symmetrized critical Hamiltonians) S I ,. . . , S, mentioned above. We will illustrate our method by considering a class of systems with scalar control for which the critical Hamiltonians and the symbols can be easily computed.
{
h
~
~
~
~
~
~
~
{
h
Jakubczyk
228
Example 2.2. Assume that the control set is the circle, U = S1, and the sets of velocities F ( s ) = f ( s , U ) are ellipses in T,X. Such a system can be written in the form
+
5 = V ( s ) g1(z) COS(U)
+ g2(3) sin(u),
where V , 91, and g2 are analytic vector fields on X . The Hamiltonian of the system can be written as
For a generic p E T,*X the Hamiltonian has two critical points with respect to U, and so there are two critical Hamiltonians (maximal and minimal)
The corresponding symbols are
+
S1 = H1 + H 2 = 2pv, S2 = (Hl)2 (H2)2= 2(Pv)2 +2(pg1)2 + 2 ( p g 2 ) 2 . More generally, if the sets of velocities F ( s ) = f(s,U) are m-dimensional ellipsoids in T,X, then we can describe the system by the equations
+ g1(z)v1 + . . . + gm+l(Z)vm+l,
j: = V ( s )
v12
+ +
= 1,
* * *
with V , 91, , , . ,gm+l analytic vector fields on X (the control set U is the m-dimensional sphere). The Hamiltonian of the system can be written as
H = p V ( s )+ P ~ I ( z ) + ~ I* +~gm+~(s)vm+l. Again, there are two critical Hamiltonians (maximal and minimal)
and the symbols are of the form
S1 = H1
+ H2 = 2pv,
S2 =
+ ( H 2 ) 2 = 2(pV)2+ 2
c
m+l i=l
These symbols will anable us to state criteria for microlocal and local feedback equivalence of systems within the class described above.
Critical Hamiltonians and Feedback Invariants
229
The reader who wishes to learn quickly the nature of our results is advised to omit Sections 3 and 4,in which critical Hamiltonians and symbols are introduced for general systems II, and read the results in Sections 5 and 6 which apply to the class of systems described in the above example (the symbols computed above are called global with respect to the control), Already systems considered in Example 2.2 have rather complicated and interesting local geometry, understood fully in special casesonly. If V 0 and dimX = n > 2, then such systems on X describe subRiemannian structures (cf. e.g., [Ag]). If additionally the Lie algebra generated by the vector fields g l , . . ,gm+l is of full rank (the Hormander condition), then minimal time of passing between two different points on X defines a metric on X and the resulting metric space, called CarnotCaratheodory space, has very interesting properties, cf. [Gr]. Some very basic facts about generic local geometry of such spaces of dimension n = 3 have been revealed only recently (see [AEGK]). Note that if V 0, then the symbol S2 coincides with the principal symbol of the sublaplacian m
A=
+ 2(g2)2,
where the vector fields gi are understood as differential operators of order one. There is one-to-one correspondence between linear differential operators of order two (with subelliptic quadratic part) and control systems described in the above example. Therefore, the local geometry of such operators is closely related to the local geometry of corresponding systems.
3. Basic assumptions and multiplicity Consider our system II:
x =f(z,u),
2 E
x, U E U.
For most of the paper we assume that X = ]Wn and U C ]Wm be an open subset (possibly equal to Etm). By T X and T*X we denote the tangent and cotangent bundles of X which, under our choice of X ,can be identified
Jakubczyk
230
with R" x R". However, the elements of the first copy of R" will be treated as the tangent (respectively, cotangent) vectors, and the duality between a cotangent vector p and a tangent vector v at the same point z will be simply written as pv (meant as the value of the functional p at the vector v). Our results will be true on manifolds, but we will onlytreat theproblem locally (with respect to x). We assume that f = ( f l , , . ,fn) is analytic with respect to x and U . We will use the following regularity assumption I
Bf = m. rank -
(RI)
BU
Let us define the set of admissible velocities of II as
F(%)= f(z,V ) = {v E T,X
: v = f ( x , ~ )U,E V } ,
and
F=
U F(%). XEX
In some cases we will need a slightly stronger regularity assumption in addition to (Rl),
(R21
F
C
T X is a regular submanifold,
(in order to avoid selfintersections of F).Let us introduce the Hamiltonian of the system ll as the function H : T * X x U + W defined by n
H ( p ,2,U ) = pf (x,U ) = &f&,
U).
i=1
We will often denote z = 01,s)E T ' X
W
R" x W".
The following equation
BH
-(z, BU
U)
=0
(called critical real equation) will play a basic role in our considerations. Its solutions (z,U ) will be called real critical points. In fact, they are critical points of the Hamiltonian H = H ( %U, ) considered as a function of U (with
Critical Hamiltonians and Feedback Invariants
231
fixed z ) . The set of solutions U of (CR) with fixed z will be called the real critical set and will be denoted by
However, we will also need complex solutions of the equation (C), and so our method will use complexanalytic functions rather thenreal analytic. Therefore, we will complexify our Hamiltonian H. If H is a polynomial, then the complexification 2 of H is the same polynomial with the real variables replaced by complex variables. In the general case we assume a point ( z o , ug) = (PO,20,U O ) be given. Then H is defined uniquely by its Taylor series in a neighborhood of this point. The complexification 3 of H can be defined locally as the complex analytic function of the variables (p,2,U ) E (6" x Ccn x C"' given by the same Taylor series (i.e. we replace the real variables by complex variables, denoted by the same letters). Then il is defined in a (small) polydisc W=W1 XW2CC2* XC"', where
W1 = {l211
El,.
a .
, l%ml < E2n)r
61,. l ~ m l 6"').
W2 = (11.1
S . ,
Consider the critical equation for complex critical points of respect to U )
ai?
-(%,U)
8U
a (with
= 0,
and the discriminant equation a2il
det ~
821
(
zU ) = , 0.
We define the critical set C and the discriminant set V of the system as the sets
c = {(%,U) E W
ail
: -8U (%,U)
= O},
ail
= 0,
D = { ( % , U ) E W : -au (%,U)
. ...
.
.
I.
."..., .. ....
.j
.,.,... _.,,,,.,L
,
,..,,
.
..
823 det 7aU (z,u)
.
..,,..
,
..
= 0).
,
,
,.
.. . .
Jakubczyk
232
A more important role willbe played by discriminant of the systemdefined as the projection of V onto Wl1 D = (z, E
W1 : 3u
E W2 such that
(%,U)E
V },
From now on we will analyse our Hamiltonian near a point (z0,uo) which is a (real) critical point, i.e. it satisfies (CR). We introduce a third regularity assumption which willbe called finite multiplicity condition (for reasons which will become clear after stating a proposition below).
(FM)
.
The point uo is an isolated solution in W2 of the critical equation ( C ) with jixed z = zo.
The following properties can be proved using well known techniques of functions of several complex variables and the regularity assumption (Rl).
Proposition 3.1. (a) The critical set C C W is a (complex) submanifold. (b) The discriminant set V C W and the discriminant D C W1 are nontrival analytic subsets of W and W1 (inparticular, they are closed and nowhere dense in W and W I ) . (c) Under the assumption ( F M ) we can choose a neighborhood ( a polydisc) W = W1 X WZ of (zo,uo), possibly smaller then the original one, so that there exists a number p < 00 such that for any z E W1 \ D the equation ( C ) has exactly p digerent solutions in W2 U1
= U&),
..
I
,
u p = U&),
which will be called critical controls. (d) The solutions ui = ui(z),i = 1,.. . l p , are local analytic functions of z on Wl\D and can be extended t o a continuous finite-valued multifunction U c p on the whole W1, defined by Ucr(z)= { U
E
wz : -ail (z,u) aU
=
O>
'
The number p, called multiplicity, is just the number of complex critical controls around the point (20,uo) = (po, X O , U O ) . We will only be interested in analysing our system near a critical point (zo, U O ) ,therefore this number
Invariants Feedback Hamiltonians and Critical
233
will be positive. In fact, the larger this number is, the more information we can recover about the local geometry of the system, as it will become clear in further considerations. One can give explicit algebraic formulas for the muliplicity (cf. [Ar]for more information and further insight).
Proposition 3.2. (a) I n the case of scalar unique number p = p(z0, U O ) ,such that
U
the multiplicity is the
(b) In the general case p is equal to the dimension of the local algebra at uo of the function h(u)= H(z0, U ) , i.e., it is equal to the dimension of the quotient linear space p = dim{C,w,(Rrn;W)/{hl9 .
m
,M } ,
where C:,,(Rm;R) denotes the algebra of germs at uo of real analytic functions, hi are the partial derivatives OH ha = - ( z o , - ) , i = 1,.. ,m, dui and { h l ,. , , , hm} is theideal of this algebra generated by these functions. I
It follows then that multiplicity is a rathereasily computable invariant. Now we will argue that it is reasonable to assume that the multiplicity islarger then one. Suppose that our control U is scalar. If n > 2, then for any point (SO, U O ) we can choose a covector (a direction) p0 such that p = p(po,z0, U O ) is larger then one. Namely, we see that it is enough to take p0 which anihilates the vectors
Then from the statement (a) in Proposition 3.2 and from the equality
Si2 S'f = p i 821' 8u we see that p 2 2. In general, for U in W and S in Rn we can always choose a direction p0 for which the multiplicity is at least n 1. By a similar argument used for general U one can prove the following fact.
-
Jakubczyk
234
Proposition 3.3. Given n and 0 < m < n, then for any point x0 E X , uo E U them exists a direction p E TZoX,p # 0 , such that p(po,zo,uo)2 n - 1,
providedthat m = 1,
or p ( p o , x o ,U O ) ) 2 2,
provided that n > 2m, or n > m + 1 and m is odd.
Remark. If m is even, then the assumption n > m+ 1 is not sufficient for the property of the above proposition to hold. Even more, there exists a system Il represented by the germ at (Q, U O ) of an f such that for any real direction p E T:oX we have p ( p , 50,U O ) 5 1. However, there always exist complex directions p E Cm such that the (complex) multiplicity satisfies the inequality p(p,zo,uo) 2 2. In this paper we will not analyse our systems near complex directions, however.
As we are interestedin analysing a system around a given point (xo,U O ) , we can choose our direction freely. It will be reasonable to choose the direction p~ so that the multiplicity is large, or at least larger then one. Until now all our considerations were local with respect to all variables p , z, and U. In many cases it is more reasonable to analyse the system globally with respect to U . In order to do this we define the global multiplicity at a point zo = (PO,20) as the sum of local multiplicities over all critical points PkO) =
c
lU(Z0,U).
uEU:7(zO)
We introduce the global finite multiplicity condition as
This condition means that there are only finitely many critical points in the set U&(ZO),each of them of finite multiplicity (i.e. satisfying (FM)). From our previous proposition it follows that this multiplicity is usually larger then one, if the direction is chosen appropriately.
Critical Hamiltonians and Feedback Invariants
235
Example 3.4. Let us consider systems which are polynomial of order 3 with respect to scalar control U E U = W,
with fi, i = 0 , 1 , 2 , 3 , analytic vector fields on X. The Hamiltonian takes the form
H = ao(z) + ual(z) + u2a,(z) + u 3 a 3 ( x ) , with ai(z)= p f i ( s ) . Any point zo = (PO, 20)such that POf3(20)
#0
is of global multiplicity 2 or 0, depending whether the critical equation al(Zo) -k
2UUz(Zo) -t3U2U3(%o)= 0
has real or complex roots. A point (PO,20,U O ) satisfying the critical equation is of local multiplicity 1 if A # 0, and of local multiplicity 2 if A = 0, where
82H
A = =(PO,
+
20,U O ) = ~ P O ~ Z ( ~ O 6)~ 0 ~ 0 f 3 ( 2 0 ) .
4. Critical Hamiltonians and symbols In this section we will introduce critical Hamiltonians and symbols, which are complete invariants of pure feedback equivalence.Later theywill form a basis of our construction of microlocal and local feedback invariants. Let us fix a critical point ( Z O , U O ) E T*X x U of finite multiplicity p (that is, a point satisfying (FM)).It follows fromProposition 1.1 that there exists a complex neighborhood W1 of Z O , a polydisc, and a nowhere dense analytic subset D C W1 such that for any point z1 in the open set W1 \ D one has p complex critical controls u1 ( z ) ,. , . ,U,(%) in a neighborhood of zl. We define critical Hamiltonians as
Hl(Z)= 3 ( Z , U 1 ( % ) ) , . . . ,H,(%) = fi(a,u,(s)). In a neighborhood of functions.
a1
in
W1
\D
these are well defined holornorphic
Jakubczyk
236
However, ingeneral they can not be extended as holomorphic functions to thewhole W1. Even worse, they can not be defined as univalent functions on the whole W1. For example, if H = a(z)u+b(z)u3,a = pg(s),b = p h ( z ) , and a(z0) = 0, b(z0) # 0, then the critical equation BH/& = 0 has two solutions
which are not defined as univalent functions in a full neighborhood of zo (note that the coefficient a = p g ( z ) changes sign when p varies near p0 such that pOg(lc0) = 0, if g(z0)# 0). The second drawback (the lack of univalent extensions) can be eliminated by introducing the multivalued critical Hamiltonian in the following definition Hc,(z) = E ( z , U c r ( z ) ) ,
for z E Wl and thecritical set VC,defined in Proposition 3.1. Additionally, it follows fromthis proposition that H,, is finite valued, continuous on W1, and holomorphic on W1 \ D. The first drawback of critical Hamiltonians can be eliminated by introducing symmetric functions of critical Hamiltonians. Using a symmetrization procedure explained below we will gain not only better regularity but also effective computability of our functions (explained in Section 7). We introduce the following functions, called symbols of n. For z E W1 \ D we define where q ( z ) , . . . , u m ( z ) are complex critical controls, i.e. solutions of (C). For z E D we put sk(Z) =
lim Sk(z), Y+Z
where y E W1 \D. Correctness of this definition (and continuity of Sk)follows from statement (d) of Proposition 3.1.w e see that sk are symmetric functions of our critical Hamiltonians, i.e.
S&) = H&)
+ ...+ H J Z ) ,
Critical Hamiltonians and Feedback Invariants
S2(z) = (Hl(2))'
237
+ .. + (HP(z))', ,.
.
+. +(H,(Zp
&(z) =
whenever right hand sides are well defined. For systems polynomial with respect to scalar control U we shall compute thesymbols explicitly using complex logarithmic residua (Section 9. For polynomials of order 3 this can be done directly.
Example 4.0. Consider a system which is polynomial of order 3 with respect to scalar control U E U = P (asin Example 3.4). The Hamiltonian is of the form
+
+
H = ao(z) u a l ( z ) u'az(z)
+ u3a3(z),
with aa(z) = p f i ( z ) . We assume that a 3 ( ~ 0#) 0 and the critical equation
has two real solutions at zo
(they become complex for z # zo, if a! tonians are
- 3a1a3 < 0)
The critical Hamil-
and one can easily compute the global symbols
S 1 = H + + H - = 2 a o - - - 2 ala2 I 4 a! 3 a3 27aE' 4 a o a l a ~ 8 aoa; Sz= (H+)'+ (H-)' = 2ai - -+" -"8 a! 3 27a3 ai 27a327
-"l6 ala: 81 a!
+--14a:ai ai
16 ai +"729 ai'
Note that even if the critical Hamiltonians are not real analytic, the symbols are if only a3 # 0.
Jakubczyk
238
In general the symbols are not only defined on the whole ofW1, but they are also regular and real analytic.
Lemma 4.1. (a) The functions &, k 1 1, are holomorphicon W1, and are also real analytic. (b) The functions SI, are polynomials of SI, . . ,S ,, for k > p. (c) The function Sk is homogeneous of order IC with respect to p , i.e., I
Sk(tp,z)= tksk(p,z), when both sides are defined (but it is not necessarily a polynomial of p ) . The second statement of the lemma shows that we can deal with the first p functions sk, only. Therefore, we define the complete symbol of the system ll at a point (zo, u O ) (of finite multiplicity p ) as S = (Sl,.* . , SP)* The above definition is local with repect to z = (p,z)and U. It can be modified to a definition which is global with respect to U under the global multiplicity assumption. Namely, we repeat our construction of critical Hamiltonians and symbols as follows. From our global finite multiplicity assumption (GFM)and from Proposition 1.1 it follows that there are finitely many real critical points
W,
S .
,UY
E U,'(ZO),
v 5 p*
For each wa = ( Z O ,ui), i = 1,.. . , v , we choose a neighborhood W"i = W,Wi x W,Wi of this point as the one defined in statement (c) of Proposition 3.1. For any other point W = (ZO,U ) E {eo} x U we choose a (complex) neighborhood W w of this point in the domain of H so that it does not contain any critical points, i.e. points satisfying the equation (C). Then, we define W=
U
W",
wE{ro)xU
which is a complex neighborhood of the set { d o } x U, on which a holomorphic extension i? of H is defined. Let the set W1 be defined as the
Invariants Feedback Hamiltonians and Critical
239
intersection of all neighborhoods W?', i = l,. , , ,U corresponding to critical points. We define the discriminant set D C Wl as the union of all discriminant points in W,wi corresponding to critical points w l ,, , , ,W". We can now define the global (with respect to U ) counterparts of the critical set V,,.(z), the critical Hamiltonians HI(%), .,. ,H,(%) and H,(%), and the symbol S(%)= (SI(%), . .. , S,(z)) in precisely the the same way as earlier, except that our multiplicity p = p(z0) is global and so we take into account all critical points bifurcationg from w1,. . ,w y . They have all properties mentioned in Lemma 4.1 (this follows from the local results). The propertyof real analyticity of the symbol mentioned in Lemma 4.1 is very important as it allows us to perform all operations applicable to analytic funcions (for example, taking their Poisson brackets as we will do later). Also homogeneity properties of symbols will be used, they anable us to analyse our systems on open cones in the cotangent bundle (i.e., microlocally), The symbol S = (SI, . , , ,S,) has also another crutial property; it contains all information about the system II, up to a parametrization of control. Namely, the following basic result holds.
.
Theorem 4.2. If two analytic systems II and fi on X = 2 satisfy theconditions (Rl), (R2)at (ZO,UO) and (zo,i.Zo), respectively,then for any nonzero direction p0 E T&A such that ( p o , 20,uo) and (po,zo,Zo) have the same finite multiplicities the following conditions are equivalent, locally around these points. (a) II and fi are locally pure feedback equivalent at (50,uo) and (XO,GO). (b) H,, = H,, locally around (p0,zo) and ( ~ 0 ~ x 0 ) . (c) S = 3 locally around (p0,zo) and (p0,oo). I f , additionally,thecondition of global finitemultiplicity (GFM)is satisfied at (PO,xo),and the sets of admissible velocities F and F are closed and connected, then the above statements hold globally with respect to U (with global definitions of U,, and S ) . Note that the (complex) trajectories of the finite-valued Hamiltonian vector field corresponding to the Hamiltonian H,, are the critical trajec-
Jakubczyk
240
tories of the system. Therefore, the equivalence of conditions (a) and (b) of the above theorem implies the result stated informally in Section 2. Our further analysis will be based onequivalence of conditions (a) and (c). Namely, the above theorem allows us to reduce the problem of feedback equivalence to an equivalence problem of symbols. More precisely the problem reduces to the problem of equivalence of symbols under tranformations of the cotangent bundle T * X which are liftings of transformations of the base X . We are going to solve this problem in the following two sections. More precisely, our reduction will be based on the following theorem, which is a consequence of Theorem 4.2. For a diffeomorphism 0 : 2 + X we denote by 0, : T . 2 + T * X its cotangent lifting defined by @,E)
a@
= @*@,F) = @(-(Z))-l,@(Z)). 6%
Assume that ll and fi satisfy the conditions (Rl), (R2), (FM) at( z o , ? ~ ~ ) and (ZO,EO),respectively. Theorem 4.3. (a) If II and transformations E
fi are locally feedback
= a(%),
U
equivalent by the
= Q(Z,G),
then the corresponding local symbols are related by the cotangent lifling of 0 , i.e. sk
0
@* = g k ,
for k 2 1.
(b) Vice versa, if there exists a local diffeomorphism 0 such that the equalities
- .
S1 0 0, = SI,.. ,S,
-
0
9, = S,
hold in a neighborhood of 20,then systems l3 and fi are locally feedback equivalent at ( X O , U O ) and (Zo,Go),respectively. If in addition the global finite multiplicity condition(GFM)is satisfied, and F and F are closed and connected, then the above statements hold for the global symbol S and feedback equivalence global with respectU.to
Critical Hamiltonians and Feedback Invariants
241
5. Microlocal equivalence Before trying to understand the local geometry of our systems 11, we will attempt to realise an easier task. Namely, we will find invariants of microlocalfeedbackequivalence,wheremicrolocalwill mean local with respect to state3 and direction p . In order to do this we extend a system II to thecotangent bundle by adding a natural linear differential equation for adjoint variable p (covector) in the same way as it is done in time-optimal control theory. We will analyse the new system locally with respect to p and 3.More precisely, we will introduce invariants which will describe the original system in a neighborhood of a given point x0 and a given direction defined by a line R p o C T & X (microlocally). We introduce the cotangent extension of the system 11 as the system IT* on T ' X defined as
This definition makes sense (i.e., the derivative B f /Ox is well defined) if X = Rn and the tangent spaces T,X at different x are identified by the standard shift in
Etn.
Remark. If X is a differentiable manifold then the system 11* can be defined as follows. Namely, in any coordinate system we can use the above definition. The same system can also be written in the form
x=-BH p=" BP '
BH ax '
or equivalently
d
%(P,x) = R@'2, U ) ' where R(., U) is the Hamiltonian vector field on T * X corresponding to the Hamiltonian H ( - , U) via the canonical symplectic structure on the cotangent bundle T * X . This shows that our definition of 11* is independent of the choice of coordinate system. a,
a ,
Suppose that we also have another system fi with the cotangent exten-
Jakubczyk
242
sion
5:
-z = f ( E , G ) , p”=-p:(E,G), -8.78X ”
@,E) E T * Z , G E 6.
Let us recall our notation z = ( p , z ) ,Z = @,E) for the elements of the cotangent bundle. By a conic neighborhood of a point (po, zo) in T * X , p0 # 0, wewill mean a neighborhood V of this point which is invariant under multiplication of p by nonzero real numbers, i.e., (p,z) E V implies (tp,z)E V , for any t f 0.
Definition 5.1. Two systems If and fi will be called micmlocull~feedback equivalent at thepoints (zo, uo)and (& Go), respectively, if there exist invertible analytic transformations ( z , U ) = ~ ( 2G), , (zo, uo) = x(Z0, Go), of the form
defined on a conic neighborhood of (PO, $0, GO), with q a symplectomorphism, which transform 11* into g*. Note that if the two systems are locally feedback equivalent at (20, uo) and (EO,GO)via a transformation ( Q , Q), then they are microlocally feedback equivalent at (po, z ~u ,~ and ) @o, EO,Go) for any po, such that j70 = p 8 @ / 8 E ( Z o ) , with the transformations giving the microlocal equivalence of the form
where Q, is the cotangent lifting of a. In general, microlocal equivalence does not imply local equivalence as it will be shown at the end of this section. Let us assume that a critical point (z0,uo) = ( p g , z o , u o ) is of finite multiplicity 2 5 p < m. Then the symbol S introduced in the preceeding section is well defined in a conic neighborhood of the point zo = (po, zo), that is, we are given p real analytic functions
Critical Hamiltonians and Feedback Invariants
243
Let us introduce their Poisson brackets
and the iterated Poisson brackets Sili$...ik
= { s ~ ~ , ~ ~ ~ ~ { s ~ ~5 i_l ,~* *,* s, i k~ ~ } " ~ } ,
It is not difficult to prove that if two systems are microlocally feedback equivalent via the transformations (MFT), then their critical Hamiltonians and their symbols are related by q, i.e.,
-
si o q = si,
i = l, . . . , p .
As symplectomorphisms preserve the Poisson bracket, we have a more general property. Proposition 5.2. If II and fi are microlocally equivalent via the transformations ( M F T ) , then Sili2..sik
= Sdli2...ikn
The iterated Poisson brackets of symbols S I ,. . , , S, form a family of real analytic functions SW = Sil**,ik
defined locally around zo E T*X for all words W=il
... ik,
w i t h k 2 1 , and 1 5 i l ,..., i k < p .
We will denote the set of all such words by A* (they are simply all words composed of the letters from the finite alphabet A = {l,.. . , p ) ) . We will say that the system II satisfies the rank condition at a point zo = (p0,zo) if the following condition is satisfied.
(RC)
There exist words w l , , . . ,WZ,, such that the differentials dSwl,. . . . . , ,dS,,, are linearly independent at Z O .
Theorem 5.3. Assume that systems II and fi satisfy conditions ( R l ) and (FM) at pol 2 0 , U O , and fjo, ZO, GO,respectively. Then the following statements hold.
Jakubczyk
244
(a) If systems II and fi are microlocally equivalent at points (z0,uo) and (&,GO),then the following conditions are satisfied
(A).
(B)
P(X0,'LLO)
= P ( Z 0 , Go),
2
=si~...ik(%)3
sil...~k(zo)
'5
21,'*',Zh
5 P'
(b) Vice versa, if II and 0 satisfy theconditions ( A ) and ( B ) and they satisfy the rank condition ( R C ) at zo, YO,respectively, then they are microlocally feedback equivalent at these points. Additionally, the transformation x = ( q ,$) establishing this equivalence is unique, and q preserves the Liouville form, i.e., q*(Cpidxi)= Ci7idZi. Suppose that both systems satisfy the rank condition at (zg,ug) and
(&,G,-,), with the same words
W = (Wl,.
v . ,
WZn).
Then
= (SW,
3
-
* 7
SW,,)
and
= (SW1
*
* 3
swan)
form coordinate systems around these points. It follows that local diffeomorphisms a;' and Z;' are well defined. The following result follows from Proposition 5.2 and Theorem 5.3.
Theorem 5.4. Theorem 5.3 holds with condition ( B ) replaced by the condition
(B1)
-
&l...ik
00,'
= S i l . . . i k 0 0 m 13 "
k 2: 1, 1 5 i l ,
,ik
5 P,
where the equalities of the functions hold in a neighborhood of the point so = a&))
= Zm(Z)).
The collection of numbers in condition (B), m well as the collection of functions in condition (Bl)can be regarded as sets of complete invariants of microlocal feedback equivalence (note that the multiplicity p is implicitely contained in these families). Obviously, the second set of invariants is too rich, in a sense. However, it is often more useful to use the set of functions (condition (Bl)), rather then .touse its values at the given point of consideration, only (condition (B)).
Critical Hamiltonians and Feedback Invariants
245
Corollary 6.5. The unique transformation q establishing equivalence in Theorems 5.3 and 5.4 is of the form
Remark. From the above corollary it follows that, once we are able to find the unique transformation q establishing microlocal feedback equivalence of both systems, we can also establish whether the systems are locally feedback equivalent. Namely, it is enough to check whether the (unique) transformation q is the lift of a diffeomorphism 4 : j? -+ X.We will state this as a formal result in the following section. An efficient way of finding the transformation q establishing microlocal (or local) feedback equivalence is suggested by the following proposition. Proposition 5.6. If q is a symplectomorphism from Definition 5.1 establishing microlocal feedback equivalence of systems TI and E, then we have the equality
-
Toq=T
-
for any functions T and ? obtained from SI,,. . ,S, (from 31,. . . ,S,, respectively) b y a finite number of operations of the type: (i) taking Poisson (iii) taking bracket, (ii) takinglinearcombinations,andmoregenerally, compositions of a finite number of functions with an analytic jhction of several variables.
,.
For example, if Si o q = Si,i = 1,.. . , p , then
,si
i-35;
-
"
4-35;
The proof of the proposition follows from the equality Si o q = & and the fact that each of the operations (i), (ii), and(iii) preserves the equality in the proposition. We complete this section with an example showing that, without further assumptions, microlocal feedback equivalence does not imply local feedback equivalence. In our example the rank condition will be satisfied, therefore it will be possible to find the transformation q establishing this
Jakubczyk
246
equivalence (which is unique by Theorem 5.3). We will see that it preserves the Liouville form but it is not a lift of a diffeomorphism of the basic manifolds. Example 5.7. Let us consider a family of systems like in Example 2.2 with X the half-plane {(SI,x2) : x1 > 0) and U = S1the circle,
+
x = V(X) vlgl(x) + v2g2(x), v: +v; = 1, x1 > 0, where the vector fields
a
8
g1 = 6%' Q2 = a 8x2
depend on two parameters a and b. We will find systems which are microlocally (but not locally) equivalent for different valuesof these parameters. From Example 2.2 we know that the global multiplicity of this system at any point (p, x), p # 0, is p = 2 and the symbols are
+
S1= 2pV = 2~1x2 26~2, S2 =
1 + 2(pg1)' + ~ ( p g 2 =) ~-S: + 2p:x1+ 2
2a2pi.
Consider two systems I'I and fi of the above form, with the parameters (a, b) and (E,;), respectively. Suppose that they aremicrolocally equivalent at all points which correspond to each other via a transformation (p,z) = q ( F , Z ) , which is a symplectomorphism. This implies the equality of the symbols, up to the transformation 9: Sl(p,X) =&E,% SZ(p,X)= &E,Z),
b,X)
=rlE,q,
(this follows already from Theorem 4.3 applied to the lifted systems ll* and F). Also, their iterated Poisson brackets are equal, up to the same transformation. Even more, we can define the functions
1
TI = ;zS1= plxz
+ bpz,
1 1 2 Tz = 5S2 - 4s: = p1xl + a'pi,
and applying Proposition 5.6 we get the equality of functions Tl (p,2) =
Hamiltonians and Critical
Feedback Invariants
247
The equalities p1 = 51,p2 = F2,and TI( p ,z) = F1 imply &x2
+6
2
=
+F&
These equations and G2 = a2 transformation
and
+ 1,
+ a'$
F,Z), Tz(p,x) = F2 (F,2) = $51
+ Z2$.
= b - 2 allowus to compute the
It is easy to see that thistransformation(togetherwith p1 = F1 and p 2 = &) establishes microlocal feedback equivalence of both systems as it preserves the Liouville form, i.e., p l d z l + p z d ~= j71d31 +i32dEz. On the other hand, the trasformations of 2 1 and 82 depend on the variables F1 and &, and so the unique transformation 7 does not give local feedback equivalence.
6. Local feedback equivalence From Theorem 5.3 it follows that, under the rank assumption(RC), the numbers Sil,..ik(zo) contain complete information about the system If, up to microlocal feedback transformations (MFT) (Definition 5.1). However,
Jakubczyk
248
it follows from Example 5,7 that the invariants from condition (B) of Theorem 5.3, and even the ones from condition (Bl)of Theorem 5.4, do not form a complete system of invariants of local feedback equivalence. They “know all” about the symplectic form on T * X ,and even about the , it is impossible, in general, to determine Liouville form a = C p i d z i but from these numbers the natural fibration of the cotangent bundle. This is because they “know” about the form a only away of the trivial section of T * X . In order to identify the cotangent fibration we have to introduce an additionalinvariant. It will be called polarization with analogy to the terminology used in quantization theory, and it will represent locally the cotangent fibration. Suppose that both systems satisfy the rank condition at (~0,210)and (Zo,i&),with the same collection of words Ti7 = (wl,. . . , w ~ ~Then ) . the families of functions
-
= ( S W 1 1 . * * 9 Swan),
-
I
= (SzUlt
* * 9
swan)
form coordinate systems (local diffeomorphisms) around these points. Our polarization is defined as the involutive distribution given in coordinates U, by the formula
where z E T*X near zo, and
L(%)= T,(T,*X), z = ~ ( z ) is the tangent space at z to the cotangent fiber T,+X (T : T*X -+ X is the natural projection).
Theorem 6.1.Assume that ll and fi satisfy the regularity assumption (Rl) at given points ( Z O , U O ) and (Zo,.iio). Then the following statements hold. (a) If II and fi are locally feedback equivalent at (ZO, U O ) and (ZO, GO), and they satisfy the finite multiplicity condition ( F M ) at some (p,ZO,U O )
Invariants Feedback Hamiltonians and Critical
249
and F,EO,CO),then thereexistdirections p0 and (in fact, many of them) such that the condition (FM)issatisfied at ( ~ O , I C ~=, U(zo,uo) ~) and (Fo, EO,GO)= (ZO,GO),and the following conditions hold.
(A) (B)
(C)
- A z o , Uo) = W
O , CO).
si1,,.ik(zO) = Sil.,,dk(zO),2
1, 1 5 i l ,
,ik
5 /.h.
A, = A, in a neighborhood of SO = oa(z0) = 5,(Zo),
(To have ( C ) we additionally assume the rank condition ( R C ) at zo and ZO, respectively.) (b) Vice veraa,if 030, SO, uo) and (PO, 20,CO)are j h e d , II and fi satisfy the finite multiplicity condition( F M ) and the rank condition ( R C ) at these points, respectively, and they satisfy the conditions ( A ) , ( B ) (equivalently, (B1)), and ( C ) , then they are locally feedback equivalent at (Q, uo) and
(Eo,Go). It follows then that the triple {~(zO,~O),{S~(~O)}urEA*,A~}
forms a complete system of invariants of local feedback equivalence. The condition (C) in the above theorem can be replaced by other conditions as stated in the following result. Theorem 6.2. Theorem 6.1 remains true if condition ( C ) is replaced by one of the following conditions:
(here 7r : T*X + X is the natural projection),
(C2) the map a;'
o Z,jj has an analytic extension to the point
(0,Eo),
From Corollary 5.5 it follows that thetransformation establishing equivalence of II and fi is of the form r] = a;' o Z0 when considered aa a transformation of cotangent bundles. Then condition (Cl) simply means that q preserves the cotangent fibration. Together with the fact that it preserves the Liouville form it follows that it is a lift of a diffeomorphism
Jakubczyk
250
4 : 2 + X , where ~=?rou,-~oijaoi~
and ix : X + T * X is the natural embedding of the trivial section of T * X . Condition (C2) guarantees that the transformation r] has an analytic extension to the trivial section of T * X , and so the Liouville form is preserved at the trivial section, too. This also guarantees that the map 77 is a lift of a diffeomorphism of the base. The most efficient way of finding the diffeomorphism 4 establishing local feedback equivalence isto use Proposition 5.6. After finding sufficiently many functions T and 5 satisfying T o r] = 5 (this is possible if the two systems are microlocally feedback equivalent) we may compute 7 explicitely, if the rank condition is satisfied. Then, from Corollary 5.5 it follows that r] = a;' o ija and so we may check whether one of the conditions (C), (Cl), or (C2) is satisfied. If it is so this means that r] is in fact the lift of a diffeomorphism 4 : 2 + X given by the formula in the preceeding paragraph. There is a class of systems II satisfying certain strong rank condition (global with respect to directions p ) for which microlocal feedback inriants in Theorems 5.3 and 5.4 form a complete set of invariants of local feedback equivalence. This will be discussed in a future paper.
7. Logarithmic residua and symbols In this section we present explicit formulas for the symbols Sh of our system using logarithmic residua of functions of one or several complex variables. Let us consider again the complexification 3 = U ) of the Hamiltonian of our system. In fact,we will need the variable U be complex, while the variables z = ( p ,U ) will play a role of a parameter and could be taken real only. . have the following Let us denote 3;= aa/au and 2;= 8 2 3 / 8 u 2We result in the case of scalar U .
s(z,
Critical Hamiltonians and Feedback Invariants
25 1
Theorem 7.1. (a) If (z0,uo) is of finite multiplicity, i.e. it satisfies ( F M ) , then the following complex integral formula holds for the local symbol Sk:
with E > 0 small enough. (b) If zo as of finite global multiplicity, i.e. it satisfies the asssumption (GFM), then the following formula holds for the global symbols
with e > 0 small enough, where & ( z ) in Section 3.
denotes the real critical set defined
Let us note that the integral formula for the symbol Sk can also be written in the form
1
1 2Ti
gk(z,u)d,(ln
‘L(z, U ) ) ,
h
\iYLl=e
(with d, denoting the differential with respect to U ) hence this integral is called logarithmic residuum. Using the above theorem we can compute explicitely the symbols in the case of systems polynomial with respect to control, or more generally, when they are given by a power series with respect to control. Namely, let us suppose that the Hamiltonian is the following polynomial function of scalar U , ~ ( zU ), = ao(z)
+ a I ( z ) ( u- uo) + . . . + a N ( z ) ( u-
Let us assume that al(z0)
= . . . = ar-l(zo) = 0 and ar(zo) # 0.
Theorem 7.2. (a) If r = N , then the point (zo,uo) is of multiplicity p = N - 1 and the local (aswell as global) symbols Sk,k 2 1, are given by the following rational functions of the coeficients ao, . . . , aN (infact they
Jakubczyk
252
are polynomial with respect to ao, .
. ,aN-1):
where the sum is finite and taken over the following set of indices: O I i l , ...,
25j5N,
lSjl,..,,jeSN-l, l10,
il+ . . . + i k + j + j l + . . . + j e = ( l + l ) N . (b) If 1 Ir IN - 1, then the same formula holds for the global symbol around a point zo under the assumption that p(z0) = N - 1, i.e. when the polynomial equation OH "OU (20, U ) = al(zo) 2a2(zo)u . NaN(zo)uN-l = 0,
+
+ +
has exactly N - 1 roots in U C R (counted with their multiplicities,cf. the remark stated after this theorem). (c) Finally, if the Hamiltonian is given by the series H ( Z ,U ) =
C ai(z)(ui>O
.
converging for ( z , ~ in ) a neighborhood of ( z o , ~ ~and ) , if al(z0) = , , aN-l(zo) = 0 and a N ( z 0 ) # 0 , then the local symbols sk are given by the formula in statement (a), where the sum is infinite and is taken over the set of indices
O S i 1 , ...,i k s N ,
25j5N,
j ~ , . . . , j t ? Lj ~ # N . . . , j e # N , L 2 0 , il+
...+ i k + j + j l +
. . . + j e = (e + l ) N .
This sum converges absolutely for z in a (possibly smaller) neighborhood of zo *
Remark. The assumption of statement (b) can be equivalentlystated as the condition that r < N and the equation
+
+ . .. + NaN(Zo)UN-' = 0,
ra,.(zo) + ( r l)a,.(zo)u
Critical Hamiltonians and Feedback Invariants
253
having exactly N - P roots in U (counted with multiplicity). This condition is easyto check, if U = R and thedifference N - r is not large. In particular, if N - P = 1, then this assumption is equivalent to the condition
. , N - 2,
ai(z0) = 0, i = l,..
If N
alv-l(zo)
- P = 2, our assumption is equivalent to ai(z0) = 0, i
= 1,.. . , N - 3,
# 0 # alv(z0).
the conditions
aN-2(zo)
# 0 # alv(zo)
and
(N -
- 4 N ( N - 2)ahrap&" 2 0.
Logarithmic residua can also be used for computing symbols in the caSeof multidimensional control U. However, in this case we have to deal with the theory of holomorphic functions of many variables and one should not expect to obtain results in computationally complete form (contrary to the polynomial case with scalar control). In order to state one result concerning this case let us denote
aii
83
h1 = -, . . . , h m = -. 8% 8Um We assume that the complex mapping
is proper. For positive numbers cycle
E
= (el,.
. . , E m ) we define a (complex)
r,(z) = {U E Cm : Jhl(z,u)l= Em,. .. , Ihm(z,u)I = Em}. n o m Sard's theorem it follows that for a generic E the cycle F,(%) is a submanifold of Cm of real dimension m. It is compact by our assumption on h, that is why we call it a cycle. We willalso introduce the "real part" F:(z) of F,(z) defined as the collection of these components of F&) which converge to { z o } x U, when z converges to zo and E converges to 0 E Rm (we want to eliminate cycles which are not born on the real manifod { z o } x V).
Jakubczyk
254
Theorem 7.3. (a) If map h is proper,thenthe integral
is a point of finite multiplicity andthe local symbol Sk at thispointisgiven by the
(ZO,UO)
where d , denotesthe(holomorphic)differentialwith
-
respect to
U
=
( ~ l r *rum)*
(b) If the global multiplicity ~ ( z o is) finite, then the global symbols S h are given by the same formula, with the cycle r , ( z ) replaced by the “real” cycle
I’:.
For a Hamiltonian which is polynomial with respect to U we can state additional properties of the symbols sk,analogous to theproperties in the scalar case. In particular, it can be proved that the symbol is a rational function of the coefficients of the polynomial. Acknowledgments.In thecourse of writing this paper have I profited from discussions with many colleagues including B. Bonnard, J.-P. Gauthier, V. Jurdjevic, W. Respondek, and M. Zhitomirskii. Special thanks are to Ivan Kupka who suggested a way of extending some of the results to the case of multidimensional control. The final version was written when the author was visiting INSA-ROUEN as invited Professor of the Institut Universitaire de France.
REFERENCES [Ag] A. A. Agrachev, Methods of ControlTheory in Nonholonomic Geometry, in Proceedings of the International Congress of Mathematicians, Zurich, Switzerland 1994, 1473-1483, Birkhhuser, Base1 1995. [AEGK] A. A. Agrachev, E1-H. C. El Alaoui, J. -P. Gauthier, I. Kupka,
Critical Hamiltonians and Feedback Invariants
255
Generic Singularities of Sub-Riemannian Metrics onR3,Comptes Rendus Ac. Sci. Paris 322,SBrie I (1996), 377-384. V. I. Arnold et al. Singularities, Local and Global Theory, in “Dynamical Systems VI”, V.I. Arnold ed. Encyclopaedia of Mathematical Sciences, Springer 1993. B. Bonnard, Feedback Equivalence for Nonlinear Systems and the Time Optimal Control Problem, SIAM J. Control and Optimiz. 29 (1991), 1300-1321.
B. Bonnard, Quadratic Control Systems, Mathematics of Control, Signals, and Systems 4 (1991), 139-160. R. W. Brocket, Feedback Invariants for Nonlinear Systems, in Proc. IFAC Congres, Helsinki 1978.
R. L.Bryant, R. B. Gardner, Control Strmctures, in “Geometry of Nonlinear Control and Differential Inclusions”, B. Jakubczyk, W. Respondek, T. Rzeiuchowski eds. , Banach Center Publications 32, Warszawa 1995, 111-121.
R. B. Gardner, “The Method of Equivalence and Its Applications” , CBMS Regional Conference Series in Applied Mathematics, CBMS 58, SIAM 1989. M. Gromov, Carnot-Carathdodory Spaces Seen from Within, in “Sub-Riemannian Geometry”, A. Bellai’che, 1.-J. Risler eds. Progress in Mathematics 144, Birkhauser 1996, 85-323. L. R.Hunt, R. Su, G. Meyer, DesignforMuti-InputNonlinear Systems, in “Differential Geometric Control Theory”, R. Brockett, R. Millman, H. Sussmann eds. 268-298, Birkhauser 1983.
A. Isidori, “Nonlinear Control Systems; An Introduction”, Springer, New York 1989. B.Jakubczyk, Equivalence and Invariants of Nonlinear Control Systems, in “Nonlinear Controllability and Optimal Control”, H. J. Sussmann (ed.), 177-218, Marcel Dekker, New York-Basel, 1990.
256
Jakubczyk
B. Jakubczyk, Microlocal Feedback Invariants, preprint 1992. B. Jakubczyk and W. Respondek, O n linearization of control systems, Bull. Acad. Polon. Sci. Ser. Sci. Math. 28 (1980), 517522. B. Jakubczyk and W. Respondek, Feedback Classification of Analytic Control Systems in the Plane, in “Analysis of Controlled DynamicalSystems”, B. Bonnardet al. eds. , 263-273, Progress in Systems and ControlTheory 8, Birkhauser, BostonBase1 1991. B. Jakubczyk and M. Zhitomirskii, in preparation. A. J. Krener, On the Equivalence of Control Systems and the Linearization of Nonlinear Systems, SIAM J. Control 11 (1973), 670-676. I. Kupka, O n Feedback Equivalence, CanadianMath. Society Conference Proceedings, Vol. 12, 1992, 105-117. I. Kupka, Linear Equivalence of Curves, manuscript 1992. R. Montgomery, A survey o n Singular Curves in Sub-Riemannian Geometrg, Journal of Dynamical and Control Systems 1 (1995), 49-90. W. Respondek and M. Zhitomirskii, Feedback classification of Nonlinear Control Systems on 3-Manifolds, Mathematics of Control, Signals, and Systems 8 (1996), 299-333. H. J. Sussmann, An Extension of a Theorem of Naganoon Transitive Lie algebras, Proc. Am. Math. Soc. 45 (1974), 349356. M. Ya. Zhitomirskii, Typical Singularitiesof Diferential l-Forms and Pfafian Equations, Translations of Mathematical Monographs, vol. 113, AMS, Providence, 1992.
7 Optimal Control Problems on Lie Groups: Crossroads Between Geometry and Mechanics V. Jurdjevic Department of Mathematics, University of Toronto 100, St George Street, Toronto, Ontario, M5S 1Al Canada
Abstract. The paper presents in a uniform framework several problems from classical mechanics and geometry. The framework is the geometric language of optimal control theory, where the Lagrange variational formalism is extended to Pontriagin’s Maximum Principle. The problems considered include the dynamic equations of the rigid body, the ball-plate problem, various versions of Euler elastica problem and a relatedDubbin’s problem. The statespace in each case is a Lie group. Using the Maximum Principle and invariance of the problems with respect to a natural action of the Lie group it is possible to compute explicitly the optimal solutions. The geometry of the solutions is analyzed and strikingconnections between different problems are revealed.
Introduction Optimality and feedback, two basic concerns of modern control theory, provide mathematics with new perspectives from which someof its theories can be seen in distinctive and interesting ways. The point of view offered by these perspectives stirs up certain classical ideas from the past and 257
258
Jurdjevic
moves them along new directions to both unexpected and common destinations. This paper will follow a particular direction motivated by optimal control theory through Pontryagin’s Maximum Principle and the associe ted Hamiltonian formalism. The attention will be focused to systems on Lie groups as a natural geometric setting for problems of mathematical physics, elasticity, differential geometry and dynamical systems. The method of moving frames in problems of classical mechanics and its counterpart in geometric problems on Riemannian spaces whichlift to their frame bundles, lead to differential systems onLie groups and make a naturalcontact with control theory. Many such problems which are fundamentally variational, when viewed as optimal control problems, can be effectively analyzed through the Maximum Principle. This approach leads directly to the underlying geometry of the solutions, which then serves as a guide for exploiting further symmetries. The significance of the control theoretic approach to problems of classical mechanicsis best illustrated through the equations of motion for the heavy top, and it is for that reason that that this paper contains a complete derivation of its classical equations. The Plate-Ball problem, introduced recently by Brockett and Dai further illustrates the use of the Maximum Principle in mechanics. In contrast to theEuler-top, which can be associated with a left-invariant Riemannian metric on SOs(R), the Plate-Ball problem corresponds to a sub-Riemannian left-invariant metric defined over a two dimensional distribution in a five dimensional group R2 x SO3(R). ([15]) The problem of describing the equilibrium configurations of a thin elastic rod in R3 is a classic companion of the heavy top due to a famous discovery of Kirchhoff in 1859 known as thekinetic analogue for the elastic rod. In this paper we refer to this problem as the elastic problem, and we show that Kirchhoff’s equations for the elastic rod are the extrema1 equations of an optimal control problem on the group of motions of R3. The control theoretic formalism shows that in spite of Kirchhoff’s Theorem, the geometry of the solutions for the heavy top, isverydifferent from the geometries of the solutions of the elastic problem largely due to the
Optimal Control Problems on Lie Groups
259
non-holonomic nature of the elastic problem, This difference is best seen through the planar elastic problem and itsrelations to its kinetic analogue, the mathematical pendulum. Inspired by the recent studies of differential geometers ([7],[lo], [19], [ZO])we also include the extensions of the elastic problem to non-Euclidean spaces of constant curvature. We treat these problems BB optimal control problemson the isometry group of the space. For a surface form, this approach leads to optimal control problems on E2, SO3(R) and SO(2, l), while the corresponding extensions for a space form leadto optimal control problems on E3, SOr(R), and SO(3,l). The extremal equations for a surface form reveal the geometry described by the conservation laws which reflects the non-holonomic nature of the problem, and which ultimately leads to a complete description of the solutions. For a space form, the extremal equations make an interesting associated with Hamillink with various integrability studies ([18], [12]), toninan systems on Lie algebras, and in particular, provide a geometric context for an abstract study of 0. Bogoyavlenski for Hamiltonian systems on six dimensional Lie algebras ( [ 5 ] ) . Dubins’ geodesic problem in R2 ([a], [26])with constraints on the curvature, considered as a time optimal control problem on the group of motions of R2 with the control function u ( t ) constrained by 1u(t)l 5 1, completes the pot-pourri of mechanical and geometric problems discussed in this paper which illustrate the general theory. All the problems treated in this paper are of the following type: G denotes a Lie group, and C(G) denotes its Lie algebra. denotes a leftinvariant vector field on G whose value at the group identity is equal to V. It will be assumed that the most general control system on G is of the following form
v
for some fixed
elements VO,VI,. . . , V,
in C(G). The control functions
Jurdjevic
260
. . ,urn(t))are assumed to take values in a prescribed con-
u(t) = ( u l ( t ),,
straint set U in R". In addition, it will be assumed that there is a prescribed smooth cost function L : G x R" -+ R, which will be also called a Lagrangian. Leaving aside the technical assumptions required to define the class of admissible controls, we shall be interested in the following optimal problem: Among all the admissible trajectories ( g ( t ) ,u ( t ) ) of the control system (1)which satisfy the prescribed boundary conditions g(0) = go and g ( T ) = g1 find the one which minimizes L ( g ( t ) ,u ( t ) )dt. If T is an a priori fixed number the above problem will be called a &ed time problem. Otherwise it will be called a free time problem. The Maximum Principle, as a necessary condition of optimality, very often leads to a Hamiltonian dynamical system on the cotangent bundle T*G of G. Its integral curves are called estremab. The structure of the extremals, and its relation to the symmetries of the problem is best elucidated when T*G is realized as the product of G with the dual of its Lie algebra. This realization of T*G leads to non-canonical coordinates, and therefore its use demands a solid understanding of the underlying symplectic geometry. Although much of this knowledge is already firmly established in classical mechanics ([3], [4]), yet in some subtle details the geometric theory requires further clarifications, and merits additional considerations. Our treatment of the heavy top clarifies the geometric significance of Killing form on SOa(R) in relation to its classical equations of motion, and the elastic problem elucidates the geometric nature of its conservation laws. The analogy shows that these conservation laws correspond to Casimir elements associated with invariant bilinear forms. The paper is organized as follows: Section I recalls the basic concepts from Lagrangian mechanics and contains control theoretic descriptions of each of the problems mentioned earlier. Section I1 contains a discussion of the Maximum Principle, and its role in obtaining the appropriate Hamiltonian corresponding to each problem defined in Section I. Section I11 describes the geometric foundation required for the analysis of extremals. This ,section also contains a
Control Optimal
Problems on Lie Groups
261
discussion of the conservation laws for optimal control problems on Lie groups, and their connections to the symmetries of the problem. Some of the above problems, such as the elastic problem admit extra integrals of motion (of Kowalewski type) which provide interesting extensions of the classical integrability theory.
I. Lagrangian Mechanics, Elastic Rods, Dubins' Problem As a way of introducing the mechanical problems which will be discussed in the paper, we shall first recall the basic definitions from Lagrangian mechanics ([4]).A Lagrangian system on a smooth manifold M is a triple ( M ,T ,V ) where T is a Riemannian metric on M , and V is a smooth function on M . T and V are respectively generalizations of the kinetic and the potential energy. The Lagrangian associated with this triple is given by L = T - V . Thus L is a function on the tangent bundle of M . The basic principle of mechanics, known as Lagrange's Principle, states that the actual motion of this system is such that it minimizes L(((t))dtover all curves ( ( t )in the tangent bundle of M which satisfy $T o ( ( t )= ( ( t ) , and the prescribed boundary conditions ~ ( ( ( 0 ) ) = go, and n(((T))= 41. In this notation, x denotes the natural projection from T M onto M . In order to rephrase Lagrange's principle as an optimal control problem it will be convenient to assume that M is a parallelizable manifold, such as a Lie group. That means that there exists a global frame VI, . .. , Vn of vector fields on M , On a Lie group this frame usually consists of either left or right invariant vector fields. Any differentiable curve ( ( t )on T M which satisfies &(((t))= ( ( t )is uniquely expressed as n
r(t>= CUi(t)WS(t))) i=l
for some functions u1 ( t ) ,. . , ,un(t). Conversely, any choice of functions u l ( t ) ,, . . ,un(t)uniquely determines a curve q ( t ) on M , provided that the initial point q(0) = go is fixed; g ( t ) is the solution curve of the following
Jurdjevic
262
initial value problem:
If we regard u1,, . . ,un as the control functions, then paths on M may be viewed as trajectories of the above control system. Evidently, this control system is of the type defined in the introduction with V0 = 0, and m = n = dimM. In terms of this notation the Lagrangian L becomes:
with the matrix (Tij) positive definite for each q E M, where Tdj denotes
T(K,V;.). We shall now turn our attention to theheavy top, and themethod of moving frames, as a prototype of mechanical systems on Lie groups.
A. The Rigid Body Let E3 denote R3 with its Euclidean metric on it. &, 82,Z3 is an orthonormal frame fixed at the origin Of of Es, and dl, 82, ii3 is an orthonormal frame fixed on the body with its origin at a point Ob.We shall refer to a,, &, d3 as the moving frame. Our notational conventions will follow Arnold’s convention, whenever possible ([3]). Thus, capital letters willrefer to the coordinates relative to the moving frame, while small letters refer to coordinates relative to the fixed frame. Let r(t) denote the coordinates relative to thefixed frame of the vector OfOb. For any point P on the body let q denote the coordinate vector of 0 7 relative to the fixed frame, and let Q denote the coordinate vector of 0 2 relative to the moving frame. Finally let R denote the rotation matrix defined byq = r RQ. It follows that the i-thcolumn of R is equal to the coordinates of relative to the fixed frame. As the body moves through space, the velocity of the point P is given by = 8 + %Q. Since R(t) is a path in S 0 3 ( R ) ,% is a tangent vector
*
+
%
Optimal Control Problems on Lie Groups
263
to S03(R) at R(t).The tangent space at R can be described by either left or right translation by R of the tangent space at theidentity. The tangent space at the identity consists of all anti-symmetric matrices. Assuming that T R S O ~ ( Ris) equal to { A R : At = -A), then % = A(w)R where
A(w)=
0
-W3
W2
~3
0
-WI
-W2
W1
. Thus,
0
dr
dr -dq= - +ARQ, or -dq= A(q - r ) . dt dt dt dt Let W denote the coordinates of w l & + wzZ2 + w3Z3, Then, Aq = W x q for any coordinate vector q. Hence,
+
-dq= - dr + W x (q - r ) . dt dt If m is the point mass of a point P, then its kinetic energy is given by The total kinetic energy of the body is equal to the totality of the kinetic energy due to point masses as P varies over the body. This quantity is given byan integral over the body in terms of the mass density p. We have,
Denote by A(52) the antisymmetric matrix A(52) =
0
-523
522
523
0
-521
-522
521
0
where 52 = R-lw. It follows that A(w) = RA(52)R-1 &S c m be easily verified through the following argument:
R-'A(w)q = R"(w
X
q)
= R-'w
X
R-lq = 52 X R-lq = A(52)R-lq;
therefore, R-lA(w) = A(52)R-l. 52 is called the angular velocity relative to the body. The total kinetic energy is conveniently expressed in terms of 52 as follows: Denote by h4 the total mass of the body, that is, M = IBody pdQ.
Jurdjevic
264
Then, using q - r = RQ and W x RQ = R(O x Q ) we get,
In particular, if the center of the moving frame is situated at the center of mass of the body then the middle integral is equal to zero, and
We shall now assume that the point Ob is fixed in the absolute space, and therefore = 0. Hence, T = $Body pllO X &I[’%& = +(a,PO) where P is some positive definite matrix depending on the geometry of the body. Its eigenvalues I l , I2, 4 are called the principal moments of inertia.
2
4
We shall further assume that the body is moving under the gravitational force field F’ = -Cc3 (C = the gravitational constant). Then, the potential energy of a point mass m at q is given by V(q)= mq3 = m(q,e3). The total potentialenergy consists of the sum of the contributions due to point masses. Thus,
where Q0 denotes the coordinates of the center of mass of the body. (We are not necessarily assuming that ob is situated at center of mass of the body.) For notational convenience denote c = MC. The Lagrangian L associated with the body is given by L = T - U = ;,(a,PG)- c(RQo,e3).Thus, L is a function on SO3(R) x R3 depending on R and O. In order to formulate this problem as an optimal control problem we will regard O as the control function: the dynamics of the control problem = A(w)R = RA(O)R-lR = R A ( 0 ) .Let are given by
9
Optimal Control Problems on Lie Groups
265
Fig. 1. Kinematics of the top 0 0 Ai= 0 0
G 1
0 -1, 0
A2=
0
0
0
0 0
1
-1
0 0
l
0 -1 0 and A 3 = 1 0 0 , 0 0 0
Then, letting Ai denote the left-invariant vector field on S 0 3 ( R ) whose value at the identity is Ai, we have (2)
dt = Ol.&(R)
+ Oa&(R) + O3&(R).
Thus Lagrange's Principle leads to minimizing
over the trajectories of the above control system which satisfy the prescribed boundary conditions. In the mechanics literature this problem is called the heavy top problem. The heavy top is a prototype of holonomic systems in mechanics, which means that every path in SOs(R) is a trajectory of (2). The bdl-plate problem, which we discuss next, is a natural example of a non-holonomic system.
Jurdjevic
266
B. The Ball-Plate Problem The ball-plate problem consists of the following kinematic situation: a ball rolls, without slipping between two horizontal plates separated by the distance equal to the diameter of the ball. It is assumed that the lower plate is k e d , and that theball is rolled through the horizontal movement of the upper plate. The problem is to transfer the ball from a given initial position and a given initial orientation to a prescribed final position and a final orientation along a path which minimizes c (lw(t)1I2 dt among all paths which satisfy the given boundary conditions. w(t) is the velocity of the moving plate, T is the timeof the transfer, and c is a positive constant which will be conveniently chosen. Let & , 82, 83 be the moving frame with its origin at the center of the ball. We shall assume that thefixed frame & , &, Z3 is situated on the fixed plate with 23 pointing at the moving plate. Denote the velocity w ( t ) of the upper plate by w l ( t ) e l + v z ( t ) e z . Then, using the same notations as in the description of the rigid body, we get that = W x ( g - r ) . Assume that the radius of the ball is equal to 1, and let ( ~ ( ty)( t,) , 1) denote the coordinates of the center of the ball. Then, no slipping assumption means = 0 when q - r = -e3, and that = w(t) when q - r = e3. that It follows that = w2, = -~ 1 ~1, = % + ~ 2 and , 212 = -~ 1 and therefore v1 = 2w2, w2 = - 2 ~ 1 .We shall regard the components the velocity of the center of the ball as the control functions u1 and u2. Then, &E= dt U19 = u2 and
sf
3 8+
3
9
9
dR dt
-
"
3
0
-W3
w3
0
U1 ~2
-U1
-212
0
R.
The preceding differential system is a system on G = R2 x S 0 3 ( R ) . Let V1 = e l e A2, V2 = e2 A1 and V3 = -A3 be the elements of C(G), where A I ,A2 and A3 have the same meaning aa in the previous section. Let g ( t ) = ( ~ ( ty(t), ) , R-l ( t ) )Then, .
,
Optimal Control Problems on Lie Groups
267
0
Fig. 2. The Ball-Plate Kinematics
6
being the left-invariant vector fields induced by K , with VI, VZ,and V2 and VS.The above control system is controllable since the Lie algebra generated by and VZ is equal to L(G). We shall now choose c = so that c I(v(t)((dt = 3I T(ul a U : ) dt. The principal moments of inertia of a homogenous sphere are all equal, and hence the Lagrangian L associated with the rolling sphere is equal to
so +
6
Using the fact that = I ( w ( ( ~ , it follows thatthe restriction of the Lagrangian to the distribution defined by equations (3) is given by
MI L=-(u:+tL;+w;)+ 2
M T(u:
+
U;).
According to the theory of non-holonomic mechanical systems, such aa the case of a rolling sphere, the true motion of the system is not obtained through Lagrange’s Principle, but instead through D’Alembert’s principle of virtual work ([23]). Then it is known, that for any actual motion of the ball w3 = constant. We will further assume that w3 = 0.
Jurdjevic
268
C. The Elastic Problem The elastic problem in the plane, goes back to Euler, and consists of the following: Minimize /I d t among all curves s(t) in R2 which = 1 for all t , and which in addition satisfy the prescribed satisfy boundary conditions s(0) = ZO, S(0)= 20, z(T)= 21,and $f((T) = k1. Here, kd is an arbitrary tangent vector at si,i = 0 , l . The Serret-F'renet method lifts this problem to an optimal control problem on the group of motions of the plane with the geodesic curvature function playing the role of a control. For reader's convenience, and further extensions to spaces of constant curvature, we shall quickly outline this procedure. Denote by k ( t ) = $&(t).Let P ( t ) denote the positively oriented unit vector perpendicular to k ( t ) .Let R(t) denote the orthogonal matrix defined by R(t)(;) = ak(t) ,@(t).Since R(t) is a curve in SOz(R),there
4 sf 9 1 1
11211
+
= R(t)
exists a function u ( t ) such that
(
-"d"'>,
It follows
immediately that & k ( t ) = u(t)G(t),or that ll$ll = u(t). Thus, u(t) is equal to the geodesic curvature associated with s(t). The group of motions of the plane G is equal to thesemi-direct product of R2 with SOz(R). We shall denote it by Ez.E2 can be realized tt8 the subgroup of GL3(R) of all matrices of the form each curve g ( t ) =
(4)
dt
with
Vl((e) =
(
(: i).In particular,
$ t ) ) is a solution curie of
o u
0
0 0 0 0 0 0 1 0 0 and V 3 ( e ) = 0 0 -1 0 0 0 0 1 0
It is easy to check that
[VI,% ] ( e ) =
Io
'
1. .
0 0 O -O1 0 We shall denote this
I 1 0 01 matrix by %. Hence, VI, l4 and V 3 form a frame on G, and conforms to the following Lie bracket table: [VI, V21 = 0, [VI, Vs] = R , [Vz, V3]= -Vl.
Optimal Control Problems Lie on
Groups
269
Euler called the projections of the solution curves on R2 the elastica. We shall now consider the extensions of the elastica problem to its nonEuclidean neighbors, the sphere S2, and the hyperboloid H 2 = ((2, y,z) : z2 y2 - z2 = - 1, z > 0). For notational convenience we will use M to denote either of the above spaces. We shall refer to M as a non-Euclidean surface form. In order to establish the required formalism, let (2,g)€ denote the bilinear form q y l 2 2 9 2 e23g3 with e = &l.The bilinear form with e = 1 induces a Riemannian metric on S2, while e = - 1induces a metric on Hz, Let z ( t )be any curve in R3 which also belongs to M . Then, ( z ( t )$)e , = 0, and therefore the tangent plane to M at z consist of all tangent vectors perpendicular to 2. If we further assume that ll$lle = 1, then it follows that = 0. We shall use x to denote $f and DX to denote the covariant derivative of E ; Dj!is the orthogonal projection (relative to the above form) of on T,M, and therefore D5 = $7 - € ( I C , But (z, = -1 as can beverifiedby differentiating the relation (3, $)€ = 0, and hence D& = ez. The geodesic curvature k ( t ) associated with the curve z(t)is defined by llD$$ll = k ( t ) . Let g(t) denote the unit vector such that D 9 = k ( t ) g ( t ) . It follows that (j.,b)e = 0. Then, D$(t) = &(t) - e ( z , v ) c z .Differentiating (z(t),g(t))€ = 0, we get that ( z ( t ) , = 0. Therefore, &(t) = $$(t). Since 11$1I e = 1, it follows that (P, &$)e = 0, and therefore &j = ai for some scalar Q. Differentiating the relation = 0 we obtain Q = = = -(Dk,g) = - k ( t ) . We have now obtained the Serret-Frenet differential system for M :
+
+
+
(9, e)€
9 9+
v)€
-(e,$)€ dx dt
- x,
"
Dk = k ( t ) $ and D$ = -kS.
As in the Euclidean case we will assemble these equations into a single matrix g ( t ) defined by g ( t ) e l = z(t),g ( t ) e 2 = j. and g ( t ) e s = Q. That is, z ( t ) ,$(t),and $(t)are the appropriate columns of g ( t ) . It follows that
Jurdjevic
270
Matrix g ( t ) satisfies (gw,gw), = ( w , w )for ~ any W , W in R3. Thus, g ( t ) is a curve in the Lie group G which leaves the bilinear form ( w , ~ ) , invariant. It is easy to verify that for E = 1, G = SOs(R), and for E = -1, G = SO(2,l). In either case, the Lie algebra of G consists of all matrices 0 --Ea1 -Ea2 0 -a3 for arbitrary values of a l , a2 and a3. of the form al a2 a3 0 It will be convenient to choose the following basis for C(G): 0 VI= 1 0
-€
0 0
0 0 , 0
0 0
--E
V2= 0 0 1 0
0 0
,
and
h=
0 0 0 0 0 -1. 0 1 0
Then, the equation ( 5 ) can be expressed in the more familiar form:
with Ju(t)(= k ( t ) . G is the isometry group of a two dimensional simply connected space M of constant curvature, and theelastic problem is to minimize $;f u2dt over the trajectories of (6). We shall refer to this variational problem as the elastic problem for a two dimensional space form M . The projections on M of the solution curves will be called respectively Euclidean and non-Euclidean elastica. The Lie bracket table corresponding to all three cases issummarized below in Table 1
4
Table 1 €
= 0,
€=
E
1,
the Euclidean plane the sphere S'
= -1, the hyperboloid H'.
A convention remark. Throughout this paper,the Lie bracket [X, Y] of any vector fields X and Y is defined by [X, Y]f= Y(Xf)- X(Yf). The lie bracket on a Lie algebra of a Lie group G is defined through leftinvariant vector fields; that is, [X, = [2?,Y] ?](e). In particular, when G is
Optimal Control Problems on Lie Groups
271
a subgroup of GL,(R), then elements of its Lie algebra are matrices, and
the above formula reduces to [X, Y]= YX - XY.This convention (and not its negative) produce the relations in Table 1. The extensions of this problem to higher dimensions are motivated by the elastic rod considerations. An elastic rod is described by its central line, andanorthonormal frame along the central linewhich measures the amount of bending and twisting relative to some fixed frame in the unstressed state of the rod. (see, for instance [2], [ll], and [21]). We shall state the elastic problem in R3 geometrically as follows: We shall consider the totality of all curves z(t) on R3 along with a. fixed orthonormal frame R ( t ) defined along x ( t ) and adapted to the curve s(t) by requiring that g = Rel. In this notation,R(t)is an element of SOs(R) attached to a right-handed orthonormal frame a'l (t),izz(t),&(t)through the formula R ( t )
(3 P
= a&(t)+p&(t) +y&(t) for all (a,P,r)in R3.
The requirement that 9 = dl corresponds to therod which isinextensible, since Il%ll = 1, and which is not subjected to shearing forces. (See [l11 for further details). For each such curve z ( t ) ,the frame R ( t )traces a curve in SOs(R), and therefore there exists functions u l ( t ) ,uz(t), ug(t) such that
0 ~ 3 ( t )
dt
-uz(t)
"213(t)
0
ul(q
uz(t) -~l(t)
0
In the theory of elasticity, u l ( t ) ,uZ(t), and u3(t) are known as the strain functions. The total elastic energy associated with the rod is given by the following expression
lT E = - j(C,u:(t)
+ czu;(t) + C&t))
dt,
2 o
where Cl, C2 and C3 are constants depending on the physical and geometric characteristics of the rod. According to thebasic principles of elasticity theory, the equilibrium configuration of an elastic rod, subjected to fixed
Jurdjevic
272
terminal conditions minimizes the total elastic energy of the rod (among all configurations satisfying the given boundary conditions).
1
1
As in the planar case we shall identify each configurationof the rod m a curve g(t) =
in the group of motions of R3 (realized
R(t) subgroup of G L I ( R ) ) .We will use
a
to denote the group of motions of of a curve g(t) isgivenby
R3.The derivative
E3
t=l
with 0 0 0 01
10 0 0
1 0 0 0
Io
0 0 0 01
IO
0
0 01
Io
-1
0 01
0 0
-1 0 1
'
0
and 0 0 - 1 0 0 1 0 0
loo
0
e
o I
4
+
Our optimal control problemconsist of minimizing $r(C1uf C3ui)dt among all trajectories of (7) which satisfy the prescribed boundary conditions. The corresponding elastic problem fora three dimensional space form M is defined analogously as follows: Equations (7) describe the dynamics of the elastic problem for R3. The extensions to S3and H 3 are simple: The isometry group of S3is S04(R).
..
.
Optimal Control Problems on Lie Groups
273
The appropriate dynamics on G are the same as in equation (7)except 10 -1 0 01
that the drift term f o need to be changed to
f,=
+ +
The hyperbolic space H 3 is defined by z: z$ zi- z i = -1,z4> 0. The isometry group of H 3 is 50(3,l), and the drift term for the come 0 1 0 0
=
sponding elastic problem is
0 0 . 0 0 0 0 0 0
4
. For each
of the above
+
+
problems we shall take the Lagrangian to be S,'(Cluf C24 C&) dt for some positive constants Cl,CZand CS. Before leaving this problem it will be wise to recogni.ze certain geometric facts whichwillforce a change in notations. In each of the Lie algebras above there is a natural splitting, called the Cartan decomposition, into two spaces K and P. K consists of all matrices of the form 0 '0 0 0 0 a3 0 -a2
matrices that
0
a1 0
0 1
. as ( a l ,u2, a s ) vary over R3, and P consists of the
0 -6bl
-6b2
0 0 0
bl b2 b3
0 0 0
-~b3
0 0
for
,
(bl b2, b 3 )
in R3. It is easy to check
[K,K]c K , [K,P] C P and that [P,P] C K. We shall work with the following bases for 0 0 A1 = 0 0
0 0 0 0
0 - € 1 0 B1 = 0 0 0 0
0 0 0 -1' 1 0
0 0
0 0 0 0
0 0 0 ' 0
0 0 0 0 A2 = 0 0 0 - 1
K and P:
0 0 0 1 0 0 ' 0 0
0 0 - € 0
B2 =
1 0 0 0
0 0
0 ' 0
0 As = 0 0 0
0 0 0 0 - 1 0 1 0 0 0 0 0
000"e 0 0 0 0 BB= 0 0 0 0 1 0 0 0
Their Lie bracket table whichwill be needed in the sequel is given
Jurdjevic
274
below: Table 2
This change in notations allows us to rewrite the dynamics of (7) by a single equation 3
9 = &(g) + Cui(t)Ai(g) dt i=l
3
+
+
and our problem is to minimize sr(Clu: C 2 . i C3ui)dt over the trajectories of (8). We will refer to the problem as the elastic problem for a space form M .
D. Dubins Problem This geometric problem introduced and solved by Dubin’s in ([S]),was recently brought to my attention by H. J. Sussmann ([26]). Dubins’ problem consist of finding a curve z ( t ) of minimal length which connects zo to z1, among all curves which satisfy Ilg(t)ll = 1, 11 5 1 for all t in their interval of definition [O,T],and which further satisfy the boundary tangency condition 9.0) = $ 0 , and $(T)= 51. We shall consider this problem as the minimal time problem for the control system described by equations (4) on the group of motions of R2 subject to the condition that Iu(t)l 5 1, or more generally aa the minimal time problem for its non-Euclidean extension, control system (5). It
1 1 9
275 Groups Lie
onProblems Control Optimal
is relatively easy to imagine generalizations of this problem to higher dimensions, in terms of equation (8), but unfortunately, because of a lack of space, we will not pursue such ideas further. Instead, we will turn our attention to the Maximum Principle, and the solutions of the preceding problems.
11. The Maximum Principle The Maximum Principle is a necessary condition for optimality, which was originally stated and proved by Pontryagin and his co-workers in 1958 for control systems in R", with controls restricted to compact sets ([25]). The restriction on the values of admissible controls marked an interesting departure from the usual problem of Lagrange in the calculus of variations, and at that time posed serious mathematical difficulties, since the classical idea of variations was no longer applicable. The variations used in the Maximum Principle constituted a significant generalization of the Weierstrass' original ideas in the calculus of variations, and the theorem itself was regarded as an important accomplishment of the period. The authors were awarded the Lenin prize for this work in 1962, but apart from this recognition, this beautiful theorem remained an obscure theorem outside the field of control theory. It is curious to find that, in the literature on non-holonomic systems, the Maximum Principle is entirely ignored, although the subject matter overlaps considerably with differential systems with controls ([4],[lo]). To some extent the problems discussed in this paper are deliberately chosen in order to illustrate the connections of geometric control theory to problems of geometry, mechanics and elasticity. More importantly, these problems are chosen because of their rich geometry of solutions, and the main motivation for this paper is to understand the geometric foundation of the theory which they comprise. In this paper we will focus on the geometric content of the Maximum Principle, and concentrate on its applications to the variational problems mentioned in the last section. For that reason, and also to avoid unneces-
Jurdjevic
276
sary technical details, the discussion of the Maximum Principle will not be done at the highest level of generality. As we stated in the introduction we will consider the problem ofof minimizing L ( g ( t ) ,u(t))d t over all trajectories of a left-invariant control system
S;f
on a Lie group G which satisfy the prescribed boundary conditions g ( 0 ) = go and g(T) = 91.The Lagrangian L is a given function on G x Rm,and it will be assumed that the control functions are constrained to take values in a constraint set U in R”. We will not make any assumptions about U. In particular, U can be equal to Rm. The terminal time T > 0 can be either fixed, or it can be free. In order to get to the geometric foundation for the problems, it is necessary to absorb the Lagrangian into the dynamics, and consider the extended system in R x G with
Then the trajectory ( p ( t ) ,E ( t ) ) which is optimal relative to thegiven data, when regarded as a trajectory (E(t),g(t)lE ( t ) ) with E ( t ) = L(g(t),‘ii(t)) of the extended system, must be on the boundary of the reachable set from (0, go) because the cost component E(T)is minimal among all other trajectories of (9) which satisfy the given boundary conditions g ( 0 ) = go and g ( T ) = 91. (Fig. 3) The infinitesimal analogue of this geometric characterization of the problem is expressed in terms of the “perturbationcone” at the tangent space , states that the interior of R x G at the terminal point ( E ( T ) , g ( T ) )and of the perturbation cone can not contain the negative axis (X, 0), X 5 0 in its interior. The perturbation coneis a “local approximation’’ of the reachable set from (O,go), at ( E ( t ) , g ( T ) ) .
Optimal Control Problems on Lie Groups
277
Id
Fig. 3. The geometry of optimality The Maximum Principle, is a statement of this fact on the cotangent bundle of G in terms of the separating hyperplanes in each tangent space Tg(tlGabove the optimal curve B(t). In order to be able to write itsprecise statement it will be necessary first to recall the basic facts from symplectic geometry. Let W denote the canonical symplectic form on T*G. W sets up a correspondence between functions F on T*G and their Hamiltonian vector fields P. This correspondence isgivenby dFg(w)= W ( W ,P(g)) for any v E TgG. Any vector fieldX on G induces a function Hx on T*G by the formula: Hx(E) = ( ( X ( g ) )for all E E TgG.The corresponding Hamiltonian vector field I?x on T*G is called the lifting of X to T*G.The same lifting notion extends to time varying vector fields on G; upon replacing G by R x G , the differential system (9) lifts to T*(Rx G) by the procedure described above. The lifted Hamiltonian 3t contains the control function u(t) as the parameter, and is of the form m
Jurdjevic
278
for any (Eo,<) in the cotangent space of R x G at a point CY,^). Here, we have identified T * ( Rx G)with T'R x T*G. Evidently, Q is a cyclic coordinate of "l, that is, 7-f does not explicitly depend on Q. Thus, its dual coordinate EO is constant on the trajectories of g.Therefore, the Hamiltonian 31 may be reduced to P G ,with (0 regarded &S an arbitrary parameter. This parameter is usually normalized to be either €0 = -1 or (0 = 0. We shall denote the dependence of the reduced Hamiltonian on this parameter explicitly by
c m
"lA(<, 'LlW =
-XL(9,4t))
+ E(VO(9)) +
%%g))
d=1
where X is either equal to zero, or equal to 1, and where E is an arbitrary point of T*G.We shall use Ho, . ,.,H,to denote the Hamiltonians of VO,VI,, . . ,Vm respectively. Recall that Hi(()= ( ( q ( g ) ) , i = 0,1,.. , ,n. Then the system Hamiltonian NAcan be expressed &S
We are now ready to state theMaximum Principle:
The Maximum Principle Suppose that ( g ( t ) , U ( t ) )is an optimal trajectory of (9) on an interval [0,TI. Then, g ( t ) is a projection of an integral curve c(t)of the Hamiltonian vector field 7?A(0,TZ(t))defined for all t in [O,T]such that the following hold:
(lo) if X = 0, then z ( t ) is not identically zero on [0,T ] (20) ~ P ( z ( t ) , ~ = ( tsupuELI )) ~ ~ ( c ( t ) for , u )almost all t in [o,T]. (3O) If the terminal time T is fixed then HA(c(t),TZ(t)) = constant, and if T is free then H A ( c ( t ) , E ( t )= ) 0 for all t in [ O , T ] .
A pair (E(t),u(t))of curves defined on an interval [ O , T ] where U is an admissible control function u(t) = (211, . . . ,u ( t ) )and ( is an integral curve is called an of gA(.,u(t)>such that (lo) and ( 2 O ) hold along (((t),u(t))
.. ... .
..
.
Optimal Control Problems on Lie Groups
279
extremal pair. That is, ' t l Y E ( t ) , W ) = SUP'tlA(E(t),V) UEU
for almost all t in [0,TI, and { ( t )is not identically equal to zero, whenever A = 0. The extremals which correspond to A = 1 will be called regular. We shall follow the terminology established by Bliss, and call the extremals associated with X = 0 abnormal. If an extremal { ( t ) is regular, and if the associated control function u(t) belongs to the interior of U for almost all t in the interval [ O , T ] , then condition (2O)implies that
for each i = 1,2,.. . ,m, and almost all t in [0,T]. In this notation 71 is the natural projection from T*G toG. If furthermore, the Hessian matrix OaL is positive definite along ? r ( E ( t ) ) and u ( t ) ,then equation (10) can be solved foru ( t ) in terms of {(t).This procedure leads to theHamiltonian of the system, and is a natural generalization of the classical technique from the calculus of variations involving the Legendre transform. We will now illustrate itsuse in obtaining the Hamiltonians for the problems introduced earlier in the text. Remark. In this paper we shall concentrate on the regular extremals. For the problems mentioned earlier except possibly for Dubins' problem, it is true thateither there areno abnormal extremals (such as for instance for the rigid body), or that theabnormal extremals which project onto optimal trajectories arealso regular, (asin the Plate-Ball Problem) and hence may be ignored without any loss of generality. In general, however, abnormal extremals should not be ignored, as shown by Montgomery ([22]). The Rigid Body Problem
L = ;(a,Pa)- MC(RQ0,e3). For simplicity of expressions we shall assume that P is diagonal, with its principal moments of inertia I1,Ia and 13
equal to the diagonal entries. Then, equations (10) show that along the
. . . . . . . .. . .. . . .
_...,
"./I...
. . " I
,.,.
1 .
....
....I. x
" I I .
.. . . .
.
..
Jurdjevic
280
regular extremals
8L an, H,(()= O
"
i = 1,2,3
and therefore Q ( t ) = A H i ( ( ( t ) ) .Hence, the regular extremals must be integral curves of a single Hamiltonian H given by
Since
it follows that
where R denotes the projection of ( on G = SO3(R). Thus for the rigid body problem, the Maximum Principle yields the totalenergy of the body as its Hamiltonian with the angular velocity n equal to i = 1,2,3.
2,
The Ball Plate Problem Let h1 and h2 denote the Hamiltonians of the constant vectors Z1 and Z2 in the plane, and let H I , HZ and H3 have the same meaning W in. the discussion of the rigid body. Then, 3 1 1 ( t , u l , u 2 ) = - ~1 ( u : + u , 2) + u l ( h z + H l ) + u 2 ( - h l + H 2 ) .
The controls which yield extremals satisfy equation (lo), and therefore must be of the form
Upon substituting these values in X,we obtain a single Hamiltonian H given by 1 1 H = -(h2 +H# -(-h1 +Hz)? 2 2
+
Optimal Control Problems on Lie Groups
281
The Elastic Problem Fortwo dimensional space form. the appropriate Hamiltonian is H = %Hi H1 where H1 and H3 are the Hamiltonians associated with
+
0 -e VI= 1 0 0 0
0 0 0
0 0 0 and V3= 0 0 - 1 . 0 1 0
For a three dimensional space form the Hamiltonian is given by
with HI, HZ,H3 denoting the Hamiltonians of is the Hamiltonian of &.
21,A2 and A3,while G1
Dubins' Problem. As for the elastic problem, let HI; H2 and H3 be the Hamiltonians associated with VI, v 2 and v3 in Table 1. Since L = 1, it follows that ?lx = -X H1 uH3. Then each extrema1 pair ( ( ( t ) , u ( t )must ) satisfy (l")and (2") of the Maximum Principle. Upon applying ( 2 O ) we get that, along ( ( t )the following must hold:
+ +
u(t) = 1
for all t forwhich H 3 ( ( ( t ) ) > 0,
u(t)= -1
for all t for which H3(E(t)) < 0,
H3 = 0 is the switching surface for this problem. As long as H3(((t))= 0, the Maximum Principle gives no information about u ( t ) . We shall need additional geometric tools to investigate further details of this problem, and for that reason, we will postpone its discussion until later.
111. The Geometry of the Extremals Throughout the paper { H , F } will denote the Poisson bracket of two functions on T*G,or more generally on the cotangent bundle of any m nifold M . Its definition, and its relevant properties which will be used in the paper are contained in the following proposition.
282
Jurdjevic
P,roposition. Suppose that H and F are smooth functions on T * M . Denote by { exp tI?} the one-parameter groupof di$eomo.rphisms generated by the Hamiltonian field If of H . Then, (i) {F,H } ( x )= &F o ezptl?(z)(,=ofor each z E T * M . (ii) The Lie bracket [P,l?] of two Hamiltonian vector fields F' and l? is a Hamiltonian vectorfield, and [P,i f ] = {F-}. (iii) If H x and H y are the Hamiltonian functions which correspond to smooth vector fields X and Y on M then { H x , H y = } H[x,y],
Remark. All of these statements along with their proofs can be found in Arnold's book ([3], Ch. 8) with an unfortunate transposition of the terminology: Poisson bracket of vector fields in Arnold is equal to the Lie bracket of this paper, while the Poisson bracket of this paper is the same as the Lie bracket in Arnold. Property (iii) of the Proposition implies in particular that the Hamiltonians of left (respectively right) invariant vector fields on a Lie group G form an algebra under the Poisson bracket. Rather than expressing the extremals through the canonical coordinates of T*G, and the familiar differential equation % = and =
g, 9
88H X i ' it will be more appropriate to regard T*G as GxC* with C*equal to the dual of the Lie algebra C(G)of G , and proceed with the non-canonical coordinates ( g , p ) relative to this decomposition. Recall that the above decomposition is accomplished in the following way. For each g E G , let X, be the left-translation by g, i.e., X,(x) = gx for all x E G. The differential dX, at x maps T,G onto T,,G. We shall denote the dual mapping of dX, by (dA,)*. (This notation suffers from the standard notational weakness in differential geometry, that it suppresses the explicit dependence on the base point z.)In particular then, (dX,-1)* maps T,*G onto T i G . The mapping ( g , p ) + (dX,-l)*(p) realizes T*G as G x C*. In thissetting, T(T*G)',the tangent bundle of T*G, is realized as T G x TC*.Upon identifying T G with G xC via theidentification (g, A ) -+ dX,(A) for each (g,A ) E G x C,T ( T * G )becomes G x C x C*x C*.We will further think of T ( T * G )as G x C* x C x C*with the first two coordinates "
Optimal Control Problems on Lie Groups
283
describing the base point. Each element ( g , p , X , Y * )of T(T*G)is to be regarded as the vector ( X ,Y * )in C x C* based at ( g , p ) of T*G. It is not difficult to show that the natural symplectic form W on T*G in the above representation of T(T*)Gtakes the following form:
for any ( g , p ) in G x C*,and any vectors ( X i , y i * ) in C x C* i = 1,2.
Remark. Abraham and Marsden ([l], Proposition 4.4.1, p. 345) obtain the same expression as above. However they take W = -de, rather than W = de, which is the convention used here. 6 denotes the canonical l-form on T*G.This discrepancy in sign disappears when discussingHamiltonian vector fields, sinceAbraham and Marsden define Hamiltonian vector fields through the formula dH(w) = w(Z,W), which is the negative of the one used here. Therefore any smooth function H on G X C*determines its Hamiltonian vector field l?(g,p) = ( X ,Y * )through the formula:
for all (V,W * )in C x C*,and all ( g , p )in G x C*.Hence any integral curve ( g ( t ) , p ( t ) )of Z must satisfy: (1W
W*( g - Y t ) 2 ) = W * @ )= dH(,(t)lp(t))(O,W*>,
and (1lb)
*dt( V ) = Y * ( V )= P ( t ) [ X ,VI
- d q g ( t ) , p ( t ) ) ( v0),
for all (V,W * )in C x C*. We shall now investigate the nature of the extremals associated with the Hamiltonians obtained earlier through the Maximum Principle. The rigid body problem is different from the others in the sense that its Hamiltonian is neither left, nor right invariant unless, the center of the gravity of the body coincides with its fixed point.
Jurdjevic
284
The Rigid Body Problem. Recall that G = SO3(R), and that
and c is constant (c = MC). For the representation of where E E Ti(G), T*Gas G x ,C*,each Hi is a linear function on C*since Hi(p) = p ( A i ) .In this representation E = ( g , p ) and therefore H ( g , p ) = 5l ( rII n + r 11n +
y)+
e3). Suppose that q ( e ) is any curve in C,' which satisfies q(0) = p , and $(O) = W*.Then C(gQ0,
Since each Hi is a linear function of p , it follows that $Hi o q(e)I,=o = H i ( W * ) .Hence
Hence any integral curve ( g ( t ) , p ( t )of) l? must satisfy (lla), and therefore
Since W* is an arbitrary element of C* we get that
Thus equation (lla) recovers the original dynamics
9 dt = u1(t)&(g) + uz(t)Az(g(t)) + u3(t)A3(9(t))
,=-v,
with ui(t) i = 1,2,3. We will use (Ilb) to get the differential equation for p ( t ) ,and therefore we need to first calculate dH(g,pl(V, 0) for an arbitrary element V of C. Let g ( € ) = gexge(V) be a curve in G with V E C. Then, = gV, and dH(,,p)(V,O)= $H(g(E),P)Jc=O' It follows that &H(g(€),P)lc=o=
$l,=o
Optimal Control Problems on Lie Groups
285
c(gVQ0,e3). Hence equation ( l l b ) becomes
Equation (12) is the classical equation of Euler. In order to recognize it in its more familiar form, it is necessary to recall the Killing form on so3(R)
K ( A ,B ) = --2 m c e ( A B )
(13)
for any A, B in C. Let us recall the common practice in mechanics whichidentifies R3 and its vector product with C: for that reason, let A^ denote the column vector 0 -a3 a2 associated with A = a3 0 -a1 . Then, K ( A , B ) = (A^,@ -a2 a1 0 where ( , ) denotes the Euclidean inner product in R3. It is well-known, and easy to verify that:
(ii)
[ATB]= A^ x
(14)
B^
for any A, B in C
K ( [ A ,B1,C) = K ( A ,[B, Cl), and VQ = p x Q for any V in C, and Q in R3. We shall identify each element p in C* with an element P in C via the formula p ( v ) = K(P,V) for all V in C. Since p(v) = (B,V ) , it follows that
B=
(2) p 1
with Pi = p(Ai), i = 1,2,3. Inparticular, the curve
P(t)associated with an extrema1 p ( t ) isgivenby p ( t )= terms of these notations equation (12) becomes ($9
v)
= (P^(t),
It then follows from (14) that
(:;)
H2(t)
[G]) - c(g(t)VQO,e3)-
(B,[XTV]) = ([PTX],p),and that
(g(t)VQo,e,) = (0 x Qo,g-’(t)ed = (Q0 x g”(t)e3,p).
ar. In
Juaevic
286
Thus, (12) becomes
Since it must be true for all
%&+
v in R3, we get its usual form:
+
with 6 = ?E'2 9 Z 3 . Recall that g ( t ) = R ( t ) ,where R ( t ) is the rotation matrix which transforms the coordinates relative to the moving frame into the coordinates of the fixed frame. In particular, cg-l(t)ea are the coordinates of c& relative to the moving frame. We shall use d(t)to denote cg-l (t)e3.Then, (16)
dd - dR-l dt dt
ce3 = -A(n)cR-les = - A ( n ) d ( t ) = d ( t ) x
"
5.
Consider now the quantity m ( t )= R ( t ) p ( t ) It . follows that
dt
dm d = -R(t)F(t) = R ( t ) A ( n ) F ( t+) R(t)(? X 6) + c R ( t ) ( d ( t X) QO) dt = c(e3 x R(t)Qo)= q(t) x
F'
where # = -MC33 and q ( t ) = R(t)Qo. It follows that m ( t ) is equal t o the total angular momentum, and the relation $ = q(t)x 9 expresses the well known lawof mechanics: the mte of change of the total angular momentum is equal to the external torqzle. Evidently R ( t ) d ( t )= ce3, and therefore Ild(t)l12is constant along each motion of the top. It follows that ( p ,d ) is another integral of motion, aa verified below:
=(Bxn+dxe3,d)+(B,dx6)=0. The existence of this integral can be explained much easierthrough the elastic problem, as we shall presently do, rather than just in the context of the heavy top. We shall show that the elastic problem is a left invariant variational problem on the group of motions, and that the above integral
Optimal Control Problems on287 Lie Groups
of motion is a consequence of that fact. In this context, it is the elastic problem which elucidates some of the properties of the heavy top, rather than the other way around, as the Kirchhoff's theorem is usually applied. Wewillnow turn attention to the other problems introduced earlier. In each of the remaining problems, the Hamiltonian does not depend explicitly on elements of G, that is, H is a function on C* only. Then, dH@,,)(V,0) = 0 for all V E C,and hence equation (llb) has a particularly simple form: with
9 =q g ) , dt
Therefore, p ( t ) ( V ) = po(g(t)Vg-'(t))forsome p0 in C*,as can be easily verified by differentiation. Putting g-'(t)Vg(t) for V we obtain (17)
p(t)(g-'(t)Vg(t))= constant
for 'all V in C. This condition has a very simple interpretation in terms of the symmetries of the problem: Let VT be a right invariant vector fieldwhosevalue at the identity is V . Then, exp& : G + G is a left-translation by exp tV. Since both the control system, and the Lagrangian are left-invariant, it follows that is a symmetry for H i.e., ( P ,R} = 0. That the Hamiltonian lift P of means that F o exp tI? = constant. The Hamiltonian lift of isgiven by F ( p ) = p(g"Vg). Hence along an integral curve (p(t),g(t))of 2, p(g-l(t)Vg(t))= constant. We shall refer to (17) as the consemration laws for the problem. For the remainder of this section we will exploit the conservation laws associated with each problem, and describe the basic geometry of the solutions which they determine. We shall first return to the rigidbody, and assume that the center of gravity of the body coincides with the fixed point of the body. In the mechanics literature, this case is known as the Euler top. Then the potential energy of the body is equal to zero, and its Lagrangian becomes left-invariant.
Jurdjevic
288
Continuing with the notations established earlier, the Euler top conservation laws, when expressed on the Lie algebra of S 0 3 ( R ) in terms of the matrix P ( t ) ,obtained through the identification via the Killing form, become
K ( P ( t ) , g - ' ( t ) V g ( t ) )= constant. Since K ( P ( t ) , g - l V g ) = K ( g P g - l , V ) , it follows that g(t)P(t)g-l(t)= constant along each integral curve of I?. Therefore the spectrum of P ( t ) must be constant. It is easy to check that thecharacteristic p6lynOmid of P ( t )is given by4(X) = X(X2+Ht+H,"+H,"). Thus Ht(t)+Hi(t)+Hi(t) = IIp(t)112must be constant along the integral curves of I?. This argument reaffirms the classical discovery of Euler that the integral curves of l? are contained in the intersection of the energy ellipsoid H = v1 (+ Ha $) with the momentum sphere M = H; + H i + H $ , It is interesting to contrast the Euler top, with the elastic problem for a two dimensional space form M. The corresponding problem on its isometry group G is also left-invariant and lives on a 3-dimensional Lie group. The comparisons are easier made for non-Euclidean elastica, since the Killing form is non-degenerate on both SOs(R) and SO(2,l). (This is not the case on E2).
+ $+
In either the elliptic case (E = l),or the hyperbolic case (e = -l), let K ( A , B ) = -tTrace(AB) for any A and B in C(G).Proceeding as 0 -€H1 -€H2 in the Euler top case on SOa(R), P(t) = H1 0 -€H3 is the H2 eHs 0 identification in C(G)of a curve p ( t ) in C*. In particular along an extremal curve ( g ( t ) , p ( t ) the ) conservation laws are
g(t)P(t)g" ( t )= constant. The characteristic polynomial $(X) of P ( t ) is equal $(X) = -X(X2 + e(@ H i +EH:)). Hence, M = Ht(t).+H,"(t) e H i ( t ) is a constant of motion. M is a sphere only in the elliptic case. It can be easily verifiedthat M is an integral of motion for the Euclidean case also ( E = 0). Thus each elastic curve in a 2-dimensional space form must be in the intersection of
+
+
.
.
Optimal Control Problems on Lie Groups
289
“the energy” cylindrical paraboloid H = +Hi+HI,and the “momentum” surface M = HT H,2 CH:. In the Euclidean case, M = HT H; is a cylinder, while in the hyperbolic case, the surface M = HT H$ - Hi depends on the sign of M : for M > 0, it is a hyperboloid of one sheet, for M = 0, it is a double cone, and for M < 0, it is a hyperboloid of two sheets. The figures below show all the geometric types of solutions which can occur,
+ +
+
+
l
Fig. 4. Euclidean case There is a natural angle f3 defined in the HI, Hz plane which links elastica to the mathematical pendulum. This angle is obtained as follows: Let HI,H2 and H3 be a curve in the intersection of H = constant, and M = constant. Then, (e - Hl(t))’
+ H i ( t ) = 1 - 2 ~ H 1+ H;+Hi =l-2€
+
(H”; 3+ H , 2 + H i = 1 - 2 € H + M .
Hence, (e - H l ( t ) ) 2 H;(t) is constant. We shall denote this constant by J 2 ; that is,
JZ=1-2eH+M.
J = 0 implies that H3 = constant, or that k ( t ) = constant. This situation occurs when the surfaces H = constant, and M = constant intersect at
Jurgjevic
290
I
U,
Fig. 5. Hyperbolic case
Optimal Control Problems on Lie Groups
291
Fig. 6. Elliptic case isolated points. It is easy to check that any constant value for the curvature is a solution. For the hyperbolic plane, k < 1 occurs when M > 0, k = 1 occurs for M = 0, and k > 1 occurs when M < 0. Assuming that J2# 0, let 8 be the angle defined by: e - Hl(t) = JcosB(t),
andH2(t)
= JsinB(t).
Then -Jsind(t)-
dB dH1 = -dt dt
- -H3(t)H2(t)
= -&(t)JsinO(t).
Therefore, g = &(t) = k(t). Thus, 8 is the angle that the tangent vector to the elastic curve makes with a parallel direction to the curve. Differentiating further, @ = = -Hz(t) = -JsinB(t), or +JsinB(t) = 0.
3
g
Jurdjevic
292
This is the equation of a mathematical pendulum. Furthermore,
Hence, H - E is its “total energy”. It follows that, dB (t) = - = f . \ / 2 [ ( ~- E ) dt
+J
COS e(t)].
In the Euclidean case E = 0, and H is equal to the total energy of the pendulum. Kirchhoff was the first one to notice the existence of the mathematical pendulum, and he named it the kinetic analogue of the elastic problem (see A. E. Love ([21]) for further details.) There are two distinct caaes which A. E. Love calls inflectional and non-inflectional. The inflectional case corresponds to H < J . In this case there is a cut-off angle e, and the pendulum oscillates inside of this angle. The curvature k ( t ) changes sign at the cut-off angle (hence the name). The remaining case H 2 J corresponds to the non-inflectional case, since the pendulum has enough energy to go around the top. The curvature does not change sign. In the non-Euclidean case, the energyis decreased by the sectional curvature, and everything else remains the same. These geometric considerations lead naturally to a unified integration procedure which provides a complete classification of the elastica. We refer the reader to [l41 for further details. The Ball-Plate Problem shows a remarkable connection to the elastica problem. This connection is easily established through its theconservation laws. Recall that the Hamiltonian H for this problem is given by
H = -(h1 1 -
+ +h2 1 + Hd2.
2
The Lie algebra C of R2 x SO3(R) is given by all pairs (a, A ) with a E R 2 , and A an antisymmetric matrix. The bilinear form p((a,A ) , (b,B))= a b + K ( A ,B), where a b denotes the scalar product in R’, and K denotes the Killing form on SO3(R), identifies a curve p ( t ) in C*(G)with
-
9
Optimal Control Problems on Lie Groups
293
(h(t),P(t))in C such that h(t)= ( h l ( t )h, z ( t ) ) and
It then follows from the conservation laws (17) that hl(t) = constant, h z ( t ) = constant, and that H f ( t ) H z ( t ) H i ( t ) = constant along each extrema1 curve ( g ( t ) ,P ( t ) ) .That means that B ( t ) = ( H l ( t ) , H z ( t )H, 3 ( t ) ) is contained in the intersection of the "energy" cylinder H = (H2 $(H1 + h2)2 with the momentum sphere M = H! Hi Hi,as shown in Figure 7
+
+
+
+ +
Fig. 7. The Ball-Plate extremals There is a natural angle 6 defined through the formula
~ ~ -( hl t =) &%cos(e where a = tan-'
km
+ CUI,
H' ( t )+ h2 = d%sin(e
+ a)
Jurdjevic
294
Then -&%sin(@
+da)-dedt = -(h1 dt = -(hz
- Hz(t)) = --dHZ dt
+
=- a s i n ( 8
+ a)HS(t).
Hence,
- H3(t).
"
dt Differentiating further we get that:
d28 dH3 - -hlH1- hzHz dt2 - dt = - h l ( m s i n ( 8 +a) - ha) - h2(hl - m c o s ( 8
"
+ a))
+ a)+ h z m c o s ( O +a). Since -h1 sin(8 + a)+ h2 cos(8 + a)= - d m i s i n 8 , it follows that = -hl&%sin(8
(18)
fi dt + Asin8 = 0,
with A =
JW
Equation (18) is the equation for the mathematical pendulum,Its total energy E is determined by our integrals of motion M = H; + H2 + Hi, the system Hamiltonian H, and the constant h: + h:.We have
M=H;+H;+H;
Therefore -1( M - ~ H - ~ : - ~ ~ ) = - A c o s ~ + 2 2z Thus, E = and
$(M- 2H - (h:
+ h:)) is
*
the total energy of the pendulum,
Optimal Control Problems Lie on
Groups
295
+
+
Recall that 211 = h1 - H2 = &EcOs(e a),and that 2 ~ 2= h2 HI= m s i n ( e + a).Therefore the center of the ball moves according to the following differential equations
v
Let 3 and denote the new coordinates of the center of the ball given by Z = z c o s a + y s i n a , a n d g = - x s i n a + y c o s a . Then,
e these equations become &%COS e !E= &Z sin 8
when reparametrized by dZ
"
de
- . \ / 2 ( ~+ ACOS e)'
de
,/2(~
+A C O S ~ )
8
when H = these equations are the same as the equations for the Euclidean elastica. Therefore, we obtain the following interesting result:
Each optimal path of the center of the ball which has energy H = 4 also minimizes $IC2 de where IC denotes the geodesic curvature of the path.
4
There are three geometric types of solutions which can occur. They are described by the following picture.
non-inflectional case
critical case Fig. 8.
inflectional case
Jurdjevic
296
They are characterized by the following conditions: E > A. The mathematical pendulum has enough energy to clear the top. In an analogy with the elastica problem we call this case noninflectional E = A. The mathematical pendulum has just enough energy to reach the top in infinite time. This case is called critical. E < A . The mathematical pendulum oscillates between its extreme values. This case is called inflectional. These cases are investigated in a considerable detail in ([15])where it is shown that the solutions bifurcate at the critical case. We refer the reader to this paper for further details. We now turn to Dubins’ problem, and continue with the discussion of extremals originated on page 24. Suppose now that there is an extremal ( ( t )which remains in H3 = 0 for an open interval of time. Then, d 0 = -H3(((t)) = (H3,H}(E)= {H3,Hl}(E =)-H2(<(t)). dt Differentiating further we get: d
0 = -H2(((t)) = {HZ, H}(() = {HZ, Hl} dt
+ U(t){H2,
H3}(((t))*
Thus U ( t ) H l ( ( ( t ) )= 0 for all t for which H3(((t)) = 0.Since {Ha,Hl}(<)
=~H3(((t))=O,and{H~,H~}=-H~,wegetthat~(t)H~(((t))=Oforall t for which H3(((t))= 0,Hence, u(t)= 0,and therefore the corresponding trajectory G projects onto a geodesic on the surface M . Thus the Maximum Principle predicts that each optimal trajectory must be a concatenation of the bang-bang trajectories corresponding to U = fl,and the singular arcs which are thegeodesics. In the original paper Dubins showed that each optimal trajectoryis either a concatenation of an arc of a circle, followed by a straight line, followed by an arc of a circle (or any sub-path of this situation), ar a concatenation of no more than three circular arcs. It would be quite interesting to obtain analogous results on the sphere and the hyperbolic space. We now return to the 3-dimensional elastic problem. This problem is certainly the most interesting of all of the problems considered in this paper, and much of its theory still needs to be done ([20]).
Optimal Control Problems on Lie Groups
297
We shall begin by deriving the famous result of Kirchhoff known as the kinetic analogue for the elastic problem. Recall that H I ,Ha, H3 denote respectively the Hamiltonians of &, & and &, while G I ,G2 and G3 denote
Assume that (g(t),p(t))as an integral curve of l?. Denote by Hi(t) = Hi o p ( t ) , and by Gi(t) = Gi o p ( t ) for each i = 1,2,3. The following derivations will be based on the Poisson algebra isomorphic to Table 2:
~ G -I ( G I ,H } = --G3 H2 dt c2
H3 + -G2, c 3
H1 - (G3,H } = --"G2 Cl
+ "H2 G 1 C2
"
and
dG3 dt
"
- €H2
This information can be neatly written as
where
Equations (19) are of the same type as the ones considered by 0. Bogoyavlenski in ( [ 5 ] ) . In order to arrive to Kirchhoff's Theorem we need first to make some geometric observations. Assume that E = 0.Then equations (19) resemble equations (15) and (16) of the heavy top with cg-le3 identified with G(t), and the center of gravity Q 0 situated along the axis one unit from the
Jurdjevic
298
origin Ob.Equations (19) are naturally linked to theextremal curves of H via the following bilinear form: ((U,
1 A ) ,(b, B ) ) = a b - ~ T T ( A B )
for .any elements (a, B ) and (b, B) in C ( & ) . Here is is understood that ( a , A ) ,and @ , B )are the coordinates in the basis of C(&) described in Table 2. Then for every integral curve ( g ( t ) ,p ( t ) ) of I?, the projection p ( t ) is identified with
+
+
+
P ( t ) = Gl(t)B1 G2(t)B2 G3(t)B3 *l(t)Al Recalling our earlier notations that g ( t ) = following conservation laws:
1
+ H2(t)A2 + H3(t)A3.
P;t,
I
i t ) we have the
+
R(t)G(t)= U , and R ( t ) a ( t ) s(t)X a = b
(20)
for some constants a and b. Equation (20) can be either verified by differentiating using (19), or through (17). It then follows from (20) that 1@(t)112 = G: G; G: = constant, and that
+ +
G(t) f i ( t )= G l H l + GzH2 + G3H3 = constant. 9
Let us now consider the non-Euclidean elastic problem. It is easy to verify through equations (19) that l\e(t)llzE l l f i ( t ) 1 1 2 and d(t) 8(t)are the integrals of motion. These integrals correspond to the Casimir elements in the enveloping algebra generated by the Poisson algebra spanned by HI, Hz, H3, GI, G2 and G3 (for further details concerning Casimir elements see ([6] and [13])). These Casimir elements correspond to the following Poisson algebra Hz,H3, GI, G2 representations: Let C denote the algebra spanned by HI, and G3. &present each element L = CLl aiHi biGi by the matrix
+
+(L)=
IG I
such that
A^ =
+
in g& (R) where A and B are 3 x 3 antisymmetric matrices
(ii)
and
3=
(ii) .
Jurdjevic
300
dG3 dt
1 - Z(-HlGz + H z G l ) - EHZ.
”
Denote by z ( t ) = +(H1+iHz), and by w ( t ) = G l ( t ) + i G z ( t -) E . Then, 1 2-d z = -HzH3 dt 2
+i(-
+ G3) = iH3.z + iG3.
Thus dz d 22- = - z 2 ( t ) = -iH3z2 dt dt
+i G 3 ~ .
Furthermore,
= i G 3 ~- i H 3 ~ . Therefore,
d -(z2 dt
-W)
= -iH3(z2 -W).
But then thecomplex conjugate zz - W satisfies $ ( z 2 - W ) = iH3(z2 - W), and therefore & ( z 2 - w ) ( z 2- W ) = 0. Thus, I a ( H 1 + i H ~ ) ~ ( G l + i G z - e ) 1 ’ is another integral of motion. Recalling our earlier notations a1 = and 0 2 = the above integral can be expressed more simply as
%
9,
(0: - 0:
- G1 + e)’
+ (2n1Clz - G z ) =~ constant.
When E = 0, this integral is the same as the one obtained by Kowalewski for the rigid body with its center of gravity Q 0 in the equatorial plane; that is, its component in the Z3 direction is equal to zero ([M]). Recent papers of J. P. Francoise, E. Horozov and P. Van Moerbeke listed in the bibliography explain Kowalewski’s mysterious integration techniques in terms of the geometry of certain Abelian varieties associated with the integrals of motion. While it seems perfectly plausible that these exple nations could be extended to non-Euclidean situations to yield analogous results, this paper however suggests that thereare more fundamental issues which are not yet fully understood. It might be of some interest to end the paper with an observation concerning the Lax pair form of equations (19).
Optimal Control Problems on Lie Groups
301
Both S04(R) and SO(3,l) are semi-simple groups, and therefore the trace form is non-degenerate on them. If we identify the elements of C* with C via the trace form K(A,B) = -&Trace(AB) for A and B in C, then an element p in C* is identified with
I
-H G 32
I
0
.H1
Then the projection p ( t ) of an extrema1 curve ( g ( t ) , p ( t )is) identified with P ( t ) in C. The conservation laws (17) imply that g ( t ) P ( t ) g - l ( t )is constant. Upon differentiating this relation we get
where
10 A(52)= I o
6,
Io
-E
0 I
0
523 O
0
-521
-522
S21
0
I
I
with 5 2 i= i = 1,2,3. It is easy to verify that equations (19) and (21) are the same for E # 0. These considerations imply that P ( t ) is an isospectral path in C. The characteristic polynomial r$(X) associated with P ( t ) is equal to
d(X) = -x4 - € P ( l ( i l ( t ) 1 1 2 + E @ ( t ) l l 2 )
+ €E@)G@). '
The isospectral considerations do not contain any new information, since we already know that the coefficients ofp ( X ) are conserved along the flow. This observation shows that the extraintegral of motion in the Kowalewski case is not a consequence of an isospectral form.
REFERENCES [l] R. Abraham, J. Marsden, Foundations of Mechanics, Benjamin/ Cummings, 2nd edition, Reading, Ma 1978.
Jurdjevic
302
S. S. Antman, K.B. Jordan, Qualitative aspects of the spatial deformation of non-linearly elastic rods,Proceedings of the Royal Society of Edinburgh 73A, 5, (1974) 75 85-105. V. I. Arnold, Les Mdthodes Mathdmatiques de la Mdcanique Classique, Editions Mir, 1976. V. I. Arnold, V. V. Kozlov, A. I. Neishtadt, Mathematical Aspects of Classical and Celestial Mechanics, Encyclopedia of Mathematical Sciences, Vol. 3, Springer-Verlag, 1988.
o n SO(4) andtheir 151 0. Bogoyavlenski, IntegrableEulerequations physical applications, Communications in Mathematical Physics 93 (1984), 417-436. N. Bourbaki, Elements of Mathematics, Lie Group and Lie Algebras, Capter 1-3, Spriger-Verlag, 1989.
R. Bryant, P. Griffiths, Reductions for constrained variational problems and $ $ k2 ds, American Journal of Mathematics 108 (1986), 525-570.
L.E.Dubins, O n curves of minimal length with a constraint o n average curvature and with prescribed initial and terminal position and tangents, Americal Journal of Mathematics 79 (1957), 497-516. J. P.Francoise, Action-angles and Monodromy, Asterisque 150-151 (1987), 87-108.
P. Griffiths, Exterior Differential Systems and the Calculus
of Va-
riations, Birkhauser, Boston (1983).
P. Holmes, A. Mielke, Spatially complete equilibria of buckled rods, Archive for Rational Mechanics and Analysis 101 (1988), 319-348.
E. Horozov, P. Van Moerbeke, The full geometry of Kowalewski’s top and ( l ,2)-Abelian surfaces, Comm. on Pure and Applied Math. Vol.
XL11 (1989), 357-407. J. D. Humphreys, Introduction to Lie Algebras and Representation Theory, Graduate Texts in Mathematics, Springer-Verlag, 1972. V. Jurdjevic, Non-euclideanelastica, Amer. J. Math. 1995 (117), 93-125.
Optimal Control Problems on Lie Groups
303
V. Jurdjevic, Thegeometry of theplate-ballproblem, Arch. Rat. Mech. Anet. 1993 (124), 305-328. V. Jurdjevic, Optimal control problems on Lie Groups, Analysis of Controlled Dynamical Systems, Progress in Systems andControl Theory, Birkhauser, 1991, 275-284. V. Jurdjevic, I. Kupka, Control systems on semi-simple Lie Groups and their homogeneous spaces, Ann. Inst. Fourier 31 (4) (1981), 151179. S.Kowalewski, Sur le probldrne de la rotation d’un corps solid autour d’un point f i e , Acta Mathematica 12 (1989), 177-232. J. Langer, D. Singer, The total squaredcurvature of closed curves, Journal of Differential Geometry 20 (1984), 1-22. J. Langer, D.Singer, Hamiltonian aspects of the Kireh0.f elastic rod, a preprint. A. E. Love, A treatiseontheMathematicalTheory of Elasticity, Cambridge University Press (4th edition), 1927. R. Montgomery, Abnormal minimizers, SIAM J. Control Optim, 32 (1994), 1605-1621. J. I. Neimark, N. A. Fufaev, Dynamics of Non-holonomic Systems, Translations of Mathematical Monographs, Amer. Math. Soc., Vol 33, 1972. H,PoincarB, Sur uneforme nouvelle des equations de la Mechanique, Comptes Rendus des Sciences, Tome 132 (1901), 369-371. L. S. Pontryagin, V. G. Boltyanski, R. V. Gamkrelidze, E. F. Mischenko, The Mathematical Theoryof Optimal Processes,Wiley, New York, 1962. H.J. Sussmann, G. Tang, Shortest paths for the Reeds-Shepp car: A worked out example of the use of geometric techniques in non-linear optimal control, a preprint.
Nonlinear Control and Combinatorics of Words Matthias Kawslci Department of Mathematics, Arizona State University, Tempe, Arizona 85287
Abstract. This article attempts to bring together recent work in nonlinear control systems and in algebraic combinatorics of words. On the control side recent results in controllability, nilpotent approximating systems, and applications to path planning are surveyed. On the other side, after a brief survey of various bases for free Lie algebras recent developments in the algebraic combinatorics of words are discussed. These two areas are linked together by the applications to and of the Chen series, Fliess series, and product expansions of these series, culminating in the introduction of chronological algebras.
0. Introduction This article surveys an area where nonlinear control theory and algebraic combinatorics overlap. The unifying theme is the increasingly sophisticated treatment, analysis and usage of the Chen-Fliess series. On the This work WBB partially supported by NSF-grants 90-07547 and 93-08289.
306
KawsM
control side we restrict our attention to the problems of local controllability, nilpotent approximations and very briefly,the pathplanning problem. On the combinatorial side we survey the development of the theory of bases for free Liealgebras, and then concentrate on the problem of resolving the identity map on some free algebra, which leads to the problem of finding explicit formulas for the dual bases of the Poincard-Birkhoff-Witt basis built on bases of the free Lie algebra. In trying to be accessible to audiences of both control theorists and the combinatorists, most topics are motivated at an elementary level, and complete proofs are given only where they are particularly illuminating (mostly in the last sections which unite the control and the combinatorial points of view). The article gives an extensive list of references where original proofs can be found; while this list is far from complete, an effort has been made to include many different points of view. The organization of the article is as follows: Starting with an applied mechanical problem, the controllability property is defined and several recently obtained conditions for controllability are reviewed, emphasizing the importance of the internal structure of the iterated Lie brackets involved. In the second section nilpotent control systems are investigated: After presenting coordinate realizations of nilpotent systems, algorithms for the construction of nilpotent approximating systems are given. Both of these rely onthe graded and filtered structures of the Lie algebras of vector fields. The robustness of controllability and stability under such approximations is discussed. The third section reviews the history of adaptations of the Chen path integrals to control, then called the Chen-Fliess series, presents Sussmann’s explicit product expansion of the series, and concludes with some remarks about applications to the pathplanning problem. The second part of the article takes a more combinatorial point of view and begins with elementary properties of various algebras associcc ted with words, with particular emphasis on the shuffle product, its role in the characterization of Lie elements and of exponential Lie series. The next section reviews the development of various bases forfree Lie algebras: Starting with P. Hall’s work in group theory, various bases have been di-
Nonlinear Control and Combinatoricsof Words
307
scovered, and eventually all have been shown to come from one underlying principle. Nonetheless, there are subtile differencesin the suitability of these to applications in control. The last section tries to put the combinatorial and thecontrol theoretic perspective even closer together. One focus is on the relation of Sussmann’s product expansion of the Fliess series in entirely control theoretic terms and a combinatorial resolution of the identity mapin the algebra of words. The other focal point is the chronological product, which combinatorially essentially is simply a pre-shuffle product, but from a geometric-control point of view it appears to play a much more fundamental role.
1. Controllability This section gives the basic definitions relating to controllability and briefly reviews the main criteria in use for deciding controllability. Throughout the section we refer to the following example for motivation and illustration. Consider a penny rolling without slipping on a plane, but able to rotate about its vertical axis (MClamroch and Bloch [l]).Denote by (g,%) the point of contact with the plane R2,by 0 the steering angle (e.g. with respect to the y-axis, and by 9 the rolling angle (i.e. about the horizontal symmetry axis, relative to a fixed orientation). Furtherlet TI and Tz be the torques about the vertical and a suitable horizontal axis. The equations of motion are: Jlde, = TI
JzQ = Tz (1)
p
= TQCOSO
i =rbsin0,
where Ji denote moments of inertia and T is the radius of the penny. One typical controllability question to ask is: By appropriately choosing the torques Ti (asfunctions of time), is it possible to get from any given initial state (position, orientation, and angular velocities) to any desired
Kawski
308
terminal state? One may consider unbounded torques, or more realistically torques bounded by someconstants. A local controllability question to ask is whether it is possible to reach any nearby state (from a given state) without long excursions. If it is possible to reach a certain state, one may further ask which is the best (optimal) way to reach that state. Optimality criteria could be to minimize the integral of the energy, to minimize time, or rpany others. Once it is known that one can reach a state, one may ask for an algorithm that produces a control (in this case, torques as functions of time) that steers the system to that state (a special case of the path planning problem). To phrase such questions in a generalized setting introduce the state variable x = ( u / T , Z / T , a, O , b ,6 ) E M = R' x T 2xR2 that takes values in the six dimensional manifold M (here TZdenotes the torus [0,27r] x [0,27r] with appropriately identified edges). As controls ui take the normalized torques JtTITi. Also introduce the vector fields f, 91, g2
a
a
a + x5 sinx4- + x5- + z6-a ax4 8x2 ax3
f(x) = x5 cosx48x1
(or write the vector fields as column vectors f(x) = (x5 cosx4,x5 sinx4, x5, x69 O,O, ) T , g1(x) = (O,O,O,O,1,O)T and g2 = (O,O,O,O,0,1)*). With this, the controlled equations of motion are in the following general form (3) of a nonlinear control system (&ne in the controls). Before coming back to the example, we review the some standard terminology and a few known criteria for controllability. In the following consider controlled dynamical systems of the form m
j=1
where the state x takes values in an analytic manifold M", f and gj are analytic vector fields on M , and thecontrol U ( * ) takes values in a nonempty compact subset U E R"' containing 0 E Rm in its interior. (If 0 6 U,pick
Nonlinear Control and Combinatorics of Words
309
+C
any constant U' E intU, define newvector fields F(x) = f ( x ) ~jogj(~), Z j ( x ) = g j ( x ) , and new controls G(t) = u ( t ) - uo taking values in = {W E R" : W U' E U} which now contains zero in its interior,)
+
fi
The reachable set R P ( T )from p E M at time T is the set of all endpoints x(T,u) of trajectories x ( . , U ) of (3) that start at p at time t = 0 and that correspond to admissible controls U defined on [0,TI. Here take the set of all measurable U-valued mappings that are defined on [O, 00) as the set of admissible controls U.When appropriate also consider controls defined on finite intervals [O,T].(In other places, various authors require admissible controls to have various other regularity properties, e.g. to be of class C', or piecewise constant, piecewise linear, piecewise C', and the like. For a detailed study of how controllability properties depend on the regularity properties of the controls and the regularity properties of the vector fields see Grasse [2]). Many definitions for various special kinds of controllability that have been given in the literature. We only consider the following: The system (3) is globally controllable if every terminal state q E M can be reached from every initial point p E M in some time T < 00 by some admissible control; i.e., UT
0. The system (3) is small-time locally controllable (STLC) about p if p E int q ( T ) for every T > 0.
npEM
The next task is to find simple criteria that allow one to algorithmically decide whether a system is controllable or not. In the case of linear systems x = Ax Bu with x E R" and U E R" with matrices A and B of the appropriate dimensions such a criterion is given by the Kalman-rankcondition: The (linear) system is controllable if and only if the compound . A"-lB ,. has matrix formed from all columns of the matrices B , A B , A 2.B full rank. Here controllable may be taken to mean e.g. globally controllable with piecewise constant controls, or STLC about zero with controls taking values in the cube [-l, lIm.
+
The key observation towards similarly elegant criteria for general nonlinear systems on a manifold, is that the matrix products in the Kalman-
Kawski
310
rank-condition are to be generalized to Lie products of the vector fields f and g j (compare e.g. Sussmann, [3, 41). The Lie product (or Lie bracket) of two smooth vector fields v and W is defined as the unique vector field [v,w]such that [v,w]cp= w(vcp) - v(wcp) for every smooth function cp E C w ( M ) . In local coordinates write [ v , ~=] ( D v ) w - ( D w ) where ~ DV is the Jacobian matrix of the (column) vector field v. For higher order iterated brackets it is convenient to also use the notation ( a d o v , W ) = W and inductively (adj" v , W ) = [v,( a d j v , W)]. Going back to the example of a rolling penny one easily finds that [g1, g2]= 0 , [f,g21 = whereas e.g. [f,gl](z)= cos x4 8 sin z4 8 G B ,and [ f , [ f , g a ] ] (= z )- ~ 5 s i n 3 4 & + ~ 5 c o s z 4 & .
+
&
+
Lie brackets not only are defined coordinate-free (or alternatively, they transform asdesired under coordinate transformations),but moreover, in the analytic case that we consider here, they contain all the geometric information of the system. This is made precise in the theorem (for detailed definitions see the original reference):
Theorem 1.1 (Sussmann [3]). Let M ,MU be simply connected real analytic manifolds and let L , be complete transitiveLie subalgebras of the Lie algebras of all analytic vector fields on M and G, respectively. Supand p E M and pose that @ is a Lie algebra isomorphism from L onto qE are such thatimage of the isotropy algebra of L at p under @ is the isotropy algebra of at q. Then there exists a unique diffeomorphism F :M + such that F ( p ) = q and such that F , ( X ) = @ ( X )for every x E L,
z,
Here, and in the following, L( f,g ) is the Lie algebra spanned by d l iterated Lie brackets of the vector fields f and g l , . , ,gm, and W ( p ) = ( v ( p ) : v E W } for a set W of vector fields. If m = 1, use the notation Sj for the linear span of all iterated Lie brackets containing j times the vector field g = g1 and an arbitrary number the field f.
.
One of the best studied cases of controllability is that of small-time local controllability about a rest point p of the uncontrolled vector field f (i.e.
of Words
Nonlinear Combinatorics and Control
311
f (p) = 0). For a detailed survey of the known conditions see Kawski [5]. m
(4) x = f(5)+ C u j g J a ) ,
5
E Adn, a(0) = p , f ( p ) = 0 , U E [-l, lIrn
j=l
with analytic vector fields f and systems.
gj.
We list the main results for such
Theorem 1.2. System (4) is accessible from p if and only if dimL(f,g)(p) = T I .
Theorem 1.3 (Linear condition). System (4) is STLC about p if dimS1(f,g)(p)= n. The following most simple example on M = R2 illustrates the need to consider brackets with several factors g. x1 = U
(5)
x2
= 5:
Iu(.)I 5 1 so = 0.
&
Here g(a) = and ( a d k g , f)(O) = (-)kk!&, while every iterated bracket containing any nonzero number of f's and fewer than k factors g v& nishes at so = 0. Clearly this system is accessible from so, but if k is an even number it is not STLC from zo (or controllable in practically any sense from so)because no points x E R2 with a2 < 0 can be reached from ao:.This motivates the Hermes condition, proved for planar systems by Hermes [6],and in the general case by Sussmann [7]:
Theorem 1.4 (Hermes condition). System (4)(with m = 1) is STLC about p if it is accessibleand S2k(f,g)(p)C_ (f,g ) ( p ) for every integer k > 0. A corresponding necessary condition for STLC was given byStefani [8]:
Theorem 1.5 (Stefani). If ( a d 2 ' g , f ) ( p ) $! Sak"'(f,g)(p) for some k E Z then the system (4) (with m = 1) is not STLC about p . In 1985 (Stefani [g])showed that the system x = U , = a, 2 = zsy is STLC. In this case the direction is obtained only from the bracket
g
Kawski
312
[[f,[f,g]],g], g], g], containing the controlled vector field g an even number of times. The following theorem generalizes the Hermes condition to the multi-input case, allows formore flexible weightassignments, and puts this example into a general framework: Theorem 1.6 (Sussmann [lo]). Suppose that there exists a weight 0 E [0,l] such that whenever k is odd and 11,L2, . . .l, are even then (6)
g)(O) E
c
L ( " W , d(0)
where the sum extends overall (k', l') such that W+Cjl$ < Qk+Cjl,. Then system (4) is S T L C about 'x = 0. Going back to the rolling penny, the following iterated brackets span the tangent space at so = 0 (and hence the system is accessible from xo):
Upon closer investigation one sees that the system is also STLC about xo by virtue Sussmann's general condition for controllability. Practically this means that it is possible e.g. to reach in arbitrarily short time (and thus without large excursions) states of the form x = (0, O,O,E , 0,O)(with small E # 0) from so = 0 using suitable torques. The following two examples (Kawski [5]) do not satisfy the above condition, yet by other means were shown to be STLC: In the system
x1 = U
of Words
Combinatorics Nonlinear and Control
3 13
an even number of factors g and anodd number of factors f . In thesystem
x1
=U
$1
= 2 23
+ 2 27
the onlyLie brackets giving the &-direction, are of the form fn = ( a d 2 (ad3g,f)f) and fu = ( a d 7 [ f , g ] , f ) Here . the first with six gs and three f s clearly corresponds to the uncontrollable term zi, but there is no admissible weighting that will make the other bracket fu with seven gs and eight f s be of overall lower weight. Nonetheless, it is shown in [5] that the term xi may dominate the term 21 even for arbitrarily small times and control bounds. A most recent more general controllability result by Agrachev and Gamkrelidze [l13 tries to put these examples into a more general framework. These examples and theorems clearly exhibit how conditions for controllability rely on the analysis of the iterated Lie brackets of the vector fields that define the system. The more general the conditions are, themore they require a detailed analysis of the internal structure of the brackets involved.
2. Nilpotent control systems
If L is any Lie algebra define the central descending series as the sequence of ideals L(i) c L givenby L(’) = L and inductively = [L,L(i)]= {[V,W ]: V E L , W E L ( i ) } .The Lie algebra L is called nilpotent if its central descending series terminates, i.e. if there is an integer k such that L ( k )= (0). A control system (4)is called a nilpotent system if the Lie algebra generated by the vector fields f , 91, . . . gm is nilpotent. The importance of nilpotent systems is that on one side they are very convenient to work with, e.g. solution curves z(., U ) may be found by simple quadratures (after a suitable decomposition), and on the other side the
3 14
Kawski
class of nilpotent systems is sufficiently rich that, when used as approximating systems, they preserve many geometric properties of the original system. We introduce some terminology (compare Goodman [12]):
For a fixed choice of coordinates (X',, , .S,) on R" and a sequence of positive integers T I , . . rn define a one-parameter family of dilations 6= by St(.) = ( t r l q , .. . ,trnxn).A polynomial function p = p ( s ) is homogeneous of degree k 2 0 w.r.t. the dilation d if p o 6, = t m p for all t > 0. Denote by P k the set of all polynomials that are homogeneous of degree k. A vector field V on R" with polynomial coefficients is called homogeneous of degree L if VPk E forall k 2 0, -4. Write for the set of all vector fields that are homogeneous of degree L with respect to 6. I
For example if r = (1,2,5) then p ( s ) = x! the vector field V ( x )= xi& E B-1.
&+
+ X! + 81x3 E P6 whereas
Note that a dilation induces gradations and filtrations on the spaces of polynomial functions P = @ k z O P k and vector fields with polynomial coefficients v = @e m.Specifically, one easily verifies that P k P e c %+c for all k, L 2 0, and similarly [24k, ne] 5 z k + l . To see the latter, if V E z k , WE and p E P, with S >_ -k L and S 2 0 then [ V , W ] p= W ( V p ) v(wP)c V(Pa+k) w('P,+t) c %+k+t.
+
-
U = Ut Nt. Again [ N k ,Ne] c N k + e . If e > rj for all dilation exponents rj, then N-, = 8. Consequently, if V ' , . . .V m E N-1 are vector fields (with polynomial coefWith
Ne =
Ujse3, also consider the filtration
ficients) of negative homogeneous degrees, then they generate a nilpotent Lie algebra. The following theorem essentially affirms that the converse is true also.
Theorem 2.1 (Kawski [13]). Let V 1 , ...V m be real analytic vector fields o n a real analytic manifold M " , whichgeneratethenilpotentLie algebra L = L(V',. , , V m ) .If p E M is such that diml(p) = n then there are local coordinates (X',. . .x,) aboutp and a family of dilations d = (6t)t>0 such that relative to these coordinates the vector fields V ' , . , V" havepolynomialcoeficientsand are of degree (-1) with respect to the dilation d.
.
Words of
Nonlinear Combinatorics and Control
315
A slightly stronger theorem with weakened regularity hypotheses W&S proved by Grabowski [14]. The proof of the theorem is constructive: Key steps in the construction are the selection of the dilation exponents (9)
Ti
= max{j
2 1 : dimL(j)(p)= (n+ 1)- j } ,
a choice of vector fields Wi E L(Ti)such that {W1(p), , , . ,Wn(p)}are linearly independent and finally the definition of the preferred coordinates ( X I , .I . ,zn)as the inverse of the map
@(x) = (expznW") 0... o (expzlW1)(p)
(10)
that isdefined on a neighbourhood of 0 E Rn. (Themap ( t , q ) + (exp tW)(q)denotes the flow of the vector field W . ) If a control system is not nilpotent then it may be possible to nilpotenP(z)v where cy and p are tize it by using a suitable feedback U = a(.) analytic Rm and Rmx"-valued functions, respectively, p(.) is invertible for every z E M, and v is considered a new control. The new vector fields y(z) = f(z) C j aj(z)gj(z)and E(.) = Cj$(z)gj(z) may sometimes be chosen in such a way that the Lie algebra they generate is nilpotent. Presently little is known when such a nilpotentizing feedback is possible. A necessary condition wasgivenin Hermes, Lundell, and Sullivan[15]. Recently, nilpotentizing feedback was used by Murray [16, 171 in the path planning problem. When such exact nilpotentization cannot be achieved one may try to approximate the system x = f(z) Cj U j g j ( 3 ) by a nilpotent system x = ?(x) + C j uj$j(z). The following construction is due to Hermes [18], seealso Stefani [19]: Suppose that the original system is accessible at zo = 0. Find iterated Lie brackets f w , , ,.. ,f w , of lengths r1 ,.. rn of the vector fields f and gj that are linearly independent at zo = 0 and that are such that the lengths ra are minimal. If necessary perform a linear change of coordinates to achieve that f?,,(O) = Consider the family of dilations defined by these coordinates and the exponents ri. Expand the vector fields f and gi in homogeneous polynomials and truncate the series at terms of degree 0 for f and -1 for gi. E.g. if f(z) = Cc-, .f'j(z), then set y(z) = C;=-, fj(z). Then ?E NOand 2 E N-1, and consequently the
+
+
+
&.
.
.I
3 16
Kawski
Lie algebra generated by the vector fields { ( a d gi) : 1 5 i I. n,j 1 0) is nilpotent. (To obtain L(T,g nilpotent simply truncate the series forf also at terms of degree -1.) This construction assures that the approximating system is also accessible. Moreover,if this approximating system is STLC, then the original system is also STLC [18]. Going back to the example of the rolling penny from the preceding section, choose e.g. V1 = 91, V2 = g2, V3 = [f,g2], V4 = [f,911, VS = [h, V41 and v6 = [h, Vs].The corresponding dilation exponents are T = (1,1,2,2,4,6).Introduce the new coordinates z = ( 2 5 , 2 ( $ , 2 4 , 2 3 , 2 2 , 2 1 23) and obtain thenew system with vector fields
Another important application of systems whose vector fields are homogeneous with respect to some dilation is in the problem of feedback stabilization, which is to find a feedback law U = a(.) (i.e. the control is a sufficiently regular function of the state), such that the closed loop system li: = f(s)+ ai(z)gi(z)is (asymptotically) stable about the origin z = 0. In this case the main results are of the type: If a feedback law that is homogeneous w.r.t. a dilation asymptotically stabilizes a system that also is homogeneous then the same feedback law will locally asymptotically stabilize a system that is perturbed by vector fields that areof higher degrees w.r.t. the dilation. For details see e.g. Hermes [20]and Rosier [21]. With the results mentioned above it makes sense to consider nilpotent systems as model systems that one may want to analyze first when studying e.g. controllability and stabilizability properties. For an optimal control perspective compare the investigations of the structure of the small-time reachable sets of free nilpotent systems, and for higher order normal forms by Krener and Kang [22] and Krener and Schattler [23]. The problems arise how to generate all possible nilpotent systems, and moreover, how to pick a suitable coordinate realization of these systems. We will return to thefirst question when considering realizations of free
xi
Combinatorics of and Nonlinear Control
Words
317
nilpotent systems in the following sections. Regarding the second question consider the following two nilpotent systems on R4:
In both cases the Lie algebras generated by the vector fields are four dimensional, spanned by the fields (9, [f,g ] , [f,If, g ] ] , [[f,[f,g ] , [f,g ] ] ) and by Theorem 1.1 the systems are equivalent up to a diffeomorphism. In this case this diffeomorphism takes the form of the coordinate change y = ( x I , x ~ , 23x2 x ~ , - 2 4 ) . Clearly the second system is not STLC, and consequently the first one is not STLC either. While this lack of controllability is apparent in the second realization, it is by no means obvious in the first. This exemplifies that not all coordinate realizations (in polynomial cascade form) are equally useful, and the question arises how to pick a coordinate representation that is particularly suitable for e.g. studying controllability. To conclude this section consider the following example of a nilpotent dilation homogeneous system which shows the need to not only carefully choose the coordinates, but also the iterated Lie brackets one calculates (compare [ 5 ] ) .
While this system as a whole is clearly not STLC, consider the projection
Kawski
318
R f S 4 ’ ( t=)
((xg,xg,x10,x11,x12)(t,u)} of the
reachable sets onto a five
dimensional subspace. Calculate the following iterated Lie brackets all containing three times the factor f and four times the factor g (i.e. all of the type of possible obstructions to controllability according to Sussmann’s general theorem):
The first three brackets correspond to obviously uncontrollable directions, exemplified by three supporting hyperplanes of R f ’ 4 ) ( ti.e. ) , xj(t,U) 1 0, j = 8,9,10 forall ( t , u ) . On the other hand it has been shown [5, 111 that the other two directions are independently controllable. While there are many more iterated Lie brackets containing three f s and four gs, all others can be expressed as linear combinations of the above only using anticommutativity and the Jacobi identity (the above form a basis for the homogeneous component L@14)(f,g) of the free Lie algebra generated by f and g considered as indeterminants.) The above choice nicely splits into bases for the two and the three dimensional subspaces of controllable and uncontrollable directions; clearly not every basis is expected to do this. Very loosely stated this leads to the problem of finding a basis of formal brackets for the free Lie algebra on (m 1) generators that splits into bases for the controllable and the uncontrollable directions. A final remark in this section concerns the use of nilpotent systems for the pathplanning problem. Considering systems of the form (4), initialized at a point p E M , the goal is an algorithm that given a desired terminal state q will automatically generate an admissible control U = uq that will steer the system from p to q. In the special case of nilpotent systems
+
Nonlinear Control and Combinatorics of Words
319
(or nilpotentizable systems) (assumed to be globally controllable) several solutions have been given, compare e.g. Laffaribre and Sussmann [24], Sussmann [25], Jacob [26, 271, Murray and Sastry [16]. First one brings the systems into cascade form by transforming to preferred coordinates (e.g. using Theorem 2.1). Next one picks a suitable family of controls ua that are parametrized by a finite number of real parameters CY = ( ~ 1 , .. ,ae), e.g. linear combinations of sinusoids with suitably related frequencies, piecewise constant or polynomial controls etc. Using the polynomial cascade form of the system explicitly compute the endpoints of the corresponding trajectories z(.,uor),and finally symbolically or numerically invert this map to obtain Q as a function of the endpoint. To obtain one general all-purpose algorithm one may consider the largest possible nilpotent systems, i.e. nilpotent systems that are aa free as possible. In thenext sections it is shown howto obtain such systems, but it remains open which of the in general many equivalent systems one should use, a question which again is closely related to picking a suitable basis for the free nilpotent Lie algebra.
.
3. Lie series in control This section surveys the key role played by the Chen-Fliess series in the study of affine nonlinear systems. The following sections will return to this series from a more combinatorial and algebraic perspective, here the emphasis is on the control theoretic point of view. In 1957 K.T. Chen [28] associated the formal power series
to a (smooth) path Q : [a,b] + R". The Xi are noncommutative indeterminants and starting from the line integral ,S dzi, the integrals for p 2 2 are inductively defined by b
(15)
S dzil . . .dxi, = S ( at1 dzil . ..d ~ i , - ~dai,(t), )
a
a
KawsM
320
where at denotes the portion of a ranging from a to t. It is shown that the logarithm of this series is a Lie element. The formal series is employed to derive certain Euclidean invariants of the path a. In the early seventies M. Fliess (e.g, [29, 301) realized the importance of this formal series for control theoretic purposes. Rewritten in various forms this formal series is still at the heart of many studies of properties of nonlinear control systems. It plays a particularly central role in the problem of finding state space realizations of input-output systems, typical are bilinear and nonlinear realizations studied by Krener [31],Jakubczyk [32], Crouch and Lamnabhi-Lagarrigue [33, 341 and Sontag and Wang. We follow the presentation of the Chen-Fliess series as in Sussmann [7]: Consider the free associative algebra A = A ( X 1 , . . ,Xm) generated by m indeterminants X I , . . . X m . For a multiindex I = (il, . .is) with integers 0 5 ij 5 m set X r = Xil Xi, X i a .Also considerthe algebra A^ = i ( X 1 , . . ,X,) of formal power series in XI, .. . ,X , with real coefficients. Let U be the set of Lebesgue integrable functions defined on someinterval [O,T] and taking values in Rm.For a point P E A^ and a U ( . ) E U consider the differential equation in A^ (initial value problem)
.
. ..
.
I
dS dt = S(X0
m
+ j=1 CUiXi),
S(0) = P.
A solution to (16) is a function S : [O,T]+ A^ such that S(0) = P and (16) holds for each coefficient. Hence, if P = C p r X I and S ( t ) = C s r ( t ) X I then for each I = (il, . . .is)
where J = ( i l , ,, . iSv1).In the particular case that P = 1, the solution is sr(t) = U r , where the integral stands for
Si
(18)
t
t t.
0
0 0
ta-1
ta
SUI = I 1 . - . 0
...U i l ( t z ) U i l ( t ~ ) d t l d t a... d t s .
ui.(te)Ui,_l(tr-l) 0
If U E U define Ser(T, U ) - the formal series of U - to be the solution of (16) with initial condition P = 1at time T. The set U is a semi-group under
Nonlinear Combinatorics and Control
of Words
321
the concatenation product, defined as follows: If U , W E U have domains [0,2'1] and [0,2'21, respectively, define the product uflw E U with domain [0, 2'1 2'21 by uflv(t) = u ( t ) if t 5 21' and u j v ( t ) = w(t -)1'2 otherwise. A key property of S e r is:
+
Theorem 3.1 (Sussmann [7]). The map Ser : U + A^ is a semi-group monomorphism, i.e. Ser is injective and satisfies Ser(ugw) =Ser(u)Ser(v). Returning to thecontrol system, we now use f j in place of gj for better notation m
j=1
with z E M", and here we consider C" vector fields f j on M . Each vector field f j is a first-order differential operator on the space C"(M)of smooth functions on M . For a multi-index I = (il, . . , is) with 0 5 i j 5 m the product f I = fij . , . fi, is an s-order partial differential operator on C"(M). Upon substituting the vector fields fi for the indeterminants Xi one obtains the formal series of partial differential operators on M m
Applying the operators fI to a smooth function 9 E C m ( M )one obtains a formal series of smooth functions on M
In Sussmann [7] it is shown that (when evaluated at zo and considered as a function of 2') this series (Serf(T,u)O)(q) is an asymptotic series for the propagation of 9 along trajectories of (19). In the case that the vector fields f j and thefunction 0 are analyticthe series (Serf(?', u)@)(zo) converges to @(z(T,U ;zO)):
Theorem 3.2 (Sussmann [7]). Consider system (19) wath f j analytic vector fields, K C h4 compact, 9 E C w ( M )and A > 0. Then thew exists T > 0 such that for every zo E K , for every u E U with Iu(.)I 5 A that
Kawski
322
is defined on an interval [O,T,] E [O,T]the solution curve x(t,u;xO) is defined o n [O,T,], the series (Serf(t,u)@)(xo)converges to @(x(t,u;xo)) and the convergence is uniform as long as x E K , U E U with lu(.)I 5 A , and Tu5 T . One of the many uses of this series is for proving necessary conditions for controllability, e.g. of Theorem 1.5. Assuming that (adzkg, f)(p) is linearly independent of all brackets in S2”l (f, g) for some k > 0 there is a smooth function @ : Rn + R with @(p)= 0, (ad2kg , !)@(p) = 1 and f?,@(p)= 0 for all fir E S2”-’(f,g). After suitable manipulations of the series and repeated integrations by parts etc. it is shown that essentially T
@ ( x ( t u;p)) , = ((2k)!)-’ I ( u ( t ) ) 2 k d t+ 0(lT1~”+’) 0
from which one concludes that for small T the reachable set from p cannot contain a neighbourhood of p , i.e. the system is not STLC from p . In Kawski [35]a similar necessary condition for STLC, dealing with brackets that have four factors g and three factors f was established. The proof required a partial factorization of the Fliess series in order to be able to get useful estimates of the iterated integrals. More specifically, all terms in the seriesinvolving three or fewer gs were rewritten in terms of a PoincarB-Birkhoff-Witt basis with corresponding iterated integrals simplified after repeated integrations by parts. Then an argument similar to the one used by Stefani can be used to prove this necessary condition. This proof made it very apparent that theway in which the Fliess series is written is not the most convenient one for applications to control. Also, the Lie series character of the series, already shown by Chen [28], is not at all apparent from the formula. We will come back to the question when a formal power series is an exponential Lie series in the next sections. In 1985 Sussmann [36]gave a product expansion of the Fliess series as a directed infinite product of exponentials: c
(22)
Ser(T,u) =
exp(CH(T,u)H). H€%
Combinatorics and Nonlinear Control
of Words
323
Here, the product is taken over a Hall basisof the free Liealgebra generated by the indeterminantsX,, X I , . . ,X,, see below,and explicit formulas for the coefficients CH(T, U ) as iterated integrals of the controls U are given. Here, a Hall set is defined (compare Bourbaki [37]) M a totally ordered subset 'F1 of the set M = M ( X 0 , X,) of formal brackets such that U
I
1. Each generator X i is in 'F1 2. If H , H' E 31, H 5. H' then length(H) 5 length(H') 3. If H E M is not a generator, so that H = [HI, Hz] with Hi E M , then H E 'F1 if and only if H I , H ZE 74 with H1 < H and either (i) HZ is a generator, or (ii) HZ = [Hzl, HZZ] with H21 5 H I . Note that as formal brackets e.g. [Xo,[X1, [Xo,X1]]] and [X,, [XO, [Xo,X1]]]are different, whereas the Jacobi identity implies that as elements of a Lie algebra they are equal. Under the canonical map from the space M of formal brackets into the free Lie algebra L = L(&, . . .X,) the image of a Hall set is a basis for L, compare M. Hall [38].
To every formal bracket H E 'F1 and every control U E U defined on [0,TI associate two functions C H ( U ) and C H ( U )also defined on [0,T]. C H ( U ) will be integrable and C ~ ( u ) ( = t ) $ c ~ ( u ) ( s ) d s If . H = X i then let Cxi = ui. If H = (adk H I ,Hz) (as usual (ad v, denotes the mapping W + [v, W ] )with H I ,HZ E 31 (and either HZ is a generator or its left factor is different from H I ) then define S )
The proof of the product expansion given in [36]uses standard differential equations techniques (variation of parameters). That article also gives bounds for the iterated integrals, thus providing for convergence results for the infinite product. Section 6 will return to such infinite product expansions, that time from a more combinatorial point of view, showing that the combinatorial formulas for the coefficients in this product expansion in some sensedate back as far as 1958 Schutzenberger [39], however, there clearly without any analytic interpretations, i.e. no integrals and no convergence results.
Kawski
324
4. Words and the shuffle product We rather closely followthe terminology and notation of Lothaire '[40]. Let Z be a finite set that we call an alphabet. Its elements are called letters. A wordover Z is any finite sequence W = (al,az, . . ,,a,) with aj E 2. The length ( W (of W is the length S of the sequence. Write e for the empty word (empty sequence). The set of all words of length S is denoted by Z 8 ,and the set of all words is Z*.It is equipped with a binary operation obtained by concatenating two sequences (al,az, . . ,a,)(bl, bz, ,a,) = (al, Q,. , . , a,, bl, bz, . . , ,a,). As this product is clearly associative write the word (noncommutative monomial) as W = a1a2.. a,. The empty word e satisfies we = W = ew for any word W E 2". The set Z* together with the concatenation product has a monoid structure. The closely related free semigroup is Z+ = 2" \ {e}. Note that Z* has the graded structure Z* = Zn, where w z E ZTS8whenever w ~ Z ~ a n d z E Z ~ .
.
...
.
e,"==,
The free associative algebra over 2, that is the linear space of finite linear combinations of words W E Z* with real coefficients, is denoted by A . Also refer to A as the algebra of noncommutative polynomials in m = 121 indeterminants. This space also has a natural graded structure
A = $n=O An where each homogeneous component An is the subspace spanned by all words of length n. Write A+ for the algebra of 2'. Finally, A* is the linear space of all formal power series C,EZ.c,w (with real coefficients c,). To construct the free Lie algebra L = L(Z)generated by 2 first consider the free magma M = M (Z), the set of all parenthesized words over 2 (compare Bourbaki [37]): By induction define the sets Z(i) by setting Z(1)= Z and Z(8)= U::: 2(,) x 2(,-,). E.g. if Z = { a , b, c) then ( ( ( a , c ) , b )@,c)) , E Z(5). The free magma is M ( Z ) = Uzl Z(,)and may also be identified with the set of all formal brackets over 2, compare the preceding section. The algebra of the magma with real coefficients is denoted Lib(Z), e.g. ( ( a , b ) , c )- 3 ( b , c ) E Lib(2). The free algebra L over Z is the quotient
of Words
Combinatorics and Nonlinear Control
325
algebra L ( 2 ) = Lib(Z)/3 where 3 is the two sided ideal generated by ) ( ( c , a ) , b ): a,b,c E Lib(Z)}. E.g. as the set { ( a , a ) , ( ( a , b ) , c ) ( ( b , c ) , a + elements of Lib({a, b} the formal brackets (a, (b, (a, b ) ) ) and (b, (a, (a,b ) ) ) are distinct elements, however as elements of L they are identical. Note that each element in M that is not in 2 has uniquely determined left and right factors, and thus a unique factorisation, whereas elementsof the free Lie algebra do not have right or left factors.
+
Consider the map : Lib(2) + A definedonbasis elements ( w , q ) E M \ Z by L ( ( w , q ) ) = ?I(w)x(q) - ;(q)L(u),and ;(a) = a for a E 2. It is clear that 3 lies in the kernel of and thus there is a map h : L + A such that = h o where x : Lib + L is the projection map. Elements x E A that liein the image of h are called Lie elements. For example a = h(a) E 2 C A and aab - Paba baa = h((a,(a,b ) ) ) E A are Lie elements. A simple criterion to decide when an element x E A is Lie uses the diagonal mapping, that is the algebra morphism A : A + A @ A that is defined on letters a E 2 as
x,
x
+
A(a) = e @ a + a @ e .
+ +
+
+
Thus e.g. A(&) = ab @ e b @ a a @ b e @ ab. Note that, however, A(ab - ba) L (ab- ba) @ e e @(ab- ba). Let 9 E A be theset of elements W E A such that A(w) = e @ W W @ e. Clearly 2 E G. Moreover, if w,z E B then
+
Thus the set B is a Lie mbalgebra of A (with the commutator product) and it contains all Lie elements. The following theorem asserts that the converse is true, also.
Kawskl
326
Theorem 4.1 (Friedrich's criterion). An element W E A is Lie if and only if A(w) = w @ e + e @ w . One may identify A* with the space of all real valued linear mappings defined on A . Specifically, for
C AWw E A * X = C xWwE A* A=
and
define the linear map 1
This sum is finite since for every z E A all except a finite number of the coefficients zWvanish. The shuffle product is defined as the transpose W : A* 8 A* -b A* of the diagonal mapping A : A + A 8 A. Specifically, for words v, W E Z* define v W W as the unique element in A* satisfying (27)
(V
W W ,z ) = (v @ W ,A(z))
for all z E Z*.
For practical calculation the following recursive formula, in terms of the concatenation product, which may also serve as an alternative definition, is more convenient. For any word W E Z* set e W W = W = W W e. For (possibly empty) words w,z E Z* and letters a,b E 2 define e W a = aWe=aand (28)
+
( a w )W ( b z ) = a(w LLI ( b z ) ) b((4w)LLI z),
or equivalently (29)
+
(wa) W ( z b ) = (W W (zb))a ( ( w a )W z)b.
Then extend linearly to A*. The equivalence of these two definitions is straightforward: Let a, b, c E Z and v, w,zE Z*. Using the Kroneckerdelta
Nonlinear Control and Combinatorics of Words
327
The shuffle product is obviously commutative. Via a somewhat lengthy, but straightforward calculation one may directly verify that the shuffle product is associative. Every element x E A* c m be written as x = Cr=,x,,where each x,, is a finite linear combination of words of length n. The formal power series x E A* is called a Lie element if each xn.is a Lie element. For every x E A* with zero constant term (i.e. the coefficient of the empty word is zero) define exp(x) = C;=,(P/n!)and log(e x) = CG1(-)j+lxj/j where as usual 'x = e etc. One easily verifies that log(exp(x)) = x and exp(log(e+z) = (e+x). In [28] Chen showed that thelogarithm of the Chen series is a Lie element, or in other words that the series is an exponential Lieseries. In [41]Reegave the following criterion for exponential Lie series, which is closely related to the Friedrich's criterion:
+
Theorem 4.2 (Ree 1958 [41]). A series x = CwEZ. zww E A* with x, = 1 is an exponential Lie series, i.e. log(x) is a Lie element, if and only if the coeficients xw satisfy the shufle relations, that is x,xz = <(W W z),
where
((CvEz. cvv) = CvEZ. cvzv.
Using this criterion, it is easy to verify:
Proposition 4.3. The Chen-Fliess series is an exponential Lie series. To every word W E Z* associate m iterated integral. Specifically, let U be the space of integrable functions defined on intervals [O,T]or [0, 00) with values in Rm.(If necessary, extend a control U defined on a finite interval [O,T] by setting u(t) = 0 for all t > T.)Inductively define the
Kawski
328
mapZ:Z*xR+xU+R.IfT10,anduEUthen Z ( e , T , u )= 1 (31)
and
T
1
E(aw,T,u) = u a ( t ) Z ( w , t , u )d t
if a E 2,and W E Z*.
0
(Here the letters a E Z index the component functions U, of the Rm-valued function U . This amounts to choosing an ordering of 2.) Extend Z linearly %,W E A * , T 2 0 and U E U and if the series to all of A. If z = C,EZ* ~ , E ( W ,T,U ) is absolutely convergent then write E(., t ,U ) for this series. A slightly different view is to consider the associated map a mapping A* into the space of maps from U into the space of locally absolutely continuous functions on R+, i.e. a : A + Mappings (U, AC([O,oo),R)) defined on Z* by a(w)(u)(T)= Z(w,u,T). We refer to the image of A under S as the space of iterated integralsZ,and use the convenient notation a, = .(W) for W E Z*,and more generally a, = a ( . ) = z,a, for z= Z,W E A. While not hard, it still is instructive to once directly verify the proposition, using Ree's theorem: Since the linearity properties are clear we = only need to verify that awwr = awarfor a11 w,z E Z*,i.e. a,~,(T,u) a, (T, u)a,(T, U ) for all U and all T. Proceeding by induction on the sum k = IwI lzl of the lengths of the words W and z , the start is trivial: If k = 0,i.e. W = e = z then W W z = e W e = e and also a,a, = 1 = a,.The case k = 1 is similarly simple. The pattern already emerges in the case of = IzI = 1, i.e. both W = a and z = b being letters.
+
%Ub(T, U )
= Sab+ba(T, U ) = a a b (TIU ) -k a b a (T, U)
Combinatorics and Nonlinear Control
of Words
329
after one simple integration by parts. For the induction step use the recursive characterization of the shuffle product in terms of the concatenation product. For letters a, b E 2 and words W,z E Z*
= Qaw (TV U ) a b z (T, U)
(integration by parts).
This establishes that a is an algebra homomorphism from the algebra of noncommuting polynomials (power series) with the shuffle product to the algebra of iterated integrals with pointwise multiplication. As a consequence there exists a Lie .element C such that Ser(T,u) = ezp(C(T,U ) ) . Sussmann gave an explicit formula for C = log(Ser(T, U ) ) in [36]. However, even after symmetrization the formula is still too unwieldy for many control theoretic applications. Alternatively, one may try to express the series Ser(T,u) as an (infinite) product of exponentials of simple Lie elements. This immediately leads to the problem of finding a suitable basis for L and then obtaining explicit formula for the coefficients, that again are iterated integrals.
5. Hall bases The first bases for free Lie algebras have their root in the collecting process of P.Hall [43] who was investigating higher commutators in free
330
Kawski
groups. Witt [44] demonstrated that there is an isomorphism between free Lie algebras and thehigher commutator groups of free groups. For further group theoretical investigations see e.g. Meier-Wunderli [45]. The first applications to Lie algebras are by Magnus [46] and M. Hall [47]. Compare section 3 for the standard definition of Hall bases (also: Bourbaki [37]). Since then various bases for Bee Lie algebras have been introduced. The bases by Siriov [48] and Chen-Fox-Lyndon [49], originally thought to be different, turned out to be essentially the same. Using the notation from the last section, a Lyndon word is any word W E Z* that in lexicographical order is strictly less than all its cyclic permutations, i.e. if W = uv with U , v E Z* then uv < vu. For any Lyndon word W E Z* \ Z let e(w)E Z* be the longest proper right factor of W that is a Lyndon word. One may show that the corresponding left factor X(w) is also a Lyndon word. The factorisation $(W) E M of a Lyndon word W is #(W) = W if W E Z and ($(X(w)),#(~(w))) if W E Z* \ 2. A basis for the free Lie algebra is formed by the images of these formal brackets under the canonical projection map into the free Lie algebra L (compare the discussion of Hall bases in the preceding sections). In the seventies Viennot demonstrated that all the known bases forfree Lie algebras arise from the underlying principle of unique factorisations of the free monoid Z * . Specifically, a complete factorisation of Z* is an ordered subset l3 = {ui : i 2 l} E Z* such that every word W E Z* can be written in a unique way as W = u ~ ~ u ~. , .uui, ~ ,with each udj E B and uil 2 ~ i ,2 Ui,. Closely related to Viennot's factorisations are Lmard factorisations, compare Bourbaki (the elimination Theorem 3 2.9 in[37]) or Lothaire (Problem 5.4.5 in (401). In a special case this asserts that if a E Z then L = L ( Z ) is the direct sum of the one-dimensional subspace { t a : t E R} and the Lie subalgebra that admits as a basic family the set ( ( a d k a, b) : k 2 O,b E 2 \ { a } } . We shall refer to the single large family of bases for free Lie algebras singled out by Viennot as general Hall bases. The difference to the definition of Hall bases in the narrow sense as given in Section 3 is that the
Nonlinear Control and Combinatorics of Words
331
compatibility of the ordering of the bases with the length of the basis elements is replaced by the weaker requirement that the any Hall element is larger than its left factor. A generalized Hall set over 2 is defined to be a totally ordered subset 31 C M ( 2 ) of formal brackets that satisfies the following (compare Viennot [50] Theorem 1.2):
1. Each generator a E 2 is in 3.1 2. If H E M is not a generator, so that H = [HI, Hz] with Hi E M , then
H1
H,
3. If H = [HI, Hz] E M with Hi E M , then H E 3t if and only if HI, H2 E 3.1 with H1 < H2 and either (i) H2 is a generator, or (ii) H2 = [H21, H221 with H21 5 HI.
Melanson and Rmtenauer [52, 531 give essentially the same definition, however they replace each bracket (a,b) E M by R(a,b) = (R(b),R(a)) and R(a) = a for letters a E 2. Also their ordering is reversed: “C” instead of “>”. The definition we give is more in line with the standard literature in control theory, and, in particular, facilitates the preferred notation involving flows of vector fields. As an important consequence of the unique factorisation principle investigated by Viennot the restriction of deparenthesization map 6 : M + A* to a generalized Hall set 3.1 C M is injective. This allows us to use the deparenthesized words instead of the formal brackets in the Hall set, similar to the way one conveniently writes Lyndon words in place of the parenthesized Lyndon words. This much facilitates the notation when these sets are used to index e.g. Lie brackets of vector fields and iterated integrals. However, for better readability, we use parenthesized words in the examples below. Consider the following examples of parenthesized words (i.e. elements of the free magma) for a two letter alphabet 2 = {a,b } . The first is a list of Hall elements in the narrow sense of length at most six.
Kawski
332
c((aW(a(ab)))) ( ( a w ” ) )
((a(ab))(b(ab))).
Following is a list of parenthesized Lyndon words of length at most six, again in ascending order for the alphabet 2 = { a , b).
c ((ab)b)c (((ab)b)b) ((((ab)b)b)b)c (((((ab)b)b)b)b) b. And finally we rewrite these Lyndon words to illustrate that they are a special case of a generalized Hall set. Specifically, we substitute the letters B and A for the letters a and 6, replace every bracket ( U , v) by the reversed bracket R(u,v)= ( R ( v )R(u)) , and replace “C” by “>”. When read backwards each of the resulting words is lexicographical larger than any of its cyclic rearrangements, and the factorisation is given by looking at the longest proper left factor that satisfies the same requirement.
A ( A ( A ( A ( A ( A B ) ) ) )()A ( A ( A ( A ( B ) ) )()A ( A ( A B ) ) ) c ( A w l ) < ( ( A ( A ( A B ) ) ) ( A Bc) ) ( ( A ( A B ) ) ( A B ) ) ( A B ) c ( ( A ( A ( A ( A B ) ) ) ) B )
of Words
Combinatorics and Nonlinear Control
333
Notice that most of the listed words are also strictly less then their cyclic rearrangements in standard lexicographical order (when read forward). However, this is a mere coincidence, due to the relative shortness of the words listed here. Indeed, the word ABAABB is not a Lyndon word, however it is a reverse Lyndon word. The conditions for a generalized Hallbasisallowfor many more different bases, seee.g. Melanson and Reutenauer [52] for another example. So far none of these bases has been shown to split in a nice way for a basis for the controllable directions and a basis for the uncontrollable directions of the control system (4).
A few further remarks regarding these bases are in order: The special case of the Lyndon words has the particular advantage that one easily recognizes whether any given word is Lyndon or not, and one most easily can write down its factorisation. Extensive use of this has been made in the path planning algorithms by Jacob [26, 271, and in particular in the computer implementations of various related algorithms by Petitot [54]. The most undesirable property of Lyndon words is their ordering that does not at all correspond to the length of the words. In applications such as the path planning problem, one may take a first approximation, say, using only flows of vector fields that correspond to words of length at most four. If one then desires to add further correction terms, e.g. of length five, then these have to be added at various intermediate steps, rather than added in the front or in the back. (For special tricks how to deal with this situation see [26, 27, 541.) This is quite the opposite of the case of Hall bases in the narrow sense, where one simply composes the earlier result with the new correction terms, not requiring that everything be computed over again (see the infinite directed product (22)). Finally, it is most straightforward to generate a list of all Hall words (in the narrow sense) up to a certain order. E.g. suppose that an ordered list = (wl, , . .W,) of Hall words of length at most IC has been generated. Then for j = l to 2 j IC consider each possible combination of list elements U and v that have lengths = j and [v1= k 1 j and check whether the parenthesized word ( W ) satisfies the third condition for Hall sets; if it does add it to the end of the list. This naive algorithm clearly
>
+ -
334
Kawski
can be much improved, yet it already compares favorably with the task of generating all Lyndon words in a naive way. For efficient algorithms to generate Lyndon words see the references in [53]. In order to get to explicit formulas for the iterated integrals in the infinite product expression of the Chen-Fliess-series we need to consider the enveloping algebra of the free Lie algebra L , and suitable bases for it. Following Humphrey [55], given a Lie algebra L an universal enveloping algebra for L is an associative algebra U with a linear map i : L "t U, satisfying i([x,y]) = i(x)i(y) - i(y)i(x) for all x,y E L and such that the followingholds: If S is any associative algebra and j : L "t S is a linear map satisfying j([x,y]) = j(x)j(y) - j(y)j(x) for all x,y E L, then there exists a unique algebra homomorphism T : U + S such j = 7 o i. The uniqueness of the pair (U,i) (up to isomorphism) is straightforward. Regarding the existence consider the tensor algebra 7 of L (see e.g. [55] or Lang [56] chapter XVI.) Let 3 be the two-sided ideal generated by all x@yyy@x-[x,y]withz,y~L.SetU=7/9,andleti:L+Ube the restriction of the canonical homomorphism ?r : 7 + U to L = T 1 ( L ) . Let S = 712 be the symmetric algebra where Z is the two-sided ideal generated by x 8 y - y 8 2 with x, y E L. Furthermore Q is the graded algebra associated with the filtered algebra of the universal enveloping algebra U (compare Bourbaki [37] section 1.2.7 or [55]section 17.3). The Poincarb-Birkhoff-Witt theorem asserts that the canonical homomorphism W : S + 9 is an isomorphism. As an immediate corollary of fundamental importance for us is: If ( h l ,hz, hs, . . .) is an ordered basis for the Lie algebra L then the elements ~ ( h l8, hl, 8 . . . h ~ # S) ,> 0 and i l 2 iz 2 .. .is form a basis for U, referred to as the Poincarh-Birkhoff-Witt basis for U. In the case that L = L ( 2 ) is the free algebra over the alphabet 2,we may identify the universal enveloping algebra U of L with the free associative algebra A = A(.??), (for technical details compare Bourbaki [37] Corollary 11.3.1.1). Of main interest to us are explicit formulas for the dual basis to the PBW-basis obtained from Hall sets as described above. For illustration, if a, [a[ab]],[ab],[[ablb], b are the first five basis elements of an basis for
of Words
Combinatorics and Nonlinear Control
335
L({a,b}), then the following are elements of the corresponding PBW basis: a, b, aa, [ab]= ab - ba,ba, aaa, [a[ab]]= aab - 2aba
baa, [[ablb]= abb - 2bab
+ baa,
+ bba,
bb,
[abla = aba
b[ab]= abb
- baa,
- bab, bba,
bbb.
For W E Z* , let G denote the linear map G : A + R defined by G(%)= S,,,,, for any word z E Z * , i.e. G ( z ) = 0 if z # W and G(w) = 1. We want to express the dual basis to the PBW-basis as linear combinations (that turn out to be finite linear combinations) of the maps G for W E Z*.If we write {Sh : h E PBW - basis) for the dual basis to the PBW basis, then one easily calculates the following relations for the above example (all missing matrix entries are zero): (34)
N
1
1 2 1
1 1 (35)
1 1 2 1 1 1 1 1
bbb
In this example the basis is formed from Lyndon words, and theresulting transformation matrix has a remarkable structure, being triangular with ones on the diagonal and nonnegative (!) integer values above. (The block structure is due to the multigrading structure of A ( Z ) . ) This is not a coincidence, and general results have been proved, compare e.g. MelanCon and Fkutenauer [51]and Garsia [57].
KawsM
336
The first general formula for the dual basis to thePBW basis resulting from Hall sets (in the narrow sense) was obtained by Schiitzenberger in 1958, see the notes of the SBminaire Dubreil [39]. In 1989 Melanson and Reutenauer obtained an explicit expression for the dual basis corresponding to theLyndon basis for L.They remarked that it is a “surprising fact that exactly the same formula holds for the Hald-$irSov basis as shown in [39]”. Further work by Melanson and Reutenauer [52] clarified the situation, compare also the next section. The formula that was derived in both cases makes use of the shufle product in A*. Specifically, in the case of Lyndon words [51] S, = E, if x =, aw is Lyndon with a E 2, W E Z* then S, = as,,,, and if W = w i l w t . . , W $ E Z* is any worddecomposed into Lyndon words w1 > w2 > . . . W , then
where the exponents denote shuffle exponentiation w1 = W and wh+l = W W Wk. The main surprise was that theformula,S , = a s , for Lyndon words is essentially exactly the same in the case of Hall words studied by Schiitzenberger. We will come back to thisin the next section. Note that Sussmann’s formula [36] from 1985for the iterated integrals contains in a different setting essentially the same combinatorial formula for the dual basis of the PBW basis constructed from a Hall basis.
6. Chronological products The left and right translations by a letter a E Z are the linear maps X, Q, : A + A , defined on words W E Z* by X, = aw and @,(W) = wa. The transposes ao, da : A* + A* are defined on words by (37)
(a 0 W , 2 )
= (W’ X&))
and
(W
Q
a, x ) = ( W , e a ( % ) )
for all x E Z*.In particular, if W = wb with b E Z and v E Z* then w b Q a = da,bW. Observe that ao, Qa are derivations on A* when considered
Nonlinear Control
and Combinatorics of Words
337
with the shuffleproduct, compare Ree [41,42]. Specifically, forv, W, z E Z* and a E Z ((v
W )Q a, 2) =
(v
W ,%a)
= (v @ W ,A(z)A(a))
+ (v @ W ,A(x)(e 8 a ) ) = ((v a a) @ W , A(%))+ (v @ (W a),A(z)) = (v 8 W ,A(z)(a@ e))
(38)
Q
= ((vQa)LUw+vLU(wQa),z). Alternatively, using the recursive characterization of the shuffle product, for letters a, b, c E 2 and words v, W E Z*
((vb) W
(WC)) Qa
(39)
+ Qa = (v W (wc))sa,, + ((vb) W W)&,, = ((vb)a a) W (WC)+ (vb)W ((WC)Q a). = (((v LU ( w c ) ) ~ ((vb) LU W)C)
The calculations for the left translation and derivation are analogous. In general the composition of two Dl and V2 is not a derivation, but their commutator [VI, D,]= D1 oDz - D,o D1 always is a derivation. The set of derivations forms a Lie algebra under composition. Here identify the elements of the Lie algebra L = L(%)with derivations on A* (with the shuffle product). In particular, for u,v E L and W E A* inductively (on the length of U and v) define
(40)
wa[u,v]=(WQv)QU-((WQu)Qv.
The (right) chronological product is the linear map * : A* x A* defined on words as follows: If a E Z,W E Z+ and I E Z*
(41)
w*e=O
+ A*
and w * ( z a ) = ( w * z + z * w ) a .
In general a (right) chronological product is a linear map defined on a linear space satisfying the following identity:
(42)
T*(S*t)=(T*8+S*T)*t.
As a consequence the commutative map
(T, 8 )
+ T * *S
=T
* 8 + 8 * T is
Kawski
338
associative:
r**(s**t)=r*(s*t+t*s)+(s*t+t*s)*r =(r*s+s*r)*t+(r*t+t*r)*s+(8*t+t*S)*P (43)
=(r*s+s*r)*t+t*(r*s+s*r) = (r**s) **t.
The chronological products of words are closely related to theshuffle product:
(44)
w*z+z*w=wujz.
On the space of absolutely continuous functions defined on intervals [0,T ] define a (right). chronological product by (using g to denote the derivative of 9) 1
(45)
* g ) ( t ) = 1f(s)b(s)ds. 0
One easily verifies that this product satisfies the identity (42): t
(f * (9* h ) ) ( t )= 1f ( s ) g ( a ) b )ds 0
Another chronological algebra considered by Liu and Sussmann (private communication) defines a chronological product on the space of polynomial functions in one variable by setting x" * xm = *x"+". Agrachev and Gamkrelidze [58]have used chronologicalproducts extensively in the study of products of one-parameter families of diffeomorphisms. However, while our chronological product is a pre-associative product, theirs is a pre-Lie product and is required to satisfy the identity: a * ( b * c) - b * (a c) = (a * b ) * c - (b*a) *c. A geometrical application is in terms of left invariant connections: (X, Y )+ VxY.This map is a chronological product if and V y ]= V p y ] . only if the connection is flat, i.e. [Vx,
*
Nonlinear Control and Combinatorics of Words
339
While ab (for a E 2) still is a derivation on A* now considered as a chronological algebra (47)
aD(w*z)=(aDw)*z+w*(aDz)
aa no longer is a derivation:
(48)
*
(W ( z a ) )a b = (W I l l z)a Q b = Go,b(w U] z).
Here, w,zE 2" and a, b E 2. This propertyof chronological products complements the following property of Hall words (seeViennot [50], also Melanqon and Reutenauer (521)
Theorem 6.1. Let 74 C M be a Hall set (in the general sense). Suppose z = wa is a deparenthesized Hall word with W E Z*, and a E 2, and W has the unique factorisation into a nonincreasing product of Hall words W = h, , , , h2hl with h, 2 2 h2 2 h1 then the parenthesization of z = wa is @(wa)= (@(h,),(. . . ( @ ( h z ) , ( @ ( h l ) , a ) ) ) . I
n
Very closely related to this property is the following most simple r e cursive formula for a free control system. Using the alphabet 2 to label the controls uara E 2, and a Hall set 3.1 E A ( Z ) to label the coordinate directions X H ,H E 3.1 a free system is explicitly given by
{
sHkK
ifaEZ if (H, K) E N.
Recall that due to theunique factorization properties one may as well use the deparenthesized word HK as an index. In practical applications, e.g. investigations of controllability properties, reachable sets (e.g. Krener and Schattler [22]) or in path planning algorithms (see e.g. Murray [17]),one usually considers a finite dimensional nilpotent subsystem. One only needs ) coordinate direction in the system, then to take care that if X C ( H ~ , His~one also X H ~and X H ~are coordinate directions. Using the convenient notation of xcwfor xh, XhzZhl if W factors as W = h,, , , h2hl into a nonincreasing product of Hall elements, this system may be written in the usual form h = C a E z u a f a ( x Without ). worrying about the free infinite dimensional case, the vector fields take formally the
340
Kawski
form (50) Using the multi-grading of the Lie algebra generated by the vector fields one easily shows that in particular f ~ ( 0 = ) CH& for some nonzero constant CH,and consequently the Lie algebra L(f) generated by these vector fields is isomorphic to the free Lie algebra L = L(Z).All of this makes perfect sense if one only considers suitable subsets of 74 corresponding to nilpotent Lie algebras. These realizations of free nilpotent Lie algebras are closely related to the general procedure in Theorem 2.1. For alternative constructions see Grayson and Grossman [60]. In terms of chronological products the same system is written as (51)
{ x, =
U0
x ( H , K ) = XH
* XK
ifaEZ if (H, K )E 74.
If the last letter of the (deparenthesized) Hall word H K is a E 2, i.e. HK = wa for some W E Z*and W factors W = h , . . . h2hl with h, 2 . . , 2 h2 1 h1 then a complete expansion of the formula in (49) is (52)
= xh.
%a
’ xhaxh1ua
or in terms of chronological products (53)
xwa
= %h.
*
(e
(xhs
* (xhl * xh1))
*
*)*
This formula makes the analogy to the iterated integrals in Sussmann’s product expansion equation (22)very clear. The only differenceis the omission of the multi-factorials (they reappear aa the nonzero constants CH above). In a vector space V with basis { e i } i and dual basis {ei)r the identity ei 8 ei with the usual map id : V + V may be writtenas id = identification Hom(V, V) ?r V 8 V * where V * denotes the dual of V . In our case the words W E Z* (including the empty word e) form a basis for the free associative algebra A of (noncommuting) polynomials, and we write d for the elements of the dual basis, now considered as elements of A*, the associative algebra of noncommuting power series with the letters
Combinatorics and Nonlinear Control
of Words
341
in 2 considered as indeterminants. Thus the identity map id : A be written as
+ A may
(54)
If 3.1 E M is a Hall set, write ?? C L for the corresponding Hall-basis for the free Lie algebra L = L ( Z ) ,and let P C A be the corresponding PBW basis. Then (55)
and this latter expression can be rewritten as a directed infinite product (compare e.g. Sussmann [36], Agrachev and Gamkrelidae [58, 591 and MelanGon t%nd Reutenauer [51]) e bEP
h€?i
where ^h stands for the image of h E M in L under the natural projection map. (Usually we do not distinguish between h and g.) Finally, we comeback to the product expansion of the Chen-Fliess series in control. For brevity let us consider the control system (19) defined in terms of the vector fields f a , a E 2;and consider the series Serf obtained from the series Ser (by substituting the vector fields fa for the indeterminants X , ) . The shuffle algebra homomorphism Q or E (Section 4 equation (31)) mapping words W E A* (with the shuffle product) to iterated integrals E(w)(u)(T)= (with pointwise multiplication) combined with the homomorphism mapping words W to the partial differential operators f" may then be combined (in the sense of Hopf algebras) to map the formula (54) to the Chen Fliess series (20). Recall that given a fixed initial point zo E M, disregarding any complications arising form lack of convergence (compare section 4 when there are no problems), the series Serf maps the pair (a,U ) E P ( M ) X U into G?(z(Tu,u)), the value of CI, along the trajectory of (19). The correspondence between the infinite products (56) and (22) isgiven by the same
~ ,,
.. .._
. _......
.
. _.
..
.
..,
.
.... .. .
. .. ..
Kawski
342
shuffle algebra homomorphism a combined with the Lie algebra homomorphism mapping parenthesized words h E L ( 2 ) to Lie brackets f h of vector fields.
REFERENCES A. Bloch and H.McClamroch, Controllability and stabilizability properties of nonholonomic control systems, Proc. of 2gth IEEE CDC: 1312 (1990). K.Grasse, Math. Control Signals and Systems, 5: 41 (1992). H.J. Sussmann, An extension of a theorem of Nagano on transitive Lie algebras, Proc. Amer. Math. Soc., 45: 349 (1974). [41 H.J. Sussmann, Lie brackets, Real Analyticity, and Geometric Control, Differential Geometric Control, (R. W. Brockett, R. S. Millman, H. J. Sussmann, eds.) Birkhauser, Boston: 1 (1983). (H.J. M. Kawski, NonlinearControllabilityandOptimalControl, Sussmann, ed.), Dekker, New York: 431 (1990). H.G. Hermes, Controlled stability Annali di Matematica pura ed applicata, CXIV: 103 (1977). H.J. Sussmann, Lie bracketsand local controllability: a suflcient condition for scalar-input systems, SIAM J. Cntrl. & Opt., 21:686 (1987). G. Stefani, O n the local controllability of a scalar-input system,Proc. 24th IEEE CDC (1985). G.Stefani, Local controllability of nonlinear systems, an example, Systems and Control Letters, 6: 123 (1985). H.J. Sussmann, A general theorem on local controllability, SIAM J. Control & Opt., 25: 158 (1987). A. Agrachev and Gamkrelidze, Local controllability and semigroups of di$eomorphisms, to appear in: Acta Applicandae Math.
Nonlinear Control and Combinatorics of Words
343
R. W. Goodman, Nilpotent Lie Groups, Lecture Notes Math, 662, Springer, Berlin, Heidelberg (1975). M. Kawski, J. Reine Angew. Math., 388: 1 (1988).
J. Grabowski, Remarks on nilpotent Lie algebras, Math., 406: 1 (1990).
J. Reine Angew.
H. G. Hermes, A. Lundell, D. Sullivan, Nilpotent bases f o r distributions and control systems, J. Diff. Equ., 55: 385 (1984). R. Murray andS. Sastry, Nonholonomic path planning: steering with sinusoids, Memorandum UCB/ERL M91/45, University of California at Berkeley, 1991. R. Murray, Nilpotent bases f o r a class of nonintegrable distributions with applications t o trajectory generation for nonholonomic systems, Memorandum CIT/CDS 92-002, California Inst. Technology, 1992. H. G.Hermes, Nilpotent approximations of control systems and distributions, SIAM J. Control & Opt., 24: 731 (1986) G. Stefani Local properties of nonlinear control systems, Inst. Technical Cybernetics, Techn. University Wroclaw, Poland, no.70, Conf 29: 219 (1986) H. G. Hermes, Asymptotically stabilizing feedback controls, J. Diff. Equns., 92: 76 (1991).
L. Rosier, O n the existence of homogeneous Lyapunov functions for homogeneous asymptotically stable vector fields,Systems and Control Letters, 19:467 (1992). A. Krener and W. Kang, Degree two normal forms of control systems andgeneralizedClebschLegendreconditions, in: Analysis of controlled dynamical systems, Progress Systems and Control Theory, Birkhauser, Boston, Mass. (1991) A. Krener and H. Schattler, The structure of small-time reachable sets in low dimensions, SIAM J. Control & Opt., 27: 120 (1989). G. Laffariere and H.J. Sussmann, Motion planning for controllable systems without drift: A preliminary report, Colloque d’Analyse des Systkmes Dynamiques Controlb, Birkhauser, Boston, Mass. (1990).
344
Kawski
H. J. Sussmann, New differential geometric methodsin nonholonomic path finding, Progress Systems and Control Theory, 12: 365 (1992). G.Jacob, Lyndon discretization and exact motion planning, Proceedings of ECC 1991,Grenoble, France, (1991). G.Jacob, Motion planning by piecewise constant or polynomial inputs, Proceedings of NOLCOS 1992,Bordeaux, France, (1992). K. T. Chen, Integration of paths, geometric invariants and a generalized Balcer-Hausdorff formula, Annals of Mathematics, 65: 163 (1957). M. Fliess, Realizations of nonlinear systems and abstract transitive Lie algebras, Bull. Amer. Math. Soc., 2: 444 (1980). M. Fliess, Fonctionelles causales nonlindaires et indetermindes noncommutatives, Bull. Soc. Math. France, 109:3 (1981). A. Krener, Bilinear and nonlinear realizations of input-output maps, SIAM J. Control & Opt., 13: 827 (1975). B. Jakubczyk, Existence of global analytic realizations of nonlinear causal operators, Bull. Polish Acad. Sciences Math. 34: 729 (1986). P. E. Crouch and F. Lamnabhi-Lagarrigue, Algebraic and multiple integral identities, Acta Applicandae Math. 15: 235 (1989). P. E. Crouch and F.Lamnabhi-Lagarrigue, State realizations of nonlinear systems defined by input-output differential equations, Lect. Notes Control Inf. Sciences, 11: 138 (1988). M. Kawski, A necessary condition forlocal controllability, AMS Contemporary Mathematics, 68: 143 (1987). H. J. Sussmann, A product expansion for the Chen series, Theory and Applications of Nonlinear Control Systems, (C. Byrnes and A. Lindquist, eds.), Elsevier, North-Holland: 323 (1986). N. Bourbaki, Lie Groups and Lie Algebras, Springer, Berlin (1989). M. Hall, Theory of groups, McMillan (1959). M. P. Schutxenberger, Sur une propridtd combinatoire desalgbbres de Lie libres pouvant itre utilisde duns un probldme de mathdmatiques appliqudes, SBminaire P. Dubreil, Alghbres et Thdorie des Nombres, FacultB des Sciences de Paris (1958/59).
Nonlinear Combinatorics and Control
of Words
M. Lothaire, Combinatoricson Mass. (1983).
345
Words, Addison-Wesley, Reading,
R. Ree, Lie elements and an algebra associated with shufles, Annals of Mathematics, 68:210 (1958). R. Ree, Generalized Lie elements, Canadian J. Mathematics 12:493 (1960).
P. Hall, A contribution to the theory of groups of prime order, Proc. London Math. Soc., 36:29 (1934). E.Witt, ?hue Darstellung Liescher Ringe, J. Reine Angew. Math., 177: 152 (1937). H. Meier-Wunderli, Note on a basis of P. Hall for the higher .commutators in free groups, Commentarii Mathematici Helvetici, 26: 1 (1952),
W. Magnus, O n the exponential solution of differential equations for a linear operation, Comm. Pure. Appl. Math., 7: 649 (1954). M. Hall, A basis for free Lie rings and higher commutators in free groups, Proc. Amer. Math. Soc., 1: 575 (1950).
A. I. h o v , O n bases for a free Lie algebra, Algebra i Logika SBm., 1: 14 (1962).
K.T. Chen, R. H. Fox, and R. C. Lyndon, h e differential calculus IV. The quotient groups of the lower central series, Annals Math., 68:81 (1958). X. G. Viennot, Algbbres de LieLibresetMonoddesLibres, Lect. Notes Math., 691,Springer, Berlin, (1978). G. Melanson and C. Reutenauer, Lyndon words, free algebras and shufles, Canadian J. Mathematics XLI: 577 (1989). G. Melanson and C. Reutenauer, Combinatorics of Hall trees and Hall words, J. Comb. Th. Ser. ‘A’ 59: 285 (1992). G. Melanson and C. Reutenauer, Computing Hall exponents in the free group, preprint. M. Petitot, Algbbre noncommutative en scratchpad, These, Universit6 des Sciences et Techniques de Lille Flandres Artois, (1992).
346
Kawski
[55] J. E.Humphreys, Lie algebras and Representation theory, Grad. Texts Math. 9, Springer, New York, Heidelberg (1970). [56] S. Lang, Algebra, Addison-Wesley, Reading, Mass, (1965). [57] A. M. Garsia, Combinatorics of the free Lie algebra and the symmetric group, Analysis et cetera, Acad, Press (1990). [58] A. Agrachev and R Gamkrelidze, The exponential representation of flows and the chronological calculus, Math. USSR Sbornik, 36:727 (1978). [59] A. Agrachev and R Gamkrelidze, The shufle product and symmetric groups, (preprint). [60] M. Grayson and R.Grossman, Models for free nilpotent Lie algebras, J. Algebra, 136: 177 (1990).
Feedback Classification of Nonlinear Control Systems on R2and R3 Witold Respondek*
INSA de Rouen, LMI UPRES-A CNRS 6085, Place Emile Blonde1 8, 76131 Mont Saint Aignan, France
Abstract. We present static feedback classification of nonlinear control systems on R2 and R3.We give lists of normal forms, provide checkable conditions for feedback equivalence to those forms, and describe the geometry of singularities. We discuss also the question of how singular curves of the time-optimal problem determine feedback equivalence classes of the presented normal forms.
1. Introduction Geometric control theory began in the sixties with studying nonlinear control systems of the form
rI : j. = F(z,u) using tools coming from geometry. At the begining, basic control-theory notions, like controllability, observability, and realizations, have been studied in the nonlinear context and many important results concerning those
* On leave fromthe Institute of Mathematics, Polish Academy of Sciences, Warsaw, Poland; this work waa partially supported by KBN Grant 2P03A 004 09. 347
348
Respondek
notions have been obtained. Already in this pioneering and crucial for further developments period a more systematic viewpoint has also appeared: to consider not only a given system (and analyse its properties) but to distinguish important classes of systems, to define natural classes of transformations acting on systems, and to describe those systems which are transformable, via such natural transformations, to systems belonging to the distinguished classes. Since geometric control theory uses tools from differential geometry, results of geometric control theory (both: assumptions and statements) are coordinate-free and thus one class of transformations appears very naturally: diffeomorphisms of the statespace. To be more precise, consider another control system
fi: : 5 = F(Z,G). Suppose that U = G, We say that II and fi: are state space equivalent if there exists a diffeomorphism 0 mapping the state space X of II onto the state space 2 of fi: and transforming the dynamics F(-, U ) to U). It is clear that if l2 and fi: are state space equivalent they have, up to a diffeomorphism, the same trajectories (corresponding to the same controls
F(-,
U(*))*
Another very important equivalence relation for control systems is that of feedback equivalence. Physically, feedback control means that we use controls depending on thestate: U = U(.). Applyingfeedback we change dynamical properties of the system and thus feedback control is used to achieve desired dynamical properties of the system: stabilizabilty, decoupling, disturbance rejection, model matching and so on (cf. [I3 and [NS]). Considered from the point of view of system’s transformations applying feedback transformations means that we also modify the controls (which remain unchanged for the state space equivalence) and weallow their transformations to depend on the state. More precisely, we say that II and fi are feedback equivalent if there exist a diffeomorphism E = G(.) between the state spaces and an invertible transformation G = Q(2,u) of.,controls such that the diffeomorphismm (Q(.), Q(,. U ) ) brings II into fi. Observe that feedback equivalent systems have geometrically the same
Feedback Classification of Nonlinear Control Systems
349
set of trajectories, although they aredifferently parametrized by controls. Indeed, a trajectory of ll corresponding to a control U ( . ) is mapped by @ into a trajectory of l?corresponding to G(-)= Q(z(.),u(.)). Geometric charazterization of systems which are feedback equivalent to a linear system was obtained in [JRl] and [HS]. The following observation is the starting point for the general problem of feedbackclassification. Although, for feedback equivalence there are still, as for the state space equivalence, manynon-equivalent systems, an argument based on counting the dimensions of the space of systems to be classified and that of the feedback group acting on them, shows that for systems affine with respect to controls of the form
where m = n - 1, we can expect open orbits (and more generally, moWe dels, i.e., parameterless canonical forms) to exist (see [Jl] and [TI). describe briefly this “dimensional” argument in Section 2. Here we want to emphasize that the only ” promissing” cases are &ne control systems, where the difference between the number of states and that of controls is one. A systematic analysis began with planar systems with one control studied by Jakubczyk and the author ([JR2] and [JR3]) who obtained a complete classification. Then Zhitomirskii and the author classified generic systems with two controls on Et3 in [RZl]. Recently the same authors have obtained in [RZ2] (see also [ZR]) a complete list of all simple control &ne systems (i.e., roughly speaking, all systems admitting canonical forms without parameters) in the general case of n states and m = n - 1 controls. We would like to stress the following invariant description of feedback equivalence. The set of velocities of ll is the subset F = UIEXF ( z )of the tangent bundle TX, where F ( z )= F($,U) and U ( . ) E U, the set of control values. A control system can thus be represented, if parametrization with respect to controls is irrelevant, as a subset of the tangent bundle. In
Respondek
350
particular, a control-linear system A of the form m
i=1
with unbounded controls, i.e., ui(.)E B, and pointwise linearly independent vector fields 91,. , . ,gm can be interpreted geometrically as the distribution B, spanned by 91,. . . ,gm. The control-&ne system C with unbounded controls and pointwise linearly independent vector fields 91,.. . ,gm can be interpreted geometrically as the &ne-distribution f 0. Observe that feedback equivalence of control-affine (resp. control-linear) systems is exactly the equivalence of the corresponding affine-distributions f G (resp. distributions B) under the naturalgroup of diffeomorphisms. If there are m = n - 1 controls and, moreover, the vector fields f , g l , , , , ,gn-l are pointwise independent then there exists a unique differential l-form 9 such that q(gi) = 0, i = 1,.. . ,n - 1, and ~ ( f=)1 and in this case the problem of feedback classification isjust that of classification of differential l-forms under the action of diffeomorphisms. These observations, although simple, are crucial for the problem. Firstly, they provide an invariant geometric formulation, secondly, they link the problem with the classical studies on classification of differential forms and distributions. The aim of this survey two-fold. The first is to present feedback classification results of [JR2] and [JR3]for planar systems with one control and of [R211 for systems with three states and two controls. We will give lists of normal forms, provide geometric checkable conditions for feedback equivalence to those forms, and describe geometry of singularities. The second goal is to show how singularity theory appears in a natural way in the problem of feedback classification. For instance, classical results of Darboux and Martinet give immediately models for the feedback classification out of the equlibria set. More involved results use standard tools from singularity theory, like the homotopy method. In Section 2 we give basic definitions and notations. In Section 3 we will deal with systems on R2 and in Section 5 with systems on I t 3 . In Section 4 we will dicuss a classification of differential l-forms on R2 which, as we have just explained, is closely related with feedback classification of
+
+
Feedback Classification of Nonlinear Control Systems
351
control affine systems on R' Finally, in Section 6 we will compute singular curves for all canonical forms of the presented classifications. We will briefly discuss the question (stated and studiedby Bonnard [Bo] and then by Jakubczyk [J3]): to what extent singular curves determine feedback equivalence class of the system. Our survey discusses the cases of R2 and R3 only. This simplifies the exposition and makes it more illustrative. On the other hand, many important and interesting phenomena are already present in the low-dimensional cases. We wouldlike to mention that besides the above discussedequivalence relations for control systems there is a third very important one, namely dynamic feedback equivalence. %call that both, state space equivalence and feedbackequivalencepreserve the trajectories.This is the crucial starting observation fordefining dynamic equivalence: one says that two control systems are dynamically equivalent if they have the same smooth trajectories (up to an invertible transformation of the prolonged systems). Dynamic equivalence has recently attracted a lot of attention, especially the problem of characterizing systems dynamically equivalent to linear ones (such systems are called flat or free, see [FL+l], [FL+2], [J2],
[ N W 1 [PI1*
2. Basic notations Consider a smooth nonlinear control system of the form
II : k = F ( z , u ) , z ( t ) E X,
u(t) E
U,
where X is an open subset of Rn (or an n-dimensional manifold), U is an open subset of Rm,and F ( . , is smooth with respect to both arguments, i.e., F defines a family of smooth vector fields onX smoothly parametrized by the control parameter U . The class U of admissible controls is contained in the space of all U-valued measurable functions. Throughout the paper the word smooth will always mean Cm-smooth. Consider another control system of the same form with the same class S )
Respondek
352
of admissible controls U
fi : 5 = F ( E , i i ) ,
3(t)E 2 , q t ) E 0,
where 2 is is an open subset of Rn (or an n-dimensional manifold) and 6 is an open subset of Rm. Assume that we do not change controls, i.e., U = ii and U = 0.Analogously to the transformation @,g of a vector field g(.) by the differential d 3 of a diffeomorphism 3 = Cp(z), we define the transformation of F(., U) by d@. Put
(@*F)(%, U ) = dCp(@-'(3))F(Cp-'(Z),u), where d 3 denotes the differential of Cp. We say that control systems ll and fi are state apace equivalent (respectively, locally state apace equivalent at points x~ and Eo) if there exists a diffeomorphism Cp : X + 2 (respectively, a local diffeomorphism Cp : X. + 2,@(Q) = 30,where X0 is a neighborhood of ZO)such that
3 , F = F, Define the following families of vector fields:
F={&~GE~). = F(-,ii).State space equivalenceis
3 = ( F , l u ~ U } and
where F,(-)= F(.,u) and a very natural equivalence relation for control systems since it establishes a one-to-one correspondence between trajectories corresponding to thesame controls. However it is very strong. For instance, in the analytic case two systems are locally state space equivalent at x0 and 30,respectively, if and only if there exists a linear isomorphism of the tangent spaces T,,X and TzoZ transforming all iterated Lie brackets defined by the family T into those defined by F and corresponding to the same controls (see [Kr], [S]). Therefore there are too many equivalence classes and there is no chance for any reasonable classification. Another very natural equivalence relation is that of static feedback equivalence. Recall that when considering state space equivalence the controls remain unchanged. The idea of feedback'equivalence is to enlarge & ( S )
Feedback Classification
of Nonlinear Control Systems
353
state space transformations by allowing to transform controls as well and to transform them in a way which depends on the state: thus feeding the state back to the system which justifies the name. Consider two general control systems II and fi. We say that II and fi are feedback equivalent if there exists a diffeomorphism x : X x U + 2 x of the form
which transforms the first system in the second, i.e.,
where d@ denotes the differential of @.Observe that @ plays the role of a change of coordinates in X while 9,called feedback transformation, changes coordinates in the control space in a way which isstate dependent, When studying dynamical control systems with parameters (cf., the bifurcation theory) we classify the same space of parametrized vector fields F(.,U ) but transformations are defined differently: coordinates changes in the parameters space are state-independent, while coordinates changes in the state space may depend on the parameters. Thus the role of the parameter is opposite to that played by the control parameter. Feedback transformations play a crucial role incontrol theory: applying them we can change fundamental dynamical properties of the system, e.g., we can achieve stabilization, decoupling, create unobservable dynamics, or obtain other desired properties of the dynamics (compare the monographs [I]and [NS]). In order to explain the natureof feedback transformations we start with observing their following simple but crucial property. If ll and are feedback equivalent and 7 = ~ ( u(t), t , 2 0 ) is a trajectory of II passing through zoand corresponding to a control U(*) then,itsimage = @(7)by @ is a trajectory of fi passing through 530 = @(Q) and corresponding to the feedback modified control Z(t) = *(z(t),u(t)).Therefore, two feedback equivalent systems have geometrically the same set of trajectories (up toa, diffeomorphism in the state space) which are parametrized differently by controls.
0
Respondek
354
In the control-affine case we consider systems C of the form m
+ g(z)u = f( +. cuigi(z), )
C : B = f(z)
i= 1
where f,gl, . , , ,gm are smooth vector fieldson X ,U = (211,. . , , and g($) = (gl(z), , . . ,gm(z)). In order to preserve the control-affine form of C, we will restrict feedback transformations to control-affine ones
G = q 2 ,U ) = Z(z) + & ) U ,
(2)
where G = (GI,. . . ,Gm)t, &x) is an invertible m x m matrix and the elements Zi of Z and &j of ,8 are smooth functions. Denote the inverse feedback transformation by u = a(z)+p(z)G,then the feedback equivalence of C and 5 given by m
In thb
case we consider' feedbacktransformations of the form (2) satisfying
Z(x) = 0. Assume that the vector fields 91,. . .,gm &e pointwise independent. Then (local) feedback equivalence of two 'control-linear systems A and coincides with (local) equivalence of the distributions 8 and 8 span'
ned respectively by the vector fields gi's and gi's, the latter mehing the existence of a (local) diffeomorphism CP such that @,Q = g. If the vector fields 91,. . . ,gm are pointwise linearly independent then the distribution Q can be considered as the field of kernels ker E , where E "
Feedback Classification of Nonlinear Control Systems
355
is the codistribution spanned by n - m indep,endent differential l-forms w anihilating ~ ~ $7,~i.e., by the system of Pfaffian equations {wi= 0, i = 1,. . . ,n m } . The classical problem of clmsification of nonintegrable distributions and Pfaffian equations goes back to Pfaff, Darboux, Goursat, and Cartan. Many interesting and deep results have been obtained since then by Martinet, Bryant, Hsu, Jakubczyk, Mormul, Przytycki, Roussarie, Zhitomirskii, and others (see [BC"], [BH], [JP], [JZ], [M], [MR], [Zh2]). The control-linear system A corresponds to thedistribution $7 spanned by the vector fields gi's. Analogously, one can interepret a control-affine system C of the form ri: = f (x)+CL, uigi(z), where the gi's are pointwise independent, as the affine distribution d = f $7.By an affine distribution we mean a map whichassigns to every p E X an affine subspace A(p) = f (p) Q(p) of T p X . Therefore the feedback classification problem for control-affinesystems is the geometric problem of classification of affine distributions under the action of the. natural group of diffeomorphisms:twocontrol-affine systems C and are (locally) feedback equivalent if and only if there exists a (local) diffeomorphism i9 such that .. @,A = X . Consider a control affine 'system C with n - 1 controls and assume that gl, , . . ,gn-l are pointwise linearly independent. Then thedistribution g spanned by the gi's isgivenby the kernel. of a nonvanishing l-form W 'defined by w(ga) = '0, i = l , .. . , n - 1. The form W is defined up to multiplication by a nonvanishing function. Ifwe assume, however, that the vectpr field f is independent of 91,. . . ,gn-l we can define a unique differential l-form q such that
q , , . .,
-
+
+
E
q(gi)=O,
i = l , ..., n - l ,
and
q(f)
= 1.
Points p such that f , gl,. , , ,gn-l 'are independent at p ire called nonequilibrium points andthey satisfy 0 Ah), i.e., d(p) is not linear subspace of TpX.We have $he following simple but important observation (see [Zh2],App. C).
4
.
I
I
Respondek
356
Proposition 1. Assume that the vector fields f , 91, . . . ,gn-l are pointwise independent. Then control systems C and E are (locally) feedbak equivalent if and only if the corresponding differential l-forms q and ;;i are (locally)equivalent, i.e., thereexists a (local)diffeomorphismsuchthat
@*V = q , Therefore the feedbackclassification problem in this caseis that of classification of differential l-forms under the action of diffeomorhisms. This is a classical problem whichgoesback to Darboux whose results were followed by those of Martinet [M] and Zhitomirskii [Zhl], [Zh2]. In Section 4 we discuss this problem on Ram To give another explaination of the nature of feedback transformations recall (compare the Introduction to this volume) that there are two principal ways of controlling the system II. The first possibility is to use open loop controls, i.e., to put U = u(t) and thus to transform II into a timevarying differential equation. The second possibility is to use closed loop controls, i.e., to put U = U(.) and thus to transform II into a dynamical system (a vector field). It turns out that two control-affine systems, without any assumption concerning pointwise independence of the vector fields gi, are feedback equivalent if and only if the two corresponding families of dynamical systems, obtained by using all smooth feedback controls U = u ( x ) ,are the same (up toa diffeomorphism in the state space). To be more precise, put
+ ~ 1 , . E C w ( X ) } and P@)= {F(%) + F(Z)ii(55)) I ii = .* ,G m ) t , iii E CW(Z)}.
V ( S ) = { f ( ~ )g ( i ) u ( x )I U = (
I
~i
(Gl,.
Note that V and 9 have the structures of affine-modules over the ring of smooth functions. We have the following characterization of feedback equivalence [Jl],
Proposition 2, Two systems C and 5 are locally feedback equivalent if and only if .there .exissts a local diffeomorphism such that i7) S @,V. Consider the problem of (local) feedback classificationof control-affine systems of the form C. The object to be classified is thus the (m+ l)-tuple
Feedback Classification
of Nonlinear Control Systems
357
of vector fields (f,91,. . . ,g m ) defined in local coordinates by ( m + 1)n functions. On the other hand,the transformation group (l), with Q defined by (2), is given by n smooth functions defining @ and m + ma smooth Thus a neccessary condition functions defining the feedback pair for open orbits to exist is that ( m 1)n 5 m ma n,which is equivalent to n 5 m 1. In order to make this argument precise one should consider the restriction of the feedback group action to the space of the k-jets of (m 1)-tuples of vector fields and show that thecodimension of each orbit tends to infinity with k + 00 (see [Jl], [TI). If we put the natural assumption that the number of controls is smaller than the dimension of the state space we see that one can expect open orbits to exist only if m = n - 1. Such orbits, indeed, exist if m = n - 1 as show successive results concerning the feedback classification problem. Planar systems with one control, i.e., n = 2, m = 1, have been studied and classified by Jakubczyk and the author [JR2], [JR3]. Generic classification of systems with 2 controls on 3-manifolds has been obtained by Zhitomirskii and the author [RZl]. Recently the same authors have obtained a complete list of all systems admitting canonical forms without parameters (more precisely, simple germs) in the general n-dimensional case [RZ2] (see also [ZR]). In the two following sections we discuss the 2and 3-dimensional cases. Some classification results will be stated for generic systems, i.e., systems belonging to a residual set (a countable intersection of open dense sets) in the space of all control systems, in other words, of all (m+l)-tuples of vector fields. This space will be considered with Cm-Whitney topolgy (see e.g. [GG]).
+
+
@,p).
+ +
+
3. Static feedback classification on
W'
In this Section we will present result of [JR2], [JR3] providing classification of nonlinear control systems on the plane. Consider a nonlinear system, with scalar control entering in an affine way, evolving on R' or, more generally, on a 2-dimensional manifold X.In local coordinates it is
Respondek
358
given by
(3)
C : 5 = f(z)+ g ( z ) u ,
where f and g are smooth vector fields on Et2. Throughout the section we will assume that g(%) # 0. The two following sets play a crucial role in feedback classification of C given by (3). The equilibria set is given by
E = { p E X I f(p) and g ( p ) are linearly dependent}. It is clear that if p E E and g@) # 0 then, locally around p , there exists a smooth function a(.) such that (f a g ) ( p ) = 0. This means that E consists of equlibrium points of the drift f (up to thefeedback action). The singular set S is defined by
+
S = { p E X I g ( p ) and [f,g ] ( p )are linearly dependent}. Recall that the Lie bracket of two vector fields f and g is defined in local coordinates z by [f,g]@) = g(p)g(p) - g(p)g(p). To interprete S assume that g(p) # 0 and rectify g, locally around p, by introducing local coordinates (2,g ) such that g = Let (fl(s, v),f 2 ( 2 , denote the components of f in (2,g) coordinates. We have [9,f ](2,g ) = (5, y) %(qv)& and thus S consists of points (z, y ) , where %(%,v) = 0, i.e., the points where the motion transversal to the trajectories of g can admit its extrema1velocity. Thisinterpretation is thestarting point for an approach to thefeddback classification problem based on its relations with the time-optimal control problem (see [Bo],[J3]).
&.
Proposition 3. For a generic control system, curves and they intersect transversally.
e
g+
E and S are smooth
We will call them, respectively, equilibria curve and singular curve. In the analytic case they are branching curves. We begin with the classification of generic systems.
Theorem 1. A n y generic system C , given by (3), is locally feedback equivalent,aroundallypointatwhich g does not vanish, to one of the
Feedback Classificationof Nonlinear Control Systems
359
B isaninvariant while thesmoothfunction a ( x ) satisfies a(0) # 0 and has the following invariance property: two systems in the normal f o r m (NF6), given by y3+zy+a(z) and y3+xy+Z(z), respectively, are equivalent if and'only if a(.) = Z(z) for x 5 0. Above, X
E
Observe that models (Ml)and (M2)are feedback linearizable, at nonequilibrium and equilibrium, respectively. For single input systems on the plane feedback linearizability holds under the rank condition of pointwise . [ f , g ]only as the involutivity condition is satisfied independence of g. and and [JRl]). automatically (see [HS] Choose a local coordinate system and put
and P = det (91 [ f , g I ) .
Of course, v and p depend on the choice of, coordinates but the ideals generated by them, and thus their zero level sets, are invariantly related to the system. It is obvoius that E and S are the zero level sets of v and p, respectively. In order to describe the equivalence classes (orbits under the feedback group action) corresponding to the canonical forms listed above let us introduce the following sets
C = { p E X I g(p), ad, f (p), and ad: f (p) are dependent}.
Respondek
360
and
D = { p E C I p $ E, d p b ) # 0, and g@) and ad; fCp) are independent}. At points p of D the vecor field g is tangent to S which is a smooth hypersurface because dp@) # 0. For a generic system, D consists of isolated points and coincides with C # The geometry of the equivalence classes listed in Theorem 1 can be described as follows. Theorem 2. Consider a control system C, given by (3), and assume that g ( x 0 ) # 0. x0
x0
(i) C is, localy at
20,
equivalent to model (Ml) if and only if x0 $E,
$ S. (ii) C i s , localy at
20,
equivalent to model (M2)if and only if x0 € E ,
# S.
(iii) C is, localy at 20, equivalent to one of models (M3) or (M4) if and only if x0 # E, x0 E S but x0 !$ C. (iv) C is, localy at 20, equivalent to the nomalf o m (NF5), with X # 0, if and only if 20'E E, x0 E S but x0 $ C,and moreover dv(x0) # 0. (v) C is, localy at 20, equivalent to the nomal form (NF6), with a(0) # 0 , if and only if x0 E D. It is clear that all conditions involved in (i)-(v) are invariantly related to the control system (they depend neither on coordinates system nor on feedback). Remark 1. We can distinguish the equivalence classesof models (M3) and (M4) as follows. For both cases f and ad: f are transversal at 0 E R2 to the line spanned by g(0). For systems equivalent at p to (M3), fb)and ad: f(p) point out the same side of the line while for systems equivalent to (M4) they point out opposite sides. Another way to distinguish (M3) and (M4),based on timi-optimal properties of the systems, will be given in Section 6. Following the standardnotion of the singularity theory (see e.g. [AVG] or [GG]) we say that a control system C is k-determined at p E Rn (more
Feedback Classification
of Nonlinear Control Systems
361
precisely, its germ at p E W )if all control systems having the same k-jet at p are locally feedbackequivalent. C is called finitely determined if there exists k such that C is k-determined. We say that C is structuraly stable at p if for any sufficently small perturbation 3 of C we can find a point F close to p such that C at p and at F are locally feedback equivalent.
E
As a corollary of Theorem 1 we get the twofollowing results. The germ of a control system C on a 2-manifold is structurally stable if and only if it is equivalent to one of the models (Ml)-(M4). A generic germ is finitely determined if and only if it is structurally stable or equivalent to lS, the germ is l-determined the normal form (NF5). At points p of E r and the real parametr X can be interpreted as follows. If C is given by normal form (NF5) its first order approximation at 0 E R2 is the linear uncontrollable system & = XZ, 6 = ii and X is the eigenvalue corresponding to the uncontrollable mode.
At the points of tangency of S and the trajectories of the vector field g the germ of a control system is not finitely determined and a functional
module appears. It can be given the following geometric interpretation. In a punctured neighborhood of such a tangency point p one can find a unique feedback control U($) = a(.) such that f ~ ( x )= f(x) a(x)g(x) is tangent to the singular curve S (U is not defined at p ; see also Section 6 for optimal control interpretation of U(.) and f1 (x)).Now observe that in a neighborhood of p the curve S is intersected twice by the trajectories of g which thus define an involution 4 of S (compare the normal form (NF6), where S is given by the parabola 3y2 x = 0 while the trajectories of g= by x = const.). Having defined two vector fields, f1 and its image f2 = C#J& by 4, on l-dimensional manifold S \ { p ) , we get a functional invariant. Indeed, although f1 and f2 are not defined at p their ratio can be prolonged to a smooth function which gives riseto thefunctional invariant. If we parametrize S, given by 3y2 x = 0, with the help of y then the invariant is given as a smooth even function of y. This explains why in the normal form (NF6) the function a(s) is an invariant for x 5 0 only We will give another interpretation of this invariant when discussing relations with classification of l-forms in Section 4.
+
+
+
Respondek
362
For m R-valued application $ defined in a neighborhood of 0 E Rn (its germ at 0 E Rn)we denote by j r $ the infinite jet of q5 at 0. Theorem 1 can be generalized as follows.
Theorem 3. Consider the system there exists an integer k such that dim span(g(zo),
(4)
C , given by (3), and assume that
ad:+'f (20)) = 2.
(i) Then C is locally, around XO,feedback equivalent to the following system, defined in a neighborhood of 0 E It2,
where q is the smallest integer k satisfying (4) and ai(z),i = 0, , , , ,q - 1, are smooth functions satisfying aa(0) = 0. (ii) Consider C given by (5)-(6) and
E given by
If C and z! are locally feedbackequivalentthen q = andthereexists a function germ t$ at 0 E W,and a function germ II, at 0 E R2,such that
is a germ of a diffeomorphism of W2,preserving 0 E Ra, andjrII,
=
*jr(yId4(z)I*),
(iii) If 0 E R2 is not an equilibrium point then we can choose a normal form (5)-(6) such that a0 = e, where E = 1 if q is even and E = fl if q is odd. Assume that C and z! given locally by (5)-(6) and (7)-(8) (resp.) satisfy a0 = E and ZO = E (resp.). IfC and z! are locally feedback equivalent then for any i = 1,. ,q 1 we have j r Z i = j r a i if q is even and either j r Z i = j$ai or j r & = (-l)ij$ai if q is odd.
.. -
Feedback Classificationof Nonlinear Control Systems
363
&call that a control system is accessible at p if the set R@) of points attainable from p has a nonempty interior. If the set R,(p)of points attainable from p in time t has a nonempty interior we say that the system is strongly accesible at p .
Remark 2. Observe that the set of systems, for which an integer IC satisfying (4)does not exist, is of infinite codimension. Moreover,if C is analytic then there exists k satisfying (4) if and only if C is strongly accessible at IO (which, if 10 E E, i.e., ifwe work around an equilibrium, is equivalent to accessibility),
E
Remark 3. The item (ii) implies that if C and given locally by (5)( 6 ) and (7)-(B), respectively, are locally feedbackequivalent then q = Fand there exists a function germ q!~ at 0 E R2,satisfying $(O) = 0, d$(O) # 0, such that j p i = fj~((d4($-1))(9+1-i))/.(9+1) . ai($-l)),
In order to obtain the form ( 5 ) - ( 6 ) one uses Mather's theorem on versal unfoldings of functions (see e.g. [AVG] or [BL]). In the analytic case the picture is thus as follows.Any strongly accessible analytic system can be put, using an appropriate feedback and a coordinate transformation, to thenormal form (5)-(6). Moreover two analytic systems C and 2, given respectively by ( 5 ) - ( 6 ) and (7)-(8), are locally feedback equivalent if and only if there exists m analytic function germ 4, satisfying #(O) = 0, dq!J(O) # 0, such that iii = ~ ~ d ~ ( $ - ' ) ~ ( ~ + ' - ~ ) ~ ( ~ + ' ) ai($-l). If 0 E R2 is not an equlibrium point then we can always assume that a. = cg where cq = 1 if q iseven and = fl if q is odd. Fortwo systems ( 5 ) - ( 6 ) and (7)-(B), satisfying a0 = E q and EO = cg, respectively, which are locally feedback equivalent, we have Zi = ai if q is even and either iii = ai or iii = (-l)iai if q is odd. Therefore out of the equlibrium, the set {ai, a = 1,, , . ,q - l}, of analytic function germs is a complete invariant of the local feedback classification (in the odd-dimensinal case, while in the even-dimensinal case, it is so modulo a discrete group). For a complete local classification of analytic control systems we send the reader to [JW].
Respondek
364
In this Section we discuss briefly relations between the feedback classification of control-affine systems on lR2 and classification of nonvanishing differential l-forms on R'. Consider a smooth control-affine system (9)
C : x = f(s)+ g(z)u,
where z ( t ) E R'. Recall that if p l-form q such that
E , then we can define a differential
(10)
2
q(d
0,
and, moreover, (11)
df)
1.
As we already discussed, away from the equilibrium set feedback equiv& lence classes of the system C given by (9) are in a one-to-one correspondence with differential l-forms defined by (10)-(11), up to a diffeomorphism. Indeed, consider another control system E : & = F(Z) +g(Z)C away from its equilibria set and define $ by E 0 and E 1. Then Proposition 1asserts that the control systems C and 5, considered away from their equilibria sets, are locally feedbak equivalent if and only if the corresponding differential l-forms 7 and $ are locally equivalent, i.e., there exists a local diffeomorphism @ such that
$(a
f(7)
@*V= 7). Therefore results on classification of differential l-forms (see [M] and [Zh2]) provide immediately corresponding results on classification of control systems away from equilibria. To start with, recall the result of Zhitomirskii [Zhl], [Zh2] asserting that all finitely determined nonvanishing local differential l-forms are exhausted by the classical modelsof Darboux and Martinet. This result gives thus all finitely determined control-affine systems out of the set of equlibria. In particular, the only finitely determined differential l-forms on R' are, up to a diffeomorhism, the models of
.
..
Feedback Classiflcationof Nonlinear Control Systems
365
Darboux and Martinet given, respectively, by
and (M&) (g' f 1)dz. By an easy calculation one can see that (D) gives model (Ml) while (M&) gives models (M3) and (M4) of Theorem 1. Theorem 3, which gives a classification of all control systems on R2, modulo a set of codimension infinity, has the following analogue which gives the local classification of all nonvanishing differential l-forms on R2 (modulo a set of codimension infinity). By L,w we denote the Lie derivative of a differential form W along a vector field g . For a differential l-form W on R' such that wlp # 0 let g be a nonvanishing vector field satisfying w ( g ) = 0. We have
Theorem 4. Consider a differential l-form W o n R' such that wlp # 0. Assume that there exits k 2 0 such that
L;+'wlp
(12)
# 0.
(i) Then W is locally equivalent to the following form at 0 E R' (13)
(gq+'
+ pp-'
bq-1
+ ... + l/bl (S)+ Eq)dS,
(S)
.
where q is the smallest integer k satisfying (12) and bi, i =,1,.. , q - 1, are smooth functions satisfying bi(0) = 0 while Eq = 1 if q is even and eq = f l if q is odd. (ii) If two differential l-forms W and G, given respectively b y (13). and (14)
(p" + p-'&-l(Z)+ . . . + & ( F ) + ?;)dZ,
-
-
are locally equivalent then F = g, E;; = E q , and j$bi = j$bi if q is even and either j$& = j$bi or j$ba = (-l)ij$bi if q is odd. (iii) Any analytic differential l-form is equivalent to the form (13) at 0.E R', where bi are analytic functions germs, bi(0) = 0. Moreover, two analytic differential l-forms, given respectively by (13) and (14), are locally I
Respondek
366
-
equivalent if and only if F = g, F; = E ~ and , bi = bi an case q is even and either bi = bi or bi = (-l)ibi in case p is odd.
-
It is easy to see that the integer q does not depend on the choice of the vector field g which is defined up to multiplication by a nonvanishing function. Of course, the integers q defined for the control system C and its correspoding differential form 7, given by (10)-(11), coincide. We have the following corollary ([Zhl], [Zh2]).
Corollary 1. A genericdifferential 1-form W satisfying wlp # 0 is equivalent to one of the following canonical forms at 0 E R':
(D) (a, + 1)dx (M*) (y2 f 1)dz (NF) (g3 + + b(x))dx, where b is a smooth function, b(0) # 0. Moreover,two l-forms in the normal f o r m (NF) given by (y3 + yz + b(x))dz and (y3 + yx +x(x))dx are equivalent if and only if b(x) = ;(x) f o r x 5 0. Consider a nonvanishing differential l-form W on R2.Its field of kernels is spanned be a nonvanishing vector field g satisfying w ( g ) 0. Let S denote the set of points p such that L,w(p) = 0. For a generic nonvanishing differential l-form W the set S^ is a smooth curve. The normal form (NF) appears at isolated points S^g, where S^ is tangent to g. Model ( M f ) appears at points of the curve 5 away from the isolated points of 5,. Finally, model (D) appears at points of thetopen : x = f(z) g ( x ) u complement of S^. Define a control-affine system by taking g such that w ( g ) = 0 and f such that w(f),'= 1. Clearly, C is defined up to thefeedback equivalence. Moreover, the equlibriam curve E is empty for C. Observe that the singular curve S of C coincides with S^ and the set of points of tangency of g and S coincides with 3, thus giving a control-theory interpretation of the geometry of singularities of differential l-forms. Observe that we can give the following interpretation of the existence of a functional invariant for the form (NF) (see [Zhl] and [AG, Chapter .4, Section 1.41). In suitable coordinates the differential form W can be A
+
Feedback Classification of Nonlinear Control Systems
367
expressed as +(x,v)dx which defines a field of directions ker dx = 0. On the integral curves x = const. of this field we get a family of smooth functions +(x,-). If the point under considerations bifurcates into two critical points of the family we can choose the parameter x such that the sum of the critical values becomes 1. Then the difference of the critical values, considered as a function of the parameter, provides a functional invariant of the equivalence class of W .
5. Static feedback classification on R3 In this section we present results of [RZl] on feedback classification for nonlinear control systems with two controls on three-dimensional state spaces. Consider a control system of the form (15)
c : x = f(.)
+ g1 b)u1 + 92(3)u2,
where x(t) E X ,an open subset of R3or a smooth three-dimensional manifold, and f , gl,g2 are smooth vector fields on X.Throughout this Section we will assume that g1 and g2 are pointwise independent everywhere on X. Define the two followingsubsets of X which willbe crucial in describing the geometry of the. problem. The equilibria set is given by
E = { p E X I f (p), g1 (p), and g2 (p) are linearly dependent}. Since g1 and g2 are independent, it is clear that locally around any p E E there exist smooth functions a1(x), az(x)such that (f+algl+or~gz)(p)=O. This means that, like in the two-dimensional case, E consists of equlibrium points of the drift f (up to thefeedback action). The second set is defined as
Proposition 4. For a generic control system
(i) E and S are smooth surfaces, (ii) they intersect transversally.
Respondek
368
E is called the equlzbria surface and S is called the Martinet surface. The latter plays an extremaly important role when studying classification and singularities of differential forms and Pfaffian equations as observed by Martinet [M], see also [Zh2]. Consider a distribution B given by B = span{gl,g2}. It can be represented by the Pfaffian equation { W = 0}, where W is a nonvanishing l-form satisying w(gi) = 0, i = 1,2. As we already discussed in Section 2, the system C defines the affine distribution A = f + 9 on X (Le., at any point p E X the affine subspace d(p)= f (p) + B(p) of the tangent space T p X is given), which is preserved under the feedback action. Defining an affine distribution on X is thus a feedback-invariant way of describing the control system C. The form W anihilating 9 is defined up to multiplication by a nonvanishing function. Ifwe consider p E X such that p 6 E then locally at p there exists a unique l-form q such that (16)
q(gi) E 0,
i = 1,2,
and, moreover,
(17)
q(f) 1-
Observe that all vector fields U satisfying q ( v ) = 1 span an affine distribution. Therefore away from the set of equilibria the three notions become one: a control system (defined up to feedback and such that thegi's are linearly independent), an affine distribution, and a differential l-form (compare Proposition 1 in Section 2). When studying C at points p E S it is convenient to express C by q. Let CO denote the set
i.e., the set of points where the nontrivial kerdq is tangent to S. Let C denote the closure of Co.We define two subsets of C: the intersection C1 = C n E and CZ= { p E S \ E I dq1, = 0). Moreover, one can show that C, defined as C, = { p I O(p) = T p S } is also a subset of C.
Feedback Classificationof Nonlinear Control Systems
369
Proposition 5. For a generic control system (i) C is a smooth curve contained in S and transversal to E , (ii) Ci C C, i = 1,2,3, consist of isolated points, (iii) Ci n Cj = 0, i # j , where i,j = 1,2,3, The following result describes the feedbackclassification of a generic system away from C. Below (x,g, %) denote suitable local coordinates on nx3.
where p, B
E B,B
2 0, are invariants.
Let us define L as the set of points in E \ S, where the contact distribution B is tangent to E , i.e.,
(18)
L ={pE E \S
I
Q ( p ) = TpE).
One can show that for a generic system the set L consists of isolated points. In order to discuss the normal forms (NF7) and (NF8)we need the following analysis of points of L. Take a differential l-form W such that B = kerw. The restriction (pullback) of w to the equlibria surface E is a l-form o vanishing at points p of L. To any volume form M on E there
Respondek
370
corresponds a unique vector field V given by i v s l = U ,where iv denotes the contraction with V . Recall that W , and consequently U ,are defined up to multiplication by a nonvanishing function, as is Q , and therefore the module of vector fields (V) generated by V is invariantly related to C.At p E L the vector field V has an equlibrium and thus the eigenvalues AI, A2 of its linear part at p are invariants (up to multiplication by a common factor and ordering) of C. Choose a local coordinate system and put
v
det (f,g1,ga).
Of course, v is defined up to multiplication by a nonvanishing function but its zero level set, which is the set E of equlibria, is invariantly related to the system. Now we are ready to describe the feedback equivalence classes (orbits) corresponding to the canonical forms listed above.
Theorem 6. Consider, on a 3-manifold, a nonlinear control system C , with two control vector f i e l h g1 and g2 independent at p. (i) C is, locally at p, equivalent to model ( M l ) if and only if p
# E,
P#S* (ii) C is, localy at p , equivalent to one of models ( M 2 ) or ( M 3 ) if and only if p $! E , p E S but p # C . (iii) C i s , localy at p, equivalent to model (M4)if and only if p E E ,
d4P)#O,P$L, andp$!S. (iv) C is, localy at p , equivalent to one of models (M5)or (M6)if and only if p E E , dv(p) # 0 and p E S but p # C,and, moreover, E and S are transversal at p and B is transversal to E n S at p. (v) C is, localy at p , equivalent to the normalform (NF7) if d u b ) # 0, p E L, Im X1 = Im A 2 = 0, A1 # 0,A 2 # 0, and is an irrational number. (vi) C is, localy at p , equivalent to the normalform (NF8) if dv(p) # 0, p E L , X1 = a bi, A2 = a - bi, a # 0, and b # 0.
e
+
Remark 4. A way to distinguish ( M 2 ) from ( M 3 ) and (M5)from ( M 6 ) , based on time-optimal properties of those models, will be given in Section 6.
Feedback Classification of Nonlinear Control Systems
371
Remark 5. The invariant p in (NF7) and (NFS) has the following interpretation. The first order approximation of C at p is not controllable and p is the eigenvalue corresponding to the uncontrollable mode. The invariant 0 can be calculated with the help of the eigenvalues XI, X2 as follows. For (NF7) we have
For (NFS) we have
Notice that among generic systems there are no feedback linearizable ones. This is clear because the involutivity of the distribution B spanned by g1 and g 2 , which is needed for linearizability (see [HS] and [JRl]), is highly nongeneric: it holds for a set of systems of codimension infinity. Observe that all the above conditions describing canonical forms listed in Theorem 5 involve the 1- or 2-jet of C, only. Recall the notions of structural stability and finite determinacy discussed in Section 3. The list of canonical forms given by Theorem 5 exhausts all, up to feedback equivalence, finitely determined (in particular, structurally stable) germs of generic control systems. More precisely, we have
Theorem 7. (i) The germ of a control system C is structurally stable if and only af it is equivalent to one of models (Ml)-(M6). (ii) The germ of a generic control system C is finitely determinedif and only ifat is either structurally stable or equivalent to one of the canonical forms (NF7)-(NFS). (iii) For a generic control system the set of points where the system is not finitely determined is the smooth curve C. As we already discussed, away from the equlibrium surface E we can consider the differential l-form given by (16)-(17) which uniquely determines the affine distribution f B. Therefore feedback classification results for the control system C at p 6 E follow immediately from results on classification of differential l-forms on R3 under the action of diffeo-
+
Respondek
372
morhisms. In particular, model (Ml) corresponds to q equivalent to the Darboux canonical form
(D) 5 d g + d I while (M2)and (M3)correspond to the Martinet canonical forms
+
( M f ) (1 5)dQ f IdZ (the latter correspondence needs a suitable change of coordinates). The fact that the germ at p E C of a control system is not finitely determined followsfrom a result of Zhitomirskii [Zhl] (seealso[Zh2]) stating that all finitely determined germs of any nonvanishing differential l-form are equivalent either to the Darboux or Martinet models. The classification problem at points of the equlibrium manifold E requires other methods. In order to get models (M4),(M5), (M6)and the normal forms (NF7), (NF8)we used in [RZl] thehomotopy method which requires, as a subproblem, to find the Lie algebras of infinitesimal automorphisms for 2-distrbutions kerw, where w is either the Darboux model (see [Ar]) or the Martinet model. At points of L the contact distribution 0 is tangent to the equlibria surface E. Such pairs, a contact distribution and a hypersurface, have been classified by Lychagin [L]. In our case we have additionally a vector field f , defined modulo 8, and hence two real invariants appear. We will come back to thier interpretation in Section 6.
6. Singular curves
A very interesting approach to the feedback equivalence problem has been originated and developed by Bonnard [Bo]. The main idea is to a n & lyse feedback equivalence of control systems via their singular curves in the time-optimal problem (a systematic approach based on this idea and using also complex extrema1 trajectory is presented in [J3]). Consider the control system
Feedback Classification Nonlinear Control Systems of
373
Fix an initial condition 20 and a terminaltime T > 0. Consider the inputs-state mapping End associating to any control U the end-point z(T,zo,u) E Rn of the trajectory passing through 20 and corresponding to U . We say that U is a singularcontrol if the derivative at U of the mapping End is not surjective (we consider bounded measurable controls U E U and endow the space U of controls with Lw-norm). The trajectory z ( t ,zo, U ) corresponding to a singular control is called a singular cume. According to Pontriagin Maximum Principle, singular controls and corresponding singular curves are candidates for solutions of the time-minimal (and time-maximal) control problem. We say that a smooth vector field f a on a submanifold X 8 of X is singular for C if its integralcurves ~ ( tare ) singular curves. Given a singular vector field f 8 on X 8 let a8 = (ai,. . ,a&) denote a feedback control such that fa($) = f(z) CL1a i ( z ) g i ( z )on X 8 . We will call a8(z)a feedback singular control. In this section we compute singular controls for generic control systems with 2- and 3-dimensional state spaces. Let us start with the 2-dimensional case. Consider
.
+
c : k = f(z)+ g(z)u, where z ( t )E Rz and C is given by one of the canonical forms (Ml)-(M4) or (NF5)-(NF6) of Theorem 1. Recall the definitions of the curves E and S, and of the isolated points D given in Section 3. We can easily see that singular curves do not exists away from S. In fact, at any p E R' \ S the vector fields g and [f,g ] are independent and the linearization of c is controllable along any trajectory. On the other hand, a direct calculation shows that in a neighborhood of any 20 E S \ D we can compute a unique smooth feedback singular control aa(z)defining a smooth singular vector field f a = f + a*g on X 8 = S \ D. In fact, in S \ D the control a8 can be given by the condition
where y ( t ) is the flow of We have:
fa
and 9 denotes the distribution spanned by g.
Respondek
374
for (Ml) there are no singular curves; for (M2) there are no singular curves; for (M3) oa=o,
fa=
a8=0,
fa=-
a 8X
on S = {y = 0);
for (M4)
a .8
onS={y=O);
for (NF5) Q’
= 0,
a
f” = Ax-
BX
on S = {y = 0);
for (NF6)
thus
We can summarize the above list aa follows. If we are away from S then there are no singular curves. In fact, the system C is feedback linearizable in this case ([HS], [JRl]) and linearizable systems do not possess, of course, singular curves. If C is equivalent to one of the models (M3)-(M4) then the singular vector field f a is a nonvanishing vector field on S. It does not distinguish model (M3) from (M4). However its optimality properties do. In fact, if 50 and XT are two points of S, both sufficiently close to 0 E Et2, such that there exists a trajectory of C joining them, i.e., z(O,xo,u) = x0 and z(T, Q,U) = ZT, T > 0 then there is a singular curve joining them and, moreover, it is time-maximal if is given by (M3) and time-minimal if C is given by (M4). For the normal form (NF5) the singular vector field f a is a linear vector field on S (a linearizable vector field €or systems equivalent to (NF5)). Its eigenvalueis an invariant of feedback equivalence and is equal to the parameter X of (NF5); different eigenvalues distinguish between nonequivalent systems. .(S,
S ,
a)
of Nonlinear Systems Control
Feedback Classification
375
Finally, for the normal form (NF6) the singular vector field is not defined at 0 E It2, it contains a functional parameter which, together with the vector field g, gives rise to a functional invariant in the feedback classification. If a singular curve approaches the point D (where g and S are tangent) when t + T, the corresponding singular control u'(t) tends to infinity. Thus no singular curve, corresponding to a L"-bounded control, passes trough D. Notice, however, that U' grows like (T - t)-lI2, or grows small like -(T - t ) - l I 2 , and thus U" E L1. Thus we can interpret it as a L1-singular curve touching D. We will come back to this when dicussing the 3-dimensional case. To compute singular curves in the case of control systems of the form
where x(t) E X , a 3-dimensional manifold, recall the definitions of the surfaces S and E , and of the isolated points L. In a neighborhood of any x0 E X\S the matrixT(x) is invertible, where Tij(x) = ( W , [si,gj])(x)and W is a nonvanishing l-form anihilating the distribution 6 spanned by g1 and Sa, Therefore, see [Bo],there exists a unique feedback singular control aS(x)= (@(x), .;(x)) such that f a = f aig1 +a;g2is a singular vector field on X' = X \ S. We can compute it by the formula
+
where 7 ( t )is the flow of f a . For the models and normal forms listed in Theorem 5 we get: for (Ml)
for ( M 2 )
-
= 1 22 '
for (M3) a; = - -
1
2x '
a; = 0,
.;
= 0,
f" = (1 +g)- 6
6x
fa
= (1+g)-
1 6 + -" 2X6X'
6 6z
1 6 - -" 22 ax'
Respondek
376
for (M4)
for (M5)
for (M6)
for (NF7)
for (NF8)
Note that for models (M2), (M3), (M5), (M6) the function a!, and thus the singular vector field fa, is not defined at the surface S which for all those models is given &S {x = 0). The above list allows us to compare the (local) geometry of singularities of a generic control system with that of the singular vector field f a . At a generic point of X and a generic equilibrium point, f a is a nonsingular vector field. It does not distinguish the equivalence class of (Ml) and of (M4). Observe, however, that in the latter case f a is tangent to the equilibria surface E . At the points of L (tangency points) the singular vector field f s vanishes and different values of the pair of parameters p, B in (NF7), (NF8) can be distinguished in terms of the eigenvalues of f a which are equal to p(1/2+B),p(1/2-0),pfor (NF7) andp(1/2+2iO),p(1/2-2iB),pfor (NF8).
Feedback Classification of Nonlinear Control Systems
377
Now, we proceed to the points of S. Although f is not defined on S, the behaviour of its trajectoriesin a neigborhood of S allows to distinguish the orbit of (M2) from that of (M3) as well as (M5) from (M6). We can observethat a neighbourhood 0 of any point of S\C is foliated by one-dimensional leaves, which intersect S transversally, such that any trajectory of f " starting in 0 is contained in a single leaf of the foliation and moreover lies completely on one side of S. In the case of (M3) and (M6) the trajectories off" are directed towards S and each of them reaches S in a finite time with infinite velocity. In the case of (M2) and (M5) the situation is opposite: the above picture is valid after inverting the time, i.e., replacing f" by -f".
To summarize, the singular vector field f" distinguishes two parts of S separated by the curve C. However, it distinguishes neither the equivalence class of (M2) fromthat of (M5) nor (M3) from (M6) (notice, however, that in the case of (M5) and (M6) it is tangent to the surface E of equlibria). It is easy to see that there are no singular curves contained in S. Consider models (M3) and (M6). If a singular curve approaches S when t + T then the coresponding singular control u"(t) tends to infinity and thus there are no singular curves, corresponding to LW-boundedcontrols, which touch S. Observe, however, that the feedback singular control ai = -& corresponds to the singular controls u f ( t ) = f1/2(T - t)"I2 which are L1-bounded. Thus there are L1-singular curves reaching S in a finite time and with infinite but integrable velocity (in the case of (M3) and (M6)) or escaping from S (in the case of (M2) and (M5)) with an infinite but integrable velocity. Finally, observe that the way the presence of S determines the behaviour of singular curves for distributions is completely opposite. For any 2-distribution on IR3 there are no singular curves away from S and in a neighborhood of its any generic point the surface S is foliated by singular curves (see e.g. [JZ]).For a control system with two controls on R3,as we have just shown, in a neighborhood 0 of a generic point of S there are no singular curves contained in S but the intersection of R3 \ S with 0 is foliated by singular curves.
Respondek
378
Acknowledgment. This survey is based on a joint work and joint papers with Bronidaw Jakubczyk and Michail Zhitomirskii. I am very grateful to them for our common journey through the wonderful world of nonlinear feedback.
REFERENCES [Ar] V. I.Arnold, MathematicalMethods in ClassicalMechanics, Springer, New York, 1978 (translated from the Russian edition, Moscow, Nauka, 1974). [AG]V.1.Arnold and A. B. Givental, SymplecticGeometry, in:Dynamical Systems, vol. 4, VINITI, Moscow, 1985 and SpringerVerlag, New York, 1989. [AVG] V. I. Arnold, A. N. Varchenko, S. M. Gussein-Sade, Singularities of digerentiable maps, vol. 1, Birhhauser, 1985 (translated from the Russian edition, Moscow, Nauka, 1982). [Bo]B. Bonnard, Feedback equivalence f o r nonlinear control systems and the time optimal control problem, SIAM J. Contr. and Opt., 29 (1991), 1300-1321. [BL] T. Brocker and L. Lander, Differential Germs and Catastrophes, London Math. Soc. Lecture Note Series 17,Cambridge University Press, Cambridge, 1975. [BCS] R. L. Bryant, S. S. Chern, R. B, Gardner, H. L. Goldschmidt, and P. A. Griffiths, ExteriorDigemntialSystems, Springer-Verlag, New York, 1991. [BH] R. L. Bryant and L. Hsu, Rigidity of integral curves of rank two distributions, Invent. Math., 114 (1993), 435-461. [FL+l] M. Fliess, J. LBvine, Ph. Martin, and P.Rouchon, Nonlinear control and Lie-Backlund transformations: towardsa new differential geometric standpoint, Proc. 33rd CDC, Lake Buena Vista, 1994, 223-267.
Feedback Classification of Nonlinear Control Systems
379
M. Fliess, J. LBvine, Ph. Martin, and P. Rouchon, Flatness and defect in nonlinear systems: introductory theory and applications, Internat. J. Control, 61 (1995), 1327-1361. M. Golubitsky and V. Guillemin, Stable Mappings and their Singularities, Springer, New York-Heidelberg-Berlin, 1973. H. Hermes, A. Lundell, and D. Sullivan, Nilpotent bases for distributions and control systems,Journal of Differential Equations, 55 (1984), pp. 385-400. L. R. HuntandR.Su, Linearequivalentsandnonlineartimevarying systems, Proc. Int. Symp. MTNS, Santa Monica, 1981, pp, 119-123. A. Isidori, Nonlinear Control Systems, 2nd ed. Springer-Verlag, New York, 1989. B. Jakubczyk, Equivalence and Invariants of Nonlinear Control Systems, in: Nonlinear Controllability andOptimal Control, H. J. Sussmann (ed.), Marcel Dekker, New York-Baael, 1990. B. Jakubczyk, Invariants of dynamic feedbacks and free systems, Proc. ECC, Groningen, 1993, 1510-1513. B. Jakubczyk, this volume. B. Jakubczyk and F.Przytycki, Singularities of k-tuples of vector fields, Dissertationes Mathematicae, 213 (1984), 1-64. B. Jakubczyk and W. Respondek, O n linearization of control systems, Bull. Acad. Polon. Sci. Ser. Sci. Math. 28 (1980), 517-522. B. Jakubczyk and W. Respondek, Feedback equivalence of planar systems and stabilizability, in: Robust Control of Linear Systems and Nonlinear Control, M. A. Kaashoek, J. H. van Schuppen and A. C. M. Ran (eds.), Birkhauser, Boston, 1990, 497-456. B. Jakubczyk and W. Respondek, Feedback classification of analytic control systems, in: Analysis of Controlled Dynamical Systems, B. Bonnardet al. (eds.), Birkhiiuser, Boston, 1991,262-273. B. Jakubczyk and M. Zhitomirskii, Odd-dimensional Pfafianequations: reduction to hypersurface of singular points, preprint.
Respondek
380
A. J. Krener, O n the equivalence of control systems and linearization of nonlinear systems, SIAM J. Contr. and Opt., 11 (1973), 670-676.
V. V. Lychagin, Local classification of l-order nonlinear partial differential equations, Uspekhi Matem. Nauk, 30 (1975), 101-171 (English translation in: Russian Mathematical Surveys).
J. Martinet, Sur les singularities des formes differentielles, Ann. Inst. Fourier, 20 (1970), 95-178. P. Mormul and R. Roussarie, Geometry of triples of vector fields on R4,North Holland Math. Stud., vol. 103, North Holland, Amsterdam, 1985, 89-98. [NRM] M. van Nieuwstadt, M. Ratinam, and R. M. Murray, Differentia1 flatness and absolute equivqlence, Proc. 33TdIEEE Conf. Decision Control, 1994, 326-332. H. Nijmeijer and A. J. van der Schaft, Nonlinear Dynamical Control Systems, Springer-Verlag, New York, 1990.
J. B. Pomet, A differential geometric setting for dynamic linearization, in: Geometry in Nonlinear Control and Differential Inclusions, B. Jakubczyk, W. Respondek, and T. Rxezuchowski (eds.), Banach Center Publications 32, Warsaw, 1995, 319-339. W. Respondek and M. Zhitomirskii, Feedback classification of nonlinear control systems on 3-manifolds,Math. of Control, Systems, and Signals, 8 (1995), 299-333. W. Respondek and M. Zhitomirskii, Feedback classification of nonlinear control systems: simple normal forms, preprint. H. J. Sussmann, Lie brackets, real analycity, and geometric control, in: Differential Geometric Control Theory, R. W. Brockett, R. S. Millman, and H. J. Sussmann (eds.), Birkhauser, Boston, 1983, 1-116.
K.Tchofi, On normalforms of a@ne systems under feedback,in: New Trends in Nonlinear Control Theory, J. Descusse, M. Fliess,
Feedback Classification of Nonlinear Control Systems
381
A. Isidori, and D. Leborgne (eds.), Lect. Notes in Contr. and Inf. Sci., vol. 122, Springer, Berlin, 1989, 23-32. [Zhl] M. Zhitomirskii, Finitelydetermined l-forms W , # 0 arereduced to the models of Darboux and Martinet, Functional Analysis and its Applications, 19 (1985), 71-72. [Zh2] M. Zhitomirskii, Typical Singularities of Diflerential l-Forms and Pfafian Equations, Translations of Mathematical Monographs, vol. 113, AMS, Providence, 1992. [ZR] M. Zhitomirskii and W. Respondek, Simple germs of corank one afine distributions, to appear in Banach Center Publications.
10 Time-Optimal Feedback Control for Nonlinear Systems: A Geometric Approach Heinz Schattler Department of Systems Science and Mathematics, Campus Box 1040, Washington University, One Brooking8 Drive, St. Louis, Missouri, 63130-4899
Abstract. This paper describes in an expository style selective aspects of a recently developed geometric approach to construct optimal feedback control laws. The focus is on givingideas rather than technical proofs and on illustrating the results with detailed descriptions of low-dimensional examples. The results have already appeared in the literature and proofs with various degrees of detail can be found there. Here (sometimes new) proofs are included to the extent that they illustrate important features of the overall method.
1. Introduction The goal inany optimal control problem is to construct optimal feedback control laws. There exist two distinctively different, but actuallyequivalent approaches to this problem: dynamic programming and regular synthesis. In dynamic programming the value function is calculated as solution of the Hamilton-Jacobi-Bellman equation. This can easily be done in speSupported in part by the National Science Foundation under Grants No. DMS9100043 and NO.DMS-9503356.
383
384
Schattler
cia1 cases, like linear-quadratic control problems, but is a in general quite difficult undertaking with few explicit solutions possible. Significant theoretical advances have been made over the past years with the introduction of viscosity solutions and methods from the theory of Partial Differential Equations currentlyseem to be in the forefront of research activity. Still, to obtain explicit solutions numerical methods become necessary and hence theoretical insight gets lost. Regular synthesis is a generalization of the classical method of characteristics for first-order partial differential equations to the Hamilton-Jacobi-Bellman equation and hence another way to realize dynamic programming. Following classical ideas from Calculus of Variations, the guiding principle here is to synthesize a field of optimal extremals (i.e. of trajectories which satisfy the necessary conditions for optimality) which cover the state space. Then optimality of the corresponding feedback should follow. This is indeed so if certain technical assumptions are met. However, what precisely these conditions must be is the crux of the matter. A first set of sufficient technical assumptions was given by Boltyansky [l]who also coined the phrase regular synthesis. Though quite stringent, and these conditions were weakened significantly later on by Brunovsky [2] and then Sussmann [3], many low dimensional problems can satisfactorily be dealt with by usingBoltyansky's framework. Also for the systems considered here his original results suffice. The main issue in constructing a regular synthesis is how to achieve a reduction of the family of extrema1 trajectories to a sufficient family of optimal trajectories. In general, first order necessary conditions for optimality leave the structure of extremals wide open and even higher-order conditions may be inadequate to achieve the desired reduction, To est& blish terminology, consider a system of the form C j. = f ( Z , U ) ,
x E M,
U
EU
where M is a C" manifold and the control set U is arbitrary. The dynamical law f assigns to every (x,U ) E M x U a tangent vector f(x,U ) E T,(M), the tangent space to M at x. We assume that f is continuous in (x,u), differentiable in x for each fixed U E U ,and that the partial derivative
Optimal Time Feedback Nonlinear Control for Systems
385
D J ( z , u ) is continuous in ( z , ~ )Admissible . controls are Lebesgue measurable functions U(.) defined on a compact interval [ O , T j with values in a compact subset of U. Given an admissible control defined on some open interval J which contains 0, there exists a unique absolutely continuous curve z(-), defined on a maximal open subinterval I of J , such that z(0) = p , k ( t ) = f ( z ( t )~, ( t )holds ) almost everywhere on I , and z(t) lies in M for t E I . This curve is called the corresponding trajectory. A point q is said to be reachable from p in time t if there exists an admissible control U such that the corresponding trajectory z satisfies q = z(t). If q is not reachable in smaller time, then the trajectory is called time-optimal. The set of all points which are reachable in time t is denoted by Reachc,t(p). The reachable set of the system C fromp, Reachc(p), consists of all points which are reachable from p in nonnegative times and Reachc, 0. Choosing a control ii which steers z(0) to z(t E) in time t and concatenating it with the original control on [t+E, TI defines a control which steers z(0) to z(T)in time T - E.) Call a trajectory defined on [0,TI a boundary trajectory if for all t E [O,T)
+
+
So time-optimal trajectories are necessarily boundary trajectories. The Pontryagin Maximum Principle [4] (see Section 2) gives necessary conditions for a trajectory to be a boundary trajectory by approximating the reachable set with a convex cone at the endpoint of the trajectory. Clearly, if instead it is possible to analyze the boundary of the reachable set directly, no information will be lost and bettercharacterizations of time-optimal trajectories can be derived.
386
Schlttler
Although this appears an all but impossible task in general, this is indeed a feasible approach in low dimensions. We employ it for a local analysis of time-optimal controls near a reference point p for systems which are affine in the control with bounded control values,
Here M is a sufficiently small open neighborhood of some reference point p and admissible controls are Lebesgue measurable functions U with values in the closed interval [-l, l] almost everywhere. In general, even for systems of the form (3), the structure of the (small-time) reachable set may be arbitrarily complicated making an analysis prohibitively difficult. A complete analysis akin to Sussmann’s results on time-optimal control in the plane [5, 6, 71 appears too ambitious even in dimension 3, but results similar to Piccoli’s [g] classification of generic singularities for planar time-optimal control syntheses seem feasible. Indeed, in the past 10 years numerous results about time-optimal control havebeen obtained which analyze genericcases in low dimensions, mostly in dimension 3. In the papers by Bressan [g] and Krener and Schattler [lo] the local structure of time-optimal trajectories near an equilibium point for f is determined in R3 while Shin [l11analyzes the structure of extremals for this problem in R4.Our papers [12, 131 analyze the local structure of extremal trajectories in R3 away from an equilibrium in generic cases. Saturation phenomena on singular extremals (see Section 2) do not occur in the non-degenerate case in R3,but they aregeneric and were analyzed by Bonnard and deMorant [14, 151 and by Jankovic and Schattler [16].The paper by Bonnard and de Morant [l51 gives a beautiful geometric analysis for the time-optimal control of a chemical batch reactor which has saturated singular arcs in its optimal synthesis. In thecontext of constructing small-time reachable sets, which will be pursued in this paper, we gave a precise description of the boundary of the small-time reachable set for the 3-dimensional problem (respectively a 4-dimensional system if ‘time’ is added as extra variable) in [lo, 16, 171 constructing explicit optimal syntheses for nondegenerate situations.
Time Optimal Feedback Control for Nonlinear Systems
387
In view of these results it is reasonable, and this is the guiding principle of our analysis, to pursue an approach which proceeds from the “general” to the “special”. Like the local behavior of an analytic function near a point p is determined by its Taylor coefficients, so is the local behavior of an analytic system determined by the values of the control vector fields and their Lie-brackets at p [18]. (Recall that the Lie bracket of two vector fields f and g is again a vector field, denoted by [f,g], which,in local coordinates, can be expressed as
where D denotes the matrix of partial derivatives of the respective vector field [19].)Let us loosely call this the Lie bracket configuration of the system at p . In this sense, the most general situation arises then if all the vector fields f , g and their relevant (typically low-order) Lie brackets are in general position, that is if no linear dependencies (equality relations) exist. We refer to this as codimension 0 conditions and depending on how many nontrivial, linearly independent equality relations exist, we distinguish cases of positive codimension. (Of course, there always exist trivial relations because of skew-symmetry and the Jacobi-condition and these are ignored.) In this sense, the results in [lo] give the precise structure of the small-time reachable set from a reference point p in dimension 4 under codimension 0 conditions and in [l61 the codimension 1 case is analyzed. In short, the concept of codimension is used to organize the Lie bracket conditions into groups of increasing degrees of degeneracy. Liebracket conditions are systematically used to describe the different cases. Hence our results establish a precise theoretical link between these conditions and the structure of the small-time reachable sets in these cases. Once the precise structure of the small-time reachable set (of the system with time added as extra coordinate) has been established, a regular synthesis of time-optimal trajectories can be constructed by projection into the state space along the ‘time’ coordinate. This was done in [20] for the problem of stabilizing an equilibrium point p in dimension 3 under codimension 0 conditions and the result for the codimension l case is stated
388
Schattler
in [16].As a result, the optimal control problem has been solved locally near p .
It is important to point out two essential features of this approach. First, it may appear as a limitation of the method that the construction is done locally in the state space. But dynamic programming, or regular synthesis, are in essence local procedures. This is precisely the contents of Bellman’s principle of optimality. If for a certain control problem local solutions have been found near any point, then in principle (assuming that certain technical conditions are satisfied, which essentially form the core of the definition of a regular synthesis) it is possible and straightforward to synthesize a globally optimal control in the state space. Therefore, the problems which must be solved are all the local ones. The reason for this, feature is that even a local regular synthesis has a global character on the level of trajectories. This is beautifully illustrated by the codimension 0 case indimension 3. In this problem there exist extremals which are strong relative minima in the sense of Calculus of Variations (i.e. are locally optimal relative to trajectories which lie in a sufficiently small neighborhood of the curve y : (O,T]+ z(t) ), but are not optimal near p = z(0) [21]. In retrospect, variational arguments alone are therefore in general insufficient to analyze the local optimality of trajectories in the state space, no matter how small the neighborhood of the initial point is chosen. Our approach of constructing the entire small-time reachable set avoids this problem by its very setup. Clearly variational arguments are indispensible to limit the possible candidates for optimality (and are constantly employed in our construction as such), but in general they are insufficient to solve the problem completely.
2. Small-time reachable sets as cell complexes In the Introduction we postulated the construction of the small-time reachable set as a viable approach to time-optimal control for nonlinear systems of the form (3). In this section we outline the basic idea and
Time Optimal Feedback Control for Nonlinear Systems
389
motivation behind the construction leaving the actual mechanics for later sections. Time-optimal trajectories lie in the boundary of the reachable set for the extended system where 'time' has been added as extra coordinate, 50 EE 1. Itather than to consider thisparticularsituation, we consider the extended system directly without assuming any special equation and our aim is to analyze boundary trajectories. The Pontrgagin M&mum Principle [4]gives first order necessary conditions for a trajectory to lie on the boundary of the small-time reachable set. Let y = (z(-),u(.))be an admissible pair defined over the interval [0,TI (i.e. U ( - ) is an admissible control and z(.) is the corresponding trajectory which is assumed to exist on [0,TI)and suppose s(t) lies in the boundary of the reachable set from z(0) for all t E [0,TI,Then there exists an absolutely continuous curve X(.) : [O,T] +-J P , called the adjoint variable, whichis not identically zero, such that in local coordinates we have almost everywhere on [0,T ] (5)
m = - x ( W f ( z ( t ) )+ Dg(z(t))u(t))
(6)
(X(t),g(z(t)))U(t) E min (X(t),g(z(t)))v 1451
(We write X as a row vector, (.,.) denotes the Euclidean inner product, and Df and Dg are the Jacobian matrices of f and g , respectively.) We call trajectories which satisfy these conditions extremal, Note that the absolutely continuous function d,(t) := (X(t),g(z(t))) uniquely determines the control aa (8)
u(t) = - sgn d,(t)
away fromthe set Z(y) = {t E [0,T]: @,(t)= 0). We call d, the switching function. But all we know about Z(y) is that it is a closed subset of [O,T]. In principle it can be any closed subset, e.g. a Cantor-like set of measure arbitrarily close to T. This leaves the question of regularity properties of u wide open. The Fuller phenomenon [22]shows that optimal controls can have an infinite number of switchings in arbitrarily small intervals, but currently it is not known whether pathological structures are possible for
Schattler
390
analytic systems. (For smooth systems this is the case and examples cam easily be given [23].)In order to investigate the structure of the set Z(y) further, it is a naturalnext step to consider the derivatives of the switching function. For instance, if a,(?) = 0, but by@)# 0, then T is an isolated point of Z(y) and the control has a bang-bang switch at time T. The following Lemma, though completely elementary, is extremely important. In a certain sense it explains why the Lie brackets determine the structure of optimal controls.
Lemma 2.1. Let y = (z(-),U ( ' ) ) : [0, -+ M x U be an eztremal pair with adjoint variable X. Let Z be a smooth vector field and define
W
) = ( 4 % Z(4t))).
Then W t ) = (W),[f,Z l ( 4 t ) )+ W g , Z l ( 4 t ) ) ) . Proof. The proof is a direct computation. Omitting the variable t and writing the inner product as dot product, we have
@, = AZ(5) + ADZ(Z)k + u D g ( z ) ) Z ( z+) XDZ($)(f(lt.)+ u g ( 4 ) = WZ(zC)f(x)- Df(z)Z(z))+ U W w M 4 - W M = W , Zl(S)+ uNg, ZI(4 = -Wf(.)
4 )
and this proves the Lemma. For the switching function 0, we get therefore (11)
M
)=
[ f , g l ( s ( t ) )+ U(t)[g,gl(4t)))
= (W),[ f , S l ( W ) )
(12)
W = (x(% [f,[f,gI(z(t>) +w
g ,[f,gl1(4t))).
Even though the control U is in principle undetermined on Z(.y),if Z(y) contains an open interval I, then also all the derivatives of the switching function vanish on I and this may determine the control. For instance, if ) ) not vanish on I , in equation (12) the quantity (A(t),[ g , [ f , g ] ( z ( t )does
Time Optimal Feedback Control for Nonlinear Systems
391
then this determines the control as
Of course, this expression only defines an admissible control if u ( t ) takes values in the control set. Controls of this type which are calculated by equating derivatives of the switching function with zero are called singular. Once a singular control violates the control constraint, it is no longer admissible and must be terminated. We call this saturation. In the literature several higher-order necessary conditions for optimality of singular controls are known. For instance, the generalized Legendre Clebsch condition states that
along a time-optimal singular extremal [24].Also necessary conditions for optimality at junctions with singular arcs are known (see, for instance [25]-[28]).But we will not need these here. It is natural to attempt torestrict the structureof extremal trajectories further by higher-order necessary conditions. But in hindsight, the example in [lo, 201 shows that for theoretical reasons it is in general impossible to obtain a complete reduction since trajectories which are strong relative extrema in the sense of Calculus of Variations, but are not optimal in a neighborhood of the initial point, cannot be excluded with variational arguments. Far from being an exotic aberration, our example in [lo, 201 and the analysis of bang-bang trajectories in [29] suggests that this is the typical, i.e. nondegenerate situation in dimensions greater than three. This then indicates a direct construction of the small-time reachable set as viable alternative. Even though the Maximum Principle and higher-order extensions of it are in general not able to restrict the structure of extremal trajectories completely, nevertheless they clearly single out the controls U = fl and singular controls as prime candidates and impose further restrictions on the possible structures. Hence, rather than to pursue a typically difficult analysis of higher-order conditions, an alternative approach is to simply
Schiittler
392
initiate a construction of portions of the reachable set by integrating extremal trajectories which correspond to simple structures suggested by the necessary conditions, such as bang-bang trajectories with a small number of switchings or simple concatenations of bang and singular arcs. Choosing increasingly more and more complex sequences and analysing the geometric structureof the resulting surfaces, this construction attempts a precise description of the boundary of the small-time reachable set. If this set has a regular structure, which is a reasonable expectation in nondegenerate situations (but no result to this effect is currently known), this geometric construction appears a promising undertaking.
As a general principle, the idea is to construct the small-time reachable set as a stratified CW-complex by successivelyattaching cells of increasing dimension to the 0-dimensional cell consisting of the initial condition. In the cases we considered so far (nondegenerate Lie bracket configurations in small dimensions), it was always possible to carry out this construction in a very special and simple way:giventwoIc-dimensionalcellswhich have common relative boundary and whose projections into a (k 1)dimensional submanifold of M do not intersect, two (k 1)-dimensional cells are attached which are parametrized over the set “enclosed” by the k-dimensional cells and then their geometric structure is investigated. If the projections again “enclose” a (k 1)-dimensional set in a (k 2)dimensional submanifold of M , then the construction is iterated, otherwise the intersection is analyzed. We formalize this as a general principle as follows: We postulate that on M there exists a coordinate chart (eo, . , , , m ) centered at p such that for sufficientlysmall E > 0 the small-time reachable set restricted tothe coordinatecube C(€)= {(eo,. . . ,&) : 5 E for i = o , ...n } ,
+
+
+ +
can be constructed as a CW-complex in the following way by successively attaching higher-dimensional cells which consistof admissible trajectories: 0
CO = (P)
0
Attach to CO two l-dimensional cells C-,l and C+J which are parame-
Time Optimal Feedback Control for Nonlinear Systems
393
trized over D1 = [0, E ] . Specifically, construct continuous maps @lB*
0
: D1 -+
M,
Q
* (&*(Q),
&*(Q),
..
*
,&*(a!))
which are smooth on the open interval ( O , E ) , satisfy O t ( 0 ) = p , and have the property that theprojections of these curves into (50,El)-spwe do not intersect otherwise. Connect the l-dimensional cells C-,1 and C+J bytwo2-dimensional cells C-,2 and C+J which are parametrized over the set D2 between the projections of C-,l and C+J into (€o,E1)-space.Moreprecisely, construct continuous maps
G2I* : D2
-+
M,
(Eo, El)
* (Eo, E l ,
&*(EO,
El),
. . . &*(eo, 9
El))
which are smooth on the interior of D2, satisfy
= @'!*(a) for a! E [o,E],
@ 2 ~ * ( ~ ~ ~ * ( ~ ) , ~ ~ l * ( a ! ) )
i d for which &-(~o, E l ) < {l) holds for (Q, (1) in the interior of D2. Inductively continue this procedure until two (n - 1)-dimensional cells C+,+,,-land C-,,+1 have been constructed. Connect the cells C-,n-l and C+,n-l by two n-dimensional cells CN and CS which are parametrized over r$Xl+((o,
0
0
Specifically,construct continuous functions *g;$
: D,
-+ M , ( E o , . * . ,En-1)
4:l*(Eo,.
.
*
,En-l)h
which axe piecewise smooth on the interior of D,, with the properties that their graphs
Schlittler
394
.
and that . . ,&-l) in the interior of D,. #$9-(<0,
0
< #:I+(
.. .,&-l)
holds for ((0,
. . . ,[,-l)
The small-time reachable set restricted to C(€) is given as the set of points between the graphs of these functions:
In no way do we mean to suggest that this is a generally valid structure, but we have persistently encountered it in low dimensional codimension 0 situations. If the small-time reachable set can be constructed in this way, we say that it is a regular (n l)-dimensional conical cell complex in the coordinates ( t o , , . . , m ) . Under these conditions there exists a direction v = ( v o , ~ in ) (&,,&)-space such that all slices {wo(o +vl(l = c) for sufficiently small c are stratified n-dimensional disks with ,Snm1 as boundary. These spheres can be described as union of a lower hemisphere S,, an equator E,, and an upper hemisphere N,. The equator consists of points on the lower dimensional cells SA,^ for i = l , ,. . ,n 1. In our constructions so far it consisted always of endpoints of bang-bang trajectories with at most n - 2 switchings. The hypersurfaces S = U,>& and N = U,>oN, could be constructed as entry, respectively exit strata into the small-time reachable set for a basis vector field 2.The usual transversality conditions of the Maximum Principle [4]give necessary conditions for the structure of trajectories on N and S which can be used to initiate the construction. Our motivation for considering a qualitative structure of this kind, which can be considered the most regular structure possible, simply is the guiding principle to analyse nondegenerate cases first. For nondegenerate Lie-bracket configurations there seems to be reasonable expectation of a simple structure and our outline above simply provides a precise framework to get started with. If necessary, it is possible to deviate from this stringent set-up at any moment in the construction by analyzing the geometric properties of the strata which are being attached.
+
-
Time Optimal Feedback Control for Nonlinear Systems
395
3. Algebraic preliminaries In the construction outlined above, one main question has been ignored so far, namely how to find the "correct" set of coordinates. Depending on this choice the problem may become simple or extremely complicated. Here the Lie bracket configuration for the system at p comes into play, In nondegenerate situations it plainly dictates a choice of canonical coordinates using the flows along basis vector fields. Several examples will be given in Section 4. In thissection we establish the notation and recall some algebraic results which will be indispensible to perform the necessary calculations which accompany the construction of the small-time reachable set, We use exponential notation for the flow of a vector field, i.e. we write qt = qOetz for the point at time t on the integral curve of the vector field 2 which starts at qo at time 0. Note that we let the diffeomorphisms act on the right. This agrees with standard Lie-algebraic conventions [30]and simplifies formal calculations. Accordingly, we use the same notation for the flow of 2, that is qoXetZ denotes the vector X transported from qo to qt along the flowof 2. But we freely use X(q0) for qoX and we switch between qoXetZ = qte-tzXetz and the more standard notation e-tadzX(qo).&call that a d Z ( X ) = [ Z , X ]and that e-tadZ has the asymptotic Taylor-series representation [31]
We also need an asymptotic product expansion for an exponential Lieseries due to Sussmann [32]:Let X = 1x1,, . . ,X T } be a family of noncommutative indeterminates and let L(%) denote the Lie-subalgebra of the free associative algebra generated by X I , . . , X T . Let V i be integrable functions of a scalar variable t defined over [ O , T ] . It is shown in [32]that the solution S ( t ) to the initial value problem
Schiittler
396
has a welldefined infinite product expansion in terms of exponentials of elements ina Philip Hall basisof L(%) with coefficients whichcan be computed by successive integrations and algebraic manipulations. Lemma 3.1 belowgives the form of this expansion for the case r = 2 taking into account all brackets of order 5 4.
Lemma 3.1. Let A and B be noncommutative indeteminates. Let v and W be integrable functions oft on [0,TI and set t
V(t) = u(s) d s
t
5
and W ( t )= w(s) de. 0
0
Let S(t) be the solution of the initial value problem (17)
S = S(vA + wB),
S(0) = Id.
Then S has the asymptotic product expansion
S = . , , e ( $ SwVW'da)[BI[Bv[A*BIIIe(i S ~ V ' W W ~ ) [ B , [ A ~ [ A , B I I I e ( &j w V 3 d~)[A,[A~IA,B]IIe(~~VWd~)[B,[A,BII ~~~'d~)[A,[A,Blle(S~Vds)[A,BleWBeVA
where the remainder is an infinite product of exponentials of Lie-brackets of A and B of orders 2 5 . The proof of a more general expansion which gives also all terms of order 5 is given in [l61and we refer the reader to this paper for the proof. In the special case when we take v(t) =
1 ifO
and
w(t)= 0 i f O < t < r
1 ifrst
Time Optimal Feedback Control for Nonlinear Systems
397
4. Geometric analysis of bang-bang trajectories We now make precise the construction of the small-time reachable set in codimension 0 situations as outlined above. In this section we consider submanifolds obtained by concatenating bang-bang trajectories. For the nondegenerate 2- and 3-dimensional cases this is enough to construct the small-time reachable set, but in dimension _> 4 also singular arcs need to be considered. This will be the topic of Section 5. We denote the vector fields corresponding to the constant controls U E fl by X = f - g and Y = f + g , Since admissible controls are convex combinations of U = -1 and U = +l, there exists a time ? > 0 with the property that every trajectory corresponding to an admissible control definedon [O,?] exists on [O,?] and liesin M . Furthermore, in a codimension 0 set-up we always assume that the vector fields f , g. and low order Lie brackets of f and g are in general position if the dimension allows. If X,Y,22,. ,2, are vector fields whose values at p are linearly independent, then, by shrinking M if necessary, we can choose a coordinate system on M centered at p of canonical coordinates of the second kind of the form
..
(19)
((0,
. .. ,<, ) peen . . .e 6 I-+
zn
2'
eel
et0
x.
We will always assume that M is a sufficiently small cube M=M(p)={(50,...,5,):15iJ~p, i=O,*..,~},
which we will shrink whenever necessary for the argument. Then without loss of generality we may assume that any linear independence relations which hold at p are valid on all of M . Furthermore, we may assume that for all e < p the vector field Y is transversal to thecoordinate hyperplane {& = e}. It is then impossible that trajectories leave and reenter the neighborhood M e = M n (50 5 E } n 1( 5 E ) . Also note that for E > 0 sufficiently small there exists a time T = T ( E )5 F such that every trajectory of the system startingat p must leave M Cin time T . Henceforth T denotes this time and we shall always assume that the totaltime along any trajectory is 5 T. Also, let us fix our notation for concatenations of
Schattler
398
bang-bang trajectories once and for all. Set
CO = so = cp}, Cx = {peax : S 2 01, Sx = {pesX : S > 01, Cy CXY
= {pet'
:t
2 01,
Sy = (petY : t
= { W ',etY : s , t 2
CY, = { p e t y e a x : S,t
> 01,
o}, s x y = {peexetY : s , t > O},
2 o},
s y x = {petyeax
: S,t
> 0)
and analogously for concatenations of more pieces. Also denote intersections of sets with the halfspaces (50 5 e} and {
Clearly,
and no points outside of the first quadrant are reachable forward in time within M? ,
I
Time Optimal Feedback Controlfor Nonlinear Systems
c;
1-
399
........................
I
G
€0
Fig. 1. f A g # 0 in dimension 2 This trivial structure shows up as a lower dimensional decomposition of the codimension 0 3-dimensional case. Here we msume that X and Y and their Lie bracket [X, Y) are linearly independent and choose as coordinates on M (50,<1,52) c)p e t ~ [ X * Y l e t l Y e ~ o X ,
(21)
Clearly the stratified surface C$, is present as the first quadrant in the coordinate plane (52 = 0). For a XY-trajectory Corollary 3.1 gives (22)
pesXetY
= pest(l+O(T))[X,Y]et(l+O(Tz))Yee(l+O(Ta))X
where U ( T k )denotes terms which can be bounded by CTk for some constant C < W . Thus (23)
(0
+
= ~ ( lO ( T 2 ) ,
51
= t(1
+ O(T2),
52
+
= ~ t ( l O(T)).
The equations for 50 and 51 can be solvedfor S and t near (0,O) and hence for E sufficiently small we c m write (2 as a function of 50 and (1 on Dz= {(&J,[~): 0 5 5 ~ ,=i 1,2}. In fact, (24)
52
= 50E1(1+
’*
m)
Schattler
400
and this expression is positive for 0 C 5 E, i = 1,2.Hence the surface S k y can be described as the graph of a smooth function EZXY(10,El)
= 4J;8+(Eo,sl),
(E0,El)
E D2,
which lies entirely above S+x.Note that the latter is given as the graph over D2 of the trivial function s ~ x ( E o , E 1 )= 4:*-(Eo,El)
= 0.
We claim that the small-time reachable set Reachc,g(p) restricted to {Eo 5 E , (1 5 E } , Reach5 < T ( p ) (see Fig. 2), is given by 1-
(25) ReachE,s&)
= {(~o,€1,~2): ( € o , ~ 1 ) E ~
5
2 , o (2
5 EZX~(E~,EI)}.
t 7 Fig. 2. f A g A [f,g] # 0 in dimension 3
To see this, temporarily call this set D3. Take an arbitrary point on Sky and integrate the vector field X forward in time, i.e. consider Skyx. In our coordinates X = (1,0, O)T and thus, also taking into account equation (24), it follows that the flowof X starting from Sky covers the relative interior of D3,i.e. SfCYX
= {(Eo,S1,Ea) : (E0,El)
E Dz,O
< E2 < € z X Y ( € O , E l ) ) .
Optimal Time Feedback Nonlinear Control for Systems
401
Henceall these pointsare reachable and, in fact, D3 = Ciyx, (Using Y = (*, 1, *)T,where the asterisk denotes terms of order T,it can similarly be shown that also D3 = C C x y . ) It remains to show that no other points in M e are reachable. One way to see this is geometrically. It is easy to verify (using the explicit formulas) that X points inside D3 at points in S%y \ (50 = E} U ((1 = e} and that Y points inside 0 3 at points in S;, \ {to= E}U{& = e}. Since admissible controls are convex combinations of fl, this implies that any admissible trajectory which starts ata boundary point of D3 \ { E o = e} U ((1 = e} either is tangent to or enters into the interior of 0 3 , Hence the small-time reachable set restricted to { t o 5 E, & 5 e) must lie in D3. This proves the claim. An alternative approach, which will be needed in higher dimensions, is to prove that the construction exhausts all boundary trajectories. To this end suppose y is a boundary trajectory and that @,(tl) = 0. Then it follows from equation (7) that also (A(tl),f(z(t1))) = 0 and thus
But f , g, and [f,g] are linearly independent on M and so A(t1) cannot vanish against [f,g] at Ic(t1). Hence b,(tl) # 0 and tl is a bang-bang switch. Thus boundary trajectories are bang-bang with isolated switchings. Suppose there is a second switching at time t z > tl and without loss of generality assume that U 3 -1 on (tl, tz). Then, as in equation (26), A(tz) vanishes against both X and Y at z(t2). Let q1 = z(tl), qz = z(tz) and move the vector Y(qz) back to q1 along the flowof X. This is done by integrating the variational equation. In exponential notation this simply reads
where we made use of the asymptotic expansion (15). The covector A is moved backward along the flow of X by integrating the covariational equation. But this is the adjoint equation (5) and so we simply get A(t1).
Schattler
402
Furthermore, by the definition of the adjoint operator,
(28)
( A ( t l ) ,q2Ye(t2-t1)X) = (X(tl)(e(t2-t')x)*,qzY)
= (X(tz),Y(qz)) = 0. Hence A(t1) vanishes against X , Y and e(t2-t1)adXY, all evaluated at q 1 . By the nontriviality of X these vectors must therefore be linearly dependent. Hence
(29)
0 = X A y A e(ta-t')adXY =x A Y A Y
+ (t2 - t l ) [ xY] , + O((tz - t i ) ' )
+
= ( t 2 - t l ) ( l o(t2- tl))(X A Y A [X,Y ] ) .
But this contradicts the linear independence of f, g and [f,g].Hence boundary trajectories are bang-bang with at most one switching. Thus 0 3 exhausts all possible boundary trajectories and no additional reachable points can exist in small time. (It followsfrom general results that the small-time reachable set R e a c h x : , g ( p )is compact). The second argument is more technical than necessary for this particular example, but it illustrates a general line of reasoning which becomes indispensible in higher dimensions as a method to restrict the possible candidates for extremals. In general, if a boundary trajectoryis bang-bang and has n switchings at times t l < t z < . . , < t,, then A ( t i ) vanishes against X and Y evaluated at qa = z(ti) for i = 1 , . . ,n. Moving these vectors to one point, say 41, gives (n 1) conditions of the type
+
(30)
(A(tl),
Zi)= 0
.
for i = 0 , 1 , .. . , n
with 20 = X(q1)and 2 1 = Y(q1).For instance, if U E -1 on (tl,tZ), then Zz = e(ta-tl)adXYas calculated above. Note that the vector X, when transported backward from 92 to q1 along the X-trajectory, gives no additional information since e ( t 2 - t l ) a d X X= X. Hence everyextra junction adds one more condition. Again, by the nontriviality of A ( t 1 ) the vectors X, Y,2 2 , . . . , 2, must be linearly dependent: (31)
X A Y A Z 2 A ... A Z n = O .
Time Optimal Feedback Control for Nonlinear Systems
403
This puts restrictions on the times t l , . , , ,t,. Following [33] we call the points (q1, . . . ,q,) a conjugate n-tuple and equation (31) a conjugate point relation. In this terminology, our calculation shows that there exist no conjugate 3-tuples in dimension 3 i f f , g and [f,g] are linearly independent. This 3-dimensional example was first analyzed by Lobry in a landmark paper on nonlinear control [34]. The proof given here shows the power of using a good set of coordinates. This will be a permanent feature in all constructions. A second one is the inductive nature of the construction mentioned already. Lobry’s example contains the structure of the trivial 2-dimensional casein the form of Cxy and the 3-dimensionalreachable set is constructed on top of this structure. We will see similar reductions in every step to a higher dimension. This is particularly useful since it allows to utilise earlier results as the dimension is increased. These low-dimensional examples were easy to analyse because the strata S X , S y ,and S y x became trivial with the choice of coordinates. In higher dimensions longer concatenations must be considered and it is here where the Lie-algebraic framework shows its full power. We illustrate this at the 4-dimensional codimension 0 case rederiving some of the results presented first in [lo]. We assume that the vector fields X ,Y, [X, Y] and [Y, [X, Y]] are linearly independent on M and choose as coordinates
The projection onto (53 = 0) gives Lobry’s example analyzed above. It follows therefore that E~=C~~UC~X=SOUSXUS~USX~US~X
(33)
is a regular 2-dimensional conical cell-complex in these coordinates, Set (34)
D3
= { ( 5 0 , 5 1 , 5 2 ) : (Eo751) E D290 5 5 2 5
EZXY(EO,El)).
We have seen above that the small-time reachable set for Lobry’s example can be described as either Cxyx or as C y x y . For the Cdimensional problem these surfaces are now separated in the direction of &,, We continue our inductive construction starting with E E and we show first that both S x y x and S y x y can be parametrized as graphs over D3.
Schattler
404
Lemma 4.1. The cells CkYx and C g x y are graphs of continuous jbnc: D3 + M . These functions are smooth on the relative interior tions of D3, have smooth extensions to the boundary strata Sxy and S y x , and to S y for S x y x , respectively to Sx for S y x y . Furthermore, these maps attach the $dimensional cells to E;.
$x1'
Proof of Lemma 4.1:Without loss of generality consider Sxyx and let q = pea1Xee~yes3X be a point on Sxyx. Set g1 = pealX and 92 = qleasy. We show simultaneously that Sxyx is a 3-dimensional manifold and that always points to one side of Sxyx and the coordinate vector field hence passes the vertical line test in canonical coordinates. The partial derivatives of q with respect to S I , s2, and 9 3 ,
&
&
give tangent vectors to Sxyx at q. To prove that points to one side of S X Y X it , suffices to verify that the wedge product of these vectors with 8 has constant nonzero sign. This also proves the linear independence BE3 of these vectors and hence they span the tangent space to Sxyx at q. Now, rather than calculate the wedge product at g, wemove all vectors backward along the trajectory to the point 41. This gives the vectors
405
Since moving the vectors forward is a diffeomorphism, the determinant stays negative if we move the vectors forward from q1 to q and therefore the tangent vectors
(44) and
& are also linearly independent. Hence the matrix
(45)
is nonsingular and so is then the principal 3 x 3 minor. By the implicit function theorem it is therefore possible to solve for the times (sl,s2,83) as smooth functions of ( t o , 5 1 , 5 2 ) near q provided 92 # 0. Since is transversal to S x y x everywhere, the entire 3-manifold is the graph of a smooth function c$:#-. Furthermore, these calculations show that $#- has a smooth extension to a neighborhood of s1 = 0 and 93 = 0 provided 92 > 0. It is clear that these extensions must agree with S y , S x y and S Y X .The vertical line test breaks down only along the X-trajectory. But
&
Schslttler
406
= 0) the hypersurface S x y x simply reduces to S x and therefore O,O,O): 60 > 0). This proves the Lemma for S x y x . The argument runs completely parallel for the case of S y x y . (It is clear from the construction that the domain of the maps is given by 0 3 . ) So far we have considered bang-bang trajectories with at most two switchings regardless of whether they are extremals or not. Also, at this point in the construction, there is no reason to believe that bang-bang trajectorieswith more switchings arenotextremal. Investigations into whether or not bang-bang trajectories with two switchings are extremal lead us back to the concept of conjugate points.Consider an arbitrary X Y trajectory of the form q2 = qOeaXetY which is extremal and has junctions at qo, q1 = qOeSX and at q z , Then there exists an adjoint vector A which vanishes against X and Y at every junction. Transporting all vectors to 40, it follows that X vanishes at qo also against for
(s2
&- extends continuously to Sx = {(to,
+:l*
qlye-sX
(46)
= esadX
Y(q1)
and against qz~e-tYe-eX
(47)
= eaadX etadYX
(42).
Again, by the nontriviality of X, these four vectors are linearly dependent and thus (48)
0 = X A Y A eeadxY A eaadXetadYX eeadX
y A eeadX
S
(
-'>x)
= StQXY (qo; S, t ) ( X A y A [ X ,YI A [Y,[ X ,Yll),
where the last equationdefines Q x y ( #S,; t ) as the quotient of two 4-vectors. The conjugate point relation becomes then (49)
Q X Y (qo; S,
t ) = 0,
where qo denotes the first junction. By using equation ( 1 5 ) , an expansion for Q x y in terms of S and t can readily be calculated. Write
(50) [X, [X,
y11 = a x + PY
+ r [ X ,YI + qy, [X,y11.
Time Optimal Feedback Control for Nonlinear Systems
407
where Q, 0, y and 6 are smooth functions on M.Then eeadX
(51)
XAYA
- 'y A eaadX
(
- ')x
S
= X A Y A [ X , Y ] 1+ ~ S [ X , [ X..., Y ] ] + A
1
-[X, - ;t[Y, Y][X,
1 0 0 1
+ ..Y]] - s[X,
+[X, ...
Y]]
0 0
0 0 1+ U ( T )
* * ;sd+u(T2) * * -1+U(T) - i t -sd+O(T2) and therefore Q'Xy(q0; S, t ) =
1 -2t
1 - -S6 + O(T2). 2
Let us now also assume that 6 does not vanish on M. Then, by shrinking M if necessary, we may also assume that ( S ( q ) ( dominates terms which are of order U(T)for any q. Then the linear terms in this expansion dominate the remainders and therefore 0
0
if 6 > 0, then for all qo E M and all S,t > 0 we have Qxy(q0;8 , t ) < 0 contradicting equation (49). Hence -XYtrajectories cannot be extremal. (The dot denotes a junction at the beginning and end of the trajectory). if 6 C 0, then the equation Qxy(q0;S,t ) = 0 uniquely determines the time t along an .XYm extrema1 as a function of 40 and S, namely 'i = -6(qo)s U ( S 2 ) .
+
Repeating the calculations for a -YX. trajectory, the analogous result follows. In particular, we have: Lemma 4.2. If 6 ispositiveon M , then bang-bang trajectories with more than two switchings are not boundary trajectories. All bang-bang trajectories with at most two switchings are extremals.
Schiittler
408
If 6 > 0, then it can indeed be shown that if q E Sxyx and r E Syxy have the same (Eo, (1, E2) coordinates, then the (3 coordinate of r is always smaller than the E3 coordinate of g. (This can also be seen from the cutlocus calculation given below.) Therefore Syxy entirely lies below Sxyx. It is not difficult to verify that the small-time reachable set is a regular 4dimensional conical cell complex inthe coordinates ((0,(1, (2, (3). Indeed, its structure and thisproof are a direct extension of Lobry’s 3-dimensional example to dimension 4. From nowonwe assume 6 < 0. Then the conjugate point relations determine the lengths of successive arcs. Even in this case, it follows from results of Agrachev and Gamkrelidze [35] and of Sussmann [36]that bangbang trajectories with more than two switchings are not optimal and so Cxyx and Cyxy exhaust all possible bang-bang boundary trajectories. For the 3-dimensional time-optimal control problem to stabilize an equilibrium this was shown earlier by Bressan in [g]. We will not need to utilize these results in our construction, however. We now investigate the geometric structure of these surfaces further, in particular, whether there is a nontrivial intersection of these surfaces. We will see that this is indeed the case if 6 < 0 and we call this set the cut-locus. It is determined by the (nontrivial) nonnegative solutions to the equation (53)
Pe
S I X aaYeasX
e
=
petlYetaXetsY.
(Trivial solutions are obtained if some of the times are zero and they correspond to trajectories in E = Cxy U Cyx). We first show that the presence of conjugate points implies the existence of a nontrivial cut-locus [29]. By the implicit function theorem, equation (53) can be solved uniquely in terms of ( ~ 1 ~ 5 near 2 ) a solution point q if the Jacobian in (53;t1,tzrt3)is nonsingular. Let qo = petlY, g1 = qoetax and q = Then these partial derivatives are givenby (54) (55) Ifwemove
all vectors back to go along the flowof the YXY-trajectory,
Time Optimal Feedback Controlfor Nonlinear Systems
409
then their wedge-product takes the form (56)
X A Y A etladXYA etaadXetsadYX
This expressioniszero if and only if the points ( q o , q l , q ) are a conjugate triple. Hence, by the implicit function theorem, the cut-locus equation (53) has a unique (and hence only the trivial) solution everywhere on SXYexcept possibly along a curve rxy defined as the solution to Qxy(P;tz,t 3 ) = 0. And along this curve indeed a nontrivial solution t o equation (53) bifurcates. This also can be seen geometrically by considering the tangent space to Syxy at q. If we transport the tangent space back to q1, then it is spanned by the vectors X, Y and etaadXY. But (57)
X A Y A etladXYA [Y, [X,Y]]
t a ( x A Y A [X,Y]A [Y, [X, Y]])
and together with equation (56) this implies that [Y, [X, Y]] and X point to the same side of T,,(Syxy) if Qxy > 0 and to opposite sides if qxy < 0. This implies that the surfaces Syxy and SXYX crosseach other near rxy C Sxy [29]. Elementary geometric considerations show that trajectories which maximize the coordinate [3 near Sxy are of the form YXY if qxy > 0 and of the form XYX if Qxy < 0. We now calculate this cut-locus using Corollary 3.1. We have (58)
pea1xea2yea3x
- p , . .e 3 a l a ~ [ Y , [ ~ , ~ I I ~ 3 a : ~ a [ x , [ x , Y I I ~ 4 ; a ~ ~ ~ [ ~ , y I l ~ a , ~ ~ ( a l + a a ) ~ -pe~a18~(~l6+a'+...)[y,[x,y]]~~~~aa(l+...)[X,Y]]e(a~+~~.)Ye(a~+a~+.~.)X and
410
(64)
Schlttler
816
+ sz + O(S) = 2t1 + t a d + t 3 + O(T).
Equations (60), (61) and (64) can be solved uniquely in terms of We have, for instance,
S
or t.
+ t z + O(T2) sZ = tl + t 3 + o ( ~ 3 ) l = --ti + o(T2). S l
81
(67)
= -t1
6
S3
We only remark on the side that the conjugate point relations imply that all these times (and also t 3 calculated below) are nonnegative for extremals. Now substitute these functions for S into equation (62) to obtain "
(68)
1
tl ( z t l + t a
+' > + op3)= 0. 3t3
In general, the quadratic terms need not dominate the cubic remainders. However, if the times ti satisfy a relation of the type
(69)
2 ET =
+ tz +
E
then this equation can be solved for
(70)
t3
= -t1
tl 1 -(ta
t3)
l - E
t3
+ts),
as
- St2 + O(T2).
Note that this solution is well-defined near
(t3
= 0) (i.e. r y x ) and thus a
Time Optimal Feedback Control for Nonlinear Systems
411
nontrivial cut-locus r,
extends beyond S y x at r y x . Similarly, by solving equations (60),(61) and (64) for t as a function of S, we can show that I' also extends beyond SXY at r x y . The curves r x y and r y x of intersection of I' with SYX and SXY are precisely the curves of conjugate points. Except for the stated transversality, which is easily verifiable, we have therefore proved the following result: Proposition 4.1. If 6 < 0, then S y x y and S x y x intersect transversally along a 2-dimensional surface l?. This surface extends smoothly across SYX and S x y and the intersections withthese surfaces are the curves r y x and I'xy of conjugate points. The cut-locus is the decisive structure for local optimality of bangbang trajectories. Note that, as subset of S x y x (or S y x y ) , ?! can be ((0, (1,(2) which descridescribed as the graph of the function (3 = bes S x y x over a 2-dimensional submanifold Dr of D3. Dr divides D3 into two connected components D3,+ and 0 3 , - which have the property that S x y x lies above S y x y in direction of 53 over D3,- and below S y x y over D3,+. If we denote the corresponding substrata by a superscript f,then only the trajectories in
(xyx
(72)
N = siyxn r n S,+,,
maximize the (3-coordinate over D3 (see Fig. 3). The surfaces S s y x and SFxy in fact lie in the interior of the small-time reachable set and therefore cannot be time-optimal. Within our construction the natural way to see this is to complement the construction with a stratified hypersurface S such that the small-time reachable set is a regular 4-dimensional conical cell complex with upper hemisphere N and lower hemisphere S in the coordinates ((0, (I,&, (3). This is indeed possible, but unlike the case 6 > 0, now bang-bang trajectories no longer suffice, but S consists of concatenations with singular arcs. We briefly postpone a more detailed description of S to the next section. Here let us just remark that a lo-
SchHttler
412
cal regular synthesis can easily be constructed from the precise structure of the small-time reachable set [20].This synthesis implies the following result [21].
ryx Fig. 3. The northern hemisphere for 6 < 0. Proposition 4.2. Suppose 6 < 0 andlet M be a suflciently small neighborhood of p . Then all extrema1 bang-bang trajectories whichlie in M are strong relative extrema in the sense of Calculus of Variations upto the third switching time. However, in M they are only time-optimal until the time determined by the cut-locus is reached. This occurs after the second, but prior to the third switching. After this time the trajectory lies in the interior of the small-time reachable set. We close this section with some remarks about the 5-dimensional case. Here it is assumed that the vector fields X ,Y ,[X, Y],[X, Y]] [X, and [Y, [X, Y]] are linearly independent on M and we choose as coordinates
Because of the extra d,imension the surfaces S x y x and S y x y are now separated completely. It is a straightforward extension of the arguments
Time Optimal Feedback Controlfor Nonlinear Systems
413
given above to show that (74)
E3
= CXYX UC Y X Y = so U sx U SY U S X Y U s y x U s x y x U s y x y
is a regular 3-dimensional conical cell-complex in these coordinates. Set D4 = {(50,51,52r53): (
YXY
53
5
(50,51,5a) 53
5 53XYX(50,5l,52)).
Analogous to the step from dimension 3 to dimension 4, the surfaces S x y x y and S y x y x , which would lie in the interior of the small-time reachable set for the 4-dimensional problem, are now separated in direction of the additional coordinate 54. Like above, it can be shown that SXYXY and S y x y x aregraphs of continuous functions : D4 M. These functions are smooth on the relative interior of D4 with smooth extensions to the boundary strata S x y x , SYXY,and to S y x for c&, respectively to S x y for Furthermore, these maps attach the Pdimensional cells to E;,Making two additional linear independence assumptions involving forth order brackets, it also follows that conjugate point relations restrict the lengths of successive legs of extrema1 bang-bang trajectories with 3 switchings. Like above, this implies that a nontrivial cut-locus bifurcates from the conjugate points. The Lie-algebraic formalism can then be used to calculate the canonical coordinates.and to derive the cut-locus equation. Using blow-ups of the form Fl:$
$:l+.
(75)
91
=~
$ 3and
94
= Ps2,
the determining equation takes the form
= 0, where = indicates the presence of higher-order perturbations at each summand and a2 and c1 are coefficientswhich arise when the forth order brackets [X, [X, Y]]] [X, and [Y, [Y, [X, Y]]] are expressed in terms of the basis. From this equation a complete analysis of the cut-locus can be
414
SchPttler
given. This becomes technically much more involved because of the cubic terms, but also since now additional sections of the trivial cut-locus from S x y and S y x which are not conjugate points lie in the relative boundary of the cut-locus as well. The result shows that this case still exhibits the same qualitative features as the Cdimensional case. In particular, again the cut-locus limits the optimality of bangcbang trajectories with three switchings near p . These results were obtained in joint work with M, Jankovic and appear in her thesis [37].They improve on the results in Ill] in the sense that they not only limit the structure of extrema1 bang-bang trajectories to have at most three switchings, but also exclude some of these from optimality near p . But Shin [l11 also analyzes the structure of singular extremals showing that if a trajectory contains a singular subarc then at most concatenations of the form BBSB and BSBB can be optimal. These results establish the importance of the cut-locus for the local structure of optimal controls. The construction of the small-time reachable set allows to distinguish global from local optimality on the level of the trajectories. Precisely these explicit calculations display the exact structure of the time-optimal trajectories.
5. Some remarks about the geometric analysis of singular extremals Above we used bang-bang trajectories to illustrate the mechanics of the construction of the small-time reachable set. The reason was that the relevant Liealgebraic calculations are simpler for the constant control vector fields X and Y.But very few nonlinear systems have a pure bang-bang structure andtypically singular arcs must be incorporated in the construction. This is the case for the codimension 0 Cdimensional example if S < 0 and for any typical higher dimensional system. Here we only round up the more detailed exposition of bang-bang trajectories with some comments on these aspects of the construction. We return to the codimension 0 4-dimensional example for S 0.
for Nonlinear Systems
Time Optimal Feedback Control
415
Lemma 6.1. If 7 = (s(+), U(.)) is an eztrenal pair defined over the interval [O,T] with the property that the control i s singularon an open subinterval I , then necessarily (77)
Proof. Since U is singular on I , the switching function
W
) := (X(t>, .9(X(t)))
and all its derivatives vanish on I , Hence for t E I X vanishes against X, Y and [X,Y] along s(t).Since X, Y, [X,Y] and [Y, [X,Y]] are linearly independent, and since X is nontrivial, X cannot vanish against [Y, [X,Y]] and thus
as desired. The singular control (77) gives rise to a smooth feedback control
(79) defined on M. Note that this defines an admissible control for S < 0. Also, in a sufficiently small neighborhood of p , the quotient (1 d ) / ( l - d) is bounded away from fl and so no saturation is possible in M . The concept of conjugate points can be invoked to show that singular controls cannot be concatenated with bang-bang trajecories which have additional switchings. Consider an arbitrary SX-trajectory of the form q2 = qOerSeeXand let q1 = qOefS denote the junction. If the trajectory is extremal, then there exists an adjoint multiplier which vanishes against X, Y and [X,Y] at 41, We call any point q1 where this holds a singular junction regardless of whether a singular arc is actually present or not. If q2 is another switching, then X vanishes also against X and Y at q2.
+
Schattler
416
By moving Y backward along the trajectory to q l , again four conditions are imposed on X. By the nontriviality of the multiplier these four vectors must be linearly dependent. Here we get 0 = X AY A [X,Y] A esadXY l
= ;s2(d
+ O(s))(XA Y A [X,Y ]A [Y,[X,Y]]).
But this contradictsthe linear independence of X , [X, Y] Y, and [Y, [X, Similar calculations verify that extrema1bang-bang trajectories which contain a singular subarc can at most have the structureBSB where B stands for either X or Y. Having established this structure, the next step is to show that the surfaces SXSX, SXSY, Sysx and S y s y form astratified hypersurface which can be described as the graph of a piecewise smooth function (3 = El, E z ) with domain D3, In principle this verification is exactly the same as for bang-bang trajectories, but the fact that thesingular control is a smooth feedback must be taken intoaccount. We leave it to theattentive reader to supply the details. Then it follows easily that
(80)
S = cxsx U CXSY U CYSX U CYSY =souSxuSyussusxsussxusysuSsy U sxsx U SXSY U SYSX U SYSY
provides the lower hemisphere to thereachable set (see Fig. 4).This concludes the construction of the small-time reachable set for the nondegenerate Cdimensional case. In [l61 the small-time reachable set from a point p where 6(p) = 0 was analyzed under additional linear independence assumptions on higher order Lie-brackets. In this codimension 1 situation saturation of singular arcs occurs and this makes the calculations near saturation points even more complicated since higher order terms in the expansions need to be tracked carefully. Still the precise structure of the small-time reachable set and the corresponding time-optimal synthesis near p can be established. We refer the reader to [l61for these results. This saturation phenomenon for singular arcs is also analyzed in the papers by Bonnard et al. [14, 151
Y]],
Time Optimal Feedback Control for Nonlinear Systems
417
SXY
Fig. 4. The southern hemisphere for 6 < 0 for the control of a chemical batch reactor and a complete solution is given there as well.
6. Conclusion The construction of the small-time reachable set is an effective method to solve local time-optimal control problems for contol systems which are affinein the controls and have bounded control values.Necessaryconditions for optimality single out the constant controls fl and singular controls as prime candidates, but typically do not give the precise structure. In this paper we outlined selected aspects of a construction of the small-time reachable set as a cell complexby attaching inductively cellslof increasing dimensions. These cells consisted of extrema1 trajectories formed by concatenations of increasing lengths of possible candidates for optimal trajectories like bang-bang or singular trajectories. The construction is entirely geometric and relies on a Lie-algebraic framework to perform explicit calculations to establish the geometric properties of. the strata. Structural features which were found in this way, like cut-loci, played decisive roles in the analysis of the low-dimensional examples considered so far. Thenfrom the precise structure of the small-time reachable set for the extended system where 'time' has been added as extra coordinate,,a local
Schattler
418
regular synthesis of time-optimal controls can be derived by projecting out the variable ‘time’. For more details on the latter aspect of our approach we refer the reader to [20] where this construction is carried out for the codimension 0 Cdimensional case. Hence the construction of the small-time reachable set is indeed an effective method to solve time-optimal control problems. More details on the technical aspects and the proofs can be found in [lo, 201 and in particular in [l61 where the construction is carried out in detail for the technically difficult codimension 1 4-dimensional case in the same spirit as it was explained here.
REFERENCES V. G. Boltyansky, Suficient conditions for optimality andthe justification of the dynamic programming method, SIAM Journal of Control, Vol. 4, No. 2, pp. 326-361, (1966). P. Brunovsky, Existence of regular synthesisfor geneml control problems, Journal of Differential Equations, Vol, 38,pp. 317-343,. (1978). H. Sugsmann, Synthesis, presynthesis,. suficient conditions for optimality and subanalytic sets, in: Nonlinear Controllability and Optimal Control (H. Sussmann, Ed.), Marcel Dekker, pp. 1-19, (1990). [41 L. Pontryagin, V. Boltyansky, R. Gamkrelidze, and E.Mishchehko, The mathematical theory of optimal processes, Wiley-Interscience, New York, (1962). [51 H. Sussmann, The structure of time-optimal tmjectories for singleinput system8 in the plane: the Coononsingular case, SIAM J. on Control and Optimizatian, Vol. 25, No. 2, pp. 433-465, (1987). .H:Qussmmn, .The structure of time-optimal tmjectories for singleinput systems in the .plane: the geneml real-analytic case, SIAM J. Qn Control and Optimization, Vol. 25, No. 4, pp. 868-904, (1987). [71 H. Sussmann, Regular synthesis for time-optimal control of single-
Time Optimal Feedback Control
for Nonlinear Systems
419
input realanalyticsystems in the plane, SIAM J. on Control and Optimization, Vol, 25, No. 5 , pp. 1145-1162, (1987). B. Piccoli, Classification of genetic singularitiesfor the planar timeoptimal synthesis, SIAM Journal on Control and Optimization, Vol. 34,NO.6,pp. 1914-1946, (1996).
A. Bressari, The generic local time-optimal stabilizing controls in dimension 3, SIAM Journal on Control and Optimization,Vol. 24,No. 1, pp. 177-190, (1986). A. J. Krener and H.Schattler, The structure of small time reachable sets an low dimension, SIAM Journal on Control and Optimization, Vol. 27,NO.1, pp. 120-147 (1989). Ch. E. Shin, On the structure of time-optimal stabilizing controls in R4,Bollettino U.M. I. ,Vol. (7)9-B, pp. 299-320, (1995).
H.Schattler, On the local structure of time-optimal bang-bang trujectories in R3,SIAM Journal on Control and Optimization, Vol. 26, NO.1, pp. 186-204, (1988). . . H.Schattler, The local structure of time-optimal trajectories in di,
mension 3 under generic conditions, SIAM Journal on Control and Optimization, Vol. 26,No, 4,pp. 899-918, (1988). B. Bonnard, J , P. Gauthier and J. deMorant, Geometrictirneoptimalcontrol for batch reactors, Part I, in: Analysis of Controlled Dynamical Systems, (B. Bonnard, B. Bride, J. P. Gauthier and I Kupka, Eds.), Birkhauser, (1991);Part 11, in Proceedings.of the 30th IEEE Conference on Decision and Control, Brighton, United Kingdom, (1991). B. Bonnard and J. de Morant, Toward a geometrictheory in the time-minimal control of chemical batch reactors, ,SIAM Journal on Control and optimization, Vol. 33, No. 5 , pp. 1279-1311, (1995).
H.Schattler and M. Jankovic, A synthesis of time-optimal controls in the presence of saturated singular arcs, Forum Mathematicum, Vol. 5 , pp, 203-241, (1993). H.Schattler, Regularity Properties of Optimal Trajectories: Recen-
Schiittler
420
tly Developed Techniques,in: Nonlinear Controllability and Optimal Control (H. Sussmann, Ed.), Marcel Dekker, pp. 351-381, (1990). H. Sussmann, Lie brackets and real analyticity in control theory, in: , Mathematica1,Control Theory, Banach Center Publications, Vol. 14, Polish Scientific Publishers, Warsaw, Poland, pp. 515-542, (1985). W. Boothby, An introduction to differentiablemanifoldsand Riemannian geometry,Academic Press, New York, (1975). H. Schattler, A local feedback synthesis of time-optimal stabilizing controls in dimensionthree, Mathematics of Control, Signals and Systems, Vol. 4,pp. 293-313, (1991). H. Schiittler, Extrema1 trajectories, small-timereachable sets and local feedback synthesis: a synopsis of the three-dimensional case, in: No'nlinear Synthesis (Proceedings of the IIASA Conference on Nonlinear Synthesis, Sopron, Hungary, June 1989), Christopher I. Byrnes and Alexander Kurzhansky (Eds.), Birkhiiuser, Boston, 258-269, (1991). 1221 A. T. Fuller, Study of an optimum non linear system, J. Electronics "Control, Vol. 15, pp. 63-71, (1963). 1231 H.Sussmann, Recent developments in"the regularity' theory of optimal trajectories, in: Proceedings of the Conference on Linear and Nonlinear Mathematical Control Theory, Torino, Italy, (1986). p41 A. J. Krener, The high order maximum principle and its application to singular eztremals, SIAM Journal on Control and Optimization, Vbl. 15, pp. 256-293, (1977). 'B.b. Goh, Necessary conditionsfor singular extremals involving multiplecontrolvariables, SIAM J. on Control, Vol. 5,, pp, 716-731, "(1966). J. 'P. McDanell and W.F.Pbwers, Necessary conditions for joining optimal singular and nonsingular subarcs, SIAM J. on Control, Vol. g., pp. 161-1173, (1971). H.Maurer, An example of a continuous junction for a singular control problem of even order, SIAM J. on Control, Vol. 13. , pp, 899903,',(1975);
pp.
'
'
'
Optimal Time Feedback Control
for Nonlinear Systems
421
H. W. Knobloch, Higher order necessary conditionsin moptimal control theory, Lecture notes in Control and Information Sciences, Vol. 34,Springer Verlag, Berlin, Germany, (1981). H. Schiittler, Conjugate points and intersections of bang-bang trajectories, Proceedings of the 28th IEEE Conference on Decision and Control, Tampa, Florida, pp. 1121-1126, (1989). N. Jacobson, Lie algebras, Dover, (1979). Bourbaki, Elements of Mathematics, Lie groups and Lie algebras, Chapters 1-3, Springer Verlag, (1989). H.Sussmann, A product expansion f o r the Chen series, in: Theory and Applications of Nonlinear Control Systems, C. Byrnes and A. Lindquist, Eds., North-Holland, pp. 323-335, (1986). H. Sussmann, Envelopes,conjugatepoints,andoptimal bang-bang eztremals, in: Proceedings of the 1985 Paris Conference on Nonlinear Systems (M. Fliess, M. Hazewinkel, Eds.), Reidel Publishing, The Netherlands, (1987). C. Lobry, Contr6labilitd des Systbmes nonlindaiws, SIAM J. Control, Vol. 8,pp. 573-605, (1970). A. A.Agrachev and R. V. Gamkrelidze, Symplecticgeometry for optimal control, in: Nonlinear Controllability and Optimal Control (H. Sussmann, Ed.), Marcel Dekker, pp. 263-277, (1990). H.Sussmann, Envelopes, high order optimality conditions and Lie brackets, Proceedings of the 28th IEEE Conference on Decisionand Control, Tampa, Florida, pp. 1107-1112, (1989). M. Jankovic, Thegeometricanalysis of bang-bang trajectories in the boundary of the small-time reachable set for nondegenerate fivedimensional systems, D. Sc. Thesis, Department of Systems Science and Mathematics, (1994).
Qualitative Behavior Control .Problem and Stabilization of Dynamical Systems A . N. Shoshitaishvili Math. Department, University of Arizona, Tucson, AZ On leave from Institute of Control Sciences, Moscow, Russia
Abstract The classical stabilization .problem is treated as a particular case of the following problem: for a given control dynamical system CD of the form k = v(z,U ) and a system D without control defined by f = w(z) find a feedback f such that there exists a map z I-) z which maps the trajectories of D to thetrajectories of x = v@, !(S)). This gives a possibility of stabilizing systems with degeneration of controllability (meaning that low order Taylor approximations of CD are not controllable).
0. Introduction The paper consists of five sections. In the first section a qualitative behavior control problem is posed and thepassage fromit to thestabilization problem is explained. In the second,section a synthesis of feedback in the qualitative behavior control problem by means of a so called Ashbyrelation is given. In the third section we discuss a monodromy type obstacle for asymptotic stabilization. In the fourth section we stabilize CD systems This work W93-011-1728.
supported by the Russian Fond of Fundamental Researches Grant
423
424
Shoshitaishvili
which have low codimensional degenerations of controllability (meaning that low order Taylor approximations of CD are not controllable). In this case the ordinary synthesis is generally impossible and the Ashby relation leads to synthesis with a branching feedba'ck.In thelast section we consider some relations between three fundamental principles of control: optimization, homeostasis (or coexistence principle) and imitation (the qualitative behavior control problem is a possible formalization of the latter).
1. Qualitative Behavior Control Problem 1.1. The classical stabilization problem (see [l-31 and the references therein) is to choose a feedback such that some state of the system becomes stable or asymptotically stable. A generalization is to stabilize some dynamical regime of a given nature (for example, a periodic regime or a quasiperiodic one). The next step in this hierarchy is the problem of finding a feedback such that system has an attractor of a given nature. All those problems can be viewed as particular cases of the Qualitative Behavior Control Problem (QBC-problem) ([4-61). A Controlled Dynamical System (CDsystem) is a commutative dia-
gram
E
.L
4 TX dB X
where X is a manifold, 7 : E + X is a fibre bundle, and p : TX + X is the tangent bundle. Each section 8 : X + E is called a feedback. Consider a dynamical system D (D system) on a manifold Z defined by a section W : Z + TZ.The Qualitative Behavior Control Problem for the above CD system relative to the Ck-type (0 5 k 5 00) of the D system is to find a section S : X -+ E such that' the dynamical system v o S : X -+ TX is Ck-equivalent t o W : Z -+ TZ,i.e. there exists a Ckdiffeomorphism Z + X rnapp'ing the trajectories of W onto the trajectories of v o S. The local variant of the problem (to be called the LQ-problem) considers germs instead of global objects.
and Control
Stabilization of Dynamical Systems
425
Recall (see, for example, [7])that the germ at zo of a function f : R" + R is the set G of functions such that for any g E G there exists a neighborhood U of zo such that flu = g1u. Any assertion about the germ should to be understood as the assertion which is true for all g E G and U small enough. The k-jet of the germ at zo of a C1-function f ( l 2 IC) is the set G of C'-germs g such that Ilf(z) - g(z)11 = O(1lz - sollh)>. The manifold of k-jets is parametrized by derivatives of orders at most k. The germs and jets of maps, vector fields and other objects are defined in the same manner. The main topic of investigation will be stabilization, with a feedback U = f(z),f ( 0 ) = 0, of the equilibrium x = 0 for the germ in the origin of the CD system given by (1)
5 = Az
+ BU + b(z,u)
(z E R*, U E R"),
where A , B are constant matrices, and b is a Ch second order term ( b = 4 2 ,4 ) '
To stabilize (1) it issufficient to solve the LQ-problem for the CD system (1) relative to the Ci-type (0 5 l 5 m) of the germ at the stable or asymptotically stable equilibrium z = 0 of the system (2)
i = DZ
+ d(z)
(Z
E R"),
where D is a constant matrix, d is a C' second order term. Indeed, if U = f(z)is a solution of the LQ-problem (l),(2), then any trajectory of ( l ) l u = ~ ( z ~ may be represented as F ( z ( t ) )where F is a homeomorphism. So z ( t ) is bounded or tends to zero if and only if ~ ( thas ) the respective property.
Definition [S]. A pair of linear operators A : R" + R",B : R" + R" is called controllable provided the mapping (R")" + R", ( d l ) , . . ,v(")) I+ B&) + + , . . An-lBdn), is epimorphic, If the pair ( A ,B) is controllable, then there exists a linear feedback U = La: such that A + BL is Hurwitzian, and thesystem (1)with this feedback becomes asymptotically stable [S].So the passage from stabilization to the LQ-problem is especially interesting for uncontrollable pairs (A,B ) .
+
.
426
Shoshitaishvili
2. Ashby Relation and Ashby Control 2.1. Let us recall some results [5] on solution of the QBC-problem.
A n Ashby relation is a quadruple (CD, D ,H , M ) , where C D is a CDsystem and D a D-system, and H : Z + r(E)is a Ck-mapping from 2 to the section space r ( E ) of the fibre 'bundle E, equipped with a Ck Banach manifold structure, M C X x Z is an integral manifold of the dynamical system CDH x D on X x Z defined by the section X x 2 + T(X x Z),(z,z ) I+ ( v ( H ( z ) ( z ) ) , w ( z ) )An . Ashby relation is called local for an Ashby L-relation) if CD, D, H , M are germs at the point (z = 0,u = 0,z= 0 ) , H ( z ) ( s )= (z,u= h ( z , z ) ) ,M is the graph of z = F ( % ) , and h : (R"x Rnl,O) + (Rm,O),F : (Rnl,O) + (R",O)'are germs of Ck-mappings. 2.2. Theorem. The QBC-problem (LQ-problem) is .solvable iff there
exists an Ashby relation (L-relation) (CD, D,H , M ) such that M is the graph of aCk-diffeomorphism F : Z + X(F : (R"1;O) + (R",0 ) , respectively). Proof Let a feedback S solve the QBC-problem and let F : Z + X be a Ck-diffeomorphismtaking the phase portrait of D to thatof v o S. Then the corresponding Ashby relation is defined by H such that H ( % )= S for all z E 2 and by M which is the graph of F . I
Conversely, consider an Ashby relation such that M is the graph of a Ck-diffeomorphism F : Z + X. Define s(z) = H ( F " ( z ) ) ( z ) . Then the dynamical system on X defined by the section v o S : X + TX is Ck-equivalent to D via F . The theorem is proved.
Remark. The feedback f = h(z,F " ( z ) ) coming from an Ashby relation is not the only possibility to make the behavior of C D similar to that of D. To do that, it is not necessary to eliminate variables z by means of the relation z = F ( z ) . It is enough to consider the system CDD = CDI,=h(,,,) x D and use any tool which forces states of CDD to be on M . (For example, it may be a sliding mode with M as sliding
Stabilization and Control
of Dynamical Systems
427
surface. In this case the set of x-components of CDD trajectories is image of the D trajectories by the mapF,)This remark is still true for branching feedbacks (which willbe discussed further). Solutions of the QBC-problem obtained by using the tools mentioned above other than a feedback mode may be useful in many cases. For example, in stabilization of C D systems with noncontrollable linearization, the use of slide mode realization of Ashby relation makes it possible to avoid parasitic solutions which lie in F(C) where C is the set of singular points of the map F (generally C is not empty for those CD systems). 2.3. It followsfrom Theorem 2.2 that any theorem on existence of integral manifolds in dynamical systems gives a possibility to construct a feedback for the &BC or LQ problem. Consider the germ of the system
+ BJ + a(%,I), f = DJ + d(z,z), (x E Rn,z E P ) '
S = Ax
(3)
where a(x,z) and d(x,e)are second order terms. 2.3.1. Theorem. [g], [lo]. Suppose the spectrum of A is separatedfmm that of D by a strip c1 5 Rex 5 c2 (here c1,cz are real numbers). Then (3) has the germs of integral manifolds (4)
M = { ( x , ~:)x = F(%) (F(0) = 0)},
(5)
N = {(x, z ) : z = @(x) (O(0) = O)},
which are the graphs of the germs of differentiable mappings
F : (R",O)+ (Rn,0);9 : (R",O) + (R",O). Moreover, if c1 < c2, c1 < 2 ~ 2 ,...,c1 C (k+l)cz then F and 9 are at least k times differentiable and then k-jets at the origin of F and 9 are uniquely determined by those of (1). If (3) is analytic and either the spectrum of A lies to the left of the strip and c2 > 0 or it lies to the right of the strip and cl > 0, then F is analytic.
..
,..
.. .,.
,
... .._, .. ,
"
....,.".,,_
.,../ . . .
"
..,.._. _._..,..... ...
I
,
I..,......
.. ... ...., .
....
,
,
,
,
.*_
.
Shoshitaishvili
428
2.3.2.Let a(x,z ) , d(x,z ) be analytic and let the linearization of (3), i.e. x = Ax+ Bz, i = Dz, be equivalent to x = Ax, i = Dz. Suppose the eigenvalues X I , . . ,Am of A and p l , . . . ,pn of D have no ( A ,D)-resonances, i.e., there are no relations of the f o r m Xi = mlpl + . . . + mn,un, whew m l , . . . ,mn are nonnegative integers with m1 + . . . + m n 2. Then there exists a series F ( z ) such that if it is convergent in some neighborhood of z = 0 then the manifold defined by x = F ( z ) is an analytic integral manifold of (3) [ll].If the convex hullof XI,, . . ,X m , p l , ., . , p n does not contain zero (i.e., if the spectrum of A D is contained in the Poincare domain) then F ( z ) is convergent.
>
2.4. Consider the C D and D systems (1) and (2). Evidently, a ne1 is the cessary condition for the LQ-problem to be solvable with IC solvability of the linearized LQ-problem
>
i.e., the existence of constant matrices G, @ such that
@ - l ( A+ B G ) 9 = D.
(7)
Note that if the pair A , B is controllable, then necessary and sufficient conditions for the matrices G, @ to exist axe given by Rosenbrock's structural theorem. The condition (7) is sufficient for solvability of the QBC-problem with (A,B) controllable even in the caae when the right-hand sides of the equations CD and D depend on parameters E (maybe, varying with time). Namely, consider the germs at x = 0,U = 0, E = 0 and at z = 0, E = 0 of the systems (8)
k=Ax+Bu+ls+b(s,u,E), i = R E + r ( & )( X E R ~ , U E R ~ , E E R P )
(9)
+ As + d(z,E ) , 6 = RE + ( z E R", E E RP)
i. = Dz
T(E)
Control and Stabilization of Dynamical Systems
429
respectively, where A, B, 1, R,D , A are constant matrices, and b, d, r are Ck+l higher order terms ( b = o ( ~ , u , E r) ,= o ( E ) , d = o ( z , E ) ) .
Theorem. Let the pair A , B be controllable. Then the LQ-problem (8),(9) with k 2 1 is solvable iff the same is true for the linearized LQproblem (6).
Proof. In (8)put Then
U
= gx
+ (G - g ) @ z , where g is some l x n matrix.
+ Bg)z + B(G - g)@%+ IS + b ' ( ~ , where b'(x, z ) = b(z,gz + (G - g)@z,E ) . j.= (A
Z, E ) ,
Since the pair A, B is controllable, for arbitrary complex numbers XI,. , . , X, there exists a matrix g such that the spectrum of A Bg is {Xl,, . . ,A,}. Choose g in such a way that the spectra of A Bg and D @ R satisfy the hypotheses of Theorem 2.3.1.Then the system
+
+
+ Bg)z + B(G - g)@%+ l&+ b'(z,Z , E ) , i = DZ+ AE+ d(z,E),
j. = (A
d = RE
+P(&)
has an integral manifold x = F(%, E ) , where F is a Ck-mapping having a uniquely determined k-jet at the origin. Setting E = 0 in the condition for the manifold to be integral, one finds that the l-jet F,(O,0) satisfies the relation ( A + Bg)F,(O, 0) B(G - g)@= F,(O,0)D.By (7)this is true for F,(O,O)= 0:indeedB(G-g)@+(A+Bg)@ = (A+BG)@ [email protected], since the k-jet of F is uniquely determined, F,(0,O)coincides with 0 and is )the germ of a diffeomorphism. is invertible; thus ( z , E ) H ( F ( z , E ) , E Theorem 2.4 now follows from Theorem 2.2.
+
2.5. Ashby control Suppose the spectra of A and D satisfy the assumptions of Theorem 2.3.1or Theorem 2.3.2,Sincefor any vector (or analytic) vectorvalued function e ( x ) ( @ ( O ) = 0) the eigenvalues of the linearized system (l), (2)with U = e ( z ) coincide with those of A and D , Theorem 2.3.1 (or Theorem 2.3.2)implies the existence of an Ashby L-relation ((l), (2), @,MQ);
430
Shoshitaishvili
here M Pis the graph of a mapping FP: (R", 0) -$ (R"', 0) whose k-jet at the origin is uniquely determined by that of e (and of (1) and (2), both being fixed). Given an Ashby relation, one can define a one-to-many control U* (z) = e(F'l(z)) such that the phase portrait of (1) is the image of that of (2) under the mapping z H FQ(z). If the latter is a ramified covering of finite multiplicity, then u * ( z ) isdefined in a neighbourhood of z = 0 and is realized as an ordinary feedback u(z,j)in the dynamical system k ( t )= M t )
(10)
+ C.(.(t),j(t)) + b ( z ( t ) , 4 z ( t ) , j ( t ) ) ,
j ( t ) = Q(.(t),j(t - 01, defined on a set A C R" x { 1,2,. .. ,p } : Q is definedon A; j and Q assume a finite number of values l , . . . , p , (z, ~(z,j)) E A. The solution of (10) with initial values zo,jofor t = to((zo,jo) E A) is a pair of functions z ( t ) , j ( t )satisfying (IO) for t 2 to,z(to)= zo, j ( t 0 ) = a ( z 0 jo). ,
Let us describe how to find a and U(., j). Represent the set of all solutions of z = F J z ) with respect to z as a collection of vector-valued functions q ( z ) ( j = 1,.. . , p ) , each defined on its own domain w j . In other words, z = FQ(zj(z))for any z E W j ( j= 1,. , p ) ; if z = Fp(z),then there exists a j such that z = q ( z ) ;the union of wj's is the image of F@, while the union UT==, Qj of nj = { z : z = z j ( z ) , z E w j } is the domain of FP.. Assume that there exists a function P ( z ) with values in {l, , , . , p } such that foreach trajectory z ( t ) ( t 2 t o ) of (2), p'(t) = P(z(t))' (t 2 t o ) is piecewise constant, continuous from the right, has a finite number of jumps in each finite time interval and z ( t ) E f 2 p ( z ( t ) ) for any t 2 to. This means that during each finite time interval the trajectoryz ( t ) passes from one of the ai's to another a finite number of times and after transition, at time t , from C2p(z(t"O)) to f l p ( z ( t ) ) stays for a while within the latter. Put
Then for any initial value ( z o , j o )E A there exists a solution z(t),j(t), (t 2 t o ) of (10) where z ( t ) is the z-projection of the trajectory z(t);z(t)of
Stabilization and Control
of Dynamical Systems
431
1.1.(1)and 1.1.(2) with U = e and with initial values x(t0) = xo,z(to) = zj(t,)(x(to)) lyingon MQ,while j ( t ) = P ( z ( t ) ) ( t 2 to). In general, the aforementioned solution is not unique. However, the following proposition is true. Suppose that
4 4 = z a ( z , j ) ( x ) h j ) )E A). Proposition. Let (xO,jO) E A be a point such that det BZFp(zjo(x0)) # 0. Then the trajectory (11) with x(T) = xo, j(T)= a(xO,jo)is locally unique, i.e., there exists a S > 0 such that the trajectory x(t), j ( t ) of (1) with x(T) = zo, j(T)= Q (xo,jo) is uniquely determined for T S 5 t 5 T+6 ( T E R ~ ) .
-
Thus, local uniqueness of trajectories for (10) can be violated only for x, j such that za(o,j,(x) E C = { z : det BZFp(z) = 0). The control U(., j) defined by the Ashby relation is called an Ashby control. An Ashbycontrol enables us to accomplish the synthesis of the required dynamics for a D-system 1.1(1)whose initial Taylor approximations are not controllable. However, in this case one must eliminate the parasitic solution of (l),i.e., solutions that are not the images of trajectories of 1,1(2) under FP.To do this one needs to use near C the vector-valued function z ( t ) directly and not to eliminate it by means of Fp: at any time to an Ashby control can be realized not only as a feedback U(Z,j ) for (lo), but also m an open loop control for (1.1(1)) with U = e(z(t,zj(to,(xo))). After overpassing C one returns to the Ashby feedback U(., j). The other possibilities are to use a discontinuous control.u'(x,j) which has a small jump near C and coincides with u(x,j) out of a small neighborhood of C or to use a sliding mode near C (see Remark 2.2) and so on. It is natural to call allcontrol modes which one.can obtain by means of Ashby relations Ashby controls. But here we usually employ this term for the branching feedback (10). The branching feedback will sometimes also be called an Ashby feedback.
Shoshitaishvili
432
3. Obstacles to Stabilization of Equilibrium 3.1. There exist at least two types of obstacles to stabilization which have a topological nature. Thefirst is the index of the vector field germ W at the isolated equilibrium x = 0 : w(0) = 0,w(x)= 0,x = 0.It is known [l21 that it is equal to (-l)n for an asymptotically stable equilibrium. So if it is impossible to construct a feedback u(x)such that ~ ( xU(.)), has index (-l)n then it is impossible to asymptotically stabilize the equilibrium x = 0 in the C D system = ~ ( xU ),. The second type of obstacle is associated with a nontrivial monodromy in a CD-system. It were considered in [l31 in connection with the problem of tracing periodic motions, where an Ashby control was used with branching of the type of the Riemann surface d-. Neither index nor monodromy is an obstacle for asymptotic stabilization by means of a branching Ashby control. ,3.2.Example. Consider the CD system [6] ,
x1
= U: -2154,
x2
= 2u1u2,
x1
= .:(x)
$2
= 2u1(4u2(4,
The system
- U;@),
with any continuous functions ul(x), u2(2) (ul(0) = 0) does not have a cycle 7 .that passes through the origin 2 = 0 and has no .fixed points. Moreover, the zero equilibrium of (2) cannot be asymptotically stable. Indeed index S of the vector field (2) with respect to zero is the product of the degree v of the mapping x .(x) and the degree p of the mapping U (U! - ui, 2.~1,u ~ )Since . p = 2, one has S = 2v. This is impossible if (2)has a cycle as above or an asymptotically stable origin because then S = 1. ' i )
+
of Dynamical Systems
Stabilization and Control
433
Corollary. The equilibrium x = 0 of (2) cannot be stable. It is possible to estimate the number of branches for a class of CD systems including (1). Namely, if a CD system li: = v(x,u) is such that for any continuous functions x(%), u(z) ( x ( 0 ) = 0,u(0) = 0, z E R") the map (Rn,O) + (R",O), z C ) v ( z ( z ) , u ( z ) has ) a degree x (w.r.t. I = 0) and 1x1 2 k for some integer k if v : z I+ v ( e ( z ) , u ( z ) is ) onto, then for any Ashby relation = v(z,u), e(z), L = w(z), M = { ( x , z ): x = F ( z ) ) with asymptotically stable t = v ( z ) and a differentiable branching covering F, the map F has a degree x with 1x1 2 k . This assertion reduces to the following one. Let the degree w.r.t. z = 0 of the map z C ) w(z), be 1 or -1 and the degree of the map z ++ v'@) = F,(z)w(z)(w.r.t. z = 0) be x ( x # 0). Then the degree of F w.r.t. a = 0 is some d ,I d1 2 k . As examples of such C D systems one can consider the system (1) of the form k c = U:, where xc = 21 ix2, uc = u1+ iuz,i = as well as CD-system kc = u t or, more generally, the systems (&)l = Pp((zc)z,uc), (&)S = Pm((zc)2, uc),where Pp,Pm are series with homogeneous polynomials of degree p resp. m(p 2 1,p + m 2 2) as lowest order terms and the map ( ( z C ) z , u cC)) (~p((~c)~,uc),P,((xc)2,~c)) has degreepm. It is easy to check that both problems for the CD-system kc = ut have a solution in the class of Ashby controls with k branches (the minimal possible number of branches). Namely, the Ashby control can be defined from the equation ut = w(xc), where x = w(s) is asymptotically stable or has a limit cycle 7.
(x
+
3.3. J.-M. Coron proved the following theorem. Let F denote the homology or homotopy functor.
Theorem [14]. For small E > 0 define C, = ( ( a ,U ) : 11(x,u)II < E and v(x,u) # 0). Supposethat 1.1(1)is stabilizable by means of continuous feedback. Then the map F ( C , ) + F@", 0 ) inducedby (x,u) v(z,u) is onto. 3.4. To give an example of a monodromy type obstacle consider the C D system [l31
Shoshitaishvili
434 BC
= 2iec
i, = iz,
+ azcuc,
+ uc,
where T~ = ( e c , z c , u cE) CSand i2 = -1.
Proposition. For any B6 = (7, : + lzc12 = 6) where 0 < 6 the z,) following is true. There is no unbranching continuous feedback uc(ec, such that for any (e:, 2,") E B6 there ercists a unique trajectory &(t,,:e z,?), zc(t,e:, z,") of the system BC
(1)
= 2 i e c + 2zcuc(ec, 4 ,
i = izc
+
uc(ec,
4 ,
such that ec(O,e:, z:) = e:, zc(O,e:, z,?) = z,?, and (2)
&(t,e:, z,")
+ Oast -+
0.
Proof. Consider y, = ec + z,". Then gc = 2iy, by (1).Suppose that the unbranching feedback exists. By uniqueness w.r.t. the initial point, the trajectories of (1) continuously depend on ,:Q z,?, t [15]. From (2), for any -y > 0 there exists T > 0 such that Jec(t, ,:Q z:)I < -y for t 2 T and for any (e:, z:) E &. This means that (3)
lzc(T, , :e z,">
- ( \ / ? m e:, ZC0))jl c l(?),
where l(?) 1 0 is a continuous function, l(0) = 0, ( J ) j ( j = 10772) is one of the two branches of root, and yc(T,,:Q 2,") is time T point on the trajectory of the system LC = 2iyc with initial point g: = & ( z , ? ) ~ , Suppose that j = 1 in (3). Consider v,: = eiZnry,?and = yc,, 0 (z:)~,0 5 T 5 1. It isclear that the point makes a circle if changes from e:,o = e: to @:,l' For any T , O 5 T 2 1, one has (see (3))
+
Control and Stabilization of Dynamical Systems
435
- (d-)2l
= I z c ( T , e:,z:,
W .
Choose y such that 21(y) < [@l. Then (3) and ( 5 ) cannot both hold. This completes the proof. Q.E.D. Just in the same manner one can prove the following 3.5. Proposition. There is nounbranchingcontinuous z , ) such that the system
feedbacks
U,(@,,
e, = (21 - 2x)ec + 2 z c U c ( e c , Zc),
(6)
4 = (i -
+ U,(@,,
zc),
has uniqueness property w.r.t. the initial point and ec(t,,:Q 2): decays like e-kt as t + m((&,2): E Ba).Here k E R and x E R are constants and IC > x . On the other hand, a branching Ashby control such that ec(t,&,2,") + 0 as t + 00 and ec(t,e:, z:) e-kt as t + 00 can be constructed. Indeed, one has (d/dt)(@ - z,) = (i - x ) ( @ - z,) U,, Consider uc = -(i x IC)(@ - z c ) , i.e. N
+
+
(7)
(U,
- (i - x + k)zc)2 - (i - x + k)2yc = 0
For any yc # 0 the equation (7) has two distinct solutions, Consider constants $1, $2 such that 0 < $1 < $2 < r/2, the straight lines I j = {gc E C : =g yc = $ j } and the four closed half planes ~ r j( j = 0,1,2,3): rj = {pc.€ C : $ j 5 argy, I4j r) ( j = 0, l)},~3 = C/rl, ~4 = C/rz. Define four branches uc(zcryc)(j = 0 , 1 , 2 , 3 ) as follows: uc(zc,y c , j ) is defined on rj by ~ , ( z , , y , , j )= (i - x + k)z, (i - x + k ) ~ e d 8 r g ~ c / a , Define cu(ec,zc,j)= j if yc E T j and a ( e c , z c , j ) = j 1mod4 if E 8rjTrajectories of the system (6) U j ( t ) = cu(ec(t),z c ( t ) , j ( t- 0)) me uni= 0; ec(t) quely determined by the initial points yc # 0 as well as for 'I/, eVkt as t tends to infinity.
+
+
+
N
Shoshitaishvili
436
3.6. Consider the CD system
where 41
+ $2
e,, Z c r U,, E
x
> 0 are defined as in
(6), g, = g1 + isz E
c, 4,
=
C,and (u,,gc) are control variables.
Proposition There are no unbmnchingcontinuous feedbacks F = E U3 and t r j ( t , e:,@) + 0 as t + 00. Here, Us is an arbitrary neighborhood of the origin in C3 including the disk { (O,O,@)}, where 0 5 I#,\ 5 3. (gc(ec,zc, 4:)
Proof It is easy to check that M = {(ec,zc,q5,) E U3 : &z," = QC} is an integral submanifold for (8). Choose (e:,&@E) E M ( $ # 0) and consider the loop e& = eiT& z& = z!, $'&/(z:)~ (05 T 5 27r). One has l#&l < (Id1 + 21z:12)/(1z:1)2 = (le:l/lz:I2) + 2. Choose e% such that l ~ ~ l / l< z ~1.1Then ~ the curve (e&, z:,~,$&) (0 5 T 5 1) lies in U3.The function &(t) = e,(t)/zZ(t)tends to zero as t + 00 for a trajectory of (8) by assumption. So
Control and Stabilization of Dynamical Systems
(9)one has Izc,zrr(t)
437
- 2W)(Izc,o(t)l +
- zc,o(t)l > I d a 2- & m 1 l
( t )l)*
Izc,2?ro
Thus zc,zn(t) # zc,o(t). But zc,2n = zc,o by the definition, which is a contradiction. Q.E.D.
3.7. Remarks. 1. All the considerations above are valid for a system (8) where gc G 0. But this case is not interesting because here zc = 0 is an integral submanifold for (8) and the variable ec is uncontrollable if a, = 0. 2. The system (8) is locally asymptotically stabilizable, but not globally. 3.8. The situation which was described in 3.4-3.7 is the particular case of the following one, Consider dynamical systems
k = w1(2),
(1) ( z E Rn,yE
p = v($/),
.i = W(Z,U)
Rn,zE R",u E R"), and a map $ : R" + R" such that
(2)
a,$(z>Vlb) = w(+(z)) (x E R"),
(3)
w(z,O) = 2)1(z).
The condition (2) means that W is lifted by $ to VI. (Recall, that v is lifted by $ to wl if +.w1 = W. The analytical field W is lifted by analytical @ if W is tangent to the image +(C) where C is 'a set of singular points of map $ and is not degenerate [lS, 171.) Consider the following problem of attainability of the standard motion [13,181. It is required to construct a feedback ~ ( yz,, t ) such that
+
for a given k > 0, where y ( t ) = $(z(t)) and c : arbitrary positive function continuous at zero. To do this write (4)
e= -
and consider the system (5)
= G(& z , U ) ,
f = W(Z,U),
R1 x R'
+ R1 is an
438
Shoshitaishvili
where C(@,%,U ) = v(y)(u=p+$(z)- t,b,v(z, U ) . We have to find U ( @ ,z ) such that
e(t) e-kt as t
(6)
+ 00
for trajectories of (5)lu=u(p,z). If t,b : R" + R" is a branching covering with a nontrivial action of the homotopy group of the base on the fiber, then thereasonings of subsections 3.1-3.5 show that there is no unbranching feedback solving (4)-(6), but there exists an Ashby feedback which gives a solution of (4)-(6) and has the branching type of t,b.In particular, the map $ can be chosen in such a way that degq = 1, in contrast to the maps in subsection 3.2 and in Coron's Theorem 3.3. For example, t,b : ( ~ 1 ~ c) ~ (yI,yz), 2 ) y1 = z;+qz2, y2 = 2 2 , v = 3 Y 1 8 y , + 2 ~ 2 v18 =~xlOxl ~ ~ 2 2 ~ or 8 v~ =~2y;$, - 9y18,, , v1 = (32: 222)8xl - 9(2? 2 1 2 2 ) 8 x , .
+
+
+
Remark. The considerations above show that there are no unbranching asymptotic observers f = v(z, U ( % ,y, t ) )for systems 2 = v1 (x)with outputs y = $(x) where )t : Rn + R" is a branching covering with a nontrivial action of the homotopy group of the base on the fiber.
4. Stabilization of Systems with Low Codimensional
Degenerations of Controllability This section contains, in particular, the proofs of some theorems of [6]. Suppose z = ( 2 1 , ~ )E R2, U E R1 and suppose the pair A, C is non controllable, i.e., the rank of the map R2 X R2 c) R2, (v1,v2) c) Cv1+ACv2, is less than 2. Suppose it is 1. Then 1.1(1) can be transformed by a linear transformation ~ ' ( 2to ) 21 = az; 0(11z',u1/~), 2; = + T ~+ U b(z1, S;,U ) . Suppressing the primes for simplicity and redenote U to be 7-222 q u #(XI, 2 2 , U ) one obtains
+
+
(1)
21
+
+ 611x2~+ kzu2 + 22 = = bzoozf + + b l o l ~ +l ~o(ll~,~11~); and a,
= UZI
where b3 constants.
+
6202;
611021~2
63(21,~2,~),
U,
b i j , biik
are
Control and Stabilization of Dynamical Systems
439
Assume that a 2 0 (otherwise the system ( 1 ) is obviously stabilizable) and b is Cm-smooth (though for most of the following results C3- or C4smoothness is sufficient). Remark. In Taylor series expansions we shall denote by bij the coefficient of xiuj, by bijk - that of xi&uk, and by dij and rij-that of zi ,j 1 2-
4.1. Theorem. Suppose
(2)
a2b&
+ 2abozbll + 4bozbzo > 0
and a > 0. Then stabilization of the origin is impossible, that is, there exist constants c > 0, S > 0 and initial values x:, x: (which can be chosen arbitrarily close to x1 = 0 , x2 = 0 ) such that for any (possibly, discontinuous) control u(t) (t 2 0 ) whose absolute value is bounded by c, the trajectory of the system ( l ) l u = u ~ t ~ with initial values x!, x! leaves the S-neighbourhood of x1 = 0, x2 = 0.
+
Proof. Consider xi = x1 p x i , where p is some constqnt. Then from the system (1) one has x' = ax: (bzo - a p ) x i (bzr 2p)xzu 42u2 + % ( X ~ , X ~ , U )If. for some p the quadratic form Q = (b20 - a p ) x i + (bll + 2p)xZu + bo2u2 is of fixed sign then the theorem is true. Indeed, let for definiteness Q > 0. Then there exist c 2 0 and a neighborhood 'U of the origin in R2 such that ax' + Q(x2,U ) b3(x1,x2, U ) > 0 for xi > 0,U 5 c, ( x i ,x2) E V . So the xi-component of the trajectory with initial condition (xio,x:) E V will be increasing ~ t 8long as (x:(t),xz(t))stays within V , For Q to be of fixed sign for some p , it is necessary and sufficient that the inequality b2obo2 - apboz - (1/4)bT1- blip - p 2 > 0 has a solution, and ( 2 ) is precisely the condition of that solvability. Q.E.D.
+
+
+
+
+
4.2. Theorem. Let
(3)
+ abozbll + 2b02b20 < 0
and bo2 # 0. Then it is possible to stabilize the equilibrium x = 0 of the system (1) by means of an Ashby feedback which satisfies u3 + $2(x)u2 +
440
Shoshitaishvili
#1(z)u+#o(z)
= 0, where the #d are diflerentiablefunctione with
=0
#i(O)
(i = 1,2,3). Proof. Consider an auxiliary dynamical system i = Dz + d(z) (where z E R, D is a Hurwitzian matrix, d(z) = o ( z ) ) , some function e(z) (e(0) = 0) and anAshby relation ((l), e(z), t = Dz+d(z),M = ((I, z ) : I = F ( z ) ) (it exists by Theorem 2.3.1).Suppose &e(O) # 0 and the pair (B,e(O), D) is observable, i.e., the vectors &e(O),&e(O)D are linearly independent. One has &Fz(z) x (Dz+ d ( z ) ) = @(z)from the condition for M to be (i = 1,2) are the components of the map F) an integral manifold (Fi(z) and it is easy to check that det(~,e(O),&F(O))# 0. Rename F2 and e as 22, z1 respectively. Now it is possible to write the Ashby relation in the form of the system (4)
h1
+ bozzf + b2002C: + + bllozlzz + ho151z1 + b3oZ; + + + bas$ + P(ZI,ZI,
= ax1
+b204 +
bllZ2Zl
bllos:
b21421
x2
(5)
b12~2~:
~ 2 ) ,
= 21,
.i.1 = dlozl
+ dolzz + d 2 0 4 + d11~1zz+
+ A(%),
ti = Z l , and an integral manifold (6)
51
2
= a l O ~ l ~ O-kl ~a2021 2
+
+ + aO2z; + a30z; + a0.32: + = z2. allZl.%Z
a21%:%2 +'alZzlz;
x(zl,zZ),z2
Here
+ 1z113+ k 2 l 3 + Izlllzll + I X ~ I I Z ~ I ) , ~ C ( Z ~ , Z= ~ '0(lz1l3 ) + 1z213). = 0(l.112
Consider the condition of being an integral manifold:
(7)
+
& , s l ( z ) f l ( ~ ) &z1(z)&(z)
=h(Il(Z),%),
where zl(z), i l ( z ) , t ~ ( z )Sl(z1,z) , are the left-hand sides of (6), (5), (4). 'If a20 # 0 then the map F : z I+ 2 is not a covering of the 2-space. So we should look for a system ( 5 ) such that a20 = 0. Equating the coefficients
of Dynamical Systems
Stabilization and Control
441
Rom ( z l ) , (qz2), (22)one obtains dl0 = a+bll/boz - 2a02/boz, dol = a(aoz/boz) b2o/bo2. As dl0 < 0,dol < 0 (the matrix D is Hurwitzian) we get adlo 2d01 < 0,i.e., a2 abll/bo2 2b2o/b02 < 0.Multiplying by bi2 yields (3). If (3) is satisfied then it is possible to choose a constant ai;2 such that die, d& is a solution of (z:), ( Z I Z ~ (22) ) , and dio < 0,'d& < 0. Choose an arbitrary azo # 0. Then there exist a$l (see (z!)), ai3 (see (zlzj)), d;j2(see (z;), where all # 0 because all = according to ( z : ) and bo2 # 0 by the hypotheses) which solve ( z ! ) ,( z l z i ) , (zfz2) m d (z,5)-equations. Thus under the hypotheses of Theorem 4.1 an Ashby relation can be constructes such that
+
. ,
..
+
+
,
,. . . . . . . . . . _ x . , , _ .......,
+
.. _ . , . ...........,.,
.
,.
.
....L
I . . . . .
.
.
.
, . .
.
. I . .
.l..l..
442
Shoshitaishvili
for some functions $i (i = 1,2,3)with $i(O, 0) = 0. The theorem is proved because u = z1. Q.E.D. 4.3. Definition. A jet of order k of the system (l)at the point a: = 0, u = 0 is called sufficient if all germs of
CD systems which have the same
k-jet can be asymptotically stabilized by some Ashbyrelation or none can be asymptotically stabilized by an Ashby relation. Note that a sufficient k-jet of a CD system which can be stabilized determines a type (or a set of types) of maps F : z I+ a: that give integral manifolds M = { ( I C , &:)IC = F ( z ) } in the Ashby relations stabilizing (1). Recall that the typeof the map F is by definition the equivalence class (orbit) of F w.r.t. the action of (local) diffeomorphisms $’(IC)and z’(z). 4.4. Proposition. Suppose in (1) one has bo2 = 0, ab11
and either a
+ 2b20 # 0
> 0 or bllo = blol = 0. Then the 2-jet of (1) is not suficient.
Proof. We show that under the assumptions of the proposition the system
(10)
5 = a21
+ b201~i+ b111~2u+ blloa:1a:2 + blola::1u,
&Z
=U
+
is not stabilizable. Consider $ = a:1 - (b11/2)z$ One has $ = a’$ b’a:;, where a‘ = a+blloz2+blolu, b’ = (a’bl1+2b20/2. For definiteness suppose (2bzo + b l l a ) / 2 is positive. Then for sufficiently small a:z,u, and initial $0 > 0 the function 4 will be increasing and, hence, the system (10) is not stabilizable. For 2b20 + blla < 0 one should consider initial $0 < 0. On the other hand, for all systems (1) which have a 2-jet as in (10) and a nondegenerate 3-jet there exists an asymptotically stabilizing Ashby relation (see Theorem 4.4 below). Q.E.D, 4.4. Theorem. I n the space J 3 of 3-jets of the systems (1) there exists
an algebraic hypersurface C = J 3 such that if bo2 = 0 and the 3-jet of (1) does not belong to C then there exist constants 61, 62, d l < 0 , d2 < 0 and an Ashby relation ((l),e = z i &2,2 822122, 21 = z2, 22 = dlz1 d222, M = (a:, z ) : 5 2 = 2122,2 1 = $(21,z2))) such that the map F : ( 2 1 , z2) c) ( z l z 2 , $ ( z l , z z ) ) can be transformed by a change of variable to the fown (21,22) (z122,21“- z,”).
+
+
+
Control and Stabilization of Dynamical Systems
443
Hencefor any S near the origin there exist two preimages and the Ashby control U = @ ( F - l ( z ) ) makes the system (1) asymptotically stable. The submanifold M' = ( ( 5 , ~ ): @(s,u)= 0) in the (z,u)-space to which (z,U = @(F-' (S))) belongs is not smooth at the origin and has a singularity. 4.5. By means of Ashby relations it is possible to develop an approach to stabilization problems which is similar to the classification approach in singularity theory. Namely, in the space J k of jets of systems (1) at the origin with a fixed order k three subsets axe defined: the subset CYk of the sufficient ubstabilizable jets, the subset p k of insufficient jets (for any jet in P k there exist both stabilizable and unstabilizable systems (1)with this jet), and thesubset yk of sufficient stabilizable jets. The subset r k is stratified by means of the type of Ashby relations. Two Ashby relations have the same type if there exists a change of variable z'(z), z'(z) which transforms the objects involved in one Ashby relation to thecorresponding objects for the other relation. The 'yk is also stratified by means of the projection to J k of equivalenceclasses w.r.t. various other equivalence relation in the space of Ashby relations (for example, equivalence w.r.t. the type of F). Such stratifications can be considered as classes of new equivalence relations in the space of controlled systems. Let T j , k : J j -+ J k be the natural jet projection. One has n i : l , k ( y k ) C T k + l , r ; ; l , k ( a k ) C ak+l, r;:l,k(pk) n'yk+l
# O , r i i l , k ( P k ) n a k + l # 0, r ; : I , h ( p h ) n p k + l
The investigation of systems (1) whose k-jet are in
pk
# 0.
involves jets from
fli2-:l,k(h)a
The results of items 4.1-4.4 are summarized in Table 1. Note that if j E s2 or j E s5 then it is impossible to stabilize the system (1) by means
of an Ashby control with the corresponding branching type. If in these cases stabilization is possible it is necessary to use an Ashby control with more complicated branching or with a higher Holder exponent at zero.
Note that the greater the codimension of degeneration (by definition it is the codimension in J k of the stratum7' of y k such that j'((CD(1)) E 7') of the CD system (l),the greater the degeneration of observability for the
444
Shoshitaishvili
system f = W ( Z ) under the output e(%).The latter degeneration is measured by the codimension of the corresponding stratum in the space of pairs of jets e, W. This connection between degeneration of CD controllability and degeneration of observability of the pair e, W expresses the duality of the QBC-problem and the problem of identification [6], Table 1 rhe order of jet
Kind of sufficiency sufficient, unstabilizable suff., stab.
Branching
insuff. insuff.
suff. stab.
4.6. Consider a CD system
where ( z 1 , 2 2 ) E R 2 ,U E R", Q ( 2 2 , ~ is) a quadratic form w.r.t. 2 2 , U , and @ ( U ) , b 3 ( 2 , U ) are sufficiently smooth functions with &$1(0) # 0. Denote by J& the space of 3-jet j & at the origin of systems (l"). There exists a semialgebraic subset C , C J;j, of codimension 1 such that the following holds. Below, "almost every" means "outside a surface in the space of 2-jets".
445
of Dynamical Systems
Stabilization and Control
Theorem. 1. If m 2 2 and j m E C , then for almost every 2 = D r (with Hurwitzian D ) there exists an Ashby relation ((lm),
(11)
+ d ( z ) , el(%)
e = (el(%),e 2 ( ~ ) , .- . e m ( z ) ) , 1
+
2 = DZ d(z),
M = {(x,Z ) : x = F ( s ) )
such that F has the type of the tuck x1 = z; + Z ~ Z Z x2 , =12. 2. I f m 2 3 and j m€Crnthen for almost any2 = D z + d ( z ) , e l ( z )ea(%) , (with Hurwitzian D ) there exists an Ashby relation (11) such that F has x 2 = z2, which is a homeomorphism. thetype of themap x1 = 2; , can be stabilized bymeans of an Thus the system ( l m ) with j m E C unbranching feedback which is diflerentiable at any x # 0 and is Holder with exponent 1/3.
+
4.7. The previous results can be generalized to smooth CD systems 5 = Ax Bu b ( x , u ) (x E R,u E R ) , where the pair A , B has some uncontrollable modes. By means of a linear transformation s'(s) we bring the C D system to the form
+
(12)
+
$1
=alzl+
$2
= a2x2
k3
= ~ 1 3 x 3 BU
+
Q ( z 2 , ~ 3U ),
+bl(z,
U),
b 2 ( z , U),
+ +b 3 ( 2 , U ) ,
+ +
where x1 E R k l , 2 2 E Rka, x 3 E Rks, kl kz k 3 = n, and a l , a z , a s , B axe ICl x kl, k 2 x kz, ks x kg, m x k 3 matrices. The eigenvalues of a1 have nonnegative real parts, the matrix a2 is Hurwitsian, the pair a3, B is controllable, Q ( x z , x 3 , U ) is a vector of quadratic forms depending on and b l ( z , U) = o(llmII + / 1 ~ 1 1 2 + 11412), b d z , 4 = o(ll4l + 1 1 4 1 > (i = 2,3). Fix the matrices a l , a2, a 3 , B , and consider all systems (12)with those matrices. 22,23,U
Theorem. In the space of quadratic vector forms Q ( Z ~ , Z ~ , Uthere ) are two open semialgebraic subsets 71,7 2 and a semialgebraic subset 79 of
446
Shoshitaishvili
codimension 1 (clearly, the explicit form of yi depends o n a l , a2, a3, B , but not on b2, b 3 ) such that (i) if Q E
71, then
the system (12) is not stabilizable (even by means of an open loop control u(t,zo)); (ii) the subset y3 is a subset of insuficient jets; (iii) if Q E 7 2 then the system (12) can be asymptotically stabilized by means of theAshbycontrol. If dims1 = 1 thenthebranchingtype is z3 q51(z)u q52(2) = 0. If dim21 2 2 thenthebranchingtypeis given by M and e, where e = L I Z L Z X Z L323, M = { ( x , z ) E R” x R” 2 1 = ~ ~ 2 2 2a323 ~ ( x xz ~, , z ) 4 ( ~ 2 , 5 3 ,z)}, Li : Ri -+ R” (i = 1,2,3)is linear, cq, a2,a3 are matrices, q is a vector of quadratic forms depending o n 5 2 , 2 3 , z and such that x c) q(O,O, z ) is a branching (~+ 1 1 ~ 1 1 ~ ) .Hence the branching covering, and q5(22,23,z)= O ( ( ~ X Z ~ (123112 of U is quadratic.
+
+
+
+
+
+
+
+
Rename 2 1 8 2 2 , a1 8 a2 as 2 1 , a l , respectively, and rewrite the system (12) as
+
+
where Q‘ is quadratic, b l ( x ,U ) = O(llz1ll 1 1 ~ 1 1 ~ 1 1 ~ 1 1 ~ ) .Then the above theorem is true (with other y1,yz and dim22 = 0). 4.8. Theorem. I n the space of quadratic forms depending o n 23 and
there exists a semialgebraic open subset 74 such that if Q’ E 7 4 then the system (12’)can be asymptotically stabilized by means of an Ashby control with branching type given by e ( z ) , M , where e(%)is a vector with components homogeneous of degree 2, M = ( ( 2 , ~: )x1 = F l ( z ) , x 3 = F3(z)}, F1(z) = Fi4’(z) Fi5’(z), F3(z) = Fj2’(z) + F3’3’(z),F.y’(z), F3’2’(z) of degree 4 and 2 respectively, Fi5)(z)= are vectorswithcomponents 0(1)z11~),F J 3 ’ ( z )= 0(11z)1~),andthemaps z t) (Fi4’(z),F3(2)(z)), and z H (F1( z ), F3 ( z ) ) are branching covering.
U
+
Note that y4 n y2 # 0 and y4 n y3 # 0.
Control and Stabilization of Dynamical Systems
447
4.9. Theorems 4.7, 4.8 can be proved as Theorems 4.1-4.4 by means
of the following lemma. Consider the germ of a system of ordinary differential equations k = Dz d(z), where D is Hurwitzian with spectrum lying to the left of the spectrum of the matrix a1 (see (12)) and separatedfrom it by some strip S parallel to the imaginary axis. Suppose the spectrum of a3 lies to the left of S (otherwise we can use the transformation U' = Lz3 U for a suitable L because the pair a3, B is controllable). Consider the germ at the origin of a vector-valued function e(%)= e("(z)(z E R",~ ( 0 = ) 0, e E am)where has components homogeneous of degree i, and e(>')(z) = o(11z11'). Suppose the strip S is sufficiently wide for the manifold M in the LAshby relation {(12), e(%),k = Dz d ( z ) } , M = ((x,z ) € R" X R", = F ~ ( z ) ,=zF3(z)} ~ to be Ci-smooth (see Theorem 2.3.1).
+
+
.:
+
+
Lemma. The functions F1 and F3 have the form F3 = Fii)(z) F3(>')(z),Fl(z) = F,(ai)(z) F . > 2 " ( z ) where , F3(')(z)has components homogeneous of degree i and depends only on as, B , e(i);F,(2i)has components homogeneous of degree 2i and depends only on al,B , e(i),Q;F,(>2i)= O(llz))2i), = O(llzlli).
+
Proof. From the hypothesis it follows that there exists an integral manifold 21 = c $ ( ~z,) of the system
51= a m + & ( S , e(%))+ h ( s ,e ( z > ) , (13)
+
53 = a323 + Be(%) b3(q e(%)), k = D z
+ d(z).
It is clear that the manifold EI = c$(F3(z),z ) , 1c3 = F3(z) is also integral for (13). By Theorem 2.3.1 the i-jets of F1 and F3 are uniquely determined so 4(F3(z),z ) and F1 (2) coincide up to i-thorder terms. Since z1 = c$(z3, z ) is an integral manifold, c$rs53 & =ia19 &(SEI, e) bl(c$,z3,e(z)). As e begins with i-th order terms, the (i - 1)-jet of c$(23, z ) coincides with the (i - 1)-jet of #'(Q,z ) which gives an integral manifold of the system (13) with e E 0. This means that 9(a3,z ) = c$(53,0) 4 ( 2 4 ( z 3 , z ) .As c$(z3,O)begins with second order terms and P S ( % with ) l-th order terms (for some l ) , F l ( z ) = c$(F3(z),z)begins with terms of
+
+
+
+
Shoshitaishvili
448
order min(21,2i). Terms of b3(1, &)) which depend on z1 are of the second order. So they do not affect the lowest terms of F 3 ( z ) because z1 as the function on z are of the order min(2l, 2i). Hence the lowest terms of F3 are found from the condition of integrability for 1 3 = FS(z)to be an integral manifold for the system x 3 = a313 + &i)(z), i = Dz,and 1 is equal i. Q.E.D. 4.10. We proceed with our discussion of stabilization of the CD system
(1)in the plane. Let the hypotheses of Theorem 4.2 be satisfied. Then ,as it is cleaf fromthe proof of Theorem 4.2, it is possibleto stabilize the system (1) by means of an Ashby relation with 22 = z1, i 1 = dlozl +dolz2 +d(z), where dl0 and dol are arbitrary negative constants such that
Theorem. Under the hypotheses of Theorem 4.2, f o r a n y k from the interval 0 < k 5 ( d m a ) / 2 there exists an Ashby feedback which asymptotically stabilizes a CD system (1) in such a way that x ( t ) e-at as t 00, andthenumber of switchings will be 0 or 1. The number 'of control branches is 4
+
N
.
Proof. Let dol and dl0 be as in (14) and such that the system 22 = z1,& = dlozl + dol22 has two distinct real eigenvalues. Then d &4J + (1/2)(1- adlo) > 0,i.e.,
Thus z ( t ) and, hence, x(t) decay like e-kt where -k = minReXi(i = 1,2), X: Aid10 - dol = 0. By (14), -IC = d10/2 .\/d:,/4 (1/2)(1 adlo). Under the conditions 1 < 0, a 2 0, and (15) this expression is increasing in dl0 and has a minimum - ( a d m ) / 2 at dlo = a - d-, Consider the subset C = { z : & F = 0) of the set of critical points of themap F : ( 2 1 ,z2) I+ ( F l ( z ) , z z ) where the graph of F gives the integral manifold M . The equation of the tangent line to C at the origin
+
-
+
+
-
Control and Stabilizationof Dynamical Systems
is allz2 = 0, i.e., z2 = 0 because all = bo2
w(z) = (22 = Z l , i l = d 1 o a
449
# 0; The trajectories of
+
do112
+d(z)}
are transversal to C at any point except z = 0. Indeed, dyz2 = z1 and this is zero for E E C iff z = 0. Suppose dl0 < 0, dol < 0 are such that the system i = w(z) is a node, i.e., it has two separatrices (or one) tangent to the eigenvectors of B,w(O) which divide the plane into four (or two) quadrants. The trajectories of i = W(.) cannot escape from any of these quadrants. The set C \ 0 consists of two components situated cross-wise. Any trajectory of i = W(%)either intersects transversely one of the components of C \ 0 and then remains in the corresponding quadrant or does not intersect C \ 0 and lies in the quadrant from the very beginning. Theorem 4.10 is proved.
4.11. Let us describe the function Q which gives passages from one branch to another for the Ashby control from Theorem 4.10. By (8) the integral manifold is the graph 1c1 = & ( a ) , z2 = z2, where F l ( z ) = a& allzlza x ( z 1 , z z ) , x ( z 1 , z z ) is a linear combination of z:z2, 21~22,z j and o ( l z 1 1 ~ 1 ~ 2 1 ~ Because ). all = bo2 # 0, a30 # 0, after a change of variable d(s),z'(z) the manifold M can be represented in the form of the standard tuck zi = zi3 +zizi, = zh. In the new coordinates, C is a parabola 3zP zi = 0, F ( C ) = {(zi,z;): $/4 + zas/27 = 0 } , J"l(C) = PlOP2, P1 = C , P2 = { ( ~ i ,:~ (3/2)zi2 i ) + Z Z = 0). AS P1 a d P2 have a common tangent at the origin, the vector field w(z) is transversal = a11.z~ o(lz212), it is to P2 \ 0 as well as to C \ 0. As BWF1(a)l~,(,)=O easy to check that therestriction of W to (P1\ 0) n (P2\ 0) is homotopic to the restriction to (P1 \ 0) t l (P2\ 0) of the field .ii = -zh, k i = z1 or of the field i: = zh, 26 = -zi. The first caae occurs if 030011 < 0, and the second if a30411 C 0. As a30 can be chosen to be any constant (by selecting d(z) in Dz + d ( z ) ) one can aasume that 020811 > 0. The multi-valued map J " l : d C ) z' can be represented in the form of four continuous maps (branches) fj(z) = ( z 1 ( z , j ) , z z ( z , j )()j = 1,.. ,4). The domain Mi of the i-thbranch is defined to be the set of points.(xl, 2 2 )
+ +
+
+
+
.
Shoshitaishvili
450
for which the following inequalities are satisfied: MI H
Mz
M4
M3
5 0 50 5 0
20
+
where H = xi2/4 2 2 / 2 . The range mi of the 6thbranch is definedto be the set of points (z1, z2) for which the following inequalities are satisfied:
where 71 = 32i2/2
+ z i , 72 = 32i2 + 2;.
The domain of the function cy is A = {(I', j ) : I' E Mj (IE R2,j= ( E ' , j ) E A and suppose E is an interior point of M j . Then cy(z', j ) = j . If ( E ' , j ) E A and E' E aMj then (as a3oall < O)a(z',j ) is defined from the table 1,.. . ,4)). Let
4.12. Let now the system (1) be such that
+
+
b30 # 0 or b l l # 0 or bo3 ab12 # 0. Let b 4 ( z , U ) = alqh(z,U ) c$~(I,u), r#~(0,0)= 0 , ~ ~ ( I z ,=u o((121~ ) 1 ~ 1 ~ )Then . for any k > 0 the
where
+
equilibrium of the system (16) is stabilized by means of an Ashby control such that asymptotically the trajectories converge to theorigin as e-Pt for some constant p IC. The number of switchings is not greater than one. The number of branches is 8. (Note, that 8 is not minimal number of the branches.)
>
Control and Stabilization of Dynamical Systems
451
5. Imitation, Optimization, and Homeostazing Approaches and Their Relation Usually a control problem is approaching in several ways, such as imitation, Optimization, and Homeostazing. These approaches manifest themselves as goals of control; for example, the optimization approach sets the goal to optimizing some cost function. Among those approaches the most universal ones stand out. Universality should be understood as a set of control problems which can be formulated within the given approach. Universality of the approaches leads to relations between them. They show up in the intersection of the corresponding sets or in the possibility of restating a control problem formulated within one approach in terms of an other Control theory is a cybernetic science. Cybernetic analysis reveals motives and peculiarities of this mathematical discipline. The fundamental problem of such analysis is to find universal approaches to control and to investigate their relationship. What follows is an attempt at making a further step in the study of relationships between imitation, optimization and homeostazing approaches.
5.1. Imitation means organization of the system's behavior in such a way that it becomes similar to the behavior of some given system. One of possible formalization of the imitation approach is the Qualitative Behavior Control Problem (see Section 1). For the optimization approach, it is enough to say here that one of the possible formalizations of it is the Optimal Control Problem. Homeostazing means organizing the behavior of interacting systems so that they can coexist. The universality of the homeostazing principle was first realized by Canon in [19].He introduced the term homeostasis to denote the fact of coexistence of systems or of a system and its environment. Ashby [20]formalized the notion of homeostasis. Ashby's formulation restated in modern terms is something very similar to thedefinition of Ashby
452
Shoshitaishvili
relation (see Section 2). It will be useful to have a definition of Ashby relation in the more general form (5 = 211(z,z , 4 ,
(1) 2
e&,
4 ,
e2(z,z),
= 212(z, % , g ) , M = ( ( 5 , z ) : @(z,z) = O)),
where M is an integral manifold of the system 5 = 211
(5, z,el(z,z ) ) ,i
=
212(z,z,e2(51z)).
The vector @(z,z ) is the vector of significant variables in Ashby’sterms. The equation @(x, z) = 0 is the condition of existence (or coexistence; very often this is just the same)of the z-system 5 = v1(z,z , U) and thez-system 2 = 212(z, z , g ) . The homeostazing approach means here to find el, e2 such that the relation @(z,z ) = 0 gives an integral manifold, and to suggest a tool which forces the states of the z, z systems belong to M. There are many versions of the homeostazing problem. They depend on requirements on the Ashby relation (see [e], where the Ashby relation is applied to various identification problems). For example, @(lc,z)can be a concrete vector-valued function (a first version) or any vector-valued function with nice good properties (an other version). In particular, for the QBC-problem it is enough for 0 to determine a branching covering 7T
: ( 2 , z ) c) z((z, z ) E M)).
Each of the considered approaches generates the corresponding formalism incontrol theory. The question about theuniversality of the principles reduces to the possibility of a passing from one formalism to an other.
Example. The stabilization problem can be formulated &S the QBCproblem (the imitation principle) and the latterreduces to theconstruction of an Ashby relation (the homeostazing principle).
Example,. Consider a local Ashby relation (1) such that 211, el, e2, @ are zero at z = 0, z = 0, U = 0. If @(S,%) is nondegenerate w.r.t. x and z , i.e. of &@(O,O) and 8,@(0,0) are invertible, then the zsystem is qualitatively equivalent to the z-system. Indeed, in this case the relation @ ( s , z ) = 0 can be writtenboth in the form z = f(z) and z = q5(z). The set of trajectories ( z ( t ) , z ( t ) E ) M of the system 5 = q ( 2 ,z , g l ( a , z ) ) ,z = V ~ ( Z , Zez(z,z)) , coincides with that for the 212,
Stabilization and Control
of Dynamical Systems
453
5.2. Consider the interaction of homeostazing and optimization ap-
proaches by examples of optimal synthesis in which integral manifolds of corresponding Hamiltonian systems are used. Stable integral manifolds of Hamiltonian system were used for optimal synthesis in the linear-quadratic problem (211, [22], [23] and its nonlinear perturbations [24], [25], [26]. 5.3. Consider the nonlinear controlled system and the cost function j! = v(2,U ) , (zE R*,U
E R"),v(0,O) = 0,
T
1
&?(W)= f(z(t),U(t))dt, 0
where v and f are sufficiently smooth and the process W = ( z ( t ) , ~ ( t ) ) (0 < t < 6 ( ~ ) , 6 ( w5 ) 00) is admissible. This means that z(0) = 20 for a given 20,u(t) (0 < t < S(w),b(w) 5 m) is a measurable function, s(t) is absolutely continuous (0 < t < 6(w),d(w) 5 m), limt++) z(t) = 0, j!(t)= v(x(t),u(t))a.e. on 0 < t < 6(w), and theintegral &(W) exists for all T < 6(w). We now study the problem with infinite horizon [27l.
Definition. The process WO is called Q-optimal (Q-suboptimal) for some set Q of admissible processes in the sense of infinite horizon if for any admissible process W E Q (and for any positive E ) there exist continuous belonging to some functions lo : [0,m] + [0,~ ( w o ) 1,] , : [0,00]+ [O,~(W)], class ll of functions, and the number 3 E [0,m] which depends on W (and E ) , such that
454
Shoshitaishvili
Further we assume that is the class of all admissible processes which are defined on [0,03]and II consists of the identical map, or that II is the class of all continuous functions and P! is the class of all admissible processes suchthat l i m F ~ ( w ) ( 3 T 6(w)) exists. In these cases suboptimal processes obey the Pontryagin Maximum Principle. p* Consider the function @ ( x , p U, ) = p*v(x, U ) - f (2,U),where p E P, W = plvl . .+pnwn, and thefunction H ( z , p ) = max, @ ( x , pU). , Suppose there exist a C2-smoothfunction g : ( x ,p ) I+ U such that @ ( xp, , g(z,p))= H ( x ,p ) . Consider the Hamiltonian system
+.
k = dpH(x,p),
(3)
p = -a,x(x,p).
Let H(0,O) = 0, B,H(O, 0) = 0,&H(O, 0) = 0 and suppose the equilibrium x = 0,p = 0 is hyperbolic, i.e. in some neighborhood 0 C R" x R" of the origin x = 0,p = 0 the flow of the system (3) is topologically conjugated to the flow of the standard saddle. Denote by L a positive-time integral submanifold for the system (3), such that thepoint x = 0,p = 0 belongs to L and is asymptotically stable equilibrium for the restriction of (3) to L. Let the neighborhood 0 be so small that ( 3 ) ( is ~ globally stable. Consider ?r : L" + R, (.,p) H x . Let L be the graph z = @ ( p ) of some map 0.Consider a normal suboptimal process, i.e., such that the corresponding Pontryagin extremal is normal (for definition of normal extremal see Section 5.5).
Theorem. If there exist normal suboptimal processes for x E 0 then they have the form ( x ( t ) , g ( x ( t ) , p ( t ) ) where ), x ( t ) , p ( t ) is a trajectory of the system (3) which lies on L. I f there exists an optimal synthesis then it can be done by means of an Ashby control with Ashby relation {x = &mw4,g(.,P),2j = -&H(.,P),L). 5.4. Definition. A matrix A is called anti-Hurwitzian if - A is Hur-
witzian. Consider an optimal problem (2) in someneighborhood 0 = R" x R" of 5 = 0, U = 0. Assume that f is a C3-smooth function with Vaf(0,O) = 0,
Control and Stabilizationof Dynamical Systems
455
and
(4)
~ ( zU,) = Ax
+ a(z,
U),
where A is Hurwitzian or anti-Hurwitzian, and a is a C3-smooth second order term (a(0,O) = 0,Vf(0,O) = 0). Let the function g ( z , p ) and the Hamiltonian system (3) be given. Proposition (see [25]). The Hamiltonian system (3) has a hyperbolic equilibrium x = 0 , p = 0, and all conclusions of Theorem 5.3 are valid. 5.5. In [28] wide class of optimal synthesis problem is reduced to the
construction of a (not necessarily stable) integral Lagrangian manifold, for the corresponding Hamiltonian system, we consider a particular case of the result. d~ = min, S = g(z,u), Consider the problem 4 = $igo(x(T),u(T)) z(0) = 20, z(t) = 21;x E S C R",U E U C R" (S is a submanifold, U is an arbitrary subset inR', all objects are sufficiently smooth). Recall that a Pontryagin normal extremal is a triple ( p ( ~ )z,( T ) ,U(.)) (0 5 T 5 t ) such that for any T one has U ( T ) E L,p(T)g(z(T),u(T)) -~O(Z(T),U(T)) = H ( p ( T ) ,X ( T ) , U ( T ) ) , where H ( P , Z , U) = m&€u P 9 k , U ) - 9O(Z, U ) , and
Let L C T'S (here T * S is the cotangent bundle) be a simply connected Lagrangian submanifold and suppose that
Assertion [28]. E v e y normal Pontyagin extrema1 p ( ~ ) z, ( T ) , U ( T ) suchthat ( ~ ( T ) , z ( T ) E ) L (0 5 T 5 t ) minimize the functional 4 an the class of all admissibletrajectories in D (in theusual sense,not in the sense of infinite horizon) with the given boundary conditions. If u(r) = g ( p ( ~ )~, ( 7 )for ) some function g , then the optimal synthesis canbe realized by a feedback U = g ( ~ - ~ ( x ) , x ) .
456
Shoshitaishvili 5.6. Applying the approach of [28] one can obtain the following exten-
sions of the results of [24], [25], [26]. Suppose that for the system (3) there exists a positive-time integral manifold L such that ( 3 ) l ~has a global asymptotically stable equilibrium s=o,p=o. Theorem. Supp0s.e that for any x E .rr(L),.-l(.) is path-connected. Then for any x0 E ?r(L)there exists a @-suboptimalprocess zo(t),uo(t), where @ is the set of admissible processes x ( t ) , u ( t ) such that x ( t ) belongs to .rr(L), and it has the form ( z o ( t ) , g ( z o ( t ) , p o ( twhere ) ) ) (xO(t),po(t)) is the trajectory of ( 3 ) with any po(0)such that initial condition ( z ~ , p ~ ( O ) ) E
L. 5.7. Corollary. Undertheassumption
of Theorem 5.3, supposethe
manifold L is thegraph p = F ( x ) of somemap F . Then thereexists anoptimalsynthesisand it can be done by means of anAshbycontrol with Ashby relation {k = $H(z,p),g(a,F(z)),p= -BsH(x,p),L}. Thus the system ( 4 ) with A Hurwitzian the above mentioned optimal synthesis exists and is given b y U = g(x,& O ( x ) ) for some function 9 such that Os@ = F(z). 5.8. Consider the system (l),cost function (2) and processes which meet all conditions on an admissible process except z ( t ) + 0, ( t + d(w)). The last condition is replaced by x ( t ) + x0 = x ( 0 ) ( t + b(z0)).
Definition. These processes are called loop processes. Consider a family 9 of cost functions fr(x,u) = ?@(U) - r$(x) depending on r, where #(). and @ ( U ) are smooth functions. In formula (2) assume f = fr. Let 0 be a neighborhood of x0 and let V! be the set of all loop processes such that z ( t ) lies in 0. Definition A. The local infinite horizon @-normat the point x0 (briefly, li9xo) is the infimum of positive numbers r such that there exists a
and Control
Stabilization of Dynamical Systems
457
neighborhood 0 in which for a Q-suboptimal process one has T
"5 f
u(t))dt2 0
0
if T is sufficiently large.
Definition B.If, in Definition A, Q is the set of all loop processes W such that z ( t ) lies in 0 and lirnt+, ;1 f y ( x ( t ) , u ( t )d)t exists and d(w) < 00, then l? is called the local finite horizon @-normat the point x0 (briefly,
I f @xo). In formula (2) assume
5 = Ax
(5)
+ Bu + a ( z , u ) ,
where A is an n x n-matrix, B is an n x m-matrix, a(x,U ) denotes higher order term. Let (6)
w40(4
= lIC4I2,
+(U)
for some matrix C and Euclidean norms
= Ilul12
11 ((1.
Theorem. Suppose the pair A , B is stabilizable and the pair C,A be observable. Then M O = Lf@O = H,(A, B,C),where H,(A, B , C)is an Hoc n o m for the linear system x = Ax Bu with the output = Cx.
+
The proof of the theorem follows fromthe existence of a stable integral subspace L for the linear Hamiltonian system corresponding to thelinearquadratic problem with
(7)
2 = AZ
+ Bu,
V = CX,
-
~ Y ( z , u ) = 7((u(I2 I
~CZ~(~
and 7 = H,(A, B , C). This L is complementary to the subspace x = 0 (in standard coordinates p , x in T*Rn and its projection on Rn is one-to-one [23].After a perturbation of this situation with small nonlinear terms one obtains an integral submanifold M in some vicinity of (x = 0,p = 0) with the tangent space L at the origin. Application of Theorem 5.3 completes the proof (compare [26]).
458
Shoshitaishvili
5.9. We now change the assumptions of Theorem 5.8. Namely, let ( 5 )
have the form j.1
(8)
= AlZl
+ a1(21,22,74,
~2=A222+B~+aa(x1,22,2~),
where A1 is Hurwitzian or anti-Hurwitzian, the pair A2, B is controllable, a l , a2 are second order terms, and C x = X l x l + C z x z ,where the pair C2, A2 is observable and IlC1ll is small. Then the stablemanifold M C T * X are defined for some vicinity X of the point x = 0. Very often for A1 anti-Hurwitzian the manifold M is situated w.r.t. X in such a way that T : M -+ X , T : ( p , x ) I“+ x , is one-to-one for a small neighborhood of any point belonging to T ( M )except z = 0.
Theorem. In the above situation one has I f @ x 5 7 for any x E n(M),
x #O.
Proof. Let W be any loopprocess x ( t ) , u ( t ) , (0 5 t 5 6 ( w ) ) such that x ( 0 ) = x = x ( 6 ( w ) ) and z ( t ) belongs to a small neighborhood of x . For small E > 0 consider the process W‘ whichcoincides with W on 0 5 t 5 6(w) and with z ‘ ( t ) on 6(w) 5 t 5 E , where z’(t)is the xcomponent of the trajectory T of the Hamiltonian system (3) with initial . by F1, F2, F3 thevalues conditions x(6(w)) = x , p ( 6 ( w ) )= ~ - l ( x )Denote of the cost function on the processes W ,T , W‘. Then by Theorem 5.7 one has Fl F2 = F3 2 Fz, i.e., F2 2 0. Q.E.D,
+
5.10. Theorems 5.3-5.9 show that itis possibleto describe the solutions
of the optimization problem in terms of Ashby relations. It is easy to find parallel between solutions of optimal problems and solutions of QBC-
Control Stabilization and
of Dynamical Systems
459
problems. In both cases the state variables z are doubled. In the first case they are doubled by the impulse variables p , and in the second - by the variables L where z are the state variables of the standardsystem. In both cases the existence of an integral manifold which is finitely branched over the base is of crucial importance for synthesis. Now the question of the correspondence between optimal problems and QBC-problems can be posed as follows. Suppose one has an Ashby relation A = ( k = W ( Z , U ) , U = e ( z ) , i = w(z), M ) which gives a solution of the qualitative behavior control problem for j! = w(z,U ) relative to the type of i = w(z). It is clear that for
where 4, $, y vanish on M , the Ashby relation
A’ = (5= d
( ~Z ,,U ) , U
= e‘(z, z , u ) , = ~ ~’(5, %,.),M)
is defined. Do there exist 4, $, y and a symplectic structure such that M is a Lagrangian submanifold and the vector field ( v ’ , ~ ‘ )is Hamiltonian? Do M , (W’,U ’ ) correspond to an optimal problem?
Remark. In [29]the following question is investigated: whether for a given Lagrangian manifold L in the form of p = L(t)z (here L(t) is a matrix depending on time) there exists a linear-quadratic optimization problem such the synthesis for it is accomplished by L. We say that the QBC problem and the Optimal problem correspond if the Ashby relation A can be transformed by abovementioned procedure to the Ashby relation A’. By means of this corresponding the relation between imitation and Optimization approach is defined. Acknowledgements. I would like to thank A, Agrachev, P. Brunovsky, A. Sarychev for helpful discussions.
460
Shoshitaishvili
REFERENCES I. G.Malkin, Theory of Stability of Motions, Nauka, MOSCOW, 1965 (in Russian). R. Brocket, Asymptotic stability and feedback stabilization, in Differential Geometric Control Theory, Birkhauser, Boston, 1983. W. P. Dayawansa, Resent Advances in the Stabilization Problem f o r Lou, Dimensional Systems,in Proceedings of Nonlinear Control Systems Design Symposium (M. Fliess, ed. ), Bordeaux, 1992, 1-8. [41 A. N. Shoshitaishvili, Qualitative control of dynamical systems,Uspekhi Mat. Nauk 43 (1988), English 180; transl. in Russian Math. Surveys 43 (1988). 151 A, N. Shoshitaishvili, Ashby relation and qualitative control of dynamicalsystems, Funktional Anal. i Prilozhen. 24 (1990), 94-95; English transl. in Functional Anal. Appl. 24 (1990). A. N. Shoshitaishvili, Singularities for projections of integral manifolds with applications to control and’ observation problems, in ,,Advances in Soviet Mathematics”, v. 1 (Arnold’svolume “Theory of Singularities Publ. of American Mathematics1 Society, Providence, R. I., 1990, 295-333. V. I. Arnold, A. N. Varchenko and S. M. Gusein-Zade, Singularities of differentiable maps, vol. 1, “Nauka”, Moscow, 1982. R.E. Kalman, P. L. Falb, and M. A. Arbib, Topics in Mathematical Systems Theory, McGraw Hill Book Company, New’York, 1969. A. N. Shoshitaishvili, O n bifurcations of the topological type of singularities of vector field depending on parameters, Funktional. Anal. i prilozhen. A. N. Shoshitaishvili, O n bifurcations of the topological type of singularities of vector field depending on parameters, Proc. Petrovskii 1975, 279-309; English transl. in Seminar, vol. 1, ,,Nauka”, MOSCOW, Amer. Math. Soc. Transl. (2) 118 (1982). V. I. Arnold and Yu. S.Il’yashenko, Ordinary differential equations, l , in Dynamical Systems - l , Ordinary Differential Equations and
Stabilization and Control
of Dynamical Systems
461
Smooth Dynamical Systems (D. V, Anosov,V. I. Arnold, eds.), Springer-Verlag, New-York, 1988, 7-149. H.Poincard, Memoire sur les courbes definie par les dpation diffdr151entielles, Iv partier-J. Math. Pures et Appl., 4e ser.; 1886, 2, 217 (Oeuvres, t. I, 167-222). A. N. Shoshitaishvili, T h c i n g and synthesis of periodic motion with prescribed j k q u e n c y f o rcontrollability and observability degeneracy, All-Union Seminar on Dynamics of Nonlinear Control processes, Tallinn, 1987.Abstracts of papers, Inst. Control Sci. , MOSCOW, 1987,p. 81. (Russian). J. -M. Coron, A necessary condition f o r feedback stabilization, Syst, Contr. Lett. 14 (1990), 227-232. E. A. Coddington, N. Levinson, Theory of ordinary differential equations, Wiley, New York, 1961, V. I. Arnold, Singularities in calculw of variations, Itogi Nauki i Tekhniki, Modern problems in Mathematics, v. 22, 3-55. V. M. Zakaljukin, Perestroikas of wave fronts, caustics, depending o n parameters, versality of majpings, Itogi Nauki i Tekhniki, Modern problems in Mathematics, v. 22, 53-59. A. N. Shoshitaishvili, On attainmentof the standard motion,in Proc. of Institute of Systems Investigations, 13, Moscow, 1991, 93-102 (Russian) W. B. Cannon, The wisdom of the body, London, 1932. W. R. Ashby, An Introduction to Cybernetics, Chapman and Hall, London, 1956. J. Kurzweil, O n the analytical construction of regulators, Automatika i telemekhanika 22 (1961), 688-695 (in Russian). J. C. Willems, Least-squares stationary optimal control and the algebraic Riccati equation, IEEE Trans. Automat. Control 16 (1971), 621-634. J. C. Doyle, K. Glover, P. P. Khargonekar and B. A. Francis, StateH2 and H- controlproblems, IEEE space solutionstostandard Trans. Automat. Control 34 (1990), 831-846.
462
Shoshitaishvili
(241 P. Brunovsky, O n the optimal stabilization of nonlinear systems, Czechoslovak Mathematical Journal, 18 (93) 1968, 278-293. [25] A. N, Shoshitaishvili, On the erctremals of some infinite horizon problems, 7th IFAC workshop on Control Application of Nonlinear programming and Optimization, Tbilisi, 1988, Abstracts of papers, Inst. Control Sci., Moscow, 1988, 86-87. [26]A. J. van derSchaft, O n a state space approach tononlinear Hoc control, Systems and Control Letters 16 (1991), 1-8. [27]H. Halkin, Necessaryconditionsforoptimalcontrolproblemswith infinite horizon, Econometrica 42:2 (1974). [28]A. A. Agrachev, R. V. Gamkrelidze, Symplectic methods for optimization and control, this volume. [29] P. Brunovsky, J. Kamornik, ThematrixRiccatiequationandthe non-controllable linear-quadratic problem with terminal constraints, SIAM J. Control Opt. 21 (1983), 280-288.
12 An Introduction to the Coordinate-Free Maximum Principle H. J. Sussmann Department of Mathematics, Rutgers University, New Brunswick, New Jersey 08903
Abstract. We present a version of the Maximum Principle of Optimal Control Theory that stresses its geometric, intrinsic formulation on manifolds, using the theory of connections along curves to write the adjoint equation in a coordinate-free fashion, so that the momentum functions arising from all vector fields are treated equally. By working with large systems of vector fields, we avoid making ad hoc choices such as selecting a local basis.of sections of a distribution. We then give examples showing how this coordinate-free version can be used to analyze the structure of optimal trajectories in terms of relations among iterated Lie brackets of a given control system, working in each case with the adjoint equation in terms of the momentum functions that are adequate to the particular problem under consideration.
Partially supported by NSF grant DMS92-02554. Part of this work was done while the author was a visitor at Institute for Maths matics and its Applications (I,M,A,),University of Minnesota, Minneapolis, Minnesota 55455.
463
Sussmann
464
1. Introduction
The purpose of this note is to present an introduction to a version of the finite-dimensional Pontryagin Maximum Principle that stresses its geometric, coordinate-free formulation on manifolds, and at the same time takes into account some recent developments that extend and generalize more classical versions. The Maximum Principle (abbr. MP) is a “principle” in the true sense of the word. That is, it is a somewhat vague statement that can be made precise in more than one way, in various situations, under different technical assumptions, and with similar but not identical conclusions. So the MP gives rise to a cluster of results about several closely related problems (cf. [l], [2], [3],[4], [7], [8], [ll], [l31for various versions). While all these results may look nearly identical to theuninitiated, they differ significantly in the precise technical details of their formulation. These differences-e.g. on technical assumptions on Lipschita conditions, or on Carathhodory-type requirements on integral bounds for derivatives or Lipschitz constants of time-dependent, functions or vector fields-mayseem unintuitive to the differential geometer, who is accustomed to dealing with very smooth objects, but are often the crux of the matter for the control theorist. For this reason, there is the danger that a truly general version of the Maximum Principle may appear at first sight to involve intricate lists of unmotivated technical conditions, making it hard for the reader to appreciate the statement’s rather natural geometric content. Our objective here is to provide a formulation that particularly stresses the differential geometric aspect and brings to the fore the crucial role of geometric objects such as Lie brackets, connections along curves, symplectic structures and Poisson brackets, without making unduly restrictive technical assumptions. We insist on providing coordinate-free formulations that make sense on manifolds directly, i.e. without having to choose particular’coordinate charts, or bases of sections of vector bundles. We will emphasize three features of our version, namely,
Coordinate-FreeMaximum Principle
465
F1. the fact that a control system is defined to be a general family of vector fields,
F2. the use of minimal technical assumptions, thatare in particular strictly weaker than those of previous versions such as the nonsmo0th “Maximum Principle under minimal hypotheses’’ of [3],
F3.
the intrinsic coordinate-free formulation on manifolds.
Our presentation of the MP is preceded by a review of the classical Euler-Lagrange equations, stated so as to make the transition from the classical variational problem to thecontrol problem straightforward. This should facilitate comprehension to those readers who are more familiar with the classical Calculus of Variations (abbr. C. of V.) than with modern Optimal Control Theory. At the same time, our path from the C. of V. to Optimal Control shows the latter to be a very far-reaching generalization of the former: the kinematics (i.e. the set of curves) of a C. of V. problem is trivial, since it consists of the set of all curves, and is generated by the family of all vector fields, so that all the interesting structure of a C. of V. problem arises from its Lagrangian. Optimal Control problems, on the other hand, can have a mu,& more nontrivial kinematics IC, generated by fairly general collections 3 of vector fields, whilethe Lagrangian L can still be as complicated as in the C. of V. case. It follows that in the study of these problems a much larger set of issues arises, including many that have no C. of V. counterpart, such as for example various “controllability” questions, i.e. questions about the properties of the set R(p) of all points reachable from a given point p by means of curves in I C , such as whether R(p) is the whole space, whether it is a neighborhood of p , whether it is a submanifold, whether it has a smooth or piecewise smooth boundary, or whether it is subanalytic. Toanswer these questions, one has to bring into play a much richer geometric structure, and in particular the Lie-algebraic relations among the vector fields in 3. Moreover, those issues that do have a C. of V. c0unterpart-e.g. the understanding of the structure of optimal trajectories-require an even richer structure, involving the interplay between the Lie-algebraic properties of 3 and the Lagrangian L.
. . .
. . . . . . . . . . . . .
..,.......
...
r . -
........x“ .,,~.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ._.....
,._I.
466
Sussmann
Because of limitations of space, many important issueswill not be discussed. In particular, (i) we will only deal with the MP f o r problems without state space constraints. Moreover, (ii) we will not touch upon another important aspect of the new version of the MP, namely, the inclusion of high-order necessary conditions, and (iii) we will not discuss the M& ximum Principle for differential inclusions. Finally, (iv) we will omit the proof. (A rather complete outline of the proof for the case without state space constraints is given in [22] and [23]. These two papers only deal with the case when the state space is an open subset of a Euclidean space Rn, but they allow systems with with jumps. The manifold caae is then a trivial corollary, derived by applying the Euclidean space result to a system where the jumps are coordinate changes.) Other than the four limitations listed above, our statement will be general, our presentation self-contained, and our technical assumptions minimal. The results will be stated in the general nonsmooth setting under which they were proved in [22] and [23]. In this setting, the reference arc is generated by a control q* that gives rise to a time-varying vector field (x,t ) + F,, (x,t ) that is only assumed to be locally Lipschitz with respect to the state variable x (with an appropriate integral bound of Carathdodory type for the Lipschitz constant as a function of t). On the other hand, for extra clarity, most of our discussion will be confined to the smooth case, in which F,, is of class C1with respect to x. We incorporate Feature F1 into our version by following the ideas of B. Kaskosz and S. Lojasiewicz Jr. on “generalizedcontrol systems” (cf. [6]). We define a control system to be a parametrized collection of time-vaying vector fields, i.e. a family F = {F,},Eu of time-dependent vector fieldscalled control vector fields2, abbr. CVF’s-(x, t ) ”t F,(%,t ) on a manifold M . This ismore general, and in our viewalso more natural, than the more traditional approach of considering systems j: = f(z,t,u), U E U (or, more generally, systems j: = f(z, t , u ) ,U E U ( t ) ) ,where U (or each The reader is warned that this terminology is not standard, since the expression “control vector field” has been used by other authors (for example, F. Albrecht in [l]) with a different meaning.
Coordinate-Free Maximum Principle
467
of the sets V ( t ) )is usually a subset of a Euclidean space, and “open-loop controls” are functions t + ~ ( tE) V ( t )required to satisfy some measurability conditions and some bounds. For us, U will just be an abstract set, whose elements 7 will still be called “open-loop controls.” Naturally, measurability properties and bounds are still needed, but they now become conditions of the Caratheodory type on the control vector fields F,, rather than requirements on the parameter values v. Dujectories are then locally absolutely continuous curves t + ( ( t )in M that satisfy the O.D.E. ( ( t )= F,(E(t), t ) for some 7. The MP deals with pairs (<,v),where E is an arc and 7 is a control that generates 5. (In this paper, the word “arc” always means “absolutely continuous curve defined on a compact interval.”) We call such pairs “controlled arcs.”3 Feature F2 is included by following through on a deep observation due to Lojasiewicz4,who pointed out that the vector fields other than F,, can be assumed to be just continuous with respect to the state.Thanks to this extra generality, the approach presented here turns out to settle the optimality question even in some situations where other nonsmooth versions of the Maximum Principle fail, as remarked by Kaskosz and Lojasiewicz, one of whose examples is reviewed below (cf. Example 15.4). The incorporation of Feature F3 is the most important goal of this paper. In order to provide an intrinsic coordinate-free formulation of the MP, we first present a detailed discussion of the precise type of differentialgeometric structure that is required for this purpose. As is well known, the conclusion ofthe MP involves three basic ingredients, namely, (11) the adjoint equation (abbr. AE), (12) Hamiltonian minimization, and (13) transversality conditions. The last two are manifestly intrinsic, but the AE for a controlled arc +-y = ( E , 7) would appear to involve some choices, sinceit has the form ( = 0, and the intrinsic meaning of -the Jacobian
+ S.2
2
In our view this conveys the intended meaning much better than expressions such as %dmissible pair,” that are often encountered in the literature. Naturally, there is no need to distinguish between arcs and controlled arcs for systems such as those of the classical Calculus of Variations, in which a trajectory cannot begenerated by more than one control. 4 Personal communication.
468
Sussmann
with respect to x of the CVF x + &,(x, t)-and of the derivative t require further clarification. (It is clear that the correct differential-geometric interpretation of the “adjoint vector” C is as a field of covectors along the curve E , but then is not well defined, since one does not know how to differentiate fields of vectors or covectors along curves.) The proper m w e r turns out tobe-if F,,is sufficientlysmooth-that the expression (+C.% must not be broken up into the sum of two separately meaningful terms, but has to be regarded instead as the covariant derivative of C relative to a connection L7 naturally associated with r.This is explained in detail in Section 4. The paper is organized as follows. In Section 2 we review the EulerLagrange equations and show how to generalize them to get the Control Theory MP. In Section 3 we begin our formal discussion by introducing a number of general definitions, notational conventions and abbreviations that will be used throughout the paper. In Section 4 we discuss the theory of connections along curves and study the adjoint equation along a controlled curve, i.e. Item (11) in our list of three main ingredients of the MP. Section 5 contains the definition of control systems, Lagrangians and reachable sets. Section 6 studies Hamiltonian minimization-i.e. Item (12) in the list-and introduces the important property of separability, which plays a crucial role in the transition from “weak” to “strong” minimization. Section 7 studies the thirditem in our list, namely, transversality, and defines the concept of an approximating cone. Section 8 then presents the statement of various forms of the MP, starting with the basic version of the MP as a theorem aboutset separation, anddeducing from it other versions for controllability and optimal control. In Sections 9 and 10 we specialize to systems of the form k = f ( x , u )and k = f(x,t,u) and derive the corresponding versions of the MP, showing in particular that in these cases the strong form of the minimization condition can in fact be used instead of the weak one. Sections 11 to 15 contain illustrations of the application of the MP to various differential-geometric and optimal control questions. In view of the geometric orientation of this paper, we choose as our first example the well known equation of the Riemannian geodesics, and show
Coordinate-FreeMaximum Principle
469
in Section 11 how to derive it in a simple way from the MP, using only the characterization of the Riemannian connection as the unique torsion-free connection for whichthe covariant derivative of the metric tensor vanishes, and without having to take coordinates or write formulas for the Christoffel symbols. (It turns out that the MP gives a necessary condition that can be stated in terms of an arbitrary connection C along the given curve. When L happens to be the pullback of the Levi-Civita connection, then the condition takes a simpler form, which is precisely the geodesic equation.) In Section 12 we look at a problem lying at the opposite extreme of the C. of V. -Control continuum, namely minimum-time single-input control-&ne optimal control. (This means that L 1, so L is trivial, and F essentially consists of two vector fields, twobeing the minimum number of vector fields needed in order to have nontrivial control.) We show how the application of the MP-using the geometric form of the AE-directly leads to reltG tionships between properties of the optimal trajectories and Lie algebraic relations among the two vector fields defining the kinematics. In particular, Section 12 provides a very brief introduction to some key concepts and facts in the general regularity theory of optimal trajectories. In Section 13 we let the pendulum swing back to thegeometric side, and look at translation-invariant control problems on Liegroups. In particular, we give an example of a problem in the rotation group SO(3) which is not Riemannian “since there are only two independent directions of motion at each point- but where the MP implies smoothness of the optimal arcs. Section 14 discusses sub-Riemannian minimization, and shows a coordinatefree, basis-free derivation of the necessary conditions for optimality, highlighting the importance of the so-called “abnormal extremals.” Section 15 presents a number of examples pertaining to various technical issues discussed in the text.These examples show: (i) that theconditions of the MP are not local (Example 15.1); (ii) that our version of the MP, by allowing the vector fields other than the reference vector field to be just continuous, gives in some cases better results than other nonsmooth versions (Example 15.2); (iii) that the conditions of the MP can be satisfied for a controlled trajectory (<,v) but be violated for another controlled trajectory (E, ?)
Sussmann
470
with the same 5 (Example 15.3), (iv) that by enlarging the original control system-i.e. by adding more vector fields without changing the class of trajectories-one can sometimes rule out arcs that satisfy the conditions of the MP for the original system (Example 15.4), (v) that therequirement that thereference vector field be locally integrably Lipschitz cannot be weakened (Example 15.5).Finally, in Section 16, we briefly point out that the MP is only the beginning of a much richer theory, involving “high-order conditions.” We present a simple example of a problem for which the MP in the form of this paper does not suffice, whilethe addition of a high-order condition involving a Lie bracket leads to a complete solution.
2. The Euler-Lagrange equations and the maximum principle Let SZ E. Rn be open, let a, b E P,a < b, and let SZ x Rn x [a,b] 3 + L(z,U ,t ) E B be a function that satisfies technical conditions to be made precise later. In its simplest version, the classical Calculus of Variations deals with the following minimization problem: given Z and 2 in SZ,characterize those arcs (i.e. absolutely continuous maps) & : [a,b]+ 0 that minimize the integral JL(<)= L(<(t), ( ( t ) ,t )d t among all the arcs : [a, b] + SZ such that [ ( a ) = I and ((a) = 53. (z, U ,t )
Under suitable technical conditions on L and
(ELI)
&,,the implication
a minimizing arc necessarilysatisfies (EL)
is true, where (EL) stands for the Euler-Lagrange equations, usually written in the form $ = %, For sufficiently smooth curves and Lagrangians (e.g. for &,L E C2)the meaning of this equationis clear. Under more general conditions (for example, if is Lipschitz but not Cl),some interpretation is required. Define a vector-valued function -) : [a,b] -+ R” (the momentum) by = - % ( E * ( t ) , i * ( t ) , t ) ,(i.e. c&) = -E(E*(t),i*(t),t) for i = 1,. . ,n). Then the Euler-Lagrange equations say that
(g)
.
m
- c(
Coordinate-Free Maximum Principle
(EL)
471
is equal a.e. toanabsolutelycontinuous-obviouslyuniquevector function <,that satisfies &(t) $$(<*(t),(*(t),t ) = o for a.e. t E [a,b], i = 1,.. ,n.
+
Then (ELI) becomes a rigorous theorem if, for example, (i) L is a function of class Cl,and (ii) E* is Lipschitz. Theseare not, however, the most general hypotheses that make (ELI) true. Much weaker assumptions still suffice, although they are more complicated to state, and the proof of the implication (ELI) becomes more delicate. For example, it suffices to require -ifwe use 7(&, S) to denote the set 7(&, 6) = ((2,t ) : a 5 t 5 b, 11z - E.(t)ll 5 6)”that the following conditions hold on 7 ( ( , , 6 ) for some S > 0:
EL1 EL2
L ( a ,U , t ) is continuous with respect to (2,U ) for each fixed t , L($,U, t ) is Lebesgue measurable with respect to t for each fixed
(x,U ) , EL3
EL4 MIN
L($,U , t ) is continuously differentiable with respect to z for each fmed ( U , t ) ,and a “Carathkodory type” estimate g ( z , &(t), t)ll 5 cp(t) is satisfied for all (z, t ) in 7(&, 6), where cp : [a,b] + W is an integrable function, L ( z ,U , t ) is everywhere differentiable with respect to U for each fixed (z, t ) , JL(&) 5 JL(<)for all absolutely continuous curves : [a,b] + Sl such that [ ( a ) = & ( a ) , ( ( b ) = C*@), and ( ( ( t ) , t )E 7 ( & , 6 ) for all t E [a,b].
11
The Euler-Lagrange equations can also be written in “Hamiltonian form” by introducing the Hamiltonian X L : Cl X Rn X W X R* + W given by
(2.1)
xyz,z , t,U) = ( & U ) + L(z,u,t).
Ifwepluginz=&,(t),u=(,(t),z=C(t),andletl?(t) = ( & ( t ) , ~ ( t ) , t , ( * ( t ) ) , then theO.D.E. t ( t )= ( * ( t )t)-which , holds a.e. on [a,b]-just says that, along r, the identity
-g(&(t),
Sussmann
472
holds for a.e. t E [a, b].Clearly, x = U also holds a.e., and this, together = 0, says that Hamilton's with the Euler-Lagrange equations equations
+
hold for a.e. t E [a, b],The Hamiltonian version of the classical necessary conditions then becomes:
Theorem 2.1. Let SZ be a n open subset of R", and let E* : [a,b] + SZ be absolutelycontinuous.Let L : Cl x Rn x [a,b] + R be a function such that Conditions EL1, EL2, EL3, EL4 and MIN are satisfied o n someset 7 ( & , 6 ) . Thenthere is a n absolutelycontinuousfunction C : [a,b] + R" such that, if we define ?LL by (2,1), then (2.2) and (2.3) hold for i.e. t E [a, b],i = 1,.. . ,n, along the curve 'I given by t + (x,z, t, U ) =
(s*(t),c(t),t,i*(t)>. This can be strengthened by observing that (2.2) is a consequence of the following assertion (2.4)
?L'(~*(t),C(t),t,j*(t)) = m i n ( ~ ' ( r ( t > * , 6 ( t ) , t l V: )v E an) for a.e. t E [a, b],
known as the W e i e r s t m s side condition. It turns out that (2.4) is also necessary for optimality. So we may reformulate our necessary conditions using (2.4) in place of (2.2). This has two deep consequences. First, the new result is better even for a Lagrangian L and arc that satisfy the technical conditions of Theorem 2.1, since (2.4) is in general strictly stronger than (2.2). A second and even more important consequence of the new formulation is that, since (2.2) was the onlyplacewhereournecessary conditions involved the partial derivatives these derivatives have now completely disappeared. It is then to beexpected that the differentiability condition EL4 might no longer be needed. Indeed, one can prove:
c*
e,
Theorem 2.2. Let SZ be a n open subset of R", and let .$ : [a,b] + SZ be absolutely continuous. Let L : Cl x R" x [a,b] + R be a function such that EL1, EL2, EL3 and MIN hold ,onsome set T(&,6).Then there is
Coordinate-Free Maximum Principle
473
an absolutely continuous function 6 : [a,b] + Rn such that, if we define ?lL by (2,1),then (2.3) and (2.4) hold for a.e. t E [a,b], i = 1,.. . ,n, along the curve l? given by t + (x,z , t , u ) = (c*(t),<(t),t,(*(t)). Theorem 2.2 is a particular case of the general Maximum Principle, stated below. What we have done so far can be described in the following alternative way: we started with the dynamical equation 5 = q(t),in which x belongs to an open subset S2 of Rn and the “control” r] is an arbitrary integrable Rn-valued function, and with a function (x,U , t ) + L(o,U , t ) ,the “Lagrangian.” For each contro1,function7, let L,(x,t ) = L(%,q ( t ) , tLetting ). U be the set of all control functions 7, we can define a family X L = { H { } q ) E ~ of time-varying Hamiltonian functions, parametrized by U ,by letting
Associated with each Hamiltonian function H : S2 x R* x I + R -where I is a subinterval of R-there is a time-varying vector field g on 0 x Rn, whose components are . . , ~OHa ,-,m 8H , .. , So to thefamily X L of Hamiltonians there corresponds a family gL = {d$},,u of t i m e varying vector fields on S2 x Rn.
(g,.
I
-E).
The necessary conditions for optimality then say, simply, that an optimal arc and its corresponding control function q* must satisfy the following: there is an arc 6 in Rn such that (i) S = 6 ) is an integral curve of the Hamiltonian system on S2 X Rn whose dynamical equations are A ( t ) = G,L (X ( t ) , t(ii) ) , for almost every t , r]* minimizes the function U 3 7 + ? l t ( E ( t ) , tMore ) , succinctly:
c*
(c*,
(a)
to eachLagrangian L : 0 x Rn x I + R therecorresponds a Hamiltonian system-i.e. a collection dL of time-varying Hamiltonian vector fields-on 0 x Rn; and
(b)
every optimal arc correspondingto a control q. can be lifted to an integral curve 3 of the Hamiltonian system in such a way that the control q* minimizes the Hamiltonian.
c*
474
Sussmann
To understand the control theory generalization, notice that the situation considered above really has two ingredients, namely (a) a dynamics (or kinematics) E, which singles out the class of curves over which the minimization is to be performed, and (b) a Lagrangian L. The dynamics C happens to be trivial, in the sense that C is the class of all absolutely continuous curves in a. Let us use Un to denote the class of all constant vector fields on a, and let Un be the set of all absolutely continuous maps t + q(t) E Un.Now let us identify our set U of controls with Ua in an obvious way, by thinking of our control functions q as Un-valued functions (so each value q(t) E Rn is now thought of as a constant vector field, and then q itself becomes a time-varying vector field). Then C is the set of all arcs that are integral curves of time-varying vector fields q E Un.
The control theory generalization consistsof replacing this special class C by much more general classes of curves, arising from more general families of time-varying vector fields. More precisely, rather than take U to be the special class Un,we now let U be an arbitrary parameter set, and consider families F = {F,),Eu of time-varying vector fields, parametrized by q. The dynamical equations then become 5 = F,(z,t), and the Lagrangian will be, as in the classical case, a family of functions of a,t parametrized by 9. (The classical case corresponds to letting F,(z,t ) = q(t).) It is clear that once we have decided to operate on this level of generality there is no longer any need to work on open subsets of Euclidean spaces, so the natural spaces for our optimization problem$are manifolds. In the Calculus of Variations problem, the O.D.E. 5(t) = q(t) determines the control Q given the curve a ( . ) ,so the cost can be viewed as functional on the space of curves. In themore general setting, it may happen that a curve .$ is generated by more than one control v, and the cost J could depend on q as well as .$.For this reason, it is now natural to introduce the concept of a “controlled trajectory,’’ i.e. of a pair (E, q) such q is a control and .$ is a trajectory generated by q.
Coordinate-Free Maximum Principle
475
Example 2.1. Consider "the control system f l = ql(t),kz = q2(t)xl, ( ~ 1 ~ x E2 Etz, ) q integrable.'' That is, we take
U=
U
Ll([a,b],R2)
-w
and associate with each q E U with domain I the time-varying vector field
* (ql(t),r]Z(t)zl),i.e. Fq(',t) = ql(t)& + qZ(t)zlG. 8 Let the Lagrangian L begivenby L ( q , z 2 , ~ 1 , u=~ ut ) + us, i.e. let L,(xl,xz,t) = q ~ ( t+) q~z ( t ) 2 . Let ( : [0,1] + M be the arc givenby F,
: (xl,xZ,t)
( ( t )= (0,O). Then ( is generated by the control q(t) = (0,O)and also by the control q(t)= (0,l). The costs of the controlled arcs ((,r)) and ((,g are different, since J ( ( , q ) = 0 and J((,;i) = 1. m Pursuing the analogy with the classical case, it is natural to assign to each control system C = (M,U, F)-where F = {F,),Eu-andLagrangian L = { L , } q E ~a ,family of Hamiltonian functions ?l:*", defined bY
(This is formally analogous to (2.5), since both formulas can be written, somewhat less precisely, as 31 = ( z , k) L.) Since F,(Ic, t ) is a tangent vector at S,it is clear that, for (2.6) to make sense, z has to be a covector at x. So each should be regarded as a function on the cotangent bundle T * M . We recall that, for a Haniiltonian function H on T * M , there is a well-defined Hamiltonian vector field l?, whose integral curves obey, 8H ,i = -in coordinates, the Hamilton equations k = 8x
+
31fi"tL
g,
Having defined the Hamiltonian functions, it is natural to expect that, under suitable technical assumptions (such as continuous differentiability with respect to IC of each Fq and each L,, which is needed to guarantee the existence of the partial derivatives of the Hamiltonians with respect to x), the obvious analogue of Theorem 2.2 may be true. Precisely, suppose C = (M,U, F) is a control system andL is a Lagrangian for C.Use Carc(C) to denote the set of all controlled arcs of C. If 7 = ((,v) E Carc(C) has
Sussmann
476
domain [a,b], write
87 = (
57 = (a,W
E W , W)), JC%)
,b , E ( W ,
5
= L,(E(t),t)dt. I
Call a y* = (E*, 9*)E Carc(C) with domain [a,b] a minimizer if
(2.7)
JCtL(r*) 5 J C s L ( 7 for ) all 7 E Carc(C) such that
= 83y*.
The natural extension of Theorem 2.2 would then say that if y* is a minimizer then there is an absolutely continuous function : [a,b] + T*M such that (i) <(t)E T&]M for each t E [a,b],(ii) the arc t + (&(t),<(t)) in T * M is an integral curve of the time-varying vector field t + and (iii) the control Q+ minimizes the Hamiltonian, in the sense that
<
(2.8)
q y ( E * ( t ) , C ( t ) , t=) min(~tt,CI”(E*(t),<(t),t) :9 E W for a.e. t E [a,b].
It turns out that the above statement is not quite correct in general, even under the most restrictive smoothness assumptions. However, a slightly modified version is correct. First, one has to to work not just with the Hamiltonian but with all the Hamiltonians ‘?dC*yL,for all Y 2 0. (The new parameter v is called the “abnormal multiplier.”) And second, one has to substitute for the “strong” Hamiltonian minimization condition (2.8) the weaker property ‘?d’lL,
(2.9)
‘(V7 E U ) ( % y ( E * ( t ) ,C(t),t)5 xt,CIVL(E*(t),<(t),t)
for a.e. t E [a,b ] ) . If we write Z = ( E , <) as before, use T # M to denote the cotangent bundle of M with the zero section removed, and let ? r T n M denote the canonical projection from T * M to M , then the correct statement of the Maximum Principle turns out to be-modulo technical assumptions-the following:
(MP)
Let C = ( M ,U , F ) be a control system, and let L be a Lagrangian for C . Let y* = (&,q*) E Carc(C), = (a,z-,b,z+) be a minimizer. Then thereexist a constant v 2 0 andanintegral
Coordinate-Free Maximum Principle
477
curve E, : [a,b] + T * M of the time-varying vector field $!:,+‘L, such that (i) T T * M oE*= <*,(ii) either v > 0 or a* takes values an T # M , and (iii) (2.9) holds. This is not yet a rigorous theorem, because the technical assumptions have not been stated precisely. A precise statement of a more general result under minimal technical conditions will be given below in Section 8. However, some observations are already in order at this point:
1. Except for the presence of the “abnormal multiplier’’ v and the use of “weak” rather than “strong” Hamiltonian minimization, ( M P ) is an almost verbatim transcription of Theorem 2.2 to our more general setting. 2. If we write % ( t )= ( & ( t )& , ( t ) ) ,then Condition (i) of ( M P ) says, in coordinates, that C* satisfies the “adjoint equation”
where we should think of C* and $$ as row vectors, and of 8F as a square matrix. If we add to these the equation L = 0, we obtain a system which is linear in v). Therefore the function t + (c*(t),v) either vanishes identically or. vanishesnowhere, so the nontriviality ( t ) ,v) # condition (ii) of ( M P ) is equivalent to therequirement that 0 for some t.
(c*,
(c*
3. The pair (*(t)= 0, v = 0 always satisfies all the conditions of ( M P ) other than the nontriviality property (ii). Hence (ii) is essential, for without it ( M P ) would give no information.
4. If a pair (C*, v) satisfies the conditions of (MP), and p > 0, then the pair (p<*, pv) also satisfies those conditions. Hence, if one can satisfy the conditions with v # 0 then it isin fact possible to satisfy them with v = 1. Hence the only real effect of introducing the abnormal multiplier is that we now have to consider two Hamiltonians, namely, RflcvLand 5 , Under fairly general conditions (cf. Sections 9 and 10 below) the weak form (2.9) of Hamiltonian minimization actually implies the strong
Sussmann
478
6. For the classical Calculus of Variations situation the possibility that v = 0 can be excluded. Indeed, in that case the Hamiltonian ‘H is just ( z , q ( t ) ) v L ( z , q ( t ) , t )If. v = 0, then ‘fl = ( z , 9 ( t ) ) For . our reference this equals (&(t), q(t)).An elementary arc E* and adjoint vector argument shows that in this case (2.10) easily follows from (2.9). So the control value q*(t)= ,&(t)minimizes the Hamiltonian among all possiblechoices of q, for almost all t. Since {q(t) : q E U} = R*, the vector q*(t) must minimize-fora.e. t-the linear functional v + ( t ) v) , over all v E P . Therefore ( t ) = 0, because a nonzero linear functional on Rn does not have a minimum. So the assumption that v = 0 implies that = 0 aswell, contradicting the nontriviality condition (iv).
+
c*,
(c*
It is well known that (MP) can be false if the abnormal multiplier is not included, as the following simple example shows:
+
Example 2.2. Consider the dynamics 6 = z2 u2, with a control u E R. Suppose we want to minimize u(t)dt among all trajectories that go from IC = 0 to IC = 0 in time 1 and are generated by integrable control functions [0,l] 3 t + u(t) E R. Clearly, there is only one such trajectory, namely, z* ( t ) 0, corresponding to u * ( t ) 0. So IC* isobviouslyminimizing. The Hamiltonian is H = <(x2 U’) vu,and t -+ c(t) must satisfy (. = -
si
+ +
Remark 2.1. The Hamiltonian corresponding to v = 0 does not involve L. Notice that in Example 2.2 our reference trajectory is “rigid,” in the sense that there is no other trajectory satisfying the desired endpoint conditions. This guarantees a priori that the reference trajectory must
Coordinate-Free Maximum Principle
479
be optimal for every cost functional, so it is not surprising that this fact should result in a condition not involving the Lagrangian at all.
3. Notations and basic definitions A list of abbreviations used throughout the paper-of expressions whose meaning is either well known or will be defined later in the text-is provided in an appendix following the References. An interval is a nonempty connected subset of R, so intervals can be closed, open or half-closed, bounded or unbounded. We use Zto denote the set of all nontrivial intervals. An interval is nontrivial if it is not reduced to a point. If I E Z, then L:,,,,(I)is the set of all nonnegative locally Lebesgue integrable functions on I . We use ?)(f), f [S to denote, respectively, the domain of a map f and the restriction of f to a set S. The word “function” refers to an R-valued map, unless we specify otherwise (e.g. by talking about a “vector-valued function”). A family of objects parametrized by a set S is the same as a map f with domain S, and the notation { f ( s ) } , ~iss just another name for the map f. However, the expression {f(s) : S E S} refers to the set of values of f for S E S, i.e. the set f(S).If S is a set, a time-varying map on S is a map whose domain is S x I for some I E Z. If F is a time-varying map on S,and V ( F ) = S x I, then I is the time domain of F , and we write I = 7 ? ) ( F ) . If FI,. , . ,F, are time-varying maps on the same set S, we write T D ( F 1 , .. . ,F,) = T V ( F 1 )n . . . n TD(F,). For t E 7V(F), I E S, we use Ft, F” to denote, respectively, the partial maps F ( - , t )and F ( z , * ) SO , D(&) = S and ? ) ( F x )= T V ( F ) . “Smooth” willalways mean “of class C”.” A manifold is always, by definition, smooth, finite-dimensional, Hausdorff, without boundary, and of pure dimension (i.e. all connected components have the same dimension). If M , N are manifolds, f : N + M is a smooth map, F is a smooth manifold, and fl E M is open,a smooth F-trivialization of f over S l is a smooth diffeomorphism @ : Sl X F + f - l ( s 2 ) such that f ( @ ( ~ , v ) )= z for all I E 0, v E F. We call f F-locally tri-
480
Sussmann
vial if M can be coveredby open sets overwhich there is a smooth F-trivialization of f. A smoothbundleover M with fiber F is a manifold E endowed with an F-locally trivial smooth map T E : E + M . The submanifold E ( $ ) = r i 1 ( x ) is the fiber of E over x. If M is a class of manifolds endowed with extra structure - e . g . linear spaces, affine spaces- and F E M , then an "bundle with fiber F is a smooth bundle E equipped with an M-structure on each fiber E ( x ) , such that M can be covered by open sets Clcyover which there are F-trivializations @cy such that each map : W + O,(X,W) is an M-isomorphism between F and E ( x ) . If M is the class of finite-dimensional real linear spaces, then E is a smooth vector bundle. In that case, E* denotes the dual of E . (When E = T M , the tangent bundle of M, we write T*M rather than T M * , so T * M is the cotangent bundle of M , and we use T,M, T l M , rather than T M ( x ) , T * M ( xto ) denote the fibers, i.e. the tangent and cotangent spaces of M at x E M . We write T # M for the cotangent bundle of M with the zero section removed.) If y E E ( x ) , z E E*(a) (in particular if y E T,M and z E T,*M),we use indistinctly the notations ( z , y), z.y and z(y) for the value at y of the linear functional z. If M is a manifold, E is a smooth vector bundle over M , and k E R, then C k ( M ) ,rk(E)denote, respectively, the space of real-valuedfunctions of class Ck on M and the Ck(M)-module of sections of class Ck of E. In particular, r k ( T M )and r k ( T * M )are, respectively, the spaces of vector fields and l-forms of class Ckon M , If M is second countable, then the spaces C k ( M ) and r k ( E ) are separable €+&het. Convergence in Ck(M) (resp. rk(E)) is uniform convergence on compact sets of the functions (resp. sections of E) together with all their partial derivatives of order 5 IC. (Precisely: (i) a sequence {pj} of functions in C k ( M )converges in Ck(M)to a function cp if for every i such that 0 2 i c k l and every i-tuple (X,, . . ,Xi)of smooth vector fields on M , the functions XlXz . . . Xapj converge to XlXz . . . Xap uniformly on compact sets, and (ii) a sequence { Sj 1 in rk( E )converges in rk( E )to a section S if for every S' E r m ( E * )the scalar functions (S*,Sj) converge to (S*,S) in C k ( M ) .
+
Maximum Coordinate-Free
Principle
481
The metrizability and separability of these spaces follow from the fact that M is second countable.) It is easy to see that C k ( M )is dense in C i ( M ) whenever k 5 a. If A is a topological space, we use Comp(A) to denote the set of all compact subsets ofA. A Camthdodory function (CF) on a topological space A is a time-varying function cp on A such that (i) cpt is continuous for every t E T D ( p ) ,and (ii) cpz is Lebesgue measurable for every z E A. (If A is separable metric, it then followseasily that cp must be jointly BorelxLebesgue measurable on D(cp).) We use CF(A), CF(I,A) to denote, respectively, the set of all CF's on A and the set of all cp E CF(A) such that TD((p)= I . If cp E CF(A), then cp will be called locally integrably bounded (LIB) if for every K E Comp(A) there exists a t,b E Lfoc,+(TD(cp)) such that Icp(z,t)l5 $(t)for all ( z , t )E K x TD(cp).If A is metric, with distance function d, then a cp E CF(A) willbecalled locally Lipschitz (LL) if every cpt is locally Lipschitz (i.e. cp satisfies a Lipschitz condition (cp(rc) - cp(y)l 5 C K , t d ( z , y ) for ($,P) E K x K for every K E Comp(A), t E TD(cp))and locally integmbly Lipschitz (LIL) if it is LIB and LL, and for every K E Comp(A) the Lipschitz constants C K , t can be chosenso Notice that these definithat the function t + C K , t is in LfOc,+(TD(cp)). tions only involve the restriction of the metric to compact subsets of A, so the'concepts of a LL and a LIL CF on A only depend on the class of d modulo local equivalence. (Twodistance functions d l , d2 on a topological space A are locally equivalent if for every K E Comp(A) there exist constants c1 > 0,c2 > 0 such that q d l ( z , y ) 5 d z ( z , y ) 5 c z d l ( z , y ) for,all ( q v ) E K x K.) In particular, if A is a manifold, then all the distance functions arising from Riemannian metrics on M are locally equivalent, so the class of LIL CF's on M is well defined. If M is a manifold, and cp E C F ( M ) ,we will call cp of class Ck if cpt E C k ( M )for every t E TD(cp),and locally integmbly of class Ckabbreviated LICk- if cp is of class Ck and the function X l X 2 . . .XkCp : D(cp) "t P is LIB for every. k-tuple X = (XI,. . . ,x,)of smooth vector fields on M . (For k = 0, this just amounts to saying that cp is LIB.)
482
'.
>
Sussmann
,
If A is a topological space and P is any property of CF's. that makes sense on ,,A (e.g. LIB, LL and LIL if A is metric, Ck or LICk if A is a manifold), then C F p ( A ) ,C F p ( I ,A ) , will denote, respectively, the class of all cp E C F ( A ) that have Property P, and the set of p E C F p ( A ) with time domain I .
A Carathdodory'vector field-or control vector field, abbreviated CVF - on M is a time-varying map F on M such that (i) F ( z , t ) E T,M for every ( z , t )E D(F),and (ii) Fp E C F ( M ) forevery p E C m ( M ) . (By definition, (Fp)(z,t ) = F ( z ,t ) p , so D(Fp) = D(F).) We use C V F ( I ,M ) to denote the set of all CVF's on M with time domain I , and write C V F ( M )gfUrCVF(1,M). If F E C V F ( M ) ,we say that F is LL, of class C&,LIB, LIL, LICk, if forevery p E P ( M ) the function Fp is LL, of class Ck,LIB, LIL, L E k . If P is any property of CVF'8, such as LL, Ck,LIB, LIL, LICk (or the LIBU property introduced below), we use C V F p ( I , M ) to denote the set of those F E C V F ( I , M ) that have P, and write C V F p ( M ) = UrCVFp(1,M). (Notice that CVFco(M) is just CVF(M ).)
<
A curve in M is a continuous map : I M , where I = D(<) is an gf{ ( [ ( t ) , t ): t E D(<)} interval. The graph of a curve [,,is the subset G(<) of M x W. A curve : I "t M is locally absolutely continuous (LAC), locally Lipschitz (LL), of class Ck, if p o is LAC, LL, or of class Ckfor every p E C - ( M ) . An arc is a LAC curve such that D(<) is compact. In that c&e, is in fact absolutely continuous (AC), i.e. p o t is AC for every p E C w ( M ) .The self-explanatory notations CRV(I,M ) , C R V ' A ~ ( IM), , C R V ( M ) ,C R V ' A ~ ( M ) A , R C ( M )stand, respectively, for the sets of (i) all curves : I "t M , (ii) all LAC members of CRV(I,M ) , (iii) all curves in M , (iv) all LAC curves in M , (v) all arcs in M . When I is compact we write A R C ( I ,M ) rather than CRV'Ac(1, M ) .
<
<
< <
<
<
An integral curve (abbr. IC)of an F E C V F ( M )is a LAC curve such We use that,D([f E ' T D ( F )and ( ( t )= F ( < ( t ) , t )for almost all t , E D(<). IC(F)to denote the set of all IC's of F.A CVF F haa the existence property if for every point (Z,f) E M x T D ( F ) there exists a E IC(F) such that (i) D ( e ) is a neighborhood off relative to TDIF), and (ii) ((7) = E . We say
<
Coordinate-Free Maximum Principle
483
that F has the uniqueness property if, whenever & , E 2 are IC's of F whose graphs intersect, it follows that El(t) = ( z ( t ) for every t E D ( t l )fl D ( t 2 ) . The Carathkodory existence and uniqueness theorems assert that (i) every LIB CVF F has the existence property, and (ii) if F is LIL, then F has the uniqueness property. It is easy to see that, if F is a CVF that has the'existence property, then through every point (Z,S) of D(F)there passes a maximal IC of F, i.e. an IC : I + M that cannot be extended to an IC : I + M of F defined on an interval that strictly contains I . (If F is LIB, then it is easily shown that a E IC(F) is maximal if and only if (i) G(<) is relatively closed in D ( F ) and (ii) D ( t ) is relatively open in TD(F).) ' "
,A LIBU CVF is a LIB CVF that has the uniqueness property.
A controlled curve in M is a pair y = (S, F ) such that F E CVF(M) and 5 E IC(F). The domain D(y) of a controlled curve y = (E, F) is, by definition, the domain of E. We call a controlled curve LIB-controlled (resp. LIB.U-controlled, LIL-controlled, LI6-controlled) if F is LIB (resp. LIBU, LIL,,L E k ) . If 7 is a set of CVF's in M , we write IC(7) gfUFEFIC(F), and we use Ccrv(F) to denote the set of all controlled curves y = (c, F) such that F E F.If P is any property of CVF's, such as LL, Ck, LIB, LIL, LICk, LIBU, we use Cervp(7) to denote the set of those E Ccrvp(F) that have P,(In particular, Ccrvco(F) = CcrvLAC(F) = Ccrv(F).) If ( N ,SI) is a symplectic manifold (i.e. a maniqold .N equipped with an everywhere nonsingular closed,2-form SI), and H E Ck( N ) , k 2 1, then we can associate with H a vector field E r k " I ( T N ) ,known as the Hamiltonian vector field of H . More generally, if H E CFck(N), k 1 l; then I ? . is a CVF of class @-l,! with, time domain TD(F), defined as follows. If x E N , t E TD(H), then. l?(x,t) is the vector v .E TEN such that (dHt,(z),w)= n E ( v , w )for all W E T,N. The Poisson bmcket { H ,K} of two functions H, K 6 C1(N) is, by definition, the derivative of K along the IC's of I?, i.e. {H,K} gf($K,3). More generally, if H, K E def CFcl(M), then {H,K} is defined on M X T D ( H , K )by { H , ' K } ( x , t )= ( d K t ( z ) , I ? ( x , t )i.e. ) , by {H, K } ( x ,t ) gf{ H t , &}(x)..
I?
Sussmann
484
A system of canonical coordinates on a sympleotic manifold ( N ,C?) is a chart IC = ( d, . . . , zn, z1 ,. . . , 2,) of N such that Sl = CL1dzi A d d on D(.). If K is a system of canonical coordinates on ( N ,C?),and H E C1(N), then the expression I?" of @ with respect to n isgiven,for ( z , z , t ) E -. D(K)x ' T D ( H ) ,by 2" t &=X - g , so the IC's of H satisfy Hamilton'sequations hi = E ( z ,z , t ) , ti = - # ( z , ~ , t ) .The Poisson bracket of H,K E C F c l ( N ) is then given, on D ( n ) x 7 D ( H ,K),by the formula {H,K} = Cy=, An important special instance of the above situation arises when N is the cotangent bundle T*M of, a manifold", in which case N is endef dowed with a canonical symplectic form S ~ M & M , where W M ( W ) ( z , d r p M ( w ) )for z E TZM, W E T(,,,)T*M. In that case, whenever F E r0(TM)and L E CO (M)-and, more generally, whenever F E C V F ( M ) and L E CF(M)- we can associate with the pair (F,L ) the Hamiltonian HF,L,EC F ( T * M ) ,with time domain TD(H,L), defined by
(8
&)
(g@ BE).
' k f
HF,L(z,z , t ) gfz.F(z,t)+ L ( z , t ) for z E M , z E T,*M, t E 7 2 ) ( H , L ) ,
It is clear that HF,Lhas m y of the properties CklLIB, LIL, LICk, iff both F and L do. If F E CVFck(M),L E \cFCh(M),and k 1 1, then HP,L E C ~ F & - l ( T n M )The . CVF I?F,L is the inhomogeneous Hamiltonian lift ,of F with Lagrangian L. When L = 0 we just write H F , HF, and refer to H p , @F as the Hamiltonian and Hamiltonian lift of F. Naturally, if F and L are not time-varying, then HF,L is also independent o f t , so in that case H F J is just a function on T * M . If n = ( d l . . . , z n ) is a chart of M , and we let zd(z, z ) = (z,&) for z~T~M,sothatz=~~~~z~(~,z)d~~,thenn*=(~~,,..,z~,z~,.'..,z (where we are writing x i rather than zi o TT") is a system of chonical coordinates on T * M , with domain rF!"(z)()~)), since WM = zi c l d . If F E C V F ( M )and L E C F ( M ) ,then HF,L(s,z l t ) = Cy'1 qFn9*(z,t)+ L(z,t ) on D ( n * )x ( 7 D ( H ,L).), where ,Fnpnare the k-components of F , i.e. Fnli(z,t)'kf ( d d ,F ( z ,t ) ) .Therefore the IC's of I?F,L satisfy, with respect to the chart K*, the equations
.
(3.1) t i ( t )= F " l i ( $ ( t ) , t ) ,
&(t)
-€IF" s ( t ) . m ( g ( t-) ,G 81; t )( z ( t ) , t ) ,
Coordinate-Free Maximum Principle
485
i.e. x = F"(z,t), i = - z E ( z , t ) - d L t ( z ) , where F" is the vectorvalued function (z,t ) + (F"*l(z,t),. . ,F"*n(z, t ) ) , and we are writing P ( z ,t ) and the coordinates of z (also labeled S) as column vectors and the coordinates of z as a row vector, and letting g ( z , t ) be the square matrix whose columns are the vectors G ( z , t ) ,i = 1,. , , ,n. This shows that, if E is an IC of l?p,~,then its projection T T I M o B is an IC of F, justifying the name "Hamiltonian lift." It is clear that, if F E C V F L I C k ( M )L , E C F L I C ~ ( Mand ) , k 2 1, then & p , ~E CVFLICk-l(T*M),and & p , ~has existence and uniqueness of IC's even when IC = 1. (For k = 1 f i p , need ~ t ) only involves the not be locally Lipschitz, but (i) the equation S = F&(%, components of F, which are LIC1, (ii) the equation i = - z g ( z ( t ) , t ) d L t ( z ( t ) )is linear affine in I for any given function t + z(t),and (iii) the matrix function t + F ( z ( t ) , t ) and the vector function t + d L t ( z ( t ) ) are integrable if x(-)is continuous, because F and L are LICl.)
.
4. Lie Differentiations; the variational and adjoint equations It is well known that, if M is a smooth manifold and : [a,b] + M is a smooth curve, then there is no canonical way to define how to differentiate along E fields of vectors or covectors along (, unless someextra structureis associated with the pair ( M ,E ) . The most widely known kind of structure that makes such a differentiation possible is an affine connection on M , but this is by no means needed. It turns out that the "adjoint equation" of the MP involves a type of differentiation of a field of covectors along a curve that need not arise from a connection; The purpose of this section is to spell out in detail precisely what additional structure is needed for a curve on a manifold M to give rise to a "Lie differentiation along 6." If M is a manifold, E is a smooth bundle over M , and E E C R V ( I ,M ) , then a section of E along 6 is a map U with domain I that assigns to each t E I a member u(t) of E(E(t)).Wease r ( E , ( )to denote the set of sections of E. along 6 . .The members of r(TM,E ) , r(T*M,(),are known, respectively, as fields of vectors and fields of covectors along E. We use l?mea*(E, E),
Sussmann
486
r L 1 ( E , [ )rLB(E,c), , F k ( E , [ )if IC ER, r L A C ( E , [ )rLL(E,[) , to denote, respectively the set of all Q E r ( E ,[) that are measurable, LI, LB, of class Ck, LAC, LL. (Notice that T E O Q = E , so the classes rk(E,[), rLAC(E,[), r L L ( E [), are void unless [ itself is of class Ck, LAC, or LL.). Let M be a manifold, and let (x,v) E T M . A covariant differentiation, or Lie diflerentiation, at x in the direction of v is an Blinear map L : r W ( T M )+ TxM that satisfies L(aX) = a(x)LX (v&).X(x)for all a E C"(M), X E P ( T M ) . We use LD(a,v) to denote the set of all Lie differentiations i t x in the direction of v . It iseasy to see that if L E LD(x,v) then L(X) only depends on the l-jet of X at x,so L can also be thought of as a map from J: ( T M )to T x M ,where ( 2 " ) is the space of l-jets at x of vector fields on M . Clearly, if Li E LD(z,vd)and ri E B for i = 1 , 2 , then rlLl r2L2 E LD(x,rlvl ~ 2 ~ 2 So ) . each LD(z,v) is closed under &ne combinations (i.e. under linear combinations with coefficients that add up to l),We let LD(x) = U V E ~ = ~ L D ( xand ,v), LD[M] = UrE~LD(x). Then LD(x) is a linear space, andthere is a natural projection T L D , :~LD(x) TxM that assigns to each L E LD(x) the unique v E T x M such that L E LD(z,v).If X is a smooth vector field on M , then X gives rise to an element Lx,xE LD(x,X(p)),defined by Lx,x(Y)= [X,Y](z).The map X + LX,xfrom F ( T M ) to LD(z) is clearly linear. It is easily verified that thekernel of this mapis preciselythe set of vector fields that vanish at x together with their first derivatives. Moreover, themap X + L,J is clearly surjective. So LD(x) can be canonically identified with J i ( T M ) . Therefore, if n = dimM then LD[M] is a smooth vector bundle over M , with fiber dimension n2 n, and we can associate with LD[W a bundle LD[[M]] over T M whose fiber at each ( x , v ) is LD(z,v), so LD[[W] is a bundle of affine (but not linear) n2-dimensional spaces.
+
Ji
+
+
+
A,
If IC = (x1,. . . ,x") is a chart near z,and Or = then an L E LD(x,v) is determined by an array {Li}lsi,jsnof coefficients, defined by L(Or)= L{Of(x). The Lie derivative LY of anarbitrary vector field Y = CjYj8r is then givenby LY = Cj(LY)jOf(x),where ( L Y ) j = VYj + Yi(x)Li.
ci
Principle
MaximumCoordinate-Free
487
Wenow specialize to the casewhen X is a vector field of class Cl defined near x, and L = L x , x . (Since L,J only depends on the l-jet' of X at 2, it is well defined for any X which is of class C1 near z.)If Y E r w ( T M ) ,then the components of [X, Y ] are given by [X, Y ] j ( x )= X(s)YjarXj(~c)Y'((rc). This says that, under the identification described earlier, the Lie differentiation L,J that corresponds to X has coefficients (Lx,*): = - a ~ X j ( x ) .
xi
A Lie differentiation L E LD(x, W) gives riseto a similar operator-dso denoted byL-acting on arbitrary tensor fields on M , which assigns to a section CT of the tensor bundle XkleM ( B k T M )@ (@?M) an element La of T,"leM, in such a way that the product differentiation rule L(B(al,a2))= B(Lo1,a2) B(o1,La2) holds when B is a contraction
ef
+
rw( T k l l e l M )+ T w ( T k l - l $ L 1 - l " jof a covariant and.a contravariant index or a canonical tensor product map I ' w ( T k l l e l M )x r"(Tka*LaM)+ I'w(Tkl+ka~el+L~M). Since this extension to general tensors will not be needed here, we will only describe the one case that will actually be used, namely, the extension to fields W of covectors, i.e. l-forms. If W E F ( T * M ) , and L E LD(x, W), we define LW E T,*M by the formula (LW,y) = w(w,Y)- ( w ( z ) ,L Y ) , for y E T x M , Y E r W ( T M )such that Y ( s )= y. The equation L ( a Y ) = a(x)LY (wa)Y(z)implies that (LW,y) does not depend on Y , so LW iswell defined as a covector at x. Also, the identity L(aw) = ( w a ) ~ ( z )a(x)Lw is easily verified, and the product rule L(w,Y ) = ( L W , Y ( x ) )(w(x),L Y ) is a trivial consequence of the definition, ifwe interpret Lcp as vcp when cp is a function. In coordinates, if W = z i w i c l x i , then LW = Ci(Lw)idxi,where (LW)'= W W ~ C j L{wj(x). In particular, if X E r W ( T M ) then , (L,,xw)i = W W ~ Cj $$(x)uj(x).
+
+ +
+
<
-
If L E LD(E, W), I is a nontrivial interval, : I + M is a curve, t E I, 5' is differentiable at ?, c(?) = E, and = W , then we can define the Lie derivative LW of any continuous field I 3 t + w ( t ) E T'(1)M of vectors along whichis differentiable at 3. We do this byfirst finding a time-varying vector field M x I 3 (x,t ) + W ( x ,t ) E TxM which is smooth in x for each t , jointly differentiable at (E,?),and such that W(c(t),t)= w(t) for t E I , t near ?. (The existence of W is obvious: let f l , . . . ,fn E
(v)
Sussmann
488
.
P ( T M ) be such that fl(Z), . . ,fn(Z) is a basis of TsM. Write w(t) = wi(t) fi(((t))for t near 2, where the functions wi are continuous near T, and differentiable at 2. Extend the wi arbitrarily to continuous functions on I. Let W(z;t)= Cy.1wi(t)fi(z) for ( z , t )E M x I . ) We then define def
LW = LW{
(4.1)
+ DfW(Z,t),
where DE= It is then easy to see thatLW does not depend on the choice of the extension W . (Indeed, choose a chart IC = (z', . , . ,zn) near E, let ( ( t )have components ( l ( t ) ,.. , ,("(t), and write w(t) =
c
W($,t) =
Wj(t)bjn(((t)),
j=1
C W+,
t)Sjn("),
j=1
n
n
j=1
j=1
Then Wi(((t),t ) = w'(t), and LW{
+ DfW(Z,t )
n
- k ( D f W j ( ( ( t ) , t ) + CL:Wi(Z,t))8f(E), i= 1
j=1
so that
wit) =
c
wj(tPjn(€(t>),
j=1
proving that LWdoes not depend on W.)In exactly the w n e way, we can define the Lie derivative LWof a continuous field I 3 t + w(t) E T'(,+kf of covectors along ( which is differentiable at T. The resulting formula, in
Coordinate-Free Maximum Principle
489
coordinates, is
w(t) = Cwt(t)dd(E(t)). i=l
With the above terminology, an afine connection on M is a smooth section V of LD[[M]]-i.e. a choice, for each (z,v) E TM, of a Lie differentiation V: E L D ( q v) depending smoothly on (S,v)-such that V;, regarded as a member of the linear space LD(s), depends linearly on v for each 2. On the other hand, if IC 2 1, then a vector field X E l?((TM) givesrise to a section Cx-called Lie differentiation along X - o f class of the bundle L D [ q , defined by letting C x ( 2 ) = L,J. It then (% x ()z)) for all 2-and follows that nLD[[M]]o & x = X-i.e. ~ L D , ~ ( ~ X= CXY = [X,Y] for all Y E rl(TM). We can also define connections (or "Lie differentiations") along a LAC curve in M defined on an interval I. Such a connection will be a map L that assigns to every t E I a member Ct of LD([(t)), in such a way that Ct E LD(((t),i(t)). It clear that there is a well defined concept of measurablity of such maps, and we shall incorporate the measarability requirement into the definition. (Naturally, C is measurable iff for every Y E P ( T M ) the map I 3 t + CtY E T M is measurable.) Moreover, it is natural to identify two connections if they agree almost everywhere, and this leads us to the following Definition 4.1. Let M be a smooth manifold, and assume that 6 : I + M is a LAC curve. A connection-or Lie differentiation, or covariant differentiation-along E is an equivalence class, modulo equality a.e., of Lebesgue measurable sections C of L D [ M along E such that R L D [ [ M I oI C = a.e., i.e. C ( t ) E LD(E(t),i(t)) for a.e. t E I. H We use LD([) to denote the set of all Lie differentiations along 4 and, for C E LD(E), we write ,Ct rather than C(t). It is clear that LD(5) is an affine (but not a linear) space.
Sussmann
490
Relative to any chart IC = ( d ,. . . ,zn) and any subinterval J of I such that ((J) D(Ic),an C E LD(() has well defined connection coeflcients L! : J + R, which are measurable functions. Let (ECRV'AC(I,M), C E LD((). If v E r o ( T M , ( ) ,W E r0(T*M,(), and v, W are a.e. differentiable, then the Lie derivatives CV,LW,are well defined a.e., and are given in coordinates by the obvious formulae
(Cw)j(t)= 6 q t ) + C L ! ( t ) v i ( t ) , (4.4) j
+
Clearly, the product formula &(w,v) = Cw.v w.Cw holds. In particular, the Lie derivatives CY,LWalong ( of C1vector fields Y and C' l-forms W are well defined-via CY C(Y o E ) and Cw gfC(w o e)-as fields of vectors or covectors along (. We call an C E LD(() locally integrable (LI) if the scalar function w . t Y is L1 for all Y E r w ( T M ) W , E P ( T * M ) .(In view of the product rule, this is equivalent to saying that Cw.Y is L1 for all Y ,W , since t + ( w . Y ) ( < ( t ) ) is LAC.) We use LDL'(() to denote the class of L1 connections along It is clear that C is L1 iff I can be covered by. intervals Ja that are mapped to domains of coordinate charts relative to which the coefficients L! are LI, and in that case the L! are L1 for every chart n and every interval J such that ((J)C D ( K ) . If C E LDL'(() and v E rLAC(TM,E), then the components ( C W )of~ CV relative to a chart are LI, so CV E rL'(TM,(). So C induces an Rlinear map-labeled Cvec-from rLAC(TM, () to rL'(TM,4). Moreover, given any y E rL'(TM,(),? E I, 'ii E T - M , there exists a unique €(t) w E r L A C ( T M , (such ) that Cvecv = y and v@) = F. (All this follows trivially if ((I) is contained in the domain of a chart, because the system of equations d j E{L ( ( t ) g j ( t )= d ( t )is linear and the functions L: are integrable, and then the conclusion can be easily globalized over all of I . ) In particular, C""" is onto, and dimKer C""" = dimM. Similarly, C induces a surjective map % i f
<.
+
C""": r L A C ( T * M , (+ ) rL1(T*M,(),
Coordinate-Free Maximum Principle
491
such that forevery z E M , E ) , 3 E I , Ti7 E T;(q M , there exists a unique C E YLAC(T* M , 6) such that LcovC= z and C@) = m. The equations Lvecv= p, Lcov<= z are known, respectively, as the inhomogeneous variational equation (IVE) and inhomogeneous adjoint equation (IAE) arising from ( and L, forgiven p E r L r ( T M , ( )and z E rL'(T*M,().(If p or a vanishes, then we just talk about the variational equation (VE) and adjoint equation (AE).) We will use I V V ( ( ,C,p), IAV(<,L,z)to denote the sets of solutions of Lvecv= p and LcovC = z, and write V V (
+
b
(4.5)
C(b).v(b)- C(a).v(a)= s(LC.v
+ (.Lv)(t)dt
for a, b E I.
a
In particular, (4.6)
the function t +C(t).v(t)is constant if v E V V ( ( ,L ) , C E AV((,C ) .
One obvious way to define a connection along a LAC curve ( is to construct a LICl CVF M x D(()3 ( z , t ) + F ( z , t ) , such that ( ( t ) = F ( < ( t ) t, ) for almost every t. Given F , we can define E LDL'(E) by LE'" %*LE ( t ) , F tso , that, if Y E r l ( T M ) ,then (4.7)
C f l F ( Y )= [ F t , Y ] ( ( ( t ) ) for a.e. t
Relative to a chart (4.8)
IC
E D(().
as above, we have
c(
BFj ( L f f F ( Y ) )=j d(Yjo c)(t)- & - Y i ()( ( t ) )for a.e. t. dt i=l
In the terminology of Section 3, constructing such a field F amounts to realizing 4 as the first component of a Cl-controlled curve ( ( , F ) . It is easy to see - using (4.4) and (4.8)- that the coefficients L! fo LclF are given by L! = - s o ( . h particular, if F is LICl then E LDLr(().So every LIC? CVF F such that ( E I C ( F ) gives rise to a LI Lie diferentiation
Sussmann
492
along (. Equivalently, to every LIC-controlled curve y = ((, F ) there -or LTcorresponds in a canonical way a LI Liedifferentiation along (. It is easy to show that, conversely, every L1 connection along a LAC curve arises from some F , and F can in fact be taken to be LICm: Proposition 4.1. Let ( : I + M be a LAC curve, and 'let L be a LI connection along (. Then C = ttgF for some L I P control vector field F with domain M x I .
Proof. It clearly suffices to assume that ( is contained in the domain Cl of a chart /c = ( d ,. . ,z"), and I is compact, Let L! : I + R be the coefficients of C relative to /c. Let Ft(z) = CY=, ( i j ( t )- ~ y Li(t)(zi = ~ -
.
( i ( t ) ) ) 8 r ( x ) .Then
F
satisfies the desired condition, except for the fact
that the Ft are only defined on Q. To get a globally defined F , multiply F by a cp E Cm(M)that equals 1 near ((I)and is such that support(cp)
ca. W Let y = (5, F) be a LIC1-controlled curve. Suppose W E rLAC(TM, E), and cp is a function of class C1 on some open set that contains ( ( J ) . Let t E J, p = ((t).Then (.C:v)cp = dcp(p).L:v = dcp(((s)).w(s)(Czdcp).w(t).On the other hand, if X is a smooth vect;r=held, then
$1
(C:dcp).X(p) = Ft(p)(dcp.X)- dcp(P).L:x = W P ) ( X c p )- dcpOl)*[Ft,XIb) = (FtXcp)(P)- W
t , XIcp>(P>
= (XFtcplb)= d(Ftcp)WX(Io). Therefore Czdcp = d(Ftp)(p),and then (4.9)
So, if E T,M, and 0 is any set of C1 functions cp near ( ( t ) whose differentials at ( ( t )span T&+kf,we have
(4.10)
CTv = B iff
I
de
a=t
v(s)cp = w(t)(Ftcp)+j&ofor all cp E 0.
Coordinate-Free Maximum Principle
493
In particular, if IC = ( x 1 , .. , ,x n ) is a chart near [ ( t ) , and we apply the above with cp = x i , we get the familiar formulation of the IVE in coordinates:
(4.11) Similarly, if 5 E rLAC(T*M, E ) , and X is a vector field of class <(t),we have
C' near
(LT<).X(p)=
(4.12)
So, if f E T,*M, and f& is any set of C' vector fields X near [ ( t ) whose values at <(t)span TE(,)M,we have
= C ( t ) . [ F t , X ] ( [ ( t+ ) )Z . X ( [ ( t ) ) for all X In particular, for a chart IC near [ ( t ) , by taking X = coordinate formulation of the IAE:
E Q.
&,
we get the
(4.14) The preceding remarks contain the proof of most of the following:
Proposition 4.2. Let 7 = (<,F) be a LIO-controlled curve in a manifold M , and let L E CFLICI( M ) be suchthat D ( 7 ) E T D ( L ) . Let E rLAC(T*M,E). Let w ( t ) = dLt(E(t)). Write E ( t ) = (E(t),C(t)), I = D([).Let @ be a set of functions, of class C1 near B ( I ) , such that { d K ( B ( t ) ): K E @}spans T&,)T*M for every t E I . Let f& be a set of vector fields of class C1 near [ ( I ) , such that { X ( [ ( t ) ): X E @}spans Tc(tlMfor every t E I . Then the following conditions are equivalent:
<
<
(i) satisfies the IAE L T < + W = 0, (ii) & < ( t ) . X ( [ ( t )= ) <(t).[~~,~]( [X( Lt t)()E ( t ) ) a.e. for all vector fields X that are of class C1on some open set containing [ ( I ) ,
Sussmann
494
-
(iii) & ( t ) . X ( ( ( t ) ) = C ( t ) . [ F t , X ] ( ( ( t ) )X L t ( ( ( t ) ) a.e. f o r all X E q , (iv) z is a n integral curve of iiF,L, (v) $ K ( E ( t ) ) = { H F t , L t , K } ( B ( t ) )a.e. f o r all functions K that are of class C1 on a neighborhood of B in T*M, (vi) $ K ( E ( t ) ) = { H F ~ , L , , K } ( Z ( ~ ) a.e. ) for all K E 0.
Proof, The equivalence of (i), (ii) and (iii) has already been proved. The equivalence of (i) and(iv) followsby comparing the coordinate expression (4.14)of the IAE (with .Ti = -&(((t))) with the formula (3.1) for the IC’s of I?F,L. The implication (iv)+(v) is a trivial consequence of the definition of the Poisson bracket, and(v)+(vi) is trivial. Finally, (vi)=+(iv) because, if (vi) holds and t E I is such that $ K ( S ( t ) ) = { H F $ , LK~ }, ( S ( t ) )for all K E 0, and Z is differentiable at t , then d K ( E ( t ) ) . s ( t )= dK(B(t)).I?F,L(B(t)) for all K E ,!V so s(t)= &(W)). Definition 4.2. If M is a manifold, 7 = ( E l l ’ ) is a LIC1-controlled G T D ( L ) , then curve in M , and L E C F ( M ) is LICl and such that D(<) a solution 5 of the IAE 1275 W = O-where W ( $ ) = dLt(t(t))-is called an L-adjoint vector along 7 . If L = 0 we just call C an adjoint vector along
+
Y.
We use Adj((,F, L ) , or Adj(y, L ) , to denote the set of all L-adjoint vectors along?. If L = 0, we just write Adj(<,F),or Adj(7). If n = dimM, then it is clear’that Adj(y,L ) is an n-dimensional affine space, and Adj(7) is linear. We stress that we have not defined “adjoint vectors along a curve’’ in M . The definition of an adjoint vector makes sense for a LI@-controlled curve. If y = ( E , F), then the space Adj(y) will in general depend on F and not on ( alone. 8
2
Of the various equivalent conditions of Proposition 4.2,the most imporetant one is (ii), so we will reformulate in terms of “generalized switching associated with a functions.” The generalized switching function ‘px,;,~ vector field X along (7,C ) is the real-valued function on D(7) given by
Coordinate-Free Maximum Principle
495
~ x , ~ ,kf ~ ((( (t t))X(((t))). , Theformula of Condition (ii) then says that
Therefore:
AV
If y = (S, F) is a LIC1-controlled curve in M , L E CFcl(M), I = D([)C T D ( L ) ,and is a LAC field of covectors along 7,then C E Adj(7,L) if (4.15) holds for every vector field X of class C’ on a neighborhood ofS.
We conclude this section by describing the generalization of the AVE to the case of a LIL-controlled curve. If M and N aresmooth manifolds, and $ : M -+ N is a locally Lipschitz map, then the well-known Rademacher Theorem says that $ is differentiable almost everywhere. Ifwe let Diff($) be the set of points where t)J is differentiable, then D$(.) is a welldefined linear map from T,M to T$(,)N for each z E Diff($). The graph G(D($))of D$ is : z E Diff(M)}, which is a subset of the bundle the set ((2, $(S),Dl(($)) J I C w ( M ,N ) of l-jets of smooth maps from M to N . (If we use Hom(X, Y ) to denote the set of linear maps from X to Y ,whenever X ,Y are linear spaces, then an element Q of J I C w ( M ,N ) can be thought of in a natural way as a triple (z,p, L), where z E M , y E N , and L E Hom(T,M, T ! N ) . So J I C w ( M ,N ) can be identified with the bundle Hom(TM,TN) Over M x N whose fiber at (z, y) is Hom(T,M,T’N).) The Clarke generalized Jacobian of a locally Lipschitz map $ : M + N at a point z E M is the subset B$(z) of Hom(T,M,T$(,$V) whichis the convex hull of the set of all limits lim+,w D$(zj), ranging over all sequences {xi} in Diff(t)J) such that z j + z and limj-toc D $ ( s j ) exists. It is clear that B$(z) is compact and convex. In the special case when N = T M and t)J is a locally Lipschitz vector field on M , then each D$(z), z E Diff($), can be regarded as a l-jet of vector fieldson M , i.e. as a member of LD(z), so B$(z) is a compact convex subset of LD(z).
Sussmann
496
Definition 4.3. Let y = (5, F ) be a LIL-controlledcurve in M . A connection associated with y is a measurable map D(<)3 t + Ct E LD(E(t))such that Ct E BFt(t(t))for a.e. t E D(5). We write C’’ to denote the set of all connections along y. m The LIL analogue of the IAE is the “inhomogeneous adjoint inclusion’’ (abbr. IAI),defined as follows. Suppose y = ( E , F) is a LIL-controlled curve in M , and L E C F ( M )is LIL and such that I = D(y) C ‘TD(L). Then we can define the “augmented” map F L : M x ‘TD(F,L ) + T M x W given by F L ( z , t )= ( F ( z , t ) , L ( s , t For ) ) . fixed t , the map F: is Lipschitz, so it has a well defined Clarke generalized Jacobian OF:. For each I E M , OF:(z) is a compact convex subset of the product L D ( z )xT,*M, We will use to denote the set of all measurable selections of OFL o y, so the members of LYgL are the measurable maps I 3 t + (Ct,wt) E LD(<(t))x T‘((,yI4 such that (Ct,ut) E 8 F t ( c ( t ) )for almost all t E I .
Definition 4.4. If M is a manifold, y = ( E , F ) is a LIL-controlled curve in M , and L E C F ( M )is LIL and such that D ( t ) C T D ( L ) ,then a solution of the IAI for y and L is a LAC field of covectors C along 4 such that [IAIS]
there ,exists a selection (C,W ) E icy’Lfor which
-(cg)(t)= wt(<(t))
for a.e. t E D([).
A solution C of the IAI for y and L will be called an L-adjoint vector along y. If L = 0 we just call ( a solution of the AI (adjoint inclusion) or an adjoint vector along y. We use Adj(6, F, L ) , or Adj(y, L ) , to denote the set of all L-adjoint vectors along y. If L = 0,we just write Adj((,F), or Adj(r). 9 Remark 4.1. If C is a solution of the IAI for y and L, corresponding to a selection (C,W ) E .CyvL, then in particular .C E Cr and W is a selection of OL 0 y. so [IAIS*]
( satisfies the equation
-C,(
E 8Lt
a.e. for some ( t , w ) E P L .
Coordinate-Free Maximum Principle
497
Notice, however, that [IAIS”]is in general strictly weaker than [IAIS], since a F t ( z ) aFt(z) x a L t ( z ) ,but the inclusionneed not be an equality. m
5 . Control systems, reachable sets, lagrangians Definition 5.1. A control system is a triple C = (M,U, F ) such that (i) M is a manifold, (ii) U is a set, and (iii) F = { F , , } l l Eis~ a family of control vector fields on M . The manifold M is the state space of C, U is the open-loop control space, the elements of U are the open-loop controls or, simply, controls, and F is the dynamics. a If C = (M,U,F) is a control system, we use F@), P ( C ) to denote, respectively, the sets {Fq: q E U} and {P : (3F E F(C))(3J E Z)(J C 7’0( F )A F^ = F [ M x J)} . We say that a CVF G E C V F ( M ) occurs in C if G E F(C). If U^ is any subset of U ,we will use C [U^to denote the restriction of C to U^, i.e. the system (M,U^,F[U^), where F[U^ = {Fv: 17 E If P is any property of CVF’s (for example, LIB, LIBU, LIL, or LIC”), then a control q E L4 will be called P-admissible (for example, LIB-admissible) or UP,^, for C if the corresponding CVF F,, satisfies P. We will use UF,F, to denote the class of all P-admissible q E U ,and write C[P for C[Up,,.
c}.
Definition 6.2. A controlled trajectory of a system C = (M,U, F) is a pair y = (t,q) such that q E U and ([,F,) is .a controlled curve. If D(<)is compact, then y is a controlled arc of C. The domain D(?)of y is the domain o f t . We use Ctraj(C) (resp. Carc(C)) to denote the set of all controlled trajectories (resp. controlled arcs) of C. A trajectory (resp. arc) of C is a E such that ([,v) E Ctraj(C) (resp. ([,v) E Carc(C)) for some q E U.If P is any property of CVF’s, such as LIB or LIL or LIC1, a 7 = ([,v) E Ctraj(C) is P-controlled if q is P-admissible. We use Ctrajp(C) (resp. Carcp(C)) to denote the set of all P-controlled trajectories (resp. ?-controlled arcs) of C. m Often, we will use the self-explanatory expression “the control system 2 = F,(z,t), z E M , q E U” as a name for the control system (M,U, F).
Sussmann
498
Definition 5.3. Let C = (M,U,F) be a control system and let a,b E R, a 5 b. If p , z E M , we say that 1c is C-reachable from p over [a, b] if there exists a controlled arc ((,v) of C such that ( ( a ) = p and ((a) = z. We say that 1c is C-reachable from p if it is C-reachable from p over a, b for some a , b. We define Rz(p)(resp. 7?&](p)), the C-reachable set from p (resp. the C-reachable set from p over [a,b]) to be the set of all points z E M that are C-reachable from p (resp. C-reachable from p over [a,b]). m Definition 5.4. If C = (M,U,F) is a control system, a Lagrangian for C is a family {L,),Eu of Carathdodory functions on M such that 7 D ( L , ) = 7 D ( F , ) for all q E U.m Given a Lagrangian L for C = (M,U, F),we would like to define the cost functional JLgc : Carc(C) + B by letting JLt%>
=
5
L, ( m ) , t)dt
D(Y)
if y = ((, 7). This is possible provided that the integral makes sense. If we want to useLebesgue integration,then the minimal requirementsince the integrand is guaranteed to be measurable-is that the function &,L : t + L,(((t),t) be bounded below or above by an integrable function. (In that case, the integral makes $ense, although its value could be +cc or "00. 5, Let us call a y E Ctraj(C) weaklyacceptable for L if at least one of the functions $,' : t + max(L,(((t),t),O), @ :t + l ' - min(Lq(((t),t ) ,0) is locally integrable, and acceptable for L if &,L is locally integrable. If 7 is weakly acceptable for L and has domain [a,b], then the function p - , , ~: [a,b] + B given by p - , , ~ ( t = ) L,(((s), s)ds is the running cost along y. (More generally, any function p : D ( 7 ) + P such that p ( t ) E L,(y(s), S) ds + T for some finite constant T and some a E D ( 7 ) will be called a running cost for 7.) We define JLlC(7)= p Y , ~ ( b )So . is well defined as a map from Carcwa(C,L) to P, where Carcwa(C,L) is the set of all y E Carc(C) that are weakly acceptable for L. JLlc
Naturally, one could also allow integrability in the sense of some more general conditionally convergent integration theory, but this possibility appears to be without interest and will not be pursued here.
Coordinate-FreeMaximum Principle
499
The augmented system C L associated with a system C = (M,U,F) and a Lagrangian L = {L,, : q E U} is the system (5.1)
x = F&, t ) ,
xo = L&,
t),
q fU,
in which “the running cost zo has been added as a new state variable.” That is, C L is the triple ( M x B,U, F L) where, for 7 E U ,F[ E C V F ( M x W) is given by Ft(z,zO,t) = (F7)(z,t),Lq(z,t)), with domain { ( z , z O , t ): z E M , zo E B, t E TD(F,)}. According to our general definitions, a trajectory of the augmented system for a control 7 E U is a pair (E, Eo) such that 5 is a trajectory of C for 7,(<,v) is acceptable for L,and Eo is a running cost for ( E , 7). In particular, a controlled arc 7 = (E, 7)of C may fail to give rise to controlled arcs (E, Eo, 7) of CL,since 7 may fail to be acceptable for L. Wenow define the key substitution properties needed to construct “control variations,” as required by the proof of the MP sketched in [22] and [23]. If F E C V F ( M )and J is a subinterval of T D ( F ) ,then FrJ will denote the restriction of F to M x J , so T D ( F [ J )= J .
Definition 5.5. Let F , G be CVF’s on M , and let E 5 R Then 1. SUB(E,F,G ) (Le, “the CVF obtained from F by substituting G over E”) is defined if E E T D ( F ,G). In that case, SUB(E,F,G ) has time domain TD(F), and (5.2)
+
SUB(E,F, G ) ( z ,t ) = XE(t)G(z,t) (1 - XE(t))F(z,t ) for (z,t ) E D(F).
2. Assume that E = [a,b] is a subinterval of T D ( F ) . Then D E L ( E , F )
(i.e. “the CVF obtainedfrom F by removing FrE”)is the CVF on M with time domain T = ((“00, a] n TD(F)) U {t : t 2 a, t b - a E T D ( F ) } ,such that, for ( z , t )E M x T,
+
(5.3)
D E L ( E ,F ) ( z ,t )
! z f
x ( - w , @ ] ( t ) F (t z),
+ (1- X(,,,)(t))F(z,t+ b - a)
Sussmann
500
3. Assume that E = [a,b] is a compact interval, and let r = b - a. Then I N S ( E ,F,G) (i.e. “the CVF obtained from F by inserting G over E”) is defined if a E T D ( F )and E G 7D(G). In that case, the time domain of I N S ( E ,F, G) is (5.4) T = ( ( - ~ i ) , a ] n 7 D ( F ) ) U [ a , b ] U {tt 2 : b, t-(b-a) E TD(F)}
and the value of I N S ( E ,F,G ) at (z, t ) E M x T is
+ X(a,b](t)G(~,t) + X(b,oc](t)F(s,t- v)-
I W E , F,G ) ( z , t ) X(-,,a](t)F(z,t) !sf
‘Definition5.6. Let C = (M,U, F) be a control system. Then 1. C is closed undersubstitutions over members of a class S of subsets of the real line if,whenever 71,72 belong to U , E E S, and SUB(E,F,,, ,Fqa)is defined, it follows that SUB(E,F,,, ,FVa)occurs in C. 2. If C is closed under substitutions over members of the class of all compact subsets (resp. measurable subsets, subintervals, compact subintervals) of W, then C will be said to have the compact (resp. measurable, interval, compact interval) substitution property. 3. C is closed underintervaldeletions if,whenever q E U and E is a compact subinterval of TD(F,,), it follows that DEL(E,F,,) occurs in C. 4. C = ( M ,U ,F) is closed under interval insertions if, whenever 71, 72 E U are two controls, and E is a compact interval such that I N S ( E ,F,,,,Fq2)is defined, it follows that I N S ( & F,,,,Fq2) occurs in C. B
6. Hamiltonian minimization and separability Throughout this section, we fix a control system C = (M,U, F), and a Lagrangian L for C. We associate with the pair (C, L ) the Hamiltonian whichis the family of Hamiltonians of all the pairs { H f l L } , , ~ ~
Principle
Maximum Coordinate-Free
SO1
(Fq,Lq). In other words, for each q E U we define E CF(T*N) by 3c:9L(z,z,t ) z.Fq(z, t ) L,(%,t ) , so the time domain of 3cfJ is 7 D ( F q ) .Alternatively, we may view as a partially defined function on T * M x W x U ,with domain ((2, z,t,q) E T*Mx W x U : t E 7 D ( F q ) } , given by
+
3 c c l L
(6.1)
3cCJ(z,z , t , q ) gfz.Fq(z,t)+ L,(e,t).
74;.
When L = 0, we omit the L and write ?lc, If B is a curve in T*M , and q E U ,we call q weakly minimizing along B for (C, L ) if (i) D ( 6 ) C T D ( F q ) , and (ii) forevery ;i E U there is a subset E(ij) of D ( 6 ) n 7 D ( F q ) of Lebesgue measure zero such that 7-lclL(6(t), t ,q) 5 7-lCbL(B(t), t ,i j ) for all t belonging to ( D ( 6 )nT D ( F q ) )\ E(ii). We call q strongly minimizing along E for (C,L ) if (6.2)
3cc,L(z,z , t , v) = min{3ccpL(s,z, t ,;i) : c E U ) for a.e. t E TD(Fq)n D ( 8 ) .
Equivalently, q is strongly minimizing along ( for (C,L ) iff it is weakly minimizing and, in addition, the set E(ij) of Condition (ii) can be chosen to be independent of If y = (E, q) is a LIL-controlled trajectory of and L is a Lagrangian for C, then a field of a system C = (M,U,F), covectors C along y will be called weakly (resp. strongly) minimizing for y for ( C ,L ) , if q is weakly (resp. strongly) minimizing for (C,L ) along the curve B = (S, 6 ) . We write Adjc(y, L) gfAdj(<,Fq,Lq) if C = (M,U,F) is a control system, L is a Lagrangian for C, and y = ((,v) E CtrajLIL(C).(The elements of Adjc(y, L) are called C-L-adjoint vectors along y, or simply L-adjoint vectors along y if the meaning of C is clear from the context.) Also, we write Adj:mi,(y, L), AdjFmi,Jy, L),to denote the classes of C E Adjc(y, L ) that are weakly and strongly minimizing. As usual, when L = 0 we drop the reference to L, k d talk about controls 7 that are minimizing for C along a curve in T * M , or fields of covectors that are minimizing for C for a controlled trajectory y. We then def c write AdjZmi,(-y) AdjZmi,(y,O) and Adjemin(y) c = AdjSmi,(y,O)
v.
% i f
Sussmann
502
The conclusion of the Maximum Principle asserts the existence of a nontrivial C E AdjEmin(T,L) along a controlled trajectory T, having certain additional properties. It is important to know when this implies the stronger property that C is strongly minimizing. It turns out that the answer is affirmative whenever a very general “separability” condition holds. To make this precise, we introduce some definitions. Definition 6.1.A control system C = ( M ,U ,F )is separable along an arc ( : [a,b] -+ M if there exists a countable subset U0 of U ,such that (i) (a,b] E 7 D ( F , ) for every 77 E UO,and (ii) for every t E [a, b],the set of values { F , ( ( ( t ) , t ) : 77 E UO}is dense in {F,(E(t),t) : 9 E U , [a,b] C 7-D(Fv)},The system C is separable if it is separable along every trajectory of C. m The following observation is completely trivial, but crucial for the passage from weak to strong minimization: Proposition 6.1. For a separable system, weak and strong Hamiltonian minimization are equivalent. m
7. Separation, approximating cones, transversality Definition 7.1.Let SI,S2 be subsets of a topological space A , and let q E A. We say that S1 and SZ are separated at q if S1 n S2 = { g } , and that 5’1 and S2 are locally separated at q if there is a neighborhood U of q such that S1n SZn U = {g}. m Recall that a cone in a real linear space V is a nonempty subset C of V such that W E C,r 2 0 implies rw E C. Definition 7.2. Let S be a subset of a manifold M , and let x E S. A strongly approximating cone of S at x is a closed convex cone C in the tangent space TxM with the property that there exist a neighborhood N of 0 in TxM and a continuous map F : N n C + S such that
(7.1)
F(w)= x
+ v + o(llwll) as v -+ 0 through values in N n C.
Coordinate-FreeMaximum Principle
503
(Formula (7.1)has a clear meaning in local coordinates, and it is easy to see that it is in fact independent of the choice of coordinates.) An approximating cone of S at x is a closed convex cone which is the closure of an increasing union of strongly approximating cones. m
It is easy to verify that, if S is a submanifold of M , then the ordinary tangent space TxS is an approximating cone of S at x. If S is a convex set near x relative to some chart IE (i.e. if S is such that &(S)is convex for some chart IC near x), then the set of limiting directions of S at x is an approximating cone. (A limiting direction of S at x E S is a vector v E TxM such that either v = 0 or there exists a sequence (xk)r=l of points of S and a sequence of red numbers hk > 0 that satisfy Xk # x, h k + 0,and Xk = x + hkv+ O(hk) as k + m. This in principle depends on the choice of a chart IC, but is easily shown to bein fact independent of IC.) It is easy to see that every vector v belonging to anapproximating cone of S at x must be a limiting direction of S at x. Let us call a set S AC-regular at x if the set of all limiting directions of S at 3 is an approximating cone of S at x. Then thepreceding examples simply say that every set which is convex relative to some chart is AC-regular at x. Definition 7.3. If V is a finite-dimensional real linear space, a vector to a subset C C V if (z,v)5 0 for all v E C.We use C+to denote the set of all vectors polar to C. m z E V * is transversal-or polar-
It is clear that C+is always a closed convex cone. If C itself is a closed convex cone, then it is well known that Ctt = C.
8. The maximum principle We are now ready to state theMP. We will first present the separation theorem that constitutes the most basic form of the principle, and then show how other results-such as necessary conditions for controllability and optimality-follow from it. In its simplest form, the separation theorem gives a necessary condition for a “reference trajectory” &, : [a,b] + M of a control system C =
Sussmann
504
(M,U,F) to be such that, if p , = <*(a)and q* = &(b), then the reachable set R from p,, is locally separated from a set S at q. However, it will be convenient to formulate the basic separation result in a slightly more general form, by considering in addition a map g from a neighborhood U of q* to some other manifold N , and taking S to be a subset of N , inwhichcase the result becomes a necessary condition for g ( R ) to be locally separated from S at g(q*). Moreover, we will consider together the case when R is the “fixed time interval” reachable set R&,,,(p)and that when R is the “variable time interval” reachable set Rc(p) . The necessary condition is givenin termsof [* together witha control q* that generates&, and says that there must exist a field of covectors C along &, that satisfies four requirements, namely: (i) the “adjoint equation,” which says that C is an adjoint vector along the controlled trajectory = (&,q*),(ii) the minimization condition,that says that C is weakly minimizing (so that, if C is separable, then C is actually strongly minimizing), (iii) nontriviality, that is, 6 # 0 and, finally, (iv) transversality, which is the condition that relates C to the set S. In addition, there axe some minor variations. For example, when the system is closed under deletions and insertions, we can , ( ( ( t ) , tbe ) ) constant add the requirement that the Hamiltonian ( c ( t )F,,, as a function o f t and, in the variable time-interval case, this constant has to be 0.
From the separation theorem one can easily deduce necessaryconditions for noncontrollabilityalong ,$* (i.e. for the point q to lieon the boundary of R),and for optimality, i.e.for to minimizesome Lagrangian cost functional JL*’-or, more generally, J L * c 9 qcf. , below-among all trajectories that satisfy an endpoint condition a( E S, where S is a given subset of M x M . The noncontrollability condition is exactly like the condition for sepqation, except that in that case there is no set S, and thetransversality requirement disappears.. (Again, when the system is closed under deletions and insertions one gets a constant Hamiltonian, and if the time interval is free then the constant must vanish.) The optimality condition is exactly 1ike.the one for separation, except that now one has to use the Hamiltonian %!.cbuLfor some v 2 0.
<
Coordinate-Free Maximum Principle
505
In order to state the Maximum Principle, we make the following assumptions:
Al. C = (M,U, F) is a control system; A2. either (i) C has the compact substitution property, or (ii) C has the interval substitution property and every F,,, q E U ,is LIB; A3. a, b E W,a 5 b, p * , q* E M , & : [a,b] + M is a trajectory of C, P* = 5* (a), and q* = 5* (b); A4. E U is such that y* = (&,V+) is a LIL-controlled arc of C; A5.a. N is a smooth manifold, W is a neighborhood of q* in M , and g is a map from W to N ; A5.b. g is Lipschitz, and G = Og(q*) is the Clarke generalized Jacobian of g at q*; A6. S is a subset of N and C is an approximating cone of S at g(q*); A7. either (i) C is not a linear subspace of Tg(q,)iV, or (ii) C = (0) but g(q) belongs to Clos(S - {g(q*)}). We will call E*, q*, y* the reference arc, reference contwl, and reference controlled arc, respectively. In addition to thebasic hypotheses A l , . .,A7, we will want to consider the special case when C satisfies the stronger condition A8. C is closed under interval insertions and deletions. The separation properties of interest to us are (SEP-F) the sets g(RE,,](p*)) and S are locally separated at g(q+), (SEP-V) the sets g(Rc(p*)) and S are locally separated at g(q*). Actually, the necessary condition for,SEP-V is in fact necessary for the following weaker separation property: (SEP-V#)
forsome Q > 0, the sets g(RE,b+,l(p*)) and S are locally separated at g(q*),
where
(8.1)
RE,,,,](P)= U W ~ , , ] ( P ): b - L c L b + Q
Q)*
506
Sussmann
We recall that the Clarke generalized Jacobian (cf. Section 4) G of g : W -+ N at q* is a set of linear maps from Tq.M to T’(q*)N,so every G E G has a well definedtranspose Gt ,which is a linear map from TJ(q,l N to T4:M .
Theorem 8.1 (The basic form of the Maximum Principle). Suppose that the data C, M , U , F , a , b, &, v*, r,, P,, q,, N , W , g , G,S and C are such that A 1 ,.. .,A7 hold. Assume that (SEP-F)holds. Then there exist a B E Tl(q*lN,a G E G , and aweakly minimizing adjoint vector ( E Adj’(7,) along y*. such that B # 0 , B belongs to the polar cone C+ of C , and ( ( b ) = Gte. If in addition A8 holds, then 5 can be chosen such that the Hamiltonian t + H E ( C , ( t ) , ( ( t ) , t ) is a.e. constant. If both A 8 ((t), = t )0 and (SEP-v#) hold, then ( can be chosen such that H< (&(t), a.e. m
A fairly self-contained outline of the proofof Theorem 8.1 isgiven in [22]and [23],for the case when the state space is an open subset of a Euclidean space Rn.However,since the result of [22]and [23]allows systems with with jumps, the manifold caseis an immediate corollary, obtained by applying the Euclidean space result to a system where the jumps are coordinate changes,
A special case of Theorem 8.1 is a result on controllability. Let us say that a reference trajectory & : [a,b] -+ M is a j’ixed-interval boundary trajectory if E, (b) is not an interior point of the reachable set R t , , I ( & ( a ) ) , (The negation of this is j’ixed-interval local controllability along &, i.e. Similarly, we say that E, theproperty that <,(b) E Int(R~,bl((*(a))).) is a variable-interval boundary trajectoq if for some a > 0 &(b) is not an interior point of the reachable set 7Zt,bhal((,(a)), (The negation of this is variable-interval local controllabilityalong E,.) It turns out that Theorem 8.1 implies that a necessary condition for (, : [a,b] -+ M to be a ked-interval boundary trajectory is that there exist a nontrivial minimizing adjoint vector (, with the additional constancy of the Hamiltonian property if A8 holds, and the vanishing of the Hamiltonian if 6, is a variable-interval boundary trajectory. We state a slightly more gene-
Coordinate-Free Maximum Principle
507
ral result, that contains the controllability condition as a particular case, obtained by taking N = W = M and g = identity.
Theorem 8.2. Let C,M , U , F , a, b, E*, q+,y*, P*, q*, N ,W , g and G be such that A1 ,.. .,A5 hold. Assume that g(q*) is not an interior point of g ( R g , b l ( p * ) ) , Then there exist a B E T,*((,*]N, a G E G , and a weakly minimizingadjointvector ( E AdjC(y,) along y* suchthat 8 # 0 and c ( b ) = GtO. If in addition A8 holds, then C can be chosen such that t + H; (E3(t),( ( t ) , is t ) a.e. constant. If A8 holds and g(q*) is not an interior point ~ f g ( R ~ , ~ ~ , ~ (then p * )() can , be chosen such that HE((*(t),C(t),t) = 0 a.e. Proof. If suffices to apply Theorem 8.1 to the case where the set S is just a sequence of points converging to g(q*) but not equal to g(q*), and = (0). m
c
Remark 8.1. In Theorems 8.1 and 8.2, Assumption A5.b can be weakened, using the concept of a “semidifferentiablemap,” introduced in [22] and [23]. It suffices to assume that A5.b#.
g is continuous on W and semidifferentiable at Q+, and G is a semidifferential of g at q*.
As explained in [22] and [23],this includes in particular two cases that are not directly covered by Theorem 8.1, namely, (a) when g is continuous near q* and differentiable at q+, and G = {dg(q*)}; (b) when g is Lipschitz and G is a Warga derivate container of g at q*. m Finally, we turn to the optimality result. We now assume that we axe given a collection of data C,M , U ,F , a, b, E*, q*, y*,p*, q* satisfying Al, A2, A3, A4 as in the previous theorems, and in addition: 0PTl.a. W is a neighborhood of (p+, q * ) in M x M , and cp : W + R is a function, 0PTl.b. cp is Lipschitz, A d G = By(p,,q*)is the Clarke generalized gradient of cp at (P* , q * ) ,
Sussmann
508
OPT2. S is a subset of M x M such that (p*,q*) E S, and C E Tp.M Tq*M is an approximating cone of S at (p*,Q*), OPT3. L is a Lagrangian for E, and L,, is LIL.
X
Let us write CarcL(C) to denote the set of y E Car@) that are acceptable for L. We can then consider the variable-interval optimal control pmblem:
PROBLEM P(E, L,S,p): minimize the cost functional b
(8.2)
JL,cb’p(Y)
Ckf
j L , ( W , t ) dt + cp(E(a),W ) )
a
in the set of all triples ( a ,b, 7) such that y = ((, 9 ) E CarcL(E), D(()= [a,bl, and ( [ ( a ) ,I(b))E S, and, for a given a, b, the fixed-interval optimal control problem:
P [ a , b ] ( E , LS,p): , minimize JLgcp~(y)in the set of all controlled arcs y = ((,v) E CarcL(E) such that D ( t ) = [a,b] and ( ( ( a )(, ( b ) ) PROBLEM
E S.
An important particular case of variable-interval optimal control is minimum time control,in which the Lagrangian is equal to 1and the function p is zero. (Usually, such problems are considered for autonomous systems, in which case one can always take a to be 0.) We will use Ptime(C,S) to denote the problem P(E, 1,S,0). Togetnecessary conditions for these problems, we will apply Theorem 8.1 to an auxiliary system E. This will require that Assumption A2 be satisfied for the augmented system EL rather than just for C, i.e. that
C L has the compact substitution property, or (ii) EL OPT4. Either: (i) has the interval substitution property and F, and L, are LIB for all 7 E U. Similarly, we will need the analogue of A8 for CL: OPT5.
C L is closed under interval insertions and deletions.
With this terminology, we have
Coordinate-Free Maximum Principle
509
Theorem 8.3 (The optimal control version of the Maximum Principle). Assume that C, M , U , F, L, y*, E*, v*, a, b, P*, q*, S , C , p and G are such that Hypotheses Al,A3, A4, OPT1, OPT2, OPT3 and OPT4 hold, and y* is a solution of the minimization problem P[a,a~(C, L , S, p). Then there exist a constant U 2 0 and a uL,,-adjoint vector C along y* such that (i) ( is weakly minimizing for ( C , uL) along 7+,(ii) ( satisfies the transversality condition
and (iii) either ( ( b ) # 0 or U > 0. If OPT5 holds,then C and U can be chosen so that the Hamiltonian ( ( ( t )F,,(&(t),t)) , + uL,,((*(t),t ) is a.e. constant. If OPT6 holds and ( a ,b, ye) is a solution of the fwe-interval problem P(C,L, S, p), then ( and U can be chosen so that the constant value of the Hamiltonian is zero. Proof. We apply Theorem 8.1to an auxiliary control system E, defined on the interval [a - 1,b], whose state consists of a pair z1,zz of variables in M , as well as a real variable U,and whose dynamical law is k1 = k1
= 0,
g = 0, v arbitrary,
15 t C a, kz = F,(zz, t ) , g = L,(sz, t ) , for a I t 5 b.
=v,
for a -
c
Precisely, we define = M x M x W, and = U x V , where Li = {q E U : [a,b] G 7 D ( F , ) } ,and V is the set of all piecewise constant maps from the interval [a - 1,a] t o the space r W ( T M )of smooth vector fields on M . For ii = ( 7 , V ) E and (z1,zz,y) E G, we let 9q(zl,zz,y,, t) = (V(t)(q),V(t)(zz),O for) a - 1 5 t C a , and define &(z1,z2,yr, t) = (0,F,(zz, t ) ,L,(z2, t ) )for a 5 t 5 b. We then take 5 =
c,
(z,c,$).
We let a*(t)= S', L,, (&,(S), s)ds for a I t 5 b (so c* is the running cost along y*). We then let = (+, 0), and define : [a - 1,b] + G by letting & ( t )= Cp*,p,,O) for a - 1 I t C a, and &(t)=. (p*,E*(t),a,(t)) for a 5 t <_ b. We then let ?* = (&, f*). Then ?* is a controlled arc of E, such that f*(a - 1) = &(a) = (p,,p,,O) and F*(b) = (p*,q*,c), where c = a,(b).
Sussmann
5 10
We let N = G, and define g by g(zl,zz,y) = (~1,zz,y+cp(zl,zz)). Then g isdefinedon a neighborhood of (p*,q*, c) and is Lipschitz. Pick a smooth function $ : M x M + R such that $(p*,q*) = 0 and $(a,ZZ) > 0 for (z1,z~) # ( p * , q*). Then define c* = c v@*,a*), and
+
(8.4)
~ = { ( ~ ~ , I c ~ , ~ ) E N : ( ~ ~ , ~ ~ ) E S A T I ,C * - $ ( ~ ~ , ~ Z ) } .
Then g(p*,q*,c)= (p*,q*,c*)E 3. If E CarcL(E) is such that D(?)= [a-l,b], i(a-1) = (p*,p+,O), and in addition F(b) is of the form (ql,q z , 9) with g(q1, q z , v) E 3, then (q1,qz) E S and y cp(q1,qz) 5 c + cp(p*,q,) - $(ql,qz). On the other hand, the fact that 5, is a solution of P[,,q(C,L, S,cp) easily implies that v cp(q1, q z ) 2 c cp(p*,a*). These two inequalities imply $(q1 , 42) = 0,so q1 = p * , qz = q,. Then y = c as well, so g(q1, q z , v) = ( p * ,q*, c,). Therefore the sets g(7Z~-l,,l(p*, p * , 0)) and 3 are separated at (p*,q* ,c*). Moreover, if y* actually solves the free interval problem, then for any a > 0 the sets g(7Z~-l,,~,l(p*,p*,0))and 3 are Separated at (p*,q*,c*). It is easy to see that 8 = C x (--oo,O] is m approximating cone for 3 at (p*,q*, c*). Moreover, 2" is not a linear space, so Theorem 8.1 can be applied to obtain a weakly minimizing adjoint vector along y* for the system 5, a $E T;*fi,c*N\ (0) such that FE Et, and a G E Bg(p,,q*,c) such that c(b) = GtB.
(F,$
+
+
+
c
Now c(t) E T&t)(M x M x R), and &(t)= (p*,&(t),o*(t)) for a I
c@)
t 5 b, El*@) = (p*,P*,O)for a - 15 t < a. Then = (Cl(t),CZ(t),Y(t)), where 61(t) E T;*M for't E [a - l , b ] , t ( t ) E T;*M for t E [a - 1 , a ] , C z ( t ) E T;*(t)M for t E [a,b],and ~ ( tE)R. On the interval [a,b] the adjoint equation says that i/(t)= 0, &(t) = 0, and -&(t) E (Cz(t),v(t)) ' B(FLIt(E,(t)),where F&(z) = (Fq.(z,t),Lq,(z,t)). So C1 and Y are constant on [a,b]. On [a - 1,a] the adjoint equation says that 51, and Y are constant. Moreover, the Hamiltonian minimization condition on [a - 1,a] implies that (1 ( t ) &(t)= 0 for t E [a - 1,a].
+
Ifwe define { ( t ) = &(t)for t E [a, b], it follows easily that 5 E Adj'(T*, vL,,). Onthe otherhand, the Clarke generalized Jacobian ag(p*,q*, c) is easily seen to be equal to {Gx : X E B&, g,)}, where
Principle
MaximumCoordinate-Free
511
T,,M X R + Tp,M X T,,M x R is the map (vl,vZ,r) + ( V I , 212, P X(v1, vz)). (Recall that each X is a map from TprM x Tq.M to R.) For X E &(p*, a*), the transpose GI is therefore the map from Ti,M x Ti. M x R to TP:,M x Ti*M x R given by GA : Tp,M
+
(8.5)
X
G I ( f l l , h 4= ((el,e2) + P A P ) E Tip.,q")(M x M ) X
-
W T i * M X Ti*M X W.
In particular, ifwe let g = (&,&, p ) , and pick X such that Gx = G, then G+(@= ((&,&)+pX,p). Therefore ( - ( ( a ) , ( ( b ) , v ) = ( - < ~ ( a ) , ( ~ ( b ) ,=v ) (<1(a),52(b),v) = (sl(b),
m
+
+
+
+
Remark 8.2. If ( and v satisfy the conclusions of Theorem 8.3, and r > 0, then it is easy to see that p( and v satisfy the conclusions BS well. Therefore the number v can always be assumed to be equal to 1 or t o 0. B Remark 8.3. Notice that in Theorem 8.3 we no longer needthe special hypothesis A7, because the cone z" is not a linear subspace, even if C is. Remark 8.4. The transversality condition says that there is a X E a&,, q+) such that ( ( - < ( a ) , ( ( b ) )-vX, v) 5 0 for all v E C. In thespecial case when C is a linear subspace, this actually implies that ((-((a), ( ( b ) )vX, v) = 0 for all v E C. Therefore, in this case the transversality condition simply says that ( - ( ( a ) , ( ( b ) ) [ C= vX[C. This happens, in particular, if S is a submanifold of M , in which case C can be taken to be the tangent space to S at q*, In the special case when cp = 0 and S is a submanifold of
Sussmann
512
M x M , the transversality condition says that ( - ( ( a ) , ( ( b ) ) is orthogonal to T[,*,,*)S. If S is of the form { p * } x 2, where 2 is a submanifold of M (i.e. if ( ( a )is fixed and we only have a terminal point constraint ( ( b ) E Z ) , then the transversality condition says that ( ( b ) is orthogonal to T,,Z. m Remark 8.5. If S = {p*} x M and cp is just a function of q, then we have a free terminal point problem.In that case, the transversality condition simply says that ( ( b ) E v&(q*).If v = 0 then ('(b) = 0, contradicting the fact that ((b) and v cannot both vanish. So for a free terminal point problem we can always assume that U = 1. m Remark 8.6. The possibility that v = 0 can also be excluded in other cases. For example, for the classical Calculus of Variations, in which the dynamics is just x = U , the Hamiltonian in the sense of Theorem 8.3 is given by ( z , U ) + v L ( qU , t ) . If v was equal to zero, the minimization condition would require the control u ( t ) being used to minimize the linear functional v + ( ( ( t ) v) , for v ranging over the whole space R*. Clearly, such a functional does not have a minimum unless ( ( t )= 0. But this possibility is excluded by the nontriviality condition. So for a classical Calculus of Variationsproblem we can always assumethat v = 1. In particular, Theorem 8.3 implies Theorem 2.2. m An important particular case of the previous theorem is: Theorem 8.4 (The Maximum Principle for time-optimal control). Aesume that the data C , M , U , F ,E*, q*, T*, a , b, p*, q*, S and C are such that A l , A2, A3, A4, A8 and OPT2 hold. Assume that (* is a solution of the minimization problem Ptime(C,S ) . Then thew exists a nonzero weakly minimizing adjoint vector ( along T* such that (-((a), ( ( b ) ) E Ct and the Hamiltonian ( ( ( t ) F,, , ( ( * ( t ) , tis) )constant and 5 0 almost everywhere. Proof. This is a trivial consequence of Theorem 8.3, taking g = identity 0. Indeed, the Hamiltonian isgivenby %!:*"(z,z,t) = and cp ( 2 , F,(z,t ) )+ v. So Theorem 8.3 gives a v 2 0 and a ( E Adjc(y,, v) such that ( ( b ) and v cannot both vanish, %F;" vanishes a.e. along E (where S ( t ) = ( ( * ( t ) , ( ( t )and ) ) , ( satisfies the appropriate transversality condition. The condition that ( E Adj'(T,,v) just says that ( E Adj"(T*). In %" :l
Coordinate-Free Maximum Principle
513
terms of the Hamiltonian .tl,”(z, z , t ) = ( 8 , F,(%, t ) ) of C, the condition :;” = 0 along 8 just says that is constant and 5 0 along 8. that X If C I 0, then the vanishing of X,”;”along B implies that v = 0 as well, contradicting the nontriviality condition. So [ # 0. Finally, since cp = 0, the transversality condition just says that (-<(a),[@)) E C?.m
Definition 8.1.A controlled arc that satisfies the necessary condition of Theorem 8.1,8.2,8.3, or 8.4, for the corresponding problem (separation, controllability, optimal control or minimum time control) is a controlled estremal for the problem. M
9. Classical systems: the autonomous case In this and the next section we show how the Maximum Principle applies to systems of the “classical” form 5 = f(z,t ,U), and in particular we explain how in this case one can obtain strong rather than weak minimization in the conclusion, under rather general technical conditions. We will do this in two cases, namely, (a) that of systems of the form 5 = f(z,U),U E U, and (b) systems 5 = f(s,t, U), U E V ( t ) .In the former case, we will allow the control set U to be a completely general set, with no structure whatsoever, and will only require f(z,U ) to be continuous in z for each fixed.u. In the latter case, we select a setting similar to that of F. Clarke’s version of the MP,presented in [3], [4], We now discuss Case (a). Here the relevant objects of interest will be systems generated by a family f = (fu}uEu of ordinary vector fields on M , and we will have to define in a precise way how such objects give rise to control systems C = (M,U, F ) in the sense of Definition 5.1. Roughly speaking, if we write f ( z , u ) = fu(z), then C ought to be “the control system j. = f ( z , u ) , U E U,”This, however, is still vague, because it fails to to specify the “class U of admissible open-loop controls,” i.e. the set of functions t -+ u ( t ) that can actually be plugged into the formal expression x = f(z,U ) to produce a true differential equation S = f(s,u(t)).For this reason, the system E associated with a given f will not be unique, because there is no canonical choice of U.
Sussmann
514
Definition 9.1.A vector field family on a manifold M is a parametrized family f = { fu}uEv of vector fields on M (i.e. continuous sections of T M ) .A vector field systemis a triple V = ( M ,U, f ) such that f = { f u } U , = ~ is a vector field family on M . m
c
If V = ( M ,U, f ) is a vector field system, we let denote the set of all vector fields f u , U E U. We say that an F E C V F ( M )is generated by V if Ft E for every t E TV(F).We call a CVF F piecewise constant if there is a locally finite partition P of ‘TD(F)into intervals such that the map t + Ft is constant on each J E P,
c
Definition 9.2. A control system C = (M,U, F ) is generated by a vector field system V = (G,V, f ) if (i) M = (ii) F,, is generated by V for everyr] E U ,and (iii) whenever I is any interval such that I = ‘TD(F,,,), for some qo E U ,and G is a piecewise constant CVF generated by U with time domain I, then there exists an q E U such that G = F,, on M x I . m
E,
Remark 9.1. Condition (iii) is a slightly weaker form of the natural requirement that “all control vector fields arising from piecewise constant U-valued controls” should occur in C. We choose only to impose this weaker condition because we want to allow, for example, systems with a fixed time interval. m The following general result, together with Proposition 6.1, says that for systems generated by vector field systems the Maximum Principle can always be applied in the “strong” form, i.e. with strong rather than weak minimization.
Theorem 9.1. Every control system C = (M,U, F) generatedby a vector field system V = ( M ,U, f ) is separable.
<
Proof Let : [a,b] + M be a trajectory of C corresponding to a control 70.Let I = TD(F,,,,), so [a,b] C I. Let K E Comp(M) be such that {<(t): a 5 t 5 b} S Int(K). Let I ” ( K ) denote the space of continuous vector fields on K, i.e. the set of restrictions to K of continuous sections of T M (or, equivalently, the set of continuous maps X : K -+ TM such that T T M o X = idK). Then r o ( K )is a separable Banach space, ifwe
Maximum Coordinate-Free
Principle
515
pick a Riemannian metric on M and let llXll = sup{llX(z)lJ : x E K } . So the subspace W = {fu [ K : U E U} is separable as well. Let WO be a countable dense subset of W , and pick for each g E W Oa u0 = p ( g ) such that g = fuo [ K . Let U0 = { p ( g ) : g E WO}.Then U0 is countable. Let WO be the set of all piecewise constant Uo-valued maps 8 : I + U0 that are constant on the intervals of a finite partition PO of I with rational partition points. Then WOis countable. If 6 E W O then , Definition 9.2 implies that there exists an 7 = v(8)in U such that I 5 7D(F,)and fect,(x) = F,(q t) for (x,t ) E M x I . Let 2-40 = {v(e) : 8 E W O } .Then U0 is countable as well. If Q E 2-40, then of course I C TD(F,).Moreover, if t E [a,b],then {F,(E(t),t ) : Q E UO}is clearly dense in {fu(((t)) : U E U}, which contains the set {F,(E(t), t ) : r ) E U ,[a, b] TD(F,)}.So E is separable along 5. m
s
Remark 9.2. At first sight, it may appear surprising that a general separability result can be proved in the setting where the control set U is completely arbitrary, and in particular is not required to be a separable metric space. However, our definition of separability only involves the values of the vector fields F,(,, t ) ,and the control values are justparameters. So all that really matters is the separability of the F’rdchet space of vector fields. m For a vector field system V = ( M ,U, f),we define a Lagrangian to be a family L = {Lu}uEuof continuous functions on M . The augmented vector field system V L is the system ( M x W, U, fL), where f L = {f,“ : U E U}, f,”(x, xO) = (fu(x),L,(z)). We can then consider various control systems generated by V L .
10. Classical systems: the time-varying case We now turn to the second case, namely, systems x = f (x,t , U ) . We select a setting similar to that of F. Clarke’s version of the MP, presented in
PI,
141.
Sussmann
516
Recall that (a) the Borel a-algebra B ( X ) of a metric space X is the a-algebra generated by the open sets, and (b)a Polish space is a separable d) topological space X whose topology arises from a metric d such that (X, is complete. If I is an interval, we use C ( I ) to denote the cr-algebra of Lebesgue measurable subsets of I. If P is a Polish space, I E Z,and M is a smooth manifold, we define a classical control dynamics (abbr. CCD) on M with time domain I and ambient control space P to be a pair (U, f) such that
U is an C ( I ) x B(P)-measurable subset of I x P, such that U ( t )= ( U : ( t , u )E U} is nonempty for every t E I; (11) ( z , t , u ) + f(z,t , u ) is a map from M x U to T M , such that f(z,t , U ) E T,M for all (z, t ,U ) E M x U , (111) every map f ( - , t , ~ ) , ( t E, uU),is continuous,
(I)
(IV)
every map f(z, S ,
S),
z E M , is C ( I ) x B(P)-memurable.
(In particular, (I), (111) and (W) imply that f is B ( M ) x C ( I )x B(P)measurable.) A control is a Lebesgue measurable map rl : J + P such that J is a subinterval of I and 9(t) E U ( t ) for each t E J. If we use U to denote the class of all controls, and thenwrite F,(z,t ) = f(z,t,p?(t)),F = (F,,)71E~, then C = (M,U,F) is a control system. A C arising in this way from a CCD (U, f) will be called a classicalcontrol system.
Theorem 10.1. A classicalcontrolsystemis measurable substitution property. Proof. Thesubstitutionproperty in [23].D
sepamble andhasthe
is trivial. Separability isproved
It follows from Theorem 10.1 that for a classical control system the MP applies, with the extra conclusion that the adjoint vector is strongly minimizing. (For the optimality results, we have to require that the augmented system CL,obtained by adding the running cost as a new state variable, satisfy the same technical assumptions. That is, the La-
Coordinate-Free Maximum Principle
517
grangian (3,t ,U ) + L ( z ,t,U ) has to be continuous in 3 for each ( t ,U ) , and L ( i ) x B(P)-measurable in ( & U ) for each 2,exactly as in [4].) For Theorem 8.1 to apply, the reference CVF f* : ( E , t ) -+ f(z,t,u*(t)) has to be LIL. This is a weaker assumption than those of [4],Theorems 5.1.2 and 5.2.1, where (a) f ( s , t , u )has to be Lipschitz with respect to z foreach t , ~with , a Lipschitz constant k ( t , u ) ,and (b) the function t -+ k ( t ,U* ( t ) )has to be integrable. Condition (b) is our requirement that f* be LIL-and this hypothesis is essential, as can be shown by simple examples-cf. Example 15.5 below-but Condition (a) is no longer needed.
11. An example: the derivation
of the geodesic equation Our first example of an application of the Maximum Principle will be the derivation of the well-known equations of the Riemannian geodesics. This will illustrate both thepower and thesimplicity of the control theory method, especiallywhen it iscombined with an intrinsically geometric formulation and with the approach of regarding a control system as apossibly very large-collection of vector fields. In particular, we work with the geometric formulation of the adjoint equation in terms of a connection along a curve, which turns out to be tailor-made for use in a differentialgeometric setting. Let M be a Riemannian manifold, with metric tensor G. Our goal is to characterize those arcs ( : [a,b] + M that minimize length among all arcs f such that S f = a(. (Recall that if D(()= [a,b] then a( (E(a),( ( b ) ) ) . It is well known that every such arc can be reparametrized by a constant times arc-length in such a way that a = 0, b = 1, and that the minimumlength arcs ( : [0, l] -+ M that are so parametrized are precisely those that minimize the integral J(<) = 3 G(((t),( ( t ) )dt. So it suffices to characterize the arcs ( : [0,l] -+ M that minimize J(() among all such that SF = S<. Let A be the set of all arcs ( : [0, l] -+ M for which J(() < 00, so A is the Sobolev space W'b2([0,l],M ) . For p , q E M , let M ( p , q ) = {( E A : ! Z f
Sussmann
518
86 = ( p , q ) } , and let M = U,,,M(p,q). Our goal is then to characterize the class M by deriving a necessary condition for a E A to belong to M. Naturally, the condition is going to be the familiar equation for the
<
geodesics. To apply the Maximum Principle, we first realize the class A as the set of all trajectories of a control system generated by a vector field system (U, f). The simplest way to do this is by letting U = r W ( T M )the , space of all smooth vector fields U on M . Let us introduce the somewhat redundant notation fu for a vector field U E U ,so fu(z)is just equal to u(z), and let us also write f ( z , u ) fu(z).With these notations, we can think of the family of vector fields { f u ) u E ~ as a control system 35. = f(z,u) in the traditional control theory sense, except that our choice of control space U is somewhat unorthodox, since U is a very large space of vector fields. A control will be, for us, a L I P control vector field (2, t ) + F ( z ,t ) on M with time domain [0, l]. Equivalently, we may think of a control as a function [0, l] 3 t + r](t)E U such that the map (t,z) + f 7 ) ( t ) (-i.e. z) the map (t,z) "i q(t)(z)-is measurable with respect to t for each fixed z, and is bounded uniformly with respect to t together with all $-derivatives of all orders on each compact subset of M. (That is, a control is a bounded measurable map from [0, l] to the F'r6chet space F W ( T M ) . )We let U be the class of allcontrols, and write F,(z,t) = q ( t ) ( z )for r] E U. Then C = ( M ,U ,F) is a control system in the sense of our definition. It is clear that every trajectory of a control is in A and the converse-that is, that every arc E A arises as a trajectory for some control 7-is trivially true. So A is exactly the class of all trajectories of C.
e*
<
For any control
r]
E U ,define L,, : M x [0,l] + R by
for z E M . Then L = {L,, : r] E U} is a Lagrangian for C in the sense of our definitions. Given p , q E M , the members of M ( p ,q ) are exactly the solutions of the optimal control problem with fixed endpoints in which one seeks to minimize the integral 1; L,,(<(t),t ) d t among all controlled trajec-
Coordinate-Free Maximum Principle
519
tories 'Y = ((, V ) of C for which ((0) = p , ((1) = q. (In the terminology of Section 8, this is Problem Plo,l~(C,L, {(p,q)},O).) The Maximum Principle then gives the following necessary condition for a (* E A, arising from a control q*, to belong to M(p,4):there exist a field of covectors t + ( ( t )along E* and a v >_ 0 that satisfy the follo, # (0,O) for t E [O,I], wingfive conditions: (i) nontriviality, i.e. ( ( ( t ) v) (ii) absolute continuity, (iii) the adjointequation, (iv) the Hamiltonian minimization condition, which says that for almost every t E [O,l] the function q + ( ( ( t ) , v ( t ) ( & ( t ) ) )vL,(&(t),t) is minimized by q,,, and (V) the constancy of the Hamiltonian condition, which says that there is a "L,,(&(t), =t)p a.e. (Our constant p such that t + (c(t),q*(t)((,(t))) system C is clearly invariant under deletions and insertions " s o ( can be chosen so as to satisfy (v)- and separable, so (iv) holds.)
+
+
We now show howthese conditions imply the geodesic equation. First of all, it is clear that theHamiltonian minimization condition holds for a particular t iff q*(t)((* ( t ) )-i.e. (*(t)- minimizes the functional TE*(t)M3 y + ( ( ( t ) , ~ )i G ( y , y ) . This functional does not have a minimum if v = 0, ( ( t )# 0, so v cannot vanish. Therefore we may assume that v = 1. Next, recall that the metric G gives rise to an isomorphism G(z) : TxM + T,*M such that (G(z)v,v) = (v, W ) G whenever v, W E TxM. Ifwe let w ( t ) = G(&(t))-'(<(t)), then the vector (*(t)minimizes the functional TE,(~)M 3 y + G(w(t),y) i G ( y , y ) . Now, ifwe have an inner product on a finite-dimensional linear space S, and a nonzero vector W E S,it is clear that the functional v + (W, v) &(v, v)is minimized by v = - W , So &(t)= q * ( t ) ( E * ( t ) )= - 4 t h i.e. &(t)= -G((*(t))-'(C(t)).
+
+
(e,
S )
+
Condition (ii) then implies that & ( t )is absolutely continuous. The con-
+
+
stant p is equal to (~(t),q*(t)(t*(t))) L,. (t*(t),t), i.e. to ( ~ ( t ) , i * ( t > ) iG(&(t),&(t)),which equals -ill(*(t)Ilb.So I l ( * ( t ) l l ~ is equal to a constant 0. Wenow use the remaining piece of information, namely, the adjoint equation.This says that C ' < = - W , where CY is the connection along (* canonically associated with the controlled arc 7 = ((*,V,), w ( t ) = d J L ) , , t ( E * ( t ) ) , and L,*,t(4
gfL,,(z,t).
Sussmann
520
So fax, qN is an arbitrary control that generates &, and we still have the freedom to choose it. Let us recall that, according to Proposition 4.1, every L1 connection C along &, is of the form C ~ *forI some ~ LIC" CVF F with domain M x [O,l].In particular, we can take C to be the pullback to EN of the Riemannian connection, or Levi-Civita connection of G, i.e. of the unique torsion-free affine connection V such that V G = 0. (The torsion-free condition says that V x Y - V y X = [ X ,Y ]for every X , Y E P ( T M ) . )The adjoint equation then becomes = -dL,,,t((,(t)), where q*(t)= F,, Ft(z)= F ( z ,t ) , and F is a LIC" CVF such that EN is an integral curve of F and Cf'l" = Vi,(t)for almost all t. In that case, 2&, (z, t ) = G(Ft(z),Ft(z))* Now,if X , Y E r"(TM), then X G ( Y , Y ) = 2G(VxY,Y),because V x G = 0.Then XG(Y,Y )= 2G(VyX+ [X, Y ] , Ybecause ), V is torsionfree. On the otherhand, if 0 5 t 5 1; then Ct = Cf*lF, so CtX = [Ft,X](&(t)) forevery X E rm(TM).Therefore XG(Ft,Ft)(&(t))= SG(VF,X [X, F,], Ft)(&(t)).Since CtX = VFtX(&(t)), we have
+
Since X is arbitrary, we have in fact shown that dG(Ft,Ft)(&(t))= 0,i.e. (C) = 0. that dL,,,t(<,(t)) = 0.So the adjoint equation just says that Vim(tl But then
v~.(t,ti*) = V<*(t)(G-lc) = G-l(v<.(t)(C)) = 0, using again the fact that G is invariant under V . So we have shown that
which is the geodesic equation.
H
Remark 11.1. Notice that in the above derivation we never took coordinates, and everything was done directly in an invariant way. In particular, we only used the two properties characterizing the Riemannian connection, but did not write the explicit formulas for the Christoffel symbols, and we applied all the conditions of the Maximum Principle, but
Coordinate-FreeMaximum Principle
521
did not express any of them -not even coordinates. m
the adjoint equation- in local
12. Single-input control-affine minimum time problems Our second example of an application of the MP is in many ways diametrically opposed to thefirst one. Whereas the'problem of Section 11is one with a kinematics consisting of all curves, and theinteresting structure arises from the Lagrangian -i.e. from the metric- we will now look at problems where the Lagrangian L is E 1, which is as trivial as a function can be, and all the geometric structure comes from the kinematics. We will show how the MP can be applied to get strong conclusions about the structure of optimal controls. We consider the simplest type of kinematics that involves nontrivial control. Clearly, this requires a minimum of two vector fields, so we will focus on "two-vector field" control systems, i.e. systems in which at each point z there are two directions of motion, X ( z ) and Y ( z ) .(For general reasons --e.g. because we want to have existence of optimal controls, and/or we prefer that the reachable sets be closed- it is always a good idea to have convexity of the set of directions of motion at a point, so we will consider systems where the dynamics is such that x belongs to the convex hull of X ( z ) and Y ( z ) . )Writing f = fr(X Y), g = +(X- Y), we end up with a system of the form = f(z)+ug(z),lul 5 1. The fact that L I 1 means, of course, that we will be looking at minimum time control.
+
x
+
So we will consider control systems 3i: = f(z) ug(z), Iu1
5 1,where z
takes values in a manifold M , and f , g are smoothvector fields on M . The controls are measurable functions U ( ' ) : [a,b] + [-l,l],defined on some compact interval [a,b]. We would like to understand the structure of the time-optimal trajectories and their corresponding controls. Naturally, the MP can be applied, but this is only the first step in the analysis, since our real problem is to how translate the information provided by the MP into interesting facts about the optimal controls.
Sussmann
522
The MP, applied to this case, says that if & : [a,b] + M is a timeoptimal trajectory corresponding to a control U+ : [a,b] + [-l, l], then there exists a nowhere vanishing AC field of covectors ( along E* that satisfies the adjoint equation(AE)and, forsome constant p 2 0, the Hamiltonian minimization condition:
(12.1)
+
- P =,C(t>.(f u*(t)g)(J*(t))
A = min{((t).(f + v g ) ( & ( t ) :) v E [-l, l]}
for a.e. t E [a,b].
The problem is to interpret the above conditions and translate them into facts about U * . Define the switching function cp : [a,b] + W by letting cp(t) = <(t)g([(t)). Then (12.1) clearly implies that U* = 1 as long as cp C .O, and U* = -1 as long as cp > 0, so the switchings of u*(t)occur at the zeros of cp, and therefore the properties of these zeros translate into information about the optimal controls. For example, if the switching function cp could be proved to have a discrete set of zeros, then it would follow that the optimal controls are “bang-bang,” i.e. piecewise constant with values 1 or -1. One possible course of action now would be to write the adjoint equation in local coordinates. This might seem reasonable, at least in the Euclidean case, i.e. when M is an open subset of W , so that there is a natural choice of coordinates. However, we shall argue that, even in the Euclidean case, the resulting system of equations, namely, i i =,-&x,z,u), with x = & ( t ) ,U = u * ( t ) ,is unnecessarily complicated, and fails to display the most relevant information abouttheproblem. This is so because (a) the components ( i ( t ) of the adjoint vector <(t)are the inner products ((t).ei of C(t) with the members ei of the canonical basis of W , and (b) the inner products C(t).ei are much less significant than the switching function cp, whichisthe inner product of withthevector field g. (For example, we have explained above how the structure of the set of zeros of cp directly implies properties of the optimal controls.) For this reason, it is more convenient - e v e n in the Euclidean case- to use a formulation of the MP where the canonical basis of Rn is not singled out for preferential treatment, and the products (.Xof with all possible vector fields X are treated alike. This is precisely what the intrinsic formulation of the AE
<
Coordinate-Free Maximum Principle
S23
given in Section 4 does, since it gives a condition on the derivative of the function t + ((t).X for all possible vector fields X . The AE then turns out to say that the derivative of ( ( t ) . X is another function of the same kind, where X is replaced by the Lie bracket [f U,, (t)g,X ] . That is,
+
d (12.2) z ( C ( t ) . X ( E * ( t )=) )C(t).[f +u*(t)g,XI(t*(t)) for a-e. t E [a,bI
for all smooth vector fields X on M .
Example 12.1. According to (12.2), the time derivative @ of ‘p is , u(t)g,g](E(t))). Since [g,g] = 0, the function $J given by $(t) = ( ( ( t ) [f we havein fact $J(t)= (C(t),[ f , g ] ( l ( t ) ) ) This . already gives interesting information about the optimaltrajectories. For example, suppose that M is two-dimensional. Then, if g(x) and [ f , g ] ( x )are linearlyindependent at every point x E M , it follows that every time-optimal trajectory is bang-bang. (Indeed, since ( ( t ) # 0, the functions ‘p and Cc, cannot vanish simultaneously, so the set of zeros of cp is discrete.) More generally, this shows that the occurrence of non bang-bang optimal arcs is intimately related to the structure of the “singular” set, i.e. the set S of points x E M such that g ( z )and [ f ,g](x)are linearly dependent. Let us call a trajectory E* : [a,b] + M arising from a control u*(t)singular on a subinterval J of [a,b] if -1 < u*(t)< 1 almost everywhere on J. Then (assuming still that dim M = 2 ) if E* is singular on J , then l*(J) C S. In other words, “singular optimal arcs” if they exist, have to lie in S. This observation is the first step of a long analysis that can be pursued all the way, until one obtains a complete characterization of the structureof optimal trajectories for general real-analytic problems of the form 5 = f(z) ug(x),IuI 5 1, in two dimensions (cf. [17],[ M ] ,[19]).It is easy to see why real analyticity should make a difference at this point: when f and g are real-analytic, the set S is the set of zeros of a real-analytic function -the determinant of f and [ f ,g]- so the structureof S is particularly simple, since S is a locally finite union of points and analytic arcs.
+
+
The following example shows that “singular optimal arcs,” as defined in Example 12.1, can indeed occur in very simple situations.
Sussmann
524
Example 12.2. (Example 12.1,continued.) Consider the control system C in the plane whose dynamical equations are given by 31 = 1 - si, 32 = U , where s1, 22 are real variables and the control u is required to satisfy IuI 5 l. Let & : [O,l] + Rz be the trajectory given by & ( t )= (t,0), corresponding to the control u*(t) I 0. Then it is easy to see that is time-optimal, because goes from (0,O) to (1,O)in time 1, andthe dynamics of C does not allow motion fromleft to right with horizontal speed > 1. On the other hand, it isclear that & is singular in the sense of Example 12.1. Our system is of the form 3 = f(s) ug(s),with f = (1 - s!) g = A simple calculation shows that theLie bracket [f,g] is equal to 2z2&. So g and [f,g ] are linearly dependent on the set ((sl,s2): s2 = 01, which is precisely where 6 lies, in agreement with the analysis of Example 12.1. B
&,
+
&.
Example 12.3. A single-input linear system is a system of the form 5 = f(s) ug(z) on Rn,for which the vector field f is linear -i.e. f = VA’for some square matrix A, where VA(Z) As- and g is constant, b. Wewill consider the i.e. g = wb forsome b E Rn, where w b ( Z ) casewhen the, set of control valuesis the interval [-l, l]. The adjoint equation reduces in this case to = -(.A, and the switching function cp isgivenby p(t) = ((t).b,so the higher order derivatives of cp are cp(”(t) = (-l)k((t).Akb.If v is the smallest positive integer such that A”b E linear span (b,Ab,. . . ,A”-lb), then cp satisfies a linear constmtcoefficient O.D.E. of the the form “”l
+
ef ef
This implies that the set of zeros of cp is discrete -and therefore is bang-bang- provided that cp does not vanish identically. A system 3 = As ub in Rn is called controllable if linearspan(b, Ab,. . ,An-lb) = Rn. For a controllable system, v = n, and cp can only vanish identically if ((t),Akb= 0 for k = 0, I . ,n - 1, in which case ( ( t ) E 0, contradicting the nontriviality condition. So for a controllable single-input linear system, every time-optimal trajectory is bang-bang. B
+
.
CoordinatemFree Maximum Principle
525
Example 12.4 (Example 12.3, continued). For a noncontrollable single-input linear system, all pathology would appear to be possible, For example, consider the system 81 = 51, SZ = U , lul 5 1,which is clearly not controllable, since b = (0,l)t and A = :), so that Ab = 0. (Here &‘tl’ denotes transpose.) Let U+ : [0, 2’ 3 -+ [-l, l] be an arbitrary measurable function. Let & be the corresponding trajectory, with initial condition &(O) = (1,O)t. Then <,(T) = (e*,y)t, where y depends on U * , but it is : [0,T] -+ R2 going from (1,O)t to clear that for any other trajectory (eT,y)t the time T has to ,be equal to T.Therefore all trajectories that start at (1,O)t are optimal. This means that optimal controls for this problem can be as pathological as arbitrarycontrols, i.e. arbitrary measurable functions. On the other hand, the appearance of unlimited pathology is somewhat deceptive, at least for this problem, since one can slightly modify the question of strong piecewise regularity for optimal controls, in a very reasonable way, so as to get a positive answer. Let us call a set U of controls suficient for time-optimality if, whenever E, P are such that there is a time-optimal trajectory from Z to 9, it follows that there is a timeoptimal trajectory ?from I to 9 that arises from a control in U.(In other words, U is sufficient if for the purpose of optimally controlling our system one can forget about the controls that do not belong to U ,because all the Qptimal controlling that can be done .with general controls can also be done with controls in U,)Then it is easy to see that the class U consisting of all bang-bang controls with at most one discontinuity is sufficient for our example, since the effect of a control u : [a,b] -+ [-l,13 is completely determined by its integral u(t)dt,and all values of the integral that c&n be realized with general controls can also be realized with controls inU. m
(i
( S )
<
The results of the previous examples can be generalized to arbitrary single-input linear systems of the form x = Az ub in W , and to multiinput linear systems li: = A s Bu,where the control u is now a vector in R”’, A and B are, respectively, an n x n and an n x m matrix, and the set U of control values is a convex polyhedron K. Under these conditions, it
+
+
526
Sussmann
can be proved that the controls that are bang-bang (i.e. piecewise constant with values in the set of vertices of K ) form a suficient class for timeoptimality. The preceding discussion shows that the statement that “time-optimal controls are bang-bang” is true -after a slight reinterpretation- for linear control problems, but can fail in a rather strong way as soon as the slightest deviation from linearity is permitted, as in Example 12.2. (It is easy to see that in Example 12.2 the only control that goes from (0,O) to (1,O) in time 1 is U*,so the class of bang-bang controls is not sufficient.) Stronger nonlinearities can produce even stronger pathologies, as we now show.
Example 12.5. Let U* : [0, T ] + [-l, l ] be an arbitrary measurable function. Let v ( t ) = $, u(s)ds. Let K be thegraph of w , i.e. K = { ( t ,w ( t ) ) : 0 5 t 5 T}. Let 8 : Rz + W be a function of class C” such that 8 0 on K and 8 > 0 on Rz \ K . Consider the control system in R3 given by the equations j.i = 1 , 22 = U , $3 = 8 ( q , q ) . As before, the values of the control U are required to belong to [ - l , l].Let r = v ( T ) .Let & ( t )= ( t ,v ( t ) ,0). Then it is easy to see that goes from (O,O, 0) to (T,r, 0) in time T ,and every trajectory that goes from (O,O, 0 ) to (T,r, 0 ) does so in time T and coincides with & or one of its time translates. In particular, U* and its time translates are the only time-optimal controls that give rise t o trajectories going from (O,O, 0 ) to ( T ,r, 0). So no class U can be suficient for time-optimality unless it contains U* or one of its translates, m
=
The point of Example 12.5 is that U* is arbitrary, so one can get arbitrary unavoidable pathology with just simple examples in R3.This means, in particular, that the only class of [-l,l]-valued controls defined on intervals of the form [O,T]that is universally suflcient (i.e. suficient for all systems) is the class of all controls. Notice, however, that the above construction does not work if one insists on staying within the class of.real-analytic systems, since in that case the function 8 will not exist udess K is very special. It turns out that
Coordinate-Free Maximum Principle
527
For real-analytic systems one can prove the existence of nontrivial universally suficient classes. For example, the class of all controls that are real-analytic on an open dense subset of their domain is universully suficient (cf. [20], [21]). So far, everything that has been said would suggest that linearity of a system has much to do with the question whether time-optimal controls are bang-bang. The next example shows that thesituation is not so simple.
Example 12.6. Consider the system
+U , ki.2 = 1+ 2 1 i51 = x:
(12.3)
-k 22:23
- 4 2 3 -k 2212321,
-g23, with control constraint 1 ~ 51 1. The system equations obviously lookhighly 53
=2122
+
nonlinear. However, if one writes them in the si: = f (S) ug(z) form, a8 before, andcomputesiterated Lie brackets of f and g, one finds that [g,[f,g]]is a linear combination of g and [f,g] with smooth coefficients: [g, [f,g]] = a g + P [ f , g ] (actually, [g,[f,g]]= 2g), and that the vector fields g, [f,g] and [f , [f , g]]are linearly independent at every point. These facts can be exploited, using the adjoint equation in its intrinsic form (12.2). Let E* be time-optimal, let U* be a control giving rise to E*, and let 6 be as in the conclusion of the MP. One then finds that
so
(12.5)
=4 t )
+ u * ( W E * ( t ) ) c p ( t+) u * ( ~ ) P ( E * ( t ) h W ,
where a(t) = C(t).[f,[f,g]](E*(t)).This equation implies that the set of zeros of cp is discrete. (Indeed: the absolute continuity of C together with (12.4) imply that cp is C1and @ is absolutely continuous. If.cp(t) = 0 and @ ( t )# 0 then it is clear that t is an isolated zero of cp. If cp(t) = 0 and
528
Sussmann
#(t) = 0 then ~ ( t#) 0,. because ( ( t ) # 0 and g , [ f , d and [f,[f,gl1 are linearly independent at [ * ( t ) . So (12.5)implies that has constant sign and is bounded away from zero in a neighborhood of t. Therefore t is, once again, an isolated zero of cp.) Since all zeros of 50 are isolated, U* is bang-bang. So all optimal controls are bang-bang, exactly as in the linear case. m Examples 12.2, 12.3, 12.6 show that the amount of “nonlinearity” of a system is not a very good indicator of its control-theoretic properties. This should not be surprising, since properties such as the bang-bang nature of time-optimal arcs are clearly invariant under arbitrary nonlinear coordinate changes, whereas linearity itself is not. Thismeans that one can easily produce plenty of examples of nonlinear systems whose time-optimal arcs are bang-bang by just taking linear systems and changing coordinates to render them ‘‘nonlinear.” It is clear that facts about the behavior of a system that are invariant under general nonlinear coordinate changes should be related to properties that have the same invariance. As Example 12.6 suggest, what matters is the structural relations among the Lie bmckets of f and g , i.e. whether or not certain iterated brackets are linear combinations of some other brackets. The coordinate-free formulation in terms of Equation (12.6) is the one that makes it easiest to relate these structural properties to the necessary conditions of the MP. We illustrate this by proving the following “simplified bang-bang theorem.”
+
Theorem 12.1. Consider a single-input control system E : x = f ( x ) u g ( x ) , [ U [ 5 1, where f and g are smooth vector fields o n a manifold M . Let gkbe defined inductive18 by go = g , &+l = [f,g k ] . Assume that (i) for everJl IC the vector field [g, gk] is a linear combination of go,. , , ,gk with smooth coeflcients and (ii) for some iF the vectors go(%), . .. ,g;i~(x) span T,M for all x. Then ail time-optimal arcs of C are bang-bang.
Proof. Write [g,g k ] = C:=, a i g j . For an optimal arc [* arising from a control U + , let be a nontrivial minimizing adjoint vector. Define p k ( t ) =
Coordinate-Free Maximum Principle
529
..
(C(t)&(&(t))), and write @ = ( 9 0 , . ,pi)+.Then @(t) = 0 iff C(t)= 0, t. On the other hand,
so O ( t ) # 0 for all
(12.6)
@k
= pk+l
+
c4Cpj k
U*
j
for k E (0,
I
,x}
=O
,x}
along % = (&, 5). Given t E D([*) such that po(t) = 0, let k E {l,. , . be such that p k ( t ) = c # 0 but pj(t) = 0 for j < k. Since the pj are bounded, (12.6)implies that pj(t h) = O(lh1) as Ihl + 0 for j < k, and pk(t h) = c O(lh1). But then (12.6)implies that pk-l(t h) = ch O ( h 2 ) ,and pj(t h ) = O ( h 2 )for j < k - 1. Repeated applications of (12.6) then yield (Pk-i(t h ) = $ O(lhli+l) for i = 0,- , ,k. In particular, setting i = k, we find that po(t h ) = ahk o([hlk)for some a # 0. So t is an isolated zero of po. Therefore all the zeros of the switching function p0 are isolated. As explained before, this implies that U* is bangbang. m
+
+
+
+
+
+
+
+
+
+
.
Remark 12.1. Theorem 12.1 can be extendedand generalized in various ways. For example, for real-analytic systems one can almost dispense with Condition (ii). It suffices to make the much weaker hypothesis that (ii') for every 2 E M the vector f ( 2 ) belongs to the linear span of ( g k ( 2 ) : k E RI}. Indeed, every trajectory of C is contained in an integral manifold S of the Lie algebra of vector fields generated by f and g , so one can always work with the restriction of C to S, i.e. assume that M itself is an integral manifold. Then (i) and (iii) imply (ii). Condition (iii) is essential, W shown by the system 51= 1, x 2 = U , for which g k = 0 for k > 0, so h,g k ] = 0 for all k, and therefore (i) holds. Clearly, for this system every trajectory is time-optimal, so there are in particular optimal controls that axe not bang-bang. On the other hand, if one changes the question as discussed earlier, and asks instead whether bang-bang optimal arcs form a suflcient family for time-optimality, then the answer is"yes" for analytic systems, assuming only Hypothesis (i). (In our trivial example kl = 1, kz = U , it is clear that one can always pick U* to be bang-bang with at most one switching.) A stronger version of this result, under a hypothesis weaker than (i), and
Sussmann
530
yielding a stronger conclusion (local uniform bounds on the number of switchings) is proved in [16].The proof of the main result of [l61 is based on a technical lemma with a very long proof. A much simpler proof of the lemma has been found by Dmitruk (personal communication). m
Remark 12.2. For single-input linear systems, [g,&] = 0 for all k. Theorem 12.1 therefore contains and generalizes the result of Example 12.3. Moreover, the property that [g,&] = 0 for all k is invariant under general nonlinear coordinate changes, whereas linearity itself is not. Since the bang-bang property is itself invariant, it can be argued that an “invariant reason” for a system to have this property provides a more satisfactory explanation than a noninvariant one. If so, then Theorem 12.1 can be seen as explaining, from a nonlinear perspective, why linear systems have the bang-bang property.
13. Invariant systems on Lie groups Let G be aLie group, andlet L(G)denote the Lie algebra of G, thought of as the set of left-invariant vector fields on G.A left-invariant vector field system on G is a parametrized family f = { f u } u ~ of~ members of -&(G). We will think of f as a map U + L(G),and write f(u) instead of fu. A control for f is a function 7 from some interval to U such that f o q is measurable. A control 7 : I -+ U gives rise to a time-varying vector field fv with domain G x I , defined by fv(g,t)= f(q(t))(g) for g E G.The vector field fv is LIB iff q is locally Lebesgue integrable, and in that case
fv E CVFLIC=(G). A control system generated by such an f will be, according to our general definitions, a triple (G,U,F) such that (i) whenever an interval I is the time domain of some control vector field that occurs in F, then every fv arising from a piecewise constant control q : I + U occurs in the family F, and (ii) every CVF that occurs in F is the restriction to its domain of some fv. We will limit ourselves to systems that are generated by f in the following stronger sense: (I) the parameter set U actually is a set of controls, and (11) for every 7 E U the control vector field Fv is actually fv.
Coordinate-FreeMaximum Principle
531
Specifying such a system amounts to specifying a set I of intervals and a set U of U-valued controls in such a way that D(q) E I for every q E U , and U contains all piecewise constant U-valued controls whose domain is in I. An “invariant Lagrangian” should then be a family of functions { L u } u Esuch ~ that each L, is invariant under left translations. This, however, just means that each L, is a constant function on G. So an invariant Lagrangian for an invariant family f is, simply, a real-valued function U + L(u) on U. In accordance with our general definitions, a controlled arc y = ( E , q) will be acceptable for L if the function L o q is locally integrable. So in this case acceptability is just a property of q, and does not involve E. Given an invariant system C = (G,U, F) generated by f , and an invariant Lagrangian L for f , we can define L,(t) = L(q(t)),and J L ( ~=) SD(c) L(q(t))dt for a controlled arc y = ( E , q) of C such that q is acceptable. For invariant systems with an invariant Lagrangian, the concepts of an adjoint vector and an L-adjoint vector for some L coincide, since the “Lagrangian part” of the Hamiltonian is independent of g E G. Let q : I + U be such that f o q is locally integrable, and let E : I + G be a trajectory for q. Let y = ( E , q). An adjoint vector along y is an absolutely continuous map C : I -+ T*G such that each <(t)isin T;(,)G and 6 satisfies the adjoint equation. The cotangent spaces T,*G can all be identified with the dual L(G)* of L(G), via the maps 0, : TSG -+ L(G)* that assign to each z E TgG the linear functional O,(z) : L(G) + W defined by O,(z)(X) = ( z ,X ( g ) ) .With this identification, we can regard fields of covectors along a trajectory as L(G)*-valued functions. Let C be a field of covectors along 5, and let (0 be the L(G)*-valued function that corresponds to 5 under this identification, so that Ce(t) = ec(,)(C(t)),i.e. (Ce(t),X)= ( ( ( t ) , X ( < ( t ) ) ) whenever X E L(G). The adjoint equation says that
Sussmann
532
for everysmooth vector field X on G. If X is left-invariant, then [f ( q ( t ) )X, ] is left-invariant as well, so we can rewrite (13.1) as (13.2)
d -(Ce(t),X) dt = (Ce(t>, [f ( ~ ( t ) ) , X l ) -
So if ( is an adjoint vector along y then Se must satisfy (13.2) forall X E L(G). It is easy to show that, conversely, if (0 satisfies (13.2) for all X E L(G) then C is an adjoint vector along y. From now on, we identify Adj'(y) with a space of L(G)*-valued functions, so we think of (0, rather than 6 , as an "adjoint vector.'' We can then drop the subscript 8 , and simply state that
An absolutely continuous map ( : D(()"t L(G)* is an adjoint vector along a controlled trajectory ( ( , V ) of a system C generated by a leftinvariant vector field system f = { f,, : U E U } on the Lie group G, if and only if d (13.3) % ( < ( t ) , X= ) ( ( ( t ) , [ f ( q ( t ) ) , X ]for ) a.e. t for all X E L(G). This condition can also be expressed in terms of the dual of the adjoint representation of L(G). For Y E L(G), let ady denote the map X + [Y,X] from L(G) to L(G), and let ad; : L(G)* + L(G)* denote the dual map. Then (5, [f ( m ) , X I ) = (C, adf(,(t)) ( X ) )= ("d;(q(t))(S(t)),X ) . so we have shown that (13.3) holds i#((t) = ad;(,(t))(<(t)) for a.e. t E D(().
Example 13.1. Let A I , A2 be two linearly independent skew-symmetric 3 x 3 matrices. Consider the system R = R(u1A1 +uzAz), evolving in G = S 0 ( 3 ) , the group of 3 x 3 orthogonal matrices R such that det R = 1, with control constraint U: U$ 5 1. A control is a measurable function I 3 t "t (Vl(t), qz(t)) with values in ( ( ~ 1u2) , : u:+ui 5 l}, whose domain I is a compact interval. Then L(G) is naturally identified with SO@), the Lie algebra of 3 x 3 skew-symmetric matrices, by assigning to each A E so(3) the vector field X A given by X A ( R )= RA. Then [ X A , X B ]= X [ A , B ] , where [A, B] = AB - BA.
+
Coordinate-FreeMaximum Principle
533
Let us study the minimum time problem. If [a,b] 3 t -+ R(t) E G is a minimum time trajectory, corresponding to a control 7 = (91, q 2 ) , then by the MP there exists a minimizing adjoint vector ( : [a,b] -+ L(G)*such that ( ( t ) # 0 for all t, and the Hamiltonian is 5 0 along (R(-),(), The Hamiltonian is H(R,a, U ) = U I ( a ,X A(R)) ~ u 2 ( z , X A(R)), ~ Then
+
H(R(t),6(t),u)= . 1 ( 6 ( t ) , X A 1 ( R ( t ) ) ) + u 2 ( C ( t ) , X A z ( R ( t ) ) ) .
+
We rewrite this as H(R(.),C(.),u) = ulcpl uz(p2, where the “switching ) ) . let p ( t ) = functions’’ (pi are givenby (pi(t)= ( c ( t ) , X ~ ~ ( R ( Ift ) we d ( p ~ ( t ) ~ ~ 2 ( t ) then ~ , theHamiltonian minimization condition easily imif p ( t ) # 0. The value of the Hamiltonian for U = 7 plies that qi(t)= then turns out to be - p ( t ) . The constancy of the Hamiltonian then says that p is constant. The adjoint equations say that (bl(t) = -r)z(t)(ps(t) and (bz ( t )= V ( t ) ( ~ (3 t ), where 9 3 ( t )= ( ( ( t )X, A ~ )and , A 3 = [AI,Az]. If the constant p W+ equal to 0, then it would follow that (p1 and cpz vanish identically. Since the three vector fields X A form ~ a basis of L(G), and ( ( t ) # ’ 0, the function (p3 nevervanishes. But then the equations +l = -r]z(p3, +2 = r]l(p3 imply that ql(t) qz(t) 0. So our trajectory R(.)is just a constant G-valued function, and is therefore not time-optimal unless b = a. If b > a, we have shown that p must be > 0. But then, ifwe write (pi = p& for i = 1,2,3, we see that qi = -$i for i = 1,2,and 41 = -%$3 I 4 2 = 71$~3. On the other hand,the derivative of (p3 can be computed using the adjoint equation once again. Since (p3 = (C, X A ~ we ) , have
+
-#
(b3
= ‘)?l(C,X[A1,A3]) i~Z(<,X[AZ,A~]).
+
+
Now, [ A l , A 3 ] = c u A 1 c n A 2 and [Az,As] = c z l A 1 c 2 2 A 2 , for some constants cij. (Recall that so(3) is isomorphic to R3 via an isomorphism that maps the Lie bracket to the cross product. Since the cross product v 3 = v1 x v 2 of two vectors is orthogonal to v 1 and vz, it follows that the cross products v1 x v 3 and v 2 x v 3 are orthogonal to 213, so they are linear combinations of v 1 and W.) So (13.4)
$3
= Cll9l’Pl
+ C129lV32 + cZlqz(P1 +
CZZ9ZCpZ~
Sussmann
534
If we divide through by p , use qi = -& for i = 1,2, and define q 3 = 4
3 ,
we see that the three functions qi satisfy:
We have shown that every optimal control (171, q 2 ) that is defined on an interval [a, b] with b > a satisfies the system (13.5)for some function q3, In particular, this implies that all optimal controls are smooth. W
Remark 13.1. Example 13.1 is a special case of the more general subRiemannian minimization problems to be considered in the Section 14.For these problems, minimizers are “normal” or “abnormal” extremals, and normal extremals are always smooth. Example 13.1 is “strongly bracketgenerating” (cf. Strichartz [15]),and this implies that all minimizers are normal extremals and hence smooth. W
14. Sub-Riemannian minimizers If M is a manifold, and E is a subbundle of T M , we use L ( E )to denote the Lie algebra of vector fields generated by r w ( E ) .The subbundle E is called bracket-generating if { X ( z ): X E L ( E ) }= T,M for all x E M . A Riemannian metric on E is a smooth map G that assigns to each 7: E M a strictly positive definite symmetric bilinear form G ( x ) on E ( x ) .We will regard G(s) as linear map from E ( x ) to E ( x ) * ,so G(x)v E E(x)* if v E E ( x ) ,and the G(x)-inner product of v, W E E ( s ) is then (G(x)v,W). We will write 1 ( v 1 (to ~ denote the length of a vector v E E ( x ) .
4
A sub-Riemannian manifold is a triple ( M ,E , G ) such that M is a manifold, E is a bracket-generating subbundle of the tangent bundle T M of M , and G is a Riemannian metric on E . If S = ( M ,E , G )is a subRiemannian manifold, then we define A(S),the class of S-admissible (or E-admissible) arcs, whose elements are the arcs in M that satisfy ( ( t )E E(<(t))for almost every t E D(<). For an arc E A(S), we define the
< <
8
Coordinate-Free Maximum Principle
535
length (14.1) An arc ( : [a,b] + M that belongs to A(S) is a minimizer if A(() 5 A(<') for all (' E d(S) that start at ( ( a ) and end at ( ( b ) . We let M ( S ) be the set of all minimizers. We would like to find necessary conditions for an arc in d(S) to be a minimizer. An arc ( E A(S) is parametrized by constant times arc-length (abbr. PCAL) if there exists a constant c such that Ili(t)llG = c for almost all t E D(().It is easy to see that every arc coincides, up to a time repare metrization, with a PCAL arc ( : [0, l] + M . We let A([O,l],S) = {< E A(S) : D(()= [0,l]}, and use M P C A L ( Sdenote ) the class of all PCALminimizers ( E d([0,l], S). Then it is easily shown that (* E M P C A L ( Siff ) (* minimizes the functional ( -+ JG(()= 1; Ili(t)II~ dt subject to a( = a<*. So we will study the minimizers of JG instead. To be able to apply the MP, we proceed as in Section 11 and realize d([O,l],S) as the class of trajectories of a control system. As in Section 11, we do this by using a vector field system with a very large space of vector fields. Precisely, we take U = Y ( E ) ,and write fu(z)= f(z,u) = ~ ( z ) . Then f = { fu : U E U} is a vector field system. We then let U denote the set of all L I P maps 7 : [0,l] + U. (This means that our space U of controls is exactly L1([O,l],U),the space of integrable functions with values in the F'rkchet space V . ) If q E U ,we write Fq = q, so F,(z,t)= q ( t ) ( z ) Now . let C = (M,U,F). Then C is a control system generated by ( M ,U, f ) in the sense of our definitions. It is easy to see that the set of trajectories of C is exactly A([@l],S). Nextdefine L q ( z , t )= g G ( q ( t ) ( z ) , q ( t ) ( z Then ) ) . L is a Lagrangian for c', and an arc <* E A( [0,l],S), arising from a control q*, belongs to MPcAL(S) iff it minimizes $L,(((t),t)dt subject to (<,v) E Carc(C), ((0) = E * ( O ) , ((1) = (*(l>.
Now let (* : [0,l] + M belong to M P C A L ( S )and , pick an q* : [0, l] -+ E Carc(C). Then the MPtells us that there exist an absolutely continuous field C of covectors along <* and a v E {0,1) such
U such that y,, = (E*, v,,)
Sussmann
536
that (C, v) # (0,O)and C is a vl-adjoint vector along -y* that satisfies the minimization condition. Therefore the following hold for a.e. t E [a, b]: (MC)
(C(t),v*(t)(t*(t)>)+ v L 9 * ( t * ( W 5 (C(%rl(t)(E*(t)))
+ vL,(t*(t),t) (AE)
for all q E U ,
d
& m , X ( t * ( t ) ) ) = ( C ( t ) , [rl*(t)JI(t*(t)))
- vXL,,,~(&(~)) for allX
E
l?”.
So far, the conditions on C appear to depend on the choice of a control q* that generates E*. We now seek to get rid of this dependence, as we did in Section 11 for the Riemannian case. This is most easily done for the minimization condition (MC). Indeed, since q.(t)(t*(t)) = &(t),(MC) is equivalent to (MC.b)
(C(t),i*(t)) + ;G(i*(%i*(a 5 ( C W )
+ i ~ ( v , v ) for all v E ~ ( < , ( t ) ) , which does not involve q* at all. Next, we use the fact that G gives rise, for each z, to a linear map G#(z) : T,*M + E ( z ) , characterized by the identity G(G#(z)(z),v)= .(v) for z E T j M , v E E ( z ) . (That is, G#(z)(z) is the vector W E E ( z ) such that G(w, = z [ E ( z ) .In the Riemannian case, G(%) maps T,M to T j M and is invertible, and G#(z) is precisely G ( z ) - l , i.e. “the metric tensor with raised indices,” usually denoted by gij. In the general sub-Riemannian case, G($) is invertible as a map from E ( z ) to E * ( z ) so , G(z)-l is a well defined map from E * ( z )to E ( z ) ,and G#(z) is the composite of the restriction map z + z[E(z)with G(z)-l. In more traditional language, all this amounts to saying that the g i j are well defined in the sub-Riemannian case even though the gij are not.) Then ((($),v) = G(w(t),v)for v E Define w(t) = G#(&(t))(C(t)). E ( & ( t ) ) .So (MC.b) is equivalent to e)
(MC4
G ( 4 t ) , i * ( t )+) ;G(i*(t),i*W 5 W t ) ,v)
+ i ~ ( v , v ) for all v E ~ ( < * ( t ) ) .
.
.
Coordinate-Free Maximum Principle
537
If v = 0, then (MC.c) is true iff w(t) = 0 for 8.e. t . If v = 1, then (MC.c) holds iff w(t) = -(*(t))for a.e. t. So (MC.c) is equivalent to
w(t) = -v(*@).
(MC.d)
Now, if ( ( t )= 0 for some t, then v = 1. Since (MC.d) holds a.e., W is 5 6) has positive measure continuous, and w(t) = 0, the set {S : Il(*(t)((~ for each 6 > 0. Since <* is PCAL, we see that & ( t ) = 0 a.e., so & is a constant arc. (Constant arcs are of course minimizers, and satisfy (AE) and (MC.d) with ( 0, v = 1.) From now on, we assume that is a nonconstant arc, so ( ( t )# 0 for all t. We distinguish two cases, depending of whether U = 0 or v = 1. A nonzero absolutely continuous field of covectors ( along <* that satisfies (MC.d) and(AE) with v = 1 (resp. v = 0) is called a normal (resp. abnormal) minimizing adjoint vector along <*,and the curve (&, c) will then be called a normal (resp. abnormal) edremal lifb of 5.. A trajectory that has a normal (resp. abnormal) extremal lift iscalled a normal (resp. abnormal) edremal. We begin by analyzing the normal case, i.e. we assume that ( # 0, v = 1, and (AE) and (MC.d) hold. Let HS : T*M + R be the function defined by
(14.2)
l
H S ( x , z )= -;llG#(x)(z)llb.
Then HS will be called the Hamiltonian of the sub-Riemannian manifold S. (This should not be confused with the controltheory Hamiltonian, which is a family of Hamiltonian functions parametrized by the controls 77 E U.)If g = (91,. . . ,gm) is an orthonormal basis of sections of E on an Cy==, Hij open subset Cl of M , and we write fi = T ; ! ~ ( M ) , then HS = on fi. (Recall from Section 3 that if X is a vector field on M then Hx is the function (x,z ) + z X ( z ) on T*M.) If X is any smooth vector field on M , then
-:
m
m
Sussmann
538
on fi.
SO
If W is a smooth l-form on W , and we write Y = G#w, then
m
Therefore
Coordinate-Free Maximum Principle
539
So we have shown that 1
{Hs, H x ) ( G ~ ~=) H) [ x , y ] ( z , w ( z )-) i ( X G ( Y ,Y ) ) ( z ) . Returning now to our normal minimizer S*, we can pick for each t a smooth (14.3)
l-form ut such that wt(E*(t)) = c(t)and G#(wt)= -q*(t). (This is clearly possible, because G#(&(t))(C(t)) = w(t) = -q*(t)(&(t)).)Then (14.3) implies that
(14.4) {HS, H x } ( S * ( t ) ,c(tN = H[,.(t,,x](E*(t),W ) 1 - -(XG(q*(t),rl*(t)))(E*(t)) 2 = ( c ( t ) ,[rl*(t),Xl(E*(t)) - XL,*,t(E*(t)) for all X E r m ( T M ) .So the right-hand side of (AE) is equal to {Hs, Hx} ( & ( t ) , < ( t )so ) , (AE) is equivalent to
d (14.5) x H x ( S * ( t )c, ( t ) ) = {HS, Hx)(E*(t),c ( t ) ) for all X E r W ( T M ) . This is easily seen to be equivalent to the condition that
(AE.N)
E : t -+ (E* (t),~ ( t is ) )an integral curve of H ~ .
(It suffices to show that if z E T,*M \ (0) then every tangent vector W E T~z,zl(T*M is )of the form l?x(z,z)forsome X , a fact that is easily verified in coordinates. So every covector at (2, z) is of the form d H x ( z , z) for some X . Then (14.5)says that dHx.e= d H ~ . l ?along ~ B forevery X ,so E = along E.) So we have proved that (MC) is equivalent to (MC.d), and that when these hold with v = 1 then (AE) is equivalent to (AE.N). It is easy to show that if t -+ ( [ ( t )c(t)) , is an integral curve of Hs,then [ is admissible and PCAL and C satisfies (MC.d) with v = 1. (Relative to an orthonormalbasis of sections (91,. . . ,gm) as before, Formula (14.2)implies that Hs(z,z) = -
-4
-
#
i(t>= - Cj”=l(C(t),gj(5(t))gj(S(t)),showing that S is admissible, and i(t)= -G#(&(t))(c(t)), so (MC.d) holds with v = 1. c j ( z , g j ( z ) ) 2 - so
Finally, along (E, 5) the function Hs isitself constant, and in view of (MC.d) the value of this constant is -1ili(t)ll$, so E is PCAL.)
Sussmann
540
To study the case when v = 0, we let E" denote the annihilator of E , so E*(z) = { z E T,IM : (VV E E ( z ) ) ( z V = 0)). If W E I'""(E"), ) depends, for a V E r W ( E )and X E F ( T M ) , then ( w , [ V , X ] ) ( xonly given X , on the values of W and V at z. So each X induces a bilinear map Ox(z): E"(z) x E ( z ) -+ R, such that Ox(z,v)= ( W , [ V , X ] ) ( zwhenever ) W E rW(EL), V E P ( E ) are such that W(.) = z , V ( x )= v. In particular, (AE) says "since v = 0- that (14.6)
d z ( C W ( < * ( t ) )=) (C(t>, h*(t),XI(E*(t))) for every X E roo (2").
On the other hand, (MC.d) implies that (C(t),v)= 0 for all v E E ( & ( t ) ) , (AE) is equivalent to
so that ( ( t )E E(&(t))".Then
d
(AEA) z ( C ( t ) , X ( E * ( t )= ) )ex(C(t),&(t)) for every X E r w ( T M ) . We summarize the above discussion in the following
Theorem 14.1. Let S = ( M ,E , G ) be a sub-Riemannian manifold, andlet C be thecorrespondingcontrol system. Let E* : [0,l ] + M be a nonconstant S-admissible PCAL arc. Let q*, be a control in U that generates & as a trajectory of C. Let ( be a nonzero absolutely continuous field of covectors along (*. Then
[IJ C satisfies (MC)and (AE) with v
= 1 i f f( satisfies (MC.d) with v = 1 and (AE.N), and in that case HS # 0 along E. [11] ( satisfies (MC)and (AE) with v = 0 iff ( ( t ) E E*(&(t))f0.r all t
and (AE.A) holds.
More generally, if (&,,j ) : [0,l ]+ T*M is any integral curve of HS,then & isadmissibleand PCAL and C satisfies (MC)with v = 1 and (AE). If HS.# 0 along (E*, S ) , then &, is nonconstant, so is a normal extrernal.
W
Theorem 14.1 implies in particular -both for v = 1 and v = 0- that the pair of conditions (MC), (AE) is independent of the choice of q*. Moreover, the nonconstant normal extremals of C are exactly the projections on
Coordinate-Free Maximum Principle
541
M of the integral curves of HS,from which it follows that all the normal extremals are smooth. For a Riemannian metric, the abnormal case cannot arise, because C E EL implies C = 0, which is excluded. However, in the sub-Riemannian case Theorem 14.1 doesnot rule out the possibility that some minimizers might be abnormal, or even strictly abnormal (i.e. abnormal but not normals). The possibility of the existence of strictly abnormal minimizers was incorrectly excluded in some papers such as [14],where it was asserted that every sub-Riemannian minimizer is a normal extremal, with the obvious corollary that every sub-Riemannian minimizer is smooth. Since then, it waa recognized, e.g. in [15],that the proof of this theorem had a g&, due precisely to the possibility that v = 0. However, it was not until the work of R. Montgomery [l21 that an example was found of a minimizer that is actually abnormal and not normal. The following much simpler example was presented in Liu-Sussmann [g]. Example 14.1. We let M = R3,with the usual coordinates 2 1 , $2, z3,and take E to be the kernel of the l-form W = z: dzz - (1 - $1) dzg. Since W never vanishes, E is a smooth two-dimensional subbundle of T M . The vector fields f , g , given by f = g = ( 1 - SI)& z: form a global basis of sections of E. The Lie brackets [f,g ] , [f,[ f , g ] ]are easily ) [f,[ f , g ] ] ( zspan ) computed, and it turns out thatf(z),g ( z ) , [ f , g ] ( zand R3 at every point. Therefore E is bracket-generating. We now define a metric G on E by G = dx: h(zl)(dz$+ h i ) ,where h(Z1) = ( l - r l ) +z,' Then the vector fields f and g form an orthonormal basis of sections of E . It is easy to see that for every a,b such that a < b the restriction to [a,b] of the curve & : t + (0, t, 0) is an abnormal extremal which is not normal. The following result was proved in 191:
+ &
&,
+
Q'
Theorem 14.2. The restriction of & to any interval of length 5 3 is optimal. 6 A trajectory can be normal and abnormal at once, since it can have two different lifts with different values of the abnormal multiplier.
Susemann
542
A much moredetailed study of abnormal extremals for sub-Memannian manifolds with a rank-two distribution E has been carried out in [lo]. Remark 14.1. The above example shows that not all minimizers are normal extremals, so the smoothness of all minimizers cannot be proved by using the fact that the normal extremals are smooth, and it is ‘natural to conjecture that nonsmooth minimizers exist. However, even though it is very easy to produce examples of abnormal sub-Riemannian extremals that are not smooth, all examples found so far of abnormal extremals that can be proved to be optimal have turned out to be smooth. As of this writing, the question whether there exist nonsmooth sub-Riemannian minimizers remains unanswered. m
15. Miscellaneous examples In this section we present several examples, in order to illustrate some of the technical issues raised in the preceding discussion. Our first example pertains to the question of the locality of the conditions of the MP, and shows that there is a striking contrast between the classical Calculus of Variations and the more general optimal control situation. Indeed, it is rather obvious that the Euler-Lagrange equations (EL) are local, in the sense that a trajectory ( defined on an interval I is a solution if and only if every t E I has a neighborhood N ( t ) relative to I such that the restriction ( [ N ( t )of ( to N ( t ) is a solution. The following elementary example shows that the analogous property is not true in general for optimal control problems.
Example 15.1 (Nonlocality of the MP conditions). Consider the minimum time problem for the control system in R2given byh1 = zz, h2 = U , IuJ 5 1. Notice that for this system a trajectory ( uniquely determines the control that generates it, so in this caae there is no need to distinguish between “trajectories” and “controlled trajectories.’’ The Hamiltonian is given by H = z l z 2 Z ~ U The . adjoint equations for a covector field t + C(t) = (G(t),Cz(t)) along a trajectory t + U ) = ( h ( t ) , & ( t ) >say that
+
Principle
Maximum Coordinate-Free
543
= 0 and (2 = -61. So (2 is a linear function oft, which is not identically zero. The minimization condition says that u ( t ) = 1 when (2(t) < 0 and u ( t ) = -1 when ( ~ ( t>) 0. Since the function (2 is linear, it can have at most one zero. This means that u ( t ) is “bang-bang” (i.e. with values in the set {-l, l}) with at most one switching. Moreover, the constant determines the sense of the switching. Indeed, the condition that H 5 0 implies that at a switching time ? we have cl&@) 5 0. If c1 > 0 which can only happen if (~(1)5 0- then &@) < 0, which means that (2 changes sign from positive to negative, and U switches from -1 to 1. Similarly, if (1 < 0 ”which is only possible if (~(3)2 Q- then $@) > 0, which means that (2 changes sign from negative to positive, so U switches from 1 to -1. (If (1 = 0 then (2 is a nonzero constant, so there are no switchings.) Therefore if a trajectory isanextremal,thenthecontrol u ( t ) isbang-bangwithat most one switching. A switching from U = 1 to U = -1 canonly occur in theupperhalf-plane x2 2 0, and a switching from U = -1 to U = 1 canonlyoccur in thelowerhalf-plane x2 5 0. Conversely, it is easy to see that every trajectory that satisfies these conditions is an extremal. Indeed, suppose that corresponds to a control U ( - ) that has exactly one switching. Suppose that this switching occurs at time j,and has the “correct” direction, i.e. either (i) U switches from 1 to -1 and <2@) 2 0 or (ii) U switches from -1 to 1 and &@) 5 0. Let t + Cz(t) be a linear function that vanishes at ? and has the correct slope, positive in Case (i) and negative in Case (ii). Let (1 = Then ( = (Cl, (2) is a minimizing adjoint vector along E, along which H 5 0. So is an extremal. Now, if E : [a,b] -+ R2 is any trajectory corresponding to a bang-bang control with two switchings, and if these two switchings have the correct direction, then it is clear that is locally extremal (i.e. every t has a relative neighborhood N ( t ) in [a, b] such that < [ N ( t )is an extremal) but is not an extremal. (An example of such a E is the curve E : [0,4]+ R2 that starts with ((0) = (0,O)and corresponds to a control U ( - ) such that u ( t ) = -1 for 0 5 t < 1, then u(t) = 1 for 1 5 t < 3 and, finally, u ( t )= -1 for 3 5 t 5 4.) (1
cl
<
<
<
-$.
<
<
Sussmann
544
Our second example illustrates how the formulation based on.continuous vector fields “due, as indicated in the introduction, to S. Lojasiewicz Jr.- adds extra strength to the MP. Example 15.2. Consider the system x = u(l+#), gj = z,IuI 5 1, in two dimensions. Let ( ( t )= (0,O) for 0 5 t 5 1. Let us study, using the MP, whether 5 is a boundary trajectory. (Itis quite easy to verify directly that E is not a boundary trajectory, butour goal here is,to see how various forms of the MP would deal withthe problem.) Our trajectory corresponds to the vector field (0, z),which is smooth, On the other hand, all the other vector fields of the system, corresponding to nonzero values of the control, are continuous but fail to be locally Lipschitz. So all standard formulations of the MP, including the “nonsmooth” ones, fail to be applicable. The version given here does apply, however, since it only requires the reference vector field to be LIL (and our case this vector field is actually of class Cl),while the other vector fields are allowed to be just continuous. The Hamiltonian is H = e l u ( l + d l 3 ) Z ~ Z and , the Hamiltonian minimization condition along 5 requires (1 P 0. The adjoint equation along E says (1 = -(2, so (2 I 0, contradicting the nontriviality condition. So ( is not a boundary trajectory.
+
The next two examples illustrate the importance of the distinction between “trajectories” and “controlled trajectories,” and the usefulness of formulating the MP in a way that allows for large classes of control vector fields. The first example illustrates the possibility that a trajectory ( may arise from two different controls 17 and q, in such a way that the controlled trajectory (E, 7) is a controlled extremal “for, say, the controllability problem- but (<,q is not. Naturally, since the property of being a boundary trajectory only depends on the trajectory itself, but not on the choice of a control that generates it, the fact that there is one control ;i generating in such a way that (E, 3 is not a controlled extremal already suffices to imply that is not a boundary trajectory, but this would not have been detected by just looking at the controlled trajectory v). The second example -borrowed from the work of Kaskosz and Lojasiewicz
<
<
(c,
Coordinate-Free Maximum Principle
545
[6]- illustrates an even more dramatic possibility. It can happen that a trajectory ( of a system C is such that ((,v) satisfies the necessary conditions of the MP -applied to C- for all possible choices of a control q that generates 6, but that there is a larger system 5 that gives rise to the same trajectories as E,for which there is a control such that ((,G) is a LIC1-controlled trajectory that violates the conditions of the MP for E. In other words,
e
the power of the MP can be enhanced by app6ying it to a suitably enlarged system rather than to the original system. This is possible in our setting, because our formulation of the MP is stated for essentially arbitrary collections of vector fields, so the addition of many new vector fields does not take us out of the range of applicability of the MP. Example 15.3. Consider the system $1 = u1, $2 = 21221, where the controls ul,u2, take values in the interval [-l, l],so that the control space U is the two-dimensional square [-l, 112.The curve t + ( ( t )= (O,O), 0 5 t 5 1, is obviously a trajectory, corresponding to the control q given and also to the control i j given by ij(t)= (0,l). We want by q ( t ) I (0, 0), to know if ( is a boundary trajectory. The Hamiltonian is H = z1211+z221221. If t + {(t)= ((1(t),{2(t)) is an adjoint vector along ((, q) or ((,g,then 5 must satisfy (1 = -U&, (2 = 0. Moreover, the minimization of the Hamiltonian requires that C1 E 0, since u1 = 0 minimizes {1(t)ul for 211 E [-1,1]. For the controlled trajectory ((,v), the control 212 vanishes, so we can satisfy the adjoint equations by taking {l = 0, = -1, and then it is easy to verify that the minimization condition is satisfied. On the other hand, for ((,g,212 is equal to 1, and therefore the only way to satisfy the adjoint equations with {l = 0 is to let = 0 as well, contradicting the requirement that { # 0. Therefore the 7) , is a controlled extremal, but (6, ?j) is not. Since controlled trajectory (I ((,gis not a controlled extremal, we conclude that (0,O)is an interior point of the reachable set from (O,O), a fact which is also easily verified directly. B
Sussmann
546
Example 15.4. Let C be the controlsystem 51 = ul,k2 = 1$11+u2-1, where the controls 211, 212, take values in the interval [-1,1], so that the control space U is the square [-l, 1'. (The controls are all measurable U-valued functions.) The curve t + [ ( t )= (O,O), 0 5 t 5 1 is obviously a trajectory, corresponding to the control q given by q(t) E (0,l).Once again, we want to find out whether 5 is a boundary trajectory. Notice, to begin with, that 7 is the only control giving rise to The corresponding reference vector fieldis (0, ( q l ) , whichis Lipschitz. The version of the MP stated in this paper therefore applies to C, and one might suspect that, if is nota boundary trajectory,thenit might be possible to detect this by applying Theorem 8.2. We show that this is not the case, by proving:
e.
(A) (E, q ) satisfies the necessary condition of Theorem 8.2 applied to C. (B)There as a larger system 5, having exactly the same trajectories as C , and a control T for E, such that (C,$) does not satisfy the necessary condition of Theorem 8.2 applied to 5. To prove (A), we first seek to apply Theorem 8.2 to C.'The Hamiltonian is H = z1u1 z2(1z11 u2 - 1). The adjoint differential inclusion says that (1 E [-ICzl, and (2 = 0. For our reference controlled trajectory ((,v), these conditions can be satisfied by taking C1 E 0 and (2 P -1. With this choice, the Hamiltonian minimization condition is satisfied as well, and the resulting value of the Hamiltonian is zero. So E passes the test of Theorem 8.2 applied t o C. Now consider the vector field f given byf(q,$2) = (0,$1) for $1 2 -1, f ($1, $2) = (0, lzll - 2) for $1 5 -1. Then f(z1,zz)belongs to the set ((U~,IZ~~+U~-~):(U~,UZ) E [-1,1]2}forevery(z1,z2).So,ifweaddthis new vector field to our system, the resulting extended system has exactly the same trajectories. Precisely, define 6 = U U { #}, where # is a new object, Let U^ be the set of measurable 6-valued functions defined on compact intervals. For Fj E 6,define Ffi(z,t)= f (2,t ) if G(t)= #, Ffi(z,t ) = ( q ,1x11 u2 - 1) if Q(t)= ( ~ 1 ~ E~ U. 2 )Then define = (R2,1u^,P),
+
+
E
+
Coordinate-Free Maximum Principle
547
It is clear that E is an integral curve of f . Therefore (E,?) E Ctraj(s), ifwe define 6by letting Q(t)E #. Moreover, the adjoint equation along the controlled arc (E,?) says (I = -52 and (2 = 0. The Hamiltonian minimization condition still requires 51 to be G 0. But this forces 5 2 to be E 0 as well, violating the nontriviality condition. So fails to satisfy the necessary condition of Theorem 8.2 applied to E,
Example 15.5. The purpose of this example is to show that the integral bounds on partial derivatives or Lipschitz constants of the reference CVF are crucial for the validity of the MP. We will do this by exhibiting an example of a system li: = f(z,U , t ) ,z E W, where f is jointly continuous in (z, U , t ) ,globally bounded, and smoothin Z,U for each t , for whichthere exists a trajectory E* : [0, l] + W that goes from 1 to 0 and corresponds to a control U * , in such a way that (a) E* is smooth, (b) U* is constant, (c) the function t + u * ( t ) ,t ) is bounded (actually, E 0), and (d) 0 is on the boundary of the reachable set from 1, but (e) the conclusion of the MP is not true. The only hypothesis of the MP that fails'to hold in this case is the integral bound I%(z,u*(t),t ) ( 5 h(t),with h integrable. In the MP, this is required to hold for the reference control for z near the the reference trajectory. (In the example, the bound does hold along the trajectory itvanishes there. So in particular we are showing that it is self, because important that the bound hold not just on E* but on a whole neighborhood.) We let cp : W2 + P be a continuous function such that (i) cp(~, t ) is smooth in z for each fixed t , (ii) cp is bounded, (iii) cp((1- t ) 3 ,t ) = 3(1- t ) 2 for o 5 t 5 I, (iv) %((I - t ) 3 , t ) = o for o 5 t 5 1, (v) cp(z,t) 5 o for z <_ 0. (For example, we can construct such a cp by starting with the function $ : W + W given by $(S) = &f - 1, which is smooth and such that $(l)= 1, $(S) 5 0 for S 5 0, and l$(s)l 5 1 for all S. Then plug in S = +, for t < 1.The resulting function (2, t ) + O(z,t ) = $( &), defined for t < 1,is still smoothin both variables, and such that O(z, t ) 5 0 for z 5 0, O((l-t)3,t) = 1,and lO(z,t)(5 1.Now multiply this by 3(1-t)'. The resulting function (2, t ) + ~ ( zt ), = 3(1- t)'$( &) is still defined
g(<.@),
%
Sussmann
548
for t c 1 and smooth in both variables. Moreover, u(z,t) 5 0 for z 5 0, ~ ( ( l - t ) ~ ,=t )3(1-t)2, and lu(z,t)l 5 3(1-t)2. Extendu toRzby letting o(z,t ) = 0 for t 2 0, and multiply by a smooth nonnegative function of t that equals 1for t 2 0 and 0 for t 5 -1. Then p satisfies all our conditions.) Consider the control system 6 = -u2p(z, t ) , u E U, where U = R, or U = [-K, K], where K > 1. The class of control functions can be fairly arbitrary, e.g. L1 or LW or the class of all piecewise continuous controls. Let &(t)= (1 - t ) 3 ,0 5 t 5 1. Then p(&(t),t ) = 3(1 - t ) 2 = -(*(t). So E. is a trajectory of our system, corresponding to thecontrol u*(t)E 1. Clearly, c* (0) = 1, (1) = 0. So 0 is reachable from 1 in time 1. On the other hand, it is easy to see that no point z E (CO,0) can be reached from 1, because every trajectory 6 satisfies ( ( t )2 0 whenever E(t) < 0. So &(l)belongs to the boundary of the reachable set from 1. If the MP applied, it would follow that there exists a nowhere vanishingAC function C : [0,l] + R that satisfies the adjoint equation and issuch that the Hamiltonian H(z,z , u , t ) = -u2zp(z, t ) ,evaluated for z = &(t),z = c(t), is minimized as a function of U E U by u = u*(t),i.e. u = 1. On the other hand, the function U 3 U -t -u2((t)p((*(t), t ) can only be minimized by u = 1 if C(t)p(&(t),t)= 0. Since cp(&(t),t ) = 3(1 - t ) 2# 0 for 0 5 t < 1, we conclude that { ( t ) 0, contradicting the nontriviality condition. Clearly, = 0 along &,so our example has all the promised features. Notice that a function p that satisfies our five conditions cannot possibly obey an integral bound Ig(z,t)l 5 h(t),h E L1,on a neighborhood of (0,O).Indeed, assume there was such a bound. Let Pt denote the point ((1- t ) 3 ,t ) ,and let St denote the segment in the (z,t)-plane joining (0, t ) to Pt. Since p(0,t ) 5 0, St must contain a point Qt = (qt,t) where p = 0. But then the Mean Value Theorem implies that there is an Rt in the segment from Qt to Pt such that p(Pt) = ((1- t)3Since 0 5 qt C (1 - t ) 3 ,and p(Pt) = -3(1 - t ) 2 ,we conclude that
g
qt)g(&).
Since & t=l. m
-+
(0,O)as t
+ 1-,
wesee that h cannot be integrable near
Coordinate-Free Maximum Principle
549
16. Conclusion: beyond the maximum principle We close this introduction to the Maximum Principle by pointing out that this very powerful result is only the beginning of a very rich theory. It can be shownby means of very simple examples that the Maximum Principle may fail to detect controllability or to exclude optimality in situations where other standard mathematical tools would easily settle the question.
Example 16.1. Consider the system 51 = ul, 52 = u2, 53 = ~ 1 ~ with controls u1, u2 such that lull 5 1, lual 5 1. Consider the trajectory ( given by ( ( t )= (0,0, 0), and corresponding to the control q (0,O).Once again, we ask if is a boundary trajectory. It is easy to show that the necessary conditions of the Maximum Principle are satisfied along (5,q). Moreover, q is the only control giving rise to 6. In fact, it can be shown that the necessary conditions of the Maximum PTinciple will still be satisfied for any enlarged system that has the same trajectories, so the enlargement procedure considered in the previous section will failto detect the fact that 5 is not a boundary trajectory. However, this fact can be detected by elementary means using standard tools from other areas of mathematics. Precisely, let us write our system as li: = ulfl(z) + uzf2(s), where f1 is the vector field (1,0,0)and fi has components (O,l,zl). The system dynamics directly tells us that motion in the directions of f1 and f~ and their negatives is possible. To find out if a full neighborhood of the system can be reached, we compute the Lie bracket of these two vector fields, and find that [fl,f2] = (0,0, l),So f1, f~ and [fl,fa] are linearly independent. This implies that motion in a third direction is possible, so ( is not a boundary trajectory, The previous example shows that, as far as controllability is concerned (and similar examples can easily be given for optimality problems) the Maximum Principle only captures one aspect of a much more complex situation, while simple tools from differential geometry can deal with c& ses that are beyond the power of the various versions and extensions of Pontryagin's result. The controllability of the system of Example 16.1 is
2 ,
Sussmann
550
a particular case of a general theorem relating controllability to Lie algebraic conditions, known as “Chow’s Theorem,” whose use was advocated by R. Hermann as f a r back as 1963 (cf. [ 5 ] ) , i.e. shortly after the publication of the book [l31 that first presented the Maximum Principle. The natural conclusion to draw is that to realize its full power, the Maximum Principle has to be combined with differential geometric tools. This, of course, is what the burgeoning field of “differential geometric optimal, control theory” is about, and the many successes of this line of research in the last twenty years confirm the validity of this conclusion.
REFERENCES [l]
F.Albrecht, Topics in Control Theory, Springer-Verlag, Berlin-Hei-
delberg-New York, 1968. [2] L. D. Berkovitz, Optimal Control Theoyl, Springer-Verlag, New York, 1974. [3] F. H. Clarke, TheMaximumPrincipleunderminimalhypotheses, S.I.A.M. J. Control Optim. 14 (1976), 1078-1091. [4] F. H.Clarke, Optimization and Nonsmooth Analysis, Wiley Int,erscience, New York, 1983. [5] R. Hermann, On the accessibility problem in control theory, in: International Symposium on Nonlinear Differential Equations and Nonlinear Mechanics, J. P. LaSalle and S. Lefschetz Eds., Academic Press, New York (1963) [6] B. Kaskosz and S. Lojasiewicz Jr., A Maximum Principle for generalized control systems, Nonlinear Anal. TMA 9 (1985), 109-130. [7] E. B. Lee and L. Markus, Foundations of OptimalControl Theory, J. Wiley, New York, 1967. [8] G. Leitmann, An Introduction t o OptimalControl, McGraw-Hill, New York, 1966. [g] W.S. Liu and H.J. Sussmann, Abnormalsub-Riemannian minimizers, in: Differential Equations, Dynamical Systems andControl I
Coordinate-Free Maximum Principle
551
Science, K. D. Elworthy, W. N. Everitt, and E. B. Lee Eds,, Lect. Notes Pure Appl. Math., Vol. 152. M.Dekker,NewYork (1993), 705-716. W. S. Liu and H.J. Sussmann, Shortest paths for sub-riemannian metrics onrank-2 distributions, to appear in the Memoirs of the American Math. Society.
J. Macki and A. Strauss, Introduction to Optimal Control Theory, Springer-Verlag, New York, 1982. R. Montgomery, Geodesicswhich do notsatisfythe geodesicequations, 1991 preprint. L. S. Pontryagin, V. G. Boltyanskii, R. ,V.Gamkrelidze, E. F. Mischenko, The Mathematical Theory of Optimal Processes (translated by K.N. Trigoroff, L. W. Neustadt, editor), J. Wiley, 1962.
R. Strichartz, Sub-Riemannian Geometry, J. Diff. Geom. 24 (1986), 221-263.
R. Strichartz, Correctionsto 'Sub-Riemannian Geometry', J. Diff. Geom. 30, 2, (1989), 595-596.
H.J. Sussmann, A bang-bang theorem with bounds on the number
of
switchings, S.I.A.M. J. Control and Optimization 17 (1979), 629651.
H.J. Sussmann, The structureof time-optimal trajectoriesfor singleinput systems in theplane:the Coo nonsingular case, S.I.A.M. J, Control and Optimization 25 (1987), 433-465.
H.J. Sussmann, The structureof time-optimal trajectoriesfor singleinput systems in the plane: the general real-analytic case, S.I.A.M. J. Control and Optimization 25 (1987), 868-904. H.J. Sussmann, Regular synthesis for time-optimal control of singleinputreal-analyticsystems in the plane, S.I.A.M. J. Control and Optimization 25, No. 5 (1987), 1145-1162. H. J , Sussmann, A weak regularitytheorem for real analyticoptimal control problems, Revista Matemsltica Iberoamericana 2, No. 3 (1986), 307-317.
Sussmann
552
[21] H.J. Sussmann, Tmjectorg regularity and real analyticity: some recent results, Proc. 25th I.E.E.E. Conference on Decision and Control, Athens, Greece (1986), 592-595. [22] H. J. Sussmann, A strong version of the Lojasiewicz Maximum Principle, in: Optimal Control of Differential Equations, Nicolai H. Pave1 Ed., M. Dekker Inc., New York (1994). [23] H. J. Sussmann, A strong version of the Maximum Principle under weak hypotheses, Proc. 33rd IEEE Conf. Dec. Control, Orlando, FL 1994, 1950-1956.
Appendix: List of abbreviations AC: absolutely continuous Adj(y): the set of all adjoint vectors along the controlled curve y = (t,F),i.e. Adj(y,O) Adj(7,L): the set of all L-adjoint vectors along the controlled curve y Adjc(y), Adj’(7,L): the set of all adjoint (resp. L-adjoint) vectors along the controlled trajectory y of a control system c Adj%i,(-d: A d j L h 0) Adjfmin(y,L): the set of all strongly minimizing C E Adjc(.y, L) AdjL,(y): AdjLdy,~) Adj:mi,(7, L): the set of all weakly minimizing 6 E Adj%, L) AE: adjoint equation AI: adjoint inclusion A R C ( M ) :the set of all arcs in M A R C ( I , M ) :the set of all arcs in M with domain I Carc(C): the set of all controlled arcs of a system c = (M,U,F)
Principle
MaximumCoordinate-Free
553
Carcp(C): the set of 7 E Carc(C) that have Property
P Carcwa(C,L): the set of all 7 E Carc(C) that are weakly acceptable for L C*:the polar of a cone C Ccrv(3): the set of all controlled curves 7 = (c, F) such that F E 3 C c r v ~ ( 3 ) : t hset e of 7 E Ccrv(3) that have Property
P CF: Carathdodory function CF(M): the set of all CF's on M CFp(M): theset of all CF's on M that have Property
P CFp(I,M): the set of all cp E CFp(M) with time domain I CFVp(M): the set of all CVF's on M that have Property P CFVp(I,M): theset of all f E CFVp(M) with time domain I C k ( M ) :the space of real-valued functions of class C k on M C. of V.: Calculus of Variations CRV(M), CRVLAC(M): theset of all curves (resp. LAC curves) in M CRV(I, M ) , CRVLAC(I,M ) : the set of 6 E CRV(M) (resp.CRVLAc(M)) with domain I Ctraj(C): the set of all controlled trajectories of a control system C Ctrajp(C): the set of all 7 E Ctraj(C) that have Property P CVF: control vector field CVF(M): the set of all CVF's on M D(.): domain of
Sussmann
554
DEL(E,F): theCVF obtainedfrom F by removingFrE FL:M x I 3 (z,t ) + F L ( z t, )=(F(z,t ) ,L(s, t ) )
ETMXW 3(C),?(C): the set {Fv: q E U} ?(C):.the set of CVF's that occur in C,cf. Section 5 G(.):graph of l?: the Hamilton vector field corresponding to the function H HF:HF,o,i.e. H F ( ~z,,t ) = z.F(z, t ) HF,L:(for F E C V F ( M ) and L E C F ( M ) )the time-varying function on T * M given by
z,t ) + t) NCpL (%f{HfgL}vE~): the Hamiltonian function aasociated with a control system C = (M,U, F) and a Lagrangian L for C Z: the set of all nontrivial intervals IAE: inhomogeneous AE IAI: inhomogeneous AI IC: Integral curve IC(F): the set of integral curves of the CVF F IC(3): UFEF IC(F) IJVS(E,F,G): the CVF obtained from F by inserting G over E WE:inhomogeneous VE IVV(5,C, p), IAV(5,C, z): the sets of solutions of Cvecv = p and CcovC = z,respectively J'(E): the bundle of k-jets of sections of the bundle E J:(E): the set of k-jets at z of sections of the bundle E LAC: locally absolutely continuous LB: locally bounded H F , L ( Z , z,t ) =
Coordinate-FreeMaximum Principle
LDL' ( E ) : the class ofL1 connections along E LD[M]: the bundle U x E ~ L D ( z ) LD[[M]]: thebundle over T M with fiber LD(Ic,v) at (2,v) LD(IC): UWET,MLD(Z, v) LD(z, v): the set of all Lie differentiations at II:in the direction of v L,J: the element of L D ( z , X ( s ) ) defined by
L,,X(Y) = [X,Y l ( 4 LD(S): theset of all Lie differentiations alonga curve E LI: locally integrable LIB: locally integrably bounded LIBU: LIB with uniqueness LICk: locally integrably of class C k LIL: locally integrably Lipschitz LL: locally Lipschitz &x: the map M 3 IC + L,,x E LD(Ic) CY: the connection canonically associated with a LIC1-controlled curve r. = (S, F ) LT:the set of connections canonically associ& ted with a LIL-controlled curve 7 = (E, F) LylL:the set of all measurable selections of OFLo Y CCov:the maps defined in Section 4 Ltoc,+(I):the set of all nonnegative locally Lebesgue integrable functions on I MP: Maximum Principle N: the set of integers 2 0 N: the set N U (CO} ODE:ordinary differential equation PCAL: parametrized by constant times arc-length R+ : the half-line [0, 00)
,CweC,
555
Sussmann
556
-R RU{+oo}
-
lIg,:RU{+W,-oO}
Rn:the space of real n-component column vectors Rc(x):The E-reachable set from x Ri,bl(x): The C-reachable set from x over [a,b]
R:,b*&): U{RK,,]b): - * L: c 5 b + a} SUB(E,F, G): the C W obtained from F by substituting G over E T D ( F ) :time domain of F T D ( F ~. ,. . T D ( Fn~ ) n TD(Fm) T M : the tangent bundle of M T * M : the cotangent bundle of M T # M : T * Mwith the zero section removed T,M: the tangent space of M at x T,*M : the cotangent space of M at x UP,F, or &,E: the class of all P-admissible 17 E U VE: variational equation V V ( ( ,C), A V ( ( ,C): the sets IVV(6,C, 0), IAV(6,L O ) , respectively I'(E):the set of sections of the bundle E r ( E , the set of sections of the bundle E along the curve 6 r P ( E ) , r P ( E): E , the set of Q E r ( E ) (resp. Q E I'(E,())that have Property P rk(E), r"E, 0 : r C k( E ) , rCk( E ,E ) T E : the canonical projection of a bundle E onto its base T L D , ~the : natural projection LD(x) + T,M XS: indicator function of the set S C[u^:the restriction of the control system C = (M,U, F) to U^, if U^ is a subset of U
...
e):
C [P:C ruF,c
Coordinate-FreeMaximum Principle
CL:the augmented system associated with a control system C = (M,U, F) and a Lagrangian L = (L, : 7 E U}
5 2 , : the canonical symplectic form on T * M (., [e,
the Poisson bracket the Lie bracket restriction t: transpose
e}:
m]:
r:
557
Index Abnormal multiplier, 475 Accessibility, 4 Adjoint equation, 8,490 inclusion, 495 multiplier, 4 15 vector, 406 vector along y, 493 Admissible controls, 3,33, 114 Anti-Hurwitzian matrix, 454 Approximation homogeneous, 205 linear, 206,209,185 nilpotent, 168,306 Approximating (strongly approximating) cone, 502 Ashby relation, 423,426 local (L-relation), 426 Asymptotically stabilizing feedback control (ASFC), 205 Attainable set, 137 Ball-plate problem, 266,280 Bang-bang switch, 390 theorem, 527 Casimir elements, 298 Chen-Fliess series, 305,3 19,334 Chen pathintegrals, 306 Chronological algebra, 339 product, 307,337 Clarke generalized Jacobian, 494 Commutator formula, 396 Concatenation, 397
[Concatenation] product, 32 1,324 Conjugate n-tuple, 402 point, 90,406 relation, 402 triple, 409 Control Ashby branching, 435 closed loop, 4,356 discontinous, 213, 43 1 feedback, 2 13,4 15 open loop, 4,213,356,43 1,496 optimal, 207,213,383 space, 1 structurally stable, 154 system, 1,3,496 bidynamical, 112,122 generic, 112 polydynamical, 112,125 Control Lyapunov function, 171, 196 Control (Carathdodory) vector field, 466,48 1
Controllability complete, 4 local, 4,306 local asymptotic, 168 small time local, 167 Controllable pair, 425 Controlled arc, 496 curve, 482 trajectory, 496 Cost functional (function), 2 13,260 Covariant (semi), 86 Covering, 440 559
Index
560
[Covering] differentiable branching, 433 ramified (of finite multiplicity), 430 Critical equation (real), 228,229 Hamiltonian, 10,222,234 point, 28,35 set, 229 trajectory, 223 value, 28 Curvature, 100 Curve locally Lipschitz (LL),481 locally absolutely continuous (LAC), 48 1 monotone, 47 simple, 65 Cut-locus, 408 CW-complex, 392 Darboux canonical form, 372 model, 363 Degree, 172 oriented, 171 fundamental stabilizability, 187 second stabilizability, 187 Dilation, 208 exponent, 3 14 filtration induced, 2 1 1 one parameter family of, 173,314 Discriminant set, 229 of system, 230 Distribution(s), 3,350 affine, 3,350,355 contact, 372 equivalence of, 354 Dubins problem, 274,281 Elastica,' 269 (non) Euclidean;270
Elastic problem, 268,281 Equilibria curve, 358 surface, 368 Equilibrium point, 166,358,454 set, 358 Equivalence feedback, 219,348,353 microlocal feedback, 240 pure feedback, 2 19 state (space), 219, 348, 352 symplectic, 225 Euler equation, 285 top, 287,299 Euler-Lagrange equation@),470 operator, 94 Extrema1 controlled, 5 12 lift, 84 normal, abnormal, 536 optimal, 384, Pontryagin normal, 39,454,455 Feedback, 207 Ashby, 43 1 branching, 424,43 1 equivalence, 79,219,348,353 group, 7,80,84 Halder continous, 167 locally stabilizing, 185 niIpotentizing, 3 15 locally stabilizing, 170 piecewise analytic, 166 transformations, 7 unbranching continous, 435 Finite multiplicity (FM),230 global (GFM), 232 Filtration, 3 14 extended, 210,212
/
.
Index
Finite horizon, 457 Fixed time problem, 260 Formal power series, 3 19 algebra of, 320 Free control system, 339 magma, 324 monoid, 330 time problem, 260 Function Carathdodory (CF),480 locally integrably bounded (LIB),480 of class Ck(LIC') 480 Lipschitz (LIL), 480 locally Lipschitz (LL), 480 Functor homology (singular), 177,427 homotopy, 177,427 Germ, 121, 121 Group dilation, 207,213 symmetry, 207,213 Hall basis 322, 330,396 Hall-Sirsov basis, 336 Hamilton equation(@,47 1 Hamiltonian, 8, 80,278,471 ,483 function, 472,474 lift, 8,483 reduced, 87
Hamilton-Jacobi-Bellmann(HJB) equation, 206,383 Heavy top problem, 265 Hessian, 30 H6lder's exponent,443 Homeostazing, 45 1 problem, 452 Homogenous function, 174,208, polynomial, 206,3 14
561
[Homogenous] system, 168 vector field, 174,208,314 Hopf bifivcation degenerate, 185 stationary, 185 theorem, 183 Hurwitzian matrix, 440 Imitation, 45 1 Immersion isotropic, 24 Lagrangian, 24 Index, 172 of stabilizability, 187 of a vector field, 432 Infinite horizon, 453 Integral curve of CVF, 482 manifold, 426 Integrator, 2 16 Iterated integral, 327 Invariant (semi), 86 Jacobi curve, 72,75 identity, 26,3 18 Kalman rankcondition, 309 Killing form, 285 Lagrange Grassmannian, 39 problem, 6 Lagrangian, 260,497,s 14 multiplier, 28, 35 normal, abnormal, 32,35 point, 28,32,35 subbundle, 58 subspace, 23 system, 26 1 La Salle invariance principle, 186
Index
562
Legendre Clebsch conditions (generalized), 391
condition (strong), 94 Lie differentiation (LD), 485,487,489 element, 306, 320, 325 polynomial, 2 10 series, 306,322,327,395 Lie algebra, 5,210 free, 210,306,324 isomorphism, 3 10 nilpotent, 3 13 Lie bracket (product), 210,270,3 10, 348,358
configuration, 387,394 iterated, 3 10 Limiting curve, 117 directions, 117 Linear-quadratic control problem, 384,453
Lipschitz curve, 2 1 submanifold, 2 1 Local controllability first order condition, 205 Local transitivity property (LTP), 114 small time (STLTP), 114 Logarithmic residuum, 249 Loop process, 456 Lyapunov function, 207,2 14 homogenous, 209 Lyndon word, 330 Manifold sub-Riemannian, 533 symplectic, 23 Martinet canonical form, 372 model, 364 surface, 368
Maslov Arnold cocycle, 59 cocycle, 59 index, 45,50 Maximum Principle ( M P ) , 9,206, 278,373,385,389,454,476, 505,508,511 Microlocalization,22 1 Monodromy, 423,432, Morse mapping, 29 Moving frame, 262 Multiplicity, 223,230
Nonlocal transitivity zone, 149 Normal suboptimal process, 454 Observable pair, 440 Optimal control problem, 5 Optimal synthesis, 6 Optimization, 45 1 Orbit codimension of, 357 heteroclinic, homoclinic, 191 open, 349 periodic, 183 Passing (i-passing) set, 134 Path planning problem, 306,314 Pfaffian equation, 355 woptimal, 453 Poincard-Bendixon theory, 182 Poincard-Birkhoff-Witt basis (PBW-basis),306,322,334 theorem, 334 Poincard domain, 428 Point &turning (passing), 1 19 zero (nonzero) passing, 1 19 Poisson bracket, 26,221,486 Pontryagin Maximum Principle (PMP), see Maximum Principle
Index
Problem fixed time, 260 free time, 260 Quadratic control system, 10 1 cost fbnctional, 206 Qualitative behavior control problem (QBC-problem), 423,424,45 1 local (LQ-problem), 424 Quasihomogenous, 174 l-cochain (real-valued), 56 l-cocycle, 56 Rank condition (RC), 24 1 Reachable point, 170,385 set, 4, 170 small-time, 385 ,392 Regulator problem 205,208 theory (quadratic), 196 Resonance, 428 Rigid body, 167,262,279,284 Rosenbrock's structural theorem, 428 Saturation, 391 of singular arcs, 4 16 Shuffle product, 306,326 Singular arc, 414 control, 373,391 feedback, 373 curve, 358,372 junction, 4 15 point of type p (p-0), 145,147 surface, 130 trajectory, 80,82 vector field, 373 velocity, 130 Stability matrix, 206 Stabilizability, 166
563
[Stabilizability] asymptotic, 166 local asymptotic, 168 Stabilizationproblem, 5, 165,424 Stable (asymptotically) subset, 137 State space, 1 equivalence, 348,352 Static feedback equivalence, 352 Stationary Ricatti equation, 207 Steep domain, 117 Structurally stable control system, 154,36 1 set of curves, 160 Submanifold exact, 24 isotropic, 24 Lagrangian, 24 Lipschitz, 24 Subspace isotropic, 22 Lagrangian, 23 Switching function, 389, 541 generalized, 494 isolated, 40 1 Symbol, 224,234 Symplectic form, 22 group, 23 manifold, 23 space, 22 transformations, 23 vector bundle, 58 System approximating, 2 12 accessible, 309, ,311, 363, analytic, 387,363 control-affine, 2,79,349,354,386 control-linear, 2,350 closed-loop, 3 16 finitely determined, 36 1 feedback linearizable, 176,371
564
Index
[System] generic, 357 globally controllable, 309 hamiltonian, 453 homogenous of degree p, 173 k-determined, 360 large time locally .controllable
(LTLC),213 linear, 2,309 approximating, 209 locally asymptotically controllable, 166,170
locally stabilizable, 170 nilpotent, 306,3 13 reduced, 87 separable, 50 1 small time locally controllable (STLC), 170,213,309 strongly accessible, 363 structurally stable, 154,36 1
Time-optimal (minimal,maximal) control problem, 6,80,358,373 Trajectory, 353 bang-bang, 394 boundary, 3 85 complex extremal, 372 extremal, 389 time-optimal, 385 Uncontrollable eigenvalues (of the linearized system), 175 mode, 36 1 pair, 425 Variational equation (inhomogeneous), 490 Velocity indicatrix, 117 Weight, 210,312 Weighted homogenous system, 168 Word, 324