Controllability and Observability (C.I.M.E. Summer Schools, 46)

E. Evangelisti ( E d.) Controllability and Observability Lectures given at a Summer School of the Centro Internazional...

Author: E. Evangelisti

47 downloads 760 Views 81MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

E. Evangelisti ( E d.)

Controllability and Observability Lectures given at a Summer School of the Centro Internazionale Matematico Estivo (C.I.M.E.), held in Pontecchio (Bologna), Italy, July 1-9, 1968

C.I.M.E. Foundation c/o Dipartimento di Matematica “U. Dini” Viale Morgagni n. 67/a 50134 Firenze Italy [email protected]

ISBN 978-3-642-11062-7 e-ISBN: 978-3-642-11063-4 DOI:10.1007/978-3-642-11063-4 Springer Heidelberg Dordrecht London New York

©Springer-Verlag Berlin Heidelberg 2010 st Reprint of the 1 ed. C.I.M.E., Ed. Cremonese, Roma 1969 With kind permission of C.I.M.E.

Printed on acid-free paper

Springer.com

CENTRO I NT E R NA Z IO NAL E MATEMATICO ESTIVO (C.I. M. E.) 0 1 Cicio - S3-SS0 Ma r-coni dal 1- 9 L ugl i o 1968

CON TROL LAB IL ITY AND OBSERVABIL ITY Coordinatore : P ro f. G . EVANGE L ISTI

R .E . KALMAN : Le ctures on Controllability and Obs ervability

pag .

R . KULIKOWSKI : Controllabil ity and Optimum Contro l

" " "

A . STRASZAK

Supe r vi s o r-y Cont r-ollab ility

L . WE ISS

Lectu res on Co nt r o ll a bil ity an d Obser vability

151

193 20 1

CENTRO INTERNAZIONALE MATEMATICO ESTIVO (C.I.M.E.)

LECTURES ON CONTROLLABILITY AND OBSERVABILITY

R.E. KALMAN (Stanford- University)

Corso tenuto a S3.SS0 Marconi (Bologna) dal

1 al 9

Luglio

1968

TABLE OF CONTENTS

O.

Introduction.

5

1.

Classical and modern dynamical systems.

15

2.

Standardization of definitions and "classical" results.

23

3.

Definition of states via Nerode equivalence classes.

35

4.

Modules induced by linear input/output maps.

43

5.

Cyclicity and related questions.

59

6.

Transfer functions.

78

7.

Abstract construction of realizations.

92

8.

Construction of realizations.

98

9.

Theory of partial realizations.

112

10.

General theory of observability.

119

11.

Historical notes.

133

12.

References.

142

-5-

R. E. Kalman

INTRODUCTION The theory of controllability and observability has been developed, one might almost say reluctantly, in response to problems generated by technological science, especially in areas related to control, communication, and computers.

It seems that the first

conscious steps to formalize these matters as a separate area of (system-theoretic or mathematical) research were undertaken only as late as 1959, by KAlMAN [l960b -c ].

There have been, however, many

scattered results before this time (see Section 12 for some historical comments and references), and one might confidently assert today that some of the main results have

bee~

discovered, more or less independ-

ently, in every country which has reached an advanced stage of "development" and it is certain that these same results will be rediscovered again in still more places as other countries progress on the road to development. With the perspective afforded by ten years of happenings in this field, we ought not hesitate to make some guesses of the significance of what has been accomplished. (i)

I see two main trends:

The use of the concepts of controllability and observability

to study nonclassical questions in optimal control and optimal estimation theory, sometimes a s basic hypotheses securing existence, more often as seemingly technical cond.tLons which allow a sharper statement of results or shorter proofs. (ii)

Interaction between the concepts of controllability and

observability and the study of structure of dynamical systems, such

-6-

R. E. Kal man as:

formulation and solution of the problem of realization,

canonical forms, decomposition of systems. The first of these topics is older and has been studied primarily from the point of view of analysis, although the basic lemma (2.7) is purely algebraic.

The second group of topics

may be viewed as "blowing up" the ideas inherent in the basic lemma (2.7), resulting in a more and more strictly algebraic point of view. There is active research in both areas. In the first, attention has shifted from the case of systems governed by finite-dimensional linear differential equations with constant coefficients (where succes s was quick and total) to systems governed by infinite-dimensional linear differential equations (delay differential equations, classical types of partial

different~al

equations, etc.), to finite-dimensional linear differential equations with time-dependent coefficients, and finally to all sorts and subsorts of nonlinear differential equations.

The first two

topics are surveyed concurrently by WEISS [1969] while MARKUS [1965] looks at the nonlinear situation.

My own current interest lies in the second streao, and thes e lectures will deal ptimarily with it, after a rather hurried overview of the general problem and of the "classical" results. Let us take a quick look at the most important of these "classical" results .

For conveni enc e I shall describe them in system-theoretic

7

H. E. Kalman (rather than conventional pure mathematical) language.

The mathe-

matically trained reader should have no difficulty in converting them into his preferred framework, by digging a little into the references. In area (i), the most important results are probably those which give more or less explicit and computable results for controllability and observability of certain specific classes of systems. Beyond these, there seem to be two main theorems: THEOREM A.

A real, continuous-time, n-dimensional, constant,

linear dynamical system I:

has the property "every set of

n

eigenvalues may be produced by Imitable state feedback" if and only if I:

is completely controllable.

The central special case is treated in great detail by KALMAN, FAlB, and ARBIB [1969, Chapter 2, Theorem 5.10]; for a proof of the general case with background comments, refer to WONHP~ [1967].

As

a particular case, we have that every system satisfying the hypotheses of the theorem can be "stabilized" (made to have eigenvalues with negative real parts) via a suitable choice of feedback.

This result

is the "existence theorem" for algorithms used to construct contr01 systems for the past three decades, and yet a conscious formulation of the problem and its mathematical solution go back to about 1963~ (See Theorem D below.)

The analogous problem for nonconstant linear

systems (governed -by linear differential equations with variable coefficients) is still not solved.

8

R.E.Kalman THEOREM B.

( "Duality Principle")

Every problem of control-

lability in a real, (continuous-time, or discrete-time), finitedimensional, constant, linear dynamical system is equivalent to a controllability problem in a dual system. This fact was first observed by KALMAN [1960a] in the solution of the optimal stochastic filtering problem for discrete-time systems, and was soon applied to several problems in system theory by KAI.MAN [196ob-c].

See also many related comments by KAI.MAN, FALB,

and ARBIB [Chapters 2 and 6, 1969].

As a theorem, this principle

is not yet known to be valid outside the linear area, but as an intuitive prescription it has been rather useful in guiding systemtheoretic research.

The problems involved here are those of fomula-

tion rather than proof. algebra and in particular

The basic difficulties seem to point toward category theory.

System-theoretic

duality, like the categoric one, is concerned with "reversing arrows".

See Section 10 for a modern discussion of these points

and a precise version of Theorem B. Partly as a result of the questions raised by Theorem Band partly because of the algebraic techniques needed to prove Theorem A and related lemmas, attention in the early 1960.s shifted toward certain problems of a structural nature which were, somewhat surprisingly at first, found to be related to controllability and observability. THEOREM C.

The main theorems again seem to be two: (Canonical Decomposition)

Every real (continuous-

time o~ discrete-time), finite-dimensio~al, const?r.t, linear ~vnamical

9

R.E.Kalman

system may be canonically decomposed into four parts, of which only one part, that which is completely controllable and completely observable, is involved in the input/output behavior of the system. The proof given by ~~ [1962] applies to nonconstant systems only under the severe restriction that the dimensions of the subspace of all controllable and all unobservable states is c onstant on the whole real line.

The result represented by Theorem C is far from

definitive, however', since finite-dimensio nal linear, :'cnconsta:lt systems admit at least four diffe·re::-:' canonical decompositi-:ns:

i t is

possible and fruitful to dualize the notions of controllability and observability, thereby arriving at four properties, presently called reachability and controllability as well as constructibility* and observability. (See Section 2 definitions.)

Any combination of a property from

the first list with a property from the second list gives a canonical decomposition re sult

an a l og~~s

to Theorem C.

The comple xity of

the s1 tuation wa.s first revealed by \·/EISS and KALMA..~ [1955]; this paper contributed to a revival of interest (with hopes of success) in the special problems of nonconstant linear systems.

Recent

*WEISS [1969] uses "determinability" instead of constructibility. The new terminology used in these lectures is not yet entirely standard.

-

10 -

R. E. Kalman

progress is surveyed by WEISS [1969].

Intimately related to the

canonical structure theorem, and in fact necessary to fully clarify the phrase "involved in the input/output behavior of the system'~ is the last basic result: THEOREf4 D.

(Uniqueness of I·linimal Realization)

Given the

imwulse-response matrix W of a real, continuous-time, finitedimensional, linear dynamical system, there exists a time, finite-dimensional, linear dynamical system (a)

continuous-

which

realizes W: that is, the impulse-respo~se matrix of is equal to W;

~

(b)

Lw

re~l,

has minimal dimension in the class of linear systems satisf'ying (a);

(c)

is completely controllable and completely observable;

(d)

is uniquely determined (modulo the choice of a basis at each

t

for its state space) by reauirement (a)

together with (b) or, independently, by (a) together with

(c). In short, for any W as described above, there is an "essentially unique"

~

of the same "type" which satisfies (a) through (c).

COROLLARY 1. constant

~

If W comes from a constant system, there is a

which satisfies (a) through (c), and is uniquely

determined by (a) + (b) or (a) + (c) (modulo a fixed choice of basis for its state space).

-

II

-

R. E. Kalman

COROLULQY 2.

All claims of Coyollary 1

"impulse-response matrix of a

const?~t,

c o ~tinue

to hold if

finite-dimensional system"

is replaced by "tra::1sfer function matrix of a constant, finitedimensional system". The first general discussion of the situation with an equivalent statement of Theorem D is due to KAlMAN [1963b, Theorems 7 and 8].

(This paper does not include co~plete proofs, or even

an explicit statecent of Corollaries 1 and 2, although they are implied by the general algorithm given in Section version of the original unpublished proof of

7.

7.~eorem

An edited

D is given

in KALMAN, FALB, and ARBIB [1969, Chapter 10, Appendix C].) These results are of great importance in engineering system theory since they relate methods based on the Laplace transform (using the transfer function of the systec) and the time-donain methods based on input/output data (the matrix ;'1) to the statevariabl

(dynamical system) methods developed in 1955-1960.

fact, by Corollary 1 it follows that the

t~o

In

methods NUst yield

identical results; for instance, starting with a constant impulseresponse matrix W,

property (c) implies thao the existence

of a stable control lay is always assured by virtue of Theorem A. Thus it is only after the development represented by Theorems A-D that a rigorous justification is obtained for methods used in control

t~e

intuitive design

enginee~ing.

As with Theorem C, certain formulational difficulties arise in connection with a precise definition of a "r.onconstant linear

-

12 -

R. E. Kalman

dynamical system".

Thus, it seems preferable at present to replace

in Theorem D "impulse-response matrix WIt (or "abstract input/output map WIt) by "complete reachability" •

by "weighting pattern WIt

and "complete controllability"

The definitive form of the 1963 theorem

evolved through the works of WEISS and KAlMAN [1965], YOULA [1966], and KALMAN; a precise formulation and modernized proof of Tneorem D in the weighting pattern case was given recently by KAI.MA..f>l', FALB, and ARBIB [1969, Chapter 10, Section 13.]

A completely general

discussion of what is meant by a "minimal realization" of a nonconstant impulse-response matrix involves many technical complications due to the fact that such a minimal realization does not exist in the class of linear differential equations with "nice" coefficient functions.

For the current status of this problem,

consult especially DESOER and VARAIYA [19 67], SILVERMAN and MEADOWS [1969], KArMAN, FALB, and ARBIB [1969, Chapter 10, Section 13] and WEISS [1969]. From the standpoint of the present lectures, by far the most interesting consequence of Theorem D is its influence, via efforts to arrive at a definitive proof of Corollary 1, on the development of the algebraic stream of system theory.

The first proof of this

important result (in the special case of distinct eigenvalues) is that of GILBERT [1963].

Immediately afterwards, a general proof

was given by KArMAN [1963b, Section 7].

This proof, strictly

computational and linear algebraic in nature, yields no theoretical insight although it is useful as the basis of a computer algorithm.

-

13 -

R. E. Kalman

Using the classical theory of invariant factors,

~~

[1965a]

succeeded in showing that the solution of the minimal realization problem can be effectively reduced to the classical invariantfactor algorithm.

This result is of great theoretical interest

since it strongly suggests the now standard module theoretic approach, but it does not lead to a simple proof of Corollary 1 and is not a practical method of computation. The best known proof of Corollary 1 was obtained in 1965 by B. L. Ho, with the aid of a remarkable algorithm, which is equa l.Iy important

from a theoretical and computational viewpoint.

The early formula-

tion of the algorithm was described by HO and KALMAN [1966], with later refinements discussed in HO and KALMAN [1969], KAh\UU~ FALB, and ARBIB [1969, Chapter 10, Section 11] and ~~ [1969c]. Almost simultaneously with the work of B. L. Ho, the basic results were discovered independently also by YOULA and TISSI [1966] and by SILVERM.I'IN [1966].

The subject goes back to the 19th century

and centers around the theory of Hankel matrices; however, many of the results just referenced seem to be fundamentally new. field is currently in a very active stage of development. discuss the essential ideas involved in Sections 8-9.

This

We shall

Many other

topics, especially Silverman's generalization of the algorithm to nonconstant systems unfortunately time.

cap~ot

be covered due to lack of

-

14 -

R.E. Kalman

Acknowledgment It is a pleasure to thank C. I. M. E. and its organizers, especially Professors E. Bompiani, E. Sarti, and E. Belardinelli, for arranging a special conference on these topics.

The sunny

skies and hospitality of Italy, along with Bolognese food pla.yed a subsidiary but vital part in the success of this important gathering of scientists.

-

IS -

R. E. Kalman 1.

CLASSICAL Am> MODERN DYNA1.fICAL SYSTEHS

In mathematics the term dynamical system (synonyms:

topological

dynamics, flows, abstract dynamics, etc.) usually connotes the action of a one-parameter group

T

(t~e

reals) on a set

X,

where

X is

at least a topological space (more often, a differentiable manifold) and the action is at least continuous.

This setup is physically

motivated, but in a very old-fashioned sense.

A "dynamical system"

as just defined is an idealization, generalization, and abstraction o~

Newton's world view of the Solar System as described via a finite set c

nonlinear ordinary differential equations.

These equations represent

the positions and momenta of the planets regarded as point masses and are completely determined by the laws of gravitation, i.e., they do not contain any terms to account for "external" forces that may act on the system. Interesting as this notation of a dynamical system may be (and iSl) in pure mathematics, it is much too limited for the study of those dynamical systems which are of contemporary interest.

There

are at least three different ways in which the classical concept must be generalized: (i)

The time set of the system is not necessarily restricted

to the reals; (ii)

A state

x E X of the system is not merely acted upon by

the "passage of time" but also by inputs which are or could be manipulated to bring about a desired type of behavior;

-

16 -

R. E. Kalman

(iii)

The states of the system cannot, in general, be observed.

Rather, the physical behavior of the system is manifested through its outputs which are many-to-one functions of the state. The generalization of the time set is of minor interest to us here.

The notions of input and output, however, are exceedingly

fundamental; in fact, controllability is related to the input and observability to the output.

With respect to dynamical systems in

the classical sense, neither controllability nor observability are meaningful concepts. A much more detailed discussion of dynamical systems in the modern sense, together with rather detailed precise definitions, will be found in KALMAN, FALB, and ARBIB [1969, Chapter 1]. From here on, we will use the term "dynamical system" exclusively in the modern sense (we have already done so in the Introduction). The following symbols will have a fixed meaning throughout the paper:

( 1.1)

T

time set,

U

set of input values,

X

state set,

y

set of outpu.t values,

n

input functions,

Ij)

transition map,

TJ

readout map.

The following assumptions will always apply (otherwise the sets above are arbitrary):

-

17 -

R. E. Kalman

T

an ordered subset of the reals

o

class of functions

T -> U

(i)

w is undefined outside some

each function

finite interval

w

E 0

with For most

dependent on

on

W'

abelian group of integers;

m;

there is a function

which agrees with w on

later,

f_ :;~ 3e3

such that

JeT co

¢,

(ii)

~,

J

w and

J w"

T will be equal to

U, X, Y, 0

fined" can be replaced by "equal to

~

= (ordered)

will be linear spaces; "unde-

0"; and "functions undefined out-

side a finite interval" will mean the same as "finite sequences". The most general notion of a dynamical system for our present needs is given by the followinr;

DEfINITION.

~amical

consi sting of the maps

cp:

und efined when ever

defined on the sets

1]

x, m) '-. cp(t;

T,

t

>

T, U, 0, X, Y

T,

x, m)

T·

= '

TXX-> Y:

The tran sition map cp (1.4 )

is a composite object

TXT X X X 0 -> X,

(t;

Tj:

rp,

system E

cp( t; t, x, w)

(t, x) I--> Tj(t, x). satisf~c~

x·,

the following assumptions:

-

18 -

R.E.Kalman (1.5)

cp(t;

(1.6)

if cp(s;

T,

X,

=

(I)

T,

cp(t; s, cp(s;

(I)

(I)'

X,

on

[T, t), cp( s;

(I)

T,

x,

T,

(I)

then for all

,

(I);

s E [T, t)

x, ,:D' ) .

The definition of a dynamical system on this level of generality should be regarded only as a scaffolding for the terminology; interesting mathematics begins only after further hypotheses are made. T, U,

instance, it is usually necessary to endow the sets Y

with a topology ~{PLE.

and then require that

cp

and

1]

n,

= B = reals,

X, and

be continuous.

The classical setup in topological dynamics may

be deduced from our Definition (1.3) in the following way. T

For

Let

regarded as an abelian group under the usual addition

and having the usual topology; let

n consist only of the nowhere-

defined function; let

X be topological space; disregard Y and

define

T

cp for all cp(t;

T,

t, X,

entire]~;

E T and write it as x·(t - T),

(I)

that is, a function of

1]

x

and

t -

T

alone.

Check (1.4-5); in

the new notation they become

x'O

x

and

x.(s + t)

Finally, require that the map

(1.8)

INTERPRETATION.

(x·s)·t.

(x, t) ~ x·t

be continuous.

The essential idea of Definition (1.3) is

that it axiomatizes the notion of state.

A dynamical system is informally

-

19 -

R. E. Kalman

a rule for state transitions (the function

together with suitable

~),

means of expressing the effect of the input on the state and the effect of the state on the output (the function as follows: time

T

"an input w,

~).

The map ~ is verbalized

applied to the system Z

produces the stati! ~(t; T, x, w)

at time

in state t."

x at

The peculiar

definition of an input function w is used here mainly for technical convenience; by (1.6) only equivalence classes of inputs agreeing over

h, t] enter into the determination of at

t

~( t ; T,

means no input acts on Z at time The pair

x, w).

"w not defined"

t.

(T, x) E T X X will be called an event of a dynamical

system L In the sequel, we shall be concerned primarily with systems which are finite-dimensional, linear, and continuous-time or discrete-time. Often these systems will be also real and constant (= stationary or time-invariant).

We leave the precise definition of these terms in

the context of Definition (1.3) to the reader (consult KALMAN, FALB, or ARBIE [1969, Chapter 1] as needed) and proceed to make some ad hoc definitions without detailed explanation. The following conventions will remain in force throughout the lectures whenever the linear case is discussed: Continuous-time.

n

=

T =~,

U

= gm,

all continuous functions

X = gn, Y = ~p, m R -+ R which vanish out-

side a finite interval. (1.10)

Discrete-time.

T =?!,

K

= fixed field (arbitrary),

-

20 -

R. E. Kalman

u =!fl, x = r, Z ~

!fl

Y

= KP, n = all

functions

which are zero for all but a finite number of

their arguments. Now we have, finally,

(loll)

dynamical system E time

A real, continuous-time, n-dimensional, linear

DEFINITION.

(F(·), G(·),

is a triple of continuous matrix functions of

H(.»

where

(n X n matrices over

~)

~

(n X m matrices over

~)

R -+

(p X n matrices over

~).

F(·) :

R

-)

G(.) :

R

H(.):

,

These maps determine the equations of motion of E in the following ~:

F(t)x + G(t)w(t),

dx/dt (1.12)

where

{

H(t)x(t),

y(t)

t E~,

x E

t,

(I)(~) E ~m,

and yet) E ~p.

To check that (1.12) indeed makes E

into a well-defined dynamical

system in the sense of Definition (1.3), it is necessary to recall the basic facts about finite systems of ordinary linear differential equations with continuous coefficients. iPF(t, '1"):

Define the map

~ X ~ ~ {n X n

matrices over

~}

to be the family of n X n matrix solutions of the linear differential

-

21 -

R. E. Kalman

equation F(t)x,

dx/dt

x E ~

subject to the initial condition unit matrix,

I

~F is of class Cl

Then

in both arguments.

transition matrix of (the system E matrix is)

F(·).

"E R.

It is called the

whose "infinitesimar'transition

From this standard result we get easily also the

fact that the transition map of E is explicitly given by

Ul)

while the readout map is given by (1.14)

11( t, x)

H(t) x.

It is instructive to verify that lence class of

Ul'S

which agree on

['" t].

In view of the classical terminology "linear differential equations with constant coefficients", we introduce the nonstandard (1.15)

DEFINITION.

A real, continuous-time, finite-dimensional

linear dynamical system E

= (F( ·),

G(·), H(·»

is called constant

iff all three matrix functions are constant. In strict analogy with (1.15), we say: (1.16)

DEFINITION.

A discrete-time, finite-dimensional, linear,

constant dynamical system E over

K

is a triple

(F, G, H)

of

-

22 -

R. E. Kalman

n X n, n X m, p X n

.

matr~ces

K over th e f<eld ~.

mine the equations of motion of i

Hx( t),

y(t) t E~,

in the following manner:

Fx(t) + Gw(t),

X( t + 1) {

x E

x",

w(t) E

If',

and

y(t) E KP•

In the sequel, we shall use the notations F, -, H)

These maps - deter-

(F, G, -)

or

to denote systems possessing certain properties which

are true for any

H or

G.

Finally, we adopt the following convention, which is already implicit in the preceding discussion:

(1.18) E

DEFINITION.

The dimension

n

is equal to the dimension of XE as a

of a dynamical system v~ctor

space.

-

23 -

R. E. Kalman

2.

STANDARDIZATION OF DEFINITIONS AND "CIASSICAIl'RESULTS

In this section, we shall be mainly interested in finitedimensional linear dynamical systems, although the first two definitions will be quite general. Let

E

Section 1.

be an arbitrary dynamical system as defined in We assume the following slightly special property:

There exists a state

(j)(t;

T,

and an input

x*

x*, m*)

for all

x*

For simplicity, we write and

t,

T

and m*

x*

n have additive structure,

ing.)

m*

0

such that E T and as

O.

t>

T.

(When X

will have the usual mean-

The next two definitions refer to dynamical systems

with this extra property. DEFINITION.

An event

there exists a t E T and an on

(T,

x»

(T,

x)

wEn

is controllable iff§

(both

t

and w may depend

such that

(j)( t; T, x, Ul) In words:

0

an event is controllable iff it can be transferrc

to

0

in finite time by an appropriate choice of the input function

w.

Think of the path from

function defined over

(T, x)

to

(t, 0)

as the graph of a

[T, t].

§The technical word iff means if and only if.

-

24

-

R. E. Kalman Consider now a reflection of this graph about

T.

This

suggests a new definition which is a kind of "adjoint" of the definition of controllability: DEFINITION. 1s an

(T,

sET

x»

An event

and an

00

E

n

(T, x) is reachable iff there

(~

sand

00

may depend on

such that x

=
We emphasize: different concepts.

0,

00).

controllability and reachability are entirely A striking example of this fact is encountered

below in Proposition (4.26). We shall now review briefly some well-known criteria for and relations between reachability and controllability in linear systems.

(2.3)

PROPOSITION.

In a real, continuous-time, finite-dimensional,

linear dynamical system (a)

reachable if and only if some W(s,

(b)

E = (F('), G(')' - ),

s E

an event

x E range W(s, T)

(T, x)

for

s < T, where

~,

T) = IT ~F(T, a)G(a)Gr(a)~F(T, a)du s

controllable if an only if x E range some t E

~,

t >

T,

W(T, t) for

where

The original proof of (b) is in KALMAN [1960b]; both cases are treated in detail in KALMAN, FALB, and

JL~IB

[1 969, Chapter 2,

is

-

25 -

R. E. Kalman Section 2].

Note that if G(·)

we cannot have reachability, and if G(·) zero on

(- 00, T)

is identically zero on is identically

(T, + 00) we cannot have controllability.

For a constant system, the integrals above depend only on the difference of the limits; hence, in particular

So we have

(2.4)

PROPOSITION.

In a

rea~continuous-time,

linear, constant dynamical system an event for all

T

finite-dimensional,

(T, x) is reachable

if and only if it is reachable for one

T;

an ev:nt

is reachable if and only if it is controllable. From (2.3) one can obtain in a straightfoF~~rd fashion also the following much stronger result: (2.5)

THEOREM.

In a rea1 continuous-time, n-dimensional,

linear, constant dynamical system L = (F, G, -)

a state

x

is reachable (or, equivalently, controllable) at a~ T E ~ if and only i f x E span (G, FG, ••• ) C ~n; if this condition is satisfied, we can choose with

°> 0

arbitrary.

s

=T

-

0, t

=T

(The span of a sequence of matrices is to

be interpreted as the vector space generated by the columns of these matrices.)

+ 0,

-

26

-

R. E. Kalman A proof o:r (2.5) may be found in KALMAN, HO, and NARENDRA

[1963] and in KALMAN, FALB, and ARBIB [1969, Chapter 2, Section

3]. A trivial but noteworthy consequence is the fact that the definition of reachable states of E is "coordinate-free":

(2.6)

COROLJJL~Y.

states of E ~

Xz,

The set of reachable (or controllable)

in Theorem (2.5) is a subspace of the real vector

the state space of E.

Very o:rten the attention to individual states is unnecessary and therefore many authors prefer to use the terminology completely reachable at x E

X~

.

T"

~s reachabl~ ""~ , or ~

event in E

for

"every event

( 'l", x),

completely reachable " for

is reachable", etc.

is

"l.

fixed,

'l" =

" every

Thus (2.5), together with the

Cayley-Hamilton theorem, implies the BASIC

~MA.

A real, continuous-time, n-dimensional,

linear, constant dynamical system E

= (F, G, -) is comnletely

reachable if an only if rank

(2.8)

(G, FG, ••• , yn-1G)

n.

Condition (2.8) is very well-~~own; it or equivalent forms of it have been discovered, explicitly used, or implicitly assumed by many authors.

A trivially equivalent form of (2.7) is given by

COROLLARY 1.

A constant system E

completely reachable if and only if the

= (F,

s~allest

is

F-invariant

subspace of XL containing (all col~on vectors of) itself.

G, -)

G is

~

_

27-

R. E. Kalman

A useful variant of the last fact is given by

(2.10)

COROLLARY 2.

(W. Hahn)

A constant system E = (F, G, -)

is completely reachable if and only if there is no nonzero eigenvector of F which is orthogonal to (every column vector of)

G.

Finally, let us note that, far from being a technical condition, (2.5) has a direct system-theoretic interpretation, as follows: PROPOSITION.

(2.il)

The state space

~

of a real, continuous-

time, n-dimensional, linear, constant dynamical system E

= (F,

G, -)

may be written as a direct sum

which induces a decomposition of the equations of motion as (obvious notations) dx/dt {

dx,jdt

The subsystem L a state

= (Fil, Gl, -)

x = (~, x E ~ 2) PROOF.

of E;

l

is completel v reachable.

is reachable i f and only if x 2

We define

Xl

tion, every state in Xl

Xl

= O.

to be the set of reachable states

by (2.5) this is an F-invariant subspace of

finite-dimensionality,

Hence

is a direct summand in

XE" ~"

Hence, by By

construc-

is reachable, and (every column vector of)

-

28

-

R. E. Kalman G belongs to F = 0, ll

Xl'

The F-invariance of Xl

implies that

which implies the asserted form of the

e~uations

of

motion.

0

REMARK.

Note that

X 2

is not intrinsically defined

(it depends on an arbitrary choice in completing the direct sum). Hence to say that state if

"(0, x

2)

is an unreachable (or uncontrollable)

x 2 -f 0" is an abuse of language.

More precisely: the

set of all reachable (or controllable) states has the structure of a vector suac~ bltthe set of all unreachable (or uncontrollable) states does not have such structure.

This fact is important to

bear in mind for the algebraic development which follows after this section and also in the definition of observabi1ity and constructibi1ity below. chosen in such a way that

In general, the direct sum cannot be F = 0. 12

While condition (2.8) has been fre~uently used as a technical re~uirement

in the solution of various optimal control problems in

the late 1950 s, it was only in 1959-60 that the relation between (2.8) and system theoretic questions was clarified by KALMAN (1960b-c] via Definition (2.2) and Propositions (2.5) and (2.11). 11 for further details.)

(See Section

In other words, without the preceding

discussion the use of (2.8) may appear to be artificial, but in fact it is not, at least in problems in which control enters, because, by (2.12) control problems stated for respect to the intrinsic subspace

Xl'

~

are nontrivial only with

-

29 -

R.E.Kalman

The hypothesis "constant" is by no means essential for Proposition (2.11), but we must forego further comments here. For later purposes, we state some facts here for discretetime, constant linear systems analogous to those already developed for their continuous-time counterparts.

The proofs are straight-

forward and therefore omitted (or given later, for illustrative purposes). (2.14)

PROPOSITION.

A state

n-dimensional, linear, con stant

x

of a real, discrete-time,

~~cal

system E

= (F,

G, -)

is reachable if and only if x E span (G, FG, ••• , ~-lG). Thus such a system is completely reachable if and only if (2.8) holds. (2.16)

PROPOSITION.

A state

x

of the system E

described

in Proposition (2.14) is controllable if and only if

) x E span ( F-1G, • ", F-nG, where {x:

PROPOSITION.

(2.18)

~x

column vectcr of

G}.

In a real, discrete-time, finite-dimensional,

linear, constant dynamical system E

= (F,

G, -)

a reachable state

is always controllable and the converse is always true whenever det F

f

O.

-

30 -

R. E. Kal ma n

Note also that Propositions (2.11) and its proof continue to be correct, without any modification, when "continuous-time" is replaced by "discrete-time". Now we turn to a discussion of observability. The original definition of observability by KALMAN' [1960b, Definition (5.23)] was concocted in such a way as to take advantage of vector-space duality.

The conceptual problems surround-

ing duality are easy to handle in the linear case but are still by no means fully understood in the nonlinear case (see Section 10).

In order to get at the main facts quickly, we shall consider

here only the linear case and even then we shall use the underlying idea of vector-space duality in a rather ad-hoc fashion. The reader wishing to do so can easily turn our remarks into a strictly dual treatment of facts (2.1)-(2.12) with the aid of the setup introduced in Section 10. DEFI}ilTION.

An event

("

x)

in a real, continuous-

time, finite-dimensionak linear dynamical system E

= (F(o),

-, H(·»

is unobservable iff

DEFI}ilTION. ("

x)

With respect to the same system, an event

is unconstructible* iff

*In the older literature, starting with KAl}t~~ [1960b, Definition (5.23)], it is this concept which is called "observability"o By hindsight, the present choice of words seems to be more natural to the writer.

-

31 -

R. E. Kalman

The motivation for the first

defi~ition

is obvious:

the

"occurrence" of an unobservable event cannot be detected by looking at the output of the system after time subsumes

ill

= 0,

linearity.)

T.

(The definition

but this is no 10s3 of generality because of

The motivation for the second definition is less

obvious but is in fact strongly suggested by statistical filtering theory (see Section ments Definition

10).

(2.20)

complements Definition

In any case, Definition

(2.21)

comple-

in exactly the same way as Definition

(2.1)

(2.2).

From these definitions, it is very easy to dedu ce the following criteria:

(2.21)

PROPOSITION.

In a real, continuous-time, finite-di mensional,

linear dynamical system E (a)

= (F('),

-, H('))

unobservable if and only if x for all

t E

~,

an event

(T, x)

is

E kernel M(T, t)

t > T, where

M(T, t)

(b)

unconstructible if and only if for all

s

E~,

s <

T,

where

x E kernel M( s, T)

-

PROOF.

32

R.E. Kalman

-

Part (a) follows i=ediatel,y from the observation:

x € kernel M(T, t ) ~ H(s) F(s, T)X

=

0

for all

s € [T, t l .

o

(b) follows by an analogous argument.

REMARK.

Part

Let us compare this result with Proposition (2.3),

and let us indulge (onl,y temporaril,y) in abuses of language of the following sort:*

unreacha.ble # for all

t >

x € kernel

W( T,

t)

T

and observable # for some t >

x € range

M( T,

t)

T.

From these relations we can easil,y deduce the so-called "duality rules"; that is, problems involving observability (or constructibility) are converted into problems involving reachaoility (or ~ o nt rollability) in a suitabl,y defined dual system.

See KALMAN, FALB,

and ARBIB [1969, Chapter 2, Proposition (6.12)] and the broader discussion in Section 10. We will say: by slight abuse of language, that a system is completely observable whenever

0

is the onl,y unobservable state.

Thus the Basic Lemma (2 .7) "dualizes" to the PROPOSITION.

A real, continuous-time or discrete-time,

n-dimensional, linear, constant

dynamical system E

= (F,

- , H)

*A11 this would be strictly correct if we agreed to replace "direct sum" in Pr oposition (2.11) and its counterpart (2. 25) by "orthogonal. direct sum"; but thi s would be an arbitrary convention which, whi l e conve nie ~t, h~s no natural system-theoretic justification. Rer ead Rem1i.tk (2.13).

-

33 -

R. E. Kal ma n

is completely observable if and only if (2.24)

rank (rr , F'H', ••• , (F,)n-~,)

n.

By duality, complete constructibility in a continuous-time system is equivalent to observability; in a discrete-time system this is not true in general but it is true when det F

f

O.

It is easy to see also that (2.11) "dualizes" to: PROPOSITION. time or discrete-time system r.

= (F,

The state space

Xr.

of a real,

continuous-

n-dimensional, linear, constant dynamical

-, H) may be written as a direct sum

and the equations of r. are decomposed correspondingly as dx/dt

Fllxl,

dx,jdt

F 21x l + F 22x2'

yet) PROOF.

H x (t ) . 2 2 Proceed dually to the proof of Proposition (2.11),

beginning with the definiticn of Xl states of r..

as the set of all unobservable

o

Combining Propositions (2.11) and (2.25) gives Theorem C as in KALMAN (1962].

This completes our survey of the "classical" results related

-

34 -

R. E . Kalman

to reacha bility, controllability, observapility, and constructibility. The remaining lectures wi l l be concerned exclusively with discrete-time sy stems .

The main motivation for the succeeding

developments will be the algebraic criteria (2.8) and (2.24) as well as a deeper examination of Theorems C and D of the Introduction.

-

35

-

R. E. Kalman

3.

DEFINITION OF STATES VIA NERODE EQUIVALENCE CLASSES

A classical dynamical system is essentially the action of the time set

T

(= reals)

on the states

X.

In other words} the

states are acted on by an abelian group, namely definition of addition). consequences. inputs

(~+

usual

This is a trivial fact, but it has deep

A (modern) dynamical system is the action of the

n on X;

in exact analogy ,nth the classical case, to

the abelian structure on

T there corresponds an (associative

but noncommutative) semigroup structure on

n.

The idea that

n

always admits such a structure was apparently overlooked until the late 1950's when it became fashionable in automata theory (school of SCHUTZENBERGER).

This seeClS to be the "right" way

of translating the intuitive notion of dynamics into mathematics, and it will be fundamental in our succeeding investigations. It is convenient to assume from now on, until the end of these lectures} that T

time - -set -

Z

additive (ordered) group of

integers. Since we shall be only interested in constant systems from here on} we shall adopt the following normalization convention:*

*In the discrete-time nonconstant case, we WOQld have to deal with ~ copies of n, each normalized with respect to a different particular value of T E ~.

-

36 -

R. E. Kalman No element of n is defined for

t >

In view of (3.2), we can define the "length"

max {-t E Z:

Iill I

is

ill

I!:Jt

Before defining the semigroup on fundamental notion of dynamics : defined for all

ari':

0

q~

in

n ~ n:

ill

O.

T

1(1)1

of ill

defined for any

n,

by

s < t}.

we introduce another

the (left) shift operator

an'

Z by

1-+

t ~ m(t + q).

arim:

Note that the definition of O"n

is compatible with the normaliza-

tion (3.2). If J

ill

n Jill ,

of (I) and

00'

(3.4)

v

(l)

= empty for

we define the join

ill, ill' E 0.,

as the function

(00

w'

lill'

on on

Jill' Jill'.

When n has an additive structure, then we replace

DEFINITION. o.

n X0

~

o.

0,

00 v 00'

by

ill

+ (I)'.

There is an associative operation

called concatenation, defined by

(00, v)

Note that, by

1-+

anIvl ill

v

v,

(3.2) through (3.4),

o

is well defined.

Note also that the asserted existence of concatenation rests on the fact that

0

intervals in

We might express the content of (3.5) also as:

T.

is made up of functions defined over finite

o is a semigroup with valuation, since evidently

l(I)ovl = Iwl + Ivl.

37 -

R.E.Kalman

In view of (3.5), it is natural to use an abbreviated notation* also for the transition function, as follows:

(3.6)

Iwl,

ep(o; -

Xow

w)

x,

Now we come to an important nonclassical concept in dynamical systems, whose evolution was strongly influenced by problems in communications and automata theory:

a discrete-time constant

input/output map f: 0 -+ Y: w >-+ few)

y(l)

We interpret this map as follows: system E

y(l)

is the output of some

(say, a digital computer) when E is subjected to

the (finite) input sequence w,

assuming that

E is some fixed

initial equilibrium state before the application of co,

This

definition automatically incorporates the notions of "discretetime" as well as "causal" or "dynamics" (the latter because yet)

is not defined for

t < 1).

However, (3.7) does not

clearly imply "constancy" (implicitly, however, this is clear from the normalization assumption (3.2) on more forceful, we extend

(3.8)

r:

n

-+

r

f

=

n).

To make the definition

to the map

Y X Y

(infinite cartesian product) (y(l), y(2), •••

Interpretation:

r

of the system E after

gives the output sequence t

=0

y

= (y(l),

). y(2),

resulting from the application of an

*Observe that xow is the strict analog of the notation xt customary in topological dyn?~cs. The action of w on x satisfies xo(wov) = (xom)ov in view of (1.5).

-

38 -

R. E. Kalman input

ill

which stops at

t

= o.

This definition expresses causality more forcefully and incorporates constancy, provided we define the (left) shift operator for any

~r

T

~

on

r

0, T E

so as to be compatible with let

~'

t

y( t + T)

l-t

:(y(l), y(2), ••• ) Note:

the operator

operator

~r

~n

1-4

(y(T + 1), y(T + 2), •.• )

"appends" an undefined term at

0,

the

"discards" the term y(l).

Now, dropping the bar over

(3.10)

(3.3). So,

DEFINITION.

f,

we adopt

A discrete-time, constant input/output map

(of some underlying d"vnamical system E)

is any map f

such that

the following diagram

is commutative.

f

is Hnc"r iff i t is a K-vcctor -------------_.------

§.P~c~\J..~:>m:>..r:~1.i.:s"1J!.

(3.10) as the external

It will be convenient to regard

definition of a dynamical system, in contrast to the internal definition set up in Section 1. Intuitively, we should think of kind of experimental data; namely,

f f

as a highly idealized incorporates all possible

information that could be gained by subjecting the underlying

-

39 -

R. E. Kalman

system to experiments in which only input/output data is available.

This point of view is related to experimental physics the

same way as the classical notion of a dynamical system is related to Newtonian (axiomatic) physics. The basic question which motivates much of what will follow can now be formulated as f ollows: PROBLEM OF REALIZATION. f

(but of course also of

~,

Given only the knowledge of

l1, and

r)

how can we discover,

in a mathematically consistent, rigorous, and natural

.~y,

the

properties of the system E which is supu os ed to underlie the given input/output map f? This suggests immediately the following fUndamental concept: DEFINITION.

A fixed dynamical system E

(internal

definition, as in Section 1) is a realization of a fixed input/ output map f

iff

fE' o the input/output map of Eo. o

f

o

=

that is,

f

o

is identical with

In view of the notations of Section 1 plus the special convention

(3.6),

the explicit form of the realization condition is

simply that f (m) o

for all m

l1.

~ (<11: (0; o 0 Tne symbol

brium state in which Eo application of m.

*

Iml, *, m» stands for an arbitrary equili-

remains, by definition, until the

(Later we simply take

*

to be

0.)

-

40 -

R. E. Kalman

To solve the realization problem, the critical step is to induce a definition of X (of some E)

from the given

o

f

o

•

It is rather surprising that this step turns out to be trivial,

on the abstract level.

(On the concrete level, however, there are

many unsolved problems in

actual~

computing what

Section 8, we shall solve this problem, too, but linear case.)

X h. o~

In

in the

The essential idea seems to have been published

first by NERODE [1958]:

(3.14)

DEFINITION.

Make the concatenation semigroup n into

a monoid by adjoining a neutral element defined function on ~).

Then (j) == f

--

equivalent to (j)' with respect to f«(j)'oV) There are reasons (which

rr~ny

a~e

f)

for all

¢

(which is the nowhere-

co'

(~:

(j) is Nerode

iff VE

n.

intuitive, physical, historical, and technical

scattered throughout the literature and

conce~·

trated especinl~ strong~ in KALMAN, FALB, and ARBIE [1969]) for using this as the MAIN DEFINITION. =f'

denoted as

X = ( (j))f: f input/output map f.

The set of equivalence classes under (j)

E

nl,

is the state set of the

Let us verify immediately that (3.15) makes mathematical sense :

-

41 -

R.E.Kalman PROPOSITION. f

For each linear, constant input/output maE

there exists a dynamical system (a)

~f

(b)

~

realizes

PROOF.

f;

f

We show how to induce

define the state set of ~f by (b). transition

such that

X.

=

f

~f

f\L~ction

of

~f

~f'

given

f.

We

Further, we define the

by

xov

We must check that

on the left of ~

o

o!),

two different uses of sentation of

x as

(m)f'

is well defined (note

that is, independent of the repreThis follows trivially from (3.14).

Now we define the readout map of

~f

by

(3.18) Again, this map is well defined since we can take special case in (3.14).

v =

¢

as a

Then

~ (xs v) f

and the realization condition

(3.6)

is verified.

Hence claim (a)

is correct. COMMENTS.

0 In' automata theory,

reduced form of any system which realizes

~f

f.

is known as the Clearly, any two

-

42 -

R. E. Kc l ma n

reduced forms are isomorphic, in the set-theoretic sense, since the set

X f

is intrinsically defined by

f.

(This observation

is a weak version of Theorem D of the Introduction; here "uniqueness" means "modulo a perwutation of the labels of elements in the set

X ".) f

~:ot, ice also that

since, by DefInition (3.15" is reachaole (W)f.

~a

any element

As to observability of

L

f

; s como1 etely reachable

everJ element W' ~f'

x

= (W)f

of X f

in the Nerode equivalence class see Section 10.

-

43 -

H. E. Kalman

4.

MODULES INDUCED BY LINEAR INRJTjOUTPUT MAPS

We are now ready to embark on the main topics of these lectures. It is assumed that the reader is conversant with modern algebra (especially:

abelian groups, commutative rings, fields, modules, the ring

of polynomials in one

the theory of elementary divisors),

variabl~,and

on the level of, say, VAN DER WAERDEN, lANG [1965], HU [1965] or ZARISKI and SAMUEL [1958, Vol. 1].

The material covered from here

on dates from 1965 or later. Standing assumptions until Section 10:

(4.1)

All systems

L:

(F, G, H)

are di screte-time, linear,

constant, defined over a fixed field

K

(but not necessarily

finite-dimensional).

Our immediate objective is to provide the setup and proof for the

(4.2)

FUNDAMENTAL THEOREM OF LINEAR SYSTEM THEORY.

The natural

state set X f

associated with a discrece-time, linear, constant input-

output map l'

over a fixed field

generated module over the ring z

and coefficients in

(4.3)

COMME~ITS.

to the inputs to

L:,

admits the structure of a finitely

K

K[z] of polynomials (-hi t h indeterminate

K).

Since the ring

K[z] will be s een to be related

this result has a superficial resemblance t o the

fact that in an arbitrary dynamical system L: the state set the action of a semigroup, namely

nL:

~

admit s

(see (3.6) and related footnote).

It turns out, however, that this action of from combining the concatenation product in

n on X, which results

n

with the definition of

-44R. E. Kalman

states via Nerode equivalence, is incompatible with the additive structure of

n [KALMAN, 1967, Section 3].

Our theorem asserts the

existence of an entirely different kind of structure of X. structure, that of a

This

K[z]-module, is not just a consequence of

dynamics, but depends critically on the additive structure on and on the linearity of

f.

The

rele\~nt

n

multiplication is not

(noncommutative) concatenation but (c ommutative) convolution (because convolution is the natural product in

K[z]); dynamics is thereby

restated in such a way that the tools of commutative algebra become applicable.

In a certain rather definite sense (see also Remark

(4.30)), Theorem (4.2) expresses the algebraic content of the method of the Laplace transformation, especially as regards the practices developed in electrical engineering in the U.S . during the 1950's. The proof of Theorem (4.2) consists in a long sequence of canonical constructions and the verification that everything is well defined and works as needed. In view of (4.1) and the conventions made in Section 1, be viewed as a K-vector space and and all met)

mE n .

=0

for all

(a)

n

w(t )

=0

for almost all

n

may

t E ~

By convention (3.2 ), we have assumed also that t > O.

~ ~[z]

As a result, we have that:

as a K-vector space.

Let us exhibit the i somor-

phism explicitly as follows: (4.4)

By (3.2 ), the sum in (4.4) is always finite.

The isomorphism

-

45 -

R.E.Kalman

obviously preserves the K-linear structure on

n.

In the sequel, we

shall not distinguish sharply between (I) as a function

T ~ IfI

and

(I) as an m-vector polynomial. (b)

n

is a free K[z]-module with m generators, that is,

n ~ IfI[z] also in the K[z]-module sense. action of K[z]

on n by scalar multiplication as

.,

K[z ] X n

.·m

"[Z]

where

In fact, we define the

~

n:

(7T, (I))

t-7

7T.(I)

«(I). E K[ z ], j J

1, ••. , m).

The product of 7T with the components of the vector (I) is the product in K[z].

We write the scalar product on the left, to avoid

any confusion with notation axioms are verified;

It is easy to see that the module

n is obviously free, with generators

m~

(4.6)

(3.6) .

j·th posf.ti.on, j

(c)

On

n

1, ••• , m,

the action of the shift operator

by multiplication by

z.

rrn is represented

This. of course. is the main reason for

introducing the isomorphi sm

(4.4) in the first place.

-

46 -

R. E. Kalman

(d)

(4.4)

Each element of

sUggests viewing

z

is a formal power series in

r

t

z

as an abstract representation of

-1

In fact, - t E

~;

hence we define

(4.7) By

r

(3.8)

and

(4.1),

not defined) for

yet) E

t < 1.

KP

for ea ch

t > 1

and is zero (or

In general the sum is taken ove r infinitely ~ny

nonzero terms; there is no question of convergence and the right-hand side of

(4.7)

series.

is to be interpreted stict1y algebraically as a formal power Since

reO)

is always zero (see

(3.8»),

we can say also

that (e)

r

is isomorphic to the K-vector subspace of z-l ~~th coefficients in KP)

(formal power series in of all power series with

0

first

KP[[z-l]] consisting

t er~.

The first nontrivial construction is the following : (f)

r

has the structure of a

K[z]

module, with scalar

multiplication defined as

(4.8)

.:

K[z] X r

->

r:

(n, r) H7r.y

This product may be interpreted as the ordinary product of a power series in

z-l

by a polynomial in

z,

followed by the deletion of

all terms containir~ no negative powers of the module axioms is straightforward.

z.

The verification of

-

47 -

R. E. Kalman (g)

is a

f

K[ e l

quence of the fact that cation by (h)

homomorphism. This is an immediate consef = constant (see (3.10» "and that multipli-

corresponds to the left shift operators on n and

z

The Nerode equivalence classes of

n/kernel f.

f

r.

are isomorphic with

This is an easy but highly nontrivial lemma, connecting

Nerode equivalence with the module structure on n.

The proof is an

immediate consequence of the formula w.v

=

z lvlW + V.

(4.9) implies

In fact, by K-linearity of f, f(w.V)

=

f(w'.V)

for all

v E n

if and only i f fezk .WI) for all k> 0

k

fez -oi)

in Z.

The proof of Theorem (4.2) is now complete, since the last lemma identifies module

X f

as defined by (3.15) with the

quotient

n/kernel f.

We write elements of the latter as it is clear that since

K[z]

n

Xf

[w]f

=

W

+ kernel fj

as a K[z]-module is generated by

itself is generated by

e

l, that the scalar product in n/kernel f

••• , em

then

[el]f' ••• , [em]f'

(see (4.6».

Note also

is

(4.10) The last product abov" (that in n)

has already been defined in (4.5).

The reader should verify directly that (4.10) gives a well-defined scalar product.

-

48 -

R. E. Kalman

REMARK.

(4.11) define

f.

There is a strict duality in the setup used to

From the point of view of homological algebra [MAC LANE

19631, this duality looks as follows.

Since every free module is

projective, the natural map

exhibits

X as the image of a projective module. On the other f hand, there is a bijection between the set X and the set f

'::'f

fen)

'::'f is clearly a and so

X f

X f

K[ z 1- submodu.Le of

r

(with

f( z .m) ) ,

z- f(m)

K[r..] -modules.

It is

is an injective module [MAC LANE 1963, page 95,

r

Exercise 21

1'.

'::'f are isomorphic also as

and

known that

c

So the natural Inap X ~ :;;:f: f

as a submodule of an injective module.

[m1f

H

f(m)

exhibits

This fact is basic in the f (Section 7),

construction of the "transfer function" associated with

but it s full implications are not yet understood at present. There is an easy counterpart of Theorem (4.2) which concerns a dynamical system given in "internal" form: (4.12)

PROPOSITION.

The state set

X E

of every discrete-time,

ffnite-dimensional, linear, constant dynamical system admits the structure of a PROOF. K-vector space.

E

=

(F, G, -)

K[zl-~.

By definition (see (1.10)), We make it into a

X= ~

is already a

K[zl-module by defining

-

49 -

R. E. Ka l m a n

K[ z ] X JCl ->

(4.13)

.:

(4 .14)

COMMENT.

JCl:

(7T, x)

H

7T(F)x .

o

The construction used in the proof of (4 .12) is

t he classical trickof studying the properties of a fixed linear map F:

JCl

->

JCl

via the

K[ zl-module structure that

F

induces on

JCl

by (4 .13). In view of the canonical construction of L provided by f Proposition (3.16), the state set X can be treated as a K[zl module irrespective as to whether

X is constructed from

or given a priori as part of the specification of L the

(X

f

(X

= Xf)

= ~). Thus

K[zl-module structure on X is a nice way of uniting the "external"

and the "internal" definitions of a dynamical system .

Henceforth we

shall talk about a (discrete-time, linear, constant dynamical) system

L somewhat imprecisely via properties of its ass ociated

K[zl -module~.

We shall now give some examples of using module-theoretic language to express standard facts encountered before . (4. 15)

PROPOSITION .

FE is given by PROOF.

If X is the state-module of L,

X -) X: x

H

the map

z -x ,

This is obvious from (4 .13) if X

~, f

then we find that, by (1.17), x(l)

Fx(O) + Gw(O) , F[ ~lf + Gw( O);

since

x( O) results fr om input

~,

x(l)

resuJ.ts from input

z·~

+ w(O)

- so -

R. E. Kalman

and we get [Z· E + w(0) ] f'

z'[E]f + [w(O)]f' z'[E]f + GuJ(O).

o

So the assertion is again verified.

Now we can replace Proposition (2.14) by the much more elegant (4.16)

A system E = (F, G, -)

PROPOSITION.

if and only if the columns of G generate

is completely reachable

~.

The claim is that complete reachability is equiva-

PROOF.

lent to the fact that every element

x E

~

is expressible as

m

x

1rj E K[z],

= ftl1rjgj'

G

[gl' ••• , ~].

In view of (4.15), this is the same as requiring that

x be expressible

as m

x

ftl1r/F)gj;

this last condition is equivalent to complete reachability by (2.14). (4.17)

COROLLARY.

The reachable states of E are precisely

those of the submodule of (4.18)

REMARK.

simply means that

0

~

generated by (the columns of)

The statement that X is

~

G.

"E is not completely reachable"

generated by those vectors which make up

the matrix G in the specification of the input side of the system

E.

-

51 -

R. E. Kalman

It does not follow that vectors.

X cannot be finitely generated by some other

In fact, to avoid unnecessary generality, we shall henceforth

assume that

X is always finitely generated over K[z]. From the system-theoretic point of view, the case

when we need

infinitely many generators, that is, infinitely many input channels, seems rather bizzare at present. The syst em X f

PROPOSITION. PROOF. is reached by

(4.20)

~

iff is

W

Obvious from the notation:

E

E

[O]f'

a state

x

[~]f

n.

o

PROPOSITION. PROOF.

is completely reachable.

The system X f

is completely observable.

Obvious from Lemma (h) above:

D([w]f)

= fe w) = 0

which says that the only unobservable state of X f

o

0 E X • f

Let us

g~nera!ize

the l ast r esult t o obtain a module-theoretic

for complete obs ervability. doing this.

c r it c r i ~n

There are two technically different ways of

The first depends on the observation that the "dual" of a

submodule (see Corollary

(4.17»

is a ~uotient module.

observability via the "dual" system Consider a dynamical system E K[z ]-module ~

(F', HI, -)

= (F,

The s econd defines

associated with

(F, -, H).

-, H) and the corresponding

and K-homomorphism H: ~ ~ Y = KP•

We can extend H

-

52 -

R. E. Kalman

to a K[z] -homomorphism

H:

~

--+

H

(look back at (?8)) by setting

r

x ~ (Hx, H(z'x), H(z2. x), .•• ). From Definition (2.19) we see that no nonzero element of the quotient module

~/kernel

can say that

H is unobservable. Hence, by abuse of language, we

~/kernel

H

is the module of observable states of E.

Thus we arrive at phrasing the counterparts of (4.16-17) in the following language:

(4.21)

A system E

PROPOSITION.

if anionly i f the quotient module

(4.22)

COROLLARY.

= (F,

-A

~/kernel

is compl et el y observable

TERMINOLOGY.

~/kernel

are to be identified

H.

The preceding cons i derat i ons suggest viewing

a system E as essentially the sane "thing" as a module speaking, however, knowing E

= (F,

G, H)

(see (4.13)) but also a quotient module module (that generated by

XEo =

~.

H is isomorphic with

The observable states of E

with the elements of the quotient module (4.23)

H)

G)

of ~,

~

X.

gives us not only (over kernel

H)

Strictly ~

= XF

of a sub-

that is

K[z]G/ kernel -H.

If ~ ~~ we say that

~

is canonical (relative to the given

G, H).

To be more precise, let us observe the following stronger version of (4.19-20):

-

53 -

R. E. Kalman

(4.24)

CORRESPONDENCE

between

There is a bijective correspondence

THEO~I.

K[z]-homomorphisms

f: n

r and the equivalence class of

--+

completely reachable and completely observable systems basis change in

E modulo a

~.

Detailed discussion of this result is postponed until Section A

7. stricter observation of the "duality principle" leads to

(4.25)

The K-linear dual of E

DEFINITION.

= (F, G, H) is

E* = (FI, H', G') (, = matrix transposit ion). The states of E* are called costates of E. The following fact is an immediate consequence of this definition:

(4.26)

PROPOSITION.

structure of

K[z-l]

is the dual of ~ product in

~*

The state set

of

Y~*

module, as follows:

E* may be given the

(i) as a vector space

~*

regarded as a K-vector space, (ii) the scalar

is defined by x*(Fx).

(4.26A)

REMAR.X.

We cannot define

K[z]-linear dual of domain

~,

XE*

as

because every-torsion module

D has a trivial D-dual.

equal to

M over an integral

However, the reader can verify (using

the ideas to be developed in Section 6) that morphic with

Ho~[ z ] (~, K[ z])

~*

defined above is iso-

Ho~[z](~,K(z)/K[Z]). See BOURBAKI [Algebre, Chapter

(2 e ~d.), Section

4, No.8].

7

-

54 -

R. E. halman Now we verify easily the following dual statements of (4.16-17):

(4.:27)

PROPOSITION.

if and only if (4.28)

A system E

generates

HI

COROLLARY.

generated by

-, H)

is completely observable

~*.

The observable COstates of E* are precisely

the reachable states of E*, ~*

= (F,

that is, those of the submodule of

H'.

We have eliminated the abuse of language incurred by talking about "observable states" through introduction of the new notion of "observable COstates".

The full explication of why this is necessary

(as well as natural) is postponed until Section 10. The preceding simple facts depend only on the notion of a module and are immediate once we recognize the fact that

F may be eliminated

from statements such as (2.8) by passing to the module induced by via (4.13).

F

But module theory yields many other, less obvious results

as well, which derive mainly from the fact that

K[z]

is a principa1-

ideal domain. We recall:

an element

m of an

R-modu1e M (R

= arbitrary

commutative ring) has torsion iff there is a r E R such that r·m

= O.

If this is not the case,

m is free.

Similarly,

said to be a torsion module iff every element of

M has torsion.

M is a free module if no nonzero element has torsion. is any subset of M, ~

=

(r;

the annihilator r·j,

it follows immediately that

o ~

of

~

for all

j,

M is

If

LC M

L is the set

E L);

is an ideal in

R.

Note also that

-

55 -

R. E. Kalman

the statement that

"M

is a torsion module" does not imply in general

is nontrivial, that is,

~

~

f. o.

(Counterexample:

take

an M which is not finitely generated.) Coupling these notions with the spe cial fact that, for us, R

= K[z),

we get a number of interesting

(4.29)

PROPOSITION.

is a torsion

PROOF.

">S:

is finite-dimensional i f and only i f

I:

If

>s:

q

is infinite dimensional.

"I: = finite-dimensional" is defined

Xl' ••• , x q

G).

of XI:

deg Y

j

Hence

with Y E K[z).

Yl[z)

= nj >

0

j

for all

is either zero (and then a unit which implies

See (l.18).

(which are not·

is a principal-ideal domain, each of the

K[z}

>s:

By assumption X is. finitely generated

nonzero elements

pal ideal, say, then

I:

= finite-dimensional as a K-vector space".

necessarily the columns of

Since

is free,

We recall that

Sufficiency. by, say,

results:

K[z)-~.

COROLlARY.

to be

syste~theoretic

x

replace each expression

j

x. J

=0

j

If

1, ••• , q.

JS:

A

x

is a princij

is a torsion module, For otherwise

Y

j

is free, which is a contradiction) or contr

y to assumption.

Hence we can

-

56 -

R. E. Kalman by the simpler· one

XL,

which shows that

as a K-module, is generated by the finite set

Necessity. x

F:

1-+

z.-x,

If

Let

XL

~F

be the minimal polynomial of the map

is finite-dimensional as a K-module,

deg

> O.

~F

This means (by the usual definition of the minimal polynomial in matrix theory or more generally in linear algebra) that x E

XL

so that

y~

is a torsion

~F

annihilates every

o

K[z]-module.

Notice, from the second half of the proof, that the notion of a minimal polynomial can be extended from K-linear algebra to

K[z]-modules.

In fact, the same argument gives us also the well-known (~030)

PROPOSITION.

Every finitely generated torsion module

over a principal-ideal domain ~M

given by

~

=

R has e nontrivial minimal pynomial

~~.

COROLLARY. q

If a

K[z]-module

generators and minimal polynomial

f

~X'

X is finitely generated with then

dim X (as K-vector space)

~

REMARK.

is completely reachable and is

The fact that

therefore generated by of L

M

L

f

q.deg ~X'

m vectors allows us to estimate the dimension

by (4.31) knowi.ng only

deg

but without having computed

~X

f

-

57 -

R. E. Kalman

X

itself.

(Knowing X explicitly means knowing F: x ~ z·x, etc.) f In other words, the module-theoretic setup considerably enhances the f

content of Proposition (3.16).

Guided by these observations, we shall

develop in Section 8 explicit algorithms for calculating from

f

without first having to compute PROPOSITION.

E

If

XL

f

directly

F.

is a free

K[z]-module, no sta.te of

can be simultaneously reachable and controllable. PROOF.

We recall that

"XL = free"

simplicity that

XE = K[z].

for some 5 E K[z]. zlwl. x +

00.1

=0

This shows that WI

Similarly,

for some

1

Then

00

x

XL

means that

(isomorphic to) a finite sum of copies of K[z].

by

dim E

= reachable

is

Suppose for means that

x

= 5'1

x = controllable means that

E K[z].

is annihilated by

Hence if x has both properties,

~ow,

which contradicts the assumption that

the input ~

5 followed

is free.

o

The most important consequence of Theorem (4.2) is due to the fact that through it we can apply to linear dynamical systems the well-known FUNDAMENTAL Sl'RUCTURE THEOREM FOR FINITELY GENERATED MODULES OVER A PRINCIPAL IDEAL DOr1Allf R (Invariant Factor Theorem for Modules).

Every such module M

wi~h

~

generators is isomorphic to

-

58 -

R. E. Kalman

where the the

El

Vi

R/ViR are quotient rings of

R viewed as modules over

(called the invariant factors of M)

M up to units in

denotes the free

Vi Iw .. , i l

R,

R-module with

s

R,

are uniquely determined

i = 2, ••• , q,

and, as usual, RS

generators; finally,

r + s

~

m.

Various proofs of this theorem are referenced in KALMAN, FALE, and ARBIB [1969, page 270], and one is given later in Section 6. Note:

The divisibility conditions imply that

module iff

s = 0

and then

M is a torsion

VM = VI.

One important consequence of this theorem (others in Section is that it gives us the most general situation when torsion module

E.

~

7)

is not a

(4.33) with (4.34), we

For instance, combining

get PROPOSITION.

A system cannot be simultaneously completely

reachable and completely controllable if its oo-dimensional components (i.e.,

(4.37 )

REMARK.

s >0

in

K[z]-module

X has any

(4.35».

Although our entire development in this section may

be regarded as a deep examination of Proposition (2.14), most of our comments apply equally well to (2.7), since both statements rest on the ~ algebraic condition (2.8).

In fact, the only remaining

thing to be "algebraized" is the notion of "cont i nuous- t ime " . shall not do this here.

We

Once this last step is taken, the algebraization

of the Laplace transform (as related to ordinary linear differential equations) will be complete.

-

59 -

R. E. Ka I ma n

5.

CYCLICITY AND REIATED QUESTIONS

We recall that an R-module

element

th~t

M = Hm.

m E M such that

iff there is an element better to say

(R = arbitrary ring) is cyclic

M

such a module is monogenic:

[It would be

generated by one

m.]

If M is cyclic, the map R and has kernel

Am'

loI: r

~

r'm

H

is an epimorphism

the annihilatir-g ideal of m.

This plus the

homomorphism theorem gives t he well-known PROPOSITION.

Every cyclic

R-~J ::J:lle

RIA:n

is isomorphic with the quotient ring

r-,

with ge::e!'at::>!' :n

vie~ed a s an

~-~Jdule.

This result is much m::>re interesting when, as in our case,

R

is not only commutative and a principal-ideal domain, but specifically the polynomial ring

X be a cyclic

So let A

g

g.

=

'I' g K(z],

where

K(z] -module ·H.ith generator

X "" X[ z l/vK[ z l. (i)

(ii)

and

l~t

X.

Write

~

g

Hence

= ~ X =~.

'I' g is a minimal

In view of (5.1),

Let us r ecs.Ll some f'eabur es of the ring

K[ z ]hx[ z }:

Its elements are the residue classe s of polyno~als ~ ( mod V) ,

rr E K[z]. (rr]-[cr]

g

is the minimal or annihilating polynomial of

By commutativity and cyclicity,

polynomial also for

(rr]

K[z] .

W!'ite these as

[~)

or

[rrlo/.

Multiplication i s def ined as

= [rrcr). Each

[rr]

is either a

is a unit iff (n, 'f)

tL~it

= greatest

or a divisor of zero.

In fact,

common divisor of u, V is a

-

60 -

R. E. Kalman

unit in

K[z)

(that is,

cnr + Tljr so that

unit in K[z),

divisors since (iii)

(0",

1

If

T

[71").[ljr/Q] ljr

Then

E K[z])

is the inverse of

[0")

=0 f

(71", ljr)

(71", \if) E K).

[71").

On the other hand, if

then both

[71")

and

[ljr/Q) are zero

= [(71"/Q)ljr] = o.

is a prime in K[z)

(that is, an irreducible poly-

nomial with re spe ct to coefficients over the ground field by (ii)

K[z ]/ljrJ([ z ]

i s a field.

K),

then

This is a very standard construction

in algebraic number theory. Since it is awkward to compute with equivalence classes

[71"],

shall often prefer to work with the standard representative of namely a polynomial mined by

[71"]

1i- of least degree in

and the condition

[ 71"] •

deg 7i- < deg ljr.

we

[71"),

7i- is uniquely deterHenceforth

-

will

always be used in this sense. The next two assertions are immediate: (5.2) to the

PROPOSITION .

K[z]/fK[z)

K-vector space CEJ( n)

K[z)/ljrK[ z ]

=

~

a E K[z l : deg l' < n

is also isomorphic to lJ1(n) as a

we define the scalar product in ®(n) PROPOSITION. then

K-vector space is isomoruhic

dim E = deg ljr .

If

=

deg \' } •

K[ z I-mcdul,e, provided

E.l (71".1')

r+

7i1.

XE is cyclic ,d th minimal poJ.y:lomial

~,

-

61 -

R. E. Kalman

(4.34), we see that the most general

Looking back at Theorem

K(z]-module is a direct sum of cyclic

(5.3) and

(4.3~

K(z]-modules.

By combining

and using the fact that dimension is additive under

direct summing, we can replace (4.31) by the followiEg exact result: PROPOSITION. factors

If X is a torsion module with invariant E

W , ••• , W then l q -dim E

A simple but highly useful consequence of cyclicity is the so-called control canonical form [KALMAN, FALB, and ARBIB, 1969,

44] for a completely reachable pair (F, g) wher e g is an

page

matrix.

n X1

We shall now

Observe first that lent to

"g

generates

procee~to

"(F, g) XF,

deduce this result.

completely reachable" is equiva-

the module induced by

F via (1+ .13) ."

Let

det (zI - F), n

z + al Z then X F•

~

n-l

+ ••• + an'

~ E K;

is the characteristic (and also the) minimal polynomial for

[This is a well-known fact of module theory.

See for example

KAlMAN, FALB, and ARBIB [1969, Chapter 10, Section 7] for detailed discussion.]

As in KALMAN [1962], consider the vectors

-

62 -

R . E. Kalman

en e

in~.

=

n_l

g

=

=

l.g

=

{l)(z).g,

~.g =

z.g +

[ For consistency,

{2)(z)'g,

xin+l)(z) F

= X(z).J F These vectors are

easily seen to be linearly independent over since

~

'" ®n)

as a

K.

They generate

K-vector space (Proposition (5 .2}).

••• , en are a basis for ~ as a K-vector space. l, respect to this basis, the K-homomorphism e

z:

is represented by the matrix

(5. 6)

0

1

0

0

0

0

0

1

0

0

0

0

0

0

1

F

-0:

r

-exn-1

-0:

n-2

[This is proved by direct computation. necessary to use the fact that

-0:

2

Hence

With

z -x

X H

-~

In particular, it is

~

-

63 -

R. E. Kalman

z.e

1

z~n)(z).g, (~(z) - an) 'g,

Note that the last row of

By definition,

F

of

~.

~

has the representation

(5.7 )

g

in

(5.6) consists of the coefficients

= en' Hence g as a column vector in

g

Conversely, suppose

W,

~

have the matrix representation

with respect to some basis in

~.

(5. 6-7)

Then (by direct computation)

the rank condition (2 .8) is satisfied and therefore

(F, g) is

completely reachable in both the continuous-time and discretetime cases (Propositions (2 .7) and (2.16)). We have now proved:

(5.8)

PROPOSITION.

The pair

(F, g)

is completely reachable

if and only if there is a basis relative to which

COROLLARY. A(Z)

= zn + tll zn-l + ... + tln

exists an ~

Given an arbitrary

n-vector

(F, g)

£

in

such that

K[ zJ,

n-th K

is,given by

degree polynomial

= arbitrary

A = ~_g£'

is completely reachable.

F

field.

There

if and only if the

-

64 -

R. E . Kalman

PROOF.

Suppose that

With respect to the same basis

(5.6-1),

forms

(F, g)

(5.5)

is completely reachable.

which exhibits the canonical

define

Then verifY by direct computation that Conversely, suppose that reachable.

(F,

A = ~_gtl' g)

is not completely

Then, recalling Proposition (2.12) (which is an

algebraic consequence of (2.8) and hence equally valid for both continuous-time and discrete-time), deg ~

•

~

Since

22

the polynomial

KU

and so is also

F-invariant subspace of X = ~,

is independent of the choicp of basis in

~

II

~ = XP/X • (In F 22 11 does not depend on the arbitrary choice of

and the same is true then also for

particular, X 2

is an

dim X > 0 2

~

22

in satisfYing the condition X =

we have for all

n-vectors

~ EB X

2.)

t, deg

This contradicts the claim that with suitable choice of

In view of (2.12),

~

22

A-X - · F- gt ·

> O.

is true for any

t.

In view of the importance of this last result, we shall

rephrase it in purely module theoretic terms:

A

o

-

65 -

R. E. Kalman THEOREM.

Let

K be an arbitrary field and

K[z]-module with generator n.

g and minimal polynomial

There is a bijection between n-th

= zn + 131 zn-l + .e: If' -+ If': x(j) .g ~

degree polynomials

(5.5»

J

such that

-

"A is the minimal polynomial for the

new module structure induced on X by the map Note that in (5 .11) The map l

z

X of degree

..• + 13 in K[ z l and K-homomorphisms n H l .• g (j = 1, " " nand x(j) defined

"A( e)

E

X a cyclic

(F, g, -) to

z*

l(x) .

z*: x H z -x -

lex) corresponds to gl'x in (5.10).

in (5.11) defines a control law for the system corresponding to the module X.

The passage from

is the module-theoretic form of the well-known open-loop

to closed-loop transformation used in classical linear control theory. PROOF.

If',

basis for treat

l

l

represents the equivalence class is never a

l

that this choi ce of

= 1,

••• , n + 1.

"A(l)(z _

13 j

.e

- O:j'

implies

We

(that is, an operator

l'x = l(~ 'g), where

= (s:

[s]

s·g

= x).

Unless

K[zl-homomorphism and therefore

does not commute with nonunits in K[z]. Define

j

form a

is clearly a well-defined K-homomorphism.

K-vector space), by writing

identically zero,

.e

x(l) .g, ••• , x(n)'g

formally as an element of K[z]

on X is a ~

Since the vectors

j

= 1,

.•• , n.

"A(j)(z -

Use induction on

j.

We prove first

l) = x(j)(z) for By definition,

l) = x(l)(z). f I n the general case,

-

66 -

R. E. Kalman

(inductive hypothesis), (def . of .£), (def. of .£ .), J

(def. of x(j+l)).

j = n + 1)

It follows (case regarded as a

K[z*]-module.

A(l)(z*).g, ••• , A(n)(Z*)'g space since

F=~~asitions

to the

A annihilates

g

X

On the other hand, the is a basis for

X as a

K-vector

X(l)(z).g : .• , x(n)(z).g was such a basis.

is cyclic with generator by

that

also as a

K[z*]-module.

(5.1-2) the annihilating ideal of

So X

Hence

g" with respect

K[z*)-module structure cannot be generated by a polynomial

of degree less than

n,

nomial with respect to

that is, z*.

A is indeed the minimal poly-

The correspondence

A

f-'

.£ is obviously

o

bijective. The proof immediately implies the following COROLLARY. ~

K[z]-module.

respect to the are related as

Hz)

~

Then

x x

=

~. g

be any element of X viewed

has the representation

K[z*]-module structure on

X,

where

~*.g

S

with

~

s*

-

67 -

R. E. Kalman

So the open-loop/closed-loop transformation is essentially a change in the canonical basis, provided X is cyclic.

X(j)

It is interesting that the

have long been known in

Algebra (they are related to the Tschirnhausen transformation discussed extensively by WEBER [1898, §46, 54, 74, 85, 96]), but their present (very natural) use in module theory seems to be new. **Theorem (5.l1} may be viewed as the central special case of Theorem A of the Introduction.

Let us restate the latter in

precise form as follows: THEOREM. n

;>.(z) = z + I\Z There exists an if and only if

n-l

Given an arbitrary . + ••• + I3n E! K[z],

n X m matrix (F, G)

Lover

n-th

degree polynomial

K = arbitrary field. K such that

~-GL'

=A

is completely reachable.

For some time, this result had the st a t us of a well-known folk theorem, considered to be a straightfoniard consequence of (5.9). has been discovered independently by many pe ople .

The latter

(I first he ard

of it in 1958, proposed as a conjecture by J. E. Bertram and proved soon afterwards by the so-called root-locus method.)

Indeed, the

passage from (5.11) to (5.13) is primarily a tecnnical problem.

A

proof of (5.13) was given by LAIiGEliliOP [1964) and subsequent ly simplified by WON¥JU~ [1967).

Tne first proof was (~n_~ecessarily)

very long, but the second proof is also unsatisfactory; since it depends on arguments using a splitti ng field of

K

**The material between these marks was added after the Summer School.

-

68 -

R. E. Kalman

and fail when K is a finite field.

We shall use this situation

as an excuse to illustrate the power of the module-theoretic approach and to give a proof of (5.13) valid for arbitrary fields. The procedure of LANGENHOP and WONHAM rests on the following fact, of which we give a module-theoretic proof: LEMMA.

Let

F be cyclic* and

!.!! m-vector

a E

K be an arbitrary but infinite field.

(F, G)

Ifl

completely reachable.

such that

(F, Ga)

Let

Then there is

is also completeq

reachable. We begin with a simple remark, which is also useful in reducing the proof of (5.13) to Lemma (5.18). SUBLEMMA.

Every submodule of a cyclic module over a

principal-ideal domain is cyclic. PROOF OF (5.14). m= 1

is trivial.

m.

The case

The general case amounts to the following.

Consider the submodule gl' ••• , ~-l

We use induction on

of G.

Y of X =

~

generated by the columns

In view of (5.15),

Y is cyclic.

By the

inductive hypothesis, we are given the existence of a cyclic generator of Y of the form

gy

We must prove:

a, J3 E K the vector

for suitable

is a cyclic generator for

=

a i gl + ••• + am_I· ~-l' a a.~

+

i

E K.

J3.~

X.

*Of course, this means that the is cyclic.

K[z]-module

X F

(see (4.13))

-

69 -

R . E. Ka l ma n

By hypothesis,

Sx'

X has an (abstract) cyclic generator

By cyclicity we have the representations

=

gy

and

TJ'~

Eim

Tj,

~,~,

~ E K[ e l-

Hence our problem is reduced to proving the following:

ex, tl

E

K the polynomial

aT)

~

+

is a unit in

K[

for suitable This,

Z]/~K[ z ] .

in turn, is equivalent to proving

(5.16) where

aT)

classes zero.

-

mod gi'

K[z]

are the unique prime factors of

Then no pair

(~i' ~i)'

X,

reachable.

= 1.

values of

X'

(F, G)

is completely ~

and

gy

the condition

tl from con sideration.

can be

is a proper sub-

are zero, then every ~ .

K[Z]/~K[z],

Then

.•• , r

that is, ~gi annihilates

whence

contradicting the fact that

I f all the

is a unit in

ex

X' = K[z]gy + K[z]9m'

= 1,

i

For if one is, then gil (~, TJ, ~),

module of

r

in

1, ... , r

i

~

mean the representative of least degree of equivalence

the submodule

So let

° (mod g.)

f-

gl' .,., gr Let

~.

+ ~

f-

0,

~. + ~ . ~

Since

~

= 0 eliminates at most K is infinite by

(5.16).

An essential part of the lemm~ is the stipulation that

"F

= cyclic

+ (F, G)

TJ

is already a cyclic generator.

hypothesis, there are always some tl which sati sfy

The hypothesis

so

0

a E ~.

= completely reachable" means that

-

70 -

H. E . Kalman

that is, the le~~ i s trivially true for some a E ~[z]

sx = Ga.

But since we want

a E K,

since

there must be interaction

between vector-space structure and module structure, and for this reason the lemma is nontrivial. when K = finite field.

As a matter of fact, the lemma is false

The simplest counterexample is provided

when (5.12) rules out a single nonzero value of 13, out all

thereby ruling

13. COUNTEREXAMPLE.

Let

integers modulo the prime ideal

Notice that

K = y~, ~.

that is, the ring of

Consider

~ = Xl e X e ~

(as a K[z]-module), where the 2 minimal polynomials of the direct sumrrands are

').(z) X 2(z) X (Z)

3

z2 + z + I, z 2, z + 1. (Xl' X X = 1, hence 2, 3) gl generates Xl eX while

All these factors are relatively prime, X is cyclic. generates

Notice also that

X ex • A cyclic generator for 2 3

3

X is

-

71 -

R. E. Kalman

A simple calculation gives

(z

4

2

+ Z

+ l)'~'

Conditions (5.16) are here a-I + f3'0

f

0

(mod Xl)'

+ f3.1

f

0

(mod X

a-I + f3-1

f

0

(mod X ) .

a-O

2),

3

These conditions have no solution in

g/~.

At this point, the following is the situation concerning Theorem (5.13): (1)

Its counterpart, Theorem A of the Introduction, was

claimed to be true in the continuous-time case under the hype

.~esis

of complete controllability. (2)

In the discrete-time case (5.13) with the preceding

hypothesis Theorem A is false, because of the counterexample: (F

= nilpotent,

~-GL'

G

= 0)

the pair

is completely controllable, but evidently

1s independent of L.

However, in view of (5.11),.Theorem

(5_13) might be true also in the discrete-time case if "complete controllability" is replaced by "complete reachability", this modification being immaterial in the continuous-time case. (3)

Because of (5.17), we might expect that a theorem like (5.13)

1s false for an arbitrary field

K.

-72R. E. Kalman

(4)

If our general claim that reachability properties are

reflected in module-theoretic properties is true, then (5.13) should hold without assumptions concerning module-theoretic fact, that

K,

= principal

K[z]

independent of the specific choice of

because the principal ideal domain, is

K.

We now proceed to establish Theorem (5.13). hypotheses on

K will turn out to be irrelevant.

PROOF OF (5.13).

Necessity is proved exactly as in (5.8).

Sufficiency will follow by induction on m,

~~.

once we have proved it

m = 2:

in the special case

(5.18)

Let

K be an arbitrary field and let

K[z]-module generated by

gl' g2.

K[z*]-module structure on

Let

Case 1.

z*

=z

£ - £

£(x)

will change the

serve that on

o

or

x E Z.

on

Thus there exist polynomials

£

In (5.11)

Replacing

K[z]-module structure on

t.. on

z

Y but pre-

is prime to the unchanged minimal polynomial

y +

Z

by

so that the new minimal poly-

V,

a

such that

B.r hypothesis, every x E X has the representation x

induces a

g2·

X=YEllZ.

for all

nomial Z.

gl + g2

that is,

Z. Further, choose Y

z - £

Y = K[z]gl and Z = K[z]g2.

ynZ=O,

such that

=

z*

X then X is cyclic with respect to this

structure and is generated by either

PROOF.

X be a

There is a K-homomorphism £

(of the tyPe defined in (5.11] such that if

take an

That is, special

vt.. + o X

~Z

= 1.

X

-

73 -

R.E. Kalman

Now verify that x

= (T]crX + svA)·(gl + g2)'

T]crX'g

l

+ SVA·g 2,

T](l - VA)·gl + s(l - crX)'g2' Tj'gl + s 'g2'

K[z*]-module.

ynz=wf o.

C2.s e2.

there is ag E K[z] cyclicity of Take same w Tj

-1

T]

g'g2

f

on

Then if

O.

generates

unit (mod

w

there is also a

Y,

f

su ch that

1Sc).

To show:

X such that

g' g2

Z = X.

3y ;lypotlle s i s,

and therefore, by

Tj E K[z]

such that

1Sc)

g'g2 = w = Tj'gl'

we are done because

In the nontrivial case,

there is a suitable new module structure

~ = unit (mod X* ) ,

nomial of X as a

'lE W.

T] = u,'.it (mod

and so

Y,

kt

X*

being the minimal poly-

K[ z* ]-moduLe,

The main facts we need are the following: SUB~~~.

deg X = n,

Let

X be a fixed element of

FX the companion matrix of

the cyclic module induced by X F• X

Then

Tj E K[z]

F)? and

is a unit modulo

X given by g

K[z]

,nth

(5.6),

X FX

a cyclic generator of

X if and only if

~'g

is

also a cyclic gener at or of X . FX PROOF.

Obvious.

o

-

74 -

R. E. Kalman

.Jl-l )

f

( dety, FXY' ••. , .1"X Y

where

y

(5.19).

Same notations as in

SUB~~~.

(5.20)

Write

0,

is the column vector Tin

PROOF.

Since

X(1), ••• , x(n)

is the basis for the

K-vector space of all polynomials of degree (~l' ••• , ~n)

is uniquely determined by

< n,

By definition

~.

is the matrix representing the module operator to the special basis

e

l,

••• , en

in

~ X

the n-tuple

z: x

given by

~

z·x

FX relative

(5.5).

Similarly,

using one of the module axioms, we verifY that

£

J=l

[rt.x(j)(Z)]'g "J

'

Jl'iij[x(j)(z).gJ,

in other words, the numerical vector (5.22) represents the abstract vector

Ti·g

in

X relative to the same basis FX

e

l,

.•• , en'

Recall

-

75 -

R. E. Kalman

that By

generates

Tj'g

(2.7)

~X

(F x, ll(FX) g)

is complete reachable.

the latter condition is equivalent to

follows from

(5.21)

Same notations as in

(5.19)

and

(5.20).

Given

n-vector (5.22), there exists a polynomial

X

is satisfied.

PROOF.

Let

11 1 , Ti 2 , X(z )

The rest

o

any nonzero nwnerical such that

(5.21).

(5.19).

SUBLEMVA.

numbers

iff

Ti r be the first member of the sequence of which is nonzero.

n +

Z

~z

and determine the first ll r

'i'ir+l

o

Tjr

o

o

n-l r

Write

+ •.. + an' coefficients of

X by the rule

~:J

:J

T}r

an

o

o

1

(Since all numbers belong to a field, the required values of a

r,

..• , an

exist.)

reduce the matrix in

Now check, by computation, that these conditions

(5.21)

to the direct sum of two triangular

matrices, each with nonzero elements on its diagonal . In view of always choose a new

(5.12), Xy = Xt

it follows from these facts that we can such that

Tjt

= unit

mod Xt •

o

-

76 -

R. E. Kalman

The proof of Case 2 is not yet complete, however, because we must still extend the is easy .

Write first

Z

K[z*]-module structure from

= W$

Z·

and then

direct sum is now wi t h r espect to the

t

from Y to X

"by

£i Z'

s '.:ttins

polynomial

X*

(5.12),

is replaced by some

(5.24 )

~

defined over

K-~odule

O.

=

X Since ~*

X

Y to X. This

Y $ Z',

where the

structure of X.

Extend

;;o',{ we have a n(;w mi nima.L z*

= Zt on Y,

~*

= ~t .

By

such that

w

that is, our previous representation of

w~ 0

in W induces a

similar representation with respect to the new K[z*]-module structure on X. Since

~

Xr,

is a unit modul o

By (5.24), we have, with re sp ect t o the cy.

(~* ·g2)'

c-

(~* ·gl)'

we can

~T it e

K[z*]-s tructure,

(1 + TXt) ·gl'

gl· This proves that

52 generates both Y and Z; that is,

a cyclic generator f or

X end owed

~~ th

proof of Lemma ( 5 . 18 ) is now complete.

the

K[z*]-structure.

is The

o

-

77 -

R. E. Kal ma n It should be clear that Theorem (5.13) is not a purely moduletheoretic result, but depends on the interplay between module theory, vector-spaces, and elimination theory (via (5.21)). the fact that

£

ca~

be extended from

For instance,

Y to X, which was needed

in the proof of Case 2, is a typical vector-space argument.** There are many open (or forgotten) results concerning cyclic modules which are of interest in system theory. is easy to show that an

n Xn

real matrix is cyclic iff a certain is nonzero at

~

For instance, it

is roughly analogous to the polynomial

det

,

F'

the polynomial

in the same ring,

but, unlike in the latter case, the general form of

~

does not seem

to be known. We must not terminate this discussion without pointing out another consequence of cyclicity which work.

Since

K[z]jXg K[z], co~~tative

X = cyclic with generator it is clear that

Xg

g

the module frame-

is isomorphic with

X also has the structure of this

ring, that is, the product is defined as

xXy If

tra~scends

(~Tj) 'g.

irreducible, then

X has a galois group.

No one has

tion of this galois group. facts in the theory of

X is even a field. eve~

Hence, in particular,

given a dynamical

interpret~-

In other words, there are obvious algebraic

dyr~nical

from the dynamical point of view.

systems which have never been examined For some related comments in the

setting of topological semi groups, see DAY and WALLACE [1967].

-

78 -

R. E. Kalman

6.

(6.0)

PREAMBLE.

TRANSFER FUNCTIONS

There has been a vigorous tradition in engineer-

ing (especially in electrical engineering in the United States during 1940-1960) that seeks to phrase all results of the theory of linear constant dynamical systems in the language of the Laplace transform. Textbooks in this area often try to motivate their biased point of view by claiming that "the Laplace transform reduces the analytical problem of solving a differential equation to an algebraic problem". When directed to a mathematician, such claims are highly misleading because the mathematical ideas of the Laplace transform are never in fact used.

The ideas which are

complex function theory:

actu~lly

used belong to classical

properties of rational functions, the

partial-fraction expansion, residue calculus, etc.

More importantly,

the word "algebraic" is used in engineering in an archaic sense and the actual (modern) algebraic content of engineering education and practice as related to linear sy stems

i~

very meager.

For

eXfu~ple,

the crucial concept of the transfer function is usually introduced via heuristic arguments based on linearity or "defined" purely formally as "the ratio of Laplace transforms of the output over the input". do the job

~~

To

and to recognize the transfer function as a natural

and purely algebraic gadget, requires a drastically new point of view, which is now at hand as the machinery set up in Sections 3-5. essential idea of our present treatment was first published in KALMAN [1965b l.

The

-

79 -

R. E. Kalman

The first purpose of this section i s to give an intrinsically algebraic definition of the transfer function associated with a discrete-time, constant, linear input/output map (see Definition (3.10)). Since the applications of transfer functions are standard, we shall not develop them in detail, but we do want to emphasize their role in relating the classical invariant factor theorem for polynomial matrices to the corresponding module theorem (4. 34). Consider an arbitrary

K[zl-homomorphism

(g) following Theorem (4.2)) . equivalent to the set

(f(e

j),

n~ r

f:

(see lemma

Then as a "mathematical object" i

1, ... , m,

e

j

f

is

defined by (4.6)),

since (6.1) (The scalar product on the right is that in the defined in Section 4.) power series in

z

-1

By definition of with vani shing

fir~t

r,

K[zl-module

each

term.

f(e .) J

r,

as

is a formal

We shall try to

represent these formal power series by ratios of polynomials (Which we shall call transfer functions~) and then we ca n replace formula (6.1) by a certain specially defined product of a ratio of polynomials by a polynomial .

Some algebraic sophistication will be needed to find the

correct rules of calculations.

These "rules" will consititute a

rigorous (and simple) version of Heaviside 's so-called "calculus". There are no conceptual complications of any sort.

(However, we are

dodging some difficulties by working solely in discrete-time.) *This entrenched terminology is rather unenlightening in the present algebraic context.

-

80 -

R. E. Kalman

X = n/kernel f be the state set of f regarded as f K[zl-module. We assume that X is a torsion module with nontrivial f Let

a

minimal polynomial

ljr.

=

ljr·f(e.)

(6.2)

J

Then, for each f(ljr·e.) J

=

j = 1,

ordinary product of the power series

no dot

O.

~([ljr.e.l) J

By definition of the module structure on

a (vector) polynomial.

•.. , m we have

r,

(6.2) means that the

f(e j )

by the polynomial

Hence (6.2) is equivalent to

ljr

is

(notation:

ordinary product) 1, ... , m.

!ntuit.i'y-e}:.y.:, we can solve this equat.Lon by writing

fee .) J

There are two vmys of making this idea rigorous. Method 1.

(6.3)

Define

=

f(e.) J

G./ljr J

as the formal division of

G.

by

1jr

Check that the coefficient of

ZO

is always

J

into ascending powers of O.

*(z-l)

Multiply both sides of (6.2 1 ) by

= z-nljr(z)

and

Q.(z-l) ~ z-nQ(z). J

Then

-1

Verify by computation

that the power series so obtained satisfies (6.2 1 ) Method 2.

z

.

z-m.

Write

~ E K[z-ll C K[[z-lJl

and (6.2 1 ) becomes (6.2")

~f( e .)

Moreover, the

J

O-th

coefficient of

~

is

1

(because of the convention

-

81 -

R.E. Kalman

W

that the leading coefficient of K[[z-l]]

is

1),

hence

t

is a unit in

and therefore

(6.3' )

f(e.) J

Note that tions of

(6.3) and (6.3')

f(e.),

give slightly different defini-

depending on whether we use a transfer function with

J

z

respect to the variable

or

in the engineering literature.) preferable.

actu~lly

z

-1

(Both notations have been used

For us the form~lisffi of Method 1 is

(The calculations of Method 1 can be reduced by Method 2

to the better-known calculations of the inverse in the ring

K[[z-l]].)

Summarizing, we have the easy but fundamental result:

(6.4)

EXIS~~CE

OF

TRP~SFER

correspondence beblcen polynomial

~Ihere

Q

yuNCTIONS.

K[ z j-homomorpht er,s

wand transfer function

j E KP[z], deg

den ominator of

Q

j <
There is a bijcctive f: n

~trices

and

->

I'

with minimal

of the type

W is the lea.st common

Z.

In many contexts, it is preferable to deal with the ponding to

f

rat.he r t.han \,ith

f

itself.

Zf

dim Zf /',.

W z

and conversely. dim f

fare well-

Thus, for instan ce,

~ dim X

f;

least common denominator of minimal polynomial of

corre s-

Because the cor r e sponde nce

is bijective, it is clear that all objects induced by defined also for

Zf

fZ'

Z,

-

82 -

R.E.Kalman

(6.5)

REMARK.

realization of

In view of Propositions (4.20-21), the natural

Z,

namely

D.

X = X z f, Z

well as completely observable. has caused a great confusion ,

is completely reachable as

Not having this fact available before 1960 Questions such as thoscresolved by Theorem (5.13)

tended to be attacked algorithmically, using special tricks amounting to elementary algebraic manipulations of elements of

Z.

Very few

theoretical results could be conclusively established by this route until the conceptual foundations of the theory of reachability and observability were developed. The preceding results may be restated as "rules" whereby the values of

f

may be computed using

Z.

We have in fact,

fern) = Z· rn,

(6.6)

wZ

multiply the polynomial matrix consisting of the numerators of Z with rn, reduce to minimaldegree polynomials modulo and then divide formally by W as in ~lethod 1 above.

*

We can also compute the entire output of the system E

Z

(that is,

all output values following the application of the first nonzero input value) by the rule

same as above, but do not reduce modulo

W.

In this second case, the output sequence will begin with a positive power of

z.

(The coefficients of the positive powers of

thrown away in the definition of

f

(see (3.7»

z

are

and in the definition

vhere

-

83 -

R. E. Kalman

r,

of the scalar product in for

X f

= n/kernel

in order to secure a simple formula

f.)

Many other applications of transfer functions may be found in KAl1~, FALB, and ARBIB [1969, Chapter 10, Section 10].

It is easy to show that the transfer function associated with

= (F,

the system L f

G, H)

is given by

Zf

= H(zI

- F)-lG.

(This is

just the formal Laplace transform computed from the constant version of (1.12) by setting

= zx(t).)

x(t + 1)

z

= d/dt

or from (1.17) by setting

Probably the simplest way of computing

Z

is

via the formula

6.8)

q

where

1/I

F

is the minimal polynomial of the matrix

script denotes the special polynomials defined in identity

(6.8)

deg .1/I, F and the super-

(5.5).

The matrix

follows at once from the classical scalar identity

[WEBER, 1898, §4]

ttl . ( L) (z - w) .L. zJ;; q-a, (w),

7T(Z) - 7T( w) upon setting

w

= F,

7T

J= l

= 1/IF'

q

deg 7T,

and invoking the Cayley-Hamilton theorem.

Much of classical linear system theory was concerned with computing Zr

In the modern context, this problem "factors" into first solving

the realization problem

f ~ L f

a~d

then applying formula

(6.8).

See

Sections 8 and 9. One of the mysterious features of Rule (6.6) (as contrasted with the conventional rule (6.7)) is the necessity of reducing mowllo The simplest way of understanding the importance of this

1/1.

-

84 -

R. E. Kalman

aspect of the problem is to show how to relate the module invariant factors occuring in the structure theorem (4 .34) to the classical facts concerning the invariant factors of a polynomial matrix. INVARIANT FACTOR THEOREM FOR MATRICES.

Let

P be a

matrix with elements in an arbitrary principal-ideal domain p

(6.10) where

A and

Bare

p X P

diag

IT

= rank

P.

The

II. 1

and

Rand

Then

m X m matrices (not necessarily det A, det B units in

<\' ..., "-q'

is unique (u p to units in q

R.

AlIB,

unique) with elements in

(6.11)

p Xm

with

0, ... , 0)

R) with

R,

II. E R 1

= 1, ... ,

i

whEe

are called the invariant factors of

and

q - 1,

P.

As anyone would expect, there is a correspondence between the module structure theorem (~.34) and the matrix structure theorem (6.9) and, in particular, between the respective invariant factors and

Ill' .•• , IIq.

t l'

... ,

Let us sketch the standard proof of this fact follow-

ing CURTIS and REINER [1962, §13.3] who also give a proof of (6.9). PROOF OF (4.34). onto

M given by

basis elements of M ~ RSjN,

Clearly,

~:

e.

Consider the R-h omomorphism from where the

H

1

m R (recall where

N

(4.6))

elements.

Write each basis element

1

and the

= kernel

is a free sutmodule of

e.

~.

are the standard gi

generate

M.

It can be proved that

with a basis of at most fj

Rm

of

N as

L Pij'e i,

.e < m Pij E R.

r

-

85 -

R.E.Ko.lman

(6.9)

Apply

to the R-matrix

= L aij·e i•

~j

P.

f j =L

f k = A.g i i'

(6.10-11),

~

Define

cij'fi,

C

= B-1,

Hence

Then, by "direct sum", M

~

Let

THEOREM.

given by

(6.9),

i

=r

r

o

r = rank P = ,e.

=0

..• , Aq be the invariant fact ors of

Xz

(\' W) = Qi'

Then the

are

wi \

Ccn c Lde'r the ;:[z]-~pimorphism

ill E [O]Z = kernel (mod

i = 1, ... , q.

for

CJ. = !'a;Jl: \jIZ.

m OO?

(WZ)ill

and

is the smallest integer su ch that

+ 1, . • . ,

Clearly,

'J.'

and let

invariant f a ctors of

where

~

the same type of calculations, we can prove also

(6.12)

wZ

r

(4.34) holds with W. = A.

That is, ~

R/AREIl

""

w).

I-l

iff

Z'ill = 0

(see

I-l:

n

-7

(6.6».

Xz:

ill

-7

[ill]Z'

Equivalently,

Using the representation whose existence is claimed

R. E. Kalman

86 -

by

(6.9),

write

-1

W = D !,

o/Z

Define

where diag (0/1' 1lr 2, ••• , t r , 1, ""

l'

M = 0, (1lrZ)W = 0,

Then

(C, J\ D = matrices over K[z].)

CAD

and W has clearly maximal rank among K[z]So the columns of the matrix W consti-

matrices with this property. tute a basis for

1).

kernel~.

The rest follows easily, as in the proof

of (4.34).

0

(6.13)

~~K.

The preceding proof remains correct, without any

modification, if the representation 1jrZ is taken in the ring

K[z]/o/K[z],

= CAD,

det C, det D

rather than in K[z].

= units

The former

representation follows trivially from the latter but may be easier to compute.

(6.14)

REMARK.

factors of Xz factors of

Theorem (6.12) shows how to compute the invariant

from those of 1j:Z.

must define the invariant

vIe

Z to be the same as those of X

z

bijective corre spondence

Z H

(because of the

Xz), Consistency with (6.12) demands

that we write

(6.15) where

Q.

1.

/

is defined as in

(6.3).

In other words, the -

the denominators of the scalar transfer function

A./o/ 1.

t.1.

are

a ~ ~~r cancellation

-

- -

of all common factors. Theorems (4.34) and (6.12) do not fully reveal the significance of invariant factors in dynamical systems. deduce all properties of

~trix-invariant

Nor is it convenient to factors from the representation

-

87 -

R. E. Kalman theorem (6.9).

It is interesting that the sharpened results we present

below are much in the spirit of the original work of WEIERSTRASS, H. J. S. SNITH, KRONECKER, FROBENIUS, and HENSEL, as summarized in the well-known monograph of (6 .16)

[1899].

~IDTH

DEFllUTION.

orization domain V, W (over

R,

R.

Let A, B rects.ngular matrices over a unique factAlB

(read:

A divides

of appropriate sizes)

B)

such that

iff there are matrices B

=

VAW.

This is of course just the usual definition of "divide" in a ring, specialized to the noncommutative ring of matrice s. The following result [~H 1899, Theorems IIIa-b, p. 52] shows that in case of principal-ideal domains the correspondence between matrices and their invariant factors preserves the divide relation (is "functcrial" with respect to "divide"): ( 6.17)

THEOREM.

if and O:lJ.y i f PROOF.

Let

R be a principal-ideal domain.

A. (A)I /,. (B) 1.

Sufficiency.

AlB

all .i .

Write the representation (6.10) as

B =

A

By hypothesis, there is a B

fc~

1.

Then

~

(diagonal) such that

V2~~W2'

V2ViiAVl~WlW~1~W2' (V2Vil)A(W~1~W2)·

~~ =~.

Hence

-

88 -

R. E. Kalman

Necessity.

(6.18)

LEMMA.

domain

R,

This is just the following

For an arbitrary

A. (A) I A. (B) •

A!B implies PROOF.

~~igue-factorization

l.

l.

By elementary determinant manipulations, as in

o

MUTH [1899, Theorem II, p, 16-17].

o

This completes the proof of Theorem (6.17)

(6.19)

REMARK.

Since (6.9) does not apply (why?) to unique factori-

zation domains, for purposes of using Lemma (6.18) we need WEIERSTRASS's if tJ..(A)

definition of invariant factors: all

j Xj

A. (A) = l.

minors of a matrix A,

c:l. (A)/tJ..l.- l(A).

=

J

greatest common factor of

with tJ. (A) o

= 1,

then

Of course, this definition can be shown to be

equivalent (over principal-ideal domains) to that implied by (6.9). In analogy with Definition (6.16), let us agree (note inversionl) on

(6.20) Zl'Z2

DEFINITION. (read:

such that

Zl

Zl

divides

= VZ 2W.

PROOF.

Let

Zl' Z2 be transfer-function matrices Z2)

(Note that

iff there are matrices Zl'Z2

This is the natural

V, ~1

implies at once:

cOlL~terpart

over

K[z]

*Z i*Z .) 1 2

of Theorem (6.16),

and follows from it by a simple calculation using the definition of ~i(Z)

given by (6.15).

o

-

89 -

R. E. Kalman

(6.22) iff

~ I~,

that is, iff XE

X E2

[or isomorphic to a quotient module of X ]. E 2

121

is isomorphic to a submodule of

This definition is also functorially related to the definition of "divide" over a principal ideal domain

R because of the following

standard result:

(6.23)

THEOREM.

R-modules.

Then

Let

R be a principal-ideal domain and

X, Y

Y is (isomorphic) to a submodule or quotient module

of X i f and only i f w.(Y)llJr.(x), ~ ~ PROOF.

1, • • •, r(Y)

i

Sufficiency.

< r(X).

Take both X and

Y in canonical

form

(4.34), with xl' ••• , Xr(X) generating the cyclic pieces of X,

and

Yl' ••• , Yr(X) (with

assignment

y.

~

is, exhibits assignment

H

Yi

=0

(lJr.(x)/v.(Y)}x. ~

~

1

if

Y as (isomorphic to) a submodule of X. xi

H

Yi

Necessity

and

Y.

The that

Similarly, the

defines an epimorphism X --+ Y exhibiting

Y as

X.

(follovring BOURBAKI [Algebre, Chapter 7 (2 e ed.),

4, Exercise 8]). Let Y be a submodule of X.

X ~ LIN where theorem,

those of

defines a monomorphism Y --+ X,

(isomorphic to) a quotient module of

Section

i > r(Y))

L, N are free R-modules.

By

(4.34),

By a classical isomorphism

Y is isomorphic to a quotient module

MIN,

where

M is free (since submodules of a free module are free).

L:J M :J N

-90-

R. E.Kalman From the last relation, that, for any R-module

r(Y) ~ reX). X and any

Now observe, again using (4.34)

7f E R,

and therefore ideal generated by Since

7fY

is a submodule of

R1I'k (X) J R!V (Y), k

7fX

(7f:

for all

< k},

r(7fX)

7f E R,

it follows that

and the proof is complete for the case when

a submodule of X.

Y is

The proof of the other case is similar.

o

(6.24) Immediate from the fact that

PROOF. of

E (see Section Now

(6.25)

is a submodule

7).

o

we can summa.rize main results of this section as the PRD-lE DECOMFOSITION THEOREM FOR LINEAR DYNAMICAL SYSTEMS.

The following conditions are e 0uivalent: (i)

Zl

~i(ZI)

(ii) (iii)

E Z

PROOF. (6.23), since

divides

1

Z2·

divides

~i(Z2)

can be simulated by

for all E

Z

2

i.

•

This follows by combining Theorem (6.21) with Theorem

~.(Z) J.

= ~J..(EZ)

by definition.

o

-

91 -

R.E.Kalman

INTERPRETATION.

(6.26)

The definition of Zl'Z2 means, in

syst~

theoretic terms, that the inputs and outputs of the machine whose transfer function is an input

Ul

2

Z2 are to be "recoded":

the original input

= B(z)wl

r 2 is replaced by an output

and the output

r 1 = A( z)r 2; with these "coding" operations, a machine with transfer function transfer function, the equation A, B are replaced by

A, B

Zl. Zl

L

Ul

2

is replaced by

wj.ll act like

2

In view of the definition of a

= AZ 2B

(reduced modulo

is

al~~ys

tz

2

).

satisfied whenever

This means that the

coding operations can be carried out physically given a delay of d = deg t

z2

units of' time (or more). No feedback is involved in coding,

it is merely necessary to store the

d

last elements of the input and

Hence, in view of Theorem (6.25) and Corollary (6.24),

output sequences.

we can say that it is possible to alter the dynamical behavior of a system L

2

arbitrarily by external coding involving delay but not

feedback if and only if the invariant factors of the desired external behavior

(Zl)

behavior

(ZL;)

called the

are divisors of invariant factors of the exter~al of the given system.

2 PRll~S

of linear systems:

The invariant factors may be they represent the atoms of system

behavior which cannot be simulated from smaller units using arbitrary but feedback-free coding.

In fact, there is a close (bot not isomorphic)

relationship between the Krohn-Rhodes primes of automata theory (see KAIMAN, FALE, and ARBIB [1969, Chapters 7-9]) and ours.

A full treat-

ment of this part of' linear system theory will be published elsewhere.

-

92

-

R. E. Kalman

7.

ABSTRACT THEORY OF REALIZATIONS

The purpose of this short section is to review and expand those portions of the previous discussion which are relevant to the detailed theory of realizations to be presented in Sections 8 and 9.

The same

issues are examined (from a different point of view) also in KALMAN, FALB, and ARBIE [1969]. Let

f:

n~ r

construction of

X f,

(Sections 3 and 4). I-l :

f

n

be a fixed input/output map.

->

as a set and as carrying a It is clear that

X : m f

(i) f

=

Let us recall the K[z]-module structure

Lfo~f'

where

[to},

H

'f: X ~ r: [m]f 4 f

f(m)

(ii)

are K[z]-homomorphisms, and

I-l

f

= epimorphism while

L

f

= mononorphi.sm.

We have also seen that

(7.1)

C'

'f

=

epimorphism monomorphism

<=> Xf

is completely reachable;

< > Xf

is completely observable.

These facts set up a "functor" between system-theoretic notions and algebra which characterize

X f

uniquely.

Consequently, it is desirable

to replace also our system-theoretic definition of a r 2alization (3.12) by a purely algebraic one: DEFINITION. is any factoriz ation

A realization of a f

Kl z j-homomor-phd sm f: n

that is, any commutative di8§ram

->

r

-

93 -

R. E. Kalman of

K[z]-homomorphisms.

The

module of the realization. co~letely

A

K[z]-module

X is called the state

realization is canonical iff it is

reachable and completely observable, that is,

surjective and

= P,

(or X

(7.3)

REMARK.

~

= f,

X

= n,

E

G, H) by

F:

~

X:

~

restricted to the submodule

G

H

ln'

It is clear that a realization in the sense of (3.12)

= (F, X

~

= lp).

L

can always be obtained from a realization given by (7.2). define

is

is injective.

A realization always exists because we can take f

~

x

In fact,

z·x,

H

followed by the projection Y

(w:

Iw]

H

Y(l).

lJ.

It is easily verified that these rules will define a system with f, x

= f.

Given any such E, X

=

it is also clear that the rules

J1:,

~: w ~ t~ F~tGEw(t), V:

x

l->

(~x, ~FEx, •••

define a factorization of

f.

Hence the correspondence between (3.12)

~ (7.2) is bijective.

The quickest way to exploit the algebraic consequences of our definition (7.2) is via the following arrow-theoretic fact:

-

94 -

R.E.Kalman

ZEIGER FILL-IN LEMMA. and

5

~

A, B, C, D be sets and ex,

s, r,

set maps for which the following diagram commutes: ex

4

>

B

./

./

r ./

VJi/

~

~

5

is surjective and

t3

./

C

If ex

./

./

;>

W D

5 . i s injective, there exists a unique set

corresponding to the dashed arrow which preserves commutativity.

This follows by straightforward "diagram-chasing", which proves at the same time the COROLLARY.

The claim of the lemma remains valid if "sets"

are replaced by "R-modules" and "set maps" by "R-homomorphisms". Applying the module version of the lemma twice, we get

(7.6)

PROPOSITION.

fixed

f:

Consider any two canonical realizations of a

the corresponding state-sets are isomorphic as K[z]-modyles.

Since every K[z]-module is automatically also a K-vector space, (7.6) shows that the two state sets are K-isomorphic, that is, have the same dimension as vector spaces.

The fact that they are also K[z]-isomorphic

implies, via Theorem (4.34), that they have the same invariant factors. We have already employed the convention that (in view of the bijection between

f

and l:f)' the invariant factors of

f

and X f

are to be

-

95 -

R. E. Kalman

identified.

In view of

(7.6),

this is now a general fact, not dependent

on the special construction used to get

(7.6)

x. f

We can therefore restate

as the

(7.7)

ISOHORPHISM THEOREM FOR CANONICAL REALIZATIONS.

canonical realizations of a fixed

f

Any two

have isomorphic state module s.

The state module of a canonical realization is uniquely characterized (up to isom orphism) by its invariant factors, which may be also viewed as those of

f.

A simple exercise proves also

(7.8)

PROPOSITION.

realization

f,

then

If

X is the state module of a canonical

dim X (as a vector space) is minimum in the

class of all realizations of

f.

This result has been used in some of the literature to justify the terminology "minimal realization" as equivalent to "canonical realization".

'-Ie shall see in Section 9 that the two notions are

not aD~Ys equivalent; we prefer to view (7.2) as the basic definition and

(7.8)

as a derived fact.

REMARK. claimed (4.24).

2 = (F, G, H)

(7.7)

Theorem

constitutes a proof of the previously

To be more explicit:

if

E

(F, G, H)

and

are two triples of matrices defining canonical realiza-

tions of the same

f,

then

space isomorphism A: X -)

(7.7)

X

implies the existence of a vector-

such that

-

96 -

R . E. Kalman F

(7.10)

'" G

AG,

1\

If we identify X and X then A is simply a basis change and it follows that the class of all matrix triples which are canonical realizations of a fixed grOUp over

f

is isomorphic with the general linear

X. The actual computation of a canonical realization, that is,

of the abstract Nerode equivalence classes

[m]f'

require a consider-

able amount of applied-mathematical machinery, which will be developed in the next section. a factorization of

The critical hypthesis is the existence of f

such that

expressed by saying that

f

dim X

<

(this is sometimes

00.

has finite rank.)

Given any such reali-

zation, it is possible to obtain a canonical one by a process of reduction.

(7.n)

More precisely, we have THEOREM.

Every realization of

f

with state module

X

contains a subquotient (a quotient of a submodule, or equivalently, a submodule of a quotient)

X* of X which is the state-module of

a canonical realization of

f.

PROOF.

The reachable states

X

r

of X and so are the unobservable states X*

R<

X/Xr

n Xo

is a subquotient of X.

X* is a canonical state-module for

f.

=

ima ge

a r e a submodule

X = kernel r , o

Hence

It f'o.l.Lows immediately that

[The proof may be visualized

via the following commutative diagram, where the canonical injections and projections.]

~

j

IS

and

pI s

are

o

-

97 -

R. E. Kalman

REMARK.

Since any subquotient of X is isomorphic to a

submodule (or a quotient module) of X, that

it follows from Theorem (6.23)

X can be state-state module of a realization only if ~i(f)l~i(X)

for all

i

(recall also Corollary (6.24)).

not enough since the

~i

This condition, however, is

are invariants of module isomorphisms and not

isomorphisms of the commutative diagram (7.2). The preceding discussion should be kept in mind to gain an overview of the algorithms to be developed in the next sections.

-

8.

98 -

R . E. Ka l m a n

CONSTRUCTION OF REALIZATIONS

Now we shall develop and generalize the basic algorithm, originally due to B. L. Ho (see HO and KALMAN [1966]), for computing a canonical E = (F, G, H)

realization

of a given input/output map f.

Most of

thp. discussion will be in the language of matrix algebra. Here and in Section 9 boldface capital letters*

Notations.

will

denote block matrices or sequences of matrices; finite block matrices will be denoted by small Greek subscripts on boldface capitals; the elements of such matrices will be denoted by ordinary capitals.

This

is intended to make the practical aspects of the computations selfevident; no further explanations will be made. Let

f:

n~ r

the K-linearity of

be a given, fixed K[z]-homomorphism. f

Using only

we have that

f(oo) (1)

(8.1) where the

~

(k > 0)

a re

p X m matrices over the fixed field

K.

We denote the totality of these matrices by

Then it is clear that the spe cification of a

K[z]-ho ~omorphism

is equivalent to the specification of its matrix sequence over, if L

(8.2)

reali zes

f(oo) (1)

*Note to Printer:

f

~ (f).

(8.1) can be written explicitly as

t~ HF-tChJ( t) .

=

Indicated by double underline.

f More-

-

99 -

R. E. Kalman

Comparing (8.1) and (8.2) we can translate (3.12) into an equivalent matrix-language DEFINITION. A dynamical system

~G,

realizes a

A iff the relation

(matrix) infinite seguence

~+l

L = (F, G, H)

0, 1, 2, ...

k

is satisfi ed, Let us now try to obtain sequence

A

also a matrix criterion for an infinite

to have a finite-dimensional realization.

The simplest

way to do that is to first write dO\{ll a matrix representation for the map f: n

->

r,

So let

r~l

A 2

R(A)

::: =

C

and verifY that ~

A 2 A 3 A4

A 3 A4 A 5

represents

~(~(f»

column vector ~~th elements

Classically, with A.

~(~)

... ,

when w E n

is ,~ewed as an

(wl(O), •.. , wm(o), Wl ( l ) , . .. ).

is known as the (infinite) Hankel matrix associated

We denote by

H

the

=~,v

~

ing in the upper left-hand corner of

(8.4)

f

X V block submatrix of H.

E be any realization of A.

PROPOSITION.

Let

rank ~~, v(~)

< dim

E

H appear-

for all

>1,

v > 1.

Then

-

100 -

R. E. Kalman

(8.5)

COROLLARY.

realization only if

An infinite sequence

rank

has a finite-dimensional

~

is constant for all

~~,V(~)

~, V sufficiently

large. PROOF.

If

dim

E

=

00,

the claim of the proposition is

vacuous (although formally correct!).

Assume therefo:e that

dim E <

00

E the finite block matrices

and define from ~V

[G, FG, ••• , F

V-l G ]

and

Then O'R

=~=V

=

by the definition

and

rank 0

=~

H

=~,

(A)

V =

(8.3) of a realization.

are at most

n

=

dim E.

I t is clear that

rank R

=V

Thus our claim is reduced to

the standard matrix fact rank (AB)

o

< min (rank A, rank B}.

Our next obj ective is the proof of the converse of the corollary. This can be done in several ways.

The original proof is due to HO and

similar results were obtained independently and and TISSI [1966] as well as by SILVE~~ [1966]. are analyzed and compared in Section 11].

KAh~,

FALB, and

~f

concurre~tly

[1966];

by YOULA

Two different proofs

JL~IB

[1969, Chapter 10,

All proofudepend on certain finiteness argument s.

We

shall give here a variant of the proof developed in HO and KALMAN [1 969].

-

101-

R. E. Kalman

(8.6)

DEFINITION.

the sequence

~

The infinite Hankel matrix H associated with A = (A', A")

has finite length

iff one of the follow-

ing two equivalent conditions holds: min {.e': rank

A" A'

(min

!!o, v =rank !io, v for all -~ , -~ +K,

.e": rarik!!-11,~0"

is the row length of H and

rank H 0"+K =j.4,~ A"

K,

for all

v

K,

1, 2,

... }

1, 2,

11

is the column length of H.

The equivalence of the two conditions is immediate from the equality of the row rank and column rank of a finite matrix.

The proof

of the following result (not needed in the sequel) is left for the reader as an exercise in familiarizing himself with the special pattern of the elements of a Hankel matrix: PROPOSITION. either both true

[~

For any has finite

g,

the following inequalities are

le~]

or both false [otherwise]:

< rank ~mA", A" < mA", A" < rank H pA', =71.' ,pA' < A'

The most direct consequence of the finiteness condition given by

(8.6) is the existence of a finite-dimensional representation Z of the shift operator

CIA

acting on a sequence

A.

will be the Hankel matrix associated with a given A.

~

and

The "operand" As we shall see

soon, this representation of the shift operator induces a rule for

< ee

.•• }<

co ,

-

102 -

R. E. Kalman

computing the matrix we would expect:

F of a realization of ~.

This is exactly what

module theory tells us that, loosely speaking,

DEFINITIOX~

The shift ope=ator

~A

on an infinite sequence

A is given by

the corresponding shift ouerator on Hankel matrices is then

(Of course,

~H

is well-defined also on submatrices of a Hankel matrix.)

(8.8)

MAIN LEMMA.

seguence

~

A Hankel matrix

H associated with an infinite

has finite length if and only if the shift operator

has finite-dimensional left and right matrix representations. H has finite length and

l" X £11

A = (A', A")

block matrices

Sand

and

f,.

(8.9) is

A" X A".

PROOF. which satisfies

Sufficiency.

Ta.1{e any

X £.

such that

and furthermore the minimum size of these matrices satisfying A' X A'

Precisely:

if and only if there exist Z

cr H

£11 X £tI

(8.9). Compute the last column of

block matrix ~J.l,£ "~:

Z

-

103 -

R. E. Kalman

(8.10)

= 0,

(where

j

of ~).

Relation (8.10) proves that

1, ...

Z

>Iv

rank ~K+l,£"

rank ~K+l,£"

(>I, v)th

is the

for all

for all

element block

0, 1, ...

K

the general case follows by repetition of the same argument. existence of the claimed

Z

cannot exceed the size of of the smallest

implies that the col umn length

Z.

If actually

Z which works in

the necessity part of the proof.

A"

Hence the A"

of

H

is smaller than the size

(8.9), we get a contradiction from

The claims concerning

S

are proved

by a strictly dual argument. Necessity. (A" + 1)

th

By the definition of

block column of

~>I, A"+l

exist

(8.U)

m X m matrices

Zl' .• . , ZA"

Aj+lZA" + Aj+2Z A"+l

holds identically for all

j

~>I,

= 0,

A"+l;

moreover, this

no matter how l arge.

So there

su ch that the relation

+ • • • + A O+' Z J "" 1

A" X A" block c ompanion mat r Lx of just defined:

>I,

each column of the

is linearly dependent on the

columns of the precedi ng block c olumns of property is true f or all intege rs

A",

1, ..••

=

A

JO+I+"'" 1\

Now define

Z t o be a n

m X m block made up from the

Zo ~

-

104 -

R. E. Ka lman

0

0

0

0

ZA"

I

0

0

0

Z""_l

0

I

0

0

Z""_2

o

o

o

o

o

o

o

I

Z

The verification of (8.9) is immediate, using (8.11) . of

A' X A'

block matrix

S

The existence

verifying (8.9) follows by a strictly

o

dual argument.

Now we have enough material on hand to prove the strong version of Corollary (8.12)

(8.5):

THEOREM.

An infinite s eguence

realization of dimension

n

if and only if the as sociated Hankel

matrix H has finite length PROOF.

A has a finite-dimensional

Sufficiency.

A=

o.I,

Let

A").

~AII, 1

be a

column matrix whose first block element is an the other blocks are

m X m zero matrices.

define

(8.13)

~AIl,l'

!!l -

1. 11 •

, F\

A" X 1

block

m X m unit matrix and

Using (8.9) with

.e"

=A",

-

105-

R. E. Kalman Then, for all

k

~

0,

comput ~

~l, A'l~A", l~ (J~l, A"~A", 1;

mXG

(8.9).

the second step uses

By definition of

(JA and ~, the last k ~«(JA(~))' namely A1+k'

matrix is just the

(1, l)th element of

Hence the given

is a realization of A.

E

Necessity.

This is immediate from Cor :

;a r y (8.5).

0

Now we want to attack the problem of finding a canonical realization of ~,

since the realization given by (8.13) is u~ually very far

from canoirl.cal.

Our succeeding consideratiorn here and in Section

9

are made more transparent if we digress for a moment to establish another consequence of

(8.8).

By outrageous abuse of language, we shall say that length iff (8.14) order ~

=

has finite length.

1i(~)

DEFINITION.

We note

An infinite sequence B is an extension of

N of (the initial part of) an infinite seguence B k

(8.15)

for

k

THEOREM.

= 1,

N of ~,

By

(8.8),

A iff

••• , N.

No infinite seauence of finite length

has distinct length-preserving extensions of any order PROOF.

A has finite

Suppose

(A', A")

N > A' + A".

B is a length-preserving extension of order

the length of both sequences being both sequences satisfy relation

(A', N'),

(8.9),

with

with suitable

N > A' + N'. ~ and

~B'

-

106-

R. E . Kalman

The sequence

A is uniquely determined by

from the left and the sequence acting on the matrix

N.

acting on

~AI, 7\"(~)

B is uniquely determined by from the right .

~A', A"(~)

are equal by hyp othesis on

~

~B

The two matrices

Moreover,

and

are also equal, since the matrices on the right-hand side depend only on the

2nd, •.• , N-th

member of ea ch sequence.

Using only this fact

and the associativity of the matrix product 11:-1

~AI, A"~~B ;:

So

'

k-l

~~AI, N'~B

'

o

B.

A

Now we can hope for a realization algorithm which uses only the first

A' + A"

terms of a sequence of finite length.

In fact, we have

(8.16)

B. L. HO' s REALIZATION AIDORITHM.

seguence

A of finite length with associated Hankel matrix

Consider any i nfinit e

following steps will lead to a canonical realization of

A:

H.

The

-

107 -

R . E. Kalman (i)

Determine

(ii)

Compute

nonsingular

pA'

X

pN

A', A". n = rank ~A'I A"; and

mA"

X'

mA"

in doing so, determine

matrices

P, Q su ch that

(8.17)

(iii)

(8.18)

Compute

Rn P!!" ,,,~, - /\ ,/\

G

H = are idempot ent "editing" matrices c orre spondi ng to the operations "r et a i n onl y the first

p

rows" and "retain only the first

m columns". We claim the (8.19)

REALIZATION THEOREl·! FOR INFINITE SEQUENCES.

seguence

~

(A', N'),

whos e a ssociat ed Hankel mat r i x

~

For any infinite

ha s f inite length

B. L. Ho's f or mula s (8.17-18) yi el d a canonical r ealization. PROOF.

If E

defined by (8.17-18 ) is a realization of ~,

then it is certainly cano ni ca l : the class of all realizations of

by ~

(8.4)

L

ha s minimal dimension in

and so it i s canonical by (7. 8) .

The required verification is int eresting. subscripts.

Observe that

l!

H

n

= QCRP

First, drop all

is a pseudo-inverse of

~,

that

-

108 -

R.E.Kalman is,

~~ =~.

Then, by definition of

~G

F, G, H,

II

.'m d

~,

(~Q.C)(RP[(J"r&]Q.C)k(~C),

~(~II[(J"~])~~C; by repeated application of (8.9),

~(~I1~)~~C ~~(~II~)k-~~C,

RS~~C,

~~C, R[(J"~]C. The last equation calls for picking out the first first (8.20)

m columns of COt~NT.

(J"~,

which is just

A+ l k,

p

rows and the

as required.

0

This is a considerably sharper result than Theorem

(8.12), in two respects: (i) use the matrix (ii) form:

It is no longer necessary to compute ~", , ,," «(J"At;;) ,

~:

we simply

which is part of the data of the problem.

Formulas (8.18) give the desired realization in minimal

there is no need to reduce (8.13) to a minimal realization (recall

here (7.11». Notice also that the proof of (8.19) does not re~uire (8.12) but depends (just like the latter) on direct use of (8.8).

-

109 -

R. E. Ka l ma n

An apparently serious limitation of the algorithm (8.16) is the

necessity to verify abstractly that

has finite length".

"~

Of

course, this can be done only on the basis of certain special hypotheses on ~'

given in advance.

(ii) ~

= coefficients

(Examples:

=0

(i) ~

for all

k > q;

of the T~lor expansion of a rational function.)

Fortunately, the difficulty is only apparent, for the preceding developments can be sharpened further: F1JNDA.MEN'rAL THEOREM OF LINEAR REALIZATION THEORY.

(8.21)

any infinite sequence

~

and the corresponding Hankel matrix H.

Suppose there exist integers (8.22)

1,1, 1,"

such that

rank

u., +,.r, 1 n,,(~), _

rank

~£ I, 1,"+1 q~) .

_.r,

'" of Then there exists unique extension A

such that with

A'

<

~=

1,1

i\' = 1, I, i\"

PROOF.

and

= 1,"

~

of order

I ,

1,1

+

1,"

moreover, applying formulas (8.17-18) gives a canonical realization of

A.

Exactly as in the necessity part of the proof of

(8.8), condition (8.22) implies the existence of ~

crJ£.e

Consider

1,11 (~)

'" of Define an extension A

~

of order

t

I

+ t"

k> 1.

by

and

Z such that

-

110-

R. E. Kalman By repeated application of (8.23), it follows that we have also

k > O.

Now i t is clear, from (8.8), that

A~

<

A""

£'

and

~

<

A""

£11.

ness of the extension follows immediately from (8.15). Theorem (8.19) is still valid, even though

(£', £11)

The uni.que-

Moreover, is not necessarily

minimal, because the proof of (8.19) depended only on (8.9) and not on theminimalityof

o

(£', £11).

Theorem (8.21) says, in effect, that a canonical realization of some extension of ~

is always possible as scon as (8.22) is satisfied.

Moreover, (8.22) can be used as a practical criterion for constructing by trial and error a canonical realization of any A known to have finite length (but without being given (8.24)

EXAMPLES.

(i)

A', A").

There is no scalar infinite seCJ.uence

(p

m "" 1)

A for which (8.22) is never satisfied.

(ii)

If ~£I,£11

is squar-e and has full rank (for instance,

in the scalar case), then (8.22) is automatically satisfied. (iii)

If the algorithm (8.16) is applied without any informa-

tion concerning condition (8.22), the system always realize some extension of

~,

E defined by (8.18) will

at least of order

1.

It is not

known, however', hO\'T to get a simple formula which would determine the maximal order of this extension of A. The remaining interesting CJ.uestion is then:

What can be said if

(8.22) is not satisfied for a finite amount of data

AI' ••• , ~

and

-

111 -

R.E.Kalman

any

£ I , t"

satisf'ying

£ I + £"

N.

This problem is the topic of

the next section.

(8.25)

FINAL COlt.l1Elf:'.

An essential feature of B. L.Ho's algorithm

is that is preserves the block structure of the data of the problem. course, one can obtain pa"allel r€'sults by treating

~£"£"

ordinary matrix, disregarding its block-Hankel structure. procedure requires looking at a minor of

~

Of

as an

Such a

of maximum rank, and was

described explicitly by SILVERl/AN [1966] and SILVERl'.AN and MEADOHS [1969]. There does not seem to be any obvious computational advantage associated with the second method.

-

112 -

R . E . Kalman

9.

THEORY OF PARTIAL REALIZATIONS

In one obvious r e sp ect the theory of realizati ons developed in the previous se ction is r ather un satisfactory : with infinite sequenc e s.

it is concerned

Fr om here on we cal l a sy stem satisfying

(8.3) a complete reali zation, t o di stinguish it from the practically more interest i ng ca se gi ven by DEFINITION. sequence of system A

Let

A = (A , A

=

1

2,

be an i nfi ni t e

••• )

K. A dynami cal

p x m matrice s ove r a fixed fi eld

2: = ( F, G, H)

is a partial realization of order

r

of

iff

~+l

~G for

k

0, 1,

...,

r•

We shall us e t he same ter :ninology if, Ln st.e a d of a n i nfinit e se quence s

~

r.

!';, we are given merely a f i nite s equence The re a son f or

t~i s

A

=s

convent ion will be cl ea r fr om the di s-

cussion to f ollow.

We sh all call the firs t

sequence (o f orde r

r).

r

terms of

A a partial

The con c epts of ca non ical par tial r eal ization a nd minima l partial reali za tion wi l l be under stood in exact ly the same sense as for a complet e reali zation .

We warn t he r ea der, bowever, that now the s e

two notions wi l l turn out t o be i nequivalent, in that minimal partial

=?

ca nonical partial

but not c onvers el y .

Our mai n intere st wi l l be t o determi ne a l l equivalence clas s es of mi nimal pa r t i al realizat i ons; i n gene ra l , a given sequence wi ll.

-

113 -

R. E. Kalman

have infinitely many inequivalent minimal partial r ealizations if r

is sufficiently small. According to the Main Theorem (8. 21) of the theory of realiza-

tions, the minimal partial realization problem has a unique solution whenever the rank condition (8.22) is satisfied.

If the length

r

of the

partial sequence is prescribed a pr i ori, it may well happen that (8. 22) does not hold.

vThat to do?

realization

(F, G, H)

sequence of

~r

Clearly, if we hav e a minimal partial

of order

r

we can extend the partial

on which this realization is based to an infinite

sequence ca noni ca l ly r ealized by

(F, G, H)

simply by setting

Consegu'ntly, we have the preliminary PROPOSITIor.. realization f or

~r

The

de ter~ inat io n

is e quivalent t o t he

extensions of a partial sequence

A

=:c

of a

mini ~~l

dete rminati ~n

partial of a l l

such t hat the extended

sequence i s (i) finite-dh ensj. =::?l .;:J.G, ,"ore st r ong' Y,

(ii) it" dimension is minimal in t he class of all ext.ensi.ons . It is trivial to prove that finite-d imensional exte nsi ons exi st for any pa rtial sequence ( of finit e length). reduced t o determining

extensi on~

Hence the problem i s

whi ch have mini mal dimen sion.

solution of this latter problem c onsists of t wo step s.

i ~e d i a tely

The

First, we show

by a trivial a r gument that the min imal dimensi on can be bounded fr om

-

114-

R.E.Kalman below by an examination of the Hankel array defined by the partial sequence.

Second, and this is rather surprising, we show that the

lower bound can be actually attained.

For further details, especially

the characterization of equivalence classes of the minimal partial realizations, see KALMAN [1969c and 1970b). DEFINITION.

By the Hankel array

sequence

~r we mean that

block is

A + j_ i l

if

r Xr

i + j - 1

of a partial

~ (Ar>

block Hankel matrix whose

~

r

(i, j)th

and undefined otherwise.

In other words, the Hankel array of a partial sequence

~r

consists of block rows and columns made up of subsequences Ap' ... , Ar

(1 ~ P ~ r)

above them.

~r

and blank spaces.

n (A )

b~ the number of rows of the o =r which are linearly independent of the rows

PROPOSITION. Hankel array of

of ~r

Then the dimension of a realization of A =r

is at least

n (A ). o =r

PROOF. sequence of ~,

~

The rank of

~

Hankel matrix of an infinite

is a lower bound on the dimension of

by Proposition

(8.4).

~

of

This implies "filling

~r'

in" the blank spaces in the Hankel array of =:l'"

~r'

is filled in, the rank of the resulting

=r

matrix is bounded from below by By

realization

By Proposition (9.2), it suffices

to consider a suitable extension

H(A)

~

Regardless of how r Xr

block Hankel

n (A ). o =r

o

the block symmetry of the Hankel matrix, we would expect

to be able to determine

n (A) o =r

by an analogous examination of the

-

115 -

R. E. Kalman

columns of the Hankel array of lower bound.

fir'

thereby obtaining the

This is indeed true.

~

We prefer not to give a direct

proof, since the result will follow as a corollary of the Main Theorem

(9.7).

The critical fact is given by the

MAIN LEMMA.

fir

For a partial seguence

~:

smallest integer such that for row of H(A) - - - = =r

k'> A'

every

is linearly dependent on the

rows above it. AJ'(A ) =r

smallest integer such that for column in the

k-th

k" > A"

every

block column of H(A) = =r

is linearly dependent on the columns to the left of it. Every partial sequence seguence

~

may be extended to an infinite

A in at least one way such that the condition n (A) o =r

for all

~

> A' (A ), v > A" (A ) =r =r

is satisfied. PROOF.

The existence of the numbers

It suffices to show, for arbitrary such a way that the numbers

A', A",

and

Consider the first row of Ar+l

n

r,

is trivial.

how to select

Ar+l

remain constant.

and examine in turn all the

first rows of the first, second, third, ••. ,

!! U ). - -r

o

A'. A"

ALth

block rows in

If the first row of the first block row is linearly depen-

dent on the rows above it (that is,

0), we fill in the first row

in

-

116-

R. E. Kalman

of Ar+l

using this linear dependence (that is, we make the first

row of Ar+l

all zeros).

This choice of the first row of Ar+l

will preserve linear dependencies for the first row of every block row below the second block row, by the definition of the Hankel pattern.

If the first row in the first block row is linearly

independent of those above (that is, contributes

I

to

n (A

o =r

we pass to the second block row ana repeat the procedure.

»,

Eventually

the first row of some block row will become linearly dependent on those above it, except when

A' = r; in that case, choose the first

row of Ar+l

to be linearly dependent of the first rows of

••• , A r•

Repeating this process for the second, third,

~,

of each block row*, eventually ing

At

or

Ar+l

rows

is determined without increas-

n. o

To complete the proof, we must show that the above definition of Ar+l

also preserves the value of

~~

That is, we must show

that no new independent columns are produced in the Hankel array of ~r

when Ar+l

is filled in.

that the definition of Ar+l rank H =r, I rank I]-r- 1 , 2

rank HI = ,r

This is verified immediately by noting implies the conditions

rank!! -r+I , l' rank ~r, 2'

rank ~2, r

rank ~l,r+l.

*Of course, ::0-., Li nea,r dep endence in t.he first step does :1Qt that the corresponding row of Ar+l will be ~ll zeros.

in~ly

o

-

117 -

R. E. Kalman

With

th~

a.id of this simple but subtle observation, the problem

is reduced to that covered by the V~in Theorem

(8.21) of Section 8. We have:

MAIN THEORD1 FOR MINIMAL PARTIAL REALIZATIONS.* be a partial sequence.

flr

Let

Then:

(i) Every minimal realization of ~r

has dimension

n (A ). o =r

(ii) All minimal realizations may be determined with the aid of B. L. Ho's formulas as given by Lemma (iii) If -is unique.

(8.17-18) vdth

r> A'(A ) + A"(A) = =r =r

there are extensions of

~r

then the minimal realization

~~ny

satiSfying

By the Main Lemma

minima'

r=alizcti~ns

o

So we can apply the

as

(9.6).

(9.5),

every partial sequence

has at least one infinite extension "hich preserves n.

A" = A"(A ) =r

(9.5).

Othen,ise there are ss

PROOF.

and

A', A"

~r

and

(8.21) of the preceding section.

It fo.l Lovs that the minimal partial realization is uni que if (the A' (A ) + A"(A ) + 1 Hankel matrix can be =r =r = =r =r filled in completely with the available data); in the contrary case, the

r

> At (A ) + A" (A )

minimal extensions will depend on the

mar~er

in which the matrices

Ar+l' •••, AA'+ 1\' have been determined (subject to the requirement

o

(9.6) ).

In view of the theorem, we are justified in calling the integer

A• =or *A similar result was obtained sDT.lutaneously and independently by T. Tether (Stanford dissertation, 19c9).

-

(9.8)

REMARK.

R.E.Kalman

118-

The essential point is that the quantities

no'

A', and AU are uniquely determined already from partial data, irrespective of the possible nonuniqueness of the minimal extensions of the partial sequence.

We warn, however, that this result does

not generalize to all invariants of the minimal realization. instance, one cannot determine from

For

how many cyclic pieces a

~r

minimal realization of A will have: some minimal realizations =r may be cyclic and others may not [KALMAN 1970b). Finally, let us note also a second consequence of the Main Theorem: COROLLARY.

Suppose

nl(~r)

columns of the Hankel array of ~r

no(~r))· Then

dim ~r

PROOF.

If

is the number of independent

(defined analogously with

= nl(~r)· "l(~r)

>

no(~r)

then, using the Main Theorem,

we get a contradiction to the fact that the rank of any Hankel matrix of an infinite sequence is lower bound f0r the dimension of any realization (Proposition to any

~Al+~~l

equal to

(8.4)). If

nl(~r)

<

no(~r)

we contradict the fact that rank

then extending ~r ~A"N'

is at least

o

n (A ).

o =r

In other words, the characteristic property of rank, that counting rank by row or column dependence yielcs identical results, is preserved even for incomplete Hankel arrays. It is useful to check a simple case which illustrates some of the technicalities of the proof of the Main Lemma. EXAMPLE. r X p,

where

The dimension of (0, 0, ... , 0, A)

p = rank A r

and

At = A"

= r,

is precisely

-

119 -

R. E. Kalman 10.

GENERAL THEORY OF OBSERVABILITY

In this concluding section, we wish to discuss the problem of observability in a rather general setting: linearity, at least in the beginning .

we will not assume

This is an ambitious program

and leads to many more problems than results.

Still, I think it is

interesting to give some indication of the difficulties which are conceptual as well as mathematical.

This discussion can also

serve as an introduction to very recent research [KALMAN 1969a, 1970a] on the observability problem in certain classes of nonlinear systems. The motivation for this section, as indeed for the whole theory of observability, stems from the writer's discovery [KALMAN 1960a) that the problem of (linear) statistical prediction and filtering can be formulated and r esolved very effectively by consistent use of dynamical concepts and methods, and that this whole theory is a strict dual of the theory of optimal control of linear systems with quadratic Lagrangian.

For those who are familiar with the standard

classical theory of statistical filtering (see, for instance, YAGLOM [1962]), we can summarize the situation very simply by saying that Wiener-Kolmogorov filter

+ theory of finite-dimensional linear dynamical systems Kalman filter. For the latter, the original papers are [KALMAN 1960a, 1963a] and [KAlMAN and BUey 1961).

-

120 -

R. E. Kalman The reade r int ere sted in further details and a

m ~dern

exposition is

referred especially t o the monograph of KALM.t'N [:1.969b). We shall exaoi ne here only one a spect of this theory (which does not involve a ny stochastic elements):

the strict formulation

of the "duality r;rinciple" between reachability and o'.Js er vabi lit y . This principle was f ormally stated f or the first time '.Jy KA~~N [1960c l, but the pertinent dis cu s sion in this paper is limited t o the linear case and is somewhat ad-hoc.

Aided by research progress since 19&1, it is

now possible t o develop a completely general approach to the "duality principle " .

We shall do thi s and, as a by-product, we shall obtain

a new and strictly deductive proof of the principle in the now classical linear case. We shall introduce a ge neral notion of the "dual" system, arid use it to replace the problem of observability by an equivalent problem of reachability.

In keeping with the point of view of the

earlier lectures, we shall view a sys t em i n terms of its input/output map f

and dualize

f

(rather than

L).

The constructibility

problem will not be of direct interest, since its theory is similar to that of the obs er vabi l i t y problem. Let

n, r

from then on. (k

be the sa me set s as defined in Section 4 and used We assume that both

nand

rare K-vector spaces

arbitrary fi eld) and recall the definiti on of the shift

ope rators

n and

Cf

Cf ~

!

both shift op erators by module structure on

n

on

z and

n

and

r

(see

(3.10)).

We denote

but ignore, until later, the r.

K[z)-

-

121 -

R . E. Kal m an

By a constant (not necessarily linear) input/output map f:

n

~r

we shall mean

~

map f

which commutes with the shift

dPerators, that is, z·(f(oo)).

f( z·oo)

Let us now formulate the general problem of this section:

(10.1)

PROB~l OF OBSERVABILITY.

its canoni cal realization after

= o.

t

~,

Given an input/output map f,

and an input sequence

Determine the state

x

the knowledge of the output sequence of

of ~

~

after

This problem cannot be solved in general: that the state set

since

Giving

00'

X f

of

f

at

v E n applied

o

t t

from

O.

To see this, recall

may be viewed as a set of functions

is Nerode-equivalent to

00

iff

v E n and the corresponding output sequence amount s to

giving various values of to the sequences

¢,

vI"

f(ooo.)(l) ZV

r

(namely those corresponding

+ v •• •, v, z v, z2 v, ••• ), r_ l'

and

it may happen that these substitutions do not yield enough values of the function

f(ooo·)(l)

to determine the function itself.

This

situation has been recognized for a long time in automata theory,

-

122 -

R. E. Kalman

where, in an almost self-explanatory terminology, one says that "l:

is initial-state determinable by an infinite multiple experiment

(possibly infinitely many diff=rent single experiment (single

v

VIS)

but not necessarily by a See MOORE (1956].

chosen at will)."

The problem is further complicated by the fact that it may make a difference whether or not we have a free choice of

v.

KALMAN,

FALB, and ARBIB (1969, Section 6.3)] give some related comments. A further difficulty inherent in the preceding discussion is that the problem is posed on a purely set-theoretic level and does not lend itself to the introduction of more refined structural assumptions.

We shall therefore reformulate the problem in such

a way as to focus attention on determining those properties of the initial state which can be

comput~d

from the combined knowledge of

the input and output sequence occurring after For simplicity, we shall fix the value of generality, since

f

resulting from

after

x

x

is not linear). t

0

t

O. V at . 0

(no loss of

Then the output sequence

is given simply as

f(ill) ,

where

(ill]f' We shall use the circumflex to denote certain classes of

functions from a set into the field

K.

class will be the class of all functions.

f An element

(all functions

r

of

r

For the moment, this Thus

I' --> K) •

is simply a "rule" (in practice, a computing

algorithm) which assisns to each possible output seqDtn~e Y

in

r

-

123 -

R. E. Kalman

a number in the field

K.

If y

resulted from the state

x

then

y(y)

Y(f(w))

(rof) (00)

" and, by definition of gives the value of a certain function in n the state, also the value of a certain function in

X.

This suggests

the DEFiNITION.

(10.2)

iff there is a

An element

y"x E?

~ E

X is

an c:se~vable costate

such that we have identically for all

ooEn

In other words, no matter what the initial state the value of ~

y"x

rule

at

x

x

=

[oolf

is,

can always be determined by applying the

to the output sequence

f(oo)

resulting from

x.

Note,

carefully, that this definition subsumes (i) a fixed choice of the class of functions denoted by the circumflex, and (ii) a fixed input sequence after

t

=

0

(here

v

=

0).

For certain purposes, it

may be necessary to generalize the definition in various ways (KALMAN 1970 al, but here we wish to avoid all unessential complications. According to Definition (10.2), we shall see that a system is COmpletely observable iff every costate is observable.

This agrees

with the point of view adopted earlier (see Section 4) in an ad-hoc fashion.

Also, the vague requirement to "determine

x"

used in

-

124 -

R. E . Kalman

(10.1) is now replaced by a precise notion which can be manipulated (via the actual definition of the circumflex) to express limitations on the algorithms that we may apply to the output sequence of the system. The requirement "every costate is observable" can be often replaced by a much simpler one.

For instance, if

X is a vector

space, it is enough to know that "every linear costate is observable" or even just that "every element of some dual basis is an observable costate"; if X is an algebraic variety, it is natural to interpret "complete observability" as "every element of the coordinate ring of X is an observable costate" [KALMAN 1970a]. We can now carry out a straightforward "dualization" of the

n ~r.

setup involved in the definitior. of the input/output map f:

First, we adopt (again with respect to a fixed interpretation of the circumflex) : DEFINITION.

The dual of an input/output map f:

n

~r

is the map

Note that

f

is well-defined, since the circumflex means the class

of all functions. As to the next step, we wish to prove that constancy is inherited under dualization. spift operator on obvious ones:

To do this, wo have to induce a definition of the rand

n.

The only possible definitions are the

-

125 -

R. E. Kalman

'" r

--+ '" r:

Both of these new shi f t operators will be den oted by

z

-1

The reason for this notation will become clear later. Now it is easy to verify:

(10.4)

PROPOSITION. PROOF.

f

is constant, so is

'" f.

We apply the definitions in suitable sequence:

fez -l·r)(w)

and so we see that

If

'" f

(z-l·r)(f(w))

(de t'. of

r),

Y(z.f(w))

(def. of

(ff'),

Y(f(z.w))

(f

f(r)(z. w)

(de r , of

r) ,

(z-l·1'(r))(w)

(def'. of

(fn),

c ommutes with

z

At this stage, we cannot as yet view

wheneve r

f

f

is constant),

does.

0

as the input/ output map

of a dynamical system because concatenation is not yet defined on and therefore

r

'"n "

is not yet a properly defi ned "input set".

In other words, it is necessary t o chec k that the notion of ti me i s also inherited under dualization. to be

po ~sible

In gen eral , this doe s not appe ar

wi t hout some str ong limitation on the cla s s

we shall look only at the simpl e s t

'"P.

Here

-

126 -

R. E. Kalman

HYPOTHESIS. finiteness condi t i on : such that for all

Every function

y

There is an integer

y, 0 E

r

satisfies the

ly"'l

(dependent on

in

y)

the condition

r

I, ••• ,

IrI

implies

Yeo).

r(y)

In other words, we assume that the value of each "y

at

y

is uniquely determined by some finite portion of the output sequence y.

Assuming (10.5), it is immediate that

f admits a concatenation

multiplication which corresponds (at least intuitively) to the usual

n:

one defined on (10.6)

We can now prove the expected theorem, which may be regarded as the precise form of the "duality" principle: THEOREM. map and

f

Let

its dual.

f

be an arbitrary constant input/output

Suppose further that (10. 5 ) holds.

each observable costate of

f

(relative to

may be viewed as a reachable state of '"f, PROOF.

r

induced by

f.

r

Then

satiSfYing (10.5))

and conversely.

First we determine the Nerode equivalence classes on By definition

-

127 -

R. E.Y.:alman

'"€ E P. '"

for all

Now "r is linear

f

the definition of

and

(!);

in fact, direct use of

(10.6) gives (50f)(W), wEn.

So rof

and

are equal as plements ~l

50f

same observable Gostate.

~:

chey define the

Tn fancier language, the assignment

{lo.B) is well defined and constitutes a bijection between the reachable states of '" f

and those costates of

f

which are observable

o

relative to the function class Thus ~o

hold.

(10.5) is a sufficient condition for Ghe

d~lity

principle

However, the fact that the canonical realization 0f '" f

is

completely reachable is not quite the same as saying that the canonical realization of

f

is completely observable because the latter depends

on the choice of

r

Moreover, Theorem

(10.7)

and therefore is not an intrinsic property of does not give any indicati~n how "big"

and it may certainly happen that the observability problem for ~~ch

more difficult than the reachability problem.

f.

X

is

f

f

is

These matters will

be illustrated later by some examples. Now we deduce the original form of the duality principle from Theorem

(10.7).

The essential point is that (10.5) holds automati-

cally as a result of linearity. New definition of the function class: the class of all K-linear ~xnctions.

let the circumflex denote

(All the underlyin~ bets with the

K-vector spaces, so the definition makes sense.)

-

128 -

R.E.Kalman The following facts are well known: PROPOSITION. K-vector spaces .

Let

*

denote duality in the sense of

Then:

r {).

(KP[[z-ll])*

n {).

(JCD'[ z])*

KP[z-ll,

JCD'[ [z l l.

Now we can state the (10.10)

MAIN THEOREM.

dimensional. (i)

Suppose

Suupose further that

f

PROOF.

f

f,

A

~

hence every costate of

The fact that

by Proposition (10.4).

r

Then:

K[z~ll-homomorphism

f,

isomorphic with the X f

is observable.

is K-linear implies, by (10.3),

(Caution:

K[zl-homomorphism

cannot be simplified.

are

f*

is K-linear; the constancy of

dual of the

K-linear duality.

and finite-dimensional.

The reachable states of

K-linear dual of X f;

that

is K-linear, con stant, finite-

is K-linear and constant, that is, a

(and therefore written~ f*) (ii)

f

f

f

always implies that of is not the K[zl-linear

and the construction given here

See Remark (4.26A).)

To prove the second part, we note that by Proposition (10. 9) Hypothesis (10.5) holds and thus map of a dynamical system. of

f*

are isomorphic with

f = f*

is a well-defined input/output

We must prove that the reachable states

X;,

the K-linear dual of X f•

amounts t o proving that the K-vector spac e of functions

This

-

129-

R. E. Kalman is isomorphic with the K-vector space

X;.

It suffices to prove

that the K-vector space generated by the K-linear functions (10.1l) is isomorphic with Then by

x f

0, 1, •••

i

= 0,

X;.

Suppose that, for fixed

and x,

j

1, ••• , m]

'"

every

A(x)

= O.

by definition of the Nerode equivalence relation induced

(recall here the discussion from Section

3).

X is f finite-dimensional by hypothesis, it follows from this property of

the functions

(A)

that they generate

X* f•

Since

Obviously,

din:

x;

=

so that everything is proved.

[J

In other terms, the fact that

f

with the appropriate definition of

A

is a

f

K[z-l]-homomorphism.

= K[z]-homomorphism

t

=-

k

Since (10.5) holds, we can interpret

due to input

Y

In fact, we have

y(y)

f(y)( m), (Yof)(w) , ~(f(Y)(-

k»(Wk).

the output of the dual

is given by the assignment

which is a linear function defined on the sequence.

together

implies that

in a system-theoretic 'iay, as follc~s:

system at

dim Xf'

k-th

term of the input

-

130 -

H.E.Kalman

(10.12) that

"f

REMARK.

It is essentially a consequence of Proposition (10.9)

turns out to be the same kind of algebraic object as

f.

Note,

however, that under duality the input and output terminals are interchanged and ~

t

is replaced by

-t

(hence

z

z -1) •

In terms of the pictorial definition of a system, this statement simply amounts to "reversing the directions of the arrows", which is the "right" way to define duality in the most general mathem~tical

context, namely in category theory.

We would expect

that the duality principles of system theory will eventually become a part of this very general

du~lity

theory.

yet because the correct categories to

b~

This has not happened

considered in the study of

dynamical systems have not yet been determined.

It is likely that

eventually many different categories wi]l have to be looked at in studying dynamical problems. We shall now present an example the previous results.

whi~h

should help to interpret

We emphasize, however, that the theory sketched

here is still in a very rudi.mentary form. (10.13)

EXAMPLE. x(t + 1)

y(t)

Consider the system

L

defined by

2x(t) + u(t), y(t)

=( 1

if

0;;

if

1/2 ~ x(t) < 1,

x(t) < 1/2,

x(t), t E ~;

-

131 -

R. E. Kalman

X = U = Y = ~ mod 1, i.e., the interval [0, 1).

with

be thought of as identified with 0 . ) x

We let

u(t)

o

1.

(1

= O.

is to

We view

through its binary representation or

It is clear from the definition of the sy stem that the output sequence due to any

If it.

x

x

is precisely

is irrational, infinitely many terms are needed to identify

Consequently, the

x's

lence classes induced by Relative to

are isomorphic with the Nerode equiva-

f[.

So [

cannot be reduced.

".... = functions", every co stat e of

f[

is

observable, provided that Hypothesis (10. 5) is not satisfied.

If

it is, then only those c ostates defined on fi xed-length rationals are observable (more precisely, these on a fixed finite subset of the not define a

dyn~ic al

functions which depend only .... gk(x)ls). Thus: either f does ~re

sy st em or not all co st ates are obse r vabl e .

Now let us replace the set

[0, 1)

by its inters ection

with the rationals .

It is clear that there is now a finite algorithm

for dete rmining

we simply apply the re sult s of partial realiza-

x:

tion theory of the previous se ction. problem is to express of polynomials in is rational.)

x

from

~2[2]--which

However,

x

(We take

K

= ~2

(gl(x), ••• , g2(x) 0

and the

as a ratio

i s always pos sible sinc e each

x

i s not "effecti vely computable" in the

-

132 -

R. E. Kalman

strict sense since there is no way of knowing when the algorithm has stopped. no

~

~(x)

rule

for all

In other words, given an arbitrary costate

,.,

y"

x

x.

such that the application of "y" x

to

On the other hand, substituting into

,.,

x

there exists

Yx gives

,.,

x

the

results of the partial-realization algorithm will give an approxi~tion to the value of

~(x)

which always converges in a finite

(but a priori unknown) number of steps as more values of the output sequen~e

are observed.

In short, the costate-determination algorithm

has certain pseudo-random elements in it and therefore cannot be described through the machinery of deterministic dynamical systems. (Is there some relation here to the conceptual difficulties of Quantum Mechanics?)

-

133 -

R. E. Kalman

11.

HISTORICAL COMMENTS

It is not an exaggeration to say that the entire theory of linear, constant (and here, discrete-time) dynamical systems can be viewed as a systematic development of the equivalent algebraic conditions (2.8) and (2.15). Of course, the use of modules (over

K[z])

to study a constant

square matrix (see (4.13)) has been " st andar d" since the 1920's under the influence of E. NOETHER and especially after the publication of the Modern Algebra of VAN DER WAERDEN. must be also quite old.

Condition (2.15), by itself,

For instance, GANTMAKHER [1959, Vol. 1, p. 203]

attributes to KRYLOV [1931] the idea of computing the characteristic polynomial of a square matrix A by choosing a random vector computing successively b, Ab,

A2b, ...

band

until linear dependence is

obtained, which yields the coefficients of det (zI - A). will succeed iff X is cyclic with generator A

g.)

(The method

However, the

merger of (4.13) with (2.15), which is the essential idea in the algebraic theory of linear systems, was done explicitly first in KALMAN [1965b]. We shall direct our remarks here mainly to the history of conditions (2.8) and (2.15) as related to controllability.

See also earlier

comments in KALMAN [1960c, pp , 481, 483, 484] and in KAWAN, HO, and NARENDRA [1963, pp. 210-212].

We will have to bear in mind that the

development of modern control theory cannot be separated from the development of the concept of controllability; moreover, the technological problems of the 1950's and even earlier had a major influence on the genesis of mathematical ideas (just as the latter have led to many new technological applications of control in the 1960's).

-

134 -

R.E. Kalman

The writer developed the mathematical definition of controllability with applications to control theory, during the first part of 1959. (Unpublished course notes at Johns Hopkins University, 1958/59.) first definitions were in the form of (2.17) and (2.3).

These

Formal presenta-

tions of the results were made in Mexico City (September, 1959, see KALMAN [1960b]), University of California at Berkeley (April, 1969, see KALMAN [1960d]), and Moskva (June, 1960, see KALMAN [1960c]), and in scientific lectures on many other concurrent occasions in the U.S.

As

far as the writer is aware, a conscious and explicit definition of controllability which combines a control-theoretic wording

~th

a

precise mathematical criterion was first given in the above references. There are of course many instances of similar ideas arising in related contexts.

Perhaps the comments below can be used as the starting point

of a more detailed examination of the situation in a seminar in the history of ideas. The following is the chain of the writer's own ideas culminating in the publications mentioned above: (1)

In KALMAN [1954] it is pointed out (using transform methods)

that continuous-time linear systems can be controlled by a linear discrete-time (sampled-data) controller in finite time.* *It is sometimes claimed in the mathematical literature of optimal control theory that this cannot be done with a linear system. This is false; the correct statement is "cannot be done with a linear controller producing control functions which are continuous (and not merely piecewise continuousl) in time." Such a restriction is completely 'irrelevant from the technological point of view. As a matter of fact, computer-controlled systems have been proposed and built for many years on the basis of linear, time-optimal control.

-

135-

R. E. Kalman

(2)

Transposing the result of KALMAN [1954] from transfer functions

to state variables, an algorithm was sketched for the solution of the discrete-time time-optimal control of systems with bounded control and linear continuous-time dynamics. (3)

[KALMAN, 1957]

As a popularization of the results of the preceding work, the

same technique was applied to give a general method for the design of linear sampled-data systems by

~~

and BERTRAM [1958].

Some background comments concerning these papers are appropriate: (1)

The ideas and method presented in KALMAN [1954] descend

directly from earlier (and very well known) engineering research on time-optimal control.

(The main references in KALMAN [1954] are:

McDONALD [1950], HOPKIN [1951], BOGNER and KAZDA [1954], as well as a research report included in

~~l

[1955].)

Although the results of

KALMAN [1954] on linear time-optimal control were considered to be new when published, it became clear later that similar ideas were at least implicit in OLDENBOURG and SARTORIUS [1951, §90, p. 219] and in TSYPKIN's work in the early 1950's.

The engineering idea of nonlinear time-optimal

control goes back, at least, to DOLL [1943] and to OLDENBURGER in 1944, although the latter's work was unfortunately not widely known before 1957. During the same time, there was much interest in the same problems in other countries; see, for instance, FELDBAUM [1953] and UTTLEY and HAMMOND [1953].

Mathematical work in these problems probably began with BUSHAW's

dissertation [1952] in which, to quote from

~·WL~

[1955, before equation

(40»), " ••• [it was] rigorously proved that the intuition which led to the formulation of the [engineering] theory [quoted above] was indeed correct."

TSIEN's survey [1 954] contains a lengthy account of this state

-

136 -

R.E.Kalman

of affairs and was ready by many• . We emphasize:

none of this

extensive literature contains even a hint of the algebraic considerations related to controllability. (2-3)

The critical insight gained and recorded in KAU~ [1957] is

the following:

the solution of the discrete-time time-optimal control

problem is equivalent to expressing the state as a linear combination of a certain vector sequence (related to control and dynamics) with coefficients bounded by 1

in absolute value, the coefficients being

the values of the optimal control sequence. of the first

n

The l inear independence

vectors of the sequence guarantees that every point

in a neighborhood of zero can be moved to the origin in at most

n

steps (hence the terminology of "complete controllability"); and the condition for this is identical with (2 .17) (stated in KALMAN [1 957] and KALMAN and BERTRAM [1958] only for the case

det F of 0

and m = 1).

A thorough discussion of these matters is found in KALMAN [1960c; see especially Theorem I, p. 485].

A serious conceptual error in KALMAN

[1957] occurred, however, in that complete controllability was not assumed, as a hypothesis for the existence of time-optimal control law, but an attempt was made to show that the controllability is almost always com.plete [Lem:na 1].

In fact, this lemma is true, with a small

technical modification in the condition.

Only much later did it become

clear (see the discussion of Theorem D in the Introduction), however, that a dynamical system is always completely controllable (in the nonconstant case, completely reachable) if it is derived from an external description. this difficulty, very

~sterious

in 1957, which led to the development

It was

-

137 -

R.E. Kalman Of a formal machinery for the definition of controllability during the next two years .

The changing point of view is already apparent in

KALMAN and BERTRAM [1958]; the unpublished paper promised there was delayed precisely because the algebraic machinery to prove Theorem D was out of reach in 1957-8.

Consult also the findings of the biblio-

grapher RUDOLF [1969].

IN

S~~Y:

under the stimulation of the engineering problems

of minimal-time optimal control, the researches begun by KALMAN [1954,

1957] and KAlilAN and BERTRAM [1958] eventually evolved intoiwhat has come to be called the mathematical theory of controllability (of linear systems). Beginning about 1955, Ind stimulated by the same engineering problems, FONTRYAGIN .and h i,s school in the USSR developed their mathematical theory of optimal control around the celebrated "Maximum Principle". mentioned

(They were well aware of the survey of TSIEN [1954] acove J and referenced it both in English and in the Russian

translation of 1956.)

We now know that ~ theory of control, regard-

less of its particular mathematical style, must contain ingredients related to controllability.

So it is interesting to examine how

explicitly the controllability condition appears in the work of PONTRYAGIN and related research. GAMKRELIDZE [1957, §2; 195e §lJ §2] calls the time optimal control problem associated with the system

(11.1)

dx/dt

Ax

+ bu(t)

-

138 -

R. E. Kalman "nondegenerate" iff subspace of (11.2)

n R •

b

is not contained in a proper A-invariant

He notes immediately that this is equivalent to

~ ) det ( b, Ab, ••. , An- [)

f.

(i.e., the special case of (2.8) for

0

m = 1).

He then proves:

in

the "degenerate" case the problem either reduces to a simpler one or the motion cannot be influenced by the control function

u(·).

~

this is very close to an explicit definition of controllability. However, in discussing the general case

m > 1,

GAMKRELIDZE [1958,

§3, Section 1] defines "nondegeneracy" of the system

=

dx/dt

Ax + Bu(t)

as the condition (11.4)

det (b., Ab., ••• , An-~.) ~

~

~

f.

0

for every column

b~

.

E B,

but he does not show that this generalized condition of "nondegeneracy" for (11.3) inherits the interesting characterization proved for "nondegeneracy" in the case of (11.1).

In fact, condition (11.4) is much too strong

to prove this; the correct condition is (2.8), that is, complete controllability.

In other w0rds, in GPJ~IDZE's work (11.4) plays

the role of a technical condition for eliminating "degener a cy" (actually, lack of uniqueness) from a particular optimal control problem and is not ; explicitly related to the more baEic notion of complete controllability. Neither GAMKRELIDZE nor PONTRYAGIN [1958] give an interpretation of (11.4) as a property of the dynamical system (11.3) , but employ (11.4) only in relation to the particular problem of time-optimal control.

See

-

139 -

R.E.Kalman also KALMAN [l960c, p. 484].

A siaular point of view is taken by

USALLE [1960]; he calls a dynamical system (11.3) satisfying (2.8) "proper" but then goes on to require (11.4) (to assure the uniqueness of the time-optimal controls) and calls such systems "normal". The assumption of some kind of "nondegeneracy" condition ·...as apparently unavoidable in the early phases of research on the timeoptimal control problem.

For example, ROSE [1953, pp . 39-58] examines

this problem for (11.1); by defining "nondegeneracy" [po 41] by a condition equivalent ot (11.2), he obtains most of GAMKRELIDZE's results in the special case when A has real eigenvalues [Theorem 12].

ROSE

uses determinants closely related to the now familiar lemmas in controllability theory but he, too, fails to formulate controllability as a concept independent of the time-optimal control problem. A similar situation exists in the calculus of variations.

The

so-called Caratheodory classes (after CARATHEODORY [1933]) correspond to a kind of classification of controllability properties of nonconstant systems.

In fact, the standard notion of a normal family of extremals

of the calculus of variations is closely related to condition (11.4), suitably generalized via (2.5) to nonconstant systems.*

Normality is

used in the calculus of variations mainly as a'hondegeneracy'condition. It is importan':. to note that the "nondegeneracy" condit loons employed in opt Ime.l c orrt r o., ",nd the calculus role of eliminating

annoyin~

01

var a.at a.ons play mainly the

;,echnicalities and simplifying proofs.

*The use of the word "normal" by IaSALLE [1960] t'or (.11.4) is only accidentally coincident with the earlier use of the "normal" in the calculus of variations.

-

140-

R. E. Kalman With suitable formulation, however, the basic results of time-optimal control theory continue to hold without the assumption of complete controllability.

The same is not true,

howeve~,

of the four kinds of

theorems mentioned in the Intorduction, and therefore these results are more relevant to the story of controllability than the time-optimal control discussed above. There is a considerable body of literature relevant to controllability theory which is quite independent of control theory.

For instance, the

treatment of a reachability condition in partial differential equations goes back at least to CHOW [1940] but perhaps it is fairer to attribute it to Caratheodory's well-known approach to entropy via the nonintegrability condition.

The current status of these ideas as related to

controllability is reviewed by WEISS [1969, Section 9].

An independent

and very explicit study of reachability is due to ROXIN [1960]; unfortunately, his examples were purely geometric and therefore the paper did r.ot help in clarifying the celebrated condition (2.8).

The

Wronskian determinant of the classical theory of ordinary differential equations with variable coefficients also has intersections with controllability theory, as pointed out recently with considerable success by SILVERMAN [1966].

Vany

problems in control theory were misunderstood

or even incorrectly solved before the advent of controllability theory. Some of these are mentioned in KALMAN [1963b, Section 9].

For relations

with automata theory, see ARBIB [1965]. Let us conclude by stating the writer's own current position as to the significance of controllability as a subject in mathematics:

-

141-

R. E. Kalman

(1)

Controllability is basically an algebraic concept.

(This

claim applies of course also to the nonlinear controllability results obtained via the Pfaffian method.) (2)

The historical development of controllability was heavily

influenced by the interest prevailing in the 1950·s in optimal control theory.

Ultimately, however, controllability is seen as a relatively

minor component of that theory .

(3)

Controllability as a conceptual tool is indispensable in

the discussion of the relationship between transfer functions and differential equations and in questiohs relating to the four theorems of the Introduction.

(4)

The chief current problem in controllability theory is the

extension to more elaborate algebraic structures. For a survey of the historical background of observability, which would take us too far afield here, the reader should consult KALMAN [1969b].

-

142 -

R. E. Kalman

12. Sec~ion

A:

REFERENCES

General References

M. A. ARBIB

A common framework for automata theory and control theory, SIAM J. Contr., 2:206-222. C. W. CURTIS and 1. REINER Representation Theory of Finite Groups and Associative Algebras, Interscience-Wiley. E. M. DAY and A. D. WALIACE [1967]

Multiplication induced in the state space of an act, Math. System Theory; 1:305-314.

C. A. DESOEH and P. VABAlYA [1967]

The minimal realization of a nonanticipative impulse response matrix, SIAM J. Appl. Math., 15:754-764.

E. G. GILBERT Controllability and observability in multivariable control systeffi3, SIAM J. ContrOl, 1:128-151. B. L. HO and R. E. KAIJlAN [1966]

Effective construction of linear state-variable models from input/output functions, Rege1ungstechnik, 14:545-548. The realization of linear, constant input/output maps, I. Complete realizations, SIAM J. Contr., to appear.

S. T. HU [1965 ]

Elements of Modern Algebra, Holden-Day.

R. E. KALMAN

[1960a]

A new approach to linear filtering and prediction problems, J. Basic Engr. (Trans. ASME), 82D:35-45.

[1960b]

Contributions to the theory of optimal control, Bol. Soc. Mat. Mexicana, L:I02-119.

-

[1960c]

143 -

On the general theory of control systems, Proc. 1st IFAC Congress, Moscow; Butterworths, London. Canonical structure of linear dynamical systems, Proc. Nat. Acad. of Sci. (USA), 48:596-600. New methods in Wiener filtering theory, Proc. 1st Symp. on Engineering Applications of Random Function Theory and Probability, Purdue University, November 1960, pp 270-388, Wiley. (Abridged from RIAS Technical Report 61-1.)

[1963b]

Mathematical description of linear dynamical systems, SIAM J. Contr., 1:152-192.

[1965a]

Irreducible realizations and the degree of a rational matrix, SIAM J. Contr., 13:520-544. Algebraic structure of linear dynamical systems. I. The Module of E, Proc. Nat. Acad. Sci. (USA), 54:1503-1508.

[1967]

[1969a]

On multilinear machines, J. Compo and System Sci., to appear.

[1969b]

pynamic Prediction and Filtering Theory, Springer, to appear.

[1969c]

On partial realizations of a linear input/output map, Guillemin Anniversary Volume, Holt, Winston and Rinehart.

[1970a]

Observability in multilinear systems, to appear.

[1970b]

The realization of linear, constant, input/output maps. II. Partial realizations, SIAM J. Control, to appear.

R. E. KAWAN and R. S. BUCY

New results in linear prediction and filtering theory, J. Basic Engr. (Trans. ASME, Sere D), 83D:95-100. R. E. KALMAN, P. L. FALB and M. A. ARBIB Topics in

~~thematical

System Theory, McGraw-Hill.

R. E. KALMAN, Y. C. HO and K. NARENDRA [1963]

Controllability of linear dynamical systems, Contr. to Diff. Equations, 1:189-213.

C. E. LANGENHOP

[1964]

On the stabilization of linear systems, Proc. Am. Soc., 15:735-742.

~~th.

-

S. LANG

[1965]

144

R. E. Kalman

Algebra, Addison-Wesley.

S. MAC LANE

[1963]

Homology, Springer.

L. A. MARKUS

[1965 ]

Controllability of nonlinear processes, SIAM J. Control, J:78-9O.

E. F. MOORE [1956]

Gedanken-experiments on sequential machines, in Automata Studies, C. E. Shannon and J. McCarthy (eds.), pp. 129-153, Princeton University Press.

P. MUTH [1899]

Theorie und Anwendung der Elementarthei1er, Teubner, Leipzig.

A. NERODE [1958]

Linear automaton transformations, Proc. Amer. Math. Soc.,

2:5 41-544 .

L. SILVERMAN [1966]

Representation and realization of time-variable linear systems, Doctoral dissertation, Columbia University.

L. M. SILVERMAN and H. E. MEADOWS Equivalent realizations of linear systems, J. Control, to appear. H.

S~l

WEBER [1898]

Lehrbuch der Algebra, Vol. 1, 2nd Edition, reprinted by Chelsea, New York.

L. WEISS Lectures on Controllability and Observability, C.I.M.E . Seminar. L. WEISS and R. E. KALMAN [1965 ]

Contributions to linear system theory, Intern. J. Engr. ScL, J:141-171.

W. M. WONHAM [1967]

On pole assignment in multi-input controllable linear systems, IEEE Trans. Auto. Contr., AC-12:6oo-665.

-

145 -

A. M. YAGLOM

An Introduction to the Theory of Stationary Random Functions, Prentice-Hall.

D. C. YOUIA

[1966]

The synthesis of linear dynamical systems from prescribed weighting patterns, SIAM J. Appl. Math., 14:527-549.

D. C. YOUIA and P. TISSI

[1966]

n-port synthesis via reactance extraction, Part I, IEEE Intern. Convention Record.

O. ZARISKI and P. SAMUEL [1958]

Commutative Algebra, Vol. 1, Van Nostrand.

-

Section B:

146-

References for Section 11

M. A. ARBIB

A common framework for automata theory and control theory, ~IAM.J. Contr., 1:206-222. I. BOGNER and L. F. KAZDA

[1954 ]

An investigation of the switching criteria for higher order contactor servomechanisms, Trans. AlEE, 11 11:118-127.

D. W. BUSHAW Differential equations with a discontinuous forcing term, doctoral dissertation, Princeton University.

[1952]

C. CARATHEODORY [1933 ]

Uber die Einteilung der Variationsprobleme von Lagrange nach Klassen, Comm, Mat. Relv., 2:1-19.

W. L. CHCM

[1940]

Uber Systeme von linearen partiellen Differentialgleichungen erster Ordnung, Math. Annalen, :98-105.

H. G. DOLL

[ 1943]

Automatic control system for vehicles, US Patent 2,463,362.

A. A. FELDBAUM

[1953 ]

Avtomatika i Telemekhanika, 14 :712-728.

R. V. <W-lKRELIDZE the theory of optimal processes in linear systems (in Russian), Dokl. Akad. Nauk SSSR, 116:9-11.

[1957]

On

[1958]

The theory of optimal processes in linear systems (in Russian), Izvestia Akad. Nauk SSSR, ~:449-474.

F. R. GANTMAKHER

[1959]

The Theory of Matrices, 2 vols., Chelsea.

-

147 -

A. M. HOPKIN [1951]

A phase-plane approach to the compensation of saturating servomechanisms, Trans. AlEE, 70:631-639.

R. E. KALMAN

[1954 ]

D~scussion of a paper by Bergen and Ragazzini, Trans. AIEE, 73 II: 245-246.

[1955]

Analysis and design principles of second and higherorder saturating servomechanisms, Trans. AIEE, 74 II:29h-3l0.

[ 1957]

Optimal nonlinear control of saturating systems by intermittent control, IRE WESCON Convention Record, 1, IV:130-135.

[1960b]

Contributions to the theory of optimal control, Bol. Soc. ~~t. Mexicana, 1:102-119.

[1960c]

On the general theory of control systems, Froc. 1st IFAC Congress, Moscow; Butterworths, London. Lecture notes on control system theory (by M. Athans and G. Lendaris), Univ. of Calif. at Berkeley.

[1963b]

NathemaUcal description of linear dynamical systems, SIAM J. Contr., 1,:152-192.

[1965b]

Algebraic structure of linear dynamical systems. I. The Module of E, Proc. Nat. Acad. Sci. (USA), 54:15 03-1508 .

[1969b]

Dyna~ic

Prediction and Filtering Theory, Springer, to app ear.

R. E. YJ\l1:A1J and J. E BERTRAM

[195 8 ]

R. E. KALllAN, [1963]

General synthesi s procedure f or computer control of single and ~ulti-loop linear systems, Trans, ALEE, TI 11 I: t<'2-6 09 . Y.

C.

HO

and

K.

NA_ttENDRA

Cont ro L'lab i ld ty of linear dynamical systems, Contr. to Diff. Equations, 1,:189-213.

-

148 -

A. N. KRYLOV

On the numerical solution of the equation by which the frequency of small oscillations is determined in technical problems (in Russian), I~v. Akad. Nauk SSSR Ser. ~ix.-Mat., ~:491-539.

[1931]

J. P. LaSALLE

The time-optimal control problem, Contr. Nonlinear Oscillations, Vol. 5, Princeton Univ. Press.

[1960] D. C. McDONALD

Nonlinear techniques for improving servo performance, Proc. Nat. Electronics Conf. (USA), ~:400-421.

[1950]

R. C. OLDENBOURG and H. SARTORIUS Dynamik selbstattiger Regelungen, 2nd edition, Oldenbourg, Munchen.

[1951]

R. OLDENBURGER [1957]

Optimum nonlinear control, Trans. ASME, 12.:527-546.

[1966]

Optimal and Self-Optimizing Control, MIT Press.

L. S. FONTRYAGIN

[1958]

Optimal control processes (in Russian), Uspekhi Mat. Nauk, 14:3-20.

N. J. ROSE Theoretical aspects of limit control, Report 459, Stevens Institute of Tech., Hoboken, N. J.

[1953 ] E. ROXIN

[1960]

Reachable zones in autonomous differential systems, Bol. Soc. Mat. Mexicana, 2:125-135.

K. E. RUDOLF

[1969]

On some unpublished works of R. E. Kalman, not to be unpublished.

-

149 -

H. S. TSIEN

[19541

Engineering Cybernetics, McGraw-Hill.

A. M. UTTLEY and P. H. HAMMON!)

[19531 L.

The stabilization of on-off controlled servomechanisms, in Aut Gmatic and Manua] Control, Ac ~dernic Press.

WEISS

Lectuces on eontrollability and Observability, C:I.M.E. Seminar

CENTRO INTERNAZIONALE MATEMATICO ESTIVO (C .I.M.E.)

R. KULIKOWSKI

CONTROLLABILITY AlI:D OPTIMUM CON T RO L

C O l'S 0

t en u t

0

a

S a ss

0

M a r .: 0 n i

d .a 1 1

a1 9

lu g I i

0

19 E8

-

153-

CONTROLLABILITY AND OPTIMUM CONTROL by R. Kulikowski (Polish Academy of Sciences , Warszawa) 1. Introduction

and

Statement

of the

Problem

The intuitive notion of controllability of dynamic systems had been used for many years by control-engineers. In the case of linear stable systems, described by eqs , x = Ax + Bu

(1)

= Cx

y

(2)

,

where x = n-dimensional state-vector, u = r-dimensional control-vector, y = p-dimensional output-vector, A, B, C - matrices of the dimensions n x n, n x r, p x n,

respectively,

that notion has been formulated strictly by Kalman (see Ref. According to Kalman the system S, described

by (1), (2)

trollable if and only if: given that the system is in state then for some finite time that

T

>0

there is a control

x

o

is con-

at time t=O,

u(t), t €

[0,

T] such

x(T) = O. He has proved also that

column

Q

where

n- 1 B, ... , A

a"

S.

of

S

is

controllable if and only if the

vectors of the matrix

(3)

That

t':::,.

[B, AB,

... , A n-1

denote columns

B

means that

the rank of

there is a set n which constitute a basis for R nr

[2, 3J ) .

columns

of

Q

.,

BJ of

Q,

span the state space

Q

is

n

of n

S

among

l inear-Iy independent vectors

In control theory the notion of optimal being used. The system

or that

is optimally (time)

(time)

control is

also

controlled if there exists

-

154-

R. Kulikowski

the control function trol-set T te

(usually a convex r-dimensional

required for x

the transition

to the given final state

o

Using the well [8J),

u{t) E U, where

U

is an

polyhedron)

admissible consuch that the time

of state variables from the given initial stax{T)

(e. g.

x{T) = 0) is minimum.

known "maximum principle" of Pontryagin (see R ef.

or an equivalent optimization technique, it ; s possible to formulate

the necessary condition of optimality for system S.

That condition can be

written in the form of linear differential equations and it can be used for derivation of optimum control function others

u{t). As shown by Pontryagin and

(Ref. \:8]) the necessary condition of optimality becomes also suffi-

cient when the system is controllable. Besides, the optimum control exists and it is unique. Then the notion of c o n t r-ollarul ity has an important th eoretical and practical value. It determines conditions under which the optimum control of a linear system is generally possible.

Roughl y speak ing a controllable system is the system in which the state of

variables

can be

driven to

any pos ition with

a finite value

performance measure and an optimally controllable s ystem is

system in which the

system in which

the

the state variables can be driven to

the given position with a minimum value of performance measure. In the case of the linear systems and optimization time as

the performance measure

the controllable system can be optimally controlled.

Is

that also true if

we take as performance measure the other possible performance criteria? A

simple example will show that the answer to that question is negative. Consider the first-order system described by the eq . dx/ dt

= u(t) ,

x{O)

=0

-

155-

R. Kulikowski

Assume

that

the control force u(t)

is generated by the control 2 ler having finite energy, what can be wr itten as u ~ L [0, T]. As the performance measure we assume the so called 2

lIy_xll

E(u)

=

j

[y(t)

o where

y(t) is

"square error":

T

T

~

,2

u(?:) d1:'.;

dt,

o

a given square-integrable function.

The necessary condition of optimality

requires that

t

y(t)

fU('t')d~,tE[O,TJ.

o When that condition holds the functional bound:

E(u)

E(u)

attains its lower

= O.

Assuming timum value of

that

u(t)

y(t)

= I(t) (a unit step) we find easily that the op-

is Dirac's

b (t)

function.

However, that fun ction is

not square- integrable and the optimum solution of our optimization problem does not exist desp ite the fact that the finite evergy solutions "approximating"

~

(t) exist.

Obviously the difficulties of the present type arise in the case when some sufficient conditions of optimality (such as the generalized Weierstrass theorem)

for nonlinear functionals do not hold . In the present paper we shall deal with a "well defined" class of

optimalization problems for which the optimum control exists and an optimally controlled system is also a controllable system (at least in certain of state space). It will be use gull to introduce the following definition.

We shall say that a 'system is optimally controllable if the system state-variables can be derive to any (given) position with minimum value of the performance measure and the system constraints

are not violated.

-

156-

R. Kulikowski We shall ti ons of

strive to find c o nd iti o ns under which the optimum solu -

concrete optimization probl ems exist and to determine the regions

of controllability in the space of system and constraint parameters. We shall be interested mainly in complex optimalization problems including energy and amplitude inequality constraints, constraints of phase c o o r d i na t e s , nonlinear systems etc. Dealing with that cl ass of problems it is convenient to employ a few simple notions of functional analysis. Then the main optim ization problem which is investigated in the present paper can be formulated as follows. Let a nonlinear functional be given. Find the point

x£ X

F(x) . x € X o

X is a Banach space)

such that max

(4)

(where

F(x)

xEn

ncX

where

is the set of feasible solutions, specified by

2. 0

G(x)

(5)

where

is a nonlinear operator

G

space). It is also assumed that

F

the condition

(G : X and

G

-t

Z

and Z

is another Banach

are concave and strongly

differentiable. In the most practical applications the assumption that Z

are Banach spaces (i. e.

X

the linear, normed and complete

and spa ces).

hold s • In particular w e shall deal with the space of all continuous funct ions C [0, T] , and with the space of all p-power integrable fun ctions L P [0 , T], P ? 1 . In since

G

function

these is

an

cases the element

x is a function

operator the element

of tim e (z(t)). The in equality

nonnegative for a ll

t I: [0 , T]

all te-ro, T] - in the case We shall wr it e

also

z = G(x),

of time (x(t)) and z € Z

is

another

(5) means that the values

- in the cas e

Z:::C

zit) are

, and for almost

Z ~LP . z >if 1'" z 2

z l - z 2 ). O.

The inequal ities d efi-

ned in s uc h a wa y have the kn own prop erties o f inequalit ies for numb ers

-

157-

R. Kulik owski

(i. e . o ne can multiply them by non negati ve numbers, sum them up and pass

to th e limits on both sides). c o nv e x, closed cone

(i. e.

Generally, one can say that

Z

the set closed with respect to the multiplication

by real nonnegative numbers). The closed c o ne induces in -order relation denoted by the sign

a(t)

where

a(t),

form (5) if

b(t)

~

Z a partial-

~

One should observe that the c o n s t r a i nt (6)

includes a

of the form

x(t):S b(t) ,

= given functions, can be represented in the general

we introduce the operators: G (x) = x - a , 1

G (x ) = b - x , 2

and writ e

Us ing that notation we assume that for an ordered pair

..-: z 1 lities

J

z2 >

zl 2: 0,

the inequality

z2 :::: O.

z ;;: 0

=

Z

is equivalent to

=

is equivalent to the pair of inequa-

In the similar way for the Z

the inequal ity

z20

Z

n

n

ordered set

>

zi 2: 0, i = I, •.• , n •

The space of all ordered pairs is c a lle d the Cartesian product of

< zl' z2 > , where ZI and

Z2'

zl€ZI

and

It is denoted by

• In t.he abo ve m ention ed e xa m ple the domain and range of operators is

X

(i. e. G

: X ~X, G X~X) and G(x) E Z = XxX . 2: 1 It should be noted th at if for a function x two ine qua lity cons t r a int s h old a nd G (x ) < 0 , o -

-158 -

R. Kul ikow s ki

then the equality constraint

G (x) = 0 is also valid. Then the equality o constraint is a special case of (5), where G(x) = < G (x}, - G (x) > .

o

As can be seen the constraint

0

(5) may also express the equality

or inequality constraints imposed on the state and output coordinates of a dynamic system (1),

(2).

In order to obtain an explicite relation of that type one may integrate eq. (1) and express the state and output coordinates as a Volterra operator of control vector. The functional x

may describe the control

F

cost. If, for example,

denotes a vector-function, with components:

the energy cost of control forces is proportional to

T

r

o

i=1

T

r

S L.

[u.(t)] 2 dt

,

1

and

(7)

F(x)= -

J ., o

[uP)]

2 dt.

i=1

In the present case the optimization time

T is fixed.

seen later it will be also possible to solve problems in

which

As can be T

is m i-

ni m i z ed ,

Before we formulate the sufficient and necessary cond it ion s of optimality it is necessary to introduce the notion of weak and strong differentials. Then a method based on the notion of Lagrange-functional will be presented. Hurwicz

[IJ

General idea of that method is based on the paper of

L.

dealing with nonlinear programming in topological spaces.

As shown in Ref.

[4,

5, 7J

found to be very useful

the method of Lagrange-functionals has been

in the solution of optimization problems with

-

159-

R. Kulikowski

operator inequality constraints.

Let spaces,

G(x) be a nonlinear operator G: X

and x, h be elements of X. The operator

r e nt iaol e

Y, where

X, Yare Banach

G is called weakly diffe-

if the limit 1 ' -\G(x+ fh)-G(x) }

lim

(1)

=dG(x,h) ,

YL

;r-+o where

~

is a number, exists.

~

The differential dG(x, h) is a homogeneous operator with respect to h, i , e. dG(x,

r h)

r. dG(x, h) .

=

When dG(x, h) exists and is continuous at

it is also an additive operator

(see Ref.

x

[9J ).

As an example find the weak differential of the operator T

J K[t, t' ,

G(x)

(2)

He[o, T] ,

x( 7:)] dC',

U

where the functions with

respect to We

(3)

K' [t, T, x] = ~ K[t, L, x] x ox and xEX.

K [t, 1:, x],

t,1'E[O, T]

are continuous

get formally

d G(x, h) = dG{x + "t h) d

r

IJ

T =0

K~ [t,

f

1;' ,

X(7:")] h(-r-) d T

•

°

Since the function K is uniformly continuous the differentiation the integral

sign

under

is admissible.

Besides the weak differential the strong (or Frechet) differential is also being used. If at the point

we have

G(x + h) - G(x) = dG(x, h) + r2(x, h) ,

(4)

where

X€X

dG(x, h)

is a

linear operator with respect to hEX

and

-160 -

R. Kulikowski II

lim

nix, h)/i

we call

dG(x, h) When

Y,

o

II h II

iJ hI/ -+ 0

the strong differential of

dG(x, h)

G(x)

is a linear operator acting from the space

it is also an element of the Banach space

from

X

into

Y

by

into

of all linear operators acting

and it is called the derivative of G(x)

Denoting that operator

X

at

x.

the point

G'(x) we can write d(G(x, h)

(5)

= G'(x)(h)

In that formula G'(x) is an operator acting on the element hEX . As follows from the definition of strong differential if it exists it is equal to the weak differential. It is possible to show (see Ref.

(9])

that the continuity and existence of weak derivative G'(x) at the vicinity of

x is sufficient for the existence of strong derivative and that these

two derivatives are equal. Now we shall consider relations between the differentials and derivatives

of differentiable funct ional s , Denote the weak linear differential of

the functional

F'[x},

X€

nCx

ctional with respect to

h

dF(x, h).

X~

dF(x, h)

is a linear fun-

dF(x, h) = (y, h) ,where

adjoint to

as a result of an operation i , e. Consider,

Since

one can write

an element of the space also

by

X.

T

f

KG

y = f(x) , f: X

(t-),'t"J d'l',

7

X~

XEL

P

[0,

TJ

o where

K'

x

[x,tJ,

t'E[O,T] ,

are continuous functions. We get by (3)

T

T

dF(x,h) =

f o

K~

[x,1:Jh(t)dl:'

f o

is

That element can be treated

for example, the functional

F(x)

y

Y (1:") h (1") d't"

•

-

161-

R. Kulikowski

+q

-1

1,

J

y('t') € L q

y = f(x)

= K'

x, h € L P [0, T then

Since

[0,

TJ, where

P

-1

+

and the operator

(6)

L P [0, T]

is a cting from

The operator

f

into is

x

[x, 'f:;' ]

Lq[O,T]

being c all e d the gradient

of the functional

F:

f(x) = grad F'[x] In the ly denoted by

case of the finite dimensional space

R

n

the gradient is usual-

V F(x) •

By Lagrange functional we call the expression

p (x, ).

(1)

where operator

A

)

= F(x) + A[G(x)]

is a linear functional defined over th e space

Z

(of the range of

G(x)). The form of t h e functional

tical applications

Z

).

depends on the space Z. In pra cP is usually the space L [0, TJ or e[o, T]. F or

t h e s e spaces there one knows the general forms

of linear fun ctionals (

[6] ).

However we shall be mainly interested in the non-negative fun ctionals. We call a fun ctional X(z) non-ne gative, i , e .

non-negative if for all the non-negative

II (z)

~ 0 .

The ge n e r a l form of a linear non-negative functional in spa ce is T (2)

z ~ 0 it is

J o

z (t )

A(t)

dt ,

L P [0, T]

-

162 -

R. K ulikowski

whe re

is a lmos t every -where non-ne gati ve f u nc tio n

A (t )

al m ost e v ery

t € [O, T ] )

~(t)f L q, q = p/(P-l).

a nd

Th e non-n e g ati v e fu nc t io nal s ove r the s pace

(A(t) ?Ofor

i». TJ

C

assume the

foll owing for m T

J

( 3)

wher e

z (t ) d). (t)

,

o

A(t)

is a non-de cr easing,

c ontin u o u s from th e right

side, fun ction

ha v i ng b ounded variation. In the ; eneral c a se the co n s t r ai n t p onent s,

G{x) ~ 0

co n s is t s o f s e veral co m -

i , e. 1, . .. , n,

where

G . : X .... Z., i = 1, • .• , n, 1

and

1

the fun ctional

).

is d efined o v e r

th e Ca r t he s ia n produ ct

co nsis ts

It

wh er e

A

d enot e s

i

c o m p o nents

i , e.

t he line ar fun ctional

ove r

n

of

Z. • 1

Wh e n we w r ite

A~

T he L ag ra nge

fun ctional in th e p res e nt case c a n

0

it mea n s th at

a ll

'\ . a re no n - ne gativ e 1

b e wr itten

n

(x,

Sinc e

A)

L x

= F {x ) +

F(x),

a c o nc ave fun ctional

i= 1

A[G{ x)] of

x

1

[G .(X)] ,

ar e c o ncave fun ctionals of a nd

a

linear fun ctional

F or c o ncave d iffer entia bl e fu nc ti o n s hold s (4)

), 1. c

where

1

F{x)

Z .~ 1

x, of

p(x,).)

is

>.

the foll owing inequality

-

163-

R. Kulidowski

It

follows froml the definition of the differential and the definition

of a concave function: (5)

°<

a<:

<1,

xl' x

- arbitrary points of the domain 2 It can be easily proved that the inequality

of

F'(x)

(4)

holds also

for

conca ve and differentiable functionals.

Theorem 1.

F

Let the functional

and

the operator

concave and strongly differentiable over there exists gative

such

functional

a

dx'ff.x, )J,x) =

(2)

C(x) > 0,

X[C(x)]

l;:

X

and

a

A that for every

(1)

(3)

x

be

C

X.

If

linear non-nex E: X

°

= 0,

(4)

then the functional

F(x)

attains

at

mum value subject to the c o n s t r a i nt

X the C(x) :::'0.

Proof 1) . Since

F,

). C

are

concave functionals we ha ve

F(x ) < F(x) + dF(x, x o -

).. [C(X

o)]::;).

" 0

- x)

,

[C(X)J +ldC(X, X - x) o

1)

The proof

of

a

similar

theorem

is

given

in

Ref.

[1]

.

maxi-

-164 -

R. Kulikowski

x

where

= arbitrary element of

o

X.

Summing up these inequalities one

gets F(x

+

o)

~[G(Xo)J

Taking into account

Since

~ 2:. 0

Since

x

o

dx~[(X').

:::: F(x) + >-[G(X)] + (1)

and

(3)

), X o

xl

one obtains

2: 0 we get F(x) 2:. F(x o) o) arbitrary the theorem has been proved.

and is

G(x

Remark When

the constraint

x

2. 0

is distinghished explicitly i , e.

when

< G 1(x),

G(x)

where

G: X .. Z,

x

> ,

Z = ZI)( X,

one can write

i£ (x,). ) where

A.

=

< ), 1' ) ' 2> .

Then the conditions

(1) - (4)

can be written in the following

equivalent form (5)

d

d

(6)

x

~ 1 [(X, \),

x

x]

=

0

~1 L(X, ~ 1)' xJ 5 0, for

every

PI l(X, ~'1)' 1 1J = 0, [(x, ,\), ), IJ 2. 0 for every

x

2.

0 ,

d).

(7)

d~ ~

(8)

1

Indeed, the condition

(1)

dx~[(3"}.)'

(9)

Since

),2(X) = 0,

for

x]

can

0

be written

= dX~I[(X,

x =x

~1~

II)'

x],.\(X)

=0

one gets (5). According to

(4) for

-

165-

R. Kulikowski

XI [G I (x)]

Since in

.x 2(x) ;:: 0

x ~ 0 the inequality

every

= d;\ 1 ~

hol ds , Then by (9) one gets

[(x, ~ 1)' AI]

for every

0

(3) can

G 1(x) ? 0

the form of (7) • Taking into account that

AIL G 1 (x)] ?

then

\;:: 0 , the relation

(6).

be written

if and only if

(2) can be written in the form

of (8) • Remark 2: It should be noted that when

such

a pair

(x,5.), that

F and

G

x c X the

for every

are convex and there exists conditions

d ~(x,:x.; x) = 0,

(10)

x

G(x)

(11)

:s

0,

~[GC>()]= 0,

( 12) (13 ) hold then F'[x )

attains

The formulae (5) - (8)

at

x

its minimum value; subject to

preserve their form,

except

G'(-x) $ O.

(6) and (8), which c ha n -

ge the i n equality signs, i.e .:

~I[(x,),)(l2:0,

(14)

d

(15)

-r-O:; - I d A<£ 1 L(x , " 1)';' r''5. 0, for every

Re mark

x

for every

x

, II

~O, ~

0,

:~ .

In the c a s e when the expli cit form

Of

the functional ~ (x,), ) is g i v e n

the opt imality conditions i nc l u d i n g differentials c a n be replaced by c o n d iti o n s in the gradient form • Let fo r

example, F'[x} : be the integral operator T

F(x) =

J

K [x(

~),:,J d r

o ha vi ng the

gradient f

[x;1:"]

' KI r , x ( '!:'),::J, x -

-

166 -

R . Ku likowski

(see (6) o f s e c.

2 ). Ass ume t hat

G (x ) be a d ifferentia bl e o pe rato r an d

G: X ~ L P [ 0, TJ T aking into account t he ge ne r a l for m o f line ar fun ctional s in LP[O , T ] (s ee (2)

o f s e c.

3)

o ne can writ e T

J ).Cd

G(x )

G

ex)

;\ ("e ) E: L q [0, T]

di,

o Den ot e th e g ra d ient of t ha t fun ctio nal g ra d and t he g ra d i e nt of For

a

is a f unc tion

h x , \.)

fi xed

of

by

A [ G(X)] by

'f [x,

).(~) g [x,,] , i , e.

)('t) g [x ,

). ,I::

r:J '

J.

x =x

r E. [0, TJ and

d 1! [X,\ ; xJ = (gra d x x

o ne ca n wr it e

~(x, ~),

x]

=

T

_

f~rx,

A:r]

x('C"') d c",

o T he co ndi tio ns

(I), (10) r educ e t o th e r equirement th at a l mos t eve ry -

wh e r e (1 6)

In a simila r wa y th e co nd itio ns

(5),

(6) a re equi val ent t o the follo-

wing: I)

if

g rad

X(T ) = 0

II)

if x(r)

x

~ 1 (x, \.) < 0 fo r a l mos t a lmos t e ve ry- whe re in P

> 0 for a l mos t

a l mo st e ve ry - w he re i n wh ere

all 1: E P -C [0, TJ, then

a ll 't ER C [0, T J ,

then

gr a d

T he co nd it io n s ( 3) ,

(4)

PI (x, ')")

= 0

R,

P, R = a r b it ra ry s et s of po s iti ve m e a sure o f the interval

d ient form:

x

[0,

TJ

(and (7), (8)) c a n b e a lso written i n th e g r a -

-

167-

R. K ulik ow s ki

III)

~ g r a d >. 'f(X, ",\ )

= G(x)

> 0 for a l m o s t all C" E P C [0, T] then

), ('t) = 0 almost every-where in P, IV) if

"5:(-::) > 0

g r a d .l.

1? (x, ~

Co n d it i o n s

:=

for almost all 'e'ER

[O,T] , then

) = 0 almost every-wher e in R.

III-IV h a v e a simple physical interpretation: a t th e p oints

where the c o n s t r a i nt s are not active the L a grange-functi on vanishes and wh en the L a gran g e - function is positive the

co n s t ra i nt s must be a ctive .

It i s also p ossible to write down the g ra d i e n t form of conditions o f

optimality in th e case of minimali zation o f Wh en th e operator c tio n s ,

F'(x ) •

G(x) is a cting into th e space o f c o nt i n uo us fu n -

G: X -,. C [0, T] , one c a n w r ite

i. c ,

T ) G[x] d), ("d

\ " .\ LG(x) '1J

o where

). (~)

d ). (r) ~ 0 small

is a non-in creasing fun ction, what c an b e als o written a s

vi c inity of 1:' is non -ne gative). Wh en

A'( z )

d ~\ (1:") =

de',

;l.(?:")

to

d). (<:)

ned in th e g e ne ralize d s e n s e (e . g . as the ).'(1:") c a n also include

Dira c's

~

A' (1:) 2:

0

0 if the d ifferentiation is defi-

Schwart z's

'b (t)

in the a r b it r a ry

is differ entiable

and the above c o n d iti o n 'can b e written

The l ast inequality is equivalent

ca s e

>. ("l:")

(that notation m e ans that the increas e of

distribu tion). In th a t

- fun ctions a t th e dis c ontinuity-

- p o i nt s o f .:\ (1: ) The co n d itio n

( 3)

in th e pr-e s e nt c a se c a n b e written a s

0, or I) i f = 0

II)

if

It is

grad/.

~ (x'\) = G Lx] > 0

.r o r

()..(1:") = c o n s t ) at the point d\(;;) > 0, "t"l:P

th en

"r

[0, TJ

't" E-P C [0 , T ]

then

d

A( co)

T or v i c i n i t y of ""t' ,

= 0,

tEP .

gr'a d i " n ~

10r m

G [ x]

also po s .si bl e to write th e

€

of 0 ;.'1. i ma l i ' y co n d i -

-

168-

R. Kulikowski

tions for the remaining formulae leave that as an exercise The conditions

(1) - (4), (5) - (8), (10) - (1:3).

We shall

for the readers. (5) - (8)

can be called the quasi-saddle-point condi-

tions and they can be treated as a generalization of well differential conditions of optimality, which

known Kuhn-Tucker

were for mulai ed originally for

nonlinear functions in finite -dimensional spaces

~~~~~~~~~~X_~~~j2~~~~~Y!_~2~~~~~~~~ When an optimization problem is being solved it is also important to know that the functions x(t) which do not satisfy the cond itions (1)-(4) or (5) - (8)

can not be optimal. That problem requires to prove that the

conditions (1)-(4) are also necessary for optimum. In order to prove the necessary conditions we shall impose certain regularity conditions on G but Shall not assume any more that We shall call ble variation

x

x € X o at the point

o

,which

G

if

is defined

for every admiss iby the condition

G [x ] + dG(x , x) ~ 0 , o 0

(I)

there e xi s t s e curve emanating of admissible solutions

n .

generally speaking, a function i , e.

F and G are concave .

a regular point of x

F and

y:

from

x, tangent to x and lying in the set o 0 By a curve in the Banach space we understand,

r of real variable

s

with

the range in

X ,

R't X • According to definition that function should satisfy the follow-

ing conditions: (2)

't (s)E n , We shall

F(x)

(3)

'(0) = x O '

dt(O, 1)

show now, that if

and a re gular point of

=~.

is a maximali zing point for

G then for each admissible variation

G(x) + dG(x, x)

~

0 ,

x, i. e .

-

169-

R. Kulikowski

we get a non-positive increase of F, i , e. -dF(i, x) 2: 0.

(4)

Indeed, the real function f(s)

(5)

attains the maximum value for On the

['f(S)J

F

'1'(0) =

X, i.e. for s

= 0. Then

df(O, I)'::; 0.

other hand, according to the differentiation rule of compound. fun-

ction (5)

and formulae (2) we get df(O, I) = dF ['1-(0), d 1(0, I)] Then

dF(x, x) SO and

(4) has been

= dF(x, x) .

proved.

Introducing the notation II (x)

= -d F(x,

g(x )

xj ,

= G(x) + dG(i,

x)

the obtained result can be written as: (6)

g(x) 2:

if

°

then

II (x) ~

°.

The next step in our reasoning consists in showing that there exists such

a functional

). > 0, which will ensure the relation

where L (x) I

(7)

= dG(x, x] .

The main obstacle in showing that is the nonlinearity of g(x), whi ch consists of linear term In order to (8)

L (x) and the additive term G(x). I overcome that obstacle an auxiliary operator L < s , x>

=

< s, sG(i) + dG(x , x»,

where L: R"X'-' R"Z

s€R,

-170 -

R. Kulikowski

can be introduced. it can be proved that for

real numbers

and

o(l,,x2

Since the fun ctional Rx X

L <

S,

x> is a linear operator,

i,

e,

sl' S2€ R,

11 (x)

can

be also treated as defined over

then the following notation 1 <

x >

S,

-dF(x,

x)

can be introdu ced. Now we should che ck whether the condit ion L <s ,

x> ? 0 impli es

<s , x>

> O.

Observe th at this c o n d iti o n c a n be written as

s ? 0

(9)

and

sG(iC) + dG(i, x ,

Assuming first of all that

s >0

G(x) + dG(x, x is) Taking into ac count

? 0

and divining

?

(4) we obtain

(9) by

s

we get

0 . -dF(x, x is)? 0

and -dF(x, x)?O

or ( 10)

1 <

Then the fun ctional

<

S,

x

x> 2

o.

1 is nonnegative

> which s at i s fly c o nd itio n s Each point

(11 )

S,

s > 0,

on the set

P

of

pai r s

sG(x) + dG(x, x).2 0 .

<0 , x> for which the relation dG(x , x) ;: 0

holds can be treated as a limit of the sequence sequen ce c a n be constructed as follows.

of points from

P,

That

-171-

R. Kulikowski

Take on element

!n

(12)

x

with the property that

o

+ dG(x, x

G(x)

Summing up (11) and (12)

!n

0

In)

>0

-

one gets

G(x) + dG(x, x

In + x)

>0

0 -

Then 1 1 < -, - x n

n

1

lim n .,.

then

(9)

+x >e

0

P

• Since

1

< 0, x>

< -;;-' -;;- X o + x >

00

implies (10) for all

s

2 o.

Introducing the notation

<s, x> = w, Rx X = W the obtained relation

can be written as ( 13)

if

L(w)

2

0

then

l(w)::: 0 ,

where L = linear operator, I

L: W

= linear functional over

W.". , adjoint to

~

V = Rx Z ,

W (an element of the space

W).

We can now return back to the main problem which is the existence of nonnegative functional

l(w)

(14)

To solve that spa ce s

V Jt

satisfying the relation

=

v*[L(w)].

problem we shall need a generalization for Banach

of the well known Farkas lemma. Hirst of all it is necessary to introduce a few additional notions. We shall denote by

Q

a set of all functionals of

be represented in the form (15 )

w*+ (w) = v ....

[L(W)J '

'"

v 2:: 0,

w

which can

-

172-

R. Kulikow ski

where

*

v e- V

v

l<-

= R KZ

The sequen ce of fun ctionals ,..

•

w E. W, if the sequen ce w

e

W

w k(w),

witE W"" is called w eakly converg ent to k

k = 1,2, • . •

converges to

~

w (w)

for each

•

Th e set ther

l<

.

of functionals

Q C W

"!l'

is called w eakly clos ed

with ea ch wea kly c o n ve rge n t sequence

if

to ge-

w: -+ w it c o nt a i ns also

their

limits, i. e. The g e ne r a lized Farkas)

states that

function al

W -It

if

Farkas lemma (called the lemma of Minkowski

Q

is a weakly closed set then

... v~0

In other words there exists such

and

it contains each that (14) holds

r1 J ) .

(see Ref.

Now we are able to formulate a nd prove the following n eces sary cond itions of optimality Theorem 2. Let 1.) at the point

xi murn , 2) 3)

the functi onal

F(x)

subject to the constraint

the point the set

x

2 0,

G(x) ,

is weakly closed.

Then ther e e x is t s such a fun ctional condit ions

G(x)

x is a re gular point

Q

attains the conditional ma-

~ :::: 0

that the quasi-s addle-point

(1) - (4) hold.

Proof

Ii

It was shown already that there exists such a functional (16)

1 <s , x> = v"'(L < s,x » Taking i nt o a cc ount (8) one can write

1) The proof of a similar theorem is given in R ef.

[1

~

.

_..

v

~

? 0,

th at

-

173-

R . Kulikowski

il""<S, x > = ~s +I[sG(x) + dG(i, x)] , where

u

is a real number

and ·~ E Z -Ii-

•

v"~ 0 can be written in the form

The condition

""~O,.\~O

(17)

Hence (4) follows. The relation (16) (18)

can be rewritten as

-dF(i, x) = Setting

s =0

1 [dG(X,

and grad

x

(18)

x)]

=di [(x, ~),

! [i, >: J = 0

xJ = 0

•

the relation (1) follows.

Setting

x = 0,

X[G(i)] = 0

yields

+ ~ (sG(i) + dG(x, x)] •

one gets from

dF(x, x] +

Then

fS

t<- + 1[G(X)] = 0, what by (17)

s = 1 one gets

and (3). The condition (2) must hold by

assumption,

Q.E.D. The theorem ples

2 is valid under two regularity conditions. In the exam-

which are considered below the operator

G(x) = A(x) + a, where element of

X.

G(x) assumes the form

A(x) is a linear operator over

One c a n easily check

each admissible variation

that

X, and

a - given

G(x) i s a regular operator. For

x, defined by

G(X) + dG(x, x) = A(x) + a + A(x) ~ 0 , and the function t(s)

and the c ur ve

r

t(s) = x + sx , one obtains

= x+sxefl,

't(O)=x,

dy-

(0,1)=x

is lying within the set of admissible solutions .

It is possible to check that the same property have the operators

G(x.),

which consist

of several components

(constraints)

Ai(x) t a ' A j i

-174 -

R. Kulikowski

linear,

a € X, i = 1, ••• , n ; i In the case of nonlinear operators

the space of continuous functions measurable,

bounded functions

(Z = C[O, TJ)

(Z = L

x,

such that the function

00

or into

into

the space of

[0, T]) the regularity c ond itio n

[7J )

boils down (see in that respect Ref. tion

G(x) , which are acting

to the existence of adm issible varia -

z(t)

z(t) = G [x] + dG(x,

x),

is positive. In the case of the space of continuous functions it means that min

z (t ) > 0,

0-:; tg L 00 [0, T]

and in the case of the space z(t)

it means that the minimum

of

is positive almost every-where. As an example consider the inequality t

G(x) = a(t) -

(20)

J
d'l:' >

0,

o where

r

(x)

is increasing

= nonne gative,

G(x)

~

0

and having continuous derivative

continuous function. That op erator is regular

'f I(X);

a(t) =

Indeed, since

and t

dG(x, x)

ifl [x

)

('0)]

x('lf)

d z , '("(x)

> 0,

o

one c a n set

x(t')

= - 1 and obta in z(t) > O.

As shown in Ref. dition

3 . of theorem

an element (21 )

[7] in order

to check whether the re gularity c on -

2 holds it is sufficient to show that there exists such

w-Ii E W, that L(w'''') > 0 •

-175 -

R. Kulikowski

For example, in the case of the

L(w)

= < s,

operator

t

J If' [x{~)J x('t')d"t'

(20)

one obtains

t

- Sr [XC~)} d t'} > ,

+ s [a(t)

°st:ST.

o

o t

Since

~ If[xCI:)]

a(t) -

there exists such a parr

d 1;' ~ 0 , then

< s,* x· >,

it can be easily proved that

s'" > 0

x~(t) > 0,

and

t

e [0, T],

that

L(w*) > 0 • It should be noted that the theorem

of variational

2 generalizes certain theorem

calculus in Banach spaces and, in particular,

the following

theorem of Luster-nik, Theorem 3. Let the functionals

If grad

X E.X

and

F'(x},

subject

H(x)

II > O.

If

to the constraint

x

is

a conditional

H(x) = c ,

grad F(x)

(22) and

F, H be strongly differentiable at the point

= p

where grad

extremum point of

c = H(x) , then

H(x) ,

/"- is a number • The proof of that theorem is given in

Consider a controlled input

u(t)

Ref.

[9] .

linear dynamic system, shown in Fig. 1; having one and n+I outputs, which

are described by Volterra

operators: (1)

Yi(t)

J kp,1:") u(1:')

d1:' ,

o where ki (t, 1:' ) - linearly independent transient functions of the system, i = 0, 1, .•• , n •

-

176-

R. Kulikowski

A typical optimization problem can be formulated as follows; Find the function u(t) € L P [0, TJ, which minimalizes T

II

(2)

u 1/

=( J

p

ju(t)1 p

o

,) 1/p

d~

, p Co 1

subject to the constraints T

J

y/T) =

(3)

k.(T,'t' 1

u(-r) d z-

= x .; 1

i = 0,

1, ••• , n

o where

T,

x . - given real numers, 1

In other words, for the given outputs

it is required to minimalize the control cost (2)

x . attained at the time 1

t = T •

In certain cases additional conditions of the form (4) (~,

M ~ u(t) M

M

given numbers) or t J.(t) = k.(t, z ) u(?") d z < x .(t), J 0 J J

f

(5)

S

= 0,1, ... , n

(x .(t) = given time functions) are being imp osed. J The constraint (5) is c a ll e d "restriction of phase coordinates". There exist, of course, many known optimization techniques, such as : va r ia t io na l cal culus,

maximum principle, dynamic programming etc. whi ch

c an be applied for the solution of the optimization problems formulated above. In the present se ction we should like to demonstrate that the m ethod based on theorems

I,

2 of sections

4, 5 , is very c o n ve nie nt for the solution of

problems incl uding restri ction o f ph ase c o o r d i na te s . Instead of de alin g w ith a g e ne r a l n-dim ensional s ystem we shall confine o u r anal ysis to a s e cond order s ystem, whi ch is fr equently e nc o u nt e r e d in th e

en gine ering pra cti ce (e. g. in s ervom e ch ani sms etc .) . The r esult o f that

anal ysis w ill be us eful for th e in vestigation of a class of c ontrollability probl e rn s ,

-

177-

R. Kulikowski

Example I. Consider a system described by the differential equa tio n dYI dt = u(t),

(6)

with zero i nitia l conditions:

y 0(0) = Y1(0) = 0,

shown in Fig. 2 .

It is r equired to find such a control function

u(t)

which min imali ze s

th e " energy cost": I

T

="2 j

E(u)

(7)

[u('()

J

2

dt,

o s ub ject to the constraints T (8)

YO(T) =

J

(T -1:') u(;:-) d r = x ' O

(x

O

> 0)

0 T

YI (T) =

J

u( ·~)

dr

=0 •

0

The c o n s t r a i nts (8), (9) mean th at the deflection of the output c o o r dinat e of the s ystem for of that coordinate

at

t =T

t =T

is equal x 0

a nd the c o r re s po n d i ng velo city

is equal ze r o . The c o ns t r a i nt s of that kind are

typi cal for operatio n of controlled motors

and se r vo m ec ha nis m s .

The Lagrangean of present pr obl em is e q ua l

~ (u, f ) (10)

1 =2

T 2 \ [u ( 1:' )] d t- +

o

T

( T -'t') u(r) d r -

xo~+ /'1-2 J o

wh ere

r-2 = La grange multipliers. Th E' ne cess ary, a nd at th e sa m e t im e s uff icie nt co nditi o n (due t o t he

1~I'

co nvexi ty of

E (u )) , o f o p t ima l it y acco r d in g t o th e or em

3 becomes :

-178 -

R. Kulikowski

(ll ) where

grad f'l'

u

~(u, t<-)"'. u{~)

+ ""l{T - L) + /"'2

can be computed by

«'2

12x = ___0 1 T3

~

and (12)

(a) ,

(9) •

6x

,

~

2

= T

=

0 ,

We get then

0

2

6x =-f(1- ~~), O:S~:s T. T Now we can solve a more complicated problem when in addition u{t')=u{'r)

to constraints

(a), (9)

the amplitude-constraints of control force: u{ '1:') - M :s 0

(13) -M - u{'I:')

where

M

= given

number, should be taken into account. These constraints

are typical for the operation Since linear,

<0

of electrical motors in servo-systems.

is a square-integrable function the general form of the 2 non-negative functional L{z) in the space L [0, T (according

to (2) of section

u{r)

J

3) becomes T

L{z)

! ) (~)

=

z{'t') d't'

,

o where the function

A(-e-)

is non-negative and square-integrable.

The Lagrangean of present problem becomes

J (U{t')J 2 d e

1 T

P{u, /"-, ~) = "2

+ ~1

A1 (r ),

~ 2{1:)

=

~ o)'

J

T

T

+ { ). ('t')

o where

(T -1:') u ('I:') di:- x

0

o

ucn d e

(fT

1

[u ('t')

-

Lagrange functions.

M] dt'+

'\2('t') [- M - u('I:')] d r ,

o

-

179-

R. Kulikowsk i

The necessary and sufficient condition of optimality, accordin g to formulae (10) - (13) of sec. 4, grad

u

p(u,

fL, ) )

become

= u("1:')

+

(15)

r-1 (T - ~) + r.2

u( '1:) - M.:::O,

( 16) T (17)

J

A 1('~)

[U(....) -

1"-1'

f-2

(C')

+

0,

T M] d e

J

>

° \

"1

-M - u('/:'):s.0,

A2 ( "t) [-M

°

( 18) where

=

), 2 (~)

+

(r )

~2(L) ~

.2 0,

- u ('t->]dt' =

°

°,

can be determined by (8), (9) •

,\ . (L), 1 become zero. That correspond to the nonactive constraints. When the

As can be seen these condition will hold for all "L

c o n s t r a i nt s become active

the functions

values which satisfy the condition the interval

[0, TJ

1 (-1:'), 1

, when

). 2(~) should take such

(15). When the constraints are not active in

the optimum solution should be the same as

(12).

Then the optimum solut ion becomes

M,

u(t)

(19)

0 < t < T -

- [t"-;1 (T-t) - M,

+

0

r-2] = M l-1 - ~] T-2T

T-T

o

T

< t < T-T 0 -

-

-

T

T

T

(T - t) li(t) dt

°

=

T-T

Ml J (T - t)dt + J

°

0

< t < T

0-

can be derived from the equation obtained by setting o which yields :

where (8),

-

0

(T-t)

[2(t - T ) 1_ 0

T

-vr,

(19)

JJ.t

into

-

180-

R. Kulikowski

..... I

(T _ t) dt} =

M T - 2T

T -T

'= x

o

o

o

Typical form of that solution has been shown in

11 (t),

The correspondig optimum values o f - t"l(T - t) -

{

r, -

~ 2(t) become:

(t-T ) o 2M =-4M T-2T o'

0 st

'5. To

O,To StsT

~ 0,

0:S t :S T-T 0

LI'Ll (T -

t+T -T t)

+

P:

o f the relation

2

- 2M = 4M

o

T-2T

T-T o

< t <- T

0-

[x ,

T

The pl ot

F i g. 3.

_9= f[--2..-] has been shown T MT2

It c a n b e obse r ve d that th e optimum s olution,

in

Fig. 4.

satisfying c o n s t r a i nt s (8), (9),

exi sts only if j x i < M( T/2)2 • It means that wh en M is less that o 2 4 f x {/ T it is i rnpos s i bl e to a t t a i n the output x in time T using optimurr o 0 T2 I x , < M(-) (or non optimum) contro1. In other words, when the system o 2 is optimally c ontrollable i n the sense formulated in sec. 1. (12)

Exar::ple

~

Consider a gain - coris tr -a ints

(1 3)

the problem of example 1 but replace the amplitude-

by the " velocity- constraint"

(20)

G(u) =

j

u('t') d't'

'5.

V

,

o whe r e

V - given p ositi ve number corresponding to the maximum admissible

v e loci ty of the output c o o r di na t e . Sin ce C [O, T ] (i.e.

u",L G : L

2

2

(0, T] and the ~

C),

,

10

whi ch

operator

G(Il) is acting into the space

the g e n e r a l form of a

nonnegative linear

-

181-

R. Ku l ikow ski

fu ncti on al is des c ribed

by (3) of s e c.

3,

blem become s 1 2

T

/'-1 (

j

-~

(T

th e La grang e an of th e pr esent proT

I o

) u( t" ) d ~ - x ) o·

0

J

[u('t")

2

d r' T

+

' 2

T

J

+

d

>. (t)

S

[

.\ (t)

is a

u( c ) d

i

0

t

0

where

J

i.

u(

d~

~)

- V

+

J

dt ,

0

nonde creasing fu nction with bounded va ria ti o n , co nti nuo u s

from th e right s ide. Si.nce T

d

t

S

dl'

S [u('e-)

d -\ (t) {

0

J h( 't") J

o ne g e t s

u

wh e r e th e fu nc tio n T he n fun cl io nal (7)

t

J dArt) [ J o

i

t

d ), (t)

u( 7:")

d~ -

0

d~(t),

t'

v]

T

S

d A.(t)

0

T 1(t;):J

d), (t)

= ;\( T ) - ).('t:)

is non-incre as in g.

~

the ne ce ssary and suffic ient c on dii ion for th e mi nim um of the sub ject to th e c o ns t r aint s (8), (9), (20) is acco r d i ng to (10) -

- ( I:l)

(21 )

g rad

u

! (u,

~, A. )

= u(

~) + t""1 (T - -r ) + t'-2 + 1 ( ~ )

t

( 22)

1h("t') d ~

T

0

T

T 0

0

T

gr a d

- v}=

+ 0 h("t')J de'

S o

ut e ) dl: ~ V ,

0,

-

T

J 7:

J

(23)

182-

d

A(t')

[

°

u(s) ds -

R. Kulikowski

V]

0,

°

(24)

d ). ( ~)

where small

vicinity

2

of

In order

J and

~ (1: ")

denotes the increase of

in an arbitrarily

't' • to satisfy

active, we split [ T 2' T

°

(21) - (24), in the general case when (22) is

[O,T] into three subintervals

[O,T

I],

(T

I,T 2

J,

assume

tilt)

= 0,

as illustrated by Fig. 5. Then

the condition (21) can be written as

(25)

(26)

f'-I(T - t)

(27)

+

/-t- + let) 2

u(t) + 1"-1 (T - t) + 1-'-2 Since

let)

= 0,

= 0,

should be continuous

t

at

e (T2'

T l ' T2

T]

one obtain fr-om

these equations t€[O,T (28)

u(t)

I],

t ,,[,-. ,I... ]

t E"[O, TIl,

(29)

let)

tE [T

I,

•

T

2],

-

183-

R. Kuli kowsk i

In o r d e r

to

(9)

T Then from

be va lid it should b e

-

T

2

for

(26)

T

1

t = T 2 and

l( T ) 2

0

o ne g e ts

f2~

- t'-l T 1 • By

and

(20)

T

I

we obtain

(8)

T

1

\1("1:') dt"

J

=

0

T

J

1

r- 1 (t -

0

I'-

T 1 )dt

2 lT1

T

u('t') dz- =

J

0

T (T - t) f

(t - T ) dt + 1 1

0

T

V

2

J

T- T -T

f-l -i1 1

2

T2 2

J=

(T- t)

t:

1

(t -T+T )dt 1

1

x 0

From th es e e q ua ti o ns o ne fi n d s T1

In o r d e r

=

to get

~

x (T - -9), V

T1

8V x

1"-1 9

> 0 it must

be

h- ;J

~v > 1.

2

In o t he r w o rds

t he s y-

stem is op t i rr.a l Iy co nt rolla bl e, in the s ens e o f s~c . 1, i f th e velocity c o ns t r ai nt Xo i s g re a te r th en V T t In F ig . 5 a t ypi c al plot o f u(t), 1ft) and th e " velo city" v (t ) = u (~ ) d eO ha s b e en s ho wn .

I

Example 3 . It i s requir ed to find su ch a cont rol

u(t) o f t he s y st e m

minimiz es the fun ct ional T (30)

IU

F [ u]

o

('t')

I d 't

,

( 6),

wh ich

-184 -

R. Kulikowski

subject to the constraints (13), (9),

(13).

Since the derivative of the function under the integral sign

of (30)

is discontinuous we can not use the formula dF

eu d

+

r

hJ }

5 for the derivation of the gradient u(t),

of

(= 0

t: u J .

F

For that reason we represent

as a difference of two nonne gattve functions, i , e.

wher-e

"i (t) = max [u(t) ,

0]

'

u

2(t)

= max (-u(t), 0

In the pre sent case the Lagrangean takes the following form T

S: Cu,r ,iI] = J o

CUl(~)

+

U2(~)Jd7:+ (T

-~)[u l (c) T

T

(31 )

\+

- u ( t' ) 1 dj- - x 2 ' 0)

J CU 1 ( ::- )

- u2(L)]d~

.I .\ 1( ~,) [u /1:' )

+

o

-

M] de- +

o T

+

\

'2(t') [ u 2 ( t-)

- Ml di:

0 where

\("c),

\2(r)E:L

oo

[0, T].

Before we write down the conditions of optimality we should de r ive the differentials of

(31) T

(32)

S [1 + r- 1 (T-"t") o

+

I<. I 2

+), (1:')J 1

h(T) d1;' ,

-

185-

R. Kuliko ws k i

( 33)

d

.p [u' u 2

t"- ..) '.

T

h

i

\ ~ I -

=

.J

I~ 2 + .\ 2 ( z- ) J h (l:') dr- : ,

f<- I (T - ""t ) -

°

T

( eu I ( ?") -

( 34)

M

J

dt ,

°

T

I [u2 ( 1:')

(35)

Jh (e)

- MJ h( L)d '" •

°

Now w e can show th at the o p ti m um sol ution c o n s is ts of tw o p ulses with th e h e ight equa l

M (s e e Fig. 6)

~

, ..M,

( 36)

\11 (t) = ) ( 37)

( 38)

r u (t ) -= ,

t

2

( 39)

O
I

0,

T

0,

°:s

M, T-T

T -T

t <

I

< t4' I - -

an d _

i - I- r- I (T - t ) -

~ (t)

(40)

I

{--<2

2:

0,

°< t < T -

-

I

=

°

I

O
(41 )

=1°'-I T he va l ue

-

1

+

of

TI

ca n be deter mined by s e t ting

u(t) i nt o (8)

wh i ch y ie lds T

(42)

J

°

T

(T-t)Mdt-

J

(T-t)Mdt= MT

1(

T-T

x I)= o·

-

186-

R. Kulikowski

fA 1'

The value of

A 2 (T

\(T ) = 0 , 1

r2

can be determined using the relations

- T ) = 0, 1

whichyield

°,

~ 1(T - T 1) - /"'-2 +>-<-T +1"-:2 1 1

0.

Then 1"'-1 =

~2

T 2

Using formulae (32) - (41) it is possible to check d

u

1>

[ii,

1

,\ ~ [u,

d

I"-

\

f'- , A

1

and d

u

d

1> ( ii,

r- ,\

l

Al

1?[u,~,-\

d

~J

uJ

d

~O,

A] ::: 0,

u

[u, r ,): ; u] = ° f> Cu, ,..,~ ; :x ] = 0, iJ?

2

x2

u]:::

d):?[U,!,-,); d').

P [li, r

2

; ~ J .:::

-

,~

0,

u ~ 0,

0,

2

which indicate that the solution In Fig. 7

ti]

that

(36)

is optimum. T1 [_IXo1 ]

- (39)

the plot of the relation

MT 2

T

shown.

has been

41 X o I

M ~ It can be observed that the optimum solution exists only if 4x o the system is optiM In other words, when is greater than

~

mally

T

controllable in the sense of section 1.

2

It should be observed that the problems considered in the examples

and

3

can be easily extended for the case when the constraints of the

control force are given time functions

In a similar way the quantity

V

M (t) 1

,

M

in the example

i , e.

2(t),

2

can

be

treated as a time function. It is also possible to

sist

consider minimum-time problems, which con-

in finding such a control force

to the constraints

(8),

(9)

u(t)

(10) and

which

minimizes

T

subject

-

187-

R. Kulikowski T

J I u( 1:') I p

d't' -:s

u ,

p

> 1.

o

where

U

= given positive number.

If for example

p

=1

from

(36) -

(39) one obtains

T

(43)

U =

J I Ii (1:') I dr

= 2MT 1

o By elimination of

T 1 from (42).

(43)

one obtains

I. x 0'! M

In Fig. 8. given. from

the plot

of that function

For a given numerical values of that plot

for

,,(, and

0(

= const has been

U/ M

a correspondig minimum time value

one can read T=T

which deter-

mines the optimum solution. Then using (42) the corresponding value of T1

can be derived.

-

188 -

;

ute)

~---+'

--+-

•

Y (t) 0

y .( t ) 1

--.....;. Yn (t )

Fig. 1

ut e ) ~

)

Fig. 2

u(t),

-A(t)

2M

M +---\--

.....

T

o +---~---4,c----;:::-~-----~

-M

Fig. 3

t

-

t;

T o/

189-

T

I

0.5

0.2

0 .2

O. 1

Fig. 4

- - - - ...... ~ l(t)

"

-

"--,

, 0

,

-,

o

T

u(t)

Fig. 5

t\j(tl_ \

, ," ' -'- ) I

"i (t)

T/ 2 0

T

1

I

T-T

I

U

1

2

(t )

Fig. 6

'"

z,L-

T

-190 -

0 •.-

o. 0.3

0.2

0.1 0 .3 Fig. 7

0(=9

2

«=4 Fig. 8

2

-191 -

R. Kulikowski R ef er en ces 1.

Hurwicz L. H. L zawa:

2.

Pr-ogr-arnrru ng in linear spaces, i n K. 1. Arrow, L . Hurwi c z , Studies in linear and nonlinear programming)Stanford 1958.

Ka lman R . E. : On ' he g eneral theo ry of control s ystems. Pr o c.. IFAC Congr. Mos c ow 1960 vol , 1 pp . 481-49 3. London 196 1.

3.

Kalman R. E. , Ho Y . C ' I Narendra K. S. : Controllability of linear dynamica l s ystems, in Contribution to di·fferential equations. Vol. L New York 1962.

4.

Kulikowski R . : On optimal control w ith integral and ma gnit ude type of constraints. Wars zawa 19 67 Prace l ns t y t utu Automatykl PAN z , 67.

5.

Kulikows ki R . : On optimum control of nonlinear , dynamic indust rial pro cess es. Ar r hiwum Automatyk i i Telemechaniki 19 6 7

6.

z, L

Lusternik L.A. , Sobol e v W .1. : Elements of functiona l anal ysis . Mos c ow 19 5 1 ( i n Russian , English trans lation available).

7.

Maj erc zyk -G6mulka J . , Makowski K: Wyznac zanie optymalne go sterowania proc es am i dyna m ic z nymi m etoda; funkcjon alow Lagrange ' a . Archiwum Auto matyki i Te le me c ha ni k i 1968 z , 2, 3.

8.

Pontryagin L. S., Boltianski i V. G .

, Gamkrelid ze R. V. , Mishchenko

E. F. : The mathematical theory of opt i m a l processes . New Yo rk 196 2 . English translation by K. N . Triro goff. 9.

Weinberg M . M . : Variationa l methods o r investigation of nonl m e a r operators.

Moscow 19 5 6 (in Russsian, En g lish translation availabl e ) .

CENTRO INTERNAZIONALE lVIATEMAl1ICO ESTIVO (C. ,I. JVl. E.)

A. STRASZAK

SlJPER VISOR Y

Corso t e n u t o a

S asso

CONTROLLABILI TY

Mar con i

dal

1 a l 9

1u gl io

1968

-

SUPER VISOR Y

195-

CONTROLLABILITY by

A . Straszak (Institu te fo r Au tomatic Control - Polish Academy o f Sciences, Warsaw-) l.In troduct i o n Multilevel control systems are o f increas j ng importance in t he ap plication of complex automatic control in ind ustrial and non -industrial fie ld . The multileve l approach to the synthesis or design of the multivariable control sys tem is the on ly one acceptable for real large -scale systems . A mul tilevel structu re arises rather naturally in practice. However. the theo reti cal foundation of this concept is still on the ea rly stage of the research, since conventional mult ivariable optimization and control theories can not be used d irectly. II.

[1] [4] [5] [6]

The . P r o b l e m Let us co nsi de l' a set of optimal control systems.

i = 1,2, . .. , k . :

where x.

(x

1

. , Xl i' .... x n .i)

01

1

u.

(u l i ' u 2 i ' . . .

1

,.J\ 01 .

J

u

m. i ) 1

t

o.

1

of performance of the c o n t r ol system u, ~ 1

U.

1

Each of the optimal control systems is op timal due- to s ome f ix ed resources (for example: fue l,

energy) or another factor (for example: a p r ice

-

196-

A. Straszak

of the control given to the systeII]). Suppose that the set of the optimal control systems have common sources of this resources or resource - like factors . Now, we may formulate the following problem. Firstly, can we improve the optimality of the optimal control system which belongs to the given set of the optimal control systems by using the multilevel control procedure. Secondly, can we improve the overall optimality (in the sense of the global index of performance) of the set of the optimal control system by us ing the multilevel control procedure. However, we can not answer these questions before the study of controllability (coordinability) of the optimal control system. Therefore, first of all we must know i f it is possible to control the optimal control system and how to do it.

~~_~~£~~~~~~~~_~~~~~~~~~i~~~_ The optimal

control law

is optimal only for the given dynamic process dx .

1

dt and given constraints u.

1

and /

or

1 . 1

T.

So

1

(u .(?;'» 1

2

d~ <

2

r.

1

and/or

and so on. Therefore the dynam ic process f with the controller i

-

197-

A. Straszak

r., r~) 1 1

u . = c .(x ., 1

1

1

must be controllable in the working domain of the state

space

Di~X

due

j

to the given resources

ri .

Now, we can formulate the supervisory controllability: We say that the dynamic process if for any given number

"2.

in the working domain

Di

f is a supervisory controllable i > 0, the dynamic process f . is controllable 1

or working subdomain

j ", r and the interval (. i

resources

point for different

<

X

.

(11) >

01

7.,

D~ t: D. 1

1

due to given

consists of the more than one

"

Example. Let us consider the dynamic process

~~

=

with the working domain x

2(t=T) and

=x

x

2

g ~1x +[~]u <xl(t=O) = 0,

x

xl(t= T) = 0,

2(t=O)=O,

>

5 T

X

o

=

dt = T - minimum time index of performance.

o

Using the functional analysis methods for this optimal control problem

[ 3] , we obtain that for

lu k r ,r

for

~O

IU(tz;)

=

I,

I d"l:;~

S

T

T =

for

o

[u

r = I,

(~)

J

2

dr~

r =1 .

-

198-

A. Stras zak

Therefore this dynam ic proc ess w ith f

o

I has th e supervisory con -

t r olla b il ity . Contr ol. Assume that we have a set of the supervisory co nt r oll a bl e opt imal co nt r ol s ystems , th erefore we have relations

where min

x . 01

x . 01

u.

1

and ve ctor index of pe rforman ce x

og

=

'" '"x , · · · , "..x (x o l' o Z o k)

By introdu cing the g l o ba l index of performance I

g

"-

max

or

x . 01

k

I

g

r:

A

X .

01

i =I

we c a n fo r m ul a t e the o pt i mi z a tio n problem for a supe rvisory c o nt r olle r : Minimize I

g

/'0

x .

max i

0 1

subje ct t o

or Minimi ze

k I

g

L: i =I

A

X .

01

-

199-

A. Straszak

subject to

which can be solved by the supervisory controller by using the mathematicla programming machinery

2

7

References 1.

Mesarovic,

M. D . : Advances in multilevel control Proc . IFAC Tokyo Symposium on system Engine.=ring, Tokyo 1965.

2.

Karlin,

Mathematical Methods and Theory in Games, Programming and Economics . Pergamon Press. London 1959.

3.

Kulikowski,

R .

Optimal Control as a Function of P lant Parameters . Arch . Aut. i Tel. n , 2 1961 . Warsaw (Polish).

4.

Kulikowski,

R .

Optimum control of aggregated multilevel system , Proc. III IFAC Congress, 19 6 6. London 196?

5.

Straszak,

A.

On the structure synthes is problem in multilevel control systems. Proc. IFAC Tokyo Symposium on System Engineering . Tokyo 1965.

6.

Straszak,

A.

Multilayer and Multilevel control Structures. Proc. on Neural Network Springer - Verlag . Heidelberg 1968.

7.

Straszak A .

S:

Suboptimal supervisory control. Functional analysis and aptimizationNew York. Academic Press 1966.

CENTRO Ir-
LECTURES ON CONTROLLABILITY AND OBS.ERVABILlTY

L. WEISS (University of Maryland)

Cor-so tenuto a Sasso Marconi (Bologna) daI I al 9 Lu g l io

1968

-

203-

CONTENTS Section

1.

Introduction

205

2.

Existence and Uniqueness of Solutions to Delay-Differential Equations

207

3.

Representation of Solutions to Linear Delay-Differential Equations

209

4.

Definitions of Controllability

2\2

5.

Controllability of Linear Delay-Differential

213

~yS:tems

6.

Local Controllability of Nonlinear DelayDifferential Systems

2\5

7.

Controllability and Observability for Ordinary Linear and Nonlinear Differential Systems

2\9

8.

9.

n Algebraic Criteria for R - Controllability of Linear Time-Varying Delay-Differential and Ordinary-Differential Systems

232

The Pfaffian System Approach to Controllability

244

10.

Structure Theory for Linear Differential Systems

25\

11.

Weighting Patterns, Impulse Responses, Minimal Realizations, and Controllability Theory

272

-

205-

L. Weiss

1.

INTRODUCTION These lecture notes are devoted to a detailed exami na t i on

of the fundamental system-theoretic concepts of Controllability, Observability, Reachabili ty and Determinability, and of the roles they play in certain specific areas of research in modern system cheory.

The mathematical models employed in this study range in

sophistication from linear, constant coefficient, differential equations to nonlinear, time-varying functional-differential Much of the material, especially that dealing'~ith

equations.

functional-differential systems is of very recent origin, and in many cases enables well known older results in controllability theory to be embedded within newer more general ones.

An overall

objective has been to incorporate a certain amount of self-containment and breadth in the presentation.

Nonetheless, various aspe cts of the

subject have not been covered (it might be well to point out the obvious fact that there are as many types of problems one could consider as there are different types of system models) .

Some notabl e

omissions which a prospective lecturer on the subject may wish to fill in are :

(1) Controllability for part ial differential equations

and differential equations on Banach spaces.

(2) The relationship

of controllability to the problem of system stabilization. A series of seven lectures bases on the material in these notes presented by the author at the Centro Intemazionale Matemat i co

-

206-

L. Weiss Estivo at Pontecchio Marconi, Italy, in July, 1968.

am indebted

to Professors E.Bompiani, R. Conti, and G. Evangelisti for the privilege of participating in the C.I.M.E . course on Controllability and Observability.

-

207-

L . Weiss

2.

EXISTENCE AND UNIQUENESS OF SOLUTIONS TO DELAY-DlrFERENTI AL EQUATJONS Alth ough our concern i s with a r es t r ic ted c l ass o f del ay-

differenti al equations, it i s instructive t o present an exis tenc e theorem for a more general class of equations.

The re s ult s, who3e

p roo fs a r e not reproduced he re, are due to Driver [10].

Notat ion i s

as follows.

D

open connected set in

If

x = co l( x ,· . , , x l n)

If

[a,b]

£

n

R

n

,

R

is an interval in

11~II[a,b]

then and

R

max i

Ilxll = ~

:

Ix I 1

n [a ,b] ... R

IIHt)1I a't :Gb C([a,b],K) = class of continuous function s mappin g

,

t hen

= sup

[a ,b]

i nto

Cons ider the de lay-differential s ystem

dx dt = f(t ,x( '»

(2.1)

where

f(t,H '»

n e R

, t c (to' y) , Y (.

t c (to' y), ~ c C([CI,t] , D)

for

Definition 2 . 1-

(i)

i s a cont i nuous function of

t

( Lf )

fo r every con stant

S

£

[to' y)

f for f

is continuous in t

0

(.

t

(.

Y

t

for any

f or some

if

f(t, ~( · » ~ £

is locally lipsch i tzian i n

and evpry compact s et

KS,uC such that

'0

SeD

C([ CI , y] ,D) . ~

if

th ere c xI s t s a

-

208-

L. Weiss

for all

t

E:

[to'S), all

4>1' 4>2

Definition 2.2. 4>(') to

E:

C([a,t].S) .

(i) Given an initial function

C ([a,to],D) , a solution to (1) is a function

< S ~ y , such that

x(t)

E:

x(t) = <jl(t)

satisfies (1) on

for all

t

E:

[Cl,t

x(t;t , 4» o

•

] O

C ([ Cl, rl) ,D)

E:

and

(to'S) (ii) The solution at time

from initial time

x(·)

to

and initial function

4>

t

generated

is denoted by y(t;t , 4» o

This solution is unique if any other solution

is identical to it as far as both are defined. Theorem 2.2. be continuous in

l

Consider the system (2.1) and let

and locally lipschitzian in

W.

Let

Then there exists a unique solution on

[a,S), to < S

~

y,

then for any compact set numbers

to < t

i.e., as

t

< t

l

+ ~

x(t)

2

t

and

such that

Let

D

ljI, and linear in

W

[a,y) .

S

=

x(t;t , 4» o

cannot be increased,

x(t

k)

E:

comes arbitrarily close to

there exists a unique solution interval

S < Y and

x(t)

G ~ D there exists a sequence of real

< •• + S

Corollary 2.3. in

and if

f(t,ljI(·»

Rn

x(t)

and let

D - G, k = 1,2, •.. ,

D or is unbounded. f(t,W('»

be continuous

Then for every =

x(t;t , 4» o

on the entire

-

209-

L. Weiss

3.

REPRESENTATION OF SOLUTION S FOR LI NEAR DELAY- DIFFERENTI AL SYS TH1 ~

In th is s e cti on, whi ch i s b as e d he avi l y on th e work o f Hale and Meyer [12] , we cons i der t he equati ons o f t he f orm

dx dt = f(t,x('»

(3.1)

with in itial function sp ace is linear in

x(')

t - h ::s::t f or a l l

~ £

It

B , all

The con t ro l funct ion

C([t

+ u(t)

o

- h,t o],R

n)

=

and de pends on ly on values of

u(')

L(')

x (s )

f(t, x( '» for

II f( t, ¢(.» II

i s furthe r a ssumed that t . where

B whe r e

s L( t)

II

is cont inuous an d posi tive .

bel ongs to the c lass o f fun cti ons wh ich

are meas urab l e and b ounded on every f i ni t e t ime interval.

Our

objective i s to out l i ne the de rivation of a " vari ati on of par ameters" formula.

We note first th at (3.1) is equivalent to th e functi on al

integral e qua t ion

(3.2)

{

x( t )

~ (t)

x ( t)

Jt

, r c [to - h,t o]

-t

f(s , x(s»ds + 0

I:

u fs ) ds + .p(0 ) ,

>. t

0

0

to whi ch a unique so l ut ion ex is ts by The orem 2 .2 . on

t

The hy potheses

a l l ow appl icat i on of th e Ri esz Repr es en ta tion The ore m t o

e s tab lis h e xi stence of an

n x n

matri x value d f unc tion

n

defi ned

[t-h. t 1

-

210-

L. Weiss

on ~

(_00,00) x [-h,O] £

f(t,~(·»

n(t,T)]~(T)

= fO

[d -h T is of bounded variation on

Moreover, n Ct ;«)

B

each

such that

for all

[-h,O]

for

n) Ll([to,T) ,R

t

range in

Now let denote the space of functions with n R which are Lebes~ue integrable over [to' T) Then we

have Theorem 3.1. (or (3.2» ~

c B.

Let

with control

for all

x(t;to'~'O) + K (t ,»)

f or each W(t,s)

T >. t

and with

o

Then

(3.3)

where

be the solution of (3.1)

x(·,to'~'u)

is defined for

t , and

f:

K(t,s)u(s)ds , t >. to o

s ~ t - h, K(t,')

K(t,s) -- a~=,s)

2 L«t 0 ,t],Rn)

£

00

1 t everyw h ere, were h amos

is the unique solution of the equations

W(t,s) = Oft

for all

t

£

[s - h,s]

(3.4)

{

s

Proof:

Let

f-ho

{d T1U;,T)} W(T + F;,i;)dF; - (t - s)1 , s

W( t ,s)

'S t

T

u(·) c L ([to,t],Rn ) l

is a continuous linear operator mapping

M(u) =6 x(t;to'O,u )

Then n

Ll([to,t],R)

Theorem, there exists an uefined for all

t

:l;

t

o

into n x n

n

R

and

matrix

such that

.

-

211-

L. Weiss

I

t X(t;t ,T)u(T)dT t

It is let

ea~ily

K(t,T)

shown [12] that

=

X is independent of

X(t;to,T) , t c (-00,00) , T c (-,t]

K( t, T)

0

W( t ,»)

- fK(t,T)dT

Then

0

o

for

T

1:

(t,t + h] for

For any

t >. n

'1

W satisfies (3.4) and

and

~

t

and let

(_, 00)

£

Wet ,~)

Hence

o

,

for

0

let t

£

['1 - h,'1]

K is as stated in the theorem .

The linear system to be discussed in some detail from a controllability standpoint is of the form

~;

(3.5)

where

=

A(t)x(t) + B(t)x(t - h) + C(t)u(t)

n , ut t ) c RP , and

xf t) e R

continuous functions.

A(o)

, B(') , cc-)

The solution of (3.5) can be represented

as in (3.3) and it is easily checked that the function ~t isfies

(3.6)

are

K(t,s)

the partial differential equations [2]

aK( t , s ) as

-K(t,s)A(s) -

aK(t ,s) as

-K(t,s)A(s) , t - h

K(t,t)

I.

xr e , s + h)B(s + h) , to

~

s

~

t

~

s < t - h

-

212-

L. Weiss

4.

DEFINITIONS OF CONTROLLABILITY Consider the nonlinear delay-differential system

dx dt

(4.1)

= f(t,x(t) ,x(t - h) .uf t )

n x(t) £ R , u(t) £ RP , and

where

u

, t >- to

is measurable and bounded

on every finite time interval (such controls will be called " admiss'ible").

The delay is represented by a real scalar

and it is assumed that for all

t.

l f £ C

in all its arguments and

The initial function space is the space

h > 0 f(t,O,O,O)

0

B as defined

earlier. Definition 4.1 . if for any

£ B

~

control segment independent of if

t

l

- to

th~re

The system (4.1) is exists

t

l

=

n R - controllable

tl($) £(to'oo)

and an admissible

such that x(tl;to,$,u) = O. If t l i s l] n $ , we speak of fixed-time R - controllability, and u[to,t

can be made arbitrarily small we speak of differential

Rn _ controllability. While this definition turns out to be quite useful, it does not reflect the fact that the state space of (4.1) is a function space and that one can conceive of control problems in which the state of the system is to be transferred to a point (or region) in function space. Hence it makes sense to also consider the following definition.

-

213-

L. Weiss Definition 4.2.

The system (4.1) is controllable to the

origin with respect to the space of initial functions ~

E B

there exists

t

l

= tl(~) E(to' oo)

B if for any

and an admissible control

] such that x(t;tc'~'u) = 0 for all t E [tl,t l + h] . to,tl+h Although controllability to the origin does not imply controllability

segment

u[

to some other point in function space, it is possible to obtain results for the latter problem using an approach similar to that presented in the sequel (see Weiss [23]). 5.

CONTROLLABILITY OF LINEAR DELAY-DIFFERENTIAL SYSTEMS

We begin with the following Lemma. Lemma 5.1.

n R - controllable if there

such that

exists

(5.1)

The system (3.5) is

rank Itl K(tl,n)C(n)C'(n)K'(tl,n)dn t

=n

o

where the prime (') indicates transpose. Proof:

Let

C be the matrix in (5 .1) whose rank is

n .

The Lemma follows by substituting

in (3.3), for then

x(tl;to'~'u)

=O

.

The question of the necessit y

of (5.1) involves the concept defined below .

-214 -

L. Weiss

Definition 5.1. each

A s ys t e m (3.5) is poi ntw i se comp le te i f fo r

there exis t s a s et o f init ial fun cti on s

t

x(t;t ,~,O) o 1

such th at the set

i

1~

(

B , i

=

1, .. . , n ,

fo rms a bas i s f o r

1 , .. . , n

n R

It is e as y to construct an example t o sh ow th at not a l l time varying s ystems (3 .5) are pointwis e comp l e t e.

We con jec tu re, howe ver,

that all constant coefficient systems of the form (3.5) are pointwi se complete . Lemma 5 .2.

If the syst em (3.5) i s po intwi se comp l e t e , th en

(5 .1) i s ne ce ss a ry an d suffi cient f or f i xed -time Proof : For any t

l

> to

0

=

vector

R

Xl

(

B , suppose there e xists a f i xed t ime

n

=

=

0

( L)

(11) admissible

By hypothesis,

Then

Xl

such that

[to ,til

Then th ere exi sts a nonz ero

x ~K ( t l 's ) C ( s ) = 0

such that

Theorem 5.3. wi th respect to

u

but (5.1) does not hold.

xix ( t l ; t o , , O)

x(tl;to, ,O)

(5 .2)

£

and an admis s i b l e c ontrol

x(tl ;to ,,u)

Then

n R - controllab i lity.

x~ x l

=

0

for all

s

(

[to ,t l l

ca n be chos en so that

whi ch is a contradiction .

A system (3 .5) is controllable to the origin

B i f and only if it i s

n R

for each

con trollab Le

( B and for some corresponding t l

s uch that

C(t)u(t)

and

-

215-

L . Weiss

u *(.)

has an admissible solution Proof: 4>

B , let

£

u[

(tl,t

The necessity of (i) is obvious.

to,t l

]

be such that

holds, then on the interval ~(t)

defined on

(tl,t

A(t)x(t)

l

x(t it l o,4>,u)

+ h) .

l

Now, given =

0

If (5.2)

+ h) , the system (3.5) becomes

It then follows that

x(t)

=

0

for

Conversely, i f (3.5) is controllable to the origin with

B , then for each

respect to and control t

£

u

[to,tl+h)

[tl,t l + h]

4>

B there exists some

£

such that

x(t;t

o,4>,u)

=

0

t

l

> t

0

for all

This implies (i) and the uniqueness theorem for

delay equations implies (ii). Remark:

The major element in the controllability problem

for 3.5 is the solution to (5.2). exist on

(tl,t

the range of

l

+ h)

C(t)

Clearly, an admissible solution will

if and only if the right side of (5.2) is in

almost everywhere on the interval.

condition for the latter to hold is the existence of an matrix

D(t)

with bounded measurable elements such that

almost everywhere on 6.

LOCAL

(tl,t

l

+ h)

A sufficient n x p B(t)

C(t)D(t)

•

n R - CONTROLLABILITY OF NONLINEAR DELAY-DIFFERENTIAL SYSTEMS In this section, we generalize the results of Lee and

Markus [19] to the case of delay-equations.

-

216-

L. Weiss Definition 6 .1.

The system (4.1) is locally

Rn - controllable

n B if it is R - controllable to the orig in B with respect to a neighborhood N(O} where to the origin with respect to

(6.1)

A(t}

af =ai( (t,O,O,O)

B(t}

at = aX (t,O,O,O) , where

C(t}

af = au (t,O,O,O)

d

Theorem 6.1.

xd(t}

n R - controllable

A system (4.1) is locally

to the origin with respect to

x( t - h)

B if its first variation about the

zero-solution satisfies the condition that there exists

Proof:

t

l

> to

such that (5.1) holds

We introduce a parameter

~

into the control

u

and define

(6.2)

u~(t} - u(t,O

It should be noted that that if

u(t,O}

= UO(t} = a on

for

t

£

[to,t

l],

and

-

217-

L. Weiss

Define the Jacobian matrix

J(t)

B E,

(6.3)

Since

dX( t ; 'i>' O ,u )

J( t)

dE,

by

I E, =O

0, the solution of (4.1) is written as

~

x(t,E,)

J:

f(T,x(T),xd(T),u E,(T)dT o

Then we have

J(t)

=

~; I =

J:

E,=O

where

A. B ,C

(6.4)

j(t)

[A(T)J( T) + B(T)J(t - h) + C(T) 0

are as given in (6.1) .

~~

(T ,0)

ta.

Differentiating we obtain

dU

A(t)J(t) + B(t)J(t - h) + C(t) dE, (t,O) , to

~ t

But, from (6.2)

dU ~

(t,O)

and

ct r) ~ (t,O) dE,

- B(t ) J (t - h) , t 1

<

t ' t1 + h

~

t1 + h .

-

218-

L. Weiss

Therefore,

(6.5)

A(t)J(t) + B(t)J(t - h) +

J(t)

The solution of (6.5) over the interval

(6.6)

J:

J(t)

C(t)C' (t)K' (t ,t) 1

[to,t l]

is then

K(t,s)C(s)C' (s)K' (tl ,s)ds o

By

hypothesis, det J(t

a ,

#

l)

and so, application of the

implicit function theorem [8] yields a unique continuous map

~

s uch that if

£

N(aB)

then the equation

an admissible solution

F;

= IT(~)

•

x ( t l ; t o , ~ ,F; ) =

IT :

a has

This proves Theorem 6.1.

I t is worth remarking that the hypothesis of Theorem b. 1 is neces s arv

well as sufficient in order to yield a nonsingular Jacohian matrix (6.3) at the t.irne

t

l•

of the choice of

To see this, we need merely note that regardless u(t, F;), the function

I

t

where

Z( ·)

£

L [t ,t 2 o l].

1

J(t

would have the form

l)

K(tl,s)C(s)Z(s)ds

o

The necessity of the hypothesis then

follows from the Lemma below, which is due to Hermes [13] • Lemma 6.2. in

L [t , t ] 2 o l

Let

H(')

for any finite

be an

n

p

L [t , t 2 o l]

I

in

singular is that

V=

t

J

such that l o

H(n)H' (n)dn

A

matrix of functions

Then, a necessary and

t l > to

sufficient condition that there exist a functions

p

x

x

t

1

n

matri x, Vet) , of

H(n)V(n)dn

is n8nsingular.

i s oon-

:IS

-

219-

L. Weiss Proof.

Sufficiency is trivial (let

necessity, assume there exists

V

but

is singular.

such that

c

V

such that

Then there exists a

o

c'

V(t)

and s i nce

V(t) = H' (t )

A is nonsingular,

~onstant

H(t)H'(t)

cH(t) = 0

c A = 0

which contradicts the nonsingularity of

7.

vector

c

~

0

is positive semidefin it e

we have

almost everywhere on

For

.

[to,t

But then

l] A

CONTROLLABILITY AND OBSERVABILITY FOR ORDINARY LINEAR AND NONLINE,\R

DIFFERENTIAL SYSTEMS Much of what has already been presented for delay equations is directly applicable to the controllability problem for ordinary differential equations.

The special nature of the latter, however,

allows somewhat sharper results to be obtained.

Ou~

emphasis is on

linear equations although the nonlinear problem is also discussed in Consider the system

the sequel.

dx dt = A(t)x(t) + C(t)u(t)

(7.1)

y(t)

where

x(t)

E:

H(t)x(t)

Rn , uf t )

E:

RP , y f t)

continuous functions of time.

E:

r R ,and

A(') , C(·) ,H(·)

We shall find it convenient to refer to

the adjoint system to (7.1) defined as the system

(7 .~)

dz =-A' (t)z(t) + H' (t)ii(t) dt

y( t.)

are

C' ( t) z (t) .

-

220-

L. Weiss

It is easily shown that if

(matrizant) for (7.1), Le., if

d~ dt

(7.3)

~'(T,t)

¢(t,T)

satisfies

= A(t)~(t,T)

~ (r ,

then

i s the tran siti on matrix

~(t,T)

t)

I

is the transition matrix for (7.2).

We present the following definitions. Definition 7.1. controllable from time

A state

T

if there exists

and an admissible i nput segment

is uncontrollable from from

u[T,tl

is transferred to the phase

(T,XJ

of the system (7.1) is

X o

T

t > T • t

finite,

such that the phase

(t,O) •

Otherwise

x

a

If every (no) state is controllable

T , the system is controllable (uncontrollable) from

T

Controllability (uncontrollability) of the system from all

T

denoted by complete controllability (uncontrollability). t - T

Definition 7.2. at time

T

A state

(older terminology:

a finite value of time

t

<

such that the phase

Otherwise at

If

in these definitions can be made arbitrarily small we speak

of differential controllability from

u[t,Tl

is

x

o

T X o

of the s ystem (7.1) is reachable

anticausal controllable) if there exi st s

T , and an admissible control segment (t,O)

is unreachab Ie at

is transferred to the phase T

If every (no) s t a t e is reachable

T, the system is reachable (unreachable) at

(unreachability) of the system at all

(T,X o)

T

T .

Reachability

is denoted by complete

-

221-

L. Weiss reachability (unreachab ility). If

It -

11

can be made a r b it r a r ily

small,we speak of differential reachability at Definition 7.3. observable from time

A state

1.

o

.

of the sys t em (7 .1) is

if with respect t o the a dj oi n t s ystem th at

1

state is controllable from time unobservable from

X

l

1

Otherwise the stat e is

•

Remaining definiti ons of system observab ility,

unobservability, complete observability, and differential observabi1itv follow analogously from Def. 7.1 above. Definition 7.4.

A state

x

o

is determinable at time

1

ii,

with respect to the adjoint system, that state is reachable at time 1.

Otherwise the state is undeterminable at

1

Remaining definitions

of system determinability, undeterminability complete determinabil ity and differential determinability follow analogously from Def . 7.2 above. Remark:

In earlier papers of the writer (20), (21), the

concepts of "reachability" and "observab ility" were den oted respe ctively by "anticausal controllability" and "anticausal observability" while the present difinition of "determinability" corresponds to the older definition of "observability".

The rationale behind the

t

e rm.ino Logy

will soon become apparent. Now de fine the "Con trollab iIi ty" an d "Determinab iii t y" matrices by the relations

(7.4)

C(t, a)

(7.5)

V(t, O)

f

at"

I:

( t , n) C(n) C' ( n) <1> ' (t; , n)dn

<1> '

( n,t)H' ( n)H(n )
-

222-

L. Weiss

Let

denote the range of

R[·]

Theorem 7.1.

Proof:

Hence

x

Kernel

E:

x

C(t,o) E:

n

R

K[C(t,02)]

[0].

R[C(t,o)],

, 0

° >.

t

is monotone nondecreasing

is a Gramian matrix and has the propert y x'C(t,ol)x s x'C(t

t;

=-> x

Therefore

complementation

Then we have

°.

with increasing

that for any

[.].

E:

K[C(t,ol)]

l0 2)x

for

K[·]

where

K[C(t'Ol)] >. K[C(t,02)]

01 < 0 2 denotes

and by orthogonal

R[C(t,ol)] li R[C(t,02)]

Corollary 7.2.

There exists a positive

l C

function

u(t)

such that

U R[C(t,a)]

R[C(t,t + \l(t»] .

o>t Corollary 7.3. with decreasing

R[C(t,a)], a

t;

t

is monotone nondecreasing

°.

Identical results hold with

C(t,o)

replaced by

V(t, o)

Hence, if we denote the set of states controllable from (reachable at) time

t

by

Pc(t)(Pr(t»

and denote the set of states which

determinable at (observable from) time there exist positive

l C

functions

t

by

Qd(t)(~(t»

vet) , wet) ,p(t)

Pc Ct)

R[C(t,t +

Pr I t)

R[C(t,t - v (t» ]

Qd(t)

R[V(t,t - w(t»]

Qo(t)

R[V(t,t + p (t»]

(7.6)

\let»~]

, then

such that

-

223-

L. Weiss

We can now characterize the concepts of controllab ilit y, r e achabilit y, determinability, and observability for the s ystem (7.1). Theorem 7.4.

A state

from (reachable at) time value of time

t

l

<

o

of (7 .1) i s controllable

i f and only if there exists a finite

r

> T (t

l

X

T)

such that

X

Proof (for controllability only) : solution of (7.1) with initial state

x

o

£

o

R[C(T,t

l)]

(Sufficiency) :

The

and initial time

is

given by

x(t;"x ) o

By hypothesis, there exists Setting (r , « ) all

,

u(n)

C'(n)¢'("n)z

° ,,) X

o

we find that

n R

such that

C("tl)z

= X

o .

and making use of the fact that ((t,T)

x(tl;"x

o)

=

(t, o) <1>( 0 , T)

for

= 0 .

(Necessity) :

Suppose

x U 0) o

is controllable from

¢ R[C("t)]

for all

t >

Then there exists a

finite value of time such that

o or

(7.7)

£

satisfies a group property

t,

but

= -

z

x

o

T

•

and an admissible control segment

-

224-

L. Weiss Since xl

X

on

o i R[C(T, t 1) ] K[C(T,t 1)] .

in which case

i t follows that

o

has a nonz ero projec ti on

By Theorem (7.1), "i E qC(T,t)]

xi~(T,n)C(n)

=0

for all

(7.7), xixo = 0 which implies

from

X

X

o

n E[T,t

for all But then,

l].

E R[C(T,t

l)]

t d T,tll

, a contradi ct i on.

From Theorem 7.1 and Theorem 7.4 we deduce Theorem 7.5. at) time

T

A system (7.1) is controllable from (reachable

if and only if there exists a finite value of time

Another useful result is given by Theorem 7.6.

(L) I f

x E Pc( t)

and

T

~

t , then

(ii) If

x E Pr(t)

and

T

~

t , then

$(T,t) x E Pc(T) .

~(T,t)

x E Pr(T) . Proof of (i):

u* [t,t

(7.8)

such that l]

o

x(t

By hypothesis there exists

1;t,x)

= O.

Then

~(tl,n)C(n)u

*(n)dn

tl

>

t

and

-

L. Weiss

225-

Let

o

on

[T,t)

Then

~(tl,T)~(T,t)x + ~(tl,T) ft ~(T,~)C(~)~*(~)d~ T

o

by (7 .8).

The proof of (Lt ) follows exactly as above.

Now, from Defini tions

7.3, 7.4 and Theorem 7.4, we have Theorem 7.7. (observable from) time

A state T

Corollary 7.8. (observable from) time t

l

<

T (t

l

>

T)

x

o

of (7 .1) is determinable at

if and only if there exists a finite

A system (7 .1) is determinable at T

if and only if there exists a finite

such that rank

V(T,t

l)

=

n .

The dual of Theorem 7.6 is also clear and is given by

-

226-

L. Weiss The significance of the phrases "observable from" and "determinable at" should now be clear from th e nature of (7.1) . For i f (7.1) is determinable at (observable from) time (assuming u(o) uniguel~

= 0)

, the state of the system at time

determined from knowledge of the output

can be

y(t)

over a

finite time interval ending at (beginning from) time To see this, consider the solution for u(o) = 0 , the initial state is (7.1) is determinable at

and, for some

t

of (7.1) when

the initial time is

o'

t

a

; and ass ume

Then we can write

a

t l < to tl

x

x

y

I

a

t

a

<1>'

(t,t )H' (t)y(t)dt . a

Application of Theorem 7 .9 indicates that the state be determined at time

t

' ( t o , t ) xo

c an

0

It is also easy to check the following facts about th e sys tern (7.1). 1. time

t

Controllability from

t

implies that any state at

can be transferred to any other state in a finite interval

of time beginning wi th

t.

-

2.

227-

If (7.1) is c on t r o ll a b l e from

L . Weiss

t

and we revers e th e

ordering of the time scale, the reversed time s y s t e m i s re a chab le at

t.

3.

In general, complete controllability does not i mp l y

complete reachabili ty or vice ve ra a , in which

n = p = r = 1 , A(t)

=0

1 - cos t

o

C(t)

To see this, cons id e r (7.1)

, and

, 0

~

,

:> 0

t

~ ~

t

t

1

The above system is completely controllable, but is r eachabl e at for

t > 0

~(.)

T

I f a system (7.1) i s controllable

, then it is reachable at all

t ~ T

+

~( T)

where

is as defined in Corollary 7.2. (d i )

it i s observable from all

t:>

If a system is determinable at

T - W(T)

where

W(T)

T

is as defined

earlier. Proof: rank

o n ly

One can say the following, however.

Proposition 7.10 . (i) from time

t

C(T,T + U(T» rank

If (7.1) is controllable from

(i)

=

n

=

C(t;,T)

rank

C(T +

for all

t;

~

T

,

then

~( T),T)

T+

~ (T )

(ii) Follows from (i) b y dualit y

C0rollary 7 .11.

Complete differential c on c r o l l ab i l i ty

(obsez '; ab :l1ity) is equivalent to c omp l e t e d t f f e r er-t La I re ach ab ilit y

,

-lli-

L. VVeiss (determinability).

[It is therefore obvious that no distinction

need be made between the concepts of controllability (observabi1ity) and reachabi1ity (determinability) for time-invariant systems (7.1)]. We now consider briefly the problem of controllability and determinability for ordinary nonlinear differential systems.

The

controllability problem can, in fact, be handled in a manner completely analogous to that presented for systems containing time delays (simply h + 0).

let

Results of a slightly different nature can be obtained

for special types of nonlinear equations and these are discussed in the section dealing with the application of Pfaffian systems to the controllability problem. The determinability problem warrants more detailed comment, however, and so we consider the problem of giving sufficient conditions for a class of

non1inea~

ordinary differential s ;'stems to be uniformly

determinable (defined below).

The discussion is based on the work

of A1'breckht and Krasovskii [1]. Consider the system

{

(7 .9)

dx dt =

y(t)

Rn , y(t)

x(t)

of

x

in a neighborhood of

of

f

and

g

£

g(t,x(t))

Rr , f(t,O)

where

£

f(t,x(t))

x

=0

= 0 , f and

g

are analytic functions

, and the components

respectively cm' be expanded in a series as foll3Ws.

-

229-

L. Weiss

{

fi(t,x)

(7.10)

gi (t ,«)

where

$ (m) i

and

oji (m) i

are

L

m=l

L m=l

(m) $i (t; ,x)

(m) ojii (t,x)

th m

order forms in

x

wi th continuous

and bounded coefficients . Definition 7.6. at all

t ~ Y if there exists a function

that any trajectory

(7.11) for all

The system (7.9) is uniformly determinable

x(t)

Rl x Rr + Rn

Y:

such

can be expressed as

x(t) = Y(t,y(t +

e» , - y

~

e

~

0

t >. y > 0 .

Remark:

The interest in uniform determinability stems from

the desire to develop a method of state-vector determination whose practical implementation would involve the taking of measurements over time intervals of fixed length. The first result of interest characterizes uniform observability for the linear system (7.1) (with

u(·)

= 0) , and follows from

Corollary 7.8 (See Kalman [17]). Lemma 7.10. at all where

t ~ Y

The system (7.1) is uniformly determinable

if and only if rank

V(t,t - y) = n

for all

t >. y ,

-

230-

L. Weiss

and

t- y

V(t,t -

(7.12)

f

t

y)

t'(n,t)H'(n)H(n) t(n,t)d n

i s the transition matrix co r r es pondi n g to (7 .3).

t

The main result is the theorem below. Theorem 7.11. at all

t

~

Y if its first variation has that property.

Proof: suppose in

X

n R

o

The system (7.9) is uniformly determinable

£

Consider the solution

x(-,O,x

o)

of (7.9) and

N(O), a sufficiently small neighborhood of th e o ,igin

Writing

x(t)

x(t;O,x

=

o)

, we have the following expan s ion

on the interval [t - y , t )

(7.3)

x ( t + 8)

where

t

~

t ( t + 8,t)x(t) + I S(m)(t, 8,x(t», - y " e

0

is the transition matrix associated with the fir st

variation of (7.9).

The series (7.13) will converge for

sufficiently small.

Since

(7.14)

y ft

+

8)

g (t + 8 , x( t + 8»

,

- y

~

8

x(t)

:'! 0

then s ub s t i t u t i ng (7.13) into (7.14) and expanding yi e l ds

y ( t + 8)

where

H(t + 8)t ( t + 8 , t ) x ( t ) + I p (m) ( t ,8 , x( t » m=2

, - y

:'!

0

~

0

-

231-

L. Weiss

H(t) =

18. ax

(t 0)

'

Now let

(7.15)

Y(t,y(t + e )

A( t )

y

x(t) + V(t,t - y)-l

Jo- <\l'(t + e , t ) H' ( t + e ) l m=2L p (m) ( t , O, x ( t » ]dll

The above equation has the form

U(a;x,t)

with respect to

a

x

is

1

when

and

=

x

x(t)

=

n(a(t),t)

of (7 .15).

nieghborhood so that

x(t)

where

V(t,t - y)-l

(m) Cl

x(t)

is an

th

m

The derivati ve of

a r e set to zero.

by the implicit function theorem, for any small neighborhood of the origin in

O.

a Ct )

U

Hence,

in a suffi ciently

n R , there is a unique s olution

The map

n

is analyti c in

a

i n thi s

has a representation

-y J

0 <\l'(t + e,t)H'(t + e) y ( t + e) de + L a ( m) ( t,a ( t » m=2 order form i n

a .

-

232-

L. Weiss

8.

ALGEBRAIC CRITERIA FOR

n

R

CONTROLLABILITY OF LINEAR

DELAY-DIFFERENTIAL AND ORDINARY-DIFFERENTIAL SYSTEMS The conditions given thus far for controllability rest implicitly on the availability of the kernel matrix

K(t,T) .

From

a computational point of view, especially in the case of delay equations, it is desirable to have controllability criteria which are solely dependent on the coefficients of the differential equation . Such criteria are developed in this section.

The results extend s ome o f

those recently obtained by Kirillova and ~urakov (18). The system model for the remaining discussion is (3.5) in which the coefficient matrices are assumed to possess

(n - 1)

continuous derivatives. The first theorem yields sufficient conditions for controllability .

(8 .1)

Q(t)

Define the matrix

n_ R

-

233-

L. Weiss

where

(8.2)

and

C(t)

Q~ J

0

for

o or

j

j

j

1, ..• ,k

k

1, . •. ,n

> k •

Theorem 8.1. then (3.5) is

Proof: Fix

t

l

If there exists t such that rank Q(t ~ n • 1 1) n R - controllable to the origin from some finite to < t l

We shall show that rank

Q(t

~

l)

and suppose (5.1) does not hold for any

there exists a nonzero vector for all

z

£

n R

t

such that

implies (5 .1) <

o

t .

Then

z'K(t1,T)C(t)

0

In particular we have the set of equations

o

(8.3)

for all

T

£

rtl - kh, t k

Differentiating

n

(n - 1)

times we obtain, for

l

- (k - l)h) ,

0, 1, . .. ,n

k

0,

-

234-

L. Weiss

o ,

T

E

[t

1

- 11, t ] , 1

i I , ~ . . ,n

As

T

->

t

As

T

->

(t

Setting

k

(8.4) becomes

1

Z'QiCtl)

h)+, (8.4) becomes

l 1

=

0 , i

z'KCtl,t

l

in C8.3) and differentiating

=

.

l, ... ,n

- h)Qi(t Cn - 1)

l

h)

o ,

I , .. . , n .

times a gain yi e l ds

i l , . . . ,n .

As

T

->

Ct

As

T

->

(t

l l

- h)

z'Q~(tl - h)

, (8.5) becomes

0

z' [K(t l ,tl-2h)QiCtl-2h)+K(tl tl-h)Q~Ct1-h) I

2h)+, (8.5) becomes

Continuing in this manner, for

=

k

n

we obtain

That is

Z'Q~Ctl -

(j -

l)h)

From th is it follows that rank

o , QCt

1)

1, ... ,k , k

~

n

1, . . . ,n

.

which proves the th e ore m.

Il.

-

235-

L. Weiss Remarks: Q becomes (8.6)

1 . For time-invariant sys te ms with Q

=

Q(t)

a

th e matri x

[C,BC, •.•• Bn-1C) .

2. For time-varying systems with matrix

A

B(' )

a • the

becomes

(8.6)

Q(t)

p (t.)

o

C(t) ,

To obtain neces sary c onditions for con t r ol labi l i t y of ( 3.5) i s more difficult. and only partial results are available.

We fir st

present s ome results of Kir illova and Cur akov (18) for the ca se where A,B,C

are constant matrices.

r = 1, ...• n ,

Define the matrices

r

Pj,j=1 ..... 2

by

r+l C , PZj- l

pr+l Zj

APr j

Let

(8.7)

(pr+l) )r=l •...• n-l Zj j=l •. •• ,zr-l

P

an d let

p

(pr+l) )r=l, .. .• m-l 2j j=l •...• Zr-l

m

Lemma 8.Z :

If rank

p

m

<

n • then rank

p

m

m

r-l

-

235-

L. Weiss

Proof: all

but rank

k < j

columns

Assume that for

vl •.•.• v

a

• s

P

VK

that any matrix

p s +2

3.

Then

since

p~;l

range of k P 2j+ 1 a

=

an d

AV

BP~

rank P k

f or

which span the range of

s

p S+l j

j

1 •••.• 2

s

P

s

can be

p s +2

s l s = AP + • j = 1 •. ..• 2 , we have 2j-l j

AVK 2 .

But it follows

s 2 p +

Similarly.

spans the range of

• we can easily show that

zj-r

{vl •...• v } a

spans the

p~;2. It then follows. by induction. that every column of pk+l 2j-l' s < k

~

n - 1 • is depen d ent on t h e

vk ' s

Th ere f are.

m • which proves the Lemma. Theorem 8.::

A necessary condition for the time-invariant

system (3.5) to be completely (differentially) is that rank P

=

n

R

- controllable

n

Proof: nonsingular matrix

Suppose rank

m < n.

P

Then there exists an

T such that

TP

where

P

Let

has a representation

{v ••..• v } l a =

P

can be expressed as

2j-l

by hypothesis that

VK

Since

l

>

j

Then there exists a set of

a (m. of

Then any matrix expressed as

l •...• s. rank P

rank Ps + l

s ~

j

has

m rows and rank

P

that the coordinate t ransformation into the form

=

m. x

=

It follows from Lemma 8. 2 Tx

=

[Xl] x2

transforms (3.5)

n x n

-

L. Weiss

237-

(8.8)

It is obvious that (8.8) is nor completely controllable. It is pointed out in [18] that when in (3.5) and Hence. for

n

3

then rank Q = rank P

A.B.C

Q is given by (8.1).

n

~

~

3 • Theorem 8.3 yields necessary and sufficient n R - controllability of time-invariant

conditions for complete systems (3.5).

When B

where

are constant

=

0

the matrix

P

becomes the well known

controllability matrix. [C.AC•••.• An-lC] • for linear time-invariant systems . It is somewhat unfortunate that what appears to be the natural extension of Lemma 8.2 to the time-varying case does not result in an immediately obvious corresponding extension of Theorem 8.3 except if the delay is zero. in which case. one can appl y the following result due to Dolezal [9]. (See Weiss and Falb [25] for an alternate proof as well as further applications of the result to system theory). Lemma 8.4.

Let

defined on the real line. that rank Set) = r

ck

for all

Set) Let

be an r

t

n x n

matrix of

be a nonnegative integer Then there exists an

functions. T(t). nonsingular for all

t . such that

T(t)S(t)

where

Set)

is

r x n

and rank Set)

Now define the matrices

k C

r

for all

t.

n x n

functions <

n

such

matrix of

-

238-

L. Weiss

P

r+l

p~;l(t) where

r = 1, ... ,n

The functions

a(' )

A(t + a(j)h)P~(t)

- (r.) 2j 1

J

-B(t + 8(j)h)P~(t) J

I, and j

l, . •. ,zr-l

and

are integral valued (both domain

8(')

and range consist of natural numbers) and in an appendix to this section, we give an algorithm which yields the inverse image of each point in the range.

pet)

Now let

[pi(t),(p~.\(t - a (j ) h» , (p~l ( t

_ 8(j)h»]r=I, • . . ,n-1 j=l, ••. ,zr-l

and let

P (t ) m

Lemma 8.5.

Let

I

be any interval on which rank

is a constant function of

t

for each

for all

P

t El

then rank

(Remark:

m

k.

If rank

(t) = m for all

That such an interval

I

t

£

Pk(t)

pet) = m I

exists follows from the

fact that a continuous matrix has unchanging rank on a

unio~

nf int ervals

which form an everywhere dense set on the real line). Proof: for all

Assume that for

r < j ,all

j

t El , but rank

1, .. . ,5 ,

P (t) S

rank rank

p. (t ) > rank J

P + (t) s I

P (r )

for a ll

r

t

r I .

-

239-

L. Weiss Then,

there exis ts on interval

,\(t), .. . ,wo(t) , s Ps(t)

for each

t

a c m, of

~ £

J

J

•

Let

and a set of columns

Ps(t), which span the range of

Wet)

=

maximal-row submatrix of any matrix

[wl(t), •• . ,w,,(t)] s+l P (t - (')h) , j j

W(t)K (t ) for all t ._ J . Since l d s+l . s+l = -d p. (r ) - A(t + a(J)h)P . (t), j t J J s+2 have that any matrix P - a(j)h) can 2j_ l(t

Then any 1,. " ,2

5

has the form s+2 P 2 ' let) J-

1, ... ,2

s

, we

be expressed as

But it follows by hypothesis that W(t)K 3(t). at each

t

Then £

shown that at each rank

t

(wl(t), ••• ,wo(t)} Similarly, since

J

(wl(t), •• • ,wo(t)} £

pet)

Wet) - A(t)W(t)

I].

has a representat ion

spans the range of

p~;l(t) = B(t

spans the range of

Ps+2 - ,. (j ) h ) 2j_ 1(t

+ B(j)h)P;(t) , it can be p;;2(t - r (j ) h )

The Lemma then follows by induction, since

m. Unfortunately, the presence of the delay term in (3.5)

mitigates against an immediate generalization of Theorem (8.3) by means of a cons t ruc t Lon of the the type represented by (8.8). However, with the delay term absent fication of setting

B(t)

=0

(h

0) , and with the simpli-

, we obtain a result which. in con -

junction with Lemma 8.4, yields a straightforward generalizati on o f Theorem 8.3 and its proof for the case of ordinary differenti al eq ua ti ons . That is, let

h

=

0

and

B(t)

=0

in (3.5).

With these assumptions,

-

240L. Weiss

Q and

the matrices

P , in (8.1) and (8.9) respectively, coincide.

And so, we h ave

Theorem 8.6:

A necessary condition for the s yst em (3.5)

to be differentially controllable on an interval [t rank

P(t)

=

n

for all

t

£

[t

is that

except possibly for a set whose

l,t 2]

complement is everywhere dense on

t l, 2]

[t

l,t 2].

The direct proof of this result (See Silverman and Meadows (41)) is left as an exercise in ligh t of Lemma 8.4 and the proof of Lemma 8.2.

An indirect proof is obtained as a by-product of the dis-

cussion in the next chapter. Corollary 8.7:

A necessary and sufficient condition for

a time-invariant system (7.1) to be completely controllable is that

rank

(8.10)

[C,AC, ..• ,An-lC]

n •

Now consider (7.1) again and define the matrix where

So(t)

=

H' (r.) , Sk (r)

S(t)

=

[So(t),Sltt)"",Sn_l(L)]

:t Sk_l (r ) + A' (t)Sk_l (t ) •

Then

by duality we have Corollary 8. 8.

A necessary and sufficient condition for

a system (7.1) to be differentially observable (or determinable) on an interval

[t

l,t 2]

is that rank

S(t)

=

n

for all

[t

l,t 2]

except possibly for a set whose complement is everywhere dense on [t

1,t2].

[In the time invariant case, this reduces to rank

[H' ,A'H' , ••. ,An-I' H']

=

n].

-

241-

L. Weiss

An application of Corollary 8.8 which follows directly from a proof of a theorem i n [22) is that for single-input, single-output systems, differential observability implies existence of an input/output differential equation defined on those intervals where rank

Set)

~

n "

An interesting application of Corollary 8.7 is the foll owing. Corollary 8.9.

The scalar time-invariant system

u( t )

(8. 11)

is completely differentially controllable. Proof: time-invariant

The scalar system (8.ll) is equivalent to the n-dimensional system (7.1) where

is the

and

(n -

1)

x (n -

1)

identity.

Corollary 8. 9 can be genera 1 a" ze d as f 0 llaws •

Then (8 .10) holds.

Consider (8.ll) as

a delay equation with initial data available for

LJ')

t " to

as any (nonlinear, time-varying) functional operator such

that trajectories of (8.ll) are in the domain of

Lt (.) • and consider

the system

( 8. 12)

Define

u( t )

.

-

242-

L . Weiss As an example we might consider

TheorelO 8.10 n

R

The system (8.12) is completely differentially

- cant rollab Ie . Proof:

Consider (8Jl) with arbitrary initial condition

and let the control function which transfers some arbitrarily given time be denoted by such that

x*

X

o

to the origin in

u * with the associated trajectory

Then the system (8.12) with initial function = x

along the trajectory

a

~

can also be steered to the origin in x*

by applying the control

* + Lt(x *)(t) u(t)

This result which generalizes a result of Hermes (13) and of Davison et al (7). can be simply interpreted as differential controllability of 'scalar

n

th

order linear differential systems

is invariant under feedback. We shall return to (8. U) for further discussion in the next section.

x

a

-

243-

L. Weiss APPENDIX ALGORITHM FOR GENERATING THE FUNCTIONS To generate

a(j)

AND

8(j) .

a(j):

Let

£

be a nonnegative integer and let

set of (odd) integers

{xekl

(ordered by

»

X

be the

e

such that

a(j )

2j - 1 = x£k , k = 1, ... , (n - 2) ! (n - (£+ 2) !£!

corresponds to Then

n-(Hl)

X£ =

U

{2]1X£_1,k + 1 , k

(n-(lJ+2»! } 1 , .. • , (n - (£ + lJ + I»! ( £ - I)!

]1=1 where

x_ l,k " 0 •

B(j) :

To generate Let

£

be a positive integer and let

{Y£k 1 (ordered by»

of (even) integers cor reapouds to

2j

=

Y£k ' k = 1, ..• ,

(n (n _

Y£ be the set

such that

B(j)

2)! (£ + I»! (£ - I)!

Then

(n -

(n where

_ 1 .

(]1 + 2» ! } lJ»! (£ - 2)!

( ~+

-

244-

L . Weiss

9.

THE PFAFFIAN SYSTEM APPROACH TO CONTROLLABILITY In this section, we present an alternate approach to

the problem of characterizing differential controllability of linear ordinary differential systems, and local controllability of nonlinear systems with control appearing linearly.

Our

discussion stems from the work of Hennes [131. which was, in turn , motivated by the work of Caratheodory [31 and Chow [41.

A novel

aspect is the connection of Pfaffians to Theorem 8.6. Consider the system (7 .1) in which Suppose rank Let

R(t)

functions defined on

be an

C(t) = s = constant on an open int erval (n - s)

I , with rank

R(t) C(t)

(9.1)

Definition 9.1.

p < nand

o

x

n

matrix of continuous

R(t)

for all

=

n - s

for all

t El, su ch that

t El •

The Pfaffian system associat ed with (9 .1)

is the system

(9.2)

R(t) dx - R(t) A(t) x dt

Let the rows of

R be denoted by

o,

defined on

r~

and Le t

1.

r'

I .

denote

an arbitrary nonzero linear combination of those rows with continuous scalar valued coefficients

ai(t) •

I .

That is

-

245-

L. Weiss (9.3)

r' (c)

Definition 9.Z. at the point

tIEl

The Pfaffian system (9.Z) is integrable

if there exists some

r'(t)

such that the

form

(9.4)

r' (t) dx - r'(t) A(t) x dt

is an exact differential in a neighborhood of

t l•

More precisely, integrability of the Pfaffian at l C

implies existence of a that for some

t E [t

l

~(t,x) such

E > 0 ,

It ax for

scalar-valued function

t

t I' 1

(t,x)

+ E)

Remark :

'

It at

r ' (c)

where.£1

ax

=

(t,x)

-r' (r) A(t) x

row ( .£1 )

aX i

Hermes [13] shows that any integrating factor

of (9.4) can be taken as a function only of

t

and can therefore

be incorporated in (9.3). The main resul t we wish to prove is the following. Theorem 9.1. Consider (7.1) and its associated Pfaffian (9.3).

Let the matrices

A, C possess

(n - Z)

and

(n - 1)

continuous derivatives respectively and define the matrix as in (8.6). [tl,t [tl,t

Z] Z]

Q(t)

Then the Pfaffian is nonintegrable on an interval

if and only if rank

Q(t)

=n

which is everywhere dense on

for all [tl,t

Z]

t •

on a subset of

-

246-

L. Weiss Proof:

(Sufficiency):

Suppose there exists

such that (9.2) is defined and integrable at

T).

on an open interval which includes

+

E

"t

such that

2

r'(t)C(t)

=0

~'(t) = -r' (t)A(t)

r' (t)Q(t)

then follows that a subinterval

J.

on

J

such that rank Q(t)

matrix of

Q(t) < n

=m<

t El , where

E >

for

[T ,T

t E

m,s l C

=

0

l,t 2]

C(t)

0

+

constant

=

r' , a with Since

E).

~'(t)C(t) + r'(t)r.(t)

on

IT,T + E).

0

It

exists such that rank

CIT, T + E)

The contrapositive of this establishes sufficiency. (Necessity):

rank

R, and

on the latter interval, we have

on the interval which implies

Q(t) < n

(so rank

T

[t

E

Then there exists

nonzero linear combination of the rows of T

1

n

on

Suppose there exists Then there exists

J

for all

tEl , and rank

are constants.

functions,

C(t)

J

C [t

I C J

=s

l,t2] such that

"m

for all

By Lemma 8 .4 there exists an

n

x

n

T(t) , defined and nonsingular for all

tEl , such that

rQ (t)]

=l

T(t)Q(t)

and

C (t) l

where

Ql(t)

t El .

Now since rank

~

have

Cr(t)

=s

and

m rows and rank

[': (t )

0 I

_s

where

C2~t:)

be an

(n-s) x n

s

n-

(n-s) x n , has rank

[C:"l]! T,

rows and rank

matrix. n - s

l C

matrix

C (t ) 2

l, and

=m

for all

Tl(t)

such that

"f:"l [C;"l]

Then the matrix on

Q (t) l

t El , then by Lemma 8.4,

for all

m x m nonsingular

there exists an

T(t)C(t)

=s R(t)

on

Let:

L •

= KT 2(t)T(t)

R(t)C(t) = 0

on

I.

K

=

[0

is Let

I

n-s I

-

247-

L. Weiss a'

=

r'(t)

[0, •.• ,0,1] (n-dimensional) and for fixed =

a'T(n)i(n,t)

with (7 .1). and

~(t,x)

=

eI

r'(t)C(t)

=

°

Define the scalar valued

Then, for

T

More0ver,

I

tcl

21. at

which implies existence of T

° on

r' (t)

ax

is the transition matrix associated

for all

r'(t)x.

~ (t,x)

integrable at

I

r' (t)Q(t)

= -r' (t)A(t)

r'(t)

function

Then

where

ncl , let

tcl ,

(t ,x)

-r' (t)A(t)x

such that the Pfaffian is

The contrapositive of what we have proved

•

establishes necessity and proves the theorem. Now, the following result was proved by Hermes [13] us ing the controllability matrix (7 .4). Theorem9.Z. controllable on an

A system (9.1) is differentially

int~rval

is nonintegrable on

[tl,t

[tl,t

Z]

if and only if the Pfaffian

Z]

The proof of Theorem 8.6 now follows trivially from Theorems 9.1 and 9.Z. We discuss briefly the applicability of this method tu nonlinear systems

with control appearing linearly.

The modifications

in this case are relatively minor for ordinary differential equations. That is, if the system is of the form

x

=

f(t,x) + C(t,x)u , (t,x) c

V

(a domain in

Then (9.Z), (9.3), (9.4) becomes, respectively

n) R .

-

o

R( t , x) C( t , x)

for all

(t,x) £

o

R(t,x)dx - R(t,x)f(t,x)dt

r' (t; ,x) =

L. Weiss

248-

V

(t,x) £ V

for all

La. (t , x) r: ( t , x)

i

1.

1.

aqd Definition 9.2 is altered accordingly [we now speak of integra~ility

at the point (tl,xl)£V

and of a neighborhood of

although its basic structure is unchanged. by some functional operator structure is unchanged. functional operator

1)],

f(t,x)

is replaced

Mt(x) (which may introduce delays, etc .),

Now, if

Mt(x)

Now, if

(t1x

f(t,x)

is replaced by some

(which may introduce delays, etc.),

the situation becomes more complicated with respect to the definition of the Pfaffian and its integrability.

However, under some

circumstances, the definition given for ordinary differential equations is formally applicable because of the special nature We now give an example of this, by adapting an example given in [13]. Consider the scalar system is equivalent to the

[]

(9 .5)

it

n

where

I

n_ l

is the

n

th

- order system (8.10).

The

n-dimensional system

[ 0

In-I]

~]

- Lt(xl ••• xn) + u

(n - l)x(n - 1)

identity.

Except for the

-

249-

L. Weiss last equation, (9.5) is of a form for which Definitions 9.1 and 9.2 make sense.

Clearly, if

R can be chosen so that

L t

is excluded

from entering the expression for the Pfaffian, then the results in this section become applicable to (9.5) and therefore to (8.10). turns out that

R can be so chosen and we therefore have

Proposition 9.3. nonintegrable for all Proof:

It

The Pfaffian of the system (9.6) is

t.

Proceeding formally, we choose

R(t)

as the

(0

-

matrix R(t)

[I _

0]

n 1

.

The Pfaffian system associated with (9.5) is

o ,

(9.6)

i

1, ...

,0

-

1 .

For (9.6) to be integrable, there must exist scalar-valued functions al(t) , not all zero, such that

dt

is an exact differential.

But then we would have

1)

x

0

-

250-

L. Weiss

from which it follows that

u.(t) 1

o ,

i

1, ..• ,n - 1.

Hence

the Pfaffian is nonintegrable. What we have essentially shown is that nonintegrability of the Pfaffian for scalar

nth - order system is invariant under

feedback. In his paper, Hermes argued that on the basis of results for linear systems it was reasonable to define controllability for nonlinear systems with control appearing linearly in terms of nonintegrability of the Pfaffian for such systems.

Since we proved

n earlier that the system (8. JO) is completely differentially (R -) controllable, then we can state that every system which is presently n known to be completely differentially (R -) controllable has a nonintegrable Pfaffian associated with it.

At this point it appears,

therefore, that Hermes assertion is correct provided the word "controllability" is modified to "differential controllability".

-

251-

L. Weiss 10.

STRUcrURE THEORY FOR LINEAR DIFFERENTIAL SYSTEMS

For a number of years following the introduction of the concept of "s tate" as a fundamen tal quality in sys tern theory and control theory, there existed a certain amount of confusion in the literature concerning the relationship among different mathematical models or representations of the same system.

For

instance, there was the transfer function representation, the impulse response, the input/output differential equation, and the state vector differential equation.

In the case of linear

time-invariant systems described by ordinary differential equations. this confusion was cleared up primarily by R. E. Kalman through the statement and proof of a fundamental theorem on the structure of linear time-invariant control systems [16] which was motivated by the significant earlier work of E. G. Gilbert [11]. W~

shall develop the structure theory within the more

general context of time-varying systems (See Weiss [14]) and discuss some of its implications. Our efforts in this section are limited to a discussion of linear ordinary differential systems of the form (7.1).

For

convenience, (7.1) is some times referred to as "the sys tern

lA(·).

C(·), H(·)}".

A principal tool in our development of this part of the structure theory is the following Corollary to Lemma 8.4.

-

252-

L. Weiss

Corollary 10.1:

Let

Set)

be as i n Lemma 8.4 with the

additional property that it is symmetric . k C

n x n matrix of

functions

Then there e xis t s an

T(t) , nonsingular for a l l

L

,

such that

(lO.D T(t)S(t)T'(t)

S' (c)

where

r x r

is

00]

matrix all

C(t,t + \l(t»

t

for all

t

•

Consider the system (7.1) with controllabil ity and suppose rank

Then there exists a

t

S' (t) = r

and rank

Theorem 10.2.

for all

c

l

C(t ,t + \l(t» = r

c

< n

invertible coo rdinat e transformati on

of the state space of (7.1) with respect to which

(7.1) takes on th e

form

~l (t)

(10.2)

valid for all time , where the system

xl(t) is an

{All(o), Cl(o), Hl(o)}

Proof:

r

c

- vector.

Moreover,

is completely controllable.

Application of Coroll ary 1001 to

shows existence of a continuously differentiable T(t) , nonsingular for all

for

t , such th at

C(t,t + \let»~ n x n

matri x,

-

(10.3)

t

.

is

r

x r

c

c

L. Weiss

:J

T(t)C(t,t + \J(t»T'(t)

'"C( t )

where

253 -

' symmetric, and rank

"-

C(t)

= r

c

for all

The right side of (l0.3) represents the controllability mat r I x

for (7.1) after the transformation

~(t)

= T( t) x( t )

is made.

Hence, by Theorem 7.4, a controllable state in the transformed system '30: 1

has the form

0

where

~l is an r

c

- vector.

we have that the transformed transition matrix

From Theorem 7.6 "-

~

has the form

(independent of arguments)

(10.4)

r

regardless of

c

x r

c

It then follows by equation (7.3) that

"-

t , A has the form

-v A

The above transformed quantities are related to the original by the equations

-v

A( t)

-

254-

L. Weiss

Also, 0.0.3) implies that

T(t)4>(t,n)C(n)C' (n)' (t,Il)T' (t)

for all

t

Choosing

and all n

=

nc[t,t +

~(t)l

where

K(t,n)

is

t , it is clear that the above equation implies

Using the notation

for all

t

which proves the main part of the theorem.

The remainder

follows by a trivial observation. By completely analogous argument,one can prove Theorem 10 .3 : V(t,t - w(t»

Consider (1) with determinability matr ix

and let rank

Then there exists a

l C

V(t,t - w(t)

= rd

<

n

for all

t •

invertible coordinate transformation of

(7.1) such that under this transformation

-

255-

L. Weiss

(10.5)

valid for all the sys tern

t

where

{AU (.), C l

xl(t)

c-r.

Theorem 10.4.

is an

HI (.)}

r

d

- vector.

Furthermore,

f.s completely determinab Ie ,

Consider (7.1) and let the hypotheses of

Theorem 10.2 hold so that (7.1) can be transformed into (10.2) with the transfurmed transition matrix given by (10.4). determinability matrices

VI

and

be

1 C

V2 by

(10.6)

(10.7)

and let

and let

wi(t) , i

1,2

functions that

Define the

-

256-

L . Weiss

r

d1

< r

for all

c

n - r

l C

Then there exists a

c

t

for all

t

•

invertible coordinate transformation,

x(t) = T(t)x(t), defined for all

t , under which the coef f t ci.en t

matrices of (7.1) have the form

aa( t) A A(t)

....

Aba(t)

0 AbbCt)

AacC t)

Aad( t )

Abc(t)

AbdCt)

0

0

ACCCt)

0

0

Adc( t)

0 Add(t)

(10.8)

C( e)

....

o o

Proof: (7.1) into C10.2).

Let

Tl(t)

be the diffeomorphism which transforms

With the resulting' transition matrix given by

(10.4), the determinability matrix for the transformed system is (with arguments omitted)

-

257-

L. Weiss

V(t,a)

From (10.6) and (10.7) we have V

22

is

system

V

{A

Clearly, VI

2

ll

VII

=

VI

and the last term in

is the determinab ility matrix for the

( · ) ,Cl( o),Hl(o)}

in (10 .2) .

By hypothesis, rank

< r for all t . Hence, by Corollary 10.1 c d1 there exists a continuously different iable r x r nonsingular c c

VI (r , t -wl(t» = r

matrix

where

T 2(t)

Vl(t)

such that

is

r

d

x rd

and rank

Vl(t)

=

r

dl Hence, by Theorem 10.3 the transformation 1

1

transforms (7.1) into a form in which

for all

t

•

-

258-

L. Weiss

0]

.

Thus, (7.1) becomes

(10.9) u(t)

+

= A22(t)x2(t)

~2(t)

= Ha (t)x a (t) +

y(t)

where we have

dim xa(t)

H 2(t)x2(t)

= r d and dim xb(t) = r - r d 1 c 1

Now consider the system

{A 0 H 22('), 2(')}

The determinability matrix for this system is rank an

V2 ( t , t (n - r

matrix

c)

T 3(t)

w

2

( t » = rclz< n - r

x (n - r such

c)

c

for all

t

V 2•

in (10.9). By hypothesis,

Then there exists

continuously differentiable, nonsingular

-

259-

L. Weiss

:J where

x r and rank V ( t) = r d 2 d d 2 2 2 The coordinate transformation defined by

V 2(t)

is

r

f or '"

i.e., the system now becomes

( lOJD)

x'c (t )

·d x (t) y( t )

.

f or il l I t .

transforms (10.9) into the desired canonical form in whi ch

o]

t

-

260-

L. Weiss

valid for all

t

where

dim xC(t) = r

dZ

and

dim xd(t) = n - r

c

- r

dZ

This completes the proof of Theorem 10.3 in which the overall coordinat e transformation

T(t)

is given by

Our final theorem in this section, when taken together with Theorems 10.2, 10.3, .10 . 4 , yields the structural decomposition of a given system (7.1).

It is motivated by the possibility that

certain state variables in

x

d

may be determinable as a resul t of

ad, A i.e., the system associated with

the connection

[::J

may

contain a determinable subsystem. Theorem 10.5:

Consider the system (7.1) and let the

hypotheses of Theorem 10.4 hold so that (7.1) can be transformed into (10.10).

Consider the system

0]

(10.11)

and call the corresponding determinability matrix

w (t ) 3

be a

l

c

function such that

V3( t ,a ) •

Let

-

261-

L. Weiss

and let rank for all

t , r

d3

<

n - r

- r

c

d2

+ r

f'1(t,t - "' 3(t»

Then there p xist s a

d1

r

1 C

invertible coordinate transformation which converts (10.10) into

t lu-

form

·b

~ (t)

= ~ba (t)~ a (t) +

~

bb

b

+

(t)~ (t)

~

bc

c

(t)~ (t)

+

bd

~

d

(t) ~ (L) +

(10.12)

·d

~ (t.)

yet)

valid for all time, where dim ~a( t ) d

cIim ~ (r)

r

d1

; dim ~b(t)

n - r Note:

c

- r

d2

r

c

+ rd 1

- r

dim ~c(t)

dl

The general form differs from (10.10) in that, by

means of a further transformation of coordinates plus a regrouping of state variables

(dim

~

c

c

(t) > dim x (t)

and

dim

~

d

(t) <

the feedback coefficient from the system associated with that associated with

~

a

d

to

becomes identically zero.

Proof of Theorem 10.5: system (10.11) be given by

~

Let the transition matrix for the

1

"l

-

262-

L. We iss

where

~aa

aa A

corresponds to

and has dimension

r

d1

x r

Then omitting arguments in the integrand below, we have

dl

or

va a ( t , o ) (l0013)

[

Vaa(t, a)

where

R(t,a)]

R'(t, o)

Q(t, a)

is the determinability matrix for

{Aaa(o), Ca(o), Ha(o)} , so that rank all

t

0

Let

w4(t)

Vaa(t,t - w (t» 1

=

r

d

f or l

be a continuously differenti able fun ct ion s uch

that

U

a< t and let

w(t)

be likewise continuously differentiable such th at

w( t ) for all

t •

R[Q(t . o ) 1

>

max(w.(t», i i

~

1,2, 3,4

-

263 -

L. Weiss

It is e asy to show th at

R[Vaa (t ,t - w e t »~

Hence there e xi sts a matri x

I

R[ R( t ,t - '<J ( t )

K(t)

su ch th at

By Corollary 10 .1 it f ollows th at

K( ' ) c (:1

Now define the ( n - r matrix

T (t ) 4

c

- r

d

+ r 2

d

)

c

I rd

denotes the

1 1 is C

r

c

-

r

d

2

+ rd

)

1

J

- K( t) I n _ r -r

T4 ( r )

-c ( n -

1

b y th e f ormula

(10.14)

whe re

I

r

d

2

x r identity mat r i x. Clea rly, 1 d1 From ( 10 . 13) an d (10.14) we ob ta i n (omi t ti ng a rgumen t s d

on ri gh t hand side)

where

Q 1

Q - R'K

is n onne gati ve def i n ite a n d i t fo llows b y h y po t h e s I s

-

264-

L. Weiss

that rank

Ql(t,t - w(t»

=

r

d

- r 3

d

Applying Corollary 10 .1 let (n - r

c

- r

d

for all 1 T (t)

s

t

.

be an

(n - r

- r

c

d

)

x

Z

continuously differentiable nonsingular matrix such

)

Z

that

o

o where for all

PI (t) t •

- r and rank Pl(t) - r ) x (r d d) d 3 1 3 1 Then the coordinate transformation defined by is

(r

d

where

(10.15)

has the effect of transforming the determinability matrix

03

for the

system (10.11) into the form

(10.16)

-'i

-

:J

]

-

265-

L. Weiss for all

t.

It is easy to check that

T(t)-l

h as

t

lu: " "'"L' fo rm

dS

T(t) , in fact -KT

(10.17)

for all

5l

J

T, t , where the dimensions of the

"0"

are

n - r

c

- r

d2

" rd

Therefore, the transformed state coefficient matrix of (10.11) is given by

•

+ T(t)T{t)

A d(t) a,

and has the form

a, d(t)

A

and the corresponding form, i.e.,

" 4>

a, d(t .. r)

~ransition

matrix

4>

a,

d(t,T)

a l s o has th e

-I

1

-

266-

L. Weiss for all

t,T

•

Now partition

where

; ad 1

where

~ 12

Add

x n - r

is

is

;ad

(n - r

as f o l l ows :

, and

c

- r + r d ) x (r d - r d) d 2 3 1 3 1 remaining matrices are comformab1e ,.ith thi s. c

- r

; dd

and

d

and the

corresponds to a partition of the ve ct or Then the transpo se o f

(10 .19)

<1> '

a,d

a

r····

Add '

; ad ' 1 ; ad ' 2

~ ll

Add ' ~12

''J <1>2 1 Add ' ~2 2

By (10 .16), states which are determinable at any fi xed time under the new c oo r di n a t e system, have the f orm

Xl

(d]

[ wet)

must,

-

267-

L. Weiss

where w(t)

dim xl(t) =

0

r

=

for all

d

3

t.

, dim w(t) = n - r

- r + r ' and d2 d3 d1 From Theorem 7.6 and (10.19) it f oll ows t hat

Add ' 1>12

for all

t.

0

=

Transposing, and using equation (7.3), we f ind that

r

A

for all

~

(t~

- r

c

a,d

aa A

Aa1 d

0

0

Add All

0

0

Add A 2l

Add A 22

t • Before giving the final regrouping of terms, it rema ins

to find the "output" coefficient matrix of (l0.11) under the new coordinate system .

This is given by

H( t )

and so, from

(10.17) (with arguments omitted)

But from Theorem 10.3 it is clear that

H

H takes on th e fo r m

-

268-

L . We iss where

is

We now define th e f oll owin g

quan ti ties :

~

~

a

x

a

~

[;]

c

~

b d

x x

lC = [A CC

"dd J AU

lc = [A dC

~~d ]

~dd

ac

~~d ]

.ta a

It =

[A

b d 2

~C

[HC

H~]

"dd

An =

Aaa ~bb=Aba

and if we denote

then we define

Finally,

~b b

bb ,

= A

The t he orem

is thus proved . It s hou l d be emphasized that our proce du re fo r s t r uc t u ral de composit ion is "symmetric" from a nunber o f point s of view.

For

example, just as The orem 10.3 is a dual r e sult t o Theorem 10.2 , we

-

269-

L. Weis s

could have given a completely dual procedure f or consonant with (10.12).

obtainin ~

a f orm

That is, one can easily write th e duaJ t o

Theorems 10.4 and 10.5 which would begin with the applicati on of Theorem 10.3 and would replace the matrices matrices

C C C l, Z' 3

VI' V ' 03

Z

wi t h

etc.

In addition to all this "dual" symmetry, the resul ts of Section 7 indicate that the same type of structural decompositi on is obtained if "controllability" is replaced by "reachability" and/or "determinability" is replaced by "observability".

Hence,

Theorems 10 .4 and 10.5 as well as their duals are each represent ati ve s for a set four structural decomposition theorems. To avoid confusion in the sequel, our discuss ion and interpretation of the results of this section are given only with reference to the actual procedure adopted i n Theorems 10.4 and 10.5 t o obtain (lO.lZ) .

On the basis of our comments above, the

reader can easily supply the interpretations for all the r ema in i ng approaches. Remarks:

1.

The overall coordinate transf ormat ion whi ch

produces the general structural decomposition of an arbitrary s yst em (7.1) is represented by the matrix

I

r

d

0

T i(t)

1 T

T( t) 0

Ts(t)

-1

J

4(t)

-1

I

0

I

-1

iT 1(

0

T 3( t

-JI )

.J

-

2.

L. Weiss

270-

For the special case when (1) is time-invariant, all applications

of Corollary 10.1) will involve time-invariant transformations so that the procedure given in the proofs of Theorems 10.4 and 10.5 clearly leads to a time-invariant structural decomposition. 3.

Pictorially, the decomposition (10.12) can be viewed as in

Figure I, which shows four interconnected systems

Sa, Sb, sC, Sd

enclosed in "boxes" labelled with their associated state vectors. If, as is natural, we view the interconnecting lines inside the large "box" as input and output lines for the structural components, then the following result is readily discernible from individual examination of each structural component in (10.12) plus reference to the proofs of Theorems 10.4, 10.5. Corollary 10.6: (i)

Sa

is completely con t rollab Le and completely detorminablp

(Lf.)

Sb

is completely controllable and completely undeterminab I e

(iii)

Sc

is completely uncontrollable and completely determinable

(Lv)

Sd

is completely uncontrollable and completely undeterminable

Note:

I f the matrices

A('), C('), H(')

analytic functions of time, the ranks of

in (7.1) are

e(t,t + ~(t», Vi(t,t - wi»'

i = I, 2, 3 , will be constant everywhere in the t-domain.

Hence,

the system-theoretic interpretation of the structural decomposition of a system with analytic coefficients is given by Corollary 10.6.

This

provides the proof for assertions concerning the structural decomposition of analytic systems which were made by Kalman [16] and Weiss and Kalman [21].

It may be of interest to point out that in this special case

the overall coordinate transformation can be taken to be analytic rather than just

Cl •

This follows directly from the proof of Dolezal's Theorem

(Lemma 8.4) given by Weiss and Falb [26].

-

271 -

L, We is s

O H ''':

Sa I XO GO

u Gb

'''F

bo

F

OC

bc Sb xb F JSC XC I

1

I

~Fbd

I

I

HC

Y

~ F dC

SdI xd

F IGURE l :

St r u c t u r al Decomposi tio n of a Li nea r Sy st em

-

11.

272-

WEIGHTING PATTERNS, IMPULSE RESPONSES, MINIMAL REALIZATIONS AND CONTROLLABILITY THEORY

Until recently, input/output relations were the most popular means used in engineering textbo oks to represent systems, with the "State" being on l y impli citly considered .

Since an input/output relation is the natural ou t come

of an attempt to model a system from experimental observations, it is of interest to investigate the relationship between input/output represent ations and state-vector represent ations of a s ystem.

In

keeping with the theme of these notes, our discussion centers on the properties of controllab ility, observability, et c. for such representations .

We consider only ordinary linear differential

systems of the form ( 7.1) . The solution to ( 7.1) can be written as

( 11.1)

where

H(t)~(t,tO)xO

y Ir)

X

o

(11.2)

+

I:

W(t,T)u(T)dT o

is the state of the system at time

W(t,T)

H(t) ~(t, T)C(T)

f or all

t

o

and

t, T

and is denoted as the weighting pattern for ~ 7.1) (See Wei ss [20]). Clearly, if

x

o

o ,

then

W(t,T)

contains all the informat i on

needed to compute all input/output pairs of the system .

On thi s

-

273-

L. Weiss

basis we concentrate our s t udy of i npu t /outp u t the function

r ~l ati on s

so le ly un

W(t,T).

[A historical aside:

En gi ne e r s h ave

t

r a d i t Lona l Ly

concerned themselves with inp ut/ output rel ati ons associ a t ed wi t il t he causal impulse response function

(11. 3)

The distin ction betwe en

W c

Wc(t,T)

def ined b y

W( t, T)

,

t

o

,

t <

and

::: T

T

W from a sy stem th e oreti c puint

of view, first discussed by Weiss and Kalman [21], is bri e fl y cons idered later on] . The following definiti uns are pertinent t o th e di s cu s s i on. Definition 11.1. form on an i nt.e rva L columns o f interval

H( ') ~('

(t

l,t 2)

(t ,11)

l,t 2)

A weighting pattern (11 .2)is in r edu ced i f the rows of

( 1l , ' )C ( ' ) an d th e

are linearly independent f un ction s on th e

independent o f

Definition 11.2.

11

A weighting p att ern o f a sy s t em (7 .1)

is globally reduced if it is in r educed f orm on the entire int er val

«-

00 ,00»

of definition of the system. Definition 11.3.

A weight ing p att ern h as the propert y

DCDO (for Differentially Controllable Di ff erenti all y Obser vabl e) on an interval

(t

subinterval of

(t

l,t2) l,t 2)

if it is i n redu ced form on every of pos itive l ength .

-

274-

L. Weiss

Definition 11.4.

The order of a globally reduced weighting

pattern (11.2) is the number of columns of rows of

H(')~("~)

= number

of

4>(Tl,')C(') . Definition 11.5.

weighting pattern

W(t,T)

pattern can be reduced to

A global realization of a globally reduced is a dynamical system (7.1) whose weighting W(t,T) •

Definition 11.6.

If the dimension of the state space of the

realization equals the order of

W(t,T) , the realization is globally

reduced or is minimal. Perhaps the first thing to notice about - Wet or) the problem of obtaining a global realization is trivial.

is that This

follows from the Lemma below. Lemma 11.1.

An

weighting pattern for an if

r x p

matrix function

W(t,T)

is a

n-dimensional system (7.1) if and only

W can be factored as

'I'(t)tl(T)

(11. 4)

where

'I'(t)

is

r x n , O(t)

is

for all

n x p ,and

t,T

'1'(.)

, 0(')

are

continuous functions. Proof: system

(Sufficiency):

{O, O('),'I'(')}

is

(11.4).

(Necessity): 'I'(t)

H(t)4>(t,~)

and

The weighting pattern for the

OCt )

=

Consider (11.1). 4>(Tl,t)C(t)

Letting

yields the desired factorization.

-

275-

L. Weis s

It is als o quite s i mple t o c on s t r uc t " g l ob a l I v r ed uc e d

weighting pattern from a gi ven one, as i n dicate d b y t he p r oo f of the result below .

Every wei ghtin g p at t ern has

Lemma 11. 2.

g .l oh a l Ly

.J

reduced form .

Pr oof : the row s o f

' (. )

Suppose (11.4 ) is n ot gl oball y reduced. and for the colunms of

Suppose the row s of exists a n

n x n

(' ( . )

Then

'P( · ) are dependent on

a re dependent.

nonsingular con s ta n t matr i x

K

Then th ere

such that

K0 ( t )

where the rows of

W( t,.r)

O( · )

are independent over

0

'l'( t ) K- l

[

(_00, 00)

Then

(t)] 0

'l'(t) 0(t )

If the columns of

'l'l ~t)

are not independent over

we introduce a nonsingular con s t an t matrix

where the columns of

qrl1(t)

L

(_00 ,00) ,

su ch that

a r e independent ov e r

(_00,"' )

•

-

276-

L. Weiss

Then, letting

~( t)

L-lO(t)

we get

Wet,,)

(11 .5)

and (11.5) is globally reduced. We now investigate some of the properties of minim al realizations of globally reduced weighting patterns .

The first

result justifies the terminology "minimal". Lemma 11.3. weighting pattern realizations of Proof. Wet,,)

A minimal realization of a globally r :1ured

Wet,,)

has the lowest dimension of all glob al

W(t,')' Suppose the contrary .

Let

n

and consider a global realization of

dimens ion

< n.

be the order of

Wet,,)

with

Clearly, its weigh ting pattern is of order

which contradicts the assumption that

Wet,,)

< n

is globally reduced.

An interesting fact which relates the material on "structure" to that presently being developed is the f ollowing. (See Kalman [15], [16)). Proposition 11.4.

The subsystem

Sa

in Figure 1 is

a minimal realization of the weighting pattern for the overall system. Proof:

The weighting pattern for the system (10.12) is

-

277-

L. Weiss

W(t,T)

where

aa

corresponds to the coefficient matrix

right side of (11.6) is the weighting pattern for globally reduced.

Faa Sa

'111e and is

It is a trivial matter to check that the order

of this weighting pattern is the dimension of Definition 11.7.

Two linear systems

~

a

S,S

of the form

(7.1) are algebraically eguivalent if there exists a nonsingular continuously differentiable matrix

T(t)

T(t)As(t)T(t)

such that

-1'

+ T(t)T(t)

-1

(11. 7)

for all

t . ~,

obvious result is

Proposition 11.5.

Weighting patternsare invariant under

algebraic equivalence. Lemma 11.6.

Points of time from which a system (7.1) is

controllable (or observable) or at which a system is reachable (or determinable) are invariant under algebraic equivalence.

-

278-

L. Weiss Proof: analogously).

(for controllability only.

The rest follows

Under algebraic equivaien ce we have the cor r es ponde nce

C(t, t + \let) + T(t)C(t, t + \J(t»T' (t) and so rank

C(t,t + \J(t»

= n

C(t,t +

implies rank

1J( t )

C(t,t

to

1 ;l ( t ) )

n •

The following result was first stated but not proved in [16] and [21].

A proof was subsequently published by YouLa- [26].

The following proof combines that of You]a with one given by Kal man in unpublished notes.

Theorem 11.7.

Any two minimal realizations of a given

globally reduced weighting pattern are algebraically equivalent. Proof:

It is clear from the proof of Lemma 11.1 that

any minimal realization is algebraically equivalent to one with the

A(·) " 0

coefficient matrix

(take

T(t)

= ~(t,n)

in (11.7».

Hence it suffices to prove that any two minimal realizations {O'~1(·),0l(·)}

weighting pattern

and

{O'~2(·), 02(·)}

Wet,,)

of a given globally reduced

are algebraically equivalent.

To do

this, first note that

(11. 8)

Wet,,)

where the columns of the

~i(·)

are linearly independent on exist finite intervals

i,

i

=

0. (· ) 1

, i = 1,2

It then follows that there

(_00, 00)

J , K

i

and the rows of the

1,2

on whi ch the aforementioned

columns and rows are linearly independent respectively.

[for if not,

-

279-

L. Weiss

then on a ny int erval vect or on

sk

[-k,k], k = 1 , 2, . . . , th e r e exis ts a c on s t a n t

o f un it no r m s u c h th at

[-k,k]

The s e quen ce

I r

~i m s k . = £;

lSk.} · s u c h that 1.

1 -Ko

".

'k '

and

~ l ( t ) l; k = 0

almos t everywhere

h a s a con ve r ge nce s uh s eq ue n ce '1'1(t )t; = 0

a .e . i n

( _ 'c , ")

1.

th u s cant radicting the linear independen ce of the c o l umns o f

'j' ( . )

He n ce the matrices

I 'l'~(t)'¥ .(t)dt

i

1,2

(t} G~ (t)dt

i

1,2

J .

1

1

1

and

f

N. 1

are n on singul ar.

G.

K.

1

1

1

Hence , fr om (11.8) we c an wr ite th at

(11.9)

'l'l ( t )V

Now, mult iply ing bot h sides o f (11.10) on t h e left by inte grating over -1

M 2

we get

I

n

J

2

UV

,

~ 2 (t )

,

and multiplyin g on t h e l e f t agai n b y But from ( 1 1. 8) , (11.9) , (1 1. 10 ) , we h ave

-

280 -

L. We is s from whi ch it f ollows t hat {O, 'l'1(0), 0

l(0)

}

I

n

= VU so t hat

U=

i s algebrai cally equi va len t t o

v-I

and t here f ore

{C,'I'2 ( o),02 ( o)} Io"' i ch

proves the theorem. As a direct cons eq uen ce of Theorem 11. 7 and Lemma 11.6 we have the s t a t e me n t that all min imal re a l iza t i ons o f a given gl oball y reduced weighting pattern h ave e ss entially t h e same be ha vi or

~i" '

respe ct to the properties of con t ro l lab i l i t y , r e a chabil it y, determinab ilit y , and observability. We can, o f course, go even further as indicated bel ow. Theorem 11080

Given a globally reduced weighting pattern

(11.1), there exist fin ite values of time

t' , t" , s uch that

all min imal realizations of (11. 1) are con t r o llab l e ( or obs erva b le ) from all

0 ( T)

=

t < t'

and are re ach able ( or determinable) at a ll

Proof:

Let

W(t, T)

=

'I'( t) 0 ( T)

Since the rows of

linearly i n de pende n t over

(_00,00 )

where

'I'( t )

0 ( 0) (columns of

[t' , 00) «_oo,t"]) 0

= H(t )¢>(t, 'l)

0 ( 0) (columns o f

'1' ( 0»

are

'1' ( 0»

t' (t")

such

are linearly independent over

The rest follows from Lemma 11. 6.

Corollary 11. 9.

All minimal realizations of a weight ing

pattern with the property DCDa are differentially con t r ol l ab l e , r eachable, determinable , observable 0 We now relate the con c ep t of impulse re sponse to the material developed thus far o

,

, by the argument used in the proo f

of Theorem 11. 7, there exis ts a fi ni te value of time that rows of

t > t"

-

281 -

L. Weiss

It use d t o b e the s tan da r d pr a cti c e in en gine erin g t e xtbook s (espe cially th ose con c e rne d with c ommuni c ati on

t

he orv ) t o

represent line ar sy stems by an input/outp ut expres s i on o f th e f o rm

( 11.11)

y(t )

(The inf inite limits apparently mot ivated by the h e a vy u s e o f Fourier transforms in communicati ons problems). W (t, T) c

The fun cti on

wa s normally referred to as the "impulse re sponse" o f

the s yst e m s i nce if one replaces compon e n t i s a dir ac p roperties o f

u(.) b y a ve ct o r whos e

6 - function

6 (T -

6 - functions, y (t)+ the i

~)

th

, th

1

, then b y th e f ormal r ow of

Wc(t, O

.

Having thus introd uced the v iewpoint of the ph ysicist int o th e problem, it was a simple matter to note that since "ph ysi c al " s ystems d o not p r od u ce response s pri or to introducti on o f sti muli, one should assume t hat for p hys i cal or causal systems, th e fun ct i on in (11 an has the prope rty that

W (t ,') c

=

0

f or

t < •

it was perhaps inevitable that this widely t aught e xampl e o f mod el build ing v ia th e incoproration of ph ysical prin c ipl es i nt o an a pri or i as s umed mathematical form (11.11) would be (in c o rre ctly, as it turned ou t) as soc i &ted with the nathematical pr opert i e s o f solutions to ordinary differential e q u a t i ons . Nevertheles s there i s a problem involved fo r, in fa ct, we can ob tai n information about the wei ghtin g p attern through experiment only for

t

~

•

•

W(t,T)

I t is ther ef o r e o f some

interest t o rephrase the re sults obtained thus far so as t o a p p l y

.

-

282-

L. Weiss

to "impulse responses". defined by (11.3).

The causal impulse response function was

Of interest also is the anticausal impulse

response function defined as follows. Definition 11.8. Wa(t,T)

The anticausal impulse response

of a system ( 7.1) with weighting pattern

W(t,T)

is

defined by

W(t,T) , t

( 11.12)

o , Definition 11.9. impulse response

~

T

t > T

A realization of a causal (anticausal)

WC(t,T) (Wa(t,T»

is a system ( 7 .1) whose

causal (anticausal) impulse response is

W (Wa(t,T» c(t,1)

•

The following Corollary of Lemma 11.1 is obvious. Corollary llJO.

An

r x p

is a causal impulse response for an if and only if there exists an r x n n x p

matrix

matrix function

W

C(t,1)

n-dimensional system ( 7 .1) matrix

~(.)

, and an

0(') , both defined and continuous for all time

such that

(11.13)

o ,

t < T

-

283-

L. Weiss

In similar fashion, t he an t ic aus a l impulse r e s pons e must have the form

(11.14)

'J' ( t ) 0 ( T) , t

o, Defini tion 11.]0.

T >

(-"',T)

are linearly independent over Definition 11 .11.

> r

the rows of

,

[i;,"')

linearly independent over

(-"', i;)

while the col umns o f

i; <

co

,

the r ows o f

while the columns o f

'p( . )

0( . ) ~( . )

are are

Gi ven a globally reduced ca usa l ( ant i cau s al )

impulse response (11.13 )«11.14 », the fun ction t,T

are

An anticausal impulse re sp ons e (1 1.14)

linearly independent over

defined f or all

0( · )

[T, "') •

is globally reduced if, for s ome

Definition 11.12.

r

A causal impulse respon se (11 .1 3) i s

globally reduced if for sone linearly independent over

t

~

W(t , rJ

=

', Ct ) (,(, )

is the naturally ind uced we ight ing pattern

a s s oc i a t e d with tha t: i mp ul s e re sponse.

(Note that this we i ght i ng

pat t e rn mus t be glubally reduced). Remark:

It s hou l d be emph as i ze d th a t many di ff e re nt

weighting p atterns may be as s oc i ate d wi th th e gr aph of a gi ve n (causal or anticausal) impul se resp onse . 0 (, )

=

0

chosen for

for all t , 0

,

~ O

in (11 .13) .

without a f fec t i n g

For examp l e , l e t

The n

~ ( t)

Wc(t, r) .

can be a rb it rar i l y De fin i t i on 11. 11

-

284 -

L . Weiss

simply ident ifies the wei ghting patt ern r es ult in g fro m a choice of

,¥( . )

and

0( ' )

from the class of pos sible cho ices.

Defini tion 11.13.

A realizati on ( 7 .1) of a glob a lly

r educed i mp u l s e resp onse is minimal if i ts dimensi on e qua ls th e o r de r of th e weigh t ing patt ern naturall y ind uc e d by th a t impu lse re sponse . It i s obvious from the previ ous Remark that two minimal re alizations of u given gl oba l ly redu ced impul s e respon se may not be a l geb raic a l ly e qui va len t .

One can say th e

f ollowing, ho weve r , as a res ult of Theo r em 11 .7. The orem 11.11.

Two minimal re alizations of a globa lly

reduced c au s al (an ti causa l ) impulse re sponse are a l gebrai cally equivalent if and only i f they have the s ame an t ic ausa l (causa l ) impul se r e sponse . Our fin al r es ul t t ou ch e s on the questi on of wha t informat ion regardin g c on t rol - t heo r e tic properti e s of a s ys te m is ca rri e d by the sys te m's imp uls e r esp ons e , and f ollows f rom Def init i ons 11.10, 11.11, plus ea r l i e r r e sults. Let

1,~

be as in Definiti ons 11.10, 11. 11.

Then

WE

h ave (a )

The~11.12.

A minimal r e aliz ati on of a glooal ly

redu ce d c aus al i mp u l se re s ponse is r e ach able at a ll obs e rvab le fro m a l l

t <

T

•

t >

T

an ~

is

-

285-

L. Weiss

(b)

A minimal realization o f a gl ob ally

reduced anticausal impulse re sponse is controllable fr om a l l t

<: i;.

and is determinable at all

t > E •

-

286-

L. Weiss

BIBLOGRAPHY

[1]

E. G. Al 'brekht and N. N. Krasovskii, "The Observability of a Nonlinear Controlled System in the Neighborhood of a given Motion", Automation and Remote Con trol, 25, pp . 934-944, (1965) .

[2]

R. Bellman and K. L. Cooke, Differential-Difference Equations, Academic Press, New York, (1963).

[3]

C. Ca r at he odo ry , "Untersuchunger Uber die Grundlagen der Thermodynamik", Math. Ann. pp • 355-386, (1909) •

[4]

W. L. Chow, "Ub e r Systeme von Linear Partiallen Differentialgleichungen erster Ordnung", Math. Ann , , pp , 95-105, (1940).

[5]

D. H. Chyung and E. B. Lee, "Optimal Systems with Time-Delays", Proc. 3rd Congress of the IFAC, London, (1966).

[6]

E. A. Coddington and N. Levinson, Theory of Ordinary Differential Equations, McGraw-Hill, New York, (1955).

[7]

E. Davison, 1. Silverman and P . Varaiya, "Controllability of a Class of Nonlinear Time-Variable Systems", IEEE Trans. Auto. Control, AC-12, pp. 791-792, (1967).

[8]

J. Dieudonne, Foundations of Modern Analysis, Academic Press, New York, (1960).

[9]

V. Dole~al, "The Exis tence of a Continuous Basis of a Certain Linear Subspace of E which Depends on a Parame t er", Cas. Pro. Pest. Mat., 89, pp . 466-468, (1964).

[10]

R. Driver, "Existence and Stability of Solutions of a DelayDifferential System", Arch. Rat. Mech. Anal., 10, pp , 401-426, (1962).

Ill]

E. G. Gilbert, "Controllability and Observability in Multivariabl e Control Systems", J. SIAM Control, 1, pp. 128-151, (1953).

[12]

J. K. Hale and K. R. Meyer, "A Class of Fun ctional Equations

[13]

H. Hermes, "Controllability and the Singul ar Probl em", J. SIAM Control, 3, pp . 241-260, (1965) .

[14]

R. E. Kalman, "On the Ge ne r a l Theory of Control Systems", Pr o c . First IFAC, Butterworth's, London, pp. 481-492 (1960).

of Neutral Type", Mem. AIDer. Math. Soc., No. 76, Providence, (1967) .

-

287-

L. Weiss

[15]

R. E. Kalman, Y. C. Yo and K. S. Narend r a , "Controllability of Linear Dynamical Systems", Con t r , to DHf. Eg'" I, pp. 189-21~ (1962).

[16]

R. E . Kalman, "Mathematical Description of Linear Dynamical Systems", J. SIAM Control, I, pp , 152-192, (1963).

[17]

R. E. Kalman , "Contributions to the Theory of Optimal Control", Bol. Soc. Mat. Mex., pp. 102-119, (1960) .

[18]

F . M. Kirillova and S. V. Curakov, "On the Problem of Controllabilit y of Linear Systems with Aftereffect" (Russian), Diff . Urav., 3, pp. 436-44~ (1967).

[19]

E. B. Lee and L. Markus, "Optimal Control for Nonlinear Processes", Arch. Rat. Mech. Anal., 8, pp. 36-38, (1961).

[20]

L . Weiss and R. E. Kalman, "Contributions to Linear System Theory", Int'l J. Engrg. ScL, 3, pp , 141-171 (1965) .

[21]

L . Weiss, "Weighting Patterns and the Controllability and Observabili ty of Linear Systems", Proc. Nat. Acad. Sci. (USA), 51, pp . 1122-1127, (19~4).

[22]

L. Weiss, "The Concepts of Differential Controllability and Differential Observabili ty", J. Math. Anal. and Ap'p'!':", 10, pp. 442-449 (196:). "Correction and Addendum'", J . Math. Anal. and Appl ., 13, pp. 577-578 , (1966).

[23]

L. Weiss, "On the Controllability of Delay-Differential Systems", J. SIAM Control,S, pp. 575-587, (1967) .

[24]

L. Weiss, "On the Structure Theory of Linear Di fferential Sys tern s", J . SIAM Control, (1968).

[25]

L. Weiss and P . L. Falb, "Do Ie Ea L r s Theorem, Linea Continuously Parametrized Elements and Time-Varyin Math. SystEms Theory, (~969).

[26]

D. C. Youla, "The Synthesis of Linear Dynamical S"stems from Prescribed Weighting Patterns", J. SIAM Appl. Mat 'l=.., 14, pp. 527-549, (1966).

Algeb ra wi th Systems",

-- 288 -

L. Weiss

OTHER PERTINENT REFERENCES

27.

H. A. An tosiewicz, "Linear Con trol Sys tems", Arch. R3l-.. Mech. Anal., 12, pp. 313-324 (1963).

28.

A. V. Balakrishnan, "On the Controllability of Non li n e a r System", Proc. Nat. Acad. Sci. (USA), 55, pp . 465-468, (1966).

29.

R. Conti, "Contributions to Linear Control Theory", J. Difr. Eqs., I, pp , 427-445, (1965).

30.

C. A. Desoer and P. P. Varaiya, "Minimal Realizations of Nonanticipative Impulse Responses", J. SIAM Appl. Math., 15, pp , 754-764, (1967).

31.

H. O. Fattorini, "On Complete Controllability of Linear Systems", J. DiU. Eqs.,

32.

A. Halanay, Differential Equations: Stability, Oscillations, Time Lags, Academic Press, New York, 1966.

33.

B. L. Ho and R. E. Kalman, "Effect ive Construction of StateVariable Models from Input/Output Functions", ~elungstechnik, 12, pp. 545-548, (1966).

34 .

E. Kreindler and P. Sarachik, "On the Concepts of Controllability and Observability of Linear Systems", ~EEE Trans Auto. Control, AC-9, pp. 129-136, (1964).

35.

N. N. Krasovskii, "On the Theory of Controllabilit y and Ob s e r v ab i Li t v of Linear Dynamic Systems", PMM, 28, pp. 3-14, (964).

36.

C. E. Langenhop, "On the Stabilization of Lin ear Systems", Pr oc . Amer. Math. Soc., IS, pp. 735-742, (1964).

37.

J. P. LaSalle, "The Time-Optimal Control Problem", Contributions to the Theory of Nonlinear Oscillations, Vol. V, Princet on Uni v . Press, pp . 1-24, (1960).

38.

~. Markus, "Controllability of Nonlinear Processes", J. SIAM

Control, 3, pp. 78-90, (1965). 39.

L. s. Pontryagin, V. G. Boltyanskii, R. V. Garnkrelidze and E. F. Mishchenko, The Mathematical Theory of Optimal Pro c esse s, Wiley (Interscience), New York, (1962).

-

289-

L. Weiss

40.

D. L . Russell, "Nonharmonic Four i er Se ries in th e Con t ro l The ory of Distributed Parameter Sy stems ", J. Math. Ana l . ~., 18, pp. 542 -560, (1967).

41.

L. Silverman and H, Meadows, "Controllabili t y a n d Observa bi Ii t y in Time-Variable Linear Systems", J. SIAM Con t ro l , 5, pp. 64- 7 3 . (1967) .

42. W. M. Wonham, "Pole Assignment in Mul ti-Input Con t ro llab I e Linear Systems", IEEE Trans. Au t o . Con t ro l , AC-1 2, p p. 660- 66 5, (1967) .

Stereodynamics (C.I.M.E. Summer Schools, 56)

Read more

Nonlinear controllability and optimal control

Read more

Stochastic Differential Equations (C.I.M.E. Summer Schools, 77)

Read more

Dynamical Systems (C.I.M.E. Summer Schools, 78)

Read more

Wave Propagation (C.I.M.E. Summer Schools, 81)

Read more

Wave Propagation: (C.I.M.E. Summer Schools, Vol. 81)

Read more

Teoria Della Turbolenza (C.I.M.E. Summer Schools, 14)

Read more

Differential Topology (C.I.M.E. Summer Schools, 73)

Read more

Algebraic Surfaces (C.I.M.E. Summer Schools, 76)

Read more

Complex Analysis (C.I.M.E. Summer Schools, 62)

Read more

Complex Analysis (C.I.M.E. Summer Schools, 62)

Read more

Materials with Memory (C.I.M.E. Summer Schools, 74)

Read more

Potential Theory (C.I.M.E. Summer Schools, 49)

Read more

Spectral Analysis (C.I.M.E. Summer Schools, 64)

Read more

Stability Problems (C.I.M.E. Summer Schools, 65)

Read more

Theoretical Computer Sciences (C.I.M.E. Summer Schools, 68)

Read more

Relativistic Fluid Dynamic (C.I.M.E. Summer Schools, 52)

Read more

Economia Matematica (C.I.M.E. Summer Schools, 40)

Read more

Statistical Mechanics (C.I.M.E. Summer Schools, 71)

Read more

Recursion Theory and Computational Complexity (C.I.M.E. Summer Schools, 79)

Read more

Categories and Commutative Algebra (C.I.M.E. Summer Schools, 58)

Read more

Harmonic Analysis and Group Representations (C.I.M.E. Summer Schools, 82)

Read more

Model Theory and Applications (C.I.M.E. Summer Schools, 69)

Read more

Finite Geometric Structures and Their Applications (C.I.M.E. Summer Schools, 60)

Read more

Geometric Measure Theory and Minimal Surfaces (C.I.M.E. Summer Schools, 61)

Read more

Calculus of Variations, Classical and Modern (C.I.M.E. Summer Schools, 39)

Read more

Matroid Theory and Its Applications (C.I.M.E. Summer Schools, 83)

Read more

Global controllability and stabilization of nonlinear systems

Read more

Dynamical systems: Stability, controllability and chaotic behavior

Read more

Dynamical Systems: Stability, Controllability and Chaotic Behavior

Read more

Recommend Documents

Stereodynamics (C.I.M.E. Summer Schools, 56)

Giuseppe Grioli ( E d.) Stereodynamics Lectures given at a Summer School of the Centro Internazionale Matematico Estiv...

Nonlinear controllability and optimal control

Stochastic Differential Equations (C.I.M.E. Summer Schools, 77)

Jaures Cecconi ( E d.) Stochastic Differential Equations Lectures given at a Summer School of the Centro Internazional...

Dynamical Systems (C.I.M.E. Summer Schools, 78)

C. Marchioro ( E d.) Dynamical Systems Lectures given at a Summer School of the Centro Internazionale Matematico Estiv...

Wave Propagation (C.I.M.E. Summer Schools, 81)

Giorgio Ferrarese ( E d.) Wave Propagation Lectures given at a Summer School of the Centro Internazionale Matematico E...

Wave Propagation: (C.I.M.E. Summer Schools, Vol. 81)

e, ressa • ng Iy 980 Giorgio Ferrarese (Ed.) Wave Propagation Lectures given at a Surruner School of the Centro I...

Teoria Della Turbolenza (C.I.M.E. Summer Schools, 14)

C. Ferrari ( E d.) Teoria della turbolenza Lectures given at the Centro Internazionale Matematico Estivo (C.I.M.E.), h...

Differential Topology (C.I.M.E. Summer Schools, 73)

V. Villani ( E d.) Differential Topology Lectures given at a Summer School of the Centro Internazionale Matematico Est...

Algebraic Surfaces (C.I.M.E. Summer Schools, 76)

G. Tomassini ( E d.) Algebraic Surfaces Lectures given at a Summer School of the Centro Internazionale Matematico Esti...

Complex Analysis (C.I.M.E. Summer Schools, 62)

F. Gherardelli ( E d.) Complex Analysis Lectures given at a Summer School of the Centro Internazionale Matematico Esti...