Lecture Notes in Mathematics Editors: J.-M. Morel, Cachan F. Takens, Groningen B. Teissier, Paris
1770
Springer Berlin Heidelberg New York Barcelona Hong Kong London Milan Paris Singapore Tokyo
Heide Gluesing-Luers sen
Linear Delay-Differential Systems with Commensurate Delays: An Algebraic Approach
'
Springer
Author Heide Gluesing-Luerssen Department of Mathematics University of Oldenburg 26111 Oldenburg, Germany e-mail: gluesing@mathematik. uni-oldenburg.de
Cataloging-in-Publication Data available Die Deutsche Bibliothek - CIP-Einheitsaufnahme Gli.ising-Li.ierssen, Heide: Linear delay differential systems with commensurate delays : an algebraic approach I Heide Gluesing-Lueerssen.- Berlin; Heidelberg; New York; Barcelona ; Hong Kong ; London ; Milan ; Paris ; Tokyo : Springer, 2002 (Lecture notes in mathematics ; 1770) ISBN 3-540-42821-6
Mathematics Subject Classification (2ooo): 93C05, 93B25, 93C23, 13B99, 39B72 ISSN 0075-8434 ISBN 3-540-42821-6 Springer-Verlag Berlin Heidelberg New York This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are liable for prosecution under the German Copyright Law. Springer-Verlag Berlin Heidelberg New York a member ofBertelsmannSpringer Science + Business Media GmbH http://www.springer.de ©Springer-Verlag Berlin Heidelberg 2002 Printed in Germany The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Typesetting: Camera-ready TEX output by the author SPIN: 10856623
41/3143/LK- 543210- Printed on acid-free paper
Preface
The term delay-differential equation was coined to comprise all types of differential equations in which the unknown function and its derivatives occur with various values of the argument. In these notes we concentrate on (implicit) linear delay-differential equations with constant coefficients and commensurate point delays. We present an investigation of dynamical delay-differential systems with respect to their general system-theoretic properties. To this end, an algebraic setting for the equations under consideration is developed. A thorough purely algebraic study shows that this setting is well-suited for an examination of delay-differential systems from the behavioral point of view in modern systems theory. The central object is a suitably defined operator algebra which turns out to be an elementary divisor domain and thus provides the main tool for handling matrix equations of delay-differential type. The presentation is introductory and mostly self-contained, no prior knowledge of delay-differential equations or (behavioral) systems theory will be assumed. There are a number of people whom I am pleased to thank for making this work possible. I am grateful to Jan C. Willems for suggesting the topic "delaydifferential systems in the behavioral approach" to me. Agreeing with him, that algebraic methods and the behavioral approach sound like a promising combination for these systems, I started working on the project and had no idea of what I was heading for. Many interesting problems had to be settled (resulting in Chapter 3 of this book) before the behavioral approach could be started. Special thanks go to Wiland Schmale for the numerous fruitful discussions we had in particular at the beginning of the project. They finally brought me on the right track for finding the appropriate algebraic setting. But also later on, he kept discussing the subject with me in a very stimulating fashion. His interest in computer algebra made me think about symbolic computability of the Bezout identity and Section 3.6 owes a lot to his insight on symbolic computation. I wish to thank him for his helpful feedback and criticisms. These notes grew out of my Habilitationsschrift at the University of Oldenburg, Germany. The readers Uwe Helmke, Joachim R:osenthal, Wiland Schmale, and Jan C. Willems deserve special mention for their generous collaboration. I also want to thank the Springer-Verlag for the pleasant cooperation. Finally, my greatest thanks go
VI
Preface
to my partner, Uwe Nagel, not only for many hours carefully proofreading all these pages and making various helpful suggestions, but also, and even more, for being so patient, supportive, and encouraging during the time I was occupied with writing the "Schrift" .
Oldenburg, July 2001
Heide Gluesing-Luerssen
Table of Contents
1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
2
The Algebraic Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
3
The 3.1 3.2 3.3 3.4 3.5 3.6
Algebraic Structure of 1-£0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Divisibility Properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Matrices over 1-lo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Systems over Rings: A Brief Survey......................... The Nonfinitely Generated Ideals of 1-lo . . . . . . . . . . . . . . . . . . . . . The Ring 1-l as a Convolution Algebra . . . . . . . . . . . . . . . . . . . . . . Computing the Bezout Identity . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23 25 35 43 45 51 59
4
Behaviors of Delay-Differential Systems. . . . . . . . . . . . . . . . . . . . . 73 4.1 The Lattice of Behaviors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 4.2 Input/Output Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 4.3 'Ifansfer Classes and Controllable Systems . . . . . . . . . . . . . . . . . . . 95 4.4 Subbehaviors and Interconnections ......................... 104 4.5 Assigning the Characteristic Function ....................... 115 4.6 Biduals of Nonfinitely Generated Ideals . . . . . . . . . . . . . . . . . . . . . 129
5
First-Order Representations ................................ 5.1 Multi-Operator Systems ................................... 5.2 The Realization Procedure of Fuhrmann. . . . . . . . . . . . . . . . . . . . . 5.3 First-Order Realizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Some Minimality Issues ...................................
135 138 148 157 · 162
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 Index ..................................................... ...... 175
1 Introduction
Delay-differential equations (DDEs, for short) arise when dynamical systems with time-lags are being modeled. Such lags might for instance occur if some nonnegligible transportation time is involved in the system or if the system needs a certain amount of time to sense information and react on it. The characteristic feature of a system with time-lags is that the dynamics at a certain time does not only depend on the instantaneous state of the system but also on some past values. The dependence on the past can take various shapes. The simplest type is that of a constant retardation, a so-called point delay, describing for instance the reaction time of a system. More generally, the reaction time itself might depend on time (or other effects). Modeling such systems leads to differential- difference equations, also called differential equations with a deviating argument, in which the unknown function and its derivatives occur with their respective values at various time instants t -Tk. A completely different form of past dependence arises if the process under investigation depends on the full history of the system over a certain time interval. In this case a math~matical formulation leads to general functional-differential equations, for instance integra-differential equations. In control theory the term distributed delay, as opposed to point delay, has been coined for this type of past dependence. We will consistently use the term delaydifferential equation for differential equations having any kind of delay involved. All the delay-differential equations described above fall in the category of infinite-dimensional systems. The evolution of these systems can be described in a twofold way. On the one hand, the equations can, in certain circumstances, be formulated as abstract differential equations on an infinite-dimensional space. The space consists basically of all initial conditions, which in this case are segments of functions over a time interval of appropriate length. This description leads to an operator-theoretic framework, well suited for the investigation of the qualitative behavior of these systems. For a treatment of DDEs based on functional analytic methods we refer to the books Hale and Verduyn Lunel [49] and Diekmann et al. (22] for functional-differential equations and to the introductory book Curtain and Zwart [20] on general infinite-dimensional linear systems in control theory. On the other hand, DDEs deal with one-variable functions and can be treated to a certain extent with "analysis on JR" and transform techniques. For an investigation of DDEs in this spirit we refer to the books Bellman and Cooke [3], Driver (23], El'sgol'ts and Norkin (28], and Kolmanovskii and
2
1 Introduction
Nosov [65] and the references t];lerein. All the monographs mentioned so far aim at analyzing the qualitative behavior of their respective equations, most of the time with an emphasis on stability theory.
Our interest in DDEs is of a different nature. Our goal is an investigation of systems governed by DDEs with respect to their general control-theoretic properties. To this end, we will adopt an approach which goes back to Willems (see for instance [118, 119]) and is nowadays called the behavioral approach to systems theory. In this framework, the key notion for specifying a system is the space of all possible trajectories of that system. This space, the behavior, can be regarded as the most intrinsic part of the dynamical system. In case the dynamics can be described by a set of equations, it is simply the corresponding solution space. Behavioral theory now introduces all fundamental system properties and constructions in terms of the behavior, that means at the level of the trajectories of the system and independent of a chosen representation. In order to develop a mathematical theory, one must be able to deduce these properties from the equations governing the system, maybe even find characterizations in terms of the equations. For systems governed by linear time-invariant ordinary differential equations this has been worked out in great detail and has led to a successful theory, see, e. g., the book Polderman and Willems [87]. Similarly for multidimensional systems, described by partial differential or discrete-time difference equations, much progress has been made in this direction, see for instance Oberst [84], Wood et al. [123], and Wood [122]. The notion of a controller, the most important tool of control theory, can also be incorporated in this framework. A controller forms a system itself, thus a family of trajectories, and the interconnection of a to-be-controlled system with a controller simply leads to the intersection of the two respective behaviors. The aim of this monograph is to develop, and then to apply, a theory which shows that dynamical systems described by DDEs can be successfully studied from the behavioral point of view. In order to pursue this goal, it is unavoidable to understand the relationship between behaviors and their describing equations in full detail. For instance, we will need to know the (algebraic) relation between two sets of equations which share the same solution space. Restricting to a reasonable class of systems, this can indeed be achieved and leads to an algebraic setting, well suited for further investigations. To be precise, the class of systems we are going to study consists of (implicit) linear DDEs with constant coefficients and commensurate point delays. The solutions being considered are in the space of C00 -functions. Formulating all this in algebraic terms, one obtains a setting where a polynomial ring in two operators acts on a module of functions. However, it turns out that in order to answer the problem raised above, this setting will not suffice, but rather has to be enlarged. More specifically, certain distributed delay operators (in other words, integro-differential equations) have to be incorporated in our framework. These distributed delays have a very specific feature; just like point-delay-differential operators they are determined by finitely many data, in fact they correspond to certain rational
1 Introduction
3
functions in two variables. In order to get an idea of this larger algebraic setting, only a few basic analytic properties of scalar DDEs are needed. Yet, some careful algebraic investigations are necessary to see that this provides indeed the appropriate framework. In fact, it subsequently allows one to draw far-reaching consequences, even for systems of DDEs, so that finally the behavioral approach can be initiated. As a consequence, the monograph contains a considerable part of algebra which in our opinion is fairly interesting by itself. We want to remark that delay-differential systems have already been studied from an algebraic point of view in the seventies, see, e. g., Kamen [61], Morse [79], and Sontag [105]. These papers have initiated the theory of systems over rings, which developed towards an investigation of dynamical systems where the trajectories evolve in the ring itself. Although this point of view leads away from the actual system, it has been (and still is) fruitful whenever system properties concerning solely the ring of operators are investigated. Furthermore it has led to interesting and difficult purely ring-theoretic problems. Even though our approach is ring-theoretic as well, it is not in the spirit of systems over rings, for simply the trajectories live in a function space. Yet, there exist a few connections between the theory of systems over rings and our approach; we will therefore present some more detailed aspects of systems over rings later in the book. We now proceed to give a brief overview of the organization of the book. Chapter 2 starts with introducing the class of DDEs under consideration along with the algebraic setting mentioned above. A very specific and simple relation between linear ordinary differential equations and DDEs suggests to study a ring of operators consisting of point-delay-differential operators as well as certain distributed delays; it will be denoted by 1-l. In Chapter 3 we disregard the interpretation as delay-differential operators and investigate the ring 1-l from a purely algebraic point of view. The main result of this chapter will be that the ring 1-l forms a so-called elementary divisor domain. Roughly speaking, this says that matrices with entries in that ring behave under unimodular transformations like matrices over Euclidean domains. The fact that all operators in 1-l are determined by finitely many data raises the question whether these data (that is to say, a desired operator) can be determined exactly. We will address this problem by discussing symbolic computability of the relevant constructions in that ring. Furthermore, we will present a description of 1-l as a convolution algebra consisting of distributions with compact support. In Chapter 4 we finally turn to systems of DDEs. We start with deriving a Galois-correspondence between behaviors on the one side and the modules of annihilating operators on the other. Among other things, this comprises an algebraic characterization of systems of DDEs sharing the same solution space. The correspondence emerges from a combination of the algebraic structure of 1-l with the basic analytic properties of scalar DDEs derived in Chapter 2; no further analytic study of
4
1 Introduction
systems of DDEs is needed.* The Galois-correspondence constitutes an efficient machinery for addressing the system-theoretic problems studied in the subsequent sections. Therein, some of the basic concepts of systems theory, defined purely in terms of trajectories, will be characterized by algebraic properties of the associated equations. We will mainly be concerned with the notions of controllability, input/output partitions (including causality) and the investigation of interconnection of systems. The latter touches upon the central concept of control theory, feedback control. The algebraic characterizations generalize the well-known results for systems described by linear time-invariant ordinary differential equations. A new version of the finite-spectrum assignment problem, well-studied in the analytic framework of time-delay systems, will be given in the algebraic setting. In the final Chapter 5 we study a problem which is known as state-space realization in case of systems of ordinary differential equations. If we cast this concept in the behavioral context for DDEs, the problem amounts to finding system descriptions, which, upon introducing auxiliary variables, form explicit DDEs of first order (with respect to differentiation) and of retarded type. Hence, among other things, we aim at transforming implicit system descriptions into explicit ones. Explicit first order DDEs of retarded type form the simplest kind of systems within our framework. Of the various classes of DDEs investigated in the literature, they are the best studied and, with respect to applications, the most important ones. The construction of such a description (if it exists) takes place in a completely polynomial setting, in other words, no distributed delays arise. Therefore, the methods of this chapter are different from what has been used previously. As a consequence and by-product, the construction even works for a much broader class of systems including for instance certain partial differential equations. A complete characterization, however, of systems allowing such an explicit first order description, will be derived only for DDEs. A more detailed description of the contents of each chapter is given in its respective introduction. We close the introduction with some remarks on applications of DDEs. One of the first applications occurred in population dynamics, beginning with the predator-prey models of Volterra in the 1920s. Since population models are in general nonlinear, we will not discuss this area and refer to the books K uang [66], MacDonald [70], and Diekmann et al. (22] and the references therein. The work of Volterra remained basically unnoticed for almost two decades and only in the early forties DDEs got much attention when Minorsky [77] began to study ship stabilization and automatic steering. He pointed out that for these systems the existing delays in the feedback mechanism can by no means be neglected. Because of the great interest in control theory during that time and * At this point the reader familiar with the paper [84] of Oberst will notice the structural similarity of systems of DDEs to multidimensional systems. We will point out the similarities and differences between these two types of systems classes on several occasions later on.
1 Introduction
5
the decades to follow the work of Minorsky led to other applications and a rapid development of the theory of DDEs; for more details about that period see for instance the preface of K olmanovskii and N osov [65] and the list of applications in Driver [23, pp. 239]. It was Myschkis [81] who first introduced a class of functional-differential equations and laid the foundations of a general theory of these systems. Monographs and textbooks that appeared ever since include Bellman and Cooke [3], El'sgol'ts and Norkin [28], Hale[48], Driver [23], Kolmanovskii and Nosov [65], Hale and Verduyn Lunel [49], and Diekmann et al. [22]. A nice and brief overview of applications of DDEs in engineering can be found in the book Kolmanovskii and Nosov [65], from which we extract the following list. In chemical engineering, reactors and mixing processes are standard examples of systems with delay, because a natural time-lag arises due to the time the process needs to complete its job; see also Ray [89, Sec. 4.5] for an explicit example given in transfer function form. Furthermore, any kind of system where substances, information, or energy (wave propagation in deep space communication) is being transmitted to certain distances, experiences a time-lag due to transportation time. An additional time-lag might arise due to the time needed for certain measurements to be taken (ship stabilization) or for the system to sense information and react on it (biological models). A model of a turbojet engine, given by a linear system of five first-order delay equations with three inputs and five to-be-controlled variables can be found in [65, Sec. 1.5]. Moreover a system of fifth-order DDEs of neutral type arises as a linear model of a grinding process in [65, Sec. 1. 7]. Finally we would like to mention a linearized model of the Mach number control in a wind tunnel presented in Manitius [75]. The system consists of three explicit equations of first order with a time-delay occurring only in one of the state variables but not in the input channel. In that paper the problem of feedback control for the regulation of the Mach number is studied and various different feedback controllers are derived by transfer function methods. This problem can be regarded as a special case of the finite-spectrum assignment problem and can therefore also be solved within our algebraic approach developed in Section 4.5. Our procedure leads to one of the feedback controllers (in fact, the simplest and most practical one) derived in [75].
2 The Algebraic Framework for Delay-Differential Equations
In this chapter we introduce the specific class of delay-differential equations we are interested in and derive some basic, yet important, properties. In this way we hope to make clear that, and how, the algebraic approach we are heading for depends only on a few elementary analytic properties of the equations under consideration. The fact that we can indeed proceed by mainly algebraic arguments results from the structure of the equations under consideration together with the type of problems we are interested in. To be precise, we will restrict to linear delay-differential equations with constant coefficients and commensurate point-delays on the space C00 (JR, C). We are not aiming at solving these equations and expressing the solutions in terms of (appropriate) initial data. For our purposes it will suffice to know that the solution space of a DDE (without initial conditions), i. e. the kernel of the associated delay-differential operator, . is "sufficiently rich". In essence, we need some knowledge about the exponential polynomials in the solution space; hence about the zeros of a suitably defined characteristic function in the complex plane. Yet, in order to pursue by algebraic means, the appropriate setting has to be found first. The driving force in this direction is our goal to handle also systems of DDEs, in other words, matrix equations. In this chapter we will develop the algebraic context for these considerations. Precisely, a ring of delay-differential operators acting on C00 (JR, C) will be defined, comprising not only the pointdelay differential operators induced by the above-mentioned equations but also certain distributed delays which arise from a simple comparison of ordinary differential equations and DDEs. It is by no means clear that the so-defined operator ring will be suitable for studying systems of DDEs. That this is indeed the case will turn out only after a thorough algebraic study in Chapter 3. In the present chapter we confine ourselves with introducing that ring and providing some standard results about DDEs necessary for later exposition. In particular, we will show that the delay-differential operators under consideration are surjections on (JR, C).
coo
As the starting point of our investigation, let us consider a homogeneous, linear DDE with constant coefficients and commensurate point delays, that is an equation of the type
8
2 The Algebraic Framework N
M
LPijf(i)(t- jh)
L
= 0,
t E JR,
i=O j=O
where N, M E No, Pii E JR, and h > 0 is the smallest length of the point delays involved. Hence all delays are integer multiples of the constant h, thus commensurate. For our purposes it suffices to assume the smallest delay to be of unit length, which can easily be achieved by rescaling the time axis. Therefore, from now on we will only be concerned with the case h = 1 and the equation above reads as N
L
M
LPijf(i)(t- j) = 0,
t
E
JR.
(2.1)
i=O j=O
It will be important for our setting that the equation is considered on the full time axis JR. Moreover, we are not imposing any kind of initial conditions but rather focus on the solution space in £ := C00 (JR,
B := {f E £
I (2.1)
is satisfied}.
The choice £ = coo (JR,
2 The Algebraic Framework
9
(like e. g. delays of length 1 and v'2 or 1r) leads to serious obstacles preventing an algebraic approach similar to the one to be presented here; see [47, 109, 111, 26]. At this point we only want to remark that in the general case the according operator ring lacks the advantageous algebraic properties which will be derived for our case in the next chapter. These differences will be pointed out in some more detail in later chapters (see 3.1.8, 4.1.15, 4.3.13). Remark 2.2 In the theory of DDEs one distinguishes equations of retarded, neutral, and advanced type. These notions describe whether or not the highest derivative in, say (2.1), occurs with a delayed argument. Precisely, Equation (2.1) is called retarded if PNo =1- 0 and PNj = 0 for j = 1, ... , M; it is said to be neutral if PNo =1- 0 and PNi =i 0 for some j > 0, and advanced in all other cases, see [28, p. 4]. This classification is relevant when solving initial value problems in forward direction. Roughly speaking, it reflects how much differentiability of the initial condition on [0, M] is required for (2.1) being solvable in forward direction; see for instance the results [3, Thms. 6.1, 6.2, and the transformation on p. 192]. Since we are dealing with infinitely differentiable functions and, additionally, require forward and backward solvability, these notions are not really relevant for our purposes.
Let us now rewrite Equation (2.1) in terms of the corresponding operators. Introducing the forward shift cr of unit length
cr f(t) := f(t- 1), where f is a function defined on JR,
ft,
Equation (2.1) reads as
LPijDicrj
(2.2)
and the ordinary differential operator D p(D, cr)f = 0, where N
p(D, cr)
=
L
=
M
i=O j=O
is a polynomial in the two commuting operators D and cr. The solution space is simply (2.3) B := kerp(D, cr) ~ £. For notational reasons, which will become clear in a moment, it will be convenient to have an abstract polynomial ring JR[s, z] with algebraically independent elements s and z at our disposal. (The names chosen for the indeterminates should remind of the Laplace transform s of the differential operator D and the z-transform of the shift-operator in discrete-time systems.) Since the shift cr is a bijection on £, it will be advantageous to introduce even the (partially) Laurent polynomial ring N
R[s, z, z- 1 ]
= {L
M
L Piisizi Im, ME Z, N E No, Pij E 1R }·
i=O j=m
10
2 The Algebraic Framework
Associating with each Laurent polynomial the delay-differential operator (including possibly backward shifts) we obtain the ring embedding p
f---+
p(D, a)
(2.4)
(of course, if p is a nonzero polynomial, then the operator p(D, a) is not the zero operator on .C). In other words, the operators D and a are algebraically independent over lR in the ring Endc(.C). Put yet another way, .C is a faithful module over the commutative operator ring IR[D,a,a- 1 ]. Let us now look for exponential functions e>-· in the solution space (2.3). Just like for ODEs one has for A E
p(D, a)(e>-·) = (
M
L L PijDiaj) (e>-·)
= (
j=m = p(A, e->-)e>-·. i=O
N
M
i=O
j=m
L L PijAie->-i)e>-·
(2.5)
Hence the exponential function e>-· is a solution if and only if A is a zero of the function p(s, e-s) which therefore will be called the characteristic function of the delay-differential equation p(D, a)f = 0. Obviously, it is an entire function, known as exponential polynomial (or quasi polynomial). Before providing some more details on exponential polynomials, we want to fix some notation. Definition 2.3 (1) Denote by H(
(3) For q = ~ E IR(s)[z, z- 1], where p = E:o E~mPijSizi E IR[s, z, z- 1 ] and ¢ E IR[s]\{0}, denote by q* E M(
q*(s)
=
"\;"'M
i
L.Ji=O L.Jj;(:rijS e
-js
(
=
p
-s)
~(:)
for s E
In case q* is entire, we call the set V(q*) the characteristic variety and its elements the characteristic zeros of q. (4) For f E H(
ord.>.(f) := min{k E No I f(k)(A)-!= 0} denote the multiplicity of A as a zero off. Iff for A E
=0, we put ord.>. (f) = oo
2 The Algebraic Framework
11
Part (1) of the next proposition is standard in the theory of DDEs. Just like for ODEs, the multiplicities of the characteristic zeros correspond to exponential monomials in the solution space. As a simple consequence we include the fact that delay-differential operators are surjective on the space of exponential polynomials. Proposition 2.4 Let p E JR[s, z, z- 1]\{0}. (1) For k E No and .X E C denote by ek,>. E C the exponential monomial ek,>.(t) = tke>-t. Then
p(D,
alek_~ =
t.
(~) (p*)(K)(,\)ek-K,~·
In particular, ek,>. E kerp(D, a) if and only if ord>,(p*) > k. The function p* E H (C) is called the characteristic function of the delay-differential operator p( D, a).
(2) The operator p( D, a) is a surjective endomorphism on the space of exponential polynomials B := spanc{ek,>. I k E N0 , .X E C}. More precisely, let a := ord>,(p*) 2:: 0. Then, for all el,>. E B there exist constants ao, ... , al+a E C with al+a =/= 0 such that l+a ) p(D, a) ( ~ a~~:e~~:,>. = el,>.·
(2.6)
PROOF: (1) Let p = Ei,j Piisizi E JR[s, z, z- 1 ]. The asserted identity is easily verified in the following way:
=
t.
(~) (p*)(")(A)ek-K,~(t).
The rest of ( 1) is clear. (2) It suffices to establish (2.6). We proceed by induction on l. Put c .(p*)(a) (.X). Then c f= 0 by assumption. For l = 0 it follows from (1) that p(D, a)(c- 1 ea,>.) =eo,>., as desired. For l > 0 put al+a
=
p(D, a)(al+ael+a,>.)
1
e!a) - c- 1 . Then, by virtue of (1), l+a
= al+a L
~~:=a
(l + ) (p*)(~~:)(.X)el+a-~~:,>. = K
a
l-1
el,>.
+ ~ bjej,>. J=O
12
2 The Algebraic Framework
for some constants bi E C. By induction the functions bjej,>. have preimages involving solely exponential monomials ei,>. with i ::; l +a - 1. Combining them suitably with the equation above yields the desired result. D The foregoing considerations show that characteristic functions play exactly the same role as for ODEs, in the sense that their zeros correspond to the exponential monomials in the solution space. The main difference to ODEs is that the characteristic function has infinitely many zeros in the complex plane unless it degenerates to a polynomial. Since this property will be of central importance for the algebraic setting (in fact, this will be the only information about the solution spaces of DDEs we are going to need), we include a short proof showing how it can be deduced from Hadamard's Factorization Theorem. The estimate in part (1) below will be useful in a later section to embed R[s, z, z- 1 ] in a Paley-Wiener algebra. Proposition 2.5 Let p E R(s, z, z- 1 ]. Then
(1) there exist constants C, a> 0 and N lp*(s)l ::; C(1
E
.No such that
+ lsi)N eaiResl
for all sEC,
(2) the characteristic variety satisfies #V(p*) < oo
<¢:=:=?
p = zk'l/J for some k E Z and '1/J E R(s)\{0}.
In the classical paper (88) much more details about the location of the zeros of p* can be found, see also [3, Ch. 13). As we ar~ not dealing with stability issues, the above information (2) suffices for our purposes. PROOF: (1) Letting p = '2:~ 0 L:;!,mPiisizi, we can estimate straightforwardly N
lp*(s)l ::;
M
LL i=O j=m
M
IPiillslie-jRes::; C(1
+ lsi)N
L e-jRes j=m
where C > 0 is a suitable constant and a= max{lml, IMI}. (2) It suffices to show "==>". Let p be as in the proof of (1) and assume # V(p*) < oo. In order to get the desired result from Hadamard's Factorization Theorem, one simply has to make sure that the order (of growth) of p*, defined as -.log log M (r; p*) * * , where M(r;p ) :=max IP (s)l, hmr-too 1ogr lsl=r (see (54, Def. 1.11.1]) is bounded from above by one. But this can easily be deduced either from (1) or from simple properties of the order concerning sums
2 The Algebraic Framework
13
and products of entire functions, see [54, Sec. 4.2]. Now Hadamard's Factorization Theorem [54, 4.9] implies that p* is of the form p*(s) = 'l/J(s)e 01 s+f3, where 'ljJ E
= zk¢ for some k E Z
and¢ E JR[s]\{0}.
(b) For¢ E JR[s] and p E JR[s, z, z- 1 ] we have ker¢(D) ~ kerp(D,o-) ~
p*
¢
E
H(
Part (b) can also be interpreted as follows. Each pair (p, ¢) which satisfies the equivalent conditions in (b) gives rise to an operator on .C. Precisely, using the inclusion ker ¢(D) ~ kerp(D, a-) and the surjectivity of the differential operator ¢(D) one obtains a unique well-defined map ij: .C - t .C making the diagram
£
(D)
p(D~
£
(2.7)
/.
.c
commutative. The collection of all these operators ij will constitute the algebraic setting for our approach to DDEs. Let us first give an example. Example 2.7 Let A E JR, L E Z and p = e>.LzL- 1, ¢ = s-A. Since p*(A) = 0, we have ~ E H(
.(t-r) f(r)dr of this ODE, we then obtain
J;
(qj) (t) = ((eHuL- l)g) (t) = -foL e~r f(t- r)dr. In infinite-dimensional control theory, this operator is called a distributed delay, since the value of ijf at timet depends on the past off on the full time segment
[t- L,t].
14
2 The Algebraic Framework
Remark 2.8 Let us verify that the map ij in (2. 7) is independent of the particular representation~ as a quotient in ~(s, z). To this end, let p, ¢be as in Corollary 2.6(b) and
such that ~ E H(C) and let
p E ~[s, z, z- 1], ¢ E ~[s]\ {0} such that ~ = ~ in
~(s, z). Pick f E £and choose g, g E £satisfying ¢(D)g = ¢(D)g =f. We wish to show that p(D, a)g = p(D, a)g. To do so, we pick hE £such that ¢(D)h =g. Then, using p¢ = p¢, we obtain p(D,a)g- p(D,a)g = p(D,a)(¢(D)h- g), which is indeed zero, since ¢(D)(¢(D)h- g) = ¢(D)g- ¢(D)g = f- f = 0 and ker ¢(D) ~ ker p( D, a). As a consequence, the map ij depends only on the quotient ~ and not on the particular representation.
Now we are ready to introduce the ring of operators ij as they occur in (2.7). We also define the analogue where the backward shift a- 1 is omitted. This will be quite convenient for causality considerations later on and, occasionally, for normalization purposes.
Definition 2. 9 (1) Define 1-l :=
{~I pE ~[s, z, z- 1 ],
= {qE
~(s)[z, z- 1 ] Iq*
1-lo := 1-l n ~(s)[z] = { q E
¢E E
~[s]\ {0},
p; E H(
H(
~(s)[z] Iq*
E
H(
where ~(s)[z, z- 1 ] denotes the ring of Laurent polynomials in z with coefficients in ~( s). (2) Let p E ~[s, z, z- 1] and¢ E ~[s]\ {0} be polynomials such that q := ~ E 7-l. Define ij as the operator ij: £
-----t
£,
f
f----+
p(D, a)g, where g E £ is such that ¢(D)g =f.
Just like p(D, a), the map ij is simply called a delay-differential operator. Henceforth the term DDE refers to any equation of the form ijf =h. Obvio~sly, 1-l and 1-lo are subrings with unity of~(s)[z, z- 1] inducing the injective ring homomorphism
1-l
-----t
H (
q
f----+
q*.
(2.8)
Furthermore, the operators ij are
1-l
-----t
Endc(£),
q
f----+
ij.
(2.9)
2 The Algebraic Framework
15
Using commutativity of R[D, a, a- 1 ] ~ £, it is easily seen that (2.9) is a ring homomorphism, which means in particular that the operators ij commute with each other. Notice that the embedding extends (2.4), turning£ into a faithful 7-l-module. In Section 3.5 we will describe the ring 1-l in terms of distributions, showing that the mappings ij are convolution operators on£. Part (b) of Corollary 2.6 can now be translated into ker ¢(D)
~
ker p( D, a) <===? ¢ divides p in the ring 1-l
(2.10)
for all ¢ E JR[s) and p E JR[s, z, z- 1 ]. Recall from the introduction, that it will be one of our objectives to describe the algebraic relation between systems of delay-differential equations which share the same solution space. Characterizing the inclusion of solution spaces is only a slightly more general task for which now a special, and simple, case has been settled by simply defining the operator ring suitably. The equivalence (2.10) suggests that the operators in 1-l should be taken into consideration for the algebraic investigation of DDEs. This extension will turn out to be just right in Section 4.1 where we will see that (2.10) holds true for arbitrary delay-differential operators, even in matrix form. Remark 2.10 The ring 1-l as given in Definition 2.9 has been introduced first in the paper [42). It has appeared in different shapes in the control-theoretic literature before. In a very different context, the ring of Laplace transforms of 1-l has been introduced in the paper [85) to show the coincidence of null controllability and spectral controllability for a certain class of systems under consideration. In a completely different way, the ring 7-lo was also considered in [63). Therein, a ring generated by the entire functions (l),(s) = 1 -se_=-~e" and their derivatives is introduced in order to achieve Bezout identities (sl- A(e- 8 ))M(s) + B(e- 8 )N(s) =I with coefficient matrices over the extension B[s, e- 8 ]. One can show by some lengthy computations that 7-lo is isomorphic to this ring B[s, e- 8 ]. Notice for instance that B>..(s) = -%-(s) for p and ¢in Example 2.7 and L = 1. In [9) and [8) the approach of [63) has been resumed.
e
At this point we wish to take a brief excursion and compare the situation for DDEs with that for partial differential equations. Remark 2.11 In the paper [84) a very comprehensive algebraic study of multidimensional systems has been performed. The common feature of the various kinds of systems covered in [84] is a polynomial ring K[s 1 , ••• , sm] of operators acting on a function space A. This model covers linear partial differential operators with constant coefficients acting on C00 (Rm, C) or on V' (Rm) as well as their real counterparts and discrete-time versions of partial shift-operators on sequence
16
2 The Algebraic Framework
spaces. It has been shown in (84, (54), p. 33] that in all these cases the corresponding module A constitutes a large injective cogenerator within the cat. egory of K[st, ... , sm]-modules. From this a duality between solution spaces and finitely generated submodules of K[st, ... , sm] (the sets of annihilating equations) is derived, making it feasible to apply the powerful machinery of commutative algebra to problems in multidimensional systems theory (see Example 5.1.3 for a brief overview of the structural properties of multidimensional systems). From our point of view this says that for multidimensional systems it "suffices" to stay in the setting of a polynomial operator ring in order to achieve a translation of relations between solution spaces into algebraic terms. At (84, p. 17] Oberst has observed that his approach does not cover delay-differential equations. We wish to illustrate this fact by giving a simple example which shows that £ is not injective in the category of JR[s, z]-modules. Recall that an JR[s, z]-module M is said to be injective, if the functor Hom!R(s,z) ( -, M) is exact on the category of JR[s, z]-modules (67, III, § 8]. For our purposes it suffices to note that Hom!R(s,z) (JR[s, z]n, £) ~.en where the isomorphism is given by f ~ (f(ei), ... , f(en))T. The inverse associates with each (a1, ... , anl E .en the homomorphism that takes (PI, ... ,Pnl E JR(s, z]n to the element E~=lPi(D,a)ai E £.As a consequence, for a matrix P E JR[s,z]nxm, considered as a map from JR[s, z]m to JR[s, z]n, its dual with respect to the abovementioned functor is given by P ( D, a l : .en ----+ .em. Now we can present the example. Consider the matrices P =
[z- 1], 8
Q = [s, 1- z].
Then ker pT = im QT in JR[s, z]2, while for the dual maps one only has im P (D, a) £: ker Q(D, a) in £ 2 , as can readily be seen by the constant function w = (0, 1l E £ 2 . Hence £ is not injective. It can be seen straightfo~rdly from the very definition of ij in Definition 2.9 that imP(D,a) = ker[1, 1 -;zJ, indicating again that it is natural to enlarge the operator ring from JR[s, z] to 1{0 . We remark that the fact, that multidimensional systems theory "takes place in a polynomial setting", by no means implies that it is simpler than our setting for DDEs. Quite the contrary, we will see that every finitely generated submodule of a free 7-i-module is free, which simplifies matters enormously when dealing with matrices. Despite the complete different algebraic setting there will arise a structural similarity of systems of DDEs to multidimensional systems, which will be pointed out on several occasions in Chapter 4. In Chapter 5, multidimensional systems will be part of our investigations on multi-operator systems. For completeness and later use we want to present the generalization of Proposition 2.4 about exponential monomials.
2 The Algebraic Framework
17
Lemma 2.12
Let p E ~[s, z, z- 1 ] and ¢ E ~[s]\ {0} such that q := ~ E 1-l. Moreover, let A E .(q*). Consider the finite sum f = E~=O fvev,>. E .C with coefficients fv E . for some bv E . (q*) > m. The function q* E H (.(¢) = k, thus ord,>.(p*) = l + k. Proposition 2.4(2) (applied to the ordinary differential operator ¢(D)) guarantees the existence of a function g = E~:Ok gvev,>., where 9v E . and the desired result follows since (p*)(K)(_A) = 0 forK< l + k. D Remark 2.13 Notice that we did not consider any expansions of solutions as infinite series of exponential polynomials. Such expansions do exist, see [102] and [3, Ch. 6], the latter for solutions of retarded equations on ~+· We will not utilize these facts since the only case, where the full information about the solution space is needed, is that of ODEs, see also (2.10). For the general case it will be sufficient for us to know which exponential monomials are contained in the solution space. Series expansions of the type above are important when dealing with stability of DDEs. We will briefly discuss the issue of stability in Section 4.5, where we will simply quote the relevant results from the literature.
We conclude our considerations on scalar DDEs with the surjectivity of delaydifferential operators on .C. This fact is well-known and can be found in [25, p. 697], where it is stated in a much more general context and proven with rather elaborate methods. However, we would like to prove a version which also shows what kind of initial conditions can be imposed for the DDE (2.1). This also gives us the opportunity to present the method of steps, the standard procedure for solving initial value problems of DDEs. Earlier in this chapter we briefly addressed what kind of initial data should be specified in order for (2.1) to single out a unique solution f. Apart from smoothness requirements, we suggested that f has to be specified on an interval of length M, the largest delay occurring in (2.1). For instance, a solution of the pure delay equation a f- f = 0 is determined completely by the restriction fo := fi[o,l)· But in order that f be smooth, it is certainly necessary that the initial condition fo can be extended to a smooth function on [0, 1] having equal derivatives Jt) (0) = JJv) (1) of all orders v E No at the endpoints of the interval. In other words, fo and all its derivatives have to satisfy the delay equation for t = 1. This idea generalizes to arbitrary DDEs and leads to the restriction given
18
2 The Algebraic Framework
in (2.11) below, which simply says that the initial condition has to be compatible with the given DDE. As our approach comprises retarded, neutral, and also advanced equations of arbitrary order, and, additionally, requires smoothness, we could not find a reference for the result as stated below. However, the procedure is standard and one should notice the similarity of the proof given below for part (1) with, e. g., those presented in the book [3, Thms. 3.1 and 5.2]. In the sequel the notation C00 [a, b] := C00 ([a, b], C) as well as f(v) for f E C00 [a, b] refers, of course, to one-sided derivatives when taken at the endpoints a or b.
Proposition 2.14 M · Let q =PeP -1 E 1io\{0} where p = ~j=oPizJ, Pi,¢ E R[s], Po f=.Of=.pM, ¢ f=. 0, and M ~ 1. Furthermore, let g E .C.
(1) For every foE C00 [0, M] satisfying
(p(D, O")fdv)) (M) = g(v) (M) for all v E No
(2.11)
there exists a unique function f E .C such that p(D, O")j g and fl[o,M] = fo. As a consequence, the map ij is surjective on .C. (2) Iff E kerij ~ .C satisfies fl[k,k+M] = 0 for some k E R, then f = 0. (1) To prove the existence of j, we show: every fo E C00 [a, b] defined on an interval of length b- a~ M which satisfies the condition PROOF:
M
(p(D, O")fdv)) (t) = ( LPi(D)O"j fdv)) (t)
=
g(v)(t) for all v E No
(2.12)
j=O
for all t E [a + M, b] can be extended in a unique way to a solution fi E C00 [a- 1, b + 1] which satisfies (2.12) on [a- 1 + M, b + 1]. (Notice that the initial condition given in the proposition is included as an extreme case where a = 0 and b = M.) To this end, write Po (s) = ~~==-~ aisi + sr and consider the inhomogeneous ODE M
Po(D)](t)
=
g(t)- ( LPi(D)O"i fo) (t)
(2.13)
j=1
for t E [b, b + 1] with initial condition
j(v)(b)
= Jt)(b)
for v
= 0, ... , r- 1.
(2.14)
(If r = 0, then Po = 1 and no initial condition is imposed). In any case, there is a unique solution j E C00 [b, b + 1] to (2.13), (2.14) and j satisfies M
r-1
JCr)(b) =g(b)- (LPi(D)O"jfo)(b)- Laij(i)(b) = fdr)(b). j=1
i=O
2 The Algebraic Framework
19
Differentiating (2.13) and using (2.12) shows successively j(v)(b) = fdv)(b) for allv E N0 • Therefore, the function !I defined by fi(t) = fo(t) fortE [a, b] and fi(t) = ](t) fortE (b,b+1] is inC 00 [a,b+1] and, by construction, satisfies (2.12) on [a+ M, b + 1]. In the same manner one can extend !I to a smooth solution on [a- 1, b + 1]; one takes the unique solution of the ODE M-1
PM(D)](t) = g(t)-
L Pj(D)h (t- j) on [a+ M- 1, a+ M] j=O
with initial data J
h(t)
:=
fi(t) for a::; t::; b + 1,
!2(t)
:=
](t + M) for a- 1::; t
Then f2 E C00 (a- 1, b + 1] and satisfies (2.12) on [a+ M- 1, b + 1]. Repeating this extension leads to a solution in£. It is clear from the procedure that the solution of the initial value problem is unique. As for the surjectivity of ij, observe that it suffices to show the surjectivity of p = p( D, a). The latter can be accomplished by providing a function fo E C00 [0, M] satisfying (2.11). A simple choice for fo is as follows: pick a solution h1 E C00 [0, M] of the ODE Po(D)hl = g on the interval [0, M] and let h2 E 1. Then one checks C00 [0, MJ be such that h2![o,M-o. 75] 0 and h2i(M-o.s,M] that fo := h1h2 is a desired initial condition. 0 (2) is a consequence of the uniqueness in (1).
=
=
Remark 2.15 It is immediate to see that for all f E C and q E 1i the inclusion f E ker ij implies Re j, Im f E ker ij, too. As a consequence, Corollary 2.6, Equation (2.10), and Proposition 2.14 remain valid when C is replaced by its real-valued analogue cooo~, ffi.). We now close the considerations about scalar DDEs and want to spend the rest of this chapter on discussing a (somewhat extreme) example illustrating some features of systems of delay-differential equations. The general theory has to be postponed until Chapter 4, when the algebraic results concerning matrices with entries in 1i are available. Example 2.16 We consider the homogeneous system of DDEs R(D, a)w
R
=
= 0, where
-2 OJ
s -z s 1 E ffi.[s, zJ 3 x 3 . [ 0 -2z s
This example is taken from [23, pp. 249} (see also the references given therein), where it is presented in the form
20
2 The Algebraic Framework
x(t)
02 OJ x(t) 0 0 -1 [ 00 0
=
+
[OOOJ 1 0 0 x(t- 1). 020
(2.15)
It was the purpose in [23] to present a system that is nicely solvable in forward direction but lacks backward solutions for most choices of initial conditions. Indeed, prescribing a continuous initial condition xo : [to, to+ 1) ~ C 3 , one can solve (2.15) uniquely in forward direction by using the (corresponding version of the) method of steps. The solution is continuously differentiable on (to+ 1, oo) and satisfies (2.15) on that interval, see [3, Thm. 6.2). (In fact, the solution is of class Ck on (t 0 + k, oo) for each k E No.) If the initial condition x 0 is of class coo on [to, to+ 1) and satisfies (2.15) for t =to+ 1 in all derivatives, the solution is of class coo on [to, oo) and fulfills (2.15) for all t 2:: t 0 + 1. In order to discuss backward solvability, it is shown by some elementary calculations in [23) that every differentiable function on [to, oo) which satisfies (2.15) for t 2:: t 0 + 1 is on the interval [to + 2, oo) a polynomial of the form
p(t) =
a+ bt + ct
2
~ + ct ( a - b + (b - 2c )t + ct 2
)
E
C[tf for some a, b, c E C.
(2.16)
As a consequence, even an initial condition imposed at just one single point might not allow a backward solution. For instance, there is no differentiable solution x = (x1, x2, x3)T on ( -oo, to] satisfying x1 (to) - 2x2(to) - x3(to) I= 0. Of course, this is due to the singular coefficient matrix of x(t- 1) in (2.15). Let us now study Equation (2.15) from our point of view where only smooth solutions on the whole of 1R are being considered. In this particular example it is possible to achieve a triangular form for R by applying elementary row operations over lR [s, z]; indeed we have
R1 := where
V
=
s [
2
s3 [
0 OJ
s -2 0
=VR
-z s 1
+ 2z 2s
-2J
1
0 0
0
1 0
and det V
= -2.
Since v- 1 E JR[s,z] 3x 3, too, the operator V(D,a) is bijective on £ 3. As a consequence, the kernel of R(D, a) remains unchanged under this transformation and can be read off explicitly from the triangular form. In fact, one obtains ker R(D, a) = ker R 1(D, a) = {p E C[t) 3 r pis of the form (2.16)}.
(2.17)
We see that the triangular form does not only have the advantage of providing the solution space of (2.15), it also exhibits via its diagonal elements, where and how initial data can be imposed so that, by virtue of Proposition 2.14, forward
2 The Algebraic Framework
21
and backward solutions are guaranteed. We can go even one step further. The kernel (2.17), being a finite-dimensional space of polynomials, is in fact a solution space of an ordinary differential operator. This operator can be determined explicitly by exploiting the linear equations governing the coefficients in (2.16). Elementary calculations show 8
ker R(D, a)= ker R 2(D) where R 2 = [
:
~2 ~]
E
JR[s] 3 x 3 ;
s -1 0 1
see also (44, pp. 227) for a general method. Let us determine the row transformations relating the matrices R and R2. To this end, we simply have to calculate R 2R- 1 in JR(s, z) 3 x 3 and obtain UR = R2 where s2 [
s' +2sz-
+ 2z
2s
~~-
0
-2 0
2 s-12+z
s -2~+2-2z
2z+ 2z'
s
l .
2
This matrix is easily seen to have determinant 2 and entries in 1io. Put another way, R2(D) is obtained from R(D, a) via a unimodular row transformation over the operator algebra 1io. In Section 4.1 we will see that operator matrices sharing the same kernel are always related like this. Let us briefly draw a link to the forward solutions discussed above. Since the transformation matrices V and U contain the shift operator z, they don't preserve the solution space of R( D, a )x = 0 in C1 ([to, oo), C 3 ), but only respect the "tails of the solutions". A few time units have to elapse (the number depending on the degree of z occurring in the matrices and possibly on the amount of differentiability required for applying the transformations), before the solutions of R(D,a)x = 0 turn into solutions of, say, R 1 (D,a)x = 0. This is exactly what we observed in (2.16) and (2.17). Finally we should mention that in this example a first information about the kernel of R in £, 3 could have been obtained by noticing that det R = s3 . The equation (adjR)(D, a) o R(D, a) = D 3 !3 shows immediately that the solutions of R(D, a)f = 0 on lR are polynomials of degree at most 3, hence the solution space is finite-dimensional. In this case one could go even further and determine the full solution space by substituting the general polynomial of degree three in the original equation R(D, a)f = 0. This idea, of course, applies whenever the determinant of the system matrix is in JR[s) but fails in the general case det R E JR[s, z]. Even though the example is extreme in the sense that it describes actually a system of ODEs, it should demonstrate that a triangular form is of advantage for getting some more information about the solution space of a delay-differential operator. But not every polynomial matrix in two variables can be row reduced to a triangular form. Fortunately, this can always be achieved with transformation matrices having entries in 1i, as we will see in the next chapter.
3 The Algebraic Structure of 1-£0
In this chapter we will concentrate on the purely algebraic part of the theory and analyze the ring structure of 1io. As the following sections will show, the operator ring 7-{0 carries a rich algebraic structure, interesting by itself, but also nicely suited for an algebraic study of delay-differential equations and systems thereof later on. The combination of the two embeddings 1io ~ JR(s)[z] and 1-{0 ~ H(C) proves to be a powerful tool for the upcoming investigations; the ring JR(s)[z] is a principal ideal domain, while H(C) is a Bezout domain (it is even known to be an elementary divisor domain, but for us the Bezout property will be the main tool). The inclusion 1io ~ H(C) has been exploited already once in Proposition 2.5(2), where we established that a proper delay-differential operator has infinitely many characteristic zeros. This fact in combination with an easy handling of the finitely many zeros of each possible denominator in 1-{0 ~ R(s)[z] will be a permanent ingredient for the arguments in Section 3.1. In that section we derive the main results about 1io. (The corresponding facts about the ring 7-{, which is simply the localization (1io)z, are readily derived.) On the one hand, it will be shown that 7-{0 is a Bezout domain, that is, each finitely generated ideal is principal. Put another way, each two nonzero elements have a greatest common divisor which, additionally, can be expressed as a linear combination of the given elements with coefficients in 7io. On the other hand, we will also see that 7-{0 is a so-called adequate ring, meaning that each element can be factored in a certain desired manner concerning its characteristic zeros. In Section 3.2, general ring theory is applied to deduce that 7-{0 is a so-called elementary divisor domain. This says essentially, that matrices with entries in 7-{0 admit triangular and diagonal forms in a way quite similar to matrices over Euclidean domains. This fact will be of fundamental importance in the next chapter when we study systems of delay-differential equations. Furthermore, the Bezout property of 1-{0 will be utilized to generalize the notions of greatest common divisors and least common multiples to matrices over 7io. As has just been indicated, the reason for focusing on matrix theory rather than general module theory over 7io is the fact that in the next chapter our main objects of study will be systems of delay-differential equations, hence operators which are matrices over 7-{0 or 7-{. However, a translation and interpretation of the matrixtheoretic results into the language of general module theory will be given at the end of Section 3.2.
24
3 The Algebraic Structure of 1-lo
In Section 3.3 we will take a brief excursion into the theory of systems over rings. Even though this area of control theory is not directly related to our approach to delay-differential systems, it has been a helpful source for getting acquainted with various ring structures and their matrix-theoretic implications. In order to further study the algebraic structure of 1-lo, a description of the nonfinitely generated ideals of 1-lo follows in Section 3.4. Among other things, it will be shown that 1-lo has Krull-dimension one, that is, all nonzero prime ideals are maximal. In Section 3.5 we recall the original meaning of 1-l as an operator algebra acting on coo (JR, C) and demonstrate how it can be characterized as an algebra of distributions with compact support. This can also be understood as an embedding of 1-l into a suitable Paley-Wiener algebra of entire functions. The structure of the distributions associated with the elements of 1-l will be given explicitly. Finally, in Section 3.6 the question is posed whether the ingredients required for the calculation of the various objects introduced earlier can actually be computed. We will concentrate on symbolic computability; numerical questions will not be addressed. Although the various calculations of the previous sections are in a certain sense constructive, some additional work is necessary with respect to their symbolic computability. This is mainly due to the central role played by the zeros of the (transcendental) functions in JR[s, e- 8 ] for the construction of suitable ring elements. Starting with functions in, say Q[s, e- 8 ], one is necessarily led to certain transcendental field extensions of Q. Interestingly enough, it is exactly Schanuel's (still open) conjecture about the transcendence degree of such extensions, which, in case it is valid, enables symbolic computability of Bezout equations in 1-lo and, consequently, of all relevant constructions presented in this chapter. In this context it is also interesting to note that generically the computational difficulties for the Bezout identity of three or more polynomials in Q[s, z] do not arise due to the lack of any common zeros to be taken care of. We will make use of the following Notation 3.1 Let R be any commutative domain with unity.
(a) By Rx we denote the group of units of R. We will write pIn q if p divides q in R. Any greatest common divisor of Pl, ... , pz E R (if it exists) will be denoted by gcdn(Pb ... ,pz). Consequently, each expression containing gcd n (PI, ... , pz) has to be understood as up to units in the ring. In the same way, a least common multiple (if it exists) will be denoted by lcm n. (PI, ... , pz). In case R = JR [s], we omit the index R. This will not cause any confusion due to the specific rings under consideration. (b) A matrix U E Rnxn is called unimodular if det U E Rx and U is said to be nonsingular if det U E R\ {0}. The group of unimodular matrices over R is denoted by Gln(R), while En(R) is the subgroup
3.1 Divisibility Properties
25
En(R) := {U E Gln(R) I U is a finite product of elementary matrices}, where ·elementary matrices are understood in the usual sense, see, e.g., {55, p. 338}.
(c) The rank of a matrix ME Rpxq is defined to be the rank of M regarded as a matrix over the quotient field of R. Hence it denotes the maximal number of linearly independent rows or columns in the matrix M. (d) We call two matrices P, Q E Rpxq left equivalent (resp. equivalent) over R if there exists U E Glp(R) (resp. U E Glp(R) and V E Glq(R)) such that UP= Q (resp. UPV = Q). Right equivalence is defined accordingly. (e) For a polynomial p E R[x1, ... , Xn] let degxi p denote the degree of p as a polynomial in Xi· For q = ~ E JR(s)[z], where p E JR[s, z] and¢ E JR[s], define deg 8 q = deg 8 p - deg ¢. (f) For an indeterminate x over R the notation R((x)) stands for the ring of formal Laurent series in x with co(1fficients in R, that is, 00
R((x))
= {L
rixi I L E Z, ri E R }·
i=L
Likewise, R[x] := series over R.
{E:,o rixi I ri
E
R} denotes the ring of formal power
3.1 Divisibility Properties In this section we want to lay down the foundations for our algebraic considerations. To begin with, we present some basic, yet important, rules for calculating in the ring
Ho
= {q E JR(s)[zJI q*
E
H(C) }.
These rules comprise certain types of factorizations as well as some kind of division with remainder close to that in Euclidean domains. Furthermore, using the fact that two entire functions do always have a greatest common divisor, we can establish the same result for Ho. Along with the division with remainder one even obtains that Ho is a Bezout domain, that is, the greatest common divisor can be expressed as a linear combination of the given elements. On the other hand, it is a simple fact that Ho is not a principal ideal domain. A combination of the embeddings Ho ~ JR(s)[z] and Ho ~ H(C) will finally show that Ho admits adequate factorizations in a sense to be made precise below. Remark 3.1.1 In the sequel we will make frequent use of the fact that the ring H(C) of entire functions is a Bezout domain [82, p. 136]. Precisely, each two nonzero functions J, g E H(C) have a greatest common divisor dE H(C) which satifies
26
3 The Algebraic Structure of 1-lo
ord,x (d)
= min {ord,x (f), ord,x (g)}
for all A E C
and can be expressed as linear combination
d = af + bg with suitable functions a, b E H(C). Let us start with the following list of properties. Comments will be given below. Recall the notation from Definition 2.3. Proposition 3.1.2 (a) For all p E 1{0 and A E C we have p*(X) = p*(A), where denotes complex conjugation. (b) The units of1io are given by 1i5 = lR\{0}. Moreover, {p E 1-lo I p* E H(C)x} = {azk I a E lR\{0}, kENo}. (c) Let p, q E 1-lo and let zfrto p. Then p* IH(c) q* -<===> p lrto q. (d) For all p E 1-lo the following assertions are equivalent: i) p is irreducible, ii) either p = ¢for some irreducible¢ E JR[s] or p = az for some nonzero a E JR, iii) p is prime. (e) The ring 1{0 is not factorial and not Noetherian. (f) For all k E No and ¢ E JR[s]\ {0} there exists a polynomial 8 E JR[s] such zk-8 '1...1 t h at-¢- E ''-0· . (g) For all p, q E 1-lo there exist elements f,p E 1-lo such that p = fq + p and degz p :=::; degz q. (h) If¢ E JR[s] and a, b E 1-lo are elements such that ¢ lrto (ab), then there exists a factorization ¢ = c/>1 ¢2 in such a way that ¢1 , ; 2 E 1-lo. Moreover, (i)
one can arrange the factorization such that V ( c/>1, (b¢2 1) *) = 0. Each two elements p, q E 1i0 , not both zero, have a greatest common divisor d = gcdrto (p, q) E 1-lo \ {0}. Moreover, d* = gcdH (p*, q*) and hence V(d*) = V(p*, q*). In particular,
gcdH (p*, q*)
= 1 -<===> gcdrto (p, q) = zl -<===> V(p*, q*) = 0.
for some l E No
(j) Let p, a, bE 1-lo be such that p lrto (ab) and gcdrto (p, a) = 1. Then p lrto b. (k) Each pair of elements p, q E 1-lo \ {0} has a least common multiple lcmrt (p, q) E 1-lo, given by d pq( ) • In other words, the intersection o gc rto p,q of finitely many principal ideals is principal.
Before turning to the proof we want to briefly comment on some of the assertions above. Remark 3.1.3 (i) Part (c) is a very simple instance of a result given in [4, p. 270], where this property is called stability and proven for an analogous situation including polynomials and exponential functions of several complex variables. In
3.1 Divisibility Properties
27
that generality the proof requires much more sophisticated methods from harmonic analysis, whereas in the case of interest to us, namely complex functions in one variable, the result can easily be obtained from the Theorem of Bezout for algebraic plane curves as will be shown below. (ii) The sole reason for 1-lo not being factorial, as stated in (e), is that elements usually have infinitely many characteristic zeros, hence, due to the very definition of 1{0 , infinitely many irreducible (linear) factors. In fact, we will make frequent use of this property. With a suitable adaptation guaranteeing convergence, these linear factors can be arranged in an infinite product according to Weierstra£'s factorization theorem. However, ~e will not utilize this nice result from complex analysis. (iii) The importance of part (f) is perhaps most clearly seen by concentrating on the polynomial part JR[s, z] of 1-lo. Regard q E JR[s, z] as a polynomial in z with coefficients in JR[s). It has in general a highest coefficient, say¢, that is not a unit in JR[s) and therefore prevents division with remainder by q in IR[s, z]. While simply dividing by ¢ would in general not be admissible in 1{0 , we can "normalize" q via multiplication with a suitable function (z- o)¢- 1 E 7-lo. With an appropriate handling of possible denominators this basic idea is easily exploited to establish the division with remainder as given in (g). Due to the increase of the degree (with respect to z) in the normalization, one cannot achieve, in general, a strictly smaller remainder. (iv) Part (i) will be of central importance for the algebraic structure of 1-lo. In Theorem 3.1.6 we will encounter an alternative proof for the existence of greatest common divisors. Yet we think it is worth presenting the version above as it shows more directly the connection to the greatest common divisor for entire functions. PROOF OF PROPOSITION
3.1.2: (a) is obvious.
(b) The first part is clear, while the second one follows from Proposition 2.5(2). (c) The direction "-¢=:" is true since p ~----t p* is a ring homomorphism. As for "=>", let q*(p*)- 1 E H( ii)" Let p E 7-lo be irreducible and pI- az for all a E JR. By (b) there exists A E iii)" follows easily, whereas "iii) => i)" is true in every commutative domain.
6
6 (
28
3 The Algebraic Structure of 7-lo
(e) Consider z- 1 E 1-lo. The polynomials Pv = (s - 27riv)(s + 27riv) E IR(s] for v E N satisfy n~: 1Pv E 1-lo for all n E N. Hence z- 1 has infinitely many 1 irreducible factors in 1-lo and
is an infinite properly ascending chain of ideals in 1-lo. (f) is a simple interpolation: one needs 8(v)(A) = ( -k)ve-k>. for each root A E V(¢) and 0 ~ v ~ ord>.(¢) -1. Using (a), such a polynomial 8 exists, even with coefficients in IR, cf. (21, p. 37]. (g) Write '\;'L
·
'\;'M
L.....ij=O PjZJ
p=
¢
·
L.....ij=O QjZJ
¢
,q=
where Pi, qj, ¢ E JR(s] and PL =/= 0 =/= QM· Only the case L > M needs consideration. Using (f), one may find 8 E IR(s] such that ~~o E 1-l0 • Then p' :=p-pLzL-M-lz-oq E 1-lo and degzp' < degzP· This way, we can proceed QM until the degree of the remainder is reduced to M. (h) Simply distribute the zeros of¢ in an appropriate way, see also (c). (i) Only the case p =/= 0 =/= q needs consideration. Let p = ~ and q = ~ where
a, bE IR(s, z] and¢ E IR(s]. We proceed in two steps. First, a greatest common divisor of a, b in IR(s, z] is extracted. Thereafter only finitely many common characteristic zeros are left producing a polynomial gcd in H(C). The details are as follows. Define g = gcd&[s,zJ (a, b) E IR(s, z] and let ga1 = a, gb1 = b. Moreover, write¢= ¢ 1 ¢2 in IR(s] such that a1
b1
g
cP1 ' cP1 ' cP2
E 1-lo'
I'Ho (a1g). The coprime#V(~, ~) < oo. Therefore '1/J :=
which is possible by applying (h) to the situation ¢ ness of a 1 and b1 in IR(s, z] implies gcdH (~, ~) E IR(s) and
are factorizations of p and q within 1-lo. Since #V(~, ~) =
0, we obtain
-,;.- = gcdH
f.;
(j) is a consequence of (c) and (i). (k) can be shown by standard calculations in H(C). Alternatively, a proof will be provided for a matrix version of the assertion in Theorem 3.2.8. It will make use of the Bezout property (proven for 1-lo in Theorem 3.1.6). D
3.1 Divisibility Properties
29
Remark 3.1.4 A glance at the proof of (i) shows that the greatest common divisor in 1-to of polynomials p and q is a polynomial, too. Remark 3.1.5 For the ring 1-t the situation becomes even smoother. Since the units of 1-t are given by the set
1-tx = {azk I a = {p E
E
Rx, k E Z}
= {p E 1-t I V(p*) = 0}
1-t I p* E H(C)x},
and because of the relationship (3.1.1) the results above translate easily into according properties for 1-t. One simply has to adapt the formulations, whenever the element z is involved. In particular, p and q are coprime in 1-t if and only if p* and q* are coprime in H(C). Note also that Proposition 3.1.2(c) can be rephrased as saying that 1-t is the largest ring extension of R[s, z] within R(s, z) to which the embedding (2.8) can be extended. Put another way, the ring 1-t can be written as
1-t
=
{f E R(s, z) I j* E H(C)}.
The proof of the existence of the greatest common divisor given above is constructive in the sense that it shows exactly which steps lead to the desired result. However, the practical computations involve serious difficulties as one needs to compute the common zeros of exponential polynomials. Before presenting some examples, we want to establish the main results of this section. Its proof demonstrates an alternative way for the computation of a greatest common divisor. But even more will be obtained. The procedure generates a linear combination for the greatest common divisor, showing that 1-to is a Bezout domain. As a by-product- and as a consequence of the sort of division with remainder given in Proposition 3.1.2(g) - one observes that each unimodular matrix is a finite product of elementary matrices. We remark that this is true, for the same reason, over every Euclidean domain, but not, in general, for the ring R[s, z]. A counterexample in form of a 2 x 2-matrix over R[s, z] has been found in (16]. We present this matrix along with a factorization into elementary matrices over 1-to in Example 3.2.3(2) in the next section. It is worth mentioning that for n > 2 unimodular n x n-matrices over R[s, z] are always finite products of elementary matrices. This is a special case of Suslin's stability theorem (106]. Interestingly enough, the unimodular matrices over the ring H(C) of entire functions are also finite products of elementary matrices, see (82, p. 141]. In this case the argument is completely different from that for 1t and will be addressed briefly in Remark 3.1.10 below. Part (c) below is a technical fact which will be needed in the next section in
30
3 The Algebraic Structure of 'Ho
order to prove that 1i is an elementary divisor domain. If one translates the adequate factorization stated in (c) into entire functions one observes that the factor b* is made up of exactly all common zeros of p* and q* with the multiplicity they have in p*. This formulation shows that the ring H(CC) itself is adequate, too. In our approach, the adequate factorization will be mainly used to prove that 1i is an elementary divisor domain; see the next section. Recall the notation given in 3.1(b). Theorem 3.1.6 Let }( be any of the rings 1i and 1io. (a) }( is a Bezout domain, that is, each finitely generated ideal is principal. In other words, for all Pl, ...-, Pn E JC (not all zero) and every d = gcdx:: (PI, ... , Pn) there exist a1, ... , an E JC such that a1P1
+ ... + anPn = d.
(3.1.2)
Furthermore, there even exists a matrix U E En (JC) such that
We call (3.1.2) a Bezout identity or a Bezout equation for the elements pr, ... ,pnEIC. (b) En(JC) = Gln(JC). (c) }( is an adequate ring, that is, for each pair of elements p, q E IC\ {0} there exists a factorization p = ab for some a, b E JC such that gcdx:: (a, q) = 1 and gcdx:: (b, q) rj JCX for every divisor bE IC\ICx of b. In Section 3.4, where the nonfinitely generated ideals are described, an alternative argument for 1i being adequate will come along as a by-product. PROOF:
It is easy to see that we can restrict to the ring 1io, cf. (3.1.1).
(a) Using the sort of division with remainder given in Proposition 3.1.2(g) one can proceed .as for matrices over Euclidean domains. Without restriction we may assume Pi =f. 0 fori= 1, ... , n. Write Pi
=
"\"'Mi .. j L..ij=OPtJZ
¢
where Pij, ¢ E JR[s],
PiMi
=f. 0.
Without restriction let M 1 ::; Mk for k = 1, ... , n. We will show that by elementary row transformations applied to the vector (pr, ... ,Pnl the degrees of the elements Pk with respect to z can be reduced. In order to do so consider the following two cases.
3.1 Divisibility Properties
31
i) If Mk > M 1 for some k, we use Proposition 3.1.2(g) to accomplish Pk .Pk- fPt for some f E Ho and degzPk ::; degzPl· Proceeding this way, we can achieve via elementary operations that the degrees of P2, ... , Pn are at most M 1· ii) If degz PI = . . . = degz Pn = M 1, we can handle the vector of highest coefficients (PIM1 , P2M1 , ••• , PnM1 l via elementary transformations in the Euclidean domain R[s]. Let 8 := gcdlilt[sJ (PIMp .. ~ ,PnM1 ) E R[s]. Then there exists a transformation matrix V E En(R[s]) such that V(PlMp P2M1 , ••• ,PnM1 l = (8, 0, ... , Ol, see [36, pp. 134]. Hence
V(pt, ... ,Pnl =(PI, ... ,Pnl E 1-l(j and degz Pi
< Mt = degzPI for j = 2, ... , n.
Combining these two methods we arrive after finitely many steps at
U(pt, ... ,Pnl
= (d,
0, ... , Ol
for some matrix U E En(Ho) and some dE Ho. By the unimodularity of U, the resulting element d is a greatest common divisor of PI, ... , Pn in Ho. (b) follows from (a) by induction, using for (Pb ... ,Pnl the first column of a unimodular matrix. (c) The idea of the proof is as follows: factor p = ab such that V(b*) = V(p*, q*) and ord.>.(b*) = ord.>.(P*) for all A E V(b*). This can easily be done within the ring 1{0 if #V(p*, q*) < oo. In the other case an iterative procedure is needed. First of all, it is easy to see that we may restrict to the case where zf?-to q, which will simplify the use of Proposition 3.1.2(i) later in the proof. As for the iteration, start with b1 := gcd?-£ 0 (p, q) E Ho and put a1 := [;_. Next, define successively for p = aibi, i EN, the following elements: (3.1.3) Hence p = aibi = ai+lbi+t· This produces a sequence of elements ai E 1-{0 where ai+l I?-to ai. But then ai+l divides ai. also in the principal ideal ring R(s)[z] with the consequence that for some kEN the element Ck E 1-{0 is a unit in R(s)[z], hence Ck E R[s]\ {0}. As a consequence, V(ak, bk) is finite, say
V(ak, bk) ={At, ... , An}, and we can define n
f :=IT (s- Ai)li.
E R[s], where li = ord,>.i (ak)·
(3.1.4)
i=l
Defining a:= akf- 1 E Ho and b := fbk E Ho, we get p = ab. There remains to show that this factorization satisfies the requirements of the theorem.
32
3 The Algebraic Structure of 7-lo
1) To establish the coprimeness of a and q, suppose V(a*, q*) =I= 0 and let A E V(a*, q*) ~ V(p*, q*) = V(bi). Then A E V(bi, ak) ~ V(ak, bk) = {All· .. , An}· But for A = Aj we have ord>,(a*) = ord>.j (ak) - ord>.J (f) = 0. Hence V(a*, q*) = 0 and from Proposition 3.1.2(i) we conclude the coprimeness of a and q. 2) Let b E 1to \1t5 be a divisor of b. Since zf1io q, we also have zf1-£o b. As a consequence there is some A E V(b*) such that b*(A) = 0. The construction (3.1.3) of the sequences (ci) and (bi) leads to the following identities of varieties (recall that we count zeros in V without multiplicity)
V(b*) = V(j*bk) = V(bk) = V(ck-Ibk-1) = V(bk-1) = ... = V(br) = V(p*, q*). Thus A E V(q*, h*) and therefore band q are not coprime. Note that in the case V(p*, q*) ={AI, ... , An} is finite, the construction above leads directly to the factorization p = ~b where b = fi~=I (s - Ai)l;. and li = ord>.i (p*). D The procedure given in (b) for the Bezout identity is, although somehow natural, not very practical as the examples will show. A better procedure, requiring less steps, can be found in [39, Rem. 2.5]. But that one has some shortcomings, too, for it needs a priori knowledge of a greatest common divisor and does not imply part (b) about unimodular matrices. We will demonstrate that procedure in Example 3.1.9(3). Remark 3.1.7 The result as stated above has been proven first in [42]. The adequateness has been obtained in discussion with Schmale [98]. In special cases, basically if the elements are coprime and one of them is monic ins, a Bezout identity has been earlier derived in a fairly different setting, see [85, Sec. 4], [63, (3.2),( 4.14)] as well as later [9]. In [5, Prop. 7.8] a Bezout identity 1 = ~7= 1 figi has been obtained for exponential polynomials fi E 0, that are linearly independent over Q. As shown in [47, Thms. 5.4 and 5.9], the algebraic approach leads to the operator algebra
3.1 Divisibility Properties
1i(t) := {~
lp,
q
33
E IR[8,ZJ, .. . ,zt], kerq ~ ker.P}
= {f E lR(8, Zt, ... , zt) If* E H(C)} p Ip E lR[8, Zt, ... , zt), = { -zvq
q E 1R[8]\ {0}, v E N0l , -p* E H(C) } , q*
· - z V} · •. . ·z Vt • The I as t 1'dent't w here now ! .*( 8 ) ·. - f( 8, e -TIS , ... , e -TtS) an d z v .1 y 1 1 above is due to [4]. Note that 1i = 1i(l) if 71 = 1. A fairly simple example [47, Exa. 5.13] reveals that 1i(l) is not a Bezout domain whenever l > 1. As a consequence, serious obstacles arise for an algebraic approach to equations with noncommensurate delays. We will touch upon these issues in Chapter 4.
We illustrate the determination of a greatest common divisor along with a Bezout identity by some simple examples. Part (b) of the theorem above will be addressed in the next section when considering matrices over 1i, see Example 3.2.3. Example 3.1.9 For computational issues, which will be addressed in Section 3.6, we will keep track of the coefficients of the indeterminates 8 and z in the calculations below, starting with coefficients in Q.
(1) Let p = z + 82 - 1, q = 82 E Q[8, z]. Then gcd'Ho (p, q) = 8 since p*(8) = 0 and (p*)' (0) =I 0. In this case, a Bezout identity 8 = ap + bq over 1i0 can easily be found by rewriting it as b _ 8 - ap _ 8 - a( z -
q
-
+ 82 -
1)
82
.
Now the requirement b* E H(C) forces a E 1io to be such that the function f(8) = 8- a*(8)(e-s + 82 - 1) has a zero of multiplicity 2 at 8 = 0. The simple choice a = -1 E 1i0 suffices and leads to the Bezout identity 8 = - (z
+ 82 -
1) +
8
+ (z + 8 2 - 1) 82
8
2
over 1i0 . Notice that a and bare in Q(8)[z] n1i0 , that is, all the coefficients of the indeterminates 8 and z in the equation above are in Q. (2) Let p = z, q = 8 + 1 E Q[8, z]. Then p and q are coprime in 1io and for a Bezout equation 1 = ap + bq one needs b* = 1 -:~fs E H(C), leading to the sole condition a*( -1) = e- 1. Hence 1 = e- 1 z
+
1- e- 1 z
8+ 1
(8
+ 1)
is a Bezout identity as desired. In this case the coefficients of 8 and z are in the field Q(e). It is easy to see that no Bezout equation with coefficients in the field A of algebraic numbers exists.
34
3 The Algebraic Structure of 1-lo
(3) Let p
=
;+~,
q = s + z E 1io. The elements are coprime since the equations
e-.x = e and .A + e-.X = 0 have no common zeros in C. To obtain a Bezout identity we first let a
= 1 and b = - (s + 1) and get (3.1.5)
aq+bp= s+e.
This is indeed the first step in the procedure given in the proof of Theorem 3.1.6(a) and corresponds to the elementary transformation
1 0] (p)q -- (s+e :.t~ ) · [-(s+1)1 The next step of the procedure would be a transformation of the type
l(
[~ (s+f(!+i) (•+!m+l)) = Ui~e) where 8 E JR[s) satisfies 8( -1) = e, 8( -e) = ee. Instead of going this way, which would require another step thereafter, we proceed as follows. Equation (3.1.5) implies
[q*(-e),p*(-e)]
(::i=:D
=0,
thus, by coprimeness of p and q, it follows
-e)) . [-p*( -e)]
a*( ( b*( -e)
E
ImJR
q*( -e)
.
Indeed, with the given p, q, a, and bone has
-e)) [-p*( -e)] c q*(-e)
a*( ( b*( -e)
=
where c = el~e~ E JR. As a consequence, as~cep and bs~ceq are in 1io and altering Equation (3.1.5) leads to the Bezout identity
a+cp s+e
b-cq s+e
1 =--q+--p =
(1- e)z + (e- ee)s- ee + e2 (e- l)z + (ee- l)s + ee- e q+ p (e- ee)(s + 1)(s +e) (e- ee)(s +e)
with coefficients in Q(e, ee). The examples (2) and (3) should demonstrate how (successive) Bezout identities force to extend step by step the field of coefficients, in this case from Q through Q(e) to Q(e, ee). It seems unknown whether the transcendence degree of Q(e, ee) is two, which is what one would expect. This is a very specific case of a more
3.2 Matrices over 'Ho
35
general conjecture of Schanuel in transcendental number theory, which we will present in 3.6.5. However, very little is known about this conjecture (just to 2 give an example, it is only known that at least one of the numbers ee or ee is transcendental, see [1, p. 119]). Handling of the successive field extensions forms an important (and troublesome) issue in symbolic computations of Bezout identities in 7-{0 . We will turn to these questions in Section 3.6. The results stated so far show a striking resemblance of '}-{ and H(C) with respect to their algebraic structure. But there are also differences, one of them being presented next, another one is the dimension of the rings and has to be postponed until Section 3.4. Remark 3.1.10 For a commutative domain R with unity one says that 1 is in the stable range of R if for all n 2:: 1 and at, ... , an+ I E R satisfying R = (a1, ... , an+ I) there exist b1, ... , bn E R such that R = (a1 + b1an+I, ... , an + bnan+I), see e.g., (30, p. 345]. It is easy to see that this is equivalent to the property that for all a1, ... , an+ I E R satisfying R = (a1, ... , an+ I) there exist c2, ... , Cn+l E R such 1 that a 1 + E~2 Ciai is a unit in R. While this is true for the ring H(C), see (82, p. 138], this is not the case for the rings '}-{ and 7-{0 , as the following example shows. Let a 1 = z -1 and a2 = (s -1)(s- 2) E 7-l. Then a1 and a2 are coprime in'}-{ and a Bezout equation 1 = c1a1 + c2a2 implies for the coefficients Cl
=
1 - c2a2 ' a1
C2
=
1 - c1a1 '1-1 E /(.. a2
Considering the roots of the denominators it can be seen that neither of the coefficients c 1 and c2 can be a unit in 7-l. In (82, p. 139] it has been proven that for every Bezout domain with 1 in the stable range unimodular matrices are finite products of elementary matrices. This result applies in particular to the ring H(C) and we arrive at Theorem 3.1.6(b) for JC = H(C).
3. 2 Matrices over 1-£0 In this section we turn our attention to matrices over 7-{0 . First of all, it is an easy consequence of the Bezout property that one can always achieve left equivalent triangular forms. From Theorem 3.1.6(b) we know that this can even be done by elementary row transformations. But even more can be accomplished. It is a classical result that an adequate commutative Bezout domain allows diagonal reductions via left and right equivalence for its matrices. In other words, matrices admit a Smith-form, just like matrices with entries in a Euclidean domain. This will be dealt with in the first theorem below and some consequences
36
3 The Algebraic Structure of 1io
will be pointed out. Thereafter we present a generalization of the concepts of greatest common divisors and least common multiples for matrices. As our arguments work over arbitrary commutative Bezout domains, the results will be given in that generality. The end of this section is devoted to a summary of the matrix-theoretic results in terms of general module theory. Let us start with triangular and diagonal forms. Theorem 3.2.1 Let K be any of the rings 1-{ or Ho. Then (a) every matrix P E Knxm is left equivalent to an upper triangular matrix, that is, there exists U E Gln(K) such that UP is upper triangular, (b) K is an elementary divisor domain, that is, by definition, every matrix P E Knxm is equivalent to a diagonal matrix where each diagonal element divides the next one. Precisely, there exist V E Gln(K) and W E Glm(K) such that
(3.2.1)
with r = rk P and di E K\ {0} satisfying di l~e di+ 1 fori= 1, ... , r - 1. The diagonal elements are the invariant factors of P and hence unique up to units of K. (They are also called elementary divisors in {51, 64}, explaining the name of rings with this type of diagonal reduction.) PROOF: Part (a) is a consequence of Theorem 3.1.6(a). The statement in (b) follows from [51], see also [64, p. 473], where is has been proven that adequate Bezout domains are elementary divisor domains; recall Theorem 3.1.6( c) for the adequateness of 1-{0 . The uniqueness of the diagonal elements follows, just like for Euclidean domains, from the invariance of the elementary divisors under left and right equivalence, which in turn is a consequence of the Cauchy-Binet theorem (valid over every commutative domain), see e. g. (83, pp. 25] for principal ideal domains. D
It is worth mentioning that it is still an open conjecture whether every commutative Bezout domain is an elementary divisor domain, see [17, p. 492, ex. 7] and (68]. Remark 3.2.2 It is worthwhile noticing that left equivalent triangular forms can be obtained over an arbitrary commutative Bezout domain R. This can easily be seen as
3.2 Matrices over 1-lo
37
follows. Let P = (Pij) E Rnxm and a1P11 + .. .+anPnl = gcdR(pu, · · · ,Pnl) =: d be a Bezout equation for the first column of P. Then the coefficients a1, ... , an are coprime in R, hence form a unimodular row (all ... , an), which, using again the Bezout property, can always be completed to a unimodular matrix A E Gln(R), see [12, pp. 81]. This way one can transform P via left equivalence to a matrix with first column (d, 0, ... , Ol. The rest follows by induction. Our proof of part (a) above is slightly simpler since we made, implicitly, use of the division with remainder as given in Proposition 3.1.2(g) (see the proof of Theorem 3.1.6(a)). Example 3.2.3 (1) Consider the matrix
2
sz + 1] E 1-l2x2. z- 1 sz 2 -1
p = [ s
°
Since the entries of P are coprime in 1-lo, an elementary divisor form of P is given by
[~de~Pl = [~z2 (s3 -s)+z~s-l)-s2 +1]· In order to obtain also the transformation matrices, let us begin with deriving a triangular form. Notice that gcd'Ho (s 2, z - 1) = s. The Bezout equation s= s + z- 1 s2 - ( z- 1)' s2 can be derived as in Example 3.1.9(1). Hence we get the left unimodular transformation 1 (sz + 1) - (sz 2 - 1)] 2 = [s s+;~: s sz2 + 82 1 -- s z -1 sz -1 0 ;z (sz + 1) + s(sz 2 - 1)
[
-1] [
1]
s+ z
s
=:
[~ :].
To obtain a diagonal form notice that s and a are coprime in 1-£0 , hence there exist x, y E 1-lo such that 1 = xs + ya. The simple choice y = 2/3 and x = (1- 2/3a)s- 1 E 1-lo yields now
(2) The matrix
M
:=
[1 -z+ ~z
s2
l
1- sz
is in Gl2(IR[s, z]) but not in E2(IR[s, z]), see [16] or [97, p. 676]. Over the ring 1-£0 , however, it factors into M
1 = [ l:z
z:l
0] 1 0] [10 sl 1 [10 - 1sl [ -11 1 [ 1 0] 1 ·
38
3 The Algebraic Structure of 1io
Let us now return to the equivalence p* IH q* -<=> p l11: q for p, q E 1-l, given (for the ring 1-lo) in Proposition 3.1.2(c). Using diagonal forms, this can easily be generalized to matrices. To this end, we extend the embedding 1-l C-..t H(C) to matrices in the obvious entrywise way, thus (3.2.2) Clearly, (PQ)*
= P*Q*
and (P
+ Q)* = P* + Q*, whenever defined.
Proposition 3.2.4 Let~ E 1-[P;xq, i = 1, 2, be two matrices. There exists F E H(C)P 2 xp 1 such that FPi = P2 if and only if there exists X E JtP 2 xp 1 such that XP1 = P 2 . If P 1 and P 2 have entries in 1-lo and P1 satisfi.es rkiR(s)Pl(s, 0) = rkP1, then the matrix X can be chosen with entries in 1-lo, too.
PROOF: The if-part is obvious. As for the other direction, let U P1 V = [ ~ ~] where Ll is as in (3.2.1) and U, V are unimodular. Assume P2 V
= [Q, Q']
is partitioned accordingly. Then P2V* = F(U- )* [ ~* ~], thus Q' = 0 and 1
dj IH Qii' the entries of Q*. Proposition 3.1.2(c) yields Qij = Xijdj for some Xij E 1t. Defining X = ( Xij) E JtP 2 X r' the desired left factor is given by X= [X, O]U E 1-£P2 xp 1 • The additional rank condition in the case of entries in 1-£0 guarantees that the diagonal elements dj E 1-lo of Ll are not divisible by z, making Proposition 3.1.2(c) D applicable again. Another standard consequence of the diagonal reduction is the following characterization of right invertibility for matrices over 1-l. Corollary 3.2.5 For a matrix P E 1tpxq the following conditions are equivalent:
(a) (b) (c) (d)
P has a right inverse over 1-l, that is, PM = I P for some matrix M E 1-lq x P. P* has a right inverse over H(C). rkP*(A) = p for all A E C. Pis right equivalent to [Ip, 0].
(e) P can be completed to a unimodular matrix
[~]
E
Glq(1t).
(f) The greatest common divisor of the full-size minors of Pis a unit in 1-l. Furthermore, each matrix Q E 1-lrx q of rank p can be factored as Q = AP where A E 1-lrxp is of rank p and P E 1-lpxq is right invertible over 1-l. The matrices A and P are unique up to right resp. left equivalence. The corresponding equivalences are true when 1-l is replaced by 1t0 provided that one adds the condition rkiR(s)P(s, 0) =pin the parts (b) and (c).
3.2 Matrices over 1io
39
It is worthwhile noticing that the above equivalences (a) ¢.? (c) ¢.? (e), if formulated accordingly, are valid for matrices over a polynomial ring K[x1, ... , xm] (K any field), too, see also Theorem 5.1.12 in Chapter 5. This is the celebrated Theorem of Quillen/Suslin on projective modules.
"(a) ==? (b) ==? (c)" is obvious, and so is "(c) ¢.? (f)", recalling the units in 1i from Remark 3.1.5. As for "(c) ==? (d)", let PU = [Ll, 0] with Ll E 1-lPXP. Then P*U* = [Ll*, 0] and hence det Ll* is a unit in H(
of det
[~]
along the block row given by P. The implication "(e)
==?
(a)" is trivial.
For the factorization of Q use a diagonal form Q = U diagr x q ( d1, ... , dp) V with unimodular matrices U and V. Then A= Udiagrxp(db ... , dp) and P = [Ip, O]V yield the desired result. The uniqueness is straightforward. The additional condition for the ring 1-lo guarantees that z, which is not a unit D in 1-lo, is not a common divisor of the full-size minors of P. The second part of this section is devoted to a generalization of the concepts of greatest common divisors and least common multiples from functions in 1i to matrices over 1-l. We will formulate the results for matrices over an arbitrary commutative Bezout domain, as this is exactly what is needed for the proof. The greatest common right divisor of two matrices, one of them being square and nonsingular, is standard in matrix theory over principal ideal domains, see [71, p. 31-36]. A least common left multiple comes as a by-product. The result carries over literally to Bezout domains and even to non-square matrices in the way given below in Theorem 3.2.8. This version looks fairly standard, too, but seems to be less known. We would like to present a proof since the precise description will be needed later in Chapter 4, where a Galois correspondence between finitely generated submodules of 1-lq and solution spaces in .Cq of systems of DDEs will be established. The following notation will be helpful. Definition 3.2.6
Let n, q E N and n < q. (a) Let Jn,q := {(p1, ... , Pn) E Nn II ::; Pl < ... < Pn ::; q} be the set of all ordered selections of n elements from the set {1, ... , q}. (b) For a selection p = (pt, ... , Pn) E Jn,q denote by p E Jq-n,q the complementary selection, that is p = (PI, ... , Pq-n) E Jq-n,q where {Pll · · ·, Pn} U {pt, ... , Pq-n} = {1, ... , q}. (c) Let p = (PI, ... , Pn) E Jn,q. For an n x q-matrix A denote by A(p) the minor of order n of A obtained after selecting the columns with indices Pl, ... , Pn· Accordingly, A (p) denotes the minor obtained from the row selection in case of a q x n-matrix A.
40
3 The Algebraic Structure of 7-lo
The following technical lemma will be a valuable tool on several occasions throughout the book. Lemma 3.2.7 Let R be any commutative domain. (1) Let M E Rnxq, N E Rqx(q-n) be matrices satisfying rkM q- nand MN = 0. Then there exist a, bE R such that M(p)
(2) Let M = [M1, M2] E
a C) = ±bN P for all p E
R(qi+q2 -r)x(qi+q2 )
matrices with rk M = q1
+ q2 -
= n, rkN =
Jn,q·
and N = [Z~] E
R(qi+q2 )xl
be
r, rk N = r ::; l and M N = 0. Then
PROOF: (1) can be found in [53, p. 294), but can also be derived by some straightforward matrix calculations over the quotient field of R. It simply says that the (projective) Plucker coordinates, taken with the correct sign, of a vector space and its orthogonal complement are identical. (2) is a simple consequence of (1), applied to the equation M N = 0 for a
submatrix
N=
[Z:J E
R(q 1 +q2 )xr
of N satisfying rkN2 = rkN2 and rkN
which certainly exists.
= r, 0
Now we can state and prove the following result. Theorem 3.2.8 Let R be a commutative Bezout domain and A E Rlxq, B E matrices. Put r := rk [AT, BT]T and assume r > 0. Let U
= [g~ g~]
E Gll+m(R),
Rmxq
be two
partitioned according to U1 E Rrxl,
be such that (3.2.3)
Then (a) Dis a greatest common right divisor of A and B of full row rank rand as such is unique up to left equivalence. We write D = gcrd(A, B). Moreover, there exist M E Rrxl, N E Rrxm such that D = M A+ N B and therefore, imAT + imBT = imDT. (b) Suppose rkA = l, rkB = m. Ifr < l +m, then M := U3A = -U4B E R(l+m-r)xq is a least common left multiple of A and B of full row rank. Furthermore, im AT n im BT = im MT.
3.2 Matrices over 'Ho
41
Every least common left multiple of A and B in R(l+m-r)xq of full row rank is left equivalent toM. We write M = lclm(A, B). If r = l + m, the only common left multiple of A and B is the zero matrix; in particular, im AT n im BT = 0. It will be convenient to define lclm(A, B) as the empty matrix in R 0 xq. (The image, resp. kernel, of an empty matrix is the zero subspace, resp. the full space.) IfrkA = l, rkB = m, then rkA + rkB = rkgcrd(A, B)+ rklclm(A, B). PROOF: First of all, recall that a matrix over an arbitrary commutative Bezout domain is left equivalent to an upper triangular form, see Remark 3.2.2. This guarantees the existence of the matrices U and D. It is not necessary to have D in triangular form; solely the full row rank is important.
(a) Using (3.2.3) and letting
we get the equations U1 A + U2B assertions of (a) can be derived.
= D, V1 D =A, and V3D = B, from which all
As for (b), consider the case r < l+m first. By construction, M is a common left multiple of A and B. Applying Lemma 3.2.7(2) to the equation [U3, U4 ] [ ~] = 0, one gets rk U3 = l + m- r and thus rk M = l + m- r, too. The unimodularity of U implies (3.2.4) This shows that every common left multiple M of A and B is a left multiple of M. Thus, M is a least common left multiple of A and B. Equation (3.2.4) also yields im AT n im BT = im MT. The uniqueness follows immediately from the full row rank. If r = l + m, then a common left multiple M = X A = Y B satisfies the identity [X, -Y]
[~] = 0, therefore [X, -Y] = 0, which yields M = 0.
The final identity concerning the ranks is clear.
D
Remark 3.2.9 Notice that the least common multiple U3A = -U4 B yields a homomorphism from Rq to Rl+m-r factoring through the maps defined by A and B. In this formulation one can calllclm( A, B) the "free part" of the so-called push-out (or fiber sum) of the maps A: Rq ~ Rl and B: Rq ~ Rm, which is given as the quotient space M := Rl+m / im [AT, _ BT]T together with the two maps from Rq
into M factoring through A and B, see [67, p. 59]. Indeed, with the notation as in the proof above, it is easy to see that the map
42
3 The Algebraic Structure of 1-lo
is an embedding of Rl+m-r into M. Moreover, the finitely generated module M decomposes into its free part and its torsion submodule as follows:
(By virtue of the fact, that finitely generated torsion-free modules over Bezout domains are free [97, p. 478], the decomposition above can basically be proven the same way as for principal ideal domains, for which a proof can be found in [67, p. 533].)
Remark 3.2.10 We wish to end our matrix-theoretic considerations with an interpretation of the results given above into module-theoretic terms. First of all, the Bezout property of 1i can simply be expressed as stating that every finitely generated ideal of 1i is a free 1-l-module of rank one. Secondly, the left or right equivalent triangular forms for matrices over 1i (Theorem 3.2.1(a)) imply that every finitely generated submodule of a free 1-l-module is free. Indeed, if M is such a finitely generated module, we can assume without loss of generality that M ~ 1-lr for some r E N and that M = imQ for some matrix Q E 1-lrxs. Using a right equivalent triangular form of Q, one can single out a full column rank matrix representation M = im Q, showing that M is free. Thirdly, the sum of two finitely generated submodules N1 and N2 of an 1-l-module M is certainly finitely generated again, hence a free module if M is free. The construction of a greatest common right divisor in Theorem 3.2.8(a) presents a way of how to construct a basis for the sum N 1 + N 2, given generating matrices AT and BT for N 1 and N2, respectively. More interesting from a module-theoretic point of view is the fact that also the intersection N1 n N2 of two finitely generated submodules of a free 1-l-module is finitely generated and free again. A basis for N 1 n N 2 is given by the least common left multiple of generating matrices for N1 and N 2 (see Theorem 3.2.8(b)). Observe that all the above is true for arbitrary commutative Bezout domains (see Remark 3.2.2). In commutative ring theory the situation above is captured in a more general context by the notion of coherent rings and modules. A module M over a commutative ring R is called coherent if M is finitely generated and every finitely generated submodule N of M is finitely presented, hence there is by definition, an exact sequence F1 ---* Fo ---* N ---* 0 with finitely generated free modules F 0 and F 1 . A commutative ring R is called coherent if it is a coherent R-module, hence if every finitely generated ideal of R is finitely presented [38, Sec. 2]. Since finitely generated free modules are trivially finitely presented, every commutative Bezout domain is coherent. It is known that if R is coherent, then every finitely generated submodule of a free R-module is finitely presented [38, Thm. 2.3.2]. This generalizes the situation for commutative Bezout domains where these modules turn out to be even free as we have seen above. Furthermore, sum and intersection of two coherent submodules of a coherent module are coherent again [38, Cor. 2.2.4] and we arrive at a generalization of the greatest common divisor and the least common left multiple.
3.3 Systems over Rings: A Brief Survey
43
3.3 Systems over Rings: A Brief Survey In this section we want to take a short excursion into the area of systems over rings. We present some of the main ideas and discuss the ring 1-lo with respect to some ring-theoretic properties arising in the context of systems over rings. The theory of systems over rings is a well-established part of systems theory, initiated mainly by the papers [79, 105], in which it has been observed that in various types of systems, like for instance delay-differential systems, the main underlying structure is that of a ring. As a consequence, the properties of such systems can be studied, to a certain extent, in an algebraic setting. This in turn has led to several notions for rings, which, beyond their system-theoretic background, can be studied in purely algebraic terms. The book (12] provides not only an excellent overview of these various concepts, but also introduces a variety of rings to systems theory. Although our algebraic approach to delaydifferential systems is not in the spirit of systems over rings, the book [12] has been our main guide through the area of Bezout domains and elementary divisor domains. In the sequel we want to survey one branch of the theory of systems over rings. For the moment it might simply serve as a brief introduction into that area of systems theory. But there is also a (weak) connection to Section 4.5 of our work where we will come back to this topic. The starting point for the theory of systems over rings is the description of a linear first-order discrete-time dynamical system as an equation (3.3.1) where A E Rnxn and B E Rnxm are matrices over some ring Rand Xk ERn and Uk E Rm are the sequences of the states and inputs, respectively (at this point there is no need to consider an output equation). From a system-theoretic point of view a lot of natural questions arise. The most basic one is whether or not it is possible for a given system (3.3.1) to steer it from one state to any other in finite time by suitable choice of the inputs Uk· This is the well-known notion of reachability and can be expressed solely in ring-theoretic terms. (1) The pair (A, B) is called reachable, if im [B, AB, ... , An-l B] = Rn, see [79, p. 529]. If R is a domain, the above is equivalent to [.XI -A, -B] being right invertible over the polynomial ring R[.X], see (46, Thm. 2.2.3]. It is a purely algebraic result that for reachable systems (3.3.1) over a field the internal modes can be altered arbitrarily by use of static state feedback Uk = Fxk. This problem of modifying the systems dynamics can equally well be formulated for rings. In this case it falls apart into two subproblems. (2) [105, p. 20] A pair (A, B) E Rnxn X Rnxm is called coefficient assignable, if for each monic polynomial a E R[.X] of degree n there exists a feedback
44
3 The Algebraic Structure of Ho
matrix FE Rmxn such that the closed loop system given by the equation Xk+l =(A+ BF)xk has characteristic polynomial det(.XJ- A- BF) =a. (3) [79, p. 530], [105, p. 20] A pair (A, B) E nnxn X nnxm is called pole assignable, if for all al' ... ' an E R there exists a feedback matrix F E nm X n such that det(.XJ- A- BF) = fi;=l (.X- aj)· It is easy to see that coefficient assignability is stronger than pole assignability which in turn implies reachability, see [105, p. 21]. Whether or not the converse is true, depends strongly on the underlying ring R. This has led to the following notions. (4) [12, p. 67] A ring R is called a CA-ring (resp. PA-ring) if each reachable pair (A, B) is coefficient assignable (resp. pole assignable). Each field is a CA-ring. In the general case of systems over rings a particular simple case arises if there is only one input channel, that is, m = 1. In this case, reachability of (A, b) simply says that b is a cyclic vector for the matrix A and one straightforwardly verifies that (A, b) is coefficient assignable. As a consequence, one can show that a ring is a CA-ring if it is an FC-ring in the following sense. (5) [105, p. 21], [12, p. 74] A ring is said to be an FC-ring (feedback cyclization ring), if for each reachable pair (A, B) there exists a matrix FE nmxn and a vector vERn such that (A+ BF, Bv) is reachable. Even for simple rings it is surprisingly difficult to see if they have one of the properties above. We confine ourselves with reporting the following results and open questions. (i) The polynomial ring JR[z] is a PA-ring (105, p. 23], but not a CA-ring. For instance, the pair (A, B)
=(
[~ ~] , [~1 ~ 1] ) z2
(3.3.2)
is reachable, but does not allow a feedback matrix F E JR[z] 2 x 2 such that det(.XJ- A+ BF) = .X 2 +.X+ (z 2 + z + 2)/4 E JR[z][.X]; see (29, p. 111] and [99]. (ii) The ring
3.4 The Nonfinitely Generated Ideals of 1-lo
1-l6x
45
1-l6x
2 2 even provides an example of a pair (A, B) E x which is reachable over 1-lo but not coefficient assignable. Hence 1-lo is not CA and, consequently, not an FC-ring either.
We will come back to a slightly different notion of coefficient assignability in Section 4.5. The topic of realization theory for systems over rings will be briefly addressed in the introduction to Chapter 5.
3.4 The Nonfinitely Generated Ideals of 1-£0 The Bezout property of 1-lo says that all finitely generated ideals are principal, hence they are completely described by one generator. In this section we focus our attention to the nonfinitely generated ideals. As we will see, each such ideal can be fully described by one "generating" polynomial along with a specified set of zeros (counting multiplicities). As a consequence, it will turn out that all nonzero prime ideals are maximal, in other words, the Krull-dimension of 1-{0 is one. The results of this section are not directly related to our investigation of delaydifferential equations in the next chapter. However, we think they are interesting for a further algebraic study of the ring 1-lo. We restrict to the ring 1-lo. The results about 1-l can readily be deduced. Let us first rephrase the characterization of prime elements, given in Proposition 3.1.2(d), in ideal-theoretic language. The following is an immediate consequence of the Bezout property together with Proposition 3.1.2(d) and (i). Proposition 3.4.1 Let {0} =/=I~ 1-lo be a finitely generated ideal. Then
I is prime {:::=:> I is maximal
{:::=:>I=(¢) for some irreducible¢
E
JR[s]\{0} or I= (z).
We begin our investigation with an important class of nonfinitely generated ideals in 1-lo. They can be regarded as "generalized" principal ideals, for the information on such an ideal is contained completely in one (generating) polynomial. These ideals will serve as a sort of building block for all nonfinitely generated ideals. In the sequel a polynomial ¢ E JR[s] is called monic if¢ =/= 0 and its leading coefficient is 1. Definition 3.4.2 Let p E JR[s, z]. Define Dp := { ¢ E JR[s] I ¢monic and¢ I'Ho p} to be the set of all admissible denominators of p. Furthermore, let
46
3 The Algebraic Structure of 1-lo
We call ((p)) the full ideal generated by p and the polynomial p is said to be a full generator of ((p)). It is clear that a full ideal is indeed an ideal of 1-lo. Notice that there is no need to consider full ideals generated by q E 1-lo \lR[s, z] as q = ~ E 1-lo would fully generate ((p)), too. Proposition 3.4.3 Let p E JR[s, z]\ {0}. (1) The ideal ((p)) is at most countably generated and has empty variety, that is, V{f* I f E ((p))} = 0. (2) Let q E JR[s, z]\ {0}. Then
((q)) ~ ((p)) In particular, ((q)) (3) Let
zf&[s,zJ
¢==}
P lnt(s)[z] q.
= ((p)) if and only if q = '1/Jp for some '1/J
E
JR(s).
p. Then
((p)) = 1-lo <=;. ((p)) is finitely generated. In case, the ideal ((p)) is not finitely generated, it has a full generator p E JR[s, z], which is primitive as a polynomial in z and as such is unique up to a constant. p E JR[s]\ {0}
<=;.
PROOF: (1) The first part follows directly from the fact that p has only countably many monic divisors. The second part is clear. (2) "==>" is obvious. As for "{=",let p~ For all¢ E Dq one has
= q where a E JR[s, z]
q ap a p - =- = -- E
¢
'1/J¢
P1 P2
and '1/J E JR[s]\ {0}.
((p)),
where P1P2 = '1/J¢ is a factorization of '1/J¢ such that P~, f; E 1-lo, see the first part of 3.1.2(h). Hence ((q)) ~ ((p)). (3) Both implications "==>" are trivial and the leftmost "{=" follows from the very definition of the set ((p)). Hence there remains to show that a full ideal, which is finitely generated, is identical to 1-lo. To do so, let ((p)) ~ 1-lo be a finitely generated, hence principal, ideal, say ((p)) = (a). Then V(a*) = 0 by (1) and Proposition 2.5(2) and the assumption on pimply a E lR\ {0}, thus ((p)) = 1-£0 . 0
Remark 3.4.4 The last part of the proof also shows that a nonfinitely generated full ideal can not be contained in a principal ideal other than some (zk) where k E N0.
3.4 The Nonfinitely Generated Ideals of 1-lo
47
The full ideals defined above are constructed by allowing all admissible denominators of a given p E R[s, z]. Much more ideals are obtained by restricting the denominators of p to suitable subsets of Dp. Indeed, as it will turn out, each ideal of 1to can be obtained in this manner. Before giving the details, we want to present a preparatory result, showing that each ideal is sandwiched between a finitely generated ideal and a corresponding full ideal. This is the key step toward a description of arbitrary ideals in terms of a certain generating polynomial. Proposition 3.4.5 Let I ~ 1to be an ideal, I f= {0}. Then there exists a polynomial p E R[s, z] such that (a) (p) ~I~ ((p)), (b) p is minimal with respect to (a) in the following sense: for all q E R[s, z] satisfying (q) ~ I ~ ((q)) it follows qp- 1 E R[s]. Each polynomialp satisfying (a) is called a sandwich-polynomial of I. A minimal sandwich-polynomial is unique up to a constant. It is easy to see that for a principal ideal I sandwich-polynomial of I.
=
(p) the generator p E 7t0 is a
PROOF: Throughout the proof we use the notation ( )'Ho resp. ( )JR[s,z] for ideals generated in the respective rings.
(a) The ideal I n R[s, z] in R[s, z] is finitely generated, say I n R[s, z] = (pt, ... , Pl)JR[s,z]· Then in ?to we obtain a principal ideal (3.4.1) where p = gcdrto (pt, ... ,pl) is in R[s, z], too (see Remark 3.1.4). By construction we have (p)'Ho ~ I. The inclusion I~ ((p)) can be established as follows: for each h E I there exists ¢ E R[s] such that ¢h E In R[s, z] ~ (p)'Ho. Thus, h = ¢- 1 ap for some a E ?to and by splitting denominators according to Proposition 3.1.2(h) we obtain h = ¢1 -£; E ((p)). (b) It suffices to show that the polynomial p found in (3.4.1) is minimal. To this end, let q E R[s, z] be such that (q)'Ho ~ I~ ((q)). Then q E In R[s, z], hence q = hp for some hE ?to. Moreover,p E ((q)) leads to an equation 'lj;p = h'q where h' E 7t 0 and 'ljJ E Dq. But then hh' = 'ljJ E R[s] and therefore qp- 1 =hE R[s], D establishing the minimality of p. Now it should be clear how to describe arbitrary ideals. Starting with a full ideal ((p)), one restricts the set of denominators Dp to a suitable subset M. In order to get an ideal, the set M has to be closed with respect to the operations necessary for denominators. This leads directly to the following definition.
48
3 The Algebraic Structure of 1to
Definition 3.4.6 (a) A set M ~ JR[s]\ {0} is called an admissible set of denominators if it satisfies i) 1 EM,
ii) each cj) E M is monic, iii) if¢, 'l/J EM, then lcm(¢, 'l/;) EM. If, in addition, M ~ Dp for some p E JR[s, z], then M is said to be an admissible set of denominators for p. (b) An admissible set M of denominators is called saturated, if M is closed with respect to taking divisors, that is, if cj) E M and 'l/J E JR[s] is a monic divisor of¢, then 'l/J EM. (c) Let p E JR[s, z] and M be an admissible set of denominators for p. Define
((p)) (M) := { h~ I hE 1-lo, cP EM}
~ 1-lo.
Hence ((p))(M) is the ideal generated by the elements p¢- 1 E
1-{0
where
¢EM.
Of course, one would like to call M simply a denominator set. As, unfortunately (from our point of view), this term is reserved in algebra in a slightly different way, we decided to avoid that name. Obviously, Dp is a saturated admissible set of denominators for p and ((p)) (Dp) = ((p)). Moreover, ((p))(M) satisfies (p) ~ ((p))(M) ~ ((p)). However, the polynomial p is not necessarily a minimal sandwich-polynomial of ((p)) (M), as illustrated in part (2) of the following example. Example 3.4. 7 (1) If M = {1}, then ((p))(M)
= (p).
(2) Let p = p¢ where p E JR[s, z] and¢ E JR[s]\{0} is monic. Put M = {1, ¢}. Then ((p)) (M) = (p) and therefore, p is a minimal sandwich-polynomial of ((p)) (M). (3) Let p = (z- 1)(z + 1) E 1-lo. Then
Dp = { ¢
E
JR[sJJ ¢monic, V(¢) ordA(cP)
c {k1ri I k
~ 1 for all .X
E
E
Z} and
V(¢) },
hence Dp is simply the set of all polynomials consisting of finitely many different linear factors of the form s- k1ri. A saturated admissible set M of denominators for p is, e. g., given by M
= {cP E Dp IV(¢)
C {2k7ri IkE
Z}} = Dz-1·
The variety of ((p)) (M) fulfills
V{f* If E ((p))(M)} = {k1ri IkE 2/Z + 1}
= V((z + 1)*).
3.4 The Nonfinitely Generated Ideals of 1io
49
The requirement that the polynomials in an admissible set of denominators be monic is merely added in order to exhibit the close connection between finiteness properties of the ideals and the associated admissible sets of denominators. Indeed, Proposition 3.4.8 Let p E JR[s, z]\{0} and M be an admissible set of denominators for p. Put I = ((p)) (M). Then
I is finitely generated
~
M is finite.
For"¢" let M = {¢r, ... ,¢t} and observe that ((p))(M) = (p¢- 1 ), where ¢ = lcm(¢b ... , ¢z) E M. As for "=?", assume ((p))(M) is finitely genPROOF:
erated, hence principal, say ((p))(M) = (q) for some q E 7-lo. Then q = h~ for some 'ljJ E M and h E 7-lo. Moreover, for all ¢ E M there is some h4> E 7-lo such that ~ = h4>q. Thus, h
M = { ¢ E Dp I ¢ monic, 3 '1/J E M : ¢ I '1/J} is the saturation of M. Now we can completely describe the ideals in 1{0 • The presentation ofnonfinitely generated ideals given in part (3) below, will be important later in Section 4.6 where we study the solution spaces of delay-differential equations corresponding to these ideals. Theorem 3.4.10 Let {0} =/:- I ~ 1-lo be an ideal and p E JR[s, z] be a sandwich-polynomial of I. Put
Then
(1) M = {¢ E Dp I:Jh E 7-lo: gcdrt 0 (h,¢) = 1 and h~ E I}. (2) M is a saturated admissible set of denominators. (3) I= ((p))(M)· PROOF: (1) The inclusion "~" is trivial. For '';2" multiply a Bezout identity 1 = gcdrto (h, ¢) = ah + b¢, where a, bE 7-lo are suitable coefficients, by p¢- 1 E 7-l0 .
50
3 The Algebraic Structure of 1-lo
(2) If¢ EM and 'lj; E R[s] is a monic divisor of¢, say 'lj;p =¢,then~ = p~ E I. Hence M is saturated and there remains to show that M is closed with respect to taking least common multiples. To this end, let ¢1, ¢2 E M and ¢ = lcm( ¢ 1, ¢ 2). Since R[s] is factorial, we may write¢= ¢1¢2, where (/;i I ¢i and gcd((/;1, (/;2) = 1. From the above we know E I fori= 1, 2. Hence
f
p
p
""""+"""" = ¢1
¢2
A
(¢1
A
p
+ ¢2)~ ¢1¢2
E
I.
Since gcd((/;1 + ¢2, (/;1(/;2) = 1, we obtain from (1) that¢= (/;1(/;2 EM. (3) The inclusion ((p))(M) ~ I is immediate from the definition of M. For the converse let q E I, hence q E ((p)) since pis a sandwich-polynomial of I. Then q = h~ for some ¢ E Dp and h E 1-lo. Using Proposition 3.1.2(h), one can assume gcd'H.o (h, ¢) = 1, which, by (1), yields ¢ E M. Thus q E ((p))(M) as D desired. Corollary 3.4.11 If {0} f= I~ 1-{0 is an ideal having a sandwich-polynomial in R[s], then I= (a) for some a E R[s].
PROOF: By Proposition 3.4.5(b), each other sandwich-polynomial of I is in R[s], too, and the representation in part (3) of Theorem 3.4.10 completes the proof. 0
The following theorem provides an alternative argument for the adequateness of 1-{0 (cf. Theorem 3.1.6(c)). It reveals that 1-{0 is a one-dimensional ring, that is, each ascending chain of prime ideals has maximal length 1. It is well-known that one-dimensional Bezout domains are adequate [12, p. 95]. We would like to mention, that, in contrast, the ring H(C) of entire functions is of infinite Krull-dimension, yet still adequate [12, Thms. 3.17, 3.18]. Theorem 3.4.12 Let {0} f= I ~ 1-lo be an ideal. (a) If I is a prime ideal that is not finitely generated, then I = ((p)) for some
irreducible p E R[s, z]\R[s]. (b) I is prime if and only if I is maximal. PROOF: (a) Let I be a nonfinitely generated prime ideal. Then In R[s] = {0} for otherwise the intersection would contain an irreducible element a E R[s] with the consequence (a) ~ I, contradicting Proposition 3.4.1. Let p E R[s, z]\R[s] be a sandwich-polynomial of I, hence I~ ((p)). But then even I= ((p)) is true, since for each ¢ E Dp we have p = ¢~ E I and the primeness of I together withIn R(s] = {0} implies ~ E I. Again by the primeness of I and by virtue of Proposition 3.4.3(2), p can be chosen as an irreducible polynomial in R[s, z].
3.5 The Ring 1i as a Convolution Algebra
51
(b) In light of Proposition 3.4.1 we are reduced to show that each I = ((p)), where p E IR[s, z]\IR[s] is irreducible, is maximal. To this end, let I~ J for some ideal J having sandwich-polynomial q E IR[s, z]. The case q E IR[s] can be handled with Corollary 3.4.11 and Remark 3.4.4. If q ¢ IR[s], then Proposition 3.4.3(2) applied to ((p)) ~ ((q)) together with the irreducibility of p yields ((p)) = ((q)). Hence I is a maximal ideal. 0 We close this section with the following result concerning the uniqueness of the representation of nonfinitely generated ideals. Its proof is lengthy but straightforward and will be omitted. Proposition 3.4.13
Consider the ideal ((p)) (M) ~ 1-lo where p = cf>p E IR[s, z] for some¢ E IR[s] and IR[s, z] which is primitive as a polynomial in z. Furthermore, M ~ Dp is a saturated admissible set of denominators for p. Let a E IR[s] be the unique monic polynomial such that M n D4> = Da and put
pE
Then M is a saturated admissible set of denominators contained in DT'p satisfying ((p))(M) = ((7p))(lvf) and M n Dr= {1}. This provides a unique presentation of the ideal ((p)) (M) in the following sense: let ((71PI)) (M1 ) = ((72P2)) (M2 ) where Pi E IR[s, z] are primitive as polynomials in z, the polynomials 7i E R[s] are monic, and Mi ~ Dr;fii are saturated admissible sets of denominators satisfying Min Dr; = {1}. Then 71 = 72, PIP2 1 E 1R and M1 = M2.
3.5 The Ring 1-l as a Convolution Algebra Let us recall that the ring 1-l has been introduced in Chapter 2 as a ring of delaydifferential operators acting on coo (IR, C). The main purpose of this section is now to place this situation in the broader context of convolution operators. More precisely, we will describe 1-l as an algebra of distributions with compact support. The delay-differential operators ij introduced in Definition 2.9(2) will turn out to be the associated convolution operators. Using the Laplace transform and a suitable Paley-Wiener Theorem, it will be easy to see that 1-l is (isomorphic to) the space of distributions which are rational expressions in the 1 Dirac-impulses 8~ ) and 81 and have compact support. The structure of these distributions can be exhibited in more detail by going through some additional explicit calculations. In particular, it will turn out that each such distribution can be written as the sum of a piecewise smooth function and a distribution with finite support, hence as a polynomial of Dirac-distributions. Algebraically,
52
3 The Algebraic Structure of 7-lo
this is reflected by the decomposition of the functions in 1i into their strictly proper and their polynomial part in a sense to be made precise below. For the algebraic approach to delay-differential equations this description is important because it allows one to abandon the restriction to C00 -functions for the solutions. Recall that from an algebraic point of view the space C00 (R, C) is very convenient to begin with, simply because it is a module over R[s, z, z- 1]. It turns out (cf. Remark 3.5.7) that over the proper part of 1i much more general function spaces, for example are modules with respect to convolution, too. We will take this aspect into consideration when discussing input/output systems in the next chapter.
LL,
For the main line o'f our approach, where we restrict to C00 -functions, the results of this section are not strictly necessary. Yet, we think the description in terms of distributions sheds some new light on our investigations. We begin with fixing some notation. Let V' be the vector-space of complexvalued distributions on the space V := {f E C00 (.IR,C) I suppf is compact}, endowed with the usual inductive limit topology. Here supp f denotes the support of a function (or distribution) f. Furthermore, let v~ :=
{T E V' I suppT bounded on the left}
and V~ :=
{T E V' IsuppT compact}.
We will identify the distributions in V~ with their extension to distributions on £ := C00 (.1R, C). The notation£, instead of£ as in Chapter 2, is meant to indicate that the space £ is endowed with the topology of uniform convergence in all derivatives on all compact sets. Let £+ := £ n V~ be the space of functions in £ with support bounded on the left. Finally, denote by 8£.k) the k-th derivative of the Dirac-distribution at a E JR. Recall that the convolution S*T of distributions is well-defined and commutative if either both factors are in V~ or if at least one factor is in V~. Moreover, convolution is associative if either all three factors are in V~ or if at least two of them are in V~. Finally, (V~, +, *) is an R-algebra without zero divisors and with 8o as identity [104, p. 14, p. 28/29] or (128, p. 124-129]. In this setting, differentiation (resp. forward-shift) corresponds to convolution with 8al) (resp. 81). Precisely, for p = L:f=l L::oPiisizi E R[s, z, z- 1] and f E £we have
pf = (
L
N
j=l
i=O
2: LPij8ai) * 8j) * f = p(8al), 81) *f.
(3.5.1)
1 Notice that R[8a ),81,8-1] is a subring of V~ and isomorphic to R[s,z,z- 1 ]. This observation has been made already in [61], where it was utilized for a transfer function approach to delay-differential systems.
3.5 The Ring
1{
as a Convolution Algebra
53
In the subsequent discussions we will also consider the function space
:3 tk E JR, k E Z, such that tk < tk+1 and }
PC
00
:=
{
f:
lR--> C limk---±oo tk = ±oo,
(3.5.2)
00
fl(tk,tk+ll E C ((tk, tk+1], C), f(tk+) E C
of piecewise smooth functions which are left-smooth everywhere and bounded on every finite interval. Note that coo C PC 00 c L~c C V'. Let
PC'f := {f E PC 00 Isupp f bounded on the left} c V~. By use of left-derivatives we can extend the delay-differential operators p( D, a) for p E JR[s, z, z- 1] from£ to PC 00 • Observe, that for f E PC 00 Equation (3.5.1) does not hold true anymore, as known from p = s and f the Heaviside-function. Instead, for a piecewise smooth function f with data as in (3.5.2) one has the identity
8y> ~ J = O"j J
i-1
+I: I: (J
(3.5.3)
~-t=OkEZ
where the sum vanishes if i = 0, see [103, p. 37/38]. Note that ai J(i) E PC 00 • Ob1 serve also that (3.5.3) generalizes the well-known identity 8a ) * H = 80 , where His the Heaviside function. 1 The next theorem gives an embedding of the abstract quotient field JR(8a \ 81) 1 of JR[8a ), 81, 8-1] into V~. Actually, one even obtains an embedding of the quite larger field JR( 8al)) (( 81 )) of formal Laurent series in 81 with coefficients in JR( 8al)) into V~. Recall Notation 3.1(f) for Laurent series. Theorem 3.5.1 The field JR(8a1))((81)) of formal Laurent series in 81 with coefficients in JR(8a1)) 1 is a subfield ofV~. As a consequence,.1R(8a >, 81 ) is a subfield ofV~, too.
1 1 We begin with the much smaller field JR(8a >). The inclusion JR(8a )) c V~ is standard in distribution theory, see e. g. [128, 6.3-1]. Explicitly, the inverse 1 of ¢(8a )) = l:~=O ¢i8ai) in V~ for a polynomial¢= l:~=O ¢isi E JR[s] of degree r > 0 is a regular distribution in PC'f and given by the function g defined as PROOF:
g = Hh where
hE ker ¢ C £ and h{i) (0) = { 0 _ 1 for~= O, · · · 'r- 2 ¢r for ~ = r - 1
}
(3.5.4)
and where HE PC'f denotes the Heaviside function. (This can also be checked directly by using (3.5.3).) Since V~ is a domain, this provides an embedding
54
3 The Algebraic Structure of 1-lo
JR(8a1>) ~ V+ and we denote the elements as q(8a1>) for q E JR(s). As for the general case of Laurent series, consider an arbitrary series p L:~z Pi(8al)) * 8j where Pi(8al)) E JR(8a1>). By the first part of this proof, the 1
distributions Pi(8a >) * 8j exist in V+. Since they have support in (j, oo), these1 ries E~z Pj (8a )) * 8j converges in V' with respect to the weak topology. Thus completeness of V' yields p E V', too. Since convolution of two such Laurent series is continuous in each factor [57, 41.8], formal multiplication of the series is identical to convolution in V+, and consequently the field JR(8a1))((81)) is a subfield of V+. D Example 3.5.2 1 Let us compute q(8a \ 81) E JR(8a1\ 81) for q = e~:.:_~- 1 E 1-lo. Defining g E PCf as g(t) = 0 for t ::; 0 and g(t) = e>..t if t > 0 we obtain (8a1> - .A)- 1 = g and hence q(8al), 81) = (e>..L8L- 1) * g = e>..LaLg- g =: g E PCf,
where g(t) = -e>..t for t E (0, L] and g(t) = 0 elsewhere. The function g has compact support and therefore defines a convolution operator
(§
* f)(t) =
1
§(r)f(t- r)dr = -
J.L
e>.r f(t- r)dr for f E
t:.
Notice that this is just the operator ij that we calculated in Example 2.7. For this particular function q, the distribution q(8al), 81 ) is regular. We will see in Proposition 3.5.8 at the end of this section, which regular distributions stem from functions in 11.. Remark 3.5.3 Using the embedding JR(s, z) C JR(s)((z- 1)), we obtain in the same way the inclusion JR(8al), 81) c V'_, the space of distributions with support bounded on the right. E. g., the inverse of 8a1) in V'_ is given by H - 1 (with H being again the Heaviside function). This consideration provides an alternative proof for the surjectivity of delay-differential operators (see Proposition 2.14). Indeed, let p E JR[s, z] and g E £.For solving p(8al), 81) * f = g, decompose g = 9+ + g_ where 9+ E £+ and g_ E £_ (the minus-subscript indicating the support being bounded on the right), which is certainly possible. Since V+ * £+ ~ £+ and
V'_
* £_ ~ 1
£_, we obtain unique solutions
f _ := p(8a >, 81 ) -
1
*g_
f+
:=
p(8a1>, 81 ) -
1
* 9+
E
£+ and
E £_ in the respective spaces (by abuse of notation, here
the latter expression p(8a1>, 81 ) -
1
denotes the inverse in V'_) and thus
(the last convolution is well-defined since p(8a1>, 81 ) E V~). Observe that the solution f = f+ + f- depends on the choice of the decomposition of g.
3.5 The Ring 1-l as a Convolution Algebra
55
The following result will be of some benefit for treating causality questions later on. Lemma 3.5.4 Let q E JR(s, z)\{0} be given as Laurent series q = E;t qi(s)zi with qi E JR(s) and ql # 0. Then for each nonzero function u E V with suppu C (0, 1) one has (q(c>al), 81) * u)l{l,l+1) ¢ 0.
1 First notice that q(8a ), 81) * u E £+· Moreover, q has an inverse 1 in JR(s)((z)) of the form q- = E;-l fi(s)zi where f-t # 0. Thus u = 1 q- 1(8a ), 8t) * q(c>al), 81) * u in the domain V~. Now the assertion follows for otherwise supp (q(c>al), 81) * u) c (l + 1, oo) and this would imply suppu c (1, oo). PROOF:
0
Now we can investigate the subring 1-l ~ JR(s, z) with regard to the embedding JR(c>a1), 81) ~ V~. Let us first give a brief outline of what follows. It is easy to see that the characteristic function q* introduced in (2.8) is, in terms of distri1 butions, just the Laplace transform of q(c>a ), 81). Since q* is an entire function 1 whenever q E 1-l, this suggests that q(8a ), 81) should have compact support. Indeed, 1-l can be embedded in the Paley-Wiener algebra of the Laplace transforms of distributions with compact support, so that finally 1-l is (isomorphic 1 to) the subalgebra of distributions with compact support in JR( c>a ), 8t). All this together is the content of Theorem 3.5.6 below. Before presenting the details, we wish to give an explicit description of the distributions in 1-l, and even those in JR(s)[z, z- 1]. To this end, let q = ~ E JR(s)[z, z- 1] where p =
L
N
r
L LPijSizj and¢= L ¢isi. j=l i=O
(3.5.5)
i=O
1 Assume r = deg ¢ > 0. Let ¢(8a ))- 1 = g be as in (3.5.4). Using (3.5.3) one derives (3.5.6) where p(g) E PC+ refers to the left-derivative of g. Since p and ¢ have real coefficients, the function g and consequently p(g) are actually real-valued and the finite sum in (3.5.6) is a polynomial in JR[c>a1), 811 8_ 1], which we call the 1 impulsive part of the distribution q(8a ), 81). It vanishes if and only if N < r (for N ~ r the coefficient of 8JN-r) is nonzero). As a result, the distribution 1 q(8a ), 81) decomposes into a regular distribution p(g) and a finite impulsive part.
56
3 The Algebraic Structure of 1-lo
Algebraically, this decomposition can be expressed by the decomposition of q into its strictly proper rational and its polynomial part. Indeed, performing division with remainder in the ring JR[s, z, z- 1] = JR[z, z- 1)[s) we derive p
=a¢+ b for some a, bE JR[s,z,z- 1 ] and deg8 b < deg¢,
hence
b
q
=¢+a.
The foregoing discussion shows that b¢ - 1 corresponds to the regular part p(g), 1 while a(8~ ),
1io where
=
1-io,sp EBlR[s, z],
(3.5. 7)
1-io,sp := {p¢- 1 E Ho I deg8 p < deg ¢ }, 1-io,p := {p¢- 1 E Ho I deg8 p::; deg¢}
}
(3.5.8)
are the subrings of (strictly) proper functions in 1-io. Both spaces will be needed in later sections. Now we turn to the characterization of the distributions in 1£0 . It is obvious, that 1 the impulsive part of q(8~ ), 81 ) in (3.5.6) does always have compact support. As for the regular part, this is true if and only if q E 1i as will be shown in Theorem 3.5.6 below. All the results given there could be derived from (3.5.6). However, we would like to draw also the link to the corresponding Paley-Wiener Theorem. Recall the notation eo,A E E for the functions given by eo,A (t) = eAt. The following theorem is formulated in terms of the Laplace transform. The version using the Fourier transform FT = (T, eo,-is) is more common and we refer to [96, Thm. 7.23) for a proof of the theorem below in terms of the Fourier transform. For us the Laplace transform is more convenient simply because it leads directly to the characteristic functions q* E H(C). Recall that we identity distributions with compact support with their extensions to distributions on E. Theorem 3.5.5
The Laplace transform T
V~---+ H(C),
induces an isomorphism from PW(C) := {
V~
~
{£T: c~c(T, s ~-+
eo,-s)
}
onto the Paley- Wiener algebra
f E H(C) 13 C, a> 0, N E No
\1 s E C:
lf(s)l ::; C(l + lsi)N eaiResl }· The constant a > 0 can be chosen such that supp T
~
[-a, a].
3.5 The Ring 1-l as a Convolution Algebra
57
Now we can present the following description of the algebra 1-t (see also [39, Thm. 2.8] where the result appeared first). Part (iv) states that the delaydifferential operators introduced in Definition 2.9(2) are simply convolution operators induced by certain distributions with compact support acting on £. Theorem 3.5.6 1 (i) Each distribution q(8a ), 81) E IR(8a1))[81l 8-1] admits a Laplace transform. The transform is given by q*. (ii) {q* I q E 7-t} = PW(
(iii) The monomorphism q ~ q(8a1), 81) from JR(s, z) into V~ induces the identities 1 1-t = { q E JR(s, z) I q(8a ), 81) E V~}, 1-to = {q E Hlsuppq(8a1),81) c [O,oo)}, 7-to,sp = { q E 7-to Iq(8al), 81) E PCf }.
(iv) q(8a1), 81)
*f
=
q(f) for all q E 1-i and f
E
£.
1 (i) Let q be as in (3.5.5). It has to be shown that eo,-cq(8a ), 81) is a tempered distribution for some c E JR, cf. [57, p. 231]. This can be deduced from the representation (3.5.6) as follows. The impulsive part has compact support and is therefore tempered. The regular part satisfies pg(t) = ph(t) for t > L where h is as in (3.5.4). Since ph is an exponential polynomial, this term can be made tempered, too. The second part of the assertion follows from linearity and multiplicativity of the Laplace transform along with the fact that sie-js is the transform of 8Ji). (ii) "~" For q E IR[s, z, z- 1] the characterizing estimate has been given in Proposition 2.5(1). For q = p¢- 1 E 1-t there exists a compact set K C
in its interior and hence l¢(s)l > M > 0 for all s E
which is what we wanted.
q(J), 0
58
3 The Algebraic Structure of ?io
Next, we would like to draw some specific conclusions for distributions in 1 JR( 8a ), 81) using the calculations and representations given above. Remark 3.5. 7 As a special case of the decomposition (3.5.6) we remark that for q E 1-lo,sp the operator ij is a convolution of the form
ijf =
foL g(r)f(·- r)dT
with kernel g E PC 00 having support in (0, L] for some L > 0. As a consequence, ij can be applied to much more general function spaces. E. g., the spaces L 11 L~c' em where 0::; m::; oo, or PC 00 (all spaces consisting of complex-valued functions defined on JR) are 1-lo,sp-modules. The same is true when replacing 1-lo,sp by 1-lo,p, since the polynomial part of each q E 1-lo,p is in JR(z], over which the above mentioned spaces are modules as well. If we restrict to one-sided functions, we can say even more. Define
JR(s).p := { ~
E
JR(s) I dega < degb} c JR(s)p := { ~
E
JR(s) I dega::; degb}
to be the rings of (strictly) proper rational functions in s. Then lR (s) P (( z )) denotes the ring of Laurent series in z with proper rational functions as coefficients (see 3.1(f)) and the discussion following (3.5.6) leads to the embeddings
As a consequence, the subspaces (L1)+, (L~J+, (Cm)+, and PCf consisting of functions with support bounded on the left are modules over JR( s) p (( z )) . Hence these spaces qualify as underlying function modules for "proper" delaydifferential operators, that is, operators having no differentiation involved. The same is true for the real-valued analogues. We will come back to this interpretation in Section 4.2 when investigating input/output operators. We end this section with the following description of the distributions in.Ho,sp· Proposition 3.5.8 PCf be a function such that suppg ~ (0, L] for some L E N. Then
Let g E
1
g = q(8a >, 81 ) for some q E 1-lo,sp if and only if for every k E {0, ... , L- 1} the restricted function gl(k,k+ 1] is a finite linear combination of functions from the set
S := { ej,.X (a sin(M·)
+ b cos(11-·)) I A, /1-, a, b E JR,
j E No}.
PROOF: Necessity follows from (3.5.4) and (3.5.6). For sufficiency it is enough to show that the Laplace transform of a function g of the above type is of the form q* for some q E 1-lo,sp· To do so, consider the finite Laplace transform
3.6 Computing the Bezout Identity
59
Cg(s) := J:+l e-stg(t)dt for an arbitrary integer k. For all j E No and a E
.Ceo,a(s) =
ek(a-s)(ea-s -1) [ekazk(eaz -1)]* = a-s a-s
are entire functions. But this can easily be seen using Proposition 3.1.2(a). 0
3.6 Computing the Bezout Identity As indicated in the title, we will now get back to the ring structure of 1i and discuss it from a computational point of view. More precisely, we will reconsider the construction of greatest common divisors and representing Bezout identities with respect to their exact computability, that is symbolic computability, not numerical. As it can be seen by reviewing Sections 3.1 and 3.2, the Bezout identities form the main ingredient for all other constructions given there, like adequate factorizations or unimodular transformation of matrices into triangular forms. As an indispensable prerequisite for symbolic computations one needs, of course, a way to represent the objects on a computer. It will turn out that this part is the main (and only) difficulty for the symbolic computability of Bezout identities. In order to become more specific about these problems and how they can be dealt with we introduce first the notion of computability (also known as effectiveness or decidability), as it is common in the computer algebra literature, see, e. g. [14, 2]. The outline of this section will be resumed thereafter. Definition 3.6.1 (see [2, pp. 178]) A ring (field) is called computable, if
(a) each element can be represented on a computer in such a way that equality of any two given elements can be tested by means of an algorithm, (b) the ring (field) operations can be performed algorithmically. It is known that Q is computable and that the field K(x1, ... , Xn) of rational functions is computable whenever K is a computable field. Moreover, K(a) is computable if a: is algebraic over the computable field K and its minimal polynomial is known, see [2, pp. 178/179]. We remark that the definition given above does not imply the existence of algorithms, which, on any input, calculate the desired objects in a reasonable way. Computability is concerned only with the (theoretical) possibility of symbolic
60
3 The Algebraic Structure of 'Ho
computations. In fact, the arguments given below will show that, under certain assumptions, Bezout equations are computable in 1-lo by means of an algorithm. But even on reasonably small input, the computations might lead already after a few steps to a pretty large output. Definition 3.6.1 can be extended to define computable Bezout domains by adding the requirement that for each set of given elements a Bezout identity (see (3.1.2)) can be computed algorithmically. (Likewise one can define computable Euclidean domains.) It is the purpose of this section to study whether 7-£0 is a computable Bezout domain. In this generality, however, an affirmative answer would imply lR C 1-lo to be a computable field. Because of Definition 3.6.1(a), this requires especially symbolic representation of real numbers and decidability (in finite time) about equality of any two such numbers, which is impossible in practice, see also (15, p. 6]. Therefore, it is reasonable to reduce the question about computability of Bezout equations to the subclass of objects which may arise if one starts with polynomials in the computable domain Q(s, z] c 1-£0 . In Example 3.1.9(2) and (3) we demonstrated how a Bezout equation for polynomials p, q E Q(s, z] inside 1-lo might require the field extension Q(e), while for p, q E Q(e)(s)[z] n 7-lo one might even be led to coefficients in Q(e, ee). Thus, in this example, we have to be concerned with the computability of the field Q(e, ee). Recall from Example 3.1.9 that the transcendence degree of Q(e, ee) seems to be unknown! As a consequence this field is not computable, implying that in general no Bezout equations for functions in Q(e, ee)(s)[z] n 7-lo can be computed symbolically. This example is quite simple, but nevertheless typical for the general situation, see Theorem 3.6.3 below. Successive Bezout equations as in Example 3.1.9 are for instance necessary for transforming matrices into triangular form. We will see that the only obstacle for the computability of Bezout equations (starting in Q(s, z]) is the unknown transcendence degree of the field extensions needed for the coefficients. These extensions arise from adjoining elements A and eA, where A E
3.6 Computing the Bezout Identity
61
tension of the coefficient fields. Keeping track of all the successive steps in the procedure we finally arrive at the computability of the field extensions. All this together leads to the result that, assuming Schanuel's conjecture, a Bezout equation for given P1, ... ,pn E 7-lo is computable provided that P1, ... ,pn have coefficients in a field extension of Q of the above-mentioned type. In the sequel we will provide the details for this statement. We first review the corresponding proofs in Section 3.1 and determine the field extensions needed for the successive steps. Thereafter, the computability of the desired objects is investigated step by step. At the end of the section we turn to Bezout identities for generic polynomials in Q[s, z]. Whereas the computation of a Bezout equation for two generic polynomials in Q[s, z] requires field extensions of the above type, the situation is different for three or more generic polynomials. We will show that in the latter case there exists a Bezout identity with coefficients in the polynomial ring Q[s, z]. As a consequence, no computational difficulties arise in that case. In order to avoid confusion we want to emphasize that the term Bezout identity (or Bezout equation), for elements p, q, say, will always refer to the ring 1i0 ; that is, it stands for an expression of gcd1io (p, q) as a linear combination of p and q with coefficients in 7-lo. To keep things a little simpler we will not be concerned with the question of real or complex coefficients, that is, throughout this (and only this) section, let 1-£0 be the ring 7-lo = {p¢- 1 I p E
62
3 The Algebraic Structure of 'Ho
Lemma 3.6.2 Let F ~ C be a E.eld extension of Q and¢ E F[s]. Furthermore, let kENo. Then there exist numbers AI, ... , Al E C, which are algebraic over F, and a
polynomial 8 E F(A~, ... , Al, e>- 1 , ••• , e>-t )[s] such that
(zk- 8)¢-I E F(AI, ... , Al, e>. 1 ,
••• ,
e>. 1 )(s)[z] n 1io.
As discussed above, all other steps of the procedure for finding a Bezout identity can be performed over the current field of coefficients. Starting with PI, ... , Pn E Q[s, z], the procedure therefore leads to the following towers of successive field extensions. Put
(3.6.1)
Without restriction one may assume that AI, ... , AlN are linearly independent over Q. This will be of importance later when "applying" Schanuel's conjecture to the fields FN. The foregoing arguments show Theorem 3.6.3
Let F = FN be a E.eld as in (3.6.1) and let Pb ... ,pn E F(s)[z] n 1io. Then there exist E.nitely many E.eld extensions FN+b ... , FN+k of the type (3.6.1) and functions d, a~, ... ' an E FN+k(s)[z] n 1io such that d = gcd'Ho (p~, ... ,pn) = aiPI
+ · · · + anPn·
Therefore, even an iterative process of several Bezout equations using the outcome of one step (or certain transformations of it) as the input for the next step, does always lead to a field extension of the above type (3.6.1) - as long as the process has been initiated with such type of coefficients. This applies for instance to the transformation of matrices into upper triangular form. Before we turn to the symbolic computability of the ingredients, we would like to comment on the procedure in general. Although the version given in the proof of Theorem 3.1.6(a), based on successive division with remainder, is natural for the computation of a greatest common divisor, it is far from being optimal. This does not only apply to the number of steps (we touched upon this in Example 3.1.9(3)), but also to the field extensions needed. Indeed, for PI, ... ,Pn E Q[s, z] an extension of the type FI (see (3.6.1)) suffices for a Bezout identity; a fact, that is not apparent from the above discussion. We will show this for n = 2, the general case follows by some straightforward generalization, see also [39, Rem. 2.5].
3.6 Computing the Bezout Identity
63
Proposition 3.6.4 (a) Q(s)[z] n1-lo ={;I IP E Q[s,z], l E No} n1-lo. (b) Let Pll P2 E Q(s)[z]n1-lo. Then d := gcdrto (PbP2) E Q(s)[z]n1-lo and there exists a field extension F 1 as in (3.6.1) and functions all a2 E F1(s)[z] n1-lo such that d = a1P1 + a2P2· PROOF: (a) Let p¢- 1 E
Q(s)[z] n1-l0 where p E Q(s, z] and¢ E Q(s] are coprime in Q(s, z] and ¢ is monic. If p E Q(s], then ¢ = 1 and the assertion follows. Thus let p E Q(s, z]\Q[s] and pick some A E V(¢) C C. Then A is algebraic and e->. E V(p) where p := p(A, z) E Q(A)(z]\{0}. Hence e->. is algebraic, too, which by the Theorem of Lindemann-Weierstrass (56, pp.277] yields A = 0. This shows that ¢ = sl for some l E No as asserted. (b) Write Pi= qis-r; where qi E Q[s, z] and ri E No. Let b := gcdQ(s,zJ (q1, q2) E Q[s, z] and put Ci := qib- 1 E Q(s, z]. Using Proposition 3.1.2(h) we can find factorizations Ci b Pi = sl; sr ' i = 1' 2 where both fractions are in Q(s)[z] n 7-{0 and li + r = ri. By construction c1 and c2 are coprime in Q[s, z] and from [18, Ch. 3.5, Cor. 4] one derives that c1s-l 1 and c 2 s-l 2 are coprime even in the larger ring Q(s)[z]. Using the fact that Q(s)(z] is a principal ideal domain, we can therefore find bll b2 E Q(s, z] and¢ E Q[s] such that (3.6.2) In order to proceed we have to consider the following two cases. 1. Case: c1s-h, c2s-l 2 E Q[s]. In this case we can arrange Equation (3.6.2) with polynomials bb b2 E Q(s] and ¢ = gcdQ(sJ(c1s-h,c2s-l 2 ) = gcdrt 0 (c1s-h,c2s-l 2 ). Coprimeness of c1 and c2 in Q[s] even yields ¢ = 1 and it follows bs-r = gcdrto (PbP2) E Q(s)[z] n 1-lo, which proves the first part of (b). Furthermore, bs-r = b1P1 + b2P2 is a Bezout identity with all terms in Q(s)[z] n 7-{0 . 2. Case: degz Ci > 0 for at least one i. Equation (3.6.2) implies V( ci s-h, c2s-h) ~ V( ¢). From this it follows as in the proof of (a) that the only possible common root of ci s-h and c2s-h is zero. Hence gcdrt 0 (c1s-h,c2s-l 2 ) = sl for some l E No and bsl-r = gcdrt 0 (Pt,P2) is again in Q(s)[z] n 7-{ 0 . As for the seco.nd statement of (b), consider C1
b1 sl+h
C2
+ b2 sl+l2 =
c/>
sl '
(3.6.3)
which is an equation with all terms on the left hand side in Q(s)[z] n 1-lo. Thus '1/J := cf>s-l is a polynomial in Q[s]. There remains to eliminate the roots of '1/J. For each A E V( '1/J) we have
64
3 The Algebraic Structure of 'Ho
therefore
bj(A)) ( b2(A)
E kerc
[(~)* (2_.)* ] _. [ (ffu-) *(A)] sl+h (A), sl+l, (A) - tmc - (..'l,, (A) ,
r
the latter identity being valid since by coprimeness of c1 s-l-l 1 , c2s-l-h in 1to both matrices have rank 1 at every point A E C. Since all entries involved are in the field Q(A, e-\), this implies the existence of some c E Q(A, e,\) satisfying
Now, we can adjust (3.6.3) to
where all quotients are in Q(A, e-\)(s)[z] n 7t 0 • Since each zero of 'l/J(s- A)- 1 is algebraic, we can proceed this way and finally obtain a field extension F 1 as in (3.6.1) and an equation
for some functions a1, a1P1 + a2P2 = bsl-r.
a2
E F1(s)[z]n1to. We also get the desired Bezout identity 0
Let us now return to the investigation of the procedure in the proof of Theorem 3.1.6(a) for finding a Bezout identity. Despite its non-optimal character, this procedure is quite convenient with regard to computability. The discussion preceding Lemma 3.6.2 shows that a Bezout equation is computable by means of an algorithm if (a) all the occurring coefficient fields are computable in the sense of Definition 3.6.1 and (b) the zeros of univariate polynomials (in s) over these coefficient fields can be determined by means of an algorithm. Indeed, univariate polynomials over a computable field form a computable Euclidean domain, hence greatest common divisors and their Bezout equations within this Euclidean domain can be computed. Besides this, only the interpolating polynomials 8 for (zk - 8)¢- 1 E 7t 0 are needed for the procedure in Thm 3.1.6. But they can be written down explicitly, once the zeros of ¢ along with their multiplicities have been exactly determined, and this will be addressed in (b). Let us begin with part (a). Recall that the relevant fields occurring in the process are of the type Fiv as in (3.6.1), that is, they consist of successive adjunction
3.6 Computing the Bezout Identity
65
of algebraic elements A along with exponentials eA. Computability, as required in (a) above, is questionable without any knowledge about the transcendence degree of the field. But this is indeed an open problem, a special instance of a still open but generally believed conjecture, attributed to Schanuel. 3.6.5 Schanuel's Conjecture (see [67, p. 687]) If AI, ... , At are complex numbers, linearly independent over Q, then the transcendence degree of Q( AI, ... At, eA 1 , ••• , eAz) is at least l. Notice that in the special case where AI, ... , At are algebraic numbers, it is known that the transcendence degree of Q(AI? ... At, eA 1 , ••• , eA 1 ) is equal to l. This is the well-known Theorem of Lindemann-Weierstrass [56, pp. 277). A verification of the conjecture would answer a lot of questions concerning the algebraic independence of given transcendental numbers, like, say, e and 1r (where it is in fact even unknown whether e + 1r is irrational!), or e and ee. In our situation, it would provide even the exact transcendence degree along with a transcendence basis for the fields FN as in (3.6.1). Indeed, Schanuel's conjecture leads to tr. degFN = tr. degQ(AI, ... AtN, eA 1 , ••• , eA 1N) = lN, since AI, ... AtN are algebraic over certain subfields of FN and taken to be linearly independent over Q. Thus, the fields FN can be written as (3.6.4) where Q(eA 1 , ••• , eA 1N) is purely transcendental and AI, ... AtN are algebraic over that field. Assuming that Schanuel's conjecture is correct, the field FN is immediately seen to be computable, see [2, p. 178/179). One should note at this point that in symbolic computation each algebraic Aj comes as a remainder modulo its minimal polynomial, thus the structure of the algebraic extension is completely given. Remark 3.6.6 In [90] the issue of exact computations with complex numbers has been studied in a somewhat different context. Approximations within a given tolerance using interval arithmetic are combined with symbolic descriptions in order to derive that a subfield of complex numbers, called elementary numbers, is computable, if Schanuel's conjecture is true [90, Thm. 5.1]. One can easily convince oneself that the fields FN given above consist of elementary numbers. For the subsequent discussion (up to Corollary 3.6.10) we will assume Schanuel 's conjecture. Then part (b) of the list above remains to be studied. Since in symbolic computations zeros of polynomials in F[s] are represented via their minimal polynomials, part (b) above asks for computing the irreducible
66
3 The Algebraic Structure of 7-lo
factors of univariate polynomials in an algorithmic way. This amounts to the question whether FN is a computable factorization field in the sense of Definition 3.6. 7 We call a field F a computable factorization field, ifF is computable and every p E F[s] can be factored into irreducible polynomials in F[s] by means of an algorithm.
Using the representation (3.6.4) for the fields FN and Schanuel's conjecture, one can break up the question about the computable factorization property into two pieces. We start with Proposition 3.6.8 Let Q(T) := Q(t 11 ... , tn) C C be a field extension of transcendence degree n. Then Q(T) is a computable factorization field.
PROOF: This can be deduced from the fact that multivariate polynomials with rational coefficients can be factored in an algorithmic way into their irreducible factors, see [112, 60]. Precisely, for p E Q(T)[s] there exists d E Q[t1, ... , tn] such that dp E Q[t11 ... , tn, s]. A factorization dp = TI~=l Qj into irreducible polynomials Qj E Q(t1, ... , tn, s] leads top = d- 1 fl~=l Qj where each factor is D either a unit or irreducible in Q(T)[s]. The main step for establishing the computable factorization property of FN is Theorem 3.6."9 Let F c C be a computable factorization field. Furthermore, let () E C be algebraic over F with monic minimal polynomial M E F[t]. Then F(B) is a computable factorization field.
PROOF: The above result is standard if F = Q, in which case it can be found, e. g., in [15, Sect. 3.6.2]. But the same proof applies equally well to our situation. We will present a brief sketch of the arguments by repeating the algorithm given in [15, Alg. 3.6.4). Let p E F(B)[s) be a polynomial. We wish to decompose p into its irreducible factors. (1) F(B) is a computable field, thus F(B)[s] is a computable Euclidean domain, allowing us to compute the squarefree part q := gcd(~,p') E F(B)[s] for which steps (2) - (4) yield a factorization into irreducible factors. (2) Let q = E:o Qi(B)si where Qi(B) E F[B] = F(B). Without loss of generality we may assume deg Qi < deg M. Then the representation of q is unique and we can associate with q the bivariate polynomial Q := E:o Qi(t)si E F[t, s]. The norm of q is defined to be
N(q)
:= Rest(M(t), Q(t, s)),
3.6 Computing the Bezout Identity
67
where Rest denotes the resultant with respect tot. Then N(q) E F[s] and it can be shown [15, p. 119] that N(q) = 1 Q(Oj, s), where the minimal polynomial M of 0 is given by M(t) = TI;=l (t - Oi)· The norm N(q) satisfies exactly the same properties as for F = Q given at [15, p. 144], and the algorithm proceeds as follows. Try k = 0, 1, 2 ... , until Nk(q) := Rest(M(t), Q(t, s - kt)) is squarefree (which can be tested in F[s]). This can always be accomplished in finitely many steps. (3) Factor Nk(q) = TI~=l Nj into irreducible polynomials Ni E F[s]. (4) Calculate qj := gcd(q(s), Nj(s + kO)) E F(O)[s], which is feasible. Then q = n~=l Qj is a factorization of q into irreducible factors . .(5) The multiplicities of the factors qj in p can be determined by successive division of p by qj. D
TI;=
Now we can summarize. Corollary 3.6.10
(Assuming Schanuel's conjecture). Let F = FN be a field as in (3.6.1) and let R be a matrix with entries in F(s)[z] n 11.0 . Then (a) F is computable factorization field. (b) A left-equivalent triangular form of R can be computed symbolically by means of an algorithm. In particular, for elements in F(s)[z] n Ho a greatest common divisor in Ho along with a Bezout identity can be computed symbolically. (c) A diagonal reduction of R via left and right equivalence can be computed symbolically by means of an algorithm. (a) is a consequence of Proposition 3.6.8 and Theorem 3.6.9. (b) is an iterative process of computing Bezout equations and hence symbolically feasible by the above discussion. (c) One can see from the proofs in [64, Thms. 5.1, 5.2, 5.3], that the only additional feature necessary for diagonal reduction is the adequate factorization of certain entries of R. The proof of Theorem 3.1.6( c) reveals that the computation of such a factorization consists of a finite sequence of greatest common divisors to be computed (see (3.1.3)) together with the determination of the multiplicities li in (3.1.4), which can be accomplished by successive division. Thus, all ingredients for the diagonal reduction, including the steps given in [64], can be performed symbolically in finitely many steps. 0 PROOF:
We would like to present the following simple example with coefficients in Q( e). It might give an idea about the number of terms possibly arising in a Bezout identity in case the polynomials have high degrees or coefficients in large field
3 The Algebraic Structure of 1-lo
68
extensions of Q, or about the number of terms possibly arising in the entries of a matrix that has been transformed to triangular form. Example 3.6.11 Let p = (z- 1)(s- 1), q = (1- ez) 2s 2(sz- 2) E Q(e)[s, z]. By inspection, a greatest common divisor of p and q in 1-lo is found to be ¢ = s(s- 1). Using a procedure similar to the one given in the proof of Proposition 3.6.4(b) and getting help from, e. g., MAPLE, one obtains the Bezout equation
s(s- 1) = f (z- l)(s- 1) + 9 (1- ez) 2s 2(sz- 2) where f1
f = 2(e2- 1)(e- 1)2(s- 1)(s- 2)'
9
91
= 2(e 2 -
1)(e- 1) 2s 2(s- 2)
are both in Q(e)(s)[z] n 7-lo and
fi
+ e 2)z 3 s 2 + ( -2e4 + 2e 2)z 2 s 3 + (2e 4 - 2e 2)z 3 s +( -2e4 + 4e 3 + 2e 2 - 4e)zs 3 + (2e4 + 6e 3 - 2e 2 - 2e)z 2 s 2 +( -2e 4 + 4e 3 - 4e + 2)s 3 + (6e 4 - 4e 3 - 9e 2 + 4e + 1)zs2
= ( -3e
4
+(6e4 - 4e 3 - 2e 2 + 4e)z 2s + (6e4 - 12e3 - 4e 2 + 12e- 2)s 2 +( -4e 4 + 4e 2 )z 2 + ( -4e 4 - 12e3 + 6e 2 + 4e- 2)zs +( -4e 4 + 8e3 + 10e2 - Be- 2)s + (8e 3 - 8e)z- 4e 2 + 4, 91
= (3e 2 - 1)sz + (2e 2 - 2)s 2 + (3- 5e 2)s + (2- 2e2)z + 2e 2 - 2.
Observe that degz f = 3 = degz q and degz 9 = 1 = degz p. These degrees can be shown to be the minimum possible for the coefficients of any Bezout equation for p and q. We wish to close the discussion on computability with the following result showing that triangular forms for matrices with rational coefficients can even be obtained over a field F1, see (3.6.1). Notice that for such a field the Theorem of Lindemann-Weierstrass implies that the transcendence degree is h, so that computability is guaranteed without making use of Schanuel's conjecture. Proposition 3.6.12 (a) Let Q E (Q(s)[z) n7-lo)n. Then there exists T E (Q(s)[z) n7-lo)Cn-l)xn such that TQ = 0 and Tis right invertible over 1-lo. (b) For every Q E (Q(s)[z) n 1-lo)nxm there exists an extension F 1 as in (3.6.1) and a matrix V E Gln(FI(s)[z] n 1-lo) such that VQ is upper triangular. PROOF: (a) First of all, there exists some T E Q[s, z]
3.6 Computing the Bezout Identity
69
where this issue will be discussed in more detail), we may assume the full-size minors ofT to be coprime in Q[s, z]. Then their greatest common divisor in 1io is of the form sm for some mE N0 , see Proposition 3.6.4. If m = 0, the matrix Tis right invertible over 1io by virtue of Corollary 3.2.5(f) and we are done. Assume m > 0. We seek to factor T = AT for some A E Q[s]
=
sm and T E (Q(s)[z]
n 1io)
Then Tis right invertible over 1io and satisfies TQ = 0. The factorization can be accomplished by an iterative procedure as follows. Assume for the general step that we have already a factorization T = AITI where AI E Q[s]
(recall Proposition 3.6.4(a)). Since the greatest common divisor of the full size minors of TI is a power of s, we have to be concerned merely with possible rank deficiencies of Ti(O) in order to achieve right invertibility, see Corollary 3.2.5( c). We have Ti(O) = T(O, 1)/l!, where T E Q[s, z]
at
a
and T2 := diag(n-I)x(n-l)(s-I, 1, ... , 1)VTI E (Q(s){z] n 1io)(n-I)xn, with which we can proceed. After m steps the process ends with a factorization n 1i0 )(n-I)xn is right
T =AT satisfying (3.6.5), thus the matrix T E (Q(s)[z] invertible a~d yields TQ = 0.
(b) Let QI = (qi, ... , Qn)T be the first column of Q. From (a) we have TQI = 0 for some T E (Q(s)[z] n 1io)
70
3 The Algebraic Structure of 1to
[~] Q = Fol~,]
where Q'
E
(IQ(s)[z] n 1to)(n-l)x(m-l).
We can proceed by induction.
0
At the end of this section we want to consider a special case of the Bezout identity in which the computational difficulties do not occur. In fact, a particular nice situation arises if the given polynomials Pll ... , Pn E F[s, z] (where F ~
is empty. As we will show below, for n 2::: 3 this situation is generic in the sense that only a set of measure zero in the parameter space for the polynomials Pl, ... , Pn leads to cases, where no polynomial Bezout identity exists. For n = 2 just the opposite is the case: the set of pairs of polynomials with nonempty common variety forms a set of measure one. To make these ideas precise, we introduce the (finite-dimensional) parameter space of all polynomials with total degree bounded by some prescribed number.
Definition 3.6.13 Let F be any subfield of
E
N define
Tm := {p E F[s, z] I tdegp::; m}
to be the set of all polynomials p with total degree tdeg p at most m. Via l:ij PijSi zi ~ (Pii )i,j := coeff(p), the coefficients taken in some fixed order, we identify Tm with the parameter space FL, where L = (m + 1)(m + 2)/2. Moreover, for n E N let
be the set of all lists of polynomials of total degree at most m, whose greatest common divisor in F[s, z] is a unit and satisfies a Bezout identity within F[s, z]. It should be quite intuitive that two affine plane curves defined by Pl and p 2 do intersect generically in
3.6 Computing the Bezout Identity
71
Theorem 3.6.14 (a) Let n = 2. Then Z 2 is contained in a proper Zariski-closed subset of F 2 L.
(b) If n ;: : : 3, the set Zn contains a Zariski-open subset of pnL. For p E F[s, z] define p E F[s, z, w] to be the homogenization of p. (a) We will make use of the Theorem of Bezout for projective plane curves [35, p. 112]. If two nonconstant polynomials do not intersect in
PROOF:
{(PI, P2)
E
Z2 I Pl, P2 not constant} ~ {(p1,p2) E T~ I V(PbP2, ij)
=I {0} in
The set A describes an algebraic variety defined by the resultant R E Z[Xll ... , X2L+ 3] of the polynomials PllP2, and ij, see [19, Ch. 3, Thm. 2.3]. Since ij = w is fixed, the resultant can be regarded as a polynomial P E Z[X1, ... , X2L] in the coefficients of P1 and P2, and thus
A= V(P) := {(coeff(p1),coeff(p2)) E F 2L I P(coeff(pl),coeff(p2)) = 0} ~ p2L. The variety V (P) is proper, because the complement of A is certainly not empty. Since the neglected part of Z 2 , where at least one polynomial Pi is constant, forms an algebraic variety itself, assertion (a) is proved. (b) In the case n ;: : : 3 we may argue as follows. If (Pll ... ,pn) ¢ Zn, then (pl,P2,P3) ¢ Z3 and V(pl,P2,P3) =I {0} in
= (s - 3) z + (s + 1) z 2 + 2, P2 = (z - 1) (z - 2)
E Q [s,
z]
one has gcdrt:o (PbP2) = s E (Pt,P2)Q(s,z]' as is easily verified using MAPLE. Changing P2 into P2 = ( z - 1) (z - 2) 2, however, one. obtains gcdrto (PI, P2) = s tf. (PI,P2)Q[s,z]· Note that in both cases the algebraic variety is of the form
72
3 The Algebraic Structure of 1to
The first coordinates of its points are exactly the zeros of the associated exponential polynomials Pi, P2, .P2. This is certainly a necessary condition for the existence of a Bezout equation in Q[s, z], but, as just illustrated, not sufficient. We will not dwell upon these considerations but close with the remark, that generically a pair (PbP2) E T~ is coprime in 1io, i. e., V(pi,p2) = 0. This should intuitively be clear, and can formally be established by parametrizing appropriately the set of noncoprime pairs. Together with part (a) of the theorem above this implies that generically the Bezout equation of two polynomials cannot be solved in the polynomial ring F[s, z].
4 Behaviors of Delay-Differential Systems
We now resume the investigation of delay-differential equations in the framework of Chapter 2. Thanks to the Bezout property of 1-l it is possible to turn directly to systems of DDEs. As being indicated by the title of this chapter, we now start the system-theoretic study in terms of the so-called behavioral approach. Let us briefly introduce the main ideas of this part of systems theory. In the behavioral framework, a system is specified by the set of all trajectories it declares possible, called the behavior. If the laws governing the system are known, the behavior is simply the set of all trajectories compatible with these laws. This point of view has been introduced in systems and control theory by Willems in the eighties, see e. g. [118]. The basic idea of a system as described above is completely different from the "classical" notion. The latter regards a control system as a device transferring input signals into output signals; this results (in most cases) in the concept of a transfer function. Such a system description has also to comprise the information about the initial conditions of the system, hence the circumstances, under which a certain input is transferred into a certain output. In the behavioral theory, a system is "simply" the collection of all feasible input/outputs pairs, regardless of the specific circumstances leading to any of these pairs. Furthermore, the behavioral viewpoint goes even beyond the notion of inputs and outputs itself. As it was pointed out by Willems by some standard examples of control theory, there are certain situations in which it might be misleading to distinguish a priori between inputs and outputs. This applies in particular when systems, sharing the same external variables, are interconnected. In general it depends on the structure of the interconnection which of the variables will act as inputs for one of the components and which will act as outputs. With the set of all trajectories being the central concept of a system, behavioral theory begins, of course, at this very stage. System properties are defined in terms of the trajectories. This leads immediately to the following tasks. Firstly, one wants to understand, and hopefully characterize, these properties in terms of the chosen representation, the set of describing equations say. This goal applies, for instance, to the notion of controllability, or to the feasibility of certain feedback interconnections as well as to any cause/effect structures, which, if they exist, lead in a second step to the notion of input/output systems. Consequently, a transfer function, if it exists, arises from certain properties of, and
74
4 Behaviors of Delay-Differential Systems
relations between, the components of the (vector-valued) trajectories in the behavior. Secondly, a variety of system descriptions might be possible and one might want to switch from one to another. Hence one has to clarify the relationship between the various descriptions. At this point we would like to mention that the idea of describing a control system "simply" as the set of all its trajectories has been around in systems theory before Willems' work. In the book [7, p. 51] a variant of this set is considered, called the input/output relation of the system, even though no specific distinguishing properties are associated with the various components (named inputs and outputs) of the trajectories. However, we think Willems' approach is more convincing because of its consequence in pursuing the idea to explain every notion (say, the properness of a transfer function) in terms of trajectories. Moreover, the behavioral approach has the advantage that by avoiding any prespecified input/output structure the fundamental notions of systems theory (like controllability or composition of systems) often come out in much simpler, therefore much more transparent, form.* In this chapter we will develop a theory for studying systems described by delaydifferential equations from the behavioral point of view. Hence we assume that the laws governing the system have already been determined and were found to be DDEs (at least in the modeled situation). The following definition of a behavior will turn out to be sufficiently rich for our purposes. Recall Definition 2.9 for the operators ron .C = C 00 (~, C) where r E 1-l.
Definition 4.1 Fix q E N. A set B ~ £.q is called a behavior (or simply a system), if it is the solution space of a system of DDEs, that is, if there exists a matrix R = (rij) E 1-{PXQ such that
B= {
CJ t, E
c•
r,Jwi
= o, i =
1, ... ,P} .
The matrix R is said to be a kernel-representation of B. The coordinates w1 , ... , Wq of the trajectories in B are called the external (or manifest) variables of the system. In the sequel we will use the names behavior and system interchangably. Notice that the behaviors just defined are in general described by an implicit system ofDDEs. *
For sake of completeness we would like to also remark that the term 'behavior' has also been used in the seventies by Eilenberg in the context of finite automata and machines (dynamical systems over finite structures). It describes exactly the same object, that is, the set of all trajectories (called successful paths) of an automaton, see [27, p. 12].
4 Behaviors of Delay-Differential Systems
75
At first sight, the definition above appears to be rather restrictive for it requires that the behavior be the kernel of a delay-differential operator. It seemingly excludes systems, which are specified with the help of some auxiliary variables like, for instance, images of (matrices of) delay-differential operators. In that situation only certain variables appearing in the describing equations are regarded as the manifest variables and only their trajectories make up the behavior. These are the variables whose trajectories the model wants to describe. The other variables have been introduced for, or have resulted from, modeling. All such auxiliary variables are called latent variables (see also (87, Def.1.3.4] for a definition of a dynamical system with latent variables in full generality). For our purposes it suffices to have in mind that images or preimages of behaviors under delay-differential operators are examples of latent variable descriptions. We will see in Section 4.4, that they are behaviors in the sense of Definition 4.1, which therefore is not as restrictive as it appears. Notice that the description of the behavior B in Definition 4.1 does not only consist of the DDEs causing the relations between the external variables but also includes the smoothness condition w E {,Q. In Section 3.5 we saw that under certain circumstances also other function spaces qualify as solution spaces for DDEs. We will briefly resume this idea in the context of input/output structures and transfer functions in Section 4.2. The chapter is organized as follows. The foundations of our approach are laid in the first section. Therein, it is shown that the family of all behaviors in {,Q constitutes a lattice which is anti-isomorphic to the lattice of all finitely generated sub modules of 1-lq. The anti-isomorphism is given by passing from behaviors to their annihilating modules. Among other things, we characterize algebraically when two kernel-representations share the same behavior. This is of fundamental importance for our goal of describing system properties in terms of (the highly non-unique) kernel-representations. Due to the fact that 1-l is an elementary divisor domain, the results of this section are reminiscent of those for systems of ODEs (where the ring of operators is a Euclidean domain). However, we decided to emphasize the lattice structure and, consequently, the close connection between certain constructions for systems on the one side and division properties for representing matrices on the other. This results in a slightly different exposition as in, say, [87]. Yet, the first section provides a machinery that allows one to proceed in a fairly standard way (that is, like for systems of ODEs) when discussing the basic concepts of behavioral theory for systems of DDEs. This will be initiated in Section 4.2. Here we discuss those properties of the behavior which lead to ·a. distinction of the external variables into inputs and outputs, including possible nonanticipating cause/effect structures. The characterizations, given in terms of kernel-representations, generalize those for systems of ODEs in a straightforward way. For input/output systems the (formal) transfer function is introduced in the usual way and investigated with respect to nonanticipation. Autonomous systems arise as an extreme case of systems without any inputs, hence without any possibility to control. In Section 4.3
76
4 Behaviors of Delay-Differential Systems
we classify systems according to their input/output structure. More precisely, we investigate the equivalence relation induced by the transfer function. It turns out that the equivalence classes constitute sublattices of the lattice of all systems and contain a least element. This particular element is shown to be the unique controllable system in its equivalence class. The notion of controllability refers, of course, to behavioral controllability, that is the ability to drive any system trajectory into any other in finite time. Various (algebraic) characterizations of controllability are derived. Section 4.4 is devoted to the interconnection of systems. Adding some regularity condition, this can be regarded as the behavioral version of the connection of a to-be-controlled system with a controller. The interconnection of systems usually leads to latent variables in the model for the overall system which one might want to eliminate in order to derive a kernelregresentation. We begin with this step by presenting an elimination theorem. Thereafter we turn to the interconnection of systems. Since the interconnection of two systems forms a subsystem of either of its components, it is natural to ask which subsystems of a given system can be achieved as a (regular) interconnection, in other words by connecting a suitable controller. We present various characterizations, one of which is purely in terms of the trajectories; in fact, it can be seen as a generalization of controllability. At the end of the section we turn to a question which can be regarded as the dual of achievability of subsystems, namely direct sum decompositions of behaviors. This problem might not directly be of system-theoretic significance, but from a mathematical point of view it arises quite naturally in this context. As we will show, direct sum decompositions are closely related to the skew-primeness of certain matrices involved. In Section 4.5 we briefly address the issue of stability for autonomous systems, before we turn to the question of constructing autonomous interconnections with prescribed (say, stable) characteristic polynomial. As a particular case, the finite-spectrum assignment problem via feedback control for first-order systems is studied. We show how the problem can be formulated arid solved within our algebraic framework. In the final Sectiqn 4.6 we slightly change our point of view and reconsider the nonfinitely gene:rated ideals in 1t. It is investigated whether they are invariant under taking biduals with respect to the action of 1t on .C. Using the description of these ideals obtained in Chapter 3, a criterion for invariance in terms of the characteristic zeros is derived. In most parts of the chapter the operator ring 1t c R(s)[z, z- 1 ], containing both forward and backward shift, is the natural choice for the algebraic description. Only when concerned with cause/effect structures it is more convenient to utilize the smaller ring 7-lo c R(s)[z] in order to avoid backward shifts.
4.1 The Lattice of Behaviors In this section we analyze the structure of the set of all behaviors in .Cq. Associating with each behavior the space of all annihilating equations, we obtain a
4.1 The Lattice of Behaviors
77
one-one correspondence between behaviors in .Cq on the one hand and finitely generated submodules of 1tq on the other. Precisely, two matrices R1 and R 2 with q columns determine the same behavior in .Cq if and only if they share the same rowspace in Jtlxq. But even more can be achieved. The results derived in Chapter 3 provide an easy way to see that this correspondence is actually an anti-isomorphism of lattices. In particular, sum and intersection of behaviors are again behaviors, kernel-representations are given by a least common left multiple and a greatest common right divisor of the given representations, respectively. This Galois-correspondence, and particularly the description in terms of representing matrices, will be of fundamental importance for this chapter and the one to follow. A lot of situations arising later on can be subsumed in this correspondence. It is worthwhile remarking that these results about systems of DDEs can (and will) be deduced without further analysis of delay equations. Indeed, thanks to the Bezout property of 1t, the basic analytical results about scalar DDEs, derived in Chapter 2, are sufficient for the matrix case as well. We will also discuss the question whether or not a given behavior permits a polynomial kernel-representation. This information will be useful in the context of first-order systems to be dealt with in the next chapter. The section will be closed with a short presentation of related results for systems with noncommensurate delays. Let us start with the correspondence between behaviors and sub modules of 1tq. Each matrix R = (rij) E 1tpxq gives rise to two kinds of maps, namely
1tq and
-----+
JtP'
h
~
(~ ) ~ R (~!)
Rh
1
£q
-----+
£P,
Wq
Wq
:= (
t
._
rijWj) ._
J- 1
z-l, ... ,p
and where the operators rij are defined as in Definition 2.9(2). We will denote both maps simply by R and use the notation ker?-l R, im?-l R (resp. ker.c R, im.cR) for the kernel and the image of the first (resp. second) map. It would certainly be more consistent with Definition 2.9, and probably less confusing, to denote the second operator by R. The disadvantage of that choice would be a somewhat cumbersome notation when dealing with block matrices. Furthermore, we believe that the meaning of R is always clear from the context. Since .C is an 1t-module, we have
RS = R o S as maps on 1tq and on .Cq
(4.1.1)
for all matrices R and S over 1t of compatible sizes. As a consequence, each unimodular matrix U E Glq(1t) acts bijectively on both 1tq and .Cq. The 1t-module structure on .C induces the 1t-bilinear map
78
4 Behaviors of Delay-Differential Systems
which in turn gives rise to the spaces
M ..L = {w E .Cq I hT w B..L = { h E 1-lq I hT w
= 0 for all h E M} for M ~ 1-lq, = 0 for all w E B} for B ~ .Cq.
}
(4.1.2)
Notice that M..L is the solution space of the (possibly infinitely many) equations induced by M ~ 1-lq, while B..L defines the space of all annihilating equations of the functions in B ~ .Cq. We call these spaces the duals of M and B. Furthermore, M..L..L and B..L..L are said to be the biduals of M and B, respectively. It is clear that M..L and B..L are 1-l-submodules and that M ~ M..L..L, B ~ B..L..L. Moreover, one easily derives the identities (4.1.3) for 1-l-submodules Mi ~ 1-lq and Bi ~ .Cq, i = 1, 2. With this notation, the behaviors introduced in Definition 4.1 appear as the duals ker.c R = (imr£ RT) ..L where R E 1-lpxq.
(4.1.4)
Remark 4.1.1 It is easy to verify the isomorphism
ker.c R
~ Hom7t ( 1-lq/ imr£ RT , .C) ,
associating with w E ker.c R the mapping h ~ hT w, where h is the coset of h E 1-lq. Thus, behaviors are duals of finitely presentable modules with respect to the contravariant functor Hom7t (-, .C). This observation has been utilized by Oberst in his paper [84] on multidimensional systems. In that case the operator algebra is some polynomial ring, say
4.1 The Lattice of Behaviors
79
for the system variables, which are restricted by the equations (the matrix R) governing that system. The actual trajectories of the system, evolving in time, are not incorporated in this model. Let us return to finitely generated submodules of 1-lq and their duals as introduced above. Definition 4.1.2 Fix q E N. Denote by M the set of all finitely generated submodules of 1-lq, partially ordered by inclusion. Moreover, the set of all behaviors in .Cq, partially ordered by inclusion, is denoted by B. Observe that B is simply the set of duals of the modules in M. Furthermore, the Bezout property of 1-l implies that each finitely generated submodule of 1-lq is free, see also Remark 3.2.10. Thus, M consists in fact of all free submodules of 1-lq. As a consequence, the matrix R in (4.1.4) can be chosen with full row rank. Proposition 4.1.3 M is a (non-complete) modular lattice. It is distributive if and only if q = 1. PROOF: M is modular as a sublattice of the lattice of all submodules of 1-lq. It is obvious that the sum of two finitely generated submodules is finitely generated again, while the closedness of M with respect to intersection is a consequence of the Bezout property of 1-l, see Theorem 3.2.8(b) and Remark 3.2.10. The non-completeness of M is immediate from the existence of nonfinitely generated submodules, see Section 3.4. For q > 1, the lattice is not distributive. This is seen in exactly the same way as the non-distributivity of the lattice of vector spaces, see for instance [56, p. 463). For q = 1, the distributive law follows from Proposition 3.1.2 along with the identity lcm?-l (a, gcdrt (b, c)) = gcdrt (lcm?-l(a, b), lcm?-l(a, c)), which is true in every commutative Bezout domain. D It is worth mentioning that even the lattice of all ideals in 1-l is distributive;
this is shown for arbitrary commutative Bezout domains in [58, Thm. 1]. Via the anti-isomorphism, to be derived next, the partially ordered set B will turn into a modular lattice, too. We need the following preparatory result characterizing surjectivity and injectivity of matrices of delay-differential operators. Proposition 4.1.4 Let R E 1-lpxq. Then
(a) im.cR = £P if and only if rk R = p. (b) ker.c R = {0} if and only if rkR*(.X) = q for all .X E C; recall R*(.X) R(.X, e->.) from (3.2.2).
80
4 Behaviors of Delay-Differential Systems
PROOF: Since unimodular matrices act bijectively on .CP resp . .Cq, we may assume that R is in diagonal form. Then (a) follows from the scalar case given in Proposition 2.14. The only-if part of (b) is a consequence of Lemma 2.12(a), while the if-part follows from the left invertibility over 1-t., as derived in Corollary 3.2.5. o The next theorem contains the main results of this section. Part (a) can be viewed as the cornerstone of the theory we are going to develop. The characterization of the inclusion of behaviors via right division of the according matrices was, to some extent, the main reason for passing from polynomial to more general delay-differential operators in Chapter 2. Recall that the ring 1-t. was constructed in such a way that the inclusion ker.c ¢ ~ ker.cp for ¢ E JR[s] and p E JR[s, z] is true if and only if p¢- 1 E 1-t., see (2.10). Thanks to the algebraic structure of 1-t. this generalizes immediately to arbitrary matrices of delaydifferential operators. This is possible even without much knowledge about the solutions of such operators, like for instance series expansions into exponential polynomials. Observe that, by virtue of Proposition 3.2.4, part (a) below could just as well be expressed as B1 ~ B2 ¢::> FRi = R2 for some FE H(C)P 2 XP 1 • In this formulation, the implication "=?" is a special case of (72, Thm. 3], where the result is stated in much more generality for distributions on Rn having compact support. We will come back to this at the end of the section when discussing the situation for systems with noncommensurate delays. Theorem 4.1.5 Fori= 1, 2 Jet Ri E
be two matrices. Put Bi := ker.c Ri E B. Then (a) B1 ~ B2 {::==} XR1 = R2 for some X E HP 2 XP 1 • Ifrk Ri =Pi fori= 1, 2, then 1-lp;xq
B1 = B2
{::==}
Pl = P2 and R1, R2 are left equivalent.
(b) (B1)..L = imrt R1T· (c) B1 n B2 = ker.c gcrd(R1, R2). (d) Let rkRi =Pi fori= 1, 2. Then B1 + B2 = ker.c lclm(R1, R2). As a consequence, B is a sublattice of the lattice of all submodules of .Cq. A slightly modified version of this result appeared first in [42, Prop. 4.4]. PROOF: (a) "-<==" follows from (4.1.1). For "=?" we make use of a diagonal form for R1. Let r = rk R1 and U R, V = [ ~
~]
where U, V are unimodular and Ll = diag,.xr(d,, ... , dr ),
see Theorem 3.2.1(b). Put R2 V =: [P, Q] where the matrix is partitioned in such a way that P = (Pij) has r columns. Then ker.c (U R 1 V) = ker.c (..:1, 0] ~ ker.c [P, Q]. Thus Q = 0 and ker.c dj ~ ker.c Pij for all i = 1, ... ,p2 and
4.1 The Lattice of Behaviors
81
j = 1, ... , r. Using Lemma 2.12, we obtain dj IH<e> Ptj for all i, j. This implies FU* Rj = R2 for some F E H(C)P2 xp 1 and the result follows with Proposi-
tion 3.2.4. The consequence stated in (a) is standard. (b) For every a E 1-lq one has a E (kerc R 1).l if and only if kerc R 1 ~ kerc aT. Hence the result is a consequence of (a). (c) follows from (a) along with a representation gcrd(Rll R2) = M R 1 + NR2 as derived in Theorem 3.2.8(a). (d) In order to obtain an lclm(Rll R 2 ), we transform the matrix [R1T, Rl]T via left equivalence into a full row rank part. Precisely, let l = rk [R1T, R2T]T and let
be such that
[g~ g:] [~~]
=
[~]
for some DE 1tlxq of rank!.
Then D = gcrd(R1, R2) and U4R2 = lclm(Rb R2) by Theorem 3.2.8. By Proposition 4.1.4(a), the operator D is surjective and therefore one gets for w E Cq the equivalences w E kerc R1
+ kerc R2
The assertion B being a lattice follows now from (c) and (d).
0
Remark 4.1.6 (i) The sole reason for the rank condition in part (d) of the theorem is that the least common left multiple is defined only for full row rank matrices, see Theorem 3.2.8(b ). The proof shows that in any case the identity kerc R 1 + kerc R2 = kerc U4R2 is true. (ii) The theorem above is true without any modifications if one replaces 1-l by IR[s], representing ordinary differential operators. This is, of course, a wellknown result, see, e. g., (7, pp. 91] for part (a). But one can also recover this special case from Theorem 4.1.5, since it is easy to see that for Ri E IR[s]P;Xq the matrix X in (a), if it exists, can be chosen with entries in IR(s], too. The same is true for the gcrd and lclm.
82
4 Behaviors of Delay-Differential Systems
We would like to illustrate part (a) by some examples. Example 4.1. 7 (a) A first example was derived by elementary considerations in Example 2.16 of Chapter 2. Therein two matrices in JR[s, z] 3 x 3 , having the same kernel in £ 3 , were presented. The left equivalence over 1-lo was directly verified. (b) Let R = (rij) E JR[s, z]2x 2 where
rn = (z- 2)s 3 + (z- l)s 2, r1; = (z 3 - z 2 + l)s 5 + (z 3 - 2z 2)s4 + (z- l)s 2, r21 = (z 2 - 3z + 2)s 2 + (z 2 - 2z + 2)s + 1, r 22 = (z 4
-
3z 3
+ 2z 2)s 3 + (z 4 -
2z 3
+ 2z 2 + z- 1)s2 + (2z- 1)s- 1.
Then detR = -s 4 , hence (adjR)R = -s 4 I and ker.cR ~ ker.c(s 4 J). Thus the kernel ker.c R is finite-dimensional and consists of polynomials of degree at most 3. We could calculate a basis by substituting the general form into Rw = 0. A little less work is necessary by using the following argument. The entries r 11 and r 21 of the first column of R are easily seen to be coprime in Q[s, z] and therefore, using Proposition 3.6.4(b), they are also coprime in 11.0 . Thus the matrix R is left equivalent to some matrix
·- [10 s4 pl E'L/2X2 A .no . Using Proposition 3.1.2(g), we can even arrange that p E JR[s] and has degree less than 4, say p = Po + PIs + p2s 2 + p3s 3. Hence the behavior ker .c R = ker .c A is given by the space
-po) (-pot-
spanc { ( 1
( -Pot
'
3
t
-
3p!l:
3
-
PI) ' (-pot2- t22p1t- 2p2) ' 6p,t -
6pa) }.
Checking successively these functions with the operator R, one gets p = -1 + 2s - 3s 2 + 3s 3 . It can also be verified directly, that ~ith the given polynomial p the matrix RA- 1 is in Gl 2(7t0 ), but not in JR[s, z]2x 2. This leads to the consequence that no matrix U E Gl2(1R[s, z]) exists such that U R =:BE JR[s]2x 2 (B would satisfy ker.c B = ker.c A and Remark 4.1.6(ii) applies). Hence R is left equivalent to some pure differential operator, where the transformation matrix has entries in 1-£0 , but not in JR[s, z]. In both examples we were guided by the argument that a matrix R E Hgxn with det R E JR[s] has a finite-dimensional kernel which, consequently, has to be the kernel of an ordinary differential operator (see also [44, p. 227] where an associated differential operator is calculated explicitly from the prescribed solution
4.1 The Lattice of Behaviors
83
space). Together with Theorem 4.1.5(a) this implies that R is left equivalent over 1-l to a matrix with entries in JR[s]. This can (and will) be established with direct matrix calculations in Lemma 4.1.10 below. The results of Theorem 4.1.5 can be summarized in a Galois-correspondence between behaviors and finitely generated submodules. Corollary 4.1.8
The partially ordered sets B and M are anti-isomorphic modular lattices; the anti-isomorphism is given by taking duals, that is by the maps j: M~B, g :
B
~
M~----tMJ..,
M,
B 1----t BJ..,
which fire inverses of each other. By virtue of Theorem 4.1.5(b ), we have BJ.. E M for all B E B, so f and g are well-defined maps and even inverses of each other, see also (4.1.4). We have to show that they are anti-homomorphisms. In light of (4.1.3), it suffices to show that they map intersections onto sums. This can be derived from the Theorems 3.2.8 and 4.1.5 as follows. Fori= 1, 2let Ri E 1-{PiXq be two matrices with rank Pi. Then PROOF:
f(im1-l RtT
n im'H Rl) = (im'Hlclm(Rt, R2l)J.. = kerc lclm(Rt, R2)
= kerc Rt + kerc R2 and likewise, using the gcrd, one obtains g(kerc R 1 n kerc R 2 ) = im'H R 1T + im'H R 2T. Now there only remains to observe that the anti-isomorphic image of a modular lattice is a modular lattice itself. But this is a standard exercise in D lattice theory. Remark 4.1.9 ·The identity (im'H RT)J.. = kerc R is also valid if we interchange the roles of 1-l and C. This is not part of the preceding corollary but can be seen directly. Indeed, using a diagonal form for R E 1-lpxq, we see that the module ker'H R ~ 1-lq is finitely generated and that imcRT ~ Cq is a behavior. Moreover, both are related by (imcRT)J.. = ker'H R, which is the identity above with C and 1-l interchanged. As a consequence, C satisfies the fundamental principle in the following sense: for matrices R E 1-lpxq and S E 1-lqxl one has the equivalence 1-{l
S
R
~ 1-{q ~ 1-[P is exact
¢==::}
£P
RT ST ~ Lq ~ £ 1 is exact.
This result might look surprising if combined with the fact that delay-differential operators act continuously on & (the map f ~----+ q(<5~ 1 ), <51 ) * j, see Theorem 3.5.6(iv), is continuous on & by [107, Thm. 27.3]). It tells in particular that operators in 1-[PXq have a closed range. But this follows indeed from the surjectivity in the scalar case (Proposition 2.14) along with a triangular form.
84
4 Behaviors of Delay-Differential Systems
Next we will investigate under which condition~ a behavior allows a polynomial kernel-representation. We start with the special case of square nonsingular matrices having determinant in R[s]. The following lemma provides the general result that was guiding the examples in 4.1.7. Lemma 4.1.10 Let A E 1in x n be a matrix such that det A = zk ¢ for some k E Z and ¢ E R[s]\{0}. Then A is left equivalent (over 'H) to an upper triangular matrix BE
R[s]nxn.
PROOF:
Let A be left equivalent to the upper triangular matrix
(see Theorem 3.2.1(a)). Then det A = rr~=1 ai = u¢ for some u E 1-fX' hence det A is a unit in R(s)[z, z- 1 ]. We may assume without restriction ai E R[s]. Note that the elements above the diagonal may contain negative powers of z. Let p = ( ~~=l Pvzv)'l/J- 1 E 1i be such an element in the, say, jth column of A. By virtue of Proposition 3.1.2(f) we can subtract an appropriate multiple of aj from p to obtain a polynomial in R[s]. Indeed, for v E {l, ... ,L}\{0} choose 8v E R[s] such that z:;Jv E 'H. (The case where v is negative is not contained in 3.1.2(£), but works equally well.) Then p- ~~=l,v#O z:;Jvpvaj E 'HnR(s) = R[s]. This way we obtain the desired matrix BE R[s]nxn. 0 The lemma does not generalize to matrices with determinant in R[s, z] as will be demonstrated next. Example 4.1.11 Consider the matrix
R=
l
1 -sz-1 [ Oz-1
E 1i2x2.
Thus det R = z - 1 is a polynomial but R is not left equivalent to some matrix with entries in R[s, z]. To see this, suppose to the contrary that there exists
[~!]
U= such that
EGI,{1l)
b(z- 1)] [ cz;1 ++ d(z _ 1) R s, z
a a zU R = [c
1
E
]2x2
.
(4.1.5)
Then a, c E R[s, z] and it is easy to see that b and d have to be of the form b = hs- 1 , d = ds- 1 for some b, dE R[s, z] satisfying
4.1 The Lattice of Behaviors
b*(O) Now, Equation (4.1.5) yields det U
= 0 == d*(O).
85
(4.1.6)
s IR[s,z] (a+ b) and s IIR[s,zJ (c +d). But
~ ~ ~ " ~ ~ a+ b ~ d + c~ = s- 1 (ad- be)= s- 1 ((a + b)d- b(d +c))= - d - - b
s
s
has to be a unit in 7-l, which is not possible because of (4.1.6). As it will turn out, the condition for the existence of polynomial kernelrepresentations has to be strengthened. For a proof of the corresponding statement in Theorem 4.1.13 below we will make use of a result concerning factorizations of polynomial matrices, which we want to present first. Theorem 4.1.12 Let F be any field andRE F[x, y]pxq be a polynomial matrix in two variables with rank p. Put N := (~) and denote by mt, ... , mN E F[x, y] the full-size minors of R. Let d E F[x, y] be any common divisor of mt, ... , mN. Then there exists a matrix D E F[x, y]PXP with det D = d and some R E F[x, y]pxq such that R = DR. Consequently, the following conditions are equivalent: .
(a) R is minor prime, that is gcdF[x,yJ (m1, ... , mN) = 1. (b) R is left-factor prime, that is, whenever R = matrix, then D is unimodular.
DR,
where
D is
a square
A proof can be found in [31, Cor. 1, p. 127] for the case where R is a square matrix. From this the non-square case can easily be deduced in the following way. Denote by At, ... , AN E F[x, y]PXP the p x p-submatrices of R in any chosen order. Then diF[x,yJ (detAi) for all i. According to [31, p. 117], the matrices Ai have a common left divisor with determinant d if the determinant of every matrix Q in the right ideal generated by A 1 , ... , AN (within the ring F[x, y]PXP) is divisible by d, too. But the latter can be deduced from the Binet-Cauchy formula as follows: Let Q = 1 AiBi, Bi E F[x, y]PXP, be a matrix in the right ideal. Using the notation introduced in Definition 3.2.6, one obtains PROOF:
2::
and the fact that each full-size minor of the matrix [A 1 , ... , AN] is either zero or- up to sign- detAi for some i = 1, ... ,N implies diF[x,vJ (detQ) as desired. Hence, applying [31, Cor. 1, p. 127], we obtain a factorization Ai = D.tt, where D is a square matrix having determinant d. The nonsingularity of D implies immediately that the matrices Ai form the p x p-submatrices of some
86
RE
4 Behaviors of Delay-Differential Systems
F[x, y]pxq (in the same chosen order as for the Ai), so that finally R
= DR. D
An alternative and more constructive proof of the factorization property is given in [45, Thm. 2]. In [78] the result can be found for the case of an algebraically closed coefficient field. It is worth mentioning that the preceding result is not true for polynomial matrices in more than two variables; for an example see [126]. In (117, 3.2.7] it is proven that a polynomial ring S[y] has the factorization property for its matrices in the above sense if and only if S is a principal ideal domain. Now we are in a position to present a sufficient condition for the existence of polynomial kernel-representations. We also show, that polynomial kernelrepresentations can always be reduced to full row rank ones. Theorem 4.1.13
(1) Let R E 1-lpxq be a right invertible matrix. If all full-size minors of R are in 'IR[s, z, z- 1], then R is left equivalent (over 1t) to some matrix R' E JR[s, z]pxq. As a consequence, ker.c R = ker.c R'. (2) Let R E JR[s, z]pxq be a matrix where rk R = r < p. Then there exists a full row rank matrix R E. JR[s, ztxq such that R is left equivalent over 1t to [R, o]T. As a consequence, ker.c R = ker.c R. (1) We use a factorization of the "numerator matrix" of R to extract a maximal left factor. The remaining part will be the desired R'. Without restriction we may assume that R E 1-lgxq (that is, no z- 1 is involved) and is written as R = ¢- 1 R where¢ E JR[s] andRE JR[s, z]pxq. Then the full-size minors satisfy R(p) = ¢P R(p) for all p E Jp,q, and the assumption on the full size minors of R implies that ¢Pis a common divisor of the full-size minors R(p) in the ring JR[s, z]. Using Theorem 4.1.12, one obtains a factorization R = AR' with suitable matrices A and R' over JR[s, z] and det A = ¢P. Hence R(P) = R(p) for all p E Jp,q and, consequently, the matrix R' is right invertible over 1-l, too (Corollary 3.2.5(f)). The identity R = ¢- 1 AR' yields ¢- 1 A E Glv(rl) and so R and R' are left equivalent. (2) Again, we are going to use various factorizations. We may assume without restriction that PROOF:
R
=
[~~ ~]
E IR[s, z]pxq where R 1 E IR[s, z]"xr is such that rk R 1
= r.
Denoting by dE R[s, z] a greatest common divisor (within R[s, z]) of the full-size minors of [R1, R2], we may factor [R1, R2] = D[Qt, Q2] where D E JR[s, ztxr satisfies det D = d and where [Qll Q 2] E JR[s, ztxq. The rank r of the matrix
4.1 The Lattice of Behaviors
87
yields that R3Q1 1Q2 = R4, thus R3Q1 1 Q2 is polynomial. Consider the equation [R3, R4] = R3Q1 1[Qll Q2] and notice that [Qll Q2] is minor prime by construction. Cramer's rule applied to each full-size square submatrix establishes that R3Q1 1 is polynomial itself. Hence
R = A[Q,,Q2] where A:=
[Rs~l']
E IR[s,z]•xr.
Again, by Theorem 4.1.12, one can extract a greatest common divisor of the full-size minors of A to the right. Precisely, we may write A = A 1 B for some B E R[s, zjTxr such that the matrix A 1 E R[s, z]pxr is minor prime. This yields that, if considered over 1{, the full-size minors of A1 have a greatest common divisor a E 1{, which has only finitely many zeros, and thus a is even in R[s] by Proposition 2.5(2). As a consequence, A 1 is left equivalent over 1i to a matrix
[A~ 1]
where A 11 E
·wxr and
det An
= a E IR[s]\ {0}.
By virtue of Lemma 4.1.10 we can finally factor A 11 as A 11 = CF where C E Glr(H), F E R[sjTxr and det F =a. Putting R = [F BQ1, F BQ 2 ], the assertion ~~.
0
Remark 4.1.14 It should be noted that all results of this section remain valid, when the func-
tion space £ = C00 (R, C) is replaced by its real-valued analogue, see also Remark 2.15. As a consequence, the same comment applies to the whole of this and the next chapter. At the end of this section we would like to quote some results from the existing literature about systems with noncommensurate delays. We restrict to results which are concerned with characterizing the inclusion (4.1.7) Recall from Theorem 4.1.5(a) that if Ri are matrices with entries in 1{ then (4.1.7) is equivalent to R1 being a right divisor of R 2, that is, XR 1 = R 2 for some matrix X with entries in 1{. It is natural to ask whether this characterization generalizes to systems with noncommensurate delays. For future reference we formulate the results in a remark. Remark 4.1.15 In Remark 3.1.8 we presented the operator ring
~ Ia, bE R[s, Z1, ... , zt], ker.c b ~ ker.c a} = {f E R(s,z1, ... ,zz) If* E H(C)},
1i(l) = {
88
4 Behaviors of Delay-Differential Systems
taken from [4 7]. The variables Zj represent shifts of positive lengths Tj which are Q-linearly independent retardations. The notation kerc a and f* are straightforward generalizations of the commensurate case. Let l > 1 and Ri E 7-tcl) Pixq, i = 1, 2, be two matrices. In light of Theorem 4.1.5(a) one might expect that (4.1.7) can be characterized by right division with respect to the operator ring 7-t(l)· This, however, is in general not true. In order to quote some related results let us first recall from Section 3.5 the notation£, which refers to the space C00 (lR, 0, functions p, q E 7-t(l) can be found such that V(p*; q*) = 0 and the ideal (p*, q*) PW(C) taken in the Paley-Wiener algebra is not principal [111, Prop. 2.6]. As a consequence, not even a matrix X with entries in PW(
4.2 Input/Output Systems
89
is true if rk R1 = Pl· In [26, Thm. 4.1] this has been generalized in the direction that for Ri as in (4.1.9) the inclusion (4.1.7) is equivalent to an identity X* R 1 = R 2 for some X E (£')P2 xp 1 if and only if the operator R 1 has a closed range in £P 1 • In the commensurate case, that is for R 1 with entries in 7-l, we observed in Remark 4.1.9 that the closed range is a consequence of the existence of triangular forms and the surjectivity of scalar operators. In the noncommensurate case, the scalar operators are surjective as well [24, Thm. 5], but the analogous implication onto the range of a matrix-operator fails due to the lack of left equivalent triangular forms.
4.2 Input/Output Systems This section centers around the system-theoretic notions of inputs and outputs. Capturing these concepts in the behavioral language amounts to the task of defining their essential properties in terms of the trajectories. Once this is settled, one wants to understand, probably in terms of describing equations, whether or not a given system is endowed with an input/output structure. In the same fashion, one wishes to describe and understand causal (that is, nonanticipating) relationships between inputs and outputs. The incorporation of all these notions in the behavioral approach has been elaborated by Willems [118, 119], see also [87]. The concepts are defined for arbitrary dynamical systems in terms of the trajectories. Of all system classes, however, linear systems described by ODEs are those, for which these notions are best understood and algebraic characterizations are known, see [87]. We recall the concepts in Definition 4.2.1 for our situation of delay-differential systems. The characterizations in terms of kernel-representations, given in Theorem 4.2.3, are fairly simple and standard, which is due to the fact that we are dealing with C00 -trajectories only. The results generalize the criteria known for ODEs in a straightforward way. We discuss also the case of (L~J+-trajectories for input/output systems and present a sufficient condition for nonanticipation in this more general situation. Note first that behaviors B ~ .Cq are time-invariant, that is ato (B) = B for all t 0 E JR, where (at 0 w)(t) = w(t- t 0 ) is the forward shift by t 0 time units defined for arbitrary functions w on JR. Therefore the time instant t 0 = 0 occurring in the definition below is just a matter of choice and has no specific meaning by itself. For the causality considerations we will make use of the notation w_ := wi{-oo,O]
(4.2.1)
90
4 Behaviors of Delay-Differential Systems
for the restriction of the function w, defined on JR, to the left half line (-oo, 0]. Occasionally it will be convenient to utilize the interpretation of rational functions in JR(s, z) as distributions (cf. Section 3.5). In that context we will pass from .C to the topological space £. Definition 4.2.1 Let B ~ .c,q be a behavior. (a) B is called autonomous if for all wEB the condition w_ = 0 implies w = 0.
Let q = m + p and assume the external variables w = (w 1 , ... , Wm+p )T are partitioned into w = ( uT, yT)T where u E .c,m and y E £,P. (b) The variables in u are called free (simply, u is said to be free) if for all u E .c,m there exists y E £,P such that (uT, yTl E B. (c) The behavior B is said to be an input/output (i/o-) system with input u and output y, if u is maximally free, that is, if u is free and no selection (Wi 1 , ••• , Wim. )T of external variables exists which is free and satisfies ih > m. (d) Let B be an i/o-system with input u and output y. Then B is called nonanticipating if for all u E .c,m satisfying u_ = 0 there exists y E £,P such that Y- = 0 and (uT, YTl E B. Let us briefly describe the system-theoretic meaning of these notions. In an autonomous system the future of a trajectory is completely determined by its past. As a consequence, no variable can be set freely. On the other hand, in an i/o-system the free variables can be considered as controlling variables (the input), which can be set arbitrarily, while the output consists of the bound variables; it processes the setting chosen for the input. Nonanticipation reflects a causal relationship (causal with respect to time) between input and output: "The past of the output is not restricted by the future of the input." [87, p. 89]. In terms of input/output maps (cf. Remark 4.2.4), it simply says that the effect cannot occur in time prior to the cause. Remark 4.2.2 It is not quite in the behavioral spirit to assume that the external variables are
a priori in an ordering such that only the first m can play the role of inputs and the last p that of outputs. Instead, it would be more natural to take arbitrary orderings into consideration. Since that would add merely a permutation matrix to the setting, we disregard this additional freedom and assume that a suitable reordering, if possible, has already been carried out. Clearly, the maximum number of free variables is uniquely determined. It will turn out that this number equals the number of all external variables minus the number of independent equations. Observe that this is simply the classical situation as in linear algebra over fields. Moreover, we will see that every collection
4.2 Input/Output Systems
91
of free variables can be extended to a maximally free one. This is a trivial consequence of the rank criteria given below. As to be expected, nonanticipation is closely related to the size of the retardations acting on the inputs and outputs. Theorem 4.2.3 Let B = ker.c [P, Q] ~ cm+v where [P, Q] E 1irx(m+v) has rank r and let m, p > 0. Assume that the external variables are partitioned into u andy as in Definition 4.2.1. Then (1) u is free if and only ifrkQ = r. (2) B defines an i/o-system if and only ifrk Q = r = p. In this case, the matrix -Q- 1 P E JR(s, z)vxm exists and is called the formal transfer function of the i/a-system B. (3) Let B be an i/o-system. Then B is nonanticipating if and only if Q- 1 P E
JR(s)[z]pxm. Notice that in the scalar case r = m = p reflect the surjectivity of Q acting on £.
= 1 the first two assertions simply
PROOF: (1) In case rk Q < r = rk [P, Q], a left equivalent form of [P, Q], where Q is upper triangular, shows that u is not free. The converse is immediate by Proposition 4.1.4(a).
(2) "<=" For every nonsingular p x p-submatrix of [P, Q] one may use a diagonal form to see that no larger collection than the complementary m variables can be free. Together with (1) this prov~s the assertion. For "=>" notice that rk Q = r = rk [P, Q] implies the existence of a nonsingular r x r-submatrix of Q, resulting in a collection of m + p - r free variables so that the maximality of m yields r=p.
(3) First of all, by (2) the formal transfer function
-Q-1 P E JR(s, z)pxm
~
JR(s)((z))pxm
exists. For nonanticipation, dealing with inputs having their support bounded to the left, it is most convenient to utilize the convolution operator given by the distribution (-Q- 1 P)(8~ 1 ),8 1 ) E (D~)pxm acting on £'!f, see Theorem 3.5.1. Precisely, for all u E £+satisfying u_ = 0, there exists a unique output y E £~given by y = ( -Q- 1 P)(8a1 ), 81) * u. If -Q- 1 P E JR(s)[z]vxm, then ( -Q- 1 P)(oa1 \ (h) has support in [0, oo) and thus Y- = 0, too. Hence B is nonanticipating. The converse follows from Lemma 3.5.4. D From the above it is immediate that every behavior can be turned into an i/o-system by suitably reordering the external variables. It turns out that the same is true even for nonanticipation. Before proving that assertion, we want to comment on the characterization of nonanticipation given above.
92
4 Behaviors of Delay-Differential Systems
Remark 4.2.4 For an i/o-system B = ker.c [P, Q] the formal transfer function -Q- 1 P exists and induces the distribution ( -Q- 1 P)(8~1)' 81) E ('D~)pxm, see Theorem 3.5.1. It therefore gives rise to the convolution operator
(4.2.2) Since 'D~ * £+ ~ £+, the operator can be restricted to a map £+ -----7 £~. (We utilized this fact already in the proof of part (3) above.) In this way, T may be regarded as an input/output (i/o-) operator associated with the system B. The graph of the restriction to£+ is exactly the subspace B n £'!;+P of all one-sided trajectories in B. The distribution
is usually called the impulse response since its columns are the responses to the Dirac inputs ui = 80 ei E ('D~)m, where et, ... , em denote the standard basis vectors in JRm. According to Theorem 4.2.3(3), the operator T (or rather its graph in £'!;+P) is nonanticipating if and only if -Q- 1 P E JR(s)[z]pxm. As a consequence, each purely differential behavior ker.c [P, Q] (that is, [P, Q] E JR[s]Px(m+p)) is a nonanticipating i/o-system provided that Q is nonsingular. In this context no requirement like -Q- 1P being a proper rational matrix arises. This is simply due to the fact that we allow C00 -functions only, so that differentiation (the polynomial part of a rational matrix) causes no particular difficulties. The situation is different when taking other functions into consideration. In Remark 3.5. 7 we discussed the possibility of more general functions spaces. Let us consider the case of (L~J+-functions being fed into the system. Then, in order to avoid impulsive parts in the output, -Q- 1 P has to be proper in the sense that -Q- 1 P E JR(s)p((z)yxm (see Remark 3.5.7 for the notation). Then the map (4.2.2) specializes to
which, again, is nonanticipating iff -Q- 1 Pis a power series (rather than merely a Laurent series) over the ring JR(s)p. For systems of ODEs this has been described in [120, p. 333]. We will call a system ker.c [P, Q] satisfying the condition -Q- 1 P E JR(s)P[z]pxm a strongly nonanticipating i/o-system. At this point a main difference between behaviors defined by DDEs and those given by ODEs arises. The latter ones can always be turned into strongly nonanticipating i/o-systems by suitably reordering the external variables, see also [87, Thm. 3.3.22]. This is not true for delay-differential systems. For instance, for the behavior B given by [p, q] = [s- s2 z, 1- s3 z] neither q- 1 p nor p- 1 q is in JR(s)p((z)). Thus, B can neither way be regarded a strongly nonanticipating i/osystems. But on the other hand, both quotients are in JR(s)[z], so the behavior B defines a nonanticipating i/o-system (over C00 ) either way.
4.2 Input/Output Systems
93
The following proposition provides some information how to read off directly from the matrix [P, Q], without expanding -Q- 1Pinto a series, whether or not the system is (strongly) nonanticipating. The criteria take their best formulation by choosing a normalized form for [P, Q] in the sense that the matrix has no negative powers of z and a constant coefficient (with respect to z) of full row rank. Part (a) below shows that each behavior admits such a normalized kernel-representation. The criterion for Q- 1P being a matrix over JR(s)[z] is then very natural: the constant coefficient of Q has to be nonsingular. The normalization is also implicitly contained in the assumption of part (c) leading to a strongly nonanticipating system. Although we will not dwell on the case of (L!J+-trajectories later on, we would like to include this particular criterion. It will be utilized later to demonstrate that the systems arising in Chapter 5 as well as the controller used for spectrum assignment in Section 4.5 are actually strongly nonanticipating systems. Recall the definition degs q for rational functions q E JR( s) [z] given in 3.1. For a matrix ME JR(s)[z]pxq we denote by M(s, 0) the matrix in JR(s)pxq obtained after substituting z = 0 into M. We call M normalized if rk!R(s)M(s, 0) = p. Proposition 4.2.5
(a) For each matrix [P, Q] E 1-[PX(m+p) with rank p there exists a matrix U E Glp(IR[s, z, z- 1]) such that U[P, Q] is in Hgx(m+p) and normalized, i. e. rkiR(s)(U[P, Q])(s, 0) = p. (b) Let [P, Q] E Hgx(m+p) be a normalized matrix and Q be nonsingular. Then Q- 1P E JR(s)[z]pxm-{=:=} det Q(s, 0) =/:- 0.
(c) Let [P, Q] E Hgx (m+p) and det Q =1- 0. Write det Q = Ef=o Qj ( s)zi with coefficients Qj E JR(s) and suppose deg8 (det Q) = deg8 Qo. Moreover, suppose deg8 ( det Q) is maximal among all degrees of the full-size minors of [P,Q]. Then Q- 1 P E JR(s)p[z]pxm. Notice that by (a) and (b) every system can be turned into a nonanticipating i/o-system by reordering the external variables. PROOF: (a) It is enough to establish a denominator free version, i. e., [P, Q] E JR[s,z]Px(m+p). Assume rkiR(s)[P,Q](s,O) < p. Then there exists a row transformation U E Glp(IR[s]) such that the last row of U[P, Q](s, 0) is identically zero. Hence the matrix
•o' z~'] U[P,QJ
1 [
=:
[P,Q,]
has entries in JR[s, z]. If rk [P1, Q 1](s, 0) = p we are done. Otherwise we can proceed in the same manner with [P1, Q1]. This way we can build a procedure which keeps running as long as the current matrix [ll, Ql] satisfies rk [ll, Qz](s, 0) < p. But on the other hand, the procedure must stop after finitely many steps since
94
4 Behaviors of Delay-Differential Systems
the full row rank of [P, Q] guarantees that the maximal degree in z of the fullsize minors constitutes a strictly decreasing sequence of nonnegative numbers. Thus we obtain the desired matrix after finitely many steps, which proves the assertion. (b) Notice that both P and Q are matrices over the ring JR(s)[z] and Q is invertible as such if and only if det Q is a unit in JR( s) [z], hence iff det Q( s, 0) =f. 0. This proves "~". For "=>" observe that P = Q E~o Aj(s)zi with coefficients Aj E JR(s)pxm implies P(s, 0) = Q(s, O)A 0 (s), which together with the normalization rk [P(s, 0), Q(s, 0)] = p yields rkQ(s, 0) = p. (c) Let us start with the scalar case m = p = 1. Write P = E~oPi(s)zi where Pi E R(s). Then the assumption on deg8 (det Q) reads as deg8 Qo 2::: deg 8 pj and deg8 qo 2::: degs Qj for each j. Using (b), we have Q- 1 P E R(s)[z], say Q- 1P = E~o aj(s)zi for some aj E R(s). Now the result follows by induction since ao = q01 po E JR(s)p and aj = q0 1 pj- E~= 1 q01 qiaj-i E JR(s)p. The matrix case is a consequence of the scalar case along with Cramer's rule. Indeed, the entry (Q- 1 P)ij is of the form (det Q)- 1 det Qij, where Qij is the matrix obtained by replacing the ith column of Q with the jth column of P. Hence Qij is a full-size minor of [P, Q] and the result follows from the assumpD tions combined with the scalar case. Remark 4.2.6 For normalized matrices [Pi, Qi] E 7-l~x(m+p), hence rk [Pi, Qi](s, 0) = p for i = 1, 2, the uniqueness result about kernel-representations in Theorem 4.1.5(a) reads as
This can be verified straightforwardly. We close this section with an algebraic characterization of autonomy. It is immediate from the definition that autonomous systems have no free variables. The converse is true as well and follows from the identity ker.c R ~ ker.c (det R · Iq), where R is nonsingular, together with Proposition 2.14(2). For completeness, we also include the special case of finite-dimensional systems, which can easily be derived by use of a diagonal form together with the scalar case in Corollary 2.6(a) and Lemma 4.1.10. Proposition 4.2. 7 Let R E 1-lpxq be a matrix with associated behavior B = ker.c R
~
£q. Then
(a) B is autonomous if and only ifrkR = q. (b) B is finite-dimensional (as JR-vector space) if and only if B is the kernel of some nonsingular purely differential operator, i. e., B = ker.c T for some nonsingular T E JR[s]qxq.
4.3 'fransfer Classes and Controllable Systems
95
4.3 Transfer Classes and Controllable Systems In Section 4.1 we characterized the equality of behaviors via left equivalence of associated kernel-representations over'}-{. Now we will turn to a weaker equivalence relation on the lattice B, which will be called transfer equivalence. This notion refers to the fact that for i/o-systems each equivalence class is going to consist of the systems with the same formal transfer function. However, the equivalence itself can easily be handled without use of any input/output partition, which is merely a reordering of the external variables, anyway. In particular, there is no need for giving an interpretation of -Q- 1 Pas an operator. It will be shown that each equivalence class is a sublattice of B with a (unique) least element. This particular element can be characterized algebraically, but also purely in terms of its trajectories. It turns out to be a controllable system meaning that every trajectory of the behavior can be steered into every other within finite time without violating the laws governing the system. Finally, a direct decomposition of behaviors into their controllable part and an autonomous subsystem will be derived. Definition 4.3.1 (a) For B = ker.c R, where R E 1-{pxq, define the output number of B by o(B) := rkR. (b) For systems Bi = ker.c Ri, where Ri E 1-{PiXq have full row rank, i = 1, 2, define
B1 ""B2
:~ {
o(B1) = o(B2) and R2 =M R1 for a nonsingular matrix ME JR.(s, z)P 1 xp 1 •
This provides an equivalence relation on the lattice B. We call two systems B1 and B2 transfer equivalent if B1 "' B2. The equivalence class of a behavior B will be denoted by [B] and is called its transfer class. The output number is well-defined by Theorem 4.1.5(a). It does indeed count the number of output variables of the system, see Theorem 4.2.3(2). Observe that transfer equivalence simply means that the kernel-representations share the same rowspace as JR.(s, z)-vector spaces. Since R(s, z) is the quotient field of the operator ring 1-l, transfer equivalence can just as well be expressed as
B1 "'B2
:~
o(BI) = o(B2) and { AR2 = BR1 for nonsingular matrices A, BE '}-{P 1 xp 1 •
It is easily seen that for ifo-systems transfer equivalence is the same as equality of the formal transfer functions. In the next theorem we describe the structure of the transfer classes. Among other things, we obtain that behaviors with right invertible kernel-representations are exactly the images of delay-differential operators.
96
4 Behaviors of Delay-Differential Systems
Theorem 4.3.2 Let BE B have output number o(B) = p. Then the transfer class [B] of B is a sublattice of B. It contains a least element Be and can therefore be written as
[B] = {B' E B I o(B') = o(B) and Be~ B'}.
(4.3.1)
For a system B' E [B] the following are equivalent: (1) B' =Be, the least element. (2) B' = ker.c R' for some right invertible R' E 1-{pxq. (3) B' has an image-representation, that is B' = im.cQ for some Q E of full column rank. The matrix Q can be chosen left invertible. PROOF:
Let B11 B2 E [B] be given as Bi = ker.c Ri for some Ri E
full row rank. From B1
r-v
B2 it follows rk
[~~]
=
1-{qx(q-p)
1-{Pxq
having
rkR1 = rkR2 and, by Theo-
rem 3.2.8, rklclm(R~, R2) = rkgcrd(Rb R2) = p, too. Using Theorem 4.1.5(c) and (d), we obtain (B1 + B2) r-v B1 r-v (B1 n B2), which implies the closedness of [B] with respect to taking finite sums and intersections. As for the existence of a least element, we first show that there exists a behavior in (B] satisfying (2). To this end, let B = kerc. R where R E 1-{Pxq has full row rank. Using Corollary 3.2.5 we may factor R as
R=BRc
(4.3.2)
where
B
E 1-{PXP
is nonsingular and Rc E
1-{pxq
is right invertible.
(4.3.3)
Now
Be
:=
ker.c Rc E (B]
(4.3.4)
is a system in [B] satisfying (2). To show the implication "(2) => (3)", let B' = ker.c R' E [B] for some right invertible matrix R' E 1-{Pxq. Completing R' to a unimodular matrix (4.3.5) (see Corollary 3.2.5) and partitioning the inverse as
u-l =
(Q',Q] according to Q E
1-{qx(q-p),
(4.3.6)
one obtains ker.c R' = im.cQ. Indeed, for v E ker.c R' and w := U'v one has
v= Hence ker.c R' R'Q = 0.
~
u- 1 Uv = [Q',Q] (~) = Qw E imcQ.
im.cQ and the converse inclusion follows from the identity
4.3 Transfer Classes and Controllable Systems
97
For the implication "(3) =? (2)", let B' = im.cQ for some matrix Q E 1-lqx(q-p) of full column rank and factor Q = QA where Q is left invertible and A is nonsingular. Using Proposition 4.1.4, we observe im.cQ = im.cQ. The matrix Q can be completed to a unimodular matrix, say u- 1 as in (4.3.6) and U as in (4.3.5), and the argument above leads again to B' = im.cQ = ker.c R', where R' is a right invertible matrix. In order to prove "(2) =? (1)", we first remark that the system Be defined in (4.3.4) is the unique system in the transfer class [B] with a right invertible kernel-representation. To see this, let M Re = N R~, where R~ E 7-lpxq is right invertible, too, and M, N E 1-£PXP are nonsingular. Using right inverses, one obtains that N- 1 M, M- 1 N E 1-£PXP, thus R~ = (N- 1 M)Re is left equivalent to Re showing that ker.c R~ = ker.c Re by Theorem 4.1.5(a). Now there remains to establish the minimality of Be = ker.c Re in [B]. We know already that ker .c Re = im.c Q for some matrix Q. Let B' = ker .c R' be any behavior in [B]. Then K R' = LR for some nonsingular matrices K, L E 1-£PXP and hence KR' = LBRe by (4.3.2) and (4.3.3). This yields R'Q = 0 and thus im.cQ ~ ker.c R' = B'. Hence Be is the (unique) least element in the lattice [B]. Together with Theorem 4.1.5(a) we get (4.3.1) as well as the implication "(1) =? (2)", completing the proof. D Obviously, the autonomous systems in £q form a transfer class having the trivial system ker.c I= {0} as its least element. The least element Be of a transfer class is of system-theoretic significance. It is a controllable system in the sense that it is capable of steering every trajectory into every other trajectory within finite time and without leaving the behavior. Put another way, controllability is the possibility to combine any past of the system with any desired (far) future of the system. In order to make this precise we first need a notion for combining functions.
Definition 4.3.3 For w, w' E £q and to E IR define the concatenation of w and w' at time t 0 as the function Wl\to w' : IR ~ cq given by , . _ { w (t) for t < to (wl\toW )(t) .w'(t) fort?:: to Using concatenations, trajectory steering can be expressed as follows.
Definition 4.3.4 (see [87, Def. 5.2.2] and the interpretation given therein) A time-invariant subspace B of L,q is called controllable if for all w, w' E B there exists some time instant to ?:: 0 and a function c: [0, to) ~ cq such that wl\oCI\t 0 0"t 0 W1 E B. Note that the requirement wl\ocl\t 0 Ut 0 w' E B implies in particular, that the concatenation is smooth. Since a-t 0 w'(t 0 ) = w'(O), the concatenation switches
98
4 Behaviors of Delay-Differential Systems
exactly from w(O) to w' (0) but allows for some finite time to 2: 0 to make the switching smooth and compatible with the laws of the system.
Remark 4.3.5 The definition of controllability given above appears to be the most intrinsic one possible. It merely refers to the collection of all trajectories of the system and does not make use of any kind of representation, for instance, a kernelrepresentation or a state space representation. A slightly different version of controllability, yet also based solely on the set of possible trajectories, has been introduced in the algebraic approach to systems theory in (125, p. 153]. In this case, the notion resorts to input/output partitions, which makes the concept of controllability more technical than the definition above. Of course, the space .Cq is controllable. It is even controllable in arbitrarily short time, that is, for all w, w' E .Cq and all t 0 > 0 there exists a function c such that wl\oCI\t0 0"t 0 W1 E .Cq. In the next lemma we verify (straightforwardly) that the image U(wl\t 0 W1 ) of a smooth concatenation wl\t 0 w' under a delay-differential operator U is a concatenation of U (w) and U (w') and some intermediary piece. Its length is determined by the size of the maximal retardation appearing in the operator U.
Lemma 4.3.6 Let w, w' E .Cq and to E lR be such that Wl\t 0 w' E .Cq. Furthermore, let the matrix U E Hgx q be written as U = L:f=o Uj zi with coefficients Uj E JR( s )Px q. Then there exists a function c : (to, to + L) ~
First of all, it is clear that U(wl\t 0 W1 ) E .CP. As for the concatenation, we proceed in two steps. a) Assume first U E JR(s, z]pxq, hence Uj E JR[s]pxq. Then PROOF:
L
U(wl\t 0 W1 )(t)
L
= LUj(WI\t j=O
={
0
W1 )(t- j)
=L
(Uj(w)l\t 0 Uj(w'))(t- j)
j=O
Ef=o Uj(w')(t- j) = U(w')(t)
if t 2: t 0
L:f=o Uj(w)(t- j) = U(w)(t)
if t < t 0
+L
and the desired result follows. b) For the general case let Uj = Vj¢- 1 where Vj E JR[s]pxq and¢ E JR(s]\{0}. Put V = L:f=o Vjzi E JR[s, z]pxq. Then U = V ¢- 1 and for all iiJ E .Cq we have U( w) = V(v), where v E .Cq satisfies ¢(v) = w entrywise. Let iiJ = wl\t 0 w'. Using the appropriate initial conditions at t 0 , one observes that one may find
4.3 Transfer Classes and Controllable Systems
99
v E _cq such that v = Vl\t 0 v' where ¢( v) = w and ¢( v') = w'. But then part a) of the proof yields U(w) = V(v) = V(vl\t 0 v') = V(v)l\t 0 CI\to+LV(v') = U(w)l\t 0 CI\to+LU(w') for some suitable function c defined on the interval [to, to+ L). o One obtains immediately Corollary 4.3. 7 Let B be a time-invariant controllable subspace of _cq. Then for all U E 1-£PXq the space U(B) ~ £P is controllable, too. Since B is time-invariant, it is enough to consider U E Hgxq. Let U be as in Lemma 4.3.6. We have to show that for all w, w' E B the images U(w) and U(w') can be concatenated within U(B). By assumption on B we have uLw' E B and there exists t 0 2:: 0 together with a function c such that w := wl\ocl\t0 Uto+Lw' E B. Now Lemma 4.3.6 provides some intermediary function c1 such that PROOF:
U(w)
= U(wl\ocl\t 0 Uto+Lw') = U(w)l\ocii\to+LU(uto+Lw') = U( w)l\ocii\to+Luto+Lu( w'),
completing the proof since U(w) E U(B).
0
Now we are in a position to establish the following characterization of controllable behaviors. Theorem 4.3.8 Let B = ker.c R where R E 1-(,Pxq is a matrix of rank r. Then B is controllable if and only ifrkR*(s) = r for all sEC. As a consequence, B is controllable if and only if B =Be, where Be is the least element in its transfer class [B]. Notice that the rank condition does not depend on the choice of the kernelrepresentation R. PROOF: Sufficiency follows from Corollary 4.3. 7 together with the existence of image representations as derived in Theorem 4.3.2. For necessity we first prove the assertion for the case B ~ .C, hence R E 1-£. Let w E B be any trajectory. By controllability there exist to > 0 and a function c such that v := wl\oCI\t 0 0 E ker.c R. Using twice Proposition 2.14(2), we obtain v = 0 and w = 0. Therefore, ker.c R = {0} and Lemma 2.12 shows R E 1-lx, as desired (cf. Remark 3.1.5). For the general case use a diagonal form U RV = diagpx q ( d 1 , ... , dr) where U and V are unimodular matrices and d11 ... , dr E 1-£\ {0}. Since ker.c R is controllable, the same is true for the system v- 1 (ker.c R) = ker.c diagpxq(d~, ... , dr ), see Corollary 4.3.7. This implies the controllability of ker.c di ~ .C for each
100
4 Behaviors of Delay-Differential Systems
i = 1, ... , r and now the rank condition on R follows from the first part of the
proof. The second part of the assertion can be deduced from Theorem 4.3.2(2) by using a full row rank kernel-representation and resorting to the rank criterion in 0 Corollary 3.2.5( c) for right invertibility. Remark 4.3.9 Reconsidering the arguments above we see with hindsight that controllability of B is equivalent to the capability of steering each trajectory in finite time to zero. Precisely,
B is controllable
~
\:1 w E B 3 to 2:: 0, c: [o,·to)
~
such that wl\ocl\t 0 0 E B. In the next remark we want to relate the controllability criterion above to some other results in the literature. Remark 4.3.10 (i) The criterion for controllability in Theorem 4.3.8 appeared first in [42, Thm. 5.5]. In the special case of behaviors having a polynomial kernelrepresentations it has been proven by completely different methods in [91]. The result generalizes the well-known Hautus-criterion for systems of ODEs to delay-differential systems; see [50] for state-space systems and [118, Prop. 4.3] for behavioral controllability of ODEs. For certain time-delay systems of the form x = Ax+ Bu with matrices A, B over JR.[z] or even 'Ho,p, it is also known to characterize spectral controllability [6, 74, 73], a notion referring to the controllability of certain finite-dimensional systems associated with the zeros of det(sl- A*(s)). In [85, Thm. 1] it has been shown that spectral controllability is identical to null controllability. The latter means that for every piecewise continuous initial condition there exists a piecewise continuous control u of bounded support in [0, oo) such that the corresponding solution x is of bounded support. (ii) It is easily seen that the constant rank assumption on R* for controllability R;r being torsion-free. is equivalent to the quotient module M := 1iq f. Im1-£ The connection between the system ker.c R and the module M has been explained in Remark 4.1.1. Recall in particular that for R being polynoR;r is taken as the definition of a mial, the quotient T := lR.[s, z]q f. ImJR[s,z] delay-differential system in [32, 80]. In [80], controllability, depending on an lR.[s, z]-algebra A, is defined algebraically as the torsion-freeness of the module A ®JR[s,z] T. Since M = 1i ®JR[s,z] T, behavioral controllability coincides with the algebraic notion of 'H-torsion-free controllability in [80]. (iii) For systems of PDEs, or generally for multidimensional systems, the notion of controllability or concatenability does not come as straightforward as for onedimensional systems (like ODEs and DDEs). Various notions of
4.3 Transfer Classes and Controllable Systems
101
controllability have been suggested in [124] (see also [129, Sec. 1.4]) and characterized algebraically and in structural terms similar to our Theorems 4.3.2 and 4.3.8. Some of the structural characterizations appeared first in [84, pp. 139); controllability of smooth systems of PDEs has been investigated in detail also in [86]. (iv) For systems ofDDEs with noncommensurate delays the existing results will be summarized in Remark 4.3.13 below. It is an immediate consequence of Theorem 4.3.8 that two controllable systems in e,q are transfer equivalent if and only if they are identical. Put another way, the formal transfer function, taken after a suitable input/output partition, determines the (unique) controllable behavior Be in the transfer class [B]. The p:roof of Theorem 4.3.2 shows, see (4.3.2), (4.3.3) and (4.3.4), how this controllable behavior can be obtained from a given system ker c R, namely by cancelling the nonsingular left factors (if any) of R (which for R = [P, Q], of course, does not change the formal transfer function -Q- 1 P). The minimality of Be in the transfer class can be rephrased as follows: a system B is controllable if and only if it has no proper subsystem with the same number of free variables. As we will show next, there is another way to characterize Be. It says that Be is simply the controllable part of B in the sense that it is the maximal controllable subbehavior contained in B. Recall from Remark 4.1.9 that ker?-£ R is finitely generated for every matrix R. Proposition 4.3.11 Let R E 1-ipxq be a matrix and put B = kerc R. Let Be be the (unique) controllable system in the transfer class [B]. Moreover, let ker?-£ R = im?-£ T ~ 1-iq for some T E 1-iqxt. Then Be= imcT. Furthermore, one has B' ~ Be for every controllable behavior B' contained in B. We call Be the controllable part of B.
PROOF: By Theorem 4.3.2(3), each controllable behavior B' has an imagerepresentation B' = imcT' for some T' E 1-iqxr. Hence B' ~ B implies RT' = 0 so that T' = TX for some X E 7-itxr and B' = imcT' ~ imcT. As a special case, we obtain Be ~ imcT. On the other hand, if R = BRe is factored as in (4.3.2) and (4.3.3), then ker?-£ R = ker?-£ Re = im?-£ T, whence ReT= 0 and imcT ~ kerc Re =Be. This concludes the proof. D Remark 4.3.12 Another characterization of controllable behaviors can be found in [111,
Thm. 3.5]. A behavior B ~ £q is controllable if and only if B = B n Vq £, where, again, v ~£is the space of C00 -functions having compact support and -:-£ denotes the closure with respect to the topology on £. The only-if part follows in essence from the existence of image-representations and the dense-
102
4 Behaviors of Delay-Differential Systems
ness of V in£. The proof of the other direction can be reduced via a diagonal for~ to the scalar case, where then kere p n V = {0} for each nonzero p E 1t (Proposition 2.14(2)) is the key argument. Remark 4.3.13 In the same paper [111], controllable behaviors have been investigated for systems with noncommensurate delays and even for convolution systems of the type discussed in Remark 4.1.15(4). In this generality, it is not known whether the properties (a) controllability, (b) having a kernel-representation with constant rank on C, (c) having an image-representation, and (d) being the closure of its compact support part, are equivalent. However, it has been shown in [111, Thms. 3.5, 3.6] that for R E (£')pxq each of the following conditions implies the next one: (i) kere R = im E Q for some Q E (£')qxl, (ii) kere R is controllable in the sense of Definition 4.3.4,
(iii) kere R = kere R n Vq
E,
(iv) kere R = imeQ E for some Q E (£')qxl. (v) rk.CR(s) is constant on C, where .CR denotes the Laplace transform of R (in this case, kere R is called spectrally controllable). If R has full row rank, then one also has "(v) => (iv)". In the special case of a delay-differential operator R E 1-l(/)q (see the Remarks 3.1.8 and 4.1.15) it is proven in (41, Thm 3.12] that "(iii) {:} (iv) {:} (v)'', regardless of any rank constraint. The implication "(v) => (ii)", however, does not hold for general operators R E 1-l(t)q, see the example in [41, Ch. 4]. Controllable systems are, in a certain sense, just the extreme opposite of autonomous systems. Controllability describes the capability to switch from any trajectory to any other, in other words, the past of a trajectory has no lasting implications on the far future. On the other side, autonomy prohibits any switching at all, because, by definition, the past of a trajectory determines completely its future. These two extreme points on a scale of flexibility for behaviors can also be expressed in module-theoretic terms. It is easy to see that a system A= ker.c A is autonomous if and only if its annihilator in 1i is not trivial (indeed, if A is nonsingular, then det A E ann( A)\ {0}; the other direction follows from Theorem 4.1.5(a)). On the other hand, it is not hard to show that a behavior B is controllable if and only if it is a divisible 1-£-module, that is, if each a E ?t\ {0} is a surjection on B. Next we show that each behavior can be decomposed into a direct sum of its controllable part and an autonomous subsystem.
4.3 Transfer Classes and Controllable Systems
103
Theorem 4.3.14 Let B ~ .Cq be a behavior with controllable part Be. Then there exists an autonomous system A ~ .Cq such that (4.3.7)
B = BcEBA.
Furthermore, let B = kerc R where R = BRe E 1-(,PXq is factored as in (4.3.2), (4.3.3). Then in every direct decomposition B = BeontrEBBaut into a controllable and an autonomous subsystem, the controllable system is given by Beontr = Be, while the autonomous part is of the form Baut = kerc A for some A E 1-lqxq satisfying det A = det B, up to units in 1-l. PROOF: Consider the factorization R = BRe in (4.3.2), (4.3.3). Hence the controllable part of B is given by Be = kerc Re by (4.3.4). Complete Re to a unimodular matrix
U := [ ~~]
E
Glq(H)
u- 1 = [Q', Q] such that Q E A := u- 1 [~] E 1-(,qXq and put A
and partition the inverse as
1-lqx(q-p). Define
the nonsingular matrix
:= kerc
the identities
[~?][Q',Q]
= lq =
[Q',QJ[~~J
and A= Q'BRe
A. Using
+ QU',
one
immediately verifies RcA= R as well as
[Q~ ~,] [~, ~] = I•+•·
(4.3.8)
Thus by Theorem 3.2.8, I = gcrd(Rc, A) and BRc = R = lclm(Rc, A) and (4.3.7) follows from Theorem 4.1.5(c) and (d). Consider now a given decomposition B = Beontr 67 Baut· As for the uniqueness of the controllable term, observe that on the one hand Beontr ~ Be by Proposition 4.3.11. On the other hand, using once more Theorem 3.2.8 in combination with Theorem 4.1.5(d) one verifies o(Bcontr) = p so that Bcontr E [B] and therefore Be ~ Bcontr by Theorem 4.3.2. Hence Bcontr = Be is the controllable part of B. As for the autonomous part, write Baut =: kerc A where A E 1-lqxq. We have to show that det A = det B up to units in 1-l. To this end, let V, W E Glq(?t) such that Rc W = [Ip, 0] and
VA W = [~:1 ]
4
where A4 E 1-[(q-p) x (q-p). Then,
firstly, ker c Re W n kerc VA. W = { 0} gives det A 4 E 1t x , see also Proposition 4.1.4. Secondly, one has kerc BReW= kerc RcW + kerc V AW = kerc [A1, 0].
= kerc lclm(RcW, V AW)
Hence [At, OJ and BRc Ware left equivalent. Since det B is the greatest common divisor of the full-size minors of B Rc W, this yields det A = det A 1 = det B up to units in 1-l, which is what we wanted. D
104
4 Behaviors of Delay-Differential Systems
We close the section with Remark 4.3.15 (a) The decomposition (4.3.7) is quite standard in behavioral systems theory, see (87, Thm. 5.2.14] for systems described by ODEs. The sum can also be derived for multidimensional systems given by PDEs, but in this case the decomposition is not always direct, see [123, Thm. 5.2].
(b) To some extent, one can regard the direct decomposition (4.3. 7) as the "behavioral version" of the "classical" decomposition of a system into its forced and free motions, see, e. g., [52, Prop. 3.1] in a slightly different context. Indeed, denoting by £+ the H-submodule of .C consisting of all functions having support bounded on the left, it is easy to derive from (4.3.7) and (4.3.8) the relation ker.c R n £,~ = ker.c Rc n .C~. This space can be viewed as the set of all forced motions of the system (including the forcing input, starting at some finite time to E JR.), while A= ker.c A contains the free motions (including input which has been acting on the system forever). In case, ker.c R = ker.c [P, Q] is an i/o-system with kernel-representation [P, Q] E 1-{Px(m+p) and det Q =f- 0, we know from Remark 4.2.4 that ker.c R n £,~ is the graph of the convolution op1 erator given by ( -Q- 1 P)(86 \ 81) restricted to This way, we observe again that the formal transfer function is related merely to the controllable part of the system. Consequently, nonanticipation, as well, is a property related to the controllable part only.
£+.
4.4 Subbehaviors and Interconnections So far we have only been concerned with the analysis of a single system. In this and the next section we will direct our attention to the interconnection of two systems, one of which being regarded the given plant, the other one the to-be-designed controller. Indeed, a controller does constitute a system itself. It processes (part of) the output of the to-be-controlled system and computes (part of) the inputs for that system with the purpose to achieve certain desired properties of the overall system, like for instance stability. Thus, the system and the controller are interconnected to form a new system. In the behavioral framework the interconnection can be written as the intersection of two suitably defined behaviors. The underlying idea is simply, that the trajectories of the interconnection have to satisfy both sets of equations, those governing the system and those imposed by the controller. Depending on the type of interconnection or on the description of the components, the resulting system might be described with the help of some auxiliary (latent) variables, which hopefully can be eliminated in a second step so that one ends up with a kernel-representation for the external variables of the interconnection. This elimination procedure will be dealt with at the beginning of the section.
4.4 Subbehaviors and Interconnections
105
Thereafter we turn to the interconnection of systems and investigate the achievability of a given subsystem via regular interconnections from the overall system. The notion of regularity can be understood as requiring, in a certain sense, most efficient controllers. At the end of the section the dual of regular interconnections will be treated, these are direct sum decompositions of behaviors. It will be shown that the existence of direct sum decompositions is closely related to the notion of skewprimeness for matrices. The following theorem shows that (and how) latent variables can be eliminated in certain situations. The cases considered are exactly those showing up in typical interconnections. The theorem will be particularly important in the next chapter where we study latent variable systems of a specific type. In that context, a special role will be played by polynomial kernel-representations; therefore we also include the polynomial case in the theorem below. For the term "latent variable" we would like to recall the discussion following Definition 4.1 in the introduction to this chapter. Theorem 4.4.1
(a) The image of a behavior under a delay-differential operator is again a behavior. Precisely, if Ri E 1-{Pi. xq for i = 1, 2 are matrices of full row rank, then R1 (ker.c R2) = ker.c X, where the matrix X E 1-ltxp 1 is such that X R 1 is a least common left multiple of R 1 and R2. Moreover, if Ri E R[s, z]Pixq and rkiR(s,z)
[~~]
=
rkc [~~~:n for all s E
the matrix X can be chosen in R[s, z]txp1. (b) Let Ri E 1-[PXP;, i = 1, 2, be two matrices and assume rk[R1,R2] =p. Furthermore, assume [V1T, V2T]T E Glp(1-l) is such that
[~] R2 =
["":{] for some ME 7-irxp, with rank r.
Then B := { w E £P 1
I R1 w E im.cR2} = ker.c (V2R1).
If we have additionally Ri E R[s, z]PXPi and rkiR(s,z)R2 = rkcR2(s) for all sEC, then the matrix V2 can be chosen with entries in R[s, z], too.
As the proof will show, the condition on the generic rank of R 1 and R 2 in part (a), and hence also in (b), is not inherently necessary. It simply allows to make use of the least common left multiple, which has been defined for this case only. Note that also the (extreme) case where rk [R1T, R 2T]T = p 1 + p 2 is encompassed in the statement above, as in this situation the least common left multiple is the empty matrix while R 1 (ker.c R 2) is indeed all of [,P 1 •
106
4 Behaviors of Delay-Differential Systems
The special case R1 = I in (b) shows again that im.cR2 is a behavior, a fact being indicated already in Remark 4.1.9. In light of Theorem 4.3.2 we see that the systems of this form (that is, having an image-representation), are just the controllable systems. It is not possible to drop the pointwise rank condition imposed for the polynomial kernel-representations. For instance, im.c [z
~ 1]
=
ker.c [1,
z~I]
(by
Proposition 4.3.11) and because of Theorem 4.1.5(a) no polynomial kernelrepresentation can be found in this case. PROOF OF THEOREM 4.4.1: (a) The first part is fairly standard and can be seen as follows. By the Bezout property of 1-l, we know that there exist matrices Ui such that
[~~ ~:] [~~] = [~]
for some DE Hrxq with rkD
=r
(4.4.1)
and the leftmost matrix is in Glv 1 +v2 (1-l). This provides lclm(R1, R2) = U3R1 · by Theorem 3.2.8(b). Using the surjectivity of D, see Proposition 4.1.4, we get for wE .Cq wE R,(ker,;R2) <=?
(~) E imc [~~]
~wE
<=?
m:)
E imc
[~]
ker.c U3,
which proves the first assertion of (a). Let us now turn to R1 and R2 being polynomial matrices. The existence of a polynomial kernel-representation for R1(ker.c R2) will be proven once we have established that [U3, U4] in (4.4.1) can be chosen polynomial. This can be accomplished as follows. We start with any equation of the type (4.4.1). Notice that r = rk [R1T, R 2T]T. By virtue of Theorem 4.1.13(2) we know that [R1T, R 2T]T is right equivalent over 1-{ to a matrix
[~~ ~]
E
JR:.[s, zj(Pl +v2)xq,
and [R1T, R2T]T has full column rank r. The rank assumption on [R1T, R 2T]T and the invariance of the invariant factors under equivalence imply the coprimeness of the full-size minors of (R1T, Rl]T in 1-l. Applying now Lemma 3.2.7(1) to the equation
[U3, u.]
[~~] = o
shows that the full-size minors of [U3, U4] are polynomial so that by Theorem 4.1.13(1) the matrix [U3, U4 ] is left equivalent to a polynomial matrix [03, 04]· Now we can replace the unimodular matrix in (4.4.1) by
4.4 Subbehaviors and Interconnections
107
and obtain from the first part of the proof the identity R 1 (ker.c R 2 ) = ker.c lh, hence a polynomial kernel-representation. (b) follows from (a) by observing that B = [Ipp O](ker.c [Rt, -R2 ]). Note also that the matrix
(~i-~:J
has constant rank whenever
R2
has.
D
Let us now start with the investigation of interconnecting systems. Definition 4.4.2 (see (120, p. 332]} The interconnection of two systems B1, B2 E B is defined to be the system B := B1 n B2. The interconnection is called regular if o(B) = o(B1) + o(B2).
The concept of a regular interconnection is rather natural in the behavioral setting as can be seen by Theorem 4.2.3. Indeed, the number q of external variables minus the rank of a kernel-representation represents the number of input variables of a system. If one thinks of one of the interconnecting components as the controller, it is natural to require that each linearly independent equation of the controller should put a restriction onto one additional input channel, for otherwise the controller would be inefficient. Put another way, restrictions are imposed on what is not yet restricted. As a consequence, the resulting interconnection of B1 and B2 is left with q- o(B1 ) - o(B2) input variables, which is exactly the regularity condition. Using once more Theorem 4.1.5 and 3.2.8, one obtains o(B1 nB2 ) +o(B1 +B2 ) = + o(B2) and therefore
o(B1 )
(4.4.2)
Hence the interconnection is regular if and only if the components add up to the full space £q. As an example we want to discuss the classical feedback-configuration of two systems. It also exhibits how "interconnected" variables may turn into latent variables of the interconnection in the sense that they are not describing the external behavior of the new system. Example 4.4.3 Giv~n the two systems
where q = p + m and [Pb QI] E 1-fPX(m+p) and [P2, Q2] E 1-imx(p+m). The classical feedback-interconnection given by u := u1 - y2, Yl = u 2 =: y is described by the system
108
4 Behaviors of Delay-Differential Systems
for the variables (u, y, u 1 , Y2)· If one is interested in the new external variables u andy only, one eliminates the latent variables u1 and Y2 by taking the projection
oool (ker.c [I0 Q10 -I P1 0I] ) .
B := [I 0100
0 p2 0 Q2
Using Theorem 4.4.l(a), one can find the kernel-representation
where U :=
[~~ ~~]
E
Glm+p(H) is such that U [~~] =
[~]
for some full row
rank matrix D. It describes the laws governing the external variables (u, y) of the new system. It can easily be seen that the external behavior B is an i/osystem with output y if and only if det(I- Q1 1P1Q"2 1P2) =/:- 0. This is the usual well-posedness condition for this type of feedback-configurations in the classical transfer function approach. In the same way one can handle series- and parallel-interconnections. As this is completely analogous to the case of systems described by ODEs in (87, Exa. 6.2.9, Ex. 6.3, Ex. 6.4], the details will be omitted. Obviously, an interconnection is a subsystem of either of its components. It is fairly simple to characterize algebraically those subsystems of a given system, which can be achieved as regular interconnection from that system. But it is also not hard to give a dynamical characterization purely in terms of the trajectories involved. Theorem 4.4.4 Let B ~ B ~ £q be two behaviors and assume B = ker.c R where R E 1-{fJxq. Then the following statements are equivalent: (a) There exists a system B' ~ £q such that B = B n B' is a regular intercon-
nection of B and B', (b) the image R( B) ~ £P of B is controllable, (c) B =Be+ B, where Be denotes the controllable part of B, (d) B is B-controllable, that is, for each w E B there exist to ~ 0, wEB, and a function c: (0, to) --+
4.4 Subbehaviors and Interconnections
109
of Remark 4.3.9 we see that controllability in the sense of the previous section is the same as {0}-controllability. The characterization above is close to what has been obtained for multidimensional systems in [92, Thm. 4.2] showing once more the structural analogy between these classes of systems. The equivalence of (a) and (b) can be derived by taking the duals ofthe behaviors and considering the corresponding problem in terms of finitely generated submodules of 1-lq. However, we think it is reasonable to stay on the systems side in order to use one and the same language throughout the proof. PROOF OF THEOREM 4.4.4: Let B = ker.c R for some R E JiPXQ having full row rank. We may also assume without restriction that R has full row rank and is contained in 1-lgxq, thus R does not contain any negative powers of z. The latter will simplify the application of R onto a concatenation later in the proof. The inclusion B ~ B implies a relation X R = R where X E JiPXP is a full row rank matrix. Note that R = lclm(R, R) and therefore R(B) = ker.c X ~ £i by Theorem 4.4.1(a).
"(a) =? (b)" Let B' = ker.c R' where R' E JiP'xq is a matrix of rank p'. Then ker .c R =
B=
ker .c [;] and
p=
p
+ p'
by regularity of the interconnection.
Hence Theorem 4.1.5(a) yields that the matrices Rand [;] are left equivalent. Thus X is a block row of a unimodular matrix and therefore ker.c X= R(B) is controllable by virtue of Theorem 4.3.8 and Corollary 3.2.5. "(b) =? (a)" follows by completing X to a unimodular matrix [XT, yT]T (see Corollary 3.2.5) and defining R' =YR. "(b) =? (c)" Let R = BRc be factored as in (4.3.2) and (4.3.3), thus Be = ker.c Rc is the controllable part of B. Then the condition B = Bc+B is equivalent toR being an lclm(Rc, R) (up to unimodular left factors), see Theorem 4.1.5(d). But the latter follows from right invertibility of X, since every lclm(Rc, R) is of the form L =ARE 1-lpxq and a right divisor of R = XR = BRc. "(c) =? (d)" Pick a trajectory w E B. By assumption there exist We E Be and E B slfch that w = we+ w. Controllability of Be implies the existence of a trajectory v := wcl\ocl\t0 0 E Be. As a consequence, v + w. = wl\oc' l\t 0 w E B, which proves (d).
w
"(d) =? (b)" Let v = Rw E R(B) for some w E B. By assumption there exists a trajectory w E B such that w1 := wl\ocl\t 0 w E B for some to > 0 and a suitab}e funct!on c defin~d on [0, to). Now we can apply Lemma 4.3.6 and obtain Rw1 = Rwl\oc' At 1 Rw E R(B) for some t 1 2:: 0 and a function c'. (At this point it is convenient, but not necessary, to have the entries of R in 1-lo in order to avoid any backward shifts of the concatenating time instants.) Since Rw = 0, the last part shows that every trajectory in R(B) can be steered to zero, which by Remark 4.3.9 is equivalent to controllability of R(B). D
110
4 Behaviors of Delay-Differential Systems
Remark 4.4.5 Note that the map R(ker.c R)----+ ker.c R/k RA, . er.c
Rw ~ w + ker.c R
is an isomorphism of 1-l-modules. Therefore, "quotient behaviors can be identified with real behaviors" (with a different number of external variables) and the controllability condition in part (b) above could be expressed in terms of the quotient behavior. Since the image of a controllable behavior is controllable again (see Corollary 4.3. 7), the following additional characterization is immediate from the theorem above. Notice that by part (b) below the term controllability can now be understood in a twofold way. Firstly, it describes the ability to steer trajectories (Definition 4.3.4), and secondly, it expresses the achievability of all subsystems via regular interconnections. In other words, it guarantees the very existence of controllers.
Corollary 4.4.6 The following conditions on a system B ~ Cq are equivalent. (a) B is controllable, (b) each subbehavior B ~ B can be achieved via regular interconnection from B, (c) {0} ~ B can be achieved via regular interconnection from B. Remark 4.4. 7 Consider once more the situation of Theorem 4.4.4. In case that B = B n B' is a regular interconnection, the output number of B is, by definition of regularity, the sum of the output numbers of the components B and B'. This, however, does not guarantee that the outputs of the given subsystem B are made up by the outputs of the two components. But this can always be achieved by a suitable choice of the component B'. Even more can be accomplished. If B ~ B are both nonanticipating i/o-systems, then the controller B' can be chosen in this form, too (and, of course, such that the outputs match). This can easily be shown in exactly the same way as described for systems of ODEs in [120, Thm. 9); see also Proposition 4.2.5(b) for the condition of nonanticipation. It is worth mentioning that in general it is not possible to have all components strongly nonanticipating i/o-systems (see Remark 4.2.4) at the same time. This fails even for systems of ODEs as can be seen by the example
B := ker.c
3
2
l
2s +1 s [1 _ 82 8 + 1
c B := ker.c [2, s 3 + 1, s 2 ].
In this case strong nonanticipation of Band B requires by Proposition 4.2.5(c) that the second and third external variable are the output of B, while the second one is the output of B. But it is not possible to find a strongly nonanticipating interconnecting system B' having the third variable as output.
4.4 Subbehaviors and Interconnections
111
After these considerations on interconnections we now turn to a problem, that can be regarded, in a sense made precise below, as the dual of achievability via regular interconnections. Given a behavior 8o with subbehavior 81 ~ 8o, we ask for conditions which guarantee that 8 1 is a direct summand of 8o in the "behavioral sense" , that is
8o
= 81 EB 82
for some behavior 82
~
8o.
(4.4.3)
In this case we simply call 8 1 a direct term of 8o. In terms of the duals Mi = the question above can be posed as follows: giyen finitely generated modules Mo ~ M1 ~ 1tq, find a finitely generated submodule M2 ~ 1tq such that M 1 + M2 = 1tq and M1 n M2 =Mo. This is exactly the condition of achievability via regular interconnections where now behaviors are replaced by modules (see also see (4.4.2) for the regularity condition). The problem stated above on direct terms might not be of system-theoretic significance by itself, but nevertheless we believe it is natural to be investigated. 8i..L ~ 1-{Q,
Example 4.4.8 (a) For 8 0 = £q, the class of all direct terms of 8o is immediately seen to be the class of all controllable systems. Indeed, ker.c R1 EB ker.c R2 = £q is equivalent to gcrd(Rb R2) = Iq and lclm(Rb R2) being the empty matrix. But this simply means that [R1T, R 2Tf is unimodular so that by Corollary 3.2.5 and Theorem 4.3.8 the behaviors ker.c R1 and ker.c R2 are controllable. (b) In the previous section it has been shown that the controllable part of a system is always a direct term, the complementary term being autonomous, see Theorem 4.3.14. The theorem below will show that even each controllable subsystem is a direct term. (c) Consider an autonomous system Bo ~ £q, given by 8o = ker .c A, hence the matrix A E 1tqxq is nonsingular. Choose a frequency ,\ E C with ord>. (det A*) =: k > 0. It is intuitively clear that there exists an exponential solution w(t) = w0 e>-t in 8 0 . We will show even more. By some matrix calculations it is possible to derive a direct decomposition of ker .c A that extracts exactly the solutions having frequency ,\. To this end, let U, V E Glq(1t) such that U AV = diagqxq(at, ... , aq) = ..1 is diagonal. Extracting from each the (possible) root ,\ with maximal multiplicity, we obtain a factorization
a;
where ai E 1t and ai(-\) # 0. In particular, we have Ei=1 ki = k. The coprimeness of ai and (s - ,\)ki induces the direct sum decompositions ker.c ai EBker.c (s- -\)ki = ker.c ai for the components, see Theorem 4.1.5(c) and (d). This in turn implies ker.c ..1 = ker.c Ll EB ker.c A and we finally get the direct sum decomposition ker.c A= ker.c
(Av- 1) EB ker.c (AV- 1 ).
(4.4.4)
112
4 Behaviors of Delay-Differential Systems
Since det(Av- 1) = (s - .X)k E IR[s] (up to a unit in 1t), we know by Lemma 4.1.10 that ker.c (AV- 1 ) = ker.c A for some purely differential operator A E IR[s]qxq. Hence this behavior is a k-dimensional vector space consisting solely of functions of the type w(t) = p(t)e>-t where p E C[t]q. On the other hand, the first component ker.c (Jv- 1) in (4.4.4) does not contain any (vector-valued) exponential polynomial of frequency..\; this follows from the inclusion ker.c (Jv- 1) ~ ker.c (det(.AV- 1 )Iq)· For autonomous systems of ODEs one can derive this way successively a complete direct sum decomposition according to the finitely many various frequencies of the system. This is, of course, nothing else but the well-known expansion of the solutions into finite sums of exponential polynomials. Remark that the decomposition (4.4.4) implies the identities A = lclm( J v-\ AV- 1 ) and I= gcrd(Jv- 1, AV- 1 ) by virtue of Theorem 4.1.5. In this particular case this is also clear from the fact that J and A are commuting. In order to attack the question posed above let us first rewrite (4.4.3). Choosing full row rank kernel-representations Bi = ker.c Ri, we see that, as in the previous example, the decomposition (4.4.3) is equivalent to gcrd(R1, R2) = Iq and lclm(Rb R2) = Ro. Let furthermore, Ro = X R1 be the factorization implied by the inclusion B1 ~ Bo. In the scalar case the existence of R 2 satisfying the above requirements is identical to the coprimeness of X and R 1 . In the matrix case this generalizes to some skew primeness between these two matrices, which then provides a criterion for a direct sum (4.4.3) in terms of the given data R 1 and R 0 . This is the content of Theorem 4.4.9 below. The role played by the quotient Bo/ B1 will be discussed in Remark 4.4.10 right after the proof. The (straightforward) equivalence (a) ¢:?(b) is the analogue of a corresponding result for two-dimensional discrete-time systems in [108, Thm. 18.3.4]. Theorem 4.4. 9 Let Ri E JtP;.xq, i
= 0, 1, be two matrices with full row rank. Define the associated behaviors Bi = ker .c Ri ~ £q and assume X R1 = Ro for some X E JtPo xp 1 , thus B1 ~ Bo. Then the following conditions are equivalent: (a) B1 is a direct term of Bo, (b) the matrices X and R 1 are skew-prime, that is, there exist matrices F E JtP 1 xpo and G E 1tqxp1 such that (4.4.5)
(c) there exists a matrix G E 1tq x Pl such that
Bo = B1
$
GR1 (Bo).
Furthermore, every direct term B1 ~ Bo is of the form H E 1tqxq. Moreover, every controllable subbehavior B1 and in case Bo is controllable, every direct term of Bo is
B1 = H(Bo) for some is a direct term of B0, controllable, too.
4.4 Subbehaviors and Interconnections
113
Remark that the skew-primeness condition does not depend on the choice of R 1 and R 0 , which, being of full row rank, are left equivalent to every other chosen representation. "(a)=> (b)" Let 8 0 = 8 1 EBB2 where B2 = ker.c R2 and R2 E 1fP2 xq has full row rank. Then Theorem 4.1.5 yields gcrd(R1, R2) = Iq and lclm(R1, R2) = X R 1. From Theorem 3.2.8 we get that Po = Pl + P2 - q and an equation of the form PROOF:
where the leftmost matrix is in Glp 1 +p2 (1t) and partitioned according to G E 1tqxp 1 • Again Theorem 3.2.8 implies that the matrix X R1 is an lclm(R1, R2) and hence by the uniqueness of the least common left multiple we can assume without loss of generality that X= X. Completing [R 1T, R2T]T to a unimodular matrix (which is possible by Corollary 3.2.5) we get after some elementary column transformations, if necessary, a matrix identity of the form (4.4.6) with matrices F and N of fitting sizes. This shows (b). "(b) => (c)" The equation (4.4.5) shows that both matrices [R1, F] and [GT, XT]T can be completed to unimodular matrices. Choosing the completions appropriately, we arrive again at Equation (4.4.6) with suitable matrices R 2 , N, Y, and Z. For the verification of the direct sum in (c) we use the identity Ro = X R 1 and calculate for wo E Bo
(i)
R1GR1wo =(I- FX)R1wo = R1wo, implying the directness of the sum, (ii) R1(I- GR1)wo =(I- R1G)R1wo = FXR1wo = 0, hence Bois contained in the sum,
(iii) RoGR1wo = Ro(GR1 - I)wo GR1 (Bo) ~ Bo.
= X(R1G - I)R1wo
=
0 by (ii), thus
Since Theorem 4.4.1(a) guarantees that GR1 (Bo) is a behavior, the implication "(c) => (a)" is clear. In order to establish the representation B1 = H(Bo) for a given direct term B 1 of Bo, consider again (4.4.6) and define H := ZR2 = I- GR 1. The inclusion B1 ;;2 H(Bo) is immediate by (ii) above, while the converse follows from B 1 ~ ker.c GR1 ~ ker.c (I- ZR2). The remaining assertions are consequences of the above in combination with Theorem 4.3.8 and Cor 4.3.7. D Remark 4.4.10 Unfortunately we are not able to provide an intrinsic characterization for B1 being a direct term of Bo, that is to say a criterion purely in terms of the trajectories. However, the skew-primeness of the matrices X and R 1 can be given a behavioral interpretation. Note that the existence of a direct decomposition does not only require the splitting of the exact sequence
114
4 Behaviors of Delay-Differential Systems
0
----t
B1
----t
Bo
----t
Bo/ B1 ----t 0,
but also the quotient Bo/ B1 to be isomorphic to a behavior contained in Bo that, additionally, intersects trivially with B1. From Remark 4.4.5 we know that the quotient can be regarded as the behavior R 1 (Bo) contained in £P 1 • Thanks to Equation (4.4.5) it is indeed possible to embed this space as a behavior in Bo, complementary to B 1 . Precisely, the operator G induces an H-isomorphism from R1(Bo) onto the behavior GR1(Bo) ~ Bo ~ .Cq. The theorem above tells how to check whether or not B 1 is a direct term of Bo, and, if so, how to determine a complementary term. One has to check the solvability of the skew-primeness equation and to find a solution, if it exists. Since this equation is linear this is not a problem (apart from computational issues, see Section 3.6). For matrices over K[x], where K is a field, a nice criterion for solvability has been derived in (94]. Studying the proof in [94], one remarks that it works equally well for the ring H(C) of entire functions and, as a consequence, also for H. The result will be summarized next. We will confine ourselves to sketching the main idea of the proof in [94] along with its adaptation to our situation. For the details the reader is asked to consult (94]. Theorem 4.4.11
Let A E Hlxn, B E equation
Hnxm,
and C E
Hnxn
be given matrices. Then the matrix
C=FA+BG
(4.4.7)
is solvable over H if and only if the matrices
Cl [B 0] [B 0 A ' 0 A
E H(n+l)x(m+n)
(4.4.8)
are equivalent. We remark that by the uniqueness of the elementary divisor form (Theorem 3.2.1), equivalence of matrices over H can easily be checked (easily again up to practical computational issues) by calculating the invariant factors of the given matrices. We follow the steps taken in [94]. 1) Necessity follows easily (over every domain) since (4.4. 7) implies
SKETCH OF PROOF:
J-F] [BOACl [JOJ-G] [BoA· 0] [OJ =
2) For sufficiency one may assume rk A = a > 0, rk B = f3 > 0 and that the matrices A and B are in diagonal form with invariant factors a 1 , ... , aa and b1, ... , bf3, respectively. Hence a1 1?-£ .•. 1?-£ aa and b1 1?-£ ... 1?-£ bf3. Now, solving (4.4. 7) reduces to finding hi and gii such that
4.5 Assigning the Characteristic Function fiiai
+ bi9ij =
Cij
115
(4.4.9)
where C = (cij) and aj = 0 = bi for j >a: and i > {3. The solvability of (4.4.9) is established in [94] for the ring K[x] by showing that the equivalence of the matrices in (4.4.8) implies that for each irreducible polynomial 'Y E K[x] which occurs with maximal power r in aj and bi, the element 'Yr is also a divisor of Cij· Thus, Cij is in the ideal generated by aj and bi. As for the ring 7-l, one can use the same line of arguments to show that
Hence cij is in the ideal (gcdH(q(aj,bi)) generated by aj and bi in H(C) and Proposition 3.1.2(i) together with the Bezout property of 7-l yields Cij E (aj, bi)?-f., thus the solvability of (4.4.9). D We remark that the proof is not suitable as a procedure for solving (4.4. 7) for it requires a diagonal reduction of A and B, which would comprise the main bulk of the computations. For certain square nonsingular matrices over the polynomial ring K[x] alternative procedures for solving the skew-prime equation are given in [121]. These procedures were motivated by the observation that the skewprime equation over K[x] has arisen in several places in systems theory; see the introduction in [121] and the references therein.
4.5 Assigning the Characteristic Function This section is devoted to a special case of regular interconnection. We want to design autonomous interconnections with a prescribed characteristic polynomial. The first requirement, autonomy, simply says that all inputs of the original system are restricted by the controller, i. e. no free variables are left in the interconnection. This implies that the interconnection is a system of the form ker.c A, where A E 7-{qxq is a nonsingular matrix. In this case, the characteristic function det A* E H(C) provides some first structural information about the system; for instance, whether it is finite-dimensional, hence a system of ODEs, see Proposition 4.2. 7(b ), and if so, whether it is stable, which can be seen from the location of the zeros of det A* in the complex plane. It is natural to ask whether a stability criterion in terms of the characteristic zeros is also true for autonomous delay-differential systems. This will be dealt with in the first part of this section. Thereafter we turn to the problem of assigning characteristic functions via interconnections. More precisely, given a system ker.c R, where R E 7-{Pxq, we will ask ourselves as to which functions a E 7-{ are achievable as a = det[RT, CT]T by suitable choice of the controller C E 7-{(q-p)xq. One might also ask for certain additional properties of the controller, like a (nonanticipating) i/o-structure.
116
4 Behaviors of Delay-Differential Systems
The existence of a controller such that the interconnection is stable turns out to be related to so-called stabilizability. Following [87] we will define stabilizability of a behavior as the possibility to steer its trajectories asymptotically to zero. In contrast to systems of ODEs, however, it is not clear whether this is equivalent to the existence of stabilizing controllers. Only partial results will be given below. In the last part of this section we concentrate on first-order systems of the type x = A(cr)x + B(cr)u where (A, B) E JR[z]nx(n+m). In this particular case controllers of a specific type are sought such that the interconnection attains a prescribed polynomial a E JR[s, z]. In the case a E JR[s], this is the well-known problem of finite spectrum assignment investigated in much detail in the context of infinite-dimensional systems. We show how the problem fits into our algebraic approach and provide a solution that combines the algebraic methods with a type of Heymann-Lemma known for controllable delay-differential systems. We start with stability of delay-differential systems. Definition 4.5.1
(a) A system B ~ £q is called stable iflimt~oo w(t) = 0 for all wE B. (b) {87, 5.2.29} A system B ~ £q is called stabilizable if for all w E B there exists a trajectory w' E B such that
w' (t)
=
w (t) for all t :::; 0 and lim w' (t) = 0. t~oo
We should mention that stability as defined above is usually called asymptotic stability in the literature. Since we are not dealing with stability in the sense that the solutions stay bounded, we will skip the adjective asymptotic. Notice that stabilizability says that every trajectory in 13 can be steered asymptotically to zero and, as a consequence, asymptotically to every other trajectory in the behavior. Clearly, stability implies autonomy. The following necessary condition for stability does not come as a surprise. Recall the notation given in Definition 2.3. Proposition 4.5.2 If A E Hqxq is a matrix such that ker.c A is stable then V(det A*) ~ C_, where c_ := {A E c I Re (A) < 0} denotes the open left half-plane. We call det A* the characteristic function of the autonomous system ker .c A.
We saw in Example 4.4.8(c) that for all A E V(detA*) there exists an exponential monomial with frequency A in ker .c A. Hence stability implies Re(A) < 0. o PROOF:
It is well-known that in general the condition V(p*) ~ C_ for p E H is not sufficient for stability of ker.c p, not even if p is a polynomial. Indeed, there
4.5 Assigning the Characteristic Function
117
exist unstable equations with characteristic variety in the open left half-plane, see [13, Sec. IV). The key point is that the variety stay away from the imaginary axis in order to ensure stability. For polynomial delay-differential operators p E JR[s, z, z- 1] this has been shown in the book [3, Ch. 6). For general delaydifferential operators p E 1i this follows from the next result, proven in [110, Prop. 2). Proposition 4.5.3 Let p E 1i be such that V(p*) ~ Cc := {.A E C I Re (.A) :::; c} for some constant c E JR. Then for all w E ker c p and for all b > c there exists a constant K > 0 such that lw(t)l :::; K ebt for all t > 0. The proof of this result is beyond the scope of this book. It needs some detailed knowledge about the location of the zeros of exponential polynomials as derived in [3, Ch. 12] as well as a result of Hille and Phillips on invertibility in a certain distribution algebra (see also [20, App. A.7.4]). Corollary 4.5.4 Let A E 1iq x q be a matrix satisfying V ( (det A)*) Then kerc A is stable.
~
Cc for some constant c < 0.
This follows at once from the inclusion kerc A~ kerc ((det A)Iq)· It is worth being added that for polynomial retarded equations (see Remark 2.2) the condition V(p*) ~ C_ is equivalent to stability, see [3, Cor. 6.1).
Next we turn to the notion of stabilizability. There are in essence two ways to discuss this property of a behavior. On the one hand, the notion itself suggests that the system can be made stable in a certain sense. This is the issue of the existence of stabilizing controllers and will be addressed in Theorem 4.5. 7. On the other hand, every behavior B decomposes into its controllable part Be and an autonomous part A, see Theorem 4.3.14. Since every trajectory in the controllable part can be steered to zero (even identically, not only asymptotically), it is natural to ask whether stabilizability of B is related to stability of the autonomous part A. Indeed, for systems of ODEs it has been shown in [87, Thm. 5.2.30) that for B = Be EB A one has the equivalence
A is stable
~
B is stabilizable
(this is not quite the statement in that theorem, but it is exactly what has been proven in [87]). We strongly believe that this equivalence is true for delaydifferential systems as well, but unfortunately we cannot provide a complete proof. This is due to two facts, the lack of a characterization of stability in terms of the characteristic zeros and the lack of suitable series expansions of the trajectories along the characteristic zeros. One implication, however, comes easily with the decomposition.
118
4 Behaviors of Delay-Differential Systems .
Proposition 4.5.5 Let B ~ .Cq be a behavior and B = Be EB A be a decomposition of B into its controllable part Be and an autonomous behavior A ~ .Cq. Then
A is stable ===> B is stabilizable. Let w E B be any trajectory and write w = We + Wa with trajectories Be and Wa E A. By controllability there exists a concatenation w~ := Wel\of l\t 0 0 E Be. Hence stability of A implies that w' := w~ + Wa E B satisfies PROOF:
We E
w'(t) = w(t) for all
t::; 0 and t---+0 lim w'(t) = 0,
thus stabilizability of B.
0
This result will suffice to provide a sufficient criterion for the existence of stabilizing controllers. Recall from Section 4.4 that an intersection B = Bt n B2 ~ .Cq is called regular if the sum of the output numbers of B1 and B2 equals the output number of the intersection B. If additionally the intersection is autonomous, this reduces to o(B 1 ) + o(B2 ) = q. In other words, if the system B 1 is given by B1 = ker.c R with a full row rank representation R E 1ipxq, the controller B2 has to have a kernel-representation C E Ji(q-p)xq of full row rank. Let us start with the following simple result. Recall the notation R(p) from Definition 3.2.6 for the full-size minors of the matrix R. Proposition 4.5.6 Let R E JiPXq be a matrix such that rkR = p < q and define b E 1i as b := gcd'H { R(p) I p E Jp,q}. Furthermore, let f E 1i. Then
there exists a controller C E
1i(q-p)xq
such that det
[~] = f
(4.5.1)
if and only if b IH f. The analogous result is true if we replace the ring 1i by 1{0 . PROOF: The only-if-part is obvious. As for the if-part, factor R into R =ERe with a right invertible matrix Rc as in (4.3.2), (4.3.3). Then det B = b and Re can be completed to a unimodular matrix [ReT, 6T]T, see Corollary 3.2.5. Multiplying, for instance, the first row of 6 by fb- 1 E 1i, we obtain the desired D controller C satisfying (4.5.1).
In case R = [P,Q] E 1igx(m+p)' where Q is square and detQ(s,O) =f. 0 (that is, the system ker.c R is a nonanticipating i/o-system), the controller can be chosen in the form C = [F,G] E 1i;;x(m+p) such that F is square and detF(s,O) =f. 0. This can easily be achieved by starting with an arbitrary controller with entries in 1io satisfying (4.5.1) and, in case the first block F happens to be singular, adding a suitable left multiple of [P, Q]; we omit the details which are identical to the case of ODEs presented in (120, Thm. 9]. The nonsingularity ofF
4.5 Assigning the Characteristic Function
119
implies that the interconnection constitutes a closed loop system in the sense that the outputs of the system (resp. controller) make up the inputs of the controller (resp. system). It is, however, in general not possible to construct a strongly nonanticipating controller, where p-la is a Laurent series over IR(s)p (see Remark 4.2.4 and Proposition 4.2.5(c)). This can easily be seen by the trivial example [P, Q] = [1, s 2 + 1] to which the stable polynomials+ 1 (or any other polynomial of degree less than deg Q) is to be assigned. Now we can address the existence of stabilizing controllers. Theorem 4.5.7 Let R E Hpxq and bE 1i both be as in Proposition 4.5.6. Consider the following
conditions. (a) There exists a constant c < 0 such that rkR(A) = p for all A E C\Cc. (b) ker .c b is stable. (c) There exists a controller C E 1-[(q-p)xq such that ker.c
[~] is stable.
(d) B is stabilizable. Then (a)=* (b) =* (c) and (b)=* (d). We believe that the conditions (b), (c), and (d) are equivalent, but unfortunately we are not able to provide a proof for this conjecture. However, we would also like to point out that in case of delay-differential systems with noncommensurate delays, the conditions above are known to be not equivalent. In [110, Sec. 5.1) a system is presented which is even spectrally controllable (see Remark 4.3.13), but not stabilizable. PROOF: Write R = BRc as in Equations (4.3.2), (4.3.3), hence det B = b. From the proof of Theorem 4.3.14 we obtain a matrix A E Hqxq such that R = B Rc = RcA and det A = det B = b. Moreover, ker .c R = ker .c Rc E9 ker .c A. "(a) =* (b)" is clear by Corollary 4.5.4 since V(b*) ~ Cc. "(b) =* (c)" By Proposition 4.5.6 there exists C E 1-[(q-p)xq such that det[RT, cT)T =b. Now the result follows from ker.c [RT, CT]T ~ ker.c (blq). "(b) ==? (d)" is in Proposition 4.5.5 since ker.c A~ ker.c (blq) is stable. 0
Notice that condition (a) above is satisfied, if, for instance, ker.c R is controllable, or if the set of rank deficiencies {A E C I rkR*(A) < p} is finite and contained inc_. We come now to the last and main part of this section. It centers around the special case of retarded time-delay systems of the form
x=
A(CT)x + B(CT)u, where (A, B)
E
IR[z]nxn x IR[z]nxm.
(4.5.2)
Notice that the equation is explicit and of first order with respect to differentiation. Moreover, it is of retarded type since differentiation occurs solely in the
120
4 Behaviors of Delay-Differential Systems
variable x and at time t. These systems are the simplest and best studied class of DDEs. They have also been studied in considerable detail in the controltheoretic literature in the context of infinite-dimensional systems as well as in the context of systems over rings, here the ring JR[z]. Chapter 5 will be devoted to the question which behaviors can be expressed with the help of equations of the form (4.5.2) by introducing the latent variable x. In the terminology of Section 4.2, kerc [sf- A, -B] = { (:) E _cn+m
I x =Ax+ Bu}
constitutes an i/o-system with input u and output x. It is even a strongly nonanticipating system since det(sl- A) is of the form E~o aj(s)zi where a0 = det(sl - A(O)) has degree n which is the maximal degree attained by the full size minors of [sf- A, -B]. Hence strong nonanticipation follows from Proposition 4.2.5( c). An important question which has been investigated in much detail for the system (4.5.2) is that of assigning a desired characteristic function via "static state" feedback. In our terminology this amounts to finding a feedback matrix F E JR[z]mxn such that
-B] =
sf- A det [ -F I
det(sl- A- BF)
(4.5.3)
takes on a prescribed value a E JR[s, z]. Hence the input u to the system (4.5.2) becomes the "delayed state feedback" u = F(o-)x. Observe that this problem depends solely on the matrices A, B, and F. Therefore, it applies equally well to delay-differential systems as in (4.5.2) and to discrete-time systems Xk+l =· Axk + Buk over the ring JR[z] as discussed earlier in Section 3.3. Therein we quoted some results concerning the assignability of the determinant in (4.5.3) over various types of rings. We saw that JR[z] is a PA-ring, but not a CA-ring, meaning that for every reachable pair (A, B) the closed loop polynomial (4.5.3) can be assigned every value of the form IJ~=l (s - ai) with ai E JR(z] but in general not every monic polynomial sn + E~,:-01 bisi with bi E JR[z], see part (i) of Section 3.3. Recall also from (1) of that section that the notion of reachability refers to the interpretation of (A, B) as a discrete-time system. Using the characterization [sf- A, - B] being right-invertible over JR[ s, z], one notices that reachability is (much) stronger than controllability of kerc [sf- A, -B] in the sense of Section 4.3. The equivalence of reachability and pole assignability over the ring JR[z] (part (i) in Section 3.3), however, shows that this is the appropriate notion in this purely matrix-theoretic context. In the sequel we will investigate a modified version of coefficient assignability. A broader class of controllers, more powerful than static feedback, will be employed with the result that even the weaker assumption of controllability suffices for
4.5 Assigning the Characteristic Function
121
arbitrary coefficient assignment. More precisely, we will allow point delays and distributed delays induced by the proper elements from the rings 1io,p and 1io, 5 p, see Equation (3.5.8). As discussed in Remark 3.5.7, the restriction to proper operators enables to apply the controller to larger function spaces than .C. In fact, the controller will even be strongly nonanticipating. Definition 4.5.8 The pair (A, B) E JR[z]nxn x JR[z]nxm is said to be weakly coefficient assignable if for each monic polynomial a E JR[s, z] with deg8 a = n there exists a feedback law (4.5.4) u=Fx+Gu, where F E 1iO,p mxn and G E 1iO,sp mxm ' such that
l
sf-A -B det [ _ F I _ G
(4.5.5)
= a.
Here and in the sequel the requirement a being monic refers to the variables, 1 that is, the polynomial a E JR[s, z] is of the form a = sn + E~:0 aisi with coefficients ai E JR[z].
A few remarks are in order. Remark 4.5.9 (1) Notice that the feedback law u = F(a)x, where FE JR[z]mxn, is included in the class of controllers (4.5.4). While for that situation (F with entries in JR[z] and G = 0), Equation (4.5.5) can be understood as a system over a ring, this is no longer true when passing to the larger ring 1io,p ::> JR[z] for the controller. The variable s, representing differentiation, is of course not contained in the ring 1io,p of proper functions, but it is certainly not algebraically independent over 1io,p· Hence the configuration (4.5.5) does not fit into the context of systems over the ring 1io,p· (2) It is easy to verify that the controller (4.5.4) constitutes a strongly nonanticipating i/o-system with input x and output u in the sense of Remark 4.2.4. Indeed, the strict properness of G implies that det(I- G) E 1io,p is a unit in JR(s)p[z] and therefore (I- G)- 1 F E JR(s)p[z]mxn. Hence the control law u = Fx + Gu, just like the system ± = Ax+ Bu, can process (L~J+ functions without producing Dirac-impulses. In fact, the definition of 1io,p and 1io,sp in (3.5.8) and Theorem 3.5.6 show that the control law (4.5.4) is of the type N
u(t) =
~ Rix(t- j)
L
L
+f. f(r)x(t- r)dr +f. g(r)u(t- r)dr,
(4.5.6)
where N, L ~ 0 and Rj E JRmxn and where the entries of f E (PCf)mxn, g E (PCf)mxm are even piecewise exponential polynomials according to Proposition 3.5.8.
122
4 Behaviors of Delay-Differential Systems
The notion of weak coefficient assignability defined above is closely related to what is called finite spectrum assignability in the context of infinite-dimensional systems and has been studied in much detail in the existing literature. The latter notion refers to the same equation (4.5.5) but with regard to the following situation. On the one hand, only polynomials a E JR[s] are being considered. This results in a prescribed finite spectrum of the interconnection, which in most cases is the desirable property. On the other hand, a fairly broader class of feedback laws is allowed, namely feedbacks as given in (4.5.6) but with arbitrary L 2 -functions f and g defined on [0, L], see e. g. [76], [114, Def. 2.1], [113, p. 546], [115, p. 1378], [116], and [9]. Several results about finite spectrum assignability have been obtained within this context (see again the papers cited above). In particular, in [113] it is shown that the system (4.5.2) is finite spectrum assignable if and only if it is controllable. As we will see next, this equivalence still holds true after replacing finite spectrum assignability by the stronger notion of weak coefficient assignability. We formulate the result as follows. Theorem 4.5.10
The pair (A, B) E JR[z]nxn x JR[z]nxm is weakly coefficient assignable if and only if the behavior ker.c [sf- A, -B] is controllable. Knowing the results from the literature, the theorem is hardly surprising. It simply says that all controllers (4.5.4) for finite spectrum assignment fall in the class 1io,p or can be made to do so. Hence, although an infinite-dimensional system, only finitely many parameters need to be found to determine a controller. In Example 4.5.14 it will be shown for special cases how this can be accomplished. The result above appeared first in [39, Thm. 3.4]. In the singlejnput case and for a E JR[s], it can also be found in [9], the proof being based on the description of 1{0 introduced in [63]. We wish to present a short proof below, showing how the result fits into our algebraic framework for DDEs. It also illustrates that the generalization from finite spectrum to arbitrary monic characteristic polynomials a E JR[s, z] is evident in the algebraic setting. It has to be mentioned that the key step in the multi-input case cannot easily be derived by our method, but will be a reduction to the single-input case thanks to a kind of Heymann-Lemma for (4.5.2), established in [113]. Before turning to the proof of the theorem above we will present this preparatory result. In the sequel we will call a pair (A, B) controllable if the behavior ker.c [sf- A, -B] is controllable, hence if the matrix [sf - A, - B] is right invertible over 1{0 . Theorem 4.5.11 ([113, Thm. 2.1]} Let (A, B) E JR[z]nxn x JR[z]nxm be a controllable pair and assume that the first column b1 of B is nonzero. Then there exists a matrix K E JR[z]mxn such that the pair (A+ BK, b1 ) is controllable.
4.5 Assigning the Characteristic Function
123
The proof is very technical. It requires a detailed study of the rank deficiencies of the matrices [B(e- 8 ), A(e- 8 )B(e- 8 ), ••• , A(e-s)n-l B(e- 8 )] and [sl- A(e- 8 ), -B(e- 8 )]. It is worth being noticed that the assertion is not true when we replace controllability by reachability. Indeed, in the latter version the lemma would state that JR[z) allows feedback cyclization (see part (5) of Section 3.3), which is not true, since JR[z] is not even a CA-ring. Let us illustrate the difference by the example
(A, B)= (A, [b1ob2))
= (
[~ ~], [~1 z2 ~ 1])
(4.5.7)
from (3.3.2), which is reachable, but not coefficient assignable over JR[z]. It is easy to see that no feedback matrix K E R[z]2x 2 exists such that at least one of the pairs (A+ BK, bt) or (A+ BK, b2) is reachable. On the other hand, even without applying any feedback the pair (A, bt) is controllable. PROOF OF THEOREM 4.5.10: Only sufficiency requires proof. Choose a monic polynomial a E R[s, z] with degs a= n. 1. case: m = 1 For j = 1, ... , n + 1 denote by Pj E R[s, z] then x n-minor obtained from the matrix [sl- A, -B] after deleting the jth column, hence Pn+l = det(sJ- A). Controllability of ker.c [sl- A, -B] implies that the elements Pll ... ,Pn+l are coprime in Ho. Thus there exist r1, ... , rn+l E 1{0 such that
(4.5.8)
where q = (rt, ... , rn) E H6xn. According to (3.5.7) we can decompose q into its polynomial and its strictly proper part, say q = q1 + d1 where q1 E 1i 0~:r,n and d1 E R[s, zpxn. Division with remainder applied to the polynomial matrices d1 and sf - A leads to an equation d1
= h(sl- A)+ d where hE R[s, z]lxn
and dE R[zpxn.
Hence
*
sl - A - B ] [sf - A - B] a = det [ ql + d r n+ 1 + hB = det t,j;
*
(4.5.9)
(4.5.10)
where q1 + d =: t,j; E 1i 0~;n and rn+l + hB =: E Ho. In particular, ¢ E R[s]\ {0} and ft is a polynomial vector with entries of degree at most p := deg ¢ and c is a polynomial. We may assume that ¢ is monic. Then ¢a = det
[sl hA-cB]
Therefore
*=
yields that c E R[s, z] is monic and of degree degs c = p, too. 1 - g for some g E 1io,sp and the result follows.
124
4 Behaviors of Delay-Differential Systems
2. case: m > 1 With the aid of Theorem 4.5.11 this part of the proof is standard. Without restriction suppose that the first column b1 of B is nonzero. Then there exists K E lR[z]mxn such that [sf- (A+ BK), -b1] is right invertible over 1-£0 . Hence, the first case guarantees the existence off E 1i0~;n and g E 7-lo,sp satisfying
a
-b1]
BK _ = det [sl- A_f 1 9
.
. Puttmg
(4.5.11)
Equation (4.5.5) is obtained.
0
Remark 4.5.12 The proof shows that in the single-input case the computation of a controller amounts in essence to solving a Bezout equation. In Section 3.6 we have shown that (in case all coefficients are rational numbers or in certain field extensions of Q) a symbolic solution can be found algorithmically if Schanuel's conjecture is true. In the multi-input case the additional feedback matrix K needs to be found. According to (113] this can be achieved in finitely many steps in which certain varieties V(qi, ... , qi), where qi E JR[s, z], have to be determined. As this amounts to the determination of a greatest common divisor, this again can be accomplished symbolically if the initial data have computable coefficients.
Let us revisit the proof above for two special cases. Remark 4.5.13 (1) Firstly, we can recover from the proof above the well-known fact that for single input systems reachability is equivalent to coefficient assignability. In order to do so, let (A, B) be a reachable single-input pair, hence m = 1. Since in this case the matrix [sf- A, -B] is right invertible over lR[s, z], the coefficients ri in (4.5.8) are even in ~[s, z] and, consequently, q = d 1 E ~[s, zpxn and q1 = 0. Thus c = rn+l + hB E ~[s, z] has to be one, since a is monic and has degree n, and we obtain the familiar static feedback
where dE JR[z] 1 xn. a= det [sl-A-B] d 1 Hence reachability implies coefficient assignability while the converse is true for arbitrary systems. Due to the failure of Theorem 4.5.11 with reachability in place of controllability, the above does not generalize to multi-input systems. However, at the end of this section we will show that for reachable multi-input systems one can always achieve coefficient assignment with F E 1i0::,xn and G = 0.
4.5 Assigning the Characteristic Function
125
(2) A particular simple case of the procedure in the proof above arises when (A, B) is in ]Rnxn x JR[z]n, that is, if there is just one input channel and the delays occur only in the input. In this situation, one can achieve a prescribed finite spectrum even with a controller (4.5.4) where F is constant. This can be seen as follows. Since the polynomial Pn+I = det(sJ- A) is in JR[s], one can obtain a Bezout equation
1 = O:IPI
+ · · · + O:nPn + O:n+IPn+I
(4.5.12)
with a:i E JR[s) for i = 1, ... , n. Indeed, the requirement O:n+I = I-a: 1 p1 - ... -a:npn E Ho needs only finitely many zeros of Pn+I (including Pn+l
multiplicities) to be taken care of via appropriate choice of a: I, ... , O:n. This can be formulated as finitely many interpolation problems for a:i, which can then be solved within JR[s]. Multiplying Equation (4.5.12) by the desired characteristic polynomial a E JR[s) shows that the vector q = (r1, ... , rn) in the first case of the proof of Theorem 4.5.10 is actually in JR[spxn. In particular, the strictly proper part QI is zero. Using once more that sl- A is in JR[s]nxn, we see that the remainder din (4.5.9) is a constant vector. Thus we get finally
l
sl- A -B _ g =a E JR[s) d 1
det [
for some dE JRixn and g E Ho,sp· We illustrate the situation by the following examples.
Example 4.5.14 [ ~ (a) Consider the matrix [sf- A, - B] = ~ ~ 1~J The matrix A is unstable 18 1 and we wish to assign the stable characteristic polynomial a = ( s + 1) (s + 2). The minors PI = z(s-1), P2 = -z, P3 = s(s-1) of the matrix [sl -A, -B] are coprime in Ho, showing that the system is controllable. Using the idea of the preceding remark, one easily finds the Bezout equation
.
1 = -PI- esp2
+
1 + (z- ez)s- z
s(s _ 1)
P3·
Hence
a
= det = det
[~-a1 sesa ~1 [
~
~z
]
(l+(z-ez)s-z)a
~
s(s-I)
~
1 s 1 z ] ' 6e _ 2 6e 1 _ (6ez-2z-4)s+2z-2 s(s-I)
126
4 Behaviors of Delay-Differential Systems
where the last expression follows after elementary row transformations which produce constants in the first two entries of the last row. The con6 1 _;z) + <:~~1) volution operator associated with g = (Gez- 2sz(~~i)+ 2 z- 2 = 2< can be obtained from Example 2. 7. This leads finally to the (stabilizing) controller
u(t)
= (2- 6e)x1(t)- 6ex2 (t) + /.
1 (2- 6e7 )u(t- r)dr.
(b) In the very special case n = m = 1 and A E JR, B = b(z) E JR[z], a = s + a0 E JR[s), the procedure of Remark 4.5.13(2) results in the controller u
= -b(e-A)- 1 (A + ao)x + gu
where g
= (A+ ao)
b(e-A)- 1b(z)- 1 s _A
E
1-lo,sp·
E. g. for b(z) = zL the controller equation simply reads as (see again Example 2.7)
u = -eAL(A + ao)x- (A+ ao) J.L eA7 U(·- r)dr, which for L = 1 has been obtained earlier with completely different methods in [76, (2.13),(2.16)]. (c) Finally, we want to consider the following example, taken from [75), where it has been derived as a linearized model of the Mach number control in a wind tunnel. Let [sf-
e,
A, -B] =
s
+a
r
0
0
J
0 0 - ,;,az 0 E -1 s 2 w 2 s + 2ew -w
IR[s, z] 3 x 4
and w E lR are nonzero parameters. Notice that Rewhere a, K, mark 4.5.13(2) does not apply since there occurs a delay in the matrix A. We assume that the model has already been normalized so that the delay has length one. We want to assign an arbitrarily prescribed polynomial a E JR[s] of degree 3. It will be useful to express a in the form
a= (s 2 + b1s + bo)(s +a)+ (3, where b1, bo, (3 E JR. Put b = s 2 +b 1s+bo E IR[s). It is easily checked that (A, B) is a controllable pair. A Bezout equation for the greatest common divisor of the minors of [sf - A, - B] takes the simple form
l
0 0 s + a - Kaz 0 -1 s 0 w~ s + 2ew -w2 0 det [ 0 0 1 KG.~ s+a
=w
2
Q
,;,ae '
4.5 Assigning the Characteristic Function
127
since w2 Kaea is a nonzero constant. From this we can proceed as in the proof of Theorem 4.5.10. Multiplying the last row by the polynomial a and subtracting an appropriate multiple of the first row, one derives
Since b is a polynomial of degree two, we have to perform two steps of row transformations in order to obtain a proper rational last row. This leads finally to
where the constants are given by K1
=
f3
Kaeaw 2
,
K21
= ~2 w
1,
K22
= - /32 , K 3 w
-
-
b1
-
2
w2
~w ·
Hence the controller is of the form 1
u=
- K ! X ! - K21X2
+ K221
e-ar x2(·-
r)dr-
K3X3.
This is the same controller as obtained by different methods in [75, (24)]. Of the various controllers derived in [75], this is the simplest one for the assignment problem since in this case x 2 is the only variable whose integration is required in order to determine the input u. Remark 4.5.15 In the next chapter it will be shown that the controller given in (4.5.4) always admits a so-called first-order representation, i. e. one can find matrices
(A, B, 6, D)
E
IR[zrxr x IR[zrxn x IR[z]mxr x JR[z]mxn
such that ker.c [-F,I- G]
=
{(xT,uT)T l3w E cr:
w= Aw + Bx; u = Cw + Dx}.
Using such a representation, the equations of the interconnection are given by
This system shows the close connection to the classical framework of dynamic feedback for state-space systems over rings, which has been studied extensively in, e. g., (46] with respect to stabilizability, see [46, p. 39].
128
4 Behaviors of Delay-Differential Systems
Notice that in Example 4.5.14(c) we derived a controller of the form u = Fx, hence G = 0. It simply feeds back a segment of the trajectory x, see (4.5.6). As we will show next, this is always possible if the matrix B is constant. Corollary 4.5.16 Let (A, B) E JR[z]nxn x JR[z]nxm be a controllable pair and suppose that the entries of B are coprime in IR[z]. Then for every monic polynomial a E JR(s, z] with deg 8 a= n there exists a feedback matrix FE 1i 0~xn such that
si-
det [ -F
A-B] I =a.
(4.5.13)
In particular, the above conditions are met by reachable pairs (A, B). PROOF: Let U E Gln(lR[z]) and V E Glm(IR[z]) with det U = det V = 1 such that B 1 := U BV is in Smith-form. By the assumption on B, the first row of B1 is of the form ({3, 0, ... , 0) E IR 1 xm where {3 f- 0. As in the proof of Theorem 4.5.10 1 U AU- -B1 £ F 'LI mxn d G E '1J mxm · we get a= det [ sf- -F '"O,sp as m 1 _ G or some E no,p an (4.5.11). The strict properness of g yields (s- p)g E 7-lo,p for all p E IR[z] and hence adding the first row of [sf- U Au- 1 , -B 1 ], multiplied by {3- 1 g, to the first row of [- F, I - G] leads to
l
a
=
det [si- -UFAU1
1
-U BV] = I
de
t [
-B]
si- A - V F1 U I
for some matrix F 1 which has entries in 7-lo,p· Consequently, V F 1 U E 1-lJ;xn, establishing (4.5.13). The additional assertion on reachable pairs is easily seen by resorting to a Smith-form for B. D We close the section with the following Example 4.5.17 Let us apply the result above to the pair (A, B) in (4.5.7), which is reachable but not coefficient assignable as a system over the ring JR[z], see (i) in Section 3.3. In this case it is easy to obtain for every prescribed monic polynomial a s 2 + a1s + ao E JR[s, z], all ao E IR[z], the controller F =
z-1 a1- aoao 8[
0
0
l
E 1i 2x2 O,p
satisfying (4.5.13). Hence the feedback law is given by
u,
= a,(o-)x, +
J.' (ao(o-)x,)(·- r)dr + ao(o-)x2,
u2
= 0.
4.6 Biduals of Nonfinitely Generated Ideals
129
4.6 Biduals of Nonfinitely Generated Ideals At the end of this chapter we want to return to the Galois-correspondence between submodules and behaviors, derived in Section 4.1. We saw in Corollary 4.1.8 that Ml..l.. = M for every finitely generated submodule M ~ 1-{Q. In this section we will investigate whether or not the identity Il..l.. =I is true also for ideals of 1-l that are not finitely generated. This question is not quite in the spirit of this chapter about behaviors, since.
I l.. = {w
E
.C I pw = 0 for all p
E
I}
is not a behavior in the sense of Definition 4.1, where only finitely many defining equations were allowed. But that definition was tailored anyway to our specific context of (linear time-invariant) DDEs with certain types of delays. In this sense, Definition 4.1 is somewhat artificial, yet convenient, from a general behavioral point of view. Using the more general and natural definition of a behavior as simply being a set of trajectories [87, Sec. 1.3/1.4], the space Il.. falls, of course, in the class of linear, time-invariant (autonomous) behaviors. But even without resorting to these quite general ideas, we believe an investigation of the identity I 1..1.. = I fits naturally in our work, because a description of the nonfinitely generated ideals is already available from Section 3.4. In fact, we saw in Theorem 3.4.10 that each ideal I~ 1-l is of the form
I= ((p))(M)
= { h~ I hE 7-l, ¢EM},
where p E ~[s, z] is some polynomial and M is an admissible set of denominators for p. As we will show by some simple examples, it depends decisively on the characteristic zeros of the polynomial p and the denominator set M whether or not the identity I = I l..l.. holds true. In particular, an algebraic characterization (in ideal-theoretic terms, say) appears to be impossible. Instead, the examples give an indication of how to translate the identity I = I l..l.. into a condition on the characteristic zeros. The general case can then be carried out almost straightforwardly. Due to the infinite character of the situation, one main difference to the preceding sections arises. In order to get further information about the solution space I l.. ~ .C we have to make use of some topological argument. More precisely, we will need that I l.. is completely determined by its exponential monomials, or, in other words, by the characteristic variety of I. This is what one would certainly expect, but for a formal proof one has to make use of Schwartz's theorem on translation-invariant subspaces. For finitely generated ideals (or modules) it was possible to circumvent these arguments due to the division properties in 7-l. Let us begin with
130
4 Behaviors of Delay-Differential Systems
Definition 4.6.1 Let I ~ 1{ be any subset. Define the characteristic variety of I to be V(I*) :=
n
V(p*) ~ C.
pEl
The elements of V (I*) are called the characteristic zeros of the set I. For A E C define ord.x (I*) := minpEl ord.x (p*) E No.
Remark 4.6.2 Let I ~ 1{ be an ideal given as I = ((p))(M)' where p E JR[s, z] and M is an admissible set of denominators for p. It is easy to see that ord.x(I*) = ord.x(p*)- maxord.x(¢) for all A E C. ¢EM
(4.6.1)
Recall from Proposition 3.4.8 that in the special case where I is finitely generated, the set M is finite, say M = { 'lj;~, ... , 'lj;z}. It follows I = (p'lj;- 1 ) where 'lj; = lcm('lj; 11 ... , '1/Jz) E M (see the proof of 3.4.8) and ord.x(I*) = ord.x(p*'lj;- 1 ) for all A E C. This coincides with (4.6.1) above. Now we are prepared to describe precisely the dual Ij_ ~ .C in terms of the characteristic variety V(I*). This in turn leads directly to a description of the elements in the bidual I j_ j_. Recall the notation ek,>. ( t) = tk e>-t for the exponential monomials. Theorem 4.6.3 Let £ = C00 (JR, C), equipped with the topology of uniform convergence on all compacta in all derivatives. Then for every subset I ~ 1{ one has Ij_ =
n
ker£p
= --------------------------------£ spanc{ek,.>. I A E V(I*), 0::; k < ord.x(I*)} .
(4.6.2)
pEl
As a consequence, q E 1{ satisfies q E I j_ j_
{:=:::>
ord.x (I*) ::; ord.x (q*) for all A E C.
PROOF: First of all, the operator p : £ ~ £ is continuous for every p E H. This follows from the fact that this map is simply the convolution operator f f--7 p(8~1)' 81) * f (see Theorem 3.5.6(iv)) which is continuous on £ by (107, Thm. 27.3]. Therefore, each space ker£ p and consequently Ij_, too, is a closed, linear, and translation-invariant subspace of £. Now, (102, Thm. 5] implies that I j_ is the closure of the vector space of all finite linear combinations of the exponential monomials ek,>. contained in Ij_. Using Lemma 2.12, this leads directly to (4.6.2). The second part concerning Ij_j_ follows immediately from the same lemma. 0
4.6 Biduals of Nonfinitely Generated Ideals
131
Thanks to this representation of I ..L ..L, a characterization of the identity I = I ..L ..L can be completely accomplished in terms of the variety V(I*). The solution spaces I ..L ~ £ need no longer be considered. We first give a description of the ideal itself in terms of its characteristic zeros. Recall from Theorem 3.4.10 that each ideal in 1-l is of the form ((p)) (M) as given below. Theorem 4.6.4 Let p E JR[s, z]\:IR[s] be a polynomial and M ~ Dp be an admissible set of denominators for p. Put I = ((p))(M) ~ 1-l and let q E 1-l. Then one has the equivalence q E
I{=::=::}
(i) ord>.(I*) ~ ord>.(q*) for all .A E C { (ii) #{.A E C I ord>.(q*) < ord>.(p*)} < oo.
PROOF: "=>" is true since q E I is of the form q = hp¢- 1 for some hE 1-l and ¢EM . .
"-¢="We may assume without restriction that M is saturated, see Remark 3.4.9. Let q E 1-l and r
{.A
E
C I ord>.(q*) < ord>.(p*)} ={.At, ... , Ar} and '1/J :=II (s- Ai)Pi i=l
where Pi = ord>.i (p*)- ord>.i (q*). Then p*'I/J- 1 IH(C) q* and therefore, hp'lj;- 1 = q for some h E 1-l by Proposition 3.1.2(c). There remains to show that '1/J E M. Using part (i) of the assumption and Remark 4.6.2, one gets
fori= 1, ... , r. This shows ord>.i ('1/J) ~ maxeEM ord>.i (~) for all i = 1, ... , rand the saturation of M yields '1/J E M. Hence q = hp'lj;- 1 E ((p))(M) =I, which is what we wanted. D Notice the special case where I = ((p)) is a full ideal, that is M = Dp is the set of all admissible denominators for p. Then ord>.(I*) = 0 for all .A E C (see also Proposition 3.4.3(1)) and, consequently, one obtains for all q E 1-l the equivalence q E ((p))
{=::=::}#{.A
E
C I ord>.(q*)
< ord>.(p*)} < oo.
Note that this is also clear from the very definition of ((p)). Comparing now the last two theorems, one gets immediately Corollary 4.6.5 Let I = ((p))(M) ~ 1-l be as in Theorem 4.6.4. Then I consequence, I= I..L..L if and only if I..L..L ~ ((p)).
= I..L...L. n ((p)). As a
132
4 Behaviors of Delay-Differential Systems
We would like to illustrate the situation by some examples.
Example 4.6.6 (i) Let I= ((p)) be the full ideal generated by some p E JR[s, z]. Then V(I*) = 0 and therefore, I J.. = {0} so that I l.l. = 1-l. (ii) Let p = (z-l)(z+ 1) and put I = ((p)) (D..,+t). Then the characteristic variety is given by V(I*) = {2k7ri I k E .Z} = V((z- 1)*) and each characteristic zero of I has multiplicity one. Hence q = z- 1 E IJ..J..\I. (iii) Let again p = (z -1)(z + 1) and choose the admissible set of denominators
M := { ¢ E JR[s] I¢ monic, gcd(¢, ¢') = 1, V(¢) ~ {k1ri I kEN}} ~ Dp. Then the ideal I = ((p )) (M) has characteristic variety V (I*) = { k1ri I k ::; 0} and satisfies the identity Il.l. =I. For a verification of the last assertion, one may argue as follows. If q E IJ..J.. and q = a'lj;- 1 for a E JR[s, z], 'If; E JR[s], then V(I*) ~ V(a*) by Theorem 4.6.3 and hence #V((z- 1)*, a*) = oo = #V((z+1)*, a*). From the Theorem ofBezout for algebraic curves it follows that p = (z -1)(z + 1) divides a in JR[s, z], say a= flp for some a E JR[s, z]. Now one obtains q = flp'lj;- 1 E ((p)) n Jl.l., and so q E I by the corollary above. The examples indicate the general idea. The admissible set M of denominators must leave untouched infinitely many characteristic zeros of each irreducible component of p in order to guarantee I = I J.. J... The case of multiple zeros of p*, not discussed in the preceding examples, can easily be handled with the following lemma.
Lemma 4.6.7 Let p E JR[s, z] be an irreducible polynomial. Then p* has only finitely many multiple zeros. Observe the consequence that for every polynomial p the multiplicities of the zeros in V(p*) stay bounded. Write p = E~=oPjZj with Pj E JR[s]. Then the derivative of p* is given by (p*)' = q* where q = E~= 0 (pj- jpj)zj. Suppose to the contrary that #V(p*, (p*)') = oo. Then the irreducibility of p yields p lm:.[s,zl q, which, along with deg8 p = deg8 q and degzp = degz q, means pa = q for some nonzero constant a E JR. But this is a contradiction due to the specific form of q, and the lemma follows. D PROOF:
Now we are prepared for the following characterization.
4.6 Biduals of Nonfinitely Generated Ideals
133
Theorem 4.6.8 Given apolynomialp =a TI~=l P? where a E R[s]\{0}, Vj > 0, andp1, ... ,pk E R[s, z]\R[s] are different irreducible polynomials. Let M ~ Dp be an admissible set of denominators for p. Define the ideal I := ((p)) (M) ~ 1-l. Then
I= Ij_j_ PROOF:
{:=::=:>
V j = 1, ... , k: #{>. E V(pj) I ord.>.(I*) ~
Vj
ord>.(Pj)} = oo.
"=>" Suppose one of the sets on the right-hand side is finite, say (4.6.3)
We construct an element q E IJ_j_\I. According to Theorem 4.6.3 we have to find q E 1-l\I such that ord.>.(q*) ~ ord.>.(J*) for all).. E C. The idea is simply to divide p by p 1 and to compensate the then missing characteristic zeros by a polynomial in R[s], which is possible due to (4.6.3). Also higher multiplicities have to be taken care of. The details are as follows. Let {Mb ... , JJ.d = V(pi, (pi)') be the finite set of multiple zeros of Pi (see Lemma 4.6. 7). Define Ti := ord.>.i (I*) fori= 1, ... , r and Pt := ordJLt (pi) fort= 1, ... , l. Put q
:= apr 1 - 1
k
r
l
j=2
i=1
t=1
IJ P? IJ (s- >.iti IJ (s - JJ.t)Pt E R[s, z].
Note that both sets {>.1, ... , >.r} and {Mb ... , Jl.t} are contained in V(pi)· Observe that q (j. I= ((p))(M) because P1 (j. R[s]. In order to prove that q E /j_j_, we have to show ord.>.(q*) ~ ord.>.(I*) for all).. E C. This is obvious for >. E { >.1, ... , >.r} and the other cases for ).. remain to be checked. For>.= Jl.t E {JJ.b ... , Jl.t} we have ordJLt (q*)
~ ordJLt (a) + (v1 - 1)ordJLt (pi) + ordJLt ( (
k
IJ P? )*) + Pt j=2
k
= ordJLt (a)
+ v1Pt + ordJLt ( ( IJ P? )*)
= ordJLt (p*)
~ ordJLt (I*).
j=2
In the case>. E V(pi)\{>.b ... , >.r, Jl.b ... , I-tt} we get from (4.6.3) and the definition of the numbers Jl.t the estimate ord.>.(I*)::; v1ord.>.(Pi) -1 = v1 -1 = ord.>.((pr1 - 1)*) ::; ord.>.(q*). Finally, for ).. (j. V(pi) one has k
ord.>.(q*) = ord.>.(a)
+ ord>. ( (IJ P? )*) j=2
Hence q E Ij_j_\I.
= ord.>.(p*)
~ ord.>.(I*).
134
4 Behaviors of Delay-Differential Systems
"<¢=" Let q E Ij_j_. We may assume q E 1i0 and write q = a¢- 1 where a E JR[s, z] and¢ E JR[s]\ {0}. Then ord.\(a*) 2:: ord.\(q*) 2:: ord.\(I*) for all A E C by virtue of Theorem 4.6.3. It remains to establish property (ii) in Theorem 4.6.4. Using induction on l it is possible to show the implication #{A
E
V(pj) I ord.\(a*) 2:: l ord.\(pj)} ==
00
====>
P;
l&(s,zJ
a
for j = 1, ... , k (use the fact that the left-hand side implies # V (a*, pj) = oo and recall that Pi is irreducible). Hence the assumption and the coprimeness of the polynomials Pi yield a = h I1~= 1 P? for some h E JR[s, z]. It follows q = (h I1~= 1 P?)¢- 1 and we get the equivalence ord.\(p*)- ord.\(q*)
> 0 <==> ord.\(a) + ord.\(¢) > ord.\(h*).
Since the right-hand side can be true for at most finitely many values of A, we obtain property (ii) of Theorem 4.6.4 and deduce q E I. D We conclude the section with the following
~wo
special cases.
Corollary 4.6.9 Let a E JR[s]\{0} and p E JR[s, z]\lR[s] and consider the ideal I = ((ap))(M)' where M ~ Do:p is an admissible set of denominators.
(i) If p = P1 · ... · Pk with pairwise different irreducible polynomials Pi E JR[s, z]\lR[sJ, then I= I j_j_ <==> #(V(I*) n V(pi)) = oo for all i = 1, ... , k. (ii) If pis irreducible, then I= Ij_j_ if and only if #V(I*) PROOF:
= oo.
For (i) note that
where the second set on the right-hand side is finite by Lemma 4.6.7. The result follows from Theorem 4.6.8. (ii) is a consequence of (i) because in this case V(I*) ~ V(p*) U V(a) and V(a) is finite. D
5 First-Order Representations
In this chapter we will be concerned with the question whether a given system B = ker.c R, defined by implicit DDEs, can be described by explicit equations upon introducing auxiliary variables. More precisely, we will investigate whether the system B ~ t:,m+p can be expressed in the form
± = A(a)x + B(a)u, y = C(a)x + E(a)u
}
(5.1)
where u E t:,m and y E £,P are the external variables of the system, and X E t:,n is an additional latent variable introduced for the description. Moreover, A, B, C, and E are matrices over R[z] of fitting sizes. Notice that the first equation of (5.1) is explicit and of first order with respect to differentiation. Furthermore, differentiation occurs solely in the variable x and at timet, meaning that (5.1) is a system of DDEs of retarded type. These equations are the simplest and best studied class of DDEs. Results concerning forward solutions of initial value problems (not in t:,n, usually) can be found for instance in [3, Sec. 6.4] and [23, Ch. VII]. They form a helpful means for a detailed analysis of the dynamics of the system. If the matrices A, B, C, and E are constant, equations (5.1) form the classical state-space description for systems of ODEs. In that case, the value x(t) E Rn constitutes the state at timet (if we disregard the underlying function space) in the sense that it contains all necessary information to determine the future of the system, once an input u is applied. For DDEs, however, system (5.1) is in general infinite-dimensional and therefore x(t) does not present the state at time t in any reasonable manner. Yet, the trajectory x describes the evolution of the system. Namely, in an infinite-dimensional setting the state at time t is basically the segment of the trajectory x whose length is equal to the maximal lag occurring in (5.1) and which ends at time t. Formulated in an appropriate setting, this leads to a state-space description via an abstract differential equation on a suitable Hilbert space, providing another useful tool for a detailed study of the qualitative behavior of such a system; see for instance [20, Sec. 2.4]. We will not make explicit use of these features of the system (5.1), but merely consider it, in the spirit of the behavioral approach, as a latent variable system, where the latent variable x has been introduced for modeling the "external
136
5 First-Order Representations
behavior" Bcxt
(A, B, c, E):= { (~) E cm+p I there exists
X E
en satisfying (5.1)}
(5.2)
of all possible input/output pairs of the system. Indeed, from the elimination result in Theorem 4.4.1 one can easily deduce that Bcxt (A, B, C, E) is a behavior in the sense of Definition 4.1. Motivated by the above sketched properties of (5.1), we will be concerned with the converse question, that is, which behaviors can be described in the form (5.1), (5.2)? Systems of the type (5.1) have been studied in much detail in the literature. On the one hand, they have been investigated extensively in the context of infinite-dimensional systems, where often even matrices over 7t0 or more general convolution operators are taken into consideration, see, e. g., [85, 74, 73, 20] and the references therein. On the other hand, delay-differential systems of the type (5.1) have been apparently the main motivation for initiating the area of systems over rings, see [79, 105, 61], where they have been studied in detail ever since. In Section 3.3 we gave a quick overview of some interesting control problems arising for systems over rings. We did not mention there the area of realization theory, which we will briefly address now as it comes close to what will be done in this chapter. For a discrete-time system Xk+l = Axk + Buk, Yk = Cxk + Euk where the entries of all vectors and matrices are in some ring R, the transfer function is given by C(sl- A)- 1 B + E, hence it is a proper rational function in R[s- 1 ]pxm. The classical problem of realization theory is as follows: given an arbitrary proper rational function G E R[s- 1 ]pxm, find matrices A, B, C, and E with entries in R such that G = C(sl- A)- 1 B + E, preferably with the dimension n of the abstract state space being as small as possible. Put another way, if G is given as G = L::~o Gis-i, the matrices have to satisfy GAi-l B = Gi fori > 0 and E = G0 • In case R is a field, the relationship between rational functions and their realizations is fully understood, including minimality and uniqueness issues. In particular, each proper rational matrix is realizable. For the general case, realizability is always guaranteed, too, but the results concerning minimality and uniqueness depend on the ring. Since we will take a slightly different approach, we will not go into the details but refer the reader to [12, Ch. 4]. For systems over fields, an alternative approach for realizing the transfer function has been proven very fruitful, too. It is known as the polynomial model of Fuhrmann or simply the Fuhrmann-realization. Unlike the above-mentioned approach, it does not realize the sequence of coefficients Gi but is rather based on a polynomial factorization Q- 1 P of G, see [33, 34]. We will present this construction in detail in Section 5.2 where it will be utilized for our purposes. Let us now return to DDEs. It is easily seen that (5.1) is a strongly nonanticipating i/o-system with input u and output y. Moreover, the formal transfer function is given by C(sl- A)- 1 B + E E JR(s, z)pxm which looks formally just
5 First-Order Representations
137
like the transfer function for discrete-time systems over the ring JR[z). However, we are not interested in realizing the transfer function but rather want to realize, if possible, a given system ker.c [P, Q] as external behavior ker.c [P,Q] = Bext(A,B,C,E),
(5.3)
where Boxt (A, B, C, E) is as in (5.2) and A, B, C, and E are the matrices to be found. In Section 4.3 we saw that the formal transfer function -Q- 1 P does not contain the full information about the system because it neglects the autonomous part. As a consequence, realizing behaviors in the sense of (5.3) is in general stronger than realizing the transfer function. In the sequel we wish to explain briefly our approach to behavioral realization. Since the representation (5.1) is completely polynomial, the operator ring 1-t with its nice algebraic properties turns out to be of little help. Instead, we will first treat the problem for systems ker.c [P, Q] with a polynomial kernelrepresentation [P, Q]. This brings us back to the Fuhrmann-realization. As mentioned above, that procedure, developed for systems over fields, utilizes polynomial factors, P and Q say, for realizing the transfer function G = -Q- 1 P. As we will see in Section 5.2, the very procedure of Fuhrmann also works in the more general context of DDEs, and, even more, provides a behavioral realization. The latter is somewhat surprising since the procedure takes place in a completely polynomial setting; only the surjectivity of the delay-differential operators will be needed to establish the transfer function realization as a behavioral one. In order to prove the strength of Fuhrmann's construction, we want to present the realization in an even more general setting. In fact, as we will show, the procedure works for arbitrary systems where a polynomial ring of mutually commuting operators acts surjectively on a module A, representing the underlying function space. It will be crucial that the operators are algebraically independent, for this will allow us to apply the theorem of Quillen/Suslin on projective modules over polynomial rings so that we get a free module as an abstract state space. We will introduce this abstract framework in the next section along with various concrete classes of systems, such as differential systems with (possibly) noncommensurate delays as well as certain systems of partial differential equations. In Section 5.2 eventually, the realization procedure will be carried out in this general framework. The reason for passing to this quite general setting instead of sticking to DDEs is twofold. On the one hand we think that in this situation, more generality provides also more clarity as it exhibits exactly what kind of structure is needed for the procedure to work. On the other hand, the more general context does not require more advanced methods. It is literally the same construction as it would be for systems of DDEs. Having finished our considerations in the general setting of abstract polynomial systems, we will return to delay-differential systems with commensurate delays in Section 5.3. Only little extra work is needed to derive a criterion for
138
5 First-Order Representations
realizability of ker.c [P, Q], along with a realization procedure, where now [P, Q] is an arbitrary operator with entries in 1-l. For sufficiency we will utilize the Fuhrmann-realization for the "numerator matrix"; necessity will be a consequence of the elimination procedure of Section 4.4. Finally, in the last section the question of minimality will be addressed. Unfortunately, we can only provide partial answers in this direction, one of which is that the Fuhrmann-realization yields, in a certain sense, the best result for systems with a polynomial kernel-representation.
5.1 Multi-Operator Systems In this section we introduce the abstract model of systems for which a realization procedure will be presented later on. For obvious reasons the classes of systems being described by this model will simply be called multi-operator systems. As will be illustrated throughout this section, they cover not only differential systems with even noncommensurate point-delays but also certain systems of partial differential equations. (The investigation of DDEs in the framework of Chapter 4 will be resumed in Section 5.3.) We will close this section with a first result concerning the formal transfer function of the systems under consideration. Let us now fix the abstract model for the multi-operator systems. All we need is a commutative polynomial ring K[z 1 , ••• , ZL, s] in l + 1 indeterminates over an arbitrary field K and a nonzero divisible K[zt, ... , ZL, s]-module A. Hence, by definition, every nonzero polynomial p induces a surjective map on A by left multiplication. The indeterminates is distinguished merely because, in the next section, we will construct realizations which are explicit and of first order with respect to s, analogous to (5.1) for DDEs. For the time being there is no particular meaning to s. We will also use the notation K[z] := K[z1, ... , zl] for the polynomial ring in the first l indeterminates and K [z, s] for K [z 1 , ... , ZL, s]. A matrix R E K[z, s]pxq induces the two K[z, s]-linear maps
K[z, s]q
~
K[z, s]P,
Aq
~
AP,
p
f-----t
Rp
and
af-----t Ra.
Just like for delay-differential systems, both maps will simply be denoted by R and the notation ker K[z,s] R and im K[z,s] R, resp. kerA R and im AR will be used in the obvious way. The surjectivity of the map a carries over to matrices.
1--4
pa for each nonzero p E
K[z, s] immediately
139
5.1 Multi-Operator Systems
Lemma 5.1.1 Let R E K[z, s]pxq be a matrix with full row rank and A be any divisible K[z, s]-module. Then imAR = AP. For the verification one simply selects a nonsingular full-size submatrix Q of R and utilizes the identity Q(adjQ) = (detQ)Ip. Summarizing, our abstract model consists of a polynomial ring of l + 1 algebraically independent operators acting on a divisible module A. The following examples show that this model covers indeed concrete systems, including delaydifferential equations with even noncommensurate delays as well as certain partial differential equations or discrete-time partial difference equations. We begin with
Example 5.1.2 (Delay-Differential Systems) Let A = C00 (lR, C) and denote by Ui the shift operator of length Ti > 0, i. e. (uif)(t) =, f(t- Ti)· Then JR(u1, ... , Ul, D] is the ring of all linear, time-invariant delay-differential operators of the form Pv,i E lR,
(5.1.1)
where E' means this sum being finite. The space A naturally carries the structure of an JR(u1, ... , ul, D)-module. Precisely, for p as in (5.1.1) and f E A one has N
pf(t) =
L:' LPv,if(i)(t- (v, r) ), t E lR, vEN 1 i=O
where (v, r) = L:~= 1 vjTj denotes the standard scalar product. It is obvious that U1, .•. , ul, and DE Endc(A) mutually commute. Moreover, if r 1 , ••• , Tl E lR are linearly independent over Q, then u 1 , ... , ul, D are algebraically independent elements in the ring Endc(A). To see this, let p be as in (5.1.1). Then p being the zero operator in Endc(A) implies in particular for the exponential functions eo,~ the identity N
0 =peo.~(t) = (L'LPv,iAie-~(v,r))e~t for all t E lR and all A E C. vENl
i=O
Since (v, r) I= (p,, r) whenever v I= p, in Nl, all coefficients Pv,i E lR must be zero. Thus, JR[u1, ... , ul, D] is a polynomial ring in l + 1 indeterminates. Its elements are delay-differential operators with l noncommensurate delays. From (25, p. 697] it is known that the operators are surjective on A. The following class of systems arises in multidimensional systems theory. They have been studied in a unified manner in (84].
140
5 First-Order Representations
Example 5.1.3 (Multidimensional Systems) Consider the following situations. 8 ] be the ring (a) Let K be one of the fields lR or
of partial differential operators acting on A = C00 (JRl+I, K) or on A = V'(JRl+ 1 ), the space of real- or complex-valued distributions on JRl+ 1 ; (b) Let K be any (possibly finite) field and let
A:= {
L
a(n)t~ 1
• ••• •
t~~r
j
a(n) E K }, where n = (nt, ... , nt+I),
nEW+ 1
be the K -algebra of formal power series in l + 1 indeterminates over K. Via the backward shifts with truncation
Zi (
L
a(nb ... , nl+I)t~ 1
•••••
t~~r)
nEW+ 1
L
=
a(nt, ... , ni
+ 1, ... , nt+I)t~ 1
• ••• •
t~~i\
nENl+l
the space A can be endowed with the structure of a K[z11 ... , Zt+I]-module. This is usually the framework for discrete-time multidimensional systems, cf. (123, 122]. In all cases above the operator ring is a polynomial ring in l + 1 indeterminates. It is the main result of (84] that these situations have some strong algebraic structure in common: the module A constitutes a large injective cogenerator in the category of K[z, s]-modules, see (84, (54) p. 33]. Part of this result .goes back to work of Ehrenpreis and Palamodov in the case of PDEs. The large injective cogenerator property itself is not needed for our purposes and we refer the interested reader to (84] for the details. More important for us are the consequences for the operators acting on A. In essence, the correspondence between kernels in Aq and operators in K[z, s]pxq is quite similar to that for delay-differential systems discussed in Section 4.1. We would like to extract the following from (84] for future reference. (1) (84, (46), p. 30] For matrices R 1 E K[z, s]pxq and R 2 E K[z, sjlxp one has kerK[z,s]
R1T = imK(z,s]R2T
¢=:::?
kerA R2 = imARl.
(2) In particular, if R E K[z, s]pxq has rank p, then imAR = AP. (3) (84, (61), p. 36] For matrices Ri E K[z, s]Pixq, i = 1, 2, one has kerA R1 ~ kerA R2 ¢=:::? R2 =X R1 for some X E K[z, s]P 2 xp 1 • Recall the analogous results in Proposition 4.1.4, Theorem 4.1.5(a), and Remark 4.1.9 for the case where 1-l is acting on .C = C00 (lR,
ft]
5.1 Multi-Operator Systems
141
In Theorem 4.2.3 we introduced the formal transfer function -Q- 1 P E R(s, z)pxm of an i/o-system ker.c [P, Q] ~ .t:,m+p of DDEs. In the same way the formal transfer function can (and will) be introduced for the general polynomial setting of this section. In this context the following situation will play a crucial role.
Example 5.1.4 (Transfer Functions) Let K[z, s] be any polynomial ring in l + 1 indeterminates. Then the space A= K(z, s) carries a natural K[z, s]-module structure given by multiplication. The same is true for the space N
K(z)((s-
1
))
={
L
fisi j NEZ, fiE K(z)}
i=-oo
of formal Laurent series in s- 1 with coefficients in the field K(z). Clearly, both spaces are divisible K[z, s]-modules, thus our abstract approach applies. For this setting, behavioral theory coincides with the transfer function framework as we will make precise in Example 5.1.8.
Remark 5.1.5 Throughout this section, it does not play any role having one of the variables distinguished. Even more, if XI, ... , Xt+l are algebraically independent elements over K, the same is true for Yb ... , Yl+b where (Yb ... , Yl+I)T
= A(x1, ... , Xt+ll + b
for some A E Gll+ 1 (K) and b E Kl+ 1 • In particular, K[yt, ... , Yl+I] = K[xb ... , Xt+ 1 ]. For instance, in Example 5.1.2, the polynomial ring can also be presented as R[D, a 1 - 1, ... , O't - 1), where we replaced the shift operators by the corresponding difference operators and changed the ordering of the indeterminates. In this case, the list of operators (z1, ... , Zt, s) reads as (D, a 1 - 1, ... , O't - 1), so that s = O't - 1 is the distinguished operator. The procedure of the next section would then result in a first-order realization with respect to the last difference operator O't - 1, provided that certain necessary conditions are satisfied. Let us return to the general case of a divisible K[z, s]-module A. For R E K[z, s]pxq the kernel kerA R is a submodule of Aq and can be regarded as an abstract version of a behavior of a dynamical system, generalizing those of Definition 4.1. If A is a function space, it consists of all trajectories in Aq that are governed by a system of (higher order) equations, e. g., delay-differential equations, partial differential equations, or partial difference equations in case of the examples above. In the general case, for instance in Example 5.1.4, there is no interpretation of kerA R in terms of trajectories. In the following definition we introduce these systems formally along with the desired first-order representations.
142
5 First-Order Representations
Definition 5.1.6 Let R E K[z, stx(m+p) be any matrix.
(a) The module kerAR ={a E Am+p IRa= 0}
is called a behavior (or a system) in Am+P. (b) The behavior kerA R, or simply the matrix R, is said to be realizable, if there exists a number n E N and matrices (A, B, C, E) E K[z]nxn
X
K[z]nxm
X
K[z]pxn
X
K[z]pxm
such that kerAR = B:t(A,B,C,E)
(5.1.2)
where (5.1.3)
In case such matrices exist, we call the quadruple (A, B, C, E) a realization of ker A R. The system
sx =Ax +Bu, y = Cx+Eu
}
(5.1.4)
is said to be a first-order representation of kerA R and the behavior B;t (A, B, C, E) is called the external behavior of (5.1.4). The length n of the internal vector x is called the dimension of the realization (A, B, C, E). The matrix C (sl- A) - l B +E E K (z, s )PX m is said to be the formal transfer function of (5.1.4) or of (5.1.3). The term first-order representation or first-order system refers, of course, to the fact that the first equation in (5.1.4) is linear with respect to the operator induced by s. As has been discussed for DDEs in the introduction to this Chapter, it does not make sense to call (5.1.4) a state-space system. Only for certain cases, where the matrices are constant, this might be appropriate. A few remarks are in order. Remark 5.1.7 (i) It is not clear whether each external behavior of a first-order system does admit a kernel-representation, in other words, whether latent variables can always be eliminated. We will see below that this is indeed the case for the examples above except possibly for delay systems with noncommensurate delays, where this is unknown.
5.1 Multi-Operator Systems
143
(ii) Remember the notions of free and maximally free variables of a delaydifferential system from Definition 4.2.1. These concepts generalize naturally to the context of operators acting on A and can be applied to the behavior (5.1.3). From the surjectivity of sf -A on An it is immediate that the variables u are free, meaning that for each u E Am there exists y E AP such that (uT,yT)T E kerAR· For the examples 5.1.2- 5.1.4, again with the possible exception of systems with noncommensurate delays, the variables u are even maximally free, so that the last p variables constitute the outputs of the system; see the discussion below. We know from the delay-differential systems of Chapter 4, that this implies that R has rank p, see Theorem 4.2.3. That means that the number of outputs equals the number of independent equations. Again, this will be true in more generality. However, the realization procedure in the next section applies only to full row rank kernel-representation, meaning that we are restricted to matrices R E K[z, s]Px(m+p) to start with. Put another way, we will assume in Section 5.2 that the system is governed by exactly p linearly independent · equations. Except for the case of transfer functions and systems with commensurate delays, this restriction is indeed crucial: since K[z, s] is not a principal ideal domain, it is in general not possible to eliminate linearly dependent rows of R without changing the associated behavior kerA R, see Example 5.1.10 below. (iii) In accordance with our definition of input/output systems (see Definition 4.2.1), we always place the free variables into the first m components of the external variables; see also Remark 4.2.2 for a comment on this restrictive point of view. Let us discuss the definition for the list of examples above.
Example 5.1.8 (Transfer Functions) Consider again Example 5.1.4 where A is either K(z,s) or K(z)((s- 1 )). In this case, the external behavior of (5.1.4) is simply
B;'(A, B, C,E)
~ { (~) E Am+p Iy ~ (C(si- A)= kerA
1
B
+ E)u}
[P, Q],
where -Q- 1 P = C(si- A)- 1 B + E is any factorization of the formal transfer function into polynomial matrices (which, of course, exists). Thus, the external behavior B;t (A, B, C, E) admits a full row rank kernel-representation [P, Q] E K[z, s]Px(m+p). Obviously, for this special choice of A, realizing a behavior kerA [P, Q] is the same as realizing the rational function -Q- 1P, that is, as finding matrices (A, B, C, E) satisfying -Q- 1 P = C(si- A)- 1 B +E. Note also that in this case u is maximally free.
144
5 First-Order Representations
Example 5.1.9 (Delay-Differential Systems) In the situation of Example 5.1.2, where s = D and o-11 ... , o-z are shift operators of noncommensurate lengths T1, ... , T[, the first-order system in (5.1.4) reads as
X= Y=
L AvO"vX + L BvO"vU, I
I
vEW
vEW
L
I
CvO"vX +
vEN1
L
I
EvO"vU,
vEN 1
where we use the notation o-v := o-r1 o · · · o o-r1 , and Av, Bv, Cv, and Ev are constant matrices with entries in R. If l = 1, we know from Theorem 4.4.1(a) that the external behavior
B~' (A, B, C, E) = [~ ~] (kerA [sJ -
A,- Bl)
is in fact a behavior in the sense of Definition 4.1. Moreover, we will see in Proposition 5.3.1 that it always admits a kernel-representation kerA [P, Q] where [P, Q] E 1-fPX(m+p) and Q is nonsingular. In particular, u is maximally free, see Theorem 4.2.3. It remains an open question whether similar results are true for systems with noncommensurate delays, cf. [127, p. 234] and [41, Sec. 3.1].
Example 5.1.10 (Multidimensional Systems) Let A be any of the spaces in Example 5.1.3 with the corresponding modulestructure. Then each external behavior B:;_t (A, B, C, E) of a system (5.1.4) admits a kernel-representation of rank p, the number of output variables yin the system. This can be seen as follows. Define the matrix
· [sf 0-A -B] Im .
M :=
C
(5.1.5)
E
Since each submodule of K[z, s]n+m+p is finitely generated, there exists a matrix [Y, P, Q] E K[z, s]lx(n+m+p), for some lEN, such that ;'{"
.
,.,
,.,
,., if
kerK[z,s] M = ImK[z,sj[Y,P,Q] .
(5.1.6)
It follows rk [Y, P, Q] = p ::; l. Lemma 3.2.7(2) shows that we have even rk Q = p. Furthermore, property (1) of Example 5.1.3 yields kerA [Y, P, Q] imAM and therefore
(5.1.7)
5.1 Multi-Operator Systems
145
see also (84, (34), p. 25]. By property (3) of Example 5.1.3 each other kernelrepresentation of B:t (A, B, C, E) has rank p, too. It has been shown in [122, Lemma 2] that the rank p implies that u is maximally free. We conclude these considerations of multidimensional systems with a concrete example illustrating that in this case the external behavior of a first-order system does, in general, not admit a full row rank kernel-representation. To this end, we write a~i = 8i and let K[z, s] = C[8t, 82, 83] act on A = C00 (1R3 , C). In particular, s = 83 is the distinguished variable. Let m = n = 1 and p = 2 and consider the first-order system
where the third identity is (up to permutation of the components) simply the fact that the image of the gradient operator is the kernel of the curl operator in A. This fact can also be derived from the corresponding identity kerqa!,a2 ,a3 ] [83, 82, 81] = imqa1 ,a2 ,a3 ]RT for polynomials by using property (1) of Example 5.1.3. Suppose now, kerA R had a full row rank kernel-representation, say kerA R = kerA R for some R E C[8b 82, 83]2X 3. But then property (3) of Example 5.1.3 would imply that imqa1 ,a2 ,a3 ]RT = imqa1 ,a2 ,a3 ]RT is a free module, which is certainly not the case. Hence we see that there exist behaviors which do admit realizations in the sense of Definition 5.1.6, but which do not allow a full row rank kernel-representation. Systems of this type will be excluded from our construction in the next section. As pointed out in the introduction to this chapter, realization of transfer functions and of behaviors are in general not the same thing. However, the following relationship will be proved. The second of the statements below will be crucial in the next section as it relates polynomial equations to solution spaces over A. It is a generalization of the purely differential (hence univariate) version given in [93, Lemma 2.1]. Notice that we are requiring [P, Q] to have full row rank. Proposition 5.1.11 Let A be a nonzero divisible K[z, s]-module and let [P, Q] E K[z, s]Px (m+p) be a matrix with full row rank. Furthermore, let (A, B, C, E) E K[z]nxn be a given matrix quadruple.
X
K[z]nxm
X
K[z]pxn
X
K[z]pxm
146
5 First-Order Representations
(a) If condition (5.1.2) is satisfied for R := [P, Q], then Q is nonsingular and -Q- 1 P = C(sl- A)- 1 B +E.
(5.1.8)
(b) Suppose that (5.1.8) is true. If X:= QC(sl- A)- 1 E K[z, s]pxn, i.e. X is polynomial, and if the polynomial matrix [X, P, Q] E K[z, s]Px(n+m+p) is right invertible over K[z, s], then kerA[P,Q]
= B;_t(A,B,C,E).
Before giving the proof, we would like to present the following version of the Theorem of Quillen/Suslin on projective modules over polynomial rings (previously known as Serre's conjecture). At this point it is crucial that the operators are algebraically independent over K.
Theorem 5.1.12 (Quillen/Suslin) For a matrix ME K[z, s]pxq the following conditions are equivalent: (i) M is right invertible over K[z, s], (ii) M can be completed to a unimodular matrix [MT, NT]T E Glq(K[z, s]), (iii) the ideal generated by the full-size minors of M is the unit ideal in K[z, s]. Alternatively, every finitely generated projective module over K[z, s] is free. SKETCH OF THE PROOF: The implications (ii) =} (i) is trivial and (ii) =} (iii) as well as (i) =} (iii) are simple consequences of the Binet-Cauchy formula for the minors of matrix products. The assertion (iii) =? (i) can be seen as follows: The minors of M are given by M(p), p E Ip,q, see Definition 3.2.6. Denote the corresponding p x p-submatrices of M by Mp, thus det Mp = M(p)· By assumption there exist numbers Cp E K such that LpEIp,q cpM(p) = 1. Define the matrix C := LpEIp,q cpEp adjMp E K[z, s]qxp where Ep E Kqxp is the matrix with the identity Ip sitting on the rows with indices p = (PI, ... , pp) and zeros elsewhere, hence M Ep = Mp. Then C constitutes a right inverse of M. The remaining implication (i) =? (ii) as well as the alternative formulation is the celebrated result of Quillen/Suslin, see [67, pp. 491]; we also want to mention [69] for an algorithm computing a unimodular completion. D PROOF OF PROPOSITION 5.1.11: (a) From (5.1.2) we will first derive the identity
M := QCadj(si- A)B + det(sl- A)QE + det(sl- A)P = 0.
(5.1.9)
In fact, by divisibility of A it is enough to show that Mu = 0 for all u E Am. Thus, let u E Am be an arbitrary element and pick x E An such that Bu = (sl- A)x; see Lemma 5.1.1. Put y = Cx + Eu. Then Pu + Qy = 0 and one easily verifies
5.1 Multi-Operator Systems
147
Mu = det(sl- A)(QCx + QEu + Pu) = 0, hence (5.1.9) follows. This in turn implies
[P,Q]
[c(si -i)- 1B+E]
=O,
considered as an equation over the field K(z, s). Since both matrices have full rank, Lemma 3.2.7 yields detQ -1-0 and (5.1.8) is established. (b) Write again R = [P, Q]. By Theorem 5.1.12, the matrix [-X, R] can be completed to a unimodular matrix
[-='l: r;:] E Gln+mw(K[z, s]) and the assumptions can be rewritten as the matrix identity
u1 u2 ) [-X R where T :=
_[T) [si- A-B] Im 0
U[sl- A,-B] + U2[~1E] 1
E
C
E
0 '
K[z,s]
Hence
By virtue of Lemma 5.1.1, the operator Tis surjective on An+m. Therefore, the above equivalence provides B;t (A, B, C, E)= kerA R = kerA [P, Q], as desired. 0
Notice the analogy of the last part with the line of arguments in the proof of Theorem 4.4.1 (a). Remark in particular, that the assumptions of part (b) in the proposition together with the divisibility of A allow the elimination procedure in the last step of the proof. We conclude this section with a brief consideration concerning the formal transfer function. This will show in particular that realizable delay-differential systems are strongly nonanticipating. Recall the notation in 3.1(f) for formal power series over a ring.
148
5 First-Order Representations
Remark 5.1.13 Consider the formal transfer function C(sl- A)- 1 B On the one hand, we have
+E
E
K(z, s)pxm.
00
C(si- A)- 1 B
+E
=
L CAi-
1
Bs-i
+E
E K[z][s- 1 ]pxm.
(5.1.10)
i=l
This fact will be of importance for the construction in the next section. On the other hand, for delay-differential systems with commensurate delays, the ring K [z] = lR [zJ is univariate and we also have (5.1.11) where R(s)p is the ring of all proper rational functions in R(s), see Remark 3.5.7. Equation (5.1.11) can be seen as follows. The determinant of sl- A is of the form E~o ai{s)zi where a0 (s) = det(sl- A(O)) is of degree n (the size of A). Since this is also the maximal degree in s attained by the full-size minors of the matrix [si- A, B), we get from Proposition 4.2.5(c) that (si- A)- 1 B E R(s)P [z]nxm, from which (5.1.11) follows. Hence, the system B~t (A, B, C, E) is strongly nonanticipating, see Remark 4.2.4.
5.2 The Realization Procedure of Fuhrmann Now we will establish Fuhrmann's realization for the systems introduced in the previous section. Thus, as before, K[z, s] = K[zt, ... , Zl, s] is a polynomial ring in l + 1 indeterminates acting on a nonzero divisible module A. These assumptions will allow us to utilize once more the Theorem of Quillen/Suslin in order to carry out the polynomial part of the construction. The realization will then be completed by applying Proposition 5.1.11 which translates the polynomial identities into identities of solution spaces over A. The polynomial model of Fuhrmann in [33], see also (34], has been developed for the realization of transfer functions of finite-dimensional systems. Precisely, given a proper rational matrix G E R(s)pxm, matrices A, B, C, and E are p constructed such that G = C(sl- A)- 1 B + E; recall Example 5.1.8. The construction makes use of a factorization of G into a quotient Q- 1P of polynomial matrices and takes place in a purely formal setting of Laurent series. It is the purpose of this section to show that for the multi-operator systems of the previous section, the procedure of Fuhrmann can be carried out in exactly the same way and provides a realization of the formal transfer function -Q- 1 P of a system kerA [P, Q]; but even more, it yields a realization of the behavior itself, hence it leads to (5.1.2). The construction works provided that certain assumptions on the system kerA [P, Q] are satisfied; in particular Q has to be nonsingular. At the end of
5.2 The Realization Procedure of Fuhrmann
149
the section we will discuss how restrictive these requirements are for the various system classes given in the preceding section. By virtue of Proposition 5.1.11, the key for the realization of the behavior is to realize the transfer function - Q- 1P .of the system. Indeed, a matrix X, satisfying the conditions of part (b) of that proposition will come as a by-product, so that we finally arrive at the desired behavioral realization (5.1.2). Therefore, the module A does not play a role for the construction itself. In fact, the realization of the rational matrix -Q- 1 P can be best understood by thinking of (5.1.4) as a discrete-time system over the ring K[z] with s denoting the shift, thus x, u, and yare vectors with entries in K[z] as discussed in Section 3.3. The dynamics can also be formalized via Laurent series over K[z], which is then exactly the setting for the Fuhrmann-realization. In fact, we will see that the realization (5.1.2) is even valid for the nondivisible module A:= K[z]((s- 1)) of formal Laurent series in s- 1 with coefficients in K[z]. We begin by fixing some notation; they are compatible with earlier ones for delay systems.
Definition 5.2.1 (a) A matrix FE K[z]((s- 1))pxq is called proper ifF is a power series in s- 1 , thus ifF E K[z][s- 1]pxq. The matrix F is called strictly proper if it is proper without a constant term, i. e., ifF E s- 1 K[z] [s- 1 ]pxq. (b) Denote by II_ and II+ the projections onto the strictly proper part and polynomial part respectively, that is
II_ : K[z]((s-1 ))pxk
---7
K[z]((s-1 ))pxk, i=-oo
i=-oo
N
N
II+: K[z]((s-1))pxk
---7
K[z]((s-1))pxk,
L i=-oo
Fisi
t-------t
LFisi. i=O
Note that II+= id- II_. (c) An element f = E!-oo fisi E K[z]((s- 1)) is called monic, if it is nonzero and its highest coefficient is a nonzero constant, i.e. if fN E K\{0}. The units of the ring K[z]((s- 1)) are just the monic elements. (Notice that in this chapter we do not require the highest coefficient of a monic element to be 1; this is different from our terminology in the sections 3.4, 4.5, and 4.6.) The starting point for the construction will be a matrix [P,Q] E K[z,s]PX(m+p), where detQ is monic and -Q- 1 P is proper. (5.2.1) In particular we are requiring that [P, Q] have full row rank. The even stronger assumption det Q being monic is necessary for the specific construction to work.
150
5 First-Order Representations
Note that under this assumption the matrix -Q- 1 P is indeed a Laurent series over K[z], just like the formal transfer function of a first-order system, see (5.1.10). That equation together with Proposition 5.1.11(a) shows now that the properness of -Q- 1 P is necessary for realizability whenever Q has a monic determinant. In order to keep the notation simple, we will confine ourselves to systems where -Q- 1 P is strictly proper. This is no restriction at all as one can see from the equivalence kerA[P,Q] = Ba;_t(A,B,C,E) ~ kerA[P+QE,Q] = B~t(A,B,C,O),.
where E
= -Q- 1 P- JI_(-Q- 1 P)
(5.2.2)
is the constant part of the proper matrix
-Q-1P. The first step of the Fuhrmann-realization of [P, Q] is the construction of an abstract "state-module". The underlying idea is perhaps most clearly seen by regarding [P, Q] as a system over the space A:= K[z]((s- 1 )). Even though A is not a divisible K[z, s]-module it makes perfectly sense to consider the solution space
ker.4[P, Q] = { = {
(~) E .A_m+v IPu + Qy =.0} (~) E .A_m+p Iy =
(5.2.3) 1
-Q- Pu }•
where we simply make use of the fact that -Q- 1 P has entries in the ring K[z]((s- 1 )), too. With each series f = 'Ef=-oo f -isi we can associate the time sequence with value f -i at time -i. Hence the sequence starts at time - L. The dynamics, the multiplication by s, translates into the backward shift for the associated sequences. Thus we can also think of a discrete-time system over the ring K[z] in the sense of Section 3.3. If we take t = 0 as the present time instant, then each sequence has a finite past, the polynomial part of the Laurent series. The state space for the first-order representation of (5.2.3) is going to be the space of all those past pieces which are capable to give rise to a (unique) future piece for the system ker.A[P, Q]. Equation (5.2.3) indicates that the vector polynomials J, where Q- 1f is strictly proper, are candidates for the state space. This will in fact lead to the correct notion. We will introduce the space in a slightly different way, which will facilitate the introduction of the "state to next-state map" later on. Define the map
IIQ : K[z, s]P
---7
K[z, s]P,
Recall again that det Q being monic guarantees Q- 1 E K[z]((s- 1 ))PXP. As a consequence, IIQ(J) = f- QII+(Q- 1 f) is indeed in K[z, s]P. Notice that if
5.2 The Realization Procedure of Fuhrmann
151
p = 1, then IIQ (f) is just the remainder of f after division by Q in the ring K[z, s]. In any case, IIq is a K[z]-linear map satisfying IIq o llq = IIq, thus a projection. The image of IIQ is going to be the abstract state space. With part (d) of the following theorem we will establish that the state space is a finitely generated free K[z]-module, thus allowing matrix representations for the linear maps to be defined later on. Theorem 5.2.2
Let Q E K[z, s]PXP be a matrix with monic determinant and define the space Sq := im llq ~ K[z, s]P. Then Sq is a K[z]-module and satisfies the following properties. (a) Sq = {f E K[z, s]P IQ- 1 f is strictly proper}, (b) Sq = spanK[zJ{IIq(eisi) Ii = 1, ... ,p, j = 0, ... , deg8 (det Q)- 1}, where e1, ... , ep are the standard basis vectors of K[z]P, (c) K[z, s]P = ker llq E9 Sq = QK[z, s]P E9 Sq, (d) Sq is a free K[z]-module with rankSq = deg8 (detQ). (a) is obvious. (b) Let f E K[z, s]P. We can carry out division with remainder by det Q and get an expression f = h det Q + g1, where ft, g1 E K[z, s]P and the degree with respect to s of each entry of g1 is strictly less than deg8 (det Q). Then PROOF:
IIq(f) = QJI_ (adj(Q)h
+ Q- 1g1)
= Qll_(Q- 1g1) = llq(gt),
which is what we wanted. (c) The first identity holds true for arbitrary projections. Thus there remains to show that ker llq = QK[z, s]P. The inclusion ".;2" follows directly from the definition of IIq. As for "~", let f E K[z, s]P and IIq(f) = 0. Then 0 = Q(idII+)(Q- 1 f)= f- QII+(Q- 1 f), showing that f = QII+(Q- 1 f) is contained in
QK[z,s]P. (d) By (b) and (c) the K[z]-module Sq is finitely generated and projective. Hence it is a free K[z]-module according to Theorem 5.1.12 of Quillen/Suslin. Let (5.2.4) {g1, ... , gn} ~ K[z, s]P be a basis of Sq. In order to show that n = deg8 (det Q), we will use the results about the Fuhrmann-realization over fields, in this case over the field K(z). Hence we will pass for a moment to the polynomial ring K(z)[s], where we can perform the same constructions as above. This way we can take advantage of a Smithform Q of Q, which provides some more information about the corresponding image Sq. Consider the projection
ilq : K(z)[s]P
~ K(z)[s]P,
where, of course, II_ stands for the projection onto the strictly proper part in the ring K(z)((s- 1 ))P as well. Put
152
5 First-Order Representations
Sq
:=
imilq ~ K(z)[s]P
(5.2.5)
and let UQV = Q = diag(qt, ... , qp) E K(z)[s]PXP be a Smith-form of Q over the polynomial ring K(z)[s], thus U, V E Glp(K(z)[s]). As in part (a) one has Sq = {f E K(z)[s]P I (J- 1f strictly proper}, thus ~
T
Sq = {(JI, ... , fv) E K(z)[s]P I degs fi < degs Qi, i = 1, ... ,p}. Now [34, Thm. 4.11] yields that Sq and Sq are isomorphic K(z)-vector spaces (this can also be checked directly, the isomorphism is given by Sq 3 f ~-t ilq(U f) E Sq)· Therefore we obtain p
dimK(z) Sq =
L deg
8 Qi
= deg8 (det Q).
i=1
There remains to show that the rank of the module Sq coincides with the dimension of the vector space Sq. But this is a consequence of the K (z )linearity of flq and can be seen as follows. Let g = ilq(f) E Sq for some f E K(z)[s]P. Writing f = h- 1/,where f E K[z,s]P and hE K[z]\{0}, we get g = h- 1ilq(/) = h- 1IIq(/) = h- 19 for some g E Sq = spanK[zj{gb ... , gn}· Thus (5.2.6) and together with the linear independence of g1, ... , gn over K (z), this amounts to D rankSq = n = dimK(z) Sq = deg8 (detQ). Now we are ready to establish the Fuhrmann-realization of a system kerA [P, Q]. To this end, maps A, B, and 6 are introduced which translate the dynamics of the system into state-space form. Their matrix representations A, B, and C will then lead to the identity -Q- 1P = C(sl -A)- 1B for the rational functions. The first-order representation (5.1.2), (5.1.3) will finally be accomplished by use of Proposition 5.1.11. In part (e) below we add the fact that the construction also yields a realization over A= K[z]((s- 1 )). Let us briefly describe the dynamics of the maps given in the theorem below. First of all, iJ simply feeds the values U-i E K[z]m of the input sequence u = E~-oo U-isi through the map Pinto the system, which is in accordance with the equation Pu + Qy = 0. The maps A and 6 both take a state f and calculate the corresponding future piece Q- 1f, which then will be shifted one step backward (in other words, the system moves one step forward in time). The output map 6 simply delivers the current value at (the new) timet= 0, hence the constant part of Q- 1sf. The "state to nextstate map" A disregards exactly this current value by taking the future piece II_ (Q- 1sf) and associates with it the corresponding past piece Q II_ (Q- 1sf), which then is the new state.
5.2 The Realization Procedure of Fuhrmann
153
Theorem 5.2.3 (Fuhrmann-Realization) Let [P, Q] E K[z, s]Px(m+p) be a matrix where det Q is monic and Q- 1 P is strictly proper. Put n := deg 8 (det Q). Then the K[z]-linear maps
A: iJ:
SQ-------> SQ, K[z]m
------->
SQ,
6: SQ-------> K[z]P, are well-defined. Fix a basis
f
f-------7
IIQ(sj) = QII_(Q- 1 sj),
~
f-------7
-P~,
f
f-------7
II+(Q- 1 sf)
JI, ... , fn
E
K[z, s]P of SQ and let
A E K[z]nxn, BE K[z]nxm, and C E K[z]pxn
be the matrix representations of A, iJ, and 6 with respect to the chosen basis of SQ and the standard bases of K[z]m and K[z]P. Finally, put X= (JI, ... , fn]· Then (a) -Q- 1 P = C(sl- A)- 1 B, (b) the matrix [X, Q] E K[z, s]Px(n+p) is right invertible,
B;_t
(A, B, C, 0) = kerA [P, Q] for every nonzero divisible K[z, sJ-module A, (c) (d) the realization (A, B, C, 0) is coreachable, meaning that
1 [" (;A] is left invertible over K[z, s],
(e)
BAt (A, B, C, 0) = ker,A[P, Q] for the module A= K[z]((s-
1
)).
The theorem is a slightly enhanced version of [40, Thm. 3.3]. Notice that coreachability is simply the dual of reachability in the sense of systems over the ring K[z], see (1) of Section 3.3. First of all, one should note that by strict properness of Q- 1 P, the image of the map iJ is in fact contained in SQ, see Theorem 5.2.2(a). Furthermore, by definition of SQ, the vector Q- 1 sf is proper whenever f ESQ. Hence, 6(f) is simply the constant part of Q- 1 sf and indeed contained in K[z]P. Hence the maps are well-defined .. (a) This part follows by passing to the corresponding K(z)-spaces and applying Fuhrmann's result for systems over a field. Indeed, we know from (5.2.4) and (5.2.6) that JI, ... , fn is a basis of the K(z)-vector space SQ given in (5.2.5), too. Hence the triple (A, B, C), regarded as matrices over K(z), constitutes also the Fuhrmann-realization of [P, Q] E K(z)(s]PX(m+p) given in (34, p. 40], and (34, Thm. 10.1] implies that C(sl- A)- 1 B = -Q- 1 P. PROOF:
(b) From the definition of X and Theorem 5.2.2(c) we get K[z, s]P = [X, Q]K[z, s]n+v, thus the right invertibility of [X, Q]. (c) We will show that X ~ [ft, ... , fn] = QC(sl- A)- 1 which then will allow
154
5 First-Order Representations
to apply Proposition 5.1.11. Consider the matrices A, B, and C. The choice of the bases for the modules involved imply the relations
which in turn yield
X(sf ~A)= sX- QII-(Q- 1 sX) = Q(id- Il_)(Q- 1 sX) =
QII+(Q- 1 sX) = QC.
(5.2.7)
Thus X = QC(sf- A)- 1 . Now, by virtue of part (a) and (b) above, Proposition 5.1.11(b) implies the desired result. (d) To prove coreachability, rewrite Equation (5.2. 7) as
[-X,Q] [sfC
A]
=0.
(5.2.8)
By virtue of Lemma 3.2.7(1), we get for the full-size minors [-X, Q](p) =
a [sf-
±b
C
A] (iJ)
for all p E Jp,n+p
where a, bE K[z, s] are taken as coprime polynomials. The coprimeness of the full-size minors of [-X, Q] implies at once a E K\ { 0}. But then also b E K\ { 0}, as can be deduced from the equation degs (detQ) = n = degs det(sf- A) and the fact that det Q is monic. Thus the full-size minors of [sf - AT, CT]T coincide (up to signs and a nonzero constant) with those of (-X, Q] and the result follows from Theorem 5.1.12. (e) The inclusion"~" follows from (a), see also (5.2.3). As for"~", let (uT, yTy E K[z]((s- 1 ))m+p satisfy Pu = -Qy. Thus y = -Q- 1 Pu = C(sf- A)- 1 Bu. We have to show that x := (sf- A)- 1 Bu E K[z]((s- 1 ))n. To this end, we make use of the coreachability established in (d). Picking a left inverse [M, N] E K[z, s]nx (n+p), the identity
sf[M,N] [ C leads to (sf- A)- 1
A] =In
= M + NC(sf- A)- 1 . Hence
x = MBu + NC(sf- A)- 1 Bu = MBu + Ny is in K[z]((s- 1 ))n, since the series u and y have their coefficients in K[z] and the matrices are polynomial. 0 We illustrate the procedure by the following example. Example 5.2.4 Let us consider again delay-differential equations with commensurate point delays. Thus, JR(a, D] acts on A = coo (JR, C), where, as in the previous chapters, a
155
5.2 The Realization Procedure of Fuhrmann
denotes the forward shift of unit length and D denotes differentiation. Consider the system kerA [P, Q] ~ A 3 where
l [
2 ]2x3 D 11 (a- 1)D [P,Q ] = lra. 0 (a-1)2D+Da-1 ElRa,D Then d et Q = - D 2
. IS
.
momc an
d Q-1p = -
2
n-2 [
] (a- 1) -(a_ 1)3 D _(a_ 1)D
. IS
strictly proper. From Theorem 5.2.2(b) we know that Sq
= spann<[aJ{ Ilq
G),
m,
Ilq
IIq
(~), Ilq (~)}
One calculates
QJI_Q
_1
2
0] -(a -1) D [1 (a -l)D [1 0 D 0] 0 1 0 D = 0 (a- 1) 2 + 1 -(a- 1) 3 - (a- 1) 0
and therefore X = [!I, h], since fi :=
(~)
((d~~)1~~ 1) A, iJ, and 6 and their
and !2 :=
basis of SQ. In order to determine the maps representations we calculate
form a matrix
A(X) = IlQ[Dfi, Df2] =[-(a -1)!2, 0),
B(et)
= -P = (1- a)JI,
1] [ 0 1]
1 "
C(X)=II+(Q
-1
[ [Dfi,Df2)]=II+ (a-D2+ 10 = (a-1)2+10 · -CT
Now, use of Theorem 5.2.3 leads to the first-order system
X = [1 Y=
l +[ ~ l
~u~
x
1
u u,
[(u- ~)2 + 1 ~]X
as a realization for kerA [P, Q]. This can also be verified by some straightforward calculations. At the end of this section we want to discuss the assumptions imposed on the matrix [P, QJ in (5.2.1). Size and rank of [P, Q] as well as properness have been discussed in the examples 5.1.8- 5.1.10. Thus there only remains to investigate the requirement that det Q be monic. Example 5.2.5 (Transfer Functions) Let A= K(z, s) or A= K(z)((s- 1 )) as in Example 5.1.4. As it has been shown in Example 5.1.8 this is simply the classical situation of realization of transfer
156
5 First-Order Representations
functions for systems over rings, which has been studied in much detail in the literature; for an overview see e. g. [12, Ch. 4] and [62] and the references therein. In this case, the classical realization procedures of i/o-operators over rings given in the above-mentioned literature apply more generally than the Fuhrmannconstruction. In particular, realizability is simply characterized by the property -Q- 1 P E K[z][s-l~pxm (see [12, Theorems 4.13 and 4.14]). Moreover, since nonsingular left factors of [P, Q] don't matter, i. e. kerA[P,Q]
= kerAU[P,Q] for all nonsingular U E K(z,s)pxp,
the requirement that det Q be monic, as needed for the Fuhrmann construction, is not necessary for realizability. Let us illustrate this by the simple example
ker [d I~1 ~j = { ( ~) A
E
A
3
13 x
E
A : sx = u, y = [ ~:
2
] x} ,
which can be verified easily. The realization is canonical and absolutely minimal in the sense of systems over rings, see [95]. One can show by some lengthy, but straightforward calculations that it is not possible to find a polynomial kernelrepresentation U[P, Q] E K[z, s] 2 x 3 , where U is a nonsingular matrix such that det(UQ) is monic and of degree 1.
Example 5.2.6 (Multidimensional Systems) Let K[z, s) and A be any of the cases in Example 5.1.3. We saw in Example 5.1.10 that realizability of kerA R does not require the row space of R to be a free module of rank p. If however, such a full row rank kernel-representation [P, Q) E K[z, s]Px(m+p) of Be;.\ A, B, C, E) does exist, then det Q is necessarily monic as we will establish now. To this end, let kerA [P, Q] = B;t (A, B, C, E) be as in (5.1.2), (5.1.3). We will make use of the notation in Example 5.1.10. By Proposition 5.1.11 we have -Q- 1 P = C(sl- A)- 1 B +E. Moreover, (5.1.2) and (5.1.3) show that kerA (sf- A) ~ kerA QC, thus property (3) of 5.1.3 guarantees the existence of a matrix Y E K[z, s]pxn such that Y(sl -A)= -QC. All this together provides the matrix equation [Y, P, Q)M = 0 where M is as in (5.1.5), hence ker K[z,s] MT 2
imK[z,s] [Y,
P, Q]T.
(5.2.9)
As for the converse inclusion, consider Equation (5.1.6) and B;t (A, B, C, E) = kerA [f>, Q) in (5.1.7). Property (3) of Example 5.1.3 yields a matrix W E K[z, s]lxp such that W[P, Q] = [f>, Q). Using once more (5.1.6), this leads to WY = Y and hence to kerK[z,s]
M
T
.
A
A
Air
•
if
= lmK[z,s] [Y, P, Q) = lillK[z,s] [Y, P, Q)
T
W ~
. ImK[z,s] [Y,
if
P, Q] ,
establishing equality in (5.2.9). Hence we have a full row rank matrix [Y, P, Q] where the row space has a kernel-representation over the polynomials. It has been shown in [117, Thm. 3.3.8] that this implies that [Y, P, Q] is minor prime;
5.3 First-Order Realizations
157
cf. Theorem 4.1.12 for the notion of minor primeness. Applying Lemma 3.2.7(1) to the equation [Y, P, Q]M = 0, we obtain that det Q is a divisor of det(si- A) in K[z, s], and therefore monic, which is what we wanted to prove. Note that the last step also shows that each first-order representation has at least degs (det Q) many latent variables so that the Fuhrmann-realization turns out to be minimal with respect to this parameter. In this sense, it results in an optimal realization.
For delay systems with commensurate delays we will discuss the requirements for the Fuhrmann-realization as well as minimality of first-order representations in the next two sections, where also the general case [P, Q] with entries in 7-l will be taken into consideration.
5.3 First-Order Realizations for Delay-Differential Systems For the rest of this chapter we will concentrate on DDEs. Hence from now on we will consider exclusively the situation, where the polynomial ring is given by IR[s, z] ~ IR[D, a-] acting on .C = C00 (1R, C) as well as the larger ring 7-l of distributed delay operators. In this section we wish to clarify which systems ker.c [P, Q], where [P, Q] has now entries in 7-l, can be represented in the form
B';'(A,B,C,E) = {
(~) E cm+v 13 x E en:~:~~!~~}
(5.3.1)
for some matrices A, B, C, and E of the usual sizes and with entries in IR[z). A careful study of the elimination procedure derived in Section 4.4 will reveal that (5.3.1) is a behavior in the sense of Definition 4.1, which admits a kernel-representation [P, Q] E 7-[PX(m+p) where Q is nonsingular with monic determinant and Q- 1 P is proper. This way we will see that the requirements in (5.2.1) are necessary not only for the Fuhrmann-construction but for realizability in general. They are also sufficient as will be established by applying the Fuhrmann-realization to a suitable "numerator matrix" of [P, Q] E 7-[PX(m+p). Let us start with the necessity. Of course, in exactly the same way as for polynomials in Definition 5.1.6(b ), a matrix [P, Q] E 7-[PX (m+p) is said to be realizable, if matrices (A, B, C, E) over JR[z] exist such that ker.c [P, Q] =
Bc;_t (A, B, C, E).
(5.3.2)
We begin with deriving a kernel-representation having the desired properties (5.2.1). We also give a sufficient condition for the existence of a polynomial
158
5 First-Order Representations
kernel-representation in part (d) below. Recall that 1-lo ~ IR(s)[z] ~ IR[zJ((s- 1 )) and that q = p¢- 1 E 1-lo being monic in the sense of Definition 5.2.1(c) is the same asp being monic with respect to s. We would also like to remind of the notation [P, Q]cp) for the full-size minors of the given matrix. Proposition 5.3.1
For every first-order system (5.3.1) with matrices (A, B, C, E) E IR[z]nxn x ffi.[z]nxm x JR[z)PXn x R[z]pxm there exists a matrix [P, Q] E 1-lbx(m+p) satisfying (a) Ba;t (A, B, C, E)= ker.c [P, Q], (b) Q is nonsingular and -Q- 1 P = C(sl- A)- 1 B
(c) det Q is monic and deg8 [P, Q](p)
~
+ E, deg8 (det Q) ~ n for all selections
P E Jp,m+p 7
(d) deg 8 (det Q) = n if and only if the matrix [sf - AT, CT]T is left invertible over 1-l0 . (Notice that by Theorem 4.1.5(a) this assertion does not depend on the specific choice of the kernel representation [P, Q].) Moreover, if deg8 (det Q) = n, then the system Be;t (A, B, C, E) admits a polynomial kernel-representation [P, Q] E IR[s, z]Px(m+p). PROOF: (a) First note that the external behavior (5.3.1) can be written as
This is a latent variable description to which the elimination result of Theorem 4.4.1 applies. Precisely, there exists a unimodular matrix
such that [uu31 u.u42]
[sic- A]-- [Mo]
for some matrix M E
1-l~ x n.
Now Theorem 4.4.l(b) yields
B~' (A, B, C, E) = ker.c [U3, U4] [!E ~]
= ker.c [P, Q],
where [P, Q] = [UgB- U4E, U4]. This proves (a). (b) follows from (a) in exactly the same way as in Proposition 5.1.11(a); this time all matrices have entries in 1-lo and act on the divisible 1-lo-module .C. (c) Consider the equation
(Ug, U4] [ sfC
A] = 0.
(5.3.3)
5.3 First-Order Realizations
159
Denote the full-size minors of [U3, U4] by [U3, U4](p) = up>- 1 E 1-lo where up E R[s, z] and 4> E R[s]. Since [U3, U4] is right invertible over 1-lo, its full-size minors are coprime in 1-lo and we may assume without restriction that the polynomials up are coprime in R[s, z]. Then (5.3.3) combined with Lemma 3.2.7(1) implies sf(.o) bup = ±4> [ C for all p E Jp,n+p
A]
for some bE R[s, z], thus for all p E Jp,n+p
(5.3.4)
and in particular
bdetQ = bdetU4 = ±det(sf- A),
(5.3.5)
showing that det Q is monic. Moreover, (5.3.4) implies deg8 [U3, U4](p) ~ deg8 ( det U4) for all selections p E Jp,n+p and the Binet-Cauchy formula applied to (5.3.6) yields the second assertion of (c). (d) In order to show the equivalence, consider again the proof of part (c). The assertion deg8 (det Q) = n is the same as saying that the polynomial bin (5.3.5) is a constant, hence a unit in 1-lo. This in turn is equivalent to the coprimeness of the full-size minors of [sf- AT, CTJT in 1-lo by virtue of (5.3.4) together with the right invertibility of [U3, U4]. From Corollary 3.2.5 we get, equivalently, the right invertibility of [sf- AT' cT]T' concluding this part of the proof. The existence of a polynomial kernel-representation is a consequence of Theorem 4.4.1(b). 0 Remark 5.3.2 In Remark 5.1.13 we showed that the rational function C(sl- A)- 1 B + E has its entries in JR.(s)p[z], hence the system is strongly nonanticipating. This can also be seen from part (c) above which says that det Q is of the form . 1 detQ = (s l + ""l-1 £.Ji=oPiS.,_)4>for some lEN, PiE JR.[z], and 4> E JR.[s]. It follows that deg8 ( det Q) = deg det Q(s, 0) and the second assertion in (c) together with Proposition 4.2.5(c) shows that the entries of Q-lp are in JR.(s)p[z].
From the Fuhrmann-realization and (5.2.2) we know that the conditions in Proposition 5.3.1(c) are sufficient for realizability of ker.c [P, Q] pro~ided the matrix [P, Q] is polynomial. As we will show now this remains valid when we pass to distributed delay operators for [P, Q]. Since kernel-representations are unique only up to unimodular left factors over 1-l, it is most convenient for the subsequent formulation to utilize a normalized kernel-representation as it has been established in Proposition 4.2.5(a).
160
5 First-Order Representations
Theorem 5.3.3 ([43, Thm. 3.2]) Let [P, Q] E 7-l~x(m+p) be a matrix such that rk [P, Q](s, 0) = p. Then the system ker.c [P, Q] is realizable if and only if det Q is monic and Q- 1 Pis proper. Unfortunately, we cannot offer an intrinsic characterization for realizability purely in terms of the trajectories of the system. PROOF: The only-if part follows from Proposition 5.3.1 together with Remark 4.2.6 and Remark 5.1.13. (Notice that the construction in the proof of Proposition 5.3.1leads to a normalized kernel-representation since (5.3.5) yields det Q(s, 0) # 0.) As for the if-part we show that the "numerator" of [P, Q] satisfies the requirements for the Fuhrmann-realization. The latter can then be utilized to derive a realization for ker.c [P, Q] itself. The details are as follows. Write [P, Q] = [P, Q)¢- 1 where [P, Q] E R[s, z]Px(m+p) and¢= E~,:~ ¢isi+sr E R[s). Then Q- 1 P = Q- 1 P is proper and det Q is monic. Thus the Fuhrmannrealization is applicable to [P, Q] and we obtain from Theorem 5.2.3 together with (5.2.2) the existence of a realization (A, B, C, E) E R[z]nxn+nxm+pxn+pxm of dimension n := degs det(Q). Precisely,
ker.c
[P,Q]
= Be;t(A,B,C,E)
={(
v)
w
E
~:,m+p 1:3 xE ~:,n : i:w == Ax + Bv } . Cx+Ev
(5.3.7)
By the very definition of the operators in 1-l (see Definition 2.9) this yields
In order to obtain a first-order representation for ker.c [P, Q], we have to translate the ordinary differential equations u = ¢v and y = ¢w into first-order matrix differential equations. First of all, it is standard to rewrite the (vectorvalued) equation u = ¢v in the following way. Put
Then one has for v E £m
u = ¢v
¢=::?
:i; =
Ax + Bu.
It remains to rewrite the equation y = ¢w in terms of x, x, and u. By use of vCr) = u - E~,:~ ¢iv(i) and the systems equations in (5.3.7) one derives straightforwardly the output equation
5.3 First-Order Realizations
161
where r
H
r
r
== c[I:: ¢iAi, L i=O
¢iAi-
1
B,
i=1
L
¢iAi-
2
B, ... 'B]
E JR[z]PX(n+rm).
(5.3.8)
i=2
Now we simply have to combine all equations. Defining the matrices F :=
G :=
[
I
A O B
[~]
0A... OJ
E JR[z](n+rm)x(n+rm),
} E
(5.3.9)
IR[zJ(n+rm)xm,
the construction shows that ker.c [P, Q] ~ B~t (F, G, H, E). Reversing the line of arguments, we obtain also the converse inclusion, hence the realizability of ker.c [P, Q]. D As to be expected, the realization constructed above is (highly) non-minimal with respect to the number of latent variables. This can also be seen from the following example where a realization is derived by inspection.
Example 5.3.4 Let p = ~ E 1io,p be a proper nonconstant function, hence degs a :::; deg ¢. (Notice that 1io,p ~ JR[z][s- 1] telling that the two notions of properness coincide.) The graph of the operator on £ induced by p can' be realized as a first-order system as follows. First of all, the system ker.c (p, -1]
= {(u,pu)T I u E £}
is realizable, according to Theorem 5.3.3. In this specific case, a first-order representation can simply be derived by rewriting the definition of the operator p on£. Let a= '\'r LJi=O ais i and¢= '\'r LJi=O ¢is i where ¢i E JR, ¢r = 1, and ai E lR [ z ] . Then ker.c (p, -1] = {(u, yy E £ 2 I :3 .x E £:ax= y, ¢x = u}. It is standard to transform these equations into a first-order system. Put
A ·- [ 0
1
.
-¢o -¢1 · · ·
l'
-¢~-1
B :=
[~] ~
'
and · C := [ao- ar¢o, ... , ar-1- arc/Jr-1]·
se;_t
Then one easily checks that ker .c (p, -1] = (A, B, C, ar). Notice that this realization requires r = deg ¢ latent variables, which is less than what is used in the proof above.
162
5 First-Order Representations
In Proposition 5.4.6 it will be shown that the dimension derived in the proof of Theorem 5.3.3 above can be drastically reduced.· It is worthwhile mentioning the following consequence of the criterion in Theorem 5.3.3. Remark 5.3.5 The controller kerc [- F, I - G] used for weak coefficient assignment in Section 4.5 is realizable. This follows from the fact that by Definition 4.5.8 the matrix G is in 7-l0~;m, hence in s- 1JR[z][s-l]mxm, and thus det(J- G) is a unit in JR[z][s- 1]. The properness ofF E '}-{ 0~xn implies then that (J- G)- 1 F is proper, too, and Theorem 5.3.3 applies. We want to end this section with some comments clarifying the relationship between the notions of realizability as a behavior or as a transfer function and strong nonanticipation. Remark 5.3.6 Firstly, it is clear from Theorem 5.3.3 and Proposition 5.1.1l(a) that realizability as a behavior is properly stronger than realizability as a transfer function (simply left multiply a realizable kernel-representation with a nonsingular matrix with nonmonic determinant to get a nonrealizable behavior with realizable transfer function). Secondly, realizability as a transfer function is properly stronger than strong nonanticipation. Indeed, in Remark 5.1.13 we saw that every first-order system is strongly nonanticipating. But on the other side, also the simple system [p, q] = [s- z, 1 + s + sz] is strongly nonanticipating by virtue of Proposition 4.2.5(c), whereas it does not satisfy q- 1p E JR[z]((s- 1)) and is therefore not realizable as a transfer function.
5.4 Some Minimality Issues In this last section on realization we will discuss some minimality issues concerning the number of latent variables in a first-order representation. In order to make clear what we are heading for, let us briefly restrict to systems of ODEs. It is well-known that an ordinary differential operator [P, Q] E JR[s]px(m+p) has a first-order representation of dimension n := deg( det Q). Moreover, this is the minimal dimension possible and minimality of a realization is equivalent to coreachability, which in this case is simply observability of the latent variables. These results can be deduced for instance from the papers [37, Sec. 3] and [101], but they will also be contained in Theorem 5.4.3 below (notice that the existence of an n-dimensional coreachable realization follows from the Fuhrmannconstruction of Section 5.2). For delay-differential systems we can only provide a partial generalization of the results just mentioned. While for delay-systems
5.4 Some Minimality Issues
163
having a polynomial kernel-representation these results remain valid, they fail for arbitrary systems ker.c [P, Q]. In that case the minimal number of latent variables as well as a (intrinsic) characterization of minimality are unknown. To begin with, we introduce the following notions. Recall from Definition 5.1.6 . the dimension of a realization as the number of latent variables.
Definition 5.4.1 Let [P, Q] E 7-lbx (m+p) be a realizable matrix of rank p. (i) The McMillan degree dM[P, Q] of the matrix [P, Q] is deB.ned to be the minimal number n E No for which an n-dimensional B.rst-order representation (A, B, C, E) of ker.c [P, Q] exists. A realization (A, B, C, E) of [P, QJ is said to be minimal if its dimension is equal to dM[P, Q]. (ii) A realization (A, B, C, E) is called observable, if
[sf(;
A]
is left invertible over 1io.
(iii) A realization (A, B, C, E) of dimension n is called weakly observable, if
rkR[zJO(A, C)= n, where O(A, C)= [ gA ] E JR[z]pnxn.
CA~- 1
Remark 5.4.2 (1) For systems of ODEs, the McMillan degree is usually defined for proper rational functions -Q- 1 P E JR(s )pxm, where it denotes the minimal dimension of a first-order transfer function realization, see [59, Sec. 6.5]. This coincides with the notion above if and only if the matrix [P, Q] is right invertible over JR[s), which, of course, reflects again the gap between realizing transfer functions and realizing behaviors. (2) The definition of observability given above is, of course, not quite in the "behavioral spirit" as it is not a definition in terms of the trajectories of the system, but rather based on a specific representation. In [87, Def. 5.3.2], a latent variable system is defined to be observable if the latent variable xis uniquely determined by the external variables (uT, yTy, in which case one says that xis observable from the external data. But this is exactly what is described by the algebraic condition in the definition above. If we rewrite the systems equations in (5.3.1) as
[•I CA] x= [_BE~](~)' we see that xis observable from the external data if and only if the operator on the left-hand side is injective, which in turn is the case if and only if [sf- AT' cT]T is left invertible, see Proposition 4.1.4 and Corollary 3.2.5.
164
5 First-Order Representations
(3) The concept of weak observability is identical to the notion of observability as a system over the ring JR[z], see [79, p. 529] or [105, p. 16]. It is a consequence of Theorem 5.4.3 below that observability implies weak observability. In the next theorem we will clarify, at least partially, the relationship between minimality and the two kinds of observability defined above. We know already from Proposition 5.3.1 that each realization has at least deg8 (det Q) many latent variables. Thanks to the Fuhrmann-realization, we get the exact minimum number of latent variables if we know, by some means, that the system allows a polynomial kernel-representation. In particular, the Fuhrmann-realization is minimal. Theorem 5.4.3 Let [P, Q] E 1-l~x(m+p) be a realizable matrix with rank p and let
kerc [P,Q] = sa;_\A,B,C,E)
where (A, B, C, E) E JR[z]nxn X IR[z]nxm X IR[z]pxn X JR[z]PXm. Then (a) kerc [P, Q] does always admit a realization which has dimension equal to rk O(A, C). As a consequence, if (A, B, C, E) is a minimal realization, then it is weakly observable. (b) The realization (A, B, C, E) is observable if and only if deg8 (det Q) = n. Moreover, if the realization is observable, the external behavior has a polynomial kernel-representation; in other words, there exists W E Glp (1-l) such that W[P, Q] E IR[s, z]PX(m+p). (c) Suppose there exists W E Glp(1-l) such that W[P, Q] E IR[s, z]Px (m+p). Then the Fuhrmann-realization is minimal. As a consequence, (A, B, C, E) is minimal{:==:? deg8 (det Q) = n and hence dM[P, Q] = deg 8 (det Q). Observe that (b) in combination with (c) provides a characterization of minimality in terms of the trajectories of the external and latent variables provided the system has a polynomial kernel-representation. Notice also that we recover the results on minimal realizations for systems of ODEs mentioned in the introduction to this section. PROOF: (a) The first part is completely analogous to the case of systems over fields. We start with the given realization (A, B, C, E). Assume that rk O(A, C) = r < n. Then a change of basis in the abstract state space JR[z]n leads to an r-dimensional realization as follows. There exists a matrix V E Gln(IR[z]) such that O(A, C)V = [0, M 2) where M 2 E IR[zrpxr has full column rank. If Vis partitioned accordingly, say V = [V1, V2) where V2 E JR[z]nxr, then kerR[zJ O(A, C) E9 imR[zJ V2 = IR[z]n and hence we get
5.4 Some Minimality Issues
(V
-I AV, V -I B,CV,E) = ( [AI0 A2] [BI] (O,C2],E) A B ,
(5.4.1)
,
4
165
2
where A4 E R[zrxr, C2 .E R[z]Pxr, B2 E R(zrxm, and the remaining matrices are of fitting sizes. Now the surjectivity of the operator sfon .cn-r (see Proposition 4.1.4) implies ker.c [P, Q] = B'";_t (A, B, C, E) = Be;_t (A4, B2, C2, E), which is the desired realization. The second assertion is clear. (b) follows from Proposition 5.3.1(d) together with the uniqueness of kernelrepresentations up to left equivalence over 1-{.
AI
(c) Taking a normalized polynomial kernel-representation [P, Q] = W[P, Q], that is rk [P, Q](s, 0) = p, realizability implies that det Q is monic, see Theorem 5.3.3. Hence the Fuhrmann-realization exists and has dimension equal to deg8 (detQ). Now the desired results follow from Proposition 5.3.1(c). D Notice that for a given first-order system B'";,t (A, B, C, E) which is not observable, the theorem does not provide a test for minimality as long as we don't know whether a polynomial kernel-representation exists. But this is not simple either. Only in the case of co,;,trollable systems ker.c [P, Q], we get from Theorem 4.1.13 an easy test for the existence of polynomial kernel-representations. The following example illustrates this problem. It also shows that weak observability is not sufficient for minimality. Example 5.4.4 Consider the first-order system given by
l [(
- ( 0, [11z ' (A, B, C, E)-
l)
2
z -0 1) 0] 1 ' [ (1-0 z)3
'
which is weakly observable, since rk C = 2 = n, but not observable. So far we cannot apply any of the results in Theorem 5.4.3 since we don't know whether the system has a polynomial kernel-representation. Using the elimination procedure as in the proof of Proposition 5.3.1(a) we obtain the kernel-representation
k B.cext(A ' B ' C ' E)_ - er .c
[(z- 1)- +( (z- 1) s 01 0s] 3
z -I) 2 s
(the procedure does not lead to a unique representation, the above is the one which "most likely" arises). This representation is not polynomial. But in this specific case it can easily be made polynomial via left equivalence and we get
B.cext(A ' B ' C ' E)
=k
er.c
[(z- 1) + (z- 1) s0 s l 3
(z - 1) 4
1z- 1 ·
(5.4.2)
Now we can apply Theorem 5.4.3 and see that the realization is not minimal. A minimal one requires only one latent variable and is for instance given by the Fuhrmann-realization of [P, Q], where [P, Q] denotes the matrix on the
166
5 First-Order Representations
right hand side of (5.4.2). Applying Theorem 5.2.3 to the strictly proper part [P + EQ, Q], see also (5.2.2), the one-dimensional realization of (5.4.2) is easily found to be
" ,. ,.
(A,B,C,E)=
(0,1-z, [1 -1 zl , [(1 -z) 0 l) 3
.
Remark 5.4.5 The Fuhrmann-realization was shown to be coreachable, that is [sl-AT, cT]T is left-invertible over JR[s, z], see Theorem 5.2.3(d). This is properly stronger than observability defined in 5.4.1. In light of Theorem 5.4.3(b) and (c) we see that not every minimal realization is coreachable.
Let us now turn briefly to systems ker.c [P, Q] without polynomial kernelrepresentations. Unfortunately, it remains an open question what the minimal dimension dM[P, Q] of a realization is and how to characterize minimality in terms of the latent variable system itself. We can only provide the following estimate for dM[P, Q], thereby reducing drastically the dimension of the realization constructed in the proof of Theorem 5.3.3. Proposition 5.4.6
Let [P,Q] = ¢- 1[P,Q] E Hgx(m+p) where [P,Q] E JR[s,z)PX(m+p) and¢ E JR[s]. Suppose [P, Q] is realizable. Then dM[P,Q]::; dM[P,Q] = deg8 (detQ).
PROOF: The rightmost identity has been shown in Theorem 5.4.3(c). As for the inequality we have to go through the proof of Theorem 5.3.3 again, where a realization of ker.c [P, Q] has been derived via a realization of ker.c [P, Q]. Let (A, B, C, E) be the Fuhrmann-realization of [P, Q], which is minimal and therefore satisfies rkO(A,C) = n := deg8 (detQ) by Theorem 5.4.3(a). The proof of Theorem 5.3.3 leads to the realization ker.c [P, Q] = B~xt (F, G, H, E) where F, G, and H are as in (5.3.9) and (5.3.8). Put r
H1 :=
r
[L ¢iAi, L i=O
i=l
r
¢iAi-l B,
L
¢iAi- 2 B,
... , B]
E
JR[z]nx(n+rm),
i=2
thus H = CH1. Then it is easy to see that H1F = AH1 and therefore H pl = CAl H 1 for alll E No. Hence O(F, H) = O(A, C)H1 and the estimate rkiR[zjO(F, H) ::; rkiR(zjO(A, C) = n coupled with Theorem 5.4.3(a) completes the proof. D We want to conclude our considerations on first-order realizations with illustrating yet another open problem. In case of systems of ODEs it is known that
5.4 Some Minimality Issues
167
two minimal realizations (A, B, 0, E) and (A, B, 6, E) of dimension n, say, are similar whenever they share the same external behavior, hence there exists a matrix T E Gln(R) such that
(A, iJ,c, E)= (TAT-I, TB, oT-1, E).
(5.4.3)
This can be deduced for instance from the results in (37]. The relationship just described fails for DDEs in any reasonable kind of generalization. Indeed, even coreachable realizations, hence particularly nice minimal realizations of a system of polynomial DDEs, are not similar over whatsoever ring. This will be illustrated by the following example, where the Fuhrmann-construction is applied to two left equivalent polynomial kernel-representations. Example 5.4. 7 Let bE IR[z] be arbitrary. Then
kerc
[b~)~s2-; + 2 ~1] = kerc [b~)~s2-; + 2 ~z]
'
(5.4.4)
0
which follows for instance from the fact that iJ2 = implies zy2 = Y2· Note that the determinants of the 2 x 2-submatrices in (5.4.4) are monic and of degree 3 with respect to s. Applying the Fuhrmann-realization to the matrix on the left-hand side of (5.4.4), we obtain the first-order representation
(A, B, C, 0)
= ( [[ z
2 ~ ~] , [-bt] ,[~ ~ ~] ,0) ,
while the same procedure for the matrix on the right-hand side leads to
(A2, B2, 02, 0)
=
Oz-2z] [-b(z)] [010] ) ([~ ~ ~ , ~ , o o 1 , 0 ·
One can check by some straightforward calculations that there does not exist a nonsingular matrix T in, say JR(s, z) 3 x 3 , such that (A2, B 2, 02) = (T A1T-l, TB1, C 1 T- 1 ). As a consequence, the relation between these two representations, which should exist as they describe the same system and are minimal, has to be captured differently. Indeed, in this specific case one can simply observe the following. Left multiplying the equation sx = A1x + B1u by
1-z]
1O 01 0 [0 0 1
E
Gl3(1lo)
leads to sx = A 2 x + B2u and we see that there are transformations, other than similarity, which also preserve the specific structure of the first-order systems. They should be taken into account for an investigation of the relation between minimal systems sharing the same external behavior. We leave this as an open problem.
References
1. A. Baker. Transcendental Number Theory. Cambridge University Press, 1979. 2. T. Becker and V. Weispfenning. Grabner Bases: A Computational Approach to Commutative Algebra. Springer, New York, 1993. 3. R. Bellman and K. L. Cooke. Differential-Difference Equations. Academic Press, New York, 1963. 4. C. A. Berenstein and M. A. Dostal. The Ritt theorem in several variables. Arkiv foer Matematik, 12:267-280, 1974. 5. C. A. Berenstein and A. Yger. Ideals generated by exponential-polynomials. Adv. in Math., 60:1-80, 1986. 6. K. P. M. Bhat and H. N. Koivo. Modal characterizations of controllability and observability in time delay systems. IEEE Trans. Aut. Contr., AC-21:292-293, 1976. 7. H. Blomberg and R. Ylinen. Algebraic Theory for Multivariable Linear Systems. Academic Press, London, 1983. 8. D. Brethe and J. J. Loiseau. Stabilization of linear time-delay systems. JESARAIRO-APII, 6:1025-1047, 1997. 9. D. Brethe and J. J. Loiseau. An effective algorithm for finite spectrum assignment of single-input systems with delays. Math. and Computer in Simulation, 45:339348, 1998. 10. J. Brewer, T. Ford, L. Klingler, and W. Schmale. When does the ring K[y] have the coefficient assignment property? J. Pure Appl. Algebra, 112:239-246, 1996. 11. J. Brewer, L. Klingler, and W. Schmale. C[y] is a CA-ring and coefficient assignment is properly weaker than feedback cyclization over a PID. J. Pure Appl. Algebra, 97:265-273, 1994. 12. J. W. Brewer, J. W. Bunce, and F. S. Van Vleck. Linear Systems over Commutative Rings, volume 104 of Lecture Notes in Pure and Applied Mathematics. Marcel Dekker, New York, 1986. 13. W. E. Brumley. On the asymptotic behavior of solutions of differential-difference equations of neutral type. J. Diff. Eqs., 7:175-188, 1970. 14. A.M. Cohen, H. Cuypers, and H. Sterk (Eds.). Some Tapas of Computer Algebra. Springer, Berlin, 1999. 15. H. Cohen. A Course in Computational Algebraic Number Theory. Springer, Berlin, 3. printing, 1996. 16. P. M. Cohn. On the structure of the GL2 of a ring. Inst. Hautes Etudes Sci. Publ. Math., 30:365-413, 1966. 17. P.M. Cohn. Pree Rings and Their Relations. Academic Press, London, 2. edition, 1985.
170
References
18. D. Cox, J. Little, and D. O'Shea. Ideals, Varieties, and Algorithms; An Introduction to Computational Algebraic Geometry and Commutative Algebra. Springer, New York, 1992. 19. D. Cox, J. Little, and D. O'Shea. Using Algebraic Geometry. Springer, New York, 1998. 20. R. F. Curtain and H. J. Zwart. An Introduction to Infinite-Dimensional Linear Systems Theory. Springer, New York, 1995. 21. P. J. Davis. Interpolation and Approximation. Dover Publications, New York, 2. edition, 1975. 22. 0. Diekmann, S. A. van Gils, S. M. Verduyn Lunel, and H.-0. Walther. Delay Equations; Functional-, Complex-, and Nonlinear Analysis. Springer, New York, 1995. 23. R. D. Driver. Ordinary and Delay Differential Equations. Springer, New York, 1977. 24. L. Ehrenpreis. Solutions of some problems of division. Part II. Division by a punctual distribution. Amer. J. Mathematics, 77:286-292, 1955. 25. L. Ehrenpreis. Solutions of some problems of division. Part III. Division in the spaces V', 'H, QA, 0. Amer. J. Mathematics, 78:685-715, 1956. 26. S. J. L. van Eijndhoven and L. C. G. J. M. Habets. Equivalence of convolution systems in a behavioral framework. Report RANA 99-25. Eindhoven University of Technology, 1999. 27. S. Eilenberg. Automata, Languages~ and Machines, volume A. Academic Press, New York, 1974. 28. L. E. El'sgol'ts and S. B. Norkin. Introduction to the Theory and Application of Differential Equations with Deviating Arguments. Academic Press, New York, 1973. 29. E. Emre and P. P. Khargonekar. Regulation of split linear systems over rings: Coefficient-assignment and observers. IEEE Trans. Aut. Contr., AC-27:104-113, 1982. 30. D. Estes and J. Ohm. Stable range in commutative rings. J. Algebra, 7:343-362, 1967. 31. D. R. Estes and J. R. Matijevic. Matrix factorizations, exterior powers, and complete intersections. J. Algebra, 58:117-135, 1979. 32. M. Fliess and H. Mounier. Interpretation and comparison of various types of delay system controllabilities. In Proceedings of the !FAG Conference on System, Structure and Control (Nantes), pages 330-335, 1995. 33. P. A. Fuhrmann. Algebraic systems theory: An analyst's point of view. J. Franklin Inst., 301:521-540, 1976. 34. P. A. Fuhrmann. Linear Systems and Operators.in Hilbert space. McGraw-Hill, New York, 1981. 35. W. Fulton. Algebraic Curves; An Introduction to Algebraic Geometry. AddisonWesley, 3. printing, 1989. 36. F. R. Gantmacher. The Theory of Matrices, volume 1. Chelsea, New York, 1977. 37. A. H. W. Geerts and J. M. Schumacher. Impulsive-smooth behavior in multimode systems, Part II: Minimality and equivalence. Automatica, 32:819-832, 1996. 38. S. Glaz. Commutative Coherent Rings. Springer LN 1371, New York, 1989. 39. H. Gluesing-Luerssen. A convolution algebra of delay-differential operators and a related problem of finite spectrum assignability. Math. Contr., Sign., Syst., 13:22-40, 2000.
References
171
40. H. Gluesing-Luerssen. The Fuhrmann-realization for multi-operator systems in the behavioral context. Mult. Systems Signal Processing, 11:193-211, 2000. 41. H. Gluesing-Luerssen, P. Vettori, and S. Zampieri. The algebraic structure of delay-differential systems: A behavioral perspective. Kybemetika, 37:397-426, 2001. 42. H. Gliising-LiierBen. A behavioral approach to delay-differential systems. SIAM J. Contr. f3 Opt., 35:48D-499, 1997. 43. H. Gliising-LiierBen. First-order representations of delay-differential systems in a behavioural setting. European Journal of Control, 3:137-149, 1997'. 44. I. Gohberg, P. Lancaster, and L. Rodman. Matrix Polynomials. Academic Press, New York, 1982. 45. J. P. Guiver and N. K. Bose. Polynomial matrix primitive factorization over arbitrary coefficient field and related results. IEEE Trans. Circuits Syst., CAS29:649-657, 1982. 46. L. C. G. J. M. Habets. Algebraic and computational aspects of time-delay systems. PhD thesis, Eindhoven University of Technology, 1994. 47. L. C. G. J. M. Habets. System equivalence for AR-systems over rings- With an application to delay-differential systems. Math. Contr., Sign., Syst., 12:219-244, 1999. 48. J. Hale. Theory of Functional Differential Equations. Springer, New York, 1977. 49. J. Hale and S. M. Verduyn Lunel. Introduction to Functional Differential Equations. Springer, New York, 1993. 50. M. L. J. Hautus. Controllability and observability conditions for linear autonomous systems. Nederl. Akad. Wet. Proc., A 72:443-448, 1969. 51. 0. Helmer. The elementary divisor theorem for certain rings without chain condition. Bull. Amer. Math. Soc., 49:225-236, 1943. 52. D. Hinrichsen and D. Pratzel-Wolters. Solution modules and system equivalence. Int. J. Contr., 32:777-802, 1980. 53. W. V. D. Hodge and D. Pedoe. Methods of Algebraic Geometry; Vol. 1. Cambridge University Press, 1968 (Reprint 1994). 54. A. S. B. Holland. Introduction to the Theory of Entire Functions. Academic Press, New York, 1973. 55. T. W. Hungerford. Algebra. Springer, New York, 1974. 56. N. Jacobson. Basic Algebra I. W. H. Freeman, New York, 2. edition, 1985. 57. L. Jantscher. Distributionen. Walter de Gruyter, Berlin, 1971. 58. C. U. Jensen. On characterizations of Priifer rings. Math. Scand., 13:9D-98, 1963. 59._ T. Kailath. Linear Systems. Prentice-Hall, 1980. 60. E. Kaltofen. Polynomial factorization 1982-1986. In D. V. Chudnovsky and R. D. Jenks (Eds.) Computers in Mathematics; Lecture Notes in Pure and Applied Mathematics 125, pages 285-309. Marcel Dekker, New York, 1990. 61. E. W. Kamen. On an algebraic theory of systems defined by convolution operators. Math. Systems Theory, 9:57-74, 1975. 62. E. W. Kamen. Linear systems over rings: From R. E. Kalman to the present. In A. C. Antoulas (Ed.) Mathematical System Theory, The Influence of R. E. Kalman, pages 311-324. Springer, Berlin, 1991. 63. E. W. Kamen, P. P. Khargonekar, and A. Tannenbaum. Proper stable Bezout factorizations and feedback control of linear time-delay systems. Int. J. Contr., 43:837-857, 1986. 64. I. Kaplansky. Elementary divisors and modules. Trans. Amer. Math. Soc., 66:464-491, 1949.
172
References
65. V. B. Kolmanovskii and V. R. Nosov. Stability of Functional Differential Equations. Academic Press, London, 1986. 66. Y. Kuang. Delay Differential Equations with Applications in Population Dynamics. Academic Press, Boston, 1993. 67. S. Lang. Algebra. Addison-Wesley, 2. edition, 1984. 68. 0. Lezama and 0. Vasquez. On the simultaneous basis property in Priifer domains. Acta Math. Hungar., 80:169-176, 1998. 69. A. Logar and B. Sturmfels. Algorithms for the Quillen-Suslin Theorem. J. Algebra, 145:231-239, 1992. 70. N. MacDonald. Biological Delay Systems: Linear Stability Theory. Cambridge University Press, 1989. 71. C. C. MacDuffee. The Theory of Matrices. Chelsea, New York, 1946. 72. B. Malgrange. Existence et approximation des solutions des equations aux derivees partielles et des equations des convolution. Annales de l'Institut Fourier, 6:271-355, 1955/1956. 73. A. Manitius. Necessary and sufficient conditions of approximate controllability for general linear retarded systems. SIAM J. Contr. & Opt., 19:516-532, 1981. 74. A. Manitius and R. Triggiani. Function space controllability of linear retarded systems: A derivation from abstract operator conditions. SIAM J. Contr. & Opt., 16:599-645, 1978. 75. A. Z. Manitius. Feedback controllers for a wind tunnel model involving a delay: Analytical design and numerical simulation. IEEE Trans. Aut. Contr., AC29: 1058-1068, 1984. 76. A. Z. Manitius and A. W. Olbrot. Finite spectrum assignment problem for systems with delays. IEEE Trans. Aut. Contr., AC-24:541-553, 1979. 77. N. Minorsky. Self-excited oscillations in dynamical systems possessing retarded actions. J. Appl. Mech., 9:65-71, 1942. 78. M. Morf, B. C. Levy, and S.-Y. Kung. New results in 2-D-systems theory, part I: 2-D polynomial matrices, factorizations, and coprimeness. Proc. of the IEEE, 65:861-872, 1977. 79. A. S. Morse. Ring models for delay-differential systems. Automatica, 12:529-531, 1976. 80. H. Mounier. Algebraic interpretations of the spectral controllability of a linear delay system. Forum Math., 10:39-58, 1998. 81. A. D. Myschkis. Lineare Differentialgleichungen mit nacheilendem Argument. Deutscher Verlag der Wissenschaften, Berlin, 1955. [Russian Original 1951]. 82. R. Narasimhan. Complex Analysis in one Variable. Birkhauser, Boston, 1985. 83. M. Newman. Integral Matrices. Academic Press, New York, 1972. 84. U. Oberst. Multidimensional constant linear systems. Acta Appl. Mathematicae, 20:1-175, 1990. 85. A. W. Olbrot and L. Pandolfi. Null controllability of a class of functional differential systems. Int. J. Contr., 47:193-208, 1988. 86. H. K. Pillai and S. Shankar. A behavioral approach to control of distributed systems. SIAM J. Contr. & Opt., 37:388-408, 1998. 87. J. W. Polderman and J. C. Willems. Introduction to Mathematical Systems Theory; A behavioral approach. Springer, New York, 1998. 88. L. S. Pontryagin. On the zeros of some elementary transcendental functions. Amer. Math. Soc. Transl., Ser. 2, 1:95-110, 1955. 89. W. H. Ray. Advanced Process Control. Chemical Engineering Series. McGrawHill, New York, 1981.
References
173
90. D. Richardson. How to recognize zero? J. Symb. Comput., 24:627-645, 1997. 91. P. Rocha and J. C. Willems. Behavioral controllability of delay-differential systems. SIAM J. Contr. & Opt., 35:254-264, 1997. 92. P. Rocha and J. Wood. Trajectory control and interconnection of 1D and nD systems. SIAM J. Contr. & Opt., 40:107-134, 2001. 93. J. Rosenthal and J. M. Schumacher. Realization by inspection. IEEE Trans. Aut. Contr., AC-42:1257-1263, 1997. 94. W. E. Roth. The equations AX- YB = C and AX- XB = C in matrices. Proc, Americ. Math. Soc., 3:392-396, 1952. 95. Y. Rouchaleau and E. D. Sontag. On the existence of minimal realizations of linear dynamical systems over Noetherian integral domains. J. Comp. Syst. Sci., 18:65-75, 1979. 96. W. Rudin. Functional Analysis. Tata McGraw Hill, New Dehli, 1973. 97. G. Scheja and U. Storch. Lehrbuch der Algebra, Teil1. B. G. Teubner, Stuttgart, 2. edition, 1994. 98. W. Schmale. Unpublished notes. 1995. 99. W. Schmale. Private communications. 1996. 100. W. Schmale. Matrix cyclization over complex polynomials. Lin. Algebra and its Appl., 275/276:551-562, 1998. 101. J. M. Schumacher. Transformations of linear systems under external equivalence. Lin. Algebra and its Appl., 102:1-33, 1988. 102. L. Schwartz. Theorie generate des fonctions moyenne-periodiques. Annals of Mathematics, 48:857-929, 1947. 103. L. Schwartz. Theorie des Distributions. Vol. 1. Hermann, Paris, 1950. 104. L. Schwartz. Theorie des Distributions. Vol. 2. Hermann, Paris, 1951. 105. E. D. Sontag. Linear systems over commutative rings: A survey. Ricerche di Automatica, 7:1-34, 1976. 106. A. A. Suslin. On the structure of the special linear group over polynomial rings. Math. USSR-Izv., 11:221-238, 1977. 107. F. Treves. Topological Vector Spaces, Distributions and Kernels. Academic Press, New York, 1967. 108. M. E. Valcher. Some results about the decomposition of two-dimensional behaviors. In J. W. Polderman and H. L. Trentel~an (Eds.) The Mathematics of Systems and Control: From Intelligent Control to Behavioral Systems, pages 191-201. University of Groningen, The Netherlands, 1999. 109. P. Vettori. Delay differential systems in the behavioral approach. PhD thesis, Universita di Padova, 1999. 110. P. Vettori and S. Zampieri. Stability and stabilizability of delay-differential systems. Preprint 2001. 111. P. Vettori and S. Zampieri. Controllability of systems described by convolutional or delay-differential equations. SIAM J. Contr. & Opt., 39:728-756, 2000. 112. P. S. Wang. An improved multivariate polynomial factorization algorithm. Math. Comp., 32:1215-1231, 1978. 113. K. Watanabe. Finite spectrum assignment and observer for multivariable systems with commensurate delays. IEEE Trans. Aut. Contr., AC-31:543-550, 1986. 114. K. Watanabe, M. Ito, and M. Kaneko. Finite spectrum assignment problem of systems with multiple commensurate delays in states and control. Int. J. Contr., 39:1073-1082, 1984. 115. K. Watanabe, E. Nobuyama, T. Kitamori, and M. Ito. A new algorithm for finite spectrum assignment of single-input systems with time delay. IEEE Trans. Aut. Contr., AC-37:1377-1383, 1992.
174
References
116. K. Watanabe, E. Nobuyama, and A. Kojima. Recent advances in control of time delay systems- a tutorial review. In Proceedings of the 35th Conference on Decision and Control (Kobe, Japan}, volume 2, pages 2083-2089, 1996. 117. P. A. Weiner. Multidimensional convolutional codes. PhD thesis, University of Notre Dame, 1998. Available at http:/ /www.nd.edu/-rosenjpreprints.html. 118. J. C. Willems. Models for dynamics. Dynamics Reported, 2:171-269, 1989. 119. J. C. Willems. Paradigms and puzzles in the theory of dynamical systems. IEEE Trans. Aut. Contr., AC-36:259-294, 1991. 120. J. C. Willems. On interconnection, control, and feedback. IEEE Trans. Aut. Contr., AC-42:326-339, 1997. 121. W. A. Wolovich. Skew prime polynomial matrices. IEEE Trans. Aut. Contr., AC-23:88Q-887, 1978. 122. J. Wood. Modules and behaviours in nD systems theory. Mult. Systems and Signal Processing, 11:11-48, 2000. 123. J. Wood, E. Rogers, and D. H. Owens. Controllable and autonomous nD linear systems. Mult. Systems and Signal Processing, 10:33-69, 1999. 124. J. Wood and E. Zerz. Notes on the definition of behavioural controllability. Syst. Contr. Lett., 37:31-37, 1999. 125. R. Ylinen, S. Ruuth, and J. Sinervo. Set theoretical and algebraic systems theory for linear differential and difference systems. Acta Polytech. Scand., 31:15Q-167, 1979. 126. D. C. Youla and G. Gnavi. Notes on n-dimensional system theory. IEEE Trans. Circuits Syst., CAS-26:105-111, 1979. 127. S. Zampieri. Some results and open problems on delay differential systems. In J. W. Polderman and H. L. Trentelman (Eds.) The Mathematics of Systems and Control: From Intelligent Control to Behavioral Systems, pages 231-238. University of Groningen, The Netherlands, 1999. 128. A. H. Zemanian. Distribution Theory and Transform Analysis. McGraw-Hill, New York, 1965. 129. E. Zerz. Topics in Multidimensional Linear Systems Theory. Lecture Notes in Control and Information Sciences 256. Springer, London, 2000.
Index
·In·,
24 101 ( ... ) , the ideal generated by ... A(p)l A(P)' 39 B, 79
7£'
[B], 95
Bc,96,99 B.1., 78 B';\A,B,C,E), 142 ([:_' 116 Cc, 117 D = ft, 9 Dp, 45 'D, V', V'-r, 'D~, 52 dM[P, Q), 163 degxp, 25 oi_k), 52 diagnxm(d1, ... , dr), 36 £,£+,52 ek,>., 11 En(R), 24 FN, 62 gcdn (p, q), 24 gcrd(A, B), 40 Gln(R), 24 H, the Heaviside function, 53 H(C), 10, 25 1t, 1to, 14 1t(l), 33 7to,p, 7to,sp, 56 In, then x n-identity matrix im'H R, im.cR, 77 imK[z,s]R, imAR, 138 Jn,q, 39 ker'H R, ker.c R, 77 kerK[z,s] R, kerA R, 138 K[z]((s- 1 )), K[z)[s- 1 ], 149 .L, 8
lcm(¢, 1/;), 48 lcmn (p, q), 24 lclm(A, B), 41 (£~c)+, 58 M, 79 M.1., 78 M(C), 10 o(B), 95 O(A, C), 163 ord.x(I*), 130 ord.x(f), 10 ((p)), 46 ((p))(M)' 48 PC 00 , PC+, 53 PW(C), 56 JI_, II+, 149 IIQ, 150 q, 14 q*, 10 Q*, 38 Rx, 24 R((x)), R[x], 25 IR[s, z, z- 1 ), 9 IR(s)[z, z- 1 ), 14 IR( s )p, IR( s ).p, 58 p, 39
a,9 suppj, 52 SQ, 151 tdeg, 70 Tm, 70 V(/*), 130 V(S), 10 W-,
89
W/\t 0 W
Zn, 70
1
,
97
176
Index
adequate ring, 30 admissible set, 48 advanced DDE, 9 autonomous, 90 behavior, 74, 142 Bezout domain, 30 Bezout identity, 30 bidual, 78 CA-ring, 44 characteristic function, 11, 17, 116 characteristic variety, 130 characteristic zero, 10, 130 coefficient assignable, 43 commensurate delays, 8 computable, 59 computable factorization field, 66 concatenation, 97 controllable, 97 controllable pair, 122 controllable part, 101 coreachable, 153 DDE, 1 dimension of a realization, 142 direct term, 111 duals, 78 elementary divisor domain, 36 empty matrix, 41 equivalence, 25 external behavior, 142 external variables, 74 FC-ring, 44 first-order representation, 142 first-order system, 142 formal transfer function, 91, 142 free variable, 90 full ideal, 46 i/o operator, 92 image-representation, 96 impulse response, 92 input, 90 input/output system, 90 interconnection, 107 kernel-representation, 74
Laplace transform, 56 latent variables, 75 left equivalence, 25 maximally free, 90 McMillan degree, 163 minimal realization, 163 minor prime, 85 monic, 45, 121, 149 neutral DDE, 9 nonanticipating, 90 nonsingular matrix, 24 normalized matrix, 93 observable, 163 ODE, 8 output, 90 output number, 95 PA-ring, 44 Paley-Wiener algebra, 56 pole assignable, 44 proper, 56, 58, 149 rank, 25 reachable, 43 realizable, 142, 157 realization, 142 regular interconnection, 107 retarded DDE, 9 sandwich-polynomial, 47 saturated, 48 Schanuel's conjecture, 65 skew-prime, 112 spectrally controllable, 100, 102 stabilizable, 116 stable, 116 stable range, 35 strictly proper, 56, 58, 149 strongly nonanticipating if o-system, 92 time-invariant, 89 transfer class, 95 transfer equivalent, 95 unimodular matrix, 24 weakly coefficient assignable, 121 weakly observable, 163