Operator Theory: Advances and Applications Vol. 161 Editor: I. Gohberg
Editorial Office: School of Mathematical Sciences Tel Aviv University Ramat Aviv, Israel Editorial Board: D. Alpay (Beer-Sheva) J. Arazy (Haifa) A. Atzmon (Tel Aviv) J. A. Ball (Blacksburg) A. Ben-Artzi (Tel Aviv) H. Bercovici (Bloomington) A. Böttcher (Chemnitz) K. Clancey (Athens, USA) L. A. Coburn (Buffalo) K. R. Davidson (Waterloo, Ontario) R. G. Douglas (College Station) A. Dijksma (Groningen) H. Dym (Rehovot) P. A. Fuhrmann (Beer Sheva) B. Gramsch (Mainz) J. A. Helton (La Jolla) M. A. Kaashoek (Amsterdam) H. G. Kaper (Argonne) S. T. Kuroda (Tokyo)
Subseries Linear Operators and Linear Systems Subseries editors: Daniel Alpay Department of Mathematics Ben Gurion University of the Negev Beer Sheva 84105 Israel
P. Lancaster (Calgary) L. E. Lerer (Haifa) B. Mityagin (Columbus) V. V. Peller (Manhattan, Kansas) L. Rodman (Williamsburg) J. Rovnyak (Charlottesville) D. E. Sarason (Berkeley) I. M. Spitkovsky (Williamsburg) S. Treil (Providence) H. Upmeier (Marburg) S. M. Verduyn Lunel (Leiden) D. Voiculescu (Berkeley) H. Widom (Santa Cruz) D. Xia (Nashville) D. Yafaev (Rennes) Honorary and Advisory Editorial Board: C. Foias (Bloomington) P. R. Halmos (Santa Clara) T. Kailath (Stanford) P. D. Lax (New York) M. S. Livsic (Beer Sheva)
Joseph A. Ball Department of Mathematics Virginia Tech Blacksburg, VA 24061 USA André M.C. Ran Division of Mathematics and Computer Science Faculty of Sciences Vrije Universiteit NL-1081 HV Amsterdam The Netherlands
The State Space Method Generalizations and Applications
Daniel Alpay Israel Gohberg Editors
Birkhäuser Verlag Basel . Boston . Berlin
Editors: Daniel Alpay Department of Mathematics Ben-Gurion University of the Negev P.O. Box 653 Beer Sheva 84105 Israel e-mail:
[email protected]
Israel Gohberg School of Mathematical Sciences Raymond and Beverly Sackler Faculty of Exact Sciences Tel Aviv University Ramat Aviv 69978 Israel e-mail:
[email protected]
2000 Mathematics Subject Classification 47Axx, 93Bxx
A CIP catalogue record for this book is available from the Library of Congress, Washington D.C., USA Bibliographic information published by Die Deutsche Bibliothek Die Deutsche Bibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data is available in the Internet at
.
ISBN 3-7643-7370-9 Birkhäuser Verlag, Basel – Boston – Berlin This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in other ways, and storage in data banks. For any kind of use permission of the copyright owner must be obtained. © 2006 Birkhäuser Verlag, P.O. Box 133, CH-4010 Basel, Switzerland Part of Springer Science+Business Media Printed on acid-free paper produced from chlorine-free pulp. TCF∞ Cover design: Heinz Hiltbrunner, Basel Printed in Germany ISBN-10: 3-7643-7370-9 e-ISBN: 3-7643-7431-4 ISBN-13: 978-3-7643-7370-2 987654321
www.birkhauser.ch
Contents Editorial Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ix
D. Alpay and I. Gohberg Discrete Analogs of Canonical Systems with Pseudo-exponential Potential. Definitions and Formulas for the Spectral Matrix Functions . . . . . . . . . . 1 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Review of the continuous case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 The asymptotic equivalence matrix function . . . . . . . . . . . . . . . . . . . . 2.2 The other characteristic spectral functions . . . . . . . . . . . . . . . . . . . . . . 2.3 The continuous orthogonal polynomials . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Perturbations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 The discrete case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 First-order discrete system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 The asymptotic equivalence matrix function . . . . . . . . . . . . . . . . . . . . 3.3 The reflection coefficient function and the Schur algorithm . . . . . . 3.4 The scattering function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 The Weyl function and the spectral function . . . . . . . . . . . . . . . . . . . . 3.6 The orthogonal polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7 The spectral function and isometries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Two-sided systems and an example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Two-sided discrete first-order systems . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 An illustrative example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 4 4 8 14 16 19 19 22 27 29 31 33 37 39 39 41 44
D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘ Matrix-J-unitary Non-commutative Rational Formal Power Series . . .
49
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 More on observability, controllability, and minimality in the non-commutative setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Matrix-J-unitary formal power series: A multivariable non-commutative analogue of the line case . . . . . . . . . . . 4.1 Minimal Givone–Roesser realizations and the Lyapunov equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
51 54 60 67 68
vi
Contents
5
6
7
8
4.2 The associated structured Hermitian matrix . . . . . . . . . . . . . . . . . . . . 4.3 Minimal matrix-J-unitary factorizations . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Matrix-unitary rational formal power series . . . . . . . . . . . . . . . . . . . . . Matrix-J-unitary formal power series: A multivariable non-commutative analogue of the circle case . . . . . . . . . 5.1 Minimal Givone–Roesser realizations and the Stein equation . . . . 5.2 The associated structured Hermitian matrix . . . . . . . . . . . . . . . . . . . . 5.3 Minimal matrix-J-unitary factorizations . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Matrix-unitary rational formal power series . . . . . . . . . . . . . . . . . . . . . Matrix-J-inner rational formal power series . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 A multivariable non-commutative analogue of the half-plane case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 A multivariable non-commutative analogue of the disk case . . . . . Matrix-selfadjoint rational formal power series . . . . . . . . . . . . . . . . . . . . . . . 7.1 A multivariable non-commutative analogue of the line case . . . . . . 7.2 A multivariable non-commutative analogue of the circle case . . . . Finite-dimensional de Branges–Rovnyak spaces and backward shift realizations: The multivariable non-commutative setting . . . . . . . . 8.1 Non-commutative formal reproducing kernel Pontryagin spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Minimal realizations in non-commutative de Branges–Rovnyak spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
72 74 75 77 77 83 84 85 87 87 91 96 96 100 102 102 106 110 111
D.Z. Arov and O.J. Staffans State/Signal Linear Time-Invariant Systems Theory, Part I: Discrete Time Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 1 2 3 4 5 6 7 8 9 10
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . State/signal nodes and trajectories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The driving variable representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The output nulling representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The input/state/output representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Transfer functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Signal behaviors, external equivalence, and similarity . . . . . . . . . . . . . . . . Dilations of state/signal systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Acknowlegment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
116 120 123 128 132 138 146 153 167 176 176 176
Contents
vii
J.A. Ball, G. Groenewald and T. Malakorn Conservative Structured Noncommutative Multidimensional Linear Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Structured noncommutative multidimensional linear systems: basic definitions and properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Adjoint systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Dissipative and conservative structured multidimensional linear systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Conservative SNMLS-realization of formal power series in the class SAG (U, Y) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I. Gohberg, I. Haimovici, M.A. Kaashoek and L. Lerer The Bezout Integral Operator: Main Property and Underlying Abstract Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Spectral theory of entire matrix functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 A review of the spectral data of an analytic matrix function . . . . 2.2 Eigenvalues and Jordan chains in terms of realizations . . . . . . . . . . 2.3 Common eigenvalues and common Jordan chains in terms of realizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Common spectral data of entire matrix functions . . . . . . . . . . . . . . . 3 The null space of the Bezout integral operator . . . . . . . . . . . . . . . . . . . . . . . 3.1 Preliminaries on convolution integral operators . . . . . . . . . . . . . . . . . 3.2 Co-realizations for the functions A, B, C, D . . . . . . . . . . . . . . . . . . . . . 3.3 Quasi commutativity in operator form . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Intertwining properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Proof of the first main theorem on the Bezout integral operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 A general scheme for defining Bezout operators . . . . . . . . . . . . . . . . . . . . . . 4.1 A preliminary proposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Definition of an abstract Bezout operator . . . . . . . . . . . . . . . . . . . . . . . 4.3 The Haimovici-Lerer scheme for defining an abstract Bezout operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 The Bezout integral operator revisited . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 The null space of the Bezout integral operator . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
180 183 191 193 199 220
225 226 228 229 232 234 237 241 242 244 248 251 254 256 257 260 262 264 266 268
Editorial Introduction This volume of the Operator Theory: Advances and Applications series (OTAA) is the first volume of a new subseries. This subseries is dedicated to connections between the theory of linear operators and the mathematical theory of linear systems and is named Linear Operators and Linear Systems (LOLS). As the existing subseries Advances in Partial Differential Equations (ADPE), the new subseries will continue the traditions of the OTAA series and keep the high quality of the volumes. The editors of the new subseries are: Daniel Alpay (Beer–Sheva, Israel), Joseph Ball (Blacksburg, Virginia, USA) and Andr´ ´e Ran (Amsterdam, The Netherlands). In the last 25–30 years, Mathematical System Theory developed in an essential way. A large part of this development was connected with the use of the state space method. Let us mention for instance the “theory of H∞ control”. The state space method allowed to introduce in system theory the modern tools of matrix and operator theory. On the other hand the state space approach had an important impact on Algebra, Analysis and Operator Theory. In particular it allowed to solve explicitly some problems from interpolation theory, theory of convolution equations, inverse problems for canonical differential equations and their discrete analogs. All these directions are planned to be present in the subseries LOLS. The editors and the publisher are inviting authors to submit their manuscripts for publication in this subseries. This volume contains five essays. The essay of D. Arov and O. Staffans, State/signal linear time-invariant systems theory, part I: discrete time systems, contains new results in classical system theory. The essays of D. Alpay and D.S. Kalyuzhny˘ ˘ı-Verbovetzki˘ı, Matrix-J-unitary non-commutative rational formal power series, and of J.A. Ball, G. Groenewald and T. Malakorn, Conservative structured noncommutative multidimensional linear systems are dedicated to a new branch in Mathematical system theory where discrete time is replaced by the free semigroup with N generators. The essay of I. Gohberg, I. Haimovici, M.A. Kaashoek and L. Lerer, The Bezout integral operator: main property and underlying abstract scheme contains new applications of the state space method to the theory of Bezoutiants and convolution equations. The essay of D. Alpay and I. Gohberg Discrete analogs of canonical systems with pseudo-exponential potential. Definitions and formulas for the spectral matrix functions is concerned with new results and formulas for the discrete analogs of canonical systems. Daniel Alpay, Israel Gohberg
Operator Theory: Advances and Applications, Vol. 161, 1–47 c 2005 Birkhauser ¨ Verlag Basel/Switzerland
Discrete Analogs of Canonical Systems with Pseudo-exponential Potential. Definitions and Formulas for the Spectral Matrix Functions Daniel Alpay and Israel Gohberg Abstract. We first review the theory of canonical differential expressions in the rational case. Then, we define and study the discrete analogue of canonical differential expressions. We focus on the rational case. Two kinds of discrete systems are to be distinguished: one-sided and two-sided. In both cases the analogue of the potential is a sequence of numbers in the open unit disk (Schur coefficients). We define the characteristic spectral functions of the discrete systems and provide exact realization formulas for them when the Schur coefficients are of a special form called strictly pseudo-exponential. Mathematics Subject Classification (2000). 34L25, 81U40, 47A56.
Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2 Review of the continuous case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1 The asymptotic equivalence matrix function . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 The other characteristic spectral functions . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.3 The continuous orthogonal polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.4 Perturbations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3 The discrete case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.1 First-order discrete system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.2 The asymptotic equivalence matrix function . . . . . . . . . . . . . . . . . . . . . . 22 3.3 The reflection coefficient function and the Schur algorithm . . . . . . . . 27 3.4 The scattering function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.5 The Weyl function and the spectral function . . . . . . . . . . . . . . . . . . . . . . 31 3.6 The orthogonal polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.7 The spectral function and isometries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 4 Two-sided systems and an example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.1 Two-sided discrete first-order systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.2 An illustrative example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2
D. Alpay and I. Gohberg
1. Introduction Canonical differential expressions are differential equations of the form −iJ
∂Θ (x, λ) = λΘ(x, λ) + v(x)Θ(x, λ), ∂x
where
v(x) =
0 k(x)∗
k(x) 0
,
J=
x ≥ 0, λ ∈ C,
In 0
0 −IIn
(1.1)
,
(R+ ) is called the potential. Such systems were introduced by and where k ∈ Ln×n 1 M.G. Kre˘ ˘ın (see, e.g., [37], [38]). Associated to (1.1) are a number of functions of λ, which we called in [10] the characteristic spectral functions of the canonical system. These are: 1. 2. 3. 4. 5.
The The The The The
asymptotic equivalence matrix function V (λ). scattering function S(λ). spectral function W (λ). Weyl function N (λ). reflection coefficient function R(λ).
Direct problems consist in computing these functions from the potential k(x) while inverse problems consist in recovering the potential from one of these functions. In the present paper we study discrete counterparts of canonical differential expressions. To present our approach, we first review various facts on the telegraphers’ equations. By the term telegraphers’ equations, one means a system of differential equations connecting the voltage and the current in a transmission line. The case of lossy lines can be found for instance in [45] and [18]. We here consider the case of lossless lines and follow the arguments and notations in [16, Section 2], [19, p. 110–111] and [46]. The telegraphers’ equations which describe the evolution of the voltage v(x, t) and current i(x, t) in a lossless transmission line can be given as: ∂v ∂i (x, t) + Z(x) (x, t) = 0 ∂x ∂t ∂i ∂v (x, t) + Z(x)−1 (x, t) = 0. ∂x ∂t
(1.2)
In these equations, Z(x) represents the local impedance at the point x. A priori there may be points where Z(x) is not continuous, but it is important to bear in mind that voltage and current will be continuous at these points. Let us assume that Z(x) > 0 and is continuously differentiable on an interval (a, b), and introduce the new variables V (x, t) = Z(x)−1/2 v(x, t), I(x, t) = Z(x)1/2 i(x, t),
Analogs of Canonical Systems with Pseudo-exponential Potential
3
and V (x, t) + I(x, t) , 2 V (x, t) − I(x, t) . WL (x, t) = 2
WR (x, t) =
Then the function W (x, t) =
1 Z(x)−1/2 WR (x, t) = WL (x, t) 2 Z(x)−1/2
Z(x)1/2 −Z(x)1/2
v(x, t) i(x, t)
(1.3)
satisfies the differential equation, also called symmetric two components wave equation (see [16, equation (2.6) p. 362], [46, p. 256], [19, equation (3.3) p. 111]) ∂W (x, t) ∂W (x, t) 0 −κ(x) = −J + W (x, t), −κ(x) 0 ∂x ∂t where Z (x) 1 0 . (1.4) J= and κ(x) = 0 −1 2Z(x) We distinguish two cases: (a) The case where Z(x) > 0 and is continuously differentiable on R+ . Taking the (inverse) Fourier transform f → f(λ) = R eiλt f (t)dt on both sides we get to a canonical differential expressions (also called Dirac type system), with (x, λ). The theory of canonical differential k(x) = iκ(x) and Θ(x, λ) = W expressions is reviewed in the next section. (b) The case where Z(x) is constant on intervals [nh, (n + 1)h) for some preassigned h > 0. We are then lead to discrete systems. The paper consists of three sections besides the introduction. In Section 2 we review the main features of the continuous case. The third section presents the discrete systems to be studied. These are of two kinds, one-sided and two-sided. Section 3 also contains a study of one-sided systems and of their associated characteristic spectral functions. In Section 4 we focus on two-sided systems and we also present an illustrative example. In the parallel between the continuous and discrete cases a number of problems remains to be considered to obtain a complete picture. In the sequel to the present paper we study inverse problems associated to these first-order systems. To conclude this introduction we set some definitions and notation. The open unit disk will be denoted by D, the unit circle by T, and the open upper half-plane by C+ . The open lower half-plane is denoted by C− and its closure by C− . We will make use of the Wiener algebras of the real line and of the unit circle. These are defined as follows. The Wiener algebra of the real line W n×n (R) = W n×n consists of the functions of the form ∞ eiλt u(t)dt (1.5) f (λ) = D + −∞
4
D. Alpay and I. Gohberg
where D ∈ Cn×n and where u ∈ Ln×n (R). Usually we will not stress the depen1 n×n n×n (resp. W− ) consists of the functions of the dence on R. The sub-algebra W+ form (1.5) for which the support of u is in R+ (resp. in R− ). The Wiener algebra W(T) (we will usually write W rather than W(T)) of the unit circle consists of complex-valued functions f (z) of the form f z f (z) = Z
for which def.
f W =
|ff | < ∞.
Z
2. Review of the continuous case 2.1. The asymptotic equivalence matrix function We first review the continuous case, and in particular the definitions and main properties of the characteristic spectral functions. See, e.g., [7], [11], [10] for more information. We restrict ourselves to the case where the potential is of the form −1 ∗ k(x) = −2ceixa Ip + Ω Y − e−2ixa Y e2ixa (b + iΩc∗ ) , (2.1) where (a, b, c) ∈ Cp×p × Cp×n × Cn×p is a triple of matrices with the properties that p and ∪m ∩m =0 ker ca = {0} =0 Im a b = C for m large enough. In system theory, see for instance [30], the first condition means that the pair (c, a) is observable while the second means that the pair (a, b) is controllable. When both conditions are in force, the triple is called minimal. See also [14] for more information on these notions. We assume moreover that the spectra of a and of a× = a − bc are in the open upper half-plane. Furthermore Ω and Y in (2.1) belong to Cp×p and are the unique solutions of the Lyapunov equations i(Ωa×∗ − a× Ω) = −i(Y a − a∗ Y ) =
bb∗ , c∗ c.
(2.2) (2.3)
This class of potentials was introduced in [7] and called in [26] strictly pseudoexponential potentials. Note that both Ω and Y are strictly positive since the triple (a, b, c) is minimal, and that Ip + ΩY and Ip + Y Ω are invertible since √ √ det(IIp + ΩY ) = det(IIp + Y Ω) = det(IIp + Y Ω Y ). Note also that asymptotically, k(x) ∼ −2ceixa (IIp + ΩY )−1 (b + iΩc∗ )
(2.4)
as x → +∞. Potentials of the form (2.1) can also be represented in a different form; see (2.22).
Analogs of Canonical Systems with Pseudo-exponential Potential
5
We first define the asymptotic equivalence matrix function. To that purpose (and here we follow closely our paper [12]) let F, G and T be the matrices given by ia 0 0 f1∗ −c 0 , T = , G = F =i , (2.5) c∗ 0 0 f1 0 −ia∗ where f1 = (b∗ − icΩ)(IIp + Y Ω)−1 . Theorem 2.1. Let Q(x, y) be defined by Q(x, y) = F exT (II2p − exT ZexT )−1 eyT G where (F, G, T ) are defined by (2.5) and where Z is the unique solution of the matrix equation T Z + ZT = −GF. Then the matrix function ∞ U (x, λ) = eiλJx + Q(x, u)eiλJu du x
is the unique solution of (1.1) with the potential as in (2.1), subject to the condition −ixλ In 0 e (2.6) lim U (x, λ) = I2n , λ ∈ R. 0 eixλ In x→∞ Furthermore, the Cn×n -valued blocks in the decomposition of the matrix function U (0, λ) = (U Uij (0, λ)) are given by U11 (0, λ)
= In + icΩ(λIIp − a∗ )−1 c∗ ,
U21 (0, λ)
= (−b∗ + icΩ)(λIIp − a∗ )−1 c∗ ,
U12 (0, λ)
= −c(IIp + ΩY )(λIIp − a)−1 (IIp + ΩY )−1 (b + iΩc∗ ),
U22 (0, λ)
= In − (ib∗ Y + cΩY )(λIIp − a)−1 (IIp + ΩY )−1 (b + iΩc∗ ).
See [9, Theorem 2.1]. Definition 2.2. The function V (λ) = U (0, λ) is called the asymptotic equivalence matrix function. The terminology asymptotic equivalence matrix function is explained in the following theorem: Theorem 2.3. The asymptotic equivalence matrix function has the following property: let x ∈ R and ξ0 and ξ1 in C2n . Let f0 (x, λ) = eiλxJ ξ0 be the C2n -valued solution to (1.1) corresponding to k(x) = 0 and f0 (0, λ) = ξ0 and let f1 (x, λ) corresponding to an arbitrary potential k of the form (2.1), with f1 (0, λ) = ξ1 . The two solutions are asymptotic in the sense that lim f1 (x, λ) − f0 (x, λ) = 0
x→∞
if and only if ξ1 = U (0, λ)ξ0 . For a proof, see [10, Section 2.2].
6
D. Alpay and I. Gohberg
The asymptotic equivalence matrix function takes J-unitary values on the real line: V (λ)JV (λ)∗ = J, λ ∈ R. We recall the following: if R be a C2n×2n -valued rational functions analytic at infinity, it can be written as R(λ) = D + C(λIIm − A)−1 B, where A, B, C and D are matrices of appropriate sizes. Such a representation of R is called a realization. The realization is said to be minimal if the size of A is minimal (equivalently, the triple (A, B, C) is minimal, in the sense recalled above). The McMillan degree of R is the size of the matrix A in any minimal realization. Minimal realizations of rational matrix-valued functions taking J-unitary values on the real line were characterized in [5, Theorem 2.8 p. 192]: R takes J-unitary values on the real line if and only if there exists an Hermitian invertible matrix H ∈ Cm×m solution of the system of equations i(A∗ H − HA) = C =
C ∗ JC iJB ∗ H.
(2.7) (2.8)
The matrix H is uniquely defined by the minimal realization of R and is called the associated Hermitian matrix to the minimal realization matrix function. The matrix function R is moreover J-inner, that is J-contractive in the open upper half-plane: R(λ)JR(λ) ≤ J
for all points of analyticity in the open upper half-plane,
if and only if H > 0. The asymptotic equivalence matrix function V (λ) has no pole on the real line, but an arbitrary rational function which takes J-unitary values on the real line may have poles on the real line. See [5] and [4] for examples. The next theorem presents a minimal realization of the asymptotic equivalence matrix function and its associated Hermitian matrix. Theorem 2.4. Let k(x) be given in the form (2.1). Then, a minimal realization of the asymptotic equivalence matrix function associated to the corresponding canonical differential system is given by V (λ) = I2n + C(λII2p − A)−1 B, where ∗ ∗ a 0 c 0 A= , B= 0 a 0 (IIp + ΩY )−1 (b + iΩc∗ ) and
C=
icΩ −b∗ + icΩ
−c(IIp + ΩY ) −ib∗ Y − cΩY
,
and the associated Hermitian matrix is given by Ω i(IIp + ΩY ) H= . −i(IIp + Y Ω) (IIp + Y Ω)Y We now prove a factorization result for V (λ). We first recall the following: let as above R be a rational matrix-valued function analytic at infinity. The factorization
Analogs of Canonical Systems with Pseudo-exponential Potential
7
R = R1 R2 of R into two other rational matrix-valued functions analytic at infinity (all the functions are assumed to have the same size) is said to be minimal if deg R = deg R1 + deg R2 . Minimal factorizations of rational matrix-valued functions have been characterized in [14, Theorem 1.1 p. 7]. Assume now that R takes J-unitary values on the real line. Minimal factorizations of R into two factors which are J-unitary on the real line were characterized in [5]. Such factorizations are called J-unitary factorizations. To recall the result (see [5, Theorem 2.6 p. 187]), we introduce first some more notations and definitions: let H ∈ Cp×p be an invertible Hermitian matrix. The formula [x, y]H = y ∗ Hx,
x, y ∈ Cp
defines a non-degenerate and in general indefinite inner product. Two vectors are orthogonal with respect to this inner product if [x, y]H = 0. The orthogonal complement of a subspace M ⊂ Cp is: M[⊥] = {x ∈ Cp ; [x, m]H = 0 ∀m ∈ M} . We refer to [29] for more information on finite-dimensional indefinite inner product spaces. Theorem 2.5. Let R be a rational matrix-valued function analytic at infinity and J-unitary on the real line, and let R(λ) = D + C(zIIp − A)−1 B be a minimal realization of R, with associated matrix H. Let M be a A-invariant subspace of Cp non-degenerate with respect to the inner product [·, ·]H . Let π denote the orthogonal (with respect to [·, ·]H ) projection such that ker π = M,
Im π = M[⊥]
and let D = D1 D2 be a factorization of D into two J-unitary constants. Then R = R1 R2 with R1 (z) = D1 + C(zIIp − A)−1 (IIp − π)BD2−1 R2 (z) = D2 + D1−1 Cπ(zIIp − A)−1 BD2 is a minimal J-unitary factorization of R. Conversely, every J-unitary factorization of R is obtained in such a way. As a consequence we have: Theorem 2.6. Let V (λ) be the asymptotic equivalence matrix function of a canonical differential expression (1.1) with potential of the form (2.1). Then it admits a minimal factorization V (λ) = V1 (λ)V V2 (λ)−1 where V1 and V2 are J-inner and of same degree.
8
D. Alpay and I. Gohberg
To prove this result we consider the realization of V (λ) given in Theorem 2.4 p and note that the space C0 is A-invariant and H-non-degenerate (in fact, Hpositive). The factorization follows from Theorem 2.5. The fact that V2 is inner follows from ∗ Ip 0 0 Ω 0 Ip H= . −i(IIp + Y Ω)Ω−1 Ip −i(IIp + Y Ω)Ω−1 Ip 0 −Ω−1 − Y To prove this last formula we have used the formula for Schur complements: A11 I 0 0 A11 A12 I A−1 11 A12 = A21 A22 A21 A−1 0 A22 − A21 A−1 I 0 I 11 11 A12 for matrices of appropriate sizes and A11 being invertible. See [20, formula (0.3) p. 3].
One could have started with the space C0p , which is also A-invariant and Hpositive. In particular, the above factorization is not unique. 2.2. The other characteristic spectral functions In this section we review the definitions and main properties of the characteristic spectral functions associated to a canonical differential expression. It follows from Theorem 2.4 that U (0, λ) has no pole on the real line and that, furthermore: U11 (0, λ)U11 (0, λ)∗ − U12 (0, λ)U12 (0, λ)∗ = In U22 (0, λ)U U22 (0, λ)∗ − U21 (0, λ)U U21 (0, λ)∗ = In and
U11 (0, λ)∗ U12 (0, λ) = U21 (0, λ)∗ U22 (0, λ)
for real λ. In particular, U11 (0, λ) is invertible on the real line and U21 (0, λ)U11 (0, λ)−1 is well defined and takes contractive values on the real line. Definition 2.7. The function R(λ) = (U U21 (0, λ)U11 (0, λ)−1 )∗ = U12 (0, λ)U U22 (0, λ)−1 ,
λ ∈ R,
is called the reflection coefficient function. To present an equivalent definition of the reflection coefficient function, we need some notation: if A B Θ= ∈ C(p+q)×(p+q) , A ∈ Cp×p , and X ∈ Cp×q C D we set
TΘ (X) = (AX + B)(CX + D)−1 .
Note that TΘ1 Θ2 (X) = TΘ1 (T TΘ2 (X)) when all expressions are well defined.
(2.9)
Analogs of Canonical Systems with Pseudo-exponential Potential
9
Theorem 2.8. Let Θ(x, λ) = U (x, λ)U (0, λ)−1 . Then, Θ(x, λ) is also a solution of (1.1). It is an entire function of λ. It is J-expansive in C+ ,
λ∈R ∗ = 0, J − Θ(x, λ)JΘ(x, λ) ≤ 0, λ ∈ C+ , and satisfies the initial condition Θ(0, λ) = I2n . Moreover R(λ) = lim TΘ(x,λ)−1 (0), x→∞
λ ∈ R.
(2.10)
The matrix function Θ(x, λ) is called the matrizant, or fundamental solution of the canonical differential expression. Its properties may be found in [22, p. 150]. For real λ the matrix function U (0, λ) is J-unitary. Hence we have: Θ(x, λ)−1 = U (0, λ)U (x, λ)−1 . The result follows using (2.9) and the asymptotic property (2.6). In fact, the function R is analytic and takes contractive values in the closed lower half-plane. For a proof and references, see [10] and [13, Theorem 3.1 p 6]. Theorem 2.9. A minimal realization of R(λ) is given by R(λ) = −c(λIIp − (a + iΩc∗ c))−1 (b + iΩc∗ ).
(2.11) ∗
See [10]. It follows in particular that the spectrum of the matrix a + iΩc c is in the open upper half-plane. Note that Ω is not arbitrary but is related to a, b and c via the Lyapunov equation (2.2). A direct proof that R is analytic and contractive in C− can be given using the results in [33], as we now explain. Definition 2.10. A Cn×n -valued rational function R is called a proper contraction if it takes contractive values on the real line and if moreover it is analytic at infinity and such that R(∞)R(∞)∗ < In . The following results are respectively [33, Theorem 3.2 p. 231, Theorem 3.4 p. 235]. Theorem 2.11. Let R be a Cn×n -valued rational function analytic at infinity and let R(z) = D + C(zI − A)−1 B be a minimal realization of W . Let α β B(IIn − D∗ D)−1 B ∗ A + BD∗ (IIn − DD∗ )−1 C A= = . γ α∗ C ∗ (IIn − DD∗ )−1 C A∗ + C ∗ (IIn − DD∗ )−1 DB ∗ Then the 1) The 2) The 3) The
following are equivalent: matrix function R is a proper contraction. real eigenvalues of A have even partial multiplicities. Riccati equation XγX − iXα∗ + iαX + β = 0.
has an Hermitian solution.
(2.12)
10
D. Alpay and I. Gohberg
The matrix A is called the state characteristic matrix of W and the Riccati equation (2.12) is called its state characteristic equation. Theorem 2.12. Let R be a Cn×n -valued proper contraction, with minimal realization R(z) = D + C(zI − A)−1 B and let (2.12) be its state characteristic equation. Then, any Hermitian solution of (2.12) is invertible and the number of negative eigenvalues of X is equal to the number of poles of R in C− . Consider now the minimal realization (2.11). The corresponding state characteristic equation is Xc∗ cX − iX(a∗ − icc∗ Ω) + i(a + iΩcc∗ )X + (b + iΩc∗ )(b∗ − icΩ) = 0. To show that X = Ω is a solution of this equation is equivalent to prove that Ω solves the Lyapunov equation (2.3). Indeed, 0 = Ωc∗ cΩ − iΩ(a∗ − icc∗ Ω) + i(a + iΩcc∗ )Ω + (b + iΩc∗ )(b∗ − icΩ) ⇐⇒ 0 = −iΩa∗ + iaΩ + bb∗ − iΩ(a − c∗ b∗ ) + i(a − bc)Ω + bb∗ ⇐⇒ 0 = i(a× Ω − Ωa×∗ ) + bb∗ , which is (2.3). The scattering matrix function is defined as follows: Theorem 2.13. The differential equation (1.1) has a uniquely defined C2n×n -valued solution such that for λ ∈ R, In −IIn X(0, λ) = 0, lim 0 eixλ In X(x, λ) = In . x→∞
The limit
lim e−ixλ In
x→∞
0 X(x, λ) = S(λ)
exists for all real λ and is called the scattering matrix function of the canonical system. The scattering matrix function takes unitary values on the real line, belongs to the Wiener algebra W and admits a factorization S = S+ S− where S+ and its inverse are analytic in the closed upper half-plane while S− and its inverse are analytic in the closed lower half-plane. We note that the general factorization of a function in the Wiener algebra and unitary on the real line involves in general a diagonal term taking into account quantities called partial indices; see [31], [32], [34], [17]. We also note that conversely, functions with the properties as in the theorem are scattering matrix functions of a more general class of differential equations; see [41] and the discussion in [7, Appendix].
Analogs of Canonical Systems with Pseudo-exponential Potential
11
Theorem 2.14. The scattering matrix function of a canonical system (1.1) with potential (2.1) is given by: = (IIn + b∗ (λIIp − a∗ )−1 c∗ )−1
S(λ)
×(IIn − (ib∗ Y − c)(λIIp − a)−1 (IIp + ΩY )−1 (b + iΩc∗ )). A minimal realization of the scattering matrix function is given by S(λ) = In + C(λII2p − A)−1 B, where a b(icΩ − b∗ ) , A= 0 a×∗ b , B= (IIp + Y Ω)−1 (c∗ + iY b) C = (c
Set G=
icΩ − b∗ ). −Ω −iIIp
iIIp −Y (IIp + ΩY )−1
.
Then it holds that i(AG − GA∗ ) = CG =
−BB ∗ , iB ∗ ,
and thus S takes unitary values on the real line. For a proof, see [8, p. 7]. The last statement follows from [5, Theorem 2.1 p. 179], that is from equations (2.7) and (2.8) with H = X −1 and J = Ip . Since ∗ Ip −Ω 0 Ip 0 0 X= 0 (Ω + ΩY Ω)−1 iΩ−1 Ip iΩ−1 Ip
the space leads to:
Cp 0
is A invariant and H-negative. Thus Theorem 2.5 on factorizations
Theorem 2.15. The scattering matrix function of a canonical system (1.1) with potential (2.1) admits a minimal factorization of the form S(z) = U1 (z)−1 U2 (z) where both U1 and U2 are inner (that is, are contractive in C+ and take unitary values on the real line). The fact that U2 is inner (and not merely unitary) stems from the fact that the Schur complement of −Ω in H is equal to −Y (IIp + ΩY )−1 − iIIp (−Ω)−1 (−iIIp ) = (Ω + ΩY Ω)−1 and in particular is strictly positive. Such a factorization result was also proved in [12, Theorem 7.1] using different methods. It is a particular case of a factorization result of M.G. Kre˘n ˘ and H. Langer for functions having a finite number of negative squares; see [39].
12
D. Alpay and I. Gohberg
We now turn to the spectral function. We first recall that the operator df (x) − v(x)f (x) dx restricted to the space of C2n -valued absolutely continuous functions with entries in L2 and such that (IIn − In )f (0) = 0 Hf (x) = −iJ
is self-adjoint. Definition 2.16. A positive function W : R → Cn×n is called a spectral function if there is a unitary map U from Ln2 onto Ln2 (W ) mapping H onto the operator of multiplication by the variable in Ln2 (W ). Theorem 2.17. The function V22 (λ) − V12 (λ))−1 W (λ) = (V V22 (λ) − V12 (λ))−∗ (V is a spectral function, the map U being given by ∞ 1 In In Θ(x, λ)∗ f (x)dx. F (λ) = √ 2π 0
(2.13)
A direct proof in the rational case can be found in [26]. When k(x) ≡ 0, we have that W (λ) = In dλ, and the unitary map (2.13) is readily identified with the Fourier transform. Definition 2.18. The Weyl coefficient function N (λ) is defined in the open upper half plane; it is the unique Cn×n -valued function such that ∞ In In In In −iN (λ) ∗ ∗ iN (λ) In Θ(x, λ) Θ(x, λ) dx In −IIn In −IIn In 0 is finite for −i(λ − λ∗ ) > 0. In the setting of differential expressions (1.1), the function N was introduced in [27]. The motivation comes from the theory of the Sturm-Liouville equation. The Weyl coefficient function is analytic in the open upper half-plane and has a nonnegative imaginary part there. Such functions are called Nevanlinna functions. Theorem 2.19. The Weyl coefficient function is given by the formula N (λ) = i(U12 (0, λ) + U22 (0, λ))(U12 (0, λ) − U22 (0, λ))−1 = i(IIn − 2c(λIIp − a× )−1 (b + iΩc∗ )).
(2.14)
Proof. We first look for a Cn×2n -valued function P (λ) such that x → P (λ)Θ(x, λ)∗ has square summable entries for λ ∈ C+ . Let U (λ, x) be the solution of the differential system (1.1) subject to the asymptotic condition (2.6). Then, U (x, λ) = Θ(x, λ)U (0, λ). We thus require the entries of the function x → P (λ)U (0, λ)−∗ U (x, λ)
(2.15)
Analogs of Canonical Systems with Pseudo-exponential Potential
13
to be square summable. By definition of U , it is necessary for P (λ)U (0, λ)−∗ to be of the form (0, p(λ)) where p(λ) is Cn×n -valued. It follows from the definition of U (0, λ) that one can take P (λ) = 0 In U (0, λ)∗ = U12 (0, λ)∗ U22 (0, λ)∗ and hence the necessity condition. Conversely, we have to show that the function (2.15) has indeed summable entries. But this is just doing the above argument backwards. The realization formula follows then from the realization formulas for the block entries of the asymptotic equivalence matrix function. Any of the functions in the spectral domain determines all the others, as follows from the next theorem: Theorem 2.20. Assume given a differential system of the form (1.1) with potential k(x) of the form (2.1). Assume W (λ), V (λ), R(λ), S(λ) and N (λ) are the characteristic spectral functions of (1.1), and let S = S− S+ be the spectral factorization of the scattering matrix function S, where S− and its inverse are invertible in the closed lower half-plane and S+ and its inverse are invertible in the closed upper half-plane. Then, the connections between these functions are: W (λ) W (λ)
= S− (λ)−1 S− (λ)−∗ = S+ (λ)S+ (λ)∗ , = Im N (λ),
S(λ)
= S− (λ)S+ (λ),
R(λ)
= (iN (λ)∗ − In )(iN (λ)∗ + In )−1 ,
N (λ)
= i(IIn + R(λ)∗ )(IIn − R(λ)∗ )−1 , 1 (iN (λ)∗ + In )S− (λ)∗ (−iN (λ) − In )S+ (λ)−∗ = (iN (λ)∗ − In )S− (λ)∗ (−iN (λ) + In )S+ (λ)−∗ 2
V (λ) for λ ∈ R.
See [10, Theorem 3.1]. We note that R∗ = TV (0). We now wish to relate V to a unitary completion of the reflection coefficient function. It is easier to look at 0 In 0 In
V (λ) = V (λ) . In 0 In 0 We set P =
I2n + J = 2
In 0
0 0
and
Q=
I2n − J = 2
0 0
0 In
.
Theorem 2.21. Let Θ ∈ C2n×2n be such that det(P +QΘ) = 0. Then det(P −ΘQ) = 0 and def def. Θ× = (P Θ + Q)(P + QΘ)−1 = (P − ΘQ)−1 (ΘP − Q) (2.16)
14
D. Alpay and I. Gohberg
Finally I2n − Θ× Θ×
∗
∗
I2n − Θ× Θ×
=
(P − ΘQ)−1 (J − ΘJΘ∗ ) (P − ΘQ)−∗
(2.17)
=
(P + QΘ)−∗ (J − Θ∗ JΘ) (P + QΘ)−1 .
(2.18)
where A ∈ Cn×n . We have: In 0 In P + QΘ = , P − ΘQ = C D 0
Proof. We set Θ =
A C
B C
−B −D
.
Thus either of these matrices is invertible if and only if D is invertible. Thus both equalities in (2.16) make sense. To prove that they define the same object is equivalent to prove that (P − ΘQ)(P Θ + Q) = (ΘP − Q)(P + QΘ), i.e., since P Q = QP = 0, P Θ − ΘQ = ΘP − QΘ. This in turn clearly holds since P + Q = I2n . We now prove (2.17). The proof of (2.18) is similar and will be omitted. We have I2n − Θ× Θ×
∗
=
I2n − (P − ΘQ)−1 (ΘP − Q)(ΘP − Q)∗ (P − ΘQ)−∗
=
(P − ΘQ)−1{(P − ΘQ)(P − ΘQ)∗−(ΘP − Q)(ΘP − Q)∗ } ×(P − ΘQ)−∗
=
(P − ΘQ)−1 {P − Q + ΘQΘ∗ − ΘP Θ∗ } (P − ΘQ)−∗
and hence the result since J = P − Q.
The function defined by (2.16) is called the Potapov–Ginzburg transform of Θ. We have A − BD−1 C BD−1 × Θ = . (2.19) −D−1 C D−1 Theorem 2.22. The Potapov–Ginzburg transform of V is a unitary completion of the reflection coefficient function. Indeed, from (2.19) the 22 block of the Potapov–Ginzburg transform of V is exactly R. It is not a minimal completion (in particular it has n poles in C− ). See [20] for more information on this transform. Minimal unitary completions of a proper contraction are studied in [33, Theorem 4.1 p. 236]. 2.3. The continuous orthogonal polynomials As already mentioned, for every x ≥ 0 the function λ → Θ(x,λ) = U (x,λ)U (0,λ)−1 is entire. Albeit their name, the continuous orthogonal polynomials are entire functions, first introduced by M.G. Kre˘ın (see [37]) and in terms of which one can
Analogs of Canonical Systems with Pseudo-exponential Potential
15
compute the matrix function Θ(x, λ). To define these functions we start with a function W of the form (2.20) W (λ) = In − eitλ ω(t)dt, λ ∈ R, R
with ω ∈ Ln×n (R) and such that W (λ) > 0 for all λ ∈ R. This last condition 1 insures that the integral equation T ΓT (t, s) − ω(t − u)ΓT (u, s)du = ω(t − s), t, s ∈ [0, T ] 0
has a unique solution for every T > 0. Definition 2.23. The continuous orthogonal polynomial is given by: 2t Γ2t (u, 0)e−iλu du . P (t, λ) = eitλ In + 0
Theorem 2.24. It holds that In In Θ(x, λ) = P (t, −λ) R(t, λ) 2t where R(t, λ) = eitλ In + 0 Γ2t (2t − u, 2t)e−iλu du . In view of Theorem 2.20, note that every rational function analytic at infinity, such that W (∞) = In , with no poles and strictly positive on the real line, is the spectral function of a canonical differential expression of the form (1.1) with potential of the form (2.1). Furthermore, let W (λ) = In + C(λIIp − A)−1 B be a minimal realization of W . Then, W is of the form (2.20) with
iCe−iuA (IIp − P )B, u > 0, ω(u) = −iCe−iuA P B, u < 0, where P is the Riesz projection of A in C+ . We recall that P = (ζIIp − A)−1 dζ γ
where γ is a positively oriented contour which encloses only the eigenvalues of A in C+ . Theorem 2.25. Let W be a rational Cn×n -valued function analytic and invertible on R and at infinity. Assume moreover that W (λ) > 0 for real λ and that W (∞) = In . Let W (λ) = In + C(λIIp − A)−1 B be a minimal realization of W . Let P (resp. P × ) denote the Riesz projection corresponding to the eigenvalues of A (resp. of A× = A − BC) in C+ . Then, the continuous orthogonal polynomials P (t, λ) are given by the formula × P (t, λ) = eiλt In + C(λIIp + A× )−1 (IIp − e−2iλt e−2itA )π2t B where
×
πt = (IIp − P + P e−itA )−1 (IIp − P ).
16
D. Alpay and I. Gohberg
Furthermore, lim e−itλ P (t, λ) = S− (−λ)∗ .
t→∞
(2.21)
See [7, Theorem 3.3 p 10]. The computations in [7] use exact formulas for the function ΓT (t, s) in terms of the realization of W which have been developed in [15]. We note that the potential k(x) can be written as −1 × PB (2.22) k(x) = 2C P e−2ixA |Im P in terms of the realization of the spectral function W . 2.4. Perturbations In this subsection we address the following question: assume that k(x) is a strictly pseudo-exponential potential. Is −k(x) also such a potential? This is not quite clear from formulas (2.1) or (2.22). One could attack this problem using the results in [11], where we studied a trace formula for a pair of self-adjoint operators corresponding to the potentials k(x) and −k(x). Here we present a direct argument in the rational case. More precisely, if N is a Nevanlinna function so are the three functions λ
→ −N −1 (λ),
λ λ
→ −N −1 (−λ∗ )∗ , → N (−λ∗ )∗ ,
and we have three associated weight functions W− (λ) W1 (λ)
= =
Im − N (λ)−1 , Im − N (−λ∗ )−∗ ,
W2 (λ)
=
Im N (−λ∗ )∗ .
The relationships between these three weight functions and the original weight function W and the associated potential have been reviewed in the thesis [36] and we recall the results in form of a table: The potential The weight function 0 k(x) v(x) = W (λ) = Im N (λ) k(x)∗ 0 0 k(x) −v(x) = − W− (λ) = Im − N (λ)−1 k(x)∗ 0 0 k(x)∗ − W1 (λ) = Im N (−λ∗ )∗ k(x) 0 0 k(x)∗ W2 (λ) = Im − N (−λ∗ )−∗ k(x) 0
Analogs of Canonical Systems with Pseudo-exponential Potential
17
Let N (λ) = i(I + c(λI − a)−1 b) be a minimal realization of N . Then, W (λ) = I + C(λI − A)−1 B is a minimal realization of the weight function W , where 1 a 0 b c A= , B= , C= 0 a∗ c∗ 2
b∗ ,
(2.23)
and the Riesz projection corresponding to the spectrum of A in the open upper half-plane C+ is I 0 P = . (2.24) 0 0 Furthermore, the potential associated to the weight function W is given by (2.22) where A, B, C and P are given by (2.23) and (2.24), and ∗ a − bc − bb2 2 ∗ . A× = A − BC = ∗ − c2c (a − bc 2) Consider now the weight function W− . A minimal realization of −N (λ)−1 is given by −N (λ)−1 = i(I − c(λI − a× )−1 b), a× = a − bc, and a minimal realization of W− is given by W− (λ) = I + C− (λI − A− )−1 B− , where A− =
a× 0
0
a
×∗
,
b c∗
B− = B =
,
C− = −C = −
1 c 2
b∗ ,
and the Riesz projection corresponding to the spectrum of A− in the open upper half-plane C+ is P− = P given by (2.24). The potential associated to the weight function W− is given by −1 × k− (x) = −2C P e−2itA− |Im P P B, where A× − = A− − B− C− = Setting
D=
a − bc 2 0
0 ∗ (a − bc 2)
a−
bc 2 c c 2 ∗
(a
bb∗ 2 ∗ − bc 2)
,
Z=
0
∗
cc 2
we have A× = D − Z
and A× − = D + Z.
. b∗ b 2
0
,
18
D. Alpay and I. Gohberg
We are now in a position to prove the following result: Theorem 2.26. Let k(x) be a strictly pseudo-exponential potential with associated Weyl function N (λ). The potential associated to Im − N −1 is equal to k− (x) = −k(x). Proof. To prove that k− (x) = −k(x), it is enough to prove that ×
P e−itA |Im
P
= P e−it(A− −B− C− ) |Im
P.
To prove this equality, it is enough in turn to prove that for all positive integers , it holds that P A× |Im P = P (A− − B− C− ) |Im P , i.e., that I 0 I 0 I 0 I 0 = (D − Z) (D + Z) 0 0 0 0 0 0 0 0 for all positive integers . Let = ±1. The expression (D + Z) consists of a sum of terms of the form Dα1 ( Z)β1 Dα2 ( Z)β2 · · · , where the αi and the βi are equal to 1 or 0 and i (αi + βi ) = . Each factor diagonal. We consider two cases, namely Dαi Z βi for which βi = 0 is anti block β being odd or even. When β is odd, we have the product of an odd i i i i number of anti block diagonal matrices, and the result is antiblock diagonal, and so, premultiplying and postmultiplying this product by I0 00 we obtain the zero matrix. When i βi is even, the product is an even function of and have the same value at = 1 and at = −1. The case of the other two weight functions is treated in much the same way. We focus on W1 (λ) = Im N (−λ∗ )∗ . A minimal realization of N (−λ∗ )∗ is given by N (−λ∗ )∗ = i(I − b∗ (λI + a∗ )−1 c∗ ), and a minimal realization of the weight function W1 is therefore given by W1 (λ) = I + C1 (λI − A1 )−1 B1 ,
where
−a∗ 0 and the Riesz projection half-plane C+ is P1 = P function W1 is given by A1 =
∗ 1 0 c , B1 = , C1 = − b∗ c , −a b 2 corresponding to the spectrum of A1 in the open upper given by (2.24). The potential associated to the weight
× k1 (x) = 2C1 P1 e−2itA1 |Im
−1 P1
We claim that k1 (x) = −k(x)∗ . Indeed, ∗× k1 (x)∗ = 2B1∗ P1∗ P1 e2itA1 |Im
P1 B1 .
−1 P1
P1∗ C1∗ .
Analogs of Canonical Systems with Pseudo-exponential Potential But we have that B1∗ P1∗
= 2CP = c
0 ,
P1 C1∗
1 = −P B = − 2
b 0
,
19
× A∗× 1 = −A ,
which allows to conclude.
3. The discrete case 3.1. First-order discrete system In our previous work [6] we studied inverse problems for difference operators associated to Jacobi matrices. Such operators are the discrete counterparts of Sturm– Liouville differential operators, and one can associate to them a number of functions analytic in the open unit disk similar to the characteristic spectral functions of a canonical differential expression. In the present paper we chose a different avenue to define discrete systems, which has more analogy to the continuous case and is more natural. The analogies between the two cases are gathered in form of two tables at the end of the paper. We note that another type of discrete systems has been considered by L. Sakhnovich in [42, Section 2 p. 389]. Our starting point is the telegraphers’ equations (1.2). We now assume that the local impedance function Z(x) defined in (1.2) is equal to a constant, say Zn , on the interval [nh, (n + 1)h) for n = 0, 1, . . . In particular, Z(x) may have discontinuities at the points nh. On the open interval (nh, (n + 1)h), we have k(x) = 0 and equation (1.3) becomes ∂ ∂ ) 0 ( ∂x + ∂t W (x, t) = 0. ∂ ∂ 0 ( ∂x − ∂t ) v1n (x − t) v2n (x + t) on the interval (nh, (n + 1)h). Voltage and current are continuous at the points nh. Let us set α(n, t) = lim W (x, t).
Hence one can write
W (x, t) =
x→nh x>nh
Taking into account (1.3) one gets to: 1 Zn−1/2 α(n, t) = 2 Zn−1/2 1 Zn−1/2 −1 lim W (x, t) = x→nh 2 Zn−1/2 −1 x
v(nh, t) i(nh, t) 1/2 Zn−1 v(nh, t) . 1/2 i(nh, t) −Z Zn−1 1/2
Zn 1/2 −Z Zn
We define the backward shift operator on functions of the variable t ∆f (t) = f (t − h).
20
D. Alpay and I. Gohberg
We have:
v1,n−1 (nh − t) v2,n−1 (nh + t) v ((n − 1)h − (t − h)) = 1,n−1 v2,n−1 ((n − 1)h + t + h) ∆ 0 v1,n−1 ((n − 1)h − t) = 0 ∆−1 v2,n−1 ((n − 1)h + t) ∆ 0 = α(n − 1, t). 0 ∆−1
lim W (x, t) =
x→nh x
Thus,
∆ 0
and we have:
1 Zn−1/2 0 −1 α(n − 1, t) = ∆−1 2 Zn−1/2 −1
1/2
Zn−1 1/2 −Z Zn−1
v(nh, t) , i(nh, t)
−1 −1/2 1/2 −1/2 1/2 Zn+1 Zn Zn+1 Zn ∆ 0 α(n, t) = α(n − 1, t) −1/2 1/2 −1/2 1/2 0 ∆−1 Zn+1 −Z Zn+1 Zn −Z Zn ∆ 0 = H(ρn ) α(n − 1, t) 0 ∆−1
where ρn =
Zn+1 − Zn Zn+1 + Zn
1 and H(ρ) = 1 − |ρ|2
1 −ρ∗
−ρ 1
for |ρ| < 1. See [19, p. 111]. Replacing ∆ by the complex variable and removing the scalar constant factor √ 1 2 we see that the discretization of the telegraphers’ equations leads to sys1−|ρ|
tems of the form
Yn+1 (z) =
1 −ρ∗n
−ρn 1
z 0 Yn (z), 0 z −1
(3.1)
which we will call two-sided first-order discrete systems. The solution corresponding to ρn ≡ 0 is n z Yn (z) = 0
0
z −n
Y0 (z),
that is, we are in a two-sided situation (the negative powers of z corresponding to signals coming from −∞). Recursions of the related forms Xn+1 (z) =
1 −ρ∗n
−ρn 1
z 0
0 Xn (z) 1
(3.2)
Analogs of Canonical Systems with Pseudo-exponential Potential and
−ρn 1
1 Zn+1 (z) = Zn (z) −ρ∗n
z 0 0 1
21
are one-sided (in the sense that solutions corresponding to ρn ≡ 0 involve only positive powers of z) and appear in the covariance extension problem. We here consider equations of the form (3.2). These are sometimes called a first-order discrete system. See [1]. Here we will call them one-sided first-order discrete system. Connections between the systems (3.1) and (3.2) are studied in the sequel. √ Sometimes appears a factor 1/ 1−|ρn |2 on the right side of these equations. For the situation considered here, where the ρn are of a special form (and in particular the sequence ρn belongs to 1 ) this factor can be ignored. As in the case of canonical differential systems a number of functions of z are associated to such systems: we mention in particular the spectral function, the scattering function and the Weyl function. As in [9] we focus on the scalar case and postpone the matrix-valued case to a later publication. We refer to [21] for more information on discrete systems. The potential k(x) in (1.1) is now replaced by a sequence of numbers ρn , n = 0, 1, 2 . . . in the open unit disk D. We will call such sequences Schur sequences. Strictly pseudo-exponential potentials are now replaced by sequences of the form ρn = −can (IIp − ∆a∗(n+1) Ωan+1 )−1 b.
(3.3)
In this equation (a, b, c) ∈ Cp×p × Cp×1 × C1×p is a minimal triple of matrices, the spectrum of a is in the open unit disk and ∆ and Ω are the solutions of the Stein equations ∆ − a∆a∗ = Ω − a∗ Ωa =
bb∗ c∗ c.
(3.4) (3.5)
Furthermore, one requires that a is invertible and that it holds that Ω−1 > ∆.
(3.6)
One recognizes in (3.3) the counterpart of (2.1). Moreover, as n → ∞, ρn ∼ −can b,
(3.7)
which is the analogue of (2.4). These sequences were introduced in our previous work [9] and called strictly pseudo-exponential sequences. The form of the ρn and the condition (3.6) call for some explanations, which we now give. In [9] the starting point was the Nehari extension problem associated to a sequence γj , j = 0, −1, . . .: find all elements f ∈ W such that fj = γj , sup |f (z)| < 1.
|z|=1
j = 0, −1, . . .
22
D. Alpay and I. Gohberg
In this problem an important role is played by the Hankel operator ⎛ ⎞ γ0 γ−1 · · · ⎜ γ−1 γ−2 · · · ⎟ ⎜ ⎟ ⎟ : . .. Γ=⎜ 2 → 2 . ⎜ .. ⎟ . ⎝ ⎠ .. .. . . A necessary and sufficient condition for the problem to have a solution is that Γ < 1. In [9] we took γ−j = caj b. With this choice of γj the norm condition on Γ is equivalent to (3.6). This follows for instance from the formula (II2 − Γ∗ Γ)−1 = I2 + B ∗ (IIp − Ω∆)−1 ΩB = I2 + B ∗ Ω1/2 (IIp − Ω1/2 ∆Ω1/2 )−1 Ω1/2 B where B = b ab a2 b · · · . As a direct consequence of the results in [9] one can get a formula for the solution of the systems (3.2) and (3.1) with various boundary conditions in terms of the matrices a, b and c. The analogue of Theorem 2.26 is: Remark 3.1. If ρn is a strictly pseudo-exponential sequence so is −ρn . Indeed, it suffices to replace c by −c. This does not affect the matrices Y and Ω. In the remaining of this section we study the spectral functions associated to a one-sided discrete first-order system. The two-sided case is considered in the next section. 3.2. The asymptotic equivalence matrix function As for the continuous case there are two distinguished solutions to the systems (3.1) and (3.2); the first, related to the inverse spectral problem, fixes the value of the solution for n = 0 while the second, related to the inverse scattering problem, fixes the asymptotic value as n → ∞. For (3.1) the asymptotic behavior at ∞ is −n 0 z lim Yn (z) = I2 0 zn n→∞ while for (3.2) it is
−n z lim 0 n→∞
0 Xn (z) = I2 . 1
We begin with the analogue of Theorem 2.1. Theorem 3.2. Let ρ0 , ρ1 , . . . be a strictly pseudo-exponential sequence of Schur coefficients. Every solution of the first-order discrete system (3.2) is of the form n n−1 1 0 1 0 0 z Xn (z) = Hn (z)−1 (1 − |ρ |2 ) H0 (z) X0 (z), (3.8) 0 z 0 z −1 0 1 =0
Analogs of Canonical Systems with Pseudo-exponential Potential where
23
αn (z) βn (z) Hn (z) = γn (z) δn (z)
and αn (z) =
1 + can z(zIIp − a)−1 (IIp − ∆Ωn )−1 ∆a∗n c∗
βn (z) =
ca z(zIIp − a) ∗
∗
(IIp − ∗ −1
1 + b (IIp − za )
δn (z) =
−1
(IIp − ∆Ωn )
∗ −1
b (IIp − za )
γn (z) = ∗n
−1
n
b
(3.10)
−1 ∗n ∗
Ωn ∆)
(3.9)
a c
(3.11)
−1
(3.12)
(IIp − Ωn ∆)
Ωn b,
n
where Ωn = a Ωa . The solution Kn with the asymptotic −n 0 z lim Xn (z) = I2 0 1 n→∞ corresponds to 1 (1 − |ρ |2 ) =0
X0 (z) = ∞
1 0
1 0 H0 (z)−1 0 z
0 z −1
,
(3.13)
that is,
n−1 n (1 − |ρ |2 ) 1 0 0 −1 z Kn (z) = =0 (z) , (3.14) H n ∞ 2 0 z −1 0 z =0 (1 − |ρ | ) while the solution for which the initial value is identity at n = 0 corresponds to X0 (z) = I2 . In particular we have =n−1 =0
1 −ρ∗ =
−ρ 1
where we denote
0 1
n 1 0 2 −1 z Hn (z) (1 − |ρ | ) 0 z 0
n−1 =0
z 0
=n−1
0 1 0 H0 (z) , 0 z −1 1
(3.15)
A = An−1 · · · A0 .
=0
Proof. We first recall the following results, proved in [9]. It holds that δn (z)∗ = αn (1/z ∗ ),
βn (z)∗ = γn (1/z ∗ ),
(3.16)
and
1 . 2 =0 (1 − |ρ | ) Furthermore, the matrix functions Hn satisfy the recurrence equation 1 0 1 ρn 1 0 Hn+1 (z) = (z) H , n = 0, 1, 2, . . . n 0 1z ρ∗n 1 0 z det H0 (z) = ∞
(3.17)
(3.18)
24
D. Alpay and I. Gohberg
We rewrite (3.18) as Hn+1 (z)
1 0 1 1 0 (z) = H n 0 z −1 ρ∗n 0 z −1
and we multiply this equation and equation (3.2) 1 0 1 2 X Hn+1 (z) (z) = (1 − |ρ | ) n+1 n 0 z −1 0 z = (1 − |ρn |2 ) 0
ρn , 1
side by side. We obtain: 0 z 0 H (z) Xn (z) n z −1 0 1 0 1 0 Hn (z) Xn (z). 1 0 z −1
Reiterating, we obtain that n 1 0 z n+1 2 1 − |ρ | Hn+1 (z) −1 Xn+1 (z) = 0 z 0
1 0 0 X0 (z) H0 (z) 0 z −1 1
=0
and hence we obtain formula (3.8) for Xn (z). The other claims are easily verified. We note that the solution Xn to (3.2) corresponding to X0 = I2 is a polynomial for every n (in the continuous case, the solution is an entire function). X canbe nn z expressed in terms of the orthogonal polynomials. We also remark that 0 10 is the solution of (3.2) corresponding to ρn ≡ 0. Definition 3.3. The function V (z) =
δ0 (z) − β0z(z) −zγ0 (z) α0 (z)
(3.19)
is called the asymptotic equivalence matrix function of the one-sided first-order discrete system (3.2). The terminology is explained in the next theorem: (1)
(2)
Theorem 3.4. Let c1 and c2 be in C2 , and let Xn and Xn be the C2 -valued solutions of (3.2), corresponding to the case of zero potential and to a potential ρn (1) (2) respectively and with initial conditions X0 (z) = c1 and X0 (z) = c2 . Then, for every z on the unit circle, ⇐⇒ c2 = V (z)c1 . lim Xn(1) (z) − Xn(2) (z) = 0 n z 0 (1) Proof. By definition, Xn (z) = c . On the other hand, 0 1 1 n n−1 1 0 0 1 0 (2) 2 −1 z Xn (z) = Hn (z) (1 − |ρ | ) H0 (z) c . 0 z 0 z −1 2 0 1 n→∞
=0
Hn (z)−1 = I2 for z on the unit circle and since The result follows ∞ since limn→∞ 2 det H0 (z) = 1/ =0 (1 − |ρ | ). We note that the function X0 (z) given by (3.13) is equal to V (z).
Analogs of Canonical Systems with Pseudo-exponential Potential
25
The asymptotic equivalence matrix function takes (up to a constant) J-unitary values on the unit circle, with 1 0 J= . 0 −1 Minimal realizations of rational functions which take J-unitary values on the unit circle T were studied in [5], where the following theorem is proved (in the more general setting of matrix-valued functions, i.e., where J is an arbitrary m × m matrix both unitary and self-adjoint; see [5, Theorem 3.1 p. 197]). Theorem 3.5. Let R be a C2×2 -valued rational function analytic and invertible both at infinity and at the origin. Let R(z) = D + C(zI − A)−1 B be a minimal realization of H. Then, R takes J-unitary values on T if and only if there is an Hermitian invertible matrix such that ∗ A B H 0 H 0 A B = . (3.20) C D 0 −J 0 −J C D Note that (3.20) can be rewritten as H − A∗ HA = C ∗ JD = J − D∗ JD
=
−C ∗ JC, A∗ HB, −B ∗ HB.
The matrix H is uniquely defined from the given minimal realization and is called the associated Hermitian matrix to the given realization. Theorem 3.6. A minimal realization of the matrix function H0 is given by H0 (z) = D + C(zI − A)−1 B where a 0 A= , 0 a−∗ ∗ ∗ (IIp − ∆Ω)−1 , a c 0 a 0 (IIp − ∆Ω)−1 ∆ , B= −(IIp − Ω∆)−1 −(IIp − Ω∆)−1 Ω 0 b 0 a−∗ c 0 C= , 0 b∗ 1 + c(IIp − ∆Ω)−1 ∆c∗ c(IIp − ∆Ω)−1 b . D= 0 1 Let t = 1 + c(IIp − ∆Ω)−1 ∆c∗ . Then, t > 0 and the function √1t H0 (z) is J-unitary on the unit circle, with minimal realization 1 1 1 √ H0 (z) = √ D + C(zI − A)−1 √ B. t t t
26
D. Alpay and I. Gohberg
The associated Hermitian matrix to this realization is given by −Ω −IIp . X= −IIp −a∆a∗ We now recall the analogue of Theorem 2.5 for minimal J-unitary factorizations on the unit circle (see [5, Theorem 3.7 p. 205]): Theorem 3.7. Let R be a rational function J-unitary on the unit circle and analytic and invertible at ∞. Let R(z) = D + C(zI − A)−1 B be a minimal realization of R, with associated Hermitian matrix H. Let M be a A-invariant subspace nondegenerate in the metric [·, ·]H induced by H. Finally, let π denote the orthogonal projection defined by ker π = M, Im π = M[⊥] . Then R = R1 R2 with R1 (z) = (I + C(zI − A)−1 (I − π)BD−1 )D1−1 R2 (z) = D2 (I + D−1 Cπ(zI − A)−1 B)D with D1 = I + C1 H1−1 (I − αA∗1 )−1 C1∗ J,
D2 = DD1−1
where |α| = 1 and C1 = C|M ,
A1 = A|M ,
H1 = πH|M
is a minimal J-unitary factorization of R, and every minimal J-unitary factorization of R is obtained in such a way. Using this result we obtain: Theorem 3.8. The matrix function H0 admits a minimal J-unitary factorization H0 (z) = U1 (z)−1 U2 (z) where U1 and U2 are J-inner. The asymptotic equivalence matrix function admits a minimal J-unitary factorization 1 V (z) = V1 (z)−1 V2 (z) det H0 (z) where V1 and V2 are J-inner.
Indeed, the space C0 is A invariant and H-negative. Furthermore, ∗ Ip −Ω 0 Ip 0 0 −Ω −IIp , = −IIp −a∆a∗ 0 Ω−1 − a∆a∗ Ω−1 Ip Ω−1 Ip p
and by (3.6) and (3.4), Ω−1 − a∆a∗ > 0. This insures that U2 is J-inner.
To prove the second claim, we remark that the function
set V1 (z) = U2 (z)
1 0
0 z −1
and V2 (z) = U1 (z)
1 0
0 z −1
.
1 0
0 z −1
is J-inner and
Analogs of Canonical Systems with Pseudo-exponential Potential
27
3.3. The reflection coefficient function and the Schur algorithm We now associate to a one-sided first-order discrete system a function analytic and contractive in the open unit disk. We first set 1 −ρ C(ρ) = −ρ∗ 1 and Mn (z) = C(ρ0 )
z 0
0 z C(ρ1 ) 1 0
0 z · · · C(ρn ) 1 0
0 . 1
(3.21)
Theorem 3.9. Let ρn , n = 1, 2, . . . be a strictly pseudo-exponential sequence and let Mn (z) be defined by (3.21). The limit R(z) = lim TMn (z) (0)
(3.22)
n→∞
exists and is equal to β0 (1/z). α0 It is a function analytic and contractive in the open unit disk, called the reflection coefficient function. It takes strictly contractive values on the unit circle. R(z) =
Proof. From (3.15) we have that: n n+1 z 2 Mn (z) = (1 − |ρ | ) H0 (z ∗ )∗ 0 =0
0 Hn+1 (z ∗ )∗ . 1
The result follows then from the definition of the linear fractional transformation and from the equality (see (3.16)) γ0 (z ∗ )∗ β0 = (1/z). δ0 (z ∗ )∗ α0 For every n the matrix function
n
=0
√1
1−|ρ |2
Mn is J-inner and thus the function
TMn (z) (0) is analytic and contractive in the open unit disk. It follows that R(z) is analytic and contractive in the open unit disk. The fact that R(z) is strictly contractive on T is proved as follows. One first notes that α0 and β0 have no pole H0 (z) (recall on the unit circle. From the J-unitarity on the unit circle of √ 1 det H0 (z)
that det H0 (z) is a strictly positive constant; see (3.17)) stems the equality 1 , 2 =0 (1 − |ρ | )
|α0 (z)|2 − |β0 (z)|2 = det H0 (z) = ∞ and hence | αβ00 (z)| < 1 for z ∈ T.
z ∈ T,
We note the complete analogy between the characterizations (2.10) and (3.22) of the reflection coefficient functions for the continuous and discrete cases respectively.
28
D. Alpay and I. Gohberg
We now present a realization for R: Theorem 3.10. Let ρn , n = 0, 1, . . . be a strictly pseudo-exponential sequence of the form (3.3). The reflection coefficient function of the associated discrete system (3.2) is given by the formula: R(z) = c {(I − ∆a∗ Ωa) − z(I − ∆Ω)a}
−1
b.
(3.23)
In particular R(0) = c(I − ∆a∗ Ωa)−1 b = −ρ0 . Proof. We first compute α0 (z)−1 using the formula (1 + AB)−1 = 1 − A(I + BA)−1 B with A = cz(zI − a)−1 and B = (I − ∆Ω)−1 ∆c∗ . We obtain α0 (z)−1 = 1 − cz(zI − a)−1 (I + (I − ∆Ω)−1 ∆c∗ cz(zI − a)−1 )−1 (I − ∆Ω)−1 ∆c∗ −1
= 1 − cz {(I − ∆Ω)(zI − a) + ∆c∗ cz} Therefore
∆c∗ .
α0 (z)−1 β0 (z) = 1 − cz {(I − ∆Ω)(zI − a) + ∆c∗ cz}−1 ∆c∗ × (cz(zI − a)−1 (I − ∆Ω)−1 b) = cz(zI − a)−1 (I − ∆Ω)−1 b −1
− cz {(I − ∆Ω)(zI − a) + ∆c∗ cz} × ∆c∗ cz(zI − a)−1 (I − ∆Ω)−1 b. Writing
∆c∗ cz = (I − ∆Ω)(zI − a) + ∆c∗ cz − (I − ∆Ω)(zI − a), we have that −1
α0 (z)−1 β0 (z) = cz {(I − ∆Ω)(zI − a) + ∆c∗ cz}
(I − ∆Ω)(zI − a)
× (zI − a)−1 (I − ∆Ω)−1 b, and hence the result since (I − ∆Ω)(zI − a) + ∆c∗ cz = z(I − ∆a∗ Ωa) − (I − ∆Ω)a.
The Schur algorithm starts from a function R(z) analytic and contractive in the open unit disk (a Schur function), and associates to it recursively a sequence of functions Rn with R0 (z) = R(z) and, for n ≥ 1: Rn+1 (z) =
Rn (z) − Rn (0) . z(1 − Rn (0)∗ Rn (z))
The recursion continues as long as |Rn (0)| < 1. By the maximum modulus principle, all the functions in the (finite or infinite) sequence are Schur functions; see [43], [23].
Analogs of Canonical Systems with Pseudo-exponential Potential
29
The numbers ρn = Rn (0) bear various names: Schur coefficients, reflection coefficients,. . . . They give a complete characterization of Schur functions. In various places (see, e.g., [44]), they are also called Verblunsky coefficients. Theorem 3.11. Let ρn be a strictly pseudo-exponential sequence. The functions −1 βn (1/z) = can (I − ∆a∗(n+1) Ωan+1 ) − z(I − ∆a∗n Ωan )a b Rn (z) = αn are Schur functions. Furthermore, the Schur coefficients of Rn are −ρm , m ≥ n. Proof. The first claim follows from the previous theorem, replacing c by can and Ω by a∗n Ωan . To prove the second fact, we rewrite (3.18) (with m instead of n) as: αm+1 (z) = βm+1 (z) = zγm+1 (z) = δm+1 (z) =
αm (z) + ρ∗m βm (z),
(3.24)
z(ρm αm (z) + βm (z)), γm (z) + ρ∗m δm (z),
(3.25)
δm (z) + ρm γm (z)
Dividing (3.25) by (3.24) side by side we obtain: βm (z) + ρm βm+1 (z) = z αm αm+1 1 + ρ∗m αβm (z) m
and hence the result. Corollary 3.12. For every n ≥ 0 there exists a Schur function Sn such that R = TMn (Sn ).
(3.26)
3.4. The scattering function We now turn to the scattering function. We first look for the C2 -valued solution of the system (3.2), with the boundary conditions 1 −1 Y0 (z) = 0, 0 1 Yn (z) = 1 + o(n). The first condition implies that the solution is of n n−1 1 0 z Yn (z) = (1 − |ρ |2 ) Hn (z)−1 0 z 0 =0
the form 1 0 H0 (z) 0 1
x(z) z −1 x(z) 0
where x(z) is to be determined via the second boundary condition. We compute n n−1 x(z) 0 z 0 1 Yn (z) = (1 − |ρ |2 ) 0 z Hn (z)−1 H0 (z) x(z) . 0 1 z =0
Taking into account that limn→∞ Hn (z) = I2 we get that ∞ lim 0 1 Yn (z) = (1 − |ρ |2 ) 0
n→∞
=0
x(z) z H0 (z) x(z) z
30
D. Alpay and I. Gohberg
∞ and hence 1 = ( =0 (1 − |ρ |2 ))(zγ0 (z) + δ0 (z))x(z), that is 1 x(z) = ∞ . 2 ( =0 (1 − |ρ | ))zγ0 (z) + δ0 (z) Furthermore, lim 1
n→∞
1 0 1 0 x(z) 0 Yn (z)z −n = 1 0 H0 (z) 0 z −1 x(z) 0 z ∞ α0 (z) + β0 (z) 2 z (1 − |ρ | ) 1 0 x(z) = γ0 (z) + δ0z(z) =0 α0 (z) + β0z(z) . zγ0 (z) + δ0 (z)
= Definition 3.13. The function
S(z) =
α0 (z) + β0z(z) zγ0 (z) + δ0 (z)
is called the scattering function associated to the discrete system (3.2). Theorem 3.14. The scattering function admits the factorizations S(z) = S+ (z)S− (z) =
B1 (z) B2 (z)
where S+ and its inverse are invertible in the closed unit disk, S− and its inverse are invertible in the outside of the open unit disk, and where B1 and B2 are two finite Blaschke products. Proof. Using (3.16) we see that β0 (z) z and so S takes unitary values on the unit circle. It follows from Theorem 3.9 and from [24, Theorem 3.1, p. 918] that (zγ0 ) (1/z ∗ )∗ =
zγ0 (z) + δ0 (z) = δ0 (z)(1 + zR(z ∗ )∗ ) is analytic and invertible in |z| < 1. This gives the first factorization with 1 , zγ0 (z) + δ0 (z) 1 β0 (z) S− (z) = . = α0 (z) + S+ (1/z ∗ )∗ z S+ (z) =
The second factorization is a direct consequence of the fact that S is rational and takes unitary values on T.
Analogs of Canonical Systems with Pseudo-exponential Potential
31
3.5. The Weyl function and the spectral function To introduce the Weyl coefficient function we consider the matrix function 1 Un (z) = √ 2
1 1
=n−1 1 1 −ρ∗ −1 =0
−ρ 1
z 0
0 1 1 1 √ . 1 2 1 −1
Definition 3.15. The Weyl coefficient function N (z) is defined for z ∈ D by the iN (z ∗ )∗ following property: The sequence n → Un (z) belongs to 22 . 1 A similar definition appears in [40, Theorem 1, p. 231]. Theorem 3.16. It holds that 1 − zR(z) . (3.27) 1 + zR(z) n−1 Proof. Indeed, by (3.15) and with cn−1 = =0 (1 − |ρ |2 ), we have that: cn−1 1 1 iN (z ∗ )∗ 1 0 Un (z) = Hn (z)−1 1 0 z 1 −1 2 n z 0 1 0 1 + iN (z ∗ )∗ × H0 (z) 0 z −1 0 1 −1 + iN (z ∗ )∗ n cn−1 1 1 0 1 0 z = Hn (z)−1 0 1 0 z 1 −1 2 β0 (z) ∗ ∗ α0 (z)(1 + iN (z ) − z (1 − iN (z ∗ )∗ ) × , zγ0 (z)(1 + iN (z ∗ )∗ ) − δ0 (z)(1 − iN (z ∗ )∗ ) iN (z ∗ )∗ and so the sequence n → Un (z) belongs to 22 if and only if it holds 1 that N (z) = i
zγ0 (z)(1 + iN (z ∗ )∗ ) = δ0 (z)(1 − iN (z ∗ )∗ ).
(3.28)
This equation in turns is equivalent to iN (z) =
zβ0 (1/z) − α0 (1/z) zγ0 (z ∗ )∗ − δ0 (z ∗ )∗ zR(z) − 1 = = . zγ0 (z ∗ )∗ + δ0 (z ∗ )∗ zβ0 (1/z) + α0 (1/z) zR(z) + 1
where we took into account (3.16).
(3.29)
For similar results, see [44, Theorem 5.2 p. 520]. Theorem 3.17. The Weyl coefficient function associated to a one-sided first-order discrete system with strictly pseudo-exponential sequence is given by: −1 N (z) = i 1 + 2zc {I − ∆a∗ Ωa + zbc − z(I − ∆Ω)a} b . (3.30)
32
D. Alpay and I. Gohberg 1 − 2(1 + zR(z))−1 . On the other hand, −1 −1 = 1 + zc {(I − ∆a∗ Ωa) − z(I − ∆Ω)a} b
Proof. We have N (z) = (1 + zR(z))−1
1 zR(z)−1 i zR(z)+1
=
1 i
−1
= 1 − zc {(I − ∆a∗ Ωa) − z(I − ∆Ω)a} −1 −1 × 1 + zbc {(I − ∆a∗ Ωa) − z(I − ∆Ω)a} b −1
= 1 + zc {I − ∆a∗ Ωa + zbc − z(I − ∆Ω)a}
b,
and hence the result.
Remark 3.18. Let N be the Weyl function associated to the sequence ρn , n = 0, 1, 2, . . .. Then −N −1 is the Weyl function associated to the sequence −ρn , n = 0, 1, 2, . . .. The spectral function W (z) =
c , |α0 (1/z) + zβ0 (1/z)|2
1 , (1 − |ρ |2 ) =0
c = ∞
|z| = 1.
(3.31)
will play an important role in the sequel. Theorem 3.19. The Weyl coefficient function N (z) is such that Im N (z) = W (z) on the unit circle. Proof. From (3.16) we have that |α0 (z)|2 − |β0 (z)|2 is a constant for |z| = 1. Therefore: 1 1 zR(z) − 1 1 z ∗ R(z)∗ − 1 Im N (z) = + 2i i zR(z) + 1 i z ∗ R(z)∗ + 1 2 1 − |R(z)| = |1 + zR(z)|2 |α0 (1/z)|2 − |β0 (1/z)|2 = = W (z). |α0 (1/z) + zβ0 (1/z)|2 Theorem 3.20. The characteristic spectral functions of a one-sided first-order discrete system are related by the formulas 1 c , z ∈ T, c = ∞ W (z) = , |S− (1/z)|2 (1 − |ρ |2 ) =0 W (z) = Im N (z), z ∈ T, 1 − zR(z) , 1 + zR(z) 1 1 + iN (z) R(z) = , z 1 − iN (z) 1 (1 + iN (z ∗ )∗ )S+ (z)−1 V (z) = 2 −(1 − iN (z ∗ )∗ )S+ (z)−1
N (z) = i
−(1 + iN (1/z))S−(1/z) . (1 − iN (1/z))S−(1/z)
Analogs of Canonical Systems with Pseudo-exponential Potential
33
We will prove only the last identity. From (3.19) and (3.28) we have that 1 + iN (z ∗ )∗ δ0 (z) = 2 zγ0 (z) + δ0 (z)
1 + iN (z ∗ )∗ zγ0 (z) = . 2 zγ0 (z) + δ0 (z)
and
Thus, 1 + iN (z ∗ )∗ S+ (z)−1 2 Similarly, from (3.29) we obtain δ0 (z) =
and zγ0 (z) =
1 + iN (z) zβ0 (1/z) = 2 zβ0 (1/z) + α0 (1/z)
1 − iN (z ∗ )∗ S+ (z)−1 . 2
1 − iN (z) α0 (1/z) = , 2 zβ0 (1/z) + α0 (1/z)
and
and hence the result. 3.6. The orthogonal polynomials The solution Mn (given by (3.21)) to the system (3.2) with the initial condition M0 (z) = I2 is polynomial. It can be expressed in terms of the orthogonal polynomials associated to the weights Im N (z) and Im − N −1 (z) (where |z| = 1), and we recall now the definition of the orthogonal polynomials. We start with a function W (eit ) = Z w eit such that Z |w | < ∞ (that is, W belongs to the Wiener algebra of the unit circle). We assume moreover that W (eit ) > 0 for all real t. Set ⎛ ⎞ ∗ w1∗ ··· wm w0 ∗ ⎟ ⎜ w1 w0 . . . wm−1 ⎜ ⎟ Tm = ⎜ . (3.32) ⎟. . . .. .. ⎝ .. ⎠ w0 wm wm−1 · · · Then Tm is invertible, and we define: ⎛ (m) (m) γ00 γ01 ⎜ (m) (m) ⎜ γ10 γ11 ⎜ . T−1 = .. m ⎜ . ⎝ . . (m) (m) γm0 γm1 Definition 3.21. The family
⎛ 1
γ0m (m) γ1m .. .
···
γmm
m
⎝ pm (z) = (m) j=0 γ00
⎞
(m)
··· ···
⎟ ⎟ ⎟. ⎟ ⎠
(m)
⎞ (m) γ0j z m−j ⎠
is called the family of orthonormal polynomials associated to the sequence wj . The term orthonormal is explained in the next theorem: Theorem 3.22. We have 2π 1 pk (eit )W (eit )pm (eit )∗ dt = δk,m . 2π 0
34
D. Alpay and I. Gohberg
We now consider a rational function W , analytic on T and at the origin. Then, W admits a minimal realization of the form W (z) = D + zC(IIp − zA)−1 B. The function W is in the Wiener algebra of the unit circle. Indeed, the matrix A has no spectrum on T and the Fourier coefficients of W are given by ⎧ ⎨ CA−1 (I − P )B if = 1, 2, . . . w = D − CP B if = 0 ⎩ −CA−1 P B if = −1, −2, . . . where P is the Riesz projection defined by 1 P =I− (ζI − A)−1 dζ. 2πi T Indeed, we have for |z| = 1: W (z) = D + zC(I − zA)−1 B = D + zC(I − zA)−1 (P + I − P )B ∞ z (A(I − P )) )B = D + zC( =0
− C(AP )−1 (I − z −1 (AP )−1 )−1 B, and hence the result. Furthermore, for every m, the matrix Vm = (I − P + P A)−m (I − P + P A×m ) is invertible (with A× = A − BD−1 C). Moreover, a) for 0 ≤ j < i ≤ m. (m)
γij
−(m+1) = (D−1 C(A× )i Vm−1 (A× )m−j B − D−1 C(A× )i−j−1 BD−1 ). +1 P A
b) for 0 ≤ i ≤ j ≤ m (m)
γij
−(m+1) = δij D−1 + D−1 C(A× )i Vm−1 (A× )m−j BD−1 . +1 P A
These results are proved in [28, pp. 35–37] when D = I. They allow to prove: Theorem 3.23. Let W be a rational matrix-valued function analytic and invertible at the origin and infinity, and analytic on the unit circle. Let W (z) = D + zC(I − zA)−1 B be a minimal realization of W . Suppose that W (eit ) > 0, t ∈ [0, 2π]. Then, (1)
−(m+1) ×m pm (z) = (D−1 + D−1 CV Vm−1 A B)−1/2 +1 P A ⎧ ⎫ m ⎨ ⎬ −(m+1) × z m D−1 + D−1 CV Vm−1 ( A×(m−j) z m−j ) B. +1 P A ⎩ ⎭ j=0
Analogs of Canonical Systems with Pseudo-exponential Potential (2) For |z| <
35
1 ρ(A× )
lim z m (γ00 )−1/2 pm (1/z) = D−1 + D−1 Cπ(A× | (m)
ImP ×
m→∞
− zI)−1 B
(3.33)
where π is the projection onto Im P along ker P × . Other type of realizations (and accordingly formulas for pm ) are possible. In particular, it is of interest to remove the hypothesis of analyticity at the origin or at infinity. We first recall the following results (see [25, (3.10) p. 398 and Theorem 8.2 p. 422]). Theorem 3.24. Let W be a Cn×n rational function analytic on the unit circle T. Then W belongs to the Wiener algebra W n×n and it can be written as W (z) = In + C(zG − A)−1 B where C ∈ Cn×p , B ∈ Cp×n and G and A are p × p matrices for some p ∈ N. Furthermore, these matrices may be chosen such that det (zG− A) does not vanish on T. The Fourier coefficients of W are given by the formulas ⎧ ⎨ −CEΩ (I − P )B if = 1, . . . In − CE(I − P )B if = 0 w = ⎩ CEΩ−−1 P B if = −1, . . . where the matrices E, Ω and P are defined by 1 1 1 1 −1 E= (1 − )(ζG − A) dζ, Ω = (ζ − )G(ζG − A)−1 dζ, 2πi T ζ 2πi T ζ and 1 P = 2πi
T
G(ζG − A)−1 dζ.
The matrices E, Ω and P are respectively called the right equivalence operator, the associated operator and the separating projection. The operator Ω commutes with P and has all its eigenvalues inside the open unit disk. We will also need the matrix 1 Q= (ζG − A)−1 Gdζ (3.34) 2πi T Theorem 3.25. Let W be as in the previous theorem and suppose that W −1 is analytic on T. Then one can choose G and A such that det (zG − A× ) does not vanish on T, with A× = A − BC. If the matrix Tm is invertible, the entries of its inverse are given by (m)
γij where
× = wi−j + Kij
⎧ ⎨ CE × (Ω× ) (I − P × )B × In + CE × (I − P × )B w = ⎩ −CE × (Ω× )−−1 P × B
(m)
if = 1, 2, . . . , m if = 0 if = −1, −2, . . . , −m
(3.35)
36
D. Alpay and I. Gohberg
and = CE × (Ω× )i+1 (I − P × )V Vm−1 (I − Q)E × (Ω× )j P × B
(m)
Kij
−CE × (Ω× )m−i P × Vm−1 QE × (Ω× )m−j (I − P × )B.
(3.36)
In these expressions, (I − Q)E × (I − P × ) + (I − Q)E × (Ω× )m+1 P ×
=
Vm
+QE × (Ω× )m+1 (I − P × ) + QE × P × , where Q was defined in (3.34) and where the matrices P × , E × and Ω× are the separating projection, the right equivalence operator and the associated operator corresponding to zG − A× . With these formulas we obtain Theorem 3.26. Let W be a rational weight function with realization W (z) = In + C (zG − A)−1 B and suppose that both det (zG − A) and det (zG − A× ) do not vanish on T. Suppose moreover that the Toeplitz matrix Tm (defined in (3.32)) is non singular. Then: ⎧ ⎨
⎫ ⎞ ⎛ ⎞ m m ⎬ = b0m + CE × b1m ⎝ (zΩ× )j ⎠ P × + b2m ⎝ (zΩ×−1 )j ⎠ (I − P × ) B ⎩ ⎭
z m pm (1/z)
⎛
j=0
j=0
where we have defined b0m
= I + CE × (I − P × )B + CE × Ω×−1 B,
b1m
= −Ω×−1 + Ω× (I − P × )V Vm−1 (I − Q)E × ,
b2m
= −Ω×m P × Vm−1 QE × Ω×m .
Proof. From formulas (3.35) and (3.36) we obtain: (m)
γ00
= w0× + K00
(m)
= In + CE × (I − P × )B + Vm−1 (I − Q)E × P × B +CE × Ω× (I − P × )V −CE × Ω×m P × Vm−1 QE × Ω×m (I − P × )B and for j > 0, (m)
γ0j
× = w−j + K0j
(m)
= −CE × Ω×(j−1) P × B +CE × Ω× (I − P × )V Vm−1 (I − Q)E × Ω×j P × B −CE × Ω×m P × Vm−1 QE × Ω×(m−j) (I − P × )B.
Analogs of Canonical Systems with Pseudo-exponential Potential
37
Thus (m)
(m)
(m)
z m pm (1/z) =
γ00 + zγ01 + · · · + z m γ0m
=
In + CE × (I − P × )B m −CE × Ω×−1 z j Ω×j P × B j=1
⎛
+CE × Ω× (I − P × )V Vm−1 (I − Q)E × ⎝ ⎛ −CE × Ω×m P × Vm−1 QE × Ω×m ⎝
m j=0
m
⎞ z j Ω×j ⎠ P × B ⎞
z j Ω×−j ⎠ (I − P × )B
j=0
from which the claim follows. One can also consider representations of the form W (z) = D + (1 − z)C(zG − A)−1 B. (m)
See [35]. One needs to develop formulas for the γij . Such formulas and the corresponding formulas for the orthogonal polynomials will be given elsewhere. 3.7. The spectral function and isometries Let 1 1 1 0 1 U= √ . and J1 = 1 0 2 1 −1 We note that J = U J1 U. Furthermore, let Θn (z) = U Mn (z)U where Mn (z) is given by (3.21). The matrix function Θn is J1 -inner. We denote by H(Θn ) the associated reproducing kernel (z)J1 Θn (w)∗ Hilbert space, with reproducing kernel J1 −Θn1−zw . We denote by L(N ) the ∗ reproducing kernel Hilbert space with reproducing kernel Theorem 3.27. The map
N (z)−N (w)∗ i(1−zw ∗ ) .
F → −iN (z) 1 F (z)
is an isometry from H(Θn ) into L(N ). Furthermore, elements of H(Θn ) are of the form f (z) F (z) = , i(pN ∗ f )(z) where f runs through the set of polynomials of degree less or equal to n and where p denotes the orthogonal projection from L2 onto H2 , and F 2H(Θn) = 2f 2L2 (Im
N ).
(3.37)
38
D. Alpay and I. Gohberg
Proof. Let us denote by H(R) the reproducing kernel Hilbert space with repro∗ R(z)R(w)∗ ducing kernel 1−zw1−zw . Then, by e.g., [2, Propositions 6.1 and 6.4] (but ∗ the result is well known and is related to the Carath´´eodory–Toeplitz extension problem), equation (3.26) implies that the map which to F associates the function z → 1 −zR(z) F (z) is an isometry from H(M Mn ) into H(R). Since J1 − Θn (z)J J1 Θn (w)∗ Mn (w)∗ ∗ J − Mn (z)JM =M M , ∗ 1 − zw 1 − zw∗ 2 1 − zw∗ R(z)R(w)∗ 1 N (z) − N (w)∗ = , ∗ ∗ ∗ i(1 − zw ) 1 + zR(z) 1 − zw 1 + w R(w)∗ the maps F → M F √ 2 f → f (1 + zR) are isometries from H(Θn ) onto H(M Mn ) and from H(R) onto L(N ). The first claim follows since √ 2 −iN (z) 1 = 1 −zR(z) M. 1 + zR(z) The last claim can be obtained from [3, Section 7]. We note that a similar result for the continuous case was proved in [11]. The arguments are easier here because of the finite dimensionality. Using Theorem 3.27 we can relate the orthogonal polynomials and the entries of the matrix function Θn . Corollary 3.28. Let Θn be as in Theorem 3.27. Then for , k < n % & 1 1 = 2δ,k . Θ , Θk 1 1 H(Θ ) n
In particular, for every n ≥ 0, pn (z) = 1
1 0 Θn (z) . 1
Proof. Denote by H2,J the Kre˘ ˘ın space of C2 -valued functions with entries in the Hardy space H2 of the open unit disk, and with inner product: [F, G]H2,J = F, JGH22 . Then (see [4]), the space H(M Mn ) is isometrically included inside H2,J . Assume now that < k. The function k z 0 −1 (Θ Θk )(z) = U U C(ρi ) 0 1 i=+1
Analogs of Canonical Systems with Pseudo-exponential Potential belongs to H2,J and is such that (Θ−1 Θk )(0) Thus,
% Θ
& 1 1 , Θk 1 1 H(Θ
= n)
39
1 0 . = 1 0
% & 1 1 , Θ−1 Θ k 1 1 H(Θ
=0 n)
The proof that the inner product is equal to 2 when = k is proved in the same way. The last claim follows from (3.37).
4. Two-sided systems and an example 4.1. Two-sided discrete first-order systems We now turn to the systems of the form (3.1), that is, 1 −ρn z 0 Yn+1 (z) = Yn (z), −ρ∗n 1 0 z −1 and begin with the definition of the asymptotic equivalence matrix function. Theorem 4.1. Let ρn be a strictly pseudo-exponential sequence. Every solution of the system (3.1) is of the form n n−1 1 0 0 1 0 2 2 −1 z 2 (1 − |ρ | ) Hn (z ) H0 (z ) Y0 (z). Yn (z) = 0 z2 0 z12 0 z −n =0
The solution such that
−n z lim 0 n→∞
corresponds to 1 Y0 (z) = ∞ (1 − |ρ |2 ) =0
0 Yn (z) = I2 zn 1 0 2 −1 1 H0 (z ) 0 0 z2
0
z −2
,
while the solution with value I2 at n = 0 corresponds to Y0 (z) = I2 . Proof. Replacing z by z 2 in the recursion (3.18) we obtain: 1 0 1 0 1 ρn 2 = H (z ) . Hn+1 (z 2 ) n 0 z12 0 z12 ρ∗n 1 Note that
1 −ρ∗n
−ρn 1
1 ρ∗n
ρn 1
= (1 − |ρn |2 )II2 .
(4.1)
40
D. Alpay and I. Gohberg
Thus, multiplying side by side (4.1) and (3.1) we obtain: 1 0 1 0 z 0 2 2 2 Hn+1 (z ) Yn+1 (z) = (1 − |ρn | ) Hn (z ) Yn (z) 0 z12 0 z12 0 z −1 z 0 1 0 = (1 − |ρn |2 ) Hn (z 2 ) Yn (z) 0 z −1 0 z12 from which we obtain: 1 0 Hn+1 (z 2 ) Yn+1 (z) = 0 z12 n+1 z = 0
0 z −(n+1)
1 H0 (z ) 0
0
2
1 z2
Y0 (z)
n
1 − |ρ |
2
=0
and hence the formula for Yn (z). Definition 4.2. The function 1 V (z) = n−1 2 =0 (1 − |ρ | )
1 0
0 2 −1 1 (z ) H 0 0 z2
0
z −2
is called the asymptotic equivalence matrix of the two-sided first-order discrete system (3.1). We note that it is related to the asymptotic equivalence matrix (3.19) of the discrete system (3.2) by the transformation z → z 2 . The proof of the following result is similar to the proof of Theorem 3.4. Theorem 4.3. Let c1 and c2 be in C2 , and let Y (1) and Y (2) be the C2 -valued solutions of (3.1), corresponding to the case of ρn ≡ 0 and to the strictly pseudo(1) exponential sequence ρn respectively and with initial conditions Y0 (z) = c1 and (2) Y0 (z) = c2 . Then, for every z on the unit circle it holds that Yn(1) (z)c1 − Yn(2) (z)c2 = 0 lim Y
n→∞
Proof. By definition,
Yn(2) (z) =
n−1
(1) Yn (z)
(1 − |ρ |2 )
=0
n z = 0
1 0
0
z −n
c2 = V (z)c1 .
⇐⇒
c1 . On the other hand,
n 0 2 −1 z (z ) H n z2 0
0 z −n
H0 (z 2 )
1 0 c . 0 z −2 2
The result follows since limn→∞ Hn (z 2 )−1 = I2 for z on the unit circle.
Analogs of Canonical Systems with Pseudo-exponential Potential
41
The other spectral functions of the systems (3.2) and (3.1) are also related by the transformation z → z 2 . The definitions and results are identical to the one-sided case. Theorem 4.4. Let ρn , n = 0, 1, . . . be a strictly pseudo-exponential sequence of the form (3.3). The reflection coefficient function of the associated discrete system (3.1) is given by the formula: −1 b. (4.2) R(z) = c (I − ∆a∗ Ωa) − z 2 (I − ∆Ω)a The scattering function is defined as follows. We look for the C2 -valued solution of the system (3.2), with the boundary conditions 1 −1 Y0 (z) = 0, 0 1 Yn (z) = z −n + o(n). Then the limit
lim 1
n→∞
0 Yn (z)z −n
exists and is called the scattering function of the system (3.1). It is related to the scattering function of the system (3.2) by the map z → z 2 . We also mention that J-inner polynomials are now replaced by J-unitary functions with possibly poles at the origin and at infinity, but with constant determinant. 4.2. An illustrative example As a simple example we take a = α ∈ (0, 1), b = 1 and c = c∗ . Then ∆=
1 , 1 − α2
Ω=
and ρn = −αn
c2 , 1 − α2
c 1−
c2 α2n+2 (1−α2 )2
.
(4.3)
The numbers c and α need to satisfy (3.6), that is (1 − α2 )2 > c2 . Note that this condition implies that c c < < 1, |ρ0 | = c2 1 − α2 2 1 − α (1−α2 )2 and more generally, |ρn | =
αn c 1−
c2 α2n+2 (1−α2 )2 n
α c 1 − α2n+2 αn (1 − α2 ) c αn = < < 1, 2n+2 2 2 1−α 1−α 1 + α + · · · + α2n ≤
as it should be.
42
D. Alpay and I. Gohberg Continuous case iJf − V f = zf
The system Special solutions
Entire J-inner functions 0 k(x) v(x) = 0 k(x)∗ −1 ita −2ixa∗ k(x) = −2ce Y e2ixa Ip + Ω Y − e
Potential
Solution asymptotic to the solution with k ≡ 0
Theorem 2.1
−k is also a potential
Theorem 2.26
Asymptotic property
Formula (2.4)
Reflection coefficient
Formulas (2.11) and (2.10)
Weyl function
Formula (2.14)
Weyl function for −k(x)
Theorem 2.26
Factorization of the asymptotic equivalence matrix
Theorem 2.6
Asymptotic behavior of the orthogonal polynomial
Equation (2.21) Table 1
The reflection coefficient is equal to: R(z) =
1−
α2 c2 (1−α2 )2
c − zα(1 −
c2 (1−α2 )2 )
.
We check directly that it is indeed a Schur function as follows: we have for |z| ≤ 1 c . |R(z)| ≤ α2 c2 c2 1 − (1−α2 )2 − α(1 − (1−α 2 )2 ) We thus need to check that c≤1− that is, with T =
α2 c2 c2 − α(1 − ), (1 − α2 )2 (1 − α2 )2
c (1−α2 ) ,
c ≤ 1 − α2 T 2 − α(1 − T 2 ) = (1 − α)(1 + T 2 α),
Analogs of Canonical Systems with Pseudo-exponential Potential
43
Discrete case (one-sided case) z −ρn Yn+1 (z) = Yn (z) −zρ∗n 1
The system Special solutions
J-inner polynomials
Potential: the Schur coefficients ρn
ρn = −can (I − ∆a∗(n+1) Ωan+1 )−1 b
Solution asymptotic to the solution with ρn ≡ 0
Formula (3.14)
−ρn is also pseudo-exponential
Remark 3.1
Asymptotic property
Formula (3.7)
Reflection coefficient
Formulas (3.23) and (3.22)
Weyl function
Formula (3.30)
Weyl function for −ρn
Remark 3.18
Factorization of the asymptotic equivalence matrix
Theorem 3.8
Asymptotic behavior of the orthogonal polynomial
Equation (3.33) Table 2
1 that is, T ≤ 1+α (1 + T 2 α). This last inequality in turn holds since T and α are in (0, 1). Finally, from (3.27) we obtain the expression for the Weyl function:
N (z) = i
1− 1−
α2 c2 (1−α2 )2 α2 c2 (1−α2 )2
− zα(1 − − zα(1 −
c2 (1−α2 )2 ) c2 (1−α2 )2 )
− zc + zc
.
We summarize the parallels between the continuous case and the one-sided discrete case in Tables 1 and 2.
44
D. Alpay and I. Gohberg
References [1] V.M. Adamyan and S.E. Nechayev. Nuclear Hankel matrices and orthogonal trigonometric polynomials. Contemporary Mathematics, 189:1–15, 1995. [2] D. Alpay, T. Azizov, A. Dijksma, and H. Langer. The Schur algorithm for generalized Schur functions. III. J-unitary matrix polynomials on the circle. Linear Algebra Appl., 369:113–144, 2003. [3] D. Alpay and H. Dym. Hilbert spaces of analytic functions, inverse scattering and operator models, I. Integral Equation and Operator Theory, 7:589–641, 1984. [4] D. Alpay and H. Dym. On applications of reproducing kernel spaces to the Schur algorithm and rational J-unitary factorization. In I. Gohberg, editor, I. Schur methods in operator theory and signal processing, volume 18 of Operator Theory: Advances and Applications, pages 89–159. Birkh¨ auser Verlag, Basel, 1986. [5] D. Alpay and I. Gohberg. Unitary rational matrix functions. In I. Gohberg, editor, Topics in interpolation theory of rational matrix-valued functions, volume 33 of Operator Theory: Advances and Applications, pages 175–222. Birkh¨ a ¨user Verlag, Basel, 1988. [6] D. Alpay and I. Gohberg. Inverse spectral problems for difference operators with rational scattering matrix function. Integral Equations Operator Theory, 20(2):125– 170, 1994. [7] D. Alpay and I. Gohberg. Inverse spectral problem for differential operators with rational scattering matrix functions. Journal of differential equations, 118:1–19, 1995. [8] D. Alpay and I. Gohberg. Inverse scattering problem for differential operators with rational scattering matrix functions. In I. B¨ ¨ ottcher and I. Gohberg, editors, Singular integral operators and related topics (Tel Aviv, 1995), volume 90 of Operator Theory: Advances and Applications, pages 1–18. Birkh¨ ¨ auser Verlag, Basel, 1996. [9] D. Alpay and I. Gohberg. Connections between the Carath´ ´eodory-Toeplitz and the Nehari extension problems: the discrete scalar case. Integral Equations Operator Theory, 37(2):125–142, 2000. [10] D. Alpay and I. Gohberg. Inverse problems associated to a canonical differential system. In L. Kerchy, ´ C. Foias, I. Gohberg, and H. Langer, editors, Recent advances in operator theory and related topics (Szeged, 1999), Operator theory: Advances and Applications, pages 1–27. Birkh¨ auser, Basel, 2001. [11] D. Alpay and I. Gohberg. A trace formula for canonical differential expressions. J. Funct. Anal., 197(2):489–525, 2003. [12] D. Alpay, I. Gohberg, M.A. Kaashoek, and A.L. Sakhnovich. Direct and inverse scattering problem for canonical systems with a strictly pseudo-exponential potential. Math. Nachr., 215:5–31, 2000. [13] D. Alpay, I. Gohberg, and L. Sakhnovich. Inverse scattering for continuous transmission lines with rational reflection coefficient function. In I. Gohberg, P. Lancaster, and P.N. Shivakumar, editors, Proceedings of the International Conference on Applications of Operator Theory held in Winnipeg, Manitoba, October 2–6, 1994, volume 87 of Operator theory: Advances and Applications, pages 1–16. Birkh¨ auser Verlag, Basel, 1996.
Analogs of Canonical Systems with Pseudo-exponential Potential
45
[14] H. Bart, I. Gohberg, and M.A. Kaashoek. Minimal factorization of matrix and operator functions, volume 1 of Operator Theory: Advances and Applications. Birkh¨ auser Verlag, Basel, 1979. [15] H. Bart, I. Gohberg, and M.A. Kaashoek. Convolution equations and linear systems. Integral Equations Operator Theory, 5:283–340, 1982. [16] A.M. Bruckstein and T. Kailath. Inverse scattering for discrete transmission-line models. SIAM Rev., 29(3):359–389, 1987. [17] K. Clancey and I. Gohberg. Factorization of matrix functions and singular integral operators, volume 3 of Operator Theory: Advances and Applications. Birkh¨ auser Verlag, Basel, 1981. [18] D de Cogan. Transmission line matrix (LTM) techniques for diffusion applications. Gordon and Breach Science Publishers, 1998. [19] T. Constantinescu. Schur parameters, factorization and dilation problems, volume 82 of Operator Theory: Advances and Applications. Birkhauser ¨ Verlag, Basel, 1996. [20] H. Dym. J-contractive matrix functions, reproducing kernel Hilbert spaces and interpolation. Published for the Conference Board of the Mathematical Sciences, Washington, DC, 1989. [21] H. Dym and A. Iacob. Applications of factorization and Toeplitz operators to inverse problems. In I. Gohberg, editor, Toeplitz centennial (Tel Aviv, 1981), volume 4 of Operator Theory: Adv. Appl., pages 233–260. Birkh¨ a ¨user, Basel, 1982. [22] H. Dym and A. Iacob. Positive definite extensions, canonical equations and inverse problems. In H. Dym and I. Gohberg, editors, Proceedings of the workshop on applications of linear operator theory to systems and networks held at Rehovot, June 13–16, 1983, volume 12 of Operator Theory: Advances and Applications, pages 141– 240. Birkhauser ¨ Verlag, Basel, 1984. [23] B. Fritzsche and B. Kirstein, editors. Ausgew¨ ¨ ahlte Arbeiten zu den Urspr¨ ungen ¨ der Schur-Analysis, volume 16 of Teubner-Archiv zur Mathematik. B.G. Teubner Verlagsgesellschaft, Stuttgart–Leipzig, 1991. [24] I. Gohberg, S. Goldberg, and M.A. Kaashoek. Classes of linear operators. Vol. II, I volume 63 of Operator Theory: Advances and Applications. Birkhauser ¨ Verlag, Basel, 1993. [25] I. Gohberg and M.A. Kaashoek. Block Toeplitz operators with rational symbols. In I. Gohberg, J.W. Helton, and L. Rodman, editors, Contributions to operator theory and its applications (Mesa, AZ, 1987), volume 35 of Oper. Theory Adv. Appl., pages 385–440. Birkhauser, ¨ Basel, 1988. [26] I. Gohberg, M.A. Kaashoek, and A.L. Sakhnovich. Canonical systems with rational spectral densities: explicit formulas and applications. Math. Nachr., 194:93–125, 1998. [27] I. Gohberg, M.A. Kaashoek, and A.L. Sakhnovich. Pseudo-canonical systems with rational Weyl functions: explicit formulas and applications. Journal of differential equations, 146:375–398, 1998. [28] I. Gohberg, M.A. Kaashoek, and F. van Schagen. Szeg¨ ¨ o–Kac–Achiezer formulas in terms of realizations of the symbol. J. Funct. Anal., 74:24–51, 1987.
46
D. Alpay and I. Gohberg
[29] I. Gohberg, P. Lancaster, and L. Rodman. Matrices and indefinite scalar products, volume 8 of Operator Theory: Advances and Applications. Birkhauser ¨ Verlag, Basel, 1983. [30] I. Gohberg, P. Lancaster, and L. Rodman. Invariant subspaces of matrices with applications. Canadian Mathematical Society Series of Monographs and Advanced Texts. John Wiley & Sons Inc., New York, 1986. A Wiley-Interscience Publication. [31] I. Gohberg and Ju. Leiterer. General theorems on the factorization of operatorvalued functions with respect to a contour. I. Holomorphic functions. Acta Sci. Math. (Szeged), 34:103–120, 1973. [32] I. Gohberg and Ju. Leiterer. General theorems on the factorization of operator-valued functions with respect to a contour. II. Generalizations. Acta Sci. Math. (Szeged), 35:39–59, 1973. [33] I. Gohberg and S. Rubinstein. Proper contractions and their unitary minimal completions. In I. Gohberg, editor, Topics in interpolation theory of rational matrix-valued functions, volume 33 of Operator Theory: Advances and Applications, pages 223–247. Birkhauser ¨ Verlag, Basel, 1988. [34] I.C. Gohberg and I.A. Fel dman. Convolution equations and projection methods for their solution. American Mathematical Society, Providence, R.I., 1974. Translated from the Russian by F.M. Goldware, Translations of Mathematical Monographs, Vol. 41. [35] G.J. Groenewald. Toeplitz operators with rational symbols and realizations: an alternative version. Technical Report WS:–362, Vrije Universiteit Amsterdam, 1990. [36] A. Iacob. On the spectral theory of a class of canonical systems of differential equations. PhD thesis, The Weizmann Institute of Sciences, 1986. [37] M.G. Kre˘n. ˘ Continuous analogues of propositions for polynomials orthogonal on the unit circle. Dokl. Akad. Nauk. SSSR, 105:637–640, 1955. [38] M.G. Kre˘n. Topics in differential and integral equations and operator theory, volume 7 of Operator theory: Advances and Applications. Birkhauser ¨ Verlag, Basel, 1983. Edited by I. Gohberg, Translated from the Russian by A. Iacob. ¨ [39] M.G. Kre˘n ˘ and H. Langer. Uber die verallgemeinerten Resolventen und die charakteristische Funktion eines isometrischen Operators im Raume Πk . In Hilbert space operators and operator algebras (Proc. Int. Conf. Tihany, 1970), pages 353–399. North-Holland, Amsterdam, 1972. Colloquia Math. Soc. J´ a ´nos Bolyai. [40] L.Golinskii and P. Nevai. Szeg˝ ˝ o difference equations, transfer matrices and orthogonal polynomials on the unit circle. Comm. Math. Phys., 223(2):223–259, 2001. [41] F.E. Melik-Adamyan. On a class of canonical differential operators. Izvestya Akademii Nauk. Armyanskoi SSR Matematica, 24:570–592, 1989. English translation in: Soviet Journal of Contemporary Mathematics, vol. 24, pages 48–69 (1989). [42] L. Sakhnovich. Dual discrete canonical systems and dual orthogonal polynomials. In D. Alpay, I. Gohberg, and V. Vinnikov, editors, Interpolation theory, systems theory and related topics (Tel Aviv/Rehovot, 1999), volume 134 of Oper. Theory Adv. Appl., pages 385–401. Birkh¨ a ¨user, Basel, 2002. ¨ [43] I. Schur. Uber die Potenzreihen, die im Innern des Einheitkreises beschr¨ ¨ ankten sind, I. Journal f¨ fur die Reine und Angewandte Mathematik, 147:205–232, 1917. English
Analogs of Canonical Systems with Pseudo-exponential Potential
47
translation in: I. Schur methods in operator theory and signal processing. (Operator theory: Advances and Applications OT 18 (1986), Birkh¨ ¨ auser Verlag), Basel. [44] B. Simon. Analogs of the m-function in the theory of orthogonal polynomials on the unit circle. J. Comput. Appl. Math., 171(1-2):411–424, 2004. [45] F. Wenger, T. Gustafsson, and L. Svensson. Perturbation theory for inhomogeneous transmission lines. IEEE Trans. Circuits Systems I Fund. Theory Appl., 49(3):289– 297, 2002. [46] A. Yagle and B. Levy. The Schur algorithm and its applications. Acta Applicandae Mathematicae, 3:255–284, 1985. Daniel Alpay Department of Mathematics Ben–Gurion University of the Negev Beer-Sheva 84105 Israel e-mail: [email protected] Israel Gohberg School of Mathematical Sciences The Raymond and Beverly Sackler Faculty of Exact Sciences Tel–Aviv University Tel–Aviv, Ramat–Aviv 69989 Israel e-mail: [email protected]
Operator Theory: Advances and Applications, Vol. 161, 49–113 c 2005 Birkhauser ¨ Verlag Basel/Switzerland
Matrix-J-unitary Non-commutative Rational Formal Power Series D. Alpay and D.S. Kalyuzhny˘ı-Verbovetzki˘ Abstract. Formal power series in N non-commuting indeterminates can be considered as a counterpart of functions of one variable holomorphic at 0, and some of their properties are described in terms of coefficients. However, really fruitful analysis begins when one considers for them evaluations on N -tuples of n × n matrices (with n = 1, 2, . . .) or operators on an infinite-dimensional separable Hilbert space. Moreover, such evaluations appear in control, optimization and stabilization problems of modern system engineering. In this paper, a theory of realization and minimal factorization of rational matrix-valued functions which are J-unitary on the imaginary line or on the unit circle is extended to the setting of non-commutative rational formal power series. The property of J-unitarity holds on N -tuples of n × n skew-Hermitian versus unitary matrices (n = 1, 2, . . .), and a rational formal power series is called matrix-J-unitary in this case. The close relationship between minimal realizations and structured Hermitian solutions H of the Lyapunov or Stein equations is established. The results are specialized for the case of matrix-J-inner rational formal power series. In this case H > 0, however the proof of that is more elaborated than in the one-variable case and involves a new technique. For the rational matrix-inner case, i.e., when J = I, the theorem of Ball, Groenewald and Malakorn on unitary realization of a formal power series from the non-commutative Schur–Agler class admits an improvement: the existence of a minimal (thus, finite-dimensional) such unitary realization and its uniqueness up to a unitary similarity is proved. A version of the theory for matrix-selfadjoint rational formal power series is also presented. The concept of non-commutative formal reproducing kernel Pontryagin spaces is introduced, and in this framework the backward shift realization of a matrix-J-unitary rational formal power series in a finite-dimensional non-commutative de Branges–Rovnyak space is described. Mathematics Subject Classification (2000). Primary 47A48; Secondary 13F25, 46C20, 46E22, 93B20, 93D05.
The second author was supported by the Center for Advanced Studies in Mathematics, BenGurion University of the Negev.
50 Keywords. J-unitary matrix functions, non-commutative, rational, formal power series, minimal realizations, Lyapunov equation, Stein equation, minimal factorizations, Schur–Agler class, reproducing kernel Pontryagin spaces, backward shift, de Branges–Rovnyak space.
Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 3 More on observability, controllability, and minimality in the non-commutative setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 4 Matrix-J-unitary formal power series: A multivariable non-commutative analogue of the line case . . . . . . . . . . . . . 67 4.1 Minimal Givone–Roesser realizations and the Lyapunov equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.2 The associated structured Hermitian matrix . . . . . . . . . . . . . . . . . . . . . . 72 4.3 Minimal matrix-J-unitary factorizations . . . . . . . . . . . . . . . . . . . . . . . . . . 74 4.4 Matrix-unitary rational formal power series . . . . . . . . . . . . . . . . . . . . . . . 75 5 Matrix-J-unitary formal power series: A multivariable non-commutative analogue of the circle case . . . . . . . . . . . 77 5.1 Minimal Givone–Roesser realizations and the Stein equation . . . . . . 77 5.2 The associated structured Hermitian matrix . . . . . . . . . . . . . . . . . . . . . . 83 5.3 Minimal matrix-J-unitary factorizations . . . . . . . . . . . . . . . . . . . . . . . . . . 84 5.4 Matrix-unitary rational formal power series . . . . . . . . . . . . . . . . . . . . . . . 85 6 Matrix-J-inner rational formal power series . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 6.1 A multivariable non-commutative analogue of the half-plane case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 6.2 A multivariable non-commutative analogue of the disk case . . . . . . . 91 7 Matrix-selfadjoint rational formal power series . . . . . . . . . . . . . . . . . . . . . . . . . 96 7.1 A multivariable non-commutative analogue of the line case . . . . . . . . 96 7.2 A multivariable non-commutative analogue of the circle case . . . . . 100 8 Finite-dimensional de Branges–Rovnyak spaces and backward shift realizations: The multivariable non-commutative setting . . . . . . . . . 102 8.1 Non-commutative formal reproducing kernel Pontryagin spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 8.2 Minimal realizations in non-commutative de Branges–Rovnyak spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 8.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Matrix-J-unitary Rational Formal Power Series
51
1. Introduction In the present paper we study a non-commutative analogue of rational matrixvalued functions which are J-unitary on the imaginary line or on the unit circle and, as a special case, J-inner ones. Let J ∈ Cq×q be a signature matrix, i.e., a matrix which is both self-adjoint and unitary. A Cq×q -valued rational function F is J-unitary on the imaginary line if F (z)JF (z)∗ = J
(1.1)
at every point of holomorphy of F on the imaginary line. It is called J-inner if moreover F (z)JF (z)∗ ≤ J (1.2) at every point of holomorphy of F in the open right half-plane Π. Replacing the imaginary line by the unit circle T in (1.1) and the open right half-plane Π by the open unit disk D in (1.2), one defines J-unitary functions on the unit circle (resp., J-inner functions in the open unit disk). These classes of rational functions were studied in [7] and [6] using the theory of realizations of rational matrix-valued functions, and in [4] using the theory of reproducing kernel Pontryagin spaces. The circle and line cases were studied in a unified way in [5]. We mention also the earlier papers [36, 23] that inspired much of investigation of these and other classes of rational matrix-valued functions with symmetries. We now recall some of the arguments in [7], then explain the difficulties appearing in the several complex variables setting, and why the arguments of [7] extend to the non-commutative framework. So let F be a rational function which is J-unitary on the imaginary line, and assume that F is holomorphic in a neighborhood of the origin. It then admits a minimal realization F (z) = D + C(IIγ − zA)−1 zB where D = F (0), and A, B, C are matrices of appropriate sizes (the size γ × γ of the square matrix A is minimal possible for such a realization). Rewrite (1.1) as F (z) = JF (−z)−∗ J,
(1.3)
where z is in the domain of holomorphy of both F (z) and F (−z)−∗ . We can rewrite (1.3) as D + C(IIγ − zA)−1 zB = J D−∗ + D−∗ B ∗ (IIγ + z(A − BD−1 C)∗ )−1 zC ∗ D−∗ J. The above equality gives two minimal realizations of a given rational matrix-valued function. These realizations are therefore similar, and there is a uniquely defined matrix (which, for convenience, we denote by −H) such that −H 0 A B −(A∗ − C ∗ D−∗ B ∗ ) C ∗ D−∗ J −H 0 = . (1.4) JD−∗ B ∗ JD−∗ J 0 Iq C D 0 Iq The matrix −H ∗ in the place of −H also satisfies (1.4), and by uniqueness of the similarity matrix we have H = H ∗ , which leads to the following theorem.
52
D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘
Theorem 1.1. Let F be a rational matrix-valued function holomorphic in a neighborhood of the origin and let F (z) = D + C(IIγ − zA)−1 zB be a minimal realization of F . Then F is J-unitary on the imaginary line if and only if the following conditions hold: (1) D is J-unitary, that is, DJD∗ = J; (2) there exists an Hermitian invertible matrix H such that A∗ H + HA = B
=
−C ∗ JC, −H
−1
∗
C JD.
(1.5) (1.6)
The matrix H is uniquely determined by a given minimal realization (it is called the associated Hermitian matrix to this realization). It holds that J − F (z)JF (z )∗ = C(IIγ − zA)−1 H −1 (IIγ − z A)−∗ C ∗ . z + z In particular, F is J-inner if and only if H > 0.
(1.7)
The finite-dimensional reproducing kernel Pontryagin space K(F ) with reproducing kernel J − F (z)JF (z )∗ K F (z, z ) = (z + z ) provides a minimal state space realization for F : more precisely (see [4]), F (z) = D + C(IIγ − zA)−1 zB, where
A C
B D
K(F ) K(F ) : → Cq Cq
is defined by F (z) − F (0) f (z) − f (0) u, Cf = f (0), Dx = F (0)x. , Bu = z z Another topic considered in [7] and [4] is J-unitary factorization. Given a matrix-valued function F which is J-unitary on the imaginary line one looks for all minimal factorizations of F (see [15]) into factors which are themselves Junitary on the imaginary line. There are two equivalent characterizations of these factorizations: the first one uses the theory of realization and the second one uses the theory of reproducing kernel Pontryagin spaces. (Af )(z) = (R0 f )(z) :=
Theorem 1.2. Let F be a rational matrix-valued function which is J-unitary on the imaginary line and holomorphic in a neighborhood of the origin, and let F (z) = D + C(IIγ − zA)−1 zB be a minimal realization of F , with the associated Hermitian matrix H. There is a one-to-one correspondence between minimal J-unitary factorizations of F (up to a multiplicative J-unitary constant) and Ainvariant subspaces which are non-degenerate in the (possibly, indefinite) metric induced by H. In general, F may fail to have non-trivial J-unitary factorizations.
Matrix-J-unitary Rational Formal Power Series
53
Theorem 1.3. Let F be a rational matrix-valued function which is J-unitary on the imaginary line and holomorphic in a neighborhood of the origin. There is a one-to-one correspondence between minimal J-unitary factorizations of F (up to a multiplicative J-unitary constant) and R0 -invariant non-degenerate subspaces of K(F ). The arguments in the proof of Theorem 1.1 do not go through in the several complex variables context. Indeed, uniqueness, up to a similarity, of minimal realizations doesn’t hold anymore (see, e.g., [27, 25, 33]). On the other hand, the notion of realization still makes sense in the non-commutative setting, namely for non-commutative rational formal power series (FPSs in short), and there is a uniqueness result for minimal realizations in this case (see [16, 39, 11]). The latter allows us to extend the notion and study of J-unitary matrix-valued functions to the non-commutative case. We introduce the notion of a matrix-J-unitary rational FPS as a formal power series in N non-commuting indeterminates which is J ⊗ In -unitary on N -tuples of n × n skew-Hermitian versus unitary matrices for n = 1, 2, . . .. We extend to this case the theory of minimal realizations, minimal J-unitary factorizations, and backward shift models in finite-dimensional de Branges–Rovnyak spaces. We also introduce, in a similar way, the notion of matrixselfadjoint rational formal power series, and show how to deduce the related theory for them from the theory of matrix-J-unitary ones. We now turn to the outline of this paper. It consists of eight sections. Section 1 is this introduction. In Section 2 we review various results in the theory of FPSs. Let us note that the theorem on null spaces for matrix substitutions and its corollary, from our paper [8], which are recollected in the end of Section 2, become an important tool in our present work on FPSs. In Section 3 we study the properties of observability, controllability and minimality of Givone-Roesser nodes in the non-commutative setting and give the corresponding criteria in terms of matrix evaluations for their “formal transfer functions”. We also formulate a theorem on minimal factorizations of a rational FPS. In Section 4 we define the non-commutative analogue of the imaginary line and study matrix-J-unitary FPSs for this case. We in particular obtain a non-commutative version of Theorem 1.1. We obtain a counterpart of the Lyapunov equation (1.5) and of Theorem 1.2 on minimal J-unitary factorizations. The unique solution of the Lyapunov equation has in this case a block diagonal structure: H = diag(H1 , . . . , HN ), and is said to be the associated structured Hermitian matrix (associated with a given minimal realization of a matrix-J-unitary FPS). Section 5 contains the analogue of the previous section for the case of a non-commutative counterpart of the unit circle. These two sections do not take into account a counterpart of condition (1.2), which is considered in Section 6 where we study matrix-J-inner rational FPSs. In particular, we show that the associated structured Hermitian matrix H = diag(H1 , . . . , HN ) is strictly positive in this case, which generalizes the statement in Theorem 1.1 on J-inner functions. We define non-commutative counterparts of the right half-plane and the unit disk, and formulate our results for both of these domains. The second
54
D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘
one is the disjoint union of the products of N copies of n × n matrix unit disks, n = 1, 2, . . ., and plays a role of a “non-commutative polydisk”. In Theorem 6.6 we show that any (not necessarily rational) FPS with operator coefficients, which takes contractive values in this domain, belongs to the non-commutative Schur– Agler class, defined by J.A. Ball, G. Groenewald and T. Malakorn in [12]. (The opposite is trivial: any function from this class has the above-mentioned property.) In other words, the contractivity of values of a FPS on N -tuples of strictly contractive n × n matrices, n = 1, 2, . . ., is sufficient for the contractivity of its values on N -tuples of strictly contractive operators in an infinite-dimensional separable Hilbert space. Thus, matrix-inner rational FPSs (i.e., matrix-J-inner ones for the case J = Iq ) belong to the non-commutative Schur–Agler class. For this case, we recover the theorem on unitary realizations for FPSs from the latter class which was obtain in [12]. Moreover, our Theorem 6.4 establishes the existence of a minimal, thus finite-dimensional, unitary Givone–Roesser realization of a rational matrix-inner FPS and the uniqueness of such a realization up to a unitary similarity. This implies, in particular, non-commutative Lossless Bounded Real Lemma (see [41, 7] for its one-variable counterpart). A non-commutative version of standard Bounded Real Lemma (see [47]) has been presented recently in [13]. In Section 7 we study matrix-selfadjoint rational FPSs. In Section 8 we introduce non-commutative formal reproducing kernel Pontryagin spaces in a way which extends one that J.A. Ball and V. Vinnikov have introduced in [14] non-commutative formal reproducing kernel Hilbert spaces. We describe minimal backward shift realizations in non-commutative formal reproducing kernel Pontryagin spaces which serve as a counterpart of finite-dimensional de Branges–Rovnyak spaces. Let us note that we derive an explicit formula (8.12) for the corresponding reproducing kernels. In the last subsection of Section 8 we present examples of matrix-inner rational FPSs with scalar coefficients, in two non-commuting indeterminates, and the corresponding reproducing kernels computed by formula (8.12).
2. Preliminaries In this section we introduce the notations which will be used throughout this paper and review some definitions from the theory of formal power series. The symbol p×q is the Cp×q denotes the set of p × q matrices with complex entries, and (Cr×s ) space of p × q block matrices with block entries in Cr×s . The tensor product A ⊗ B p×q with (i, j)th of matrices A ∈ Cr×s and B ∈ Cp×q is the element of (Cr×s ) r×s p×q block entry equal to Abij . The tensor product C ⊗ C is the linear span of n finite sums of the form C = k=1 Ak ⊗ Bk where Ak ∈ Cr×s and Bk ∈ Cp×q . One p×q identifies Cr×s ⊗ Cp×q with (Cr×s ) . Different representations for an element C ∈ Cr×s ⊗ Cp×q can be reduced to a unique one: C=
p q r s µ=1 ν=1 τ =1 σ=1
cµντ σ Eµν ⊗ Eτσ ,
Matrix-J-unitary Rational Formal Power Series
55
where the matrices Eµν ∈ Cr×s and Eτσ ∈ Cp×q are given by
1 if (i, j) = (µ, ν) Eµν ij = , µ, i = 1, . . . , r and ν, j = 1, . . . s, 0 if (i, j) = (µ, ν)
1 if (k, ) = (τ, σ) , τ, k = 1, . . . , p and σ, = 1, . . . q. (Eτ σ )k = 0 if (k, ) = (τ, σ)
We denote by FN the free semigroup with N generators g1 , . . . , gN and the identity element ∅ with respect to the concatenation product. This means that the generic element of FN is a word w = gi1 · · · gin , where iν ∈ {1, . . . , N } for ν = 1, . . . , n, the identity element ∅ corresponds to the empty word, and for another word w = gj1 · · · gjm , one defines the product as ww = gi1 · · · gin gj1 · · · gjm ,
w∅ = ∅w = w.
We denote by w = gin · · · gi1 ∈ FN the transpose of w = gi1 · · · gin ∈ FN and by |w| = n the length of the word w. Correspondingly, ∅T = ∅, and |∅| = 0. A formal power series (FPS in short) in non-commuting indeterminates z1 , . . . , zN with coefficients in a linear space E is given by f (z) = fw z w , fw ∈ E, (2.1) T
w∈F FN
where for w = gi1 · · · gin and z = (z1 , . . . , zN ) we set z w = zi1 · · · zin , and z ∅ = 1. We denote by E z1 , . . . , zN the linear space of FPSs in non-commuting indeterminates z1 , . . . , zN with coefficients in E. A series f ∈ Cp×q z1 , . . . , zN of the form (2.1) can also be viewed as a p × q matrix whose entries are formal power series with coefficients in C, i.e., belong to the space C z1 , . . . , zN , which has an additional structure of non-commutative ring (we assume that the indeterminates zj formally commute with the coefficients fw ). The support of a FPS f given by (2.1) is the set supp f = {w ∈ FN : fw = 0} . Non-commutative polynomials are formal power series with finite support. We denote by E z1 , . . . , zN the subspace in the space E z1 , . . . , zN consisting of non-commutative polynomials. Clearly, a FPS is determined by its coefficients fw . Sums and products of two FPSs f and g with matrix coefficients of compatible sizes (or with operator coefficients) are given by (f + g)w = fw + gw , (f g)w = fw gw . (2.2) w w =w
A FPS f with coefficients in C is invertible if and only if f∅ = 0. Indeed, assume that f is invertible. From the definition of the product of two FPSs in (2.2) we get f∅ (f −1 )∅ = 1, and hence f∅ = 0. On the other hand, if f∅ = 0 then f −1 is given by ∞ k f −1 (z) = 1 − f∅−1 f (z) f∅−1 . k=0
56
D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘
The formal power series in the right-hand side is well defined since the expansion k of 1 − f∅−1 f contains words of length at least k, and thus the coefficients (f −1 )w are finite sums. A FPS with coefficients in C is called rational if it can be expressed as a finite number of sums, products and inversions of non-commutative polynomials. A formal power series with coefficients in Cp×q is called rational if it is a p × q matrix whose all entries are rational FPSs with coefficients in C. We will denote by Cp×q z1 , . . . , zN rat the linear space of rational FPSs with coefficients in Cp×q . Define the product of f ∈ Cp×q z1 , . . . , zN rat and p ∈ C z1 , . . . , zN as follows: 1. f · 1 = f for every f ∈ Cp×q z1 , . . . , zN rat ; 2. For every word w ∈ FN and every f ∈ Cp×q z1 , . . . , zN rat , f · zw = fw z ww = fv z w w∈F FN
w
where the last sum is taken over all w which can be written as w = vw for some v ∈ FN ; 3. For every f ∈ Cp×q z1 , . . . , zN rat , p1 , p2 ∈ C z1 , . . . , zN and α1 , α2 ∈ C, f · (α1 p1 + α2 p2 ) = α1 (f · p1 ) + α2 (f · p2 ). The space C z1 , . . . , zN rat is a right module over the ring C z1 , . . . , zN with respect to this product. A structure of left C z1 , . . . , zN -module can be defined in a similar way since the indeterminates commute with coefficients. Formal power series are used in various branches of mathematics, e.g., in abstract algebra, enumeration problems and combinatorics; rational formal power series have been extensively used in theoretical computer science, mostly in automata u ¨ tzenberger theorem [35, 44] theory and language theory (see [18]). The Kleene–Sch¨ (see also [24]) says that a FPS f with coefficients in Cp×q is rational if and only if it is recognizable, i.e., there exist r ∈ N and matrices C ∈ Cp×r , A1 , . . . , AN ∈ Cr×r and B ∈ Cr×q such that for every word w = gi1 · · · gin ∈ FN one has p×q
fw = CAw B,
where Aw = Ai1 . . . Ain .
(2.3)
Let Hf be the Hankel matrix whose rows and columns are indexed by the words of FN and defined by (Hf )w,w = fwwT ,
w, w ∈ FN .
It follows from (2.3) that if the FPS f is recognizable then (Hf )w,w = T
CAww B for all w, w ∈ FN . M. Fliess has shown in [24] that a FPS f is rational (that is, recognizable) if and only if γ := rank Hf < ∞. In this case the number γ is the smallest possible r for a representation (2.3). In control theory, rational FPSs appear as the input/output mappings of linear systems with structured uncertainties. For instance, in [17] a system matrix
Matrix-J-unitary Rational Formal Power Series is given by
57
A B ∈ C(r+p)×(r+q) , C D and the uncertainty operator is given by M=
∆(δ) = diag(δ1 Ir1 , . . . , δN IrN ), where r1 + · · · + rN = r. The uncertainties δk are linear operators on 2 representing disturbances or small perturbation parameters which enter the system at different locations. Mathematically, they can be interpreted as non-commuting indeterminates. The input/output map is a linear fractional transformation LF T (M, ∆(δ)) = D + C(IIr − ∆(δ)A)−1 ∆(δ)B,
(2.4) Tαnc
of a linear which can be interpreted as a non-commutative transfer function system α with evolution on FN :
xj (gj w) = Aj1 x1 (w) + · · · + AjN xN (w) + Bj u(w), j = 1, . . . , N, α: (2.5) y(w) = C1 x1 (w) + · · · + CN xN (w) + Du(w), where xj (w) ∈ Crj (j = 1, . . . , N ), u(w) ∈ Cq , y(w) ∈ Cp , and the matrices Ajk , B and C are of appropriate sizes along the decomposition Cr = Cr1 ⊕ · · · ⊕ CrN . Such a system appears in [39, 11, 12, 13] and is known as the non-commutative Givone–Roesser model of multidimensional linear system; see [26, 27, 42] for its commutative counterpart. In this paper we do not consider system evolutions (i.e., equations (2.5)). We will use the terminology N -dimensional Givone–Roesser operator node (for brevity, GR-node) for the collection of data α = (N ; A, B, C, D; Cr =
N '
Crj , Cq , Cp ).
(2.6)
j=1
Sometimes instead of spaces Cr , Crj (j = 1, . . . , N ), Cq and Cp we shall consider abstract finite-dimensional linear spaces X (the state space), Xj (j = 1, . . . , N ), U (the input space) and Y (the output space), respectively, and a node α = (N ; A, B, C, D; X =
N '
Xj , U, Y),
j=1
where A, B, C, D are linear operators in the corresponding pairs of spaces. The non-commutative transfer function of a GR-node α is a rational FPS Tαnc(z) = D + C(IIr − ∆(z)A)−1 ∆(z)B.
(2.7)
Minimal GR-realizations (2.6) of non-commutative rational FPSs, that is, representations of them in the form (2.7), with minimal possible rk for k = 1, . . . , N were studied in [17, 16, 39, 11]. For k = 1, . . . , N , the kth observability matrix is Ok = col(Ck , C1 A1k , . . . , CN AN k , C1 A11 A1k , . . . C1 A1N AN k , . . .)
58
D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘
and the kth controllability matrix is Ck = row(Bk , Ak1 B1 , . . . , AkN BN , Ak1 A11 B1 , . . . AkN AN 1 B1 , . . .) (note that these are infinite block matrices). A GR-node α is called observable (resp., controllable) if rank Ok = rk (resp., rank Ck = rk ) for k = 1, . . . , N . A GR( rj q p node α = (N ; A, B, C, D; Cr = N j=1 C , C , C ) is observable if and only if its (N adjoint GR-node α∗ = (N ; A∗ , C ∗ , B ∗ , D∗ ; Cr = j=1 Crj , Cp , Cq ) is controllable. (Clearly, (α∗ )∗ = α.) In view of the sequel, we introduce some notations. We set: Awgν = Aj1 j2 Aj2 j3 · · · Ajk−1 jk Ajk ν , (CA)gν w = Cν Aνj1 Aj1 j2 · · · Ajk−1 jk , (AB)wgν = Aj1 j2 · · · Ajk−1 jk Ajk ν Bν , (CAB)gµ wgν = Cµ Aµj1 Aj1 j2 · · · Ajk−1 jk Ajk ν Bν , where w = gj1 · · · gjk ∈ FN and µ, ν ∈ {1, . . . , N }. We also define: Agν = A∅ = Iγ (CA)gν = Cν , (AB)gν = Bν , (CAB)gν = Cν Bν , (CAB)gµ gν = Cµ Aµν Bν , and hence, with the lexicographic order of words in FN , wgk Ok = colw∈F FN (CA)
T
gk w and Ck = roww∈F , FN (AB)
and the coefficients of the FPS Tαnc (defined by (2.7)) are given by (T Tαnc )∅ = D,
(T Tαnc )w = (CAB)w
for
w = gj1 · · · gjn ∈ FN .
The kth Hankel matrix associated with a FPS f is defined in [39] (see also [11]) as (Hf,k )w,w gk = fwgk wT
with
w, w ∈ FN ,
that is, the rows of Hf,k are indexed by all the words of FN and the columns of Hf,k are indexed by all the words of FN ending by gk , provided the lexicographic order is used. If a GR-node α defines a realization of f , that is, f = Tαnc, then (Hf,k )w,w gk = (CAB)wgk w
T
T
= (CA)wgk (AB)gk w ,
i.e., Hf,k = Ok Ck . Hence, the node α is minimal if and only if α is both observable and controllable, i.e., γk := rank Hf,k = rk
for all k ∈ {1, . . . , N } .
This last set of conditions is an analogue of the above mentioned result of Fliess on minimal recognizable representations of rational formal power series. Every non-commutative rational FPS has a minimal GR-realization.
Matrix-J-unitary Rational Formal Power Series
59
Finally, we note (see [17, 39]) that two minimal GR-realizations of a given (N rational FPS are similar : if α(i) = (N ; A(i) , B (i) , C (i) , D; Cγ = k=1 Cγk , Cq , Cp ) (i=1,2) are minimal GR-nodes such that Tαnc(1) = Tαnc(2) then there exists a block diagonal invertible matrix T = diag(T T1 , . . . , TN ) (with Tk ∈ Cγk ×γk ) such that A(1) = T −1 A(2) T,
B (1) = T −1 B (2) ,
C (1) = C (2) T.
(2.8)
Of course, the converse is also true, moreover, any two similar (not necessarily minimal) GR-nodes have the same transfer functions. Now we turn to the discussion on substitutions of matrices for indeterminates in formal power series. Many properties of non-commutative FPSs or noncommutative polynomials are described in terms of matrix substitutions, e.g., matrix-positivity of non-commutative polynomials (non-commutative Positivstellensatz) [29, 40, 31, 32], matrix-positivity of FPS kernels [34], matrix-convexity [21, 30]. The non-commutative Schur–Agler class, i.e., the class of FPSs with operator coefficients, which take contractive values on all N -tuples of strictly contractive operators on 2 , was studied in [12] 1 ; we will show in Section 6 that in order that a FPS belongs to this class it suffices to check its contractivity on N -tuples of strictly contractive n × n matrices, for all n ∈ N. The notions of matrix-Junitary (in particular, matrix-J-inner) and matrix-selfadjoint rational FPS, which will be introduced and studied in the present paper, are also defined in terms of substitutions of matrices (of a certain class) for indeterminates. w ∈ C z1 , . . . , zN . For n ∈ N and an N -tuple of Let p(z) = |w|≤m pw z N
matrices Z = (Z1 , . . . , ZN ) ∈ (Cn×n ) , set p(Z) = pw Z w , |w|≤m
where Z w = Zi1 · · · Zi|w| for w = gi1 · · · gi|w| ∈ FN , and Z ∅ = In . Then for any N
rational expression for a FPS f ∈ C z1 , . . . , zN rat its value at Z ∈ (Cn×n ) is well defined provided all of the inversions of polynomials p(j) ∈ C z1 , . . . , zN in this expression are well defined at Z. The latter is the case at least in some (j) neighborhood of Z = 0, since p∅ = 0. N
Now, if f ∈ Cp×q z1 , . . . , zN rat then the value f (Z) at some Z ∈ (Cn×n ) is well defined whenever the values of matrix entries (ffij (Z)) (i = 1, . . . , p; j = 1, . . . , q) are well defined at Z. As a function of matrix entries (Zk )ij (k = 1, . . . , N ; i, j = 1, . . . , n), f (Z) is rational Cp×q ⊗ Cn×n -valued function, which is holomorphic on an open and dense set in Cn×n . The latter set contains some neighborhood N : Zk < ε, k = 1, . . . , N } (2.9) Γn (ε) := {Z ∈ Cn×n 1 In
fact, a more general class was studied in [12], however for our purposes it is enough to consider here only the case mentioned above.
60
D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘
of Z = 0, where f (Z) is given by f (Z) =
fw ⊗ Z w .
w∈F FN
The following results from [8] on matrix substitutions are used in the sequel. Theorem 2.1. Let f ∈ Cp×q z1 , . . . , zN rat , and m ∈ Z+ be such that ) ) ker fw = ker fw . w∈F FN
w∈F FN :|w|≤m
Then there exists ε > 0 such that for every n ∈ N : n ≥ mm (in the case m = 0, for every n ∈ N), ⎛ ⎞ ) ) ker f (Z) = ⎝ ker fw ⎠ ⊗ Cn , (2.10) Z∈Γn (ε)
w∈F FN : |w|≤m
and moreover, there exist l ∈ N : l ≤ qn, and N -tuples of matrices Z (1) , . . . , Z (l) from Γn (ε) such that ⎞ ⎛ l ) ) (j) ker f (Z ) = ⎝ ker fw ⎠ ⊗ Cn . j=1
w∈F FN : |w|≤m
Corollary 2.2. In conditions of Theorem 2.1, if for some n ∈ N : n ≥ mm (in the case m = 0, for some n ∈ N) one has f (Z) = 0, ∀Z ∈ Γn (ε), then f = 0.
3. More on observability, controllability, and minimality in the non-commutative setting In this section we prove a number of results on observable, controllable and minimal GR-nodes in the multivariable non-commutative setting, which generalize some well-known statements for one-variable nodes (see [15]).
k and the kth trunLet us introduce the kth truncated observability matrix O
cated controllability matrix Ck of a GR-node (2.6) by *k = col|w|<pr (CA)wgk , O
*k = row|w|
with the lexicographic order of words in FN . *k = rank Ok and rank * Theorem 3.1. For each k ∈ {1, . . . , N }: rank O Ck = rank Ck . Proof. Let us show that for every fixed k ∈ {1, . . . , N } matrices of the form (CA)wgk with |w| ≥ pr are representable as linear combinations of matrices
k (CA)wg with |w|
< pr. First we remark that if for each fixed k ∈ {1, . . . , N } and j ∈ N all matrices of the form (CA)wgk with |w| = j are representable as linear combinations of matrices of the form (CA)w gk with |w | < j then the same holds for matrices of the form (CA)wgk with |w| = j+1. Indeed, if w = i1 · · · ij ij+1
Matrix-J-unitary Rational Formal Power Series
61
then there exist words w1 , . . . , ws with |w1 | < j, . . . , |ws | < j and a1 , . . . , as ∈ C such that s w aν (CA)wν gij+1 . (CA) = ν=1
Then for every k ∈ {1, . . . , N }, (CA)wgk = (CA)w Aij+1 ,k = =
s
aν (CA)wν gij+1 Aij+1 ,k
ν=1 wν gij+1
aν (CA)
ν: |wν |<j−1
=
Aij+1 ,k +
aν (CA)wν gij+1 Aij+1 ,k
ν: |wν |=j−1
wν gij+1 gk
aν (CA)
+
ν: |wν |<j−1
aν (CA)wν gij+1 gk .
ν: |wν |=j−1
Consider these two sums separately. All the terms in the first sum are of the form aν (CA)(wν gij+1 )gk with |wν gij+1 | < j. In the second sum, by the assumption, for , . . . , wtν of length strictly less each matrix (CA)wν gij+1 gk there exist words w1ν than j and complex numbers b1ν , . . . , btν such that
(CA)wν gij+1 gk =
t
bµν (CA)wµν gk .
µ=1
k Hence (CA)wgk is a linear combination of matrices of the form (CA)wg with |w|
< j. Reiterating this argument we obtain that any matrix of the form (CA)wgk with |w| ≥ j and fixed k ∈ {1, . . . , N } can be represented as a linear combination
k of matrices of the form (CA)wg with |w|
< j. In particular,
rank col|w|<j (CA)wgk = rank Ok ,
k = 1, . . . , N.
(3.1)
Since for any k ∈ {1, . . . , N } one has (CA)wgk ∈ Cp×rk and dim Cp×rk = prk , we obtain that for some j ≤ pr, and moreover for j = pr (3.1) is true, i.e., *k = rank Ok . rank O The second equality is proved analogously. *k and C *k depend only on the Remark 3.2. The sizes of the truncated matrices O sizes of matrices A, B and C, and do not depend on these matrices themselves. *k is rough, and one could probably improve it. For Our estimate for the size of O *k is important, *k and C our present purposes, only the finiteness of the matrices O and not their actual sizes. Corollary 3.3. A GR-node (2.6) is observable (resp., controllable) if and only if for every k ∈ {1, . . . , N }: *k = rk rank O
*k = rk ), (resp, rank C
or equivalently, the matrix Ok (resp., Ck ) is left (resp., right ) invertible.
62
D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘
Remark 3.4. Corollary 3.3 is comparable with Theorems 7.4 and 7.7 in [39], how*k and C *k here are finite. ever we note again that the matrices O γk q p Theorem 3.5. Let α(i) = (N ; A(i) , B (i) , C (i) , D, Cγ = ⊕N k=1 C , C , C ), i = 1, 2, be minimal GR-nodes with the same transfer function. Then they are similar, the similarity transform is unique and given by T = diag(T T1 , . . . , TN ) where
+ †
(1) = C (2) C (1)
(2) O Tk = O k k k k
(3.2)
(here “+ “ ” denotes a left inverse, while “† “ ” denotes a right inverse). Proof. We already mentioned in Section 2 that two minimal nodes with the T1 , . . . , TN ) and T = same transfer function are similar. Let T = diag (T diag (T T1 , . . . , TN ) be two similarity transforms. Let x ∈ Cγk . Then, for every w ∈ FN , (C (2) A(2) )wgk (T Tk − Tk ) x = (C (1) A(1) )wgk x − (C (1) A(1) )wgk x = 0. Since x is arbitrary, from the observability of α(2) we get Tk = Tk for k = 1, . . . , N , hence the similarity transform is unique. Comparing the coefficients in the two FPS representations of the transfer function, we obtain (C (1) A(1) B (1) )w = (C (2) A(2) B (2) )w for all of w ∈ FN \ {∅}, and therefore + + (1) + (1) (2) + (2) Ok Ck = Ok Ck ,
k = 1, . . . , N.
Thus we obtain + † + + + (2) (1) (2) + (1) Ok Ok = Ck , Ck
k = 1, . . . , N.
Denote the operators which appear in these equalities by Tk , k = 1, . . . , N . A direct computation shows that Tk are invertible with + † + + + + (1) (2) (1) (2) Tk−1 = Ok Ok = Ck . Ck Let us verify that T = diag(T T1 , . . . , TN ) ∈ Cγ×γ is a similarity transform between (1) (2) and α . It follows from the controllability of α(1) that for arbitrary k ∈ α {1, . . . , N } and x ∈ Cγk there exist words wj ∈ FN , with |wj | < γq, scalars aj ∈ C and vectors uj ∈ Cq , j = 1, . . . , s, such that x=
s ν=1
T
aν (A(1) B (1) )gk wν uν .
Matrix-J-unitary Rational Formal Power Series Then Tk x =
63
+ + s T + + + + (2) (1) (2) (1) Ok x = Ok (A(1) B (1) )gk wν uν Ok aν Ok ν=1
=
s
+ s T + + (2) (2) (2) (2) gk wνT aν Ok uν = aν (A(2) B (2) )gk wν uν . Ok (A B )
ν=1
ν=1
This explicit formula implies the set of equalities (1)
(2)
Tk Bk = Bk ,
(1)
(2)
Tk Akj = Akj Tj ,
(1)
Ck
(2)
= Ck Tk ,
k, j = 1, . . . , N,
which is equivalent to (2.8).
Remark 3.6. Theorem 3.5 is comparable with Theorem 7.9 in [39]. However, we establish in Theorem 3.5 the uniqueness and an explicit formula for the similarity transform T . Using Theorem 2.1, we will prove now the following criteria of observability, controllability, and minimality for GR-nodes analogous to the ones proven in [8, Theorem 3.3] for recognizable FPS representations. Theorem 3.7. A GR node α of the form (2.6) is observable (resp., controllable) if and only if for every k ∈ {1, . . . , N } and n ∈ N : n ≥ (pr − 1)pr−1 (resp, n ≥ (rq − 1)rq−1 ), which means in the case of pr = 1 (resp., rq = 1): “for every n ∈ N”, ) ker ϕk (Z) = 0 (3.3) Z∈Γn (ε)
(resp.,
,
ran ψk (Z)
= Crk ⊗ Cn ),
(3.4)
Z∈Γn (ε)
where the rational FPSs ϕk and ψk are defined by
ϕk (z) = C(IIr − ∆(z)A)−1 -Crk ,
(3.5)
ψk (z) = Pk (IIr − A∆(z))
(3.6)
−1
B,
with Pk standing for the orthogonal projection onto . C (which is naturally identified here with the subspace in Cr ), the symbol “ ” means linear span, ε = A−1 (ε > 0 is arbitrary in the case A = 0), and Γn (ε) is defined by (2.9). This GR-node is minimal if both of conditions (3.3) and (3.4) are fulfilled. rk
Proof. First, let us remark that for all k = 1, . . . , N the functions ϕk and ψk are well defined in Γn (ε), and holomorphic as functions of matrix entries (Z Zj )µν , j = 1, . . . , N, µ, ν = 1, . . . , n. Second, Theorem 3.1 implies that in Theorem 2.1 applied to ϕk one can choose m = pr−1, and then from (2.10) obtain that observability for a GR-node α is equivalent to condition (3.3). Since α is controllable if and only if α∗ is observable, controllability for α is equivalent to condition (3.4). Since minimality for a GR-node α is equivalent to controllability and observability together, it is in turn equivalent to conditions (3.3) and (3.4) together.
64
D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘
(N Let α = (N ; A , B , C , D ; Cr = j=1 Crj , Cs , Cp ) and α = (N ; A , B , (N rj q s C , D ; Cr = j=1 C , C , C ) be GR-nodes. For k, j = 1, . . . , N set rj = rj + rj , and Akj Bk Cj Bk D rk ×rj Akj = , Bk = ∈C ∈ Crk ×q , 0 A B (3.7) kj k p×rj p×q C D C Cj = j , D=DD ∈C . j ∈C (N r rj q p Then α = (N ; A, B, C, D; C = j=1 C , C , C ) will be called the product of GR-nodes α and α and denoted by α = α α . A straightforward calculation shows that Tαnc = Tαnc Tαnc . Consider a GR-node N N ' ' Crj , Cq ) := (N ; A, B, C, D; Cr = Cr j , Cq , Cq ) α = (N ; A, B, C, D; Cr = j=1
j=1
(3.8) with invertible operator D. Then α× = (N ; A× , B × , C × , D× ; Cr =
N '
Crj , Cq ),
j=1
with A× = A − BD−1 C,
B × = BD−1 ,
C × = −D−1 C,
D× = D−1 ,
(3.9)
×
will be called the associated GR-node, and A the associated main operator, of α. It is easy to see that, as well as in the one-variable case, (T Tαnc )−1 = Tαnc× . Moreover, × × (α× ) = α (in particular, (A× ) = A), and (α α )× = α× α× up to the natural rj rj identification of C ⊕ C with Crj ⊕ Crj , j = 1, . . . , N , which is a similarity transform. Theorem 3.8. A GR-node (3.8) with invertible operator D is minimal if and only if its associated GR-node α× is minimal. Proof. Let a GR-node α of the form (3.8) with invertible operator D be minimal, and x ∈ ker Ok× for some k ∈ {1, . . . , N }, where Ok× is the kth observability matrix × . Then x ∈ ker(C × A× )wgk for every w ∈ FN . Let us show for the GR-node α/ wgk that x ∈ ker Ok = w∈F , i.e, x = 0. FN ker(CA) × For w = ∅, Ck x = 0 means −D−1 Ck x = 0 (see (3.9)), which is equivalent to Ck x = 0. For |w| > 0, w = gi1 · · · gi|w| , (CA)wgk
=
Ci1 Ai1 i2 · · · Ai|w| k
=
−1 −1 −DC Ci×1 (A× Ci2 ) · · · (A× Ck ) i1 i2 + Bi1 D i|w| k + Bi|w| D
=
L0 Ck× +
|w| j=1
× Lj Ci×j A× ij ij+1 · · · Ai|w| k ,
Matrix-J-unitary Rational Formal Power Series
65
with some matrices Lj ∈ Cq×q , j = 0, 1, . . . , |w|. Thus, x ∈ ker(CA)wgk for every w ∈ FN , i.e., x = 0, which means that α× is observable. Since α is controllable if and only if α∗ is observable (see Section 2), and ∗ D is invertible whenever D is invertible, the same is true for α× and (α× )∗ = (α∗ )× . Thus, the controllability of α× follows from the controllability of α. Finally, the minimality of α× follows from the minimality of α. Since (α× )× = α, the minimality of α follows from the minimality of α× . Suppose that for a GR-node (3.8), projections Πk on Crk are defined such that Akj ker Πj ⊂ ker Πk ,
(A× )kj ran Πj ⊂ ran Πk ,
k, j = 1, . . . , N.
We do not assume that Πk are orthogonal. We shall call Πk a kth supporting projection for α. Clearly, the map Π = diag(Π1 , . . . , ΠN ) : Cr → Cr satisfies A ker Π ⊂ ker Π,
A× ran Π ⊂ ran Π,
i.e., it is a supporting projection for the one-variable node (1; A, B, C, D; Cr , Cq ) in the sense of [15]. If Π is a supporting projection for α, then Ir − Π is a supporting projection for α× . The following theorem and corollary are analogous to, and are proved in the same way as Theorem 1.1 and its corollary in [15, pp. 7–9] (see also [43, Theorem 2.1]). Theorem 3.9. Let (3.8) be a GR-node with invertible operator D. Let Πk be a projection on Crk , and let (11) (12) (1) Akjj Akjj Bj A= Bj = Ck = Ck(1) Ck(2) (21) (22) , (2) , Akj Akj Bj be the block matrix representations of the operators Akj , Bj and Ck with respect ˙ to the decompositions Crk = ker Πk +ran Πk , for k, j ∈ {1, . . . , N }. Assume that D = D D , where D and D are invertible operators on Cq , and set α = (N ; A(11) , B (1) (D )−1 , C (1) , D ; ker Π =
N '
ker Πk , Cq ),
k=1
α = (N ; A(22) , B (2) , (D )−1 C (2) , D ; ran Π =
N '
ran Πk , Cq ).
k=1
Then α = α α
(up to a similarity which maps C
rk
˙ = ker Πk +ran Πk onto ·
Cdim(ker Πk ) ⊕ Cdim(ranΠk ) (k = 1, . . . , N ) such that ker Πk + {0} is mapped onto ·
Cdim(ker Πk ) ⊕ {0} and {0} + ranΠk is mapped onto {0} ⊕ Cdim(ranΠk ) ) if and only if Π is a supporting projection for α. Corollary 3.10. In the assumptions of Theorem 3.9, Tαnc = F F ,
66
D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘
where F (z) = D + C(IIr − ∆(z)A)−1 (IIr − Π)∆(z)B(D )−1 , F (z) = D + (D )−1 CΠ(IIr − ∆(z)A)−1 ∆(z)B. We assume now that the external operator of the GR-node (3.8) is equal to D = Iq and that we also take D = D = Iq . Then, the GR-nodes α and α of Theorem 3.9 are called projections of α with respect to the supporting projections Ir − Π and Π, respectively, and we use the notations N ' ker Πk , Cq , α = prIr −Π (α) = N ; A(11) , B (1) , C (1) , D ; ker Π = k=1
α = prΠ (α) =
(22)
N; A
,B
(2)
,C
(2)
, D ; ran Π =
N '
ran Πk , C
q
.
k=1
Let F , F and F be rational FPSs with coefficients in Cq×q such that F = F F .
(3.10)
The factorization (3.10) will be said to be minimal if whenever α and α are minimal GR-realizations of F and F , respectively, α α is a minimal GR-realization of F . In the sequel, we will use the notation N ' γ γk ×γk q α = N ; A, B, C, D; C = C ,C (3.11) k=1
for a minimal GR-realization (i.e., rk = γk for k = 1, . . . , N ) of a rational FPS F in the case when p = q. The following theorem is the multivariable non-commutative version of [15, Theorem 4.8]. It gives a complete description of all minimal factorizations in terms of supporting projections. Theorem 3.11. Let F be a rational FPS with a minimal GR-realization (3.11). Then the following statements hold: (i) if Π = diag(Π1 , . . . , ΠN ) is a supporting projection for α, then F is the transfer function of prIγ −Π (α), F is the transfer function of prΠ (α), and F = F F is a minimal factorization of F ; (ii) if F = F F is a minimal factorization of F , then there exists a uniquely defined supporting projection Π = diag(Π1 , . . . , ΠN ) for the GR-node α such that F and F are the transfer functions of prIγ −Π (α) and prΠ (α), respectively. Proof. (i). Let Π be a supporting projection for α. Then, by Theorem 3.9, α = prIγ −Π (α)prΠ (α).
Matrix-J-unitary Rational Formal Power Series
67
By the assumption, α is minimal. We now show that the GR-nodes α = prIγ −Π (α) and α = prΠ (α) are also minimal. To this end, let x ∈ ran Πk . Then wgk wg wg C (2) A(22) x = (CA) k Πk x = (CA) k x. Thus, if Ok denotes the kth observability matrix of α , then x ∈ ker Ok implies x ∈ ker Ok , and the observability of α implies that α is also observable. Since gk wT g wT A(22) B (2) = Πk (AB) k , one has Ck = Πk Ck , where Ck is the kth controllability matrix of α . Thus, the controllability of α implies the controllability of α . Hence, we have proved the minimality of α . Note that we have used that ker Π = ran (IIγ − Π) is A-invariant. Since ran Π = ker(IIγ − Π) is A× -invariant, by Theorem 3.8 α× is minimal. Using α× = (α α )× = (α )× (α )× , we prove the minimality of (α )× in the same way as that of α . Applying once again Theorem 3.8, we obtain the minimality of α . The dimensions of the state spaces of the minimal GR-nodes α , α and α are related by γk = γk + γk ,
k = 1, . . . , N.
Therefore, given any minimal GR-realizations β and β of F and F , respectively, the same equalities hold for the state space dimensions of β , β and β. Thus, β β is a minimal GR-node, and the factorization F = F F is minimal. (ii). Assume that the factorization F = F F is minimal. Let β and β be minimal GR-realizations of F and F with k-th state space dimensions equal to γk and γk , respectively (k = 1, . . . , N ). Then β β is a minimal GR-realization of F and its kth state space dimension is equal to γk = γk + γk (k = 1, . . . , N ). Hence β β is similar to α. We denote the corresponding GR-node similarity by T = diag(T T1 , . . . , TN ), where
T k : Cγ ⊕ Cγ → Cγ ,
k = 1, . . . N,
is the canonical isomorphism. Let Πk be the projection of Cγk along Tk Cγk onto Tk Cγk , k = 1, . . . , N , and set Π = diag(Π1 , . . . , Πk ). Then Π is a supporting projection for α. Moreover prIγ −Π (α) is similar to β , and prΠ (α) is similar to β . The uniqueness of Π is proved in the same way as in [15, Theorem 4.8]. The uniqueness of the GR-node similarity follows from Theorem 3.5.
4. Matrix-J-unitary formal power series: A multivariable non-commutative analogue of the line case In this section we study a multivariable non-commutative analogue of rational q × q matrix-valued functions which are J-unitary on the imaginary line iR of the complex plane C.
68
D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘
4.1. Minimal Givone–Roesser realizations and the Lyapunov equation N
Denote by Hn×n the set of Hermitian n × n matrices. Then (iHn×n ) will denote the set of N -tuples of skew-Hermitian matrices. In our paper, the set 0 N iHn×n , JN = n∈N
1 where “ ” stands for a disjoint union, will be a counterpart of the imaginary line iR. Let J ∈ Cq×q be a signature matrix. We will call a rational FPS F ∈ Cq×q z1 , . . . , zN rat matrix-J-unitary on JN if for every n ∈ N, F (Z)(J ⊗ In )F (Z)∗ = J ⊗ In
(4.1)
n×n N
at all points Z ∈ (iH ) where it is defined. For a fixed n ∈ N, F (Z) as a function of matrix entries is rational and holomorphic on some open neighborhood N Γn (ε) of Z = 0, e.g., of the form (2.9), and Γn (ε) ∩ (iHn×n ) is a uniqueness set in n×n N (C ) (see [45] for the uniqueness theorem in several complex variables). Thus, (4.1) implies that (4.2) F (Z)(J ⊗ In )F (−Z ∗ )∗ = J ⊗ In at all points Z ∈ (Cn×n )N where F (Z) is holomorphic and invertible (the set of such points is open and dense, since det F (Z) ≡ 0). The following theorem is a counterpart of Theorem 2.1 in [7]. Theorem 4.1. Let F be a rational FPS with a minimal GR-realization (3.11). Then F is matrix-J-unitary on JN if and only if the following conditions are fulfilled: a) D is J-unitary, i.e., DJD∗ = J; b) there exists an invertible Hermitian solution H = diag(H1 , . . . , HN ), with Hk ∈ Cγk ×γk , k = 1, . . . , N , of the Lyapunov equation A∗ H + HA = −C ∗ JC,
(4.3)
B = −H −1 C ∗ JD.
(4.4)
and The property b) is equivalent to b ) there exists an invertible Hermitian matrix H = diag(H1 , . . . , HN ), with Hk ∈ Cγk ×γk , k = 1, . . . , N , such that H −1 A∗ + AH −1 = −BJB ∗ ,
(4.5)
C = −DJB ∗ H.
(4.6)
and Proof. Let F be matrix-J-unitary. Then F is holomorphic at the point Z = 0 in CN , hence D = F (0) is J-unitary (in particular, invertible). Equality (4.2) may be rewritten as (4.7) F (Z)−1 = (J ⊗ In )F (−Z ∗ )∗ (J ⊗ In ).
Matrix-J-unitary Rational Formal Power Series
69
Since (4.7) holds for all n ∈ N, it follows from Corollary 2.2 that the FPSs corresponding to the left and the right sides of equality (4.7) coincide. Due to The(N orem 3.8, α× = (N ; A× , B × , C × , D× ; Cγ = k=1 Cγk , Cq ) with A× , B × , C × , D× given by (3.9) is a minimal GR-realization of F −1 . Due to (4.7), another minimal ˜ B, ˜ C, ˜ D; ˜ Cγ = (N Cγk , Cq ), where ˜ = (N ; A, GR-realization of F −1 is α k=1
∗
A˜ = −A ,
∗
˜ = C J, B
∗
C˜ = −JB ,
˜ = JD∗ J. D
By Theorem 3.5, there exists unique similarity transform T = diag(T T1 , . . . , TN ) which relates α× and α ˜ , where Tk ∈ Cγk ×γk are invertible for k = 1, . . . , N , and T (A − BD−1 C) = −A∗ T,
T BD−1 = C ∗ J,
D−1 C = JB ∗ T.
(4.8)
Note that the relation D−1 = JD∗ J, which means J-unitarity of D, has been already established above. It is easy to check that relations (4.8) are also valid for T ∗ in the place of T . Hence, by the uniqueness of similarity matrix, T = T ∗ . Setting H = −T , we obtain from (4.8) the equalities (4.3) and (4.4), as well as (4.5) and (4.6), by a straightforward calculation. Let us prove now a slightly more general statement than the converse. Let α be a (not necessarily minimal) GR-realization of F of the form (3.8), where D is J-unitary, and let H = diag(H1 , . . . , HN ) with Hk ∈ Crk ×rk , k = 1, . . . , N , be a Hermitian invertible matrix satisfying (4.3) and (4.4). Then in the same way as in [7, Theorem 2.1] for the one-variable case, we obtain for Z, Z ∈ Cn×n : −1
F (Z)(J ⊗ In )F (Z )∗ = J ⊗ In − (C ⊗ In ) (IIr ⊗ In − ∆(Z)(A ⊗ In )) ×∆(Z + Z ∗ )(H −1 ⊗ In ) (IIr ⊗ In − (A∗ ⊗ In )∆(Z ∗ ))
−1
(C ∗ ⊗ In )
(4.9)
−1
(note that ∆(Z) commutes with H ⊗ In ). It follows from (4.9) that F (Z) is (J ⊗ In )-unitary on (iHn×n )N at all points Z where it is defined. Since n ∈ N is arbitrary, F is matrix-J-unitary on JN . Clearly, conditions a) and b’) also imply the matrix-J-unitarity of F on JN . Let us make some remarks. First, it follows from the proof of Theorem 4.1 that the structured solution H = diag(H1 , . . . , HN ) of the Lyapunov equation (4.3) is uniquely determined by a given minimal GR-realization of F . The matrix H = diag(H1 , . . . , HN ) is called the associated structured Hermitian matrix (associated with this minimal GR-realization of F ). The matrix Hk will be called the kth component of the associated Hermitian matrix (k = 1, . . . , N ). The explicit formulas for Hk follow from (3.2): wgk 2 wg 3+ Hk = − col|w|≤qr−1 ((JB ∗ )(−A∗ )) k col|w|≤qr−1 (D−1 C)A× 4 T g wT 5† = −row|w|≤qr−1 ((−A∗ )(C ∗ J))gk w row|w|≤qr−1 A× (BD−1 ) k . Second, let α be a (not necessarily minimal) GR-realization of F of the form (3.8), where D is J-unitary, and let H = diag(H1 , . . . , HN ) with Hk ∈ Crk ×rk , k = 1, . . . , N , be an Hermitian, not necessarily invertible, matrix satisfying (4.3) and
70
D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘
(4.6). Then in the same way as in [7, Theorem 2.1] for the one-variable case, we obtain for Z, Z ∈ Cn×n : F (Z )∗ (J ⊗ In )F (Z) = J ⊗ In − (B ∗ ⊗ In ) (IIr ⊗ In − ∆(Z ∗ )(A∗ ⊗ In )) ×(H ⊗ In )∆(Z ∗ + Z) (IIr ⊗ In − (A ⊗ In )∆(Z))
−1
(B ⊗ In )
−1
(4.10)
(note that ∆(Z) commutes with H ⊗ In ). It follows from (4.10) that F (Z) is (J ⊗ In )-unitary on (iHn×n )N at all points Z where it is defined. Since n ∈ N is arbitrary, F is matrix-J-unitary on JN . Third, if α is a (not necessarily minimal) GR-realization of F of the form (3.8), where D is J-unitary, and equalities (4.5) and (4.6) are valid with H −1 replaced by some, possibly not invertible, Hermitian matrix Y = diag(Y Y1 , . . . , YN ) with Yk ∈ Crk ×rk , k = 1, . . . , N , then F is matrix-J-unitary on JN . This follows from the fact that (4.9) is valid with H −1 replaced by Y . Theorem 4.2. Let (C, A) be an observable pair of matrices C ∈ Cq×r , A ∈ (N rk and Ok has full column rank for each Cr×r in the sense that Cr = k=1 C k ∈ {1, . . . , N }, and let J ∈ Cq×q be a signature matrix. Then there exists a matrix-J-unitary on JN rational FPS F with a minimal GR-realization (N rk q α = (N ; A, B, C, D; Cr = k=1 C , C ) if and only if the Lyapunov equation (4.3) has a structured solution H = diag(H1 , . . . , HN ) which is both Hermitian and invertible. If such a solution H exists, possible choices of D and B are D0 = Iq ,
B0 = −H −1 C ∗ J.
(4.11)
Finally, for a given such H, all other choices of D and B differ from D0 and B0 by a right multiplicative J-unitary constant matrix. Proof. Let H = diag(H1 , . . . , HN ) be a structured solution of the Lyapunov equation (4.3) which is both Hermitian and invertible. We first check that the pair (A, −H −1 C ∗ J) is controllable, or equivalently, that the pair (−JCH −1 , A∗ ) is observable. Using the Lyapunov equation (4.3), one can see that for any k ∈ {1, . . . , N } and w = gi1 · · · gi|w| ∈ FN there exist matrices K0 , . . . , K|w|−1 such that (CA)wgk
= (−1)|w|−1 J((−JCH −1 )A∗ )wgk Hk + K0 J(−JC Ci2 Hi−1 (A∗ )i2 i3 · · · (A∗ )i|w| k )Hk + · · · 2 + K|w|−2 J(−JC Ci|w| (A∗ )i|w| k )Hk + K|w|−1 J(−JCk Hk−1 )Hk .
Thus, if x ∈ ker((−JCH −1 )A∗ )wgk for all of w ∈ FN then Hk−1 x ∈ ker Ok , and the observability of the pair (C, A) implies that x = 0. Therefore, the pair (−JCH −1 , A∗ ) is observable, and the pair (A, −H −1 C ∗ J) is controllable. By Theorem 4.1 we obtain that F0 (z) = Iq − C(IIr − ∆(z)A)−1 ∆(z)H −1 C ∗ J
(4.12)
is a matrix-J-unitary on JN rational FPS, which has a minimal GR-realization (N α0 = (N : A, −H −1 C ∗ J, C, Iq ; Cr = k=1 Crk , Cq ) with the associated structured Hermitian matrix H.
Matrix-J-unitary Rational Formal Power Series
71
(N Conversely, let α = (N ; A, B, C, D; Cr = k=1 Crk , Cq ) be a minimal GRnode. Then by Theorem 4.1 there exists an Hermitian and invertible matrix H = diag(H1 , . . . , HN ) which solves (4.3). Given H = diag(H1 , . . . , HN ), let B, D be any solution of the inverse problem, (N rk q i.e., α = (N ; A, B, C, D; Cr = k=1 C , C ) is a minimal GR-node with the associated structured Hermitian matrix H. Then for F0 = Tαnc0 and F = Tαnc we obtain from (4.9) that F (Z)(J ⊗ In )F (Z )∗ = F0 (Z)(J ⊗ In )F F0 (Z )∗ for any n ∈ N, at all points Z, Z ∈ (Cn×n )N where both F and F0 are defined. By the uniqueness theorem in several complex variables (matrix entries for Zk ’s and Z ∗k ’s, k = 1, . . . , N ), we obtain that F (Z) and F0 (Z) differ by a right multiplicative (J ⊗ In )-unitary constant, which clearly has to be D ⊗ In , i.e., F (Z) = F0 (Z)(D ⊗ In ). Since n ∈ N is arbitrary, by Corollary 2.2 we obtain F (z) = F0 (z)D. Equating the coefficients of these two FPSs, we easily deduce using the observability of the pair (C, A) that B = −H −1 C ∗ JD. The following dual theorem is proved analogously. Theorem 4.3. Let (A, B) be a controllable pair of matrices A ∈ Cr×r , B ∈ Cr×q in (N the sense that Cr = k=1 Crk and Ck has full row rank for each k ∈ {1, . . . , N }, q×q and let J ∈ C be a signature matrix. Then there exists a matrix-J-unitary on JN rational FPS F with a minimal GR-realization α = (N ; A, B, C, D; Cr = (N rk q k=1 C , C ) if and only if the Lyapunov equation GA∗ + AG = −BJB ∗ has a structured solution G = diag(G1 , . . . , GN ) which is both Hermitian and invertible. If such a solution G exists, possible choices of D and C are D0 = Iq ,
C0 = −JB ∗ G−1 .
(4.13)
Finally, for a given such G, all other choices of D and C differ from D0 and C0 by a left multiplicative J-unitary constant matrix. Theorem 4.4. Let F be a matrix-J-unitary on JN rational FPS, and α be its GRrealization. Let H = diag(H1 , . . . , HN ) with Hk ∈ Crk ×rk , k = 1, . . . , N , be an Hermitian invertible matrix satisfying (4.3) and (4.4), or equivalently, (4.5) and (4.6). Then α is observable if and only if α is controllable. Proof. Suppose that α is observable. Since by Theorem 4.1 D = F∅ is J-unitary, by Theorem 4.2 α is a minimal GR-node. In particular, α is controllable. Suppose that α is controllable. Then by Theorem 4.3 α is minimal, and in particular, observable.
72
D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘
4.2. The associated structured Hermitian matrix Lemma 4.5. Let F be a matrix-J-unitary on JN rational FPS, and let α(i) = (N (N ; A(i) , B (i) , C (i) , D; Cγ = k=1 Cγk , Cq ) be minimal GR-realizations of F , with (i) (i) the associated structured Hermitian matrices H (i) = diag(H1 , . . . , HN ), i = 1, 2. (1) (2) Then α and α are similar, i.e., (2.8) holds with a uniquely defined invertible matrix T = diag(T T1 , . . . , TN ), and (1)
Hk
In particular, the matrices
= Tk∗ Hk Tk , (2)
(1) Hk
and
(2) Hk
k = 1, . . . , N.
(4.14)
have the same signature.
The proof is easy and analogous to the proof of Lemma 2.1 in [7]. Remark 4.6. The similarity matrix T = diag(T T1 , . . . , TN ) is a unitary map(N γk ping from Cγ = C endowed with the inner product [ · , · ]H (1) onto k=1 (N γ γk C = k=1 C endowed with the inner product [ · , · ]H (2) , where [x, y]H (i) = H (i) x, yCγ ,
x, y ∈ Cγ , i = 1, 2,
that is, [x, y]H (i) =
N
[xk , yk ]H (i) ,
k=1
i = 1, 2,
k
where xk , yk ∈ Cγk , x = colk=1,...,N (xk ), y = colk=1,...,N (yk ), and (i)
[xk , yk ]H (i) = Hk xk , yk Cγk ,
k = 1, . . . , N, i = 1, 2.
k
Recall the following definition [37]. Let Kw,w be a Cq×q -valued function deKw,w )∗ = Kw ,w . Then Kw,w is fined for w and w in some set E and such that (K called a kernel with κ negative squares if for any m ∈ N, any points w1 , . . . , wm in E, and any vectors c1 , . . . , cm in Cq the matrix (c∗j Kwj ,wi ci )i,j=1,...,m ∈ Hm×m has at most κ negative eigenvalues, and has exactly κ negative eigenvalues for some choice of m, w1 , . . . , wm , c1 , . . . , cm . We will use this definition to give a characterization of the number of negative eigenvalues of the kth component Hk , k = 1, . . . , N , of the associated structured Hermitian matrix H. Theorem 4.7. Let F be a matrix-J-unitary on JN rational FPS, and let α be its minimal GR-realization of the form (3.11), with the associated structured Hermitian matrix H = diag(H1 , . . . , HN ). Then for k = 1, . . . , N the number of negative eigenvalues of the matrix Hk is equal to the number of negative squares of each of the kernels F,k Kw,w ∗
F ,k Kw,w
T
= (CA)wgk Hk−1 (A∗ C ∗ )gk w , T
= (B ∗ A∗ )wgk Hk (AB)gk w ,
w, w ∈ FN , w, w ∈ FN ,
∗
(4.15) (4.16)
For k = 1, . . . , N , denote by Kk (F ) (resp., Kk (F )) the linear span of the functions F,k F ∗ ,k q w → Kw,w c (resp., w → Kw,w c) where w ∈ FN and c ∈ C . Then dim Kk (F ) = dim Kk (F ∗ ) = γk .
Matrix-J-unitary Rational Formal Power Series
73
Proof. Let m ∈ N, w1 , . . . , wm ∈ FN , and c1 , . . . , cm ∈ Cq . Then the matrix equality F,k c) = X ∗ Hk−1 X, (c∗j Kw j ,wi i i,j=1,...,m
with
T X = row1≤i≤m (A∗ C ∗ )gk wi ci ,
F,k implies that the kernel Kw,w has at most κk negative squares, where κk denotes the number of negative eigenvalues of Hk . The pair (C, A) is observable, hence we T can choose a basis of Cq of the form xi = (A∗ C ∗ )gk wi ci , i = 1, . . . , q. Since the matrix X = rowi=1,...,q (xi ) is non-degenerate, and therefore the matrix X ∗ Hk−1 X F,k has exactly κk negative eigenvalues, the kernel Kw,w has κk negative squares. Analogously, from the controllability of the pair (A, B) one can obtain that the kernel Kk (F ∗ ) has κk negative squares. Since Kk (F ) is the span of functions (of variable w ∈ FN ) of the form (CA)wgk y, y ∈ Cγk , it follows that dim Kk (F ) ≤ γk . From the observability of the pair (C, A) we obtain that (CA)wgk y ≡ 0 implies y = 0, thus dim Kk (F ) = γk . In the same way we obtain that the controllability of the pair (A, B) implies that dim Kk (F ∗ ) = γk .
We will denote by νk (F ) the number of negative squares of either the kernel F ∗ ,k or the kernel Kw,w defined by (4.15) and (4.16), respectively.
F,k Kw,w
Theorem 4.8. Let F (i) be matrix-J-unitary on JN rational FPSs, with minimal (N (i) (i) γk GR-realizations α(i) = (N ; A(i) , B (i) , C (i) , D(i) ; Cγ = , Cq ) and the k=1 C (i) (i) associated structured Hermitian matrices H (i) = diag(H1 , . . . , HN ), respectively, (1) (2) i = 1, 2. Suppose that the product α = α α is a minimal GR-node. Then the matrix H = diag(H1 , . . . , HN ), with (1) (1) (2) (1) (2) Hk 0 Hk = (4.17) ∈ C(γk +γk )×(γk +γk ) , k = 1, . . . , N, (2) 0 Hk is the associated structured Hermitian matrix for α = α(1) α(2) . Proof. It suffices to check that (4.3) and (4.4) hold for the matrices A, B, C, D defined as in (3.7), and H = diag(H1 , . . . , HN ) where Hk , k = 1, . . . , N , are defined in (4.17). This is an easy computation which is omitted. Corollary 4.9. Let F1 and F2 be matrix-J-unitary on JN rational FPSs, and suppose that the factorization F = F1 F2 is minimal. Then νk (F F1 F2 ) = νk (F F1 ) + νk (F F2 ),
k = 1, . . . , N.
74
D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘
4.3. Minimal matrix-J-unitary factorizations In this subsection we consider minimal factorizations of rational formal power series which are matrix-J-unitary on JN into factors both of which are also matrix-Junitary on JN . Such factorizations will be called minimal matrix-J-unitary factorizations. Let H ∈ Cr×r be an invertible Hermitian matrix. We denote by [ · , · ]H the Hermitian sesquilinear form [x, y]H = Hx, y where · , · denotes the standard inner product of Cr . Two vectors x and y in Cr are called H-orthogonal if [x, y]H = 0. For any subspace M ⊂ Cr denote M [⊥] = {y ∈ Cr : y, mH = 0 ∀m ∈ M } . The subspace M ⊂ Cr is called non-degenerate if M ∩ M [⊥] = {0}. In this case, ·
M [+]M [⊥] = Cr ·
where [+] denotes the H-orthogonal direct sum. In the case when H = diag(H1 , . . . , HN ) is the structured Hermitian matrix associated with a given minimal GR-realization of a matrix-J-unitary on JN rational FPS F , we will call [ · , · ]H the associated inner product (associated with the given minimal GR-realization of F ). In more details, [x, y]H =
N
[xk , yk ]Hk ,
k=1
where xk , yk ∈ Cγk and x = colk=1,...,N (xk ), y = colk=1,...,N (yk ), and [xk , yk ]Hk = Hk xk , yk Cγk ,
k = 1, . . . , N.
The following theorem (as well as its proof) is analogous to its one-variable counterpart, Theorem 2.6 from [7] (see also [43, Chapter II]). Theorem 4.10. Let F be a matrix-J-unitary on JN rational FPS, and let α be its minimal GR-realization of the form (3.11), with the associated structured Her(N mitian matrix H = diag(H1 , . . . , HN ). Let M = k=1 Mk be an A-invariant subspace such that Mk ⊂ Cγk , k = 1, . . . , N , and M is non-degenerate in the associated inner product [ · , · ]H . Let Π = diag(Π1 , . . . , ΠN ) be the projection defined by ker Π = M,
ran Π = M[⊥] ,
or in more details, ker Πk = Mk ,
[⊥]
ran Πk = Mk ,
k = 1, . . . , N.
Matrix-J-unitary Rational Formal Power Series
75
Let D = D1 D2 be a factorization of D into two J-unitary factors. Then the factorization F = F1 F2 where F1 (z) =
D1 + C(IIγ − ∆(z)A)−1 ∆(z)(IIγ − Π)BD2−1 ,
F2 (z) =
D2 + D1−1 CΠ(IIγ − ∆(z)A)−1 ∆(z)B,
is a minimal matrix-J-unitary factorization of F . Conversely, any minimal matrix-J-unitary factorization of F can be obtained in such a way. For a fixed J-unitary decomposition D = D1 D2 , the correspondence between minimal matrix-J-unitary factorizations of F and non(N degenerate A-invariant subspaces of the form M = k=1 Mk , where Mk ⊂ Cγk for k = 1, . . . , N , is one-to-one. Remark 4.11. We omit here the proof, which can be easily restored, with making use of Theorem 3.9 and Corollary 3.10. Remark 4.12. Minimal matrix-J-unitary factorizations do not always exist, even for N = 1. Examples of J-unitary on iR rational functions which have non-trivial minimal factorizations but lack minimal J-unitary factorizations can be found in [4] and [7]. 4.4. Matrix-unitary rational formal power series In this subsection we specialize some of the preceding results to the case J = Iq . We call the corresponding rational formal power series matrix-unitary on JN . Theorem 4.13. Let F be a rational FPS and α be its minimal GR-realization of the form (3.11). Then F is matrix-unitary on JN if and only if the following conditions are fulfilled: a) D is a unitary matrix, i.e., DD∗ = Iq ; b) there exists an Hermitian solution H = diag(H1 , . . . , HN ), with Hk ∈ Cγk ×γk , k = 1, . . . , N , of the Lyapunov equation A∗ H + HA = −C ∗ C, and
(4.18)
C = −D−1 B ∗ H.
(4.19) The property b) is equivalent to b ) there exists an Hermitian solution G = diag(G1 , . . . , GN ), with Gk ∈ Cγk ×γk , k = 1, . . . , N , of the Lyapunov equation and
GA∗ + AG = −BB ∗ ,
(4.20)
B = −GC ∗ D−1 .
(4.21)
Proof. To obtain Theorem 4.13 from Theorem 4.1 it suffices to show that any structured Hermitian solution to the Lyapunov equation (4.18) (resp., (4.20)) is invertible. Let H = diag(H1 , . . . , HN ) be a structured Hermitian solution to (4.18), and x ∈ ker H, i.e., x = col1≤k≤N (xk ) and xk ∈ ker Hk , k = 1, . . . , N . Then
76
D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘
HAx, x = Ax, Hx = 0, and equation (4.18) implies Cx = 0. In particular, ˜ = col(0, . . . , 0, xk , 0, . . . , 0) where xk ∈ for every k ∈ {1, . . . , N } one can define x ker Hk is on the kth block entry of x ˜, and from C x ˜ = 0 get Ck xk = 0. Thus, ker Hk ⊂ ker Ck , k = 1, . . . , N . Consider the following block representations with respect to the decompositions Cγk = ker Hk ⊕ ran Hk : (11) (12) 0 0 Aijj Aijj (2) , H Aij = , C = = 0 C k k (22) , (21) (22) k 0 Hk Aij Aij where i, j, k = 1, . . . , N . Then (4.18) implies (A∗ H + HA)ij
(12)
and
(21) Aji
(21) ∗
= (A∗ji Hj + Hi Aij )(12) = (Aji ) Hj
(22)
= 0,
= 0, i, j = 1, . . . , N . Therefore, for any w ∈ FN we have (CA)wgk = 0 (C (2) A(22) )wgk , k = 1, . . . , N, (2)
(22)
where C (2) = row1≤k≤N (Ck ), A(22) = (Aij )i,j=1,...,N . If there exists k ∈ {1, . . . , N } such that ker Hk = {0}, then the pair (C, A) is not observable, which contradicts to the assumption on α. Thus, H is invertible. In a similar way one can show that any structured Hermitian solution G = diag(G1 , . . . , GN ) of the Lyapunov equation (4.20) is invertible. A counterpart of Theorem 4.2 in the present case is the following theorem. Theorem 4.14. Let (C, A) be an observable pair of matrices C ∈ Cq×r , A ∈ Cr×r (N rk in the sense that Cr = and Ok has full column rank for each k ∈ k=1 C {1, . . . , N }. Then there exists a matrix-unitary on JN rational FPS F with a mini(N mal GR-realization α = (N ; A, B, C, D; Cr = k=1 Crk , Cq ) if and only if the Lyapunov equation (4.18) has a structured Hermitian solution H = diag(H1 , . . . , HN ). If such a solution H exists, it is invertible, and possible choices of D and B are D0 = Iq ,
B0 = −H −1 C ∗ .
(4.22)
Finally, for a given such H, all other choices of D and B differ from D0 and B0 by a right multiplicative unitary constant matrix. The proof of Theorem 4.14 is a direct application of Theorem 4.2 and Theorem 4.13. One can prove analogously the following theorem which is a counterpart of Theorem 4.3. Theorem 4.15. Let (A, B) be a controllable pair of matrices A ∈ Cr×r , B ∈ Cr×q (N in the sense that Cr = k=1 Crk and Ck has full row rank for each k ∈ {1, . . . , N }. Then there exists a matrix-unitary on JN rational FPS F with a minimal GR(N rk q realization α = (N ; A, B, C, D; Cr = k=1 C , C ) if and only if the Lyapunov equation (4.20) has a structured Hermitian solution G = diag(G1 , . . . , GN ). If such a solution G exists, it is invertible, and possible choices of D and C are D0 = Iq ,
C0 = −B ∗ G−1 .
(4.23)
Matrix-J-unitary Rational Formal Power Series
77
Finally, for a given such G, all other choices of D and C differ from D0 and C0 by a left multiplicative unitary constant matrix. Let A = (A1 , . . . , AN ) be an N -tuple of r × r matrices. A non-zero vector x ∈ Cr is called a common eigenvector for A if there exists λ = (λ1 , . . . , λN ) ∈ CN (which is called a common eigenvalue for A) such that Ak x = λk x,
k = 1, . . . , N.
The following theorem, which is a multivariable non-commutative counterpart of statements a) and b) of Theorem 2.10 in [7], gives a necessary condition on a minimal GR-realization of a matrix-unitary on JN rational FPS. Theorem 4.16. Let F be a matrix-unitary on JN rational FPS and α be its minimal GR-realization, with the associated structured Hermitian matrix H = diag(H1 , . . . , HN ) and the associated inner products [ · , · ]Hk , k = 1, . . . , N . Let Pk denote the orthogonal projection in Cγ onto the subspace {0} ⊕ · · · ⊕ {0} ⊕ Cγk ⊕ {0} ⊕ · · · ⊕ {0}, and Ak = AP Pk , k = 1, . . . , N . If x ∈ Cγ is a common eigenvector for A corresponding to a common eigenvalue λ ∈ CN then there exists Pj x, Pj x]Hj = 0. In particular, A has no j ∈ {1, . . . , N } such that Re λj = 0 and [P common eigenvalues on (iR)N . Proof. By (4.18), we have for every k ∈ {1, . . . , N }, Pk x, Pk x]Hk = − CP Pk x, CP Pk x . (λk + λk )[P Suppose that for all k ∈ {1, . . . , N } the left-hand side of this equality is zero, then CP Pk x = 0. Since for ∅ = w = gi1 · · · gi|w| ∈ FN , Pi1 Ai2 · · · Ai|w| · Ak x = λi2 · · · λi|w| λk CP Pi1 x = 0, (CA)wgk Pk x = CP the observability of the pair (C, A) implies Pk x = 0, k = 1, . . . , N , i.e., x = 0 which contradicts to the assumption that x is a common eigenvector for A. Thus, there exists j ∈ {1, . . . , N } such that (λj + λj )[P Pj x, Pj x]Hj = 0, as desired.
5. Matrix-J-unitary formal power series: A multivariable non-commutative analogue of the circle case In this section we study a multivariable non-commutative analogue of rational Cq×q -valued functions which are J-unitary on the unit circle T. 5.1. Minimal Givone–Roesser realizations and the Stein equation Let n ∈ N. We denote by Tn×n the matrix unit circle Tn×n = W ∈ Cn×n : W W ∗ = In , i.e., the family of unitary n × n complex matrices. We will call the set (Tn×n ) the matrix unit torus. The set 0 N TN = Tn×n n∈N
N
78
D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘
serves as a multivariable non-commutative counterpart of the unit circle. Let J = J −1 = J ∗ ∈ Cq×q . We will say that a rational FPS f is matrix-J-unitary on TN if for every n ∈ N, f (W )(J ⊗ In )f (W )∗ = J ⊗ In N
at all points W = (W W1 , . . . , WN ) ∈ (Tn×n ) where it is defined. In the following theorem we establish the relationship between matrix-J-unitary rational FPSs on JN and on TN , their minimal GR-realizations, and the structured Hermitian solutions of the corresponding Lyapunov and Stein equations. Theorem 5.1. Let f be a matrix-J-unitary on TN rational FPS, with a minimal GR-realization α of the form (3.11), and let a ∈ T be such that −¯ a ∈ σ(A). Then F (z) = f (a(z1 − 1)(z1 + 1)−1 , . . . , a(zN − 1)(zN + 1)−1 )
(5.1)
is well defined as a rational FPS which is matrix-J-unitary on JN , and F = Tβnc, (N where β = (N ; Aa , Ba , Ca , Da ; Cγ = k=1 Cγk , Cq ), with √ −1 Aa = (aA , Ba = 2(aA + Iγ )−1 aB, γ) √− Iγ )(aA + I−1 (5.2) Ca = 2C(aA + Iγ ) , Da = D − C(aA + Iγ )−1 aB. A GR-node β is minimal, and its associated structured Hermitian matrix H = diag(H1 , . . . , HN ) is the unique invertible structured Hermitian solution of ∗ A B H 0 A B H 0 = . (5.3) C D 0 J C D 0 J Proof. For any a ∈ T and n ∈ N the Cayley transform Z0 −→ W0 = a(Z0 − In )(Z0 + In )−1 maps iHn×n onto Tn×n , thus its simultaneous application to each matrix variable maps (iHn×n )N onto (Tn×n )N . Since the simultaneous application of the Cayley transform to each formal variable in a rational FPS gives a rational FPS, (5.1) defines a rational FPS F. Since f is matrix-J-unitary on TN , F is matrix-J-unitary on JN . Moreover, −1 F (z) = D + C Iγ − a(∆(z) − Iγ )(∆(z) + Iγ )−1 A ×a(∆(z) − Iγ )(∆(z) + Iγ )−1 B = D + C (∆(z) + Iγ − a(∆(z) − Iγ )A)−1 a(∆(z) − Iγ )B = D + C (aA + Iγ − ∆(z)(aA − Iγ ))−1 a(∆(z) − Iγ )B −1 ∆(z)aB = D + C(aA + Iγ )−1 Iγ − ∆(z)(aA − Iγ )(aA + Iγ )−1 −1 −C(aA + Iγ )−1 Iγ − ∆(z)(aA − Iγ )(aA + Iγ )−1 aB = D − C(aA + Iγ )−1 aB + C(aA + Iγ )−1 −1 × Iγ − ∆(z)(aA − Iγ )(aA + Iγ )−1 ×∆(z) Iγ − (aA − Iγ )(aA + Iγ )−1 aB = Da + Ca (IIγ − ∆(z)Aa )−1 ∆(z)Ba .
Matrix-J-unitary Rational Formal Power Series
79
Thus, F = Tβnc. Let us remark that the FPS
ϕak (z) = Ca (IIγ − ∆(z)Aa )−1 -Cγk
(c.f. (3.5)) has the coefficients Ca Aa )wgk , (ϕak )w = (C
w ∈ FN .
Remark also that ϕ˜k (z) : = ϕk a(z1 − 1)(z1 + 1)−1 , . . . , a(zN − 1)(zN + 1)−1 −1 - γ = C Iγ − a(∆(z) − Iγ )(∆(z) + Iγ )−1 A C k −1 = C ((∆(z) + Iγ ) − a(∆(z) − Iγ )A) (∆(z) + Iγ )-Cγk −1 = C ((aA + Iγ ) − ∆(z)(aA − Iγ )) (∆(z) + Iγ )-Cγk −1 = C(aA + Iγ )−1 Iγ − ∆(z)(aA − Iγ )(aA + Iγ )−1 (∆(z) + Iγ )-Cγk - 1 = √ Ca (IIγ − ∆(z)Aa )−1 -Cγk (zk + 1) 2 1 = √ (ϕak (z) · zk + ϕak (z)) . 2 qγ−1
Let k ∈ {1, . . . , N } be fixed. Suppose that/n ∈ N, n ≥ (qγ − 1) (for qγ − 1 = 0 choose arbitrary n ∈ N), and x ∈ Z∈Γn (ε) ker ϕak (Z), where Γn (ε) is a neighborhood of the origin of Cn×n where ϕak (Z) is well defined, e.g., of the form (2.9) with ε = Aa −1 . Then, by Theorem 3.1 and Theorem 2.1, one has ⎞ ⎛ ) ) ker ϕak (Z) = ⎝ ker (ϕak )w ⎠ ⊗ Cn ⎛ =⎝
w∈F FN : |w|≤qγ−1
Z∈Γn (ε)
)
⎞
˜k (β) ⊗ Cn . ker (C Ca Aa )wgk ⎠ ⊗ Cn = ker O
w∈F FN : |w|≤qγ−1
˜k (β), {y (µ) }lµ=1 ⊂ Cn such that Thus, there exist l ∈ N, {u(µ) }lµ=1 ⊂ ker O x=
l
u(µ) ⊗ y (µ) .
(5.4)
µ=1
Since (ϕak (z) · zk )wgk = (C Ca Aa )wgk for w ∈ FN , and (ϕak (z) · zk )w = 0 for w = wgk with any w ∈ FN , (5.4) implies that ϕak (Z)(IIγk ⊗ Zk )x ≡ 0. Thus, 1 ϕ˜k (Z)x = √ (ϕak (Z)(IIγk ⊗ Zk ) + ϕak (Z)) x ≡ 0. 2 Since the Cayley transform a(∆(z)−IIγ )(∆(z)+IIγ )−1 maps an open and dense subset of the set of matrices of the form ∆(Z) = diag (Z1 , . . . , ZN ), Zj ∈ Cγj ×γj , j = 1, . . . , N , onto an open and dense subset of the same set, ϕk (Z)x = (C ⊗ In )(IIγ − ∆(Z)(A ⊗ In ))−1 x ≡ 0.
80
D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘
Since the GR-node α is observable, by Theorem 3.7 we get x = 0. Therefore, ) ker ϕak (Z) = 0, k = 1, . . . , N. Z∈Γn (ε)
Applying Theorem 3.7 once again, we obtain the observability of the GR-node β. In the same way one can prove the controllability of β. Thus, β is minimal. Note that ∗ H 0 A B H 0 A B − = 0 J C D 0 J C D ∗ A∗ HB + C ∗ JD A HA + C ∗ JC − H . (5.5) = B ∗ HA + D∗ JC B ∗ HB + D∗ JD − J Since −a ¯∈ / σ(A), the matrix (aA + Iγ )−1 is well defined, as well as Aa = (aA − Iγ )(aA + Iγ )−1 , and Iγ − Aa = 2(aA + Iγ )−1 is invertible. Having this in mind, one can deduce from (5.2) the following relations: A∗ HA + C ∗ JC − H = 2(IIγ − A∗a )−1 (A∗a H + HAa + Ca∗ JC Ca )(IIγ − Aa )−1 B ∗ HA + D∗ JC
√ 2(Ba∗ H + Da∗ JC Ca )(IIγ − Aa )−1 √ ∗ + 2Ba (IIγ − A∗a )−1 (A∗a H + HAa + Ca∗ JC Ca )(IIγ − Aa )−1 =
B ∗ HB + D∗ JD − J =
Ba∗ (IIγ − A∗a )−1 (A∗a H + HAa + Ca∗ JC Ca )(IIγ − Aa )−1 Ba
+
(Ba∗ H + Da∗ JC Ca )(IIγ − Aa )−1 Ba + Ba∗ (IIγ − A∗a )−1 (C Ca∗ JDa + HBa ).
Thus, A, B, C, D, H satisfy (5.3) if and only if Aa , Ba , Ca , Da , H satisfy (4.3) and (4.4) (in the place of A, B, C, D, H therein), which completes the proof. We will call the invertible Hermitian solution H = diag(H1 , . . . , HN ) of (5.3), which is determined uniquely by a minimal GR-realization α of a matrix-J-unitary on TN rational FPS f , the associated structured Hermitian matrix (associated with a minimal GR-realization α of f ). Let us note also that since for the GR-node β from Theorem 5.1 a pair of the equalities (4.3) and (4.4) is equivalent to a pair of the equalities (4.5) and (4.6), the equality (5.3) is equivalent to ∗ −1 −1 H H 0 A B 0 A B = . (5.6) C D 0 J C D 0 J Remark 5.2. Equality (5.3) can be replaced by the following three equalities: H − A∗ HA D∗ JC J − D∗ JD
= C ∗ JC, = −B ∗ HA,
(5.7) (5.8)
= B ∗ HB,
(5.9)
Matrix-J-unitary Rational Formal Power Series
81
and equality (5.6) can be replaced by H −1 − AH −1 A∗ DJB
∗
J − DJD
∗
= = =
BJB ∗ , −CH CH
−1
−1
(5.10) ∗
A ,
(5.11)
∗
(5.12)
C .
Theorem 5.1 allows to obtain a counterpart of the results from Section 4 in the setting of rational FPSs which are matrix-J-unitary on TN . We will skip the proofs when it is clear how to get them. Theorem 5.3. Let f be a rational FPS and α be its minimal GR-realization of the form (3.11). Then f is matrix-J-unitary on TN if and only if there exists an invertible Hermitian matrix H = diag(H1 , . . . , HN ), with Hk ∈ Cγk ×γk , k = 1, . . . , N , which satisfies (5.3), or equivalently, (5.6). Remark 5.4. In the same way as in [7, Theorem 3.1] one can show that if a rational FPS f has a (not necessarily minimal) GR-realization (3.8) which satisfies (5.3) (resp., (5.6)), with an Hermitian invertible matrix H = diag(H1 , . . . , HN ), then for any n ∈ N, f (Z )∗ (J ⊗ In )f (Z) =
−1
J ⊗ In − (B ∗ ⊗ In ) (IIγ ⊗ In − ∆(Z ∗ )(A∗ ⊗ In ))
×
(H ⊗ In )(IIγ ⊗ In − ∆(Z )∗ ∆(Z))
×
(IIγ ⊗ In − (A ⊗ In )∆(Z))
−1
(B ⊗ In )
(5.13)
and respectively, f (Z)(J ⊗ In )f (Z )∗
= J ⊗ In − (C ⊗ In ) (IIγ ⊗ In − ∆(Z)(A ⊗ In ))−1 × (IIγ ⊗ In − ∆(Z)∆(Z )∗ )(H −1 ⊗ In ) −1
× (IIγ ⊗ In − (A∗ ⊗ In )∆(Z )∗ )
(C ∗ ⊗ In ),
(5.14)
N
at all the points Z, Z ∈ (Cn×n ) where it is defined, which implies that f is matrix-J-unitary on TN . Moreover, the same statement holds true if H = diag(H1 , . . . , HN ) in (5.3) and (5.13) is not supposed to be invertible, and if −1 ) in (5.6) and (5.14) is replaced by any Hermitian, H −1 = diag(H1−1 , . . . , HN Y1 , . . . , YN ). not necessarily invertible matrix Y = diag(Y Theorem 5.5. Let f be a matrix-J-unitary on TN rational FPS, and α be its GRrealization. Let H = diag(H1 , . . . , HN ) with Hk ∈ Crk ×rk , k = 1, . . . , N , be an Hermitian invertible matrix satisfying (5.3) or, equivalently, (5.6). Then α is observable if and only if α is controllable. Proof. Let a ∈ T, −a ¯ ∈ / σ(A). Then F defined by (5.1) is a matrix-J-unitary on JN rational FPS, and (5.2) is its GR-realization. As shown in the proof of Theorem 5.1, α is observable (resp., controllable) if and only if so is β. Since by Theorem 5.1 the GR-node β satisfies (4.3) and (4.4) (equivalently, (4.5) and (4.6)), Theorem 4.4 implies the statement of the present theorem.
82
D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘
Theorem 5.6. Let f be a matrix-J-unitary on TN rational FPS and α be its minimal GR-realization of the form (3.11), with the associated structured Hermitian matrix H. If D = f∅ is invertible then so is A, and A−1 = H −1 (A× )∗ H.
(5.15)
Proof. It follows from (5.8) that C = −JD−∗ B ∗ HA. Then (5.7) turns into H − A∗ HA = C ∗ J(−JD−∗ B ∗ HA) = −C ∗ D−∗ B ∗ HA, which implies that H = (A× )∗ HA, and (5.15) follows.
The following two lemmas, which are used in the sequel, can be found in [7]. Lemma 5.7. Let A ∈ Cr×r , C ∈ Cq×r , where A is invertible. Let H be an invertible Hermitian matrix and J be a signature matrix such that H − A∗ HA = C ∗ JC. Let a ∈ T, a ∈ / σ(A). Define Da
=
Ba Then
A C
=
Ba Da
Iq − CH −1 (IIr − aA∗ )−1 C ∗ J, −H
∗ H 0
−1
−∗
A
0 J
(5.16)
∗
C JDa .
A C
Ba Da
(5.17) H = 0
0 . J
Lemma 5.8. Let A ∈ Cr×r , B ∈ Cr×q , where A is invertible. Let H be an invertible Hermitian matrix and J be a signature matrix such that H −1 − AH −1 A∗ = BJB ∗ . Let a ∈ T, a ∈ / σ(A). Define Da Ca Then
A Ca
B Da
=
Iq − JB ∗ (IIr − aA∗ )−1 HB,
(5.18)
=
−Da JB ∗ A−∗ H.
(5.19)
−1 H 0
0 J
A Ca
B Da
∗
−1 H = 0
0 . J
Theorem 5.9. Let (C, A) be an observable pair of matrices C ∈ Cq×r , A ∈ Cr×r (N rk in the sense that Cr = and Ok has full column rank for each k ∈ k=1 C {1, . . . , N }. Let A be invertible and J ∈ Cq×q be a signature matrix. Then there exists a matrix-J-unitary on TN rational FPS f with a minimal GR-realization (N rk q α = (N ; A, B, C, D; Cr = k=1 C , C ) if and only if the Stein equation (5.7) has a structured solution H = diag(H1 , . . . , HN ) which is both Hermitian and invertible. If such a solution H exists, possible choices of D and B are Da and Ba defined in (5.16) and (5.17), respectively. For a given such H, all other choices of D and B differ from Da and Ba by a right multiplicative J-unitary constant matrix.
Matrix-J-unitary Rational Formal Power Series
83
Proof. Let H = diag(H1 , . . . , HN ) be a structured solution of the Stein equation (5.7) which is both Hermitian and invertible, Da and Ba are defined as in (5.16) and (5.17), respectively, where a ∈ T, a ∈ / σ(A). Set αa = (N ; A, Ba , C, Da ; Cr = (N rk q nc k=1 C , C ). By Lemma 5.7 and due to Remark 5.4, the transfer function Tα of αa is a matrix-J-unitary on TN rational FPS. Since αa is observable, by Theorem 5.5 αa is controllable, and thus, minimal. (N rk q Conversely, if α = (N ; A, B, C, D; Cr = k=1 C , C ) is a minimal GRnode whose transfer function is matrix-J-unitary on TN then by Theorem 5.3 there exists a solution H = diag(H1 , . . . , HN ) of the Stein equation (5.7) which is both Hermitian and invertible. The rest of the proof is analogous to the one of Theorem 4.2. Analogously, one can obtain the following. Theorem 5.10. Let (A, B) be a controllable pair of matrices A ∈ Cr×r , B ∈ Cr×q in (N the sense that Cr = k=1 Crk and Ck has full row rank for each k ∈ {1, . . . , N }. Let A be invertible and J ∈ Cq×q be a signature matrix. Then there exists a matrix-J-unitary on TN rational FPS f with a minimal GR-realization α = (N (N ; A, B, C, D; Cr = k=1 Crk , Cq ) if and only if the Stein equation G − AGA∗ = BJB ∗
(5.20)
has a structured solution G = diag(G1 , . . . , GN ) which is both Hermitian and invertible. If such a solution G exists, possible choices of D and C are Da and Ca defined in (5.16) and (5.17), respectively, where H = G−1 . For a given such G, all other choices of D and C differ from Da and Ca by a left multiplicative J-unitary constant matrix. 5.2. The associated structured Hermitian matrix In this subsection we give the analogue of the results of Section 4.2. The proofs are similar and will be omitted. Lemma 5.11. Let f be a matrix-J-unitary on TN rational FPS and α(i) = (N γk q (N ; A(i) , B (i) , C (i) , D; Cγ = k=1 C , C ) be its minimal GR-realizations, with (i) (i) the associated structured Hermitian matrices H (i) = diag(H1 , . . . , HN ), i = 1, 2. (1) (2) Then α and α are similar, that is C (1) = C (2) T,
T A(1) = A(2) T,
and
T B (1) = B (2) ,
for a uniquely defined invertible matrix T = diag (T T1 , . . . , TN ) ∈ Cγ×γ and (1)
Hk
In particular, the matrices
= Tk∗ Hk Tk , (2)
(1) Hk
and
(2) Hk
k = 1, . . . , N. have the same signature.
Theorem 5.12. Let f be a matrix-J-unitary on TN rational FPS, and let α be its minimal GR-realization of the form (3.11), with the associated structured Hermitian matrix H = diag(H1 , . . . , HN ). Then for each k ∈ {1, . . . , N } the number of
84
D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘
negative eigenvalues of the matrix Hk is equal to the number of negative squares of each of the kernels (on FN ): T
f,k wgk −1 Kw,w Hk (A∗ C ∗ )gk w , = (CA) ∗
(5.21)
T
f ,k ∗ ∗ wgk Kw,w Hk (AB)gk w . = (B A )
Finally, for k ∈ {1, . . . , N } let Kk (f ) (resp., Kk (f ∗ )) be the span of the functions f,k f ∗ ,k q w → Kw,w c (resp., w → Kw,w c) where w ∈ FN and c ∈ C . Then dim Kk (f ) = dim Kk (f ∗ ) = γk . We will denote by νk (f ) the number of negative squares of either of the functions defined in (5.21). Theorem 5.13. Let fi , i = 1, 2, be two matrix-J-unitary on TN rational FPSs, with minimal GR-realizations N ' (i) (i) (i) (i) (i) γ (i) γk q = C ,C α = N ; A , B , C , D; C k=1 (i)
(i)
and the associated structured Hermitian matrices H (i) = diag(H1 , . . . , HN ). Assume that the product α = α(1) α(2) is a minimal GR-node. Then, for each k ∈ {1, . . . , N } the matrix (1) (1) (2) (1) (2) Hk 0 Hk = ∈ C(γk +γk )×(γk +γk ) (5.22) (2) 0 Hk is the associated kth Hermitian matrix for α = α(1) α(2) . Corollary 5.14. Let f1 and f2 be two matrix-J-unitary on TN rational FPSs, and assume that the factorization f = f1 f2 is minimal. Then, ν(f1 f2 ) = ν(f1 ) + ν(ff2 ). 5.3. Minimal matrix-J-unitary factorizations In this subsection we consider minimal factorizations of matrix-J-unitary on TN rational FPSs into two factors, both of which are also matrix-J-unitary on TN rational FPSs. Such factorizations will be called minimal matrix-J-unitary factorizations. The following theorem is analogous to its one-variable counterpart [7, Theorem 3.7] and proved in the same way. Theorem 5.15. Let f be a matrix-J-unitary on TN rational FPS and α be its minimal GR-realization of the form (3.11), with the associated structured Hermitian matrix H = diag(H1 , . . . , HN ), and assume that D is invertible. Let (N γ M = k=1 Mk be an A-invariant subspace of C , which is non-degenerate in the associated inner product [ · , · ]H and such that Mk ⊂ Cγk , k = 1, . . . , N . Let Π = diag(Π1 , . . . , ΠN ) be a projection defined by ker Π = M,
and
ran Π = M [⊥] ,
Matrix-J-unitary Rational Formal Power Series
85
that is [⊥]
f or k = 1, . . . , N. ker Πk = Mk , and ran Πk = Mk Then f (z) = f1 (z)ff2 (z), where 2 3 f1 (z) = Iq + C(IIγ − ∆(z)A)−1 ∆(z)(IIγ − Π)BD−1 D1 , 3 2 f2 (z) = D2 Iq + D−1 CΠ(IIγ − ∆(z)A)−1 ∆(z)B ,
(5.23) (5.24)
with D1 = Iq − CH −1 (IIγ − aA∗ )−1 C ∗ J, D = D1 D2 , where a ∈ T belongs to the resolvent set of A1 , and where C1 = C - , A1 = A- , H1 = PM H M
M
M
(with PM being the orthogonal projection onto M in the standard metric of Cγ ), is a minimal matrix-J-unitary factorization of f . Conversely, any minimal matrix-J-unitary factorization of f can be obtained in such a way, and the correspondence between minimal matrix-J-unitary factorizations of f with f1 (a, . . . , a) = Iq and non-degenerate subspaces of A of the form (N M = k=1 Mk , with Mk ⊂ Cγk , k = 1, . . . , N , is one-to-one. Remark 5.16. In the proof of Theorem 5.15, as well as of Theorem 4.10, we make use of Theorem 3.9 and Corollary 3.10. Remark 5.17. Minimal matrix-J-unitary factorizations do not always exist, even in the case N = 1. See [7] for examples in that case. 5.4. Matrix-unitary rational formal power series In this subsection we specialize some of the results in the present section to the case J = Iq . We shall call corresponding rational FPSs matrix-unitary on TN . Theorem 5.18. Let f be a rational FPS and α be its minimal GR-realization of the form (3.11). Then f is matrix-unitary on TN if and only if: (a) There exists an Hermitian matrix H = diag(H1 , . . . , HN ) (with Hk ∈ Cγk ×γk , k = 1, . . . , N ) such that ∗ A B H 0 H 0 A B = . (5.25) C D 0 Iq C D 0 Iq Condition (a) is equivalent to: (a ) There exists an Hermitian matrix G = diag (G1 , . . . , GN ) (with Gk ∈ γk ×γk C , k = 1, . . . , N ) such that ∗ G 0 A B G 0 A B = . (5.26) 0 Iq C D 0 Iq C D Proof. The necessity follows from Theorem 5.1. To prove the sufficiency, suppose that the Hermitian matrix H = diag(H1 , . . . , HN ) satisfies (5.25) and let a ∈ T be such that −a ∈ σ(A). Then, H satisfies conditions (4.18) and (4.19) for the GR(N node β = (N ; Aa , Ba , Ca , Da ; Cγ = k=1 Cγk , Cq ) defined by (5.2) (this follows
86
D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘
from the proof of Theorem 5.1). Thus, from Theorem 4.13 and Theorem 5.1 we obtain that f is matrix-unitary on TN . Analogously, condition (a ) implies that the FPS f is matrix-unitary on TN . A counterpart of Theorem 4.14 in the present case is the following theorem: Theorem 5.19. Let (C, A) be an observable pair of matrices in the sense that Ok has full column rank for each k = 1, . . . , N . Assume that A ∈ Cr×r is invertible. Then there exists a matrix-unitary on TN rational FPS f with a minimal GR-realization (N α = (N ; A, B, C, D; Cr = k=1 Crk , Cq ) if and only if the Stein equation H − A∗ HA = C ∗ C
(5.27) rk ×rk
, k = has an Hermitian solution H = diag(H1 , . . . , HN ), with Hk ∈ C 1, . . . , N . If such a matrix H exists, it is invertible, and possible choices of D and B are Da and Ba given by (5.16) and (5.17) with J = Iq . Finally, for a given H = diag(H1 , . . . , HN ), all other choices of D and B differ from Da and Ba by a right multiplicative unitary constant. A counterpart of Theorem 4.15 is the following theorem: Theorem 5.20. Let (A, B) be a controllable pair of matrices, in the sense that Ck has full row rank for each k = 1, . . . , N . Assume that A ∈ Cr×r is invertible. Then there exists a matrix-unitary on TN rational FPS f with a minimal GR-realization (N α = (N ; A, B, C, D; Cr = k=1 Crk , Cq ) if and only if the Stein equation G − AGA∗ = BB ∗
(5.28) rk ×rk
has an Hermitian solution G = diag(G1 , . . . , GN ) with Gk ∈ G , k = 1, . . . , N . If such a matrix G exists, it is invertible, and possible choices of D and C are Da and Ca given by (5.18) and (5.19) with H = G−1 and J = Iq . Finally, for a given G = diag(G1 , . . . , GN ), all other choices of D and C differ from Da and Ca by a left multiplicative unitary constant. A counterpart of Theorem 4.16 in the present case is the following: Theorem 5.21. Let f be a matrix-unitary on TN rational FPS and α be its minimal GR-realization of the form (3.11), with the associated structured Hermitian matrix H = diag(H1 , . . . , HN ) and the associated kth inner products [·, ·]Hk , k = 1, . . . , N . Let Pk denote the orthogonal projection in Cγ onto the subspace {0} ⊕ · · · ⊕ {0} ⊕ γ Cγk ⊕ {0} ⊕ · · · ⊕ {0}, and set Ak = AP Pk for k = 1, . . . , N . If x ∈ C is a common eigenvector for A = A1 , . . . , AN corresponding to a common eigenvalue λ = (λ1 , . . . , λN ) ∈ CN , then there exists j ∈ {1, . . . , N } such that |λj | = 1 and [P Pj x, Pj x]Hj = 0. In particular A has no common eigenvalues on TN . The proof of this theorem relies on the equality (1 − |λk |2 )[P Pk x, Pk x]Hk = CP Pk x, CP Pk x,
k = 1, . . . , N,
and follows the same argument as the proof of Theorem 4.16.
Matrix-J-unitary Rational Formal Power Series
87
6. Matrix-J-inner rational formal power series 6.1. A multivariable non-commutative analogue of the half-plane case Let n ∈ N. We define the matrix open right poly-half-plane as the set 7 N n×n N 6 = Z = (Z1 , . . . , ZN ) ∈ Cn×n : Zk + Zk∗ > 0, k = 1, . . . , N , Π and the matrix closed right poly-half-plane as the set N N = clos Πn×n clos Πn×n 6 7 N = Z = (Z1 , . . . , ZN ) ∈ Cn×n : Zk + Zk∗ ≥ 0, k = 1, . . . , N . We also introduce PN =
0 N Πn×n
and clos PN =
n∈N
It is clear that
0
N clos Πn×n .
n∈N
n×n N N iH ⊂ clos Πn×n N
is the essential (or Shilov ) boundary of the matrix poly-half-plane (Πn×n ) (see 1 N [45]) and that JN ⊂ clos PN (recall that JN = n∈N (iHn×n ) ). Let J = J −1 = J ∗ ∈ Cq×q . A matrix-J-unitary on JN rational FPS F is called matrix-J-inner (in PN ) if for each n ∈ N: F (Z)(J ⊗ In )F (Z)∗ ≤ J ⊗ In
(6.1)
N
at those points Z ∈ clos (Πn×n ) where it is defined (the set of such points is N open and dense, in the relative topology, in clos (Πn×n ) since F (Z) is a rational matrix-valued function of the complex variables (Zk )ij , k = 1, . . . , N, i, j = 1, . . . , n). The following theorem is a counterpart of part a) of Theorem 2.16 of [7]. Theorem 6.1. Let F be a matrix-J-unitary on JN rational FPS and α be its minimal GR-realization of the form (3.11). Then F is matrix-J-inner in PN if and only if the associated structured Hermitian matrix H = diag(H1 , . . . , HN ) is strictly positive. Proof. Let n ∈ N. Equality (4.9) can be rewritten as ∗
∗
J ⊗ In − F (Z)(J ⊗ In )F (Z ) = ϕ(Z)∆(Z + Z )(H −1 ⊗ In )ϕ(Z ) where ϕ is a FPS defined by ϕ(z) := C(IIγ − ∆(z)A)−1 ∈ Cq×γ z1 , . . . , zN rat , and (6.2) is well defined at all points Z, Z ∈ (Cn×n )N for which 1 ∈ σ (∆(Z)(A ⊗ In )) ,
1 ∈ σ (∆(Z )(A ⊗ In )) .
∗
(6.2)
88
D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘
Set ϕk (z) := C(IIγ − ∆(z)A)−1 -Cγk ∈ Cq×γk z1 , . . . , zN rat , k = 1, . . . , N. Then (6.2) becomes: ∗
J ⊗ In − F (Z)(J ⊗ In )F (Z ) =
N
∗
∗
ϕk (Z)(Hk−1 ⊗ (Zk + Zk ))ϕk (Z ) .
(6.3)
k=1
Let X ∈ Hn×n be some positive semidefinite matrix, let Y ∈ (Hn×n )N be such that 1 ∈ σ(∆(iY )(A ⊗ In )), and set for k = 1, . . . , N : ek := (0, 0, . . . , 0, 1, 0, . . . , 0) ∈ CN with 1 at the kth place. Then for λ ∈ C set (k)
Y1 , . . . , iY Yk−1 , λX + iY Yk , iY Yk+1 , . . . , iY YN ). ZX,Y (λ) := λX ⊗ ek + iY = (iY Now, (6.3) implies that ∗
ZX,Y (λ))(J ⊗ In )F (Z ZX,Y (λ )) J ⊗ In − F (Z (k)
(k)
∗
= (λ + λ )ϕk (Z ZX,Y (λ))(Hk−1 ⊗ X)ϕk (Z ZX,Y (λ )) . (k)
(k)
(6.4)
(k)
The function h(λ) = F (Z ZX,Y (λ)) is a rational function of λ ∈ C. It is easily seen from (6.4) that h is (J ⊗ In )-inner in the open right half-plane. In particular, it is (J ⊗ In )-contractive in the closed right half-plane (this also follows directly from (6.1)). Therefore (see, e.g., [22]) the function Ψ(λ, λ ) =
∗
ZX,Y (λ))(J ⊗ In )F (Z ZX,Y )(λ ) J ⊗ In − F (Z (k)
(k)
(6.5) λ + λ is a positive semidefinite kernel on C: for every choice of r ∈ N, of points λ1 , . . . , λr ∈ C for which the matrices Ψ(λj , λi ) are well defined, and vectors c1 , . . . , cr ∈ Cq ⊗ Cn one has r
c∗j Ψ(λj , λi )ci ≥ 0,
i,j=1 (k)
ZX,Y (0)) = i.e., the matrix (Ψ(λj , λi ))i,j=1,...,r is positive semidefinite. Since ϕk (Z ϕk (iY ) is well defined, we obtain from (6.4) that Ψ(0, 0) is also well defined and Ψ(0, 0) = ϕk (iY )(Hk−1 ⊗ X)ϕk (iY )∗ ≥ 0. This inequality holds for every n ∈ N, every positive semidefinite X ∈ Hn×n and every Y ∈ (Hn×n )N . Thus, for an arbitrary r ∈ N we can define n
= nr, Y = (1) (r) (j) n
× n N (Y 1 , . . . , Y N ) ∈ (H ) , where Y k = diag(Y Yk , . . . , Yk ) and Yk ∈ Hn×n , k = 1, . . . , N, j = 1, . . . , r, such that ϕk (iY ) is well defined, ⎛ ⎞ In · · · In . .. ⎟ ∈ Cn×n ⊗ Cr×r ∼ Cn × n
=⎜ X = ⎝ .. .⎠ In
···
In
Matrix-J-unitary Rational Formal Power Series
89
and get
k (iY )∗ 0 ≤ ϕk (iY )(Hk−1 ⊗ X)ϕ = diag(ϕk (iY (1) ), . . . , ϕk (iY (r) ))× ⎛ ⎞ ⎛ ⎞ In ⎟ ⎜ ⎜ ⎟ × ⎝Hk−1 ⊗ ⎝ ... ⎠ In · · · In ⎠ diag(ϕk (iY (1) )∗ , . . . , ϕk (iY (r) )∗ ) In ⎛ ⎞ ϕk (iY (1) ) ⎜ ⎟ −1 .. =⎝ ⎠ (Hk ⊗ In ) ϕk (iY (1) )∗ . ϕk (iY (r) ) = ϕk (iY (µ) )(Hk−1 ⊗ In )ϕk (iY (ν) )∗
ϕk (iY (r) )∗
···
.
µ,ν=1,...,r
Therefore, the function Kk (iY, iY ) = ϕk (iY )(Hk−1 ⊗ In )ϕk (iY )∗ is a positive semidefinite kernel on any subset of (iHn×n )N where it is defined, and in particular in some neighborhood of the origin. One can extend this function to Kk (Z, Z ) = ϕk (Z)(Hk−1 ⊗ In )ϕk (Z )∗
(6.6)
at those points Z, Z ∈ (C ) × (C ) where ϕk is defined. Thus, on some neighborhood Γ of the origin in (Cn×n )N × (Cn×n )N , the function Kk (Z, Z ) is holomorphic in Z and anti-holomorphic in Z . On the other hand, it is well known (see, e.g., [9]) that one can construct a reproducing kernel Hilbert space (which we will denote by H(Kk )) with reproducing kernel Kk (iY, iY ), which is obtained as the completion of H0 = span Kk (·, iY )x ; iY ∈ (iHn×n )N ∩ Γ, x ∈ Cq ⊗ Cn n×n N
n×n N
with respect to the inner product 8 r 9 (µ) (ν) Kk (·, iY )xµ , Kk (·, iY )xν µ=1
=
ν=0 r :
Kk (iY (ν) , iY (µ) )xµ , xν
µ=1 ν=1
H0
; Cq ⊗Cn
.
The reproducing kernel property reads: f (·), Kk (·, iY )xH(Kk ) = f (iY ), xCq ⊗Cn , ∗
and thus Kk (iY, iY ) = Φ(iY )Φ(iY ) where Φ(iY ) : f (·) → f (iY ) is the evaluation map. In view of (6.6), the kernel Kk (·, ·) is extendable on Γ × Γ to the function K(Z, Z ) which is holomorphic in Z and antiholomorphic in Z ,
90
D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘
all the elements of H(Kk ) have holomorphic continuations to Γ, and so has the function Φ(·). Thus, Kk (Z, Z ) = Φ(Z)Φ(Z )∗ and so Kk (Z, Z ) is a positive semidefinite kernel on Γ. (We could also use [3, Theorem 1.1.4, p.10] to obtain this conclusion.) Therefore, for any choice of ∈ N and Z (1) , . . . , Z () ∈ Γ the matrix ϕk (Z (µ) )(Hk−1 ⊗ In )ϕk (Z (ν) )∗ µ,ν=1,..., ⎞ ⎛ ϕk (Z (1) ) (6.7) ⎟ ⎜ .. −1 (1) ∗ () ∗ =⎝ ⊗ I ) · · (H ϕ (Z ) · · · ϕ (Z ) ⎠ n k k k . ϕk (Z () ) is positive semidefinite. Since the coefficients of the FPS ϕk are (ϕk )w = (CA)wgk , w ∈ FN , and since α is an observable GR-node, we have ) ker(CA)wgk = {0} . w∈F FN
Hence, by Theorem 2.1 we can chose n, ∈ N and Z (1) , . . . , Z () ∈ Γ such that )
ker ϕk (Z (j) ) = {0} .
j=1
Thus the matrix colj=1,..., ϕk (Z (j) ) has full column rank. (We could also use Theorem 3.7.) From (6.7) it then follows that Hk−1 > 0. Since this holds for all k ∈ {1, . . . , N }, we get H > 0. Conversely, if H > 0 then it follows from (6.2) that for every n ∈ N and N Z ∈ (Πn×n ) for which 1 ∈ σ(∆(Z)(A ⊗ In )), one has J ⊗ In − F (Z)(J ⊗ In )F (Z)∗ ≥ 0. Therefore F is matrix-J-inner in PN , and the proof is complete.
Theorem 6.2. Let F ∈ C z1 , . . . , zN rat be matrix-J-inner in PN . Then F has a minimal GR-realization of the form (3.11) with the associated structured Hermitian matrix H = Iγ . This realization is unique up to a unitary similarity. q×q
Proof. Let α◦ = (N ; A◦ , B ◦ , C ◦ , D; Cγ =
N '
Cγ k , Cq )
k=1
be a minimal GR-realization of F , with the associated structured Hermitian ma◦ trix H ◦ = diag(H1◦ , . . . , HN ). By Theorem 6.1 the matrix H ◦ is strictly positive. ◦ 1/2 ◦ 1/2 Therefore, (H ) = diag((H1◦ )1/2 , . . . , (H HN ) ) is well defined and strictly positive, and N ' Cγk , Cq ), α = (N ; A, B, C, D; Cγ = k=1
Matrix-J-unitary Rational Formal Power Series
91
where A = (H ◦ )1/2 A◦ (H ◦ )−1/2 ,
B = (H ◦ )1/2 B ◦ ,
C = C ◦ (H ◦ )−1/2 ,
(6.8)
is a minimal GR-realization of F satisfying A∗ + A =
−C ∗ JC,
(6.9)
=
∗
−C JD,
(6.10)
A∗ + A = C =
−BJB ∗ , −DJB ∗ ,
(6.11) (6.12)
B or equivalently,
and thus having the associated structured Hermitian matrix H = Iγ . Since in this case the inner product [ · , · ]H coincides with the standard inner product · , · of Cγ , by Remark 4.6 this minimal GR-realization with the property H = Iγ is unique up to unitary similarity. We remark that a one-variable counterpart of the latter result is essentially contained in [20], [38] (see also [10, Section 4.2]). 6.2. A multivariable non-commutative analogue of the disk case Let n ∈ N. We define the matrix open unit polydisk as 7 N n×n N 6 D = W = (W W1 , . . . , WN ) ∈ Cn×n : Wk Wk∗ < In , k = 1, . . . , N , and the matrix closed unit polydisk as N N clos Dn×n = clos Dn×n 6 7 N = W = (W W1 , . . . , WN ) ∈ Cn×n : Wk Wk∗ ≤ In , k = 1, . . . , N . N
The matrix unit torus (Tn×n ) is the essential (or Shilov) boundary of (Dn×n ) (see [45]). In our setting, the set 0 0 n×n N n×n N DN = D resp., clos DN = clos D n∈N
N
n∈N
is a multivariable non-commutative counterpart of the open (resp., closed) unit disk. Let J = J −1 = J ∗ ∈ Cq×q . A rational FPS f which is matrix-J-unitary on TN is called matrix-J-inner in DN if for every n ∈ N: f (W )(J ⊗ In )f (W )∗ ≤ J ⊗ In N
(6.13)
at those points W ∈ clos (Dn×n ) where it is defined. We note that the set of N such points is open and dense (in the relative topology) in clos (Dn×n ) since f (W ) is a rational matrix-valued function of the complex variables (W Wk )ij , k = 1, . . . , N, i, j = 1, . . . , n.
92
D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘
Theorem 6.3. Let f be a rational FPS which is matrix-J-unitary on TN , and let α be its minimal GR-realization of the form (3.11). Then f is matrix-J-inner in DN if and only if the associated structured Hermitian matrix H = diag(H1 , . . . , HN ) is strictly positive. Proof. The statement of this theorem follows from Theorem 6.1 and Theorem 5.1, since the Cayley transform defined in Theorem 5.1 maps each open matrix unit polydisk (Dn×n )N onto the open right matrix poly-half-plane (Πn×n )N , and the inequality (6.13) turns into (6.1) for the function F defined in (5.1). The following theorem is an analogue of Theorem 6.2. Theorem 6.4. Let f be a rational FPS which is matrix-J-inner in DN . Then there exists its minimal GR-realization α of the form (3.11), with the associated structured Hermitian matrix H = Iγ . Such a realization is unique up to a unitary similarity. In the special case of Theorem 6.4 where J = Iq the FPS f is called matrixinner, and the GR-node α satisfies ∗ A B A B = Iγ +q , C D C D i.e., α is a unitary GR-node, which has been considered first by J. Agler in [1]. In what follows we will show that Theorem 6.4 for J = Iq is a special case of the theorem of J. A. Ball, G. Groenewald and T. Malakorn on unitary GR-realizations of FPSs from the non-commutative Schur–Agler class [12], which becomes in several aspects stronger in this special case. Let U and Y be Hilbert spaces. Denote by L(U, Y) the Banach space of bounded linear operators from U into Y. A GR-node in the general setting of Hilbert spaces is α = (N ; A, B, C, D; X =
N '
Xk , U, Y),
k=1
i.e., a collection of Hilbert spaces X , X1 , . . . , XN , U, Y and operators A ∈ L(X ) = L(X , X ), B ∈ L(U, X ), C ∈ L(X , Y), and D ∈ L(U, Y). Such a GR-node α is called unitary if ∗ ∗ A B A B A B A B = IX ⊕Y , = IX ⊕U , C D C D C D C D
A B i.e., C is a unitary operator from X ⊕ U onto X ⊕ Y. The non-commutative D transfer function of α is
Tαnc (z) = D + C(I − ∆(z)A)−1 ∆(z)B,
(6.14)
Matrix-J-unitary Rational Formal Power Series
93
where the expression (6.14) is understood as a FPS from L(U, Y) z1 , . . . , zN given by ∞ w k (CAB) z w = D + C (∆(z)A) ∆(z)B. (6.15) Tαnc (z) = D + w∈F FN \{∅}
k=0
The non-commutative Schur–Agler class SAnc N (U, Y) consists of all FPSs f ∈ L(U, Y) z1 , . . . , zN such that for any separable Hilbert space K and any N tuple δ = (δ1 , . . . , δN ) of strict contractions in K the limit in the operator norm topology fw ⊗ δ w f (δ) = lim m→∞
w∈F FN : |w|≤m
exists and defines a contractive operator f (δ) ∈ L(U ⊗ K, Y ⊗ K). We note that the non-commutative Schur–Agler class was defined in [12] also for a more general class of operator N -tuples δ. ). Consider another set of non-commuting indeterminates z = (z1 , . . . , zN For f (z) ∈ L(V, Y) z1 , . . . , zN and f (z ) ∈ L(V, U) z1 , . . . , zN we define a FPS ∗ f (z)f (z ) ∈ L(U, Y) z1 , . . . , zN , z1 , . . . , zN by w T ∗ ∗ f (z)f (z ) = fw (ffw ) z w z . (6.16) w,w ∈F FN
In [12] the class
SAnc N (U, Y)
was characterized as follows:
Theorem 6.5. Let f ∈ L(U, Y) z1 , . . . , zN . The following statements are equivalent: (1) f ∈ SAnc N (U, Y); (2) there exist auxiliary Hilbert spaces H, H1 , . . . , HN which are related by H = (N k=1 Hk , and a FPS ϕ ∈ L(H, Y) z1 , . . . , zN such that ∗
∗
IY − f (z)f (z ) = ϕ(z)(IIH − ∆(z)∆(z )∗ )ϕ(z ) ; (6.17) (N (3) there exists a unitary GR-node α = (N ; A, B, C, D; X = k=1 Xk , U, Y) such that f = Tαnc. We now give another characterization of the Schur–Agler class SAnc N (U, Y). Theorem 6.6. A FPS f belongs to SAnc N (U, Y) if and only if for every n ∈ N and W ∈ (Dn×n )N the limit in the operator norm topology f (W ) = lim fw ⊗ W w (6.18) m→∞
w∈F FN : |w|≤m
exists and f (W ) ≤ 1. Proof. The necessity is clear. We prove the sufficiency. We set fk (z) = fw z w , k = 0, 1, . . . . w∈F FN : |w|=k
94
D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘
Then for every n ∈ N and W ∈ (Dn×n )N , (6.18) becomes f (W ) = lim
m→∞
m
fk (W ),
(6.19)
k=0
where the limit is taken in the operator norm topology. Let r ∈ (0, 1) and choose τ > 0 such that r + τ < 1. Let W ∈ (Dn×n )N be such that W Wj ≤ r, j = 1, . . . , N . Then, for every x ∈ U ⊗ Cn the series ∞ r+τ r+τ k W x λW x = λ fk f r r k=0
converges uniformly in λ ∈ clos D to a Y ⊗ Cn -valued function holomorphic on clos D. Furthermore, < < < < < < < < r+τ −k−1
< k < < r k < r r+τ < < ffk (W ) =
(6.20)
Thus we have < < k m ∞ ∞ < < r < < f (W ) − f (W ) f f (W ) ≤ < ∞. ≤ < < k k < < r+τ k=0
k=m+1
k=m+1
We observe that the limit in (6.19) is uniform in n ∈ N and W ∈ (Dn×n )N such that W Wj ≤ r, j = 1, . . . , N . Without loss of generality we may assume that in the definition of the Schur–Agler class the space K is taken to be the space 2 of square summable sequences s = (sj )∞ j=1 of complex numbers indexed by N: ∞ 2 |s | < ∞. We denote by P the orthogonal projection from 2 onto the j n j=1 subspace of sequences for which sj = 0 for j > n. This subspace is isomorphic to Cn , and thus for every δ = (δ1 , . . . , δN ) ∈ L(2 )N such that δδj ≤ r, j = 1, . . . , N , we may use (6.20) and write k r ffk (P Pn δ1 Pn , . . . , Pn δN Pn ) ≤ . (6.21) r+τ Since the sequence Pn converges to I2 in the strong operator topology (see, e.g., [2]), and since strong limits of finite sums and products of operator sequences are equal to the corresponding sums and products of strong limits of these sequences, we obtain that Pn δ1 Pn , . . . , Pn δN Pn ) = fk (δ). s − lim fk (P n→∞
Thus from (6.21) we obtain ffk (δ) ≤
r r+τ
k .
Matrix-J-unitary Rational Formal Power Series
95
Therefore, the limit in the operator norm topology f (δ) = lim
m→∞
m
fk (δ)
k=0
does exist, and f (δ) ≤
∞
ffk (δ) ≤
k=0
∞ k=0
r r+τ
k < ∞.
Moreover, since the limit in (6.19) is uniform in n ∈ N and W ∈ (Dn×n )N such that W Wj ≤ r < 1, j = 1, . . . , N, the rearrangement of limits in the following chain of equalities is justified: lim f (P Pn δ1 Pn , . . . , Pn δN Pn )h = lim lim
n→∞
= lim lim
m→∞ n→∞
n→∞ m→∞
m
m
fk (P Pn δ1 Pn , . . . , Pn δN Pn )h
k=0
fk (P Pn δ1 Pn , . . . , Pn δN Pn )h = lim
m→∞
k=0
m
fk (δ)h = f (δ)h
k=0
(here h is an arbitrary vector in U ⊗ 2 and δ ∈ L(2 )N such that δδj ≤ r, j = 1, . . . , N ). Thus for every δ ∈ L(2 )N such that δδj < 1, j = 1, . . . , N , we obtain f (δ) ≤ 1, i.e., f ∈ SAnc N (U, Y). Remark 6.7. One can see from the proof of Theorem 6.6 that for arbitrary f ∈ SAnc N (U, Y) and r : 0 < r < 1, the series f (δ) =
∞
fk (δ)
k=0
converges uniformly and absolutely in δ ∈ L(K)N such that δδj ≤ r, j = 1, . . . , N , where K is any separable Hilbert space. Corollary 6.8. A matrix-inner in DN rational FPS f belongs to the class nc q q q SAnc N (C ) = SAN (C , C ). Thus, for the case J = Iq , Theorem 6.4 establishes the existence of a unitary GR-realization for an arbitrary matrix-inner rational FPS, i.e., recovers Theorem 6.5 for the case of a matrix-inner rational FPS. However, it says even more than Theorem 6.5 in this case, namely that such a unitary realization can be found minimal, thus finite-dimensional, and that this minimal unitary realization is unique up to a unitary similarity. The representation (6.17) with the rational FPS ϕ ∈ Cq×γ z1 , . . . , zN rat given by ϕ(z) = C(IIγ − ∆(z)A)−1 is obtained from (5.14) by making use of Corollary 2.2.
96
D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘
7. Matrix-selfadjoint rational formal power series 7.1. A multivariable non-commutative analogue of the line case A rational FPS Φ ∈ Cq×q z1 , . . . , zN rat will be called matrix-selfadjoint on JN if for every n ∈ N: Φ(Z) = Φ(Z)∗ N
at all points Z ∈ (iHn×n ) where it is defined. The following theorem is a multivariable non-commutative counterpart of Theorem 4.1 from [7] which was originally proved in [28]. Theorem 7.1. Let Φ ∈ Cq×q z1 , . . . , zN rat , and let α be a minimal GRrealization of Φ of the form (3.11). Then Φ is matrix-selfadjoint on JN if and only if the following conditions hold: (a) the matrix D is Hermitian, that is, D = D∗ ; (b) there exists an invertible Hermitian matrix H = diag(H1 , . . . , HN ) with Hk ∈ Cγk ×γk , k = 1, . . . , N, and such that A∗ H + HA = 0, C = iB ∗ H.
(7.1) (7.2)
Proof. We first observe that Φ is matrix-selfadjoint on JN if and only if the FPS F ∈ C2q×2q z1 , . . . , zN rat given by Iq iΦ(z) F (z) = (7.3) 0 Iq is matrix-J J1 -unitary on JN , where J1 =
0 Iq
Moreover, F admits the GR-realization iC I , q β = (N ; A, 0 B , 0 0
Iq . 0
(7.4)
N ' iD Cγk , C2q ). ; Cγ = Iq k=1
This realization is minimal. Indeed, the kth truncated observability (resp., controllability) matrix of β is equal to * *k (β) = iOk (α) O (7.5) 0 and, resp.,
*k (β) = 0 C
* Ck (α) ,
(7.6)
and therefore has full column (resp., row) rank. Using Theorem 4.1 of the present paper we see that Φ is matrix-selfadjoint on JN if and only if:
Matrix-J-unitary Rational Formal Power Series
(1) the matrix
Iq 0
iD Iq
97
is J1 -unitary;
(2) there exists an invertible Hermitian matrix H = diag(H1 , . . . , HN ), with Hk ∈ Cγk ×γk , k = 1, . . . , N , such that ∗ iC iC ∗ A H + HA = − J1 , 0 0 ∗ Iq iD −1 iC 0 B J1 . = −H 0 Iq 0 These conditions are in turn readily seen to be equivalent to conditions (a) and (b) in the statement of the theorem. From Theorem 4.1 it follows that the matrix H = diag(H1 , . . . , HN ) is uniquely determined by the given minimal GR-realization of Φ. In a similar way as in Section 4, it can be shown that Hk , k = 1, . . . , N , are given by the formulas wg + wg colw∈FN : |w|≤qγ−1 (CA) k Hk = − colw∈FN : |w|≤qγ−1 (B ∗ (−A∗ )) k † T gk w T ∗ ∗ gk w roww∈F = roww∈F . FN : |w|≤qγ−1 ((−A )C ) FN : |w|≤qγ−1 (AB) The matrix H = diag(H1 , . . . , HN ) is called in this case the associated structured Hermitian matrix (associated with a minimal GR-realization of the FPS Φ). N It follows from (7.1) and (7.2) that for n ∈ N and Z, Z ∈ (iHn×n ) we have: −1
(7.7) Φ(Z) − Φ(Z )∗ = i(C ⊗ In ) (IIγ ⊗ In − ∆(Z)(A ⊗ In )) −1 ∗ ∗ ∗ −1 ∗ ×∆(Z + Z ) H ⊗ In Iγ ⊗ In − (A ⊗ In )∆(Z ) (C ⊗ In ), −1 ∗ Φ(Z) − Φ(Z )∗ = i(B ∗ ⊗ In ) Iγ ⊗ In − ∆(Z )(A∗ ⊗ In ) (7.8) ∗
×∆(Z + Z ) (H ⊗ In ) (IIγ ⊗ In − (A ⊗ In )∆(Z))−1 (B ⊗ In ). Note that if A, B and C are matrices which satisfy (7.1) and (7.2) for some (not necessarily invertible) Hermitian matrix H, and if D is Hermitian, then Φ(z) = D + C(I − ∆(z)A)−1 ∆(z)B is a rational FPS which is matrix-selfadjoint on JN . This follows from the fact that (7.8) is still valid in this case (the corresponding GR-realization of Φ is, in general, not minimal). If A, B and C satisfy the equalities GA∗ + AG B
= 0, = iGC ∗
(7.9) (7.10)
for some (not necessarily invertible) Hermitian matrix G = diag(G1 , . . . , GN ) then (7.7) is valid with H −1 replaced by G (the diagonal structures of G, ∆(Z) and ∆(Z ) are compatible), and hence Φ is matrix-selfadjoint on JN . As in Section 4, we can solve inverse problems using Theorem 7.1. The proofs are easy and omitted.
98
D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘
Theorem 7.2. Let (C, A) be an observable pair of matrices, in the sense that Ok has a full column rank for all k ∈ {1, . . . , N }. Then there exists a rational FPS which is matrix-selfadjoint on JN with a minimal GR-realization α of the form (3.11) if and only if the equation A∗ H + HA = 0 has a solution H = diag(H1 , . . . , HN ) (with Hk ∈ Cγk ×γk , k = 1, . . . , N ) which is both Hermitian and invertible. When such a solution exists, D can be any Hermitian matrix and B = iH −1 C ∗ . Theorem 7.3. Let (A, B) be a controllable pair of matrices, in the sense that Ck has a full row rank for all k ∈ {1, . . . , N }. Then there exists a rational FPS which is matrix-selfadjoint on JN with a minimal GR-realization α of the form (3.11) if and only if the equation GA∗ + AG = 0 has a solution G = diag(G1 , . . . , GN ) (with Gk ∈ Cγk ×γk , k = 1, . . . , N ) which is both Hermitian and invertible. When such a solution exists, D can be any Hermitian matrix and C = iB ∗ G−1 . From (7.5) and (7.6) obtained in Theorem 7.1, and from Theorem 4.4 we obtain the following result: Theorem 7.4. Let Φ be a matrix-selfadjoint on JN rational FPS with a GRrealization α of the form (3.8). Let H = diag(H1 , . . . , HN ) (with Hk ∈ Crk ×rk , k = 1, . . . , N ) be both Hermitian and invertible and satisfy (7.1) and (7.2). Then the GR-node α is observable if and only if it is controllable. The following Lemma is an analogue of Lemma 4.5. It is easily proved by J1 -unitary on JN function F defined in (7.3). applying Lemma 4.5 to the matrix-J Lemma 7.5. Let Φ ∈ Cq×q z1 , . . . , zN rat be matrix-selfadjoint on JN , and (N γk q let α(i) = (N ; A(i) , B (i) , C (i) , D; Cγ = k=1 C , C ) be two minimal GRrealizations of Φ, with the associated structured Hermitian matrices H (i) = (i) (i) diag(H1 , . . . , HN ), i = 1, 2. Then these two realizations and associated matrices H (i) are linked by (2.8) and (4.14). In particular, for each k ∈ {1, . . . , N } the (1) (2) matrices Hk and Hk have the same signature. For n ∈ N, points Z, Z ∈ (Cn×n )N where Φ(Z) and Φ(Z ) are well defined, F given by (7.3), and J1 defined by (7.4) we have: Φ(Z)−Φ(Z )∗ ∗ 0 i J1 ⊗ In − F (Z)(J J1 ⊗ In )F (Z ) = (7.11) 0 0 and
0 0 ∗ . (7.12) ) 0 Φ(Z)−Φ(Z i Combining these equalities with (7.7) and (7.8) and using Corollary 2.2 we obtain the following analogue of Theorem 4.7. ∗
J1 ⊗ In )F (Z) = J1 ⊗ In − F (Z ) (J
Matrix-J-unitary Rational Formal Power Series
99
Theorem 7.6. Let Φ be a matrix-selfadjoint on JN rational FPS, and let α be its minimal GR-realization of the form (3.11), with the associated structured Hermitian matrix H = diag(H1 , . . . , HN ). Then for each k ∈ {1, . . . , N } the number of negative eigenvalues of the matrix Hk is equal to the number of negative squares of the kernels Φ,k wgk −1 Kw,w Hk (A∗ C ∗ )gk w = (CA) ∗
T
T
Φ ,k ∗ ∗ wgk Kw,w Hk (AB)gk w , = (B A )
w, w ∈ FN .
(7.13)
Finally, for k ∈ {1, . . . , N }, let Kk (Φ) (resp., Kk (Φ∗ )) denote the span of the Φ,k Φ∗ ,k q functions w → Kw,w (resp., w → Kw,w ) where w ∈ FN and c ∈ C . Then, dim Kk (Φ) = dim Kk (Φ∗ ) = γk . Let Φ1 and Φ2 be two FPSs from Cq×q z1 , . . . , zN rat . The additive decomposition Φ = Φ 1 + Φ2 is called minimal if γk (Φ) = γk (Φ1 ) + γk (Φ2 ),
k = 1, . . . , N,
where γk (Φ), γk (Φ1 ) and γk (Φ2 ) denote the dimensions of the kth component of the state space of a minimal GR-realization of Φ, Φ1 and Φ2 , respectively. The following theorem is an analogue of Theorem 4.8. Theorem 7.7. Let Φi , i = 1, 2, be matrix-selfadjoint on JN rational FPSs, with (N (i) (i) γk minimal GR-realizations α(i) = (N ; A(i) , B (i) , C (i) , D(i) ; Cγ = , Cq ) k=1 C (i) (i) and the associated structured Hermitian matrices H (i) = diag(H1 , . . . , HN ). Assume that the additive decomposition Φ = Φ1 + Φ2 is minimal. Then the GR-node (N α = (N ; A, B, C, D; Cγ = k=1 Cγk , Cq ) defined by D = D(1) + D(2) ,
(1)
(2)
γk = γk + γk , (1)
k = 1, . . . , N, (2)
and with respect to the decomposition Cγ = Cγ ⊕ Cγ , (1) (1) A 0 B A= , B = C = C (1) (2) (2) , B 0 A
C (2) ,
(7.14)
is a minimal GR-realization of Φ, with the associated structured Hermitian matrix H = diag(H1 , . . . , HN ) such that for each k ∈ {1, . . . , N }: (1) Hk 0 Hk = (2) . 0 Hk Let νk (Φ) denote the number of negative squares of either of the functions defined in (7.13). In view of Theorem 7.6 and Theorem 7.1 these numbers are uniquely determined by Φ.
100
D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘
Corollary 7.8. Let Φ1 and Φ2 be matrix-selfadjoint on JN rational FPSs, and assume that the additive decomposition Φ = Φ1 + Φ2 is minimal. Then νk (Φ) = νk (Φ1 ) + νk (Φ2 ),
k = 1, 2, . . . , N.
An additive decomposition of a matrix-selfadjoint on JN rational FPS Φ is called a minimal matrix-selfadjoint decomposition if it is minimal and both Φ1 and Φ2 are matrix-selfadjoint on JN rational FPSs. The set of all minimal matrixselfadjoint decompositions of a matrix-selfadjoint on JN rational FPS is given by the following theorem, which is a multivariable non-commutative counterpart of [7, Theorem 4.6]. The proof uses Theorem 4.10 applied to the FPS F defined by (7.3), and follows the same argument as one in the proof of Theorem 4.6 in [7]. Theorem 7.9. Let Φ be a matrix-selfadjoint on JN rational FPS, with a minimal GR-realization α of the form (3.11) and the associated structured Hermitian matrix (N H = diag(H1 , . . . , HN ). Let M = k=1 Mk be an A-invariant subspace, with Mk ⊂ Cγk , k = 1, . . . , N , and assume that M is non-degenerate in the associated inner product [ · , · ]H . Let Π = diag(Π1 , . . . , ΠN ) be the projection defined by ker Π = M,
ran Π = M[⊥] ,
that is, [⊥]
ker Πk = Mk , ran Πk = Mk , k = 1, . . . , N. Let D = D1 + D2 be a decomposition of D into two Hermitian matrices. Then the decomposition Φ = Φ1 + Φ2 , where Φ1 (z) = D1 + C(IIγ − ∆(z)A)−1 ∆(z)(IIγ − Π)B, Φ2 (z) = D2 + CΠ(IIγ − ∆(z)A)−1 ∆(z)B, is a minimal matrix-selfadjoint decomposition of Φ. Conversely, any minimal matrix-selfadjoint decomposition of Φ can be obtained in such a way, and with a fixed decomposition D = D1 + D2 , the correspondence between minimal matrix-selfadjoint decompositions of Φ and non(N degenerate A-invariant subspaces of the form M = k=1 Mk , where Mk ⊂ Cγk , k = 1, . . . , N , is one-to-one. Remark 7.10. Minimal matrix-selfadjoint decompositions do not always exist, even in the case N = 1. For counterexamples see [7]. 7.2. A multivariable non-commutative analogue of the circle case In this subsection we briefly review some analogues of the theorems presented in Section 7.1. Theorem 7.11. Let Ψ be a rational FPS and α be its minimal GR-realization of the form (3.11). Then Ψ is matrix-selfadjoint on TN (that is, for all n ∈ N one has Ψ(Z) = Ψ(Z)∗ at all points Z ∈ (Tn×n )N where Ψ is defined) if and only if there exists an invertible Hermitian matrix H = diag(H1 , . . . , HN ), with Hk ∈ Cγk ×γk , k = 1, . . . , N , such that A∗ HA = H,
D − D∗ = iB ∗ HB,
C = iB ∗ HA.
(7.15)
Matrix-J-unitary Rational Formal Power Series Proof. Consider the FPS f ∈ C2q×2q z1 , . . . , zN rat defined by Iq iΨ(z) f (z) = . 0 Iq Using Theorem 5.3, we see that f is matrix-J J1 -unitary on TN , with 0 Iq J1 = , Iq 0 if and only if its GR-realization iC I , q β = (N ; A, 0 B , 0 0
101
(7.16)
(7.17)
iD γj 2q ; Cγ = ⊕ N j=1 C , C ) Iq
(which turns out to be minimal, as can be shown in the same way as in Theorem 7.1) satisfies the following condition: there exists an Hermitian invertible matrix H = diag(H1 , . . . , HN ), with Hk ∈ Cγk ×γk , k = 1, . . . , N , such that ⎞∗ ⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ H 0 0 H 0 0 A 0 B A 0 B ⎝iC Iq iD⎠ ⎝ 0 0 Iq ⎠ ⎝iC Iq iD⎠ = ⎝ 0 0 Iq ⎠ , 0 0 Iq 0 Iq 0 0 0 Iq 0 Iq 0 which is equivalent to the condition stated in the theorem.
For a given minimal GR-realization of Ψ the matrix H is unique, as follows from Theorem 5.1. It is called the associated structured Hermitian matrix of Ψ. The set of all minimal matrix-selfadjoint additive decompositions of a given matrix-selfadjoint on TN rational FPS is described by the following theorem, which is a multivariable non-commutative counterpart of [7, Theorem 5.2], and is proved J1 -unitary on TN FPS f defined by (7.16), by applying Theorem 5.15 to the matrix-J where J1 is defined by (7.17). (We omit the proof.) Theorem 7.12. Let Ψ be a matrix-selfadjoint on TN rational FPS and α be its minimal GR-realization of the form (3.11), with the associated structured Hermitian (N matrix H = diag(H1 , . . . , HN ). Let M = k=1 Mk be an A-invariant subspace, with Mk ⊂ Cγk , k = 1, . . . , N , and assume that M is non-degenerate in the associated inner product [·, ·]H . Let Π = diag(Π1 , . . . , ΠN ) be the projection defined by ker Π = M, ran Π = M[⊥] , that is, [⊥] ker Πk = Mk , ran Πk = Mk , k = 1, . . . , N. Then the decomposition Ψ = Ψ1 + Ψ2 , where Ψ1 (z) = D1 + C(IIγ − ∆(z)A)−1 ∆(z)(IIγ − Π)B, Ψ2 (z) = D2 + CΠ(IIγ − ∆(z)A)−1 ∆(z)B, with D1 = and
i ∗ (1) B1 2 B1 H
+ S, the matrix S being an arbitrary Hermitian matrix, B1 = PM B,
H (1) = PM H -M ,
102
D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘
is a minimal matrix-selfadjoint additive decomposition of Ψ (here PM denotes the orthogonal projection onto M in the standard metric of Cγ ). Conversely, any minimal matrix-selfadjoint additive decomposition of Ψ is obtained in such a way, and for a fixed S, the correspondence between minimal matrix-selfadjoint additive decompositions of Ψ and non-degenerate A-invariant (N subspaces of the form M = k=1 Mk , where Mk ⊂ Cγk , k = 1, . . . , N , is one-toone.
8. Finite-dimensional de Branges–Rovnyak spaces and backward shift realizations: The multivariable non-commutative setting In this section we describe certain model realizations of matrix-J-unitary rational FPSs. We restrict ourselves to the case of FPSs which are matrix-J-unitary on JN . Analogous realizations can be constructed for rational FPSs which are matrix-Junitary on TN or matrix-selfadjoint either on JN or TN . 8.1. Non-commutative formal reproducing kernel Pontryagin spaces Let F be a matrix-J-unitary on JN rational FPS and α be its minimal GRrealization of the form (3.11), with the associated structured Hermitian matrix H = diag(H1 , . . . , HN ). Then by Theorem 4.7, for each k ∈ {1, . . . , N } the kernel (4.15) has the number νk (F ) of negative eigenvalues equal to the number of negaF,k tive squares of Hk . Lemma 4.5 implies that the kernel Kw,w from (4.15) does not depend on the choice of a minimal realization of F . Theorem 4.7 also asserts that the span of the functions where w ∈ FN
F,k w → Kw,w c,
and c ∈ Cq ,
is the space Kk (F ) with dim Kk (F ) = γk , k = 1, . . . , N . One can introduce a new metric on each of the spaces Kk (F ) as follows. First, define an Hermitian form [ · , · ]F,k by: F,k F,k ∗ F,k [K·,w c , K·,w c]F,k = c Kw,w c . This form is easily seen to be well defined on the whole space Kk (F ), that is, if f and h belong to Kk (F ) and F,k F,k fw = Kw,w c = Kw,w c j j j
and hw =
F,k Kw,v d = s s
s
s
F,k Kw,v dt , t
t
where all the sums are finite, then ⎤ ⎡ F,k F,k ⎦ [f, h]F,k = ⎣ K·,w c , K·,v d j j s s j
F,k
A =
F,k K·,w c ,
t
B F,k K·,v dt t
. F,k
Matrix-J-unitary Rational Formal Power Series
103
Thus, the space Kk (F ) endowed with this new (indefinite) metric is a finitedimensional reproducing kernel Pontryagin space (RKPS) of functions on FN F,k with the reproducing kernel Kw,w . We refer to [46, 4, 3] for more information on the theory of reproducing kernel Pontryagin spaces. In a similar way, the space (N K(F ) = k=1 Kk (F ) endowed with the indefinite inner product [f, h]F =
N
[ffk , hk ]F,k .
k=1
where f = col (f1 , . . . , fN ) and h = col (h1 , . . . , hN ), becomes a reproducing kernel Pontryagin space with the reproducing kernel F,1 F,N F Kw,w Kw,w = diag(K , . . . , Kw,w ),
w, w ∈ F N .
F,k F Rather than the kernels Kw,w , k = 1, . . . N , and Kw,w we prefer to use the FPS kernels T F,k w w Kw,w , k = 1, . . . , N, (8.1) K F,k (z, z ) = z z w,w ∈F N
K F (z, z )
=
F w Kw,w z z
w T
,
(8.2)
w,w ∈F N
and instead of the reproducing kernel Pontryagin spaces Kk (F ) and K(F ) we will use the notion of non-commutative formal reproducing kernel Pontryagin spaces (NFRKPS for short; we will use the same notations for these spaces) which we introduce below in a way analogous to the way J.A. Ball and V. Vinnikov introduce non-commutative formal reproducing kernel Hilbert spaces (NFRKHS for short) in [14]. Consider a FPS w T Kw,w z w z ∈ L(C) z1 , . . . , zN , z1 , . . . , zN rat , K(z, z ) = w,w ∈F FN
where C is a Hilbert space. Suppose that ∗
K(z , z) = K(z, z ) =
T
∗ w w Kw,w z . z
w,w ∈F FN ∗ Then Kw,w = Kw ,w for all w, w ∈ FN . Let κ ∈ N. We will say that the FPS K(z, z ) is a kernel with κ negative squares if Kw,w is a kernel on FN with κ negative squares, i.e., for every integer and every choice of w1 , . . . , w ∈ FN and c1 , . . . , c ∈ C the × Hermitian matrix with (i, j)th entry equal to c∗i Kwi ,wj cj has at most κ strictly negative eigenvalues, and exactly κ such eigenvalues for some choice of , w1 , . . . , w , c1 , . . . , c . Define on the space G of finite sums of FPSs of the form Kw (z)c = Kw,w z w c, w∈F FN
104
D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘
where w ∈ FN and c ∈ C, the inner product as follows: ⎤ ⎡ ⎣ Kwi (z)ci , Kwj (z)cj ⎦ = K Kwj ,wi ci , cj C . i
j
i,j
G
It is easily seen to be well defined. The space G endowed with this inner product can be completed in a unique way to a Pontryagin space P(K) of FPSs, and in P(K) the reproducing kernel property is [f, Kw (·)c]P(K) = ffw , cC .
(8.3)
See [4, Theorem 6.4] for more details on such completions. Define the pairings [·, ·]P(K)×P(K)
z1 ,...,zN and ·, ·C
z1 ,...,zN ×C as mappings P(K) × P(K) z1 , . . . , zN → C z1 , . . . , zN and C z1 , . . . , zN × C → C z1 , . . . , zN by A B T f, gw z w = [f, gw ]P(K) z w , w∈F FN
8
w∈F FN
P(K)×P(K)
z1 ,...,zN
9
fw z w , c
w∈F FN
=
w∈F FN
C
z1 ,...,zN ×C
ffw , cC z w .
Then the reproducing kernel property (8.3) can be rewritten as [f, K(·, z)c]P(K)×P(K)
z1 ,...,zN = f (z), cC
z1 ,...,zN ×C .
(8.4)
The space P(K) endowed with the metric [·, ·]P(K) will be said to be a NFRKPS associated with the FPS kernel K(z, z ). It is clear that this space is isomorphic to the RKPS associated with the kernel Kw,w on FN , and this isomorphism is well defined by Kw (·)c → K·,w c, w ∈ FN , c ∈ C. Let us now come back to the kernels (8.1) and (8.2) (see also (4.15)). Clearly, they can be rewritten as K F,k (z, z ) F
K (z, z )
= ϕk (z)Hk−1 ϕk (z )∗ , = ϕ(z)H
−1
k = 1, . . . , N,
∗
ϕ(z ) ,
(8.5) (8.6)
where rational FPSs ϕk , k = 1, . . . , N, and ϕ are determined by a given minimal GR-realization α of the FPS F as = C(IIγ − ∆(z)A)−1 , ϕk (z) = ϕ(z)-Cγk , k = 1, . . . , N. ϕ(z)
For a model minimal GR-realization of F , we will start, conversely, with establishing an explicit formula for the kernels (8.1) and (8.2) in terms of F and then define a minimal GR-realization via these kernels.
Matrix-J-unitary Rational Formal Power Series
105
Suppose that for a fixed k ∈ {1, . . . , N }, (8.5) holds with some rational FPS ϕk . Recall that J − F (z)JF (z )∗ =
N
ϕk (z)Hk−1 (zk + (zk )∗ )ϕk (z )∗
(8.7)
k=1
(note that (zk )∗ = zk ). Then for any n ∈ N and Z, Z ∈ Cn×n : J ⊗ In − F (Z)(J ⊗ In )F (Z )∗ =
N
ϕk (Z)(Hk−1 ⊗ (Zk + (Zk )∗ ))ϕk (Z )∗ . (8.8)
k=1
Therefore, for λ ∈ C: J ⊗ I2n − F (ΛZ,Z (λ))(J ⊗ I2n )F (diag(−Z ∗ , Z ))∗ I I = λϕk (ΛZ,Z (λ)) Hk−1 ⊗ n n ϕk (diag(−Z ∗ , Z ))∗ , In In
(8.9)
where In In Z 0 ΛZ,Z (λ) := λ ⊗ ek + ∗ In In 0 −Z Z1 0 0 λIIn Zk−1 λIIn + Zk = ∗ ,..., ∗ , ∗ , 0 −(Z1 ) 0 −(Zk−1 ) λIIn λIIn − (Zk ) 0 0 ZN Zk+1 , ∗ ,..., ∗ 0 −(Zk+1 0 −(ZN ) ) −Z1∗ diag(−Z , Z ) := 0 ∗
∗ 0 −ZN ,..., Z1 0
0 ZN
,
and, in particular, ∗
ΛZ,Z (0) = diag(Z, −Z ). For Z and Z where both F and ϕk are holomorphic, ϕk (ΛZ,Z (λ)) is continuous in λ, and F (ΛZ,Z (λ)) is holomorphic in λ at λ = 0. Thus, dividing by λ the expressions in both sides of (8.9) and passing to the limit as λ → 0, we get −
d {F (ΛZ,Z (λ))} -λ=0 (J ⊗ I2n )F (diag(−Z ∗ , Z ))∗ dλ I I ∗ = ϕk diag(Z, −Z ) Hk−1 ⊗ n n ϕk (diag(−Z ∗ , Z ))∗ In In ϕk (Z) ∗ = (Hk−1 ⊗ In ) ϕk (−Z ∗ ) ϕk (Z )∗ . ∗ ϕk (−Z )
Taking the (1, 2)th entry of the 2 × 2 block matrices in this equality, we get: K F,k (Z, Z ) = −
d F (ΛZ,Z (λ))12 -λ=0 (J ⊗ In )F (Z )∗ . dλ
(8.10)
106
D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘
Using the FPS representation for F we obtain from (8.10) the representation ⎛ ⎞ w T ⎝ K F,k (Z, Z ) = (−1)|v |+1 Fwgk vT JF Fv ⎠ ⊗ Z w (Z ∗ ) . w,w ∈F FN
v,v ∈F FN : vv =w
From Corollary 2.2 we get the expression for a FPS K F,k (z, z ), namely: ⎛ ⎞ w T ⎝ K F,k (z, z ) = (−1)|v |+1 Fwgk vT JF Fv ⎠ z w z . w,w ∈F FN
(8.11)
v,v ∈F FN : vv =w
Using formal differentiation with respect to λ we can also represent this kernel as K F,k (z, z ) = −
d F (Λz,z (λ))12 -λ=0 JF (z )∗ . dλ
(8.12)
We note that one gets (8.11) and (8.12) from (8.7) using the same argument applied to FPSs. Let us now consider the NFRKPSs Kk (F ), k = 1, . . . , N , and K(F ) = (N k=1 Kk (F ). They are finite-dimensional and isomorphic to the reproducing kernel Pontryagin spaces on FN which were denoted above with the same notation. Thus dim Kk (F ) = γk ,
k = 1, . . . , N,
dim K(F ) = γ.
(8.13)
The space K(F ) is a multivariable non-commutative analogue of a certain de Branges–Rovnyak space (see [19, p. 24], [4, Section 6.3], and [7, p. 217]). 8.2. Minimal realizations in non-commutative de Branges–Rovnyak spaces Let us define for every k ∈ {1, . . . , N } the backward shift operator Rk : Cq z1 , . . . , zN rat −→ Cq z1 , . . . , zN rat by Rk :
w∈F FN
fw z w −→
fwgk z w .
w∈F FN
(Compare with the one-variable backward shift operator R0 considered in Section 1.) Lemma 8.1. Let F be a matrix-J-unitary on JN rational FPS. Then for every k ∈ {1, . . . , N } the following is true: 1. Rk F (z)c ∈ Kk (F ) for every c ∈ Cq ; 2. Rk Kj (F ) ⊂ Kk (F ) for every j ∈ {1, . . . , N }.
Matrix-J-unitary Rational Formal Power Series
107
Proof. From (8.7) and the J-unitarity of F∅ we get J − F (z)JF F∅∗
=
(F F∅ − F (z))JF F∅∗ = −
N
Rk F (z)zk JF F∅∗
k=1
=
N
ϕk (z)Hk−1 zk (ϕk )∗∅ ,
k=1
and therefore for every k ∈ {1, . . . , N } and every c ∈ Cq we get F∅ c = K∅F,k (z) (−JF F∅ c) ∈ Kk (F ). Rk F (z)c = −ϕk (z)Hk−1 (ϕk )∗∅ JF Thus, the first statement of this Lemma is true. To prove the second statement we start again from (8.7) and get for a fixed j ∈ {1, . . . , N } and w ∈ FN : ∗ −F (z)JF Fwg j
=
∗ ϕj (z)H Hj−1 (ϕj )w
+
N
∗
ϕk (z)Hk−1 zk (ϕk )wgj ,
k=1
and therefore for any c ∈ Cq : N N N ∗ F,j F,k Rk F (z)JF Kwg R − Fwg c z = K (z)c z + (z)c zk . k k w k j j k=1
k=1
k=1
Hence, one has for every k ∈ {1, . . . , N }: F,j ∗ F,k Rk Kw (z)c = −Rk F (z)JF Fwg c − Kwg (z)c, j j
(8.14)
and from the first statement of this Lemma we obtain that the right-hand side of this equality belongs to Kk (F ). Thus, the second statement is true, too. We now define operators Akj : Kj (F ) → Kk (F ), A : K(F ) → K(F ), B : Cq → K(F ), C : K(F ) → Cq , D : Cq → Cq by (8.15) Akj = Rk -Kj (F ) , k, j = 1, . . . , N, A B
= (Akj )k,j=1,...,N , ⎞ ⎛ R1 F (z)c ⎟ ⎜ .. : c −→ ⎝ ⎠, . RN F (z)c ⎞ f1 (z) N ⎜ .. ⎟ − → (ffk )∅ , ⎝ . ⎠ k=1 fN (z)
(8.16) (8.17)
⎛ C
:
D
= F∅ .
(8.18) (8.19)
These definitions make sense in view of Lemma 8.1. Theorem 8.2. Let F be a matrix-J-unitary on JN rational FPS. Then the GR-node ( q α = (N ; A, B, C, D; K(F ) = N k=1 Kk (F ), C ), with operators defined by (8.15)– (8.19), is a minimal GR-realization of F .
108
D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘
Proof. We first check that for every w ∈ FN : w = ∅ we have w
Fw = (CAB) .
(8.20)
Let w = gk for some k ∈ {1, . . . , N }. Then for c ∈ Cq : w
(CAB) c = Ck Bk c = (Rk F (z)c)∅ =
Fwgk z w c
w∈F FN
= Fgk c. ∅
Assume now that |w| > 1, w = gj1 . . . gj|w| . Then for c ∈ Cq : (CAB)w c = Cj1 Aj1 ,j2 · · · Aj|w|−1 ,j|w| Bj|w| c = Rj1 · · · Rj|w| F (z)c ∅ w = Fw gj1 ···gj|w| z c w ∈F FN
∅
= Fgj1 ···gj|w| c = Fw c. Since F∅ = D, we obtain that F (z) = D + C(I − ∆(z)A)−1 ∆(z)B, that is, α is a GR-realization of F . The minimality of α follows from (8.13).
Let us now show how the associated structured Hermitian matrix H = diag(H1 , . . . , HN ) arises from this special realization. Let F,j F,j h = col1≤j≤N (K Kw (·)cj ) and h = col1≤j≤N (Kw (·)cj ). j j
Using (8.14), we obtain [Akj hj , hk ]F,k + [hj , Ajk hk ]F,j = =
F,k F,k F,j F,j [Rk Kw (·)cj , Kw Kw (·)cj , Rj Kw (·)ck ]F,k + [K (·)ck ]F,j j j k k F,j F,k cj . (ck )∗ Kw g ,w + Kw ,w g j j j k k
◦
◦
◦
◦
◦
Let α= (N ; A, B , C , D; Cγ =
(8.21)
k
(N k=1
Cγk , Cq ) be any minimal GR-realization of F , ◦
◦
◦
with the associated structured Hermitian matrix H = diag(H1 , . . . , HN ). Then the
Matrix-J-unitary Rational Formal Power Series
109
right-hand side of (8.21) can be rewritten as F,j F,k cj (ck )∗ Kw g ,w + Kw ,w g j j j k k k gj wjT ◦ ◦ wk gk gj ◦ −1 ◦ ◦ ∗ ∗ ∗ = (ck ) Hj A C C A gk gj wjT ◦ ◦ wk gk ◦ −1 ◦ ◦ ∗ ∗ + CA Hk A C ◦ ◦ wk gk CA
◦ Akj
cj
−1 −1 ∗ ◦ gj wjT ◦ ◦ ◦ ◦ Hj + Hk Akj A∗ C ∗ cj
=
∗ (ck )
=
∗ −(ck )
∗ ◦ gj wjT ◦ ◦ wk gk ◦ ◦ ◦ ∗ ∗ Bk J Bj A C cj CA
=
∗ −(ck )
−1 ◦ gj wjT ◦ ◦ wk gk ◦ −1 ◦ ∗ ◦ ◦ ◦ ∗ ∗ Hk Ck J Cj Hj A C cj CA
∗
F,k F,j = −(ck ) Kw JK∅,w cj j k ,∅ ∗ ∗ F,k F,j = −(ck ) K∅,w JK∅,w cj j k
= −(hk )∗∅ J(hj )∅ . ◦
◦
◦
◦
In this chain of equalities we have exploited the relationship between A, B , C , D, J ◦ ◦ and H from Theorem 4.1 applied to a GR-node α. Thus we have for all k, j ∈ {1, . . . , N }: [Akj hj , hk ]F,k + [hj , Ajk hk ]F,j = −(hk )∗ Ck∗ JC Cj hj .
(8.22)
Since this equality holds for generating elements of the spaces Kk (F ), k = 1, . . . , N ) it extends by linearity to arbitrary elements h = col(h1 , . . . , hN ) and h = col(h1 , . . . , hN ) in K(F ). For k = 1, . . . , N, let ·, · F,k be any inner product for which Kk (F ) is a Hilbert space. Thus, K(F ) is a Hilbert space with respect to the inner product h, h F :=
N
hk , hk F,k .
k=1
Then there exist uniquely defined linear operators Hk : Kk (F ) → Kk (F ) such that: [hk , hk ]F,k = Hk hk , hk F,k ,
k = 1, . . . N,
and so with H := diag(H1 , . . . , HN ) : K(F ) → K(F ) we have: [h, h ]F = Hh, h F .
110
D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘
Since the spaces Kk (F ) are non-degenerate (see [4]), the operators Hk are invertible and (8.22) can be rewritten as: Cj , (A∗ )kj Hj + Hk Akj = −Ck∗ JC
k, j = 1, . . . N,
which is equivalent to (4.3). Now, for arbitrary c, c ∈ Cq and w ∈ FN we have: ∗
F,k F,k Hk Bk c, Kw (·)c F,k = [Rk F (·)c, Kw (·)c ]F,k = c Fw gk c.
On the other hand, F,k F,k F,k − Ck∗ JDc, Kw F∅ c, Ck Kw F∅ c, K∅,w (·)c F,k = −JF (·)c F,k = −JF c Cq −1 ∗ ◦ ◦ ◦ ◦ ∗ F,k ∗ ◦ = −c Kw F∅ c = −c (C A)w gk Hk Ck J D c ,∅ JF
= c
∗
◦ ◦ w gk ◦ ◦ ◦ w gk ∗ ◦ ∗ Bk c = c C A B c = c Fw gk c. CA
Here we have used the relation (4.4) for an arbitrary minimal GR-realization ◦ ◦ ◦ ◦ (N ◦ γk q k=1 C , C ) of F , with the associated structured Her-
α= (N ; A, B , C , D; Cγ = ◦
◦
◦
mitian matrix H = diag(H1 , . . . , HN ). Thus, Hk Bk = −Ck∗ JD, k = 1, . . . , N , that is, B = −H −1 C ∗ JD, and (4.4) holds for the GR-node α. Finally, by Theorem 4.1, we may conclude that H = diag(H1 , . . . , HN ) is the associated structured Hermitian matrix of the special GR-realization α. 8.3. Examples In this subsection we give certain examples of matrix-inner rational FPSs on J2 with scalar coefficients (i.e., N = 2, q = 1, and J = 1). We also present the corresponding non-commutative positive kernels K F,1 (z, z ) and K F,2 (z, z ) computed using formula (8.12). Example 1. F (z) = (z1 + 1)−1 (z1 − 1)(z2 + 1)−1 (z2 − 1). K F,1 (z, z ) = 2(z1 + 1)−1 (z1 + 1)−1 , K F,2 (z, z ) = 2(z1 + 1)−1 (z1 − 1)(z2 + 1)−1 (z2 + 1)−1 (z1 − 1)(z1 + 1)−1 . Example 2. F (z) = (z1 + z2 + 1)−1 (z1 + z2 − 1). K F,1 (z, z ) = K F,2 (z, z ) = 2(z1 + z2 + 1)−1 (z1 + z2 + 1)−1 . Example 3.
−1 z1 + (z2 + i)−1 − 1 F (z) = z1 + (z2 + i)−1 + 1 −1
= ((z2 + i)(z1 + 1) + 1)
K
−1
−1
(z2 + i)(z2 − i) ((z1 + 1)(z2 − i) + 1)
−1
((z1
K F,1 (z, z ) = 2 ((z2 + i)(z1 + 1) + 1) F,2
((z2 + i)(z1 − 1) + 1) .
(z, z ) = 2 ((z2 + i)(z1 + 1) + 1)
+
1)(z2
−1
− i) + 1)
.
,
Matrix-J-unitary Rational Formal Power Series
111
References [1] J. Agler, On the representation of certain holomorphic functions defined on a polydisk, Oper. Theory Adv. Appl., vol. 48, pp. 47–66, Birkh¨ ¨ auser Verlag, Basel, 1990. [2] N.I. Akhiezer and I.M. Glazman, Theory of linear operators in Hilbert space, Dover Publications Inc., New York, 1993, Translated from the Russian and with a preface by Merlynd Nestell, Reprint of the 1961 and 1963 translations. [3] D. Alpay, A. Dijksma, J. Rovnyak, and H. de Snoo, Schur functions, operator colligations, and reproducing kernel Pontryagin spaces, Oper. Theory Adv. Appl., vol. 96, ¨ Verlag, Basel, 1997. Birkhauser [4] D. Alpay and H. Dym, On applications of reproducing kernel spaces to the Schur algorithm and rational J-unitary factorization, I. Schur methods in operator theory and signal processing, Oper. Theory Adv. Appl., vol. 18, Birkh¨ ¨ auser, Basel, 1986, pp. 89–159. [5] D. Alpay and H. Dym, On a new class of realization formulas and their application, Proceedings of the Fourth Conference of the International Linear Algebra Society (Rotterdam, 1994), vol. 241/243, 1996, pp. 3–84. [6] D. Alpay and I. Gohberg, On orthogonal matrix polynomials, Orthogonal matrixvalued polynomials and applications (Tel Aviv, 1987–88), Oper. Theory Adv. Appl., vol. 34, Birkhauser, ¨ Basel, 1988, pp. 25–46. [7] D. Alpay and I. Gohberg, Unitary rational matrix functions, Topics in interpolation theory of rational matrix-valued functions, Oper. Theory Adv. Appl., vol. 33, Birkhauser, ¨ Basel, 1988, pp. 175–222. [8] D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘ ı, On the intersection of null spaces for matrix substitutions in a non-commutative rational formal power series, C. R. Math. Acad. Sci. Paris 339 (2004), no. 8, 533–538. [9] N. Aronszajn, Theory of reproducing kernels, Trans. Amer. Math. Soc. 68 (1950), 337–404. [10] D.Z. Arov, Passive linear steady-state dynamical systems, Sibirsk. Mat. Zh. 20 (1979), no. 2, 211–228, 457, (Russian). [11] J.A. Ball, G. Groenewald, and T. Malakorn, Structured noncommutative multidimensional linear systems, Preprint. [12] J.A. Ball, G. Groenewald, and T. Malakorn, Conservative structured noncommutative multidimensional linear systems, In this volume. [13] J.A. Ball, G. Groenewald, and T. Malakorn, Bounded Real Lemma for structured noncommutative multidimensional linear systems and robust control, Preprint. [14] J.A. Ball and V. Vinnikov, Formal reproducing kernel Hilbert spaces: The commutative and noncommutative settings, Reproducing kernel spaces and applications, Oper. Theory Adv. Appl., vol. 143, Birkh¨ ¨ auser, Basel, 2003, pp. 77–134. [15] H. Bart, I. Gohberg, and M.A. Kaashoek, Minimal factorization of matrix and operator functions, Oper. Theory Adv. Appl., vol. 1, Birkh¨ ¨ auser Verlag, Basel, 1979. [16] C. Beck, On formal power series representations for uncertain systems, IEEE Trans. Automat. Control 46 (2001), no. 2, 314–319. [17] C.L. Beck and J. Doyle, A necessary and sufficient minimality condition for uncertain systems, IEEE Trans. Automat. Control 44 (1999), no. 10, 1802–1813.
112
D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘
[18] J. Berstel and C. Reutenauer, Rational series and their languages, EATCS Monographs on Theoretical Computer Science, vol. 12, Springer-Verlag, Berlin, 1988. [19] L. de Branges and J. Rovnyak, Square summable power series, Holt, Rinehart and Winston, New York, 1966. [20] M.S. Brodski˘ı, Triangular and Jordan representations of linear operators, American Mathematical Society, Providence, R.I., 1971, Translated from the Russian by J.M. Danskin, Translations of Mathematical Monographs, Vol. 32. [21] J.F. Camino, J.W. Helton, R.E. Skelton, and J. Ye, Matrix inequalities: a symbolic procedure to determine convexity automatically, Integral Equations Operator Theory 46 (2003), no. 4, 399–454. [22] H. Dym, J contractive matrix functions, reproducing kernel Hilbert spaces and interpolation, CBMS Regional Conference Series in Mathematics, vol. 71, Published for the Conference Board of the Mathematical Sciences, Washington, DC, 1989. [23] A.V. Efimov and V.P. Potapov, J-expanding matrix-valued functions, and their role in the analytic theory of electrical circuits, Uspehi Mat. Nauk 28 (1973), no. 1(169), 65–130, (Russian). [24] M. Fliess, Matrices de Hankel, J. Math. Pures Appl. (9) 53 (1974), 197–222. [25] E. Fornasini and G. Marchesini, On the problems of constructing minimal realizations for two-dimensional filters, IEEE Trans. Pattern Analysis and Machine Intelligence PAMI-2 (1980), no. 2, 172–176. [26] D.D. Givone and R.P. Roesser, Multidimensional linear iterative circuits-general properties, IEEE Trans. Computers C-21 (1972), 1067–1073. [27] D.D. Givone and R.P. Roesser, Minimization of multidimensional linear iterative circuits, IEEE Trans. Computers C-22 (1973), 673–678. [28] I. Gohberg, P. Lancaster, and L. Rodman, Matrices and indefinite scalar products, Oper. Theory Adv. Appl., vol. 8, Birkh¨ ¨ auser Verlag, Basel, 1983. [29] J.W. Helton, “Positive” noncommutative polynomials are sums of squares, Ann. of Math. (2) 156 (2002), no. 2, 675–694. [30] J.W. Helton, Manipulating matrix inequalities automatically, Mathematical systems theory in biology, communications, computation, and finance (Notre Dame, IN, 2002), IMA Vol. Math. Appl., vol. 134, Springer, New York, 2003, pp. 237–256. [31] J.W. Helton and S.A. McCullough, A Positivstellensatz for non-commutative polynomials, Trans. Amer. Math. Soc. 356 (2004), no. 9, 3721–3737 (electronic). [32] J.W. Helton, S.A. McCullough, and M. Putinar, A non-commutative Positivstellensatz on isometries, J. Reine Angew. Math. 568 (2004), 71–80. [33] D.S. Kalyuzhniy, On the notions of dilation, controllability, observability, and minimality in the theory of dissipative scattering linear nD systems, Proceedings of the International Symposium MTNS-2000 (A. El Jai and M. Fliess, Eds.), CD-ROM (Perpignan, France), 2000, http://www.univ-perp.fr/mtns2000/articles/I13 3.pdf. [34] D.S. Kalyuzhny˘-Verbovetzki˘ ˘ ˘ı and V. Vinnikov, Non-commutative positive kernels and their matrix evaluations, Proc. Amer. Math. Soc., to appear. [35] S.C. Kleene, Representation of events in nerve nets and finite automata, Automata studies, Annals of mathematics studies, no. 34, Princeton University Press, Princeton, N. J., 1956, pp. 3–41.
Matrix-J-unitary Rational Formal Power Series
113
[36] I.V. Kovaliˇ ˇsina, and V.P. Potapov, Multiplicative structure of analytic real J-dilative matrix-functions, Izv. Akad. Nauk Armjan. SSR Ser. Fiz.-Mat. Nau 18 (1965), no. 6, 3–10, (Russian). ¨ [37] M.G. Kre˘n ˘ and H. Langer, Uber die verallgemeinerten Resolventen und die charakteristische Funktion eines isometrischen Operators im Raume Πκ , Hilbert space operators and operator algebras (Proc. Internat. Conf., Tihany, 1970), North-Holland, Amsterdam, 1972, pp. 353–399. Colloq. Math. Soc. J´ ´ anos Bolyai, 5. [38] M.S. Livˇ ˇsic, Operators, oscillations, waves (open systems), American Mathematical Society, Providence, R.I., 1973, Translated from the Russian by Scripta Technica, Ltd. English translation edited by R. Herden, Translations of Mathematical Monographs, Vol. 34. [39] T. Malakorn, Multidimensional linear systems and robust control, Ph.D. thesis, Virginia Polytechnic Institute and State University, Blacksburg, Virginia, 2003. [40] S. McCullough, Factorization of operator-valued polynomials in several non-commuting variables, Linear Algebra Appl. 326 (2001), no. 1-3, 193–203. [41] A.C.M. Ran, Minimal factorization of selfadjoint rational matrix functions, Integral Equations Operator Theory 5 (1982), no. 6, 850–869. [42] R.P. Roesser, A discrete state-space model for linear image processing, IEEE Trans. Automatic Control AC–20 (1975), 1–10. [43] L.A. Sakhnovich, Factorization problems and operator identities, Russian Mathematical Surveys 41 (1986), no. 1, 1–64. [44] M.P. Sch¨ u ¨tzenberger, On the definition of a family of automata, Information and Control 4 (1961), 245–270. [45] B.V. Shabat, Introduction to complex analysis. Part II, I Translations of Mathematical Monographs, vol. 110, American Mathematical Society, Providence, RI, 1992, Functions of several variables, Translated from the third (1985) Russian edition by J. S. Joel. [46] P. Sorjonen, Pontrjaginr¨ ¨ aume mit einem reproduzierenden Kern, Ann. Acad. Sci. Fenn. Ser. A I Math. 594 (1975), 30. [47] K. Zhou, J.C. Doyle, and K. Glover, Robust and optimal control, Prentice-Hall, Upper Saddle River, NJ, 1996. D. Alpay Department of Mathematics Ben-Gurion University of the Negev Beer-Sheva 84105, Israel e-mail: [email protected] D.S. Kalyuzhny˘-Verbovetzki˘ ˘ Department of Mathematics Ben-Gurion University of the Negev Beer-Sheva 84105, Israel e-mail: [email protected]
Operator Theory: Advances and Applications, Vol. 161, 115–177 c 2005 Birkhauser ¨ Verlag Basel/Switzerland
State/Signal Linear Time-Invariant Systems Theory, Part I: Discrete Time Systems Damir Z. Arov and Olof J. Staffans Abstract. This is the first paper in a series of several papers in which we develop a state/signal linear time-invariant systems theory. In this first part we shall present the general state/signal setting in discrete time. Our following papers will deal with conservative and passive state/signal systems in discrete time, the general state/signal setting in continuous time, and conservative and passive state/signal systems in continuous time, respectively. The state/signal theory that we develop differs from the standard input/state/output theory in the sense that we do not distinguish between input signals and output signals, only between the “internal” states x and the “external” signals w. In the development of the general state/signal systems theory we take both the state space X and the signal space W to be Hilbert spaces. In later papers where we discuss conservative and passive systems we assume that the signal space W has an additional Kre˘ ˘ın space structure. The definition of a state/signal system has been designed in such a way that to any state/signal system there exists at least one decomposition of the signal space W as the direct sum W = Y U such that the evolution of the system can be described by the standard input/state/output system of equations with input space U and output space Y. (In a passive state/signal system we may take U and Y to be the positive and negative parts, respectively, of a fundamental decomposition of the Kre˘ın space W.) Thus, to each state/signal system corresponds infinitely many input/state/output systems constructed in the way described above. A state/signal system consists of a state/signal node and the set of trajectories generated by this node. A state/signal node is a triple Σ = (V ; X , W), where V is a subspace with appropriate properties of the product space X × X × W. In this first paper we extend standard input/state/output notions, such as existence and uniqueness of solutions, continuous dependence on initial data, observability, controllability, stabilizability, detectability, and minimality to the state/signal setting. Three classes of representations of state/signal systems are presented (one of which is the class of input/state/output representations), and the families of all the transfer functions of these representations are studied. We also discuss realizations of signal behaviors by state/signal systems, as well as dilations and compressions of these systems. (Duality will be discussed later in connection with passivity and conservativity.)
116 Mathematics Subject Classification (2000). Primary 47A48, 93A05; Secondary 94C05. Keywords. State/signal node, driving variable, output nulling, input/state/ output, linear fractional transformation, transfer function, behavior, external equivalence, realization, dilation, compression, outgoing invariant, strongly invariant, controllability, observability, minimality, stabilizability, detectability.
Contents 1 2 3 4 5 6 7 8 9 10
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . State/signal nodes and trajectories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The driving variable representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The output nulling representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The input/state/output representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Transfer functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Signal behaviors, external equivalence, and similarity . . . . . . . . . . . . . . . . . Dilations of state/signal systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Acknowlegment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
116 120 123 128 132 138 146 153 167 176 176 176
1. Introduction The main motivation for this work comes from the notion of a multi-port network. Such a network consists of internal branches, where the evolution of the data is described by, e.g., systems of ordinary or partial differential equations involving state variables (lumped or distributed), and external branches (ports), where the evolution of the port variables is only partially restricted by the network equations. Typically one part of the port variables can be prescribed in an arbitrary way (this is the “input” part), after which the remaining “output” part of the port variables can be computed from the network equations. However, the splitting of the port variables into an input part and output part is not specified, and many different choices are possible. To be a little more concrete, let us consider a two-port Kirchhoff network, i.e., a Kirchhoff network with two external branches. To each of these branches we associate at each time instant t a normalized voltage/current pair (v1 (t), i1 (t)),√respecmeans that we divide each voltage by R and tively, (v2 (t), i2 (t)) (normalization √ multiply each current by R, where R is a fixed resistance). Thus, the complete set of port variables is the four-dimensional vector w(t) = (v1 (t), i1 (t), v2 (t), i2 (t)).
State/Signal Systems
117
Sometimes we may use u(t) = (v1 (t), i1 (t)) as the input data, and regard (v2 (t), i2 (t)) as the output data (or the other way around). This case is called the transmission case, and it is used, e.g., in the cascade synthesis of two-ports. However, this choice of input and output data is not always possible or reasonable. Another possibility is to choose u(t) = (i1 (t), i2 (t)) as the input data and y(t) = (v1 (t), v2 (t)) as the output data (or the other way around). These cases are referred to as the impedance and admittance cases, and they are used, e.g., in series and parallel connections of networks. Neither is this choice of input and output data always possible or reasonable. In his development of the theory of passive Kirchhoff networks V. Belevitch [Bel68] proposed the use of the incoming wave data u(t) = ( √12 (v1 (t) + i1 (t)), √12 (v2 (t) + i2 (t))) as input data and the outgoing wave data u(t) = ( √12 (v1 (t)−i1 (t)), √12 (v2 (t)−i2 (t))) as output. This case is called the scattering case, and this particular decomposition is always possible and meaningful for passive Kirchhoff networks. In all these cases the physical network is the same, but depending on the decomposition of w(t) = (v1 (t), i1 (t), v2 (t), i2 (t)) into an input part and an output part we get very different input/state/output characteristics. The idea of considering the evolution of external signals w(t) without an explicit decomposition into an input part u(t) and an output part y(t) is the most fundamental ingredient in the behavioral theory initiated by J. Willems (see, e.g., [PW98] for a recent presentation of behavioral theory). Our approach differs from the standard behavioral approach in the sense that we always include a state variable in the equations describing the evolution of the system, and we more or less ignore polynomial descriptions as well as dynamics generated by ordinary differential equations. It is genuinely infinite-dimensional, and it appears to be applicable to a large class of infinite-dimensional problems. A first step in this direction was taken by J. Ball and O. Staffans [BS05], where the main notion of a state/signal node and its trajectories are found in an implicit way. A state/signal system consists of a state/signal node and the set of trajectories generated by this node. A state/signal node is a triple Σ = (V ; X , W), where X (the state space) and W4 (the 5 signal space) are Hilbert spaces, and V is a subX space of the product space X with appropriate properties. In this paper we shall W only discuss systems with discrete time. The list of properties that the subspace V should satisfy in this case is given in Definition 2.1. By a trajectory (x(·), w(·)) ∞ of Σ on Z+ = {0, 1, 2, . . .} we mean a pair of sequences {x(n)}∞ n=0 and {w(n)}n=0 satisfying C x(n+1) D x(n) (1.1) ∈ V, n ∈ Z+ . w(n)
The properties of the subspace V have been chosen in such a way that there exists at least one admissible decomposition (actually infinitely many decompositions) of the signal space W as the direct sum W = Y U of an input space U and an output space Y such that trajectories are defined by a usual input/state/output
118
D.Z. Arov and O.J. Staffans
system of equations x(n + 1) = Ax(n) + Bu(n), y(n) = Cx(n) + Du(n),
n ∈ Z+ ,
(1.2)
x(0) = x0 , where the coefficients A, B, C, and D are bounded 2 3linear operators between A B ] ∈ B([ X ] ; X ). The set of all trajecthe respective Hilbert spaces, i.e., [ C Y D U tories (x(·), w(·)) of the state/signal system (1.1) can be obtained from the set of trajectories of (1.2) by taking the state sequence x(·) to be the same and taking w(·)4 = 5y(·) + u(·). The latter equation we write alternatively in the form 2 3 y(·) w(·) = u(·) , and likewise, instead of W = Y U we write alternatively W = Y U . In addition to these input/state/output representations, there are two other useful types of representations, namely driving variable and output nulling representations. In a driving variable representation we parameterize the trajectories by using an extra driving variable with values in an auxiliary driving variable Hilbert space L. The trajectories of the system are described by a system of equations x(n + 1) = A x(n) + B (n), w(n) = C x(n) + D (n),
n ∈ Z+ ,
(1.3)
x(0) = x0 , where the coefficients (A , B , 2C , D )3 are bounded linear operators between the A B ∈ B([ X ] ; [ X ]), and D is injective and has respective Hilbert spaces, i.e., C L W D closed range. The set of all trajectories (x(·), w(·)) of the state/signal system Σ can be obtained from the set of trajectories (x(·), (·), w(·)) of (1.3) by simply dropping the driving variable (·). In an output nulling representation we formally consider the signal component w as an input which is restricted by an additional equation posed in an auxiliary error space K. The trajectories of this new input/ state/output system are described by a system of equations x(n + 1) = A x(n) + B w(n), e(n) = C x(n) + D w(n),
n ∈ Z+ ,
(1.4)
x(0) = x0 , where the coefficients (A , B ,2C , D )3 are bounded linear operators between the A B ∈ B([ X ] ; [ X ]), and D is surjective. The respective Hilbert spaces, i.e., C W K D reason for the name “output nulling” for this representation is that (x(·), w(·)) is a trajectory of Σ if and only if (x(·), w(·), e(·)) with e(n) = 0 for all n is a trajectory of the input/state/output system described by (1.4). To each state/signal system there corresponds infinitely many representations of each of the three types described above. We prove the existence of these three types of representations, discuss their properties, and also discuss the relationships between different representations of the same type or of different types.
State/Signal Systems
119
Each input/state/output representation (1.2) of a given state/signal system has a B(U; Y)-valued transfer function given by D(z) = D + zC(1X − zA)−1 B,
z ∈ ΛA ,
(1.5)
where ΛA is the set of points z ∈ C for which (1X − zA) has a bounded inverse, plus the point at infinity if A is boundedly invertible. Thus, each state/signal system has infinitely many such transfer functions, one corresponding to each input/ state/output representation. All of these transfer functions can be obtained from one fixed input/state/output representation through the use of a linear fractional transformation. More precisely, let W = Y U and W = Y1 U1 be two admissible input/output decompositions of the signal space W of a given state/signal system Σ, and denote the corresponding transfer functions by D and D1 , respectively. Let C D C U1 D PYU11 |U P | Θ11 Θ12 Θ= = YY11 Y , (1.6) Θ21 Θ22 PU1 |Y PUY11 |U where PYU11 |Y is the restriction to Y of the projection of W onto Y1 along U1 , etc. (Note that Θ can be interpreted as a decomposition of the identity in W with respect to the two sum decompositions W = Y U = Y1 U1 .) Then D1 is the value of the linear fractional transform of D with coefficient matrix Θ, i.e., D1 (z) = [Θ11 D(z) + Θ12 ][Θ21 D(z) + Θ22 ]−1 ,
z ∈ Λ A ∩ Λ A1 .
(1.7)
We also introduce notions of controllability, observability, and minimality of state/signal systems. These notions are defined in terms of the properties of its trajectories, without any reference to the various representations described above, but it is possible to give equivalent conditions for controllability and observability in terms of the different types of representations described above. In particular, we prove that a state/signal system is controllable (or observable, or minimal) if and only if at least one corresponding input/state/output system (1.2) (hence all of them) has the same property. In Section 2 we discuss the main notions of the theory: state/signal nodes, the corresponding trajectories, and their basic properties. In Sections 3 and 4 we study driving variable and output nulling representations, respectively. Here we also define the notions of controllability and observability and develop tests for controllability and observability in terms of driving variable and output nulling representations. Input/state/output representations are studied in Section 5. Here we also give criteria for the admissibility of a decomposition of the signal space W into an input space U and an output space Y and describe the connections between different representations. Different kinds of transfer functions related to different representations of state/signal systems and their connections are studied in Section 6. In Section 7 we introduce and study signal behaviors and their realizations by means of state/signal systems. Dilations of state/signal systems are studied in depth in Section 8. In particular, we show that a dilation of a state/ signal system has the same signal behavior and also the same set of input/output transfer functions (restricted to a neighborhood of zero) as the original system.
120
D.Z. Arov and O.J. Staffans
The main result of this section characterizes dilations in terms of the existence of a decomposition of the state space into parts with certain invariance properties. All the proofs are given in the state/signal setting, and we obtain standard input/ state/output results as corollaries of our main results. Finally, Section 9 is devoted to a study of different stabilizability properties of state/signal systems in terms of the existence of stable representations of driving variable, output nulling, or input/ state/output type. Not only power stability, but also strong stability is studied. Notation. The space of bounded linear operators from one normed space X to another normed space Y is denoted by B(X ; Y), and we abbreviate B(X ; X ) to B(X ). The domain of a linear operator A is denoted by D(A), its range by R (A), and its kernel by N (A). The restriction of A to some subspace Z ⊂ D(A) is denoted by A|Z . The identity operator on X is denoted by 1X . For each A ∈ B(X ) we let ΛA be the set of points z ∈ C for which (1X − zA) has a bounded inverse, plus the point at infinity if A is boundedly invertible. C is the complex plane, D is the open unit disk in C, Z = {0, ±1, ±2, . . .}, Z+ = {0, 1, 2, . . .}, and Z− = {−1, −2, . . .}. The space H 2 (D; U), where U is a Hilbert space, consists U-valued functions φ on D which satisfy E of all analytic 1 2 φ(z) |dz| < ∞. The space H ∞ (D; U, Y), where U and φ2 := sup0≤r<1 2π |z|=r Y are Hilbert spaces, consists of all bounded analytic B(U; Y)-valued functions on (Z+ ; U) and 2 (Z+ ; U) contain those D. The sequence spaces 1 U-valued sequences + u(·) on Z which satisfy n∈Z+ u(n) < ∞, respectively, n∈Z+ u(n)2 < ∞, and ∞ (Z+ ; U) consists of all bounded U-valued sequences on Z+ . We denote the projection onto a closed subspace Y of a space X along some complementary subspace U by PYU . The closed linear span or linear span of a sequence of subsets Rn ⊂ X where n runs over some index set Λ is denoted by ∨n∈Λ Rn and spann∈Λ Rn , respectively. We denote 2 3 the product of the two locally convex topological vector spaces X X and Y may be Hilbert spaces (in which and Y by X Y . In particular, although 2X 3 case the product topology 2 3 in2 Y3 is induced by an inner product), we shall not require that [ X0 ] ⊥ Y0 in X Y . Furthermore, 2 3 2 3 in this case we identify a vector [ x0 ] ∈ [ X0 ] with x ∈ X and a vector2 y03 ∈ Y0 with y ∈ Y. (Thus, we also denote the ordered direct sum X Y by X Y .)
2. State/signal nodes and trajectories In this section we shall study time-invariant linear systems in discrete time induced by something that we call a state/signal node. Definition 2.1. A triple Σ = (V ; X , W), where the (internal ) state space X and the (external ) signal space W are Hilbert spaces and V is a subspace of the product
State/Signal Systems space K :=
4X 5 X W
121
is called a state/signal node if it has the following properties:1
(i) V is closed in K; 4z5 X ] such that x ∈ V ; (ii) For every x ∈ X there is some [ wz ] ∈ [ W w 4z5 (iii) If 0 ∈ V , then z = 0; 0 6 7 -4z5 x X ] - x ∈ V for some z ∈ X X ]. (iv) The set [ w ] ∈ [W is closed in [ W w We call K the node space and V the generating subspace. As we shall see in a moment (in Proposition 2.2, Lemmas 2.3–2.4 and Theorem 2.5), all of these conditions have a clear meaning related to the fact that we shall use the generating subspace V as the main tool in our definition of a trajectory. To define such a trajectory it is not important that (i)–(iv) hold. We define a trajectory (x(·), w(·)) along an arbitrary subspace V of K on the time interval [n1 , n2 ], where n1 , n2 ∈ Z, n1 ≤ n2 , to be a pair of sequences 2 +1 2 {x(k)}nk=n and {w(k)}nk=n satisfying 1 1 C x(k+1) D x(k) (2.1) ∈ V, n1 ≤ k ≤ n2 . w(k)
We shall also allow n1 = −∞ or n2 = ∞, in which case we replace ≤ by < in the formula above. Most of our trajectories will be considered on Z+ . We shall refer to the sequence x(·) as the state component and to the sequence w(·) as the signal component of the trajectory (x(·), w(·)). In the case where n1 is finite we shall call x(n1 ) the initial state of this trajectory. It follows immediately from Definition 2.1 that the set of trajectories along a given subspace V of K has the following two properties: 1) if (x(·), w(·)) is a trajectory along V on [n1 , n2 ], then for each k ∈ Z, the shifted pair of sequences (x(· + k), w(· + k)) is a trajectory along V on [n1 − k, n2 − k]. 2) if (x1 (·), w1 (·)) is a trajectory along V on [n1 , n2 ], if (x2 (·), w2 (·)) is a trajectory along V on [n2 + 1, n3 ], and if x1 (n2 + 1) = x2 (n2 + 1), then the concatenation (x(·), w(·)) defined by (x(k), w(k)) = (x1 (k), w1 (k)) for k ∈ [n1 , n2 ], (x(k), w(k)) = (x2 (k), w2 (k)) for k ∈ [n2 + 1, n3 ], and x(n3 + 1) = x2 (n3 + 1), is a trajectory along V on [n1 , n3 ]. Property 1) means that the set of trajectories along V is time-invariant, and property 2) says that x has the state property; cf. [PW98, p. 119]. 1 Recall
that we denote the direct product X × X × W by
4
X X W
5
. Later when we introduce passive
˘ın space, and equip K with a nodes we shall require X to be a Hilbert space, W to be a Kre˘ particular Kre˘ ˘ın space structure rather than the Hilbert space structure that it inherits from X and W. This is the reason why we throughout ignore the Hilbert space inner product in K induced by the inner products in X and W. The only way in which we use the fact that X and W are Hilbert spaces is in the assertion that every closed subspace of K has a complementary subspace. The same comments applies to all other Hilbert spaces and their products that appear in this paper.
122
D.Z. Arov and O.J. Staffans
Properties (ii) and (iii) in Definition 2.1 are reflected in the properties of the set of all trajectories along V as follows: 4X 5 Proposition 2.2. Let V be a subspace of the product space K := X . W
1) The following three statements are equivalent: (a) V has property (ii) in Definition 2.1; (b) for every x0 ∈ X there is a trajectory (x(·), w(·)) along V on Z+ with x(0) = x0 ; (c) every trajectory (x(·), w(·)) along V defined on some interval [0, n2 ] can be extended to a trajectory on Z+ . 2) The following four statements are equivalent: (a) V has property (iii) in Definition 2.1; k ∈ (b) if (x(·), w(·)) is a trajectory on [n1 , n2 ] along V , then 4for every 5 x(k)
[n1 , n2 ], the value of x(k + 1) is determined uniquely by w(k) ; (c) if (x(·), w(·)) is a trajectory on [n1 , n2 ] along V , then the value of x(n2 + 1) is determined uniquely by x(n1 ) and w(k), n1 ≤ k ≤ n2 . (d) if (x(·), w(·)) is a trajectory on [n1 , n2 ] along V with x(n1 ) = 0, then the value of x(n2 + 1) is determined uniquely by w(k), n1 ≤ k ≤ n2 . Proof. Proof of 1): The implications (b) ⇒ (a) and (c) ⇒ (a) are obvious. We next prove that (a) ⇒ (b). Suppose that (a) holds. Let x0 ∈ X , and define from property (ii) in Definition 2.1 that there exist x(1) and x(0) = x0 . It follows D C x(1) x(0) w(0)
∈ V . By the same argument with x(0) replaced by x(1), C x(2) D there exist x(2) and w(1) such that x(1) ∈ V . By induction, we will obtain (b). w(0) such that
w(1)
The proof of the fact that (a) ⇒ (c) is the same as the proof of the implication (a) ⇒ (b) given above, except that we start from time n + 1 and the initial value x(n + 1) (instead of time zero and initial value x0 ). The proof of 2) is left to the reader. By the state/signal system generated by the state/signal node Σ = (V ; X , W) we mean this node itself together with the set of all trajectories along V . For simplicity we use the same notation Σ for the system as we used for the original node. We shall also refer to the trajectories along V as the trajectories of Σ. We shall next develop certain representations of the subspace V in Definition 2.1, and begin with the following lemmas. 4X 5 Lemma 2.3. Let V be a subspace of the product space K := X . Let G2,3 : V → 4z5 W X ] be the bounded linear operator that maps the vector x ∈ V into [ x ] ∈ [ X ]. [W w W w Then the following conditions are equivalent: 1) V has property (iii); 2) G2,3 is injective;
State/Signal Systems
123
X ] of K, i.e., 3) V has a graph representation over the last two components [ W X 5 exists a linear operator F , mapping D(F ) ⊂ [ W ] into X such that 4there z x w
x x ∈ V if and only if [ w ] ∈ D(F ) and z = F [ w ].
Assuming 1), with G2,3 and F defined as in 2) and 3), the operator F is uniquely determined by V (hence so is D(F )), R (G2,3 ) = D(F ), G−1 2,3 : D(F ) → V is given 4 F 5 −1 by G2,3 = 1X 0 , and 0 1W F
⎡ z ⎤ C D C D x x −1 ∈ D(F ) (2.2) , V = G2,3 D(F ) = ⎣ x ⎦ - z = F w w w 4X 5 Lemma 2.4. Let V be a subspace of the product space K := X . Assume that V W has property (iii), and let F be the operator defined in Lemma 2.3. Then 1) V has property (i) if and only if F is closed, 2) V has property (ii) if and only if the linear operator D(F ) → X that maps x ] ∈ D(F ) into x ∈ X is surjective, [w 3) V has property (iv) if and only if D(F ) is closed, 4) V has properties (i) and (iv) if and only if F is bounded and D(F ) is closed. We leave the straightforward proofs of Lemmas 2.3 and 2.4 to the reader. By combining Lemmas 2.3 and 2.4 we get the following theorem: 4X 5 Theorem 2.5. Let V be a subspace of the product space K := X . Then V has W properties (i)–(iv) listed in Definition 2.1, i.e., Σ = (V ; X , W) is a state/signal node, if and only if V has a graph representation over the last two components X ] of K with a bounded linear operator F : D(F ) ⊂ [ X ] → X with closed domain, [W W i.e., F
⎡ z ⎤ C D C D x x ⎦ ⎣ x -z=F , ∈ D(F ) , (2.3) V = w w w x with the additional property that the linear operator D(F ) → X that maps [ w ]∈ D(F ) into x ∈ X is surjective.
In the next three sections we shall develop three different types of representations of a state/signal system Σ: driving variable representations, output nulling representations, and input/state/output representations. They complement each other, and all of them are important in slightly different connections.
3. The driving variable representation In our first representation of the generating subspace V we write V as the image of a bounded linear injective operator of the following type.
124
D.Z. Arov and O.J. Staffans
4X 5 Lemma 3.1. Let V be a subspace of the product space K := X , where X and W W are Hilbert spaces. If there exists a Hilbert space L and four operators A ∈ B(X ), B ∈ B(L; X ), C ∈ B(X , W), and D ∈ B(L; W),
(3.1)
D is injective and has a closed range
(3.2)
where such that
⎛⎡
A V = R ⎝⎣1X C
⎤⎞ ⎡ ⎤F A x + B B ⎦ - x ∈ X, ∈ L , x 0 ⎦⎠ = ⎣ C x + D D
(3.3)
then V has properties (i)–(iv) listed in Definition 2.1, i.e., (V ; X , W) is a state/ signal node. Conversely, if V has properties (i)–(iv) listed in Definition 2.1 then V is given by (3.3) for some Hilbert space L and some operators A , B , C , and D satisfying (3.1) and (3.2). Proof. We begin by proving that the representation (3.1)–(3.3) implies that V has properties (i)–(iv) in Definition 2.1. Trivially, (3.1) and (3.3) (ii). It is 2 1X imply 3 0 also clear that the injectivity of D implies that the operator C D is injective. 3 2 32 3−1 2 we get the Thus, by defining D(F ) = R 1CX D0 and F = A B 1CX D0 graph representation (2.3) of V . According to Lemma 2.3, this2implies3 that V has property (iii). The closedness of R (D ) implies that also R 1CX D0 is closed, 5 4 2 1X 0 3 4 1X 0 5 4 X 5 1X 0 because R C D = C 1W R(D ) , where C 1W is boundedly invertible. 2 3−1 Finally, the closed graph theorem implies that 1CX D0 is bounded on D(F ), hence so is F , and by part 4) of Lemma 2.4, V has properties (i) and (iv). We have now showed that V has all the properties (i)–(iv). Conversely, suppose that V has properties (i)–(iv) 4 5in Definition 2.1. Let G2 ∈ z x w
∈ V into x ∈ X . We take 4z5 L = N (G2 ), and define B ∈ B(L; X ) and D ∈ B(X ; W) by B w0 = z and 4z5 4z5 4 B 5 2 B 3 D 0 = w for each 0 ∈ L. Clearly = 0 for all ∈ L, D is injective w w D 2 B 3 X on L, and the range of D is closed in [ W ]. By property (ii) in Definition 2.1, G2 maps V onto X . Let G−1 2,right ∈ B(X ; V ) be an arbitrary right-inverse of G2 (such a bounded right-inverse C D exists since V is closed). This right-inverse must be of the B(V ; X ) be the bounded linear operator that maps
A
1X form G−1 (the middle component must be the identity operator since 2,right = C 4 5 4z5 z x = x for all x ∈ V ). By property (i), V = R G−1 L, hence G2 w 2,right w C D C D 4 B 5 A A V = 1X X L = 1X X 0 L. C
This implies (3.3).
C
D
State/Signal Systems
125
We still have to show that D is injective and has closed range, and for this we need properties C (iii) Dand (iv) (which we have not used up to now). By construction A B
the operator 1X 0 is injective. It then follows from Lemma 2.3 that the operator C D C D 3 2 A B X G2,3 1X 0 = 1CX D0 : [ X L ] → [ W ] also must be injective (since we now assume C D
(iii)). This implies that D is injective. That 2 the3 range is closed follows from (iv), i.e., from the closedness of D(F ) = R 1CX D0 , since (as we observed above) 54 5 5 4 2 3 4 X 1X 0 , where is boundedly invertible. R 1CX D0 = 1CX 10W R(D ) C 1W 2 A B 3 We shall call a colligation Σdv/s/s := C ; X , L, W , where L is a Hilbert D space and A , B , C , and D satisfy (3.1)–(3.3) a driving variable representation of the state/signal node Σ = (V ; X , W). We shall also refer to Σdv/s/s as a drivingvariable/state/signal node. By the driving-variable/state/signal system Σdv/s/s we mean the node Σdv/s/s itself together with the set of all trajectories (x(·), (·), w(·)) generated by this node through the equations x(k + 1) = A x(k) + B (k), w(k) = C x(k) + D (k),
n1 ≤ k ≤ n2 .
(3.4)
The space L considered above is called a driving variable space, and the vector ∈ L in (3.3) is called a driving variable. (The notion of a driving variable is known in the finite-dimensional setting from the theory of behaviors; see, e.g., [WT02].) From each trajectory (x(·), (·), w(·)) of the driving-variable/state/signal system Σdv/s/s we get a trajectory (x(·), w(·)) of the state/signal system Σ by simply deleting the driving variable component . It follows from part 3) of Proposition 3.2 below that this correspondence between the trajectories of the two types of systems is one-to-one. Let us next point out some important properties of driving variable representations. ; X , W) node with the driving variable Proposition 3.2. Let Σ = (V 2 A 3 be a state/signal B ; X , L, W , and let F : D(F ) → X be the linear representation Σdv/s/s = C D operator defined in Lemma 2.3. Then the following assertions are true. 3 2 1) R 1CX D0 = D(F ), R (B ) = R0 , R (D ) = U0 , and the preimage of R (D ) under C is given by U0 , where R0 = F [ w0 ] - [ w0 ] ∈ D(F ) -4z5 6 7 = z ∈ X - 0 ∈ V for some w ∈ W , (3.5) w U0 = w ∈ W - [ w0 ] ∈ D(F ) -4z5 7 6 (3.6) = w ∈ W - w0 ∈ V for some z ∈ X ,
126
D.Z. Arov and O.J. Staffans
- x - [ 0 ] ∈ D(F ) - 4z5 7 - x (3.7) - 0 ∈ V for some z ∈ X . 2 3 Consequently, the ranges of B , D , and 1CX D0 do not depend on the particular choice of Σdv/s/s . 2) The space L is2 isomorphic to the space U0 defined in (3.6). 3 3) The operator 1CX D0 has a bounded inverse mapping D(F ) one-to-one onto [X L ], and the vector in the representation (3.3) is uniquely determined by x [w ] via C D D−1 C D C D C x x x 1X 0 , ∈ D(F ). (3.8) = C D w w 2 3 4) The operator A B is given by 3 2 3 2 A B = F 1CX D0 . (3.9) U0 = x ∈ X 6 = x∈X
Consequently, A is determined uniquely by C and B is determined uniquely by D . Proof. Assertion 1) follows from (3.3) and the definition of F . To see that assertion 2) holds it suffices to note that the operator D maps L one-to-one onto U0 , and by the closed graph theorem, then inverse of this operator is also bounded. Assertions 3) and 4) were established as a part of the proof of Lemma 3.1. 2 A B 3 Theorem 3.3. Let Σdv/s/s = C D ; X , L, W be a driving variable representation of a state signal system Σ, and let D C DC D C A B 1X 0 A1 B1 = (3.10) C1 D1 C D K M where (3.11) K ∈ B(X ; L), M ∈ B(L1 ; L), and M has a bounded inverse, 4 5 A B for some Hilbert space L1 . Then Σ1dv/s/s = C1 D1 ; X , L1 , W is a driving vari1
1
able representation of Σ. Conversely, every driving variable representation Σ1dv/s/s of Σ may be obtained from formula (3.10) for some operators K and M satisfying (3.11). The operators K and M are uniquely defined by Σdv/s/s and Σ1dv/s/s via D K = C1 − C and D M = D1 . (3.12) 5 4 A B Proof. Suppose that Σ1dv/s/s = C1 D1 ; X , L1 , W given by (3.10) for some op1 1 2 3 erators K and M satisfying (3.11). It follows from (3.11) that 1KX M0 maps [ X L] 2X 3 one-to-one onto L1 . By (3.3) and (3.10), ⎡ ⎤ ⎤ ⎤ ⎡ ⎡ DC D A1 B1 C D A B C A B C D X 1 X 0 X X ⎣1X 0⎦ = V. = ⎣1X 0 ⎦ = ⎣1X 0 ⎦ L L M K L 1 1 C1 D1 C D C D
State/Signal Systems
127
Furthermore, D1 = D M is injective and has closed range. Thus Σ1dv/s/s is a driving variable representation of Σ. We next turn to the converse part. By statements 1) and 3) of Proposition 3.2, 2 G H 3 2 1X 0 3−1 4 1X 0 5 is a bounded linear operator mapping the operator K M := C D C1 D1 5 2 4 2X 3 3 2 G H 3 1X 0 X = 1CX D0 K L1 one-to-one onto [ L ]. It follows from the identity C1 D1 2 1X M0 3 2 G H 3 that G = 1X and that H = 0, and the invertibility of K M = K M implies that M is invertible. Thus, (3.11) and (3.12) hold. By statement 4) of Proposition 3.2, 2 32 3−1 2 3 4 1 0 5−1 = A1 B1 CX1 D1 , F = A B 1CX D0 3 2 3 2 3 2 hence A1 B1 = A B 1KX M0 . Thus equation (3.10) holds. Finally, we remark that (3.12) determines K and M uniquely since D is injective. Definition 3.4. Let Σ = (V ; X , W) be a state/signal system. 1) By an externally generated trajectory of Σ on [0, n] or on Z+ we mean a trajectory (x(·), w(·)) satisfying x(0) = 0. 2) The reachable subspace Rn of Σ in time n is the subspace of all the final states x(n + 1) of all externally generated trajectories (x(·), w(·)) of the system Σ on the interval [0, n]. 3) The (approximately) reachable subspace R of Σ (in infinite time) is the closure in X of all the possible values of the state components x(·) of all externally generated trajectories (x(·), w(·)) of the system Σ on Z+ . 4) The system is (approximately) controllable if the reachable subspace is all of X . Thus, Rn ⊂ Rn+1 , R = ∨n∈Z+ Rn (we get the first inclusion by taking x(0) = 0 and w(0) = 0, so that also x(1) = 0; for the second inclusion we use part 1) of Proposition 2.2). Observe, in particular, that the subspace R0 defined above coincides with the subspace R0 defined in (3.5). The subspaces Rn and R in Definition 3.4 have the following simple characterizations in terms of an arbitrary driving variable representation of Σ. Proposition 3.5. Let Σ = (V ; X2, W) be3 a state/signal system, with a driving vari A B ; X , L, W . Then the subspaces R defined able representation Σdv/s/s = C n D above and the reachable subspace R are given by (3.13) Rn = span{R (A )k B | 0 ≤ k ≤ n}, n ∈ Z+ , k (3.14) R = ∨k∈Z+ R (A ) B . In particular, Σ is controllable if and only if X = ∨k∈Z+ R (A )k B .
(3.15)
128
D.Z. Arov and O.J. Staffans
Proof. Let (x(·), w(·)) be an externally generated trajectory of Σ on [0, n]. It follows from the representation (3.3) (by induction) that x(n + 1) can be written in the form n (A )k B (n − k) x(n + 1) = k=0
for some sequence {(k)}nk=0 . Thus, x(n + 1) belongs to the linear span of k n {R (A ) B }k=0 . Conversely, to each such sequence {(k)}nk=0 corresponds a trajectory on [0, n] for which x(n+1) is given by the formula above. This proves (3.13). Letting n → ∞ in (3.13) we get (3.14). The final statement follows from (3.14) and the definition of controllability.
4. The output nulling representation In our second representation of the generating subspace V we write V as the kernel of a surjective operator of the following type. 4X 5 Lemma 4.1. Let V be a subspace of the product space K := X , where X and W W are Hilbert spaces. If there exists a Hilbert space K and four operators A ∈ B(X ), B ∈ B(W; X ), C ∈ B(X , K), and D ∈ B(W; K)
(4.1)
where D is surjective such that C V =N
−1X 0
A C
B D
D
F
⎡ z ⎤ - z = A x + B w ⎣ ⎦ x ∈K, = - 0 = C x + D w w
(4.2)
(4.3)
then V has properties (i)–(iv) listed in Definition 2.1, i.e., (V ; X , W) is a state/ signal node. Conversely, if V has properties (i)–(iv) listed in Definition 2.1 then V is given by (4.3) for some Hilbert space K and some operators A , B , C , and D satisfying (4.1) and (4.2). Proof. Trivially, if V is given by (4.3), then V has property (iii). That4 (i) holds fol5 A B . lows from the fact that V is the kernel of the bounded linear operator −10X C D
Define F as in Lemma 2.3. That (iv) holds follows 2 3 from the fact that D(F ) is the kernel of the bounded linear operator C D . Finally, (ii) holds since the surjectivity of D guarantees that for every x ∈ X it is possible to find some w ∈ W x such that C x + D w = 0, i.e., [ w ] ∈ D(F ). Conversely, suppose that V has properties 2 (i)–(iv). 3 Then Xthe operator F in C D Lemma 2.3 is bounded and D(F )2 is closed. Let ∈ B([ W ] ; K) be an arbi3 trary surjective operator with N C D = D(F ) (e.g., let K be a complemen-
State/Signal Systems
129
2 3 2 3 D(F ) X ] and let C D = PK ). Let A B be an artary subspace to D(F ) in [ W 3 2 X ] ; X ) (e.g., take A B = F PDK(F ) bitrary extension of F to an operator in B([ W 3 2 D is surjective and (4.1) and (4.3) hold. with K chosen as above). Then C It remains to show that D is surjective, and for this we need property (ii) (which has not yet been used). It follows from (4.3) 3 holds if and only 2 that (ii) if R (C ) ⊂ R (D ). Because of the surjectivity of C D , this is equivalent to (4.2). 2 A B 3 We shall call a colligation Σs/s/on := ; X , W, K , where K is a C D Hilbert space and A , B , C , and D satisfy (4.1)–(4.3) an output nulling representation of the state/signal node Σ = (V ; X , W). (Output nulling representations are known in the finite-dimensional case from the theory of behaviors; see, e.g., [WT02].) We shall also refer to Σs/s/on as a signal/state/output nulling node. By the signal/state/output nulling system Σs/s/on we mean the node Σs/s/on itself together with the set of all trajectories generated by this node. However, the notion of a trajectory of such a node differs slightly from the corresponding notions for a state/signal node or a driving-variable/state/signal node. By a trajectory of Σs/s/on on [n1 , n2 ] we mean a triple of sequences (x(·), w(·), e(·)) which satisfy x(k + 1) = A x(k) + B w(k), e(k) = C x(k) + D w(k),
(4.4)
n1 ≤ k ≤ n2 .
Here we interpret w as input data and e as output data. Thus, not every trajectory of (4.4) corresponds to a trajectory of the corresponding state/signal system Σ; this is true exactly for those trajectories whose output e(·) is null (i.e., it vanishes identically). We shall refer to e as the error variable, and to the space K as the error space. Output nulling representations have a number of important properties listed below. Proposition 4.2. Let Σ = 2 (V ; X , W) node with the output nulling 3 be a state/signal A B ; X , W, K , and let F : D(F ) → X be the linear representation Σs/s/on = C D operator defined in Lemma 2.3. Then the following assertions are true. 1) The operator F is given by 2 2 3 F = A B |D(F ) with D(F ) = N C
3 D .
(4.5)
2) We have N (D ) = U0 , N (C ) = U0 , R (B |U0 ) = R0 ,
(4.6)
where R0 , U0 , and U0 are defined in (3.5)–(3.7). Consequently, the range and kernels listed above do not depend on the particular choice of Σs/s/on . 3) Let Y0 be a direct complement in W to the space U0 defined 5 i.e., 4 in (3.6), W = Y0 U0 . Then D |Y0 maps Y0 one-to-one onto K and
1X B |Y0 0 D |Y0
maps
130
D.Z. Arov and O.J. Staffans 2X 3
one-to-one onto [ X K ], and consequently, these operators are boundedly invertible. Moreover, C DC D C DC D B 1X F A 1X = , (4.7) 0 HY0 C D HY0 Y0
or equivalently, C D C 1 A = X C 0
B |Y0 D |Y0
D C D C D C D F 1X 0 − , 0 HY0 HY0
(4.8)
where HY0 : X → W is the operator defined by HY0 x = w, where w is the x unique element in Y0 such that [ w ] ∈ D(F ). Consequently, A is determined uniquely by B and C is determined uniquely by D . 4) The space K is isomorphic to every direct complement in W to the space U0 defined in (3.6). Proof. We leave the straightforward proofs of 1) and 2) to the reader. That the restriction of D to any complement Y0 of U0 is invertible with a bounded infrom the fact that N (D ) = U0 . This implies that the restriction of 5 4verse follows 2 3 1X B to YX0 is invertible with a bounded inverse. Formula (4.7) follows from 0 D (4.3) and (4.5). Clearly (4.8) is equivalent to (4.7). Finally, 4) follows from the invertibility of D |Y0 established in 3). 2 A B 3 ; X , W, K) be an output nulling representaTheorem 4.3. Let Σs/s/on = ( C D tion of a state/signal system Σ, and let C D C DC D A1 B1 1X K A B = , (4.9) C1 D1 0 M C D where K ∈ B(K, X ), M ∈ B(K, K1 ), and M has a bounded inverse, for some Hilbert space K1 . Then Σ1s/s/on =
4
A 1 B1 C1 D1
5
; X , W, K1
(4.10)
is an output nulling representation of Σ. Conversely, every output nulling representation Σ1s/s/on of Σ may be obtained from the formula (4.9) for some operators M and K satisfying (4.10). The operators M and K are uniquely defined by Σs/s/on and Σ1s/s/on via (4.11) M D = D1 and K D = B1 − B . 5 4 A1 B1 ; X , W, K1 is given by (4.9) for some Proof. Suppose that Σ1s/s/on = C D 1
1
operators K and M satisfying (4.10). It follows from (4.9) and (4.10) that
State/Signal Systems C D1
= M D is surjective, that C N
−1X 0
A1 C1
B1 D1
D
1X 0 0 0 1X K 0 0 M
131
D is invertible, and that
⎤ ⎛⎡ 0 0 C 1X −1X A = N ⎝⎣ 0 1X K ⎦ 0 C 0 0 M C D −1X A B =N = V. 0 C D
⎞ D B ⎠ D
Thus Σ1s/s/on is an output nulling representation of Σ. We next turn to the converse part. Let Y be an arbitrary complement to D(F ). By part 3) of Proposition 4.2, the operator D C DC D−1 C 1X B1 |Y0 1X B |Y0 K G := H M 0 D1 |Y0 0 D |Y0 2 3 is a bounded linear operator mapping [ X onto KX1 . It follows from K ] one-to-one 5 5 4 4 2 G K 3 1X B |Y0 1 B | that G = 1X and that H = 0, the identity 0X D1 |Y0 = H M 0 D |Y0 Y 1 0 5 4 2 G K 3 K implies that M is invertible. Thus, and the invertibility of H = 10X M M 5 4 5 4 −1 4 5 1 B | 1X B |Y0 −1 2 A 3 A1 = , hence (4.10) and (4.11) hold. By (4.8), 0X D1 |Y0 C1 0 D |Y 0 C 1 Y0 4 5 4 5 2 3 A1 K A = 10X M . Thus equation (4.9) holds. C C 1
Finally, we remark that (4.11) determines K and M uniquely since D is surjective. Definition 4.4. Let Σ = (V ; X , W) be a state/signal system.
1) By an unobservable trajectory of Σ on [0, n] or on Z+ we mean a trajectory (x(·), 0) (i.e., the signal component of this trajectory is identically zero on [0, n] or on Z+ ). 2) The unobservable subspace Un of Σ in time n is the subspace of the initial states x(0) of all unobservable trajectories (x(·), 0) of Σ on [0, n]. 3) The unobservable subspace U of Σ (in infinite time) is the subspace of the initial states x(0) of all unobservable trajectories (x(·), 0) of Σ on Z+ . 4) The system is (approximately) observable if the unobservable subspace is {0}. Thus, Un+1 ⊂ Un ,
U = ∩n∈Z+ Un .
Observe, in particular, that the subspace U0 defined above coincides with the subspace U0 defined in (3.7). The subspaces Un and U in Definition 4.4 have the following simple characterizations in terms of an arbitrary output nulling representation of Σ.
132
D.Z. Arov and O.J. Staffans
Proposition 4.5. Let Σ = (V ; X , W) be a state/signal system and let Σs/s/on = 2 A B 3 ( C ; X , W, K) be an output nulling representation of this system. Then D Un = ∩0≤k≤n N C (A )k , (4.12) k U = ∩k∈Z+ N C (A ) . (4.13) In particular, Σ is observable if and only if ∩k∈Z+ N C (A )k = {0}. (4.14) k Proof. If x0 ∈ ∩0≤k≤n N C (A ) , i.e., if C (A )k x0 = 0 for 0 ≤ k ≤ n, then it follows from (4.3) that (x(·), w(·)), where x(k) = (A )k x0 and w(k) = 0, 0 ≤ k ≤ n, is a trajectory of Σ on the interval [0, n]. Thus, x0 ∈ Un in this case. Conversely, if (x(·), w(·)) is a trajectory of Σ on [0, n] with x(0) = x0 and w(k) = 0, 0 ≤ k ≤ n, then by (4.3) x(k + 1) = A x(k) 0 = C x(k), 0 ≤ k ≤ n, k which gives x0 ∈ N C (A ) for all k, 0 ≤ k ≤ n. Thus (4.12) holds. Letting n → ∞ in (4.12) we get (4.13). The final statement follows from (4.13) and the definition of observability.
5. The input/state/output representation In this section we shall discuss a third type of representation of a state/signal system Σ = (V ; X , W) in which trajectories (x(·), w(·)) on Z+ of Σ are described by the usual system of equations (1.2) in the traditional input/state/output theory. 4X 5 Theorem 5.1. Let V be a subspace of the product space K := X , where X W and W are Hilbert spaces, and suppose that W = Y Uis the direct sum of two complementary closed subspaces Y and U. If there exists four operators A ∈ B(X ), B ∈ B(U; X ), C ∈ B(X , Y), and D ∈ B(U; Y), ⎛⎡
(5.1)
⎤⎞ A B D C ⎜⎢1X 0 ⎥⎟ 0 B −1X A ⎜ ⎢ ⎥ ⎟ V = R ⎝⎣ =N 0 C −1Y D C D ⎦⎠ 0 1U (5.2) ⎤F
⎡ Ax + Bu ⎦ -- x ∈ X , u ∈ U , x = ⎣ Cx + Du + u then V has properties (i)–(iv) listed in Definition 2.1, i.e., (V ; X , W) is a state/ signal node. Conversely, if V has properties (i)–(iv) listed in Definition 2.1 then V is given by (5.2) for some operators A, B, C, and D satisfying (5.1) for some decomposition W = Y U. These operators are uniquely defined by V and by the decomposition W = Y U. such that
State/Signal Systems
133
Proof. The representation (5.2) has an obvious interpretation as a driving variable 2 3 representation of V (take C = [ C0 ] and D = 1DU ). Thus, by Lemma 3.1, if V is given by (5.2) for some operators A, B, C, and D satisfying (5.1), then V has properties (i)–(iv). To prove the converse part we start from an arbitrary driving variable representation of V (e.g., from the one constructed in the proof of the converse part of Lemma 3.1), i.e., we 4let 5L be a Hilbert space, and let A , B , C , and D satisfy z (3.1)–(3.3). Then each x ∈ V can be written in the form w
⎡ ⎤ ⎡ z A ⎣ x ⎦ = ⎣1X C w
⎤ B C D x 0⎦ , D
for a unique ∈ L. Let W = Y U be an arbitrary decomposition of W with the property that PUY D maps L one-to-one onto U (for example, we can take U = U0 , with U0 defined as in (3.6), and take Y to be an arbitrary4 direct complement to 5 z U0 ). With respect to this decomposition of W the vector x can be written in the form (where we denote u = PUY w and y = PYU w) ⎡ ⎤ ⎡ A z ⎢x⎥ ⎢ 1X ⎢ ⎥=⎢ U ⎣ y ⎦ ⎣PY C u PUY C
w
⎤ B C D 0 ⎥ x ⎥ . PYU D ⎦ Y PU D
Since PUY D is boundedly invertible, we can solve for to get the equivalent representation ⎤ ⎡ ⎤ ⎡ B A z D−1 C D C ⎢x⎥ ⎢ 1X 0 ⎥ 0 1X x ⎥ ⎢ ⎥=⎢ U ⎣y ⎦ ⎣PY C PYU D ⎦ P Y C P Y D u U U u PUY C PUY D ⎡ ⎤ PUY D )−1 PUY C B (P PUY D )−1 A − B (P C D ⎢ ⎥ x 1X 0 ⎥ =⎢ . ⎣P U C − P U D (P PUY D )−1 PUY C PYU D (P PUY D )−1 ⎦ u Y Y 0 1U This representation is of the type (5.2) with C
A C
DC D−1 D C 0 1X B A B = PYU C PYU D PUY C PUY D D C D A − B (P PUY D )−1 PUY C B (P PUY D )−1 = . PYU C − PYU D (P PUY D )−1 PUY C PYU D (P PUY D )−1
(5.3)
134
D.Z. Arov and O.J. Staffans
A B ] follows from the fact that (5.2) is a graph represenThe uniqueness of [ C D CX D C 0 D tation of V with respect to the decomposition of K into K = Y0 X0 , and the 0
U
operator appearing in this graph representation is unique. AB We shall call a colligation Σi/s/o := [ C D ] ; X , U, Y , where W = Y U and A, B, C, and D satisfy (5.1) and (5.2) an input/state/output representation of the state/signal node Σ = (V ; X , W). We shall also refer to Σi/s/o as an input/state/ output node. By the input/state/output system Σi/s/o we mean the node Σi/s/o itself together with the set of all trajectories (x(·), u(·), y(·)) generated by this node through the equations x(k + 1) = Ax(k) + Bu(k), y(k) = Cx(k) + Du(k),
n1 ≤ k ≤ n2 .
(5.4)
The subspace U considered above is called an input space, and the vector u ∈ U in (5.2) is called an input variable. Analogously, the subspace Y considered above is called an output space, and the vector y ∈ Y in (5.2) is called an output variable. From each trajectory (x(·), u(·), y(·)) of the input/state/output system Σi/s/o we get a trajectory (x(·), w(·)) of the state/signal system Σ by taking w = u + y, and conversely, from each trajectory (x(·), w(·)) of the state/signal system Σ we get a trajectory (x(·), u(·), y(·)) of the input/state/output system Σi/s/o by taking u = PUY w and y = PYU w. Remark 5.2. Every input/state/output representation can be interpreted both as a driving variable representation and as an output nulling representation. In both cases we combined u and y into the signal vector w = [ uy ]. We get a driving variable representation by writing (5.2) in the form z = Ax + Bu, C D C D C D y C D = x+ u, 0 1U u 2 3 with driving variable space U (the operator D = 1DU is injective and has closed range), and we get an output nulling representation by writing it in the form C D 2 3 y z = Ax + 0 B , u C D 3 y 2 , 0 = Cx + −1Y D u 3 2 with error space Y (the operator D = −1Y D is surjective). Remark 5.3. In the standard input/state/output systems theory one considers trajectories (x(·), u(·), y(·)) generated by (5.4), but the input space U and the output space Y are not required to be complementary subspaces of a given signal space W. Nevertheless, also in this situation it is possible to introduce the product 2 3 space W = Y with an appropriate inner product, to identify Y with the subspace U
State/Signal Systems
135
2Y 3
of W, and to identify U with the subspace [ U0 ] of W. Then W = Y U, the triple Σ = (V ; X , W) with V defined by (5.2) is a state/signal node, and the original input/state/output system is an input/state/output representation of this node. 0
Remark 5.4. Each driving variable representation Σdv/s/s of a state/signal system may be interpreted as an input/state/output system, with the driving variable as input data and the original signal as output data. We can and will therefore apply all notions, notations, and results that we will define or obtain for input/ state/output systems to such driving variable representations. In this connection we throughout replace the word “input” by “driving” and the word “output” by “signal”. An analogous remark is valid for output nulling representations of state signal systems. When we interpret such representations as input/state/output systems we throughout replace the word “input” by “signal” and the word “output” by “error”. Proposition 5.5. Let Σ = (V ; X , W) be a state/signal system, with an input/state/ A B ] ; X , U, Y . output representation Σi/s/o = [ C D 1) The reachable subspaces Rn in time n and the reachable subspace R are given by Rn = span{R Ak B | 0 ≤ k ≤ n}, n ∈ Z+ , (5.5) k R = ∨k∈Z+ R A B . (5.6) In particular, Σ is controllable if and only if X = ∨k∈Z+ R Ak B .
(5.7)
2) The unobservable subspaces Un in time n and the unobservable subspace U are given by Un = ∩0≤k≤n N CAk , (5.8) k (5.9) U = ∩k∈Z+ N CA . In particular, Σ is observable if and only if ∩k∈Z+ N CAk = {0}. Proof. This follows from Propositions 3.5 and 4.5 and Remark 5.2.
(5.10)
Definition 5.6. Let Σ = (V ; X , W) be a state/signal system. We 2 3call the ordered direct sum decomposition W = Y U (also denoted by W = Y U ) an admissible (input/output) decomposition for Σ if Σ has an input/state/output representation with input space U and output space Y. Our following theorem characterizes the set of all admissible input/output decompositions. Lemma 5.7. Let Σ = (V ; X , W) be a state/signal node, and let W = Y U be a direct sum decomposition of W. Define U0 as in (3.6). Then the following statements are equivalent:
136
D.Z. Arov and O.J. Staffans
1) W = Y U is an admissible input/output decomposition for Σ. 2) PUY |U0 maps U0 one-to-one onto U, i.e., (P PUY |U0 )−1 ∈ B(U; U0 ). 3) The space U0 has the graph representation 2 3 U0 = w = 1DU u | u ∈ U ,
(5.11)
for some D ∈ B(U; Y). If the decomposition W = Y U is admissible for Σ, then the operator D in (5.11) coincides with the operator D in (5.2). Proof. Proof of 1) ⇒ 3): If 1) holds, then the representation (5.2) of V gives us a graph space representation of U0 (with the same operator D as in (5.2)). Proof of 3) ⇒ 2): If 3) holds, then PUY maps U0 one-to-one onto U, and D = PYU (P PUY |U0 )−1 . 2 A B 3 ; X , L, W be an arbitrary driving Proof of 2) ⇒ 1): Let Σdv/s/s = C D variable representation of Σ. Then PUY maps U0 one-to-one onto U and PUY D maps L one-to-one onto U. The proof of Theorem 5.1 provides us with an input/state/ output representation of Σ with input space U and output space Y. Remark 5.8. According to Lemma 5.7, if Y is an arbitrary direct complement to the subspace U0 in (3.6), then W = Y U0 is an admissible decomposition for Σ. For this reason we shall refer to U0 as the canonical input space. The admissibility of a given decomposition of the signal space of a given state/signal system Σ can also be studied by means of a given driving variable, or output nulling, or input/state/output representation of the given system Σ. Lemma 5.9. Let Σ = (V ;2X , W) 3be a state/signal node with the driving variable A B ; X , L, W . representation Σdv/s/s = C D 1) W = Y U is an admissible input/output decomposition for Σ if and only if PUY D )−1 ∈ B(U; L). PUY D maps L one-to-one onto U, i.e., (P
(5.12)
2) If the decomposition W = Y U is admissible for Σ, then the corresponding operators A, B, C, and D in (5.2) are given by (5.3). Proof. In the proof of Theorem 5.1 we constructed an input/state/output representation of Σ under the assumption that (5.12) holds. Thus, (5.12) is sufficient for admissibility. Conversely, suppose that the decomposition is admissible for Σ. Then by Lemma 5.7, PUY maps the canonical input space U0 = R (D ) one-to-one onto U, and D is injective. Thus, (5.12) is also necessary for admissibility. Lemma 5.10. Let Σ = (V2; X , W) 3be a state/signal node with the output nulling A B ; X , W, K , and let W = Y U be a direct sum representation Σs/s/on = C D decomposition of W. 1) W = Y U is an admissible input/output decomposition for Σ if and only if D |Y maps Y one-to-one onto K, i.e., (D |Y )−1 ∈ B(K; Y).
(5.13)
State/Signal Systems
137
2) If the decomposition W = Y U is admissible for Σ, then the corresponding operators A, B, C, and D in (5.2) are given by D−1 C D C D C A B |U A B 1X −B |Y = 0 −D |Y C D |U C D (5.14) D C A − B |Y (D |Y )−1 C B |U − B |Y (D |Y )−1 D |U . = −(D |Y )−1 C −(D |Y )−1 D |U 4z5 4z5 Proof. Take an arbitrary x ∈ K. By (4.3), x ∈ V if and only if w w C D C DC D z A B x = . 0 C D w With u = PUY w and y = PYU w this can be written in the equivalent form ⎡ ⎤ C D C D x z A B |Y B |U ⎣ ⎦ y . = 0 C D |Y D |U u If the decomposition W = Y U is admissible for Σ, then the condition
(5.15) 4z5 x w
∈V
determines y uniquely as a continuous function of x and u (by (5.2), y = Cx + Du), and therefore the operator D |Y in (5.15) must map Y one-to-one onto K (recall that the range of D is all of K). Thus (5.13) is a necessary condition for admissibility. Conversely, suppose that (5.13) holds. Then (5.15) can be written in the equivalent form D−1 C DC D C D C B |U x 1X −B |Y A z = 0 −D |Y C D |U u y C DC D A − B |Y (D |Y )−1 C B |U − B |Y (D |Y )−1 D |U x = . −(D |Y )−1 C −(D |Y )−1 D |U u This is an input/state/output representation with A , B , C , and D given by (5.14). Thus, (5.13) is also sufficient for the admissibility of the decomposition W = Y U. Theorem 5.11. Let Σ = (V ; X, W) be a state/signal node with the input/state/ A B ] ; X , U, Y . Let W = Y U be a direct sum output representation Σi/s/o = [ C 1 1 D 2 3 2 Y1 3 ; ) by (1.6). decomposition of W, and define Θ ∈ B( Y U1 U 1) W = Y1 U1 is an admissible input/output decomposition for Σ if and only if Θ21 D + Θ22 maps U one-to-one onto U1 , i.e., (5.16) (Θ21 D + Θ22 )−1 ∈ B(U1 ; U). 2) If the decomposition W = Y1 U1 is admissible for Σ, then the corresponding operators A1 , B1 , C1 , and D1 are given by C D C DC D−1 0 A1 B1 A B 1X , (5.17) = C1 D1 Θ11 C Θ11 D + Θ12 Θ21 C Θ21 D + Θ22
138
D.Z. Arov and O.J. Staffans or equivalently, A1 = A − B(Θ21 D + Θ22 )−1 Θ21 C, B1 = B(Θ21 D + Θ22 )−1 , C1 = Θ11 C − (Θ11 D + Θ12 )(Θ21 D + Θ22 )−1 Θ21 C,
(5.18)
D1 = (Θ11 D + Θ12 )(Θ21 D + Θ22 )−1 . Proof. This follows from Remark 5.2 and Lemma 5.9.
Theorem 5.12. Let Σ = (V ; X , W) be a state/signal node with the input/state/ A B ] ; X , U, Y , and let W = Y U be a direct output representation Σi/s/o = [ C 1 1 D 2 3 2 3
∈ B( Y1 ; Y ) by sum decomposition of W. Define Θ U1 U C D C U D
11 Θ
12 PY |Y1 PYU |U1 Θ
Θ= (5.19)
22 = P Y |Y P Y |U . Θ21 Θ 1 1 U U 1) W = Y1 U1 is an admissible input/output decomposition for Σ if and only if
11 − DΘ
21 maps Y1 one-to-one onto Y. Θ (5.20) 2) If the decomposition W = Y1 U1 is admissible for Σ, then the corresponding operators A1 , B1 , C1 , and D1 are given by D C D−1 C D C
22
21 1X A BΘ −B Θ A1 B1 = (5.21)
11 − DΘ
21
12 + DΘ
22 , C1 D1 0 Θ C −Θ or equivalently,
21 (Θ
11 − DΘ
21 )−1 C, A1 = A + B Θ
22 + B Θ
21 (Θ
11 − DΘ
21 )−1 (−Θ
12 + DΘ
22 ), B1 = B Θ
11 − DΘ
21 )−1 C, C1 = (Θ
11 − DΘ
21 )−1 (−Θ
12 + DΘ
22 ). D1 = (Θ Proof. This follows from Remark 5.2 and Lemma 5.10.
(5.22)
6. Transfer functions The (input-output) transfer AB function of discrete time input/state/output system Σi/s/o = [ C ] ; X , U, Y is defined by the formula D D(z) = D + zC(1X − zA)−1 B,
z ∈ ΛA ,
(6.1)
where ΛA is the set of points z ∈ C for which (1X − zA) has a bounded inverse, plus the point at infinity if A is boundedly invertible. The set ΛA is the maximal domain of analyticity of the function zA(z), where A is the (Fredholm) resolvent of A, i.e., (6.2) A(z) = (1X − zA)−1 , z ∈ ΛA .
State/Signal Systems
139
Thus, both D and A will be defined on the same subset ΛA of the extended complex plane. The resolvent A may have an analytic extension to the point at infinity even if A does not have a bounded inverse, and the transfer function D may have an analytic extension to a larger domain, but in this paper we shall not make any use of such extensions. Note that D(z) = D + zCA(z)B, that D(0) = D and that D(∞) = D − CA−1 B (if A is boundedly invertible). The function D arises in a natural way when one studies the Z-transform of a trajectory (x(·), u(·), y(·)) of Σi/s/o on Z+ . Let us denote the formal power series ∞ ∞ 2 induced by the sequences {x(n)}∞ n=0 , {y(n)}n=0 , and {u(n)}n=0 by x ˆ(z) =
∞
n
x(n)z ,
yˆ(z) =
n=0
∞
n
y(n)z ,
u ˆ(z) =
n=0
∞
u(n)z n .
n=0
The system of equations (1.2) is then equivalent to the following system of equations for formal power series: ˆ(z) = x(0) + zAˆ x x(z) + zB u ˆ(z),
(6.3)
yˆ(z) = C x ˆ(z) + Dˆ u(z).
Solving these equations for xˆ and yˆ in terms of x(0) and u ˆ we get the more explicit formula C D C D C D B(z) x ˆ(z) A(z) u ˆ(z), (6.4) = x(0) + yˆ(z) C(z) D(z) where the right-hand side should be interpreted as sums and products of (formal) power series of the following type: x(0) is just a constant, u ˆ(z) is the formal power , and the multipliers A(z), B(z), C(z), series induced by the sequence {u(n)}∞ n=0 and D(z), represent the MacLaurin series of the corresponding functions defined by (6.1), (6.2), and by B(z) = z(1X − zA)−1 B = zA(z)B,
z ∈ ΛA ,
C(z) = C(1X − zA)−1 = CA(z),
z ∈ ΛA ,
(6.5)
that is, A(z) = C(z) =
∞ n=0 ∞ n=0
An z n ,
B(z) =
∞
An Bz n+1 ,
n=0 n n
CA z ,
D(z) = D +
∞
(6.6) n
CA Bz
n+1
.
n=0
2 The alternative transform where z is replaced by 1/z is also frequently used. The corresponding transfer function is then given by D + C(z − A)−1 B, defined on the resolvent set of A, including the point at infinity.
140
D.Z. Arov and O.J. Staffans
The corresponding time-domain formulas are x(n) = An x(0) +
n−1
Ak Bu(n − k − 1),
k=0
y(n) = CAn x(0) + Du(n) +
n−1
(6.7) CAk Bu(n − k − 1),
n ∈ Z+
k=0
(where we interpret an empty sum as zero). From time to time we shall need to refer to the different maps in (6.7), and therefore introduce the following termiˇ : X → X Z+ , the input-to-state map nology. We define the state-to-state map A + + ˇ : U Z → X Z , the state-to-output map C ˇ : X → Y Z+ , and the input-to-output B + + ˇ : U Z → U Z by map D ˇ (Ax)(n) = An x, ˇ (Bu)(n) =
n−1
n ∈ Z+ ,
Ak Bu(n − k − 1),
n ∈ Z+ ,
k=0
ˇ (Cx)(n) = CAn x, ˇ (Du)(n) =D+
n−1
n ∈ Z+ ,
(6.8)
CAk Bu(n − k − 1), n ∈ Z+ .
k=0
It is frequently possible to interpret the above equations as equations between analytic functions defined in a neighborhood of zero rather than formal power series. It suffices to assume that the (formal) power series defining u ˆ has a strictly positive radius of convergence. This implies that also the series defining x ˆ and yˆ have a positive radius of convergence, that u ˆ, zˆ, and yˆ are analytic functions defined in a neighborhood of zero, and that (6.4) holds with A(z), B(z), C(z), and D(z) defined by (6.1), (6.2), and (6.5). In particular, if x(0) = 0, then yˆ(z) = D(z)ˆ u(z) in a neighborhood of zero, and this explains why the function D is called the input-output transfer function. Similar interpretations are valid for the transfer functions A (state to state), B (input to state), and C (state to output). A more compact way of writing (6.1), (6.2), and (6.5) is D C D C (1/z − A)−1 B zA(z) B(z) (1/z − A)−1 = C(1/z − A)−1 D + C(1/z − A)−1 B zC(z) D(z) C D−1 DC 1X 0 1/z − A −B (6.9) = C D 0 1U C D D−1 C 1/z − A 0 1X B = , z ∈ ΛA , z = 0 −C 1Y 0 D (the value at infinity is obtained by taking limits as z → ∞, and the corresponding formula for z = 0 is trivial).
State/Signal Systems
141
D C A(z) B(z) V(z) := C(z) D(z) the four block input/state/output transfer function of the system Σi/s/o . 2 A B 3 A driving-variable/state/signal system Σdv/s/s = ; X , L, W may C D be interpreted as an input/state/output system with L as input space, X as ˆ w) state space, and W as output space. The Z-transform (ˆ x, , ˆ of a trajectory + (x(·), (·), w(·)) of this system on Z therefore satisfies D C D C DC C D x(0) A (z) B (z) x(0) x ˆ(z) := , (6.10) = V (z) ˆ ˆ C (z) D (z) (z) w(z) ˆ (z) We shall call
where A , B , C , and D are given by (6.9) with A, B, C, and D replaced by A , B , C , and D . We shall call V the four block driving-variable/state/signal transfer function of the system Σdv/s/s . Analogously, the Z-transform (ˆ x, w, ˆ eˆ) of a trajectory (x(·), w(·), e(·)) of a signal/state/output nulling system Σs/s/on = 2 A B 3 ; X , W, K on Z+ therefore satisfies C D C C D D C DC D xˆ(z) x(0) A (z) B (z) x(0) = V (z) , (6.11) := eˆ(z) w(z) ˆ ˆ C (z) D (z) w(z) where A , B , C , and D are given by (6.9) with A, B, C, and D replaced by A , B , C , and D . We shall call V the four block signal/state/error transfer function of the system Σs/s/on . Below we shall study relations between the four block transfer functions V, V , and V that correspond to the three types of representations (input/state/ output, driving variable, or output nulling, respectively) of a given state/signal system Σ = (V ; X , W). First we will consider the relationships between the four block driving variable transfer function of two driving-variable representations of a state/signal system. Theorem 6.1. Let 2 Σdv/s/s = A C
B D
3
; X , L, W
and
Σ1dv/s/s =
4 A1 C1
B1 D1
5
; X , L1 , W
be two driving variable representations of the state/signal system Σ =4 (V ; X , W).5 A (z) B (z) Denote the four block transfer functions of Σdv/s/s and Σ1dv/s/s by C (z) D (z) 4 5 A (z) B (z) and C1(z) D1(z) , respectively, and let K ∈ B(X ; L) and M ∈ B(L1 ; L) be the 1 1 operators in Theorem 3.3, uniquely determined by (3.12). 1) The operator 1L − K B (z) (defined on ΛA ) has a bounded inverse if and only if z ∈ ΛA ∩ ΛA1 . 2) For all z ∈ ΛA ∩ ΛA1 , C D C DC D−1 C D A1 (z) B1 (z) A (z) B (z) 1X 0 1X 0 = , C1 (z) D1 (z) C (z) D (z) −K A (z) 1L − K B (z) 0 M (6.12)
142
D.Z. Arov and O.J. Staffans or equivalently,3 A1 (z) = (1X − B (z)K )−1 A (z), B1 (z) = (1X − B (z)K )−1 B (z)M , C1 (z) = C (z) + D (z)K (1X − B (z)K )−1 A (z),
(6.13)
D1 (z) = D (z)(1L − K B (z))−1 M . Proof. The case where z = 0 is trivial, so in the sequel we assume that z = 0. Assume first that z ∈ ΛA ∩ΛA1 , with z = 0. Since z ∈ ΛA1 , we get from (6.9), D C DC D−1 C 1X 1/z − A1 −B1 0 zA1 (z) B1 (z) = 0 1L1 zC1 (z) D1 (z) C1 D1 C DC D−1 C DC D−1 −1 1X 0 0 0 1X 1/z − A1 −B1 1X = C1 D1 K M K M 0 1L1 C DC D −1 1 1/z − A 0 −B = X . −1 −(M ) K (M )−1 C D Observe, in particular, that the last block matrix above is boundedly invertible. Since also z ∈ ΛA , we can factor D C D C 1X −B 0 1/z − A = −(M )−1 K zA (z) (M )−1 (1L − K B (z)) −(M )−1 K (M )−1 C D (6.14) 1/z − A −B × . 0 1L As we noticed above, the left-hand side in boundedly invertible, and hence also the operator 1L − K B (z) must be boundedly invertible. Substituting this factorization into the formula above we get C D C DC D−1 zA1 (z) B1 (z) 1X 0 1/z − A −B = zC1 (z) D1 (z) C D 0 1L C D−1 1X 0 × −(M )−1 K zA (z) (M )−1 (1L − K B (z)) D DC D−1 C C 1X 1X 0 0 zA (z) B (z) . = zC (z) D (z) −K zA (z) 1L − K B (z) 0 M 2 3 0 Multiplying this identity to the right by 1/z we get (6.12). We have now proved 0 1 assertion 2) and one half of assertion 1). To prove the other half of assertion 1) we assume that z ∈ ΛA , z = 0, and that 1L − K B (z) is boundedly invertible. Then the block operator matrix on the left-hand side of (6.14) is also boundedly invertible. As we noticed above, 3 Note
that, by Lemma 10.1, 1L − K B (z) has a bounded inverse if and only if 1X − B (z)K has a bounded inverse.
State/Signal Systems 4 52 1X 1 −B1 this matrix factors into 1/z−A 0 1L1 K boundedly invertible, i.e., z ∈ ΛA1 . Theorem 6.2. Let 2 Σs/s/on = A C
B D
3
; X , W, K
and
3 0 −1 , M
143 and hence 1/z − A1 must be
Σ1s/s/on =
4 A1
C1
B1 D1
5
; X , W, K1
be two output nulling representations of the state/signal system Σ 4= (V ; X , W).5 A (z) B (z) Denote the four block transfer functions of Σs/s/on and Σ1s/s/on by C (z) D (z) 4 5 A (z) B (z) and C1 (z) D1 (z) , respectively, and let K and M be the operators in Theorem 1
1
4.3, uniquely determined by (4.11). 1) The operator 1K − zC (z)K (defined on ΛA ) has a bounded inverse if and only if z ∈ ΛA ∩ ΛA1 . 2) For all z ∈ ΛA ∩ ΛA1 , C
D C 1 A1 (z) B1 (z) = X C1 (z) D1 (z) 0
0 M
DC
1X 0
−zA (z)K 1K − zC (z)K
D D−1 C A (z) B (z) , C (z) D (z) (6.15)
or equivalently,4 A1 (z) = A (z)(1X − zK C (z))−1 C (z), B1 (z) = B (z) + zA (z)(1X − zK C (z))−1 K D (z), C1 (z) = M C (z)(1X − zK C (z))−1 ,
(6.16)
D1 (z) = M (1K − zC (z)K )−1 D (z). The proof of this theorem is similar to the proof of Theorem 6.1, and we leave it to the reader. AB 2 A B 3 Lemma 6.3. Let Σi/s/o = [ C ; X , L, W be D ] ; X , U, Y and Σdv/s/s = C D an input/state/output and a driving variable representation, respectively, of the state/signal system Σ 4= (V ; X , W). Denote four 5 4 the 5 block transfer functions of B(z) A (z) B (z) and , respectively. Σi/s/o and Σdv/s/s by A(z) C(z) D(z) C (z) D (z) 1) The operator PUY D (z) (defined on ΛA ) has a bounded inverse if and only if z ∈ Λ A ∩ Λ A . 2) For all z ∈ ΛA ∩ ΛA , D−1 C D C DC 1X 0 A(z) B(z) A (z) B (z) , (6.17) = C(z) D(z) PYU C (z) PYU D (z) PUY C (z) PUY D (z) 4 Note
that, by Lemma 10.1, 1K − zC C (z)K has a bounded inverse if and only if 1X − zK C (z) has a bounded inverse.
144
D.Z. Arov and O.J. Staffans or equivalently, PUY D (z))−1 PUY C (z) A(z) = A (z) − B (z)(P B(z) = B (z)(P PUY D (z))−1 C(z) = PYU C (z) − PYU D (z)(P PUY D (z))−1 PUY C (z)
(6.18)
D(z) = PYU D (z)(P PUY D (z))−1 . Proof. We interpret Σi/s/o as a driving variable representation 4 A B 5 Σ1dv/s/s = C1 D1 ; X , L1 , W 1
with L1 = U and
A
A1
B1
C1
D1
B
1
⎡
A
B
⎤
⎥ ⎢ =⎣ C D ⎦; 1U 0 see Remark 5.2. The corresponding block decomposition of Σdv/s/s is given by ⎤ ⎡ A B B A A B ⎥ ⎢ = ⎣ P U C PYU D ⎦ . Y C D PUY C PUY D To these two driving variable representations we apply Theorem 6.1. By comparing the two representations to each other we find that the operators K ∈ B(X ; L) and M ∈ B(U; L) are given by M = [P PUY D ]−1 ,
K = −[P PUY D ]−1 PUY C .
The operator 1L − K B (z) in part 1) Theorem 6.1 is given by 1L − K B (z) = 1L + [P PUY D ]−1 PUY C B (z) = [P PUY D ]−1 (P PUY D + PUY C B (z)) = [P PUY D ]−1 PUY D(z), and it is boundedly invertible if and only if PUY D(z) is boundedly invertible. Substituting the above values into (6.12) we get (6.17). AB 2 A B 3 Lemma 6.4. Let Σi/s/o = [ C D ] ; X , U, Y and Σs/s/on = ; X , W, K C D be an input/state/output and a output nulling representation, respectively, of the state/signal system Σ 4= (V ; X , W). Denote the four 5 4 5 block transfer functions of Σi/s/o and Σs/s/on by
A(z) B(z) C(z) D(z)
and
A (z) B (z) C (z) D (z)
, respectively.
1) The operator D (z)|Y (defined on ΛA ) has a bounded inverse if and only if z ∈ ΛA ∩ ΛA . 2) For all z ∈ ΛA ∩ ΛA , D C D−1 C D C 1 −B (z)|Y A(z) B(z) A (z) B (z)|U = X , (6.19) 0 −D (z)|Y C (z) D (z)|U C(z) D(z)
State/Signal Systems
145
or equivalently A(z) = A (z) − B (z)|Y (D (z)|Y )−1 C (z), B(z) = B (z)|U − B (z)|Y (D (z)|Y )−1 D (z)|U , C(z) = −(D (z)|Y )−1 C (z),
(6.20)
D(z) = −(D (z)|Y )−1 D (z)|U . Proof. This lemma is proved in the same way as Lemma 6.3, but this time we interpret Σi/s/o as an output nulling representation of Σ (as in Remark 5.2) and use Theorem 6.2 instead of Theorem 6.1. 3 AB 2 1 B1 Theorem 6.5. Let Σi/s/o = [ C D ] ; X , U, Y and Σ1i/s/o = A C1 D1 ; X , U1 , Y1 be two input/state/output representations of the state/signal system 4Σ = (V ; X5, W). B(z) Denote the four block transfer functions of Σi/s/o and Σ1i/s/o by A(z) C(z) D(z) and 4 5 2 Y 3 2 Y1 3 2 Y1 3 2 Y 3 A1 (z) B1 (z)
C1 (z) D1 (z) , respectively. Define Θ ∈ B( U ; U1 ) and Θ ∈ B( U1 ; U ) by (1.6) and (5.19), respectively. 1) For each z ∈ ΛA the following conditions are equivalent: (a) z ∈ ΛA1 . (b) The operator Θ21 D(z) + Θ22 has a bounded inverse.
21 has a bounded inverse.
11 − D(z)Θ (c) The operator Θ 2) For all z ∈ ΛA ∩ ΛA1 , C D C DC D−1 A1 (z) B1 (z) A(z) B(z) 1X 0 , = Θ11 C(z) Θ11 D(z) + Θ12 Θ21 C(z) Θ21 D(z) + Θ22 C1 (z) D1 (z) (6.21) or equivalently, A1 (z) = A(z) − B(z)(Θ21 D(z) + Θ22 )−1 Θ21 C(z), B1 (z) = B(z)(Θ21 D(z) + Θ22 )−1 , C1 (z) = Θ11 C(z) − (Θ11 D(z) + Θ12 )(Θ21 D(z) + Θ22 )−1 Θ21 C(z),
(6.22)
D1 (z) = (Θ11 D(z) + Θ12 )(Θ21 D(z) + Θ22 )−1 . 3) For all z ∈ ΛA ∩ ΛA1 , C D C D−1 C D
22
21 A1 (z) B1 (z) 1X A(z) B(z)Θ −B(z)Θ =
11 − D(z)Θ
21
12 + D(z)Θ
22 , (6.23) 0 Θ C(z) −Θ C1 (z) D1 (z) or equivalently,
21 (Θ
11 − D(z)Θ
21 )−1 C(z), A1 (z) = A(z) + B(z)Θ
22 + B(z)Θ
21 (Θ
11 − D(z)Θ
21 )−1 (−Θ
12 + D(z)Θ
22 ), B1 (z) = B(z)Θ
11 − D(z)Θ
21 )−1 C(z), C1 (z) = (Θ
11 − D(z)Θ
21 )−1 (−Θ
12 + D(z)Θ
22 ). D1 (z) = (Θ
(6.24)
146
D.Z. Arov and O.J. Staffans
Proof. Assertion 2) follows from Lemma 6.3, assertion 3) from Lemma 6.4, and for assertion 1) we need both of these lemmas. For the proof of 2) we interpret Σ1i/s/o as a driving variable representation, and for the proof of 3) we interpret Σ1i/s/o as an output nulling representation, as explained in Remark 5.2.
7. Signal behaviors, external equivalence, and similarity The behavioral approach to systems theory was introduced by Willems, and has been developed extensively by him and others (see, e.g., [PW98] for a recent presentation of behavioral theory). The vast majority of the literature on behaviors deals with finite-dimensional systems, and the existing extensions to the infinitedimensional case seem to ignore state space representations of the type that we have introduced above. Below we shall consider the problem of realization of a given behavior on a Hilbert space W by a state/signal system Σ = (V ; X , W). In order to motivate out definition of a signal behavior we first take a closer look at the signal parts of all externally generated trajectories of a state/signal system Σ = (V ; X , W). Let W be the set of all the signal sequences w(·), defined on Z+ with values in W, that are the signal components of externally generated trajectories (x(·), w(·)) of Σ on Z+ . It is easy to see that this set W is a closed + right-shift invariant subspace of the Fr´ ´echet space W Z of all W-valued sequences on Z+ . We now turn the above property into a definition. Definition 7.1. Let W be a Hilbert space.5 By a (causal signal) behavior on the + signal space W we mean a closed right-shift invariant subspace of W Z . This is a special case of a “manifest behavior”, as described, e.g., in [PW98, Definition 1.2.9], but our choice of this particular subclass of behaviors is not a standard one. A similar definition was used by Ball and Staffans [BS05] in continuous time (with an extra growth restriction at infinity that was appropriate in their setting). A behavior that is induced by a state/signal system Σ = (V ; X , W) as explained above is called realizable, and the state/signal system Σ that induces this behavior is called a realization of the behavior W. Definition 7.2. Two state/signal systems with the same signal space are called externally equivalent if they induce the same behavior. A behavior induced by a state/signal system has both an image representation and a kernel representation of the following type: Lemma 7.3. Let W be the behavior induced by a state/signal system Σ = (V ; X ; W). Then 5 We
make only indirect use of the fact that W is a Hilbert space. See the footnote to Definition 2.1.
State/Signal Systems
147
ˇ of every driving variable rep1) W is the range of the driving-to-signal map D resentation of Σ, and ˇ of every output nulling repre2) W is the kernel of the signal-to-error map D sentation of Σ. We leave the easy proof to the reader. After introducing the above notions we face the following tasks: 1) find criteria of realizability of a given behavior on W; 2) find criteria of external equivalence between two state/signal systems with the same signal space. The solutions of these problems will be given in this section. These solutions involve some additional notation. If W is a behavior on W, then the set W(0) = {w(0) | w ∈ W}.
(7.1)
is a closed subspace of W. We call this subspace the zero section of W. Observe that, if W is induced by a state/signal system, then W(0) coincides with the canonical input space U0 in (3.6). Definition 7.4. Let W be a behavior on 3 An ordered direct sum decomposition 2 W. W = Y U (also denoted by W = Y U ) is called an admissible (input/output) decomposition for W if it has the following two properties: +
1) For any sequence u(·) ∈ U Z there exists at least one sequence w(·) ∈ W such that u(n) = PUY w(n) for all n ∈ Z+ (that is, the projection of W onto + + U Z along Y Z is surjective). 2) There exists positive constants M and r such that T
rn w(n)2 ≤ M 2
n=0
T
rn PUY w(n)2
(7.2)
n=0
for all w(·) ∈ W and all T ∈ Z+ . Theorem 7.5. Let W be a behavior on W. 1) The following conditions are equivalent: (a) The behavior W is realizable by a state/signal system. (b) There exists at least one admissible input/output decomposition W = Y U for W. (c) For some direct complement Y0 to the zero section W(0) the decomposition W = Y0 W(0) is admissible for W. (d) For every direct complement Y0 to the zero section W(0) the decomposition W = Y0 W(0) is admissible for W. 2) Assume that W is realizable by the state/signal system Σ = (V ; X , W). Then a direct sum decomposition W = Y U is admissible for W if and only if it is admissible for Σ.
148
D.Z. Arov and O.J. Staffans
Proof. We begin by proving one half of assertion 2). Suppose first that the behavior W is realized by the state/signal system Σ = (V ; X , W). Consider some admissible input/output decomposition W = YU for the state/signal system Σ. Let Σi/s/o = AB [ C D ] ; X , U, Y be the input/state/output representation of Σ corresponding to this decomposition. Then, for every externally generated trajectory (x(·), w(·)) of Σ on Z+ we have w(n) = y(n) + u(n), where u(n) = PUY w(n) and y(n) = PYU w(n). + Clearly, the projection of W onto U Z is surjective (this is the first requirement of an admissible input/output decomposition for W). To prove that also (7.2) holds we choose some r > 0 and rewrite (1.2) in the form xr (n + 1) = rAxr (n) + rBur (n), yr (n) = Cxr (n) + Dur (n),
n ∈ Z+ ,
(7.3)
x(0) = 0, where xr (n) = rn x(n), ur (n) = rn u(n), and yr (n) = rn y(n). Choose r so small that rA < 1. By (6.7) and by the standard fact that the convolution of an 1 -sequence and an 2 -sequence belongs to 2 , T
yr (n)2 ≤ M12
n=0
T
ur (n)2 ,
n=0 −1
where M1 = D + C(1 − rA) B. Clearly this implies (7.2) with a larger constant M (which depends, among others, on the norms of PYU ). Thus, the decomposition W = Y U is admissible for W, and we have proved one direction of assertion 2). In addition, we have proved the implication (a) ⇒ (d), since the decomposition in (d) is admissible for Σ (see Lemma 5.7). Trivially (d) ⇒ (c) and (c) ⇒ (b). Thus, it remains to prove the other half of assertion 2) and the implication (b) ⇒ (a). Suppose now that W = Y U is an admissible decomposition for the behavior W. Let r and M be the constants in (7.2). For each w(·) ∈ W we define wr (n) = rn w(n), ur (n) = rn PUY w(n), and yr (n) = rn PYU w, n ∈ Z+ . Then (7.2) implies that the mapping from ur to yr is a continuous right-shift invariant mapping from 2 (Z+ ; U) to 2 (Z+ ; Y). As is well known, this implies that this mapping has a multiplier representation given in terms of Z-transforms by ur (z) yˆr (z) = Dr (z)ˆ for some bounded holomorphic B(U; Y)-valued function in the unit disk D, satisfying supz∈D Dr (z) ≤ M . This function Dr can be realized the 2 as 3 input/output r Br transfer function of an input/state/output system Σr = A Cr Dr ; X , U, Y ; see [Aro74, Theorem 3], [Fuh74], or [Hel74, Theorem 3c.1]. We then define 5 4 −1 −1 Σi/s/o = r CrAr r DrBr ; X , U, Y . This system is an input/state/output representation of a state/signal system Σ = (V ; X , W), and the decomposition W = Y U is admissible for this system. The
State/Signal Systems
149
system Σ is a state/signal realization of the given behavior W. This proves the implication (b) ⇒ (a), and completes the proof of assertion 1). It only remains to prove the second half of the assertion 2), namely that every decomposition W = Y U that is admissible for the behavior W is also admissible for its realization Σ. To do this we use the characterization given in + Lemma 5.7. Let u0 ∈ U, and take some arbitrary u(·) ∈ U Z with u(0) = u0 . Then there is a corresponding signal w(·) ∈ W such that PUY w(·) = u(·). In particular, u0 = PUY w(0), where w(0) ∈ W(0) = U0 . Thus PUY maps U0 onto U. That PUY |U0 is injective follows from (7.2). By Lemma 5.7, the decomposition W = Y U is admissible for Σ. Proposition 7.6. Let W be a realizable behavior on W, let W = Y U be a direct sum decomposition of W. Then the following conditions are equivalent. 1) W = Y U is an admissible input/output decomposition for W. PUY )−1 ∈ B(U; W(0)). 2) PUY maps W(0) one-to-one onto U, i.e., (P 3) The space W(0) has the graph representation 2 3 W(0) = w = 1DU u | u ∈ U , (7.4) for some D ∈ B(U; Y). If the decomposition is admissible, then the operator D in (7.4) is the feedthrough operator of every input/state/output realization of W with W = Y U. This follows from Lemma 5.7 and part 2) of Theorem 7.5 (recall that W(0) = U0 ). Theorem 7.7. Let Σ and Σ1 be two state/signal systems with the common signal space W. 1) If Σ and Σ1 have a common admissible input/output decomposition W = Y U and the corresponding input/output transfer functions coincide in a neighborhood of zero, then the two systems are externally equivalent. 2) Conversely, if Σ and Σ1 are externally equivalent, then any direct sum decomposition W = Y U is admissible for Σ if and only if it is admissible for Σ1 , and the corresponding input/output transfer functions coincide in the (connected) component of ΛA ∩ ΛA1 which contains zero. In particular, the feedthrough operators also coincide. Proof. Proof of 1): denote the of Σ and A We input/state/output representations 2 A1 B1 3 B ] ; X , U, Y , respectively, Σ1 Σ1 by Σi/s/o = [ C = ; X , U, Y , and C1 D1 D i/s/o the behaviors induced by Σ and Σ1 by W, respectively, W1 . Let w(·) ∈ W. Then there exists a sequence x(·) with x(0) = 0 such that (x(·), w(·)) is a trajectory of Σ on Z+ . Equivalently, (x(·), u(·), y(·)), with u(·) = PUY w(·) and y(·) = PYU w(·) is a trajectory of Σi/s/o on Z+ with x(0) = 0. Let (x1 (·), u(·), y1 (·)) be the trajectory of Σ1i/s/o on Z+ which has x1 (0) = 0 and the same input sequence u as above. We claim that y1 (·) = y(·). To prove this is suffices to show that the two inputˇ in (6.8)) are the same for the two systems Σi/s/o and to-output map (the map D 1 Σi/s/o , i.e., that D = D1 and that CAk B = C1 Ak1 B for all k ∈ Z+ . However,
150
D.Z. Arov and O.J. Staffans
these are the Taylor coefficients of the corresponding transfer functions D and D1 at the origin, and since we assume that the two transfer functions coincide in a neighborhood of the origin, these Taylor coefficients are the same, too. Thus, y(·) = y1 (·), as claimed. This means that (x1 (·), w(·)) is an externally generated trajectory of Σ1 on Z+ . The above argument shows that W ⊂ W1 . By interchanging the roles of the two systems Σ and Σ1 we conclude by the same argument that W1 ⊂ W. Thus, the two systems Σ and Σ1 are externally equivalent. Proof of 2). Suppose that Σ and Σ1 are externally equivalent. Then they induce the same behavior W. By part 2) of Theorem 7.5, the decomposition W = Y U is admissible for Σ if and only if it is admissible for W, and this is true if and only if it is admissible for Σ1 . Assume that the decomposition is admissible (for both systems), and denote the corresponding transfer functions by D, respectively, + D1 . Let u(·) ∈ U Z , and suppose that the Z-transform of u(·) has a nonzero radius of convergence. Choose some w(·) ∈ W such that PUY w(·) = u(·). Define y(·) = PYU w(·). Then we have in some (possibly smaller) neighborhood of zero, yˆ(z) = D(z)ˆ u(z) = D1 u ˆ(z). +
This being true for all u(·) ∈ U Z whose Z-transform of u(·) has a nonzero radius of convergence, this implies that D(z) = D1 (z) in some neighborhood of zero. By analytic extension, these two transfer functions must coincide in the connected component of ΛA ∩ ΛA1 which contains zero. That the feedthrough operators coincide follows from the fact that they are the values of the transfer functions at zero. Instead of testing the external equivalence of two state/signal systems by using input/state/output representations of these systems it is also possible to use driving variable or output nulling representations. Proposition 7.8. Let Σ and Σ1 be two state/signal systems with the common signal space W. Let Σi/s/o and Σ1i/s/o be two input/state/output representations of Σ, respectively, Σ1 corresponding to the same admissible decomposition W = Y U, let Σdv/s/s and Σ1dv/s/s be two driving variable representations of Σ, respectively, Σ1 , and let Σs/s/on and Σ1s/s/on be two output nulling variable representations of Σ, respectively, Σ1 . Then the following conditions are equivalent: 1) Σ and Σ1 are externally equivalent. ˇ and D ˇ 1 of Σi/s/o , respectively, Σ1 2) The input-to-output maps D i/s/o coincide. ˇ ˇ 3) The driving-to-signal maps D and D1 of Σdv/s/s , respectively, Σ1dv/s/s have the same ranges. ˇ and D ˇ of Σs/s/on , respectively, Σ1 4) The signal-to-error maps D 1 s/s/on have the same kernels. Proof. This follows from Lemma 7.3, Theorem 7.7, and the fact that the input/output transfer function determines the input-to-output map uniquely.
State/Signal Systems
151
The rest of this section is devoted to a study of similarity and pseudosimilarity of state/signal systems. Definition 7.9. Two state/signal systems Σ = (V ; X , W) and Σ1 = (V V1 ; X1 , W) with the same signal space W are similar if there exists a boundedly invertible operator R ∈ B(X ; X1 ), called the similarity operator, such that (x(·), w(·)) is a trajectory of Σ if and only if (x1 (·), w(·)) = (Rx(·), w(·)) is a trajectory of Σ1 . From this definition follows that two similar state/signal systems are externally equivalent. The corresponding similarity notion is well known A B for input/state/output 1 systems. Two input/state/output systems Σi/s/o = [ C D ] ; X , U, Y and Σi/s/o = 2 A1 B1 3 C1 D1 ; X1 , U, Y with the same input and output spaces are similar if there exists a boundedly invertible operator R ∈ B(X ; X1 ) such that D D C C RAR−1 RB A1 B1 . = CR−1 C1 D1 D We shall apply the same similarity notion to driving variable and output nulling representations, too, interpreting them as input/state/output systems (as explained in Remark 5.4). Proposition 7.10. Let Σ = (V ; X , W) and Σ1 = (V V1 ; X1 , W) be two state/signal systems with the same signal space W, and let R be a boundedly invertible operator in B(X X1 ; X ). Then the following conditions are equivalent. 1) Σ and4 Σ1 are 5similar with similarity operator R. 2) V1 =
R 0 0 0 R 0 0 0 1W
V.
3) Σ and Σ1 have driving variable representations Σdv/s/s and Σ1dv/s/s , respectively, which are similar with similarity operator R. 4) To each driving variable representation Σdv/s/s of Σ there is a (unique) driving variable representation Σ1dv/s/s of Σ1 such that these representations are similar with similarity operator R. 5) Σ and Σ1 have output nulling representations Σs/s/on and Σ1s/s/on , respectively, which are similar with similarity operator R. 6) To each output nulling representation Σs/s/on of Σ there is a (unique) output nulling representation Σ1s/s/on of Σ1 such that these representations are similar with similarity operator R. 7) There exists some decomposition W = Y U of W which is admissible both for Σ and for Σ1 , and the corresponding input/state/output representations Σi/s/o and Σ1i/s/o are similar with similarity operator R. 8) The systems Σ and Σ1 have the same set of admissible decompositions W = Y U of W, and for every such decomposition the corresponding input/ state/output representations Σi/s/o and Σ1i/s/o are similar with similarity operator R. We leave the easy proof to the reader.
152
D.Z. Arov and O.J. Staffans
Various partial converses to the statement that two similar systems are externally equivalent is also valid. Some additional conditions are always needed. One such condition is that both the systems are controllable and observable. In this case they need not actually be similar but only pseudo-similar. Two state/signal V1 ; X1 , W) are called pseudo-similar if there systems Σ = (V ; X , W) and Σ1 = (V exists an injective densely defined closed linear operator R : X → X1 with dense range such that the following conditions hold: If (x(·), w(·)) is a trajectory of Σ on Z+ with x(0) ∈ D(R), then x(n) ∈ D(R) for all n ∈ Z+ and (Rx(·), w(·)) is a trajectory of Σ1 on Z+ , and conversely, if (x1 (·), w(·)) is a trajectory of Σ1 on Z+ with x1 (0) ∈ R (R), then x1 (n) ∈ R (R) for all n ∈ Z+ and (R−1 x1 (·), w(·)) is a trajectory of Σ on Z+ . Proposition 7.11. Two controllable and observable state/signal systems Σ = (V ; X , W) and Σ1 = (V V1 ; X1 , W) with the same signal space W are externally equivalent if and only if they are pseudo-similar. Proof. In one direction the assertion is obvious: if Σ and Σ1 are pseudo-similar, then they induce the same behavior (take x(0) = 0 and x1 (0) = 0). Conversely, suppose that Σ and Σ and are controllable and observable state/ signal systems which are externally equivalent. Then they have the same set of admissible input/output decompositions of the signal space W. Let W = Y U be such a decomposition, and denote corresponding input/state/output repre Athe 2 A1 B1 3 B ];X ,U,Y and Σ1 sentations of Σ and Σ1 by Σi/s/o = [ C = ,U,Y , ;X X 1 C1 D1 D i/s/o 1 respectively. Then both Σi/s/o and Σi/s/o are controllable and observable, and also externally equivalent. This means that their input/output transfer functions coincide a neighborhood of zero. By [Aro79, Proposition 6], these two systems are pseudo-similar in the following sense: there exists an injective densely defined closed linear operator R : X → X1 with dense range such that R (B) ⊂ D(R), AD(R) ⊂ D(R), A1 R (R) ⊂ R (R) , A1 R = RA|D(R) , B1 = RB, C1 R = C|D(R) , D1 = D.
(7.5)
If (x(·), w(·)) and (x1 (·), w(·)) are externally generated trajectories of Σ and Σ1 , respectively, with x(0) ∈ D(R), x1 (0) ∈ R (R), and x1 (0) = Rx(0), then for all n ∈ Z+ , x(n) = An x(0) +
n−1
Ak Bu(n − k − 1),
k=0
x1 (n) =
An1 Rx(0)(0)
+
n−1
(7.6) Ak1 B1 u(n
− k − 1),
k=0
where u(n) = PUY w(n). This combined with (7.5) gives x1 (n) = Rx(n) for all n ∈ Z+ . Thus, Σ and Σ1 are pseudo-similar.
State/Signal Systems
153
8. Dilations of state/signal systems In the classical finite-dimensional input/state/output systems theory a system is called minimal if the dimension of its state space is minimal among all systems with the same transfer function. By a classical result due to Kalman, such a finitedimensional input/state/output system is minimal if and only if it is controllable and observable. We can reformulate this result is the state/signal setting as follows: a state/signal system with a finite-dimensional state space has a state space with minimal dimension among all externally equivalent systems if and only if it is controllable and observable. In the case where the state space is infinite-dimensional the requirement that its state space should have minimal dimension becomes obscure (all infinitedimensional separable Hilbert spaces has the same dimension). It is therefore necessary to define minimality in terms of some other property. One natural solution is to study dilations and compressions of systems. In the finite-dimensional case the minimality of the dimension of the state space is equivalent to the statement that the system cannot be compressed into a “smaller” system, and this characterization has a natural infinite-dimensional analogue. The notions of dilations and compressions of operators and of input/state/output systems have attracted a great deal of attention and it plays an important role in many works, see, e.g., [Aro79], [SF70], and [LP67] for Hilbert space versions, and [BGK79] and [Sta05] for Banach space versions.
= (V ; X , W) is a dilation along Z of Definition 8.1. The state/signal system Σ the state/signal system Σ = (V ; X , W), or equivalently, the state/signal system
if the following Σ is a compression along Z onto X of the state/signal system Σ, conditions hold: 1) X = X Z,
on Z+ with x ˜(0) ∈ X , then (P PXZ x ˜(·), w(·)) 2) If (˜ x(·), w(·)) is a trajectory of Σ + is a trajectory of Σ on Z . 3) There is at least one decomposition W = Y U of W which is admissible for
and Σ. both Σ Note that, whereas the compressed system is determined uniquely by the dilated system and by the decomposition X = X Z, the converse is clearly not true.
= (V ; X , W) be a dilation along Z of Lemma 8.2. Let the state/signal system Σ Σ = (V ; X , W). Then the following claims hold. 1) To each trajectory (x(·), w(·)) of Σ on Z+ there is a unique trajectory
on Z+ satisfying x (˜ x(·), w(·)) ˜ of Σ ˜(0) = x(0) and w(·) ˜ = w(·). This tra˜(·). jectory has the additional property that x(·) = PXZ x
and Σ are externally equivalent. In particular, they have the same admis2) Σ sible input/output decompositions of the signal space, and the input/output transfer functions and the input-to-output maps of the corresponding input/
and Σ coincide. state/output representations of Σ
154
D.Z. Arov and O.J. Staffans
and for Proof. Let W = Y U be a decomposition which is admissible both for Σ
Σ, and denote the corresponding input/state/output representations of Σ and Σ by
i/s/o and Σi/s/o , respectively. Let (x(·), w(·)) be a trajectory of Σ on Z+ . Define Σ u(·) = PUY w(·) and y(·) = PYU w(·). Then (x(·), u(·), y(·)) is a trajectory of Σi/s/o ,
i/s/o has a unique trajectory (˜ x(·), u(·), y˜(·)) on Z+ satisfying x ˜(0) = x(0). and Σ
on Z+ . According to Define w(·) ˜ = y˜(·)+ u(·). Then (˜ x(·), w(·)) ˜ is a trajectory of Σ property 2) in Definition 8.1, (P PXZ x ˜(·), w(·)) ˜ must be a trajectory of Σ, and hence, if we define y˜(·) = PYU w(·), ˜ then (P PXZ x˜(·), u(·), y˜(·)) is a trajectory of Σi/s/o . But a trajectory of Σi/s/o is determined uniquely by its initial state and input data, and therefore we must have x(·) = PXZ x ˜(·) and y˜(·) = y(·). This proves assertion 1). Assertion 2) follows immediately from property 2) in Definition 8.1 together with assertion 1) . Observability and controllability are preserved under compressions (but not under dilations).
= (V ; X , W) be a dilation along Z Lemma 8.3. Let the state/signal system Σ
and R be the reachable subspaces and let U
and U be of Σ = (V ; X , W). Let R
∩ X and R is
and Σ, respectively. Then U = U the unobservable subspaces of Σ
In particular, if Σ
is controllable or observable, then Σ is the closure of PXZ R. controllable or observable, respectively. We leave the easy proof to the reader.
and In order to be able to study the relationship between the two systems Σ Σ in Definition 8.1 in more detail we need the following two invariance notions.6 Definition 8.4. Let Σ = (V ; X , W) be a state/signal system. 1) A closed subspace Z of X is outgoing invariant for Σ if to each x0 ∈ Z there is a (unique) trajectory (x(·), 0) of Σ on Z+ with x(0) = x0 satisfying x(n) ∈ Z for all n ∈ Z+ . 2) A closed subspace Z of X is strongly invariant for Σ if every trajectory (x(·), w(·)) of Σ on Z+ with x(0) ∈ Z satisfies x(n) ∈ Z for all n ∈ Z+ . These invariance properties can also be described in terms of the generating subspace V as follows. Lemma 8.5. Let Σ = (V ; X , W) be a state/signal system, and let Z be a closed subspace of X . 1) Z is outgoing invariant for Σ if and only if the following condition holds: 4z5 (8.1) To each x ∈ Z there is a (unique) z ∈ Z such that x ∈ V . 0
2) Z is strongly invariant for Σ if and only if it the following implication is true: 4z5 (8.2) If x ∈ V and x ∈ Z, then z ∈ Z. w
6 The connections between these notions and the unobservable and reachable subspaces are explained in Lemma 8.6 below.
State/Signal Systems
155
Proof. Proof of 1): The necessity of (8.1) for outgoing invariance Cis immediate (the D solution (x(·), 0) mentioned in part 1) of Definition 8.4 satisfies
x(1) x(0) 0
∈ V .)
Then Conversely, suppose that (8.1) holds. Let x0 ∈ Z. 5 (8.1) with x replaced 4 x(1) ∈ V . Applying (8.1) by x0 gives the existence of x(1) ∈ Z such that x0 0
once D more with x replaced by x(1) we get the existence of x(2) ∈ Z such that C x(2) x(1) ∈ V . Continuing in the same way we get a sequence x(·) such that x(0) = x0 0
and (x(·), 0) is a trajectory of Σ on Z+ . According to Definition 8.4, Z is outgoing invariant. Proof of 2): To see that (8.2) is necessary for Z to be strongly 4 z0 5 invariant we x0 ∈ V implies argue as follows. By part 1) of Proposition 2.2, the condition w 0
that there exists a trajectory (x(·), w(·)) of Σ on Z+ with x(0) = x0 , w(0) = w0 , and x(1) = z0 . If, furthermore, x0 ∈ Z, then the strong invariance of Z implies that x(n) ∈ Z for all n ∈ Z+ . In particular, z0 = x(1) ∈ Z. The proof of the converse part is similar to the proof of the converse part of assertion 1), and it is left to the reader. The two main examples of outgoing invariant and strongly invariant subspaces are the following: Lemma 8.6. Let Σ = (V ; X , W) be a state/signal system. 1) The unobservable subspace is the maximal outgoing invariant subspace for Σ, i.e., it is outgoing invariant, and it contains every other outgoing invariant subspace. 2) The reachable subspace is the minimal closed strongly invariant subspace for Σ, i.e., it is strongly invariant, and it is contained in every other closed strongly invariant subspace. We leave the easy proof to the reader. The following theorem is the main result of this section.
= V ; X , W and Σ = V ; X , W be two state/signal systems Theorem 8.7. Let Σ
is a dilation along Z with X = X Z (and with the same signal space). Then Σ of Σ if and only if the following conditions hold: 1) V is given by ⎡ ⎤ F
⎡ Z ⎤ z˜ PX z˜ V = ⎣ x ⎦ - x ∈ X and ⎣ x ⎦ ∈ V . w w
(8.3)
2) Z has a decomposition Z = Zo Zi where Zo is outgoing invariant for Σ
and Zo X is strongly invariant for Σ.
156
D.Z. Arov and O.J. Staffans
One possible choice of the subspaces Zo and Zi in 2) is to take Zo = Zomax and to take Zi to be an arbitrary direct complement of Zomax in Z, where
on Z+ with x(·), 0) of Σ - there exists a trajectory (˜
˜0 ∈ X Zomax = x . (8.4) - x ˜(0) = x ˜0 satisfying PXZ x˜(n) = 0 for all n ∈ Z+ The subspace Zomax is maximal in the sense that it contains every other space Zo that can be used in the decomposition in 2).
7 We shall call Zo an outgoing subspace and Zi an incoming subspace of Σ.
Proof. We begin by proving necessity 4 z˜of 51) and 2), assuming that Σ is a dilation of 0
Σ, and begin with condition 1). Let x0 ∈ V with x0 ∈ X . By Proposition 2.2, Σ w0
has a trajectory (˜ x(·), w(·)) ˜ on Z+ with x ˜(1) = ˜0 , x ˜(0) = x0 , and w(0) = w0 . By condition 2) inC Definition 8.1, (x(·), w(·)) ˜ with x(·) = PXZ x˜(·) is a trajectory of Σ. D 4 5 Z x(1) P z˜0 In particular, x(0) = Xx0 ∈ V . This shows that the right-hand side of (8.3) w0
w0
is contained in V . The opposite inclusion follows from a similar argument which replaces condition 2) in Definition 8.1 by part 1) of Lemma 8.2. To prove the existence of a decomposition of the type described in part 2) we define Zo = Zomax by (8.4). It is easy to see that Zomax is a closed subspace of X , and it is contained in Z since PXZ Zomax = 0. Let Zi be an arbitrary direct complement of Zomax in Z. We claim that this decomposition of Z has the two properties mentioned in 2).
It is easy to see from Definition 8.4 that Zomax is outgoing invariant for Σ, max
x(·), w(·)) so it remains to show that Zo X is strongly invariant for Σ. Let (˜
on Z+ with x be a trajectory of Σ ˜(0) = z0 + x0 , where z0 ∈ Zomax and x0 ∈ X .
on Z+ with Since Zomax is outgoing invariant, there is a trajectory (˜ x1 (·), 0) of Σ max + ˜1 (n) ∈ Zo for all n ∈ Z . Define x ˜2 (·) = x ˜(·) − x˜1 (·). x ˜1 (0) = z0 satisfying x
on Z+ with x Then (˜ x2 (·), w(·)) is a trajectory of Σ ˜2 (0) = x0 ∈ X . Define x(·) = PXZ x ˜2 (·). By Condition 2) in Definition 8.1, (x(·), w(·)) is a trajectory of Σ on Z+ . In particular, it is also a trajectory on [1, ∞). By assertion 2) of Lemma
on 8.2, applied to the time interval [1, ∞), there is a trajectory (˜ x3 (·), w(·)) of Σ Z [1, ∞) satisfying x ˜3 (1) = x(1) and PX x˜3 (n) = x(n) for all n ∈ [1, ∞). Define
on [1, ∞), and it satisfies x ˜4 (·) = x ˜2 (·) − x ˜3 (·). Then (˜ x4 (·), 0) is a trajectory of Σ Y Y Y PX x˜4 (n) = PX x ˜2 (n) − PX x ˜3 (n) = x(n) − x(n) = 0 for all n ∈ [1, ∞). It follows from (8.4) (after we have shifted the trajectory (˜ x(·), 0) one step to the left) that ˜(1) = x˜1 (1) + x ˜3 (1) + x ˜4 (1) where x ˜1 (1) ∈ Zomax , x ˜3 (1) = x ˜4 (0) ∈ Zomax . Thus, x max max ˜4 (1) ∈ Zo , so x x(1) ∈ X , and x ˜(1) ∈ Zo X . This proves that the implication (8.2) holds with Z replaced by Zomax X . By Lemma 8.5, Zomax X is strongly invariant. To prove the maximality of Zomax it suffices to observe that if Zo is outgoing
on Z+ with x(·), 0) of Σ invariant, then for each z0 ∈ Zo there is a trajectory (˜ 7 The
reason for these names will be explained elsewhere.
State/Signal Systems
157
x ˜(0) = z0 satisfying x ˜(n) ∈ Zo ⊂ Z for all n ∈ Z+ , and hence PXZ x˜(n) = 0 for all + n ∈ Z . This implies that z0 ∈ Zomax . For the converse proof we assume that 1) and 2) hold. It follows from (8.3)
and Σ have the same canonical input space U0 , so condition that the two systems Σ 3) of Definition 8.1 is satisfied. Our proof of the fact that also condition 2) of Definition 8.1 holds is based on the following implication: C Z D 4 z˜ 5 PX z˜ ˜ ∈ Zo X , then PXZ x˜ ∈ V . If x˜ ∈ V and x (8.5) w
4 z˜ 5
w
The proof of (8.5) goes as follows. Let x˜ ∈ V with x ˜ = z0 + x0 , where z0 ∈ Zo w and x ∈ X . Since Z is outgoing invariant, there is some z1 5∈ Zo such that o 4 z˜−z 4 z1 5 0 1
z0 x0 ∈ V (see Lemma 8.5). Since V is a subspace also ∈ V . We can 0 w 4 Z 5 PX (˜−z1 ) now apply (8.3) to conclude that ∈ V . But PXZ (˜ − z1 ) = PXZ z˜ since x0 w
z1 ∈ ZDo ⊂ Z and x0 = PXZ x ˜ since x ˜ − x0 = z0 ∈ Zo ⊂ Z. Thus, we conclude that C Z PX z˜ Z PX x ˜ ∈ V . This proves (8.5). w
on Z+ with x ˜(0) ∈ X . Because of the Let (˜ x(·), w(·)) be a trajectory of Σ strong invariance of Zo X , this implies that x ˜(n) ∈ Zo X for all n ∈ Z+ . Define ˜(·). Then it follows from (8.5) that (x(·), w(·)) is a trajectory of Σ on x(·) = PXZ x
is a dilation Z+ . Thus, condition 2) in Definition 8.1 holds, and we conclude that Σ of Σ. Let us record the following fact which we observed in the preceding proof.
= (V ; X , W) be a dilation along Z Corollary 8.8. Let the state/signal / system Σ
of Σ = (V ; X , W), and let X = Zo X Zi be the decomposition of X given in Theorem 8.7. Denote Zo X by Xo . Then V is given by ⎡ ⎤
⎡ P Z z˜⎤ F z˜ X ˜ ∈ Xo and ⎣ x˜ ⎦ ∈ V . V = ⎣PXZ x˜⎦ - x (8.6) w w This follows from (8.3) and (8.5).
= V ; X , W be a state/signal system. Assume that X = Corollary 8.9. Let Σ X Z, and define V by (8.3). Then Σ = V ; X , W is a state/signal node. It
if and only if Z can be decomposed into is a compression along Z onto X of Σ
and Zo X is Z = Zo Zi in such a way that Zo is outgoing invariant for Σ
strongly invariant for Σ. Proof. If V is given by (8.3), then V clearly has properties (i) and (iii) in Definition 2.1. That it also has properties (i) and (iv) follows from Lemma 2.4, because if we denote the operator in part 3) of Lemma 2.3 corresponding to V and V by F and F , respectively, then F = PXZ F with D(F ) = D(F ). Thus Σ is a state/signal node. The remaining claims follow from Theorem 8.7.
158
D.Z. Arov and O.J. Staffans
Remark 8.10. It is possible to reformulate condition 2) in Theorem 8.7 by focusing on the subspace Xo := Zo X instead of focusing on Zo . We claim that condition 2) in Theorem 8.7 is equivalent to the following condition: 2 ) X has a decomposition X = Xo Zi , where Zi ⊂ Z, X ⊂ Xo , Xo is strongly
and Xo ∩ Z is outgoing invariant for Σ.
invariant for Σ, Clearly, 2 ) follows from from 2) if we take Xo = Zo X . It is almost as easy to derive 2) from 2 ), with Zo = Xo ∩ Z; the only slightly nontrivial part is to show that X = Zo X Zi , or equivalently, that Xo = (X Xo ∩ Z) X . However, this follows from the assumptions that X = X Z = Xo Zi where Zi ⊂ Z and X ⊂ Xo , which implies that PZX − PZXio is a projection with kernel X Zi and range Xo ∩ Z (we leave the proof of this to the reader). The same replacement of 2) by 2 ) can be carried out in Corollary 8.9, too. The final conclusion of Theorem 8.7 says that if Zo is an arbitrary subspace of Z satisfying the properties listed in 2), then Zo ⊂ Zomax . This result implies that the subspace Xomax := Zomax X has an analogous maximality property: if Xo is an arbitrary subspace of X satisfying the properties listed in 2 ), then Xo ⊂ Xomax . A similar argument shows that all the subspaces Zo in 2) and all the subspaces Xo in 2 ) satisfy Zomin ⊂ Zo and Xomin ⊂ Xo , where Zomin and Xomin are defined in Theorem 8.11 below. Theorem 8.11. Among all the decompositions X = Zo X Z Zi in Theorem 8.7 there is one for which the outgoing subspace Zo is the smallest possible, i.e., there is an outgoing invariant subspace Zo = Zomin which can be used in this decomposition, and which is contained in every outgoing subspace Zo for every other choice of decomposition. The subspace Zomin can be constructed as follows: Let Xomin be the closure in X of all the possible values of the state components x
(·) of all trajectories
on Z+ satisfying x ˜(0) ∈ X , and define Zomin = Xomin ∩ Z. (˜ x(·), w(·)) of Σ Proof. Define Xomin and Zomin as described in Theorem 8.11, and let Zi be an arbitrary direct complement to Zomin in Z. Then X = X Z = X Zomin Zi . We have both X ⊂ Xomin and Zomin ⊂ Xomin , so Zomin X ⊂ Xomin . To see that we actually have Zomin X = Xomin it suffices to show that Xomin ∩ Zi = {0} (since Xomin ⊂ X = (Z Zomin X ) Zi ). But this is true because Xomin ∩ Z) ∩ Zi = Zomin ∩ Zi = {0}. Xomin ∩ Zi = (X Thus Xomin = Zomin X and X = Xomin Zi = Zomin X Zi . It is easy to see that Xomin is the smallest (closed) strongly invariant subspace
of X which contains X . In particular, for each decomposition X = Xo Z Zi satisfying condition 2 ) in Remark 8.10 we must have Xomin ⊂ Xo . As we saw in Remark 8.10, this implies that if Zo is an arbitrary subspace which satisfies the conditions listed in 2) of Theorem 8.7, then Zomin ⊂ Zo . This proves the claim about the minimality of Zomin (and of Xomin ). It only remains to show that Zomin is outgoing invariant
To do this we argue as follows. for Σ. Choose some arbitrary decomposition X = Zo X Zi of the type given in Theorem 8.7, and define Xo := Zo X . Since Xomin is the smallest (closed) strongly invariant subspace of X which contains X we must have X ⊂ Xomin ⊂ Xo . It follows
State/Signal Systems
159
from (8.3) and (8.6) that (8.6) also holds if we replace Xo by Xomin . Take some arbitrary z0 ∈ Z0min = Xomin ∩ Z ⊂ Xo ∩ Z = Zo . Then there exists some trajectory
on Z+ with x (˜ x(·), w(·)) of Σ ˜(0) = z0 . By (8.6) with Xo replaced by Xomin , if we Z ˜(·), then (x(·), w(·)) is a trajectory of Σ. Observe that x(0) ( = 0. define x(·) = PX x
on Z+ By part 1) Lemma 8.2, there exists a (unique) trajectory (˜ x1 (·), w(·)) of Σ with PXZ x ˜1 (·) = x(·) (in particular, x˜1 (0) = 0). Define x ˜2 (·) = x ˜(·) − x ˜1 (·). Then
with x (˜ x2 (·), 0) is a trajectory of Σ ˜2 (0) = z0 and PXZ x˜2 (·) = x(·) − x(·) = 0. Thus x ˜2 (n) ⊂ Z for all n ∈ Z+ . But on the other hand, by the strong invariance of Xomin , x ˜2 (n) ⊂ Xomin for all n ∈ Z+ . Thus, x ˜2 (n) ⊂ Z ∩ Xomin = Zomin for all n ∈ Z+ and, as we recall, x ˜2 (0) = z0 . This proves that Zomin is outgoing invariant. It is often useful to split a compression or dilation into the product of two successive dilations or compression. = (V ; X, W) be a compression of Σ
= (V ; X , W) along Z Lemma 8.12. Let Σ onto X , and let Σ = (V ; X , W) be a compression of Σ along Z onto X . Then
along Z Z onto X , and P ZZ Σ = (V ; X , W) is a compression of Σ = PXZ PXZ . X The easy proof is left to the reader. Two particularly simple types of dilations are those where one of the two subspaces Zo and Zi in Theorem 8.7 can be taken to be zero.
= (V ; X , W) is an outgoing dilation Definition 8.13. The state/signal system Σ along Z of the state/signal system Σ = (V ; X , W), or equivalently, the state/signal system Σ is an outgoing compression along Z onto X of the state/signal system
if the following conditions hold: Σ, 1) X = X Z,
on Z+ , then (P 2) If (˜ x(·), w(·)) is a trajectory of Σ PXZ x˜(·), w(·)) is a trajectory + of Σ on Z . 3) There is at least one decomposition W = Y U of W which is admissible for
and Σ. both Σ Clearly, every outgoing dilation is also a dilation.
= V ; X , W and Σ = V ; X , W be two state/signal systems Lemma 8.14. Let Σ with X = X Z (and with the same signal space). Then the following conditions are equivalent.
is an outgoing dilation along Z of Σ, 1) Σ 2) V is given by ⎤ ⎡ Z
⎡ P Z z˜⎤ - ⎡ z˜ ⎤ F 0 0 PX X Z Z ⎦ -⎣ ⎦
⎣ ⎦ ⎣ 0 V = PX x ˜ - x ˜ ∈V . V = 0 PX (8.7) 0 0 1X w w
3) (8.3) holds and Z is outgoing invariant for Σ.
160
D.Z. Arov and O.J. Staffans
Proof. The proof of the fact that 1) implies 2) is essentially the same as the proof of the necessity of (8.3) in Theorem 8.7, and the proof of the converse implication is a simplified version of the sufficiency part of the proof of the same theorem. That 1) and 2) together imply 3) is a simplified version of the final paragraph of the proof of Theorem 8.11 (replace Zomin by Z, replace Xomin by X , and use the
is a dilation of Σ and that (8.6) now holds with Xo replaced by X ). facts that Σ Finally, that 3) implies 2) follows from Corollary 8.8.
= (V ; X , W) is an incoming dilation Definition 8.15. The state/signal system Σ along Z of the state/signal system Σ = (V ; X , W), or equivalently, the state/signal system Σ is an incoming compression along Z onto X of the state/signal system
if the following conditions hold: Σ, 1) X = X Z,
on Z+ with x ˜(0) ∈ X , then x ˜(n) ∈ X for 2) If (˜ x(·), w(·)) is a trajectory of Σ + all n ∈ Z and (x(·), w(·)) is a trajectory of Σ on Z+ . 3) There is at least one decomposition W = Y U of W which is admissible for
and Σ. both Σ
= V ; X , W and Σ = V ; X , W be two state/signal systems Lemma 8.16. Let Σ with X = X Z (and with the same signal space). Then the following conditions are equivalent.
is an incoming dilation along Z of Σ, 1) Σ 2) V is given by
⎡ ⎤ F z˜ V = ⎣ x ⎦ ∈ V - x ∈ X . (8.8) w
3) (8.3) holds and X is strongly invariant for Σ. This proof is similar to the proof of Lemma 8.14 and it is left to the reader. Definition 8.17. A state/signal system is minimal if it is not a (nontrivial) dilation of any other state/signal system (along any direction). Theorem 8.18. An state/signal system is minimal if and only if it is controllable and observable.
be state/signal system, and let Σ be a compression of Σ.
If Σ
is Proof. Let Σ observable, then the outgoing subspace Zo in the decomposition in Theorem 8.7
is controllable, is trivial (since it is part of the unobservable subspace), and if Σ then the incoming subspace Zi in the decomposition in Theorem 8.7 is trivial
is both controllable (since Zo X contains the reachable subspace). Thus, if Σ and observable, then it does not have any nontrivial dilation. The converse claim follows from Theorem 8.19 below (which shows that every non-observable or non-controllable system has a nontrivial compression). Theorem 8.19. Σ = (V ; X , W) be a state/signal system. Denote the reachable subspace of Σ by R and the unobservable subspace of Σ by U.
State/Signal Systems
161
U R, and let O be a 1) Let O be a direct complement to U in X , define X◦ := PO i direct complement to X◦ in O. Define V◦ by C UO D 4z5 PX◦ i z -x ∈V . (8.9) V◦ := x - x ∈ X◦ , w
w
Then Σ◦ = (V V◦ ; X◦ , W) is a minimal state/signal systems which is a compression of Σ along U Oi . Here U is outgoing invariant for Σ and U X◦ is strongly invariant for Σ, so that we can take Zo = U and Zi = Oi in the decomposition in Theorem 8.7. 2) Let Q be a direct complement to R in X , define Ro = R ∩ U, and let X• be a direct complement to Ro in R. Define C R Q D 4z5 PX•o z x x ∈ X ∈ V . (8.10) , V• := • x w w
Then Σ• = (V V• ; X• , W) is a minimal state/signal systems which is a compression of Σ along Ro Q onto X• . Here Ro is outgoing invariant for Σ and Ro X• is strongly invariant for Σ, so that we can take Zo = Ro and Zi = Q in the decomposition in Theorem 8.7. Proof. Proof of 1). We begin by performing an outgoing compression of Σ along U onto O, i.e., we define 4z5 7 64 U 5 PO z V◦1 := - x ∈ O, x ∈ V . x w
w
According to Lemma 8.6, U is outgoing invariant for Σ, so by Corollary 8.9, Σ1◦ := (V V◦1 ; O, W) is a compression of Σ along U. Moreover, it follows from Lemma 8.3 that Σ1◦ is observable. We continue by performing an incoming compression of Σ1◦ along Oi onto its reachable subspace, which according to Lemma 8.3 is equal to X◦ . Thus, we define C O D 4z5 PX◦i z -1 x x ∈ X ∈ V , V◦ := . ◦ x ◦ w w The subspace X◦ is strongly invariant for Σ1◦ (see Lemma 8.6), so by Corollary 8.9, Σ◦ := (V V◦ ; X◦ , W) is a compression of Σ1◦ along Oi . By Lemma 8.12, this system is the same one which we defined in Part 1), and by Lemma 8.3, Σ◦ is both controllable and observable. It remains to show that U X◦ is strongly invariant for Σ. However, this follows from the fact that the maximal outgoing subspace Zomax defined in (8.4) always is contained in the unobservable subspace U, and in this particular case it coincides with U. Thus, U X◦ coincides with the space Zomax X◦ , and it must therefore be strongly invariant. Proof of 2). We begin by performing an incoming compression of Σ along Q onto R, i.e., we define 7 64 Q 5 4z5 PR z x ∈V . V•1 := - x ∈ R, w x w
162
D.Z. Arov and O.J. Staffans
According to Lemma 8.6, R is strongly invariant for Σ, so by Corollary 8.9, Σ1• := (V V•1 ; R, W) is a compression of Σ along Q. Moreover, it follows from Lemma 8.3 that Σ1• is controllable. We continue by performing an outgoing compression of Σ1• along its unobservable subspace, which according to Lemma 8.3 is equal to Ro . That is, we define C R D 4z5 PX•o z 1 V• := - x ∈ X• , x ∈ V• . x w
w
(see Lemma 8.6), so by Corollary The subspace Ro is outgoing invariant for 8.9, Σ• := (V V• ; X• , W) is a compression of Σ1• along Ro . By Lemma 8.12, this system is the same one which we defined in Part 1), and by Lemma 8.3, Σ• is both controllable and observable. We already observed above that Ro is outgoing invariant and that Ro X• = R is strongly invariant for Σ. Σ1•
Theorem 8.20. Every realizable signal behavior has a minimal state/signal realization (i.e., the behavior has a state/signal realization which is minimal). This follows from Theorem 8.19 (since a compressed system is externally equivalent to the original system). Up to now we have not used any specific representation of a state/signal system in our study of dilations and compressions. For completeness we interpret some of our results in terms of driving variable, output nulling, and input/state/ output representations. We begin with the following description of the crucial formula (8.3) in Theorem 8.7.
= V ; X , W and Σ = V ; X , W be two state/signal systems Lemma 8.21. Let Σ with X = X Z (and with the same signal space). 1) The following conditions are equivalent: (a) V 4 is given 5by (8.3).
B
A
W is a driving variable representation of Σ,
then ; X , L, (b) If
D
C 4 Z 5 Z PX A |X PX B
W is a driving variable representation of Σ. ; X , L,
|X C D 4 5
B
A
, W, K
is an output nulling representation of Σ,
then ; X (c) If
4 ZC D Z 5 PX A |X PX B
is an output nulling representation of Σ. ; X , W, K
4C |X 5 D
B
A
coris an input/state/output representation of Σ (d) If
D
; X , U, Y C responding to some 5admissible input/output decomposition W = Y U, 4
P Z A|
PZB
X X X then ; X , U, Y is an input/state/output representation
X
C| D of Σ corresponding to the same admissible decomposition of W. 2) Assume that the equivalent conditions (a)–(d) above hold. Then every driving variable representation of Σ is of the form described in (b), every output nulling representation of Σ is of the form described in (c), and every input/output representation of Σ is of the form described in (d).
State/Signal Systems
163
Proof. The equivalence of (a)–(d) follows from (3.3), (4.3), (5.2), and (8.3). That every input/state/output representations of V must be of the type given in (d) follows from the uniqueness of such a representation (see Theorem 5.1). The proof of the claim that all possible output nulling representations of V are of the type (c) is similar to the proof of the claim that all possible driving variable representations of V are of the type (b), so let us only prove the latter claim. 2 3 A B ; X , L, W be an arbitrary driving variable representation of Σ, Let C 5 4 D
B
A
W be the driving variable representation of Σ
mentioned ; X , L, and let
D
C
and M ∈ in part (b). Then by Theorem 6.1, there exist operators K ∈ B(X ; L)
B(L; L), with M boundedly invertible, such that C D C Z D
|X + P Z B
K P Z B
M A B PX A 1 X X =
|X + D
M .
K C D C D 1
= K P Z . Then Define K X C D C Z D
)|X P Z B
M
+B
K A B PX (A X =
)|X
M .
+ D
K D C D (C 4 5
+B
B
M
K A
; X , L, W is a driving variable representation By Theorem 6.1, 2 AC B+ D 3 K D M
of Σ, and hence ; X , L, W is of the type (b). C D
Definition 8.1 is very closely related to the following definition of a dilation of a input/state/output system. Definition 8.22. We say that the input/state/output system 5 4
B
A
, U, Y
i/s/o = ; X Σ
C D
is a dilation along Z of the input/state/output system AB Σi/s/o = [ C D ] ; X , U, Y ,
i/s/o , if X = X Z or equivalently, that Σi/s/o is a compression along Z onto X of Σ and the following condition holds: For each x0 ∈ X and each input sequence u(·) ∈ +
i/s/o , U Z the corresponding trajectories (˜ x(·), u(·), y˜(·)) and (x(·), u(·), y(·)) of Σ Z ˜(0) = x(0) = x0 , satisfy x(·) = PX x ˜(·) and respectively, Σi/s/o , with initial state x y˜(·) = y(·). As usual, we shall call an input/state/output system minimal if it is not a (nontrivial) dilation of any other input/state/output system (along any direction).
= (V ; X , W) and Σ = (V ; X , W) be two state/signal systems Lemma 8.23. Let Σ
with X = X Z (and with the same signal space W).
and Σ have a common admissible input/output decomposition 1) Suppose that Σ W = Y U. Denote the corresponding input/state/output representations by
i/s/o is a dilation along Z of Σi/s/o , then Σ
i/s/o , respectively, Σi/s/o . If Σ Σ is a dilation along Z of Σ.
164
D.Z. Arov and O.J. Staffans
is a dilation along Z of Σ, then the two systems have the 2) Conversely, if Σ same admissible input/output decompositions W = Y U, and if we denote
i/s/o and Σi/s/o , the corresponding input/state/output representations by Σ
respectively, then Σi/s/o is a dilation along Z of Σi/s/o .
on Z+ with x Proof. Proof of 1). Let ( x(·), w(·)) be a trajectory of Σ
(0) ∈ X . Then Y U
i/s/o . ( x(·), u(·), y(·)) with u(·) = PU w(·) and y(·) = PY w(·) is a trajectory of Σ Z
(·) is a trajectory of Σi/s/o , By Definition 8.22, (x(·), u(·), y(·)) with x(·) = PX x
is a dilation along Z of Σ.
(·), w(·)) is a trajectory of Σ. Thus, Σ and hence (P PXZ x Proof of 2). That the two systems have the same admissible input/output decompositions follows from Lemmas 5.7 and 8.2. Let W = Y U be a decomposi and for Σ. Let ( tion which is admissible both for Σ x(·), u(·), y(·)) be a trajectory
i/s/o on Z+ with x
(0) = x0 ∈ X . Then ( x(·), w(·)) with w(·) = y(·) + u(·) of Σ
and by Definition 8.1, (P is a trajectory of Σ, PXY x
(·), w(·)) is a trajectory of Σ. Y Hence (P PX x
(·), u(·), y(·)) is a trajectory of Σi/s/o . More precisely, it is the unique
i/s/o is a trajectory of Σ with the initial state x0 and the input data u(·). Thus, Σ dilation along Z of Σi/s/o . 5 4 B
B
A
i/s/o = and Σi/s/o = [ A Theorem 8.24. Let Σ C D ] ; X , U, Y be
D
; X , U, Y C two input/state/output systems with X = X Z (and with the same input and
i/s/o if and only output spaces). Then Σi/s/o is a compression along Z onto X of Σ 5 4
if Z can be decomposed into Z = Zo Zi such that the decomposition of A B C D with respect to the decomposition X = Zo X Zi has the following form (where ∗ stands for an irrelevant block) ⎡ ⎤ ∗ ∗ ∗ ∗ D ⎢ C ⎥
B
A ⎢ 0 A ∗ B ⎥ =⎢ 0 0 ∗ (8.11) ⎥. 0
C D ⎣ ⎦ 0 C ∗ D This is a non-orthogonal version of [Aro79, Proposition 4]. For completeness we include a short proof based on Theorem 8.7.
i/s/o
and Σ be the state/signal systems induced by Σ Proof of Theorem 8.24. Let Σ and Σi/s/o , respectively. 5 4
If A B is of the form (8.11), then it is easy to see that Zo is outgoing C D
Moreover, it follows from Lemma invariant and Zo X is strongly invariant for Σ.
is a dilation along Z of Σ, and 8.21 that (8.3) holds. Thus, by Theorem 8.7, Σ
i/s/o is a dilation along Z of Σi/s/o . consequently, by Lemma 8.23, Σ
i/s/o is a dilation along Z of Σi/s/o . Then, by Conversely, suppose that Σ
is a dilation along Z of Σ. Let X = Zo X Z Zi be the decomposition Lemma 8.23, Σ in Theorem 8.7. Then it is easy to see that the fact that Zo is outgoing 5 invariant 4
B
A and Zo X is strongly invariant imposes the structure (8.11) on . That the C D
State/Signal Systems
165
entries in positions (2, 2), (2, 4), (4, 2), and (4, 4) are A, B, C, and D follows from (8.3) and Lemma 8.21. 5 4
It is not difficult to see that the decomposition (8.11) of A B with respect C D to the decomposition X = Zo X Zi is valid if and only if (we denote Zo Zi by Z)
,
⊂ Zo X , Zo ⊂ N (C) R (B)
R (A|Zo ) ⊂ Zo , R (A|Z X ) ⊂ Zo X , (8.12) o
X , B = P Z B,
C = C|
X , D = D.
A = PXZ A| X
∈ B(X ) is an dilation of A ∈ B(X ), i.e., Thus, in particular, A
nX , An = PXZ A|
n ∈ Z+ .
(8.13)
Orthogonal dilations (i.e., dilations where X and Z are orthogonal) play an essential role in the Nagy–Foia¸s theory of harmonic analysis for operators in Hilbert space (see [SF70]) which is intimately connected with the Lax–Phillips scattering theory (see [LP67] and [AA70]). Theorem 8.25. An input/state/output system is minimal if and only if it is controllable and observable. Moreover, an input/state/output system Σ which is not minimal can be compressed into a minimal system (i.e., there is a minimal input/ state/output system which is an compression of Σ). This is a non-orthogonal version of [Aro79, Propositions 3 and 4, p. 151]. It is easy to deduce this theorem from Theorems 8.18 and 8.19 in the same way as we derived Theorem 8.24 from Theorem 8.7. We leave the details to the reader. Theorem 8.26. Let Σ be a state/signal system. Then the following conditions are equivalent: 1) Σ is minimal. 2) Σ is controllable and observable. 3) Σ has a minimal input/state/output representation. 4) Σ has a controllable driving variable representation and an observable output nulling representation. 5) Every input/state/output representation of Σ is minimal. 6) Every driving variable representation of Σ is controllable, and every output nulling representation of Σ is observable. Proof. This follows from Propositions 3.5, 4.5, and 5.5, and Theorems 8.18 and 8.25. 5 4
B
A A B ] ; X , U, Y be
i/s/o = and Σi/s/o = [ C Lemma 8.27. Let Σ D
D
; X , U, Y C
= X Z. Denote the four block transfer two input/state/output systems with X 5 4 5 4
B(z) A(z) B(z)
i/s/o and Σi/s/o by A(z) and functions of Σ
C(z) D(z) , respectively. Then C(z) D(z) the following conditions are equivalent:
166
D.Z. Arov and O.J. Staffans
i/s/o is a dilation along Z of Σi/s/o . 1) Σ 2) For all n ∈ Z+ ,
n |X , An = PXZ A
A
n |X , CAn = C
n B,
An B = PXZ A n n
A
B,
CA B = C
3) For all z in some neighborhood at zero, D C Z C
P A(z)| A(z) B(z) X = X C(z)|X C(z) D(z)
D = D. D
PXZ B(z) .
D(z)
(8.14)
(8.15)
Proof. The equivalence of 1) and 2) follows from (6.7), and the equivalence of 2) and 3) follows from (6.6).
= (V ; X , W) and Σ = (V ; X , W) be two state/signal systems Theorem 8.28. Let Σ
with X = X Z (and with the same signal space W).
is a dilation of Σ if and only if there exist driving variable representations 1) Σ
dv/s/s and Σdv/s/s of Σ
and Σ, respectively, with the property that Σ
dv/s/s is Σ a dilation along Z of Σdv/s/s (in the input/state/output sense; in particular they have the same driving variable space).
is a dilation of Σ, then to every driving variable representation Σdv/s/s 2) If Σ /
dv/s/s of Σ
of Σ there exists at least one driving variable representation Σ
dv/s/s is a dilation along Z of Σdv/s/s (in the input/state/output such that Σ sense). Proof. Assertion 1) follows from Remark 5.2 and Lemma 8.23. To prove assertion 2) we take an arbitrary driving-variable representation 4 5 2 A B 3
A B
, L,
W be the
dv/s/s = Σdv/s/s = ; X ; X , L, W of Σ. Let Σ
D
C D C
mentioned in part 1). Then by Theorem 6.1, driving variable representation of Σ
and M ∈ B(L; L),
with M boundedly invertthere exist operators K ∈ B(X ; L) ible, such that D C Z D C
(z) P Z B (z) A (z) B (z) PX A X =
(z)
(z) C (z) D (z) C D C D−1 C D 1X 1X 0 0 × ,
(z) 1L − K P Z B
(z) −K PXZ A 0 M X
= K P Z . Then the right-hand side is the compression along Z of the Define K X function DC D−1 C D C
(z)
(z) B 1X 1X 0 0 A ,
(z) −K
(z) 1L − K
(z)
(z) D
B
A C 0 M which according to Theorem 5 4 1is the 5four-block transfer function of the driving 4 6.1 0
B
* A
By Lemma 8.27, Σ
dv/s/s is a dilation X of Σ. variable representation
M C D K along Z of Σdv/s/s .
State/Signal Systems
167
= (V ; X , W) and Σ = (V ; X , W) be two state/signal systems Theorem 8.29. Let Σ
with X = X Z (and with the same signal space W).
is a dilation of Σ if and only if there exist output nulling representations 1) Σ
and Σ, respectively, with the property that Σ
s/s/on is
s/s/on and Σs/s/on of Σ Σ a dilation along Z of Σs/s/on (in the input/state/output sense; in particular they have the same error space).
is a dilation of Σ, then to every output nulling representation Σs/s/on of 2) If Σ
such that
s/s/on of Σ Σ there exists at least one output nulling representation Σ
s/s/on is a dilation along Z of Σs/s/on (in the input/state/output sense). Σ The proof of this theorem is similar to the proof of Theorem 8.28, and we leave it to the reader.
9. Stability Below we shall introduce and study different stability notions for state/signal systems. These are related to the stability of different representations of the system. In this connection we interpret each representation as an input/state/output system, and apply the following notion of stability. Definition 9.1. A input/state/output system is 1) stable, if the following implication holds for all its trajectories (x(·), u(·), y(·)): u(·) ∈ 2 (Z+ ; U) ⇒ x(·) ∈ ∞ (Z+ ; X ) and y(·) ∈ 2 (Z+ ; Y).
(9.1)
2) strongly stable, if the following implication holds for all its trajectories (x(·), u(·), y(·)): u(·) ∈ 2 (Z+ ; U) ⇒ lim x(n) = 0 and y(·) ∈ 2 (Z+ ; Y). n→∞
(9.2)
3) power stable, if there exists a constant r > 1 such that the following implication holds for all its trajectories (x(·), u(·), y(·)): u(·) = 0 ⇒ lim rn x(n) = 0. n→∞
(9.3)
It is clear that (9.2) implies (9.1).
AB Lemma 9.2. An input/state/output system Σi/s/o = [ C D ] ; X , U, Y with the four 2A B3 block transfer function C D is stable if and only if the following four conditions hold: 1) There is a constant C > 0 such that An ≤ C for all n ∈ Z+ . 2) B(z)∗ x ∈ H 2 (D; U) for all x ∈ X . 3) C(z)x ∈ H 2 (D; Y) for all x ∈ X . 4) D ∈ H ∞ (D; U, Y). This lemma is undoubtedly known, but we have not been able to find an explicit statement in the literature. (A continuous time version of this lemma can easily be derived from [Sta05].) For completeness we therefore include a short proof.
168
D.Z. Arov and O.J. Staffans
Proof. Clearly, Σi/s/o is stable if and only if the four input-state-output maps listed in (6.8) have the following properties: ˇ maps X into ∞ (Z+ ; X ); 1 ) A ˇ maps 2 (Z+ ; U) into ∞ (Z+ ; X ); 2) B ˇ 3 ) C maps X into 2 (Z+ ; X ); ˇ maps 2 (Z+ ; U) into 2 (Z+ ; Y). 4 ) D We claim that each one of these conditions is equivalent to the corresponding condition listed in the statement Lemma 9.2. It is easy to see that all of these operators are always closed as operators between the indicated spaces, so by the closed graph theorem, 1 )–4 ) are equivalent to the corresponding statements where we require each of these maps to be bounded, i.e., ˇ ∈ B(X ; ∞ (Z+ ; X )); 1 ) A ˇ ∈ B(2 (Z+ ; U); ∞ (Z+ ; X )); 2 ) B ˇ ∈ B(X ; 2 (Z+ ; X )); 3 ) C ˇ ∈ B(2 (Z+ ; U); 2 (Z+ ; Y)). 4 ) D Obviously, 1) is equivalent to 1 ). Condition 1) implies that D ⊂ ρ(A), and hence all the transfer functions listed in 2)–4) are defined and analytic on D. That 3) is equivalent to 3 ) follows from the fact that the Z-transform is a bounded linear map from 2 (Z+ ; U) onto H 2 (D; Y) with a bounded inverse. The equivalence of ˇ maps 2 (Z+ ; U) into 4) and 4 ) is well known: a causal convolution operator D 2 + ∞ (Z ; Y) if and only if its symbol D belongs to H (D; U, Y). The equivalence of 2) and 2 ) remains to be established. It is easy to see that 2 ) is equivalent to the following condition:
n }n∈Z+ of operators defined by B
n u = n Ak Bu(−k − 1) 2 ) the sequence {B k=0 is uniformly bounded in B(2 (Z− ; U); X ).
n u is a Assume that 2 ) holds. Then, for each u ∈ 2 (Z− ; U), the sequence B 2 − {u(k)}k<m Cauchy sequence in X (since the norm in (Z ; U) of the sequence
Then Bu
= ∞ Ak Bu(−k − tends to zero as m → −∞). Denote the limit by B. k=0
∈ B(2 (Z− ; U); X ). By duality, B
∗ ∈ B(X ; 2 (Z− ; U)). This is equivalent 1) and B to the statement that the operator x → B ∗ (A∗ )n x, n ∈ Z+ , maps X into 2 (Z+ ; Y), which equivalent to 2) (in the same way as 3) is equivalent to 3 )). Thus, 2 ) ⇒
∗ above is 2). Conversely, if 2) holds, then the operator that we denoted by B
bounded, hence so is B, and this implies 2 ). AB Lemma 9.3. An input/state/output system Σi/s/o = [ C D ] ; X , U, Y is strongly stable if and only if it is stable and A is strongly stable, i.e., limn→∞ An x = 0 for all x ∈ X . Also this lemma must be known, but we have not found an explicit proof in the literature (a proof of the well-posed continuous time version of this lemma is given in [Sta05], and the discrete time proof is the same). For the convenience of the reader we therefore again include a short proof.
State/Signal Systems
169
Proof. It is easy to see that if Σi/s/o is strongly stable then it is stable, and limn→∞ An x = 0 for all x ∈ X . Let us therefore only prove the converse part. Let (x(·), u(·), y(·)) be a trajectory of Σi/s/o on Z+ with u ∈ 2 (Z+ ; U). Fix ∞ > 0. Choose m large enough so that k=m u(k)2 ≤ 2 . Then we have for all n ≥ m, n−m−1 x(n) = An−m x(m) + An−k−1 Bu(m + k) k=0
Here An−m x(m) → 0 as n → ∞ (because of the strong stability of A), and the norm of the second term is at most C , where C is the norm of the mapping ˇ ∈ B(2 (Z+ ; U); ∞ (Z+ ; X )). Since was arbitrary, this implies that x(k) → 0 as B k → ∞. Remark 9.4. As is well known, conditions 2) and 3) in Lemma 9.2 imply that the sums C := An BB ∗ (A∗ )n , (9.4) n∈Z+
O :=
(A∗ )n C ∗ CAn ,
(9.5)
n∈Z+
converge monotonically in the strong sense to nonnegative operators O ∈ B(X ) and C ∈ B(X ), respectively. These are called the infinite time controllability, respectively, observability Gramians of the system. They are the minimal nonnegative solutions of the Stein equations H − AHA∗ = BB ∗ , ∗
∗
G − A GA = C C,
(9.6) (9.7)
respectively. If A is strongly stable, then the nonnegative solution H of (9.6) is unique (hence H = C), and if A∗ is strongly stable (i.e., (A∗ )n x → 0 for all x ∈ X ), then the nonnegative solution G of (9.7) is unique (hence G = O). B Lemma 9.5. Let Σi/s/o = [ A C D ] ; X , U, Y be an input/state/output system. Then the following conditions are equivalent: 1) Σi/s/o is power stable; 2) D := {z ∈ C | |z| ≤ 1} ⊂ ΛA ; 3) There exists constants q < 1 and C > 0 such that An ≤ Cq n . Proof. Clearly 2) and 3) are equivalent. It is also clear that 3) implies 1). For the converse implication we observe that condition 1) says that there is some r > 1 such that limn→∞ rn An x = 0 for all x ∈ X . By the uniform boundedness principle, supn∈Z+ rn An < ∞. This implies 3) with γ = 1/r. Lemma 9.6. Every power stable input/state/output system is strongly stable. Proof. This follows from Lemmas 9.2, 9.3, and 9.5.
Thus, power stability implies strong stability, which further implies stability.
170
D.Z. Arov and O.J. Staffans
We call a driving variable or output nulling representation of a state/signal system stable, or strongly stable, or power stable, if it has this property when it is interpreted as an input/state/output system, as explained in Remark 5.4. Definition 9.7. A state/signal system is 1) stabilizable (or strongly stabilizable, or power stabilizable) if it has a stable (or strongly stable, or power stable, respectively) driving variable representation. 2) detectable (or strongly detectable, or power detectable) if it has a stable (or strongly stable, or power stable, respectively) output nulling representation. 3) LFT-stabilizable 8 (or strongly LFT-stabilizable, or power LFT-stabilizable), if it has a stable (or strongly stable, or power stable, respectively) input/ state/output representation. Next we shall show that the above notions are closely connected to the corresponding (better known) notions for input/state/output systems.9 B Definition 9.8. An input/state/output system Σi/s/o = [ A C D ] ; X , U, Y is 1) stabilizable (or strongly stabilizable, or power stabilizable) if there exists an operator L ∈ B(X ; U), called a state feedback operator, such 4that 5the new iny(·) u(·)
put/state/output system with input (·) and output w(·) = by the system of equations
, described
x(n + 1) = Ax(n) + Bu(n), y(n) = Cx(n) + Du(n), u(n) = Lx(n) + (n),
(9.8) z∈Z , +
is stable (or strongly stable, or power stable, respectively). 2) detectable (or strongly detectable, or power detectable) if there exists an operator H ∈ B(Y; X ), called an output injection operator, such that the 5 4 new input/state/output system with input w(·) = described by the system of equations
e(·) u(·)
and output y(·),
x(n + 1) = Ax(n) + Hy(n) + Bu(n), y(n) = Cx(n) + e(n) + Du(n),
z ∈ Z+ ,
(9.9)
is stable (or strongly stable, or power stable, respectively). 3) output feedback stabilizable (or strongly output feedback stabilizable, or power output feedback stabilizable) if there exists an operator K ∈ B(Y; U), called a output feedback operator, such that 1U − KD has a bounded inverse and the 8 LFT
stands for Linear Fractional Transformation. number of slightly different ways of presenting these notions do exist. We have chosen to present a version which makes the connection to the state/signal theory as simple as possible. This is a discrete time analogue of the approach used in [Sta05, Chapter 7]. 9A
State/Signal Systems
171
new input/state/output system with input (·) and output y(·), described by the (implicit) system of equations (where u(n) should be eliminated) x(n + 1) = Ax(n) + Bu(n), y(n) = Cx(n) + Du(n), u(n) = Ky(n) + (n),
(9.10) z∈Z , +
is stable (or strongly stable, or power stable, respectively). 4) LFT-stabilizable (or strongly LFT-stabilizable, or power LFT-stabilizable), 2 11 Ψ12 3 ∈ if there exists Hilbert spaces Y and U and an operator Ψ = Ψ Ψ21 Ψ22 5 4 2 Y 3 Y B( U ; ), called an LFT-feedback operator, such that both Ψ itself and U
Ψ21 D+Ψ22 have bounded inverses, and such that the new input/state/output system with input u1 (·) and output y1 (·), described by the (implicit) system of equations (where u(n) and y(n) should be eliminated) x(n + 1) = Ax(n) + Bu(n), y(n) = Cx(n) + Du(n), (9.11)
y1 (n) = Ψ11 y(n) + Ψ12 u(n), u1 (n) = Ψ21 y(n) + Ψ22 u(n),
z ∈ Z+ ,
is stable (or strongly stable, or power stable, respectively). More explicitly, the resulting input/state/output 4systems5have the following AL B L 2 3 structure. If we denote the system in part 1) by ΣL = C ; X , U, Y , then L U DL ⎤ ⎡ B A + BL D C L BL A ⎥ ⎢ = (9.12) ⎣ C + DL D ⎦. C L DL L 1U 4 H BH 5 2 3 ;X, Y If we denote the system in part 2) by ΣH = A U , Y , then C H DH B A D C H H B + HD A + HC BH A . (9.13) = C H DH C 1Y D 4 K B K 5 ; X , U, Y , then If we denote the system in part 3) by ΣK = A K K C D C
AK CK
D C −1 −1 D BK A + BK (1Y − DK) C B (1U − KD) = −1 −1 DK (1Y − DK) C D (1U − KD) DC C D−1 1K A B 0 = C D −KC 1U − KD C D−1 C D 1 −BK A B = Y . 0 1Y − DK C D
(9.14)
172
D.Z. Arov and O.J. Staffans
If we denote the system in part 4) by ΣΨ = Lemma 10.1) C Ψ D C A BΨ A = C Ψ DΨ Ψ11 C
B Ψ11 D + Ψ12
DC
4 AΨ
CΨ
1X Ψ21 C
BΨ DΨ
5
Y , then (see also ; X , U,
0 Ψ21 D + Ψ22
D−1 ,
(9.15)
or equivalently, AΨ = A − B(Ψ21 D + Ψ22 )−1 Ψ21 C, B Ψ = B(Ψ21 D + Ψ22 )−1 , C Ψ = Ψ11 C − (Ψ11 D + Ψ12 )(Ψ21 D + Ψ22 )−1 Ψ21 C,
(9.16)
DΨ = (Ψ11 D + Ψ12 )(Ψ21 D + Ψ22 )−1 . When we apply Definition 9.8 to various systems it is often more convenient to use the following equivalent characterization: AB Lemma 9.9. Let Σi/s/o = [ C D ] ; X , U, Y be an input/state/output system. 2 3 4 L B L 5 1) The system ΣL = A ; X , U, Y whose coefficient matrix is given by U C L DL (9.12) is Dstable (or strongly stable, or power stable) if and only if the system C AL B 2 Y 3 has the same property. C 0 ; X , U, U L 0 4 H H5 2Y 3 A B 2) The system ΣH = C ; X , H H U , Y whose coefficient matrix is given D by (9.13) is stable (or strongly stable, or power stable) if and only if the system 2 3 4 AH H B 5 ;X, Y U , Y has the same property. C 0 0 4 AK B K 5 ; X , U, Y whose coefficient matrix is given by 3) The system ΣK = C K DK (9.14) 2 K is3 stable (or strongly stable, or power stable) if and only if the system A B ; X , U, Y has the same property. C 0 4 AΨ B Ψ 5
Y whose coefficient matrix is given by ; X , U, 4) The system ΣΨ = C Ψ Ψ D (9.15) is3 stable (or strongly stable, or power stable) if and only if the system 2 AΨ B ; X , U, Y has the same property. C 0 Proof. Proof of 1): The latter system differs from ΣL only in the sense that we have subtracted a multiple of the first input from the second input and modified the feedthrough term, and this does not affect stability. Proof of 2): The latter system differs from ΣH only in the sense that we have subtracted a multiple of the second output from the first output and modified the feedthrough term, and this does not affect stability. Proof of 3): The latter system differs from ΣK only in the sense that we have multiplied both the input and the output by bounded invertible operators and modified the feedthrough term, and this does not affect stability. Proof of 4): The latter system differs from ΣΨ only in the sense that we have multiplied both the input and the output by bounded invertible operators and modified the feedthrough term, and this does not affect stability. Indeed, the
State/Signal Systems
173
operator multiplying C to the left is invertible, because of the invertibility of Ψ and the following Schur factorization: C DC DC D Ψ11 Ψ12 1Y D 0 1Y Ψ21 Ψ22 0 1U −(Ψ21 D + Ψ22 )−1 Ψ21 1U C D Ψ11 − (Ψ11 D + Ψ12 )(Ψ21 D + Ψ22 )−1 Ψ21 Ψ11 D + Ψ12 = . 0 Ψ21 D + Ψ22 AB Lemma 9.10. Let Σi/s/o = [ C D ] ; X , U, Y be an input/state/output system. 1) If Σi/s/o is output feedback stabilizable (or strongly output feedback stabilizable, or power output feedback stabilizable), then it is also LFT-stabilizable (or strongly LFT-stabilizable, or power LFT-stabilizable, respectively). 2) If Σi/s/o is LFT-stabilizable (or strongly LFT-stabilizable) with an LFT2 11 Ψ12 3 feedback operator Ψ = Ψ where Ψ22 has a bounded inverse, then it Ψ21 Ψ22 is also output feedback stabilizable (or strongly output feedback stabilizable, respectively). 3) If Σi/s/o is LFT-stabilizable (or strongly LFT-stabilizable) and D = 0, then it is also output feedback stabilizable (or strongly output feedback stabilizable, respectively). 4) If Σi/s/o is power LFT-stabilizable then it is also power output feedback stabilizable. 5) If Σi/s/o is LFT-stabilizable, then it is both stabilizable and detectable. 2 1Y 0 3 Proof. Proof of 1) : Take Y = Y, U = U, and Ψ = −K 1U . Proof of 2): Use parts 3)–4) of Lemma 9.9, and take K = −Ψ−1 22 Ψ21 . Proof of 3): This follows from 2), since the assumption that D = 0 implies that Ψ22 = Ψ21 D + Ψ22 has a bounded inverse. Proof of 4): The claim 2) remains valid also in the power stabilizable case, with the same proof. However, in the power stabilizable case the spectral radius of the operator AΨ lies strictly inside the unit disk. This implies that the set of all LFT-feedbacks Ψ which power stabilize Σi/s/o is open. Therefore, if it is nonempty, it must contain some element Ψ for which Ψ22 is invertible. Thus, by the power stable version of part 2), Σi/s/o is power output feedback stabilizable. Proof of 5): Use parts 1), 2) and 4) of Lemma 9.9, and take L = −(Ψ21 D + Ψ22 )−1 Ψ21 C and H = −B(Ψ21 D + Ψ22 )−1 Ψ21 . Also note that L is a left multiple of C and that H is a4right multiple of B, which implies that in this case ΣL is AL B 5 stable if and only if ; X , U, Y is stable, and ΣH is stable if and only if C 0 4 AH B 5 ; X , U, Y is stable. C
0
Theorem 9.11. Let Σ = (V ; X , W) be a state/signal node. 1) The following conditions are equivalent. (a) Σ is stabilizable (or strongly stabilizable, or power stabilizable); (b) Σ has a stabilizable (or strongly stabilizable, or power stabilizable) input/ state/output representation;
174
D.Z. Arov and O.J. Staffans
(c) every input/state/output representation of Σ is stabilizable (or strongly stabilizable, or power stabilizable). 2) The following conditions are equivalent. (a) Σ is detectable (or strongly detectable, or power detectable); (b) Σ has a detectable (or strongly detectable, or power detectable) input/ state/output representation; (c) every input/state/output representation of Σ is detectable (or strongly detectable, or power detectable). 3) The following conditions are equivalent. (a) Σ is LFT-stabilizable (or strongly LFT-stabilizable, or power LFT-stabilizable); (b) Σ has a LFT-stabilizable (or strongly LFT-stabilizable, or power LFTstabilizable) input/state/output representation; (c) every input/state/output representation of Σ is LFT-stabilizable (or strongly LFT-stabilizable, or power LFT-stabilizable). Proof. The proofs of the strongly stable and power stable versions of this theorem are identical to the proofs of the basic version, so below we shall only prove the basic “stable” version. Proof of 1): We prove this by showing that (b) ⇒ (a) ⇒ (c) (the implication (c) ⇒ (b) is trivial). AB Let Σi/s/o = [ C D ] ; X , U, Y be a stabilizable input/state/output representation of Σ, and let L ∈ B(Y; U) be a stabilizing state feedback operator. Then the system ΣL whose coefficient matrix is given by (9.12) is stable. This system has an obvious interpretation as a driving variable representation of Σ (with driving variable space U). Thus, according to Definition 9.7, Σ is stabilizable. Conversely, that 3 Σ is stabilizable (in the sense of Definition 9.7). Let 2 A B suppose Σdv/s/s = C D ; X , L, W be a stable driving variable representation of Σ, and AB let Σi/s/o = [ C D ] ; X , U, Y be an arbitrary input/state/output representation of Σ. We can alternatively interpret this representation, too, as a driving 5 4 variable C representation as explained in Remark 5.2. Split C and D into C = C1 and 2 4 5 D D = D1 in accordance with the splitting W = Y U. Then, by Theorem 2
3.3, there exist operators L ∈ B(X ; U) and M ∈ B(L; U), with M boundedly invertible, such that ⎡
A ⎣C1 C2
⎤ ⎡ A B D1 ⎦ = ⎣C D2 0
⎤ ⎡ D B C A + BL 1 0 ⎣ C + DL D⎦ = L M L 1U
⎤ BM DM ⎦ . M
(9.17)
This coefficient matrix is identical to the one in (9.12) apart from the fact that the input variable has been multiplied by the invertible operator M . This means that L is a stabilizing state feedback operator for Σi/s/o .
State/Signal Systems
175
The proof of Part 2) is similar to the proof of Part 1), and it is left to the reader (this time we interpret the input/state/output representation as an output nulling representation as explained in Remark 5.2). Proof of 3): The implication (a) ⇒ (c) follows from Theorem 5.11 (take Ψ to be the operator Θ defined in (1.6)), and the implication (c) ⇒ (b) is trivial. Thus, it remains to prove the implication (b) ⇒ (a). AB Let Σi/s/o = [ C representation of D ] ; X , U, Y be an input/state/output 2 Y 3 4 Y 5 Σ with a LFT-stabilizing feedback operator Ψ ∈ B( U ; ), and let ΣΨ = U 4 AΨ B Ψ 5
, Y be the stable input/state/output system whose coefficient ; X , U Ψ Ψ C D matrix is given by (9.15). We claim that there exists an admissible input/output decomposition W = Y1 U1 of W such that the corresponding input/state/output representation is stable. The proof of this claim is by direct construction. 4 5 2 13
Y We begin by interpreting Ψ as an operator Ψ = Ψ Ψ2 ∈ B(W; ), where U
Ψ1 = Ψ11 PYU + Ψ12 PUY 4and5 Ψ2 = Ψ21 PYU + Ψ22 PUY . The bounded inverse of this
:= Ψ−1 = operator belongs to B( Y ; W), and it can be decomposed into Ψ U 2 3
1 Ψ
2 . Define Ψ 2 3 2 3 Y1 = N Ψ2 , U1 = N Ψ1 .
Define P ∈ B(W) and Q ∈ B(W) by C D Ψ
P := Ψ 1 , 0
Q := Ψ
C
D 0 . Ψ2
Clearly P + Q = 1W . For all w ∈ Y1 we have Qw = 0, hence P w = w, and for all w ∈ U1 we have P w = 0, hence Qw = w. This implies that P and Q are complementary projections in W, with R (P ) = N (Q) = Y1 and N (P ) = R (Q) = U1 , i.e., P = PYU11 and Q = PUY11 . In particular, this implies that W = Y1 U1 .
1 , and Furthermore, Ψ1 maps Y1 one-to-one onto Y with the bounded inverse Ψ
onto DU with the bounded inverse Ψ2 . Ψ1 maps U1 one-to-one C U
Let Φ :=
U
PY 1 |Y PY 1 |U
1 Y PU 1 |Y 1
1 Y
PU 1 |U
. This is the same operator that we find in (1.6),
1
corresponding to the two decompositions W = Y U = Y1 U1 , and it is explicitly given by C D
Ψ
Ψ Ψ Ψ Φ = 1 11 1 12 . Ψ2 Ψ21 Ψ2 Ψ22
2 (Ψ12 D+Ψ22 ) is invertible, and by Theorem 5.11, the In particular, Φ12 D+Φ22 = Ψ decomposition W = Y1 U1 is admissible. Let us denote the corresponding input/ 2 3 A1 B1 1 state/output system by Σi/s/o = C1 D1 ; X , U1 , Y1 . This system is obtained Ψ
−1 and the output by Ψ
1 . Thus, Σ1 from Σ by multiplying the input by Ψ is 2
stable.
i/s/o
176
D.Z. Arov and O.J. Staffans
10. Appendix Lemma 10.1. Let A ∈ B(X ; Z) and B ∈ B(Z; X ). 1) 1X − BA has a bounded inverse if and only if 1Z − AB has a bounded inverse. 2) If 1X − BA has a bounded inverse, then (1Z − AB)−1 = 1X + A(1X − BA)−1 B, B(1Z − AB)−1 = (1X − BA)−1 B.
(10.1)
For a proof see, e.g., [Sta05, Appendix A4].
Acknowlegment Damir Z. Arov thanks ˚ Abo Akademi for its hospitality and the Academy of Finland for its financial support during his visits to ˚ Abo in 2003–2005. He also gratefully acknowledges the partial financial support by the joint grant UM1-2567-OD-03 from the U.S. Civilian Research and Development Foundation (CRDF) and the Ukrainian Government. Olof J. Staffans gratefully acknowledges the financial support by grant 203991 from the Academy of Finland.
References [AA70]
Vadim M. Adamyan and Damir Z. Arov, On unitary couplings of semiunitary operators, Eleven Papers in Analysis (Providence, R.I.), American Mathematical Society Translations, vol. 95, American Mathematical Society, 1970, pp. 75– 129. [Aro74] Damir Z. Arov, Scattering theory with dissipation of energy, Dokl. Akad. Nauk SSSR. 216 (1974), 713–716, Translated in Soviet Math. Dokl. 15 (1974), 848– 854. , Passive linear stationary dynamic systems, Sibir. Mat. Zh. 20 (1979), [Aro79] 211–228, Translated in Sib. Math. J. 20 (1979), 149-162. [Bel68] Vitold Belevitch, Classical network theory, Holden-Day, San Francisco, Calif.Cambridge-Amsterdam, 1968. [BGK79] Harm Bart, Israel Gohberg, and Marinus A. Kaashoek, Minimal factorization of matrix and operator functions, Operator Theory: Advances and Applications, vol. 1, Birkhauser-Verlag, ¨ Basel Boston Berlin, 1979. [BS05] Joseph A. Ball and Olof J. Staffans, Conservative state-space realizations of dissipative system behaviors, To appear in Integral Equations Operator Theory (2005), 63 pp. [Fuh74] Paul A. Fuhrmann, On realization of linear systems and applications to some questions of stability, Math. Systems Theory 8 (1974), 132–140. [Hel74] J. William Helton, Discrete time systems, operator models, and scattering theory, J. Funct. Anal. 16 (1974), 15–38. [LP67] Peter D. Lax and Ralph S. Phillips, Scattering theory, Academic Press, New York, 1967.
State/Signal Systems [PW98]
177
Jan Willem Polderman and Jan C. Willems, Introduction to mathematical systems theory: A behavioral approach, Springer-Verlag, New York, 1998. [SF70] B´ela Sz.-Nagy and Ciprian Foia¸¸s, Harmonic analysis of operators on Hilbert space, North-Holland, Amsterdam London, 1970. [Sta05] Olof J. Staffans, Well-posed linear systems, Cambridge University Press, Cambridge and New York, 2005. [WT02] Jan C. Willems and Harry L. Trentelman, Synthesis of dissipative systems using quadratic differential forms: Part II, IEEE Trans. Autom. Control 47 (2002), 53– 69. Damir Z. Arov Division of Mathematical Analysis Institute of Physics and Mathematics South-Ukrainian Pedagogical University 65020 Odessa, Ukraine Olof J. Staffans ˚ Abo Akademi University Department of Mathematics FIN-20500 ˚ Abo, Finland URL: http://www.abo.fi/~staffans/
Operator Theory: Advances and Applications, Vol. 161, 179–223 c 2005 Birkhauser ¨ Verlag Basel/Switzerland
Conservative Structured Noncommutative Multidimensional Linear Systems Joseph A. Ball, Gilbert Groenewald and Tanit Malakorn Abstract. We introduce a class of conservative structured multidimensional linear systems with evolution along a free semigroup. The system matrix for such a system is unitary and the associated transfer function is a formal power series in noncommuting indeterminates. A formal power series T (z1 , . . . , zd ) in the noncommuting indeterminates z1 , . . . , zd arising in this way satisfies a noncommutative von Neumann inequality, i.e., substitution of a d-tuple of noncommuting operators δ = (δ1 , . . . , δd ) on a fixed separable Hilbert space which is contractive in the appropriate sense yields a contraction operator T (δ) = T (δ1 , . . . , δd ). We also obtain the converse realization theorem: any formal power series satisfying such a von Neumann inequality can be realized as the transfer function of such a conservative structured multidimensional linear system. Mathematics Subject Classification (2000). Primary 47A56; Secondary 13F25, 47A60, 93B28. Keywords. Formal power series, noncommuting indeterminates, energy balance, Hahn-Banach separation argument, noncommutative Schur-Agler class.
Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Structured noncommutative multidimensional linear systems: basic definitions and properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Adjoint systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Dissipative and conservative structured multidimensional linear systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Conservative SNMLS-realization of formal power series in the class SAG (U, Y) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
180 183 191 193 199 220
The first author was partially supported by US National Science Foundation under Grant Number DMS-9987636; The second author is supported by the National Research Foundation of South Africa under Grant Number 2053733; The third author was supported by a grant from Naresuan University, Thailand.
180
J.A. Ball, G. Groenewald and T. Malakorn
1. Introduction This paper concerns extensions of the classical theory of conservative discrete-time linear systems to the setting of conservative structured multidimensional linear systems with evolution along a finitely generated free semigroup (words in a finite set of letters). By way of introduction we first review the relevant points of the classical theory. By a (classical) conservative discrete-time input/state/output (i/s/o) linear system, we mean a system of equations of the form x(n + 1) = Ax(n) + Bu(n) Σ = Σ(U ) : (1.1) y(n) = Cx(n) + Du(n) such that the so-called connection matrix or colligation C D C D C D A B H H U= : → C D U Y
(1.2)
is unitary. Here we assume that x(n) takes values in the state space H, u(n) takes values in the input space U and y(n) takes values in the output space Y where H, U and Y are all assumed to be Hilbert spaces. The unitary property of the colligation U leads to the energy balance relation x(n + 1)2 − x(n)2 = u(n)2 − y(n)2 .
(1.3)
Summing over all n with 0 ≤ n ≤ N leads to x(N + 1)2 − x(0)2 =
N 2
3 u(n)2 − y(n)2 .
n=0
In particular, if we assume that x(0) = 0 and let N → ∞ we get ∞ ∞ y(n)2 ≤ u(n)2 . n=0
(1.4)
n=0
Application of the Z-transform {x(n)}n∈Z+ → x (z) :=
x(n)z n
n∈Z+
to the system equations (1.1) leads to the frequency-domain formulas x (z) = (I − zA)−1 x(0) + z(I − zA)−1 B u(z)
(1.5)
(z) y(z) = C(I − zA)−1 x(0) + TΣ (z) · u
(1.6)
where
TΣ (z) = D + zC(I − zA)−1 B. In particular, if we assume x(0) = 0 we get the input-output relation y(z) = TΣ (z) · u (z).
From (1.4) and the Plancherel theorem we then see that H 2 (D,Y) ≤ uH 2 (D,U ) T TΣ · u
(1.7)
Conservative Noncommutative Systems
181
∞ for all u ∈ H 2 (D, U) (the Hardy space of U-valued functions u (z) = n=0 u(n)z n on the unit disk D with norm-square summable Taylor coefficients (u2H 2 (D,U ) = ∞ 2 TΣ is in the operator-valued n=0 u(n) < ∞). As a result it follows that n Schur class S(U, Y) consisting of functions S(z) = ∞ n=0 Sn z analytic on D with values equal to contraction operators from ∞U into Y. Conversely, it is well known that any Schur class function S(z) = n=0 Sn z n ∈ S(U, Y) can be realized as the transfer function of a conservative linear system, i.e., any S ∈ S(U, Y) can be written in the form S(z) = D + zC(I − zA)−1 B for some unitary colligation 2 3 H A B ] : [H] → U = [C Y . Moreover any such S satisfies a von Neumann inequality: D U if K is another Hilbert space and T ∈ L(K)has T < 1, then S(T ) ≤ 1 where ∞ S(T ) ∈ L(U ⊗K, Y ⊗K) is given by S(T ) = n=0 Sn ⊗T n . The following theorem is a convenient summary of the various equivalent characterizations of the operatorvalued Schur class S(U, Y). Theorem 1.1. Let z → S(z) be a L(U, Y)-valued function defined on the unit disk D. Then the following conditions are equivalent: 1. S ∈ S(U, Y), i.e., S is analytic on D and S(z) ≤ 1 for all z ∈ D. 2. S is analytic on D and S(T ) ≤ 1 for any operator T on some Hilbert space K with T < 1. 3. There exists a Hilbert space H and an operator-valued function z → H(z) ∈ L(H , Y) so that I − S(z)S(w)∗ = H(z)H(w)∗ for all z, w ∈ D. 1 − zw 4. S(z) can be realized as the transfer function of a conservative discrete-time i/s/o linear system, i.e., there is a unitary colligation U of the form (1.2) so that S(z) = D + zC(IIH − zA)−1 B. For more information on the Schur class and its applications in both operator theory and engineering, we refer the reader to [42, 25, 26, 46, 38, 47]. Recent work has generalized these ideas to multivariable settings in several ways. We mention [1, 2, 17, 15, 16] for extensions to the polydisk Dn ⊂ Cn setting, [33, 52, 10, 30, 3, 8, 18, 45, 37, 4] for extensions to the unit ball Bn ⊂ Cn (where additional refinements concerning Nevanlinna-Pick-type interpolation and lifting theorems are also explored), and the recent work [56, 7, 6, 12] which suggests how a unification of these two settings can be achieved. In the present paper we generalize these ideas to other types of conservative structured multidimensional linear systems. This paper can be considered as a sequel to our paper [13] where we introduced and studied a general class of systems called structured noncommutative multidimensional linear systems (SNMLSs). These systems have evolution along a free semigroup rather than along an integer lattice as is usually taken in work in multidimensional linear system theory, and the transfer function is a formal power series in noncommuting indeterminates rather than an analytic function of several complex variables. In [13] it is assumed
182
J.A. Ball, G. Groenewald and T. Malakorn
that the input space, state space and output space were all finite-dimensional linear spaces, and analogues of the standard results in finite-dimensional linear system theory (such as controllability, observability, Kalman decomposition, state space similarity theorem, Hankel operators and realization theory) were developed. Here we use the same notion of SNMLS as introduced in [13] but take the input space, state space and output space all to be Hilbert spaces and introduce a notion of conservative SNMLS for which the system and its adjoint satisfy an energy balance relation. The main result is Theorem 5.3 which can be viewed as a far-reaching generalization of Theorem 1.1. In this generalization, the unit disk is replaced by a tuple of (not necessarily commuting) operators δ = (δ1 , . . . , δd ) on some Hilbert space K in an appropriate noncommutative Cartan domain ( di=1 Ii ⊗ δi < 1 for an appropriate collection of n∞× m matrices I1 , . . . , Id ), and analytic operatorvalued functions z → T (z) = n=0 Tn z n on the unit disk are replaced by formal power series Tw z w (1.8) T (z) = w∈F Fd
in a set of noncommuting formal indeterminates z = (z1 , . . . , zd ), where the coefficients Tw are operators from U to Y. Here Fd is the free semigroup generated by the set of letters {1, . . . , d}; thus elements of Fd are words w of the form w = iN · · · i1 where ik ∈ {1, . . . , d} for each k = 1, . . . , N . We also consider the empty word ∅ as an element of Fd which serves as the unit element for Fd : ∅ · w = w · ∅ = w for all w ∈ Fd . Given a formal power series T (z) as in (1.8) and an operator-tuple δ = (δ1 , . . . , δd ) we may define T (δ) ∈ L(U ⊗ K, Y ⊗ K) by Tw ⊗ δ w (1.9) T (δ) = w∈F Fd
whenever the series converges, where δ w = δiN · · · δi1 ∈ L(K) if w = iN · · · i1 and δik ∈ L(K) for k = 1, . . . , N. Theorem 5.3 characterizes formal power series T (z) for which a noncommutative von Neumann inequality T (δ) ≤ 1 holds for all operator tuples δ = (δ1 , . . . , δd ) in d a suitable noncommutative Cartan domain i=1 Ii ⊗ δi < 1 in terms analogous to those in Theorem 1.1. One can view the result as a noncommutative analogue of the recent work of Ambrozie-Timotin [7], Ball-Bolotnikov [12] and AmbrozieEschmeier [6] on extensions of the so-called Schur-Agler class to more general domains in Cd . In this more general setting there is no analogue of condition (1) in Theorem 1.1. In the classical case, the implication (2) = =⇒ (3) follows in a rather straightforward way as a consequence of the fact that the Schur class can be identified with the space of contractive multipliers on the Hardy space over the unit disk. This type of argument applies in our setting only in special cases (see Remark 5.11); the general case requires a rather involved separation argument of HahnBanach type first used in this context by Agler for the (commutative) polydisk setting (see [1]). The analogue of implication (3) = =⇒ (4) in Theorem 1.1 follows the now standard “lurking isometry” argument which now has appeared in many
Conservative Noncommutative Systems
183
contexts (see [11] for a survey), while the implication (4) =⇒ = (1) is elementary but in our setting requires some care (see Theorem 4.2). This functional calculus of formal power series considered as functions of noncommuting operator-tuples has been used in the context of robust control and the theory of structured singular values (µ-analysis) – see [22, 23, 24, 44, 57]; we explore these connections further in our paper [14]. We mention that results on formal power series (including polynomials in noncommuting indeterminates) closely related to our Theorem 5.3 below have appeared in the recent work of Helton, McCullough and Putinar [39, 40, 41] on representations of polynomials in noncommuting indeterminates as sums of squares as well as in related work of Kalyuzhny˘-Verbovetzki˘ ˘ ˘ı and Vinnikov [43]. This work has motivation from somewhat different connections with system theory. We indicate more precise connections between this work and our Theorem 5.3 in Remark 5.15 below. In a different direction, the paper of Alpay and Kalyuzhny˘-Verbovetzki˘ ˘ ˘ı [5] introduces the notion of a rational, inner formal power series and develops a realization theory for these (see Remarks 5.2 and 5.5 below). The paper is organized as follows. Following the present Introduction, in Section 2 we review the needed material from [13] on structured noncommutative multidimensional linear systems (SNMLSs). In Section 3 we define the adjoint of a SNMLS (having all signal spaces equal to Hilbert spaces). This gives the natural setting for the definition of a conservative SNMLS in Section 4. Section 5 contains the main Theorem 5.3 on the identification of the structured noncommutative Schur-Agler class with the set of formal power series capable of being realized as the transfer function of a conservative SNMLS.
2. Structured noncommutative multidimensional linear systems: basic definitions and properties We present an infinite-dimensional Hilbert-space version of the structured noncommutative multidimensional linear systems (SNMLS) introduced in [13]. As in graph theory, a graph G consists of a set of vertices V = V (G) and edges E = E(G) connecting vertices. We assume throughout that the sets V and E are both finite, i.e., that G is a finite graph. We are interested only in what we call admissible graphs, i.e., a bipartite graph such that each connected component is a complete bipartite graph. This means simply that: ˙ into the set of 1. the set of vertices V has a disjoint partitioning V = S ∪R source vertices S and range vertices R, K K 2. S and R in turn have disjoint partitionings S = ∪˙ k=1 Sk and R = ∪˙ k=1 Rk into nonempty subsets S1 , . . . , SK and R1 , . . . , RK such that, for each sk ∈ Sk and rk ∈ Rk (with the same value of k) there is a unique edge e = esk ,rk connecting sk to rk (s(e) = sk , r(e) = rk ), and 3. every edge of G is of this form.
184
J.A. Ball, G. Groenewald and T. Malakorn
If v is a vertex of G (so either v ∈ S or v ∈ R) we denote by [v] the path-connected component p (i.e., the complete bipartite graph p = Gk with set of source vertices equal to Sk and set of range vertices equal to Rk for some k = 1, . . . , K) containing v. Thus, given two distinct vertices v1 , v2 ∈ S ∪ R, there is a path of G connecting v1 to v2 if and only if [v1 ] = [v2 ] and this path has length 2 if both v1 and v2 are either in S or in R and has length 1 otherwise. In case s ∈ S and r ∈ R are such that [s] = [r], we shall use the notation es,r for the unique edge having s as source vertex and r as range vertex: es,r ∈ E determined by s(es,r ) = s, r(es,r ) = r.
(2.1)
Note that es,r is well defined only for s ∈ S and r ∈ R with [s] = [r]. We define a structured noncommutative multidimensional linear system (SNMLS) to be a collection Σ = (G, H, U ) where G is an admissible graph, H = {Hp : p ∈ P } is a collection of (separable) Hilbert spaces (called auxiliary state spaces) indexed by the path-connected components P of the graph G, and where U is a connection matrix (sometimes also called colligation) of the form C D C D C D C D A B [Ar,s ] [Br ] ⊕s∈S H[s] ⊕r∈R H[r] U= = : → (2.2) C D [Cs ] D U Y where U and Y are additional (separable) Hilbert spaces (to be interpreted as the input space and the output space respectively). This definition differs from that in [13] in that here we take the auxiliary state spaces Hp , the input space U and the output space Y to be separable (possibly infinite-dimensional) Hilbert spaces rather than finite-dimensional linear spaces. With any SNMLS we associate an input/state/output linear system with evolution along a free semigroup as follows. We denote by FE the free semigroup generated by the edge set E. An element of FE is then a word w of the form w = eN · · · e1 where each ek is an edge of G for k = 1, . . . , N . We denote the empty word (consisting of no letters) by ∅. The semigroup operation is concatenation: if w = eN · · · e1 and w = eN · · · e1 , then ww is defined to be ww = eN · · · e1 eN · · · e1 . Note that the empty word ∅ acts as the identity element for this semigroup. On occasion we shall have use of the notation we−1 for a word w ∈ FE and an edge e ∈ E; by this notation we mean
if w = w e, w −1 we = (2.3) undefined otherwise. with a similar convention for e−1 w. If Σ = (G, H, U ) is an SNMLS, we associate the system equations (with evolution along FE ) ⎧ ⎨ xs(e) (ew) = Σs∈S Ar(e),s xs (w) + Br(e) u(w) xs (ew) = 0 if s = s(e) (2.4) Σ: ⎩ y(w) = Σs∈S Cs xs (w) + Du(w).
Conservative Noncommutative Systems
185
Here the state vector x(w) at position w (for w ∈ FE ) has the form of a column vector x(w) = cols∈S xs (w) with column entries indexed by the source vertices s ∈ S and with column entry xs (w) taking values in the auxiliary state space H[s] (and thus x(w) takes values in the state space ⊕s∈S H[s] ), while u(w) ∈ U denotes the input at position w and y(w) ∈ Y denotes the output at position w. Just as in the classical case, if we specify an initial condition x(∅) ∈ ⊕s∈S H[s] and feed in an input string {u(w)}w∈F FE , then equations (2.4) enables us to recursively compute x(w) for all w ∈ FE \ {∅} and y(w) for all w ∈ FE . The solution of these recursions can be made more explicit as follows. Note first of all that a consequence of the system equations is that x(ew) ∈ Hs(e) := cols∈S [δ s,s(e) H[s(e)] ] for all e ∈ E and w ∈ FE (where δ s,s is the Kronecker delta function). Given x(∅) and {u(w)}w∈F (E), we can solve the system equations (2.4) or (2.7) uniquely for {x(w)}w∈F FE \{∅} and {y(w)}w∈F as follows: FE xs(eN ) (eN · · · e1 ) = Ar(eN ),s(eN −1 ) Ar(eN −1 ),s(eN −2 ) · · · Ar(e1 ),s xs (∅) s∈S
+
N
Ar(eN ),s(eN −1 ) · · · Ar(er+1 ),s(er ) Br(er ) u(er−1 · · · e1 )
(2.5)
r=1
where we interpret u(er−1 · · · e1 ) to be u(∅) where r = 1, and xs (eN eN −1 · · · e1 ) = 0 if s = s(eN ). Also, Cs(eN ) Ar(eN ),s(eN −1 ) Ar(eN −1 ),s(eN −2 ) · · · Ar(e1 ),s xs (∅) y(eN · · · e1 ) = s∈S
+
N
Cs(eN ) Ar(eN ),s(eN −1 ) · · · Ar(er+1 ),s(er ) Br(er ) u(er−1 · · · e1 )
r=1
+ Du(eN · · · e1 ).
(2.6)
This formula must be interpreted appropriately for special cases. As examples, for the particular cases r = 1 and r = N we have the interpretations Ar(eN ),s(eN −1 ) · · · Ar(er+1 ),s(er ) Br(er ) u(er−1 · · · e1 )|r=1 = Ar(eN ),s(eN −1 ) · · · Ar(e2 ),s(e1 ) Br(e1 ) u(∅), Ar(eN ),s(eN −1 ) · · · Ar(er+1 ),s(er ) Br(er ) u(er−1 · · · e1 )|r=N = Br(eN ) u(eN −1 · · · e1 ). The system equations (2.4) can be written more compactly in operatortheoretic form as x(ew) = IΣ;e Ax(w) + IΣ;e Bu(w) (2.7) Σ: y(w) = Cx(w) + Du(w)
186
J.A. Ball, G. Groenewald and T. Malakorn
where IΣ;e : ⊕r∈R H[r] → ⊕s∈S H[s] is given via matrix entries
IH[s(e)] = IH[r(e)] if s = s(e)and r = r(e), [IIΣ;e ]s,r = 0 otherwise. A consequence of the system equations (2.8) is the identity D D C C colr∈R xs[r] (es[r] ,r w) cols∈S xs (w) =U u(w) y(w)
(2.8)
(2.9)
for any choice of source-vertex cross-section p → sp . Here we say that a map p → sp from the set of path-connected components P of G into the set of source vertices S of G is a source-vertex cross-section if, for each path-connected component p ∈ P , the path-connected component of G containing sp ∈ S is equal to p: sp ∈ S for each p ∈ P and [sp ] = p.
(2.10)
More precisely, the system of equations (2.9) is equivalent to (2.7) in the sense that the function w → (u(w), x(w), y(w)) satisfies (2.7) if and only if the function w → (u(w), x(w), y(w)) satisfies (2.9) for every choice of source-vertex crosssection map p → sp ∈ S (see (2.10)). From the fact that (2.9) holds for any choice of source-vertex cross-section p → sp we deduce that the state vector w → x(w) of any system trajectory w → (u(w), x(w), y(w)) satisfies the compatibility condition xs (es,r w) is independent of s ∈ [r] for each fixed r ∈ R and w ∈ FE ,
(2.11)
as can also be seen directly from the system equations (2.4). Note that IΣ;e is already determined by the first two pieces G and H of Σ = (G, H, U ). On occasion we shall need these objects in situations where we have a graph G and a collection of Hilbert spaces H = {Hp : p ∈ P } without the presence of any particular connection matrix U . In this situation we shall use the in place of IΣ;e . We shall also have occasion notation IG,H;e to need the operator pencil ZΣ (z) = e∈E IΣ,e ze , also written as ZG,H (z) = e∈E IG,H;e ze when U is absent or suppressed. Also just as in the classical case, it is convenient to introduce “frequencydomain” notation for explicit representation of system trajectories. For any linear space H, we define the formal noncommutative Z-transform of a sequence of Hvalued functions as a formal power series in several noncommuting indeterminates z = (ze : e ∈ E) as follows: h(w)z w , (2.12) {h(w)}w∈F FE → h(z) = w∈F FE
where z ∅ = 1, z w = zeN zeN −1 · · · ze1 if w = eN eN −1 · · · e1 . Thus
z w · z w = z ww ,
z w · ze = z we for w, w ∈ FE and e ∈ E.
Conservative Noncommutative Systems
187
On occasion we shall have need of multiplication on the right or left by ze−1 ; we use the convention
−1 if we−1 ∈ FE is defined; z we w −1 (2.13) z ze = 0 if we−1 is undefined. where we use the convention (2.3) for the meaning of we−1 . We use the obvious analogous convention to define ze−1 z w . As derived in [13], application of the formal noncommutative Z-transform to the system equations (2.4) and solving gives a frequency-domain formula for the state and output trajectory: x (z) = (I − ZΣ (z)A)−1 x(∅) + (I − ZΣ (z)A)−1 ZΣ (z)B u(z) u(z) y(z) = C(I − ZΣ (z)A)−1 x(∅) + TΣ (z) where we have set ZΣ (z) =
IΣ;e ze
(2.14) (2.15)
e∈E
and where the formal power series given by TΣ (z) = D + C(I − ZΣ (z)A)−1 ZΣ (z)B = T∅ +
∞
(2.16)
Cs(eN ) Ar(eN ),s(eN −1 ) · · · Ar(e2 ),s(e1 ) Br(e1 ) zeN zeN −1 · · · ze2 ze1
N =1 e1 ,...,eN ∈E
(2.17) is the transfer function of the SNMLS Σ. As explained in [13], there are three particular examples worth special mention; we refer to these as (1) noncommutative Fornasini-Marchesini systems, (2) noncommutative Givone-Roesser systems, and (3) noncommutative full-structured multidimensional linear systems. These special cases are defined as follows. Example 2.1. Noncommutative Fornasini-Marchesini systems. We let GF M be the admissible graph with source-vertex set S F M consisting of a single element S F M = {1} and with range-vertex set RF M and set of edges E F M both equal to the finite set {1, . . . , d}, with edge j having source vertex 1 and range vertex j: sF M (j) = 1 and rF M (j) = j for j = 1, . . . , d. Suppose now that Σ = (GF M , H, U F M ) is a SNMLS with structure graph GF M . As GF M has only one path-connected component (P F M = {p1 }, the collection of Hilbert spaces H = {Hp : p ∈ P } collapses to a single Hilbert space H. The connection matrix U F M then has the form ⎡ ⎤ A1 B1 D D ⎢ . C C d .. ⎥ CHD ⊕j=1 H A B ⎢ . . ⎥ =⎢ . UFM = → ⎥: C D U Y ⎣Ad Bd ⎦ C
D
188
J.A. Ball, G. Groenewald and T. Malakorn
and the system equations (2.4) have ⎧ x(1w) ⎪ ⎪ ⎪ ⎨ .. . ΣF M : ⎪ ⎪ x(dw) ⎪ ⎩ y(w) or in more compact form x(jw) ΣF M : y(w) ···
A1 x(w) + B1 u(w) .. .
= =
Ad x(w) + Bd u(w) Cx(w) + Du(w).
(2.18)
(2.19)
3 0 where IH occurs in the jth column.
···
IH
=
= IΣF M ,j Ax(w) + IΣF M ,j Bu(w) = Cx(w) + Du(w)
where we set
2 IΣF M ,j = 0
the form
The transfer function TΣF M (z) then has the form TΣF M (z) = D + C(I − ZΣF M (z)A)−1 ZΣF M (z)B = D + C(I − z1 A1 − · · · − zd Ad )−1 (z1 B1 + · · · + zd Bd ) =D+
d
CAv Bj z v zj
(2.20)
v∈F Fd j=1
where the structure matrix ZΣF M (z) is given by ZΣF M (z) =
d
2 IΣF M ,j zj = z1 IH
···
3 zd IH .
j=1
We consider these as noncommutative Fornasini-Marchesini systems (with evolution along the free semigroup generated by {1, . . . , d}) to be a noncommutative analogue of the commutative multidimensional systems (with evolution along an integer lattice rather than a tree or free semigroup) introduced and studied by Fornasini and Marchesini (see, e.g., [34]). Example 2.2. Noncommutative Givone-Roesser systems. We let GGR be the graph with source-vertex set S GR , range vertex set RGR and set of edges E GR all equal to the finite set {1, . . . , d} with edge j having source vertex j and range vertex j: sF M (j) = j and rF M (j) = j for j = 1, . . . , d. Suppose now that Σ = (GGR , H, U GR ) is a SNMLS with structure graph GGR . As GGR has d path-connected components (P GR = {p1 , . . . , pd }), the collection of Hilbert spaces H can be labeled as H = {Hj : j = 1, . . . , d}. The connection matrix U GR then has the form ⎡ ⎤ A11 · · · A1d B1 D ⎢ . D C C d .. .. ⎥ C⊕d H D A B ⎢ . i=1 i → ⊕j=1 Hj . . ⎥ U GR = =⎢ . ⎥: C D U Y ⎣Ad1 · · · Add Bd ⎦ C1
···
Cd
D
Conservative Noncommutative Systems and the system ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ GR Σ : ⎪ ⎪ ⎪ ⎪ ⎪ ⎩
189
equations (2.4) have the form x1 (1w) = A11 x1 (w) + · · · + A1d xd (w) + B1 u(w) .. .. . . xd (dw) = Ad1 x1 (w) + · · · + Add xd (w) + Bd u(w) xi (iw) = 0 if i = i, y(w) = C1 x1 (w) + · · · + Cd xd (w) + Du(w).
or in more compact form x(jw) GR Σ : y(w)
= IΣGR ,j Ax(w) + IΣGR ,j Bu(w) = Cx(w) + Du(w) ⎡ 0 ⎢ ⎢ ⎢ =⎢ ⎢ ⎢ ⎣
where we set
(2.21)
(2.22)
⎤
⎥ ⎥ ⎥ ⎥ IHj IΣGR ,j ⎥ ⎥ .. ⎦ . 0 where the nonzero entry occurs in the jth diagonal slot. The transfer function TΣGR (z) then has the form ..
.
(2.23) TΣGR (z) = D + C(I − ZΣGR (z)A)−1 ZΣGR (z)B ⎛⎡ ⎤ ⎡ ⎤⎞−1 ⎡ ⎤ IH1 z1 B1 z1 A11 · · · z1 A1d 2 3 ⎜⎢ ⎥ ⎢ .. .. ⎥⎟ ⎢ .. ⎥ .. = D + C1 · · · Cd ⎝⎣ ⎦−⎣ . . . ⎦⎠ ⎣ . ⎦ IHd zd Bd zd Ad1 · · · zd Add ∞ CiN AiN ,iN −1 AiN −1 ,iN −2 · · · Ai2 ,i1 Bi1 ziN ziN −1 · · · zi2 zi1 =D+ N =1 i1 ,...,iN ∈{1,...,d}
(2.24) where the structure matrix ZΣGR (z) is given by ⎡ z1 IH1 d ⎢ IΣGR ,j zj = ⎣ ZΣGR (z) =
⎤ ..
⎥ ⎦.
.
j=1
zd IHd
We consider these noncommutative Givone-Roesser systems to be a noncommutative analogue of the commutative multidimensional systems introduced and studied by Givone and Roesser (see, e.g., [35, 36]). Example 2.3. Noncommutative full-structured multidimensional systems. We take Gfull to be the complete bipartite graph on source-vertex set S full = {1, . . . , n} and range-vertex set Rfull = {1, . . . , m}. Thus we may label the edge set E full as E full = {(i, j) : i = 1, . . . , n; j = 1, . . . , m} with sfull(i, j) = i,
rfull (i, j) = j.
190
J.A. Ball, G. Groenewald and T. Malakorn
We let Fn,m denote the free semigroup generated by the set E full = {1, . . . , n} × {1, . . . , m}. Thus elements of Fn,m are words w of the form (iN , jN )(iN −1 , jN −1 ) · · · (i1 , j1 ) where ik ∈ {1, . . . , n} for all k = 1, . . . , N and jk ∈ {1, . . . , m} for all k = 1, . . . , N . Suppose that Σfull = (Gfull , H, U full) is a SNMLS with structure graph equal to Gfull . As Gfull has only one connected component in this case, the collection of Hilbert spaces H = {Hp : p ∈ P full } collapses to a single Hilbert space denoted as H. The connection matrix U full then has the form ⎡ ⎤ A11 · · · A1n B1 D ⎢ . C C m D .. .. ⎥ C⊕n HD ⊕j=1 H A B ⎢ .. ⎥ full i=1 . . =⎢ = → U ⎥: C D U Y ⎣Am1 · · · Amn Bm ⎦ C1 · · · Cn D and the associated system equations have the form ⎧ ⎪x1 ((1, j) · w) = Aj1 x1 (w) + · · · + Ajn xn (w) + Bj u(w) for j = 1, . . . , m, ⎪ ⎪ ⎪ .. ⎪ ⎪ . ⎨ Σfull :
⎪xn ((n, j) · w) ⎪ ⎪ ⎪ ⎪xi ((i, j) · w) ⎪ ⎩ y(w)
= Aj1 x1 (w) + · · · + Ajn xn (w) + Bj u(w) for j = 1, . . . , m, = 0 if i = i, = C1 x1 (w) + · · · + Cn xn (w) + Du(w). (2.25) Note that, as is consistent with (2.11), xi ((i, j) · w) is independent of i for each fixed j ∈ {1, . . . , m} and w ∈ Fn,m . The transfer function TΣfull then has the form TΣfull (z) = D + C(I − ZΣfull (z)A)−1 ZΣfull (z)B (2.26) 2 3 = D + C1 · · · Cn m ⎛⎡ ⎤ ⎡ m ⎤⎞−1 ⎡ m ⎤ ··· IH j=1 z1j Aj1 j=1 z1j Ajn j=1 z1j Bj ⎜⎢ ⎥ ⎢ ⎥⎟ ⎢ ⎥ .. .. .. .. · ⎝⎣ ⎦−⎣ ⎦⎠ ⎣ ⎦ . m . m . m . IH ··· j=1 znj Aj1 j=1 znj Ajn j=1 znj Bj =D+
∞
CiN AjN ,iN −1 AjN −1 ,iN −2 · · · Aj2 ,i1 Bi1
N =1 i1 ,...,iN ∈{1,...,n} j1 ,...,jN ∈{1,...,m}
· ziN ,jN ziN −1 ,jN −1 · · · zi2 ,j2 zi1 ,j1 and where ZΣfull (z) is given by
(2.27)
z1,1 IH ⎢ .. ZΣfull (z) = ⎣ .
···
⎤ z1,m IH .. ⎥ . . ⎦
zn,1 IH
···
zn,m IH
⎡
Conservative Noncommutative Systems
191
3. Adjoint systems It turns out that the adjoint system for a SNMLS Σ has a somewhat different form. Let us say that the collection Σ∗ = (G, H∗ , U∗ ) is a SNMLS of adjoint form if 1. G is an admissible finite graph, 2. H∗ = {H∗p : p ∈ P } is a collection of Hilbert spaces (the auxiliary state spaces for Σ∗ ) indexed by the set P of path-connected components of G, and 3. the connection matrix U∗ for Σ∗ has the form C D C D C D C D A∗ B∗ [A∗s,r ] [B∗s ] ⊕r∈R H∗[r] ⊕H∗[s] U∗ = = : → (3.1) [C∗r ] C∗ D∗ D∗ U∗ Y∗ where U∗ (the input space for Σ∗ ) and Y∗ (the output space for Σ∗ ) are Hilbert spaces. The system equations associated with an SNMLS of adjoint form Σ∗ involve also a choice of source-vertex cross-section p → sp as in (2.10) and are given by x∗s (w) = A x (e w) + B∗s u∗ (w) r∈R ∗s,r s[r] s[r] ,r (3.2) Σ∗ : y∗ (w) = r∈R C∗r x∗s[r] (es[r] ,r (w) + D∗ u∗ (w). The state vector x∗ (w) = cols∈S x∗s (w) takes values in the state space ⊕s∈S H∗[s] with components x∗s (w) in the auxiliary state space H∗[s] for each s ∈ S and is required to satisfy the compatibility condition x∗s (es,r w) = x∗s (es ,r w) for all s, s ∈ S with [s] = [s ] and for all r ∈ R and w ∈ FE .
(3.3) (3.4)
The adjoint input signal u∗ (w) takes values in U∗ and the adjoint output signal y∗ (w) takes values in Y∗ . Given a positive integer N , suppose that we are given an input signal {u∗ (w)}w : |w|≤N on the finite horizon {w ∈ Fd : |w| ≤ N } along with a finalization of the state {x∗ (w)}w : |w|=N +1 . We can then apply the recursions in (3.2) to compute x∗ (w) and y∗ (w) for all w ∈ Fd with |w| ≤ N . The compatibility condition (3.4) implies that the resulting solution x∗ (w) and y∗ (w) is independent of the choice of source-vertex cross-section p → sp . In general we say that a triple of functions w → (u∗ (w), x∗ (w), y∗ (w)) is a trajectory of the system of adjoint form Σ∗ if x∗ satisfies the compatibility condition (3.4) and (u∗ , x∗ , y∗ ) satisfy the adjoint system equations (3.2) for some (and hence for any) choice of source-vertex cross-section p → sp . Given a SNMLS Σ = (G, H, U ), we define the adjoint system Σ∗ of Σ to be the SNMLS of adjoint form given by Σ∗ = (G, H, U ∗ ).
(3.5)
From the definition (3.2) we see that the system equations associated with Σ∗ therefore have the form x∗s (w) = A∗ x∗s[r] (es[r] ,r w) + Cs∗ u∗ (w) ∗ r∈R r,s Σ : (3.6) ∗ ∗ y∗ (w) = r∈R Br x∗s[r] (es[r] ,r w) + D u∗ (w).
192
J.A. Ball, G. Groenewald and T. Malakorn
where the adjoint state vector x∗ (w) = cols∈S x∗s (w) taking values in ⊕s∈S H[s] , adjoint input signal u∗ (w) taking values in Y and adjoint output signal y∗ (w) taking values in U. The defining condition of the adjoint system is given by the following Proposition. In the following statement, by a local trajectory of the system Σ at the word w we mean a function w → (u(w ), x(w ) = ⊕s∈S xs (w ), y(w )) defined at least for w = w and w = ew for each e ∈ E which satisfies the system equations (2.4) at position w. Similarly, by a local trajectory of Σ∗ at w we mean a function w → (u∗ (w ), x∗ (w ) = ⊕s∈S x∗s (w ), y∗ (w )) defined at least for w = w and w = ew for each e ∈ E which satisfies the compatibility condition (3.4) and the adjoint system equations (3.6) at w. With these notions we avoid the issue of whether a local trajectory (of Σ or Σ∗ ) necessarily extends to a global trajectory. Proposition 3.1. Suppose that we are given a SNMLS Σ = (G, H, U ) with adjoint system Σ∗ = (G, H, U ∗ ). 1. The adjoint pairing relation xs[r] (es[r] ,r w), x∗s[r] (es[r] ,r w)H[r] + y(w), u∗ (w)Y r∈R
=
xs (w), x∗s (w)H[s] + u(w), y∗ (w)U
(3.7)
s∈S
holds for any trajectory (u, x, y) of Σ and any trajectory (u∗ , x∗ , y∗ ) of Σ∗ . 2. Conversely, if a given function w → (u(w), x(w), y(w)) ∈ U × (⊕s∈S H[s] ) × Y satisfies the adjoint pairing relation (3.7) with respect to every local trajectory (u∗ (w), x∗ (w), y∗ (w)) of Σ∗ at each w ∈ FE , then (u, x, y) is a trajectory of Σ. 3. Conversely, if a given function w → (u∗ (w), x∗ (w), y∗ (w)) ∈ Y × (⊕s∈S H[s] ) × U satisfies the adjoint pairing relation (3.7) with respect to every local trajectory (u(w), x(w), y(w)) of Σ at w for each w ∈ FE , then (u∗ , x∗ , y∗ ) is a trajectory of Σ∗ . Proof. Note that the system equations (2.9) for Σ can be written in vector form as D C D C colr∈R xs[r] (es[r] ,r w) cols∈S xs (w) . (3.8) =U u(w) y(w) Similarly, in vector form, the adjoint system equations (3.6) are C D D C cols∈S x∗s (w) ∗ colr∈R x∗s[r] (es[r] ,r w) =U y∗ (w) u∗ (w)
(3.9)
Conservative Noncommutative Systems and the adjoint pairing relation is %C D& D C colr∈R xs[r] (es[r] ,r w) colr∈R x∗s[r] (es[r] ,r w) , y(w) u∗ (w) (⊕r∈R H[r] )⊕Y %C D C D& cols∈S xs (w) cols∈S x∗s (w) = . , u(w) y∗ (w) (⊕ H )⊕U s∈S
193
(3.10)
[s]
If (u, x, y) is a trajectory of Σ and (u∗ , x∗ , y∗ ) is a trajectory of Σ∗ , then substitution of (3.8) and (3.9) into (3.10) shows that (3.10) holds for (u, x, y) and (u∗ , x∗ , y∗ ) by definition of the adjoint U ∗ of U . More precisely, if (u, x, y) is a trajectory such that (3.7) holds for any local trajectory (u∗ , x∗ , y∗ ) of Σ∗ at w, then we see that D C D& %C colr∈R x∗s[r] (es[r] ,r w) colr∈R xs[r] (es[r] ,r w) , y(w) u∗ (w) (⊕r∈R H[r] )⊕Y D& D %C C x (e w) col cols∈S xs (w) r∈R ∗s[r] s[r] ,r ∗ ,U = . u(w) u∗ (w) (⊕s∈S H[s] )⊕U 4 col 5 r∈R xs[r] (es[r] ,r w) As can be taken to be an arbitrary element of (⊕r∈R H[r] ) ⊕ U y(w) and the source-vertex cross-section p → sp is also arbitrary, it follows that (u, x, y) satisfies (3.8) at w. As the choice of w ∈ FE is arbitrary, we conclude that (u, x, y) is a trajectory of Σ. A similar argument shows that (u∗ , x∗ , y∗ ) is a trajectory of Σ∗ if (u∗ , x∗ , y∗ ) satisfies (3.7) against every local trajectory (u, x, y) of Σ at each w ∈ FE , and Proposition 3.1 now follows.
4. Dissipative and conservative structured multidimensional linear systems In case U is contractive (U ≤ 1), we say that Σ is a dissipative SNMLS. In this case the trajectories of Σ have the following energy dissipation property: xs[r] (es[r] ,r w)2 − x(w)2 ≤ u(w)2 − y(w)2 (4.1) r∈R
for every choice of source-vertex cross-section p → sp . We say that the SNMLS is isometric in case the connection matrix U is isometric. In this case the dissipation inequality (4.1) is replaced with the energy balance relation: xs[r] (es[r] ,r w)2 − x(w)2 = u(w)2 − y(w)2 (4.2) r∈R
for every choice of source-vertex cross-section p → sp . An interesting special case is the case where there is a unique source-vertex cross-section. This happens exactly when each path-connected component p of the admissible graph G contains exactly one source vertex sp ; this occurs, e.g., for the case of noncommutative Fornasini-Marchesini systems (see Example 2.1) and for
194
J.A. Ball, G. Groenewald and T. Malakorn
noncommutative Givone-Roesser systems (see Example 2.2). In this case, each edge e has the form es[r] ,r and hence can be indexed more simply by r ∈ R: es[r] ,r → er . Then the property that xs (ew) = 0 if s = s(e) translates to xs (er w) = 0 if s = s[r] . With the use of this fact we see that, when we sum (4.1) over all words w of length at most some N , the left side of the inequality telescopes and we arrive at 2 3 x(w)2 − x(∅)2 ≤ u(w)2 − y(w)2 . (4.3) w : |w|=N +1
w : |w|≤N
This can be rearranged as y(w)2 ≤
w : |w|≤N
w : |w|≤N
≤
y(w)2 +
x(w)2
w : |w|=N +1
u(w) + x(∅)2 , 2
(4.4)
w : |w|≤N
and hence, letting N → ∞ gives y(w)2 ≤ u(w)2 + x(∅)2 . w∈F FE
(4.5)
w∈F FE
In particular, if we impose zero initial condition x(∅) = 0 and take formal Ztransform, from the fact that y(z) = TΣ (z) · u (z) (see (2.14)) we arrive at u(z)2L2 (F u(z)2L2 (F (z) ∈ L2 (F FE , U), T TΣ (z) FE ,Y) ≤ FE ,U ) for all u
(4.6)
FE , U) into L2 (F FE , Y) i.e., multiplication by TΣ is a contraction operator from L2 (F in case there is a unique source-vertex cross-section p → sp for G. We shall have further discussion of this point in Remark 5.14 below. Given a SNMLS Σ = (G, H, U ), we say that Σ is a conservative SNMLS if the connection matrix D C D C D C D C [Ar,s ] [Br ] ⊕s∈S H[s] ⊕r∈R H[r] A B = U= : → C D [Cs ] D U Y is unitary. In particular U is isometric, so system trajectories satisfy the energy balance relation (4.2). Just as in the classical case, for a system-theoretic interpretation of the meaning of the adjoint U ∗ of U also being isometric, we need to introduce the adjoint system Σ∗ . Recall the definition of the adjoint Σ∗ of a SNMLS Σ = (G, H, U ) given by (3.5). Theorem 4.1. Suppose that Σ = (G, H, U ) is a SNMLS. Then Σ is conservative (i.e., U is unitary) if and only if either one of the following conditions holds: 1. The function (u, x, y) : FE → U × ⊕s∈S H[s] × Y is a local trajectory of Σ at w if and only if the function (y, x, u) : FE → Y × ⊕s∈S H[s] × U is a local trajectory of Σ∗ at w.
Conservative Noncommutative Systems
195
2. A local trajectory (u, x, y) of Σ at w satisfies the energy balance relation xs[r] (es[r] ,r w)2 − x(w)2 = u(w)2 − y(w)2 (4.7) r∈R
and any local trajectory (u∗ , x∗ , y∗ ) of Σ∗ at w satisfies the adjoint energy balance relation x∗s[r] (es[r] ,r w)2 + u∗ (w)2 = x∗s (w)2 + y∗ (w)2 . (4.8) r∈R
s∈S
for all source-vertex cross-sections p → sp . In particular, if Σ = (G, H, U ) is a conservative SNMLS, then 1. (u, x, y) is a trajectory of Σ if and only if (y, x, u) is a trajectory of Σ∗ , 2. any trajectory (u, x, y) of Σ satisfies (4.7), and 3. any trajectory (u∗ , x∗ , y∗ ) of Σ∗ satisfies (4.8). Proof. From the block forms (3.8) and (3.9) of the system equations for Σ and Σ∗ , we see that the equivalence between (u, x, y) being a local trajectory for Σ and (y, x, u) being a local trajectory for Σ∗ is in turn equivalent to U ∗ = U −1 , i.e., to U being unitary. Again from the system equations (3.8) and (3.9), we see that (4.7) holding for all local trajectories just means that U is isometric while (4.8) holding for all local trajectories of Σ∗ just means that U ∗ is isometric. This essentially completes the proof of Theorem 4.1. A useful property of dissipative (and hence in particular of conservative) SNMLSs is the possibility of interpreting the transfer function as a function acting on tuples of noncommuting contraction operators as we now explain. In general, suppose that G is an admissible graph and that we are given a formal power v series T (z) = v∈F FE Tv z in noncommuting variables z = (ze : e ∈ E) indexed by the edge set E of the graph G, with coefficients Tv equal to bounded operators acting between Hilbert spaces U and Y. Suppose that we are also given a collection δ = (δe : e ∈ E) of bounded, linear operators (not necessarily commuting) on some separable infinite-dimensional Hilbert space K also indexed by the edge set E of G. We define an operator T (δ) : U ⊗ K → Y ⊗ K by T (δ) := lim Tv ⊗ δ v N →∞
v∈F FE : |v|≤N
∅
where δ = IK and δ v = δeN · · · δe1 if v = eN · · · e1 .
(4.9)
whenever the limit (say, in the norm or the strong operator topology) exists. In general there is no reason for the limit in (4.9) to exist. However, for the case that T (z) = TΣ (z) is the transfer function of a conservative SNMLS Σ = (G, H, U ), T (δ) always makes sense for a natural class of operator-tuples δ = (δe : e ∈ E). To state the result we first need to agree on some notation. Suppose that Σ is a conservative SNMLS with system structure matrix ZΣ (z) = I e∈E Σ,e ze as in (2.15). For δ = (δe : e ∈ E) a finite collection of (not necessarily
196
J.A. Ball, G. Groenewald and T. Malakorn
commuting) bounded linear operators on some Hilbert space K indexed by the edge set E, define an operator ZΣ (δ) : ⊕r∈R H[r] ⊗ K → ⊕s∈S H[s] ⊗ K by ZΣ (δ) = e∈E IΣ,e · δe where IΣ,e · δe is given in terms of its matrix entries
IH[s] ⊗ δe = IH[r] ⊗ δe if s = s(e) and r = r(e), [IIΣ,e · δe ]s,r = (4.10) 0 otherwise. Note that the definition of IΣ,e and of ZΣ (z) uses only the first two pieces G and H of the SNMLS Σ = (G, H, U ). In case Hp is taken to be the complex numbers C for each path-connected component p ∈ P , we denote the associated coefficient matrices IΣ,e and the structure matrix ZΣ (z) simply as IG,e and ZG (z). Thus IG,e : ⊕r∈R C → ⊕s∈S C with matrix entries
1 if s = s(e) and r = r(e), [IIG,e ]s,r = 0 otherwise and ZG (z) = e∈E IG,e ze . We then define a class BG L(K) of tuples δ = (δe : e ∈ E) of bounded, linear operators on the Hilbert space K (the G-unit ball of L(K)nE ) (where nE denotes the number of edges in the graph G) by BG L(K) = {δ = (δe : e ∈ E) : δe ∈ L(K) for e ∈ E and ZG (δ) < 1}.
(4.11)
It is easy to see that ZG (δ) = ZΣ (δ) whenever Σ = (G, H, U ) is a SNMLS with structure graph G; thus ZΣ (δ) < 1 for all δ ∈ BG L(K). Theorem 4.2. Suppose that T (z) = D + C(I − ZΣ (z)A)−1 ZΣ (z)B is the transfer function of a dissipative SNMLS Σ = (G, H, U ) and that K is some other separable Hilbert space. Then for any collection δ = (δe : e ∈ E) of operators in BG L(K), T (δ) as defined in (4.9) is a well-defined contraction operator (with the limit of the partial sums in (4.9) existing in the operator-norm topology) from U ⊗ K to Y ⊗ K (T (δ) ≤ 1), and can alternatively be expressed as T (δ) = (D ⊗ IK ) + (C ⊗ IK ) (I − ZΣ (δ)(A ⊗ IK ))−1 ZΣ (δ)(B ⊗ IK ). Proof. A general fact is that, if C A U = C
(4.12)
D C D C D B H H : → D U Y
is contractive and ∆ : H → H is a strict contraction, then the upper feedback connection Fu [U , ∆] : U → Y defined implicitly by Fu [U , ∆] : u → y if there exist h ∈ H and h ∈ H C D C D C D A B h h so that = and ∆h = h C D u y
Conservative Noncommutative Systems
197
is well defined, moreover is contractive (F Fu [U , ∆] ≤ 1), and is given explicitly by the linear-fractional formula Fu [U , ∆] = D + C (I − ∆A )−1 ∆B .
(4.13)
This fact can be found in any of a number of places where linear-fractional transformations are discussed, e.g., in [57] where there is a comprehensive treatment for the control-theory context, or in Section 3 of [7] where there is a concise summary of what we are using here. Now suppose that Σ = (G, H, U ) is a dissipative SNMLS and suppose that δ = (δe : e ∈ E) is an operator-tuple in BG L(K). We shall use a different font δ s,s for the Kronecker delta function
1 if s = s , δ s,s = (4.14) 0 otherwise for which we shall have use on occasion in the sequel. We apply the linear-fractional construction (4.13) to the case C
D C D C D A ⊗ IK B ⊗ IK (⊕s∈S H[s] ) ⊗ K (⊕r∈R H[r] ) ⊗ K U = : → , C ⊗ IK D ⊗ IK U ⊗K Y ⊗K ∆ = ZΣ (δ) : ⊕r∈R H[r] ⊗ K → ⊕s∈S H[s] ⊗ K.
Note that U is then contractive since by assumption U is contractive and that ∆ < 1 since δ ∈ BG L(K). Hence it follows that Fu [U , ZΣ (δ)] = (D ⊗ IK ) + (C ⊗ IK ) (I − ZΣ (δ)(A ⊗ IK ))
−1
ZΣ (δ)(B ⊗ IK )
is a well-defined contraction operator from U ⊗ K into Y ⊗ K. It remains to show that Fu [U , ZΣ (δ)] = TΣ (δ). Verification of this identity draws upon repeated use of the product rule for tensor products (A ⊗ B)(C ⊗ D) = (AC) ⊗ (BD) as we now show. Since ZΣ (δ)(A ⊗ IK ) ≤ ZΣ (δ)A < 1, it follows that the inverse of I − ZΣ (δ)(A ⊗ IK ) is given by the Neumann expansion (I − ZΣ (δ)(A ⊗ IK ))
−1
=
∞
N
[ZΣ (δ)(A ⊗ IK )] .
N =0
From this we see that the (s, s ) matrix entry of (I − ZΣ (δ)(A ⊗ IK ))−1 : ⊕s∈S H[s] → ⊕s∈S H[s]
198
J.A. Ball, G. Groenewald and T. Malakorn
is given by [(I − ZΣ (δ)(A ⊗ IK ))−1 ]s,s = δ s,s IH[s] +
∞
N =1 eN ,...,e1 ∈E : s(eN )=s
(Ar(eN ),s(eN −1 ) ⊗ δeN ) · · · (Ar(e2 ),s(e1 ) ⊗ δe2 )(Ar(e1 ),s ⊗ δe1 ) = δ s,s IH[s] +
∞
(Ar(eN ),s(eN −1 ) · · · Ar(e2 ),s(e1 ) Ar(e1 ),s )⊗
N =1 eN ,...,e1 ∈E : s(eN )=s
⊗ (δeN · · · δe2 δe1 )
(4.15)
Note next that C ⊗ IK : ⊕s∈S H[s] ⊗ K → Y ⊗ K has row matrix representation (4.16) C ⊗ IK = rows∈S [Cs ⊗ IK ] while ZΣ (δ)(B ⊗ IK ) : U ⊗ K → ⊕s ∈S H[s ] ⊗ K has column matrix representation ⎡
ZΣ (δ)(B ⊗ IK ) = cols ∈S ⎣
⎤ Br(e) ⊗ δe ⎦ .
(4.17)
e : s(e)=s
Using (4.15), (4.16) and (4.17), we then compute −1
(C ⊗ IK ) (I − ZΣ (δ)(A ⊗ IK )) ZΣ (δ)(B ⊗ IK ) 3 2 = (Cs ⊗ IK ) (I − ZΣ (δ)(A ⊗ IK ))−1 s,s (Br(e) ⊗ δe ) s,s ∈S e∈E : s(e)=s
= X1 + X2 . where we have set (Cs ⊗ IK )(Br(e) ⊗ δe ) X1 =
(4.18)
s∈S e : s(e)=s
X2 =
∞
(Cs ⊗ IK )·
s,s ∈S e : s(e)=s N =1 eN ,...,e1 ∈E : s(eN )=s
· (Ar(eN ),s(eN −1 ) · · · Ar(e1 ),s ⊗ δeN · · · δe2 δe1 )(Br(e) ⊗ δe ). The first term X1 simplifies to X1 =
(4.19)
(Cs Br(e) ) ⊗ δe
s∈S e∈E : s(e)=s
=
e∈E
T e ⊗ δe
(4.20)
Conservative Noncommutative Systems
199
while the second term X2 can be simplified to X2 =
∞
(Cs Ar(eN ),s(eN −1 ) · · ·
s,s ∈S e : s(e)=s N =1 eN ,...,e1 ∈E : s(eN )=s
=
· · · Ar(e2 ),s(e1 ) Ar(e1 ),s Br(e) ) ⊗ (δeN · · · δe2 δe1 δe ) Tv ⊗ δ v .
(4.21)
v∈E : |v|≥2
Combining (4.20) and (4.21) along with the identity T∅ = D immediately gives us the identity (4.12) as wanted. This completes the proof of Theorem 4.2.
5. Conservative SNMLS-realization of formal power series in the class SAG (U, Y) Let G be a fixed admissible graph with source-vertex set S, range-vertex set R and 4.2 suggests that we consider the class of all formal power edge set E. Theorem v series T (z) = v∈F T FE v z having the property in the conclusion of Theorem 4.2. We view this class as a noncommutative analogue of the Schur-Agler class studied in a series of papers (see, e.g., [1, 17, 15, 6, 7, 12]). Definition 5.1. We say that T (z) is in the noncommutative Schur-Agler class Hilbert space K and SAG (U, Y) (for a given admissible graph G) if, for each v each δ = (δe : e ∈ E) ∈ BG L(K), the limit T (δ) = limN →∞ v∈F Fd : |v|≤N Tv ⊗ δ exists (in the operator-norm topology) and defines an operator T (δ) : U ⊗ K → Y ⊗ K which is contractive T (δ) ≤ 1.
(5.1)
Remark 5.2. Alpay and Kalyuzhny˘-Verbovetzki˘ ˘ ˘ı in [5] have shown that a given formal power series T (z) = Tv z v ∈ L(U, Y)z v∈F FE
belongs to the noncommutative Schur-Agler class SAG (U, Y) if and only if (5.1) holds for each δ = (δe : e ∈ E) ∈ BG L(CN ) for each finite N = 1, 2, 3, . . . . The proof there is done explicitly only for the case where each component of G consists of a single source vertex and a single range vertex (the Givone-Roesser case); we expect that this result continues to hold for the case of a general admissible graph G. Our next goal is a converse to Theorem 4.2 (see Theorem 5.3 below). For the statement we shall need some additional notation and terminology. We let z = (ze : e ∈ E) be a second system of noncommuting indeterminates; while ze ze = ze ze and ze ze = ze ze unless e = e , we will use the convention that
200
J.A. Ball, G. Groenewald and T. Malakorn
ze ze = ze ze for all e, e ∈ E. We also shall need the convention (2.13) to give meaning to expressions of the form
For H(z) =
−1
−1
ze−1 z v z v ze−1 = (z v ze−1 ) · (ze−1 z v ) = z ve z e
v
.
Hv z v , we will use the convention that ∗ ∗ v H(z) = Hv z := Hv∗ z v = Hv∗ z v . v∈F FE
v∈F FE
v∈F FE
v∈F FE
v v In general let us say that a formal power series K(z, z ) = v,v ∈F FE [K]v,v z z with coefficients [K]v,v equal to operators on a Hilbert space X (so K(z, z ) ∈ L(X )z, z ) is positive-definite provided that [K]v,v yv , yv X ≥ 0 (5.2) v,v ∈F FE
for all choices of yv ∈ X with yv = 0 for all but finitely many v ∈ FE . By the standard results concerning reproducing kernel Hilbert spaces ([9]), it is known that condition (5.2) is equivalent to the existence of an auxiliary Hilbert space H and operators Hv ∈ L(H , X ) for each v ∈ FE so that [K]v,v = Hv Hv∗ . Equivalently we therefore have: K(z, z ) ∈ L(X )z, z is positive-definite if and only if there exists an auxiliary Hilbert space H and a formal power series H(z) ∈ L(H , X )z so that K(z, z ) = H(z)H(z )∗ . We shall be particularly interested in this concept for the case where X = ⊕s∈S Y. We therefore consider a formal power series K(z, z ) of the form K(z, z ) = [Ks,s (z, z )]s,s ∈S ∈ L(⊕s∈S Y)z, z . Such a K(z, z ) therefore is positive-definite if [Ks,s ]v,v ys ,v , ys,v Y ≥ 0
(5.3)
s,s ∈S v,v ∈F FE
for all choices of ys,v ∈ Y for s ∈ S and v ∈ FE with ys,v = 0 for all but finitely many such s, v, or equivalently, if and only if there exist an auxiliary Hilbert space H and formal power series Hs (z) ∈ L(H , Y) for each s ∈ S so that Ks,s (z, z ) = Hs (z)Hs (z )∗ . v Theorem 5.3. Let T (z) = v∈F FE Tv z be a formal power series in noncommuting indeterminates z = (ze : e ∈ E) indexed by the edge set E of the admissible graph G with coefficients Tv ∈ L(U, Y) for two Hilbert spaces U and Y. Then the following conditions are equivalent: 1. T (z) is in the noncommutative Schur-Agler class SAG (U, Y). 2. There exists a positive-definite formal power series K(z, z ) = [Ks,s (z, z )]s,s ∈S ∈ L(⊕s∈S Y)z, z
Conservative Noncommutative Systems
201
so that IY − T (z)T (z )∗ = Ks,s (z, z ) − s∈S
r∈R
s,s ∈S :
[s]=[s ]=[r]
ze s ,r Ks,s (z, z )zes,r .
(5.4)
3. There exists a collection of Hilbert spaces H = {Hp : p ∈ P } (where P is the set of path-connected components of the admissible graph G) and a formal v power series H(z) = v∈F FE Hv z with coefficients Hv ∈ L(⊕s∈S H[s] , Y) so that we have the noncommutative Agler decomposition I − T (z)T (z )∗ = H(z) (I − ZG,H (z)ZG,H (z )∗ ) H(z )∗ (5.5) where we have set ZG,H (z) = e∈E IG,H ;e ze with coefficients IG,H ;e equal to operators acting from ⊕r∈R H[r] to ⊕s∈S H[s] determined from matrix entries [IIG,H ;e ]s,r given by
IH[s] = IH[r] if s = s(e) and r = r(e), (5.6) [IIG,H ;e ]s,r = 0 otherwise. 4. There is a conservative SNMLS Σ = (G, H, U ) with structure graph equal to the given admissible graph G so that T (z) = TΣ (z), i.e., so that T (z) = D + C(I − ZΣ (z)A)−1 ZΣ (z)B 5 4 5 4 ⊕r∈R H[r] ⊕s∈S H[s] B]: → is unitary and where ZΣ (z) = D Y U
A where U = [ C e∈E IΣ,e ze with IΣ,e as in (2.8).
Remark 5.4. We give the name Agler decomposition to an identity of the form (5.4) or (5.5) since representations of this type to our knowledge originate in the work of Agler (see [1]) in the context of the commutative polydisk. Remark 5.5. We note that the paper [5] of Alpay and Kalyuzhny˘-Verbovetzki˘ ˘ gives a uniqueness result for conservative realizations of rational inner formal power series in the Givone-Roesser case. We leave a systematic development of the uniqueness theory for realizations as in part (4) of Theorem 5.3 to another occasion. Proof. The proof breaks up into several implications which need to be shown: (2) ⇐⇒ (3): Suppose that K(z, z ) = [Ks,s (z, z )]s,s ∈S ∈ L(⊕s∈S Y)z, z is positive definite. Then, by the remarks preceding the statement of the Theorem, Ks,s (z, z ) has a factorization Ks,s (z, z ) = Hs (z)Hs (z )∗ for a formal power series Hs (z) ∈ L(H[s] , Y) for s ∈ S.
202
J.A. Ball, G. Groenewald and T. Malakorn
Then, if we set H(z) = rows∈S Hs (z) ∈ L(⊕s∈S H[s] , Y)z, then (5.4) assumes the form
I − T (z)T (z )∗ = Hs (z)Hs (z )∗ − s∈S
r∈R s,s ∈S : [s]=[s ]=[r]
Hs (z) · (1 − zes,r ze s ,r )IIH · Hs (z )∗
= H(z) I⊕s∈S H[s] − ZG,H (z)ZG,H (z )∗ H(z )∗ . from which (5.5) follows. Conversely, if H(z) = rows∈S Hs (z) ∈ L(⊕s∈S H[s] , Y)z
is as in (5.5), we may embed each Hp into a common Hilbert space H and without loss of generality assume that Hp = H for each p ∈ P . We then set Ks,s (z, z ) = Hs (z)Hs (z )∗ ∈ L(Y)z, z and K(z, z ) = [Ks,s (z, z )]s,s ∈S ∈ L(⊕s∈S Y)z, z . Then by the factored form of K(z, z ) = (cols∈S Hs (z)) (cols ∈S Hs (z ))
∗
we see that K(z, z ) is positive-definite. Reversal of the steps above then shows that K(z, z ) satisfies (5.4). In this way we see that (2) is equivalent to (3) in Theorem 5.3. (4) =⇒ = (1): Since any conservative system is also dissipative, this follows from Theorem 4.2. (1) =⇒ = (2) or (3): As a first case we assume that dim Y < ∞. Let X denote the v v linear space L(Y)z, z of all formal power series ϕ(z, z ) = v,v ∈F FE ϕv,v z z in the sets of noncommuting indeterminates z = (ze : e ∈ E) and z = (ze : e ∈ E) (but where ze ze = ze ze for each e, e ∈ E) with coefficients ϕv,v in the space of bounded linear operators L(Y) on the Hilbert space Y. We define a sequence of increasing seminorms · N on L(Y)z, z according to the rule ϕ(z, z )N =
sup v,v ∈F FE : |v|,|v |≤N
ϕv,v .
(5.7)
Then X is a locally convex topological vector space in the topology induced by these seminorms and this Let C be the set of all formal topology is vmetrizable. v power series ϕ(z, z ) = v,v ∈F ϕ z z in L(Y)z, z such that FE v,v ϕ(z, z ) = H(z)(I − ZG,H (z)ZG,H (z )∗ )H(z )∗
(5.8)
for some collection of Hilbert spaces H = {Hp : p ∈ P } indexed by the pathconnected components P of G and for some formal power series Hv z v H(z) = v∈F FE
with coefficients Hv ∈ where ZG,H (z) = e∈E IG,H ;e ze is defined as in (5.6). From the equivalence (2) ⇐⇒ (3), we see that an equivalent L(⊕s∈S H[s] , Y),
Conservative Noncommutative Systems
203
condition for membership of ϕ in C is the existence of a positive-definite formal power series K(z, z ) = [Ks,s (z, z )]s,s ∈S ∈ L(⊕s∈S Y)z, z so that ϕ(z, z ) = Ks,s (z, z ) − ze s ,r Ks,s (z, z )zes,r . (5.9) r∈R s,s ∈S : [s]=[s ]=[r]
s∈S
When working with the decomposition (5.8), we may assume without loss of generality that Hp is a fixed separable infinite-dimensional Hilbert space independent of the choice of ϕ ∈ C and of the particular representation (5.8) for a given ϕ ∈ C. It is easily checked that C is closed under sums and multiplication by nonnegative scalars, i.e., that C is a cone in X . We need to establish a few preliminary facts concerning C. Lemma 5.6. Any positive-definite formal power series ϕ(z, z ) in L(Y)z, z is also in C. Proof. As ϕ is a positive kernel, we know that we can factor ϕ as ϕ(z, z ) = H(z)H(z )∗ for some H(z) ∈ L(K, Y)z for some auxiliary Hilbert space K. We must produce , Y)z so that a formal power series H (z) ∈ L(⊕s∈S H[s] H(z)H(z )∗ = H (z)[II⊕s∈S H[s] − ZG,H (z)ZG,H (z )∗ ]H (z )∗ . Let s0 ∈ S be any fixed choice of particular source vertex. Without loss of generality we may assume H is presented in the form H = 2 (F FE0 , K)
We take H (z) ∈
where
L(⊕s∈S H[s] , Y)z
E0 = {e ∈ E : s(e) = s0 }.
to be of the form
, K)z H (z) = H(z)K (z) where K (z) = rows∈S Ks (z) with Ks (z) ∈ L(H[s]
where Ks (z) is given by
Ks (z)
=
v rowv∈F FE0 z IK 0
if s = s0 , otherwise.
Then we check
4 5 H (z) I⊕s∈S H[s] − ZG,H (z)ZG,H (z )∗ H (z )∗ A B = H(z)Ks 0 (z) 1− ze ze IH Ks 0 (z )∗ H(z )∗ ⎡⎛ = H(z) ⎣⎝
e∈E0
z v z v IK −
v∈F F E0
⎞
⎤
z v z v ⎠ IK ⎦ H(z )∗
v∈F FE0 \{∅}
∗
= H(z)H(z )
as wanted, and Lemma 5.6 follows.
204
J.A. Ball, G. Groenewald and T. Malakorn
We shall need to approximate the cone C by the cone Cε (where ε > 0) defined as the set of all ϕ ∈ L(Y)z, z having a representation ϕ(z, z ) = H(z) I − (1 + ε)2 ZG,H (z)ZG,H (z )∗ H(z )∗ + γe (z)(1 − ε2 ze ze )γe (z )∗ (5.10) e∈E , Y)z and some γe (z) ∈ L(H , Y)z for e ∈ E. for some H(z) ∈ L(⊕s∈S H[s] Equivalently, just as in the proof of (2) ⇐⇒ (3) (Step 1 above), we see that, in terms of positive-definite formal power series, Cε can be defined as the set of all ϕ ∈ L(Y)z, z having a representation ϕ(z, z ) = Ks,s (z, z ) − (1 + ε)2 ze s ,r Ks,s (z, z )zes,r s∈S
+
Γe (z, z ) − ε
2
e∈E
r∈R s,s ∈S : [s]=[s ]=[r] ze Γe (z, z )ze e∈E
(5.11)
for some positive-definite formal power series K(z, z ) = [Ks,s (z, z )]s,s ∈S ∈ L(⊕s∈S Y)z, z and some positive-definite formal power series Γe (z, z ) in L(Y)z, z for each e ∈ E. Lemma 5.7. Assume that ϕ ∈ L(Y)z, z is in the cone Cε for all ε > 0 sufficiently small. Then ϕ ∈ C, i.e., ϕ has a representation (5.8) or equivalently (5.9). Proof. The assumption is that, for all ε > 0 sufficiently small, there is a positiveKε,s,s (z, z )]s,s ∈S in L(⊕s∈S Y)z, z definite formal power series Kε (z, z ) = [K and a positive-definite formal power series Γε,e (z, z ) in L(Y)z, z so that (5.11) holds (with K (z, z ) in place of K(z, z ) and Γε,e in place of Γe ) for each e ∈ E. In particular for the (∅, ∅)-coefficient we get [K Kε,s,s ]∅,∅ + [Γε,e ]∅,∅ . ϕ∅,∅ = s∈S
e∈E
Hence [K Kε,s,s ]∅,∅ and [Γε,e ]∅,∅ are all uniformly bounded as tends to 0. By using Kε,s,s ]∅,∅ is uniformly bounded the positive-definiteness of Kε (z, z ), we see that [K as ε tends to zero for all s, s ∈ S as well. More generally, computation of the (v, v ) coefficient of ϕ from (5.11) yields −1 ϕv,v = [K Kε,s,s ]v,v − (1 + ε)2 [K Kε,s,s ]ve−1 s,r ,e v s∈S
+
e∈E
r∈R s,s ∈S : [s]=[s ]=[r]
[Γε,e ]v,v − ε2
[Γε,e ]ve−1 ,e−1 v .
s ,r
(5.12)
e∈E
Here we are using (2.3) to define words of the form ve−1 or e−1 v with the convention that the coefficient is taken to be equal to zero if any of its indices is an undefined word. Inductively assume that there is a uniform bound on [K Kε,s,s ]w,w for all words w, w ∈ FE having length at most N . From (5.12) we can then see that [K Kε,s,s ]v,v is uniformly bounded for all v ∈ FE with |v| = N + 1. Using the
Conservative Noncommutative Systems
205
positive-definiteness of K(z, z ), we then see that this leads to a uniform bound for [K Kε,s,s ]v,v for all v, v ∈ FE of length at most N + 1 as ε tends to zero. A similar inductive argument gives that [Γε,e ]v,v is uniformly bounded as ε tends to zero for all words v, v with length |v|, |v | at most some N < ∞. Since we are assuming that Y is finite-dimensional, it follows that bounded subsets of L(Y) are precompact in the operator-norm topology. By this fact combined with a Cantor diagonalization procedure, there exists a sequence of numbers εn > 0 tending to zero such that the limits lim [K Kεn ,s,s ]v,v = [Ks,s ]v,v ,
n→∞
lim [Γεn ,e ]v,v = [Γe ]v,v
n→∞
all exist in the operator-norm topology of L(Y). We then take limits in (5.12) to deduce that −1 ϕv,v = [Ks,s ]v,v − [Ks,s ]ve−1 [Γe ]v,v (5.13) + s,r ,e v and hence ϕ(z, z ) =
s ,r
r∈R s,s ∈S : [s]=[s ]=[r]
s∈S
Ks,s (z, z ) −
r∈R s,s ∈S : [s]=[s ]=[r]
s∈S
with Ks,s (z, z ) and Γe (z, z ) given by Ks,s (z, z ) = [Ks,s ]v,v z v z v ,
e∈E
ze s ,r Ks,s (z, z )zes,r +
Γe (z, z )
e∈E
(5.14) Γe (z, z ) =
v,v ∈F FE
[Γe ]v,v z v z v .
v,v ∈F FE
Kεn ,s,s (z, z )]s,s ∈S and Γεn ,e (z, z ) are positive-definite for each As Kεn (z, z ) = [K fixed n, we know that [K Kεn ,s,s ]v,v ys ,v , ys,v Y ≥ 0, [Γεn ,e ]v,v gv , gv Y ≥ 0 s,s ∈S v,v ∈F FE
v,v ∈F FE
(5.15) for all finitely supported Y-valued functions (s, v) → ys,v and s → gv . We may then take the limits as n → ∞ in (5.15) to get [Ks,s ]v,v ys ,v , ys,v Y ≥ 0, [Γe ]v,v gv , gv Y ≥ 0 (5.16) s,s ∈S v,v ∈F FE
v,v ∈F FE
from which we see that K(z, z ) = [Ks,s (z, z )]s,s ∈S and Γe (z, z ) for e ∈ E are positive-definite formal power series as well. By Lemma 5.6, for each e ∈ E the formal power series Γe (z, z ) is therefore in the cone C. As the difference in the first two terms on the right-hand side of (5.14) is clearly in C by the characterization (5.9) for C and C is closed under addition, it follows that ϕ ∈ C as asserted. Lemma 5.7 now follows. Lemma 5.8. If ϕ(z, z ) ∈ L(Y)z, z is a positive-definite formal power series and if ε > 0, then: 1. ϕ ∈ Cε , and 2. for each e ∈ E, the kernel ϕ(z,
z ) := ϕ(z, z ) − ε2 ze ϕ(z, z )ze is also in Cε .
206
J.A. Ball, G. Groenewald and T. Malakorn
Proof. As ϕ(z, z ) is positive-definite, we have a factorization ϕ(z, z ) = H(z)H(z )∗ for some H(z) ∈ L(K, Y)z. To show that ϕ(z, z ) ∈ Cε it suffices to produce a representation (5.10) for ϕ with γε,e (z) = 0 for each e ∈ E. As in the proof of Lemma 5.6, to produce a representation of this latter form it suffices to produce , Y)z with a formal power series H (z) ∈ L(⊕s∈S H[s] ϕ(z, z ) = H(z)H(z )∗ = H (z)(1 − (1 + ε)2 ZG,H (z)ZG,H (z )∗ )H (z )∗ . (5.17) FE0 , K) where E0 = For this purpose we assume that H is presented as H = 2 (F {e ∈ E : s(e) = s0 } where s0 is some fixed source vertex s0 ∈ S. We then take H (z) to be of the form H (z) = H(z)K (z) where , K)z K (z) = rows∈S Ks (z) ∈ L(⊕s∈S H[s]
is given by
Ks (z)
=
|v| v rowv∈F FE0 (1 + ε) z IK 0
if s = s0 , otherwise.
Then a direct computation as in the proof of Lemma 5.6 gives H (z)(II⊕s∈S H[s] − (1 + ε)2 ZG,H (z)ZG,H (z )∗ )H (z )∗ A B 2 = H(z)Ks0 (z) 1 − (1 + ε) ze ze IH Ks 0 (z )∗ H(z )∗ ⎡⎛ = H(z) ⎣⎝
e∈E0
(1 + ε)2|v| z v z v IK −
v∈F F E0 ∗
⎞
⎤
(1 + ε)2|v| z v z v ⎠ IK ⎦ H(z )∗
v∈F FE0 \{∅}
= H(z)H(z ) = ϕ(z, z ), as wanted, and part (1) of Lemma 5.8 follows. For the second assertion, use the characterization (5.11) for membership in Cε with Ks,s (z, z ) = 0 for all s, s ∈ S and with Γe (z, z ) = 0 for e = e and Γe (z, z ) = ϕ(z, z ). Lemma 5.9. For each ε > 0, the cone Cε is closed as a subspace of X = L(Y)z, z with the locally convex topology induced by the sequence of seminorms · N given by (5.7) for N = 1, 2, . . . . Proof. Suppose that {ϕn }n=1,2,... is a sequence of elements of Cε converging to ϕ ∈ X in the locally convex topology of X . By the characterization (5.11) we have the existence of positive-definite formal power series Kn;s,s (z, z )]s,s ∈S ∈ L(⊕s∈S Y)z, z , Γn,e (z, z ) ∈ L(Y)z, z Kn (z, z ) = [K
Conservative Noncommutative Systems so that the representation ϕn (z, z ) = Kn;s,s (z, z ) − (1 + ε)2 s∈S
+
r∈R s,s ∈S : [s]=[s ]=[r]
Γn,e (z, z ) − ε2
e∈E
207
ze s ,r Kn;s,s (z, z )zes,r
ze Γn,e (z, z )ze
(5.18)
e∈E
holds for each n = 1, 2, . . . . In terms of coefficients we then have −1 [ϕn ]v,v = [K Kn;s,s ]v,v − (1 + ε)2 [K Kn;s,s ]ve−1 s,r ,e s∈S
+
r∈R s,s ∈S : [s]=[s ]=[r]
[Γn,e ]v,v − ε2
e∈E
s ,r
[Γn,e ]ve−1 ,e−1 v .
v
(5.19)
e∈E
By assumption [ϕn ]v,v converges in the operator-norm of L(Y) to [ϕ]v,v as n → ∞. An inductive argument on the length of words combined with the positivedefiniteness of Kn (z, z ) and Γn,e (z, z ) as in the proof of Lemma 5.7 can now be used to show that [K Kn;s,s ]v,v and [Γn,e ]v,v remain uniformly bounded as n → ∞ for each v, v ∈ FE and e ∈ E. Since we are assuming that dim Y < ∞, a compactness argument together with a Cantor diagonalization argument (as in the proof of Lemma 5.7) can be used to show that there exists a subsequence n1 < n2 < n3 < . . . such that the limits lim [K Knk ;s,s ]v,v = [Ks,s ]v,v ,
k→∞
lim [Γnk ,e ]v,v = [Γe ]v,v
k→∞
all exist in L(Y)-norm for each s, s ∈ S, e ∈ E, and v, v ∈ FE . We may then take limits in (5.19) to conclude that −1 [Ks,s ]v,v − (1 + ε)2 [Ks,s ]ve−1 [ϕ]v,v = s,r ,e v s∈S
+
r∈R s,s ∈S : [s]=[s ]=[r]
[Γe ]v,v − ε2
e∈E
[Γe ]ve−1 ,e−1 v .
K(z, z ) = [Ks,s (z, z )]s,s ∈S with Ks,s (z, z ) = Γe (z, z ) =
(5.20)
e∈E
If we then set
s ,r
[Ks,s (z, z )]v,v z v z v ,
v,v ∈F FE v v
[Γe ]v,v z z
,
v,v ∈F FE
we conclude that ϕ(z, z ) = Ks,s (z, z ) − (1 + ε)2 s∈S
+
e∈E
r∈R s,s ∈S : [s]=[s ]=[r]
Γe (z, z ) − ε2
ze Γe (z, z )ze .
ze s ,r Ks,s (z, z )zes,r (5.21)
e∈E
Furthermore, as Ks,s (z, z ) is the coefficientwise limit of Kn,s,s (z, z ) where each [K Kn;s,s ]s,s ∈S is positive definite and Γe (z, z ) is the coefficientwise limit of
208
J.A. Ball, G. Groenewald and T. Malakorn
Γn,e (z, z ) which is positive definite, it follows as in the proof of Lemma 5.7 that K(z, z ) and Γe (z, z ) for each e ∈ E are positive definite. The identity (5.21) then shows that ϕ(z, z ) satisfies the criterion (5.11) for membership in Cε as wanted, and Lemma 5.9 follows. We are now ready to commence the proof of (1) =⇒ = (2) in Theorem 5.3 for the case where dim Y < ∞. Suppose that we are given a formal power series T (z) which is in the Schur-Agler class SAG (U, Y). The issue is to show that ϕT (z, z ) := IY − T (z)T (z )∗ is in C. By Lemma 5.7, it suffices to show that ϕT is in Cε for all ε > 0 small enough. Recall the notation X for the topological linear space L(Y)z, z with the locally convex topology of norm-convergence of power-series coefficients. By the Hahn-Banach separation principle (apply the contrapositive version of part (b) of Theorem 3.4 in [54] with X = X , A = {ϕT }, and B = Cε ), it suffices to show: for fixed ε > 0 and for any continuous linear functional L : X → C such that L(ϕ) ≥ 0 for all ϕ ∈ Cε , it follows that L(ϕT ) ≥ 0. Here denotes “real part”. Fix ε > 0 and let L be any continuous linear functional on X with L|Cε ≥ 0. Define L1 : X → C by 1 L(ϕ) + L(ϕ) ˘ (5.22) L1 (ϕ) = 2 where we have set ϕ(z, ˘ w) = ϕ(w, z)∗ . Note that L1 (ϕ) = L(ϕ) in case ϕ ˘ = ϕ. We define a sesquilinear form ·, ·L on the space H0 := L(Y, C)z according to the formula f, gL = L1 (g(z)∗ f (z )).
(5.23)
∗
Note that any formal power series ϕ of the form ϕ(z, z ) = f (z)f (z ) has the property that ϕ˘ = ϕ; by part (1) of Lemma 5.8, any such ϕ is in Cε . We conclude that f, f L = L(f (z)∗ f (z )) ≥ 0 for all f ∈ H0 . We may thus identify elements of zero norm and then take a completion in the L-norm to get a Hilbert space HL . We next seek to define operators δe for each e ∈ E on HL so that δe∗ is given by δe∗ : f (z) → ze f (z) for f ∈ H0 . (5.24) ∗ 2 By part (2) of Lemma 5.8 we know that the kernel f (z) (1 − ε ze ze )f (z ) belongs to Cε , and hence f 2HL − ε2 δe∗ f 2HL = L f (z)∗ (1 − ε2 ze ze )f (z ) ≥ 0 for f ∈ H0 . Hence δe extends to a bounded operator on all of HL with δe = δe∗ ≤ 1/ε for each e ∈ E. It is then easy to see that the operator ZG,HL (δ)∗ : (⊕s∈S HL ) → (⊕r∈R HL ) is given by multiplication by ZG,HL (z )∗ on the left: ZG,HL (δ)∗ : f (z ) → ZG,HL (z )∗ f (z ) for f ∈ ⊕s∈S H0 .
Conservative Noncommutative Systems
209
Note that an element f ∈ ⊕s∈S H0 can be viewed as an element of the space L(Y, ⊕s∈S C)z . The (⊕s∈S HL )-norm of an element f = ⊕s∈S fs ∈ ⊕s∈S H0 can be computed as follows: f 2⊕s∈S HL = ffs 2HL = L (ffs (z)∗ fs (z )) = L (f (z)∗ f (z )) . s∈S
s∈S
Similarly, ZG,HL (δ)∗ f 2⊕r∈R HL = L (f (z)∗ ZG,HL (z)ZG,HL (z )∗ f (z )) . We may then compute f 2⊕s∈S HL − (1 + )2 ZG,HL (δ)∗ f 2⊕r∈R HL = L f (z)∗ (II⊕s∈S C − (1 + ε)2 ZG,HL (z)ZG,HL (z )∗ )f (z ) .
(5.25)
Clearly, ϕ(z, z ) given by ϕ(z, z ) := f (z)∗ (II⊕s∈S C − (1 + ε)2 ZG,HL (z)ZG,HL (z )∗ )f (z ) is in the cone Cε : simply take γe (z) = 0 for all e ∈ E in the defining representation (5.10) for elements of Cε . From (5.25) and the assumption that L is nonnegative on Cε , we therefore deduce that ZG,HL (δ) = ZG,HL (δ)∗ ≤
1 < 1. 1+ε
From our assumption that T (z) ∈ SAG (U, Y), we deduce that T (δ) ≤ 1. If we are in the scalar-valued case U = Y = C, then we see from the form (5.24) for the action of δe∗ and from the continuity of L that T (δ)∗ is given by T (δ)∗ : f (z ) → T (z )∗ f (z ) with T (δ)∗ f 2 = L (f (z)∗ T (z)T (z )∗ f (z )) . As T (δ)∗ ≤ 1 we therefore have 0 ≤ 12HL − T (δ)∗ (1)2HL = L (IIY − T (z)T (z )∗ ) = L(ϕT (z, z )) as wanted. The general case is a little more intricate. For Φ ∈ L(U, Y) and v ∈ FE , the on an element f (z ) ⊗ y of HL ⊗ Y. We tensor product operator δ ∗v ⊗ Φ∗ acts v assume that the formal power series f = v∈F FE fv z consists only of its constant term (so fv = 0 for v = ∅ and f (z ) = where ∈ L(Y, C) is a linear functional on Y). We compute the (HL ⊗ U)-inner product of (δ ∗v ⊗ Φ∗ )( ⊗ y) against another
210
J.A. Ball, G. Groenewald and T. Malakorn
such object (δ ∗v
⊗ Φ∗ )( ⊗ y ) as follows:
(δ ∗v
⊗ Φ∗ )( ⊗ y ),
(δ ∗v ⊗ Φ∗ )( ⊗ y)HL ⊗U
= z v ⊗ Φ∗ y , z v ⊗ Φ∗ yHL ⊗U
= z v , z v HL · Φ∗ y , Φ∗ yU = L1 z v z v ∗ · ΦΦ∗ y , yY · = L1 ∗ y ∗ (Φz v )(Φ∗ z v )y .
(5.26)
Here we have viewed the vector y ∈ Y as the operator y : α → αy from C to Y with adjoint operator y ∗ : Y → C given by y ∗ : y → y , yY ∈ C. In this way, the inner product ΦΦ∗ y , yY , when viewed as an operator on C, can be written as the operator composition ΦΦ∗ y , yY = y ∗ ΦΦ∗ y : C → C. By linearity we can generalize (5.26) to G (δ)∗ ( ⊗ y ), G(δ)∗ ( ⊗ y)HL ⊗U = L1 (∗ y ∗ G(z)G (z )∗ y )
(5.27)
for any polynomials G(z ), G (z ) in the noncommuting indeterminates z with coefficients in L(U, Y) (G, G ∈ L(U, Y)z ). More generally, by the assumed continuity of L on X , (5.27) continues to hold if G and G are formal power series in L(U, Y)z for which G(δ) and G (δ) are defined. We now apply (5.27) to the case where G = G = T ∈ SAG (U, Y) and where ∗ ( ) = y = yj and ∗ = y = yi , where y1 , y2 , . . . , yM is an orthonormal basis for Y, to get T (δ)∗ (yj∗ ⊗ yj ), T (δ)∗ (yi∗ ⊗ yi )HL ⊗U = L1 yi yi∗ T (z)T (z )∗ yj yj∗ . Summing over i, j = 1, . . . , M then gives < ⎛ ⎞<2 < < M < < ∗
= L (T (z)T (z )∗ ) .
(5.28)
Moreover, we compute yj∗ ⊗ yj , yi∗ ⊗ yi HL ⊗Y = yj∗ , yi∗ HL · yj , yi Y = δi,j L1 yi yj∗ . Summing this over i, j = 1, . . . , M then gives < <2 <M < M < ∗ < < < y ⊗ y = L1 yj yj∗ = L (IIY ) . j< j < < j=1 < j=1 HL ⊗Y
(5.29)
Conservative Noncommutative Systems
211
Using that T (δ ≤ 1 and combining (5.28) and (5.29) then gives <2 < < ⎛ ⎞<2 < < < <M M < < < < ∗ ∗⎝ ∗ < < < ⎠ yj ⊗ yj < −
= L(IIY − T (z)T (z )∗ ) = L(ϕT (z, z ))
HL ⊗U
(5.30)
as wanted. This completes the proof of (1) =⇒ = (2) or (3) for the case that dim Y < ∞. We now consider the case of a general separable Hilbert output space Y. Let y1 , y2 , . . . , yM , . . . be an orthonormal basis for Y and let PM : Y → Y be the orthogonal projection onto the closed span YM of {y1 , . . . , yM }. Suppose that the formal power series T (z) ∈ L(U, Y)z is in the noncommutative Schur-Agler class SAG (U, Y). Then clearly PM T (z) ∈ L(U, YM )z is in the noncommutative = (2) or (3) Schur-Agler class SAG (U, YM ). Hence, by the special case of (1) =⇒ already proved, ϕT,M = PM (IIY − T (z)T (z )∗ )P PM has a representation of the form ϕT,M (z, z ) = KM;s,s (z, z ) − s∈S
r∈R
s,s ∈S :
[s]=[s ]=[r]
ze s ,r KM;s,s (z, z )zes,r (5.31)
for a positive-definite formal power series KM (z, z ) = [KM;s,s (z, z )]s,s ∈S ∈ L(⊕s∈S YM )z, z . In terms of power-series coefficients, we therefore have −1 [ϕT,M ]v,v = [KM;s,s ]v,v − [KM;s,s ]ve−1 s,r ,e s∈S
r∈R s,s ∈S : [s]=[s ]=[r]
s ,r
v .
(5.32)
By construction [ϕT,M ]v,v = PM [ϕT ]v,v PM
(5.33)
and hence [ϕT,M ]v,v ≤ [ϕT ]v,v for all v, v ∈ FE and M = 1, 2, . . . .
(5.34)
The uniform estimate (5.34) combined with an inductive argument on the length of words (as in the proof of Lemma 5.7) implies that [KM;s,s ]v,v is uniformly bounded in the operator norm of L(Y) as M → ∞ for each v, v ∈ FE . Furthermore, L(Y) carries a weak-∗ topology as the dual space of the trace-class operators L1 (Y) under the duality pairing induced by the trace (see [28, Theorem 19.2 page 94]). By Alaoglu’s Theorem (see [55, Theorem 10.3 page 174]), norm-bounded sets in L(Y) are precompact in the weak-∗ topology. Moreover (see [32, Theorem 1 page 426]), since the predual L1 (Y) of L(Y) is separable, it follows that the weak-∗ topology on bounded subsets of L(Y) is metrizable. These observations combined
212
J.A. Ball, G. Groenewald and T. Malakorn
with another Cantor diagonalization procedure allow us to conclude that there exists a subsequence Mnk → ∞ so that weak-∗ limk→∞ KMk ;v,v = Kv,v
(5.35)
exists for each v, v ∈ FE . Furthermore, a consequence of (5.33) is that weak-∗ limM→∞ [ϕT,M ]v,v = [ϕT ]v,v .
(5.36)
Using (5.35) and (5.36) to take weak-∗ limits in (5.32), we get −1 [ϕT ]v,v = [Ks,s ]v,v − [Ks,s ]ve−1 s,r ,e
s ,r
r∈R s,s ∈S : [s]=[s ]=[r]
s∈S
or, in terms of formal power series, Ks,s (z, z ) − ϕT (z, z ) = s∈S
r∈R
s,s ∈S :
[s]=[s ]=[r]
v
ze s ,r Ks,s (z, z )zes,r ,
(5.37)
v v where we set Ks,s (z, z ) = v,v ∈F FE [Ks,s ]v,v z z . Furthermore, using the characterization (5.3) it is easy to see from (5.35) that K(z, z ) is a positive-definite formal power series since each KMk (z, z ) is positive-definite. We have thus verified that ϕT has a representation (5.4) as wanted. This completes the proof of (1) = =⇒ (2) or (3). (3) =⇒ = (4): Suppose that T (z) ∈ L(U, Y)z is such that I − T (z)T (z )∗ = H(z)(I − ZG,H (z)ZG,H (z )∗ )H(z )∗ for some H(z) ∈
, Y)z. L(⊕s∈S H[s]
(5.38)
Thus H(z) has a row matrix representation
H(z) = rows∈S Hs (z) where Hs (z) ∈
L(H[s] , Y)z.
Write the coefficient of Hs (z) for the word v ∈ FE
as [H Hs ]v . Given two words v, v ∈ FE , equating coefficients of z v z v gives δ v,∅ δ v ,∅ IY − Tv Tv∗ =
[Hs ]v ([Hs ]v )∗ −
s∈S
r∈R
⎛
⎝
s : [s]=[r]
=
⎠·⎝ [Hs ]ve−1 s,r
s : [s]=[r]
Rewrite this identity in the form ⎛ ⎞ ⎛ ⎝ ⎠·⎝ [Hs ]ve−1 s,r r∈R
⎞ ⎛
in (5.38)
⎞ ([Hs ]v e−1 )∗ ⎠ .
s : [s ]=[r]
s ,r
⎞ ([Hs ]v e−1 )∗ ⎠ + δ v,∅ δ v ,∅ IY
s : [s ]=[r]
[Hs ]v ([Hs ]v )∗ + Tv Tv∗ .
s ,r
(5.39)
s∈S
As a consequence of (5.39) we see that the map V : DV → RV defined by C D C D colr∈R s : [s]=[r] ([Hs ]ve−1 )∗ cols∈S ([H Hs ]v )∗ s,r V : y → y (5.40) Tv∗ δ v,∅ IY
Conservative Noncommutative Systems
213
extends by linearity and limits to define a unitary transformation from C D colr∈R s : [s]=[r] ([H Hs ]ve−1 )∗ s,r DV := closed span y : v ∈ FE , y ∈ Y δ v,∅ IY onto
C RV := closed span
D cols∈S ([Hs ]v )∗ y : v ∈ FE , Tv∗
y∈Y .
Extend V to a unitary transformation V of the form D C D D C C ∗ C∗ colr∈R H[r] cols∈S H[s] A : → V = B ∗ D∗ Y U where, for p ∈ P , Hp ⊃ Hp . Set H equal to the collection {Hp : p ∈ P }. Putting the pieces together, we have that Σ = (G, H, U = V ∗ ) is a conservative SNMLS. We next verify that T (z) is the transfer function T (z) = TΣ (z) of the conservative SNMLS Σ constructed as above. Indeed, since V extends V we see from (5.40) that C ∗ D C D C D colr∈R s : [s]=[r] ([Hs ]ve−1 )∗ C∗ Hs ]v )∗ A cols∈S ([H : y → y (5.41) B ∗ D∗ Tv∗ δv,∅ IY
for all v ∈ FE and y ∈ Y. If we multiply both sides by z v and sum over all v ∈ FE and cancel off the common factor y we get the formal power series identity D C DC D C D C ∗ ZΣ (z )∗ 0 H(z )∗ H(z )∗ A C∗ : → . (5.42) B ∗ D∗ IY T (z )∗ 0 IY In particular, the top block component of (5.42) gives A∗ ZΣ (z )∗ H(z )∗ + C ∗ = H(z )∗ from which we get H(z )∗ = (I − A∗ ZΣ (z )∗ )−1 C ∗ . Substituting this into the equality of the bottom block components of (5.42) then gives T (z )∗ = B ∗ ZΣ (z )∗ (I − A∗ ZΣ (z )∗ )−1 C ∗ + D∗ and hence T (z) = D + C(I − ZΣ (z)A)−1 ZΣ (z)B and we have verified that T (z) = TΣ (z) as wanted. This completes the proof of (3) = =⇒ (4) in Theorem 5.3. As we have now verified (2) ⇐⇒ (3), (4) = =⇒ (1), (1) = =⇒ (2) or (3) and (3) = (4), the proof of all of Theorem 5.3 is now complete. =⇒
214
J.A. Ball, G. Groenewald and T. Malakorn
Remark 5.10. It is possible to give an elementary direct proof of (4) =⇒ = (3) in Theorem 5.3. Assume that the formal power series T (z) is realized as T (z) = TΣ (z) for a conservative SNMLS Σ = (G, H, U ). In particular D C C D C D ⊕s∈S H[s] A B ⊕r∈R H[r] : U= → C D U Y is unitary, and hence we have the relations BB ∗ = I − AA∗ ,
BD∗ = −AC ∗ ,
DB ∗ = −CA∗ ,
DD∗ = I − CC ∗ . (5.43)
Then we compute
I − T (z)T (z )∗ = I − D + C(I − ZΣ (z)A)−1 ZΣ (z)B · · D∗ + B ∗ ZΣ (z )∗ (I − A∗ ZΣ (z )∗ )−1 C ∗ = I − DD∗ − C(I − ZΣ (z)A)−1 ZΣ (z)BD∗ − DB ∗ ZΣ (z )∗ (I − A∗ ZΣ (z )∗ )−1 C ∗ − C(I − ZΣ (z)A)−1 ZΣ (z)BB ∗ ZΣ (z )∗ (I − A∗ ZΣ (z )∗ )−1 C ∗ = CC ∗ + C(I − ZΣ (z)A)−1 ZΣ (z)AC ∗ + CA∗ ZΣ (z )∗ (I − A∗ ZΣ (z )∗ )−1 C ∗ − C(I − ZΣ (z)A)−1 ZΣ (z)(I − AA∗ )ZΣ (z )∗ (I − A∗ ZΣ (z )∗ )−1 C ∗ (where we used (5.43)) = C(I − ZΣ (z)A)−1 [I − ZΣ (z)ZΣ (z )∗ ] (I − A∗ ZΣ (z )∗ )−1 C ∗ (by algebra)
and (3) follows with H(z) = C(I − ZΣ (z)A)−1 ∈ L(⊕s∈S H[s] , Y)z. Note that this computation uses only that U is coisometric; we conclude that if T (z) has a realization as the transfer function of a SNMLS Σ = (G, H, U ) with coisometric connection matrix U , then T also has such a realization with unitary connection matrix U . By a completely parallel computation, one can verify that, whenever U is isometric, I − T (z)∗ T (z ) = G(z)∗ (I − ZG,H (z)∗ ZG,H (z )) G(z )
(5.44)
for some formal power series G(z ) = colr∈R Gr (z ) ∈ L(U, ⊕r∈R H[r] )z
and some collection of Hilbert spaces H = {Hp : p ∈ P }. Under the correspondence
z ) = (colr∈R Gr (z)∗ ) · (rowr∈R Gr (z )) K(z, it is easy to see that (5.44) can be equivalently expressed as
r,r (z, z )−
r,r (z, z )zes,r (5.45) K I −T (z)∗T (z ) = ze s,r K r∈R
s∈S r,r ∈R : [r]=[r ]=[s]
Conservative Noncommutative Systems
215
for some positive-definite formal power series
z ) = [K
r,r (z, z )]r,r ∈R ∈ L(⊕r∈R H )z, z . K(z, [r] Hence if U is unitary then we have both (5.5) and (5.44) for some H(z) ∈ , Y)z and some G(z ) ∈ L(U, ⊕r∈R H[r] )z , or equivalently, both L(⊕s∈S H[s] (5.4) and (5.45) for some positive-definite formal power series
z ) ∈ L(⊕r∈R H )z, z . K(z, z ) ∈ L(⊕s∈S H )z, z and K(z, [s]
[r]
Moreover, since Theorem 4.2 is valid for dissipative SNMLSs, we see that TΣ (z) satisfies (1) in Theorem 5.3 if Σ = (G, H, U ) with U merely contractive. We conclude that a given power series T (z) can be realized as the transfer function of a dissipative SNMLS Σ (i.e., Σ = (G, H, U ) with U contractive) if and only if it has a (possibly different) realization as the transfer function of a conservative SNMLS (with U unitary). Moreover, any of these characterizations is equivalent to the 2 × 2-block kernel decomposition D C D C T (z) − T (z ) H(z) 0 IY − T (z)T (z )∗ (5.46) = T (z)∗ − T (z )∗ IU − T (z)∗T (z ) 0 G(z)∗ B C A D I⊕s∈S H[s] − ZG,H (z)ZG,H (z )∗ ZG,H (z) − ZG,H (z ) H(z )∗ 0 · · 0 G(z ) ZG,H (z)∗ − ZG,H (z )∗ I⊕r∈R H[r] − ZG,H (z)∗ ZG,H (z ) for some H(z) ∈ L(⊕s∈S H[s] , Y)z, G(z ) ∈ L(U, ⊕r∈R H[r] )z and a common collection of Hilbert spaces H = {Hp : p ∈ P }. Equivalently, under the correspondence C D KSS (z, z ) KSR (z, z ) K(z, z ) = KRS (z, z ) KRR (z, z ) C D 3 cols∈S Hs (z) 2 rows∈S Hs (z )∗ rowr∈R Gr (z ) , = ∗ colr∈R Gr (z)
(5.46) is equivalent to C IY − T (z)T (z )∗ T (z)∗ − T (z )∗
D C D MSS (z, z ) MSR (z, z ) T (z) − T (z ) = IU − T (z)∗ T (z ) MRS (z, z ) MSS (z, z )
where we have set KSS;s,s (z, z ) − MSS (z, z ) = s∈S
MSR (z, z ) =
r∈R s∈S : [s]=[r]
MRS (z, z ) =
r∈R
MRR (z, z ) =
r∈R
s,s ∈S :
[s]=[s ]=[r]
ze s ,r KSS;s,s (z, z )zes,r
KSR;s,r (z, z )zes,r − ze s,r KSR;s,r (z, z ) KRS;r,s (z, z )zes,r − ze s,r KRS;r,s (z, z )
s∈S r∈R : [r]=[s]
(5.47)
KRR;r,r (z, z ) −
s∈S r,r ∈R : [r]=[r ]=[s]
ze s,r KRR;r,r (z, z )zes,r (5.48)
216
J.A. Ball, G. Groenewald and T. Malakorn
for some positive-definite formal power series C D KSS (z, z ) KSR (z, z ) K(z, z ) = KRS (z, z ) KRR (z, z ) C D 4 ⊕ H 5 [KSS;s,s (z, z )]s,s ∈S [KSR;s,r (z, z )]s∈S;r∈R s∈S z, z . = ∈ L ⊕r∈R H[s] [KRS;r,s (z, z )]r∈R;s∈S [KRR;r,r (z, z )]r,r ∈R [r] Complete details for the commutative case appear in [7, 12]. Remark 5.11. Fornasini-Marchesini conservative SNMLS. In this extended remark we lay out how Theorem 5.3 specializes for the case where G = GF M is as in the setting of the noncommutative Fornasini-Marchesini systems explored in Example 2.1. In this case the Agler decompositions (5.4) or (5.5) for the formal power series T (z) ∈ L(U, Y)z assume the forms ∗
I − T (z)T (z ) = K(z, z ) −
d
zk−1 K(z, z )zk−1
k=1
= H(z) · (1 − z1 z1 − · · · − zd zd )IIH · H(z )∗ .
(5.49)
By Theorem 5.3 applied to the Fornasini-Marchesini case, we see that a given formal power series T (z) ∈ L(U, Y)z satisfies (5.49) if and only if T is in the Fornasini-Marchesini Schur-Agler class SAGF M (U, Y) given in this case by SAGF M (U, Y) ={T (z) ∈ L(U, Y)z : T (δ1 , . . . , δd ) ≤ 1 for all δ1 , . . . , δd ∈ L(K) with δ1 δ1∗ + · · · + δd δd∗ ≤ 1}.
(5.50)
There has been much work of late from a number of different points of view on a noncommutative analogue of the algebra of Toeplitz operators on the unit disk (the “Cuntz-Toeplitz algebra” – see [49, 51, 52, 29, 31, 30, 20, 21]. The Cuntz-Toeplitz algebra (expressed in our notation) algebra Mnc,d consisting of is the multiplier v the formal power series T (z) = v∈F T z with scalar coefficients Tv ∈ C such Fd v that the left multiplication operator MT : f (z) → T (z) · f (z)
(5.51)
defines a bounded operator on the Fock space L2 (F Fd ) defined by
F 2 v 2 Fd ) = f (z) = fv z ∈ Cz : |ffv | < ∞. L (F v∈F Fd
v∈F Fd
The tensor product of this space with L(U, Y) is the space Mnc,d(U, Y) consisting of formal power series Tv z v ∈ L(U, Y)z T (z) = v∈F Fd
for which the associated left multiplication operator as in (5.51) defines a bounded Fd , U) := L2 (F Fd ) ⊗ U into L2 (F Fd , Y) := L2 (F Fd ) ⊗ Y. It is operator from L2 (F
Conservative Noncommutative Systems
217
then natural to define a d-variable, noncommutative analogue of the Schur class Snc,d (U, Y) by Snc,d (U, Y) = T (z) ∈ L(U, Y)z : M MT L(L2 (F (5.52) Fd ,U ),L2 (F Fd ,Y)) ≤ 1 where MT : L2 (F Fd , U) → L2 (F Fd , Y) is as in (5.51). As pointed out in [21], the condition that T ∈ Snc,d (U, Y) can also be expressed as the positivity of a certain kernel kFd (z, z )IIY − T (z)(kFd (z, z )IIU )T (z )∗ = H(z)H(z )∗ (5.53) for some H(z) ∈ L(H , Y)z, where we have set kFd equal to the noncommutative Szeg¨ o¨ kernel z v z v . (5.54) kFd (z, z ) := v∈F Fd
It turns out that the Schur-Agler class SAGF M (U, Y) and the d-variable noncommutative Schur class Snc,d (U, Y) are identical, as explained in the following Proposition. v Proposition 5.12. Let T (z) = v∈F Fd Tv z ∈ L(U,Y)z be a formal power series with coefficients in L(U, Y). Then T (z) satisfies (5.49) for some H(z) in L(H, Y)z if and only if T (z) satisfies (5.53) for the same H(z) in L(H, Y)z. Thus the Fornasini-Marchesini Schur-Agler class SAGF M (U, Y) defined as in (5.50) is identical to the noncommutative Schur class Snc,d (U, Y) defined as in (5.52). Proof. By Theorem 5.3 we know that (5.5) characterizes the Fornasini-Marchesini Schur-Agler class SAGF M (U, Y) and by the result from [21] mentioned above we know that (5.53) characterizes the noncommutative Schur class Snc,d (U, Y); hence the second assertion in Proposition 5.12 is an immediate consequence of the first. Assume now that T (z) satisfies (5.49) for some H(z) ∈ L(H, Y)z. Multi plication of (5.49) on the left by z v and then on the right by z v followed by the summation over all v ∈ Fd leads to (5.53). Conversely, multiplication of (5.53) on the left by ze and on the right by ze followed by the sum over all e ∈ E leads to (5.49). This completes the proof of Proposition 5.12. For the Fornasini-Marchesini special case (where G = GF M ), it turns out that the content of Theorem 5.3 can be gleaned from various pieces already existing in the literature. Specifically, we have: (4) =⇒ = (1): (This amounts to Theorem 4.2 specialized to the Fornasini-Marchesini case.) As in Remark 5.10, we see that (4) =⇒ = (3) with H(z) = C(I −ZΣ (z)A)−1 , so we may assume that T (z) has a Fornasini-Marchesini Agler decomposition (5.49). By Proposition 5.12, an equivalent condition is that T (z) is in the noncommutative MT ≤ 1. Note that MT amounts to Schur class Snc,d (U, Y) (see (5.52)), i.e., M T (S) = limr↑1 T (rS) where S = (S1 , . . . , Sd ) is the d-tuple of creation operators on L2 (F Fd , C): Sk : f (z) → zk f (z) for k = 1, . . . , d. (5.55)
218
J.A. Ball, G. Groenewald and T. Malakorn
Note next that, given a d-tuple δ = (δ1 , . . . , δd ) on a Hilbert space K, then ZGF M (δ) amounts to the operator-block row matrix 2 3 ZGF M (δ) = δ1 · · · δd : ⊕dj=1 K → K, and hence the class BGF M L(K) consists of strict row contractions, i.e., δ = (δ1 , 3 2 . . . , δd ) with δ1 · · · δd < 1. It is known (see [27, 48, 53]) that any strict row contraction δ = (δ1 , . . . , δd ) dilates to a row shift of some multiplicity, i.e., one can embed K as a subspace of L2 (F Fd , E) for some auxiliary Hilbert space E in such a way that δ v = PK (S ⊗ IE )v |K for v ∈ Fd . But then we have T (δ) = PY⊗K T (S ⊗ IE )|U ⊗K . From the fact that T (S ⊗ IE ) = T (S) ≤ 1, we conclude that T (δ) ≤ 1. Alternatively, once we have established that M MT ≤ 1, we may apply von Neumann’s inequality for the noncommutative ball setting (see [50, 53]) to conclude that T (δ) ≤ 1; in fact, a natural way to prove von Neumann’s inequality is as an application of dilation theory as sketched above. Via either way, we have verified (4) =⇒ = (3) = =⇒ (1) in Theorem 5.3 for the Fornasini-Marchesini case. (1) =⇒ = (3): Suppose now that T (z) ∈ L(U, Y)z satisfies condition (1) in Theorem 5.3 (specialized to the Fornasini-Marchesini case), i.e., T (z) is in the Fornasini-Marchesini Schur-Agler class SAGF M (U, Y) given by (5.50). In particular, we have that T (rS) ≤ 1 for each r < 1 where S = (S1 , . . . , Sd ) is the Fd , C) as in (5.55). By letting r → 1 we see that d-tuple of row shifts on L2 (F T (S) ≤ 1. As observed in the previous paragraph, T (S) = MT and we see that T (z) ∈ Snc,d (U, Y). Again by Proposition 5.12, equivalently T (z) satisfies (5.49) for some H(z) ∈ L(U, Y)z. In this way we have verified (1) = =⇒ (3) in Theorem 5.3 for the Fornasini-Marchesini case. (3) =⇒ = (4): Assume now that T (z) satisfies condition (3) in Theorem 5.3 specialized to the Fornasini-Marchesini case (5.49), i.e., that T (z) satisfies (5.49) for , Y)z. By Proposition 5.12 an equivalent assumption some H(z) ∈ L(⊕s∈S H[s] is that T (z) is in the noncommutative Schur class Snc,d (U, Y) given by (5.52), i.e., M MT ≤ 1. In case I − T (z)∗ T (z ) has 0 maximal factorable minorant, the fact that T (z) has a realization T (z) = TΣ (z) for a Fornasini-Marchesini conservative SNMLS Σ = (GF M , H, U ) follows from the work of Popescu (see [49]), where a Sz.-Nagy-Foia¸¸s model theory for row contractions is developed. By later results obtained in [20], it follows that in fact any noncommutative Schur class formal power series T (z) ∈ Snc,d (U, Y) can be realized as T (z) = TΣ (z) for a FornasiniMarchesini conservative SNMLS Σ, i.e., the restriction that I − T (z)∗ T (z ) have 0 maximal factorable minorant in the Popescu result can be removed. The result from [20] used functional models for representations of the Cuntz algebra (see [19]) to extend the model theory from [49] to the case of a general completelynonunitary row contraction. In this way we have an alternate verification of (3) = (4) in Theorem 5.3 for the Fornasini-Marchesini case. =⇒
Conservative Noncommutative Systems
219
Finally we mention that the proof of (3) = =⇒ (4) presented here, but specialized to the Fornasini-Marchesini case, is presented in [21]. Remark 5.13. When specialized to the special case of Givone-Roesser conservative noncommutative systems (see Example 2.2), Theorem 5.3 can be viewed as a noncommutative analogue of the realization result of Agler [1] (see also [17]). Similarly, the specialization of Theorem 5.3 to the case of full-structured conservative noncommutative multidimensional systems (Example 2.3) can be viewed as a noncommutative extension of the realization result of [12] (see also [7, 6]) for the special case of Cartan domains of Type I. Remark 5.14. Let us say that the admissible graph GRS is a row-sum graph if each path-connected component p of GRS contains exactly one source vertex sp . As the name suggests, the associated structure matrix ZGRS (z) is the direct sum of Fornasini-Marchesini structure graphs: ⎤ ⎡ z1,1 · · · z1,d1 ⎥ ⎢ z2,1 · · · z2,d2 ⎥ ⎢ ZGRS (z) = ⎢ ⎥. . .. ⎦ ⎣ zK,1
···
zK,dK
Equivalently, row-sum graphs are exactly the admissible graphs for which there is a unique source-vertex cross-section p → sp (see (2.10)). The result Theorem 4.2 combined with the observation (4.6) gives the following: if GRS is a row-sum graph and Σ = (GRS , H, U ) is a conservative (or dissipative) system with structure graph GRS , then the associated transfer function TΣ (z) is in the noncommutative Schur class Snc,d (U, Y). On the other hand, from the discussion above we have seen that Snc,d (U, Y) coincides with the class SAGF M (U, Y) of formal power series T (z) ∈ L(U, Y)z having realization as the transfer function of a conservative SNMLS Σ = (GF M , H, U ) with Fornasini-Marchesini structure graph GF M . By Theorem 5.3, the first class is characterized by T (δ) ≤ 1 for any δ ∈ BGRS L(K) while the second class is characterized by T (δ) ≤ 1 for any δ ∈ BGF M L(K). Thus it must be the case that: given a formal power series T (z) ∈ L(U, Y)z in d noncommuting indeterminates z = (z1 , . . . , zd ) and a row-sum graph GRS with edge set E = {(1, 1) . . . , (1, d1 ), (2, 1) . . . , (2, d2 ), . . . , (K, 1), . . . , (K, dK )}, if T (δ) ≤ 1 for all δ ∈ BGRS L(K), then also T (δ) ≤ 1 for all δ ∈ BGF M L(K) (where GF M is the Fornasini-Marchesini graph with edge set E). In fact one can see this result directly from the fact that BGF M L(K) ⊂ BGRS L(K) if GRS is a row-sum graph.
(5.56)
Indeed, if δ = (δ1,1 , . . . , δ1,d1 , δ2,1 , . . . , δ2,d2 , . . . , δK,1 , . . . , δK,dK ) ∈ BGF M L(K), then the row matrix 2 3 ZGF M (δ) = δ1,1 . . . δ1,d1 δ2,1 . . . δ2,d2 . . . , δK,1 . . . δK,dK
220
J.A. Ball, G. Groenewald and T. Malakorn
2 is contractive. In particular for each k = 1, . . . , K the shorter row δk,1 is contractive from which we see that the row-sum matrix ⎡ δ1,1 · · · δ1,d1 ⎢ δ2,1 · · · δ2,d2 ⎢ ZGRS (δ) = ⎢ .. ⎣ . δK,1
···
...
δk,dk
3
⎤ ⎥ ⎥ ⎥ ⎦ δK,dK
is contractive, i.e., δ ∈ BGRS L(K). In this way the containment (5.56) follows in a simple direct way. Remark 5.15. One can view Theorem 5.3 as really concerning the formal power series K(z, z ) = I − T (z)T (z )∗ (5.57) in two sets of noncommuting indeterminates z = (z1 , . . . , zd ) and z = (z1 , . . . , zd ) rather than T (z) itself. Expressed in this way, Theorem 5.3 says that a formal v v power series K(z, z ) = v,v ∈F ∈ L(Y)z, z of the special form Fd Kv,v z z (5.57) has the representation K(z, z ) = H(z) (I − ZG,H (z)ZG,H (z )∗ ) H(z )∗ for some H(z) ∈ L(H, Y)z if and only if Kv,v ⊗ δ v (δ ∗ )v ≥ 0 K(δ, δ) =
(5.58)
(5.59)
v,v ∈F Fd
for all operator d-tuples δ = (δ1 , . . . , δd ) ∈ BG L(K). One can pose the question of obtaining results along this line without the restriction that K(z, z ) a priori has the special form (5.57). In case one takes K(z, z ) to be a general hereditary kernel, sets ZG,H (z) formally equal to zero, and replaces BG L(K) by the set N of nilpotent d-tuples δ of matrices of arbitrary finite size (δ with δ v = 0 for |v| sufficiently large), such a result appears in the recent paper of Kalyuzhny˘ı-Verbovetzki˘˘ı and Vinnikov (see [43]). For the special case where K(z, z ) is a polynomial and sets BG L(K) equal to all of L(K)d (where K is taken to be any finite-dimensional Hilbert space), the Positivstellensatz of [39] gives a similar type result. For BG L(K) set equal to other types of algebraic varieties or semivarieties, see [40] and [41].
References [1] J. Agler, On the representation of certain holomorphic functions defined on a polydisk, in Topics in Operator Theory: Ernst D. Hellinger memorial Volume (Ed. L. de Branges, I. Gohberg and J. Rovnyak), pp. 47-66, OT48 Birkhauser ¨ Verlag, Basel, 1990. [2] J. Agler and J.E. McCarthy, Nevanlinna-Pick interpolation on the bidisk, J. Reine Angew. Math. 506 (1999), 191–204. [3] J. Agler and J.E. McCarthy, Complete Nevanlinna-Pick kernels, J. Functional Analysis, 175 (2000), 111–124.
Conservative Noncommutative Systems
221
[4] D. Alpay, V. Bolotnikov and T. Kaptano˘ ˘ glu, The Schur algorithm and reproducing kernel Hilbert spaces in the ball, Linear Algebra Appl. 342 (2002), 163–186. [5] D. Alpay and D.S. Kalyuzhny˘-Verbovetzki˘ ˘ ˘ı, Matrix-J-unitary non-commutative rational formal power series, in this volume. [6] C.-G. Ambrozie and J. Eschmeier, A commutant lifting theorem on analytic polyhedra, Proceedings of Operator Theory Conference Dedicated to Prof. Wieslaw Zelazko, Banach Center publ., Warszawa, to appear. [7] C.-G. Ambrozie and D. Timotin. A von Neumann type inequality for certain domains in Cn , Proc. Amer. Math. Soc., 131 (2003), 859–869. [8] A. Arias and G. Popescu, Noncommutative interpolation and Poisson transforms, Israel J. Math. 115 (2000), 205–234. [9] N. Aronszajn, Theory of reproducing kernels, Trans. Amer. Math. Soc., 68 (1950), 337–404. [10] W. Arveson, Subalgebras of C ∗ -algebras III: multivariable operator theory, Acta Math., 181 (1998), 159–228. [11] J.A. Ball, Linear systems, operator model theory and scattering: Multivariable generalizations, in Operator Theory and its Applications (Ed. A.G. Ramm, P.N. Shivakumar and A.V. Strauss), FIC25, Amer. Math. Soc., Providence, 2000. [12] J.A. Ball and V. Bolotnikov, Realization and interpolation for Schur-Agler class functions on domains with matrix polynomial defining function in Cn , J. Functional Analysis 213 (2004), 45–87. [13] J.A. Ball, G. Groenewald and T. Malakorn, Structured noncommutative multidimensional linear systems, SIAM J. Control and Optimization, to appear. [14] J.A. Ball, G. Groenewald and T. Malakorn, Bounded real lemma for structured noncommutative multidimensional linear systems and robust control, preprint (2005). [15] J.A. Ball, W.S. Li, D. Timotin and T.T. Trent, A commutant lifting theorem on the polydisc, Indiana Univ. Math. J. 48 (1999), 653–675. [16] J.A. Ball and T. Malakorn, Multidimensional linear feedback control systems and interpolation problems for multivariable holomorphic functions, Multidimensional Systems and Signal Processing 15 (2004), 7–36. [17] J.A. Ball and T. Trent, Unitary colligations, reproducing kernel Hilbert spaces and Nevanlinna–Pick interpolation in several variables, J. Functional Analysis, 157 (1998), no.1, 1–61. [18] J.A. Ball, T.T. Trent and V. Vinnikov, Interpolation and commutant lifting for multipliers on reproducing kernel Hilbert spaces, in: Operator Theory and Analysis: The M.A. Kaashoek Anniversary Volume (Workshop in Amsterdam, Nov. 1997), pages 89–138, OT 122, Birkhauser ¨ Verlag, Basel, 2001. [19] J.A. Ball and V. Vinnikov, Functional models for representations of the Cuntz algebra, in Operator Theory, System Theory and Scattering Theory: Multidimensional Generalizations (Ed. D. Alpay and V. Vinnikov), Birkh¨ ¨ auser Verlag OT volume, to appear. [20] J.A. Ball and V. Vinnikov, Lax-Phillips scattering and conservative linear systems: a Cuntz-algebra multidimensional setting, Memoir of the AMS, to appear. [21] J.A. Ball and V. Vinnikov, Formal reproducing kernel Hilbert spaces: the commutative and noncommutative settings, in Reproducing Kernel Hilbert Spaces (Ed. D. Alpay), pages 77–134, OT 143, Birkhauser ¨ Verlag, Basel, 2003.
222
J.A. Ball, G. Groenewald and T. Malakorn
[22] C.L. Beck, On formal power series representations for uncertain systems, IEEE Trans. Auto. Contr. 46 No. 2 (2001), 314–319. [23] C.L. Beck and J.C. Doyle, A necessary and sufficient minimality condition for uncertain systems, IEEE Trans. Auto. Contr. 44 No. 10 (1999), 1802–1813. [24] C.L. Beck, J.C. Doyle and K. Glover, Model reduction of multidimensional and uncertain systems, IEEE Trans. Auto. Contr. 41 No. 10 (1996), 1466–1477. [25] L. de Branges and J. Rovnyak, Canonical models in quantum scattering theory, in Perturbation Theory and its Applications in Quantum Mechanics (Ed. C.H. Wilcox), Wiley, New York, 1966, pp. 295–392. [26] M.S. Brodski˘ı, Triangular and Jordan Representations of Linear Operators, Volume Thirty-Two, Translations of Mathematical Monographs, American Mathematical Society, Providence, 1971. [27] J.W. Bunce, Models for n-tuples of noncommuting operators, J. Functional Analysis 57 (1984), 21–30. [28] J.B. Conway, A Course in Operator Theory, Graduate Studies in Mathematics Vol. 21, American Mathematical Society (Providence), 2000. [29] K.R. Davidson and D.R. Pitts, The algebraic structure of non-commutative analytic Toeplitz algebras, Math. Ann. 311 (1998), 275–303. [30] K.R. Davidson and D.R. Pitts, Nevanlinna–Pick interpolation for non-commutative analytic Toeplitz algebras, Integral Equations Operator Theory 31 (1998), no. 3, 321–337. [31] K.R. Davidson and D.R. Pitts, Invariant subspaces and hyper-reflexivity for free semigroup algebras, Proc. London Math. Soc. 78 (1999), 401–430. [32] N. Dunford and L.T. Schwartz, Linear Operators Part I: General Theory, Interscience Publishers, New York, 1958. [33] S.W. Drury, A generalization of von Neumann’s inequality to the complex ball, Proc. Amer. Math. Soc., 68 (1978), 300–304. [34] E. Fornasini and G. Marchesini, Doubly-indexed dynamical systems: state space models and structural properties, Math. System Theory 12 (1978), 59–72. [35] D.D. Givone and R.P. Roesser, Multidimensional linear iterative circuits – general properties, IEEE Trans. Comp. C-21 no. 10 (1972),1067–1073. [36] D.D. Givone and R.P. Roesser, Minimization of multidimensional linear iterative circuits, IEEE Trans. Comp. C-22 no. 7 (1973), 673–678. [37] D. Greene, S. Richter and C. Sundberg, The structure of inner multipliers on spaces with complete Nevanlinna Pick kernels, J. Functional Analysis 194 no. 2 (2002), 311–331. [38] J.W. Helton, The characteristic functions of operator theory and electrical network realization, Indiana Univ. Math. J. 22 (1972/73), 403–414. [39] J.W. Helton, “Positive” noncommutative polynomials are sums of squares, Ann. Math. 56 (2002), 675–694. [40] J.W. Helton and S.A. McCullough, A Positivstellensatz for noncommutative polynomials, Trans. Amer. Math. Soc. 356 No. 9 (2004), 3721–3737. [41] J.W. Helton, S.A. McCullough and M. Putinar, A non-commutative Positivstellensatz on isometries, J. Reine Angew. Math. 568 (2004), 71–80. [42] I. Gohberg (ed.), I. Schur Methods in Operator Theory and Signal Processing, OT18 Birkhauser ¨ Verlag, Basel-Boston, 1986.
Conservative Noncommutative Systems
223
[43] D.S. Kalyuzhny˘-Verbovetzki˘ ˘ ˘ı and V. Vinnikov, Non-commutative positive kernels and their matrix evaluations, Proc. Amer. Math. Soc., to appear. [44] W.-M. Lu, K. Zhou and J.C. Doyle, Stabilization of uncertain linear systems: an LFT approach, IEEE Trans. Auto. Contr. 41 No. 1 (1996), 50–65. [45] S. McCullough and T.T. Trent, Invariant subspaces and Nevanlinna-Pick kernels, J. Functional Analysis 178 (2000), 226–249. [46] B. Sz.-Nagy and C. Foia¸¸s, Harmonic Analysis of Operators on Hilbert Space, North Holland/American Elsevier, 1970. [47] N.K. Nikol’ski˘, Treatise on the Shift Operator: Spectral Function Theory, SpringerVerlag, Berlin, 1986. [48] G. Popescu, Models for infinite sequences of noncommuting operators, Acta Sci. Math. 53 (1989), 355–368. [49] G. Popescu, Characteristic functions for infinite sequences of noncommuting operators, J. Operator Theory 22 (1989), 51–71. [50] G. Popescu, Von Neumann inequality for (B(H)n )1 , Math. Scand. 68 (1991), 292– 304. [51] G. Popescu, Multi-analytic operators on Fock spaces, Math. Ann. 303 (1995), 31–46. [52] G. Popescu, Interpolation problems in several variables, J. Math. Anal. Appl. 227 (1998), 227–250. [53] G. Popescu, Poisson transforms on some C ∗ -algebras generated by isometries, J. Functional Analysis 161 (1999), 27–61. [54] W. Rudin, Functional Analysis, McGraw-Hill, New York, 1973. [55] A.E. Taylor and D.C. Lay, Introduction to Functional Analysis, Second Edition, Wiley, 1980. [56] A.T. Tomerlin, Products of Nevanlinna-Pick kernels and operator colligations, Integral Equations Operator Theory 38 (2000), no. 3, 350–356. [57] K. Zhou, J.C. Doyle and K. Glover, Robust and Optimal Control, Prentice Hall, Upper Saddle River, New Jersey, 1996. Joseph A. Ball Department of Mathematics Virginia Tech Blacksburg, Virginia 24061-0123 e-mail: [email protected] Gilbert Groenewald Department of Mathematics North West University Potchefstroom 2520, South Africa e-mail: [email protected] Tanit Malakorn Department of Electrical and Computer Engineering Naresuan University Phitsanulok, 65000, Thailand e-mail: [email protected]
Operator Theory: Advances and Applications, Vol. 161, 225–270 c 2005 Birkhauser ¨ Verlag Basel/Switzerland
The Bezout Integral Operator: Main Property and Underlying Abstract Scheme I. Gohberg, I. Haimovici, M.A. Kaashoek and L. Lerer Abstract. For a class of entire matrix-functions a continuous analogue of the classical Bezout matrix for scalar polynomials is introduced and studied. This analogue is an integral operator with a matrix-valued kernel. The null space of this operator is explicitly expressed in terms of the common eigenvectors and common Jordan chains of the two underlying entire matrix functions. Also a refinement of the abstract scheme from [17] for defining Bezout operators is presented and analyzed. The approach of the paper is based, to a large extent, on the state space method from mathematical system theory. In particular, an important role is played by the fact that the functions involved can be represented as transfer functions of certain infinite-dimensional input output systems. Mathematics Subject Classification (2000). Primary 47B35, 47B99, 45E10, 30D20; Secondary 33C47, 42C05, 93B15. Keywords. Bezout operator, continuous analogue of the Bezout matrix, convolution integral operators on a finite interval, state space method.
Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226 2 Spectral theory of entire matrix functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 A review of the spectral data of an analytic matrix function . . . . . 2.2 Eigenvalues and Jordan chains in terms of realizations . . . . . . . . . . . 2.3 Common eigenvalues and common Jordan chains in terms of realizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Common spectral data of entire matrix functions . . . . . . . . . . . . . . . .
228 229 232 234 237
The research of the fourth author was partially supported by a visitor fellowship of the Netherlands Organization for Scientific Research (NWO) and by the Fund for Promotion of Research at the Technion, Haifa.
226
I. Gohberg, I. Haimovici, M.A. Kaashoek and L. Lerer
3 The 3.1 3.2 3.3 3.4 3.5
null space of the Bezout integral operator . . . . . . . . . . . . . . . . . . . . . . . . Preliminaries on convolution integral operators . . . . . . . . . . . . . . . . . . Co-realizations for the functions A, B, C, D . . . . . . . . . . . . . . . . . . . . . . Quasi commutativity in operator form . . . . . . . . . . . . . . . . . . . . . . . . . . . Intertwining properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Proof of the first main theorem on the Bezout integral operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 A general scheme for defining Bezout operators . . . . . . . . . . . . . . . . . . . . . . . 4.1 A preliminary proposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Definition of an abstract Bezout operator . . . . . . . . . . . . . . . . . . . . . . . . 4.3 The Haimovici-Lerer scheme for defining an abstract Bezout operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 The Bezout integral operator revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 The null space of the Bezout integral operator . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
241 242 244 248 251 254 256 257 260 262 264 266 268
1. Introduction Let a, b, c, and d be n × n matrix functions, a and d belong to Ln×n [0, ω], while 1 [−ω, 0]. We shall assume that the four functions a, b, c, d b and c belong to Ln×n 1 satisfy the following additional condition A(λ)B(λ) = C(λ)D(λ),
λ ∈ C,
(1.1)
where A, B, C, D are the entire n × n matrix functions given by ω 0 eiλs a(s) ds, B(λ) = In + eiλs b(s) ds, A(λ) = In + C(λ) = In +
(1.2)
−ω ω
0 0
eiλs c(s) ds, −ω
D(λ) = In +
eiλs d(s) ds.
(1.3)
0
Here In denotes the n × n identity matrix. Notice that in the scalar case (n = 1) the additional condition (1.1) is automatically fulfilled with a = d and b = c. Given four functions as above, we let T be the integral operator on Ln1 [0, ω] defined by ω
γ(t, s)ϕ(s) ds,
(T ϕ)(t) = ϕ(t) +
0 ≤ t ≤ ω,
(1.4)
0
with the kernel function γ being given by γ(t, s) = +
a(t − s) + b(t − s) + min{t,s} a(t − r)b(r − s) − c(t − ω − r)d(r + ω − s) dr. (1.5) 0
We can now state the first main result of this paper, which shows that the operator T preserves the main property of the classical Bezout matrix.
Bezout Integral Operator and Abstract Scheme
227
Theorem 1.1. Assume that condition (1.1) is satisfied. Then the dimension of the null space of the operator T defined by (1.4), (1.5) is equal to the total multiplicity of the common eigenvalues of the entire matrix functions B and D. The definition of the total multiplicity of the common eigenvalues of the entire matrix functions B and D, which involves the notion of common Jordan chains of B and D, will be given at the end of Section 2.1 below. We shall also present a basis of the null space of T in terms of these common Jordan chains (Theorem 4.6). For the scalar case and with a = d and b = c, the above theorem, together with the description of its null space, has been proved in Section 6 of [9]. In [9] it has also been shown that for the scalar case and with a = d and b = c the operator T is the natural continuous analogue of the classical Bezout matrix for polynomials. For the matrix case (n > 1) it is proved in [17] that this analogy with the classical Bezout matrix remains true provided condition (1.1) is satisfied. For this reason we shall refer to the operator T as the Bezout integral operator associated with {A, C; B, D}, and we simply write T = T {A, C; B, D}. We call (1.1) the quasi commutativity property of the quadruple {A, C; B, D}. Theorem 1.1 does not remain true, not even in the scalar case, when the quasi commutativity property is not satisfied (see the example at the end of Chapter 3). Theorem 1.1 has been proved in the dissertation [16] using the general scheme for Bezout operators developed in [17]. In this paper we give a self-contained and independent proof of Theorem 1.1. Theorem 4.6, which gives explicit formulas for a basis of the null space of T , is our second main result and seems to be new. We also present and analyse a more refined version of the scheme from [17], and use this to prove Theorem 4.6. This paper consists of four chapters including the present introduction. In the second chapter we recall the notion of total common multiplicity (that is, the total multiplicity of the common eigenvalues) of two entire matrix functions and study common spectral data, including common Jordan chains, of such functions. The latter is done by representing the functions involved as transfer functions of certain infinite-dimensional input output systems. The main result is Theorem 2.6 which identifies the total common multiplicity of two entire matrix functions in terms of certain invariant subspaces. Theorem 1.1 is proved in Chapter 3 using Theorem 2.6. In the final chapter we return to the definition of a Bezout integral operator T . We present a refinement of the scheme from [17] for defining Bezout operators, and we show how our operator T fits into this scheme. In the final section we prove our second main theorem (Theorem 4.6). We conclude this introduction with a few remarks about the literature on the Bezout matrix and its generalizations. For the definition of the classical Bezout matrix for two scalar polynomials and a comprehensive survey of its properties and its use in various applications, we refer the reader to [20]. Getting the main
228
I. Gohberg, I. Haimovici, M.A. Kaashoek and L. Lerer
property of the Bezout matrix for matrix (non-commutative) functions presented serious difficulties. Attempts to generalize the notion of the Bezout matrix to matrix polynomials by replacing scalar multiplication by the usual matrix multiplication or by the tensor (Kronecker) product did not yield a natural analogue with the same main property (see [24] for details). The idea of involving supplementary functions A and C along with the given functions B and D, such that (1.1) holds true, originates from some problems in system theory (see [1], also [6], [18]), and turned out to be successful; see, for the case of matrix and operator polynomials, [23], [24], and also Chapter 9 in the book [25]. Notice that in many problems the supplementary functions appear in a natural way or can be constructed from the given functions B and D (see, e.g., Section 5 in [10]) and Section 3 in [14]). Of course, in the commutative case one can just set A = D and C = B. When passing to non-polynomial matrix functions, a significant role is played by the idea of representing the matrix functions involved as transfer functions of certain input output systems. For entire scalar functions it originates from the paper [26]; see also Chapter 16 in [27]. This idea and the one of the previous paragraph were important in constructing proper analogues of the classical Bezout matrix for rational and analytic matrix functions (see, e.g., [10], [16], [17], [21], [22] and the references therein). The two ideas play also an important role in the present paper.
2. Spectral theory of entire matrix functions In this chapter we deal with entire n×n matrix functions that are equal to the n×n identity matrix In at the point zero. Such a function F admits a representation of the form F (λ) = In + λC(I − λA)−1 B,
λ ∈ C.
(2.1)
Here A is a quasi-nilpotent operator on a Banach space X , that is, A is a bounded linear operator of which the spectrum σ(A) consists of the point zero only. Furthermore, B : Cn → X and C : X → Cn are bounded linear operators, and I is the identity operator on X . Since σ(A) = {0}, the operator I − λA is invertible for each λ ∈ C. Hence, both sides of (2.1) are well defined for each λ ∈ C. To get a representation of F as in (2.1), it is convenient to first consider the n × n matrix function W (λ) = F (λ−1 ) which is defined and analytic on the set Ω = (C ∪ {∞})\{0}. In particular, 0 ∈ Ω. Hence we can apply Theorem 2.5 in [5] to show that there exists a Banach space X , a bounded linear operator A on X such that σ(A) = {0}, and bounded linear operators B : Cn → X and C : X → Cn such that W (λ) = In + C(λI − A)−1 B,
0 = λ ∈ C.
(2.2)
Bezout Integral Operator and Abstract Scheme
229
Here, as before, I is the identity operator on X . Since F (λ) = W (λ−1 ), from (2.2) we get (2.1) for each 0 = λ ∈ C. But both the left and right side of (2.1) are analytic at zero. Thus (2.1) holds for each λ ∈ C. One refers to the right-hand side of (2.2) as a realization of W . This terminology is taken from mathematical system theory, where functions of the form (2.2) appear as transfer functions of time-invariant input output systems (cf., [5], [7]). Following the system theory terminology we call the space X the state space of the realization, and the operator A in (2.2) is called the main operator or state operator. The operators B and C are called the input operator and output operator, respectively. We shall use these terms also for the operators in (2.1). In the sequel we refer to the right-hand side of (2.1) as a co-realization. The terms realization and co-realization will also be used when in (2.1) and (2.2) the identity matrix In is replaced by an arbitrary square matrix D. This chapter, which consists of four sections, deals with the spectral properties of the functions F and W in terms of the representations (2.1) and (2.2). The first section, which has a preliminary character, reviews for analytic matrix functions the concepts of eigenvalues, corresponding eigenvectors and Jordan chains, and canonical systems of Jordan chains. In Section 2.2 we show that the representation (2.2) of W allows one to describe the eigenvalues, the corresponding eigenvectors, and canonical systems of Jordan chains corresponding to an eigenvalue of W in terms of the spectral properties of the operator A× = A−BC. Notice that the latter operator appears in a natural when one invert W (λ). Indeed, W (λ)−1 = In − C(λI − A× )−1 B,
λ ∈ σ(A× ),
where σ(A× ) denotes the spectrum of A× . In Section 2.3 we use realizations to describe the common eigenvalues and common Jordan chains of two functions of the form (2.2). In the final section the results of the third section are applied to two entire matrix functions, and we use co-realizations to describe the common zero data of two such functions in operator terms. 2.1. A review of the spectral data of an analytic matrix function Let H be an n × n matrix function, which is analytic on an open set Ω of C. We assume H to be regular on Ω. The latter means that det H(λ) ≡ 0 on each connected component of Ω. As usual the values of H are identified with their canonical action on Cn . In what follows λ0 is a point in Ω. The point λ0 is called an eigenvalue of H whenever there exists a vector x0 = 0 in Cn such that H(λ0 )x0 = 0. In that case the non-zero vector x0 is called an eigenvector of H at λ0 . Note that λ0 is an eigenvalue of H if and only if det H(λ0 ) = 0. In particular, in the scalar case, i.e., when n=1, the point λ0 is an eigenvalue of H if and only if λ0 is a zero of H. The multiplicity ν(λ0 ) of the eigenvalue λ0 of H is defined as the multiplicity of λ0 as a zero of det H(λ). The set of eigenvectors of H at λ0 together with the zero vector is equal to Ker H(λ0 ).
230
I. Gohberg, I. Haimovici, M.A. Kaashoek and L. Lerer
An ordered sequence of vectors x0 , x1 , . . . , xr−1 in Cn is called a Jordan chain of length r of H at λ0 if x0 = 0 and k 1 (j) H (λ0 )xk−j = 0, j! j=0
k = 0, . . . , r − 1.
(2.3)
Here H (j) (λ0 ) is the jth derivative of H at λ0 . From x0 = 0 and (2.3) it follows that λ0 is an eigenvalue of H and x0 a corresponding eigenvector. The converse is also true, that is, x0 is an eigenvector of H at λ0 if and only if x0 is the first vector in a Jordan chain for H at λ0 . Given an eigenvector x0 of H at λ0 there are in general many Jordan chains for H at λ0 which have x0 as their first vector. However, the fact that H is regular implies that the lengths of these Jordan chains have a finite supremum which we shall call the rank of the eigenvector x0 . To organize the Jordan chains corresponding to the eigenvalue λ0 we proceed as follows. Choose an eigenvector x1, 0 in Ker H(λ0 ) such that the rank r1 of x1, 0 is maximal, and let x1, 0 , . . . , x1, r1 −1 be a corresponding Jordan chain. Next we choose among all vectors x in Ker H(λ0 ), with x not a multiple of x1, 0 , a vector x2, 0 of maximal rank, r2 say, and we choose a corresponding Jordan chain x2, 0 , . . . , x2, r2 −1 . We proceed by induction. Assume x1, 0 , . . . , x1, r1 −1 , . . . , xk, 0 , . . . , xk, rk −1 have been chosen. Then we choose xk+1, 0 to be a vector in Ker H(λ0 ) that does not belong to span{x1, 0 , . . . , xk, 0 } such that xk+1, 0 is of maximal rank among all vectors in Ker H(λ0 ) not belonging to span{x1, 0 , . . . , xk, 0 }. In this way, in a finite number of steps, we obtain a basis x1, 0 , x2, 0 , . . . , xp, 0 of Ker H(λ0 ) and corresponding Jordan chains x1, 0 , . . . , x1, r1 −1 , x2, 0 , . . . , x2, r2 −1 , . . . , xp, 0 , . . . , xp, rp −1 .
(2.4)
The system (2.4) is called a canonical system of Jordan chains for H at λ0 . From the construction it follows that p = dim Ker H(λ0 ). Furthermore, the numbers r1 ≥ r2 ≥ · · · ≥ rp are uniquely determined by H and do not depend on the particular choices made above. They are called the partial multiplicities of H at λ0 . Their sum r1 + · · · + rp is equal to the multiplicity ν(λ0 ). The above definitions of eigenvalue, eigenvector and Jordan chain for H at λ0 also make sense when H is non-regular or when H is a non-square analytic matrix function on Ω. However, in that case it may happen that the supremum of the lengths of the Jordan chains with a given first vector is not finite. On the other hand, if for each non-zero vector x0 in Ker H(λ0 ) the supremum of the lengths of the Jordan chains with x0 as first vector is finite, then we can define a canonical set of Jordan chains for H at λ0 in the same way as it was done above for regular analytic matrix functions. More details on the above notions, including proofs, can be found in [15]; see also the book [13] or the appendix of [11].
Bezout Integral Operator and Abstract Scheme
231
Common spectral data. Next we consider two n × n matrix functions H1 and H2 which are analytic on an open subset Ω of C. We also assume that either H1 or H2 is regular on Ω. Let λ0 be a point in Ω. We say that λ0 is a common eigenvalue of H1 and H2 if there exists a vector x0 = 0 such that H1 (λ0 )x0 = H2 (λ0 )x0 = 0. In this case we refer to x0 as a common eigenvector of H1 and H2 at λ0 . Note that x0 is a common eigenvector of H1 and H2 at λ0 if and only if x0 is a non-zero vector in A B H1 (λ0 ) Ker H1 (λ0 ) ∩ Ker H2 (λ0 ) = Ker . H2 (λ0 ) If an ordered sequence of vectors x0 , x1 , . . . , xr−1 is a Jordan chain for both H1 and H2 at λ0 , then we say that x0 , x1 , . . . , xr−1 is a common Jordan chain for H1 and H2 at λ0 . In other words, x0 , x1 , . . . , xr−1 is a common Jordan chain for H1 and H2 at λ0 if and only if x0 , x1 , . . . , xr−1 is a Jordan chain for H at λ0 , where H is the non-square matrix function given by A B H1 (λ) H(λ) = , λ ∈ Ω. (2.5) H2 (λ) Let x0 be a common eigenvector of H1 and H2 at λ0 . Since H1 or H2 is regular, the lengths of the common Jordan chains of H1 and H2 at λ0 with initial vector x0 have a finite supremum. In other words, if x0 is a non-zero vector in Ker H(λ0 ), where H is the non-square analytic matrix function defined by (2.5), then the lengths of the Jordan chains of H at λ0 with initial vector x0 have a finite supremum. Hence for H in (2.5) a canonical set of Jordan chains of H at λ0 is well defined. We say that x1, 0 , . . . , x1, r1 −1 , x2, 0 , . . . , x2, r2 −1 , . . . , xp, 0 , . . . , xp, rp −1
(2.6)
is a canonical set of common Jordan chains of H1 and H2 at λ0 if the chains in (2.6) form a canonical set of Jordan chains for H at λ0 , where H is defined by (2.5). Furthermore, in that case the number ν(H1 , H2 ; λ0 ) :=
p
rj
j=1
is called the common multiplicity of λ0 as a common eigenvalue of the analytic matrix functions H1 and H2 . If the analytic matrix functions H1 and H2 have a finite number of common eigenvalues in Ω, then we define the total common multiplicity of H1 and H2 in Ω to be the number ν(H1 , H2 ; Ω) given by ν(H1 , H2 ; Ω) = ν(H1 , H2 ; λ). λ∈Ω
When Ω = C, we simply write ν(H1 , H2 ) = ν(H1 , H2 ; C).
232
I. Gohberg, I. Haimovici, M.A. Kaashoek and L. Lerer
The total common multiplicity ν(B, D). Let B and D be the n × n entire matrix functions defined by (1.2) and (1.3), respectively. From the definitions of these functions it follows that lim
λ≤0, |λ|→∞
B(λ) = In ,
lim
λ≥0, |λ|→∞
D(λ) = In .
Thus B has only a finite number of eigenvalues in the closed lower half-plane, and the same is true for D with respect to the closed upper half-plane. We conclude that the number of common eigenvalues of B and D in C is finite. This allows us to define the total common multiplicity ν(B, D) of B and D, namely: ν(B, D; λ), ν(B, D) = λ
where the sum is taken over the common eigenvalues, and ν(B, D; λ) is the common multiplicity of λ as a common eigenvalue of B and D. 2.2. Eigenvalues and Jordan chains in terms of realizations Throughout this section W is an n× n matrix function which is analytic on C\{0}, and we assume that W is given in realized form: W (λ) = In + C(λI − A)−1 B,
0 = λ ∈ C.
(2.7)
Here A, B, C, and I are as in the previous section. With the realization (2.7) we associate the operator A× = A − BC. Since A× = A−BC and BC is of finite rank, A× is a finite rank perturbation of a quasi-nilpotent operator. It follows that a non-zero point λ0 in the spectrum of A× is an eigenvalue of finite type. Thus, if 0 = λ0 ∈ σ(A× ), then λ0 is an isolated point in σ(A× ), and the corresponding Riesz projection P (λ0 ; A× ) is of finite dimension (see Section II.1 in [8] for further details). In particular, the nonzero part of σ(A× ) consists of eigenvalues only. Recall (see Section II.2 of [8]) that x0 , x1 , . . . , xr−1 in X is called a Jordan chain of A× at λ0 if x0 = 0 and A× x0 = λ0 x0 ,
A× xj = λ0 xj + xj−1
(j = 1, . . . , r − 1).
(2.8)
In other words, in the terminology of Section 2.1, the vectors x0 , x1 , . . . , xr−1 form a Jordan chain of the operator A× at λ0 if and only if x0 , x1 , . . . , xr−1 is a Jordan chain of the analytic operator-valued function λI − A at λ0 . The following proposition is the main result of this section. Proposition 2.1. Let W be given by (2.7), and put A× = A − BC. Fix 0 = λ ∈ C. Then C maps Ker (λ0 I − A× ) in a one to one way onto Ker W (λ0 ), and the action of the corresponding inverse map is given by (A − λ0 I)−1 B. Furthermore, if x0 , . . . , xr−1 is a Jordan chain of A× at λ0 , then Cx0 , . . . , Cxr−1 is a Jordan chain of W at λ0 , and each Jordan chain of W at λ0 is obtained in this way. Proof. We shall use the fact (see [5], page 58) that the operator functions D C C D λI − A× 0 W (λ) 0 , 0 I 0 In
(2.9)
Bezout Integral Operator and Abstract Scheme are analytically equivalent the following identity C C(λI − A)−1 −(λI − A)−1 C =
233
on C\{0}. More precisely, for each 0 = λ ∈ C we have In 0
DC
W (λ) 0
−I 0
B In DC 0 0 I I
DC
λI − A× 0
0 In
In −(λI − A)−1 B
D (2.10) DC
I −C
0 In
D .
Notice that the first two factors in the left-hand side of (2.10) are invertible, and these factors and their inverses depend analytically on λ ∈ C\{0}. A similar statement holds true for the second and third factor in the right-hand side of (2.10). Thus (2.10) shows that the operator functions in (2.9) are analytically equivalent on C\{0}. We first prove the statement about the Jordan chains. So, let x0 , . . . , xr−1 be a Jordan chain for A× at λ0 . Put x(λ) = x0 + (λ − λ0 )x1 + · · · + (λ − λ0 )r−1 xr−1 . Then x(λ0 ) = x0 = 0, and (λ − A× )x(λ) = (λ − λ0 )r ϕ(λ), where ϕ is analytic at λ0 . By applying the left-hand side (2.10) to the vector function D C x(λ) , 0 we see that the function C W (λ) 0
0 I
DC
−Cx(λ) x(λ) + (λI − A)−1 BCx(λ)
D
must have a zero at λ0 of order at least r. For the second component this result means that the vectors x0 , . . . , xr−1 are precisely equal to the first r Taylor coefficients of −(λI − A)−1 BCx(λ) at λ0 . In particular x0 = −(λ0 I − A)−1 BCx0 .
(2.11)
Since x0 = 0, formula (2.11) yields Cx0 = 0. For the first component we have that W (λ)Cx(λ) has a zero of order at least r at λ0 . Since Cx0 = 0, this is equivalent to the statement that Cx0 , . . . , Cxr−1 is a Jordan chain of W at λ0 . To prove that all Jordan chains of W at λ0 are obtained in this way, let y0 , . . . , yr−1 be a Jordan chain of W at λ0 . Put y(λ) = y0 + (λ − λ0 )y1 + · · · + (λ − λ0 )r−1 yr−1 . Then y(λ0 ) = y0 = 0 and W (λ)y(λ) = (λ − λ0 )r ψ(λ), where ψ is analytic at λ0 . Using the experience of the previous part of the proof, put x(λ) = −(λI − A)−1 By(λ),
(2.12)
and let x0 , . . . , xr−1 be the first r Taylor coefficients of x(λ) at λ0 , that is, xk =
k
(A − λ0 I)−(α+1) Byk−α ,
α=0
k = 0, . . . , r − 1.
(2.13)
234
I. Gohberg, I. Haimovici, M.A. Kaashoek and L. Lerer
From (2.10) it follows that the vector function C DC DC λI − A× 0 I 0 (λI − A)−1 B 0 In In C In
I 0
DC
y(λ) 0
D
has a zero at λ0 of order at least r. Using (2.12) we conclude that the same holds true for (λI − A× )x(λ), that is, at λ0 the function (λI − A× )x(λ) has a zero of order at least r too. Since W (λ)y(λ) = y(λ) − Cx(λ), the first r Taylor coefficients at λ0 of y(λ) and Cx(λ) coincide. Thus y0 = Cx0 , . . . , yr−1 = Cxr−1 .
(2.14)
From y0 = 0 we obtain x0 = 0. But then the fact that (λI − A× )x(λ) has a zero at λ0 of order at least r is equivalent to the fact that x0 , . . . , xr−1 is a Jordan chain of λI − A× at λ0 . Formula (2.14) shows that C maps this chain onto the chain we started with. Finally notice that the result about the Jordan chains specified for r = 1 implies the fact that C maps Ker (λ0 I − A× ) in a one to one way onto Ker W (λ0 ). Furthermore, according to (2.11), the action of the corresponding inverse map is given by the operator (A − λ0 I)−1 B. Corollary 2.2. Let W be given by (2.7), and put A× = A−BC. Fix 0 = λ0 ∈ σ(A× ). If x1, 0 , . . . , x1, r1 −1 , . . . , xp, 0 , . . . , xp, rp −1 is a canonical system of Jordan chains of A× at λ0 , then the chains Cx1, 0 , . . . , Cx1, r1 −1 , . . . , Cxp, 0 , . . . , Cxp, rp −1 form a canonical system of Jordan chains for W at λ0 . Moreover, any canonical system of Jordan chains for W at λ0 is obtained in this way. In particular, the multiplicity of λ0 as an eigenvalue of W is equal to rank P (λ0 ; A× ), where P (λ0 ; A× ) is the Riesz projection of A× corresponding to λ0 . Proof. The result follows immediately from Proposition 2.1. Indeed, notice that C maps Ker (λ0 I − A× ) in a one to one way onto Ker W (λ0 ). Since xj+1, 0 is a vector in Ker (λ0 I −A× ) which does not belong to span{x1, 0 , . . . , xj, 0 }, it follows that Cxj+1, 0 is a vector in Ker W (λ0 ) which does not belong to span{Cx1, 0 , . . . , Cxj, 0 }. This, together with the definition of a canonical system of Jordan chains, yields the desired result. 2.3. Common eigenvalues and common Jordan chains in terms of realizations Throughout this section W1 and W2 are n × n matrix functions which are analytic on C\{0} and at infinity. We assume that W1 (∞) and W2 (∞) are equal to the n × n identity matrix. The functions W1 and W2 can be realized simultaneously in the following way: W1 (λ) = In + C1 (λI − A)−1 B,
W2 (λ) = In + C2 (λI − A)−1 B.
(2.15)
Bezout Integral Operator and Abstract Scheme
235
Here A is a quasi-nilpotent operator acting on a Banach space X , the operators C1 , C2 act from X into Cn , and B is an operator from Cn into X . To get the realizations in (2.15) we apply Theorem 2.5 in [5] to the 2n × 2n matrix function C D W1 (λ) 0 W (λ) = . W2 (λ) 0 The zeros in the second column of W (λ) stand for the zero n × n matrix. Since W is analytic on C\{0} and at infinity, Theorem 2.5 in [5] tells us that W admits a representation ˆ ˆ + C(λI ˆ W (λ) = D − A)−1 B, ˆ maps where A is a quasi-nilpotent operator on a Banach space X , the operator B 2n 2n C into X , and Cˆ maps X into C . Furthermore, C D In 0 ˆ D = W (∞) = . In 0 ˆ and Cˆ can be partitioned as follows: Identifying C2n with Cn ⊕Cn , the operators B C D C n D C n D 2 3 ˆ = B1 B2 : Cn → X , Cˆ = C1 : X → Cn . B C C2 C It follows that with this choice of A, C1 , C2 and with B = B1 the identities in (2.15) are satisfied. Our aim is to describe the common eigenvalues and common Jordan chains of W1 and W2 given by the realizations in (2.15). For this purpose, put A× 1 = A − BC1 ,
A× C2 , 2 = A − BC
(2.16)
and let M be the largest subspace of Ker (C1 − C2 ) that is invariant under A× 1 . Since M ⊂ Ker (C1 − C2 ), the operators C1 and C2 coincide on M, and hence A× 1 × and A× 2 also coincide on M. In particular, A2 leaves M invariant too. It follows that M is also the largest A× 2 -invariant subspace contained in Ker (C1 − C2 ). In and C the sequel we let A× M be the operators defined by M × × A× M = A1 |M = A2 |M : M → M,
(2.17)
CM = C1 |M = C2 |M : M → C . (2.18) By IM we denote the identity operator on M. We shall need the following lemma. n
Lemma 2.3. The non-zero part of σ(A× M ) consists of eigenvalues of finite type only. × Proof. Let λ0 = 0 be a point in the boundary ∂σ(A× M ) of σ(AM ). Then λ0 is an approximate eigenvalue of A× M , that is, there exists a sequence m1 , m2 , . . . in M such that mj = 1 for each j and (λ0 IM − A× M )mj → 0 for j → ∞. Since , it follows that A× |M = A× M
(λ0 I − A× )mj = (λ0 IM − A× M )mj → 0,
j → ∞.
×
Hence λ0 is also an approximate eigenvalue of A . We conclude that × ∂σ(A× M )\{0} ⊂ σ(A ).
(2.19)
236
I. Gohberg, I. Haimovici, M.A. Kaashoek and L. Lerer
But the non-zero part of σ(A× ) consists of isolated eigenvalues only. This together × with (2.19) implies that the non-zero part of σ(A× M ) is contained in σ(A ). × Take 0 = λ0 ∈ σ(AM ). The result of the previous paragraph shows that × λ0 is an isolated point in σ(A× M ), and hence its Riesz projection P (λ0 ; AM ) is × × well defined. Since the resolvent sets of A and AM are connected, it follows that × × P (λ0 ; A× M ) = P (λ0 ; A )|M . But P (λ0 ; A ) has finite rank, and thus the same is × true for P (λ0 ; AM ). This proves that λ0 is an eigenvalue of finite type for A× M. The following proposition is the main result of this section. Proposition 2.4. Let W1 and W2 be given by (2.15), and let A× M , CM be the operators defined by (2.17) and (2.18), respectively. Fix 0 = λ0 ∈ C. Then λ0 is a common eigenvalue of W1 and W2 if and only if λ0 is an eigenvalue of × A× M . More precisely, CM maps Ker (λ0 IM − AM ) in a one to one way onto Ker W1 (λ0 )∩Ker W2 (λ0 ), and the action of the corresponding inverse map is given by (A − λ0 I)−1 B. Furthermore, if x0 , . . . , xr−1 is a Jordan chain of A× M at λ0 , then CM x0 , . . . , CM xr−1 is a common Jordan chain of W1 and W2 at λ0 , and each common Jordan chain of W1 and W2 at λ0 is obtained in this way. Proof. We first prove the statements about the Jordan chains. Let the vectors × × x0 , . . . , xr−1 form a Jordan chain of A× M at λ0 . Fix i = 1, 2. Since Ai |M = AM , the × vectors x0 , . . . , xr−1 also form a Jordan chain for Ai at λ0 . But then Proposition 2.1 implies that Ci x0 , . . . , Ci xr−1 is a Jordan chain for Wi at λ0 . Recall that CM = Ci |M and the vectors x0 , . . . , xr−1 are in M ⊂ Ker (C1 − C2 ). Thus CM xj = Ci xj for i = 1, 2 and j = 0, . . . , r−1. We conclude that CM x0 , . . . , CM xr−1 is a common Jordan chain of W1 and W2 at λ0 . Next, let y0 , . . . , yr−1 be a common Jordan chain of W1 and W2 at λ0 . Put xk =
k
(A − λ0 I)−(α+1) Byk−α ,
k = 0, . . . , r − 1.
(2.20)
α=0
From the proof of Proposition 2.1 (cf., formula (2.13)) we know that the vectors × x0 , . . . , xr−1 form a Jordan chain at λ0 for both A× 1 and A2 , and according to formula (2.14) we have yj = C1 xj
and yj = C2 xj
Since x0 , . . . , xr−1 is a Jordan chain of
A× 1 ,
(j = 0, . . . , r − 1).
(2.21)
the space
N = span{xj | j = 0, . . . , r − 1} A× 1.
is invariant under From (2.21) we see that the vectors x0 , . . . , xr−1 belong to Ker (C1 − C2 ). Thus N is an A× 1 -invariant subspace contained in Ker (C1 − C2 ). It follows that N ⊂ M. We conclude that x0 , . . . , xr−1 is a Jordan chain of A× M at λ0 and yj = CM xj for j = 0, . . . , r − 1, as desired. When specified for r = 1, the results proved in the preceding two paragraphs imply that CM maps Ker (λ0 IM − A× M ) onto Ker W1 (λ0 ) ∩ Ker W2 (λ0 ). This map is also one to one because CM x0 = 0 whenever x0 is a non-zero vector in the null
Bezout Integral Operator and Abstract Scheme
237
space Ker (λ0 IM − A× M ). By taking k = 0 in (2.20) we see that the action of the corresponding inverse map is given by the operator (A − λ0 I)−1 B. Corollary 2.5. Let W1 and W2 be given by (2.15), and let A× M , CM be the operators defined by (2.17) and (2.18), respectively. Fix 0 = λ0 ∈ σ(A× M ). If x1, 0 , . . . , x1, r1 −1 , . . . , xp, 0 , . . . , xp, rp −1 is a canonical system of Jordan chains of A× M at λ0 , then the chains CM x1, 0 , . . . , CM x1, r1 −1 , . . . , CM xp, 0 , . . . , CM xp, rp −1 form a canonical system of common Jordan chains of W1 and W2 at λ0 . Moreover, any canonical system of common Jordan chains of W1 and W2 at λ0 is obtained in this way. In particular, the total common multiplicity of W1 and W2 at λ0 is given by ν(W W1 , W2 ; λ0 ) = rank P (λ0 ; A× (2.22) M ), × where P (λ0 ; A× M ) is the Riesz projection of AM corresponding to λ0 .
Proof. The proof follows the same line of reasoning as that of Corollary 2.2. One only has to replace the reference to Proposition 2.1 by a reference to Proposition 2.4. Let W1 and W2 be given by (2.15), and let A× M be the operator defined by (2.17). Proposition 2.4 shows that W1 and W2 have a finite number of common eigenvalues in C\{0} if and only if the non-zero part of the spectrum of A× M is finite. Moreover, using (2.22), we see that in that case the total common multiplicity of W1 and W2 in C\{0} is equal to the rank of the Riesz projection (see [8]) corresponding to the non-zero part of the spectrum of A× M. 2.4. Common spectral data of entire matrix functions In this section F1 and F2 are two entire n × n matrix functions which are assumed to have the value In at zero. The functions F1 and F2 can be represented simultaneously in the form F1 (λ) = In + λC1 (I − λA)−1 B,
F2 (λ) = In + λC C2 (I − λA)−1 B
(2.23)
Here A is a quasi-nilpotent operator on a Banach space X , the operators C1 , C2 act from X into Cn , and B is an operator from Cn into X . To get the co-realizations of F1 and F2 in (2.23) one applies the result of the second paragraph of the previous section to the matrix functions W1 (λ) = F1 (λ−1 ) and W2 (λ) = F2 (λ−1 ). Theorem 2.6. Let F1 and F2 be given by (2.23), and let M be the largest subspace contained in Ker (C1 − C2 ) that is invariant under A× C2 . Assume A× 2 = A − BC 2 is injective and dim M < ∞. Then F1 and F2 have a finite number of common F1 , F2 ) is given by eigenvalues and their total common multiplicity ν(F ν(F F1 , F2 ) = dim M.
(2.24)
238
I. Gohberg, I. Haimovici, M.A. Kaashoek and L. Lerer
Furthermore, in terms of the common Jordan chains of F1 and F2 a basis of M can be obtained as follows. Let z1 , . . . , z be the set of distinct common eigenvalues of F1 and F2 in C, and for each common eigenvalue zν let y1,ν 0 , . . . , y1,ν r(ν) −1 , y2,ν 0 , . . . , y2,ν r(ν) −1 , . . . , ypνν , 0 , . . . , ypν , r(ν) −1 1
ν
2
(2.25)
pν
stand for a canonical set of common Jordan chains of F1 and F2 at zν . Then the vectors k (ν) uj,ν k = (I − zν A)−(α+1) Aα Byj,ν k−α , k = 0, . . . , rj − 1, (2.26) α=0
j = 1, . . . , pν ,
ν = 1, . . . , ,
form a basis of M. × The above theorem also holds for A× C2 . 1 = A − BC1 in place of A2 = A − BC × × Also, notice that the operators A1 and A2 coincide on the space M defined in Theorem 2.6. As before (see (2.17)), we put × × A× M = A1 |M = A2 |M : M → M.
In order to prove Theorem 2.6 it is convenient first to prove the following lemma. Lemma 2.7. Put W1 (λ) = F1 (λ−1 ) and W2 (λ) = F2 (λ−1 ), and let z0 be a common eigenvalue of F1 and F2 . Then z0 = 0, and λ0 = z0−1 is a common eigenvalue of W1 and W2 . Moreover, any non-zero common eigenvalue of W1 and W2 is obtained in this way, and W1 , W2 ; λ0 ). (2.27) ν(F F1 , F2 ; z0 ) = ν(W Proof. Since F1 and F2 are both non-singular at zero, we have z0 = 0. Furthermore, Ker F1 (z0 ) ∩ Ker F2 (z0 ) = Ker W1 (λ0 ) ∩ Ker W2 (λ0 ).
(2.28)
Thus z0 is a common eigenvalue of F1 and F2 if and only if λ0 is a common eigenvalue of W1 and W2 . It remains to prove (2.27). Let y0 , . . . , yr−1 be any common Jordan chain of F1 and F2 at z0 , and let y(z) = y0 + (z − z0 )y1 + · · · + (z − z0 )r−1 yr−1 . In what follows we define y 0 , . . . , y r−1 to be the first r Taylor coefficients of y (λ) = y(z) at λ0 , where λ = z −1 . Notice that y0 = y 0 . We claim that y 0 , . . . , y r−1 is a common Jordan chain of W1 and W2 at λ0 . To see this, let i = 1, 2, and consider Fi (z)y(z). Since y0 , . . . , yr−1 is a Jordan chain of Fi at z0 , we have Fi (z)y(z) = (z − z0 )r ψi (z), with ψi being analytic at z0 . It follows that −1 r 1 . y (λ) = Fi (z)y(z) = (z − z0 )r ψi (z) = (λ − λ0 )r ψi Wi (λ) λλ0 λ The function (−λλ0 )−r ψi (λ−1 ) is analytic at λ0 . Thus y 0 , . . . , y r−1 is a Jordan chain of Wi at λ0 . Reversing the arguments used in the preceding paragraph, one proves that each common Jordan chain of W1 and W2 at λ0 is of the form y 0 , . . . , y r−1 , where
Bezout Integral Operator and Abstract Scheme
239
y0 , . . . , yr−1 is some common Jordan chain of F1 and F2 at z0 and y0 = y 0 . We can then use (2.28) to show that the map y0 , . . . , y r−1 ) (y0 , . . . , yr−1 ) → (
(2.29)
transforms a canonical system of common Jordan chains of F1 and F2 at z0 into a canonical system of common Jordan chains of W1 and W2 at λ0 , which proves (2.27). From Lemma 2.7 and the remark made in the last paragraph of the previous section we have the following result. Corollary 2.8. Let F1 and F2 be given by (2.23), and let M be the largest subspace contained in Ker (C1 − C2 ) that is invariant under A× C2 . Put A× 2 = A − BC M = × A2 |M . Then F1 and F2 have a finite number of common eigenvalues if and only if the non-zero part of the spectrum of A× F1 , F2 ) M is finite. Moreover, in that case ν(F is equal to the rank of the Riesz projection corresponding to the non-zero part of the spectrum of A× M. Proof. Let W1 (λ) = F1 (λ−1 ) and W2 (λ) = F2 (λ−1 ). Using (2.23) we see that W1 and W2 are given by the realizations in (2.15), and hence we can apply the results of the previous section. Since the matrices F1 (0) and F2 (0) are non-singular, the common eigenvalues of F1 and F2 are all non-zero. Hence we can use Lemma 2.7 to show that F1 and F2 have a finite number of common eigenvalues if and only if W1 and W2 have a finite number of common eigenvalues in C\{0}. But then the remark made in the last paragraph of the previous section yields the first part of the corollary. Assume now that F1 and F2 have a finite number of common eigenvalues, z1 , . . . , z , say. For j = 1, . . . , put λj = zj−1 . Then λ1 , . . . , λ are the common eigenvalues of W1 and W2 in C\{0}. Using (2.27), this yields ν(F F1 , F2 ) =
j=1
ν(F F1 , F2 ; zj ) =
ν(W W1 , W2 ; λj ) = ν(W W1 , W2 ; C\{0}).
j=1
By the remark made in the last paragraph of the previous section the quantity ν(W W1 , W2 ; C\{0}) is equal to the rank of the Riesz projection corresponding to the non-zero part of the spectrum of A× M , which completes the proof. Proof of Theorem 2.6. The injectivity of A× 2 and the fact that M is invariant × × under A× 2 imply that AM = A2 |M is injective too. By assumption, M is finitedimensional. Hence the spectrum of A× M is finite and consists of eigenvalues only. is injective, it follows that the point zero is not in the spectrum of A× Since A× M M. Summarizing we see that the spectrum of A× is equal to the non-zero part of M and is finite. In particular, M is equal to the range of the the spectrum of A× M Riesz projection corresponding to the non-zero part of the spectrum of A× M . An application of Corollary 2.8 then yields (2.24).
240
I. Gohberg, I. Haimovici, M.A. Kaashoek and L. Lerer
Next we prove that the vectors in (2.26) form a basis of M. Let W1 (λ) = F1 (λ−1 ) and W2 (λ) = F2 (λ−1 ). For ν = 1, . . . , put λν = zν−1 . Using the map (2.29) with zν in place of z0 , we transform the canonical system (2.25) into y 1,ν 0 , . . . , y 1,ν r(ν) −1 , y 2,ν 0 , . . . , y 2,ν r(ν) −1 , . . . , y pνν , 0 , . . . , y pν , r(ν) −1 . 1
ν
2
(2.30)
pν
From the proof of Lemma 2.7 we know that (2.30) forms a canonical system of common Jordan chains of W1 and W2 at λν = zν−1 . But then we can use Corollary 2.5 to show that ν ν y j,k = CM x
j,k ,
(ν)
k = 0, . . . , rj
− 1, j = 1, . . . , pν ,
where ν ν ν ν x
1,
1, ,x
2,
2, ,...,x
pνν , 0 , . . . , x
pν (ν) (ν) 0, . . . , x 0, . . . , x r −1 r −1 1
(ν) ν , rpν −1
2
× is a canonical system of Jordan chains of A× M = A2 |M at λν . It follows that the set of vectors ν (ν) (2.31) x
j, k | k = 0, . . . , rj − 1, j = 1, . . . , pν , ν = 1, . . . ,
forms a basis for M. We proceed by relating the vectors in the set (2.31) to the vectors uj,ν k in (2.26). From (2.13) we know that ν = x
j,k
k
ν (A − λν I)−(α+1) B yj,k−α ,
(ν)
k = 0, . . . , rj
− 1.
α=0
Now put x
jν (λ) y jν (λ)
ν−1
= x
j,ν 0 + (λ − λν ) xj,ν 1 + · · · + (λ − λν )rj =
y j,ν 0
+ (λ −
λν ) yj,ν 1
j
rjν−1
+ · · · + (λ − λν )
x
j,ν rν−1 ,
y j,ν rν−1 . j
Then at λν the function x
jν (λ) + (λI − A)−1 B yjν (λ) has a zero of order at least rj . Next, for z = λ−1 put (ν)
jν (λ), xjν (z) = x
yjν (z) = y jν (λ).
Then we see that at zν the function xjν (z) + z(I − zA)−1 Byjν (z) has a zero of (ν)
order at least rj
(ν)
too. Let xj,ν 0 , . . . , xj, r(ν) −1 be the first rj xjν (z)
Taylor coefficients in
j
the Taylor expansion of at zν . By comparing the Taylor expansions of the functions xjν (z) and −z(I − zA)−1 Byjν (z) at zν we obtain xj,ν 0 = −zν (I − zν A)−1 Byj,ν 0 ,
(2.32)
xj,ν k = −zν (I − zν A)−1 Byj,ν k
(2.33)
−
k
ν (I − zν A)−(α+1) Aα−1 Byj,k−α
α=1
(ν)
(k = 1, . . . , rj
− 1).
Bezout Integral Operator and Abstract Scheme
241
To see this note that
−1 z(I − zA)−1 = z (I − zν A) − (z − zν )A =z
∞
(z − zν )α (I − zν A)−(α+1) Aα
α=0
=
∞
(z − zν )(α+1) (I − zν A)−(α+1) Aα + zν (I − zν A)−1
α=0 ∞
+ =
(z − zν )α (I − zν A)−(α+1) (zν A − I + I)A(α−1)
α=1 ∞
(z − zν )(α+1) (I − zν A)−(α+1) Aα + zν (I − zν A)−1
α=0 ∞
− +
(z − zν )α (I − zν A)−α A(α−1)
α=1 ∞
(z − zν )α (I − zν A)−(α+1) A(α−1)
α=1
= zν (I − zν A)−1 +
∞
(z − zν )α (I − zν A)−(α+1) A(α−1) .
α=1
Thus z(I − zA)−1 = zν (I − zν A)−1 +
∞
(z − zν )α (I − zν A)−(α+1) A(α−1) .
α=1
From the latter identity the formulas (2.32) and (2.33) are clear. Finally, to complete the proof notice that for α ≥ 1 we have zν (I − zν A)−(α+1) Aα = (I − zν A)−(α+1) (zν A − I + I)A(α−1) = −(I − zν A)−α A(α−1) + (I − zν A)−(α+1) A(α−1) . Using this in (2.26) we obtain zν uj,ν 0 = −xνj, 0 ,
zν uj,ν k = −xj,ν k − uj,ν k−1
(ν)
(k = 1, . . . , rj
− 1).
Since the set (2.31) is a basis for M and zν = 0, we conclude that vectors in (2.26) form a basis for M too. In the next chapter we shall apply the results of this section to the entire matrix functions B and D appearing in (1.2) and (1.3).
3. The null space of the Bezout integral operator In this chapter we prove Theorem 1.1. The proof will be based on Theorem 2.6. This requires to have appropriate co-realizations for the entire matrix functions
242
I. Gohberg, I. Haimovici, M.A. Kaashoek and L. Lerer
A, B, C, D. These co-realizations will be constructed in Section 3.2, using some preliminaries on convolution integral operators from Section 3.1. In Section 3.3 we use a result from [10] to restate the quasi commutativity property (1.1) in terms of convolution integral operators on Ln1 [0, ω]. In Section 3.4 we establish intertwining relations between the operator T = T {A, C; B, D} and the main operators of the inverses of the co-realizations in Section 3.2. We are then ready to give the proof in Section 3.5. 3.1. Preliminaries on convolution integral operators Throughout this section V and W are the linear transformations defined by t ω (V f )(t) = −i f (s) ds, (W f )(t) = i f (s) ds (0 ≤ t ≤ ω). (3.1) 0
t
We view V and W as bounded linear operators on Ln1 [0, ω]. We also need the following projection and embedding operators: ω n n πf = f (s) ds, (3.2) π : L1 [0, ω] → C , 0
τ :C → n
Ln1 [0, ω],
(τ x)(t) = x,
0 ≤ t ≤ ω.
(3.3)
Notice that W − V = iL where L = τ π. The operators V and W are Volterra operators, that is, the operators V and W are compact and their spectra consist of the number zero only. Proposition 3.1. Let k ∈ Ln×n [−ω, ω], and consider on Ln1 [0, ω] the integral op1 erators ω (Kϕ)(t) = k(t − s)ϕ(s) ds, 0 ≤ t ≤ ω, 0 t k(t − s)ϕ(s) ds, 0 ≤ t ≤ ω, (K+ ϕ)(t) = 0 ω (K− ϕ)(t) = k(t − s)ϕ(s) ds, 0 ≤ t ≤ ω. t
Put L = τ π, where π and τ are defined by (3.2) and (3.3). Then V K − KV = iK− L − iLK− ,
W K − KW = iLK+ − iK+ L .
(3.4)
Moreover, K commutes with V if and only if k has its support on the positive half-line, and K commutes with W if and only if k has its support on the negative half-line. Proof. We split the proof into three parts. Part 1. In this part R is an arbitrary integral operator on Ln1 [0, ω], ω (Rf )(t) = ρ(t, s)f (s) ds, 0 ≤ t ≤ ω. 0
Bezout Integral Operator and Abstract Scheme
243
We assume that the function |ρ(t, s)f (s)| is integrable on [0, ω] × [0, ω]. We claim that W R − RV is an integral operator of which the kernel function γR is given by ω ω ρ(t, s) ds + i ρ(s, r) ds, 0 ≤ t, r ≤ ω. (3.5) γR (t, r) = i r
t
Indeed, using Fubini’s theorem, we have ω ω ω (Rf )(s) ds = i ρ(s, r)f (r) dr ds (W Rf )(t) = i t 0 t ω ω = i ρ(s, r) ds f (r) dr, 0
and
(RV f )(t)
= =
t
ω s ρ(t, s)(V f )(s) ds = −i ρ(t, s) f (r) dr ds 0 0 0 ω ω −i ρ(t, s) ds f (r) dr. ω
0
r
This shows that W R − RV has (3.5) as its kernel function. Part 2. In this part we apply the result of the previous part to R = K, and we show that (3.6) W K − KV = iLK+ + iK− L. Indeed, when R = K, we have ρ(t, s) = k(t − s), and the kernel function γK (t, r) of W K − KV is given by ω ω k(t − s) ds + i k(s − r) ds γK (t, r) = i r t ω−r ω−r t−r k(s) ds + i k(s) ds = i k(s) ds, 0 ≤ t, r ≤ ω. =i t−ω
t−r
It follows that (W K − KV )f (t) =
ω
i
0
ω
ω
=i 0
t−ω ω−r
k(s) ds f (r) dr k(s) ds f (r) dr + i
0
=i 0
ω−r
t−ω
0
ω
k(s − r) ds f (r) dr + i
r
ω
s
ω
t
= i(LK+ f )(t) + i(K− Lf )(t),
k(s) ds f (r) dr
t−ω 0
k(s) ds Lf
t−ω ω−t
k(s − r)f (r) dr ds + i 0 0 0 ω k(t − s)(Lf )(s) ds = i(LK+ f )(t) + i =i
0
k(−s) ds Lf
244
I. Gohberg, I. Haimovici, M.A. Kaashoek and L. Lerer
which proves (3.6). Using that W − V = iL, it is straightforward to derive from (3.6) the two identities in (3.4). Part 3. In this part we prove the final statements of the proposition. First note that the identities in (3.4) yield V K+ − K+ V = 0,
W K− − K− W = 0.
(3.7)
If k has its support on the positive half-line, then K = K+ , and hence the first identity in (3.7) shows that K commutes with V . Since K = K+ + K− , to prove the reverse implication, it suffices to show that V K− − K− V = 0 implies K− = 0. To do this, assume V K− − K− V = 0. Then the first identities in (3.7) and (3.4) yield that K− L = LK− . Put k− = k|[−ω,0] . The identity K− L = LK− implies that for each x ∈ Cn we have ω k− (t − s)x ds = (K− Lτ x)(t) = (LK− τ x)(t), and hence
ω t
t
k− (t − s)x ds does not depend on t. It follows that 0 ω k− (t − s)x ds = k− (s)x ds, 0 ≤ t ≤ ω, t−ω
t
does not depend on t, which implies that k− = 0, and hence K− = 0. In a similar way one proves that K commutes with W if and only if k has its support on the negative half-line. Let K, K+ and K− be as in the above proposition. We say that K ∈ P if K = K+ , and K ∈ N if K = K− . In other words, K ∈ P if and only if k has its support on the positive half-line, and K ∈ N if and only if k has its support on the negative half-line. Using this terminology, the final part of Proposition 3.1 can be summarized as follows: K ∈ P if and only if K commutes with V , and K ∈ N if and only if K commutes with W . 3.2. Co-realizations for the functions A, B, C, D In this section we show that the entire matrix functions A, B, C, D defined by (1.2), (1.3) admit the following co-realizations: A(0) + iλπ(I − λW )−1 YA ,
(3.8)
B(λ) =
B(0) + iλZB (I − λV )−1 τ,
(3.9)
C(λ) =
e−iλω {C(0) + iλπ(I − λW )−1 YC },
(3.10)
D(λ)
eiλω {D(0) + iλZ ZD (I − λV )−1 τ }.
(3.11)
A(λ)
=
=
Here V and W are the operators on Ln1 [0, ω] defined by (3.1), the operators π and τ are given by (3.2) and (3.3), respectively, and the operators YA , YC from Cn into
Bezout Integral Operator and Abstract Scheme Ln1 [0, ω], and ZB , ZD from Ln1 [0, ω] into Cn are given by ω YA = A1 τ, (A1 f )(t) = a(t + ω − s)f (s) ds, 0 ≤ t ≤ ω, t ω C0 f )(t) = c(t − s)f (s) ds, 0 ≤ t ≤ ω, YC = (I + C0 )τ, (C t t b(t − ω − s)f (s) ds, 0 ≤ t ≤ ω, ZB = −πB−1 , (B−1 f )(t) = 0 t d(t − s)f (s) ds, 0 ≤ t ≤ ω. ZD = −π(I + D0 ), (D0 f )(t) =
245
(3.12) (3.13) (3.14) (3.15)
0
Here π and τ are defined by (3.2) and (3.3), respectively. To derive formulas (3.8)– (3.11) we need some auxiliary results. Recall that the spectra of the operators V and W consist of the point zero only. Hence (I − λV )−1 and (I − λW )−1 are well defined for each λ ∈ C. In fact, elementary calculations show that for each λ ∈ C we have t (I − λV )−1 f (t) = f (t) − iλ eiλ(r−t) f (r) dr, 0 ≤ t ≤ ω, (3.16) 0 ω (I − λW )−1 f (t) = f (t) + iλ eiλ(r−t) f (r) dr, 0 ≤ t ≤ ω. (3.17) t
From (3.16) and (3.17) it is straightforward to derive the following equalities which will be useful later: (3.18) (I − λV )−1 τ x (t) = e−iλt x, 0 ≤ t ≤ ω, x ∈ Cn , (3.19) (I − λW )−1 τ x (t) = eiλ(ω−t) x, 0 ≤ t ≤ ω, x ∈ Cn , iλπ(I − λV )−1 τ
=
(1 − e−iλω )IIn ,
iλπ(I − λW )−1 τ
=
(eiλω − 1)IIn ,
λ ∈ C, λ ∈ C.
(3.20) (3.21)
Here In denotes the n × n identity matrix. To derive the co-realizations (3.8)–(3.11) the following two propositions will be useful. [0, ω], and let M0 and M1 be the operators on Proposition 3.2. Let m ∈ Ln×n 1 Ln1 [0, ω] defined by
t
m(t − s)f (s) ds,
(M M0 f )(t) =
0 ≤ t ≤ ω,
0
ω
m(t + ω − s)f (s) ds,
(M1 f )(t) = t
0 ≤ t ≤ ω.
246
I. Gohberg, I. Haimovici, M.A. Kaashoek and L. Lerer
Then the Fourier transform m ˆ of m admits the following representations: m(λ) ˆ
=
ˆ − iλπM M0 (I − λV )−1 τ }, eiλω {m(0)
(3.22)
m(λ) ˆ
=
m(0) ˆ + iλπM1 (I − λW )−1 τ.
(3.23)
Proof. First notice that m(λ) ˆ =
ω
e
iλr
0
ω
eiλ(ω−r) m(ω − r) dr.
m(r) dr =
(3.24)
0
Using (3.18), we see that
2 3 m(λ) ˆ = eiλω M (I − λV )−1 τ = eiλω M τ + λM V (I − λV )−1 τ ,
where M is the operator from Ln1 [0, ω] into Cn given by ω Mf = m(ω − r)f (r) dr.
(3.25)
0
Obviously, M τ = m(0). ˆ Notice that M f = (M M0 f )(ω). The latter identity, together with the fact M0 and V commute (see the final paragraph of the previous section), yields ω M V f = (M M0 V f )(ω) = (V M0 f )(ω) = −i (M M0 f )(t) dt = −iπM M0 f, 0
which proves (3.22). Formula (3.23) is proved in a similar way. Indeed, using (3.24) and (3.19), we have m(λ) ˆ = M (I − λW )−1 τ = M τ + λM W (I − λW )−1 τ. Since M f = (M1 f )(0) and M1 and W commute (see the final paragraph of the previous section), we have ω M W f = (M1 W f )(0) = (W M1 f )(0) = i M1 f (t) dt = iπM1 f, 0
which proves (3.23).
Proposition 3.3. Let ∈ Ln×n [−ω, 0], and let L0 and L−1 be the operators on 1 Ln1 [0, ω] defined by ω (L0 f )(t) = (t − s)f (s) ds, 0 ≤ t ≤ ω, t t (t − ω − s)f (s) ds, 0 ≤ t ≤ ω. (L−1 f )(t) = 0
Then the Fourier transform ˆ of admits the following representations: ˆ (λ)
=
ˆ − iλπL−1 (I − λV )−1 τ, (0)
(3.26)
ˆ (λ)
=
ˆ + iλπL0 (I − λW )−1 τ }. e−iλω {(0)
(3.27)
Bezout Integral Operator and Abstract Scheme
247
Proof. We obtain this proposition as a corollary of the previous one. Indeed, define [0, ω], and m(t) = (t − ω) for 0 ≤ t ≤ ω. Then m ∈ Ln×n 1 ω ω eiλt m(t) dt = eiλt (t − ω) dt m(λ) ˆ = 0
= eiλω
0 ω
eiλ(t−ω) (t − ω) dt
0
= eiλω
0
ˆ eiλt (t) dt = eiλω (λ).
−ω
Now apply Proposition 3.2 to this m. Notice that M0 = L−1 and M1 = L0 . Since ˆ ˆ m(0) ˆ = (0), and m(λ) ˆ = eiλω (λ), we see that formula (3.22) yields (3.26), and (3.23) yields (3.27). Let us now derive formulas (3.8)–(3.11) by applying the above two propositions. Proof of (3.8). We apply Proposition 3.2 with m = a. In this case the operator M1 = A1 , where A1 is given by (3.12). Hence (3.23), together with the fact that A1 and W commute, yields a ˆ(0) + iλπ(I − λW )−1 A1 τ. ˆ(λ) = a
(3.28)
ˆ(λ), we have A(0) = In + a ˆ(0). Thus Recall that YA = A1 τ . Since A(λ) = In + a (3.28) yields (3.8). Proof of (3.9). We apply Proposition 3.3 with = b. In this case L−1 = B−1 and ˆ = ˆb(λ). Thus (3.26) yields (λ) ˆb(λ) = ˆb(0) − iλπB−1 (I − λV )−1 τ. Since ZB = −πB−1 and B(λ) = In + ˆb(λ), we see that (3.9) holds.
Proof of (3.10). We apply Proposition 3.3 with = c. In this case L0 = C0 and ˆ = cˆ(λ). The operator C0 commutes with W . Thus (3.27) yields (λ) eiλω cˆ(λ)
=
cˆ(0) − iλπ(I − λW )−1 C0 τ
=
cˆ(0) − iλπ(I − λW )−1 (I + C0 )τ − iλπ(I − λW )−1 τ.
C0 )τ , According to (3.21) the last term is equal to eiλω In −IIn . Recall that YC = (I +C and C(λ) = In + cˆ(λ). It follows that eiλω C(λ)
=
eiλω In + eiλω cˆ(λ)
=
eiλω In + cˆ(0) + iλπ(I − λW )−1 YC − eiλω In + In
=
C(0) + iλπ(I − λW )−1 YC ,
and (3.10) is proved.
248
I. Gohberg, I. Haimovici, M.A. Kaashoek and L. Lerer
Proof of (3.11). We apply Proposition 3.2 with m = d. In this case M0 = D0 and ˆ m(λ) ˆ = d(λ). Thus (3.22) yields ˆ = d(0) ˆ − iλπD0 (I − λV )−1 τ. e−iλω d(λ) Recall that ZD = −π(I + D0 ). Thus ˆ = d(0) ˆ + iλZ ZD (I − λV )−1 τ + iλπ(I − λV )−1 τ. e−iλω d(λ) ˆ According to (3.20) the last term is equal to In − e−iλω In . Since D(λ) = In + d(λ), we conclude that ˆ e−iλω D(λ) = e−iλω In + e−iλω d(λ) =
ˆ + iλZ In + d(0) ZD (I − λV )−1 τ
=
D(0) + iλZ ZD (I − λV )−1 τ,
which completes the proof of (3.11).
3.3. Quasi commutativity in operator form In this section we recall Proposition 3.2 from [10]. This proposition restates the quasi commutativity property (1.1) in operator form. To state the precise result (see Proposition 3.4 below) we need some preliminaries. First some additional notation and terminology. If a lower case letter f denotes a function in Ln×n (R), then we let the calligraphic letter F denote the 1 function defined by F (λ) = In +
∞
eiλs f (s) ds.
(3.29)
−∞
Furthermore, for each ν ∈ Z we let capital Fν be the convolution operator on Ln1 [0, ω] given by ω (F Fν ϕ)(t) = f (t − s + νω)ϕ(s) ds, 0 ≤ t ≤ ω. (3.30) 0
We call F the entire matrix function defined by f , and we shall refer to the operators Fν , ν ∈ Z, as the convolution operators corresponding to F . Now, let a, b, c, and d be the functions appearing in (1.2) and (1.3). We shall view a, b, c, d as functions in Ln×n (R), with a and d having their support in [0, ω], 1 while the support of b and c is in [−ω, 0]. Using the terminology introduced in the previous paragraph, the functions A, B, C, and D in (1.2), (1.3) are the entire matrix functions defined by the functions a, b, c, and d, respectively. Next, we consider the convolution operators corresponding to A, B, C, and D. Since a has its support in [0, ω], the convolution operators Aν , ν ∈ Z, corresponding to A have the following properties: t (i) (A0 ϕ)(t) = a(t − s)ϕ(s) ds, 0 ≤ t ≤ ω, 0 ω (ii) (A1 ϕ)(t) = a(t + ω − s)ϕ(s) ds, 0 ≤ t ≤ ω, t
(iii) Aν = 0 for ν = 0, ν = 1.
Bezout Integral Operator and Abstract Scheme
249
Similarly, since b has its support in [−ω, 0], the convolution operators Bν , ν ∈ Z, corresponding to B have the following properties: ω b(t − s)ϕ(s) ds, 0 ≤ t ≤ ω, (j) (B0 ϕ)(t) = t t (jj) (B−1 ϕ)(t) = b(t − s − ω)ϕ(s) ds, 0 ≤ t ≤ ω, 0
(jjj) Bν = 0 for ν = 0, ν = −1. Analogous results hold for the convolution operators Cν and Dν , ν ∈ Z, corresponding to C and D, respectively. Notice that the notations introduced in the previous paragraph are consistent with the notations used in (3.12)–(3.15). We are now ready to restate the quasi commutativity property in operator form. For the sake of completeness we repeat the proof given in [10]. Proposition 3.4. The quasi commutativity property (1.1) is equivalent to the following two conditions: (I + A0 )B−1 = C−1 (I + D0 ),
(I + C0 )D1 = A1 (I + B0 ).
(3.31)
Moreover, the identities in (3.31) imply A0 + B0 + A0 B0 + A1 B−1 = C0 + D0 + C0 D0 + C−1 D1 .
(3.32)
(R), we denote in Ln×n 1 on Ln1 (R) defined by
Proof. We begin with some additional notation. Given f by the bold face capital letter F the convolution operator ∞ (Fϕ)(t) = f (t − s)ϕ(s) ds, −∞ < t < ∞.
(3.33)
−∞
Notice that the function F given by (3.29) is the symbol of the operator I + F. With the convolution operator F we associate the block Laurent operator ⎤ ⎡ .. . ⎥ ⎢ ⎥ ⎢ F0 F−1 F−2 ⎥ ⎢ ⎥. ⎢ LF = ⎢ F1 F0 F−1 ⎥ ⎥ ⎢ F2 F1 F0 ⎦ ⎣ .. . Here Fν is the νth convolution operators corresponding to F , see (3.30). We consider LF as a bounded linear operator on the space 1, Z Ln1 [0, ω] . The latter space consists of all doubly infinite sequences ϕ = (ϕj )j∈ Z with ϕj ∈ Ln1 [0, ω] such that ϕ1, Z (Ln1 [0, ω]) :=
∞ j=−∞
ϕj Ln1 [0, ω] < ∞.
The spaces Ln1 (R) and 1, Z Ln1 [0, ω] are isometrically equivalent, and for f and g in Ln×n (R) we have 1 LFG = LF LG . (3.34)
250
I. Gohberg, I. Haimovici, M.A. Kaashoek and L. Lerer
Now let us consider the functions A, B, C, and D given by (1.2) and (1.3). Notice that A, B, C, and D are the symbols of the convolution operators I + A, I + B, I + C, and I + D, respectively. It follows that condition (1.1) is equivalent to (I + A)(I + B) = (I + C)(I + D),
(3.35)
which according to (3.34) can be rewritten as LA + LB + LA LB = LC + LD + LC LD .
(3.36)
Now recall the properties (i)–(iii) for the operator Aν , the properties (j)–(jjj) for the operators Bν , and the analogous properties for the operators Cν and Dν (ν ∈ Z). By comparing the entries in the infinite operator matrices determined by the leftand right-hand sides of (3.36) we see that (1.1) is equivalent to (α) B−1 + A0 B−1 = C−1 + C−1 D0 , (β) A0 + B0 + A0 B0 + A1 B−1 = C0 + D0 + C0 D0 + C−1 D1 (γ) A1 + A1 B0 = D1 + C0 D1 . Obviously, (α) is the same as the first part of (3.31), and (γ) is the same as the second part of (3.31). Thus to complete the proof we have to show that (3.31) implies condition (β). Consider the functions f = a + b + a ∗ b and g = c + d + c ∗ d, where ∗ (R). Then LF is equal to the left-hand denotes the convolution product in Ln×n 1 side of (3.36), and LG to the right-hand side of (3.36). The first part of (3.31) yields F−1 = G−1 , and the second part of (3.31) implies F1 = G1 . Now notice that F−1 = G−1 is equivalent to f (t) = g(t) for each −2ω ≤ t ≤ 0, and F1 = G1 is equivalent to f (t) = g(t) for each 0 ≤ t ≤ 2ω, i.e., F−1 = G−1
⇐⇒
f |[−2ω, 0] = g|[−2ω, 0] ,
F1 = G1
⇐⇒
f |[0, 2ω] = g|[0, 2ω] .
In particular, if (3.31) holds, then f |[−ω, ω] = g|[−ω, ω] , which is equivalent to F0 = G0 . But F0 = G0 is equivalent to (β). This completes the proof. For latter purposes we present the following lemma. Lemma 3.5. Let F be the entire matrix function defined by f ∈ Ln×n (R). Then 1 πϕ +
∞
πF Fν ϕ = F (0)πϕ,
ϕ ∈ Ln1 [0, ω],
(3.37)
ν=−∞
τx +
∞
Fν τ x = τ F (0)x,
x ∈ Cn .
ν=−∞
Here π and τ are the operators defined by (3.2) and (3.3), respectively.
(3.38)
Bezout Integral Operator and Abstract Scheme Proof. For ϕ ∈ Ln1 [0, ω] we have ∞
∞
πF Fν ϕ =
ν=−∞
=
ω
ν=−∞ νω ∞ ω −∞ 0 ω ∞
=
0
=
ω
0 ν=−∞ 0 ∞ (ν+1)ω
=
∞
f (t − s + νω)ϕ(s) ds dt
ω
f (t − s)ϕ(s) ds dt
0
f (t − s)ϕ(s) ds dt
f (t − s) dt ϕ(s) ds −∞ ω f (t) dt ϕ(s) ds
−∞
=
251
0
F (0)πϕ − πϕ,
which proves (3.37). The proof of (3.38) is similar.
3.4. Intertwining properties In this section we prove two propositions about intertwining relations between T and the data from the co-realizations (3.8)–(3.11). Proposition 3.6. Assume the quadruple {A, C; B, D} satisfies the quasi commutativity property. Then the Bezout integral operator T associated with {A, C; B, D} satisfies the equation YC ZD , (3.39) W T − T V = iY YA ZB − iY where YA , YC , ZB , ZD are the operators on Ln1 [0, ω] defined by (3.12)–(3.15). Furthermore, πT
= A(0)ZB − C(0)Z ZD ,
(3.40)
Tτ
= YC D(0) − YA B(0).
(3.41)
Proof. Recall that the Bezout integral operator T is given by (1.4) and (1.5). Using these formulas and the notations introduced in the previous section we see that T = (I + A0 )(I + B0 ) − C−1 D1 .
(3.42)
On the other hand, the quasi commutativity property implies that A0 + B0 + A0 B0 + A1 B−1 = C0 + D0 + C0 D0 + C−1 D1 . It follows that the Bezout integral operator is also given by T = (I + C0 )(I + D0 ) − A1 B−1 .
(3.43)
Next, notice that the operator A1 and C0 belong to the class N , and the operators B−1 and D0 to the class P. Thus (see the final paragraph of Section 3.1)
252
I. Gohberg, I. Haimovici, M.A. Kaashoek and L. Lerer
the operators A−1 and C0 commute with W and the operators B−1 and D0 with V . Using (3.43) this yields WT
= (I + C0 )W (I + D0 ) − A1 W B−1 ,
TV
= (I + C0 )V (I + D0 ) − A1 V B−1 .
Since W − V = iτ π we see that W T − V T = i(I + C0 )τ π(I + D0 ) − iA1 τ πB−1 .
(3.44)
Now, recall that YA = A1 τ,
YC = (I + C0 )τ,
ZB = −πB−1 ,
ZD = −π(I + D0 ).
(3.45)
By using these identities in (3.44) we obtain (3.39). To prove (3.40) we first note that (3.37) applied to the matrix functions a and c yields the following two identities: π(I + A0 ) + πA1 = A(0)π,
π(I + C0 ) + πC−1 = C(0)π.
Thus πT
= π(I + C0 )(I + D0 ) − πA1 B−1 = C(0)π(I + D0 ) − πC−1 (I + D0 ) − A(0)πB−1 + π(I + A0 )B−1 .
Since the quadruple {A, C; B, D} satisfies the quasi commutativity property, the first identity in (3.31) holds true. This yields that πT = C(0)π(I + D0 ) − A(0)πB−1 . But then we can use the third and fourth identity in (3.45) to show that (3.40) holds. To prove (3.41) we apply (3.38) to the functions b and d. This yields (I + B0 )τ + B−1 τ = τ B(0),
(I + D0 )τ + D1 τ = τ D(0).
By using this together with the second identity in (3.31) we obtain Tτ
= (I + C0 )(I + D0 )τ − A1 B−1 τ = (I + C0 )τ D(0) − (I + C0 )D1 τ − A1 τ B(0) + A1 (I + B0 )τ = (I + C0 )τ D(0) − A1 τ B(0).
But then we can use the first two identities in (3.45) to show that (3.41) holds. Next, we assume that the matrices A(0) and D(0) are non-singular. This allows us to introduce the operators WA× = W − iY YA A(0)−1 π,
VD× = V − iτ D(0)−1 ZD .
Notice that WA× and VD× are finite rank perturbations of Thus both WA× and VD× are compact operators. Using the
(3.46)
W and V , respectively. operators WA× and VD× , we can now give a first description of the kernel of the Bezout integral operator T .
Bezout Integral Operator and Abstract Scheme
253
Proposition 3.7. Assume that the quadruple {A, C; B, D} satisfies the quasi commutativity property, and let the matrices A(0) and D(0) be non-singular. Then the Bezout integral operator T associated with {A, C; B, D} satisfies the intertwining relation (3.47) WA× T = T VD× , and the null space of T is equal to the maximal VD× -invariant subspace contained in Ker πT . Here WA× and VD× are the compact operators defined by (3.46). Proof. From (3.40) it follows WA× T = W T − iY YA (0)−1 πT = W T − iY YA ZB + iY YA A(0)−1 C(0)Z ZD . Similarly, (3.41) yields T VD× = T V − iT τ D(0)−1 ZD = T V − iY YC ZD + iY YA B(0)D(0)−1 ZD . Since {A, C; B, D} satisfies the quasi commutativity property, we have A(0)B(0) = C(0)D(0), and hence A(0)−1 C(0) = B(0)D(0)−1 . Thus WA× T − T VD× = W T − T V − iY YA ZB + iY YC ZD = 0, because of (3.39). Thus (3.47) holds. Next, we prove the statement about the null space of T . From (3.47) it follows that Ker T is invariant under VD× . Obviously, Ker T ⊂ Ker πT . Now, let M be the maximal VD× -invariant subspace in Ker πT . Since M is maximal, it suffices to show that M ⊂ Ker T . Take f ∈ M. Then πT f = 0, and hence WA× T f = (W − iY YA A(0)−1 π)T f = W T f. Since M is invariant under
VD× ,
we have
(V VD× )k f
T (V VD× )k f = W k T f,
(3.48)
∈ M for each k. It follows
k = 0, 1, 2, . . . .
(3.49)
Indeed, using (3.48) with (V VD× )k−1 f in place of f , we obtain × k−1 × k−1 × k−1 VD ) f = WA× T (V VD ) f = W T (V VD ) f . T (V VD× )k f = T VD× (V Since T VD× f = WA× T f = W T f by (3.48), the above calculation shows that we can prove (3.49) by induction. From (3.49) we see that × k VD ) f = 0, k = 0, 1, 2, . . . . πW k T f = πT (V But then π(I − λW )−1 T f = 0 for each λ ∈ C. Using (3.17) it is straightforward to show that π(I − λW )−1 g = gˆ(λ), λ ∈ C, where gˆ is the Fourier transform of g ∈ Ln1 [0, ω]. Thus π(I − λW )−1 T f = 0 implies that Tf (λ) = 0 for each λ ∈ C. Hence T f = 0. This proves that M ⊂ Ker T .
254
I. Gohberg, I. Haimovici, M.A. Kaashoek and L. Lerer
3.5. Proof of the first main theorem on the Bezout integral operator In this section we shall prove Theorem 1.1. We split the proof into two parts. In the first part we show that without loss of generality we can assume that the matrices A(0), B(0), C(0), D(0) are non-singular. In the second part we assume that this non-singularity condition is satisfied, and we apply Theorem 2.6 and Proposition 3.7 to complete the proof. Part 1. For α ∈ R let Aα , Bα , Cα , Dα be the n × n matrix functions which one obtains when in formulas (1.2)–(1.3) the functions a(s), b(s), c(s), d(s) are replaced by eiαs a(s), eiαs b(s), eiαs c(s), eiαs d(s), respectively. In other words Aα (λ) = A(α + λ), Cα (λ) = C(α + λ),
Bα (λ) = B(α + λ), Dα (λ) = D(α + λ).
(3.50) (3.51)
Let Tα be the Bezout integral operator associated with {Aα , Cα ; Bα , Dα }. Then formulas (1.4), (1.5) show that Tα = Mα T M−α , where Mα is the operator on Ln1 [0, ω] defined by (M Mα f )(t) = eiαt f (t),
0 ≤ t ≤ ω.
(3.52)
Since α ∈ R, we have |e | = 1 for each t. It follows that Mα is an invertible bounded linear operator on Ln1 [0, ω] and Mα−1 = M−α . Thus T and Tα are similar operators, and hence dim Ker T = dim Ker Tα . From the identities in the righthand sides of (3.50) and (3.51) it is clear that the total common multiplicity of B and D is equal to the total common multiplicity of Bα and Dα . In other words, ν(B, D) = ν(Bα , Dα ). Thus iαt
dim Ker T = ν(B, D) ⇐⇒ dim Ker Tα = ν(Bα , Dα ). From the identities in (3.50) and (3.51) it is also clear that {A, C; B, D} has the quasi commutativity property if and only if this property is satisfied for the quadruple {Aα , Cα ; Bα , Dα }. The above results show that it suffices to prove Theorem 1.1 for some quadruple {Aα , Cα ; Bα , Dα } in place of {A, C; B, D}. We claim that we can choose α in such a way that the values of Aα , Bα , Cα , Dα at zero are non-singular matrices. To see this, we first note that Aα (0) = A(α),
Bα (0) = B(α),
Cα (0) = C(α),
Dα (0) = D(α).
Next, since the functions a, b, c, d have their support in a finite interval, the Riemann-Lebesgue lemma shows that for α ∈ R, α → ∞, the values Aα , Bα , Cα , Dα tend to the n × n identity matrix. Thus Aα (0), Bα (0), Cα (0), Dα (0) are all non-singular for α ∈ R and α sufficiently large. Part 2. In this part we assume that the values of A, B, C, D at zero are nonsingular. This allows us to introduce the functions FB (λ) = B(0)−1 B(λ) and FD (λ) = e−iλω D(0)−1 D(λ). In other words, using the representations (3.9) and
Bezout Integral Operator and Abstract Scheme
255
(3.11) we have FB (λ)
=
In + iλB(0)−1 ZB (I − λV )−1 τ,
FD (λ)
=
In + iλD(0)−1 ZD (I − λV )−1 τ.
Fα (λ), the common eigenvalues of Since B(λ) = B(0)F FB (λ) and D(λ) = eiλω D(0)F B and D are the same as those of FB and FD . Furthermore, if x0 , . . . , xr−1 is a common Jordan chain of B and D, then it is a common Jordan chain of FB and FD , and conversely. It follows that ν(B, D) = ν(F FB , FD ). To compute ν(F FB , FD ) we apply Theorem 2.6. the This requires to determine largest VD× -invariant subspace contained in Ker iB(0)−1ZB − iD(0)−1 ZD . To do this, we first show that Ker iB(0)−1 ZB − iD(0)−1 ZD = Ker πT. (3.53) Indeed, using the quasi commutativity property and (3.40) we have iB(0)−1 ZB − iD(0)−1 ZD = iD(0)−1 D(0)B(0)−1 ZB − ZD = iD(0)−1 A(0)−1 C(0)ZB − ZD = iD(0)−1 A(0)−1 C(0)ZB − A(0)Z ZD =
−iD(0)−1 A(0)−1 πT,
which proves (3.53) Let M be the largest VD× -invariant subspace contained in the null space of iB(0)−1 ZB −iD(0)−1 ZD . Using (3.53) and Proposition 3.7, we see that M = Ker T . Since T is of the form I + Γ, with Γ a compact operator, dim Ker T < ∞, and hence dim M < ∞. Thus by Theorem 2.6, dim Ker T = dim M = ν(F FB , FD ) = ν(B, D), provided VD× is injective. Thus to complete the proof it remains to show that VD× f = 0 implies f = 0. To do this, recall that VD× = V − iτ D(0)−1 ZD . Hence, using (3.15), the hypotheses VD× f = 0 implies that t f (s) ds = D(0)−1 π(I + D0 )f, 0 ≤ t ≤ ω. 0
The right-hand side in the previous identity does not depend on t. Hence f (s) = 0 a.e. on [0, ω], and therefore f = 0. Thus VD× is injective. This completes the proof of Theorem 1.1. At this stage, using the second part of Theorem 2.6 and the arguments used in the above proof, we could also prove Theorem 4.6. However, we prefer first to clarify the general scheme underlying the definition of the Bezout integral operator.
256
I. Gohberg, I. Haimovici, M.A. Kaashoek and L. Lerer
Example. We conclude with an example showing that Theorem 1.1 does not remain true when the quasi commutativity property is not fulfilled. For this purpose, take n = 1, and let a, d ∈ L1 [0, 1], and b, c ∈ L1 [−1, 0] be given by a(t) = 0, b(−t) = −1, c(−t) = 0, d(t) = −1,
0 ≤ t ≤ 1.
(3.54)
With this choice of a, b, c, d, and ω = 1, we let T be the operator on L1 [0, 1] defined by (1.4) and (1.5). One computes that the action of T is given by 1 (T ϕ)(t) = ϕ(t) − ϕ(s) ds, 0 ≤ t ≤ 1. t
Hence T is an invertible operator on L1 [0, 1]. Next consider the functions A, B, C, D defined by (1.2) and (1.3), where a, b, c, d are as above (and ω = 1). It follows that for each λ ∈ C we have eiλ − 1 1 − e−iλ , C(λ) = 1, D(λ) = 1 − . iλ iλ Obviously, in this case the quasi commutativity property (1.1) is not satisfied. Now let us show that in this case the conclusion of Theorem 1.1 does not hold. Since the derivative of ez at zero is equal to one, we have D(0) = 1 − 1 = 0. Similarly, B(0) = 1 − 1 = 0. Thus 0 is a common zero of B and D. Hence the total multiplicity ν(B, D) is positive, and therefore strictly larger than dim Ker T , which is equal to zero. We conclude that in this case the result of Theorem 1.1 does not hold. Note that in this scalar case we can use the result of [9] to express ν(B, D) as the dimension of the null space of a suitable Bezout integral operator. In fact, this would mean to keep b and d as in (3.54), and to take a = d and c = b. With this choice of a, b, c, d the quasi commutativity property is trivially satisfied. The corresponding Bezout integral operator T is now given by 1 ϕ(s) ds, 0 ≤ t ≤ 1. (T ϕ)(t) = ϕ(t) − A(λ) = 1, B(λ) = 1 −
0
For this operator T we have dim Ker T = ν(B, D), and this number is equal to one.
4. A general scheme for defining Bezout operators In this chapter we present a refinement of the general scheme from [17] for defining Bezout operators. The first section has a preliminary character. The main result of this chapter is Theorem 4.3 in Section 4.2. This theorem includes a description of the null space of an abstract Bezout operator in terms of a certain invariant subspace. In the third section we show that the abstract Bezout operator defined in Theorem 4.3 is the closure of a Bezout operator in the sense of [17]. In the final section we show that the Bezout integral operator defined in Chapter 1 is also a Bezout operator according to the general scheme.
Bezout Integral Operator and Abstract Scheme
257
4.1. A preliminary proposition In this section we prove a proposition which can be viewed as a generalization of the classical state space similarity theorem from mathematical system theory. In the next section the proposition will be used to define an abstract Bezout operator. Consider two operator-valued functions H1 and H2 given in the following form: H1 (λ) = IU + λC1 (IIX1 − λA1 )−1 B1 , H2 (λ) = IU + λC C2 (IIX2 − λA2 )−1 B2 . (4.1) Here U, X1 , X2 are complex Banach spaces, IU , IX1 , IX2 are the identity operators on the corresponding spaces, and Aj : Xj → Xj ,
Bj : U → Xj ,
Cj : Xj → U
(j = 1, 2)
are bounded linear operators. We refer to the representations in (4.1) of H1 and H2 as co-realizations. If in (4.1) the variable λ is replaced by λ−1 , then the formulas in (4.1) yield the usual realizations which are known from mathematical system theory (see, e.g., the books [5, 7]). Let T (X X1 → X2 ) be a closed linear operator with domain D(T ) in X1 and range in X2 , and let L be a linear submanifold of X1 . We say L is a core for T if L is contained in D(T ) and T is equal to the closure of the restriction T |L. In what follows, for j = 1, 2, Cj |Aj ) = Im (Aj |Bj ) = span{Anj Bj U | n ≥ 0}, Ker (C
∞ )
Ker Cj Anj .
n=0
We are now ready to state the proposition. C2 |A2 ) = {0}. Then H1 Proposition 4.1. Put R = Im (A1 |B1 ), and assume Ker (C and H2 coincide in a neighborhood of zero if and only if there exists a closed linear operator T (X X1 → X2 ) such that R is a core for T , T B1 = B2 ,
T A1 x = A2 T x,
C2 T x = C1 x
(x ∈ D(T )).
(4.2)
Moreover, in that case T is uniquely determined, and Ker T is the maximal A1 invariant subspace of X1 contained in D(T ) ∩ Ker C1 . Finally, the second identity in (4.2) includes the statement that D(T ) is invariant under the operator A1 . If we assume that the co-realization for H1 and H2 are minimal, that is, if Cj |Aj ) = {0}, then for j = 1, 2 the space Im (Aj |Bj ) is dense in Xj and Ker (C Proposition 4.1 reduces to the state space similarity theorem for (possibly infinitedimensional) systems with bounded coefficients, and in this case the operator T is known as a pseudo-similarity (see [19], Theorem 3b.1, [4], Theorem 3.2, and [2], Proposition 6). The proof of Proposition 4.1 given below follows that of the state space similarity theorem. In general, without the core condition, the operator T in Proposition 4.1 is not unique; this follows from Section 3.3 in [3].
258
I. Gohberg, I. Haimovici, M.A. Kaashoek and L. Lerer
Proof of Proposition 4.1. Let T (X X1 → X2 ) be a closed linear operator such that R is contained in D(T ), and assume (4.2) holds. We show that H1 and H2 coincide in a neighborhood of zero. Since R is invariant under A1 and R ⊂ D(T ), the second identity in (4.2) implies that T Aj1 x = Aj2 T x (x ∈ R).
(4.3)
Now, take u ∈ U. Then, using (4.2) and (4.3), we have C2 An2 B2 u = C2 An2 T B1 u = C2 T An1 B1 u = C1 An1 B1 u,
n = 0, 1, 2, . . . .
Thus at zero the functions H1 and H2 have the same Taylor coefficients, and therefore H1 and H2 coincide in a neighborhood of zero. Next, let T (X X1 → X2 ) be another closed linear operator such that R is contained in D(T ) and (4.2) holds with T in place of T . Then (4.3) holds with T in place of T , and hence T An1 B1 = An2 T B1 = An2 B2 = An2 T B1 = T An1 B1 ,
n = 0, 1, 2, . . . .
It follows that T and T coincide on R. But then the closures of T |R and T |R are equal too. This proves the uniqueness statement. In the remaining part we assume that H1 and H2 coincide in a neighborhood of zero. This assumption is equivalent to the requirement that C1 An1 B1 = C2 An2 B2 ,
n = 0, 1, 2, . . . .
(4.4)
We shall use (4.4) to construct T and to describe its kernel. x1 Let X1 ⊕ X2 be the Banach space consisting of all pairs , x1 ∈ X1 and x2 x2 ∈ X2 , with the norm being given by x1 = x1 + x2 . x2 Consider the operators A : X1 ⊕ X2 → X1 ⊕ X2 , B : U → X1 ⊕ X2 , C : X1 ⊕ X2 → U,
x1 A1 x1 = , x2 A2 x2 B1 u Bu = , B2 u x1 C = C1 x1 − C2 x2 . x2
A
The operators A, B, and C are bounded linear operators. Introduce the space n =/ G n≥0 Ker CA . Obviously, G is a closed linear submanifold of X1 ⊕ X2 . The is a graph space, that is, fact that Ker (C C2 |A2 ) = {0} implies that G 0 =⇒ (4.5) ∈G = x2 = 0. x2 Next, consider the space G0 = Im (A|B) = span{An Bu | n = 0, 1, 2, . . . , u ∈ U}.
Bezout Integral Operator and Abstract Scheme
259
Condition (4.4) is equivalent to the statement that CAn B = 0 for n = 0, 1, 2, . . .. Let G be the closure of G0 in X1 ⊕ X2 . Then Hence (4.4) implies that G0 ⊂ G. Hence G is a closed linear submanifold of X1 ⊕ X2 and a graph space by G ⊂ G. (4.5). Thus there exists a closed linear operator T with domain D(T ) in X1 and range in X2 such that 6 x 7 G = G(T ) = | x ∈ D(T ) . Tx We claim that T has the desired properties. Notice that Im B ⊂ G0 ⊂ G. Hence B1 = T B2 . From the definition of G0 it follows that G0 is invariant under A. But A is bounded, and thus, by continuity, the space G is also invariant under A. This shows that A1 D(T ) ⊂ D(T ) and T A1 x = A2 T x for each x ∈ D(T ). Since we have G ⊂ Ker C, and thus C1 x = C2 T x for each x ∈ D(T ). Finally, G ⊂ G, note that 6 m 7 G0 = | m ∈ Im (A1 |B1 ) . Tm Since G = G(T ) is the closure of G0 , this shows that R is a core for T . We conclude with the description of Ker T . Obviously, Ker T is closed and contained in D(T ). The third identity in (4.2) implies that Ker T ⊂ Ker C1 , and from the second identity in (4.2) we conclude that Ker T is invariant under A1 . Thus Ker T is an A1 -invariant subspace contained in D(T ) ∩ Ker C1 . Let N be an arbitrary A1 -invariant subspace of X , contained in the intersection D(T ) ∩ Ker C1 . To finish the proof it suffices to show that N ⊂ Ker T . Take x ∈ N . Then x ∈ D(T ), and we see from (4.2) and (4.3) that C2 Ak2 T x = C2 T Ak1 x = C1 Ak1 x,
k = 0, 1, 2, . . . .
On the other hand, N is ⊂ Ker C1 for each k, that 0 for k = 0, 1, 2, . . .. But is, C1 Ak1 x = 0 for k = Ker (C C2 |A2 ) = {0}, and hence T x = 0. We proved that x ∈ Ker T , and therefore N ⊂ Ker T . A1 -invariant. Thus Ak1 x ∈ N 0, 1, 2, . . .. Thus C2 Ak2 T x =
Corollary 4.2. Put R = Im (A1 |B1 ), and assume that Ker (C C2 |A2 ) = {0}. Let T (X X1 → X2 ) be a closed linear operator such that R ⊂ D(T ) and (4.2) holds. Then for |λ| and |µ| sufficiently small we have H2 (λ) − H1 (µ) = (λ − µ)C C2 (IIX2 − λA2 )−1 T (IIX1 − µA1 )−1 B1 .
(4.6)
The previous identity includes the statement that for |µ| sufficiently small the set (IIX1 − µA1 )−1 B1 U is contained in D(T ). Proof. First we use the second identity in (4.2) to prove that for |µ| sufficiently small we have (IIX1 − µA1 )−1 D(T ) ⊂ D(T ), −1
T (IIX1 − µA1 )
and
(4.7) −1
x = (IIX2 − µA2 )
T x (x ∈ D(T )).
260
I. Gohberg, I. Haimovici, M.A. Kaashoek and L. Lerer
To see this, let x ∈ D(T ), and take |µ| < (A1 + A2 )−1 . Then N
lim
N →∞
µn Anj x = (IIXj − µAj )−1 x,
j = 1, 2.
n=0
Since x ∈ D(T ), the second identity in (4.2) yields lim T
N →∞
N
N µn An1 x = lim µn An2 T x = (IIX2 − µA2 )−1 T x. N →∞
n=0
n=0
But T is closed. Thus the above formulas prove (4.7). Recall that B1 U ⊂ R ⊂ D(T ). Thus, as a corollary of (4.7), we have (IIX1 − µA1 )−1 B1 U ⊂ D(T ),
T (IIX1 − µA1 )−1 B1 = (IIX2 − µA2 )−1 B2 .
(4.8)
for |µ| sufficiently small. Next, we consider H1 (λ) − H2 (µ). By Proposition 4.1 the functions H1 (λ) and H2 (λ) coincide in a neighborhood of zero. Hence C1 (IIX1 − λA1 )−1 B1 is equal to C2 (IIX2 − λA2 )−1 B2 for |λ| sufficiently small. This, together with the resolvent formula, yields H2 (λ) − H1 (µ)
= λC C2 (IIX2 − λA2 )−1 B2 − µC1 (IIX1 − µA1 )−1 B1 = λC C2 (IIX2 − λA2 )−1 B2 − µC C2 (IIX2 − µA2 )−1 B2 = (λ − µ)C C2 (IIX2 − λA2 )−1 (IIX2 − µA2 )−1 B2
for |λ| and |µ| sufficiently small. Using the second equality in (4.8), we obtain (4.6). For later purposes we note that the second identity in (4.2) also shows that (IIX1 − µA1 )−k Aj1 B1 U ⊂ D(T ),
and
(4.9)
T (IIX1 − µA1 )−k Aj1 B1 = (IIX2 − µA2 )−k Aj2 B2 ,
j, k = 0, 1, 2, . . . .
Here |µ| is assumed to be sufficiently small. Indeed, this follows directly from (4.7) using the fact that Aj1 B1 U is contained in D(T ) and T Aj1 B1 u = Aj2 B2 u for each u ∈ U and each j. 4.2. Definition of an abstract Bezout operator In this section we state and prove the theorem that we shall use to define an abstract Bezout operator. Consider the following four operator-valued functions: Fj (λ) = IU + λC Cj (IIX − λA)−1 B,
j = 1, 2,
(4.10)
−1 Bj , Gj (λ) = IU + λC(IIX − λA)
j = 1, 2.
(4.11)
Bezout Integral Operator and Abstract Scheme
261
Here U, X and X are Banach spaces, IU , IX and IX are the identity operators on U, X and X , respectively, and A : X → X,
: X → X , A
B : U → X, B1 , B2 : U → X ,
C1 , C2 : X → U, C : X → U
are bounded linear operators. We shall prove the following theorem.
= {0}. Then the funcTheorem 4.3. Put R = Im (A|B), and assume Ker (C|A) F1 (·) and G2 (·)F F2 (·) coincide in a neighborhood of zero if and only if tions G1 (·)F there exists a closed linear operator T (X → X ) such that R is a core for T and
x = B2 C2 x − B1 C1 x, x ∈ D(T ), (i) T Ax − AT (ii) T B = B2 − B1 (iii) CT x = C1 x − C2 x, x ∈ D(T ). Moreover, in that case the operator T is uniquely determined and Ker T is the C2 ). Finally, maximal (A−BC1 )-invariant subspace contained in D(T )∩Ker (C1 −C item (i) includes the statement that D(T ) is invariant under the operator A. Notice that Theorem 4.3 contains Proposition 4.1 as a special case. Indeed, Proposition 4.1 appears when in Theorem 4.3 we take F1 = H1 , G2 = H2 , C2 = 0, and B1 = 0. Then G1 (λ)F F1 (λ) = H1 (λ) and G2 (λ)F F2 (λ) = H2 (λ), and the statements (i)–(iii) reduce to (4.2). On the other hand, as we shall see below, Theorem 4.3 is an immediate corollary of Proposition 4.1. We shall refer to the operator T defined by Theorem 4.3 as the abstract Bezout operator associated to the operator-valued functions {G1 , G2 ; F1 , F2 } and the co-realizations (4.10) and (4.11). The use of this terminology will be justified in the next section. First we prove Theorem 4.3. Proof of Theorem 4.3. For λ in an appropriate neighborhood of zero, put F2 (λ)−1 , H1 (λ) = F1 (λ)F
H2 (λ) = G1 (λ)−1 G2 (λ).
Then G1 (·)F F1 (·) and G2 (·)F F2 (·) coincide in a neighborhood of zero if and only if H1 and H2 coincide in a neighborhood of zero. Using the co-realizations for F2 and G1 , we have F2 (λ)−1 G1 (λ)−1
−1 = IU − λC C2 (IIX − λA× B, A× C2 , 2 ) 2 = A − BC × ×
)B1 , A
=A
− B1 C. = IU − λC(II − λA X
1
1
A straightforward calculation then yields H1 (λ) H2 (λ)
−1 = IU + λ(C1 − C2 )(IIX − λA× B, 2 ) × −1
) (B2 − B1 ). = IU + λC(II − λA X
1
Next, using a standard feedback argument, we have Im (A× 2 |B) = Im (A|B),
× ) = Ker (C|A).
Ker (C|A 1
(4.12) (4.13)
262
I. Gohberg, I. Haimovici, M.A. Kaashoek and L. Lerer
× It follows that R = Im (A× 2 |B) and Ker (C|A1 ) = {0}. Thus we can apply PropoF1 (·) and G2 (·)F F2 (·) coincide in a neighborhood of sition 4.1 to show that G1 (·)F zero if and only if there exists a closed linear operator T (X → X ) such that R is a core for T and
× (a) T A× x ∈ D(T ), 2 x = A1 T x, (b) T B = B2 − B1 , (c) CT x = C1 x − C2 x, x ∈ D(T ). Moreover there is only one such T , and its null space is the maximal A× 2 -invariant × C2 . subspace of X contained in D(T ) ∩ Ker (C1 − C2 ). Here A2 = A − BC Note that (b) and (c) coincide with (ii) and (iii), respectively. Hence to complete the proof it remains to show that (a) and (i) are equivalent whenever (b),(c) or (ii), (iii) are satisfied. To do this, notice that Im B ⊂ R ⊂ D(T ), and hence the C2 implies that identity A× 2 = A − BC A× 2 D(T ) ⊂ D(T )
⇐⇒
AD(T ) ⊂ D(T )
T A× 2x
In particular, is well defined for each x ∈ D(T ) if and only if T Ax is well C2 x for each defined for each x ∈ D(T ), and in that case T A× 2 x = T Ax − T BC × ×
x ∈ D(T ). From A1 = A − B1 C it follows that A1 T x = AT x − B1 CT x for each x ∈ D(T ). Finally, using (b),(c) or, equivalently (ii), (iii), we obtain B2 C2 x − B1 C1 x
= (B2 − B1 )C C2 x − B1 (C1 − C2 )x = T BC C2 x − B1 CT x, x ∈ D(T ).
With these remarks the equivalence of (a) and (i) is clear.
4.3. The Haimovici-Lerer scheme for defining an abstract Bezout operator In this section we show that the operator T appearing in Theorem 4.3 is the closure of a Bezout operator in the sense of [17]. To do this we first briefly describe the general set-up from the latter paper. In [17] the construction of a Bezout operator requires that two basic assumptions are satisfied. The first is that along with a pair of given analytic operatorvalued functions F1 and F2 one has two other analytic operator-valued functions G1 and G2 such that F2 (λ) − G1 (λ)F F1 (λ) = 0 (4.14) G2 (λ)F in some open domain Ω. In our setting Ω is a sufficiently small neighborhood of zero, and (4.14) is the condition that G1 (·)F F1 (·) and G2 (·)F F2 (·) coincide in a neighborhood of zero. The second assumption is that we have co-realizations of F1 and F2 as in (4.10), and co-realizations of G1 and G2 as in (4.11). These two basic assumptions are not artificial and, in general, it is quite straightforward to satisfy these assumptions. For instance, as has been shown in Section 1.2 in [17], given a pair of analytic operator-valued functions F1 and F2 one can always construct another pair G1 and G2 such that (4.14) holds. Furthermore, as we mentioned in the last but one paragraph of the introduction, in many concrete cases the supplementary functions appear in a natural way or can be
Bezout Integral Operator and Abstract Scheme
263
constructed from the first pair of functions. Also, as follows from the realization result in Section 2.3 (see also Section 1.3 in [17]), co-realizations as in (4.10) and (4.11) always exist for matrix functions that are analytic in a neighborhood of the origin. Now let the two basic assumptions be satisfied with co-realizations as in (4.10) and (4.11). Assuming additionally that
= {0}, Im (A|B) is dense in X , and Ker (C|A) it is shown in [17] that there exists a unique (possibly unbounded) operator THL with domain D(T THL ) = span{(IIX − λA)−k Aj BU | λ ∈ Ω, j, k = 0, 1, 2, . . .},
(4.15)
such that
−1 THL (IIX − µA)−1 B, G1 (λ)F F1 (µ) − G2 (λ)F F2 (µ) = (µ − λ)C(IIX − λA)
(4.16)
for |λ| and |µ| sufficiently small. In [17] this operator is called the Bezout operator associated with the realizations (4.10), (4.11), and the equality (4.16). In [17], it is also shown that all known concrete Bezout operators can be derived from this general scheme as particular cases. The next proposition clarifies the connection between the operator THL and the abstract Bezout operator introduced in the previous section. Proposition 4.4. Let F1 , F2 , G1 , G2 be the operator-valued functions given by (4.10), (4.11), and let G1 (·)F F1 (·) and G2 (·)F F2 (·) coincide in a neighborhood of zero.
= {0}. Then the abstract Assume that Im (A|B) is dense in X and Ker (C|A) Bezout operator T associated to {G1 , G2 ; F1 , F2 } is equal to the closure of the operator THL . Proof. First we apply Corollary 4.2 to H1 (λ) = F1 (λ)F F2 (λ)−1 and H2 (λ) = −1 G1 (λ) G2 (λ). Using the identities (4.12) and (4.13) and assuming |λ| and |µ| to be sufficiently small, this yields
× )−1 T (IIX − µA× )−1 B. H2 (λ) − H1 (µ) = (λ − µ)C(II − λA X
1
2
It follows that G1 (λ)F F1 (µ) − G2 (λ)F F2 (µ) = G1 (λ)[H1 (µ) − H2 (λ)]F F2 (µ)
× )−1 T (IIX − µA× )−1 BF F2 (µ). = (µ − λ)G1 (λ)C(IIX − λA 1 2
× = A
− B1 C and A× = A − BC A straightforward calculation, using A C2 , shows 1 2 that
× )−1 = C(II − λA)
−1 , G1 (λ)C(II − λA X
(IIX −
1
−1 µA× BF F2 (µ) 2)
X
= (IIX − µA)−1 B.
We conclude that for |λ| and |µ| sufficiently small we have
−1 T (IIX − µA)−1 B. G1 (λ)F F1 (µ) − G2 (λ)F F2 (µ) = (µ − λ)C(IIX − λA)
264
I. Gohberg, I. Haimovici, M.A. Kaashoek and L. Lerer
In particular, the set (IIX − µA)−1 BU is contained in D(T ). From (4.9) it follows that D(T THL ) is also a subset of D(T ). Thus the above equality implies that T |D(TTHL ) = THL . On the other hand, in [17] it is proved that Im (A|B) is contained in D(T THL ). Since Im (A|B) is a core for T , this implies that T is the closure of THL . In spite of the remarks made in the third paragraph of this section, the implementation of the abstract scheme in a specific case may not be an easy task. For instance, getting a suitable Bezout operator for a given pair of concrete matrix functions is not always straightforward. Also, to see that a given operator is actually a Bezout operator requires to identify the corresponding matrix functions and to get appropriate co-realizations yielding the given operator as a Bezout operator in the sense of the abstract scheme. The next section illustrates the latter point for the Bezout integral operator studied in this paper. 4.4. The Bezout integral operator revisited In this section we show that the Bezout integral operator introduced in the first chapter is a Bezout operator according the general scheme presented in Section 4.2. Let A, B, C, D be the entire n × n matrix functions given by (1.2) and (1.3). Assume that the quasi commutativity property (1.1) is satisfied, and let the matrices A(0), B(0), C(0), D(0) be non-singular. Put E = A(0)B(0) = C(0)D(0). Notice that E is invertible, and that C(0)−1 ED(0)−1 and A(0)−1 EB(0)−1 are both equal to the n × n identity matrix In . Consider the following matrix functions FB (λ) = In + iλB(0)−1 ZB (I − λV )−1 τ, −1
FD (λ) = In + iλD(0)
−1
ZD (I − λV )
(4.17)
τ,
(4.18)
GA (λ) = In + iλE −1 π(I − λW )−1 YA A(0)−1 E, GC (λ) = In + iλE
−1
−1
π(I − λW )
−1
YC C(0)
E.
(4.19) (4.20)
By comparing (4.17)–(4.20) with (3.8)–(3.11), and using the properties of E, we see that FB (λ) = B(0)−1 B(λ),
FD (λ) = e−iλω D(0)−1 D(λ),
GA (λ) = E −1 A(λ)A(0)−1 E,
GC (λ) = E −1 eiλω C(λ)C(0)−1 E.
Since C(0)−1 ED(0)−1 = A(0)−1 EB(0)−1 = In , the above identities can be used to show that FB (λ) = E −1 A(λ)B(λ), GA (λ)F
GC (λ)F FD (λ) = E −1 C(λ)D(λ).
Thus quasi commutativity property (1.1) implies that FB (λ) = GC (λ)F FD (λ), GA (λ)F
λ ∈ C.
In particular, the functions GA (·)F FB (·) and GC (·)F FD (·) coincide in a neighborhood of zero.
Bezout Integral Operator and Abstract Scheme
265
Proposition 4.5. Let A, B, C, D be the entire n × n matrix functions given by (1.2) and (1.3). Assume that the quasi commutativity property (1.1) is satisfied, and let the matrices A(0), B(0), C(0), D(0) be non-singular. Then the Bezout integral operator T defined by (1.4), (1.5) is equal to the abstract Bezout operator associated to the matrix functions {GA , GC ; FB , FD } and the co-realizations (4.17)–(4.20). Proof. If in (4.10) and (4.11), we take
= W, A = V, A B1 = YA A(0)−1 E,
B = τ, C = iE −1 π,
C1 = iB(0)−1 ZB ,
B2 = YC C(0)−1 E, C2 = iD(0)−1 ZD ,
then F1 = FB , F2 = FD , G1 = GA , G2 = GC , and G1 (·)F F1 (·) and G2 (·)F F2 (·) coincide in a neighborhood of zero.
= {0}. We claim that Im (A|B) is dense in X = X = Ln1 [0, ω], and Ker (C|A) First notice that for j = 0, 1, 2, . . . we have (−it)j x (x ∈ Cn ). (4.21) j! Since the polynomials are dense in L1 [0, ω], this implies that Im (A|B) is dense
if and only if πW n f = 0 for in X = Ln1 [0, ω]. Next, observe that f ∈ Ker (C|A)
implies n = 0, 1, 2, . . .. Thus f ∈ Ker (C|A) fˆ(λ) = π(I − λW )−1 f = 0, λ ∈ C, Aj B = V j τ,
and (V j τ x)(t) =
= {0}. which yields f = 0. Thus Ker (C|A) Since the Bezout integral operator T is a bounded operator on Ln1 [0, ω], the fact that R = Im (A|B) is dense in Ln1 [0, ω] implies that R is a core for T . Thus in order to complete the proof it remains to show that the Bezout integral operator T satisfies the identities (i)–(iii) in Theorem 4.3. To do this we use Proposition 3.6. Indeed, (3.39) yields
T A − AT = T V − W T = iY YC ZD − iY YA ZB = =
B2 E −1 C(0)D(0)C C2 − B1 E −1 A(0)B(0)C1 B2 C2 − B1 C1 ,
which proves (i). To get (ii) we use (3.41) to show that TB
= T τ = YC D(0) − YA B(0) = YC C(0)−1 E −1 − YA A(0)−1 E −1 = B2 − B1 .
Finally, (3.40) yields CT Thus (i)–(iii) hold.
=
ZB − iE −1 C(0)Z ZD iE −1 πT = iE −1 A(0)Z
=
iB(0)E −1 ZB − iD(0)E −1 ZD = C1 − C2 .
266
I. Gohberg, I. Haimovici, M.A. Kaashoek and L. Lerer
4.5. The null space of the Bezout integral operator As an addition to Theorem 1.1 we prove the following result. Theorem 4.6. Let A, B, C, D be the entire n×n matrix functions given by (1.2) and (1.3), and assume that the quasi commutativity property (1.1) is satisfied. Then the null space of the Bezout integral operator T associated with {A, C; B, D} is finite-dimensional, and a basis for this null space can be obtained in the following way. Let λ1 , . . . , λ be the set of distinct common eigenvalues of B and D in C, and for each common eigenvalue λν let ν ν ν ν ν ν x1, 0 , . . . , x1, r (ν) −1 , x2, 0 , . . . , x2, r (ν) −1 , . . . , xpν , 0 , . . . , xp 1
(ν) ν , rpν −1
2
stand for a canonical set of common Jordan chains of B and D at λν . Then the functions ψj,ν k (t) = e−iλν t
k (−it)k−µ ν x , (k − µ)! j, µ µ=0
(ν)
k = 0, . . . , rj
− 1,
j = 1, . . . , pν ,
(4.22) ν = 1, . . . , ,
form a basis for the null space of T . In particular, dim Ker T = ν(B, D). Proof. We split the proof into two parts. In the first part we assume additionally that the matrices A(0), B(0), C(0), D(0) are non-singular. The general case is treated in the second part. Part 1. Let M be the null space of the Bezout integral operator T associated with {A, C; B, D}. Since T is of the form I + Γ, with Γ a compact operator, the space M is finite-dimensional. Assume that A(0), B(0), C(0), D(0) are non-singular, and consider the functions FB , FD , GA , GC defined by (4.17)–(4.20). As we have seen in the previous section the operator T is equal to the abstract Bezout operator associated to the matrix functions {GA , GC ; FB , FD } and the co-realizations (4.17)–(4.20). Thus we know from Theorem 4.3 that M = Ker T is the maximal subspace that is invariant under the operator V − iτ B(0)−1 ZB and that is contained in the null space of the operator iB(0)−1 ZB − iD(0)−1 ZD . This allows us to apply Theorem 2.6 with F1 = FB and F2 = FD . We already know that M is finite-dimensional. The argument used in the final paragraph of Chapter 3 shows that the operator V − iτ B(0)−1 ZB is injective. Thus Theorem 2.6 (together with the remark made immediately after Theorem 2.6) tells us that the functions uj,ν k =
k
(I − λν V )−(α+1) V α τ xj,ν k−α ,
(ν)
k = 0, . . . , rj
− 1,
α=0
j = 1, . . . , pν ,
ν = 1, . . . , ,
form a basis for M = Ker T . To complete the proof of the first part we shall show that ψj,ν k = uj,ν k ,
(ν)
k = 0, . . . , rj
− 1,
j = 1, . . . , pν ,
ν = 1, . . . , .
(4.23)
Bezout Integral Operator and Abstract Scheme
267
To do this, we first note that for α = 0, 1, 2, . . . we have ∞ dα m m dα −1 (I − λV ) = λ V dλα dλα m=0
= =
∞
m(m − 1) · · · (m − α + 1)λm−α V m
m=α ∞
(m + α)(m + α − 1) · · · (m + 1)λm V (m+α)
m=0 ∞
∞ (m + α)! m (m+α) m+α λ V = = α! λm V (m+α) . m m! m=0 m=0 On the other hand, dα (I − λV )−1 = α!(I − λV )−(α+1) V α . dλα It follows that −(α+1)
(I − λV )
V
α
∞ m+α = λm V (m+α) , m
α = 0, 1, 2, . . . .
m=0
Next, using the second identity in (4.21), we see that for each x ∈ Cn and α = 0, 1, 2, . . . we have ∞ ∞ (−it)α (−it)m (−it)α (−it)m+α e−iλt x = x= x λm λm α! m! α! m! α! m=0 m=0 ∞ (−it)m+α m+α = x λm m (m + α)! m=0 ∞ m+α = λm V (m+α) τ x (t). m m=0
We conclude that for each x ∈ Cn and α = 0, 1, 2, . . . we have (−it)α (I − λV )−(α+1) V α τ x (t) = e−iλt x, 0 ≤ t ≤ ω. (4.24) α! Replacing the summation index µ in (4.22) by α = k − µ and using the identity in (4.24), we see that ψj,ν k (t)
=
k α=0
=
e−iλν t
(−it)α ν xj, (k−α) α!
k (I − λν V )−(α+1) V α τ xj,ν (k−α) (t) = uj,ν k (t), α=0
This proves (4.23), and we are done.
0 ≤ t ≤ ω.
268
I. Gohberg, I. Haimovici, M.A. Kaashoek and L. Lerer
Part 2. For an arbitrary α ∈ R let Aα , Bα , Cα , Dα be the entire n × n matrix functions defined by (3.50) and (3.51). An elementary calculation, using (2.3), shows that λ1 − α, . . . , λ − α are the distinct common eigenvalues of Bα and Dα in C. Furthermore, the vectors ν ν ν ν ν ν x1, 0 , . . . , x1, r (ν) −1 , x2, 0 , . . . , x2, r (ν) −1 , . . . , xpν , 0 , . . . , xp 1
(ν) ν , rpν −1
2
form a canonical set of common Jordan chains of Bα and Dα at λν − α. Let Tα be the Bezout integral operator associated with {Aα , Cα ; Bα , Dα }, where α ∈ R has been chosen in such a way that the matrices Aα (0), Bα (0), Cα (0), Dα (0) are non-singular at λ = 0. As we have seen in the Part 1 of Section 3.5 such an α always exists. The result of Part 1 of the present proof, together with the remarks in the previous paragraph, shows that the functions ψ j,ν k (t) = e−i(λν −α)t
k (−it)k−µ ν x , (k − µ)! j, µ µ=0
(ν)
k = 0, . . . , rj
j = 1, . . . , pν ,
− 1, ν = 1, . . . , ,
form a basis for the null space of Tα . Recall that Tα = Mα T M−α , where Mα is the operator on Ln1 [0, ω] defined by (3.52). It follows that g ∈ Ker T if and only if for some g ∈ Ker Tα we have g(t) = e−iαt g(t), ψj,ν k (t)
=e Since a basis of Ker T .
−iαt
ψ j,ν k (t)
0 ≤ t ≤ ω.
for 0 ≤ t ≤ ω, we see that the functions in (4.22) form
From the proof of Theorem 4.6 it is clear that with slight modifications in the arguments this theorem could also have been proved at the end of the previous chapter. The fact that the vectors in (4.22) are contained in the null space of the Bezout integral operator T can also be derived from Theorem 1.1 in [10], using the equivalence between T and the resultant operator proved in Section 3 of [10].
References [1] B.D.O. Anderson and E.I. Jury, Generalized Bezoutian and Sylvester matrices in multivariable linear control, IEEE Trans. Automatic Control, AC-21 (1976), 551– 556. [2] D.Z. Arov, Scattering theory with dissipation of energy, Dokl. Akad. Nauk SSSR 216 (4) (1974), pp. 713–716 [in Russian]; English translation with addenda: Sov. Math. Dokl. 15 (1974), pp. 848–854. [3] D.Z. Arov, M.A. Kaashoek, and D.R. Pik, The Kalman-Yakubovich-Popov inequality for discrete time systems of infinite dimension, J. Oper. Theory, to appear. [4] J.A. Ball, and N. Cohen, De Branges-Rovnyak operator models and systems theory: a survey, in: Topics in Matrix and Operator Theory (eds. H. Bart, I. Gohberg, M.A. Kaashoek), OT 50, Birkhauser ¨ Verlag, Basel, 1991, pp. 93–136.
Bezout Integral Operator and Abstract Scheme
269
[5] H. Bart, I. Gohberg, and M.A. Kaashoek, Minimal Factorization of Matrix and Operator Functions. OT 1, Birkhauser ¨ Verlag, Basel, 1979. [6] R.R. Bitmead, S.Y. Kung, B.D.O. Anderson, and T. Kailath, Greatest common divisors via generalized Sylvester and Bezout matrices, IEEE Trans. Autom. Control AC-23 (1978), 1043–1047. [7] M.J. Corless, and A.E. Frazho, Linear systems and control, Marcel Dekker, Inc., New York, NY, 2003. [8] I. Gohberg, S. Goldberg, and M.A. Kaashoek: Classes of Linear Operators Vol. 1. OT 49, Birkhauser ¨ Verlag, Basel, 1990. [9] I. Gohberg, and G. Heinig, The resultant matrix and its generalizations, II. Continual analog of resultant matrix, Acta Math. Acad. Sci. Hungar 28 (1976), 198–209, [in Russian]. [10] I. Gohberg, M.A. Kaashoek, and L. Lerer, The continuous analogue of the resultant and related convolution operators, to appear. [11] I. Gohberg, M.A. Kaashoek, and F. van Schagen, Partially specified matrices and operators: classification, completion, applications, OT 79 Birkhauser ¨ Verlag, Basel, 1995. [12] I. Gohberg, M.A. Kaashoek, and F. van Schagen, On inversion of convolution integral operators on a finite interval, in: Operator Theoretical Methods and Applications to Mathematical Physics. The Erhard Meister Memorial Volume, OT 147, Birkhauser ¨ Verlag, Basel, 2004, pp. 277–285. [13] I. Gohberg, P. Lancaster, and L. Rodman, Matrix Polynomials, Academic Press, New York, 1982. [14] I. Gohberg, and L. Lerer, Matrix generalizations of M.G. Krein theorems on matrix polynomials,in Orthogonal Matrix-Valued Polynomials and Applications (I. Gohberg, ed.), OT 34, Birkhauser ¨ Verlag, Basel, 1988, pp. 137–202. [15] I.C. Gohberg, and E.I. Sigal, An operator generalization of the logarithmic residue theorem and the theorem of Rouch´e, Mat.Sbornik 84 (126) (1971), 607–629 [in Russian]; English transl. Math. USSR, Sbornik 13 (1971), 603–625. [16] I. Haimovici, Operator equations and Bezout operators for analytic operator functions, Ph.D. thesis, Technion Haifa, Israel, 1991 [in Hebrew]. [17] I. Haimovici, and L. Lerer, Bezout operators for analytic operator functions, I. A general concept of Bezout operator, Integral Equations Oper. Theory 21 (1995), 33–70. [18] G. Heinig, and K. Rost, Algebraic Methods for Toeplitz-like Matrices and Operators, OT 13, Birkhauser ¨ Verlag, Basel, 1984. [19] J.W. Helton, Discrete time systems, operator models, and scattering theory, J. Funct. Anal. 16 (1974), 15–38. [20] M.G. Krein, and M.A. Naimark, The method of symmetric and hermitian forms in theory of separation of the roots of algebraic equations, GNTI, Kharkov, 1936 [in Russian]; English transl. Linear and Multilinear Algebra 10, (1981), 265–308. [21] L. Lerer and L. Rodman, Bezoutians of rational matrix functions, J. Funct. Anal. 141 (1996), 1–38.
270
I. Gohberg, I. Haimovici, M.A. Kaashoek and L. Lerer
[22] L. Lerer and L. Rodman, Bezoutians of rational matrix functions, matrix equations and factorization, Lin. Alg. Appl. 302/303 (1999), 105–135. [23] L. Lerer, L. Rodman and M. Tismenetsky, Bezoutian and Schur-Cohn problem for operator polynomials, J. Math. Analysis & Appl. 103 (1984), 83–102. [24] L. Lerer and M. Tismenetsky, The eigenvalue separation problem for matrix polynomials, Integral Equations Operator Theory 5 (1982), 386–445. [25] L. Rodman, An introduction to operator polynomials, OT 38, Birkhauser ¨ Verlag, Basel, 1989. [26] L.A. Sakhnovich, Operatorial Bezoutiant in the theory of separation of roots of entire functions, Functional Anal. Appl. 10 (1976), 45–51 [in Russian]. [27] L.A. Sakhnovich, Integral equations with difference kernels on finite intervals, OT 84, Birkhauser ¨ Verlag, Basel, 1996. I. Gohberg School of Mathematical Sciences Raymond and Beverly Faculty of Exact Sciences Tel-Aviv University Ramat Aviv 69978, Israel e-mail: [email protected] I. Haimovici Steimatzky Str. 9/9 Ramat Aviv Hahadasha 69639 Tel Aviv, Israel e-mail: [email protected] M.A. Kaashoek Afdeling Wiskunde Faculteit der Exacte Wetenschappen Vrije Universiteit De Boelelaan 1081a 1081 HV Amsterdam, The Netherlands e-mail: [email protected] L. Lerer Department of Mathematics Technion – Israel Institute of Technology Haifa 32000, Israel e-mail: [email protected]