A panorama of modern operator theory and related topics

Operator Theory: Advances and Applications Volume 218 Founded in 1979 by Israel Gohberg Editors: Joseph A. Ball (Black...

Author: Harry Dym | Marinus A. Kaashoek | Peter Lancaster | Heinz Langer | Leonid Lerer (eds.)

11 downloads 802 Views 6MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Operator Theory: Advances and Applications Volume 218 Founded in 1979 by Israel Gohberg

Editors: Joseph A. Ball (Blacksburg, VA, USA) Harry Dym (Rehovot, Israel) Marinus A. Kaashoek (Amsterdam, The Netherlands) Heinz Langer (Vienna, Austria) Christiane Tretter (Bern, Switzerland) Associate Editors: Vadim Adamyan (Odessa, Ukraine) Albrecht Böttcher (Chemnitz, Germany) B. Malcolm Brown (Cardiff, UK) Raul Curto (Iowa, IA, USA) Fritz Gesztesy (Columbia, MO, USA) Pavel Kurasov (Lund, Sweden) Leonid E. Lerer (Haifa, Israel) Vern Paulsen (Houston, TX, USA) Mihai Putinar (Santa Barbara, CA, USA) Leiba Rodman (Williamsburg, VA, USA) Ilya M. Spitkovsky (Williamsburg, VA, USA)

Honorary and Advisory Editorial Board: Lewis A. Coburn (Buffalo, NY, USA) Ciprian Foias (College Station, TX, USA) J.William Helton (San Diego, CA, USA) Thomas Kailath (Stanford, CA, USA) Peter Lancaster (Calgary, Canada) Peter D. Lax (New York, NY, USA) Donald Sarason (Berkeley, CA, USA) Bernd Silbermann (Chemnitz, Germany) Harold Widom (Santa Cruz, CA, USA)

Harry Dym Marinus A. Kaashoek Peter Lancaster Heinz Langer Leonid Lerer Editors

A Panorama of Modern Operator Theory and Related Topics The Israel Gohberg Memorial Volume

Editors Harry Dym Department of Mathematics Weizmann Institute of Science Rehovot, Israel

Marinus A. Kaashoek Department of Mathematics VU University Amsterdam Amsterdam, The Netherlands

Peter Lancaster Department of Mathematics & Statistics University of Calgary Calgary, Alberta, Canada

Heinz Langer Institute of Analysis and Scientific Computing Vienna University of Technology Vienna, Austria

Leonid Lerer Department of Mathematics Technion Israel Institute of Technology Haifa, Israel

ISBN 978-3-0348-0220-8 e-ISBN 978-3-0348-0221-5 DOI 10.1007/978-3-0348-0221-5 Springer Basel Dordrecht Heidelberg London New York Library of Congress Control Number: 2012930973 Mathematics Subject Classification (2010): 47-XX, 46-XX, 32-XX, 15-XX, 93-XX © Springer Basel AG 2012 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in other ways, and storage in data banks. For any kind of use, permission of the copyright owner must be obtained. Printed on acid-free paper

Springer Basel AG is part of Springer Science + Business Media (www.birkhauser-science.com)

Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

ix

D. Alpay and H. Attia An Interpolation Problem for Functions with Values in a Commutative Ring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

J. Arazy and H. Upmeier Minimal and Maximal Invariant Spaces of Holomorphic Functions on Bounded Symmetric Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19

D.Z. Arov and H. Dym B-regular 𝐽-inner Matrix-valued Functions . . . . . . . . . . . . . . . . . . . . . . . . . .

51

J.A. Ball and V. Bolotnikov Canonical Transfer-function Realization for Schur-Agler-class Functions of the Polydisk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

75

H. Bart, T. Ehrhardt and B. Silbermann Spectral Regularity of Banach Algebras and Non-commutative Gelfand Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 W. Bauer and N. Vasilevski Banach Algebras of Commuting Toeplitz Operators on the Unit Ball via the Quasi-hyperbolic Group . . . . . . . . . . . . . . . . . . . . . . . 155 H. Bercovici, R.G. Douglas and C. Foias Canonical Models for Bi-isometries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 A. B¨ ottcher, S. Grudsky, D. Huybrechs and A. Iserles First-order Trace Formulae for the Iterates of the Fox–Li Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 A. Brudnyi, L. Rodman and I.M. Spitkovsky Factorization Versus Invertibility of Matrix Functions on Compact Abelian Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225

vi

Contents

P. Dewilde Banded Matrices, Banded Inverses and Polynomial Representations for Semi-separable Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 V.K. Dubovoy, B. Fritzsche and B. Kirstein Description of Helson-Szeg˝ o Measures in Terms of the Schur Parameter Sequences of Associated Schur Functions . . . . . . . . . .

269

Y. Eidelman and I. Haimovici Divide and Conquer Method for Eigenstructure of Quasiseparable Matrices Using Zeroes of Rational Matrix Functions . . . . . . . . . . . . . . . . . 299 R.L. Ellis An Identity Satisﬁed by Certain Orthogonal Vector-valued Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

329

I. Feldman and N. Krupnik Invertibility of Certain Fredholm Operators . . . . . . . . . . . . . . . . . . . . . . . . . 345 F.L. Hern´ andez, Y. Raynaud and E.M. Semenov Bernstein Widths and Super Strictly Singular Inclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359 M.A. Kaashoek and F. van Schagen On Inversion of Certain Structured Linear Transformations Related to Block Toeplitz Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377 S. Koyuncu and H.J. Woerdeman The Inverse of a Two-level Positive Deﬁnite Toeplitz Operator Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387 P. Lancaster and I. Zaballa Parametrizing Structure Preserving Transformations of Matrix Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

403

P. Lancaster and I. Zaballa A Review of Canonical Forms for Selfadjoint Matrix Polynomials . . . . 425 H. Langer, A. Markus and V. Matsaev Linearization, Factorization, and the Spectral Compression of a Self-adjoint Analytic Operator Function Under the Condition (VM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

445

J. Leiterer An Estimate for the Splitting of Holomorphic Cocycles. One Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

465

Contents

vii

L. Lerer and A.C.M. Ran The Discrete Algebraic Riccati Equation and Hermitian Block Toeplitz Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495 F. Oggier and A. Bruckstein On Cyclic and Nearly Cyclic Multiagent Interactions in the Plane . . . 513 ¨ J. Ostensson and D.R. Yafaev A Trace Formula for Diﬀerential Operators of Arbitrary Order . . . . . . 541 L. Rodman Jordan Structures and Lattices of Invariant Subspaces of Real Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 571 J. Rovnyak and L.A. Sakhnovich Pseudospectral Functions for Canonical Diﬀerential Systems. II . . . . .

583

D. Xia Operator Identities for Subnormal Tuples of Operators . . . . . . . . . . . . . . 613

Israel Gohberg 1928–2009

Preface Israel Gohberg, the founder of the Birkh¨auser OT series (and the journal Integral Equations and Operator Theory) passed away on October 12, 2009, a few months after his eighty-ﬁrst birthday. This brought to a close more than sixty years of intense mathematical activity, some 25 in the former Soviet Union, where he was born, and the remaining 35 or so while living in Israel, but with many extended visits to collaborators in Europe (primarily Germany and The Netherlands), the US and Canada. A recent Birkh¨auser volume, Israel Gohberg and Friends, provides extensive documentation of Israel’s life, activities and interests. It includes biographical material, a list of his papers (458), books (25) and students (40) up to March 2008, as well as testimonials from his many collaborators, students and friends some of which are reprinted from the proceedings of the conferences that celebrated his sixtieth, seventieth and eightieth birthday, in Calgary, Groningen and Williamsburg, respectively. The journal Linear Algebra and its Applications also printed six short articles on Israel Gohberg by six of his collaborators in volume 433 (2010), 877– 892. Obituaries appeared in various other journals, including the IEEE Control Systems Magazine (volume 30, December 2010). In spite of deteriorating health in his later years, Israel maintained a full schedule of activities and continued to generate a steady stream of ideas, plans for the future, articles and books and continued to exhibit a positive optimistic outlook. Even when he was hospitalized in the Intensive Care Unit of Meir Hospital in Kfar Saba, Israel, and forced to cancel the ﬁrst of a planned sequence of meetings in Germany, he expressed the hope of being able to participate in the second. Unfortunately, this was not to be. This volume is a collection of articles written to honor his memory by a number of his former students, collaborators, colleagues and friends on subjects that intersect with his many interests. A list of the key words that appear in the titles gives a partial indication of their scope: interpolation, transfer function, realization theory, Banach algebras, Toeplitz operators, factorization, (numerical) algorithms, semi-separable matrices and operators, Fredholm operators, block Toeplitz matrices, inversion of structured matrices, Riccati equations, trace formulas, matrix polynomials, linearization, analytic operator functions, Jordan structures, canonical diﬀerential systems, multivariable operator theory. December 2011

Harry Dym, Marinus A. Kaashoek, Peter Lancaster, Heinz Langer, Leonid Lerer

Operator Theory: Advances and Applications, Vol. 218, 1–17 c 2012 Springer Basel AG ⃝

An Interpolation Problem for Functions with Values in a Commutative Ring Daniel Alpay and Haim Attia Dedicated to the memory of Israel Gohberg

Abstract. It was recently shown that the theory of linear stochastic systems can be viewed as a particular case of the theory of linear systems on a certain commutative ring of power series in a countable number of variables. In the present work we study an interpolation problem in this setting. A key tool is the principle of permanence of algebraic identities. Mathematics Subject Classiﬁcation (2000). 60H40, 93C05. Keywords. White noise space, stochastic distributions, linear systems on rings.

1. Introduction There are numerous connections between classical interpolation problems and optimal control and the theory of linear systems; see for instance [10, 1]. In these settings, the coeﬃcient space is the complex ﬁeld ℂ, or in the case of real systems, the real numbers ℝ. Furthermore, already from its inception, linear system theory was considered when the coeﬃcient space is a general (commutative) ﬁeld, or more generally a commutative ring; see [22, 25]. In [8, 6] a new approach to the theory of linear stochastic systems was developed, in which the coeﬃcient space is now a certain commutative ring ℜ (see Section 3 below). The results from [22, 25] do not seem to be directly applicable to this theory, and the speciﬁc properties of ℜ played a key role in the arguments in [8, 6]. We set ℕ0 = {0, 1, 2, 3, . . .} . The purpose of this work is to discuss the counterparts of classical interpolation problems in this new setting. To set the problems into perspective, we begin this D. Alpay thanks the Earl Katz family for endowing the chair which supported his research.

2

D. Alpay and H. Attia

introduction with a short discussion of the deterministic case. In the classical theory of linear systems, input-output relations of the form 𝑦𝑛 =

𝑛 ∑

ℎ𝑛−𝑘 𝑢𝑘 ,

𝑛 = 0, 1, . . . ,

(1.1)

𝑘=0

where (𝑢𝑛 )𝑛∈ℕ0 is called the input sequence, (𝑦𝑛 )𝑛∈ℕ0 is the output sequence, and (ℎ𝑛 )𝑛∈ℕ0 is the impulse response, play an important role. The sequence (ℎ𝑛 )𝑛∈ℕ0 may consist of matrices (of common dimensions), and then the input and output sequences consist of vectors of appropriate dimensions. Similarly state space equations 𝑥𝑛+1 = 𝐴𝑥𝑛 + 𝐵𝑢𝑛 , 𝑦𝑛 = 𝐶𝑥𝑛 + 𝐷𝑢𝑛 ,

𝑛 = 0, 1, . . .

play an important role. Here 𝑥𝑛 denotes the state at time 𝑛, and 𝐴, 𝐵, 𝐶 and 𝐷 are matrices with complex entries. The transfer function of the system is ℎ(𝜆) =

∞ ∑

ℎ𝑛 𝜆𝑛 ,

𝑛=0

in the case (1.1), and

ℎ(𝜆) = 𝐷 + 𝜆𝐶(𝐼 − 𝜆𝐴)−1 𝐵

in the case of state space equations, when assuming the state at 𝑛 = 0 to be equal to 0. Classical interpolation problems bear various applications to the corresponding linear systems. See for instance [10, Part VI], [21]. To ﬁx ideas, we consider the case of bitangential interpolation problem for matrix-valued functions analytic and contractive in the open unit disk (Schur functions), and will even consider only the Nevanlinna-Pick interpolation problem in the sequel to keep notation simple, but it will be clear that the discussion extends to more general cases. Recall (see [10, §18.5 p. 409]) that the bitangential interpolation problem may be deﬁned in terms of a septuple of matrices 𝜔 = (𝐶+ , 𝐶− , 𝐴𝜋 , 𝐴𝜁 , 𝐵+ , 𝐵− , Γ) by the conditions ∑ Res𝜆=𝜆0 (𝜆𝐼 − 𝐴𝜁 )−1 𝐵+ 𝑆(𝜆) = −𝐵− , 𝜆0 ∈𝔻

∑

∑

Res𝜆=𝜆0 𝑆(𝜆)𝐶− (𝜆𝐼 − 𝐴𝜋 )−1 = 𝐶+ ,

𝜆0 ∈𝔻

Res𝜆=𝜆0 (𝜆𝐼 − 𝐴𝜁 )−1 𝐵+ 𝑆(𝜆)𝐶− (𝜆𝐼 − 𝐴𝜋 )−1 = Γ,

𝜆0 ∈𝔻

where 𝐴𝜁 and 𝐴𝜋 have their spectra in the open unit disk, where (𝐴𝜁 , 𝐵+ ) is a full range pair (that is, controllable) and where (𝐶− , 𝐴𝜋 ) is a null kernel pair (that is, observable). We send the reader to [10] for the deﬁnitions. Moreover, Γ satisﬁes the compatibility condition Γ𝐴𝜋 − 𝐴𝜁 Γ = 𝐵+ 𝐶+ + 𝐵− 𝐶− .

Interpolation Problem in a Commutative Ring Let 𝑃 be the matrix (see [10, p. 458]) ( 𝑃1 𝑃 = Γ

) Γ∗ , 𝑃2

3

(1.2)

where 𝑃1 and 𝑃2 are the solutions of the Stein equations ∗ ∗ 𝐶− − 𝐶+ 𝐶+ , 𝑃1 − 𝐴∗𝜋 𝑃1 𝐴𝜋 = 𝐶−

∗ ∗ − 𝐵− 𝐵− . 𝑃2 − 𝐴𝜁 𝑃2 𝐴∗𝜁 = 𝐵+ 𝐵+

Furthermore, and assuming the unknown function 𝑆 to be ℂ𝑝×𝑞 -valued, we set ) ( 0 𝐼𝑝 . 𝐽= 0 −𝐼𝑞 When 𝑃 is strictly positive, the solutions of the interpolation problem are given in terms of a linear fractional transformation based on a 𝐽-inner rational function Θ built from the septuple 𝜔 via the formula (see [10, (18.5.6) p. 410]) ( )( ) ∗ 0 (𝜆𝐼 − 𝐴𝜋 )−1 𝐶+ −𝐵+ Θ(𝜆) = 𝐼 + (𝜆 − 𝜆0 ) ∗ 0 (𝐼 − 𝜆𝐴∗𝜁 )−1 𝐶− 𝐵− (1.3) ( ) ∗ ∗ (𝐼 − 𝜆0 𝐴∗𝜋 )−1 𝐶+ −(𝐼 − 𝜆0 𝐴∗𝜋 )−1 𝐶− , × 𝑃 −1 (𝐴𝜁 − 𝜆0 𝐼)−1 𝐵+ (𝐴𝜁 − 𝜆0 𝐼)−1 𝐵− where 𝜆0 is ﬁxed on the unit circle and such that the various inverses exist in the above formula. An important fact is that the entries of 𝑃1 and 𝑃2 are rational functions of the entries of the matrices of 𝜔. The same holds for the entries of 𝑃 since Γ belongs to the septuple 𝜔. As a consequence, there exists a rational function 𝑓 (𝜆), built from 𝜔 and such that the entries of Θ are polynomials in 𝜆, with coeﬃcients which are themselves polynomials in the entries of the matrices of 𝜔 with coeﬃcients in the set of integers ℤ. This fact will allow us in the sequel to use the principle of permanence of identities (see [9, p. 456]), to extend interpolation problems to a more general setting. Allowing in (1.1) the input sequence (𝑢𝑛 )𝑛∈ℕ0 to consist of random variables has been considered for a long time. On the other hand, allowing also the impulse response of the system to carry some randomness seems much more diﬃcult to tackle. Recently a new approach to the theory of linear stochastic systems was developed using Hida’s white noise space theory [18], [19], [23], and Kondratiev’s spaces of stochastic test functions and distributions [20]. In this approach, see [3], [5], [6], the complex numbers are replaced by random variables in the white noise space, or more generally, by stochastic distributions in the Kondratiev space, and the product of complex numbers is replaced by the Wick product. For instance, (1.1) now becomes 𝑛 ∑ ℎ𝑛−𝑘 ♢𝑢𝑘 , 𝑛 = 0, 1, . . . (1.4) 𝑦𝑛 = 𝑘=0

where the various quantities are in the white noise space, or more generally in the Kondratiev’s space of stochastic distributions, and ♢ denotes the Wick product.

4

D. Alpay and H. Attia

An important role in this theory is played by a ring ℜ of power series in countably many variables with coeﬃcients in ℂ. This ring is endowed with a topology, which is that of the dual of a countably normed nuclear space. See Sections 2 and 3. Let us denote by ∑ r(𝑧) = 𝑟𝛼 𝑧 𝛼 , (1.5) 𝛼∈ℓ

an element of ℜ, where ℓ denotes the set of sequences (𝛼1 , 𝛼2 , . . .), whose entries are in ℕ0 , and for which 𝛼𝑘 ∕= 0 for only a ﬁnite number of indices 𝑘, and where we have used the multi-index notation 𝑧 𝛼 = 𝑧1𝛼1 𝑧2𝛼2 ⋅ ⋅ ⋅

𝛼 ∈ ℓ.

The ring ℜ has the following properties: (P1 ) If r ∈ ℜ and r(0, 0, 0, . . . ) ∕= 0, then r has an inverse in ℜ. (P2 ) If r ∈ ℜ𝑛×𝑛 is such that r(0, 0, 0, . . . ) = 0𝑛×𝑛 and if 𝑓 is a function of one complex variable, analytic in a neighborhood of the origin, with Taylor expansion ∞ ∑ 𝑓𝑝 𝜆𝑝 , 𝑓 (𝜆) = 𝑝=0

then, the series def.

𝑓 (r) =

∞ ∑

𝑓𝑝 r𝑝

𝑝=0 𝑛×𝑛

converges in ℜ . Furthermore, if 𝑔 is another function of one complex variable, analytic in a neighborhood of the origin, we have (𝑓 𝑔)(r) = 𝑓 (r)𝑔(r). (1.6) ∑ def. ∗ 𝛼 ∗ (P3 ) If r(𝑧) = 𝛼∈ℓ 𝑟𝛼 𝑧 𝛼 ∈ ℜ, then r∗ (𝑧) = 𝛼∈ℓ 𝑟𝛼 𝑧 ∈ ℜ, where 𝑟𝛼 denotes the conjugate of the complex number 𝑟𝛼 . ∑

Property (P1 ) implies in particular that a matrix A ∈ ℜ𝑛×𝑛 is invertible in ℜ𝑛×𝑛 if and only if det A(0) ∕= 0. This fact, together with (𝑃2 ), allows to deﬁne expressions such as ∞ ∑ H (𝜆) = D + 𝜆C(𝐼𝑛 − 𝜆A)−1 B = D + 𝜆𝑘 CA𝑘−1 B, (1.7) 𝑘=1

where A, B, C, and D are matrices of appropriate dimensions with entries in ℜ, and where 𝜆 is an independent complex variable. As explained in [6] this is the transfer function of some underlying linear systems, and is a rational function with coeﬃcients in ℜ. The purpose of this paper is to explain how to tackle in the present setting counterparts of some classical interpolation problems which appear in the theory of linear systems. To illustrate our strategy, we focus on the Nevanlinna-Pick interpolation problem, but our method works the same for the general bitangential

Interpolation Problem in a Commutative Ring

5

interpolation problem. The computations done in the classical theory (that is, when the coeﬃcient space consists of the complex numbers) extend to the case where ℂ is replaced by the ring ℜ. In some cases, such as Nevanlinna-Pick interpolation, this can be shown by direct computations. In the general case, one needs to use the principle of permanence of identities, see [9, p. 456]. We note that there are other commutative rings with properties (P1 ), (P2 ) and (P3 ) for which the above analysis is applicable. See [7]. The paper consists of ﬁve sections besides the present introduction. In the second section we review Hida’s white noise space setting and the Kondratiev spaces of stochastic distributions. The deﬁnition and main properties of the ring ℜ are given in Section 3. In Section 4 we deﬁne and study analytic functions from an open set of ℂ with values in ℜ. In Section 5 we consider the Nevanlinna-Pick interpolation problem. In the last section we discuss the bitangential interpolation problem.

2. The white noise space We here review Hida’s white noise space theory and the associated spaces of stochastic distributions introduced by Kondratiev. See [18], [19], [20], [23]. Let S (ℝ) denote the Schwartz space of smooth real-valued rapidly decreasing functions. It is a nuclear space, and by the Bochner-Minlos theorem (see [15, Th´eor`eme 2, p. 342]), there exists a probability measure on the Borel sets ℬ of the dual space def.

S (ℝ)′ = Ω such that 𝑒

−

∥𝑠∥2 L2 (ℝ) 2

∫ =

Ω

𝑒𝑖⟨𝜔,𝑠⟩ 𝑑𝑃 (𝜔),

∀𝑠 ∈ S (ℝ),

(2.8)

where the brackets ⟨⋅, ⋅⟩ denote the duality between S (ℝ) and S (ℝ)′ . The probability space 𝒲 = (Ω, ℬ, 𝑑𝑃 ) is called the white noise probability space. We will be interested in particular in L2 (𝒲), called the white noise space. For 𝑠 ∈ S (ℝ), let 𝑄𝑠 denote the random variable 𝑄𝑠 (𝜔) = ⟨𝜔, 𝑠⟩. It follows from (2.8) that ∥𝑠∥L2 (ℝ) = ∥𝑄𝑠 ∥L2 (𝒲) . Therefore, 𝑄𝑠 extends continuously to an isometry from L2 (ℝ) into L2 (𝒲). In the presentation of the Gelfand triple associated to the white noise space which we will use, we follow [20]. The white noise space L2 (𝒲) admits a special orthogonal basis (𝐻𝛼 )𝛼∈ℓ , indexed by the set ℓ and built in terms of the Hermite functions ˜ ℎ𝑘

6

D. Alpay and H. Attia

and of the Hermite polynomials ℎ𝑘 deﬁned by ∞ ∏ 𝐻𝛼 (𝜔) = ℎ𝛼𝑘 (𝑄˜ℎ𝑘 (𝜔)). 𝑘=1

We refer to [20, Deﬁnition 2.2.1 p. 19] for more information. In terms of this basis, any element of L2 (𝒲) can be written as ∑ 𝐹 = 𝑓𝛼 𝐻𝛼 , 𝑓𝛼 ∈ ℂ, (2.9) 𝛼∈ℓ

with

∥𝐹 ∥2L2 (𝒲) =

∑

∣𝑓𝛼 ∣2 𝛼! < ∞.

𝛼∈ℓ

There are quite a number of Gelfand triples associated to L2 (𝒲). In our previous works [2], [5], and in the present one, we focus on the one consisting of the Kondratiev space 𝑆1 of stochastic test functions, of L2 (𝒲), and of the Kondratiev space 𝑆−1 of stochastic distributions. To deﬁne these spaces we ﬁrst introduce for 𝑘 ∈ ℕ the Hilbert space ℋ𝑘 which consists of series of the form (1.5) such that ( )1/2 ∑ def. 2 2 𝑘𝛼 (𝛼!) ∣𝑓𝛼 ∣ (2ℕ) < ∞, (2.10) ∥𝐹 ∥𝑘 = 𝛼∈ℓ

and the Hilbert spaces ℋ𝑘′ consisting of sequences (𝑓𝛼 )𝛼∈ℓ such that )1/2 ( ∑ ′ def. 2 −𝑘𝛼 ∥𝐹 ∥𝑘 = ∣𝑓𝛼 ∣ (2ℕ) < ∞. 𝛼∈ℓ

We note that, for 𝐹 ∈ ℋ𝑘′ we have lim 𝑝≥𝑘 ∥𝐹 ∥′𝑝 = ∣𝑓(0,0,0,...) ∣2 , 𝑝→∞

(2.11)

as can be seen, for instance, by applying the dominated convergence theorem to an appropriate discrete measure. Following the usage in∑the literature, we will also write the elements of ℋ𝑘′ as formal power series 𝛼∈ℓ 𝑓𝛼 𝐻𝛼 . Note that (ℋ𝑘 )𝑘∈ℕ forms a decreasing sequence of Hilbert spaces, with increasing norms, while (ℋ𝑘′ )𝑘∈ℕ forms an increasing sequence of Hilbert spaces, with decreasing norms. The spaces 𝑆1 and 𝑆−1 are deﬁned by the corresponding projective and inductive limits ∞ ∞ ∩ ∪ ℋ𝑘 and 𝑆−1 = ℋ𝑘′ . 𝑆1 = 𝑘=1

𝑘=1

The Wick product is deﬁned on the basis (𝐻𝛼 )𝛼∈ℓ by 𝐻𝛼 ♢𝐻𝛽 = 𝐻𝛼+𝛽 .

It extends to an everywhere deﬁned and continuous map from 𝑆1 × 𝑆1 into itself and from 𝑆−1 × 𝑆−1 into itself.1 Let 𝑙 > 0, and let 𝑘 > 𝑙 + 1. Consider ℎ ∈ ℋ𝑙′ and 1 The

continuity properties are proved in [7] for a more general family of rings.

Interpolation Problem in a Commutative Ring

7

𝑢 ∈ ℋ𝑘′ . Then, V˚ age’s inequality holds: ∥ℎ♢𝑢∥′𝑘 ≤ 𝐴(𝑘 − 𝑙)∥ℎ∥′𝑙 ∥𝑢∥′𝑘 , where

( 𝐴(𝑘 − 𝑙) =

∑

(2.12)

)1/2 (𝑙−𝑘)𝛼

(2ℕ)

< ∞.

(2.13)

𝛼∈ℓ

See [20, Proposition 3.3.2 p. 118]. The following result is a direct consequence of (2.13) and will be useful in the sequel. def.

′ and Lemma 2.1. Let 𝐹 ∈ ℋ𝑝′ . Then, 𝐹 ♢𝑛 = 𝐹 ♢ ⋅ ⋅ ⋅ ♢𝐹 ∈ ℋ𝑝+2

𝑛 times

∥𝐹 ♢𝑛 ∥′𝑝+2

)𝑛 1 ( 𝐴(2)∥𝐹 ∥′𝑝 , ≤ 𝐴(2)

𝑛 = 1, 2, 3, . . .

(2.14)

Proof. We proceed by induction. The case 𝑛 = 1 holds since ∥𝐹 ∥′𝑝+2 ≤ ∥𝐹 ∥′𝑝 ,

for 𝐹 ∈ ℋ𝑝′ .

Assume now that (2.14) holds at rank 𝑛. Then, from (2.12) we have ∥𝐹 ♢(𝑛+1) ∥′𝑝+2 ≤ 𝐴(2)∥𝐹 ∥′𝑝 ∥𝐹 ♢𝑛 ∥′𝑝+2 )𝑛 1 ( 𝐴(2)∥𝐹 ∥′𝑝 ≤ 𝐴(2))∥𝐹 ∥′𝑝 𝐴(2) )𝑛+1 1 ( 𝐴(2)∥𝐹 ∥′𝑝 = . 𝐴(2)

□

3. The ring 𝕽 The Kondratiev space 𝑆−1 endowed with the Wick product is a commutative ring of sequences (𝑐𝛼 )𝛼∈ℓ , with properties (𝑃1 ), (𝑃2 ) and (𝑃3 ), where in (𝑃1 ) one understands by evaluation at the origin the ﬁrst coeﬃcient of the sequence. Using the Hermite transform (deﬁned below), we view 𝑆−1 as a ring of powers series in inﬁnitely many variables. We point out that there are other commutative rings of sequences with properties (𝑃1 ), (𝑃2 ) and (𝑃3 ), and for which a counterpart of inequality (2.12) holds. See [7]. The Hermite transform is deﬁned by 𝐼(𝐻𝛼 ) = 𝑧 𝛼 ,

with 𝛼 ∈ ℓ and 𝑧 = (𝑧1 , 𝑧2 , . . .) ∈ ℂℕ .

Then

𝐼(𝐻𝛼 ⋄ 𝐻𝛽 ) = 𝐼(𝐻𝛼 )𝐼(𝐻𝛽 ). ∑ ∑ It extends for 𝐹 = 𝛼∈ℓ 𝑎𝛼 𝐻𝛼 ∈ 𝑆−1 by the formula 𝐼(𝐹 )(𝑧) = 𝛼∈ℓ 𝑎𝛼 𝑧 𝛼 , and converges in sets of the form ⎧ ⎫ ⎨ ⎬ ∑ 𝐾𝑝 (𝑅) = 𝑧 ∈ ℂℕ : ∣𝑧∣𝛼 (2ℕ)𝑝𝛼 < 𝑅2 , ⎩ ⎭ 𝛼∕=0

8

D. Alpay and H. Attia

where 𝑝 is such that 𝐹 ∈ ℋ𝑝′ . The Kondratiev space 𝑆−1 is closed under the Wick product, and we have 𝐼(𝐹 ⋄ 𝐺)(𝑧) = 𝐼(𝐹 )(𝑧)𝐼(𝐺)(𝑧)

and 𝐼(𝐹 + 𝐺)(𝑧) = 𝐼(𝐹 )(𝑧) + 𝐼(𝐺)(𝑧)

for any 𝐹, 𝐺 ∈ 𝑆−1 . Therefore the image of the Kondratiev space 𝑆−1 under the Hermite transform is a commutative ring, denoted by def

ℜ = Im(𝐼(𝑆−1 )). This ring was introduced in [6]. We transpose to it via the Hermite transform the properties of 𝑆−1 . We have ∞ ∪ ℜ= 𝐼(ℋ𝑘′ ). 𝑘=1

We deﬁne the adjoint G∗ = (h𝑠𝑡 ) ∈ ℜ𝑚×𝑛 of G = (g𝑡𝑠 ) ∈ ℜ𝑛×𝑚 by h𝑠𝑡 (𝑧) = ∗ g𝑡𝑠 (𝑧) (𝑡 ∈ {1, . . . , 𝑛} and 𝑠 ∈ {1, . . . , 𝑚}). Then for A ∈ ℜ𝑛×𝑚 and B ∈ ℜ𝑚×𝑢 we have (AB)∗ = B∗ A∗ . (3.15) ∗ ∗ ∗ Note that G (0) = G(0) , where G(0) is the usual adjoint matrix. Deﬁnition 3.1. An element A ∈ ℜ𝑛×𝑛 will be said strictly positive, A > 0, if it can be written as A = GG∗ , where G ∈ ℜ𝑛×𝑛 is invertible. It will be said positive if G is not assumed to be invertible. Lemma 3.2. Let A ∈ ℜ𝑛×𝑛 . Then, A is strictly positive if and only if A(0) ∈ ℂ𝑛×𝑛 is a strictly positive matrix (in the usual sense). Proof. If A = GG∗ with det G(0) ∕= 0, then A(0) = G(0)G(0)∗ is a strictly positive matrix. Conversely, assume that A ∈ ℜ𝑛×𝑛 is such that A(0) > 0. We write A(𝑧) = A(0) + (A(𝑧) − A(0)) √ √ √ √ = A(0){𝐼𝑛 + ( A(0))−1 (A(𝑧) − A(0))( A(0))−1 } A(0). √ √ Note that E(𝑧) = A(0))−1 (A(𝑧) − A(0))( A(0))−1 vanishes at 𝑧 = (0, 0, . . .). Property (𝑃2 ) with 1 2 1⋅3 3 1 𝜁 + 𝜁 + ⋅ ⋅ ⋅ = (1 + 𝜁)1/2 , ∣𝜁∣ < 1, 𝑓 (𝜁) = 1 + 𝜁 − 2 2⋅4 2⋅4⋅6 √ □ implies that A = CC∗ , where C = A(0)𝑓 (E). Similarly, if A is positive, then A(0) is also positive, but the converse statement need not hold. Take for instance 𝑛 = 1 and A(𝑧) = 𝑧1 . Then it is readily seen that one cannot ﬁnd r ∈ ℜ such that 𝑧1 = r∗ (𝑧)r(𝑧). We deﬁne the ring of polynomials with coeﬃcients in ℜ by ℜ[𝜆]. To avoid confusion between the variable 𝜆 and the variables 𝑧 we introduce the notation I (r) = r(0),

r ∈ ℜ.

Interpolation Problem in a Commutative Ring

9

Deﬁnition 3.3. A rational function with values in ℜ𝑛×𝑚 is an expression of the form R(𝜆) = p(𝜆)(q(𝜆))−1 (3.16) 𝑛×𝑚 , and q ∈ ℜ[𝜆] is such that I (q(𝜆)) ∕≡ 0. where p ∈ (ℜ[𝜆]) Let R ∈ ℜ𝑛×𝑚 (𝜆). Then, I (R) ∈ ℂ𝑛×𝑚 (𝜆), and it is readily seen that (I (R))(𝜆) = I (R(𝜆)).

(3.17)

It is proved in [6] that every rational function with values matrices with entries in ℜ and for which I (q(0)) ∕= 0 can be written as (1.7). Example 3.4. Let r ∈ ℜ. The function 𝐹r (𝜆) = (𝜆 − r)(1 − 𝜆r∗ )−1 ∈ ℜ(𝜆) is rational. It is deﬁned for 𝜆 ∈ ℂ such that 1 ∕= 𝜆(I (r))∗ . The next example of rational function need not be deﬁned for 𝜆 = 0. Example 3.5. Let r ∈ ℜ. The function 𝐹r (𝜆) = (𝜆 − r)(𝜆 − r∗ )−1 ∈ ℜ(𝜆) is rational. It is deﬁned for 𝜆 ∈ ℂ such that 𝜆 ∕= (I (r))∗ .

4. Analytic functions with values in 𝕽 It is possible to deﬁne analytic functions with values in a locally convex topological vector space (see for instance the discussion in [13, 14, 17, 16]). Here the structure of ℜ allows us to focus, locally, on the classical deﬁnition of Hilbert space-valued functions, as we now explain. Proposition 4.1. Let Ω ⊂ ℂ be an open set and let f : Ω → ℜ be a continuous function. Then, f is locally Hilbert space valued, that is, for every 𝜁0 ∈ Ω, there is a compact neighborhood 𝐾 of 𝜁0 and a number 𝑝0 such that f (𝐾) ⊂ 𝐼(ℋ𝑝′ 0 ). Proof. Every 𝜁0 ∈ Ω has a neighborhood 𝐾 of the form 𝐵𝛿 = {𝜁 ∈ Ω ; ∣𝜁0 −𝜁∣ ≤ 𝛿} for some 𝛿 > 0. Since 𝐵𝛿 is a compact set and f is continuous, f (𝐵𝛿 ) is compact in ℜ, and therefore strongly bounded. See [12, p. 54]. Thus there exists 𝑝0 ∈ ℕ such that f (𝐵𝛿 ) ∈ 𝐼(ℋ𝑝′ 0 ) and is bounded in the norm of 𝐼(ℋ𝑝′ 0 ). See [12, Section 5.3 p. 45]. □ Therefore we can deﬁne an analytic function from Ω to ℜ as a continuous function which locally admits a power expansion with coeﬃcients in one of the spaces 𝐼(ℋ𝑝′ ). The following example shows that we cannot expect to have a ﬁxed 𝑝 in general. ∑ 𝜆 2 Example 4.2. Let f (𝜆, 𝑧) = ∞ 𝑛=1 𝑛 𝑧𝑛 . Then f is continuous (as a function of 𝜆) from ℂ into ℜ, but there is no 𝑝 such that f (𝜆, 𝑧) (viewed now as a function of 𝑧) belongs to 𝐼(ℋ𝑝′ ) for all 𝜆 ∈ ℂ.

10

D. Alpay and H. Attia Indeed, let 𝜆0 ∈ ℂ. We have (∥f (𝜆0 )∥′𝑝 )2 =

∞ ∑

∣𝑛𝜆0 ∣(2𝑛)−𝑝 = 2−𝑝

𝑛=1

∞ ∑

𝑛Re 𝜆0 −𝑝 < ∞,

𝑛=1

for 𝑝 > Re 𝜆0 + 1. To show continuity at a point 𝜆0 ∈ ℂ, we take 𝑝 > ∣𝜆0 ∣ + 2, and restrict 𝜆 to be such that ∣𝜆0 − 𝜆∣ < 1. Using the elementary estimate ∣𝑒𝑧1 − 𝑒𝑧2 ∣ ≤ ∣𝑧1 − 𝑧2 ∣ ⋅ max ∣𝑒𝑧 ∣, 𝑧∈[𝑧1 ,𝑧2 ]

(4.1)

for 𝑧1 , 𝑧2 ∈ ℂ, we have for 𝑛 = 2, 3, . . . 𝜆

∣𝑛 2 − 𝑛

𝜆0 2

∣ ≤ (ln 𝑛)

∣𝜆 − 𝜆0 ∣ ∣𝜆0 ∣+1 ln 𝑛 𝑒 2 2

and so (∥f (𝜆) − f (𝜆0 )∥′𝑝+2 )2 = 2−𝑝−2

∞ ∑

𝜆

∣𝑛 2 − 𝑛

𝜆0 2

∣2 𝑛−𝑝−2

𝑛=2

≤ 2−𝑝−2

∞ ∣𝜆 − 𝜆0 ∣2 ∑ (ln 𝑛)2 ∣𝜆0 ∣+1−𝑝 𝑛 , 4 𝑛2 𝑛=2

and hence the continuity at the point 𝜆0 in the norm ∥ ⋅ ∥′𝑝+2 , and hence in ℜ. See in particular [12, p. 57] for the latter. Recall that, in the case of Hilbert space, weak and strong analyticity are equivalent, and can be expressed in terms of power series expansions. The argument uses the uniform boundedness theorem. See [24, Theorem VI.4, p. 189]. We deﬁne the evaluation of an ℜ-valued analytic function at a point r ∈ ℜ. We ﬁrst introduce ℜΩ = {r ∈ ℜ; I (r) ∈ Ω}, where Ω ⊂ ℂ is open. Theorem 4.3. Let Ω be an open subset of ℂ, and let f : Ω → ℜ be an analytic function. Let r ∈ ℜΩ , and let f (𝜁) =

∞ ∑

f𝑛 (𝜁 − I (r))𝑛 ,

(4.2)

𝑛=0

be the Taylor expansion around I (r) ∈ Ω, where the f𝑛 ∈ ℋ𝑝′ 0 for some 𝑝0 ∈ ℕ, and where the convergence is in ℋ𝑝′ 0 . The series f (r) =

∞ ∑ 𝑛=0

converges in ℋ𝑞′ for some 𝑞 > 𝑝0 .

f𝑛 (r − I (r))𝑛

(4.3)

Interpolation Problem in a Commutative Ring

11

Proof. Let 𝐾 be a compact neighborhood of I (r), and let 𝑝0 ∈ ℕ be such that f (𝐾) ⊂ ℋ𝑝′ 0 . Let furthermore 𝑅 be the radius of convergence of the ℋ𝑝′ 0 -valued power series (4.2). In view of (2.11), there exists 𝑝, which we can assume strictly larger than 𝑝0 , such that (4.4) 𝐴(2)∥r − I (r)∥′𝑝 < 𝑅. On the other hand, using (2.14), we obtain ∥f𝑛 (r − I (r))𝑛 ∥′𝑝+2 ≤ 𝐴(2)∥f𝑛 ∥′𝑝0 ∥(r − I (r))𝑛 ∥′𝑝+2 ( )𝑛 ≤ ∥f𝑛 ∥′𝑝0 𝐴(2)∥r − I (r)∥′𝑝 . In view of (4.4), the series ∞ ∑ 𝑛=0

converges and so the series

( )𝑛 ∥f𝑛 ∥′𝑝0 𝐴(2)∥r − I (r)∥′𝑝 ∞ ∑

f𝑛 (r − I (r))𝑛

𝑛=0

′ converges absolutely in 𝐼(ℋ𝑝+2 ).

□

The evaluation of f at r is deﬁned to be f (r) given by (4.3). Proposition 4.4. We can rewrite the evaluation at r as a Cauchy integral ∮ 1 f (𝜁) 𝑑𝜁 f (r) = 2𝜋𝑖 𝜁 −r where the integration is along a circle centered at I (r) and of radius 𝑟 < 𝑅 and in Ω. Proof. As in Theorem 4.3 we consider a compact neighborhood 𝐾 of I (r), and let 𝑝0 be such that f (𝐾) ⊂ ℋ𝑝′ 0 . We consider a circle at centered I (r) and which lies inside 𝐾. We have ∮ ∮ 1 1 f (𝜁) f (𝜁) 𝑑𝜁 = 𝑑𝜁 2𝜋𝑖 𝜁 −r 2𝜋𝑖 𝜁 − I (r) + I (r) − r { )𝑛 } ∮ ∞ ( ∑ r − I (r) 1 f (𝜁) = 𝑑𝜁 2𝜋𝑖 𝜁 − I (r) 𝑛=0 𝜁 − I (r) ∞ } {∮ 1 ∑ f (𝜁) 𝑛 = (r − I (r)) 𝑑𝜁 , 2𝜋𝑖 𝑛=0 (𝜁 − I (r))𝑛+1 where we have used the estimates as in the proof of Theorem 4.3 and the dominated convergence theorem to justify the interchange of integration and summation. □ Recall that a function 𝑓 analytic and contractive in the open unit disk is called a Schur function. Furthermore, by the maximum modulus principle, 𝑓 is in fact strictly contractive in 𝔻, unless it is identically equal to a unitary constant.

12

D. Alpay and H. Attia

We will call a function f analytic from the open unit disk 𝔻 into ℜ a Schur function (notation: f ∈ 𝑆ℜ ) if the function 𝜆 → I (f (𝜆)) is a Schur function. For instance, both 1 + 𝑧1 𝑧3 and 0.5 + 10𝑧1 − 3𝑧5 are Schur functions. We now deﬁne the analog of the open unit disk by ℜ𝔻 = {r ∈ ℜ; I (r) ∈ 𝔻}, and the analog of strictly contractive Schur functions as the set of analytic functions f : 𝔻 → ℜ such that the function 𝜆 → I (f (𝜆)) is a strictly contractive Schur function. Theorem 4.5. f ∈ 𝑆ℜ is a strictly contractive Schur function if and only if f : 𝔻 → ℜ𝔻 is analytic. Proof. If f is analytic from 𝔻 into ℜ, and such that the 𝜆 → I (f (𝜆)) is a strictly contractive Schur function, it means by deﬁnition that the range of f lies inside ℜ𝔻 . Conversely, let f : 𝔻 → ℜ be analytic and such that I (f ) is a strictly contractive Schur function. Then for every 0 < 𝑟 < 1, there exists 𝑘 ∈ ℕ (which may depend on 𝑟) such that f (∣𝜆∣ ≤ 𝑟) ⊂ 𝐼(ℋ′ 𝑘 ). We can write f as f (𝜆) =

∞ ∑

𝜆𝑛 fn ,

𝑛=0

∑∞ where ∣𝜆∣ < 𝑟 and fn ∈ Now I (f )(𝜆) = 𝑛=0 𝜆𝑛 I (fn ) for ∣𝜆∣ < 𝑟. Since □ this holds for all 𝑟 ∈ (0, 1) the function 𝜆 → f (𝜆) has range inside ℜ𝔻 . 𝐼(ℋ𝑘′ ).

5. Nevanlinna-Pick Interpolation In this section we solve the following interpolation problem (𝐼𝑃 ). Problem 5.1. Given 𝑛 ∈ ℕ and points a1 , . . . , a𝑛 , b1 , . . . , b𝑛 ∈ ℜ𝔻 , ﬁnd all Schur functions f with coeﬃcients in ℜ such that f (a𝑖 ) = b𝑖 for 𝑖 = 1, 2, . . . , 𝑛. The solution of this problem under the assumption that some matrix is strictly positive, is presented in Theorem 5.3 below. We ﬁrst give some preliminary arguments, and note that if f is a solution of the interpolation problem 5.1, then 𝑓 = I (f ) is a solution of the classical interpolation problem 𝑓 (𝑎𝑖 ) = 𝑏𝑖 ,

𝑖 = 1, . . . , 𝑛,

(5.5)

where we have set 𝑎𝑖 = I (a𝑖 ) and 𝑏𝑖 = I (b𝑖 ),

𝑖 = 1, . . . , 𝑛.

This last problem is solved as follows: let 𝑃 denote the 𝑛 × 𝑛 Hermitian matrix with 𝑖𝑗 entry equal to 1 − 𝑏𝑖 𝑏∗𝑗 . (5.6) 1 − 𝑎𝑖 𝑎∗𝑗

Interpolation Problem in a Commutative Ring

13

A necessary and suﬃcient condition for (5.5) to have a solution in the family of Schur functions is that 𝑃 ≥ 0. We will assume 𝑃 > 0. Set, in the notation of the introduction, 𝐴∗𝜁 = 𝐴 = diag (𝑎∗1 , 𝑎∗2 , . . . , 𝑎∗𝑛 ), ( ) ( ) 𝐵+ 1 1 ⋅ ⋅ ⋅ 1 def − = 𝐶, = ∗ ∗ 𝐵− 𝑏1 𝑏2 ⋅ ⋅ ⋅ 𝑏∗𝑛 ( ) 1 0 𝐽= . 0 −1

(5.7)

Furthermore, specializing the formula for Θ given in the introduction with 𝑧0 = 1, or using the formula arising from the theory of reproducing kernel Hilbert spaces (see [11], [1]), set ( ) 𝑎(𝜆) 𝑏(𝜆) def Θ(𝜆) = 𝐼2 − (1 − 𝜆)𝐶(𝐼𝑛 − 𝜆𝐴)−1 𝑃 −1 (𝐼 − 𝐴)−∗ 𝐶 ∗ 𝐽 = . 𝑐(𝜆) 𝑑(𝜆) We now gather the main properties of the matrix-valued function Θ relevant to the present work. For proofs, we refer to [1], [10], [11]. Proposition 5.2. The following hold: (a) The matrix-valued function Θ is 𝐽-inner with respect to the open unit disk. (b) Θ has no poles in 𝔻 and 𝑐(𝜆)𝜎 + 𝑑(𝜆) ∕= 0 for all 𝜆 ∈ 𝔻 and all 𝜎 in the closed unit disk. (c) The identity ( ) 1 −𝑏𝑖 Θ(𝑎𝑖 ) = 0, 𝑖 = 1, . . . , 𝑛. (5.8) is valid. (d) The linear fractional transformation def.

𝑇Θ(𝜆) (𝜎(𝜆)) =

𝑎(𝜆)𝜎(𝜆) + 𝑏(𝜆) 𝑐(𝜆)𝜎(𝜆) + 𝑑(𝜆)

describes the set of all solutions of the problem (5.5) in the family of Schur functions when 𝜎 varies in the family of Schur functions. To solve the interpolation problem 5.1 we introduce the matrices A, C and P, with entries in ℜ, built by formulas (5.6) and (5.7), but with a1 , . . . , a𝑛 , b1 , . . . , b𝑛 instead of 𝑎1 , . . . , 𝑎𝑛 , 𝑏1 , . . . , 𝑏𝑛 . Note that P > 0 since 𝑃 > 0, and we can deﬁne the ℜ2×2 -valued function Θ as Θ but with A, C and P instead of 𝐴, 𝐶 and 𝑃 . We have I (A) = 𝐴, I (C) = 𝐶, and I (P) = 𝑃. Furthermore,

I (Θ(𝜆)) = Θ(𝜆).

(5.9)

Theorem 5.3. Assume P > 0. Then, there is a one-to-one correspondence between the solutions f of the problem 5.1 in 𝑆ℜ and the elements g ∈ 𝑆ℜ via the linear fractional transformation f = 𝑇Θ (g).

14

D. Alpay and H. Attia

Proof. We ﬁrst claim that the matrix-valued function Θ satisﬁes the counterparts of (5.8), that is, ( ) 1 −b𝑖 Θ(a𝑖 ) = 0, 𝑖 = 1, . . . , 𝑛. (5.10) This is done using the permanence of algebraic identities. See [9, p. 456] for the latter. Indeed, the matrix-valued function ( ∏ ) ∗ ∗ (det(𝐼𝑛 − 𝜆𝐴))(det(𝐼𝑛 − 𝐴 ))(det 𝑃 ) (1 − 𝑎ℓ 𝑎𝑗 ) Θ(𝜆) ℓ,𝑗=1,...,𝑛

is a polynomial in 𝜆 with coeﬃcients which are themselves polynomials in the 𝑎𝑖 and the 𝑏𝑗 , with entire coeﬃcients. Therefore, multiplying both sides of (5.8) by the polynomial function ( ∏ ) (det(𝐼𝑛 − 𝜆𝐴))(det(𝐼𝑛 − 𝐴∗ ))(det 𝑃 ) (1 − 𝑎ℓ 𝑎∗𝑗 ) ℓ,𝑗=1,...,𝑛

evaluated at 𝜆 = 𝑎𝑖 (𝑖 = 1, 2, . . . , 𝑛), and taking the real and imaginary part of the equalities (5.8), we obtain for each 𝑖 four polynomial identities in the 4𝑛 real variables Re 𝑎𝑗 , Re 𝑏𝑗 , Im 𝑎𝑗 , Im 𝑏𝑗 , with 𝑗 = 1, . . . , 𝑛, with entire coeﬃcients, namely { } ∏ ( ) ∗ ∗ Re 1 −𝑏𝑖 det (𝐼 − 𝑎𝑖 𝐴) det (𝐼 − 𝐴 ) det 𝑃 (1 − 𝑎ℓ 𝑎𝑗 )Θ(𝑎𝑖 ) ℓ,𝑗=1,...,𝑛

{ Im

( = 0

∏

( ) 1 −𝑏𝑖 det (𝐼 − 𝑎𝑖 𝐴) det (𝐼 − 𝐴∗ ) det 𝑃

) 0 ,

} (1 − 𝑎ℓ 𝑎∗𝑗 )Θ(𝑎𝑖 )

ℓ,𝑗=1,...,𝑛

( = 0

) 0 .

It follows (see [9, p. 456]) that these identities hold in any commutative rings, and in particular in ℜ: { } ∏ ( ) ∗ ∗ Re 1 −b𝑖 det (𝐼 − a𝑖 A) det (𝐼 − A ) det P (1 − aℓ a𝑗 )Θ(a𝑖 ) ℓ,𝑗=1,...,𝑛

Im

{ ( 1

( = 0

∏

) −b𝑖 det (𝐼 − a𝑖 A) det (𝐼 − A∗ ) det P

) 0 ,

} (1 − aℓ a∗𝑗 )Θ(a𝑖 )

ℓ,𝑗=1,...,𝑛

( = 0

) 0 .

We now use the fact that we are in the ring ℜ. Because of the choice of the a𝑗 , the element ∏ det (𝐼 − a𝑖 A) det (𝐼 − A∗ ) (1 − aℓ a∗𝑗 ) ℓ,𝑗=1,...,𝑛

Interpolation Problem in a Commutative Ring

15

is invertible in ℜ. When furthermore P > 0 we can divide both sides of the above equalities by ∏ det (𝐼 − a𝑖 A) det (𝐼 − A∗ ) (1 − aℓ a∗𝑗 ) det P ℓ,𝑗=1,...,𝑛

and obtain (5.10). Let now r ∈ 𝑆ℜ , and let u, v be analytic ℜ-valued functions deﬁned by ( ) ( ) ( ) u(𝜆) r(𝜆) a(𝜆)r(𝜆) + b(𝜆) = Θ(𝜆) = . v(𝜆) 1 c(𝜆)r(𝜆) + d(𝜆) Using (5.10) we have that u(a𝑖 ) = b𝑖 v(a𝑖 ),

𝑖 = 1, . . . , 𝑛.

To conclude, we need to show that v(a𝑖 ) is invertible in ℜ for 𝑖 = 1, . . . , 𝑛. But we have I (v(a𝑖 )) = 𝑐(𝑎𝑖 )I (r)(𝑎𝑖 ) + 𝑑(𝑎𝑖 ), 𝑖 = 1, . . . , 𝑛. The function Θ(𝜆) = I (Θ(𝜆)) is 𝐽-unitary on the unit circle and has no poles there. Therefore, we have 𝑐(𝑎𝑖 )I (r)(𝑎𝑖 ) + 𝑑(𝑎𝑖 ) ∕= 0 (see item (b) in Proposition 5.2), and hence v(a𝑖 ) is invertible in ℜ. Therefore uv−1 = 𝑇Θ (r) is a solution of the interpolation problem. Assume now that f is a solution. Then, we know from the discussion before the theorem that there exists a Schur function 𝜎(𝜆) such that I (f (𝜆)) = 𝑇I (Θ(𝜆)) (𝜎(𝜆)).

(5.11)

Deﬁne a ℜ-valued function r by f (𝜆) = 𝑇Θ(𝜆) (r(𝜆)). Taking I on both sides of this expression we obtain I (f (𝜆)) = 𝑇I (Θ(𝜆)) (I (r(𝜆))). Comparing with (5.11), we obtain I (r(𝜆)) = 𝜎(𝜆), and hence r ∈ 𝑆ℜ .

□

6. More general interpolation problem The matrix-valued function Θ deﬁned by (1.3) and describing the set of solutions of the bitangential problem satisﬁes the conditions ∑ ( ) Res𝜆=𝜆0 (𝜆𝐼 − 𝐴𝜁 )−1 𝐵+ 𝐵− Θ(𝜆) = 0 𝜆0 ∈𝔻

∑

𝜆0 ∈𝔻

Res𝜆=𝜆0 Θ(1/𝜆∗ )∗

( ) 𝐶− (𝜆𝐼 − 𝐴𝜋 )−1 = 0. 𝐶+

See also [4]. As for the Nevanlinna-Pick case, these conditions can be translated into a ﬁnite number of polynomial equations with coeﬃcients in ℤ, and the principle of permanence of identities allows to extend these properties in the case of

16

D. Alpay and H. Attia

a commutative ring. On the other hand, we do not know how to extend the third interpolation property, and so the method is not applicable to the most general bitangential interpolation problem. If we restrict the parameter to be a constant contractive matrix, the third condition also translates into a polynomial identity with entire coeﬃcients, and the same method can still be used. The case of functions with poles inside the open unit disk, or the degenerate cases, are more diﬃcult to treat, and will be considered elsewhere.

References [1] D. Alpay. The Schur algorithm, reproducing kernel spaces and system theory. American Mathematical Society, Providence, RI, 2001. Translated from the 1998 French original by Stephen S. Wilson, Panoramas et Synth`eses. [Panoramas and Syntheses]. [2] D. Alpay, H. Attia, and D. Levanony. Une g´en´eralisation de l’int´egrale stochastique de Wick-Itˆ o. C. R. Math. Acad. Sci. Paris, 346(5-6):261–265, 2008. [3] D. Alpay, H. Attia, and D. Levanony. On the characteristics of a class of gaussian processes within the white noise space setting. Stochastic processes and applications, 120:1074–1104, 2010. [4] D. Alpay, P. Bruinsma, A. Dijksma, and H.S.V. de Snoo. Interpolation problems, extensions of symmetric operators and reproducing kernel spaces II. Integral Equations Operator Theory, 14:465–500, 1991. [5] D. Alpay and D. Levanony. Linear stochastic systems: a white noise approach. Acta Applicandae Mathematicae, 110:545–572, 2010. [6] D. Alpay, D. Levanony, and A. Pinhas. Linear stochastic state space theory in the white noise space setting. SIAM Journal of Control and Optimization, 48:5009–5027, 2010. [7] D. Alpay and Guy Salomon. A family of commutative rings with a V˚ age’s inequality. Arxiv manuscript number http://arxiv.org/abs/1106.5746. [8] Daniel Alpay and David Levanony. Linear stochastic systems: a white noise approach. Acta Appl. Math., 110(2):545–572, 2010. [9] Michael Artin. Algebra. Prentice Hall Inc., Englewood Cliﬀs, NJ, 1991. [10] J. Ball, I. Gohberg, and L. Rodman. Interpolation of rational matrix functions, volume 45 of Operator Theory: Advances and Applications. Birkh¨ auser Verlag, Basel, 1990. [11] H. Dym. 𝐽-contractive matrix functions, reproducing kernel Hilbert spaces and interpolation. Published for the Conference Board of the Mathematical Sciences, Washington, DC, 1989. [12] I.M. Gelfand and G.E. Shilov. Generalized functions. Volume 2. Academic Press. [13] A. Grothendieck. Sur certains espaces de fonctions holomorphes. I. J. Reine Angew. Math., 192:35–64, 1953. [14] A. Grothendieck. Sur certains espaces de fonctions holomorphes. II. J. Reine Angew. Math., 192:78–95, 1953.

Interpolation Problem in a Commutative Ring

17

[15] I.M. Guelfand and N.Y. Vilenkin. Les distributions. Tome 4: Applications de l’analyse harmonique. Collection Universitaire de Math´ematiques, No. 23. Dunod, Paris, 1967. [16] M. Herv´e. Analytic and plurisubharmonic functions in ﬁnite and inﬁnite-dimensional spaces. Number 198 in Lecture Notes in Mathematics. Springer-Verlag, 1971. [17] M. Herv´e. Analyticity in inﬁnite-dimensional spaces, volume 10 of de Gruyter Studies in Mathematics. Walter de Gruyter & Co., Berlin, 1989. [18] T. Hida, H. Kuo, J. Potthoﬀ, and L. Streit. White noise, volume 253 of Mathematics and its Applications. Kluwer Academic Publishers Group, Dordrecht, 1993. An inﬁnite-dimensional calculus. [19] T. Hida and Si Si. Lectures on white noise functionals. World Scientiﬁc Publishing Co. Pte. Ltd., Hackensack, NJ, 2008. [20] H. Holden, B. Øksendal, J. Ubøe, and T. Zhang. Stochastic partial diﬀerential equations. Probability and its Applications. Birkh¨ auser Boston Inc., Boston, MA, 1996. [21] M. Kaashoek. State space theory of rational matrix functions and applications. In P. Lancaster, editor, Lectures on operator theory and its applications, volume 3 of Fields Institute Monographs, pages 235–333. American Mathematical Society, 1996. [22] R.E. Kalman, P.L. Falb, and M.A. Arbib. Topics in mathematical system theory. McGraw-Hill Book Co., New York, 1969. [23] Hui-Hsiung Kuo. White noise distribution theory. Probability and Stochastics Series. CRC Press, Boca Raton, FL, 1996. [24] M. Reed and B. Simon. Methods of modern mathematical physics. I. Functional analysis. Academic Press, New York, 1972. [25] E.D. Sontag. Linear systems over commutative rings: A survey. Ricerche di Automatica, 7:1–34, 1976. Daniel Alpay Department of Mathematics Ben Gurion University of the Negev P.O.B. 653, Be’er Sheva 84105, Israel e-mail: [email protected] Haim Attia Department of Mathematics Sami Shamoon College of Engineering Be’er Sheva 84100, Israel e-mail: [email protected]

Operator Theory: Advances and Applications, Vol. 218, 19–49 c 2012 Springer Basel AG ⃝

Minimal and Maximal Invariant Spaces of Holomorphic Functions on Bounded Symmetric Domains Jonathan Arazy and Harald Upmeier Dedicated to the memory of Israel Gohberg

Abstract. Let 𝐷 be a Cartan domain in ℂ𝑑 and let 𝐺 = 𝐴𝑢𝑡(𝐷) be the group of all biholomorphic automorphisms of 𝐺. Consider the projective representation of 𝐺 on spaces of holomorphic functions on 𝐷 (𝑈𝜈 (𝑔)𝑓 )(𝑧) := {𝐽(𝑔 −1 )(𝑧)}𝜈/𝑝 𝑓 (𝑔 −1 (𝑧)),

𝑔 ∈ 𝐺,

𝑧 ∈ 𝐷,

where 𝑝 is the genus of 𝐷 and 𝜈 is in the Wallach set 𝑊 (𝐷). We identify the minimal and the maximal 𝑈𝜈 (𝐺)-invariant Banach spaces of holomorphic functions on 𝐷 in a very explicit way: The minimal space 𝔐𝜈 is a Besov-1 space, and the maximal space ℳ𝜈 is a weighted 𝐻 ∞ -space. Moreover, with respect to the pairing under the (unique) 𝑈𝜈 (𝐺)invariant inner product we have 𝔐∗𝜈 = ℳ𝜈 . In the second part of the paper we consider invariant Banach spaces of vector-valued holomorphic functions and obtain analogous descriptions of the unique maximal and minimal space, in particular for the important special case of “constant” partitions which arises naturally in connection with nontube type domains. Mathematics Subject Classiﬁcation (2000). 46E22, 32M15. Keywords. Banach spaces, holomorphic functions, symmetric domains.

1. Bounded symmetric domains and Jordan triples Let 𝐷 be a Cartan domain in ℂ𝑑 , i.e., an irreducible bounded symmetric domain in its Harish-Chandra realization. Then 𝑍 = ℂ𝑑 is a hermitian Jordan triple. The main example is the matrix ball 𝐷 = 𝐷(𝐼𝑟,𝑛 ) = {𝑧 ∈ 𝑀𝑟,𝑛 (ℂ), 𝐼𝑟 − 𝑧𝑧 ∗ > 0},

1 ≤ 𝑟 ≤ 𝑛.

20

J. Arazy and H. Upmeier

with triple product

1 (𝑥𝑦 ∗ 𝑧 + 𝑧𝑦 ∗ 𝑥). 2 In this paper we only sketch the necessary background on Cartan domains and hermitian Jordan triples, for more details cf. [U2], [L2], [FK2]. Let 𝐺 = Aut (𝐷) be the group of holomorphic automorphisms, and let {𝑥, 𝑦, 𝑧} =

𝐾 = {𝑔 ∈ 𝐺; 𝑔(0) = 0} be the maximal compact subgroup. Using Cartan’s linearity theorem, one proves that 𝐾 consists of linear maps. Then 𝐷 ≡ 𝐺/𝐾 via the evaluation map 𝑔 → 𝑔(0). The symmetries of 𝐷 have the form 𝑠0 (𝑧) = −𝑧 and, more generally, 𝑠𝑧 = 𝑔 𝑠0 𝑔 −1 , where 𝑔 ∈ 𝐺 satisﬁes 𝑔(0) = 𝑧. For each 𝑎 ∈ 𝐷 there exists a unique midpoint symmetry 𝜙𝑎 ﬁxing the geodesic midpoint between 0 and 𝑎, and satisfying 𝜙𝑎 (0) = 𝑎. Example 1.1. For 𝐷 = 𝐷(𝐼𝑟,𝑛 ) we have { ( ) } 𝛼𝛽 ∗ 𝐺 = 𝑆𝑈 (𝑟, 𝑛) = 𝑔 = ∈ 𝑆𝐿 (ℂ, 𝑟 + 𝑛); 𝑔𝐽𝑔 = 𝐽 𝛾𝛿 ( ) 𝐼 0 where 𝐽 = 𝑟 . The action is given by M¨ obius transformations 0 −𝐼𝑛 𝑔 ⋅ 𝑧 = (𝛼𝑧 + 𝛽)(𝛾𝑧 + 𝛿)−1 and the midpoint symmetry is 𝜙𝑎 (𝑧) = (𝐼𝑟 − 𝑎𝑎∗ )−1/2 (𝑎 − 𝑧)(𝐼 − 𝑎∗ 𝑧)−1 (𝐼 − 𝑎∗ 𝑎)1/2 . In the 1-dimensional case, this reduces to 𝜙𝑎 (𝑧) =

𝑎−𝑧 . 1 − 𝑎∗ 𝑧

The group 𝐾 ≡ 𝑆(𝑈 (𝑟) × 𝑈 (𝑛)) acts via 𝑘(𝑧) = 𝑢𝑧𝑣, where 𝑢 ∈ 𝑈 (𝑟), 𝑣 ∈ 𝑈 (𝑛) and det (𝑢) det (𝑣) = 1. In general, the domain 𝐷 is characterized by the dimension 𝑑, the genus 𝑝, and the rank 𝑟. Moreover we have characteristic multiplicities 𝑎, 𝑏 deﬁned via 𝑎 𝑑 = 𝑟(𝑟 − 1) + 𝑟 + 𝑟𝑏, 2 𝑝 = (𝑟 − 1) 𝑎 + 2 + 𝑏. In the matrix case 𝐷 = 𝐷 (𝐼𝑟,𝑛 ) for 1 ≤ 𝑟 ≤ 𝑛, we have 𝑑 = 𝑟 ⋅ 𝑛, 𝑝 = 𝑟 + 𝑛, 𝑎 = 2, 𝑏 = 𝑛 − 𝑟. For any hermitian Jordan triple 𝑍 and 𝑢, 𝑣 ∈ 𝑍, the Bergman operator 𝐵(𝑢, 𝑣) acting on 𝑍 is deﬁned by 𝐵(𝑢, 𝑣) 𝑧 = 𝑧 − 2{𝑢 𝑣 ∗ 𝑧} + 𝑄𝑢 𝑄𝑣 𝑧

Minimal and Maximal Invariant Spaces

21

where 𝑄𝑣 𝑧 = {𝑣 𝑧 ∗ 𝑣}. It is known that det 𝐵(𝑧, 𝑤) = ℎ(𝑧, 𝑤)𝑝 , where ℎ(𝑧, 𝑤) is a 𝐾-invariant sesqui-holomorphic polynomial determined by ℎ(𝑧, 𝑧) =

𝑟 ∏

(1 − 𝑠2𝑗 (𝑧)),

𝑗=1

where 𝑠𝑗 (𝑧) are the singular values of 𝑧. For matrices, we have ℎ(𝑧, 𝑤) = det (𝐼 − 𝑧𝑤∗ ). If 𝑧, 𝑤 ∈ 𝑍 and 𝐵(𝑧, 𝑤) is invertible, we deﬁne the quasi-inverse [L1], [L2] 𝑧 𝑤 := 𝐵(𝑧, 𝑤)−1 (𝑧 − 𝑄𝑧 𝑤). One can show [L2, p. 25, JP35] that 𝐵(𝑧, 𝑤)−1 = 𝐵(𝑧 𝑤 , −𝑤). The “transvection” 𝑔𝑎 ∈ 𝐺 [L2, Proposition 9.8], deﬁned by 𝑔𝑎 (𝑧) = 𝑎 + 𝐵(𝑎, 𝑎)1/2 𝑧 −𝑎 = 𝜙𝑎 (−𝑧) for all 𝑎, 𝑧 ∈ 𝐷, satisﬁes 𝑔𝑎−1 = 𝑔−𝑎 and 𝑔𝑎′ (𝑧) = 𝐵(𝑎, 𝑎)1/2 𝐵(𝑧, −𝑎)−1 = 𝐵(𝑎, 𝑎)1/2 𝐵(𝑧 −𝑎 , 𝑎).

2. Hilbert spaces of holomorphic functions Let 𝑑𝑚(𝑧) be the Lebesgue measure. The unique (up to a constant multiple) 𝐺invariant measure on 𝐷 has the form ℎ(𝑧, 𝑧)−𝑝 𝑑𝑚(𝑧). Given a parameter 𝜈 > 𝑝 − 1 we deﬁne a probability measure 𝑑𝜇𝜈 (𝑧) = 𝑐𝜈 ⋅ ℎ(𝑧, 𝑧)𝜈−𝑝 𝑑𝑚(𝑧) on 𝐷, which has the quasi-invariance property 2𝜈

𝑑𝜇𝜈 (𝑔(𝑧)) = ∣𝐽(𝑔, 𝑧) 𝑝 ∣ 𝑑𝜇𝜈 (𝑧), ∀ 𝑔 ∈ 𝐺.

(2.1)

′

Here 𝐽(𝑔, 𝑧) = det 𝑔 (𝑧) is the Jacobian of 𝑔 at 𝑧. (2.1) follows from 𝐵(𝑔(𝑧), 𝑔(𝑤)) = 𝑔 ′ (𝑧) 𝐵(𝑧, 𝑤) 𝑔 ′ (𝑤)∗

∀ 𝑔 ∈ 𝐺, ∀ 𝑧, 𝑤 ∈ 𝐷

(2.2)

which yields the quasi-invariance 1

1

ℎ(𝑔(𝑧), 𝑔(𝑤)) = 𝐽(𝑔, 𝑧) 𝑝 ℎ(𝑧, 𝑤) 𝐽(𝑔, 𝑤) 𝑝 , ∀ 𝑔 ∈ 𝐺

(2.3)

of ℎ. Proposition 2.1. Each 𝑔 ∈ 𝐷 has a unique “polar decomposition” 𝑔 = 𝑔𝑎 ⋅ 𝑘 with 𝑎 = 𝑔(0), 𝑘 ∈ 𝐾. Proof. Deﬁne 𝑎 = 𝑔(0) and consider 𝑘 = 𝑔𝑎−1 ∘ 𝑔. Then 𝑘 ∈ 𝐺 and 𝑘(0) = 0. □ Therefore 𝑘 ∈ 𝐾 and 𝑔 = 𝑔𝑎 ∘ 𝑘.

22

J. Arazy and H. Upmeier Using Proposition 2.1, we deﬁne a cocycle 𝐽𝜈 : 𝐺 × 𝐷 → ℂ by putting 𝐽𝜈 (𝑔𝑎 𝑘, 𝑧) := ℎ(𝑎, 𝑎)𝜈/2 ℎ(𝑘𝑧, −𝑎)−𝜈 ,

(2.4)

using the sesqui-holomorphic branch of ℎ(𝑧, 𝑤)−𝜈 on 𝐷 × 𝐷 normalized by ℎ(0, 0)−𝜈 = 1. Then ∣𝐽𝜈 (𝑔, 𝑧)∣ = ∣𝐽 (𝑔, 𝑧)∣𝜈/𝑝 . The Jacobian of 𝑔𝑎 has the form 𝐽(𝑔𝑎−1 , 𝑧) = ℎ(𝑎, 𝑎)𝑝/2 ℎ(𝑧, 𝑎)−𝑝 . Since 𝑔𝑎−1 = 𝑔−𝑎 , (2.4) implies 𝐽𝜈 (𝑔𝑎−1 , 𝑧) = ℎ(𝑎, 𝑎)𝜈/2 ℎ(𝑧, 𝑎)−𝜈 . Now consider the so-called Wallach set 𝑊 (𝐷) := {𝜈; (𝑧, 𝑤) → ℎ(𝑧, 𝑤)−𝜈 positive deﬁnite}

(2.5)

and, for 𝜈 ∈ 𝑊 (𝐷), deﬁne the reproducing kernel Hilbert space ℋ𝜈 = span {ℎ(⋅, 𝑤)−𝜈 ; 𝑤 ∈ 𝐷} with inner product determined by ⟨ℎ(⋅, 𝑤)−𝜈 , ℎ(⋅, 𝑧)−𝜈 ⟩𝜈 = ℎ(𝑧, 𝑤)−𝜈 for the reproducing kernel of ℋ𝜈 . The corresponding group action (𝑈𝜈 (𝑔)𝑓 )(𝑧) := 𝐽𝜈 (𝑔 −1 , 𝑧) 𝑓 ((𝑔 −1 (𝑧))

(2.6)

on ℋ𝜈 acts projectively: 𝑈𝜈 (𝑔1 ∘ 𝑔2 ) = 𝑐(𝑔1 , 𝑔2 ) 𝑈𝜈 (𝑔1 ) 𝑈𝜈 (𝑔2 ) for a unimodular cocycle. Then 𝑈𝜈 (𝑔) : ℋ𝜈 → ℋ𝜈 acts isometrically, ∀ 𝑔 ∈ 𝐺, because (2.3) implies 𝐽𝜈 (𝑔, 𝑧) ℎ(𝑔(𝑧), 𝑔(𝑤))−𝜈 𝐽𝜈 (𝑔, 𝑤) = ℎ(𝑧, 𝑤)−𝜈 . One can show that ℋ𝜈 is irreducible for the action 𝑈𝜈 of 𝐺. The primary examples are the weighted Bergman space ℋ𝜈 = 𝐿2𝑎 (𝐷, 𝜇𝜈 ) for 𝜈 > 𝑝 − 1, and the Hardy space ℋ 𝑑 = 𝐻 2 (𝑆, 𝜎), for 𝜈 = 𝑑𝑟 . Here 𝑆 is the Shilov 𝑟 boundary of 𝐷 and 𝜎 is the unique 𝐾-invariant probability measure on 𝑆. For a deeper analysis of ℋ𝜈 , we need the ﬁne structure of the polynomial algebra 𝒫 of 𝑍. For 1 ≤ 𝑗 ≤ 𝑟 there exist Jordan theoretic minors 𝑁𝑗 (𝑧) generalizing the principal 𝑗 × 𝑗-minors for matrices. In particular, 𝑁𝑟 = 𝑁 is the Jordan determinant. The conical polynomials, for any signature m = (𝑚1 , . . . , 𝑚𝑟 ) ∈ ℕ𝑟 satisfying 𝑚1 ≥ 𝑚2 ≥ ⋅ ⋅ ⋅ ≥ 𝑚𝑟 ≥ 0, are given by 𝑁m (𝑧) =

𝑟 ∏ 𝑗=1

𝑁𝑗 (𝑧)𝑚𝑗 −𝑚𝑗+1 , 𝑧 ∈ 𝑍,

Minimal and Maximal Invariant Spaces

23

where 𝑚𝑟+1 := 0. For diagonal matrices (including the rectangular case), we have ⎡ ⎤ 𝑡1 0 $ 𝑟 $ ⎥ ∏ ⎢ 𝑡2 $ ⎥ ⎢ 𝑚 0 𝑡 𝑗 𝑗 = 𝑡m . 𝑁m ⎢ = $ ⎥ .. $ ⎦ ⎣ . 0

𝑡𝑟

𝑗=1

Denote by 𝒫m the span of {𝑁m ∘ 𝑘; 𝑘 ∈ 𝐾}. It is well known [S], [U1], [FK1] that the {𝒫m }m≥0 are 𝐾-irreducible and 𝐾-inequivalent, and there is a direct sum decomposition ∑ ⊕ 𝒫= 𝒫m . (2.7) m≥0

It follows that the {𝒫m }𝑚≥0 are pairwise orthogonal in any 𝐾-invariant inner product on 𝒫. Consider the Fischer inner product ∫ (∂ ) ∗ 2 1 (𝐹 )(0) ⟨𝑓, 𝐹 ⟩ℱ = 𝑑 𝑓 (𝑧) 𝐹 (𝑧) 𝑒−∣𝑧∣ 𝑑𝑚(𝑧) = 𝑓 (2.8) 𝜋 ∂𝑧 ℂ𝑑

on 𝒫, where 𝐹 ∗ (𝑧) := 𝐹 (𝑧). Deﬁne 𝐾 m (𝑧, 𝑤) as the reproducing kernel for 𝒫m in the Fischer inner product. Then ∑ 𝑒⟨𝑧,𝑤⟩ = 𝐾 m (𝑧, 𝑤). (2.9) m≥0

For 𝜈 ∈ ℂ and 𝑧, 𝑤 ∈ 𝐷 there is a binomial expansion ∑ ℎ(𝑧, 𝑤)−𝜈 = (𝜈)m 𝐾 m (𝑧, 𝑤),

(2.10)

m≥0

where (𝜈)m =

𝑗 −1 ( 𝑟 𝑚∏ ∏

𝜈 + ℓ − (𝑗 − 1)

𝑗=1 ℓ=0

𝑟 𝑎) ∏ ( 𝑎) 𝜈 − (𝑗 − 1) = 2 2 𝑚𝑗 𝑗=1

is the multi-variable “Pochhammer symbol”. As a consequence, one obtains a determination of the Wallach set ) { ℓ𝑎 }𝑟−1 ( 𝑎 ∪ (𝑟 − 1) , ∞ 𝑊 (𝐷) = {𝜈 ∈ ℂ; (𝜈)m ≥ 0 ∀ m} = 2 ℓ=0 2 as a union of a discrete and a continuous part [RV], [W], [LA], [FK1]. The multivariable hypergeometric functions are deﬁned as 𝑝 ∏ ( ) ∑ 1 (𝛼𝑗 )m 𝛼1 , . . . , 𝛼𝑝 𝐾 m (𝑧, 𝑤). (𝑧, 𝑤) = 𝑝 𝐹𝑞 𝑞 ∏ 𝛽1 , . . . , 𝛽𝑞 m≥0 (𝛽𝑗 )m 1

For example, we have 0 𝐹0 (𝑧, 𝑤) = exp ⟨𝑧, 𝑤⟩ by (2.9), and (2.10) yields 1 𝐹0

(𝜈)(𝑧, 𝑤) = ℎ(𝑧, 𝑤)−𝜈 .

24

J. Arazy and H. Upmeier Let 𝛼0 , 𝛼1 , . . . , 𝛼𝑞 ; 𝛽1 , . . . , 𝛽𝑞 > (𝑟 − 1) 𝑎2 . Put 𝛾=

𝑞 ∑

𝛼𝑗 −

0

𝑞 ∑

𝛽𝑗 .

1

By [FK1], the hypergeometric functions have the following asymptotic behaviour, uniformly for 𝑧 ∈ 𝐷: ( ) 𝛼 𝑎 (2.11) 𝛾 > (𝑟 − 1) =⇒ 𝑞+1 𝐹𝑞 (𝑧, 𝑧) ≈ ℎ(𝑧, 𝑧)−𝛾 𝛽 2 ( ) 𝛼 𝑎 𝛾 < −(𝑟 − 1) =⇒ 𝑞+1 𝐹𝑞 (𝑧, 𝑧) ≈ 1. (2.12) 𝛽 2 Remark 2.1. For the unit ball (𝑟 = 1) and 𝛾 = 0, we have ( ) ( 1 ) 𝛼 . (𝑧, 𝑧) ≈ log 𝑞+1 𝐹𝑞 𝛽 1 − ∣𝑧∣ For the exact asymptotics if 𝑧 is scalar, see [Y]. For 𝑟 = 2, exact asymptotics are given in [EZ]. In the following we consider Banach spaces of holomorphic functions on 𝐷 which are “invariant” under the group action (2.6), with the aim to characterize the (unique) maximal and minimal invariant Banach spaces and describe them via explicit formulas. In later sections this study is extended to the case of vectorvalued holomorphic functions associated with the holomorphic discrete series of 𝐺. In this context our main result concerns symmetric domains which are “not of tube type”. In this paper we only consider parameters 𝜈 belonging to the Wallach set (2.5). In a separate paper [AU4] we consider the so-called “pole set” arising from analytic continuation, and show that our results concerning the maximal and minimal invariant space can be generalized to this situation via suitable intertwining operators.

3. Invariant Banach spaces of holomorphic functions In this section we assume that 𝜈 ∈ 𝑊 (𝐷) is a Wallach parameter and consider the weighted group action 𝑈𝜈 deﬁned in (2.6). For the unweighted action (𝜈 = 0) and the unit disk, the results of this section have been obtained in [AF], [AFP]. Deﬁnition 3.1. Let 𝑋 be a non-trivial Banach space of holomorphic functions on 𝐷. We say that 𝑋 is 𝑈𝜈 (𝐺)-invariant if (i) 𝑓 ∈ 𝑋, 𝑔 ∈ 𝐺 =⇒ 𝑈𝜈 (𝑔) 𝑓 ∈ 𝑋 and ∥𝑈𝜈 (𝑔) 𝑓 ∥𝑋 = ∥𝑓 ∥𝑋 . (ii) For any ﬁnite (complex) Borel measure 𝜇 on 𝐾, the linear operator (convolution by 𝜇) ∫ (𝑇𝜇 𝑓 )(𝑧) = maps 𝑋 continuously into itself.

𝐾

𝑓 (𝑘𝑧) 𝑑𝜇(𝑘)

Minimal and Maximal Invariant Spaces

25

(iii) For every 𝑧 ∈ 𝐷, the evaluation functional 𝑓 → 𝛿𝑧 (𝑓 ) := 𝑓 (𝑧) is bounded on 𝑋 (it suﬃces to require the continuity of 𝛿0 ). Note that condition (ii) holds if 𝐾 acts on 𝑋 strongly continuously via 𝜋(𝑘)𝑓 = 𝑓 ∘ 𝑘 −1 . Proposition 3.1. 𝑋 contains the constant function 1 and, normalizing ∥1∥𝑋 = 1, we have for 𝑓 ∈ 𝑋 ∣𝑓 (0)∣ ≤ ∥𝑓 ∥𝑋 /∥1∥𝑋 = ∥𝑓 ∥𝑋 . Proof. Since 𝐷 is circular, we have by (ii) and (iii) for all 𝑧 ∈ 𝐷 1 𝑓 (0)1 = 2𝜋

∫2𝜋

𝑓 (𝑒𝑖𝜃 𝑧)𝑑𝜃.

□

0

Corollary 3.1. For 𝑓 ∈ 𝑋 and 𝑎 ∈ 𝐷, we have ∣𝑓 (𝑎)∣ ≤ ℎ(𝑎, 𝑎)−𝜈/2 ∥𝑓 ∥𝑋 . Hence convergence in 𝑋 implies uniform convergence on compact subsets of 𝐷. Proof. Use the formula □ ∣(𝑈𝜈 (𝑔𝑎−1 ) 𝑓 )(0)∣ = ℎ(𝑎, 𝑎)𝜈/2 ∣𝑓 (𝑎)∣ ≤ ∥𝑈𝜈 (𝑔𝑎−1 ) 𝑓 ∥𝑋 = ∥𝑓 ∥𝑋 . ∑ 𝑓m ∈ 𝑋, then 𝑓m ∈ 𝑋 for all m, and the projections Corollary 3.2. If 𝑓 = m≥0

𝑓 → 𝑓m are continuous.

Proof. In terms of the character 𝜒m of 𝐾 on 𝒫m , we have ∫ 𝑓m (𝑧) = 𝑓 (𝑘 −1 𝑧) 𝜒m (𝑘) 𝑑𝑘.

□

𝐾 𝑎 2,

then 𝒫 is dense in 𝑋 in the topology of uniform Corollary 3.3. If 𝜈 > (𝑟 − 1) convergence on compact subsets. If 𝜈 = ℓ𝑎 2 , 0 ≤ ℓ ≤ 𝑟 − 1, the same holds for ∑ ⊕ 𝒫m . (3.1) 𝒫ℓ = 𝑚ℓ+1 =0 m≥0

Proof. From 1 ∈ 𝑋 (Proposition 3.1) it follows by (i) that 𝑈𝜈 (𝑔𝑎 ) 1 = const ℎ(−, 𝑎)−𝜈 ∈ 𝑋

for all 𝑎 ∈ 𝐷.

m

Applying Corollary 3.2, we obtain (𝜈)m 𝐾 (−, 𝑎) ∈ 𝑋, hence either (𝜈)m = 0 or □ else 𝒫m = span {𝐾 m (−, 𝑎) : 𝑎 ∈ 𝐷} ⊂ 𝑋. Our main goal is to characterize the maximal and minimal invariant spaces. Deﬁnition 3.2. Let ℳ𝜈 = {𝑓 ∈ ℋ(𝐷); ∥𝑓 ∥ℳ𝜈 < ∞}, where ∥𝑓 ∥ℳ𝜈 = sup ℎ(𝑧, 𝑧)𝜈/2 ∣𝑓 (𝑧)∣ = sup ∣(𝑈𝜈 (𝑔) 𝑓 )(0)∣. 𝑧∈𝐷

𝑔∈𝐺

26

J. Arazy and H. Upmeier

It is easy to see that ℳ𝜈 satisﬁes (i), (ii) and (iii) of Deﬁnition 3.1. Hence using the second expression for the norm, it follows that ℳ𝜈 is 𝑈𝜈 (𝐺)-invariant. We remark that taking another base point 𝑎 ∈ 𝐷 instead of 0 yields the same space with a norm proportional to ∥ ⋅ ∥ℳ𝜈 . Proposition 3.2. If 𝑋 is 𝑈𝜈 (𝐺)-invariant, then 𝑋 ⊆ ℳ𝜈 and ∥𝑓 ∥ℳ𝜈 ≤ ∥𝑓 ∥𝑋 , ∀ 𝑓 ∈ 𝑋. Proof. In view of Proposition 3.1, we have ∣(𝑈𝜈 (𝑔) 𝑓 )(0)∣ ≤ ∥𝑓 ∥𝑋 , ∀ 𝑓 ∈ 𝑋.

□

Corollary 3.4. ℳ𝜈 is the unique maximal 𝑈𝜈 (𝐺)-invariant space, and it is a weighted 𝐻 ∞ -space, with weight ℎ(𝑧, 𝑧)𝜈/2 . We remark that there exist spaces of holomorphic functions on 𝐷 satisfying (i), (ii) of Deﬁnition 3.1, but not (iii). For instance, let 𝑓 be any holomorphic function on 𝐷 (possibly not in ℳ𝜈 ). Deﬁne 𝔐𝜈,𝑑 (𝑓 ) to be the space of all functions of the form ∞ ∑ ( ) 𝐹 (𝑧) = 𝑐𝑗 𝑈𝜈 (𝑔𝑗 ) 𝑓 (𝑧), 𝑗=1

where 𝑔𝑗 ∈ 𝐺 and

∞ ∑ 𝑗=1

∣𝑐𝑗 ∣ < ∞. For 𝐹 ∈ 𝔐𝜈,𝑑 (𝑓 ) we deﬁne ∥𝐹 ∥𝔐𝜈,𝑑 (𝑓 ) = inf

∞ ∑

∣𝑐𝑗 ∣,

𝑗=1

where the inﬁmum is taken over all admissible representations of 𝐹 . Then it is easy to check that 𝔐𝜈,𝑑 (𝑓 ) is the smallest Banach space of holomorphic functions on 𝐷 which contains 𝑓 and satisﬁes (i) and (ii) of Deﬁnition 3.1. Proposition 3.3. The Banach space 𝔐𝜈,𝑑 (𝑓 ) satisﬁes condition (iii) if and only if 𝑓 ∈ ℳ𝜈 . More generally, let 𝑋 be a Banach space of holomorphic functions on 𝐷 satisfying (i) and (ii). Then 𝑋 satisﬁes (iii) if and only if 𝑋 ⊂ ℳ𝜈 continuously. Proof. If (iii) holds, then 𝔐𝜈,𝑑 (𝑓 ) (resp., 𝑋) is a 𝑈𝜈 (𝐺)-invariant Banach space and Proposition 3.2 implies 𝑓 ∈ ℳ𝜈 (resp., 𝑋 ⊂ ℳ𝜈 continuously). Conversely, if 𝑓 ∈ ℳ𝜈 , then sup ∣𝑈𝜈 (𝑔) 𝑓 (0)∣ < ∞ 𝑔∈𝐺

and hence 𝛿0 is continuous on 𝔐𝜈,𝑑 (𝑓 ). Similarly, 𝑋 ⊂ ℳ𝜈 continuously implies for all 𝑓 ∈ 𝑋 ∣𝑓 (0)∣ ≤ ∥𝑓 ∥ℳ𝜈 ≤ 𝑐 ∥𝑓 ∥𝑋 . Hence 𝛿0 is continuous on 𝑋. By (i), the continuity of 𝛿𝑧 , 𝑧 ∈ 𝐷, follows. Deﬁnition 3.3. Let 𝔐𝜈 consist of all 𝑓 ∈ ℋ(𝐷) such that ∫ 𝑓 (𝑧) = 𝑑𝜇(𝑎) ℎ(𝑎, 𝑎)𝜈/2 ℎ(𝑧, 𝑎)−𝜈 𝐷

□

(3.2)

Minimal and Maximal Invariant Spaces

27

for some ﬁnite (complex) Borel measure 𝜇 on 𝐷. Deﬁne the norm ∥𝑓 ∥𝔐𝜈 = inf {∥𝜇∥; 𝜇 satisﬁes (3.2)}. Proposition 3.4. We have 𝑓 ∈ 𝔐𝜈 if and only if ∫ 𝑓 (𝑧) = 𝑑𝜇(𝑔) (𝑈𝜈 (𝑔) 1)(𝑧), ∀ 𝑧 ∈ 𝐷

(3.3)

𝐺

for some ﬁnite Borel measure 𝜇 on 𝐺. Moreover ∥𝑓 ∥𝔐 = inf {∥𝜇∥; 𝜇

satisﬁes (3.3)}.

Hence 𝔐𝜈 is 𝑈𝜈 (𝐺)-invariant. The straightforward proof is omitted. Also, the condition ∥1∥𝔐𝜈 = 1 is satisﬁed. Indeed, if 1=

∫ 𝐷

(3.4)

𝑑𝜇 (𝑎) ℎ(𝑎, 𝑎)𝜈/2 ℎ(𝑧, 𝑎)−𝜈 , ∀ 𝑧 ∈ 𝐷

then for 𝑧 = 0 we have ∫ ∫ ∫ 𝜈/2 𝜈/2 1= 𝑑𝜇 (𝑎) ℎ(𝑎, 𝑎) ≤ 𝑑∣𝜇∣ (𝑎) ℎ(𝑎, 𝑎) ≤ 𝑑∣𝜇∣ (𝑎) = ∥𝜇∥ 𝐷

𝐷

𝐷

and therefore 1 ≤ ∥1∥𝔐𝜈 = inf {∥𝜇∥ : 𝜇 representing measure}. On the other hand, for 𝜇 = 𝛿0 we have ∫ 𝑑 𝛿0 (𝑎) ℎ(𝑎, 𝑎)𝜈/2 ℎ(𝑧, 𝑎)−𝜈 = 1 𝐷

so ∥1∥𝔐𝜈 ≤ ∥𝛿0 ∥ = 1. Hence (3.4) holds. Proposition 3.5. There is a canonical duality 𝔐∗𝜈 ≡ ℳ𝜈 with respect to the pairing ⟨𝑓, 𝐹 ⟩𝜈 of ℋ𝜈 . Proof. Let 𝐹 ∈ ℳ𝜈 and 𝑓 ∈ 𝔐𝜈 , with representation (3.3). Since (𝑈𝜈 (𝑔) 𝐹 )(0) = ⟨𝑈𝜈 (𝑔) 𝐹, 1⟩𝜈 = ⟨𝐹, 𝑈𝜈 (𝑔 −1 ) 1⟩𝜈 , it follows that

∫

⟨𝑓, 𝐹 ⟩𝜈 =

𝑑𝜇(𝑔) ⟨𝑈𝜈 (𝑔) 1, 𝐹 ⟩𝜈 𝐺

∫

=

𝑑𝜇(𝑔) ⟨1, 𝑈𝜈 (𝑔 −1 ) 𝐹 ⟩𝜈 =

𝐺

Hence

∫ 𝑑𝜇(𝑔) 𝑈𝜈 (𝑔 −1 ) 𝐹 (0). 𝐺

∫ ∣⟨𝑓, 𝐹 ⟩𝜈 ∣ ≤ 𝐺

𝑑∣𝜇∣ (𝑔) ∣(𝑈𝜈 (𝑔 −1 )𝐹 )(0)∣ ≤ ∥𝜇∥∥𝐹 ∥ℳ𝜈 .

28

J. Arazy and H. Upmeier

This holds for every representing measure 𝜇 for 𝑓 , hence ∣⟨𝑓, 𝐹 ⟩𝜈 ∣ ≤ ∥𝑓 ∥𝔐𝜈 ∥ ∥𝐹 ∥ℳ𝜈 .

(3.5)

Thus sup

∥𝑓 ∥𝔐𝜈 ≤1

∣⟨𝑓, 𝐹 ⟩𝜈 ∣ ≤ ∥𝐹 ∥ℳ𝜈 .

The converse inequality follows from ∥𝐹 ∥ℳ𝜈 = sup ∣⟨𝐹, 𝑈𝜈 (𝑔) 1⟩𝜈 ∣ ≤ 𝑔∈𝐺

sup

∥𝑓 ∥𝔐𝜈 ≤1

∣⟨𝑓, 𝐹 ⟩𝜈 ∣.

This means that the operator 𝑉 : ℳ𝜈 → 𝔐∗𝜈 deﬁned by (𝑉 𝐹 )(𝑓 ) = ⟨𝑓, 𝐹 ⟩𝜈 is an isometry. We claim that 𝑉 is surjective. Indeed, let Φ ∈ 𝔐∗𝜈 and deﬁne 𝐹 (𝑧) = Φ(ℎ(⋅, 𝑧)−𝜈 ). Then 𝐹 is holomorphic and ℎ(𝑧, 𝑧)𝜈/2 ∣𝐹 (𝑧)∣ = ∣Φ(ℎ(𝑧, 𝑧)𝜈/2 ℎ(⋅, 𝑧)−𝜈 )∣ = ∣Φ(𝑈𝜈 (𝑔𝑧−1 ) 1)∣ ≤ ∥Φ∥𝔐∗𝜈 . So 𝐹 ∈ ℳ𝜈 and ∥𝐹 ∥ℳ𝜈 ≤ ∥Φ∥𝔐∗𝜈 . Also, if 𝑓 ∈ 𝔐𝜈 is represented as in (3.2), then ∫ ∫ 𝜈/2 −𝜈 Φ(𝑓 ) = 𝑑𝜇(𝑎) ℎ(𝑎, 𝑎) Φ(ℎ(⋅, 𝑎) ) = 𝑑𝜇(𝑎) ℎ(𝑎, 𝑎)𝜈/2 𝐹 (𝑎) 𝐷 𝐷 ∫ 𝑑𝜇(𝑎) ℎ(𝑎, 𝑎)𝜈/2 ⟨ℎ(⋅, 𝑎)−𝜈 , 𝐹 ⟩𝜈 = ⟨𝑓, 𝐹 ⟩𝜈 . = 𝐷

It follows that 𝑉 (𝐹 ) = Φ, and so 𝑉 is a surjective isometry.

□

Deﬁnition 3.4. Let 𝔐𝜈,𝑑 be the space of all 𝑓 ∈ 𝔐𝜈 which are represented with respect to a discrete measure, i.e., 𝑓 (𝑧) =

∞ ∑

𝑐𝑗 (𝑈𝜈 (𝑔𝑗 ) 1)(𝑧)

(3.6)

𝑗=1

with 𝑔𝑗 ∈ 𝐺 and 𝑐𝑗 ∈ ℂ such that

∑ 𝑗

∣𝑐𝑗 ∣ < ∞, with the norm

∥𝑓 ∥𝔐𝜈,𝑑 = inf

∞ ∑

∣𝑐𝑗 ∣

𝑗=1

over all representations (3.6). Clearly, 𝔐𝜈,𝑑 is a closed subspace of 𝔐𝜈 and ∥𝑓 ∥𝔐𝜈 ≤ ∥𝑓 ∥𝔐𝜈,𝑑 for all 𝑓 ∈ 𝔐𝜈,𝑑 . Proposition 3.6. The dual space of 𝔐𝜈,𝑑 is identiﬁed isometrically with ℳ𝜈 , with respect to the pairing ⟨𝑓, 𝐹 ⟩𝜈 , 𝑓 ∈ 𝔐𝜈,𝑑 , 𝐹 ∈ ℳ𝜈 . In particular, 𝔐𝜈,𝑑 = 𝔐𝜈 with equal norms.

Minimal and Maximal Invariant Spaces

29

Proof. The fact that 𝔐∗𝜈,𝑑 = ℳ𝜈 isometrically is proved as in the proof of Proposition 3.5. This also yields that ∥𝑓 ∥𝔐𝜈 = ∥𝑓 ∥𝔐𝜈,𝑑 for all 𝑓 ∈ 𝔐𝜈,𝑑 . To prove that 𝔐𝜈 = 𝔐𝜈,𝑑 it suﬃces (by the Hahn-Banach theorem) to prove that if Φ ∈ 𝔐∗𝜈 vanishes on 𝔐𝜈,𝑑 then it is zero. But this follows from the identiﬁcation of 𝔐∗𝜈 with ℳ𝜈 . □ Proposition 3.7. If 𝑋 ∕= 0 is 𝑈𝜈 (𝐺)-invariant, then 𝔐𝜈 ⊆ 𝑋 and ∥𝑓 ∥𝑋 ≤ ∥𝑓 ∥𝔐𝜈 , ∀ 𝑓 ∈ 𝔐𝜈 . Hence 𝔐𝜈 is the unique minimal 𝑈𝜈 (𝐺)-invariant Banach space. Proof. Since 1 ∈ 𝑋 and ∥1∥𝑋 = 1 we have ∥𝑈𝜈 (𝑔) 1∥𝑋 = 1 for all 𝑔 ∈ 𝐺. Let ∞ ∑ 𝑐𝑗 𝑈𝜈 (𝑔𝑗 ) 1 be an admissible representation. Then 𝑓 ∈ 𝔐𝜈 = 𝔐𝜈,𝑑 , and let 𝑓 = 𝑗=1

the series converges absolutely ∞ ∑

∥𝑐𝑗 𝑈𝜈 (𝑔𝑗 ) 1∥𝑋 =

𝑗=1

∞ ∑

∣𝑐𝑗 ∣ < ∞,

𝑗=1

and the completeness of 𝔐𝜈 guarantees that the convergence is also in the norm of 𝑋. Therefore 𝑓 ∈ 𝑋 and ∥𝑓 ∥𝑋 ≤

∑

∥𝑐𝑗 𝑈𝜈 (𝑔𝑗 ) 1∥𝑋 =

𝑗

∞ ∑

∣𝑐𝑗 ∣.

𝑗=1

This holds for all discrete representations of 𝑓 , hence ∥𝑓 ∥𝑋 ≤ ∥𝑓 ∥𝔐𝜈 .

□

We remark that there exist functions 𝑓 ∈ ℳ𝜈 for which the group action 𝑔 → 𝑈𝜈 (𝑔) 𝑓 is not continuous in the norm of ℳ𝜈 . This leads to the following (0)

Deﬁnition 3.5. Let ℳ𝜈 = {𝑓 ∈ ℳ𝜈 ; 𝑔 → 𝑈𝜈 (𝑔)𝑓 is continuous in the ℳ𝜈 norm}. Proposition 3.8. (0) (i) ℳ𝜈 is the maximal 𝑈𝜈 (𝐺)-invariant space 𝑋 for which 𝑔 → 𝑈𝜈 (𝑔) 𝑓 is continuous in norm for all 𝑓 ∈ 𝑋; (0)∗ (ii) ℳ𝜈 = 𝔐𝜈 with respect to ⟨⋅, ⋅⟩𝜈 ; (0) (0)∗∗ (iii) The canonical embedding of ℳ𝜈 in ℳ𝜈 = ℳ𝜈 is the inclusion map. These statements will not be proved here, since they are not needed for our main problem: to identify 𝔐𝜈 via concrete integral formulas (not as a quotient space of the ﬁnite Borel measures on 𝐷 or 𝐺). Deﬁnition 3.6. The shift operator 𝑆𝛼𝛾 on 𝒫 (“diﬀerentiation of order 𝛾 − 𝛼”) is deﬁned by (∑ ) ∑ (𝛾) m 𝑆𝛼𝛾 𝑓m = 𝑓m . (𝛼)m m≥0

m≥0

30

J. Arazy and H. Upmeier In view of the Faraut-Kor´anyi-formula (2.10), we have (𝑆𝛼𝛾 𝑓 )(𝑧) = ⟨𝑓, ℎ(⋅, 𝑧)−𝛾 ⟩𝛼 ,

and the reproducing kernel identity yields 𝑆𝛼𝛾 (ℎ(⋅, 𝑧)−𝛼 ) = ℎ(⋅, 𝑧)−𝛾 . It follows that 𝑆𝛼𝛾 (ℋ𝛼 ) = ℋ𝛾 . Remark 3.1. If 𝛼 > (𝑟−1) 𝑎2 , then 𝑆𝛼𝛾 is deﬁned on all of 𝒫. If 𝛼 = then 𝑆𝛼𝛾 is deﬁned only on 𝒫ℓ (cf. (3.1)).

ℓ𝑎 2 ,

0 ≤ ℓ ≤ 𝑟−1,

Our ﬁrst main result is Theorem 3.1. Let 𝜈 ∈ 𝑊 (𝐷), 𝜈 > (𝑟 − 1) 𝑎. Choose 𝛽 ∈ ℝ such that 𝛽 + 𝜈2 > 𝑝− 1. Then there is a continuous embedding 𝑆𝜈𝜈+𝛽 (𝔐𝜈 ) ⊆ 𝐿1𝑎 (𝐷, 𝜇𝛽+ 𝜈2 ). Here 𝐿1𝑎 denotes the subspace of holomorphic functions in 𝐿1 . Proof. It is enough to consider the “atoms”: 𝑓 = ℎ(𝑎, 𝑎)𝜈/2 ℎ(⋅, 𝑎)−𝜈 for 𝑎 ∈ 𝐷. We have 𝜈

𝜈

(𝑆𝜈𝜈+𝛽 𝑓 )(𝑧) = ℎ(𝑎, 𝑎) 2 ⟨ℎ(⋅, 𝑎)−𝜈 , ℎ(⋅, 𝑧)−(𝜈+𝛽) ⟩𝜈 = ℎ(𝑎, 𝑎) 2 ℎ(𝑧, 𝑎)−(𝜈+𝛽) . 𝜈 2

Using the asymptotic behaviour of 2 𝐹1 , following from the assumption > (𝑟 − 1) 𝑎2 , we obtain ∫ $ $ 𝜈+𝛽 $2 𝜈 $ ∥𝑆𝜈𝜈+𝛽 𝑓 ∥𝐿1 (𝜇𝛽+ 𝜈 ) = 𝑐𝛽+𝜈/2 ℎ(𝑎, 𝑎) 2 $ℎ(𝑧, 𝑎)− 2 $ ℎ(𝑧, 𝑧)𝛽+𝜈/2−𝑝 𝑑𝑧 2 𝐷 ( 𝜈+𝛽 𝜈+𝛽 ) 𝜈 𝜈 𝜈 2 2 □ = ℎ(𝑎, 𝑎) 2 2 𝐹1 (𝑎, 𝑎) ≈ ℎ(𝑎, 𝑎) 2 ℎ(𝑎, 𝑎)− 2 = 1. 𝜈 𝛽+ 2 Theorem 3.1 has the following converse

Theorem 3.2. Let 𝜈 ∈ 𝑊 (𝐷) be arbitrary. Choose 𝛽 ∈ ℝ such that 𝛽 + 𝜈2 > 𝑝 − 1. Let 𝑓 be analytic on 𝐷 such that 𝑆𝜈𝜈+𝛽 𝑓 ∈ 𝐿1𝑎 (𝐷, 𝜇𝛽+ 𝜈2 ). Then 𝑓 ∈ 𝔐𝜈 and ∥𝑓 ∥𝔐𝜈 ≤

𝑐𝜈+𝛽 ∥𝑆𝜈𝜈+𝛽 𝑓 ∥𝐿1 (𝜇𝛽+ 𝜈 ) . 2 𝑐𝛽+𝜈/2

Proof. Consider the ﬁnite Borel measure 𝑑𝜇(𝑎) = (𝑆𝜈𝜈+𝛽 𝑓 )(𝑎) ℎ(𝑎, 𝑎)𝛽+𝜈/2−𝑝 𝑑𝑎.

Minimal and Maximal Invariant Spaces

31

Using the self-adjointness of 𝑆𝜈𝜈+𝛽 with respect to 𝜇𝜈+𝛽 and the reproducing property, we obtain ∫ ∫ 𝑑𝜇(𝑎) ℎ(𝑎, 𝑎)𝜈/2 ℎ(𝑧, 𝑎)−𝜈 = 𝑑𝑎 ℎ(𝑎, 𝑎)𝜈+𝛽−𝑝 (𝑆𝜈𝜈+𝛽 𝑓 )(𝑎) ℎ(𝑎, 𝑧)−𝜈 𝐷 𝐷 ∫ = 𝑑𝑎 ℎ(𝑎, 𝑎)𝜈+𝛽−𝑝 𝑓 (𝑎) 𝑆𝜈𝜈+𝛽 (ℎ(⋅, 𝑧)−𝜈 )(𝑎) ∫𝐷 1 = 𝑑𝑎 ℎ(𝑎, 𝑎)𝜈+𝛽−𝑝 𝑓 (𝑎) ℎ(𝑎, 𝑧)−(𝜈+𝛽) = 𝑓 (𝑧). 𝑐 𝜈+𝛽 𝐷 Hence 𝑓 ∈ 𝔐𝜈 and ∥𝑓 ∥𝔐𝜈 ≤ 𝑐𝜈+𝛽 ∥𝜇∥ = Corollary 3.5. If

𝜈 2

𝑐𝜈+𝛽 𝑐𝛽+𝜈/2

∥𝑆𝜈𝜈+𝛽 𝑓 ∥𝐿1 (𝜇𝛽+ 𝜈 ) .

□

2

> 𝑝 − 1 we can choose 𝛽 = 0. Hence 𝔐𝜈 = 𝐿1𝑎 (𝐷, 𝜇 𝜈2 ).

Corollary 3.6. For each 𝑓 ∈ 𝔐𝜈 , the map 𝐺 ∋ 𝑔 → 𝑈𝜈 (𝑔) 𝑓 ∈ 𝔐𝜈 is continuous in the norm of 𝔐𝜈 . 𝜈 Proof. This follows by realizing 𝔐𝜈 as 𝑆𝜈+𝛽 (𝐿1𝑎 (𝐷, 𝜇𝛽+ 𝜈2 )) with 𝛽+ 𝜈2 > 𝑝−1.

Corollary 3.7. Let 𝜈 > (𝑟 − 1) 𝑎 and choose 𝛽 ∈ ℝ such that 𝛽 + 𝑓 ∈ 𝔐𝜈 ⇐⇒

𝑆𝜈𝜈+𝛽

𝑓∈

𝐿1𝑎 (𝐷,

𝜈 2

□

> 𝑝 − 1. Then

𝜇𝛽+ 𝜈2 ).

(3.7)

Specializing to rank 𝑟 = 1, we obtain Corollary 3.8. Let 𝐷 be the open unit ball of ℂ𝑑 . Let 𝑓 be a holomorphic function on 𝐷 and choose 𝛽 such that 𝛽 + 𝜈2 > 𝑑. Then (3.7) holds.

4. Invariant Banach spaces of vector-valued holomorphic functions We now turn to vector-valued holomorphic function spaces related to the holomorphic discrete series. In this section we describe the unique maximal space, and obtain a suﬃcient condition for membership in the unique minimal space. For any ﬁxed partition m = (𝑚1 , . . . , 𝑚𝑟 ) consider the m-th Peter-Weyl component 𝒫m (cf. (2.7)) and parameters 𝜈 ∈ ℝ such that the integral ∫ 𝐾 m (𝐵(𝑎, 𝑎) 𝑒, 𝑒) 𝑐−1 (4.1) = 𝑑𝑎 ℎ(𝑎, 𝑎)𝜈−𝑝 𝜈,m 𝐾 m (𝑒, 𝑒) 𝐷 is ﬁnite. Here 𝑒 ∈ 𝑍 is a maximal tripotent. It is well known that 𝐾 m (𝑒, 𝑒) =

𝑑m (𝑑/𝑟)m

where 𝑑m = dim 𝒫m . For example, in the rank 1 case (unit ball) we have 𝐾 𝑚 (𝑧, 𝑤) =

(𝑧∣𝑤)𝑚 𝑚!

32

J. Arazy and H. Upmeier

and (𝑑)m = 𝑑(𝑑 + 1) ⋅ ⋅ ⋅ (𝑑 + 𝑚 − 1) =

(𝑑+𝑚−1)! (𝑑−1)! . 𝑑

On the other hand, the space 𝒫𝑚 ) ( , the number of of homogeneous polynomials on 𝑍 = ℂ has dimension 𝑚+𝑑−1 𝑚 solutions of 𝑘1 + ⋅ ⋅ ⋅ + 𝑘𝑑 = 𝑚 in integers 𝑘𝑖 ≥ 0. Thus, for 𝑒 = (1, 0, . . . , 0) we obtain 𝑑𝑚 (𝑚 + 𝑑 − 1)! 1 1 = 𝐾 𝑚 (𝑒, 𝑒). = = (𝑑)𝑚 𝑚!(𝑑 − 1)! (𝑑)𝑚 𝑚! Since 𝐾 acts irreducibly on 𝒫m it follows that ∫ $ 𝑑𝑎 ℎ(𝑎, 𝑎)𝜈−𝑝 ⟨𝑝 ∘ 𝐵(𝑎, 𝑎)1/2 $ 𝑞 ∘ 𝐵(𝑎, 𝑎)1/2 ⟩ℱ ⟨𝑝∣𝑞⟩ℱ = 𝑐𝜈,m 𝐷

for all 𝑝, 𝑞 ∈ 𝒫m . Here ⟨𝑝∣𝑞⟩ℱ is the Fischer-Fock norm (2.8). Equivalently, ∫ 𝑑𝑎 ℎ(𝑎, 𝑎)𝜈−𝑝 ⋅ 𝑝(𝐵(𝑎, 𝑎)𝜁) (4.2) 𝑝(𝜁) = 𝑐𝜈,m 𝐷

for all 𝑝 ∈ 𝒫m and 𝜁 ∈ 𝑍. Let ℋ𝜈,m denote the Hilbert space of all holomorphic functions Φ : 𝐷 → 𝒫m , 𝑧 → Φ𝑧 (𝜁) = Φ(𝑧, 𝜁) such that ∥Φ∥2𝜈,m

∫ = 𝑐𝜈,m

𝐷

𝑑𝑧 ℎ(𝑧, 𝑧)𝜈−𝑝 ∥Φ𝑧 ∘ 𝐵(𝑧, 𝑧)1/2 ∥2ℱ < +∞.

Here we write Φ𝑧 (𝜁) = Φ(𝑧, 𝜁) for 𝑧 ∈ 𝐷, 𝜁 ∈ 𝑍, noting that Φ(𝑧, −) is a polynomial of type m in the 𝜁-variable. In this notation, Φ𝑧 ∘ 𝐵(𝑧, 𝑧)1/2 (𝜁) = Φ(𝑧, 𝐵(𝑧, 𝑧)1/2 𝜁). Moreover the scalar parameter 𝜈 is chosen large enough so that 𝑐𝜈,m > 0, and so ℋ𝜈,m contains all the “constant” functions (1 ⊗ 𝑝)(𝑧, 𝜁) = 𝑝(𝜁) for 𝑝 ∈ 𝒫m . It is easily shown that (𝑈𝜈,m (𝑔 −1 )Φ)(𝑧, 𝜁) = 𝐽𝜈 (𝑔, 𝑧) Φ(𝑔(𝑧), 𝑔 ′(𝑧)𝜁), with 𝑔 ∈ 𝐺, Φ ∈ ℋ𝜈,m , 𝑧 ∈ 𝐷 and 𝜁 ∈ 𝑍, deﬁnes a unitary (projective) representation of 𝐺 on ℋ𝜈,m belonging to the so-called holomorphic discrete series of 𝐺 [AU3]. Proposition 4.1. For Φ ∈ ℋ𝜈,m we have the reproducing property ∫ Φ𝑧 (𝜁) = 𝑐𝜈,m 𝑑𝑎 ℎ(𝑎, 𝑎)𝜈−𝑝 ℎ(𝑧, 𝑎)−𝜈 ⋅ Φ𝑎 (𝐵(𝑎, 𝑎) 𝐵(𝑧, 𝑎)−1 𝜁). 𝐷

(4.3)

Minimal and Maximal Invariant Spaces

33

Proof. The reproducing formula, for a suitable constant, is proved in [AU3]. Applying the formula to 𝑧 = 0, we obtain ∫ Φ0 (𝜁) = 𝑐𝜈,m 𝑑𝑎 ℎ(𝑎, 𝑎)𝜈−𝑝 Φ𝑎 (𝐵(𝑎, 𝑎) 𝜁) (4.4) 𝐷

which reduces to (4.2) for Φ = 1 ⊗ 𝑝, and thus speciﬁes the constant.

□

Deﬁnition 4.1. Let 𝑋 ⊂ 𝒪(𝐷, 𝒫m ) be a non-trivial Banach space of 𝒫m -valued holomorphic functions on 𝐷. We say that 𝑋 is 𝑈𝜈,m (𝐺)-invariant if (i) Φ ∈ 𝑋, 𝑔 ∈ 𝐺 =⇒ 𝑈𝜈,m (𝑔) Φ ∈ 𝑋 and ∥𝑈𝜈,m (𝑔) Φ∥𝑋 = ∥Φ∥𝑋 . (ii) For any ﬁnite (complex) Borel measure 𝜇 on 𝐾, the linear operator (convolution by 𝜇) ∫ (𝑇𝜇 Φ)(𝑧, 𝜁) =

𝑑𝜇(𝑘) Φ(𝑘𝑧, 𝑘𝜁) 𝐾

maps 𝑋 continuously into itself. (iii) For every 𝑧 ∈ 𝐷, the evaluation map Φ → (𝛿𝑧 ⊗ 𝐼) Φ ∈ 𝒫m , deﬁned by (𝛿𝑧 ⊗ 𝐼) Φ(𝜁) := Φ(𝑧, 𝜁), is bounded on 𝑋. As before, condition (ii) is satisﬁed if the unweighted representation of 𝐾 on 𝑋 is strongly continuous. Proposition 4.2. Let 𝑋 ∕= (0) be an invariant Banach space in 𝒪(𝐷, 𝒫m ). Then (i) 1 ⊗ 𝒫m ⊂ 𝑋, and there exists a constant 𝑐𝑋 such that for all Φ ∈ 𝑋 (ii) ∥Φ0 ∥ℱ ≤ 𝑐𝑋 ∥Φ∥𝑋 . Proof. Put 𝑚 := 𝑚1 + ⋅ ⋅ ⋅ + 𝑚𝑟 , and consider the ﬁnite Borel measure 𝑒𝑖 𝑚 𝑡 𝑑𝑡/2𝜋. Since the polynomials in 𝒫m have total degree 𝑚, we have ∫2𝜋 ∫2𝜋 𝑑𝑡 𝑖 𝑚 𝑡 𝑑𝑡 𝑖 𝑚 𝑡 −𝑖 𝑚 𝑡 −𝑖𝑡 −𝑖𝑡 𝑒 𝑒 Φ(𝑒 𝑧, 𝑒 𝜁) = 𝑒 Φ(𝑒−𝑖𝑡 𝑧, 𝜁) 2𝜋 2𝜋 0

0

∫2𝜋 = 0

𝑑𝑡 Φ(𝑒−𝑖𝑡 𝑧, 𝜁) = Φ(0, 𝜁) = Φ0 (𝜁). 2𝜋

Since the action 𝑈𝜈,m on 𝑋 is isometric and 𝑑𝑡/2𝜋 is a probability measure, it follows that ∫2𝜋 𝑑𝑡 𝑖 𝑚 𝑡 𝑒 𝑈𝜈,m (𝑒𝑖𝑡 ) Φ (4.5) 1 ⊗ Φ0 = 2𝜋 0

belongs to 𝑋, and ∥1 ⊗ Φ0 ∥𝑋 ≤ ∥Φ∥𝑋 . Choosing Φ ∕= 0, there exists 𝑧 ∈ 𝐷 such that Φ𝑧 (𝜁) = Φ(𝑧, 𝜁) ∕≡ 0. Applying a suitable 𝑈𝜈,m (𝑔)-transformation, we may assume 𝑧 = 0, i.e., Φ0 (𝜁) = Φ(0, 𝜁) ∕≡ 0. Since 𝐾 acts irreducibly on 𝒫m , it follows from (4.5) that 1 ⊗ 𝑝 ∈ 𝑋 for all 𝑝 ∈ 𝒫m , i.e., 1 ⊗ 𝒫m ⊂ 𝑋, and there exists 𝑐𝑋 > 0 □ such that ∥𝑝∥ℱ ≤ 𝑐𝑋 ∥1 ⊗ 𝑝∥𝑋 .

34

J. Arazy and H. Upmeier

Deﬁnition 4.2. Let ℳ𝜈,m ⊂ 𝒪(𝐷, 𝒫m ) be the Banach space of all holomorphic functions Φ : 𝐷 → 𝒫m such that ∥Φ∥ℳ𝜈,m < +∞, where ∥Φ∥ℳ𝜈,m = sup ℎ(𝑧, 𝑧)𝜈/2 ∥Φ𝑧 ∘ 𝐵(𝑧, 𝑧)1/2 ∥ℱ = sup ∥(𝑈𝜈,m (𝑔) Φ)0 ∥ℱ . 𝑧∈𝐷

𝑔∈𝐺

The requirements (ii) and (iii) in Deﬁnition 4.1 are easily checked, and hence, with the second expression for the norm, it follows that ℳ𝜈,m is 𝑈𝜈,m (𝐺)-invariant. Changing the 𝐾-invariant inner product on 𝒫m , or taking another “base point” 𝑎 ∈ 𝐷 instead of 0, changes the norm only by a proportionality constant. Theorem 4.1. Let 𝑋 ⊂ 𝒪(𝐷, 𝒫m ) be a 𝑈𝜈,m -invariant Banach space. Then 𝑋 ⊂ ℳ𝜈,m continuously, i.e., ℳ𝜈,m is the unique maximal invariant space. Proof. Let Φ ∈ 𝑋. Then Proposition 4.2 implies ∥(𝑈𝜈,m (𝑔) Φ)0 ∥ℱ ≤ 𝑐𝑋 ⋅ ∥𝑈𝜈,m (𝑔) Φ∥𝑋 = 𝑐𝑋 ∥Φ∥𝑋 and hence

sup ∥(𝑈𝜈,m (𝑔) Φ)0 ∥ℱ ≤ 𝑐𝑋 ⋅ ∥Φ∥𝑋 .

𝑔∈𝐺

The assertion follows.

□

For 𝑝 ∈ 𝒫m and 𝑔 ∈ 𝐺, deﬁne 𝑝𝑔 := 𝑈𝜈,m (𝑔) (1 ⊗ 𝑝) ∈ 𝒪(𝐷, 𝒫m ). For 𝑔 = 𝑔𝑎 , we put 𝑝𝑎 := 𝑝𝑔𝑎 and obtain (𝑝𝑎𝑧 )(𝜁) = ℎ(𝑎, 𝑎)𝜈/2 ℎ(𝑧, 𝑎)−𝜈 𝑝 (𝐵(𝑎, 𝑎)1/2 𝐵(𝑧, 𝑎)−1 𝜁).

(4.6)

More generally, 𝑝𝑧𝑔𝑎 𝑘 (𝜁) = ℎ(𝑎, 𝑎)𝜈/2 ℎ(𝑧, 𝑎)−𝜈 𝑝 (𝑘 −1 𝐵(𝑎, 𝑎)1/2 𝐵(𝑧, 𝑎)−1 𝜁). Lemma 4.1. For large enough parameters 𝛼, 𝛽, 𝛾 we have the change of variables formula ∫ 𝑑𝑤 ℎ(𝑤, 𝑤)𝛼−𝑝 ℎ(𝑔𝑎 (𝑥), 𝑤)−𝛽 ℎ(𝑤, 𝑔𝑎 (𝑦))−𝛾 𝑓 (𝑔𝑎−1 (𝑤)) = ℎ(𝑎, 𝑎)𝛼−𝛽−𝛾 ℎ(𝑥, 𝑎)𝛽 ℎ(𝑎, 𝑦)𝛾 ∫ ⋅ 𝑑𝑤 ℎ(𝑤, 𝑤)𝛼−𝑝 ℎ(𝑥, 𝑤)−𝛽 ℎ(𝑤, 𝑦)−𝛾 ℎ(𝑤, 𝑎)𝛾−𝛼 ℎ(𝑎, 𝑤)𝛽−𝛼 𝑓 (𝑤) 𝐷

for all 𝑎, 𝑥, 𝑦 ∈ 𝐷 and all 𝑓 ∈ 𝐿1 (𝐷, 𝜇𝛼 ). Proof. Since 𝑑𝑤 ℎ(𝑤, 𝑤)−𝑝 is 𝐺-invariant, it follows that ∫ 𝑑𝑤 ℎ(𝑤, 𝑤)𝛼−𝑝 ℎ(𝑔𝑎 (𝑥), 𝑤)−𝛽 ℎ(𝑤, 𝑔𝑎 (𝑦))−𝛾 𝑓 (𝑔𝑎−1 (𝑤)) 𝐷 ∫ 𝑑𝑤 ℎ(𝑤, 𝑤)−𝑝 ℎ(𝑔𝑎 (𝑤), 𝑔𝑎 (𝑤))𝛼 = 𝐷

⋅ ℎ(𝑔𝑎 (𝑥), 𝑔𝑎 (𝑤))−𝛽 ℎ(𝑔𝑎 (𝑤), 𝑔𝑎 (𝑦))−𝛾 𝑓 (𝑤).

Minimal and Maximal Invariant Spaces

35

Now the assertion follows from ℎ(𝑔𝑎 (𝑤), 𝑔𝑎 (𝑤))𝛼 ℎ(𝑔𝑎 (𝑥), 𝑔𝑎 (𝑤))−𝛽 ℎ(𝑔𝑎 (𝑤), 𝑔𝑎 (𝑦))−𝛾 [ ]𝛼 = ℎ(𝑎, 𝑎) ℎ(𝑤, 𝑎)−1 ℎ(𝑤, 𝑤) ℎ(𝑎, 𝑤)−1 ]−𝛽 [ ⋅ ℎ(𝑎, 𝑎) ℎ(𝑥, 𝑎)−1 ℎ(𝑥, 𝑤) ℎ(𝑎, 𝑤)−1 [ ]−𝛾 ⋅ ℎ(𝑎, 𝑎) ℎ(𝑤, 𝑎)−1 ℎ(𝑤, 𝑦) ℎ(𝑎, 𝑦)−1 = ℎ(𝑎, 𝑎)𝛼−𝛽−𝛾 ℎ(𝑥, 𝑎)𝛽 ℎ(𝑎, 𝑦)𝛾 ℎ(𝑤, 𝑤)𝛼 ℎ(𝑥, 𝑤)−𝛽 ⋅ ℎ(𝑤, 𝑦)−𝛾 ℎ(𝑤, 𝑎)𝛾−𝛼 ℎ(𝑎, 𝑤)𝛽−𝛼 .

□

Generalizing Deﬁnition 3.6, we deﬁne the shift operator 𝑆𝜈𝜈+𝛽 acting on 𝒪(𝐷, 𝒫m ) by ∫ 𝑑𝑤 ℎ(𝑤, 𝑤)𝜈−𝑝 ℎ(𝑧, 𝑤)−(𝜈+𝛽) Φ𝑤 (𝐵(𝑤, 𝑤) 𝐵(𝑧, 𝑤)−1 𝜁) (𝑆𝜈𝜈+𝛽 Φ)𝑧 (𝜁) = 𝑐𝜈,m 𝐷

for all 𝑧 ∈ 𝐷 and 𝜁 ∈ 𝑍. The normalization is chosen so that 𝛽 = 0 yields the identity. It is easily shown that 𝑆𝜈𝜈+𝛽 commutes with the (unweighted) action of 𝐾 on 𝒪(𝐷, 𝒫m ). Proposition 4.3. Let 𝑝 ∈ 𝒫m and 𝑎, 𝑧 ∈ 𝐷. Then, using the notation (4.6), we have (𝑆𝜈𝜈+𝛽 𝑝𝑎 )𝑧 = ℎ(𝑧, 𝑎)−𝛽 𝑝𝑎𝑧 . Proof. Using a 𝕋-rotation in the anti-holomorphic variable 𝑤 yields ∫ 𝑑𝑤 ℎ(𝑤, 𝑤)𝜈−𝑝 ℎ(𝑔𝑎 (𝑧), 𝑤)−(𝜈+𝛽) ℎ(𝑎, 𝑤)𝛽 𝑝(𝐵(𝑤, 𝑤) 𝐵(𝑔𝑎 (𝑧), 𝑤)−1 𝑔𝑎′ (𝑧) 𝜁) 𝐷 ∫ ∫ 𝑑𝜗 = ℎ(𝑔𝑎 (𝑧), 𝜗𝑤)−(𝜈+𝛽) ℎ(𝑎, 𝜗𝑤)𝛽 𝑑𝑤 ℎ(𝑤, 𝑤)𝜈−𝑝 2𝜋 𝐷 𝕋

∫ =

𝐷

∫ =

𝐷

⋅ 𝑝(𝐵(𝑤, 𝑤) 𝐵(𝑔𝑎 (𝑧), 𝜗𝑤)−1 𝑔𝑎′ (𝑧) 𝜁) 𝑑𝑤 ℎ(𝑤, 𝑤)𝜈−𝑝 ℎ(𝑔𝑎 (𝑧), 0)−(𝜈+𝛽) ℎ(𝑎, 0)𝛽 ⋅ 𝑝(𝐵(𝑤, 𝑤) 𝐵(𝑔𝑎 (𝑧), 0)−1 𝑔𝑎′ (𝑧) 𝜁) ′ 𝑑𝑤 ℎ(𝑤, 𝑤)𝜈−𝑝 𝑝(𝐵(𝑤, 𝑤) 𝑔𝑎′ (𝑧) 𝜁) = 𝑐−1 𝜈,m 𝑝(𝑔𝑎 (𝑧) 𝜁).

Applying Lemma 4.1 to 𝑥 = 𝑔𝑎 (𝑧), 𝑦 = 0 we obtain (𝑆𝜈𝜈+𝛽 𝑝𝑎 )𝑧 (𝜁) ∫ = 𝑐𝜈,m 𝑑𝑤 ℎ(𝑤, 𝑤)𝜈−𝑝 ℎ(𝑧, 𝑤) 𝑝𝑎𝑤 (𝐵(𝑤, 𝑤)−(𝜈+𝛽) 𝐵(𝑧, 𝑤)−1 𝜁) 𝐷 ∫ = 𝑐𝜈,m 𝑑𝑤 ℎ(𝑤, 𝑤)𝜈−𝑝 ℎ(𝑧, 𝑤)−(𝜈+𝛽) ℎ(𝑎, 𝑎)𝜈/2 𝐷

⋅ ℎ(𝑤, 𝑎)−𝜈 𝑝(𝑔𝑎′ (𝑤) 𝐵(𝑤, 𝑤) 𝐵(𝑧, 𝑤)−1 𝜁)

36

J. Arazy and H. Upmeier = 𝑐𝜈,m ℎ(𝑎, 𝑎)𝜈/2

∫ 𝐷

𝑑𝑤 ℎ(𝑤, 𝑤)𝜈−𝑝 ℎ(𝑧, 𝑤)−(𝜈+𝛽)

⋅ ℎ(𝑤, 𝑎)−𝜈 𝑝(𝐵(𝑔𝑎 (𝑤), 𝑔𝑎 (𝑤)) 𝐵(𝑔𝑎 (𝑧), 𝑔𝑎 (𝑤))−1 𝑔𝑎′ (𝑧) 𝜁). The general transformation formula (2.2) specializes to 𝐵(𝑔𝑎 (𝑧), 𝑔𝑎 (𝑤)) = 𝑔𝑎′ (𝑧) 𝐵(𝑧, 𝑤) 𝑔𝑎′ (𝑤)∗ = 𝐵(𝑎, 𝑎)1/2 𝐵(𝑧, 𝑎)−1 𝐵(𝑧, 𝑤) 𝐵(𝑎, 𝑤)−1 𝐵(𝑎, 𝑎)1/2 . As a consequence, 𝐵(𝑔𝑎 (𝑤), 𝑔𝑎 (𝑤)) 𝐵(𝑔𝑎 (𝑧), 𝑔𝑎 (𝑤))−1 𝑔𝑎′ (𝑧) = 𝑔𝑎′ (𝑤) 𝐵(𝑤, 𝑤) 𝐵(𝑧, 𝑤)−1 . Hence (𝑆𝜈𝜈+𝛽 𝑝𝑎 )𝑧 (𝜁) = 𝑐𝜈,m ℎ(𝑎, 𝑎)𝜈/2 ℎ(𝑎, 𝑎)−(𝜈+𝛽) ℎ(𝑔𝑎 (𝑧), 𝑎)𝜈+𝛽 ∫ ⋅ 𝑑𝑤 ℎ(𝑤, 𝑤)𝜈−𝑝 ℎ(𝑔𝑎 (𝑧), 𝑤)−(𝜈+𝛽) ℎ(𝑎, 𝑤)𝛽 𝑝(𝐵(𝑤, 𝑤) 𝐵(𝑔𝑎 (𝑧), 𝑤)−1 𝑔𝑎′ (𝑧) 𝜁) 𝐷

= ℎ(𝑎, 𝑎)𝜈/2 ℎ(𝑧, 𝑎)−(𝜈+𝛽) 𝑝(𝑔𝑎′ (𝑧) 𝜁) = ℎ(𝑧, 𝑎)−𝛽 𝑝𝑎𝑧 (𝜁).

□

Proposition 4.4. The operators 𝑆𝜈𝛾 are symmetric with respect to ⟨⋅, ⋅⟩𝜈,m , namely ⟨𝑆𝜈𝛾 Φ, Ψ⟩𝜈,m = ⟨Φ, 𝑆𝜈𝛾 Ψ⟩𝜈,m

(4.7)

for all Φ, Ψ ∈ ℋ𝜈,m for which 𝑆𝜈𝛾 Φ, 𝑆𝜈𝛾 Ψ ∈ ℋ𝜈,m . Proof. For convenience we denote 𝑑𝜎𝜈,m (𝑧) = 𝑐𝜈,m 𝑑𝑧 ℎ(𝑧, 𝑧)𝜈−𝑝 . Then we have

∫ 〈 〉 𝑑𝜎𝜈,m (𝑧) (𝑆𝜈𝛾 Φ)(𝑧, 𝐵(𝑧, 𝑧)1/2 ⋅), Ψ(𝑧, 𝐵(𝑧, 𝑧)1/2 ⋅) ⟨𝑆𝜈𝛾 Φ, Ψ⟩𝜈,m = ℱ 𝐷 〈∫ ∫ = 𝑑𝜎𝜈,m (𝑧) 𝑑𝜎𝜈,m (𝑤) ℎ(𝑧, 𝑤)−𝛾 Φ(𝑤, 𝐵(𝑤, 𝑤) 𝐵(𝑧, 𝑤)−1 𝐵(𝑧, 𝑧)1/2 ⋅), 𝐷 𝐷 〉 Ψ(𝑧, 𝐵(𝑧, 𝑧)1/2⋅) ℱ 〈 ∫ = 𝑑𝜎𝜈,m (𝑤) Φ(𝑤, 𝐵(𝑤, 𝑤) 𝐵(𝑧, 𝑤)−1 𝐵(𝑧, 𝑧)1/2 ⋅), 𝐷 〉 ∫ 𝑑𝜎𝜈,m (𝑧) ℎ(𝑤, 𝑧)−𝛾 Ψ(𝑧, 𝐵(𝑧, 𝑧)1/2 ⋅) . 𝐷

ℱ

Using the fact that for all 𝑝, 𝑞 ∈ 𝒫m and 𝑇 ∈ 𝐾

ℂ

⟨𝑝 ∘ 𝑇, 𝑞⟩ℱ = ⟨𝑝, 𝑞 ∘ 𝑇 ∗ ⟩ℱ

Minimal and Maximal Invariant Spaces

37

we obtain (with 𝑇 = 𝐵(𝑤, 𝑤)1/2 𝐵(𝑧, 𝑤)−1 𝐵(𝑧, 𝑧)1/2 ) that the last integral is equal to 〈 ∫ 𝑑𝜎𝜈,m (𝑤) Φ(𝑤, 𝐵(𝑤, 𝑤)1/2 ⋅), 𝐷 〉 ∫ 𝑑𝜎𝜈,m (𝑧) ℎ(𝑤, 𝑧)𝛾 Ψ(𝑧, 𝐵(𝑧, 𝑧) 𝐵(𝑤, 𝑧)−1 𝐵(𝑤, 𝑤)1/2 ⋅) 𝐷 ℱ ∫ 〈 〉 1/2 𝛾 1/2 = 𝑑𝜎𝜈,m (𝑤) Φ(𝑤, 𝐵(𝑤, 𝑤) ⋅), (𝑆𝜈 Ψ)(𝑤, 𝐵(𝑤, 𝑤) ⋅) ℱ

𝐷

= ⟨Φ, 𝑆𝜈𝛾 Ψ⟩𝜈,m .

□

The same arguments yield the following result. Proposition 4.5. For 𝜈, 𝛾 ∈ ℝ let Φ ∈ ℋ𝜈,m ∩ ℋ𝛾,m and Ψ ∈ ℋ𝜈,m with 𝑆𝜈𝛾 Ψ ∈ ℋ𝛾,m . Then ⟨Φ, Ψ⟩𝜈,m = ⟨Φ, 𝑆𝜈𝛾 Ψ⟩𝛾,m . Proof.

∫

〈 〉 𝑑𝜎𝛾,m (𝑧) Φ(𝑧, 𝐵(𝑧, 𝑧)1/2 ⋅), (𝑆𝜈𝛾 Ψ)(𝑧, 𝐵(𝑧, 𝑧)1/2 ⋅) ℱ 𝐷 〈 ∫ = 𝑑𝜎𝛾,m (𝑧) Φ(𝑧, 𝐵(𝑧, 𝑧)1/2 ⋅), 𝐷 〉 ∫ −𝛾 −1 1/2 𝑑𝜎𝜈,m (𝑤) ℎ(𝑤, 𝑧) Ψ(𝑤, 𝐵(𝑤, 𝑤) 𝐵(𝑧, 𝑤) 𝐵(𝑧, 𝑧) ⋅) 𝐷 ℱ 〈∫ ∫ = 𝑑𝜎𝜈,m (𝑤) 𝑑𝜎𝛾,m (𝑧) Φ(𝑧, 𝐵(𝑧, 𝑧) 𝐵(𝑤, 𝑧)−1 𝐵(𝑤, 𝑤)1/2 ⋅) ℎ(𝑧, 𝑤)−𝛾 , 𝐷 𝐷 〉 1/2 Ψ(𝑤, 𝐵(𝑤, 𝑤) ⋅) ℱ ∫ 〈 〉 1/2 = 𝑑𝜎𝜈,m (𝑤) Φ(𝑤, 𝐵(𝑤, 𝑤) ⋅), Ψ(𝑤, 𝐵(𝑤, 𝑤)1/2 ⋅) = ⟨Φ, Ψ⟩𝜈,m

⟨Φ,

𝑆𝜈𝛾

Ψ⟩𝛾,m =

ℱ

𝐷

where we have used the reproducing property.

□

Corollary 4.1. Let Ψ, Φ ∈ ℋ𝜈,m ∩ ℋ𝛾,m satisfy 𝑆𝜈𝛾 Ψ, 𝑆𝜈𝛾 Φ ∈ ℋ𝛾,m . Then ⟨𝑆𝜈𝛾 Φ, Ψ⟩𝛾,m = ⟨Φ, Ψ⟩𝜈,m = ⟨Φ, 𝑆𝜈𝛾 Ψ⟩𝛾,m . Proof. The second equality follows from Proposition 4.5. For the ﬁrst, ⟨𝑆𝜈𝛾 Φ, Ψ⟩𝛾,m = ⟨Ψ, 𝑆𝜈𝛾 Φ⟩𝛾,m = ⟨Ψ, Φ⟩𝜈,m = ⟨Φ, Ψ⟩𝜈,m . Proposition 4.6. We have ∫ Φ𝑧 (𝜁) = 𝑐𝜈+𝛽,m 𝑑𝑎 ℎ(𝑎, 𝑎)𝜈+𝛽−𝑝 ℎ(𝑧, 𝑎)−𝜈 (𝑆𝜈𝜈+𝛽 Φ)𝑎 (𝐵(𝑎, 𝑎) 𝐵(𝑧, 𝑎)−1 𝜁). 𝐷

□

38

J. Arazy and H. Upmeier

Proof. Let 𝑧 ∈ 𝐷 and 𝑝 ∈ 𝒫m be ﬁxed. The reproducing formula (4.6) applied to 𝜈 + 𝛽 yields 𝜈/2 𝑐−1 (Φ𝑧 ∣ 𝑝 ∘ 𝐵(𝑧, 𝑧)1/2 )ℱ 𝜈+𝛽,m ⋅ ℎ(𝑧, 𝑧) ∫ = 𝑑𝑎 ℎ(𝑎, 𝑎)𝜈+𝛽−𝑝 ℎ(𝑧, 𝑎)−(𝜈+𝛽) ℎ(𝑧, 𝑧)𝜈/2 𝐷

∫ =

𝐷

∫ =

𝐷

∫ =

𝐷

∫ =

𝐷

⋅ (Φ𝑎 ∘ 𝐵(𝑎, 𝑎) 𝐵(𝑧, 𝑎)−1 ∣ 𝑝 ∘ 𝐵(𝑧, 𝑧)1/2 )ℱ 𝑑𝑎 ℎ(𝑎, 𝑎)𝜈+𝛽−𝑝 ℎ(𝑧, 𝑎)−(𝜈+𝛽) ℎ(𝑧, 𝑧)𝜈/2 ⋅ (Φ𝑎 ∘ 𝐵(𝑎, 𝑎) ∣ 𝑝 ∘ 𝐵(𝑧, 𝑧)1/2 𝐵(𝑎, 𝑧)−1 )ℱ 𝑑𝑎 ℎ(𝑎, 𝑎)𝜈+𝛽−𝑝 ⋅ (Φ𝑎 ∘ 𝐵(𝑎, 𝑎) ∣ ℎ(𝑎, 𝑧)−𝛽 ⋅ ℎ(𝑎, 𝑧)−𝜈 ℎ(𝑧, 𝑧)𝜈/2 𝑝 ∘ 𝐵(𝑧, 𝑧)1/2 𝐵(𝑎, 𝑧)−1 )ℱ 𝑑𝑎 ℎ(𝑎, 𝑎)𝜈+𝛽−𝑝 (Φ𝑎 ∘ 𝐵(𝑎, 𝑎) ∣ ℎ(𝑎, 𝑧)−𝛽 ⋅ 𝑝𝑧𝑎 )ℱ 𝑑𝑎 ℎ(𝑎, 𝑎)𝜈+𝛽−𝑝 (Φ𝑎 ∘ 𝐵(𝑎, 𝑎) ∣ (𝑆𝜈𝜈+𝛽 𝑝𝑧 )𝑎 )ℱ .

Using Proposition 4.4 for the parameter 𝜈 + 𝛽, we obtain 𝜈/2 𝑐−1 (Φ𝑧 ∣ 𝑝 ∘ 𝐵(𝑧, 𝑧)1/2 )ℱ 𝜈+𝛽,m ⋅ ℎ(𝑧, 𝑧) ∫ ( ) = 𝑑𝑎 ℎ(𝑎, 𝑎)𝜈+𝛽−𝑝 (𝑆𝜈𝜈+𝛽 Φ)𝑎 ∘ 𝐵(𝑎, 𝑎) ∣ 𝑝𝑧𝑎 ℱ ∫𝐷 = 𝑑𝑎 ℎ(𝑎, 𝑎)𝜈+𝛽−𝑝 𝐷 ) ( ⋅ (𝑆𝜈𝜈+𝛽 Φ)𝑎 ∘ 𝐵(𝑎, 𝑎) ∣ ℎ(𝑎, 𝑧)−𝜈 ℎ(𝑧, 𝑧)𝜈/2 𝑝 ∘ 𝐵(𝑧, 𝑧)1/2 𝐵(𝑎, 𝑧)−1 ℱ ∫ 𝜈/2 𝜈+𝛽−𝑝 −𝜈 = ℎ(𝑧, 𝑧) 𝑑𝑎 ℎ(𝑎, 𝑎) ℎ(𝑧, 𝑎) 𝐷 ( ) ⋅ (𝑆𝜈𝜈+𝛽 Φ)𝑎 ∘ 𝐵(𝑎, 𝑎) ∣ 𝑝 ∘ 𝐵(𝑧, 𝑧)1/2 𝐵(𝑎, 𝑧)−1 ℱ ∫ 𝜈/2 𝜈+𝛽−𝑝 −𝜈 = ℎ(𝑧, 𝑧) 𝑑𝑎 ℎ(𝑎, 𝑎) ℎ(𝑧, 𝑎) 𝐷 ) ( ⋅ (𝑆𝜈𝜈+𝛽 Φ)𝑎 ∘ 𝐵(𝑎, 𝑎) 𝐵(𝑧, 𝑎)−1 ∣ 𝑝 ∘ 𝐵(𝑧, 𝑧)1/2 . ℱ

Since any polynomial in 𝒫m has the form ℎ(𝑧, 𝑧)𝜈/2 𝑝 ∘ 𝐵(𝑧, 𝑧)1/2 , the assertion follows. □ Remark 4.1. Proposition 4.6 can be written as 𝜈 𝑆𝜈+𝛽 𝑆𝜈𝜈+𝛽 Φ = Φ

Minimal and Maximal Invariant Spaces

39

for Φ in a dense subspace of ℋ𝜈,m . Thus, formally, 𝑆𝛾𝜈 𝑆𝜈𝛾 = 𝐼 for all 𝜈, 𝛾 ∈ ℝ large enough. Up to now, the polynomial 𝑝 ∈ 𝒫m was arbitrary. We now specialize to 𝐴(𝜁) = 𝐾𝑒m (𝜁) = 𝐾 m (𝜁, 𝑒) where 𝑒 ∈ 𝑍 is a maximal tripotent. Then we have 𝐴𝑔𝑧𝑎 𝑘 (𝜁) = ℎ(𝑎, 𝑎)𝜈/2 ℎ(𝑧, 𝑎)−𝜈 𝐾 m (𝑘 −1 𝐵(𝑎, 𝑎)1/2 𝐵(𝑧, 𝑎)−1 𝜁, 𝑒) = ℎ(𝑎, 𝑎)𝜈/2 ℎ(𝑧, 𝑎)−𝜈 𝐾 m (𝐵(𝑎, 𝑎)1/2 𝐵(𝑧, 𝑎)−1 𝜁, 𝑘𝑒).

(4.8)

Deﬁnition 4.3. (i) Let 𝔐𝜈,m denote the Banach space of all holomorphic functions Φ : 𝐷 → 𝒫m which have a representation ∫ Φ𝑧 (𝜁) = 𝑑𝜇 (𝑔) 𝐴𝑔𝑧 (𝜁) 𝐺

for some ﬁnite ℂ-valued Borel measure on 𝐺. The norm is deﬁned as the inﬁmum ∥Φ∥𝔐𝜈,m = inf ∥𝜇∥ 𝜇

taken over all such representations. (ii) Deﬁne a vector-valued 𝐿1 -space ℒ1𝛾 to consist of all Φ ∈ 𝒪(𝐷, 𝒫m ) such that ∫ 𝑑𝑧 ℎ(𝑧, 𝑧)𝛾−𝑝 ∥Φ𝑧 ∘ 𝐵(𝑧, 𝑧)1/2 ∥ℱ < ∞. ∥Φ∥ℒ1𝛾 := 𝑐𝛾,m 𝐷

Here ∥ ⋅ ∥ℱ is the Fischer norm on 𝒫m . Our main theorem in this section is Theorem 4.2. Let Φ ∈ 𝒪(𝐷, 𝒫m ) and suppose that 𝑆𝜈𝜈+𝛽 Φ ∈ ℒ1𝛽+𝜈/2 . Then Φ ∈ 𝔐𝜈,m and 1/2 𝑐𝜈+𝛽,m ∥𝑆𝜈𝜈+𝛽 Φ∥ℒ1𝛽+𝜈/2 . ∥Φ∥𝔐𝜈,m ≤ (𝑑/𝑟)1/2 m 𝑑m 𝑐𝛽+𝜈/2 Proof. Deﬁne a complex measure 𝜇 on 𝐺 by 𝑑𝜇 (𝑔𝑎 𝑘) = 𝑑𝑘 𝑑𝑎 ℎ(𝑎, 𝑎)𝛽+𝜈/2−𝑝 (𝑆𝜈𝜈+𝛽 Φ)𝑎 (𝐵(𝑎, 𝑎)1/2 𝑘𝑒). For each 𝑘 ∈ 𝐾 the Cauchy-Schwarz inequality yields $ $ $ $ 𝜈+𝛽 m $(𝑆𝜈 Φ)𝑎 (𝐵(𝑎, 𝑎)1/2 𝑘𝑒)$ = $((𝑆𝜈𝜈+𝛽 Φ)𝑎 ∘ 𝐵(𝑎, 𝑎)1/2 ∣ 𝐾𝑘𝑒 )ℱ $ ≤ ∥(𝑆𝜈𝜈+𝛽 Φ)𝑎 ∘ 𝐵(𝑎, 𝑎)1/2 ∥ℱ ⋅ 𝐾 m (𝑒, 𝑒)1/2 =

1/2

𝑑m

1/2

(𝑑/𝑟)m

∥(𝑆𝜈𝜈+𝛽 Φ)𝑎 ∘ 𝐵(𝑎, 𝑎)1/2 ∥ℱ

𝜈+𝛽 = 𝑑1/2 Φ)𝑎 ∘ 𝐵(𝑎, 𝑎)1/2 ∥𝑑/𝑟 . m ∥(𝑆𝜈

40

J. Arazy and H. Upmeier

Hence

∫ ∥𝜇∥ =

∫ 𝑑𝑘

𝐷

𝐾

∫

1/2

𝑑m

≤

$ $ 𝑑𝑎 ℎ(𝑎, 𝑎)𝛽+𝜈/2−𝑝 $(𝑆𝜈𝜈+𝛽 Φ)𝑎 (𝐵(𝑎, 𝑎)1/2 𝑘𝑒)$

1/2

=

(𝑑/𝑟)m

𝐷

1

𝑑m

1 1 𝑑𝑎 ℎ(𝑎, 𝑎)𝛽+𝜈/2−𝑝 1(𝑆𝜈𝜈+𝛽 Φ)𝑎 ∘ 𝐵(𝑎, 𝑎)1/2 1ℱ

1/2

𝑐𝛽+𝜈/2 (𝑑/𝑟)1/2 m

1 𝜈+𝛽 1 1𝑆𝜈 Φ1 1 ℒ

𝛽+𝜈/2

.

Hence 𝜇 is a ﬁnite measure on 𝐺. Moreover, (4.8) implies ∫ 𝑑𝜇(𝑔) 𝐴𝑔𝑧 (𝜁) 𝐺

=

∫

𝑑𝑎ℎ(𝑎,𝑎)𝛽+𝜈/2−𝑝 ℎ(𝑎,𝑎)𝜈/2 ℎ(𝑧,𝑎)−𝜈 𝐷 ∫ ⋅ 𝑑𝑘 (𝑆𝜈𝜈+𝛽 Φ)𝑎 (𝐵(𝑎,𝑎)1/2 𝑘𝑒) 𝐾 m (𝐵(𝑎,𝑎)1/2 𝐵(𝑧,𝑎)−1 𝜁, 𝑘𝑒)

∫ =

𝐷

∫

𝐾 𝜈+𝛽−𝑝

𝑑𝑎ℎ(𝑎,𝑎)

−𝜈

∫

ℎ(𝑧,𝑎)

𝑑𝑘 (𝑆𝜈𝜈+𝛽 Φ)𝑎 (𝐵(𝑎,𝑎)1/2 𝑘𝑒)

𝐾

⋅ 𝐾 m (𝑘𝑒, 𝐵(𝑎,𝑎)1/2 𝐵(𝑧,𝑎)−1 𝜁)

( ) m 𝑑𝑎ℎ(𝑎,𝑎)𝜈+𝛽−𝑝 ℎ(𝑧,𝑎)−𝜈 (𝑆𝜈𝜈+𝛽 Φ)𝑎 ∘ 𝐵(𝑎,𝑎)1/2 ∣𝐾𝐵(𝑎,𝑎) 1/2 𝐵(𝑧,𝑎)−1 𝜁 𝑑/𝑟 𝐷 ∫ = (𝑑/𝑟)−1 𝑑𝑎ℎ(𝑎,𝑎)𝜈+𝛽−𝑝 ℎ(𝑧,𝑎)−𝜈 m 𝐷 ( ) m ⋅ (𝑆𝜈𝜈+𝛽 Φ)𝑎 ∘ 𝐵(𝑎,𝑎)1/2 ∣𝐾𝐵(𝑎,𝑎) 1/2 𝐵(𝑧,𝑎)−1 𝜁 ℱ ∫ ( ) −1 𝜈+𝛽−𝑝 −𝜈 𝜈+𝛽 = (𝑑/𝑟)m 𝑑𝑎ℎ(𝑎,𝑎) ℎ(𝑧,𝑎) (𝑆𝜈 Φ)𝑎 𝐵(𝑎,𝑎)1/2 𝐵(𝑎,𝑎)1/2 𝐵(𝑧,𝑎)−1 𝜁 ∫𝐷 ( ) −1 = (𝑑/𝑟)m 𝑑𝑎ℎ(𝑎,𝑎)𝜈+𝛽−𝑝 ℎ(𝑧,𝑎)−𝜈 (𝑆𝜈𝜈+𝛽 Φ)𝑎 𝐵(𝑎,𝑎) 𝐵(𝑧,𝑎)−1 𝜁 =

𝐷

−1 = (𝑑/𝑟)−1 m 𝑐𝜈+𝛽,m Φ𝑧 (𝜁)

using Proposition 4.4. Thus Φ is represented by 𝜇, up to a constant.

□

Minimal and Maximal Invariant Spaces

41

5. Minimal spaces for non-tube type domains In this section we obtain a “converse” of Theorem 4.2, and thus a complete characterization of the minimal space, for the special partitions s = (𝑠, . . . , 𝑠), where 𝑠 ∈ ℕ. These “constant” partitions arise naturally in the study of highest quotients (Dirichlet spaces) for domains which are not of tube type (cf. [AU3]). The integration formulas developed here may be of independent interest. We consider the Peirce decomposition ( ) 𝑍1 (5.1) 𝑍 = 𝑍1 ⊕ 𝑍1/2 = 𝑍1/2 of 𝑍 for a maximal tripotent 𝑒, and write 𝑧 ∈ 𝑍 as 𝑧 = 𝑧1 + 𝑧1/2 , with 𝑧1 ∈ 𝑍1 and 𝑧1/2 ∈ 𝑍1/2 . Lemma 5.1. For 𝑢 ∈ 𝑍1 , 𝑣 ∈ 𝑍1/2 the Bergman operator 𝐵(𝑢, 𝑣) has a blockmatrix decomposition ) ( 𝐼 −2𝑢 □ 𝑣 ∗ (5.2) 𝐵(𝑢, 𝑣) = 1 0 𝐼1/2 with respect to (5.1). Here 𝐼𝜈 denotes the identity operator on 𝑍𝜈 . Proof. For 𝑧 ∈ 𝑍, we have {𝑢 𝑣 ∗ 𝑧1 } ∈ 𝑍3/2 = (0) and 𝑄𝑣 𝑧1 ∈ 𝑍0 = (0), since 𝑒 is maximal. Moreover, 𝑄𝑣 𝑧1/2 ∈ 𝑍1/2 and hence 𝑄𝑢 𝑄𝑣 𝑧1/2 ∈ 𝑍3/2 = (0). Thus 𝐵(𝑢, 𝑣) 𝑧 = 𝑧 − 2{𝑢 𝑣 ∗ 𝑧} + 𝑄𝑢 𝑄𝑣 𝑧 = 𝑧1 + 𝑧1/2 − 2{𝑢 𝑣 ∗ (𝑧1 + 𝑧1/2 )} + 𝑄𝑢 𝑄𝑣 (𝑧1 + 𝑧1/2 ) = 𝑧1 + 𝑧1/2 − 2{𝑢 𝑣 ∗ 𝑧1/2 }, with 𝑧1 − 2{𝑢 𝑣 ∗ 𝑧1/2 } ∈ 𝑍1 . The assertion follows.

□

Corollary 5.1. For 𝑢 ∈ 𝑍1 , 𝑣 ∈ 𝑍1/2 , we have det𝑍 𝐵(𝑢, 𝑣) = 1. In particular, 𝐵(𝑢, 𝑣) is invertible, with inverse given by ( ) 𝐼 2𝑢 □ 𝑣 ∗ 𝐵(𝑢, 𝑣)−1 = 𝐵(𝑢, −𝑣) = 1 0 𝐼1/2 Lemma 5.2. If 𝐵(𝑧, 𝑤) is invertible and 𝑄𝑧 𝑤 = 𝑄𝑤 𝑧 = 0, then 𝑧 𝑤 = 𝑧. Proof. By assumption, we have 𝐵(𝑧,𝑤)𝑧 = 𝑧 − 2{𝑧 𝑤∗𝑧} + 𝑄𝑧 𝑄𝑤 𝑧 = 𝑧 − 2𝑄𝑧 𝑤 + 𝑄𝑧 𝑄𝑤 𝑧 = 𝑧 = 𝑧 − 𝑄𝑧 𝑤 = 𝐵(𝑧,𝑤)𝑧 𝑤 . Since 𝐵(𝑧, 𝑤) is invertible, we conclude that 𝑧 = 𝑧 𝑤 .

□

Proposition 5.1. Suppose 𝑣, 𝑢 ∈ 𝐷 and 𝑄𝑢 𝑣 = 𝑄𝑣 𝑢 = 0. Then we have

) ( 𝐵 𝑢 + 𝐵(𝑢, 𝑢)1/2 𝑣, 𝑢 + 𝐵(𝑢, 𝑢)1/2 𝑣 = 𝐵(𝑢, 𝑢)1/2 𝐵(𝑣, 𝑢) 𝐵(𝑣, 𝑣) 𝐵(𝑢, 𝑣) 𝐵(𝑢, 𝑢)1/2 .

(5.3)

(5.4)

42

J. Arazy and H. Upmeier

Proof. Since 𝑣 −𝑢 = 𝑣 by Lemma 5.2, we have 𝑔𝑢 (𝑣) = 𝑢 + 𝐵(𝑢, 𝑢)1/2 𝑣 −𝑢 = 𝑢 + 𝐵(𝑢, 𝑢)1/2 𝑣 and 𝑔𝑢′ (𝑣) = 𝐵(𝑢, 𝑢)1/2 𝐵(𝑣 −𝑢 , 𝑢) = 𝐵(𝑢, 𝑢)1/2 𝐵(𝑣, 𝑢). Now apply (2.2). □ For any tripotent, the Peirce spaces are hermitian Jordan subtriples of 𝑍, and 𝑍1 and 𝑍0 are always irreducible if 𝑍 is irreducible. One can show that in our case of a maximal tripotent (i.e., 𝑍0 = (0)) the Peirce 12 -space 𝑍1/2 is also irreducible. Let 𝐷1 = 𝐷 ∩ 𝑍1 and 𝐷1/2 = 𝐷 ∩ 𝑍1/2 denote the respective open unit balls. Corollary 5.2. Let 𝑢 ∈ 𝐷1 and 𝑣 ∈ 𝐷1/2 . Then (5.3) holds and, in addition, we have (5.5) ℎ(𝑢 + 𝐵(𝑢, 𝑢)1/2 𝑣, 𝑢 + 𝐵(𝑢, 𝑢)1/2 𝑣) = ℎ(𝑢, 𝑢) ℎ(𝑣, 𝑣). Proof. By Lemma 5.1 and Lemma 5.2, the assumption of Proposition 5.1 is satisﬁed, showing that (5.4) holds. Moreover, ℎ(𝑢, 𝑣) = 1 = ℎ(𝑣, 𝑢) by Lemma 5.1. Therefore (5.5) follows from (5.4) by taking determinants. □ Proposition 5.2. For 𝑢 ∈ 𝑍1 and 𝑣 ∈ 𝑍1/2 , we have 𝑢 + 𝐵(𝑢, 𝑢)1/2 𝑣 ∈ 𝐷 if and only if 𝑢 ∈ 𝐷1 and 𝑣 ∈ 𝐷1/2 . Proof. As a consequence of the spectral theorem for Jordan triples, we have ℎ(𝑧, 𝑧) > 0 for 𝑧 ∈ 𝐷 and ℎ(𝑧, 𝑧) = 0 for all 𝑧 ∈ ∂𝐷. Hence 𝐷 is a connected component of 𝑀 := {𝑧 ∈ 𝑍 : ℎ(𝑧, 𝑧) > 0}. Deﬁne 𝜋 : 𝐷 → 𝑍1/2 by 𝜋(𝑤) := 𝐵(𝑤1 , 𝑤1 )−1/2 𝑤1/2 for all 𝑤 = 𝑤1 + 𝑤1/2 ∈ 𝐷 with 𝑤𝜈 ∈ 𝑍𝜈 . Since Peirce projections are contractive, we have ∥𝑤1 ∥ ≤ ∥𝑤∥ < 1. Therefore 𝑤1 ∈ 𝐷1 and 𝐵(𝑤1 , 𝑤1 ) is invertible. By Corollary 5.2, we have ℎ(𝑤1 , 𝑤1 ) ℎ(𝜋(𝑤), 𝜋(𝑤)) = ℎ(𝑤, 𝑤) ∕= 0. It follows that ℎ(𝜋(𝑤), 𝜋(𝑤)) ∕= 0 and therefore 𝜋(𝑤) ∈ 𝑍1/2 ∩ 𝑀 . Since 𝜋 is continuous and 𝐷 is connected, it follows that 𝜋(𝐷) belongs to the 0-connected component of 𝑀 ∩ 𝑍1/2 , which coincides with 𝐷1/2 . This shows that 𝑤 = 𝑢 + 𝐵(𝑢, 𝑢)1/2 𝑣 ∈ 𝐷 implies 𝑢 ∈ 𝐷1 and 𝑣 = 𝜋(𝑤) ∈ 𝐷1/2 . Conversely, let 𝑢 ∈ 𝐷1 . Deﬁne 𝐹𝑢 : 𝑍1/2 → 𝑍 by 𝐹𝑢 (𝑣) := 𝑢 + 𝐵(𝑢, 𝑢)1/2 𝑣. Then Corollary 5.2 implies ℎ(𝐹𝑢 (𝑣), 𝐹𝑢 (𝑣)) = ℎ(𝑢, 𝑢) ℎ(𝑣, 𝑣). If 𝑣 ∈ 𝐷1/2 , then ℎ(𝑣, 𝑣) ∕= 0 and hence 𝐹𝑢 (𝑣) ∈ 𝑀 . Since 𝐹𝑢 (0) = 𝑢 ∈ 𝐷1 ⊂ 𝐷, 𝐹𝑢 (𝐷1/2 ) belongs to the 𝑢-connected component of 𝑀 , which coincides with 𝐷. Therefore 𝑤 = 𝐹𝑢 (𝑣) ∈ 𝐷. □

Minimal and Maximal Invariant Spaces

43

According to Proposition 5.2 the map 𝐹 (𝑢, 𝑣) := 𝑢 + 𝐵(𝑢, 𝑢)1/2 𝑣 deﬁnes a real-analytic isomorphism from 𝐷1 × 𝐷1/2 onto 𝐷, with inverse 𝐹 −1 (𝑤1 + 𝑤1/2 ) = 𝑤1 + 𝐵(𝑤1 , 𝑤1 )−1/2 𝑤1/2 . Put 𝛽(𝑢) := 𝐵(𝑢, 𝑢)1/2 ∈ End(𝑍). Then 𝐹 has the derivative 𝐹 ′ (𝑢, 𝑣)(𝑥, 𝑦) = 𝑥 + 𝛽(𝑢) 𝑦 + (𝛽 ′ (𝑢) 𝑥) 𝑣 for 𝑥 ∈ 𝑍1 , 𝑦 ∈ 𝑍1/2 . Since 𝛽(𝑢) preserves both Peirce spaces, the same is true for 𝛽 ′ (𝑢)𝑥 ∈ End(𝑍). Thus we have a block-matrix decomposition ( ) 𝐼1 𝑇 𝐹 ′ (𝑢, 𝑣) = 0 𝐵(𝑢, 𝑢)1/2 with respect to (5.1), where 𝑇 𝑥 := (𝛽 ′ (𝑢) 𝑥) 𝑣 = It follows that

∂ $$ $ 𝛽(𝑢 + 𝑡𝑥) 𝑣. ∂𝑡 𝑡=0

det𝑍 𝐹 ′ (𝑢, 𝑣) = det𝑍1/2 𝐵(𝑢, 𝑢)1/2 = ℎ(𝑢, 𝑢)𝑏/2 . Hence 𝐹 ′ (𝑢, 𝑢) has the “real” determinant $2 $ det 𝐹 ′ (𝑢, 𝑣) = $det𝑍1/2 𝐵(𝑢, 𝑢)1/2 $ = ℎ(𝑢, 𝑢)𝑏 .

(5.6)

Making the change of variables 𝑤 = 𝑢 + 𝐵(𝑢, 𝑢)1/2 𝑣 (5.6) yields

(𝑢 ∈ 𝐷1 , 𝑣 ∈ 𝐷1/2 )

𝑑𝑤 = ℎ(𝑢, 𝑢)𝑏 𝑑𝑢 𝑑𝑣.

(5.7) (5.8)

Proposition 5.3. Let 𝑢 ∈ 𝑍1 , 𝑣 ∈ 𝑍1/2 and 𝑎 = 𝑎1 + 𝑎1/2 ∈ 𝑍 with 𝑎𝜈 ∈ 𝑍𝜈 . Suppose that 𝐵(𝑎1 , 𝑢) is invertible. Then ℎ(𝑢 + 𝐵(𝑢, 𝑢)1/2 𝑣, 𝑎) = ℎ(𝑢, 𝑎1 ) ⋅ ℎ(𝑣, 𝐵(𝑢, 𝑢)1/2 𝐵(𝑎1 , 𝑢)−1 𝑎1/2 ).

(5.9)

Proof. Polarizing the identity (5.5) yields ℎ(𝑢 + 𝐵(𝑢, 𝑎1 )1/2 𝑣1 , 𝑎1 + 𝐵(𝑎1 , 𝑢)1/2 𝑣2 ) = ℎ(𝑢, 𝑎1 ) ℎ(𝑣1 , 𝑣2 )

(5.10)

whenever 𝑣1 , 𝑣2 ∈ 𝑍1/2 . Putting 𝑣1 = 𝐵(𝑢, 𝑎1 )−1/2 𝐵(𝑢, 𝑢)1/2 𝑣

and 𝑣2 = 𝐵(𝑎1 , 𝑢)−1/2 𝑎1/2 ,

the left-hand sides of (5.9) and (5.10) agree, whereas ) ( ℎ(𝑣1 , 𝑣2 ) = ℎ 𝐵(𝑢, 𝑎1 )−1/2 𝐵(𝑢, 𝑢)1/2 𝑣, 𝐵(𝑎1 , 𝑢)−1/2 𝑎1/2 ) ( = ℎ 𝐵(𝑢, 𝑢)1/2 𝑣, 𝐵(𝑎1 , 𝑢)−1 𝑎1/2 ( ) = ℎ 𝑣, 𝐵(𝑢, 𝑢)1/2 𝐵(𝑎1 , 𝑢)−1 𝑎1/2 .

□

44

J. Arazy and H. Upmeier

Lemma 5.3. Let 𝑢 ∈ 𝐷1 and 𝑎 = 𝑎1 + 𝑎1/2 ∈ 𝐷 with 𝑎𝜈 ∈ 𝑍𝜈 . Then 𝐵(𝑎1 , 𝑢) is invertible and 𝐵(𝑢, 𝑢)1/2 𝐵(𝑎1 , 𝑢)−1 𝑎1/2 ∈ 𝐷1/2 . Proof. Since 𝑎1 ∈ 𝐷1 , it follows that 𝐵(𝑎1 , 𝑢) is invertible. Therefore the addition formula [L2, p.26] yields (𝑢𝑎1 )

𝑎𝑢 = (𝑎1 + 𝑎1/2 )𝑢 = 𝑎𝑢1 + 𝐵(𝑎1 , 𝑢)−1 𝑎1/2 (𝑢𝑎1 )

since 𝑢𝑎1 ∈ 𝑍1 and hence 𝑎1/2

= 𝑎𝑢1 + 𝐵(𝑎1 , 𝑢)−1 𝑎1/2

= 𝑎1/2 by Lemma 5.2. It follows that

𝑔−𝑢 (𝑎) = −𝑢 + 𝐵(𝑢, 𝑢)1/2 𝑎𝑢 = −𝑢 + 𝐵(𝑢, 𝑢)1/2 𝑎𝑢1 + 𝐵(𝑢, 𝑢)1/2 𝐵(𝑎1 , 𝑢)−1 𝑎1/2 . Since 𝑎 ∈ 𝐷, we have 𝑔−𝑢 (𝑎) ∈ 𝐷. Therefore the Peirce 12 -component 𝐵(𝑢, 𝑢)1/2 𝐵(𝑎1 , 𝑢)−1 𝑎1/2 ∈ 𝐷1/2 .

□

Let 𝑃1 : 𝑍 → 𝑍1 denote the Peirce 1-projection. Lemma 5.4. For 𝑢 ∈ 𝑍1 and 𝑣 ∈ 𝑍1/2 , we have 𝑃1 𝐵(𝑣, 𝑢) = 𝑃1 . Proof. Using Lemma 5.1 and 𝐵(𝑣, 𝑢) = 𝐵(𝑢, 𝑣)∗ we write ( )( ) ( 𝐼 0 𝐼1 0 𝐼 = 1 𝑃1 𝐵(𝑣, 𝑢) = 1 0 0 −2𝑣 □ 𝑢∗ 𝐼1/2 0

) 0 = 𝑃1 . 0

Here 𝐼𝜈 is the identity map on 𝑍𝜈 .

□

Lemma 5.5. Let s = (𝑠, . . . , 𝑠) and 𝑤 = 𝑤1 + 𝑤1/2 ∈ 𝐷 with 𝑤𝜈 ∈ 𝑍𝜈 . Then 𝐾𝑒s (𝐵(𝑤, 𝑤) 𝑒) =

𝑑s ℎ(𝑤, 𝑤)𝑠 ℎ(𝑤1 , 𝑤1 )𝑠 . (𝑑/𝑟)s

(5.11)

Proof. Let 𝑁 be the Jordan algebra determinant of 𝑍1 , normalized by 𝑁 (𝑒) = 1. Then 𝑑s 𝑁 (𝑃1 𝑧)𝑠 . 𝐾𝑒s (𝑧) = 𝐾 s (𝑒, 𝑒) 𝑁 (𝑃1 𝑧)𝑠 = (𝑑/𝑟)s Writing 𝑤 = 𝑢 + 𝐵(𝑢, 𝑢)1/2 𝑣 with 𝑢 ∈ 𝐷1 and 𝑣 ∈ 𝐷1/2 , Proposition 5.1 and Lemma 5.4 imply 𝑃1 𝐵(𝑤, 𝑤) 𝑒 = 𝑃1 𝐵(𝑢, 𝑢)1/2 𝐵(𝑣, 𝑢) 𝐵(𝑣, 𝑣) 𝐵(𝑢, 𝑣) 𝐵(𝑢, 𝑢)1/2 𝑒 = 𝑃1 𝐵(𝑢, 𝑢)1/2 𝑃1 𝐵(𝑣, 𝑢) 𝐵(𝑣, 𝑣) 𝐵(𝑢, 𝑣) 𝑃1 𝐵(𝑢, 𝑢)1/2 𝑒 = 𝑃1 𝐵(𝑢, 𝑢)1/2 𝑃1 𝐵(𝑣, 𝑣) 𝑃1 𝐵(𝑢, 𝑢)1/2 𝑒 = 𝐵(𝑢, 𝑢)1/2 𝐵(𝑣, 𝑣) 𝐵(𝑢, 𝑢)1/2 𝑒. The invertible transformations 𝑃1 𝐵(𝑢, 𝑢)1/2 𝑃1 and 𝑃1 𝐵(𝑣, 𝑣) 𝑃1 on 𝑍1 belong to the “structure group” 𝐾1ℂ of 𝑍1 , and 𝑁 has the semi-invariance property 𝑁 (𝛾𝑧) = 𝑁 (𝛾𝑒) 𝑁 (𝑧) = (Det 𝛾)𝑟/𝑑1 𝑁 (𝑧) for all 𝛾 ∈ 𝐾1ℂ and 𝑧 ∈ 𝑍1 .

Minimal and Maximal Invariant Spaces

45

It follows that 𝑁 (𝑃1 𝐵(𝑤, 𝑤) 𝑒) = 𝑁 ((𝑃1 𝐵(𝑢, 𝑢)1/2 𝑃1 ) (𝑃1 𝐵(𝑣, 𝑣) 𝑃1 ) (𝑃1 𝐵(𝑢, 𝑢)1/2 ) 𝑒) = 𝑁 (𝐵(𝑢, 𝑢)1/2 𝑒)2 𝑁 (𝐵(𝑣, 𝑣) 𝑒) = ℎ(𝑢, 𝑢)2 ℎ(𝑣, 𝑣). Since ℎ(𝑤, 𝑤) = ℎ(𝑢, 𝑢) ℎ(𝑣, 𝑣) by (5.5), the assertion follows.

□

Let 𝑑1 , 𝑟1 , 𝑎1 , 𝑝1 and 𝑑1/2 , 𝑟1/2 , 𝑎1/2 , 𝑝1/2 denote the respective invariants for the (irreducible) Jordan triples 𝑍1 and 𝑍1/2 . Theorem 5.1. The integral deﬁning 𝑐−1 𝜈,s is ﬁnite (i.e., 𝑐𝜈,s > 0) if and only if 𝑠 + 𝜈 > 𝑝 − 1. In this case we have ΓΩ (2𝑠 + 𝜈) ΓΩ1/2 (𝑠 + 𝜈 − 𝑝 + 𝑝1/2 ) 𝑐𝜈,s = . 𝑑1/2 𝜋 𝑑 ΓΩ (2𝑠 + 𝜈 − 𝑑𝑟1 ) ΓΩ1/2 (𝑠 + 𝜈 − 𝑝 + 𝑝1/2 − 𝑟1/2 ) Proof. Combining (5.11), (5.5) and (5.7) we see that ∫ ∫ ∫ 𝑠+𝜈−𝑝 𝑠 2𝑠+𝜈+𝑏−𝑝 = 𝑑𝑤 ℎ(𝑤, 𝑤) ℎ(𝑤 , 𝑤 ) = 𝑑𝑢 ℎ(𝑢, 𝑢) 𝑑𝑣 ℎ(𝑣, 𝑣)𝑠+𝜈−𝑝 . 𝑐−1 1 1 𝜈,s 𝐷

𝐷1

𝐷1/2

Since 𝑝 − 𝑏 = 𝑝1 (the genus of 𝑍1 ), we have ∫ 𝑑1 𝑑𝑢 ℎ(𝑢, 𝑢)2𝑠+𝜈+𝑏−𝑝 = 𝜋 𝑑1 ΓΩ (2𝑠 + 𝜈 − )/ΓΩ (2𝑠 + 𝜈) 𝑟 𝐷1

which is ﬁnite if and only if 2𝑠 + 𝜈 > (𝑟 − 1) 𝑎 + 1 = 𝑝1 − 1. Also, ∫ ( 𝑑1/2 ) 𝑑𝑣 ℎ(𝑣, 𝑣)𝑠+𝜈−𝑝 = 𝜋 𝑑1/2 ΓΩ1/2 𝑠 + 𝜈 − 𝑝 + 𝑝1/2 − /ΓΩ1/2 (𝑠 + 𝜈 − 𝑝 + 𝑝1/2 ) 𝑟1/2

𝐷1/2

which is ﬁnite if and only if 𝑠 + 𝜈 − 𝑝 + 𝑝1/2 − Since 𝑝1/2 −

𝑑1/2 𝑟1/2

= (𝑟1/2 − 1)

𝑎1/2 2

𝑑1/2 𝑎1/2 . > (𝑟1/2 − 1) 𝑟1/2 2

+ 1, this is equivalent to 𝑠 + 𝜈 > 𝑝 − 1.

□

Proposition 5.4. Let 𝑎 ∈ 𝐷 and 𝜁 ∈ 𝑍. Then ∫ 1 ∥𝑆𝜈𝜈+𝛽 (𝐾𝜁m )𝑎 ∥ℒ1𝛽+𝜈/2 = 𝑑𝑧 ℎ(𝑧, 𝑧)𝛽+𝜈/2−𝑝 ∣ℎ(𝑧, 𝑎)−𝛽 ∣ ⋅ 𝐾𝜁m (𝐵(𝑧, 𝑧) 𝜁)1/2 𝑐𝛽+𝜈/2 𝐷 ∫ = 𝑑𝑧 ℎ(𝑧, 𝑧)𝛽+𝜈/2−𝑝 ⋅ ∣ℎ(𝑧, 𝑎)−𝛽 ∣ ⋅ ∥𝐾𝜁m ∘ 𝐵(𝑧, 𝑧)1/2 ∥ℱ (5.12) 𝐷

Proof. Proposition 4.3 and (4.8) imply (𝑆𝜈𝜈+𝛽 (𝐾𝜁m )𝑎 )𝑧 ∘ 𝐵(𝑧, 𝑧)1/2 = ℎ(𝑧, 𝑎)−𝛽 (𝐾𝜁m )𝑎𝑧 ∘ 𝐵(𝑧, 𝑧)1/2 = ℎ(𝑎, 𝑎)𝜈/2 ℎ(𝑧, 𝑎)−(𝛽+𝜈) 𝐾𝜁m ∘ 𝐵(𝑎, 𝑎)1/2 𝐵(𝑧, 𝑎)−1 𝐵(𝑧, 𝑧)1/2 .

46

J. Arazy and H. Upmeier

Since ⟨𝐾𝜁m ∘ 𝐵(𝑎, 𝑎)1/2 𝐵(𝑧, 𝑎)−1 𝐵(𝑧, 𝑧)1/2 ⟩2ℱ ′ ′ = 𝐾𝜁m (𝑔−𝑎 (𝑧) 𝐵(𝑧, 𝑧) 𝑔−𝑎 (𝑧)∗ 𝜁) = 𝐾𝜁m (𝐵(𝑔𝑎−1 (𝑧), 𝑔𝑎−1 (𝑧))𝜁),

it follows that ∥(𝑆𝜈𝜈+𝛽 (𝐾𝜁m )𝑎 )𝑧 ∘ 𝐵(𝑧, 𝑧)1/2 ∥ℱ $ $ = ℎ(𝑎, 𝑎)𝜈/2 $ℎ(𝑧, 𝑎)−(𝛽+𝜈) $ 𝐾𝜁m (𝐵(𝑔𝑎−1 (𝑧), 𝑔𝑎−1 (𝑧)) 𝜁)1/2 . Applying Lemma 4.1 to 𝑥 = 𝑦 = 0 yields 1

∥𝑆𝜈𝜈+𝛽 (𝐾𝜁m )𝑎 ∥ℒ1𝛽+𝜈/2 𝑐𝛽+𝜈/2 ∫ = 𝑑𝑧 ℎ(𝑧, 𝑧)𝛽+𝜈/2−𝑝 ∥(𝑆𝜈𝜈+𝛽 (𝐾𝜁m )𝑎 )𝑧 ∘ 𝐵(𝑧, 𝑧)1/2 ∥ℱ 𝐷 ∫ $ $ 𝜈/2 = ℎ(𝑎, 𝑎) 𝑑𝑧 ℎ(𝑧, 𝑧)𝛽+𝜈/2−𝑝 ⋅ $ℎ(𝑧, 𝑎)−(𝛽+𝜈) $ 𝐷

⋅ 𝐾𝜁m (𝐵(𝑔𝑎−1 (𝑧), 𝑔𝑎−1 (𝑧)) 𝜁)1/2 = ℎ(𝑎, 𝑎)𝜈/2 ℎ(𝑎, 𝑎)−𝜈/2 ∫ ⋅ 𝑑𝑧 ℎ(𝑧, 𝑧)𝛽+𝜈/2−𝑝 ℎ(𝑧, 𝑎)−𝛽/2 ℎ(𝑎, 𝑧)−𝛽/2 𝐾𝜁m (𝐵(𝑧, 𝑧) 𝜁)1/2 𝐷 ∫ $ $ = 𝑑𝑧 ℎ(𝑧, 𝑧)𝛽+𝜈/2−𝑝 $ℎ(𝑧, 𝑎)−𝛽 $ 𝐾𝜁m (𝐵(𝑧, 𝑧) 𝜁)1/2 ∫𝐷 $ $ = 𝑑𝑧 ℎ(𝑧, 𝑧)𝛽+𝜈/2−𝑝 $ℎ(𝑧, 𝑎)−𝛽 $ ⋅ ∥𝐾𝜁m ∘ 𝐵(𝑧, 𝑧)1/2 ∥ℱ . 𝐷

Our main result in this section is Theorem 5.2. Let 𝑠 ∈ ℕ and 𝜈 satisfy 𝑎 𝜈 𝑠 + > (𝑟 − 1) 2 2 and 𝑎1/2 𝜈 +𝑠 > (𝑟1/2 − 1) + 𝑝 − 𝑝1/2 . 2 2 Let 𝛽 ∈ ℝ satisfy 𝛽 + 𝜈+𝑠 2 > 𝑝 − 1. Then we have for Φ ∈ 𝒪(𝐷, 𝒫s ) Φ ∈ 𝔐𝜈,s ⇐⇒ 𝑆𝜈𝜈+𝛽 Φ ∈ ℒ1𝛽+𝜈/2 . Proof. Let 𝑝1/2 be the genus of 𝑍1/2 , and put 𝛼=𝛽+

𝜈+𝑠 + 𝑝1/2 − 𝑝. 2

Then 𝛽 − 𝛼 = 𝑝 − 𝑝1/2 −

𝑎1/2 𝜈+𝑠 <− (𝑟1/2 − 1) 2 2

□

Minimal and Maximal Invariant Spaces by assumption. This implies 𝐶 (1/2) := sup

𝑦∈𝐷1/2

(1/2)

2 𝐹1

47

( ) 𝛽/2 𝛽/2 (𝑦, 𝑦) < +∞. 𝛼

By Lemma 5.3, 𝐵(𝑢, 𝑢)1/2 𝐵(𝑎1 , 𝑢)−1 𝑎1/2 ∈ 𝐷1/2 and hence ( ) (1/2) 𝛽/2 𝛽/2 (𝐵(𝑢, 𝑢)1/2 𝐵(𝑎1 , 𝑢)−1 𝑎1/2 , 𝐵(𝑢, 𝑢)1/2 𝐵(𝑎1 , 𝑢)−1 𝑎1/2 ) 2 𝐹1 𝛼 ≤ 𝐶 (1/2) for all 𝑢 ∈ 𝐷1 and 𝑎 = 𝑎1 + 𝑎1/2 ∈ 𝐷. Now consider Φ = 𝐴𝑔𝑎 = (𝐾𝑒s )𝑎 . Specializing Proposition 5.4 to the constant partition s = (𝑠, . . . , 𝑠) and making the change of variables 𝑤 = 𝑢 + 𝐵(𝑢, 𝑢)1/2 𝑣 as in (5.7), we obtain with Proposition 5.3 and Lemma 5.5. 1/2

1

(𝑑/𝑟)s

𝑐𝛽+𝜈/2 =

1/2 𝑑s 1/2 (𝑑/𝑟)s 1/2 𝑑s

∫ = = =

∫𝐷 ∫𝐷

∥𝑆𝜈𝜈+𝛽 Φ∥ℒ1𝛽+𝜈/2

∫ 𝐷

$ $ 𝑑𝑤 ℎ(𝑤, 𝑤)𝛽+𝜈/2−𝑝 $ℎ(𝑤, 𝑎)−𝛽 $ 𝐾𝑒s (𝐵(𝑤, 𝑤) 𝑒)1/2

$ $ 𝑑𝑤 ℎ(𝑤, 𝑤)𝛽+𝜈/2−𝑝 $ℎ(𝑤, 𝑎)−𝛽 $ ℎ(𝑤, 𝑤)𝑠/2 ℎ(𝑤1 , 𝑤1 )𝑠/2 𝑑𝑤 ℎ(𝑤, 𝑤)𝛽+

𝜈+𝑠 2 −𝑝

$ $ ℎ(𝑤1 , 𝑤1 )𝑠/2 $ℎ(𝑤, 𝑎)−𝛽 $

$ $ 𝜈 𝑑𝑢 ℎ(𝑢, 𝑢)𝛽+ 2 +𝑠+𝑏−𝑝 $ℎ(𝑢, 𝑎1 )−𝛽 $ 𝐷1 ∫ $ $ 𝜈+𝑠 𝑑𝑣 ℎ(𝑣, 𝑣)𝛽+ 2 −𝑝 $ℎ(𝑣, 𝐵(𝑢, 𝑢)1/2 𝐵(𝑎1 , 𝑢)−1 𝑎1/2 )−𝛽 $ ⋅ 𝐷1/2

( ) $ $ (1/2) 𝛽/2 𝛽/2 𝛽+𝑠+𝜈/2−𝑝1 $ −𝛽 $ ℎ(𝑢, 𝑎 𝑑𝑢 ℎ(𝑢, 𝑢) ) 𝐹 1 2 1 (1/2) 𝛼 𝐷1 𝑐𝛼 ) ( 1/2 −1 1/2 −1 𝐵(𝑎1 , 𝑢) 𝑎1/2 , 𝐵(𝑢, 𝑢) 𝐵(𝑎1 , 𝑢) 𝑎1/2 ⋅ 𝐵(𝑢, 𝑢) ∫ $ $ 𝐶 (1/2) 𝑑𝑢 ℎ(𝑢, 𝑢)𝛽+𝑠+𝜈/2−𝑝1 ⋅ $ℎ(𝑢, 𝑎1 )−𝛽 $ ≤ (1/2) 𝐷1 𝑐𝛼 ( ) (1/2) 𝐶 𝛽/2 𝛽/2 (1) (𝑎1 , 𝑎1 ) 𝐹 = (1) (1/2) 2 1 𝛽 + 𝑠 + 𝜈/2 𝑐 𝑐𝛼 =

∫

1

𝛽+𝑠+𝜈/2

≤

𝐶 (1/2) ⋅ 𝐶 (1) (1)

(1/2)

𝑐𝛽+𝑠+𝜈/2 𝑐𝛼

,

where 𝐶

(1)

:= sup

𝑎1 ∈𝐷1

(1) 2 𝐹1

(

) 𝛽/2 𝛽/2 (𝑎1 , 𝑎1 ) < +∞ 𝛽 + 𝑠 + 𝜈/2

48

J. Arazy and H. Upmeier

since our assumption on the parameters implies

Every Φ ∈ 𝔐𝜈,m

𝛽 𝛽 𝜈 𝜈 𝑎 + − (𝛽 + + 𝑠) = −𝑠 − < −(𝑟 − 1) . 2 2 2 2 2 has a representation ∫ Φ = 𝑑𝜇(𝑔) 𝐴𝑔 𝐺

for a ﬁnite complex measure 𝜇 on 𝐺. Then ∫ 𝑆𝜈𝜈+𝛽 Φ = 𝑑𝜇(𝑔) 𝑆𝜈𝜈+𝛽 𝐴𝑔 𝐺

and the previous calculation shows ∥𝑆𝜈𝜈+𝛽 Φ∥ℒ1𝛽+𝜈/2 ≤ ∥𝜇∥ ⋅ sup ∥𝑆𝜈𝜈+𝛽 𝐴𝑔 ∥ℒ1𝛽+𝜈/2 𝑔∈𝐺

≤

1/2 𝑑s 1/2 (𝑑/𝑟)s

𝑐𝛽+𝜈/2 (1) (1/2) 𝑐𝛽+𝑠+𝜈/2 𝑐𝛼

𝐶 (1/2) 𝐶 (1) ⋅ ∥𝜇∥.

It follows that 𝑆𝜈𝜈+𝛽 Φ ∈ ℒ1𝛽+𝜈/2 , as asserted. Thus we obtain the implication Φ ∈ 𝔐𝜈,s =⇒ 𝑆𝜈𝜈+𝛽 Φ ∈ ℒ1𝛽+𝜈/2 . The converse implication follows from Theorem 4.2, applied to the partition s = (𝑠, . . . , 𝑠). □

References [A1] [A2] [AF] [AFP] [AU1] [AU2] [AU3]

[AU4]

J. Arazy, A survey of invariant Hilbert spaces of analytic functions on bounded symmetric domains, Contemp. Math. 185 (1995), 7–65. J. Arazy, Boundedness and compactness of generalized Hankel operators on bounded symmetric domains, J. Funct. Anal. 137 (1996), 97–151. J. Arazy, S. Fisher, Some aspects of the minimal M¨ obius invariant space of analytic functions on the unit disk, Springer Lect. Notes in Math. 1070 (1984), 24–44. J. Arazy, S. Fisher and J. Peetre, M¨ obius invariant function spaces, J. reine angew. Math. 363 (1985), 110–145. J. Arazy and H. Upmeier, Invariant inner products in spaces of holomorphic functions on bounded symmetric domains, Documenta Math 2 (1997), 213–261. J. Arazy and H. Upmeier, Boundary measures for symmetric domains and integral formulas for the discrete Wallach points, Int. Equ. Op. Th. 47 (2003), 375–434. J. Arazy and H. Upmeier, Jordan Grassmann manifolds and intertwining operators for weighted Bergman spaces, Proceedings Cluj-Napoca (2007), 25–53, ClujUniversity Press 2008. J. Arazy and H. Upmeier, Intertwining operators and invariant function spaces for pole set parameters, (in preparation).

Minimal and Maximal Invariant Spaces [EZ]

49

M. Englis and G. Zhang, On the Faraut-Kor´ anyi hypergeometric function in rank two, Ann. Inst. Fourier 54 (2004), 1855–1875. [FK1] J. Faraut and A. Koranyi, Function spaces and reproducing kernels on bounded symmetric domains, J. Funct. Anal. 88 (1990), 64–89. [FK2] J. Faraut and A. Koranyi, Analysis on Symmetric Cones, Clarendon Press, Oxford (1994). [LA] M. Lassalle, Alg`ebres de Jordan et ensemble de Wallach, Invent. Math. 89 (1987), 375–393. [L1] O. Loos, Jordan Pairs, Springer Lect. Notes in Math. 460 (1975). [L2] O. Loos, Bounded Symmetric Domains and Jordan Pairs, Univ. of California, Irvine (1977). [RT] L. Rubel and R. Timoney, An extremal property of the Bloch space, Proc. Amer. Math. Soc. 75 (1979), 45–49. [RV] H. Rossi, M. Vergne, Analytic continuation of holomorphic discrete series of a semi-simple Lie group, Acta Math. 136 (1975), 1–59. [S] W. Schmid, Die Randwerte holomorpher Funktionen auf hermitesch symmetrischen R¨ aumen, Invent. Math. 9 (1969), 61–80. [U1] H. Upmeier, Jordan algebras and harmonic analysis on symmetric spaces, Amer. J. Math. 108 (1986), 1–25. [U2] H. Upmeier, Jordan Algebras in Analysis, Operator Theory and Quantum Mechanics, CBMS Series in Math. 67, Amer. Math. Soc. (1987). [W] N. Wallach, The analytic continuation of the discrete series I, II, Trans. Amer. Math. Soc. 251 (1979), 1–17 and 19–37. [Y] Z. Yan, A class of generalized hypergeometric functions in several variables, Can. J. Math. 44 (1992), 1317–1338. Jonathan Arazy Department of Mathematics University of Haifa Haifa 31905, Israel e-mail: [email protected] Harald Upmeier Fachbereich Mathematik Universit¨ at Marburg D-35032 Marburg, Germany e-mail: [email protected]

Operator Theory: Advances and Applications, Vol. 218, 51–73 c 2012 Springer Basel AG ⃝

B-regular 𝑱 -inner Matrix-valued Functions Damir Z. Arov and Harry Dym Dedicated to the memory of our valued teacher, colleague and friend, Israel Gohberg Z”L.

Abstract. In the study of the class 𝒰(𝐽) of mvf’s (matrix-valued functions) that are 𝐽-inner with respect to the open upper half-plane ℂ+ and a given signature matrix 𝐽, special roles are played by the classes 𝒰ℓ𝑅 (𝐽), 𝒰𝑟𝑅 (𝐽), 𝒰ℓ𝑠𝑅 (𝐽) and 𝒰𝑟𝑠𝑅 (𝐽) of left regular, right regular, left strongly regular and right strongly regular 𝐽-inner mvf’s. These are discussed at length in [ArD08] and the references cited therein. Shorter introductions may be found in the survey articles [ArD05] and [ArD07]. In particular, these classes are characterized in terms of the RKHS’s (reproducing kernel Hilbert spaces) ℋ(𝑈 ) that are associated with each mvf 𝑈 ∈ 𝒰(𝐽). If 𝑈 = 𝑈1 𝑈2 with 𝑈1 , 𝑈2 ∈ 𝒰(𝐽), then, by a theorem of L. de Branges, the RKHS ℋ(𝑈1 ) is contractively included in ℋ(𝑈 ); necessary and suﬃcient conditions for isometric inclusion are also given. In this paper we introduce the class 𝒰𝐵𝑅 (𝐽) of 𝐵-regular 𝐽-inner mvf’s. It is characterized by the fact that if 𝑈 = 𝑈1 𝑈2 with 𝑈1 , 𝑈2 ∈ 𝒰(𝐽), then ℋ(𝑈1 ) is isometrically included in ℋ(𝑈 ). If 𝑈 ∈ 𝒰(𝐽) is the characteristic mvf of a Livsic-Brodskii operator node, i.e., if 𝑈 is holomorphic at the point 𝜆 = 0 and normalized by 𝑈 (0) = 𝐼𝑚 , then, thanks to another theorem of L. de Branges, 𝑈 ∈ 𝒰𝐵𝑅 (𝐽) if and only if every normalized left divisor 𝑈1 of 𝑈 ∈ 𝒰(𝐽) is left regular in the Brodskii sense. We shall show that 𝒰ℓ𝑠𝑅 (𝐽) ∪ 𝒰𝑟𝑠𝑅 (𝐽) ⊆ 𝒰𝐵𝑅 (𝐽). We shall also discuss the inverse monodromy problem for canonical diﬀerential systems for monodromy matrices 𝑈 ∈ 𝒰𝐵𝑅 (𝐽) and shall present an example of a 2 × 2 canonical diﬀerential system for which the matrizant (fundamental solution) 𝑈𝑥 (𝜆) belongs to the class 𝒰ℓ𝑠𝑅 (𝐽) for every 𝑥 > 0, but does not belong to the class 𝒰𝑟𝑠𝑅 (𝐽). Mathematics Subject Classiﬁcation (2000). Primary 47B32, 46E22, 47A48; Secondary 93C15, 45xx. Keywords. Canonical systems, de Branges spaces, 𝐽-inner matrix-valued functions, reproducing kernel Hilbert spaces, Livsic-Brodskii nodes.

52

D.Z. Arov and H. Dym

1. Introduction The class 𝒰(𝐽) of 𝐽-inner mvf’s (matrix-valued functions) with respect to the open upper half-plane ℂ+ = {𝜆 ∈ ℂ : ℑ𝜆 > 0} is the set of meromorphic 𝑚 × 𝑚 mvf’s 𝑈 (𝜆) in ℂ+ that are 𝐽-contractive on the set 𝔥+ 𝑈 = {𝜆 ∈ ℂ+ : at which 𝑈 (𝜆) is holomorphic} and have nontangential limits a.e. on the real axis ℝ that are 𝐽-unitary, i.e., and

𝑈 (𝜆)𝐽𝑈 (𝜆)∗ ≤ 𝐽

for 𝜆 ∈ 𝔥+ 𝑈

𝑈 (𝜆)𝐽𝑈 (𝜆)∗ = 𝐽

a.e. on ℝ,

(1.1) (1.2) ∗

∗

respectively. Here 𝐽 denotes an 𝑚×𝑚 signature matrix, i.e., 𝐽 = 𝐽 and 𝐽 𝐽 = 𝐼𝑚 . The signature matrices ±𝐼𝑚 , ±𝐽𝑝 and ±𝒥𝑝 , where [ [ ] ] 0 −𝐼𝑝 0 −𝑖𝐼𝑝 𝐽𝑝 = and 𝒥𝑝 = , −𝐼𝑝 0 𝑖𝐼𝑝 0 will be of particular interest. The existence of nontangential boundary values a.e. on ℝ is a consequence of the fact that (1.1) guarantees that 𝑈 ∈ 𝒰(𝐽) belongs to the class 𝒩 𝑚×𝑚 of 𝑚 × 𝑚 mvf’s that are meromorphic in ℂ+ with bounded Nevanlinna characteristic there. Moreover, every such 𝑈 has a meromorphic pseudocontinuation with bounded Nevanlinna characteristic in the open lower half-plane ℂ− that may be deﬁned on the set + Ω− and det 𝑈 (𝜆) ∕= 0}. 𝑈 = {𝜆 ∈ ℂ− : 𝜆 ∈ 𝔥𝑈 by the formula

𝑈− (𝜆) = 𝐽(𝑈 # (𝜆))−1 𝐽

for 𝜆 ∈ Ω− 𝑈,

(1.3)

where 𝑓 # (𝜆) = 𝑓 (𝜆)∗

and, as will be needed later, 𝑓 ∼ (𝜆) = 𝑓 (−𝜆)∗ .

(1.4)

Formulas (1.2) and (1.3) serve to guarantee that the nontangential boundary values of 𝑈 and 𝑈− coincide a.e. on ℝ, i.e., 𝑈 (𝜇) = lim 𝑈 (𝜇 + 𝑖𝜈) = lim 𝑈− (𝜇 − 𝑖𝜈), 𝜈↓0

𝜈↓0

(1.5)

and hence that 𝑈− is a pseudocontinuation of 𝑈 . From now on mvf’s 𝑈 ∈ 𝒰(𝐽) will be considered in the set − 0 𝔥𝑈 = 𝔥+ 𝑈 ∪ 𝔥𝑈 ∪ 𝔥𝑈 ,

where and

𝔥− 𝑈 = {𝜆 ∈ ℂ− : at which 𝑈 (𝜆) is holomorphic} 𝔥0𝑈 = {𝜆 ∈ ℝ : at which 𝑈 (𝜆) is holomorphic}.

B-regular 𝐽-inner Matrix-valued Functions

53

If 𝑈 ∈ 𝒰(𝐽) then the formula ⎧ ∗  ⎨ 𝐽 − 𝑈 (𝜆)𝐽𝑈 (𝜔) if 𝜆 ∕= 𝜔 −2𝜋𝑖(𝜆 − 𝜔) 𝐾𝜔𝑈 (𝜆) = (1.6) ′ ∗  ⎩ 𝑈 (𝜔𝐽𝑈 (𝜔) if 𝜆 = 𝜔 2𝜋𝑖 deﬁnes a positive kernel on 𝔥𝑈 × 𝔥𝑈 . Therefore, by the matrix version of a theorem of Aronszajn, there is an RKHS (reproducing kernel Hilbert space) ℋ(𝑈 ) with 𝐾𝜔𝑈 (𝜆) as its RK (reproducing kernel). This means that the following two conditions are met: (1) 𝐾𝜔𝑈 𝑣 ∈ ℋ(𝑈 ) for every choice of 𝜔 ∈ 𝔥𝑈 and 𝑣 ∈ ℂ𝑚 . (2) If 𝑓 ∈ ℋ(𝑈 ), then 𝑣 ∗ 𝑓 (𝜔) = ⟨𝑓, 𝑘𝜔𝑈 𝑣⟩ℋ(𝑈)

for every 𝑓 ∈ ℋ(𝑈 ), 𝜔 ∈ 𝔥𝑈 and 𝑣 ∈ ℂ𝑚 . ℋ(𝑈 ) to 𝔥+ 𝑈 ℂ+ and ℂ− ,

(1.7)

𝔥− 𝑈

and are holomorphic The restrictions 𝑓+ and 𝑓− of 𝑓 ∈ respectively. Moreover (as with bounded Nevanlinna characteristic in shown in Theorem 5.49 of [ArD08]), 𝑓− is the pseudocontinuation of 𝑓+ . Thus, if 𝑓 ∈ ℋ(𝑈 ), then 𝑓 (𝜇) = 𝑓+ (𝜇) = lim 𝑓 (𝜇 + 𝑖𝜈) = lim 𝑓− (𝜇 − 𝑖𝜈) = 𝑓− (𝜇) a.e. on ℝ. 𝜈↓0

𝜈↓0

A mvf 𝑈 ∈ 𝒰(𝐽) belongs to the class 𝒰𝑟𝑠𝑅 (𝐽) of right strongly regular 𝐽-inner mvf’s if the nontangential boundary value 𝑓 (𝜇) belongs to 𝐿𝑚 2 (ℝ) for every 𝑓 ∈ ℋ(𝑈 ). Thus, upon identifying 𝑓 ∈ ℋ(𝑈 ) with its boundary values, this can be reexpressed as 𝒰𝑟𝑠𝑅 (𝐽) = {𝑈 ∈ 𝒰(𝐽) : ℋ(𝑈 ) ⊆ 𝐿𝑚 (1.8) 2 (ℝ)}. The class 𝒰ℓ𝑠𝑅 (𝐽) of left strongly regular 𝐽-inner mvf’s may be deﬁned as 𝒰ℓ𝑠𝑅 (𝐽) = {𝑈 ∈ 𝒰(𝐽) : 𝑈 ∼ ∈ 𝒰𝑟𝑠𝑅 (𝐽)}.

(1.9)

A mvf 𝑈 ∈ 𝒰(𝐽) belongs to the class 𝒰𝐵𝑅 (𝐽) of B-regular 𝐽-inner mvf’s if for every factorization 𝑈 = 𝑈1 𝑈2 with 𝑈1 , 𝑈2 ∈ 𝒰(𝐽) (1.10) the equality ℋ(𝑈1 ) ∩ 𝑈1 ℋ(𝑈2 ) = {0}

(1.11)

is in force. The importance of the class 𝒰𝐵𝑅 (𝐽) of B-regular 𝐽-inner mvf’s is exhibited by the following two theorems of L. de Branges that correspond to Theorems 5.52 and 5.50 in [ArD08]. The formulations there were inﬂuenced by the discussion in Section 5 of [AlD84], which in turn is based on [Br63] and [Br65]. Theorem 1.1. (L. de Branges) Let 𝑈 = 𝑈1 𝑈2 , where 𝑈1 , 𝑈2 ∈ 𝒰(𝐽). Then ℋ(𝑈1 ) is contained contractively in ℋ(𝑈 ), i.e., ℋ(𝑈1 ) ⊆ ℋ(𝑈 )

(as vector spaces)

54

D.Z. Arov and H. Dym

and ∥𝑓 ∥ℋ(𝑈) ≤ ∥𝑓 ∥ℋ(𝑈1 )

for 𝑓 ∈ ℋ(𝑈1 ).

Moreover, this inclusion is isometric if and only if (1.11) holds. Furthermore, ℋ(𝑈1 ) ∩ 𝑈1 ℋ(𝑈2 ) = {0} ⇐⇒ ℋ(𝑈 ) = ℋ(𝑈1 ) ⊕ 𝑈1 ℋ(𝑈2 ).

(1.12)

Theorem 1.2. (L. de Branges) Let 𝑈 ∈ 𝒰(𝐽) and let ℒ be a closed subspace of ℋ(𝑈 ) that is 𝑅𝛼 invariant for every point 𝛼 ∈ 𝔥𝑈 . Then there exists a mvf 𝑈1 ∈ 𝒰(𝐽) such that 𝔥𝑈1 ⊇ 𝔥𝑈 , ℒ = ℋ(𝑈1 ) and 𝑈1−1 𝑈 ∈ 𝒰(𝐽). Moreover, the space ℋ(𝑈1 ) is isometrically included in ℋ(𝑈 ), and ℋ(𝑈 ) = ℋ(𝑈1 ) ⊕ 𝑈1 ℋ(𝑈2 ),

where

𝑈2 = 𝑈1−1 𝑈.

(1.13)

Remark 1.3. If 𝑈 ∈ ℰ ∩ 𝒰 ∘ (𝐽), then ∥𝑅0𝑛 ∥1/𝑛 tends to zero as 𝑛 ↑ ∞ since 𝑅0 is a Volterra operator and therefore the identity 𝑅𝛼 − 𝑅0 = 𝛼𝑅0 𝑅𝛼 =⇒ 𝑅𝛼 =

∞ ∑

𝛼𝑛−1 𝑅0𝑛 .

𝑛=1

Thus, for such mvf’s 𝑈 a closed subspace ℒ of ℋ(𝑈 ) is invariant for every point 𝛼 ∈ ℂ if and only if it is invariant under 𝑅0 . A simple example of a continuous family of mvf’s 𝑈𝑠 ∕∈ 𝒰𝐵𝑅 (𝐽) may be constructed by ﬁxing a matrix 𝑉 ∈ ℂ𝑚×𝑘 such that 𝑉 ∗ 𝑉 = 𝐼𝑘 and 𝑉 ∗ 𝐽𝑉 ∗ = 0 and setting 𝑈𝑠 (𝜆) = exp{𝑖𝜆𝑠𝑉 𝑉 ∗ 𝐽} = 𝐼𝑚 + 𝑖𝜆𝑠𝑉 𝑉 ∗ 𝐽

for 𝑠 ≥ 0.

Then 𝑈𝑠 ∈ ℰ ∩ 𝒰(𝐽) for every 𝑠 ≥ 0, 𝑈𝑠 𝑈𝑡 = 𝑈𝑠+𝑡 and, as follows readily from formula (1.6), the RK of the RKHS ℋ(𝑈𝑠 ) is given by the formula 𝐾𝜔𝑈𝑠 (𝜆) =

𝑠 𝑉 𝑉 ∗, 2𝜋

which serves to exhibit ℋ(𝑈𝑠 ) as the 𝑘-dimensional subspace of ℂ𝑚 spanned by the columns of 𝑉 for every 𝑠 > 0. Thus, the spaces ℋ(𝑈𝑠 ) are all the same as vector spaces for 𝑠 > 0. However, the norms depend upon 𝑠 and ( )2 ( )2 2𝜋 2𝜋 𝑡 ∥ 𝑉 𝑉 ∗ 𝑥∥2ℋ(𝑈𝑡 ) = ∥𝐾𝜔𝑈𝑡 (𝑉 𝑥)∥2ℋ(𝑈𝑡 ) ∥𝑉 𝑥∥2ℋ(𝑈𝑡 ) = 𝑡 2𝜋 𝑡 ( )2 2𝜋 2𝜋 ∗ 𝑥 𝑥 = 𝑥∗ 𝑉 ∗ 𝐾𝜔𝑈𝑡 (𝜔)𝑉 𝑥 = 𝑡 𝑡 2𝜋 ∗ 𝑥 𝑥 = ∥𝑉 𝑥∥2ℋ(𝑈𝑠 ) for 0 < 𝑠 < 𝑡. < 𝑠 An example of a canonical diﬀerential system with matrizant 𝑈𝑥 (𝜆) that belongs to 𝒰ℓ𝑠𝑅 (𝐽) but does not belong to 𝒰𝑟𝑠𝑅 (𝐽) will be furnished in Section 8.

B-regular 𝐽-inner Matrix-valued Functions

55

2. A unitary operator from 퓗(𝑼 ) onto 퓗(𝑼 ∼ ) It is important to keep in mind that 𝑈 ∈ 𝒰(𝐽) ⇐⇒ 𝑈 ∼ ∈ 𝒰(𝐽). Lemma 2.1. Let 𝑈 ∈ 𝒰(𝐽) and let 𝑇 be the operator deﬁned on ℋ(𝑈 ) by the formula (𝑇 𝑓 )(𝜆) = 𝑈 ∼ (𝜆)𝐽𝑓 (−𝜆) for 𝜆 ∈ 𝔥𝑈 ∩ 𝔥𝑈 ∼ . (2.1) Then 𝑇 is a unitary operator from ℋ(𝑈 ) onto ℋ(𝑈 ∼ ). Proof. Let 𝜆, 𝜔 ∈ 𝔥𝑈 ∼ , −𝜆, −𝜔 ∈ 𝔥𝑈 and suppose that 𝜆 ∕= 𝜔, det 𝑈 ∼ (𝜆) ∕= 0 and det 𝑈 ∼ (𝜔) ∕= 0. Then ∼

𝑈 (−𝜆)𝐽𝑈 ∼ (𝜔)∗ . 𝐾𝜔𝑈 (𝜆) = 𝑈 ∼ (𝜆)𝐽𝐾−𝜔

(2.2)

Therefore, the operator 𝑇 maps the dense subspace ℒ1 of vvf’s 𝑓 ∈ ℋ(𝑈 ) of the form 𝑛 ∑ 𝑈 𝑓 (𝜆) = 𝐾−𝜔 (𝜆)𝐽𝑈 ∼ (𝜔𝑗 )∗ 𝜉𝑗 with 𝜉𝑗 ∈ ℂ𝑚 and 𝑛 ≥ 1 (2.3) 𝑗 𝑗=1

into the dense subspace ℒ2 of vvf’s 𝑔(𝜆) = (𝑇 𝑓 )(𝜆) = 𝑈 ∼ (𝜆)𝐽

𝑛 ∑

𝑈 𝐾−𝜔 (−𝜆)𝐽𝑈 ∼ (𝜔𝑗 )∗ 𝜉𝑗 𝑗

𝑗=1

=

𝑛 ∑

∼

𝐾𝜔𝑈𝑗

(𝜆)𝜉𝑗

(2.4)

with 𝜉𝑗 ∈ ℂ

𝑚

and 𝑛 ≥ 1.

𝑗=1

Moreover, if 𝑓 and 𝑔 are deﬁned by the above formulas, then ⟨𝑓, 𝑓 ⟩ℋ(𝑈) = =

𝑛 ∑ 𝑛 ∑ 𝑗=1 𝑘=1 𝑛 ∑ 𝑛 ∑ 𝑗=1 𝑘=1

𝑈 𝜉𝑗∗ 𝑈 ∼ (𝜔𝑗 )𝐽𝐾−𝜔 (−𝜔𝑗 )𝐽𝑈 ∼ (𝜔𝑘 )∗ 𝜉𝑘 𝑘

(2.5) ∼ 𝜉𝑗∗ 𝐾𝜔𝑈𝑘 (𝜔𝑗 )𝜉𝑘

= ⟨𝑔, 𝑔⟩ℋ(𝑈 ∼ ) .

Thus, 𝑇 maps ℒ1 isometrically onto ℒ2 . Moreover, if 𝑓 ∈ ℋ(𝑈 ), then there exists a sequence of vvf’s 𝑓𝑘 ∈ ℒ1 such that ∥𝑓 − 𝑓𝑘 ∥ℋ(𝑈) → 0 as 𝑘 ↑ ∞. But, as ℋ(𝑈 ) is a RKHS, this implies that 𝑓𝑘 (𝜆) → 𝑓 (𝜆) at each point 𝜆 ∈ 𝔥𝑈 as 𝑘 ↑ ∞. Thus, if 𝑔𝑘 = 𝑇 𝑓𝑘 for 𝑘 = 1, 2, . . . , then 𝑔𝑘 (𝜆) = (𝑇 𝑓𝑘 )(𝜆) = 𝑈 ∼ (𝜆)𝐽𝑓𝑘 (−𝜆) → 𝑈 ∼ (𝜆)𝐽𝑓 (−𝜆) for each point 𝜆 ∈ 𝔥𝑈 ∼ such that −𝜆 ∈ 𝔥𝑈 . Since ∥𝑔𝑘 ∥ℋ(𝑈 ∼ ) = ∥𝑓𝑘 ∥ℋ(𝑈) → ∥𝑓 ∥ℋ(𝑈)

as 𝑘 ↑ ∞

and ∥𝑔𝑘 − 𝑔𝑗 ∥ℋ(𝑈 ∼ ) = ∥𝑓𝑘 − 𝑓𝑗 ∥ℋ(𝑈) ,

as 𝑘 ↑ ∞

56

D.Z. Arov and H. Dym

there exists a 𝑔 ∈ ℋ(𝑈 ∼ ) such that ∥𝑔𝑘 − 𝑔∥ℋ(𝑈 ∼ ) → 0 as 𝑘 ↑ ∞. Therefore, since ℋ(𝑈 ∼ ) is a RKHS and ∩ 𝔥𝑈 ∼ = 𝔥𝑔 , 𝑔∈ℋ(𝑈 ∼ )

𝑔𝑘 (𝜆) → 𝑔(𝜆) at each point 𝜆 ∈ 𝔥𝑈 ∼ as 𝑘 ↑ ∞. Consequently, 𝑔(𝜆) = 𝑈 ∼ (𝜆)𝐽𝑓 (−𝜆) = (𝑇 𝑓 )(𝜆) for 𝑓 ∈ ℋ(𝑈 ), i.e., 𝑇 maps ℋ(𝑈 ) into ℋ(𝑈 ∼ ). Therefore, since 𝑇 is an isometry on the full space ℋ(𝑈 ) and ℒ2 is dense in ℋ(𝑈 ∼ ), 𝑇 maps ℋ(𝑈 ) onto ℋ(𝑈 ∼ ). □ Theorem 2.2. 𝒰ℓ𝑠𝑅 (𝐽) = {𝑈 ∈ 𝒰(𝐽) : 𝑇 𝑓 ∈ 𝐿𝑚 2 (ℝ)

for every 𝑓 ∈ ℋ(𝑈 )}.

Proof. This follows from Lemma 2.1 and formulas (1.8) and (1.9). 𝐿𝑚 2 (ℝ)

Remark 2.3. Since 𝑔(𝜇) belongs to if and only if 𝑔(−𝜇) belongs to the equality (2.6) is equivalent to the following equality 𝒰ℓ𝑠𝑅 (𝐽) = {𝑈 ∈ 𝒰(𝐽) : 𝑈 # 𝐽𝑓 ∈ 𝐿𝑚 2 (ℝ) for every 𝑓 ∈ ℋ(𝑈 )} = {𝑈 ∈ 𝒰(𝐽) : 𝑈 −1 𝑓 ∈ 𝐿𝑚 2 (ℝ) for every 𝑓 ∈ ℋ(𝑈 )}.

(2.6) □ 𝐿𝑚 2 (ℝ), (2.7)

3. Some properties of the class 퓤𝑩𝑹 (𝑱 ) Theorem 3.1. 𝑈 ∈ 𝒰𝐵𝑅 (𝐽) ⇐⇒ 𝑈 ∼ ∈ 𝒰𝐵𝑅 (𝐽). Proof. If 𝑈 ∈ 𝒰𝐵𝑅 (𝐽) and 𝑈 = 𝑈1 𝑈2 is a factorization of 𝑈 with factors 𝑈1 , 𝑈2 ∈ 𝒰(𝐽), then 𝑈 ∼ = 𝑈2∼ 𝑈1∼ . Let 𝑓 ∈ ℋ(𝑈2∼ ) ∩ 𝑈2∼ ℋ(𝑈1∼ ). Then, by Lemma 2.1, 𝑓 (𝜆) = 𝑈2∼ (𝜆)𝐽𝑓2 (−𝜆) = 𝑈2∼ (𝜆)𝑈1∼ (𝜆)𝐽𝑓1 (−𝜆), where 𝑓𝑗 ∈ ℋ(𝑈𝑗 ) for 𝑗 = 1, 2. Therefore, 𝐽𝑓2 (−𝜆) = 𝑈1∼ (𝜆)𝐽𝑓1 (−𝜆), i.e.,

𝑓2 (𝜆) = 𝐽𝑈1# (𝜆)𝐽𝑓1 (𝜆) = 𝑈1 (𝜆)−1 𝑓1 (𝜆).

Thus,

𝑓1 = 𝑈1 𝑓2 ,

and hence

𝑓1 ∈ ℋ(𝑈1 ) ∩ 𝑈1 ℋ(𝑈2 ) = {0}.

Consequently, 𝑓 = 0, i.e., ℋ(𝑈1 ) ∩ 𝑈1 ℋ(𝑈2 ) = {0} =⇒ ℋ(𝑈2∼ ) ∩ 𝑈2∼ ℋ(𝑈1∼ ) = {0}. The converse implication then follows from the fact that (𝑓 ∼ )∼ = 𝑓 .

□

B-regular 𝐽-inner Matrix-valued Functions

57

Theorem 3.2. The following two inclusions are in force: and

𝒰𝑟𝑠𝑅 (𝐽) ⊆ 𝒰𝐵𝑅 (𝐽)

(3.1)

𝒰ℓ𝑠𝑅 (𝐽) ⊆ 𝒰𝐵𝑅 (𝐽)

(3.2)

Proof. The inclusion (3.1) follows from Theorem 1.1 and Theorems 5.50 and 5.92 in [ArD08]. The inclusion (3.2) follows from the characterization (1.9), the inclusion (3.1) and Theorem 3.1. □ Theorem 3.3. If 𝑈1 , . . . , 𝑈𝑛 ∈ 𝒰(𝐽) and 𝑈 = 𝑈1 ⋅ ⋅ ⋅ 𝑈𝑛 , then 𝑈 ∈ 𝒰𝐵𝑅 (𝐽) =⇒ 𝑈𝑘 ∈ 𝒰𝐵𝑅 (𝐽)

for 𝑘 = 1, . . . , 𝑛.

Proof. It suﬃces to consider the case 𝑛 = 2. Then if 𝑈 = 𝑈1 𝑈2 , 𝑈1 = 𝑈𝑎 𝑈𝑏 , with 𝑈𝑎 , 𝑈𝑏 , 𝑈2 ∈ 𝒰(𝐽) and 𝑈 ∈ 𝒰𝐵𝑅 (𝐽), the two factorizations 𝑈 = 𝑈1 𝑈2 and 𝑈 = 𝑈𝑎 (𝑈𝑏 𝑈2 ) imply that ∥𝑓 ∥ℋ(𝑈1 ) = ∥𝑓 ∥ℋ(𝑈)

for every 𝑓 ∈ ℋ(𝑈1 )

∥𝑓 ∥ℋ(𝑈𝑎 ) = ∥𝑓 ∥ℋ(𝑈)

for every 𝑓 ∈ ℋ(𝑈𝑎 ),

and respectively. Therefore, ∥𝑓 ∥ℋ(𝑈𝑎 ) = ∥𝑓 ∥ℋ(𝑈1 )

for every 𝑓 ∈ ℋ(𝑈𝑎 ),

which proves that 𝑈1 ∈ 𝒰𝐵𝑅 (𝐽). The proof that 𝑈2 ∈ 𝒰𝑅𝐵 (𝐽) follows from formula 𝑈 ∼ = 𝑈2∼ 𝑈1∼ and Theorem 3.1. □

4. Canonical systems with B-regular matrizants Let

𝒰 ∘ (𝐽) = {𝑈 ∈ 𝒰(𝐽) : 0 ∈ 𝔥𝑈

and 𝑈 (0) = 𝐼𝑚 }

and

ℰ ∩ 𝒰(𝐽) = {𝑈 ∈ 𝒰(𝐽) : 𝑈 is an entire mvf}. A family of 𝑚 × 𝑚 mvf’s 𝑈𝑥 (𝜆), 0 ≤ 𝑥 < ℓ, that is continuous with respect to 𝑥 on [0, ℓ) for each 𝜆 ∈ ℂ and meets the conditions 𝑈𝑥2 ∈ ℰ ∩ 𝒰 ∘ (𝐽) when 0 ≤ 𝑥1 ≤ 𝑥2 < ℓ and 𝑈0 (𝜆) ≡ 𝐼𝑚 𝑈𝑥−1 1

(4.1)

will be called a normalized monotonic continuous chain of entire 𝐽-inner mvf ’s. It is well known that if 𝑀 (𝑥) is a continuous nondecreasing 𝑚 × 𝑚 mvf on [0, ℓ) with 𝑀 (0) = 0, then the matrizant (fundamental solution) of the canonical integral system ∫ 𝑥 𝑈𝑥 (𝜆) = 𝐼𝑚 + 𝑖𝜆 𝑈𝑠 (𝜆)𝑑𝑀 (𝑠)𝐽, 0 ≤ 𝑥 < ℓ, (4.2) 0

58

D.Z. Arov and H. Dym

is normalized monotonic continuous chain of entire 𝐽-inner mvf’s. There is a converse statement in the class ∘ ℰ ∩ 𝒰𝐵𝑅 (𝐽) = ℰ ∩ 𝒰 ∘ (𝐽) ∩ 𝒰𝐵𝑅 (𝐽)

that will be presented below in Theorem 4.1. If ∫ 𝑥 𝑀 (𝑥) = 𝐻(𝑠)𝑑𝑠 for 𝑥 ∈ [0, ℓ) 0

and some 𝑚 × 𝑚 mvf 𝐻 that meets the conditions 𝐻 ∈ 𝐿𝑚×𝑚 1,𝑙𝑜𝑐 ([0, ℓ)) and 𝐻(𝑥) ≥ 0 a.e. on [0, ℓ),

(4.3)

then the matrizant 𝑈𝑥 (𝜆) is a solution of the canonical diﬀerential system ∂𝑈𝑥 (𝜆) = 𝑖𝜆𝑈𝑥 (𝜆)𝐻(𝑥)𝐽 for 0 ≤ 𝑥 < ℓ, with 𝑈0 (𝜆) = 𝐼𝑚 , (4.4) ∂𝑥 wherein the Hermitian 𝐻(𝑥) is subject to (4.3). From time to time we shall also impose the normalization trace 𝐻(𝑥) = 1

a.e. on [0, ℓ].

(4.5)

Theorem 4.1. Each normalized monotonic continuous chain 𝑈𝑥 (𝜆), 0 ≤ 𝑥 < ℓ, of entire B-regular 𝐽-inner mvfr’s is the matrizant of exactly one canonical integral system (4.2) with a continuous nondecreasing mass function 𝑀 (𝑥), 0 ≤ 𝑥 < ℓ, with 𝑀 (0) = 0 that may be obtained from 𝑈𝑥 (𝜆) by the formula ( ) ∂𝑈𝑥 𝑀 (𝑥) = −𝑖 (0)𝐽. (4.6) ∂𝜆 Proof. This follows from the deﬁnition of the class 𝒰𝐵𝑅 (𝐽) and Theorem 4.6 in [ArD97]. □ Theorem 4.2. Each normalized monotonic continuous chain 𝑈𝑥 (𝜆), 0 ≤ 𝑥 < ℓ, of entire right or left strongly regular 𝐽-inner mvf ’s is the matrizant of exactly one canonical integral system (4.2) with a continuous nondecreasing mass function 𝑀 (𝑥), 0 ≤ 𝑥 < ℓ, with 𝑀 (0) = 0 that may be obtained from 𝑈𝑥 (𝜆) by the formula (4.6). Proof. This is an immediate consequence of Theorems 3.2 and 4.1.

□

5. Direct and inverse monodromy problems A canonical integral system (4.1) is said to be a regular integral system if ℓ < ∞ and the mass function 𝑀 (𝑥) is a continuous non decreasing 𝑚 × 𝑚 mvf on the closed interval [0, ℓ] with 𝑀 (0) = 0. In this case the matrizant 𝑈𝑥 (𝜆), 0 ≤ 𝑥 ≤ ℓ is a normalized monotonic continuous chain of entire 𝐽-inner mvf’s on the interval [0, ℓ] and the value 𝑈 (𝜆) = 𝑈ℓ (𝜆) of the matrizant at the right-hand end point ℓ of the interval is called the monodromy matrix.

B-regular 𝐽-inner Matrix-valued Functions

59

Similarly, a canonical diﬀerential system (4.4) is said to be a regular diﬀerential system if ℓ < ∞ and the Hermitian 𝐻(𝑥) meets the conditions 𝐻 ∈ 𝐿𝑚×𝑚 ([0, ℓ]) and 𝐻(𝑥) ≥ 0 a.e. on [0, ℓ]. 1

(5.1)

The matrizant 𝑈𝑥 (𝜆), 0 ≤ 𝑥 ≤ ℓ, of such a system is a normalized monotonic continuous chain of entire 𝐽-inner mvf’s on the interval [0, ℓ] that is in fact absolutely continuous with respect to 𝑥 on [0, ℓ]. The value 𝑈ℓ (𝜆) of the matrizant at the right-hand end point of the interval is called the monodromy matrix of the system. It is clear that the monodromy matrices of regular canonical integral and diﬀerential systems belong to the class ℰ ∩ 𝒰 ∘ (𝐽). Moreover, by elementary estimates it may also be shown that they are of exponential type. A converse to these results was obtained by V.P. Potapov [Po60] as an application of his work on the multiplicative representation of meromorphic 𝐽-contractive mvf’s 𝑈 (𝜆) in ℂ+ with det 𝑈 (𝜆) ∕= 0 for some 𝜆 ∈ 𝔥+ 𝑈. Theorem 5.1. (V.P. Potapov) If 𝑈 ∈ ℰ∩𝒰 ∘ (𝐽), then 𝑈 (𝜆) is the monodromy matrix of a regular canonical diﬀerential system on the interval [0, ℓ] with a Hermitian 𝐻(𝑥) that meets the conditions (4.5) and (5.1). Moreover, the length of this interval is uniquely speciﬁed by the formula [ ( ) ] ∂𝑈 ℓ = ℓ𝑈 = trace −𝑖 (0)𝐽 . (5.2) ∂𝜆 Remark 5.2. A mvf 𝑈 ∈ ℰ ∩𝒰 ∘ (𝐽) is automatically of exponential type. This follows from the fact that such a mvf 𝑈 has bounded Nevanlinna characteristic in both ℂ+ and ℂ− and a theorem of M.G. Krein; see, e.g., Theorem 3.108 in [ArD08] for the latter. In general, a mvf 𝑈 ∈ ℰ ∩ 𝒰 ∘ (𝐽) is the monodromy matrix of more than one canonical diﬀerential system, i.e., it is not possible to recover 𝐻(𝑥) uniquely from 𝑈 (𝜆). However, if 𝐽 = ±𝐼𝑚 , then 𝑈 is the monodromy matrix of exactly one canonical diﬀerential system subject to the normalization conditions (5.1) and (4.5) if and only if 𝑈 (𝜆) and det 𝑈 (𝜆) have the same exponential type. This criterion is due to Brodskii-Kisilevskii; see, e.g., [Bro72]. Let 𝜏𝑓± = lim sup 𝜈 −1 ln ∥𝑓 (±𝑖𝜈)∥ 𝜈↑∞

for entire mvf’s 𝑓 . ∘ ∘ Theorem 5.3. If 𝑈 ∈ ℰ ∩ (𝒰𝑟𝑠𝑅 (𝐽) ∪ 𝒰ℓ𝑠𝑅 (𝐽)), then 𝑈 is the monodromy matrix of exactly one canonical diﬀerential system (4.4) with Hermitian 𝐻(𝑥) subject to the constraints (5.1) and (4.5) if and only if either

or

𝜏𝑈+ ≤ 0

and

− 𝜏𝑈− = 𝜏det 𝑈

𝜏𝑈− ≤ 0

and

+ 𝜏𝑈+ = 𝜏det 𝑈.

Proof. A proof will be presented in Section 8.5 in [ArD12].

□

60

D.Z. Arov and H. Dym

Remark 5.4. Theorem 5.3 is a generalization of the Brodskii-Kisilevski criterion, since by a theorem of M.G. Krein (see, e.g., Theorem 3.108 in [ArD08]) the exponential type of a mvf 𝑈 ∈ ℰ ∩ 𝒰 ∘ (𝐽) is equal to max{𝜏𝑈+ , 𝜏𝑈− }. A fundamental result of L. de Branges serves to establish uniqueness of the inverse monodromy problem for 2 × 2 monodromy matrices in ℰ ∩ 𝒰(𝒥1 ). Theorem 5.5. (L. de Branges) If 𝑈 ∈ ℰ ∩ 𝒰 ∘ (𝒥1 ) and det 𝑈 (𝜆) ≡ 1, then it is the monodromy matrix of exactly one canonical diﬀerential system with a real Hermitian 𝐻 that is subject to the constraints (5.1) and (4.5). Proof. See [Br68a] and, for additional information,[DMc76].

□

A mvf 𝑈 ∈ 𝒰(𝐽) is said to be symplectic if 𝑈 (𝜆)𝜏 𝒥𝑝 𝑈 (𝜆) = 𝒥𝑝

when 𝜆 ∈ 𝔥𝑈 .

𝜏

Here 𝑈 (𝜆) denotes the transpose of 𝑈 (𝜆). If 𝑈 ∈ ℰ ∩ 𝒰(𝐽) is symplectic, then 𝜏𝑈+ = 𝜏𝑈− . If a 2 × 2 mvf 𝑈 ∈ ℰ ∩ 𝒰(𝐽), then 𝑈 is symplectic ⇐⇒ det 𝑈 (𝜆) ≡ 1 ⇐⇒ 𝜏𝑈+ = 𝜏𝑈− . A mvf 𝑈 ∈ ℰ ∩ 𝒰 ∘ (𝐽) is said to be unicellular if every pair of left divisors 𝑈1 , 𝑈2 ∈ ℰ ∩ 𝒰 ∘ (𝐽) of 𝑈 is ordered in the sense that either 𝑈1−1 𝑈2 ∈ ℰ ∩ 𝒰 ∘ (𝐽)

or 𝑈2−1 𝑈1 ∈ ℰ ∩ 𝒰 ∘ (𝐽).

It is readily seen that a mvf 𝑈 ∈ ℰ ∩ 𝒰 ∘ (𝐽) is unicellular if and only if 𝑈 ∼ is unicellular. The matrizant of a regular canonical integral system (4.2) with monodromy matrix 𝑈 is a family of ordered left divisors of 𝑈 that is maximal in a natural way. ∘ (𝐽), then any maximal family of ordered normalized Theorem 5.6. If 𝑈 ∈ ℰ ∩ 𝒰𝐵𝑅 left divisors of 𝑈 may be parametrized in such a way that it is the matrizant 𝑈𝑥 (𝜆), 0 ≤ 𝑥 ≤ ℓ𝑈 of a canonical diﬀerential system (4.4) with Hermitian 𝐻(𝑥) that meets the constraints (5.1) and (4.5).

Proof. Let 𝒱 be a maximal family of ordered normalized left divisors of 𝑈 and for each 𝑉 ∈ 𝒱, let let 𝑈𝑥 (𝜆) = 𝑉 (𝜆), where 𝑥 = ℓ𝑉 = −𝑖trace𝑉 ′ (0)𝐽. Then 0 ≤ 𝑥 ≤ ℓ𝑈 and the mvf’s 𝑈𝑥 (𝜆) that are obtained by this parametrization satisfy the conditions of Theorem 4.1. Therefore, the conclusions of this theorem are valid. □ ∘ (𝐽), then the following assertions are equivalent: Theorem 5.7. If 𝑈 ∈ ℰ ∩ 𝒰𝐵𝑅 (1) 𝑈 is unicellular.

(2) 𝑈 is the monodromy matrix of exactly one canonical diﬀerential system (4.4) with Hermitian 𝐻(𝑥) that meets the constraints (5.1) and (4.5). (3) 𝑈 ∼ is the monodromy matrix of exactly one canonical diﬀerential system (4.4) with Hermitian 𝐻(𝑥) that meets the constraints (5.1) and (4.5).

B-regular 𝐽-inner Matrix-valued Functions

61

Proof. This follows from Theorem 4.1 and the fact that any maximal monotone family of left divisors of 𝑈 in the class in 𝒰 ∘ (𝐽) is a normalized monotonic continuous chain of entire 𝐽-inner mvf’s. In view of this (2) holds if and only if (1) holds. Finally, (3) is immediate from (2) and Theorem 3.1. □

6. Connections with the theory of characteristic functions A mvf 𝑈 ∈ ℰ ∩ 𝒰 ∘ (𝐽) may be identiﬁed as the characteristic mvf of an LB (LivsicBrodskii) Volterra 𝐽-node Σ = (𝐾, 𝐹 ; 𝑋, ℂ𝑚 ; 𝐽): 𝑈 (𝜆) = 𝑈Σ (𝜆) = 𝐼𝑚 + 𝑖𝜆𝐹 (𝐼 − 𝜆𝐾)−1 𝐹 ∗ 𝐽,

(6.1)

where 𝐾, the main operator of the node, is a Volterra operator in the Hilbert space 𝑋, 𝐹 is a bounded linear operator from 𝑋 into ℂ𝑚 , 𝐽 is an 𝑚 × 𝑚 signature matrix and 𝐾 − 𝐾 ∗ = 𝑖𝐹 ∗ 𝐽𝐹. (6.2) An LB Volterra 𝐽-node may be chosen to be simple, which means that ∩ ker 𝐹 𝐾 𝑛 = {0}.

(6.3)

𝑛≥0

It is known that

∩

𝐹 𝐾 𝑛 = {0} ⇐⇒ ker 𝐾 ∩ ker 𝐹 = {0};

(6.4)

𝑛≥0

and that if Σ𝑗 = (𝐾𝑗 , 𝐹𝑗 ; 𝑋𝑗 , ℂ𝑚 ; 𝐽), 𝑗 = 1, 2, is a pair of simple LB Volterra 𝐽-nodes with the same characteristic mvf 𝑈 , then they are unitarily equivalent, i.e., there exists a unitary operator 𝑇 from 𝑋1 onto 𝑋2 such that 𝐾2 = 𝑇 𝐾1 𝑇 −1

and 𝐹2 = 𝐹1 𝑇 −1 ;

(6.5)

see, e.g., [Bro72] for details and additional information on the connections between the characteristic mvf’s of LB Volterra 𝐽-nodes and entire 𝐽-inner mvf’s. Remark 6.1. In this paper we focus on the class ℰ ∩ 𝒰 ∘ (𝐽) because the matrizants of the canonical systems that we study belong to this class. However, it is also possible to characterize the larger class 𝒰 ∘ (𝐽): A mvf 𝑈 ∈ 𝒰 ∘ (𝐽) if and only if it is the characteristic mvf of a simple Livsic-Brodskii node Σ = (𝐾, 𝐹 ; 𝑋, ℂ𝑚 ; 𝐽) for which the real part 𝐾𝑅 = (𝐾 + 𝐾 ∗ )/2 of the main operator 𝐾 is a bounded selfadjoint operator with singular spectrum, i.e., ∫ 𝑏 𝐾𝑅 = 𝜇𝑑𝐸𝜇 and 𝜎𝑥 (𝜇) = ⟨𝐸𝑢 𝑥, 𝑥⟩ are singular functions of 𝜇 (6.6) 𝑎

for every 𝑥 ∈ 𝑋, where [𝑎, 𝑏] is a ﬁnite interval in ℝ. (Since ⋁ 𝑛 𝐾𝑅 𝐹 ℂ𝑚 = 𝑋, 𝑛≥0

62

D.Z. Arov and H. Dym

it is actually enough to require that (6.6) holds for every 𝑥 ∈ 𝐹 ℂ𝑚 .) An equivalent requirement is that the mvf ∫ 𝑏 𝑑𝐸𝜇 ∗ −1 ∗ 𝐹 𝑐(𝜆) = −𝑖𝐹 (𝐾𝑅 − 𝜆𝐼) 𝐹 = −𝑖𝐹 𝜇 −𝜆 𝑎 (which belongs to the Carath´eodory class) is purely singular, i.e., the nontangential limit 𝑐(𝜇) + 𝑐(𝜇)∗ = 0 a.e. on ℝ; see, e.g., Lemma 6.3 and Theorem 6.4 in [ArD08] and pp. 28–30 of [Bro72] for the last identiﬁcation. One basic model of a simple LB Volterra 𝐽-node with characteristic mvf 𝑈 ∈ ℰ ∩ 𝒰 ∘ (𝐽) is due to L. de Branges: Σ𝑑𝑏𝑟 (𝑈 ) = (𝑅0 , 𝐹0 ; ℋ(𝑈 ), ℂ𝑚 ; 𝐽), in which

√ 𝑓 (𝜆) − 𝑓 (0) and 𝐹0 𝑓 = 2𝜋𝑓 (0) for 𝑓 ∈ ℋ(𝑈 ). (6.7) 𝜆 The veriﬁcation of simplicity is easily carried out with the help of the identities (𝑅0 𝑓 )(𝜆) =

√ 𝑓 (𝑛) (0) for 𝑓 ∈ ℋ(𝑈 ) and 𝑛 = 0, 1, . . .. (6.8) 2𝜋 𝑛! Let Σ1 = (𝐾1 , 𝐹1 ; 𝑋1 , ℂ𝑚 ; 𝐽) and Σ2 = (𝐾2 , 𝐹2 ; 𝑋2 , ℂ𝑚 ; 𝐽) be a pair of LB Volterra 𝐽-nodes and let [ ] 𝐾1 𝑖𝐹1∗ 𝐽𝐹2 𝐾= , 𝐹 = [𝐹1 𝐹2 ] and 𝑋 = 𝑋1 ⊕ 𝑋2 . 0 𝐾2 𝐹0 𝐾0𝑛 𝑓 = (𝑅𝑜𝑛 𝑓 )(0) =

Then Σ = (𝐾, 𝐹 ; 𝑋, 𝐶 𝑚 ; 𝐽) is an LB Volterra 𝐽-node that is called the product of the nodes Σ1 and Σ2 and is denoted Σ = Σ1 × Σ2 . It is easy to see that in this product 𝑋1 is a closed subspace of 𝑋 that is invariant under 𝐾. An essential feature of this deﬁnition is that if 𝑓 = col (𝑓1 , 𝑓2 ) with 𝑓𝑗 ∈ 𝑋𝑗 for 𝑗 = 1, 2, then ∥𝑓 ∥2𝑋 = ∥𝑓1 ∥2𝑋1 + ∥𝑓2 ∥2𝑋2 . (6.9) Moreover, 𝐾1 = 𝐾∣𝑋1 ,

𝐹1 = 𝐹 ∣𝑋1 ,

𝐾2 = 𝑃𝑋2 𝐾∣𝑋2

and 𝐹2 = 𝐹 ∣𝑋2 .

(6.10)

𝑚

Conversely, if Σ = (𝐾, 𝐹 ; 𝑋, 𝐶 ; 𝐽) is an LB Volterra 𝐽-node and 𝑋1 is a closed subspace of 𝑋 that is invariant under 𝐾 and 𝑋2 = 𝑋 ⊖ 𝑋1 , then Σ = Σ1 × Σ2 ,

where Σ𝑗 = (𝐾𝑗 , 𝐹𝑗 ; 𝑋𝑗 , ℂ𝑚 ; 𝐽) for 𝑗 = 1, 2,

𝐾𝑗 and 𝐹𝑗 are deﬁned as in (6.10) and ( [ 𝐾1 𝑈Σ = 𝐼𝑚 + 𝑖𝜆[𝐹1 𝐹2 ] 𝐼 − 𝜆 0

𝑖𝐹1∗ 𝐽𝐹2 𝐾2

])−1 [

It is known that the formula 1 (𝑇Σ 𝑥)(𝜆) = √ 𝐹 (𝐼 − 𝜆𝐾)−1 𝑥, 2𝜋

] 𝐹1∗ 𝐽 = 𝑈Σ1 𝑈Σ2 . 𝐹2∗

𝑥 ∈ 𝑋,

(6.11)

B-regular 𝐽-inner Matrix-valued Functions

63

deﬁnes a unitary similarity from a simple LB Volterra 𝐽-node Σ = (𝐾, 𝐹 ; 𝑋, ℂ𝑚 ; 𝐽) with characteristic mvf 𝑈 onto Σ𝑑𝑏𝑟 (𝑈 ). Thus, if 𝑈 ∈ ℰ ∩ 𝒰 ∘ (𝐽), then there exists a simple LB Volterra 𝐽-node Σ = (𝐾, 𝐹 ; 𝑋, ℂ𝑚 ; 𝐽) such that 𝑈Σ = 𝑈 , it is deﬁned up to unitary equivalence by 𝑈 . Moreover, every closed subspace 𝑋1 of 𝑋 that is invariant under 𝐾 deﬁnes an LB Volterra 𝐽-node Σ1 , as above, such that its characteristic mvf is a left divisor of 𝑈 . If a left divisor 𝑈1 of 𝑈 may be obtained in this way, i.e., as the characteristic mvf of a node Σ1 = (𝐾1 , 𝐹1 ; 𝑋1 , ℂ𝑚 ; 𝐽) that is related to the node Σ = (𝐾, 𝐹 ; 𝑋, ℂ𝑚 ; 𝐽) with characteristic mvf 𝑈 as in (6.10), where 𝑋1 is a closed subspace of 𝑋 that is invariant under 𝐾, then it is called left regular in the Brodskii sense (or the Livsic-Brodskii sense). Equivalently, if 𝑈 = 𝑈1 𝑈2 , then 𝑈1 is a regular left divisor in this sense if and only if the product Σ1 × Σ2 of two simple LB Volterra J-nodes with characteristic mvf’s 𝑈1 and 𝑈2 is a simple node. Theorem 6.2. Let 𝑈 = 𝑈1 𝑈2 , where 𝑈1 , 𝑈2 ∈ ℰ ∩ 𝑈 ∘ (𝐽). Then 𝑈1 is left regular divisor of 𝑈 in the Brodskii sense if and only if the L. de Branges condition (1.11) holds. Proof. Suppose ﬁrst that (1.11) holds. Then Theorem 1.1 holds. Let Σ1 = Σ𝑑𝑏𝑟 (𝑈1 ) and Σ2 = Σ𝑑𝑏𝑟 (𝑈2 ). Then both of these two nodes are simple. The equivalence (1.12) and the formulas for the operators in these nodes implies that Σ = Σ1 × Σ2 is unitarily equivalent to the simple node Σ𝑑𝑏𝑟 (𝑈 ). Thus, 𝑈1 is left regular divisor of 𝑈 in the Brodskii sense. Conversely, if 𝑈1 is a left regular divisor of 𝑈 in the Brodskii sense, then it is the characteristic mvf of a node Σ1 that is related to the simple LB Volterra 𝐽node Σ𝑑𝑏𝑟 (𝑈 ) by (6.10). Therefore, the Hilbert space 𝑋1 in the node Σ1 is a closed subspace of the Hilbert space ℋ(𝑈 ) = 𝑋 in the node Σ𝑑𝑏𝑟 (𝑈 ) that is invariant ˜1 ) for some under 𝑅0 . Thus, in view of Theorem 1.2 and Remark 1.3, 𝑋1 = ℋ(𝑈 −1 ˜1 ∈ 𝒰 ∘ (𝐽) for which 𝑈 ˜1 𝑈 ∈ 𝒰 ∘ (𝐽) and ℋ(𝑈 ) = ℋ(𝑈 ˜1 ) ⊕ 𝑈 ˜1 ℋ(𝑈2 ). mvf 𝑈 ˜ ˜ Consequently, as the characteristic mvf 𝑈1 of Σ𝑑𝑏𝑟 (𝑈1 ) coincides with 𝑈1 , (1.11) holds by Theorem 1.1. □ Theorem 6.3. If 𝑈1 , 𝑈2 ∈ ℰ ∩ 𝒰 ∘ (𝐽), then Σ𝑑𝑏𝑟 (𝑈1 ) × Σ𝑑𝑏𝑟 (𝑈2 ) is simple if and only if (1.11) holds. Proof. This theorem follows from Theorem 6.2. However, we shall give an independent proof for the sake of added perspective. Let 𝑓𝑗 ∈ ℋ(𝑈𝑗 ) for 𝑗 = 1, 2 and let 𝑓 = col (𝑓1 , 𝑓2 ). Then 𝑓 ∈ ker 𝐹 ∩ ker 𝐾 if and only if 𝐹1 𝑓1 + 𝐹2 𝑓2 = 0, i.e., if and only if 𝑓1 (0)+𝑓2 (0) = 0,

𝐾1 𝑓1 + 𝑖𝐹1∗ 𝐽𝐹2 𝑓2 = 0 and 𝐾2 𝑓2 = 0,

√ (𝑅0 𝑓1 )(𝜆)+𝑖 2𝜋𝐹1∗ 𝐽𝑓2 (0) = 0

and (𝑅𝑜 𝑓2 )(0) = 0. (6.12)

64

D.Z. Arov and H. Dym

Therefore, since 𝐹1∗ 𝑣 = for 𝑣 ∈ ℂ𝑚 ,

√ 𝐽 − 𝑈1 (𝜆)𝐽 √ 𝑣 2𝜋𝐾0𝑈 𝑣 = −𝑖 2𝜋𝜆

𝐽 − 𝑈1 (𝜆)𝐽 𝑓1 (𝜆) − 𝑓 (0) −𝑖 𝐽𝑓1 (0) 𝜆 −𝑖𝜆 𝑓1 (𝜆) − 𝑈1 (𝜆)𝑓1 (0) = 0. = 𝜆 Thus, the three constraints in (6.12) imply that (𝑅0 𝑓1 )(𝜆) + 𝑖𝐹1∗ 𝐽𝐹2 𝑓2 =

𝑓1 (𝜆) = 𝑈1 (𝜆)𝑓1 (0) = 𝑈1 (𝜆)𝑓2 (0) = 𝑈1 (𝜆)𝑓2 (𝜆). But this implies that

𝑓1 ∈ ℋ(𝑈1 ) ∩ 𝑈1 ℋ(𝑈2 ), which is equal to zero if (1.11) is in force. Therefore, 𝑓2 = 0 and hence 𝑓 = 0. Thus, condition (1.11 implies that the product node Σ𝑑𝑏𝑟 (𝑈1 )× Σ𝑑𝑏𝑟 (𝑈2 ) is simple. Conversely, if the product node Σ𝑑𝑏𝑟 (𝑈1 ) × Σ𝑑𝑏𝑟 (𝑈2 ) is simple, then ℋ(𝑈1 ) must sit isometrically inside ℋ(𝑈1 𝑈2 ), which implies that (1.11) holds, by another application of Theorem 1.1. □ A Volterra operator 𝐾 in a Hilbert space 𝑋 is called unicellular if and only if the set of all closed subspaces of 𝑋 that are invariant under 𝐾 are ordered by inclusion. ∘ (𝐽) is the characteristic mvf of a simple LB Volterra Theorem 6.4. If 𝑈 ∈ ℰ ∩ 𝒰𝐵𝑅 𝐽-node with main operator 𝐾, then

𝑈

is unicellular

⇐⇒

𝐾

is unicellular.

Proof. It suﬃces to verify this for the de Branges model, i.e., to show that 𝑈 is unicellular if and only if 𝑅0 is unicellular on ℋ(𝑈 ). But, by Theorems 1.1 and 1.2, a closed subspace ℋ1 of ℋ(𝑈 ) is invariant under 𝑅0 if and only if ℋ1 = ℋ(𝑈1 ) for some 𝑈1 ∈ ℰ ∩𝒰 ∘ (𝐽) that is a left divisor of 𝑈 and the factors in the corresponding factorization (1.10) meet the condition (1.11). □ There is another LB Volterra 𝐽-node associated with each mvf 𝑈 ∈ ℰ ∩𝒰 ∘ (𝐽) that is obtained after identifying 𝑈 as the monodromy matrix of a canonical differential system (4.4) with Hermitian 𝐻(𝑥), 0 ≤ 𝑥 ≤ ℓ, that meets the constraints (5.1) and (4.5) that is deﬁned in terms of 𝐻 as follows: Σ𝐻 = (𝐾𝐻 , 𝐹𝐻 ; 𝑋𝐻 , ℂ𝑚 ; 𝐽), where 𝑋𝐻 =

{ } ∫ ℓ measurable 𝑚 × 1 vvf’s 𝑓 on [0, ℓ] : 𝑓 (𝑥)∗ 𝐻(𝑥)𝑓 (𝑥)𝑑𝑥 < ∞ , ∫

(𝐾𝐻 𝑓 )(𝑥) = 𝑖𝐽

0

0

ℓ

∫ 𝐻(𝑠)𝑓 (𝑠)𝑑𝑠

and 𝐹𝐻 𝑓 =

0

ℓ

𝐻(𝑠)𝑓 (𝑠)𝑑𝑠

for 𝑓 ∈ 𝑋𝐻 .

B-regular 𝐽-inner Matrix-valued Functions

65

For this node, formula (6.11) may be expressed in terms of the matrizant of the underlying canonical system: ∫ ℓ 1 (𝑇Σ𝐻 𝑓 )(𝜆) = √ 𝑈 (𝑠, 𝜆)𝑓 (𝑠)𝑑𝑠. (6.13) 2𝜋 0 Theorem 6.5. If 𝑈𝑥 , 0 ≤ 𝑥 ≤ ℓ, is the matrizant of the canonical system (4.4), then the node Σ𝐻 is simple if and only if the inclusions ℋ(𝑈𝑥 ) ⊆ ℋ(𝑈ℓ ) are isometric for every 𝑥 ≤ ℓ. Proof. See, e.g., (5) in Theorem 8.26 in [ArD12].

□

∘ Theorem 6.6. If 𝑈 ∈ ℰ ∩ 𝒰𝐵𝑅 (𝐽), then there exists at least one simple LB Volterra 𝐽-node Σ𝐻 with characteristic mvf 𝑈 (𝜆) and a normalized Hermitian 𝐻(𝑥) that meets the constraints (5.1) and (4.5). There is exactly one such node if and only if 𝑈 is unicellular.

Proof. Let Σ = (𝐾, 𝐹 ; 𝑋, ℂ𝑚 ; 𝐽) be any simple LB Volterra 𝐽-node with characteristic mvf 𝑈 . Then there exists a maximal chain of closed subspaces of 𝑋 that are invariant under 𝐾 and are ordered by inclusion. The characteristic mvf’s of the projections of Σ onto these subspaces is a maximal ordered chain of normalized left divisors of 𝑈 . By Theorem 5.6 this chain is the matrizant 𝑈𝑥 (𝜆), 0 ≤ 𝑥 ≤ ℓ, of a canonical diﬀerential system (4.4) with monodromy matrix 𝑈 and Hermitian 𝐻 that meets the constraints (5.1) and (4.5). The node Σ𝐻 meets the claimed properties of the theorem, as follows with the help of Theorem 6.4. □

7. de Branges spaces Let the 𝑝 × 2𝑝 mvf 𝔈(𝜆) = [𝐸− (𝜆) 𝐸+ (𝜆)] with 𝑝 × 𝑝 blocks 𝐸± be deﬁned in terms of the bottom block row of 𝐴 ∈ 𝒰(𝐽𝑝 ) by the formula ] [ √ 1 −𝐼𝑝 𝐼𝑝 𝔈(𝜆) = 2[0 𝐼𝑝 ]𝐴(𝜆)𝔙, where 𝔙 = √ . 2 𝐼𝑝 𝐼𝑝 It is then readily checked that

[ ] √ √ 0 2[0 𝐼𝑝 ]{𝐽𝑝 − 𝐴(𝜆)𝐽𝑝 𝐴(𝜔)∗ } 2 = 𝐸+ (𝜆)𝐸+ (𝜔)∗ − 𝐸− (𝜆)𝐸− (𝜔)∗ 𝐼𝑝

and hence that the kernel 𝐾𝜔𝔈 (𝜆)

[ ] 𝐸+ (𝜆)𝐸+ (𝜔)∗ − 𝐸− (𝜆)𝐸− (𝜔)∗ 0 𝑈 = 2[0 𝐼𝑝 ]𝐾𝜔 (𝜆) = 𝐼 𝜌𝜔 (𝜆) 𝑝

is positive on 𝔥𝔈 × 𝔈. Therefore, this kernel deﬁnes a RKHS (Reproducing kernel Hilbert space), which we shall refer to as the de Branges space ℬ(𝔈) based on the de Branges matrix 𝔈(𝜆); for additional information (including an intrinsic deﬁnition

66

D.Z. Arov and H. Dym

of ℬ(𝔈)) see, e.g., Sections 5.10 and 5.11 in [ArD08] (especially the ﬁrst and last equivalences in (5.115)). Moreover, it turns out that if [ ] √ √ 𝑓 𝑓 = 1 ∈ ℋ(𝐴) and 𝑉2 𝑓 = 2[0 𝐼𝑝 ]𝑓 = 2𝑓2 , 𝑓2 then ∥𝑓 ∥2ℋ(𝐴) = 2∥𝑓2 ∥2ℬ(𝔈)

for 𝑓 ∈ ℋ(𝐴) ⊖ ker 𝑉2 ,

i.e., the operator 𝑉2 is an isometry from ℋ(𝐴) ⊖ ker 𝑉2 onto ℬ(𝔈). If 𝑝 = 1 and 𝐴 ∈ ℰ ∩ 𝒰(𝐽1 ), then ker 𝑉2 = {0} if and only if lim 𝜈 −1

𝜈↑∞

𝑎11 (𝑖𝜈) + 𝑎12 (𝑖𝜈) = 0, 𝑎21 (𝑖𝜈) + 𝑎22 (𝑖𝜈)

see, e.g., Section 5.12 in [ArD08] for additional information. Alternate characterizations of scalar de Branges spaces ℬ(𝐸) of entire functions based on an entire function 𝐸 that meets the condition ∣𝐸(𝜆)∣ > ∣𝐸(𝜆)∣ for 𝜆 ∈ ℂ+ are given in Sections 19–23 of [Br68a]. (Theorem 4.1 in [Dy70] exhibits the consistency of de Branges’ deﬁnitions with the characterization in terms of Hardy spaces in (5.115) of [ArD08].) Spaces of matrix- and operator-valued functions are considered in [Br68b].

8. An example In this section we shall consider the canonical diﬀerential system [ ] 𝑟 0 𝑥 𝑢′ (𝑥, 𝜆) = 𝑖𝜆𝑢(𝑥, 𝜆) 𝐽1 for 0 ≤ 𝑥 < ∞ 0 𝑥−𝑟 and −1 < 𝑟 < 1, with matrizant [ 𝑎11 (𝑥, 𝜆) 𝐴𝑥 (𝜆) = 𝐴(𝑥, 𝜆) = 𝑎21 (𝑥, 𝜆)

𝑎12 (𝑥, 𝜆)

(8.1)

] for 0 ≤ 𝑥 < ∞,

𝑎22 (𝑥, 𝜆)

(8.2)

and shall establish the following facts: (1) The matrizant can be expressed in terms of the Gamma function and Bessel functions 𝐽𝑝 (𝑥) of the ﬁrst kind as ⎤ ⎡( )(1−𝑟)/2 𝑥𝜆 1+𝑟 Γ( ) 0 2 2 ⎦ 𝐴(𝑥, 𝜆) = ⎣ ( 𝑥𝜆 )(1+𝑟)/2 1−𝑟 0 Γ( ) 2 2 [ ] −𝑖𝑥𝑟 𝐽 1+𝑟 (𝑥𝜆) 𝐽 𝑟−1 (𝑥𝜆) 2 2 × , (8.3) −𝑖𝑥−𝑟 𝐽 1−𝑟 (𝑥𝜆) 𝐽− 1+𝑟 (𝑥𝜆) 2

2

B-regular 𝐽-inner Matrix-valued Functions or, equivalently, as [ 𝐴(𝑥, 𝜆) =

]

Γ( 1+𝑟 2 )

0

0

Γ( 1−𝑟 2 )

⎡

67

F 𝑟−1 (𝑥𝜆)

2 ×⎣ 𝑥𝜆 −𝑟 1−𝑟 −𝑖 2 𝑥 F (𝑥𝜆)

𝑟 −𝑖 𝑥𝜆 2 𝑥 F 1+𝑟 (𝑥𝜆)

2

2

F− 1+𝑟 (𝑥𝜆)

⎤ ⎦,

(8.4)

2

where F𝑝 (𝜆) = (𝜆/2)−𝑝 𝐽𝑝 (𝜆) =

∞ ∑ 𝑘=0

(−1)𝑘 (𝜆/2)2𝑘 . Γ(𝑘 + 1)Γ(𝑘 + 1 + 𝑝)

(8.5)

Moreover, 𝐴𝑥 (𝜆) is real (i.e., 𝐴𝑥 (−𝜆) = 𝐴𝑥 (𝜆)) and symplectic; 𝑎11 (𝑥, 𝜆) and 𝑎22 (𝑥, 𝜆) are even functions of 𝜆, 𝑎12 (𝑥, 𝜆) and 𝑎21 (𝑥, 𝜆) are odd functions of 𝜆 and ∣𝑎𝑗𝑘 (𝑥, 𝜆)∣ ≤ 𝑎𝑗𝑘 (𝑥, 𝑖∣𝜆∣)

𝑓 𝑜𝑟 𝜆 ∈ ℂ, 𝑥 > 0 𝑎𝑛𝑑 𝑗, 𝑘 = 1, 2. (8.6) [ ] (2) The de Branges matrices 𝔈(𝑥, 𝜆) = 𝐸− (𝑥, 𝜆) 𝐸+ (𝑥, 𝜆) with components 𝐸± (𝑥, 𝜆) = 𝑎22 (𝑥, 𝜆) ± 𝑎21 (𝑥, 𝜆) ( )(1+𝑟)/2 } 𝑥𝜆 1−𝑟 { = Γ( ) 𝐽− 1+𝑟 (𝑥𝜆) ∓ 𝑖𝑥−𝑟 𝐽 1−𝑟 (𝑥𝜆) 2 2 2 2

(8.7)

are entire functions of exponential type 𝑥, ∣𝐸+ (𝑥, 𝜇)∣ = ∣𝐸+ (𝑥, −𝜇)∣ 𝐸− (𝑥, 𝜆) =

# 𝐸+ (𝑥, 𝜆)

for every point 𝜇 ∈ ℝ,

for every point 𝜆 ∈ ℂ

(8.8) (8.9)

and 𝐸± (𝑥, 𝜆) ∕= 0 for every point 𝜆 ∈ ℂ± . (3) The function 𝑐𝑥 (𝜆) = 𝑇𝐴𝑥 [1] =

𝑎11 (𝑥, 𝜆) + 𝑎12 (𝑥, 𝜆) 𝑎21 (𝑥, 𝜆) + 𝑎22 (𝑥, 𝜆)

(8.10)

belongs to the subclass of the Carath´eodory class 𝒞 that is denoted by 𝒞𝑎 in [ArD08] and is characterized by the fact that it meets the condition lim 𝜈 −1 𝑐𝑥 (𝑖𝜈) = 0

𝜈↑∞

(8.11)

and its spectral function 𝜎𝑥 (𝜇) is locally absolutely continuous; in fact } ∫ ∞{ 1 𝜇 1 − (8.12) 𝑐𝑥 (𝜆) = 𝑖𝛼 + ∣𝐸+ (𝑥, 𝜇)∣−2 𝑑𝜇 𝜋𝑖 −∞ 𝜇 − 𝜆 1 + 𝜇2 for 𝜆 ∈ ℂ+ and appropriate 𝛼 ∈ ℝ. (4) The function

Δ𝑥 (𝜇) = ∣𝐸+ (𝑥, 𝜇)∣−2 = 𝜎𝑥′ (𝜇)

68

D.Z. Arov and H. Dym

satisﬁes the Muckenhoupt (𝐴2 ) condition } { ∫ 𝑏 ∫ 𝑏 1 1 −1 Δ𝑥 (𝜇)𝑑𝜇 Δ𝑥 (𝜇) 𝑑𝜇 < ∞ for every choice of 𝑎 < 𝑏 . sup 𝑏−𝑎 𝑎 𝑏−𝑎 𝑎 ∘ (𝐽1 ) for 0 ≤ 𝑥 < ∞. (5) 𝐴𝑥 ∈ ℰ ∩ 𝒰ℓ𝑠𝑅 (6) The de Branges space ℬ(𝔈𝑥 ) is the space of entire functions 𝑓 of exponential type less than or equal to 𝑥 for which ∫ ∞ ∣𝑓 (𝜇)∣2 (8.13) 𝑟 𝑑𝜇 < ∞. −∞ ∣𝜇∣

(7) If 𝑟 = 0, then

[

] −𝑖 sin(𝜆𝑥) , cos(𝜆𝑥)

cos(𝜆𝑥) 𝐴𝑥 (𝜆) = −𝑖 sin(𝜆𝑥)

𝔈𝑥 (𝜆) = [𝑒𝑖𝜆𝑥

𝑒−𝑖𝜆𝑥 ]

and the de Branges space ℬ(𝔈𝑥 ) is equal to the Paley-Wiener space {∫ 𝑥 } ℋ(𝑒𝑥 ) ⊕ ℋ∗ (𝑒𝑥 ) = 𝑒𝑖𝜆𝑡 𝑓 (𝑡)𝑑𝑡 : 𝑓 ∈ 𝐿2 ([−𝑥, 𝑥]) . −𝑥

(8) If −1 < 𝑟 < 0, then ℬ(𝔈𝑥 ) is a proper subspace of ℋ(𝑒𝑥 ) ⊕ ℋ∗ (𝑒𝑥 ). If 0 < 𝑟 < 1, then ℋ(𝑒𝑥 ) ⊕ ℋ∗ (𝑒𝑥 ) is a proper subspace of ℬ(𝔈𝑥 ). (9) If 𝑟 ∈ (−1, 0) ∪ (0, 1) and 𝑥 > 0, then 𝐴𝑥 ∕∈ 𝒰𝑟𝑠𝑅 (𝐽1 ). Proof. The proof is divided into steps. 1. Veriﬁcation of (1). The top row of the matrizant is a solution of the system ] [ ] [ ′ 𝑎11 (𝑥, 𝜆) 𝑎′12 (𝑥, 𝜆) = −𝑖𝜆 𝑥−𝑟 𝑎12 (𝑥, 𝜆) 𝑥𝑟 𝑎11 (𝑥, 𝜆) with initial condition

[

] [ 𝑎11 (0, 𝜆) 𝑎12 (0, 𝜆) = 1

] 0 .

Thus, 𝑎11 (𝑥, 𝜆) is a solution of the Bessel equation 𝑟 𝑎′′11 (𝑥, 𝜆) + 𝑎′11 (𝑥, 𝜆) + 𝜆2 𝑎11 (𝑥, 𝜆) = 0, 𝑥 with initial conditions 𝑎11 (0, 𝜆) = 1 i.e.,

( 𝑎11 (𝑥, 𝜆) =

𝑥𝜆 2

(

)(1−𝑟)/2 Γ

0 ≤ 𝑥 < ∞,

and 𝑎′11 (0, 𝜆) = 0,

1+𝑟 2

) 𝐽(𝑟−1)/2 (𝑥𝜆)

for 0 ≤ 𝑥 < ∞.

(8.14)

This justiﬁes the formula for 𝑎11 . The formula for 𝑎12 follows from the fact that 𝑑 𝑝 𝑑 −𝑝 𝑥 𝐽𝑝 (𝑥𝜆) = 𝜆𝑥𝑝 𝐽𝑝−1 (𝑥𝜆) and 𝑥 𝐽𝑝 (𝑥𝜆) = −𝜆𝑥−𝑝 𝐽𝑝+1 (𝑥𝜆). 𝑑𝑥 𝑑𝑥 The remaining entries in (8.3) can be veriﬁed in much the same way, since ] [ ] [ ′ 𝑎21 (𝑥, 𝜆) 𝑎′22 (𝑥, 𝜆) = −𝑖𝜆 𝑥−𝑟 𝑎22 (𝑥, 𝜆) 𝑥𝑟 𝑎21 (𝑥, 𝜆)

B-regular 𝐽-inner Matrix-valued Functions with initial condition

[

] [ 𝑎21 (0, 𝜆) 𝑎22 (0, 𝜆) = 0

69

] 1 .

Furthermore, 𝐴𝑥 (𝜆) is real and symplectic, since 𝐴𝑥 (−𝜆) = 𝐴𝑥 (𝜆)

and

det 𝐴𝑥 (𝜆) = 1.

Finally, the inequalities in (8.6) are immediate from from (8.4) and (8.5). 2. Formulas (8.8) and (8.9) hold and ∣𝐸± (𝑥, 𝜆)∣ > 0 for every point 𝜆 ∈ ℂ± . Formulas (8.8) and (8.9 follow easily from (8.2), (8.4), (8.5) and the ﬁrst formula in (8.7). Next, since [ ] [ ] 0 0 1 𝐴𝑥 (𝜇)𝐽1 𝐴𝑥 (𝜇)∗ 𝐴 (𝜇) = 0 for 𝜇 ∈ ℝ, 1 𝑥 the identity

[

𝐸− (𝑥, 𝜇)

] √ [ 𝐸+ (𝑥, 𝜇) = 2 0

] 1 𝐴𝑥 (𝜇)𝔙

implies that ∣𝐸+ (𝑥, 𝜇)∣2 = ∣𝐸− (𝑥, 𝜇)∣2 . Thus, if 𝜇 ∈ ℝ, 𝐸+ (𝑥, 𝜇) = 0 =⇒ 𝐸− (𝑥, 𝜇) = 0 =⇒ rank 𝐴𝑥 (𝜇) ≤ 1, which contradicts the invertibility of 𝐴𝑥 (𝜆) on ℝ. Similarly, the inequality ∣𝐸+ (𝑥, 𝜆)∣ ≥ ∣𝐸− (𝑥, 𝜆)∣

for 𝜆 ∈ ℂ+

(8.15)

implies that if 𝐸+ (𝑥, 𝜔) = 0 for some point 𝜔 ∈ ℂ+ , then 𝐸− (𝑥, 𝜔) = 0 and hence, det 𝐴𝑥 (𝜔) = 0, which contradicts the already established fact that 𝐴𝑥 (𝜆) is invertible for every point 𝜆 ∈ ℂ. Therefore, ∣𝐸+ (𝑥, 𝜆)∣ > ∣𝐸− (𝑥, 𝜆)∣

for 𝜆 ∈ ℂ+

(8.16)

and the proof that ∣𝐸+ (𝑥, 𝜆)∣ > 0 for 𝜆 ∈ ℂ+ is complete. In view of (8.9), this also justiﬁes the inequality ∣𝐸− (𝑥, 𝜆)∣ > 0 for 𝜆 ∈ ℂ− . 3. If 𝑥 > 0 and 𝜇 > 0, then there exist a pair of positive constants 𝛾1 and 𝛾2 that depend on 𝑥 and 𝑟 such that 𝛾1 (1 + ∣𝜇∣𝑟 ) ≤ ∣𝐸+ (𝑥, 𝜇)∣2 ≤ 𝛾2 (1 + ∣𝜇∣𝑟 ) Since

√ 𝐽𝑝 (𝑥) ∼

it is readily checked that

for 𝜇 ∈ ℝ.

(8.17)

𝜋 𝜋 2 cos(𝑥 − − 𝑝 ) as 𝑥 ↑ ∞, 𝜋𝑥 4 2

( )2 1−𝑟 1 ( 𝑥𝜇 )𝑟 Γ ∣𝐸+ (𝑥, 𝜇)∣ ∼ 𝜋 2 2 { 2 } × cos (𝑥𝜇 + 𝑟(𝜋/4)) + 𝑥−2𝑟 sin2 (𝑥𝜇 + 𝑟(𝜋/4)) 2

Thus, if 1 𝑎𝑥 = Γ 𝜋

(

1−𝑟 2

)2

−2𝑟

min{1, 𝑥

}

1 and 𝑏𝑥 = Γ 𝜋

(

1−𝑟 2

)2

as 𝜇 ↑ ∞.

(8.18)

max{1, 𝑥−2𝑟 },

70 then

D.Z. Arov and H. Dym ( 𝑥𝜇 )𝑟 2

𝑎𝑥 (1 + 𝑂(1/𝜇)) ≤ ∣𝐸+ (𝑥, 𝜇)∣2 ( 𝑥𝜇 )𝑟 ≤ 𝑏𝑥 (1 + 𝑂(1/𝜇)) 2

as 𝜇 ↑ ∞. This serves to establish (8.17), since 𝐸+ (𝑥, 𝜆) is an entire function of 𝜆 with no real zeros. ( )(1+𝑟)/2 𝑥𝜈 𝑟 −𝑟 √𝑒 4. 𝐸+ (𝑥, 𝑖𝜈) ∼ 𝑥𝜈 Γ( 1 − 2 2 )(1 + 𝑥 ) 2𝜋𝑥𝜈 𝑎𝑠 𝜈 ↑ ∞. This follows from formula (8.7), and the relations 𝐽𝑝 (𝑖𝑥𝜈) = 𝑖𝑝 𝐼𝑝 (𝑥𝜈)

𝑒𝑥𝜈 and 𝐼𝑝 (𝑥𝜈) ∼ √ 2𝜋𝜈

as 𝜈 ↑ ∞.

(8.19)

5. Veriﬁcation of (2). Assertion (2) follows from (1) and Step 2. Detailed type estimates are also furnished in [Dy70]. 6. Veriﬁcation of (3). In view of (8.3) and (8.19), ( 𝑥𝜈 )(1−𝑟)/2 1 + 𝑟 { } 𝑎11 (𝑥, 𝑖𝜈) + 𝑎12 (𝑥, 𝑖𝜈) = ) 𝐼 𝑟−1 (𝑥𝜈) + 𝑥𝑟 𝐼 1+𝑟 (𝑥𝜈) . Γ( 2 2 2 2 Therefore, by (8.10),

( ) } 1+𝑟 { 𝑟 𝑟−1 1+𝑟 Γ (𝑥𝜈) + 𝑥 𝐼 (𝑥𝜈) 𝐼 ( 𝑥𝜈 )−𝑟 2 2 2 ( ){ 𝑐𝑥 (𝑖𝜈) = } 1−𝑟 2 𝑥−𝑟 𝐼 1−𝑟 (𝑥𝜈) + 𝐼− 1+𝑟 (𝑥𝜈) Γ 2 2 2 ( ) 1+𝑟 (1 + 𝑥𝑟 ) ( 𝑥𝜈 )−𝑟 Γ 2 ( ) ∼ 1−𝑟 2 Γ (1 + 𝑥−𝑟 ) 2 ( ) 1+𝑟 Γ ( 𝜈 )−𝑟 2 ( ) as 𝑥 ↑ ∞. = 1−𝑟 2 Γ 2

Thus, (8.11) holds, since 𝑟 > −1. The rest follows from the fact that 𝑐𝑥 (𝜆) is holomorphic on ℝ and (ℜ𝑐𝑥 )(𝜇) = ∣𝐸+ (𝑥, 𝜇)∣−2 for 𝜇 ∈ ℝ. 7. Veriﬁcation of (4) and (5). (4) follows from the bounds in (8.17); (5) follows from (3), (4) and Theorem 10.9 in [ArD08]. 8. Veriﬁcation of (6). If 𝑓 ∈ ℬ(𝔈𝑥 ), then it is an entire function of exponential type at most 𝑥 by Theorem 5.65 in [ArD08] and the bound √ ∣𝑣 ∗ 𝑓 (𝜔)∣ = ∣⟨𝑓, 𝐾𝜔𝔈 𝑣⟩ℬ(𝔈) ∣ ≤ ∥𝑓 ∥ℬ(𝔈) 𝑣 ∗ 𝐾𝜔𝔈 (𝜔)𝑣 ;

B-regular 𝐽-inner Matrix-valued Functions and

∫

71

∞

∣𝑓 (𝜇)∣2 (8.20) 2 𝑑𝜇 < ∞. −∞ ∣𝐸+ (𝑥, 𝜇)∣ Thus, in view of the bounds in (8.17), (8.13) holds. In fact, since ∫ ∞ 1 1 𝑑𝜇 < ∞ (8.21) 2 2 −∞ ∣𝐸+ (𝑥, 𝜇)∣ 1 + 𝜇 and (8.9), (8.16) are in force, ℬ(𝔈𝑥 ) can be characterized as the set of entire functions 𝑓 of exponential type ≤ 𝑥 for which the integral in (8.20) is ﬁnite; see, e.g., Lemma 3.5 in [Dy70]. However, in view of (8.17), this is equivalent to (6). 9. Veriﬁcation of (7), (8) and (9). The asserted inclusions follow from the characterizations of ℬ(𝔈𝑥 ) given in (10), the Paley-Wiener theorem (which serves to characterize the Paley-Wiener space by (10) with 𝑟 = 0) and the fact that: If 𝑓 is an entire function of exponential type at most 𝑥 and −1 < 𝑟 ≤ 0, then ∫ ∞ ∫ ∞ ∣𝑓 (𝜇)∣2 ∣𝑓 (𝜇)∣2 𝑑𝜇 < ∞, 𝑟 𝑑𝜇 < ∞ =⇒ ∣𝜇∣ −∞ −∞ ∥𝑓 ∥2ℬ(𝔈𝑥)

=

i.e.,

−1 < 𝑟 ≤ 0 =⇒ ℬ(𝔈𝑥 ) ⊆ ℬ([𝑒𝑥 𝑒−𝑥 ]). (8.22) On the other hand, if 𝑓 is an entire function of exponential type at most 𝑥 and 0 ≤ 𝑟 < 1, then ∫ ∞ ∫ ∞ ∣𝑓 (𝜇)∣2 ∣𝑓 (𝜇)∣2 𝑑𝜇 < ∞ =⇒ 𝑟 𝑑𝜇 < ∞, −∞ −∞ ∣𝜇∣

i.e.,

0 ≤ 𝑟 < 1 =⇒ ℬ([𝑒𝑥 𝑒−𝑥 ]) ⊆ ℬ(𝔈𝑥 ). (8.23) √ √ Moreover, if 0 < 𝑟 < 1, then the function 𝑓 (𝜆) = sin 𝜆/ 𝜆 is an entire function of minimal exponential type that meets the condition (8.13). Therefore, 𝑓 ∈ ℬ(𝔈𝑥 ) for every 𝑥 > 0. However, 𝑓 ∕∈ 𝐿2 (ℝ). Consequently the inclusion in (8.23) is proper when 0 < 𝑟 < 1. Next, to establish that the inclusion in (8.23) is proper when −1 < 𝑟 < 0, it suﬃces to exhibit an entire function 𝑓 of exponential type at most 𝑥 in 𝐿2 (ℝ) for which (8.13) fails. The formula ∫ 𝜋/2 𝜋Γ(𝑎 − 1) ) (1 ) (𝑎 > 1) (8.24) (cos 𝑡)𝑎−2 𝑒𝑖𝜆𝑡 𝑑𝑡 = 𝑎−2 ( 1 𝑓 (𝜆) = 1 1 2 Γ 𝑎 −𝜋/2 2 + 2𝜆 Γ 2𝑎 − 2𝜆 (which is taken from p. 186 of Titchmarsh [Ti62], who credits S. Ramanujan) exhibits the right-hand side as an entire function of exponential type 𝜋/2 when ∫ 𝜋/2 (cos 𝑡)2(𝑎−2) 𝑑𝑡 < ∞, 𝜋/2

i.e., when 𝑎 > 3/2. Moreover, since ) ( )}−1 { (1 = 𝑂(∣𝜇∣1−𝑎 ), Γ 2 𝑎 + 12 𝜇 Γ 12 𝑎 − 12 𝜇

(8.25)

72

D.Z. Arov and H. Dym

(8.13) fails for −1 < 𝑟 < 0 when 𝑎 ≤ (3 − 𝑟)/2. Therefore, since the function speciﬁed in (8.24) is of exponential type 𝜋/2, the inclusion (8.22) is proper for such choices of 𝑟 and 𝑥 ≥ 𝜋/2. The same conclusion is obtained for 0 < 𝑥 < 𝜋/2 by considering 𝑓 (𝜌𝜆) with 0 < 𝜌 < 1 in place of 𝑓 (𝜆). Finally (9) is immediate from Theorem 5.98 in [ArD08]. □ Additional features of this example that are connected with the interpretation of the mvf’s 𝐴𝑥 as resolvent matrices of a class of related extension problems will be discussed in [ArD12]. Remark 8.1. It turns out that the function 𝑐𝑥 (𝜆) deﬁned by (8.10) tends to a limit 𝑐∞ (𝜆) as 𝑥 ↑ ∞ and that this limit admits an integral representation ∫ ∞ 𝑒𝑖𝜆𝑡 𝑔∞ (𝑡)𝑑𝑡 for 𝜆 ∈ ℂ+ , (8.26) 𝑐∞ (𝜆) = 𝜆2 0

where

𝑔∞ (𝑡) = 𝑘𝑟 𝑡𝑟+1 for 𝑡 ≥ 0 and a constant 𝑘𝑟 . Thus, if −1 < 𝑟 < 1/2, then ∫ 𝑎 ∣𝑔∞ (𝑡)∣2 𝑑𝑡 = ∞ for every 𝑎 > 0,

(8.27)

0

which, in view of Theorem 8.39 in [ArD08], is a stronger conclusion than (9). We remark that variants of the diﬀerential system (8.1) considered in this section have been used for assorted purposes in [Dy70], [LLS], [Sak97] and presumably in many other places.

References [AlD84] D. Alpay and H. Dym, Hilbert spaces of analytic functions, inverse scattering and operator models, I., Integ. Equat. Oper. Th. 7 (1984), 589–741. [ArD97] D.Z. Arov and H. Dym, 𝐽-inner matrix functions, interpolation and inverse problems for canonical systems, I: Foundations, Integ. Equat. Oper. Th. 29, (1997), 373–454. [ArD05] D.Z. Arov and H. Dym, Strongly regular 𝐽-inner matrix-vaued functions and inverse problems for canonical systems, in: Recent Advances in Operator Theory and its Applications (M.A. Kaashoek, S. Seatzu and C. van der Mee, eds.), Oper. Theor. Adv. Appl., 160, Birkh¨ auser, Basel, 2005, pp. 101–160. [ArD07] D.Z. Arov and H. Dym, Bitangential direct and inverse problems for systems of diﬀerential equations, in Probability, Geometry and Integrable Systems (M. Pinsky and B. Birnir, eds.) MSRI Publications, 55, Cambridge University Press, Cambridge, 2007, pp. 1–28. [ArD08] D.Z. Arov and H. Dym, 𝐽-Contractive Matrix Valued Functions and Related Topics, Encyclopedia of Mathematics and its Applications, 116. Cambridge University Press, Cambridge, 2008.

B-regular 𝐽-inner Matrix-valued Functions

73

[ArD12] D.Z. Arov and H. Dym, Bitangential Direct and Inverse Problems for Systems of Diﬀerential Equations, Cambridge University Press, in press. [Br63] L. de Branges, Some Hilbert spaces of analytic functions I. Trans. Amer. Math. Soc. 106 (1963), 445–668. [Br65] L. de Branges, Some Hilbert spaces of analytic functions II. J. Math. Anal. Appl. 11 (1965), 44–72. [Br68a] L. de Branges, Hilbert Spaces of Entire Functions. Prentice-Hall, Englewood Cliﬀs, 1968. [Br68b] L. de Branges, The expansion theorem for Hilbert spaces of entire functions, in Entire Functions and Related Parts of Analysis Amer. Math. Soc., Providence, 1968, pp. 79–148. [Bro72] M.S. Brodskii, Triangular and Jordan Representations of Linear Operators. Transl. Math Monographs, 32 Amer. Math. Soc. Providence, R.I., 1972. [Dy70] H. Dym, An introduction to de Branges spaces of entire functions with applications to diﬀerential equations of the Sturm-Liouville type. Adv. Math. 5 (1970), 395–471. [DMc76] H. Dym and H.P. McKean, Gaussian Processes, Function Theory, and the Inverse Spectral Problem, Academic Press, New York, 1976; reprinted by Dover, New York, 2008. [LLS] H. Langer, M. Langer and Z. Sasvari, Continuation of Hermitian indeﬁnite functions and corresponding canonical systems: An example, Methods of Funct. Anal. Top., 10 (2004), no.1, 39–53. [Po60] V.P. Potapov, The multiplicative structure of 𝐽-contractive matrix functions. Amer. Math. Soc. Transl. (2) 15 (1960) 131–243. [Sak97] L.A. Sakhnovich, Interpolation Theory and its Applications. Mathematics and its Applications, 428, Kluwer Academic Publishers, Dordrecht, 1997. [Ti62] E.C. Titchmarsh, Introduction to the Theory of Fourier Integrals (Second Edition), Oxford University Press, Oxford, 1962. Damir Z. Arov Division of Informatics and Applied Mathematics South-Ukranian National Pedagogical University 65020 Odessa, Ukraine e-mail: [email protected] Harry Dym Department of Mathematics The Weizmann Institute of Science Rehovot 76100, Israel e-mail: [email protected]

Operator Theory: Advances and Applications, Vol. 218, 75–122 c 2012 Springer Basel AG ⃝

Canonical Transfer-function Realization for Schur-Agler-class Functions of the Polydisk Joseph A. Ball and Vladimir Bolotnikov In memory of Israel Gohberg, a ﬁne teacher and colleague who will be missed

Abstract. Associated with any Schur-class function 𝑆(𝑧) (i.e., a contractive operator-valued holomorphic function on the unit disk) is the de BrangesRovnyak kernel 𝐾𝑆 (𝑧, 𝜁) = [𝐼 − 𝑆(𝑧)𝑆(𝜁)∗ ]/(1 − 𝑧𝜁) and the reproducing kernel Hilbert space ℋ(𝐾𝑆 ) which serves as the canonical functional-model statespace for a coisometric transfer-function realization 𝑆(𝑧) = 𝐷+𝑧𝐶(𝐼−𝑧𝐴)−1 𝐵 of 𝑆. To obtain a canonical functional-model unitary transfer-function realization, it is now well understood that one must work with a certain (2 × 2)block matrix kernel and associated two-component reproducing kernel Hilbert space. In this paper we indicate how these ideas extend to the multivariable setting where the unit disk is replaced by the unit polydisk in 𝑑 complex variables. For the case 𝑑 > 2, one must replace the Schur class by the more restrictive Schur-Agler class (deﬁned in terms of the validity of a certain von Neumann inequality) in order to get a good realization theory paralleling the single-variable case. This work represents one contribution to the recent extension of the state-space method to multivariable settings, an area of research where Israel Gohberg was a prominent and leading practitioner. Mathematics Subject Classiﬁcation (2000). 47A57. Keywords. Operator-valued Schur-Agler functions, Agler decomposition, unitary realization.

1. Introduction 1.1. The classical setting and the legacy of Israel Gohberg For 𝒰 and 𝒴 Hilbert spaces, we let ℒ(𝒰, 𝒴) denote the space of bounded linear operators mapping 𝒰 into 𝒴, abbreviated to ℒ(𝒰) in case 𝒰 = 𝒴. We then deﬁne the operator-valued version of the classical Schur class 𝒮(𝒰, 𝒴) to consist of holomorphic functions 𝑆 on the unit disk 𝔻 with values equal to contraction operators

76

J.A. Ball and V. Bolotnikov

between 𝒰 and 𝒴. There is a close connection between Schur-class functions and dissipative discrete-time linear systems which we now explain. By a discrete time linear system we mean a system of equations of the form { 𝑥(𝑡 + 1) = 𝐴𝑥(𝑡) + 𝐵𝑢(𝑡) Σ𝑑 (1.1) 𝑦(𝑡) = 𝐶𝑥(𝑡) + 𝐷𝑢(𝑡) where the evolution is along the nonnegative integers 𝑡 ∈ ℤ+ := {0, 1, 2, . . . }. Here we view 𝒰 as the input space, 𝒳 as the state space and 𝒴 as the output space. Application of the ℤ-transform in the form 𝑥 ˆ(𝑧) =

∞ ∑

𝑥(𝑛)𝑧 𝑛

𝑛=0

then results in an input-output relation of the form 𝑦ˆ(𝑧) = 𝐶(𝐼 − 𝑧𝐴)−1 𝑥(0) + 𝑇Σ𝑑 (𝑧)ˆ 𝑢(𝑧) where

𝑇Σ𝑑 (𝑧) = 𝐷 + 𝑧𝐶(𝐼 − 𝑧𝐴)−1 𝐵

(1.2)

is a rational matrix function with no pole at the origin (in case 𝒰, 𝒳 , and 𝒴 are all ﬁnite-dimensional), or, more generally, an operator-valued function, holomorphic on a neighborhood of the origin (in the inﬁnite-dimensional setting). A key discovery is that this procedure is reversible: any rational matrix-valued (or, more generally operator-valued) function holomorphic in a neighborhood of the origin can be represented in the form of a transfer function of a linear system (1.2); this is the starting point for the so-called state-space method in the analysis of all sorts of problems (e.g., Fredholm theory of Wiener-Hopf and singular integral operators, pole-zero structure and interpolation problems of tangential Lagrange-Sylvester and Nevanlinna-Pick type for rational matrix functions) of which Israel Gohberg was a leading ﬁgure (see [34, 47, 35, 36, 23, 45]). There is a special case of the discrete-time linear system (1.1) which is of special interest, namely the case where the system is conservative or dissipative, corresponding to the case where the linear spaces 𝒳 , 𝒰, 𝒴 are all taken to be Hilbert 𝐴 𝐵 ] is taken to be unitary (conservative spaces and the colligation matrix U = [ 𝐶 𝐷 case) or contractive (dissipative case). The contractivity of the colligation matrix U then results in the energy-balance relation ∥𝑥(𝑡 + 1)∥2 − ∥𝑥(𝑡)∥2 ≤ ∥𝑢(𝑡)∥2 − ∥𝑦(𝑡)∥2 ,

(1.3)

i.e., the net change in the energy stored in the system from time 𝑡 to time 𝑡 + 1 cannot be more than the net energy supplied to the system through the absorption of the input signal 𝑢(𝑡) and the loss of the output signal 𝑦(𝑡). If we take 𝑥(0) = 0 and sum up over 0 ≤ 𝑡 ≤ 𝑇 , we get 0 ≤ ∥𝑥(𝑇 + 1)∥2 ≤

𝑇 ∑ [ ] ∥𝑢(𝑡)∥2 − ∥𝑦(𝑡)∥2 𝑡=0

Canonical Realization

77

for all 𝑇 ≥ 0, which immediately implies that ∥{𝑦(𝑡)}𝑡∈ℤ+ ∥ℓ2 ≤ ∥{𝑢(𝑡)}𝑡∈ℤ+ ∥ℓ2 . Via 𝑢∥2𝐻 2 . Here, for any coeﬃcient the Plancherel theorem, it follows that ∥ˆ 𝑦∥2𝐻 2 ≤ ∥ˆ 𝒴

𝒰

Hilbert space ℰ we use the notation 𝐻ℰ2 to denote the Hardy space of ℰ-valued ˆ, we then see that necessarily 𝑇Σ𝑑 functions on the unit disk 𝔻. As 𝑦ˆ = 𝑇Σ𝑑 ⋅ 𝑢 is in the Schur class 𝒮(𝒰, 𝒴). Such functions come up as scattering functions for conservative or dissipative linear circuits (see [37, 65]), as characteristic operator functions in the theory of canonical models for Hilbert space contraction operators (see [54, 58, 41]) as well as scattering functions for Lax-Phillips scattering systems (see [53, 1]). We mention that the Livˇsic theory of characteristic functions (see [54, 55]) is really about modeling operators close to unitary (contractive or not), or after a Cayley transform change of variable, operators which are close to selfadjoint, where the state-space 𝒳 is allowed to carry an indeﬁnite inner-product rather than a positive-deﬁnite inner product; it was in this latter context that Israel Gohberg was also a participant in development of the theory (see [43]). Finer considerations can be used to characterize when 𝑆 is inner, i.e., in addition has unitary boundary values on the unit disk, but we do not go into the details of this point here. Suﬃce it to say that it is such energy-balance considerations which distinguish the earliest versions of the Livˇsic theory of characteristic functions from the Kalman theory where the various spaces 𝒰, 𝒳 , 𝒴 are just linear spaces and there is no consideration of any energy-balance relation as in (1.3) (see [51, 52]); it should be mentioned that energy-balance considerations were introduced into the control theory literature by Willems (see [63, 64]) with earlier foreshadowing in the circuit theory literature [37, 65]. The realization problem for Schur-class functions can be stated quite simply: given a function 𝑆 in the Schur class 𝒮(𝒰, 𝒴), can one ﬁnd a contractive (or even 𝐴 𝐵 ] so that 𝑆(𝑧) is realized as the transfer funcunitary) colligation matrix U = [ 𝐶 𝐷 tion 𝑇Σ𝑑 of the associated dissipative (or even conservative) discrete-time linear system (1.1)? For future reference, we state the following well-known but state-ofart reﬁned version of the solution of this problem; a good reference for this result and the single-variable results to follow in this section is the book [7] where the more general Pontryagin-space setting is handled. Theorem 1.1. Let 𝑆 : 𝔻 → ℒ(𝒰, 𝒴) be given. Then the following are equivalent: (1a) 𝑆 ∈ 𝒮(𝒰, 𝒴), i.e., 𝑆 is holomorphic on 𝔻 with ∥𝑆(𝑧)∥ ≤ 1 for all 𝑧 ∈ 𝔻. (1b) 𝑆 satisﬁes the von Neumann inequality: ∥𝑆(𝑇 )∥ ≤ 1 for any strictly contractive operator 𝑇 on a Hilbert space ℋ, where 𝑆(𝑇 ) is deﬁned by 𝑆(𝑇 ) =

∞ ∑

𝑛

𝑆𝑛 ⊗ 𝑇 ∈ ℒ(𝒰 ⊗ ℋ, 𝒴 ⊗ ℋ)

if

𝑛=0

𝑆(𝑧) =

∞ ∑

𝑆𝑛 𝑧 𝑛 .

𝑛=0

(2) The associated kernel function 𝐾𝑆 (𝑧, 𝜁) = is a positive kernel on 𝔻 × 𝔻.

𝐼𝒴 − 𝑆(𝑧)𝑆(𝜁)∗ 1 − 𝑧𝜁

(1.4)

78

J.A. Ball and V. Bolotnikov

(3) There is an auxiliary Hilbert [ ] [ space ] 𝒳 [ and ] a unitary connecting operator (or 𝐴 𝐵 𝒳 𝒳 colligation) U = : → so that 𝑆(𝑧) can be expressed as 𝐶 𝐷 𝒰 𝒴 𝑆(𝑧) = 𝐷 + 𝑧𝐶(𝐼 − 𝑧𝐴)−1 𝐵.

(1.5)

(4) 𝑆(𝑧) has a realization as in (1.5) where the connecting operator U is any one of (i) isometric, (ii) coisometric, or (iii) contractive. We shall be interested in uniqueness issues in such a transfer-function re[ ] ′ 𝐴 𝐵] : [𝒳 ] → 𝒳 and U alization. Let us say that two colligations U = [ 𝐶 = 𝒴 𝐷 𝒰 [ 𝐴′ 𝐵 ′ ] [ ′ ] [ ′] 𝒳 𝒳 : 𝒰 → 𝒴 are unitarily equivalent if there is a unitary operator 𝐶 ′ 𝐷′ 𝑈 : 𝒳 → 𝒳 ′ so that [ ][ ] [ ′ ][ ] 𝑈 0 𝐴 𝐵 𝐴 𝐵′ 𝑈 0 = . 0 𝐼𝒴 𝐶 𝐷 𝐶 ′ 𝐷′ 0 𝐼𝒰 It is readily seen that if two colligations are unitarily equivalent, then their transfer functions are identical: 𝑆(𝑧) = 𝑆 ′ (𝑧). The converse is true under appropriate ⋁ minimality conditions which we now recall. In what follows, the symbol stands for the closed linear span. [ ] 𝐴 𝐵 ] : [ 𝒳 ] → 𝒳 is called Deﬁnition 1.2. The colligation U = [ 𝐶 𝒴 𝐷 𝒰 1. observable (or closely outer-connected) if the pair (𝐶, 𝐴) is observable, i.e., if ⋁ ∗𝑛 ∗ Ran 𝐴 𝐶 = 𝒳 ; 𝑛≥0

2. controllable (or closely inner-connected) if the pair (𝐵, 𝐴) is controllable, i.e., ⋁ if Ran 𝐴𝑛 𝐵 = 𝒳 ; 𝑛≥0

3. closely connected if

⋁

{Ran 𝐴𝑛 𝐵, Ran 𝐴∗𝑛 𝐶 ∗ } = 𝒳 .

𝑛≥0

The kernel function 𝐾𝑆 (𝑧, 𝜁) given by (1.4) and the associated reproducing kernel Hilbert space ℋ(𝐾𝑆 ) is the classical de Branges-Rovnyak kernel and de Branges-Rovnyak reproducing kernel Hilbert space associated with the Schur-class function 𝑆 and has been much studied over the years, both as an object in itself and as a tool for other types of applications. The special role of the de BrangesRovnyak space in connection with the transfer-function realization for Schur-class functions is to provide a canonical functional-model realization for 𝑆, as illustrated in the following theorem. Theorem 1.3. Suppose that the function 𝑆 is in the Schur class 𝒮(𝒰, 𝒴) and let ℋ(𝐾𝑆 ) be the associated de Branges-Rovnyak space. Deﬁne operators 𝐴, 𝐵, 𝐶, 𝐷 by 𝑓 (𝑧) − 𝑓 (0) 𝑆(𝑧) − 𝑆(0) 𝐴 : 𝑓 (𝑧) → , 𝐵 : 𝑢 → 𝑢, (1.6) 𝑧 𝑧 𝐶 : 𝑓 (𝑧) → 𝑓 (0), 𝐷 : 𝑢 → 𝑆(0)𝑢.

Canonical Realization

79

𝐴 𝐵 ] deﬁnes a coisometry from ℋ(𝐾 ) ⊕ 𝒰 to Then the operator colligation U = [ 𝐶 𝑆 𝐷 −1 𝐷+𝑧𝐶(𝐼 −𝑧𝐴) ℋ(𝐾𝑆 )⊕𝒴. Moreover, U is observable and its transfer function [ 𝐵] [ 𝐴′ 𝐵 ′ ] 𝒳 equals 𝑆(𝑧). Finally, any observable coisometric colligation 𝐶 ′ 𝐷′ : [ 𝒰 ] → 𝒳 𝒴 with transfer function equal to 𝑆 is unitarily equivalent to U.

It is easily seen from characterization (1a) in Theorem 1.1 that 𝑆 ∈ 𝒮(𝒰, 𝒴) ⇐⇒ 𝑆 ♯ ∈ 𝒮(𝒴, 𝒰)

where 𝑆 ♯ (𝑧) := 𝑆(𝑧)∗ .

Hence for a given Schur-class function 𝑆 there is also associated a dual de Branges∗ 𝑆(𝜁) which Rovnyak space ℋ(𝐾𝑆 ♯ ) with reproducing kernel 𝐾𝑆 ♯ (𝑧, 𝜁) = 𝐼−𝑆(𝑧) 1−𝑧𝜁 plays the same role of providing a canonical functional-model realization for isometric realizations of 𝑆 as ℋ(𝐾𝑆 ) plays for coisometric realizations, as illustrated in the next theorem. Theorem 1.4. Suppose that the function 𝑆 is in the Schur class 𝒮(𝒰, 𝒴) and let ℋ(𝐾𝑆 ♯ ) be the associated dual de Branges-Rovnyak space. Deﬁne ˜ : 𝑔(𝑧) → 𝑧𝑔(𝑧) − 𝑆(𝑧)∗ ˜ ˜ : 𝑢 → (𝐼 − 𝑆(𝑧)∗ 𝑆(0))𝑢, 𝐴 𝑔 (0), 𝐵 (1.7) ˜ : 𝑔(𝑧) → 𝑔˜(0), ˜ : 𝑢 → 𝑆(0)𝑢, 𝐶 𝐷 where 𝑔˜(0) is the unique vector in 𝒴 such that 〉 〈 𝑆(𝑧)∗ − 𝑆(0)∗ ⟨˜ 𝑔(0), 𝑦⟩𝒰 = 𝑔(𝑧), 𝑦 for all 𝑦 ∈ 𝒴. 𝑧 ℋ(𝐾𝑆 ♯ ) [ ] ˜ = 𝐴˜ 𝐵˜ deﬁnes an isometry from ℋ(𝐾𝑆 ♯ ) ⊕ 𝒰 Then the operator colligation U ˜ ˜ 𝐶 𝐷 ˜ is controllable and its transfer function 𝐷 ˜ + 𝑧˜𝐶( ˜ 𝐼˜ − to ℋ(𝐾𝑆 ♯ ) ⊕ 𝒴. Moreover, U [ 𝐴′ 𝐵 ′ ] −1 ˜ ˜ 𝑧˜𝐴) 𝐵 equals 𝑆(𝑧). Finally any controllable isometric colligation 𝐶 ′ 𝐷′ : [ 𝒳 𝒰 ]→ [𝒳 ] ˜ with transfer function equal to 𝑆 is unitarily equivalent to U. 𝒴

Remark 1.5. We note that Theorem 1.4 can be derived directly from Theorem 1.3 as follows: use formulas (1.6) with 𝑆 ♯ in place of 𝑆 and with 𝒰 and 𝒴 switched to deﬁne the observable coisometric realization U for the function 𝑆 ♯ . Then the ˜ := U∗ will be exactly as described in Theorem 1.4. Explicit formulas colligation U (1.7) for operators adjoint to those in (1.6) are obtained via standard routine calculations. ˆ 𝑆 which In addition to the kernels 𝐾𝑆 and 𝐾𝑆 ♯ , there is a positive kernel 𝐾 combines these two and is deﬁned as follows: ⎤ ⎡ ⎤ ⎡ 𝑆(𝑧) − 𝑆(𝜁) 𝑆(𝑧) − 𝑆(𝜁) 𝐼 − 𝑆(𝑧)𝑆(𝜁)∗ ⎥ ⎢ 𝐾𝑆 (𝑧, 𝜁) ⎥ ⎢ 𝑧−𝜁 1 − 𝑧𝜁 𝑧−𝜁 ⎥ ⎥=⎢ ˆ 𝜁) = ⎢ 𝐾(𝑧, ⎦ ⎣ 𝑆(𝑧)∗ − 𝑆(𝜁)∗ 𝐼 − 𝑆(𝑧)∗ 𝑆(𝜁) ⎦ . ⎣ 𝑆 ♯ (𝑧) − 𝑆 ♯ (𝜁) 𝐾𝑆 ♯ (𝑧, 𝜁) 𝑧−𝜁 𝑧−𝜁 1 − 𝑧𝜁 (1.8) ˆ is also a positive kernel on 𝔻×𝔻 and the associated reproducing It turns out that 𝐾 ˆ 𝑆 ) serves as the canonical functional-model state space kernel Hilbert space ℋ(𝐾 for unitary realizations of 𝑆, as summarized in the following theorem.

80

J.A. Ball and V. Bolotnikov

Theorem 1.6. Suppose that the function 𝑆 is in the Schur class 𝒮(𝒰, 𝒴) and let ˆ 𝜁) be the positive kernel on 𝔻 given by (1.8). Deﬁne operators 𝐴, ˆ 𝐵, ˆ 𝐶, ˆ 𝐷 ˆ by 𝐾(𝑧, [ ] [ ] [ ] 𝑆(𝑧)−𝑆(0) 𝑓 (𝑧) [𝑓 (𝑧) − 𝑓 (0)]/𝑧 𝑢 𝑧 𝐴: → , 𝐵 : 𝑢 → , 𝑧𝑔(𝑧) − 𝑆(𝑧)𝑓 (0) (𝐼 − 𝑆(𝑧)∗ 𝑆(0))𝑢 [𝑔(𝑧) ] 𝑓 (𝑧) 𝐶: → 𝑓 (0), 𝐷 : 𝑢 → 𝑆(0)𝑢. 𝑔(𝑧) 𝐴 𝐵 ] deﬁnes a unitary operator from ℋ(𝐾 ˆ𝑆 ) ⊕ Then the operator colligation U = [ 𝐶 𝐷 ˆ 𝑆 ) ⊕ 𝒴. Moreover, U is closely connected and its transfer function 𝒰 onto ℋ(𝐾 𝐷 + [𝑧𝐶(𝐼 −] 𝑧𝐴)−1 𝐵 equals 𝑆(𝑧). Finally, any closely connected unitary colliga′ 𝐵 ′ : (𝒳 ⊕ 𝒰) → (𝒳 ⊕ 𝒴) with transfer function equal to 𝑆 is unitarily tion 𝐴 𝐶 ′ 𝐷′ equivalent to U.

1.2. Extensions to several variables One of the further developments inspired by the success of the state-space method to problems in single-variable function theory as developed by Israel Gohberg and his collaborators was the introduction of state-space methods to several-variable function theory. One can start with the seminal work of Agler [2] where what is now called the Schur-Agler class on the unit polydisk 𝔻𝑑 and where the realization of a multivariable function as the transfer function of a certain type of multidimensional linear system (with evolution along a multidimensional lattice ℤ𝑑+ rather than along the nonnegative integers ℤ+ ) were introduced. This polydisk setting was later developed in ﬁner detail in [3, 28]. Parallel but diﬀerent results were then developed for the setting of the unit ball in ℂ𝑑 in [4, 29, 44, 20], general domains [10, 16], algebraic curves and Riemann surfaces [30, 31, 32, 22], and domains in ℂ𝑛 with matrix-polynomial deﬁning function [10, 16] (see [14] for a survey up to the year 2000), and now one can see transfer-function realizations appear for functions of noncommuting variables as well [33, 8, 24, 25, 50, 59, 60, 61, 56, 57, 15, 21]. We focus here on the case where the unit disk 𝔻 is replaced by the unit polydisk 𝔻𝑑 = {𝑧 = (𝑧1 , . . . , 𝑧𝑑 ) : ∣𝑧𝑖 ∣ < 1}. We then recall the Schur-Agler class 𝒮𝒜𝑑 (𝒰, 𝒴) consisting of ℒ(𝒰, 𝒴)-valued functions analytic on 𝔻𝑑 and such that ∥𝑆(𝑇 )∥ ≤ 1 for any collection of 𝑑 commuting operators 𝑇 = (𝑇1 , . . . , 𝑇𝑑 ) on a Hilbert space 𝒦. The operator 𝑆(𝑇 ) is deﬁned by 𝑆(𝑇 ) =

∞ ∑

𝑆𝑛 ⊗ 𝑇 𝑛 ∈ ℒ(𝒰 ⊗ 𝒦, 𝒴 ⊗ 𝒦)

𝑛∈ℤ𝑑 +

if 𝑆(𝑧) =

∑

𝑆𝑛 𝑧 𝑛

𝑛∈ℤ𝑑 +

with the standard multivariable notation 𝑧 𝑛 = 𝑧1𝑛1 ⋅ ⋅ ⋅ 𝑧𝑑𝑛𝑑

and 𝑇 𝑛 = 𝑇1𝑛1 ⋅ ⋅ ⋅ 𝑇𝑑𝑛𝑑

if

𝑛 = (𝑛1 , . . . , 𝑛𝑑 ) ∈ ℤ𝑑+ .

The following result appears in [2, 3, 28] and is a multivariable analog of Theorem 1.1.

Canonical Realization

81

Theorem 1.7. Let 𝑆 be a ℒ(𝒰, 𝒴)-valued function deﬁned on 𝔻𝑑 . The following statements are equivalent: (1) 𝑆 belongs to the class 𝒮𝒜𝑑 (𝒰, 𝒴). (2) There exist positive kernels 𝐾1𝐿 , . . . , 𝐾𝑑𝐿 : 𝔻𝑑 × 𝔻𝑑 → ℒ(𝒴) such that for every 𝑧 = (𝑧1 , . . . , 𝑧𝑑 ) and 𝜁 = (𝜁1 , . . . , 𝜁𝑑 ) in 𝔻𝑑 , 𝐼𝒴 − 𝑆(𝑧)𝑆(𝜁)∗ =

𝑑 ∑ (1 − 𝑧𝑖 𝜁 𝑖 )𝐾𝑖𝐿 (𝑧, 𝜁).

(1.9)

𝑖=1

(3) There exist positive kernels 𝐾1𝑅 , . . . , 𝐾𝑑𝑅 : 𝔻𝑑 × 𝔻𝑑 → ℒ(𝒰) such that for every 𝑧, 𝜁 ∈ 𝔻𝑑 , 𝐼𝒰 − 𝑆(𝑧)∗ 𝑆(𝜁) =

𝑑 ∑

(1 − 𝑧𝑖 𝜁 𝑖 )𝐾𝑖𝑅 (𝑧, 𝜁).

(1.10)

𝑖=1

(4) There exist positive kernels [ 𝐿 ] 𝐾𝑖 𝐾𝑖𝐿𝑅 𝐾𝑖 = : 𝔻𝑑 × 𝔻𝑑 → ℒ(𝒴 ⊕ 𝒰) 𝐾𝑖𝑅𝐿 𝐾𝑖𝑅

for

𝑖 = 1, . . . , 𝑑

such that for every 𝑧, 𝜁 ∈ 𝔻𝑑 , ] [ 𝑆(𝑧) − 𝑆(𝜁) 𝐼𝒴 − 𝑆(𝑧)𝑆(𝜁)∗ 𝑆(𝑧)∗ − 𝑆(𝜁)∗ 𝐼𝒰 − 𝑆(𝑧)∗ 𝑆(𝜁) ] 𝑑 [ ∑ (1 − 𝑧𝑖 𝜁 𝑖 )𝐾𝑖𝐿 (𝑧, 𝜁) (𝑧𝑖 − 𝜁 𝑖 )𝐾𝑖𝐿𝑅 (𝑧, 𝜁) = . (𝑧𝑖 − 𝜁 𝑖 )𝐾𝑖𝑅𝐿 (𝑧, 𝜁) (1 − 𝑧𝑖 𝜁 𝑖 )𝐾𝑖𝑅 (𝑧, 𝜁)

(1.11)

(1.12)

𝑖=1

(5) There exist Hilbert spaces 𝒳1 , . . . , 𝒳𝑑 and a unitary connecting operator (or colligation) U of the structured form ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 𝐴11 . . . 𝐴1𝑑 𝐵1 𝒳1 𝒳1 [ ] ⎢ . ⎥ ⎥ ⎢ ⎢ . . . . ⎥ .. .. ⎥ ⎢ .. ⎥ 𝐴 𝐵 ⎢ . ⎢ . ⎥ U= =⎢ . (1.13) ⎥ : ⎢ ⎥→⎢ . ⎥ 𝐶 𝐷 ⎣𝐴𝑑1 . . . 𝐴𝑑𝑑 𝐵𝑑 ⎦ ⎣𝒳𝑑 ⎦ ⎣𝒳𝑑 ⎦ 𝐶1 . . . 𝐶𝑑 𝐷 𝒰 𝒴 so that 𝑆(𝑧) can be realized in the form −1

𝑆(𝑧) = 𝐷 + 𝐶 (𝐼 − 𝑍(𝑧)𝐴) where we have set

⎡ 𝑧1 𝐼𝒳1 ⎢ 𝑍(𝑧) = ⎣

𝑍(𝑧)𝐵

for all 𝑧 ∈ 𝔻𝑑 ,

(1.14)

⎤ ..

.

⎥ ⎦.

(1.15)

𝑧𝑑 𝐼𝒳𝑑

(6) 𝑆(𝑧) has a realization as in (1.14) where the connecting operator U is any one of (i) isometric, (ii) coisometric, or (iii) contractive.

82

J.A. Ball and V. Bolotnikov

In the sequel we shall use the terminology left Agler decomposition for 𝑆, right Agler decomposition for 𝑆, or simply Agler decomposition for 𝑆, for any collection of kernels {𝐾 𝐿 , . . . , 𝐾𝑑𝐿 }, {𝐾1𝑅 , . . . , 𝐾𝑑𝑅 }, or {𝐾1 , . . . , 𝐾𝑑 } of the form (1.11), such that the respective decomposition (1.9), (1.10), or (1.12) holds. A straightforward calculation (see, e.g., [17]) shows that for an operator colligation U of the form (1.13) and its transfer function 𝑆 (1.14), we have 𝐼 − 𝑆(𝑧)𝑆(𝜁)∗ = 𝐶(𝐼 − 𝑍(𝑧)𝐴)−1 (𝐼 − 𝑍(𝑧)𝑍(𝜁)∗ ) (𝐼 − 𝐴∗ 𝑍(𝜁)∗ )−1 𝐶 ∗ (1.16) ] [ ] [ 𝑍(𝜁)∗ (𝐼 − 𝐴∗ 𝑍(𝜁)∗ )−1 𝐶 ∗ ∗ −1 + 𝐶(𝐼 − 𝑍(𝑧)𝐴) 𝑍(𝑧) 𝐼 (𝐼 − UU ) 𝐼 and 𝐼 − 𝑆(𝑧)∗ 𝑆(𝜁) = 𝐵 ∗ (𝐼 − 𝑍(𝑧)∗ 𝐴∗ )−1 (𝐼 − 𝑍(𝑧)∗ 𝑍(𝜁)) (𝐼 − 𝐴𝑍(𝜁))−1 𝐵 (1.17) ] [ [ ] 𝑍(𝜁)(𝐼 − 𝐴𝑍(𝜁))−1 𝐵 , + 𝐵 ∗ (𝐼 − 𝑍(𝑧)∗ 𝐴∗ )−1 𝑍(𝑧)∗ 𝐼 (𝐼 − U∗ U) 𝐼 from which it is clear that for a contractive U and any analytic ℒ(⊕𝑑𝑖=1 𝒳𝑖 )-valued function 𝑍 in 𝑑 complex variables (i.e., not necessarily of the form (1.15)), formula (1.14) deﬁnes an ℒ(𝒰, 𝒴)-valued function 𝑆 analytic and contractive-valued on the domain {𝑧 ∈ ℂ𝑛 : ∥𝑍(𝑧)∥𝑜𝑝 < 1}. The relevance of formula (1.14) to the polydisk setting is mostly determined by the very special structure (1.15) of 𝑍. Extending most univariate notions to the polydisk setting we will often refer to these extensions as “structured” ones meaning that for other choices of 𝑍, these notions will be substantially diﬀerent. Our purpose here is to describe the analogs of Theorems 1.3, 1.4 and 1.6, i.e., canonical model realizations for a Schur-Agler class function as in Theorem 1.7 with the speciﬁcation of a uniqueness criterion, for the polydisk setting. It turns out that coisometric, isometric and unitary realizations are not so useful for a viable theory in the multivariable setting, and it makes sense to work with weakly coisometric, weakly isometric and weakly unitary realizations instead – see Section 2 below for precise deﬁnitions. Also one no longer has strict uniqueness, but the amount of freedom in the choice of canonical functional-model realization can be well described. Once the analog of Theorem 1.3 (weakly coisometric canonical functional-model realizations) is understood, the analog of Theorem 1.4 (canonical functional-model isometric realization) follows by a simple duality argument just as in the univariate case. We note that the polydisk analog of Theorem 1.3 already appears in [17]; here we also obtain converse characterizations of which families of kernels can arise as left Agler decompositions for some Schur-Agler class function (Theorems 3.7 and 3.9 below). The main new point of the present paper then is to develop the analog of Theorem 1.6 (weakly unitary canonical functional-model realizations – see Theorem 5.9 below) for the polydisk setting. We also construct weakly coisometric realizations for a simple example (𝑆(𝑧) = 𝑧1 𝑧2 ) to illustrate the main ideas – see Example 3.6 below.

Canonical Realization

83

We note that the paper [17] also develops the analogs of canonical functionalmodel isometric and unitary realizations (Theorem 1.4 and Theorem 1.6) for the ball setting (more precisely, the contractive-multiplier class on the Drury-Arveson space as in [4, 29]), and for the general domain setting (more precisely, the SchurAgler class associated with a domain with matrix-polynomial deﬁning function as in [10, 16]). In addition, in our companion paper [19] we work out explicitly the analogs of Theorem 1.4 and 1.6 (canonical functional-model isometric and coisometric realizations) for the ball setting and discuss applications to the canonical model theory for commuting row contractions as initiated in the work of Bhattacharyya, Eschmeier and Sarkar ([38, 39]). Finally, the paper [18] extends the isometric/unitary canonical functional-model model theory to the general-domain setting, thereby providing a general setting containing the polydisk and ball versions as special cases. The main point of the present paper is to obtain weakly unitary realizations for a Schur-Agler class function on the polydisk with a given Agler decomposition in a canonical functional-model form. The paper [26] also obtains canonical functional-model realizations but by a quite diﬀerent approach involving the construction of a multievolution Lax-Phillips scattering system having the given Schur-class function 𝑆 as its scattering function. In [26] the goal is to obtain realizations for 𝑆 which are actually unitary (not just weakly unitary); the construction is rather complicated unless the scattering system satisﬁes some additional minimality conditions. Nevertheless this scattering approach is used in [49] to characterize Schur-class functions on the polydisk in terms of a generalized Agler decomposition. We mention that there is recent work of Arov-Staﬀans (see [12, 13]) connected with single-variable de Branges-Rovnyak spaces which uses the term “canonical model” is a somewhat diﬀerent sense from how we are using it here. In [12, 13] the authors take a behavioral approach whereby one does not distinguish between the input space 𝒰 and the output space 𝒴 but rather focuses on the “signal space” 𝒴 ⊕ 𝒰. The authors then obtain “coordinate-free” versions of the de Brangesˆ 𝑆 ) which amount to geometric Rovnyak model spaces ℋ(𝐾𝑆 ), ℋ(𝐾𝑆 ♯ ) and ℋ(𝐾 Kre˘in-space descriptions of the graph of the Schur-class function 𝑆. In the present paper, we do distinguish between the input space 𝒰 and the output space 𝒴 and the point is to pick out a unique (to the extent possible in the multivariable setting) choice of state space and realization (among all that are equivalent in an appropriate sense) which gives rise to a transfer-function realization for 𝑆 in the desired class of realizations. 1.3. The plan of the paper The present paper is organized as follows. Following this Introduction, in Section 2 we introduce the modiﬁcations (weakly coisometric, weakly isometric, weakly unitary realization) of the notions of coisometric, isometric and unitary realizations appearing in Theorems 1.3, 1.4, 1.6 which have proved useful for a viable theory in

84

J.A. Ball and V. Bolotnikov

the polydisk setting. Section 3 then recalls the analog of Theorem 1.3 for the polydisk setting from [17], namely, the characterization of a weakly coisometric canonical functional-model realization for a Schur-Agler class function on the polydisk. Section 4 applies the duality argument to arrive at the parallel result concerning weakly isometric canonical functional-model realizations, and the ﬁnal Section 5 gives the complete results concerning weakly unitary canonical functional-model realizations for a Schur-Agler-class function on the polydisk.

2. Preliminaries Here we review a few preliminaries from [28] concerning colligation matrices realizing Schur-Agler-class functions over the polydisk. We say that a colligation U of the form (1.13) is unitarily equivalent to a colligation [ ′ ] [ 𝑑 ] ] [ 𝑑 𝐴 𝐵′ ⊕𝑖=1 𝒳𝑖′ ⊕𝑖=1 𝒳𝑖′ ′ → (2.1) : U = 𝐶 ′ 𝐷′ 𝒰 𝒴 if there are unitary operators 𝑈𝑖 : 𝒳𝑖 → 𝒳𝑖′ such that [ 𝑑 ][ ] [ ′ ][ ⊕𝑖=1 𝑈𝑖 0 𝐴 𝐵 𝐴 𝐵 ′ ⊕𝑑𝑖=1 𝑈𝑖 = 𝐶 ′ 𝐷′ 0 𝐼𝒴 𝐶 𝐷 0

] 0 . 𝐼𝒰

(2.2)

The deﬁnition of unitary equivalence is certainly structured – the block-diagonal structure of the unitary operator ⊕𝑑𝑖=1 𝑈𝑖 is predetermined by the structure (1.15) of 𝑍. Equality (2.2) is what we need to guarantee (as in the univariate case) that the transfer functions of two unitarily equivalent colligations coincide. We next extend Deﬁnition 1.2 to the polydisk setting. Deﬁnition 2.1. Let 𝑃𝒳𝑖 denote the orthogonal projection of 𝒳 = ⊕𝑑𝑖=1 𝒳𝑖 onto 𝒳𝑖 . The structured colligation (1.13) is called observable if ⋁ 𝒳𝑖 = 𝑃𝒳𝑖 (𝐼 − 𝐴∗ 𝑍(𝜁)∗ )−1 𝐶 ∗ 𝑦 for 𝑖 = 1, . . . , 𝑑. 𝜁∈𝔻𝑑 , 𝑦∈𝒴

It is called controllable if ⋁ 𝒳𝑖 =

𝑃𝒳𝑖 (𝐼 − 𝐴𝑍(𝜁))−1 𝐵𝑢

for 𝑖 = 1, . . . , 𝑑,

𝜁∈𝔻𝑑 , 𝑢∈𝒰

and it is called closely connected if ⋁{ } 𝑃𝒳𝑖 (𝐼 − 𝐴∗ 𝑍(𝜁)∗ )−1 𝐶 ∗ 𝑦, 𝑃𝒳𝑖 (𝐼 − 𝐴𝑍(𝜁))−1 𝐵𝑢 : 𝜁 ∈ 𝔻𝑑 , 𝑦 ∈ 𝒴, 𝑢 ∈ 𝒰 𝒳𝑖 = for 𝑖 = 1, . . . , 𝑑. In analogy with the univariate case, a realization of the form (1.14) is called coisometric, isometric, unitary or contractive if the operator U is respectively, coisometric, isometric, unitary or just contractive. Note that Statement (6) in Theorem 1.7 concerning contractive realizations does not appear in [2, 3, 28]; however its equivalence to statements (1)–(5) is quite simple (see [17]). It turns out that a more useful analog of “isometric” and “coisometric” realizations appearing

Canonical Realization

85

in the classical univariate case is not that the whole connecting operator U (or U∗ ) be isometric, but rather that U (respectively, U∗ ) be isometric on a certain canonical subspace of 𝒳 ⊕ 𝒰 (of 𝒳 ⊕ 𝒴, respectively). As it is natural to restrict to contractive connecting operators U to realize contractive functions 𝑆, we maintain the restriction that connecting operators be contractive. Deﬁnition 2.2. The colligation U of the form (1.13) is called 1. weakly coisometric if the adjoint U∗ is contractive as an operator from 𝒳 ⊕ 𝒴 to 𝒳 ⊕ 𝒰 and is isometric when restricted to the subspace 𝒟 ⊕ 𝒴; here 𝒟 ⊂ 𝒳 is given by ⋁ 𝑍(𝜁)∗ (𝐼 − 𝐴∗ 𝑍(𝜁)∗ )−1 𝐶 ∗ 𝑦 ⊂ 𝒳 ; (2.3) 𝒟 := 𝜁∈𝔻𝑑 , 𝑦∈𝒴

2. weakly isometric if U is contractive as an operator from 𝒳 ⊕ 𝒰 to 𝒳 ⊕ 𝒴 and ˜ ⊕ 𝒰; here the subspace 𝒟 ˜⊂𝒳 is isometric when restricted to the subspace 𝒟 is given by ⋁ ˜ := 𝒟 𝑍(𝜁)(𝐼 − 𝐴𝑍(𝜁))−1 𝐵𝑢 ⊂ 𝒳 ; (2.4) 𝜁∈𝔻𝑑 , 𝑢∈𝒰

3. weakly unitary if it is simultaneously weakly isometric and weakly coisometric. The notions of weakly coisometric/isometric/unitary colligations do not appear in the single-variable context for the simple reason that if U is observable (controllable, closely connected), then a weakly coisometric (weakly isometric, weakly unitary) colligation is automatically coisometric (isometric, unitary). Remark 2.3. From the identity (1.16) we see that what is needed for the validity of the identity 𝐼 − 𝑆(𝑧)𝑆(𝜁)∗ = 𝐶(𝐼 − 𝑍(𝑧)𝐴)−1 (𝐼 − 𝑍(𝑧)𝑍(𝜁)∗ ) (𝐼 − 𝐴∗ 𝑍(𝜁)∗ )−1 𝐶 ∗

(2.5)

is that U∗ be isometric when restricted to the subspace ] [ ] [ ⋁ 𝒳 𝑍(𝜁)∗ (𝐼 − 𝐴∗ 𝑍(𝜁)∗ )−1 𝐶 ∗ 𝑦⊂ ; 𝒟0 := 𝐼 𝒴 𝑑 𝜁∈ℂ , 𝑦∈𝒴

by specializing 𝜁 to 𝜁 = 0 in ℂ𝑑 , we see that 𝒟0 contains

[

{0} 𝒴

]

as a subspace and

hence 𝒟0 coincides with 𝒟 ⊕𝒴 where 𝒟 is as in (2.3). We conclude that the weakly coisometric property is exactly what is needed for the identity (2.5) to hold. By a similar analysis working with the identity (1.17), we see that the weak isometric property is what we need for the validity of the dual decomposition 𝐼 − 𝑆(𝑧)∗ 𝑆(𝜁) = 𝐵 ∗ (𝐼 − 𝑍(𝑧)∗ 𝐴∗ )−1 (𝐼 − 𝑍(𝑧)∗ 𝑍(𝜁)) (𝐼 − 𝐴𝑍(𝜁))−1 𝐵.

(2.6)

86

J.A. Ball and V. Bolotnikov

Remark 2.4. A perusal of the formulas for the operators 𝐴 and 𝐵 in Theorem 1.3 (see formulas (1.6)) reveals the key role of the diﬀerence-quotient transformation 𝑤(𝑧) − 𝑤(0) 𝑧 acting on a function analytic at 0. A key property of the space ℋ(𝐾𝑆 ) in Theorem 1.3 is that it is invariant under the diﬀerence-quotient transformation 𝑅. It has been recognized for some time now that the multivariable analog of the diﬀerencequotient transformation is any solution of the Gleason problem (see [46, 5, 6, 9]). Given a space ℋ of holomorphic functions ℎ which are holomorphic in a neighborhood of the origin in 𝑑-dimensional complex Euclidean space ℂ𝑑 , we say that the operators 𝑅1 , . . . , 𝑅𝑑 mapping ℋ into itself solve the Gleason problem for ℋ if every function ℎ ∈ ℋ has a decomposition of the form 𝑅 : 𝑤(𝑧) →

ℎ(𝑧) − ℎ(0) =

𝑑 ∑

𝑧𝑘 [𝑅𝑘 ℎ](𝑧).

𝑘=1

We shall see that solutions of more structured Gleason problems enter into the deﬁnition of the colligation matrices for canonical functional models in the polydisk setting discussed below.

3. Weakly coisometric realizations For every function 𝑆 ∈ 𝒮𝒜𝑑 (𝒰, 𝒴) with a ﬁxed left Agler decomposition (1.9), one can construct a weakly coisometric realization in a certain canonical way; we now recall the construction from [17]. Let 𝑆 ∈ 𝒮𝒜𝑑 (𝒰, 𝒴) be given along with the kernel collection {𝐾1𝐿 , . . . , 𝐾𝑑𝐿 } providing the left Agler decomposition (1.9) for 𝑆. We say that an operator-block 𝑑 matrix 𝐴 = [𝐴𝑖𝑗 ]𝑖,𝑗=1 acting on ⊕𝑑𝑖=1 ℋ(𝐾𝑖𝐿 ) solves the structured Gleason problem for the kernel collection {𝐾1𝐿 , . . . , 𝐾𝑑𝐿 } if the identity 𝑓1 (𝑧) + ⋅ ⋅ ⋅ + 𝑓𝑑 (𝑧) − [𝑓1 (0) + ⋅ ⋅ ⋅ + 𝑓𝑑 (0)] =

𝑑 ∑

𝑧𝑖 [𝐴𝑓 ]𝑖 (𝑧)

(3.1)

𝑖=1

holds for all 𝑓 = ⊕𝑑𝑖=1 𝑓𝑖 ∈ ⊕𝑑𝑖=1 ℋ(𝐾𝑖𝐿 ), where we write ⎤ ⎡ [𝐴𝑓 ]1 (𝑧) ⎥ ⎢ .. 𝐴𝑓 (𝑧) = ⎣ ⎦ ∈ ⊕𝑑𝑖=1 ℋ(𝐾𝑖𝐿 ). . [𝐴𝑓 ]𝑑 (𝑧)

(We refer back to Remark 2.4 for a discussion of the “unstructured” Gleason problem.) We say that the operator 𝐵 : 𝒰 → ⊕𝑑𝑖=1 ℋ(𝐾𝑖𝐿 ) solves the structured Gleason problem for 𝑆 if the identity 𝑆(𝑧)𝑢 − 𝑆(0)𝑢 =

𝑑 ∑ 𝑖=1

𝑧𝑖 [𝐵𝑢]𝑖 (𝑧) holds for all 𝑢 ∈ 𝒰.

(3.2)

Canonical Realization

87

Deﬁnition 3.1. Given 𝑆 ∈ 𝒮𝒜𝑑 (𝒰, 𝒴), we shall say that the block-operator matrix [ ] [ 𝑑 ] [ 𝑑 ] 𝐴 𝐵 ⊕𝑖=1 ℋ(𝐾𝑖𝐿 ) ⊕𝑖=1 ℋ(𝐾𝑖𝐿 ) U= : → (3.3) 𝐶 𝐷 𝒰 𝒴 is a canonical functional-model (abbreviated to c.f.m. in what follows) colligation associated with left Agler decomposition (1.9) for 𝑆 if 1. 2. 3. 4.

U is contractive. 𝐴 : ⊕𝑑𝑖=1 ℋ(𝐾𝑖𝐿 ) → ⊕𝑑𝑖=1 ℋ(𝐾𝑖𝐿 ) solves the structured Gleason problem (3.1). 𝐵 : 𝒰 → ⊕𝑑𝑖=1 ℋ(𝐾𝑖𝐿 ) solves the structured Gleason problem (3.2) for 𝑆. The operators 𝐶 : ⊕𝑑𝑖=1 ℋ(𝐾𝑖𝐿 ) → 𝒴 and 𝐷 : 𝒰 → 𝒴 are given by 𝐶 : 𝑓 (𝑧) → 𝑓1 (0) + ⋅ ⋅ ⋅ + 𝑓𝑑 (0),

𝐷 : 𝑢 → 𝑆(0)𝑢.

(3.4)

Equalities (3.1), (3.2) and (3.4) can be equivalently reformulated in terms of adjoint operators 𝐴∗ , 𝐵 ∗ , 𝐶 ∗ and 𝐷∗ as follows. With a given left Agler decomposition {𝐾1𝐿 , . . . , 𝐾𝑑𝐿 } of a function 𝑆 ∈ 𝒮𝐴𝑑 (𝒰, 𝒴), we associate the kernel ⎤ ⎡ 𝐿 𝐾1 (𝑧, 𝜁) ⎥ ⎢ .. 𝑑 𝑑 𝑑 𝕋𝐿 (𝑧, 𝜁) := ⎣ (3.5) ⎦ : 𝔻 × 𝔻 → ℒ(𝒴, 𝒴 ) . 𝐾𝑑𝐿 (𝑧, 𝜁)

and the subspace

⎡

⎤ 𝜁 1 𝐾1𝐿 (⋅, 𝜁)𝑦 ⎢ ⎥ .. 𝒟= 𝑍(𝜁)∗ 𝕋𝐿 (⋅, 𝜁)𝑦 = ⎦ ⊂ ⊕𝑑𝑖=1 ℋ(𝐾𝑖𝐿 ). (3.6) ⎣ . 𝜁∈𝔻𝑑 , 𝑦∈𝒴 𝜁∈𝔻𝑑 , 𝑦∈𝒴 𝜁 𝑑 𝐾𝑑𝐿 (⋅, 𝜁)𝑦 ⋁

⋁

Proposition 3.2. Given 𝑆 ∈ 𝒮𝒜𝑑 (𝒰, 𝒴), the block-operator matrix U of the form (3.3) is a c.f.m. colligation associated with left Agler decomposition {𝐾1𝐿 , . . . , 𝐾𝑑𝐿 } for 𝑆 if and only if U is contractive and [ ∗ ] ] [ ] [ 𝐿 𝐴 𝐶∗ 𝑍(𝜁)∗ 𝕋𝐿 (⋅, 𝜁)𝑦 𝕋 (⋅, 𝜁)𝑦 (3.7) : → U∗ = 𝐵 ∗ 𝐷∗ 𝑦 𝑆(𝜁)∗ 𝑦 where 𝕋𝐿 is deﬁned in (3.5). Proof. It follows by straightforward inner-product calculations (see Proposition 3.4 and Remark 3.6 in [17] for details) that identities (3.1), (3.2) and equalities (3.4) are equivalent respectively to 𝐴∗ : 𝑍(𝜁)∗ 𝕋𝐿 (⋅, 𝜁)𝑦 → 𝕋𝐿 (⋅, 𝜁)𝑦 − 𝕋𝐿 (⋅, 0)𝑦, ∗

∗

𝐿

∗

∗

𝐵 : 𝑍(𝜁) 𝕋 (⋅, 𝜁)𝑦 → 𝑆(𝜁) 𝑦 − 𝑆(0) 𝑦, ∗

𝐿

𝐶 : 𝑦 → 𝕋 (⋅, 0)𝑦,

∗

∗

𝐷 : 𝑦 → 𝑆(0) 𝑦.

(3.8) (3.9) (3.10)

It remains to show that the last four equalities are equivalent to (3.7). Indeed, substituting the ﬁrst and the second equality from (3.10) into (3.8) and (3.9)

88

J.A. Ball and V. Bolotnikov

respectively, we get 𝐴∗ 𝑍(𝜁)∗ 𝕋𝐿 (⋅, 𝜁)𝑦 + 𝐶 ∗ 𝑦 = 𝕋𝐿 (⋅, 𝜁)𝑦, 𝐵 ∗ 𝑍(𝜁)∗ 𝕋𝐿 (⋅, 𝜁)𝑦 + 𝐷∗ 𝑦 = 𝑆(𝜁)∗ 𝑦 which express equalities of the top and of the bottom components in (3.7). Conversely, upon setting 𝜁 = 0 in (3.7) and taking into account that 𝑍(0) = 0 we get equalities (3.10). Substituting (3.10) back into (3.7) and comparing the top and the bottom components we arrive at (3.8) and (3.9). □ On the other hand, given an 𝑆 ∈ 𝒮𝒜𝑑 (𝒰, 𝒴) with a ﬁxed left Agler decomposition {𝐾1𝐿 , . . . , 𝐾𝑑𝐿 }, one can rearrange identity (1.9) as 𝑑 ∑

𝑧𝑖 𝜁 𝑖 𝐾𝑖𝐿 (𝑧, 𝜁) + 𝐼𝒴 =

𝑖=1

𝑑 ∑

𝐾𝑖𝐿 (𝑧, 𝜁) + 𝑆(𝑧)𝑆(𝜁)∗

𝑖=1

and write the latter in the inner product form as 𝑑 ∑ 𝑖=1

⟨𝜁¯𝑖 𝐾𝑖𝐿 (⋅, 𝜁)𝑦, 𝑧¯𝑖 𝐾𝑖𝐿 (⋅, 𝑧)𝑦 ′ ⟩ℋ(𝐾𝑖𝐿 ) + ⟨𝑦, 𝑦 ′ ⟩𝒴 =

𝑑 ∑ 𝑖=1

or equivalently, as 〈[

⟨𝐾𝑖𝐿 (⋅, 𝜁)𝑦, 𝐾𝑖𝐿 (⋅, 𝑧)𝑦 ′ ⟩ℋ(𝐾𝑖𝐿 ) + ⟨𝑆(𝜁)∗ 𝑦, 𝑆(𝑧)∗ 𝑦 ′ ⟩𝒰 ,

]〉 ] [ 𝑍(𝑧)∗ 𝕋𝐿 (⋅, 𝑧)𝑦 ′ 𝑍(𝜁)∗ 𝕋𝐿 (⋅, 𝜁)𝑦 , 𝑦′ 𝑦 𝐿 ⊕𝑑 𝑖=1 ℋ(𝐾𝑖 )⊕𝒴 〈[ 𝐿 ] [ 𝐿 ]〉 ′ 𝕋 (⋅, 𝜁)𝑦 𝕋 (⋅, 𝑧)𝑦 = , 𝑆(𝑧)∗ 𝑦 ′ 𝑆(𝜁)∗ 𝑦 ⊕𝑑 ℋ(𝐾 𝐿 )⊕𝒰 𝑖=1

𝑖

where 𝕋𝐿 is given in (3.5). It follows from the last identity that the linear map ] [ ] [ 𝐿 𝑍(𝜁)∗ 𝕋𝐿 (⋅, 𝜁)𝑦 𝕋 (⋅, 𝜁)𝑦 (3.11) 𝑉: → 𝑦 𝑆(𝜁)∗ 𝑦 extends to the isometry from [ ] [ 𝑑 ] ⋁ 𝑍(𝜁)∗ 𝕋𝐿 (⋅, 𝜁)𝑦 ⊕𝑖=1 ℋ(𝐾𝑖𝐿 ) ⊂ 𝒟𝑉 = 𝑦 𝒴 𝑑 𝜁∈𝔻 , 𝑦∈𝒴

onto ℛ𝑉 =

⋁ 𝜁∈𝔻𝑑 , 𝑦∈𝒴

[

𝕋𝐿 (⋅, 𝜁)𝑦 𝑆(𝜁)∗ 𝑦

]

] ⊕𝑑𝑖=1 ℋ(𝐾𝑖𝐿 ) . ⊂ 𝒰 [

By the same argument as used in Remark 2.3 for the splitting of the subspace 𝒟0 there, we see that the subspace 𝒟𝑉 splits as a direct sum 𝒟𝑉 = 𝒟 ⊕ 𝒴 where 𝒟 is given in (3.6) and that the defect spaces 𝒟⊥ := (⊕𝑑 ℋ(𝐾 𝐿 ) ⊕ 𝒴) ⊖ 𝒟𝑉 ∼ = 𝒟⊥ and ℛ⊥ := (⊕𝑑 ℋ(𝐾 𝐿 ) ⊕ 𝒰) ⊖ ℛ𝑉 𝑉

𝑖=1

𝑖

𝑉

𝑖=1

𝑖

Canonical Realization can be characterized as { ⊥

𝒟 =

𝑓=

𝑑 ⊕ 𝑖=1

ℛ⊥ 𝑉

𝑓𝑖 ∈

𝑑 ⊕

ℋ(𝐾𝑖𝐿 )

𝑖=1

:

𝑑 ∑

89 }

𝑧𝑖 𝑓𝑖 (𝑧) ≡ 0 ,

(3.12)

𝑖=1

{[ ] [ } ] ∑ 𝑑 𝑓 ⊕𝑑𝑖=1 ℋ(𝐾𝑖𝐿 ) = ∈ 𝑓𝑖 (𝑧) + 𝑆(𝑧)𝑢 ≡ 0 . : 𝑢 𝒰

(3.13)

𝑖=1

Combining (3.11) with Proposition 3.2 we arrive at the following. Lemma 3.3. Given a left Agler decomposition {𝐾1𝐿 , . . . , 𝐾𝑑𝐿 } for a function 𝑆 ∈ 𝒮𝒜𝑑 (𝒰, 𝒴), let 𝑉 be the isometric operator associated with this decomposition as in (3.11). A block-operator matrix U of the form (3.3) is a c.f.m. colligation associated with {𝐾1𝐿 , . . . , 𝐾𝑑𝐿 } if and only if U∗ is a contractive extension of 𝑉 to all of (⊕𝑑𝑖=1 ℋ(𝐾𝑖𝐿 )) ⊕ 𝒴, i.e., U∗ ∣𝒟⊕𝒴 = 𝑉

and

∥U∗ ∥ ≤ 1.

(3.14)

The following theorem is the multivariable counterpart of Theorem 1.3 for the polydisk setting. The ﬁrst statement is an immediate consequence of Lemma 3.3 and we refer to [17, Theorem 3.9] for the rest. Theorem 3.4. Let 𝑆 be a function in the Schur-Agler class 𝒮𝐴𝑑 (𝒰, 𝒴) with given left Agler decomposition {𝐾1𝐿 , . . . , 𝐾𝑑𝐿 }. Then 𝐿 𝐿 𝐵 1. There exists a c.f.m. colligation U = [ 𝐴 𝐶 𝐷 ] associated with {𝐾1 , . . . , 𝐾𝑑 }. 𝐿 𝐿 2. Every c.f.m. colligation U associated with {𝐾1 , . . . , 𝐾𝑑 } is weakly coisometric and observable and furthermore, 𝑆(𝑧) = 𝐷 + 𝐶(𝐼 − 𝑍(𝑧)𝐴)−1 𝑍(𝑧)𝐵. 3. Any observable weakly coisometric colligation U′ of the form (2.1) with its transfer function equal to 𝑆 is unitarily equivalent to some c.f.m. colligation U for 𝑆. We next describe all c.f.m. colligations associated with a given left Agler decomposition (1.9) of a function 𝑆 ∈ 𝒮𝒜𝑑 (𝒰, 𝒴). By Lemma 3.3, it suﬃces to ∗ describe]all solutions [ ] contractive completion problem (3.14). Identifying [ 𝑑 U of𝐿 the ⊥ 𝒟 ⊕𝑖=1 ℋ(𝐾𝑖 ) we then can represent U∗ from (3.14) as with 𝒴 𝒟⊕𝒴 [ ⊥ ] ] [ 𝑑 [ ] 𝒟 ⊕𝑖=1 ℋ(𝐾𝑖𝐿 ) ∗ (3.15) → U = 𝑋 𝑉 : 𝒰 𝒟⊕𝒴 [ ∗ [ 𝑑 ] ] 𝐴 ∣𝒟 ⊥ ⊕𝑖=1 ℋ(𝐾𝑖𝐿 ) ⊥ → : 𝒟 where 𝑋 = is unknown. 𝐵 ∗ ∣𝒟 ⊥ 𝒰 Theorem 3.5. Let {𝐾1𝐿 , . . . , 𝐾𝑑𝐿 } be a ﬁxed Agler decomposition of a given function 𝑆 ∈ 𝒮𝒜𝑑 (𝒰, 𝒴). Let 𝑉 be the associated isometry deﬁned in (3.11) with the defect spaces 𝒟⊥ and ℛ⊥ 𝑉 deﬁned in (3.12), (3.13). Then all c.f.m. colligations associated with {𝐾1𝐿 , . . . , 𝐾𝑑𝐿 } are described by formula (3.15) where 𝑋 is an arbitrary contraction from 𝒟⊥ into ℛ⊥ 𝑉 . Moreover,

90

J.A. Ball and V. Bolotnikov

1. There exists an isometric c.f.m. colligation U associated with {𝐾1𝐿 , . . . , 𝐾𝑑𝐿 } if and only if dim 𝒟⊥ ≥ dim ℛ⊥ 𝑉 . All such colligations are of the form (3.15) where 𝑋 : 𝒟⊥ → ℛ⊥ 𝑉 is a coisometry. 2. There exists a coisometric c.f.m. colligation U associated with {𝐾1𝐿 , . . . , 𝐾𝑑𝐿 } if and only if dim 𝒟⊥ ≤ dim ℛ⊥ 𝑉 . All such colligations are of the form (3.15) where 𝑋 : 𝒟⊥ → ℛ⊥ 𝑉 is an isometry. 3. There exists a unitary c.f.m. colligation U associated with {𝐾1𝐿 , . . . , 𝐾𝑑𝐿 } if and only if dim 𝒟⊥ = dim ℛ⊥ 𝑉 . All such colligations are of the form (3.15) where 𝑋 : 𝒟⊥ → ℛ⊥ is unitary. 𝑉 Proof. The operator U∗ of the form (3.15) is a contraction if and only if 𝑋𝑋 ∗ ≤ 𝐼 − 𝑉 𝑉 ∗ = 𝑃ℛ⊥ 𝑉 which in turn, is equivalent to 𝑋 being a contraction from 𝒟⊥ into ℛ⊥ 𝑉 . The and statement operator U∗ is coisometric if and only if 𝑋𝑋 ∗ = 𝐼 − 𝑉 𝑉 ∗ = 𝑃ℛ⊥ 𝑉 (1) follows. Furthermore, U∗ of the form (3.15) is a coisometry if and only if 𝑉 ∗ 𝑉 = 𝐼𝒟⊕𝒴 ,

𝑉 ∗𝑋 = 0

and 𝑋 ∗ 𝑋 = 𝐼𝒟⊥ .

(3.16)

The ﬁrst equality in (3.16) holds since 𝑉 is isometric and the second equality holds since the ranges of 𝑉 and of 𝑋 are orthogonal. Thus, U∗ of the form (3.15) is coisometric if and only 𝑋 is isometric which completes the proof of (2). Statement (3) follows from (1) and (2). □ Example 3.6. If 𝑆1 , 𝑆2 : 𝔻 → 𝔻 are Schur-class functions, then the function 𝑆(𝑧1 , 𝑧2 ) = 𝑆1 (𝑧1 ) ⋅ 𝑆2 (𝑧2 ) belongs to the class 𝒮𝒜2 (ℂ, ℂ). Making use of coisometric functional-model realizations ˜𝑖 (𝐼ℋ(𝐾 ) − 𝑧𝑖 𝐴 ˜𝑖 )−1 𝐵 ˜𝑖 ˜ 𝑖 + 𝑧𝑖 𝐶 𝑆𝑖 (𝑧𝑖 ) = 𝐷 𝑆𝑖

(𝑖 = 1, 2)

for 𝑆1 and 𝑆2 provided by Theorem 1.3, one can easily get a coisometric realization (1.14) for 𝑆 with the state space equal ℋ(𝐾𝑆1 ) ⊕ ℋ(𝐾𝑆2 ) and with operators 𝐴, 𝐵, 𝐶, 𝐷 given by [ ] ] [ ] [ ˜1 𝐵 ˜1 𝐶 ˜2 𝐴 ˜ 2, 𝐶 = 𝐶 ˜ 1𝐷 ˜1 𝐷 ˜ 1𝐶 ˜2 , 𝐵 = 𝐵 ˜2 𝐵 ˜2 , 𝐴 = ˜1 𝐷 𝐷=𝐷 ˜2 . 0 𝐴 However, this realization is not a c.f.m. in the sense of Deﬁnition 3.1 since the ˜ 1 = 𝑆1 (0) = 0). operator 𝐶 is not of the requested form (3.4) (unless 𝐷 To construct a c.f.m. realization, we should choose an Agler decomposition for 𝑆. One possible decomposition is the following: 1 − 𝑆(𝑧)𝑆(𝜁)∗ = (1 − 𝑧1 𝜁 1 )𝐾1 (𝑧, 𝜁) + (1 − 𝑧2 𝜁 2 )𝐾2 (𝑧, 𝜁),

(3.17)

where 𝐾1 (𝑧, 𝜁) = 𝐾𝑆1 (𝑧1 , 𝜁1 ),

𝐾2 (𝑧, 𝜁) = 𝑆1 (𝑧1 )𝑆1 (𝜁1 )𝐾𝑆2 (𝑧2 , 𝜁2 )

(3.18)

Canonical Realization

91

and the kernels 𝐾𝑆𝑖 are deﬁned as in (1.4). Thus, ℋ(𝐾1 ) = ℋ(𝐾𝑆1 ) and ℋ(𝐾2 ) = 𝑆1 ℋ(𝐾𝑆2 ). Since the identity 𝑧1 ℎ1 (𝑧1 ) + 𝑧2 𝑆1 (𝑧1 )ℎ2 (𝑧2 ) ≡ 0 implies ℎ1 ≡ ℎ2 ≡ 0, the subspace 𝒟⊥ (3.12) is trivial and therefore, there exists a unique c.f .m. colligation associated with decomposition (3.17). Explicit formulas for the entries 𝐴 and 𝐵 in terms of their adjoints follow (upon some straightforward manipulations) from (3.7): 𝐾𝑆1 (⋅, 𝜁1 ) − 𝐾𝑆1 (⋅, 0) , 𝐴∗21 = 0, 𝜁1 𝑆1 (𝜁1 ) − 𝑆1 (0) , 𝐴∗12 : 𝐾𝑆1 (⋅, 𝜁1 ) → 𝑆1 ⋅ 𝐾𝑆2 (⋅, 0) ⋅ 𝜁1 𝐾𝑆2 (⋅, 𝜁2 ) − 𝐾𝑆2 (⋅, 0) , 𝐴∗22 : 𝑆1 𝐾𝑆2 (⋅, 𝜁2 ) → 𝑆1 ⋅ 𝜁2 𝑆1 (𝜁1 ) − 𝑆1 (0) 𝐵1∗ : 𝐾𝑆1 (⋅, 𝜁1 ) → ⋅ 𝑆2 (0), 𝜁1 𝑆2 (𝜁2 ) − 𝑆2 (0) 𝐵2∗ : 𝑆1 𝐾𝑆2 (⋅, 𝜁2 ) → 𝜁2

𝐴∗11 : 𝐾𝑆1 (⋅, 𝜁1 ) →

(3.19)

where the unspeciﬁed argument is either 𝑧1 or 𝑧2 depending on the context. Note that this colligation is coisometric and that 𝐴11 and 𝐴22 are backward shift operators on ℋ(𝐾1 ) and ℋ(𝐾2 ), respectively. On the other hand, one can consider a diﬀerent Agler decomposition (3.17) for 𝑆 where 𝐾1 (𝑧, 𝜁) = 12 𝐾𝑆1 (𝑧1 , 𝜁1 ) ⋅ (1 + 𝑆2 (𝑧2 )𝑆2 (𝜁2 )), (3.20) 𝐾2 (𝑧, 𝜁) = 12 𝐾𝑆2 (𝑧2 , 𝜁2 ) ⋅ (1 + 𝑆1 (𝑧1 )𝑆1 (𝜁1 )). With respect to this decomposition, the subspace 𝒟⊥ can be nontrivial, in which case there will exist a family of non-equivalent c.f.m. realizations. Here is a simple illustrative example. Let 𝑆(𝑧1 , 𝑧2 ) = 𝑧1 𝑧2 . For the Agler representation (3.17), (3.18) we now have 1 − 𝑆(𝑧)𝑆(𝜁)∗ = (1 − 𝑧1 𝜁 1 ) ⋅ 1 + (1 − 𝑧2 𝜁 2 ) ⋅ 𝑧1 𝜁 1 .

(3.21)

The formulas (3.19) now take the form 𝐴∗11 = 0,

𝐴∗21 = 0,

𝐴∗22 = 0, 𝐴∗12 : 1 → 𝑧1 , 𝐵1∗ = 0, 𝐵2∗ : 𝑧1 → 1. {[ 1 ] [ 0 ] [ 0 ]} 0 , 𝑧1 , 0 of ℋ(𝐾1 ) ⊕ ℋ(𝐾2 ) ⊕ ℂ, the matrix With respect to the basis 0 1 0 of the c.f .m. colligation U has the form ⎡ ⎤ 0 1 0 U = ⎣0 0 1⎦ 1 0 0

92

J.A. Ball and V. Bolotnikov

and it is not hard to see that indeed ( [ ][ ])−1 [ [ ] 𝑧1 0 0 1 𝑧1 1 0 𝐼2 − 0 𝑧2 0 0 0

0 𝑧2

][ ] [ [ ] 1 0 = 1 0 0 1

𝑧1 1

][ ] 0 = 𝑧1 𝑧2 . 𝑧2

For the alternative Agler decomposition of 𝑆(𝑧) = 𝑧1 𝑧2 in accordance to (3.20) we use the kernels 𝐾1 (𝑧, 𝜁) =

1 1 ⋅ (1 + 𝑧2 𝜁 2 ) and 𝐾2 (𝑧, 𝜁) = ⋅ (1 + 𝑧1 𝜁 1 ). 2 2

𝑧2 𝑧1 }, { √12 , √ } and {1} for ℋ(𝐾1 ), ℋ(𝐾2 ) and ℂ, We ﬁx orthonormal bases { √12 , √ 2 2 ⊥ respectively. Note that the subspaces 𝒟 (3.12) and ℛ⊥ 𝑉 are given by {[ 1 ]} 𝑧2 −1 . (3.22) 𝒟⊥ = span {[ −𝑧 ]} , ℛ⊥ 𝑉 = span 1 0

In particular both 𝒟⊥ and ℛ⊥ 𝑉 are nontrivial and of the same dimension 1. [ ] [ ] 1/2 ℋ(𝐾1 ) Formulas (3.10) give 𝐷∗ = 0 and 𝐶 ∗ : 1 → ∈ . Therefore the 1/2 ℋ(𝐾2 ) matrix representations of 𝐷 and 𝐶 with respect to the ﬁxed choice of bases are [ ] 𝐷 = 0, 𝐶 = √12 0 √12 0 . (3.23) Formulas (3.8), (3.9) amount to ⎡ ⎤ [ ⎤ ⎡ ∗ ] 𝑧2 𝜁 2 𝐴11 𝐴∗21 𝜁 (1 + 𝑧 𝜁 ) 2 1 2 ⎣𝐴∗12 𝐴∗22 ⎦ : → ⎣ 𝑧1 𝜁 1 ⎦ . 𝜁 (1 + 𝑧 𝜁 ) ∗ ∗ 1 2 1 𝐵1 𝐵2 2𝜁 1 𝜁 2

(3.24)

Letting 𝜁1 = 0, 𝜁2 ∕= 0 and then 𝜁2 = 0, 𝜁1 ∕= 0 we get the action of all operators in (3.24) on constant functions: 𝐴∗11 : 1 → 0, 𝐵1∗ : 1 → 0,

𝐴∗12 : 1 → 𝑧1 ,

𝐵2∗ : 1 → 0.

𝐴∗21 : 1 → 𝑧2 ,

𝐴∗22 : 1 → 0,

Substituting these actions back into (3.24) we get the additional relation ⎡ ∗ ⎤ ⎡ ⎤ [ √ ] 0 𝐴11 𝐴∗21 ⎣𝐴∗12 𝐴∗22 ⎦ : 𝑧2 /√2 → ⎣ 0 ⎦ . √ 𝑧1 / 2 𝐵1∗ 𝐵2∗ 2

(3.25)

(3.26)

From the characterization of the spaces 𝒟⊥ and ℛ⊥ 𝑉 in (3.22), we see that the only freedom in the c.f.m. colligation U associated with this Agler decomposition is given by ⎡ ∗ ⎡ √ ⎤ ⎤ √ ] [ 𝐴11 𝐴∗21 1/ √2 ⎣𝐴∗12 𝐴∗22 ⎦ : 𝑧2 / √2 → 𝜔 ⎣−1/ 2⎦ (3.27) −𝑧1 / 2 𝐵1∗ 𝐵2∗ 0

Canonical Realization

93

where 𝜔 ∈ ℂ has ∣𝜔∣ ≤ 1. When we combine (3.26) and (3.27) we see that √ ⎤ ⎡ ∗ ⎡ ∗ ⎤ ⎤ ⎡ √ ]) [ 𝜔/2 √2 𝐴11 𝐴∗21 ( [ √ ] 𝐴11 √ 1 1 𝑧 / 2 𝑧 / 2 2 √ 2 √ ⎣𝐴∗12 ⎦ 𝑧2 / 2 = ⎣𝐴∗12 𝐴∗22 ⎦ ⎦ + = ⎣−𝜔/2 √ 2 2 𝑧2 / 2 2 −𝑧1 / 2 ∗ ∗ ∗ 𝐵1 𝐵1 𝐵2 1/ 2 and, similarly, ⎡ ∗ ⎤ ⎡ ∗ 𝐴21 𝐴11 √ ⎣𝐴∗22 ⎦ 𝑧1 / 2 = ⎣𝐴∗12 𝐵2∗ 𝐵1∗

√ ⎤ ⎤ ⎡ √ ]) [ −𝜔/2√ 2 𝐴∗21 ( [ √ ] 1 1 𝑧2 /√2 𝑧2 / √2 𝐴∗22 ⎦ − = ⎣ 𝜔/2√ 2 ⎦ . 2 𝑧1 / 2 2 −𝑧1 / 2 ∗ 𝐵2 1/ 2

Combining all these with the formulas (3.25) and (3.23), we now conclude that, with respect to the bases chosen as above, the matrix of the c.f .m. colligation U has the form ⎤ ⎡ 0 0 0 1 0√ ⎢ 𝜔/2 0 −𝜔/2 0 1/ 2 ⎥ ⎢ ⎥ 1 0 0 0√ ⎥ U=⎢ (3.28) ⎥. ⎢ 0 ⎣ −𝜔/2 0 𝜔/2 0 1/ 2 ⎦ √ √ 1/ 2 0 1/ 2 0 0 For every choice of 𝜔 with ∣𝜔∣ ≤ 1 we have ] )−1 [ ] [ ( 𝑧1 𝐼2 𝑧1 𝐼2 0 0 𝐴 𝐵 𝐷 + 𝐶 𝐼4 − 0 𝑧2 𝐼2 0 𝑧2 𝐼2 ⎡ ⎤−1 ⎡ ⎤ 1 0 0 −𝑧1 0 𝜔 ⎥ ⎢𝑧1 ⎥ ] ⎢− 𝜔 𝑧1 1[ 1 𝑧 0 1 2 2 ⎥ ⎢ ⎥ 1 0 1 0 ⎢ = ⎣ 0 −𝑧2 1 0 ⎦ ⎣0⎦ 2 𝜔 0 − 𝜔2 𝑧2 1 𝑧2 2 𝑧2 ⎡ ⎤⎡ ⎤ 𝜔 2 ∗ 𝑧1 (1 + 𝜔2 𝑧1 𝑧2 ) ∗ 0 2 𝑧1 𝑧2 ⎥ ⎢ 𝑧1 ⎥ [ ] ⎢∗ 1 ∗ ∗ ∗ ⎥⎢ ⎥ 1 0 1 0 ⎢ = 𝜔 2 ⎣∗ 𝑧2 (1 + 𝜔 𝑧1 𝑧2 ) ∗ ⎦⎣0⎦ 2(1 + 𝜔𝑧1 𝑧2 ) 2 2 𝑧1 𝑧2 𝑧2 ∗ ∗ ∗ ∗ ][ ] [ 𝜔 𝜔 2 [ ] 1 𝑧1 (1 + 2 𝑧1 𝑧2 ) 𝑧1 2 𝑧1 𝑧2 1 1 = = 𝑧1 𝑧2 𝜔 2 𝑧2 𝑧2 (1 + 𝜔2 𝑧1 𝑧2 ) 2(1 + 𝜔𝑧1 𝑧2 ) 2 𝑧1 𝑧2 as expected. Note that the realization is weakly coisometric for any choice of 𝜔 with ∣𝜔∣ ≤ 1 and is unitary when ∣𝜔∣ = 1. Finally, we note that the more general example 𝑆(𝑧) = 𝑧1𝑚 𝑧2𝑛 can be handled in much the same way; we welcome the reader to work out the details. We close this section with some discussion of various loose ends suggested by the results of this section. 3.1. Characterization of left Agler decompositions We have seen that construction of a c.f.m. for a Schur-Agler-class function 𝑆 requires knowledge of a left Agler decomposition for 𝑆. A natural question then is: which collections of kernels {𝐾1𝐿 , . . . , 𝐾𝑑𝐿 } arise as a left Agler decomposition

94

J.A. Ball and V. Bolotnikov

for some Schur-Agler class function 𝑆 ∈ 𝒮𝒜𝑑 (𝒰, 𝒴)? The following result gives an intrinsic, although arguably not particularly easily checkable, characterization of such kernel collections. Theorem 3.7. Let {𝐾1𝐿 , . . . , 𝐾𝑑𝐿 } be a collection of 𝑑 ℒ(𝒴)-valued positive kernels on 𝔻𝑑 . Then {𝐾1𝐿 , . . . , 𝐾𝑑𝐿 } is a left Agler decomposition for some Schur-Aglerclass function 𝑆 ∈ 𝒮𝒜𝑑 (𝒰, 𝒴) (for some appropriate input space 𝒰) if and only if there exists a solution 𝐴 = [𝐴𝑖𝑗 ]𝑑𝑖,𝑗=1 of the structured Gleason problem (3.1) which is contractive in the sense that 𝑑 ∑

∥[𝐴𝑓 ]𝑖 ∥2 ≤

𝑖=1

[𝑓 ] 1 ⊕𝑑 for all 𝑓 = .. ∈ 𝑖=1 ℋ(𝐾𝑖𝐿 ). .

𝑑 ∑ 𝑖=1

∥𝑓𝑖 ∥2ℋ(𝐾 𝐿 ) − 𝑖

𝑑 ∑

∥𝑓𝑖 (0)∥2𝒴

(3.29)

𝑖=1

𝑓

𝑑 ⊕𝑑 𝐿 Moreover, if this is the case and if we deﬁne 𝐶 : 𝑖=1 ℋ(𝐾𝑖 ) → 𝒴 by [𝑓 ] 1 (3.30) 𝐶 : .. → 𝑓1 (0) + ⋅ ⋅ ⋅ + 𝑓𝑑 (0), .

𝑓𝑑

𝐵 ] from 𝒰 to then there exists a choice of operator [ 𝐷 𝐴 𝐵 [ 𝐶 𝐷 ] is a c.f.m. for 𝑆.

⊕𝑑

𝑖=1

ℋ(𝐾𝑖𝐿 ) ⊕ 𝒴 so that

Proof. Necessity is immediate from Theorem 3.4 and the deﬁnition of c.f.m. Conversely suppose that we are given a collection {𝐾1𝐿 , . . . , 𝐾𝑑𝐿 } of ℒ(𝒴)valued positive kernels over 𝔻𝑑 for which there exists a contractive solution 𝐴 = ⊕𝑑 𝐿 [𝐴𝑖𝑗 ]𝑑𝑖,𝑗=1 of the Gleason problem (3.1). Deﬁne the operator 𝐶 : 𝑖=1 ℋ(𝐾𝑖 ) → 𝒴 as in (3.30). By the assumption that 𝐴 is a contractive solution of the Gleason problem, 𝐴 it follows that from ⊕𝑑𝑖=1 ℋ(𝐾𝑖𝐿 ) to ) the block column matrix [ 𝐶 ] is contractive ( 𝑑 𝐿 𝐵 ⊕𝑖=1 ℋ(𝐾 ) may then construct an operator [ 𝐷 ] from an input space ( 𝑖 ) ⊕ 𝒴. We 𝒰 into ⊕𝑑𝑖=1 ℋ(𝐾𝑖𝐿 ) ⊕ 𝒴 as a solution of the Cholesky factorization problem: [ ] [ ] [ ] ] ] 𝐼 0 𝐴 [ ∗ 𝐵 [ ∗ 𝐴 𝐶∗ . 𝐷∗ = 𝐵 − 0 𝐼 𝐶 𝐷 𝐴 𝐵 ] is coisometric. If we deﬁne 𝑆(𝑧) = It then follows that the colligation matrix [ 𝐶 𝐷 𝐷 + 𝐶(𝐼 − 𝑍(𝑧)𝐴)−1 𝑍(𝑧)𝐵, then 𝑆 is in the Schur-Agler class 𝒮𝒜𝑑 (𝒰, 𝒴) and the identity (1.16) leads to the representation (2.5) for 𝐼 − 𝑆(𝑧)𝑆(𝜁)∗ . It is convenient to introduce the notation ℐ𝑘 : ℋ(𝐾𝑘 ) → ⊕𝑑𝑖=1 ℋ(𝐾𝑖 ) for the inclusion map of ℋ(𝐾𝑘𝐿 ) into the direct-sum space ⊕𝑑𝑖=1 ℋ(𝐾𝑖𝐿 ) as the 𝑘th coordinate with the other coordinates equal to zero. We then have that the adjoint ℐ𝑘∗ of ℐ𝑘 is given by [ ]

ℐ𝑘∗ : 𝑓 =

𝑓1

.. .

𝑓𝑑

→ [𝑓 ]𝑘 := 𝑓𝑘 .

Canonical Realization

95

In addition we let 𝑃𝑘 be the projection operator on ⊕𝑑𝑖=1 ℋ(𝐾𝑖 ) given by 𝑃𝑘 = ℐ𝑘 ℐ𝑘∗ . We next argue that we recover the kernel 𝐾𝑘𝐿 (𝑧, 𝜁) as 𝐾𝑘𝐿 (𝑧, 𝜁) = 𝐶(𝐼 − 𝑍(𝑧)𝐴)−1 𝑃𝑘 (𝐼 − 𝐴∗ 𝑍(𝜁)∗ )−1 𝐶 ∗ .

(3.31)

By the deﬁnition of 𝐴 solving the Gleason problem, we see that, for any 𝑓𝑘 ∈ ℋ(𝐾𝑘𝐿 ), 𝑑 ∑ 𝑧𝑗1 [𝐴ℐ𝑘 𝑓𝑘 ]𝑗1 (𝑧). 𝑓𝑘 (𝑧) = 𝑓𝑘 (0) + 𝑗1 =1

We then apply the Gleason-problem identity (3.1) to each ℐ𝑗1 [𝐴ℐ𝑘 𝑓𝑘 ]𝑗1 and iterate to get ⎛ ([ 𝑑 𝑑 ] ∑ ∑ 𝑓𝑘 (𝑧) = 𝑓𝑘 (0) + 𝑧𝑗1 ⎝[𝐴ℐ𝑘 𝑓𝑘 ]𝑗1 (0) + 𝑧𝑗2 𝐴ℐ𝑗1 [𝐴ℐ𝑘 𝑓𝑘 ]𝑗1 (0) 𝑗1 =1

+

𝑑 ∑

𝑧𝑗3

𝑗3 =1

= 𝐶ℐ𝑘 𝑓𝑘 +

𝑑 ∑

𝑗2 𝑗 3

⎛ 𝑧𝑗1 ⎝𝐶𝑃𝑗1 𝐴ℐ𝑘 𝑓𝑘 +

𝑗1 =1

+

𝑑 ∑ 𝑗3 =1

=

∞ ∑

𝐶

𝑛=0

𝑑 ∑

𝑧𝑗3 ⎝𝐶𝑃𝑗3 𝐴𝑃𝑗2 𝐴𝑃𝑗1 𝐴ℐ𝑘 𝑓𝑘 + 𝑑 ∑

𝑗4 =1

𝑧𝑗2 (𝐶𝑃𝑗2 𝐴𝑃𝑗1 𝐴ℐ𝑘 𝑓𝑘 +

𝑗2 =1

⎛

(

𝑗2

𝑗2 =1

⎛ ⎞⎞⎞ [ 𝑑 [ ] ] ∑ ⎝ 𝐴ℐ𝑗2 𝐴ℐ𝑗1 [𝐴ℐ𝑘 𝑓𝑘 ] (0) + 𝑧𝑗4 ⋅ . . . ⎠⎠⎠ 𝑗1

⎞⎞⎞ 𝑧𝑗4 ⋅ ⋅ ⋅ ⋅ ⎠⎠⎠

𝑗4 =1

)𝑛 𝑧𝑖 𝑃𝑖 𝐴

𝑑 ∑

ℐ𝑘 𝑓𝑘 = 𝐶(𝐼 − 𝑍(𝑧)𝐴)−1 ℐ𝑘 𝑓𝑘

𝑖=1

which we summarize as 𝑓𝑘 (𝑧) = 𝐶(𝐼 − 𝑍(𝑧)𝐴)−1 ℐ𝑘 𝑓𝑘 By the reproducing-kernel property of

𝐾𝑘𝐿 (⋅, 𝜁)

for 𝑓𝑘 ∈ ℋ(𝐾𝑘𝐿 ). for

ℋ(𝐾𝑘𝐿 ), −1

⟨𝑓𝑘 , 𝐾𝑘𝐿 (⋅, 𝜁)𝑦⟩ℋ(𝐾𝑘𝐿 ) = ⟨𝑓𝑘 (𝜁), 𝑦⟩𝒴 = ⟨𝐶(𝐼 − 𝑍(𝜁)𝐴) =

⟨𝑓𝑘 , ℐ𝑘∗ (𝐼

(3.32)

we also know that ℐ𝑘 𝑓𝑘 , 𝑦⟩𝒴

∗

− 𝐴 𝑍(𝜁)∗ )−1 𝐶 ∗ 𝑦⟩ℋ(𝐾𝑘𝐿 )

from which we conclude that 𝐾𝑘𝐿 (⋅, 𝜁)𝑦 = ℐ𝑘∗ (𝐼 − 𝐴∗ 𝑍(𝜁)∗ )−1 𝐶 ∗ 𝑦.

(3.33)

Combining (3.33) with the general principle (3.32) applied to the case where 𝑓𝑘 = 𝐾𝑘 (⋅, 𝜁)𝑦 then gives us 𝐾𝑘𝐿 (𝑧, 𝜁) = 𝐶(𝐼 − 𝑍(𝑧)𝐴)−1 ℐ𝑘 ℐ𝑘∗ (𝐼 − 𝐴∗ 𝑍(𝜁)∗ )−1 𝐶 ∗ 𝑦 and (3.31) follows as wanted.

(3.34)

96

J.A. Ball and V. Bolotnikov

𝐴 𝐵 ] is coisometric by construction, we see Since the colligation matrix U = [ 𝐶 𝐷 from the identity (2.5) that ( 𝑑 ) ∑ ∗ −1 𝐼 − 𝑆(𝑧)𝑆(𝜁) = 𝐶(𝐼 − 𝑍(𝑧)𝐴) (1 − 𝑧𝑘 𝜁 𝑘 )𝑃𝑘 (𝐼 − 𝐴∗ 𝑍(𝜁)∗ )−1 𝐶 ∗ 𝑘=1

=

𝑑 ∑

(1 − 𝑧𝑘 𝜁 𝑘 )𝐾𝑘𝐿 (𝑧, 𝜁) (by (3.34))

𝑘=1

and we see that we recover {𝐾1𝐿 , . . . , 𝐾𝑑𝐿 } as a left Agler decomposition for 𝑆. It remains to verify the ﬁnal assertion in the statement of the theorem. By 𝐴 𝐵 ] constructed above satisﬁes construction we see that the colligation matrix [ 𝐶 𝐷 properties (1), (2), and (4) in Deﬁnition 3.1 for a c.f.m. of 𝑆. As for property (3), observe that (𝑆(𝑧) − 𝑆(0))𝑢 = 𝐶(𝐼 − 𝑍(𝑧)𝐴)−1 𝑍(𝑧)𝐵𝑢 =

𝑑 ∑

𝑧𝑘 𝐶(𝐼 − 𝑍(𝑧)𝐴)−1 𝑃𝑘 𝐵𝑢

𝑘=1

=

𝑑 ∑

𝑧𝑘 [𝐵𝑢]𝑘 (𝑧)

𝑘=1 𝐴 𝐵] where we used (3.32) for the last step. This completes the veriﬁcation that [ 𝐶 𝐷 is a c.f.m. for 𝑆. □

Remark 3.8. A variant of Theorem 3.7 is Theorem 3.10 in [17] where it is assumed that {𝐾1𝐿 , . . . , 𝐾𝑑𝐿 } is a left Agler decomposition for a known 𝑆 in 𝒮𝒜𝑑 (𝒰, 𝒴) and then it is shown that any contractive solution 𝐴 of the Gleason problem (3.1) can be embedded into a c.f.m. for 𝑆. A very similar argument as in the proof of Theorem 3.7 also occurs in the proofs of Theorem 2.2 and Theorem 3.1 in [20] where closely related results are proved but for contractive multipliers on the Drury-Arveson space rather than Schur-Agler-class functions on the polydisk. The univariate (𝑑 = 1) special case of Theorem 3.7 amounts essentially to Theorem 11 in [40] and can be viewed as a version of the Beurling-Lax theorem for backward-shift-invariant subspaces. We next oﬀer a second characterization of left Agler decompositions which may be easier to apply in some cases. Theorem 3.9. Suppose that we are given a collection {𝐾1𝐿 , . . . , 𝐾𝑑𝐿 } of ℒ(𝒴)-valued positive kernels. Then {𝐾1𝐿 , . . . , 𝐾𝑑𝐿 } is a left Agler decomposition for some SchurAgler-class function 𝑆 ∈ 𝒮𝒜𝑑 (𝒰, 𝒴) (for some appropriate input space 𝒰) if and ∑𝑑 only if the kernel 𝐼 − 𝑘=1 (1 − 𝑧𝑘 𝜁 𝑘 )𝐾𝑘𝐿 (𝑧, 𝜁) is a positive kernel.

Canonical Realization

97

Proof. If {𝐾1𝐿 , . . . , 𝐾𝑑𝐿 } is a left Agler decomposition for a Schur-Agler-class function 𝑆, it follows immediately from the deﬁning property (1.9) that 𝐼−

𝑑 ∑

(1 − 𝑧𝑘 𝜁 𝑘 )𝐾𝑘𝐿 (𝑧, 𝜁) = 𝑆(𝑧)𝑆(𝜁)∗

𝑘=1

is a positive kernel with Kolmogorov decomposition given by 𝑆(𝑧)𝑆(𝜁)∗ . Con∑𝑑 versely, if 𝐼 − 𝑘=1 (1 − 𝑧𝑘 𝜁 𝑘 )𝐾𝑘𝐿 (𝑧, 𝜁) is a positive kernel, it has a Kolmogorov decomposition 𝑆(𝑧)𝑆(𝜁)∗ and then 𝑆 is a Schur-Agler-class function having {𝐾1𝐿 , □ . . . , 𝐾𝑑𝐿 } as a left Agler decomposition. 3.2. Weakly coisometric versus coisometric c.f.m.’s From the deﬁnitions we see that in case the subspace 𝒟 given by (3.6) is equal to ⊕𝑑 the whole space 𝑖=1 ℋ(𝐾𝑖𝐿 ), then the weakly coisometric c.f.m. determined by the Agler decomposition {𝐾1𝐿 , . . . , 𝐾𝑑𝐿 } is unique and is automatically coisometric. In the univariate case (𝑑 = 1), we see 𝐾1𝐿 = 𝐾 𝐿 and the domain 𝒟 collapses to ⋁ ⋁ 𝜁𝐾 𝐿 (⋅, 𝜁)𝑦 = 𝐾 𝐿 (⋅, 𝜁)𝑦 = ℋ(𝐾 𝐿 ) 𝒟= 𝜁∈𝔻 𝑦∈𝒴

𝜁∈𝔻, 𝑦∈𝒴

from which we see that weakly coisometric and coisometric c.f.m.’s coincide in the univariate case. As illustrated in Example 3.6, in the multivariate case it can happen that the containment in (3.6) holds with equality or that it is strict. In general not much is known about the actual construction and structure of Agler decompositions beyond ad hoc constructions as in Example 3.6; in particular, we do not know if, given a Schur-Agler-class function 𝑆, there exists a left Agler decomposition {𝐾1𝐿 , . . . , 𝐾𝑑𝐿 } which gives rise to a coisometric c.f.m. for 𝑆. We return to this topic in the context of two-component canonical functional models (the polydisk analog of Theorem 1.6) in Remark 5.12 below.

4. Weakly isometric realizations Using the strategy described in Remark 1.5, all the results concerning weakly isometric colligations/realizations associated with a ﬁxed right Agler decomposition (1.10) of a function 𝑆 ∈ 𝒮𝒜𝑑 (𝒰, 𝒴) can be obtained from their “coisometric” counterparts. Indeed, it follows from Theorem 1.7 that 𝑆 belongs to the class 𝒮𝒜𝑑 (𝒰, 𝒴) 𝑧 )∗ belongs to 𝒮𝒜𝑑 (𝒴, 𝒰) (we if and only if the associated function 𝑆 ♯ (𝑧) := 𝑆(¯ use the standard notation 𝑧¯ = (¯ 𝑧1 , . . . , 𝑧¯𝑑 ) for 𝑧 = (𝑧1 , . . . , 𝑧𝑑 ) ∈ ℂ𝑑 ). It is also clear from Theorem 1.7 that a right decomposition {𝐾1𝑅 , . . . , 𝐾𝑑𝑅 } for 𝑆 is at the same time a left decomposition for 𝑆 ♯ . Furthermore, 𝑆 is the transfer function of the colligation U of the form (1.13) if and only if 𝑆 ♯ is the transfer function of U∗

98

J.A. Ball and V. Bolotnikov

which is readily seen upon taking adjoints in (1.14): 𝑆 ♯ (𝑧) = 𝑆(¯ 𝑧 )∗ = 𝐷∗ + 𝐵 ∗ 𝑍(¯ 𝑧 )∗ (𝐼 − 𝐴∗ 𝑍(¯ 𝑧 )∗ )−1 𝐶 ∗ = 𝐷∗ + 𝐵 ∗ 𝑍(𝑧)(𝐼 − 𝐴∗ 𝑍(𝑧))−1 𝐶 ∗ = 𝐷∗ + 𝐵 ∗ (𝐼 − 𝑍(𝑧)𝐴∗ )−1 𝑍(𝑧)𝐶 ∗ . Assume that we are given a function 𝑆 ∈ 𝒮𝒜𝑑 (𝒰, 𝒴) with a ﬁxed right Agler decomposition {𝐾1𝑅 , . . . , 𝐾𝑑𝑅 }. Let ⎤ ⎡ 𝑅 𝐾1 (𝑧, 𝜁) ⎥ ⎢ .. 𝕋𝑅 (𝑧, 𝜁) := ⎣ (4.1) ⎦ : 𝔻𝑑 × 𝔻𝑑 → ℒ(𝒰, 𝒰 𝑑 ) . 𝐾𝑑𝑅 (𝑧, 𝜁)

and let

⎡

⎤ 𝜁 1 𝐾1𝑅 (⋅, 𝜁)𝑢 ⎢ ⎥ .. ˜= 𝒟 𝑍(𝜁)∗ 𝕋𝑅 (⋅, 𝜁)𝑢 = ⎣ ⎦ ⊂ ⊕𝑑𝑖=1 ℋ(𝐾𝑖𝑅 ). (4.2) . 𝜁∈𝔻𝑑 , 𝑢∈𝒰 𝜁∈𝔻𝑑 , 𝑢∈𝒰 𝜁 𝑑 𝐾𝑑𝑅 (⋅, 𝜁)𝑢 ⋁

⋁

Deﬁnition 4.1. Given 𝑆 ∈ 𝒮𝒜𝑑 (𝒰, 𝒴), we shall say that the block-operator matrix ] [ [ ] [ 𝑑 ] ˜ 𝐵 ˜ 𝐴 ⊕𝑑𝑖=1 ℋ(𝐾𝑖𝑅 ) ⊕𝑖=1 ℋ(𝐾𝑖𝑅 ) ˜ = : U → (4.3) ˜ 𝐷 ˜ 𝒰 𝒴 𝐶 is a dual canonical functional-model (abbreviated to d.c.f.m. in what follows) colligation associated with right Agler decomposition (1.10) for 𝑆 if ˜ is contractive. 1. U ˜ and 𝐶 ˜ to the subspace 𝒟 ˜ ⊂ ⊕𝑑 ℋ(𝐾 𝑅 ) 2. The restrictions of operators 𝐴 𝑖=1 𝑖 deﬁned in (4.3) have the following action on special kernel functions: ˜ ˜ : 𝑍(𝜁)∗ 𝕋𝑅 (⋅, 𝜁)𝑢 → 𝕋𝑅 (⋅, 𝜁)𝑢 − 𝕋𝑅 (⋅, 0)𝑢, 𝐴∣ 𝒟

(4.4)

˜ ˜ : 𝑍(𝜁) 𝕋 (⋅, 𝜁)𝑢 → 𝑆(𝜁)𝑢 − 𝑆(0)𝑢. 𝐶∣ 𝒟

(4.5)

∗

𝑅

˜ : 𝒰 → ⊕𝑑 ℋ(𝐾 𝑅 ) and 𝐷 ˜ : 𝒰 → 𝒴 are given by 3. The operators 𝐵 𝑖=1 𝑖 ˜ : 𝑢 → 𝕋𝑅 (⋅, 0)𝑢, 𝐵

˜ : 𝑢 → 𝑆(0)𝑢. 𝐷

(4.6)

The formulas (4.4)–(4.6) look very much the same as formulas (3.8)–(3.10) and reproducing the arguments from the proof of Proposition 3.2 we arrive at the following. ˜ of the form Proposition 4.2. Given 𝑆 ∈ 𝒮𝒜𝑑 (𝒰, 𝒴), the block-operator matrix U (4.3) is a d.c.f.m. colligation associated with right Agler decomposition {𝐾1𝑅 , . . . , ˜ is contractive and 𝐾𝑑𝑅 } for 𝑆 if and only if U ] [ [ ] ] [ 𝑅 ˜ 𝐵 ˜ 𝕋 (⋅, 𝜁)𝑢 𝑍(𝜁)∗ 𝕋𝑅 (⋅, 𝜁)𝑢 𝐴 ˜ . (4.7) → U= ˜ ˜ : 𝑢 𝑆(𝜁)𝑢 𝐶 𝐷

Canonical Realization

99

On the other hand, as a consequence of identity (1.10) we get (as in the previous section) that the formula [ ] [ ] ] [ 𝑅 ˜˜ 𝐵 ˜ 𝕋 (⋅, 𝜁)𝑢 𝐴∣ 𝑍(𝜁)∗ 𝕋𝑅 (⋅, 𝜁)𝑢 𝒟 ˜ 𝑉 = ˜ (4.8) → ˜ : 𝑢 𝑆(𝜁)𝑢 𝐶∣𝒟˜ 𝐷 extends by continuity to deﬁne the isometry 𝑉˜ : 𝒟𝑉˜ → ℛ𝑉˜ where [ 𝑅 ] [ 𝑑 ] ⋁ 𝕋 (⋅, 𝜁)𝑦 ⊕𝑖=1 ℋ(𝐾𝑖𝑅 ) ˜ ⊕ 𝒰 and ℛ ˜ = 𝒟𝑉˜ = 𝐷 ⊂ . 𝑉 𝒰 𝑆(𝜁)𝑢 𝑑 𝜁∈𝔻 , 𝑢∈𝒰

The operator 𝑉˜ is completely determined by the kernels {𝐾1𝑅 , . . . , 𝐾𝑑𝑅 } and it follows from (4.7) that a block-operator matrix U of the form (4.3) is a d.c.f.m. colligation associated with {𝐾1𝑅 , . . . , 𝐾𝑑𝑅 } if and only if U is a contractive extension of 𝑉 to all of (⊕𝑑𝑖=1 ℋ(𝐾𝑖𝑅 )) ⊕ 𝒴. This observation proves the ﬁrst statement in the following theorem which is the multivariable analog of Theorem 1.4. Theorem 4.3. Let 𝑆 be a function in the Schur-Agler class 𝒮𝐴𝑑 (𝒰, 𝒴) with given right Agler decomposition {𝐾1𝑅 , . . . , 𝐾𝑑𝑅 }. Then 𝐴 𝐵 ] associated with {𝐾 𝑅 , . . . , 𝐾 𝑅 }. 1. There exists a d.c.f.m. colligation U = [ 𝐶 1 𝑑 𝐷 2. Every d.c.f.m. colligation U associated with {𝐾1𝑅 , . . . , 𝐾𝑑𝑅 } is weakly isometric and controllable and furthermore, 𝑆(𝑧) = 𝐷 + 𝐶(𝐼 − 𝑍(𝑧)𝐴)−1 𝑍(𝑧)𝐵. 3. Any controllable weakly isometric colligation U′ of the form (2.1) with the transfer function equal 𝑆 is unitarily equivalent to some d.c.f.m. colligation ˜ for 𝑆. U The latter theorem is a consequence of Theorem 3.4 so the proof will be omitted as well as the formulation of the theorem parallel to Theorem 3.5.

5. Weakly unitary realizations In this section we study unitary realizations of an 𝑆 ∈ 𝒮𝒜𝑑 (𝒰, 𝒴) associated with a ﬁxed Agler decomposition (1.12). Following the streamlines of Section 2, we let ℋ(𝐾𝑖 ) to be the reproducing kernel Hilbert spaces associated with the kernels 𝐾𝑖 from decomposition (1.12). For functions 𝑓 ∈ ⊕𝑑𝑖=1 ℋ(𝐾𝑖 ), we will use representations and notation ⎤ ⎡ 𝑓1 [ [ ] ] 𝑑 𝑑 ⊕ 𝑓 𝒴 ⎥ ⊕ ⎢ 𝑓𝑖 := ⎣ ... ⎦ ∈ ℋ(𝐾𝑖 ) where 𝑓𝑖 = 𝑖,+ : 𝔻𝑑 → . (5.1) 𝑓= 𝒰 𝑓𝑖,− 𝑖=1 𝑖=1 𝑓𝑑 We furthermore introduce the kernel ⎤ ⎡ 𝐾1 (𝑧, 𝜁) ⎥ ⎢ .. 𝑑 𝑑 𝑑 𝕋(𝑧, 𝜁) := ⎣ ⎦ : 𝔻 × 𝔻 → ℒ(𝒴 ⊕ 𝒰, (𝒴 ⊕ 𝒰) ) . 𝐾𝑑 (𝑧, 𝜁)

(5.2)

100

J.A. Ball and V. Bolotnikov

and the subspaces [ ] [ ] } ⋁{ 𝑦 0 ∗ 𝑑 𝒟= 𝑍(𝜁) 𝕋(⋅, 𝜁) , 𝕋(⋅, 𝜁) : 𝜁 ∈ 𝔻 , 𝑦 ∈ 𝒴, 𝑢 ∈ 𝒰 0 𝑢 and

[ ] } [ ] ⋁{ 0 𝑦 ∗ 𝑑 : 𝜁 ∈ 𝔻 , 𝑦 ∈ 𝒴, 𝑢 ∈ 𝒰 , 𝑍(𝜁) 𝕋(⋅, 𝜁) ℛ= 𝕋(⋅, 𝜁) 𝑢 0

(5.3)

(5.4)

of ⊕𝑑𝑖=1 ℋ(𝐾𝑖 ) whose orthogonal complements can be described as { } ] ⊕ 𝑑 [ 𝑑 𝑑 𝑑 ⊕ ∑ ∑ 𝑓𝑖,+ ⊥ 𝒟 = 𝑓= ℋ(𝐾𝑖 ) : 𝑧𝑖 𝑓𝑖,+ (𝑧) ≡ 0 & 𝑓𝑖,− (𝑧) ≡ 0 (5.5) ∈ 𝑓𝑖,− 𝑖=1

and

{ ⊥

ℛ =

𝑓=

𝑖=1

] 𝑑 [ ⊕ 𝑓𝑖,+ 𝑖=1

𝑓𝑖,−

∈

𝑑 ⊕

𝑖=1

ℋ(𝐾𝑖 ) :

𝑖=1

𝑖=1

𝑑 ∑

𝑓𝑖,+ (𝑧) ≡ 0 &

𝑖=1

𝑑 ∑

} 𝑧𝑖 𝑓𝑖,− (𝑧) ≡ 0 ,

𝑖=1

respectively. By the reproducing kernel property, we have [ ]〉 〈 𝑦 = ⟨𝑓𝑖,+ (𝜁), 𝑦⟩𝒴 , 𝑓𝑖 , 𝐾𝑖 (⋅, 𝜁) 0 ℋ(𝐾 ) 𝑖 [ ]〉 〈 0 = ⟨𝑓𝑖,− (𝜁), 𝑢⟩𝒰 . 𝑓𝑖 , 𝐾𝑖 (⋅, 𝜁) 𝑢 ℋ(𝐾 )

(5.6) (5.7) (5.8)

𝑖

We deﬁne the coisometric map s : ⊕𝑑𝑖=1 ℋ(𝐾𝑖 ) → ℋ(𝐾1 + ⋅ ⋅ ⋅ + 𝐾𝑑 ) by formula s𝑓 = 𝑓1 + ⋅ ⋅ ⋅ + 𝑓𝑑

where 𝑓 =

𝑑 ⊕ 𝑖=1

𝑓𝑖 ∈

𝑑 ⊕

ℋ(𝐾𝑖 )

and observe that in view of (5.2), (5.7) and (5.8), 〈 [ ]〉 𝑦 = ⟨(s𝑓 )+ (𝜁), 𝑦⟩𝒴 , 𝑓, 𝕋(⋅, 𝜁) 0 ⊕𝑑 ℋ(𝐾 ) 𝑖 𝑖=1 〈 [ ]〉 0 = ⟨(s𝑓 )− (𝜁), 𝑢⟩𝒰 . 𝑓, 𝕋(⋅, 𝜁) 𝑢 ⊕𝑑 ℋ(𝐾 ) 𝑖=1

(5.9)

𝑖=1

(5.10) (5.11)

𝑖

Deﬁnition 5.1. A contractive colligation ] [ 𝑑 ] [ ] [ 𝑑 ⊕𝑖=1 ℋ(𝐾𝑖 ) 𝐴 𝐵 ⊕𝑖=1 ℋ(𝐾𝑖 ) → U= : 𝒰 𝒴 𝐶 𝐷

(5.12)

will be called a two-component canonical functional-model (abbreviated to t.c.f.m. in what follows) realization associated with a ﬁxed Agler decomposition (1.12) of a given 𝑆 ∈ 𝒮𝒜𝑑 (𝒰, 𝒴) if

Canonical Realization

101

1. The state space operator 𝐴 solves the structured Gleason problem (s𝑓 )+ (𝑧) − (s𝑓 )+ (0) =

𝑑 ∑

𝑧𝑖 [𝐴𝑓 ]𝑖,+ (𝑧),

(5.13)

𝑖=1

whereas the adjoint operator 𝐴∗ solves the dual structured Gleason problem (s𝑓 )− (𝑧) − (s𝑓 )− (0) = 2. The operators 𝐶 : are of the form

⊕𝑑

𝑖=1

𝑑 ∑ 𝑖=1

ℋ(𝐾𝑖 ) → 𝒴, 𝐵 ∗ :

𝐶 : 𝑓 → (s𝑓 )+ (0),

𝑧𝑖 [𝐴∗ 𝑓 ]𝑖,− (𝑧). ⊕𝑑

𝑖=1

𝐵 ∗ : 𝑓 → (s𝑓 )− (0),

(5.14)

ℋ(𝐾𝑖 ) → 𝒰 and 𝐷 : 𝒰 → 𝒴 𝐷 : 𝑢 → 𝑆(0)𝑢.

(5.15)

Proposition 5.2. Relations (5.13), (5.14) and (5.15) are equivalent respectively to equalities [ ] [ ] [ ] 𝑦 𝑦 𝑦 ∗ ∗ 𝐴 𝑍(𝜁) 𝕋(⋅, 𝜁) = 𝕋(⋅, 𝜁) − 𝕋(⋅, 0) , (5.16) 0 0 0 [ ] [ ] [ ] 0 0 0 𝐴𝑍(𝜁)∗ 𝕋(⋅, 𝜁) = 𝕋(⋅, 𝜁) − 𝕋(⋅, 0) , (5.17) 𝑢 𝑢 𝑢 [ ] [ ] 𝑦 0 𝐶 ∗ 𝑦 = 𝕋(⋅, 0) , 𝐵𝑢 = 𝕋(⋅, 0) , and 𝐷∗ 𝑦 = 𝑆(0)∗ 𝑦 (5.18) 0 𝑢 holding for every 𝜁 ∈ 𝔻𝑑 , 𝑦 ∈ 𝒴 and 𝑢 ∈ 𝒰. Proof. It follows from (5.10) that ⟨(s𝑓 )+ (𝑧) − (s𝑓 )+ (0) 𝑦⟩𝒴 =

〈 [ ] [ ]〉 𝑦 𝑦 𝑓, 𝕋(⋅, 𝑧) − 𝕋(⋅, 0) 0 0 ⊕𝑑

𝑖=1 ℋ(𝐾𝑖 )

and on the other hand, due to the diagonal structure (1.15) of 𝑍(𝑧), 〈 𝑑 〉 〈 [ ]〉 ∑ 𝑦 𝑧𝑖 [𝐴𝑓 ]𝑖,+ (𝑧), 𝑦 = 𝑍(𝑧)𝐴𝑓, 𝕋(⋅, 𝑧) 0 ⊕𝑑 ℋ(𝐾 ) 𝑖 𝑖=1 𝑖=1 𝒴 〈 [ ]〉 𝑦 = 𝑓, 𝐴∗ 𝑍(𝑧)∗ 𝕋(⋅, 𝑧) . 0 ⊕𝑑 ℋ(𝐾 ) 𝑖=1

𝑖

⊕𝑑𝑖=1 ℋ(𝐾𝑖 )

and 𝑦 ∈ 𝒴, the Since the two latter equalities hold for every 𝑓 ∈ equivalence (5.13) ⇔ (5.16) follows. The equivalence (5.14)⇔ (5.17) follows from (5.11) in much the same way; the formula for 𝐶 ∗ in (5.18) follows from 〈 [ ]〉 𝑦 ⟨𝑓, 𝐶 ∗ 𝑦⟩ = ⟨𝐶𝑓, 𝑦⟩ = ⟨(s𝑓 )+ (0), 𝑦⟩ = 𝑓, 𝕋(⋅, 0) 0 and the formula for 𝐵 is a consequence of a similar computation. The formula for □ 𝐷∗ is self-evident.

102

J.A. Ball and V. Bolotnikov

Proposition 5.3. Let 𝐵, 𝐶 and 𝐷 be the operators deﬁned in (5.15). Then 𝐶𝐶 ∗ + 𝐷𝐷∗ = 𝐼𝒴

and

𝐵 ∗ 𝐵 + 𝐷∗ 𝐷 = 𝐼𝒴 .

(5.19)

Furthermore 𝐵 ∗ has the following action on kernel elements of the subspace 𝒟 deﬁned in (5.3): [ ] 𝑦 → 𝑆(𝜁)∗ 𝑦 − 𝑆(0)∗ 𝑦, 𝐵 ∗ : 𝑍(𝜁)∗ 𝕋(⋅, 𝜁) 0 [ ] 0 ¯ → 𝑢 − 𝑆(0)∗ 𝑆(𝜁)𝑢, 𝐵 ∗ : 𝕋(⋅, 𝜁) 𝑢

(5.20) (5.21)

for all 𝜁 ∈ 𝔻𝑑 , 𝑦 ∈ 𝒴 and 𝑢 ∈ 𝒰, where 𝕋 is deﬁned in (5.2). Proof. We ﬁrst observe that 〉 1 [ ]12 〈∑ 𝑑 1 1 𝑦 1 ∥𝐶 ∗ 𝑦∥2 = 1 𝐾𝑖𝐿 (0, 0)𝑦, 𝑦 = ⟨(𝐼 − 𝑆(0)𝑆(0)∗ )𝑦, 𝑦⟩ , 1𝕋(⋅, 0) 0 1 = 𝑖=1

〉 1 [ ]12 〈∑ 𝑑 1 1 0 𝑅 1 ∥𝐵𝑢∥ = 1 𝐾𝑖 (0, 0)𝑢, 𝑢 = ⟨(𝐼 − 𝑆(0)∗ 𝑆(0))𝑢, 𝑢⟩ , 1𝕋(⋅, 0) 𝑢 1 = 2

𝑖=1

where the ﬁrst equalities follow from formulas (5.18) for 𝐵 and 𝐶 ∗ , the second equalities follow by reproducing kernel formulas (5.10), (5.11) along with deﬁnitions (5.9), (5.2) and (1.11) of s, 𝕋 and 𝐾𝑖 , and ﬁnally, the third equalities follow from the decomposition formula (1.12) evaluated at 𝑧 = 𝜁 = 0. Taking into account the formulas (5.15) and (5.18) for 𝐷 and 𝐷∗ , we then have equalities ∥𝐶 ∗ 𝑦∥2 = ∥𝑦∥2 − ∥𝑆(0)∗ 𝑦∥2 = ∥𝑦∥2 − ∥𝐷∗ 𝑦∥2 ,

(5.22)

∥𝐵𝑢∥2 = ∥𝑢∥2 − ∥𝑆(0)𝑢∥2 = ∥𝑢∥2 − ∥𝐷𝑢∥2 holding for all 𝑦 ∈ 𝒴 and 𝑢 ∈ 𝒰 which are equivalent to operator equalities (5.19). To verify (5.20) and (5.21) we proceed as follows. By deﬁnitions (5.9), (1.11), (1.15) and (5.2) of s, 𝐾𝑖 , 𝑍(𝑧) and 𝕋, [ ( [ ])] [ ]] [ 𝑑 𝑑 ∑ ∑ 𝑦 𝑦 s 𝑍(𝜁)∗ 𝕋(⋅, 𝜁) = 𝜁 𝑖 𝐾𝑖 (⋅, 𝜁) = 𝜁 𝑖 𝐾𝑖𝑅𝐿 (⋅, 𝜁)𝑦, 0 0 − − 𝑖=1

𝑖=1

[ ]] [ ( [ ])] 𝑑 [ 𝑑 ∑ ∑ 0 0 = = 𝐾𝑖𝑅 (⋅, 𝜁)𝑢. 𝐾𝑖 (⋅, 𝜁) s 𝕋(⋅, 𝜁) 𝑢 − 𝑢 − 𝑖=1

𝑖=1

Canonical Realization

103

Combining the deﬁnition (5.15) of 𝐵 ∗ with the two last formulas evaluated at zero gives 𝐵 ∗ 𝑍(𝜁)∗ 𝕋(⋅, 𝜁)

[ ] [ ( [ ])] 𝑑 ∑ 𝑦 𝑦 = s 𝑍(𝜁)∗ 𝕋(⋅, 𝜁) (0) = 𝜁 𝑖 𝐾𝑖𝑅𝐿 (0, 𝜁)𝑦, 0 0 −

(5.23)

𝑗=1

[ ] [ ( [ ])] 𝑑 ∑ 0 0 = s 𝕋(⋅, 𝜁) (0) = 𝐾𝑖𝑅 (0.𝜁)𝑢. 𝐵 𝕋(⋅, 𝜁) 𝑢 𝑢 − ∗

(5.24)

𝑗=1

Upon letting 𝑧 = 0 in (1.12) and equating the block entries in the bottom row we see that 𝑆(𝜁)∗ − 𝑆(0)∗ =

𝑑 ∑

𝜁 𝑖 𝐾𝑖𝑅𝐿 (0, 𝜁),

𝐼𝒰 − 𝑆(0)∗ 𝑆(𝜁) =

𝑖=1

𝑑 ∑

𝐾𝑖𝑅 (0, 𝜁)

(5.25)

𝑖=1

and combining the two latter equalities with (5.23) and (5.24) gives (5.20), (5.21). □ Formulas (5.20), (5.21) describing the action of the operator 𝐵 ∗ on elementary kernels of 𝒟 were easily obtained from the general formula (5.15) for 𝐵 ∗ . Although the operator 𝐴∗ is not deﬁned in Deﬁnition 5.1 on the whole state space ⊕𝑑𝑖=1 ℋ(𝐾𝑖 ), it turns out that its action on elementary kernels of 𝒟 is completely determined by conditions (5.13) and (5.14). One half of the job is handled by formula (5.16) (which is equivalent to (5.13)). Another half is covered in the next proposition. 𝐴 𝐵 ] be a t.c.f.m. colligation associated with the Agler Proposition 5.4. Let U = [ 𝐶 𝐷 decomposition (1.12) of a given 𝑆 ∈ 𝒮𝒜𝑑 (𝒰, 𝒴) and let 𝕋 be given by (5.2). Then [ ] [ ] [ ] 0 0 𝑆(𝜁)𝑢 𝐴∗ : 𝕋(⋅, 𝜁) → 𝑍(𝜁)∗ 𝕋(⋅, 𝜁) − 𝕋(⋅, 0) (5.26) 𝑢 𝑢 0

for all 𝜁 ∈ 𝔻𝑑 , 𝑦 ∈ 𝒴 and 𝑢 ∈ 𝒰. Proof. We have to show that formula (5.26) follows from conditions in Deﬁnition 5.1. To this end, we ﬁrst verify the equality 2

2

2

∥𝑍(𝜁)∗ ℎ𝜁,𝑢 ∥ − ∥𝐴𝑍(𝜁)∗ ℎ𝜁,𝑢 ∥ = ∥𝐶𝑍(𝜁)∗ ℎ𝜁,𝑢 ∥𝒰

(5.27)

where the norms on the left-hand side are taken in ⊕𝑑𝑖=1 ℋ(𝐾𝑖 ) and where we have set for short [ ] 0 ℎ𝜁,𝑢 := 𝕋(⋅, 𝜁) ∈ ⊕𝑑𝑖=1 ℋ(𝐾𝑖 ). (5.28) 𝑢

104

J.A. Ball and V. Bolotnikov

By the reproducing kernel property (5.11) and on account of (5.2) and (1.11), ⟨ℎ𝜁,𝑢 , ℎ𝑧,𝑢 ⟩⊕𝑑

𝑖=1 ℋ(𝐾𝑖 )

2

∥𝑍(𝜁)∗ ℎ𝜁,𝑢 ∥⊕𝑑

𝑖=1 ℋ(𝐾𝑖 )

=

𝑑 ∑

⟨𝐾𝑖𝑅 (𝑧, 𝜁)𝑢, 𝑢⟩𝒰 ,

(5.29)

∣𝜁𝑖 ∣2 ⋅ ⟨𝐾𝑖𝑅 (𝜁, 𝜁)𝑢, 𝑢⟩𝒰 .

(5.30)

𝑖=1

=

𝑑 ∑ 𝑖=1

Equality (5.17) holds by Proposition 5.2 and can be written as 𝐴𝑍(𝜁)∗ ℎ𝜁,𝑢 = ℎ𝜁,𝑢 − ℎ0,𝑢

(5.31)

in notation (5.28). This formula together with (5.29) leads us to 2

∥𝐴𝑍(𝜁)∗ ℎ𝜁,𝑢 ∥⊕𝑑 =

𝑑 ∑ 〈( 𝑖=1

𝑖=1 ℋ(𝐾𝑖 )

2

= ∥ℎ𝜁,𝑢 − ℎ0,𝑢 ∥⊕𝑑

𝑖=1 ℋ(𝐾𝑖 )

) 〉 𝐾𝑖𝑅 (𝜁, 𝜁) − 𝐾𝑖𝑅 (𝜁, 0) − 𝐾𝑖𝑅 (0, 𝜁) + 𝐾𝑖𝑅 (0, 0) 𝑢, 𝑢 𝒰 .

(5.32)

Upon letting 𝑧 = 𝜁 in (1.12) we get the identity 𝐼𝒰 − 𝑆(𝜁)∗ 𝑆(𝜁) =

𝑑 ∑

(1 − ∣𝜁𝑖 ∣2 )𝐾𝑖𝑅 (𝜁, 𝜁)

(5.33)

𝑖=1

which together with the second relation in (5.25) implies 𝑑 ∑ ) ( (1 − ∣𝜁𝑖 ∣2 ∣)𝐾𝑖𝑅 (𝜁, 𝜁) − 𝐾𝑖𝑅 (𝜁, 0) − 𝐾𝑖𝑅 (0, 𝜁) + 𝐾𝑖𝑅 (0, 0) 𝑖=1

= 𝐼𝒰 − 𝑆(𝜁)∗ 𝑆(𝜁) − (𝐼𝒰 − 𝑆(𝜁)∗ 𝑆(0)) − (𝐼𝒰 − 𝑆(0)∗ 𝑆(𝜁)) + 𝐼𝒰 − 𝑆(0)∗ 𝑆(0) = −(𝑆(𝜁)∗ − 𝑆(0)∗ )(𝑆(𝜁) − 𝑆(0)). Subtracting (5.32) from (5.30) and making use of the last identity gives us 1 12 2 2 ∥𝑍(𝜁)∗ ℎ𝜁,𝑢 ∥ − ∥𝐴𝑍(𝜁)∗ ℎ𝜁,𝑢 ∥ = 1𝑆(𝜁)𝑢 − 𝑆(0)𝑢1𝒴 . (5.34) On the other hand, it follows from the identity 𝑆(𝜁) − 𝑆(0) =

𝑑 ∑

𝜁 𝑖 𝐾𝑖𝐿𝑅 (0, 𝜁)

𝑖=1

(which is yet another consequence of the decomposition formula (1.12)), the explicit formula (5.15) for 𝐶 and deﬁnitions (5.9), (5.2) and (1.11), that [ ]] [ 𝑑 ∑ 0 𝐶𝑍(𝜁)∗ ℎ𝜁,𝑢 = [s (𝑍(𝜁)∗ ℎ𝜁,𝑢 )]+ (0) = 𝜁 𝑖 𝐾𝑖 (⋅, 𝜁)) (0) 𝑢 + 𝑖=1

=

𝑑 ∑ 𝑖=1

𝜁 𝑖 𝐾𝑖𝐿𝑅 (0, 𝜁)𝑢 = 𝑆(𝜁)𝑢 − 𝑆(0)𝑢. (5.35)

Canonical Realization

105

Substituting the latter equality into (5.34) completes the proof of (5.27). Writing (5.27) in the form ⟨(𝐼 − 𝐴∗ 𝐴 − 𝐶 ∗ 𝐶)𝑍(𝜁)∗ ℎ𝜁,𝑢 , 𝑍(𝜁)∗ ℎ𝜁,𝑢 ⟩⊕𝑑

𝑖=1 ℋ(𝐾𝑖 )

=0

and observing that the operator 𝐼 − 𝐴∗ 𝐴 − 𝐶 ∗ 𝐶 is positive semideﬁnite (since U is contractive by Deﬁnition 5.1), we conclude that (𝐼 − 𝐴∗ 𝐴 − 𝐶 ∗ 𝐶)𝑍(𝜁)∗ ℎ𝜁,𝑢 ≡ 0

for all 𝜁 ∈ 𝔻𝑑 , 𝑢 ∈ 𝒰.

Applying the operator 𝐶 ∗ to both parts of (5.35) we get [ ] 𝑆(𝜁)𝑢 − 𝑆(0)𝑢 , 𝐶 ∗ 𝐶𝑍(𝜁)∗ ℎ𝜁,𝑢 = 𝕋(⋅, 0) 0

(5.36)

(5.37)

by the explicit formula (5.18) for 𝐶 ∗ . From the same formula and the formula (5.15) for 𝐷 we get [ ] 𝑆(0)𝑢 ∗ ∗ ∗ . (5.38) 𝐶 𝐷𝑢 = 𝐶 𝑆(0) 𝑢 = 𝕋(⋅, 0) 0 We next apply the operator 𝐴∗ to both parts of equality (5.31): 𝐴∗ 𝐴𝑍(𝜁)∗ ℎ𝜁,𝑢 = 𝐴∗ ℎ𝜁,𝑢 − 𝐴∗ ℎ0,𝑢 .

(5.39)

Comparing (5.28) and the second formula in (5.18) (which holds by Proposition 5.2) convinces us that ℎ0,𝑢 = 𝐵𝑢 (5.40) so that (5.39) can be written as 𝐴∗ ℎ𝜁,𝑢 = 𝐴∗ 𝐴𝑍(𝜁)∗ ℎ𝜁,𝑢 + 𝐴∗ 𝐵𝑢.

(5.41)

Since U is contractive (by Deﬁnition 5.1) and since 𝐵 and 𝐷 satisfy the second equality in (5.19), it then follows that 𝐴∗ 𝐵 + 𝐶 ∗ 𝐷 = 0. Thus, [ ] 𝑆(0)𝑢 . 𝐴∗ 𝐵𝑢 = −𝐶 ∗ 𝐷𝑢 = −𝐶 ∗ 𝑆(0)∗ 𝑢 = −𝕋(⋅, 0) 0 Taking the latter equality into account and making subsequent use of (5.36), (5.37) and (5.38) we then get from (5.41) 𝐴∗ ℎ𝜁,𝑢 = (𝐼 − 𝐶 ∗ 𝐶)𝑍(𝜁)∗ ℎ𝜁,𝑢 − 𝐶 ∗ 𝐷𝑢 [ ] [ ] 𝑆(0)𝑢 𝑆(𝜁)𝑢 − 𝑆(0)𝑢 ∗ − 𝕋(⋅, 0) = 𝑍(𝜁) ℎ𝜁,𝑢 − 𝕋(⋅, 0) 0 0 [ ] 𝑆(𝜁)𝑢 = 𝑍(𝜁)∗ ℎ𝜁,𝑢 − 𝕋(⋅, 0) . 0 Substituting (5.28) into the last identity we get (5.26) which completes the proof. □

106

J.A. Ball and V. Bolotnikov

Remark 5.5. Since any t.c.f.m. colligation is contractive, we have in particular that 𝐴𝐴∗ + 𝐵𝐵 ∗ ≤ 𝐼. Therefore, formulas (5.20), (5.21) and (5.26), (5.16) deﬁning the action of operators 𝐵 ∗ and 𝐴∗ on elementary kernels of the space 𝒟 (see (5.3)) can be extended by continuity to deﬁne these operators on the whole 𝒟. 𝐴 𝐵 ] associated with a ﬁxed Agler Proposition 5.6. Any t.c.f.m. colligation U = [ 𝐶 𝐷 decomposition (1.12) of a given 𝑆 ∈ 𝒮𝒜𝑑 (𝒰, 𝒴) is weakly unitary and closely connected. Furthermore,

𝑆(𝑧) = 𝐷 + 𝐶(𝐼 − 𝑍(𝑧)𝐴)−1 𝑍(𝑧)𝐵.

(5.42)

𝐴 𝐵 ] be a t.c.f.m. colligation of 𝑆 associated with a ﬁxed Agler Proof. Let U = [ 𝐶 𝐷 decomposition (1.12). Then equalities (5.16)–(5.18) and (5.26) hold (by Propositions 5.2 and 5.4) and can be solved for 𝕋(⋅, 𝜁) as follows: [ ] [ ] 𝑦 𝑦 ∗ ∗ −1 = (𝐼 − 𝐴∗ 𝑍(𝜁)∗ )−1 𝐶 ∗ 𝑦, (5.43) 𝕋(⋅, 𝜁) = (𝐼 − 𝐴 𝑍(𝜁) ) 𝕋(⋅, 0) 0 0 [ ] [ ] 0 0 ¯ −1 𝐵𝑢. = (𝐼 − 𝐴𝑍(𝜁)) (5.44) 𝕋(⋅, 𝜁) = (𝐼 − 𝐴𝑍(𝜁)∗ )−1 𝕋(⋅, 0) 𝑢 𝑢

From (5.43) and (5.20) we conclude that equalities (𝐷∗ + 𝐵 ∗ 𝑍(𝜁)∗ (𝐼 − 𝐴∗ 𝑍(𝜁)∗ )−1 𝐶 ∗ )𝑦 = 𝑆(0)∗ 𝑦 + 𝐵 ∗ 𝑍(𝑧)∗ 𝕋(⋅, 𝜁)

[ ] 𝑦 0

= 𝑆(0)∗ 𝑦 + 𝑆(𝜁)∗ 𝑦 − 𝑆(0)∗ 𝑦 = 𝑆(𝜁)∗ 𝑦

(5.45)

hold for every 𝜁 ∈ 𝔻𝑑 and 𝑦 ∈ 𝒴, which proves representation (5.42). Furthermore, in view of (5.2), ⋁{ } 𝑃ℋ(𝐾𝑖 ) (𝐼 − 𝐴∗ 𝑍(𝜁)∗ 𝐶 ∗ 𝑦, 𝑃ℋ(𝐾𝑖 ) (𝐼 − 𝐴𝑍(𝜁)𝐵𝑢 : 𝜁 ∈ 𝔻𝑑 , 𝑦 ∈ 𝒴, 𝑢 ∈ 𝒰 [ ] [ ] } ⋁{ 𝑦 0 = 𝑃ℋ(𝐾𝑖 ) 𝕋(⋅, 𝜁) , 𝑃ℋ(𝐾𝑖 ) 𝕋(⋅, 𝜁) : 𝜁 ∈ 𝔻𝑑 , 𝑦 ∈ 𝒴, 𝑢 ∈ 𝒰 0 𝑢 [ ] [ ] } ⋁{ 𝑦 0 = 𝐾𝑖 (⋅, 𝜁) , 𝐾𝑖 (⋅, 𝜁) : 𝜁 ∈ 𝔻𝑑 , 𝑦 ∈ 𝒴, 𝑢 ∈ 𝒰 0 𝑢 [ ] [ ] } ⋁{ 𝑦 𝑦 : 𝜁 ∈ 𝔻𝑑 , ∈ 𝒴 ⊕ 𝒰 = ℋ(𝐾𝑖 ) = 𝐾𝑖 (⋅, 𝜁) 𝑢 𝑢 𝐵 and the colligation U = [ 𝐴 𝐶 𝐷 ] is closely connected by Deﬁnition 2.1. To show that U is weakly unitary, let us rearrange the Agler decomposition (1.12) for 𝑆 as [ ] ] ] [ 𝑑 [ ] ∑ 𝜁¯𝑖 𝐼𝒴 0 𝑧𝑖 𝐼𝒴 0 𝐼𝒴 [ ¯ 𝑆( 𝐼 (𝑧, 𝜁) 𝐾 𝜁) + 𝒴 𝑖 𝑆(¯ 𝑧 )∗ 0 𝐼𝒰 0 𝐼𝒰

[ ] 𝑆(𝑧) [ 𝑆(𝜁)∗ = 𝐼𝒰

𝑖=1

]

𝐼𝒰 +

𝑑 [ ∑ 𝐼𝒴 𝑖=1

0

[ ] 𝐼 0 𝐾𝑖 (𝑧, 𝜁) 𝒴 𝑧𝑖 𝐼𝒰 0

] 0 , 𝜁¯𝑖 𝐼𝒰

Canonical Realization

107

which in turn can be written in the inner product form ¯ ⟨𝑦 + 𝑆(𝜁)𝑢, 𝑦 ′ + 𝑆(¯ 𝑧 )𝑢′ ⟩𝒴 +

𝑑 〈 ∑

𝐾𝑖 (⋅, 𝜁)

[¯ ] 𝜁𝑖 𝑦 𝑢

𝑖=1

= ⟨𝑆(𝜁)∗ 𝑦 + 𝑢, 𝑆(𝑧)∗ 𝑦 ′ + 𝑢′ ⟩𝒰 +

𝑑 〈 ∑

, 𝐾𝑖 (⋅, 𝑧)

𝐾𝑖 (⋅, 𝜁)

𝑖=1

[

𝑦 ] 𝜁¯𝑖 𝑢 ,

[

𝜁¯𝑖 𝑦 ′ 𝑢′

]〉

𝐾𝑖 (⋅, 𝑧)

ℋ(𝐾𝑖 )

[

𝑦′ 𝜁¯𝑖 𝑢′

]〉 ℋ(𝐾𝑖 )

which is the same as 〈[ [ ] ]〉 ] [ [ ′] 𝑍(𝜁)∗ 𝕋(⋅, 𝜁) [ 𝑦0 ] + 𝕋(⋅, 𝜁) [ 𝑢0 ] 𝑍(𝜁)∗ 𝕋(⋅, 𝑧) 𝑦0 + 𝕋(⋅, 𝑧) 𝑢0′ , ¯ ¯ ′ 𝑦 + 𝑆(𝜁)𝑢 𝑦 ′ + 𝑆(𝜁)𝑢 ] [ [ 〈[ ]〉 ′ ] 𝑦 𝕋(⋅, 𝜁) [ 0 ] + 𝑍(𝜁)∗ 𝕋(⋅, 𝜁) [ 𝑢0 ] 𝕋(⋅, 𝑧) 𝑦0 + 𝑍(𝜁)∗ 𝕋(⋅, 𝜁) [ 𝑢0 ] , = 𝑆(𝜁)∗ 𝑦 + 𝑢 𝑆(𝑧)∗ 𝑦 ′ + 𝑢′ (5.46) where the inner products are taken in (⊕𝑑𝑖=1 ℋ(𝐾𝑖 )) ⊕ 𝒴 and (⊕𝑑𝑖=1 ℋ(𝐾𝑖 )) ⊕ 𝒰. Letting 𝑢 = 𝑢′ = 0 and 𝑦 = 𝑦 ′ in the latter equality gives 1[ ]1 1[ ]1 1 𝑍(𝜁)∗ 𝕋(⋅, 𝜁) [ 𝑦 ] 1 1 𝕋(⋅, 𝜁) [ 𝑦 ] 1 0 0 1 1=1 1 1 1 1 1 𝑦 𝑆(𝜁)∗ 𝑦 which on account of (5.43) can be written as 1[ ]1 1[ ]1 1 𝑍(𝜁)∗ (𝐼 − 𝐴∗ 𝑍(𝜁)∗ )−1 𝐶 ∗ 𝑦 1 1 (𝐼 − 𝐴∗ 𝑍(𝜁)∗ )−1 𝐶 ∗ 𝑦 1 1 1=1 1. 1 1 1 1 𝑦 𝑆(𝜁)∗ 𝑦 Since

[

𝐴∗ 𝐵∗

𝐶∗ 𝐷∗

][

𝑍(𝜁)∗ (𝐼 − 𝐴∗ 𝑍(𝜁)∗ )−1 𝐶 ∗ 𝑦 𝑦

]

[ =

(𝐼 − 𝐴∗ 𝑍(𝜁)∗ )−1 𝐶 ∗ 𝑦 𝑆(𝜁)∗ 𝑦

(5.47) ]

(the top components in the latter formula are equal automatically whereas the bottom components are equal due to (5.45)), equality (5.47) tells us that U is weakly coisometric by Deﬁnition 2.2. Similarly letting 𝑢 = 𝑢′ and 𝑦 = 𝑦 ′ = 0 in (5.46) we get 1[ ]1 1[ ]1 1 𝕋(⋅, 𝜁) [ 0 ] 1 1 𝑍(𝜁)∗ 𝕋(⋅, 𝜁) [ 0 ] 1 𝑢 𝑢 1=1 1 1 ¯ 1 1 1 1 𝑆(𝜁)𝑢 𝑢 which in view of (5.44) can be written as 1[ ]1 1[ ]1 ¯ − 𝐴𝑍(𝜁)) ¯ −1 𝐵𝑢 1 ¯ −1 𝐵𝑢 1 1 𝑍(𝜁)(𝐼 1 (𝐼 − 𝐴𝑍(𝜁)) 1=1 1 1 ¯ 1 1 1 1 𝑆(𝜁)𝑢 𝑢 and since

[

𝐴 𝐶

𝐵 𝐷

][

¯ − 𝐴𝑍(𝜁)) ¯ −1 𝐵𝑢 𝑍(𝜁)(𝐼 𝑢

]

[ =

¯ −1 𝐵𝑢 (𝐼 − 𝐴𝑍(𝜁)) ¯ 𝑆(𝜁)𝑢

]

(again, the top components are equal automatically and the bottom components are equal due to (5.42)), the colligation U is weakly isometric by Deﬁnition 2.2. □

108

J.A. Ball and V. Bolotnikov

Proposition 5.6 establishes common features of t.c.f.m. colligations leaving the question about the existence of at least one such colligation open. As was shown in the proof of Proposition 5.6, the Agler decomposition (1.12) can be written in the inner product form (5.46) from which we conclude that the linear map ] [ ] [ 𝐴𝑉 𝐵𝑉 𝑍(𝜁)∗ 𝕋(⋅, 𝜁) [ 𝑦0 ] + 𝕋(⋅, 𝜁) [ 𝑢0 ] 𝑉 = : ¯ 𝑦 + 𝑆(𝜁)𝑢 𝐶𝑉 𝐷𝑉 [ ] 𝕋(⋅, 𝜁) [ 𝑦0 ] + 𝑍(𝜁)∗ 𝕋(⋅, 𝜁) [ 𝑢0 ] → (5.48) 𝑆(𝜁)∗ 𝑦 + 𝑢 deﬁned completely in terms of a given Agler decomposition {𝐾1 , . . . , 𝐾𝑑 } of 𝑆, extends to the isometry from } ⋁ {[ 𝑍(𝜁)∗ 𝕋(⋅, 𝜁) [ 𝑦 ] ] [ 𝕋(⋅, 𝜁) [ 0 ] ] 𝑑 0 𝑢 𝒟𝑉 = , : 𝜁 ∈ 𝔻 , 𝑦 ∈ 𝒴, 𝑢 ∈ 𝒰 ¯ 𝑦 𝑆(𝜁)𝑢 onto

} ⋁ {[ 𝕋(⋅, 𝜁) [ 𝑦 ] ] [ 𝑍(𝜁)∗ 𝕋(⋅, 𝜁) [ 0 ] ] 𝑑 0 𝑢 ℛ𝑉 = : 𝜁 ∈ 𝔻 , 𝑦 ∈ 𝒴, 𝑢 ∈ 𝒰 . , 𝑆(𝜁)∗ 𝑦 𝑢

It is readily seen that 𝒟𝑉 and ℛ𝑉 contain respectively all vectors of the form [ 𝑦0 ] and [ 𝑢0 ] and therefore they are split into direct sums 𝒟𝑉 = 𝒟 ⊕ 𝒴 where the subspaces 𝒟 and ℛ of operators 𝐴𝑉 : 𝒟 → ℛ,

and ℛ𝑉 = ℛ ⊕ 𝒰

⊕𝑑𝑖=1 ℋ(𝐾𝑖 )

𝐵𝑉 : 𝒰 → ℛ,

are deﬁned in (5.3), (5.4). For the

𝐶𝑉 : 𝒟 → 𝒴,

𝐷𝑉 : 𝒰 → 𝒴

we have from (5.48) the following relations: [ ] [ ] 𝑦 𝑦 ∗ 𝐴𝑉 𝑍(𝜁) 𝕋(⋅, 𝜁) + 𝐵𝑉 𝑦 = 𝕋(⋅, 𝜁) , 0 0 [ ] [ ] 0 0 ∗ ¯ 𝐴𝑉 𝕋(⋅, 𝜁) + 𝐵𝑉 𝑆(𝜁)𝑢 = 𝑍(𝜁) 𝕋(⋅, 𝜁) , 𝑢 𝑢 [ ] 𝑦 + 𝐷𝑉 𝑦 = 𝑆(𝜁)∗ 𝑦, 𝐶𝑉 𝑍(𝜁)∗ 𝕋(⋅, 𝜁) 0 [ ] 0 ¯ = 𝑢. 𝐶𝑉 𝕋(⋅, 𝜁) + 𝐷𝑉 𝑆(𝜁)𝑢 𝑢

(5.49) (5.50) (5.51) (5.52)

Equalities (5.49) and (5.50) are obtained upon comparing the top components in (5.48) with respectively, 𝑢 = 0 and 𝑦 = 0. Equalities (5.51) and (5.52) are obtained similarly upon comparing the bottom components in (5.48). Letting 𝜁 = 0 in (5.49) and (5.51) gives [ ] 𝑦 and 𝐷𝑉 𝑦 = 𝑆(0)∗ 𝑦. (5.53) 𝐵𝑉 𝑦 = 𝕋(⋅, 0) 0

Canonical Realization

109

Substituting the ﬁrst and the second formula in (5.53) respectively into (5.49), (5.50) and into (5.51) and (5.52) results in equalities [ ] [ ] [ ] 𝑦 𝑦 𝑦 𝐴𝑉 : 𝑍(𝜁)∗ 𝕋(⋅, 𝜁) = 𝕋(⋅, 𝜁) − 𝕋(⋅, 0) , (5.54) 0 0 0 [ ] [ ] [ ] 0 0 𝑆(𝜁)𝑢 𝐴𝑉 : 𝕋(⋅, 𝜁) → 𝑍(𝜁)∗ 𝕋(⋅, 𝜁) − 𝕋(⋅, 0) , (5.55) 𝑢 𝑢 0 [ ] 𝑦 → 𝑆(𝜁)∗ 𝑦 − 𝑆(0)∗ 𝑦, (5.56) 𝐶𝑉 : 𝑍(𝜁)∗ 𝕋(⋅, 𝜁) 0 [ ] 0 ¯ → 𝑢 − 𝑆(0)∗ 𝑆(𝜁)𝑢 (5.57) 𝐶𝑉 : 𝕋(⋅, 𝜁) 𝑢 holding for all 𝜁 ∈ 𝔻𝑑 , 𝑢 ∈ 𝒰 and 𝑦 ∈ 𝒴 and completely deﬁning the operators 𝐴𝑉 and 𝐶𝑉 on the whole space 𝒟. Lemma 5.7. Given the Agler decomposition {𝐾1 , . . . , 𝐾𝑑 } for a function 𝑆 ∈ 𝒮𝒜𝑑 (𝒰, 𝒴), let 𝑉 be the isometric operator associated with this decomposition as 𝐴 𝐵 ] of the form (5.12) is a t.c.f.m. in (5.48). A block-operator matrix U = [ 𝐶 𝐷 colligation associated with {𝐾1 , . . . , 𝐾𝑑 } if and only if ∥U∗ ∥ ≤ 1,

U∗ ∣𝒟⊕𝒴 = 𝑉

𝐵 ∗ ∣𝒟⊥ = 0,

(5.58)

that is, U is a contractive extension of 𝑉 from 𝒟 ⊕ 𝒴 to all of subject to condition 𝐵 ∗ ∣𝒟⊥ = 0.

(⊕𝑑𝑖=1 ℋ(𝐾𝑖 )) ⊕ 𝒴

and

∗

𝐴 𝐵 ] be a t.c.f.m. colligation associated with {𝐾 , . . . , 𝐾 }. Proof. Let U = [ 𝐶 1 𝑑 𝐷 Then U is contractive by deﬁnition and relations (5.16) and (5.18)–(5.26) hold by Propositions 5.2 and 5.4. Comparing (5.16) and (5.26) with (5.54), (5.55) we see that 𝐴∗ ∣𝒟 = 𝐴𝑉 . Comparing (5.20), (5.21) with (5.56), (5.57) we conclude that 𝐵 ∗ ∣𝒟 = 𝐶𝑉 . Also, it follows from (5.18) and (5.53) that 𝐶 ∗ = 𝐵𝑉 and 𝐷∗ = 𝐷𝑉 . Finally, it is seen from formula (5.5) that for every 𝑓 = ⊕𝑑𝑖=1 𝑓𝑖 ∈ 𝒟⊥ ,

(s𝑓 )− (𝑧) =

𝑑 ∑

𝑓𝑖,− (𝑧) ≡ 0

𝑖=1

so that in particular, 𝐵 ∗ 𝑓 = (s𝑓 )− (0) = 0, which proves the last equality in (5.58). 𝐴 𝐵 ] meets all the conConversely, let us assume that a colligation U = [ 𝐶 𝐷 ditions in (5.58). From the second relation in (5.58) we conclude the equalities (5.53)–(5.57) hold with operators 𝐴𝑉 , 𝐵𝑉 , 𝐶𝑉 and 𝐷𝑉 replaced by 𝐴∗ , 𝐶 ∗ , 𝐵 ∗ and 𝐷∗ respectively. In other words, we conclude from (5.53) that 𝐶 ∗ and 𝐷∗ are deﬁned exactly as in (5.18) which means (by Proposition (5.3)) that they are already of the requisite form. Equalities (5.56), (5.57) tell us that the operator 𝐵 ∗ satisﬁes formulas (5.20), (5.21). As we have seen in the proof of Proposition 5.4, these formulas agree with the second formula in (5.15) deﬁning 𝐵 ∗ on the whole ⊕𝑑𝑖=1 ℋ(𝐾𝑖 ). From the third condition in (3.14) we now conclude that 𝐵 ∗ is deﬁned by formula (5.15) on the whole ⊕𝑑𝑖=1 ℋ(𝐾𝑖 ) and therefore, 𝐵 is also of the requisite

110

J.A. Ball and V. Bolotnikov

form. The formula (5.54) (with 𝐴∗ instead of 𝐴𝑉 ) leads us to (5.16) which means that 𝐴 solves the Gleason problem (5.13). To complete the proof, it remains to show that 𝐴∗ solves the dual Gleason problem (5.14) or equivalently, that (5.18) holds. Rather than (5.18), we have equality (5.50) (with 𝐴∗ and 𝐶 ∗ instead of 𝐴𝑉 and 𝐵𝑉 respectively) which can be written in terms of notation (5.28) as 𝐴∗ ℎ𝜁,𝑢 = 𝑍(𝜁)∗ ℎ𝜁,𝑢 − 𝐶 ∗ 𝑆(𝜁)𝑢

(5.59)

We use (5.59) to show that equality ∥ℎ𝜁,𝑢 ∥2⊕𝑑

𝑖=1 ℋ(𝐾𝑖 )

− ∥𝐴∗ ℎ𝜁,𝑢 ∥2⊕𝑑

𝑖=1 ℋ(𝐾𝑖 )

= ∥𝐵 ∗ ℎ𝜁,𝑢 ∥2𝒰

(5.60)

holds for every 𝜁 ∈ 𝔻𝑑 and 𝑢 ∈ 𝒰. Indeed, 1 1 2 2 2 ¯ 12 ∥ℎ𝜁,𝑢 ∥ − ∥𝐴∗ ℎ𝜁,𝑢 ∥ = ∥ℎ𝜁,𝑢 ∥ − 1𝑍(𝜁)∗ ℎ𝜁,𝑢 − 𝐶 ∗ 𝑆(𝜁)𝑢 〈 〉 2 2 ¯ = ∥ℎ𝜁,𝑢 ∥ − ∥𝑍(𝜁)∗ ℎ𝜁,𝑢 ∥ + 𝐶𝑍(𝜁)∗ ℎ𝜁,𝑢 , 𝑆(𝜁)𝑢 1 〈 〉 1 ¯ 12 . ¯ 𝐶𝑍(𝜁)∗ ℎ𝜁,𝑢 − 1𝐶 ∗ 𝑆(𝜁)𝑢 (5.61) + 𝑆(𝜁)𝑢, We next express all the terms on the right of (5.61) in terms of the function 𝑆: 〈 〉 2 2 ∥ℎ𝜁,𝑢 ∥ − ∥𝑍(𝜁)∗ ℎ𝜁,𝑢 ∥ = (𝐼𝒰 − 𝑆(𝜁)∗ 𝑆(𝜁))𝑢, 𝑢 , (5.62) 〈 〉 〈 〉 ∗ ∗ ¯ ¯ (𝑆(𝜁) − 𝑆(0))𝑢, 𝑢 , 𝐶𝑍(𝜁) ℎ𝜁,𝑢 , 𝑆(𝜁)𝑢 = 𝑆(𝜁) (5.63) 〈 〉 〈( ) 〉 ¯ 𝐶𝑍(𝜁)∗ ℎ𝜁,𝑢 = 𝑆(𝜁)∗ − 𝑆(0)∗ 𝑆(𝜁)𝑢, 𝑢 , 𝑆(𝜁)𝑢, (5.64) 12 1 ∗ 2 ∗ 2 ¯ 1 = ∥𝑆(𝜁)𝑢∥ − ∥𝑆(0) 𝑆(𝜁)𝑢∥ ¯ 1𝐶 𝑆(𝜁)𝑢 . (5.65) We mention that (5.62) follows from (5.29), (5.30) and (5.33); equality (5.62) is a consequence of (5.35). Taking adjoints in (5.63) gives (5.64) and equality (5.65) ¯ in (5.22). We now substitute the four last is obtained upon letting 𝑦 = 𝑆(𝜁)𝑢 equalities into (5.61) to get ∥ℎ𝜁,𝑢 ∥2⊕𝑑

𝑖=1 ℋ(𝐾𝑖 )

where

− ∥𝐴∗ ℎ𝜁,𝑢 ∥2⊕𝑑

𝑖=1 ℋ(𝐾𝑖 )

= ⟨𝑅(𝜁)𝑢, 𝑢⟩𝒰

(5.66)

( ) 𝑅(𝜁) = 𝐼𝒰 − 𝑆(𝜁)∗ 𝑆(𝜁) + 𝑆(𝜁)∗ 𝑆(𝜁) − 𝑆(0) ) ( + 𝑆(𝜁)∗ − 𝑆(0)∗ 𝑆(𝜁) − 𝑆(𝜁)∗ 𝑆(𝜁) + 𝑆(𝜁)∗ 𝑆(0)𝑆(0)∗ 𝑆(𝜁) = 𝐼𝒰 − 𝑆(𝜁)∗ 𝑆(0) − 𝑆(0)∗ 𝑆(𝜁) + 𝑆(𝜁)∗ 𝑆(0)𝑆(0)∗ 𝑆(𝜁) )( ) ( = 𝐼𝒰 − 𝑆(𝜁)∗ 𝑆(0) 𝐼𝒰 − 𝑆(0)∗ 𝑆(𝜁) .

It is readily seen from (5.21) that ¯ 𝐵 ∗ ℎ𝜁,𝑢 = 𝑢 − 𝑆(0)∗ 𝑆(𝜁)𝑢 and therefore

1 1 2 ¯ 12 = ⟨𝑅(𝜁)𝑢, 𝑢⟩ , ∥𝐵 ∗ ℎ𝜁,𝑢 ∥𝒰 = 1𝑢 − 𝑆(0)∗ 𝑆(𝜁)𝑢 𝒰

(5.67)

Canonical Realization

111

which together with (5.66) completes the proof of (5.60). Writing (5.60) as ⟨(𝐼 − 𝐴𝐴∗ − 𝐵𝐵 ∗ )ℎ𝜁,𝑢 , ℎ𝜁,𝑢 ⟩ = 0 and observing that the operator 𝐼 − 𝐴𝐴∗ − 𝐵𝐵 ∗ is positive semideﬁnite (since 𝐴 𝐵 ] is a contraction), we conclude that U = [𝐶 𝐷 (𝐼 − 𝐴𝐴∗ − 𝐵𝐵 ∗ )ℎ𝜁,𝑢 = 0

for all 𝜁 ∈ 𝔻𝑑 , 𝑢 ∈ 𝒰.

(5.68)

𝐴 𝐵] Since the operators 𝐵 and 𝐷 satisfy the ﬁrst equality (5.19) and since U = [ 𝐶 𝐷 ∗ ∗ is a contraction, we have 𝐴𝐶 + 𝐵𝐷 = 0. We now combine this latter equality with (5.40), (5.67) and formula (5.18) for 𝐷∗ to get ¯ ¯ = 𝐵𝐵 ∗ ℎ𝜁,𝑢 + 𝐵𝐷∗ 𝑆(𝜁)𝑢 ℎ0,𝑢 = 𝐵𝑢 = 𝐵(𝐵 ∗ ℎ𝜁,𝑢 + 𝑆(0)∗ 𝑆(𝜁)𝑢)

¯ = 𝐵𝐵 ∗ ℎ𝜁,𝑢 − 𝐴𝐶 ∗ 𝑆(𝜁)𝑢.

(5.69)

We now apply the operator 𝐴 to both parts of (5.59): ¯ 𝐴𝐴∗ ℎ𝜁,𝑢 = 𝐴𝑍(𝜁)∗ ℎ𝜁,𝑢 − 𝐴𝐶 ∗ 𝑆(𝜁)𝑢 and solve the obtained identity for 𝐴𝑍(𝜁)∗ ℎ𝜁,𝑢 with further simpliﬁcations based on (5.68) and (5.69): ¯ = ℎ𝜁,𝑢 − 𝐵𝐵 ∗ ℎ𝜁,𝑢 − 𝐵𝐷∗ 𝑆(𝜁)𝑢 ¯ = ℎ𝜁,𝑢 − ℎ0,𝑢 . 𝐴𝑍(𝜁)∗ ℎ𝜁,𝑢 = 𝐴𝐴∗ ℎ𝜁,𝑢 + 𝐴𝐶 ∗ 𝑆(𝜁)𝑢 Substituting (5.28) into the latter equality we get (5.18) which completes the proof. □ As a consequence of Lemma 5.7 we get a description of all t.c.f.m. colligations associated with a given Agler decomposition of a Schur-Agler function. Lemma 5.8. Let {𝐾1 , . . . , 𝐾𝑑 } be a ﬁxed Agler decomposition of a function 𝑆 ∈ 𝒮𝒜𝑑 (𝒰, 𝒴). Let 𝑉 be the associated isometry deﬁned in (5.48) with the defect spaces 𝒟⊥ and ℛ⊥ deﬁned in (5.5), (5.6). Then all t.c.f.m. colligations associated with {𝐾1 , . . . , 𝐾𝑑 } are of the form [ ] [ ⊥ ] [ ⊥ ] 𝑋 0 𝒟 ℛ U∗ = : → (5.70) 0 𝑉 𝒟⊕𝒴 ℛ⊕𝒰 where we have identiﬁed [ 𝑑 ] ⊕𝑖=1 ℋ(𝐾𝑖 ) with 𝒴

[

𝒟⊥ 𝒟⊕𝒴

]

[ and

⊕𝑑𝑖=1 ℋ(𝐾𝑖 ) 𝒰

]

[ with

ℛ⊥ ℛ⊕𝒰

]

and where 𝑋 is an arbitrary contraction from 𝒟⊥ into ℛ⊥ . The colligation U is isometric (coisometric, unitary) if and only if 𝑋 is coisometric (isometric, unitary). For the proof, it is enough to recall that 𝑉 is unitary as an operator from 𝒟𝑉 = 𝒟 ⊕ 𝒴 onto ℛ𝑉 = ℛ ⊕ 𝒰 and then to refer to Lemma 5.7. The meaning of description (5.70) is clear: the operators 𝐵 ∗ , 𝐶 ∗ , 𝐷∗ and the restriction of 𝐴∗ to the subspace 𝒟 in operator colligation U∗ are prescribed. The objective is to guarantee U∗ be contractive by suitable deﬁning 𝐴∗ on 𝒟⊥ . Lemma 3.5 states that 𝑋 = 𝐴∗ ∣𝒟⊥ must be a contraction with range contained in ℛ⊥ .

112

J.A. Ball and V. Bolotnikov We now are ready to formulate the multivariable counterpart of Theorem 1.6.

Theorem 5.9. Let 𝑆 be a function in the Schur-Agler class 𝒮𝐴𝑑 (𝒰, 𝒴) with given Agler decomposition {𝐾1 , . . . , 𝐾𝑑 }. Then 𝐴 𝐵 ] associated with {𝐾 , . . . , 𝐾 }. 1. There exists a t.c.f.m. colligation U = [ 𝐶 1 𝑑 𝐷 2. Every t.c.f.m. colligation U associated with {𝐾1 , . . . , 𝐾𝑑 } is weakly unitary and closely connected and furthermore, 𝑆(𝑧) = 𝐷 + 𝐶(𝐼 − 𝑍(𝑧)𝐴)−1 𝑍(𝑧)𝐵. ˜ of the form (2.1) with the 3. Any weakly unitary closely connected colligation U transfer function equal 𝑆 is unitarily equivalent to some t.c.f.m. colligation U for 𝑆.

Proof. Part (1) is contained in Lemma 5.8.[ Part (2) proved in Proposition ] was [ ] [ ] 𝑑 𝑑 ˜ ˜ ˜ ˜ ˜ = 𝐴 𝐵 : ⊕𝑖=1 𝒳𝑖 → ⊕𝑖=1 𝒳𝑖 be 5.6. To prove part (3) we assume that U ˜ 𝐷 𝒰 𝒴 𝐶 𝑑 ˜𝑖 and a closely connected weakly unitary colligation with the state space ⊕𝑖=1 𝑋 ˜ − 𝑍(𝑧)𝐴) ˜ −1 𝑍(𝑧)𝐵. ˜ Then 𝑆 admits Agler decomposition such that 𝑆(𝑧) = 𝐷 + 𝐶(𝐼 (1.12) with kernels 𝐾𝑖 deﬁned by: [ 𝐿 ] 𝐾𝑖 (𝑧, 𝜁) 𝐾𝑖𝐿𝑅 (𝑧, 𝜁) 𝐾𝑖 (𝑧, 𝜁) = 𝐾𝑖𝑅𝐿 (𝑧, 𝜁) 𝐾𝑖𝑅 (𝑧, 𝜁) [ ] ] [ ˜ − 𝑍(𝑧)𝐴) ˜ −1 𝐶(𝐼 ∗ −1 ˜ ˜∗ (𝐼 − 𝐴𝑍(𝜁) ˜ ˜∗ 𝑍(𝜁)∗ )−1 𝐶 = ˜∗ 𝑃 𝐵 ) (𝐼 − 𝐴 ˜ 𝑋 ∗ −1 𝑖 ˜ ) 𝐵 (𝐼 − 𝑍(𝑧)𝐴 for 𝑖 = 1, . . . , 𝑑. Let ℋ(𝐾𝑖 ) be the associated reproducing kernel Hilbert spaces and let ℐ𝑖 : 𝒳˜𝑖 → 𝒳˜ = ⊕𝑑𝑖=1 𝒳˜𝑗 be the inclusion maps ℐ𝑖 : 𝑥𝑖 → 0 ⊕ ⋅ ⋅ ⋅ ⊕ 0 ⊕ 𝑥𝑖 ⊕ 0 ⊕ ⋅ ⋅ ⋅ ⊕ 0. Since the realization is closely connected, the operators 𝑈𝑖 : 𝒳˜𝑖 → ℋ(𝐾𝑖 ) given by [ ] ˜ − 𝑍(𝑧)𝐴) ˜ −1 𝐶(𝐼 (5.71) 𝑈𝑖 : 𝑥𝑖 → ˜ ∗ ˜∗ )−1 ℐ𝑖 𝑥𝑖 𝐵 (𝐼 − 𝑍(𝑧)𝐴 are unitary. Let us deﬁne 𝐴 ∈ ℒ(⊕𝑑𝑖=1 ℋ(𝐾𝑖 )) by ) ( ) ( ˜ 𝐴 ⊕𝑑𝑖=1 𝑈𝑖 = ⊕𝑑𝑖=1 𝑈𝑖 𝐴. In more detail: 𝐴 = [𝐴𝑖𝑗 ]𝑑𝑖,𝑗=1 where [ [ ] ] ˜ − 𝑍(𝑧)𝐴) ˜ −1 ˜ − 𝑍(𝑧)𝐴) ˜ −1 𝐶(𝐼 𝐶(𝐼 ˜ 𝐴𝑖𝑗 : ˜ ∗ ˜ ∗ (𝐼 − 𝑍(𝑧)𝐴 ˜∗ )−1 ℐ𝑗 𝑥𝑗 → 𝐵 ˜∗ )−1 ℐ𝑖 𝐴𝑖𝑗 𝑥𝑗 . 𝐵 (𝐼 − 𝑍(𝑧)𝐴 Since the operators (5.71) are unitary, we have from (5.72) ( ) ( ) ∗ ˜ 𝐴∗ ⊕𝑑𝑖=1 𝑈𝑖 = ⊕𝑑𝑖=1 𝑈𝑖 𝐴

(5.72)

(5.73)

Canonical Realization

113

and therefore, [ [ ] ] ˜ − 𝑍(𝑧)𝐴) ˜ −1 ˜ − 𝑍(𝑧)𝐴) ˜ −1 𝐶(𝐼 𝐶(𝐼 ∗ ˜∗ 𝐴𝑗𝑖 : ˜ ∗ ˜ ∗ (𝐼 − 𝑍(𝑧)𝐴 ˜∗ )−1 ℐ𝑗 𝑥𝑗 → 𝐵 ˜∗ )−1 ℐ𝑖 𝐴𝑗𝑖 𝑥𝑗 . 𝐵 (𝐼 − 𝑍(𝑧)𝐴 Take the generic element 𝑓 of ⊕𝑑𝑖=1 ℋ(𝐾𝑖 ) in the form [ ] 𝑑 𝑑 ⊕ ⊕ ˜ − 𝑍(𝑧)𝐴) ˜ −1 𝐶(𝐼 𝑓 (𝑧) = ℐ 𝑥 and let 𝑥 := 𝑥𝑗 ∈ 𝒳˜. ˜ ∗ (𝐼 − 𝑍(𝑧)𝐴 ˜∗ )−1 𝑗 𝑗 𝐵 𝑗=1 𝑗=1

(5.74)

(5.75)

By (5.73) and (5.75), we have ([ ] ) 𝑑 ∑ ˜ − 𝑍(𝑧)𝐴) ˜ −1 𝐶(𝐼 [𝐴𝑓 ]𝑖 (𝑧) = (5.76) 𝐴𝑖𝑗 ˜ ∗ (𝐼 − 𝑍(𝑧)𝐴 ˜∗ )−1 ℐ𝑗 𝑥𝑗 𝐵 𝑗=1 [ [ ] ] 𝑑 [ ] ∑ ˜ − 𝑍(𝑧)𝐴) ˜ −1 ˜ − 𝑍(𝑧)𝐴) ˜ −1 𝐶(𝐼 𝐶(𝐼 ˜ ˜ 𝐴 𝐴𝑥 𝑥 = . ℐ ℐ = 𝑖 𝑖𝑗 𝑗 𝑖 ˜ ∗ (𝐼 − 𝑍(𝑧)𝐴 ˜ ∗ (𝐼 − 𝑍(𝑧)𝐴 ˜∗ )−1 ˜∗ )−1 𝑖 𝐵 𝐵 𝑗=1 Similarly, we get from (5.74) and (5.75) [ ] [ ] ˜ − 𝑍(𝑧)𝐴) ˜ −1 𝐶(𝐼 ∗ ˜∗ 𝑥 . [𝐴 𝑓 ]𝑗 (𝑧) = ˜ ∗ ℐ𝑗 𝐴 ∗ −1 ˜ 𝑗 𝐵 (𝐼 − 𝑍(𝑧)𝐴 )

(5.77)

For 𝑓 and 𝑥 as in (5.75), we have [ ] 𝑑 ∑ ˜ − 𝑍(𝑧)𝐴) ˜ −1 𝐶(𝐼 (s𝑓 )(𝑧) = ˜ ∗ (𝐼 − 𝑍(𝑧)𝐴 ˜∗ )−1 ℐ𝑗 𝑥𝑗 𝐵 𝑗=1 [ [ ] 𝑑 ] ˜ − 𝑍(𝑧)𝐴) ˜ −1 ∑ ˜ − 𝑍(𝑧)𝐴) ˜ −1 𝐶(𝐼 𝐶(𝐼 = ˜∗ ℐ𝑗 𝑥𝑗 = ˜ ∗ ˜∗ )−1 ˜∗ )−1 𝑥 𝐵 (𝐼 − 𝑍(𝑧)𝐴 𝐵 (𝐼 − 𝑍(𝑧)𝐴 𝑗=1 which together with (5.76) and (5.77) gives ˜ − 𝑍(𝑧)𝐴) ˜ −1 𝑥 − 𝐶𝑥 ˜ (s𝑓 )+ (𝑧) − (s𝑓 )+ (0) = 𝐶(𝐼 ˜ ˜ − 𝑍(𝑧)𝐴) ˜ −1 𝑍(𝑧)𝐴𝑥 = 𝐶(𝐼 =

𝑑 ∑ 𝑗=1

𝑑 [ ] ∑ ˜ ˜ − 𝑍(𝑧)𝐴) ˜ −1 ℐ𝑗 𝐴𝑥 𝑧𝑗 ⋅ 𝐶(𝐼 = 𝑧𝑗 ⋅ [𝐴𝑓 ]𝑗 (𝑧), 𝑗

𝑗=1

˜ ∗ (𝐼 − 𝑍(𝑧)𝐴 ˜∗ )−1 𝑥 − 𝐵 ˜∗𝑥 (s𝑓 )− (𝑧) − (s𝑓 )− (0) = 𝐵 ˜ ∗ (𝐼 − 𝑍(𝑧)𝐴 ˜∗ )−1 𝑍(𝑧)𝐴 ˜∗ 𝑥 =𝐵 =

𝑑 ∑ 𝑗=1

𝑑 [ ] ∑ ˜∗ 𝑥 = ˜ ∗ (𝐼 − 𝑍(𝑧)𝐴 ˜∗ )−1 ℐ𝑗 𝐴 𝑧𝑗 ⋅ 𝐵 𝑧𝑗 ⋅ [𝐴∗ 𝑓 ]𝑗 (𝑧). 𝑗

⊕𝑑𝑖=1 ℋ(𝐾𝑖𝐿 ),

𝑗=1

Since 𝑓 is the generic element 𝑓 of the two latter equalities mean that the operators 𝐴 and 𝐴∗ solve Gleason problems (5.13) and (5.14), respectively. On

114

J.A. Ball and V. Bolotnikov

the other hand, for an 𝑥 of the form (5.75), for operators 𝑈𝑖 deﬁned in (5.71) and for the operators 𝐶 and 𝐵 ∗ deﬁned on ⊕𝑑𝑖=1 ℋ(𝐾𝑖 ) by formulas (5.15), we have 𝐶(⊕𝑑𝑖=1 𝑈𝑖 )𝑥 =

𝑑 ∑

(𝑈𝑖 𝑥𝑖 )+ (0) =

𝑖=1

𝑑 ∑

˜ − 𝑍(0)𝐴) ˜ −1 ℐ𝑖 𝑥𝑖 = 𝐶 ˜ 𝐶(𝐼

𝑖=1

𝑑 ∑

˜ ℐ𝑖 𝑥𝑖 = 𝐶𝑥

𝑖=1

and quite similarly, 𝐵 ∗ (⊕𝑑𝑖=1 𝑈𝑖 )𝑥 =

𝑑 ∑

(𝑈𝑖 𝑥𝑖 )− (0) =

𝑖=1

𝑑 ∑

˜ ∗ (𝐼 − 𝑍(0)𝐴 ˜∗ )−1 ℐ𝑖 𝑥𝑖 = 𝐵 ˜∗ 𝐵

𝑖=1

𝑑 ∑

˜ ∗ 𝑥. ℐ𝑖 𝑥𝑖 = 𝐵

𝑖=1

˜ and 𝐵 ∗ (⊕𝑑 𝑈𝑖 ) = 𝐵 ˜ ∗ (which is equivalent to 𝐵 = =𝐶 Thus, 𝑖=1 𝑑 𝑑 ˜ as the operator ⊕ 𝑈𝑖 is unitary). The two last equalities along with (⊕𝑖=1 𝑈𝑖 )𝐵 𝑖=1 𝐴 𝐵 ] is unitarily equivalent to the original (5.72) mean that[ the ]realization U = [ 𝐶 𝐷 ˜ ˜ 𝐴 𝐵 ˜ = via the unitary operator ⊕𝑑 𝑈𝑖 . Therefore this realization realization U 𝐶(⊕𝑑𝑖=1 𝑈𝑖 )

𝑖=1

˜ 𝐷 𝐶

U is also weakly unitary. Also it is a functional-model realization since the state space 𝒳 is the functional-model state space ⊕𝑑𝑖=1 ℋ(𝐾𝑖 ), the operators 𝐵 and 𝐶 are given by (5.15) and the state space operator 𝐴 solves the Gleason problems in (5.13), (5.14). □ We next present the analog of Theorem 3.7 for the two-component setting for the case of unitary colligation matrices; the single-variable special case (𝑑 = 1) of this result amounts to Theorem 4 in [40]. Here we use notation as in (5.1) and (5.9). Theorem 5.10. Suppose that we are given a collection of ℒ(𝒴 ⊕ 𝒰)-valued positive kernels [ 𝐿 𝐿𝑅 ]} {[ 𝐿 𝐿𝑅 ] 𝐾1 𝐾1 𝐾𝑑 𝐾𝑑 , . . . , . {𝐾1 , . . . , 𝐾𝑑 } = 𝐾 𝑅𝐿 𝐾 𝑅 𝐾 𝑅𝐿 𝐾 𝑅 1

1

𝑑

𝑑

Then {𝐾1 , . . . , 𝐾𝑑 } is the Agler decomposition for some unitary t.c.f.m. for some Schur-Agler-class function 𝑆 ∈ 𝒮𝒜𝑑 (𝒰, 𝒴) if and only if the following conditions hold: 1. The structured Gleason problem (5.13)–(5.14) has an isometric solution 𝐴:

𝑑 ⊕

ℋ(𝐾𝑖 ) →

𝑖=1

𝑑 ⊕

ℋ(𝐾𝑖 )

𝑖=1

in the sense that ∥𝐴𝑓 ∥2⊕𝑑

𝑖=1

∥𝐴∗ 𝑓 ∥2⊕𝑑

ℋ(𝐾𝑖 )

𝑖=1

= ∥𝑓 ∥2⊕𝑑

ℋ(𝐾𝑖 )

𝑖=1

= ∥𝑓 ∥2⊕𝑑

ℋ(𝐾𝑖 )

𝑖=1

− ∥(s𝑓 )+ (0)∥2𝒴 ,

ℋ(𝐾𝑖 )

− ∥(s𝑓 )− (0)∥2𝒰

(5.78)

⊕𝑑 for all 𝑓 ∈ 𝑖=1 ℋ(𝐾𝑖 ). 2. The equality of range-defect dimensions

dim(Ran 𝐸0 s+ )⊥ = dim(Ran 𝐸0 s− )⊥

(5.79)

Canonical Realization

115

holds, where 𝐸0 is the operator of evaluation at zero and where the maps ) ) ( 𝑑 ( 𝑑 𝑑 𝑑 ⊕ ∑ ⊕ ∑ 𝐿 𝑅 and s− : s+ : ℋ(𝐾𝑖 ) → ℋ 𝐾𝑖 ℋ(𝐾𝑖 ) → ℋ 𝐾𝑖 𝑖=1

𝑖=1

𝑖=1

𝑖=1

are given by s+ : 𝑓 →

𝑑 ∑

𝑓𝑘,+ ,

𝑘=1

s− : 𝑓 →

𝑑 ∑

𝑓𝑘,− .

𝑘=1

Moreover, if this is the case and if we deﬁne operators 𝐶 : ⊕𝑑 𝐵 : 𝒰 → 𝑖=1 ℋ(𝐾𝑖 ) by 𝐶 : 𝑓 → (s𝑓 )+ (0),

⊕𝑑

𝑖=1

ℋ(𝐾𝑖 ) → 𝒴 and

𝐵 ∗ : 𝑓 → (s𝑓 )− (0),

(5.80)

𝐴 𝐵 ] is a unitary t.c.f.m. then there exists an operator 𝐷 : 𝒰 → 𝒴 so that U = [ 𝐶 𝐷 for 𝑆 associated with the Agler decomposition {𝐾1 , . . . , 𝐾𝑑 } for 𝑆.

Proof. Necessity of the existence of a solution of the structured Gleason problem (5.13)–(5.14) is immediate from the existence result, part (1) of Theorem 5.9, together with the deﬁnition of t.c.f.m. associated with a given Agler decomposition {𝐾1 , . . . , 𝐾𝑑 }. The additional conditions (5.78) (5.79) are a consequence of the 𝐴 𝐵 ] is unitary. assumption that the t.c.f.m. colligation matrix U = [ 𝐶 𝐷 We next suppose that we are given a collection of kernels {𝐾1 , . . . , 𝐾𝑑 } as in the statement of the Theorem. Deﬁne operators 𝐵 and 𝐶 as in (5.80). The hypothesis (5.78) tells us that the block operators [ ∗] ⊕ [ ] ⊕ 𝑑 𝑑 𝑑 𝑑 ⊕ ⊕ 𝐴 𝐴 ℋ(𝐾𝑖 ) → ℋ(𝐾𝑖 ) ⊕ 𝒴 and ℋ(𝐾𝑖 ) → ℋ(𝐾𝑖 ) ⊕ 𝒰 : ∗ : 𝐵 𝐶 𝑖=1

𝑖=1

𝑖=1

𝑖=1

are isometric. We seek to deﬁne an operator 𝐷 : 𝒰 → 𝒴 in such a way that the 𝐴 𝐵 ] is unitary. The isometric properties of [ 𝐴 ] resulting matrix U = [ 𝐶 𝐷 𝐶 [ 𝐴colligation ∗ ] and of 𝐵 ∗ tell us that there exist isometries 𝛼 : 𝒟𝐴 → 𝒴 and 𝛽 : 𝒟𝐴∗ → 𝒰 (where we have set 𝒟𝐴 equal to the closure of the range of the operator 𝐷𝐴 = (𝐼 −𝐴∗ 𝐴)1/2 and 𝒟𝐴∗ equal to the closure of the range of the operator 𝐷𝐴∗ := (𝐼 − 𝐴𝐴∗ )1/2 ) so that 𝐶 = 𝛼𝐷𝐴 and 𝐵 ∗ = 𝛽𝐷𝐴∗ . Note that Ran 𝛼 = Ran 𝐶 = Ran 𝐸0 s+ ,

Ran 𝛽 = Ran 𝐵 ∗ = Ran 𝐸0 s− .

The dimension assumption (5.79) assures us that we can construct an isometry 𝛾 from (Ran 𝛽)⊥ onto (Ran 𝛼)⊥ . Let us now deﬁne an operator 𝐷 : 𝒰 → 𝒴 by { −𝛼𝐴∗ 𝛽 ∗ 𝑢 if 𝑢 ∈ Ran 𝛽, 𝐷𝑢 = 𝛾𝑢 if 𝑢 ∈ (Ran 𝛽)⊥

116

J.A. Ball and V. Bolotnikov

and extend it by linearity to all of 𝒰. Then it is easily checked that the colligation 𝐴 𝐵 ] is unitary. We are now ready to deﬁne the Schur-Agler class matrix U = [ 𝐶 𝐷 function 𝑆(𝑧) by 𝑆(𝑧) = 𝐷 + 𝐶(𝐼 − 𝑍(𝑧)𝐴)−1 𝑍(𝑧)𝐵. 𝑑 We let ℐ𝑘 be the injection of [ℋ(𝐾𝑘 ) into of] ] ⊕𝑖=1 ℋ(𝐾𝑘 ) as in the proof [ 𝐿 𝐿𝑅 𝐾𝑘 𝐾𝑘 𝑓𝑘,+ Theorem 3.7 (but where now 𝐾𝑘 = 𝐾 𝑅𝐿 𝐾 𝑅 and hence elements 𝑓𝑘 = 𝑓𝑘,− 𝑘

𝑘

of ℋ(𝐾𝑘 ) consist of two components). We next argue that [ ] 𝐶(𝐼 − 𝑍(𝑧)𝐴)−1 𝑓𝑘 (𝑧) = ℐ 𝑓 𝐵 ∗ (𝐼 − 𝑍(𝑧)𝐴)−1 𝑘 𝑘

(5.81)

or, equivalently, that 𝑓𝑘,+ (𝑧) = 𝐶(𝐼 − 𝑍(𝑧)𝐴)−1 ℐ𝑘 𝑓𝑘 , ∗

for all 𝑓𝑘 =

[

𝑓𝑘,+ 𝑓𝑘,−

]

∗ −1

𝑓𝑘,− (𝑧) = 𝐵 (𝐼 − 𝑍(𝑧)𝐴 )

(5.82)

ℐ𝑘 𝑓𝑘

(5.83)

in ℋ(𝐾𝑘 ). It suﬃces to note that (5.82) follows in the same way

as (3.32) in the proof of Theorem 3.7 based on the ﬁrst two-component Gleasonproblem identity (5.13). Similarly the second identity (5.83) follows in the same way by making use of the second two-component Gleason-problem identity (5.14), and hence (5.81) follows. We next make use of the reproducing-kernel property of 𝐾𝑘 to get 〈[ ] [ ]〉 𝑓𝑘,+ (𝜁) 𝑦 𝑦 , ⟨𝑓𝑘 , 𝐾𝑘 (⋅, 𝜁) [ 𝑢 ]⟩ℋ(𝐾𝑘 ) = 𝑢 𝒴⊕𝒰 𝑓𝑘,− (𝜁) 〈[ ] [ ] [𝑦 ]〉 𝐶(𝐼 − 𝑍(𝜁)𝐴)−1 𝑓𝑘,+ = (by (5.81)) ℐ 𝑘 𝑓𝑘,− , 𝑢 𝐵 ∗ (𝐼 − 𝑍(𝜁)𝐴∗ )−1 𝒴⊕𝒰 〈[ [ ]〉 ] ] 𝑦 [ 𝑓𝑘,+ = , ℐ𝑘∗ (𝐼 − 𝐴∗ 𝑍(𝜁)∗ 𝐶 ∗ (𝐼 − 𝐴𝑍(𝜁)∗ )−1 𝐵 𝑓𝑘,− 𝑢 ℋ(𝐾 ) 𝑘

from which we conclude that [ ] [ 𝑦 = ℐ𝑘∗ (𝐼 − 𝐴∗ 𝑍(𝜁)∗ 𝐶 ∗ 𝐾𝑘 (𝑧, 𝜁) 𝑢

∗ −1

(𝐼 − 𝐴𝑍(𝜁) )

[ ] ] 𝑦 𝐵 . 𝑢

From the general identity (5.81) we conclude that [ 𝐾𝑘 (𝑧, 𝜁) =

] [ 𝐶(𝐼 − 𝑍(𝑧)𝐴)−1 ∗ ∗ −1 ∗ 𝐶 ∗ ∗ ∗ −1 𝑃𝑘 (𝐼 − 𝐴 𝑍(𝜁) ) 𝐵 (𝐼 − 𝑍(𝑧) 𝐴 )

(𝐼 − 𝐴𝑍(𝜁))−1 𝐵

]

(5.84) (where we have set 𝑃𝑘 = ℐ𝑘 ℐ𝑘∗ ). 𝐴 𝐵 ] is unitary (and hence in particular On the other hand, since U = [ 𝐶 𝐷 weakly unitary), it follows that 𝑆 admits an Agler decomposition (1.12) with

Canonical Realization kernels given by ] [ ˜ 𝐿 (𝑧, 𝜁) 𝐾 ˜ 𝐿𝑅 (𝑧, 𝜁) 𝐾 𝑘 𝑘 ˜ 𝑘 (𝑧, 𝜁) = 𝐾 ˜ 𝑅𝐿 (𝑧, 𝜁) 𝐾 ˜ 𝑅 (𝑧, 𝜁) 𝐾 𝑘 𝑘 [ ] [ 𝐶(𝐼 − 𝑍(𝑧)𝐴)−1 ∗ ∗ −1 𝐶 ∗ = ∗ ∗ −1 𝑃𝑘 (𝐼 − 𝐴 𝑍(𝜁) ) 𝐵 (𝐼 − 𝑍(𝑧)𝐴 )

117

(𝐼 − 𝐴𝑍(𝜁)∗ )−1 𝐵

]

= 𝐾𝑘 (𝑧, 𝜁) (by (5.84)) and it follows that {𝐾1 (𝑧, 𝜁), . . . , 𝐾𝑑 (𝑧, 𝜁)} is an Agler decomposition for 𝑆.

□

It is also possible to give a “weakly unitary” version of Theorem 5.10 Theorem 5.11. Given a collection of ℒ(𝒴 ⊕ 𝒰)-valued positive kernels [ 𝐿 𝐿𝑅 ]} {[ 𝐿 𝐿𝑅 ] 𝐾1 𝐾1 𝐾𝑑 𝐾𝑑 , . . . , , {𝐾1 , . . . , 𝐾𝑑 } = 𝑅𝐿 𝑅 𝐾1 𝐾1 𝐾𝑑𝑅𝐿 𝐾𝑑𝑅 ⊕𝑑 let 𝒟 and ℛ be the subspaces of 𝑖=1 ℋ(𝐾𝑖 ) deﬁned in (5.3) and (5.4). Then {𝐾1 , . . . , 𝐾𝑑 } is the Agler decomposition for some t.c.f.m. for some function 𝑆 ∈ 𝒮𝒜𝑑 (𝒰, 𝒴) if and only if 1. The structured Gleason problem (5.13)–(5.14) has a solution 𝐴:

𝑑 ⊕

ℋ(𝐾𝑖 ) →

𝑖=1

𝑑 ⊕

ℋ(𝐾𝑖 )

𝑖=1

which is weakly unitary in the sense that the equalities in (5.78) hold with ≤ ⊕𝑑 in place of = for all 𝑓 ∈ 𝑖=1 ℋ(𝐾𝑖 ) and in addition the equalities ∥𝑃𝒟 𝐴𝑓 ∥2⊕𝑑

ℋ(𝐾𝑖 )

= ∥𝑓 ∥2⊕𝑑

ℋ(𝐾𝑖 )

− ∥(s𝑓 )+ (0)∥2𝒴

for all 𝑓 ∈ ℛ,

∥𝑃ℛ 𝐴∗ 𝑔∥2⊕𝑑

ℋ(𝐾𝑖 )

= ∥𝑔∥2⊕𝑑

ℋ(𝐾𝑖 )

− ∥(s𝑔)− (0)∥2𝒰

for all 𝑔 ∈ 𝒟

𝑖=1 𝑖=1

𝑖=1

𝑖=1

(5.85)

where 𝑃𝒟 and 𝑃ℛ denote the orthogonal projections onto 𝒟 and ℛ. 2. The equality (5.79) of range-defect dimensions holds. Moreover, if this is the case and if we deﬁne operators 𝐶 and 𝐵 as in (5.80), then 𝐴 𝐵 ] is a t.c.f.m. for 𝑆 associated there exists an operator 𝐷 : 𝒰 → 𝒴 so that U = [ 𝐶 𝐷 with the Agler decomposition {𝐾1 , . . . , 𝐾𝑑 } for 𝑆. Proof. As the proof is mostly the same as that of Theorem 5.10, we just sketch the main ideas. The one diﬀerence from the proof of Theorem 5.10 is that in the suﬃciency part we start with isometric operators [ ] [ ] ˜∗ ˜ 𝐴 𝐴 : 𝒟 →ℛ⊕𝒰 : ℛ → 𝒟 ⊕ 𝒴 and 𝐶 𝐵∗ ˜ = 𝑃𝒟 𝐴∣ℛ . We use the same formulas as in the previous where we have set 𝐴 ˜ theorem (with 𝐴 instead of [𝐴) to construct 𝐷 and then invoke Lemma 5.7 to ] show that the operator U =

𝑋 0 0

0 0 ˜𝐵 𝐴 𝐶 𝐷

is a unitary t.c.f.m. for 𝑆(𝑧) = 𝐷 + 𝐶(𝐼 −

𝑍(𝑧)𝐴)−1 𝑍(𝑧)𝐵 associated with the Agler decomposition {𝐾1 , . . . , 𝐾𝑑 }.

□

118

J.A. Ball and V. Bolotnikov

Remark 5.12. As was the case for left Agler decompositions associated with c.f.m.’s (see Subsection 3.2), not much is known about the construction and structure of Agler decompositions associated with t.c.f.m.’s. However, in [26] there appears an example of an Agler decomposition (arising from an explicit closely connected unitary structured colligation matrix U) for which both 𝒟 and ℛ are proper ⊕𝑑 subspaces of 𝑘=1 ℋ(𝐾𝑑 ) of codimension 1. Left open there (and here) is whether ⊕ ⊕ there exists an example where 𝒟 ∕= 𝑑𝑘=1 ℋ(𝐾𝑘 ) but ℛ = 𝑑𝑘=1 ℋ(𝐾𝑘 ) (or vice versa). More generally, we are lacking an example where 𝒟 and ℛ have unequal ⊕𝑑 codimensions in 𝑖=1 ℋ(𝐾𝑖 ), i.e., an example of an Agler decomposition for which no associated t.c.f.m. can be unitary.

References [1] V.M. Adamjan and D.Z. Arov, On unitary coupling of semi-unitary operators, Dokl. Akad. Nauk. Arm. SSR 43 (1966), no. 5, 257–263. [2] J. Agler. On the representation of certain holomorphic functions deﬁned on a polydisk, in Topics in Operator Theory: Ernst D. Hellinger memorial Volume (eds. L. de Branges, I. Gohberg and J. Rovnyak), Oper. Theory Adv. Appl. OT 48, pp. 47–66, Birkh¨ auser Verlag, Basel, 1990. [3] J. Agler and J.E. McCarthy, Nevanlinna-Pick interpolation on the bidisk, J. Reine Angew. Math. 506 (1999), 191–204. [4] J. Agler and J.E. McCarthy, Complete Nevanlinna-Pick kernels, J. Funct. Anal. 175 (2000) no. 1, 111–124. [5] D. Alpay and C. Dubi, Backward shift operator and ﬁnite-dimensional de BrangesRovnyak spaces in the ball, Linear Algebra Appl. 371 (2003), 277–285. [6] D. Alpay and C. Dubi, A realization theorem for rational functions of several complex variables, Systems Control Lett. 49 (2003) no. 3, 225–229. [7] D. Alpay, A. Dijksma, J. Rovnyak, and H. de Snoo, Schur functions, operator colligations, and reproducing kernel Pontryagin spaces, Oper. Theory Adv. Appl. OT 96, Birkh¨ auser Verlag, Basel, 1997. [8] D. Alpay and D.S. Kalyuzhnyi-Verbovetzkyi, Matrix-𝐽-unitary non-commutative rational formal power series, in The State Space Method: Generalizations and Applications pp. 49–113 (eds. D. Alpay and I. Gohberg), Oper. Theory Adv. Appl. OT 161, Birkh¨ auser Verlag, Basel-Boston-Berlin, 2006. [9] D. Alpay and H.T. Kaptano˘ glu, Gleason’s problem and homogeneous interpolation in Hardy and Dirichlet-type spaces of the ball, J. Math. Anal. Appl. 276 (2002) no. 2, 654–672. [10] C.-G. Ambrozie and D. Timotin, A von Neumann type inequality for certain domains in ℂ𝑛 , Proc. Amer. Math. Soc., 131 (2003), no. 3, 859–869. [11] N. Aronszajn, Theory of reproducing kernels, Trans. Amer. Math. Soc., 68 (1950), 337–404. [12] D.Z. Arov and O.J. Staﬀans, A Kre˘in-space coordinate-free version of the de BrangesRovnyak complementary space, J. Funct. Anal. 256 (2009) no. 12, 3892–3915.

Canonical Realization

119

[13] D.Z. Arov and O.J. Staﬀans, Two canonical passive state/signal shift realizations of passive discrete-time behaviors, J. Funct. Anal. 257 (2009) no. 8, 2573–2634. [14] J.A. Ball, Linear systems, operator model theory and scattering: multivariable generalizations, in: Operator theory and its applications, 151–178, Fields Inst. Commun., 25, Amer. Math. Soc., Providence, RI, 2000. [15] J.A. Ball, A. Biswas, Q. Fang and S. ter Horst, Multivariable generalizations of the Schur class: positive kernel characterization and transfer function realization, in Recent advances in operator theory and applications (eds. T. Ando, R.E. Curto, I.B. Jung, and W.Y. Lee), pp. 17–79, Oper. Theory Adv. Appl. OT 187, Birkh¨ auser Verlag, 2009. [16] J.A. Ball and V. Bolotnikov, Realization and interpolation for Schur-Agler-class functions on domains with matrix polynomial deﬁning function in ℂ𝑛 , J. Funct. Anal. 213 (2004), no.1, 45–87. [17] J.A. Ball and V. Bolotnikov, Canonical de Branges-Rovnyak model transfer-function realization for multivariable Schur-class functions, in Hilbert Spaces of Analytic Functions (eds. J. Mashreghi, T. Ransford, and K. Seip), CRM Proceedings & Lecture Notes 51, Amer. Math. Soc., Providence, 2010. [18] J.A. Ball and V. Bolotnikov, Canonical transfer-function realization for Schur-Aglerclass functions on domains with matrix polynomial deﬁning functions in ℂ𝑛 , in Recent Progress in Operator Theory and Its Applications, J.A. Ball, R. Curto, S. Grudsky, J.W. Helton, R. Quiroga-Barrancoi, and N. Vasilevski, eds., Proceedings of the International Workshop on Operator Theory and Applications (IWOTA), Guanajuato, Mexico, Oper. Theory Adv. Appl. OT 220, Birkh¨ auser, Springer Basel AG, 2012, 23–55. [19] J.A. Ball and V. Bolotnikov, Canonical transfer-function realization for Schur multipliers on the Drury-Arveson space and models for commuting row contractions, Ind. U. Math. J., to appear. [20] J.A. Ball, V. Bolotnikov and Q. Fang, Transfer-function realization for multipliers of the Arveson space, J. Math. Anal. Appl. 333 (2007), no. 1, 68–92. [21] J.A. Ball, V. Bolotnikov and Q. Fang, Schur-class multipliers on the Fock space: de Branges-Rovnyak reproducing kernel spaces and transfer-function realizations, in Operator Theory, Structured Matrices, and Dilations: Tiberiu Constantinescu Memorial Volume (eds. M. Bakonyi, A. Gheondea, M. Putinar and J. Rovnyak), pp. 85–114, Theta Press, Bucharest, 2007. [22] J.A. Ball, K.F. Clancey, and V. Vinnikov, Concrete interpolation of meromorphic matrix functions on Riemann surfaces, in Reproducing kernel spaces and applications pp. 77–134, Oper. Theory Adv. Appl. OT 143, Birkh¨ auser Verlag, 2003. [23] J.A. Ball, I. Gohberg, and L. Rodman, Interpolation of rational matrix functions, Oper. Theory Adv. Appl. OT 45, Birkh¨ auser Verlag, Basel, 1990. [24] J.A. Ball, G. Groenewald and T. Malakorn, Structured noncommutative multidimensional linear systems, SIAM J. Control Optim. 44 (2005), no. 4, 1474–1528. [25] J.A. Ball, G. Groenewald and T. Malakorn, Conservative structured noncommutative multidimensional linear systems, in The State Space Method: Generalizations and Applications (eds. D. Alpay and I. Gohberg), pp. 179–223, Oper. Theory Adv. Appl. OT 161, Birkh¨ auser Verlag, Basel, 2006.

120

J.A. Ball and V. Bolotnikov

[26] J.A. Ball, D.S. Kaliuzhnyi-Verbovetskyi, C. Sadosky, V. Vinnikov, Scattering systems with several evolutions and formal reproducing kernel Hilbert spaces, in preparation. [27] J.A. Ball, C. Sadosky, V. Vinnikov, Scattering systems with several evolutions and multidimensional input/.state/output systems, Integral Equations Operator Theory 52 (2005), no. 3, 323–393. [28] J.A. Ball and T.T. Trent, Unitary colligations, reproducing kernel Hilbert spaces and Nevanlinna–Pick interpolation in several variables, J. Funct. Anal., 157 (1998), no. 1, 1–61. [29] J.A. Ball, T.T. Trent and V. Vinnikov, Interpolation and commutant lifting for multipliers on reproducing kernel Hilbert spaces, in Operator Theory and Analysis: The M.A. Kaashoek Anniversary Volume (eds. H. Bart, I. Gohberg and A.C.M. Ran), pp. 89–138, Oper. Theory Adv. Appl. OT 122, Birkh¨ auser Verlag, Basel, 2001. [30] J.A. Ball and V. Vinnikov, Zero-pole interpolation for meromorphic matrix functions on an algebraic curve and transfer functions for 2D systems, Acta Appl. Math. 45 (1996) no. 3, 239–316. [31] J.A. Ball and V. Vinnikov, Zero-pole interpolation for matrix meromorphic functions on a compact Riemann surface and a matrix Fay trisecant identity, Amer. J. Math. 121 (1999) no. 4, 841–888. [32] J.A. Ball and V. Vinnikov, Overdetermined multidimensional systems: state space and frequency domain methods, in Mathematical systems theory in biology, communications, computation, and ﬁnance (eds. J. Rosenthal and D.S. Gilliam), pp. 63–119, IMA Vol. Math. Appl. 134, Springer, New York, 2003. [33] J.A. Ball and V. Vinnikov, Lax-Phillips scattering and conservative linear systems: A Cuntz-algebra multidimensional setting, Memoirs of the American Mathematical Society, 178 no. 837, American Mathematical Society, Providence, 2005. [34] H. Bart, I. Gohberg, and M.A. Kaashoek, Minimal factorization of matrix and Operator functions, Oper. Theory Adv. Appl. OT 1, Birkh¨ auser Verlag, 1979. [35] H. Bart, I. Gohberg, M.A. Kaashoek, and A.C.M. Ran, Factorization of matrix and operator functions: the state space method, Oper. Theory Adv. Appl. OT 178, Birkh¨ auser Verlag, Basel, 2008. [36] H. Bart, I. Gohberg, M.A. Kaashoek, and A.C.M. Ran, A state space approach to canonical factorization with applications, Oper. Theory Adv. Appl. OT 200, Birkh¨ auser Verlag, Basel, 2010. [37] V. Belevitch, Classical Network Theory, Holden-Day, San Francisco, 1968. [38] T. Bhattacharyya, J. Eschmeier and J. Sarkar, Characteristic function of a pure commuting contractive tuple, Integral Equations Operator Theory 53 (2005), no. 1, 23–32. [39] T. Bhattacharyya, J. Eschmeier and J. Sarkar, On c.n.c. commuting contractive tuples, Proc. Indian Acad. Sci. Math. Sci. 116 (2006), no. 3, 299–316. [40] L. de Branges, Factorization and invariant subspaces, J. Math. Anal. Appl. 29 (1970), 163–200. [41] L. de Branges and J. Rovnyak, Canonical models in quantum scattering theory, in: Perturbation Theory and its Applications in Quantum Mechanics (C. Wilcox, ed.) pp. 295–392, Holt, Rinehart and Winston, New York, 1966.

Canonical Realization

121

[42] L. de Branges and J. Rovnyak, Square summable power series, Holt, Rinehart and Winston, New York, 1966. [43] V. Brodski˘ı, I. Gohberg, and M.G. Kre˘ın, The characteristic function of an invertible operator, Acta Sci. Math. (Szeged) 32 (1971), 141–164. [44] J. Eschmeier and M. Putinar, Spherical contractions and interpolation problems on the unit ball, J. Reine Angew. Math. 542 (2002), 219–236. [45] C. Foia¸s, A. Frazho, I. Gohberg and M.A. Kaashoek, Metric Constrained Interpolation, Commutant Lifting and Systems, Oper. Theory Adv. Appl. OT 100, Birkh¨ auser Verlag, Boston-Basel, 1998. [46] A.M. Gleason, Finitely generated ideals in Banach algebras, J. Math. Mech., 13 (1964), 125–132. [47] I. Gohberg and M.A. Kaashoek (eds.), Constructive Methods of Wiener-Hopf Factorization, Oper. Theory Adv. Appl. OT 21, Birkh¨ auser Verlag, Basel, 1986. [48] I. Gohberg, P. Lancaster, and L. Rodman, Matrix polynomials, Academic Press, New York, 1982. [49] A Grinshpan, D.S. Kaliuzhnyi-Verbovetskyi, V. Vinnikov, and H.J. Woerdeman, Classes of tuples of commuting contractions satisfying the multivariable von Neumann inequality, J. Funct. Anal. 256 (2009), no. 9, 3035–3054. [50] J.W. Helton, S. McCullough and V. Vinnikov, Noncommutative convexity arises from linear matrix inequalities, J. Functional Analysis 240 (2006) no. 1, 105–191. [51] R.E. Kalman, Mathematical description of linear dynamical systems, J. SIAM Control Ser. A 1 (1963), 152–192. [52] R.E. Kalman, P.L. Falb, and M.A. Arbib, Topics in mathematical system theory, McGraw-Hill, New York, 1969. [53] P.D. Lax and R.S. Phillips, Scattering Theory, Pure and Applied Math. 26, Academic Press, Boston, 1989. [54] M.S. Livˇsic, On a class of linear operators in Hilbert space, Mat. Sbornik N.S. 19(61) (1946), 239–262; English translation: Amer. Math. Soc. Transl. (2) 13 (1960), 61–83. [55] M.S. Livˇsic, Operators, oscillations, waves (Open Systems), Translations of Mathematical Monographs 34, Amer. Math. Soc., Providence, 1973. [56] P.S. Muhly and B. Solel, Hardy algebras, 𝑊 ∗ correspondences and interpolation theory, Math. Ann. 330 (2004), no. 2, 353–415. [57] P.S. Muhly and B. Solel, Canonical models for representations of Hardy algebras, Integral Equations Operator Theory, 53 (2005), no. 3, 411–452. [58] B. Sz.-Nagy and C. Foias, Harmonic analysis of operators on Hilbert space, North-Holland/American Elsevier, 1970; revised edition: B. Sz.-Nagy, C. Foias, H. Bercovici, and L. Kerchy, Harmonic analysis of operators on Hilbert space. Second edition. Revised and enlarged edition., Universitext, Springer, New York, 2010. [59] G. Popescu, Characteristic functions for inﬁnite sequences of noncommuting operators, J. Operator Theory 22 (1989), 51–71. [60] G. Popescu, von Neumann inequality for (𝐵(ℋ)𝑛 )1 , Math. Scand. 68 (1991), 292– 304. [61] G. Popescu, Multi-analytic operators on Fock spaces, Math. Ann. 303 (1995), 31–46.

122

J.A. Ball and V. Bolotnikov

[62] D. Sarason, Sub-Hardy Hilbert Spaces in the Unit Disk, John Wiley and Sons Inc., New York, 1994. [63] J.C. Willems, Dissipative dynamical systems I: general theory, Arch. Rational Mech. Anal. 45 (1972), 321–351. [64] J.C. Willems, Dissipative dynamical systems II: Linear systems with quadratic supply rates, Arch. Rational Mech. Anal. 45 (1972), 352–393. [65] M.R. Wohlers, Lumped and distributed passive networks: a generalized and advanced viewpoint, Academic Press, New York, 1969. Joseph A. Ball Department of Mathematics Virginia Tech Blacksburg, VA 24061-0123, USA e-mail: [email protected] Vladimir Bolotnikov Department of Mathematics The College of William and Mary Williamsburg, VA 23187-8795, USA e-mail: [email protected]

Operator Theory: Advances and Applications, Vol. 218, 123–153 c 2012 Springer Basel AG ⃝

Spectral Regularity of Banach Algebras and Non-commutative Gelfand Theory Harm Bart, Torsten Ehrhardt and Bernd Silbermann Dedicated to Israel Gohberg, in grateful recognition of his wonderful contributions to mathematics

Abstract. A new non-commutative Gelfand type criterion for spectrally regular behavior of vector-valued analytic functions is developed. Applications are given in situations that could not be handled with earlier methods. Some open problems are identiﬁed. Mathematics Subject Classiﬁcation (2000). Primary: 30G30, 46H99; Secondary: 47A56, 47L10. Keywords. Analytic vector-valued function, logarithmic residue, spectral regularity, polynomial identity algebra, radical, family of homomorphisms, family of matrix representations.

1. Introduction Let Δ be a bounded Cauchy domain in the complex plane ℂ, let 𝑓 be a complex function deﬁned and analytic on an open neighborhood of the closure of Δ, and suppose 𝑓 does not vanish on the boundary ∂Δ of Δ. From complex function theory we know that the contour integral ∫ 𝑓 ′ (𝜆) 1 𝑑𝜆 2𝜋𝑖 ∂Δ 𝑓 (𝜆) is equal to the number of zeros of 𝑓 in Δ. Hence it vanishes if and only if 𝑓 (𝜆) ∕= 0 for each 𝜆 ∈ Δ. The issue studied in the present paper is this: to what extent does the state of aﬀairs in the scalar case carry over to the more general Banach algebra setting? So the problem we investigate is the following. Let ℬ be a (nontrivial) unital (complex) Banach algebra, let 𝐹 be a ℬ-valued function deﬁned and analytic on an open neighborhood of the closure of a bounded Cauchy domain Δ, and suppose 𝐹 takes invertible values on the boundary ∂Δ of Δ. Does it follow (or under what extra conditions can one conclude) that 𝐹 takes invertible values on Δ provided

124

H. Bart, T. Ehrhardt and B. Silbermann

it is given that the contour integral ∫ 1 𝐹 ′ (𝜆)𝐹 (𝜆)−1 𝑑𝜆 2𝜋𝑖 ∂Δ

(1)

vanishes? If, for the Banach algebra ℬ under consideration, the answer is always positive, then ℬ is called spectrally regular. Clearly, the archetypical example of such a Banach algebra is ℂ. A necessary condition for a Banach algebra to be spectrally regular is that it does not feature any nontrivial zero sum of idempotents (see [BES2]). In [BES1], a nontrivial zero sum of ﬁve idempotents is constructed for ℬ(ℓ2 ), the Banach algebra of bounded linear operators on the Hilbert space ℓ2 (cf. [E] and [PT]). Thus ℬ(ℓ2 ) is not spectrally regular. On the other hand, in the papers [Bar], [BES2], [BES3], [BES4] and [BES6], spectral regularity has been established for large classes of Banach algebras. Before we proceed, let us mention that in the present article, actually a somewhat stronger form of spectral regularity is adopted than described above. It is one that takes into account the phenomenon of quasinilpotency. Indeed, in the sequel we will call a unital Banach algebra ℬ spectrally regular if the following holds true: in the situation indicated above in which (1) is (well) deﬁned, the function 𝐹 has invertible values on Δ provided (1) is quasinilpotent, i.e., has the singleton set {0} as its spectrum. The Banach algebras for which spectral regularity in the weaker sense was established in the papers [Bar], [BES2], [BES3], [BES4] and [BES6] are spectrally regular in the stronger sense too (see Section 2 below). The methods that have been used in the mentioned articles can be divided into two categories: those using trace arguments (in cases where Fredholm operators enter the picture), and those employing Gelfand type considerations (in situations where commutativity properties play a role). The approach via trace arguments has been systematically pursued in [BES5] and [BES6]. The present paper is devoted to a further exploration along the other line, where suﬃcient conditions for spectral regularity are established with the help of Gelfand type considerations. The ﬁrst step in this direction was taken in [Bar], dealing with the commutative case and using classical Gelfand theory; a second in [BES2], where (among others) polynomial identity algebras were considered and it was necessary to take recourse to non-commutative Gelfand theory, with matrix representations taking the place of the multiplicative linear functionals from classical Gelfand theory (see [Kr] or [P], Section 7.1). Here is a brief description of the contents of the present paper. Apart from the introduction (Section 1) and the list of references, the paper consists of four sections. Section 2 contains preliminaries on notation and terminology, as well as a review of earlier results serving as the proper context in which to position the material presented in the rest of the article. In Section 3, a new Gelfand type criterion for spectral regularity is derived, and, with an eye on applications later on in the paper, two corollaries are obtained. The results involve families of homomorphisms that are more general than the so-called suﬃcient families of

Spectral Regularity and Non-commutative Gelfand Theory

125

matrix representations that have been employed before in [BES2]. One of the two corollaries has a strong algebraic aspect in that it is formulated in terms of the radical of the underlying Banach algebra. The other corollary is concerned with Banach algebras that, in a certain sense, can be embedded in Banach algebras of bounded linear operators on a Banach space. Here the semigroup of Fredholm operators features as an important ingredient, and so in the background the ideal of the compact operators on a Banach space and the Calkin algebra play a role. The new criterion and (especially) its corollaries turn out to be eﬀective tools, enabling us to deal with a variety of situations which we could not handle earlier. This is illustrated by the material presented in Section 4. Here are three examples: 1) a unital Banach algebra ℬ is spectrally regular if and only if ℬ factored by its radical is; 2) the 𝐶 ∗ -algebra generated by the block Toeplitz operators having a piecewise continuous deﬁning (also called generating) function is spectrally regular; and 3) the same is true for the Banach subalgebra of ℬ(ℓ2 ) consisting of the bounded linear operators on ℓ2 having a block upper triangular matrix representation with respect to an orthonormal basis in ℓ2 . The analysis presented in this paper hinges on the use of certain families of Banach algebra homomorphisms having properties pertinent to the study of spectral regularity. Section 5 contains a few remarks about how the diﬀerent properties in question compare. In particular it is brieﬂy pointed out that the conceptual framework developed in Section 3 provides a genuine extension of the non-commutative Gelfand theory employed before in [BES2]. For a detailed analysis, see the forthcoming paper [BES7]. The expression (1) deﬁnes the left logarithmic residue of the function 𝐹 with respect to the Cauchy domain Δ. There is also a right version obtained by replacing the left logarithmic derivative 𝐹 ′ (𝜆)𝐹 (𝜆)−1 by the right logarithmic derivative 𝐹 (𝜆)−1 𝐹 ′ (𝜆). Accordingly one can make a distinction between left spectral regularity and right spectral regularity. For all results obtained in this paper, the left and right versions are analogous to one another. Therefore we will only consider the left version of the logarithmic residue and drop the qualiﬁer ‘left’ altogether. Note, however, that is not known whether a Banach algebra can be left spectrally regular while failing to be right spectrally regular. One ﬁnal remark. The Banach algebras considered in this paper are unital. They are nontrivial too, so their unit elements diﬀer from their zero elements. It is not assumed, however, that the unit elements have norm one. For an individual unital Banach algebra one can always renorm such that the unit element does have norm one. In working with families of Banach algebra homomorphisms the way we do here, a ﬁxation on unit elements with norm one would introduce an unnecessary and undesirable rigidity.

2. Preliminaries and review of earlier results In this section we review some earlier results. We also use the opportunity to ﬁx notations and to introduce terminology.

126

H. Bart, T. Ehrhardt and B. Silbermann

A spectral conﬁguration is a triple (ℬ, Δ, 𝐹 ) where ℬ is a unital complex Banach algebra, Δ is a bounded Cauchy domain in ℂ (see [TL] or [GGK1]) and 𝐹 is a ℬ-valued analytic function on an open neighborhood of the closure of Δ which has invertible values on all of the boundary ∂Δ of Δ. With such a spectral conﬁguration, taking ∂Δ to be positively oriented, one can associate the contour integral ∫ 1 𝐿𝑅(𝐹 ; Δ) = 𝐹 ′ (𝜆)𝐹 (𝜆)−1 𝑑𝜆. 2𝜋𝑖 ∂Δ We call it the logarithmic residue associated with (ℬ, Δ, 𝐹 ); sometimes the term logarithmic residue of 𝐹 with respect to Δ is used as well. In the scalar case ℬ = ℂ, the logarithmic residue ∫ 𝑓 ′ (𝜆) 1 𝑑𝜆 (2) 2𝜋𝑖 ∂Δ 𝑓 (𝜆) associated with a spectral conﬁguration (ℂ, Δ, 𝑓 ) is equal to the number of zeros of 𝑓 in Δ (multiplicities counted). This can be rephrased by saying that (2) is the winding number with respect to the origin of the curve {𝑓 (𝜆)}𝜆∈∂Δ , taken with the orientation induced by the one on ∂Δ. Thus 𝐿𝑅(𝑓, Δ) is a nonnegative integer which is zero if and only if 𝑓 does not vanish on Δ. Motivated by these facts, and taking into account that in the general Banach algebra situation one can have nonzero quasinilpotent elements, we introduce the following terminology. The spectral conﬁguration (ℬ, Δ, 𝐹 ) is said to be winding free when 𝐿𝑅(𝐹 ; Δ) = 0, spectrally winding free if 𝐿𝑅(𝐹 ; Δ) is quasinilpotent, and spectrally trivial in case 𝐹 takes invertible values on Δ. By Cauchy’s theorem a spectral conﬁguration is winding free (and a fortiori spectrally winding free) provided it is spectrally trivial. As mentioned in the introduction, the converse of this is not generally true in the vector-valued situation. Under certain ﬁnite dimensionality conditions, positive results can be obtained. A bounded linear operator 𝑇 on a Banach space 𝑋 is called a Fredholm operator if its null space Ker 𝑇 is ﬁnite dimensional and its range Im 𝑇 has ﬁnite codimension in 𝑋 (and is therefore closed). The following theorem, extending Corollary 3.3 in [BES3], will serve as a key tool later on. Without going into details, we mention that the result allows for an extension to an abstract 𝐶 ∗ -algebra setting; see the forthcoming paper [BES8]. Theorem 2.1. Let 𝑋 be a Banach space, let (ℬ(𝑋), Δ, 𝐹 ) be a spectral conﬁguration, and suppose 𝐹 is Fredholm operator valued on Δ. The following statements are equivalent: (1) (ℬ(𝑋), Δ, 𝐹 ) is spectrally trivial; (2) (ℬ(𝑋), Δ, 𝐹 ) is winding free; (3) (ℬ(𝑋), Δ, 𝐹 ) is spectrally winding free. Proof. Statements (1) and (2) are equivalent by Corollary 3.3 in [BES3]. Obviously (2) ⇒ (3), and it remains to prove the implication (3) ⇒ (2). Assume 𝐿𝑅(𝐹 ; Δ)

Spectral Regularity and Non-commutative Gelfand Theory

127

is quasinilpotent. The Fredholmness of 𝐹 implies that 𝐿𝑅(𝐹 ; Δ) is a ﬁnite rank operator on 𝑋 (see [GS]). Hence 𝐿𝑅(𝐹 ; Δ) is nilpotent. In particular its trace vanishes. By Proposition 3.2 in [BES3], the rank of the logarithmic residue 𝐿𝑅(𝐹 ; Δ) does not exceed its trace, and it follows that 𝐿𝑅(𝐹 ; Δ) = 0. □ Results as Theorem 2.1 are concerned with spectral conﬁgurations in which a given individual function has special properties. Without these additional properties, (spectrally) winding free spectral conﬁgurations with the same underlying Banach algebra might fail to be spectrally trivial. Lifting our conceptual framework to that of the underlying algebras, we call a unital Banach algebra ℬ spectrally regular if each spectrally winding free spectral conﬁguration (ℬ, Δ, 𝐹 ) is spectrally trivial. Not every Banach algebra is spectrally regular. Indeed, from what was said in the introduction, it is clear that ℬ(ℓ2 ) is not. In [Bar], [BES2] and [BES4] positive results have been obtained, but these concern the (possibly) somewhat weaker type of spectral regularity featuring in those papers. That version of spectral regularity requires the triviality of a spectral conﬁguration (ℬ, Δ, 𝐹 ) to follow from the conﬁguration being winding free instead of it being spectrally winding free. Nevertheless, all the Banach algebras that have been identiﬁed in [Bar], [BES2] and [BES4] as spectrally regular in this weaker sense are actually spectrally regular in the stronger sense considered in here. This can be seen by looking at the proofs given in [Bar], [BES2] and [BES4], but it will also become clear from the material to be presented below. We do not know whether the weak and the strong version of spectral regularity really diﬀer from each other or actually amount to the same. The matrix algebras ℂ𝑚×𝑚 are spectrally regular. For the form of spectral regularity employed here (stronger than in our earlier publications), this conclusion can be obtained from Theorem 2.1 since matrices can be viewed as Fredholm operators. More generally, when 𝑋 is a Banach space, 𝒦(𝑋) stands for the ideal of the compact operators on 𝑋, and 𝐼𝑋 denotes the identity operator on 𝑋, the Banach subalgebra ℬ𝒦 (𝑋) = {𝜆𝐼𝑋 + 𝑇 ∣ 𝜆 ∈ ℂ, 𝑇 ∈ 𝒦(𝑋)} of ℬ(𝑋) is spectrally regular. In case 𝑋 is ﬁnite dimensional, ℬ𝒦 (𝑋) can be identiﬁed with the matrix algebra ℂ𝑛×𝑛 where 𝑛 is the dimension of 𝑋. In case dim 𝑋 = ∞, the result follows by combining Proposition 4.1 in [BES4] and Theorem 2.1. Commutative unital Banach algebras are spectrally regular too (see [Bar]). Such algebras belong to the wider class of polynomial identity algebras. A Banach algebra ℬ is called a polynomial identity (Banach) algebra, PI-algebra for short, if there exist a positive integer 𝑘 and a nontrivial polynomial 𝑝(𝑥1 , . . . , 𝑥𝑘 ) in 𝑘 noncommuting variables 𝑥1 , . . . , 𝑥𝑘 such that 𝑝(𝑏1 , . . . , 𝑏𝑘 ) = 0 for every choice of elements 𝑏1 , . . . , 𝑏𝑘 in ℬ. Clearly commutativity implies the property of being PI. Also, according to a celebrated result of Amitsur and Levitzky [AL], all algebras of the form ℂ𝑚×𝑚 are PI-algebras. PI-algebras are spectrally regular (see [BES2] and below). PI-algebras have been investigated in [Kr], and we will now discuss material from there which is highly pertinent to the topic of the present paper (cf. Section 7.1 in [P]). For this, two more concepts are needed.

128

H. Bart, T. Ehrhardt and B. Silbermann

The ﬁrst is that of the radical. The deﬁnition and basic properties of this fundamental notion can be found, for instance, in [N], Section II.7.5 and [Kr], Section 13. For our purpose it is important to know that the radical ℛ(ℬ) of a unital Banach algebra ℬ is a closed two-sided ideal in ℬ which can be characterized as follows: an element 𝑏 in ℬ belongs to ℛ(ℬ) if and only if for each 𝑥 ∈ ℬ both 𝑒 + 𝑥𝑏 and 𝑒 + 𝑏𝑥 are invertible in ℬ. Here 𝑒 is the unit element in ℬ. We also need the concept of a suﬃcient family. A family {𝜙𝜔 : ℬ → ℬ𝜔 }𝜔∈Ω of continuous unital Banach algebra homomorphisms is said to be suﬃcient when an element 𝑏 ∈ ℬ is invertible in ℬ if and only if 𝜙𝜔 (𝑏) is invertible in ℬ𝜔 for all 𝜔 ∈ Ω. Note that the ‘only if part’ in this deﬁnition is automatically fulﬁlled. A continuous Banach algebra homomorphism into a matrix algebra ℂ𝑚×𝑚 , 𝑚 a positive integer, is called a matrix representation. A family of matrix representations {𝜙𝜔 : ℬ → ℂ𝑚𝜔 ×𝑚𝜔 }𝜔∈Ω is said to be of ﬁnite order if the sizes of the matrices involved have a ﬁnite upper bound, i.e., if sup𝜔∈Ω 𝑚𝜔 < ∞. With this terminology, the following basic result holds: a unital Banach algebra ℬ possesses a suﬃcient family of matrix representations of ﬁnite order if and only if the quotient algebra ℬ/ℛ(ℬ) is a PI-algebra. The latter condition is obviously satisﬁed when ℬ itself is a PI-algebra. Hence, if ℬ is a PI-algebra, then ℬ possesses a suﬃcient family of matrix representations of ﬁnite order. For later reference (see Subsection 4.1), we also mention that, as a consequence, ℬ possesses a suﬃcient family of ﬁnite order if and only if so does the quotient algebra ℬ/ℛ(ℬ). We complete the exposition of this material by pointing out that the existence of a suﬃcient family of matrix representations (not necessarily of ﬁnite order) implies the spectral regularity of the underlying algebra. For the weaker form of spectral regularity used in our earlier papers, this result is contained in [BES2], Theorem 4.1. For the stronger form under consideration here, it is immediate from the spectral regularity of the matrix algebras and Corollary 3.5 below. At this point, we can make a connection with Problem 12 in [Kr], Section 29: characterize those Banach algebras which possess a suﬃcient family of matrix representations not necessarily of ﬁnite order. Spectral regularity is a necessary requirement for this; it is however not a suﬃcient condition (see the last paragraph in Section 5). There is one more class of spectrally regular Banach algebras that we want to mention: that of the Banach algebras covered by Theorem 4.2 in [BES2] and the remark made after that theorem. It is a subclass of a class of Banach algebras appearing in (numerically oriented) work by Hagen, Roch and the third author (see [Si] and [HRS]). The description of the class is somewhat involved, and we refrain from giving further details here. Theorems 4.1 and 4.2 in [BES2] referred to above are Gelfand type criteria in the sense that they are stated in terms of families of Banach algebra homomorphisms. In the next section we shall develop a new criterion of this type which turns out to be eﬀective for establishing spectral regularity in a variety of cases which we were not able to handle with the old tools.

Spectral Regularity and Non-commutative Gelfand Theory

129

3. A new Gelfand type criterion for spectral regularity In this section, we will extensively work with families of Banach algebra homomorphisms. These homomorphisms need not be unital. If 𝑋 is a Banach space, the identity operator on 𝑋 is denoted by 𝐼𝑋 and the set of Fredholm operators on 𝑋 by ℱ (𝑋). Theorem 3.1. Let ℬ be a unital Banach algebra. For 𝜔 in an index set Ω, let ℬ𝜔 be a spectrally regular Banach algebra and let 𝜙𝜔 : ℬ → ℬ𝜔 be a continuous homomorphism. Further, for 𝑡 in an index set 𝑇 , let 𝑋𝑡 be a nontrivial Banach space and let Φ𝑡 : ℬ → ℬ(𝑋𝑡 ) be a continuous homomorphism. Assume the following two inclusions hold: ∩ ∩ Ker 𝜙𝜔 ⊂ Φ−1 (a) 𝑡 [ℱ (𝑋𝑡 ) − {𝐼𝑋𝑡 }], 𝜔∈Ω

(b)

∩

𝑡∈𝑇

Ker Φ𝑡 ⊂ ℛ(ℬ).

𝑡∈𝑇

Then ℬ is spectrally regular. Proof. Let (ℬ, Δ, 𝐹 ) be a spectral conﬁguration, and suppose it is spectrally winding free, i.e., 𝐿𝑅(𝐹 ; Δ) is quasinilpotent. We need to show that (ℬ, Δ, 𝐹 ) is spectrally trivial, i.e., 𝐹 takes invertible values in ℬ on all of Δ. The unit element in ℬ will be denoted by 𝑒, that in ℬ𝜔 by 𝑒𝜔 . Take 𝜔 ∈ Ω, and put 𝑝𝜔 = 𝜙𝜔 (𝑒). Then 𝑝𝜔 is an idempotent in ℬ𝜔 and 𝜙𝜔 (𝑏) = 𝑝𝜔 𝜙𝜔 (𝑏) = 𝜙𝜔 (𝑏)𝑝𝜔 for all 𝑏 ∈ ℬ. If 𝑏 ∈ ℬ is invertible in ℬ, then 𝜙𝜔 (𝑏) + 𝑒𝜔 − 𝑝𝜔 is invertible in ℬ𝜔 , with inverse 𝜙𝜔 (𝑏−1 ) + 𝑒𝜔 − 𝑝𝜔 . Also, if 𝜙𝜔 (𝑏) + 𝑒𝜔 − 𝑝𝜔 is invertible in ℬ𝜔 with inverse 𝜙𝜔 (𝑏1 ) + 𝑒𝜔 − 𝑝𝜔 for some 𝑏1 ∈ ℬ, then 𝑏𝑏1 − 𝑒 and 𝑏1 𝑏 − 𝑒 belong to Ker 𝜙𝜔 . In other words, 𝑏1 is an inverse of 𝑏 modulo the ideal Ker 𝜙𝜔 . Again let ( 𝜔 ∈ ) Ω, and deﬁne the ℬ𝜔 -valued function 𝐹𝜔 by stipulating that 𝐹𝜔 (𝜆) = 𝜙𝜔 𝐹 (𝜆) + 𝑒𝜔 − 𝑝𝜔 . Along with 𝐹 , the function 𝐹𝜔 is analytic on an open neighborhood of the closure of Δ. As the function 𝐹 comes from the spectral conﬁguration (ℬ, Δ, 𝐹 ), it takes invertible values on an open neighborhood 𝑈 of ∂Δ. Take 𝜆 ∈ (𝑈 . Then)𝐹𝜔 (𝜆) is invertible in the Banach algebra ℬ𝜔 with inverse 𝐹𝜔 (𝜆)−1 = 𝜙𝜔 𝐹 (𝜆)−1 + 𝑒𝜔 − 𝑝𝜔 . It follows, in particular, that (ℬ𝜔 , Δ, 𝐹𝜔 ) is a spectral conﬁguration. Next we compute 𝐿𝑅(𝐹𝜔 ; Δ). Using that 𝐹𝜔′ = 𝜙𝜔 ∘ 𝐹 ′ , we get ∫ 1 𝐹 ′ (𝜇)𝐹𝜔 (𝜇)−1 𝑑𝜇 𝐿𝑅(𝐹𝜔 ; Δ) = 2𝜋𝑖 ∂Δ 𝜔 ∫ ( ) ) )( ( 1 = 𝜙𝜔 𝐹 ′ (𝜇) 𝜙𝜔 𝐹 (𝜇)−1 + 𝑒𝜔 − 𝑝𝜔 𝑑𝜇 2𝜋𝑖 ∂Δ ∫ ∫ ( ′ ) ( ) ( ) 1 1 −1 = 𝑑𝜇 + 𝜙𝜔 𝐹 (𝜇) 𝜙𝜔 𝐹 (𝜇) 𝜙𝜔 𝐹 ′ (𝜇) (𝑒𝜔 − 𝑝𝜔 )𝑑𝜇. 2𝜋𝑖 ∂Δ 2𝜋𝑖 ∂Δ

130

H. Bart, T. Ehrhardt and B. Silbermann

( ( ) ) The last term vanishes because 𝜙𝜔 𝐹 ′ (𝜇) = 𝜙𝜔 𝐹 ′ (𝜇) 𝑝𝜔 , and we conclude that ∫ ) ( ( ) 1 𝐿𝑅(𝐹𝜔 ; Δ) = 𝜙𝜔 𝐹 ′ (𝜇) 𝜙𝜔 𝐹 (𝜇)−1 𝑑𝜇 2𝜋𝑖 ∂Δ ( ) ∫ 1 𝐹 ′ (𝜇)𝐹 (𝜇)−1 𝑑𝜇 = 𝜙𝜔 2𝜋𝑖 ∂Δ ( ) = 𝜙𝜔 𝐿𝑅(𝐹 ; Δ) . We proceed by proving that 𝐿𝑅(𝐹𝜔 ; Δ) is quasinilpotent. Take 𝜇 ∈ ℂ ∖ {0}. As 𝐿𝑅(𝐹 ; Δ) is quasinilpotent, 𝜇𝑒 − 𝐿𝑅(𝐹 ; Δ) is invertible in ℬ. Now ( ) 𝜇𝑒𝜔 − 𝐿𝑅(𝐹𝜔 ; Δ) = 𝜇(𝑒𝜔 − 𝑝𝜔 ) + 𝜙𝜔 𝜇𝑒 − 𝐿𝑅(𝐹 ; Δ) , and the right-hand side of this identity is obviously invertible in ℬ𝜔 with inverse (( )−1 ) 𝜇−1 (𝑒𝜔 −𝑝𝜔 )+𝜙𝜔 𝜇𝑒−𝐿𝑅(𝐹 ; Δ) . Thus 𝜇 is in the resolvent set of 𝐿𝑅(𝐹𝜔 ; Δ), as desired. By hypothesis, the Banach algebra ℬ𝜔 is spectrally regular, and we have just proved that 𝐿𝑅(𝐹𝜔 ; Δ) is quasinilpotent. Thus we may conclude that the spectral conﬁguration 𝐿𝑅(𝐹𝜔 ; Δ) is spectrally trivial, i.e., the function 𝐹𝜔 takes invertible values on Δ. Put 𝑉 = Δ ∪ 𝑈 . Then 𝑉 is an open neighborhood of the closure Δ ∪ ∂Δ of the Cauchy domain Δ and 𝐹𝜔 takes invertible values on 𝑉 . Hence, by Cauchy’s integral formula, ∫ 1 1 𝐹𝜔 (𝜇)−1 𝑑𝜇, 𝜆 ∈ Δ. (3) 𝐹𝜔 (𝜆)−1 = 2𝜋𝑖 ∂Δ 𝜇 − 𝜆 Let 𝜆 ∈ Δ, and introduce

∫

1 𝐹 (𝜇)−1 𝑑𝜇. (4) 𝜇 − 𝜆 ∂Δ ( ) Then 𝐺(𝜆) ∈ ℬ and, using the identity 𝐹𝜔 (𝜇)−1 = 𝜙𝜔 𝐹 (𝜇)−1 + 𝑒𝜔 − 𝑝𝜔 already obtained above, ∫ ( ) ( ) 1 1 𝜙𝜔 𝐹 (𝜇)−1 𝑑𝜇 𝜙𝜔 𝐺(𝜆) = 2𝜋𝑖 ∂Δ 𝜇 − 𝜆 ∫ ) 1 1 ( = 𝐹𝜔 (𝜇)−1 − (𝑒𝜔 − 𝑝𝜔 ) 𝑑𝜇 2𝜋𝑖 ∂Δ 𝜇 − 𝜆 ∫ ∫ 1 1 1 1 𝐹𝜔 (𝜇)−1 𝑑𝜇 − (𝑒𝜔 − 𝑝𝜔 )𝑑𝜇 = 2𝜋𝑖 ∂Δ 𝜇 − 𝜆 2𝜋𝑖 ∂Δ 𝜇 − 𝜆 1 𝐺(𝜆) = 2𝜋𝑖

= 𝐹𝜔 (𝜆)−1 − (𝑒𝜔 − 𝑝𝜔 ). ( ) ( ) Thus 𝐹𝜔 (𝜆)−1 = 𝜙𝜔 𝐺(𝜆) + 𝑒𝜔 − 𝑝𝜔 . As 𝐹𝜔 (𝜆) = 𝜙𝜔 𝐹 (𝜆) + 𝑒𝜔 − 𝑝𝜔 (by deﬁnition), it follows that 𝐺(𝜆)𝐹 (𝜆) − 𝑒 and 𝐹 (𝜆)𝐺(𝜆) − 𝑒 belong to Ker 𝜙𝜔 .

Spectral Regularity and Non-commutative Gelfand Theory

131

Since 𝜔 ∈ Ω was taken arbitrarily, we may conclude that, for 𝜆 ∈ Δ as above, 𝐺(𝜆)𝐹 (𝜆) − 𝑒 and 𝐹 (𝜆)𝐺(𝜆) − 𝑒 are in the left-hand side of (a). Thus, taking into account the inclusion (a), ( ) Φ𝑡 𝐺(𝜆)𝐹 (𝜆) − Φ𝑡 (𝑒) ∈ ℱ (𝑋𝑡 ) − {𝐼𝑋𝑡 }, 𝑡 ∈ 𝑇, (5) and, likewise,

( ) Φ𝑡 𝐹 (𝜆)𝐺(𝜆) − Φ𝑡 (𝑒) ∈ ℱ (𝑋𝑡 ) − {𝐼𝑋𝑡 },

𝑡 ∈ 𝑇.

(6)

Take 𝑡 ∈ 𝑇 , and put 𝑃𝑡 = Φ𝑡 (𝑒). Then 𝑃𝑡 is an idempotent in ℬ(𝑋𝑡 ), in other words 𝑃𝑡 is a projection of 𝑋𝑡 , and Φ𝜔 (𝑏) = 𝑃𝑡 Φ𝑡 (𝑏) = Φ𝑡 (𝑏)𝑃𝑡 for all 𝑏 ∈ ℬ. If 𝑏 ∈ ℬ is invertible in ℬ, then Φ𝑡 (𝑏) + 𝐼𝑋𝑡 − 𝑃𝑡 is invertible in ℬ(𝑋𝑡 ), with inverse Φ𝑡 (𝑏−1 ) + 𝐼𝑋𝑡 − 𝑃𝑡 . Also, if Φ𝑡 (𝑏) + 𝐼𝑋𝑡 − 𝑃𝑡 is invertible in ℬ(𝑋𝑡 ) with inverse Φ𝑡 (𝑏1 ) + 𝐼𝑋𝑡 − 𝑃𝑡 for some 𝑏1 ∈ ℬ, then 𝑏𝑏1 − 𝑒 and 𝑏1 𝑏 − 𝑒 belong to Ker Φ𝑡 . In other words, 𝑏1 is an inverse of 𝑏 modulo the ideal Ker Φ𝑡 . Again, let 𝑡 ∈ 𝑇 , and introduce the ℬ(𝑋𝑡 )-valued function 𝐹ˆ𝑡 by putting ( ( ) ) ˆ 𝐹𝑡 (𝜆) = Φ𝑡 𝐹 (𝜆) + 𝐼𝑋𝑡 − 𝑃𝑡 . Arguing as above, we see that ℬ(𝑋𝑡 ), Δ, 𝐹ˆ𝑡 is ( ) a spectral conﬁguration. Also 𝐿𝑅(𝐹ˆ𝑡 ; Δ) = Φ𝑡 𝐿𝑅(𝐹 ; Δ) , and it follows that 𝐿𝑅(𝐹ˆ𝑡 ; Δ) is quasinilpotent. Next observe that ( ( ) ) ) )( ( ) ) ( ( Φ𝑡 𝐺(𝜆) + 𝐼𝑋𝑡 − 𝑃𝑡 𝐹ˆ𝑡 (𝜆) = Φ𝑡 𝐺(𝜆) + 𝐼𝑋𝑡 − 𝑃𝑡 Φ𝑡 𝐹 (𝜆) + 𝐼𝑋𝑡 − 𝑃𝑡 ( ( ) ) = Φ𝑡 𝐺(𝜆)𝐹 (𝜆) − Φ𝑡 (𝑒) + 𝐼𝑋𝑡 , ( ( ) ) and so Φ𝑡 𝐺(𝜆) +𝐼𝑋𝑡 −𝑃𝑡 𝐹ˆ𝑡 (𝜆) ∈ ℱ (𝑋𝑡 ) by (5). Similarly, by taking into account ) ) ( ( (6), we get 𝐹ˆ𝑡 (𝜆) Φ𝑡 𝐺(𝜆) + 𝐼𝑋𝑡 − 𝑃𝑡 ∈ ℱ (𝑋𝑡 ). But then 𝐹ˆ𝑡 (𝜆) is a Fredholm operator, and) we can apply Theorem 2.1 to see that the spectral conﬁguration ( ℬ(𝑋𝑡 ), Δ, 𝐹ˆ𝑡 is spectrally trivial. Analogous to (3), we have ∫ 1 ˆ 1 𝐹ˆ𝑡 (𝜆)−1 = 𝐹𝑡 (𝜇)−1 𝑑𝜇, 𝜆 ∈ Δ, 2𝜋𝑖 ∂Δ 𝜇 − 𝜆 ( ) and, by the same (sort of) reasoning as used before, 𝐹ˆ𝑡 (𝜆)−1 = Φ𝑡 𝐺(𝜆) + 𝐼𝑋𝑡 − 𝑃𝑡 . Since 𝐹𝑡 (𝜆) = Φ𝑡 𝐹 (𝜆) + 𝐼𝑋𝑡 − 𝑃𝑡 (by deﬁnition), it follows that 𝐺(𝜆)𝐹 (𝜆) − 𝑒 and 𝐹 (𝜆)𝐺(𝜆) − 𝑒 belong to Ker Φ𝑡 . As 𝑡 ∈ 𝑇 was taken arbitrarily, we may conclude that 𝐺(𝜆)𝐹 (𝜆) − 𝑒 and 𝐹 (𝜆)𝐺(𝜆) − 𝑒 are in the left-hand side of (b) which, by assumption, is a subset of the radical of ℬ. So 𝐺(𝜆)𝐹 (𝜆) and 𝐹 (𝜆)𝐺(𝜆) are invertible. But then 𝐹 (𝜆) is both left and right invertible, hence invertible, as desired. The inverse of 𝐹 (𝜆) is 𝐺(𝜆) given by (4). □ Before drawing consequences from Theorem 3.1, we present some remarks on the conditions∩ (a) and (b) in the theorem. First we note that (a) in Theorem 3.1 is fulﬁlled when 𝜔∈Ω Ker 𝜙𝜔 ⊂ ℛ(ℬ). To see this, it is suﬃcient to prove that, with {Φ𝑡 : ℬ → ℬ(𝑋𝑡 )}𝑡∈𝑇 [ of continuous ]homomorphisms as in Theorem 3.1, ∩ a family −1 Φ we have ℛ(ℬ) ⊂ ℱ (𝑋𝑡 ) − {𝐼𝑋𝑡 } . The argument is as follows. Write 𝑡∈𝑇 𝑡

132

H. Bart, T. Ehrhardt and B. Silbermann

𝑃𝑡 = Φ𝑡 (𝑒) with 𝑒 the unit element in ℬ. Then 𝑃𝑡 is an idempotent in ℬ(𝑋𝑡 ). Clearly Φ𝑡 (𝑏)𝑃𝑡 = 𝑃𝑡 Φ𝑡 (𝑏) = Φ𝑡 (𝑏) for all 𝑏 ∈ ℬ. Take 𝑟 ∈ ℛ(ℬ). Then 𝑟 + 𝑒 is invertible in ℬ, say with inverse 𝑠. A straightforward computation now yields (Φ𝑡 (𝑟 + 𝑒) + 𝐼𝑋𝑡 − 𝑃𝑡 )(Φ𝑡 (𝑠) + 𝐼𝑋𝑡 − 𝑃𝑡 ) = 𝐼𝑋𝑡 , (Φ𝑡 (𝑠) + 𝐼𝑋𝑡 − 𝑃𝑡 )(Φ𝑡 (𝑟 + 𝑒) + 𝐼𝑋𝑡 − 𝑃𝑡 ) = 𝐼𝑋𝑡 . Thus, Φ𝑡 (𝑟) + 𝐼𝑋𝑡 = Φ𝑡 (𝑟 + 𝑒) + (𝐼𝑋𝑡 − 𝑃𝑡 ) is invertible in ℬ(𝑋𝑡 ). Hence Φ𝑡 (𝑟) ∈ ℱ (𝑋𝑡 ) − {𝐼𝑋𝑡 }, as desired. Next we observe that (b) in Theorem 3.1 cannot be satisﬁed by the empty index family 𝑇 . Indeed, if so, the Banach algebra ℬ would coincide with its radical, and this can only happen in the (excluded) case when ℬ is trivial. Finally, in contrast to what we have for (b), it is possible to have (a) satisﬁed by the empty index family Ω. The underlying fact (not diﬃcult to establish) is that the inclusion ∩ Φ−1 (7) ℬ ⊂ 𝑡 [ℱ (𝑋𝑡 ) − {𝐼𝑋𝑡 }] 𝑡∈𝑇

is satisﬁed if and only if for all 𝑡 ∈ 𝑇 , the projection 𝑃𝑡 = Φ𝑡 (𝑒) : 𝑋𝑡 → 𝑋𝑡 has ﬁnite rank. So (7), which is trivially fulﬁlled when the index set Ω is empty, basically means that the homomorphisms Φ𝑡 are (or rather can be identiﬁed with) matrix representations. Later on we will use two speciﬁc forms of Theorem 3.1. We give them as corollaries. In the ﬁrst ℬ is a closed subalgebra of a Banach algebra of the type ℬ(𝑋), unital but with unit element not necessarily equal to 𝐼𝑋 . Corollary 3.2. Let 𝑋 be a nontrivial Banach space, and let ℬ be a closed subalgebra of ℬ(𝑋). For 𝜔 in an index set Ω, let ℬ𝜔 be a spectrally regular Banach algebra, and let 𝜙𝜔 : ℬ → ℬ𝜔 be a continuous homomorphism. Suppose ∩ Ker 𝜙𝜔 ⊂ ℱ (𝑋) − {𝐼𝑋 }. (8) 𝜔∈Ω

Then ℬ is spectrally regular. Proof. Take 𝑇 = {0}, put 𝑋0 = 𝑋, and let Φ0 : ℬ → ℬ(𝑋0 ) be the identical embedding of ℬ into ℬ(𝑋0 ). Then Ker Φ0 = {0}, hence (b) in Theorem 3.1 is trivially fulﬁlled. From (8) it is obvious that (a) in Theorem 3.1 is satisﬁed too. □ Corollary 3.3. Let ℬ be a unital Banach algebra. For 𝜔 in an index set Ω, let ℬ𝜔 be a spectrally regular Banach algebra, and let 𝜙𝜔 : ℬ → ℬ𝜔 be a continuous homomorphism. Suppose ∩ Ker 𝜙𝜔 ⊂ ℛ(ℬ). (9) Then ℬ is spectrally regular.

𝜔∈Ω

Proof. Let 𝑋 be ℬ considered as a Banach space only. Then ℬ can be identiﬁed with a Banach subalgebra of ℬ(𝑋). The standard argument for this uses the left regular representation Ψ of ℬ into ℬ(𝑋) deﬁned by Ψ(𝑏)(𝑥) = 𝑏𝑥, 𝑥 ∈ 𝑋, 𝑏 ∈ ℬ.

Spectral Regularity and Non-commutative Gelfand Theory

133

Having this identiﬁcation in mind, we need to prove that (9) implies (8). Take 𝑏 in the left-hand side of (9). Then 𝑏 ∈ ℛ(ℬ), and so 𝑏 + 𝑒 is invertible in ℬ. Here 𝑒 is the unit element in ℬ. Now under the left regular representation Ψ, this unit element is identiﬁed with 𝐼𝑋 . So 𝑏 + 𝐼𝑋 is invertible in ℬ, hence invertible in ℬ(𝑋). It follows that 𝑏 ∈ ℱ (𝑋) − {𝐼𝑋 } as desired. □ A family {𝜙𝜔 : ℬ → ℬ𝜔 }𝜔∈Ω of Banach algebra homomorphisms for which (9) holds will be called radical-separating. This terminology is justiﬁed by the fact that the inclusion (9) holds if and only if the family {𝜙𝜔 : ℬ → ℬ𝜔 }𝜔∈Ω separates the points of ℬ modulo the radical of ℬ. If the stronger ∩ condition is satisﬁed that the family separates the points of ℬ or, equivalently, 𝜔∈Ω Ker 𝜙𝜔 = {0}, we call the family separating. When the underlying Banach algebra ℬ is semisimple (i.e., its radical is trivial), the two concepts obviously amount to the same. The special situation where the ‘test algebras’ ℬ𝜔 are semisimple and the Banach algebra homomorphisms 𝜙𝜔 are surjective is of interest too (see Subsection 4.1, Lemma 4.5 and below). Indeed, in that case the family {𝜙𝜔 : ℬ → ℬ𝜔 }𝜔∈Ω is ∩ radical-separating if and only if the inclusion (9) is in fact an equality, i.e., 𝜔∈Ω Ker 𝜙𝜔 = ℛ(ℬ). This is immediate from the following straightforward observation. If 𝜓 : ℬ → 𝒜 is a surjective unital Banach algebra homomorphism, then 𝜓 maps ℛ(ℬ) into ℛ(𝒜); so when 𝒜 is semisimple, it ensues that ℛ(ℬ) ⊂ Ker 𝜓. We now make a connection with material presented earlier in Section 2. Recall that a family {𝜙𝜔 : ℬ → ℬ𝜔 }𝜔∈Ω of continuous unital Banach algebra homomorphisms is said to be suﬃcient when an element 𝑏 ∈ ℬ is invertible in ℬ if (and only if) 𝜙𝜔 (𝑏) is invertible in ℬ𝜔 for all 𝜔 ∈ Ω. Besides suﬃcient families, the books [RRS] and [RSS] also feature so-called weakly suﬃcient families. Inspired by the deﬁnitions given there, we introduce the notion of a partially weakly suﬃcient family of homomorphisms. Write ∥.∥𝜔 for the norm in ℬ𝜔 and 𝑒𝜔 for the unit element in ℬ𝜔 . The family {𝜙𝜔 : ℬ → ℬ𝜔 }𝜔∈Ω of continuous unital Banach algebra homomorphisms is called partially weakly suﬃcient, or p.w. suﬃcient for short, provided that (a) sup𝜔∈Ω ∥𝑒𝜔 ∥𝜔 < ∞ (recall from the last paragraph of the introduction that ∥𝑒𝜔 ∥𝜔 need not be equal to one), and (b) an element 𝑏 ∈ ℬ is invertible in ℬ if 𝜙𝜔 (𝑏) is invertible in ℬ𝜔 for all 𝜔 ∈ Ω and sup𝜔∈Ω ∥𝜙𝜔 (𝑏)−1 ∥𝜔 < ∞. In deﬁnitions of this type, conditions such as (b) are usually of the ‘if and only if’ type. The fact that we do not impose this more restrictive requirement here is the reason for the use of the term ‘partially’ in our terminology. A suﬃcient family of Banach algebra homomorphisms {𝜙𝜔 : ℬ → ℬ𝜔 }𝜔∈Ω is p.w. suﬃcient in the sense that it can be turned into a p.w. suﬃcient family by renorming the Banach algebras ℬ𝜔 with an appropriate equivalent norm. Indeed, for 𝜔 ∈ Ω, just choose an equivalent Banach algebra norm for which the unit element 𝑒𝜔 in ℬ𝜔 has norm one. It is a standard fact from Banach algebra theory that this can be done.

134

H. Bart, T. Ehrhardt and B. Silbermann

In Section 2.2.5 of [RSS], a deﬁnition of weak suﬃciency is given which in its context, namely that of 𝐶 ∗ -algebras and ∗ -homomorphisms, amounts to the same as p.w. suﬃciency. Theorem 2.2.10 of [RSS] shows that the families in question are separating. In the general (non 𝐶 ∗ ) Banach algebra setting one has to be content with a weaker conclusion. Proposition 3.4. Let ℬ be a unital Banach algebra, and let {𝜙𝜔 : ℬ → ℬ𝜔 }𝜔∈Ω be a family of unital Banach algebra homomorphisms. If the family {𝜙𝜔 }𝜔∈Ω is p.w. suﬃcient or suﬃcient, then it is radical-separating. Proof. Write 𝑒 for the unit element in ℬ, and let 𝑒𝜔 and ∥.∥𝜔 stand for the unit element and norm in ℬ𝜔 , respectively. First, suppose that the family {𝜙𝜔 }𝜔∈Ω is p.w. suﬃcient, thus, in particular, sup𝜔∈Ω ∥𝑒𝜔 ∥𝜔 < ∞. Take 𝑥 in the left-hand side of (9). Then, for 𝜔 ∈ Ω and 𝑏 ∈ ℬ, we have 𝜙𝜔 (𝑥) = 0 and 𝜙𝜔 (𝑏𝑥 + 𝑒) = 𝜙𝜔 (𝑏)𝜙𝜔 (𝑥) + 𝜙𝜔 (𝑒) = 𝜙𝜔 (𝑒) = 𝑒𝜔 . So 𝜙𝜔 (𝑏𝑥+𝑒) is invertible in ℬ𝜔 and sup𝜔∈Ω ∥𝜙𝜔 (𝑏𝑥+𝑒)−1 ∥𝜔 = sup𝜔∈Ω ∥𝑒𝜔 ∥𝜔 < ∞. It follows that 𝑏𝑥 + 𝑒 is invertible in ℬ. Similarly 𝑥𝑏 + 𝑒 is invertible in ℬ, and we conclude that 𝑥 ∈ ℛ(ℬ). When the family {𝜙𝜔 }𝜔∈Ω is suﬃcient instead of p.w. suﬃcient, the argument is even simpler (and left to the reader). One can also argue that, in the sense explained above, suﬃciency implies p.w. suﬃciency. □ Corollary 3.5. Let ℬ be a unital Banach algebra. For 𝜔 in an index set Ω, let ℬ𝜔 be a spectrally regular Banach algebra and let 𝜙𝜔 : ℬ → ℬ𝜔 be a continuous unital homomorphism. If the family {𝜙𝜔 }𝜔∈Ω is suﬃcient or p.w. suﬃcient, then ℬ is spectrally regular. Proof. Combine Corollary 3.3 and Proposition 3.4.

□

In connection with Corollary 3.5 note that the existence of a p.w. suﬃcient family can often be much more easily established than that of a suﬃcient family. For an example, consider the Banach algebra ℓ∞ . Comparing Corollaries 3.3 and 3.2, one is confronted with a striking diﬀerence between the conditions (9) and (8). In (9), both terms in the inclusion are ideals, and in fact ideals in one and the same given Banach algebra ℬ. In (8), however, the left-hand side is an ideal in a Banach subalgebra ℬ of the underlying Banach algebra ℬ(𝑋), whereas the right-hand side is a shifted semigroup of elements in ℬ(𝑋). Here are some comments meant to elucidate the situation. First let us look at (9). Evidently it follows from (9) that ∩ Ker 𝜙𝜔 ⊂ 𝒢(ℬ) − {𝑒}, (10) 𝜔∈Ω

where (again) 𝑒 denotes the unit element in ℬ, and 𝒢(ℬ) stands for the group of invertible elements in ℬ. However, as the left-hand side of (10) is an ideal, (10) in turn implies (9). Thus (9) and (10) amount to the same.

Spectral Regularity and Non-commutative Gelfand Theory

135

Turning now to (8), observe that there is a certain analogy between the righthand sides of (8) and (10). Indeed, the set ℱ (𝑋) consists of the elements of ℬ(𝑋) that are invertible modulo the closed two-sided ideal 𝒦(𝑋) of the compact operators on 𝑋. On the other hand, the right-hand side of (8) need not be contained in ℬ. In fact, the circumstance that ℬ is embedded in the (generally) larger algebra ℬ(𝑋) of all bounded linear operators on 𝑋 is a key element in the proof of Corollary 3.2. Given the role the ideal 𝒦(𝑋) is playing in the background, one may wonder whether (8) can be reformulated in a form resembling (9), so with an ideal in the right-hand side of the inclusion. This is possible to the extent that Corollary[3.2( remains true)]when in (8) the right-hand side of the inclusion is replaced by ℬ(𝑋) onto the Calkin 𝜅−1 ℛ ℬ(𝑋)/𝒦(𝑋) with 𝜅 being the canonical [ ( mapping of )] −1 ⊂ ℱ (𝑋) − {𝐼𝑋 }, algebra ℬ(𝑋)/𝒦(𝑋). In fact 𝒦(𝑋) ⊂ 𝜅 ℛ ℬ(𝑋)/𝒦(𝑋) and therefore each of the two inclusions ∩ [ ( )] (11) Ker 𝜙𝜔 ⊂ 𝜅−1 ℛ ℬ(𝑋)/𝒦(𝑋) , 𝜔∈Ω

∩

Ker 𝜙𝜔 ⊂ 𝒦(𝑋),

(12)

𝜔∈Ω

is suﬃcient for (8) to hold. Now (11) and (12) bear some resemblance to (9). However, unlike (10) and (9) which simply amount to the same, the relationship between the conditions (8) and (11), and the relationship between (8) and (12), are not so clear. The reason is that the set featuring in the left-hand sides of (8), (11) and (12), although it is an ideal in ℬ, need not be an ideal in ℬ(𝑋). The upshot of this discussion is that, although several modiﬁcations of Corollary 3.2 are possible, the formulation given above seems to be the optimal one. For completeness we add that in the previous two paragraphs, the ideal of the compact operators may be replaced by that of the strictly singular operators. For material on strictly singular operators, see Section III.2 in [Go] or Section 4.5 in [AA]. The new criterion for spectral regularity (Theorem 3.1) and its corollaries (Corollary 3.3 and Corollary 3.2) can be employed eﬀectively in so far as there is an adequate supply of spectrally regular test algebras ℬ𝜔 . Some classes of spectrally regular Banach algebras are described in Section 2. In this paper, the test algebras mostly employed are matrix algebras while an occasional use is made of an algebra which is not of that type. One of the classes of spectrally regular Banach algebras mentioned in Section 2 is that of the PI-algebras. However, the property of being PI is often diﬃcult or even impossible to check. On the other hand, there are many Banach algebras which become PI, hence spectrally regular, after factoring out the radical. In this connection it is fortunate that a Banach algebra ℬ is spectrally regular if and only ℬ/ℛ(ℬ) is (see Theorem 4.2 below). The Banach algebras in question are therefore suitable as test algebras too.

136

H. Bart, T. Ehrhardt and B. Silbermann

4. Applications In this section we present applications of Corollaries 3.3 and 3.2. In particular we establish the spectral regularity of certain Banach algebras for which this was hitherto impossible. Along the way some new results based on the older methods are obtained too. The material is divided into ﬁve subsections. 4.1. Subalgebras and quotients We begin with a special case of Corollary 3.3, worth to be stated in its own right. Corollary 4.1. Let ℬ and 𝒜 be unital Banach algebras, and let Φ : ℬ → 𝒜 be a continuous Banach algebra homomorphism. Assume Ker Φ ⊂ ℛ(ℬ) and 𝒜 is spectrally regular. Then ℬ is spectrally regular too. Proof. In Corollary 3.3, take for Ω the singleton set {0}, for ℬ0 the Banach algebra □ 𝒜 (spectrally regular by assumption), and for 𝜙0 the homomorphism Φ. The situation where the Banach algebra homomorphism Φ in Corollary 4.1 happens to be injective is of particular interest. One can then view ℬ as a continuously embedded subalgebra of 𝒜. Thus, in particular, Corollary 4.1 implies that each closed unital subalgebra 𝒜 of a spectrally regular Banach algebra ℬ (where 𝒜 need not have the same unit element as ℬ) is spectrally regular again. Another immediate consequence is that a unital Banach algebra ℬ is spectrally regular provided it is ﬁnite dimensional. Indeed, if 𝑛 = dim ℬ, then ℬ can be identiﬁed with a Banach subalgebra of ℂ𝑛×𝑛 . For this, use the left regular representation of ℬ into ℬ(𝑋), where 𝑋 is the 𝑛-dimensional Banach space obtained by considering ℬ as a Banach space only (cf. the proof of Corollary 3.3). Next we turn to quotient algebras. Here the situation is more involved. In fact there are two issues. First, is a quotient of a spectrally regular Banach algebra spectrally regular again? Second, if a quotient is spectrally regular, does it follow that the underlying algebra is spectrally regular too? As concerns the ﬁrst issue, in sharp contrast to what has just been observed for subalgebras, a quotient algebra of a spectrally regular Banach algebra need not be spectrally regular. The counterexample that we have uses elements developed in Subsection 4.2 below. For that reason it will be given there. Note that with this we also have an example of surjective Banach algebra homomorphism Ψ : 𝒜 → ℬ such that 𝒜 is spectrally regular while ℬ is not. Thus what might be called the dual of Corollary 4.1, taken with Φ injective, does not hold. For the second issue, as might be expected, the answer is generally negative too. Here is a counterexample. Let ℕ stand for the set of positive integers, and consider the Banach space ℓ2 ({0} ∪ ℕ). Write it as a direct sum ℂ ∔ ℓ2 , where ℓ2 = ℓ2 (ℕ), and take for ℬ the Banach subalgebra of ℬ(ℂ ∔ ℓ2 ) consisting of all bounded linear operators from ℂ ∔ ℓ2 into ℂ ∔ ℓ2 having the diagonal form [ ] 𝛼 0 , 𝛼 ∈ ℂ, 𝑇 ∈ ℬ(ℓ2 ). 0 𝑇

Spectral Regularity and Non-commutative Gelfand Theory

137

Now let 𝒥 be the set of all operators in ℬ of the type [ ] 0 0 , 𝑇 ∈ ℬ(ℓ2 ). 0 𝑇 Then 𝒥 is a closed two-sided ideal in ℬ. As ℬ/𝒥 is isomorphic to ℂ, the quotient algebra ℬ/𝒥 is spectrally regular. However, ℬ is not. In fact, along with ℬ(ℓ2 ), the Banach algebra ℬ features a nontrivial zero sum of idempotents, and this rules out the property of being spectrally regular (cf. the third paragraph of the introduction). Another way to see that ℬ is not spectrally regular is by ﬁrst noting that ℬ(ℓ2 ) can be viewed as a closed unital subalgebra of ℬ, and then taking into account the remark made after the proof of Corollary 4.1. In view of Corollary 4.3 below, we emphasize that in this counterexample the quotient algebra ℬ/𝒥 is ﬁnite (in fact one) dimensional but that 𝒥 is not contained in the radical of ℬ. In the remainder of this subsection we focus on the special situation where the ideal 𝒥 which is factored out is contained in the radical of the underlying Banach algebra ℬ. Observe that in this situation invertibility modulo 𝒥 and invertibility in ℬ amount to the same. This will be used several times later on. Theorem 4.2. Let ℬ be a unital Banach algebra, and let 𝒥 be a closed two-sided ideal in ℬ which is contained in the radical of ℬ. Then ℬ is spectrally regular if and only if the quotient algebra ℬ/𝒥 has this property. In particular, ℬ is spectrally regular if and only if so is ℬ/ℛ(ℬ). Proof. First suppose 𝐵/𝒥 is spectrally regular. Take for Ω the singleton set {0}, for ℬ0 the Banach algebra ℬ/𝒥 , and for 𝜙0 the canonical mapping from ℬ onto ℬ0 . Then Ker 𝜙0 = 𝒥 . By assumption 𝒥 ⊂ ℛ(ℬ). Hence the singleton family {𝜙0 } is radical-separating, and the desired result follows from Corollary 3.3. (Alternatively, one can use Corollary 3.5 after noting that the family {𝜙0 } is suﬃcient.) This proves the ‘if part’ of the theorem. Next we turn to the ‘only if part’ and assume that ℬ is spectrally regular. Let (ℬ/𝒥 , Δ, 𝐹ˆ) be a spectral conﬁguration and suppose it is spectrally winding free. It must be shown that (ℬ/𝒥 , Δ, 𝐹ˆ ) is spectrally trivial. For this we shall use that the ℬ/𝒥 -valued analytic functions can be lifted to ℬ. In other words, they can be written as the composition of an analytic ℬ-valued function with 𝜅, the canonical mapping of ℬ onto the quotient space ℬ/𝒥 . That this is indeed possible can be seen from the proof of Theorem 1a in [Gra] which is based on Grothendieck’s work on topological tensor products [Gro]; see also Section 3.0 in [ZKKP], [Ka], and Section 6.4 in [GL]. In the concrete situation that we have here, one can also proceed as follows, employing only lifting of continuous functions. Denote the domain of the function 𝐹ˆ by 𝑈 . Then 𝑈 is an open subset of the complex plane containing the closure Δ of Δ. Now let Δ1 be another bounded Cauchy domain such that Δ ⊂ Δ1 ⊂ Δ1 ⊂ 𝑈 . Write 𝑓ˆ1 for the restriction of the function 𝐹ˆ to ∂Δ1 . Then 𝑓ˆ1 : ∂Δ1 → ℬ/𝒥 is a continuous function. There exists a continuous lifting of 𝑓ˆ1 , that is a function

138

H. Bart, T. Ehrhardt and B. Silbermann

𝑓1 : ∂Δ1 → ℬ/𝒥 such that 𝑓ˆ1 = 𝜅 ∘ 𝑓1 (see, e.g., [ZKKP], Section 1.0). Deﬁne the function 𝐹 : Δ1 → ℬ by ∫ 1 1 𝐹 (𝜆) = 𝑓1 (𝜇)𝑑𝜇, 𝜆 ∈ Δ1 . 2𝜋𝑖 ∂Δ1 𝜇 − 𝜆 Then 𝐹 is analytic on Δ1 . Also, for 𝜆 ∈ Δ1 , ∫ 1 1 (𝜅 ∘ 𝐹 )(𝜆) = (𝜅 ∘ 𝑓1 )(𝜇)𝑑𝜇, 2𝜋𝑖 ∂Δ1 𝜇 − 𝜆 ∫ 1 ˆ 1 𝑓1 (𝜇)𝑑𝜇, = 2𝜋𝑖 ∂Δ1 𝜇 − 𝜆 ∫ 1 ˆ 1 𝐹 (𝜇)𝑑𝜇, = 2𝜋𝑖 ∂Δ1 𝜇 − 𝜆 and the latter expression is equal to 𝐹ˆ (𝜆) by the Cauchy integral formula. With 𝐹 we form a new spectral conﬁguration (ℬ, Δ, 𝐹 ). That this is a( spectral ) conﬁguration indeed, can been as follows. For 𝜆 in ∂Δ we have that 𝜅 𝐹 (𝜆) is invertible in ℬˆ = ℬ/𝒥 , so 𝐹 (𝜆) is invertible modulo the ideal 𝒥 . But then, making use of on observation made earlier, 𝐹 (𝜆) is invertible in ℬ. ( ) Clearly 𝜅 𝐿𝑅(𝐹 ; Δ) = 𝐿𝑅(𝐹ˆ ; Δ), and the latter is quasinilpotent. Take 𝜇 in ℂ ∖ {0}. Then 𝜇𝜅(𝑒) − 𝐿𝑅(𝐹ˆ ; Δ) is invertible in ℬ/𝒥 . Here 𝑒 stands for the unit element in ℬ. Now 𝜅(𝜇𝑒 − 𝐿𝑅(𝐹 ; Δ)) = 𝜇𝜅(𝑒) − 𝐿𝑅(𝐹ˆ ; Δ). Thus 𝜇𝑒 − 𝐿𝑅(𝐹 ; Δ) is invertible modulo 𝒥 , hence invertible in ℬ. Thus we have proved that 𝐿𝑅(𝐹 ; Δ) is quasinilpotent, i.e., the spectral conﬁguration (ℬ, Δ, 𝐹 ) is spectrally winding free. As ℬ is assumed to be spectrally regular, we may conclude that (ℬ, Δ, 𝐹 ) is ˆ Δ, 𝐹ˆ ). spectrally trivial. But then so is the spectral conﬁguration (ℬ, □ The following result is a simple consequence of Theorem 4.2 and the remark made in the (second part of the) paragraph after the proof of Corollary 4.1 (see also the counterexample presented above). Corollary 4.3. Let ℬ be a unital Banach algebra, and let 𝒥 be a closed two-sided ideal in ℬ which is contained in the radical of ℬ. Suppose the quotient algebra ℬ/𝒥 is ﬁnite dimensional. Then ℬ is spectrally regular. In particular, ℬ is spectrally regular whenever ℛ(ℬ) has ﬁnite codimension in ℬ. In Section 2, the paragraph directly following the proof of Theorem 2.1, it was indicated that it is possible to work with a somewhat weaker form of spectral regularity than the one adopted here (cf. the ﬁrst four paragraphs of the introduction). For this weaker version (involving vanishing logarithmic residues instead of quasinilpotent ones), we have not been able to prove the ‘only if part’ of Theorem 4.2; neither do we have a counterexample showing that it need not hold. Now, instead of looking at spectral regularity, we consider the (stronger) property of possessing a suﬃcient family matrix representations. We have the following analogue of Theorem 4.2.

Spectral Regularity and Non-commutative Gelfand Theory

139

Proposition 4.4. Let ℬ be a unital Banach algebra, and let 𝒥 be a closed two-sided ideal in ℬ which is contained in the radical of ℬ. Then ℬ possesses a suﬃcient family of matrix representations if and only if so does the quotient algebra ℬ/𝒥 . In particular, ℬ possesses a suﬃcient family of matrix representations if and only if this is the case for ℬ/ℛ(ℬ). In our review of known material presented in Section 2, we mentioned that the last statement in Proposition 4.4 is true when one works with suﬃcient families of matrix representations having the additional property of being of ﬁnite order. This additional property is not required here. To prove Proposition 4.4, need the following lemma. Lemma 4.5. Let ℬ be a unital Banach algebra. Then ℬ possesses a suﬃcient family of matrix representations if and only if ℬ possesses a suﬃcient family of surjective matrix representations. Proof. The ‘if part’ of the proposition is trivial. So we concentrate on the ‘only if part’. Let 𝜙 : ℬ → ℂ𝑛×𝑛 be a unital matrix representation of ℬ. It suﬃces to show that there exist a positive integer 𝑚, positive integers 𝑛1 , . . . , 𝑛𝑚 , and surjective unital matrix representations 𝜙𝑘 : ℬ → ℂ𝑛𝑘 ×𝑛𝑘 ,

𝑘 = 1, . . . , 𝑚,

with the following properties: for 𝑏 ∈ ℬ, the matrix 𝜙(𝑏) is invertible in ℂ𝑛×𝑛 if and only if 𝜙𝑘 (𝑏) is invertible in ℂ𝑛𝑘 ×𝑛𝑘 , 𝑘 = 1, . . . , 𝑚. The argument runs as follows. If the matrix representation 𝜙 itself is surjective, there is nothing to prove (case 𝑚 = 1). Assume it is not, so 𝜙[ℬ] is a proper subalgebra of ℂ𝑛×𝑛 . Applying Burnside’s Theorem (cf., [LR]), we see that 𝜙[ℬ] has a nontrivial invariant subspace, i.e., there is a nontrivial subspace 𝑉 of ℂ𝑛×𝑛 such that 𝜙(𝑏)[𝑉 ] is contained in 𝑉 for all 𝑏 in ℬ. But then there exist an invertible 𝑛 × 𝑛 matrix 𝑆, positive integers 𝑛− and 𝑛+ , a unital matrix representation 𝜙− : ℬ → ℂ𝑛− ×𝑛− and a unital matrix representation 𝜙+ : ℬ → ℂ𝑛+ ×𝑛+ such that 𝜙 has the form ] [ 𝜙− (𝑏) ∗ −1 𝑆, 𝑏 ∈ ℬ. 𝜙(𝑏) = 𝑆 0 𝜙+ (𝑏) Clearly 𝜙(𝑏) is invertible in ℂ𝑛×𝑛 if and only if 𝜙− (𝑏) is invertible in ℂ𝑛− ×𝑛− and 𝜙+ (𝑏) is invertible in ℂ𝑛+ ×𝑛+ . If 𝜙− and 𝜙+ are both surjective we are done (case 𝑚 = 2); if not we can again apply Burnside’s Theorem and decompose further. This process terminates after at most 𝑛 steps. A completely rigorous argument can be given using induction. □ Proof of Proposition 4.4. To establish the ‘only if part’ of the proposition, we may assume that ℬ possesses a suﬃcient family {𝜙𝜔 : ℬ → ℂ𝑚𝜔 ×𝑚𝜔 }𝜔∈Ω of surjective matrix representations (see Lemma 4.5). Take 𝜔 ∈ Ω. As is well known, ℂ𝑛𝜔 ×𝑛𝜔 is (semi)simple. Thus the remark made in the second paragraph after the proof of Corollary 3.3 applies. It gives ℛ(ℬ) ⊂ Ker 𝜙𝜔 . But then 𝒥 ⊂ Ker 𝜙𝜔 and 𝜙𝜔

140

H. Bart, T. Ehrhardt and B. Silbermann

induces a continuous unital Banach algebra homomorphism Φ𝜔 from ℬ/𝒥 into ℂ𝑚𝜔 ×𝑚𝜔 which satisﬁes 𝜙𝜔 = Φ𝜔 ∘ 𝜅. Here 𝜅 is the canonical homomorphism 𝑚𝜔 ×𝑚𝜔 of ℬ }𝜔∈Ω is suﬃcient. Indeed, if ( onto ) ℬ/𝒥 . The family {Φ𝜔 : ℬ/𝒥 → ℂ Φ𝜔 𝜅(𝑏) = 𝜙𝜔 (𝑏) is invertible for each 𝜔 ∈ Ω, then 𝑏 is invertible in ℬ, hence 𝜅(𝑏) is invertible in ℬ/𝒥 . Next suppose that ℬ/𝒥 possesses a suﬃcient family of matrix representations, say {Φ𝜔 : ℬ/𝒥 → ℂ𝑚𝜔 ×𝑚𝜔 }𝜔∈Ω . With 𝜅 as above, put 𝜙𝜔 (= Φ𝜔) ∘ 𝜅. Take 𝑏 ∈ ℬ, and assume 𝜙𝜔 (𝑏) is invertible for each 𝜔 ∈ Ω. Then Φ𝜔 𝜅(𝑏) is invertible for each 𝜔 ∈ Ω, and we may conclude that 𝜅(𝑏) is invertible in ℬ/𝒥 . In other words, 𝑏 is invertible modulo the ideal 𝒥 . As this ideal is contained in the radical of ℬ, it follows that 𝑏 is invertible in ℬ. Thus {𝜙𝜔 : ℬ → ℂ𝑚𝜔 ×𝑚𝜔 }𝜔∈Ω is a suﬃcient family of matrix representations, and the ‘if part’ of Proposition 4.4 has been proved. □ One may ask whether in Proposition 4.4 suﬃcient families can be replaced by (radical-)separating families. If ℬ/𝒥 has a radical-separating family of matrix representations, then so has ℬ. The proof is a slight modiﬁcation of the argument given above to prove the ‘if part’ of Proposition 4.4 and employs the fact that ℛ(ℬ/𝒥 ) = 𝜅[ℛ(ℬ)], where 𝜅 is the canonical homomorphism from ℬ onto ℬ/𝒥 . How about the converse? Here the situation is less clear. If ℬ possesses a (radical-) separating family of surjective matrix representations, then this is also the case for ℬ/𝒥 . The proof is analogous to the reasoning presented above to prove the ‘only if part’ of Proposition 4.4 and again employs the fact that ℛ(ℬ/𝒥 ) = 𝜅[ℛ(ℬ)]. However, we do not know whether the existence of a (radical-)separating family of matrix representations for ℬ implies the existence of such a family consisting of surjective homomorphisms. In other words, we do not know whether there is an analogue of Lemma 4.5 for families that are radical-separating (or even separating) instead of suﬃcient. Our conjecture is: there is not. Thus the question whether the existence of a (radical-)separating family of matrix representations for ℬ generally implies the existence of such a family for ℬ/𝒥 is open. 4.2. Algebras of ℓ∞ -type Let 𝑇 be a nonempty set, and let B = {ℬ𝑡 }𝑡∈𝑇 be a family of unital Banach algebras for which it is assumed that sup𝑡∈𝑇 ∥𝑒𝑡 ∥𝑡 < ∞. Here 𝑒𝑡 stands for the unit element in ℬ𝑡 and ∥.∥𝑡 denotes the norm on ℬ𝑡 . Write ℓB∞ for the ℓ∞ -direct B product of the family ∏ B (cf. [P], Subsection 1.3.1). Thus ℓ∞ consists of all 𝒇 in the Cartesian product 𝑡∈𝑇 ℬ𝑡 such that ∣∣∣𝒇 ∣∣∣ = sup ∥𝒇 (𝑡)∥𝑡 < ∞. 𝑡∈𝑇

With the operations of addition, scalar multiplication and multiplication deﬁned pointwise, and with ∣∣∣.∣∣∣ as norm, ℓB∞ is a unital Banach algebra.

Spectral Regularity and Non-commutative Gelfand Theory

141

From Theorem 4.1 in [BES7] we know that (even when the constituting algebras ℬ𝑡 are matrix algebras) ℓB∞ need not possess a suﬃcient family of matrix representations. So, in general, the road to establishing spectral regularity for Banach algebras of the type ℓB∞ via Theorem 4.1 in [BES2] is blocked, and Theorem 4.2 in [BES2], the other Gelfand type criterion in [BES2], does not seem to work either. Corollary 3.3 helps out in a surprisingly simple way. Theorem 4.6. Let 𝑇 be a nonempty set, and let B = {ℬ𝑡 }𝑡∈𝑇 be a family of unital Banach algebras. Then ℓB ∞ is spectrally regular if and only if so are all the Banach algebras ℬ𝑡 , 𝑡 ∈ 𝑇 . Proof. The family of point evaluations on ℓB∞ is obviously separating the points of ℓB∞ , so Corollary 3.3 gives the ‘if part’ of the theorem. The ‘only if part’ is immediate from the remark made after the proof of Corollary 4.1. □ The ‘if part’ of Theorem 4.6 can also be obtained from Corollary 3.5. Indeed, the family of point evaluations on ℓB∞ is easily seen to be p.w. suﬃcient. In general it is not suﬃcient, as can be seen by looking at ℓ∞ . Specializing to the case where the Banach algebras ℬ𝑡 all coincide with a single Banach algebra ℬ, we write ℓ∞ (𝑇 ; ℬ) for the Banach algebra of all bounded functions from 𝑇 into ℬ, provided with the pointwise algebraic operations and the supremum norm. Corollary 4.7. Let 𝑇 be a nonempty set, and let ℬ be a unital Banach algebra. Then ℓ∞ (𝑇 ; ℬ) is spectrally regular if and only if so is ℬ. Combining Corollaries 4.7 and 4.1, one readily gets a variety of results. For instance, if 𝑇 is a compact topological space and ℬ is a spectrally regular Banach algebra, then the Banach algebra 𝒞(𝑇 ; ℬ) of all continuous functions from 𝑇 into ℬ (provided with the pointwise algebraic operations and the supremum norm) is spectrally regular. Another example is 𝒜𝒫(ℝ; ℬ), the Banach algebra of continuous almost periodic functions from ℝ into ℬ (again provided with the pointwise algebraic operations and the supremum norm): if ℬ is spectrally regular, then so is 𝒜𝒫(ℝ; ℬ). Finally, if ℬ is a spectrally regular Banach algebra, then the Wiener algebra 𝒲(𝕋; ℬ) of ℬ-valued functions on the unit circle 𝕋 is spectrally regular. This follows by noting that 𝒲(𝕋; ℬ) is continuously embedded in 𝒞(𝕋; ℬ). Taking advantage of Theorem 4.6, we close this subsection with an example of a spectrally regular 𝐶 ∗ -algebra 𝒜 having a closed two-sided ideal 𝒥 (closed under the ∗ -operation) such that the quotient Banach algebra 𝒜/𝒥 is not spectrally regular. As noted in the discussion after Corollary 4.1, the existence of such an example is in sharp contrast with the fact that each Banach subalgebra of a spectrally regular Banach algebra is spectrally regular again. 𝑛×𝑛 }𝑛∈ℕ . This Banach To obtain the example we start with ℓM ∞ with M = {ℂ algebra is spectrally regular by Theorem 4.6. We now pass to a 𝐶 ∗ -subalgebra of

142

H. Bart, T. Ehrhardt and B. Silbermann

𝑛 𝑛 ℓM ∞ . For 𝑛 = 1, 2, 3, . . . , deﬁne 𝑃𝑛 : ℓ2 → ℂ and 𝑄𝑛 : ℂ → ℓ2 by ⎛ ⎞ 𝑥1 ⎛ ⎞ ⎜ .. ⎟ ⎞ ⎛ ⎞ ⎛ 𝑥1 ⎜ . ⎟ 𝑥1 𝑥1 ⎜ ⎟ ⎜ 𝑥2 ⎟ 𝑥𝑛 ⎟ ⎜ .. ⎟ ⎜ ⎜ ⎟ ⎜ .. ⎟ ⎟. 𝑃𝑛 ⎜ 𝑥3 ⎟ = ⎝ . ⎠ , 𝑄𝑛 ⎝ . ⎠ = ⎜ ⎜ 0 ⎟ ⎝ ⎠ ⎜ ⎟ .. 𝑥𝑛 𝑥𝑛 ⎜ 0 ⎟ . ⎝ ⎠ .. . M Let ℓM ∞,∗ consist of all 𝒇 ∈ ℓ∞ such that the strong limits s-lim𝑛→∞ 𝑄𝑛 𝒇 (𝑛)𝑃𝑛 and ∗ ∗ M s-lim𝑛→∞ 𝑄𝑛 𝒇 (𝑛) 𝑃𝑛 exist in ℬ(ℓ2 ). Then ℓM ∞,∗ is a 𝐶 -subalgebra of ℓ∞ . Since M M ∗ ℓ∞ is spectrally regular, so is ℓ∞,∗ . Introduce the continuous 𝐶 -homomorphism Ψ : ℓM ∞,∗ → ℬ(ℓ2 ) by Ψ(𝒇 ) = s-lim𝑛→∞ 𝑄𝑛 𝒇 (𝑛)𝑃𝑛 . Take 𝑇 ∈ ℬ(ℓ2 ), and let 𝒈 = (𝒈(1), 𝒈(2), 𝒈(3), . . .) be given by 𝒈(𝑛) = 𝑃𝑛 𝑇 𝑄𝑛 (so 𝒈 is built from the ﬁnite sections of 𝑇 ). Then s-lim𝑛→∞ 𝑄𝑛 𝒈(𝑛)𝑃𝑛 = 𝑇 and s-lim𝑛→∞ 𝑄𝑛 𝒈(𝑛)∗ 𝑃𝑛 = 𝑇 ∗ . M Hence 𝒈 ∈ ℓM ∞,∗ and Ψ(𝒈) = 𝑇 . We conclude that Ψ : ℓ∞,∗ → ℬ(ℓ2 ) is surjective. Put 𝒥 = Ker Ψ. Then 𝒥 is a closed two-sided ideal in ℓM ∞,∗ (closed under the ∗ ∗ -operation) and the quotient space ℓM /𝒥 is 𝐶 -isomorphic to ℬ(ℓ2 ). As ℬ(ℓ2 ) ∞,∗ M lacks the property of being spectrally regular, so does ℓ∞,∗ /𝒥 .

4.3. Abstract matrix algebras Let ℬ be a unital Banach algebra, let 𝑛 be a positive integer, and let ℬ 𝑛×𝑛 stand for the set of 𝑛 × 𝑛 matrices with entries from ℬ. With the standard algebraic operations, and one of the usual norms (see, for instance, [P], Subsection 1.6.9), ℬ 𝑛×𝑛 is again a unital Banach algebra. Clearly ℬ can be identiﬁed with the Banach subalgebra of ℬ 𝑛×𝑛 consisting of all 𝑛 × 𝑛 diagonal matrices in ℬ 𝑛×𝑛 with constant diagonal. Thus ℬ is spectrally regular whenever ℬ 𝑛×𝑛 has this property (see Corollary 4.1). What about the converse? Formulated in a more ﬂexible way: if ℬ is spectrally regular, under what additional conditions can one conclude that ℬ 𝑛×𝑛 is spectrally regular too? The complete answer to this question is not known; two positive results that we have been able to obtain are presented below. To give the proper context for the ﬁrst, we recall that a Banach algebra is spectrally regular provided it possesses a radical-separating family of matrix representations (special case of Corollary 3.3). Proposition 4.8. Let ℬ be a unital Banach algebra, and let 𝑛 be a positive integer. Suppose ℬ possesses a radical-separating family of matrix representations (so ℬ is spectrally regular). Then the matrix algebra ℬ 𝑛×𝑛 has a radical-separating family of matrix representations too, hence it is spectrally regular. Conversely, if ℬ 𝑛×𝑛 has a radical-separating family of matrix representations, then so has ℬ. As will be clear from the proof, the proposition remains true when radicalseparating is replaced by separating. The modiﬁcation of the proposition involving unital matrix representations is correct also.

Spectral Regularity and Non-commutative Gelfand Theory

143

Proof. Let {𝜙𝜔 : ℬ → ℂ𝑚𝜔 ×𝑚𝜔 }𝜔∈Ω be a family of matrix representations. For 𝜔 ∈ Ω, deﬁne Φ𝜔 : ℬ 𝑛×𝑛 → ℂ𝑛𝑚𝜔 ×𝑛𝑚𝜔 by ( ) Φ𝜔 [𝑏𝑗𝑘 ]𝑛𝑗,𝑘 =1 = [𝜙𝜔 (𝑏𝑗𝑘 )]𝑛𝑗,𝑘 =1 . Then Φ𝜔 is a matrix representation (unital when Φ𝜔 is). Clearly ∩ ∩ Ker Φ𝜔 ⇔ 𝑏𝑗𝑘 ∈ Ker 𝜙𝜔 , 𝑗, 𝑘 = 1, . . . , 𝑛. [𝑏𝑗𝑘 ]𝑗,𝑘 =1 ∈ 𝜔∈Ω

∩

𝜔∈Ω

∩ If {𝜙𝜔 }𝜔∈Ω is separating, then 𝜔∈Ω Ker 𝜙𝜔 = {0}, hence 𝜔∈Ω Ker Φ𝜔 = {0}, so {Φ𝜔 } is separating too. Next suppose {𝜙𝜔 }𝜔∈Ω is radical-separating. Then ∩ 𝜔∈Ω Ker 𝜙𝜔 ⊂ ℛ(ℬ), and we see that ∩ Ker Φ𝜔 ⇒ 𝑏𝑗𝑘 ∈ ℛ(ℬ), 𝑗, 𝑘 = 1, . . . , 𝑛. [𝑏𝑗𝑘 ]𝑗,𝑘 =1 ∈ 𝜔∈Ω

Now the radical of ℬ 𝑛×𝑛 consists of all matrices in ℬ 𝑛×𝑛 with entries in ℛ(ℬ). This well-known result can be found, for instance, as Proposition 5.14 in [CR]; cf. also Proposition 1.1.15 in [RSS] ∩ for a more general observation on ideals in matrix algebras. It follows that 𝜔∈Ω Ker Φ𝜔 ⊂ ℛ(ℬ 𝑛×𝑛 ), i.e., the family {Φ𝜔 } is radical-separating. To start the argument for the second part, recall that ℬ can be identiﬁed with the inverse closed Banach subalgebra 𝒟 of ℬ 𝑛×𝑛 consisting of all 𝑛 × 𝑛 diagonal matrices in ℬ 𝑛×𝑛 with constant diagonal. Let {Φ𝜔 : ℬ 𝑛×𝑛 → ℂ𝑘𝜔 ×𝑘𝜔 }𝜔∈Ω be a family of matrix representations, and, for 𝜔 ∈ Ω, let 𝜙𝜔 be the restriction of 𝑘𝜔 ×𝑘𝜔 Φ𝜔 to 𝒟. Then ∩ 𝜙𝜔 : 𝒟 → ℂ ∩ is a matrix representation (unital when Φ𝜔 is). Clearly Ker 𝜙 = 𝒟 ∩ 𝜔 𝜔∈Ω 𝜔∈Ω Ker Φ𝜔 . If {Φ𝜔 }𝜔∈Ω is separating, then ∩ ∩ Ker Φ = {0}, hence Ker 𝜙𝜔 = {0}, 𝜔 𝜔∈Ω 𝜔∈Ω ∩ so {𝜙𝜔 } is separating too. Next assume {Φ𝜔 }∩𝜔∈Ω is radical-separating. Thus 𝜔∈Ω Ker Φ𝜔 ⊂ ℛ(ℬ 𝑛×𝑛 ), and it follows that 𝜔∈Ω Ker 𝜙𝜔 ⊂ 𝒟 ∩ ℛ(ℬ 𝑛×𝑛 ). The right-hand side of this inclusion is contained in the radical of 𝒟 because 𝒟 is inverse closed in ℬ 𝑛×𝑛 . Hence {𝜙𝜔 }𝜔∈Ω is radical-separating, as desired. □ Our next result is concerned with a special case of the situation covered by the ﬁrst part Proposition 4.8. However, the stronger condition that is imposed (cf. Proposition 3.4) allows for a correspondingly stronger conclusion. Anticipating on the proof to be given, we mention an important result due to Procesi and Small [PS] which will serve as an essential tool in the argument: if ℬ is a PI-algebra, then so is the matrix algebra ℬ 𝑛×𝑛 (𝑛 a positive integer). For material on PI-algebras, see Section 2. Proposition 4.9. Let ℬ be a unital Banach algebra, let 𝑛 be a positive integer, and suppose ℬ possesses a suﬃcient family of matrix representations of ﬁnite order (so ℬ is spectrally regular). Then the matrix algebra algebra ℬ 𝑛×𝑛 has a suﬃcient family of matrix representations of ﬁnite order too, hence it is spectrally regular. Proof. The hypothesis on ℬ amounts the requirement that the quotient algebra ℬ/ℛ(ℬ) is PI (see Section 2, and the references given there). Now apply

144

H. Bart, T. Ehrhardt and B. Silbermann

the result of Procesi and Small quoted above. It follows that the matrix algebra ( )𝑛×𝑛 ℬ/ℛ(ℬ) is PI. Write 𝜙 for the canonical mapping of ℬ onto ℬ/ℛ(ℬ), and ( )𝑛×𝑛 ( ) 𝑛×𝑛 deﬁne Φ : ℬ → ℬ/ℛ(ℬ) by Φ [𝑏𝑗𝑘 ]𝑛𝑗,𝑘 =1 = [𝜙(𝑏𝑗𝑘 )]𝑛𝑗,𝑘 =1 . Then Φ is a surjective algebra homomorphism and, using Proposition 5.14 in [CR] again, its null space is ℛ(ℬ 𝑛×𝑛 ). Thus ℬ 𝑛×𝑛 /ℛ(ℬ 𝑛×𝑛 ), being algebraically isomorphic ( )𝑛×𝑛 , is a PI-algebra. But then, as we wanted to prove, ℬ 𝑛×𝑛 has a to ℬ/ℛ(ℬ) suﬃcient family of matrix representations of ﬁnite order. □ As was noted before, ℬ can be identiﬁed with the inverse closed Banach subalgebra of ℬ 𝑛×𝑛 consisting of all 𝑛× 𝑛 diagonal matrices in ℬ 𝑛×𝑛 with constant diagonal. Hence, if ℬ 𝑛×𝑛 has a suﬃcient family of matrix representations of ﬁnite order, then so does ℬ. In combination with Proposition 4.9 this gives: the matrix algebra ℬ 𝑛×𝑛 has a suﬃcient family of matrix representations of ﬁnite order if and only if so does ℬ. This bears a certain analogy to Proposition 4.4. The latter has no ﬁnite order condition on the suﬃcient family of matrix representations, however. We do not know whether one can do without this restriction here too. Let us ﬁnish this subsection with a simple observation concerning the Banach 𝑛×𝑛 of ℬ 𝑛×𝑛 consisting of the upper triangular 𝑛 × 𝑛 matrices with subalgebra ℬupper entries in ℬ. 𝑛×𝑛 Proposition 4.10. If the unital Banach algebra ℬ is spectrally regular, then ℬupper is spectrally regular too. 𝑛×𝑛 Proof. The homomorphisms 𝜙1 , . . . , 𝜙𝑛 , with 𝜙𝑘 mapping a matrix from ℬupper into its 𝑘th diagonal element, form a suﬃcient family of Banach algebra homo𝑛×𝑛 into the Banach algebra ℬ, and the latter is spectrally morphisms mapping ℬupper regular by assumption. □

In Proposition 4.10, upper triangularity can of course be replaced by lower triangularity. For Banach algebras of operators, triangularity can be brought into connection with families of invariant subspaces. This line of thought is pursued in the next subsection. 4.4. Algebras of operators with prescribed invariant subspaces Let 𝑋 be a complex Banach space and let ℳ be a family of closed nontrivial subspaces of 𝑋. By ℬ(𝑋; ℳ) we denote the set of all operators 𝑇 ∈ ℬ(𝑋) such that 𝑇 [𝑀 ] ⊂ 𝑀 for all 𝑀 ∈ ℳ. Clearly ℬ(𝑋; ℳ) is a Banach subalgebra of ℬ(𝑋). It is our aim to give suﬃcient conditions in order that ℬ(𝑋; ℳ) is spectrally regular. An obvious condition of this type is that 𝑋 is ﬁnite dimensional so that ℬ(𝑋; ℳ) can be identiﬁed with a subalgebra of ℂ𝑚×𝑚 where 𝑚 is the dimension of 𝑋. Hence, from now on, we assume that 𝑋 is inﬁnite dimensional (so that ℬ(𝑋) and its Banach subalgebras need not be spectrally regular). Prominent of algebras of the type ℬ(𝑋; ℳ) are the Banach sub) ( instances algebra of ℬ ℓ2 (ℕ) consisting of block upper triangular operators ) (with respect ( to a given orthonormal basis), the Banach subalgebra of ℬ ℓ2 (ℕ) consisting of

Spectral Regularity and Non-commutative Gelfand Theory

145

( ) block lower triangular operators, and the Banach subalgebra of ℬ ℓ2 (ℤ) consisting of block upper (or, alternatively, lower) triangular operators (all the time with ﬁnite but possibly variable block size). For these, spectral regularity can be established with the help of Corollary 3.2. However, basically the same argument as the one employed for these cases gives a more general result which shows that it makes sense to have ℱ (𝑋) − {𝐼𝑋 } in the right-hand side of (8); see the discussion involving the expressions (11) and (12) in Section 3. To facilitate the further exposition, we need some preparations. As before 𝑋 will be an inﬁnite-dimensional Banach space. We say that 𝑀 is almost included in 𝑁 , written 𝑀 ≺ 𝑁 , if dim 𝑀/(𝑀 ∩ 𝑁 ) is ﬁnite. It is a well-known fact that dim 𝑀/(𝑀 ∩ 𝑁 ) = dim (𝑀 + 𝑁 )/𝑁 . Hence 𝑀 ≺ 𝑁 if and only if dim (𝑀 + 𝑁 )/𝑁 < ∞. If 𝑀 ⊃ 𝑁 , then 𝑀 ≺ 𝑁 if and only if 𝑁 has ﬁnite codimension in 𝑀 . Also 𝑀 ≺ 𝑁 whenever 𝑀 ⊂ 𝑁 . In particular 𝑀 ≺ 𝑀 , so the relation ≺ is reﬂexive. As is easily veriﬁed, it is also transitive. If 𝑇 is a linear operator on 𝑋 and 𝑀 ≺ 𝑁 , then 𝑇 [𝑀 ] ≺ 𝑇 [𝑁 ] too. The subspaces 𝑀 and 𝑁 are said to be almost equal, written 𝑀 ≍ 𝑁 , if both 𝑀 ≺ 𝑁 and 𝑁 ≺ 𝑀 . This is equivalent to requiring that the quotient space (𝑀 + 𝑁 )/(𝑀 ∩ 𝑁 ) has ﬁnite dimension. Note that ≍ is an equivalence relation. Hence the collection of all closed subspaces of 𝑋 is the disjoint union of the equivalence classes modulo ≍. A nonempty subset of such an equivalence class will be called a cluster. An example of a cluster is a nonempty family of ﬁnitedimensional subspaces of 𝑋. A nonempty family of closed ﬁnite codimensional subspaces of 𝑋 is a cluster as well. We are now ready to present our next theorem. Its proof will illustrate that in Corollary 3.2 it is important to have condition (8) instead of one of the possibly more restrictive requirements (11) or (12). Theorem 4.11. Let ℳ1 , . . . , ℳ𝑛 be an 𝑛-tuple of clusters of closed subspaces of the inﬁnite-dimensional Banach space X. Suppose the 𝑛-tuple is almost nested in the sense that ⋁ ∩ 𝑀 ≺ 𝑀, 𝑘 = 1, . . . , 𝑛 − 1. (13) 𝑀∈ℳ𝑘

⋁

𝑀∈ℳ𝑘+1

∩ Further assume that codim 𝑀∈ℳ1 𝑀 < ∞ and dim 𝑀∈ℳ𝑛 𝑀 < ∞. Then, with ℳ being the union of the clusters ℳ1 ∪⋅ ⋅ ⋅∪ℳ𝑛 , the Banach algebra ℬ(𝑋; ℳ) is spectrally regular. The Banach algebras of block triangular operators mentioned earlier all correspond to situations where the requirements in the theorem are trivially fulﬁlled. For details, see Theorem 4.12 and the comments concerning it at the end of this subsection. Proof. Let 𝑀 and 𝑁 be closed subspaces of 𝑋, let 𝑇 be a bounded linear operator on 𝑋, and suppose 𝑇 [𝑀 ] ⊂ 𝑀 and 𝑇 [𝑁 ] ⊂ 𝑁 . Then 𝑇 [𝑀 + 𝑁 ] ⊂ 𝑀 + 𝑁 and 𝑇 [𝑀 ∩ 𝑁 ] ⊂ 𝑀 ∩ 𝑁 . Hence 𝑇 induces a bounded linear operator on the quotient

146

H. Bart, T. Ehrhardt and B. Silbermann

space (𝑀 + 𝑁 )/(𝑀 ∩ 𝑁 ) which we will denote by 𝑇𝑀,𝑁 . Clearly 𝑇𝑀,𝑁 is the zero operator on (𝑀 + 𝑁 )/(𝑀 ∩ 𝑁 ) if and only if 𝑇 [𝑀 + 𝑁 ] ⊂ 𝑀 ∩ 𝑁 . Given the (assumed) inclusions 𝑇 [𝑀 ] ⊂ 𝑀 and 𝑇 [𝑁 ] ⊂ 𝑁 , this comes down to 𝑇 [𝑀 ] ⊂ 𝑁 and 𝑇 [𝑁 ] ⊂ 𝑀 . If 𝑆 is another bounded linear operator on 𝑋 leaving invariant 𝑀 and 𝑁 , then 𝑇 + 𝑆 leaves 𝑀 and 𝑁 invariant too and (𝑇 + 𝑆)𝑀,𝑁 = 𝑇𝑀,𝑁 + 𝑆𝑀,𝑁 . Similarly (𝑇 𝑆)𝑀,𝑁 = 𝑇𝑀,𝑁 𝑆𝑀,𝑁 and (𝛼𝑇 )𝑀,𝑁 = 𝛼𝑇𝑀,𝑁 , 𝛼 ∈ ℂ. Take 𝑘 ∈ {1, . . . , 𝑛} and 𝑀, 𝑁 ∈ ℳ𝑘 with 𝑀 ∕= 𝑁 . Then the quotient space (𝑀 + 𝑁 )/(𝑀 ∩ 𝑁 ) has positive ﬁnite dimension. For 𝑇 ∈ ℬ(𝑋; ℳ), we have that 𝑇 leaves invariant 𝑀 and 𝑁 , and we can put Φ𝑘;𝑀,𝑁 (𝑇 ) = 𝑇𝑀,𝑁 . In this way we get a continuous (unital) homomorphism ( ) 𝑀 +𝑁 Φ𝑘;𝑀,𝑁 : ℬ(𝑋, ℳ) → ℬ . 𝑀 ∩𝑁 In the sequel it will be identiﬁed with a matrix representation on ℬ(𝑋; ℳ). Fix 𝑘 among the integers 1, . . . , 𝑛, and consider {Φ𝑘;𝑀,𝑁 }𝑀,𝑁 ∈ℳ𝑘 , 𝑀∕=𝑁 . This is a family of matrix representations on ℬ(𝑋; ℳ). We claim that [ ⋁ ] ∩ ∩ 𝑇 ∈ Ker Φ𝑘;𝑀,𝑁 ⇒ 𝑇 𝑀 ⊂ 𝑀, (14) 𝑀∈ℳ𝑘

𝑀,𝑁 ∈ℳ𝑘 , 𝑀∕=𝑁

⋁

𝑀∈ℳ𝑘

where the symbol signals the operation of taking the closed linear span. The argument is as follows. To obtain the inclusion in the right-hand side of (14), we need to show that 𝑇 [𝑀 ] ⊂ 𝑁 for all 𝑀, 𝑁 ∈ ℳ𝑘 . Take 𝑇 in the left-hand side of (14) and 𝑀, 𝑁 ∈ ℳ𝑘 . If 𝑀 = 𝑁 we have 𝑇 [𝑀 ] ⊂ 𝑀 = 𝑁 because 𝑇 ∈ ℬ(𝑋; ℳ). If 𝑀 ∕= 𝑁 , we have 𝑇𝑀,𝑁 = Φ𝑘;𝑀,𝑁 (𝑇 ) = 0, and so 𝑇 [𝑀 + 𝑁 ] ⊂ 𝑀 ∩ 𝑁 , in particular 𝑇 [𝑀 ] ⊂ 𝑁 . For convenience, write 𝐷0 = 𝑋, 𝑉𝑛+1 = {0} and ∩ ⋁ 𝐷𝑘 = 𝑀, 𝑉𝑘 = 𝑀, 𝑘 = 1, . . . , 𝑛. 𝑀∈ℳ𝑘

𝑀∈ℳ𝑘

Then, by the hypotheses in the theorem, 𝐷𝑘 ≺ 𝑉𝑘+1 for 𝑘 = 0, . . . , 𝑛. Hence 𝑇 [𝐷𝑘 ] ≺ 𝑇 [𝑉𝑘+1 ],

𝑘 = 0, . . . , 𝑛,

(15)

where for 𝑇 one can take any linear operator on 𝑋. Next consider {Φ𝑘;𝑀,𝑁 }𝑀,𝑁 ∈ℳ𝑘 , 𝑀∕=𝑁 ; ∩ 𝑘=1,...,𝑛 . This again is a family of matrix representations on ℬ(𝑋; ℳ). Take 𝑇 in 𝑀,𝑁 ∈ℳ𝑘 , 𝑀∕=𝑁 ; 𝑘=1,...,𝑛 Ker Φ𝑘;𝑀,𝑁 . Then we have from (14) that 𝑇 [𝑉𝑘 ] ⊂ 𝐷𝑘 ,

𝑘 = 1, . . . , 𝑛 .

(16)

Combining (15) and (16), we get 𝑇 [𝐷𝑘 ] ≺ 𝐷𝑘+1 , 𝑘 = 0, . . . , 𝑛 − 1. But then, via (ﬁnite) induction, 𝑇 𝑘 [𝐷0 ] ≺ 𝐷𝑘 , 𝑘 = 0, . . . , 𝑛. In particular 𝑇 𝑛 [𝐷0 ] ≺ 𝐷𝑛 . As 𝐷0 = 𝑋 and 𝐷𝑛 ≺ 𝑉𝑛+1 = {0}, it follows that Im 𝑇 𝑛 ≺ {0}. Thus Im 𝑇 𝑛 is ﬁnite dimensional, i.e., 𝑇 𝑛 is a ﬁnite rank operator (hence compact). By standard Fredholm theory, we may conclude that 𝐼𝑋 − (−𝑇 )𝑛 is a Fredholm operator, i.e., Ker(𝐼𝑋 − (−𝑇 )𝑛 ) is ﬁnite dimensional and Im(𝐼𝑋 − (−𝑇 )𝑛 )

Spectral Regularity and Non-commutative Gelfand Theory

147

has ﬁnite codimension in 𝑋. Now ( 𝑛−1 ) ( 𝑛−1 ) ∑ ∑ 𝑛 𝑘 𝑘 𝐼𝑋 − (−𝑇 ) = (𝐼𝑋 + 𝑇 ) (−𝑇 ) (−𝑇 ) (𝐼𝑋 + 𝑇 ), = 𝑘=0

𝑘=0

therefore Im(𝐼𝑋 − (−𝑇 )𝑛 ) ⊂ Im(𝐼𝑋 + 𝑇 ) and Ker(𝐼𝑋 + 𝑇 ) ⊂ Ker(𝐼𝑋 − (−𝑇 )𝑛 ). 𝑛 , the operator 𝐼𝑋 + 𝑇 is Fredholm. So, along with 𝐼𝑋 − (−𝑇 )∩ We conclude that 𝑀,𝑁 ∈ℳ𝑘 , 𝑀∕=𝑁 ; 𝑘=1,...,𝑛 Ker Φ𝑘;𝑀,𝑁 ⊂ ℱ (𝑋) − {𝐼𝑋 }. Corollary 3.2 now gives that ℬ(𝑋; ℳ) is spectrally regular. □ Elaborating on the proof, we note that the family of matrix representations {Φ𝑘;𝑀,𝑁 }𝑀,𝑁 ∈ℳ𝑘 , 𝑀∕=𝑁 ; 𝑘=1,...,𝑛 is nonempty. Suppose it is not. Then all the clusters ℳ1 , . . . , ℳ𝑛 are singletons and we get 𝑋 = 𝐷0 ≺ 𝑉1 = 𝐷1 ≺ 𝑉2 = 𝐷2 ≺ ⋅ ⋅ ⋅ ≺ 𝑉𝑛 = 𝐷𝑛 ≺ 𝑉𝑛+1 = {0}. By transitivity this gives 𝑋 ≺ {0}, contradicting the inﬁnite dimensionality of 𝑋. There is another elucidating observation to make. Suppose the 𝑛-tuple of clusters in Theorem 4.11 is nested (instead of only almost nested) in the sense that the almost inclusions in (13) are in fact genuine inclusions. Then all the almost inclusions in the above proof are genuine inclusions too. This leads to the stronger conclusion that 𝑇 is nilpotent; in fact 𝑇 𝑛 = 0. We conclude this subsection by coming back to Theorem 4.11 for the case 𝑛 = 1. For that situation, the theorem reads as follows. Theorem 4.12. Let ℳ be a cluster of closed subspaces of the inﬁnite-dimensional Banach space X. Assume ∩ ⋁ 𝑀 < ∞, dim 𝑀 < ∞. (17) codim 𝑀∈ℳ

𝑀∈ℳ

Then ℬ(𝑋; ℳ) is spectrally regular. Theorem 4.12 can be used to deal with the Banach algebras of triangular operators mentioned in the third paragraph of this subsection. Here are the details. (a) Let ℳ be a nonempty family of ﬁnite-dimensional subspaces of the inﬁnitedimensional Banach space 𝑋. Then ℳ is a cluster and it is clear that the second part of (17) is satisﬁed. If the ﬁrst part of (17) is fulﬁlled too, we may conclude that( ℬ(𝑋;) ℳ) is spectrally regular. This covers the Banach subalgebra of ℬ ℓ2 (ℕ) consisting of block upper triangular operators where, for the appropriate choice of ℳ, the ﬁrst part of (17) even amounts to ⋁ 𝑀∈ℳ 𝑀 = ℓ2 (ℕ). (b) Let ℳ be a nonempty family of ﬁnite codimensional subspaces of the inﬁnitedimensional Banach space 𝑋. Then ℳ is a cluster and it is clear that the ﬁrst part of (17) is satisﬁed. If the second part of (17) is fulﬁlled too, we may conclude that ( ℬ(𝑋; ) ℳ) is spectrally regular. This covers the Banach subalgebra of ℬ ℓ2 (ℕ) consisting of block lower triangular operators where,

148

H. Bart, T. Ehrhardt and B. Silbermann

for ∩ the appropriate choice of ℳ, the second part of (17) even boils down to 𝑀∈ℳ 𝑀 = {0}. ( ) (c) Theorem 4.12 also covers the Banach subalgebra of ℬ ℓ2 (ℤ) consisting of block upper (or, alternatively lower) triangular operators. For the appropriate ⋁ 𝑀 = ℓ2 (ℤ) and the choice of ℳ, the ﬁrst part of (17) boils down to 𝑀∈ℳ ∩ second to 𝑀∈ℳ 𝑀 = {0}. In (a), (b) and (c), block triangularity is taken with respect to a given orthonormal basis in, respectively, ℓ2 (ℕ), ℓ2 (ℕ), and ℓ2 (ℤ). The blocks are allowed to be of variable (but ﬁnite) size. 4.5. Algebras of Toeplitz and singular integral operators We start with the following immediate consequence of Corollary 3.2. Corollary 4.13. Let 𝑋 be an inﬁnite-dimensional Banach space, and let ℬ be a Banach subalgebra of ℬ(𝑋). If the quotient Banach algebra ℬ/(𝒦(𝑋) ∩ ℬ) is spectrally regular, then so is ℬ. One may replace 𝒦(𝑋) by the generally larger ideal of the strictly singular operators on 𝑋; see the corresponding remark in Section 3. Proof. Consider the singleton family {𝜅}, where 𝜅 : ℬ → ℬ/(𝒦(𝑋) ∩ ℬ) is the canonical mapping, and apply Corollary 3.2. □ As a special case of Corollary 4.13, we have the following result. Let 𝑋 be an inﬁnite-dimensional Banach space, and let ℬ be a Banach subalgebra of ℬ(𝑋). Suppose the ideal 𝒦(𝑋) of the compact operators on 𝑋 is contained in ℬ. Then ℬ is spectrally regular provided the quotient ℬ/𝒦(𝑋) is. This means that in cases where 𝒦(𝑋) ⊂ ℬ and ℬ/𝒦(𝑋) is a polynomial identity algebra or, more generally, ℬ/𝒦(𝑋) possesses a suﬃcient family of matrix representations, one can conclude that ℬ is spectrally regular. There is an abundance of such situations, especially in the theory of singular integral operators and Toeplitz operators: see, for instance, the books [BK], [BS], [Cor], [GGK2], [GK1], [GK2], [Kr], and the paper [GK3]. As a characteristic illustration, we consider the unital 𝐶 ∗ -algebras generated by block Toeplitz operators appearing in [GGK2], Sections XXXII.2 and XXXII.4. Depending on the continuity requirements imposed on the so-called deﬁning (or generating) function, the algebras in question are denoted there by 𝒯𝑚 (𝐶) and 𝒯𝑚 (𝑃 𝐶). In fact, 𝒯𝑚 (𝐶) and 𝒯𝑚 (𝑃 𝐶) are, respectively, the smallest closed subalgebra of ℬ(ℓ𝑚 2 ) containing all block Toeplitz operators for which the deﬁning function is a continuous, respectively, a piecewise continuous, ℂ𝑚×𝑚 -valued function. Theorem 4.14. The 𝐶 ∗ -algebras 𝒯𝑚 (𝐶) and 𝒯𝑚 (𝑃 𝐶) are spectrally regular. Proof. Let 𝒯 be one of the Banach algebras mentioned above. Then 𝒯 is a Banach 𝑚 subalgebra of ℬ(ℓ𝑚 2 ) where ℓ2 stands for the Hilbert space of square summable 𝑚 sequences with entries in ℂ . We now make use of the material presented in [GGK2], Chapter XXXII, in particular Theorems 2.1 and 4.2. The ﬁrst thing

Spectral Regularity and Non-commutative Gelfand Theory

149

to mention is that 𝒯 contains the ideal 𝒦 = 𝒦(ℓ𝑚 2 ) of the compact operators . The second is that 𝒯 /𝒦 can be identiﬁed with a Banach algebra of the on ℓ𝑚 2 type 𝒞(𝑇, ℂ𝑚×𝑚 ) where 𝑇 is an appropriately chosen compact topological space. This Banach algebra is spectrally regular, a conclusion which has been drawn in Section 4.2 from Corollaries 4.1 and 4.7. Along with 𝒞(𝑇, ℂ𝑚×𝑚 ), the quotient algebra 𝒯 /𝒦 is spectrally regular too. The spectral regularity of 𝒯 now follows by applying Corollary 4.13. □ We add to the above argument that is also easy to see that the algebra 𝒞(𝑇, ℂ𝑚×𝑚 ) is PI. Indeed, as the operations in 𝒞(𝑇, ℂ𝑚×𝑚 ) are deﬁned pointwise, an annihilating polynomial for ℂ𝑚×𝑚 is one for 𝒞(𝑇, ℂ𝑚×𝑚 ) too. The property of being PI carries over to 𝒯 /𝒦. Now let us specialize to the case 𝑚 = 1 and consider 𝒯 (𝐶) = 𝒯1 (𝐶), the algebra generated by the Toeplitz operators on ℓ2 (ℕ) with continuous generating function. By a result of Coburn [Cob], the 𝐶 ∗ -algebra 𝒯 (𝐶) is ∗ -isomorphic to the so-called universal algebra generated by one nonunitary isometry. Hence this universal algebra, which can occur in many diﬀerent appearances, is spectrally regular (see [RR] and [GF]). For a further analysis, see the forthcoming paper [BES8] where related algebras are considered too. ( Toeplitz ) algebras can also be considered in the context of the spaces ℓ𝑝 (ℤ+ ), 𝐿𝑝 [0, ∞) and 𝐻𝑝 (𝕋); see [BS]. Corollary 4.13 is then applicable too. Indeed, factoring out the compacts gives again a spectrally regular Banach algebra, in fact one that has a suﬃcient family of matrix representations of ﬁnite order. Recall that this does not automatically give that the quotient algebra is PI; it does when the quotient algebra is semisimple.

5. Concluding remarks In the above, we encountered families of Banach algebra homomorphisms having certain properties pertinent to the topic of this paper. Certain relationships between these properties are obvious, others, somewhat less trivial, have been established in Section 3. Restricting ourselves (in order to keep things tractable) to considering matrix representations only, the situation is as depicted in the following scheme PI ⇒ suﬃcient, ﬁnite order ⇒ suﬃcient ⇒ p.w. suﬃcient ⇒ radical-separating ⇑ separating (where the third implication from the left has to be understood as being true modulo an appropriate renorming of the test algebras). Clearly, the overarching notion is that of a radical-separating family. Now the question arises, is it overarching in the strict sense? Or, in more precise terms, what about the converses of the implications in the above scheme?

150

H. Bart, T. Ehrhardt and B. Silbermann

This issue is addressed in [BES7]. Here we only mention that one of the main results presented there is that the converse of the implication suﬃcient ⇒ radical-separating, (implicit in the above scheme) is not valid. In fact, not even the implication separating ⇒ suﬃcient, holds. On the level of individual families, this has already been noted before: see the paragraph directly following the proof of Theorem 4.6. However, it is even true in the much stronger sense that a Banach algebra may possess a family of unital matrix representations which is separating and p.w. suﬃcient while it fails to 𝑛×𝑛 }𝑛∈ℕ , have any suﬃcient one. An example is the 𝐶 ∗ -algebra ℓM ∞ , with M = {ℂ featuring in the last paragraph of Subsection 4.2. For the proof of the fact that ℓM ∞ does not possess any suﬃcient family of matrix representations, one needs some ‘grasp’ on the collection of all unital matrix representations of the Banach algebra ℓM ∞ – a highly nontrivial matter (cf. the situation for the relatively simple Banach algebra ℓ∞ ). One ﬁnal remark. As was mentioned in Section 2, spectral regularity is a necessary condition for a Banach algebra to have a suﬃcient family of matrix representations. The 𝐶 ∗ -algebra ℓM ∞ illustrates that it is not a suﬃcient condition. So additional requirements are needed to characterize the Banach algebras possessing a suﬃcient family of matrix representations (not necessarily of ﬁnite order), an issue posed as Problem 12 in Section 29 of [Kr]. Acknowledgement The second author (T.E.) was supported in part by NSF grant DMS-0901434.

References [AA]

[AL] [Bar] [BES1] [BES2] [BES3]

Y.A. Abramovich, C.D. Aliprantis, An Invitation to Operator Theory, Graduate Studies in Mathematics, Vol. 50, American Mathematical Society, Providence, Rhode Island 2002. S.A. Amitsur, J. Levitzky, Minimal identities for algebras, Proc. Amer. Math. Soc. 1 (1950), 449–463. H. Bart, Spectral properties of locally holomorphic vector-valued functions, Paciﬁc J. Math. 52 (1974), 321–329. H. Bart, T. Ehrhardt, B. Silbermann, Zero sums of idempotents in Banach algebras, Integral Equations and Operator Theory 19 (1994), 125–134. H. Bart, T. Ehrhardt, B. Silbermann, Logarithmic residues in Banach algebras, Integral Equations and Operator Theory 19 (1994), 135–152. H. Bart, T. Ehrhardt, B. Silbermann, Logarithmic residues of Fredholm operator-valued functions and sums of ﬁnite rank projections, In: Operator Theory: Advances and Applications, Vol. 130, Birkh¨ auser Verlag, Basel 2001, pp. 83–106.

Spectral Regularity and Non-commutative Gelfand Theory [BES4]

151

H. Bart, T. Ehrhardt, B. Silbermann, Logarithmic residues in the Banach algebra generated by the compact operators and the identity, Mathematische Nachrichten 268 (2004), 3–30. [BES5] H. Bart, T. Ehrhardt, B. Silbermann, Vector-Valued Logarithmic Residues and the Extraction of Elementary Factors, Econometric Institute Erasmus University Rotterdam, Report nr. EI 2007-31, 2007. [BES6] H. Bart, T. Ehrhardt, B. Silbermann, Trace conditions for regular spectral behavior of vector-valued analytic functions, Linear Algebra Appl. 430 (2009), 1945–1965. [BES7] H. Bart, T. Ehrhardt, B. Silbermann, Families of homomorhisms in non-commutative Gelfand theory: comparisons and counterexamples, accepted for publication in the IWOTA 2010 Proceedings. In: W. Arendt, J.A. Ball, J. Behrndt, K.-H. F¨ orster, V. Mehrmann, C. Trunk (eds.): Recent Advances in Operator Theory, Oper. Theory Adv. Appl. OT 221, Birkh¨ auser, Springer Basel AG, 2012. [BES8] H. Bart, T. Ehrhardt, B. Silbermann, Logarithmic Residues, Rouch´e’s Theorem, Spectral Regularity, and Zero Sums of Idempotents: the 𝐶 ∗ -algebra Case, forthcoming. [BK] A. B¨ ottcher, Yu. Karlovich, Carleson Curves, Muckenhaupt Weights, and Toeplitz Operators, Progress in Mathematics, Vol. 154, Birkh¨ auser Verlag, Basel 1997. [BS] A. B¨ ottcher, B. Silbermann, Analysis of Toeplitz Operators, Springer Verlag, Berlin 1990. [Cob] L.A. Coburn, The 𝐶 ∗ -algebra generated by an isometry, Bull. Amer. Math. Soc. 73 (1967), 722–726. [Cor] H.O. Cordes, Elliptic Pseudodiﬀerential Operators – An Abstract Theory, Lecture Notes in Mathematics, Springer Verlag, Berlin 1995. [CR] C.W. Curtis, I. Reiner: Methods of representation theory, Vol. I. With applications to ﬁnite groups and orders, Wiley Classics Library. John Wiley and Sons, New York 1990. [E] T. Ehrhardt, Finite sums of idempotents and logarithmic residues on connected domains, Integral Equations and Operator Theory 21 (1995), 238–242. [Go] S. Goldberg, Unbounded Linear Operators, McGraw-Hill, New York 1966. [GF] I. Gohberg, I.A. Feldman, Convolution Operators and Projection Methods for Their Solution, Translations of Mathematical Monographs, Vol. 41, Amer. Math. Soc., Providence, Rhode Island 1974. [GGK1] I. Gohberg, S. Goldberg, M.A. Kaashoek, Classes of Linear Operators, Vol. I, Operator Theory: Advances and Applications, Vol. 49, Birkh¨ auser Verlag, Basel 1990. [GGK2] I. Gohberg, S. Goldberg, M.A. Kaashoek, Classes of Linear Operators, Vol. II, Operator Theory: Advances and Applications, Vol. 63, Birkh¨ auser Verlag, Basel 1993. [GK1] I.C. Gohberg, N.Ya. Krupnik, One-Dimensional Linear Singular Integral Equations, Vol. 1, Operator Theory: Advances and Applications, Vol. 53, Birkh¨ auser Verlag, Basel 1992.

152

H. Bart, T. Ehrhardt and B. Silbermann

[GK2]

I.C. Gohberg, N.Ya. Krupnik, One-Dimensional Linear Singular Integral Equations, Vol. 2, Operator Theory: Advances and Applications, Vol. 54, Birkh¨ auser Verlag, Basel 1992.

[GK3]

I.C. Gohberg, N.Ya. Krupnik, On an algebra generated by Toeplitz matrices, Funk. Anal. i Priloz 3 (1969), 46–56 (Russian); English Transl., Funct. Anal. Appl. 3 (1969), 119–127.

[GL]

I. Gohberg, J. Leiterer, Holomorphic Operator Functions of One Variable and Applications, Operator Theory: Advances and Applications, Vol. 192, Birkh¨ auser Verlag, Basel 2009.

[GS]

I.C. Gohberg, E.I. Sigal, An operator generalization of the logarithmic residue theorem and the theorem of Rouch´e, Mat. Sbornik 84 (126) (1971), 607–629 (Russian); English Transl., Math. USSR Sbornik 13 (1971), 603–625. B. Gramsch, Meromorphie in der Theorie der Fredholmoperatoren mit Anwendungen auf elliptische Diﬀerentialoperatoren, Math. Ann. 188 (1970), 97-112.

[Gra] [Gro]

[HRS]

[Ka] [Kr]

[LR] [N]

A. Grothendieck, Produits tensoriels topologiques et espaces nucl´eaires, Mem. Amer. Math. Soc., No.16, American Mathematical Society, Providence, Rhode Island 1955 [French]. R. Hagen, S. Roch, B. Silbermann, Spectral Theory of Approximation Methods for Convolution Equations, Operator Theory: Advances and Applications, Vol. 74, Birkh¨ auser Verlag, Basel 1995. W. Kaballo, Lifting-S¨ atze f¨ ur Vektorfunktionen und das 𝜀-Tensorprodukt, Habilitationsschrift, Kaiserslautern 1976. N.Ya. Krupnik, Banach Algebras with Symbol and Singular Integral Operators, Operator Theory: Advances and Applications, Vol. 26, Birkh¨ auser Verlag, Basel 1987. V. Lomonosov, P. Rosenthal, The simplest proof of Burnside’s Theorem on matrix algebras, Linear Algebra Appl. 383 (2004), 45–47. M.A. Naimark, Normed Rings, Wolters-Noordhof, Groningen 1970.

[RRS]

R. Rabinovich, S. Roch, B. Silbermann, Limit Operators and their Applications in Operator Theory, Operator Theory: Advances and Applications, Vol. 150, Birkh¨ auser Verlag, Basel 2004.

[P]

T.W. Palmer, Banach Algebras and The General Theory of *-Algebras, Volume I: Algebras and Banach Algebras, Cambridge University Press, Cambridge 1994.

[PS]

C. Procesi, L. Small, Endomorphism rings of modules over Pl-algebra, Math. Z. 106 (1968), 178–180. C. Pearcy, D. Topping. Sums of small numbers of idempotents, Michigan Math. J. 14 (1967), 453–465. M. Rosenblum, J. Rovnyak, Hardy Classes and Operator Theory, The Clarendon Press, Oxford University Press, New York 1985. S. Roch, P.A. Santos, B. Silbermann, Non-commutative Gelfand Theories, Springer Verlag, London Dordrecht, Heidelberg, New York 2011.

[PT] [RR] [RSS] [Si]

B. Silbermann, Symbol constructions and numerical analysis, In: Integral Equations and Inverse Problems (R. Lazarov, V. Petkov, eds.), Pitman Research Notes in Mathematics, Vol. 235, 1991, 241–252.

Spectral Regularity and Non-commutative Gelfand Theory [TL]

153

A.E. Taylor, D.C. Lay, Introduction to Functional Analysis, Second Edition, John Wiley and Sons, New York 1980. [ZKKP] M.G. Zaˇıdenberg, S.G. Kreˇın, P.A. Kuˇcment, A.A. Pankov, Banach bundles and linear operators, Uspehi Mat. Nauk 30 no. 5(185) (1975), 101–157 [Russian]; English Transl., Russian Math. Surveys 30 (1975), no. 5, 115–175. Harm Bart Econometric Institute Erasmus University Rotterdam P.O. Box 1738 NL-3000 DR Rotterdam, The Netherlands e-mail: [email protected] Torsten Ehrhardt Mathematics Department University of California Santa Cruz, CA 95064, USA e-mail: [email protected] Bernd Silbermann Fakult¨ at f¨ ur Mathematik Technische Universit¨ at Chemnitz D-09107 Chemnitz, Germany e-mail: [email protected]

Operator Theory: Advances and Applications, Vol. 218, 155–175 c 2012 Springer Basel AG ⃝

Banach Algebras of Commuting Toeplitz Operators on the Unit Ball via the Quasi-hyperbolic Group Wolfram Bauer and Nikolai Vasilevski To the memory of Professor I. Gohberg, a great mathematician and personality

Abstract. We continue the study of commutative algebras generated by Toeplitz operators acting on the weighted Bergman spaces over the unit ball 𝔹𝑛 in ℂ𝑛 . As was observed recently, apart of the already known commutative Toeplitz 𝐶 ∗ -algebras, quite unexpectedly, there exist many others, not geometrically deﬁned, classes of symbols which generate commutative Toeplitz operator algebras on each weighted Bergman space. These classes of symbols were in a sense subordinated to the quasi-elliptic and quasi-parabolic groups of biholomorphisms of the unit ball. The corresponding commutative operator algebras were Banach, and being extended to the 𝐶 ∗ -algebras they became non-commutative. We consider here the case of symbols subordinated to the quasi-hyperbolic group and show that such classes of symbols are as well the sources for the commutative Banach algebras generated by Toeplitz operators. That is, together with the results of [11, 12], we cover the multidimensional extensions of all three model cases on the unit disk. Mathematics Subject Classiﬁcation (2000). Primary 47B35; Secondary 47L80, 32A36. Keywords. Toeplitz operator, weighted Bergman space, unit ball, commutative Banach algebra, quasi-hyperbolic group.

The ﬁrst named author has been supported by an “Emmy-Noether scholarship” of DFG (Deutsche Forschungsgemeinschaft). The second named author has been partially supported by CONACYT Project 102800, M´ exico.

156

W. Bauer and N. Vasilevski

1. Introduction In the paper we continue the study of commutative algebras generated by Toeplitz operators acting on the weighted Bergman spaces over the unit ball 𝔹𝑛 in ℂ𝑛 . The case of commutative 𝐶 ∗ -algebras was considered in [8], whose main result states that if the symbols of generating Toeplitz operators are invariant under the action of a maximal commutative subgroup of biholomorphisms of the unit ball, then the corresponding 𝐶 ∗ operator algebra is commutative on each commonly considered weighted Bergman space. There are ﬁve diﬀerent pairwise non-conjugate model classes of such subgroups: quasi-elliptic, quasi-parabolic, quasi-hyperbolic, nilpotent, and quasi-nilpotent (the last one depends on a parameter, giving in total 𝑛 + 2 model classes for the 𝑛-dimensional unit ball). In the case of the unit disk (𝑛 = 1), the above result is exact in a sense that (see for details [5]), under some technical assumption on “richness” of the symbol classes, a 𝐶 ∗ -algebra generated by Toeplitz operators is commutative on each weighted Bergman space if and only if the symbols of generating Toeplitz operators are invariant under the action of a maximal commutative subgroup of the M¨obius transformation of the unit disk. It was ﬁrmly expected that the multidimensional case preserves the regularities of the one-dimensional situation. That is, the invariance under the action of a maximal commutative subgroup of biholomorphisms for generating symbols is the only reason for appearing of Toeplitz operator algebras which are commutative on each weighted Bergman space. At the same time, quite unexpectedly it was observed in [12] that for 𝑛 > 1 there are many other, not geometrically deﬁned, classes of symbols which generate commutative Toeplitz operator algebras on each weighted Bergman space. These classes of symbols were in a sense originated from, or subordinated to the quasielliptic group, the corresponding commutative operator algebras were Banach, and being extended to 𝐶 ∗ -algebras they became non-commutative. Moreover, for 𝑛 = 1 all of them collapsed to the commutative 𝐶 ∗ -algebra generated by Toeplitz operators with radial symbols (one-dimensional quasi-elliptic case). It was shown then in [11] that the classes of symbols, subordinated to the quasi-parabolic group, as well generate via corresponding Toeplitz operators the Banach algebras which are commutative on each weighted Bergman space. Again being extended to 𝐶 ∗ algebras they became non-commutative, and for 𝑛 = 2 such algebras collapse to the single 𝐶 ∗ -algebra generated by Toeplitz operators with quasi-parabolic symbols. In this paper we consider the case of symbols subordinated to the quasihyperbolic group and show that such classes of symbols are the sources for the Banach algebras generated by Toeplitz operators which again are commutative on each weighted Bergman space. That is, together with [11, 12], we cover the multidimensional extensions of the (only) three model cases on the unit disk. The study of the last two model cases of maximal commutative subgroup of biholomorphisms of the unit ball, the nilpotent, and quasi-nilpotent groups (which appear only for 𝑛 > 1 and 𝑛 > 2, respectively), still remains as an important and interesting open question.

Banach Algebras of Commuting Toeplitz Operators

157

We mention as well that the commutativity properties of Toeplitz operators were studied in diﬀerent settings, for example, in [1, 2, 3, 4, 7] The paper is organized as follows. In Sections 2 and 3 we recall the notion of weighted Bergman spaces over the unit ball 𝔹𝑛 in ℂ𝑛 and its unbounded realization as the Siegel domain 𝐷𝑛 . Via an explicitly given diﬀeomorphism from 𝐷𝑛 onto a half-space 𝒟 we identify the weighed Bergman space with a closed subspaces in 𝐿2 (𝒟, 𝜂𝜆 ) where 𝜂𝜆 is an induced measure depending on the weight parameter 𝜆 > −1. In Sections 4, 5 and 6 we introduce polar type coordinates on 𝒟 and we explain the notion of quasi-hyperbolic symbols. Then an important result in [8] (cf. Theorem 6.1 of the present paper) establishes a unitary equivalence between Toeplitz operators acting on the weighted Bergman space over 𝐷𝑛 and certain explicitly given multiplication operators. Sections 7 and 8 provide the notion of hyperbolic 𝑘-quasi-radial and hyperbolic k-quasi-homogeneous symbols. Theorem 8.2 roughly speaking states that conjugation with the unitary operator of Theorem 6.1 transforms a Toeplitz operator having a hyperbolic k-quasi-homogeneous symbol into the product of a shift and a multiplication operator. In Section 9 we extend the results in [11, 12] to the case of Toeplitz operators with hyperbolic 𝑘-quasi-homogeneous symbols. In particular, we show that the Banach algebras generated by Toeplitz operators with the above hyperbolic 𝑘quasi-homogeneous symbols are commutative on each weighted Bergman space. A short appendix complements the text.

2. The domains 𝔹𝒏 , 𝑫𝒏 and 퓓 Let 𝔹𝑛 := {𝑧 = (𝑧1 , . . . , 𝑧𝑛 ) ∈ ℂ𝑛 : ∣𝑧∣2 = ∣𝑧1 ∣2 + ⋅ ⋅ ⋅ + ∣𝑧𝑛 ∣2 < 1} be the unit ball in ℂ𝑛 . For points of ℂ𝑛 = ℂ𝑛−1 × ℂ we use the notation: 𝑧 = (𝑧 ′ , 𝑧𝑛 ),

where 𝑧 ′ = (𝑧1 , . . . , 𝑧𝑛−1 ) ∈ ℂ𝑛−1 , 𝑧𝑛 ∈ ℂ.

By 𝐷𝑛 we denote the Siegel domain in ℂ𝑛 : { } 𝐷𝑛 := 𝑧 = (𝑧 ′ , 𝑧𝑛 ) ∈ ℂ𝑛−1 × ℂ : Im 𝑧𝑛 − ∣𝑧 ′ ∣2 > 0 . Recall that the Cayley transform 𝜔 : 𝔹𝑛 → 𝐷𝑛 is given by: ( 𝑧 ) 𝑧𝑛−1 1 − 𝑧𝑛 ) ( 1 = 𝜁1 , . . . , 𝜁𝑛−1 , 𝜁𝑛 = 𝜁. 𝜔(𝑧) = 𝑖 ,..., , 1 + 𝑧1 1 + 𝑧𝑛−1 1 + 𝑧𝑛 The following result is well known: Lemma 2.1. The Cayley transform biholomorphically maps the unit ball 𝔹𝑛 onto the Siegel Domain 𝐷𝑛 . The inverse transform 𝜔 −1 : 𝐷𝑛 → 𝔹𝑛 is given by: ( 2𝑖𝜁1 2𝑖𝜁𝑛−1 1 + 𝑖𝜁𝑛 ) . 𝜔 −1 (𝜁) = − ,...,− , 1 − 𝑖𝜁𝑛 1 − 𝑖𝜁𝑛 1 − 𝑖𝜁𝑛

158

W. Bauer and N. Vasilevski Consider the domain 𝒟 := ℂ𝑛−1 × ℝ × ℝ+ . Then the mapping: 𝜅 : (𝑧 ′ , 𝑢, 𝑣) ∈ 𝒟 → (𝑧 ′ , 𝑢 + 𝑖𝑣 + 𝑖∣𝑧 ′ ∣2 ) ∈ 𝐷𝑛

(2.1) ′ 2

deﬁnes a diﬀeomorphism between 𝒟 and 𝐷𝑛 . Note that Im (𝑢 + 𝑖𝑣 + 𝑖∣𝑧 ∣ ) = 𝑣 + ∣𝑧 ′ ∣2 > ∣𝑧 ′ ∣2 in the case of 𝑣 ∈ ℝ+ . The inverse map 𝜅−1 : 𝐷𝑛 → 𝒟 is given by: ) ( 𝜅−1 (𝑧 ′ , 𝑧𝑛 ) = 𝑧 ′ , Re 𝑧𝑛 , Im 𝑧𝑛 − ∣𝑧 ′ ∣2 .

3. Weighted Bergman spaces over 𝔹𝒏 , 𝑫𝒏 , and 퓓 Let 𝑣 be the standard Lebesgue measure on ℂ𝑛 ∼ = ℝ2𝑛 . We write 𝑧𝑘 = 𝑥𝑘 + 𝑖𝑦𝑘 for 𝑛 𝑘 = 1, . . . , 𝑛. On the ball 𝔹 and for 𝜆 > −1 we consider the normalized weighted measure: Γ(𝑛 + 𝜆 + 1) 𝑑𝜇𝜆 := 𝑐𝜆 (1 − ∣𝑧∣2 )𝜆 𝑑𝑣, 𝑐𝜆 := 𝑛 . 𝜋 Γ(𝜆 + 1) Let 𝑓 be a function on 𝐷𝑛 , then integrals transform as follows: ∫ ∫ 𝑓 (𝜁) 𝑓 ∘ 𝜔(𝑧)𝑑𝑣(𝑧) = 22𝑛 𝑑𝑣(𝜁). ∣1 − 𝑖𝜁𝑛 ∣2𝑛+2 𝑛 𝔹 𝐷𝑛 In particular, with 𝑓 ∈ 𝐿2 (𝔹𝑛 , 𝜇𝜆 ) we have: ∫ 2 ∣𝑓 (𝑧)∣2 (1 − ∣𝑧∣2 )𝜆 𝑑𝑣(𝑧) ∥𝑓 ∥ = 𝑐𝜆 𝔹𝑛 ∫ (1 − ∣𝜔 −1 (𝜁)∣2 )𝜆 ∣𝑓 ∘ 𝜔 −1 (𝜁)∣2 𝑑𝑣(𝜁) = 22𝑛 𝑐𝜆 ∣1 − 𝑖𝜁𝑛 ∣2𝑛+2 𝐷𝑛 ( ) ∫ ′ 2 𝜆 2𝑛+2𝜆 −1 2 Im 𝜁𝑛 − ∣𝜁 ∣ =2 𝑐𝜆 ∣𝑓 ∘ 𝜔 (𝜁)∣ 𝑑𝑣(𝜁). ∣1 − 𝑖𝜁𝑛 ∣2𝑛+2𝜆+2 𝐷𝑛

(3.1)

We introduce the space 𝐿2 (𝐷𝑛 , 𝜇 ˜𝜆 ), where the weight with respect to the Lebesgue measure is given by )𝜆 𝑐𝜆 ( Im 𝜁𝑛 − ∣𝜁 ′ ∣2 . 𝜇 ˜𝜆 (𝜁) = 4 From (3.1) we conclude: Corollary 3.1. The operator 𝒰𝜆 : 𝐿2 (𝔹𝑛 , 𝜇𝜆 ) → 𝐿2 (𝐷𝑛 , 𝜇 ˜𝜆 ) deﬁned by: ( )𝑛+𝜆+1 ) ( 2 𝑓 ∘ 𝜔 −1 (𝜁) 𝒰𝜆 𝑓 (𝜁) := 1 − 𝑖𝜁𝑛 gives a unitary transformation of Hilbert spaces. Its inverse has the form: ( −1 ) 1 𝒰𝜆 𝑓 (𝑧) = 𝑓 ∘ 𝜔(𝑧). (1 + 𝑧𝑛 )𝑛+𝜆+1 Proof. It is clear that 𝒰𝜆 is an isometry. The second assertion follows from: 1 1 1 = = (1 − 𝑖𝜁𝑛 ). 1+𝑖𝜁 −1 𝑛 1 + [𝜔 (𝜁)]𝑛 2 1 + 1−𝑖𝜁 𝑛

□

Banach Algebras of Commuting Toeplitz Operators

159

Deﬁnition 3.2. With 𝜆 > −1 we write 𝒜2𝜆 (𝔹𝑛 ) and 𝒜2𝜆 (𝐷𝑛 ) for the weighted ˜𝜆 ), respecBergman spaces of all analytic functions in 𝐿2 (𝔹𝑛 , 𝜇𝜆 ) and 𝐿2 (𝐷𝑛 , 𝜇 tively. The restriction of 𝒰𝜆 to 𝒜2𝜆 (𝔹𝑛 ) deﬁnes a unitary transformation onto furthermore the operator 𝒰𝜆 conjugates the corresponding weighted Bergman projections. Consider again the domain 𝒟 = ℂ𝑛−1 × ℝ × ℝ+ . We write points 𝑤 ∈ 𝒟 in the form 𝑤 = (𝑧 ′ , 𝑢, 𝑣), where 𝑢 ∈ ℝ and 𝑣 ∈ ℝ+ . Let 𝑓 be a function on 𝐷𝑛 and 𝜅 : 𝒟 → 𝐷𝑛 be the diﬀeomorphism (2.1). The determinant of the transformation 𝜅 is identically one, and therefore: ∫ ∫ 𝑓 ∘ 𝜅(𝑧 ′ , 𝑢, 𝑣)𝑑𝑣(𝑤) = 𝑓 (𝑧)𝑑𝑣(𝑧). (3.2) 𝒜2𝜆 (𝐷𝑛 ),

𝒟

𝐷𝑛

Deﬁnition 3.3. Let 𝜆 > −1, then we consider the weighted space 𝐿2 (𝒟, 𝜂𝜆 ), where the weight function 𝜂𝜆 is deﬁned by: 𝜂𝜆 (𝑧 ′ , 𝑢, 𝑣) =

𝑐𝜆 𝜆 𝑣 . 4

Moreover, let 𝑈0 : 𝐿2 (𝐷𝑛 , 𝜇 ˜𝜆 ) → 𝐿2 (𝒟, 𝜂𝜆 ) be the operator deﬁned by 𝑈0 𝑓 := 𝑓 ∘𝜅. Let 𝑓 ∈ 𝐿2 (𝐷𝑛 , 𝜇 ˜𝜆 ), then by (3.2): ∫ 𝑐𝜆 ∥𝑈0 𝑓 ∥2𝐿2 (𝒟,𝜂𝜆 ) = ∣𝑓 ∘ 𝜅(𝑧 ′ , 𝑢, 𝑣)∣2 𝑣 𝜆 𝑑𝑣(𝑤) 4 𝒟 ∫ ( )𝜆 𝑐𝜆 ∣𝑓 (𝑧)∣2 Im 𝑧𝑛 − ∣𝑧 ′ ∣2 𝑑𝑣(𝑧) = ∥𝑓 ∥2𝐿2(𝐷𝑛 ,𝜇˜𝜆 ) . = 4 𝐷𝑛 It immediately follows: Lemma 3.4. The operator 𝑈0 is unitary with inverse 𝑈0−1 = 𝑈0∗ given by 𝑈0∗ 𝑓 = 𝑓 ∘ 𝜅−1 . Consider the space 𝒜0 (𝒟) := 𝑈0 (𝒜2𝜆 (𝐷𝑛 )). It has been shown in [8] that 𝒜0 (𝒟) consists of all diﬀerentiable functions in 𝐿2 (𝒟, 𝜂𝜆 ) which satisfy the equations: ) ( ) ( ∂ ∂ 1 ∂ ∂ +𝑖 𝑧𝑘 𝜑 = 0, 𝑘 = 1, . . . , 𝑛 − 1, (3.3) − 𝜑=0 and 2 ∂𝑢 ∂𝑣 ∂𝑧 𝑘 ∂𝑣 or the equations ( ) ∂ 1 ∂ +𝑖 𝜑=0 2 ∂𝑢 ∂𝑣

( and

) ∂ ∂ − 𝑖 𝑧𝑘 𝜑 = 0, ∂𝑧 𝑘 ∂𝑢

𝑘 = 1, . . . , 𝑛− 1. (3.4)

160

W. Bauer and N. Vasilevski

4. Polar-type coordinates in 퓓 Represent 𝒟 = ℂ𝑛−1 × ℝ × ℝ+ in the form ℂ𝑛−1 × Π, where Π ⊂ ℂ denotes the upper half-plane. We introduce in 𝒟 the non-isotropic upper semi-sphere: } { Ω := (𝑧 ′ , 𝜁) ∈ ℂ𝑛−1 × Π : ∣𝑧 ′ ∣2 + ∣𝜁∣ = 1 . We use the following natural parametrization for points (𝑧 ′ , 𝜁) ∈ Ω ⊂ 𝒟: {( ) Ω = 𝑠1 𝑡1 , . . . , 𝑠𝑛−1 𝑡𝑛−1 , 𝜌𝑒𝑖𝜃 : 𝑠𝑘 ∈ [0, 1), 𝑡𝑘 ∈ 𝕊1 , 𝜌 ∈ (0, 1], 𝜃 ∈ (0, 𝜋),

𝑛−1 ∑

} 𝑠2𝑘 + 𝜌 = 1 .

𝑘=1

This induces a representation of the points (𝑧 ′ , 𝜁) ∈ 𝒟 of the form: } { 1 𝒟 = (𝑟 2 𝑧 ′ , 𝑟𝜁) : (𝑧 ′ , 𝜁) ∈ Ω, 𝑟 ∈ ℝ+ , and we can write 𝒟 = 𝜏 (𝔹𝑛−1 ) × 𝕋𝑛−1 × ℝ+ × (0, 𝜋), where 𝕋 = 𝕊1 denotes the unit circle in ℂ and 𝜏 (𝔹𝑛−1 ) is the base of 𝔹𝑛 in the sense of a Reinhardt domain: 𝑛−1 { } ∑ 𝑛−1 𝑛−1 ) := 𝑠 = (𝑠1 , . . . , 𝑠𝑛−1 ) ∈ ℝ+ : 𝑠2𝑘 < 1 . 𝜏 (𝔹 𝑘=1

Hence we can express points (𝑧 ′ , 𝜁) ∈ 𝒟 in the new coordinates (𝑠, 𝑡, 𝑟, 𝜃) ∈ 𝜏 (𝔹𝑛−1 ) × 𝕋𝑛−1 × ℝ+ × (0, 𝜋), which are connected with the previous coordinates (𝑧 ′ , 𝜁 = 𝜌𝑒𝑖𝜃 ) by the formulas: ∣𝑧𝑘 ∣ 𝑠𝑘 = √ , ∣𝑧 ′ ∣2 + ∣𝜁∣ 𝑟 = ∣𝑧 ′ ∣2 + ∣𝜁∣,

𝑡𝑘 =

𝑧𝑘 , ∣𝑧𝑘 ∣

𝜃 = arg 𝜁,

1

or 𝑧𝑘 = 𝑟 2 𝑠𝑘 𝑡𝑘 and 𝜁 = 𝑟(1 − ∣𝑠∣2 )𝑒𝑖𝜃 , where 𝑘 = 1, . . . , 𝑛 − 1. In these new coordinates we have: Theorem 4.1 ([8], Lemma 9.1). The equations (3.4) take the following form: for 𝑘 = 1, . . . , 𝑛 − 1: ∂𝑓 ∂𝑓 2𝑠2𝑘 ∂𝑓 − 𝑡𝑘 +𝑖 (sin 𝜃 + 𝑖 cos 𝜃 − 1) , 2 ∂𝑠𝑘 ∂𝑡𝑘 1 − ∣𝑠∣ ∂𝜃 𝑛−1 [ ] ∂𝑓 2 ∂𝑓 1 ∑ ∂𝑓 ∣𝑠∣ 0=𝑟 − . 𝑡ℓ +𝑖 1+ (sin 𝜃 + 𝑖 cos 𝜃) ∂𝑟 2 ∂𝑡ℓ 1 − ∣𝑠∣2 ∂𝜃

0 = 𝑠𝑘

ℓ=1

The space 𝒜0 (𝒟) = 𝑈0 (𝒜2𝜆 (𝐷𝑛 )) consists of all functions 𝑓 = 𝑓 (𝑠, 𝑡, 𝑟, 𝜃) which satisfy the above equations and belong to: ) ( 𝐿2 (𝒟, 𝜂𝜆 ) = 𝐿2 𝜏 (𝔹𝑛−1 ), (1 − ∣𝑠∣2 )𝜆+1 𝑠𝑑𝑠 ( ) 𝑐𝜆 ⊗ 𝐿2 (𝕋𝑛−1 ) ⊗ 𝐿2 (ℝ+ , 𝑟𝜆+𝑛 𝑑𝑟) ⊗ 𝐿2 (0, 𝜋), sin𝜆 𝜃𝑑𝜃 , 4 where we write 𝑠𝑑𝑠 := 𝑠1 𝑑𝑠1 ⋅ ⋅ ⋅ 𝑠𝑛−1 𝑑𝑠𝑛−1 .

Banach Algebras of Commuting Toeplitz Operators

161

5. The operators 𝑹0 and 𝑹 Let 𝑀 : 𝐿2 (ℝ+ , 𝑟𝜆+𝑛 𝑑𝑟) → 𝐿2 (ℝ) be the Mellin transform and by ℱ(𝑛−1) = ℱ ⊗ ⋅ ⋅ ⋅ ⊗ ℱ we denote the (𝑛 − 1)-dimensional discrete Fourier transform, where ℱ : 𝐿2 (𝕊1 ) → ℓ2 (ℤ). More precisely, ∫ [ ] 𝜆+𝑛−1 1 𝑟−𝑖𝜉− 2 𝜓(𝑟) 𝑑𝑟 𝑀 𝜓 (𝜉) : = √ 2𝜋 ℝ+ ∫ [ ] 1 𝑑𝑡 ℱ 𝑓 (𝑛) : = √ 𝑓 (𝑡)𝑡−𝑛 . 𝑖𝑡 2𝜋 𝕊1 Introduce the unitary operator 𝑈1 := 𝐼 ⊗ ℱ(𝑛−1) ⊗ 𝑀 ⊗ 𝐼: ) ( 𝑈1 : 𝐿2 𝜏 (𝔹𝑛−1 ), (1 − ∣𝑠∣2 )𝜆+1 𝑠𝑑𝑠 ⊗ 𝐿2 (𝕋𝑛−1 ) ⊗ 𝐿2 (ℝ+ , 𝑟𝜆+𝑛 𝑑𝑟) ( ) 𝑐𝜆 sin𝜆 𝜃𝑑𝜃 ⊗ 𝐿2 (0, 𝜋), 4 ( ) −→ 𝐿2 𝜏 (𝔹𝑛−1 ), (1 − ∣𝑠∣2 )𝜆+1 𝑠𝑑𝑠 ⊗ ℓ2 (ℤ𝑛−1 ) ⊗ 𝐿2 (ℝ) ( ) 𝑐𝜆 ⊗ 𝐿2 (0, 𝜋), sin𝜆 𝜃𝑑𝜃 . (5.1) 4 We identify the space on the right-hand side with: ( ) ℬ : = ℓ2 ℤ𝑛−1 , ℒ where ( ( ) ) 𝑐𝜆 ℒ : = 𝐿2 (ℝ) ⊗ 𝐿2 𝜏 (𝔹𝑛−1 ), (1 − ∣𝑠∣2 )𝜆+1 𝑠𝑑𝑠 ⊗ 𝐿2 (0, 𝜋), sin𝜆 𝜃𝑑𝜃 . 4 Put 𝑋 := ℤ𝑛−1 × ℝ and 𝑌 := 𝜏 (𝔹𝑛−1 ) × (0, 𝜋) and consider the spaces: 𝐿2 (𝑋, 𝜇) : = ℓ2 (ℤ𝑛−1 ) ⊗ 𝐿2 (ℝ)

( ) 𝑐𝜆 𝐿2 (𝑌, 𝜂) : = 𝐿2 (𝜏 (𝔹𝑛−1 ), (1 − ∣𝑠∣2 )𝜆+1 𝑠𝑑𝑠) ⊗ 𝐿2 (0, 𝜋), sin𝜆 𝜃𝑑𝜃 . 4 2 2 Deﬁnition 5.1. With 𝑈0 : 𝐿 (𝐷𝑛 , 𝜇 ˜𝜆 ) → 𝐿 (𝒟, 𝜇𝜆 ) this construction induces the unitary operator: ˜𝜆 ) → ℬ. 𝑈 := 𝑈1 𝑈0 : 𝐿2 (𝐷𝑛 , 𝜇 Let 𝑋1 := ℤ𝑛−1 × ℝ ⊂ 𝑋 and consider a function 𝑔0 = 𝑔0 (𝑥, 𝑦) on 𝑋1 × 𝑌 + deﬁned by { } 𝑔0 (𝜉, 𝑠, 𝜃) = 𝑔0 (𝑝, 𝜉, 𝑠, 𝜃) , 𝑛−1 𝑝∈ℤ+

where for 𝑝 ∈

ℤ𝑛−1 +

𝑛−1

and (𝜉, 𝑠, 𝜃) ∈ ℝ × 𝜏 (𝔹

) × (0, 𝜋) we put

[ ]− 𝜆+𝑛+∣𝑝∣+1 +𝑖𝜉 2 𝛽𝑝 (𝜉, 𝑠, 𝜃) := 𝑠𝑝 1 − (1 + 𝑖)∣𝑠∣2 ×𝑒

−2(𝜉+𝑖 𝜆+𝑛+∣𝑝∣+1 ) arctan 2

[( ) ∣𝑠∣2 1−𝑖 1−∣𝑠∣ tan 2

∣𝑠∣2 𝜃 2 + 1−∣𝑠∣2

]

, (5.2)

and we write: 𝑔0 (𝑝, 𝜉, 𝑠, 𝜃) = 𝛼𝑝 (𝜉)𝛽𝑝 (𝜉, 𝑠, 𝜃) The following is shown in [8]:

and

𝛼𝑝 (𝜉) := ∥𝛽𝑝 (𝜉, ⋅, ⋅)∥−1 𝐿2 (𝑌,𝜂) .

162

W. Bauer and N. Vasilevski

Proposition 5.2 ([8]). The function 𝑔0 has the properties (a)–(c): (a) For each (𝑝, 𝜉) ∈ ℤ𝑛−1 × ℝ = 𝑋1 it holds 𝑔0 (𝑝, 𝜉, ⋅, ⋅) ∈ 𝐿2 (𝑌, 𝜂) and + ∥𝑔0 (𝑝, 𝜉, ⋅, ⋅)∥𝐿2 (𝑌,𝜂) = 1. (b) 𝑈 maps the Bergman space 𝒜2𝜆 (𝐷𝑛 ) onto 𝑔0 𝐿2 (𝑋1 , 𝜇) ⊂ 𝐿2 (𝑋, 𝜇) ⊗ 𝐿2 (𝑌, 𝜂): ( ) closed 2 𝑈 : 𝒜2𝜆 (𝐷𝑛 ) −→ 𝑔0 𝐿2 (𝑋1 , 𝜇) = 𝑔0 ℓ2 ℤ𝑛−1 ⊂ ℬ. + , 𝐿 (ℝ) (c) For all 𝑓 ∈ 𝐿2 (𝑋1 , 𝜇) one has ∥𝑔0 𝑓 ∥𝐿2 (𝑋,𝜇)⊗𝐿2 (𝑌,𝜂) = ∥𝑓 ∥𝐿2 (𝑋1 ,𝜇) .

(5.3)

Now we introduce an isometric embedding:

( 𝑛−1 2 2 𝑅0 : ℓ2 (ℤ𝑛−1 , ℒ) + , 𝐿 (ℝ)) → ℬ = ℓ ℤ

by the rule:

{ } { } 𝑅0 : 𝑐𝑝 (𝜉) 𝑝∈ℤ𝑛−1 → 𝑐𝑝 (𝜉)𝑔0 (𝑝, 𝜉, 𝑠, 𝜃) +

𝑝∈ℤ𝑛−1

,

(5.4)

where we put 𝑐𝑝 (𝜉)𝑔0 (𝑝, 𝜉, 𝑠, 𝜃) = 0 if 𝑝 ∈ ℤ𝑛−1 ∖ ℤ𝑛−1 + . The adjoint operator 2 𝑅0∗ : ℬ → ℓ2 (ℤ𝑛−1 , 𝐿 (ℝ)) has the form: + { ∫ { } ∗ 𝑅0 : 𝑑𝑝 (𝜉, 𝑠, 𝜃) 𝑝∈ℤ𝑛−1 → 𝛼𝑝 (𝜉) 𝛽𝑝 (𝜉, 𝑠, 𝜃)𝑑𝑝 (𝜉, 𝑠, 𝜃) 𝜏 (𝔹𝑛−1 )×(0,𝜋)

} 𝑐𝜆 ×(1 − ∣𝑠∣2 )𝜆+1 sin𝜆 𝜃𝑠𝑑𝑠𝑑𝜃 . 4 𝑝∈ℤ𝑛−1 +

(5.5)

One easily checks that 𝑅0∗ 𝑅0 = 𝐼 : 𝐿2 (𝑋1 , 𝜇) −→ 𝐿2 (𝑋1 , 𝜇), 𝑅0 𝑅0∗ = 𝑄 : 𝐿2 (𝑋, 𝜇) ⊗ 𝐿2 (𝑌, 𝜂) −→ 𝑈 (𝒜2𝜆 (𝐷𝑛 )) = 𝑔0 𝐿2 (𝑋1 , 𝜇), where 𝑄 is the orthogonal projection onto the right-hand side. Theorem 5.3 ([8]). The operator 𝑅 := 𝑅0∗ 𝑈 maps the Hilbert space 𝐻 := 𝐿2 (𝐷𝑛 , 𝜇 ˜𝜆 ) onto 𝐿2 (𝑋1 , 𝜇). The restriction and the adjoint operator: 2 𝑅∣𝒜 : 𝒜 : = 𝒜2𝜆 (𝐷𝑛 ) −→ 𝐿2 (𝑋1 , 𝜇) = ℓ2 (ℤ𝑛−1 + , 𝐿 (ℝ))

𝑅∗ = 𝑈 ∗ 𝑅0 : 𝐿2 (𝑋1 , 𝜇) −→ 𝒜 ⊂ 𝐻 are isometric isomorphisms. Furthermore, 𝑅𝑅∗ = 𝐼 : 𝐿2 (𝑋1 , 𝜇) −→ 𝐿2 (𝑋1 , 𝜇), 𝑅∗ 𝑅 = 𝑃 : 𝐻 −→ 𝒜, where 𝑃 is the orthogonal projection of 𝐻 onto 𝒜. Proof. Proposition 9.3. in [8].

□

Banach Algebras of Commuting Toeplitz Operators

163

6. Toeplitz operators with quasi-hyperbolic symbols Recall that the Toeplitz operator 𝑇𝑎 with symbol 𝑎 ∈ 𝐿∞ (𝐷𝑛 ) acts on the weighted Bergman space 𝒜2𝜆 (𝐷𝑛 ) by the rule 𝑇𝑎 𝜑 = 𝐵𝐷𝑛 ,𝜆 (𝑎𝜑), where 𝐵𝐷𝑛 ,𝜆 is the Bergman orthogonal projection of the space 𝐿2 (𝐷𝑛 , 𝜇 ˜𝜆 ) onto the Bergman space 𝒜2𝜆 (𝐷𝑛 ). A bounded measurable symbol 𝑎 : 𝐷𝑛 → ℂ is called quasi-hyperbolic if 𝑎 is invariant under the action of the quasi-hyperbolic group 𝕋𝑛−1 × ℝ+ acting on 𝐷𝑛 by: 1

𝕋𝑛−1 × ℝ+ ∋ (𝑡, 𝑟) : (𝑧 ′ , 𝑧𝑛 ) → (𝑟 2 𝑡𝑧 ′ , 𝑟𝑧𝑛 ). Consider the group of non-isotropic dilations {𝛿𝑟 }𝑟∈ℝ+ acting on ℝ𝑛−1 ×Π + by the rule ( 1 ) 1 𝛿𝑟 : (𝑞1 , . . . , 𝑞𝑛−1 , 𝜁) → 𝑟 2 𝑞1 , . . . , 𝑟 2 𝑞𝑛−1 , 𝑟𝜁 . A function 𝑎 ˜ = 𝑎 ˜(𝑞1 , . . . , 𝑞𝑛−1 , 𝜁) is non-isotropic homogeneous of zero order on × Π if it is invariant under 𝛿𝑟 , i.e., it can be recovered from its restriction to ℝ𝑛−1 + the non-isotropic half-sphere 𝑛−1 { } ∑ 2 × Π : 𝑞 + ∣𝜁∣ = 1 . Ω+ := (𝑞1 , . . . , 𝑞𝑛−1 , 𝜁) ∈ ℝ𝑛−1 + 𝑘 𝑘=1

On the one hand, note that a function 𝑎 on 𝐷𝑛 is quasi-hyperbolic if and only if it has the form: ( ) ˜ ∘ 𝜅−1 ∣𝑧1 ∣, . . . , ∣𝑧𝑛−1 ∣, 𝑧𝑛 𝑎(𝑧 ′ , 𝑧𝑛 ) = 𝑎 with a function 𝑎 ˜ which is non-isotropic homogeneous of zero order on ℝ𝑛−1 + ×Π. On the other hand the non-isotropic homogeneous functions of zero order on ℝ𝑛−1 + ×Π are of the type ( ) 𝑞𝑛−1 ∣𝜁∣ 𝑞1 𝑖𝜃 𝑎 ˜(𝑞1 , . . . , 𝑞𝑛−1 , 𝜁) = 𝑎0 √ 𝑒 ,..., √ , ∣𝑞∣2 + ∣𝜁∣ ∣𝑞∣2 + ∣𝜁∣ ∣𝑞∣2 + ∣𝜁∣ =˜ 𝑎0 (𝑠1 , . . . , 𝑠𝑛−1 , 𝜃) in our former coordinates (𝑠1 , 𝑠2 , . . . , 𝑠𝑛−1 ) and 𝜃 and with a function ˜ 𝑎0 on 𝜏 (𝔹𝑛−1 ) × (0, 𝜋). According to Theorem 10.5 in [8]: Theorem 6.1. Let 𝑎 ∈ 𝐿∞ (𝐷𝑛 ) be a quasi-hyperbolic function. Then the Toeplitz operator 𝑇𝑎 acting on 𝒜2𝜆 (𝐷𝑛 ) is unitary equivalent to the multiplication operator: ( ( 𝑛−1 2 ) ) 2 2 𝛾𝑎 𝐼 = 𝑅𝑇𝑎 𝑅∗ : ℓ2 ℤ𝑛−1 + , 𝐿 (ℝ) −→ ℓ ℤ+ , 𝐿 (ℝ) . The sequence 𝛾𝑎 = {𝛾𝑎 (𝑝, 𝜉)}𝑝∈ℤ𝑛−1 with 𝜉 ∈ ℝ is given by: + ∫ 𝑐𝜆 𝑎(𝑠, 𝜃)∣𝛽𝑝 (𝑠, 𝜉, 𝜃)∣2 (1 − ∣𝑠∣2 )𝜆+1 sin𝜆 𝜃𝑠𝑑𝑠𝑑𝜃, 𝛾𝑎 (𝑝, 𝜉) = 𝛼2𝑝 (𝜉) 4 𝜏 (𝔹𝑛−1 )×(0,𝜋) where 𝛽𝑝 was deﬁned in (5.2).

164

W. Bauer and N. Vasilevski

7. Hyperbolic 𝒌-quasi-radial symbols Let 𝑘 = (𝑘1 , . . . , 𝑘𝑚 ) be a tuple of positive integers such that 𝑘1 ≤ 𝑘2 ≤ ⋅ ⋅ ⋅ ≤ 𝑘𝑚 and ∣𝑘∣ = 𝑘1 + ⋅ ⋅ ⋅ + 𝑘𝑚 = 𝑛 − 1. We arrange the coordinates of ℂ𝑛−1 in 𝑚 groups: 𝑧(1) = (𝑧1,1 , . . . , 𝑧1,𝑘1 ), 𝑧(2) = (𝑧2,1 , . . . , 𝑧2,𝑘2 ), . . . , 𝑧(𝑚) = (𝑧𝑚,1 , . . . , 𝑧𝑚,𝑘𝑚 ). 𝜁(𝑗)

Represent each 𝑧(𝑗) = (𝑧𝑗,1 , . . . , 𝑧𝑗,𝑘𝑗 ) ∈ ℂ𝑘𝑗 in the form 𝑧(𝑗) = 𝑟𝑗 𝜁(𝑗) , where ∈ 𝕊2𝑘𝑗 −1 and √ 𝑟𝑗 = ∣𝑧𝑗,1 ∣2 + ⋅ ⋅ ⋅ + ∣𝑧𝑗,𝑘𝑗 ∣2 .

Deﬁnition 7.1. A function 𝑎 = 𝑎(𝑧 ′ , 𝑧𝑛 ) : 𝐷𝑛 → ℂ is called hyperbolic k-quasiradial if ˜(𝑟1 , . . . , 𝑟𝑚 , 𝑧𝑛 − 𝑖∣𝑧 ′ ∣2 ) 𝑎(𝑧 ′ , 𝑧𝑛 ) = 𝑎

(7.1) ℝ𝑚 +

× Π. with a function 𝑎 ˜ which is non-isotropic homogeneous of order zero on In that case 𝑎 is, in particular, quasi-hyperbolic and 𝑎 ˜ can be represented in the form: ( ) 𝑟1 𝑟𝑚 ∣𝜁∣ 𝑎 ˜(𝑟1 , . . . , 𝑟𝑚 , 𝜁) = 𝑎0 √ 𝑒𝑖𝜃 (7.2) ,..., √ , ∣𝑟∣2 + ∣𝜁∣ ∣𝑟∣2 + ∣𝜁∣ ∣𝑟∣2 + ∣𝜁∣ =𝑎 ˜0 (𝑠1 , . . . , 𝑠𝑚 , 𝜃) , where 𝑟 = (𝑟1 , . . . , 𝑟𝑚 ) ∈ ℝ𝑚 ˜0 is a function on 𝜏 (𝔹𝑚 ) × (0, 𝜋). + and 𝑎 By varying the tuple 𝑘 we have a collection of sets ℛ𝑘 of hyperbolic 𝑘-quasiradial functions. This collection is partially ordered by inclusion and we have: ℛ(𝑛−1) ⊂ ℛ𝑘 ⊂ ℛ(1,...,1) . we write: With a given multi-index 𝛼 = (𝛼1 , . . . , 𝛼𝑛−1 ) ∈ ℕ𝑛−1 0 𝛼(1) = (𝛼1 , . . . , 𝛼𝑘1 ), 𝛼(2) = (𝛼𝑘1 +1 , . . . , 𝛼𝑘1 +𝑘2 ), . . . , 𝛼(𝑚) = (𝛼𝑛−𝑘𝑚 , . . . , 𝛼𝑛−1 ). For a hyperbolic 𝑘-quasi-radial function 𝑎 we can further reduce the order of integration in the expression 𝛾𝑎 (𝑝, 𝜉), where 𝑝 ∈ ℤ𝑛−1 + , in Theorem 6.1. With 𝑠 = (𝑠1 , . . . , 𝑠𝑛−1 ) ∈ 𝔹𝑛−1 put 𝑠ˆ := (∣𝑠1 ∣, . . . , ∣𝑠𝑛−1 ∣) ∈ 𝜏 (𝔹𝑛−1 ) and 𝑒 := (1, 1, . . . , 1). With a suitable function 𝐻∣𝑝∣ : ℝ × ℝ+ × (0, 𝜋) → ℂ and with 𝑝 ∈ ℤ𝑛−1 we can write + 𝛽𝑝 (𝜉, 𝑠ˆ, 𝜃)(1 − ∣𝑠∣2 )

𝜆+1 2

( ) = 𝑠𝑝 ⋅ 𝐻∣𝑝∣ 𝜉, ∣𝑠∣, 𝜃 .

(7.3)

Banach Algebras of Commuting Toeplitz Operators

165

Hence one has: ∫

$ $2 𝑎 ˜0 (𝑠, 𝜃) $𝛽𝑝 (𝜉, 𝑠, 𝜃)$ (1 − ∣𝑠∣2 )𝜆+1 𝑠𝑑𝑠 𝜏 (𝔹𝑛−1 ) ∫ $ ( )$2 1 𝑎 ˜0 (ˆ 𝑠, 𝜃) ∣𝑠2𝑝+𝑒 ∣$𝐻∣𝑝∣ 𝜉, ∣𝑠∣, 𝜃 $ 𝑑𝑠 = (∗). = 𝑛−1 2 𝔹𝑛−1 If 𝑎 is hyperbolic 𝑘-quasi-radial, we obtain: ∫ ∫ $ $2 1 (∗) = 𝑛−1 𝑎 ˜0 (𝑟, 𝜃)∣𝑠2𝑝+𝑒 ∣$𝐻∣𝑝∣ (𝜉, ∣𝑟∣, 𝜃)$ 2 𝑘 −1 𝜏 (𝔹𝑚 ) 𝕊 1 ×⋅⋅⋅×𝕊𝑘𝑚 −1 ×

𝑚 ∏ 𝑗=1

2∣𝑝(𝑗) ∣+𝑘𝑗 −1

𝑟𝑗

𝑑𝜎(𝑠(1) ) ⋅ ⋅ ⋅ 𝑑𝜎(𝑠(𝑚) ) 𝑑𝑟.

Here and in what follows 𝑑𝜎 means the usual surface measure on the sphere. From Lemma A.1 we have: ( )−1 𝑚 ∫ 𝑚 ∏ ∏ $ 2𝑝 +𝑒 $ 𝑘𝑗 + 1 𝑚 (𝑗) (𝑗) $ $ 𝑠 𝑑𝜎(𝑠) = 2 𝑝! Γ ∣𝑝(𝑗) ∣ + , Θ𝑝 := 𝑘𝑗 −1 2 𝑗=1 𝕊 𝑗=1 and it follows: Lemma 7.2. Let 𝑎 be hyperbolic 𝑘-quasi-radial and 𝑝 ∈ ℤ𝑛−1 + , then: ( ) 𝐹𝑎 ∣𝑝(1) ∣, . . . , ∣𝑝(𝑚) ∣, 𝜉 ), 𝛾𝑎 (𝑝, 𝜉) = ( 𝐹𝑒 ∣𝑝(1) ∣, . . . , ∣𝑝(𝑚) ∣, 𝜉

(7.4)

where we have used the notation in (7.1) and (7.2) and we put 𝑒 ≡ 1. The function 𝐹𝑎 is deﬁned by: ∫ ( ) $ $2 𝑎 ˜0 (𝑟, 𝜃)$𝐻∣𝑝∣ (𝜉, ∣𝑟∣, 𝜃)$ 𝐹𝑎 ∣𝑝(1) ∣, . . . , ∣𝑝(𝑚) ∣, 𝜉 := 𝜏 (𝔹𝑚 )×(0,𝜋) 𝑚 ∏ 2∣𝑝 ∣+𝑘𝑗 −1 𝑐𝜆 × 𝑟𝑗 (𝑗) 4 𝑗=1

sin𝜆 𝜃𝑑𝑟𝑑𝜃.

(7.5)

Proof. From our calculation before and with the notation (7.5) we have: ( )−1 ( 𝑚 ) ∏ 𝑘𝑗 + 1 2 𝑚−𝑛+1 𝑝! Γ ∣𝑝(𝑗) ∣ + 𝐹𝑎 ∣𝑝(1) ∣, . . . , ∣𝑝(𝑚) ∣, 𝜉 . 𝛾𝑎 (𝑝, 𝜉) = 𝛼𝑝 (𝜉)2 2 𝑗=1 Moreover, it holds 2 𝛼−2 𝑝 (𝜉) = ∥𝛽𝑝 (𝜉, ⋅, ⋅)∥𝐿2 (𝑌,𝜂) ∫ 𝑐𝜆 = ∣𝛽𝑝 (𝜉, 𝑠, 𝜃)∣2 (1 − ∣𝑠∣2 )𝜆+1 sin𝜆 𝜃𝑠𝑑𝑠𝜃 4 𝜏 (𝔹𝑛−1 )×(0,𝜋) ( )−1 𝑚 ∏ ( ) 𝑘𝑗 + 1 = 2𝑚−𝑛+1 𝑝! Γ ∣𝑝(𝑗) ∣ + 𝐹𝑒 ∣𝑝(1) ∣, . . . , ∣𝑝(𝑚) ∣, 𝜉 , 2 𝑗=1

which proves (7.4).

□

166

W. Bauer and N. Vasilevski

8. Hyperbolic 𝒌-quasi-homogeneous functions ′ 𝑛−1 Let 𝑘 = (𝑘1 , . . . , 𝑘𝑚 ) ∈ ℤ𝑚 we write + . With a point 𝑧 = (𝑧1 , . . . , 𝑧𝑛−1 ) ∈ ℂ 𝑚 𝑧(𝑗) = 𝑟𝑗 𝜁(𝑗) , 𝑗 = 1, . . . , 𝑚, to deﬁne the vectors (𝑟1 , . . . , 𝑟𝑚 ) ∈ ℝ+ and ( 𝜁 := (𝜁1 , . . . , 𝜁𝑛−1 ) = 𝜁(1) , 𝜁(2) , . . . , 𝜁(𝑚) ) ∈ 𝕊2𝑘1 −1 × 𝕊2𝑘2 −1 × ⋅ ⋅ ⋅ × 𝕊2𝑘𝑚 −1 .

A second representation of 𝑧 ′ has been given earlier: ) ( 1 1 𝑧 ′ = 𝑟 2 𝑠1 𝑡1 , . . . , 𝑟 2 𝑠𝑛−1 𝑡𝑛−1 , 1

𝑟 ∈ ℝ+ ,

1

where 𝑡 = (𝑡1 , . . . , 𝑡𝑛−1 ) ∈ 𝕊 × ⋅ ⋅ ⋅ × 𝕊 and 𝑠 = (𝑠1 , . . . , 𝑠𝑛−1 ) ∈ 𝜏 (𝔹𝑛−1 ). Hence it follows: 1 𝑧𝑗,ℓ = 𝑟𝑗 𝜁𝑗,ℓ = 𝑟 2 𝑠𝑗,ℓ 𝑡𝑗,ℓ , for ℓ ∈ {1, . . . , 𝑘𝑗 }, and therefore 1

𝜁𝑗,ℓ =

1

𝑟2 𝑠𝑗,ℓ 𝑡𝑗,ℓ 𝑟𝑗

and 1

Moreover, we have 𝑟𝑗 = ∣𝑧(𝑗) ∣ = 𝑟 2 𝑠(𝑗) ∕= 0: 𝑠𝑗,ℓ 𝑡𝑗,ℓ 𝜁𝑗,ℓ = ∣𝑠(𝑗) ∣

∣𝜁𝑗,ℓ ∣ =

𝑟2 𝑠𝑗,ℓ . 𝑟𝑗

√ 1 𝑠2𝑗,1 + ⋅ ⋅ ⋅ + 𝑠2𝑗,𝑘𝑗 = 𝑟 2 ∣𝑠(𝑗) ∣, and in case of and

∣𝜁𝑗,ℓ ∣ =

𝑠𝑗,ℓ . ∣𝑠(𝑗) ∣

(8.1)

Deﬁnition 8.1. A function 𝜑 ∈ 𝐿∞ (𝐷𝑛 ) is called hyperbolic 𝑘-quasi-homogeneous if it has the form ( ) 𝑞 ˜ ∣𝑧(1) ∣, . . . , ∣𝑧(𝑚) ∣, 𝑧𝑛 − 𝑖∣𝑧 ′ ∣2 𝜁 𝑝 𝜁 , 𝜑(𝑧 ′ , 𝑧𝑛 ) = 𝑎 where 𝑎 ˜ is non-isotropic homogeneous of order zero on ℝ𝑚 + × Π. We call the pair (𝑝, 𝑞) ∈ ℤ𝑛−1 × ℤ𝑛−1 the corresponding quasi-homogeneous degree. + + According to (8.1) we write: 𝑚 ( ) ∏ 𝑞 ˜0 ∣𝑠(1) ∣, . . . , ∣𝑠(𝑚) ∣, arg(𝑧𝑛 − 𝑖∣𝑧 ′ ∣2 ) 𝑡𝑝 𝑡 𝑠𝑝+𝑞 ∣𝑠(𝑗) ∣−∣𝑝(𝑗) ∣−∣𝑞(𝑗) ∣ , 𝑎(𝑧 ′ , 𝑧𝑛 ) = 𝑎

𝑗=1

=𝜃

where 𝑎 ˜0 is a function on 𝜏 (𝔹𝑚 ) × (0, 𝜋) and for 𝑗 = 1, . . . , 𝑚 we have ∣𝑧(𝑗) ∣ . ∣𝑠(𝑗) ∣ = √ ′ 2 ∣𝑧 ∣ + ∣𝑧𝑛 − 𝑖∣𝑧 ′ ∣2 ∣ we denote by 𝑒ˆ𝜌 = {𝛿𝜌,𝛽 }𝛽∈ℤ𝑛−1 the 𝜌’s element For a multi-index 𝜌 ∈ ℤ𝑛−1 + +

2 of the standard orthonormal basis in ℓ2 (ℤ𝑛−1 + ). Given 𝑐(𝜉) ∈ 𝐿 (ℝ), let ( ) { } 𝑒ˆ𝜌 𝑐(𝜉) = 𝑒ˆ𝜌 ⊗ 𝑐(𝜉) = 𝛿𝜌,𝛽 𝑐(𝜉) 𝛽∈ℤ𝑛−1 +

be the corresponding one-component element of ℓ

2

2 (ℤ𝑛−1 + , 𝐿 (ℝ)).

Banach Algebras of Commuting Toeplitz Operators

167 𝑞

Theorem 8.2. Given a hyperbolic 𝑘-quasi-homogeneous symbol 𝜑 = 𝑎𝜁 𝑝 𝜁 we have: ⎧ 0, if there is an ℓ such that    ⎨ 𝜌ℓ + 𝑝ℓ − 𝑞ℓ < 0, 𝑅𝑇𝜑 𝑅∗ : 𝑒ˆ𝜌 (𝑐(𝜉)) → 𝑎  𝛾 ˜ (𝜉)ˆ 𝑒 (𝑐(𝜉)), if for all ℓ one has 𝜌+𝑝−𝑞   𝜌,𝑘,𝑝,𝑞 ⎩ 𝜌ℓ + 𝑝ℓ − 𝑞ℓ ≥ 0, where 𝑎 (𝜉) 𝛾˜𝜌,𝑘,𝑝,𝑞

= Θ𝜌+𝑝 𝛼𝜌+𝑝−𝑞 (𝜉)𝛼𝜌 (𝜉) ×

𝑚 ∏ 𝑗=1

1 2𝑛−1

2∣𝜌(𝑗) ∣+∣𝑝(𝑗) ∣−∣𝑞(𝑗) ∣+𝑘𝑗 −1

𝑟𝑗

∫ 𝜏 (𝔹𝑚 )×(0,𝜋)

𝑎 ˜0 (𝑟1 , . . . , 𝑟𝑚 , 𝜃)

] [ 𝑐𝜆 sin𝜆 𝜃𝑑𝑟𝑑𝜃. × 𝐻∣𝜌+𝑝−𝑞∣ ⋅ 𝐻∣𝑝∣ (𝜉, ∣𝑟∣, 𝜃) 4

Here we use the notation (7.3) and write as before: 𝑚 ∫ ∏ 2𝜌 +2𝑝 +𝑒 Θ𝜌+𝑝 = ∣𝛾(𝑗)(𝑗) (𝑗) (𝑗) ∣𝑑𝜎(𝛾(𝑗) ) 𝑗=1

𝕊𝑘𝑗 −1

( )−1 𝑘𝑗 + 1 Γ ∣𝜌(𝑗) ∣ + ∣𝑝(𝑗) ∣ + . = 2 (𝜌 + 𝑝)! 2 𝑗=1 𝑚

𝑚 ∏

Proof. Similar to the proof of Theorem 10.5 in [8] we have: ( ) 𝑅𝑇𝜑 𝑅∗ 𝑒ˆ𝜌 𝑐(𝜉) 𝑚 ∏ ( ( ) ) 𝑞 −∣𝑝 +𝑞 ∣ = 𝑅0∗ 𝑈1 𝑎 ˜0 𝑠(1) , . . . , 𝑠(𝑚) , 𝜃 𝑡𝑝 𝑡 𝑠𝑝+𝑞 𝑠(𝑗) (𝑗) (𝑗) 𝑈1−1 𝑅0 𝑒ˆ𝜌 𝑐(𝜉)

=

𝑅0∗ 𝑈1 𝑎 ˜0

( ) 𝑞 𝑠(1) , . . . , 𝑠(𝑚) , 𝜃 𝑡𝑝 𝑡 𝑠𝑝+𝑞 ×

𝑚 ∏ 𝑗=1

−∣𝑝

𝑠(𝑗) (𝑗)

+𝑞(𝑗) ∣

𝑗=1

{ } 𝑈1−1 𝑒ˆ𝜌 𝑐(𝜉)𝛼𝜌 (𝜉)𝛽𝜌 (𝜉, 𝑠, 𝜃)

𝑚 { ( } ∏ ) −∣𝑝 +𝑞 ∣ ˜0 𝑠(1) , . . . , 𝑠(𝑚) , 𝜃 𝑠𝑝+𝑞 𝑠(𝑗) (𝑗) (𝑗) 𝑐(𝜉)𝛼𝜌 (𝜉)𝛽𝜌 (𝜉, 𝑠, 𝜃) , = 𝑅0∗ 𝑒ˆ𝜌+𝑝−𝑞 𝑎 𝑗=1

2 where 𝑅0 : ℓ2 (ℤ𝑛−1 + , 𝐿 (ℝ)) → ℬ and 𝑈1 = 𝐼 ⊗ ℱ(𝑛−1) ⊗ 𝑀 ⊗ 𝐼 have been deﬁned in (5.4) and (5.1), respectively. Hence we obtain from (5.5) that { ( ) 𝑅𝑇𝑎𝜁 𝑝 𝜁 𝑞 𝑅∗ 𝑒ˆ𝜌 𝑐(𝜉) = 𝑒ˆ𝜌+𝑝−𝑞 𝑐(𝜉)𝛼𝜌+𝑝−𝑞 (𝜉)𝛼𝜌 (𝜉) ∫ ) ( 𝑎 ˜0 𝑠(1) , . . . , 𝑠(𝑚) , 𝜃 ×

× 𝛽𝜌+𝑝−𝑞 (𝜉, 𝑠, 𝜃)

𝜏 (𝔹𝑛−1 )×(0,𝜋) 𝑚 ∏ −∣𝑝 +𝑞 ∣ 𝑝+𝑞 𝛽𝜌 (𝜉, 𝑠, 𝜃)𝑠 𝑠(𝑗) (𝑗) (𝑗) (1 𝑗=1

− ∣𝑠∣2 )𝜆+1

} 𝑐𝜆 sin𝜆 𝜃𝑠𝑑𝑠𝑑𝜃 . 4

168

W. Bauer and N. Vasilevski Again we deﬁne 𝐻∣𝜌∣ by the relation 𝛽𝜌 (𝜉, 𝑠, 𝜃)(1 − ∣𝑠∣2 )

𝜆+1 2

= 𝑠𝜌 𝐻∣𝜌∣ (𝜉, ∣𝑠∣, 𝜃)

such that: and we put 𝑒 = (1, . . . , 1) ∈ ℤ𝑛−1 + ∫ ( ) 𝑎 ˜0 ∣𝑠(1) ∣, . . . , ∣𝑠(𝑚) ∣, 𝜃 𝛽𝜌+𝑝−𝑞 (𝜉, 𝑠, 𝜃)𝛽𝜌 (𝜉, 𝑠, 𝜃)𝑠𝑝+𝑞 𝜏 (𝔹𝑛−1 )

×

∣𝑠(𝑗) ∣−∣𝑝(𝑗) +𝑞(𝑗) ∣ (1 − ∣𝑠∣2 )𝜆+1 𝑠𝑑𝑠

𝑗=1

∫ =

𝑚 ∏

𝜏 (𝔹𝑛−1 )

[ ] ) ( 𝑎 ˜0 ∣𝑠(1) ∣, . . . , ∣𝑠(𝑚) ∣, 𝜃 𝑠2𝜌+2𝑝+𝑒 𝐻∣𝜌+𝑝−𝑞∣ ⋅ 𝐻∣𝑝∣ (𝜉, ∣𝑠∣, 𝜃) ×

𝑚 ∏

∣𝑠(𝑗) ∣−∣𝑝(𝑗) +𝑞(𝑗) ∣ 𝑑𝑠 = (∗).

𝑗=1

It follows that: ∫ ] $[ ( )$ 1 𝑎 ˜0 ∣𝑠(1) ∣, . . . , ∣𝑠(𝑚) ∣, 𝜃 $𝑠2𝜌+2𝑝+𝑒 $ 𝐻∣𝜌+𝑝−𝑞∣ ⋅ 𝐻∣𝑝∣ (𝜉, ∣𝑠∣, 𝜃) (∗) = 𝑛−1 2 𝔹𝑛−1 𝑚 ∏ × ∣𝑠(𝑗) ∣−∣𝑝(𝑗) +𝑞(𝑗) ∣ 𝑑𝑠 𝑗=1

=

1

∫

2𝑛−1 ×

𝑚 ∏ 𝑗=1

∫

𝜏 (𝔹𝑚 )

𝕊𝑘1 −1 ×⋅⋅⋅×𝕊𝑘𝑚 −1

𝑎 ˜0 (𝑟1 , . . . , 𝑟𝑚 , 𝜃)

2∣𝜌(𝑗) ∣+∣𝑝(𝑗) ∣−∣𝑞(𝑗) ∣+𝑘𝑗 −1 $ 2𝜌+2𝑝+𝑒 $ $ $

𝑟𝑗

𝛾

] [ × 𝐻∣𝜌+𝑝−𝑞∣ ⋅ 𝐻∣𝑝∣ (𝜉, ∣𝑟∣, 𝜃) 𝑑𝜎(𝛾1 ) ⋅ ⋅ ⋅ 𝑑𝜎(𝛾𝑚 )𝑑𝑟 ∫ 𝑚 ∏ Θ𝜌+𝑝 2∣𝜌 ∣+∣𝑝(𝑗) ∣−∣𝑞(𝑗) ∣+𝑘𝑗 −1 𝑎 ˜0 (𝑟1 , . . . , 𝑟𝑚 , 𝜃) 𝑟𝑗 (𝑗) = 𝑛−1 2 𝜏 (𝔹𝑚 ) 𝑗=1 ] [ × 𝐻∣𝜌+𝑝−𝑞∣ ⋅ 𝐻∣𝑝∣ (𝜉, ∣𝑟∣, 𝜃) 𝑑𝑟, which proves the assertion.

□

9. Commutativity results Now we have collected all the tools to extend the results in [11, 12] to the case of Toeplitz operators with hyperbolic 𝑘-quasi-homogeneous symbols. Proposition 9.1. Let 𝑘 = (𝑘1 , 𝑘2 , . . . , 𝑘𝑚 ) ∈ ℤ𝑚 + and 𝑝, 𝑞 be a pair of orthogonal multi-indices. Then, (a) and (b) below are equivalent:

Banach Algebras of Commuting Toeplitz Operators

169

(a) For each pair of non identically zero hyperbolic 𝑘-quasi-radial functions 𝑎1 and 𝑎2 the Toeplitz operators 𝑇𝑎1 and 𝑇𝑎2 𝜁 𝑝 𝜁 𝑞 commute on each weighted Bergman space 𝒜2𝜆 (𝐷𝑛 ). (b) It holds ∣𝑝(𝑗) ∣ = ∣𝑞(𝑗) ∣ for each 𝑗 = 1, 2, . . . , 𝑚. Proof. We calculate the operator products in both orders using the Theorems 6.1 and 8.2. On the one hand and according to Theorem 5.3 we have )( ) ( 𝑅𝑇𝑎2 𝜁 𝑝 𝜁 𝑞 𝑇𝑎1 𝑅∗ 𝑒ˆ𝜌 (𝑐(𝜉)) = 𝑅𝑇𝑎2 𝜁 𝑝 𝜁 𝑞 𝑅∗ 𝑅𝑇𝑎1 𝑅∗ 𝑒ˆ𝜌 (𝑐(𝜉)) ) ( = 𝑅𝑇𝑎2 𝜁 𝑝 𝜁 𝑞 𝑅∗ 𝑒ˆ𝜌 𝛾𝑎1 (𝜌, 𝜉)𝑐(𝜉) ( ) 𝑎2 = 𝛾˜𝜌,𝑘,𝑝,𝑞 (𝜉)𝛾𝑎1 (𝜌, 𝜉)ˆ 𝑒𝜌+𝑝−𝑞 𝑐(𝜉) (for all ℓ such that 𝜌ℓ + 𝑝ℓ − 𝑞ℓ ≥ 0). On the other hand: )( ) ( ( ) ( ) 𝑅𝑇𝑎1 𝑇𝑎2 𝜁 𝑝 𝜁 𝑞 𝑅∗ 𝑒ˆ𝜌 𝑐(𝜉) = 𝑅𝑇𝑎1 𝑅∗ 𝑅𝑇𝑎2 𝜁 𝑝 𝜁 𝑞 𝑅∗ 𝑒ˆ𝜌 𝑐(𝜉) ) ( ( ) 𝑎2 (𝜉)𝑐(𝜉) = 𝑅𝑇𝑎1 𝑅∗ 𝑒ˆ𝜌+𝑝−𝑞 𝛾˜𝜌,𝑘,𝑝,𝑞 ( ) 𝑎2 𝛾𝜌,𝑘,𝑝,𝑞 (𝜉)ˆ 𝑒𝜌+𝑝−𝑞 𝑐(𝜉) . = 𝛾𝑎1 (𝜌 + 𝑝 − 𝑞, 𝜉)˜ Hence both operators commute if and only if: 𝛾𝑎1 (𝜌, 𝜉) = 𝛾𝑎1 (𝜌 + 𝑝 − 𝑞, 𝜉). According to Lemma 7.2 and with the notation (7.5) this means: ( ( ) ) 𝐹𝑎1 ∣𝜌(1) ∣, . . . , ∣𝜌(𝑚) ∣, 𝜉 𝐹𝑎1 ∣𝜌(1) + 𝑝(1) − 𝑞(1) ∣, . . . , ∣𝜌(𝑚) + 𝑝(𝑚) − 𝑞(𝑚) ∣, 𝜉 ) = ). ( ( 𝐹𝑒 ∣𝜌(1) ∣, . . . , ∣𝜌(𝑚) ∣, 𝜉 𝐹𝑒 ∣𝜌(1) + 𝑝(1) − 𝑞(1) ∣, . . . , ∣𝜌(𝑚) + 𝑝(𝑚) − 𝑞(𝑚) ∣, 𝜉 This relation is fulﬁlled for all possible symbols 𝑎1 if and only if (b) holds.

□

Note that under the condition ∣𝑝(𝑗) ∣ = ∣𝑞(𝑗) ∣, for each 𝑗 = 1, . . . , 𝑚, and with a hyperbolic 𝑘-quasi radial symbol 𝑎 we have: ∫ Θ𝜌+𝑝 𝛼𝜌+𝑝−𝑞 (𝜉) Θ𝜌 𝛼2𝜌 (𝜉) 𝑎 𝛾˜𝜌,𝑘,𝑝,𝑞 (𝜉) = 𝑎 ˜(𝑟, 𝜃) Θ𝜌 𝛼𝜌 (𝜉) 2𝑛−1 𝜏 (𝔹𝑛−1 )×(0,𝜋) ×

𝑚 ∏ 𝑗=1

=

$ 2 𝑐𝜆 𝐻∣𝜌∣ (𝜉, ∣𝑟∣, 𝜃)$ sin𝜆 𝜃𝑑𝑟𝑑𝜃 4

2∣𝜌(𝑗) ∣+𝑘𝑗 −1 $$

𝑟𝑗

Θ𝜌+𝑝 𝛼𝜌+𝑝−𝑞 (𝜉) 𝛾𝑎 (𝜌, 𝜉). Θ𝜌 𝛼𝜌 (𝜉)

Moreover,

√ Θ𝜌+𝑝 𝛼𝜌+𝑝−𝑞 (𝜉) Θ𝜌+𝑝 𝜌! √ = Θ𝜌 𝛼𝜌 (𝜉) Θ𝜌 (𝜌 + 𝑝 − 𝑞)!

) ( 𝑘 +1 Γ ∣𝜌(𝑗) ∣ + 𝑗2 ( ). = √ 𝜌!(𝜌 + 𝑝 − 𝑞)! 𝑗=1 Γ ∣𝜌(𝑗) + 𝑝(𝑗) ∣ + 𝑘𝑗 +1 2 (𝜌 + 𝑝)!

𝑚 ∏

170

W. Bauer and N. Vasilevski

Thus we have: Lemma 9.2. Let 𝑘 = (𝑘1 , . . . , 𝑘𝑚 ) and 𝑝, 𝑞 ∈ ℤ𝑛−1 be orthogonal multi-indices such + 𝑛−1 that ∣𝑝(𝑗) ∣ = ∣𝑞(𝑗) ∣ for 𝑗 = 1, . . . , 𝑚. With 𝜌 ∈ ℤ+ and a hyperbolic 𝑘-quasi-radial symbols 𝑎 one has: ) ( 𝑘𝑗 +1 𝑚 Γ ∣𝜌 ∣ + ∏ (𝑗) 2 (𝜌 + 𝑝)! 𝑎 ( ). (𝜉) = 𝛾𝑎 (𝜌, 𝜉) ⋅ √ (9.1) 𝛾˜𝜌,𝑘,𝑝,𝑞 𝜌!(𝜌 + 𝑝 − 𝑞)! 𝑗=1 Γ ∣𝜌(𝑗) + 𝑝(𝑗) ∣ + 𝑘𝑗 +1 2 From this result we conclude: Corollary 9.3. Let 𝑘 = (𝑘1 , 𝑘2 , . . . , 𝑘𝑚 ) ∈ ℤ𝑚 + be given. For each pair of orthogonal multi-indices 𝑝 and 𝑞 with ∣𝑝(𝑗) ∣ = ∣𝑞(𝑗) ∣ for all 𝑗 = 1, 2, . . . , 𝑚 and a hyperbolic 𝑘-quasi-radial function 𝑎 we have: 𝑇𝑎 𝑇𝜁 𝑝 𝜁 𝑞 = 𝑇𝜁 𝑝 𝜁 𝑞 𝑇𝑎 = 𝑇𝑎𝜁 𝑝 𝜁 𝑞 . Proof. The ﬁrst equality directly follows from Proposition 9.1. Moreover, with the symbol 𝑒 ≡ 1 we have 𝛾𝑒 (𝜌, 𝜉) ≡ 1 for all multi-indices 𝜌 ∈ ℤ𝑛−1 + . Thus by (9.1) ) ( 𝑘𝑗 +1 𝑚 Γ ∣𝜌 ∣ + ∏ (𝑗) 2 (𝜌 + 𝑝)! 𝑒 ). ( (𝜉) = √ 𝛾˜𝜌,𝑘,𝑝,𝑞 (9.2) 𝜌!(𝜌 + 𝑝 − 𝑞)! 𝑗=1 Γ ∣𝜌(𝑗) + 𝑝(𝑗) ∣ + 𝑘𝑗 +1 2 In other words, one has 𝑎 𝑒 (𝜉) = 𝛾𝑎 (𝜌, 𝜉) ⋅ 𝛾˜𝜌,𝑘,𝑝,𝑞 (𝜉), 𝛾˜𝜌,𝑘,𝑝,𝑞

(9.3)

which together with the calculations in the proof of Proposition 9.1 implies the assertion. □ Given 𝑘 = (𝑘1 , 𝑘2 , . . . , 𝑘𝑚 ) and a pair of orthogonal multi-indices 𝑝 and 𝑞 with ∣𝑝(𝑗) ∣ = ∣𝑞(𝑗) ∣, for all 𝑗 = 1, 2, . . . , 𝑚, put 𝑝˜(𝑗) := (0, . . . , 0, 𝑝(𝑗) , 0, . . . , 0)

and

𝑞˜(𝑗) := (0, . . . , 0, 𝑞(𝑗) , 0, . . . , 0).

Then of course 𝑝 = 𝑝˜(1) + 𝑝˜(2) + ⋅ ⋅ ⋅ + 𝑝˜(𝑚) and 𝑞 = 𝑞˜(1) + 𝑞˜(2) + ⋅ ⋅ ⋅ + 𝑞˜(𝑚) . For each 𝑗 = 1, 2, . . . , 𝑚 we introduce the Toeplitz operator 𝑇𝑗 := 𝑇𝜁 𝑝˜(𝑗) 𝜁 𝑞˜(𝑗) . Corollary 9.4. The operators 𝑇𝑗 for 𝑗 = 1, 2, . . . , 𝑚 mutually commute. Given an ℎ-tuple of indices (𝑗1 , 𝑗2 , . . . , 𝑗ℎ ) where 2 ≤ ℎ ≤ 𝑚 and let 𝑝˜ℎ = 𝑝˜(𝑗1 ) + 𝑝˜(𝑗2 ) + ⋅ ⋅ ⋅ + 𝑝˜(𝑗ℎ )

and

𝑞˜ℎ = 𝑞˜(𝑗1 ) + 𝑞˜(𝑗2 ) + ⋅ ⋅ ⋅ + 𝑞˜(𝑗ℎ ) .

Under the condition ∣𝑝(𝑗) ∣ = ∣𝑞(𝑗) ∣, for all 𝑗 = 1, 2, . . . , 𝑚, it holds ℎ ∏ 𝑔=1

𝑇𝑗𝑔 = 𝑇𝜁 𝑝˜ℎ 𝜁 𝑞˜ℎ .

Banach Algebras of Commuting Toeplitz Operators

171

Proof. Let 𝑒 ≡ 1, then it is suﬃcient to show that for 𝑗 ∕= ℓ: 𝑒 𝑒 𝛾˜𝜌,𝑘, 𝛾𝜌+ 𝑝˜(𝑗) ,˜ 𝑝˜(𝑗) −˜ 𝑞(𝑗) (𝜉)⋅˜ 𝑞(𝑗) ,𝑘,𝑝˜(ℓ) ,˜ 𝑞(ℓ) (𝜉) 𝑒 𝑒 ˜𝜌+ =˜ 𝛾𝜌,𝑘, 𝑝˜(ℓ) ,˜ 𝑝˜(ℓ) −˜ 𝑞(ℓ) (𝜉) ⋅ 𝛾 𝑞(ℓ) ,𝑘,𝑝˜(𝑗) ,˜ 𝑞(𝑗) (𝜉) 𝑒 =˜ 𝛾𝜌,𝑘, 𝑞(𝑗) +˜ 𝑞(ℓ) (𝜉). 𝑝˜(𝑗) +𝑝˜(ℓ) ,˜

We calculate the ﬁrst product by using (9.2): 𝑒 𝑒 𝛾˜𝜌,𝑘, 𝛾𝜌+ 𝑞(𝑗) (𝜉)˜ 𝑞(𝑗) ,𝑘,𝑝˜(ℓ) ,˜ 𝑞(ℓ) (𝜉) 𝑝˜(𝑗) ,˜ 𝑝˜(𝑗) −˜ ( ) ( ) 𝑘 +1 Γ ∣𝜌(𝑗) ∣ + 𝑗2 Γ ∣𝜌(ℓ) ∣ + 𝑘ℓ2+1 ) ( ) = ( 𝑘 +1 Γ ∣𝜌(ℓ) + 𝑝(ℓ) ∣ + 𝑘ℓ2+1 Γ ∣𝜌(𝑗) + 𝑝(𝑗) ∣ + 𝑗2

(𝜌 + 𝑝˜(𝑗) − 𝑞˜(𝑗) + 𝑝˜(ℓ) )! (𝜌 + 𝑝˜(𝑗) )! √ . ×√ 𝜌!(𝜌 + 𝑝˜(𝑗) − 𝑞˜(𝑗) )! (𝜌 + 𝑝˜(𝑗) − 𝑞˜(𝑗) )!(𝜌 + 𝑝˜(𝑗) − 𝑞˜(𝑗) + 𝑝˜(ℓ) − 𝑞˜(ℓ) )!

=:𝐴𝑗,ℓ

Note that (𝜌(𝑗) + 𝑝(𝑗) )!𝜌(ℓ) ! 𝐴𝑗,ℓ = 𝐶 √ 𝜌(𝑗) !(𝜌(𝑗) + 𝑝(𝑗) − 𝑞(𝑗) )!𝜌(ℓ) !𝜌(ℓ) ! (𝜌(𝑗) + 𝑝(𝑗) − 𝑞(𝑗) )!(𝜌(ℓ) + 𝑝(ℓ) )! ×√ (𝜌(𝑗) + 𝑝(𝑗) − 𝑞(𝑗) )!𝜌(ℓ) !(𝜌(𝑗) + 𝑝(𝑗) − 𝑞(𝑗) )!(𝜌(ℓ) + 𝑝(ℓ) − 𝑞(ℓ) )! (𝜌(𝑗) + 𝑝(𝑗) )! (𝜌(ℓ) + 𝑝(ℓ) )! √ = 𝐶√ . 𝜌(𝑗) !(𝜌(𝑗) + 𝑝(𝑗) − 𝑞(𝑗) )! (𝜌(ℓ) + 𝑝(ℓ) − 𝑞(ℓ) )!𝜌(ℓ) ! Here 𝐶 denotes a constant which is independent of 𝑝˜(𝑟) and 𝑞˜(𝑟) for an index 𝑟 ∈ {ℓ, 𝑗}. Finally, note that: (𝜌 + 𝑝˜(𝑗) + 𝑝˜(ℓ) )! 𝑒 𝛾˜𝜌,𝑘, 𝑝˜(𝑗) +𝑝˜(ℓ) ,˜ 𝑞(𝑗) +˜ 𝑞(ℓ) (𝜉) = √ 𝜌!(𝜌 + 𝑝˜(𝑗) + 𝑝˜(ℓ) − 𝑞˜(𝑗) − 𝑞˜(ℓ) )! ) ( ) ( 𝑘 +1 Γ ∣𝜌(ℓ) ∣ + 𝑘ℓ2+1 Γ ∣𝜌(𝑗) ∣ + 𝑗2 ) ( ), × ( 𝑘 +1 Γ ∣𝜌(𝑗) + 𝑝(𝑗) ∣ + 𝑗2 Γ ∣𝜌(ℓ) + 𝑝(ℓ) ∣ + 𝑘ℓ2+1 and the ﬁrst factor coincides with 𝐴𝑗,ℓ . The assertion is proven.

□

𝑝 ¯𝑞 𝑢 ¯𝑣 Fix a tuple 𝑘 = (𝑘1 , 𝑘2 , . . . , 𝑘𝑚 ) ∈ ℤ𝑚 + and let 𝜑1 = 𝑎1 𝜁 𝜁 and 𝜑2 = 𝑎2 𝜁 𝜁 be bounded measurable hyperbolic 𝑘-quasi-homogeneous symbols with 𝑝 ⊥ 𝑞 and 𝑢 ⊥ 𝑣. Moreover, assume that ∣𝑝(𝑗) ∣ = ∣𝑞(𝑗) ∣ and ∣𝑢(𝑗) ∣ = ∣𝑣(𝑗) ∣ for all 𝑗 = 1, 2, . . . , 𝑚.

Theorem 9.5. The Toeplitz operators 𝑇𝜑1 and 𝑇𝜑2 commute on each weighted Bergman space 𝒜2𝜆 (𝐷𝑛 ) if and only if for each ℓ = 1, 2, . . . , 𝑛 − 1 one of the

172

W. Bauer and N. Vasilevski

conditions (a)–(d) is fulﬁlled: (a) 𝑝ℓ = 𝑞ℓ = 0, (b) 𝑢ℓ = 𝑣ℓ = 0, (c) 𝑝ℓ = 𝑢ℓ = 0, (d) 𝑞ℓ = 𝑣ℓ = 0. Proof. Let 𝜌 ∈ ℕ𝑛0 such that the following expressions are non-zero: 𝑎1 𝑎2 (𝑅𝑇𝜑1 𝑅∗ ) (𝑅𝑇𝜑2 𝑅∗ ) 𝑒ˆ𝜌 (𝑐(𝜉)) = 𝛾˜𝜌+𝑢−𝑣,𝑘,𝑝,𝑞 (𝜉)˜ 𝛾𝜌,𝑘,𝑢,𝑣 (𝜉)ˆ 𝑒𝜌+𝑢+𝑝−𝑣−𝑞 (𝑐(𝜉)) 𝑎2 𝑎1 (𝜉)˜ 𝛾𝜌,𝑘,𝑝,𝑞 (𝜉)ˆ 𝑒𝜌+𝑢+𝑝−𝑣−𝑞 (𝑐(𝜉)). (𝑅𝑇𝜑2 𝑅∗ ) (𝑅𝑇𝜑1 𝑅∗ ) 𝑒ˆ𝜌 (𝑐(𝜉)), = 𝛾˜𝜌+𝑝−𝑞,𝑘,𝑢,𝑣

Hence, 𝑇𝜑1 and 𝑇𝜑2 commute if and only if for all 𝜌 (such that the expressions below are non-zero) we have: 𝑎1 𝑎2 𝑎2 𝑎1 𝛾˜𝜌+𝑢−𝑣,𝑘,𝑝,𝑞 (𝜉)˜ 𝛾𝜌,𝑘,𝑢,𝑣 (𝜉) = 𝛾˜𝜌+𝑝−𝑞,𝑘,𝑢,𝑣 (𝜉)˜ 𝛾𝜌,𝑘,𝑝,𝑞 (𝜉).

By (9.3) this is equivalent to 𝑒 𝑒 𝛾𝑎1 (𝜌 + 𝑢 − 𝑣, 𝜉)˜ 𝛾𝜌+𝑢−𝑣,𝑘,𝑝,𝑞 (𝜉)𝛾𝑎2 (𝜌, 𝜉)˜ 𝛾𝜌,𝑘,𝑢,𝑣 (𝜉)

𝑒 𝑒 𝛾𝜌+𝑝−𝑞,𝑘,𝑢,𝑣 (𝜉)𝛾𝑎1 (𝜌, 𝜉)˜ 𝛾𝜌,𝑘,𝑝,𝑞 (𝜉). = 𝛾𝑎2 (𝜌 + 𝑝 − 𝑞, 𝜉)˜

(9.4)

From ∣𝑝(𝑗) ∣ = ∣𝑞(𝑗) ∣ and ∣𝑢(𝑗) ∣ = ∣𝑣(𝑗) ∣, for all 𝑗 = 1, . . . , 𝑚, together with Lemma 7.2 it follows that 𝛾𝑎1 (𝜌 + 𝑢 − 𝑣, 𝜉) = 𝛾𝑎1 (𝜌, 𝜉)

and

𝛾𝑎2 (𝜌 + 𝑝 − 𝑞, 𝜉) = 𝛾𝑎2 (𝜌, 𝜉).

Hence, the relation (9.4) is equivalent to: 𝑒 𝑒 𝑒 𝑒 𝛾˜𝜌+𝑢−𝑣,𝑘,𝑝,𝑞 (𝜉)˜ 𝛾𝜌,𝑘,𝑢,𝑣 (𝜉) = 𝛾˜𝜌+𝑝−𝑞,𝑘,𝑢,𝑣 (𝜉)˜ 𝛾𝜌,𝑘,𝑝,𝑞 (𝜉).

We can write this equation more explicitly by using (9.2): (𝜌 + 𝑢 − 𝑣 + 𝑝)! (𝜌 + 𝑢)! √ √ (𝜌 + 𝑢 − 𝑣)!(𝜌 + 𝑢 − 𝑣 + 𝑝 − 𝑞)! 𝜌!(𝜌 + 𝑢 − 𝑣)! ) ( ) ( 𝑘 +1 𝑘 +1 𝑚 Γ ∣𝜌(𝑗) ∣ + 𝑗2 Γ ∣𝜌(𝑗) + 𝑢(𝑗) − 𝑣(𝑗) ∣ + 𝑗2 ∏ ) ( ) ( × 𝑘𝑗 +1 𝑘𝑗 +1 Γ ∣𝜌 Γ ∣𝜌 + 𝑢 − 𝑣 + 𝑝 ∣ + + 𝑢 ∣ + 𝑗=1 (𝑗) (𝑗) (𝑗) (𝑗) (𝑗) (𝑗) 2 2 (𝜌 + 𝑝)! (𝜌 + 𝑝 − 𝑞 + 𝑢)! √ = √ (𝜌 + 𝑝 − 𝑞)!(𝜌 + 𝑝 − 𝑞 + 𝑢 − 𝑣)! 𝜌!(𝜌 + 𝑝 − 𝑞)! ) ( ) ( 𝑘 +1 𝑘 +1 𝑚 Γ ∣𝜌(𝑗) ∣ + 𝑗2 Γ ∣𝜌(𝑗) + 𝑝(𝑗) − 𝑞(𝑗) ∣ + 𝑗2 ∏ ) ( ) . ( × 𝑘𝑗 +1 𝑘 +1 Γ ∣𝜌(𝑗) + 𝑝(𝑗) ∣ + 𝑗2 𝑗=1 Γ ∣𝜌(𝑗) + 𝑝(𝑗) − 𝑞(𝑗) + 𝑢(𝑗) ∣ + 2 Since by assumption we have ∣𝑝(𝑗) ∣ = ∣𝑞(𝑗) ∣ and ∣𝑢(𝑗) ∣ = ∣𝑣(𝑗) ∣ for all 𝑗 = 1, . . . , 𝑚, this is equivalent to: (𝜌 + 𝑢 − 𝑣 + 𝑝)!

(𝜌 + 𝑢)! (𝜌 + 𝑝)! = (𝜌 + 𝑝 − 𝑞 + 𝑢)! . (𝜌 + 𝑢 − 𝑣)! (𝜌 + 𝑝 − 𝑞)!

Varying 𝜌 one can check that this equality holds if and only if for each ℓ = 1, 2, . . . , 𝑛 − 1 one of the conditions (a)–(d) are fulﬁlled. □

Banach Algebras of Commuting Toeplitz Operators

173

In the following we assume (i) and (ii): (i) For each 𝑗 with 𝑘𝑗 > 1 we have: 𝑝(𝑗) = (𝑝𝑗,1 , . . . , 𝑝𝑗,ℎ𝑗 , 0, . . . , 0)

(9.5)

𝑞(𝑗) = (0, . . . , 0, 𝑞𝑗,ℎ𝑗 +1 , . . . , 𝑞𝑗,𝑘𝑗 ) ′

(ii) if 𝑘𝑗 ′ = 𝑘𝑗 ′′ with 𝑗 < 𝑗 ′′ , then ℎ𝑗 ′ ≤ ℎ𝑗 ′′ . Let 𝑘 = (𝑘1 , . . . , 𝑘𝑚 ) be a tuple as before and ℎ = (ℎ1 , . . . , ℎ𝑚 ) where { if 𝑘𝑗 = 1, ℎ𝑗 = 0, 1 ≤ ℎ𝑗 ≤ 𝑘𝑗 − 1, if 𝑘𝑗 > 1. Deﬁnition 9.6. We denote by ℛ𝑘 (ℎ) the linear space generated by all hyperbolic 𝑞 𝑘-quasi-homogeneous functions 𝑎𝜁 𝑝 𝜁 , where the components 𝑝(𝑗) and 𝑞(𝑗) , 𝑗 = 1, . . . , 𝑚, of multi-indices 𝑝 and 𝑞 are of the form (9.5) with: 𝑝𝑗,1 + ⋅ ⋅ ⋅ + 𝑝𝑗,ℎ𝑗 = 𝑞𝑗,ℎ𝑗 +1 + ⋅ ⋅ ⋅ + 𝑞𝑗,𝑘𝑗 . and 𝑝𝑗,1 , . . . , 𝑝𝑗,ℎ𝑗 , 𝑞𝑗,ℎ𝑗 +1 , . . . , 𝑞𝑗,𝑘𝑗 ∈ ℤ+ . Note that ℛ𝑘 ⊂ ℛ𝑘 (ℎ) and that the identity function 𝑒(𝑧) ≡ 1 belongs to ℛ𝑘 (ℎ). As an application of Theorem 9.5 we have: Corollary 9.7. The Banach algebra generated by Toeplitz operators with symbols from ℛ𝑘 (ℎ) is commutative. Finally we would like to note that: (a) For 𝑛 > 2 and 𝑘 ∕= (1, 1, . . . , 1) these algebras are just Banach algebras, while the C*-algebras generated by them are non-commutative. (b) These Banach algebras are commutative for each weighted Bergman space 𝒜2𝜆 (𝐷𝑛 ) with 𝜆 > −1. (c) For 𝑛 = 2 all these algebras collapse to the single 𝐶 ∗ -algebra generated by Toeplitz operators with quasi-hyperbolic symbols.

Appendix The following well-known relation is essentially used throughout the text. For convenience of the reader we add its short proof here. Lemma A.1. Let 𝑑𝜎 denote the usual surface measure on the (𝑛 − 1)-dimensional sphere 𝕊𝑛−1 and let 𝛼 ∈ ℕ𝑛0 . Then: { ∫ 0, if some 𝛼𝑗 is odd , 𝛼 𝑥 𝑑𝜎 := 2Γ(𝛽1 )Γ(𝛽2 )⋅⋅⋅Γ(𝛽𝑛 ) , if all 𝛼𝑗 are even. 𝕊𝑛−1 Γ(𝛽1 +⋅⋅⋅+𝛽𝑛 ) where 𝛽𝑗 := 12 (𝛼𝑗 + 1). Moreover, if 𝛼 ∈ ℕ𝑛0 then we have: ) ( ) ( ∫ 2Γ 𝛼12+1 ⋅ ⋅ ⋅ Γ 𝛼𝑛2+1 𝛼 ( ) ∣𝑦 ∣𝑑𝜎 = . 𝕊𝑛−1 Γ 𝑛+∣𝛼∣ 2

174

W. Bauer and N. Vasilevski

Proof. We only prove the second assertion which in particular implies the ﬁrst one. Consider: ∫ 2 𝐼𝛼 := ∣𝑥𝛼 ∣𝑒−∣𝑥∣ 𝑑𝑥 = = =

ℝ𝑛 𝑛 ∏∫

𝑗=1 𝑛 ∏ 𝑗=1 𝑛 ∏ 𝑗=1

2

ℝ

∣𝑥𝛼𝑗 ∣𝑒−𝑥𝑗 𝑑𝑥𝑗

∫

2

0

( Γ

∞

2

𝑥𝛼𝑗 𝑒−𝑥𝑗 𝑑𝑥𝑗

𝛼𝑗 + 1 2

) .

By changing to polar coordinates we have ∫ ∫ ∞ 2 ∣(𝑟𝑦)𝛼 ∣𝑒−𝑟 𝑟𝑛−1 𝑑𝑟𝑑𝜎(𝑦) 𝐼𝛼 = 𝑛−1 ∫ ∫𝕊 ∞ 0 2 𝑟∣𝛼∣+𝑛−1 𝑒−𝑟 𝑑𝑟 ∣𝑦 𝛼 ∣ 𝑑𝜎(𝑦) = 𝑛−1 0 𝕊 ( )∫ 𝑛 + ∣𝛼∣ 1 ∣𝑦 𝛼 ∣ 𝑑𝜎(𝑦), = Γ 2 2 𝑛−1 𝕊 and the assertion follows.

□

References [1] W. Bauer, Y.L. Lee, Commuting Toeplitz operators on the Segal-Bargmann space, J. Funct. Anal. 260(2) (2011), 460–489. [2] B.R. Choe, H. Koo and Y.J. Lee, Commuting Toeplitz operators on the polydisk, Trans. Amer. Math. Soc. 356 (2004), 1727–1749. [3] B.R. Choe and Y.J. Lee, Pluriharmonic symbols of commuting Toeplitz operators, Illinois J. Math. 37 (1993), 424–436. ˇ Cuˇ ˇ ckovi´c and N.V. Rao, Mellin transform, monomial symbols and commuting [4] Z. Toeplitz operators, J. Funct. Anal. 154 (1998), 195–214. [5] S. Grudsky, R. Quiroga-Barranco and N. Vasilevski, Commutative 𝐶 ∗ -algebras of Toeplitz operators and quantization on the unit disc, J. Funct. Anal. 234 (2006), 1–44. [6] T. Le, The commutants of certain Toeplitz operators on weighted Bergman spaces, J. Math. Anal. Appl. 348(1) (2008), 1–11. [7] Y.J. Lee, Commuting Toeplitz operators on the Hardy space of the polydisc, Proc. Amer. Math. Soc., vol. 138(1) (2010), 189–197. [8] R. Quiroga-Barranco and N. Vasilevski, Commutative 𝐶 ∗ -algebras of Toeplitz operators on the unit ball, I. Bargmann-type transforms and spectral representations of Toeplitz operators, Integr. Equ. Oper. Theory 59(3) (2007), 379–419.

Banach Algebras of Commuting Toeplitz Operators

175

[9] N. Vasilevski, Bergman space structure, commutative algebras of Toeplitz operators and hyperbolic geometry, Integr. Equ. Oper. Theory 46 (2003), 235–251. [10] , Commutative algebras of Toeplitz operators on the Bergman space, Birkh¨ auser, Operator Theory: Advances and Applications, (2008). [11] , Parabolic quasi-radial quasi-homogeneous symbols and commutative algebras of Toeplitz operators, Operator Theory: Advances and Applications, v. 202 (2010), 553–568. [12] , Quasi-radial quasi-homogeneous symbols and commutative Banach algebras of Toeplitz operators, Integr. Equ. Oper. Theory 66 (2010), 141–152. Wolfram Bauer Mathematisches Institut Georg-August-Universit¨ at Bunsenstr. 3–5 D-37073 G¨ ottingen, Germany e-mail: [email protected] Nikolai Vasilevski Departamento de Matem´ aticas CINVESTAV del I.P.N. Av. IPN 2508, Col. San Pedro Zacatenco M´exico D.F. 07360, M´exico e-mail: [email protected]

Operator Theory: Advances and Applications, Vol. 218, 177–205 c 2012 Springer Basel AG ⃝

Canonical Models for Bi-isometries H. Bercovici, R.G. Douglas and C. Foias We dedicate this paper to the memory of Israel Gohberg, great mathematician, wonderful human being, friend and teacher to us all

Abstract. A canonical model, analogous to the one for contraction operators, is introduced for bi-isometries, two commuting isometries on a Hilbert space. This model involves a contractive analytic operator-valued function on the unit disk. Various complete nonunitarity conditions are considered as well as bi-isometries for which both isometries are shifts. Several families of examples are introduced and classiﬁed. Mathematics Subject Classiﬁcation (2000). Primary: 47A45. Secondary: 47A15, 47B37. Keywords. Bi-isometry, characteristic function, functional model, pivotal operator, similarity.

1. Introduction It is diﬃcult to overestimate the importance of the von Neumann-Wold theorem on the structure of isometric operators on Hilbert space. Originally introduced in the study of symmetric operators by von Neumann, it became the foundation for Wold’s study of stationary stochastic processes. Later, it was the starting point for the study of contraction operators by Sz.-Nagy and the third author as well as a key ingredient in engineering systems theory. Thus it has had an important role in both pure mathematics and its applications. For nearly ﬁfty years, researchers have sought a similar structure theory for 𝑛-tuples of commuting isometries [4,11,12,15,16,17,19] with varying success. In [2] the authors rediscovered an earlier fundamental result of Berger, Coburn and Lebow [4] on a model for an 𝑛-tuple of commuting isometries and carried the analysis beyond what the latter researchers had done. In the course of this study, HB and RGD were supported in part by grants from the National Science Foundation.

178

H. Bercovici, R.G. Douglas and C. Foias

a very concrete canonical model for bi-isometries emerged; that is, for pairs of commuting isometries. This new model is related to the canonical functional model of a contraction, but it displays subtle diﬀerences and a new set of challenges. In this paper we take up the systematic presentation and development of this model. After some preliminaries, we begin in Section 3 by examining the passage from an 𝑛-isometry to an (𝑛 + 1)-isometry showing that essentially the main ingredient needed is a contraction in the commutant of a completely nonunitary 𝑛-isometry. In the case of a bi-isometry, this additional operator can be viewed as a contractive operator-valued analytic function in the unit disk. It is this function that is the heart of our canonical model. We relate the reducing subspaces of an 𝑛-isometry to this construction and investigate a variety of notions of complete nonunitarity which generalize the notion of completely nonunitary contractions and the results of several earlier researchers. (See Section 3 for the details.) In Section 4 we specialize to the case 𝑛 = 1, that is to the case of bi-isometries, and study the extension from the ﬁrst isometry to the pair. The analytic operator function mentioned above then is the characteristic function for the pair. Various relations between the bi-isometry and the characteristic function are investigated. In Section 5, this model is re-examined in the context of a functional model; that is, one in which the abstract Hilbert spaces are realized as Hardy spaces of vector-valued functions on the unit disk. This representation allows one to apply techniques from harmonic analysis in their study. In Section 6, we specialize to bi-shifts or bi-isometries for which both isometries are shift operators. (Note that this use of the term is not the same as that used by earlier authors.) In Section 7, we return to the functional model for bi-isometries obtaining unitary invariants for them. Finally, in Section 8, several families of bi-isometries are introduced and studied. The results here are not exhaustive but intended to illustrate various aspects of the earlier theory as well as the variety of possibilities presented by bi-isometries. At the ends of Sections 3 and 4, the connection between intertwining maps and common invariant subspaces for bi-isometries is discussed. This topic has already been considered in [3] and further results will be presented in another paper. The paper beneﬁtted from a thorough review by the referee who helped eliminate one serious error, along with numerous misprints in our original manuscript. The authors gratefully acknowledge his help in improving this work.

2. Preliminaries about commuting isometries We will study families 𝕍 = (𝑉𝑖 )𝑖∈𝐼 of commuting isometric operators on a complex Hilbert space ℌ. A (closed) subspace 𝔐 ⊂ ℌ is invariant for 𝕍 if 𝑉𝑖 𝔐 ⊂ 𝔐 for 𝑖 ∈ 𝐼; we write 𝕍∣𝔐 = (𝑉𝑖 ∣𝔐)𝑖∈𝐼 if 𝔐 is invariant. The invariant subspace 𝔐 is reducing if 𝔐⊥ is invariant for 𝕍 as well. If 𝔐 is a reducing subspace, we have a

Canonical Models for Bi-isometries decomposition

179

𝕍 = (𝕍∣𝔐) ⊕ (𝕍∣𝔐⊥ ),

and 𝕍∣𝔐 is called a direct summand of 𝕍. The family 𝕍 is said to be unitary if each 𝑉𝑖 , 𝑖 ∈ 𝐼, is a unitary operator. We say that 𝕍 is completely nonunitary or cnu if it has no unitary direct summand acting on a space 𝔐 ∕= {0}. The family 𝕍 is irreducible if it has no reducing subspaces other than {0} and ℌ. The following extension of the von Neumann-Wold decomposition was proved by I. Suciu [20]. Theorem 2.1. Let 𝕍 be a family of commuting isometries on ℌ. There exists a unique reducing subspace 𝔐 for 𝕍 with the following properties. (1) 𝕍∣𝔐 is unitary. (2) 𝕍∣𝔐⊥ is completely nonunitary. We recall, for the reader’s convenience, the construction of 𝔐. We simply set ⎡ ⎤ ∞ ∩ ∩ ⎣ 𝔐= 𝑉𝑘1 𝑉𝑘2 ⋅ ⋅ ⋅ 𝑉𝑘𝑁 ℌ⎦ . 𝑁 =1

𝑘1 ,𝑘2 ,...,𝑘𝑁 ∈𝐼

Obviously, 𝑉𝑘 𝔐 ⊃ 𝔐 for each 𝑘, and the commutativity of 𝕍 implies that 𝑉𝑘 𝔐 ⊂ 𝔐 as well. Thus 𝔐 reduces each 𝑉𝑘 to a unitary operator. It is then easily seen that 𝔐 is the largest invariant subspace for 𝕍 such that 𝕍∣𝔐 is unitary, and this immediately implies properties (1) and (2), as well as the uniqueness of 𝔐. Corollary 2.2. Consider a ﬁnite family 𝕍 = (𝑉0 , 𝑉1 , . . . , 𝑉𝑛 ) of commuting isometries on ℌ. Then 𝕍 is completely nonunitary if and only if the product 𝑉0 𝑉1 ⋅ ⋅ ⋅ 𝑉𝑛 is completely nonunitary. Proof. Indeed, the space 𝔐 in the preceding theorem can alternatively be described as ∞ ∩ 𝔐= 𝑉 𝑛 ℌ, 𝑘=1

where 𝑉 = 𝑉0 𝑉1 ⋅ ⋅ ⋅ 𝑉𝑛 .

□

More generally, given a subset 𝐽 ⊂ 𝐼, we will say that 𝕍 is 𝐽-unitary if 𝑉𝑗 is a unitary operator for each 𝑗 ∈ 𝐽. The family 𝕍 is said to be 𝐽-cnu if it has no 𝐽-unitary direct summand acting on a nonzero space. Theorem 2.1 extends as follows. Theorem 2.3. Let 𝕍 = (𝑉𝑖 )𝑖∈𝐼 be a family of commuting isometries on a Hilbert space ℌ, and let 𝐽 be a subset of 𝐼. There exists a unique reducing subspace 𝔐𝐽 for 𝕍 with the following properties. (1) 𝕍∣𝔐𝐽 is 𝐽-unitary. (2) 𝕍∣𝔐⊥ 𝐽 is 𝐽-cnu.

180

H. Bercovici, R.G. Douglas and C. Foias

Proof. Let us set 𝕍𝐽 = (𝑉𝑗 )𝑗∈𝐽 and apply Theorem 2.1 to this family. Thus we can write ℌ = 𝔐 ⊕ 𝔑, where 𝔐 is reducing for 𝕍𝐽 , 𝕍𝐽 ∣𝔐 is unitary, and 𝕍𝐽 ∣𝔑 is cnu. Denote by 𝔑𝐽 the smallest reducing subspace for 𝕍 containing 𝔑, and set 𝔐𝐽 = ℌ ⊖ 𝔑𝐽 . Since 𝔐𝐽 reduces 𝕍𝐽 ∣𝔐, it follows immediately that (1) is satisﬁed. Moreover, if ℜ is any reducing subspace for 𝕍 such that 𝕍𝐽 ∣ℜ is unitary, then ℜ ⊂ 𝔐 so that ℜ ⊥ 𝔑 and consequently ℜ ⊥ 𝔑𝐽 as well. We conclude that 𝔐𝐽 is the largest reducing subspace for 𝕍 satisfying condition (1). Property (2), as well as the uniqueness of 𝔐𝐽 , follow from this observation. □ Observe that 𝔐𝐼 is precisely the space 𝔐 in Theorem 2.1, and it is convenient to extend our notation so that 𝔐∅ = ℌ. We have then 𝔐𝐽1 ∪𝐽2 = 𝔐𝐽1 ∩ 𝔐𝐽2 ,

𝐽1 , 𝐽2 ⊂ 𝐼.

(1) (𝑉𝑖 )𝑖∈𝐼

(2)

Given two families 𝕍(1) = and 𝕍(2) = (𝑉 𝑖 )𝑖∈𝐼 of commuting (1) (2) isometries on ℌ and ℌ , respectively, we denote by ℐ(𝕍(1) , 𝕍(2) ) the collection of all bounded linear operators 𝑋 : ℌ(1) → ℌ(2) satisfying the intertwining relations (1) (2) 𝑋𝑉𝑖 = 𝑉 𝑖 𝑋 for every 𝑖 ∈ 𝐼. In the special case 𝕍(1) = 𝕍(2) = 𝕍, we use the notation (𝕍)′ = ℐ(𝕍, 𝕍) for the commutant of 𝕍. Also, given 𝑇𝑗 ∈ ℒ(ℌ(𝑗) ) for 𝑗 = 1, 2, we denote by ℐ(𝑇1 , 𝑇2 ) the collection of all bounded linear operators 𝑋 : ℌ(1) → ℌ(2) satisfying 𝑋𝑇1 = 𝑇2 𝑋. (1)

(2)

Proposition 2.4. Consider two families 𝕍(1) = (𝑉𝑖 )𝑖∈𝐼 and 𝕍(2) = (𝑉 𝑖 )𝑖∈𝐼 of commuting isometries on ℌ(1) and ℌ(2) . Denote by 𝔐(𝑝) the largest reducing subspace for 𝕍(𝑝) such that 𝕍(𝑝) ∣ℌ(𝑝) is unitary for 𝑝 = 1, 2. Then for every 𝑋 ∈ ℐ(𝕍(1) , 𝕍(2) ) we have 𝑋𝔐(1) ⊂ 𝔐(2) . Proof. This follows immediately from the formulas deﬁning the spaces ℌ(𝑝) .

□

The preceding result does not extend to the spaces 𝔐𝐽 for 𝐽 ∕= 𝐼. We illustrate this by a simple example. Denote by 𝑈 ∈ ℒ(𝐿2 ) the usual bilateral shift, and set 𝑈+ = 𝑈 ∣𝐻 2 . We consider the Hilbert space ℌ = 𝐻 2 ⊕ 𝐿2 ⊕ 𝐿2 ⊕ ⋅ ⋅ ⋅ , and the family 𝕍 = (𝑉0 , 𝑉1 ) deﬁned on ℌ by the formulas 𝑉0 (𝑣 ⊕ 𝑤0 ⊕ 𝑤1 ⊕ ⋅ ⋅ ⋅ ) = 𝑈+ 𝑣 ⊕ 𝑈 𝑤0 ⊕ 𝑈 𝑤1 ⊕ ⋅ ⋅ ⋅ and

𝑉1 (𝑣 ⊕ 𝑤0 ⊕ 𝑤1 ⊕ ⋅ ⋅ ⋅ ) = 0 ⊕ 𝑣 ⊕ 𝑤0 ⊕ 𝑤1 ⊕ ⋅ ⋅ ⋅ . It is easy to verify that 𝕍 is {0}-cnu, but its restriction to the invariant subspace 𝔎 = 0 ⊕ 𝐿2 ⊕ 𝐿2 ⊕ 𝐿2 ⊕ ⋅ ⋅ ⋅

is {0}-unitary. Thus the inclusion operator from 𝔎 to ℌ does not map the {0}unitary part of 𝕍∣𝔎 to the {0}-unitary part of 𝕍. Another useful result is the existence of a unique minimal unitary extension for every family of commuting isometries [25, Chapter I] (see also [7] for a Banach space version). We review the result brieﬂy.

Canonical Models for Bi-isometries

181

Theorem 2.5. Let 𝕍 = (𝑉𝑖 )𝑖∈𝐼 be a family of commuting isometries on ℌ. There exists a family 𝕌 = (𝑈𝑖 )𝑖∈𝐼 of commuting unitary operators on a Hilbert space 𝔎 ⊃ ℌ with the following properties. (1) ℌ is invariant for 𝕌 and 𝕌∣ℌ = 𝕍. ] ⋁∞ [⋁ ∗ ∗ ∗ (2) 𝔎 = 𝑁 =0 𝑈 𝑈 ⋅ ⋅ ⋅ 𝑈 ℌ . 𝑘𝑁 𝑘1 ,𝑘2 ,...,𝑘𝑁 ∈𝐼 𝑘1 𝑘2 If 𝕌′ is another family of commuting unitary operators on a space 𝔎′ ⊃ ℌ satisfying the analogues of conditions (1) and (2), then there exists a surjective isometry 𝑊 : 𝔎 → 𝔎′ such that 𝑊 ℎ = ℎ for ℎ ∈ ℌ, and 𝑊 𝑈𝑘 = 𝑈𝑘′ 𝑊 for 𝑘 ∈ 𝐼. In equation (2) above, we use the convention that 𝑈𝑘∗1 𝑈𝑘∗2 ⋅ ⋅ ⋅ 𝑈𝑘∗𝑁 ℌ = ℌ when 𝑁 = 0. The family 𝕌 is called the minimal unitary extension of 𝕍. In the sequel, we ˜ the minimal unitary extension of 𝕍, and by ℌ ˜ the space on which denote by 𝕍 it acts. It is easy to verify the following commutant extension result. This can be deduced from the results in [25, Chapter 1], and it is proved in [7] for isometric operators acting on a Banach space. (1)

(2)

Theorem 2.6. Let 𝕍(1) = (𝑉𝑖

)𝑖∈𝐼 and 𝕍(2) = (𝑉 𝑖 )𝑖∈𝐼 be two families of com˜ ˜ (1) and 𝕍 (2) and ℌ(2) , respectively, and denote by 𝕍

muting isometries on ℌ(1) their minimal unitary extensions. The map 𝑌 → 𝑋 = 𝑌 ∣ℌ(1) establishes an iso˜ ˜ (1) , 𝕍 (2) ) such that metric bijection between the collection of operators 𝑌 ∈ ℐ(𝕍 (1) (2) (1) (2) 𝑌 ℌ ⊂ ℌ and ℐ(𝕍 , 𝕍 ). Indeed, given 𝑘1 , 𝑘2 , . . . , 𝑘𝑁 ∈ 𝐼, a given operator 𝑋 ∈ ℐ(𝕍(1) , 𝕍(2) ) easily ∗ ∗ ∗ ˜ ˜ (1) ˜ (1) (1) extends to the space 𝑉𝑘1 𝑉𝑘2 ⋅ ⋅ ⋅ 𝑉𝑘𝑁 ℌ(1) by setting ∗

∗

∗

∗

∗

∗

˜ ˜ ˜ ˜ (1) ˜ (1) (1) (2) ˜ (2) (2) 𝑌 𝑉𝑘1 𝑉𝑘2 ⋅ ⋅ ⋅ 𝑉𝑘𝑁 ℎ = 𝑉𝑘1 𝑉𝑘2 ⋅ ⋅ ⋅ 𝑉𝑘𝑁 𝑋ℎ,

∗

∗

∗

˜ ˜ (1) ˜ (1) (1) ℎ ∈ 𝑉𝑘1 𝑉𝑘2 ⋅ ⋅ ⋅ 𝑉𝑘𝑁 ℌ(1) ,

˜ ˜ and the corresponding operator 𝑌 ∈ ℐ(𝕍, 𝕍′ ) is obtained by taking the closure of ˜ If 𝑋 is isometric this extension. This unique extension of 𝑋 will be denoted 𝑋. (1) (2) ˜ or unitary then so is 𝑋. In the particular case 𝕍 = 𝕍 = 𝕍, the operator 𝑋 ˜ ∈ (𝕍) ˜ ′ satisﬁes belongs to the commutant of 𝕍, and its canonical extension 𝑋 ˜ ⊂ ℌ. 𝑋ℌ Irreducible families of commuting isometries have special properties. Theorem 2.1 shows that they are either unitary or cnu. More precisely, we have the following result. Proposition 2.7. Let 𝕍 = (𝑉𝑖 )𝑖∈𝐼 be an irreducible family of commuting isometries on a nonzero Hilbert space ℌ. For every 𝑖0 ∈ 𝐼, one of the following alternatives occurs. (1) 𝑉𝑖0 is a scalar multiple of the identity. (2) 𝕍 is {𝑖0 }-cnu.

182

H. Bercovici, R.G. Douglas and C. Foias

Proof. Assume that (2) does not occur. Theorem 2.3 implies then that 𝑉𝑖0 is unitary. Since the spectral projections of 𝑉𝑖0 reduce 𝕍, it follows that the spectrum of 𝑉𝑖0 is a singleton, and therefore (1) is true. □ The following result is a consequence of elementary facts about representations of 𝐶 ∗ -algebras. All the families of isometries in this statement are indexed by the same set 𝐼. Proposition 2.8. Let 𝕍 be a family of commuting isometries on ℌ, and denote by ℱ a collection of mutually inequivalent irreducible families of commuting isometries such that every irreducible direct summand of 𝕍 is equivalent to an element of ℱ . (1) Fix 𝕎 ∈ ℱ , and let (𝔐𝛼 )𝛼∈𝐴 be a maximal family of mutually equivalent reducing subspaces for 𝕍 such that 𝕍∣𝔐𝛼 is unitarily equivalent to 𝕎 for all 𝛼 ∈ 𝐴. Then the reducing space ⊕ ℌ𝕎 = 𝔐𝛼 𝛼∈𝐴

depends only on 𝕎. (2) If 𝕎1 , 𝕎2 ∈ ℱ are diﬀerent, then the spaces ℌ𝕎1 and ℌ𝕎2 are mutually orthogonal. (3) We have ⊕ ℌ𝕎 , ℌ = ℌ0 ⊕ 𝕎∈ℱ

where ℌ0 is a reducing subspace for 𝕍 such that 𝕍∣ℌ0 has no irreducible direct summand. When dim ℌ0 > 1, the family 𝕍∣ℌ0 is obviously reducible; it just cannot be decomposed into a direct sum of irreducible families. However, it can be decomposed into a continuous direct integral of irreducibles if ℌ is separable. A concrete example of such a decomposition will be given in Section 8. Direct integrals are also useful in the proof of the following result, an early variant of which was proved in [20] when 𝐼 consists of two elements. We refer to [26] for the theory of direct integrals. Proposition 2.9. Let 𝕍 = (𝑉𝑖 )𝑖∈𝐼 be a ﬁnite family of commuting isometries on a Hilbert space ℌ. We can associate to each subset 𝐽 ⊂ 𝐼 a reducing space 𝔏𝐽 for 𝕍 with the following properties. ⊕ (1) ℌ = 𝐽⊂𝐼 𝔏𝐽 . (2) 𝑉𝑗 ∣𝔏𝐽 is unitary for each 𝑗 ∈ 𝐽. / 𝐽. (3) 𝕍∣𝔏𝐽 is {𝑗}-cnu for each 𝑗 ∈ Proof. Since 𝐼 is ﬁnite, ℌ can be written as an orthogonal sum of separable reducing subspaces for 𝕍. Thus it is suﬃcient to consider the case of separable spaces ℌ. There exist a standard measurable space Ω, a probability measure 𝜇 on Ω, a measurable family (ℌ𝑡 )𝑡∈Ω of Hilbert spaces, and a measurable collection

Canonical Models for Bi-isometries

183

(𝕍𝑡 )𝑡∈Ω = ((𝑉𝑡𝑖 )𝑖∈𝐼 )𝑡∈Ω of irreducible families of commuting isometries on ℌ𝑡 such that, up to unitary equivalence, ∫ ⊕ ∫ ⊕ ℌ= ℌ𝑡 𝑑𝜇(𝑡), 𝑉𝑖 = 𝑉𝑡𝑖 𝑑𝜇(𝑡), 𝑖 ∈ 𝐼. Ω

Ω

The reducing subspaces of 𝕍 are precisely the spaces of the form ∫ ⊕ 𝔎= 𝔎𝑡 𝑑𝜇(𝑡), 𝜎

where 𝜎 ⊂ Ω is measurable. Proposition 2.7 shows that for each 𝑡 ∈ Ω there exists a subset 𝐽(𝑡) ⊂ 𝐼 such that 𝑉𝑡𝑗 is a scalar multiple of the identity if 𝑗 ∈ 𝐽(𝑡), while / 𝐽(𝑡). We claim that the set 𝕍𝑡 is {𝑗}-cnu for 𝑗 ∈ 𝜎𝑖 = {𝑡 ∈ Ω : 𝑉𝑡𝑖 is a scalar multiple of the identity} is measurable for each 𝑖 ∈ 𝐼. Indeed, consider measurable families of vectors 𝑡 → 𝑒𝑘𝑡 ∈ ℌ𝑡 , 𝑘 = 1, 2, . . . , such that the nonzero vectors in the set {𝑒𝑘𝑡 : 𝑘 ≥ 1} form an orthonormal basis for ℌ𝑡 for each 𝑡 ∈ Ω. Then the set 𝜎𝑖 is deﬁned by the countable family of equations ⟨𝑉𝑡𝑖 𝑒𝑘𝑡 , 𝑒ℓ𝑡 ⟩ = 0,

⟨𝑉𝑡𝑖 𝑒𝑘𝑡 , 𝑒𝑘𝑡 ⟩ = ⟨𝑉𝑡𝑖 𝑒ℓ𝑡 , 𝑒ℓ𝑡 ⟩,

which must be satisﬁed when 𝑘 ∕= ℓ and 𝑒𝑘𝑡 ∕= 0 ∕= 𝑒ℓ𝑡 . It follows that the set 𝜎𝐽 = {𝑡 ∈ [0, 1] : 𝐽(𝑡) = 𝐽} is measurable for each 𝐽 ⊂ 𝐼. The spaces ∫ ⊕ ℌ𝑡 𝑑𝜇(𝑡), 𝔏𝐽 = 𝜎𝐽

viewed as subspaces of ℌ, satisfy the conclusion of the proposition. This follows from the above description of the reducing subspaces of 𝕍. □ Some of the spaces 𝔏𝐽 in the preceding proposition can equal {0}.

3. Inductive construction of commuting isometries In this section it will be convenient to index families of commuting isometries by ordinal numbers. Thus, given an ordinal number 𝑛, an 𝑛-isometry is simply a family 𝕍 = (𝑉𝑖 )0≤𝑖<𝑛 of commuting isometries on a Hilbert space. We consider a special construction which produces an (𝑛 + 1)-isometry starting from an 𝑛-isometry 𝕍 on ℌ and a contraction 𝐴 ∈ (𝕍)′ ; that is, ∥𝐴∥ ≤ 1. ˜ ′ on ℌ ˜ is then a contraction as well, ˜ ∈ (𝕍) Observe that the canonical extension 𝐴 and therefore we can form the defect operator ˜ 1/2 ˜∗ 𝐴) 𝐷𝐴˜ = (𝐼 − 𝐴 ˜ The space 𝔇 is reducing for 𝕍 ˜ because 𝐷 ˜ commutes and the space 𝔇 = 𝐷𝐴˜ℌ. 𝐴 ˜ with 𝕍. We form the space 𝔎 = ℌ ⊕ 𝔇⊕ 𝔇 ⊕ ⋅⋅⋅ ,

184

H. Bercovici, R.G. Douglas and C. Foias

and deﬁne an (𝑛 + 1)-isometry 𝕎𝐴 = (𝑊𝑘 )0≤𝑘≤𝑛 on 𝔎 as follows. For 0 ≤ 𝑘 < 𝑛 we deﬁne ˜𝑘 ∣𝔇) ⊕ (𝑉 ˜𝑘 ∣𝔇) ⊕ ⋅ ⋅ ⋅ , 𝑊𝑘 = 𝑉𝑘 ⊕ (𝑉 while 𝑊𝑛 (ℎ ⊕ 𝑑0 ⊕ 𝑑1 ⊕ ⋅ ⋅ ⋅ ) = 𝐴ℎ ⊕ 𝐷𝐴˜ℎ ⊕ 𝑑0 ⊕ 𝑑1 ⊕ ⋅ ⋅ ⋅ if ℎ ∈ ℌ and 𝑑𝑗 ∈ 𝔇 for 𝑗 ∈ ℕ. It is easy to verify that 𝕎𝐴 is in fact an (𝑛 + 1)-isometry. When the operator 𝐴 is already isometric, we have 𝔎 = ℌ and 𝕎𝐴 = (𝕍, 𝐴). In this trivial case, every (𝑛+1)-isometry is of the form 𝕎𝐴 for some contraction 𝐴 commuting with an 𝑛-isometry 𝕍. We give now a characterization of (𝑛 + 1)-isometries which are {0 ≤ 𝑘 < 𝑛}-cnu. Theorem 3.1. Let 𝕎 = (𝑊𝑘 )0≤𝑘≤𝑛 be an (𝑛 + 1)-isometry on 𝔎, where 𝑛 ≥ 1. The following conditions are equivalent. (1) 𝕎 is {0 ≤ 𝑘 < 𝑛}-cnu. (2) There exist a cnu 𝑛-isometry 𝕍, and a contraction 𝐴 ∈ (𝕍)′ , such that 𝕎 is unitarily equivalent to 𝕎𝐴 . Proof. Assume ﬁrst that 𝕎 = 𝕎𝐴 , where 𝐴 is a contraction in the commutant of the cnu 𝑛-isometry 𝕍 on ℌ. Let 𝔐 be a reducing subspace for 𝕎𝐴 with the property that 𝑊𝑘 ∣𝔐 is unitary for all 𝑘 < 𝑛. Since the cnu direct summand of the 𝑛-isometry (𝑊𝑘 )0≤𝑘<𝑛 is precisely 𝕍 viewed as acting on ℌ ⊕ {0} ⊕ {0} ⊕ ⋅ ⋅ ⋅ , we conclude that 𝔐 ⊂ {0} ⊕ 𝔇 ⊕ 𝔇 ⊕ ⋅ ⋅ ⋅ and therefore 𝑊𝑛∗𝑁 ℎ ∈ {0} ⊕ 𝔇 ⊕ 𝔇 ⊕ ⋅ ⋅ ⋅ for every ℎ ∈ 𝔐 and 𝑁 ≥ 1. This is not possible if ℎ ∕= 0. Indeed, if ℎ = 0 ⊕ 0 ⊕ ⋅ ⋅ ⋅ ⊕ 0 ⊕ 𝑑𝑁 ⊕ ⋅ ⋅ ⋅ , and the 𝑁 th component / 𝔐 because 𝑑𝑁 is the ﬁrst nonzero component of ℎ, then 𝑊𝑛∗𝑁 ℎ = 𝐷𝐴˜𝑑𝑁 ⊕ ⋅ ⋅ ⋅ ∈ 𝐷𝐴˜𝑑𝑁 ∕= 0. Conversely, assume that condition (1) is satisﬁed. Consider the 𝑛-isometry 𝕎′ = (𝑊𝑘 )0≤𝑘<𝑛 , and the decomposition 𝔎 = ℌ ⊕ ℌ⊥ into reducing subspaces for 𝕎′ such that 𝕎′ ∣ℌ is cnu and 𝕎′ ∣ℌ⊥ is unitary. We denote by 𝕍 = 𝕎′ ∣ℌ the cnu direct summand of 𝕎′ , and deﬁne an operator 𝐴 on ℌ by setting 𝐴 = 𝑃ℌ 𝑊𝑛 ∣ℌ. Clearly 𝐴 is a contraction, and the fact that 𝐴 commutes with 𝕍 follows from the fact that the unitary component ℌ⊥ is obviously invariant for 𝑊𝑛 , and ˜′ which can therefore 𝐴∗ = 𝑊𝑛∗ ∣ℌ. Consider next the minimal unitary extension 𝕎 be written as ˜′ = 𝕍 ˜ ⊕ (𝕎′ ∣ℌ⊥ ) 𝕎 ˜ ⊕ ℌ⊥ , and the unique isometric extension 𝑊 ˜ on the space ℌ 𝑛 of 𝑊𝑛 in the com˜′ . Clearly, mutant of 𝕎 ⊥ ⊥ ˜ 𝑊 𝑛 ∣ℌ = 𝑊𝑛 ∣ℌ , ˜˜ ˜ and the compression 𝑃ℌ ˜ 𝑊𝑛 ∣ℌ is precisely the contractive extension 𝐴 of 𝐴 in the ˜ ˜ commutant of 𝕍. We show next that 𝑊 𝑛 is in fact the minimal isometric dilation

Canonical Models for Bi-isometries

185

˜ ˜ In other words, the smallest invariant subspace 𝔐 for 𝑊 ˜ of 𝐴. 𝑛 containing ℌ is ⊥ ˜ ℌ ⊕ ℌ . To prove this, observe ﬁrst that, since ⋁ 𝑁 ˜ ˜ 𝑊 𝔐= 𝑛 ℌ 𝑁 ≥0 ∗ ˜ is invariant for 𝑊 ˜ ˜ ˜ and ℌ 𝑛 , the space 𝔐 is actually reducing for 𝑊𝑛 . Moreover, 𝑊𝑖 ∗ ˜𝑖 and 𝑊 ˜ is unitary for 𝑖 < 𝑛, and hence the operators 𝑊 𝑛 also commute. Thus 𝔐 ˜ is also a reducing space for each 𝑊𝑖 if 𝑖 < 𝑛. We conclude that the space 𝔐⊥ ⊂ ℌ⊥ reduces 𝕎, and 𝕎′ ∣𝔐⊥ is unitary. Hypothesis (1) implies that 𝔐⊥ = {0}. With this preparation out of the way, we ﬁnd ourselves in the familiar territory of minimal isometric dilations [25, Chapter II]. We recall that, up to unitary ˜ is the operator 𝑊 equivalence, the minimal isometric dilation of the contraction 𝐴 deﬁned by ˜ ⊕ 𝐷 ˜ℎ ⊕ 𝑑0 ⊕ 𝑑1 ⊕ ⋅ ⋅ ⋅ 𝑊 (ℎ ⊕ 𝑑0 ⊕ 𝑑1 ⊕ ⋅ ⋅ ⋅ ) = 𝐴ℎ 𝐴

˜ ⊕ 𝔇 ⊕ 𝔇 ⊕ ⋅ ⋅ ⋅ , where 𝔇 = 𝐷 ˜ℌ. ˜ We conclude that there exists a on the space ℌ 𝐴 ⊥ unitary operator 𝑈 : 𝔇 ⊕ 𝔇 ⊕ ⋅ ⋅ ⋅ → ℌ such that (𝐼ℌ ˜ ⊕ 𝑈 )𝑊 = 𝑊𝑛 (𝐼ℌ ˜ ⊕ 𝑈 ). The reader will verify now without diﬃculty that the operator 𝐼ℌ ⊕ 𝑈 provides a □ unitary equivalence between 𝕎𝐴 and 𝕎. General (𝑛 + 1)-isometries are described using Theorem 2.3 with 𝐽 = {0 ≤ 𝑘 < 𝑛}. We record the result below. Theorem 3.2. Let 𝕎 = (𝑊𝑘 )0≤𝑘≤𝑛 be an (𝑛 + 1)-isometry on 𝔎, where 𝑛 ≥ 1. There exist reducing subspaces 𝔎0 and 𝔎1 for 𝕎 with the following properties. (1) 𝔎0 ⊕ 𝔎1 = 𝔎. (2) 𝑊𝑘 ∣𝔎1 is unitary for every 𝑘 < 𝑛. (3) 𝕎∣𝔎0 is unitarily equivalent to 𝕎𝐴 , where 𝐴 is a contraction in the commutant of a cnu 𝑛-isometry 𝕍. In fact, the 𝑛-isometry 𝕍 on ℌ ⊂ 𝔎 is the cnu part of 𝕎′ = (𝑊𝑘 )0≤𝑘<𝑛 , and the operator 𝐴 is deﬁned by the equivalent relations 𝐴 = 𝑃ℌ 𝑊𝑛 ∣ℌ,

𝐴∗ = 𝑊𝑛∗ ∣ℌ.

˜ In particular, 𝑊𝑛 is an isometric dilation of 𝐴, and 𝑊 𝑛 is an isometric dilation ′ ′ ) and 𝐴 ˜ ˜ ′. ˜ where the extension 𝑊 ˜ ˜ of 𝐴, belongs to ( 𝕎 ∈ (𝕍) 𝑛 Thus, the space 𝔎0 is simply the {0 ≤ 𝑘 < 𝑛}-cnu summand of 𝕎. The operators which intertwine two (𝑛 + 1)-isometries can also be analyzed in the context of this inductive construction. Indeed, consider (𝑛 + 1)-isometries 𝕎(𝑝) acting on 𝔎(𝑝) for 𝑝 = 1, 2. Denote by ℌ(𝑝) the subspace of 𝔎(𝑝) on which the (𝑝)′ (𝑝) cnu part of the 𝑛-isometry 𝕎𝑛 = (𝑊𝑘 )0≤𝑘<𝑛 acts. The preceding results allow us to write the decompositions (𝑝)

(𝑝)

𝔎(𝑝) = 𝔎0 ⊕ 𝔎1 ,

𝑝 = 1, 2,

186

H. Bercovici, R.G. Douglas and C. Foias (𝑝)

(𝑝)

(𝑝)

where 𝕎(𝑝) ∣𝔎0 is {0 ≤ 𝑘 < 𝑛}-cnu, and 𝑊𝑘 ∣𝔎1 (𝑝) Moreover, ℌ(𝑝) is contained in 𝔎0 , and we set 𝕍(𝑝) = 𝕎(𝑝)′ ∣ℌ(𝑝) ,

is unitary for 0 ≤ 𝑘 < 𝑛.

𝐴(𝑝) = 𝑃ℌ(𝑝) 𝑊𝑛(𝑝) ∣ℌ(𝑝) ,

𝑝 = 1, 2.

′ ˜ (𝑝)′ of the 𝑛-isometry 𝕎(𝑝) acts on the space The minimal unitary extension 𝕎

˜ (𝑝) (𝑝) ˜ (𝑝) = 𝔎 𝔎 0 ⊕ 𝔎1 , ˜ (𝑝) (𝑝) and we denote by 𝑊𝑛 the canonical extension of 𝑊𝑛 to this larger space. We have ˜ ˜ (𝑝) (𝑝) (𝑝) (𝑝) 𝑊𝑛 = (𝑊𝑛 ∣𝔎0 ) ⊕ (𝑊𝑛(𝑝) ∣𝔎1 ) ˜ (𝑝) (𝑝) ˜ (𝑝) . and, as seen above, 𝑊𝑛 ∣𝔎0 is the minimal isometric dilation of the operator 𝐴 ˜ ˜ (𝑝) (𝑝) Observe that the space 𝔎0 contains the subspace ℌ0 of the minimal unitary ˜ (𝑝) of 𝕍(𝑝) . extension 𝕍 Any operator 𝑋 ∈ ℐ(𝕎(1) , 𝕎(2) ) can be represented as a matrix [ ] 𝑋00 𝑋01 𝑋= , 𝑋10 𝑋11 (1)

(2)

where 𝑋𝑖𝑗 ∈ ℐ(𝕎(1) ∣𝔎𝑗 , 𝕎(2) ∣𝔎𝑖 ) for 𝑖, 𝑗 ∈ {0, 1}. Theorem 2.5 (applied to ˜ ∈ the entries in the ﬁrst column of 𝑋) implies the existence of an extension 𝑋 ˜ ˜ (1) (2) ℐ(𝕎 , 𝕎 ). This extension will be represented by a matrix of the form [ ] ˜ 𝑋 00 𝑋01 ˜ 𝑋= ˜ 𝑋10 𝑋11 ˜ (𝑝) (𝑝) ˜ (𝑝) = 𝔎 relative to the decompositions 𝔎 0 ⊕ 𝔎1 . Proposition 3.3. With the above notation, the following statements are true. (1) The operator 𝑍 = 𝑃ℌ(2) 𝑋00 ∣ℌ(1) belongs to ℐ(𝕍(1) , 𝕍(2) ) and 𝑍𝐴(1) = 𝐴(2) 𝑍. ˜ ˜ ˜ ˜ ˜ (1) belongs to ℐ(𝕍 (1) , 𝕍 (2) ) ∩ ℐ(𝐴 (1) , 𝐴 (2) ), and ˜ℌ 𝑋∣ (2) The operator 𝐵 = 𝑃ℌ˜ (2) 𝐵ℌ(1)⊥ = {0}. Proof. The intertwining properties of 𝑍 in part (1) follow from the fact that the space ℌ(𝑝) is reducing for 𝕎(𝑝)′ , invariant for 𝑊𝑛∗ and invariant for 𝑋 ∗ by Proposition 2.4. In other words, we can use the fact that, relative to the decompositions 𝔎(𝑝) = ℌ(𝑝) ⊕ ℌ(𝑝)⊥ , the relevant operators have matrices of the form [ ] ] [ (𝑝) [ (𝑝) ] 𝑍 0 0 𝐴 (𝑝) 𝑉𝑘 0 (𝑝) 𝑋= , 𝑊𝑛 = , 𝑊𝑘 = , 0 ≤ 𝑘 < 𝑛. ∗ ∗ ∗ ∗ 0 ∗ For part (2) we may assume that 𝕎(𝑝) , 𝑝 = 1, 2, are {0 ≤ 𝑘 < 𝑛}-cnu. Then we use again the fact that 𝑋ℌ(1)⊥ ⊂ ℌ(2)⊥ and proceed as before. □

Canonical Models for Bi-isometries

187

˜ (1) = 𝐵. In the framework of [10, ˜ ℌ Observe that we have the equality 𝑋∣ ˜ is said to be a lifting of 𝐵, and this lifting is conSec. II.1], the operator 𝑋 ˜ ≤ 1. A natural question arises: given a contraction 𝐵 satisfying tractive if ∥𝑋∥ the requirements of Proposition 3.3(2), can one construct a contractive lifting ˜ ˜ (1) , 𝕎 (2) )? If one pursues the more modest goal of ﬁnding a contractive ˜ ∈ ℐ(𝕎 𝑋 ˜ ˜ (2) ˜ ∈ ℐ(𝑊𝑛(1) , 𝑊 lifting 𝑋 𝑛 ), the answer is in the aﬃrmative, and a parametrization of all such contractive liftings can be extracted from [10, Chapter VI]. We describe the result below, under the additional assumption that 𝕎(2) is {0 ≤ 𝑘 < 𝑛}-cnu. In the notation adopted in this section, this amounts to the requirement that (2) 𝔎1 = {0}. ˜ ˜ (1) , 𝐴 (2) ) is Proposition 3.4. With the preceding notation, assume that 𝐵 ∈ ℐ(𝐴 ˜ ˜ ˜ ∈ ℐ(𝑊𝑛(1) , 𝑊𝑛(2) ) of an operator of norm ≤ 1. The set of contractive liftings 𝑋 𝐵 is parameterized by (that is, it is in a canonical bijection with) the set of all contractive analytic functions 𝑅 : 𝔻 → ℒ(𝔊, 𝔊′ ), where the spaces 𝔊 and 𝔊′ are given by the formulas ˜ (1) ˜ ˜ (1) ⊖ 𝐷 𝑊 (1) , 𝔊 = 𝐷𝐵 ℌ 𝐵 𝑛 ℌ [ ] ˜ (2) ′ ˜ ˜ ˜ (2) (1) (1) 𝔊 = (𝑊𝑛 − 𝐴 )𝐵 ℌ ⊕ 𝐷𝐵 ℌ ˜ (2) ˜ ˜ ˜ ˜ ˜ (2) )𝐵 ℎ (1) ⊕ 𝐷 ℎ (1) : ℎ (1) ∈ ℌ (1) }, ⊖ {(𝑊𝑛 − 𝐴 𝐵 and where 𝐷𝐵 = (𝐼 − 𝐵 ∗ 𝐵)1/2 . One of the liftings considered above will yield an operator 𝑋 ∈ ℐ(𝕎(1) , 𝕎(2) ) ˜ ˜ (1) (2) ˜ ˜𝑊 for 0 ≤ 𝑘 < 𝑛, and 𝐵 only when it also satisﬁes the conditions 𝑋 = 𝑊𝑘 𝑋 𝑘 itself is subject to the supplementary conditions ˜ ˜ (1) , 𝕍 (2) ). 𝐵ℌ(1)⊥ = {0}, 𝐵ℌ(1) ⊂ ℌ(2) , 𝐵 ∈ ℐ(𝕍 (See Proposition 3.3 and its proof.) We continue the discussion now under the assumption that the operator 𝐵 ˜ ˜ (1) , 𝕍 (2) ) is easily does satisfy these additional conditions. The fact that 𝐵 ∈ ℐ(𝕍 ˜ ˜ (1) (2) seen to imply that 𝐷𝐵 ∈ ℐ(𝕍 , 𝕍 ). Using the notation in Proposition 3.4, these intertwining conditions imply ˜ ˜ ˜ (1) (2) (1) 𝑉𝑘 𝔊 ⊂ 𝔊 and (𝑉𝑘 ⊕ 𝑉𝑘 )𝔊′ ⊂ 𝔊′ for 0 ≤ 𝑘 < 𝑛.

(3.1)

A routine application of techniques from of [10, Chapter VI] yields the following result. Proposition 3.5. With the above notation, assume that 𝕎(2) is {0 ≤ 𝑘 < 𝑛}-cnu. The set of contractions in ℐ(𝕎(1) , 𝕎(2) ) can be parameterized by pairs (𝐵, 𝑅),

188

H. Bercovici, R.G. Douglas and C. Foias

˜ ˜ (1) , 𝐴 (2) ) is a contraction satisfying the conditions in Proposition where 𝐵 ∈ ℐ(𝐴 3.3(2) and 𝑅 is a parameter as in Proposition 3.4 satisfying the additional conditions ˜ ˜ ˜ (1) (2) (1) (𝑉𝑘 ⊕ 𝑉𝑘 )𝑅(𝑧) = 𝑅(𝑧)𝑉𝑘 ∣𝔊, 0 ≤ 𝑘 < 𝑛, 𝑧 ∈ 𝔻. These results enable one to begin a systematic study of the invariant subspaces of bi-isometries. This study was already started in [3] and it will be continued in a forthcoming paper. Remark 3.6. We emphasize again that the preceding result does not require that 𝕎(1) is {0 ≤ 𝑘 < 𝑛}-cnu.

4. The structure of bi-isometries For the remainder of this paper, we focus on bi-isometries 𝕎 = (𝑊0 , 𝑊1 ) on a Hilbert space 𝔎. Theorem 2.3 and its proof immediately yield the following result when 𝐽 = {0}. Proposition 4.1. Consider a bi-isometry 𝕎 = (𝑊0 , 𝑊1 ) on 𝔎, let 𝔎 = ℌ ⊕ ℌ⊥ be the von Neumann-Wold decomposition relative to 𝑊0 , so that 𝑉0 = 𝑊0 ∣ℌ is a ˜ = ˜0 = 𝑉 ˜0 ⊕ (𝑊0 ∣ℌ⊥ ) ∈ ℒ(𝔎) unilateral shift and 𝑊0 ∣ℌ⊥ is unitary. Denote by 𝑊 ⊥ ˜ ˜ ˜ ℒ(ℌ ⊕ ℌ ) the minimal unitary extension of 𝑊0 , and denote by 𝑊1 ∈ ℒ(𝔎) the ˜0 . Deﬁne unique isometric extension of 𝑊1 which commutes with 𝑊 ∞ ⋁ 𝑘 ˜ ˜ ⊖ 𝔐. ˜1 ℌ, 𝑊 𝔐= 𝔑=𝔎 𝑘=0 ⊥

Then the subspace 𝔑 ⊂ ℌ is reducing for 𝕎, and 𝑊0 ∣𝔑 is unitary. Moreover, 𝔑 is the largest reducing subspace for 𝕎 with the property that 𝑊0 ∣𝔑 is unitary. Corollary 4.2. With the notation of the preceding result, the following assertions are equivalent. (1) 𝔑 = {0}. (2) 𝕎 is {0}-cnu. ∗ ∗ ˜1 ∣ℌ. ˜ ˜1 is the minimal co-isometric extension of 𝑊 (3) The operator 𝑊 The particular case of Proposition 2.9 for bi-isometries can be proved by repeated application of Proposition 4.1. This result was obtained ﬁrst in the case of doubly commuting isometries in [19]; the general case appears in [11] (see also [17] for another proof). Corollary 4.3. Consider a bi-isometry 𝕎 = (𝑊0 , 𝑊1 ) on 𝔎. There exist unique reducing subspaces 𝔎00 , 𝔎11 , 𝔎01 , 𝔎11 for 𝕎 with the following properties. (1) 𝑊0 ∣𝔎01 is a shift and 𝑊1 ∣𝔎01 is unitary. (2) 𝑊0 ∣𝔎10 is unitary and 𝑊1 ∣𝔎10 is a shift. (3) 𝑊0 ∣𝔎11 and 𝑊1 ∣𝔎11 are unitary.

Canonical Models for Bi-isometries

189

(4) There is no nonzero reducing subspace 𝔑 ⊂ 𝔎00 for 𝕎 such that either 𝑊0 ∣𝔑 or 𝑊1 ∣𝔑 is unitary. (5) 𝔎 = 𝔎00 ⊕ 𝔎01 ⊕ 𝔎10 ⊕ 𝔎11 . Proof. Proposition 4.1 yields a decomposition 𝔎 = 𝔑⊥ ⊕𝔑 into reducing subspaces for 𝕎 such that 𝑊0 ∣𝔑 is unitary and there is no reducing subspace 𝔑′ ⊂ 𝔑⊥ for 𝕎 such that 𝑊0 ∣𝔑′ is unitary. Apply this result with the pair 𝕎 replaced by (𝑊1 ∣𝔑, 𝑊0 ∣𝔑) and (𝑊1 ∣𝔑⊥ , 𝑊0 ∣𝔑⊥ ), respectively, to obtain decompositions 𝔑 = 𝔎10 ⊕ 𝔎11 and 𝔑⊥ = 𝔎00 ⊕ 𝔎01 , respectively, into sums of reducing subspaces such that 𝑊1 ∣𝔎11 and 𝑊1 ∣𝔎01 are unitary. Moreover, there is no nontrivial reducing subspace 𝔐 for 𝕎 contained in either 𝔎10 or 𝔎00 such that 𝑊1 ∣𝔐 is unitary. We leave the remaining veriﬁcations to the interested reader. □ Consider a bi-isometry 𝕎 = (𝑊0 , 𝑊1 ) on the Hilbert space 𝔎. As in Proposition 4.1, we consider the Wold decomposition 𝔎 = ℌ ⊕ ℌ⊥ for 𝑊0 , with ℌ=

∞ ⊕

𝑊0𝑘 𝔈,

𝔈 = ker 𝑊0∗ = 𝔎 ⊖ 𝑊0 𝔎,

𝑘=0

and we set 𝑉0 = 𝑊0 ∣ℌ and 𝐴 = 𝑃ℌ 𝑊1 ∣ℌ. Thus, 𝑉0 is a unilateral shift and, as observed earlier, 𝐴 is a contraction in the commutant of 𝑉0 . We will call (𝑉0 , 𝐴) the characteristic pair associated to the bi-isometry 𝕎. Thus, the characteristic pair is simply formed by a unilateral shift and a contraction in its commutant. The concept of unitary equivalence for these objects is the natural one: two such pairs are said to be unitarily equivalent if they are conjugated by a unitary operator (the same for the two operators of the pair). The pair (𝑊1 , 𝑊0 ) is also a bi-isometry, and the above procedure associates to it a characteristic pair. The characteristic pairs of (𝑊0 , 𝑊1 ) and (𝑊1 , 𝑊0 ) are not unitarily equivalent in general. For future reference, we restate Theorem 3.1 for the special case 𝑛 = 1; that is, the case of bi-isometries. Proposition 4.4. Let 𝑉0 ∈ ℒ(ℌ) be a unilateral shift, and 𝐴 ∈ {𝑉0 }′ a contraction. ˜0 ) the minimal unitary extension of 𝑉0 , let 𝐴 ˜ ∈ {𝑉 ˜0 }′ be the ˜0 ∈ ℒ(ℌ Denote by 𝑉 ∗ ˜ 1/2 − ˜ ˜ extension of 𝐴, and set 𝐷 = (𝐼 − 𝐴 𝐴) , 𝔇 = (𝐷ℌ0 ) . ˜0 . (1) The space 𝔇 is reducing for 𝑉 (2) Deﬁne the Hilbert space 𝔎 = ℌ ⊕ 𝔇⊕ 𝔇 ⊕ ⋅⋅⋅ , and the operators 𝑊0 , 𝑊1 ∈ ℒ(𝔎) by ˜0 𝑑0 ⊕ 𝑉 ˜0 𝑑1 ⊕ ⋅ ⋅ ⋅ , 𝑊0 (ℎ ⊕ 𝑑0 ⊕ 𝑑1 ⊕ ⋅ ⋅ ⋅ ) = 𝑉0 ℎ ⊕ 𝑉 𝑊1 (ℎ ⊕ 𝑑0 ⊕ 𝑑1 ⊕ ⋅ ⋅ ⋅ ) = 𝐴ℎ ⊕ 𝐷ℎ ⊕ 𝑑0 ⊕ 𝑑1 ⊕ ⋅ ⋅ ⋅ . Then (𝑊0 , 𝑊1 ) is a {0}-cnu bi-isometry whose characteristic pair is unitarily equivalent to (𝑉0 , 𝐴).

190

H. Bercovici, R.G. Douglas and C. Foias

We collect in the following statement some basic properties of the characteristic pair. These follow immediately from the results in Section 3. Proposition 4.5. Let 𝕎 = (𝑊0 , 𝑊1 ) and 𝕎′ = (𝑊0′ , 𝑊1′ ) be two bi-isometries with characteristic pairs (𝑉0 , 𝐴) and (𝑉0′ , 𝐴′ ), respectively. (1) The characteristic pair of 𝕎 ⊕ 𝕎′ is (𝑉0 ⊕ 𝑉0′ , 𝐴 ⊕ 𝐴′ ). (2) If 𝕎 is unitarily equivalent to 𝕎′ , then (𝑉0 , 𝐴) is unitarily equivalent to (𝑉0′ , 𝐴′ ). (3) Assume in addition that 𝕎 and 𝕎′ are {0}-cnu. If (𝑉0 , 𝐴) is unitarily equivalent to (𝑉0′ , 𝐴′ ), then 𝕎 is unitarily equivalent to 𝕎′ . (4) For every pair (𝑉0 , 𝐴), where 𝑉0 is a unilateral shift and 𝐴 ∈ {𝑉0 }′ is a contraction, there exists a bi-isometry 𝕎 such that (𝑉0 , 𝐴) is the characteristic pair associated to 𝕎. This bi-isometry can be chosen to be {0}-cnu. The preceding proposition characterizes the reducing subspaces of a {0}cnu bi-isometry in terms of its characteristic pair. General invariant subspaces of a bi-isometry are not characterized as easily. One diﬃculty is the fact that the restriction of a {0}-cnu bi-isometry to an invariant subspace is not always {0}cnu. Assume then that we start with a {0}-cnu bi-isometry 𝕎 on 𝔎, 𝔎′ ⊂ 𝔎 is an invariant subspace for 𝕎, and 𝕎′ = 𝕎∣𝔎′ . The inclusion operator 𝑋 ∈ ℒ(𝔎′ , 𝔎) is obviously an isometry in ℐ(𝕎′ , 𝕎). Conversely, given an isometric intertwining between bi-isometries 𝑋 ∈ ℐ(𝕎(1) , 𝕎), the range of 𝑋 is an invariant subspace for 𝕎. Thus the description of invariant subspaces for bi-isometries can be achieved by understanding the structure of isometric operators intertwining two bi-isometries. In the terminology of Proposition 3.5, one needs to ﬁnd the parameters 𝑅 which ˜ of a given contraction 𝐵. We presented in [3] give rise to isometric liftings 𝑋 some general results concerning this problem, and further results will appear in a forthcoming paper.

5. Functional representation The data in a characteristic pair (𝑉0 , 𝐴) on ℌ can alternately be encoded in a contractive analytic operator-valued function on the unit disk 𝔻. Set 𝔈 = ℌ ⊖ 𝑉0 ℌ, and deﬁne operators Θ𝑘 ∈ ℒ(𝔈) as follows: Θ𝑘 = 𝑃𝔈 𝑉0∗𝑘 𝐴∣𝔈,

𝑘 ≥ 0.

We can then associate to the pair (𝑉0 , 𝐴) the operator-valued analytic function Θ(𝑧) =

∞ ∑

𝑧 𝑘 Θ𝑘 = 𝑃𝔈 (𝐼 − 𝑧𝑉0∗ )−1 𝐴∣𝔈,

∣𝑧∣ < 1.

𝑘=0

When (𝑉0 , 𝐴) is the characteristic pair of a bi-isometry 𝕎, Θ will be called the characteristic function of 𝕎; we will use the notation Θ = Θ𝕎 when it is necessary. If Θ is the characteristic function of 𝕎 = (𝑊0 , 𝑊1 ), then its coeﬃcients satisfy Θ𝑘 = 𝑃𝔈 𝑉0∗𝑘 𝐴∣𝔈 = 𝑃𝔈 𝑊0∗𝑘 𝑃ℌ 𝑊1 ∣𝔈 = 𝑃𝔈 𝑊0∗𝑘 𝑊1 ∣𝔈,

𝑘 ≥ 0,

Canonical Models for Bi-isometries

191

since 𝑃𝔈 𝑊0∗𝑘 𝑃ℌ = 𝑃𝔈 𝑊0∗𝑘 . In particular, the constant coeﬃcient Θ0 = Θ(0) = 𝑃𝔈 𝑊1 ∣𝔈 = (𝑊1∗ ∣𝔈)∗ is precisely the pivotal operator associated with the pair (𝑊1 , 𝑊0 ), as deﬁned in [2]. For the convenience of the reader, we recall that the adjoint of the pivotal operator associated with a bi-isometry (𝑊0 , 𝑊1 ) is deﬁned as 𝑊0∗ ∣ ker(𝑊1∗ ). The operator Θ(𝑧) is a contraction for ∣𝑧∣ < 1; in fact sup ∥Θ(𝑧)∥ = ∥𝐴∥,

∣𝑧∣<1

where 𝐴∗ = 𝑊1∗ ∣ℌ. Indeed, this equality follows easily from the fact that 𝐴 is unitarily equivalent to the Toeplitz operator with symbol Θ; see the discussion following Corollary 5.1 for a deﬁnition of Toeplitz operators. Unitary equivalence of characteristic functions is deﬁned in the natural way: Θ is unitarily equivalent to Θ′ if 𝑈 Θ(𝑧) = Θ′ (𝑧)𝑈, 𝑧 ∈ 𝔻, for some unitary operator 𝑈 . It may be useful to contrast this notion of unitary equivalence with the weaker notion of coincidence, which is the appropriate concept in the study of functional models for contractions [25]. Two operator-valued analytic functions Θ and Θ′ are said to coincide if there exist unitary operators 𝑈 and 𝑉 such that 𝑈 Θ(𝑧) = Θ′ (𝑧)𝑉 for all 𝑧 ∈ 𝔻. Proposition 4.5 can now be reformulated as follows. Corollary 5.1. Let 𝕎 and 𝕎′ be two bi-isometries with characteristic functions Θ and Θ′ , respectively. (1) The characteristic function of 𝕎 ⊕ 𝕎′ is given by Θ(𝑧) ⊕ Θ′ (𝑧) for 𝑧 ∈ 𝔻. (2) If 𝕎 is unitarily equivalent to 𝕎′ , then Θ is unitarily equivalent to Θ′ . (3) Assume in addition that 𝕎 and 𝕎′ are {0}-cnu. If Θ is unitarily equivalent to Θ′ then 𝕎 is unitarily equivalent to 𝕎′ . (4) For every contractive analytic function Θ : 𝔻 → ℒ(𝔈), there exists a {0}-cnu bi-isometry 𝕎 such that Θ𝕎 is unitarily equivalent to Θ. In order to translate the result of Proposition 4.4 into function theoretical terms we need some notation. First, given a separable, complex Hilbert space 𝔈, we denote as usual by 𝐻 2 (𝔈) the Hilbert space of all square summable power series with coeﬃcients in 𝔈. Given a contractive analytic function Θ : 𝔻 → ℒ(𝔈), the analytic Toeplitz operator 𝑇Θ ∈ ℒ(𝐻 2 (𝔈)) is deﬁned simply as pointwise multiplication by Θ. The particular case Θ(𝑧) = 𝑧𝐼𝔈 yields the unilateral shift 𝑆𝔈 . The minimal unitary extension of 𝑆𝔈 is the bilateral shift 𝑈𝔈 on the Hilbert space 𝐿2 (𝔈) of all square summable Laurent series with coeﬃcients in 𝔈. The extension of 𝑇Θ which commutes with 𝑈𝔈 is the Laurent operator 𝐿Θ with symbol Θ. Now, the space 𝐿2 (𝔈) can also be viewed as the space of square integrable 𝔈-valued functions 𝑓 : 𝕋 = ∂𝔻 → 𝔈. When viewed in this manner, the operator 𝐿Θ is given by (𝐿Θ 𝑓 )(𝜁) = Θ(𝜁)𝑓 (𝜁) for almost every 𝜁 ∈ 𝕋, where the strong operator limit Θ(𝜁) = lim Θ(𝑟𝜁) 𝑟↑1

192

H. Bercovici, R.G. Douglas and C. Foias

exists almost everywhere. Similarly, the operator 𝐷 = (𝐼 − 𝐿∗Θ 𝐿Θ )1/2 is given as a multiplication operator by the strongly measurable operator-valued function Δ(𝜁) = (𝐼 − Θ(𝜁)∗ Θ(𝜁))1/2 ,

𝜁 ∈ 𝕋.

The inﬁnite sum

˜ − ⊕ (𝐷ℌ) ˜ − ⊕ ⋅⋅⋅ ˜ − ⊕ (𝐷ℌ) (𝐷ℌ) appearing in Proposition 4.4 can then be identiﬁed with 𝐻 2 ((𝐿Δ 𝐿2 (𝔈)− )). The elements in this space can be viewed as functions of two variables (𝑤, 𝜁) ∈ 𝔻 × 𝕋, analytic in 𝑤 and measurable in 𝜁. We are now ready to reformulate Proposition 4.4. Proposition 5.2. Let Θ : 𝔻 → ℒ(𝔈) be a contractive analytic function, and set Δ(𝜁) = (𝐼 − Θ(𝜁)∗ Θ(𝜁))1/2 , 𝜁 ∈ 𝕋. (1) The space (𝐿Δ 𝐿2 (𝔈))− is reducing for 𝑈𝔈 . (2) Deﬁne the Hilbert space 𝔎 = 𝐻 2 (𝔈) ⊕ 𝐻 2 ((𝐿Δ 𝐿2 (𝔈))− ), and the operators 𝑊0 , 𝑊1 ∈ ℒ(ℌ) by 𝑊0 (𝑓 ⊕ 𝑔) = 𝑎 ⊕ 𝑏,

𝑊1 (𝑓 ⊕ 𝑔) = 𝑐 ⊕ 𝑑,

where

𝑎(𝑧) = 𝑧𝑓 (𝑧), 𝑏(𝑤, 𝜁) = 𝜁𝑔(𝑤, 𝜁), 𝑐(𝑧) = Θ(𝑧)𝑓 (𝑧), 𝑑(𝑤, 𝜁) = Δ(𝜁)𝑓 (𝜁) + 𝑤𝑔(𝑤, 𝜁) for 𝑧, 𝑤 ∈ 𝔻 and 𝜁 ∈ 𝕋. Then (𝑊0 , 𝑊1 ) is a {0}-cnu bi-isometry whose characteristic function is unitarily equivalent to Θ. We will use the notation 𝕎(Θ) = (𝑊0 , 𝑊1 ) for the bi-isometry described in the preceding statement. The mapping Θ → 𝕎(Θ) establishes a bijection between unitary equivalence classes of contractive analytic functions Θ : 𝔻 → ℒ(𝔈) and unitary equivalence classes of {0}-cnu bi-isometries 𝕎. The formulas given for 𝕎(Θ) allow, in principle, explicit calculations. A ﬁrst instance is the following result. Proposition 5.3. Let Θ : 𝔻 → ℒ(𝔈) be a contractive analytic function, and denote (𝑊0 , 𝑊1 ) = 𝕎(Θ). (1) The operator 𝑊1 is unitary if and only if Θ is a constant unitary operator, that is, Θ(𝑧) ≡ Θ(0), and Θ(0) is a unitary operator in ℒ(𝔈). (2) The following conditions are equivalent: (a) 𝕎(Θ) is {1}-cnu. (b) the contraction Θ(0) is completely nonunitary. Proof. If 𝑊1 is unitary, 𝑊0 must be a unilateral shift, and therefore 𝑉 = 𝑆𝔈 , and 𝑊 = 𝑇Θ . It is well known that 𝑇Θ is unitary if and only if Θ is a constant unitary operator. To prove (2), assume ﬁrst that 𝑊1 ∣𝔑 is unitary for some nonzero reducing subspace 𝔑 of 𝕎(Θ). Applying part (1) of Proposition 5.1 and part (1) of this

Canonical Models for Bi-isometries

193

proposition, which has already been proved, shows that we can write Θ = Θ′ ⊕ Θ′′ , with Θ′′ a constant unitary operator acting on a nonzero space. In particular Θ(0) has a nontrivial unitary direct summand. Conversely, assume that Θ(0) is not completely nonunitary, so that its restriction to some nonzero invariant subspace𝔈0 is a unitary operator. The contractive analytic function Θ0 : 𝔻 → ℒ(𝔈0 ) deﬁned by Θ0 (𝑧) = 𝑃𝔈0 Θ(𝑧)∣𝔈0 is such that Θ0 (0) is unitary. The maximum principle implies that Θ0 is constant, and 𝔈0 reduces each Θ(𝑧) to Θ0 . A second application of part (1) of Proposition 5.1, as well as the already proved part (1) of this proposition, shows that 𝑊1 ∣𝔑 is unitary for some nonzero reducing subspace 𝔑 of 𝕎(Θ). □ ˇ = If 𝕎 = 𝕎(Θ), one can also calculate the characteristic function of 𝕎 (𝑊1 , 𝑊0 ), whose coeﬃcients are (𝐼 − 𝑊1 𝑊1∗ )𝑊1∗𝑘 𝑊0 ∣ran(𝐼 − 𝑊1 𝑊1∗ ),

𝑘 ≥ 0.

Thus this function is given by (𝐼 − 𝑊1 𝑊1∗ )(𝐼 − 𝑧𝑊1∗ )−1 𝑊0 ∣ran(𝐼 − 𝑊1 𝑊1∗ ),

𝑧 ∈ 𝔻.

In these formulas we use the abbreviation ‘ran’ for the range of an operator.

6. The structure of bi-shifts Consider a bi-isometry 𝕎 = (𝑊0 , 𝑊1 ). As seen earlier, the operators 𝑊0 and 𝑊1 do not need to be cnu, even if 𝕎 is {0}-cnu and {1}-cnu. In this section we study bi-isometries for which both 𝑊0 and 𝑊1 are cnu, and such bi-isometries will be called bi-shifts. Clearly bi-shifts are both {0}-cnu and {1}-cnu. Note that the bishifts described in [11] are, in our terminology, doubly commuting bi-shifts; see Proposition 6.4 below. Proposition 6.1. Assume that the bi-isometry 𝕎 is both {0}-cnu and {1)-cnu. The following conditions are equivalent. (1) 𝕎 is a bi-shift. (2) 𝑊0∗𝑛 → 0 and 𝑊1∗𝑛 → 0 as 𝑛 → ∞ in the strong operator topology. (3) The characteristic function Θ𝕎 is inner (that is, Θ𝕎 (𝜁) ∈ ℒ(𝔈) is an isometry for almost every 𝜁 ∈ 𝕋) and it enjoys the following property: (∗) There exists no inner function Ω : 𝔻 → ℒ(𝔉, 𝔈) such that 𝔉 ∕= {0} and Θ𝕎 (𝑧)Ω(𝑧) = Ω(𝑧)𝑈,

𝑧 ∈ 𝔻,

with a unitary operator 𝑈 ∈ ℒ(𝔉). Proof. The proposition is almost immediate, but we provide the brief argument below in order to illustrate the use of the results in the preceding section. The equivalence between (1) and (2) follows from the fact that an isometry is cnu if and only if it is a unilateral shift. Assume next that (2) holds so that, in particular, 𝑊0 has no unitary part. With the notation of the preceding sections, 𝑉0 = 𝑊0 , and 𝐴 = 𝑊1 , so that 𝕎 serves as its own characteristic pair. Passing to the functional model, we identify 𝑊0 with the unilateral shift 𝑆𝔈 , in which case 𝑊1 = 𝑇Θ for

194

H. Bercovici, R.G. Douglas and C. Foias

some operator-valued function Θ. The function Θ must then be inner because 𝑇Θ is an isometry. Assume now that a function Ω exists with the properties in (∗). Then it follows that 𝑇Θ ∣Ω𝐻 2 (𝔉) is a unitary operator, unitarily equivalent to 𝑇𝑈 ∈ ℒ(𝐻 2 (𝔉)). This contradicts the assumption that (2) holds, and we conclude that (3) is true. Finally, assume that (3) holds, but (2) does not. Since 𝑆𝔈 is completely nonunitary, the operator 𝑇Θ must have a unitary part. The nonzero space ∞ ∞ ∩ ∩ 𝑊1𝑛 ℌ = 𝑇Θ𝑛 𝐻 2 (𝔈) 𝔐= 𝑛=0

𝑛=0

on which this unitary part acts is obviously invariant for 𝑆𝔈 , and the BeurlingLax-Halmos theorem implies that 𝔐 = Ω𝐻 2 (𝔉) for some inner function Ω : 𝔻 → ℒ(𝔉, 𝔈) with 𝔉 ∕= {0}. The operator 𝑇Ω−1 𝑇Θ 𝑇Ω is then a unitary operator in the commutant of 𝑆𝔉 , and such operators are of the form 𝑇𝑈 for some unitary operator 𝑈 ∈ ℒ(𝔉). We conclude that 𝑇Θ 𝑇Ω = 𝑇Ω 𝑇𝑈 , contrary to (3). □ Remark 6.2. The example Θ(𝑧) ≡ 𝐼𝔈 , 𝑧 ∈ 𝔻, shows that condition (∗) is needed in the third statement of the preceding proposition. Proposition 6.1 shows that the construction of bi-shifts requires the construction of appropriate inner functions Θ : 𝔻 → ℒ(𝔈). We start with some simple examples. Fix a nonzero Hilbert space 𝔈 and an inner function 𝜗 ∈ 𝐻 ∞ . We can then form the bi-isometry 𝕎(𝜗 ⊗ 𝐼𝔈 ) = (𝑆𝔈 , 𝑇𝜗⊗𝐼𝔈 ). This is easily seen to be a bi-shift provided that 𝜗 is not constant. Proposition 6.3. Given two nonconstant inner functions 𝜗1 , 𝜗2 ∈ 𝐻 ∞ , the bi-shifts 𝕎(𝜗1 ⊗ 𝐼𝔈 ) and 𝕎(𝜗2 ⊗ 𝐼𝔈 ) are quasi-similar if and only if 𝜗1 = 𝜗2 . Proof. Let 𝑋 ∈ ℐ(𝕎(𝜗1 ⊗ 𝐼𝔈 ), 𝕎(𝜗2 ⊗ 𝐼𝔈 )) be a quasi-aﬃnity. We have 𝑋 ∈ (𝑆𝔈 )′ , and therefore we have 𝑋 = 𝑇Ξ for some outer function Ξ : 𝔻 → ℒ(𝔈). The relation 𝑋𝑇𝜗1 ⊗𝐼𝔈 = 𝑇𝜗2 ⊗𝐼𝔈 𝑋 implies that (𝜗1 (𝑧) − 𝜗2 (𝑧))Ξ(𝑧) = Ξ(𝑧)(𝜗1 (𝑧) ⊗ 𝐼𝔈 ) − (𝜗2 (𝑧) ⊗ 𝐼𝔈 )Ξ(𝑧),

𝑧 ∈ 𝔻.

The operator Ξ(𝑧) has dense range for every 𝑧 ∈ 𝔻, and we conclude that 𝜗1 = 𝜗2 . The converse is immediate. □ As pointed out earlier, the bi-isometries 𝕎(Θ1 ) and 𝕎(Θ2 ), Θ1 , Θ2 : 𝔻 → 𝔈, are unitarily equivalent if and only if the functions Θ1 and Θ2 are unitarily equivalent, that is, 𝑈 Θ1 (𝑧) = Θ2 (𝑧)𝑈 for a unitary operator 𝑈 independent of 𝑧 ∈ 𝔻. Similarity of the two bi-isometries requires the existence of an invertible outer function Ψ : 𝔻 → ℒ(𝔈) such that Ψ(𝑧)Θ1 (𝑧) = Θ2 (𝑧)Ψ(𝑧) for all 𝑧 ∈ 𝔻. Another important family of bi-shifts is deﬁned on the Hardy space 𝐻 2 (𝔻2 )⊗ 𝔉 by the formula 𝕎𝔉 = (𝑊0 , 𝑊1 ), where (𝑊𝑗 𝑓 )(𝑧0 , 𝑧1 ) = 𝑧𝑗 𝑓 (𝑧0 , 𝑧1 ),

𝑓 ∈ 𝐻 2 (𝔻2 ) ⊗ 𝔉, (𝑧0 , 𝑧1 ) ∈ 𝔻2 .

This class of bi-isometries has a simple characterization. Parts of the following proposition are known. We include a brief argument for the reader’s convenience.

Canonical Models for Bi-isometries

195

Proposition 6.4. Assume that the bi-isometry 𝕎 is both {0}-cnu and {1}-cnu. The following conditions are equivalent. (1) 𝕎 is unitarily equivalent to 𝕎𝔉 for some Hilbert space 𝔉. (2) 𝕎 is doubly commuting, that is, 𝑊0 𝑊1∗ = 𝑊1∗ 𝑊0 . (3) The characteristic function Θ𝕎 is a constant isometry. (4) The pivotal operator of (𝑊1 , 𝑊0 ) is an isometry. (5) The pivotal operator of 𝕎 is an isometry. Proof. It is immediate that (1) implies (2). For the remainder of the argument we identify 𝕎 with 𝕎(Θ), where Θ : 𝔻 → ℒ(𝔈) is a contractive analytic function. Thus 𝕎 acts on the space 𝔎 described in Proposition 5.3. Assume now that (2) holds. In this case the kernel of 𝑊0∗ must be a reducing subspace for 𝑊1 . This kernel consists of functions in 𝔎 of the form 𝑒 ⊕ 0 ⊕ 0 ⊕ ⋅ ⋅ ⋅ , with 𝑒 ∈ 𝔈 a constant. Since 𝑊1 (𝑒 ⊕ 0 ⊕ 0 ⊕ ⋅ ⋅ ⋅ ) = Θ𝑒 ⊕ Δ𝑒 ⊕ 0 ⊕ ⋅ ⋅ ⋅ , we deduce immediately that Θ is constant and Δ = 0, so that (3) is true. Assume now that (3) holds, so that Θ is a constant isometry. It follows that Θ(0) is in particular an isometry. Condition (4) follows because Θ(0) is the pivotal operator of the pair (𝑊1 , 𝑊0 ). Assume that (4) holds, so that Θ(0) is an isometry. Then it follows from the maximum principle that Θ(𝑧) = Θ(0) for all 𝑧. In particular, the function Θ is inner, and hence 𝑊0 = 𝑆𝔈 and 𝑊1 = 𝑇Θ . Note that any orthogonal decomposition Θ(0) = Θ1 ⊕ Θ2 yields a decomposition 𝑇Θ = 𝑇Θ1 ⊕ 𝑇Θ2 . If Θ1 is unitary, the operator 𝑇Θ1 is unitary as well, and therefore Θ1 must act on the space {0} because 𝕎 was assumed to be {1}-cnu. We deduce that Θ(0) is cnu, and thus it is unitarily equivalent to 𝑆𝔉 for some Hilbert space 𝔉, and in this case 𝕎(Θ) is unitarily equivalent to 𝕎𝔉 . So far we have proved that conditions (1–4) are equivalent. The equivalence of (5) with these conditions follows from the symmetry of (2). □ The example of the constant function Θ(𝑧) ≡ 𝐼, 𝑧 ∈ 𝔻, shows why the assumption that 𝕎 is both {0}-cnu and {1}-cnu is needed in the preceding proposition. If two isometries are quasi-similar and one of them is a shift, then the other one is a shift as well. It follows that a bi-isometry quasi-similar to a bi-shift must also be a bi-shift. We conclude this section with some simple properties of those bi-shifts which are similar to 𝕎𝔉 for some 𝔉. Proposition 6.5. Let 𝕎 = 𝕎(Θ) be a bi-shift, where Θ : 𝔻 → ℒ(𝔈) is an inner analytic function. Assume further that 𝕎 is similar to 𝕎𝔉 for some Hilbert space 𝔉. Then the following assertions are true. (1) The pivotal operator is similar to a unilateral shift. (2) There exists a bounded analytic function Ω : 𝔻 → ℒ(𝔈) such that Ω(𝑧)Θ(𝑧) = 𝐼,

𝑧 ∈ 𝔻.

(3) The operator Θ(𝑧) is similar to a unilateral shift for every 𝑧 ∈ 𝔻.

196

H. Bercovici, R.G. Douglas and C. Foias

Proof. We argue ﬁrst that two similar bi-isometries have similar pivotal operators. Indeed, assume that 𝑋 ∈ ℐ(𝕎(1) , 𝕎(2) ) is an invertible operator. We have then (1) (2) (1)∗ 𝑋ran𝑊0 = ran𝑊0 , and this implies that 𝑃ker 𝑊 (2)∗ 𝑋∣ ker 𝑊0 is an invertible 0 operator intertwining the two pivotal operators. Now, the pivotal operator of 𝕎𝔉 is a shift, and the preceding observation implies (1). By symmetry, we also deduce that Θ(0) is similar to a shift, and then (2) follows from the main result of [24]. To verify (3), we observe that the bi-shift 𝕎𝔉 is unitarily equivalent to 𝕎(Θ1 ), where Θ1 (𝑧) ≡ 𝑆 for 𝑧 ∈ 𝔻, with 𝑆 ∈ ℒ(𝔈) a unilateral shift. Let 𝑋 ∈ ℐ(𝕎(Θ), 𝕎(Θ1 )) be an invertible operator. We have 𝑋 ∈ (𝑊0 )′ , and therefore the operator 𝑋 is of the form 𝑋 = 𝑇Ξ for some bounded analytic function Ξ ∈ ℒ(𝔈). The fact that 𝑋 is invertible implies that 𝑋(𝑧) is invertible for every 𝑧 ∈ 𝔻, and the relation 𝑋𝑇Θ = 𝑇Θ1 𝑋 shows that Θ(𝑧) is similar to 𝑆 = Θ1 (𝑧). The proposition is proved. □ A diﬀerent approach to the similarity between a contraction and an isometry is described in [14]. This approach may also be useful in the study of similarities between bi-shifts. In contrast to certain results from the model theory of contractions, condition (2) in the above proposition is not suﬃcient to imply the similarity of 𝕎(Θ) to a bi-shift of the form 𝕎𝔉 . This is illustrated by the following example. Example 6.6. Deﬁne Θ(𝑧) ∈ ℒ(ℓ2 ) using the inﬁnite matrix ⎡ 3 ⎤ 0 ⋅⋅⋅ 5 𝜑(𝑧) 0 0 ⎢ 4𝑧 0 0 0 ⋅⋅⋅ ⎥ ⎢ 5 ⎥ ⎢ 0 1 0 0 ⋅⋅⋅ ⎥ Θ(𝑧) = ⎢ ⎥ , 𝑧 ∈ 𝔻, ⎢ 0 0 1 0 ⋅⋅⋅ ⎥ ⎣ ⎦ .. .. .. . . . ⋅⋅⋅ . . . where 𝜑 ∈ 𝐻 ∞ is an inner function such that 𝜑(0) ∕= 0. The operator Θ(0) has the eigenvalue 3𝜑(0)/5 and therefore it is not similar to a shift. However Θ satisﬁes condition (2) in the preceding proposition. One left inverse is given by ⎤ ⎡ 5 𝜂(𝑧) 0 0 ⋅ ⋅ ⋅ 3𝜑(0) ⎢ 0 0 1 0 ⋅⋅⋅ ⎥ ⎥ ⎢ ⎢ 0 0 0 1 ⋅⋅⋅ ⎥ ⎥ , 𝑧 ∈ 𝔻, Ω(𝑧) = ⎢ ⎢ .. ⎥ ⎥ ⎢ 0 . 0 0 0 ⎦ ⎣ .. .. .. .. .. . . . . . with

[ ] 4 𝜑(𝑧) 𝜂(𝑧) = 1− , 5𝑧 𝜑(0)

𝑧 ∈ 𝔻.

The reader will verify without diﬃculty that 𝕎(Θ) is indeed a bi-shift.

Canonical Models for Bi-isometries

197

7. The unitary invariants of a functional model Bi-isometries 𝕎 = (𝑊0 , 𝑊1 ) with the property that the product 𝑊0 𝑊1 is a shift were classiﬁed, up to unitary equivalence, in [4]; see also [2]. The parameters in that classiﬁcation are pairs (𝑈, 𝑃 ), where 𝑈 is a unitary operator on a Hilbert space 𝔇, and 𝑃 is an orthogonal projection on 𝔇. In this section we consider the characteristic functions of such bi-isometries. The bi-isometry 𝕎 = (𝑊0 , 𝑊1 ) associated to the pair (𝑈, 𝑃 ) acts on 𝐻 2 (𝔇) and is deﬁned by (𝑊0 𝑓 )(𝑧) = 𝑈 (𝑧𝑃 +𝑃 ⊥ )𝑓 (𝑧), (𝑊1 𝑓 )(𝑧) = (𝑃 +𝑧𝑃 ⊥ )𝑈 ∗ 𝑓 (𝑧),

𝑓 ∈ 𝐻 2 (𝔇), 𝑧 ∈ 𝔻.

The space 𝔇 is identiﬁed with the space ker(𝑊0 𝑊1 )∗ of constant functions in 𝐻 2 (𝔇), while the range of 𝑃 ⊥ is identiﬁed with ker 𝑊1∗ . For a constant function 𝑓0 ∈ 𝔇 we have 𝑈 𝑓0 = 𝑊0 𝑓0 , 𝑓0 ∈ 𝑃 ⊥ 𝔇, (7.1) while for 𝑓0 ∈ 𝑃 𝔇 we have 𝑊0 𝑓0 = 𝑧𝑈 𝑓0 = 𝑊0 𝑊1 𝑈 𝑓0 . Therefore the vector 𝑓0 = 𝑊1 𝑈 𝑓0 is in the range of 𝑊1 , and we ﬁnd that 𝑈 𝑓0 = 𝑊1∗ 𝑓0 ,

𝑓0 ∈ 𝑃 𝔇.

(7.2)

From this we easily conclude that ker 𝑊0∗ = 𝑈 𝑃 𝔇 = 𝑊1∗ 𝑃 𝔇. By reversing the order of these observations we easily deduce the following result. Proposition 7.1. Let 𝕎 = (𝑊0 , 𝑊1 ) be a bi-isometry on ℌ. Deﬁne spaces 𝔇 = ker(𝑊0 𝑊1 )∗ ,

𝔈 = ker 𝑊0∗ ,

𝔉 = ker 𝑊1∗ .

(1) We have 𝔇 = 𝔈 ⊕ 𝑊0 𝔉 = 𝑊1 𝔈 ⊕ 𝔉. (2) The operator 𝑈 : 𝔇 → 𝔇 deﬁned by 𝑈 (𝑊1 𝑒 + 𝑓 ) = 𝑒 + 𝑊0 𝑓

𝑒 ∈ 𝔈, 𝑓 ∈ 𝔉,

is unitary. (3) The bi-isometry associated with the pair (𝑈, 𝑃𝑊1 𝔈 ) on 𝔇 is unitarily equivalent to the cnu part of 𝕎. For further calculation, it is convenient to replace the space 𝔇 by the external direct sum 𝔈⊕𝔉 via the identiﬁcation Φ : 𝑒⊕𝑓 → 𝑊1 𝑒+𝑓 . With this identiﬁcation we obviously have [ ] 𝐼𝔈 0 ∗ Φ 𝑃Φ = . 0 0 Corollary 7.2. With the notation of Proposition 7.1, we have [ ] 𝑊1∗ ∣𝔈 𝑊1∗ 𝑊0 ∣𝔉 ∗ . Φ 𝑈Φ = (𝐼 − 𝑊1 𝑊1∗ )∣𝔈 (𝐼 − 𝑊1 𝑊1∗ )𝑊0 ∣𝔉

198

H. Bercovici, R.G. Douglas and C. Foias

Proof. For a vector 𝑒 ∈ 𝔈 we have 𝑈 Φ(𝑒 ⊕ 0) = 𝑈 𝑊1 𝑒 = 𝑒 = 𝑊1 𝑊1∗ 𝑒 + (𝐼 − 𝑊1 𝑊1∗ )𝑒, and this is precisely the decomposition of this vector as an element of the space 𝑊1 𝔈 ⊕ 𝔉. Therefore Φ∗ 𝑈 Φ(𝑒 ⊕ 0) = 𝑊1∗ 𝑒 ⊕ (𝐼 − 𝑊1 𝑊1∗ )𝑒. To verify the identity involving the second column, we use a similar calculation: 𝑈 Φ(0 ⊕ 𝑓 ) = 𝑈 𝑓 = 𝑊0 𝑓 = 𝑊1 𝑊1∗ 𝑊0 𝑓 + (𝐼 − 𝑊1 𝑊1∗ )𝑊0 𝑓, In these calculations we made use of (7.1) and (7.2).

𝑓 ∈ 𝔉. □

Let us consider now a contractive analytic function Θ : 𝔻 → ℒ(𝔈) and the functional model 𝕎(Θ) = (𝑊0 , 𝑊1 ). In order to identify the space 𝔉, it will be useful to recall a few facts from the theory of functional models of contraction operators. Let us introduce the auxiliary space 𝔎 = 𝐻 2 (𝔈) ⊕ (𝐿Δ 𝐿2 (𝔈))− , which can be viewed as a subspace of ℌ = 𝐻 2 (𝔈) ⊕ 𝐻 2 ((𝐿Δ 𝐿2 (𝔈))− ). Obviously, the space 𝔎 is reducing for 𝑊0 . The space 𝔊 = {Θ𝑢 ⊕ Δ𝑢 : 𝑢 ∈ 𝐻 2 (𝔈)} is invariant for 𝑊0 , and therefore ℌ(Θ) = 𝔎 ⊖ 𝔊 is invariant for 𝑊0∗ . The compression of 𝑊0 to this space is denoted 𝑆(Θ), and it is called the functional model associated with Θ. It is known that 𝑆(Θ) is a completely nonunitary contraction, and the characteristic function of 𝑆(Θ) coincides (in the sense deﬁned in [25]) with the purely contractive part of the function Θ. A vector 𝑢 ⊕ 𝑣 ∈ 𝔎 belongs to ℌ(Θ) if and only if the measurable function Θ∗ 𝑢 + Δ𝑣 is orthogonal to 𝐻 2 (𝔈). In other words, we have a Fourier expansion ∞ ∑

Θ∗ 𝑢 + Δ𝑣 =

𝜁 𝑛 𝑒𝑛 ,

𝑛=−1 ∗

with 𝑒𝑛 ∈ 𝔈. We will use the notation (Θ 𝑢 + Δ𝑣)−1 for 𝑒−1 . Lemma 7.3. Viewed as a subspace of ℌ, we have ℌ(Θ) = 𝔉. Moreover, 𝑆(Θ) is precisely the pivotal operator associated with the bi-isometry 𝕎(Θ): 𝑆(Θ)∗ = 𝑊0∗ ∣𝔉. Proof. In order to identify 𝔉, we consider its orthogonal complement which is easily calculated as 𝔉⊥ = 𝑊1 ℌ = 𝔊 ⊕ 𝑊1 𝐻 2 ((𝐿Δ 𝐿2 (𝔈))− ). The conclusion ℌ(Θ) = 𝔉 then follows because ℌ = 𝔎 ⊕ 𝑊1 𝐻 2 ((𝐿Δ 𝐿2 (𝔈))− ).

Canonical Models for Bi-isometries

199

The identiﬁcation of the pivotal operator follows now from the fact that ℌ(Θ) = 𝔉 is invariant for 𝑊0∗ . □ Proposition 7.4. Let Θ : 𝔻 → ℒ(𝔈) be a contractive analytic function, and 𝕎(Θ) = (𝑊0 , 𝑊1 ) the corresponding model bi-isometry. Then 𝕎(Θ) is unitarily equivalent to the bi-isometry associated with the pair (𝑈, 𝑃 ) of operators on 𝔈 ⊕ ℌ(Θ) deﬁned as follows: 𝑈 (𝑒 ⊕ 0) = Θ(0)∗ 𝑒 ⊕ [(𝑒 − ΘΘ(0)∗ 𝑒) ⊕ (−ΔΘ(0)∗ 𝑒)], ∗

𝑈 (0 ⊕ (𝑢 ⊕ 𝑣)) = (Θ 𝑢 + Δ𝑣)−1 ⊕ 𝑆(Θ)(𝑢 ⊕ 𝑣), and

[ 𝑃 =

𝐼𝔈 0

0 0

𝑒 ∈ 𝔈,

𝑢 ⊕ 𝑣 ∈ ℌ(Θ),

(7.3)

] .

Proof. This proof amounts to an identiﬁcation of the matrix entries in Corollary 7.2. It is convenient to regard ℌ as an inﬁnite orthogonal sum ℌ = 𝐻 2 (𝔈) ⊕ (𝐿Δ 𝐿2 (𝔈))− ⊕ (𝐿Δ 𝐿2 (𝔈))− ⊕ ⋅ ⋅ ⋅ , relative to which the operator 𝑊1 has the matrix ⎡ 𝑇Θ 0 ⎢ 𝐿Δ ∣𝐻 2 (𝔈) 0 ⎢ 𝑊1 = ⎢ 0 𝐼(𝐿Δ 𝐿2 (𝔈))− ⎣ .. .. . .

⎤ 0 ⋅⋅⋅ 0 ⋅⋅⋅ ⎥ ⎥ . 0 ⋅⋅⋅ ⎥ ⎦ .. . . . .

We now apply the formulas in Corollary 7.2 to calculate the entries of the matrix 𝑈 explicitly. Thus, for 𝑒 ∈ 𝔈, which is viewed now as a subspace of ℌ, we obtain by applying the matrix above 𝑊1∗ 𝑒 = 𝑇Θ∗ 𝑒 = 𝑃𝐻 2 (𝔈) Θ∗ 𝑒 = Θ(0)∗ 𝑒,

and (𝐼 − 𝑊1 𝑊1∗ )𝑒 = 𝑒 − 𝑊1 Θ(0)∗ 𝑒.

If 𝑢 ⊕ 𝑣 ∈ ℌ(Θ) then clearly (𝐼 − 𝑊1 𝑊1∗ )𝑊0 (𝑢 ⊕ 𝑣) = 𝑃ℌ(Θ) 𝑊0 (𝑢 ⊕ 𝑣) = 𝑆(Θ)(𝑢 ⊕ 𝑣). For the ﬁrst direct summand in the right-hand side of (7.3), let us write 𝑊0 (𝑢⊕𝑣) = 𝑢′ ⊕ 𝑣 ′ and note that 𝑊1∗ 𝑊0 (𝑢 ⊕ 𝑣) = 𝑊1∗ (𝑢′ ⊕ 𝑣 ′ ) = 𝑃𝐻 2 (𝔈) (Θ∗ 𝑢′ + Δ𝑣 ′ ). If we write the Fourier expansion Θ∗ 𝑢 + Δ𝑣 =

∞ ∑

𝜁 𝑛 𝑒𝑛 ,

𝑛=−1

then Θ∗ 𝑢′ + Δ𝑣 ′ =

∞ ∑

𝜁 𝑛+1 𝑒𝑛 ,

𝑛=−1

and the projection of this function onto 𝐻 2 (𝔈) is precisely 𝑒−1 = (Θ∗ 𝑢 + Δ𝑣)−1 , as stated. □

200

H. Bercovici, R.G. Douglas and C. Foias

Remark 7.5. The above results show us how to calculate the unitary invariants given the characteristic function Θ. Conversely, one can write explicit formulas for the characteristic function of a cnu pair 𝕎 = (𝑊0 , 𝑊1 ) with given unitary invariants. We begin by noting a general formula for this characteristic function Θ. We have Θ(𝑧) = 𝑃𝔈 (𝐼 − 𝑧𝑊0∗ )−1 𝑊1 ∣𝔈, 𝑧 ∈ 𝔻, where 𝔈 = ker 𝑊0∗ . Assume now that 𝔇 is a Hilbert space, 𝑈 ∈ ℒ(𝔇) is unitary, 𝑃 ∈ ℒ(𝔇) is an orthogonal projection, and 𝕎 is the bi-isometry on 𝐻 2 (𝔇) deﬁned using the pair (𝑈, 𝑃 ). Then the space 𝔈 = ker 𝑊0∗ is identiﬁed with the range of 𝑈 𝑃 𝑈 ∗ , provided that we view the elements of 𝔇 as constant functions in 𝐻 2 (𝔇). It is easy then to see that 𝑊1 𝑒 = 𝑈 ∗ 𝑒, 𝑒 ∈ 𝔈, so that 𝑊1 𝔈 also consists of constant functions in 𝐻 2 (𝔇). Next we observe that the operator 𝑊0∗ leaves invariant the space 𝔇 of constant functions, and in fact 𝑊0∗ ∣𝔇 = 𝑃 ⊥ 𝑈 ∗ . We conclude that (𝐼 − 𝑧𝑊0∗ )−1 ∣𝔇 = (𝐼 − 𝑧𝑃 ⊥ 𝑈 ∗ )−1 ,

𝑧 ∈ 𝔻.

Note ﬁnally that 𝑃𝔈 ∣𝔇 = 𝑈 𝑃 𝑈 ∗ . These observations allow us to write the formula Θ(𝑧) = 𝑈 𝑃 𝑈 ∗ (𝐼 − 𝑧𝑃 ⊥ 𝑈 ∗ )−1 𝑈 ∗ ∣𝔈,

𝑧 ∈ 𝔻,

which involves only the unitary invariants of 𝕎. Note further that the unitary invariants of the pair (𝑊1 , 𝑊0 ) are (𝑈 ∗ , 𝑈 𝑃 ⊥ 𝑈 ∗ ). Thus, starting with a pair (𝑊0 , 𝑊1 ) whose characteristic function is known, one can use Proposition 3.4 to calculate the unitary invariants of (𝑊0 , 𝑊1 ), and then an application of the above formulas yield an explicit expression for the characteristic function of (𝑊1 , 𝑊0 ).

8. Examples of irreducible bi-isometries and direct integral decompositions For a single isometry, that is, when 𝐼 has only one element, it follows from the von Neumann–Wold theorem that there is, up to unitary equivalence, only one nonunitary irreducible isometry. However, when 𝐼 has two or more elements there are many irreducible families of commuting isometries which do not consist of unitary operators. We will illustrate this in the case of bi-isometries 𝕍 = (𝑉0 , 𝑉1 ). These examples also illustrate the fact that the characteristic function is not always the best approach to the study of bi-isometries. Particular bi-isometries may be better understood using special methods. We recall that a complete unitary invariant of a completely nonunitary biisometry 𝕍 = (𝑉0 , 𝑉1 ) is given by a pair (𝑈, 𝑃 ), where 𝑈 is a unitary operator on

Canonical Models for Bi-isometries

201

some Hilbert space 𝔇, and 𝑃 is an orthogonal projection on 𝔇. The bi-isometry determined by (𝑈, 𝑃 ) acts on 𝐻 2 (𝔇) as follows: (𝑉0 𝑓 )(𝑧) = 𝑈 (𝑧𝑃 +𝑃 ⊥ )𝑓 (𝑧),

(𝑉1 𝑓 )(𝑧) = (𝑃 +𝑧𝑃 ⊥ )𝑈 ∗ 𝑓 (𝑧),

𝑓 ∈ 𝐻 2 (𝔇), 𝑧 ∈ 𝔻,

where 𝑃 ⊥ = 𝐼𝔈 −𝑃 . The bi-isometry 𝕍 is irreducible if and only if the pair (𝑈, 𝑃 ) is irreducible. Note for further use that the product 𝑉0 𝑉1 is precisely multiplication by the variable 𝑧. (These unitary invariants classify more general bi-isometries than the completely nonunitary ones; see [4, 2].) For our illustration we will let 𝑈 be the bilateral shift on the space 𝐿2 of all square integrable functions on the unit circle 𝕋; thus (𝑈 𝑓 )(𝜁) = 𝜁𝑓 (𝜁),

𝑓 ∈ 𝐿2 , 𝜁 ∈ 𝕋.

We will denote by 𝑒𝑗 (𝜁) = 𝜁 𝑗 the standard orthonormal basis in 𝐿2 , and for every set 𝐴 ⊂ ℤ of integers we denote by 𝑄𝐴 the orthogonal projection onto the space generated by {𝑒𝑗 : 𝑗 ∈ 𝐴}. In this case 𝑉0 and 𝑉1 are uniquely determined by the relations 𝑉1 𝑒𝑛+1 = 𝑒𝑛 if 𝑛 ∈ 𝐴 and 𝑉0 𝑒𝑛 = 𝑒𝑛+1 if 𝑛 ∈ / 𝐴. Proposition 8.1. Two pairs (𝑈, 𝑄𝐴 ), (𝑈, 𝑄𝐵 ) are unitarily equivalent if and only if there exists 𝑛 ∈ ℤ such that 𝐵 = {𝑖 + 𝑛 : 𝑖 ∈ 𝐴}. Proof. Suﬃciency is obvious: if 𝐵 = 𝐴 + 𝑛 then the operator 𝑈 𝑛 implements the unitary equivalence of the two pairs. Conversely, assume that there is a unitary operator Φ on 𝐿2 such that Φ𝑈 = 𝑈 Φ and 𝑈 𝑄𝐴 = 𝑄𝐵 𝑈 . There exists then a function 𝜑 ∈ 𝐿∞ such that ∣𝜑∣ = 1 almost everywhere and Φ𝑓 = 𝜑𝑓 for every 𝑓 ∈ 𝐿2 . The fact that 𝜑𝑒𝑖 is in the range of 𝑄𝐵 for 𝑖 ∈ 𝐴 means that (𝜑, 𝑒𝑗−𝑖 ) = (𝜑𝑒𝑖 , 𝑒𝑗 ) = 0, Similarly, 𝜑𝑒𝑖 is in the range of

𝑄⊥ 𝐵

𝑖 ∈ 𝐴, 𝑗 ∈ / 𝐵.

if 𝑖 ∈ / 𝐴, so that

(𝜑, 𝑒𝑗−𝑖 ) = 0,

𝑖∈ / 𝐴, 𝑗 ∈ 𝐵.

We deduce that there exists at least one integer 𝑛 not in the set {𝑗 − 𝑖 : (𝑖, 𝑗) ∈ (𝐴 × (ℤ ∖ 𝐵)) ∪ ((ℤ ∖ 𝐴) × 𝐵}. The function 𝑒𝑛 will then have the property that / 𝐴. 𝑒𝑛+𝑖 = 𝑒𝑛 𝑒𝑖 is in the range of 𝑄𝐵 if 𝑖 ∈ 𝐴, and it is in the range of 𝑄⊥ 𝐵 if 𝑖 ∈ Therefore 𝐵 = 𝐴 + 𝑛. □ Corollary 8.2. The pair (𝑈, 𝑄𝐴 ) is reducible if and only of 𝐴 is a periodic set; that is, 𝐴 = 𝐴 + 𝑛 for some nonzero integer 𝑛. Proof. The pair (𝑈, 𝑄𝐴 ) is reducible if and only if it commutes with a unitary which is not a scalar multiple of the identity. The argument in the proof of the preceding proposition shows that such a unitary can be chosen to be multiplication □ by 𝑒𝑛 for some 𝑛 ∈ ℤ ∖ {0}. We see therefore that there is a continuum of mutually inequivalent irreducible bi-isometries. Indeed, there is a continuum of subsets of ℤ, and only countably many of them are periodic.

202

H. Bercovici, R.G. Douglas and C. Foias

Quite interestingly, the bi-isometry associated with (𝑈, 𝑄𝐴 ) can be described very explicitly. Consider the space 𝐿2 (𝕋2 ) = 𝐿2 ⊗ 𝐿2 and its standard orthonormal basis 𝑒𝑖𝑗 (𝜁0 , 𝜁1 ) = 𝜁0𝑖 𝜁1𝑗 , 𝑖, 𝑗 ∈ ℤ, 𝜁0 , 𝜁1 ∈ 𝕋. Multiplication by the two variables deﬁnes a bi-isometry 𝕍 = (𝑉0 , 𝑉1 ) on 𝐿2 (𝕋2 ); actually 𝑉0 and 𝑉1 are unitary. We will look at proper nonempty subsets Γ ⊂ ℤ2 with the property that the space ℌΓ generated by {𝑒𝑖𝑗 : (𝑖, 𝑗) ∈ Γ} is invariant for 𝕍. In other words, (𝑖 + 𝑛, 𝑗 + 𝑚) ∈ Γ if (𝑖, 𝑗) ∈ Γ and 𝑛, 𝑚 ≥ 0 or, equivalently, Γ + ℕ2 ⊂ Γ. We deﬁne the boundary ∂Γ of Γ to consist of those pairs (𝑖, 𝑗) ∈ Γ such that (𝑖 − 1, 𝑗 − 1) does not belong to Γ. For each integer 𝑛, there exists a unique point 𝛾𝑛 = (𝑖𝑛 , 𝑗𝑛 ) ∈ ∂Γ such that 𝑖𝑛 − 𝑗𝑛 = 𝑛. Uniqueness is obvious by the deﬁnition of ∂Γ; existence follows from the fact that ∅ ∕= Γ ∕= ℤ2 . The diﬀerence 𝛾𝑛+1 − 𝛾𝑛 = (𝑖𝑛+1 − 𝑖𝑛 , 𝑗𝑛+1 − 𝑗𝑛 ) is either (1, 0) or (0, −1). We can then deﬁne the set 𝐴Γ ⊂ ℤ by 𝐴Γ = {𝑛 ∈ ℤ : 𝛾𝑛+1 − 𝛾𝑛 = (0, −1)}. Geometrically, 𝐴Γ is the union of the vertical segments in ∂Γ, omitting the lower endpoint of each one. The following result is an easy exercise. Proposition 8.3. For every subset 𝐴 ⊂ ℤ there exists a nonempty subset Γ ⊂ ℤ2 such that Γ + ℕ2 ⊂ Γ and 𝐴Γ = 𝐴. We have 𝐴Γ+(𝑝,𝑞) = 𝐴Γ + 𝑝 − 𝑞 for all (𝑝, 𝑞) ∈ ℤ2 . Proposition 8.4. Let Γ be a nonempty proper subset of ℤ2 such that ℌΓ is invariant for 𝕍. The bi-isometry associated with the invariants (𝑈, 𝑄𝐴Γ ) is unitarily equivalent to 𝕍∣ℌΓ . Proof. The space ℌ∂Γ = ℌΓ ⊖ 𝑉 𝑊 ℌΓ can be identiﬁed with 𝐿2 by mapping 𝑒𝛾𝑛 to 𝑒𝑛 . Denote by 𝑈0 the unitary operator on ℌ∂Γ which corresponds to the shift on 𝐿2 ; in other words, 𝑈0 𝑒𝛾𝑛 = 𝑒𝛾𝑛+1 . Since 𝑉0 𝑉1 corresponds with multiplication by 𝑧, it is clear that ℌ∂Γ can be identiﬁed with 𝐻 2 (ℌ∂Γ ). Therefore, we only need to show that 𝑉0 𝑒𝛾𝑛 = 𝑉0 𝑉1 𝑒𝛾𝑛+1 if 𝑛 ∈ 𝐴Γ and 𝑉0 𝑒𝛾𝑛 = 𝑒𝛾𝑛+1 if 𝑛 ∈ / 𝐴Γ . This however is immediate from the deﬁnition of 𝐴Γ and the remark preceding Proposition 8.1. □ A direct consequence of this proposition is the following: Corollary 8.5. Let Γ and Γ′ be two nonempty proper subsets of ℤ2 such that ℌΓ and ℌΓ′ are invariant for 𝕍. (1) The bi-isometries 𝕍∣ℌΓ and 𝕍∣ℌΓ′ are unitarily equivalent if and only if Γ′ = Γ + 𝛾 for some 𝛾 ∈ ℤ2 . (2) The bi-isometry 𝕍∣ℌΓ is reducible if and only if Γ = Γ + 𝛾 for some 𝛾 ∈ ℤ2 ∖ {(0, 0)}. Two particular sets Γ yielding irreducible bi-isometries were considered in [11, 17]. The ﬁrst is Γ = ℕ2 , for which 𝐴Γ = {𝑛 : 𝑛 < 0}. The restriction 𝕍∣ℌΓ is a doubly commuting bi-shift. The second is Γ = (ℤ × ℕ) ∪ (ℕ × ℤ), for which

Canonical Models for Bi-isometries

203

𝐴Γ = ℕ. The corresponding restriction of 𝕍 was called a modiﬁed bi-shift in these works. The modiﬁed bi-shift can be seen to be the dual of the doubly commuting bi-shift in the sense of [6]. The bi-isometries of the form 𝕍∣ℌΓ were considered earlier in [18]. They have the special property that the range projections of the isometries in the multiplicative semigroup they generate commute with each other. The case Γ ⊂ ℕ2 was also considered in [8] from the point of view of Hilbert modules over the bidisk algebra. We now illustrate the decomposition of a bi-isometry into a direct integral of irreducibles with the particular case provided by the set 𝐴 = 2ℤ. In this case, the commutant of the pair (𝑈, 𝑄𝐴 ) is the algebra generated by 𝑈 2 , and this operator is a unitary operator with uniform multiplicity 2 relative to the usual arclength measure on 𝕋. This is realized upon using the identiﬁcation Φ : 𝐿2 ⊕ 𝐿2 → 𝐿2 deﬁned by

(Φ(𝑓 ⊕ 𝑔))(𝜁) = 𝑓 (𝜁 2 ) + 𝜁𝑔(𝜁 2 ), 𝜁 ∈ 𝕋. The operator Φ∗ 𝑈 Φ is simply multiplication by the matrix-valued function [ ] 0 𝜁 𝑈0 (𝜁) = , 𝜁 ∈ 𝕋, 1 0

while Φ∗ 𝑄𝐴 Φ is multiplication by the constant matrix [ ] 1 0 𝑃0 = . 0 0 In other words, we have the decomposition ∫ ⊕ (𝑈0 (𝜁), 𝑃0 )∣𝑑𝜁∣, (𝑈, 𝑃 ) = 𝕋

and it is clear that the pairs (𝑈0 (𝜁), 𝑃0 ) are irreducible and mutually inequivalent. This corresponds with a direct integral decomposition of the corresponding bi-isometry. The reader will have no diﬃculty verifying that the bi-isometry associated with (𝑈0 (𝜁), 𝑃0 ) is of the form (𝜁𝑆, 𝑆), where 𝑆 is a unilateral shift of multiplicity one. The general case of a set 𝐴 such that 𝐴 = 𝐴 + 𝑛, with 𝑛 > 2, lends itself to a similar analysis, with ⎤ ⎡ 0 0 0 ⋅⋅⋅ 0 𝜁 𝑛−1 ⎢ 1 0 0 ⋅⋅⋅ 0 0 ⎥ ⎢ ⎥ ⎢ 0 𝜁 0 ⋅⋅⋅ 0 0 ⎥ ⎥ ⎢ 𝑈0 (𝜁) = ⎢ . . . . .. .. ⎥ , 𝜁 ∈ 𝕋, .. ⎢ .. .. .. . . ⎥ ⎢ ⎥ ⎣ 0 0 0 ⋅⋅⋅ 0 0 ⎦ 0 0 0 0 ⋅ ⋅ ⋅ 𝜁 𝑛−2 and 𝑃0 a diagonal projection. The diagonal elements (𝛼1 , 𝛼2 , . . . , 𝛼𝑛 ) of this projection are deﬁned by setting 𝛼𝑖 = 1 if 𝑖 ∈ 𝐴 and 𝛼𝑖 = 0 otherwise. The pair (𝑈0 (𝜁), 𝑃0 ) is irreducible provided that 𝑛 is the smallest positive period of 𝐴.

204

H. Bercovici, R.G. Douglas and C. Foias

References [1] O.P. Agrawal, D.N. Clark and R.G. Douglas, Invariant subspaces in the polydisk, Paciﬁc J. Math. 121 (1986), 1–11. [2] H. Bercovici, R.G. Douglas and C. Foias, On the classiﬁcation of multi-isometries, Acta Sci. Math. (Szeged) 72 (2006) no. 3-4, 639–661. [3] ———, Bi-isometries and commutant lifting, Oper. Theory Adv. Appl. 197 (2010), 51–76. [4] C.A. Berger, L. Coburn, and A. Lebow, Representation and index theory for 𝐶 ∗ algebras generated by commuting isometries, J. Funct. Anal. 27 (1978), 51–99. [5] A. Beurling, On two problems concerning linear transformations in Hilbert space, Acta Math. 81 (1949), 239–255. [6] J.B. Conway, The dual of a subnormal operator, J. Operator Theory 5 (1981), 195–211. [7] R.G. Douglas, On extending commutative semigroups of isometries, Bull. London Math. Soc. 1 (1969), 157–159. [8] R.G. Douglas, T. Nakazi, and M. Seto, Shift operators on the ℂ2 -valued Hardy space, Acta Sci. Math. (Szeged) 73 (2007), 729–744. [9] P.L. Duren, Theory of 𝐻 𝑝 spaces, Academic Press, New York-London, 1970. [10] C. Foias, A.E. Frazho, I. Gohberg, and M.A. Kaashoek, Metric constrained interpolation, commutant lifting and systems, Birkh¨ auser Verlag, Basel, 1998. [11] D. Ga¸spar and P. Ga¸spar, Wold decompositions and the unitary model for biisometries, Integral Equations Operator Theory 49 (2004), 419–433. [12] D. Ga¸spar and N. Suciu, Wold decompositions for commutative families of isometries, An. Univ. Timisoara Ser. Stint. Mat. 27 (1989), 31–38. [13] P. Ghatage and V. Mandrekar, On Beurling type invariant subspaces of 𝐿2 (𝑇 2 ) and their equivalence, J. Operator Theory 20 (1988), 83–89. [14] H.-K. Kwon and S. Treil, Similarity of operators and geometry of eigenvector bundles, Publ. Mat. 53 (2009), 417–438. [15] D. Popovici, On the structure of c.n.u. bi-isometries, Acta Sci. Math. (Szeged) 66 (2000), 719–729. [16] ———, On the structure of c.n.u. bi-isometries. II, Acta Sci. Math. (Szeged) 68 (2002), 329–347. [17] ———, A Wold-type decomposition for commuting isometric pairs, Proc. Amer. Math. Soc. 132 (2004), 2303–2314. [18] H.N. Salas, Semigroups of isometries with commuting range properties, J. Operator Theory 14 (1985), 311–346. [19] M. S̷locinski, On Wold type decomposition of a pair of commuting isometries, Ann. Pol. Math. 37 (1980), 255–262. [20] I. Suciu, On the semigroups of isometries, Studia Math. 30 (1968), 101–110. [21] B. Sz.-Nagy, Unitary dilations of Hilbert space operators and related topics, Conference Board of Mathematical Sciences Regional Conference Series in Mathematics, No. 19, American Mathematical Society, Providence, R.I., 1974.

Canonical Models for Bi-isometries

205

[22] ———, Sur les contractions de l’espace de Hilbert, Acta Sci. Math. (Szeged) 15 (1953), 87–92. [23] ———, Sur les contractions de l’espace de Hilbert. II, Acta Sci. Math. (Szeged) 18 (1957), 1–14. [24] B. Sz.-Nagy and C. Foias, On contractions similar to isometries and Toeplitz operators, Ann. Acad. Sci. Fenn. Ser. A I Math. 2 (1976), 553–564. [25] B. Sz.-Nagy, C. Foias, H. Bercovici, and L. K´erchy, Harmonic Analysis of Operators on Hilbert Spaces, Second Edition, Springer Verlag, New York, 2010. [26] M. Takesaki, Theory of operator algebras. I, Springer-Verlag, Berlin, 2002. [27] J. von Neumann, Allgemeine Eigenwerttheorie Hermitescher Funktionaloperatoren, Math. Ann. 102 (1929), 49–131. [28] H. Wold, A study in the analysis of stationary time series, Stockholm, 1954. H. Bercovici Department of Mathematics Indiana University Bloomington, IN 47405, USA e-mail: [email protected] R.G. Douglas and C. Foias Department of Mathematics Texas A&M University College Station, TX 77843, USA e-mail: [email protected] [email protected]

Operator Theory: Advances and Applications, Vol. 218, 207–224 c 2012 Springer Basel AG ⃝

First-order Trace Formulae for the Iterates of the Fox–Li Operator Albrecht B¨ottcher, Sergei Grudsky, Daan Huybrechs and Arieh Iserles To the memory of Israel Gohberg, one of the pioneers of contemporary Wiener–Hopf theory

Abstract. The paper is devoted to ﬁrst-order trace formulas for the iterates of the Fox–Li and related Wiener–Hopf integral operators. Such formulas provide ﬁrst insight into the asymptotic behaviour of the eigenvalues and can be used to test whether a speciﬁc guess for the eigenvalue distribution is acceptable or not. The main technical problem consists in obtaining the asymptotics of a multivariate oscillatory integral whose stationary points constitute a line. Mathematics Subject Classiﬁcation (2000). Primary 47B35; Secondary 45C05, 47B10, 78A60. Keywords. Fox–Li operator, Wiener–Hopf operator, eigenvalue, trace formula.

1. Introduction and main results The Fox–Li operator is the integral operator on 𝐿2 (−1, 1) given by √ ∫ 1 2 𝜔 ei𝜔(𝑥−𝑦) 𝑓 (𝑦) d𝑦, 𝑥 ∈ (−1, 1), (𝐹𝜔 𝑓 )(𝑥) := 𝜋i −1 √ where 𝜔 > 0 is a large parameter. Here and in the following, i stands for ei𝜋/4 . The spectrum of this operator is of great importance in laser engineering [12], [13], [15], [18], [19]. Physical aspects of the Fox–Li spectrum are also studied in the recent papers [2], [3], [4]. A very recent paper devoted to numerical methods for the approximation of the spectrum is [11]. These works indicate that the spectrum of 𝐹𝜔 is composed of points on a spiral commencing at 1 and rotating clockwise to the origin. Sergei Grudsky acknowledges support of this work by a grant of the DAAD.

208

A. B¨ottcher, S. Grudsky, D. Huybrechs and A. Iserles

However, rigorous results are still very sparse. Landau and Widom [16], [20] established a second-order result for the asymptotic distribution of the singular values of 𝐹𝜔 , that is, for the square roots of the eigenvalues of 𝐹𝜔 𝐹𝜔∗ . In [7], the regularized operator given by √ ∫ 1 2 𝜔 (𝐹𝜔,𝜀 𝑓 )(𝑥) := e(i−𝜀)𝜔(𝑥−𝑦) 𝑓 (𝑦) d𝑦, 𝑥 ∈ (−1, 1), 𝜋i −1 was considered, and it was proved that, for each ﬁxed 𝜀 > 0, the eigenvalues of 𝐹𝜔,𝜀 converge to the logarithmic spiral { ( ) } 1 𝑥2 𝜀𝑥2 √ − i exp − : 𝑥 ∈ (0, ∞) 4(1 + 𝜀2 ) 4(1 + 𝜀2 ) 1 + 𝜀i in the Hausdorﬀ metric as 𝜔 → ∞. For 𝜀 = 0, this logarithmic spiral becomes the unit circle, and it is conjectured that the eigenvalues of 𝐹𝜔 lie on spirals which eventually wind up closer and closer to the unit circle as 𝜔 goes to inﬁnity. We here prove the following ﬁrst-order trace formula for the eigenvalues of the Fox–Li operator 𝐹𝜔 itself. Theorem 1.1. The operator 𝐹𝜔 is a trace class operator, all eigenvalues are contained in the open unit disk, and, for each ﬁxed natural number 𝑘 ≥ 1, √ √ 2 𝜔 + 𝑜( 𝜔) as 𝜔 → ∞. (1) tr 𝐹𝜔𝑘 = √ 𝜋i𝑘 We do not know a rigorous argument that shows that 𝐹𝜔 has inﬁnitely many eigenvalues. However, Theorem 1.1 for 𝑘 = 1 implies the following. Since it tells us that 𝐹𝜔 is of trace class, we may compute the trace by integrating the kernel along the diagonal [14, Corollary III.10.2]. On the other hand, the trace is the sum of the eigenvalues 𝜆1 , 𝜆2 , . . . of 𝐹𝜔 , repeated according to algebraic multiplicity. Consequently, √ ∫ 1 √ ∑ 2 𝜔 2 𝜔 𝜆𝑛 = tr 𝐹𝜔 = ei𝜔⋅0 d𝑥 = √ , 𝜋i −1 𝜋i 𝑛 and since ∣𝜆𝑛 ∣ < 1 for all 𝑛, it follows that the number 𝑁 of eigenvalues satisﬁes $ $ √ 𝑁 𝑁 2 𝜔 $$ ∑ $$ ∑ √ =$ 𝜆𝑛 $ ≤ ∣𝜆𝑛 ∣ < 𝑁, $ $ 𝜋 𝑛=1 𝑛=1 √ which leastwise reveals that 𝐹𝜔 has at least 2 𝜔/𝜋 eigenvalues. Given a family of functions 𝑏𝜔 : (0, ∞) → ℂ, we say that the eigenvalues of 𝐹𝜔 are asymptotically distributed as the values of 𝑏𝜔 in the weak sense if, for each natural number 𝑘 ≥ 1, ∫ ∞ √ 𝑘 𝑏𝑘𝜔 (𝑥) d𝑥 + 𝑜( 𝜔) as 𝜔 → ∞. (2) tr 𝐹𝜔 = 0

Iterates of the Fox–Li Operator

209

Of course, saying so is motivated by the formulae ∫ ∞ ∑ ∑ tr 𝐹𝜔𝑘 = 𝜆𝑘𝑛 , 𝑏𝑘𝜔 (𝑥) d𝑥 ≈ 𝑏𝑘𝜔 (𝑛). 𝑛

0

𝑛

We remark that equal asymptotic distribution in the strong sense would mean something like ∫ ∞ √ 𝜑(𝑏𝜔 (𝑥)) d𝑥 + 𝑜( 𝜔) as 𝜔 → ∞ tr 𝜑(𝐹𝜔 ) = 0

for every function 𝜑 ∈ 𝐶 ∞ (ℂ) satisfying 𝜑(0) = 0 or for every 𝜑 of the form 𝜑 = 𝜒Ω , where Ω ⊂ ℂ ∖ {0} is a nice set and 𝜒Ω stands for the characteristic function of Ω. In the latter case, it would follow that √ #{𝑛 : 𝜆𝑛 ∈ Ω} = ∣{𝑥 ∈ (0, ∞) : 𝑏(𝑥) ∈ Ω}∣ + 𝑜( 𝜔) as 𝜔 → ∞, where #𝑆 and ∣𝑆∣ denote the cardinality and the Lebesgue measure of 𝑆, respectively. By Theorem 1.1, formula (2) is equivalent to saying that √ ∫ ∞ √ 2 𝜔 + 𝑜( 𝜔) as 𝜔 → ∞. 𝑏𝑘𝜔 (𝑥) d𝑥 = √ (3) 𝜋i𝑘 0 Using solely (3) we will show the following. Theorem 1.2. Let 𝑏𝜔 (𝑥) = exp(−𝛼(𝜔)𝑥𝜈 − i𝛽(𝜔)𝑥𝜈 ) with positive real numbers 𝛼(𝜔), 𝛽(𝜔), 𝜈. Then the eigenvalues of 𝐹𝜔 are asymptotically distributed as the values of 𝑏𝜔 in the weak sense if and only if ( ) ( ) 1 1 𝜋2 +𝑜 𝜈 = 2, 𝛼(𝜔) = 𝑜 , 𝛽(𝜔) = . 𝜔 16𝜔 𝜔 This result may be viewed as a ﬁrst step toward establishing with mathematical rigour Vainshtein’s formula 𝜋2 𝜁(1/2)𝜋 3/2 √ , , 𝛽(𝜔) ≈ 3/2 16𝜔 16 2 𝜔 which, to quote Cochran and Hinds [12], was obtained by Vainshtein [19] “using a distinctly physical approach, based upon wave-guide theory.” Here 𝜁(1/2) is Riemann’s zeta function at the point 1/2. 𝜈 = 2,

𝛼(𝜔) ≈

The Fox–Li operator 𝐹𝜔 is easily seen to be unitarily similar to the operator √ on 𝐿2 (0, 2 𝜔) that acts by the rule ∫ 2√𝜔 √ 2 1 ei(𝑥−𝑦) 𝑓 (𝑦) d𝑦, 𝑥 ∈ (0, 2 𝜔). (4) (𝑊 𝑓 )(𝑥) := √ 𝜋i 0 In this way we are entering the realm of Wiener–Hopf operators. Given a function 𝑎 ∈ 𝐿∞ (ℝ), the so-called symbol, the convolution operator 𝐶(𝑎) on 𝐿2 (ℝ) is deﬁned by ∫ ∞ ∫ ∞ 1 −i𝜉𝑥 (𝐶(𝑎)𝑓 )(𝑥) := e 𝑎(𝜉) ei𝜉𝑦 𝑓 (𝑦) d𝑦 d𝜉, 𝜉 ∈ ℝ. 2𝜋 −∞ −∞

210

A. B¨ottcher, S. Grudsky, D. Huybrechs and A. Iserles

Thus, 𝐶(𝑎) takes the Fourier transform, multiplies the result by 𝑎, and then applies the inverse Fourier transform. The boundedness of 𝑎 guarantees (and is in fact equivalent to) the boundedness of 𝐶(𝑎). The Wiener–Hopf operator 𝑊 (𝑎) is the compression of 𝐶(𝑎) to 𝐿2 (0, ∞), that is, 𝑊 (𝑎) = 𝑃 𝐶(𝑎)∣𝐿2 (0, ∞), where (𝑃 𝑓 )(𝑥) is zero for 𝑥 < 0 and 𝑓 (𝑥) for 𝑥 > 0. Finally, for a real number 𝜏 > 0, the truncated Wiener–Hopf operator 𝑊𝜏 (𝑎) is the compression of 𝑊 (𝑎) to 𝐿2 (0, 𝜏 ), i.e., 𝑊𝜏 (𝑎) = 𝑃𝜏 𝑊 (𝑎)∣𝐿2 (0, 𝜏 ), where (𝑃𝜏 𝑓 )(𝑥) = 𝑓 (𝑥) for 0 < 𝑥 < 𝜏 and (𝑃𝜏 𝑓 )(𝑥) = 0 for 𝑥 > 𝜏 . If ∫ ∞ ˆ ℓ(𝑡)ei𝜉𝑡 d𝑡, 𝑡 ∈ ℝ, 𝑎(𝜉) = ℓ(𝜉) := −∞

the Fourier transform being understood in the usual sense for ℓ ∈ 𝐿1 (ℝ) ∪ 𝐿2 (ℝ) and in the sense of distributions in more general situations, the convolution 𝐶(𝑎) can be written as ∫ ∞ ℓ(𝑥 − 𝑦)𝑓 (𝑦) d𝑦, 𝑥 ∈ ℝ, (𝐶(𝑎)𝑓 )(𝑥) = −∞

while the operators 𝑊 (𝑎) and 𝑊𝜏 (𝑎) are given by the same formula with integration over (−∞, ∞) replaced by integration over (0, ∞) and (0, 𝜏 ), respectively. Because ∫ ∞

−∞

2

ei𝑡 ei𝜉𝑡 d𝑡 = e−i𝜉

2

/4

,

𝜉∈ℝ

in the sense of distributions, we may identify the operator (3) as 𝑊2√𝜔 (𝜎) with 2 𝜎(𝜉) := e−i𝜉 /4 . We remark that 𝜎(𝜉) has oscillating discontinuities as 𝜉 → ±∞ and that this function does not belong to the classes of symbols with a well-developed ˙ + 𝐻 ∞ (ℝ), 𝑃 𝐶(ℝ), 𝑆𝑂(ℝ), theory of their Wiener–Hopf operators, such as 𝐶(ℝ) 𝑆𝐴𝑃 (ℝ); see [8] and [10]. In terms of Wiener–Hopf operators, Theorem 1.1 becomes formula (5) in the following result. 2

Theorem 1.3. Let 𝜎(𝜉) := e−i𝜉 /4 . The spectra of the operators 𝐶(𝜎) and 𝑊 (𝜎) are the unit circle 𝕋 and the closed unit disc 𝔻, respectively. The spectrum of 𝑊𝜏 (𝜎) is contained in the open unit disc 𝔻, and for every natural number 𝑘 ≥ 1, the operators 𝑊𝜏𝑘 (𝜎) := [𝑊𝜏 (𝜎)]𝑘 and 𝑊𝜏 (𝜎 𝑘 ) are trace class operators and tr 𝑊𝜏𝑘 (𝜎) = tr 𝑊𝜏 (𝜎 𝑘 ) + 𝑜(𝜏 ) have

as

𝜏 → ∞.

(5)

Denoting by ℓ𝑘 (𝑡) the kernel of the convolution integral operator 𝑊𝜏 (𝜎 𝑘 ), we ∫ ∞ 𝜏 ℓ𝑘 (𝑥 − 𝑥) d𝑥 = 𝜏 ℓ𝑘 (0) = 𝜎 𝑘 (𝜉) d𝜉 2𝜋 0 −∞ √ ∫ ∞ 2 𝜏 4𝜋 𝜏 𝜏 =√ e−i𝑘𝜉 /4 d𝜉 = = , 2𝜋 −∞ 2𝜋 i𝑘 𝜋i𝑘

tr 𝑊𝜏 (𝜎 𝑘 ) =

∫

𝜏

and taking into account that 𝐹𝜔 is unitarily similar to 𝑊2√𝜔 (𝜎), we see that (5) is indeed the same as (1).

Iterates of the Fox–Li Operator

211

The discrete analogues of Wiener–Hopf operators are Toeplitz matrices. Given 𝑎 ∈ 𝐿∞ (𝕋), the 𝑛 × 𝑛 Toeplitz matrix 𝑇𝑛 (𝑎) is the matrix (𝑎𝑗−𝑘 )𝑛𝑗,𝑘=1 where 𝑎𝑗 is the 𝑗th Fourier coeﬃcient of 𝑎, ∫ 2𝜋 1 𝑎𝑗 := 𝑎(ei𝜃 )e−i𝑗𝜃 d𝜃, 𝑗 ∈ ℤ. 2𝜋 0 It is well known and not diﬃcult to prove (see, e.g., [9, Lemma 5.16 and Theorem 5.17]) that if 𝑎 is an arbitrary function in 𝐿∞ (𝕋), then tr 𝑇𝑛𝑘 (𝑎) = tr 𝑇𝑛 (𝑎𝑘 ) + 𝑜(𝑛) = (𝑎𝑘 )0 + 𝑜(𝑛)

as 𝑛 → ∞,

(6)

which is the discrete counterpart of (5). A ﬁnite Toeplitz matrix is automatically a trace class operator, but a truncated Wiener–Hopf operator need not be of trace class. Therefore the continuous analogue of (6) does not make sense for arbitrary 𝑎 ∈ 𝐿∞ (ℝ). What is known is the following, and we will include a proof for the reader’s convenience. Theorem 1.4. If 𝑎 ∈ 𝐿∞ (ℝ) ∩ 𝐿1 (ℝ), then the operators 𝑊𝜏𝑘 (𝑎) and 𝑊𝜏 (𝑎𝑘 ) are of trace class for every natural number 𝑘 ≥ 1 and every real number 𝜏 > 0, and ∫ ∞ 𝜏 𝑎𝑘 (𝜉) d𝜉 + 𝑜(𝜏 ) as 𝜏 → ∞. (7) tr 𝑊𝜏𝑘 (𝑎) = tr 𝑊𝜏 (𝑎𝑘 ) + 𝑜(𝜏 ) = 2𝜋 −∞ 2

The function 𝜎(𝜉) = e−i𝜉 /4 in Theorem 1.3 is not in 𝐿1 (ℝ) and hence Theorem 1.3 cannot be deduced from Theorem 1.4. The actual value of Theorems 1.1 and 1.3 is that they show that (7) nevertheless remains true for 𝑎(𝜉) = 𝜎(𝜉) = 2 e−i𝜉 /4 . The following theorem unites (5) and (7). 2

Theorem 1.5. Let 𝑎(𝜉) = 𝑐(𝜉)𝜎(𝜉) where 𝜎(𝜉) := e−i𝜉 /4 and 𝑐 ∈ 𝐶 3 (ℝ) is a function having ﬁnite limits 𝑐(−∞) = 𝑐(+∞) =: 𝑐(∞). Set 𝑢(𝜉) := 𝑐(𝜉) − 𝑐(∞) and suppose that the functions 𝜉 4 𝑢(𝜉), 𝜉 3 𝑢′ (𝜉), 𝜉 2 𝑢′′ (𝜉), 𝑢′′′ (𝜉) belong to 𝐿1 (ℝ) and have zero limits as 𝜉 → ±∞. Then, for every natural number 𝑘 ≥ 1 and every real number 𝜏 > 0, the operators 𝑊𝜏𝑘 (𝑎) and 𝑊𝜏 (𝑎𝑘 ) are of trace class and (7) holds. The remaining sections of the paper are devoted to the proofs of the theorems. In Section 2, we prove Theorem 1.4 and the portion of Theorem 1.3 concerning spectra. Proposition 2.4 addresses the pseudospectra of 𝐹𝜔 and shows that, for each 𝜀 > 0, the 𝜀-pseudospectrum of 𝐹𝜔 contains the closed unit disk 𝔻 whenever 𝜔 is suﬃciently large. Theorem 1.1 and the (equivalent) trace formula of Theorem 1.3 are proved in Section 3 by determining the ﬁrst-order asymptotics of the oscillatory ∫1 multivariate integral −1 𝑚𝑘 (𝑥, 𝑥) d𝑥 where 𝑚𝑘 (𝑥, 𝑦) is the kernel of the integral operator 𝐹𝜔𝑘 ; note that 𝑚𝑘 (𝑥, 𝑦) is a (𝑘 − 1)-fold integral. Sections 4 and 5 contain the proofs of Theorems 1.2 and 1.5, respectively.

212

A. B¨ottcher, S. Grudsky, D. Huybrechs and A. Iserles

2. Wiener–Hopf operators We begin with the proof of Theorem 1.4. Let 𝒞𝑝 denote the 𝑝th Schatten–von Neumann class and ∥ ⋅ ∥𝑝 the norm in 𝒞𝑝 , that is, the ℓ𝑝 norm of the singular values of the operator. In particular, ∥ ⋅ ∥1 is the trace norm, ∥ ⋅ ∥2 is the Hilbert– Schmidt norm (= Frobenius norm), and ∥ ⋅ ∥∞ coincides with the usual operator norm on 𝐿2 . It is well known that 𝑊𝜏 (𝑎) ∈ 𝒞1 whenever 𝑎 ∈ 𝐿∞ (ℝ) ∩ 𝐿1 (ℝ); see, e.g., [10, Section 10.83]. This implies that 𝑊𝜏𝑘 (𝑎) and 𝑊𝜏 (𝑎𝑘 ) are also in 𝒞1 for 𝑘 ≥ 1. Lemma 2.1. If 𝑏, 𝑐 ∈ 𝐿∞ (ℝ) ∩ 𝐿1 (ℝ), then ∥𝑊𝜏 (𝑏)𝑊𝜏 (𝑐) − 𝑊𝜏 (𝑏𝑐)∥1 = 𝑜(𝜏 )

as

𝜏 → ∞.

˙ ∩ 𝐿2 (ℝ), Proof. If 𝑎 ∈ 𝐿∞ (ℝ) ∩ 𝐿1 (ℝ), then 𝑎 ∈ 𝐿2 (ℝ) and 𝑎 = ℓˆ with ℓ ∈ 𝐶(ℝ) where ℝ˙ is the one-point compactiﬁcation of ℝ. We denote by 𝐻(𝑎) the Hankel operator generated by 𝑎. This is the operator that acts on 𝐿2 (0, ∞) by the rule ∫ ∞ (𝐻(𝑎)𝑓 )(𝑥) := ℓ(𝑥 + 𝑦)𝑓 (𝑦) d𝑦, 𝑥 ∈ (0, ∞). 0

Letting ˜ 𝑎(𝜉) := 𝑎(−𝜉), we have ∫ (𝐻(˜ 𝑎)𝑓 )(𝑥) =

∞ 0

ℓ(−𝑥 − 𝑦)𝑓 (𝑦) d𝑦,

𝑥 ∈ (0, ∞).

A formula by Widom says that 𝑐)𝑃𝜏 + 𝑅𝜏 𝐻(˜𝑏)𝐻(𝑐)𝑅𝜏 , 𝑊𝜏 (𝑏𝑐) − 𝑊𝜏 (𝑏)𝑊𝜏 (𝑐) = 𝑃𝜏 𝐻(𝑏)𝐻(˜ 2

2

(8)

where 𝑃𝜏 is as in Section 1 and 𝑅𝜏 : 𝐿 (0, ∞) → 𝐿 (0, 𝜏 ) is the operator that is given by (𝑅𝜏 𝑓 )(𝑥) := 𝑓 (𝜏 − 𝑥) for 0 < 𝑥 < 𝜏 and (𝑅𝜏 𝑓 )(𝑥) := 0 for 𝑥 > 𝜏 ; see, for example, [10, Section 9.7(d)]. Since ∥𝐵𝐶∥1 ≤ ∥𝐵∥2 ∥𝐶∥2 , it suﬃces to prove that ∥𝑃𝜏 𝐻(𝑎)∥22 → 0 as 𝜏

𝜏 →∞

for 𝑎 = ℓˆ ∈ 𝐿∞ (ℝ) ∩ 𝐿1 (ℝ). We have ∫ ∫ ∫ ∫ ∥𝑃𝜏 𝐻(𝑎)∥22 1 𝜏 ∞ 1 𝜏 ∞ = ∣ℓ(𝑥 − 𝑦)∣2 d𝑦 d𝑥 = ∣ℓ(𝑡)∣2 d𝑡 d𝑥 𝜏 𝜏 0 0 𝜏 0 𝑥 ∫ ∫ ∫ ∫ 1 𝜏 𝜏 1 𝜏 ∞ = ∣ℓ(𝑡)∣2 d𝑡 d𝑥 + ∣ℓ(𝑡)∣2 d𝑡 d𝑥, 𝜏 0 𝑥 𝜏 0 𝜏 and the second term on the right is ∫ ∞ ∣ℓ(𝑡)∣2 d𝑡 = 𝑜(1). 𝜏

The ﬁrst term equals 1 𝜏

∫ 0

𝜏

∫ 0

𝑡

1 ∣ℓ(𝑡)∣ d𝑥 d𝑡 = 𝜏 2

∫

𝜏 0

𝑡∣ℓ(𝑡)∣2 d𝑡,

Iterates of the Fox–Li Operator and we write this as 1 𝜏

∫

𝜏0

0

1 𝑡∣ℓ(𝑡)∣ d𝑡 + 𝜏 2

∫

𝜏 𝜏0

213

𝑡∣ℓ(𝑡)∣2 d𝑡

(9)

where 𝜏0 = 𝜏0 (𝜀) is chosen so that ∫ ∫ 𝜏 ∫ ∞ 1 𝜏 𝜀 𝑡∣ℓ(𝑡)∣2 d𝑡 ≤ ∣ℓ(𝑡)∣2 d𝑡 ≤ ∣ℓ(𝑡)∣2 d𝑡 < . 𝜏 𝜏0 2 𝜏0 𝜏0 Since then 1 𝜏

∫ 0

𝜏0

𝑡∣ℓ(𝑡)∣2 d𝑡 ≤

𝜏0 𝜏

∫

𝜏0

0

∣ℓ(𝑡)∣2 d𝑡 ≤

𝜏0 𝜏

∫

∞

0

∣ℓ(𝑡)∣2 d𝑡 <

𝜀 2

if only 𝜏 is large enough, we see that (9) is smaller than any prescribed 𝜀 > 0 whenever 𝜏 is suﬃciently large. □ Lemma 2.2. If 𝑎 ∈ 𝐿∞ (ℝ) ∩ 𝐿1 (ℝ) and 𝑘 ≥ 1 is a natural number, then ∥𝑊𝜏𝑘 (𝑎) − 𝑊𝜏 (𝑎𝑘 )∥1 = 𝑜(𝜏 )

as

𝜏 → ∞.

Proof. This is trivial for 𝑘 = 1. Assume that the assertion is true for some 𝑘 ≥ 1. We write 𝑊𝜏𝑘+1 (𝑎) − 𝑊𝜏 (𝑎𝑘+1 ) as ( ) 𝑊𝜏𝑘 (𝑎) − 𝑊𝜏 (𝑎𝑘 ) 𝑊𝜏 (𝑎) + 𝑊𝜏 (𝑎𝑘 )𝑊𝜏 (𝑎) − 𝑊𝜏 (𝑎𝑘+1 ) and have 1( 1 ) 1 1 1 𝑊𝜏𝑘 (𝑎) − 𝑊𝜏 (𝑎𝑘 ) 𝑊𝜏 (𝑎)1 ≤ ∥𝑊𝜏𝑘 (𝑎) − 𝑊𝜏 (𝑎𝑘 )∥1 ∥𝑊𝜏 (𝑎)∥∞ . 1

Clearly, ∥𝑊𝜏 (𝑎)∥∞ ≤ ∥𝑎∥∞ . Furthermore, ∥𝑊𝜏𝑘 (𝑎) − 𝑊𝜏 (𝑎𝑘 )∥1 = 𝑜(𝜏 ) by assumption, and ∥𝑊𝜏 (𝑎𝑘 )𝑊𝜏 (𝑎) − 𝑊𝜏 (𝑎𝑘+1 )∥1 = 𝑜(𝜏 ) due to Lemma 2.1. Thus, the assertion is valid for 𝑘 + 1. □ As ∣tr 𝐴∣ ≤ ∥𝐴∥1 for every trace class operator 𝐴, Theorem 1.4 is an obvious consequence of Lemma 2.2. The following result proves part of Theorem 1.3. We denote the spectrum of an operator 𝐴 by sp 𝐴. The essential spectrum spess 𝐴 is the set of all 𝜆 ∈ ℂ for which 𝐴 − 𝜆𝐼 is not Fredholm, that is, not invertible modulo compact operators. Clearly, spess 𝐴 ⊂ sp 𝐴. Proposition 2.3. If 𝜎(𝜉) := e−𝑖𝜉 sp 𝐶(𝜎)) = 𝕋,

2

/4

then

spess 𝑊 (𝜎) = sp 𝑊 (𝜎) = 𝔻,

sp 𝑊𝜏 (𝜎) ⊂ 𝔻.

Proof. Throughout this proof, 𝑎 denotes an arbitrary function in 𝐿∞ (ℝ). The spectrum of 𝐶(𝑎) is the essential range ℛ(𝑎) of 𝑎. Hence sp 𝐶(𝜎) = 𝕋. To prove the assertion for the spectra of 𝑊 (𝜎), we have recourse to known results on Toeplitz operators. The passage from Wiener–Hopf operators on 𝐿2 (0, ∞) to Toeplitz operators on the Hardy space 𝐻 2 (𝕋) and back can be performed by a standard unitary similarity; see, for example, Section 9.5(e) of [10]. The Hartman–Wintner

214

A. B¨ottcher, S. Grudsky, D. Huybrechs and A. Iserles

and Brown–Halmos theorems, which can be found, for instance, as Theorems 2.30 and 2.33 in [10], yield the spectral inclusions ℛ(𝑎) ⊂ sp 𝑊 (𝑎) ⊂ conv ℛ(𝑎), where conv denotes the convex hull. Consequently, 𝕋 ⊂ sp 𝑊 (𝜎) ⊂ 𝔻. To show that spess 𝑊 (𝜎) is all of 𝔻, ﬁx some 𝜆 ∈ 𝔻. We have 𝜎(𝜉) − 𝜆 = ∣𝜎(𝜉) − 𝜆∣e−i𝜑(𝜉) with a function 𝜑 that can be written as 𝜑 = 𝜓 + 𝛿 where 𝜓 ∈ 𝐶(ℝ) ∩ 𝐿∞ (ℝ) and 𝛿 ∈ 𝐶(ℝ) is monotone on (−∞, 0 and (0, ∞) with 𝛿(±∞) = +∞. Now we can employ a result of [5], which is also cited and proved as Theorem 6.4 of [8] and, reduced to a necessary invertibility criterion, went as Proposition 2.26(d) into [10]. This result says that if 𝑎 − 𝜆 has an argument as just described, then for 𝑊 (𝑎 − 𝜆) to be Fredholm it is necessary that ∣𝜑(𝜉)∣ = 𝑂(log ∣𝜉∣) as ∣𝜉∣ → ∞. Because in our case ∣𝜑(𝜉)∣ increases as ∣𝜉∣2 , it follows that 𝑊 (𝜎) − 𝜆𝐼 = 𝑊 (𝜎 − 𝜆) cannot be Fredholm. Thus, 𝜆 ∈ spess 𝑊 (𝜎). Finally, in [7, Theorem 1.1], it is shown that ∥𝑊𝜏 (𝜎)∥∞ < 1. We therefore arrive at the conclusion that sp 𝑊𝜏 (𝜎) ⊂ 𝔻. □ As 𝑊𝜏 (𝜎) is not a normal operator, one could ask whether we should rather study the 𝜀-pseudospectrum sp𝜀 𝑊𝜏 (𝜎) := {𝜆 ∈ ℂ : 1/𝜀 ≤ ∥(𝑊𝜏 (𝜎) − 𝜆𝐼)−1 ∥∞ ≤ ∞} than the spectrum sp 𝑊𝜏 (𝜎). See [18]. It is known that, for each 𝜀 > 0, the sets sp𝜀 𝑊𝜏 (𝑎) converge to sp𝜀 𝑊 (𝑎) as 𝜏 → ∞ in the Hausdorﬀ metric if 𝑎 is piece2 wise continuous [6]. The symbol 𝜎(𝜉) = e−i𝜉 /4 is not piecewise continuous, but fortunately things are simple. Here is the result. Proposition 2.4. Given 𝜀 > 0, there is a 𝜏0 = 𝜏0 (𝜀) such that 𝔻 ⊂ sp𝜀 𝑊𝜏 (𝜎) for all 𝜏 > 𝜏0 . Proof. Pick 𝜆 ∈ 𝔻. The operator 𝑊𝜏 (𝜎 − 𝜆) and its adjoint 𝑊𝜏 (𝜎 − 𝜆) converge strongly to 𝑊 (𝜎 − 𝜆) and this operator’s adjoint 𝑊 (𝜎 − 𝜆). Thus, were the norms ∥𝑊𝜏𝑛 (𝜎 − 𝜆)−1 ∥∞ uniformly bounded for some sequence 𝜏𝑛 → ∞, 𝑊 (𝜎 − 𝜆) would be invertible. As the latter is not the case due to Proposition 2.3, we conclude that ∥𝑊𝜏 (𝜎 − 𝜆)−1 ∥∞ → ∞ for each 𝜆 ∈ 𝔻. This together with the compactness of 𝔻 implies that for every 𝜀 > 0 there is a 𝜏0 (𝜀) such that ∥𝑊𝜏 (𝜎 − 𝜆)−1 ∥∞ ≥ 1/𝜀 for all 𝜏 > 𝜏0 (𝜀) and all 𝜆 ∈ 𝔻. □ Proposition 2.4 is equivalent to saying that given 𝜀 > 0 and 𝜆 ∈ 𝔻, there exists a number 𝜏0 = 𝜏0 (𝜀) such that for every 𝜏 > 𝜏0 we can ﬁnd 𝑓 ∈ 𝐿2 (0, 𝜏 ) satisfying ∥𝑓 ∥ = 1 and ∥𝑊𝜏 (𝜎)𝑓 − 𝜆𝑓 ∥ ≤ 𝜀. This is in the spirit of Landau’s result [15]. He took 𝜆 from 𝕋 only but showed much more, namely that for 𝜏 > 𝜏0 there are at least 1000𝜏 orthonormal functions 𝑓 in 𝐿2 (0, 𝜏 ) such that ∥𝑊𝜏 (𝜎)𝑓 − 𝜆𝑓 ∥ ≤ 𝜀.

Iterates of the Fox–Li Operator

215

3. An oscillatory multivariate integral In this section we prove Theorem 1.1. Lemma 3.1. For every natural number 𝑘 ≥ 1, the operators 𝐹𝜔𝑘 as well as the 2 operators 𝑊𝜏𝑘 (𝜎) and 𝑊𝜏 (𝜎 𝑘 ) generated by 𝜎(𝜉) := e−i𝜉 /4 , are of trace class. Proof. Since 𝐹𝜔 is unitarily similar to 𝑊2√𝜔 (𝜎), it suﬃces to prove that 𝑊𝜏 (𝜎 𝑘 ) is in the trace class. The operator 𝑊𝜏 (𝜎 𝑘 ) acts by the rule ∫ 𝜏 ℓ𝑘 (𝑥 − 𝑦)𝑓 (𝑦) d𝑦, 𝑥 ∈ (0, 𝜏 ), (10) (𝑊𝜏 (𝜎 𝑘 )𝑓 )(𝑥) = 0

where ℓ𝑘 (𝑡) =

1 2𝜋

∫

∞

−∞

𝜎 𝑘 (𝜉)e−i𝜉𝑡 d𝜉 =

∫

1 2𝜋

∞

−∞

e−i𝑘𝜉

2

/4−i𝜉𝑡

2 1 ei𝑡 /𝑘 . d𝜉 = √ 𝜋i𝑘

𝑘

Thus, 𝑊𝜏 (𝜎 ) is an integral operator over a ﬁnite interval with a smooth kernel. From [14, III.10.3] we therefore deduce that 𝑊𝜏 (𝜎 𝑘 ) ∈ 𝒞1 . An alternative proof is as follows. Let ℓ𝑘,𝜏 be a 𝐶 2 function on ℝ which coincides with ℓ𝑘 on (−𝜏, 𝜏 ) and is identically zero outside (−2𝜏, 2𝜏 ). As (10) does not depend on the values of ℓ𝑘 outside (−𝜏, 𝜏 ), we have 𝑊𝜏 (𝜎 𝑘 ) = 𝑊𝜏 (ℓˆ𝑘,𝜏 ). The function ℓˆ𝑘,𝜏 is in 𝐿∞ (ℝ) because ℓ𝑘,𝜏 ∈ 𝐿1 (ℝ), and twice integrating the integral ∫ 2𝜏 ℓˆ𝑘,𝜏 (𝜉) = ℓ𝑘,𝜏 (𝑡)ei𝜉𝑡 d𝑡 −2𝜏

by parts, we obtain ℓˆ𝑘,𝜏 (𝜉) =

1 (i𝜉)2

∫

2𝜏 −2𝜏

ℓ′′𝑘,𝜏 (𝑡)ei𝜉𝑡 d𝑡

for 𝜉 ∕= 0, which shows that ℓˆ𝑘,𝜏 ∈ 𝐿1 (ℝ). In the beginning of Section 2 we noticed that symbols in 𝐿∞ (ℝ) ∩ 𝐿1 (ℝ) generate truncated Wiener–Hopf operators in the □ trace class. Hence 𝑊𝜏 (ℓˆ𝑘,𝜏 ) ∈ 𝒞1 . We have (𝐹𝜔𝑘 𝑓 )(𝑥) = where

∫

1

−1

𝑚𝑘 (𝑥, 𝑦)𝑓 (𝑦) d𝑦,

𝑥 ∈ (−1, 1),

√

(√ )2 ∫ 1 2 2 𝜔 i𝜔(𝑥−𝑦)2 𝜔 𝑚1 (𝑥, 𝑦) = e , 𝑚2 (𝑥, 𝑦) = ei𝜔(𝑥−𝑧) ei𝜔(𝑧−𝑦) d𝑧, 𝜋i 𝜋i −1 (√ )3 ∫ 1 ∫ 1 2 2 2 𝜔 ei𝜔(𝑥−𝑧) ei𝜔(𝑧−𝑤) ei𝜔(𝑤−𝑦) d𝑧 d𝑤, 𝑚3 (𝑥, 𝑦) = 𝜋i −1 −1

and so on. Since 𝐹𝜔𝑘 is of trace class by Lemma 3.1 and 𝑚𝑘 is continuous on [−1, 1]2 , it follows that ∫ 1 𝑘 𝑚𝑘 (𝑥, 𝑥) d𝑥; tr 𝐹𝜔 = −1

216

A. B¨ottcher, S. Grudsky, D. Huybrechs and A. Iserles

see [14, Corollary III.10.2]. Consequently, (√ )𝑘 𝜔 tr 𝐹𝜔𝑘 = 𝐼𝑘 𝜋i where

∫ 𝐼𝑘 :=

1

−1

∫ ...

1

−1

⎛ exp ⎝i𝜔

𝑘 ∑

(11) ⎞

(𝑥𝑗 − 𝑥𝑗+1 )2 ⎠ d𝑥1 . . . d𝑥𝑘

(12)

𝑗=1

with 𝑥𝑘+1 := 𝑥1 . By virtue of some lucky circumstances, it is not diﬃcult to compute 𝐼𝑘 straightforwardly for 𝑘 ≤ 4. Trivially, 𝐼1 = 2. Letting ∫ 𝑧 2 2 e−𝜁 d𝜁, 𝑧 ∈ ℂ, erf(𝑧) := √ 𝜋 0 one almost immediately gets (√ ) √ i 𝜋i 8𝜔 2 i e8i𝜔 erf − 𝐼2 = √ + 2 i 2 𝜔 2𝜔 𝜔 (√ )2 √ √ ( ) ( ) 2 i 1 1 1 2 𝜋i 𝜋i 𝜋i √ =√ − +𝑂 − + 𝑂 = , 2 𝜔 2𝜔 𝜔 𝜔 2𝜋 𝜔 𝜔2 2 2 while with a little more labour, one obtains (√ )] ∫ √𝜔 (√ ) [ (√ ) 6 2 2 √ 1 𝜋i 𝑦 erf 𝑦 + erf (2 𝜔 − 𝑦) d𝑦 erf 𝐼3 = 3/2 √ i i i 𝜔 3 0 (√ )2 (√ )3 ( ) 1 𝜋i 𝜋i 1 2 − √ +𝑜 =√ 𝜔 𝜔 𝜔 3/2 3 𝜋 2 and

(√ )]2 ∫ √ 𝜔 (√ ) [ (√ ) 4 2 2 √ 1 (𝜋i)3/2 𝑦 erf 𝑦 + erf (2 𝜔 − 𝑦) erf d𝑦 𝐼4 = 2 𝜔 4 i i i 0 (√ )3 ( ) 1 𝜋i 2 +𝑂 =√ . 2 𝜔 𝜔 4

However, to tackle the general case we have to proceed diﬀerently. Theorem 3.2. As 𝜔 → ∞, 2 𝐼𝑘 = √ 𝑘

(√

𝜋i 𝜔

)𝑘−1 (1 + 𝑜(1)).

Proof. To establish the pattern for general 𝑘, we ﬁrst consider the case 𝑘 = 3. The oscillator function in (12) is 𝑔(𝑥1 , 𝑥2 , 𝑥3 ) := (𝑥1 − 𝑥2 )2 + (𝑥2 − 𝑥3 )2 + (𝑥3 − 𝑥1 )2 ,

Iterates of the Fox–Li Operator

217

and its stationary points are on the straight line 𝑥1 = 𝑥2 = 𝑥3 . We make the change of variables 𝑡 = 𝑥1 − 𝑥2 ,

𝑢 = 𝑥2 − 𝑥3 ,

𝑣 = 𝑥1 + 𝑥2 + 𝑥3

in (12). The determinant of the Jacobian is 1/3, hence ∫ 1 𝐼3 = exp[i𝜔(𝑡2 + 𝑢2 + (𝑡 + 𝑢)2 )] d𝑡 d𝑢 d𝑣 3 Δ where Δ is some polytope containing the origin in its interior. The new oscillator function ℎ(𝑡, 𝑢, 𝑣) = 𝑡2 + 𝑢2 + (𝑡 + 𝑢)2 is independent of 𝑣, and as a function of 𝑡 and 𝑢 only, it has the single stationary point 𝑡 = 𝑢 = 0. The Hessian for ℎ, again thought of as a function of solely 𝑡 and 𝑢, is ( ) 4 2 . 2 4 This is a positive deﬁnite matrix, and therefore ℎ can be written as 𝑟2 + 𝑠2 in suitable coordinates 𝑟 and 𝑠. To ﬁnd the new coordinates, we try the ansatz 𝑟 = 𝑎𝑡 + 𝑏𝑢, The equation is satisﬁed for

𝑠 = 𝑐𝑢.

(13)

(𝑎𝑡 + 𝑏𝑢)2 + (𝑐𝑢)2 = 𝑡2 + 𝑢2 + (𝑡 + 𝑢)2

√ 3 1 . 𝑏= √ , 𝑐= 2 2 The Jacobi √determinant of the substitution (13) with these coeﬃcients equals 1/(𝑎𝑐) = 1/ 3. Consequently, ∫ 1 1 exp[i𝜔(𝑟2 + 𝑠2 )] d𝑣 d𝑟 d𝑠 𝐼3 = √ 3 3 Ω √ 𝑎 = 2,

where Ω is again a polytope with the origin in its interior. Integrating over 𝑣 we get ) ∫ (∫ 𝑣2 (𝑟,𝑠) 1 1 𝐼3 = √ exp[i𝜔(𝑟2 + 𝑠2 )] d𝑣 d𝑟 d𝑠 3 3 Ω1 𝑣1 (𝑟,𝑠) ∫ 1 1 𝑉 (𝑟, 𝑠) exp[i𝜔(𝑟2 + 𝑠2 )] d𝑟 d𝑠 = √ 3 3 Ω1 with 𝑉 (𝑟, 𝑠) := 𝑣2 (𝑟, 𝑠) − 𝑣1 (𝑟, 𝑠) and some (planar) polytope Ω1 with the origin in its interior. For 𝑟 = 𝑠 = 0, the variable 𝑣 ranges from −3 to 3. Hence 𝑉 (0, 0) = 6. The stationary phase formula √ ∫ 𝛽 𝜋i i𝜔𝑥2 (1 + 𝑜(1)) 𝑓 (𝑥)e d𝑥 = 𝑓 (0) 𝜔 −𝛼

218

A. B¨ottcher, S. Grudsky, D. Huybrechs and A. Iserles

can now be applied independently for 𝑟 and 𝑠. The outcome is (√ )2 (√ )2 𝜋i 𝜋i 1 1 2 𝐼3 = √ 6 (1 + 𝑜(1)) = √ (1 + 𝑜(1)). 3 3 𝜔 𝜔 3 The pattern in the general case is now obvious. Substituting 𝑡𝑗 = 𝑥𝑗 − 𝑥𝑗+1 in (12), we get 𝐼𝑘 =

1 𝑘

(1 ≤ 𝑗 ≤ 𝑘 − 1),

𝑡𝑘 = 𝑥1 + 𝑥2 + ⋅ ⋅ ⋅ + 𝑥𝑘

∫ Δ

exp[i𝜔ℎ(𝑡1 , . . . , 𝑡𝑘−1 )] d𝑡1 . . . d𝑡𝑘

with ℎ(𝑡1 , . . . , 𝑡𝑘−1 ) =

𝑘−1 ∑ 𝑗=1

⎛

𝑘−1 ∑

𝑡2𝑗 + ⎝

⎞2 𝑡𝑗 ⎠ .

𝑗=1

The Hessian of this function is the (𝑘 − 1) × (𝑘 − 1) matrix ⎛ ⎞ 4 2 2 ... ⎜2 4 2 . . .⎟ ⎟. 𝐻 := ⎜ ⎝2 2 4 . . .⎠ ... ... ... ... The determinant of the 𝑚 × 𝑚 matrix constituted by the ﬁrst 𝑚 rows and columns of 𝐻 is 2𝑚 (𝑚 + 1). Thus, by Sylvester’s theorem, 𝐻 is positive deﬁnite. We look for a change of variables ⎞ ⎛ ⎞⎛ ⎞ ⎛ 𝑎1,𝑘−1 𝑠1 𝑎11 𝑎12 . . . 𝑡1 ⎟ ⎜ ⎜ 𝑠2 ⎟ ⎜ 0 𝑎22 . . . 𝑎2,𝑘−1 ⎟ ⎟ ⎜ ⎟ ⎜ 𝑡2 ⎟ ⎜ ⎠ ⎝ . . . ⎠ = ⎝. . . . . . . . . ⎝ ... ... ⎠ 𝑠𝑘−1 0 0 . . . 𝑎𝑘−1,𝑘−1 𝑡𝑘−1 such that 𝑠21 + 𝑠22 + ⋅ ⋅ ⋅ + 𝑠2𝑘−1 = ℎ(𝑡1 , 𝑡2 , . . . , 𝑡𝑘−1 ). It is easily seen that such a change of variables can be found with √ √ √ 3 𝑘 , . . . , 𝑎𝑘−1,𝑘−1 = . 𝑎11 = 2, 𝑎22 = 2 𝑘−1 √ The Jacobi determinant equals 1/(𝑎11 𝑎22 . . . 𝑎𝑘−1,𝑘−1 ) = 1/ 𝑘. We so arrive at the representation ∫ 1 1 𝐼𝑘 = √ exp[i𝜔(𝑠21 + ⋅ ⋅ ⋅ + 𝑠2𝑘−1 )] d𝑡𝑘 d𝑠1 . . . d𝑠𝑘−1 𝑘 𝑘 Ω ∫ 1 1 = √ 𝑉 (𝑠1 , . . . , 𝑠𝑘−1 ) exp[i𝜔(𝑠21 + ⋅ ⋅ ⋅ + 𝑠2𝑘−1 )] d𝑠1 . . . d𝑠𝑘−1 . 𝑘 𝑘 Ω1

Iterates of the Fox–Li Operator

219

As 𝑉 (0, . . . , 0) = 2𝑘, the usual stationary phase formula argument yields (√ )𝑘−1 (√ )𝑘−1 1 1 2 𝜋i 𝜋i 𝐼𝑘 = √ 2𝑘 (1 + 𝑜(1)) = √ (1 + 𝑜(1)), 𝑘 𝑘 𝜔 𝜔 𝑘 as desired.

□

Lemma 2.1 in conjunction with (11) and Theorem 3.2 proves Theorem 1.1. As already said, (5) is equivalent to (1). Thus, also the proof of Theorem 1.3 is at this point complete.

4. The logarithmic spiral ansatz We now prove Theorem 1.2. Letting 𝑏𝜔 be as in that theorem, we have ∫ ∞ ∫ ∞ 𝜈 𝑘 𝑏𝜔 (𝑥) d𝑥 = e−[𝛼(𝜔)+i𝛽(𝜔)]𝑘𝑥 d𝑥, 0

0

and hence (3) is true for some 𝑘 if and only if √ ∫ ∞ 𝜔 −[𝛼(𝜔)+i𝛽(𝜔)]𝑘𝑥𝜈 −1/2 (1 + 𝑜(1)), e d𝑥 = 2𝑘 𝜋i 0 or equivalently, after substituting 𝑘𝑥𝜈 → 𝑥𝜈 , √ ∫ ∞ 𝜔 −1/𝜈 −[𝛼(𝜔)+i𝛽(𝜔)]𝑥𝜈 −1/2 𝑘 (1 + 𝑜(1)). e d𝑥 = 2𝑘 𝜋i 0

(14)

(15)

Taking (14) for 𝑘 = 1, we obtain √ ∫ ∞ 𝜈 𝜔 (1 + 𝑜(1)), e−[𝛼(𝜔)+i𝛽(𝜔)]𝑥 d𝑥 = 2 𝜋i 0 whereas (15) for 𝑘 = 2 states that √ ∫ ∞ 𝜈 𝜔 (1 + 𝑜(1)). e−[𝛼(𝜔)+i𝛽(𝜔)]𝑥 d𝑥 = 21+1/𝜈−1/2 𝜋i 0 Comparing the last two formulas, we arrive at the conclusion that if (14) holds for 𝑘 = 1 and 𝑘 = 2, then necessarily 𝜈 = 2. Now consider (14) with 𝜈 = 2. Computing the integral, we obtain that (14) is equivalent to the statement that √ √ 2 1 𝜋 𝜔 1 √ =√ (1 + 𝑜(1)), 2 𝛼(𝜔) + i𝛽(𝜔) 𝑘 𝑘 𝜋i which holds if and only if 𝛼(𝜔) + i𝛽(𝜔) =

𝜋2 i (1 + 𝑜(1)). 16𝜔

(16)

220

A. B¨ottcher, S. Grudsky, D. Huybrechs and A. Iserles

Writing 𝑜(1) = 𝑜(1) + i𝑜(1) with two real 𝑜(1) on the right, we arrive at the conclusion that (16) is valid if and only if ( ) ( ) 1 1 𝜋2 +𝑜 𝛼(𝜔) = 𝑜 , 𝛽(𝜔) = . 𝜔 16𝜔 𝜔 This completes the proof of Theorem 1.2.

5. Symbols with a Fox–Li discontinuity This section is devoted to the proof of Theorem 1.5. Lemma 5.1. (a) Let 𝐴𝜏 and 𝐵𝜏 be operators on 𝐿2 (0, 𝜏 ). If ∥𝐴𝜏 ∥1 = 𝑜(𝜏 ) and ∥𝐵𝜏 ∥∞ = 𝑂(1), then ∥𝐴𝜏 𝐵𝜏 ∥1 = 𝑜(𝜏 ) and ∥𝐵𝜏 𝐴𝜏 ∥1 = 𝑜(𝜏 ). (b) If 𝑏, 𝑑 ∈ 𝐿∞ (ℝ) and 𝐻(𝑏), 𝐻(˜𝑏) ∈ 𝒞1 , then ∥𝑊𝜏 (𝑏)𝑊𝜏 (𝑑) − 𝑊𝜏 (𝑏𝑑)∥1 = 𝑜(𝜏 ),

∥𝑊𝜏 (𝑑)𝑊𝜏 (𝑏) − 𝑊𝜏 (𝑏𝑑)∥1 = 𝑜(𝜏 ).

(c) If 𝑏 ∈ 𝐿∞ (ℝ) and 𝐻(𝑏) ∈ 𝒞1 , then 𝐻(𝑏𝑘 ) ∈ 𝒞1 for every natural number 𝑘 ≥ 1. Proof. Part (a) follows from the inequalities ∥𝐴𝜏 𝐵𝜏 ∥1 ≤ ∥𝐴𝜏 ∥1 ∥𝐵𝜏 ∥∞ ,

∥𝐵𝜏 𝐴𝜏 ∥1 ≤ ∥𝐵𝜏 ∥∞ ∥𝐴𝜏 ∥1 .

To prove (b) note that, by (8), ˜ 𝜏 − 𝑅𝜏 𝐻(˜𝑏)𝐻(𝑑)𝑅𝜏 𝑊𝜏 (𝑏)𝑊𝜏 (𝑑) − 𝑊𝜏 (𝑏𝑑) = −𝑃𝜏 𝐻(𝑏)𝐻(𝑑)𝑃 and that ˜ 𝜏 ∥1 ≤ ∥𝑃𝜏 ∥∞ ∥𝐻(𝑏)∥1 ∥𝐻(𝑑)𝑃 ˜ 𝜏 ∥∞ = 𝑂(1) = 𝑜(𝜏 ), ∥𝑃𝜏 𝐻(𝑏)𝐻(𝑑)𝑃 ∥𝑅𝜏 𝐻(˜𝑏)𝐻(𝑑)𝑅𝜏 ∥1 ≤ ∥𝑅𝜏 ∥∞ ∥𝐻(˜𝑏)∥1 ∥𝐻(𝑑)𝑅𝜏 ∥∞ = 𝑂(1) = 𝑜(𝜏 ). Finally, part (c) is obviously true for 𝑘 = 1. So suppose that 𝐻(𝑏𝑘 ) ∈ 𝒞1 for some 𝑘 ≥ 1. The identity 𝐻(𝑏𝑘+1 ) = 𝐻(𝑏𝑘 )𝑊 (˜𝑏) + 𝑊 (𝑏𝑘 )𝐻(𝑏), which is the continuous analogue of formula (2.19) in [10], shows that then 𝐻(𝑏𝑘+1 ) □ is also in 𝒞1 . 2

Proposition 5.2. Let 𝑎(𝜉) = 𝑐(𝜉)𝜎(𝜉) where 𝜎(𝜉) := e−i𝜉 /4 and 𝑐 is a function in ˙ such that 𝐻((𝑐 − 𝑐(∞))𝜎 𝜈 ) ∈ 𝒞1 and 𝐻((˜ 𝑐 − 𝑐(∞))𝜎 𝜈 ) ∈ 𝒞1 for every integer 𝐶(ℝ) 𝑘 𝑘 𝜈 ≥ 0. Then the operator 𝑊𝜏 (𝑎) and 𝑊𝜏 (𝑎 ) are of trace class for every natural number 𝑘 ≥ 1 and ∫ ∞ 𝜏 𝑘 𝑘 tr 𝑊𝜏 (𝑎) = tr 𝑊𝜏 (𝑎 ) + 𝑜(𝜏 ) = 𝑎𝑘 (𝜉) d𝜉 + 𝑜(𝜏 ). (17) 2𝜋 −∞

Iterates of the Fox–Li Operator

221

Proof. Again by Widom’s formula (8), 𝑊𝜏 (𝑎𝑘 ) = 𝑊𝜏 (𝑐𝑘 )𝑊𝜏 (𝜎 𝑘 ) + 𝑃𝜏 𝐻(𝑐𝑘 )𝐻(𝜎 𝑘 )𝑃𝜏 + 𝑅𝜏 𝐻(˜ 𝑐𝑘 )𝐻(𝜎 𝑘 )𝑅𝜏 ; notice that 𝜎 ˜(𝜉) := 𝜎(−𝜉) = 𝜎(𝜉). Lemma 3.1 tells us that 𝑊𝜏 (𝜎 𝑘 ) is in 𝒞1 . Let 𝑢 := 𝑐 − 𝑐(∞). The Hankel operator induced by a constant function is the zero operator. Hence 𝐻(𝑐) = 𝐻(𝑢) and 𝐻(˜ 𝑐) = 𝐻(˜ 𝑢). By our assumption, 𝐻(𝑢) and 𝐻(˜ 𝑢) are in 𝒞1 . From Lemma 5.1(c) we therefore deduce that the operators 𝑐𝑘 ) are also in 𝒞1 . This shows that 𝑊𝜏 (𝑎𝑘 ) ∈ 𝒞1 and thus also that 𝐻(𝑐𝑘 ) and 𝐻(˜ 𝑘 𝑊𝜏 (𝑎) = [𝑊𝜏 (𝑎1 )]𝑘 ∈ 𝒞1 . In what follows we write 𝐴𝜏 ≡ 𝐵𝜏 if ∥𝐴𝜏 − 𝐵𝜏 ∥1 = 𝑜(𝜏 ). Recall that 𝑢(𝜉) is deﬁned as 𝑐(𝜉) − 𝑐(∞). Thus, 𝑎 = 𝑢𝜎 + 𝑐(∞)𝜎. We claim that for each natural number 𝑘 ≥ 1 it is true that 𝑊𝜏𝑘 (𝑎) ≡ 𝑊𝜏 [(𝑢𝜎 + 𝑐(∞)𝜎)𝑘 − 𝑐(∞)𝑘 𝜎 𝑘 ] + 𝑐(∞)𝑘 𝑊𝜏𝑘 (𝜎).

(18)

This is trivial for 𝑘 = 1. So assume the claim is true for some 𝑘 ≥ 1. Then [ ] 𝑊𝜏𝑘+1 (𝑎) ≡ 𝑊𝜏 (𝑢𝜎) + 𝑐(∞)𝑊𝜏 (𝜎) × ⎡ ⎤ 𝑘 ( ) ∑ 𝑘 ×⎣ 𝑐(∞)𝑘−𝑗 𝑊𝜏 ((𝑢𝜎)𝑗 𝜎 𝑘−𝑗 ) + 𝑐(∞)𝑘 𝑊𝜏𝑘 (𝜎)⎦ . 𝑗 𝑗=1 𝑢𝜎 𝜈 ) are in 𝒞1 for all natural numbers 𝜈 ≥ 1 by our The operators 𝐻(𝑢𝜎 𝜈 ) and 𝐻(˜ assumption. From Lemma 5.1(b) we therefore obtain that 𝑊𝜏 (𝑢𝜎)𝑊𝜏 ((𝑢𝜎)𝑗 𝜎 𝑘−𝑗 ) ≡ 𝑊𝜏 ((𝑢𝜎)(𝑢𝜎)𝑗 𝜎 𝑘−𝑗 ) and using parts (a) and (b) of Lemma 5.1 we get 𝑊𝜏 (𝜎)𝑊𝜏 ((𝑢𝜎)𝑗 𝜎 𝑘−𝑗 ) ≡ 𝑊𝜏 (𝜎)𝑊𝜏 (𝑢𝜎)𝑊𝜏 ((𝑢𝜎)𝑗−1 𝜎 𝑘−𝑗 ) ≡ 𝑊𝜏 (𝑢𝜎 2 )𝑊𝜏 ((𝑢𝜎)𝑗−1 𝜎 𝑘−𝑗 ) ≡ 𝑊𝜏 (𝜎(𝑢𝜎)𝑗 𝜎 𝑘−𝑗 ) and 𝑊𝜏 (𝑢𝜎)𝑊𝜏𝑘 (𝜎) ≡ 𝑊𝜏 (𝑢𝜎 2 )𝑊𝜏𝑘−1 (𝜎) ≡ 𝑊𝜏 (𝑢𝜎 3 )𝑊𝜏𝑘−2 (𝜎) ≡ ⋅ ⋅ ⋅ ≡ 𝑊𝜏 (𝑢𝜎 𝑘+1 ). Consequently,

⎡

⎤ 𝑘 ( ) ∑ 𝑘 𝑊𝜏𝑘+1 (𝑎) ≡ 𝑊𝜏 ⎣(𝑢𝜎 + 𝑐(∞)𝜎) 𝑐(∞)𝑘−𝑗 (𝑢𝜎)𝑗 𝜎 𝑘−𝑗 ⎦ 𝑗 𝑗=1 [ ] + 𝑊𝜏 𝑐(∞)𝑘 𝑢𝜎 𝑘+1 + 𝑐(∞)𝑘+1 𝑊𝜏𝑘+1 (𝜎),

and the sum of the symbols in the brackets on the right is (𝑢𝜎 + 𝑐(∞)𝜎)[(𝑢𝜎 + 𝑐(∞)𝜎)𝑘 − 𝑐(∞)𝑘 𝜎 𝑘 ] + 𝑐(∞)𝑘 𝑢𝜎 𝑘+1 = (𝑢𝜎 + 𝑐(∞)𝜎)𝑘+1 − 𝑐(∞)𝑘+1 𝜎 𝑘+1 . This proves our claim (18) for 𝑘 + 1.

222

A. B¨ottcher, S. Grudsky, D. Huybrechs and A. Iserles

If 𝐴𝜏 ≡ 𝐵𝜏 , then tr 𝐴𝜏 = tr 𝐵𝜏 + 𝑜(𝜏 ). Since 𝑢𝜎 + 𝑐(∞)𝜎 = 𝑎, the trace of the ﬁrst term on the right of (18) equals ∫ ∞( ) 1 𝑎𝑘 (𝜉) − 𝑐(∞)𝑘 𝜎 𝑘 (𝜉) d𝜉, 2𝜋 −∞ and from Theorem 1.3 we know that the trace of the second term on the right of (18) is ∫ 𝑐(∞)𝑘 ∞ 𝑘 𝑘 𝑘 𝑘 𝑘 tr (𝑐(∞) 𝑊𝜏 (𝜎)) = 𝑐(∞) tr 𝑊𝜏 (𝜎 ) + 𝑜(𝜏 ) = 𝜎 (𝜉) d𝜉 + 𝑜(𝜏 ). 2𝜋 −∞ Adding the two results we arrive at (17).

□

The hypothesis of Proposition 5.2 stipulates that the Hankel operators 𝑢𝜎 𝜈 ) are in 𝒞1 for every integer 𝜈 ≥ 0. Peller showed that the 𝐻(𝑢𝜎 𝜈 ) and 𝐻(˜ two Hankel operators 𝐻(𝑏) and 𝐻(˜𝑏) are of trace class if and only if 𝑏 is in the Besov space 𝐵11 (ℝ); see [17, p. 277]. Here is a simple suﬃcient condition for 𝐻(𝑏) and 𝐻(˜𝑏) to be in the trace class. Lemma 5.3. If 𝑏 ∈ 𝐶 3 (ℝ) and the functions 𝜉 2 𝑏(𝜉), 𝜉 2 𝑏′ (𝜉), 𝜉 2 𝑏′′ (𝜉), 𝑏′′′ (𝜉) belong to 𝐿1 (ℝ) and have zero limits as 𝜉 → ±∞, then 𝐻(𝑏) and 𝐻(˜𝑏) are trace class operators. Proof. Let ℓ(𝑡) =

1 2𝜋

∫

∞

−∞

𝑏(𝜉)e−i𝜉𝑡 d𝜉,

𝑡 ∈ ℝ.

Since 𝜉𝑏(𝜉) and 𝜉 2 𝑏(𝜉) are in 𝐿1 (ℝ), we may twice diﬀerentiate the integral to see that ℓ is in 𝐶 2 (ℝ) and ∫ ∞ 1 ℓ′′ (𝑡) = (−i𝜉)2 𝑏(𝜉)𝑒−i𝜉𝑡 d𝜉. 2𝜋 −∞ Using that (𝜉 2 𝑏(𝜉))′ = 2𝜉𝑏(𝜉) + 𝜉 2 𝑏′ (𝜉) and (𝜉 2 𝑏(𝜉))′′ = 2𝑏(𝜉) + 4𝜉𝑏′ (𝜉) + 𝜉 2 𝑏′′ (𝜉) are also in 𝐿1 (ℝ) and have zero limits at inﬁnity, we may twice integrate by parts to obtain that ∫ 1 (−i)2 ∞ 2 ′′ (𝜉 𝑏(𝜉))′′ e−i𝜉𝑡 d𝜉, ℓ (𝑡) = 2𝜋 (i𝑡)2 −∞ which shows that ∫ ∞ −∞

∣𝑡∣ ∣ℓ′′ (𝑡)∣2 d𝑡 < ∞.

(19)

As 𝑏′ , 𝑏′′ , 𝑏′′′ are in 𝐿1 and have zero limits at inﬁnity, we have ∫ ∞ 1 1 𝑏′′′ (𝜉)e−i𝜉𝑡 d𝜉 ℓ(𝑡) = 2𝜋 (i𝑡)3 −∞ and hence

∫

∞

−∞

∣𝑡∣4 ∣ℓ(𝑡)∣2 d𝑡 < ∞.

(20)

Iterates of the Fox–Li Operator

223

Basor and Widom [1, p. 398] showed that 𝐻(𝑏) and 𝐻(˜𝑏) are of trace class if (19) and (20) hold. □ Corollary 5.4. If 𝑐 is as in Theorem 1.5, then the Hankel operators 𝐻(𝑢𝜎 𝜈 ) and 𝐻(˜ 𝑢𝜎 𝜈 ) are in 𝒞1 for every real number 𝜈. Proof. The function 𝑏 := 𝑢𝜎 𝜈 satisﬁes the hypothesis of Lemma 5.3.

□

Combining Corollary 5.4 and Proposition 5.2, we arrive at Theorem 1.5.

References [1] E. Basor and H. Widom, Toeplitz and Wiener–Hopf determinants with piecewise continuous symbols. J. Funct. Analysis 50 (1983), 387–413. [2] M. Berry, Fractal modes of unstable lasers with polygonal and circular mirrors. Optics Comm. 200 (2001), 321–330. [3] M. Berry, Mode degeneracies and the Petermann excess-noise factor for unstable lasers. J. Modern Optics 50 (2003), 63–81. [4] M. Berry, C. Storm, and W. van Saarlos, Theory of unstable laser modes: edge waves and fractality. Optics Comm. 197 (2001), 393–402. [5] A. B¨ ottcher, On Toeplitz operators generated by symbols with three essential cluster points. Preprint P-Math-04/86, Karl-Weierstrass-Institut, Berlin 1986. [6] A. B¨ ottcher, Pseudospectra and singular values of large convolution operators. J. Integral Equations Appl. 6 (1994), 267–301. [7] A. B¨ ottcher, H. Brunner, A. Iserles, and S. Nørsett, On the singular values and eigenvalues of the Fox–Li and related operators. New York J. Math. 16 (2010), 539– 561. [8] A. B¨ ottcher and S. Grudsky, Toeplitz operators with discontinuous symbols: phenomena beyond piecewise continuity. Operator Theory: Adv. Appl. 90 (1996), 55–118. [9] A. B¨ ottcher and B. Silbermann, Introduction to Large Truncated Toeplitz Matrices. Springer-Verlag, New York, 1999. [10] A. B¨ ottcher and B. Silbermann, Analysis of Toeplitz Operators. 2nd edition, Springer-Verlag, Berlin, Heidelberg, New York, 2006. [11] H. Brunner, A. Iserles, and S.P. Nørsett, The computation of the spectra of highly oscillatory Fredholm integral operators. J. Integral Equations Appl. To appear. [12] J.A. Cochran and E.W. Hinds, Eigensystems associated with the complex-symmetric kernels of laser theory. SIAM J. Appl. Math. 26 (1974), 776–786. [13] A.G. Fox and T. Li, Resonance modes in a maser interferometer. Bell Systems Tech. J. 40 (1961), 453–488. [14] I. Gohberg and M.G. Krein, Introduction to the Theory of Linear Nonselfadjoint Operators. Transl. Math. Monographs, Vol. 18, Amer. Math. Soc., Providence, RI, 1969. [15] H. Landau, The notion of approximate eigenvalues applied to an integral equation of laser theory. Quart. Appl. Math. 35 (1977/78), 165–172.

224

A. B¨ottcher, S. Grudsky, D. Huybrechs and A. Iserles

[16] H. Landau and H. Widom, Eigenvalue distribution of time and frequency limiting. J. Math. Analysis Appl. 77 (1980), 469–481. [17] V.V. Peller, Hankel Operators and Their Applications. Springer-Verlag, New York, Berlin, Heidelberg, 2003 [18] L.N. Trefethen and M. Embree, Spectra and Pseudospectra: The Behavior of Nonnormal Matrices and Operators. Princeton University Press, Princeton, NJ, 2005. [19] L.A. Vainshtein, Open resonators for lasers. Soviet Physics JETP 40 (1963), 709– 719. [20] H. Widom, On a class of integral operators with discontinuous symbol. Operator Theory: Adv. Appl. 4 (1982), 477–500. Albrecht B¨ ottcher Fakult¨ at f¨ ur Mathematik Technische Universit¨ at Chemnitz D-09107 Chemnitz, Germany e-mail: [email protected] Sergei Grudsky CINVESTAV del I.P.N. Departamento de Matem´ aticas Apartado Postal 14-740 07000 Ciudad de M´exico, M´exico e-mail: [email protected] Daan Huybrechs Departement Computerwetenschappen Katholieke Universiteit Leuven Celestijnenlaan 200A B-3001 Leuven, Belgium e-mail: [email protected] Arieh Iserles Department of Applied Mathematics and Theoretical Physics Centre for Mathematical Sciences University of Cambridge Cambridge CB3 0WA, United Kingdom e-mail: [email protected]

Operator Theory: Advances and Applications, Vol. 218, 225–239 c 2012 Springer Basel AG ⃝

Factorization Versus Invertibility of Matrix Functions on Compact Abelian Groups Alex Brudnyi, Leiba Rodman and Ilya M. Spitkovsky Dedicated to the memory of Israel Gohberg

Abstract. Open problems are stated and some new results are proved concerning the relationship between invertibility and factorization in various Banach algebras of matrix-valued functions on connected compact abelian groups. Mathematics Subject Classiﬁcation (2000). 47A56, 47A68. Keywords. Compact abelian groups, function algebras, factorization of Wiener-Hopf type.

1. Introduction Factorizations of Wiener-Hopf type have been widely recognized and studied as an important mathematical tool. The concept of factorization formed originally within the theory of systems of singular integral equations and boundary value problems, see the monographs [14, 7, 34, 28], for example. In the early development, the inﬂuential paper [27] played a major role. Since then, factorizations of WienerHopf type have been studied in various contexts, in particular, the state space method [4, 3]. Another direction is factorization of matrix functions on connected compact abelian groups, a topic that has been studied in [35, 36, 20, 8, 9, 42]. Besides providing a uniﬁed framework for Wiener-Hopf factorizations of various types, such as relative to the unit circle or torus [19], and relative to the real line or to ℝ𝑘 – almost periodic factorization which has been extensively studied in recent years – the abstract setting of connected compact abelian groups leads to new points of view, problems, and results. In this paper, we focus on some outstanding problems in this area. Let 𝐺 be a (multiplicative) connected compact abelian group and let Γ be its (additive) character group. Recall that Γ consists of continuous homomorphisms

226

A. Brudnyi, L. Rodman and I.M. Spitkovsky

of 𝐺 into the group 𝕋 of unimodular complex numbers. Since 𝐺 is compact, Γ is discrete (in the natural topology as a dual locally compact abelian group) [43, Theorem 1.2.5], and since 𝐺 is connected, Γ is torsion free [43, Theorem 2.5.6]. By duality, 𝐺 is the character group of Γ. Note that the character group of every torsion free abelian group with the discrete topology is connected and compact [43, Theorems 1.2.5, 2.5.6]. It is well known [43] that, because 𝐺 is connected, Γ can be made into a linearly ordered group. So let ⪯ be a ﬁxed linear order such that (Γ, ⪯) is an ordered group. Let Γ+ = {𝑥 ∈ Γ : 𝑥 ર 0}, Γ− = {𝑥 ∈ Γ : 𝑥 ⪯ 0}. Standard widely used examples of Γ are ℤ (the group of integers), ℚ (the group of rationals with the discrete topology), ℝ (the group of reals with the discrete topology), and ℤ𝑘 , ℝ𝑘 with lexicographic or other ordering (where 𝑘 is a positive integer). If 𝑈 is a unital ring, we denote by 𝑈 𝑛×𝑛 the 𝑛 × 𝑛 matrix ring over 𝑈 , and by 𝐺𝐿(𝑈 𝑛×𝑛 ) the group of invertible elements of 𝑈 𝑛×𝑛 . Let 𝐶(𝐺) be the unital Banach algebra of (complex-valued) continuous functions on 𝐺 (in the uniform topology), and let 𝑃 (𝐺) be the (non-closed) subalgebra of 𝐶(𝐺) of all ﬁnite linear combinations of functions ⟨𝑗, ⋅⟩, 𝑗 ∈ Γ, where ⟨𝑗, 𝑔⟩ stands for the action of the character 𝑗 ∈ Γ on the group element 𝑔 ∈ 𝐺 (thus, ⟨𝑗, 𝑔⟩ ∈ 𝕋). Note that 𝑃 (𝐺) is dense in 𝐶(𝐺) (this fact is a corollary of the Stone-Weierstrass theorem). For 𝑎=

𝑚 ∑

𝑎𝑗𝑘 ⟨𝑗𝑘 , .⟩ ∈ 𝑃 (𝐺), 𝑗1 , . . . , 𝑗𝑘 ∈ Γ are distinct; 𝑎𝑗𝑘 ∕= 0, 𝑘 = 1, 2, . . . , 𝑚,

𝑘=1

the Bohr-Fourier spectrum is deﬁned as the ﬁnite set 𝜎(𝑎) := {𝑗1 , . . . , 𝑗𝑘 }. The notion of Bohr-Fourier spectrum is extended from functions in 𝑃 (𝐺) to 𝐶(𝐺) by continuity; indeed, since the Bohr-Fourier coeﬃcients are continuous in the uniform topology, we can use approximations of a given element in 𝐶(𝐺) by elements of 𝑃 (𝐺). The Bohr-Fourier spectrum of 𝐴 = [𝑎𝑖𝑗 ]𝑛𝑖,𝑗−1 ∈ 𝐶(𝐺)𝑛×𝑛 is, by deﬁnition, the union of the Bohr-Fourier spectra of the 𝑎𝑖𝑗 ’s. Note that the Bohr-Fourier spectra of elements of 𝐶(𝐺) are at most countable; a proof for the case Γ = ℝ is found, for example, in [16, Theorem 1.15]; it can be easily extended to general connected compact abelian groups 𝐺. We say that a unital Banach algebra ℬ ⊆ 𝐶(𝐺) is admissible if the following properties are satisﬁed: (1) 𝑃 (𝐺) is dense in ℬ; (2) ℬ is inverse closed (i.e., 𝑋 ∈ ℬ ∩ 𝐺𝐿(𝐶(𝐺)) implies 𝑋 ∈ 𝐺𝐿(ℬ)). Important examples of admissible algebras are 𝐶(𝐺) itself and the Wiener algebra 𝑊 (𝐺) that consists of all functions 𝑎 on 𝐺 of the form ∑ 𝑎𝑗 ⟨𝑗, 𝑔⟩, 𝑔 ∈ 𝐺, (1.1) 𝑎(𝑔) = 𝑗∈Γ

Factorization Versus Invertibility where 𝑎𝑗 ∈ ℂ and

∑ 𝑗∈Γ

227

∣𝑎𝑗 ∣ < ∞. The norm in 𝑊 (𝐺) is deﬁned by ∑ ∣𝑎𝑗 ∣. ∥𝑎∥1 = 𝑗∈Γ

The inverse closed property of 𝑊 (𝐺) follows from the Bochner-Philips theorem [6] (a generalization of the classical Wiener’s theorem for the case when 𝐺 = 𝕋). Other examples of admissible algebras are weighted Wiener algebras. A function 𝜈 : Γ → [1, ∞) is called a weight if 𝜈(𝛾1 + 𝛾2 ) ≤ 𝜈(𝛾1 )𝜈(𝛾2 ) for all 𝛾1 , 𝛾2 ∈ Γ and lim𝑚→∞ 𝑚−1 log(𝜈(𝑚𝛾)) = 0 for every 𝛾 ∈ Γ. The weighted ∑ Wiener algebra 𝑊𝜈 (𝐺) consists of all functions 𝑎 on 𝐺 of the form (1.1) where 𝑗∈Γ 𝜈(𝑗)∣𝑎𝑗 ∣ < ∞, with the norm ∑ ∥𝑎∥𝜈 = 𝜈(𝑗)∣𝑎𝑗 ∣. 𝑗∈Γ

One veriﬁes that 𝑊𝜈 (𝐺) is indeed an inverse closed unital Banach algebra, see [2] for the inverse closedness property. For an admissible algebra ℬ, we denote by ℬ± the closed unital subalgebra of ℬ formed by elements of ℬ with the Bohr-Fourier spectrum in Γ± . Thus, ℬ± = ℬ ∩ 𝐶(𝐺)± . Also, 𝑛×𝑛 ) ∩ ℬ 𝑛×𝑛 = 𝐺𝐿(ℬ± ). (1.2) 𝐺𝐿(𝐶(𝐺)𝑛×𝑛 ± Next, we recall the concept of factorization in the connected compact abelian group setting, see, e.g., [36, 35, 20]. Let ℬ be an admissible algebra, and let 𝐴 ∈ ℬ 𝑛×𝑛 . A representation of the form 𝐴(𝑔) = 𝐴+ (𝑔) (diag (⟨𝑗1 , 𝑔⟩, . . . , ⟨𝑗𝑛 , 𝑔⟩)) 𝐴− (𝑔), 𝐴± , 𝐴−1 ±

𝑛×𝑛 ℬ±

𝑔 ∈ 𝐺,

(1.3)

where ∈ and 𝑗1 , . . . , 𝑗𝑛 ∈ Γ, is called a (left) ℬ-factorization of 𝐴 (with respect to the order ⪯). It follows that the elements 𝑗1 , . . . , 𝑗𝑛 in (1.3) are uniquely determined by 𝐴, up to a permutation. Borrowing the terminology from the classical (Γ = ℤ, 𝐺 = 𝕋) setting, we call them the partial indices of 𝐴. The sum 𝑗1 + ⋅ ⋅ ⋅ + 𝑗𝑛 is the total index of 𝐴. For 𝑛 = 1, the only partial index of 𝐴 (therefore coinciding with its total index) is called simply the index of 𝐴. We say that 𝐴 ∈ ℬ 𝑛×𝑛 is ℬ-factorable if a ℬ-factorization of 𝐴 exists. Denote by 𝐺𝐿𝐹 (ℬ 𝑛×𝑛 ) the set of all ℬ-factorable 𝑛 × 𝑛 matrix functions. Clearly, it is necessary that 𝐴 ∈ 𝐺𝐿(ℬ 𝑛×𝑛 ) for 𝐴 to be ℬ-factorable. In this paper, we overview some available results and state open problems concerning the opposite direction: What can be said about the structure of the set of factorable matrix functions as a subset of invertible matrix functions? If Γ = ℤ, then 𝐺 = 𝕋, and 𝑊 (𝕋)-factorization is the classical Wiener-Hopf factorization on the unit circle. As it happens, in this case the above-mentioned necessary invertibility condition is suﬃcient as well. This result is due to GohbergKrein [27], and can also be found in many monographs, e.g., [14, 34], and a more recent survey [24]. On the other hand, it is well known that the condition 𝐴 ∈ 𝐺𝐿(𝐶(𝕋)𝑛×𝑛 ) is not suﬃcient for 𝐶(𝕋)-factorization even when 𝑛 = 1; an example can be found, e.g., in [30]. However, the set 𝐺𝐿𝐹 (𝐶(𝕋)𝑛×𝑛 ) is dense in 𝐺𝐿(𝐶(𝕋)𝑛×𝑛 ).

228

A. Brudnyi, L. Rodman and I.M. Spitkovsky

For Γ = ℝ the dual group 𝐺 is the Bohr compactiﬁcation ℝ˙ of ℝ, so that 𝐶(𝐺) is nothing but the algebra 𝐴𝑃 of Bohr almost periodic functions while 𝑊 (𝐺) is its (non-closed) subalgebra 𝐴𝑃 𝑊 of 𝐴𝑃 functions with absolutely convergent Bohr-Fourier series. The ℬ-factorization corresponding to these cases, called 𝐴𝑃 and 𝐴𝑃 𝑊 factorization, respectively, in the scalar case was considered in [15] and [29]. The matrix setting was ﬁrst treated in [31, 32]. It was then observed (see also [33] and [7, Section 15.1] for the full proofs) that already for 𝑛 = 2 there exist triangular matrix functions in 𝐺𝐿(𝐴𝑃 𝑊 𝑛×𝑛 ) which are not even 𝐴𝑃 -factorable. These matrix functions have the form [ ] ⟨𝜈 + 𝛿, 𝑔⟩ 0 ˙ 𝐴(𝑔) = , 𝑔 ∈ ℝ, (1.4) 𝑐1 ⟨−𝜈, 𝑔⟩ + 𝑐2 + 𝑐3 ⟨𝛿, 𝑔⟩ ⟨−(𝜈 + 𝛿), 𝑔⟩ where 𝜈, 𝛿 > 0, 𝜈 and 𝛿 are not commensurable, and 𝑐1 , 𝑐2 , 𝑐3 are non-zero complex numbers such that (log ∣𝑐3 ∣)𝜈 + (log ∣𝑐1 ∣)𝛿 = (log ∣𝑐2 ∣)(𝜈 + 𝛿).

(1.5)

In other words, the necessary invertibility condition in general is not suﬃcient – a striking contrast with the scalar setting. The details can be found in [7], while more recent new classes are discussed in [13, 40].

2. Denseness of 𝑮𝑳𝑭 (퓑) in 𝑮𝑳(퓑) We start with the scalar case. An admissible algebra is said to be decomposing if ℬ+ + ℬ− = ℬ. For example, the weighted Wiener algebras are decomposing, but 𝐶(𝐺) is not. Theorem 2.1. ([8]) Let ℬ ⊆ 𝐶(𝐺) be an admissible algebra, where 𝐺 is a connected compact abelian group. Then: (a) The set 𝐺𝐿𝐹 (ℬ) of ℬ-factorable scalar functions is dense in 𝐺𝐿(ℬ); (b) The equality 𝐺𝐿𝐹 (ℬ) = 𝐺𝐿(ℬ) holds if and only if ℬ is decomposing. In the classical case 𝐺 = 𝕋 part (b) is well known, see [28, Theorem 3.1], for example. Moreover, in this setting the results extend verbatim to the matrix case. Theorem 2.2. Let ℬ ⊆ 𝐶(𝕋) be an admissible algebra. Then: (a) The set 𝐺𝐿𝐹 (ℬ 𝑛×𝑛 ) is dense in 𝐺𝐿(ℬ 𝑛×𝑛 ); moreover, 𝐺𝐿𝐹 (ℬ 𝑛×𝑛 ) is dense in ℬ 𝑛×𝑛 ; (b) The equality 𝐺𝐿𝐹 (ℬ 𝑛×𝑛 ) = 𝐺𝐿(ℬ 𝑛×𝑛 ) holds if and only if ℬ is decomposing. Indeed, all trigonometric 𝑛×𝑛 matrix polynomials which are invertible on the unit circle are ℬ-factorable (see, for example, the proof of Lemma VIII.2.1 in [30] or Section 2.4 in [34]); on the other hand, the set of invertible on 𝕋 trigonometric 𝑛 × 𝑛 matrix polynomials is easily seen to be dense in 𝑃 (𝕋), hence it is also dense in ℬ 𝑛×𝑛 . Part (b) was proved in [22] (see also [10, 11]). We do not know any other group 𝐺 for which 𝐺𝐿𝐹 (ℬ 𝑛×𝑛 ) = 𝐺𝐿(ℬ 𝑛×𝑛 ) holds for every decomposing algebra ℬ. Thus:

Factorization Versus Invertibility

229

Open Problem 2.3. Identify those connected compact abelian groups 𝐺 and their character groups Γ with a linear order for which 𝐺𝐿𝐹 (ℬ 𝑛×𝑛 ) = 𝐺𝐿(ℬ 𝑛×𝑛 )

(2.1)

holds for every decomposing algebra ℬ. It was conjectured in [36] that (2.1) holds for ℬ = 𝑊 (𝐺) if and only if Γ is isomorphic to a subgroup of the (additive) group of rational numbers ℚ. On the other hand, part (a) of Theorem 2.2 extends to some other groups: Theorem 2.4. The following statements are equivalent: (1) 𝐺𝐿𝐹 (ℬ 𝑛×𝑛 ) is dense in ℬ 𝑛×𝑛 , for every admissible algebra ℬ; 𝑛×𝑛 (2) 𝐺𝐿𝐹 (𝐶(𝐺)𝑛×𝑛 ) is dense in 𝐶(𝐺) ; (3) Γ is (isomorphic to) a subgroup of ℚ. Proof. (1) =⇒ (2) is obvious, while (3) =⇒ (1) is proved in [8]. Suppose (2) holds; 𝑛×𝑛 . We now use the well-known in particular, 𝐺𝐿(𝐶(𝐺)𝑛×𝑛 ) is dense in 𝐶(𝐺) fact (see [18, 41, 39, 17]) that if 𝑋 is a compact Hausdorﬀ topological space, then 𝐶(𝑋), the 𝐶 ∗ -algebra of continuous complex-valued functions on 𝑋, has dense group of invertible elements if and only if the covering dimension of 𝑋 is at most one; moreover, if 𝐶(𝑋) has dense group of invertible elements, then so does 𝐶(𝑋)𝑛×𝑛 for every integer 𝑛 ≥ 1. Thus, 𝐺 has covering dimension one. Since the two-dimensional torus 𝕋2 has covering dimension two, it follows that Γ does not contain a subgroup isomorphic to ℤ2 . It is easy to see that any such Γ is isomorphic to a subgroup of ℚ. □

3. Nondenseness Let us return to example (1.4). As also was shown in [31, 33], the matrix (1.4) is 𝐴𝑃 𝑊 factorable when the equality (1.5) does not hold. Therefore, the non-𝐴𝑃 ˙ factorable matrices delivered by (1.4), (1.5) are limits of 𝑊 (ℝ)-factorable ones. In all other concrete examples of non-factorability (the more recent of which can be found in [5, 1, 12]) the non-factorable matrix function always is a limit of factorable ones. In view of this situation, many researchers considered the following conjecture plausible: The set of 𝑊 (𝐺)-factorable matrix functions is dense in the group 𝐺𝐿(𝑊 (𝐺)𝑛×𝑛 ). It turns out, however, that for Γ = ℝ, as well as in many other cases, this conjecture fails for any admissible algebra ℬ. We describe the situation in a more general setting of triangularizable matrix functions. Let ℬ be an admissible algebra. An element 𝐴 ∈ ℬ 𝑛×𝑛 is said to be (left) ℬ-triangularizable if 𝐴 admits a representation (1.3), where the middle term diag (⟨𝑗1 , 𝑔⟩, . . . , ⟨𝑗𝑛 , 𝑔⟩) is replaced by a triangular matrix 𝑇 = [𝑡𝑖𝑗 ]𝑛𝑖,𝑗=1 with 𝑡𝑖𝑗 ∈ ℬ for 𝑖, 𝑗 = 1, . . . , 𝑛, 𝑡𝑖𝑗 = 0 if 𝑖 > 𝑗, and the diagonal elements 𝑡11 , . . . , 𝑡𝑛𝑛 belonging to 𝐺𝐿(ℬ). Denote by 𝐺𝐿𝑇 (ℬ 𝑛×𝑛) the set of 𝑛 × 𝑛 ℬ-triangularizable matrix functions. Clearly, 𝐺𝐿𝐹 (ℬ 𝑛×𝑛 ) ⊆ 𝐺𝐿𝑇 (ℬ 𝑛×𝑛) ⊆ 𝐺𝐿(ℬ 𝑛×𝑛 ).

230

A. Brudnyi, L. Rodman and I.M. Spitkovsky

The following question arises naturally: Does 𝐺𝐿(ℬ 𝑛×𝑛 ) = 𝐺𝐿𝑇 (ℬ 𝑛×𝑛 ) hold for admissible algebras? The next result shows that generally the answer is no. Denote by 𝒯 (ℬ 𝑛×𝑛 ) the minimal closed subgroup of 𝐺𝐿(ℬ 𝑛×𝑛 ) that contains 𝐺𝐿𝑇 (ℬ 𝑛×𝑛). Theorem 3.1. ([8]) Let Γ be a torsion free abelian group (in discrete topology) that contains a subgroup isomorphic to ℤ3 , and let ℬ be an admissible algebra of continuous functions on 𝐺, the dual of Γ. Then, for every natural 𝑛 ≥ 2 there exist inﬁnitely many pathwise connected components of 𝐺𝐿(ℬ 𝑛×𝑛 ) with the property that each one of these components does not intersect 𝒯 (ℬ 𝑛×𝑛 ). In particular, 𝒯 (ℬ 𝑛×𝑛 ), and a fortiori 𝐺𝐿𝐹 (ℬ 𝑛×𝑛 ), is not dense in 𝐺𝐿(ℬ 𝑛×𝑛 ). Theorem 3.1 naturally leads to Open Problem 3.2. (i) Describe all connected compact groups 𝐺 (or their duals Γ) such that 𝐺𝐿𝐹 (ℬ 𝑛×𝑛 ) is dense in 𝐺𝐿(ℬ 𝑛×𝑛 ), for any admissible algebra ℬ. (ii) Describe all connected compact groups 𝐺 such that 𝒯 (ℬ 𝑛×𝑛 ) is dense in 𝐺𝐿(ℬ 𝑛×𝑛 ), for any admissible algebra ℬ. Theorem 3.1 does not address the situation when the given 𝐴 ∈ 𝐺𝐿(ℬ 𝑛×𝑛 ) is already triangular. It is still a possibility that any such matrix can be approximated by ℬ-factorable ones. Thus, Open Problem 3.3. Prove or disprove that 𝐺𝐿𝐹 (ℬ 𝑛×𝑛 ) is dense in 𝐺𝐿𝑇 (ℬ 𝑛×𝑛). In relation with Open Problem 3.3 note that factorization of triangular matrix functions arises naturally in the consideration of convolution type equations on (unions of) intervals, see [21, 37, 38, 44, 7].

4. Topological properties of 𝑮𝑳𝑭 (퓑𝒏×𝒏 ) In this section we discuss brieﬂy some topological properties of the set 𝐺𝐿𝐹 (ℬ 𝑛×𝑛 ). It will be assumed throughout the section that the admissible algebra ℬ is decomposing. A standard argument (see, for example, [23, Theorem XXIX.9.1]) shows that there exists an open neighborhood Λ of identity in 𝐺𝐿(ℬ 𝑛×𝑛 ) such that every element of Λ admits a canonical ℬ-factorization, i.e., a ℬ-factorization with all partial indices equal to zero. As a consequence, we obtain: Proposition 4.1. The set of those 𝐴 ∈ 𝐺𝐿𝐹 (ℬ 𝑛×𝑛 ) that admit a canonical factorization is open in 𝐺𝐿(ℬ 𝑛×𝑛 ). It is not known for a general Γ whether or not the set 𝐺𝐿𝐹 (ℬ 𝑛×𝑛 ) is open. Thus: Open Problem 4.2. Identify those connected compact abelian groups 𝐺 for which 𝐺𝐿𝐹 (ℬ 𝑛×𝑛 ) is open.

Factorization Versus Invertibility

231

For example, 𝐺 = 𝕋 is such since then 𝐺𝐿𝐹 (ℬ 𝑛×𝑛 ) = 𝐺𝐿(ℬ 𝑛×𝑛 ). Proposition 4.1 leads to the following stability property of indices of scalar functions. Observe that in view of Theorem 2.1, every invertible element of ℬ is ℬ-factorable. Proposition 4.3. Let 𝐴 ∈ 𝐺𝐿(ℬ). Then the index of every nearby (in the topology of ℬ) function 𝐵 ∈ 𝐺𝐿(ℬ) is identical to that of 𝐴. Indeed, replacing 𝐴 with ⟨−𝑗, ⋅⟩𝐴(⋅), where 𝑗 is the index of 𝐴, we may assume that the latter equals zero, i.e., the ℬ-factorization of 𝐴 is canonical. Now the result is immediate from Proposition 4.1. Next, consider the (pathwise) connected components of 𝐺𝐿𝐹 (ℬ 𝑛×𝑛 ). Theorem 4.4. Every connected component of 𝐺𝐿𝐹 (ℬ 𝑛×𝑛 ) has the form 𝐶𝐺𝐿𝐹𝑗 (ℬ 𝑛×𝑛 ) := {𝐴 ∈ 𝐺𝐿𝐹 (ℬ 𝑛×𝑛 ) : the total index of 𝐴 equals 𝑗}, where 𝑗 ∈ Γ is ﬁxed. Thus, the connected component of 𝐺𝐿𝐹 (ℬ 𝑛×𝑛 ) are parametrized by 𝑗 ∈ Γ. For the Wiener algebra, Proposition 4.4 was proved in [9]. Proof. The proof of [8, Theorem 6.2] (see also [9, Section 6]) shows that every 𝐴 ∈ 𝐶𝐺𝐿𝐹𝑗 (ℬ 𝑛×𝑛 ) can be connected within 𝐺𝐿𝐹 (ℬ 𝑛×𝑛 ) to diag (1, 1, . . . , 1, ⟨𝑗, ⋅⟩). Conversely, assume there exists a continuous path from diag (1, 1, . . . , 1, ⟨𝑗1 , ⋅⟩) to diag (1, 1, . . . , 1, ⟨𝑗2 , ⋅⟩) within 𝐺𝐿𝐹 (ℬ 𝑛×𝑛 ), where 𝑗1 , 𝑗2 ∈ Γ. Passing to determinants, we obtain a path from ⟨𝑗1 , ⋅⟩ to ⟨𝑗2 , ⋅⟩ within 𝐺𝐿(ℬ). By Proposition 4.3 we must have 𝑗1 = 𝑗2 . □ In particular, the set of canonically ℬ-factorable 𝑛 × 𝑛 matrix functions is connected. Canonically ℬ-factorable scalar functions can be described in several ways. Denote by 𝐺𝐿0 (ℬ) the connected component of 𝐺𝐿(ℬ) that contains the constant function 1. Proposition 4.5. The following statements are equivalent for 𝐴 ∈ 𝐺𝐿(ℬ): (1) 𝐴 admits canonical ℬ-factorization; (2) 𝐴 has a logarithm in ℬ, i.e., 𝐴 = 𝑒𝐵 for some 𝐵 ∈ ℬ; (3) 𝐴 ∈ 𝐺𝐿0 (ℬ). Proof. The equivalence of (2) and (3) is well known for commutative unital Banach algebras. If (1) holds, then (3) holds in view of the connectivity of the set of canonically ℬ-factorable scalar functions. Finally, if (2) holds, then write 𝐵 = 𝐵+ + 𝐵− , where 𝐵± ∈ ℬ± (the decomposing property of ℬ is used here). It follows □ that 𝐴 = 𝑒𝐵+ 𝑒𝐵− is a canonical ℬ-factorization. We conclude the section with some observations concerning triangular matrix functions. Since ℬ is decomposing, according to Theorem 2.1 factorability of such matrices implies factorability of all their diagonal entries. The converse is also

232

A. Brudnyi, L. Rodman and I.M. Spitkovsky

true, whenever 𝐺 and Γ are such that (2.1) holds. The next statement shows that condition (2.1) is irrelevant, provided that the factorization of diagonal elements is canonical. Proposition 4.6. Let 𝐴 ∈ 𝐺𝐿(ℬ 𝑛×𝑛 ) be upper (or lower) triangular such that the diagonal elements of 𝐴 belong to 𝐺𝐿0 (ℬ). Then 𝐴 admits a canonical ℬfactorization. Proof. The proof is by induction on 𝑛, the case 𝑛 = 1 being trivial. Let 𝐴 ∈ ) be upper triangular with diagonal elements in 𝐺𝐿0 (ℬ). Write 𝐴 = 𝐺𝐿(ℬ 𝑛×𝑛] [ 𝐵 𝐶 , where 𝐵 ∈ 𝐺𝐿(ℬ), 𝐷 ∈ 𝐺𝐿(ℬ (𝑛−1)×(𝑛−1) ). By the induction hypoth0 𝐷 esis, 𝐵 and 𝐷 admit canonical ℬ-factorizations 𝐵 = 𝐵+ 𝐵− , 𝐷 = 𝐷+ 𝐷− . Writing 1×(𝑛−1) −1 −1 𝐶𝐷− = 𝑋+ +𝑋− , where 𝑋± ∈ ℬ± , we have a ℬ-canonical factorization 𝐵+ [ ][ ][ ][ ] 0 0 𝐵+ 1 𝑋+ 1 𝑋− 𝐵− 𝐴= . □ 0 𝐷+ 0 𝐷− 0 𝐼𝑛−1 0 𝐼𝑛−1 If the admissible algebra ℬ is not decomposing, then we can only assert that the group of upper triangular matrices in 𝐺𝐿(ℬ 𝑛×𝑛 ) with diagonal elements in 𝐺𝐿0 (ℬ), is dense in the set of canonically ℬ-factorable upper triangular matrices in 𝐺𝐿(ℬ 𝑛×𝑛 ). Note also that a triangular matrix 𝐴 may admit a canonical ℬ-factorization while the factorization of its diagonal entries is non-canonical. The classical example of this phenomenon for Γ = ℤ can be found in [27], while for Γ = ℝ, e.g., matrices (1.4) with (1.5) not satisﬁed will do the job. Other examples of this nature are scattered throughout Chapters 14, 15 of [7].

5. Small Bohr-Fourier spectra Let 𝐴 ∈ 𝐺𝐿(ℬ 𝑛×𝑛 ). In view of Theorem 3.1 it is unlikely (if Γ contains ℤ3 and 𝑛 ≥ 2) that 𝐴 admits a ℬ-factorization. So one may consider imposing additional conditions on 𝐴 to ensure ℬ-factorability. In this section, we consider small BohrFourier spectra. We denote by #𝜎(𝐴) the number of elements in the Bohr-Fourier spectrum of 𝐴. To start with an easy case, note that if #𝜎(𝐴) ≤ 2 then 𝐴 is ℬ-factorable, for any admissible algebra ℬ. This can be proved without diﬃculty using the Kronecker form for two complex matrices. In the following, we need to distinguish archimedean and non-archimedean groups. The group Γ (with the ﬁxed linear order ⪯) is said to be archimedean if for every 𝑎, 𝑏 ≻ 0 there exists an integer 𝑚 such that 𝑚𝑎 ≻ 𝑏. A well-known H¨older’s theorem states that a linearly ordered abelian group is archimedean if and only if it is order isomorphic to a subgroup of ℝ. We have a non-factorability result, proved in [36, 35]:

Factorization Versus Invertibility

233

Theorem 5.1. Assume Γ is non-archimedean (for example, Γ = ℤ𝑘 , 𝑘 > 1, with the lexicographic order). Then for every 𝑛 ≥ 2 there is a 𝑊 (𝐺)-nonfactorable triangular 𝐴 ∈ 𝐺𝐿(ℬ 𝑛×𝑛 ) with #𝜎(𝐴) = 4. A concrete example is given in [35]: Assume 0 ≺ 𝜇 ≺ 𝜆, 𝜆, 𝜇 ∈ Γ are such that 𝑛𝜇 ≺ 𝜆 for all positive integers 𝑛. Let ] [ 0 ⟨𝜆, 𝑔⟩𝐼𝑛−1 , 𝑔 ∈ 𝐺, (5.1) 𝐴(𝑔) = 𝐶1 − ⟨𝜇, 𝑔⟩𝐶2 ⟨−𝜆, 𝑔⟩ where 𝐶1 = 𝐶2 = [1 0 . . . 0] ∈ ℝ1×(𝑛−1) . Then 𝐴 is not 𝑊 (𝐺)-factorable. This example is a particular case of a more general result: Theorem 5.2. Let 𝐴 have the form [ ] ⟨𝜆, 𝑔⟩𝐼𝑝 0 𝐴(𝑔) = , 𝐶1 ⟨𝜎, 𝑔⟩ − 𝐶2 ⟨𝜇, 𝑔⟩ ⟨−𝜆, 𝑔⟩𝐼𝑞

𝑔 ∈ 𝐺,

(5.2)

where 𝐶1 , 𝐶2 ∈ ℂ𝑞×𝑝 . Assume that 𝜆 ≻ 0, 𝜇 ≻ 𝜎, and 𝑛𝜇 ≺ 𝜆,

𝑛𝜎 ≺ 𝜆

for all integers 𝑛.

(5.3)

Then for every admissible algebra ℬ, 𝐴 admits a ℬ-factorization if rank (𝜆1 𝐶1 − 𝜆2 𝐶2 ) = max{rank (𝑧1 𝐶1 − 𝑧2 𝐶2 ) : 𝑧1 , 𝑧2 ∈ ℂ} for every 𝜆1 , 𝜆2 ∈ ℂ satisfying ∣𝜆1 ∣ = ∣𝜆2 ∣ = 1. (5.4) Moreover, in this case the factorization indices of 𝐴 belong to the set {±𝜎, ±𝜇, ±𝜆, ±(𝜆 − (𝜇 − 𝜎)), . . . , ±(𝜆 − min{𝑝, 𝑞}(𝜇 − 𝜎))},

(5.5)

and if 𝜆 − 𝑘𝑗 (𝜇 − 𝜎), 𝑗 = 1, 2, . . . , 𝑠, and −(𝜆 − ℓ𝑖 (𝜇 − 𝜎)), 𝑖 = 1, 2, . . . , 𝑡, are the factorization indices of 𝐴 other than ±𝜎, ±𝜇, ±𝜆, then 𝑘1 + ⋅ ⋅ ⋅ + 𝑘𝑠 + ℓ1 + ⋅ ⋅ ⋅ + ℓ𝑡 + 𝑠 ≤ 𝑝,

𝑘1 + ⋅ ⋅ ⋅ + 𝑘𝑠 + ℓ1 + ⋅ ⋅ ⋅ + ℓ𝑡 + 𝑡 ≤ 𝑞,

and 𝑡 − 𝑠 = 𝑝 − 𝑞. Conversely, if 𝐴 admits a 𝑊 (𝐺)-factorization, then (5.4) holds. Theorem 5.2 is proved in [35, 36] for the case when ℬ = 𝑊 (𝐺), with less explicit description of the factorization indices1 . We do not know whether the converse statement holds for any admissible algebra diﬀerent from 𝑊 (𝐺). We provide some details of the proof of Theorem 5.2 in a separate Section 6. In view of Theorem 5.1 we have the following open problem: Open Problem 5.3. Assume that the subgroup generated by 𝛾1 , 𝛾2 , 𝛾3 ∈ Γ is not archimedean. Prove or disprove that every 𝐴 ∈ 𝐺𝐿(ℬ 𝑛×𝑛 ) with 𝜎(𝐴) = {𝛾1 , 𝛾2 , 𝛾3 } is ℬ-factorable, for any admissible algebra ℬ. 1 Note

that two ± signs are inadvertently omitted in the statement of [35, Theorem 3].

234

A. Brudnyi, L. Rodman and I.M. Spitkovsky

For archimedean groups which are non-isomorphic to any subgroup of ℚ, an example exists of an invertible non-𝑊 (𝐺)-factorable 𝑛 × 𝑛 matrix function 𝐴 ∈ 𝑃 (𝐺) with #𝜎(𝐴) = 5 (see [35]); for 2 × 2 matrices this example stems from (1.4). Thus: Open Problem 5.4. Assume that Γ is archimedean and not isomorphic to a subgroup of ℚ. Prove or disprove that every 𝐴 ∈ 𝐺𝐿(ℬ 𝑛×𝑛 ) with #𝜎(𝐴) = 3 or #𝜎(𝐴) = 4 is ℬ-factorable.

6. Proof of Theorem 5.2 We follow the same approach as in [36, 35]. For the readers’ convenience, we provide some details. Assume that (5.4) holds. We have to prove that 𝐴 given by (5.2) is ℬfactorable. Applying the transformation 𝐶1 −→ 𝑆𝐶1 𝑇,

𝐶2 −→ 𝑆𝐶2 𝑇,

for suitable invertible matrices 𝑆 and 𝑇 , we may assume that the pair (𝐶1 , 𝐶2 ) is in the Kronecker normal form; in other words, 𝐶1 and 𝐶2 are direct sums of blocks of the following types: (a) 𝐶1 and 𝐶2 are of size 𝑘 × (𝑘 + 1) of the form [ [ ] 𝐶1 = 𝐼𝑘 0𝑘×1 , 𝐶2 = 0𝑘×1

𝐼𝑘

]

.

(b) 𝐶1 and 𝐶2 are of size (𝑘 + 1) × 𝑘 of the form ] ] [ [ 𝐼𝑘 01×𝑘 , 𝐶2 = . 𝐶1 = 01×𝑘 𝐼𝑘 (c) 𝐶1 𝐶2 (d) 𝐶1 (e) 𝐶1 (f) 𝐶1

is the 𝑘 × 𝑘 upper triangular nilpotent Jordan block, denoted by 𝑉𝑘 , and = 𝐼𝑘 . = 𝐼𝑘 , and 𝐶2 = 𝑉𝑘 . and 𝐶2 are both invertible of the same size, say 𝑘 × 𝑘. and 𝐶2 are both zero matrices of the same size.

The proof thereby is reduced to the cases (a)–(f). The case (f) is trivial. The case (a) is dealt with in full detail in [35, proof of Theorem 3] (using arguments similar to those in [31]); the indices in this case are −𝜎 (𝑘 times), 𝜆 − 𝑘(𝜇 − 𝜎), and 𝜇 (𝑘 times), and the factors 𝐴−1 ± actually belong to 𝑃 (𝐺) ∩ ℬ± (such factorization was termed ﬁnite factorization in [36]). The case (b) is reduced to (a) as in [35], using the transformation [ ] ] [ 0 𝐽𝑘+1 0 𝐽𝑘 𝐴∗ , 𝐴 −→ 𝐽𝑘 0 𝐽𝑘+1 0 where 𝐽𝑘 is the 𝑘×𝑘 matrix with 1’s along the top-right to the left-bottom diagonal and zeros elsewhere. The cases (c), (d), and (e) follow from the fact (proved in

Factorization Versus Invertibility

235

[36]) that, under the hypotheses of Theorem 5.2, if 𝐶2 is invertible and the spectral radius of 𝐶2−1 𝐶1 is less than one, then a ℬ-factorization of 𝐴 is given by formulas [ ∑∞ ] − 𝑗=0 (𝐶2−1 𝐶1 )𝑗 𝐶2−1 ⟨𝜆 − 𝜇 − 𝑗(𝜇 − 𝜎), ⋅⟩ 𝐼 𝐴+ = , 𝐼 0 [ ] ⟨𝜇, ⋅⟩𝐼 0 Λ= , 0 ⟨−𝜇, ⋅⟩𝐼 ] [ 𝐶1 ⟨𝜎 − 𝜇, ⋅⟩ − 𝐶2 ⟨−𝜇 − 𝜆, ⋅, ⟩𝐼 , 𝐴− = ∑∞ −1 𝑗 −1 0 𝑗=0 (𝐶2 𝐶1 ) 𝐶2 ⟨−𝑗(𝜇 − 𝜎), ⋅⟩ and if 𝐶1 is invertible and the spectral radius of 𝐶2 𝐶1−1 is less than one, then a ℬ-factorization of 𝐴 is given by formulas [ ] ∑∞ ⟨𝜆 − 𝜎, ⋅⟩𝐼 −𝐶1−1 𝑗=0 (𝐶2 𝐶1−1 )𝑗 ⟨𝑗(𝜇 − 𝜎), ⋅⟩ , 𝐴+ = 0 −𝐶2 𝑒𝜇−𝜎 + 𝐶1 [ ] ⟨𝜎, ⋅⟩𝐼 0 Λ= , 0 ⟨−𝜎, ⋅⟩𝐼 ] [ ∑∞ 𝐼 𝐶1−1 𝑗=0 (𝐶2 𝐶1−1 )𝑗 ⟨𝑗(𝜇 − 𝜎) − 𝜎 − 𝜆, ⋅⟩ . 𝐴− = 0 𝐼 Note that using the Jordan form of 𝐶2−1 𝐶1 or of 𝐶2 𝐶1−1 , as the case may be, and using the inverse closed property of ℬ, one easily veriﬁes that the matrices in these formulas indeed belong to ℬ 𝑘×𝑘 , for any admissible algebra ℬ. Assume now that 𝐴 is 𝑊 (𝐺)-factorable. By [42, Theorem 1], there is a 𝑊 (𝐺)factorization of 𝐴 with all factors having Bohr-Fourier coeﬃcients in the subgroup of Γ generated by 𝜎, 𝜇, 𝜆. Thus, we may assume without loss of generality that Γ = ℤ𝑞 , where 𝑞 = 2 or 𝑞 = 3. Now argue as in the “only if” part of [35, Section 3]. □

7. Wiener-Hopf equivalence Let 𝐴1 , 𝐴2 ∈ ℬ 𝑛×𝑛 . We call 𝐴1 and 𝐴2 (left) Wiener-Hopf equivalent if there exist 𝑛×𝑛 𝑛×𝑛 such that 𝐴−1 and 𝐴± ∈ ℬ± ± ∈ ℬ± 𝐴2 (𝑔) = 𝐴+ (𝑔)𝐴1 (𝑔)𝐴− (𝑔),

𝑔 ∈ 𝐺.

Clearly, this is indeed an equivalence relation. In the setting of operator polynomials and, more generally, analytic operator-valued functions, it was introduced in [25] and then investigated further in [3] and[26, Chapters XIII and XIV]. Of course, an invertible 𝐴 ∈ ℬ 𝑛×𝑛 is (left) ℬ-factorable if and only if it is (left) Wiener-Hopf equivalent to a diagonal matrix function, and two ℬ-factorable matrix functions are Wiener-Hopf equivalent if and only if the sets of their partial indices coincide. In the case of ℬ being a weighted Wiener algebra, the notion of Wiener-Hopf equivalence, along with the latter observation, are in [20].

236

A. Brudnyi, L. Rodman and I.M. Spitkovsky

If ℬ is such that invertibility in ℬ 𝑛×𝑛 implies factorability (for example, if conditions of Theorem 2.2 b) hold), then the Wiener-Hopf equivalence classes are characterized completely by the sets of partial indices. In general, however, we arrive at Open Problem 7.1. For a given admissible algebra ℬ, describe the Wiener-Hopf equivalence classes (and their canonical representatives) of 𝐺𝐿(ℬ 𝑛×𝑛 ). This problem is still open even for ℬ = 𝐴𝑃 or 𝐴𝑃 𝑊 . Moreover, it is not even clear what are possible values of 𝑛 for which there exist “Wiener-Hopf irreducible” 𝐴 ∈ 𝐺𝐿(ℬ 𝑛×𝑛 ), that is, 𝐴 is not Wiener-Hopf equivalent to block diagonal matrices with at least two diagonal blocks. In all the constructions we are aware of (including [8]) only 1 × 1 and 2 × 2 blocks occur as Wiener-Hopf irreducibles, but there is no obvious reason why this should always be the case. Acknowledgment The research of Alex Brudnyi is supported in part by NSERC.

References [1] S. Avdonin, A. Bulanova, and W. Moran, Construction of sampling and interpolating sequences for multi-band signals. The two-band case, Int. J. Appl. Math. Comput. Sci. 17 (2007), no. 2, 143–156. [2] R. Balan and I. Krishtal, An almost periodic noncommutative Wiener’s lemma, J. Math. Anal. Appl. 370 (2010), 339–349. [3] H. Bart, I. Gohberg, and M.A. Kaashoek, Invariants for Wiener-Hopf equivalence of analytic operator functions, Constructive methods of Wiener-Hopf factorization, Operator Theory: Advances and Applications, vol. 21, Birkh¨ auser, Basel, 1986, pp. 317– 355. [4] H. Bart, I. Gohberg, M.A. Kaashoek, and A.C.M. Ran, A state space approach to canonical factorization with applications, OT vol. 200, Birkh¨ auser Verlag, Basel and Boston, 2010. [5] M.A. Bastos, Yu.I. Karlovich, I.M. Spitkovsky, and P.M. Tishin, On a new algorithm for almost periodic factorization, Recent Progress in Operator Theory (Regensburg, 1995) (I. Gohberg, R. Mennicken, and C. Tretter, eds.), Operator Theory: Advances and Applications, vol. 103, Birkh¨ auser Verlag, Basel and Boston, 1998, pp. 53–74. [6] S. Bochner and R.S. Phillips, Absolutely convergent Fourier expansion for noncommutative normed rings, Ann. of Math. 43 (1942), 409–418. [7] A. B¨ ottcher, Yu.I. Karlovich, and I.M. Spitkovsky, Convolution operators and factorization of almost periodic matrix functions, OT vol. 131, Birkh¨ auser Verlag, Basel and Boston, 2002. [8] A. Brudnyi, L. Rodman, and I.M. Spitkovsky, Non-denseness of factorable matrix functions, J. Functional Analysis 261 (2011), 1969–1991. , Projective free algebras of continuous functions on compact abelian groups, [9] J. Functional Analysis 259 (2010), 918–932.

Factorization Versus Invertibility

237

[10] M.S. Budjanu and I.C. Gohberg, General theorems on the factorization of matrixvalued functions, I. Fundamental theorems, Amer. Math. Soc. Transl. 102 (1973), 1–14. , General theorems on the factorization of matrix-valued functions, II. Some [11] tests and their consequences, Amer. Math. Soc. Transl. 102 (1973), 15–26. [12] M.C. Cˆ amara, C. Diogo, Yu.I. Karlovich, and I.M. Spitkovsky, Factorizations, Riemann-Hilbert problems and the corona theorem, arXiv:1103.1935v1 [math.FA] (2011), 1–32. [13] M.C. Cˆ amara, Yu.I. Karlovich, and I.M. Spitkovsky, Almost periodic factorization of some triangular matrix functions, Modern Analysis and Applications. The Mark Krein Centenary Conference (V. Adamyan, Y. Berezansky, I. Gohberg, M. Gorbachuk, A. Kochubei, H. Langer, and G. Popov, eds.), Operator Theory: Advances and Applications, vol. 190, Birkh¨ auser Verlag, Basel and Boston, 2009, pp. 171–190. [14] K.F. Clancey and I. Gohberg, Factorization of matrix functions and singular integral operators, OT vol. 3, Birkh¨ auser, Basel and Boston, 1981. [15] L. Coburn and R.G. Douglas, Translation operators on the half-line, Proc. Nat. Acad. Sci. USA 62 (1969), 1010–1013. [16] C. Corduneanu, Almost periodic functions, J. Wiley & Sons, 1968. [17] H.G. Dales, Banach algebras and automatic continuity, London Mathematical Society Monographs. New Series, vol. 24, The Clarendon Press Oxford University Press, New York, 2000, Oxford Science Publications. [18] T.W. Dawson and J.F. Feinstein, On the denseness of the invertible group in Banach algebras, Proc. Amer. Math. Soc. 131 (2003), no. 9, 2831–2839. [19] T. Ehrhardt and C.V.M. van der Mee, Canonical factorization of continuous functions on the 𝑑-torus, Proc. Amer. Math. Soc. 131 (2003), no. 3, 801–813. [20] T. Ehrhardt, C.V.M. van der Mee, L. Rodman, and I.M. Spitkovsky, Factorizations in weighted Wiener algebras on ordered abelian groups, Integral Equations and Operator Theory 58 (2007), 65–86. [21] M.P. Ganin, On a Fredholm integral equation whose kernel depends on the diﬀerence of the arguments, Izv. Vys. Uchebn. Zaved. Matematika (in Russian) (1963), no. 2 (33), 31–43. [22] I. Gohberg, The factorization problem in normed rings, functions of isometric and symmetric operators, and singular integral equations, Uspehi Mat. Nauk 19 (1964), 71–124 (in Russian). [23] I. Gohberg, S. Goldberg, and M.A. Kaashoek, Classes of linear operators. Vol. II, Birkh¨ auser Verlag, Basel and Boston, 1993. [24] I. Gohberg, M.A. Kaashoek, and I.M. Spitkovsky, An overview of matrix factorization theory and operator applications, Operator Theory: Advances and Applications 141 (2003), 1–102. [25] I. Gohberg, M.A. Kaashoek, and F. van Schagen, Similarity of operator blocks and canonical forms. II. Inﬁnite-dimensional case and Wiener-Hopf factorization, Topics in modern operator theory (Timi¸soara/Herculane, 1980), Operator Theory: Advances and Applications, vol. 2, Birkh¨ auser, Basel, 1981, pp. 121–170. , Partially speciﬁed matrices and operators: classiﬁcation, completion, appli[26] cations, OT vol. 79, Birkh¨ auser Verlag, Basel, 1995.

238

A. Brudnyi, L. Rodman and I.M. Spitkovsky

[27] I. Gohberg and M.G. Krein, Systems of integral equations on a half-line with kernel depending upon the diﬀerence of the arguments, Uspekhi Mat. Nauk 13 (1958), no. 2, 3–72 (in Russian), English translation: Amer. Math. Soc. Transl. 14 (1960), no. 2, 217–287. [28] I. Gohberg and N. Krupnik, One-dimensional linear singular integral equations. Introduction, OT 53, 54, vol. 1 and 2, Birkh¨ auser Verlag, Basel and Boston, 1992. [29] I.C. Gohberg and I.A. Feldman, Integro-diﬀerence Wiener-Hopf equations, Acta Sci. Math. Szeged 30 (1969), no. 3–4, 199–224 (in Russian). , Convolution equations and projection methods for their solution, Nauka, [30] Moscow, 1971 (in Russian), English translation Amer. Math. Soc. Transl. of Math. Monographs 41, Providence, R.I. 1974. [31] Yu.I. Karlovich and I.M. Spitkovsky, Factorization of almost periodic matrix functions and (semi) Fredholmness of some convolution type equations, No. 4421-85 dep., VINITI, Moscow, 1985, in Russian. , On the theory of systems of equations of convolution type with semi-almost[32] periodic symbols in spaces of Bessel potentials, Soviet Math. Dokl. 33 (1986), 145–149. , Factorization of almost periodic matrix functions, J. Math. Anal. Appl. 193 [33] (1995), 209–232. [34] G.S. Litvinchuk and I.M. Spitkovsky, Factorization of measurable matrix functions, OT vol. 25, Birkh¨ auser Verlag, Basel and Boston, 1987. [35] C.V.M. van der Mee, L. Rodman, and I.M. Spitkovsky, Factorization of block triangular matrix functions with oﬀ diagonal binomials, Operator Theory: Advances and Applications 160 (2005), 423–437. [36] C.V.M. van der Mee, L. Rodman, I.M. Spitkovsky, and H.J. Woerdeman, Factorization of block triangular matrix functions in Wiener algebras on ordered abelian groups, Operator Theory: Advances and Applications 149 (2004), 441–465. [37] B.V. Pal’cev, Convolution equations on a ﬁnite interval for a class of symbols having power asymptotics at inﬁnity, Izv. Akad. Nauk SSSR. Mat. 44 (1980), 322–394 (in Russian), English translation: Math. USSR Izv. 16 (1981). , A generalization of the Wiener-Hopf method for convolution equations on a [38] ﬁnite interval with symbols having power asymptotics at inﬁnity, Mat. Sb. 113 (155) (1980), 355–399 (in Russian), English translation: Math. USSR Sb. 41 (1982). [39] A.R. Pears, Dimension theory of general spaces, Cambridge University Press, Cambridge, England, 1975. [40] A. Rastogi, L. Rodman, and I.M. Spitkovsky, Almost periodic factorization of 2 × 2 matrix functions: New cases of oﬀ diagonal spectrum, Recent Advances and New Directions in Applied and Pure Operator Theory (Williamsburg, 2008) (J.A. Ball, V. Bolotnikov, J.W. Helton, L. Rodman, and I.M. Spitkovsky, eds.), Operator Theory: Advances and Applications, vol. 202, Birkh¨ auser, Basel, 2010, pp. 469–487. [41] G. Robertson, On the density of the invertible group in 𝐶 ∗ -algebras, Proc. Edinburgh Math. Soc. (2) 20 (1976), no. 2, 153–157. [42] L. Rodman and I.M. Spitkovsky, Factorization of matrix functions with subgroup supported Fourier coeﬃcients, J. Math. Anal. Appl. 323 (2006), 604–613. [43] W. Rudin, Fourier analysis on groups, John Wiley & Sons Inc., New York, 1990, Reprint of the 1962 original, a Wiley-Interscience Publication.

Factorization Versus Invertibility

239

[44] I.M. Spitkovsky, Factorization of several classes of semi-almost periodic matrix functions and applications to systems of convolution equations, Izvestiya VUZ., Mat. (1983), no. 4, 88–94 (in Russian), English translation in Soviet Math. – Iz. VUZ 27 (1983), 383–388. Alex Brudnyi Department of Mathematics University of Calgary 2500 University Dr. NW Calgary, Alberta, Canada T2N 1N4 e-mail: [email protected] Leiba Rodman and Ilya M. Spitkovsky Department of Mathematics College of William and Mary Williamsburg, VA 23187-8795, USA e-mail: [email protected] [email protected]

Operator Theory: Advances and Applications, Vol. 218, 241–268 c 2012 Springer Basel AG ⃝

Banded Matrices, Banded Inverses and Polynomial Representations for Semi-separable Operators Patrick Dewilde In fond memory of Israel Gohberg, towering mathematician and engaging friend

Abstract. The paper starts out with exploring properties of the URV factorization in the case of banded matrices or operators with banded inverse, showing that they result in factors with the same properties. Then it gives a derivation of representations for general semi-separable operators (matrices) as ratios of minimally banded matrices. It shows that under pretty general technical conditions (uniform reachability and/or controllability in ﬁnite time), left and right polynomial factorizations exist that are unique (canonical) when the factors are properly restrained. Next, it provides Bezout relations for these factors, explicit formulas for all the terms in these relations and an introduction to potential new applications such as L¨ owner type interpolation theory for (general) matrices. Mathematics Subject Classiﬁcation (2000). 15A09, 15A21, 15A23, 15A60, 65F05, 65F20, 93B10, 93B20, 93B28, 93B50, 93B55. Keywords. Semi-separable systems, quasi-separable systems, URV decomposition, canonical polynomial forms, Bezout equations, Loewner interpolation, time varying dynamical systems.

1. Introduction: Semi-separable systems and the ‘one pass’ URV method Semi-separable matrices were introduced in a famous paper by Israel Gohberg and two co-authors, Thomas Kailath and Israel Koltracht [15]. My contribution to the present “Gohberg Memorial Issue” is in honor, not only of Israel Gohberg, who has been a formidable leader in the development of mathematics in general and of applied operator theory and linear algebra in particular, but also of Israel Koltracht, who passed away in 2008, provided major ideas and had a strong inﬂuence on the

242

P. Dewilde

ﬁeld. The idea of semi-separable systems, on which the present paper is based, has proved to be extremely valuable as a frame that provides the right kind of generality to treat problems in dynamical system theory, estimation theory and even just matrix algebra (the original work goes back to papers on Fredholm resolvents and the analysis of Gaussian Processes by Kailath [16] and Anderson and Kailath [17]; in recent times, this type of system has sometimes been called “quasi-separable”, but I prefer to use the original terminology and see no need in introducing a new, confusing notion). Although the historical origin of semi-separability is in integral kernel theory, I restrict myself to the matrix algebra case, a case that was already contemplated in the paper just cited, be it in a restricted setting that does not allow to obtain the strongest possible results. Semi-separable theory kept interesting Israel Gohberg, in particular after the connections with time varying system theory were fully and independently developed, see [9]. In particular the inversion of linear system of equations with low numerical complexity (actually linear in the size of the matrix) got a new impetus when this connection was established, leading to a ﬂurry of new algorithms and papers in which various aspects of the theory were explored [13, 8, 14]. Recently, increased interest in banded operators with banded inverses was generated by Gilbert Strang [21]. The present paper is intended to show the connection between system theory and the theory of banded matrices, very much in the spirit originally set by Israel Gohberg and his co-workers. The approach I follow in this paper is what could be called “structural”. The connection between computational or time discrete systems and their linear algebra stands central, operator theoretic arguments are relegated to the background, not because they are unimportant, but because the results presented are of a computational or system theoretical nature. There is a general, operator theoretic framework in which generalizations would properly ﬁt, namely the theory of Nest Algebras, originally proposed by Arveson [6]. I shall use a more limited framework, described in the next paragraph, that allows for a comfortable handling of block matrices of various sizes and the connection with discrete time system theory, actually the same framework of the book [9], with some variation to accommodate common practice in matrix algebra. In particular, in contrast to the book, causal matrices are (block) lower diagonal in this paper and numerical vectors are usually column vectors. To work comfortably with semi-separable systems, we need the use of sequences of indices and then indexed sequences of vectors. When ℳ = [𝑚𝑘 ]∞ 𝑘=−∞ is a sequence of indices, then each 𝑚𝑘 is either a positive integer or zero, and a corresponding indexed sequence [𝑢𝑘 ] ∈ ℓℳ 2 will be a sequence of vectors such that each 𝑢𝑘 has dimension 𝑚𝑘 and the overall sum ∞ ∑

∥𝑢𝑘 ∥2

(1)

𝑘=−∞

is ﬁnite, the square root of which is then the quadratic norm of the sequence. When 𝑚𝑘 = 0, the corresponding entry just disappears (it is indicated as a mere ‘place

Banded Matrices, Banded Inverses, Polynomial Representations

243

holder’). A regular 𝑛-dimensional ﬁnite vector can so be considered as embedded in an inﬁnite sequence, whereby the entries from −∞ to zero and 𝑛 + 1 to ∞ disappear, leaving just 𝑛 entries indexed by 1 ⋅ ⋅ ⋅ 𝑛, corresponding, e.g., to the time points where they are being inputed into the system. On such sequences we may deﬁne a generic shift operator 𝑍, which does nothing else than shifting the position of the data in a column vector (the index) one notch forward, corresponding to the operation of a matrix whose ﬁrst subdiagonal is a block diagonal of unit matrices (𝑍𝑖,𝑖−1 = 𝐼, all other 𝑍𝑖,𝑗 = 0). It is also convenient to underline the zeroth element of a vector or the {0, 0}th element of a block matrix for orientation purposes. The shift 𝑍 has a transpose, indicated as 𝑍 ′ , which is actually also its inverse (we write 𝑍 −′ = 𝑍). We use the prime to indicate transposition in general, in real arithmetic it corresponds to the usual transpose, in complex arithmetic to the Hermitian conjugate transpose. Hence (underlining the zeroth term in the series): [. . . , 𝑢′−2 , 𝑢′−1 , 𝑢′0 , 𝑢′1 , 𝑢′2 , . . . ]𝑍 ′ = [. . . , 𝑢′−2 .𝑢′−1 , 𝑢′0 , 𝑢′1 , . . . ]

(2)

𝑍 ′ is hence a unitary shift represented as a strictly block upper unit matrix. Typically, a numerical analyst would handle only ﬁnite sequences of vectors, but the embedding in inﬁnite ones allows one to apply delays as desired and not worry about the precise index points. Similarly, we handle in this paper matrices in which the entries are matrices themselves. For example, 𝑇𝑖,𝑗 is a block of dimensions 𝑚𝑖 × 𝑛𝑗 with [𝑚𝑖 ] = ℳ and [𝑛𝑗 ] = 𝒩 , and, again, indices with no entry are just placeholders, with the corresponding block entries disappearing – also consisting just of place holders (interestingly, MATLAB now allows for such matrices, the lack of which was a major annoyance in previous versions. Place holders are very common in computer science, here they prove useful also in linear algebra). To complete the matrix algebra for this extension, only one extra rule is needed, namely that the product of an 𝑚 × 0 matrix with a 0 × 𝑛 matrix is a zero matrix of dimensions 𝑚 × 𝑛. Block matrices usually represent maps from an indexed input sequence [𝑢𝑖 ] to an indexed output sequence [𝑦𝑖 ]. To deﬁne a semi-separable system, we need a more reﬁned structure, which we now introduce. We deﬁne a causal system (of computation) by a set of equations { 𝑥𝑖+1 = 𝐴𝑖 𝑥𝑖 + 𝐵𝑖 𝑢𝑖 (3) 𝑦𝑖 = 𝐶𝑖 𝑥𝑖 + 𝐷𝑖 𝑢𝑖 in which we have introduced an intermediate (hidden) state sequence [𝑥𝑖 ], which is recursively computed (and acts as the memory of the computation), and matrices 𝐴𝑖 , 𝐵𝑖 , 𝐶 ] 𝑖 , 𝐷𝑖 at each index point 𝑖 representing the local linear computation. [ 𝐴𝑖 𝐵𝑖 is called the system transition matrix at time point 𝑖 (𝐴𝑖 being called 𝐶𝑖 𝐷𝑖 the state transition matrix). What is the corresponding input/output matrix 𝑇 ? To obtain it, I follow the tradition in classical system theory, replace the local equations above with global equations on the (embedded) sequences 𝑢 = [𝑢𝑖 ], 𝑦 = [𝑦𝑖 ] and 𝑥 = [𝑥𝑖 ], deﬁne ‘global’ block diagonal matrices 𝐴 = diag(𝐴𝑖 ), 𝐵 = diag(𝐵𝑖 ),

244 etc. and obtain

P. Dewilde {

𝑍 ′ 𝑥 = 𝐴𝑥 + 𝐵𝑢 𝑦 = 𝐶𝑥 + 𝐷𝑢 and, after eliminating the state, the input-output matrix 𝑇 = 𝐷 + 𝐶𝑍(𝐼 − 𝐴𝑍)−1 𝐵

(4) (5)

where I have assumed the inverse to exist. Hence, it must be given precise meaning. One way to do this is, is to assume that the spectral radius of 𝐴𝑍, 𝜎(𝐴𝑍) < 1, which is consistent with the boundedness of the operator. 𝑇 then represents a bounded, block lower matrix in semi-separable form. Another way would be to assume “one-sided expansions”, but this is a method that I do not pursue further in this paper, although it may have merit on its own, as I am mainly interested in stable numerics. A block upper matrix would have a similar representation, now with 𝑍 ′ replacing 𝑍: (6) 𝑇 = 𝐷 + 𝐶𝑍 ′ (𝐼 − 𝐴𝑍 ′ )−1 𝐵. For ease of reference, I indicate the transition matrix of an operator 𝑇 (whether causal or anti-causal) with the symbol “≈” as in [ ] 𝐴 𝐵 𝑇 ≈ . (7) 𝐶 𝐷 Such representations, often called realizations, produce in a nutshell the special structure of a semi-separable system. When, e.g., 𝑇 is block banded lower with two bands, then 𝐴 = 0 and 𝐵 = 𝐼 will do, the central band is represented by 𝐷[ and the ] three band, [ one can ] choose ] ﬁrst oﬀ band by 𝐶. With [a block ] [ 𝐼 𝑍 0 0 , with 𝑍 := because 𝐴= , 𝐶 = 𝐶1 𝐶2 and 𝐵 = 0 𝑍[ 𝐼 0 ] 𝑍 0 −1 the state splits in two components. We ﬁnd, indeed, 𝑍(𝐼 − 𝐴𝑍) := , 𝑍2 𝑍 and hence 𝑇 = 𝐷 + 𝐶1 𝑍 + 𝐶2 𝑍 2 . This principle can easily be extended to yield representations for multi-banded matrices or matrix polynomials in 𝑍. State space realizations are not unique. The dimension chosen for 𝑥𝑖 at time point 𝑖 may be larger than necessary, in which case one calls the representation ‘non minimal’ – I shall not consider this case further. Assuming a minimal representation, one could also introduce a non singular state transformation 𝑅𝑖 on the state at each time point, deﬁning a new state representation 𝑥 ˆ𝑖 = 𝑅𝑖−1 𝑥𝑖 . The transformed system transition matrix now becomes ] [ −1 ] [ −1 ˆ𝑖 𝐴ˆ𝑖 𝐵 𝐵𝑖 𝑅𝑖+1 𝐴𝑖 𝑅𝑖 𝑅𝑖+1 := (8) 𝐶𝑖 𝑅𝑖 𝐷𝑖 𝐶ˆ𝑖 𝐷𝑖 for a lower system, and a similar, dual representation for an upper. Given a block lower matrix 𝑇 , what is a minimal semi-separable representation for it? This problem is known as the system realization problem, and was solved originally by Kronecker [18] in the context of rational functions, and then later by various authors in various circumstances, for the semi-separable case, see

Banded Matrices, Banded Inverses, Polynomial Representations

245

[9] for a complete treatment. An essential role in realization theory is played by the so-called 𝑖th Hankel operator 𝐻𝑖 deﬁned as ⎤ ⎡ 𝑇𝑖,𝑖−1 ⋅⋅⋅ 𝑇𝑖,𝑖−2 ⎢ ⎥ 𝐻𝑖 = ⎣ ⋅ ⋅ ⋅ 𝑇𝑖+1,𝑖−2 𝑇𝑖+1,𝑖−1 ⎦ , (9) . . .. .. .. . i.e., a left lower corner matrix just West of the diagonal element 𝑇𝑖,𝑖 . It turns out that any minimal factorization of each 𝐻𝑖 yields a minimal realization [9], we have indeed ⎡ ⎤ 𝐶𝑖 ⎢ ⎥[ 𝐶𝑖+1 𝐴𝑖 ] ⎢ ⎥ (10) 𝐻𝑖 = ⎢ 𝐶𝑖+2 𝐴𝑖+1 𝐴𝑖 ⎥ ⋅ ⋅ ⋅ 𝐴𝑖−1 𝐴𝑖−2 𝐵𝑖−3 𝐴𝑖−1 𝐵𝑖−2 𝐵𝑖−1 ⎣ ⎦ .. . where, as I explained before, entries may disappear when they reach the border of the matrix. This decomposition has an attractive physical meaning. We recognize ⎡ ⎤ 𝐶𝑖 ⎢ ⎥ 𝐶𝑖+1 𝐴𝑖 ⎢ ⎥ 𝒪𝑖 = ⎢ 𝐶𝑖+2 𝐴𝑖+1 𝐴𝑖 ⎥ (11) ⎣ ⎦ .. . as the 𝑖th observability operator, and [ ℛ𝑖 = ⋅ ⋅ ⋅ 𝐴𝑖−1 𝐴𝑖−2 𝐵𝑖−3

𝐴𝑖−1 𝐵𝑖−2

𝐵𝑖−1

]

(12)

as the 𝑖th reachability operator – all these related to the (causal) lower operator we assumed. At any index point 𝑖, ℛ𝑖 maps inputs strictly before the time point 𝑖 to the state 𝑥𝑖 , while 𝒪𝑖 maps the state 𝑥𝑖 to the output at the present index point 𝑖 and outputs after that, giving its linear contribution to them. The rows of ℛ𝑖 form a basis for the rows of 𝐻𝑖 , while the columns of 𝒪𝑖 form a basis for the columns of 𝐻𝑖 in a minimal representation. When, e.g., the rows are chosen as an orthonormal basis for all the 𝐻𝑖 , then a realization will result for which 𝐴𝑖 𝐴′𝑖 + 𝐵𝑖 𝐵𝑖′ = 𝐼 for all ] [ 𝑖. We call a realization in which 𝐴𝑖 𝐵𝑖 has this property of being part of an orthogonal or unitary matrix, in input normal form. Dually, a realization is said to be in output normal form if for each index 𝑖, 𝒪𝑖′ 𝒪𝑖 = 𝐼. A general matrix 𝑇 is in semi-separable form, when both the lower and upper parts have (in general diﬀerent) system realizations (all matrices shown are block diagonal and consisting typically of blocks of low dimensions): 𝑇 = 𝐶ℓ 𝑍(𝐼 − 𝐴ℓ 𝑍)−1 𝐵ℓ + 𝐷 + 𝐶𝑢 𝑍 ′ (𝐼 − 𝐴𝑢 𝑍 ′ )−1 𝐵𝑢 .

(13)

In typical applications, all these matrices have low dimensions. Their value is that systems with semi-separable realizations can be inverted with a much lower order of numerical complexity than for the classical case of matrix inversion. I shall illustrate this principle soon.

246

P. Dewilde

It may seem laborious to ﬁnd realizations for common systems of equations like discretized partial diﬀerential equations or integral equations. Luckily, this is not the case. In many instances, realizations come with the physics of the problem. Very common are, besides block banded matrices, so-called smooth matrices [20], in which the Hankel matrices have natural low-rank approximations, and ratios of block banded matrices (which are in general full matrices), and, of course, systems derived from linearly coupled subsystems. The URV factorization The goal of an URV factorization is to represent (the block matrix) 𝑇 as a product of three (block) matrices, 𝑈 , 𝑅, and 𝑉 , 𝑈 being isometric, 𝑉 co-isometric and 𝑅 upper and upper invertible. My goal in this section is to give the details of a method that computes the factorization in a numerically stable way directly on the semiseparable representation and in a ‘one pass’ way, recursively computing the result for increasing indices. When 𝑇 = 𝑈 𝑅𝑉 and 𝑇 is invertible, then 𝑈 and 𝑉 will be plainly unitary, and 𝑇 −1 = 𝑉 ′ 𝑅−1 𝑈 ′ . However, when 𝑇 is general, then 𝑈 and 𝑉 are merely isometric, resp. co-isometric, and the solution of the least squares least squares solution for 𝑦 = 𝑢𝑇 is given by 𝑢 = 𝑇 † 𝑦 with 𝑇 † = 𝑉 ′ 𝑅−1 𝑈 ′ (the same would be true for 𝑦 = 𝑢𝑇 , now with 𝑢 = 𝑦𝑉 ′ 𝑅−1 𝑈 ′ !) 𝑇 † is called the ‘MoorePenrose inverse’ of 𝑇 . The solution to the 𝑈 𝑅𝑉 factorization problem in terms of system representations was originally given in [22], and was further elaborated in [9]. In [20] the factorization as a one pass recursive method was given. Remarkably, each of the factors has itself a simple semi-separable representation in terms of the original representation and of a complexity (as measured in the dimension of the intermediate state) that is at most equal to the original. The URV recursion starts with orthogonal operations on (block) columns, transforming ﬁrst the mixed lower-upper matrix 𝑇 to the upper form and then proceeding on an upper matrix – in practice, one actually alternates (block) column operations that make the matrix upper with (block) row operations that reduce the upper form, to achieve the one pass solution. However, the block column operations turn out to be fully independent from the row operations, hence we can treat them ﬁrst and then complete with row operations (although in numerical practice [20] the operations are staggered). The (ﬁrst) column phase of the URV factorization consists in getting rid of the lower or causal part in 𝑇 by post-multiplication with a unitary matrix, working on the semi-separable representation instead of on the original data. If one takes the lower part in input normal form, i.e., 𝐶ˆℓ 𝑍(𝐼 − ˆℓ = 𝐶ℓ 𝑍(𝐼 − 𝐴ℓ 𝑍)−1 𝐵ℓ such that 𝐴ˆℓ 𝐴ˆ′ + 𝐵 ˆ ′ = 𝐼, then the realization ˆℓ 𝐵 𝐴ˆℓ 𝑍)−1 𝐵 ℓ ℓ for (upper) 𝑉 is given by ] [ ˆℓ 𝐴ˆℓ 𝐵 (14) 𝑉 ≈ 𝐶𝑉 𝐷𝑉 [ ] ˆℓ where 𝐶𝑉 and 𝐷𝑉 are formed by unitary completion of the co-isometric 𝐴ˆℓ 𝐵 (for an approach familiar to numerical analysts see [20]). 𝑉 ′ is[ a minimal ] anti causal unitary operator, which pushes 𝑇 to upper from the right: 𝑇𝑢 0 := 𝑇 𝑉 ′ can

Banded Matrices, Banded Inverses, Polynomial Representations

247

be checked to be upper and a realization for 𝑇𝑢 follows from the preceding as ⎤ ⎡ 𝐴ˆ′ℓ 0 𝐶𝑉′ ˆ′ ⎦. (15) 𝑇𝑢 ≈ ⎣ 𝐵𝑢 𝐵 𝐴𝑢 𝐵𝑢 𝐷𝑉′ ℓ ′ ′ ′ ′ ˆ ˆ ˆ ˆ ˆ 𝐶ℓ 𝐴ℓ + 𝐷𝐵ℓ 𝐶𝑢 𝐶ℓ 𝐶𝑉 + 𝐷𝐷𝑉 As expected, the new transition matrix combines lower and upper parts and has become larger, but 𝑇𝑢 is now (block) upper. Numerically, this step is executed as an LQ factorization as follows (for an introduction to QR and LQ factorizations, ˆ𝑘 and let us assume we know 𝑅𝑘 at step 𝑘, then see the appendix). Let 𝑥𝑘 = 𝑅𝑘 𝑥 ] [ 𝐴ℓ,𝑘 𝑅𝑘 𝐵ℓ,𝑘 𝐶ℓ,𝑘 𝑅𝑘 𝐷𝑘 ][ [ ] (16) ˆℓ,𝑘 0 0 𝑅𝑘+1 𝐴ˆℓ,𝑘 𝐵 = ′ ′ ˆ′ 𝐶ˆℓ,𝑘 𝐶ˆ𝑉,𝑘 𝐶ˆℓ,𝑘 𝐴ˆ′ℓ,𝑘 + 𝐷𝑘 𝐵 + 𝐷𝑘 𝐷𝑉,𝑘 0 𝐶𝑉,𝑘 𝐷𝑉,𝑘 ℓ,𝑘 The LQ factorization of the left-handed matrix computes all the data of the righthand side, namely the transformation matrix, the data for the upper factor 𝑇𝑢 and the new state transition matrix 𝑅𝑘+1 , allowing the recursion to move on to the next index point. Because we have not assumed 𝑇 to be invertible, we have to allow for an LQ factorization that produces an echelon form rather than a strictly square lower triangular form, and allows for a kernel as well, represented by a block column of zeros. The next step is what is called an inner/outer factorization on the upper operator 𝑇𝑢 to reduce it to an upper and upper invertible operator 𝑇𝑜 and an upper orthogonal operator 𝑈 such that 𝑇𝑢 = 𝑈 𝑇𝑜 . The idea is to ﬁnd an as large as possible upper and orthogonal operator 𝑈 such that 𝑈 ′ 𝑇𝑢 is still upper – 𝑈 ′ tries to push 𝑇𝑢 back to lower, without destroying its “upperness”. When it does so maximally, an upper and upper invertible factor 𝑇𝑜 should result. There is a diﬃculty here that 𝑇𝑢 might not be invertible to start with. This diﬃculty is not hard to surmount for the factorization to go through, but in order to avoid a too technical discussion, I start out by assuming invertibility and then remark that the procedure automatically produces the general formula needed. If the entries of 𝑇𝑢 would be scalar, then I would already have reached the goal. Indeed, the inverse of 𝑇𝑢 might have a lower part, which is to be captured by the inner operator 𝑈 that we shall now determine. When 𝑇𝑢 = 𝑈 𝑇𝑜 with 𝑈 upper and orthogonal, then also 𝑇𝑜 = 𝑈 ′ 𝑇𝑢 . Writing out the factorization in terms of the realization, and redeﬁning for brevity 𝑇𝑢 := 𝐷 + 𝐶𝑍 ′ (𝐼 − 𝐴𝑍 ′ )−1 𝐵 we obtain ][ [ ′ ] + 𝐵𝑈′ (𝐼 − 𝑍𝐴′𝑈 )−1 𝑍𝐶𝑈′ 𝐷 + 𝐶𝑍 ′ (𝐼 − 𝐴𝑍 ′ )−1 𝐵 𝑇 𝑜 = 𝐷𝑈 ′ ′ = 𝐷𝑈 𝐷 + 𝐵𝑈′ (𝐼 − 𝑍𝐴′𝑈 )−1 𝑍𝐶𝑈′ 𝐷 + 𝐷𝑈 𝐶𝑍 ′ (𝐼 − 𝐴𝑍 ′ )−1 𝐵 (17) ′ ′ −1 ′ ′ ′ −1 +𝐵𝑈 {(𝐼 − 𝑍𝐴𝑈 ) 𝑍𝐶𝑈 𝐶𝑍 (𝐼 − 𝐴𝑍 ) }𝐵. This expression has the form: ‘diagonal term’ + ‘strictly lower term’ + ‘strictly upper term’ + ‘mixed product’. The last term has what is called ‘dichotomy’, what

248

P. Dewilde

stands between {⋅} can again be split in three terms: (𝐼 − 𝑍𝐴′𝑈 )−1 𝑍𝐶𝑈′ 𝐶𝑍 ′ (𝐼 − 𝐴𝑍 ′ )−1 = (𝐼 − 𝑍𝐴′𝑈 )−1 𝑍𝐴′𝑈 𝑌 + 𝑌 + 𝑌 𝐴𝑍 ′ (𝐼 − 𝐴𝑍 ′ )−1

(18)

with 𝑌 satisfying the “Lyapunov-Stein equation” 𝑍 ′ 𝑌 𝑍 = 𝐶𝑈′ 𝐶 + 𝐴′𝑈 𝑌 𝐴

(19)

′ 𝐶𝑘 + 𝐴′𝑈,𝑘 𝑌𝑘 𝐴𝑘 . The resulting strictly lower term or, with indices: 𝑌𝑘+1 = 𝐶𝑈,𝑘 has to be annihilated, hence we should require 𝐶𝑈′ 𝐷 + 𝐴′𝑈 𝑌 𝐵 = 0, in fact 𝑈 should be chosen maximal with respect to this property (beware: Y depends on U!) Once these two equations are satisﬁed, the realization for 𝑇𝑜 results as 𝑇𝑜 = ′ ′ (𝐷𝑈 𝐷 + 𝐵𝑈′ 𝑌 𝐵) + (𝐷𝑈 𝐶 + 𝐵𝑈′ 𝑌 𝐴)𝑍 ′ (𝐼 − 𝐴𝑍 ′ )−1 𝐵 – we see that 𝑇𝑜 inherits 𝐴 and 𝐵 from 𝑇 and gets new values for the other constituents 𝐶𝑜 and 𝐷𝑜 . Putting these operations together in one matrix equation and in a somewhat special order, we obtain ] ][ [ ] [ 𝐷𝑜 𝐶𝑜 𝑌𝐵 𝑌𝐴 𝐵𝑈 𝐴𝑈 . (20) = 𝐷𝑈 𝐶𝑈 0 𝑍 ′𝑌 𝑍 𝐷 𝐶

Let us interpret this result without going into motivating theory (as in done in [9, 20]). We have a (block) QR factorization of the left-hand side. At stage 𝑘 one[ must assume knowledge of 𝑌𝑘 , and then perform a normal QR factorization ] 𝑌𝑘 𝐵𝑘 𝑌𝑘 𝐴𝑘 of . 𝐷𝑜,𝑘 will be an invertible, upper triangular matrix, so its 𝐷𝑘 𝐶𝑘 dimensions are ﬁxed by the row dimension of 𝑌𝑘 . The remainder of the factorization produces 𝐶𝑜,𝑘 and 𝑌𝑘+1 , and, of course, the “Q factor” that gives a complete realization of 𝑈𝑘 . What if 𝑇 is actually singular? It turns out that then the QR factorization will produce just an upper staircase form with a number of zero rows. The precise result is ⎡ ⎤ ] [ ] 𝐷 [ 𝐶𝑜,𝑘 𝐵𝑈,𝑘 𝐴𝑈,𝑘 𝐵𝑊,𝑘 ⎣ 𝑜,𝑘 𝑌𝑘 𝐵𝑘 𝑌𝑘 𝐴𝑘 0 𝑌𝑘+1 ⎦ , = (21) 𝐷𝑘 𝐶𝑘 𝐷𝑈,𝑘 𝐶𝑈,𝑘 𝐷𝑊,𝑘 0 0 in which the extra columns represented by 𝐵𝑊 and 𝐷𝑊 deﬁne an isometric operator 𝑊 = 𝐷𝑊 + 𝐶𝑊 𝑍 ′ (𝐼 − 𝐴𝑈 𝑍 ′ )−1 𝐵𝑈 so that ] [ [ ] 𝑇𝑜 𝑇𝑢 = 𝑈 𝑊 (22) 0 and 𝑊 characterizes the row kernel of 𝑇 . Another situation (of importance for the L¨ owner interpolation theory treated in the last section of this paper) is when 𝑇 is right-outer (i.e., causal with causal right inverse). In that case 𝑌 should be empty for all index points and at each such point one then has the simpliﬁed QR factorization [ ] [ ] 𝐷 𝐶 = 𝐷𝑈 𝐷𝑜 𝐶𝑜 . (23) Actually, one can then just choose 𝐷𝑈 = 𝐼 and nothing changes – but the occurrence has to be tested of course. Whether this happens, is dependent on the past

Banded Matrices, Banded Inverses, Polynomial Representations

249

of 𝑇 , as we have to know 𝑌𝑘 at each step 𝑘. If the support for 𝑇 is only half inﬁnite (say with indices running [ ] from 1 on), it will be necessary and suﬃcient that all subsequent 𝐷𝑘 𝐶𝑘 have full row rank. Remarkably, the operations work on the rows of 𝑇𝑢 in ascending index order, just as the earlier factorization worked in ascending index order on the columns. That means that the URV algorithm can be executed completely in ascending index order. The reader may wonder at this point (1) how to start the recursion and (2) whether the proposed recursive algorithm is numerically stable. On the ﬁrst point and with our convention of empty matrices, there is no problem starting out at the upper left[ corner of the ] matrix, both 𝐴1 and 𝑌0 are just empty, the ﬁrst QR is done on 𝐷1 𝐶1 . In case the original system does not start at some ﬁnite index, but has a system part that runs from −∞ onwards, one must introduce knowledge of some initial condition on 𝑌 . This is provided, e.g., by an analysis of the LTI system running from −∞ to 0 if that is indeed the case, see [10] for more details. On the matter of numerical stability, I oﬀer two remarks. First, propagating 𝑌𝑘 is numerically stable, one can show that a perturbation on any 𝑌𝑘 will die out exponentially if the propagating system is assumed exponentially stable. Second, one can show that the transition matrix Δ of the inverse of the outer part will be exponentially stable as well, when certain conditions on the original system are satisﬁed [9], p. 367. Banded matrices with banded inverse A banded lower matrix will have a minimal semi-separable realization for which the transition operator 𝐴 is such that 𝐴𝑍 is nilpotent (the “degree” of nilpotency, which may be variable, determines the size of the band). Clearly, when 𝐴𝑍 is nilpotent, then so is 𝑍𝐴. Dually, an upper matrix with transition operator 𝐴 shall be banded when 𝐴𝑍 ′ or equivalently, 𝑍 ′ 𝐴 is nilpotent. Suppose 𝑇 is upper and upper invertible (𝑇 is outer) and banded, then an interesting question arises whether 𝑇 −1 can be banded as well. If 𝑇 = 𝐷 + 𝐶𝑍 ′ (𝐼 − 𝐴𝑍 ′ )−1 𝐵 is a minimal upper realization for 𝑇 , then a minimal upper realization for 𝑇 −1 is given by 𝑇 −1 = 𝐷−1 − 𝐷−1 𝐶𝑍 ′ (𝐼 − Δ𝑍 ′ )−1 𝐵𝐷−1 , in which the transition matrix Δ = 𝐴 − 𝐵𝐷−1 𝐶. Typically, Δ𝑍 ′ will not be nilpotent when 𝐴 is, but it can actually be, notably when 𝐵𝐷−1 𝐶 = 0. I call this case, in which the inverse has the same band as the original, a “strictly banded inverse”. It appears in major applications such as “lapped transforms”, “Haar transforms” and “wavelet representations” [21]. It is of course possible that the inverse is banded with a larger band than the original, but I do not know how to treat this more general case. In fact, all ﬁnite matrices belong to that more general class, so it does not really make sense for them. Theorem 1. Let 𝑇 be a double-sided, banded matrix with strictly banded inverse, then the URV factorization is such that the two inner factors 𝑈 and 𝑉 and the outer factor 𝑇𝑜 are all banded with strictly banded inverse.

250

P. Dewilde

Remarks. ∙ The factors can be obtained by the one-pass recursive algorithm described earlier and have each system realizations whose state complexity is at most equal to the state realization of the original. ∙ The notion of ‘banded outer with strictly banded inverse’ is a generalization of the classical LTI notion of ‘unimodular’. Proof. The theorem follows directly from the construction of the URV factorization given earlier. The factor 𝑉 is causal and inherits the transition matrix 𝐴ℓ of the lower part for which 𝐴ℓ 𝑍 is nilpotent. 𝑉 ′ is automatically nilpotent also, as it has the same transition matrix conjugated. The transition matrix of the resulting upper 𝑇𝑢 is from before [ ] 𝐴ˆ′ℓ 0 𝐴= (24) ˆ ′ 𝐴𝑢 . 𝐵𝑢 𝐵 ℓ and is nilpotent, since both 𝐴ˆ′ℓ 𝑍 ′ and 𝐴𝑢 𝑍 ′ are. The inverse 𝑇𝑢−1 = 𝑉 𝑇 −1 exists by hypothesis and remains banded as a product of two banded matrices. The extraction of 𝑈 can now again be interpreted as an external factorization 𝑇𝑢−1 = 𝑇𝑜−1 𝑈 ′ , which is such that the upper, and necessarily banded 𝑈 (since 𝑇𝑢−1 is banded) annihilates the lower part of 𝑇𝑢−1 resulting in an upper 𝑇𝑜−1 , which again has to be banded, and the same will be true for 𝑇𝑜 = 𝑈 ′ 𝑇𝑢 . More work is needed to show that the total size of the band does not increase by the procedure, but given the band structure of 𝑉 ′ and 𝑈 ′ , and the fact that 𝑉 ′ is upper and 𝑈 ′ lower, the stability of the band follows naturally. □ We are now ready to tackle the main topic of this paper: representations of semi-separable operators as ratios of polynomial matrices in the shift operator 𝑍.

2. Matrix polynomial representations Although there are complete theories for external and inner/outer factorizations (as somewhat described in the introduction), the polynomial representation the𝒩 ory for general matrices or operators (viewed as maps ℓℳ 2 → ℓ2 ) generalizing the complex matrix function theory to semi-separable matrices or time-varying systems has been elusive (a ﬁrst attempt can be found in [5], limited by the special problem treated in that paper). I tried to generalize the famous Popov construction to the semi-separable setting, but was unable to do so. When I encountered the paper of Paul Van Dooren on dead beat control [12], I stumbled on a feasible and attractive technique, which I am now presenting. Let 𝑇 be a causal (lower) operator (I shall assume 𝑇 to be bounded, although generalizations can be constructed). The goal is to ﬁnd minimal representations for 𝑇 of the type 𝑇 = Δℓ 𝑃 −1 or 𝑇 = 𝑄−′ Δ′𝑟 , in which Δℓ , Δ𝑟 , 𝑃 , 𝑄 are all polynomials in 𝑍 of minimal degree. I shall show that under very mild conditions such representations do indeed exist and how they can be computed.

Banded Matrices, Banded Inverses, Polynomial Representations

251

Preliminaries The gist of the method that I shall present is the (recursive) calculation of preimages. To do this comfortably (as I shall have to modify bases recursively), I make a distinction between matrices and “abstract” operators, the latter being basis free. I write abstract vectors in boldface or in Greek characters, while their concrete representation in a given basis is in normal font. ∑𝑛So, when x is an abstract vector in a space with basis [𝜉𝑖 ]𝑖=1⋅⋅⋅𝑛 , we have x = 𝑖=1 𝜉𝑖 𝑥𝑖 with 𝑥𝑖 the components of x in the given basis. Following tradition of Diﬀerential Geometry (or Quantum Mechanics), we can just as well assemble the 𝜉𝑖 in a (row) vector stack [ ] 𝜉1 ⋅ ⋅ ⋅ 𝜉𝑛 and write x = 𝜉𝑥, in which now 𝑥 is a column vector assembling the components of x in the basis 𝜉 – and such a notation can accommodate any indexing scheme, of course. Suppose 𝑎 : 𝒳 → 𝒴 : x → y = 𝑎x, if we have a basis stack 𝜂 in 𝒴, y = 𝜂𝑦 and another 𝜉 in 𝒳 , then there is a matrix 𝐴 so that 𝑦 = 𝐴𝑥, because then y = 𝜂𝑦 = 𝑎x = 𝑎𝜉𝑥, 𝑎𝜉 (assuming there are 𝑚 base ] 1 ⋅ ⋅ ⋅ 𝑚) [ vectors numbered ⋅ ⋅ ⋅ 𝑎𝜉 𝑎𝜉 and each has the formal matrix calculus interpretation 𝑎𝜉 = 1 𝑚 ∑ of these entries evaluates as 𝑎𝜉𝑗 = 𝑖 𝜂𝑖 𝐴𝑖𝑗 so that (again using matrix notation) 𝜂𝑦 = 𝜂𝐴𝑥

(25)

and as 𝜂 forms a basis, necessarily 𝑦 = 𝐴𝑥, a purely numerical expression. In the sequel I shall use spaces spanned by vectors that do ⋁ not necessarily have to form a basis, in particular if 𝜉 is a stack of vectors I write 𝜉 for the space spanned by the vectors. ⋁Suppose now that 𝑎 : ℬ → 𝒴 and 𝑏 : 𝒰 → 𝒴 are operators to a same space 𝒴, stack of vectors in 𝒰 (e.g., natural let 𝜉0 deﬁne a subspace of ℬ and u a (row) ⋁ basis vectors), how do we know that the (𝑎𝜉 ) ⋁0 lies in the image of u⋁under 𝐵 or, more generally, what is the full pre-image of (𝑏u) under 𝑎? I claim: (𝑎𝜉0 ) lies in ⋁ (𝑏u) iﬀ there exists a matrix 𝐹 such that 𝑎𝜉0 = 𝑏u𝐹 . The signiﬁcance of this is that the “input” −u𝐹 𝑥 is capable of annihilating 𝑎𝜉0 𝑥 (in control applications this is called a feedback loop). The existence deﬁnition ⋁ of 𝐹 follows from the following ∑ of its entries: since each entry 𝑎𝜉0,𝑗 ∈ 𝑏u we can express 𝑎𝜉0,𝑗 = 𝑖 𝑏u𝑖 𝐹𝑖,𝑗 ⋁. The next question is: ﬁnd a basis for the full subspace 𝒮 ∈ ℬ that maps to (𝑏u), 0 ⋁ ⋁ i.e., for the pre-image of (u𝑏) under 𝑎, sometimes denoted 𝑎−1 ( 𝑏u). I present algorithms to compute 𝐹 in the appendix. Dead beat control and the construction of a polynomial matrix Let us now assume that we are given the 𝐴 and 𝐵 operators of a causal semiseparable system, and that the pair {𝐴, 𝐵} is such that any state 𝑥𝑖 can be brought to zero in less than some ﬁxed ﬁnite time 𝑘 – i.e., there exist inputs 𝑢𝑖 ⋅ ⋅ ⋅ 𝑢𝑖+ℓ , ℓ ≤ 𝑘 that bring the 𝑥𝑖 to zero. Suﬃcient for that is that there exists a ﬁxed index 𝑘 such that the partial reachability operator [ℛ𝑖 ][𝑖−𝑘:𝑖−1] has full range for all 𝑖 (this means that every state at any time can be reached from zero or controlled to zero by an input sequence of length at most 𝑘. A weaker necessary and suﬃcient condition can

252

P. Dewilde

be formulated but is considerably more involved and hard to check. The suﬃcient condition is the common case, and easily satisﬁed in the ﬁnite-dimensional case). Our goal shall be to ﬁnd a diagonal operator 𝐹 such that (𝐴−𝐵𝐹 )𝑍 is a minimally nilpotent operator. [ If that ]is the case, then the system deﬁned by the system 𝐴 𝐵 transition matrix will be such that it has a causal, polynomial inverse 𝐹 𝐼 [ ] 𝐴 − 𝐵𝐹 𝐵 given by the transition matrix . It is easy to check that this is −𝐹 𝐼 indeed the inverse system, just by solving the direct system for the input, given the output, and given the assumption that (𝐴 − 𝐵𝐹 )𝑍 is nilpotent, the inverse system becomes automatically polynomial in 𝑍 and hence causal. 𝐹 is found by the dead beat construction, which attempts to ﬁnd a feedback control that brings any state to zero in a minimum number of steps, and which I now introduce. Let us assume that we are standing at point 𝑖 in the state space recursion, ℬ𝑖 being the state space at point 𝑖. We assume (1) that the system is uniformly controllable in at most a ﬁxed ﬁnite time 𝑘 and (2) that we already know how to dead beat the state at point (𝑖 + 1). We materialize the latter assumption by assuming that we dispose of a basis 𝜂 for ℬ𝑖+1 which has a decomposition in subspaces 𝒮𝑖+1,0 ⊂ 𝒮𝑖+1,1 ⊂ ⋅ ⋅ ⋅ ⊂ 𝒮𝑖+1,𝑘𝑖+1 = ℬ𝑖+1 , where 𝒮𝑖+1,𝑗 is deﬁned as the subspace of ℬ𝑖+1 that can be dead beat controlled in at most 𝑗 steps – the 0th step being the control in step 𝑖 + 1. 𝜂 is a stack of bases 𝜂0 , 𝜂1 , . . . , 𝜂𝑘𝑖+1 such ⋁𝑗 that 𝒮𝑖+1,𝑗 = ℓ=0 𝜂ℓ . For ℬ𝑖 we have possibly been given an original basis 𝜉 𝑝 , the goal being to ﬁnd a dead beat decomposition for it, similar to the one we already have for ℬ𝑖+1 . Let 𝐴𝑖 and 𝐵𝑖 be the matrices in the current bases realizing 𝑎𝑖 : ℬ𝑖 → ℬ𝑖+1 , respect. 𝑏𝑖 : 𝒰𝑖 → ℬ𝑖+1 , 𝒰𝑖 having the basis u𝑖 . It should be clear that the state at stage 𝑖 can be dead beat controlled in at most 𝑘𝑖+1 + 1 steps, but it might be in less (certainly less than 𝑘), we denote the maximum number at course). Dropping indices wherever clear, we deﬁne 𝒮𝑜 ⊂ ℬ𝑖 as stage 𝑖 by 𝑘𝑖 (of ⋁ the pre-image of 𝑏𝑖 u𝑖 , and then recursively, 𝒮𝑗 ⊂ ℬ𝑖 as the pre-image under 𝑎𝑖 of ] [ ⋁ (𝑏𝑖 u𝑖 , 𝒮𝑖+1,𝑗−1 ) with 𝑗 ≤ 𝑘𝑖+1 + 1. As 𝜉0 𝜉1 ⋅ ⋅ ⋅ 𝜉𝑘𝑖+1 is the stack of basis vectors conformal to the decomposition [𝒮𝑘 ] of ℬ𝑖 and because of the pre-image relations just described we shall have [ ] 𝑎𝑖 𝜉0 𝜉1 𝜉2 ⋅ ⋅ ⋅ 𝜉𝑘𝑖+1 ⎤ ⎡ 𝐹𝑖,0 𝐹𝑖,1 𝐹𝑖,2 ⋅ ⋅ ⋅ 𝐹𝑖,𝑘𝑖+1 𝐺1,0 𝐺1,1 ⋅ ⋅ ⋅ 𝐺1,𝑘𝑖+1 ⎥ [ ]⎢ ⎥ (26) ⎢ 0 = 𝑏𝑖 u𝑖 𝜂0 𝜂1 ⋅ ⋅ ⋅ 𝜂𝑘𝑖 ⎢ . ⎥ .. .. . . . . . ⎦ ⎣ . . . . . 0

0

0

⋅⋅⋅

𝐺𝑘𝑖 ,𝑘𝑖+1

to compute these matrices, see the for some matrices[𝐹𝑖,𝑗 and 𝐺ℓ,𝑗 (for algorithms ] appendix). 𝐹𝑖 = 𝐹𝑖,0 ⋅ ⋅ ⋅ 𝐹𝑖,𝑘𝑖+1 is the feedback matrix desired, at step 𝑖, in the bases just deﬁned (the 𝐺′ 𝑠 produce a realization of the operator 𝑎𝑖 at step 𝑖 – see also the appendix for more detail).

Banded Matrices, Banded Inverses, Polynomial Representations

253

2.1. Polynomial representations We start out with a causal (lower) matrix in output normal form 𝑇 = 𝐷 + 𝐶𝑍(𝐼 − 𝐴𝑍)−1 𝐵 : 𝐴′ 𝐴 + 𝐶 ′ 𝐶 = 𝐼, and we assume the system to be uniformly strictly stable (𝜎(𝐴𝑍) < 1). We then know that the inner 𝑉 := 𝐷𝑉 + 𝐶𝑍(𝐼 − 𝐴𝑍)−1 𝐵𝑉 with observability pair {𝐴, 𝐶} will be such that 𝑇 ′ 𝑉 is causal (see ]Thm. 6.8 in [ 𝐴 𝐵𝑉 is unitary. [9]). The diagonal operators 𝐵𝑉 and 𝐷𝑉 are such that 𝐶 𝐷𝑉 Using the dead beat control construction of the previous paragraphs based on ] [ 𝐴 𝐵𝑉 has a the matrices {𝐴, 𝐵𝑉 } we ﬁnd a feedback operator 𝐹 such that 𝐹 𝐼 polynomial inverse. This leads to the following theorem: Theorem 2. There exist minimal polynomial operators 𝑃 and 𝑄 such that 𝑉 = 𝑄𝑃 −1 = 𝑄−′ 𝑃 ′ and realizations given by [ ] 𝐴 𝐵𝑉 −1 ≈ 𝑃 𝐹 𝐼 [ ] 𝐴 − 𝐵𝑉 𝐹 𝐵𝑉 𝑃 ≈ −𝐹 𝐼 ] [ (27) 𝐴 − 𝐵𝑉 𝐹 𝐵𝑉 𝑄 ≈ 𝐶 − 𝐷𝑉 𝐹 𝐷𝑉 [ ] 𝐴′ 𝐶′ −1 ≈ 𝑄 𝐵𝑉′ + 𝐹 𝐴′ 𝐷𝑉′ + 𝐹 𝐶 ′ in which 𝑃 is polynomial with causal inverse, while 𝑄 is polynomial with anticausal inverse. Remark: 𝑄 characterizes what would be considered the ‘poles’ of 𝑇 in a linear time invariant setting! Proof. We deﬁne 𝑃 −1 by the dead beat construction based on {𝐴, 𝐵𝑉 }. As indicated there, the inverse 𝑃 then becomes automatically polynomial in 𝑍, as seen by direct evaluation of the output in terms of the input. Next, we obtain 𝑄−1 from 𝑄−1 = 𝑃 −1 𝑉 ′ , which, using the property that {𝐴, 𝐵𝑉 } is in input normal form, evaluates to (𝐷𝑉′ + 𝐹 𝐶 ′ ) + (𝐵𝑉′ + 𝐹 𝐴′ )(𝐼 − 𝑍 ′ 𝐴′ )−1 𝑍 ′ 𝐶 ′ and hence the (anticausal) realization given. A realization for 𝑄 is obtained directly from 𝑄 = 𝑉 𝑃 by introducing the realizations for 𝑉 and 𝑃 . One veriﬁes that (𝐼 − 𝑍 ′ 𝐴′ )−1 𝑍 ′ 𝐶 ′ (𝐶 − 𝐷𝑉 𝐹 )𝑍(𝐼 − 𝐴𝑓 𝑍)−1 = (𝐼 − 𝑍 ′ 𝐴′ )−1 𝑍 ′ 𝐴′ + 𝐼 + 𝐴𝑓 𝑍(𝐼 − 𝐴𝑓 𝑍)−1 , so that, indeed, 𝑃 = 𝑉 ′ 𝑄 and 𝑄−1 𝑄 = 𝐼, with the given realization for 𝑄 as a causal polynomial. □ At this point I wish to introduce the notion of minimal lengths (causal) polynomial inverse based on a reachability pair {𝐴, 𝐵} (and a dual notion for the observability pair). For ease of discussion and without impairing generality, we normalize the instantaneous term to [𝐼 as before. ] Any minimal degree inverse will 𝐴 𝐵 for some suitable 𝐹𝑛 , i.e., one for have a realization of the form 𝑃 −1 ≈ 𝐹𝑛 𝐼

254

P. Dewilde

which 𝐴 − 𝐵𝐹𝑛 is nilpotent. One such is when 𝐹𝑛 is chosen equal to 𝐹 . I shall say the polynomial inverse has minimal lengths if 𝐹𝑛 is chosen so that the rank of the nilpotent operator [𝑍(𝐴 − 𝐵𝐹𝑛 )]𝑘 is minimal for each 𝑘. The following theorem, for which I give only a sketchy prove because the recursive proof is very technical, is valid. Theorem 3. All minimal lengths causal polynomials for which 𝑉 = 𝑄𝑛 𝑃𝑛−1 and for which 𝑃𝑛,0 = 𝐼 are of the form [ ] 𝐴 𝐵𝑉 −1 𝑃𝑛 ≈ (28) 𝐹𝑛 𝐼 with 𝐹𝑛 = 𝐹 + 𝐺 for some commensurable 𝐺 in the kernel of 𝐵𝑉 , i.e., for which 𝐵𝑉 𝐺 = 0, and 𝐹 is determined by the dead beat construction. Moreover, 𝑃𝑛 = 𝑃𝑜 𝑃 in which 𝑃𝑜 = 𝐼 − 𝐺𝑍(𝐼 − 𝐴𝑍)−1 𝐵𝑉 is outer with outer inverse 𝑃𝑜−1 = 𝐼 + 𝐺𝑍(𝐼 − 𝐴𝑍)−1 𝐵𝑉 . Proof (sketch). It is easily veriﬁed directly that, given 𝐹𝑛 , 𝑉 = 𝑄𝑛 𝑃𝑛−1 with exactly the same construction for the polynomials 𝑃𝑛 and 𝑄𝑛 shown earlier in the case of 𝐹 . Conversely, to show the claimed unicity, let be given a factorization 𝑉 = 𝑄𝑛 𝑃𝑛−1 , then 𝑃𝑛−1 as the right factor necessarily deﬁnes the controllability space of 𝑉 and since it is supposed to be minimal, its AB-pair can be chosen to be {𝐴, 𝐵𝑉 } (see the realization theory in, e.g., [9]). Hence, 𝑃𝑛−1 must have a realization as given in the lemma, with some new 𝐹𝑛 . To show then that 𝐹𝑛 = 𝐹 + 𝐺, with 𝐺 such that 𝐵𝑉 𝐺 = 0, one shows that this follows from the fact that the minimal dimensions of the ranges of 𝑍(𝐴 − 𝐵𝐹𝑛 ) are precisely the dimensions of the spaces 𝒮𝑖,𝑘 of the dead beat construction, and the feedback operators realizing these dimensions all have to be of the form 𝐹𝑖,𝑘 + 𝐺𝑖,𝑘 in which 𝐺𝑖,𝑘 belongs to the kernel of 𝐵𝑉 . This is the technical (recursive) part of the proof, which I just merely sketched. □ Next we have the minimal representations for 𝑇 as the ratio of two polynomials: Theorem 4. Let 𝑇 be a uniformly exponentially stable causal semi-separable operator, whose minimal realization is uniformly controllable and observable in ﬁnite time. Then 𝑇 has a minimal representation as a ratio of two polynomial operators 𝑇 = 𝑄−′ Δ′ = 𝑃𝑐−1 Δ𝑐 , in which 𝑄 is polynomial in 𝑍 with anticausal inverse, Δ and Δ𝑐 are polynomial in 𝑍 and 𝑃𝑐 is polynomial in 𝑍 with causal inverse. Moreover, 𝑃𝑐 will be a unique polynomial of minimal length with this property except for an invertible diagonal right factor. Remark. System representations for these polynomial matrices can be found through dead beat constructions based on reachability or observability pairs – see the proof. Proof. Let 𝑉 be deﬁned as before. 𝑉 will also be uniformly reachable in ﬁnite time when it is uniformly observable in ﬁnite time, due to the fact that it has a uniformly

Banded Matrices, Banded Inverses, Polynomial Representations

255

stable unitary realization (the proof is a simple exercise in realization theory, as at each time point the rank of the reachability matrix equals that of the observability matrix). Let 𝑃 and 𝑄 be as derived above from 𝑉 and 𝑇 . The property ‘𝑇 ′ 𝑉 is causal’ translates into ‘Δ = 𝑇 ′ 𝑄 is causal’ because of the causality properties of 𝑃 . This we verify directly. Since 𝑄 is polynomial, Δ has to be polynomial as well, yielding the state space realization for Δ (with 𝐴𝑓 := 𝐴 − 𝐵𝑉 𝐹 ): Δ = 𝑇 ′ 𝑄 = 𝐷′ 𝐷𝑉 + 𝐷′ (𝐶 − 𝐷𝑉 𝐹 )(𝐼 − 𝑍𝐴𝑓 )−1 𝑍𝐵𝑉 + 𝐵 ′ 𝑍 ′ (𝐼 − 𝐴′ 𝑍 ′ )−1 𝐶 ′ 𝐷𝑉 + 𝐵 ′ 𝑍 ′ (𝐼 − 𝐴′ 𝑍 ′ )−1 𝐶′(𝐶 − 𝐷𝑉 𝐹 )(𝐼 − 𝑍𝐴𝑓 )−1 𝑍𝐵𝑉 ′

′

(29)

′

= (𝐷 𝐷𝑉 + 𝐵 𝐵𝑉 ) + [𝐷 (𝐶 − 𝐷𝑉 𝐹 ) + 𝐵 ′ (𝐴 − 𝐵𝑉 𝐹 )]𝑍(𝐼 − 𝐴𝑓 𝑍)−1 𝐵𝑉 or, with

[

𝐴′ 𝐵′

𝐶′ 𝐷′

][

𝐴 𝐶

𝐵𝑉 𝐷𝑉

]

[ =

𝐼 𝐶𝑐

0 𝐷𝑐

] ,

Δ = 𝐷𝑐 + [𝐶𝑐 − 𝐷𝑐 𝐹 ]𝑍(𝐼 − 𝐴𝑓 𝑍)−1 𝐵𝑉

(30) (31)

which indeed exhibits Δ as polynomial since 𝐴𝑓 𝑍 is nilpotent. This proves the theorem for 𝑄 and Δ. A further factorization is obtained with the same machinery. Let us deﬁne a kind of dual operator 𝑇𝑐′ = 𝑇 𝑉 ′ = (𝐷𝑉′ 𝐷 + 𝐵𝑉′ 𝐵) + 𝐵𝑉′ (𝐼 − 𝑍 ′ 𝐴′ )−1 𝑍 ′ (𝐶 ′ 𝐷 + 𝐴′ 𝐵), or, taking conjugates (the realization given may not be minimal, it will actually only be minimal when 𝑇 has no intrinsic inner left factor, i.e., an inner, degree reducing left factor!) 𝑇𝑐 := (𝐵 ′ 𝐵𝑉 + 𝐷′ 𝐷𝑉 ) + (𝐵 ′ 𝐴 + 𝐷′ 𝐶)𝑍(𝐼 − 𝐴𝑍)−1 𝐵𝑉

(32)

′

then we have the factorization 𝑇𝑐′ = 𝑉 ′ 𝑇 = 𝑃 − 𝑄′ 𝑄−′ Δ′ = 𝑃 −′ Δ′ , or 𝑇𝑐 = Δ𝑃 −1

(33)

and 𝑃 is now seen as the dead beat polynomial based on the input {𝐴, 𝐵}pair of 𝑇𝑐 . Essential uniqueness for 𝑃𝑐 in case 𝐴𝑍 is completely non-unitary, follows directly from the fact that in that case 𝐵𝑉 cannot have a non-zero kernel. □ The connection between 𝑇 and 𝑇𝑐 is 𝐷𝑐 + 𝐶𝑐 𝑍(𝐼 − 𝐴𝑍)−1 𝐵𝑉 , we have ] [ [ 𝐼 𝐴 𝐵𝑉 = ′ 𝐶 𝐷 𝐵 𝑐 𝑐 [ ] [ 𝐴 𝐵 𝐴 = 𝐶 𝐶 𝐷

in a sense ‘symmetrical’, with 𝑇𝑐 = ][ ] 0 𝐴 𝐵𝑉 , 𝐷′ ] [𝐶 𝐷𝑉 ] 𝐼 𝐶𝑐′ 𝐵𝑉 𝐷𝑉 0 𝐷𝑐′

(34)

showing that not only 𝑇 can be re-derived from (the non-minimal realization of) 𝑇𝑐 , but also that the relation is actually of the duality kind, reachability exchanged for observability. We have dual relations for 𝑇𝑐 . In particular, 𝑇 = 𝑉 𝑇𝑐′ is causal,

256

P. Dewilde

and we have for the realizations, with 𝐺 the dual of 𝐹 now based on the pair {𝐴, 𝐶}, and 𝐴𝑔 = 𝐴 − 𝐺𝐶 nilpotent [ ] 𝐴 𝐺 𝑃𝑐−1 ≈ [ 𝐶 𝐼 ] 𝐴𝑔 −𝐺 𝑃𝑐 ≈ [ ′ 𝐶 ′ 𝐼 ′ ] 𝐴 𝐶 +𝐴𝐺 −1 ≈ (35) 𝑄𝑐 ′ ′ ′ [ 𝐵𝑉 𝐷 𝑉 + 𝐵𝑉 𝐺 ] 𝐴𝑔 𝐵𝑉 − 𝐺𝐷𝑉 𝑄𝑐 ≈ 𝐷𝑉 [𝐶 ] 𝐴𝑔 𝐵 − 𝐺𝐷 Δ𝑐 ≈ 𝐶 𝐷 such that

−1 𝑇𝑐 = Δ′𝑐 𝑄−′ ), (36) 𝑐 (= Δ𝑃 and there are of course a whole collection of similar conjugate relations. It is easily veriﬁed, by direct computation, that 𝑇 = 𝑃𝑐−1 Δ𝑐 . Hence, we have obtained a ‘right’ polynomial representation of a uniformly exponentially stable causal matrix 𝑇 = 𝑄−′ Δ′ . Δ need not be invertible, but remark that 𝑄−′ borrows the original observability pair as expected. Such a factorization is of course not unique, Δ and 𝑄 could be replaced by any Δ𝑈 , 𝑄𝑈 when 𝑈 is polynomial, invertible and 𝑈 −1 is polynomial as well – so that both 𝑄 and 𝑄𝑈 deﬁne the same observability kernel. 𝑈 is then a time varying or matrix version of a unimodular operator.

Bezout relations The previous development allows for the explicit determination of Bezout identities as well. We observe that [ ] [ ] [ ] 𝑃𝑐 Δ𝑐 = 𝐼 𝐷 + 𝐶𝑍(𝐼 − 𝐴𝑔 𝑍)−1 −𝐺 𝐵 − 𝐺𝐷 . (37) Let now a minimal 𝐻 be such that 𝐴 − 𝐵𝐻 is nilpotent. 𝐻 exists because we assumed the original system to be controllable in ﬁnite time and is obtained through the dead beat construction, then [ ][ ] [ ] 𝐼 −𝐷 𝐶 𝐴𝑔 − −𝐺 𝐵 − 𝐺𝐷 (38) = 𝐴 − 𝐵𝐻 := 𝐴ℎ 0 𝐼 𝐻 and let now, for some new operators −Δ𝑟 and 𝑃𝑟 [ ] [ ] [ ] [ ] 𝑀 −Δ𝑟 𝐼 −𝐷 𝐶 − 𝐷𝐻 = − 𝑍(𝐼 − 𝐴ℎ 𝑍)−1 −𝐺 𝐵 , (39) 𝑁 𝑃𝑟 0 𝐼 𝐻 Then

[

]

[

𝑀 −Δ𝑟 𝑁 𝑃𝑟 and we have reached the Bezout identity 𝑃𝑐

Δ𝑐

] =

𝑃𝑐 𝑀 + Δ𝑐 𝑁 = 𝐼,

[

𝐼

0

]

(40) (41)

Banded Matrices, Banded Inverses, Polynomial Representations

257

with 𝑀 and 𝑁 polynomial in 𝑍, as well as a conjugate factorization 𝑃𝑐−1 Δ𝑐 = Δ𝑟 𝑃𝑟−1 = 𝑇 . From this construction we have further [ ] 𝐴ℎ 𝐵 𝑃𝑟 ≈ [ −𝐻 𝐼 ] 𝐴 𝐵 −1 𝑃𝑟 ≈ (42) 𝐻 𝐼 [ ] 𝐴ℎ 𝐵 Δ𝑟 ≈ 𝐶 − 𝐷𝐻 𝐷 dual to the construction of 𝑃 and Δ, where, again, 𝑃𝑟 is causal with causal inverse, 𝐻 is the corresponding dead beat operator based on the pair {𝐴, 𝐵} (not necessarily in input normal form!), and Δ𝑟 is polynomial. Finally, we also have two new polynomial operators 𝑅 and 𝑆 such that ] [ ] [ ] [ [ ] 𝐼 𝐷 𝐶 𝑃𝑐 Δ𝑐 (43) = + 𝑍(𝐼 − 𝐴𝑔 𝑍)−1 −𝐺 𝐵 − 𝐺𝐷 , 𝑅 𝑆 0 𝐼 𝐻 the dual Bezout identity

(−𝑅)Δ𝑟 + 𝑆𝑃𝑟 = 𝐼, (44) and the connection between the two completed, invertible polynomial matrices ] [ [ ]−1 𝑀 −Δ𝑟 𝑃𝑐 Δ𝑐 . (45) = 𝑅 𝑆 𝑁 𝑃𝑟

3. Polynomial interpolation theory for matrices: An approach Interpolation theory in the matrix context necessitates the notion of a ‘valuation’, introduced in [1] and further worked out in [2, 11, 7]. I quickly summarize the concepts in the present notation. Let 𝑇 = 𝑇0 + 𝑍𝑇1 + 𝑍 2 𝑇2 + ⋅ ⋅ ⋅ be a causal (lower) and bounded operator with the given diagonal expansion, and let 𝑊 be a (compatible) block diagonal operator such that 𝜎(𝑊 𝑍 ′ ) < 1. We deﬁne the value of 𝑇 at 𝑊 to be a diagonal operator, denoted 𝑇𝑊 (in the notation of the original paper it was denoted in a somewhat cumbersome way by 𝑇 ∧ (𝑊 )) which is such that 𝑇 = 𝑇𝑊 + (𝑍 − 𝑊 )𝑇𝑟 for some bounded, causal (lower) 𝑇𝑟 . This is the socalled W-transform of 𝑇 , so called because of the resulting reproducing kernel, see [2], where it is also shown that 𝑇𝑊 is deﬁned by the strongly convergent series 𝑇𝑊 = 𝑇𝑜 + 𝑊 𝑇1 + 𝑊 𝑊 (−1) 𝑇2 + 𝑊 𝑊 (−1) 𝑊 (−2) 𝑇3 + ⋅ ⋅ ⋅ .

(46)

The notion clearly generalizes the valuation of a complex-valued matrix function 𝑇 (𝑧) at a point 𝑎 ∈ C as 𝑇 (𝑎). Because of the non-commutativity of the shift operator 𝑍, it does not have all the properties of the valuation in the complex plane. We do have the following properties. 1. 𝑇𝑊 is the ﬁrst anti-causal term in the expansion of (𝑍 − 𝑊 )−1 𝑇 : (𝑍 − 𝑊 )−1 𝑇 = 𝑍 ′2 (⋅ ⋅ ⋅ ) + 𝑍 ′ 𝑇𝑊 + 𝑇𝑟 in which the ⋅ ⋅ ⋅ is purely anticausal.

(47)

258

P. Dewilde

Proof. Expand 𝑍 ′ (𝐼 − 𝑊 𝑍 ′ )−1 𝑇 ! 2. (Chain rule) For 𝑃 and 𝑄 anticausal we have (𝑃 𝑄)𝑊 = [𝑃𝑊 𝑄]𝑊 . If 𝑃𝑊 is (−1) −1 𝑊 𝑃𝑊 . invertible, we have in addition (𝑃 𝑄)𝑊 = 𝑃𝑊 𝑄𝑊1 where 𝑊1 = 𝑃𝑊 Proof. We have −(−1)

−1 −1 (𝑍 − 𝑊 )−1 𝑃𝑊 = (𝑃𝑊 𝑍 − 𝑃𝑊 𝑊 )−1 = (𝑍𝑃𝑊 (−1)

= 𝑃𝑊

−1 − 𝑃𝑊 𝑊 )−1

(𝑍 − 𝑊1 )−1 ,

and hence (𝑍 − 𝑊 )−1 𝑃 𝑄 = (𝑍 − 𝑊 )−1 𝑃𝑊 𝑄 + 𝑃𝑟 𝑄 (−1)

= 𝑃𝑊

(−1)

(𝑍 − 𝑊1 )−1 𝑄𝑊1 + 𝑃𝑊

𝑄𝑟 + 𝑃𝑟 𝑄,

the last being equal again to (𝑍 − 𝑊 )−1 𝑃𝑊 𝑄𝑊1 + causal. 3. (Constants) Let 𝐷 be a compatible diagonal operator, then (𝑇 𝐷)𝑊 = 𝑇𝑊 𝐷. If 𝐷 is invertible and compatible, then (𝐷𝑇 )𝑊 = 𝐷(−1) 𝑇𝑊1 , in which 𝑊1 = 𝐷−1 𝑊 𝐷(−1) . For addition we simply have (𝑇 + 𝐷)𝑊 = 𝑇𝑊 + 𝐷. 4. (State space formulas) Let 𝑇 = 𝐷 + 𝐶𝑍(𝐼 − 𝐴𝑍)−1 𝐵 be a realization for 𝑇 , assumed to be causal and such that 𝜎(𝐴𝑍) < 1. Then 𝑇𝑊 = 𝐷 + 𝑊 𝑀 𝐵 where 𝑀 solves the Lyapunov-Stein equation 𝑀 (1) = 𝐶 + 𝑊 𝑀 𝐴.

(48)

In fact, (−1)

𝑀 = [𝐶(𝐼 − 𝑍𝐴)−1 ]𝑊

= [𝐶 + 𝑊 𝐶 (−1) 𝐴 + 𝑊 𝑊 (−1) 𝐶 (−2) 𝐴(−1) 𝐴 + ⋅ ⋅ ⋅ ](−1)

(49)

and hence also

(50) 𝑇𝑊 = 𝐷 + [𝐶𝑍(𝐼 − 𝐴𝑍)−1 ]𝑊 𝐵 in accordance with the previous rules. In the sequel we shall need still another property, given by the next Lemma, which follows by direct evaluation: Lemma 1. Suppose that for 𝑖 = 1, 2, 𝑇𝑖 = 𝐷𝑖 + 𝐶𝑖 (𝐼 − 𝐴𝑍)−1 𝐵, and that 𝑇2 is causally invertible, i.e., 𝑇2−1 = 𝐷2−1 − 𝐷2−1 𝐶2 𝑍(𝐼 − 𝛼2 𝑍)−1 𝐵𝐷2−1 with 𝛼2 := 𝐴 − 𝐵𝐷2−1 𝐶2 , then 𝑇1 𝑇2−1 = 𝐷1 𝐷2−1 + (𝐶1 − 𝐷1 𝐷2−1 𝐶2 )𝑍(𝐼 − 𝛼2 𝑍)−1 𝐵𝐷2−1 .

(51)

The straight ‘L¨ owner type’ directional interpolation problem for matrices can now be deﬁned as follows: given a block diagonal matrix 𝑊 and directional data 𝜉 and 𝜂, ﬁnd a causal operator 𝑆 such that (𝜉𝑆)𝑊 = 𝜂, or, to put it diﬀerently, such that 𝜉𝑆 interpolates 𝜂 at the (block diagonal) value 𝑊 . Note that 𝜉 cannot be taken out of the bracket! To somehow restrict the discussion to a “well-posed” case, we assume that the] [ interpolation data satisﬁes the property that the reachability pair 𝑊 𝜉 −𝜂

Banded Matrices, Banded Inverses, Polynomial Representations

259

is uniformly reachable in ﬁnite time. As a consequence we have that the interpolation data can be assumed to be input normalized 𝑊 𝑊 ′ + 𝜉𝜉 ′ + 𝜂𝜂 ′ = 𝐼 as well. Due to our previous theory, solutions can then be generated on the basis of polynomial representations of the operator [ ] (52) 𝑇 ′ = (𝑍 − 𝑊 )−1 𝜉 −𝜂 = Δ𝑄−1 in which Δ and 𝑄 are polynomial in 𝑍. From the previous theory of polynomial representations applied to 𝑇 , we have in sequence ⎡ ⎤ 𝑊′ 𝐼 ⎣ 𝜉′ 0 ⎦ 𝑇 ≈ −𝜂 ′ 0 ⎤ ⎡ ′ 𝑊 𝑠 𝑡 ⎣ 𝜉 ′ 𝑑11 𝑑12 ⎦ 𝑉 :≈ ′ ⎡−𝜂 ′ 0 𝑑22⎤ (53) 𝑊 𝑠 𝑡 ⎣ 𝐹1 𝐼 0 ⎦ 𝑃 −1 :≈ 𝐹2 0 𝐼 ⎤ ⎡ 𝑠 𝑡 𝑊 ′ − (𝑠𝐹1 + 𝑡𝐹2 ) 𝑄 :≈ ⎣ 𝜉 ′ − (𝑑11 𝐹1 + 𝑑12 𝐹2 ) 𝑑11 𝑑12 ⎦ , −𝜂 ′ − 𝑑22 𝐹2 0 𝑑22 [ ] in which the]unitary realization for 𝑉 is completed by 𝐵𝑉 :≈ 𝑠 𝑡 and 𝐷𝑉 :≈ [ 𝑑11 𝑑12 , and 𝐹 is the feedback matrix belonging to the input reachability 0 𝑑22 ] [ pair 𝑊 ′ 𝑠 𝑡 (which by the way is in input normal form). Notice that 𝑄 is polynomial in 𝑍 with anti-causal inverse as before. Interpolations are now obtained by pulling 𝑄 to the left-hand side: ] [ [ ] 𝑄11 𝑄12 −1 𝜉 −𝜂 = Δ. (54) (𝑍 − 𝑊 ) 𝑄21 𝑄22 Let now 𝑎 and 𝑏 be any causal, compatible operator (in particular diagonal constants), and 𝑄(1) := 𝑄11 𝑎 + 𝑄12 𝑏 as well as 𝑄(2) := 𝑄21 𝑎 + 𝑄22 𝑏, then we ﬁnd ( ) (55) (𝑍 − 𝑊 )−1 𝜉𝑄(1) − 𝜂𝑄(2) ∈ causal and hence If, in addition, 𝑄

(𝜉𝑄(1) )𝑊 = (𝜂𝑄(2) )𝑊 (2)

is causally invertible, we shall have, with 𝑆 = 𝑄 (𝜉𝑆)𝑊 = 𝜂

(56) (1)

(𝑄

(2) −1

)

(57)

a solution of the stated L¨ owner interpolation problem. This will be the case when ﬁnite matrices are concerned, because in that case, the invertibility of 𝑑22 is necessary and suﬃcient for the causal invertibility of 𝑄22 . However, the general case is much more involved and beyond the scope of the present paper. Actually, we

260

P. Dewilde

can prove the converse (we call 𝑆 regular when it has a polynomial representation 𝑆 = 𝑞𝑝−1 with 𝑝 causally invertible): Theorem 5. Under the regularity conditions stated and 𝑄 as deﬁned in (53) we have that any causal and regular 𝑆 for which (𝜉𝑆)𝑊 = 𝜂 can be written as 𝑆 = 𝑄(1) (𝑄(2) )−1 whereby 𝑄(1) := 𝑄11 𝑎 + 𝑄12 𝑏 and 𝑄(2) := 𝑄21 𝑎 + 𝑄22 𝑏 for causal operators 𝑎 and 𝑏, and 𝑄(2) is causally invertible. [ ] 𝑆 Proof. When (𝑍 − 𝑊 )−1 (𝜉𝑆 − 𝜂) ∈ causal, then a fortiori 𝑉 ′ ∈ causal, and 𝐼 since 𝑉 ′ = 𝑃 𝑄−1 we must have [ ] [ ] 𝑆 𝑎1 =𝑄 (58) 𝑏1 𝐼 for some causal operators 𝑎1 and 𝑏1 . Let now 𝑆 = 𝑞𝑝−1 be a polynomial representation for 𝑆 with 𝑝 causally invertible (as deﬁned in the previous section), then, by multiplying right with 𝑝, we conclude that there exist 𝑎 and 𝑏 such that [ ] [ ] [ (1) ] 𝑞 𝑎 𝑄 (59) =𝑄 := 𝑝 𝑏 𝑄(2) which makes 𝑄(2) causally invertible.

□

The problem is hence “reduced” to ﬁnding adequate 𝑎 and 𝑏, at least when one wants the interpolating function 𝑆 to be causally bounded. This is a diﬀerent problem than the one considered in the classical L¨owner theory, where boundedness (or stability) does not play a role. Although I do not claim to have solved this part of the problem (at least not algorithmically), it is possible to test whether a given causal 𝑇 has a causal inverse, by computing an inner-outer decomposition, as explained in the ﬁrst section of this paper. If the inner factor turns out to be trivially constant (i.e., all 𝑌𝑘 are empty), then 𝑇 will have a causal inverse. Be that as it may, if one wants to proceed as in the LTI theory, then one can either work with an unstable (or formally causal) inverse, or assume that the factor to be inverted is indeed causally invertible. Lemma 1 then shows that the resulting interpolating operator is indeed of at most the same degree as 𝑄, given that the chosen 𝑎 and 𝑏 are mere diagonal operators.

4. Some further remarks Finding complete representations for semi-separable matrices as ratios of (minimal) banded matrices is new, to the best of my knowledge. A partial solution to the problem for the case of unitary matrices was given in [2] and involved quite a complex argument. I hope that the method presented in this paper greatly simpliﬁes the issue and provides for a complete set of representations. The classical, rational approach to L¨owner interpolation as initiated in [3] and very extensively treated in [4, 19] follows diﬀerent approaches that do not seem to

Banded Matrices, Banded Inverses, Polynomial Representations

261

generalize to the semi-separable case. However, the paper of Antoulas, Ball, Kang and Willems does clarify the role played by co-prime polynomial factorizations, which is also used in the theory presented here, although the factorizations are diﬀerent. The role played by controllability indices in the classical theory, is here taken over by the dead beat indices, which are closely related to them.

5. Appendix: Methods to compute pre-images and the numerical calculation of the 𝑭 matrix The most elementary operation needed to compute pre-images is the so-called QR factorization (and its duals) on a general matrix. Let 𝐴 be an 𝑚 × 𝑛 matrix of rank 𝛿, then a QR factorization compresses the rows of 𝐴 into a new matrix 𝑅 whose ﬁrst 𝛿 rows form a basis for the rows of 𝐴, and whose further rows are zero. The ﬁrst rows even[have a special form (which often is immaterial but numerically ] practical), namely 0 ⋅ ⋅ ⋅ 0 𝑟 ⋅ ⋅ ⋅ , where 𝑟 > 0 and all data is crowded to the North-East corner of the matrix. This is called an echelon form. It is achieved, e.g., through a sequence of elementary rotations acting on the rows of the matrix, compressing ﬁrst the data on the ﬁrst column to the top, as shown in the following schema: ⎡ ⎡ ⎡ ⎤ ⎤ ⎤ ★ ★ ★ ★ ★ ★ ⋅ ⋅ ⋅ ′ ′ ⎢ ⋅ ⋅ ⋅ ⎥ 𝑄1,2 ⎢ 0 ★ ★ ⎥ 𝑄1,3 ⎢ 0 ⋅ ⋅ ⎥ ⎢ ⎢ ⎢ ⎥ ⎥ ⎥ ⎣ ⋅ ⋅ ⋅ ⎦ −→ ⎣ ⋅ ⋅ ⋅ ⎦ −→ ⎣ 0 ★ ★ ⎦ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⎡ ⎡ ⎡ ⎤ ⎤ ⎤ (60) ★ ★ ★ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ′ ′ ′ ⎢ ⎥𝑄 ⎢ ⎥ 𝑄1,4 ⎢ 0 𝑄 ⋅ ⋅ ⎥ ⎥ 2,3 ⎢ 0 ★ ★ ⎥ 2,4 ⎢ 0 ★ ★ ⎥ −→ ⎢ ⎣ 0 ⋅ ⋅ ⎦ −→ ⎣ 0 0 ★ ⎦ −→ ⎣ 0 0 ⋅ ⎦ 0 ★ ★ 0 ⋅ ⋅ 0 0 ★ to be followed by a ﬁnal rotation 𝑄′3,4 on the third and fourth row. Here each 𝑄′𝑖,𝑗 is the (transpose of) a Jacobi rotation matrix acting on elements of the 𝑖th and 𝑗th rows. Putting all these rotations together in a single matrix 𝑄, we obtain 𝑄′ 𝐴 = 𝑅 or 𝐴 = 𝑄𝑅. When a zero column is encountered, it is skipped to the next, yielding not an upper triangular form with positive elements on the main diagonal, but a staircase form. The important issue here is that all the data in 𝑄 and 𝑅 are completely generated from the data in 𝐴, although there is no general formula known that expresses these elements in closed form – in numerical engineering this is known as ‘array processing’, converting one array into others, and is maybe the most powerful numerical technique available in matrix calculus. A similar operation on the columns, often accomplished by compressing the columns of 𝐴 in the South-East corner, and starting on the bottom row produces a stack of basis vectors in echelon form, crowded in the right-hand side of the matrix. Let us now move to the situation in the paper. Suppose bases 𝜉 for ℬ, 𝜂 for 𝒴 and u for ℬ, respect. 𝒰 have been chosen, and assume the realizations of the

262

P. Dewilde

operators 𝑎 : ℬ → 𝒴 and 𝑏 : 𝒰 → 𝒴 in these bases to be the matrices 𝐴 and 𝐵 of dimensions respect. 𝛾 × 𝛿 and 𝛾 × 𝑚. We perform a QR factorization on 𝐵 = 𝑈 𝑅 that determines 𝑈 and 𝑅, and then an LQ factorization on 𝑈 ′ 𝐴 = 𝑆𝑄. 𝑈 and 𝑄 are orthogonal (unitary) matrices, 𝑅 is in top row-echelon form and 𝑆 in right column echelon form. [Note: the ﬁrst QR factorization compresses the rows of 𝐵 to the top, while the next LQ factorization compresses columns to the right starting operations on the last row.] In block notation this produces ] [ 𝑅𝑢 (61) 𝑅= 0 where the 𝑚 rows of 𝑅𝑢 are linearly independent, and [ ] 0 𝑆11 𝑆12 𝑆= 0 0 𝑆22

(62)

where the ﬁrst set of rows in 𝑆 is taken to have 𝑚 rows, equal to the number of rows in 𝑅𝑢 , the columns of 𝑆 (and in particular of 𝑆11 and 𝑆22 ) are linearly independent (deﬁning the dimensions of these matrices), entries may disappear, depending on the rank of the matrices involved (actually any entry may disappear). It follows from the respective staircase structures that the columns of 𝑆11 lie in the range of the columns of 𝑅𝑢 and also deﬁne the maximal (column) subspace with that property, for which they provide a basis thanks to [the echelon ]form. Hence, there exists a matrix 𝐹ˆ such that 𝑆11 = 𝑅𝑢 𝐹ˆ . Let 𝑥′ = 0 𝑥′2 0 deﬁne a vector 𝑥 conformal to 𝑆, then we have ⎡ ⎤ ⎤ ⎡ 0 0 ] [ (63) 𝑈 ′ 𝐴𝑄′ ⎣ 𝑥2 ⎦ = 𝑅 0 𝐹ˆ 0 ⎣ 𝑥2 ⎦ 0 0 for any vector 𝑥2 of appropriate dimensions (and, again, some entries may not be present), from which it follows that ⎤ ⎤ ⎡ ⎡ 0 0 [ ] 𝐴𝑄′ ⎣ 𝑥2 ⎦ = 𝐵 0 𝐹ˆ 0 𝑄𝑄′ ⎣ 𝑥2 ⎦ . (64) 0 0 Suppose the dimensions of 𝑥 are 𝛿1 +𝛿2 +𝛿3 , then we⋁can conclude that the columns of 𝜉𝑄′ from 𝛿1 + 1 to 𝛿1 + 𝛿2 span the pre-image 𝜉0 of 𝐵 under 𝐴 in the basis 𝜉. Let 𝑞2′ be that collection of columns (in MATLAB notation 𝑞2′ = 𝑄′𝛿1 +1:𝛿1 +𝛿2 ,: ), then [𝜉0 = 𝜉𝑞2′ is] a choice of basis for 𝒮0 and we have, for any 𝑥 = 𝑞2′ 𝑥2 and 𝐹 = 0 𝐹ˆ 0 𝑄 𝑎𝜉𝑥 = 𝑏u𝐹 𝑥. (65) (Proof. As 𝐴𝑞2′ = 𝐵𝐹 𝑞2′ , 𝑎𝜉 = 𝜂𝐴 and 𝑏u = 𝜂𝐵, by pre-multiplication with 𝜂 and post-multiplication with x. Note that this does not assume orthogonality of the bases.) The algorithm can be enhanced numerically by using SVD’s, see the discussion further.

Banded Matrices, Banded Inverses, Polynomial Representations

263

Application to the computation of the feedback operator To compute the crucial feedback operator 𝐹 following the principle set out in the previous paragraph, we stack the vectors and perform a QR factorization on them, with 𝐸𝑖 a stack of unit vectors corresponding to the {𝜂𝑖 }: ] [ 𝐵𝑖 𝐸𝑜 𝐸1 ⋅ ⋅ ⋅ 𝐸𝑘𝑖+1 = 𝑈 𝑅 (66) 𝑅 is in row echelon form:

⎡

𝑅𝐵 0 .. .

⎢ ⎢ ⎢ 𝑅=⎢ ⎢ ⎣ 0 0

𝑅𝐵,0 𝑅00 .. . 0 0

⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅

𝑅𝐵,𝑘𝑖+1 𝑅1,𝑘𝑖+1 .. .

𝑅𝑘𝑖 ,𝑘𝑖+1 0

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

(67)

in which the rows of 𝑅𝐵 are in column echelon form themselves (hence also ⋁ linearly independent), 𝑅00 is either empty (when 𝜂𝐵 accidentally spans the space 𝜂0 ) and then a further staircase in row echelon form arises, ending in the block 𝑅𝑘𝑖 ,𝑘𝑖+1 , which will only be on the main block diagonal when 𝑘𝑖 = 𝑘𝑖+1 + 1. Once 𝑈 is obtained, we perform a RQ factorization on 𝑈 𝐴 = 𝑆𝑄, producing ⎤ ⎡ 0 𝑆00 ⋅ ⋅ ⋅ 𝑆0,𝑘𝑖 ⎢ 0 0 ⋅⋅⋅ 𝑆1,𝑘𝑖 ⎥ ⎥ ⎢ (68) 𝑆=⎢ . ⎥ . .. .. ⎦ ⎣ .. ⋅⋅⋅ . 0

0

⋅⋅⋅

𝑆𝑘𝑖+1 ,𝑘𝑖

in which 𝑆𝑘𝑖 ,𝑘𝑖+1 is either empty or in column echelon form, with the staircase mounting up leftwards till 𝑆00 , which actually may also be accidentally empty. It does not really matter that some or many of these main entries are empty, if so, they are ignored. The dimensions of the rows are chosen conformal to {𝜂0 , 𝜂1 , . . . , 𝜂𝑘𝑖+1 }, while the columns now follow a new indexing schema as a result from the factorization, and shall correspond to the requested basis (in an ultimate case it may be that 𝐴 = 0, then 𝑆 = 0 and the whole matrix echelon structure disappears). Because the rows of 𝑅 are linearly independent, the ﬁrst block row of 𝑆 can be ‘killed’ by 𝑅𝐵 (they are conformal), yielding the existence of a matrix 𝐹ˆ such that ⎡ ⎤ ⎡ ⎤ 0 𝑆00 ⋅ ⋅ ⋅ 𝑆0,𝑘𝑖 𝑅𝐵 ⎢ 0 0 ⋅⋅⋅ ⎥ ⎢ 0 ⎥ ] ⎢ ⎥ ⎢ 0 ⎥[ (69) ⎢ .. .. .. ⎥ = ⎢ .. ⎥ 0 𝐹ˆ0 ⋅ ⋅ ⋅ 𝐹ˆ𝑘𝑖 . ⎣ . ⎦ ⎦ ⎣ . ⋅⋅⋅ . . 0 0 ⋅⋅⋅ 0 0 Also, the basis for 𝒮𝑘 follows from 𝑆. If 𝑆00 is not empty, then the number of its columns plus the number of columns preceding it determine the dimension of the space 𝒮0 (we made the kernel of 𝑎 explicit here). If 𝑆11 is not empty, then its number of columns determines the dimension of 𝒮1 – which again may be empty if the staircase does not make a jump in the rows corresponding to 𝜂1 etc. The

264

P. Dewilde

remainder of the matrix 𝑆, in column echelon form, is the feedback matrix 𝐴ˆ𝑓 in the current bases 𝜉𝑄 and 𝜂𝑈 ′ . Choosing the ‘dead beat basis’ for ℬ𝑖 and keeping the basis 𝜂 for ℬ𝑖+1 changes the matrix 𝐴 to 𝐴𝑄′ , while 𝐹 = 𝐹ˆ 𝑄′ , and 𝐵 just remains what it is. [ ]′ Example 1. We take 𝐵𝑖 = 1 0 0 1 , and ⎡ ⎤ √1 0 0 2 ⎢ 0 1 0 ⎥ ⎢ ⎥ (70) 𝐴𝑖 = ⎢ − √1 1 ⎥ √ 0 ⎣ 2 2 ⎦ √1 0 1 2 and let us assume that 𝜂0 and 𝜂1 both have bases for ℬ𝑖+1 . We then have ⎡ 1 [ ] ⎢ 0 𝐵𝑖 𝐸0 𝐸1 = ⎢ ⎣ 0 1 The ﬁrst QR factorization produces ⎤ ⎡ √1 ⎡ 0 1 1 0 0 0 2 ⎢ 0 0 1 0 0 ⎥ ⎢ 0 1 ⎥ ⎢ ⎢ ⎣ 0 0 0 1 0 ⎦=⎣ 0 0 √1 1 0 0 0 1 0 2

0 0 1 0

dimension 2 and have been used as 1 0 0 0

0 1 0 0

0 0 1 0

⎤⎡ √ − √12 2 ⎢ 0 0 ⎥ ⎥⎢ 0 ⎦⎣ 0 √1 0 2

where we see the row echelon form appearing. Next, ⎤ ⎡ 1 √ 1 √12 2 ⎢ 0 1 0 ⎥ ⎥ ⎢ 𝑈 ′ 𝐴𝑖 = ⎢ − √1 0 √1 ⎥ ⎣ 2 2 ⎦ − √12 0 √12

⎤ 0 0 ⎥ ⎥. 0 ⎦ 1 √ 2 0 0 0

(71)

0 1 0 0

0 0 1 0

√1 2

⎤

0 ⎥ ⎥ 0 ⎦

(72)

√1 2

(73)

and a LQ factorization produces the right column echelon form (starting operations from the last row and compressing the columns to the right): ⎡ 1 ⎤ ⎡ ⎤ √ 1 √12 ⎤ 1 1 0 ⎡ √1 2 √1 ⎢ 0 ⎥ 2 2 ⎢ ⎥ 1 0 0 1 0 ⎥⎣ ⎢ ⎥ 1 0 ⎦ (74) ⎢ − √1 0 √1 ⎥ = ⎢ ⎣ 0 0 1 ⎦ ⎣ 2 2 ⎦ √1 √1 − 2 2 0 0 1 − √1 0 √1 2

2

the last matrix being the transition matrix in the bases 𝜉ˆ = 𝜉𝑄′ and 𝜂𝑈 , and it is in column echelon form. Comparing the row echelon form for the bases and 𝐴ˆ𝑖 ⋁ we see that 𝜉ˆ1 generates 𝒮0 , 𝒮1 = (𝜉ˆ1 , 𝜉ˆ2 ) and everything is 𝒮2 in ℬ𝑖 . The ﬁrst column of 𝐴ˆ𝑖 can be annihilated by 𝑈 ′ 𝐵𝑖 , hence [ ] 𝐹ˆ = √12 √12 0 . (75)

Banded Matrices, Banded Inverses, Polynomial Representations

265

ˆ while keeping the basis 𝜂 in ℬ𝑖+1 , we can denote If we change bases in ℬ𝑖 to 𝜉, ′ ˆ 𝐴𝑖 = 𝐴𝑖 𝑄 as the current transition matrix, 𝐵𝑖 stays what it is, 𝐹ˆ is the correct feedback matrix, and ⎡ ⎤ 0 0 − √12 ⎢ 0 1 0 ⎥ ⎥. 𝐴ˆ𝑓 = 𝐴𝑄′ − 𝐵𝑖 𝐹ˆ = ⎢ (76) ⎣ 0 0 1 ⎦ 1 √ 0 0 2 More advanced methods to determine pre-images. Consider again the situation with operators 𝑎 : ℬ → 𝒴 and 𝑏 : 𝒰 → 𝒴 and realization matrices 𝐴 and 𝐵 respect. Let 𝐴 = 𝑈 Σ𝑉 ′ be the SVD of the matrix 𝐴, with [ ] Σ11 0 Σ= (77) 0 0 in which 𝑈 and 𝑉 are orthogonal (or unitary in the complex case), Σ contains the singular values in the classical canonical sense (𝜎1 ≥ 𝜎2 ≥ ⋅ ⋅ ⋅ ≥ 𝜎𝑘 ) with 𝜎𝑘 >[0 as the last ] signiﬁcant [ singular] value, Σ11 = diag(𝜎1 , . . . , 𝜎𝑘 ). Partitioning ⋁ 𝑈 = 𝑈1 𝑈2 and 𝑉 = 𝑉1 𝑉2 conformally to Σ, we have that 𝑉2 (stack of columns) is the kernel of 𝐴, which shall always belong trivially to any pre-image. ⋁ We also see that any image shall always belong to the range of 𝐴, namely 𝑈1 (in case one works directly on matrices, one assumes that the bases are just the natural ones, otherwise one just post multiplies with the actual bases, as done before). Let be given a (row) stack ⋁ of (column) vectors y, each of dimension conformal to 𝐴. ⋁ y (if this space is Then only 𝑈1 ∩ y can contain an image with pre-image ⋁ zero, then there is no pre-image except for the trivial 𝑉2 ). The problem hence reduces to ﬁnding this intersection in a stable numerical way (the problem is that ⋁ y may not be numerically well deﬁned, and there is also a problem with the intersection, which may only be approximate). ⋁ One way to proceed is to remark that the intersection must be orthogonal to ⋁ ⋁ 𝑈2 . It is characterized by the kernel of 𝑈2′ y, we have more precisely, y𝑢 ∈ 𝑈1 ∩ y iﬀ 𝑈2′ y𝑢 = 0. The image can then be described as y𝑢 with 𝑢 in the kernel of 𝐴, and the pre-image is then ⋁ ⋁ ′ ′ 𝑉2 . (78) 𝐴−1 ( y) = 𝑉1 Σ−1 11 𝑈1 yker(𝑈2 y) + This expression shows the potential indeterminacy in a nutshell (one recognizes the Moore-Penrose inverse): there is the blow up of small singular values by Σ−1 11 , and also the lack of precision in the dimension of the kernel of 𝑈2′ y, which can be taken minimal (strictly zero) or maximal (within some 𝜖). This can be done through another SVD of⋁𝑈2′ y. Alternatively, one can look for algorithms to determine the ⋁ angle between y and 𝑈1 and take that part with angle zero – this amounts more or less to the same as before, see the literature on computing angles between subspaces!

266 Example 2. This example is only LTI case. Let us take ⎡ 1 𝐴=⎣ 0 0

P. Dewilde intended to make a quick connection with the 1 1 0

⎤ ⎡ ⎤ 0 0 1 ⎦, 𝐵 = ⎣ 0 ⎦. 1 1

(79)

It is easily veriﬁed that the pair {𝐴, 𝐵} is reachable. We make use of the fact that 𝐴 is invertible to determine the pre-images directly. In the current natural basis, a basis of 𝒮0 is the pre-image of 𝐵 namely 𝐴−1 𝐵, of 𝒮1 one has to add the pre-image of 𝐴−1 𝐵, namely 𝐴−2 𝐵, and for 𝒮2 one adds its pre-image, namely 𝐴−3 𝐵. Hence the sought after dead beat basis is given by the columns of ⎡ ⎤ 1 3 6 [ −1 ] 𝑇 = 𝐴 𝐵 𝐴−2 𝐵 𝐴−3 𝐵 = ⎣ −1 −2 −3 ⎦ . (80) 1 1 1 Transforming to the new basis we get (as 𝜂𝑥 = 𝜂ˆ𝑥ˆ with 𝜂ˆ = 𝜂𝑇 the new basis) ⎡ ⎤ ⎡ ⎤ 3 1 0 3 ˆ = 𝑇 −1 𝐵 = ⎣ −3 ⎦ . (81) 𝐴ˆ = 𝑇 −1 𝐴𝑇 = ⎣ −3 0 1 ⎦ , 𝐵 1 0 0 1 ⎡ ⎤′ 1 We see immediately that 𝐹ˆ = ⎣ 0 ⎦ . Transforming back we ﬁnd 0 ⎡ ⎤ 1 1 0 [ ] 1 1 ⎦, 𝐹 = 1 3 3 (82) 𝐴𝑓 = 𝐴 − 𝐵𝐹 = ⎣ 0 −1 −3 −2 and 𝐴𝑓 is indeed nilpotent as one should expect (to check, just calculate det(𝑧𝐼 − 𝐴𝑓 )!). The more general LTI algorithms are extensions of this mechanism to the case where 𝐴 is not invertible and the reachability base more complicated (Kronecker indices). In particular, in the MIMO case, one can determine stacks of pre-images based on the columns of 𝐵, so as to realize the polynomial inverse in a column-degree canonical form.

Banded Matrices, Banded Inverses, Polynomial Representations

267

References [1] D. Alpay and P. Dewilde. Time-varying signal approximation and estimation. In M.A. Kaashoek, J.H. van Schuppen, and A.C.M. Ran, editors, Signal Processing, Scattering and Operator Theory, and Numerical Methods, volume III of Proc. Int. Symp. MTNS-89, pages 1–22. Birkh¨ auser Verlag, 1990. [2] D. Alpay, P. Dewilde, and H. Dym. Lossless Inverse Scattering and reproducing kernels for upper triangular operators. In I. Gohberg, editor, Extension and Interpolation of Linear Operators and Matrix Functions, volume 47 of Operator Theory, Advances and Applications, pages 61–135. Birkh¨ auser Verlag, 1990. [3] A.C. Antoulas and B.D.O. Anderson. On the scalar rational interpolation problem. IMA J. Math. Control Inform., 3:61–88, 1986. [4] A.C. Antoulas and J.A. Ball and J. Kang and J.C. Willems. On the solution of the minimal rational interpolation problem. Linear Algebra and its Applications, 137:511–573, 1990. [5] D. Alpay and P. Dewilde and D. Volok. Interpolation and approximation of quasiseparable systems: the Schur-Takagi case. Calcolo, 42:139–156, 2005. [6] W. Arveson. Interpolation problems in nest algebras. J. Functional Anal., 20:208– 233, 1975. [7] J.A. Ball, I. Gohberg, and M.A. Kaashoek. Nevanlinna-Pick interpolation for timevarying input-output maps: the discrete case. In I. Gohberg, editor, Time-Variant Systems and Interpolation, volume 56 of Operator Theory: Advances and Applications, pages 1–51. Birkh¨ auser Verlag, 1992. [8] S. Chandrasekaran, M. Gu, and T. Pals. A fast and stable solver for smooth recursively semi-separable systems. In SIAM Annual Conference, San Diego and SIAM Conference of Linear Algebra in Controls, Signals and Systems, Boston, 2001. [9] P. Dewilde and A.-J. van der Veen. Time-varying Systems and Computations. Kluwer, out of print but freely available at ens.ewi.tudelft.nl, 1998. [10] P. Dewilde and A.-J. van der Veen. Inner-outer factorization and the inversion of locally ﬁnite systems of equations. Linear Algebra and its Applications, 313:53–100, 2000. [11] P.M. Dewilde. A course on the algebraic Schur and Nevanlinna-Pick interpolation problems. In Ed. F. Deprettere and A.J. van der Veen, editors, Algorithms and Parallel VLSI Architectures. Elsevier, 1991. [12] P. Van Dooren. A unitary method for deadbeat control. Proceedings MTNS, 1983. [13] Y. Eidelman and I. Gohberg. On a new class of structured matrices. Notes distributed at the 1999 AMS-IMS-SIAM Summer Research Conference, Structured Matrices in Operator Theory, Numerical Analysis, Control, Signal and Image Processing, 1999. [14] Y. Eidelman and I. Gohberg. A modiﬁcation of the Dewilde-van der Veen method for inversion of ﬁnite structured matrices. Linear Algebra and its Applications, 343-344, 2002. [15] I. Gohberg, T. Kailath, and I. Koltracht. Linear complexity algorithms for semiseparable matrices. Integral Equations and Operator Theory, 8:780–804, 1985. [16] T. Kailath. Fredholm resolvents, Wiener-Hopf equations and Riccati diﬀerential equations. IEEE Trans. Information Theory, 15(6), November 1969.

268

P. Dewilde

[17] T. Kailath and B.D.O. Anderson. Some integral equations with nonsymmetric separable kernels. SIAM J. of Applied Math., 20 (4):659–669, June 1971. [18] L. Kronecker. Algebraische Reduktion der Scharen bilinearer Formen. S.B. Akad. Berlin, pages 663–776, 1890. [19] A. J. Mayo and A.C. Antoulas. A framework for the solution of the generalized realization problem. Linear Algebra and its Applications, 425:634–662, 2007. [20] S. Chandrasekaran, P. Dewilde, M. Gu, T. Pals, A.-J. van der Veen and J. Xia. A fast backward stable solver for sequentially semi-separable matrices, volume HiPC202 of Lecture Notes in Computer Science, pages 545–554. Springer Verlag, Berlin, 2002. [21] G. Strang. Banded matrices with banded inverses and 𝑎 = 𝑙𝑝𝑢. Linear Algebra and its Applications, to appear, 2011. [22] A.J. van der Veen. Time-Varying System Theory and Computational Modeling: Realization, Approximation, and Factorization. PhD thesis, Delft University of Technology, Delft, The Netherlands, June 1993. Patrick Dewilde Institute for Advance Study TU M¨ unchen and Faculty of EEMCS TU Delft

Operator Theory: Advances and Applications, Vol. 218, 269–297 c 2012 Springer Basel AG ⃝

Description of Helson-Szeg˝ o Measures in Terms of the Schur Parameter Sequences of Associated Schur Functions Vladimir K. Dubovoy, Bernd Fritzsche and Bernd Kirstein Dedicated to the memory of Israel Gohberg

Abstract. Let 𝜇 be a probability measure on the Borelian 𝜎-algebra of the unit circle. Then we associate a Schur function 𝜃 in the unit disk with 𝜇 and give characterizations of the case that 𝜇 is a Helson-Szeg˝ o measure in terms of the sequence of Schur parameters of 𝜃. Furthermore, we state some connections of these characterizations with the backward shift. Mathematics Subject Classiﬁcation (2000). Primary 30E05, 47A57. Keywords. Helson-Szeg˝ o measures, Riesz projection, Schur functions, Schur parameters, unitary colligations.

1. Interrelated quadruples consisting of a probability measure, a normalized Carath´eodory function, a Schur function and a sequence of contractive complex numbers Let 𝔻 := {𝜁 ∈ ℂ : ∣𝜁∣ < 1} and 𝕋 := {𝑡 ∈ ℂ : ∣𝑡∣ = 1} be the unit disk and the unit circle in the complex plane ℂ, respectively. The central object in this paper is the class ℳ+ (𝕋) of all ﬁnite nonnegative measures on the Borelian 𝜎-algebra 𝔅 on 𝕋. A measure 𝜇 ∈ ℳ+ (𝕋) is called probability measure if 𝜇(𝕋) = 1. We denote by ℳ1+ (𝕋) the subset of all probability measures which belong to ℳ+ (𝕋). Now we are going to introduce the subset of Helson-Szeg˝o measures on 𝕋. For this reason, we denote by 𝒫𝑜𝑙 the set of all trigonometric polynomials, i.e., the set of all functions 𝑓 : 𝕋 → ℂ for which there exist a ﬁnite subset 𝐼 of the set ℤ of all integers and a sequence (𝑎𝑘 )𝑘∈𝐼 from ℂ such that ∑ 𝑎 𝑘 𝑡𝑘 , 𝑡 ∈ 𝕋. (1.1) 𝑓 (𝑡) = 𝑘∈𝐼

270

V.K. Dubovoy, B. Fritzsche and B. Kirstein

If 𝑓 ∈ 𝒫𝑜𝑙 is given via (1.1), then the conjugation 𝑓˜ of 𝑓 is deﬁned via ∑ (sgn 𝑘)𝑎𝑘 𝑡𝑘 , 𝑡 ∈ 𝕋, 𝑓˜(𝑡) := −𝑖

(1.2)

𝑘∈𝐼

where sgn 0 := 0 and where sgn 𝑘 :=

𝑘 ∣𝑘∣

for each 𝑘 ∈ ℤ ∖ {0}.

Deﬁnition 1.1. A non-zero measure 𝜇 which belongs to ℳ+ (𝕋) is called a HelsonSzeg˝ o measure if there exists a positive real constant 𝐶 such that for all 𝑓 ∈ 𝒫𝑜𝑙 the inequality ∫ ∫ ∣𝑓˜(𝑡)∣2 𝜇(𝑑𝑡) ≤ 𝐶 ∣𝑓 (𝑡)∣2 𝜇(𝑑𝑡) (1.3) 𝕋

𝕋

is satisﬁed. If 𝜇 ∈ ℳ+ (𝕋), then 𝜇 is a Helson-Szeg˝o measure if and only if 𝛼𝜇 is a Helson-Szeg˝ o measure for each 𝛼 ∈ (0, +∞). Thus, the investigation of HelsonSzeg˝o measures can be restricted to the class ℳ1+ (𝕋). The main goal of this paper is to describe all Helson-Szeg˝ o measures 𝜇 belonging to ℳ1+ (𝕋) in terms of the Schur parameter sequence of some Schur function 𝜃 which will be associated with 𝜇. Let 𝒞(𝔻) be the Carath´eodory class of all functions Φ : 𝔻 → ℂ which are holomorphic in 𝔻 and which satisfy Re Φ(𝜁) ≥ 0 for each 𝜁 ∈ 𝔻. Furthermore, let 𝒞 0 (𝔻) := {Φ ∈ 𝒞(𝔻) : Φ(0) = 1}. The class 𝒞(𝔻) is intimately related with the class ℳ+ (𝕋). According to the Riesz-Herglotz theorem (see, e.g., [14, Chapter 1]), for each function Φ ∈ 𝒞(𝔻) there exist a unique measure 𝜇 ∈ ℳ+ (𝕋) and a unique number 𝛽 ∈ ℝ such that ∫ 𝑡+𝜁 𝜇(𝑑𝑡) + 𝑖𝛽, 𝜁 ∈ 𝔻. (1.4) Φ(𝜁) = 𝕋 𝑡−𝜁 Obviously, 𝛽 = Im [Φ(0)]. On the other hand, it can be easily checked that, for arbitrary 𝜇 ∈ ℳ+ (𝕋) and 𝛽 ∈ ℝ, the function Φ which is deﬁned by the right-hand side of (1.4) belongs to 𝒞(𝔻). If we consider the Riesz-Herglotz representation (1.4) for a function Φ ∈ 𝒞 0 (𝔻), then 𝛽 = 0 and 𝜇 belongs to the set ℳ1+ (𝕋). Actually, in this way we obtain a bijective correspondence between the classes 𝒞 0 (𝔻) and ℳ1+ (𝕋). Let us now consider the Schur class 𝒮(𝔻) of all functions Θ : 𝔻 → ℂ which are holomorphic in 𝔻 and which satisfy Θ(𝔻) ⊆ 𝔻 ∪ 𝕋. If Θ ∈ 𝒮(𝔻), then the function Φ : 𝔻 → ℂ deﬁned by Φ(𝜁) :=

1 + 𝜁Θ(𝜁) 1 − 𝜁Θ(𝜁)

(1.5)

belongs to the class 𝒞 0 (𝔻). Note that from (1.5) it follows 𝜁Θ(𝜁) =

Φ(𝜁) − 1 , Φ(𝜁) + 1

𝜁 ∈ 𝔻.

(1.6)

Description of Helson-Szeg˝ o Measures

271

Consequently, it can be easily veriﬁed that via (1.5) a bijective correspondence between the classes 𝒮(𝔻) and 𝒞 0 (𝔻) is established. Let 𝜃 ∈ 𝒮. Following I. Schur [15], we set 𝜃0 := 𝜃 and 𝛾0 := 𝜃0 (0). Obviously, ∣𝛾0 ∣ ≤ 1. If ∣𝛾0 ∣ < 1, then we consider the function 𝜃1 : 𝔻 → ℂ deﬁned by 𝜃1 (𝜁) :=

1 𝜃0 (𝜁) − 𝛾0 ⋅ . 𝜁 1 − 𝛾0 𝜃0 (𝜁)

In view of the lemma of H.A. Schwarz, we have 𝜃1 ∈ 𝒮. As above we set 𝛾1 := 𝜃1 (0) and if ∣𝛾1 ∣ < 1, we consider the function 𝜃2 : 𝔻 → ℂ deﬁned by 𝜃2 (𝜁) :=

1 𝜃1 (𝜁) − 𝛾1 ⋅ . 𝜁 1 − 𝛾1 𝜃1 (𝜁)

Further, we continue this procedure inductively. Namely, if in the 𝑗th step a function 𝜃𝑗 occurs for which the complex number 𝛾𝑗 := 𝜃𝑗 (0) fulﬁlls ∣𝛾𝑗 ∣ < 1, we deﬁne 𝜃𝑗+1 : 𝔻 → ℂ by 1 𝜃𝑗 (𝜁) − 𝛾𝑗 𝜃𝑗+1 (𝜁) := ⋅ 𝜁 1 − 𝛾𝑗 𝜃𝑗 (𝜁) and continue this procedure in the prescribed way. Let ℕ0 be the set of all nonnegative integers, and, for each 𝛼 ∈ ℝ and 𝛽 ∈ ℝ ∪ {+∞}, let ℕ𝛼,𝛽 := {𝑘 ∈ ℕ0 : 𝛼 ≤ 𝑘 ≤ 𝛽}. Then two cases are possible: (1) The procedure can be carried out without end, i.e., ∣𝛾𝑗 ∣ < 1 for each 𝑗 ∈ ℕ0 . (2) There exists an 𝑚 ∈ ℕ0 such that ∣𝛾𝑚 ∣ = 1 and, if 𝑚 > 0, then ∣𝛾𝑗 ∣ < 1 for each 𝑗 ∈ ℕ0,𝑚−1 . Thus, for each function 𝜃 ∈ 𝒮, a sequence (𝛾𝑗 )𝜔 𝑗=0 is associated with 𝜃. Here we have 𝜔 = ∞ (resp. 𝜔 = 𝑚) in the ﬁrst (resp. second) case. From I. Schur’s paper [15] it is known that the second case occurs if and only if 𝜃 is a ﬁnite Blaschke product of degree 𝜔. The above procedure is called Schur algorithm and the sequence (𝛾𝑗 )𝜔 𝑗=0 obtained here is called the sequence of Schur parameters associated with the function 𝜃, whereas for each 𝑗 ∈ ℕ0,𝜔 the function 𝜃𝑗 is called the 𝑗th Schur transform of 𝜃. The symbol Γ stands for the set of all sequences of Schur parameters associated with functions belonging to 𝒮. The following two properties established by I. Schur in [15] determine the particular role which Schur parameters play in the study of functions of class 𝒮. (a) Each sequence (𝛾𝑗 )∞ 𝑗=0 of complex numbers with ∣𝛾𝑗 ∣ < 1 for each 𝑗 ∈ ℕ0 belongs to Γ. Furthermore, for each 𝑛 ∈ ℕ0 , a sequence (𝛾𝑗 )𝑛𝑗=0 of complex numbers with ∣𝛾𝑛 ∣ = 1 and ∣𝛾𝑗 ∣ < 1 for each 𝑗 ∈ ℕ0,𝑛−1 belongs to Γ. (b) There is a one-to-one correspondence between the sets 𝒮 and Γ. Thus, the Schur parameters are independent parameters which completely determine the functions of the class 𝒮. In the result of the above considerations we obtain special ordered quadruples [𝜇, Φ, Θ, 𝛾] consisting of a measure 𝜇 ∈ ℳ1+ (𝕋), a function Φ ∈ 𝒞 0 (𝔻), a function Θ ∈ 𝒮(𝔻), and Schur parameters 𝛾 = (𝛾𝑗 )𝜔 𝑗=0 ∈ Γ, which are interrelated in such

272

V.K. Dubovoy, B. Fritzsche and B. Kirstein

way that each of these four objects uniquely determines the other three ones. For that reason, if one of the four objects is given, we will call the three others associated with it. The main goal of this paper is to derive a criterion which gives an answer to the question when a measure 𝜇 ∈ ℳ1+ (𝕋) is a Helson-Szeg˝ o measure (see Section 6). For this reason, we will need the properties of Helson-Szeg˝ o measures listed below (see Theorem 1.2). For more information about Helson-Szeg˝ o measures, we refer the reader, e.g., to [10, Chapter 7], [11, Chapter 5]. Let 𝑓 ∈ 𝒫𝑜𝑙 be given by (1.1). Then we consider the Riesz projection 𝑃+ 𝑓 which is deﬁned by ∑ (𝑃+ 𝑓 )(𝑡) := 𝑎 𝑘 𝑡𝑘 , 𝑡 ∈ 𝕋. 𝑘∈𝐼∩ℕ0 𝑘

Let 𝒫𝑜𝑙+ := ℒ𝑖𝑛{𝑡 : 𝑘 ∈ ℕ0 } and 𝒫𝑜𝑙− := ℒ𝑖𝑛{𝑡−𝑘 : 𝑘 ∈ ℕ} where ℕ is the set of all positive integers. Then, clearly, 𝑃+ is the projection which projects the linear space 𝒫𝑜𝑙 onto 𝒫𝑜𝑙+ parallel to 𝒫𝑜𝑙− . In view of a result due to Fatou (see, e.g., [14, Theorem 1.18]), we will use the following notation: If ℎ ∈ 𝐻 2 (𝔻), then the symbol ℎ stands for the radial boundary values of ℎ, which exist for 𝑚-a.e. 𝑡 ∈ 𝕋 where 𝑚 is the normalized Lebesgue measure on 𝕋. If 𝑧 ∈ ℂ, then the symbol 𝑧 ∗ stands for the complex conjugate of 𝑧. Theorem 1.2. Let 𝜇 ∈ ℳ1+ (𝕋). Then the following statements are equivalent: (i) 𝜇 is a Helson-Szeg˝ o measure. (ii) The Riesz projection 𝑃+ is bounded in 𝐿2𝜇 . (iii) The sequence (𝑡𝑛 )𝑛∈ℤ is a (symmetric or nonsymmetric) basis of 𝐿2𝜇 . (iv) 𝜇 is absolutely continuous with respect to 𝑚 and there is an outer function 𝑑𝜇 ℎ ∈ 𝐻 2 (𝔻) such that 𝑑𝑚 = ∣ℎ∣2 and ( ) dist ℎ∗/ℎ, 𝐻 ∞ (𝕋) < 1. o measure. Then the Schur paCorollary 1.3. Let 𝜇 ∈ ℳ1+ (𝕋) be a Helson-Szeg˝ associated with 𝜇 is inﬁnite, i.e., 𝜔 = ∞ holds, and rameter sequence 𝛾 = (𝛾𝑗 )𝜔 𝑗=0 ∑∞ 2 the series 𝑗=0 ∣𝛾𝑗 ∣ converges, i.e., 𝛾 ∈ 𝑙2 . Proof. Let 𝜃 ∈ 𝒮(𝔻) be the Schur function associated with 𝜇. Then it is known (see, e.g., [1, Chapter 3]) that 𝜔 {∫ } ∏ 2 (1 − ∣𝛾𝑗 ∣ ) = exp ln(1 − ∣𝜃(𝑡)∣2 )𝑚(𝑑𝑡) . (1.7) 𝕋

𝑗=0

We denote by Φ the function from 𝒞 0 (𝔻) associated with 𝜇. Using (1.4), assumption (iv) in Theorem 1.2, and Fatou’s theorem we obtain 1 − ∣𝜃(𝑡)∣2 =

4Re Φ(𝑡) 4∣ℎ(𝑡)∣2 = ∣Φ(𝑡) + 1∣2 ∣Φ(𝑡) + 1∣2

(1.8)

Description of Helson-Szeg˝ o Measures

273

for 𝑚-a.e. 𝑡 ∈ 𝕋. Thus, ln(1 − ∣𝜃(𝑡)∣2 ) = ln 4 + 2 ln ∣ℎ(𝑡)∣ − 2 ln ∣Φ(𝑡) + 1∣.

(1.9)

In view of Re Φ(𝜁) > 0 for each 𝜁 ∈ 𝔻, the function Φ is outer. Hence, ln ∣Φ + 1∣ ∈ 𝐿1𝑚 (see, e.g., [14, Theorem 4.29 and Theorem 4.10]). Taking into account condition (iv) in Theorem 1.2, we obtain ℎ ∈ 𝐻 2 (𝔻). Thus, we infer from (1.9) that ln(1 − ∣𝜃(𝑡)∣2 ) ∈ 𝐿1𝑚 . Now the assertion follows from (1.7). □ Remark 1.4. Let 𝜇 ∈ ℳ1+ (𝕋), and let the Lebesgue decomposition of 𝜇 with respect to 𝑚 be given by 𝜇(𝑑𝑡) = 𝑣(𝑡)𝑚(𝑑𝑡) + 𝜇𝑠 (𝑑𝑡),

(1.10)

where 𝜇𝑠 stands for the singular part of 𝜇 with respect to 𝑚. Then the relation Re Φ = 𝑣 holds 𝑚-a.e. on 𝕋. The identity (1.9) has now the form ln(1 − ∣𝜃(𝑡)∣2 ) = ln 4 + ln 𝑣(𝑡) − 2 ln ∣Φ(𝑡) + 1∣ for 𝑚-a.e. 𝑡 ∈ 𝕋. From this and (1.7) now it follows a well-known result, namely, that ln 𝑣 ∈ 𝐿1𝑚 (i.e., 𝜇 is a Szeg˝o measure) if and only if 𝜔 = ∞ and 𝛾 ∈ 𝑙2 . In particular, a Helson-Szeg˝ o measure is also a Szeg˝ o measure. We ﬁrst wish to mention that our interest in describing the class of HelsonSzeg˝o measures in terms of Schur parameters was initiated by conversations with L.B. Golinskii and A.Ya. Kheifets who studied related questions in joint research with F. Peherstorfer and P.M. Yuditskii (see [7], [9], [12]). In Section 6 we will comment on some results in [7] which are similar to our own. The above-mentioned problem is of particular interest, even on its own. Solutions to this problem promise important applications and new results in scattering theory for CMV matrices (see [7], [9], [11]) and in nonlinear Fourier analysis (see [17]). Our approach to the description of Helson-Szeg˝ o measures diﬀers from the one in [7] in that we investigate this question for CMV matrices in another basis (see [3, Deﬁnition 2.2., Theorem 2.13]), namely the one for that CMV matrices have the full GGT representation (see Simon [16, pp. 261–262, Remarks and Historical Notes]).

2. A unitary colligation associated with a Borelian probability measure on the unit circle The starting point of this section is the observation that a given Schur function Θ ∈ 𝒮(𝔻) can be represented as characteristic function of some contraction in a Hilbert space. That means that there exists a separable complex Hilbert space ℌ and bounded linear operators 𝑇 : ℌ → ℌ, 𝐹 : ℂ → ℌ, 𝐺 : ℌ → ℂ, and 𝑆 : ℂ → ℂ such that the block operator ( ) 𝑇 𝐹 𝑈 := :ℌ⊕ℂ→ℌ⊕ℂ (2.1) 𝐺 𝑆

274

V.K. Dubovoy, B. Fritzsche and B. Kirstein

is unitary and, moreover, that for each 𝜁 ∈ 𝔻 the equality Θ(𝜁) = 𝑆 + 𝜁𝐺(𝐼 − 𝜁𝑇 )−1 𝐹,

(2.2)

is fulﬁlled. Note that in (2.1) the complex plane ℂ is considered as the onedimensional complex Hilbert space with the usual inner product ( ) 𝑧, 𝑤 ℂ = 𝑧 ∗ 𝑤, 𝑧, 𝑤 ∈ ℂ. The unitarity of the operator 𝑈 implies that the operator 𝑇 is contractive (i.e., ∥𝑇 ∥ ≤ 1). Thus, for all 𝜁 ∈ 𝔻 the operator 𝐼 − 𝜁𝑇 is boundedly invertible. The unitarity of the operator 𝑈 means that the ordered tuple △ := (ℌ, ℂ, ℂ; 𝑇, 𝐹, 𝐺, 𝑆)

(2.3)

is a unitary colligation. In view of (2.2), the function Θ is the characteristic operator function of the unitary colligation △. For a detailed treatment of unitary colligations and their characteristic functions we refer the reader to the landmark paper [2]. The following subspaces of ℌ will play an important role in the sequel ℌ𝔉 :=

∞ ⋁

𝑇 𝑛 𝐹 (ℂ),

∞ ⋁

ℌ𝔊 :=

𝑛=0

(𝑇 ∗ )𝑛 𝐺∗ (ℂ).

(2.4)

𝑛=0

⋁∞ By the symbol 𝑛=0 𝐴𝑛 we mean the smallest closed subspace generated by the subsets 𝐴𝑛 of the corresponding spaces. The subspaces ℌ𝔉 and ℌ𝔊 are called the subspaces of controllability and observability, respectively. We note that the unitary operator 𝑈 can be chosen such that ℌ = ℌ𝔉 ∨ ℌ𝔊

(2.5)

holds. In this case the unitary colligation △ is called simple. The simplicity of a unitary colligation means that there does not exist a nontrivial invariant subspace of ℌ on which the operator 𝑇 induces a unitary operator. Such kind of contractions 𝑇 are called completely nonunitary. Proposition 2.1. Let 𝜇 ∈ ℳ1+ (𝕋) be a Szeg˝ o measure. Let Θ be the function belonging to 𝒮(𝔻) which is associated with 𝜇 and let △ be the simple unitary colligation the characteristic operator function of which coincides with Θ. Then the spaces ℌ⊥ 𝔉 := ℌ ⊖ ℌ𝔉 ,

ℌ⊥ 𝔊 := ℌ ⊖ ℌ𝔊

(2.6)

are nontrivial. Proof. Let 𝛾 = (𝛾𝑗 )𝜔 𝑗=0 ∈ Γ be the Schur parameter sequence of Θ. Then from Corollary 1.3 we infer that 𝜔 = ∞ and that 𝛾 ∈ 𝑙2 . In this case it was proved in [3, Chapter 2] that both spaces (2.6) are nontrivial. □ ⊥ Because of (2.4) and (2.6) it follows that the subspace ℌ⊥ 𝔊 (resp. ℌ𝔉 ) is invariant with respect to 𝑇 (resp. 𝑇 ∗ ). It can be shown (see [3, Theorem 1.6]) that

𝑉𝑇 := Rstr.ℌ⊥ 𝑇 𝔊

and 𝑉𝑇 ∗ := Rstr.ℌ⊥ 𝑇∗ 𝔉

Description of Helson-Szeg˝ o Measures

275

are both unilateral shifts. More precisely, 𝑉𝑇 (resp. 𝑉𝑇 ∗ ) is exactly the maximal unilateral shift contained in 𝑇 (resp. 𝑇 ∗ ). This means that an arbitrary invariant subspace with respect to 𝑇 (resp. 𝑇 ∗ ) on which 𝑇 (resp. 𝑇 ∗ ) induces a unilateral ⊥ shift is contained in ℌ⊥ 𝔊 (resp. ℌ𝔉 ). 1 Let 𝜇 ∈ ℳ+ (𝕋). Then our subsequent considerations are concerned with the investigation of the unitary operator 𝑈𝜇× : 𝐿2𝜇 → 𝐿2𝜇 which is deﬁned for each 𝑓 ∈ 𝐿2𝜇 by (𝑈𝜇× 𝑓 )(𝑡) := 𝑡∗ ⋅ 𝑓 (𝑡),

𝑡 ∈ 𝕋.

(2.7)

Denote by 𝜏 the embedding operator of ℂ into 𝐿2𝜇 , i.e., 𝜏 : ℂ → 𝐿2𝜇 is such that for each 𝑐 ∈ ℂ the image 𝜏 (𝑐) of 𝑐 is the constant function on 𝕋 with value 𝑐. Denote by ℂ𝕋 the subspace of 𝐿2𝜇 which is generated by the constant functions and denote by 1 the constant function on 𝕋 with value 1. Then obviously 𝜏 (ℂ) = ℂ𝕋 and 𝜏 (1) = 1. We consider the subspace ℌ𝜇 := 𝐿2𝜇 ⊖ ℂ𝕋 . Denote by 𝑈𝜇× =

(

𝑇× 𝐺×

𝐹× 𝑆×

)

the block representation of the operator 𝑈𝜇× with respect to the orthogonal decomposition 𝐿2𝜇 = ℌ𝜇 ⊕ ℂ𝕋 . Then (see [3, Section 2.8]) the following result holds. Theorem 2.2. Let 𝜇 ∈ ℳ1+ (𝕋). Deﬁne 𝑇𝜇 := 𝑇 × , 𝐹𝜇 := 𝐹 × 𝜏 , 𝐺𝜇 := 𝜏 ∗ 𝐺× , and 𝑆𝜇 := 𝜏 ∗ 𝑆 × 𝜏 . Then △𝜇 := (ℌ𝜇 , ℂ, ℂ; 𝑇𝜇 , 𝐹𝜇 , 𝐺𝜇 , 𝑆𝜇 )

(2.8)

is a simple unitary colligation the characteristic function Θ△𝜇 of which coincides with the Schur function Θ associated with 𝜇. In view of Theorem 2.2, the operator 𝑇𝜇 is a completely nonunitary contraction and if the function Φ is given by (1.4) with 𝛽 = 0, then from (1.6) it follows Φ(𝜁) − 1 , 𝜁 ∈ 𝔻. 𝜁Θ△𝜇 (𝜁) = Φ(𝜁) + 1 Deﬁnition 2.3. Let 𝜇 ∈ ℳ1+ (𝕋). Then the simple unitary colligation given by (2.8) is called the unitary colligation associated with 𝜇. Let 𝜇 ∈ ℳ1+ (𝕋) be a Szeg˝o measure and let 𝛾 = (𝛾𝑗 )𝜔 𝑗=0 ∈ Γ be the Schur parameter sequence associated with 𝜇. Then Remark 1.4 shows that 𝜔 = ∞ and 𝛾 ∈ 𝑙2 . Furthermore, we use for all integers 𝑛 the setting 𝑒𝑛 : 𝕋 → ℂ deﬁned by 𝑒𝑛 (𝑡) := 𝑡𝑛 .

(2.9)

276

V.K. Dubovoy, B. Fritzsche and B. Kirstein

Thus, we have 𝑒−𝑛 = (𝑈𝜇× )𝑛 1, where 𝑈𝜇× is the operator deﬁned by (2.7). We consider then the system {𝑒0 , 𝑒−1 , 𝑒−2 , . . .}. By the Gram-Schmidt orthogonalization method in the space 𝐿2𝜇 we get a unique sequence (𝜑𝑛 )∞ 𝑛=0 of polynomials, where 𝜑𝑛 (𝑡) = 𝛼𝑛,𝑛 𝑡−𝑛 + 𝛼𝑛,𝑛−1 𝑡−(𝑛−1) + ⋅ ⋅ ⋅ + 𝛼𝑛,0 , 𝑡 ∈ 𝕋, such that the conditions 𝑛 𝑛 ⋁ ⋁ 𝜑𝑘 = (𝑈𝜇× )𝑘 1, 𝑘=0

𝑘=0

( × 𝑛 ) (𝑈𝜇 ) 1, 𝜑𝑛 𝐿2 > 0, 𝜇

𝑛 ∈ ℕ0 ,

𝑛 ∈ ℕ0 ,

(2.10)

(2.11)

are satisﬁed. We note that the second condition in (2.11) is equivalent to ( ) 1, 𝜑0 𝐿2 > 0 and 𝜇 ( × ) 𝑈𝜇 𝜑𝑛−1 , 𝜑𝑛 𝐿2 > 0, 𝑛 ∈ ℕ0 . (2.12) 𝜇

In particular, since 𝜇(𝕋) = 1 holds, from the construction of 𝜑0 we see that 𝜑0 = 1.

(2.13)

We consider a simple unitary colligation △𝜇 of the type (2.8) associated with the measure 𝜇. The controllability and observability spaces (2.4) associated with the unitary colligation △𝜇 have the forms ℌ𝜇,𝔉 =

∞ ⋁

𝑇𝜇𝑛 𝐹𝜇 (ℂ) and ℌ𝜇,𝔊 =

𝑛=0

∞ ⋁

(𝑇𝜇∗ )𝑛 𝐺∗𝜇 (ℂ),

(2.14)

𝑛=0

respectively. Let the sequence of functions (𝜑′𝑘 )∞ 𝑘=1 be deﬁned by 𝜑′𝑘 := 𝑇𝜇𝑘−1 𝐹𝜇 (1), In view of the formulas 𝑛 ⋁ 𝑘=0

(𝑈𝜇× )𝑘 1

𝑘 ∈ ℕ.

(𝑛−1 ) ⋁ 𝑘 = (𝑇𝜇 ) 𝐹𝜇 (1) ⊕ ℂ𝕋 ,

(2.15)

𝑛 ∈ ℕ,

(2.16)

𝑘=0

it can be seen that the sequence (𝜑𝑘 )∞ 𝑘=1 can be obtained by applying the GramSchmidt orthonormalization procedure to (𝜑′𝑘 )∞ 𝑘=1 with additional consideration of the normalization condition (2.12). Thus, we obtain the following result: Theorem 2.4. The system (𝜑𝑘 )∞ 𝑘=1 of orthonormal polynomials is a basis in the space ℌ𝜇,𝔉 , and (∞ ) ⋁ (𝑡∗ )𝑘 ⊖ ℂ𝕋 . (2.17) ℌ𝜇,𝔉 = 𝑘=0

This system can be obtained in the result of the application of the Gram-Schmidt orthogonalization procedure to the sequence (2.15) taking into account the normalization condition (2.12).

Description of Helson-Szeg˝ o Measures

277

Remark 2.5. Analogously to (2.17) we have the equation (∞ ) ⋁ ℌ𝜇,𝔊 = 𝑡𝑘 ⊖ ℂ 𝕋 .

(2.18)

𝑘=0

If 𝑇 is a contraction acting on some Hilbert space ℌ, then we use the setting ( ) resp. 𝛿𝑇 ∗ := dim 𝔇𝑇 ∗ , 𝛿𝑇 := dim 𝔇𝑇 where 𝔇𝑇 := 𝐷𝑇 (ℌ) (resp. 𝔇𝑇 ∗ := 𝐷𝑇 ∗ (ℌ) ) is √ the closure of the range of the √ defect operator 𝐷𝑇 := 𝐼ℌ − 𝑇 ∗ 𝑇 (resp. 𝐷𝑇 ∗ := 𝐼ℌ − 𝑇 𝑇 ∗ ). In view of (2.6), let ℌ⊥ 𝜇,𝔉 := ℌ𝜇 ⊖ ℌ𝜇,𝔉 ,

ℌ⊥ 𝜇,𝔊 := ℌ𝜇 ⊖ ℌ𝜇,𝔊 .

If 𝜇 is a Szeg˝o measure, then we have 𝐿2𝜇 ⊖

∞ ⋁

) ( ) ( 𝑒−𝑘 = ℌ𝜇 ⊕ ℂ𝕋 ⊖ ℌ𝜇,𝔉 ⊕ ℂ𝕋 = ℌ𝜇 ⊖ ℌ𝜇,𝔉 = ℌ⊥ 𝜇,𝔉 ∕= {0}.

(2.19)

𝑘=0

So from Proposition 2.1 we obtain the known result that in this case the system 2 (𝜑𝑛 )∞ 𝑛=0 is not complete in the space 𝐿𝜇 . In our case ( ) resp. 𝑉𝑇𝜇∗ := Rstr.ℌ⊥ 𝑇𝜇 𝑇∗ 𝑉𝑇𝜇 := Rstr.ℌ⊥ 𝜇,𝔉 𝜇,𝔉 𝜇 is the maximal unilateral shift contained in 𝑇𝜇 (resp. 𝑇𝜇∗ ) (see [3, Theorem 1.6]). In view of 𝛿𝑇𝜇 = 𝛿𝑇𝜇∗ = 1 the multiplicity of the unilateral shift 𝑉𝑇𝜇 (resp. 𝑉𝑇𝜇∗ ) is equal to 1. Proposition 2.6. The orthonormal system of the polynomials (𝜑𝑘 )∞ 𝑘=0 is noncomplete in 𝐿2𝜇 if and only if the contraction 𝑇𝜇 (resp. 𝑇𝜇∗ ) contains a maximal unilateral shift 𝑉𝑇𝜇 (resp. 𝑉𝑇𝜇∗ ) of multiplicity 1.

3. On the connection between the Riesz projection 𝑷+ and the 𝕱 projection 퓟𝝁, 𝕲 which projects 𝕳𝝁 onto 𝕳𝝁, 𝕲 parallel to 𝕳𝝁, 𝕱 Let 𝜇 ∈ ℳ1+ ( 𝕋 ). We consider the unitary colligation Δ𝜇 of type (2.8) which is associated with the measure 𝜇. Then the following statement is true. o measure. Then the Riesz projection 𝑃+ Theorem 3.1. Let 𝜇 ∈ ℳ1+ ( 𝕋 ) be a Szeg˝ 𝔉 2 is bounded in 𝐿𝜇 if and only if the projection 𝒫𝜇, 𝔊 which projects ℌ𝜇 onto ℌ𝜇, 𝔊 parallel to ℌ𝜇, 𝔉 is bounded. Proof. For each 𝑛 ∈ ℕ0 we consider particular subspaces of the space ℌ𝜇 , namely (𝑛)

ℌ𝜇, 𝔉 :=

𝑛 ⋁ 𝑘=0

𝑇𝜇𝑘 𝐹𝜇 ( ℂ ) ,

(𝑛)

ℌ𝜇, 𝔊 :=

𝑛 ⋁ ( 𝑘=0

𝑇𝜇∗

)𝑘

𝐺∗𝜇 ( ℂ ) ,

(3.1)

278

V.K. Dubovoy, B. Fritzsche and B. Kirstein

and (𝑛)

(𝑛)

ℌ(𝑛) 𝜇 := ℌ𝜇, 𝔉 ∨ ℌ𝜇, 𝔊 ,

𝐿𝜇, 𝑛 := ℌ(𝑛) 𝜇 ⊕ ℂ𝕋 .

(3.2)

Then from (2.15), (2.17), and (2.18) we obtain the relations ℌ𝜇, 𝔉 = ℌ𝜇 = (𝑛) ℌ𝜇, 𝔉

=

∞ ⋁ 𝑛=0 ∞ ⋁

ℌ𝜇, 𝔊 =

ℌ(𝑛) 𝜇 ,

𝐿2𝜇 =

𝑛=0 ( 𝑛 ⋁

) ∗ 𝑘

(𝑡 )

(𝑛) ℌ𝜇, 𝔊

⊖ ℂ𝕋 ,

𝑘=0

and

( ℌ(𝑛) 𝜇

(𝑛)

ℌ𝜇, 𝔉 ,

=

=

∞ ⋁ 𝑛=0 ∞ ⋁

𝑡

𝑘

⊖ ℂ𝕋 ,

(𝑛)

(3.3)

𝐿𝜇, 𝑛 ,

(3.4)

𝑛=0 ( 𝑛 ⋁

) 𝑡

𝑘

⊖ ℂ𝕋 ,

(3.5)

𝑘=0

)

𝑛 ⋁

ℌ𝜇, 𝔊 ,

𝑛 ⋁

𝐿𝜇, 𝑛 =

𝑘=−𝑛

𝑡𝑘 .

(3.6)

𝑘=−𝑛

Since 𝜇 is a Szeg˝o measure, for each 𝑛 ∈ ℕ0 we obtain (𝑛)

ℌ(𝑛) 𝜇 = ℌ𝜇, 𝔉

˙ +

(𝑛)

ℌ𝜇, 𝔊 .

(3.7) (𝑛)

Suppose now that the Riesz projection 𝑃+ is bounded in 𝐿2𝜇 . Let ℎ ∈ ℌ𝜇 . Then, because of (3.6), the function ℎ has the form ℎ(𝑡) =

𝑛 ∑

𝑎 𝑘 𝑡𝑘 ,

𝑡 ∈ 𝕋.

(3.8)

𝑘=−𝑛

From (3.5) and (3.7) we obtain ℎ = ℎ𝔉 + ℎ𝔊 ,

(3.9)

where (𝑛)

(𝑛)

ℎ𝔉 ∈ ℌ𝜇, 𝔉 , ℎ𝔉 ( 𝑡 ) = 𝑎0, 𝔉 +

ℎ𝔊 ∈ ℌ𝜇, 𝔊 , 𝑛 ∑

𝑎−𝑘 𝑡−𝑘 ,

ℎ𝔊 ( 𝑡 ) = 𝑎0, 𝔊 +

𝑘=1

𝑛 ∑

𝑎 𝑘 𝑡𝑘 ,

(3.10)

𝑘=1

and 𝑎0 = 𝑎0, 𝔉 + 𝑎0, 𝔊 . Observe that 𝔉 𝒫𝜇, 𝔊 ℎ = ℎ𝔊 .

(3.11)

ℎ = ℎ+ + ℎ− ,

(3.12)

On the other hand, we have

Description of Helson-Szeg˝ o Measures

279

where, for each 𝑡 ∈ 𝕋, ℎ+ ( 𝑡 ) = ( 𝑃+ (ℎ) ) ( 𝑡 ) =

𝑛 ∑

𝑎 𝑘 𝑡𝑘 ,

ℎ− ( 𝑡 ) =

𝑘=0

𝑛 ∑

𝑎−𝑘 𝑡−𝑘 .

(3.13)

𝑘=1

For a polynomial ℎ𝔉 of the type (3.10) we set ℎ𝔉 ( 0 ) := 𝑎0, 𝔉 .

(3.14)

Then from (3.10)–(3.14) we infer 𝔉 𝑃+ ℎ = ℎ+ = ℎ𝔊 + ℎ𝔉 ( 0 ) ⋅ 1 = 𝒫𝜇, 𝔊 ℎ + ℎ𝔉 ( 0 ) ⋅ 1 .

(3.15)

ℎ− = ℎ𝔉 − ℎ𝔉 ( 0 ) ⋅ 1 ,

(3.16)

Observe that where in view of (3.5) we see that ℎ𝔉 ⊥ 1 . Let 𝑃ℂ𝕋 be the orthoprojection from 𝐿2𝜇 onto ℂ𝕋 . Then from (3.16) it follows that ℎ𝔉 ( 0 ) ⋅ 1 = 𝑃ℂ𝕋 ℎ− = 𝑃ℂ𝕋 ( 𝐼 − 𝑃+ ) ℎ . Inserting this expression into (3.15) we get 𝔉 𝒫𝜇, 𝔊 ℎ = 𝑃+ ℎ − 𝑃ℂ𝕋 ( 𝐼 − 𝑃+ ) ℎ .

From this and (3.4) it follows that the boundedness of the projection 𝑃+ in 𝐿2𝜇 𝔉 implies the boundedness of the projection 𝒫𝜇, 𝔊 in ℌ𝜇 . 𝔉 Conversely, suppose that the projection 𝒫𝜇, 𝔊 is bounded in ℌ𝜇 . Let 𝑓 ∈ 𝐿𝜇, 𝑛 . (𝑛)

We denote by 𝑃ℌ(𝑛) the orthogonal projection from 𝐿2𝜇 onto ℌ𝜇 . We set 𝜇

ℎ := 𝑃ℌ(𝑛) 𝑓 𝜇

and use for ℎ the notations introduced in (3.8)–(3.10). Let 𝑓 = 𝑓+ + 𝑓− , where 𝑓+ := 𝑃+ 𝑓 . Then 𝑓 = 𝑃ℂ𝕋 𝑓 + ℎ = 𝑃ℂ𝕋 𝑓 + ℎ𝔊 + ℎ𝔉 . This implies 𝑃+ 𝑓 = 𝑃ℂ𝕋 𝑓 + ℎ𝔊 + ℎ𝔉 ( 0 ) ⋅ 1 . This means 𝔉 𝑃+ 𝑓 = 𝑃ℂ𝕋 𝑓 + 𝒫𝜇, 𝔊 𝑃ℌ(𝑛) 𝑓 + ℎ𝔉 ( 0 ) ⋅ 1 . 𝜇

The mapping ℎ𝔉 → ℎ𝔉 ( 0 ) is a linear functional on the set 𝒫𝑜𝑙≤0 := ℒ𝑖𝑛{𝑡−𝑘 : 𝑘 ∈ ℕ0 }.

(3.17)

280

V.K. Dubovoy, B. Fritzsche and B. Kirstein

Since 𝜇 is a Szeg˝o measure, the Szeg˝o-Kolmogorov-Krein Theorem (see, e.g., [14, Theorem 4.31]) implies that this functional is bounded in 𝐿2𝜇 on the set 𝒫𝑜𝑙≤0 . Thus, there exists a constant 𝐶 ∈ ( 0, +∞ ) such that ∥ℎ𝔉 ( 0 ) ⋅ 1∥𝐿2 = ∣ℎ𝔉 ( 0 )∣ 𝜇

≤ 𝐶 ⋅ ∥ℎ𝔉 ∥ 1( ) 1 1 1 𝔉 = 𝐶 ⋅ 1 𝐼 − 𝒫𝜇, 𝔊 ℎ1 1 1 1 𝔉 1 ≤ 𝐶 ⋅ 1𝐼 − 𝒫𝜇, 𝔊 1 ⋅ ∥ℎ∥ 1 1 1 𝔉 1 ≤ 𝐶 ⋅ 1𝐼 − 𝒫𝜇, 𝔊 1 ⋅ ∥𝑓 ∥ .

(3.18)

From (3.17) and (3.18) we get

1 1 1 1 1 1 𝔉 1 𝔉 1 𝑃 ∥𝑃+ 𝑓 ∥ ≤ ∥𝑃ℂ𝕋 𝑓 ∥ + 1𝒫𝜇, (𝑛) 𝑓 1 + 𝐶 ⋅ 1𝐼 − 𝒫 𝔊 ℌ𝜇 𝜇, 𝔊 1 ⋅ ∥𝑓 ∥ 1 1 1 1 1 𝔉 1 1 𝔉 1 ≤ ∥𝑓 ∥ + 1𝒫𝜇, 𝔊 1 ⋅ ∥𝑓 ∥ + 𝐶 ⋅ 1𝐼 − 𝒫𝜇, 𝔊 1 ⋅ ∥𝑓 ∥ .

Now considering the limit as 𝑛 → ∞ and taking into account (3.4), we see that 𝔉 the boundedness of the projection 𝒫𝜇, 𝔊 implies the boundedness of the Riesz 2 projection 𝑃+ in 𝐿𝜇 . □

4. On the connection of the Riesz projection 𝑷+ with the ortho⊥ gonal projections from 𝕳𝝁 onto the subspaces 𝕳⊥ 𝝁, 𝕱 and 𝕳𝝁, 𝕲 Let 𝜇 ∈ ℳ1+ ( 𝕋 ) be a Szeg˝o measure. As we did earlier, we consider the simple unitary colligation Δ𝜇 of type (2.8) which is associated with the measure 𝜇. As was previously mentioned, we then have ℌ⊥ 𝜇, 𝔉 ∕= { 0 }

ℌ⊥ 𝜇, 𝔊 ∕= { 0 } .

and

We denote by 𝑃ℌ⊥ and 𝑃ℌ⊥ the orthogonal projections from ℌ𝜇 onto ℌ⊥ 𝜇, 𝔉 and 𝜇, 𝔉 𝜇, 𝔊 ℌ⊥ 𝜇, 𝔊 , respectively. (𝑛)

Let ℎ ∈ ℌ𝜇 . Along with the decomposition (3.9) we consider the decomposition ℎ=˜ ℎ𝔉 + ˜ ℎ⊥ 𝔉 , where (𝑛) ˜ ℎ𝔉 ∈ ℌ𝜇,𝔉

and

(𝑛) (𝑛) ˜ ℎ⊥ 𝔉 ∈ ℌ𝜇 ⊖ ℌ𝜇, 𝔉 .

(4.1)

Description of Helson-Szeg˝ o Measures

281

(𝑛)

From the shape (3.5) of the subspace ℌ𝜇, 𝔉 and the polynomial structure of the orthonormal basis ( 𝜑𝑛 )∞ 𝑛=0 of the subspace ℌ𝜇, 𝔉 , it follows that ˜ ℎ. ℎ⊥ 𝔉 = 𝑃ℌ⊥ 𝜇, 𝔉 (𝑛) Since ℎ𝔉 (see (3.9)) and ˜ ℎ𝔉 belong to ℌ𝜇, 𝔉 , we get 𝔉 𝔉 ˜ ℎ⊥ ℎ = 𝑃ℌ⊥ ℎ𝔊, = 𝑃ℌ⊥ 𝒫𝜇, 𝔉 = 𝑃ℌ⊥ 𝔊 ℎ = 𝐵𝜇, 𝔉 𝒫𝜇, 𝔊 ℎ , 𝜇, 𝔉 𝜇, 𝔉 𝜇, 𝔉

(4.2)

𝐵𝜇,𝔉 := Rstr. ℌ𝜇,𝔊 𝑃ℌ⊥ : ℌ𝜇,𝔊 → ℌ⊥ 𝜇,𝔉 , 𝜇,𝔉

(4.3)

where i.e., we consider 𝐵𝜇, 𝔉 as an operator acting between the spaces ℌ𝜇, 𝔊 and ℌ⊥ 𝜇, 𝔉 . 𝔉 Theorem 4.1. Let 𝜇 ∈ ℳ1+ ( 𝕋 ) be a Szeg˝ o measure. Then the projection 𝒫𝜇, 𝔊 is bounded in ℌ𝜇 if and only if the operator 𝐵𝜇, 𝔉 deﬁned in (4.3) is boundedly invertible. (𝑛)

−1 Proof. Suppose ﬁrst that 𝐵𝜇, 𝔉 has a bounded inverse 𝐵𝜇, 𝔉 . Then for ℎ ∈ ℌ𝜇 , in view of (4.1) and (4.2), we have 𝔉 −1 ℎ. 𝒫𝜇, 𝔊 ℎ = 𝐵𝜇, 𝔉 𝑃ℌ⊥ 𝜇, 𝔉 𝔉 If 𝑛 → ∞, this gives us the boundedness of the projection 𝒫𝜇, 𝔊 in ℌ𝜇 . (𝑛)

(𝑛)

𝔉 Conversely, suppose that the projection 𝒫𝜇, 𝔊 is bounded in ℌ𝜇 . If ℎ ∈ ℌ𝜇 ⊖

ℌ𝜇, 𝔉 , then the decomposition (4.1) provides us ℎ=˜ ℎ⊥ 𝔉 and identity (4.2) yields 𝔉 ℎ = 𝐵𝜇, 𝔉 𝒫𝜇, 𝔊ℎ .

(4.4)

𝔉 Since 𝒫𝜇, 𝔊 is bounded in ℌ𝜇 , Theorem 3.1 implies that the Riesz projection 𝑃+ is bounded in 𝐿2𝜇 . Then it follows from condition (iii) in Theorem 1.2 that

ℌ𝜇, 𝔉 ∩ ℌ𝜇, 𝔊 = { 0 } . Thus, from the shape (4.3) of the operator 𝐵𝜇, 𝔉 , we infer that ( ⊥) 𝔉 = ℌ𝔊 . and 𝒫𝜇, ker 𝐵𝜇, 𝔉 = { 0 } 𝔊 ℌ𝔉 Now equation (4.4) can be rewritten in the form 𝔉 −1 𝐵𝜇, 𝔉 ℎ = 𝒫𝜇, 𝔊 ℎ ,

(𝑛)

ℎ ∈ ℌ(𝑛) 𝜇 ⊖ ℌ𝜇, 𝔉 .

The limit 𝑛 → ∞, (3.2) and (3.4) give us the desired result.

□

The combination of Theorem 4.1 with Theorem 3.1 leads us to the following result.

282

V.K. Dubovoy, B. Fritzsche and B. Kirstein

Theorem 4.2. Let 𝜇 ∈ ℳ1+ ( 𝕋 ) be a Szeg˝ o measure. Then the Riesz projection 𝑃+ 2 is bounded in 𝐿𝜇 if and only if the operator 𝐵𝜇, 𝔉 deﬁned in (4.3) is boundedly invertible. Let 𝑓 ∈ 𝒫𝑜𝑙 be given by (1.1). Along with the Riesz projection 𝑃+ , we consider the projection 𝑃− , which is deﬁned by: ∑ 𝑎 𝑘 𝑡𝑘 , 𝑡∈𝕋. ( 𝑃− ) ( 𝑡 ) := −𝑘∈𝐼∩ℕ0

Obviously, 𝑃− = 𝑃+ 𝑓 ,

and

𝑃+ = 𝑃− 𝑓 .

Thus, the boundedness of one of the projections 𝑃+ and 𝑃− in 𝐿2𝜇 implies the boundedness of the other one. It is readily checked that the change from the projection 𝑃+ to 𝑃− is connected with changing the roles of the spaces ℌ𝜇, 𝔊 and ℌ𝜇, 𝔉 . Thus we obtain the following result, which is dual to Theorem 4.2. Theorem 4.3. Let 𝜇 ∈ ℳ1+ ( 𝕋 ) be a Szeg˝ o measure. Then the Riesz projection 𝑃+ is bounded in 𝐿2𝜇 if and only if the operator 𝐵𝜇, 𝔊 : ℌ𝜇, 𝔉 → ℌ⊥ 𝜇, 𝔊 deﬁned by ℎ 𝐵𝜇, 𝔊 ℎ := 𝑃ℌ⊥ 𝜇, 𝔊

(4.5)

is boundedly invertible. Here the symbol 𝑃ℌ⊥ stand for the orthogonal projection 𝜇, 𝔊

from ℌ𝜇 onto ℌ⊥ 𝜇, 𝔊 .

5. Matrix representation of the operator 𝑩𝝁, 𝕲 in terms of the Schur parameters associated with the measure 𝝁 Let 𝜇 ∈ ℳ1+ ( 𝕋 ) be a Szeg˝o measure. We consider the simple unitary colligation Δ𝜇 of the type (2.8) which is associated with the measure 𝜇. In this case we have (see Section 2) ℌ⊥ 𝜇, 𝔉 ∕= { 0 }

and

ℌ⊥ 𝜇, 𝔊 ∕= { 0 }

The operator 𝐵𝜇, 𝔊 acts between the subspaces ℌ𝜇, 𝔉 and ℌ⊥ 𝜇, 𝔊 . According to the matrix description of the operator 𝐵𝜇, 𝔊 we consider particular orthogonal bases in these subspaces. In the subspace ℌ𝜇, 𝔉 we have already considered one such ∞ basis, namely the basis consisting of the trigonometric polynomials ( 𝜑𝑛 )𝑛=1 (see Theorem 2.4). Regarding the construction of an orthonormal basis in ℌ⊥ 𝜇, 𝔊 , we ∞ ﬁrst complete the system ( 𝜑𝑛 )𝑛=1 to an orthonormal basis in ℌ𝜇 . This procedure is described in more detail in [3]. We consider the orthogonal decomposition ℌ𝜇 = ℌ𝜇,𝔉 ⊕ ℌ⊥ 𝜇,𝔉 .

(5.1)

Description of Helson-Szeg˝ o Measures

283

˜ 0 the wandering subspace which generates the subspace associated Denote by 𝔏 ˜ 0 = 1 and, since with the unilateral shift 𝑉𝑇𝜇∗ . Then (see Proposition 2.6) dim 𝔏 𝑉𝑇𝜇∗ is an isometric operator, we have (𝑈𝜇× )∗ . 𝑉𝑇𝜇∗ = Rstr.ℌ⊥ 𝜇,𝔉

(5.2)

Consequently, ℌ⊥ 𝜇,𝔉 =

∞ ⊕ 𝑛=0

˜ 0) = 𝑉𝑇𝑛𝜇∗ (𝔏

∞ ⋁

˜ 0) = (𝑇𝜇∗ )𝑛 (𝔏

𝑛=0

∞ ⋁

˜ 0 ). [(𝑈𝜇× )∗ ]𝑛 (𝔏

(5.3)

𝑛=0

˜ 0 which fulﬁlls There exists (see [3, Corollary 1.10]) a unique unit function 𝜓1 ∈ 𝔏 ) ( ∗ (5.4) 𝐺𝜇 (1), 𝜓1 𝐿2 > 0. 𝜇

Because of (5.2), (5.3), and (5.4) it follows that the sequence (𝜓𝑘 )∞ 𝑘=1 , where 𝜓𝑘 := [(𝑈𝜇× )∗ ]𝑘−1 𝜓1 ,

𝑘 ∈ ℕ,

(5.5)

is the unique orthonormal basis of the space ℌ⊥ 𝜇,𝔉 which satisﬁes the conditions ( ∗ ) 𝐺𝜇 (1), 𝜓1 𝐿2 > 0, 𝜓𝑘+1 = (𝑈𝜇× )∗ 𝜓𝑘 , 𝑘 ∈ ℕ, (5.6) 𝜇

or equivalently ( ∗ ) 𝐺𝜇 (1), 𝜓1 𝐿2 > 0, 𝜇

𝜓𝑘+1 (𝑡) = 𝑡𝑘 ⋅ 𝜓1 (𝑡), 𝑡 ∈ 𝕋,

𝑘 ∈ ℕ.

(5.7)

According to the considerations in [3] we introduce the following notion. Deﬁnition 5.1. The constructed orthonormal basis 𝜑0 , 𝜑1 , 𝜑2 , . . . ; 𝜓1 , 𝜓2 , . . .

(5.8)

in the space 𝐿2𝜇 which satisﬁes the conditions (2.11) and (5.6) is called the canonical orthonormal basis in 𝐿2𝜇 . Note that the analytic structure of the system (𝜓𝑘 )∞ 𝑘=1 is described in the paper [5]. Obviously, the canonical orthonormal basis (5.8) in 𝐿2𝜇 is uniquely determined by the conditions (2.11) and (5.6). Here the sequence (𝜑𝑘 )∞ 𝑘=0 is an orthonormal system of polynomials (depending on 𝑡∗ ). The orthonormal system (𝜓𝑘 )∞ 𝑘=1 is built with the aid of the operator 𝑈𝜇× from the function 𝜓1 (see (5.5)) in a similar way as the system (𝜑𝑘 )∞ 𝑘=0 was built from (the function )∞𝜑0 (see (2.10) and (2.11)). The only diﬀerence is that the system [(𝑈𝜇× )∗ ]𝑘 𝜓1 𝑘=0 is orthonormal, whereas )∞ ( in the general case the system (𝑈𝜇× )𝑘 𝜑0 𝑘=0 is not orthonormal. In this respect, the sequence (𝜓𝑘 )∞ 𝑘=1 can be considered as a natural completion of the system of 2 orthonormal polynomials (𝜑𝑘 )∞ 𝑘=0 to an orthonormal basis in 𝐿𝜇 .

284

V.K. Dubovoy, B. Fritzsche and B. Kirstein

Remark 5.2. The orthonormal system 𝜑1 , 𝜑2 , . . . ; 𝜓1 , 𝜓2 , . . .

(5.9)

is an orthonormal basis in the space ℌ𝜇 . We will call it the canonical orthonormal basis in ℌ𝜇 . It is well known (see, e.g., Brodskii [2]) that one can consider simultaneously together with the simple unitary colligation (2.8) the adjoint unitary colligation ˜ := (ℌ𝜇 , ℂ, ℂ; 𝑇 ∗, 𝐺∗ , 𝐹 ∗ , 𝑆 ∗ ) △ (5.10) 𝜇

𝜇

𝜇

𝜇

𝜇

which is also simple. Its characteristic function Θ△ ˜ is for each 𝑧 ∈ 𝔻 given by 𝜇 ∗ ∗ Θ△ ˜ (𝑧) = Θ△𝜇 (𝑧 ). 𝜇

We note that the unitary colligation (5.10) is associated with the operator (𝑈𝜇× )∗ . It can be easily checked that the action of (𝑈𝜇× )∗ is given for each 𝑓 ∈ 𝐿2𝜇 by [(𝑈𝜇× )∗ 𝑓 ](𝑡) = 𝑡 ⋅ 𝑓 (𝑡),

𝑡 ∈ 𝕋.

If we replace the operator 𝑈𝜇× by (𝑈𝜇× )∗ in the preceding considerations, which have lead to the canonical orthonormal basis (5.8), then we obtain an orthonormal basis of the space 𝐿2𝜇 which consists of two sequences (𝜑˜𝑗 )∞ 𝑗=0

and (𝜓˜𝑗 )∞ 𝑗=1

(5.11)

of functions. From our treatments above it follows that the orthonormal basis (5.11) is uniquely determined by the following conditions: Gram-Schmidt orthogo(a) The sequence (𝜑˜𝑗 )∞ 𝑘=0 arises from the ( result of )the ∞ nalization procedure of the sequence [(𝑈𝜇× )∗ ]𝑛 1 𝑛=0 and additionally taking into account the normalization conditions ) ( × ∗𝑛 [(𝑈𝜇 ) ] 1, 𝜑˜𝑛 𝐿2 > 0, 𝑛 ∈ ℕ0 . 𝜇

(b) The relations ( ) 𝐹𝜇 (1), 𝜓˜1 𝐿2 > 0 𝜇

and 𝜓˜𝑘+1 = 𝑈𝜇× 𝜓˜𝑘 ,

𝑘 ∈ ℕ,

hold. It can be easily checked that 𝜑˜𝑘 = 𝜑∗𝑘 , 𝑘 ∈ ℕ0 , and

𝜓˜𝑘 = 𝜓𝑘∗ , 𝑘 ∈ ℕ. According to the paper [3] we introduce the following notion.

Deﬁnition 5.3. The orthogonal basis 𝜑∗0 , 𝜑∗1 , 𝜑∗2 , . . . ; 𝜓1∗ , 𝜓2∗ , . . .

(5.12)

is called the conjugate canonical orthonormal basis with respect to the canonical orthonormal basis (5.8).

Description of Helson-Szeg˝ o Measures

285

We note that 𝜑0 = 𝜑∗0 = 1. Similarly as (2.16) the identity (𝑛−1 ) 𝑛 ⋁ ⋁ × ∗ 𝑘 ∗ 𝑘 ∗ [(𝑈𝜇 ) ] 1 = (𝑇𝜇 ) 𝐺𝜇 (1) ⊕ ℂ𝕋 𝑘=0

(5.13)

𝑘=0

can be veriﬁed. Thus, ℌ𝜇,𝔉 = ℌ⊥ 𝜇,𝔉 =

∞ ⋁ 𝑘=1 ∞ ⋁

𝜑𝑘 ,

ℌ𝜇,𝔊 = ℌ⊥ 𝜇,𝔊 =

𝜓𝑘 ,

𝑘=1

∞ ⋁ 𝑘=1 ∞ ⋁

𝜑∗𝑘 ,

(5.14)

𝜓𝑘∗ .

(5.15)

𝑘=1

In [3, Chapter 3] the unitary operator 𝒰 was introduced which maps the elements of the canonical basis (5.8) onto the corresponding elements of the conjugate canonical basis (5.12). More precisely, we consider the operator 𝒰𝜇 𝜑𝑛 = 𝜑∗𝑛 ,

𝑛 ∈ ℕ0 ,

𝒰𝜇 𝜓𝑛 = 𝜓𝑛∗ ,

and

The operator 𝒰𝜇 is related to the conjugation operator in and if ∞ ∞ ∑ ∑ 𝛼𝑘 𝜑𝑘 + 𝛽𝑘 𝜓𝑘 , 𝑓= 𝑘=0

then ∗

𝑓 =

∞ ∑

𝛼∗𝑘 𝜑∗𝑘

+

𝑘=0

∞ ∑

𝑛 ∈ ℕ.

𝐿2𝜇 .

(5.16)

Namely, if 𝑓 ∈ 𝐿2𝜇

𝑘=1

𝛽𝑘∗ 𝜓𝑘∗

=

𝑘=1

∞ ∑

𝛼∗𝑘 𝒰𝜑𝑘

+

𝑘=0

From (5.16) it follows that

𝒰𝜇 : ℌ𝜇 −→ ℌ𝜇 ,

∞ ∑

𝛽𝑘∗ 𝒰𝜓𝑘 .

𝑘=1

𝒰𝜇 (1) = 1 .

Let 𝒰ℌ𝜇 := Rstr. ℌ𝜇 𝒰𝜇 .

(5.17)

Then, obviously, 𝒰ℌ𝜇 𝜑𝑛 = 𝜑∗𝑛

𝒰ℌ𝜇 𝜓𝑛 = 𝜓𝑛∗ ,

and

𝑛∈ℕ.

(5.18)

∞ ( 𝜓𝑛∗ )𝑛=1

is an orthonormal basis in the space ℌ⊥ 𝜇, 𝔊 . This sys⊥ special orthonormal basis of the space ℌ𝜇, 𝔊 mentioned

Clearly, the system tem will turn out to be the at the beginning of this section. Thus, the matrix representation of the operator 𝐵𝜇, 𝔊 : ℌ𝜇, 𝔉 −→ ℌ⊥ 𝜇, 𝔊 will be considered with respect to the orthonormal bases ( 𝜑𝑛 )∞ 𝑛=1 of the spaces ℌ𝜇, 𝔉 and

ℌ⊥ 𝜇, 𝔊 ,

and respectively. Let ( ) ℛ ℒ 𝒫

𝒬

(𝜓𝑛∗ )∞ 𝑛=1

(5.19)

(5.20)

286

V.K. Dubovoy, B. Fritzsche and B. Kirstein

be the matrix representation of the operator 𝒰ℌ𝜇 with respect to the canonical basis (5.9) of the space ℌ𝜇 . Then, from (5.18) we infer that the columns ( ) ( ) ℛ ℒ and 𝒫 𝒬 of the block-matrix (5.20) are the coeﬃcients in the series developments of 𝜑∗𝑛 and 𝜓𝑛∗ with respect to the canonical basis (5.9). If ℎ ∈ ℌ𝜇 then clearly 𝑃ℌ⊥ ℎ= 𝜇, 𝔊

∞ ∑

( ℎ , 𝜓𝑘∗ ) 𝜓𝑘∗ .

(5.21)

𝑘=1

considered as an operaThus, the matrix representation of the operator 𝑃ℌ⊥ 𝜇, 𝔊 tor acting between ℌ𝜇 and ℌ⊥ 𝜇, 𝔊 equipped with the orthonormal bases (5.9) and ∗ ∞ ( 𝜓𝑛 )𝑛=1 has the form ( ℒ∗ , 𝒬∗ ) . From this and the shape (4.5) of the operator 𝐵𝜇, 𝔊 , we obtain the following result. Theorem 5.4. Let 𝜇 ∈ ℳ1+ ( 𝕋 ) be a Szeg˝ o measure. Then the matrix of the operator 𝐵𝜇, 𝔊 : ℌ𝜇, 𝔉 −→ ℌ⊥ 𝜇, 𝔊 ∞

∞

with respect to the orthonormal bases ( 𝜑𝑘 )𝑘=1 and ( 𝜓𝑛∗ )𝑛=1 of the spaces ℌ𝜇, 𝔉 ∗ and ℌ⊥ 𝜇, 𝔊 , respectively, is given by ℒ where ℒ is the block of the matrix given in (5.20). Now Theorem 4.3 can be reformulated in the following way. o measure. Then the Riesz projection 𝑃+ Corollary 5.5. Let 𝜇 ∈ ℳ1+ ( 𝕋 ) be a Szeg˝ 2 ∗ is bounded in 𝐿𝜇 if and only if ℒ is boundedly invertible in 𝑙2 where ℒ is the block of the matrix given in (5.20). In [3, Corollary 3.7] the matrix ℒ was expressed in terms of the Schur parameters associated with the measure 𝜇. In order to write down this matrix we introduce the necessary notions and terminology used in [3]. The matrix ℒ expressed in terms of the corresponding Schur parameter sequence will the denoted by ℒ ( 𝛾 ). Let { } Γ𝑙2 := 𝛾 = (𝛾𝑗 )∞ 𝑗=0 ∈ 𝑙2 : 𝛾𝑗 ∈ 𝔻, 𝑗 ∈ ℕ0 . Thus, Γ𝑙2 is the subset of all 𝛾 = (𝛾𝑗 )∞ 𝑗=0 ∈ Γ, for which the product ∞ ∏ ( ) 1 − ∣𝛾𝑗 ∣2 𝑗=0

converges. Let us mention the following well-known fact (see, for example, Remark 1.4)

Description of Helson-Szeg˝ o Measures

287

Proposition 5.6. Let 𝜇 ∈ ℳ1+ ( 𝕋 ). Then 𝜇 is a Szeg˝ o measure if and only if 𝛾 belongs to Γ𝑙2 . For a Schur parameter sequence 𝛾 belonging to Γ𝑙2 , we note that the sequence (𝐿𝑛 (𝛾))∞ 𝑛=0 introduced in formula (3.12) of [3] via 𝐿0 (𝛾) := 1 and, for each positive integer 𝑛, via 𝐿𝑛 (𝛾) := 𝑛 ∞ ∞ ∞ ∑ ∑ ∑ ∑ ∑ (−1)𝑟 ... 𝛾𝑗1 𝛾 𝑗1 +𝑠1 . . . 𝛾𝑗𝑟 𝛾 𝑗𝑟 +𝑠𝑟 𝑟=1

𝑠1 +𝑠2 +⋅⋅⋅+𝑠𝑟 =𝑛 𝑗1 =𝑛−𝑠1 𝑗2 =𝑗1 −𝑠2

𝑗𝑟 =𝑗𝑟−1 −𝑠𝑟

(5.22) plays a key role. Here the summation runs over all ordered 𝑟-tuples (𝑠1 , . . . , 𝑠𝑟 ) of positive integers which satisfy 𝑠1 + ⋅ ⋅ ⋅ + 𝑠𝑟 = 𝑛. For example, ∞ ∑ 𝐿1 (𝛾) = − 𝛾𝑗 𝛾𝑗+1 𝑗=0

and 𝐿2 (𝛾) = −

∞ ∑

∞ ∑

𝛾𝑗 𝛾𝑗+2 +

𝑗=0

∞ ∑

𝛾𝑗1 𝛾𝑗1 +1 𝛾𝑗2 𝛾𝑗2 +1 .

𝑗1 =1 𝑗2 =𝑗1 −1

Obviously, if 𝛾 ∈ Γ𝑙2 , then the series (5.22) converges absolutely. For each 𝛾 = (𝛾𝑗 )∞ 𝑗=0 ∈ Γ𝑙2 , we set Π𝑘 :=

∞ ∏

𝐷 𝛾 𝑗 , 𝑘 ∈ ℕ0 ,

(5.23)

𝑗=𝑘

where 𝐷𝛾𝑗 :=

√ 1 − ∣𝛾𝑗 ∣2 , 𝑗 ∈ ℕ0 .

(5.24)

In the space 𝑙2 we deﬁne the coshift mapping 𝑊 : 𝑙2 → 𝑙2 via ∞ (𝑧𝑗 )∞ 𝑗=0 → (𝑧𝑗+1 )𝑗=0 .

(5.25)

The following result is contained in [3, Theorem 3.6, Corollary 3.7]. Theorem 5.7. Let 𝜇 ∈ ℳ1+ ( 𝕋 ) be a Szeg˝ o measure and let 𝛾 ∈ Γ be the Schur parameter sequence associated with 𝜇. Then 𝛾 ∈ Γ𝑙2 and the block ℒ of the matrix (5.20) has the form ⎛ ⎞ 0 0 ... Π1 ⎜ Π2 𝐿1 (𝑊 𝛾) Π2 0 . . .⎟ ⎜ ⎟ 2 ⎜ Π3 𝐿2 (𝑊 𝛾) Π 𝐿 (𝑊 𝛾) Π . . .⎟ 3 1 3 ⎜ ⎟ .. .. .. .. ⎟ ℒ(𝛾 ) = ⎜ , (5.26) ⎜ .⎟ . . . ⎟ ⎜ ⎜Π𝑛 𝐿𝑛−1 (𝑊 𝛾) Π𝑛 𝐿𝑛−2 (𝑊 2 𝛾) Π𝑛 𝐿𝑛−3 (𝑊 3 𝛾) . . . ⎟ ⎠ ⎝ .. .. .. . . . where Π𝑗 , 𝐿𝑗 ( 𝛾 ) and 𝑊 are given via the formulas (5.23), (5.22), and (5.25), respectively.

288

V.K. Dubovoy, B. Fritzsche and B. Kirstein

Remark 5.8. It follows from Theorems 5.4 and 5.7 that the matrix representation of the operator 𝐵𝜇, 𝔊 : ℌ𝜇, 𝔉 −→ ℌ⊥ 𝜇, 𝔊 ∗ ∞ with respect to the orthonormal bases ( 𝜑𝑘 )∞ 𝑘=1 and ( 𝜓𝑛 )𝑛=1 of the spaces ℌ𝜇, 𝔉 ⊥ ∗ and ℌ𝜇, 𝔊 , respectively, is given by the matrix ℒ ( 𝛾 ), where ℒ ( 𝛾 ) has the form (5.26).

6. Characterization of Helson-Szeg˝ o measures in terms of the Schur parameters of the associated Schur function The ﬁrst criterion which characterizes Helson-Szeg˝ o measures in the associated Schur parameter sequence was already obtained. It follows by combination of Theorem 1.2, Theorem 4.3, Proposition 5.6, Theorem 5.7, and Remark 5.8. This leads us to the following theorem, which is one of the main results of this paper. Theorem 6.1. Let 𝜇 ∈ ℳ1+ (𝕋) and let 𝛾 ∈ Γ be the sequence of Schur parameters associated with 𝜇. Then 𝜇 is a Helson-Szeg˝ o measure if and only if 𝛾 ∈ Γ𝑙2 and the operator ℒ∗ (𝛾), which is deﬁned in 𝑙2 by the matrix (5.26), is boundedly invertible. Corollary 6.2. Let 𝜇 ∈ ℳ1+ (𝕋) and let 𝛾 ∈ Γ be the sequence of Schur parameters associated with 𝜇. Then 𝜇 is a Helson-Szeg˝ o measure if and only if 𝛾 ∈ Γ𝑙2 and there exists some positive constant 𝐶 such that for each ℎ ∈ 𝑙2 the inequality ∥ℒ∗ (𝛾)ℎ∥ ≥ 𝐶∥ℎ∥

(6.1)

is satisﬁed. Proof. First suppose that 𝛾 ∈ Γ𝑙2 and that there exists some positive constant 𝐶 such that for each ℎ ∈ 𝑙2 the inequality (6.1) is satisﬁed. From the shape (5.26) of the operator ℒ(𝛾) it follows immediately that ker ℒ(𝛾) = {0}. Thus, Ran ℒ∗ (𝛾) = 𝑙2 . From (6.1) it follows that the operator ℒ∗ (𝛾) is invertible and ( )−1 that the corresponding inverse operator ℒ∗ (𝛾) is bounded and satisﬁes 1( ∗ )−1 1 1 ℒ (𝛾) 1≤ 1 𝐶 ∗ where 𝐶 is taken from (6.1). Since ℒ (𝛾) is a bounded linear operator, the operator [ℒ∗ (𝛾)]−1 is closed. Thus Ran ℒ∗ (𝛾) = 𝑙2 and, consequently, the operator ℒ∗ (𝛾) is boundedly invertible. Hence, Theorem 6.1 yields that 𝜇 is a Helson-Szeg˝o measure. If 𝜇 is a Helson-Szeg˝o measure, then Theorem 6.1 yields that ℒ∗ (𝛾) is boundedly invertible. Hence, condition (6.1) is trivially satisﬁed. □ It should be mentioned that a result similar to Theorem 6.1 was proved earlier using a diﬀerent method in [7, Deﬁnition 4.6, Proposition 4.7 and Theorem 4.8]. More speciﬁcally, it was shown that a measure 𝜇 is a Helson-Szeg˝o measure if and only if some inﬁnite matrix ℳ (which is deﬁned in [7, formulas (4.1) and (4.2)]) generates a bounded operator in ℓ2 . It was also shown that the boundedness of ℳ

Description of Helson-Szeg˝ o Measures

289

is equivalent to the boundedness of another operator matrix ℒ deﬁned in formula (6.4) of [7]. In order to derive criteria in another way we need some statements on the operator ℒ(𝛾) which were obtained in [3]. The following result which originates from [3, Theorem 3.12 and Corollary 3.13] plays an important role in the study of the matrix ℒ(𝛾). Namely, it describes the multiplicative structure of ℒ(𝛾) and indicates connections to the backward shift. Theorem 6.3. It holds that ℒ(𝛾) = 𝔐(𝛾) ⋅ ℒ(𝑊 𝛾) where

⎛

⎜ ⎜ ⎜ 𝔐(𝛾) := ⎜ ⎜ ⎝ and 𝐷𝛾𝑗

𝐷𝛾1 −𝛾1 𝛾 2 −𝛾1 𝐷𝛾2 𝛾 3

0 𝐷𝛾2 −𝛾2 𝛾 3

..

. ∏𝑛−1

−𝛾1 (

..

. ∏𝑛−1

0 0 𝐷𝛾3

⋅⋅⋅ ⋅⋅⋅ ...

..

. ∏𝑛−1

𝐷𝛾𝑗 )𝛾 𝑛 −𝛾3 ( 𝑗=4 𝐷𝛾𝑗 )𝛾 𝑛 .. .. . . √ := 1 − ∣𝛾𝑗 ∣2 , 𝑗 ∈ ℕ0 . The matrix 𝔐(𝛾) satisﬁes 𝑗=2

.. .

𝐷𝛾𝑗 )𝛾 𝑛 −𝛾2 (

(6.2)

𝑗=3

𝐼 − 𝔐(𝛾)𝔐∗ (𝛾) = 𝜂(𝛾)𝜂 ∗ (𝛾) where

⎛ 𝜂(𝛾) := col ⎝𝛾1 , 𝛾2 𝐷𝛾1 , . . . , 𝛾𝑛

𝑛−1 ∏

0 0 0

.. .

⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅

⋅⋅⋅ 𝐷𝛾𝑛 ⋅⋅⋅

.. .

⎞ ⎟ ⎟ ⎟ ⎟ (6.3) ⎟ ⎠

(6.4) ⎞

𝐷 𝛾 𝑗 , . . .⎠

(6.5)

𝑗=1

The multiplicative structure of ℒ(𝛾) obtained in Theorem 6.3 gives us some hope that the boundedness of the operator ℒ∗ (𝛾) can be reduced to a constructive condition on the Schur parameters via convergence of some inﬁnite products (series). This is a promising direction for future work on this problem. Let 𝛾 ∈ Γ𝑙2 . For each 𝑛 ∈ ℕ we set (see formula (5.3) in [3]) ⎞ ⎛ Π1 0 0 ... 0 ⎜ Π2 𝐿1 (𝑊 𝛾) Π2 0 ... 0 ⎟ ⎜ ⎟ 2 ⎜ Π3 𝐿2 (𝑊 𝛾) Π 𝐿 (𝑊 𝛾) Π . .. 0 ⎟ 3 1 3 𝔏𝑛 (𝛾) := ⎜ ⎟ . (6.6) ⎜ .. .. .. .. ⎟ ⎝ . . . . ⎠ 2 3 Π𝑛 𝐿𝑛−1 (𝑊 𝛾) Π𝑛 𝐿𝑛−2 (𝑊 𝛾) Π𝑛 𝐿𝑛−3 (𝑊 𝛾) . . . Π𝑛 The matrices introduced in (6.6) will play an important role in our investigations. Now we turn our attention to some properties of the matrices 𝔏𝑛 (𝛾), 𝑛 ∈ ℕ, which will later be of use. From Corollary 5.2 in [3] we get the following result. Lemma 6.4. Let 𝛾 = (𝛾𝑗 )∞ 𝑗=0 ∈ Γ𝑙2 and let 𝑛 ∈ ℕ. Then the matrix 𝔏𝑛 (𝛾) deﬁned by (6.6) is contractive. We continue with some asymptotical considerations.

290

V.K. Dubovoy, B. Fritzsche and B. Kirstein

Lemma 6.5. Let 𝛾 = (𝛾𝑗 )∞ 𝑗=0 ∈ Γ𝑙2 . Then: (a) lim𝑘→∞ Π𝑘 = 1. (b) For each 𝑗 ∈ ℕ, lim𝑚→∞ 𝐿𝑗 (𝑊 𝑚 𝛾) = 0. (c) For each 𝑛 ∈ ℕ, lim𝑚→∞ 𝔏𝑛 (𝑊 𝑚 𝛾) = 𝐼𝑛 .

∏∞ Proof. The choice of 𝛾 implies the convergence of the inﬁnite product 𝑘=0 𝐷𝛾𝑘 . This yields (a). Assertion (b) is an immediate consequence of the deﬁnition of the sequence (𝐿𝑗 (𝑊 𝑚 𝛾))∞ 𝑚=1 (see (5.22) and (5.25)). By inspection of the sequence (𝔏𝑛 (𝑊 𝑚 𝛾))∞ 𝑚=1 one can immediately see that the combination of (a) and (b) yields the assertion of (c). □ The following result is given in [3, Lemma 5.3]. Lemma 6.6. Let 𝛾 = (𝛾𝑗 )∞ 𝑗=0 ∈ Γ𝑙2 and let 𝑛 ∈ ℕ. Then 𝔏𝑛 (𝛾) = 𝔐𝑛 (𝛾) ⋅ 𝔏𝑛 (𝑊 𝛾),

(6.7)

where 𝔐𝑛 (𝛾) := ⎛

𝐷 𝛾1 ⎜ −𝛾 1𝛾2 ⎜ ⎜ −𝛾 𝐷 1 𝛾2 𝛾 3 ⎜ ⎜ .. ⎜ ⎝ (∏ . ) 𝑛−1 𝐷 𝛾𝑛 −𝛾1 𝛾 𝑗 𝑗=2

0 𝐷 𝛾2 −𝛾2 𝛾 3 .. (∏ . ) 𝑛−1 −𝛾2 𝐷 𝛾𝑛 𝛾 𝑗 𝑗=3

−𝛾3

(∏

0 0 𝐷 𝛾3 .. .

𝑛−1 𝑗=4

... ... ... ) 𝐷 𝛾𝑗 𝛾 𝑛

⎞

0 0 0 .. .

. . . 𝐷 𝛾𝑛

⎟ ⎟ ⎟ ⎟. ⎟ ⎟ ⎠

(6.8) Moreover, 𝔐𝑛 (𝛾) is a nonsingular matrix which fulﬁlls 𝐼𝑛 − 𝔐𝑛 (𝛾)𝔐∗𝑛 (𝛾) = 𝜂𝑛 (𝛾)𝜂𝑛∗ (𝛾), where

⎛ 𝜂𝑛 (𝛾) := ⎝𝛾1 , 𝛾2 𝐷𝛾1 , . . . , 𝛾𝑛

( 𝑛−1 ∏

(6.9)

)⎞𝑇 𝐷 𝛾𝑗 ⎠ .

(6.10)

𝑗=1

Corollary 6.7. Let 𝛾 = (𝛾𝑗 )∞ 𝑗=0 ∈ Γ𝑙2 and let 𝑛 ∈ ℕ. Then the multiplicative decomposition 𝔏𝑛 (𝛾) = holds true.

−→ ∞ ∏

𝔐𝑛 (𝑊 𝑘 𝛾)

(6.11)

𝑘=0

Proof. Combine part (c) of Lemma 6.5 and (6.7). Now we state the next main result of this paper. For ℎ = (𝑧𝑗 )∞ 𝑗=1 ∈ 𝑙2 and 𝑛 ∈ ℕ we set ℎ𝑛 := (𝑧1 , . . . , 𝑧𝑛 )⊤ ∈ ℂ𝑛 .

□

Description of Helson-Szeg˝ o Measures

291

Theorem 6.8. Let 𝜇 ∈ ℳ1+ (𝕋) and let 𝛾 ∈ Γ be the sequence of Schur parameters associated with 𝜇. Then 𝜇 is a Helson-Szeg˝ o measure if and only if 𝛾 ∈ Γ𝑙2 and there exists some positive constant 𝐶 such that for all ℎ ∈ 𝑙2 the inequality 1( ←− ) 1 1 𝑚 1 1 ∏ ∗ 1 𝑘 1 lim lim 𝔐𝑛 (𝑊 𝛾) ℎ𝑛 1 (6.12) 1 ≥ 𝐶∥ℎ∥ 𝑛→∞ 𝑚→∞ 1 1 𝑘=0 1 is satisﬁed. Proof. In view of (6.11) and condition (c) in Lemma 6.5 the condition (6.12) is equivalent to the fact that for all ℎ ∈ 𝑙2 the inequality lim ∥ℒ∗𝑛 (𝛾)ℎ𝑛 ∥ ≥ 𝐶∥ℎ∥

(6.13)

𝑛→∞

is satisﬁed. This inequality is equivalent to the inequality (6.1).

□

Theorem 6.8 leads to an alternate proof of an interesting suﬃcient condition for a Szeg˝o measure to be a Helson-Szeg˝o measure (see Theorem 6.12). To prove this result we will still need some preparations. Lemma 6.9. Let 𝑛 ∈ ℕ. Furthermore, let the nonsingular complex 𝑛 × 𝑛 matrix 𝔐 and the vector 𝜂 ∈ ℂ𝑛 be chosen such that 𝐼𝑛 − 𝔐𝔐∗ = 𝜂𝜂 ∗ holds. Then 1 −

satisﬁes

∥𝜂∥2ℂ𝑛

(6.14)

> 0 and the vector 1 𝜂˜ := √ 𝔐∗ 𝜂 2 1 − ∥𝜂∥ℂ𝑛

(6.15)

𝐼𝑛 − 𝔐∗ 𝔐 = 𝜂˜𝜂˜∗ .

(6.16)

Proof. The case 𝜂 = 0𝑛×1 is trivial. Now suppose that 𝜂 ∈ ℂ𝑛 ∖{0𝑛×1}. From (6.14) we get (6.17) (𝐼𝑛 − 𝔐𝔐∗ )𝜂 = 𝜂𝜂 ∗ 𝜂 = ∥𝜂∥2ℂ𝑛 ⋅ 𝜂 and consequently 2 (6.18) 𝔐𝔐∗ 𝜂 = (1 − ∥𝜂∥ℂ𝑛 ) ⋅ 𝜂. 2

Hence 1 − ∥𝜂∥ℂ𝑛 is an eigenvalue of 𝔐𝔐∗ with corresponding eigenvector 𝜂. Since 𝔐 is nonsingular, the matrix 𝔐𝔐∗ is positive Hermitian. Thus, we have 1 − 2 ∥𝜂∥ℂ𝑛 > 0. Using (6.17) we infer 2

(𝐼𝑛 − 𝔐∗ 𝔐)𝔐∗ 𝜂 = 𝔐∗ (𝐼𝑛 − 𝔐𝔐∗ )𝜂 = ∥𝜂∥ℂ𝑛 ⋅ 𝔐∗ 𝜂. Taking into account (6.18) we can conclude [ ] 2 2 2 2 ∥𝔐∗ 𝜂∥ℂ𝑛 = 𝜂 ∗ 𝔐𝔐∗ 𝜂 = 𝜂 ∗ (1 − ∥𝜂∥ℂ𝑛 ) ⋅ 𝜂 = (1 − ∥𝜂∥ℂ𝑛 ) ⋅ ∥𝜂∥ℂ𝑛

(6.19) (6.20)

and therefore from (6.15) we have ∥˜ 𝜂 ∥ℂ𝑛 = ∥𝜂∥ℂ𝑛 > 0.

(6.21)

292

V.K. Dubovoy, B. Fritzsche and B. Kirstein

Formulas (6.19), (6.15) and (6.21) show that ∥˜ 𝜂 ∥2ℂ𝑛 is an eigenvalue of 𝐼𝑛 − 𝔐∗ 𝔐 with corresponding eigenvector 𝜂˜. From (6.14) and 𝜂 ∕= 0𝑛×1 we get rank (𝐼𝑛 − 𝔐∗ 𝔐) = rank (𝐼𝑛 − 𝔐𝔐∗ ) = 1. So for each vector ℎ we can conclude ( (𝐼𝑛 − 𝔐∗ 𝔐)ℎ = (𝐼𝑛 − 𝔐∗ 𝔐) ℎ,

𝜂 ˜ ∥˜ 𝜂 ∥ℂ𝑛

)

𝜂 ˜ ∥˜ 𝜂 ∥ℂ𝑛

ℂ𝑛

= (ℎ, 𝜂˜)ℂ𝑛 𝜂˜ = 𝜂˜𝜂˜∗ ⋅ ℎ.

□

Corollary 6.10. Let the assumptions of Lemma 6.9 be satisﬁed. Then for each ℎ ∈ ℂ𝑛 the inequalities 1

∥𝔐ℎ∥ ≥ (1 − ∥𝜂∥2 ) 2 ∥ℎ∥

(6.22)

and 1

∥𝔐∗ ℎ∥ ≥ (1 − ∥𝜂∥2 ) 2 ∥ℎ∥

(6.23)

are satisﬁed. Proof. Applying (6.16) and (6.21) we get for ℎ ∈ ℂ𝑛 the relation ( ) 𝜂 ∥2 ∥ℎ∥2 = ∥𝜂∥2 ∥ℎ∥2 . ∥ℎ∥2 − ∥𝔐ℎ∥2 = (𝐼 − 𝔐∗ 𝔐)ℎ, ℎ = ∣(ℎ, 𝜂˜)∣2 ≤ ∥˜ This implies (6.22). Analogously, (6.23) can be veriﬁed.

□

Corollary 6.11. Let 𝛾 ∈ Γ𝑙2 , and let the matrix 𝔐𝑛 (𝛾) be deﬁned via (6.8). Then for all ℎ ∈ ℂ𝑛 the inequalities ) ( 𝑛 ∏ (6.24) 𝐷𝛾𝑗 ∥ℎ∥ ∥𝔐𝑛 (𝛾)ℎ∥ ≥ 𝑗=1

and

( ∥𝔐∗𝑛 (𝛾)ℎ∥

≥

𝑛 ∏

) 𝐷𝛾𝑗 ∥ℎ∥

(6.25)

𝑗=1

are satisﬁed. Proof. The matrix 𝔐𝑛 ( 𝛾 ) satisﬁes the conditions of Lemma 6.9. Here the vector 𝜂 has the form (6.10). It remains only to mention that in this case we have ⎤ ⎡ 𝑛−1 ( ) ) ∏( 2 2 2 2 2⎣ 2 ⎦ 1 − ∣𝛾𝑗 ∣ 1 − ∥𝜂∥ = 1 − ∣𝛾1 ∣ − ∣𝛾2 ∣ 1 − ∣𝛾1 ∣ − ⋅ ⋅ ⋅ − ∣𝛾𝑛 ∣ (6.26) 𝑗=1

=

𝑛 ∏

(

1 − ∣𝛾𝑗 ∣2

)

.

□

𝑗=1

The above consideration lead us to an alternate proof for a nice suﬃcient criterion for the Helson-Szeg˝ o property of a measure 𝜇 ∈ ℳ1+ ( 𝕋 ) which is expressed in terms of the modules of the associated Schur parameter sequence.

Description of Helson-Szeg˝ o Measures

293

Regarding the history of Theorem 6.12, it should be mentioned that, in view of a theorem by B.L. Golinskii and I.A. Ibragimov [6], the convergence of the inﬁnite product in (6.27) is equivalent to the property that 𝜇 is absolutely continuous with respect to the Lebesgue measure. The corresponding density is then of the form exp 𝑔, where 𝑔 is a real Besov-class function. A Theorem of V.V. Peller’s [13] states that every function of this form is a density of a Helson-Szeg˝ o measure. This topic was also discussed in detail in [7]. Theorem 6.12. Let 𝜇 ∈ ℳ1+ ( 𝕋 ) and let 𝛾 ∈ Γ be the Schur parameter sequence associated with 𝜇. If 𝛾 ∈ Γ𝑙2 and the inﬁnite product ∞ ( ∞ ∏ ) ∏ 2 1 − ∣𝛾𝑗 ∣ (6.27) 𝑘=1 𝑗=𝑘

converges, then 𝜇 is a Helson-Szeg˝ o measure. Proof. Applying successively the estimate (6.25), we get for all 𝑚, 𝑛 ∈ ℕ and all vectors ℎ ∈ ℂ𝑛 the chain of inequalities 1⎡ ←− ⎤ 1 1 ⎡ ←−− ⎤ 1 1 𝑚 1 1 1 𝑚−1 ∏ 1 1 ∗ 1 1 ∏ ∗( 𝑘 ) ( ) 𝑚 ∗ 𝑘 1 1⎣ 1 ⎦ ⎣ ⎦ 𝔐𝑛 𝑊 𝛾 ℎ1 = 1𝔐𝑛 ( 𝑊 𝛾 ) 𝔐𝑛 𝑊 𝛾 ℎ1 1 1 1 𝑘=0 1 1 1 𝑘=0 1⎡ ←−− ⎤ 1 1 𝑚−1 1 𝑚+𝑛 ∏ 1 1 ∏ ∗( 𝑘 ) 1 ⎦ ⎣ ≥ 𝐷 𝛾𝑗 1 𝔐𝑛 𝑊 𝛾 ℎ1 1 1 1 𝑘=0 𝑗=𝑚+1 ≥

⎛

≥⎝ ⎛ ≥⎝ ⎛ =⎝ ⎛ ≥⎝

⋅⋅⋅ 𝑚+𝑛 ∏

⎞ ⎛ 𝐷 𝛾𝑗 ⎠ ⋅ ⎝

𝑗=𝑚+1 ∞ ∏

⎞ ⎛ 𝐷 𝛾𝑗 ⎠ ⋅ ⎝

𝑗=𝑚+1 𝑚+1 ∞ ∏ ∏ 𝑘=1 𝑗=𝑘 ∞ ∞ ∏ ∏

⎞

𝑚+𝑛−1 ∏ 𝑗=𝑚 ∞ ∏

⎞

⎛

𝐷 𝛾𝑗 ⎠ ⋅

⋅⋅⋅

⎞

⋅⎝ ⎛

𝐷 𝛾𝑗 ⎠ ⋅

⋅⋅⋅

𝑗=𝑚

⋅⎝

𝑛 ∏ 𝑗=1

∞ ∏

⎞ 𝐷𝛾𝑗 ⎠ ∥ℎ∥ ⎞

𝐷𝛾𝑗 ⎠ ∥ℎ∥

𝑗=1

𝐷𝛾𝑗 ⎠ ∥ℎ∥ ⎞

𝐷𝛾𝑗 ⎠ ∥ℎ∥

(6.28)

𝑘=1 𝑗=𝑘

From this inequality it follows (6.12) where 𝐶=

∞ ∞ ∏ ∏

𝐷 𝛾𝑗 .

𝑘=1 𝑗=𝑘

Thus, the proof is complete.

□

294

V.K. Dubovoy, B. Fritzsche and B. Kirstein

Taking into account that the convergence of the inﬁnite product (6.27) is equivalent to the strong Szeg˝o condition ∞ ∑

𝑘 ⋅ ∣𝛾𝑘 ∣2 < ∞,

𝑘=1

Theorem 6.12 is an immediate consequence of [7, Theorem 5.3]. The proof of [7, Theorem 5.3] is completely diﬀerent from the above proof of Theorem 6.12. It is based on a scattering formalism using CMV matrices. (For a comprehensive exposition on CMV matrices, we refer the reader to Chapter 4 in the monograph Simon [16].) The aim of our next considerations is to characterize the Helson-Szeg˝o property of a measure 𝜇 ∈ ℳ1+ ( 𝕋 ) in terms of some inﬁnite series formed from its Schur parameter sequence. The following result provides the key information for the desired characterization. Theorem 6.13. Let 𝛾 = ( 𝛾𝑗 )∞ 𝑗=0 ∈ Γℓ2 and let 𝒜 ( 𝛾 ) := 𝐼 − ℒ ( 𝛾 ) ℒ∗ ( 𝛾 )

(6.29)

where ℒ ( 𝛾 ) is given by (5.26). Then 𝒜 ( 𝛾 ) satisﬁes the inequalities 0 ≤ 𝒜(𝛾 ) ≤ 𝐼

(6.30)

and admits the strong convergent series decomposition 𝒜(𝛾 ) =

∞ ∑

𝜉𝑗 ( 𝛾 ) 𝜉𝑗∗ ( 𝛾 )

(6.31)

𝑗=0

where 𝜉0 ( 𝛾 ) := 𝜂 ( 𝛾 ) ,

⎡ −→ ⎤ 𝑗−1 ∏ ( ) ) ( 𝜉𝑗 ( 𝛾 ) := ⎣ 𝔐 𝑊 𝑘 𝛾 ⎦ 𝜂 𝑊 𝑗 𝛾 , 𝑗 ∈ ℕ,

(6.32)

𝑘=0

and 𝔐 ( 𝛾 ), 𝜂 ( 𝛾 ) and 𝑊 are given by (6.3), (6.5) and (5.25), respectively. Proof. Since the matrix ℒ ( 𝛾 ) is a block of the unitary operator matrix given by (5.20) we have ∥ℒ ( 𝛾 )∥ ≤ 1. This implies the inequalities (6.30). Using (6.3) and (6.4), we obtain 𝒜 ( 𝛾 ) = 𝐼 − ℒ ( 𝛾 ) ℒ∗ ( 𝛾 ) = 𝐼 − 𝔐 ( 𝛾 ) ℒ ( 𝑊 𝛾 ) ℒ∗ ( 𝑊 𝛾 ) 𝔐∗ ( 𝛾 ) = 𝐼 − 𝔐 ( 𝛾 ) 𝔐∗ ( 𝛾 ) + 𝔐 ( 𝛾 ) 𝒜 ( 𝑊 𝛾 ) 𝔐∗ ( 𝛾 ) = 𝜂 ( 𝛾 ) 𝜂 ∗ ( 𝛾 ) + 𝔐 ( 𝛾 ) 𝒜 ( 𝑊 𝛾 ) 𝔐∗ ( 𝛾 ) .

Description of Helson-Szeg˝ o Measures

295

Repeating this procedure 𝑚 − 1 times, we get 𝒜 ( 𝛾 ) = 𝜂 ( 𝛾 ) 𝜂 ∗ ( 𝛾 ) + 𝔐 ( 𝛾 ) 𝜂 ( 𝑊 𝛾 ) 𝜂 ∗ ( 𝑊 𝛾 ) 𝔐∗ ( 𝛾 ) ⎡ −−→ ⎤ ⎡ ←−− ⎤ 𝑚−1 𝑚−1 ∏ ( ∏ ( ) ) + ⋅⋅⋅ + ⎣ 𝔐 𝑊 𝑘 𝛾 ⎦ 𝜂 ( 𝑊 𝑚 𝛾 ) 𝜂∗ ( 𝑊 𝑚 𝛾 ) ⎣ 𝔐∗ 𝑊 𝑘 𝛾 ⎦ 𝑘=0

𝑘=0

⎡ −−→ ⎤ ⎡ ←−− ⎤ 𝑚−1 𝑚−1 ∏ ( ∏ ( ) ) ) ( +⎣ 𝔐 𝑊 𝑘 𝛾 ⎦ 𝒜 𝑊 𝑚+1 𝛾 ⎣ 𝔐∗ 𝑊 𝑘 𝛾 ⎦ 𝑘=0

𝑘=0

⎡ −−→ ⎤ ⎡ ←−− ⎤ 𝑚−1 𝑚−1 𝑚−1 ∑ ∏ ( ∏ ( ) ) ) ( 𝜉𝑗 ( 𝛾 ) 𝜉𝑗∗ ( 𝛾 ) + ⎣ 𝔐 𝑊 𝑘 𝛾 ⎦ 𝒜 𝑊 𝑚+1 𝛾 ⎣ 𝔐∗ 𝑊 𝑘 𝛾 ⎦. = 𝑗=0

𝑘=0

𝑘=0

In view of part (c) of Lemma 6.4 and the shape (6.3) of the matrix 𝔐 ( 𝛾 ) for ﬁnite vectors ℎ ∈ ℓ2 (i.e., ℎ has the form ℎ = col ( 𝑧1 , 𝑧2 , . . . , 𝑧𝑛 , 0, 0, . . . ) for some 𝑛 ∈ ℕ) we obtain ⎡ −−→ ⎤ ⎡ ←−− ⎤ 𝑚−1 𝑚−1 ∏ ( ∏ ( ) ) ) ( 𝔐 𝑊 𝑘 𝛾 ⎦ 𝒜 𝑊 𝑚+1 𝛾 ⎣ 𝔐∗ 𝑊 𝑘 𝛾 ⎦ℎ = 0. lim ⎣ 𝑚−→∞

𝑘=0

𝑘=0

This implies that the series given by the right-hand side of the formula (6.32) weakly converges to 𝒜 ( 𝛾 ). From the concrete form of this series, its strong convergence follows. Thus, the proof is complete. □ The last main result of this paper is the following statement, which is an immediate consequence of Theorem 6.1 and Theorem 6.13. Theorem 6.14. Let 𝜇 ∈ ℳ1+ ( 𝕋 ) and let 𝛾 ∈ Γ be the sequence of Schur parameters associated with 𝜇. Then 𝜇 is a Helson-Szeg˝ o measure if and only if 𝛾 ∈ Γℓ2 and there exists some positive constant 𝜀 ∈ ( 0, 1 ) such that the inequality ∞ ∑ 𝜉𝑗 ( 𝛾 ) 𝜉𝑗∗ ( 𝛾 ) ≤ ( 1 − 𝜀 ) 𝐼 (6.33) 𝑗=0

is satisﬁed, where the vectors 𝜉𝑗 ( 𝛾 ) , 𝑗 ∈ ℕ0 , are given by (6.32). We note that the inequality (6.33) can be considered as a rewriting of condition (6.12) in an additive form. Remark 6.15. Finally, we would like to add that many important properties of Schur functions can be characterized in terms of the matrix, ℒ ( 𝛾 ), given by (5.26). It was shown in [3, Section 5] that the pseudocontinuability of a Schur function is determined by the properties of the matrix ℒ ( 𝛾 ). In [4, Section 2], it was proved that the 𝑆-recurrence property of Schur parameter sequences of non-inner rational Schur functions is also expressed with the aid of the matrix ℒ ( 𝛾 ). Furthermore, the structure of the matrix ℒ ( 𝛾 ) allows one to determine whether a non-inner Schur function is rational or not(see ([3, Section 5]).

296

V.K. Dubovoy, B. Fritzsche and B. Kirstein

References [1] M.J. Bertin, A. Guilloux, J.P. Schreiber: Pisot and Salem Numbers, Birkh¨ auser, Basel–Boston–Berlin, 1992. [2] M.S. Brodskii: Unitary operator colligations and their characteristic functions (in Russian), Uspek Mat. Nauk 33 (1978), Issue 4, 141–168. English transl. in: Russian Math. Surveys 33 (1978), Issue 4, 159–191. [3] V.K. Dubovoy: Shift operators contained in contractions, Schur parameters and pseudocontinuable Schur functions, in: Interpolation, Schur Functions and Moment Problems (eds.: D. Alpay, I. Gohberg), Oper. Theory Adv. Appl., Vol. 165, Birkh¨ auser, Basel, 2006, pp. 175–250. [4] V.K. Dubovoy, B. Fritzsche, B. Kirstein: The 𝒮-recurrence of Schur parameters of non-inner rational Schur functions, in: Topics in Operator Theory, Volume 1: Operators, Matrices and Analytic Functions (eds.: J.A. Ball, V. Bolotnikov, J.W. Helton, L. Rodman, I.M. Spitkovsky), Oper. Theory Adv. Appl., Vol. 202, Birkh¨ auser, Basel, 2010, pp. 151–194. [5] V.K. Dubovoy, B.F. Fritzsche, B. Kirstein: Shift operators contained in contractions, pseudocontinuable Schur functions and orthogonal systems on the unit circle, Complex Analysis and Operator Theory 5 (2011), 579–610. [6] B.L. Golinskii, I.A. Ibragimov: On Szeg˝ o’s limit theorem (in Russian), Izv. Akad. Nauk. SSSR, Ser. Mat. 35(1971), 408–429. English transl. in Math. USSR Izv. 5(1971), 421-444. [7] L.B. Golinskii, A.Ya. Kheifets, F. Peherstorfer, P.M. Yuditskii FaddeevMarchenko scattering for CMV matrices and the strong Szeg˝ o theorem, arXiv: 0807.4017v1 [math.SP] 25 July 2008. ˝ : A problem in prediction theory, Annali di Mat. Pura ed [8] H. Helson, G. Szego Applicata 4 (1960), 51, 107–138. [9] A.Ya. Kheifets, F. Peherstorfer, P.M. Yuditskii On scattering for CMV matrices, arXiv: 0706.2970v1 [math.SP] 20 June 2007. [10] P. Koosis: Introduction to 𝐻 𝑝 Spaces, Cambridge Univ. Press, Cambridge etc. 1998. [11] N.K. Nikolski: Operators, Functions and Systems: An Easy Reading, Math. Surveys and Monographs, V. 92, Contents: V. 1, Hardy, Hankel and Toeplitz (2002). [12] F. Peherstorfer, A.L. Volberg, P.M. Yuditskii CMV matrices with asymptotically constant coeﬃcients. Szeg˝ o-Blaschke class, scattering theory, Journal of Functional Analysis 256 (2009), 2157–2210. [13] V.V. Peller: Hankel operators of class 𝑆𝑝 and their applications (rational approximation, Gaussian processes, the problem of majorization of operators) (in Russian), Mat. Sb. 113(1980), 538–581. English transl. in: Math. USSR Sbornik 41(1982), 443–479. [14] M. Rosenblum, J. Rovnyak: Topics in Hardy Classes and Univalent Functions, Birkh¨ auser, Basel 1994. ¨ [15] I. Schur: Uber Potenzreihen, die im Inneren des Einheitskreises beschr¨ ankt sind, J. reine u. angew. Math., Part I: 147 (1917), 205–232, Part II: 148 (1918), 122–145. [16] B. Simon: Orthogonal Polynomials on the Unit Circle. Part 1: Classical Theory, Amer. Math. Soc. Colloq. Publ., Providence, RI, v. 54 (2004).

Description of Helson-Szeg˝ o Measures

297

[17] T. Tao, C. Thiele: Nonlinear Fourier Analysis, IAS Lectures at Park City, Mathematics Series 2003. Vladimir K. Dubovoy Department of Mathematics and Mechanics Kharkov State University Svobody Square 4 UA-61077 Kharkov, Ukraine e-mail: [email protected] Bernd Fritzsche, and Bernd Kirstein Mathematisches Institut Universit¨ at Leipzig Augustusplatz 10/11 D-04109 Leipzig, Germany e-mail: [email protected] [email protected]

Operator Theory: Advances and Applications, Vol. 218, 299–328 c 2012 Springer Basel AG ⃝

Divide and Conquer Method for Eigenstructure of Quasiseparable Matrices Using Zeroes of Rational Matrix Functions Y. Eidelman and I. Haimovici Dedicated to the memory of Israel Gohberg, our friend and teacher

Abstract. We study divide and conquer method to compute eigenstructure of matrices with quasiseparable representation. In order to ﬁnd the eigenstructure of a large matrix 𝐴 we divide the problem into two problems for smaller sized matrices 𝐵 and 𝐶 by using the quasiseparable representation of 𝐴. In the conquer step we show that to reconstruct the eigenstructure of 𝐴 from those of 𝐵 and 𝐶 amounts to the study of the eigenstructure of a rational matrix function. For a Hermitian matrix 𝐴 which is order one quasiseparable we completely solve the eigenproblem. Mathematics Subject Classiﬁcation (2000). Primary 15A18; Secondary 26C15. Keywords. Quasiseparable, divide and conquer, rational matrix function, Hermitian matrix.

1. Introduction In order to solve the eigenproblem for a large matrix 𝐴 which is in quasiseparable representation we represent 𝐴 in the form ( ) 𝐵 0 𝐴= + 𝐺𝐻 0 𝐶 with smaller sized matrices 𝐵 and 𝐶 and a perturbation matrix 𝐺𝐻 of small rank that depends on the order of quasiseparability. The matrices 𝐵 and 𝐶 have in turn at most the same order of quasiseparability and can therefore be divided further in the same way, until small enough matrices for which the eigenproblem can be solved conveniently. In most cases the two smaller matrices obtained by using an appropriate quasiseparable representation also belong both of them to that class. After the division step of the algorithm is completed and the eigenstructure of the smallest matrices has been found, we perform the conquer step in which the

300

Y. Eidelman and I. Haimovici

division tree is climbed back and we obtain the eigenstructure of a larger matrix 𝐴 upon knowing the eigenstructure of two smaller matrices 𝐵 and 𝐶. To do this we should compute the eigenstructure of a small sized matrix function with size equal to the order of perturbation. We study in detail the eigenstructure of such matrix functions. Therefore the paper restates the deﬁnition of eigenvalues and Jordan chains for rational matrix functions. We ﬁnd in exact arithmetic a correspondence which is one-to-one and onto between the eigenvalues and Jordan chains of the matrix 𝐴 and those of a rational matrix function which is built using only the spectral data of the smaller matrices 𝐵 and 𝐶 and the perturbation matrix 𝐺𝐻. Although this correspondence is of theoretical importance, in practice, when only approximations of the eigenvalues are determined, we could not choose to compute the Jordan canonical form of the matrices. As the eigenvalue multiplicities are not continuous functions of the matrix entries, computation of the Jordan canonical form is an ill-posed problem. While performing the conquer step we impose more and more restrictive conditions in order to obtain more results. The complete algorithm is obtained for Hermitian matrices with quasiseparable of order one representations. While in theory most of our results apply to general matrices, which can be always represented as quasiseparable of a certain order, in practice the case of the non-Hermitian matrices, or of the matrices which are not order one quasiseparable raises numerous diﬃculties. Among the obstacles in the non-symmetric case, which are analyzed in detail in [11], we will mention that the complex roots of the equation occur in conjugate pairs, but after ﬁnding one such pair we can remain to work further with a complex matrix and that the roots do not interlace with the poles as in the symmetric case, but can scatter anywhere in the complex plane, as [4] puts it. Also, if the rational matrix function is not in fact a scalar one, as it is the case for (order one) quasiseparable matrix 𝐴, the position of the roots is again quite at random. The present algorithm has complexity 𝑂(𝑁 2 ) in contrast to 𝑂(𝑁 3 ) operations which are required to compute eigenvalues of a non-structured matrix. The detailed analysis of complexity of this algorithms will be done in [5], the computer experiments are planned to be preformed elsewhere. This paper is a continuation of the results presented in [1]. Our results on divide and conquer method generalize the corresponding results for tridiagonal matrices and for diagonal plus semiseparable matrices, concerning algorithms for tridiagonal matrices see [3, 8, 15, 7] and the literature cited therein, concerning diagonal plus semiseparable matrices see [12]. An algorithm for unitary Hessenberg matrices, which also have quasiseparable order one, diﬀerent from ones presented in this paper, was developed in [9]. For an important, interesting, complete and up-to-date state of art in the ﬁeld of divide and conquer algorithms for eigendecomposition see [15]. Following the exposition there will show that our approach covers all the cases in a uniﬁed manner, but there are still other alternative algorithms that solve the problem, for instance the use of arrowhead matrices, which seems to be close to our method.

Divide and Conquer for Quasiseparable Matrices

301

2. Notation and deﬁnitions For an 𝑁 ×𝑁 matrix 𝐴 we denote by 𝐴𝑖𝑗 or by 𝐴(𝑖, 𝑗) its element on row 1 ≤ 𝑖 ≤ 𝑁 and on column 1 ≤ 𝑗 ≤ 𝑁 and by 𝐴(𝑖 : 𝑗, 𝑝 : 𝑞) the submatrix containing rows 1 ≤ 𝑖 ≤ 𝑗 ≤ 𝑁 inclusively between columns 1 ≤ 𝑝 ≤ 𝑞 ≤ 𝑁 inclusively. In particular, if 𝑖 = 𝑗 then we denote 𝐴(𝑖, 𝑝 : 𝑞) and if 𝑖 < 𝑗, 𝑝 = 𝑞 we denote 𝐴(𝑖 : 𝑗, 𝑝). Let 𝐴 = {𝐴𝑖𝑗 }𝑁 𝑖,𝑗=1 be a matrix with block entries 𝐴𝑖𝑗 of sizes 𝑚𝑖 × 𝑚𝑗 . Assume that the entries of this matrix are represented in the form ⎧ >   𝑝(𝑖)𝑎𝑖𝑗 𝑞(𝑗), 1 ≤ 𝑗 < 𝑖 ≤ 𝑁, ⎨ 𝑑(𝑖), 1 ≤ 𝑖 = 𝑗 ≤ 𝑁, 𝐴𝑖𝑗 = (2.1)   ⎩ 𝑔(𝑖)𝑏< ℎ(𝑗), 1 ≤ 𝑖 < 𝑗 ≤ 𝑁. 𝑖𝑗

Here 𝑝(𝑖) (𝑖 = 2, . . . , 𝑁 ), 𝑞(𝑗) (𝑗 = 1, . . . , 𝑁 − 1), 𝑎(𝑘) (𝑘 = 2, . . . , 𝑁 − 1) are 𝐿 𝐿 , 𝑟𝑗𝐿 × 𝑚𝑗 , 𝑟𝑘𝐿 × 𝑟𝑘−1 respectively, 𝑔(𝑖) (𝑖 = 1, . . . , 𝑁 − matrices of sizes 𝑚𝑖 × 𝑟𝑖−1 1), ℎ(𝑗) (𝑗 = 2, . . . , 𝑁 ), 𝑏(𝑘) (𝑘 = 2, . . . , 𝑁 − 1) are matrices of sizes 𝑚𝑖 × 𝑈 𝑈 𝑟𝑖𝑈 , 𝑟𝑗−1 × 𝑚𝑗 , 𝑟𝑘−1 × 𝑟𝑘𝑈 respectively, 𝑑(𝑖) (𝑖 = 1, . . . , 𝑁 ) are 𝑚𝑖 × 𝑚𝑖 matrices. < Also, the operations 𝑎> 𝑖𝑗 and 𝑏𝑗𝑖 are deﬁned for positive integers 𝑖, 𝑗, 𝑖 > 𝑗 as > < 𝑎> 𝑖𝑗 = 𝑎(𝑖 − 1) ⋅ ⋅ ⋅ ⋅ ⋅ 𝑎(𝑗 + 1) for 𝑖 > 𝑗 + 1, 𝑎𝑗+1,𝑗 = 𝐼𝑟𝑗 and 𝑏𝑗𝑖 = 𝑏(𝑗 + 1) ⋅ ⋅ ⋅ ⋅ ⋅ 𝑏(𝑖 − 1) < for 𝑖 > 𝑗 + 1, 𝑏𝑗,𝑗+1 = 𝐼𝑟𝑗 . The representation of a matrix 𝐴 in the form (2.1) is called a quasiseparable representation. The elements 𝑝(𝑖) (𝑖 = 2, . . . , 𝑁 ), 𝑞(𝑗) (𝑗 = 1, . . . , 𝑁 − 1), 𝑎(𝑘) (𝑘 = 2, . . . , 𝑁 − 1); 𝑔(𝑖) (𝑖 = 1, . . . , 𝑁 − 1), ℎ(𝑗) (𝑗 = 2, . . . , 𝑁 ), 𝑏(𝑘) (𝑘 = 2, . . . , 𝑁 −1); 𝑑(𝑖) (𝑖 = 1, . . . , 𝑁 ) are called quasiseparable generators of the matrix 𝐴. The numbers 𝑟𝑘𝐿 , 𝑟𝑘𝑈 (𝑘 = 1, . . . , 𝑁 − 1) are called the orders of these generators. The elements 𝑝(𝑖) (𝑖 = 2, . . . , 𝑁 ), 𝑞(𝑗) (𝑗 = 1, . . . , 𝑁 − 1), 𝑎(𝑘) (𝑘 = 2, . . . , 𝑁 − 1) and 𝑔(𝑖) (𝑖 = 1, . . . , 𝑁 − 1), ℎ(𝑗) (𝑗 = 2, . . . , 𝑁 ), 𝑏(𝑘) (𝑘 = 2, . . . , 𝑁 − 1) are called also lower quasiseparable generators and upper quasiseparable generators of the matrix 𝐴. For matrices with scalar entries the elements 𝑑(𝑖) are numbers and the generators 𝑝(𝑖), 𝑔(𝑖) and 𝑞(𝑗), ℎ(𝑗) are rows and columns of the corresponding sizes. We can suppose that for an 𝑁 × 𝑁 matrix the orders of the lower and of the upper quasiseparable generators are the same, 𝑟𝑘𝐿 = 𝑟𝑘𝑈 (𝑘 = 1, . . . , 𝑁 − 1), since otherwise one can pad the smaller ones with zeroes. It follows that we can ask this as a condition for Theorem 3.1 below, without loss of generality. Denote ⎛ ⎞ 𝑝(𝑚 + 1) ⎜ ⎟ 𝑝(𝑚 + 2)𝑎(𝑚 + 1) ⎜ ⎟ ⎜ ⎟ > 𝑁 𝑝(𝑚 + 3)𝑎(𝑚 + 2)𝑎(𝑚 + 1) 𝑃𝑚+1 = col(𝑝(𝑘)𝑎𝑘𝑚 )𝑘=𝑚+1 = ⎜ ⎟ , (2.2) ⎜ ⎟ .. ⎝ ⎠ . 𝑝(𝑁 )𝑎(𝑁 − 1) ⋅ ⋅ ⋅ 𝑎(𝑚 + 2)𝑎(𝑚 + 1)

302

Y. Eidelman and I. Haimovici 𝑚 𝑄𝑚 = row(𝑎> 𝑚+1,𝑘 𝑞(𝑘))𝑘=1 ( = 𝑎(𝑚) ⋅ ⋅ ⋅ 𝑎(3)𝑎(2)𝑞(1) ∣

𝐺𝑚

𝑎(𝑚) ⋅ ⋅ ⋅ 𝑎(3)𝑞(2) ∣ ) ⋅ ⋅ ⋅ ∣ 𝑎(𝑚)𝑞(𝑚 − 1) ∣ 𝑞(𝑚) , ⎛ 𝑔(1)𝑏(2)𝑏(3) ⋅ ⋅ ⋅ 𝑏(𝑚 − 1)𝑏(𝑚) ⎜ 𝑔(2)𝑏(3) ⋅ ⋅ ⋅ 𝑏(𝑚 − 1)𝑏(𝑚) ⎜ ⎜ .. ⎜ < 𝑚 . = col(𝑔(𝑘)𝑏𝑘,𝑚+1 )𝑘=1 = ⎜ ⎜ 𝑔(𝑚 − 2)𝑏(𝑚 − 1)𝑏(𝑚) ⎜ ⎝ 𝑔(𝑚 − 1)𝑏(𝑚) 𝑔(𝑚)

⋅⋅⋅

(2.3)

⎞ ⎟ ⎟ ⎟ ⎟ ⎟, ⎟ ⎟ ⎠

(2.4)

𝑁 𝐻𝑚+1 = row(𝑏< 𝑚𝑘 ℎ(𝑘))𝑘=𝑚+1 ( = ℎ(𝑚 + 1) ∣ 𝑏(𝑚 + 1)ℎ(𝑚 + 2) ∣ 𝑏(𝑚 + 1)𝑏(𝑚 + 2)ℎ(𝑚 + 3) ∣ ) ⋅ ⋅ ⋅ ∣ 𝑏(𝑚 + 1) ⋅ ⋅ ⋅ 𝑏(𝑁 − 1)ℎ(𝑁 ) . (2.5) A direct computation shows that

𝐴(𝑚 + 1 : 𝑁, 1 : 𝑚) = 𝑃𝑚+1 𝑄𝑚 ,

𝑚 = 1, . . . , 𝑁 − 1.

(2.6)

We point out that the number of columns of 𝑃𝑚+1 as well as the number of rows of 𝑄𝑚 is 𝑟𝑚 , so that one can multiply these matrices and obtain a matrix whose rank is at most 𝑟𝑚 . Another direct computation shows that 𝐴(1 : 𝑚, 𝑚 + 1 : 𝑁 ) = 𝐺𝑚 𝐻𝑚+1 ,

𝑚 = 1, . . . , 𝑁 − 1.

(2.7)

We point out also that the number of columns of 𝐺𝑚 as well as the number of rows of 𝐻𝑚+1 is 𝑟𝑚 , so that one can multiply these matrices and obtain a matrix whose rank is at most 𝑟𝑚 . Next we will note down some deﬁnitions which could be found in [14] and the references therein. The complex number 𝜆0 is called a zero (or an eigenvalue) of the rational matrix function 𝐹 (𝜆) if det 𝐹 (𝜆0 ) = 0 and 𝜙 ∕= 0 is called an eigenvector of 𝐹 (𝜆) corresponding to 𝜆0 if 𝐹 (𝜆0 )𝜙 = 0. If 𝜙0 is an eigenvector for the zero (eigenvalue) 𝜆0 and 𝑘 ∑ 1 (𝑗) 𝐹 (𝜆0 )𝜙𝑘−𝑗 = 0, 𝑗! 𝑗=0

𝑘 = 0, 1, . . . , 𝑝,

then 𝜙0 , 𝜙1 , . . . , 𝜙𝑝 is called a Jordan chain corresponding to 𝜆0 . A system 𝜙10 , 𝜙11 , . . . , 𝜙1𝑘1 , 𝜙20 , 𝜙21 , . . . , 𝜙2𝑘2 , . . . , 𝜙𝑟0 , 𝜙𝑟1 , . . . , 𝜙𝑟𝑘𝑟 of Jordan chains corresponding to 𝜆0 is a canonical system of Jordan chains if all the Jordan chains are of maximal length among those Jordan chains corresponding to 𝜆0 which start with an eigenvector which is independent of all the eigenvectors which have been already chosen in the system. In particular, the ﬁrst chain in the system is of maximal length among all the Jordan chains corresponding to 𝜆0 .

Divide and Conquer for Quasiseparable Matrices

303

The numbers 𝑘1 ≥ 𝑘2 ≥ ⋅ ⋅ ⋅ ≥ 𝑘𝑟 are independent of the particular Jordan chains chosen and they are called the partial multiplicities of 𝜆0 and 𝑘1 + 𝑘2 + ⋅ ⋅ ⋅ + 𝑘𝑟 (the sum of the lengths oﬀ all the independent Jordan chains) is called the multiplicity of 𝜆0 as a zero of 𝐹 (𝜆). The Jordan chains chosen for diﬀerent eigenvalues can contain the same vectors. For instance for the particular case when the matrix rational function is in fact an 1×1 (a scalar) function, all the Jordan chains for diﬀerent zeroes start with the same eigenvector 𝜙 = 1. However, if the same Jordan chain corresponds to different eigenvalues 𝜆1 and 𝜆2 and it has been chosen in both the canonical systems of Jordan chains, then its length 𝑘 is counted twice: as a partial multiplicity of 𝜆1 and as a partial multiplicity of 𝜆2 as well, when determining the total multiplicity of these eigenvalues. For instance, for the function (𝜆 − 1)(𝜆 − 2)(𝜆 − 3) the same Jordan chain 𝜙 = 1 gives for each of the three eigenvalues their total multiplicity of 1 each.

3. Divide step 3.1. The main theorem The divide step consists in splitting a single problem into two smaller independent problems with size roughly half the size of the original problem. This is done recursively, until the obtained problems are of a convenient size which is small enough so that they can be solved by standard techniques. In order to assure the next recursion step in the same initial conditions as for the current step, one must show that the two smaller matrices which are obtained in the divide step have quasiseparable representations of at most the same order as the larger initial matrix and that they possibly belong to the same class. Theorem 3.1. Let 𝑚, 𝑁 be two positive integers such that 𝑚 < 𝑁 and 𝐴 = {𝐴𝑖𝑗 }𝑁 𝑖,𝑗=1 be a block matrix with entries of sizes 𝑚𝑖 ×𝑚𝑗 with lower quasiseparable generators 𝑝(𝑖) (𝑖 = 2, . . . , 𝑁 ), 𝑞(𝑗) (𝑗 = 1, . . . , 𝑁 − 1), 𝑎(𝑘) (𝑘 = 2, . . . , 𝑁 − 1) of orders 𝑟𝑘 (𝑘 = 1, . . . , 𝑁 − 1), upper quasiseparable generators 𝑔(𝑖) (𝑖 = 1, . . . , 𝑁 − 1), ℎ(𝑗) (𝑗 = 2, . . . , 𝑁 ), 𝑏(𝑘) (𝑘 = 2, . . . , 𝑁 − 1) of the same orders 𝑟𝑘 (𝑘 = 1, . . . , 𝑁 − 1) and diagonal entries 𝑑(𝑘) (𝑘 = 1, . . . , 𝑁 ). Then the matrix 𝐴 is a perturbation of rank 𝑟𝑚 at most of a 2 × 2 block diagonal matrix ( ) 𝐵 0 (3.1) 0 𝐶 with submatrices 𝐵 of size 𝑚 × 𝑚 and 𝐶 of size (𝑁 − 𝑚) × (𝑁 − 𝑚) which have quasiseparable generators of orders 𝑟𝑘 , 𝑘 = 1, . . . , 𝑚 − 1 and of orders 𝑟𝑘 , 𝑘 = 𝑚 + 1, . . . , 𝑁 − 1 respectively.

304

form

Y. Eidelman and I. Haimovici In fact, using the notations (2.2)–(2.5) one can represent the matrix 𝐴 in the ( 𝐴=

where

( 𝑉1 =

while

𝐺𝑚 𝑃𝑚+1

)

𝐵 0

0 𝐶

,

𝑉2 =

)

+ 𝑉1 𝑉2 , (

𝑄𝑚

(3.2) 𝐻𝑚+1

)

,

𝐵 = 𝐵𝑚 = 𝐴(1 : 𝑚, 1 : 𝑚) − 𝐺𝑚 𝑄𝑚 , 𝐶 = 𝐶𝑚 = 𝐴(𝑚 + 1 : 𝑁, 𝑚 + 1 : 𝑁 ) − 𝑃𝑚+1 𝐻𝑚+1 . Moreover, the matrix 𝐵 has quasiseparable generators

(3.3) (3.4)

> 𝑝𝐵 (𝑖) = 𝑝(𝑖) − 𝑔(𝑖)𝑏< 𝑖,𝑚+1 𝑎𝑚+1,𝑖−1 (𝑖 = 2, . . . , 𝑚),

𝑞𝐵 (𝑗) = 𝑞(𝑗) (𝑗 = 1, . . . , 𝑚 − 1), 𝑎𝐵 (𝑘) = 𝑎(𝑘) (𝑘 = 2, . . . , 𝑚 − 1); ℎ𝐵 (𝑗) −

𝑔𝐵 (𝑖) < > 𝑏𝑗−1,𝑚+1 𝑎𝑚+1,𝑗 𝑞(𝑗)

= 𝑔(𝑖) (𝑖 = 1, . . . , 𝑚 − 1), (𝑗 = 2, . . . , 𝑚), 𝑏𝐵 (𝑘) = 𝑏(𝑘) (𝑘 = 2, . . . , 𝑚 − 1);

> 𝑑𝐵 (𝑘) = 𝑑(𝑘) − 𝑔(𝑘)𝑏< 𝑘,𝑚+1 𝑎𝑚+1,𝑘 𝑞(𝑘) (𝑘 = 1, . . . , 𝑚) of orders 𝑟𝑘 , 𝑘 = 1, . . . , 𝑚 − 1 and the matrix 𝐶 has the quasiseparable generators

𝑝𝐶 (𝑖 − 𝑚) = 𝑝(𝑖), (𝑖 = 𝑚 + 2, . . . , 𝑁 ), < 𝑞𝐶 (𝑗 − 𝑚) = 𝑞(𝑗) − 𝑎> 𝑗+1,𝑚 𝑏𝑚𝑗 ℎ(𝑗), (𝑗 = 𝑚 + 1, . . . , 𝑁 − 1),

𝑎𝐶 (𝑘 − 𝑚) = 𝑎(𝑘) (𝑘 = 𝑚 + 2, . . . , 𝑁 − 1); < 𝑔𝐶 (𝑖 − 𝑚) = 𝑔(𝑖) − 𝑝(𝑖)𝑎> 𝑖𝑚 𝑏𝑚,𝑖+1 , (𝑖 = 𝑚 + 1, . . . , 𝑁 − 1), ℎ𝐶 (𝑗 − 𝑚) = ℎ(𝑗), (𝑗 = 𝑚 + 2, . . . , 𝑁 ), 𝑏𝐵 (𝑘 − 𝑚) = 𝑏(𝑘) (𝑘 = 𝑚 + 2, . . . , 𝑁 − 1); < 𝑑𝐵 (𝑘 − 𝑚) = 𝑑(𝑘) − 𝑝(𝑘)𝑎> 𝑘𝑚 𝑏𝑚𝑘 ℎ(𝑘), (𝑘 = 𝑚 + 1, . . . , 𝑁 ) of orders 𝑟𝑘 , 𝑘 = 𝑚 + 1, . . . , 𝑁 − 1. Proof. It follows from (2.6), (2.7) that the matrix 𝐴 may be partitioned in the form ( ) 𝐴(1 : 𝑚, 1 : 𝑚) 𝐺𝑚 𝐻𝑚+1 𝐴= . (3.5) 𝑃𝑚+1 𝑄𝑚 𝐴(𝑚 + 1 : 𝑁, 𝑚 + 1 : 𝑁 ) Using (3.5) one can represent the matrix 𝐴 in the form ) ( ) ( ) ( 𝐵 0 𝐺𝑚 𝑄𝑚 𝐻𝑚+1 , (3.6) 𝐴= + 𝑃𝑚+1 0 𝐶 where 𝐵 and 𝐶 satisfy (3.4). Thus we have represented the matrix 𝐴 as a sum of a block diagonal 2 × 2 matrix and a matrix of rank 𝑟𝑚 at most. It remains to show that the matrix 𝐵 has quasiseparable generators of orders 𝑟𝑘 , 𝑘 = 1, . . . , 𝑚 − 1 and the matrix 𝐶 has quasiseparable generators of orders 𝑟𝑘 , 𝑘 = 𝑚 + 1, . . . , 𝑁 − 1 and obtain the formulas for these generators. We will proceed ﬁrst for the matrix 𝐵.

Divide and Conquer for Quasiseparable Matrices

305

For 1 ≤ 𝑗 < 𝑖 ≤ 𝑚 we have 𝐵(𝑖, 𝑗) = 𝐴(𝑖, 𝑗) − 𝐺𝑚 (𝑖, 1 : 𝑟𝑚 )𝑄𝑚 (1 : 𝑟𝑚 , 𝑗) < > = 𝑝(𝑖)𝑎> 𝑖𝑗 𝑞(𝑗) − 𝑔(𝑖)𝑏𝑖,𝑚+1 𝑎𝑚+1,𝑗 𝑞(𝑗).

Using the equality > > 𝑎> 𝑚+1,𝑗 = 𝑎(𝑚) ⋅ ⋅ ⋅ 𝑎(𝑖)𝑎(𝑖 − 1) ⋅ ⋅ ⋅ 𝑎(𝑗 + 1) = 𝑎𝑚+1,𝑖−1 𝑎𝑖𝑗

(3.7)

we conclude that for 1 ≤ 𝑗 < 𝑖 ≤ 𝑚 we have > > 𝐵(𝑖, 𝑗) = (𝑝(𝑖) − 𝑔(𝑖)𝑏< 𝑖,𝑚+1 𝑎𝑚+1,𝑖−1 )𝑎𝑖𝑗 𝑞(𝑗).

Thus the matrix 𝐵 has lower quasiseparable generators > 𝑝(𝑖) − 𝑔(𝑖)𝑏< 𝑖,𝑚+1 𝑎𝑚+1,𝑖−1 (𝑖 = 2, . . . , 𝑚),

𝑞(𝑗) (𝑗 = 1, . . . , 𝑚 − 1), 𝑎(𝑘) (𝑘 = 2, . . . , 𝑚 − 1) of orders 𝑟𝑘 , 𝑘 = 1, . . . , 𝑚 − 1. Similarly we obtain for 𝑖 ≤ 𝑚 the following diagonal entries for 𝐵 𝐵(𝑖, 𝑖) = 𝐴(𝑖, 𝑖) − 𝐺𝑚 (𝑖, 1 : 𝑟𝑚 )𝑄𝑚 (1 : 𝑟𝑚 , 𝑖) > = 𝑑(𝑖) − 𝑔(𝑖)𝑏< 𝑖,𝑚+1 𝑎𝑚+1,𝑖 𝑞(𝑖),

𝑖 = 1, . . . , 𝑚

and also the following upper quasiseparable generators of orders 𝑟𝑘 , 𝑘 = 1, . . . , 𝑚−1 < > 𝐵(𝑖, 𝑗) = 𝑔(𝑖)𝑏< 𝑖𝑗 (ℎ(𝑗) − 𝑏𝑗−1,𝑚+1 𝑎𝑚+1,𝑗 𝑞(𝑗)),

1 ≤ 𝑖 < 𝑗 ≤ 𝑚.

Use now formula 𝑏< 𝑖𝑚 = 𝑏(𝑖 + 1)𝑏(𝑖 + 2) ⋅ ⋅ ⋅ 𝑏(𝑚 − 1) < = (𝑏(𝑖 + 1) ⋅ ⋅ ⋅ 𝑏(𝑗 − 1))(𝑏(𝑗) ⋅ ⋅ ⋅ 𝑏(𝑚 − 1)) = 𝑏< 𝑖𝑗 𝑏𝑗−1,𝑚 .

(3.8)

In the same way we get that the matrix 𝐶 has quasiseparable generators of orders 𝑟𝑘 , 𝑘 = 𝑚 + 1, . . . , 𝑁 − 1 and we obtain the formulas for these generators. For 𝑚 + 1 ≤ 𝑗 < 𝑖 ≤ 𝑁 we have 𝐶(𝑖, 𝑗) = 𝐴(𝑖, 𝑗) − 𝑃𝑚+1 (𝑖, 1 : 𝑟𝑚 )𝐻𝑚+1 (1 : 𝑟𝑚 , 𝑗) > < = 𝑝(𝑖)𝑎> 𝑖𝑗 𝑞(𝑗) − 𝑝(𝑖)𝑎𝑖𝑚 𝑏𝑚𝑗 ℎ(𝑗).

Using again the equality (3.7), namely > > 𝑎> 𝑖𝑚 = 𝑎(𝑖 − 1) ⋅ ⋅ ⋅ 𝑎(𝑗 + 1)𝑎(𝑗) ⋅ ⋅ ⋅ 𝑎(𝑚 + 1) = 𝑎𝑖𝑗 𝑎𝑗+1,𝑚

we conclude that > < 𝐶(𝑖, 𝑗) = 𝑝(𝑖)𝑎> 𝑖𝑗 (𝑞(𝑗) − 𝑎𝑗+1,𝑚 𝑏𝑚𝑗 ℎ(𝑗)),

𝑚 + 1 ≤ 𝑗 < 𝑖 ≤ 𝑁.

Thus the matrix 𝐶 has lower quasiseparable generators 𝑝(𝑖), (𝑖 = 𝑚 + 2, . . . , 𝑁 ), < 𝑞(𝑗) − 𝑎> 𝑗+1,𝑚 𝑏𝑚𝑗 ℎ(𝑗), 𝑗 = 𝑚 + 1, . . . , 𝑁 − 1 and 𝑎(𝑘) (𝑘 = 𝑚 + 2, . . . , 𝑁 − 1) of orders 𝑟𝑘 , 𝑘 = 𝑚 + 1, . . . , 𝑁 − 1. Similarly we obtain for 𝑖 ≥ 𝑚 + 1 the following diagonal entries for 𝐶 < 𝐶(𝑖, 𝑖) = 𝐴(𝑖, 𝑖) − 𝑃𝑚+1 (𝑖, 1 : 𝑟𝑚 )𝐻𝑚+1 (1 : 𝑟𝑚 , 𝑖) = 𝑑(𝑖) − 𝑝(𝑖)𝑎> 𝑖𝑚 𝑏𝑚𝑖 ℎ(𝑖)

306

Y. Eidelman and I. Haimovici

and the following upper quasiseparable generators of orders 𝑟𝑘 , 𝑘 = 𝑚 + 1, . . . , 𝑁 −1 𝐶(𝑖, 𝑗) = 𝐴(𝑖, 𝑗) − 𝑃𝑚+1 (𝑖, 1 : 𝑟𝑚 )𝐻𝑚+1 (1 : 𝑟𝑚 , 𝑗) > < = 𝑔(𝑖)𝑏< 𝑖𝑗 ℎ(𝑗) − 𝑝(𝑖)𝑎𝑖𝑚 𝑏𝑚𝑗 ℎ(𝑗) < < = (𝑔(𝑖) − 𝑝(𝑖)𝑎> 𝑖𝑚 𝑏𝑚,𝑖+1 )𝑏𝑖𝑗 ℎ(𝑗),

𝑚 + 1 ≤ 𝑖 < 𝑗 ≤ 𝑁.

Here we used again formula (3.8) to show that 𝑏< 𝑚𝑗 = 𝑏(𝑚 + 1)𝑏(𝑚 + 2) ⋅ ⋅ ⋅ 𝑏(𝑗 − 1) < = (𝑏(𝑚 + 1) ⋅ ⋅ ⋅ 𝑏(𝑖))(𝑏(𝑖 + 1) ⋅ ⋅ ⋅ 𝑏(𝑗 − 1)) = 𝑏< 𝑚,𝑖+1 𝑏𝑖𝑗 .

□

This theorem generalizes a result obtained in [1, Section 7] for Hermitian matrices. This result is presented below as Corollary 3.2. 3.2. Hermitian and/or tridiagonal matrices It is clear that if the matrix 𝐴 is Hermitian, then this property is also preserved for the matrices 𝐵 and 𝐶. In this case, only the computation of the lower quasiseparable generators is needed so that the complexity is less. Indeed, for a Hermitian block matrix, using the given lower quasiseparable generators one can build the following upper quasiseparable generators of the same orders 𝑔(𝑗) = (𝑞(𝑗))∗ , 𝑗 = 1, . . . , 𝑁 − 1,

ℎ(𝑖) = (𝑝(𝑖))∗ , 𝑖 = 2, . . . , 𝑁,

∗

𝑏(𝑘) = (𝑎(𝑘)) , 𝑘 = 2, . . . , 𝑁 − 1.

(3.9)

Corollary 3.2. Let 𝑚, 𝑁 be two positive integers such that 𝑚 < 𝑁 and 𝐴 = {𝐴𝑖𝑗 }𝑁 𝑖,𝑗=1 be a block Hermitian matrix with entries of sizes 𝑚𝑖 × 𝑚𝑗 with lower quasiseparable generators 𝑝(𝑖) (𝑖 = 2, . . . , 𝑁 ), 𝑞(𝑗) (𝑗 = 1, . . . , 𝑁 − 1), 𝑎(𝑘) (𝑘 = 2, . . . , 𝑁 − 1) of orders 𝑟𝑘 (𝑘 = 1, . . . , 𝑁 − 1) and diagonal entries 𝑑(𝑘) (𝑘 = 1, . . . , 𝑁 ). Then the matrix 𝐴 is a perturbation of rank 𝑟𝑚 at most of a 2 × 2 block diagonal matrix ( ) 𝐵 0 (3.10) 0 𝐶 with Hermitian submatrices 𝐵 of size 𝑚 × 𝑚 and 𝐶 of size (𝑁 − 𝑚) × (𝑁 − 𝑚) which have lower quasiseparable generators of orders 𝑟𝑘 , 𝑘 = 1, . . . , 𝑚 − 1 and of orders 𝑟𝑘 , 𝑘 = 𝑚 + 1, . . . , 𝑁 − 1 respectively. In fact, one can represent the matrix 𝐴 in the form (3.2), (3.3) with ( ) ) ( 𝑄∗𝑚 ∗ 𝑉1 = , (3.11) , 𝑉2 = 𝑄𝑚 𝑃𝑚+1 𝑃𝑚+1 while

𝐵 = 𝐵𝑚 = 𝐴(1 : 𝑚, 1 : 𝑚) − 𝑄∗𝑚 𝑄𝑚 ,

∗ . 𝐶 = 𝐶𝑚 = 𝐴(𝑚 + 1 : 𝑁, 𝑚 + 1 : 𝑁 ) − 𝑃𝑚+1 𝑃𝑚+1

(3.12)

Divide and Conquer for Quasiseparable Matrices

307

Moreover, the matrix 𝐵 has (lower) quasiseparable generators ∗ > 𝑝𝐵 (𝑖) = 𝑝(𝑖) − (𝑞(𝑖))∗ (𝑎> 𝑚+1,𝑖−1 ) 𝑎𝑚+1,𝑖−1 (𝑖 = 2, . . . , 𝑚),

𝑞𝐵 (𝑗) = 𝑞(𝑗) (𝑗 = 1, . . . , 𝑚 − 1), 𝑎𝐵 (𝑘) = 𝑎(𝑘) (𝑘 = 2, . . . , 𝑚 − 1) of orders 𝑟𝑘 , 𝑘 = 1, . . . , 𝑚 − 1 and the matrix 𝐶 has (lower) quasiseparable generators (𝑖 = 𝑚 + 2, . . . , 𝑁 ), 𝑝𝐶 (𝑖 − 𝑚) = 𝑝(𝑖), > ∗ ∗ 𝑞𝐶 (𝑗 − 𝑚) = 𝑞(𝑗) − 𝑎> 𝑗+1,𝑚 (𝑎𝑗+1,𝑚 ) (𝑝(𝑗)) ,

(𝑗 = 𝑚 + 1, . . . , 𝑁 − 1),

𝑎𝐶 (𝑘 − 𝑚) = 𝑎(𝑘)

(𝑘 = 𝑚 + 2, . . . , 𝑁 − 1)

of orders 𝑟𝑘 , 𝑘 = 𝑚 + 1, . . . , 𝑁 − 1 as in Theorem 3.1. The diagonal entries of the matrices 𝐵 and 𝐶 become in the Hermitian case ∗ > 𝑑𝐵 (𝑘) = 𝑑(𝑘) − (𝑞(𝑘))∗ (𝑎> 𝑚+1,𝑘 ) 𝑎𝑚+1,𝑘 𝑞(𝑘)

𝑑𝐶 (𝑘 − 𝑚) = 𝑑(𝑘) −

> ∗ ∗ 𝑝(𝑘)𝑎> 𝑘𝑚 (𝑎𝑘𝑚 ) (𝑝(𝑘)) ,

(𝑘 = 1, . . . , 𝑚), (𝑘 = 𝑚 + 1, . . . , 𝑁 ).

In order to show that the present paper covers the case of a tridiagonal matrix which has been treated extensively in the literature ( see [3, 8, 7] and the literature cited therein) we have yet to prove that our quasiseparable approach for dividing a large matrix also preserves the tridiagonal structure. Corollary 3.3. Let 𝑚, 𝑁 be two positive integers such that 𝑚 < 𝑁 and let 𝐴 = {𝐴𝑖𝑗 }𝑁 𝑖,𝑗=1 be a block tridiagonal matrix ⎞ ⎛ 0 0 0 𝛾1 𝛽 1 0 ⋅ ⋅ ⋅ ⎜ 𝛼1 𝛾2 𝛽2 ⋅ ⋅ ⋅ 0 0 0 ⎟ ⎟ ⎜ ⎜ 0 𝛼2 𝛾3 ⋅ ⋅ ⋅ 0 0 0 ⎟ ⎟ ⎜ (3.13) ⎟ ⎜ .. .. .. .. .. .. . . ⎟ ⎜ . . . . . . . ⎟ ⎜ ⎝ 0 0 0 ⋅ ⋅ ⋅ 𝛼𝑁 −2 𝛾𝑁 −1 𝛽𝑁 −1 ⎠ 0 0 0 ⋅⋅⋅ 0 𝛼𝑁 −1 𝛾𝑁 , where 𝛾𝑘 are 𝑛𝑘 × 𝑛𝑘 matrices, 𝑘 = 1, . . . , 𝑁 and 𝛼𝑖 , 𝛽𝑖 are 𝑛𝑖+1 × 𝑛𝑖 and 𝑛𝑖 × 𝑛𝑖+1 matrices respectively, 𝑖 = 1, . . . , 𝑁 − 1. Suppose that the matrix 𝐴 has the following block quasiseparable generators 𝑝(𝑖) = 𝛼𝑖−1 , (𝑖 = 2, . . . , 𝑁 ), 𝑞(𝑗) = 𝐼𝑛𝑗 , (𝑗 = 1, . . . , 𝑁 − 1), 𝑎(𝑘) = 0𝑛𝑘 ×𝑛𝑘+1 , (𝑘 = 2, . . . , 𝑁 − 1), 𝑔(𝑗) = 𝐼𝑛𝑗 , (𝑗 = 1, . . . , 𝑁 − 1), ℎ(𝑖) = 𝛽𝑖−1 , (𝑖 = 2, . . . , 𝑁 ),

(3.14)

𝑏(𝑘) = 0𝑛𝑘 ×𝑛𝑘+1 (𝑘 = 2, . . . , 𝑁 − 1) and the diagonal entries 𝑑(𝑘) = 𝛾𝑘 , 𝑘 = 1, . . . , 𝑁 . Then the matrix 𝐴 is a perturbation of block rank one of a 2×2 block diagonal matrix ( ) 𝐵 0 (3.15) 0 𝐶

308

Y. Eidelman and I. Haimovici

with block tridiagonal submatrices 𝐵 of block size 𝑚 × 𝑚 and 𝐶 of block size (𝑁 − 𝑚) × (𝑁 − 𝑚) which preserve the quasiseparable generators of order one of the matrix 𝐴 and diﬀer of it only on diagonal entries. Proof. It follows by (2.2) that 𝑃𝑚+1 =

𝑁 col(𝑝(𝑘)𝑎> 𝑘𝑚 )𝑘=𝑚+1

by (2.3) that 𝑚 𝑄𝑚 = row(𝑎> 𝑚+1,𝑘 𝑞(𝑘))𝑘=1 =

( = (

by (2.4) that 𝑚 𝐺𝑚 = col(𝑔(𝑘)𝑏< 𝑘,𝑚+1 )𝑘=1 =

,

0𝜂𝑚 ×𝑛𝑚

0𝑛𝑚 ×𝜒𝑚 (

)

𝛼𝑚

)

𝐼𝑛𝑚

0𝜒𝑚 ×𝑛𝑚 𝐼𝑛𝑚

,

)

and by (2.5) that

) ( 𝑁 𝛽𝑚 0𝑛𝑚 ×𝜂𝑚 , 𝐻𝑚+1 = row(𝑏< 𝑚𝑘 ℎ(𝑘))𝑘=𝑚+1 = ∑ ∑𝑁 where 𝜒𝑚 = 𝑚−1 𝑖=1 𝑛𝑖 , 𝜂𝑚 = 𝑖=𝑚+1 𝑛𝑖 . Using (3.4) it follows that the desired 𝐵 and 𝐶 satisfy 𝐵 = 𝐵𝑚 = 𝐴(1 : 𝑚, 1 : 𝑚) − 𝐺𝑚 𝑄𝑚 ⎛ 0 𝛾1 𝛽 1 0 ⋅ ⋅ ⋅ ⎜ 𝛼1 𝛾2 𝛽2 ⋅ ⋅ ⋅ 0 ⎜ ⎜ 0 𝛼2 𝛾3 ⋅ ⋅ ⋅ 0 ⎜ =⎜ . .. .. .. . . . ⎜ . . . . . ⎜ ⎝ 0 0 0 ⋅ ⋅ ⋅ 𝛼𝑚−2 0 0 0 ⋅⋅⋅ 0

0 0 0 .. .

𝛾𝑚−1 𝛼𝑚−1

0 0 0 .. .

⎞

𝛽𝑚−1 𝛾𝑚 − 𝐼𝑛𝑚 ,

𝐶 = 𝐶𝑚 = 𝐴(𝑚 + 1 : 𝑁, 𝑚 + 1 : 𝑁 ) − 𝑃𝑚+1 𝐻𝑚+1 ⎛ 0 ⋅⋅⋅ 0 𝛾𝑚+1 − 𝛼𝑚 𝛽𝑚 𝛽𝑚+1 ⎜ 𝛾 𝛽 ⋅ ⋅ ⋅ 0 𝛼 𝑚+1 𝑚+2 𝑚+2 ⎜ ⎜ 0 𝛼 𝛾 ⋅ ⋅ ⋅ 0 𝑚+2 𝑚+3 ⎜ =⎜ .. .. .. .. . .. ⎜ . . . . ⎜ ⎝ 0 0 0 ⋅ ⋅ ⋅ 𝛼𝑁 −2 0 0 0 ⋅⋅⋅ 0

⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠

0 0 0 .. .

𝛾𝑁 −1 𝛼𝑁 −1

0 0 0 .. .

𝛽𝑁 −1 𝛾𝑁 ,

⎞ ⎟ ⎟ ⎟ ⎟ ⎟. ⎟ ⎟ ⎠

Therefore 𝐵 diﬀers from the submatrix 𝐴(1 : 𝑚, 1 : 𝑚) only on the entry 𝑑(𝑚), while 𝐶 diﬀers from the submatrix 𝐴(𝑚 + 1 : 𝑁, 𝑚 + 1 : 𝑁 ) only on the entry 𝑑(𝑚 + 1) and it follows that the new matrices are tridiagonal again and that they preserve the quasiseparable generators of 𝐴 given in (3.14), namely their generators are 𝑝𝐵 (𝑖) = 𝛼𝑖−1 , (𝑖 = 2, . . . , 𝑚), 𝑞𝐵 (𝑗) = 𝐼𝑛𝑗 , (𝑗 = 1, . . . , 𝑚 − 1), 𝑎𝐵 (𝑘) = 0𝑛𝑘 ×𝑛𝑘+1 , (𝑘 = 2, . . . , 𝑚 − 1),

Divide and Conquer for Quasiseparable Matrices

309

𝑔𝐵 (𝑗) = 𝐼𝑛𝑗 , (𝑗 = 1, . . . , 𝑚 − 1), ℎ𝐵 (𝑖) = 𝛽𝑖−1 , (𝑖 = 2, . . . , 𝑚), 𝑏𝐵 (𝑘) = 0𝑛𝑘 ×𝑛𝑘+1 (𝑘 = 2, . . . , 𝑚 − 1), 𝑝𝐶 (𝑖 − 𝑚) = 𝛼𝑖−1 , (𝑖 = 𝑚 + 2, . . . , 𝑁 ), 𝑞𝐶 (𝑗 − 𝑚) = 𝐼𝑛𝑗 , (𝑗 = 𝑚 + 1, . . . , 𝑁 − 1), 𝑎𝐶 (𝑘 − 𝑚) = 0𝑛𝑘 ×𝑛𝑘+1 , (𝑘 = 𝑚 + 2, . . . , 𝑁 − 1), 𝑔𝐶 (𝑗 − 𝑚) = 𝐼𝑛𝑗 , (𝑗 = 𝑚 + 1, . . . , 𝑁 − 1), ℎ𝐶 (𝑖 − 𝑚) = 𝛽𝑖−1 , (𝑖 = 𝑚 + 2, . . . , 𝑁 ), 𝑏𝐶 (𝑘 − 𝑚) = 0𝑛𝑘 ×𝑛𝑘+1 (𝑘 = 𝑚 + 2, . . . , 𝑁 − 1) and their diagonal entries are 𝑑𝐵 (𝑘) = 𝛾𝑘 , 𝑘 = 1, . . . , 𝑚 − 1, 𝑑𝐶 (1) = 𝛾𝑚+1 − 𝛼𝑚 𝛽𝑚 ,

𝑑𝐵 (𝑚) = 𝛾𝑚 − 𝐼𝑛𝑚 ,

𝑑𝐶 (𝑘 − 𝑚) = 𝛾𝑘 , 𝑘 = 𝑚 + 2, . . . , 𝑁.

Moreover the perturbations given in (3.3) are of block rank one ⎞ ⎛ 0𝜒𝑚 ×𝑛𝑚 ( ) ⎟ ⎜ 𝐼𝑛𝑚 𝐺𝑚 ⎟, 𝑉1 = =⎜ ⎠ ⎝ 𝑃𝑚+1 𝛼𝑚 0𝜂𝑚 ×𝑛𝑚 ) ( ) ( 𝑉2 = 𝑄𝑚 𝐻𝑚+1 = 0𝑛𝑚 ×𝜒𝑚 𝐼𝑛𝑚 𝛽𝑚 0𝑛𝑚 ×𝜂𝑚 .

□

It follows from Corollaries 3.2 and 3.3 that in the case of a matrix 𝐴 which is both tridiagonal and Hermitian at the same time the obtained matrices 𝐵 and 𝐶 also belong to the same class. 3.3. Algorithms to obtain suitable quasiseparable generators for the divided matrices and the entries of the perturbation matrices The following algorithm obtains in an eﬃcient manner lower and upper quasiseparable generators for the matrices 𝐵 and 𝐶. Algorithm 3.4. Let 𝐴 = {𝐴𝑖𝑗 }𝑁 𝑖,𝑗=1 be a block matrix with entries of sizes 𝑚𝑖 × 𝑚𝑗 with lower quasiseparable generators 𝑝(𝑖) (𝑖 = 2, . . . , 𝑁 ), 𝑞(𝑗) (𝑗 = 1, . . . , 𝑁 − 1), 𝑎(𝑘) (𝑘 = 2, . . . , 𝑁 − 1) of orders 𝑟𝑘 (𝑘 = 1, . . . , 𝑁 − 1), upper quasiseparable generators 𝑔(𝑖) (𝑖 = 1, . . . , 𝑁 − 1), ℎ(𝑗) (𝑗 = 2, . . . , 𝑁 ), 𝑏(𝑘) (𝑘 = 2, . . . , 𝑁 − 1) of the same orders 𝑟𝑘 (𝑘 = 1, . . . , 𝑁 − 1) and diagonal entries 𝑑(𝑘) (𝑘 = 1, . . . , 𝑁 ). Let the matrices 𝐵 and 𝐶 be given by (3.2). Then a set of quasiseparable generators 𝑝𝐵 (𝑖) (𝑖 = 2, . . . , 𝑚), 𝑞𝐵 (𝑗) (𝑗 = 1, . . . , 𝑚 − 1), 𝑎𝐵 (𝑘) (𝑘 = 2, . . . , 𝑚 − 1); 𝑔𝐵 (𝑖) (𝑖 = 1, . . . , 𝑚 − 1), ℎ𝐵 (𝑗) (𝑗 = 2, . . . , 𝑚), 𝑏𝐵 (𝑘) (𝑘 = 2, . . . , 𝑚 − 1); 𝑑𝐵 (𝑘) (𝑘 = 1, . . . , 𝑚) of the matrix 𝐵 and a set of quasiseparable generators 𝑝𝐶 (𝑖) (𝑖 = 2, . . . , 𝑁 − 𝑚), 𝑞𝐶 (𝑗) (𝑗 = 1, . . . , 𝑁 − 𝑚 − 1), 𝑎𝐶 (𝑘) (𝑘 = 2, . . . , 𝑁 − 𝑚 − 1); 𝑔𝐶 (𝑖) (𝑖 = 1, . . . , 𝑁 − 𝑚 − 1), ℎ𝐶 (𝑗) (𝑗 = 2, . . . , 𝑁 −𝑚), 𝑏𝐶 (𝑘) (𝑘 = 2, . . . , 𝑁 −𝑚−1); 𝑑𝐶 (𝑘) (𝑘 = 1, . . . , 𝑁 −𝑚) of the matrix 𝐶 which have the same orders as the generators of the matrix 𝐴 are obtained with the following algorithm.

310

Y. Eidelman and I. Haimovici

1. Find the quasiseparable generators of 𝐵. 1.1.

𝑢 = 𝑎(𝑚), 𝑑𝐵 (𝑚) = 𝑑(𝑚) − 𝑔(𝑚)𝑞(𝑚), 𝑝𝐵 (𝑚) = 𝑝(𝑚) − 𝑔(𝑚)𝑢, (3.16) 𝑣 = 𝑏(𝑚), ℎ𝐵 (𝑚) = ℎ(𝑚) − 𝑣𝑞(𝑚), 𝑉2 (𝑚) = 𝑞(𝑚),

(3.17)

𝑉1 (𝑚) = 𝑔(𝑚).

1.2. For 𝑘 = 𝑚 − 1, . . . , 2 perform the following. 𝑤 = 𝑢𝑞(𝑘), 𝑢 = 𝑢𝑎(𝑘), 𝑧 = 𝑔(𝑘)𝑣, 𝑑𝐵 (𝑘) = 𝑑(𝑘) − 𝑧𝑤, 𝑝𝐵 (𝑘) = 𝑝(𝑘) − 𝑧𝑢, (3.18) 𝑣 = 𝑏(𝑘)𝑣, 𝑏𝐵 (𝑘) = 𝑏(𝑘),

𝑞𝐵 (𝑘) = 𝑞(𝑘),

ℎ𝐵 (𝑘) = ℎ(𝑘) − 𝑣𝑤, 𝑉2 (𝑘) = 𝑤,

1.3.

𝑎𝐵 (𝑘) = 𝑎(𝑘),

𝑉1 (1) = 𝑔(1)𝑣, 𝑞𝐵 (1) = 𝑞(1),

𝑔𝐵 (𝑘) = 𝑔(𝑘),

(3.19) (3.20)

𝑉1 (𝑘) = 𝑧. 𝑉2 (1) = 𝑢𝑞(1),

𝑑𝐵 (1) = 𝑑(1) − 𝑉1 (1)𝑉2 (1), 𝑔𝐵 (1) = 𝑔(1).

(3.21) (3.22)

2. Find the quasiseparable generators of 𝐶. 2.1. 𝑠 = 𝑚 + 1, 𝑢 = 𝑎(𝑠), 𝑑𝐶 (1) = 𝑑(𝑠) − 𝑝(𝑠)ℎ(𝑠), 𝑞𝐶 (1) = 𝑞(𝑠) − 𝑢ℎ(𝑠), (3.23) 𝑉1 (𝑠) = 𝑝(𝑠),

𝑉2 (𝑠) = ℎ(𝑠),

𝑔𝐶 (1) = 𝑔(𝑠) − 𝑝(𝑠)𝑣.

𝑣 = 𝑏(𝑠),

(3.24)

2.2. For 𝑘 = 𝑚 + 2, . . . , 𝑁 − 1 perform the following. 𝑠 = 𝑘 − 𝑚,

𝑤 = 𝑝(𝑘)𝑢,

𝑧 = 𝑣ℎ(𝑘), 𝑉1 (𝑘) = 𝑤,

𝑑𝐶 (𝑠) = 𝑑(𝑘) − 𝑤𝑧, 𝑣 = 𝑣𝑏(𝑘),

𝑢 = 𝑎(𝑘)𝑢,

𝑞𝐶 (𝑠) = 𝑞(𝑘) − 𝑢𝑧,

𝑉2 (𝑘) = 𝑧,

𝑎𝐶 (𝑠) = 𝑎(𝑘),

𝑝𝐶 (𝑠) = 𝑝(𝑘)

𝑔𝐶 (𝑠) = 𝑔(𝑘) − 𝑤𝑣,

𝑏𝐶 (𝑠) = 𝑏(𝑘),

𝑉1 (𝑁 ) = 𝑝(𝑁 )𝑢,

𝑉2 (𝑁 ) = 𝑣ℎ(𝑁 ),

2.3. 𝑠 = 𝑁 − 𝑚,

𝑝𝐶 (𝑠) = 𝑝(𝑁 ),

(3.25)

ℎ𝐶 (𝑠) = ℎ(𝑘).

𝑑𝐶 (𝑠) = 𝑑(𝑁 ) − 𝑉1 (𝑁 )𝑉2 (𝑁 ),

ℎ𝐶 (𝑠) = ℎ(𝑁 ).

(3.26) (3.27)

(3.28) (3.29)

The following algorithm computes suitable lower quasiseparable generators for 𝐵 and 𝐶 and the entries of the perturbation matrix in the Hermitian case. Algorithm 3.5. Algorithm for Hermitian matrices Let 𝐴 = {𝐴𝑖𝑗 }𝑁 𝑖,𝑗=1 be a block matrix with entries of sizes 𝑚𝑖 × 𝑚𝑗 with lower quasiseparable generators 𝑝(𝑖) (𝑖 = 2, . . . , 𝑁 ), 𝑞(𝑗) (𝑗 = 1, . . . , 𝑁 − 1), 𝑎(𝑘) (𝑘 = 2, . . . , 𝑁 − 1) of orders 𝑟𝑘 (𝑘 = 1, . . . , 𝑁 − 1) and diagonal entries 𝑑(𝑘) (𝑘 = 1, . . . , 𝑁 ). Let the matrices 𝐵 and 𝐶 be given by (3.2).

Divide and Conquer for Quasiseparable Matrices

311

Then it follows that sets of lower quasiseparable generators and diagonal entries 𝑝𝐵 (𝑖) (𝑖 = 2, . . . , 𝑚),𝑞𝐵 (𝑗) (𝑗 = 1, . . . , 𝑚 − 1), 𝑎𝐵 (𝑘) (𝑘 = 2, . . . , 𝑚 − 1); 𝑑𝐵 (𝑘) (𝑘 = 1, . . . , 𝑚) for the matrix 𝐵 and 𝑝𝐶 (𝑖) (𝑖 = 2, . . . , 𝑁 − 𝑚), 𝑞𝐶 (𝑗) (𝑗 = 1, . . . , 𝑁 − 𝑚 − 1), 𝑎𝐶 (𝑘) (𝑘 = 2, . . . , 𝑁 − 𝑚 − 1); 𝑑𝐶 (𝑘) (𝑘 = 1, . . . , 𝑁 − 𝑚) for the matrix 𝐶 which have the same orders as the generators of the matrix 𝐴 are obtained with the following algorithm. 1. Find the lower quasiseparable generators and the diagonal entries of the matrix 𝐵. 1.1. 𝑢 = 𝑎(𝑚), 𝑑𝐵 (𝑚) = 𝑑(𝑚) − (𝑞(𝑚))∗ 𝑞(𝑚), 𝑝𝐵 (𝑚) = 𝑝(𝑚) − (𝑞(𝑚))∗ 𝑢, (3.30) 𝑉2 (𝑚) = 𝑞(𝑚). 1.2. For 𝑘 = 𝑚 − 1, . . . , 2 perform the following. 𝑤 = 𝑢𝑞(𝑘), 𝑢 = 𝑢𝑎(𝑘), 𝑑𝐵 (𝑘) = 𝑑(𝑘) − 𝑤∗ 𝑤, 𝑝𝐵 (𝑘) = 𝑝(𝑘) − 𝑤∗ 𝑢, 𝑞𝐵 (𝑘) = 𝑞(𝑘), 1.3.

𝑎𝐵 (𝑘) = 𝑎(𝑘),

(3.31)

𝑉2 (𝑘) = 𝑤

(3.32)

𝑉2 (1) = 𝑤, 𝑞𝐵 (1) = 𝑞(1),

𝑤 = 𝑢𝑞(1),

𝑑𝐵 (1) = 𝑑(1) − 𝑤∗ 𝑤.

(3.33)

2. Find the lower quasiseparable generators and the diagonal entries of the matrix 𝐶. 2.1.

𝑠 = 𝑚 + 1,

𝑢 = 𝑎(𝑠), 𝑑𝐶 (1) = 𝑑(𝑠) − 𝑝(𝑠)(𝑝(𝑠))∗ ,

(3.34)

𝑞𝐶 (1) = 𝑞(𝑠) − 𝑢(𝑝(𝑠))∗ , 𝑉2 (𝑠) = (𝑝(𝑠))∗ . 2.2. For 𝑘 = 𝑚 + 2, . . . , 𝑁 − 1 perform the following. 𝑠 = 𝑘 − 𝑚, 𝑤 = 𝑝(𝑘)𝑢, 𝑢 = 𝑎(𝑘)𝑢, 𝑞𝐶 (𝑠) = 𝑞(𝑘) − 𝑢𝑤∗ , ∗

𝑑𝐶 (𝑠) = 𝑑(𝑘) − 𝑤𝑤 , 2.3.

𝑠 = 𝑁 − 𝑚,

𝑎𝐶 (𝑠) = 𝑎(𝑘),

𝑝𝐶 (𝑠) = 𝑝(𝑁 ),

𝑝𝐶 (𝑠) = 𝑝(𝑘),

𝑤 = 𝑝(𝑁 )𝑢,

(3.35) ∗

𝑉2 (𝑘) = 𝑤 . (3.36)

𝑑𝐶 (𝑠) = 𝑑(𝑁 ) − 𝑤𝑤∗ , (3.37)

𝑉2 (𝑁 ) = 𝑤∗ .

4. Conquer step and eigenproblem of rational matrix functions 4.1. The link between the eigenproblem of 𝑨 and an eigenproblem for a rational matrix function In the conquer step, the solutions of the smaller problems into which a larger sized problem has been torn are successfully combined two by two to solutions of the next larger problem. Suppose that for the smaller divided matrices 𝐵 and 𝐶 of sizes 𝑚 × 𝑚 and respectively (𝑁 − 𝑚) × (𝑁 − 𝑚) we already have their spectral data, i.e., we have 𝑚 × 𝑚 and (𝑁 − 𝑚) × (𝑁 − 𝑚) invertible matrices 𝑃𝐵 and respectively 𝑃𝐶 so

312

Y. Eidelman and I. Haimovici

that 𝑃𝐵−1 𝐵𝑃𝐵 = 𝐽𝐵 and 𝑃𝐶−1 𝐶𝑃𝐶 = 𝐽𝐶 where the matrices 𝐽𝐵 and 𝐽𝐶 are in canonical Jordan form. We must compute the spectral data of the twice larger matrix 𝐴 which satisﬁes (3.2) with the known 𝑁 × 𝑟𝑚 and respectively 𝑟𝑚 × 𝑁 matrices 𝑉1 , 𝑉2 given by (3.3). Denote ( ) 𝑃𝐵 0 𝑈= . 0 𝑃𝐶 Then 𝑈 is invertible and (( ) ) 𝐵 0 𝑈 −1 𝐴𝑈 = 𝑈 −1 + 𝑉1 𝑉2 𝑈 = 𝐽 + 𝑧1 𝑧2 0 𝐶 where

( 𝐽=

𝐽𝐵 0

0 𝐽𝐶

) ,

while

𝑧1 = 𝑈 −1 𝑉1 , 𝑧2 = 𝑉2 𝑈 are small rank 𝑁 × 𝑟𝑚 and respectively 𝑟𝑚 × 𝑁 matrices. We must now ﬁnd an invertible 𝑉 which brings the matrix 𝐾 = 𝐽 + 𝑧1 𝑧2

(4.1)

(4.2)

to its canonical Jordan form, i.e., such that 𝑉 −1 (𝐽 + 𝑧1 𝑧2 )𝑉 = 𝐽𝐴 where 𝐽𝐴 is the canonical Jordan form of the original matrix 𝐴. We then set 𝑃 = 𝑈 𝑉 to obtain 𝑃 −1 𝐴𝑃 = 𝐽𝐴 . We have therefore to study the eigensystem of the matrix 𝐾 deﬁned in (4.2). Consider the 𝑟𝑚 × 𝑟𝑚 matrix function 𝐹 (𝜆) = 𝐼𝑟𝑚 − 𝑧2 (𝜆𝐼𝑁 − 𝐽)−1 𝑧1 .

(4.3)

We will show that the eigenproblem of the potentially large 𝑁 × 𝑁 matrix 𝐾 can be reduced to the eigenproblem of a small sized 𝑟𝑚 × 𝑟𝑚 matrix function 𝐹 (𝜆). Finding zeroes for det 𝐹 (𝜆), eigenvectors for the small sized matrix which is obtained when we substitute a zero value in 𝐹 (𝜆) and possible Jordan chains for those eigenvectors is all that we need, as the following theorem which is a speciﬁcation of a result ﬁrst appeared in [10]. Theorem 4.1. Suppose that 𝐽 is an 𝑁 × 𝑁 square matrix, 𝑧1 is an 𝑁 × 𝑟𝑚 and 𝑧2 is an 𝑟𝑚 × 𝑁 matrix and that the matrices 𝐽 and 𝐾 = 𝐽 + 𝑧1 𝑧2 have no common eigenvalues. Then 𝜆0 is an eigenvalue of the 𝑁 × 𝑁 matrix 𝐾 and 𝑥0 , 𝑥1 , . . . , 𝑥𝑝 is a Jordan chain of 𝐾 corresponding to 𝜆0 if and only if 𝜆0 is a zero of 𝐹 (𝜆) = 𝐼𝑟𝑚 − 𝑧2 (𝜆𝐼𝑁 − 𝐽)−1 𝑧1 .

Divide and Conquer for Quasiseparable Matrices

313

and

𝑧2 𝑥0 , 𝑧2 𝑥1 , 𝑧2 𝑥2 , . . . , 𝑧2 𝑥𝑝 is a Jordan chain of 𝐹 (𝜆) corresponding to its zero 𝜆0 . Moreover, if 𝜙0 , 𝜙1 , . . . , 𝜙𝑝 is a Jordan chain of the rational matrix function 𝐹 (𝜆) for its eigenvalue 𝜆0 , then the corresponding Jordan chain of 𝐾 is given by 𝑦𝑘 =

𝑘 ∑

(−1)𝑗 (𝜆0 𝐼𝑁 − 𝐽)−(𝑗+1) 𝑧1 𝜙𝑘−𝑗 ,

𝑘 = 0, 1, . . . , 𝑝.

(4.4)

𝑗=0

In particular

(4.5) 𝑦0 = (𝜆0 𝐼𝑁 − 𝐽)−1 𝑧1 𝜙0 is an eigenvector of 𝐾 for its eigenvalue 𝜆0 . The correspondence between the Jordan chains of 𝐾 and 𝐹 (𝜆) is one-toone and onto. In particular, the algebraic multiplicity of an eigenvalue 𝜆0 of 𝐾 coincides with the multiplicity of 𝜆0 as an eigenvalue of 𝐹 (𝜆). Proof. Let 𝜆0 be a zero of 𝐹 (𝜆) and let 𝜙0 ∕= 0 be an eigenvector corresponding to 𝜆0 . Then (𝜆0 𝐼𝑁 − 𝐾)(𝜆0 𝐼𝑁 − 𝐽)−1 𝑧1 𝜙0 = ((𝜆0 𝐼𝑁 − 𝐽) − 𝑧1 𝑧2 )(𝜆0 𝐼𝑁 − 𝐽)−1 𝑧1 𝜙0 = (𝐼 − 𝑧1 𝑧2 (𝜆0 𝐼𝑁 − 𝐽)−1 )𝑧1 𝜙0 = 𝑧1 𝐹 (𝜆0 )𝜙0 = 0. In order to prove that (𝜆0 𝐼𝑁 − 𝐽)−1 𝑧1 𝜙0 is an eigenvector of 𝐾 and 𝜆0 is one of its eigenvalues, it remains only to prove that (𝜆0 𝐼𝑁 − 𝐽)−1 𝑧1 𝜙0 ∕= 0. Indeed, since 𝜙0 ∕= 0 it follows that 𝑧2 (𝜆0 𝐼𝑁 − 𝐽)−1 𝑧1 𝜙0 = −𝐹 (𝜆0 )𝜙0 + 𝐼𝑟𝑚 𝜙0 = 0 + 𝜙0 ∕= 0 and therefore

(𝜆0 𝐼𝑁 − 𝐽)−1 𝑧1 𝜙0 ∕= 0. So we proved that 𝜆0 is an eigenvalue of 𝐾 and (𝜆0 𝐼𝑁 − 𝐽)−1 𝑧1 𝜙0 is one of its eigenvectors. Consider now a Jordan chain 𝜙0 , 𝜙1 , . . . , 𝜙𝑝 of 𝐹 (𝜆) corresponding to 𝜆0 , i.e., 𝑘 ∑ 1 (𝑗) 𝐹 (𝜆0 )𝜙𝑘−𝑗 = 0, 𝑗! 𝑗=0

𝑘 = 0, 1, . . . , 𝑝.

If we write down separately the term for 𝑗 = 0 and we also perform the derivation, then it follows that 𝐼𝑟𝑚 𝜙𝑘 − 𝑧2 (𝜆0 𝐼𝑁 − 𝐽)−1 𝑧1 𝜙𝑘 +

𝑘 ∑

(−1)𝑗+1 𝑧2 (𝜆0 𝐼𝑁 − 𝐽)−(𝑗+1) 𝑧1 𝜙𝑘−𝑗 = 0

𝑗=1

so that

⎛ 𝜙𝑘 = 𝑧2 ⎝

𝑘 ∑ 𝑗=0

⎞ (−1)𝑗 (𝜆0 𝐼𝑁 − 𝐽)−(𝑗+1) 𝑧1 𝜙𝑘−𝑗 ⎠ ,

𝑘 = 0, 1, . . . , 𝑝.

314

Y. Eidelman and I. Haimovici Denote 𝑦𝑘 =

𝑘 ∑

(−1)𝑗 (𝜆0 𝐼𝑁 − 𝐽)−(𝑗+1) 𝑧1 𝜙𝑘−𝑗 .

𝑗=0

Then 𝑧2 𝑦𝑘 = 𝜙𝑘 , 𝑘 = 0, 1, . . . , 𝑝 and in particular 𝑦0 = (𝜆0 𝐼𝑁 − 𝐽)−1 𝑧1 𝜙0 is the eigenvector that we previously found for 𝐾. It remains to prove that 𝑦0 , 𝑦1 , . . . , 𝑦𝑝 is a Jordan chain for 𝐾. (𝜆0 𝐼𝑁 − 𝐾)𝑦𝑘+1 = ((𝜆0 𝐼𝑁 − 𝐽) − 𝑧1 𝑧2 )𝑦𝑘+1 = (𝜆0 𝐼𝑁 − 𝐽)

𝑘+1 ∑

(−1)𝑗 (𝜆0 𝐼𝑁 − 𝐽)−(𝑗+1) 𝑧1 𝜙𝑘+1−𝑗 − 𝑧1 𝑧2 𝑦𝑘+1

𝑗=0

and since 𝑧2 𝑦𝑘+1 = 𝜙𝑘+1 it follows that (𝜆0 𝐼𝑁 − 𝐾)𝑦𝑘+1 =

𝑘+1 ∑

(−1)𝑗 (𝜆0 𝐼𝑁 − 𝐽)−𝑗 𝑧1 𝜙𝑘+1−𝑗 − 𝑧1 𝜙𝑘+1 .

𝑗=0

Now, the term for 𝑗 = 0 reduces itself with −𝑧1 𝜙𝑘+1 , so that we have in fact equality with 𝑘+1 ∑ (−1)𝑗 (𝜆0 𝐼𝑁 − 𝐽)−𝑗 𝑧1 𝜙𝑘+1−𝑗 𝑗=1

and if we denote 𝑞 = 𝑗 − 1, then the sum becomes 𝑘 ∑

(−1)𝑞+1 (𝜆0 𝐼𝑁 − 𝐽)−(𝑞+1) 𝑧1 𝜙𝑘−𝑞 = −𝑦𝑘

𝑞=0

for 𝑘 = 0, . . . , 𝑝 − 1. In total, we proved that (𝐾 − 𝜆0 𝐼𝑁 )𝑦𝑘+1 = 𝑦𝑘 , so that 𝑦0 , 𝑦1 , . . . , 𝑦𝑝 is a Jordan chain for 𝐾. Conversely, let now 𝜆0 be an eigenvalue of 𝐾. Since 𝐾 and 𝐽 have no common eigenvalues it follows that 𝜆0 𝐼𝑁 − 𝐽 is invertible. Let 𝑥0 be an eigenvector of 𝐾 corresponding to its eigenvalue 𝜆0 . Then (𝐾 − 𝜆0 𝐼𝑁 )𝑥0 = 0 so that (4.6) (𝐽 − 𝜆0 𝐼𝑁 )𝑥0 = −𝑧1 𝑧2 𝑥0 and it follows that 𝐹 (𝜆0 )𝑧2 𝑥0 = 𝐼𝑟𝑚 𝑧2 𝑥0 − 𝑧2 (𝜆0 𝐼𝑁 − 𝐽)−1 𝑧1 𝑧2 𝑥0 and using (4.6) this is equal to 𝑧2 𝑥0 + 𝑧2 (𝜆0 𝐼𝑁 − 𝐽)−1 (𝐽 − 𝜆0 𝐼𝑁 )𝑥0 = 0. In order to show that 𝜆0 is an eigenvalue of 𝐹 (𝜆) and to ﬁnd 𝑧2 𝑥0 as an eigenvector, it is suﬃcient to prove that this vector is not zero. Indeed, since 𝜆0 is not an eigenvalue of 𝐽 and 𝑥0 ∕= 0 then (𝜆0 𝐼𝑁 − 𝐽)𝑥0 ∕= 0 and it follows from (4.6) that 𝑧1 𝑧2 𝑥0 ∕= 0 which implies 𝑧2 𝑥0 ∕= 0.

Divide and Conquer for Quasiseparable Matrices

315

Consider now a Jordan chain 𝑥0 , 𝑥1 , . . . , 𝑥𝑝 of the matrix 𝐾 corresponding to its eigenvalue 𝜆0 . We will denote 𝑥−1 = 0 and then we can write from the deﬁnition of Jordan chains for a matrix 𝐾 that (𝐾 − 𝜆0 𝐼𝑁 )𝑥𝑘 = (𝐽 + 𝑧1 𝑧2 − 𝜆0 𝐼𝑁 )𝑥𝑘 = 𝑥𝑘−1 ,

𝑘 = 0, 1, . . . , 𝑝,

so that

𝑧1 𝑧2 𝑥𝑘−𝑗 = −(𝐽 − 𝜆0 𝐼𝑁 )𝑥𝑘−𝑗 + 𝑥𝑘−𝑗−1 . It follows that 𝑘 ∑ 1 (𝑗) 𝐹 (𝜆0 )𝑧2 𝑥𝑘−𝑗 = 𝐼𝑟𝑚 𝑧2 𝑥𝑘 − 𝑧2 (𝜆0 𝐼𝑁 − 𝐽)−1 𝑧1 𝑧2 𝑥𝑘 𝑗! 𝑗=0 −

𝑘 ∑

(4.7)

(−1)𝑗 𝑧2 (𝜆0 𝐼𝑁 − 𝐽)−(𝑗+1) 𝑧1 𝑧2 𝑥𝑘−𝑗

𝑗=1

and using (4.7) this is equal to 𝑧2 𝑥𝑘 +

𝑘 ∑ (−1)𝑗 𝑧2 (𝜆0 𝐼𝑁 − 𝐽)−(𝑗+1) ((𝐽 − 𝜆0 𝐼𝑁 )𝑥𝑘−𝑗 − 𝑥𝑘−𝑗−1 ) 𝑗=0

= 𝑧2 𝑥𝑘 −

𝑘 ∑

𝑗

−𝑗

(−1) 𝑧2 (𝜆0 𝐼𝑁 − 𝐽)

𝑥𝑘−𝑗 +

𝑗=0

𝑘 ∑

(−1)𝑗+1 𝑧2 (𝜆0 𝐼𝑁 − 𝐽)−(𝑗+1) 𝑥𝑘−𝑗−1 .

𝑗=0

In fact, the ﬁrst entry for 𝑗 = 0 in the ﬁrst sum reduces itself with 𝑧2 𝑥𝑘 , while the second sum has only 𝑘 − 1 non-zero factors since the term for 𝑗 = 𝑘 contains the fake vector 𝑥−1 = 0, therefore 𝑘 𝑘 ∑ ∑ 1 (𝑗) 𝐹 (𝜆0 )𝑧2 𝑥𝑘−𝑗 = − (−1)𝑗 𝑧2 (𝜆0 𝐼𝑁 − 𝐽)−𝑗 𝑥𝑘−𝑗 𝑗! 𝑗=0 𝑗=1

+

𝑘−1 ∑

(−1)𝑗+1 𝑧2 (𝜆0 𝐼𝑁 − 𝐽)−(𝑗+1) 𝑥𝑘−𝑗−1 = 0.

𝑗=0

But this is the mere deﬁnition of the fact that 𝑧2 𝑥0 , 𝑧2 𝑥1 , . . . , 𝑧2 𝑥𝑝 is a Jordan chain for 𝐹 (𝜆). We will prove now that the correspondence established between the Jordan chains of 𝐾 and the Jordan chains of 𝐹 (𝜆) is onto. Indeed, if 𝜙0 , 𝜙1 . . . , 𝜙𝑝 is a Jordan chain for 𝐹 (𝜆), then we will build using (4.4) the Jordan chain 𝑦0 , 𝑦1 , . . . , 𝑦𝑝 of 𝐾 and we already know that 𝜙𝑘 = 𝑧2 𝑦𝑘 , 𝑘 = 0, 1, . . . , 𝑝 so that the original chain 𝜙0 , 𝜙1 , . . . , 𝜙𝑝 of 𝐹 (𝜆) is the image of the Jordan chain 𝑦0 , 𝑦1 , . . . , 𝑦𝑝 of 𝐾, so that the correspondence is onto. It remains to prove only that the correspondence established between the Jordan chains of 𝐾 and the Jordan chains of 𝐹 (𝜆) is one-to-one. To this end note ﬁrst that two Jordan chains of 𝐹 (𝜆) which correspond to Jordan chains of 𝐾 and their lengths are diﬀerent correspond to diﬀerent Jordan chains of 𝐾 since corresponding chains have the same length as the original chains for 𝐾. Note also

316

Y. Eidelman and I. Haimovici

that Jordan chains which correspond to diﬀerent eigenvalues are diﬀerent and as such they are counted twice. (For an explanation see at the end of Subsection 2.) It remains therefore to prove for Jordan chains of the same lengths of the same eigenvalue. Let 𝑥𝑗,0 , 𝑥𝑗,1 , . . . , 𝑥𝑗,𝑝 , 𝑗 = 1, 2 two Jordan chains of 𝐾 for the eigenvalue 𝜆0 and suppose that 𝑧2 𝑥1,𝑘 = 𝑧2 𝑥2,𝑘 ,

𝑘 = 0, 1, . . . , 𝑝.

We must prove that 𝑥1,𝑘 = 𝑥2,𝑘 ,

𝑘 = 0, 1, . . . , 𝑝.

We will prove this by induction. For 𝑘 = 0 we have 𝐾𝑥1,0 = 𝜆0 𝑥1,0 and also 𝐾𝑥2,0 = 𝜆0 𝑥2,0 , therefore (𝐽 − 𝜆0 𝐼𝑁 )𝑥1,0 = −𝑧1 𝑧2 𝑥1,0 = −𝑧1 𝑧2 𝑥2,0 = (𝐽 − 𝜆0 𝐼𝑁 )𝑥2,0 , and since 𝜆0 is not an eigenvalue of 𝐽 it follows that 𝑥1,0 = 𝑥2,0 . Suppose now that for a certain 𝑘 < 𝑝 we know that 𝑥1,𝑘 = 𝑥2,𝑘 . Then (𝐽 − 𝜆0 𝐼𝑁 )𝑥1,𝑘+1 = 𝑥1,𝑘 − 𝑧1 𝑧2 𝑥1,𝑘+1 = 𝑥2,𝑘 − 𝑧1 𝑧2 𝑥2,𝑘+1 = (𝐽 − 𝜆0 𝐼𝑁 )𝑥2,𝑘+1 , therefore 𝑥1,𝑘+1 = 𝑥2,𝑘+1 .

□

If the matrices 𝐽 and 𝐾 in the above theorem have common eigenvalues, then all the other eigenvalues of 𝐾 still correspond to eigenvalues of 𝐹 (𝜆) and have the same multiplicity, while the eigenvalues of 𝐾 which were not found by solving the eigenproblem for 𝐹 (𝜆) are readily found among the eigenvalues of 𝐽. 4.2. Order one quasiseparable matrices with scalar entries If 𝐴 is an order one quasiseparable matrix, or at least 𝑟𝑚 = 1, then the perturbations 𝑉1 and 𝑉2 from (3.3) are vectors and then 𝑧1 , 𝑧2 deﬁned in (4.1) are vectors too and the rational function 𝐹 (𝜆) from (4.3) is a scalar function. If 𝐴 is a matrix with scalar entries and with order one quasiseparable representation then its generators are complex numbers. Proposition 4.2. Suppose that 𝐽 is an 𝑁 × 𝑁 matrix in Jordan canonical form, 𝑧1 is a column vector and 𝑧2 is a row vector of lengths 𝑁 and that the matrix 𝐽 has at least an eigenvalue of geometric multiplicity greater than one. Then 𝐽 and 𝐾 = 𝐽 + 𝑧1 𝑧2 have common eigenvalues. Proof. We build ﬁrst the function 𝐹 (𝜆) using (4.3). Since 𝑟𝑚 = 1 it follows that in the present case the function 𝐹 (𝜆) is a scalar function and then 𝐼𝑟𝑚 is equal to 1, i.e., 𝐹 (𝜆) = 1 − 𝑧2 (𝜆𝐼𝑁 − 𝐽)−1 𝑧1 . (4.8) Suppose that the Jordan canonical matrix 𝐽 has 𝑝 Jordan chains which start with independent eigenvectors for all the eigenvalues of 𝐽 in total. Denote by

Divide and Conquer for Quasiseparable Matrices

317

𝑘1 , 𝑘2 , . . . , 𝑘𝑝 the lengths of these 𝑝 Jordan chains. Then it follows that 𝑘1 + 𝑘2 + ⋅ ⋅ ⋅ + 𝑘𝑝 = 𝑁

(4.9)

which is the size of the square matrices 𝐽 and 𝐾. Denote 𝜆1 , 𝜆2 , . . . , 𝜆𝑝 respectively the eigenvalues which correspond to the 𝑝 Jordan chains. Then we can write the matrix 𝐽 which is in Jordan canonical form as a block diagonal matrix with the blocks ⎞ ⎛ 𝜆𝑗 1 0 ⋅⋅⋅ 0 0 ⎜ 0 𝜆𝑗 1 ⋅ ⋅ ⋅ 0 0 ⎟ ⎟ ⎜ ⎜ 0 0 𝜆𝑗 ⋅ ⋅ ⋅ 0 0 ⎟ ⎜ ⎟ ⎜ .. .. .. .. .. ⎟ .. ⎜ . ⎟ . . . . . ⎜ ⎟ ⎝ 0 0 0 ⋅ ⋅ ⋅ 𝜆𝑗 1 ⎠ 0 0 0 ⋅ ⋅ ⋅ 0 𝜆𝑗 of size 𝑘𝑗 × 𝑘𝑗 for 𝑗 = 1, 2, . . . , 𝑝. It follows that (𝜆𝐼𝑁 −𝐽)−1 is a block diagonal Toeplitz matrix with the blocks ⎞ ⎛ (Λ𝑗 )−1 −(Λ𝑗 )−2 ⋅ ⋅ ⋅ (−1)𝑘𝑗 −2 (Λ𝑗 )−(𝑘𝑗 −1) (−1)𝑘𝑗 −1 (Λ𝑗 )−𝑘𝑗 ⎜ 0 (Λ𝑗 )−1 ⋅ ⋅ ⋅ (−1)𝑘𝑗 −3 (Λ𝑗 )−(𝑘𝑗 −2) (−1)𝑘𝑗 −2 (Λ𝑗 )−(𝑘𝑗 −1) ⎟ ⎜ ⎟ ⎜ 0 0 ⋅ ⋅ ⋅ (−1)𝑘𝑗 −4 (Λ𝑗 )−(𝑘𝑗 −3) (−1)𝑘𝑗 −3 (Λ𝑗 )−(𝑘𝑗 −2) ⎟ ⎜ ⎟ ⎜ ⎟ .. .. .. .. .. ⎜ ⎟ . . . . . ⎜ ⎟ ⎝ ⎠ −(Λ𝑗 )−2 0 0 ⋅⋅⋅ (Λ𝑗 )−1 0

0

⋅⋅⋅

(Λ𝑗 )−1

0

(4.10) of size 𝑘𝑗 × 𝑘𝑗 , for each 𝑗 = 1, 2, . . . , 𝑝. Here in (4.10) Λ𝑗 denotes 𝜆 − 𝜆𝑗 . Then (𝜆𝐼𝑁 −𝐽)−1 𝑧1 which appears in the deﬁnition (4.8) of the scalar function 𝐹 (𝜆) and which now is a column vector of length 𝑁 which we denote by 𝑤 = (𝜆𝐼𝑁 − 𝐽)−1 𝑧1 ( = 𝑤𝑘0 +1 𝑤2 ⋅ ⋅ ⋅

𝑤𝑘1

𝑤𝑘1 +1

⋅⋅⋅

𝑤𝑘1 +𝑘2

⋅⋅⋅

𝑤𝑘1 +...+𝑘𝑝

(4.11) )𝑇

where 𝑘0 = 0, has its entries of indexes 𝑘0 + 𝑘1 + 𝑘2 + ⋅ ⋅ ⋅ + 𝑘𝑗−1 + 1, 𝑘0 + 𝑘1 + 𝑘2 + ⋅ ⋅ ⋅ + 𝑘𝑗−1 + 2, . . . , 𝑘0 + 𝑘1 + 𝑘2 + ⋅ ⋅ ⋅ + 𝑘𝑗 for 𝑗 = 1, 2, . . . , 𝑝 equal to ( ∑ 𝑘𝑗 𝑖+1 (𝜆 − 𝜆𝑗 )−𝑖 𝜁𝑖 𝑖=1 (−1) ⋅⋅⋅

∣

∑𝑘𝑗 −1 𝑖=1

(−1)𝑖+1 (𝜆 − 𝜆𝑗 )−𝑖 𝜁𝑖+1

(𝜆 − 𝜆𝑗 )−1 𝜁𝑘𝑗 −1 − (𝜆 − 𝜆𝑗 )−2 𝜁𝑘𝑗

∣ (𝜆 − 𝜆𝑗 )−1 𝜁𝑘𝑗

⋅⋅⋅ )

where 𝜁𝑖 , 𝑖 = 1, 2, . . . , 𝑘𝑗 denote entries of the column vector 𝑧1 as follows 𝜁𝑖 = (𝑧1 )𝑘1 +𝑘2 +⋅⋅⋅+𝑘𝑗−1 +𝑖 .

(4.12)

318

Y. Eidelman and I. Haimovici Now, from (4.11) and (4.12) it follows that 𝑧2 (𝜆𝐼𝑁 − 𝐽)−1 𝑧1 = 𝑧2 𝑤 =

𝑘𝑗 𝑝 ∑ ∑

𝑘𝑗 ∑

((𝑧2 )𝑘1 +⋅⋅⋅+𝑘𝑗−1 +𝑖

𝑗=1 𝑖=1

(−1)𝑖+1 (𝜆 − 𝜆𝑗 )−𝑖 (𝑧1 )𝑘1 +𝑘2 +⋅⋅⋅+𝑘𝑗−1 +𝑖 ).

𝑖=1

By rearranging the order of the terms in the most inner sum we obtain that −1

𝑧2 (𝜆𝐼𝑁 − 𝐽)

𝑧1 =

𝑘𝑗 𝑝 ∑ ∑ 𝑗=1 𝑞=1

𝑐𝑞,𝑗 , (𝜆 − 𝜆𝑗 )𝑞

(4.13)

where 𝑐𝑞,𝑗 (with 𝑗 = 1, 2, . . . , 𝑝 and 𝑞 = 1, 2, . . . , 𝑘𝑗 ) denote proper complex numbers. If the corresponding eigenvalues 𝜆1 , 𝜆2 , . . . , 𝜆𝑝 of the 𝑝 Jordan chains of 𝐽 are all distinct, then the common denominator 𝑝(𝜆) of all the fractions in (4.13) will be 𝑝(𝜆) = (𝜆 − 𝜆1 )𝑘1 (𝜆 − 𝜆2 )𝑘2 ⋅ ⋅ ⋅ ⋅ ⋅ (𝜆 − 𝜆𝑝 )𝑘𝑝 which by (4.9) is a polynomial of degree 𝑘1 + 𝑘2 + ⋅ ⋅ ⋅ + 𝑘𝑝 = 𝑁 . But if at least two of the eigenvalues, say 𝜆𝑗1 and 𝜆𝑗2 , 𝑗1 ∕= 𝑗2 are equal, then in (4.13) at least one of the denominators appears twice, in our case 𝜆 − 𝜆𝑗1 = 𝜆 − 𝜆𝑗2 . Therefore the degree of the common denominator will be less than 𝑁 , namely 𝑁 − min{𝑘𝑗1 , 𝑘𝑗2 }. Hence if one of the eigenvalues corresponds to more than a Jordan chain, then 𝑧2 (𝜆𝐼𝑁 − 𝐽)−1 𝑧1 from (4.13) is in fact equal to the ratio of two polynomials 𝑧2 (𝜆𝐼𝑁 − 𝐽)−1 𝑧1 =

𝑟(𝜆) , 𝑝(𝜆)

(4.14)

where 𝑝(𝜆) is the common denominator and deg 𝑝(𝜆) ≤ 𝑁 − 1 and deg 𝑟(𝜆) < deg 𝑝(𝜆). By (4.8) and (4.14) it follows that 𝐹 (𝜆) = 1 −

𝑟(𝜆) 𝑝(𝜆) − 𝑟(𝜆) = 𝑝(𝜆) 𝑝(𝜆)

with deg(𝑝(𝜆) − 𝑟(𝜆)) = max{deg 𝑝(𝜆), deg 𝑟(𝜆)} ≤ (𝑁 − 1) since deg 𝑝(𝜆) is such. The number of the eigenvalues (zeroes) of the function 𝐹 (𝜆) including multiplicities is therefore less than 𝑁 . But on the other hand the total eigenvalues multiplicity of the 𝑁 × 𝑁 matrix 𝐾 is 𝑁 since the characteristic polynomial of 𝐾 has degree 𝑁 , so that it is not equal to the zero multiplicity of 𝐹 (𝜆), but it is strictly larger. This is in contradiction with the result in Theorem 4.1 and it means that this theorem cannot be applied to the matrices 𝐾 and 𝐽 and so it follows that 𝐽 and 𝐾 must have common eigenvalues. □

Divide and Conquer for Quasiseparable Matrices

319

4.3. Order one quasiseparable matrix 𝑨 and diagonalizable matrices 𝑩 and 𝑪 with distinct eigenvalues Suppose further that the Jordan matrix 𝐽 is in fact diagonal and then denote it by 𝐷. Suppose also that the geometric multiplicity of its eigenvalues is one. This condition asks in fact that the smaller matrices 𝐵 and 𝐶 are diagonalizable and that all their eigenvalues are distinct. In this case (𝑧2 )1 (𝑧1 )1 (𝑧2 )2 (𝑧1 )2 (𝑧2 )𝑁 (𝑧1 )𝑁 𝐹 (𝜆) = 1 + + + ⋅⋅⋅ + (4.15) 𝑑1 − 𝜆 𝑑2 − 𝜆 𝑑𝑁 − 𝜆 where (𝑧1 )𝑖 , (𝑧2 ), 𝑖 = 1, . . . , 𝑁 are the components of the vectors 𝑧1 , 𝑧2 and 𝑑𝑁 < 𝑑𝑁 −1 < 𝑑𝑁 −2 < ⋅ ⋅ ⋅ < 𝑑2 < 𝑑1 the distinct diagonal entries of the diagonal matrix 𝐷. The next Lemma 4.3 gives a suﬃcient condition in which the Theorem 4.1 takes place. Lemma 4.3. Let 𝐷 be a diagonal 𝑁 × 𝑁 complex matrix and 𝑧1 , 𝑧2 be vectors with 𝑁 complex components each. Suppose that 𝐷 has no equal diagonal entries and that 𝑧1 , 𝑧2 have no zero components. Then 𝐷 and the matrix 𝐾 = 𝐷 + 𝑧1 𝑧2 given by (4.2) have no common eigenvalues. Proof. Suppose on the contrary that 𝐷 and 𝐾 have a common eigenvalue 𝜆 and that 𝑣 is an eigenvector of 𝐾 corresponding to this eigenvalue. Then 𝐾𝑣 = (𝐷 + 𝑧1 𝑧2 )𝑣 = 𝜆𝑣

(4.16)

and 𝑣 ∕= 0. Since 𝐷 is a diagonal matrix, 𝐷 = diag(𝑑1 , 𝑑2 , . . . , 𝑑𝑁 ), it follows that its eigenvalue 𝜆 is one of the entries 𝑑𝑖 , 1 ≤ 𝑖 ≤ 𝑁 . Then, if 𝑒𝑖 is the corresponding vector in the standard basis of ℂ𝑁 , we have that 𝑒∗𝑖 (𝐷 − 𝑑𝑖 )𝑤 = 0

(4.17)

0 = 𝐾𝑣 − 𝜆𝑣 = (𝐷 − 𝜆)𝑣 + 𝑧1 (𝑧2 𝑣) = (𝐷 − 𝑑𝑖 )𝑣 + 𝑧1 (𝑧2 𝑣)

(4.18)

for any vector 𝑤. By (4.16) and by (4.17) 0 = 𝑒∗𝑖 ((𝐷 − 𝑑𝑖 )𝑣 + 𝑧1 (𝑧2 𝑣)) = 0 + 𝑒∗𝑖 𝑧1 (𝑧2 𝑣) = (𝑧1 )𝑖 𝑧2 𝑣,

(4.19)

where (𝑧1 )𝑖 is the component 𝑖 of the vector 𝑧1 , which cannot be zero by the assumptions of the lemma. It follows from (4.19) that 𝑧2 𝑣 = 0.

(4.20)

But then (4.18) shows that (𝐷 − 𝑑𝑖 )𝑣 = 0, therefore 𝑣 is also an eigenvector of 𝐷 for the same eigenvalue 𝜆. Hence 𝑣 = 𝛼𝑒𝑖 for a complex scalar 𝛼 ∕= 0. Therefore 𝑧2 𝑣 = 𝛼(𝑧2 )𝑖 , where (𝑧2 )𝑖 is the component 𝑖 of the vector 𝑧2 , which must be non-zero. Therefore 𝑧2 𝑣 ∕= 0 which is in contradiction with (4.20). □

320

Y. Eidelman and I. Haimovici

5. Complete algorithm for Hermitian matrices 5.1. Hermitian order one quasiseparable matrix 𝑨 If the initial matrix 𝐴 is also Hermitian and its quasiseparable generators satisfy (3.9) then by Corollary 3.2 the smaller matrices 𝐵 and 𝐶 are Hermitian too. In this case the results proved in [7] and references therein for the special case of a tridiagonal symmetric matrix 𝐴 can be generalized for the larger context of order one quasiseparable Hermitian matrices. Suppose that for the divided matrices 𝐵 and 𝐶 of sizes 𝑚×𝑚 and respectively (𝑁 − 𝑚) × (𝑁 − 𝑚) we already have their Schur decompositions, i.e., we have 𝑚 × 𝑚 and (𝑁 − 𝑚) × (𝑁 − 𝑚) unitary matrices 𝑄𝐵 and respectively 𝑄𝐶 so that 𝑄∗𝐵 𝐵𝑄𝐵 = 𝐷𝐵 and 𝑄∗𝐶 𝐶𝑄𝐶 = 𝐷𝐶 where the matrices 𝐷𝐵 and 𝐷𝐶 are diagonal matrices. We must compute the spectral data of the twice larger matrix 𝐴 which satisﬁes (3.2) with the known column vector 𝑉1 and row vector 𝑉2 = 𝑉1∗ given by (3.11). If we denote ( ) 𝑄𝐵 0 𝑈= 0 𝑄𝐶 then 𝑈 is unitary and (( ) ) 𝐵 0 ∗ ∗ + 𝑉1 𝑉2 𝑈 = 𝐷 + 𝑧1 𝑧2 𝑈 𝐴𝑈 = 𝑈 0 𝐶 where

( 𝐷=

𝐷𝐵 0

0 𝐷𝐶

) ,

while

𝑧1 = 𝑈 ∗ 𝑉1 , 𝑧2 = 𝑧1∗ = 𝑉2 𝑈 are a column vector which we will also denote by 𝑧 and respectively a row vector which is in fact 𝑧 ∗ . We must now ﬁnd a unitary 𝑉 which brings the matrix 𝐾 from (4.2), which now becomes 𝐾 = 𝐷 + 𝑧𝑧 ∗ to its diagonal form, i.e., such that 𝑉 ∗ (𝐷 + 𝑧𝑧 ∗)𝑉 = 𝐷𝐴 where 𝐷𝐴 is the diagonal matrix in the Schur decomposition of the original matrix 𝐴. We then set 𝑃 = 𝑈 𝑉 to obtain 𝑃 ∗ 𝐴𝑃 = 𝐷𝐴 . In the case when the conditions of Lemma 4.3 are fulﬁlled and also 𝐴 is a Hermitian matrix it follows that the vector 𝑧 has no zero components. In this case, the rational scalar function 𝐹 (𝜆) becomes ∣𝑧2 ∣2 ∣𝑧𝑁 ∣2 ∣𝑧1 ∣2 + + ⋅⋅⋅+ 𝑑1 − 𝜆 𝑑2 − 𝜆 𝑑𝑁 − 𝜆 where 𝑧𝑖 are the components of the vector 𝑧 and 𝐹 (𝜆) = 1 +

𝑑𝑁 > 𝑑𝑁 −1 > 𝑑𝑁 −2 > ⋅ ⋅ ⋅ > 𝑑2 > 𝑑1

(5.1)

Divide and Conquer for Quasiseparable Matrices

321

are the distinct diagonal entries of the diagonal matrix 𝐷. Because 𝐴 is a Hermitian matrix and 𝑧 ∕= 0 the derivative of 𝐹 (𝜆) is negative between the poles 𝑑𝑖 , 𝑖 = 1, . . . , 𝑁 ∣𝑧1 ∣2 ∣𝑧2 ∣2 ∣𝑧𝑁 ∣2 𝐹 ′ (𝜆) = − − − ⋅ ⋅ ⋅ − (5.2) (𝑑1 − 𝜆)2 (𝑑2 − 𝜆)2 (𝑑𝑁 − 𝜆)2 so that 𝐹 (𝜆) is monotone between its poles. Moreover 𝐹 (𝜆) takes all the real values between each two poles, including the value zero. It follows that 𝐹 (𝜆) has exactly 𝑁 roots 𝜆𝑖 , 𝑖 = 1, . . . , 𝑁 and they satisfy 𝑑𝑁 + 𝑧 ∗ 𝑧 > 𝜆𝑁 > 𝑑𝑁 > 𝜆𝑁 −1 > 𝑑𝑁 −2 > ⋅ ⋅ ⋅ > 𝑑2 > 𝜆1 > 𝑑1 .

(5.3)

Moreover, from (4.5) we have that the eigenvectors corresponding to the eigenvalues 𝜆𝑖 , 𝑖 = 1, . . . , 𝑁 are ⎛ ⎛ 𝑧1 ⎛ ⎞ ⎞ ⎞ 𝑧1 𝑧1 ⎜ 𝑣1 = ⎜ ⎝

𝑑1 −𝜆1 𝑧2 𝑑2 −𝜆1

...

⎟ ⎟, ⎠

⎜ 𝑣2 = ⎜ ⎝

𝑧𝑁 𝑑𝑁 −𝜆1

𝑑1 −𝜆2 𝑧2 𝑑2 −𝜆2

...

⎟ ⎟, ⎠

⋅⋅⋅

, 𝑣𝑁

⎜ =⎜ ⎝

𝑧𝑁 𝑑𝑁 −𝜆2

𝑑1 −𝜆𝑁 𝑧2 𝑑2 −𝜆𝑁

...

⎟ ⎟ ⎠

(5.4)

𝑧𝑁 𝑑𝑁 −𝜆𝑁

which must be normalized to obtain the desired orthogonal matrix 𝑉 . 5.2. The rational function approximation method and the convexifying method for ﬁnding zeroes in (5.1) In order to ﬁnd the zeroes of the function 𝐹 (𝜆) which appeared in (5.1) we will now summarize for completeness two known methods: the local approximation in the region of a root by simple rational functions whose zeroes are easy to compute and as a main method the improved Newton method, i.e., the use of convexifying transformations which precede the search for a root. This methods, which are due to Bunch, Nielsen and Sorensen [2] and respectively to Melman [13] have been especially conceived for rational functions of this type. In both methods, for ﬁnding the 𝑖th root of 𝐹 (𝜆) = 1 +

∣𝑧2 ∣2 ∣𝑧𝑁 ∣2 ∣𝑧1 ∣2 + + ⋅⋅⋅+ 𝑑1 − 𝜆 𝑑2 − 𝜆 𝑑𝑁 − 𝜆

where 𝑖 = 1, 2, . . . , 𝑁 − 1 a linear change of variables 𝜇 = 𝑑𝑖 − 𝜆

(5.5)

is performed ﬁrst. (Note that the case 𝑖 = 𝑁 needs a diﬀerent treatment as (5.3) suggests.) This change of variables has numerical advantages for the accurate determination of the updated eigenvectors. After (5.5) the problem becomes to ﬁnd the zero 𝜇𝑖 of the function 𝐹𝑖 (𝜇) = 1 +

∣𝑧2 ∣2 ∣𝑧𝑁 ∣2 ∣𝑧1 ∣2 + + ⋅⋅⋅ + 𝛿1 − 𝜇 𝛿2 − 𝜇 𝛿𝑁 − 𝜇

(5.6)

322

Y. Eidelman and I. Haimovici

where 𝛿𝑗 = 𝑑𝑗 − 𝑑𝑖 , 𝑗 = 1, 2, . . . , 𝑁 and the root we look for must lie in the interval ⎞ ⎛ 𝑖−1 ∑ 0 < 𝜇𝑖 < min ⎝𝛿𝑖+1 , 1 − 𝜇𝑗 ⎠ . (5.7) 𝑗=1

The method in [2] which is also recommended in [7] is the following. Denote Ψ(𝑡) = 1 +

𝑖 ∑ ∣𝑧𝑗 ∣2 , 𝛿 −𝑡 𝑗=1 𝑗

Φ(𝑡) =

𝑁 ∑ ∣𝑧𝑗 ∣2 . 𝛿 −𝑡 𝑗=𝑖+1 𝑗

(5.8)

Then (5.6) becomes −Ψ(𝜇𝑖 ) = Φ(𝜇𝑖 ) + 1

(5.9)

and both sides are convex but the left side is decreasing and the right side is increasing on (5.7). In order to ﬁnd the root, suppose that we already have at a certain stage of the approximation 𝑡𝑘 between 0 and 𝜇𝑖 . The problem is to ﬁnd a 𝑡𝑘+1 ∈ (𝑡𝑘 , 𝜇𝑖 ), i.e., a better approximation. To this end, the two functions in (5.8) 𝑝 are approximated by interpolating simpler rational functions 𝑞−𝑡 , 𝑟 + 𝛿𝑖+1𝑠 −𝑡 such that 𝑝 𝑠 = Ψ(𝑡𝑘 ), 𝑟 + = Φ(𝑡𝑘 ), 𝑞 − 𝑡𝑘 𝛿𝑖+1 − 𝑡𝑘 𝑝 𝑠 = Ψ′ (𝑡𝑘 ), = Φ′ (𝑡𝑘 ). (𝑞 − 𝑡𝑘 )2 (𝛿𝑖+1 − 𝑡𝑘 )2 It is easy to compute 𝑝, 𝑞, 𝑟, 𝑠 and then to solve the quadratic equation 𝑝 𝑠 =1+𝑟+ 𝑞 − 𝑡𝑘+1 𝛿𝑖+1 − 𝑡𝑘+1 which is an approximation of (5.9). In fact, 𝑡𝑘+1 = 𝑡𝑘 + where 𝑎=

2𝑏 √ , 𝑎 + 𝑎2 − 4𝑏

Ψ𝑘 Δ(1 + Φ𝑘 ) + Ψ2𝑘 /Ψ′𝑘 + ′, 𝑐 Ψ𝑘

𝑐 = 1 + Φ𝑘 − ΔΦ′𝑘 ,

𝑤 = 1 + Φ𝑘 + Ψ 𝑘 ,

Φ𝑘 = Φ(𝑡𝑘 ), Ψ𝑘 = Ψ(𝑡𝑘 ),

(5.10)

𝑏=

Δ𝑤Ψ𝑘 , Ψ′𝑘 𝑐

Δ = 𝛿𝑖+1 − 𝑡𝑘 ,

Φ′𝑘 = Φ′ (𝑡𝑘 ), Ψ′𝑘 = Ψ′ (𝑡𝑘 ).

The reasons for arranging the calculations in this way are: 𝑤 must be computed anyway for a convergence test, cancellation is minimized and 𝑡𝑘+1 has an unambiguous sign. In [2] it is proved that starting with any 0 < 𝑡0 < 𝜇𝑖 the sequence obtained recursively by (5.10) converges increasingly to 𝜇𝑖 quadratically (namely ∣𝑡𝑘+1 − 𝜇𝑖 ∣ = 𝑂(∣𝑡𝑘 − 𝜇𝑖 ∣2 )).

Divide and Conquer for Quasiseparable Matrices

323

Finally, the case when 𝑖 = 𝑁 and we look for the last root is treated. In this case equation (5.9) becomes −Ψ(𝑡) = 1 and accordingly the iterations for obtaining 𝑡𝑘 are simpler: 𝑡𝑘+1 = 𝑡𝑘 +

(1 + Ψ𝑘 )Ψ𝑘 . ′ Ψ𝑘

We will now describe a method proposed by Melman in [13]. This method is faster, which is important since it has to be used numerous times. It ﬁrst performs a further transformation of variables, besides (5.5) and then the function becomes one for which both the Newton method and the secant method converge from any suitably chosen initial point and they do it faster. More speciﬁcally, a class of transformation of variables is considered which change the function into a convex one. These transformations must be twice continuously diﬀerentiable and also proper, i.e., they are one-to-one and their range (possibly including ∞) is suﬃcient to cover the values of the original variable. Such a transformation is for instance 𝑤(𝛾) = 𝛾 𝑝 for 0 < 𝑝 ≤ 1. 1 It is shown that if 𝑤′′ (𝛾) ≤ 0 for all 𝛾 such that 𝑤(𝛾) > 𝛿𝑖+1 then the 1 function 𝐹𝑖 (𝜇) from (5.6) becomes a convex function 𝐹𝑖 ( 𝑤(𝛾) ). It is also shown that if 𝐹 (𝑥) is convex and decreasing (respectively increasing) on a closed interval [𝑎, 𝑏] and 𝐹 (𝑎)𝐹 (𝑏) < 0 then Newton’s method converges monotonically to the unique solution 𝑥∗ of 𝐹 (𝑥) = 0 from any initial point in [𝑎, 𝑥∗ ] (respectively [𝑥∗ , 𝑏]). 1 ) = 0 by 𝛾 ∗ and suppose that 𝑤′ (𝛾) Moreover, denote the unique solution of 𝐹𝑖 ( 𝑤(𝛾) 1 has also a constant sign for each 𝛾 such that 𝑤(𝛾) > 𝛿𝑖+1 > 0. Then Newton’s 1 method applied to the function 𝐹𝑖 ( 𝑤(𝛾) ) in this interval converges monotonically from any point ( ( ) ) 1 𝜁0 ∈ 𝑤−1 , 𝛾∗ (5.11) 𝛿𝑖+1 1 )) depending on wether 𝑤 is increasing or decreasing. or in 𝜁0 ∈ [𝛾 ∗ , 𝑤−1 ( 𝛿𝑖+1 Suppose that 𝑤 is increasing and that we start from a point 𝜁0 as in (5.11). Denote 𝑁 ∑ ∣𝑧𝑗 ∣2 − ∣𝑧𝑖 ∣2 𝑤(𝛾) + 𝑅𝑖 (𝛾) = 1 + 𝛿𝑗 𝑖∕=𝑗=1

𝑁 ∑ 𝑗=1,𝑗∕=𝑖,𝑖+1

(

∣𝑧𝑗 ∣ 2 𝛿𝑗 )

𝑤(𝛾) −

1 𝛿𝑗

(5.12)

1 which is the rest to remain from 𝐹𝑖 ( 𝑤(𝛾) ) after its dominant most troublesome part 𝑖+1 ∣ 2 ( ∣𝑧𝛿𝑖+1 ) 𝐷𝑖 (𝛾) = (5.13) 1 𝑤(𝛾) − 𝛿𝑖+1

is deleted. Then a sequence 𝜁𝑘 which converges to the root 𝛾 ∗ faster than the Newton method which starts with the same 𝜁0 satisﬁes ′

𝑅𝑖 (𝜁𝑘 ) + 𝑅𝑖 (𝜁𝑘 )(𝜁𝑘+1 − 𝜁𝑘 ) + 𝐷𝑖 (𝜁𝑘+1 ),

324

Y. Eidelman and I. Haimovici

where 𝑅𝑖 (𝛾), 𝐷𝑖 (𝛾) have been deﬁned in (5.12), (5.13) and 𝜁𝑘+1 stays in the interval deﬁned in (5.11). 1 1 For 𝑖 = 𝑁 , the function 𝐹𝑁 ( 𝑤(𝛾) ) is almost the same but, since 𝜇 = 𝑤(𝛾) , it is −1 −1 deﬁned on [𝑤 (0), ∞) and 𝑤 (0) can be a starting point (in which the function equals 1). Its dominant part is 𝐷𝑁 (𝛾) =

( ∣𝑧𝛿11 ∣ )2 𝑤(𝛾) −

1 𝛿1

instead of the formula in (5.13). 5.3. Repeated diagonal entries and zero components for 𝒛 In applying Lemma 5.1 below we need to determine up to machine precision when two diagonal entries of 𝐷 are distinct and when an entry of 𝑧 is not zero. To this end, suppose that 𝑡𝑜𝑙 is a small multiple of the machine precision, for instance 𝑡𝑜𝑙 = 𝑢(∣∣𝐷∣∣2 + ∣∣𝑧∣∣2 ). By Lemma 5.1 we can determine an orthogonal matrix 𝑄1 and an integer 1 ≤ 𝑛 ≤ 𝑁 such that 𝑄𝑇1 𝐷𝑄1 = diag(𝜇1 , . . . , 𝜇𝑛 ) (zeroes are up to 𝑡𝑜𝑙) and a vector 𝑤 = 𝑄𝑇1 𝑧 such that 𝜇𝑖+1 − 𝜇𝑖 ≥ 𝑡𝑜𝑙 for 𝑖 = 1, . . . , 𝑛 − 1, ∣𝑤𝑖 ∣ ≥ 𝑡𝑜𝑙 for 1 ≤ 𝑖 ≤ 𝑛 and ∣𝑤𝑖 ∣ < 𝑡𝑜𝑙 otherwise. The next Lemma 5.1 shows that one can relax the conditions that 𝑧 has only non-zero components and that the diagonal elements of 𝐷 are all distinct from one another. Lemma 5.1. Let 𝐷 = diag(𝜇1 , 𝜇2 , . . . , 𝜇𝑁 ) be a diagonal real matrix and let 𝑧 be a vector with 𝑁 components. Then there exists a unitary matrix 𝑄1 such that if 𝑄∗1 𝐷𝑄1 = diag(𝜇1 , 𝜇2 , . . . , 𝜇𝑁 ) and 𝑤 = 𝑄1 𝑧 then 𝜇1 < 𝜇2 < ⋅ ⋅ ⋅ < 𝜇𝑛 ≤ 𝜇𝑛+1 ≤ ⋅ ⋅ ⋅ ≤ 𝜇𝑁 , 𝑤𝑖 ∕= 0, 𝑖 = 1, . . . , 𝑛 and 𝑤𝑖 = 0, 𝑖 = 𝑛 + 1, . . . , 𝑁 . The proof is the same as the proof in [7] p. 463, which is made there for the tridiagonal case. 5.4. Diagonalizing 𝑫 + 𝒛𝒛 ∗ In order to ﬁnd the orthogonal matrix 𝑉 in the Schur decomposition of a Hermitian order one quasiseparable matrix 𝐴 we compute 𝑉 = 𝑄1 𝑄2 . The orthogonal matrix 𝑄1 is given by Lemma 5.1. We then take as a new 𝐷 the matrix 𝑄∗1 𝐷𝑄1 and we take 𝑤 as a new 𝑧. It follows that the ﬁrst 𝑛 entries of the diagonal matrix 𝐷 are in strictly decreasing order and that the ﬁrst 𝑛 entries of 𝑧 are non zero. We proceed with ˜ 2 such that ﬁnding the 𝑛 × 𝑛 matrix 𝑄 ˜ ∗ (𝐷(1 : 𝑛, 1 : 𝑛) + 𝑧(1 : 𝑛)𝑧 ∗ (1 : 𝑛))𝑄 ˜ 2 = diag(𝜆1 , . . . , 𝜆𝑛 ). 𝑄 2

(5.14)

We can therefore apply Theorem 4.1 to an 𝑛 × 𝑛 problem, so that we must ﬁrst determine the 𝑛 distinct zeroes of the rational function 𝐹 (𝜆) in (5.1) but with only

Divide and Conquer for Quasiseparable Matrices

325

˜ 2 is found by 𝑛 poles. If 𝜆𝑖 , 𝑖 = 1, . . . , 𝑛 are these zeroes then the 𝑖𝑡ℎ column of 𝑄 ˜ 2 , 𝐼𝑁 −𝑛 ). normalizing 𝑣𝑖 , 𝑖 = 1, . . . , 𝑁 from (5.4). Finally, we consider 𝑄2 = diag(𝑄 Thus we obtain the following algorithm Algorithm 5.2. Let 𝐷 = diag(𝑑1 , 𝑑2 , . . . , 𝑑𝑁 ) be a diagonal real matrix and let 𝑧 = (𝑧𝑖 )𝑁 𝑖=1 be a vector column. Then the unitary matrix 𝑉 and the real diagonal matrix Λ such that 𝐷 + 𝑧𝑧 ∗ = 𝑉 Λ𝑉 ∗ are obtained by the following algorithm. 1. Determine the number 𝑛 of distinct diagonal entries for 𝐷, the matrix 𝑄1 such that ˜ = 𝑄∗ 𝐷𝑄1 = diag(𝜇1 , 𝜇2 , . . . , 𝜇𝑁 ) 𝐷 1 with 𝜇1 < 𝜇2 < ⋅ ⋅ ⋅ < 𝜇𝑛 ≤ 𝜇𝑛+1 ≤ ⋅ ⋅ ⋅ ≤ 𝜇𝑁 , and the vector 𝑄∗1 𝑧 = 𝑤 = (𝑤𝑖 )𝑁 𝑖=1 with the ﬁrst 𝑛 entries diﬀerent from zero and 𝑤𝑛+1 = ⋅ ⋅ ⋅ = 𝑤𝑁 = 0 as in Lemma 5.1. ˆ = diag(𝜇1 , 𝜇2 , . . . , 𝜇𝑛 ), 𝑤 2.1. Set 𝐷 ˆ = (𝑤𝑖 )𝑛𝑖=1 and using one of the iteration methods which have been described in Subsection 5.2 compute the 𝑛 eigenvalues ˆ + 𝑤( 𝜆1 , . . . , 𝜆𝑛 of the matrix 𝐷 ˆ 𝑤) ˆ ∗ with 𝑛 instead of 𝑁 . 2.2. Find 𝑛 eigenvectors 𝑣1 , 𝑣2 , . . . , 𝑣𝑛 with formula (5.4). (0)

(0)

(0)

2.3. Compute the normalized eigenvectors 𝑣1 , 𝑣2 , . . . , 𝑣𝑛 by dividing 𝑣1 , 𝑣2 , . . . , 𝑣𝑛 by the result of formula −𝐹 ′ (𝜆𝑖 ) = 3. Set ˆ2 = 𝑄

[

∣𝑧1 ∣2 ∣𝑧2 ∣2 ∣𝑧𝑛 ∣2 + + ⋅⋅⋅ + , 2 2 (𝑑1 − 𝜆𝑖 ) (𝑑2 − 𝜆𝑖 ) (𝑑𝑛 − 𝜆𝑖 )2 (0)

𝑣1

(0)

𝑣2

(0)

. . . 𝑣𝑛

]

,

𝑖 = 1, . . . , 𝑛.

(5.15)

Λ = diag(𝜆1 , . . . , 𝜆𝑛 , 𝜇𝑛+1 , . . . , 𝜇𝑁 )

ˆ 2. and compute 𝑉 = 𝑄1 𝑄 5.5. The complete algorithm Now we are in position to present the complete divide and conquer algorithm to compute eigendecomposition of a Hermitian matrix with quasiseparable of order one representation. 𝑀 Algorithm 5.3. Let 𝐴 = {𝐴𝑖𝑗 }𝑁 𝑖,𝑗=1 be an 𝑁 × 𝑁 Hermitian matrix where 𝑁 = 2 , with lower quasiseparable generators 𝑝(𝑖) (𝑖 = 2, . . . , 𝑁 ), 𝑞(𝑗) (𝑗 = 1, . . . , 𝑁 − 1), 𝑎(𝑘) (𝑘 = 2, . . . , 𝑁 − 1) of order one and diagonal entries 𝑑(𝑘) (𝑘 = 1, . . . , 𝑁 ). Then the 𝑁 eigenvalues 𝜆𝑁 < 𝜆𝑁 −1 < ⋅ ⋅ ⋅ < 𝜆1 of 𝐴 and a unitary matrix 𝑃 such that 𝑃 ∗ 𝐴𝑃 = diag (𝜆𝑁 , 𝜆𝑁 −1 . . . , 𝜆1 )

are obtained by the following algorithm.

326

Y. Eidelman and I. Haimovici

1. For performing the divide step. Set 𝑝(0,1) (𝑖) 𝑖 = 2, . . . , 𝑁,

𝑞 (0,1) (𝑗) 𝑗 = 1, . . . , 𝑁 − 1,

𝑎(0,1) (𝑘) 𝑘 = 2, . . . , 𝑁 − 1), 𝑑(0,1) (𝑘) 𝑘 = 1, . . . , 𝑁. For 𝑛 = 1, . . . , 𝑀 perform the following. Set 𝑠 = 2𝑛−1 , 𝑚 = 2𝑀−𝑛 . For 𝑗 = 1, 2, . . . , 2𝑠 using lower quasiseparable generators 𝑝(𝑛−1,𝑗) (𝑖) (𝑖 = 2, . . . , 2𝑚), 𝑞 (𝑛−1,𝑗) (𝑖) (𝑖 = 1, . . . , 2𝑚 − 1), 𝑎(𝑛−1,𝑗) (𝑖), (𝑖 = 2, . . . , 2𝑚 − 1) and diagonal entries 𝑑(𝑛−1,𝑗) (𝑖) (𝑖 = 1, . . . , 2𝑚) of the matrix 𝐴(𝑛−1,𝑗) compute via Algorithm 3.5 lower quasiseparable generators 𝑝(𝑛,2𝑗−1) (𝑖), 𝑝(𝑛,2𝑗) (𝑖) (𝑖 = 2, . . . , 𝑚), 𝑞 (𝑛,2𝑗−1) (𝑖), 𝑞 (𝑛,2𝑗) (𝑖) (𝑖 = 1, . . . , 𝑚 − 1), 𝑎(𝑛,2𝑗−1) (𝑖), 𝑎(𝑛,2𝑗) (𝑖) (𝑖 = 2, . . . , 𝑚 − 1) and diagonal entries 𝑑(𝑛,2𝑗−1) (𝑖), 𝑑(𝑛,2𝑗) (𝑖) (𝑖 = 1, . . . , 𝑚) of the matrices 𝐴(𝑛,2𝑗−1) , 𝐴(𝑛,2𝑗) and the vectors 𝑦 (𝑛−1,𝑗) such that ( (𝑛,2𝑗−1) ) 𝐴 0 (𝑛−1,𝑗) 𝐴 = + 𝑦 (𝑛−1,𝑗) (𝑦 (𝑛−1,𝑗) )∗ . 0 𝐴(𝑛,2𝑗) 2. For performing the conquer step. Set Λ(0,𝑡) = 𝑑(𝑀,𝑡) (𝑡), 𝑃 (0,𝑡) = 1,

𝑡 = 1, . . . , 𝑁.

For 𝑛 = 1, . . . , 𝑀 perform the following. For 𝑗 = 1, 2, . . . , 2𝑛 perform the following. 2.1. Compute ( (𝑛−1,2𝑗−1) ) 𝑃 0 (𝑛,𝑗) 𝑧 = 𝑦 (𝑛,𝑗) 0 𝑃 (𝑛−1,2𝑗) and set 𝐷(𝑛,𝑗) = Λ(𝑛−1,2𝑗−1) ⊕ Λ(𝑛−1,2𝑗) . 2.2. Using Algorithm 5.2 determine the eigendecomposition 𝐷(𝑛,𝑗) + 𝑧 (𝑛,𝑗) (𝑧 (𝑛,𝑗) )∗ = 𝑉 (𝑛,𝑗) Λ(𝑛,𝑗) (𝑉 (𝑛,𝑗) )∗ with a unitary matrix 𝑉 (𝑛,𝑗) and a real diagonal matrix Λ(𝑛,𝑗) 2.3. Compute 𝑃 (𝑛,𝑗) =

(

𝑃 (𝑛−1,2𝑗−1) 0

3. Set 𝑃 = 𝑃 (𝑀,1) , Λ = Λ(𝑀,1) .

0

𝑃 (𝑛−1,2𝑗)

)

𝑉 (𝑛,𝑗) .

Divide and Conquer for Quasiseparable Matrices

327

6. Conclusions We studied the divide and conquer method used for solving the eigenproblem of large matrices with quasiseparable representations. We analyzed the divide step and the conquer step for matrices with arbitrary quasiseparable order. In the conquer step we proved that in order to reconstruct the eigendata of a larger matrix from the one of the two smaller matrices we have to solve the eigenproblem for a 𝑟 × 𝑟 rational matrix function, where 𝑟 is a quasiseparable order of a matrix. We gave the complete algorithm for our method in the case of quasiseparable of order one Hermitian matrices. In a future work we will show that the results known in literature for the eigenproblem of unitary Hessenberg matrices can also be obtained as a particular case of our method and perform numerical tests.

References [1] T. Bella, Y. Eidelman, I. Gohberg and V. Olshevsky, Computations with quasiseparable polynomials and matrices, Theoretical Computer Science 409: 158–179 (2008). [2] J.R. Bunch, C.P. Nielsen and D.C. Sorensen, Rank-one modiﬁcation of the symmetric eigenproblem, Numer. Math. 31: 31–48 (1978). [3] J. Cuppen, A divide and conquer method for symmetric tridiagonal eigenproblem, Numerische Mathematik 36: 177–195 (1981). [4] J.J. Dongarra and M. Sidani, A parallel algorithm for the non-symmetric eigenvalue problem, Report CS-91-137, University of Tennessee, Knoxville (1991); SIAM J. Sci. Comput. 14: 542–569 (1993). [5] Y. Eidelman, I. Gohberg and I. Haimovici, Separable type representations of matrices and fast algorithms, to appear. [6] Y. Eidelman, I. Gohberg and V. Olshevsky, Eigenstructure of Order-One-Quasiseparable Matrices. Three-term and Two-term Recurrence Relations, Linear Algebra and its Applications 405: 1–40 (2005). [7] G.H. Golub and C.F. Van Loan, Matrix Computations, John Hopkins, Baltimore 1989. [8] M. Gu and S. Eisenstat, A divide-and-conquer algorithm for the symmetric tridiagonal eigenproblem, SIAM Journal on Matrix Analysis and Applications 16: 172–191 (1995). [9] M. Gu, R. Guzzo, X.-B. Chi and X.-Q. Cao, A divide-and-conquer algorithm for the symmetric tridiagonal eigenproblem, SIAM Journal on Matrix Analysis and Applications 25: 385–404 (2003). [10] I. Haimovici, Operator Equations and Bezout Operators for Analytic Operator Functions, Ph.D. Thesis, Technion, Haifa, 1991. [11] E.R. Jessup, A case against a divide and conquer approach to the non-symmetric eigenvalue problem, Applied Numerical Mathematics 12: 403–420 (1993). [12] N. Mastronardi, E. Van Camp and M. Van Barel, Divide and conquer algorithms for computing the eigendecomposition of symmetric diagonal-plus-semiseparable matrices, Numerical Algorithms, 9: 379–398 (2005).

328

Y. Eidelman and I. Haimovici

[13] A. Melman, Numerical solution of a secular equation, Numer. Math. 69: 483–493 (1995). [14] L. Rodman and M. Schaps, On the partial multiplicities of a product of two matrix polynomials, Integral Equations and Operator Theory, Volume 2, Number 4, 565–599 (1979). [15] R. Vandebril, M. Van Barel and N. Mastronardi, Matrix computations and semiseparable matrices: Eigenvalue and singular value methods, The John Hopkins University Press (2008). Y. Eidelman and I. Haimovici School of Mathematical Sciences Raymond and Beverly Sackler Faculty of Exact Sciences Tel-Aviv University Ramat-Aviv 69978, Israel e-mail: [email protected] [email protected]

Operator Theory: Advances and Applications, Vol. 218, 329–343 c 2012 Springer Basel AG ⃝

An Identity Satisﬁed by Certain Orthogonal Vector-valued Functions Robert L. Ellis In memory of Israel Gohberg and his mathematical prowess

Abstract. In this paper we ﬁrst deﬁne a class of scalar products on 𝑊2𝑚 , the product of an even number of copies of the Wiener algebra 𝑊 . Then we obtain a sequence of orthogonal elements of 𝑊2𝑚 for such a scalar product and derive an identity that they satisfy. Mathematics Subject Classiﬁcation (2000). 47B35, 42C05. Keywords. Orthogonal vector-valued functions, indeﬁnite scalar product, inﬁnite Toeplitz matrix, Wiener algebra, Nehari problem, four block problem.

Introduction In [1], a class of vector-valued functions was investigated that are orthogonal for a scalar product on 𝑊2 = 𝑊 × 𝑊 , where 𝑊 is the Wiener algebra of absolutely convergent Fourier series on the unit circle. In the simplest ∑∞ case, a scalar product is deﬁned as follows by a function 𝑔 in 𝑊 , i.e., 𝑔(𝑧) = 𝑘=−∞ 𝑔𝑘 𝑧 𝑘 for ∣𝑧∣ = 1, ∑∞ where the 𝑔𝑘 are complex numbers with 𝑘=−∞ ∣𝑔𝑘 ∣ < ∞. Denote any element 𝜙 of 𝑊2 as a vector ( (1) ) 𝜙 . 𝜙= 𝜙(2) Then a possibly indeﬁnite scalar product is deﬁned on 𝑊2 by ⎛ ⎞ ∫ 2𝜋 1 𝑔(𝑒𝑖𝜃 ) 1 ⎠ 𝜙(𝑒𝑖𝜃 ) 𝑑𝜃 𝜓(𝑒𝑖𝜃 )∗ ⎝ ⟨𝜙, 𝜓⟩ = 2𝜋 0 𝑖𝜃 𝑔(𝑒 ) 1

(1)

where * denotes the conjugate transpose of a matrix. This scalar product can be expressed in a diﬀerent way. For this, let 𝐺 = (𝑔𝑟−𝑠 )∞ 𝑟,𝑠=−∞ be the inﬁnite Toeplitz

330

R.L. Ellis

matrix deﬁned by the Fourier ⎛ ⋅⋅⋅ ⎜ ⋅⋅⋅ ⎜ ⎜ ⋅⋅⋅ ⎜ 𝐺=⎜ ⎜ ⎜ ⎜ ⎝

coeﬃcients of 𝑔: ⋅⋅⋅ ⋅⋅⋅ 𝑔1 ⋅⋅⋅ ⋅⋅⋅

⋅⋅⋅ 𝑔0 𝑔1 ⋅⋅⋅

and let 𝑇 be the 2 × 2 block matrix

(

𝑇 =

𝑔−1 𝑔0 𝑔1 ⋅⋅⋅

𝐼 𝐺∗

⎞ ⋅⋅⋅ 𝑔−1 𝑔0 ⋅⋅⋅ ⋅⋅⋅

𝐺 𝐼

⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⋅⋅⋅ ⎟ ⎟ ⋅⋅⋅ ⎠ ⋅⋅⋅

⋅⋅⋅ 𝑔−1 ⋅⋅⋅ ⋅⋅⋅

(2)

)

viewed as an operator on ℓ2 (−∞, ∞) × ℓ2 (−∞, ∞), whose elements will also be considered as vectors of the form ( ) 𝑎 𝑏 where 𝑎 = (. . . , 𝛼−1 , 𝛼0 , 𝛼1 , . . . )𝑇 and 𝑏 = (. . . , 𝛽−1 , 𝛽0 , 𝛽1 , . . . )𝑇 . Here the superscript 𝑇 denotes the transpose of a matrix. For any 𝜙 and 𝜓 in 𝑊2 let ∞ ∞ ∑ ∑ 𝜙(1) (𝑧) = 𝛼𝑘 𝑧 𝑘 , 𝜙(2) (𝑧) = 𝛽𝑘 𝑧 𝑘 𝜓 (1) (𝑧) =

𝑘=−∞ ∞ ∑

𝛾𝑘 𝑧 𝑘 ,

𝜓 (2) (𝑧) =

𝑘=−∞

(

𝜉𝜙 = where

Then

𝑎 𝑏

𝑘=−∞ ∞ ∑

𝛿𝑘 𝑧 𝑘

𝑘=−∞

)

( ,

𝜉𝜓 =

𝑐 𝑑

)

𝑎 = (. . . , 𝛼−1 , 𝛼0 , 𝛼1 , . . . )𝑇 ,

𝑏 = (. . . , 𝛽−1 , 𝛽0 , 𝛽1 , . . . )𝑇

𝑐 = (. . . , 𝛾−1 , 𝛾0 , 𝛾1 , . . . )𝑇 ,

𝑑 = (. . . , 𝛿−1 , 𝛿0 , 𝛿1 , . . . )𝑇 .

(3) ⟨𝜙, 𝜓⟩ = 𝜉𝜓∗ 𝑇 𝜉𝜙 . An orthogonal family {𝜙𝑛 ∣𝑛 = 0, ±1, ±2, . . . } can be obtained as follows, provided the indicated solutions exist. Suppose that for any integer 𝑛 there are ℓ1 vectors (𝑛) (𝑛) (𝑛) 𝑇 𝑏𝑛 = (. . . , 𝛽−𝑛−1 , 𝛽−𝑛 )𝑇 𝑎𝑛 = (𝛼(𝑛) 𝑛 , 𝛼𝑛+1 , . . . ) , such that )( ) ) ( ( 𝑎𝑛 𝑒1 𝐼 𝐺𝑛 (4) = 𝐺∗𝑛 𝐼 𝑏𝑛 0 where ⎞ ⎛ 𝑔𝑛 ⋅ ⋅ ⋅ 𝑔𝑛+1 𝐺𝑛 = ⎝ ⋅ ⋅ ⋅ 𝑔𝑛+2 𝑔𝑛+1 ⎠ (5) ⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅

An Identity Satisﬁed by Certain Functions and

331

𝑒1 = (1, 0, 0, . . . )𝑇

and where 𝐼 denotes variously the identity matrix of the appropriate size. For any integer 𝑛, let ( ) 𝛼𝑛 𝜙𝑛 = 𝛽𝑛 where ∞ −∞ ∑ ∑ (𝑛) (𝑛) 𝛼𝑘 𝑧 𝑘 and 𝛽𝑛 (𝑧) = 𝛽𝑘 𝑧 𝑘 . 𝛼𝑛 (𝑧) = 𝑘=𝑛

𝑘=−𝑛

Then {𝜙𝑛 ∣𝑛 = 0, ±1, ±2, . . . } is an orthogonal family of vectors in 𝑊2 for the scalar product (1). Furthermore 𝛼𝑛 and 𝛽𝑛 satisfy the identity ∣𝛼𝑛 (𝑧)∣2 − ∣𝛽𝑛 (𝑧)∣2 = 𝛼(𝑛) 𝑛

for ∣𝑧∣ = 1.

(6)

Solutions of (4) will exist and hence an orthogonal family will exist, for example, when ∣∣𝐺𝑛 ∣∣ < 1 for every integer 𝑛. The functions {𝜙𝑛 }∞ 𝑛=−∞ appear in a linear fractional description of all solutions of the Nehari problem. See [4, 5]. Identities similar to (6) also appear in [2, 3]. In this paper the preceding results will be generalized. For any given positive integer 𝑚, a scalar product will be deﬁned on 𝑊2𝑚 = 𝑊 × 𝑊 × ⋅ ⋅ ⋅ × 𝑊 , the product of 2𝑚 copies of 𝑊 , by means of a function ⎛ ⎞ 𝑔11 (𝑧) 𝑔12 (𝑧) ⋅ ⋅ ⋅ 𝑔1𝑚 (𝑧) ⎜ ⋅⋅⋅ ⎟ ⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅ ⎟ 𝑔(𝑧) = ⎜ (7) ⎝ ⋅⋅⋅ ⎠ ⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅ 𝑔𝑚1 (𝑧) 𝑔𝑚2 (𝑧) ⋅ ⋅ ⋅ 𝑔𝑚𝑚 (𝑧) where 𝑔11 , 𝑔12 , . . . , 𝑔𝑚𝑚 are in 𝑊 . Then an orthogonal system of vectors in 𝑊2𝑚 will be found by solving an equation analogous to (4), and an identity analogous to (6) will be proved.

1. A scalar product Let 𝑊 denote the Wiener algebra of absolutely convergent Fourier series on the unit circle, and let 𝑚 be a ﬁxed positive integer. Denote by 𝑊2𝑚 the product 𝑊 × 𝑊 × ⋅ ⋅ ⋅ × 𝑊 of 2𝑚 copies of 𝑊 . The elements of 𝑊2𝑚 will be represented as column vectors of the form (𝜙(1) , 𝜙(2) , . . . , 𝜙(2𝑚) )𝑇 . Let 𝑔 be a matrix-valued function as in (7), with 𝑔11 , 𝑔12 , . . . , 𝑔𝑚𝑚 in 𝑊 . Then 𝑔 deﬁnes a weight ( ) 𝐼 𝑔(𝑧) Ω(𝑧) = 𝑔(𝑧)∗ 𝐼 for the corresponding possibly indeﬁnite scalar product on 𝑊 given by ∫ 2𝜋 1 𝜓(𝑒𝑖𝜃 )∗ Ω(𝑒𝑖𝜃 ) 𝜙(𝑒𝑖𝜃 ) 𝑑𝜃. ⟨𝜙, 𝜓⟩ = 2𝜋 0

(8)

332

R.L. Ellis

First we prove that this scalar product can be re-expressed in a manner similar to (3). For 1 ≤ 𝑗, 𝑘 ≤ 𝑚 let ∞ ∑ 𝑔𝑗𝑘 (𝑧) = 𝑔𝑟(𝑗,𝑘) 𝑧 𝑟 𝑟=−∞

(𝑗,𝑘)

and let 𝐺𝑗𝑘 be the corresponding inﬁnite Toeplitz matrix 𝐺𝑗𝑘 = (𝑔𝑟−𝑠 )∞ 𝑟,𝑠=−∞ . (See (2).) Let ( ) 𝐼 𝐺 𝑇 = (9) 𝐺∗ 𝐼 where 𝐺 is the 𝑚 × 𝑚 block matrix (𝐺𝑗𝑘 )𝑚 𝑗,𝑘=1 and 𝐼 denotes the appropriate identity matrix. For any 𝜙 = (𝜙(1) , 𝜙(2) , . . . , 𝜙(2𝑚) ) in 𝑊2𝑚 and for 1 ≤ 𝑘 ≤ 2𝑚, let ∞ ∑ 𝑟 𝛼(𝑘) (10) 𝜙(𝑘) (𝑧) = 𝑟 𝑧 𝑟=−∞

and

(𝑘)

(𝑘)

(𝑘)

𝜉𝜙(𝑘) = (. . . , 𝛼−1 , 𝛼0 , 𝛼1 , . . . )𝑇

and let

𝜉𝜙 = (𝜉𝜙(1) , 𝜉𝜙(2) , . . . , 𝜉𝜙(2𝑚) )𝑇 .

Proposition 1.1. For any 𝜙 and 𝜓 in 𝑊2𝑚 , ⟨𝜙, 𝜓⟩ = 𝜉𝜓∗ 𝑇 𝜉𝜙 .

(11) (𝑘)

Proof. Denote each 𝜙(𝑘) as in (10), and denote each 𝜓 (𝑘) as in (10) with 𝛽𝑟 (𝑘) replacing 𝛼𝑟 . Then ∫ 2𝜋 ( ) 1 𝜓 (1) (𝑒𝑖𝜃 ), . . . , 𝜓 (2𝑚) (𝑒𝑖𝜃 ) ⟨𝜙, 𝜓⟩ = 2𝜋 0 ( )( )𝑇 𝐼 𝑔(𝑒𝑖𝜃 ) (1) 𝑖𝜃 (2𝑚) 𝑖𝜃 𝜙 (𝑒 ), . . . , 𝜙 (𝑒 ) 𝑑𝜃 × 𝑔(𝑒𝑖𝜃 )∗ 𝐼 ∫ 2𝜋 ( )( )𝑇 1 = 𝜓 (1) (𝑒𝑖𝜃 ), . . . , 𝜓 (2𝑚) (𝑒𝑖𝜃 ) 𝜂1 (𝑒𝑖𝜃 ), . . . , 𝜂(2𝑚) (𝑒𝑖𝜃 ) 𝑑𝜃 2𝜋 0 where for 1 ≤ 𝑗 ≤ 𝑚, 𝑖𝜃

(𝑗)

𝜂𝑗 (𝑒 ) = 𝜙

𝑖𝜃

(𝑒 ) +

𝑚 ∑

𝑔𝑗𝑘 (𝑒𝑖𝜃 )𝜙(𝑚+𝑘) (𝑒𝑖𝜃 )

𝑘=1

and for 𝑚 + 1 ≤ 𝑗 ≤ 2𝑚, 𝜂𝑗 (𝑒𝑖𝜃 ) =

𝑚 ∑ 𝑘=1

𝑔𝑘,𝑗−𝑚 (𝑒𝑖𝜃 ) 𝜙(𝑘) (𝑒𝑖𝜃 ) + 𝜙(𝑗) (𝑒𝑖𝜃 ).

An Identity Satisﬁed by Certain Functions Therefore 1 ⟨𝜙, 𝜓⟩ = 2𝜋 +

∫

2𝜋

⎩

0

2𝑚 ∑ 𝑚 ∑

⎧ 𝑚 ⎨∑

𝜓 (𝑗) (𝑒𝑖𝜃 ) 𝜙(𝑗) (𝑒𝑖𝜃 ) +

𝑗=1

𝑚 ∑ 𝑚 ∑

𝜓 (𝑗) (𝑒𝑖𝜃 ) 𝑔𝑗𝑘 (𝑒𝑖𝜃 ) 𝜙(𝑚+𝑘) (𝑒𝑖𝜃 )

𝑗=1 𝑘=1 2𝑚 ∑

𝜓 (𝑗) (𝑒𝑖𝜃 ) 𝑔𝑘,𝑗−𝑚 (𝑒𝑖𝜃 ) 𝜙𝑘 (𝑒𝑖𝜃 ) +

𝑗=𝑚+1 𝑘=1

333

𝑗=𝑚+1

⎫ ⎬ 𝜓 (𝑗) (𝑒𝑖𝜃 ) 𝜙(𝑗) (𝑒𝑖𝜃 ) 𝑑𝜃. (12) ⎭

Combining the ﬁrst and last sums on the right side of (12), we ﬁnd that ∫ 2𝜋 {∑ ∞ 2𝑚 ∑ ∞ ∑ 1 (𝑗) 𝑖𝑠𝜃 ⟨𝜙, 𝜓⟩ = 𝛽𝑟 𝑒−𝑖𝑟𝜃 𝛼(𝑗) 𝑠 𝑒 2𝜋 0 𝑠=−∞ 𝑗=1 𝑟=−∞ ∞ ∑

𝑚 ∑

+

(𝑗)

𝑗,𝑘=1 𝑟,𝑠,𝑡=−∞ ∞ ∑

𝑚 ∑

+

(𝑚+𝑘) 𝑖𝑡𝜃

𝛽𝑟 𝑒−𝑖𝑟𝜃 𝑔𝑠(𝑗,𝑘) 𝑒𝑖𝑠𝜃 𝛼𝑡

𝑒

} (𝑗+𝑚) −𝑖𝑟𝜃 𝛽𝑟 𝑒

(𝑘,𝑗) 𝑔𝑠 𝑒−𝑖𝑠𝜃

𝑗,𝑘=1 𝑟,𝑠,𝑡=−∞

=

2𝑚 ∑ ∞ ∑ 𝑗=1 𝑟=−∞

+

∞ ∑

𝑚 ∑

(𝑗)

𝛽𝑟 𝛼(𝑗) 𝑟 +

𝑗,𝑘=1 𝑟,𝑡=−∞ ∞ ∑

𝑚 ∑

(𝑗)

(𝑗+𝑚)

𝛽𝑟

𝑗,𝑘=1 𝑟,𝑡=−∞

We also have 𝜉𝜓∗ 𝑇 𝜉𝜙 = (𝜉𝜓(1) , . . . , 𝜉𝜓(2𝑚) )∗

(

𝐼 𝐺∗

(𝑘) 𝛼𝑡 𝑒𝑖𝑡𝜃

𝑑𝜃

(𝑗,𝑘) (𝑚+𝑘)

𝛽𝑟 𝑔𝑟−𝑡 𝛼𝑡 (𝑘,𝑗)

(𝑘)

𝑔𝑡−𝑟 𝛼𝑡 .

𝐺 𝐼

)

(13)

(𝜉𝜙(1) , . . . , 𝜉𝜙(2𝑚) )𝑇

= (𝜉𝜓(1) , . . . , 𝜉𝜓(2𝑚) )∗ (𝜁1 , . . . , 𝜁2𝑚 )𝑇 where for 1 ≤ 𝑗 ≤ 𝑚, 𝜁𝑗 = 𝜉𝜙(𝑗) +

𝑚 ∑

𝐺𝑗𝑘 𝜉𝜙(𝑘+𝑚)

𝑘=1

and for 𝑚 + 1 ≤ 𝑗 ≤ 2𝑚, 𝜁𝑗 =

𝑚 ∑

𝐺∗𝑘,𝑗−𝑚 𝜉𝜙(𝑘) + 𝜉𝜙(𝑗) .

𝑘=1

Therefore 𝜉𝜓∗ 𝑇 𝜉𝜙

=

𝑚 ∑ 𝑗=1

𝜉𝜓∗ (𝑗) 𝜉𝜙(𝑗)

+

𝑚 ∑ 𝑗,𝑘=1

+

𝜉𝜓∗ (𝑗) 𝐺𝑗,𝑘 𝜉𝜙(𝑘+𝑚)

2𝑚 ∑ 𝑗=𝑚+1

𝜉𝜓∗ (𝑗)

𝑚 ∑ 𝑘=1

𝐺∗𝑘,𝑗−𝑚 𝜉𝜙(𝑘)

+

2𝑚 ∑ 𝑗=𝑚+1

𝜉𝜓∗ (𝑗) 𝜉𝜙(𝑗) .

334

R.L. Ellis

Combining the ﬁrst and last sums, we have 𝜉𝜓∗

𝑇 𝜉𝜙 =

2𝑚 ∑ ∞ ∑ 𝑗=1 𝑟=−∞

(𝑗) 𝛽𝑟 𝛼(𝑗) 𝑟

+ +

𝑚 ∑

∞ ∑

(𝑗) (𝑗,𝑘)

𝑗,𝑘=1 𝑟,𝑡=−∞ ∞ 𝑚 ∑ ∑ 𝑗,𝑘=1 𝑟,𝑡=−∞

𝛽𝑟 𝑔𝑟−𝑡 𝛼(𝑘+𝑚) 𝑟 (14) (𝑗+𝑚) 𝛽𝑟

(𝑘,𝑗) (𝑘) 𝑔𝑡−𝑟 𝛼𝑡 .

From (13) and (14) we conclude that (11) holds.

□

2. An orthogonal system We will generate an orthogonal system for the scalar product in the preceding section by solving equations that are appropriate analogs of (4). We continue to let 𝐺 = (𝐺𝑗𝑘 )𝑚 𝑗,𝑘=1 be the 𝑚 × 𝑚 block matrix in (9). For any integer 𝑛, let ⎛ [𝑛] ⎞ [𝑛] 𝐻𝑛 𝐺11 ⋅ ⋅ ⋅ 𝐺1,𝑚−1 ⎜ ⎟ ⎟ ⎜ ⎟ ⎜ 𝐺 ⎜ 21 ⋅ ⋅ ⋅ 𝐺2,𝑚−1 𝐺2𝑚 [𝑛] ⎟ ⎟ ⎜ 𝐺𝑛 = ⎜ ⋅ (15) ⋅ ⋅ ⋅ ⎟ ⎟ ⎜ ⋅ ⋅ ⋅ ⋅ ⎟ ⎜ ⎝ 𝐺𝑚1 ⋅ ⋅ ⋅ 𝐺𝑚,𝑚−1 𝐺𝑚𝑚 [𝑛] ⎠ [𝑛]

[𝑛]

where 𝐺11 , . . . , 𝐺1,𝑚−1 are formed from 𝐺11 , . . . , 𝐺1,𝑚−1 by deleting all rows above the 𝑛th row; 𝐺2𝑚 [𝑛], . . . , 𝐺𝑚𝑚 [𝑛] result from 𝐺2𝑚 , . . . , 𝐺𝑚𝑚 by deleting all columns to the right of the 𝑛th column; and ⎛ (1,𝑚) (1,𝑚) ⎞ 𝑔𝑛 ⋅ ⋅ ⋅ 𝑔𝑛+1 ⎟ ⎜ ⎟ ⎜ (1,𝑚) (1,𝑚) ⎜ 𝐻𝑛 = ⎜ ⋅ ⋅ ⋅ 𝑔𝑛+2 (16) 𝑔𝑛+1 ⎟ ⎟ ⎠ ⎝ ⋅⋅⋅

⋅⋅⋅

⋅⋅⋅

is the inﬁnite Hankel matrix obtained from 𝐺(1,𝑚) by deleting all rows above the 𝑛th row and all columns to the right of the 𝑛th column. We regard 𝐺𝑛 as a “section” of 𝐺 analogous to the matrix in (5). In generating an orthogonal system we will solve equations in the form ( )( ) ) ( 𝐼 𝐺𝑛 𝑎𝑛 𝑒1 . = 𝐺∗𝑛 𝐼 𝑏𝑛 0 Here 𝐼 represents two diﬀerent identity matrices of the appropriate sizes, and 𝑒1 = (1, 0, 0, . . . )𝑇 ,

(𝑛)

𝑇 𝑎𝑛 = (𝑎1 , . . . , 𝑎(𝑛) 𝑚 ) ,

(𝑛)

(𝑛)

𝑏𝑛 = (𝑎𝑚+1 , . . . , 𝑎2𝑚 )𝑇

An Identity Satisﬁed by Certain Functions

335

where (𝑛)

= (𝛼(1,𝑛) , 𝛼𝑛+1 , . . . )𝑇 𝑛

(𝑛)

(2𝑚,𝑛)

𝑎1

(1,𝑛)

(2𝑚,𝑛)

𝑎2𝑚 = (. . . , 𝛼−𝑛−1 , 𝛼−𝑛

is in ℓ1 (𝑛, ∞) )

is in ℓ1 (−∞, −𝑛)

and for 2 ≤ 𝑘 ≤ 2𝑚 − 1, (𝑛)

𝑎𝑘

(𝑘,𝑛)

(𝑘,𝑛)

= (. . . , 𝛼−1 , 𝛼0

(𝑘,𝑛)

, 𝛼1

,...)

is in ℓ1 (−∞, ∞).

Theorem 2.1. Let 𝑔𝑗𝑘 (1 ≤ 𝑗, 𝑘 ≤ 𝑚) be in 𝑊 . Suppose that for any integer 𝑛 there are ℓ1 -vectors 𝑎𝑛 and 𝑏𝑛 such that )( ) ) ( ( 𝑎𝑛 𝑒1 𝐼 𝐺𝑛 (17) = 𝐺∗𝑛 𝐼 𝑏𝑛 0 (1)

(2𝑚) 𝑇

Let 𝜙𝑛 = (𝜙𝑛 , . . . , 𝜙𝑛

) , where

𝜙(1) 𝑛 (𝑧) = 𝜙(𝑘) 𝑛 (𝑧) =

∞ ∑

(1,𝑛) 𝑗

𝛼𝑗

𝑗=𝑛 ∞ ∑

(𝑘,𝑛) 𝑗

𝑗=−∞ −∞ ∑

(𝑧) = 𝜙(2𝑚) 𝑛

𝑗=−𝑛

𝑧

𝛼𝑗

𝑧

for 2 ≤ 𝑘 ≤ 2𝑚 − 1

(2𝑚,𝑛) 𝑗

𝛼𝑗

𝑧

where the 𝛼’s are as described before the theorem. Then {𝜙𝑛 }∞ 𝑛=−∞ is an orthogonal system in 𝑊2𝑚 for the scalar product in (8). Proof. For any integer 𝑛 let (𝑛)′

𝑎1 and

(1,𝑛)

= (. . . , 0, 0, 𝛼(1,𝑛) , 𝛼𝑛+1 , . . . )𝑇 𝑛

(𝑛)′

(2𝑚,𝑛)

(2𝑚,𝑛)

𝑎2𝑚 = (. . . , 𝛼−𝑛−1 , 𝛼−𝑛 Then

(𝑛)′

𝜉𝜙(1) = 𝑎1 𝑛

and

, 0, 0, . . . )𝑇 . (𝑛)′

𝜉𝜙(2𝑚) = 𝑎2𝑚 . 𝑛

For any two integers 𝑟 and 𝑠 with 𝑟 > 𝑠, (11) implies that ⟨𝜙𝑟 , 𝜙𝑠 ⟩ = 𝜉𝜙∗𝑠 𝑇 𝜉𝜙𝑟 . But because of the leading zeros in 𝜉𝜙(1) and 𝜉𝜙(1) and the trailing zeros in 𝜉𝜙(2𝑚) 𝑟 𝑠 𝑟 and 𝜉𝜙(2𝑚) , it follows that 𝑠 )( ( ) 𝑎𝑟 𝐼 𝐺𝑟 ∗ ⟨𝜙𝑟 , 𝜙𝑠 ⟩ = (𝑐𝑟 𝑑𝑟 ) 𝐺∗𝑟 𝐼 𝑏𝑟

336

R.L. Ellis

where 𝑐𝑟 has at least 𝑟 − 𝑠 leading zeros. Thus (17) implies that ) ( 𝑒1 = 0. ⟨𝜙𝑟 , 𝜙𝑠 ⟩ = (𝑐𝑟 𝑑𝑟 )∗ 0 This proves the theorem.

□

Just as the functions {𝜙𝑛 }∞ 𝑛=−∞ in the Introduction are related to the Nehari problem, the functions {𝜙𝑛 }∞ 𝑛=−∞ in Theorem 2.1 are related to the Four Block problem. See Section II.4 in [4].

3. An identity In this section we will derive an identity similar to (6) associated with the orthogonal functions in Section 2. We will ﬁx an integer 𝑛 and, for simplicity, suppress 𝑛 in some of the notation. Thus we will write (17) as )( ) ( ) ( 𝑎 𝑒1 𝐼 𝐺𝑛 = (18) 𝐺∗𝑛 𝐼 𝑏 0 where 𝑎 = (𝑎1 , 𝑎2 , . . . , 𝑎𝑚 )𝑇

𝑏 = (𝑎𝑚+1 , 𝑎𝑚+2 , . . . , 𝑎2𝑚 )𝑇

and

with (1)

𝑇 𝑎1 = (𝛼(1) 𝑛 , 𝛼𝑛+1 , . . . ) (𝑘)

(𝑘)

(𝑘)

𝑎𝑘 = (. . . , 𝛼−1 , 𝛼0 , 𝛼1 , . . . )𝑇 𝑎2𝑚 = (. . . ,

(2𝑚) 𝛼−𝑛−1 ,

for 2 ≤ 𝑘 ≤ 2𝑚 − 1

(2𝑚) 𝛼−𝑛 )𝑇 .

To emphasize the analogy with (6), we will also use 𝛼 in place of the function 𝜙𝑛 obtained in Theorem 2.1. Thus 𝛼 = (𝛼1 , . . . , 𝛼2𝑚 )𝑇 , where 𝛼1 (𝑧) = 𝛼𝑘 (𝑧) =

∞ ∑

(1)

𝛼𝑗 𝑧 𝑗

𝑗=𝑛 ∞ ∑

𝑗=−∞

𝛼2𝑚 (𝑧) =

−𝑛 ∑ 𝑗=−∞

(𝑘)

𝛼𝑗 𝑧 𝑗

for 2 ≤ 𝑘 ≤ 2𝑚 − 1

(2𝑚) 𝑗

𝛼𝑗

𝑧 .

For any 𝑎 = (. . . , 𝑎1 , 𝑎0 , 𝑎1 , . . . )𝑇 in ℓ1 (−∞, ∞), any 𝑏 = (𝑏𝑛 , 𝑏𝑛+1 , 𝑏𝑛+2, . . . )𝑇 in ℓ1 (𝑛, ∞), and any 𝑐 = (. . . , 𝑐−𝑛−1 , 𝑐−𝑛 ) in ℓ1 (−∞, −𝑛), we deﬁne inﬁnite

An Identity Satisﬁed by Certain Functions Toeplitz matrices by

⎛

⎜ ⎜ ⎜ ⎜ 𝑇 (𝑎) = ⎜ ⎜ ⎜ ⎜ ⎝

⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅

⎞

⋅⋅⋅ ⋅⋅⋅ 𝑎1 ⋅⋅⋅ ⋅⋅⋅

⋅⋅⋅ 𝑎0 𝑎1 ⋅⋅⋅

𝑎−1 𝑎0 𝑎1 ⋅⋅⋅

⎛

𝑏𝑛 𝑏𝑛+1 𝑏𝑛+2 ⋅ ⋅ ⋅ ⎜ 0 𝑏𝑛 𝑏𝑛+1 ⋅ ⋅ ⋅ ⎜ 𝑈 (𝑏) = ⎝ ⋅⋅⋅ 0 0 𝑏𝑛 ⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅ ⎛ 𝑐−𝑛 𝑐−𝑛−1 𝑐−𝑛−2 ⎜ 0 𝑐−𝑛−1 𝑐−𝑛 ⎜ 𝑈 (𝑐) = ⎝ 0 0 𝑐−𝑛 ⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅ and we let ⎛ ⋅⋅⋅ ⎜ ⋅⋅⋅ ⎜ 𝑅=⎜ ⎜ ⋅⋅⋅ ⎝ ⋅⋅⋅ ⋅⋅⋅

⋅⋅⋅ 0 0 1 ⋅⋅⋅

⋅⋅⋅ 0 1 0 ⋅⋅⋅

⋅⋅⋅ 1 0 0 ⋅⋅⋅

⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅

337

⋅⋅⋅ 𝑎−1 𝑎0 ⋅⋅⋅ ⋅⋅⋅ ⎞

⋅⋅⋅ 𝑎−1 ⋅⋅⋅ ⋅⋅⋅

⎟ ⎟ ⎠ ⎞ ⋅⋅⋅ ⋅⋅⋅ ⎟ ⎟ ⋅⋅⋅ ⎠ ⋅⋅⋅

⎞

⎛

⎟ ⎟ ⎟ ⎟ ⎠

and

⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⋅⋅⋅ ⎟ ⎟ ⋅⋅⋅ ⎠ ⋅⋅⋅

⋅⋅⋅ ⎜ ⋅⋅⋅ 𝑅+ = ⎜ ⎝ ⋅⋅⋅ ⋅⋅⋅

0 0 1 ⋅⋅⋅

0 1 0 ⋅⋅⋅

⎞ 1 0 ⎟ ⎟. 0 ⎠ ⋅⋅⋅

Left multiplication by either 𝑅 or 𝑅+ reverses the rows of a matrix, provided the multiplication is possible. Theorem 3.1. Let 𝛼 = (𝛼1 , 𝛼2 , . . . , 𝛼2𝑚 )𝑇 be the 𝑛th orthogonal function obtained from (18) as in Theorem 2.1. Then the identity 𝑚 ∑

2𝑚 ∑

∣𝛼𝑘 (𝑧)∣2 −

𝑘=1

∣𝛼𝑘 (𝑧)∣2 = 𝛼(1) 𝑛

for ∣𝑧∣ = 1

𝑘=𝑚+1

(1)

holds, where 𝛼𝑛 denotes the coeﬃcient of 𝑧 𝑛 in 𝛼1 . Proof. The matrix 𝐺𝑛 in (18) is given by (15). The ﬁrst and last rows in (18) imply that 𝑎1 + and

𝑚−1 ∑ ℓ=1

𝐻𝑛∗ 𝑎1 +

[𝑛]

𝐺1ℓ 𝑎𝑚+ℓ + 𝐻𝑛 𝑎2𝑚 = 𝑒1

𝑚 ∑ 𝑘=2

𝐺𝑘𝑚 [𝑛]∗ 𝑎𝑘 + 𝑎2𝑚 = 0.

(19)

(20)

338

R.L. Ellis

Observe from (16) that ⎛ 𝐻𝑛 𝑎2𝑚

⋅⋅⋅

(1,𝑚)

𝑔𝑛

⎜ ⎜ =⎜ ⎜ ⋅⋅⋅ ⎝

𝑔𝑛+2

(1,𝑚)

𝑔𝑛+1

⋅⋅⋅

⋅⋅⋅

⋅⋅⋅ ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ =⎜ ⎜ ⎜ ⎜ ⎝

(1,𝑚)

𝑔𝑛+1

(2𝑚)

𝛼−𝑛

(1,𝑚)

⎞

⎛

⎞

⋅ ⋅ ⋅

⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ (2𝑚) ⎟⎜ 𝛼 ⎠ ⎜ −𝑛−1 ⎝

⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠

(2𝑚)

𝛼−𝑛−1

(2𝑚)

𝛼−𝑛−2

(2𝑚)

(2𝑚)

𝛼−𝑛−1

(2𝑚)

0

𝛼−𝑛

0

0

𝛼−𝑛

⋅⋅⋅

⋅⋅⋅

⋅⋅⋅

(2𝑚)

𝛼−𝑛 ⎞ ⎛ 𝑔 (1,𝑚) 𝑛 ⋅⋅⋅ ⎟⎜ ⎟⎜ (1,𝑚) ⎟ ⎜ 𝑔𝑛+1 ⋅⋅⋅ ⎟⎜ ⎜ ⎟⎜ ⋅ ⎟⎜ ⎟⎜ ⋅⋅⋅ ⎟ ⎟⎜ ⋅ ⎠⎜ ⎝ ⋅⋅⋅ ⋅

⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟. ⎟ ⎟ ⎟ ⎟ ⎠

Thus 𝐻𝑛 𝑎2𝑚 = 𝑈 (𝑎2𝑚 )𝛾𝑛 (1,𝑚)

where 𝛾𝑛 = (𝑔𝑛

(1,𝑚)

, 𝑔𝑛+1 , . . . )𝑇 . Also ⎛

𝐻𝑛∗

(21)

⋅⋅⋅

⎜ ⎜ (1,𝑚) 𝑎1 = ⎜ 𝑔𝑛+1 ⎝ (1,𝑚) 𝑔𝑛 ⎛ ⎜ ⎜ ⎜ =⎜ ⎜ ⎜ ⎝

⋅⋅⋅ 0 0 (1)

𝛼𝑛

⋅⋅⋅

⋅⋅⋅

⎞

⎛

(1)

𝛼𝑛

⎞

⎟ ⎜ ⎟ ⎜ 𝛼(1) ⎟ ⎟ ⎜ 𝑛+1 ⎟ ⋅⋅⋅ ⎟⎜ ⋅ ⎟ ⎟ ⎠⎜ ⎝ ⋅ ⎠ (1,𝑚) 𝑔𝑛+1 ⋅⋅⋅ ⋅ ⎞⎛ (1,𝑚) ⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅ 𝑔𝑛 ⎟⎜ (1) ⎜ ⋅⋅⋅ ⎟ 0 𝛼𝑛 ⎟ ⎜ 𝑔 (1,𝑚) ⎟ ⎜ 𝑛+1 ⎟⎜ (1) (1) ⋅ 𝛼𝑛+1 ⋅ ⋅ ⋅ ⎟ ⎜ 𝛼𝑛 ⎠⎝ ⋅ (1) (1) 𝛼𝑛+1 𝛼𝑛+2 ⋅ ⋅ ⋅ ⋅ (1,𝑚) 𝑔𝑛+2

⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠

so that 𝑅+ 𝐻𝑛∗ 𝑎1 = 𝑈 (𝑎1 )𝛾𝑛 .

(22)

Since upper triangular Toeplitz matrices commute, it follows from (21) and (22) that 𝑈 (𝑎1 ) 𝐻𝑛 𝑎2𝑚 = 𝑈 (𝑎2𝑚 ) 𝑅+ 𝐻𝑛∗ 𝑎1 .

(23)

An Identity Satisﬁed by Certain Functions

339

Solving (19) for 𝐻𝑛 𝑎2𝑚 and (20) for 𝐻𝑛∗ 𝑎1 , and substituting in (23) leads to 𝑚−1 ∑

𝑈 (𝑎1 )𝑒1 − 𝑈 (𝑎1 )𝑎1 −

ℓ−1

= −𝑈 (𝑎2𝑚 )𝑅+ 𝑎2𝑚 −

[𝑛]

𝑈 (𝑎1 ) 𝐺1ℓ 𝑎𝑚+ℓ

𝑚 ∑

𝑈 (𝑎2𝑚 )𝑅+ 𝐺𝑘𝑚 [𝑛]∗ 𝑎𝑘

𝑘=2

which we rewrite as 𝑈 (𝑎1 ) 𝑎1 +

𝑚−1 ∑ ℓ=1

−

𝑚 ∑

[𝑛]

𝑈 (𝑎1 ) 𝐺1ℓ 𝑎𝑚+ℓ (24)

𝑈 (𝑎2𝑚 )𝑅+ 𝐺𝑘𝑚 [𝑛]∗ 𝑎𝑘 − 𝑈 (𝑎2𝑚 )𝑅+ 𝑎2𝑚 = 𝑈 (𝑎1 )𝑒1 .

𝑘=2

From rows 2 through 𝑚 in (18) we have 𝑎𝑘 +

𝑚−1 ∑

𝐺𝑘ℓ 𝑎ℓ+𝑚 + 𝐺𝑘𝑚 [𝑛] 𝑎2𝑚 = 0

for 2 ≤ 𝑘 ≤ 𝑚

(25)

ℓ=1

and from rows 𝑚 + 1 through 2𝑚 − 1 in (18) we have [𝑛] (𝐺1ℓ )∗

𝑎1 +

𝑚 ∑

𝐺∗𝑘ℓ 𝑎𝑘 + 𝑎𝑚+ℓ = 0

for 1 ≤ ℓ ≤ 𝑚 − 1.

(26)

𝑘=2

From (25) it follows that 𝑚−1 ∑

𝐺𝑘ℓ 𝑎ℓ+𝑚 = −𝑎𝑘 − 𝐺𝑘𝑚 [𝑛] 𝑎2𝑚

for 2 ≤ 𝑘 ≤ 𝑚

ℓ=1

and hence that 𝑚 𝑚−1 ∑ ∑

𝑇 (𝑎𝑘 ) 𝑅 𝐺𝑘ℓ 𝑎ℓ+𝑚 = −

𝑘=2 ℓ=1

𝑚 ∑ 𝑘=2

𝑇 (𝑎𝑘 ) 𝑅 𝑎𝑘 −

𝑚 ∑

𝑇 (𝑎𝑘 ) 𝑅 𝐺𝑘𝑚 [𝑛] 𝑎2𝑚 . (27)

𝑘=2

Similarly it follows from (26) that 𝑚 𝑚−1 ∑∑ ℓ=1 𝑘=2

𝑇 (𝑎𝑚+ℓ )∗ 𝐺∗𝑘ℓ 𝑎𝑘 = −

𝑚−1 ∑ ℓ=1

𝑇 (𝑎𝑚+ℓ )∗ 𝑎𝑚+ℓ −

𝑚−1 ∑ ℓ=1

[𝑛]

𝑇 (𝑎𝑚+ℓ )∗ (𝐺1ℓ )∗ 𝑎1 . (28)

340

R.L. Ellis

Next we observe that for 2 ≤ 𝑘 ≤ 𝑚 and 1 ≤ ℓ ≤ 𝑚 − 1, ⎞ ⎛ ⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅ ⎟ ⎜ (𝑘) (𝑘) ⎜ ⋅ ⋅ ⋅ 𝛼(𝑘) 𝛼−1 𝛼−2 ⋅ ⋅ ⋅ ⎟ 0 ⎟ ⎜ ⎟ ⎜ (𝑘) (𝑘) (𝑘) ⎜ 𝑇 (𝑎𝑘 ) 𝑅 𝐺𝑘𝑙 𝑎ℓ+𝑚 = ⎜ ⋅ ⋅ ⋅ 𝛼1 𝛼0 𝛼−1 ⋅ ⋅ ⋅ ⎟ ⎟ ⎟ ⎜ (𝑘) (𝑘) (𝑘) ⎟ ⎜ 𝛼1 𝛼0 ⋅⋅⋅ ⎠ ⎝ ⋅ ⋅ ⋅ 𝛼2 ⋅⋅⋅

⋅⋅⋅

⎛

⋅⋅⋅ ⎜ ⎜ ⋅⋅⋅ ⎜ ⎜ ⎜ × ⎜ ⋅⋅⋅ ⎜ ⎜ ⎜ ⎝ ⋅⋅⋅ ⋅⋅⋅

⋅⋅⋅

⋅⋅⋅

⋅⋅⋅

⋅⋅⋅

⋅⋅⋅

⋅⋅⋅

(𝑘ℓ) 𝑔2

(𝑘ℓ) 𝑔1

(𝑘ℓ) 𝑔0

(𝑘ℓ)

𝑔0

(𝑘ℓ)

𝑔1 𝑔0

⋅⋅⋅

(𝑘ℓ)

𝑔−1

𝑔−1

(𝑘ℓ)

𝑔−2

⋅⋅⋅

⋅⋅⋅

⎞

⋅⋅⋅

⎟ ⋅⋅⋅ ⎟ ⎟ ⎟ ⎟ ⋅⋅⋅ ⎟ ⎟ ⎟ ⎟ ⋅⋅⋅ ⎠

(𝑘ℓ) (𝑘ℓ)

⋅⋅⋅

⎛

⎞

⋅ ⋅

⎜ ⎜ ⎜ (ℓ+𝑚) ⎜ 𝛼−1 ⎜ ⎜ ⎜ 𝛼(ℓ+𝑚) ⎜ 0 ⎜ ⎜ (ℓ+𝑚) ⎜ 𝛼1 ⎜ ⎝ ⋅ ⋅

⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟. ⎟ ⎟ ⎟ ⎟ ⎟ ⎠

Therefore for any integer 𝑗, and for 2 ≤ 𝑘 ≤ 𝑚 and 1 ≤ ℓ ≤ 𝑚 − 1, the 𝑗th entry of 𝑇 (𝑎𝑘 ) 𝑅 𝐺𝑘𝑙 𝑎ℓ+𝑚 =

∞ ∑

(𝑘)

𝑟,𝑠=−∞

(𝑘ℓ)

(ℓ+𝑚)

𝛼𝑗−𝑠 𝑔−𝑟−𝑠 𝛼𝑟

.

(29)

In the same way we ﬁnd that for any integer 𝑗, and for 2 ≤ 𝑘 ≤ 𝑚 and 1 ≤ ℓ ≤ 𝑚 − 1, the 𝑗th entry of 𝑇 (𝑎𝑚+ℓ )∗ 𝐺∗𝑘ℓ 𝑎𝑘 =

∞ ∑ 𝑟,𝑠=−∞

(ℓ+𝑚)

(𝑘ℓ)

𝛼−𝑗+𝑠 𝑔−𝑠+𝑟 𝛼(𝑘) 𝑟 .

(30)

The two sums in (29) and (30) are easily seen to be equal, so it follows from (27)–(30) that 𝑚 ∑

𝑇 (𝑎𝑘 ) 𝑅 𝑎𝑘 +

𝑘=2

=

𝑚 ∑

𝑇 (𝑎𝑘 ) 𝑅 𝐺𝑘𝑚 [𝑛] 𝑎2𝑚

𝑘=2 𝑚−1 ∑

𝑚−1 ∑

∗

𝑇 (𝑎𝑚+ℓ ) 𝑎𝑚+ℓ +

ℓ=1

(31) ∗

𝑇 (𝑎𝑚+ℓ )

ℓ=1

[𝑛] (𝐺1ℓ )∗

𝑎1 .

We can carry out a similar analysis of the sums in (24) and (31). We ﬁnd that for 1 ≤ ℓ ≤ 𝑚 − 1, [𝑛]

the 𝑗th entry of 𝑈 (𝑎1 ) 𝐺1ℓ 𝑎𝑚+ℓ =

∞ ∑ ∞ ∑

𝑠=0 𝑟=−∞

(1)

(1,ℓ)

𝛼𝑛+𝑠 𝑔𝑛+𝑠+𝑗−𝑟 𝛼(𝑚+ℓ) 𝑟

for 𝑗 ≥ 0

(32)

An Identity Satisﬁed by Certain Functions and

341

[𝑛]

the 𝑗th entry of 𝑇 (𝑎𝑚+ℓ )∗ (𝐺1ℓ )∗ 𝑎1 ∞ ∑ ∞ ∑

=

𝑟=−∞ 𝑠=0

(ℓ)

(1)

𝛼𝑚+ℓ 𝑔𝑛−𝑟−𝑗+𝑠 𝛼𝑛+𝑠 𝑟

for − ∞ < 𝑗 < ∞.

(33)

Therefore [𝑛]

the 𝑗th entry of 𝑈 (𝑎1 ) 𝐺1ℓ 𝑎𝑚+ℓ [𝑛]

= the (−𝑗)th entry of 𝑇 (𝑎𝑚+ℓ )∗ (𝐺1ℓ )∗ 𝑎1

for 𝑗 ≥ 0

and hence [𝑛]

the 𝑗th entry of 𝑈 (𝑎1 ) 𝐺1ℓ 𝑎𝑚+ℓ [𝑛]

= the 𝑗th entry of 𝑅 𝑇 (𝑎𝑚+ℓ )∗ (𝐺1ℓ )∗ 𝑎1 Thus if we let

⎛

⋅⋅⋅ ⎜ ⋅⋅⋅ 𝑃 =⎜ ⎝ ⋅⋅⋅ ⋅⋅⋅

0 ⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅

0 0 ⋅⋅⋅ ⋅⋅⋅

1 0 0 ⋅⋅⋅

0 1 0 ⋅⋅⋅

0 0 1 ⋅⋅⋅

⋅⋅⋅ 0 0 ⋅⋅⋅

for 𝑗 ≥ 0. ⋅⋅⋅ ⋅⋅⋅ 0 ⋅⋅⋅

⎞ ⋅⋅⋅ ⋅⋅⋅ ⎟ ⎟ ⋅⋅⋅ ⎠ ⋅⋅⋅

be the matrix that projects ℓ1 (−∞, ∞) onto ℓ1 (0, ∞), we can conclude that [𝑛]

[𝑛]

𝑈 (𝑎1 ) 𝐺1ℓ 𝑎𝑚+ℓ = 𝑃 𝑅 𝑇 (𝑎𝑚+ℓ )∗ (𝐺1ℓ )∗ 𝑎1 .

(34)

Similarly, for 1 ≤ 𝑘 ≤ 𝑚 − 1, the 𝑗th entry of 𝑈 (𝑎2𝑚 ) 𝑅+ (𝐺𝑘𝑚 [𝑛])∗ 𝑎𝑘 =

∞ ∞ ∑ ∑

(2𝑚)

𝑠=0 𝑟=−∞

and

(𝑘𝑚)

(𝑘)

𝛼−𝑛−𝑠 𝑔𝑛+𝑠+𝑗+𝑟 𝛼𝑟

for 𝑗 ≥ 0

(35)

the 𝑗th entry of 𝑇 (𝑎𝑘 ) 𝑅 𝐺𝑘𝑚 [𝑛] 𝑎2𝑚 =

∞ ∑ ∞ ∑

(𝑘)

𝑠=0 𝑟=−∞

=

∞ ∑ ∞ ∑ 𝑠=0 𝑟=−∞

(𝑘𝑚)

(2𝑚)

(𝑘𝑚)

(2𝑚)

𝛼−𝑟 𝑔𝑛−𝑟−𝑗+𝑠 𝛼−𝑛−𝑠 𝛼(𝑘) 𝑔𝑛+𝑟−𝑗+𝑠 𝛼−𝑛−𝑠 𝑟

for − ∞ < 𝑗 < ∞

so that the 𝑗th entry of 𝑅 𝑇 (𝑎𝑘 ) 𝑅 𝐺𝑘𝑚 [𝑛] 𝑎2𝑚 =

∞ ∑ ∞ ∑ 𝑠=0 𝑟=−∞

(𝑘𝑚)

(2𝑚)

𝑎(𝑘) 𝑔𝑛+𝑟+𝑗+𝑠 𝛼−𝑛−𝑠 𝑟

for − ∞ < 𝑗 < ∞.

(36)

For 𝑗 ≥ 0 the sums in (35) and (36) are complex conjugates of each other, so 𝑈 (𝑎2𝑚 ) 𝑅+ 𝐺𝑘𝑚 [𝑛]∗ 𝑎𝑘 = 𝑃 𝑅 𝑇 (𝑎𝑘 ) 𝑅 𝐺𝑘𝑚 [𝑛] 𝑎2𝑚 .

(37)

342

R.L. Ellis

Multiplying both sides of (31) by 𝑃 𝑅 and substituting from (34) and (37), we have 𝑚 ∑

𝑃

𝑅 𝑇 (𝑎𝑘 ) 𝑅 𝑎𝑘 +

𝑘=2

𝑚−1 ∑

=𝑃

𝑚 ∑

𝑈 (𝑎2𝑚 ) 𝑅+ 𝐺𝑘𝑚 [𝑛]∗ 𝑎𝑘

𝑘=2

𝑅 𝑇 (𝑎𝑚+ℓ )∗ 𝑎𝑚+ℓ +

𝑚−1 ∑

ℓ=1

so that

𝑚−1 ∑ ℓ=1

ℓ=1

[𝑛] 𝑈 (𝑎1 ) 𝐺1ℓ

=𝑃

𝑚 ∑

𝑎𝑚+ℓ −

𝑚 ∑

[𝑛]

𝑈 (𝑎1 ) 𝐺1ℓ 𝑎𝑚+ℓ

𝑈 (𝑎2𝑚 ) 𝑅+ 𝐺𝑘𝑚 [𝑛]∗ 𝑎𝑘

𝑘=2

𝑅 𝑇 (𝑎𝑘 ) 𝑅 𝑎𝑘 − 𝑃

𝑘=2

𝑚−1 ∑

𝑅 𝑇 (𝑎𝑚+ℓ )∗ 𝑎𝑚+ℓ .

ℓ=1

Substituting this into (24), we have 𝑈 (𝑎1 ) 𝑎1 + 𝑃

𝑚 ∑

𝑅 𝑇 (𝑎𝑘 ) 𝑅 𝑎𝑘 − 𝑃

𝑘=2 (1) 𝛼𝑛 .

(38)

For any function 𝑤(𝑧) = Then (𝑛)

(𝑛)♯

𝑅 𝑇 (𝑎𝑚+ℓ )∗ 𝑎𝑚+ℓ − 𝑈 (𝑎2𝑚 ) 𝑅+ 𝑎2𝑚

ℓ=1

= 𝑈 (𝑎1 )𝑒1 =

𝛼1 (𝑧) 𝛼1

𝑚−1 ∑

∑∞

𝑘=−∞

𝑤𝑘 𝑧 𝑘 in 𝑊 , we let 𝑤♯ (𝑧) =

(1)

(1)

∑∞

𝑘=−∞

𝑤−𝑘 𝑧 𝑘 .

(1)

𝑛 𝑛+1 (𝑧) = (𝛼(1) + ⋅ ⋅ ⋅ ) (𝛼𝑛 𝑧 −𝑛 + 𝛼𝑛+1 𝑧 −𝑛−1 + ⋅ ⋅ ⋅ ) 𝑛 𝑧 + 𝛼𝑛+1 𝑧 ∞ ∑

=

𝛽𝑗 𝑧 𝑗

𝑗=−∞

where 𝛽𝑗 =

∞ ∑ 𝑠=0

(1)

(1)

𝛼𝑛+𝑠+𝑗 𝛼𝑛+𝑠

for 𝑗 ≥ 0

and

𝛽𝑗 =

∞ ∑ 𝑠=0

(1)

(1)

𝛼𝑛+𝑠 𝛼𝑛+𝑠−𝑗

for 𝑗 < 0.

In particular 𝛽−𝑗 = 𝛽𝑗 for −∞ < 𝑗 < ∞. But for 𝑗 ≥ 0, the 𝑗th entry of 𝑈 (𝑎1 ) 𝑎1 =

∞ ∑ 𝑠=0

(1)

(1)

𝛼𝑛+𝑠 𝛼𝑛+𝑠+𝑗 = 𝛽𝑗 .

Thus for 𝑗 ≥ 0, the 𝑗th entry in 𝑈 (𝑎1 ) 𝑎1 equals the coeﬃcient of 𝑧 𝑗 in 𝛼1 (𝑧) 𝛼♯1 (𝑧). Similar calculations show that for 𝑗 ≥ 0, the 𝑗th entry in 𝑈 (𝑎2𝑚 ) 𝑅+ 𝑎2𝑚 equals the coeﬃcient of 𝑧 𝑗 in 𝛼2𝑚 (𝑧) 𝛼♯2𝑚 (𝑧), and for any 𝑗 and for 2 ≤ 𝑘 ≤ 𝑚, the 𝑗th entry in 𝑅 𝑇 (𝑎𝑘 ) 𝑅 𝑎𝑘 equals the coeﬃcient of 𝑧 𝑗 in 𝛼𝑘 (𝑧)𝛼♯𝑘 (𝑧), and for any 𝑗 and for 1 ≤ ℓ ≤ 𝑚 − 1, the 𝑗th entry in 𝑅 𝑇 (𝑎𝑚+ℓ )∗ 𝑎𝑚+ℓ equals the coeﬃcient of 𝑧 𝑗 in 𝛼𝑚+ℓ (𝑧)𝛼♯𝑚+ℓ (𝑧).

An Identity Satisﬁed by Certain Functions

343

From these results and (38) we can conclude that all the coeﬃcients of the positive powers of 𝑧 in 𝑚 ∑ 𝑘=1

𝛼𝑘 (𝑧)𝛼♯𝑘 (𝑧) −

2𝑚 ∑ 𝑘=𝑚+1

𝛼𝑘 (𝑧)𝛼♯𝑘 (𝑧)

(39)

(1)

are zero and the constant term is 𝛼𝑛 . Since the coeﬃcients of the negative powers of 𝑧 in (39) are the complex conjugates of the coeﬃcients of the positive powers, it follows that 𝑚 ∑ 𝑘=1

𝛼𝑘 (𝑧)𝛼♯𝑘 (𝑧) − −1

For 𝑧 on the unit circle, 𝑧 hence 𝑚 ∑

2𝑚 ∑ 𝑘=𝑚+1

(1)

𝛼𝑘 (𝑧)𝛼♯𝑘 (𝑧) = 𝛼𝑛 . (1)

= 𝑧, so it follows that 𝛼♯𝑘 (𝑧) = 𝛼𝑘 (𝑧), 𝛼𝑛 is real, and

∣𝛼𝑘 (𝑧)∣2 −

𝑘=1

This proves Theorem 3.1.

2𝑚 ∑

∣𝛼𝑘 (𝑧)∣2 = 𝛼(1) 𝑛 .

𝑘=𝑚+1

□

It is to be expected that the inversion formula in Section 2 of [1] and the inverse problem in Section 4 of [1] can be generalized to the present situation.∗

References [1] R.L. Ellis and I. Gohberg, “Orthogonal systems related to inﬁnite Hankel matrices,” J. Funct. Anal. 109: 155–198 (1992) [2] R.L. Ellis, I. Gohberg, and D.C. Lay, “Inﬁnite analogues of block Toeplitz matrices and related orthogonal functions,” Integral Equations and Operator Theory 22: 375– 419 (1995) [3] R.L. Ellis, I. Gohberg, and D.C. Lay, “On a class of block Toeplitz matrices,” Linear Algebra Appl. 241: 225–245 (1996) [4] I. Gohberg, M.A. Kaashoek, and H.J. Woerdeman, “The band method for positive and contractive extension problems,” J. Operator Theory 22: 109–155 (1989) [5] I. Gohberg, M.A. Kaashoek, and H.J. Woerdeman, “The band method for positive and strictly contractive extension problems: An alternative version and new applications,” Integral Equations and Operator Theory 12: 343–382 (1989) Robert L. Ellis Department of Mathematics University of Maryland College Park, Maryland 20742, USA e-mail: [email protected]

∗ The

author would like to thank the reviewer for several useful suggestions.

Operator Theory: Advances and Applications, Vol. 218, 345–357 c 2012 Springer Basel AG ⃝

Invertibility of Certain Fredholm Operators Israel Feldman and Nahum Krupnik To the blessed memory of our dear teacher Israel Gohberg

Abstract. Some new classes of algebras in which each Fredholm operator is invertible are described. Mathematics Subject Classiﬁcation (2000). Primary 47A53, Secondary 45E10. Keywords. Fredholm operators, spectrum of linear operators, generalized Gelfand transform.

1. Introduction Let Ω be a unital subalgebra of a Banach algebra 𝐿(ℬ), where 𝐿 = 𝐿(ℬ) is the algebra of all linear bounded operators on a Banach space ℬ; 𝒦(Ω) – the ideal of all compact operators 𝐾 ∈ Ω; 𝒦(ℬ) := 𝒦 (𝐿(ℬ)) (for short); 𝒦0 (Ω) – the ideal of all ﬁnite-dimensional operators 𝐾 ∈ Ω, 𝐹 = 𝐹 (ℬ) – the set of all 𝐹 -operators (Fredholm operators) on ℬ and 𝐺𝐿 – the group of all invertible operators in 𝐿. Also spec(𝐴) denotes the spectrum of an operator 𝐴 in the algebra 𝐿(ℬ) and 𝜌(𝐴)(= ℂ ∖ spec(𝐴)) the regular set of operator 𝐴. Recall that algebra Ω is inverse closed in 𝐿(ℬ) if 𝐴 ∈ Ω ∩ 𝐺𝐿 =⇒ 𝐴−1 ∈ Ω. We say that Ω is 𝐹 -closed if for each operator 𝐴 ∈ Ω ∩ 𝐹, there exists an operator 𝑅 ∈ Ω such that at least one of the operators 𝑅𝐴 − 𝐼 or 𝐴𝑅 − 𝐼 is compact. Note that in this case both operators 𝑅𝐴 − 𝐼 and 𝐴𝑅 − 𝐼 are compact. In the sequel we say that Ω is an 𝐹 𝐹 -algebra (Fredholm free algebra) if Ω does not have Fredholm operators non-invertible in 𝐿(ℬ). A following characterization of Fredholm free 𝐶 ∗ subalgebras is well known: Theorem 1.1. Let 𝐻 be a Hilbert space and let 𝒞 be a 𝐶 ∗ -subalgebra of 𝐿(𝐻). Then the following two statements are equivalent: (i) Algebra 𝒞 does not contain non-zero compact operators. (ii) Algebra 𝒞 is an 𝐹 𝐹 -algebra. The research of the second author was partially supported by Retalon Inc., Toronto, ON, Canada.

346

I. Feldman and N. Krupnik

See, for example, [CL, Theorem 3.5], or (more conveniently) see Corollaries 2.5 and 2.7 below. Theorem 1.1 is no longer true if we replace 𝒞 ⊂ 𝐿(𝐻) by an arbitrary subalgebra Ω ⊂ 𝐿(ℬ). In Section 2 we study the connections between the invertibility of 𝐹 -operators and the structure of compact operators in some subalgebras Ω ⊂ 𝐿(ℬ). Let 𝐻 be a Hilbert space. A subalgebra Ω ⊂ 𝐿(𝐻) is called selfadjoint if 𝐴 ∈ Ω ⇒ 𝐴∗ ∈ Ω. The closure of a selfadjoint 𝐹 𝐹 -subalgebra has a following hereditary property: Theorem 1.2. Let 𝐻 be a Hilbert space and let Ω be a selfadjoint subalgebra of 𝐿(𝐻). If Ω is an 𝐹 𝐹 -algebra, then 𝒞 := clos(Ω) is an 𝐹 𝐹 -algebra, too. See, for example, [KF, Theorem 1], but it also follows from Theorem 3.3, Statements 1∘ and 3∘ below. Some classes of non-selfadjoint subalgebras Ω ⊂ 𝐿(ℬ) with the hereditary property like in Theorem 1.2 were studied in [KF], [KMF], [MF]. In Section 3 we continue these studies. We obtain some general properties of Banach subalgebras 𝒜 which have dense 𝐹 𝐹 -subalgebras, and, in particular, obtain some suﬃcient conditions under which algebra 𝒜 with a dense 𝐹 𝐹 -subalgebra Ω is an 𝐹 𝐹 -algebra, too. The following was stated by A. Markus (see [KF, pp. 11–12]): Proposition 1.3. Let 𝒜 ⊂ 𝐿(ℬ) be a commutative algebra and Ω its dense subalgebra. If Ω is an 𝐹 𝐹 -algebra, then 𝒜 is an 𝐹 𝐹 -algebra, too. In Section 3 some generalizations of this statement are obtained for the algebras 𝒜 which admit so-called Generalized Gelfand Transform as well as for algebras 𝒜 with standard Amitsur-Levitski polynomial identities (of some order 𝑚 = 2𝑛): ∑ 𝑠𝑔𝑛(𝜎)𝑎𝜎(1) 𝑎𝜎(2) ⋅ ⋅ ⋅ 𝑎𝜎(𝑚) = 0, (𝑎𝑗 ∈ 𝒜), (1.1) 𝜎∈𝑆𝑚

where 𝜎 runs through the symmetric group 𝑆𝑚 . In Section 4 some illustrative examples and open questions are presented. In the sequel, we suppose all Banach spaces ℬ inﬁnite-dimensional and all subalgebras of 𝐿(ℬ) (except the ideals!) unital. Sometimes we mention this in the text, but sometimes it is not mentioned. It is our pleasure to thank our friend A. Markus for useful remarks and comments.

2. The structure of compact operators in 𝑭 𝑭 -subalgebras of 𝑳(퓑) Let ℬ be a Banach space. In this section we denote by Ω an arbitrary (closed or non-closed) unital subalgebra of 𝐿(ℬ) and study the connections between the statements (i) and (ii) of Theorem 1.1 for subalgebra Ω. We start with the following two examples:

Invertibility of Certain Fredholm Operators

347

Example 2.1. Let 𝑇 (∕= 0) be a ﬁnite-dimensional operator in a Hilbert space 𝐻 (or in any inﬁnite-dimensional Banach space ℬ) and let 𝑇 2 = 0. Denote Ω = {𝑎𝐼 +𝑏𝑇 }, where 𝑎, 𝑏 ∈ ℂ. Let 𝐴 = 𝑎𝐼 + 𝑏𝑇. If 𝑎 = 0 then 𝐴 is not a Fredholm operator; if 𝑎 ∕= 0 then 𝐴−1 = 𝑎1 𝐼 − 𝑎𝑏2 𝑇. Thus Ω is an 𝐹 𝐹 -algebra, but it contains a ﬁnite-dimensional operator 𝑇. Example 2.2. Let Ω denote the algebra of all lower-triangular Toeplitz operators on ℓ2 (or on any ℓ𝑝 , 𝑝 ∈ (1, ∞)). Algebra Ω does not contain non-zero compact operators1, but it contains non-invertible Fredholm operator 𝑉 𝑥 = (0, 𝑥1 , 𝑥2 , . . . ), i.e., Ω is not an 𝐹 𝐹 -algebra. Conclusion 2.3. Examples 2.1 and 2.2 show that for the general subalgebra Ω (even in Hilbert spaces), the statements (i) and (ii) from Theorem 1.1 are independent. Thus (in contrast with Theorem 1.1) an 𝐹 𝐹 -algebra Ω may have non-zero compact operators. In continuation of this section we study the structure of the ideals of compact operators in 𝐹 𝐹 -algebras. Recall that a two-sided ideal 𝐽 of an algebra Ω is called a nil-ideal (a quasinilpotent ideal) if all its elements are nilpotent (quasinilpotent). Proposition 2.4. Let Ω(⊂ 𝐿(ℬ)) be an 𝐹 𝐹 -algebra. Then 𝒦(Ω) is a quasinilpotent ideal in Ω. In particular, 𝒦0 (Ω) is a nil-ideal in Ω, and it is not necessarily that 𝒦(Ω) = {0} or 𝒦0 (Ω) = {0}. Proof. It is clear that 𝒦(Ω) is a two-sided ideal in Ω. Let 𝐾 ∈ 𝒦(Ω) and 𝐴 = 𝐾 −𝜆𝐼. If 𝜆 ∕= 0, then 𝐴 is an 𝐹 -operator and by the condition of the proposition it is invertible. Thus spec(𝐾) = {0}, i.e., 𝒦(Ω) is a quasinilpotent ideal. In addition, Example 2.1 illustrates that this ideal is not necessarily trivial. In the mentioned example 𝒦(Ω) = 𝒦0 (Ω) = {𝜆𝑇 } is a nil-ideal. To complete the proof we give an example of an 𝐹 𝐹 -algebra which contains inﬁnite-dimensional compact operators. Let {𝑚 } ) ( ∑ 𝑥2 𝑥3 𝑥𝑛+1 𝑝 , ,..., ,... Ω= 𝑐𝑝 𝑇 : 𝑚 ∈ ℕ (2.1) 𝑥 ∈ ℓ2 , 𝑇 𝑥 := 2 3 𝑛+1 𝑝=0 and 𝒜 := clos(Ω) ⊂ 𝐿(ℓ2 ). Here 𝑇 is a inﬁnite-dimensional quasinilpotent compact (Hilbert-Schmidt) operator; Ω = {𝜆𝐼} ⊕ 𝒦(Ω), where 𝜆 ∈ ℂ. It is clear that Ω is an 𝐹 𝐹 -algebra. Note that clos(Ω) ⊂ 𝐿(ℓ2 ) is an 𝐹 𝐹 -algebra (with inﬁnite-dimensional compact operators), too. □ Corollary 2.5. Let Ω(⊂ 𝐿(𝐻)) be a selfadjoint 𝐹 𝐹 -algebra. Then Ω does not have non-zero compact operators. Proof. Let 𝐾 ∈ 𝒦(Ω), then 𝐾𝐾 ∗ ∈ 𝒦(Ω), too. By Proposition 2.4 𝐾𝐾 ∗ is quasinilpotent. Thus ∥𝐾∥2 = ∥𝐾𝐾 ∗ ∥ = max{𝜆 : 𝜆 ∈ spec(𝐾𝐾 ∗ )} = 0. 1 See,

□

for example, the proof of Statement 4∘ in Theorem 2.11 and compare with its Statement 2∘ .

348

I. Feldman and N. Krupnik

An inverse question. Let 𝒦(Ω) be a quasinilpotent ideal in Ω. Is Ω an 𝐹 𝐹 -algebra? The answer is negative even when 𝒦(Ω) is a nil-ideal and (moreover) even if 𝒦(Ω) = {0}. This can be conﬁrmed by Example 2.2, where 𝒦(Ω) = {0}, but Ω is not an 𝐹 𝐹 -algebra. Now we are going to restrict the algebra Ω ⊂ 𝐿(ℬ) with some conditions so that the implication (i) → (ii) would hold in Ω. We start with Proposition 2.6. Let Ω be a 𝐹 -closed subalgebra of 𝐿(ℬ). If Ω does not contain non-zero compact operators, then Ω is an 𝐹 𝐹 -algebra. Proof. Let 𝐴 ∈ 𝐹 ∩ Ω. Since Ω is 𝐹 -closed there exists an operator 𝐵 ∈ Ω such that 𝐵𝐴−𝐼 = 𝐾1 and 𝐴𝐵 −𝐼 = 𝐾2 are compact operators in Ω. By the conditions of the proposition 𝐾1 = 𝐾2 = 0, i.e., 𝐴 ∈ 𝐺𝐿. □ Corollary 2.7. Let Ω be a 𝐶 ∗ -subalgebra of 𝐿(𝐻). If Ω does not have non-zero compact operators, then Ω is an 𝐹 𝐹 -algebra. This statement follows from Proposition 2.6 and the following Lemma 2.8. Each 𝐶 ∗ -subalgebra Ω ⊂ 𝐿(𝐻) is 𝐹 -closed. ˆ := 𝐿(𝐻)/𝒦(𝐻). It is Proof. Let 𝜋 be the canonical homomorphism 𝐿(𝐻) → 𝐿 ∗ ˆ well known that 𝐿 is a 𝐶 -algebra. Denote ˆ = {𝑋 ˆ ∈𝐿 ˆ such that 𝜋 −1 (𝑋) ˆ ∩ Ω ∕= ∅}. Ω ˆ is a 𝐶 ∗ -subalgebra of 𝐿. ˆ Let 𝐴 ∈ 𝐹 ∩ Ω, then 𝐴ˆ It is not diﬃcult to check that Ω ∗ ˆ (because 𝐶 -subalgebras are inverse closed). Thus, there exists is invertible in Ω ˆ − 𝐼ˆ = 𝐵 ˆ 𝐴ˆ − 𝐼ˆ = 0 and hence operators 𝐴𝐵 − 𝐼 and 𝐵𝐴 − 𝐼 𝐵 ∈ Ω such that 𝐴ˆ𝐵 are compact. □ To give another condition which provides the implication (i) → (ii), we need a following deﬁnition. Let 𝑋 be a subset of 𝐿(ℬ). We say that 𝑋 is symmetric if for any 𝐴 ∈ 𝑋 there exists an operator 𝐴 ∈ 𝑋 such that spec(𝐴𝐴) ⊂ ℝ. Theorem 2.9. Let Ω be a Banach subalgebra of 𝐿(ℬ). Assume that the set 𝑋 of all Fredholm operators 𝐴 ∈ Ω is symmetric. If the algebra Ω does not contain nilpotent ﬁnite-dimensional operators, then it is an 𝐹 𝐹 -algebra. Proof. Assume that there exists a non-invertible 𝐹 -operator 𝐴 ∈ Ω. Then there exists 𝐴 ∈ Ω ∩ 𝐹 such that spec(𝐴𝐴) ⊂ ℝ. Since 𝐴 is not invertible, it follows that at least one of the operators 𝐴𝐴 or 𝐴𝐴 (we denote it by 𝐵 ) is not invertible. Thus 𝐵 (∈ Ω) is a non-invertible 𝐹 -operator and spec(𝐵) ⊂ ℝ. Recall that spec(𝐵) denotes the spectrum of operator 𝐵 in algebra 𝐿(ℬ). Let ℱ (𝐵) := {𝜆 : 𝐵 − 𝜆𝐼 ∈ 𝐹 } denote the set of 𝐹 -points of operator 𝐵, and let ℱ0 denote the unbounded component of ℱ (𝐵). Since spec(𝐵) ⊂ ℝ and 𝐵 is a non-invertible 𝐹 -operator, it follows that 𝜆0 = 0 belongs to unbounded component of 𝐹 -points of operator 𝐵 and hence ([GoKre, Theorem 3.6. ]) it is an isolated 𝐹 -point of spec(𝐵). Let Γ

Invertibility of Certain Fredholm Operators

349

denote a circle ∣𝜆∣ = 𝑟, such that 𝐵 − 𝜆𝐼 (0 < ∣𝜆∣ ≤ 𝑟) is invertible. Since the spectrum of the operator 𝐵 − 𝜆𝐼 has a connected complement in ℂ, it follows that (𝐵 − 𝜆𝐼)−1 ∈ Ω (0 < ∣𝜆∣ ≤ 𝑟), and since Ω is a closed algebra it follows that the Riesz projection ∫ 1 −1 (𝐵 − 𝜆𝐼) 𝑑𝜆 (2.2) ℛ(𝐵, Γ) := − 2𝜋𝑖 Γ belongs to Ω and is a non-zero ﬁnite-dimensional operator. By Proposition 2.4 this operator is nilpotent and this contradicts the condition of the theorem. □ Remark 2.10. The condition that Ω ∩ 𝐹 is symmetric in Theorem 2.9 is essential. Namely, the algebra Ω in Example 2.2 satisﬁes all conditions of Theorem 2.9 except the mentioned one, but the implication (i)=⇒(ii) fails. We conclude this section by considering the following class of subalgebras without compact operators. Let {𝑈𝑛 } ⊂ 𝐿(ℬ) denote a sequence of isometries which tends weakly to zero, and let 𝐶 (𝑈𝑛 , ℬ) ⊂ 𝐿(ℬ) be the commutant of the set {𝑈𝑛 }. Denote ∣𝐴∣ = inf ∥𝐴 + 𝐾∥. (2.3) 𝐾∈𝒦(ℬ)

Theorem 2.11. Let Ω be any unital subalgebra of 𝐶 (𝑈𝑛 , ℬ) . Then 1∘ . 2∘ . 3∘ . 4∘ . 5∘ . 6∘ . 7∘ .

Equality ∥𝐴∥ = ∣𝐴∣ holds for all 𝐴 ∈ 𝐶 (𝑈𝑛 , ℬ) . The algebra Ω does not have non-zero compact operators. If Ω ⊂ 𝐶 (𝑈𝑛 , 𝐻) is a 𝐶 ∗ -algebra, then it is an 𝐹 𝐹 -algebra. In general, algebra Ω is not necessarily an 𝐹 𝐹 -algebra. If the set of Fredholm operators 𝐴 ∈ Ω is symmetric, then Ω is an 𝐹 𝐹 -algebra. If Ω is a 𝐹 -closed algebra, then it is an 𝐹 𝐹 -algebra. Let 𝑌 be a subset of 𝐶 (𝑈𝑛 , ℬ) and let the set of all invertible operators 𝐴 ∈ 𝐶 (𝑈𝑛 , ℬ) be dense in 𝑌 ∩ 𝐹, then each Fredholm operator from 𝑌 is invertible. The following known lemma will be used in the proof of this theorem

Lemma 2.12. ([𝐾𝐹, 𝐿𝑒𝑚𝑚𝑎 1]) Let 𝐴𝑛 ∈ 𝐺𝐿(ℬ) and ∥𝐴𝑛 − 𝐴∥ → 0, where 𝐴 is a non-invertible F-operator, then there exists a subsequence 𝐴𝑘𝑛 such that −1 −1 ∥𝐴−1 𝐴𝑘𝑛 → 𝑆, where 𝑆 is a ﬁnite-dimensional operator. 𝑘𝑛 ∥ Proof. Statement 1∘ (of Theorem 2.11) follows from [K, Theorem 4.3]. Statement 2∘ follows from Statement 1∘ . Indeed, if 𝑇 ∈ 𝒦(Ω) then ∥𝑇 ∥ = ∣𝑇 ∣ = inf 𝐾∈𝒦(ℬ) ∣∣𝐾 + 𝑇 ∣∣ = 0. Statement 3∘ follows from Statement 2∘ and Theorem 1.1. To prove Statement 4∘ consider in ℓ𝑝 , 𝑝 ∈ (1, ∞) the sequence {𝑈 𝑛 }, 𝑛 ∈ ℕ of isometries, where 𝑈 𝑥 = (0, 𝑥1 , 𝑥2 , . . . ). It can be easily checked that {𝑈 𝑛 } tends weakly to zero. It is well known (and can be easily checked) that the commutant of operator 𝑈 (as well as of the set {𝑈 𝑛 }) coincides with the algebra of all lower triangular Toeplitz matrices. This algebra satisﬁes the condition of the theorem, but it contains non-invertible Fredholm operators. For example, 𝐴 = 𝑈. This proves

350

I. Feldman and N. Krupnik

Statement 4∘ . Statement 5∘ follows from Statement 2∘ and Theorem 2.9. Statement 6∘ follows from Statement 2∘ and Proposition 2.6. Let us prove Statement 7∘ . Assume that 𝑇 ∈ 𝑌 is a non-invertible Fredholm operator. Then there exists a sequence 𝑇𝑛 ∈ 𝐶 (𝑈𝑛 , ℬ) ∩ 𝐺𝐿(ℬ) such that ∥𝑇 − 𝑇𝑛 ∥ → 0. The algebra 𝐶 (𝑈𝑛 , ℬ) (as a commutant) is inverse closed and hence 𝑇𝑛−1 ∈ 𝐶 (𝑈𝑛 , ℬ) . By Lemma 2.12 ∥−1 𝑇𝑘−1 tends to a non-zero ﬁnitethere exists a subsequence 𝑇𝑘𝑛 such that ∥𝑇𝑘−1 𝑛 𝑛 dimensional operator 𝐾. Since 𝐶 (𝑈𝑛 , ℬ) is closed, it follows that 𝐾 ∈ 𝐶 (𝑈𝑛 , ℬ) . This contradicts Statement 2∘ . □

3. The closure of 𝑭 𝑭 -subalgebras In this section Ω denotes a (generally non-closed) 𝐹 𝐹 -subalgebra of 𝐿 (ℬ) . We study some properties of the algebra 𝒜 := clos(Ω) ⊂ 𝐿 (ℬ) . These properties can be considered as “some approximations” to the answer to a general Question 3.1. Let Ω ⊂ 𝐿 (ℬ) be an 𝐹 𝐹 -algebra. Is the closure 𝒜 = clos(Ω) an 𝐹 𝐹 -algebra, too ? Or, to a weaker Question 3.2. Let Ω be an 𝐹 𝐹 -algebra and let 𝒜 be inverse closed in 𝐿 (ℬ). Is 𝒜 an 𝐹 𝐹 -algebra, too? Questions 3.1 and 3.2 were formulated more than 15 years ago in Lecture Notes [KMF]. As far as we know, the answers to these questions are still unknown. We start with Theorem 3.3. Let Ω ⊂ 𝐿(ℬ) be an 𝐹 𝐹 -algebra and 𝒜 := clos(Ω). 1∘ . If 𝒦(𝒜) ∕= {0} (𝒦0 (𝒜) ∕= {0}) , then it is a quasinilpotent ideal (a nil-ideal) in 𝒜. If, in particular, Ω is a selfadjoint subalgebra of 𝐿(𝐻), then 𝒦(𝒜) = {0}. 2∘ . If the algebra 𝒜 is 𝐹 -closed, then it is an 𝐹 𝐹 -algebra. In addition, algebra 𝒜 is inverse closed in 𝐿(ℬ). 3∘ . If the algebra 𝒜 is inverse closed and 𝒦0 (𝒜) = {0}, then 𝒜 is an 𝐹 𝐹 -algebra2 4∘ . The algebra 𝒜 does not contain non-invertible 𝐹 -operators 𝐴 with isolated point 𝜆0 = 0 of the spectrum of operator 𝐴. 5∘ . Let the algebra 𝒜 be a subalgebra of a commutant 𝐶 (𝑈𝑛 , ℬ) , deﬁned in Section 2, then 𝒜 is an 𝐹 𝐹 -algebra. The following known statement will be used in the proof of this theorem. Lemma 3.4. Let 𝐴 ∈ 𝐿(ℬ) be a non-invertible 𝐹 -operator and let there exist 𝑟 > 0 such that {𝜆 : 0 < ∣𝜆∣ ≤ 𝑟} ⊂ 𝜌(𝐴). Then there exists a number 𝛿 > 0 such that for each operator 𝐵 ∈ 𝐿(ℬ) with ∥𝐵 − 𝐴∥ < 𝛿 the set {𝜆 : 0 ≤ 𝜆 ≤ 𝑟} ∩ spec(𝐵) 2 In

fact, statement 3∘ was proved in [KF, Theorem 2], but for completeness we give here a short proof of this statement.

Invertibility of Certain Fredholm Operators

351

consists of a ﬁnite number of points 𝜆𝑗 (∣𝜆𝑗 ∣ < 𝑟) such that 𝐵−𝜆𝑗 𝐼 are 𝐹 -operators and ∑ 𝜈𝜆𝑗 (𝐵) , (3.1) 𝜈0 (𝐴) = where 𝜈𝜆 (𝐵) denotes the algebraic multiplicity of the number 𝜆. This lemma follows from [GoKre, Theorem 4.3.] Now we are ready to prove Theorem 3.3. Proof. In order to prove the ﬁrst statement in 1∘ it is enough to show that each compact operator from the algebra 𝒜 is quasinilpotent. Let 𝐾 ∈ 𝒜 be a compact operator. If it is not quasinilpotent, then for some 0 ∕= 𝜆1 ∈ spec(𝐾) the point 𝜆 = 0 is an isolated point of the spectrum of the non-invertible Fredholm operator 𝐵 = 𝐾 − 𝜆1 𝐼 ∈ 𝒜. By Lemma 3.4 there exists 𝑟 > 0 such that for each operator 𝑇 ∈ 𝐿(ℬ) with ∥𝐵 − 𝑇 ∥ < 𝑟, there exists 𝜆0 ∈ ℂ such that the operator 𝑇 − 𝜆0 𝐼 is a non-invertible Fredholm operator. Taking such an operator 𝑇 from the dense algebra Ω, we come to a contradiction. This proves the ﬁrst statement in 1∘ . The second statement from 1∘ is evident because 𝐶 ∗ -algebras (and, in particular, algebra 𝒜) do not contain quasinilpotent or nil-ideals. 2∘ . Let the algebra 𝒜 be 𝐹 -closed and 𝐴 ∈ 𝐹 ∩ 𝒜. Then there exists 𝐵 ∈ 𝒜 such that 𝐵𝐴 = 𝐼 + 𝐾, where 𝐾 (∈ 𝒜) is a compact operator. It follows from 1∘ that 𝐾 is quasinilpotent. Since spec(𝐾) is nowhere dense in ℂ, it follows that the spectrum spec(𝐾) in algebras 𝐿 and 𝒜 coincide and hence (𝐼 + 𝐾)−1 ∈ 𝒜. Thus (𝐼 + 𝐾)−1 𝐵𝐴 = 𝐼 and the operator 𝐴 is left invertible in 𝒜. Since also ind 𝐴 = 0 (because 𝐴 is a limit of a sequence of invertible operators) it follows that 𝐴 is invertible. This proves the ﬁrst statement of 2∘ . Moreover, this proves the second statement of 2∘ because 𝐴−1 = (𝐼 + 𝐾)−1 𝐵 ∈ 𝒜. 3∘ . Suppose that 𝒜 is not an 𝐹 𝐹 -algebra. Then there exists a non-invertible 𝐹 operator 𝐴 ∈ 𝒜 and we can take a sequence 𝐴𝑛 ∈ Ω ∩ 𝐹 such that ∥𝐴𝑛 − 𝐴∥ → 0. Since Ω is an 𝐹 𝐹 -algebra, it follows that 𝐴𝑛 ∈ Ω ∩ 𝐺𝐿. Algebra 𝒜 is inverse −1 −1 −1 𝐴𝑛 → 𝑆 ∈ 𝐾0 (𝒜). Moreover, closed and 𝐴−1 𝑛 ∈ 𝒜. By Lemma 2.12 ∥𝐴𝑛 ∥ 𝑆 ∕= 0 because ∥𝑆∥ = 1. This contradicts the conditions of Statement 3∘ , and this statement is proved. 4∘ . Suppose that 𝒜 contains a non-invertible 𝐹 -operator 𝐴. If 𝜆 = 0 is an isolated point of spec(𝐴), then, by Lemma 3.4, there exists 𝛿 > 0 such that for each operator 𝐵 ∈ 𝐿(ℬ) with ∥𝐵 − 𝐴∥ < 𝛿, there exists 𝜆0 such that 𝐵 − 𝜆0 𝐼 is a non-invertible 𝐹 -operator. Like in the proof of 1∘ we take 𝐵 ∈ Ω, and come to a contradiction, which proves Statement 4∘ . 5∘ . Algebra 𝒜 is a subset of 𝐶 (𝑈𝑛 , ℬ) and the set Ω ∩ 𝐺𝐿 (⊂ 𝐶 (𝑈𝑛 , ℬ) ∩ 𝐺𝐿) is dense in 𝒜 ∩ 𝐹. Thus, 𝐶 (𝑈𝑛 , ℬ)∩ 𝐺𝐿 is dense in 𝒜 ∩ 𝐹, and we are in the condition of Statement 7∘ of Theorem 2.11 (where the set 𝑌 is substituted by the algebra 𝒜). This proves that 𝒜 is an 𝐹 𝐹 -algebra. □

352

I. Feldman and N. Krupnik

Remark 3.5. Example 2.1 and the example used in the proof of Proposition 2.4 (see equalities (2.1)) show that the algebra 𝒜 in Statement 1∘ of Theorem 3.3 may contain ﬁnite-dimensional as well as inﬁnite-dimensional compact operators. Remark 3.6. The suﬃcient condition 𝒦0 (𝒜) = {0} in Statement 3∘ of Theorem 3.3 is not necessarily for 𝒜 to be an 𝐹 𝐹 -algebra. This can be conﬁrmed by Example 2.1. We conclude this section by considering certain classes of algebras which admit generalized Gelfand transforms. Let 𝑀𝑛 (ℂ) denote the algebra of all 𝑛 × 𝑛 matrices with entries from ℂ. Deﬁnition 3.7. We say that the algebra 𝒜 ⊂ 𝐿(ℬ) admits a generalized Gelfand transform of order 𝑛 in ℬ if there exists a family of continuous homomorphisms 𝜈𝑠 : 𝒜 → 𝑀𝑘 (ℂ), 𝑠 ∈ 𝒮, 𝑘 = 𝑘(𝑠) ≤ 𝑛

(3.2)

such that for each 𝐴 ∈ 𝒜 the following implication holds: 𝐴 ∈ 𝒜 ∩ 𝐺𝐿(ℬ) ⇐⇒ det 𝜈𝑠 (𝐴) ∕= 0 ∀𝑠 ∈ 𝒮.

(3.3)

If this is the case, then we write 𝒜 ∈ 𝐺𝐺𝑇 (ℬ) and say that the system of homomorphisms {𝜈𝑠 } generates a GGT of order 𝑛 for the algebra 𝒜 in algebra 𝐿(ℬ). Example 3.8. Let Ω ⊂ 𝐿(ℬ) be a commutative subalgebra, then 𝒜 := clos(Ω) ∈ 𝐺𝐺𝑇 (ℬ). Indeed, if 𝒜 is inverse closed, then the Gelfand transform on 𝒜 (which is responsible for the invertibility of the elements from 𝒜 in algebra 𝒜) generates also an 𝐺𝐺𝑇 (ℬ) for for Ω. Assume that 𝒜 is not inverse closed. ( 𝒜 and, in particular, ) ˜ ˜ Denote by 𝒜 𝒜 ⊂ 𝒜 ⊂ 𝐿(ℬ) some closed inverse closed commutative subalgebra of 𝐿(ℬ). For example, we can take the maximal commutative subalgebra of 𝐿(ℬ) which contains 𝒜. The Gelfand transform in algebra 𝒜˜ generates a 𝐺𝐺𝑇 (ℬ) for the algebra 𝒜˜ and, in particular, for the algebra 𝒜. Theorem 3.9. Let 𝒜 := clos(Ω) ∈ 𝐺𝐺𝑇 (ℬ). Then 1∘ . 𝒜 is not necessarily an 𝐹 𝐹 -algebra. 2∘ . But, if Ω is an 𝐹 𝐹 -algebra then 𝒜 is an 𝐹 𝐹 -algebra, too. Proof. 1∘ . Let 𝑇 ∈ 𝐿(ℬ) be an arbitrary non-invertible Fredholm operator and 𝒜 ⊂ 𝐿(ℬ) an arbitrary closed commutative algebra which contains operator 𝑇. Then 𝒜 is not an 𝐹 𝐹 -algebra, but (as was shown in Example 3.8) 𝒜 ∈ 𝐺𝐺𝑇 (ℬ). / 𝐺𝐿(ℬ), then there exists a homomorphism 2∘ . Assume that 𝐴 ∈ 𝐹 (ℬ)∩𝒜 but 𝐴 ∈ 𝜈 ∈ {𝜈𝑠 } such that det 𝜈(𝐴) = 0. Let 𝐴𝑛 ∈ 𝐹 (ℬ) ∩ Ω and ∥𝐴 − 𝐴𝑛 ∥ → 0. Then det 𝜈(𝐴𝑛 ) → det 𝜈(𝐴) = 0, and hence there exist 𝜆𝑛 ∈ spec(𝜈(𝐴𝑛 )) such that 𝜆𝑛 → 0. Denote 𝐵𝑛 := 𝐴𝑛 − 𝜆𝑛 𝐼. It is clear that 𝐵𝑛 → 𝐴, 𝐵𝑛 ∈ Ω ∩ 𝐹 (ℬ) and 𝐵𝑛 ∈ / 𝐺𝐿(ℬ). This is a contradiction and the theorem is proved. □

Invertibility of Certain Fredholm Operators

353

Theorem 3.10. Let Ω ⊂ 𝐿(ℬ) be a subalgebra with Amitsur-Levitski polynomial identity (1.1) of some order 𝑚 = 2𝑛, 𝑛 ∈ ℕ, and let 𝒜 := clos(Ω) be inverse closed in 𝐿(ℬ). If Ω is an 𝐹 𝐹 -algebra, then 𝒜 is an 𝐹 𝐹 -algebra, too. Proof. Since 𝒜 := clos(Ω) is a Banach algebra with polynomial identity (1.1), it follows from [K, Theorem 21.1] that it admits a GGT for 𝒜 in 𝒜, i.e., there exists a set of homomorphisms 𝑓𝑀 : 𝒜 → 𝑀𝑘 (ℂ), where 𝑘 = 𝑘(𝑀 ) ≤ 𝑛 such that for any operator 𝐴 ∈ 𝒜 the following implication holds: 𝐴 ∈ 𝐺𝒜 ⇐⇒ det 𝑓𝑀 (𝐴) ∕= 0

∀𝑀 ∈ ℳ.

(3.4)

Here ℳ denote the set of all maximal ideals of algebra 𝒜. Since 𝒜 is inverse closed it follows that 𝐴 ∈ 𝐺𝐿(ℬ) ⇐⇒ 𝐴 ∈ 𝐺𝒜. Thus the set {𝑓𝑀 : 𝑀 ∈ ℳ} generates a 𝐺𝐺𝑇 (ℬ) for algebra 𝒜. It remains to use Theorem 3.9. □ Theorem 3.11. Let 𝑍 be a subset of the center of an algebra Ω ⊂ 𝐿(ℬ) and let Ω be a ﬁnite-dimensional module over 𝑍. If Ω is an 𝐹 𝐹 -algebra, then 𝒜 := clos(Ω) is an 𝐹 𝐹 -algebra, too. Proof. If the algebra 𝒜 has a dense subalgebra Ω which is a ﬁnite-dimensional module over its center, then (see [GK, Corollary 1.2]) it admits a GGT for 𝒜 in 𝐿(ℬ), and we can use Theorem 3.9. □ Corollary 3.12. Let Ω ⊂ 𝐿(ℬ) be a smallest (generally non-closed) unital subalgebra generated by arbitrary two idempotent operators 𝑃, 𝑅 or by 2𝑛 idempotents 𝑃1 , 𝑃2 , . . . , 𝑃2𝑛−1 , 𝑅 with some special relations.3 If Ω is an 𝐹 𝐹 -algebra, then 𝒜 = clos(Ω) is an 𝐹 𝐹 -algebra, too. Proof. If an algebra is generated by two idempotents or by 2𝑛 idempotents with relations (1–4) from [BGKKRSS, Section 4], then it admits a GGT for 𝒜 in 𝐿(ℬ) (see [GK] for two idempotents and [BGKKRSS] for 2𝑛 idempotents). Thus, again we can use Theorem 3.9. □

4. Some illustrative examples and open questions We start with a following illustrative example. Example 4.1. Let ℬ = 𝐿𝑝 (0, ∞), 𝑝 ∈ (1, ∞). Denote by {𝑈𝑛 } the sequence of 1 isometries deﬁned by equalities 𝑈𝑛 𝑓 (𝑥) = 𝑛 𝑝 𝑓 (𝑛𝑥). It is not diﬃcult to check that 𝑈𝑛 → 0 weakly. Denote by 𝒜𝑝 the commutant of the set {𝑈𝑛 }. It follows from Theorem 2.11 that algebra 𝒜𝑝 does not contain non-zero compact operators, and ∣𝐴∣ = ∥𝐴∥ for all 𝐴 ∈ 𝒜𝑝 . The algebra 𝒜𝑝 contains (for example): Singular integral operator 𝑆 and Ces` aro operators 𝐶, 𝐶˜ deﬁned by equalities ∫ ∞ ∫ ∫ ∞ 1 𝑥 𝑓 (𝑦)𝑑𝑦 𝑓 (𝑦)𝑑𝑦 1 ˜ (𝑥) = ; 𝐶𝑓 (𝑥) = ; (4.1) 𝑓 (𝑦)𝑑𝑦; 𝐶𝑓 𝑆𝑓 (𝑥) = 𝜋𝑖 0 𝑦 − 𝑥 𝑥 0 𝑦 𝑥 3 See

the relations (1–4) in [BGKKRSS, Section 4].

354

I. Feldman and N. Krupnik

integral operators

∫

𝑀 𝑓 (𝑥) = 𝑎𝑓 (𝑥) +

∞

0

𝑘(𝑥, 𝑦)𝑓 (𝑦)𝑑𝑦,

(𝑎 ∈ ℂ),

(4.2)

where 𝑘(𝑥, 𝑦) is measurable on [0, ∞) × [0, ∞) and satisﬁes the following two conditions: ∫ ∞ 𝑘(𝑥, 𝑦) ∣𝑘(𝑢, 1)∣𝑢1/𝑝−1 𝑑𝑢 < ∞. (4.3) , (𝑡 ∈ (0, ∞)) and 𝛾𝑝 (𝑘) := 𝑘(𝑡𝑥, 𝑡𝑦) = 𝑡 0 and shift operators 𝑊 𝑓 (𝑥) =

𝑚 ∑

𝑐𝑘 𝑓 (𝑎𝑘 𝑥)

where 𝑐𝑘 ∈ ℂ, 𝑎𝑘 > 0.

(4.4)

𝑘=1

Consider a few subalgebras of algebra 𝒜𝑝 , generated by operators (4.1)–(4.4). Alg1. Denote by 𝒮𝑝 (⊂ 𝒜𝑝 ) the unital Banach algebra generated by operator 𝑆. This algebra is symmetric for each 𝑝 ∈ (1, ∞) (see, for example, [K, Theorem 13.6]). For operator 𝑆 one can take 𝑆 = [cos 𝜃𝑝 𝑆 − 𝑖 sin 𝜃𝑝 𝐼] [cos 𝜃𝑝 𝐼 − 𝑖 sin 𝜃𝑝 𝑆]−1 ,

(4.5)

where 𝜃𝑝 = 2𝜋/𝑝. If, in particular, 𝑝 = 2, then 𝑆 = 𝑆 ∗ = 𝑆. It follows from Statement 5∘ of Theorem 2.11, that 𝒮𝑝 is an 𝐹 𝐹 -algebra for all 𝑝 ∈ (1, ∞). The algebra 𝒮𝑝 is wide enough. It contains, for example, the operators ∫ ∞ ∫ ∞ 𝑓 (𝑦)𝑑𝑦 1 𝑦 𝑓 (𝑦)𝑑𝑦 (∣𝑤∣ = 1) and 𝑁 𝑓 (𝑥) = , ln 𝑁𝑤 𝑓 (𝑥) = 𝜋𝑖 0 𝑦 + 𝑤𝑥 𝑥 𝑦−𝑥 0 see [K, Section 13]; the operators 𝑆

−1

𝑆

−1

1 𝑓 (𝑥) = 𝜋𝑖

and

∫ 𝑓 (𝑥) =

0

∫

∞

∞ 0

√

√

𝑦 𝑓 (𝑦)𝑑𝑦 𝑥 𝑦−𝑥

𝑥 𝑓 (𝑦)𝑑𝑦 𝑦 𝑦−𝑥

(𝑝 ∈ (1, 2))

(𝑝 ∈ (2, ∞)),

see [GK1, V. II, p. 98]. Alg2. By ℳ𝑝 (𝑝 ∈ (1, ∞)) we denote the set of all operators (4.2), which satisﬁes the conditions (4.3). It is not diﬃcult to check that ℳ𝑝 is an algebra. Indeed, let 𝑘1 , 𝑘2 , 𝑘 correspond to integral operators 𝐾1 , 𝐾2 , 𝐾 = 𝐾1 𝐾2 , where ∫ ∞ ∫ ∞ 𝐾𝑗 𝑓 (𝑥) := 𝑘𝑗 (𝑥, 𝑦)𝑓 (𝑦)𝑑𝑦 (𝑗 = 1, 2); 𝑘(𝑥, 𝑦) = 𝑘1 (𝑥, 𝑧)𝑘2 (𝑧, 𝑦)𝑑𝑧. 0

0

Then 𝑘(𝑎𝑥, 𝑎𝑦) = ∫ ∞ ∫ 1 ∞ ( 𝑧 ) (𝑧 ) (𝑧 ) 1 𝑘2 ,𝑦 𝑑 = 𝑘(𝑥, 𝑦). 𝑘1 (𝑎𝑥, 𝑧)𝑘2 (𝑧, 𝑎𝑦)𝑑𝑧 = 𝑘1 𝑥, 𝑎 0 𝑎 𝑎 𝑎 𝑎 0

(4.6)

Invertibility of Certain Fredholm Operators Next we denote (for short) 1/𝑝 − 1 = 𝑟 and check: $∫ ∞ $ ∫ ∞ ∫ ∞ $ $ ∣𝑘(𝑢, 1)∣𝑢𝑟 𝑑𝑢 = 𝑢𝑟 𝑑𝑢 $$ 𝑘1 (𝑢, 𝑧)𝑘2 (𝑧, 1)𝑑𝑧 $$ 0 0 ∫0 ∞ ∫ ∞$ ( 𝑢 )$$ ( 𝑢 )𝑟 ( 𝑢 ) $ ≤ ,1 $ ∣𝑘2 (𝑧, 1)∣𝑧 𝑟 𝑑𝑧 𝑑 $𝑘1 𝑧 𝑧 𝑧 0 0 ≤ 𝛾𝑝 (𝑘2 )𝛾𝑝 (𝑘1 ).

355

(4.7)

Equalities (4.6) and (4.7) show that ℳ𝑝 is an algebra. Theorem 4.2. The algebra clos (ℳ𝑝 ) is an 𝐹 𝐹 -algebra. Proof. It is known (see, for example, [K-G, Theorem 2]) that the spectrum of the operator (4.2) coincides with the curve ∫ ∞ 1 𝜆=𝑎+ 𝑘(𝑒𝑡 , 1)𝑒( 𝑝 +𝑖𝑥)𝑡 𝑑𝑡 (𝑥 ∈ ℝ) (4.8) −∞

and for each point 𝜆 of this curve, the operator 𝐴 − 𝜆𝐼 is not an Fredholm operator. It follows from here that ℳ𝑝 is an 𝐹 𝐹 -algebra. Using Statement 5∘ from Theorem 3.3 we obtain that clos (ℳ𝑝 ) is an 𝐹 𝐹 -algebra, too. □ ˜ Alg3. Denote by 𝒞𝑝 the unital Banach algebra generated by operators 𝐶 and 𝐶. ˜ = 𝐶 + 𝐶. ˜ It can be directly This is a commutative algebra because 𝐶 𝐶˜ = 𝐶𝐶 checked that 𝐶 and 𝐶˜ belong to the algebra ℳ𝑝 . It follows from Theorem 4.2 that 𝒞𝑝 is an 𝐹 𝐹 -algebra. Alg4. Denote by 𝒲𝑝 the unital Banach algebra generated by operators (4.4). It is well known that each Fredholm operator from 𝒲𝑝 is invertible. See, for example, the book [A], where the absence of non-invertible Fredholm operators is shown for more general classes of algebras. Since the algebra 𝒲𝑝 is commutative it follows from Proposition 1.3 (see also Example 3.8 & Theorem 3.9) that clos (𝒲𝑝 ) is an 𝐹 𝐹 -algebra. Consider another illustrative example: Example 4.3. Let ℬ𝑝 := 𝐿𝑝 (Γ), 𝑝 ∈ (1, ∞), where Γ is the unit circle, and let 𝐴 ∈ 𝐿(ℬ𝑝 ) be a singular integral operator ∫ 𝑓 (𝜏 )𝑑𝜏 , 𝑡 ∈ Γ, (4.9) 𝐴𝑓 (𝑡) = 𝑎(𝑡)𝑓 (𝑡) + 𝑏(𝑡) Γ 𝜏 −𝑡 where 𝑎 and 𝑏 are piecewise∑constant functions continuous on Γ ∖ {−1, 1}. Denote 𝑚 by Ω𝑝 the set of operators 𝑘=1 𝐴𝑘1 𝐴𝑘2 ⋅ ⋅ ⋅ 𝐴𝑘,ℓ(𝑘) , where 𝑚 ∈ ℕ and 𝐴𝑘𝑗 are the operators of the form (4.9). Theorem 4.4. Let 𝒜𝑝 := clos(Ω𝑝 ). Then 1∘ . Algebra 𝒜𝑝 is an 𝐹 𝐹 -algebra if and only if Ω𝑝 is. 2∘ . Algebra Ω𝑝 is an 𝐹 𝐹 -algebra if and only if 𝑝 = 2.

356

I. Feldman and N. Krupnik

Proof. The algebra Ω𝑝 is generated by the following two idempotents: analytical projection ) ∑ (∑ 𝑃 𝑎𝑘 𝑡𝑘 , 𝑡 ∈ Γ, 𝑎𝑘 ∈ ℂ 𝑎 𝑘 𝑡𝑘 = 𝑘≥0

and the operator 𝑅 of multiplication by the characteristic function of the upper semi-circle. If Ω𝑝 is an 𝐹 𝐹 -algebra then, by Corollary 3.12, the algebra 𝒜𝑝 := clos(Ω𝑝 ) is an 𝐹 𝐹 -algebra too. This proves Statement 1∘ . Consider an operator 𝐵 = 𝑎𝑃 + 𝑄 (∈ Ω𝑝 ), where 𝑎(𝑡) takes only two values: ±1 and 𝑄 = 𝐼 − 𝑃. It follows from [GK1, Ch. 9, Theorem 3.1] that 𝐵 is a Fredholm operator if and only if 𝑝 ∕= 2. If 𝑝 > 2 then ind 𝐴 = 1, If 𝑝 < 2 then ind 𝐴 = −1. Thus the only candidate to be an 𝐹 𝐹 -algebra is the algebra Ω2 . And here is one of the ways to conﬁrm that algebra Ω2 is an 𝐹 𝐹 -algebra. Consider the following sequence of operators ( ) √ (𝑛 + 1)𝑡 + 𝑛 − 1 2 𝑛 𝑓 𝑈𝑛 𝑓 (𝑡) = . (4.10) 𝑛 + 1 + 𝑡(𝑛 − 1) 𝑛 + 1 + 𝑡(𝑛 − 1) It is not diﬃcult to check that {𝑈𝑛 } (⊂ 𝐿(ℬ2 )) is a sequence of isometries which tends weakly to zero and that selfadjoint operators 𝑃, 𝑅 commute with all 𝑈𝑛 (we omit the details). The algebra 𝒜2 is a 𝐶 ∗ -subalgebra of the commutant 𝒜˜ of the set {𝑈𝑛 }, and it follows from Statement 3∘ of Theorem 2.11 that 𝒜2 is an 𝐹 𝐹 -algebra. □ Remark 4.5. The algebra 𝒜 in Theorem 3.10 satisﬁes the following two conditions. It is inverse closed and with Amitsur-Levitski polynomial identity (1.1). ∙ If we omit both of these conditions, then we come to the open Question 3.1. ∙ If we omit only the second condition, then we come to the open Question 3.2. ∙ Finally, if we omit only the ﬁrst condition, then we come to Question 4.6. Let Ω ⊂ 𝐿(ℬ) be an 𝐹 𝐹 -subalgebra with Amitsur-Levitski polynomial identity (1.1) of some order 2𝑛, 𝑛 ∈ ℕ. Is 𝒜 := clos(Ω) an 𝐹 𝐹 -algebra, too? If 𝑛 = 1 then the Amitsur-Levitski identity 𝑥1 𝑥2 − 𝑥2 𝑥1 = 0 means that Ω is a commutative algebra and it follows from Proposition 1.3 that the answer to Question 4.6 is positive. As far as we know, Question 4.6 for 𝑛 > 1 is still open.

Invertibility of Certain Fredholm Operators

357

References [A]

A. Antonevich, Linear Functional Equations. Operator Approach, OT. 83 Birkh¨ auser Verlag, 1996. [BGKKRSS] A. B¨ ottcher, I. Gohberg, Yu. Karlovich, N. Krupnik, S. Roch, B. Silbermann, I. Sptkovsky, Banach algebras generated by idempotents and applications, Operator Theory., V. 90(1996), 19–54. [CL] L.A. Coburn and A. Lebov, Algebraic Theory of Fredholm Operators, J. Math. Mech., 1996, 15, 577–584. [GoKre] I. Gohberg and M.G. Krein, The basic propositions on defect numbers, root numbers and indices of linear operators, Uspehi Mat. Nauk 12, no. 2(74) (1957), 43–118 (Russian). English transl. Amer. Math, Soc. Transl. (2)13 (1960), 185– 264. [GK] I. Gohberg and N. Krupnik, Extension theorems for Fredholm and invertibility symbols, IEOT 16 (1993), 515–529. [GK1] I. Gohberg and N. Krupnik, One-Dimensional Linear Singular Integral Equation, Vol. I–II, Birkh¨ auser Verlag, Basel – Boston, 1992. [K] N. Krupnik, Banach Algebras with Symbols and Singular Integral Operators, Birkh¨ auser Verlag Basel – Boston, 1987. [KF] N. Krupnik and I. Feldman, On the invertibility of certain Fredholm operators, Izv. Akad. Nauk MSSR, ser. ﬁz. i mat. nauk 2, (1982), 8–14. (Russian) [KMF] N. Krupnik, A. Markus, I. Feldman, Operator algebras in which all Fredholm operators are invertible, Lecture Notes in Mathematics, Linear and Complex Analysis, 1533 (1994), 124–125. [K-G] M. Kozhokar-Gonchar, The spectrum of Ces` aro operators, Mat. Issled. 7, no. 4(26) (1972), 94–103. (Russian) [MF] A. Markus and I. Feldman, On the algebras generated by operators with one-side inverses, Research of Dif. Equat., Shtiintsa, Kishinev, (1983) 42–46. (Russian) Israel Feldman Department of Mathematics Bar-Ilan University Ramat-Gan 52900, Israel e-mail: [email protected] Nahum Krupnik 208–7460 Bathurst Str. Vaughan, L4J 7K9 Ontario, Canada e-mail: [email protected]

Operator Theory: Advances and Applications, Vol. 218, 359–376 c 2012 Springer Basel AG ⃝

Bernstein Widths and Super Strictly Singular Inclusions F.L. Hern´andez, Y. Raynaud and E.M. Semenov To the memory of Professor Israel Gohberg

Abstract. The super strict singularity of inclusions between rearrangement invariant function spaces on [0, 1] is studied. Estimates of the Bernstein widths 𝛾𝑛 of the inclusions 𝐿∞ ⊂ 𝐸 are given. It is showed that if the inclusion 𝐸 ⊂ 𝐹 is strong and the order continuous part of exp 𝐿2 is not included in 𝐸 then the inclusion 𝐸 ⊂ 𝐹 is super strictly singular. Applications to the classes of Lorentz and Orlicz spaces are given. Mathematics Subject Classiﬁcation (2000). 41A46, 46E30. Keywords. Strictly singular operator, rearrangement invariant spaces, Rademacher system, widths.

0. Introduction A linear operator 𝐴 between two Banach spaces 𝐸 and 𝐹 is called strictly singular (SS in short) if 𝐴 fails to be an isomorphism on any inﬁnite-dimensional subspace of 𝐸. This concept was introduced by Tosio Kato in [K]. A stronger notion is the following. An operator 𝐴 from 𝐸 to 𝐹 is called super strictly singular (SSS in short) if the sequence of Bernstein widths 𝑏𝑛 (𝐴) tends to 0 when 𝑛 → ∞, where 𝑏𝑛 (𝐴) =

sup

inf

𝑄⊂𝐸,dim 𝑄=𝑛 𝑥∈𝑄,∥𝑥∥=1

∥𝐴𝑥∥𝐹 .

This notion was introduced ﬁrstly by B. Mityagin and A. Pelczynski in [MP]. About widths we refer to [PI]. It is clear that 𝐾 ⊂ 𝑆𝑆𝑆 ⊂ 𝑆𝑆, where 𝐾 denotes the class of compact operators. Properties of SSS operators have been given in [M], [P], [CCT], [FHR], [SSTT] and [S]. This operator ideal has been also named in the literature as ﬁnite strictly singular operators ([SSTT], [S]). In the context The authors gratefully acknowledge the support of MTM-grant 2008–02652, RFBR-grant 08–01– 00226a and Complutense University grant.

360

F.L. Hern´andez, Y. Raynaud and E.M. Semenov

of Banach lattices a weaker notion is the following one ([HR]): An operator 𝐴 from a Banach lattices 𝐸 to a Banach space 𝐹 is said to be disjointly strictly singular (DSS in short) if there is no disjoint sequence on non-null vectors (𝑥𝑛 ) in 𝐸 s.t. the restriction of 𝐴 to the subspace [(𝑥𝑛 )] spanned by the vectors (𝑥𝑛 ) is an isomorphism. Clearly SS ⊂ DSS. In general these operator classes 𝐾 ⊂ SSS ⊂ SS ⊂ DSS are diﬀerent. However any SS operator in a 𝑙𝑝 -space (1 ⩽ 𝑝 < ∞) is compact. This was proved by I. Gohberg, A. Markus and I. Feldman in [GMF] (for 𝑝 = 2 it was done before by J. Calkin [C]). It easily follows from results of Grothendieck that on probability measure spaces the canonical inclusions 𝐿∞ ⊂ 𝐿𝑝 are SS for any 𝑝 < ∞. More generally, the inclusion 𝐿∞ ⊂ 𝐸 is always SS for any rearrangement invariant space 𝐸 ∕= 𝐿∞ on [0, 1], (S. Novikov [N]). In fact it turns out that this inclusion is SSS ([FHR]). This paper is devoted to study SSS inclusions between arbitrary rearrangement invariant function spaces. First in Section 1 we generalize Grothendieck’s result by estimating the Bernstein widths of the inclusions 𝐿∞ ⊂ 𝐸 for r.i. function spaces 𝐸 on [0, 1] (this leads to a new proof of the fact that these inclusions are always SSS). Afterwards we study the SSS property for general inclusions 𝐸 ⊂ 𝐹 of r.i. spaces on [0, 1]. The main results are given in Section 3 (see Theorem 17 and its Corollaries). The notion of strong inclusion studied in Section 2 plays an important role for that. If 𝐸, 𝐹 are r.i. and 𝐸 ⊂ 𝐹 , then this inclusion is called strong if the topology of the norm of 𝐹 and that of convergence in measure coincide on the unit ball of 𝐸. Theorem 17 states that if the inclusion 𝐸 ⊂ 𝐹 is strong and moreover the order-continuous part 𝐺 of the Orlicz space exp𝐿2 is not included in 𝐸 then the inclusion 𝐸 ⊂ 𝐹 is SSS. Recall that a Banach space 𝐸 of measurable functions on [0, 1] is said to be rearrangement invariant (r.i.) if the following conditions hold: 1) if 𝑦 ∈ 𝐸 and ∣𝑥(𝑡)∣ ⩽ ∣𝑦(𝑡)∣ a.e., then 𝑥 ∈ 𝐸 and ∥𝑥∥𝐸 ⩽ ∥𝑦∥𝐸 ; 2) if 𝑦 ∈ 𝐸 and 𝑥 and 𝑦 are equimeasurable, then 𝑥 ∈ 𝐸 and ∥𝑥∥𝐸 = ∥𝑦∥𝐸 . As usual (cf. [LT2] and [KPS]) we shall assume that r.i. spaces 𝐸 are separable or maximal (i.e., 𝐸 = 𝐸 ′′ ), where 𝐸 ′′ denotes the space of measurable functions 𝑥 for which ∥𝑥∥𝐸 ′′ = lim ∥ min(∣𝑥∣, 𝑛)∥𝐸 < ∞. 𝑛→∞

The space 𝐸 ′ endowed with the norm ∥𝑥∥𝐸 ′ = sup

∥𝑦∥𝐸 ⩽1

∫ 0

1

𝑥(𝑡)𝑦(𝑡) 𝑑𝑡

is an r.i. space. Denote by æ𝑒 the characteristic function of a measurable set 𝑒. The function 𝜑𝐸 (𝑠) = ∥æ𝑒 ∥𝐸 , where 𝑒 ⊂ [0, 1] is any measurable set of measure 𝑠, is named the fundamental function of the r.i. space 𝐸. We will assume, w.l.o.g., that 𝜑𝐸 is concave and 𝜑𝐸 (1) = 1. In this case 𝐿∞ ⊂ 𝐸 ⊂ 𝐿1 , and ∥𝑥∥𝐿1 ⩽ ∥𝑥∥𝐸 ⩽ 𝑡 ∥𝑥∥𝐿∞ for any 𝑥 ∈ 𝐿∞ . It is known that 𝜑𝐸 ′ (𝑡) = . Given 𝑥, 𝑦 ∈ 𝐿1 , we 𝜑𝐸 (𝑡)

Bernstein Widths and Super Strictly Singular Inclusions

361

∫𝜏 ∫𝜏 shall write 𝑥 ≺ 𝑦 if 0 𝑥∗ (𝑡) 𝑑𝑡 ⩽ 0 𝑦 ∗ (𝑡) 𝑑𝑡 for every 𝜏 ∈ [0, 1]. It is well known that ∥𝑥∥𝐸 ⩽ ∥𝑦∥𝐸 provided 𝑥 ≺ 𝑦 ([LT2], 2.a.8). Important examples of r.i. spaces are the Orlicz, Lorentz and Marcinkiewicz spaces. If M is a positive convex function on [0, ∞) with 𝑀 (0) = 0, the Orlicz space 𝐿𝑀 consists of all measurable functions 𝑥(𝑡) on [0, 1] for which ⎫ ⎧ ∫1 ⎬ ⎨ ∣𝑥(𝑡)∣ )𝑑𝑡 ⩽ 1 < ∞. 𝑀( ∥𝑥∥𝐿𝑀 = inf 𝜆 > 0 : ⎭ ⎩ 𝜆 0

𝑢𝑝

If 𝑀𝑝 (𝑢) = 𝑒 − 1, 0 < 𝑝 < ∞, then 𝑀𝑝 (𝑢) is convex for 𝑝 ⩾ 1 and is convex up to equivalence for 𝑝 < 1. The space 𝐿𝑀𝑝 is denoted by exp 𝐿𝑝 . The Orlicz space 𝐿𝑀2 is not separable and its separable part (i.e., the closure of 𝐿∞ in 𝐿𝑀2 ) is denoted by 𝐺. The space 𝐺 plays an important role in the theory of r.i. spaces. Let us denote by Ω the set of all increasing concave functions 𝜑(𝑡) on [0, 1] with 𝜑(0) = 0 and 𝜑(1) = 1. The Lorentz space Λ(𝜑) and 𝐿𝑝,𝑞 consist of all measurable functions on [0, 1] s.t. ∫ 1 𝑥∗ (𝑡) 𝑑𝜑(𝑡) < ∞, ∥𝑥∥Λ(𝜑) = 0

resp. ∥𝑥∥𝐿𝑝,𝑞

⎧ ( ∫ 1( )𝑞 𝑑𝑡 ) 1𝑞   ∗ 1/𝑝 ⎨ 𝑞 𝑥 (𝑡)𝑡 , 1 ⩽ 𝑞 < ∞, 𝑝 0 𝑡 =  ∗ 1/𝑝  𝑞=∞ ⎩ sup 𝑥 (𝑡)𝑡 , 0<𝑡⩽1

∗

for 1 < 𝑝 < ∞, where 𝑥 denotes the decreasing rearrangement of ∣𝑥(𝑡)∣. The Marcinkiewicz space 𝑀 (𝜑) consists of all measurable functions on [0, 1] s. t. ∫ 𝑠 1 𝑥∗ (𝑡) 𝑑𝑡 < ∞. ∥𝑥∥𝑀(𝜑) = sup 0<𝑠⩽1 𝜑(𝑠) 0 It is well known that in a r.i. space 𝐸 the Rademacher system 𝑟𝑘 (𝑡) = sign sin 2𝑘 𝜋𝑡, 𝑘 ∈ ℕ generates a subspace isomorphic to 𝑙2 iﬀ 𝐺 ⊂ 𝐸 (cf. [LT2], 2.b.4). Hence for a couple of r.i. spaces 𝐸 and 𝐹 with 𝐸 ⊂ 𝐹 , if the inclusion 𝐸 ⊂ 𝐹 is SS then 𝐺 ∕⊂ 𝐸. In the extreme case where 𝐹 = 𝐿1 it holds that 𝐸 ⊂ 𝐿1 is SS iﬀ 𝐺 ∕⊂ 𝐸 ( [HNS]). Let 1 ⩽ 𝑞 < ∞. An r.i. space 𝐸 is called 𝑞-concave if there exists a constant 𝐶 > 0 s. t. 1( ) 𝑞1 1 ) 1𝑞 ( 𝑛 1 ∑ 1 ∑ 1 1 𝑛 1 1⩾𝐶 ∣𝑥𝑘 ∣𝑞 ∥𝑥𝑘 ∥𝑞 1 1 1 1 𝑘=1 𝑖=1 for any 𝑛 ∈ ℕ and 𝑥1 , 𝑥2 , . . . , 𝑥𝑛 ∈ 𝐸 ([LT2], 1.d.3). Given functionals 𝑓, 𝑔, we 1 shall write 𝑓 ≈ 𝑔 if 𝑓 (𝑥) ⩽ 𝑔(𝑥) ⩽ 𝐶𝑓 (𝑥) for some constant 𝐶 > 0 and every 𝑥 𝐶 from domain of deﬁnition. Some results of this article were announced in [S] and [RSH].

362

F.L. Hern´andez, Y. Raynaud and E.M. Semenov

1. Inclusion of 𝑳∞ into r.i. spaces: Generalization of Grothendieck’s result Let 𝐸, 𝐹 be a pair of r.i. space and 𝐸 ⊂ 𝐹 . Given 𝑛 ∈ ℕ, denote 𝛾𝑛 (𝐸, 𝐹 ) =

sup

inf

𝑄⊂𝐸,dim 𝑄=𝑛 ∥𝑥∥𝐸 =1, 𝑥∈𝑄

∥𝑥∥𝐹 .

Clearly 𝛾𝑛 (𝐸, 𝐹 ) are the Bernstein widths of the inclusion operator 𝐼 : 𝐸 ⊂ 𝐹 . The next statement is simple (there are many similar results). Lemma 1. Let 𝑛 ∈ ℕ and 𝑄 be an 𝑛-dimensional subspace of 𝐿∞ . There exists an element 𝑧 ∈ 𝑄 s. t. ∥𝑧∥𝐿∞ = 1 and 𝑧 2 ≺ æ(0, 𝑛1 ) . Proof. Using the Gram-Schmidt method of orthogonalization we can ﬁnd an orthonormal system 𝑥1 , 𝑥2 , . . . , 𝑥𝑛 in 𝑄. Then 1 𝑛 1 1∑ 1 1 1 𝑎𝑘 𝑥𝑘 1 1 $ 𝑛 $ 1 1 $∑ $ 𝑘=1 $ $ 𝐿∞ 1 = sup 𝑎 𝑥 (𝑡) sup 1 $ $ 𝑘 𝑘 1 𝑛 1 $ $ 𝑛 ∑ {𝑎𝑘 }∕=0 1∑ 2 =1,0⩽𝑡⩽1 𝑘=1 1 𝑎 𝑘 𝑎𝑘 𝑥𝑘 1 1 𝑘=1 1 1 𝑘=1 𝐿2 )1/2 1 ( 𝑛 )1/2 1 1 (∑ 1 𝑛 ∑ 1 1 √ 2 2 1 1 = 𝑛. = sup 𝑥𝑘 (𝑡) ⩾1 𝑥𝑘 1 0⩽𝑡⩽1 1 𝑘=1 1 𝑘=1 𝐿2

Hence 𝑄 contains an element 𝑧 s. t. ∥𝑧∥𝐿∞ = 1 and ∥𝑧∥𝐿2 and

∫ 0

1

𝑧 2 (𝑡) 𝑑𝑡 ⩽

1 ⩽ √ . Then ∥𝑧 2 ∥𝐿∞ = 1 𝑛

1 . 𝑛

It is easy to see that these two estimates imply 𝑧 2 ≺ æ(0, 𝑛1 ) .

□

Theorem 2. Let 𝐸 be an r.i. space and 𝐸 ∕= 𝐿∞ . The inclusion 𝐿∞ ⊂ 𝐸 is SSS and (1) 𝜑𝐸 (1/𝑛) ⩽ 𝛾𝑛 (𝐿∞ , 𝐸) ⩽ (𝜑𝐸 (1/𝑛))1/2 for any 𝑛 ∈ ℕ. Proof. If 𝑥𝑘 (𝑡) = æ( 𝑘−1 , 𝑘 ) (𝑡), 1 ⩽ 𝑘 ⩽ 𝑛 and 𝑄 = span{𝑥𝑘 , 1 ⩽ 𝑘 ⩽ 𝑛}, then 𝑛

𝑛

inf

𝑥∈𝑄,∥𝑥∥𝐿∞ =1

∥𝑥∥𝐸 = ∥æ(0, 𝑛1 ) ∥𝐸 = 𝜑𝐸 (1/𝑛) ,

and we get the left inequality (1). By Lemma 1 any 𝑛-dimensional subspace 𝑄 ⊂ 𝐿∞ contains an element 𝑧 ∈ 𝑄 s. t. ∥𝑧∥𝐿∞ = 1 and 𝑧 2 ≺ æ(0, 𝑛1 ) . Applying ([LT2], 2.a.8) it follows that ∥𝑧 2 ∥𝐸 ⩽ ∥æ(0, 𝑛1 ) ∥𝐸 = 𝜑𝐸 (1/𝑛).

(2)

Bernstein Widths and Super Strictly Singular Inclusions

363

1/2

The space 𝐸(2), endowed with the norm ∥𝑥∥𝐸(2) = ∥𝑥2 ∥𝐸 , is included in 𝐸 and ∥𝑥∥𝐸 ⩽ ∥𝑥∥𝐸(2) for any 𝑥 ∈ 𝐸(2) ([LT2], 1.d). The space 𝐸(2) is called the 2convexication of 𝐸. Hence ∥𝑧∥𝐸 ⩽ ∥𝑧∥𝐸(2) ⩽ (𝜑𝐸 (1/𝑛)) /2 1

and

∥𝑥∥𝐸 ⩽ ∥𝑧∥𝐸 ⩽ (𝜑𝐸 (1/𝑛)) /2 . 1

inf

𝑧∈𝑄,∥𝑥∥𝐿∞ =1

Hence 𝛾𝑛 (𝐿∞ , 𝐸) ⩽ (𝜑𝐸 (1/𝑛)) /2 , which is the right inequality in (1). Now, since □ lim 𝜑𝐸 (𝑡) = 0 for any r.i. space 𝐸 ∕= 𝐿∞ , we have lim 𝛾𝑛 (𝐿∞ , 𝐸) = 0. 1

𝑛→∞

𝑡→0

The SSS property of the inclusion of 𝐿∞ into any r.i. space 𝐸 ∕= 𝐿∞ is also proved by another method in [FHR, Prop. 5.7]. Theorem 2 may be strengthened in the class of 2-convex spaces: Theorem 3. Let 𝐸 be a 2-convex r.i. space. Then 𝜑𝐸 (1/𝑛) ⩽ 𝛾𝑛 (𝐿∞ , 𝐸) ⩽ 𝐶 𝜑𝐸 (1/𝑛)

(3)

for some 𝐶 > 0 and any 𝑛 ∈ ℕ. Proof. It is well known that for any 2-convex r.i. space 𝐸 there exists an r.i. space 𝐹 s. t. 𝐸 and 𝐹 (2) coincide up to equivalence of norms ([LT2], 1.d). Therefore it is suﬃcient to prove our statement for 𝐸 = 𝐹 (2). Let 𝑄 be an 𝑛-dimensional subspace of 𝐿∞ . By Lemma 1 there exists 𝑧 ∈ 𝑄 s. t. ∥𝑧∥𝐿∞ = 1 and 1/2

1/2

1/2

∥𝑧∥𝐸 = ∥𝑧 2 ∥𝐹 ⩽ ∥æ(0, 𝑛1 ) ∥𝐹 = ∥æ2(0, 1 ) ∥𝐹 = ∥æ(0, 𝑛1 ) ∥𝐸 = 𝜑𝐸 (1/𝑛). 𝑛

(4)

Now, if ∥ ⋅ ∥𝐸 ⩽ ∥ ⋅ ∥𝐹 (2) ⩽ 𝐶∥ ⋅ ∥𝐸 then the constant 𝐶 in (3) coincides with the constant in this inequality. The left part of (3) was proved in Theorem 2. □ Note that if the norms 𝐸 and 𝐹 (2) coincide, then (4) shows that the constant 𝐶 in (3) equals 1, i.e., 𝛾𝑛 (𝐿∞ , 𝐸) = 𝜑𝐸 (1/𝑛) for any 𝑛 ∈ ℕ. This condition is satisﬁed for 𝐸 = 𝐿𝑝 , for 2 ⩽ 𝑝 < ∞. Hence 𝛾𝑛 (𝐿∞ , 𝐿𝑝 ) = 𝜑𝐿𝑝 (1/𝑛) = (1/𝑛) /𝑝 1

(5)

for any 𝑛 ∈ ℕ and 𝑝 ∈ [2, ∞). This statement was proved in [PS]. If 1 ⩽ 𝑝 < 2, then 1 𝛾𝑛 (𝐿∞ , 𝐿𝑝 ) ⩽ 𝛾𝑛 (𝐿∞ , 𝐿2 ) = √ . 𝑛 Let 𝑄 = span{𝑟𝑘 , 1 ⩽ 𝑘 ⩽ 𝑛} where 𝑟𝑘 (𝑡) = sign sin 2𝑘 𝜋𝑡 are the Rademacher system. By Khintchine inequality ([LT1], 2.b.3) we have 1 𝑛 1 ( 𝑛 )1/2 1∑ 1 ∑ 1 1 1 1 2 inf 𝑎𝑘 𝑟𝑘 1 ⩾ √ 𝑛 inf 𝑎𝑘 = √ . 𝛾𝑛 (𝐿∞ , 𝐿1 ) ⩾ ∑ 1 1 2 ∑ ∣𝑎 ∣=1 2𝑛 ∥ 𝑛𝑘=1 𝑎𝑘 𝑟𝑘 ∥ =1 1 𝐿∞

𝑘=1

𝐿1

𝑘=1

𝑘

𝑘=1

364

F.L. Hern´andez, Y. Raynaud and E.M. Semenov

So, for any 𝑝 ∈ [1, 2) and 𝑛 ∈ ℕ we have 1 1 √ ⩽ 𝛾𝑛 (𝐿∞ , 𝐿𝑝 ) ⩽ √ . (6) 𝑛 2𝑛 Inequalities (5) and (6) show that estimates (1) are precise. Now we want to generalize the obtained results for Lorentz 𝐿𝑝, 𝑞 -spaces. Proposition 4. Let 𝑛 1. 𝛾𝑛 (𝐿∞ , 𝐿𝑝,𝑞 ) ≈ 2. 𝛾𝑛 (𝐿∞ , 𝐿𝑝,𝑞 ) ≈ 3. 𝛾𝑛 (𝐿∞ , 𝐿2,𝑞 ) =

be an integer. 1 (1/𝑛) /𝑝 if 2 < 𝑝 < ∞. 1 ( /𝑛)1/2 if 1 < 𝑝 < 2. (1/𝑛)1/2 if 2 ⩽ 𝑞 ⩽ ∞.

Proof. 1. Since 𝜑𝐿𝑝,𝑞 (𝑡) = 𝑡1/𝑝 then, by Theorem 2, (1/𝑛)1/𝑝 ⩽ 𝛾𝑛 (𝐿∞ , 𝐿𝑝,𝑞 ). To obtain an upper estimate we use Lemma 1 once more. Let 𝑄 be an 𝑛-dimensional 1 subspace of 𝐿∞ . There exists 𝑧 ∈ 𝑄 s. t. ∥𝑧∥𝐿∞ = 1 and ∥𝑧∥𝐿2 ⩽ √ . By ([BL], 𝑛 5.3.1, 3.5.3) ( 1 ) 𝑝2 2 1 1− 2 = 𝐶𝑝 𝑛− 𝑝 ∥𝑧∥𝐿𝑝,𝑞 ⩽ 𝐶𝑝 ∥𝑧∥𝐿∞𝑝 ∥𝑧∥𝐿𝑝 2 ⩽ 𝐶𝑝 𝑛− 2 for some constant 𝐶𝑝 > 0 and any 𝑛 ∈ ℕ. Hence 𝛾𝑛 (𝐿∞ , 𝐿𝑝,𝑞 ) ⩽ 𝐶𝑝 𝑛− /𝑝 . 2. Since 𝐿2 ⊂ 𝐿𝑝,𝑞 ⊂ 𝐿1 for any 𝑝 ∈ (1, 2) ([LT2], 2.b.8) then the needed estimate follows from (6). 1 3. The lower estimate 𝑛− /2 ⩽ 𝛾𝑛 (𝐿∞ , 𝐿2,𝑞 ) follows from Theorem 2 and equality 𝜑𝐿2,𝑞 (𝑡) = 𝑡1/2 . Since 𝐿2 ⊂ 𝐿2,𝑞 and ∥𝑥∥𝐿2,𝑞 ⩽ ∥𝑥∥𝐿2 ([LT2], 2.b.9) we have, by (5), that 𝛾𝑛 (𝐿∞ , 𝐿2,𝑞 ) ⩽ 𝛾𝑛 (𝐿∞ , 𝐿2 ) = 𝑛−1/2 . Therefore 𝛾𝑛 (𝐿∞ , 𝐿2,𝑞 ) = □ 𝑛−1/2 for any 𝑞 ∈ [2, ∞] and 𝑛 ∈ ℕ. 1

To ﬁnd 𝛾𝑛 (𝐿∞ , 𝐿2,𝑞 ) for 𝑞 ∈ [1, 2) is a more delicate problem. It has been partially solved. Lemma 5. Given 𝑚 ∈ ℕ and 1 ⩽ 𝑏 ⩽ 𝑚. Then 𝑚 ∑ 𝑥 √𝑘 max 𝑚 ∑ 𝑘 ∣𝑥𝑘 ∣⩽1, 𝑥2 ⩽𝑏 𝑘=1 𝑘=1

𝑘

( ) 1 is obtained on the sequence 𝑥𝑘 = min 1, 𝜀𝑘 − /2 where 𝜀 is deﬁned by the equation 𝑚 ∑ ( 2 ) min 1, 𝜀 /𝑘 = 𝑏. 𝑘=1

𝑒 𝑒 √ 1 = 𝜆, then 𝜀 ln ⩽ 2𝜆 ln /2 𝜀2 𝜀 The proof of Lemmas 5 and 6 is simple (so it is omitted).

Lemma 6. Let 0 < 𝜀, 𝜆 < 1. If 𝜀2 ln Theorem 7. Given 𝑛 ∈ ℕ, 𝑛

−1/2

( ⩽ 𝛾𝑛 (𝐿∞ , 𝐿2,1 ) ⩽

2 + ln 𝑛 𝑛

)1/2 .

√𝑒 . 𝜆

Bernstein Widths and Super Strictly Singular Inclusions

365

Proof. The left inequality follows from Theorem 2. By Lemma 1 any 𝑛-dimensional √ subspace 𝑄 of 𝐿∞ contains an element 𝑧 ∈ 𝑄 s. t. ∥𝑧∥𝐿∞ = 1 and ∥𝑧∥𝐿2 ⩽ 1/ 𝑛. Therefore sup ∥𝑥∥𝐿2,1 𝛾𝑛 (𝐿∞ , 𝐿2,1 ) ⩽ √

∥𝑥∥𝐿∞ ⩽1,∥𝑥∥𝐿2 ⩽1/

and

𝑛

𝑚

𝛾𝑛 (𝐿∞ , 𝐿2,1 ) ⩽ sup

−1 1 ∑ √ 𝑥𝑘 𝑘 2 . 𝑚

max 𝑚

𝑚 ∣𝑥 ∣⩽1, ∑ 𝑥2 ⩽ 𝑚 𝑘 𝑛 𝑘

𝑘=1

𝑘=1

Now, applying Lemma 5 we get ∫1 ) −1 ( −1 𝑒 1 1 min 1, 𝜀𝑡 2 𝑡 2 𝑑𝑡 = 𝜀 + 𝜀 ln = 𝜀 ln 𝛾𝑛 (𝐿∞ , 𝐿2,1 ) ⩽ 2 𝜀 𝜀 0

where 𝜀 is such that

∫1 ( ( ))2 −1 𝑒 1 min 1, 𝜀𝑡 2 𝑑𝑡 = 𝜀2 ln 2 = . 𝜀 𝑛 0

By Lemma 6 we deduce √ 1 √ 1 𝛾𝑛 (𝐿∞ , 𝐿2,1 ) ⩽ 2 √ ln 2 𝑒 𝑛 = 𝑛

( ( ) ) 12 ( )1 2 1 + 12 ln 𝑛 2 + ln 𝑛 2 = . 𝑛 𝑛

Now for 1 ⩽ 𝑞 ⩽ 2, using that the function 𝑠 → ln ∥𝑥∥𝐿 we get the estimate 𝛾𝑛 (𝐿∞ , 𝐿2,𝑞 ) ⩽ for every 𝑛 ∈ 𝑁 . The lower estimate 𝑛 Thus we have

−1 2

(2 + ln 𝑛)

2, 1 𝑠

□

is convex on [0, 1]

1/𝑞 −1/2

1

𝑛2 ⩽ 𝛾𝑛 (𝐿∞ , 𝐿2,𝑞 ) follows from Theorem 2.

Corollary 8. If 1 ⩽ 𝑞 ⩽ 2 and 𝑛 ∈ 𝑁 , then 1 1

𝑛2

1

⩽ 𝛾𝑛 (𝐿∞ , 𝐿2,𝑞 ) ⩽

(2 + ln 𝑛) 𝑞 1

𝑛2

− 12

.

2. Strong inclusions Let 𝐸, 𝐹 be a pair of r.i. spaces and 𝐸 ⊂ 𝐹 . The inclusion 𝐸 ⊂ 𝐹 is called strong if lim sup ∥𝑥∥𝐹 = 0. 𝜀→0 ∥𝑥∥ ⩽1 ,mes(supp 𝑥)⩽𝜀 𝐸

Clearly any strong inclusion is DSS. If an inclusion 𝐸 ⊂ 𝐹 is strong, then 𝜑𝐹 (𝑡) lim = 0. S.V. Astashkin ([A]) proved that the inverse statement is false. 𝑡→0 𝜑𝐸 (𝑡)

366

F.L. Hern´andez, Y. Raynaud and E.M. Semenov

More precisely, he constructed a pair of r.i. spaces 𝐸 and 𝐹 with 𝐸 ⊂ 𝐹 , s.t. 𝜑𝐹 (𝑡) lim = 0 and the inclusion 𝐸 ⊂ 𝐹 is not DSS. 𝑡→0 𝜑𝐸 (𝑡) Proposition 9. Let 𝐸 be an r.i. space with 𝐸 ∕= 𝐿∞ , 𝐿1 . Then the inclusions 𝐿∞ ⊂ 𝐸 ⊂ 𝐿1 are strong. Proof. Since lim 𝜑𝐸 (𝑡) = 0 for any r.i. space 𝐸 ∕= 𝐿∞ , we have 𝑡→0

lim

sup

𝜀→0 ∥𝑥∥

𝐿∞ ⩽1,mes(supp 𝑥)⩽𝜀

∥𝑥∥𝐸 ⩽ lim 𝜑𝐸 (𝜀) = 0. 𝜀→0

And, since 𝐸 ∕= 𝐿1 , so lim 𝜑𝐸 ′ (𝑡) = 0, we have 𝑡→0

mes(supp 𝑥) ⩽ 𝜀 ∥𝑥∥𝐿1 ⩽ lim ∥𝑥∥𝐸 ∥æsupp 𝑥 ∥𝐸 ′ 𝜀→0

⩽ lim 𝜑𝐸 ′ (𝜀) = lim 𝜀→0

𝜀→0

𝜀 = 0. 𝜑𝐸 (𝜀)

Proposition 10. Let 𝐸, 𝐹 be a pair of r.i. spaces and assume that )′ ∫1 ( 𝑡 𝜑′𝐹 (𝑡) 𝑑𝑡 < ∞. 𝜑𝐸 (𝑡)

□

(7)

0

Then 𝐸 ⊂ 𝐹 and this inclusion is strong. Proof. It is known ([KPS], 2.5.5, 2.5.7) that 𝐸 ⊂ 𝑀 (𝜑¯𝐸 ) and Λ(𝜑𝐹 ) ⊂ 𝐹 where 𝜑¯𝐸 (𝑡) = 𝑡/𝜑𝐸 (𝑡). Assumption (7) implies the inclusion 𝑀 (𝜑¯𝐸 ) ⊂ Λ(𝜑𝐹 ) from which follows 𝐸 ⊂ 𝐹 ([GHSS]). Let us show that (7) implies that this last inclusion is strong. Indeed, if ∥𝑥∥𝑀(𝜑¯𝐸 ) ⩽ 1 and mes(supp 𝑥) ⩽ 𝜀, then, by ([KPS], 2.2.36), ∫1 ∥𝑥∥Λ(𝜑𝐹 ) =

∗

𝑥

(𝑡)𝜑′𝐹 (𝑡) 𝑑𝑡

0

Clearly, (7) implies

∫𝜀 ( ⩽ 0

𝑡 𝜑𝐸 (𝑡)

)′

(

)′ 𝑡 𝜑′𝐹 (𝑡) 𝑑𝑡 = 0. 𝜀→0 0 𝜑𝐸 (𝑡) Thus 𝑀 (𝜑¯𝐸 ) ⊂ Λ(𝜑𝐸 ) is strong, and hence 𝐸 ⊂ 𝐹 . ∫

lim

𝜀

𝜑′𝐹 (𝑡)𝑑𝑡.

□

Denote by 𝔐 the set of all convex increasing functions on [0, ∞) s. t. 𝑀 (0) = 0, lim 𝑀 (𝑢)/𝑢 = ∞. Given an r.i. space 𝐹 and a function 𝑀 ∈ 𝔐, denote by 𝑢→∞

𝐹 (𝑀 ) the r.i. space endowed with the norm: 1 ( )1 { } 1 ∣𝑥∣ 1 1 1 ∥𝑥∥𝐹 (𝑀) = inf 𝜆 > 0 : 1𝑀 ⩽1 . 𝜆 1𝐹 It is clear that 𝐹 (𝑀 ) is an r.i. space. Note that 𝐿1 (𝑀 ) coincides with the Orlicz space 𝐿𝑀 . We need some auxiliary results to give a characterization of strong inclusions.

Bernstein Widths and Super Strictly Singular Inclusions

367

Lemma 11. Let 𝑢𝑛 , 𝑣𝑛 ⩾ 0 for every 𝑛 ∈ ℕ and lim 𝑢𝑛 = lim 𝑣𝑛 = ∞. There 𝑛→∞

𝑛→∞

exists a function 𝑀 ∈ 𝔐 s. t. 𝑀 (𝑢𝑛 ) ⩽ 𝑢𝑛 𝑣𝑛 for every 𝑛 ∈ ℕ.

Proof. Without loss of generality we may assume that the sequences {𝑢𝑛 }, {𝑣𝑛 } are strictly monotone and 𝑢1 = 𝑣1 = 0. The set 𝑆 = {(𝑢𝑛 , 𝑢𝑛 𝑣𝑛 ) , 𝑛 ∈ ℕ} uniquely deﬁnes a function 𝑀 on [0, ∞) by: 𝑀 (𝑥) = inf{𝑦 : (𝑥, 𝑦) ∈ conv 𝑆}. Clearly, 𝑀 (0) = 0 and 𝑀 is convex. Note that 𝑀 is a piecewise linear function, and that the angular points of its graph form an inﬁnite subset {(𝑢𝑛𝑘 , 𝑢𝑛𝑘 𝑣𝑛𝑘 ) : 𝑘 ⩾ 1} of 𝑆. Indeed let us indicate an algorithm deﬁning the 𝑛𝑘 ’s. Assume that 1 = 𝑛1 < 𝑛2 < ⋅ ⋅ ⋅ < 𝑛𝑘 have been determined. Then since 𝑢𝑛 𝑣𝑛 𝑢𝑛 𝑣𝑛 − 𝑢𝑛𝑘 𝑣𝑛𝑘 ⩾ 𝑣𝑛 − 𝑘 𝑘 → +∞ 𝑢𝑛 − 𝑢𝑛𝑘 𝑢𝑛 the inﬁmum

} 𝑢𝑛 𝑣𝑛 − 𝑢𝑛𝑘 𝑣𝑛𝑘 inf : 𝑛 > 𝑛𝑘 𝑢𝑛 − 𝑢𝑛𝑘 is attained, and the set 𝐴𝑘 of minimizers is ﬁnite. Then 𝑛𝑘+1 = max 𝐴𝑘 . Let us show that lim 𝑀 (𝑢) /𝑢 = ∞. Since the function 𝑀 is convex and 𝑀 (0) = 0, the {

𝑢→∞

function 𝑀 (𝑥)/𝑥 is nondecreasing on [0, +∞). But 𝑀 (𝑥) 𝑀 (𝑢𝑛𝑘 ) ⩾ sup = sup 𝑣𝑛𝑘 = +∞. 𝑥 𝑢𝑛𝑘 𝑥>0 𝑘⩾1 𝑘⩾1

□

sup

Lemma 12. Let 𝐸 be an r.i. space and 𝑀 ∈ 𝔐. The inclusion 𝐸(𝑀 ) ⊂ 𝐸 is strong iﬀ 𝐸 ∕= 𝐿∞ . Proof. It is evident that 𝐿∞ (𝑀 ) = 𝐿∞ for any 𝑀 ∈ 𝔐. This proves the ﬁrst part of our statement. Suppose that 𝐸 ∕= 𝐿∞ . Then lim 𝜑𝐸 (𝑡) = 0. If ∥𝑥∥𝐸(𝑀) < 1, 𝑡→0

then ∥𝑀 (∣𝑥∣)∥𝐸 ⩽ 1. Given 𝑛 ⩾ 1, consider the sets 𝑝 = {𝑡 : 𝑡 ∈ supp 𝑥, ∣𝑥(𝑡)∣ ⩽ 𝑛} and 𝑞 = {𝑡 : ∣𝑥 (𝑡) ∣ > 𝑛}. Since 𝑀 (∣𝑥 (𝑡)∣) ⩾ 𝑛 ∣𝑥 (𝑡)∣ æ𝑞 (𝑡) for every 𝑡 ∈ [0, 1], then 1 1 ∥𝑥æ𝑞 ∥𝐸 ⩽ ∥𝑀 (∣𝑥∣)∥𝐸 ⩽ . If mes (supp 𝑥) ⩽ 𝜀, then 𝑛 𝑛 1 ∥𝑥∥𝐸 ⩽ ∥𝑥æ𝑝 ∥𝐸 + ∥𝑥æ𝑞 ∥𝐸 ⩽ 𝑛𝜑𝐸 (𝜀) + . 𝑛 1

If we take 𝑛 = (𝜑𝐸 (𝜀))− 2 , then 1

1

1

∥𝑥∥𝐸 ⩽ (𝜑𝐸 (𝜀)) 2 (𝜀) + (𝜑𝐸 (𝜀)) 2 = 2(𝜑𝐸 (𝜀)) 2 . Hence lim

𝜀→0

sup

∥𝑥∥𝐸(𝑀 ) ⩽1, mes(supp 𝑥)⩽𝜀

∥𝑥∥𝐸 = 0.

□

Theorem 13. Let 𝐸, 𝐹 be a pair of r.i. spaces with 𝐸 ⊂ 𝐹 . The inclusion 𝐸 ⊂ 𝐹 is strong iﬀ 𝐸 ⊂ 𝐹 (𝑀 ) for some 𝑀 ∈ 𝔐.

368

F.L. Hern´andez, Y. Raynaud and E.M. Semenov

Proof. The suﬃciency follows immediately from Lemma 12. Let us prove the necessity. If the inclusion 𝐸 ⊂ 𝐹 is strong, then there exists a sequence 𝜀𝑛 ↘ 0 s. t. ∥𝑥∥𝐹 ⩽ 2−𝑛 ∥𝑥∥𝐸 for any 𝑥 ∈ 𝐸 with mes (supp 𝑥) < 𝜀𝑛 . We can assume that 𝐸 ∕= ( 𝐿∞ so 𝜑)𝐸 (𝜀𝑛 ) → 0. Now, by Lemma 11 there exists a function 𝑀 ∈ 𝔐 s. t. 1 𝑛 for any 𝑛 ∈ ℕ. Let us show that 𝐸 ⊂ 𝐹 (𝑀 ) where 𝑀 is 𝑀 ⩽ 𝜑𝐸 (𝜖𝑛 ) 𝜑𝐸 (𝜖𝑛 ) the above constructed function. Let 𝑥 ∈ 𝐸, ∥𝑥∥𝐸 = 1. Consider the following sequence of functions { 𝑥∗ (𝑡) , 𝜀𝑛 < 𝑡 ⩽ 𝜀𝑛−1 𝑥𝑛 (𝑡) = 0, for other 𝑡 ∈ [0, 1] where 𝜀0 = 1. We have 1 = ∥𝑥∗ ∥𝐸 ⩾ 𝑥∗ (𝜀𝑛 ) 𝜑𝐸 (𝜀𝑛 ) for each 𝑛 ∈ ℕ. Therefore −1 𝑥∗ (𝑡) ⩽ 𝑥∗ (𝜀𝑛 ) ⩽ (𝜑𝐸 (𝜀𝑛 )) for 𝑡 ∈ [𝜀𝑛 , 1]. The function 𝑀 (𝑢) /𝑢 is a monotone increasing one. Hence ( ) 1 𝑀 (𝑥𝑛 (𝑡)) ⩽ 𝑥𝑛 (𝑡) 𝑀 𝑀 (𝑥𝑛 (𝑡)) = 𝑥𝑛 (𝑡) 𝜑𝐸 (𝜀𝑛 ) 𝑥𝑛 (𝑡) 𝜑𝐸 (𝜀𝑛 ) ( ) 1 for 𝑡 ∈ (𝜀𝑛 , 𝜀𝑛−1 ]. By the construction of 𝑀 , we have 𝑀 𝜑𝐸 (𝜀𝑛 ) ⩽ 𝑛. 𝜑𝐸 (𝜀𝑛 ) Consequently 𝑀 (𝑥𝑛 (𝑡)) ⩽ 𝑛𝑥𝑛 (𝑡) and ∞ ∑ 𝑛=1

∥𝑀 (𝑥𝑛 )∥𝐹 ⩽

Hence the series

∞ ∑ 𝑛=1

∞ ∑ 𝑛=1

2−𝑛 ∥𝑀 (𝑥𝑛 )∥𝐸 ⩽

∞ ∑ 𝑛=1

2−𝑛 𝑛 ∥𝑥𝑛 ∥𝐸 ⩽

∞ ∑

2−𝑛 𝑛 < ∞.

𝑛=1

𝑀 𝑥𝑛 converges in 𝐹 . On the other hand by the monotone

convergence theorem it converges clearly in 𝐿1 to 𝑀 (𝑥∗ ), which has thus to be also its limit in 𝐹 . Thus (𝑀 (𝑥))∗ = 𝑀 (𝑥∗ ) belongs to 𝐹 , and so does 𝑀 𝑥. Thus the inclusion 𝐸 ⊂ 𝐹 (𝑀 ) has been proved. □

3. SSS inclusions The criterion for SS inclusion of an r.i. space into 𝐿1 that was mentioned in Introduction may be straightened as follows. Theorem 14. Let 𝐸 be an r.i. space that does not contain an isomorphic copy of 𝑐0 . If there exist 𝐶 > 0 and a sequence of subspaces 𝑄𝑛 ⊂ 𝐸 with dim 𝑄𝑛 = 𝑛, 𝑛 ∈ ℕ such that ∥𝑥∥𝐸 ⩽ 𝐶∥𝑥∥𝐿1 for any 𝑥 ∈ 𝑄𝑛 , then 𝐺 ⊂ 𝐸. Proof. Since 𝐸 does not contain a copy of 𝑐0 one can use a smooth ultraproduct argument. Indeed we may assume that 𝐸 ′ is not 𝐿∞ , so 𝜑𝐸 ′ (0+) = 0. If 𝑈 is a free ultraﬁlter on N, then in the ultrapower 𝐸𝑈 the band 𝐵 generated by 𝐸 consists

Bernstein Widths and Super Strictly Singular Inclusions

369

of elements [𝑥𝑛 ]𝑈 deﬁned by 𝐸-equi-integrable sequence (𝑥𝑛 ), while the complementary band 𝐸 ⊥ consists of elements represented by bounded sequence (𝑥𝑛 ) with mes(supp(𝑥𝑛 )) → 0. By H¨ older inequality in the second case the sequence (𝑥𝑛 ) goes to zero in 𝐿1 . Call 𝑖 the natural inclusion 𝐸 → 𝐿1 . Let as usual its ultrapower map 𝑖𝑈 : 𝐸𝑈 → (𝐿1 )𝑈 be deﬁned by 𝑖𝑈 ([𝑥𝑛 ]𝑈 ) = [𝑖(𝑥𝑛 )]𝑈 . Then by the preceding 𝑖𝑈 vanishes on the complementary band 𝐵 ⊥ of 𝐵 in 𝐸𝑈 . It is clear that 𝑖𝑈 maps 𝐵 into the band generated by 𝐿1 in its ultrapower (indeed 𝐸-equiintegrable sequence are a fortiori 𝐿1 -equiintegrable). This band 𝐵1 can be identiﬁed with a 𝐿1 of a big probability space (𝑆, Σ, 𝜇) and (since 𝐸 is order continuous) 𝐵 is identiﬁed to 𝐸(𝑆, Σ, 𝜇) (i.e., 𝑓 ∈ 𝐸(𝑆, Σ, 𝜇) iﬀ 𝑓 ∗ ∈ 𝐸), and 𝑖𝑈 when restricted to 𝐵 is simply the inclusion 𝐸(𝑆, Σ, 𝜇) → 𝐿1 (𝑆, Σ, 𝜇). In particular if 𝐸 does not contain 𝑐0 , neither does the band 𝐵, and it results that 𝐵 is a projection band (this remark goes back to [W]). If (𝑄𝑛 ) is a sequence of subspaces of 𝐸 with dim 𝑄𝑛 = 𝑛 and ∥𝑥∥𝐸 ⩽ 𝐶∥𝑥∥1 for any 𝑛 ⩾ 1 and 𝑥 ∈ 𝑄𝑛 , consider the ultraproduct 𝑄 := Π𝑈 𝑄𝑛 which is an inﬁnite-dimensional subspace of 𝐸𝑈 with ∥𝑥∥𝐸𝑈 ⩽ 𝐶∥𝑖𝑈 𝑥∥(𝐿1 )𝑈 for every 𝑥 ∈ 𝑄. Let 𝜋 be the band projection from 𝐸𝑈 onto 𝐵. Since 𝑖𝑈 = 𝑖𝑈 𝜋 we have ∥𝑥∥𝐸𝑈 ⩽ 𝐶∥𝑖𝑈 𝜋𝑥∥(𝐿1 )𝑈 ⩽ 𝐶∥𝑥∥𝐸𝑈 for every 𝑥 ∈ 𝑄. Hence 𝜋 restricts to an isomorphism on 𝑄, and in particular its range 𝜋(𝑄) is an inﬁnite-dimensional closed space. Moreover on this subspace the norms of 𝐸(𝑆, Σ, 𝜇) and that of 𝐿1 (𝑆, Σ, 𝜇) are 𝐶-equivalent since for 𝑦 = 𝜋𝑥 ∈ 𝜋(𝑄) we have ∥𝑦∥𝐸𝑈 ⩽ ∥𝑥∥𝐸𝑈 ⩽ 𝐶∥𝑖𝑈 𝜋𝑥∥(𝐿1 )𝑈 = ∥𝑖𝑈 𝑦∥(𝐿1 )𝑈 ⩽ 𝐶∥𝑦∥𝐸𝑈 . Hence 𝐺(𝑆, Σ, 𝜇) ⊂ 𝐸(𝑆, Σ, 𝜇) by Theorem 1 in [AHS]. Coming back on earth to the measure space [0, 1] this means that 𝐺 ⊂ 𝐸. □ Lemma 15. Let 𝐸 be an r.i. space and 𝑧 ∈ 𝐿1 ∖ 𝐸 ′′ .There exists an reﬂexive r.i. space 𝐸1 s.t. 𝐸 ⊂ 𝐸1 and 𝑧 ∕∈ 𝐸1 . Proof. We may suppose that 𝑧 = 𝑧 ∗ . We can ﬁnd a function 𝑢 ∈ 𝐸 ′ s.t. 𝑢 = 𝑢∗ and ∫ 1 𝑧(𝑡)𝑢(𝑡) 𝑑𝑡 = ∞. (8) 0

And moreover we can ﬁnd a function 𝑣 ∈ 𝐸 ′ s.t. 𝑣 = 𝑣 ∗ , lim 𝑣(𝑡)/𝑢(𝑡) = 0 and 𝑡→0

∫

1

0

Denote

∫ 𝜑(𝑠) =

𝑠

0

𝑧(𝑡)𝑣(𝑡) 𝑑𝑡 = ∞. ∫

𝑢(𝑡) 𝑑𝑡,

𝜓(𝑠) =

0

𝑠

𝑣(𝑡) 𝑑𝑡.

Then 𝜑 and 𝜓 are concave increasing function, lim 𝜓(𝑠)/𝜑(𝑠) = 0 and 𝑠→0

𝑀 (𝜓) ⊂ 𝑀 (𝜑) ⊂ 𝐸 ′ .

(9)

370

F.L. Hern´andez, Y. Raynaud and E.M. Semenov

Indeed, if 𝑥 ∈ 𝑀 (𝜑), ∥𝑥∥𝑀(𝜑) ⩽ 1, then ∫ ∫ 𝑠 ∗ 𝑥 (𝑡) 𝑑𝑡 ⩽ 𝜑(𝑠) = 0

𝑠

0

𝑢(𝑡) 𝑑𝑡

(10)

for every 𝑠 ∈ [0, 1]. By [LT2], 2.a.8 𝑥 ∈ 𝐸 ′ and ∥𝑥∥𝐸 ′ ⩽ ∥𝑢∥𝐸 ′ . It follows from (9) and the well-known formula (𝑀 (𝜑))′ = Λ(𝜑) that 𝐸 ⊂ Λ(𝜑) ⊂ Λ(𝜓).

(11)

Let 𝑥 ∈ Λ(𝜑) and 𝑚(supp 𝑥) ⩽ 𝜀 for some 𝜀 > 0. Then ∫ 1 ∫ 𝜀 𝜓 ′ (𝑡) 𝑑𝑡. 𝑥∗ (𝑡) 𝑑𝜓(𝑡) = 𝑥∗ (𝑡)𝜑′ (𝑡) ′ ∥𝑥∥Λ(𝜓) = 𝜑 (𝑡) 0 0 Therefore

𝜓 ′ (𝑡) ∥𝑥∥Λ(𝜑) . ′ 0<𝑡⩽𝜀 𝜑 (𝑡)

∥𝑥∥Λ(𝜓) ⩽ sup Since

𝜓 ′ (𝑡) 𝑣(𝑡) = lim =0 ′ 𝑡→0 𝜑 (𝑡) 𝑡→0 𝑢(𝑡) lim

we have lim

𝜀→0

sup

∥𝑥∥Λ(𝜑) ⩽1 mes(supp 𝑥)⩽𝜀

∥𝑥∥Λ(𝜓) = 0

and by [BVL] the embedding Λ(𝜑) ⊂ Λ(𝜓) is weakly compact. Using [B, Chap. II, §3], we conclude that the space of real interpolation method ([BL, 3.1]) 𝐸1 := (Λ(𝜑), Λ(𝜓))𝜃,𝑝 is reﬂexive for 0 < 𝜃 < 1 < 𝑝 < ∞. By (8), (11) 𝑧 ∕∈ 𝐸1 and 𝐸 ⊂ 𝐸1 . □ Denote by ℜ the class of all r.i. spaces 𝐸 on [0, 1] verifying that there exists a sequence 𝜀𝑛 → 0 s.t. for any subspace 𝑄 ⊂ 𝐸 with dim 𝑄 = 𝑛 we can select 𝑥 ∈ 𝑄 s.t. 𝑥∗ (𝜀𝑛 ) < 𝜀𝑛 and ∥𝑥∥𝐸 ⩾ 1. Clearly if 𝐸1 , 𝐸2 are r.i. spaces with 𝐸1 ∈ ℜ and 𝐸2 ⊂ 𝐸1 then 𝐸2 ∈ ℜ. Theorem 16. Let 𝐸 be an r.i. space. The following conditions are equivalent: i) 𝐸 ∈ ℜ. ii) 𝐺 ⊂ ∕ 𝐸. 1 )1 ( 1 1 1/2 1 1 iii) lim 1 = ∞. min 𝑛, ln 1 𝑛→∞ 𝑡 1𝐸 Proof. (i)⇒(ii). Suppose that 𝐺 ⊂ 𝐸. By ([LT2], 2.b.4) there exists a constant 𝐶 > 0 s.t. 1 1 𝑛 1 1∑ 1 1 1 √ ∥𝑐∥𝑙2 ⩽ 1 𝑐𝑘 𝑟𝑘 1 ⩽ 𝐶∥𝑐∥𝑙2 1 1 2 𝑘=1 𝐸

Bernstein Widths and Super Strictly Singular Inclusions

371

for any 𝑛 ∈ ℕ and 𝑐 ∈ 𝑅𝑛 , where {𝑟𝑘 } is the Rademacher system. Theorem 7 in [KS, ch. 2] states that $ { $ 𝑛 } $∑ $ 1 1 $ $ mes 𝑡 : $ . 𝑐𝑘 𝑟𝑘 (𝑡)$ ⩾ ∥𝑐∥𝑙2 ⩾ $ $ 2 32 𝑘=1

Therefore 𝐸 ∕∈ ℜ. (ii)⇒(iii). If

1 ( )1 1 1 1/2 1 1 1 lim min 𝑛, ln < ∞, 𝑛→∞ 1 𝑡 1𝐸 1 then the function 𝑧(𝑡) = ln1/2 belongs to 𝐸 ′′ . It is well known [L] that the Orlicz 𝑡 space exp ∫ 𝑠𝐿2 coincides up to equivalence with the Marcinkiewicz space 𝑀 (𝜑) where 1 ln1/2 𝑑𝑡. If 𝑥 ∈ 𝑀 (𝜑), then 𝜑(𝑠) = 𝑡 0 ∫ 𝜏 ∫ 𝜏 1 1 ∗ 𝑥 (𝑡) 𝑑𝑡 ⩽ ∥𝑥∥𝑀(𝜑) ln /2 𝑑𝑡 𝑡 0 0 for every 𝜏 ∈ [0, 1]. Hence 𝑥 ∈ 𝐸 ′′ and 1 1 1/2 ∥𝑥∥𝐸 ′′ ⩽ 1 1ln

1 11 1 ∥𝑥∥𝑀(𝜑) . 𝑡 1𝐸 ′′

Hence exp 𝐿2 = 𝑀 (𝜑) ⊂ 𝐸 ′′ and 𝐺 is contained in the closure of 𝐿∞ in 𝐸 ′′ , i.e., 𝐺 ⊂ 𝐸. 1 (iii)⇒(i). By (iii) we have ln1/2 ∕∈ 𝐸 ′′ . If 𝐺 ⊂ 𝐸, then exp 𝐿2 = 𝐺′′ ⊂ 𝐸 ′′ 𝑡 1 and ln1/2 ∈ 𝐸 ′′ . The obtained contradiction shows that 𝐺 ∕⊂ 𝐸 and 𝐺 ∕⊂ 𝐸 ′′ . 𝑡 1 By Lemma 15 there exists a reﬂexive r.i. space 𝐸1 s.t. ln1/2 ∕∈ 𝐸1 and 𝐸 ⊂ 𝐸1 . 𝑡 1/2 1 ′′ ′′ Then 𝐺 ∕⊂ 𝐸1 . Indeed, if 𝐺 ⊂ 𝐸1 , then ln ∈ 𝐺 ⊂ 𝐸1 = 𝐸1 . So, 𝐺 ∕⊂ 𝐸1 . 𝑡 Since 𝐸1 is reﬂexive, 𝐸1 does not contain a subspace isomorphic to 𝑐0 . Suppose that 𝐸1 ∕∈ ℜ. Then for some 𝜀 > 0 and any 𝑛 ∈ ℕ there exists a subspace 𝑄𝑛 ⊂ 𝐸1 , dim 𝑄𝑛 = 𝑛 s.t. 𝑥∗ (𝜀) ⩾ 𝜀 for any 𝑥 ∈ 𝑄𝑛 with ∥𝑥∥𝐸 = 1. For such 𝑥 ∈ 𝑄𝑛 we have ∫ ∥𝑥∥𝐿1 ⩾ This means that

𝜀

0

𝑥∗ (𝑡) 𝑑𝑡 ⩾ 𝜀2 .

∥𝑥∥𝐸1 ⩽ 𝜀−2 ∥𝑥∥𝐿1 for any 𝑥 ∈ 𝑄𝑛 . Since 𝐸1 does not contain a subspace isomorphic to 𝑐0 , then we can apply Theorem 14 and state that 𝐺 ⊂ 𝐸1 . The obtained contradiction proves that 𝐸1 ∈ ℜ and a fortiori 𝐸 ∈ ℜ. □ Theorem 17. Let 𝐸, 𝐹 be a pair of r.i. spaces. If 𝐺 ∕⊂ 𝐸 and 𝐸 is strongly included into 𝐹 , then the inclusion 𝐸 ⊂ 𝐹 is SSS.

372

F.L. Hern´andez, Y. Raynaud and E.M. Semenov

Proof. Since 𝐺 ∕⊂ 𝐸 then, by Theorem 16, 𝐸 ∈ ℜ, i.e., there exists a sequence 𝜀𝑛 → 0 s.t. for any subspace 𝑄𝑛 ⊂ 𝐸, dim 𝑄𝑛 = 𝑛 there exists 𝑥𝑛 ∈ 𝑄 for which 𝑥∗𝑛 (𝜀𝑛 ) < 𝜀𝑛 and ∥𝑥𝑛 ∥𝐸 = 1. We have ∥𝑥𝑛 ∥𝐹 = ∥𝑥∗𝑛 ∥𝐹 ⩽ ∥𝑥∗𝑛 𝜒(0,𝜀𝑛 ) ∥𝐹 + ∥𝑥∗𝑛 𝜒(𝜀𝑛 ,1) ∥𝐸 ⩽ sup ∥𝑥𝜒𝑒 ∥𝐹 + 𝜀𝑛 . ∥𝑥∥𝐸 =1 mes⩽𝜀𝑛

Since 𝐸 is strongly included into 𝐹 then the ﬁrst term in above tends to 0. Hence lim ∥𝑥𝑛 ∥𝐹 = 0.

𝑛→∞

□

As a direct consequence of Theorem 17 and Proposition 9 we have Corollary 18. Let 𝐸 be an r.i. space. The inclusion 𝐸 ⊂ 𝐿1 is SSS iﬀ 𝐺 ∕⊂ 𝐸. Corollary 19. Let 𝐸, 𝐹 be a pair of r.i. spaces, such that 𝐺 ∕⊂ 𝐸 and the integral condition (7) in Prop. 10 is satisﬁed. Then the inclusion 𝐸 ⊂ 𝐹 is SSS. This statement immediately follows from Theorem 17 and Proposition 10. Now we apply Corollary 19 for exponential Orlicz spaces. Corollary 20. Let 0 < 𝑝 < 𝑞 < ∞. The following conditions are equivalent: (i) the inclusion exp 𝐿𝑞 ⊂ exp 𝐿𝑝 is SS; (ii) the inclusion exp 𝐿𝑞 ⊂ exp 𝐿𝑝 is SSS; (iii) 𝑞 > 2. Proof. The equivalence (i)⇐⇒(iii) was proved in [HNS]. The implication (ii)⇒(i) is obvious. Therefore we must prove that the inclusion exp 𝐿𝑞 ⊂ exp 𝐿𝑟 is SSS for 𝑞 > 𝑟 > 2. It is well known that up to equivalence 1 𝑒 𝜑exp 𝐿𝑝 (𝑡) = ln− 𝑝 𝑡 for any 𝑝 > 0. We have )′ ) ∫ 1( ∫ 1( )′ ( 1 𝑒 1 1𝑞 −1 𝑒 1 − 1𝑟 −1 𝑒 𝑑𝑡 𝑡 − 𝑟1 𝑒 𝑞 − ln ln 0< 𝑑𝑡 = ln ln 1 𝑡 𝑡 𝑞 𝑡 𝑟 𝑡 𝑡 0 0 ln− 𝑞 𝑒𝑡 ∫ 1 ∫ ∞ 1 1 1 1 1 𝑒 𝑑𝑡 1 𝑞 = < ∞. < ln 𝑞 − 𝑟 −1 𝑠 𝑞 − 𝑟 −1 𝑑𝑠 = 𝑟 0 𝑡 𝑡 𝑟 1 𝑞−𝑟 By Corollary 19 the inclusion exp 𝐿𝑞 ⊂ exp 𝐿𝑟 is SSS; so is also the inclusion exp 𝐿𝑞 ⊂ exp 𝐿𝑝 by composition with the bounded inclusion exp 𝐿𝑟 ⊂ exp 𝐿𝑝 . □ Corollary 21. Let Λ(𝜑), Λ(𝜓) be Lorentz spaces, 𝜑 ⩽ 𝜓 and 𝐺 ∕⊂ Λ(𝜑). The following conditions are equivalent. 1) The inclusion Λ(𝜓) ⊂ Λ(𝜑) is DSS; 2) the inclusion Λ(𝜓) ⊂ Λ(𝜑) is SS; 3) the inclusion Λ(𝜓) ⊂ Λ(𝜑) is SSS; 4) the inclusion Λ(𝜓) ⊂ Λ(𝜑) is strong; 𝜑(𝑡) = 0. 5) lim 𝑡→0 𝜓(𝑡)

Bernstein Widths and Super Strictly Singular Inclusions

373

Proof. (5)⇒(4). By Lemma 2.5.2 in [KPS] sup

∥𝑥∥Λ(𝜓) ⩽1,mes(supp 𝑥)⩽𝜀

∥𝑥∥Λ(𝜑) =

∥æ𝑒 ∥Λ(𝜑) 𝜑(𝑡) . = sup 0<mes 𝑒⩽𝜀 ∥æ𝑒 ∥Λ(𝜓) 0<𝑡⩽𝜀 𝜓(𝑡) sup

Hence lim

𝜀→0 ∥𝑥∥

sup

Λ(𝜓) ⩽1,mes(supp 𝑥)⩽𝜀

∥𝑥∥Λ(𝜑) = lim sup

𝜀→0 0<𝑡⩽𝜀

𝜑(𝑡) = 0. 𝜓(𝑡)

The implication (4)⇒(3) follows from Theorem 17. The implications (3)⇒(2)⇒(1) are obvious. The equivalence (1)⇔(5) was proved in [A]. □ Corollary 21 cannot be extended to the class of Orlicz spaces. Given 1 < 𝑝 < ∞, there exists an Orlicz space 𝐿𝑀 such that the inclusion 𝐿𝑝 ⊂ 𝐿𝑀 is DSS but 𝜑𝐿𝑀 (𝑡) > 0 (cf. [GHSS]). lim sup 1/𝑝 𝑡 𝑡→0 Applying Theorem 13 and 17 we get Corollary 22. Let 𝐸, 𝐹 be a pair of r.i. spaces with 𝐸 ⊂ 𝐹 . If 𝐺 ∕⊂ 𝐸 and 𝐸 ⊂ 𝐹 (𝑀 ) for some 𝑀 ∈ 𝔐, then the inclusion 𝐸 ⊂ 𝐹 is SSS. Using Corollary 22 we get another proof of Corollary 20. Now we present a simple necessary condition for SSS inclusions. Proposition 23. Let 𝐸, 𝐹 be a pair of r.i. spaces with 𝐸 ⊂ 𝐹 and 𝜑𝐸 = 𝜑𝐹 . Then the inclusion 𝐸 ⊂ 𝐹 is not SSS. Proof. Consider the following two cases: 𝜑𝐸 (2𝑡) 𝜑𝐸 (2𝑡) > 1, 2) lim inf = 1. 1) lim inf 𝑡→0 𝑡→0 𝜑𝐸 (𝑡) 𝜑𝐸 (𝑡) 𝜑𝐸 (2𝑡) > 2𝛾 for some 𝛾 > 0. Then 𝜑𝐸 (𝑡) ⩽ 𝐶𝑡𝛾 for some 𝐶 > 0 𝜑𝐸 (𝑡) and suﬃciently small 𝑡 > 0. By Proposition 10, 𝐸 ⊃ 𝐿𝑝 for 𝑝 ∈ (1/𝛾 , ∞). Khintchine inequality implies that the norms 𝐸 and 𝐹 are equivalent on [(𝑟𝑛 )] where {𝑟𝑛 } is the Rademacher system. This means that the inclusion 𝐸 ⊂ 𝐹 is not SS ( so neither SSS). 𝜑𝐸 (2𝑡) 𝜑𝐸 (𝑛𝑡) = 1, then lim inf = 1 for any 𝑛 ∈ ℕ ([KPS], 1.1.3). If lim inf 𝑡→0 𝑡→0 𝜑𝐸 (𝑡) 𝜑𝐸 (𝑡) Therefore there exists a sequence 𝑡𝑛 ↓ 0 s.t. 𝜑𝐸 (𝑛𝑡𝑛 ) ⩽ 2𝜑𝐸 (𝑡𝑛 ) for any 𝑛 ∈ ℕ. Let 𝑥𝑘 (𝑡) = æ( 𝑘−1 𝑡𝑛 , 𝑘 𝑡𝑛 ) (𝑡), 1 ⩽ 𝑘 ⩽ 𝑛. Then 𝑛 𝑛 1 1 1 𝑛 1 𝑛 1 1∑ 1∑ 1 1 1 1 1 𝑐𝑘 𝑥𝑘 1 ⩽ max ∣𝑐𝑘 ∣ 1 𝑥𝑘 1 max ∣𝑐𝑘 ∣𝜑𝐸 (𝑡𝑛 ) ⩽ 1 1 1 1⩽𝑘⩽𝑛 1⩽𝑘⩽𝑛 1 1 Let lim inf 𝑡→0

𝑘=1

𝐸

𝑘=1

𝐸

= max ∣𝑐𝑘 ∣ 𝜑𝐸 (𝑛𝑡𝑛 ) ⩽ 2 max ∣𝑐𝑘 ∣ 𝜑𝐸 (𝑡𝑛 ) 1⩽𝑘⩽𝑛

1⩽𝑘⩽𝑛

374

F.L. Hern´andez, Y. Raynaud and E.M. Semenov

and analogously

1 1 𝑛 1 1∑ 1 1 max ∣𝑐𝑘 ∣ 𝜑𝐹 (𝑡𝑛 ) ⩽ 1 𝑐𝑘 𝑥𝑘 1 ⩽ 2 max ∣𝑐𝑘 ∣ 𝜑𝐹 (𝑡𝑛 ). 1 1 1⩽𝑘⩽𝑛 1⩽𝑘⩽𝑛 𝑘=1

𝐹

Since 𝜑𝐸 = 𝜑𝐹 we have 𝛾𝑛 (𝐸, 𝐹 ) ⩾

max ∣𝑐𝑘 ∣𝜑𝐹 (𝑡𝑛 )

1⩽𝑘⩽𝑛

2 max ∣𝑐𝑘 ∣𝜑𝐸 (𝑡𝑛 )

=

1⩽𝑘⩽𝑛

for any 𝑛 ∈ ℕ. This means that the inclusion is not SSS.

1 2 □

In particular the canonical inclusion Λ(𝜑) ⊂ 𝑀 (𝜑) is not SSS for any 𝜑 ∈ Ω 𝑡 . And, by Theorem 11 in [AHS], the inclusion Λ(𝜑) ⊂ 𝑀 (𝜑) is where 𝜑(𝑡) = 𝜑(𝑡) SS provided 𝐺 ∕⊂ Λ(𝜑) and 𝜑(+0) = 0. So we have: ∫ 1 1 ln /2 1/𝑡 𝑑𝜑(𝑡) = ∞. Then the Corollary 24. Let 𝜑 ∈ Ω with 𝜑(+0) = 0 and inclusion Λ(𝜑) ⊂ 𝑀 (𝜑) is SS but not SSS. For example, the functions 𝜑(𝑡) = ln𝛼

0

𝑒 satisfy the conditions of Corollary 24 𝑡

if 𝛼 < −1/2. Concerning Theorem 17 it is clear, as we mentioned in Introduction, that the assumption 𝐺 ∕⊂ 𝐸 is necessary for the validity of it. For the class of Lorentz spaces Corollary 21 shows that to be a strong inclusion is also a necessary condition. We do not know what happens in general.

References [A]

Astashkin S.V., Disjointly strictly singular inclusions of symmetric spaces. Mat. Notes 65(1) (1999), 3–12. [AHS] Astashkin S.V., Hernandez F.L. and Semenov E.M., Strictly singular inclusions of rearrangement invariant spaces and Rademacher spaces. Studia Math. 193(3) (2009), 269–283. [B] Beauzamy B., Espaces d’interpolation r´eels: topologie et g´eom´etrie. LNM. 666, Springer Verlag, 1978. [BL] Bergh J., L¨ ofstr¨ om J., Interpolation spaces, an introduction. Springer Verlag, 1976. [BVL] Bukhvalov A.V., Veksler A.I., Lozanovsky G.Ya., Banach lattices – some Banach aspects of their theory. Russian Math. Surveys 34 (1979), 159–213. [C] Calkin J.W., Abstract symmetric boundary conditions. Trans. Amer. Math. Soc., 45(3).(1939), 369–442. [CCT] Castej´ on A., Corbacho E. and Tarieladze V., AMD-numbers, compactness, strict singularity and the essential spectrum of operators. Georgian Math. J. 9(2) (2002), 227–270.

Bernstein Widths and Super Strictly Singular Inclusions

375

[FHR] Flores J., Hern´ andez F.L., Raynaud Y., Super strictly singular and cosingular operators and related classes. J. Operator Theory 67 (2012) (to appear). [GHSS] Garc´ıa del Amo A., Hern´ andez F.L., S´ anchez V.M., Semenov E.M., Disjointly strictly-singular inclusions between rearrangement invariant spaces. J. London Math. Soc., 62 (2000), 239–252. [GMF] Gohberg I.C., Markus A.S., Feldman I.A., On normally solved operators and ideals related with them. Amer. Math. Soc. Transl. 61 (2) (1967), 63–84. [HNS]

Hern´ andez F.L., Novikov S.Y., Semenov E.M., Strictly singular embeddings between rearrangement invariant spaces. Positivity 7 (2003), 119–124.

[HR]

Hern´ andez F.L. and Rodr´ıguez–Salinas B., On 𝑙𝑝 -complemented copies in Orlicz spaces II. Israel J. of Math. 66 (1989), 27–55.

[K]

Kato T., Perturbation theory for nullity, deﬁciency and other quantities of linear operators. J. Analyse Math. 6 (1958), 261–322.

[KPS]

Krein S.G., Petunin Yu.I., Semenov E.M., Interpolation of linear operators. AMS, RI, 1982.

[KS]

Kashin B.S., Saakyan A.A., Orthogonal series.Translations Mathematical Monog., 75, American Mathematical Society, Providence, RI, 1989.

[L]

Lorentz G.G., Relations between function spaces. Proc. AMS, 12 (1961), 127– 132.

[LT1]

Lindenstrauss J., Tzafriri L., Classical Banach Spaces. I. Springer Verlag, 1977.

[LT2]

Lindenstrauss J., Tzafriri L., Classical Banach Spaces. II. Springer Verlag, 1979.

[M]

Milman V.D., Operators of class 𝐶0 and 𝐶0∗ . Theory of Functions, Functional Analysis and Appl. 10, Kharkov (1970), 15–26 (Russian).

[MP]

Mityagin B.S. and Pe̷lczy´ nski A., Nuclear operators and approximate dimension. Proc. Inter. Congr. Math. Moscow (1966), 366–372.

[N]

Novikov S.Ya., Boundary spaces for inclusion maps between rearrangement invariant spaces. Collect. Math. 44(1997), 211–215.

[P]

Plichko A., Super strictly singular and super strictly cosingular operators in Functional analysis and its Applications, North-Holland math. St. 197. Elsevier. Amsterdam, 2004, 239–255.

[PI]

Pinkus A., n-Widths in Approximation Theory. Springer Verlag, Berlin, 1985.

[PS]

Parfenov O.G. and Slupko M.V., Bernstein widths of embeddings of Lebesgue spaces. J. of Math. Sciences. 101, 2(2000), 3146–3148.

[RSH]

Raynaud Y., Semenov E.M. and Hern´ andez F.L., Super strictly singular inclusions between rearrangement invariant spaces. Doklady Mathematics 83 (2011), 216–218.

[S]

Semenov E.M., Finitely strictly singular embeddings. Doklady Mathematics 81 (2010), 383–385.

[SSTT] Sari B., Schlumprecht T., Tomczak-Jagerman N. and Troitsky V., On norm closed ideals in 𝐿(𝑙𝑝 ⊕ 𝑙𝑞 ). Studia Math. 179 (2007), 239–262. [W]

Weis L., Banach lattices with the subsequence splitting property Proc. AMS 105, (1) (1989), 87–96.

376

F.L. Hern´andez, Y. Raynaud and E.M. Semenov

F.L. Hern´ andez Departamento de An´ alisis Matem´ atico Universidad Complutense de Madrid, E-28040 Madrid, Spain e-mail: [email protected] Y. Raynaud Institut de Math´ematiques de Jussieu Site Jussieu (Case 247) UPMC-Univ. Paris06 and CNRS F-75252 PARIS cedex 05, France e-mail: [email protected] E.M. Semenov Department of Mathematics Voronezh State University Voronezh 394693, Russia e-mail: [email protected]

Operator Theory: Advances and Applications, Vol. 218, 377–386 c 2012 Springer Basel AG ⃝

On Inversion of Certain Structured Linear Transformations Related to Block Toeplitz Matrices M.A. Kaashoek and F. van Schagen Dedicated to the memory of Israel Gohberg. We remember him as an outstanding mathematician, an inspiring teacher and a wonderful friend.

Abstract. This paper presents an explicit inversion formula for certain structured linear transformations that are closely related to ﬁnite block Toeplitz matrices. The conditions of invertibility are illustrated by an example. State space techniques from mathematical system theory play an important role. Mathematics Subject Classiﬁcation (2000). Primary 47B35; secondary 15A09, 93B99. Keywords. Structured operators, inversion, state space realization, ﬁnite block Toeplitz matrices, Gohberg-Heinig inversion formula.

1. Introduction This paper is an addition to Section 2 of [4], where the Gohberg-Heinig formula (see [2]) for the inverse of a ﬁnite block Toeplitz matrix is derived using state space techniques from mathematical systems theory. The starting point in Section 2 of [4] is the fact that any ﬁnite (𝑛 + 1)×(𝑛 + 1) block Toeplitz matrix 𝑇 can be represented as ⎤ ⎡ −𝐶𝐵 𝐼 − 𝐶𝐴𝑛 𝐵 −𝐶𝐴𝑛−1 𝐵 ⋅ ⋅ ⋅ ⎥ ⎢ −𝐶𝐴𝐵 ⎥ ⎢−𝐶𝐴𝑛+1 𝐵 𝐼 − 𝐶𝐴𝑛 𝐵 ⋅ ⋅ ⋅ ⎥ ⎢ 𝑇 =⎢ (1.1) ⎥, .. .. .. .. ⎥ ⎢ . . . . ⎦ ⎣ −𝐶𝐴2𝑛 𝐵 −𝐶𝐴2𝑛−1 𝐵 ⋅ ⋅ ⋅ 𝐼 − 𝐶𝐴𝑛 𝐵 where 𝐴 : 𝒳 → 𝒳 , 𝐵 : 𝒰 → 𝒳 , and 𝐶 : 𝒳 → 𝒰 are operators (linear transformations) acting between complex linear spaces and 𝐼 is the identity operator

378

M.A. Kaashoek and F. van Schagen

on 𝒰. The representation (1.1) allows one to study inversion of 𝑇 in terms of the operator 𝑛 ∑ 𝑀 =𝐸− 𝐴𝑛−𝑘 𝐵𝐶𝐴𝑘 : 𝒳 → 𝒳 , (1.2) 𝑘=0

where 𝐸 is the identity operator on 𝒳 . To see this note that 𝑇 = 𝐷 − 𝐹 𝐺 and 𝑀 = 𝐸 − 𝐺𝐹 , where 𝐷 is the (𝑛 + 1)×(𝑛 + 1) block diagonal matrix with 𝐼 as diagonal entries and ⎤ ⎡ 𝐶 ⎢ 𝐶𝐴 ⎥ ] [ ⎥ ⎢ (1.3) 𝐹 = ⎢ . ⎥ , 𝐺 = 𝐴𝑛 𝐵 𝐴𝑛−1 𝐵 ⋅ ⋅ ⋅ 𝐴𝐵 𝐵 . . ⎣ . ⎦ 𝐶𝐴𝑛

Assuming 𝑇 to be invertible, this connection between 𝑇 and 𝑀 is used in [4] to give a new proof of the Gohberg-Heinig formula for the inverse of 𝑇 . In the present paper we present necessary and suﬃcient conditions for 𝑀 to be invertible and we derive a formula for the inverse of 𝑀 (which was not done in [4]). To do this the four equations in the Gohberg-Heinig theorem are replaced by the equations 𝑀 𝐾 = 𝐴𝑛 𝐵,

𝑀 𝐿 = 𝐵,

𝑅𝑀 = 𝐶𝐴𝑛 ,

𝑄𝑀 = 𝐶,

(1.4)

where the operators 𝐾 and 𝐿 from 𝒰 into 𝒳 and the operators 𝑅 and 𝑄 from 𝒳 into 𝒰 are the unknowns. The following theorem is our main result. Theorem 1.1. Assume there exist operators 𝐾 and 𝐿 from 𝒰 into 𝒳 and operators 𝑅 and 𝑄 from 𝒳 into 𝒰 satisfying the equations in (1.4). If, in addition, one of the following conditions is satisﬁed 1. 𝐼 + 𝐶𝐾 is invertible, 2. 𝐼 + 𝑅𝐵 is invertible, 3. 𝐼 + 𝐶𝐾 is surjective and 𝐼 + 𝑅𝐵 is injective, 4. 𝐼 + 𝑅𝐵 is surjective and 𝐼 + 𝐶𝐾 is injective, then 𝑀 is invertible and both 𝐼 + 𝐶𝐾 and 𝐼 + 𝑅𝐵 are invertible. Moreover, in that case 𝑛 ∑ 𝐴𝑛−𝑘 𝐵(𝑄𝐴𝑘 + 𝐶𝐻𝑘 ), (1.5) 𝑀 −1 = 𝐸 + 𝑘=0

where the linear transformations 𝐻𝑘 are deﬁned recursively by 𝐻0 = 0,

𝐻1 = 𝐴𝐾(𝐼 + 𝐶𝐾)−1 𝑄 − 𝐿(𝐼 + 𝑅𝐵)−1 𝑅𝐴,

𝐻𝑗 = 𝐴𝐻𝑗−1 + (𝐻1 𝐴𝑗−2 )𝐴

(𝑗 = 2, . . . , 𝑛).

We shall give a self-contained proof of the above theorem, not using the connection between 𝑀 and 𝑇 . For other recent developments related tot the Gohberg-Heing inversion formula we refer to the extended introduction of [6] and the references given therein.

On Inversion of Certain Structured Linear Transformations

379

The paper consists of two sections not counting the present introduction. The proof of Theorem 1.1 is given in Section 2. When 𝒰 or 𝒳 is ﬁnite dimensional, then injectivity of 𝑀 implies surjectivity of 𝑀 and vice versa. As one may expect this property does not hold when both 𝒰 and 𝒳 are inﬁnite dimensional, not even when the four equations in (1.4) are solvable. In Section 3 we present an example to illustrate this fact. In this ﬁnal section we also present a corollary to Theorem 1.1 and discuss a few special cases.

2. Proof of the main result It will be convenient ﬁrst to state and prove a lemma that covers part of Theorem 1.1. Lemma 2.1. Assume there exist operators 𝐾 and 𝐿 from 𝒰 into 𝒳 and operators 𝑅 and 𝑄 from 𝒳 into 𝒰 satisfying the equations in (1.4). Then the following two statements hold true: 1. if 𝐼 + 𝑅𝐵 or 𝐼 + 𝐶𝐾 is injective, then 𝑀 is injective, 2. if 𝐼 + 𝑅𝐵 or 𝐼 + 𝐶𝐾 is surjective, then 𝑀 is surjective. Moreover, if 𝐼 + 𝐶𝐾 or 𝐼 + 𝑅𝐵 is invertible, then 𝐼 + 𝐶𝐾, 𝐼 + 𝑅𝐵 and 𝑀 are invertible. Proof. The proof of the lemma will be divided into four parts. In the ﬁrst two parts we prove the ﬁrst statement. The second statement is proved in the third part. The proof of the ﬁnal statement is given in the last part. Throughout Ω is ∑𝑛−1 the operator on 𝒳 deﬁned by Ω = 𝜈=0 𝐴𝑛−1−𝜈 𝐵𝐶𝐴𝜈 . Note that 𝑀 + 𝐵𝐶𝐴𝑛 = 𝐸 − 𝐴Ω,

𝑀 + 𝐴𝑛 𝐵𝐶 = 𝐸 − Ω𝐴.

Hence the following intertwining relations hold true: (𝑀 + 𝐵𝐶𝐴𝑛 )𝐴 = 𝐴(𝑀 + 𝐴𝑛 𝐵𝐶),

Ω(𝑀 + 𝐵𝐶𝐴𝑛 ) = (𝑀 + 𝐴𝑛 𝐵𝐶)Ω. 𝑛

(2.1) 𝑛

Furthermore, we shall use that 𝑀 + 𝐵𝐶𝐴 is invertible if and only if 𝑀 + 𝐴 𝐵𝐶 is invertible. Part 1. We assume 𝐼 + 𝑅𝐵 is injective and prove that 𝑀 is injective. Note that 𝐼 + 𝑅𝐵 is injective if and only if 𝐸 + 𝐵𝑅 is injective. Take 𝑥 ∈ Ker 𝑀 , that is, 𝑀 𝑥 = 0. Then 𝐶𝑥 = 𝑄𝑀 𝑥 = 0, and we see that (𝐸 − Ω𝐴)𝑥 = (𝑀 + 𝐴𝑛 𝐵𝐶)𝑥 = 0. So 𝑥 = Ω𝐴𝑥 and (𝐸 − 𝐴Ω)𝐴𝑥 = 𝐴(𝐸 − Ω𝐴)𝑥 = 0. Since 𝑀 + 𝐵𝐶𝐴𝑛 = (𝐸 + 𝐵𝑅)𝑀 , we have 𝐸 − 𝐴Ω = (𝐸 + 𝐵𝑅)𝑀 . Thus (𝐸 + 𝐵𝑅)𝑀 𝐴𝑥 = (𝐸 − 𝐴Ω)𝐴𝑥 = 0. Now use that 𝐸 + 𝐵𝑅 is injective. It follows that 𝑀 𝐴𝑥 = 0. We conclude that 𝑀 𝑥 = 0 implies that 𝑀 𝐴𝑥 = 0. By induction we obtain that 𝑀 𝐴𝑘 𝑥 = 0 for 𝑘 = 0, 1, 2, . . .. In particular, using that the fourth equation in (1.4)∑ is solvable, we get 𝐶𝐴𝑘 𝑥 = 0 for 𝑘 = 0, 1, 2, . . .. Since 𝑥 = Ω𝐴𝑥, 𝑛 we get that 𝑥 = 𝑘=1 𝐴𝑛−𝑘 𝐵𝐶𝐴𝑘 𝑥 = 0, and hence that 𝑀 is injective. Part 2. Next we assume 𝐼 + 𝐶𝐾 is injective, and we prove that 𝑀 is injective. Note that 𝐼 + 𝐶𝐾 = 𝐼 + 𝑄𝑀 𝐾 = 𝐼 + 𝑄𝐴𝑛 𝐵 is injective if and only if 𝐸 + 𝐴𝑛 𝐵𝑄 is injective. As in the previous part, we assume that 𝑥 ∈ Ker 𝑀 . Then we have

380

M.A. Kaashoek and F. van Schagen

that 𝐶𝐴𝑛 𝑥 = 𝑅𝑀 𝑥 = 0. Hence, (𝐸 − 𝐴Ω)𝑥 = (𝑀 + 𝐵𝐶𝐴𝑛 )𝑥 = 0. We see that 𝑥 = 𝐴Ω𝑥. Next we show that Ω𝑥 ∈ Ker 𝑀 . From 𝑀 + 𝐴𝑛 𝐵𝐶 = (𝐸 + 𝐴𝑛 𝐵𝑄)𝑀 we see (𝐸 + 𝐴𝑛 𝐵𝑄)𝑀 Ω𝑥 = (𝐸 − Ω𝐴)Ω𝑥 = Ω(𝐸 − 𝐴Ω)𝑥 = 0. Since 𝐸 + 𝐴𝑛 𝐵𝑄 is injective, we indeed have 𝑀 Ω𝑥 = 0. It follows that 𝐶𝐴𝑛−1 𝑥 = 𝐶𝐴𝑛−1 𝐴Ω𝑥 = 𝐶𝐴𝑛 Ω𝑥 = 𝑅𝑀 Ω𝑥 = 0. Replacing 𝑥 by Ω𝑥, we conclude that 𝐶𝐴𝑛−1 Ω𝑥 = 0. Again use that 𝑥 = 𝐴Ω𝑥 to conclude that 𝐶𝐴𝑛−2 𝑥 = 0. Proceeding in this way we get 𝐶𝐴𝑘 𝑥 = 0 for 𝑘 = 0, 1, 2, . . .. As we have seen in the previous part, this yields 𝑥 = 0. Thus 𝑀 is injective. Part 3. To prove the second statement, we assume 𝐼 + 𝐶𝐾 or 𝐼 + 𝑅𝐵 is surjective, and we prove that 𝑀 is surjective. To do this we apply the results of the ﬁrst statement to the algebraic dual 𝑀 # of 𝑀 . From (1.2) we see that 𝑀

#

=𝐸−

𝑛 ∑

# 𝜈

#

#

# 𝑛−𝜈

(𝐴 ) 𝐶 𝐵 (𝐴 )

𝜈=0

=𝐸−

𝑛 ∑

(𝐴# )𝑛−𝜈 𝐶 # 𝐵 # (𝐴# )𝜈 .

𝜈=0

Furthermore, the equations in (1.4) yield 𝐾 # 𝑀 # = 𝐵 # (𝐴# )𝑛 ,

𝐿# 𝑀 # = 𝐵 # ,

𝑀 # 𝑅# = (𝐴# )𝑛 𝐶 # ,

𝑀 # 𝑄# = 𝐶 # .

Our hypotheses imply that 𝐼 + 𝐾 # 𝐶 # or 𝐼 + 𝐵 # 𝑅# is injective. But then we can apply the ﬁrst statement of this lemma with 𝑀 # in place of 𝑀 , with 𝐴# in place of 𝐴, with 𝐵 # in place of 𝐶, with 𝐾 # in place of 𝑅, and with 𝐿# in place of 𝑄. It follows that 𝑀 # is injective, which is equivalent to 𝑀 being surjective. Part 4. Assume 𝐼 + 𝐶𝐾 or 𝐼 + 𝑅𝐵 is invertible. Then we know from the ﬁrst and second statement that 𝑀 is invertible. The identity 𝑀 + 𝐴𝑛 𝐵𝐶 = 𝑀 (𝐸 + 𝐾𝐶) shows that the invertibility of 𝑀 and 𝐼 + 𝐶𝐾 yield that 𝑀 + 𝐴𝑛 𝐵𝐶 is invertible. Similarly, using 𝑀 + 𝐵𝐶𝐴𝑛 = (𝐸 + 𝐵𝑅)𝑀 , we see that if 𝑀 and 𝐼 + 𝑅𝐵 are invertible, then also 𝑀 + 𝐵𝐶𝐴𝑛 is invertible. Here we use that 𝐼 + 𝐶𝐾 (or 𝐼 + 𝑅𝐵) is invertible if and only if 𝐸 +𝐾𝐶 (or 𝐸 +𝐵𝑅) is invertible. Recall that 𝑀 +𝐴𝑛 𝐵𝐶 is invertible if and only if 𝑀 + 𝐵𝐶𝐴𝑛 is invertible. Thus our hypotheses imply that 𝑀 , 𝑀 + 𝐴𝑛 𝐵𝐶 and 𝑀 + 𝐵𝐶𝐴𝑛 are all invertible. But then we see from 𝑀 + 𝐵𝐶𝐴𝑛 = (𝐸 + 𝐵𝑅)𝑀 and 𝑀 + 𝐴𝑛 𝐵𝐶 = 𝑀 (𝐸 + 𝐾𝐶) that both 𝐸 + 𝐵𝑅 and 𝐸 + 𝐾𝐶 are invertible. The latter is equivalent to 𝐼 + 𝐶𝐾 and 𝐼 + 𝑅𝐵 being invertible. □ Completing the proof of Theorem 1.1. Given Lemma 2.1 it remains to prove the ﬁnal statement of the theorem, that is, assuming 𝑀 , 𝐼 + 𝐶𝐾 and 𝐼 + 𝑅𝐵 are invertible, we have to derive the formula for 𝑀 −1 in (1.5). From (1.2) it is clear that 𝑛 ∑ 𝐴𝑛−𝑘 𝐵𝐶𝐴𝑘 𝑀 −1 . (2.2) 𝑀 −1 = 𝐸 + 𝑘=0

In this formula we want to replace 𝐶𝐴𝑘 𝑀 −1 for 𝑘 = 1, . . . , 𝑛.

On Inversion of Certain Structured Linear Transformations

381

From the ﬁrst and third identity in (1.4) we see that 𝑀 + 𝐴𝑛 𝐵𝐶 = 𝑀 (𝐸 + 𝐾𝐶), 𝑀 + 𝐵𝐶𝐴𝑛 = (𝐸 + 𝐵𝑅)𝑀. Using the ﬁrst identity in (2.1) the two previous formulas yield (𝐸 + 𝐵𝑅)𝑀 𝐴 = (𝑀 + 𝐵𝐶𝐴𝑛 )𝐴 = 𝐴(𝑀 + 𝐴𝑛 𝐵𝐶)

(2.3)

= 𝐴𝑀 (𝐸 + 𝐾𝐶). Since 𝐼 + 𝐶𝐾 and 𝐼 + 𝑅𝐵 are invertible, the same holds true 𝐸 + 𝐾𝐶 and 𝐸 + 𝑅𝐵. Thus we can multiply (2.3) from the left by 𝑀 −1 (𝐸 + 𝐵𝑅)−1 and from the right by (𝐸 + 𝐾𝐶)−1 𝑀 −1 . We obtain 𝐴(𝐸 + 𝐾𝐶)−1 𝑀 −1 = 𝑀 −1 (𝐸 + 𝐵𝑅)−1 𝐴. Thus 0 = −𝐴(𝐸 + 𝐾𝐶)−1 𝑀 −1 + 𝑀 −1 (𝐸 + 𝐵𝑅)−1 𝐴. By adding 𝐴𝑀 −1 − 𝑀 −1 𝐴 to both sides of this equality we get 𝐴𝑀 −1 − 𝑀 −1 𝐴 = 𝐴(𝐸 − (𝐸 + 𝐾𝐶)−1 )𝑀 −1 − 𝑀 −1 (𝐸 − (𝐸 + 𝐵𝑅)−1 )𝐴, and therefore 𝐴𝑀 −1 − 𝑀 −1 𝐴 = 𝐴𝐾(𝐼 + 𝐶𝐾)−1 𝐶𝑀 −1 − 𝑀 −1 𝐵(𝐼 + 𝑅𝐵)−1 𝑅𝐴. Now use the deﬁnitions of 𝐿, 𝑄 and 𝐻1 to get the identity 𝐴𝑀 −1 − 𝑀 −1 𝐴 = 𝐴𝐾(𝐼 + 𝐶𝐾)−1 𝑄 − 𝐿(𝐼 + 𝑅𝐵)−1 𝑅𝐴 = 𝐻1 . We will generalize this by induction to 𝐴𝑘 𝑀 −1 − 𝑀 −1 𝐴𝑘 = 𝐻𝑘 , 𝑘 = 1, . . . , 𝑛, as follows: 𝐴𝑘 𝑀 −1 − 𝑀 −1 𝐴𝑘 = 𝐴(𝐴𝑘−1 𝑀 −1 ) − 𝑀 −1 𝐴𝑘 = 𝐴(𝑀 −1 𝐴𝑘−1 + 𝐻𝑘−1 ) − 𝑀 −1 𝐴𝑘 = (𝐴𝑀 −1 )𝐴𝑘−1 + 𝐴𝐻𝑘−1 − 𝑀 −1 𝐴𝑘 = (𝑀 −1 𝐴 + 𝐻1 )𝐴𝑘−1 + 𝐴𝐻𝑘−1 − 𝑀 −1 𝐴𝑘 = 𝐻𝑘 . Since 𝐶𝑀 −1 = 𝑄, we proved that 𝐶𝐴𝑘 𝑀 −1 = 𝑄𝐴𝑘 + 𝐶𝐻𝑘 . Inserting this in (2.2) completes the proof of Theorem 1.1. □

3. Comments and an example Theorem 1.1 has the following corollary. Corollary 3.1. Assume there exist operators 𝐾 and 𝐿 from 𝒰 into 𝒳 and operators 𝑅 and 𝑄 from 𝒳 into 𝒰 satisfying the equations in (1.4), and let 𝐼 +𝐶𝐾 or 𝐼 +𝑅𝐵

382

M.A. Kaashoek and F. van Schagen

be invertible. Then 𝑀 , 𝐼 + 𝐶𝐾 and 𝐼 + 𝑅𝐵 are invertible, and 𝑀 −1 is given by 𝑛 ∑ 𝑀 −1 = 𝐸 + 𝐴𝑛−𝑘 𝐵𝑄𝐴𝑘 + 𝑘=0

+ −

𝑛 ∑

𝐴𝑛−𝑘 𝐵

( 𝑘−1 ∑

𝑘=1

𝑗=0

𝑛 ∑

( 𝑘−1 ∑

𝐴𝑛−𝑘 𝐵

𝐶𝐴𝑘−𝑗 𝐾(𝐼 + 𝐶𝐾)−1 𝑄𝐴𝑗

)

) 𝐶𝐴𝑘−1−𝑗 𝐿(𝐼 + 𝑅𝐵)−1 𝑅𝐴𝑗+1 .

𝑗=0

𝑘=1

∑𝑘−1 Proof. By induction one shows that 𝐻𝑘 = 𝑗=0 𝐴𝑘−1−𝑗 𝐻1 𝐴𝑗 for 𝑘 = 1, . . . , 𝑛, where 𝐻1 = 𝐴𝐾(𝐼 + 𝐶𝐾)−1 𝑄 − 𝐿(𝐼 + 𝑅𝐵)−1 𝑅𝐴. Using this in (1.5) yields the desired formula for 𝑀 −1 . □ By applying the above corollary to the algebraic dual of 𝑀 one sees that the inverse of 𝑀 is also given by 𝑛 ∑ 𝐴𝑘 𝐿𝐶𝐴𝑛−𝑘 + 𝑀 −1 = 𝐸 + 𝑘=0

+

𝑛 ( 𝑘−1 ∑ ∑ 𝑘=1

−

𝑗=1

𝑛−1 ∑ ( 𝑘−1 ∑ 𝑘=1

) 𝐴𝑗 𝐿(𝐼 + 𝑅𝐵)−1 𝑅𝐴𝑘−𝑗 𝐵 𝐶𝐴𝑛−𝑘 ) 𝐴𝑗+1 𝐾(𝐼 + 𝐶𝐾)−1 𝑄𝐴𝑘−𝑗 𝐵 𝐶𝐴𝑛−𝑘 .

𝑗=0

For 𝑛 = 1, solvability of the four equations in (1.4) directly implies that 𝑀 is invertible without any further conditions on 𝐼 + 𝐶𝐾 or 𝐼 + 𝑅𝐵. Indeed, when 𝑛 = 1 we have 𝑀 = 𝐸 − 𝐴𝐵𝐶 − 𝐵𝐶𝐴 = 𝐸 − 𝑀 𝐾𝑄𝑀 − 𝑀 𝐿𝑅𝑀. Hence 𝑀 (𝐸 + 𝐾𝑄𝑀 + 𝐿𝑅𝑀 ) = 𝐸 and (𝐸 + 𝑀 𝐾𝑄 + 𝑀 𝐿𝑅)𝑀 = 𝐸, which proves that 𝑀 is invertible. In the proof of Theorem 1.1 injectivity and surjectivity of 𝑀 are established separately. If 𝒳 or 𝒰 is ﬁnite dimensional, then the operator 𝑀 is a ﬁnite rank perturbation of the identity operator on 𝒳 . For such an operator 𝑀 one has that dim Ker 𝑀 = codim Im 𝑀 , and hence 𝑀 is injective if and only if 𝑀 is surjective. However, in general, even when the four equations in (1.4) are solvable, injectivity of 𝑀 is not equivalent to surjectivity of 𝑀 . In fact, this already happens for 𝑛 = 2, as the following example shows. Note that the case 𝑛 = 1 has to be excluded because of the result mentioned in the preceding paragraph. Example. Take 𝑛 = 2, and put 𝒰 = ℓ2+ and 𝒳 = ℂ2 ⊕ ℓ2+ . As before the identity operators on 𝒰 and 𝒳 are denoted by 𝐼 and 𝐸, respectively. Thus 𝐼 denotes the identity on ℓ2+ and 𝐸 stands for the identity on ℂ2 ⊕ ℓ2+ . The identity operator

On Inversion of Certain Structured Linear Transformations

383

on ℂ2 will be denoted by 𝐼2 . In general, 0 denotes a zero operator. The set {𝑒1 , 𝑒2 } denotes the standard basis of ℂ2 , and {𝑓1 , 𝑓2 , . . .} is the standard basis of ℓ2+ . The forward shift on ℓ2+ is denoted by 𝑆; thus 𝑆𝑓𝑘 = 𝑓𝑘+1 for 𝑘 = 1, 2, . . .. Note that the adjoint operator 𝑆 ∗ of 𝑆 is the backward shift on ℓ2+ , that is, 𝑆 ∗ 𝑓1 = 0 and 𝑆 ∗ 𝑓𝑘+1 = 𝑓𝑘 for 𝑘 = 1, 2, . . .. We deﬁne operators 𝐴, 𝐵, and 𝐶 as follows: [ ] 𝐴11 0 𝐴= : ℂ2 ⊕ ℓ2+ → ℂ2 ⊕ ℓ2+ , 0 𝐼 ] [ 𝐵1 𝐵= : ℓ2+ → ℂ2 ⊕ ℓ2+ , 𝑆∗ ] [ 𝐶 = 𝐶1 𝑆 2 : ℂ2 ⊕ ℓ2+ → ℓ2+ . Here 𝐴11 is the operator on ℂ2 deﬁned by 𝐴11 𝑒1 = 𝑒2 , and 𝐴11 𝑒2 = 𝑒1 , and 𝐵1 is the operator from ℓ2+ to ℂ2 given by 𝐵1 𝑓1 = 𝑒2 and 𝐵1 𝑓𝑘 = 0 for 𝑘 = 2, 3, . . .. Furthermore, 𝐶1 is the operator from ℂ2 to ℓ2+ deﬁned by 𝐶1 𝑒1 = 𝑓2 and 𝐶1 𝑒2 = 𝑓1 . Finally, we set 𝑀 = 𝐸 − 𝐴2 𝐵𝐶 − 𝐴𝐵𝐶𝐴 − 𝐵𝐶𝐴2 . (3.1) Since 𝐴211 = 𝐼2 , we have 𝐴2 = 𝐸, and hence 𝑀 = 𝐸 − 2𝐵𝐶 − 𝐴𝐵𝐶𝐴. Next we write 𝑀 as a 2 × 2 operator matrix relative to the direct sum decomposition ℂ2 ⊕ ℓ2+ : [ ] 𝑀11 𝑀12 𝑀= 𝑀21 𝑀22 [ ] 𝐼2 − 2𝐵1 𝐶1 − 𝐴11 𝐵1 𝐶1 𝐴11 −2𝐵1 𝑆 2 − 𝐴11 𝐵1 𝑆 2 = . −2𝑆 ∗ 𝐶1 − 𝑆 ∗ 𝐶1 𝐴11 𝐼 − 2𝑆 ∗ 𝑆 2 − 𝑆 ∗ 𝑆 2 One computes that 𝑀11 𝑒1 = 0 and 𝑀11 𝑒2 = −𝑒2 . Since 𝐵1 𝑆 2 = 0, we have 𝑀12 = 0. The action of 𝑀21 is given by 𝑀21 𝑒1 = −2𝑓1 and 𝑀21 𝑒2 = −𝑓1 . Finally, since 𝑆 ∗ 𝑆 is the identity on ℓ2+ , we see that 𝑀22 = 𝐼 − 3𝑆 ∗ 𝑆 2 = 𝐼 − 3𝑆. Now remark that Im 𝑀 ⊂ span {𝑒2 } ⊕ ℓ2+ . Thus 𝑒1 ∕∈ Im 𝑀 , and hence 𝑀 is not surjective. We shall show that 𝑀 is injective and that the four equations (1.4) do have solutions. Note that the vectors 𝑒1 , 𝑒2 , 𝑓1 , 𝑓2 , 𝑓3 , . . . form an orthogonal basis of the Hilbert space ℂ2 ⊕ ℓ2+ . We deﬁne 𝑉 to be the forward shift operator on ℂ2 ⊕ ℓ2+ with respect to this basis. Thus the action of 𝑉 is given by 𝑉 𝑒1 = 𝑒2 ,

𝑉 𝑒2 = 𝑓1 ,

𝑉 𝑓𝑗 = 𝑓𝑗+1

(𝑗 = 1, 2, . . .).

ℓ2+ ,

and hence Im 𝑀 is contained in Note that Im 𝑉 is equal to span {𝑒2 } ⊕ Im 𝑉 . The adjoint of 𝑉 is the backward shift on ℂ2 ⊕ ℓ2+ relative to the basis 𝑒1 , 𝑒2 , 𝑓1 , 𝑓2 , 𝑓3 , . . .. Thus 𝑉 ∗ 𝑒1 = 0,

𝑉 ∗ 𝑒2 = 𝑒1 ,

𝑉 ∗ 𝑓1 = 𝑒2 ,

𝑉 ∗ 𝑓𝑗 = 𝑓𝑗−1

(𝑗 = 2, 3, . . .).

384

M.A. Kaashoek and F. van Schagen

Put 𝑁 = 𝑉 ∗ 𝑀 . We claim that 𝑁 is invertible. To see this we ﬁrst note that 𝑉 ∗ 𝑀 𝑒1 = 𝑉 ∗ 𝑀11 𝑒1 + 𝑉 ∗ 𝑀21 𝑒1 = 𝑉 ∗ 𝑀21 𝑒1 = −2𝑉 ∗ 𝑓1 = −2𝑒2 , 𝑉 ∗ 𝑀 𝑒2 = 𝑉 ∗ 𝑀11 𝑒2 + 𝑉 ∗ 𝑀21 𝑒2 = −𝑉 ∗ 𝑒2 − 𝑉 ∗ 𝑓1 = −𝑒1 − 𝑒2 , 𝑉 ∗ 𝑀 𝑓1 = 𝑉 ∗ 𝑀12 𝑓1 + 𝑉 ∗ 𝑀22 𝑓1 = 𝑉 ∗ 𝑀22 𝑓1 = 𝑉 ∗ (𝑓1 − 3𝑓2 ) = 𝑒2 − 3𝑓1 , 𝑉 ∗ 𝑀 𝑓𝑗 = 𝑉 ∗ 𝑀22 𝑓𝑗 = 𝑉 ∗ (𝑓𝑗 − 3𝑓𝑗+1 ) = 𝑓𝑗−1 − 3𝑓𝑗 ,

(𝑗 = 2, 3, . . .).

Summarizing we have 𝑁 𝑒1 = − 2𝑒2 , 𝑁 𝑓1 = 𝑒2 − 3𝑓1 ,

𝑁 𝑒2 = −𝑒1 − 𝑒2 ,

(3.2)

𝑁 𝑓𝑗+1 = 𝑓𝑗 − 3𝑓𝑗+1

(𝑗 = 1, 2, 3, . . . ).

(3.3)

Now consider the 2 × 2 operator matrix representation of 𝑁 relative to the direct sum decomposition ℂ2 ⊕ ℓ2+ : [ ] 𝑁11 𝑁12 : ℂ2 ⊕ ℓ2+ → ℂ2 ⊕ ℓ2+ . 𝑁= 𝑁21 𝑁22 From (3.2) we see that 𝑁 maps ℂ2 ⊕ {0} in a one-to-one way onto ℂ2 ⊕ {0}. Hence 𝑁11 is invertible and 𝑁21 = 0. The equalities in (3.3) show that 𝑁22 = 𝑆 ∗ − 3𝐼. As 𝑆 ∗ is a contraction, it follows that 𝑁22 is also invertible. Thus 𝑁 is block upper triangular and its diagonal blocks are invertible. So 𝑁 is invertible. Since 𝑁 = 𝑉 ∗ 𝑀 is invertible, 𝑀 is injective. It remains to prove that for our 𝑀 the four equations in (1.4) are solvable. Note that 𝐶𝑁 −1 𝑉 ∗ 𝑀 = 𝐶 and hence 𝑄 = 𝑅 = 𝐶𝑁 −1 𝑉 ∗ gives that 𝑄𝑀 = 𝐶 and 𝑅𝑀 = 𝐶𝐴2 , where for the last equality we used the fact that 𝐴2 = 𝐸. From the deﬁnition of 𝑉 we see that 𝑉 ∗ 𝑉 is the identity operator on ℂ2 ⊕ ℓ2+ , and 𝑉 𝑉 ∗ is the orthogonal projection of ℂ2 ⊕ ℓ2+ onto span {𝑒2 } ⊕ ℓ2+ . Note that Im 𝐵 is contained in span {𝑒2 } ⊕ ℓ2+ . We already know that the same holds true for Im 𝑀 . Thus 𝑉 𝑉 ∗ 𝐵 = 𝐵 and 𝑉 𝑉 ∗ 𝑀 = 𝑀 . Now put 𝐾 = 𝐿 = 𝑁 −1 𝑉 ∗ 𝐵. Then 𝑀 𝐾 = 𝑉 𝑉 ∗ 𝑀 𝐾 = 𝑉 𝑁 𝐾 = 𝑉 𝑁 𝑁 −1 𝑉 ∗ 𝐵 = 𝑉 𝑉 ∗ 𝐵 = 𝐵,

𝑀 𝐿 = 𝐵 = 𝐴2 𝐵.

For the ﬁnal equality we again use that 𝐴2 = 𝐸. Thus the four equations in (1.4) have solutions. Summarizing we see that 𝑀 is injective, that the four equations (1.4) have solutions, but that 𝑀 is not surjective. □ The block Toeplitz matrix 𝑇 associated to the operator 𝑀 deﬁned by (3.1) is the 3×3 block operator matrix given by ⎡ ⎤ 𝐼 −𝑊 −𝑆 −𝑊 ⎢ ⎥ 𝐼 −𝑊 −𝑆 ⎦ . 𝑇 = ⎣ −𝑆 −𝑊

−𝑆

𝐼 −𝑊

On Inversion of Certain Structured Linear Transformations

385

Here, as before, 𝑆 is the forward shift on ℓ2+ and 𝑊 is the operator on ℓ2+ given by ⎡ ⎤ 1 0 0 0 ⋅⋅⋅ ⎢0 0 0 0 ⋅ ⋅ ⋅⎥ ⎢ ⎥ ⎢ ⎥ 𝑊 = ⎢0 1 0 0 ⋅ ⋅ ⋅⎥ ⎢0 0 1 0 ⎥ ⎣ ⎦ .. .. .. .. . . . . Note that Im (𝐼 − 𝑊 ) and Im 𝑆 are contained in span {𝑓2 , 𝑓3 , . . .}. Hence 𝑇 is not surjective, as one expects because 𝑇 is surjective if and only if 𝑀 is. Assume that 𝒳 or 𝒰 is ﬁnite dimensional, and let the four equations in (1.4) be solvable. Then 𝑀 , 𝐼 + 𝑅𝐵, 𝐼 + 𝐶𝐾, 𝐸 + 𝐵𝑅 and 𝐸 + 𝐾𝐶 are all the sum of an identity operator and an operator of ﬁnite rank. For such an operator there exists a well-deﬁned determinant that has the usual properties (cf. [3], Sections VII.1 and VII.3). We claim that det(𝐼 + 𝐶𝐾) = det(𝐼 + 𝑅𝐵).

(3.4)

To see this we ﬁrst note that det(𝐼 + 𝐶𝐾) = det(𝐸 + 𝐾𝐶),

det(𝐼 + 𝑅𝐵) = det(𝐸 + 𝐵𝑅).

(3.5)

Next, observe that det(𝐸 + 𝐵𝑅) det 𝑀 = det(𝑀 + 𝐵𝑅𝑀 ) = det(𝑀 + 𝐵𝐶𝐴𝑛 ) = det(𝐸 − 𝐴Ω) = det(𝐸 − Ω𝐴) = det(𝑀 + 𝐴𝑛 𝐵𝐶) = det(𝑀 + 𝑀 𝐾𝐶) = det 𝑀 det(𝐸 + 𝐾𝐶). If det 𝑀 ∕= 0, then the above calculation shows that det(𝐸 + 𝐾𝐶) = det(𝐸 + 𝐵𝑅), and hence, by (3.5), the identity (3.4) holds. On the other hand, if det 𝑀 = 0, then 𝑀 is not invertible, and we know from Theorem 1.1 that neither 𝐼 + 𝐶𝐾 nor 𝐼 + 𝑅𝐵 is invertible. In other words, both det(𝐼 + 𝐶𝐾) and det(𝐼 + 𝑅𝐵) are zero, and (3.4) is trivially satisﬁed. In the case when dim 𝒰 = 1, the identity (3.4) recovers the fact that the left upper element and the right lower element of the inverse of a scalar Toeplitz matrix are equal (cf. [5] or Section III.6 in [1]).

References [1] I.C. Gohberg, I.A. Fel’dman, Convolution equations and projection methods for their solution, Transl. Math. Monographs Vol. 41, Amer. Math. Soc., Providence, R.I., 1974. [2] I. Gohberg, G. Heinig, The inversion of ﬁnite Toeplitz matrices consisting of elements of a non-commutative algebra, Rev. Roum. Math. Pures et Appl. 20 (1974), 623– 663 (in Russian); English transl. in: Convolution Equations and Singular Integral

386

[3] [4] [5] [6]

M.A. Kaashoek and F. van Schagen Operators, (eds. L. Lerer, V. Olshevsky, I.M. Spitkovsky), OT 206, Birkh¨ auser Verlag, Basel, 2010, pp. 7–46. I. Gohberg, S. Goldberg, M.A. Kaashoek, Classes of Linear Operators, Volume I, Birkh¨ auser Verlag, Basel, 1990. I. Gohberg, M.A. Kaashoek, F. van Schagen, On inversion of Toeplitz matrices with elements in an algebraic ring, Lin. Alg. Appl. 385 (2004), 381–389. I. Gohberg, A.A. Semencul, On the invertibility of ﬁnite Toeplitz matrices and their continuous analogues, Matem. Issled 7(2), Kishinev (1972), (in Russian). L. Lerer, V. Olshevsky, I.M. Spitkovsky (Eds), Convolution Equations and Singular Integral Operators, OT 206, Birkh¨ auser Verlag, Basel, 2010.

M.A. Kaashoek and F. van Schagen Afdeling Wiskunde, Faculteit der Exacte Wetenschappen VU Universiteit Amsterdam De Boelelaan 1081a, NL-1081 HV Amsterdam, The Netherlands e-mail: [email protected] [email protected]

Operator Theory: Advances and Applications, Vol. 218, 387–401 c 2012 Springer Basel AG ⃝

The Inverse of a Two-level Positive Deﬁnite Toeplitz Operator Matrix Selcuk Koyuncu and Hugo J. Woerdeman To the memory of Israel Gohberg, an excellent mathematician and an inspiring teacher

Abstract. The Gohberg-Semencul formula allows one to express the entries of the inverse of a Toeplitz matrix using only a few entries (the ﬁrst row and the ﬁrst column) of the matrix, under some nonsingularity condition. In this paper we will provide a two variable generalization of the GohbergSemencul formula in the case of a positive deﬁnite two-level Toeplitz matrix with a symbol of the form ∣𝑝∣1 2 where 𝑝 is a stable polynomial of two variables. We also consider the case of operator-valued two-level Toeplitz matrices. In addition, we propose an approximation of the inverse of a multilevel Toeplitz matrix with a positive symbol, and use it as the initial value for a Hotelling iteration to compute the inverse. Numerical results are included. Mathematics Subject Classiﬁcation (2000). 15A09 (47B35, 65F30). Keywords. Two-level Toeplitz matrices, stable polynomial, inverse formula, Gohberg-Semencul expressions, Discrete Algebraic Riccati Equation.

1. Introduction Important in the development of computational and theoretical results involving Toeplitz matrices was the Gohberg-Semencul formula which expresses the inverse of Toeplitz 𝑇 in terms of the ﬁrst column and row of 𝑇 −1 . The impact of this formula on the ﬁeld of structured matrices and numerical algorithms was systematically presented in a book by G. Heinig and K. Rost [4]. Nontrivial generalization to block Toeplitz matrices is the Gohberg-Heinig formula [2]. For the classical onevariable positive deﬁnite case the Gohberg-Semencul formula [3] is the following: This research is supported by NSF grant DMS-0901628.

388

S. Koyuncu and H.J. Woerdeman

the inverse of (𝑡𝑘−𝑙 )𝑛−1 𝑘,𝑙=0 equals ⎤⎡ ⎤ ⎡ ⎡ 𝑝0 . . . 𝑝𝑛−1 𝑝𝑛 𝑝0 ⎥ ⎢ ⎥ ⎢ ⎢ .. . . . .. .. ⎦ − ⎣ ... .. ⎦⎣ ⎣ . 𝑝0 𝑝1 𝑝𝑛−1 . . . 𝑝0 where 𝑝(𝑧) =

𝑛 ∑

..

. . . . 𝑝𝑛

⎤⎡ 𝑝𝑛 ⎥⎢ ⎦⎣

... .. .

⎤ 𝑝1 .. ⎥ , .⎦

𝑝𝑛

𝑝𝑘 𝑧 𝑘

𝑘=0

(𝑡𝑘−𝑙 )𝑛𝑘,𝑙=0 (𝑝𝑘 )𝑛𝑘=0

1 𝑝¯0 𝑒1 ,

satisﬁes = where 𝑒1 = (1, 0, 0, . . . , 0)𝑇 . In this paper we consider two-level Toeplitz matrices, which in special cases are block Toeplitz matrices with Toeplitz blocks. We will provide a two variable generalization of the Gohberg-Semencul formula in the case of positive deﬁnite two-level Toeplitz matrix with a symbol of the form 𝑓 (𝑧1 , 𝑧2 ) = ∣𝑃 (𝑧11,𝑧2 )∣2 where ∑𝑛1 ∑𝑛2 𝑘 𝑙 𝑃 (𝑧1 , 𝑧2 ) = 𝑘=0 𝑙=0 𝑃𝑘𝑙 𝑧1 𝑧2 is a stable polynomial of two variables, i.e., 𝑃 (𝑧1 , 𝑧2 ) ∕= 0 for ∣𝑧1 ∣ ≤ 1,∣𝑧2 ∣ ≤ 1. We deﬁne a two-level Toeplitz matrix to be a matrix of the form 𝑇 = (𝑡k−l )k,l∈Λ where Λ is a ﬁnite subset of ℕ20 . For instance, when Λ = {0, 1} × {0, 1} which we will order lexicographically, Λ = {(0, 0), (0, 1), (1, 0), (1, 1)}, we get

⎡ 𝑡0,0 ⎢𝑡0,1 𝑇 =⎢ ⎣𝑡1,0 𝑡1,1

𝑡0,−1 𝑡0,0 𝑡1,−1 𝑡1,0

𝑡−1,0 𝑡−1,1 𝑡0,0 𝑡0,1

⎤ 𝑡−1,−1 𝑡−1,0 ⎥ ⎥ 𝑡0,−1 ⎦ 𝑡0,0

(1.1)

In this paper we obtain the following two-variable generalization of the classical Gohberg-Semencul formula. We ﬁrst need to introduce some notation. For 𝑘 = (𝑘1 , 𝑘2 ) and 𝑧 = (𝑧1 , 𝑧2 ) we let 𝑧 𝑘 = 𝑧1𝑘1 𝑧2𝑘2 . If 𝑛 = (𝑛1 , 𝑛2 ), we let 𝑛 denote the set 𝑛 = 𝑛1 × 𝑛2 , where 𝑛𝑖 = {0, . . . , 𝑛𝑖 }. Note that 𝑇 = (𝑡𝑘−𝑙 )𝑘,𝑙∈𝑛 is a block Toeplitz matrix where each of the blocks are Toeplitz; as for instance in (1.1). Finally, we denote 𝕋 = {𝑧 ∈ ℂ : ∣𝑧∣ = 1} and 𝔻 = {𝑧 ∈ ℂ : ∣𝑧∣ < 1}. Recall that the Loewner order on Hermitian matrices is deﬁned via 𝑀 ≤ 𝑁 ⇐⇒ 𝑁 − 𝑀 ≥ 0, i.e., 𝑁 − 𝑀 is positive semideﬁnite. Theorem 1.1. Let 𝑃 (𝑧1 , 𝑧2 ) =

𝑛1 ∑ 𝑛2 ∑

𝑃𝑘𝑙 𝑧1 𝑘 𝑧2 𝑙

and

𝑅(𝑧1 , 𝑧2 ) =

𝑘=0 𝑙=0

𝑛1 ∑ 𝑛2 ∑ 𝑘=0 𝑙=0

be stable operator-valued polynomials, and suppose that ∗

∗

𝑃 (𝑧1 , 𝑧2 )𝑃 (𝑧1 , 𝑧2 ) = 𝑅(𝑧1 , 𝑧2 ) 𝑅(𝑧1 , 𝑧2 ).

𝑅𝑘𝑙 𝑧1 𝑘 𝑧2 𝑙

The Inverse of a Toeplitz Operator Matrix

389

Put 𝑓 (𝑧1 , 𝑧2 ) = 𝑃 (𝑧1 , 𝑧2 )∗

−1

𝑃 (𝑧1 , 𝑧2 )−1

= 𝑅(𝑧1 , 𝑧2 )−1 𝑅(𝑧1 , 𝑧2 )∗

−1

for 𝑧1 , 𝑧2 ∈ 𝕋. Put Λ = 𝑛 ∖ {𝑛}, where 𝑛 = (𝑛1 , 𝑛2 ) and write the Fourier coeﬃcients of 𝑓 (𝑧1 , 𝑧2 ) as 𝑓ˆ(𝑘, 𝑙), (𝑘, 𝑙) ∈ ℤ2 . Consider 𝑇 = (𝑓ˆ𝑘1 −𝑘2 ,𝑙1 −𝑙2 )(𝑘1 ,𝑙1 ),(𝑘2 ,𝑙2 )∈Λ . Then

𝑇 −1 = 𝐴𝐴∗ − 𝐵 ∗ 𝐵 − 𝐶1∗ 𝐷1 −1 𝐶1 − 𝐶2∗ 𝐷2 −1 𝐶2 ,

(1.2)

where 𝐴 = (𝑃𝑘−𝑙 )𝑘,𝑙∈Λ ,

𝐵 = (𝑅𝑘−𝑙 )𝑘∈𝑛+Λ ,

(1.3)

𝑙∈Λ

and 𝐶1 ,𝐷1 ,𝐶2 and 𝐷2 are deﬁned via (𝐶1 )𝑖𝑗 =

𝑗1 ∑

min{𝑖2 ,𝑗2 }

𝑘1 =𝑖1 −𝑛1

𝑘2 =0

∑

𝑗1∑ +𝑛1 min{𝑖2 +𝑛 2 ,𝑗2 +𝑛2 } ∑

∗ 𝑃𝑘−𝑖 𝑃𝑗−𝑘 −

𝑙1 =𝑖1

∗ 𝑅𝑙−𝑖 𝑅𝑙−𝑗 ,

(1.4)

𝑙2 =𝑛2

where 𝑖 ∈ Θ1 = {𝑛1 + 1, 𝑛1 + 2, . . .} × {0, 1, . . . , 𝑛2 − 1}, 𝑗 ∈ 𝑛1 × 𝑛2 ∖ {(𝑛1 , 𝑛2 )}, (𝐶2 )𝑖𝑗 =

min{𝑖1 ,𝑗1 }

∑

𝑗2 ∑

𝑘1 =0

𝑘2 =𝑖2 −𝑛2

min{𝑖1 +𝑛1 ,𝑗1 +𝑛1 } 𝑗2 +𝑛2

∗ 𝑃𝑘−𝑖 𝑃𝑗−𝑘 −

∑

∑

𝑙1 =𝑛1

𝑙2 =𝑖2

∗ 𝑅𝑙−𝑖 𝑅𝑙−𝑗 ,

(1.5)

where 𝑖 ∈ Θ2 = {0, 1, . . . , 𝑛1 − 1} × {𝑛2 + 1, 𝑛2 + 2, . . .} and 𝑗 ∈ 𝑛1 × 𝑛2 ∖ {(𝑛1 , 𝑛2 )}, (𝐷1 )𝑘,𝑘˜ =

min{𝑘1 ,𝑘˜1 }

∑

min{𝑘2 ,𝑘˜2 }

𝑙1 =max{𝑘1 ,𝑘˜1 }−𝑛1

𝑙2 =0

∑

∗ 𝑃𝑘−𝑙 𝑃𝑘−𝑙 ˜

min{𝑘1 ,𝑘˜1 }+𝑛1 min{𝑘2 ,𝑘˜2 }+𝑛2

∑

∑

𝑠1 =max{𝑘1 ,𝑘˜1 }

𝑠2 =𝑛2

−

∗ 𝑅𝑠−𝑘 𝑅𝑠−𝑘˜ ,

(1.6)

where 𝑘, 𝑘˜ ∈ Θ1 = {𝑛1 + 1, 𝑛1 + 2, . . .} × {0, 1, . . . , 𝑛2 − 1}, and min{𝑘1 ,𝑘˜1 }

∑

min{𝑘2 ,𝑘˜2 }

𝑙1 =0

𝑙2 =max{𝑘2 ,𝑘˜2 }−𝑛2

(𝐷2 )𝑘,𝑘˜ =

∑

∗ 𝑃𝑘−𝑙 𝑃𝑘−𝑙 ˜

min{𝑘1 ,𝑘˜1 }+𝑛1 min{𝑘2 ,𝑘˜2 }+𝑛2

−

∑

∑

𝑠1 =𝑛1

𝑠2 =max{𝑘2 ,𝑘˜2 }

∗ 𝑅𝑠−𝑘 𝑅𝑠−𝑘˜

(1.7)

where 𝑘, 𝑘˜ ∈ Θ2 = {0, 1, . . . , 𝑛1 − 1} × {𝑛2 + 1, 𝑛2 + 2, . . .} and 𝑃𝑘 = 𝑅𝑘 = 0 whenever 𝑘 ∕∈ 𝑛.

390

S. Koyuncu and H.J. Woerdeman

Thus to compute 𝑇 −1 , we have reduced it to computing the inverses of 𝐷1 and 𝐷2 where 𝐷1 and 𝐷2 are traditional matrices. Typically, we ∑𝑛2 Toeplitz ∑𝑛one-level 1 𝑘 𝑙 would like to use it when 𝑃 (𝑧1 , 𝑧2 ) = 𝑘=0 𝑃 𝑧 𝑧 𝑙=0 𝑘𝑙 1 2 is in fact a polynomial of degree (𝑘1 , 𝑘2 ) where 𝑘1 ≪ 𝑛1 and 𝑘2 ≪ 𝑛2 . In that case, 𝐴, 𝐵, 𝐶1 ,𝐶2 ,𝐷1 and 𝐷2 are sparse. Let us start illustrating Theorem 1.1 by giving the following example. Example. Let 𝑛1 = 𝑛2 = 2. Given 𝑃 (𝑧1 , 𝑧2 ) = 𝑅(𝑧1 , 𝑧2 ) = 𝑝00 + 𝑝01 𝑧2 + 𝑝10 𝑧1 + 𝑝02 𝑧22 + 𝑝20 𝑧12 where 𝑝00 = 32 , 𝑝01 = 13 , 𝑝02 = 12 , 𝑝20 = 12 ,𝑝10 = 13 and Λ = {0, 1, 2} × {0, 1, 2} ∖ {(2, 2)}. In this case the matrices 𝐴, 𝐵, 𝐶1 , 𝐶2 , 𝐷1 and 𝐷2 are the following: ⎡3 2 ⎢1 ⎢ 31 ⎢ ⎢ 21 ⎢ ⎢3

0 3 2 1 3

0

0 0 3 2

1 3

0 0

1 2

0 0

0 0 0 0 0 .. .

0 0 0 0 0 .. .

𝐴=⎢ ⎢0 ⎢0 ⎢1 ⎣ 2 0

0 0

⎡ 0 ⎢0 ⎢ ⎢0 ⎢ 𝐶1 = ⎢0 ⎢ ⎢0 ⎣. ..

1 3

0 0 0 3 2 1 3 1 2 1 3

0 3 4

0 0 0 0 .. .

0 0 0 0 3 2 1 3

0 1 3

1 6 3 4

0 0 0 .. .

0 0 0 0 0 3 2

0 0 0 0 0 0 0 .. .

0 0 0 0 0 0 3 2 1 3

2 3

0 3 4

0 0 .. .

⎤ 0 0⎥ ⎥ 0⎥ ⎥ 0⎥ ⎥, 0⎥ ⎥ 0⎥ ⎥ 0⎦

⎡ 0 ⎢0 ⎢ ⎢0 ⎢ ⎢0 𝐵=⎢ ⎢0 ⎢ ⎢0 ⎢ ⎣0 0

⎤

⎡ 0 ⎢0 ⎢ ⎢0 ⎢ 𝐶2 = ⎢0 ⎢ ⎢0 ⎣. ..

3 2

1 9 2⎥ 3⎥ 1⎥ 6⎥ 3⎥ , 4⎥

0⎥ .. ⎦ .

1 2

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0

3 4

2 3

0 0 0 0 .. .

0 3 4

0 0 .. .

0 0 0 0 0 0 0 0 0 0 0 0 0 .. .

0 0 0 0 0 0 0 0 1 6 3 4

0 0 0 .. .

1 3

0 0 0 0 0 0 0 1 9 2 3 1 6 3 4

0 .. .

1 2

0 0 0 0 0 0 0 0 0 0 0 0 .. .

1⎤ 3 1⎥ 2⎥

0⎥ ⎥ 0⎥ ⎥, 0⎥ ⎥ 0⎥ ⎥ 0⎦ 0 ⎤ 0 0⎥ ⎥ 0⎥ ⎥ , 0⎥ ⎥ 0⎥ .. ⎦ .

and the top left 8 × 8 block of the inﬁnite block Toeplitz matrices 𝐷1 and 𝐷2 equals to ⎡ 13 ⎤ 1 2 3 0 0 0 0 2 36 3 3 4 1 2 1 3 ⎢ 1 2 13 0 0 ⎥ 36 9 3 6 4 ⎢ 32 ⎥ 1 1 2 13 3 ⎢ 2 36 0 0 ⎥ 9 3 3 4 ⎢ 3 ⎥ 1 1 2 1 3 2 13 ⎢ 0 ⎥ 2 36 3 3 9 3 6 4 ⎥. ⎢ 3 1 2 1 1 2 13 ⎢ ⎥ 2 0 6 3 9 36 3 3 ⎢ 4 ⎥ 1 1 2 ⎥ 3 2 13 ⎢ 0 0 2 36 ⎢ 4 3 3 9 3 ⎥ 1 2 1 1 ⎦ 3 ⎣ 0 0 2 13 4 6 3 9 36 3 1 3 2 13 0 0 0 0 2 4 3 3 36

The two-level Toeplitz matrix 𝑇 of size 8 × 8 is following. ⎡

0.6453 ⎢−0.1158 ⎢ ⎢−0.2241 ⎢ ⎢−0.1158 𝑇 =⎢ ⎢ 0.0490 ⎢ ⎢ 0.0674 ⎢ ⎣−0.2241 0.0674

⎤ −0.1158 −0.2241 −0.1158 0.0490 0.0674 −0.2241 0.0674 0.6453 −0.1158 0.0037 −0.1158 0.0490 0.0304 −0.2241⎥ ⎥ −0.1158 0.6453 0.0304 0.0037 −0.1158 0.0839 0.0304 ⎥ ⎥ 0.0037 0.0304 0.6453 −0.1158 −0.2241 −0.1158 0.0490 ⎥ ⎥. −0.1158 0.0037 −0.1158 0.6453 −0.1158 0.0037 −0.1158⎥ ⎥ 0.0490 −0.1158 −0.2241 −0.1158 0.6453 0.0304 0.0037 ⎥ ⎥ 0.0304 0.0839 −0.1158 0.0037 0.0304 0.6453 −0.1158⎦ −0.2241 0.0304 0.0490 −0.1158 0.0037 −0.1158 0.6453

The Inverse of a Toeplitz Operator Matrix

391

2. Proof of the main result To prove Theorem 1.1 we ﬁrst recall the following auxiliary results from [8]. Lemma 2.1. Assume that the operator matrix (𝐴𝑖𝑗 )2 𝑖,𝑗=1 : 𝐻1 ⊕ 𝐻2 → 𝐻1 ⊕ 𝐻2 and the operator 𝐴22 are invertible. Then 𝑆 = 𝐴11 − 𝐴12 𝐴−1 22 𝐴21 is invertible and [ ] ]−1 [ −1 𝐴11 𝐴12 𝑆 ∗ = . (2.1) ∗ ∗ 𝐴21 𝐴22 Proof. Follows directly from the factorization ][ ] [ [ 𝐴11 0 𝐼 −𝐴12 𝐴−1 𝐴11 − 𝐴12 𝐴−1 22 𝐴21 22 = 𝐴21 0 𝐼 0 𝐴22

𝐴12 𝐴22

][

𝐼

−𝐴−1 22 𝐴21

] 0 . (2.2) 𝐼 □

Lemma 2.2. Let lower/upper and upper/lower factorization of the inverse of a block matrix be given,as follows: ]−1 [ ][ ] [ 𝑃11 𝑄11 𝑄12 0 𝐵11 𝐵12 = (2.3) 𝐵21 𝐵22 0 𝑄22 𝑃21 𝑃22 [ ][ ] 0 𝑅11 𝑅12 𝑇11 = , (2.4) 0 𝑅22 𝑇21 𝑇22 and suppose that 𝑅22 and 𝑇22 are invertible. Then −1 𝐵11 = 𝑃11 𝑄11 − 𝑅12 𝑇21 .

(2.5)

Proof. Apply Lemma 2.1 with 𝐴11 = 𝑃11 𝑄11 , 𝐴12 = 𝑅12 𝑇22 , 𝐴21 = 𝑅22 𝑇21 , 𝐴22 = 𝑅22 𝑇22 to equality [ ]−1 [ ] 𝐵11 𝐵12 𝑃11 𝑄11 𝑅12 𝑇22 . □ = 𝐵21 𝐵22 𝑅22 𝑇21 𝑅22 𝑇22 Corollary 2.3. Consider a positive deﬁnite operator matrix (𝐵𝑖𝑗 )3 𝑖,𝑗=1 of which the lower/upper and upper/lower block Cholesky factorization of its inverse are given,as follows: ⎤⎡ ∗ ⎤ ⎡ ∗ ∗ 0 0 𝑃31 𝑃11 𝑃21 𝑃11 ∗ ∗ ⎦ 0 ⎦ ⎣ 0 𝑃22 𝑃32 [(𝐵𝑖𝑗 )3 𝑖,𝑗=1 ]−1 = ⎣𝑃21 𝑃22 (2.6) ∗ 0 0 𝑃33 𝑃31 𝑃32 𝑃33 ⎤⎡ ⎤ ⎡ ∗ ∗ ∗ 𝑅31 0 0 𝑅11 𝑅11 𝑅21 ∗ ∗ ⎦⎣ 𝑅21 𝑅22 𝑅22 𝑅32 0 ⎦, =⎣ 0 (2.7) ∗ 0 0 𝑅33 𝑅31 𝑅32 𝑅33 with 𝑅22 ,𝑃22 ,𝑃33 and 𝑅33 invertible. Then −1 ∗ ∗ ∗ = 𝑃11 𝑃11 − 𝑅21 𝑅21 − 𝑅31 𝑅31 𝐵11

(2.8)

392

S. Koyuncu and H.J. Woerdeman

Proof. By Lemma 2.2 we have that −1 𝐵11

=

∗ 𝑃11 𝑃11

[

− 𝑅21

∗

∗ 𝑅31

] [ ] 𝑅21 𝑅31

which gives (2.8).

□

Before we prove the Theorem 1.1, we need to introduce some notation. Let ℋ be a Hilbert space and let ℬ(ℋ) denote the Banach space of bounded linear operators on ℋ. We let 𝐿∞ = 𝐿∞ (𝕋2 ; ℬ(ℋ)) denote the Lebesgue space of essentially bounded ℬ(ℋ)-valued measurable functions on 𝕋2 , and we let 𝐿2 = 𝐿2 (𝕋2 ; ℋ) and 𝐻2 = 𝐻2 (𝕋2 ; ℋ) denote the Lebesgue and Hardy space of square integrable 2 ℋ-valued functions ∑ on 𝕋 , 𝑖respectively. As usual we view 𝐻2 as a subspace of 𝐿2 . For 𝐿(𝑧) = 𝑖∈ℤ2 𝐿𝑖 𝑧 ∈ 𝐿∞ we will consider its multiplication operator 𝑀𝐿 : 𝐿2 → 𝐿2 given by (𝑀𝐿 (𝑓 ))(𝑧) = 𝐿(𝑧)𝑓 (𝑧). The Toeplitz operator 𝑇𝐿 : 𝐻2 → 𝐻2 is deﬁned as the compression∑of 𝑀𝐿 to 𝐻2 . For Λ ⊂ ℤ2 we let 𝑆Λ denote the subspace {𝐹 ∈ 𝐿2 : 𝐹 (𝑧) = 𝑘∈Λ 𝐹𝑘 𝑧 𝑘 } of 𝐿2 consisting of those functions with Fourier support in Λ. In addition,we let 𝑃Λ denote the orthogonal projection onto 𝑆Λ . So, for instance, 𝑃ℕ20 is the orthogonal projection onto 𝐻2 and 𝑇𝐿 = 𝑃ℕ20 𝑀𝐿 𝑃ℕ∗2 . 0

Proof of Theorem 1.1. Clearly we have that 𝑀𝑓 −1 = 𝑀𝑃 𝑀𝑃 ∗ = 𝑀𝑅∗ 𝑀𝑅 . With respect to the decomposition 𝐿2 = 𝐻2 ⊥ ⊕ 𝐻2 we get that [ [ [ ] ] ] ∗ ∗ ∗ 0 ∗ 0 𝑀𝑓 = , 𝑀𝑃 = , 𝑀𝑃 −1 = , (2.9) ∗ 𝑇𝑓 ∗ 𝑇𝑃 ∗ 𝑇𝑃 −1 [ [ ] ] ∗ 0 ∗ 0 , 𝑀𝑅−1 = , (2.10) 𝑀𝑅 = ∗ 𝑇𝑅 ∗ 𝑇𝑅−1 where we used that 𝑀𝑃 ±1 [𝐻2 ] ⊂ 𝐻2 and 𝑀𝑅±1 [𝐻2 ] ⊂ 𝐻2 which follows as 𝑃 ±1 −1 and 𝑅±1 are analytic in 𝔻2 . It now follows that 𝑇𝑓 = (𝑇𝑃 )∗ (𝑇𝑃 )−1 and thus 𝑇𝑓 −1 = 𝑇𝑃 𝑇𝑃 ∗ .

(2.11)

Next, decompose 𝐻2 = 𝑆Λ ⊕ 𝑆Θ ⊕ 𝑆𝑛+ℕ20 , where Λ = 𝑛1 × 𝑛2 ∖ {(𝑛1 , 𝑛2 )} and Θ = ℕ20 ∖(Λ ∪ (𝑛 + ℕ20 )), and write 𝑇𝑃 and 𝑇𝑅 with respect to this decomposition: ⎤ ⎤ ⎡ ⎡ 𝑃11 𝑅11 ⎦ , 𝑇𝑅 = ⎣𝑅21 𝑅22 ⎦. (2.12) 𝑇𝑃 = ⎣𝑃21 𝑃22 𝑃31 𝑃32 𝑃33 𝑅31 𝑅32 𝑅33 As the Fourier support of 𝑃 and 𝑅 lies in 𝑛, and as 𝑃 (𝑧)𝑃 (𝑧)∗ = 𝑅(𝑧)∗ 𝑅(𝑧) on 𝕋2 , it is not hard to show that ∗ ∗ 𝑇𝑃 𝑇𝑃 ∗ 𝑃𝑛+ℕ 2 = 𝑇𝑅∗ 𝑇𝑅 𝑃𝑛+ℕ2 , 0

0

(2.13)

The Inverse of a Toeplitz Operator Matrix

393

which yields that ∗ ∗ ∗ ∗ 𝑃31 𝑃31 + 𝑃32 𝑃32 + 𝑃33 𝑃33 = 𝑅33 𝑅33 ,

∗ ∗ ∗ ∗ ∗ 𝑃21 𝑃31 + 𝑃22 𝑃32 = 𝑅32 𝑅33 , 𝑃11 𝑃31 = 𝑅31 𝑅11 .

Thus we can factor 𝑇𝑃 𝑇𝑃 ∗ as

𝑇𝑃 𝑇𝑃 ∗

⎡ ∗ ˜ 𝑅 11 ⎣ =

˜∗ 𝑅 21 ∗ ˜ 22 𝑅

⎤⎡ ∗ ˜ 11 𝑅 𝑅31 ∗ ⎦⎣ ˜ 𝑅21 𝑅32 ∗ 𝑅33 𝑅31

⎤ ˜ 22 𝑅 𝑅32

𝑅33

⎦,

(2.14)

˜ 11 , 𝑅 ˜ 21 and 𝑅 ˜ 22 . Combining now (2.14) and two factorization of 𝑇𝑃 𝑇𝑃 ∗ for some 𝑅 ∗ ∗ ∗ ˜ ˜∗ 𝑅 ˜∗ ˜ given via (2.11), we get 𝑅 22 21 = [ 𝑃]21 𝑃11 − 𝑅32 𝑅31 and 𝑅22 𝑅22 = 𝑃21 𝑃21 + 𝐶1 ∗ ∗ ˜ ˜∗ ˜ ˜∗ 𝑅 𝑃22 𝑃22 − 𝑅32 𝑅32 . Now, we write = 𝑅 22 21 where 𝐶1 = 𝑃Θ1 𝑅22 𝑅21 and 𝐶2 ˜∗ 𝑅 ˜ 𝐶2 = 𝑃Θ2 𝑅 22 21 . We will start only proving (1.4). The proof of (1.5) is similar. To prove (1.4), let 𝑖 ∈ Θ1 = {𝑛1 + 1, 𝑛1 + 2, . . .} × {0, 1, . . . , 𝑛2 − 1}, 𝑗 ∈ Λ. Since 𝑃𝑘 = 𝑅𝑘 = 0 ∗ ∗ when 𝑘 ∕∈ 𝑛 = 𝑛1 × 𝑛2 , we get from 𝐶1 = 𝑃Θ1 (𝑃21 𝑃11 − 𝑅32 𝑅31 ) that ∑ ∑ ∗ ∗ (𝐶1 )𝑖𝑗 = 𝑃𝑖−𝑘 𝑃𝑗−𝑘 − 𝑅𝑙−𝑖 𝑅𝑙−𝑗 . 𝑘∈Λ 𝑖−𝑘∈𝑛1 ×𝑛2 𝑗−𝑘∈𝑛1 ×𝑛2

𝑙∈𝑛+ℕ20 𝑙−𝑖∈𝑛1 ×𝑛2 𝑙−𝑗∈𝑛1 ×𝑛2

Note that 𝑖−𝑘 ∈ 𝑛1 ×𝑛2 and 𝑗−𝑘 ∈ 𝑛1 ×𝑛2 imply 0 ≤ 𝑖1 −𝑘1 ≤ 𝑛1 , 0 ≤ 𝑗1 −𝑘1 ≤ 𝑛1 , 0 ≤ 𝑖2 − 𝑘2 ≤ 𝑛2 and 0 ≤ 𝑗2 − 𝑘2 ≤ 𝑛2 . Combining these inequalities we get 𝑖1 − 𝑛1 ≤ 𝑘1 ≤ 𝑗1 and 0 ≤ 𝑘2 ≤ min{𝑖2 , 𝑗2 }. Similarly, since 𝑙 − 𝑖 ∈ 𝑛1 × 𝑛2 and 𝑙 − 𝑗 ∈ 𝑛1 × 𝑛2 we get 𝑖1 ≤ 𝑙1 ≤ 𝑗1 + 𝑛1 and 𝑛2 ≤ 𝑙2 ≤ min{𝑖2 + 𝑛2 , 𝑗2 + 𝑛2 }. Thus the 𝑖, 𝑗th entry of 𝐶1 equals (𝐶1 )𝑖𝑗 =

𝑗1 ∑

min{𝑖2 ,𝑗2 }

𝑘1 =𝑖1 −𝑛1

𝑘2 =0

∑

∗ 𝑃𝑖−𝑘 𝑃𝑗−𝑘 −

𝑗1∑ +𝑛1 min{𝑖2 +𝑛 2 ,𝑗2 +𝑛2 } ∑ 𝑙1 =𝑖1

∗ 𝑅𝑙−𝑖 𝑅𝑙−𝑗 .

𝑙2 =𝑛2

This proves (1.4). ˜ = 𝑃21 𝑃 ∗ + 𝑃22 𝑃 ∗ − 𝑅∗ 𝑅32 . ˜∗ 𝑅 Next, we need to compute 𝑅 21 22 32 [ ] 22 22 𝐷 𝐸 1 ∗ ∗ ˜ ˜ 𝑅 ˜ ˜ 𝑅 , where 𝐷𝑖 = 𝑃Θ𝑖 𝑅 Write 𝑅 22 22 = 𝐸 ∗ 22 22 𝑃Θ𝑖 , 𝑖 = 1, 2, and 𝐸 = 𝐷2 ∗ ˜ ˜ 22 𝑅22 𝑃Θ2 . We ﬁrst show that 𝐸 = 0. Let 𝑖 ∈ Θ1 = {𝑛1 + 1, . . .} × {0, . . . , 𝑛2 − 𝑃Θ1 𝑅 1}, 𝑗 ∈ Θ2 = {0, . . . , 𝑛1 − 1} × {𝑛2 + 1, . . .}. Note that ∑ ∗ ∗ ∗ (𝑃21 𝑃21 + 𝑃22 𝑃22 )𝑖𝑗 = 𝑃𝑖−𝑘 𝑃𝑗−𝑘 . (2.15) 𝑘∈Λ∪Θ 𝑖−𝑘∈𝑛 𝑗−𝑘∈𝑛

394

S. Koyuncu and H.J. Woerdeman

As 𝑘 ∈ (𝑖 − 𝑛) ∩ (𝑗 − 𝑛) ∩ (Λ ∪ Θ) is equivalent to 𝑖1 − 𝑛1 ≤ 𝑘1 ≤ 𝑗1 and 𝑗2 − 𝑛2 ≤ 𝑘2 ≤ 𝑖2 , we obtain 𝑗1 ∑

∗ ∗ (𝑃21 𝑃21 + 𝑃22 𝑃22 )𝑖𝑗 =

𝑖2 ∑

∗ 𝑃𝑖−𝑘 𝑃𝑗−𝑘 .

𝑘1 =𝑖1 −𝑛1 𝑘2 =𝑗2 −𝑛2

Next,

∑

∗ (𝑅32 𝑅32 )𝑖𝑗 =

∗ 𝑅𝑖−𝑘 𝑅𝑗−𝑘 .

(2.16)

𝑙∈𝑛+ℕ20 𝑙−𝑖∈𝑛 𝑙−𝑗∈𝑛

As 𝑙 ∈ (𝑗 + 𝑛) ∩ (𝑖 + 𝑛) ∩ (𝑛 + ℕ20 ) is equivalent to 𝑖1 ≤ 𝑙1 ≤ 𝑗1 + 𝑛1 and 𝑗2 ≤ 𝑙2 ≤ 𝑖2 + 𝑛2 , we obtain ∗ 𝑅32 )𝑖𝑗 = (𝑅32

𝑗1∑ +𝑛1 𝑖2∑ +𝑛2

∗ 𝑅𝑙−𝑖 𝑅𝑙−𝑗 .

𝑙1 =𝑖1 𝑙2 =𝑗2 ∗ ∗ ∗ Finally, we need to show that (𝑃21 𝑃21 + 𝑃22 𝑃22 )𝑖𝑗 = (𝑅32 𝑅32 )𝑖𝑗 . It is clear that if 𝑖1 − 𝑛1 > 𝑗1 or 𝑗2 − 𝑛2 > 𝑖2 then equality holds as both sides equal 0. Now let us consider the case when 𝑖1 − 𝑛1 ≤ 𝑗1 and 𝑗2 − 𝑛2 ≤ 𝑖2 . Let 𝑖 = (𝑛1 + 𝑟, 𝑠) ∈ Θ1 and 𝑗 = (˜ 𝑟 , 𝑛2 + 𝑠˜) ∈ Θ2 where 𝑟, 𝑠˜ ≥ 1, 𝑠 ∈ {0, . . . , 𝑛2 − 1} and 𝑟˜ ∈ {0, . . . , 𝑛1 − 1}. ∗ ∗ Using the fact that 𝑃 (𝑧)𝑃 (𝑧) = 𝑅(𝑧) 𝑅(𝑧) we have 𝑠 𝑟˜ ∑ ∑

𝑃𝑛1 +𝑟−𝑘1 ,𝑠−𝑘2 𝑃𝑟˜∗−𝑘1 ,𝑛2 +˜𝑠−𝑘2

𝑘1 =𝑟 𝑘2 =˜ 𝑠 𝑠 𝑟˜ ∑ ∑

=

𝑅𝑟∗˜−𝑘1 ,𝑛2 +˜𝑠−𝑘2 𝑅𝑛1 +𝑟−𝑘1 ,𝑠−𝑘2 .

(2.17)

𝑘1 =𝑟 𝑘2 =˜ 𝑠

Substituting 𝑟 = 𝑖1 − 𝑛1 , 𝑟˜ = 𝑗1 , 𝑠˜ = 𝑗2 − 𝑛2 and 𝑠 = 𝑖2 into (2.17) we obtain 𝑗1 ∑

𝑖2 ∑

𝑃𝑖1 −𝑘1 ,𝑖2 −𝑘2 𝑃𝑗∗1 −𝑘1 ,𝑗2 −𝑘2

𝑘1 =𝑖1 −𝑛1 𝑘2 =𝑗2 −𝑛2 𝑗1 ∑

=

𝑖2 ∑

𝑅𝑗∗1 −𝑘1 ,𝑗2 −𝑘2 𝑅𝑖1 −𝑘1 ,𝑖2 −𝑘2 .

(2.18)

𝑘1 =𝑖1 −𝑛1 𝑘2 =𝑗2 −𝑛2

Replacing 𝑘1 + 𝑛1 by 𝑙1 and 𝑘2 + 𝑛2 by 𝑙2 in the right hand of (2.18) we obtain 𝑗1∑ +𝑛1 𝑖2∑ +𝑛2

𝑅𝑗∗1 −𝑙1 +𝑛1 ,𝑗2 −𝑙2 +𝑛2 𝑅𝑖1 −𝑙1 +𝑛1 ,𝑖2 −𝑙2 +𝑛2 .

(2.19)

𝑙1 =𝑖1 𝑙2 =𝑗2

Replacing 𝑗1 + 𝑖1 − 𝑙1 + 𝑛1 by 𝑙˜1 and 𝑗2 − 𝑙2 + 𝑛2 + 𝑖2 by 𝑙˜2 in (2.19) we obtain 𝑗1∑ +𝑛1 𝑖2∑ +𝑛2 𝑙˜1 =𝑖1 𝑙˜2 =𝑗2

𝑅𝑙∗˜ −𝑖 1

˜

1 ,𝑙2 −𝑖2

𝑅𝑙˜1 −𝑗1 ,𝑙˜2 −𝑗2 .

(2.20)

The Inverse of a Toeplitz Operator Matrix

395

Thus (2.18) and (2.20), yield that 𝑗1 ∑

𝑖2 ∑

∗ 𝑃𝑖−𝑘 𝑃𝑗−𝑘 =

𝑗1∑ +𝑛1 𝑖2∑ +𝑛2 𝑙˜1 =𝑖1 𝑙˜2 =𝑗2

𝑘1 =𝑖1 −𝑛1 𝑘2 =𝑗2 −𝑛2

∗ 𝑅˜𝑙−𝑖 𝑅˜𝑙−𝑗 .

This proves that 𝐸 = 0. Now let us prove (1.6). The proof of (1.7) is similar and will be omitted. Let 𝑘, 𝑘˜ ∈ Θ1 . Since 𝑃𝑘 = 𝑅𝑘 = 0 when 𝑘 ∕∈ 𝑛 = 𝑛1 × 𝑛2 , we get from 𝐷1 = ∗ ˜ ˜ 22 𝑅22 𝑃Θ1 that 𝑃Θ1 𝑅 ∑ ∑ ∗ ∗ (𝐷1 )𝑘,𝑘˜ = 𝑃𝑘−𝑙 𝑃𝑘−𝑙 − 𝑅𝑠−𝑘 𝑅𝑠−𝑘˜ . ˜ 𝑙∈Λ∪Θ1 𝑘−𝑙∈𝑛1 ×𝑛2 ˜ 𝑘−𝑙∈𝑛 1 ×𝑛2

𝑠∈𝑛+ℕ20 𝑠−𝑘∈𝑛1 ×𝑛2 ˜ 𝑠−𝑘∈𝑛 1 ×𝑛2

Note that 𝑘 − 𝑙 ∈ 𝑛1 × 𝑛2 and 𝑘˜ − 𝑙 ∈ 𝑛1 × 𝑛2 implies 𝑘1 − 𝑛1 ≤ 𝑙1 ≤ 𝑘1 , 𝑘˜1 − 𝑛1 ≤ 𝑙1 ≤ 𝑘˜1 , 0 ≤ 𝑙2 ≤ 𝑘2 and 0 ≤ 𝑙2 ≤ 𝑘˜2 . Combining these inequalities we get max{𝑘1 , 𝑘˜1 } − 𝑛1 ≤ 𝑙1 ≤ min{𝑘1 , 𝑘˜1 } and 0 ≤ 𝑙2 ≤ min{𝑘2 , 𝑘˜2 }. Similarly, 𝑠−𝑘 ∈ 𝑛1 ×𝑛2 and 𝑠− 𝑘˜ ∈ 𝑛1 ×𝑛2 implies that 𝑘1 ≤ 𝑠1 ≤ 𝑘1 +𝑛1 , 𝑘˜1 ≤ 𝑠1 ≤ 𝑘˜1 +𝑛1 , ˜ entry of 𝐷1 is given by 𝑛2 ≤ 𝑠2 ≤ 𝑘2 + 𝑛2 and 𝑛2 ≤ 𝑠2 ≤ 𝑘˜2 + 𝑛2 . Thus 𝑘, 𝑘th (1.6). □

3. Implementation of the formula in Matlab Suppose we are given a two variable scalar-valued stable polynomial 𝑃 (𝑧1 , 𝑧2 ) =

𝑛2 𝑛1 ∑ ∑

𝑝𝑘𝑙 𝑧1 𝑘 𝑧2 𝑙

𝑘=0 𝑙=0 1 ∣𝑃 ∣2 .

with the symbol of 𝑇 is of the form We can build the matrices 𝐴,𝐵,𝐶1 ,𝐷1 , 𝐶2 ,𝐷2 according to Theorem 1.1. The matrices 𝐷1 and 𝐷2 are generated by matrixvalued symbols of one variable. One way to compute 𝐶1∗ 𝐷1−1 𝐶1 is to factorize 𝐷1−1 = 𝐹 𝐹 ∗ with 𝐹 upper triangular. As 𝐶1 is typically sparse with entries in the upper part, 𝐹 𝐶1 will also be sparse. The factorization of 𝐷1 (and 𝐷2 ) can be obtained by a direct LU factorization, but also via the so-called Discrete Algebraic Riccati Equation (DARE) in Matlab. We will illustrate the latter method. Suppose 𝑓 (𝑧) = 𝑓−𝑛 𝑧 −𝑛 + ⋅ ⋅ ⋅ + 𝑓𝑛 𝑧 𝑛 ≥ 0, ∣𝑧∣ = 1. We want ∗

𝑓 (𝑧) = 𝑝(𝑧) 𝑝(𝑧)

(3.1)

𝑛

where 𝑝(𝑧) = 𝑝0 + ⋅ ⋅ ⋅ + 𝑝𝑛 𝑧 is the outer factor. Note that (3.1) is equivalent to ⎡ ∗ ⎡ ∗⎤ 𝑝0 𝑝0 ⋅ ⋅ ⋅ 𝑝0 ] [ ⎢ .. ⎢ .. ⎥ .. ⋅ ⋅ ⋅ 𝑝 𝑝 = ⎣ . ⎣.⎦ 0 𝑛 . 𝑝∗𝑛 𝑝∗𝑛 𝑝0 ⋅ ⋅ ⋅

⎤ 𝑝∗0 𝑝𝑛 .. ⎥ . ⎦

𝑝∗𝑛 𝑝𝑛

396

S. Koyuncu and H.J. Woerdeman

having property that 𝑛 ∑

𝑝∗𝑖 𝑝𝑖 = 𝑓0 ,

𝑖=𝑜

𝑛−1 ∑

𝑝𝑖 𝑝∗𝑖+1 = 𝑓1 ,

𝑛−2 ∑

𝑖=0

𝑝𝑖 𝑝∗𝑖+2 = 𝑓2 , . . . , 𝑝0 𝑝∗𝑛−1 = 𝑓𝑛

(3.2)

𝑖=0

with 𝑓1∗ = 𝑓−1 , 𝑓2∗ = 𝑓−2 , . . . , 𝑓𝑛∗ = 𝑓−𝑛 . Therefore, we consider ⎤ ⎡ 𝑃00 ⋅ ⋅ ⋅ 𝑃0𝑛 ⎢ .. ⎥ ≥ 0. 𝑃 = ⎣ ... . ⎦ ⋅⋅⋅

𝑃𝑛0

(3.3)

𝑃𝑛𝑛

with the property 𝑛 ∑

𝑃𝑖𝑖 = 𝑓0 ,

𝑖=0

If we write 𝑃 = where

𝑅 = 𝑓0 ,

⎤

𝑃𝑖,𝑖+1 = 𝑓1 , . . . , 𝑃0𝑛 = 𝑓𝑛

(3.4)

𝑖=0

[

⎡

𝑛−1 ∑

𝑓−𝑛 ⎢ .. ⎥ 𝑆 = ⎣ . ⎦, 𝑓−1

−𝑋

𝑆

𝑆∗

𝑅

]

[ +

⎡ 0 𝐼 ⎢ .. ⎢. 𝐴=⎢ ⎢. ⎣ .. 0 0

𝐴∗ 𝑋𝐴

𝐴∗ 𝑋𝐵

𝐵 ∗ 𝑋𝐴 𝐵 ∗ 𝑋𝐵 ⎤ ⎥ ⎥ ⎥, ⎥ 𝐼⎦ ... 0 ..

.

] ,

⎡ ⎤ 0 ⎢ .. ⎥ ⎢ ⎥ 𝐵 = ⎢.⎥ , ⎣0 ⎦ 𝐼

and 𝑋 = 𝑋 ∗

then 𝑃 has the property (3.4). In fact, every 𝑃 satisﬁying (3.4) can be written in this form. Now suppose that 𝑋 is so that 𝑅 + 𝐵 ∗ 𝑋𝐵 > 0 then 𝑃 ≥ 0 if and only if 𝑋 − 𝐴∗ 𝑋𝐴 + (𝑆 + 𝐴∗ 𝑋𝐵)(𝑅 + 𝐵 ∗ 𝑋𝐵)−1 (𝑆 ∗ + 𝐵 ∗ 𝑋𝐴) − 𝑄 ≥ 0.

(3.5)

At the optimal choices of 𝑋 one can get 𝑋 − 𝐴∗ 𝑋𝐴 + (𝑆 + 𝐴∗ 𝑋𝐵)(𝑅 + 𝐵 ∗ 𝑋𝐵)−1 (𝑆 ∗ + 𝐵 ∗ 𝑋𝐴) − 𝑄 = 0

(3.6)

and for one of the optimal ones, we have that 𝐴 − 𝐵(𝑅 + 𝐵 ∗ 𝑋𝐵)−1 (𝑆 ∗ + 𝐵 ∗ 𝑋𝐴) ∗

−1

(3.7) −2

∗

∗

has of its eigenvalues in 𝔻. If we let (𝑅 + 𝐵 𝑋𝐵) = 𝑝0 , 𝑆 + 𝐵 𝑋𝐴 = [ all ] 𝑝0 𝑝∗𝑛 ⋅ ⋅ ⋅ 𝑝0 𝑝∗1 , then (3.5) becomes the companion matrix of 𝑆(𝑧) = 𝑧 𝑛 𝐼 + ∗ 𝑛−1 ∗ 𝑝−1 + ⋅ ⋅ ⋅ + 𝑝−1 0 𝑝1 𝑧 0 𝑝𝑛 and thus 𝑆 has all its eigenvalues in 𝔻. A detailed description of this method can be found in Section 15 of [5]. We now give the following example to illustrate how DARE can be used to factorize 𝐷1 and 𝐷2 . Example. Let 𝑃 (𝑧1 , 𝑧2 ) = 5 + 2𝑧1 + 3𝑧2 + 𝑧1 𝑧2 + 𝑧12 + 𝑧22 . Since 𝑓 (𝑧1 , 𝑧2 ) = 𝑃 (𝑧1 , 𝑧2 )𝑃 (1/𝑧1 , 1/𝑧2 ) = 41 + 15(𝑧1 + 1/𝑧1 ) + 20(𝑧2 + 1/𝑧2 ) + 5(𝑧12 + 1/𝑧12 ) + 5(𝑧22 + 1/𝑧22 ) + 5(𝑧1 𝑧2 + 1/𝑧1 𝑧2 ) + 8(𝑧1 /𝑧2 + 𝑧2 /𝑧1 ) + (𝑧12 /𝑧22 + 𝑧22 /𝑧12 ) + 3(𝑧12 /𝑧2 + 𝑧2 /𝑧12 ) + 2(𝑧1 /𝑧22 + 𝑧22 /𝑧1 ), then the symbol of 𝑇 is positive. Letting 𝑛1 = 𝑛2 = 3,

The Inverse of a Toeplitz Operator Matrix we assemble a bi-inﬁnite block as shown below. ⎡ 30 17 5 ⎢17 39 17 ⎢ ⎢ 5 17 30 ⎢ ⎢12 7 2 ⎢ ⎢ 5 15 7 ⎢ ⎢ 0 5 12 ⎢ ⎢5 3 1 ⎢ ⎢0 5 3 ⎢ ⎣0 0 5 0 0 0

397

Toeplitz matrix 𝐷1 whose top left 10 × 10 block is 12 7 2 30 17 5 12 5 0 5

5 15 7 17 39 17 7 15 5 3

0 5 12 5 17 30 2 7 12 1

5 3 1 12 7 2 30 17 5 12

⎤ 0 0 0 5 0 0⎥ ⎥ 3 5 0⎥ ⎥ 5 0 5⎥ ⎥ 15 5 3 ⎥ ⎥. 7 12 1 ⎥ ⎥ 17 5 12⎥ ⎥ 39 17 7 ⎥ ⎥ 17 30 2 ⎦ 7 2 30

We now write the matrix-valued one variable symbol associated with 𝐷1 : 𝑓 (𝑧) = 𝑓−2 where

⎡ ⎤ 30 17 5 𝑓0 = ⎣17 39 17⎦ , 5 17 30

1 1 + 𝑓−1 + 𝑓0 + 𝑓1 𝑧 + 𝑓2 𝑧 2 , 𝑧2 𝑧

𝑓−1

⎡ ⎤ 12 7 2 = ⎣ 5 15 7 ⎦ , 0 5 12

𝑓−2

⎡ 5 = ⎣0 0

⎤ 3 1 5 3⎦ 0 5

∗ ∗ = 𝑓1 and 𝑓−2 = 𝑓2 . with 𝑓−1 ∗ Suppose 𝑓 (𝑧) = 𝑝(𝑧) 𝑝(𝑧) where 𝑝(𝑧) = 𝑝0 + 𝑝1 𝑧 + 𝑝2 𝑧 2 . We write 𝑓0 = ∗ ∗ ∗ ∗ 𝑝0 𝑝0 + 𝑝1 𝑝1 + 𝑝2 𝑝2 , 𝑓1 = 𝑝0 𝑝1 + 𝑝∗1 𝑝2 , 𝑓−1 = 𝑝∗1 𝑝0 + 𝑝∗2 𝑝1 , 𝑓2 = 𝑝∗0 𝑝2 and 𝑓−2 = 𝑝∗2 𝑝0 . Using DARE in MATLAB, we can factorize 𝑓 (𝑧) in the following way: Let ⎡ ⎤ ⎡ ⎤ 0 0 0 1 0 0 0 0 0 ⎢ 0 0 0 0 1 0⎥ ⎢0 0 0⎥ ⎥ ⎢ ⎢ ⎥ [ ] ⎢ ⎥ ⎢0 0 0⎥ 𝑓 0 0 0 0 0 1⎥ ⎢ ⎥ 𝑅 = 𝑓0 , 𝑆 = −2 , 𝐴 = ⎢ , 𝐵 = ⎢ 0 0 0 0 0 0⎥ ⎢1 0 0⎥ , 𝑓−1 ⎢ ⎥ ⎢ ⎥ ⎣ 0 0 0 0 0 0⎦ ⎣0 1 0⎦ 0 0 0 0 0 0 0 0 1

𝐸 = 𝐼6

and 𝑄 = 𝑂6

then using [𝑋, 𝐿, 𝐺] = dare (𝐴, 𝐵, 𝑄, 𝑅, 𝑆, 𝐸) in MATLAB, we get ⎡ ⎡ ⎤ ⎤ 5.0000 0 0 0.6545 −0.5445 0.1176 0 ⎦ , 𝑝1 = ⎣0.4961 0.6684 −0.5944⎦ 𝑝0 = ⎣3.0000 4.8780 1.0000 2.4258 4.1835 0.2390 0.7171 1.1952 and

⎡ ⎤ 1.3090 −0.4548 −0.3899 𝑝2 = ⎣0.9923 1.9690 −0.1384⎦ . 0.4781 1.3673 2.3647

398

S. Koyuncu and H.J. Woerdeman

Thus we have 𝑝(𝑧) = 𝑝0 + 𝑝1 𝑧 + 𝑝2 𝑧 2 . Next we assemble a bi-inﬁnite block Toeplitz matrix 𝐷2 whose top left 10 × 10 block is as shown below. ⎡ ⎤ 35 13 5 18 5 0 5 0 0 0 ⎢13 39 13 7 20 5 2 5 0 0 ⎥ ⎢ ⎥ ⎢ 5 13 35 3 7 18 1 2 5 0 ⎥ ⎢ ⎥ ⎢18 7 3 35 13 5 18 5 0 5 ⎥ ⎢ ⎥ ⎢ 5 20 7 13 39 13 7 20 5 2 ⎥ ⎢ ⎥ ⎢ 0 5 18 5 13 35 3 7 18 1 ⎥ . ⎢ ⎥ ⎢ 5 2 1 18 7 3 35 13 5 18⎥ ⎢ ⎥ ⎢ 0 5 2 5 20 7 13 39 13 7 ⎥ ⎢ ⎥ ⎣ 0 0 5 0 5 18 5 13 35 3 ⎦ 0 0 0 5 2 1 18 7 3 35 Then the matrix-valued one variable symbol associated with 𝐷2 is the following: 𝑓 (𝑧) = 𝑓−2 where

⎡ ⎤ 35 13 5 𝑓0 = ⎣13 39 13⎦ , 5 13 35

∗ ∗ with 𝑓−1 = 𝑓1 and 𝑓−2 = 𝑓2 . We now let

[

𝑅 = 𝑓0 ,

] 𝑓−2 𝑆= , 𝑓−1

1 1 + 𝑓−1 + 𝑓0 + 𝑓1 𝑧 + 𝑓2 𝑧 2 2 𝑧 𝑧

𝑓−1

⎡ ⎤ 18 7 3 = ⎣ 5 20 7 ⎦ , 0 5 18

⎡

0 ⎢0 ⎢ ⎢0 𝐴=⎢ ⎢0 ⎢ ⎣0 0 𝐸 = 𝐼6

0 0 0 0 0 0

0 0 0 0 0 0

1 0 0 0 0 0

0 1 0 0 0 0

𝑓−2

⎤ 0 0⎥ ⎥ 1⎥ ⎥, 0⎥ ⎥ 0⎦ 0

⎡ 5 = ⎣0 0

⎡ 0 ⎢0 ⎢ ⎢0 𝐵=⎢ ⎢1 ⎢ ⎣0 0

⎤ 2 1 5 2⎦ 0 5

0 0 0 0 1 0

and 𝑄 = 𝑂6

then using [𝑋, 𝐿, 𝐺] = dare (𝐴, 𝐵, 𝑄, 𝑅, 𝑆, 𝐸) in MATLAB, we get ⎡ ⎡ ⎤ ⎤ 5.001 2 0.999 2.99 1 −0.01 4.855 1.625⎦ , 𝑝1 = ⎣−0.01 3.05 1.04 ⎦ 𝑝0 = ⎣ 0 0 0 4.554 0.01 −0.05 2.88 and

⎡ 1 𝑝2 = ⎣0 0

Thus we have 𝑝(𝑧) = 𝑝0 + 𝑝1 𝑧 + 𝑝2 𝑧 2 .

⎤ 0 0 1.02 0 ⎦. 0.07 1.09

⎤ 0 0⎥ ⎥ 0⎥ ⎥, 0⎥ ⎥ 0⎦ 1

The Inverse of a Toeplitz Operator Matrix

399

We now present numerical results for implementation of Theorem 1.1. 𝑛1

𝑛2

∥ 𝑇 −1 − 𝐴𝐴∗ − 𝐵 ∗ 𝐵 − 𝐶1∗ 𝐷1 −1 𝐶1 − 𝐶2∗ 𝐷2 −1 𝐶2 ∥

4

4

1.6308𝑒 − 013

8

8

3.7907𝑒 − 013

16

16

1.0216𝑒 − 012

32

32

3.7828𝑒 − 012

In the next section we provide an algorithm to approximate 𝑇 −1 in case the 1 symbol is not of the form ∣𝑝(𝑧)∣ 2 , and give numerical results.

4. Inversion algorithm and numerical results We now consider the case when the symbol of 𝑇 is not necessarily of the form ∣𝑝∣1 2 . It may still be worthwhile to use the results in the previous section for approximating 𝑇 −1 . Note that the expression 𝐴𝐴∗ − 𝐵 ∗ 𝐵 is easily computable when the polynomial 𝑝 is known, even when 𝑝 is a polynomial of more than two variables. Therefore, we may try to approximate the symbol of a multilevel Toeplitz by a symbol of the form ∣𝑝∣1 2 . In this section we explore this idea. In order to use the above idea, one needs to have a way to go from a positive deﬁnite multilevel Toeplitz ma∑ 1 trix 𝑇 = (𝑡𝑘−𝑙 )𝑘,𝑙∈Λ to a stable polynomial 𝑝 so that 𝑡(𝑧) = 𝑘∈ℤ𝑑 𝑡𝑘 𝑧 𝑘 = ∣𝑝(𝑧)∣ 2. This is a nontrivial step, and in fact in the multivariable case such a polynomial may not exist; see Theorem 1.1.3 in [1] for a necessary and suﬃcient condition when such a polynomial exist in the case of two variables. In that case we will use the following idea introduced in > 0, 𝑧 ∈ 𝕋𝑑 , we write − log (𝑡(𝑧)) as ∑ [7]. For 𝑡(𝑧) 𝑘 a Fourier series − log (𝑡(𝑧)) = 𝑘∈ℤ𝑑 𝑓𝑘 𝑧 . Let now 𝐻 be the half-space 𝐻 = {(𝑘1 , . . . , 𝑘𝑑 ) : 𝑘1 = ⋅ ⋅ ⋅ = 𝑘𝑖−1 = 0, 𝑘𝑖 ∕= 0 ⇒ 𝑘𝑖 > 0}. Then 𝐻 ∪ (−𝐻) ∪ {0} = ℤ𝑑 and 𝐻 ∩ (−𝐻) = ∅. We now introduce 𝑓+ (𝑧) = ∑ 1 𝑘 𝑑 𝑘∈𝐻 𝑓𝑘 𝑧 Then − log(𝑡(𝑧)) = 𝑓+ (𝑧) + 𝑓+ (𝑧), 𝑧 ∈ 𝕋 . Next we compute 2 𝑓00 + ∑ 1 𝑓+ (𝑧) 𝑘 𝑒 = 𝑘∈𝐻∪{0} 𝑔𝑘 𝑧 . Note that 𝑡(𝑧) = 𝑓+ (𝑧) 2 . We now use a ﬁnite set of the ∣𝑒

∣

Fourier coeﬃcients 𝑔𝑘 , 𝑘 ∈ ℕ𝑑0 ,of 𝑒𝑓+ (𝑧) as the Fourier coeﬃcients of the polynomial 𝑝. With this choice for 𝑝, the matrices 𝐴 and 𝐵 are built as in Theorem 1.1. We let 𝑋1 =𝐴𝐴∗ − 𝐵 ∗ 𝐵 and it should be noted that while the symbols 𝑡(𝑧) 1 are not of the form ∣𝑝(𝑧)∣ 2 , where 𝑝 is stable, with the choice below of Fourier coeﬃcients supported in {0, . . . , 4}𝑑 , the approximations are quite good. Let us mention that in [6] an approximation algorithm is proposed that the inverse of two-level Toeplitz matrices for various typical symbols possess low-tensor rank approximations with Kronecker factor of low displacement rank, and they state

400

S. Koyuncu and H.J. Woerdeman

initial “encouraging” results. The algorithm in [6] is iterative, namely based on the Hotelling algorithm [9]: 𝑋𝑖+1 = 2𝑋𝑖 − 𝑋𝑖 𝑇 𝑋𝑖 , 𝑖 = 1, 2, . . . ,

(4.1)

−1

2

where 𝑋1 is some initial approximation to 𝑇 . Since 𝐼 − 𝑇 𝑋𝑖 = (𝐼 − 𝑇 𝑋𝑖−1 ) , the iterations (4.1) converge quadratically, provided that ∥ 𝐼 − 𝑇 𝑋1 ∥< 1. Using the approximation 𝑋1 , and performing the Hotelling algorithm we obtain the following results. The number 𝑘∗ indicates the number of iterations the Hotelling algorithm is performed, and 𝑋∗ indicates the corresponding iterate. Example. 𝑡(𝑧1 , 𝑧2 ) = 2.1 − 12 (𝑧1 2 + 𝑧11 2 ) − 12 (𝑧2 2 + 𝑧12 2 ). Note that 𝑡 is nonsingular on 𝕋2 . In building 𝐴 and 𝐵 we only use the Fourier coeﬃcient of 𝑒𝑓+ with index ˜ = {0, . . . , 4} × {0, . . . , 4}. The results are as follows. 𝑘∈𝐾 Table 1 𝑛1

𝑛2

size(𝑇 )

∥ 𝑇 −1 − 𝑋1 ∥

𝑘∗

∥ 𝑇 −1 − 𝑋∗ ∥

16

16

288 × 288

0.002311245385348

7

3.3238𝑒 − 015

32

32

1088 × 1088 0.002329157836868

7

3.5367𝑒 − 013

48

48

2400 × 2400 0.002311239597903

8

7.8772𝑒 − 015

64

64

4224 × 4224 0.002311239524231

8

9.1165𝑒 − 015

Example. With 𝑡(𝑧1 , 𝑧2 ) = 12 + 1 ) + 19 ( 𝑧𝑧12 𝑧22 2

𝑧2

𝑧2

2

1

11 6 (𝑧1

+

1 𝑧1 )

+ 𝑧𝑧21 ) + 14 ( 𝑧12 + 𝑧22 ) + 16 ( 𝑧𝑧12 + 𝑧𝑧22 ) + 2

1

11 1 5 2 6 (𝑧2 + 𝑧2 ) + 2 (𝑧1 2 𝑧22 1 𝑧1 6 ( 𝑧2 + 𝑧1 ). Note that

+

+

1 ) 𝑧12

+ 52 (𝑧22 +

𝑡 is nonsingular

on 𝕋 . In building 𝐴 and 𝐵 we only use the Fourier coeﬃcient of 𝑒𝑓+ with index ˜ = {0, . . . , 4} × {0, . . . , 4}. We obtain the following results. 𝑘∈𝐾 Table 2 size(𝑇 )

∥ 𝑇 −1 − 𝑋1 ∥ 𝑘∗

∥ 𝑇 −1 − 𝑋∗ ∥

𝑛1

𝑛2

16

16

288 × 288 0.047558938791824

5 1.9513𝑒 − 014

32

32 1088 × 1088 0.094929745730251

5 1.4426𝑒 − 013

48

48 2400 × 2400 0.084552586200363

5 2.2439𝑒 − 013

64

64 4224 × 4224 0.086403129147974

5 2.6601𝑒 − 013

Below is an experiment in three variables (a case not covered in [6]). Example. 𝑡(𝑧1 , 𝑧2 , 𝑧3 ) = 3.5 − 12 (𝑧1 + 𝑧11 ) − 12 (𝑧2 + 𝑧12 ) − 12 (𝑧3 + 𝑧13 ). In building 𝐴 ˜ = {0, . . . , 4} × and 𝐵 we only use the Fourier coeﬃcient of 𝑒𝑓+ with index 𝑘 ∈ 𝐾 {0, . . . , 4}. The results are as follows.

The Inverse of a Toeplitz Operator Matrix

401

Table 3 ∥ 𝑇 −1 − 𝑋1 ∥ 𝑘∗

∥ 𝑇 −1 − 𝑋∗ ∥

𝑛1

𝑛2

𝑛3

size(𝑇 )

6

6

6

343 × 343

0.528074

6

1.5103𝑒 − 015

8

8

8

729 × 729

0.664157

6

1.2905𝑒 − 015

10

10

10

1331 × 1331

0.754442

6

1.9590𝑒 − 015

12

12

12

2197 × 2197

0.815762

6

2.0447𝑒 − 015

16

16

16

4913 × 4913

0.8896

6

2.7554𝑒 − 015

References [1] Jeﬀrey S. Geronimo and Hugo J. Woerdeman. Two variable orthogonal polynomials on the bicircle and structured matrices. SIAM J. Matrix Anal. Appl., 29(3):796–825 (electronic), 2007. [2] I.C. Gohberg and G. Heinig. Inversion of ﬁnite Toeplitz matrices consisting of elements of a noncommutative algebra. Rev. Roumaine Math. Pures Appl. (in Russian), 19:623–663,1974. [3] I.C. Gohberg and A.A. Semencul. The inversion of ﬁnite Toeplitz matrices and their continual analogues. Mat. Issled., 7(2(24)):201–223, 290, 1972. [4] Georg Heinig and Karla Rost. Algebraic methods for Toeplitz-like matrices and operators. Akademie-Verlag, Berlin, 1984. [5] Peter Lancaster and Leiba Rodman. Algebraic Riccati equations. Oxford Science Publications. The Clarendon Press Oxford University Press, New York, 1995. [6] Vadim Olshevsky, Ivan Oseledets, and Eugene Tyrtyshnikov. Tensor properties of multilevel Toeplitz and related matrices. Linear Algebra Appl., 412(1):1–21, 2006. [7] Cornelis V.M. van der Mee, Sebastiano Seatzu, and Giuseppe Rodriguez. Spectral factorization of bi-inﬁnite multi-index block Toeplitz matrices. Linear Algebra Apply., 343/344:355–380, 2002. Special issue on structured and inﬁnite systems of linear equations. [8] Hugo J. Woerdeman. Estimates of inverses of multivariable Toeplitz matrices. Oper. Matrices, 2(4):507–515, 2008 [9] Harold Hotelling. Some new methods in matrix calculation. Ann. Math. Statistics, 14:1–34, 1943 Selcuk Koyuncu and Hugo J. Woerdeman Department of Mathematics Drexel University Philadelphia, PA 19104, USA e-mail: [email protected] [email protected]

Operator Theory: Advances and Applications, Vol. 218, 403–424 c 2012 Springer Basel AG ⃝

Parametrizing Structure Preserving Transformations of Matrix Polynomials Peter Lancaster and Ion Zaballa Dedicated to the memory of Israel Gohberg, good friend and scholar

Abstract. The spectral properties of 𝑛 × 𝑛 matrix polynomials are studied in terms of their (isospectral) linearizations. The main results in this paper concern the parametrization of strict equivalence and congruence transformations of the linearizations. The “centralizer” of the appropriate Jordan canonical form plays a major role in these parametrizations. The transformations involved are strict equivalence or congruence according as the polynomials in question have no symmetry, or are Hermitian, respectively. Jordan structures over either the complex numbers or the real numbers are used, as appropriate. Mathematics Subject Classiﬁcation (2000). 15A21, 15A54, 47B15. Keywords. Matrix polynomials, structure preserving, transformations.

1. Introduction The objects of study in this paper are 𝑛 × 𝑛 matrix polynomials of the form ∑ℓ 𝑗 𝑛×𝑛 𝐿(𝜆) = (or 𝐴𝑗 ∈ ℝ𝑛×𝑛 ) for each 𝑗 and 𝐴ℓ is 𝑗=0 𝐴𝑗 𝜆 where 𝐴𝑗 ∈ ℂ nonsingular. Two matrix polynomials with nonsingular leading coeﬃcients will be said to be isospectral if they have the same elementary divisors or, equivalently, the same underlying Jordan canonical form. (The Jordan form will be over the complex or real ﬁelds as the context requires.) It is well known (see [7], [8], [11]) that such a polynomial has an isospectral linearization 𝜆𝐴 − 𝐵 where ⎤ ⎡ ⎤ ⎡ 0 0 −𝐴0 0 ⋅ ⋅ ⋅ 𝐴1 𝐴2 ⋅ ⋅ ⋅ 𝐴𝑙 ⎢ 0 ⎢ 𝐴2 ⋅ ⋅ ⋅ 𝐴𝑙 0 ⎥ 𝐴𝑙 ⎥ 𝐴2 ⋅ ⋅ ⋅ ⎥ ⎢ ⎥ ⎢ , 𝐵=⎢ . (1) 𝐴=⎢ . ⎥ ⎥, . . .. ⎦ .. ⎣ .. ⎣ .. 0 ⎦ 0 ⋅⋅⋅ 0 0 ⋅⋅⋅ 0 𝐴𝑙 0 𝐴𝑙 This work was supported by grants from the EPSRC (United Kingdom), NSERC (Canada), and DGICYT, GV (Spain).

404

P. Lancaster and I. Zaballa

Note also that, when 𝐿(𝜆) is hermitian, so is this linearization and, since 𝐴ℓ is invertible, 𝐴 is also invertible. ˆ Given isospectral matrix polynomials 𝐿(𝜆) and 𝐿(𝜆), the ﬁrst objective is to parametrize all strict equivalence transformations connecting their linearizations. In other words, we are to parametrize all pairs of nonsingular complex matrices 𝑈 and 𝑉 for which (with the above deﬁnitions) ˆ − 𝐵)𝑉 ˆ ; 𝑈 (𝜆𝐴 − 𝐵) = (𝜆𝐴

(2)

they determine a strict equivalence transformation. Pairs of matrices (𝑈, 𝑉 ) satisfying this property will be called block-symmetric structure preserving transformations (SPT), since they preserve the block-symmetric structure of 𝐴 and 𝐵 (see [2, 13]). It was shown in [13, Thms. 7, 8] that the structure preserving transformations of two given isospectral matrix polynomials are closely related to their standard triples as deﬁned in [7, 8]. As a ﬁrst step, it will be shown in this paper that a parametrization of all possible block-symmetric SPTs for two given 𝑛 × 𝑛 isospectral matrix polynomials can be obtained in terms of the centralizer of their common Jordan form, namely, 𝑍(𝐽) := {Ξ ∈ ℂℓ𝑛×ℓ𝑛 : Ξ𝐽 = 𝐽Ξ}.

(3)

We ﬁrst consider general (non-symmetric) polynomials (see Theorem 2.2), then those with real coeﬃcients (Theorem 3.1), and in Section 4 (Theorem 4.3) those with hermitian coeﬃcients. Finally, in Section 5, we consider those with real symmetric coeﬃcients. In the case of hermitian matrix polynomials the strict equivalence transformations deﬁned by (2) are replaced by congruence transformations: ˆ − 𝐵, ˆ 𝑈 ∗ (𝜆𝐴 − 𝐵)𝑈 = 𝜆𝐴

(4)

with 𝑈 nonsingular. These transformations, preserving the symmetries and block structure of 𝐴 and 𝐵, will be called structure preserving congruences (SPC, for short). As in the nonsymmetric case, SPC’s and selfadjoint standard triples for a given hermitian matrix polynomial will be shown to be closely related. The definition of selfadjoint standard triples given in [11, p. 244] will be used, and a one-to-one correspondence between SPC matrices and selfadjoint standard triples will be exhibited. To complete this work, it has been found necessary to carefully review canonical structures associated with matrix polynomials, and this has been done in the accompanying paper [15]. A characterization of the set of all SPCs will be obtained in terms of the (suitably modiﬁed) centralizer. The invariants known as the sign characteristics associated with real eigenvalues (and sub-sumed in a primitive matrix 𝑃 ) are to be preserved as well as the complete Jordan structure – and this motivates the notion of strictly isospectral hermitian matrix polynomials. It will be seen that the

Parametrizing Structure Preserving Transformations

405

role of matrices in the centralizer of 𝐽 must be restricted to admit a 𝑃 -unitary property. A matrix polynomial 𝐿(𝜆) is said to be diagonalizable if there is an isospectral diagonal matrix polynomial of the same size and degree. Algorithms have been proposed for the reduction of diagonalizable quadratic polynomials (ℓ = 2, which we call systems) (see also [2], [5]) and they are the subject of the recent paper [14]. In view of their importance, and for the purpose of illustration, we focus on this quadratic case in Section 2.1 and a detailed example is included. Here, in the terminology of [14], we are concerned with systems which are 𝐷𝐸ℂ (diagonalizable by strict equivalence over ℂ applied to a linearization). Sections 3 and 3.1 are analogues of 2 and 2.1, but are devoted to the special case of real matrix polynomials (without symmetries). In Section 3.1 the systems are said to be 𝐷𝐸ℝ (diagonalizable by strict equivalence over ℝ applied to a linearization). Section 4 is devoted to the case of hermitian matrix polynomials and includes systems which are 𝐷𝐶ℝ (diagonalizable to real form by complex congruence). Another natural and important topic concerns the real symmetric matrix polynomials (which are, of course, both real and hermitian). They are considered in Section 5, where the techniques of Sections 3 and 4 are utilised. Here, the systems are also 𝐷𝐶ℝ but are now diagonalizable to real form by real congruence. Analysis of this case requires some extension of existing theory, and is developed in the accompanying paper [15].

2. General complex matrix polynomials ˆ Let 𝐿(𝜆) and 𝐿(𝜆) be two ℓ-degree 𝑛 × 𝑛 matrix polynomials with nonsingular ˆ−𝐵 ˆ be their leading coeﬃcients, as in the introduction. Let 𝜆𝐴 − 𝐵 and 𝜆𝐴 ˆ linearizations as deﬁned in (1). If 𝐿(𝜆) and 𝐿(𝜆) are isospectral then 𝜆𝐴 − 𝐵 and ˆ−𝐵 ˆ are, as pencils, strictly equivalent; i.e., 𝜆𝐴 ˆ − 𝐵)𝑉 ˆ 𝑈 (𝜆𝐴 − 𝐵) = (𝜆𝐴

(5)

for some nonsingular 𝑈 and 𝑉 . We aim to characterize and parametrize the non ˆ singular block-symmetric SPTs for 𝐿(𝜆) and 𝐿(𝜆); i.e., all pairs of matrices (𝑈, 𝑉 ) for which (5) holds. As shown in [13, Th. 7] SPTs and standard triples are closely related. (The notions of “standard pairs and triples” for a matrix polynomial are carefully developed in [15].) We recall here that, if 𝐶𝑅 is the right companion matrix of 𝐿(𝜆), i.e., ⎤ ⎡ ⋅⋅⋅ 0 0 𝐼𝑛 ⎥ ⎢ .. .. .. .. ⎥ ⎢ . . . . (6) 𝐶𝑅 = ⎢ ⎥ ⎦ ⎣ 0 0 ⋅⋅⋅ 𝐼𝑛 −1 −𝐴−1 ⋅ ⋅ ⋅ −𝐴−1 ℓ 𝐴0 −𝐴ℓ 𝐴1 ℓ 𝐴ℓ−1

406

P. Lancaster and I. Zaballa

and

⎡

0 0 .. .

⎤

⎢ ⎥ ⎢ ⎥ (7) 𝑌0 = ⎢ ⎥, ⎣ ⎦ 𝐴−1 ℓ then (𝑋0 , 𝐶𝑅 , 𝑌0 ) is a standard triple of 𝐿(𝜆) and any other standard triple of this matrix polynomial is similar to (𝑋0 , 𝐶𝑅 , 𝑌0 ). It is also important to realize that [ 𝑋0 = 𝐼𝑛

0

⋅⋅⋅

] 0 ,

𝐴𝐶𝑅 = 𝐶𝐿 𝐴 = 𝐵,

(8)

where 𝐶𝐿 is the left companion matrix of 𝐿(𝜆): ⎡ ⎤ 0 ⋅⋅⋅ 0 −𝐴0 𝐴−1 ℓ ⎢𝐼𝑛 ⋅ ⋅ ⋅ 0 ⎥ −𝐴1 𝐴−1 ℓ ⎢ ⎥ 𝐶𝐿 = ⎢ . . ⎥. . . . . .. .. ⎣ .. ⎦ −1 0 ⋅ ⋅ ⋅ 𝐼𝑛 −𝐴ℓ−1 𝐴ℓ Since 𝐴 is invertible (𝑋0 𝐴−1 , 𝐶𝐿 , 𝐴𝑌0 ) is also a standard triple of 𝐿(𝜆). The block-symmetric SPTs of two matrix polynomials can be characterized by using standard triples as follows: ˆ Theorem 2.1. Let 𝐿(𝜆) and 𝐿(𝜆) be isospectral matrix polynomials of the same ˆ size. Then (𝑈, 𝑉 ) is a block-symmetric SPT for 𝐿(𝜆) and 𝐿(𝜆) if and only if one (and then both) of the following equivalent conditions holds: (a) ⎤ ⎡ 𝑋 ⎢ 𝑋𝐶𝑅 ⎥ [ ] ⎥ ⎢ ℓ−1 (9) 𝑉 = ⎢ .. ⎥ and 𝑈 −1 = 𝐴 𝑌 𝐶𝑅 𝑌 ⋅ ⋅ ⋅ 𝐶𝑅 𝑌 ⎣ . ⎦ ℓ−1 𝑋𝐶𝑅

(b)

ˆ for a standard triple (𝑋, 𝐶𝑅 , 𝑌 ) of 𝐿(𝜆). ˆ ⎤ 𝑋 ⎢ 𝑋 ˆˆ ⎥ ⎢ 𝐶𝐿 ⎥ ˆ and = ⎢ . ⎥𝐴 ⎣ .. ⎦ ˆ𝐶 ˆ ℓ−1 𝑋 ⎡

𝑉 −1

[ 𝑈 = 𝑌ˆ

ˆ𝐿 𝑌ˆ 𝐶

⋅⋅⋅

ˆℓ−1 𝑌ˆ 𝐶 𝐿

]

(10)

𝐿

ˆ 𝐶 ˆ𝐿 , 𝑌ˆ ) of 𝐿(𝜆). for a standard triple (𝑋, The proof follows from the proofs of Theorems 7 and 8 in [13]. ˆ Notice that, given isospectral matrix polynomials 𝐿(𝜆) and 𝐿(𝜆), the stanˆ dard triples of 𝐿(𝜆) of the form (𝑋, 𝐶𝑅 , 𝑌 ) are completely determined by 𝑋; and ˆ 𝐶 ˆ𝐿 , 𝑌ˆ ) are completely determined by the standard triples of 𝐿(𝜆) of the form (𝑋, ˆ 𝑋. It follows that Theorem 2.1 can be used to deﬁne a bijective correspondence ˆ ˆ for which between block-symmetric SPTs for 𝐿(𝜆) and 𝐿(𝜆) and matrices 𝑋 (𝑋)

Parametrizing Structure Preserving Transformations

407

ˆ 𝐶 ˆ𝐿 , 𝑌ˆ )) is a standard triple of 𝐿(𝜆) ˆ (𝑋, 𝐶𝑅 , 𝑌 ) ((𝑋, (𝐿(𝜆), respectively). In this section we aim to provide a more concise parametrizing set. Notice, for example, that if no invertibility is required of 𝑈 and 𝑉 , then the set of matrix pairs (𝑈, 𝑉 ) such that ˆ − 𝐵)𝑉 ˆ 𝑈 (𝜆𝐴 − 𝐵) = (𝜆𝐴 is a linear space. The goal is to obtain a parametrizing space for the blocksymmetric SPTs of two isospectral matrix polynomials which reﬂects their linearity and whose dimension can be easily computed. Let 𝐽 be the Jordan form (over ℂ) of a matrix polynomial 𝐿(𝜆) – as above – and recall the deﬁnition (3) of the centralizer of 𝐽. If 𝜆1 , . . . , 𝜆𝑝 are the distinct eigenvalues of 𝐽, it is known (see for example [1, p. 222]) that 𝑍(𝐽) is a linear space of dimension 𝑝 ∑ 𝑠𝑖 ∑ 𝑁= (2𝑗 − 1)𝑛𝑖𝑗 , (11) 𝑖=1 𝑗=1

ˆ 𝑖 = 1, . . . , 𝑝, 𝑠𝑖 is the geometric where, for eigenvalue 𝜆𝑖 of 𝐿(𝜆) (and of 𝐿(𝜆)), multiplicity of 𝜆𝑖 , and (𝑛𝑖1 , . . . , 𝑛𝑖𝑠𝑖 ) is the Segre characteristic. ˆ Let Γ denote the set of all block-symmetric SPTs of 𝐿(𝜆) and 𝐿(𝜆): ˆ − 𝐵)𝑉 ˆ }. Γ = {(𝑈, 𝑉 ) ∈ ℂℓ𝑛×ℓ𝑛 : 𝑈 (𝜆𝐴 − 𝐵) = (𝜆𝐴 As already noted, Γ is a linear space. The main result in this section is the following theorem – whose proof is quite straightforward. ˆ Theorem 2.2. Let 𝐿(𝜆), 𝐿(𝜆) be 𝑛×𝑛 isospectral matrix polynomials with det 𝐴ℓ ∕= ˆ 0 and det 𝐴ℓ ∕= 0, let 𝐽 be their common Jordan form, and deﬁne Γ as above. Let ˆ𝑅 𝑇ˆ = 𝐽. Then, the 𝑇 and 𝑇ˆ be invertible matrices such that 𝑇 −1 𝐶𝑅 𝑇 = 𝑇ˆ−1 𝐶 mapping 𝜑 : 𝑍(𝐽) −→ Γ deﬁned by ˆ𝑇ˆΞ𝑇 −1 𝐴−1 , 𝑇ˆΞ𝑇 −1 ) 𝜑(Ξ) = (𝐴 is an isomorphism of linear spaces. Proof. It is clear that, provided that 𝜑 is well deﬁned, it is a linear mapping. So the goal is to prove that 𝜑 is well deﬁned and bijective. ˆ𝑇ˆΞ𝑇 −1 𝐴−1 and 𝑉 = 𝑇ˆΞ𝑇 −1 then 𝑈 = 𝐴𝑉 ˆ 𝐴−1 and so 𝑈 𝐴 = 𝐴𝑉 ˆ . If 𝑈 = 𝐴 Also, bearing in mind (8), 𝑈𝐵

= = = = = =

ˆ𝑇ˆΞ𝑇 −1 𝐴−1 𝐵, 𝐴 ˆ𝑇ˆΞ𝑇 −1 𝐶𝑅 , 𝐴 ˆ𝑇ˆΞ𝐽𝑇 −1 , 𝐴 ˆ𝑇ˆ𝐽Ξ𝑇 −1 , 𝐴 ˆ𝐶 ˆ𝑅 𝑇ˆΞ𝑇 −1 , 𝐴 ˆ 𝐵𝑉.

(𝐴𝐶𝑅 = 𝐵) (𝑇 −1 𝐶𝑅 𝑇 = 𝐽) (Ξ𝐽 = 𝐽Ξ) ˆ𝑅 𝑇ˆ = 𝐽) (𝑇ˆ−1 𝐶 ˆ ˆ ˆ (𝐴𝐶𝑅 = 𝐵)

408

P. Lancaster and I. Zaballa

Therefore,

ˆ𝑇ˆΞ𝑇 −1 𝐴−1 )(𝜆𝐴 − 𝐵) = (𝜆𝐴 ˆ − 𝐵)( ˆ 𝑇ˆΞ𝑇 −1 ), (𝐴 ˆ𝑇ˆΞ𝑇 −1 𝐴−1 , 𝑇ˆΞ𝑇 −1 ) ∈ Γ, as required. and (𝐴 ˆ and 𝑇ˆ are invertible maThe injectivity of 𝜑 is immediate because 𝐴, 𝑇 , 𝐴 trices. Let us prove that 𝜑 is surjective. ˆ𝐶 ˆ𝑅 = 𝐵, ˆ ˆ 𝐵)𝑉 ˆ . Since 𝐴𝐶𝑅 = 𝐵 and 𝐴 Let (𝑈, 𝑉 ) ∈ Γ, i.e., 𝑈 (𝜆𝐴−𝐵) = (𝜆𝐴− ˆ ℓ𝑛 − 𝐶 ˆ𝑅 )𝑉 . Thus 𝑈 𝐴(𝜆𝐼ℓ𝑛 − 𝐶𝑅 ) = 𝐴(𝜆𝐼 ˆ−1 𝑈 𝐴(𝜆𝐼ℓ𝑛 − 𝐶𝑅 ) = (𝜆𝐼ℓ𝑛 − 𝐶 ˆ𝑅 )𝑉 𝐴 and the following relations are obtained: ˆ−1 𝑈 𝐴 = 𝑉, 𝑉 𝐶𝑅 = 𝐶 ˆ𝑅 𝑉, and 𝑉 𝑇 𝐽𝑇 −1 = 𝑇ˆ𝐽 𝑇ˆ−1 𝑉. 𝐴 ˆ𝑅 𝑇ˆ = 𝐽. Thus 𝑇ˆ−1 𝑉 𝑇 𝐽 = The last statement is a consequence of 𝑇 −1 𝐶𝑅 𝑇 = 𝑇ˆ−1 𝐶 −1 𝐽 𝑇ˆ 𝑉 𝑇 . ˆ 𝐴−1 = If we put Ξ = 𝑇ˆ−1 𝑉 𝑇 , then Ξ ∈ 𝑍(𝐽), 𝑉 = 𝑇ˆΞ𝑇 −1 and 𝑈 = 𝐴𝑉 ˆ𝑇ˆΞ𝑇 −1 𝐴−1 as desired. 𝐴 □ According to this result, Γ is a linear space of dimension 𝑁 (see (11)) and ˆ 𝐵)𝑉 ˆ are parameterized nonsingular matrices 𝑈 and 𝑉 for which 𝑈 (𝜆𝐴−𝐵) = (𝜆𝐴− through nonsingular matrices 𝑋 in the centralizer of 𝐽; a Zariski open set of the linear space 𝑍(𝐽) and a subgroup of the general linear group Glℓ𝑛 (ℂ). ˆ𝑅 𝑇ˆ = Notice that nonsingular matrices 𝑇 and 𝑇ˆ for which 𝐶𝑅 𝑇 = 𝑇 𝐽 and 𝐶 ˆ 𝑇 𝐽 (as used in this construction) necessarily have the partitioned form ⎡ ⎤ ⎡ ˆ ⎤ 𝑋 𝑋 ⎢ 𝑋𝐽 ⎥ ⎢ 𝑋𝐽 ˆ ⎥ ⎢ ⎥ ⎢ ⎥ 𝑇 = ⎢ . ⎥ , 𝑇ˆ = ⎢ . ⎥ , ⎣ .. ⎦ ⎣ .. ⎦ ˆ ℓ−1 𝑋𝐽 ℓ−1 𝑋𝐽 ˆ are full-rank 𝑛 × ℓ𝑛 matrices. Therefore (𝑋, 𝐽) and (𝑋, ˆ 𝐽) are where 𝑋 and 𝑋 ˆ Jordan pairs of 𝐿(𝜆) and 𝐿(𝜆), respectively. Notice also that, in the important special case in which 𝐿(𝜆) has all eigenvalues distinct, 𝑁 = ℓ𝑛 and the matrices Ξ parametrizing the block-symmetric SPTs (𝑈, 𝑉 ) are nonsingular diagonal matrices. 2.1. Diagonalizable quadratic systems This section concerns an application of Theorem 2.2 to a class of matrix polynomials for which numerical algorithms have been proposed ([2], [3], [5], for example), namely, “diagonalizable” systems. By deﬁnition, they are polynomials 𝐿(𝜆) of deˆ gree two for which there exists an isospectral diagonal quadratic system 𝐿(𝜆). Since all semisimple systems are included, the diagonalizable systems are often seen as being widely useful. A complete description of admissible Jordan forms 𝐽 appears in [14], and we ˆ ˆ𝑅 𝑇ˆ = 𝑇ˆ𝐽. use that information here to parametrize all matrices 𝑇ˆ for which 𝐶

Parametrizing Structure Preserving Transformations

409

This, in turn, determines a parametrization of the pairs (𝑈, 𝑉 ) ∈ Γ. The theory is illustrated with a detailed example. ˆ ˆ𝜆2 + 𝐷𝜆 ˆ +𝐾 ˆ is Let 𝜆1 , . . . , 𝜆𝑡 be the distinct eigenvalues of 𝐿(𝜆). If 𝐿(𝜆) =𝑀 2 ˆ 𝑘𝑖 . a diagonal isospectral system then the element in position (𝑖, 𝑖) is 𝑚 ˆ 𝑖 𝜆 + 𝑑𝑖 𝜆 + ˆ For each 𝑖 = 1, . . . , 𝑛 there are two possible cases: Either (i) 𝑚 ˆ 𝑖 𝜆2 + 𝑑ˆ𝑖 𝜆 + ˆ 𝑘𝑖 = 𝑚 ˆ 𝑖 (𝜆 − 𝜆𝑗𝑖 )2 , or (ii) 𝑚 ˆ 𝑖 𝜆2 + 𝑑ˆ𝑖 𝜆 + ˆ 𝑘𝑖 = 𝑚 ˆ 𝑖 (𝜆 − 𝜆𝑗𝑖 )(𝜆 − 𝜆𝑘𝑖 ) with 𝜆𝑗𝑖 ∕= 𝜆𝑘𝑖 . [ ] 𝜆𝑗𝑖 1 In the ﬁrst case, deﬁne 𝐽𝑖 = ; and in the second case deﬁne 𝐽𝑖 = 0 𝜆𝑗𝑖 ] [ ⊕𝑛 𝜆𝑗𝑖 0 ˆ . Let 𝐽 = 𝑖=1 𝐽𝑖 . This is a Jordan form of 𝐿(𝜆). 0 𝜆𝑘𝑖 [ [ ] ] 𝜆 1 0 1 Next, when 𝐽𝑖 = 𝑗𝑖 and notice that put 𝑌𝑖 = −𝜆𝑗𝑖 1 0 𝜆𝑗𝑖 [ ] [ ] 0 1 0 1 −1 𝑌𝑖 𝐽𝑖 𝑌𝑖 = = . −𝜆2𝑗𝑖 2𝜆𝑗𝑖 −ˆ 𝑘𝑖 /𝑚 ˆ 𝑖 −𝑑ˆ𝑖 /𝑚 ˆ𝑖 [ ] [ ] 𝜆𝑗𝑖 1 1 − 𝜆𝑗 −𝜆 𝜆𝑗𝑖 0 𝜆𝑗𝑖 −𝜆𝑘𝑖 , observe that 𝑘𝑖 𝑖 When 𝐽𝑖 = put 𝑌𝑖 = 0 𝜆𝑘𝑖 −𝜆𝑗𝑖 1 [ ] 1 1 − 𝜆𝑗 −𝜆 𝑘𝑖 𝑖 𝑌𝑖−1 = 𝜆𝑗𝑖 𝜆𝑗𝑖 1 − 𝜆𝑗 −𝜆 𝑘 𝑖

and 𝑌𝑖−1 𝐽𝑖 𝑌𝑖 =

[

0

𝑖

] [ 0 1 = ˆ 𝜆𝑗𝑖 + 𝜆𝑘𝑖 −𝑘𝑖 /𝑚 ˆ𝑖

−𝜆𝑗𝑖 𝜆𝑘𝑖 ⊕𝑡 Thus, if we deﬁne 𝑌 = 𝑖=1 𝑌𝑖 , then 𝑡 [ ⊕ 0 𝑌 −1 𝐽𝑌 = ˆ𝑖 −ˆ 𝑘𝑖 /𝑚 𝑖=1

1 −𝑑ˆ𝑖 /𝑚 ˆ𝑖

] 1 . −𝑑ˆ𝑖 /𝑚 ˆ𝑖

]

and there is a permutation matrix 𝑃 (always the same) such that [ ] 0 𝐼𝑛 𝑇 −1 𝑃 𝑌 𝐽𝑌 𝑃 = ˆ −𝑀 ˆ−1 𝐷 ˆ . ˆ−1 𝐾 −𝑀 ˆ𝑅 𝑇ˆ = 𝑇ˆ𝐽. Thus, if 𝑇ˆ = (𝑌 𝑃 )−1 then 𝑇ˆ is invertible and 𝐶 Let us apply this construction to a simple example. Example 2.3. Consider the diagonalizable system [ ] [ ] [ ] −1 −3 1 2 2 0 1 𝐿(𝜆) = 𝜆 +𝜆 + . 1 3 −3 −7 2 4

410

P. Lancaster and I. Zaballa

By ﬁrst examining the centralizer of the Jordan form, we will construct a complete parametrization of the pairs 𝑈, 𝑉 in Γ. The eigenvalues of 𝐿(𝜆) are: +1 with algebraic multiplicity 3 and geometric multiplicity 2, and the simple eigenvalue 0. The fact that the eigenvalue +1 has Segre characteristic (2, 1) ensures that 𝐿(𝜆) is diagonalizable (the Segre characteristics (1, 1, 1) and (3) for this eigenvalue are not admissible, see [14]). A diagonal strictly isospectral system is [ ] ][ ] [ ] [ ] [ 2 2 0 0 2 0 −4 0 2 0 𝜆 − 2𝜆 + 1 ˆ 𝐿(𝜆) = 𝜆2 + . 𝜆+ = 0 −5 0 𝜆2 − 𝜆 0 −5 0 5 0 0 All computations are made with the help of MATLAB and its Symbolic Toolbox. Matrices 𝐽1 and 𝐽2 are [ ] [ ] 1 1 1 0 𝐽1 = , 𝐽2 = , 0 1 0 0 so that

⎡ 1 ⎢0 𝐽 =⎢ ⎣0 0

1 1 0 0

0 0 1 0

⎤ 0 0⎥ ⎥ 0⎦ 0

is a Jordan form for 𝐿(𝜆) and the Segre characteristic is ((2, 1), (1)). Thus, according to (11), the dimension of 𝑍(𝐽) is 6. Now, with [ ] [ ] ⊕ 1 0 0 1 , 𝑌2 = , 𝑌 = 𝑌1 𝑌2 , 𝑌1 = −1 1 −1 1 and, deﬁning the permutation matrix ⎡ 1 ⎢0 𝑃 =⎢ ⎣0 0 we have 𝑇ˆ := (𝑌 𝑃 )−1

0 0 1 0

⎡ 1 ⎢0 =⎢ ⎣1 0

⎤ 0 0⎥ ⎥, 0⎦ 1

0 1 0 0 0 0 1 0

0 1 0 1

⎤ 0 −1⎥ ⎥. 0⎦ 0

Now Jordan chains of 𝐿(𝜆) for the eigenvalues 𝜆1 = 1 and 𝜆2 = 0 are computed following [8, p. 25]. In particular, the Jordan chains of 𝐿(𝜆) for 𝜆1 = 1 have the form: [ ] [ ] [ ] 𝑎 𝑏 𝑑 , 𝑥11 = , 𝑥02 = 𝑥01 = −𝑎 𝑐 𝑒,

Parametrizing Structure Preserving Transformations

411

[

] 2𝑓 . In particular, and the eigenvectors of 𝐿(𝜆) for 𝜆2 = 0 have the form 𝑥03 = −𝑓 a matrix of Jordan chains of 𝐿(𝜆) is (taking 𝑎 = 𝑏 = 𝑑 = 𝑓 = 1 and 𝑐 = 𝑒 = 0): [ ] 1 1 1 2 𝑋= . −1 0 0 −1 Thus, a matrix 𝑇 such that 𝑇 −1 𝐶𝑅 𝑇 = 𝐽 is ⎡ 1 1 [ ] ⎢−1 0 𝑋 𝑇 = =⎢ ⎣1 𝑋𝐽 2 −1 −1 Finally, the matrices Ξ ∈ 𝑍(𝐽) have the form ⎡ 𝑎 𝑏 𝑐 ⎢0 𝑎 0 Ξ=⎢ ⎣0 𝑑 𝑒 0 0 0

⎤ 1 2 0 −1⎥ ⎥. 1 0⎦ 0 0

(see [1, 12]): ⎤ 0 0⎥ ⎥. 0⎦ 𝑓

ˆ𝑇ˆΞ𝑇 −1 𝐴−1 : Then MATLAB produces the following answers for 𝑈 = 𝐴 [ 6*a-2*b+6*c, -4*a+2*b-4*c, 4*a-2*b+6*c, 2*b-2*a-4*c ] [ -10*f, 5*f, 0, 0 ] [ -4*a+2*b-6*c, 2*a-2*b+4*c, -2*a+2*b-6*c, -2*b+4*c ] [ -5*d+15*e+10*f, 5*d-10*e-5*f, -5*d+15*e, 5*d-10*e ] and for 𝑉 = 𝑇ˆΞ𝑇 −1 : [ -a+b-c, -2*a+2*b-2*c, a-b+2*c, a-2*b+3*c ] [ d-e-f, 2*d-2*e-f, -d+2*e+f, -2*d+3*e+f ] [ b-c, 2*b-2*c, -b+2*c, -a-2*b+3*c ] [ d-e, 2*d-2*e, -d+2*e, -2*d+3*e ] When they are nonsingular (i.e., when Ξ is nonsingular) these matrices 𝑈 and 𝑉 deﬁne a block-symmetric SPT for the given systems: this can be veriﬁed directly from equation (2). Furthermore, according to Theorem 2.2, these are all ˆ possible structure preserving transformations for 𝐿(𝜆) and 𝐿(𝜆). □

3. Real matrix polynomials ˆ If 𝐿(𝜆) and 𝐿(𝜆) are real isospectral matrix polynomials, it may be possible to design algorithms using only real arithmetic so that the matrices 𝑈 and 𝑉 for which (2) holds are real. With this in mind, we consider corresponding real Jordan forms (see [15]). The description of the centralizer of a matrix in real Jordan form may be less familiar than its complex counterpart. A simple computation shows, however, that if 𝐾 is a matrix in real Jordan form, the real matrices 𝑋 ∈ 𝑍(𝐾) (the centralizer for the real Jordan form) can be described as follows:

412

P. Lancaster and I. Zaballa

¯ 𝑟+1 ,. . . , Let 𝜆1 , . . . , 𝜆𝑟 be the real eigenvalues of 𝐾 and 𝜆𝑟+1 ,. . . , 𝜆𝑟+𝑠 , 𝜆 ¯ 𝑟+𝑠 be the non-real eigenvalues (in conjugate pairs). Let 𝑛𝑖 = (𝑛𝑖1 , . . . , 𝑛𝑖𝑡 ) be 𝜆 𝑖 the Segre characteristic of 𝐾 associated with 𝜆𝑖 , 𝑛𝑖1 ≥ 𝑛𝑖2 ≥ ⋅ ⋅ ⋅ ≥ 𝑛𝑖𝑡𝑖 ,

𝑖 = 1, . . . , 𝑟 + 𝑠. ¯𝑖 coincide for each 𝑖 = 𝑟 + And recall that the Segre characteristics of 𝜆𝑖 and 𝜆 1, . . . , 𝑟 + 𝑠. Then [it can] be veriﬁed that 𝑋 ∈ 𝑍(𝐾) if and only if 𝑋 = Diag(𝑋1 , . . . , 𝑋𝑟+𝑠 ) 𝑖 𝑖 and the matrices 𝑋𝑗𝑘 have triangular Toeplitz structure with 𝑋𝑖 = 𝑋𝑗𝑘 1≤𝑗,𝑘≤𝑡𝑖 as follows: ∙ For the real eigenvalues, 𝑖 = 1, . . . , 𝑟, they have the same form as in the complex case, ⎤ ⎡ 1 𝑛 𝑎𝑗𝑗𝑖𝑗 𝑎𝑗𝑗 𝑎2𝑗𝑗 ⋅ ⋅ ⋅ 𝑛 −1 ⎢ 0 𝑎1 ⋅ ⋅ ⋅ 𝑎 𝑖𝑗 ⎥ 𝑗𝑗 𝑗𝑗 ⎥ ⎢ 𝑖 𝑋𝑗𝑗 ∈ ℝ𝑛𝑖𝑗 ×𝑛𝑖𝑗 , =⎢ . . .. ⎥ . .. .. ⎦ ⎣ .. . 0 0 ⋅⋅⋅ 𝑎1𝑗𝑗

𝑖 𝑋𝑗𝑘

⎡ 0 ⎢0 ⎢ = ⎢. ⎣ .. 0

𝑖 𝑋𝑗𝑘

⋅⋅⋅ ⋅⋅⋅

0 0 .. . 0

⋅⋅⋅ ⋅⋅⋅ ⎡ 1 𝑎𝑗𝑘 ⎢ 0 ⎢ ⎢ . ⎢ .. ⎢ =⎢ ⎢ 0 ⎢ 0 ⎢ ⎢ . ⎣ ..

𝑎1𝑗𝑘 0 .. . 0 𝑎2𝑗𝑘 𝑎1𝑗𝑘 .. . 0 0 .. .

𝑎2𝑗𝑘 𝑎1𝑗𝑘 .. . 0 ⋅⋅⋅ ⋅⋅⋅ .. . ⋅⋅⋅ ⋅⋅⋅

⋅⋅⋅ ⋅⋅⋅ .. . ⋅⋅⋅

⎤ 𝑛 𝑎𝑗𝑘𝑖𝑗 𝑛 −1 𝑎𝑗𝑘𝑖𝑗 ⎥ ⎥ ∈ ℝ𝑛𝑖𝑘 ×𝑛𝑖𝑗 , 𝑗 > 𝑘, .. ⎥ . ⎦ 𝑎1𝑗𝑘 ⎤

𝑎𝑛𝑗𝑘𝑖𝑘 𝑎𝑛𝑗𝑘𝑖𝑘 −1 ⎥ ⎥ .. ⎥ . ⎥ ⎥ 𝑛𝑖𝑗 ×𝑛𝑖𝑘 1 , 𝑗 < 𝑘. 𝑎𝑗𝑘 ⎥ ⎥∈ℝ ⎥ 0 ⎥ .. ⎥ . ⎦

⋅⋅⋅ 0 0 ⋅⋅⋅ 0 ∙ For the non-real conjugate pairs, 𝑖 = 𝑟 + 1, . . . , 𝑟 + 𝑠, ⎤ ⎡ 1 𝑛 𝐴𝑗𝑗𝑖𝑗 𝐴𝑗𝑗 𝐴2𝑗𝑗 ⋅ ⋅ ⋅ 𝑛 −1 ⎢ 0 𝐴1𝑗𝑗 ⋅ ⋅ ⋅ 𝐴𝑗𝑗𝑖𝑗 ⎥ ⎥ ⎢ 𝑖 𝑋𝑗𝑗 , =⎢ . .. .. ⎥ .. ⎣ .. . . . ⎦ 0 0 ⋅⋅⋅ 𝐴1𝑗𝑗 ⎤ ⎡ 𝑛 𝐴𝑗𝑘𝑖𝑗 0 ⋅ ⋅ ⋅ 0 𝐴1𝑗𝑘 𝐴2𝑗𝑘 ⋅ ⋅ ⋅ 𝑛 −1 ⎢0 ⋅ ⋅ ⋅ 0 0 𝐴1𝑗𝑘 ⋅ ⋅ ⋅ 𝐴𝑗𝑘𝑖𝑗 ⎥ ⎢ ⎥ 𝑖 𝑋𝑗𝑘 = ⎢. , 𝑗 > 𝑘, .. .. .. ⎥ .. ⎣ .. ⋅ ⋅ ⋅ ... . . . . ⎦ 0 ⋅⋅⋅ 0 0 0 ⋅⋅⋅ 𝐴1𝑗𝑘

Parametrizing Structure Preserving Transformations

𝑖 𝑋𝑗𝑘

⎡ 1 𝐴𝑗𝑘 ⎢ 0 ⎢ ⎢ . ⎢ .. ⎢ =⎢ ⎢ 0 ⎢ 0 ⎢ ⎢ . ⎣ .. 0

where 𝐴ℓ𝑗𝑘

⎤ 𝐴𝑛𝑗𝑘𝑖𝑘 𝑛𝑖𝑘 −1 ⎥ 𝐴𝑗𝑘 ⎥ .. ⎥ . ⎥ ⎥ 𝐴1𝑗𝑘 ⎥ ⎥ , 𝑗 < 𝑘, 0 ⎥ ⎥ .. ⎥ . ⎦

𝐴2𝑗𝑘 𝐴1𝑗𝑘 .. . 0 0 .. .

⋅⋅⋅ ⋅⋅⋅ .. . ⋅⋅⋅ ⋅⋅⋅

[

] −𝑏ℓ𝑗𝑘 ∈ ℝ2×2 . 𝑎ℓ𝑗𝑘

0

𝑎ℓ = ℓ𝑗𝑘 𝑏𝑗𝑘

⋅⋅⋅ ⋅⋅⋅

413

0

Now 𝑍(𝐾) is a real linear space of dimension 𝑡𝑖 𝑡𝑖 𝑟 ∑ 𝑟+𝑠 ∑ ∑ ∑ (2𝑗 − 1)𝑛𝑖𝑗 + 2 (2𝑗 − 1)𝑛𝑖𝑗 . 𝑁𝑅 = 𝑖=1 𝑗=1

(12)

𝑖=𝑟+1 𝑗=1

Actually, the dimension of the centralizer of a matrix does not depend on the ﬁeld but on the degrees of its invariant polynomials (see, for example, [1, p. 222]) an these are the same computed whether on ℝ or ℂ. Now if 𝑇 and 𝑇ˆ are real nonsingular matrices satisfying ˆ𝑅 𝑇ˆ = 𝑇ˆ𝐾 𝐶𝑅 𝑇 = 𝑇 𝐾 and 𝐶 ˆ𝑅 the right companion matrices of 𝐿(𝜆) and 𝐿(𝜆), ˆ with 𝐶𝑅 and 𝐶 respectively, and ˆ − 𝐵)𝑉 ˆ } Γ𝑅 = {(𝑈, 𝑉 ) ∈ ℝℓ𝑛×ℓ𝑛 × ℝℓ𝑛×ℓ𝑛 : 𝑈 (𝜆𝐴 − 𝐵) = (𝜆𝐴 then the proof of the following theorem is the same as that of Theorem 2.2. ˆ Theorem 3.1. Let 𝐿(𝜆) and 𝐿(𝜆) be isospectral real matrix polynomials with ˆ det 𝐴ℓ ∕= 0 and det 𝐴ℓ ∕= 0. Let 𝐾 be a common real Jordan form for 𝐿(𝜆) and ˆ 𝐿(𝜆). Then, with 𝑁𝑅 of (12), the map 𝜑 : 𝑍(𝐾) → Γ𝑅 deﬁned by ˆ𝑇ˆΞ𝑇 −1 𝐴−1 , 𝑇ˆΞ𝑇 −1 ) 𝜑(Ξ) = (𝐴 is an isomorphism of 𝑁𝑅 -dimensional real vector spaces. One may also ask what form Theorem 2.1 takes when conﬁned to real matrix polynomials. However, using real standard triples as described in [15], the theorem also holds for real matrix polynomials and does not require a separate statement. 3.1. Diagonalizable real quadratic systems As with the theory over ℂ we now illustrate Theorem 3.1 in the case of diagˆ onalizable quadratic systems. If 𝐿(𝜆) is a real diagonal system the matrix 𝑇ˆ of that theorem can be constructed as for a complex system, but an additional step ˆ is required. Assume that 𝐿(𝜆) is a real diagonal system with real and complex eigenvalues: 𝜆1 ,. . . , 𝜆𝑡 . It is shown in [14] that the non-real complex eigenvalues must be semisimple and appear in conjugate pairs.

414

P. Lancaster and I. Zaballa

ˆ Let 𝑚 ˆ 𝑖 𝜆2 + 𝑑ˆ𝑖 𝜆 + ˆ 𝑘𝑖 be the polynomial in position (𝑖, 𝑖) of 𝐿(𝜆). Then there are three possibilities: (i) 𝑚 ˆ 𝑖 𝜆2 + 𝑑ˆ𝑖 𝜆 + ˆ 𝑘𝑖 = 𝑚 ˆ𝑖 (𝜆 − 𝜆𝑗𝑖 )2 with 𝜆𝑗𝑖 ∈ ℝ, 2 ˆ ˆ (ii) 𝑚 ˆ 𝑖 𝜆 + 𝑑𝑖 𝜆 + 𝑘𝑖 = 𝑚 ˆ𝑖 (𝜆 − 𝜆𝑗𝑖 )(𝜆 − 𝜆𝑘𝑖 ) with 𝜆𝑗𝑖 ∕= 𝜆𝑘𝑖 and real, ˆ𝑖 = 𝑚 ¯ 𝑗 ) with 𝜆𝑗 ∈ (iii) 𝑚 ˆ 𝑖 𝜆2 + 𝑑ˆ𝑖 𝜆 + 𝑘 ˆ𝑖 (𝜆 − 𝜆𝑗 )(𝜆 − 𝜆 / ℝ. 𝑖

𝑖

𝑖

In the ﬁrst two cases, [ deﬁne 𝐾 ] 𝑖 = 𝐽𝑖 as in the complex case of Section 2.1. In the ⊕𝑡 𝑎𝑖 𝑏 𝑖 third case, let 𝐾𝑖 = , where 𝜆𝑗𝑖 = 𝑎𝑖 + 𝑖𝑏𝑖 . Finally, deﬁne 𝐾 = 𝑖=1 𝐽𝑖 . −𝑏𝑖 𝑎𝑖 [ [ [ ] ] ] 𝜆𝑗𝑖 𝜆𝑗𝑖 1 0 1 0 If 𝐾𝑖 = , as in the or 𝐾 = , deﬁne 𝑌𝑖 = −𝜆𝑗 1 0 𝜆𝑗𝑖 [ 𝑖 ] 0 𝜆𝑘𝑖 𝑎𝑖 𝑏 𝑖 , (𝑏𝑖 ∕= 0), deﬁne previous section. If 𝐾𝑖 = −𝑏𝑖 𝑎𝑖 [ ] 1 0 𝑌𝑖 = −𝑎𝑖 /𝑏𝑖 1/𝑏𝑖 [ ] 1 0 and observe that 𝑌𝑖−1 = , 𝑎𝑖 𝑏 𝑖 [ ] [ ] 0 1 0 1 𝑌𝑖−1 𝐾𝑖 𝑌𝑖 = = . −(𝑎2𝑖 + 𝑏2𝑖 ) 2𝑎𝑖 −ˆ 𝑘𝑖 /𝑚 ˆ𝑖 −𝑑ˆ𝑖 /𝑚 ˆ𝑖 ⊕𝑡 If we deﬁne 𝑌 = 𝑖=1 𝑌𝑖 , then ] 𝑡 [ ⊕ 0 1 −1 𝑌 𝐾𝑌 = . ˆ𝑖 /𝑚 −𝑘 ˆ 𝑖 −𝑑ˆ𝑖 /𝑚 ˆ𝑖 𝑖=1

Finally, deﬁne the permutation matrix 𝑃 as in the complex case and set 𝑇ˆ = ˆ𝑅 𝑇ˆ = 𝐾. (𝑌 𝑃 )−1 . Then 𝑇ˆ is real and 𝑇ˆ−1 𝐶 Example 3.2. Let

[

] [ ] [ ] 3/2 −1/2 2 −3 5 11/2 −9/2 𝐿(𝜆) = 𝜆 + 𝜆+ . −1/2 3/2 5 −3 −9/2 11/2

The eigenvalues are 𝜆1 = −1 with algebraic multiplicity 2 and geometric multiplicity 1, together with the conjugate pair 𝜆2 = 2 + 𝑖 and 𝜆3 = 2 − 𝑖. A diagonal isospectral system is ] [ ] [ ] [ ] [ 2 0 2 0 1 0 𝜆 + 2𝜆 + 1 2 1 0 ˆ 𝐿(𝜆) = 𝜆 . +𝜆 + = 0 𝜆2 − 4𝜆 + 5 0 1 0 −4 0 5 Thus [ 𝐾1 =

]

−1 1 , 0 −1

[ 𝐾2 =

]

2 1 , −1 2

⎡

−1 1 0 ⎢ 0 −1 0 𝐾 =⎢ ⎣0 0 2 0 0 −1

⎤ 0 0⎥ ⎥. 1⎦ 2

Parametrizing Structure Preserving Transformations

415

The Segre characteristic is ((2), (1), (1)), the dimension of 𝑍(𝐾) is 4 and Ξ ∈ 𝑍(𝐾) if and only if ⎡ ⎤ 𝑎 𝑏 0 0 ⎢0 𝑎 0 0 ⎥ ⎥ Ξ=⎢ (13) ⎣0 0 𝑐 −𝑑⎦ . 0 0 𝑑 𝑐 Now,

[ 𝑌1 =

and

1 1

] 0 , 1

[ 𝑌2 =

] 1 0 , −2 1

⎡ 1 ⎢0 𝑃 =⎢ ⎣0 0

With these matrices

0 0 1 0 ⎡

1 ⎢ 0 𝑇ˆ = (𝑌 𝑃 )−1 ⎢ ⎣−1 0

𝑌 = 𝑌1

⊕

𝑌2 ,

⎤ 0 0⎥ ⎥. 0⎦ 1

0 1 0 0 0 0 1 0

0 1 0 2

⎤ 0 0⎥ ⎥. 0⎦ 1

Now we compute Jordan chains of 𝐿(𝜆). We proceed as in the complex case and ﬁnd that [ ] [ ] 𝑎 𝑏 , 𝑥11 = , 𝑥01 = 𝑎 𝑏 are the Jordan chains of 𝐿(𝜆) for the eigenvalue 𝜆1 = −1. Also, [ ] [ ] 𝑐 0 , 𝑥03 = , 𝑥02 = −𝑐 0 are the real Jordan chains of 𝐿(𝜆) for 𝜆2 = 2 + 𝑖 and 𝜆 = 2 − 𝑖. Recall that 𝑐 and 𝑑 are the real and imaginary parts of any pair of conjugate complex eigenvectors corresponding to the conjugate complex eigenvalues (see Section 2.1). In this example there are real eigenvectors associated with the complex eigenvalues. Provided that 𝑎, 𝑏 and 𝑐 take real values, the matrix [ ] [ ] 𝑋 𝑇 = with 𝑋 = 𝑥01 𝑥11 𝑥02 𝑥03 𝑋𝐾 satisﬁes 𝑇 −1 𝐶𝑅 𝑇 = 𝐾. In particular, ⎡ 1 ⎢1 𝑇 =⎢ ⎣−1 −1

if 𝑎 = 𝑐 = 1 and 𝑏 = 0, then ⎤ 0 1 0 0 −1 0 ⎥ ⎥. 1 2 1⎦ 1 −2 −1

ˆ𝑇ˆΞ𝑇 −1 𝐴−1 and 𝑉 = 𝑇ˆΞ𝑇 −1 : Finally, using (13), we compute 𝑈 = 𝐴

(14)

416

P. Lancaster and I. Zaballa

U = [ 1/2*b+1/2*a, 1/2*b+1/2*a, -1/2*b, -1/2*b [ 1/2*d+1/4*c, -1/2*d-1/4*c, 5/4*d, -5/4*d [ 1/2*b, 1/2*b, 1/2*a-1/2*b, 1/2*a-1/2*b [ -1/4*d, 1/4*d, 1/4*c-1/2*d, -1/4*c+1/2*d

] ] ] ]

V = [ 1/2*b+1/2*a, 1/2*b+1/2*a, 1/2*b, 1/2*b ] [ d+1/2*c, -d-1/2*c, -1/2*d, 1/2*d ] [ -1/2*b, -1/2*b, 1/2*a-1/2*b, 1/2*a-1/2*b ] [ 5/2*d, -5/2*d, 1/2*c-d, -1/2*c+d ] and check that they are real structure preserving transformations, i.e., that (2) holds. □

4. Hermitian matrix polynomials When a matrix polynomial 𝐿(𝜆) has hermitian coeﬃcients the linearization 𝜆𝐴−𝐵 (as used above) is also hermitian, and this admits reduction of the linearization by congruence transformations – see (4). Thus, our ﬁrst goal is as follows: for two 𝑛×𝑛 ˆ hermitian matrix polynomials 𝐿(𝜆) and 𝐿(𝜆) of degree ℓ with nonsingular leading coeﬃcients and congruent linearizations, parametrize all matrices 𝑈 ∈ Glℓ𝑛 (ℂ) ˆ−𝐵 ˆ such that 𝑈 ∗ (𝜆𝐴 − 𝐵)𝑈 = 𝜆𝐴

We will ﬁrst prove an analogue of Theorem 2.1 and then analogues of Theorems 2.2 and 3.1. However, this problem is more involved because of the presence of the sign characteristic in the canonical form (see [15]). Hermitian matrix polynomials having the same Jordan form and sign characteristic are said to be strictly isospectral. We use the same notation for the set of matrices to be parameterized: ˆ − 𝐵}. ˆ Γ = {𝑈 ∈ ℂℓ𝑛×ℓ𝑛 : 𝑈 ∗ (𝜆𝐴 − 𝐵)𝑈 = 𝜆𝐴

(15)

ˆ are invertible matrices, so are all matrices in Γ. Notice that since 𝐴 and 𝐴 In order to prove the analogue of Theorem 2.1 and introduce the set that will play a role similar to the “centralizer”, 𝑍(𝐽) of (3), let us recall some results on selfadjoint standard and Jordan triples of hermitian matrix polynomials. If 𝐿(𝜆) has hermitian coeﬃcients 𝐴0 , . . . , 𝐴ℓ , a standard triple (𝑋, 𝑇, 𝑌 ) of 𝐿(𝜆) is said to be selfadjoint if there is an invertible hermitian matrix 𝑀 ∈ ℂℓ𝑛×ℓ𝑛 such that 𝑌 ∗ = 𝑋𝑀 −1

and 𝑇 ∗ = 𝑀 𝑇 𝑀 −1

(16)

Notice that if such a matrix 𝑀 exists then 𝑋 ∗ = 𝑀 𝑌 . It is also noteworthy that if (𝑋, 𝑇, 𝑌 ) is a selfadjoint triple for 𝐿(𝜆) there is one and only one invertible hermitian matrix 𝑀 satisfying (16) (see [15]). The second property in (16) can be rewritten as 𝑀 𝑇 = 𝑇 ∗ 𝑀 . This means that 𝑇 is 𝑀 -selfadjoint; i.e., selfadjoint with respect to the indeﬁnite inner product ˆ𝑅 are selfadjoint [𝑥, 𝑦] = (𝑥, 𝑀 𝑦) = 𝑦 ∗ 𝑀 𝑥 (see [9, 11]). In particular, 𝐶𝑅 and 𝐶

Parametrizing Structure Preserving Transformations

417

ˆ respectively (see (8) and notice that for hermitian matrix with respect to 𝐴 and 𝐴, ∗ polynomials 𝐶𝐿 = 𝐶𝑅 ). Now, the analogue of Theorem 2.1 is: ˆ Theorem 4.1. Let 𝐿(𝜆) and 𝐿(𝜆) be strictly isospectral hermitian matrix polynomials with nonsingular leading coeﬃcients. Then 𝑈 ∈ Γ (of (15)) if and only if ˆ ∗ ) is a selfadjoint ˆ ∈ ℂ𝑛×ℓ𝑛 such that (𝑋, ˆ 𝐶 ˆ𝑅 , 𝐴 ˆ−1 𝑋 there is a full rank matrix 𝑋 triple of 𝐿(𝜆) and ⎡ ˆ ⎤ 𝑋 ⎢ 𝑋 ˆ𝐶 ˆ𝑅 ⎥ ⎥ ⎢ (17) 𝑈 = ⎢ . ⎥. ⎣ .. ⎦ ˆ𝐶 ˆ ℓ−1 𝑋 𝑅

ˆ (Since the roles of 𝐿(𝜆) and 𝐿(𝜆) can be interchanged in this statement, a ˆ similar characterization of 𝑈 can be given in terms of selfadjoint triples of 𝐿(𝜆) – as in Theorem 2.1.) ˆ − 𝐵. ˆ Then 𝑈 ∗ (𝜆𝐴 − 𝐵) = (𝜆𝐴ˆ − 𝐵)𝑈 ˆ −1 . Proof. Assume that 𝑈 ∗ (𝜆𝐴 − 𝐵)𝑈 = 𝜆𝐴 ∗ −1 ˆ According This means that (𝑈 , 𝑈 ) is a block-symmetric SPT of 𝐿(𝜆) and 𝐿(𝜆). ˆ ˆ ˆ to Theorem 2.1(b), there is a standard triple (𝑋, 𝐶𝐿 , 𝑌 ) of 𝐿(𝜆) such that ⎡ ˆ ⎤ 𝑋 ⎢ 𝑋 ˆ𝐶 ˆ𝐿 ⎥ ] [ ⎥ ˆ ⎢ ˆ ℓ−1 𝑌ˆ . ˆ𝐿 𝑌ˆ ⋅ ⋅ ⋅ 𝐶 (18) 𝑈 = ⎢ . ⎥ 𝐴, 𝑈 ∗ = 𝑌ˆ 𝐶 𝐿 ⎣ .. ⎦ ˆ𝐶 ˆℓ−1 𝑋 𝐿

ˆ𝐿 = 𝐶 ˆ−1 , it follows that (𝑋 ˆ∗ = 𝐴 ˆ𝐶 ˆ𝑅 𝐴 ˆ 𝐴, ˆ 𝐶 ˆ𝑅 , 𝐴 ˆ−1 𝑌ˆ ) is a Bearing in mind that 𝐶 𝑅 standard triple of 𝐿(𝜆) and ⎤ ⎡ ˆ ⎤ ⎡ ˆ ⎤ ⎡ ˆ𝐴 ˆ 𝑍 𝑋 𝑋 ⎢ 𝑋 ⎢ 𝑋 ˆ𝐶 ˆ𝑅 ⎥ ˆ𝐶 ˆ𝐿 ⎥ ˆ𝐿 𝐴 ˆ⎥ ⎢ 𝑍 ˆ𝐴 ˆ𝐴 ˆ−1 𝐶 ⎥ ⎢ ⎢ ⎥ ˆ ⎢ ⎥ 𝑈 = ⎢ . ⎥𝐴 =⎢ ⎥ = ⎢ . ⎥, . .. ⎦ ⎣ .. ⎦ ⎣ .. ⎦ ⎣ ˆ ℓ−1 𝐴 ˆ ˆ𝐴 ˆ𝐴 ˆ−1 𝐶 ˆ𝐶 ˆℓ−1 ˆ𝐶 ˆℓ−1 𝑋 𝑍 𝑋 𝐿

𝐿

𝑅

ˆ=𝑋 ˆ 𝐴. ˆ where 𝑍 ˆ 𝐶 ˆ𝑅 , 𝐴 ˆ−1 𝑌ˆ ) is a selfadjoint triple for 𝐿(𝜆); i.e., We are to prove that (𝑍, ˆ−1 = 𝑍𝑀 ˆ −1 = 𝑋 ˆ 𝐴𝑀 ˆ −1 for some invertible hermitian matrix 𝑀 . Taking 𝑌ˆ ∗ 𝐴 ˆ ˆ𝑋 ˆ ∗ . But, using (18), 𝑀 = 𝐴 we only have to show that 𝑌ˆ = 𝐴 ⎡ ⎤ ⎡ ⎤ 𝐼𝑛 𝐼𝑛 ⎢0⎥ ]⎢ 0 ⎥ [ ⎢ ⎥ ⎥ ˆ 𝑋 ˆ𝑋 ˆ ∗, ˆ𝑅 𝑋 ˆ∗ ⋅ ⋅ ⋅ 𝐶 ˆ∗ ⎢ ˆℓ−1 𝑋 ˆ∗ 𝐶 𝑌ˆ = 𝑈 ∗ ⎢ . ⎥ = 𝐴 ⎢ .. ⎥ = 𝐴 𝑅 ⎣ .. ⎦ ⎣.⎦ 0

as desired.

0

418

P. Lancaster and I. Zaballa

ˆ such that (𝑋, ˆ 𝐶 ˆ𝑅 , 𝐴 ˆ ∗) ˆ−1 𝑋 Conversely, assume that there is a full row rank matrix 𝑋 −1 ˆ ˆ ˆ is a selfadjoint triple of 𝐿(𝜆) and 𝑈 is given by (17). Put 𝑍 = 𝑋 𝐴 . Then ˆ ∗ ) is a standard triple of 𝐿(𝜆) and, because 𝐶 ˆ𝐿 𝐴, ˆ so is ˆ−1 𝑋 ˆ𝑅 = 𝐴 ˆ−1 𝐶 ˆ𝐴, ˆ𝐶 ˆ𝑅 , 𝐴 (𝑍 ∗ ˆ ). But ˆ 𝐶 ˆ𝐿 , 𝑋 (𝑍, ⎡ ˆˆ ⎤ ⎡ ˆ ⎤ 𝑍 𝑍𝐴 ⎢ 𝑍 ˆ𝐶 ˆ𝐿 ⎥ ˆ𝐴 ˆ𝐶 ˆ𝑅 ⎥ ⎢ 𝑍 ⎥ ⎢ ⎥ ˆ ⎢ = 𝐴 𝑈 =⎢ ⎢ ⎥ .. .. ⎥ ⎣ ⎦ ⎣ . . ⎦ ˆ𝐴 ˆ𝐶 ˆ ℓ−1 ˆ ℓ−1 𝑍 𝑍ˆ𝐶 𝑅

and

[

ˆ∗ 𝑈∗ = 𝑋

ˆ ∗𝐶 ˆ𝐿 𝑋

𝐿

⋅⋅⋅

] ˆ ∗𝐶 ˆ ℓ−1 . 𝑋 𝐿

ˆ 𝐶 ˆ𝐿 , 𝑋 ˆ ∗ ) of 𝐿(𝜆) such that Therefore there is a standard triple (𝑍, ⎡ ˆ ⎤ 𝑍 ⎢ 𝑍 ˆ𝐶 ˆ𝐿 ⎥ ] [ ⎥ ˆ ⎢ ˆ ∗𝐶 ˆ𝐿 ⋅ ⋅ ⋅ 𝑋 ˆ ℓ−1 . ˆ∗ 𝑋 ˆ ∗𝐶 𝑈 = ⎢ . ⎥𝐴 and 𝑈 ∗ = 𝑋 𝐿 ⎣ .. ⎦ ˆ𝐶 ˆℓ−1 𝑍 𝐿

ˆ By Theorem 2.1(b), (𝑈 ∗ , 𝑈 −1 ) is a block-symmetric SPT for 𝐿(𝜆) and 𝐿(𝜆). In other words ˆ −1 𝑈 ∗ (𝜆𝐴 − 𝐵) = (𝜆𝐴ˆ − 𝐵)𝑈 and the theorem is proved. □ 4.1. The set Γ in terms of canonical structures Now we prove the analogue of Theorem 2.2 concerning hermitian polynomials. If 𝐿(𝜆) is hermitian, 𝐽 is a Jordan form for 𝐿(𝜆), and 𝑃 is the corresponding canonical matrix determined by 𝐽 and the sign characteristic of 𝐿(𝜆) associated with its real eigenvalues (see [15]) then 𝑃 −1 = 𝑃 and 𝑃 𝐽 = 𝐽 ∗ 𝑃 . Now a Jordan triple (𝑋, 𝐽, 𝑌 ) of 𝐿(𝜆) is selfadjoint if 𝑌 ∗ = 𝑋𝑃 . The following result is Theorem 1.10 of [7]. It provides some motivation for the introduction of the set that will play the role of the centralizer 𝑍(𝐽) in the hermitian case (cf. [8, Th. 1.25]). Theorem 4.2. Let (𝑋, 𝐽, 𝑌 ) be a selfadjoint Jordan triple for the hermitian matrix ˆ 𝐽, 𝑌ˆ ) is a selfadjoint Jordan triple for 𝐿(𝜆) if and only polynomial 𝐿(𝜆). Then (𝑋, if there exists a matrix 𝑉 ∈ ℂℓ𝑛×ℓ𝑛 such that 𝑉 ∗ 𝑃 𝑉 = 𝑃 and ˆ = 𝑋𝑉, 𝐽 = 𝑉 −1 𝐽𝑉, 𝑌ˆ = 𝑉 −1 𝑃 𝑋 ∗ . 𝑋 A matrix 𝑉 for which 𝑉 ∗ 𝑃 𝑉 = 𝑃 is said to be 𝑃 -unitary. We deﬁne 𝑍(𝐽, 𝑃 ) = {𝑋 ∈ ℂℓ𝑛×ℓ𝑛 : 𝑋 ∗ 𝑃 𝑋 = 𝑃, and 𝑋𝐽 = 𝐽𝑋}.

(19)

Thus, members of 𝑍(𝐽, 𝑃 ) are the 𝑃 -unitary matrices that commute with 𝐽. This is no longer an open set of a linear space (actually it is closed) but it is still a subgroup of Glℓ𝑛 (ℂ).

Parametrizing Structure Preserving Transformations

419

ˆ𝑅 are selfadjoint with respect to 𝐴 and 𝐴, ˆ respecRecall now that 𝐶𝑅 and 𝐶 tively, and then ([7, Th. 1.4] or [11, Th. 5.1.1]) there are nonsingular matrices 𝑇 and 𝑇ˆ such that 𝐽 = 𝑇 −1 𝐶𝑅 𝑇, 𝑃 = 𝑇 ∗ 𝐴𝑇, (20) ∗ ˆ𝑇ˆ, ˆ𝑅 𝑇ˆ. 𝑃 = 𝑇ˆ 𝐴 𝐽 = 𝑇ˆ−1 𝐶 Then, with the deﬁnition (15) of Γ: ˆ Theorem 4.3. Let 𝐿(𝜆) and 𝐿(𝜆) be strictly isospectral 𝑛 × 𝑛 hermitian matrix ˆℓ nonsingular. Let 𝐽 and 𝑃 be a pair of polynomials of degree ℓ with 𝐴ℓ and 𝐴 ˆ canonical matrices common to both 𝐿(𝜆) and 𝐿(𝜆). If 𝑇 and 𝑇ˆ are invertible matrices satisfying (20) then the map 𝜑 : 𝑍(𝐽, 𝑃 ) → Γ given by 𝜑(𝑋) = 𝑇 𝑋 𝑇ˆ −1 is well deﬁned and bijective. Proof: Since 𝑇 and 𝑇ˆ are invertible matrices we only have to prove that 𝜑 is well deﬁned with a well-deﬁned inverse. Let 𝑋 ∈ 𝑍(𝐽, 𝑃 ) and 𝑈 = 𝑇 𝑋 𝑇ˆ−1. Put 𝑉 = 𝑈 −1 = 𝑇ˆ𝑋 −1 𝑇 −1 and 𝑊 = ˆ 𝐴𝑉 𝐴−1 . Since 𝑍(𝐽, 𝑃 ) is a group, 𝑋 −1 ∈ 𝑍(𝐽, 𝑃 ). In particular, 𝑋 −1 ∈ 𝑍(𝐽) and ˆ ˆ 𝐵)𝑉 ˆ . by Theorem 2.2, (𝑊, 𝑉 ) is a SPT for 𝐿(𝜆) and 𝐿(𝜆); i.e., 𝑊 (𝜆𝐴−𝐵) = (𝜆𝐴− Let us show that 𝑊 = 𝑈 ∗ . As 𝑉 = 𝑈 −1 this would imply that 𝑈 ∈ Γ. In fact ˆ 𝐴−1 𝑊 = 𝐴𝑉

= = = = =

ˆ𝑇ˆ = 𝑇 ∗ 𝐴𝑇 = 𝑃 ) ˆ𝑇ˆ𝑋 −1 𝑇 −1 𝐴−1 , 𝐴 (𝑇ˆ∗ 𝐴 𝑇ˆ−∗ 𝑃 𝑋 −1 𝑃 −1 𝑇 ∗ , (𝑋 ∗ 𝑃 𝑋 = 𝑃 ) 𝑇ˆ−∗ 𝑃 𝑃 −1 𝑋 ∗ 𝑇 ∗ , 𝑇ˆ−∗ 𝑋 ∗ 𝑇 ∗ , (𝑇 𝑋 𝑇ˆ −1)∗ = 𝑈 ∗ .

ˆ and 𝑈 ∗ 𝐵𝑈 = 𝐵. ˆ This means that Conversely, let 𝑈 ∈ Γ: 𝑈 ∗ 𝐴𝑈 = 𝐴 −1 ˆ (𝑈 , 𝑈 ) is a block-symmetric SPT for 𝐿(𝜆) and 𝐿(𝜆). Thus, by Theorem 2.2, ˆ𝑇ˆ𝑋𝑇 −1𝐴−1 for some invertible matrix 𝑋 ∈ 𝑍(𝐽). But 𝑈 −1 = 𝑇ˆ𝑋𝑇 −1 and 𝑈 ∗ = 𝐴 −1 −1 ˆ if 𝑈 = 𝑇 𝑋𝑇 , then 𝑈 = 𝑇 𝑋 −1𝑇ˆ−1 . Let us show that 𝑋 ∈ 𝑍(𝐽, 𝑃 ). This will conclude the proof because, since 𝑍(𝐽, 𝑃 ) is a group, 𝑋 −1 will also be in 𝑍(𝐽, 𝑃 ). ˆ𝑇ˆ𝑋𝑇 −1𝐴−1 . On the one hand, 𝑈 ∗ = 𝑇ˆ−∗ 𝑋 −∗ 𝑇 ∗ and on the other hand 𝑈 ∗ = 𝐴 Thus ˆ𝑇ˆ𝑋𝑇 −1 𝐴−1 𝑇 −∗ . 𝑋 −∗ = 𝑇ˆ∗ 𝐴 ˆ𝑇ˆ = 𝑇 ∗ 𝐴𝑇 = 𝑃 , so that 𝑋 −∗ = 𝑃 𝑋𝑃 −1 , and 𝑋 ∗ 𝑃 𝑋 = However, we also have 𝑇ˆ∗ 𝐴 𝑃 as desired. □ ∗

There is still the problem of the geometry of 𝑍(𝐽, 𝑃 ) (we know that 𝑍(𝐽) is a linear space). We will see in the next section that, in the real case, 𝑍(𝐽, 𝑃 ) may contain a ﬁnite number of matrices.

420

P. Lancaster and I. Zaballa

5. Real symmetric matrix polynomials The next two sections concern a special class of hermitian systems, namely, those that have real symmetric coeﬃcients – and, in particular, have linearizations which are diagonalizable by a real congruence transformation. This problem class includes prototypical models of vibration in viscously damped systems. The analysis of this case requires the notion of a real selfadjoint Jordan triple, (𝑋𝜌 , 𝐾, 𝑃 𝑋𝜌𝑇 ) associated with a real symmetric matrix polynomial 𝐿(𝜆). In particular, the matrix 𝐾 is, of course, a real Jordan canonical form (see Theorem 3.4 of [15]). The set of matrices (deﬁning real congruence transformations) to be parameterized is now: ˆ − 𝐵}. ˆ (21) Γ𝜌 = {𝑈 ∈ ℝℓ𝑛×ℓ𝑛 : 𝑈 𝑇 (𝜆𝐴 − 𝐵)𝑈 = 𝜆𝐴 ˆ are invertible, so Notice that 𝐴 and 𝐵 are real and symmetric and, since 𝐴 and 𝐴 are all matrices in Γ𝜌 . Given a real selfadjoint Jordan triple (𝑋𝜌 , 𝐾, 𝑃 𝑋𝜌𝑇 ) associated with a real symmetric matrix polynomial 𝐿(𝜆) then (cf. (19)) deﬁne 𝑍(𝐾, 𝑃 ) = {Ξ ∈ ℝℓ𝑛×ℓ𝑛 : Ξ𝑇 𝑃 Ξ = 𝑃, and Ξ𝐾 = 𝐾Ξ}. Thus, members of 𝑍(𝐾, 𝑃 ) are the 𝑃 -orthogonal matrices that commute with ˆ are real, 𝐾. As before, this is a subgroup of Glℓ𝑛 (ℝ). Then, if 𝐿(𝜆) and 𝐿(𝜆) symmetric, and strictly isospectral (have the same canonical matrices 𝐾 and 𝑃 ), then (see Section 4 of [15]) there are real nonsingular matrices 𝑇 and 𝑇ˆ such that 𝑇 𝑇 𝐴𝑇 = 𝑃, ˆ𝑇ˆ, = 𝑃 𝑇ˆ𝑇 𝐴

𝐾 = 𝑇 −1 𝐶𝑅 𝑇, ˆ𝑅 𝑇ˆ. 𝐾 = 𝑇ˆ−1 𝐶

(22)

The results for real symmetric matrix polynomials are direct analogues of those for hermitian matrix polynomials: ˆ Theorem 5.1. Let 𝐿(𝜆) and 𝐿(𝜆) be strictly isospectral real symmetric matrix polyˆ ∈ ℝ𝑛×ℓ𝑛 such nomials. Then 𝑈 ∈ Γ𝜌 if and only if there is a full rank matrix 𝑋 −1 𝑇 ˆ ) is a real selfadjoint triple of 𝐿(𝜆) and ˆ 𝑋 ˆ 𝐶 ˆ𝑅 , 𝐴 that (𝑋, ⎡ ˆ ⎤ 𝑋 ⎢ 𝑋 ˆ𝐶 ˆ𝑅 ⎥ ⎥ ⎢ 𝑈 =⎢ . ⎥ ⎣ .. ⎦ ˆ𝐶 ˆℓ−1 𝑋 𝑅

ˆ Theorem 5.2. Let 𝐿(𝜆) and 𝐿(𝜆) be strictly isospectral 𝑛×𝑛 real symmetric matrix ˆℓ nonsingular. Let 𝐾 and 𝑃 be the common polynomials of degree ℓ with 𝐴ℓ and 𝐴 canonical forms for these matrix polynomials (as above). If 𝑇 and 𝑇ˆ are invertible real matrices satisfying (22) then the map 𝜑 : 𝑍(𝐾, 𝑃 ) → Γ𝜌 given by 𝜑(Ξ) = 𝑇 Ξ𝑇ˆ−1 is well deﬁned and bijective.

Parametrizing Structure Preserving Transformations

421

Given the existence of real selfadjoint triples, the proofs are essentially the same as those of Theorems 4.1 and 4.3, respectively. It is only necessary to conﬁne the argument to real matrices. Example 5.1. We apply the theory above to the systems of Example 3.2: [ ] [ ] [ ] 3/2 −1/2 2 −3 5 11/2 −9/2 𝐿(𝜆) = 𝜆 + 𝜆+ , −1/2 3/2 5 −3 −9/2 11/2 and

[ 1 ˆ 𝐿(𝜆) = 𝜆2 0

] ] [ ] [ ] [ 2 0 0 2 0 1 0 𝜆 + 2𝜆 + 1 . +𝜆 + = 0 𝜆2 − 4𝜆 + 5 1 0 −4 0 5

We already know that they are isospectral. But they are also strictly isospectral systems. In fact, the only real eigenvalue is −1 and its Segre characteristic is (2). In order to compute the sign characteristic of the elementary divisor (𝜆 + 1)2 we can use Theorem 3.7 of [7]. It turns out that the sign characteristic of (𝜆 + 1)2 in both matrices is +1. Thus, the common real Jordan form and sip matrix for these systems are ⎡ ⎤ ⎡ ⎤ −1 1 0 0 0 1 0 0 ⎢ 0 −1 0 0⎥ ⎢1 0 0 0⎥ ⎥ ⎥ 𝐾 =⎢ and 𝑃 =⎢ ⎣0 ⎦ ⎣0 0 0 1⎦ . 0 2 1 0 0 −1 2 0 0 1 0 The matrices in 𝑍(𝐾) have the form given in (13). Hence Ξ ∈ 𝑍(𝐾, 𝑃 ) if and only if Ξ has the form in (13) and Ξ𝑇 𝑃 Ξ = 𝑃 . A simple computation shows that, in this case, ⎡ ⎤ 0 0 0 𝑎2 ⎢𝑎2 2𝑎𝑏 0 0 ⎥ ⎥. Ξ𝑇 𝑃 Ξ = ⎢ 2 ⎣0 0 2𝑐𝑑 𝑐 − 𝑑2 ⎦ 0 0 𝑐2 − 𝑑2 −2𝑐𝑑 Thus Ξ𝑇 𝑃 Ξ = 𝑃 if and only if 𝑎2 = 1, 𝑐2 = 1 and 𝑏 = 𝑑 = 0. This reveals, for example, that there are only 4 distinct matrices in 𝑍(𝐾, 𝑃 ). Next we have to ﬁnd matrices 𝑇 and 𝑇ˆ satisfying (22). We already have a matrix 𝑆 (cf. (14)) for which 𝑆 −1 𝐶𝑅 𝑆 = 𝐾. It turns out that any other matrix satisfying this condition must be of the form 𝑆𝐻 with 𝐻 ∈ 𝑍(𝐾) and invertible. Recalling (13), we compute ⎡ ⎤ 0 0 0 2𝑎2 ⎢2𝑎2 4𝑎𝑏 ⎥ 0 0 ⎥. (𝑆𝐻)𝑇 𝐴(𝑆𝐻) = ⎢ ⎣ 0 0 8𝑐𝑑 4𝑐2 − 4𝑑2 ⎦ 0 0 4𝑐2 − 4𝑑2 −8𝑐𝑑

422

P. Lancaster and I. Zaballa

Thus (𝑆𝐻)𝑇 𝐴(𝑆𝐻) = 𝑃 if and only if 𝑎2 = 1/2, 𝑏 = 0, 𝑐2 = 1/4 and 𝑑 = 0. √ Taking 𝑎 = 1/ 2 and 𝑐 = 1/2, for example, we have ⎡ √ ⎤ 0√ 0 0 1/ 2 ⎢ 0 0 ⎥ 1/ 2 0 ⎥ ∈ 𝑍(𝐾), 𝐻 =⎢ ⎣ 0 0 1/2 0 ⎦ 0 0 0 1/2 and a matrix 𝑇 such that 𝑇 𝑇 𝐴𝑇 = 𝑃 and 𝑇 −1 𝐶𝑅 𝑇 = 𝐾 is 𝑇 = 𝑆𝐻. With the above 𝐻 ⎡√ ⎤ 0 1/2 0 √2/2 ⎢ 2/2 −1/2 0 ⎥ ⎥. √ √0 𝑇 =⎢ ⎣− 2/2 ⎦ 2/2 1 1/2 √ √ − 2/2 2/2 −1 −1/2 ˆ The same procedure applied to 𝐿(𝜆) shows that, for example, ⎡ ⎤ 1 0 0 0 ⎢ 0 0 1 0⎥ ⎥ 𝑇ˆ = ⎢ ⎣−1 1 0 0⎦ 0 0 2 1 ˆ𝑇ˆ = 𝑃 and 𝑇ˆ−1 𝐶 ˆ𝑅 𝑇ˆ = 𝐾. Notice that this matrix was also obtained satisﬁes 𝑇ˆ𝑇 𝐴 by the standard procedure detailed in Section 3.1. ˆ Thus, all SPCs for 𝐿(𝜆) and 𝐿(𝜆) have the form ⎡√ ⎤ 2 1 𝑎 𝑐 0 0 2 2 ⎢ √2 ⎥ 0 ⎥ ⎢ 2 𝑎 − 12 𝑐 √0 −1 ˆ 𝑈 = 𝑇 Ξ𝑇 = ⎢ (23) 2 1 ⎥ ⎣ 0 ⎦ 𝑎 0 2𝑐 √2 2 1 0 0 2 𝑎 −2𝑐 with 𝑎2 = 1 and 𝑐2 = 1. One can easily check that ⎡ 2 ⎤ 0 𝑎2 − 1 0 2𝑎 − 2 ⎢ 0 𝑐2 − 1⎥ 4 − 4𝑐2 ˆ = ⎢ 20 ⎥, 𝑈 𝑇 𝐴𝑈 − 𝐴 ⎣𝑎 −1 0 0 0 ⎦ 0 𝑐2 − 1 0 0 and

⎡ 1 − 𝑎2 ⎢ ˆ=⎢ 0 𝑈 𝑇 𝐵𝑈 − 𝐵 ⎣ 0 0

0 5 − 5𝑐2 0 0

⎤ 0 0 0 0 ⎥ ⎥, 𝑎2 − 1 0 ⎦ 0 𝑐2 − 1

which reduce to zero when 𝑎2 = 1 and 𝑐2 = 1. Again, only four real SPCs reduce ˆ 𝐿(𝜆) to the diagonal strictly isospectral system 𝐿(𝜆). It is worth noting that the system 𝐿(𝜆) of this example satisﬁes the conditions of Theorem 2 in [14] for being decoupled (diagonalized) by congruence. For, if we write 𝐿(𝜆) in the form 𝐿(𝜆) = 𝑀 𝜆2 +𝐷𝜆+𝐾 then the eigenvalues of 𝜆𝑀 +𝐾 are 1

Parametrizing Structure Preserving Transformations

423

and 5 so that it is of deﬁnite type, and one can check that 𝐷𝑀 −1 𝐾 = 𝐾𝑀 −1 𝐷. Hence there is a nonsingular matrix 𝑉 such that ˆ 𝑉 ∗ 𝐿(𝜆)𝑉 = 𝐿(𝜆).

(24)

It turns out that our procedure to construct the SPCs for these two systems produces some matrices 𝑉 satisfying (24). In fact, one can check (see [4]) that the relation ˆ −1 𝑈 𝑇 (𝜆𝐴 − 𝐵) = (𝜆𝐴ˆ − 𝐵)𝑈 implies 𝑇 𝑇 ˆ (𝑈12 𝜆 + 𝑈11 )𝐿(𝜆) = 𝐿(𝜆)(𝑉 12 𝜆 + 𝑉11 ) −1 = [𝑉𝑖𝑗 ]𝑖,𝑗=1,2 . But from (23) we have 𝑈12 = 0, where 𝑈 = [𝑈𝑖𝑗 ]𝑖,𝑗=1,2 and 𝑈 −1 𝑉12 = 0 and 𝑉11 = 𝑈11 . Therefore 𝑇 ˆ 𝑈11 𝐿(𝜆)𝑈11 = 𝐿(𝜆),

with

[√ 𝑈11 =

2 𝑎 √2 2 2 𝑎

1 2𝑐 − 21 𝑐

] ,

𝑎2 = 𝑐2 = 1,

deﬁnes a strict real congruence between the two symmetric systems. Whether this is a viable procedure for constructing all possible strict real congruences between two given systems remains an open question. □ Acknowledgement The authors are grateful for support and encouragement received from Seamus D. Garvey, Uwe Prells, and Atanas Popov – partners in the project supported by the EPSRC (UK) under Grant EP/E046290.

References [1] Gantmacher F.R., The Theory of Matrices, vol 1. AMS Chelsea, Providence, Rhode Island, 1998. [2] Garvey S.G., Friswell M.I., and Prells U., Co-ordinate transformations for secondorder systems, Part 1: General transformations, J. Sound and Vibration, 285, 2002, 885–909. [3] Garvey S.G., Friswell M.I., Prells U., and Chen Z., General isospectral ﬂows for linear dynamic systems, Lin. Alg. and its Applications, 385, 2004, 335–368. [4] Garvey S.G., Lancaster P., Popov A., Prells U., Zaballa I., Filters Connecting Isospectral Quadratic Systems. Preprint. [5] Chu M., and Del Buono N., Total decoupling of a general quadratic pencil, Part 1, J. Sound and Vibration, 309, 2008, 96–111. [6] Chu M., and Xu S.F., Spectral decomposition of real symmetric quadratic 𝜆-matrices and its applications, Math. Comp., 78, 2009, 293–313. [7] Gohberg I., Lancaster P., and Rodman L., Spectral analysis of selfadjoint matrix polynomials, Ann. of Math., 112, 1980, 33–71.

424

P. Lancaster and I. Zaballa

[8] Gohberg I., Lancaster P., and Rodman L., Matrix Polynomials Academic Press, New York, 1982, and SIAM, Philadelphia, 2009. [9] Gohberg I., Lancaster P., and Rodman L., Matrices and Indeﬁnite Scalar Products, Birkh¨ auser, Basel, 1983. [10] Gohberg I., Lancaster P., and Rodman L., Invariant Subspaces of Matrices with Applications, Wiley, New York, 1986 and SIAM, Philadelphia, 2006. [11] Gohberg I., Lancaster P., and Rodman L., Indeﬁnite Linear Algebra and Applications, Birkh¨ auser, Basel, 2005. [12] Lancaster P., and Tismenetsky M., The Theory of Matrices, Academic Press, New York, 1985. [13] Lancaster P., and Prells U., Isospectral families of high-order systems, Z. Angew. Math. Mech, 87, 2007, 219–234. [14] Lancaster P., and Zaballa I., Diagonalizable quadratic eigenvalue problems, Mechanical Systems and Signal Processing, 23, 2009, 1134–1144. [15] Lancaster P., and Zaballa I., A review of canonical forms for selfadjoint matrix polynomials. Submitted. Peter Lancaster Dept. of Mathematics and Statistics University of Calgary Calgary, AB T2N 1N4, Canada e-mail: [email protected] Ion Zaballa Departamento de Matematica Aplicada y EIO Universidad del Pais Vasco Apdo 644 E-48080 Bilbao, Spain e-mail: [email protected]

Operator Theory: Advances and Applications, Vol. 218, 425–443 c 2012 Springer Basel AG ⃝

A Review of Canonical Forms for Selfadjoint Matrix Polynomials Peter Lancaster and Ion Zaballa Dedicated to the memory of Israel Gohberg, good friend and scholar

Abstract. In the theory of 𝑛 × 𝑛 matrix polynomials, the notions of “standard pairs and triples”, and the special cases of “Jordan pairs and triples” play an important role. There are interesting diﬀerences in these constructions according as the analysis is carried out over the complex ﬁeld ℂ, or the real ﬁeld ℝ. A careful review is made of these ideas with special reference to complex hermitian systems, and to real symmetric systems with nonsingular leading coeﬃcient. New results are obtained concerning real Jordan structures for real symmetric matrix polynomials. Mathematics Subject Classiﬁcation (2000). 15A21, 15A54, 47B15. Keywords. Matrix polynomials, canonical forms.

1. Introduction Standard and Jordan triples for matrix polynomials and their “selfadjoint” forms, were introduced and developed by Gohberg, Lancaster, and Rodman (GLR) in several publications including [3, 4, 8]. In the ﬁrst two of these works (of thirty years ago) selfadjoint structures are deﬁned for polynomials deﬁned over ℂ and are formulated in terms of canonical forms over ℂ. Following a lead given more recently in [8], we separate the “selfadjoint” and “canonical” notions and provide careful distinction between systems deﬁned over either ℂ or ℝ. Also, we take advantage of the comprehensive discussion of canonical forms provided in [11]. Less comprehensive discussions can be found in the GLR works but they are scattered and incomplete. It is our objective in this paper to give a concise and largely self-contained overview of these ideas. For convenience, some of the This work was supported by grants from the EPSRC (United Kingdom), NSERC (Canada), and DGICYT, GV (Spain).

426

P. Lancaster and I. Zaballa

necessary arguments are repeated, but some are new. Our main objective is the development of arguments leading to Theorem 4.3 and (especially) Theorem 4.4 concerning Jordan triples. In Section 5 constructions for chains of generalized eigenvectors are presented in the light of preceding results. ∑𝑗=ℓ We consider 𝑛 × 𝑛 matrix polynomials of degree ℓ: say 𝐿(𝜆) = 𝑗=0 𝐴𝑗 𝜆𝑗 with 𝐴ℓ nonsingular, and all coeﬃcient matrices 𝐴𝑗 ∈ ℂ𝑛×𝑛 , or all in ℝ𝑛×𝑛 . The ﬁrst of these is the setting for the greater part of the GLR theory. Where the distinction is not important, we use the symbol 𝔽 to denote either the ﬁeld of real or the ﬁeld of complex numbers. In particular, this paper seems to provide the ﬁrst comprehensive account of real canonical structures (for real symmetric systems) – with no hypotheses on the degrees of elementary divisors, and making no positive deﬁnite hypotheses on any of the coeﬃcients 𝐴𝑗 . We note that Chu et al. in [1], [2] and [13] have recently used partial canonical structures in studying inverse problems and model updating for some real quadratic systems.

2. Standard pairs and triples Let 𝐿(𝜆) be a an 𝑛 × 𝑛 matrix polynomial over 𝔽 with nonsingular leading coeﬃcient 𝐴ℓ . Let 𝐶𝑅 be the right companion matrix of 𝐿(𝜆), namely, ⎤ ⎡ 0 ⋅⋅⋅ 0 0 𝐼𝑛 ⎥ ⎢ 0 0 𝐼𝑛 ⋅ ⋅ ⋅ 0 ⎢ ⎥ ⎢ ⎥ .. .. 𝐶𝑅 = ⎢ (1) ⎥, . . ⎥ ⎢ ⎦ ⎣ 0 𝐼𝑛 −𝐴−1 −𝐴−1 −𝐴−1 ℓ 𝐴0 ℓ 𝐴1 . . . ℓ 𝐴ℓ−1 and deﬁne the “block symmetric” matrix 𝐴 by ⎡ 𝐴1 𝐴2 ⋅ ⋅ ⋅ ⎢ 𝐴2 ⋅ ⋅ ⋅ 𝐴ℓ ⎢ 𝐴=⎢ . ⎣ .. 𝐴ℓ

0

⋅⋅⋅

𝐴ℓ 0 .. .

⎤ ⎥ ⎥ ⎥. ⎦

(2)

0

Thus, both 𝐶𝑅 and 𝐴 are in 𝔽ℓ𝑛×ℓ𝑛 . Also, 𝐴 is nonsingular and if the coeﬃcient matrices of 𝐿(𝜆) are real and symmetric, or hermitian, then 𝐴∗ = 𝐴 so that 𝐴 is real and symmetric, or hermitian, according as 𝔽 = ℝ or ℂ. The product 𝐴𝐶𝑅 is also block-symmetric, and if 𝐿(𝜆) is hermitian or real symmetric, 𝐴𝐶𝑅 is also hermitian or real symmetric, respectively. For this reason 𝐴 is sometimes known as the “symmetrizer” for 𝐶𝑅 (it deﬁnes an indeﬁnite inner product on 𝔽ℓ𝑛×ℓ𝑛 in ∗ 𝐴). which 𝐶𝑅 is selfadjoint; i.e., 𝐴𝐶𝑅 = 𝐶𝑅

Canonical Forms Deﬁnition 2.1. (a) A pair of matrices 𝑋 ∈ 𝔽𝑛×ℓ𝑛 and 𝑇 if ⎡ 𝑋 ⎢ 𝑋𝑇 ⎢ ⎢ .. ⎣ .

427

∈ 𝔽ℓ𝑛×ℓ𝑛 form a standard pair over 𝔽 ⎤ ⎥ ⎥ ⎥ ⎦

𝑋𝑇 ℓ−1

is nonsingular. (b) A standard pair (𝑋, 𝑇 ) is a standard pair for 𝐿(𝜆) if [ ] 𝑋 = 𝐼 0 ⋅ ⋅ ⋅ 0 𝑆, and 𝑇 = 𝑆 −1 𝐶𝑅 𝑆 for some nonsingular 𝑆 ∈ 𝔽ℓ𝑛×ℓ𝑛 . Theorem 2.2. A standard pair (𝑋, 𝑇 ) is a standard pair for 𝐿(𝜆) if and only if 𝐿(𝑋, 𝑇 ) := 𝐴ℓ 𝑋𝑇 ℓ + ⋅ ⋅ ⋅ + 𝐴1 𝑋𝑇 + 𝐴0 𝑋 = 0. This is Proposition 12.1 of [8]. Deﬁnition 2.3. (a) Given a standard pair (𝑋, 𝑇 ) over 𝔽, if ⎤−1 ⎡ ⎡ 𝑋 ⎢ 𝑋𝑇 ⎥ ⎢ ⎥ ⎢ ⎢ 𝑌 =⎢ ⎥ ⎢ .. ⎦ ⎣ ⎣ . ℓ−1 𝑋𝑇

⎤ 0 .. ⎥ . ⎥ ⎥, 0 ⎦ 𝑄

(3)

for some nonsingular matrix 𝑄 ∈ 𝔽𝑛×𝑛 then (𝑋, 𝑇, 𝑌 ) is said to be a standard triple (over 𝔽). (b) If (𝑋, 𝑇 ) is a standard pair for 𝐿(𝜆) and 𝑌 is deﬁned as in (a) with 𝑄 = 𝐴−1 ℓ then (𝑋, 𝑇, 𝑌 ) is said to be a standard triple for 𝐿(𝜆) (over 𝔽). Throughout this paper, when saying that (𝑋, 𝑇, 𝑌 ) is a standard triple it is to be understood that, unless speciﬁed otherwise, 𝑋 ∈ 𝔽𝑛×ℓ𝑛 , 𝑇 ∈ 𝔽ℓ𝑛×ℓ𝑛 and 𝑌 ∈ 𝔽ℓ𝑛×𝑛 . If (𝑋, 𝑇, 𝑌 ) is a standard triple for 𝐿(𝜆) then (see [4, Prop. 2.1]): ] [ (i) 𝑌 𝑇 𝑌 ⋅ ⋅ ⋅ 𝑇 ℓ−1 𝑌 is nonsingular. (ii) 𝑇 ℓ[𝑌 𝐴ℓ + 𝑇 ℓ−1 𝑌 𝐴ℓ−1 + ⋅ ⋅]⋅ + [𝑇 𝑌 𝐴1 + 𝑌 𝐴0 = 0, ] (iii) 𝑋 𝑌 𝑇 𝑌 ⋅ ⋅ ⋅ 𝑇 ℓ−1 𝑌 = 0 ⋅ ⋅ ⋅ 0 𝐴−1 . ℓ

This implies that (𝑌 𝑇 , 𝑇 𝑇 , 𝑋 𝑇 ) is a standard triple of 𝐿(𝜆)𝑇 . The pair (𝑇, 𝑌 ) is called a left standard pair of 𝐿(𝜆). The prime example of a standard triple for polynomial 𝐿(𝜆) is ⎤ ⎡ 0 ⎢ .. ⎥ [ ] ⎥ ⎢ (4) 𝑋0 = 𝐼 0 ⋅ ⋅ ⋅ 0 , 𝑇 = 𝐶𝑅 , 𝑌0 = ⎢ . ⎥ . ⎣ 0 ⎦ 𝐴−1 ℓ

428

P. Lancaster and I. Zaballa

This is a standard triple for the matrix polynomial 𝐿(𝜆) whose leading coeﬃcient is 𝐴ℓ and the remaining coeﬃcients are 𝐴ℓ times those appearing in the last block row of 𝐶𝑅 ; i.e., 𝐿(𝜆) = 𝐴ℓ 𝜆ℓ + 𝐴ℓ−1 𝜆ℓ−1 + ⋅ ⋅ ⋅ + 𝐴1 𝜆 + 𝐴0 . In other words, all matrix polynomials with nonsingular leading coeﬃcient admit standard triples. The converse is also true. The proof is based on the fact that if (𝑋, 𝑇, 𝑌 ) is a standard triple for 𝐿(𝜆) then the resolvent form holds: 𝐿(𝜆)−1 = 𝑋(𝐼ℓ𝑛 𝜆 − 𝑇 )−1 𝑌,

𝜆∈ / 𝜎(𝐿);

(5)

𝜎(𝐿) being the spectrum (i.e., the set of eigenvalues) of 𝐿(𝜆) (Theorem 14.2 of [12]). Theorem 2.4. If (𝑋, 𝑇, 𝑌 ) is a standard triple with 𝑋 ∈ 𝔽𝑛×ℓ𝑛 , 𝑇 ∈ 𝔽ℓ𝑛×ℓ𝑛 and 𝑌 ∈ 𝔽ℓ𝑛×𝑛 then there is a unique matrix polynomial 𝐿(𝜆) for which (𝑋, 𝑇, 𝑌 ) is a standard triple. Proof. By Deﬁnition 2.3(a), if (𝑋, 𝑇, 𝑌 ) is a standard triple then there is an invertible matrix 𝑄 such that { 0 for 𝑖 = 0, 1, . . . , ℓ − 2 𝑖 𝑋𝑇 𝑌 = 𝑄 for 𝑖 = ℓ − 1 Thus

⎡

𝑋𝑌 𝑋𝑇 𝑌 .. .

𝑋𝑇 𝑌 𝑋𝑇 2 𝑌 .. .

⋅⋅⋅ ⋅⋅⋅ .. .

𝑋𝑇 ℓ−1 𝑌 𝑋𝑇 ℓ𝑌 .. .

⎤

⎥ ⎢ ⎥ ⎢ rank ⎢ ⎥ ⎦ ⎣ ℓ−1 ℓ−2 2ℓ−2 𝑌 𝑋𝑇 𝑌 ⋅ ⋅ ⋅ 𝑋𝑇 𝑋𝑇 ⎤𝑌 ⎡ 0 ⋅⋅⋅ 0 𝑄 ⎢0 ⋅⋅⋅ 𝑄 𝑋𝑇 ℓ 𝑌 ⎥ ⎢ ⎥ = rank ⎢ . ⎥ = ℓ𝑛 . .. .. ⎣ .. ⎦ . 𝑄 ⋅ ⋅ ⋅ 𝑋𝑇 2ℓ−3 𝑌 𝑋𝑇 2(ℓ−1) 𝑌 By Theorem 2.8 of [4]1 there is a matrix polynomial 𝐿(𝜆) such that 𝐿(𝜆)−1 = 𝑋(𝜆𝐼ℓ𝑛 − 𝑇 )−1 𝑌 and by Theorem 14.2.4 of [12] (𝑋, 𝑇, 𝑌 ) is a standard triple for 𝐿(𝜆). This is the only matrix polynomial for which (𝑋, 𝑇, 𝑌 ) is a standard triple ˆ ˆ −1 = 𝑋(𝜆𝐼ℓ𝑛 − because, if this triple were a standard triple for 𝐿(𝜆), then 𝐿(𝜆) 𝑇 )−1 𝑌 = 𝐿(𝜆)−1 . □ It should be noted that the coeﬃcients of 𝐿(𝜆) can be expressed2 in terms of a standard triple for 𝐿(𝜆) (this is Theorem 14.7.1 of [12] and Theorem 2.4 of [4]) and so if (𝑋, 𝑇, 𝑌 ) is real, 𝐿(𝜆) is real too. Although a standard triple deﬁnes a matrix polynomial uniquely, a matrix polynomial generally admits many standard triples. The relationship between two standard triples for the same matrix polynomial is clariﬁed in the following theorem: 1 After

straightforward generalization to admit nonsingular 𝐴ℓ possibly diﬀerent from 𝐼. plays an important part in strategies for solving inverse problems in which the coeﬃcients are expressed in terms of spectral data. See [9] and [13], for example. 2 This

Canonical Forms

429

Theorem 2.5. If (𝑋, 𝑇, 𝑌 ) is a standard triple for 𝐿(𝜆) over 𝔽, and if a triple of matrices (𝑋1 , 𝑇1 , 𝑌1 ) is similar to (𝑋, 𝑇, 𝑌 ) in the sense that 𝑋1 = 𝑋𝑆 −1 ,

𝑇1 = 𝑆𝑇 𝑆 −1,

𝑌1 = 𝑆𝑌

(6)

for some invertible 𝑆 ∈ 𝔽ℓ𝑛×ℓ𝑛 , then (𝑋1 , 𝑇1 , 𝑌1 ) is also a standard triple for 𝐿(𝜆) over 𝔽. Conversely, any two standard triples for 𝐿(𝜆) over 𝔽 are similar. This is Proposition 12.1.3 of [8]. Using the standard triple (4), we note that (𝑋, 𝑇, 𝑌 ) is a standard triple for 𝐿(𝜆) if and only if there is a nonsingular matrix 𝑆 such that ⎡ ⎤ 0 ⎢ .. ⎥ ] [ ⎢ ⎥ 𝑋 = 𝐼𝑛 0 ⋅ ⋅ ⋅ 0 𝑆, 𝑇 = 𝑆 −1 𝐶𝑅 𝑆, 𝑌 = 𝑆 −1 ⎢ . ⎥ . ⎣ 0 ⎦ 𝐴−1 ℓ It turns out that the matrix 𝑆 satisfying (6) is uniquely determined either by (𝑋, 𝑇 ) and (𝑋1 , 𝑇1 ), or by (𝑇, 𝑌 ) and (𝑇1 , 𝑌1 ). It is given by (see [4, Th. 1.25]): 𝑆 = 𝐶(𝑋1 , 𝑇1 )−1 𝐶(𝑋, 𝑇 ) or 𝑆 = 𝑅(𝑇1 , 𝑌1 )𝑅(𝑇, 𝑌 )−1 where

⎡

𝑋 𝑋𝑇 .. .

(7)

⎤

⎥ ⎢ ⎥ ⎢ 𝐶(𝑋, 𝑇 ) = ⎢ ⎥ ⎦ ⎣ ℓ−1 𝑋𝑇

[ and 𝑅(𝑇, 𝑌 ) = 𝑌

𝑇𝑌

⋅⋅⋅

] 𝑇 ℓ−1 𝑌 .

(8)

It follows from (7) that 𝐶(𝑋, 𝑇 )𝑅(𝑇, 𝑌 ) = 𝐶(𝑋1 , 𝑇1 )𝑅(𝑇1 , 𝑌1 ). Furthermore, if 𝐴 is the block-symmetric matrix of (2), then for any standard triple (𝑋, 𝑇, 𝑌 ) of 𝐿(𝜆) we have (see [12, Th. 14.2.5]) 𝐴−1 = 𝐶(𝑋, 𝑇 )𝑅(𝑇, 𝑌 ) = Γ, where

⎡ ⎢ ⎢ Γ=⎢ ⎣

0 0 .. .

Γℓ−1

⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅

0

Γℓ−1 .. .

Γ2ℓ−3

Γℓ−1 Γℓ .. .

(9) ⎤ ⎥ ⎥ ⎥ ⎦

(10)

Γ2(ℓ−1)

is (by deﬁnition) the matrix of the moments of 𝐿(𝜆); i.e., the 2ℓ−2 ﬁrst coeﬃcients of the resolvent expansion for 𝐿(𝜆)−1 .

430

P. Lancaster and I. Zaballa

3. Standard triples and hermitian systems Now we consider the notions of standard pairs and triples in the context of real symmetric or complex hermitian matrix polynomials. Using the resolvent form, the following result is easily proved (see Corollary 14.2.1 of [12]): Theorem 3.1. (a) A real matrix polynomial 𝐿(𝜆) has symmetric coeﬃcients if and only if, for any real standard triple (𝑋, 𝑇, 𝑌 ) for 𝐿(𝜆), (𝑌 𝑇 , 𝑇 𝑇 , 𝑋 𝑇 ) is also a standard triple for 𝐿(𝜆). (b) A complex matrix polynomial 𝐿(𝜆) has hermitian coeﬃcients if and only if, for any complex standard triple (𝑋, 𝑇, 𝑌 ) for 𝐿(𝜆), (𝑌 ∗ , 𝑇 ∗ , 𝑋 ∗ ) is also a standard triple for 𝐿(𝜆). The following deﬁnitions are critical – and may be unfamiliar. Observe that the statements make no reference to matrix polynomials. Deﬁnition 3.2. (a) A real standard triple (𝑋, 𝑇, 𝑌 ) is said to be real selfadjoint if there is a real nonsingular symmetric matrix 𝐻 for which 𝑌 𝑇 = 𝑋𝐻 −1 ,

𝑇 𝑇 = 𝐻𝑇 𝐻 −1,

𝑋 𝑇 = 𝐻𝑌.

(11)

(b) A complex standard triple (𝑋, 𝑇, 𝑌 ) is said to be selfadjoint if there is a nonsingular hermitian matrix 𝐻 for which 𝑌 ∗ = 𝑋𝐻 −1 ,

𝑇 ∗ = 𝐻𝑇 𝐻 −1 ,

𝑋 ∗ = 𝐻𝑌.

(12)

Note that, because of the symmetry imposed on 𝐻, the ﬁrst and third of the relations in (11) and (12) are equivalent. Note also that this is not the same use of “selfadjoint triple” as that of [4, p. 261] but an elementary adaptation of the deﬁnition given in [8, p. 244] for hermitian polynomial matrices. The following example shows that “real selfadjoint standard triples” may not be recognizable by inspection. Example 3.3. Let 𝐿(𝜆) be a real matrix polynomial and consider a standard triple of the form (4). It is, of course, a real standard triple if 𝐿(𝜆) has real coeﬃcients. Furthermore, if 𝐿(𝜆) is real and symmetric and we deﬁne 𝐻 = 𝐴 (the blocksymmetric matrix of (2)), then it can be veriﬁed that (11) holds, i.e., this standard triple is real selfadjoint. Let us take a closer look at the deﬁnition of selfadjoint triples. We will focus on the real case but using the translation: “real ↔ complex”, “symmetric ↔ hermitian” and “𝑇 ↔ ∗ ” the same results and proofs hold for complex matrices. First we show that it is not necessary to require that 𝐻 be symmetric in Deﬁnition 3.2 of a selfadjoint triple. Theorem 3.4. A real standard triple (𝑋, 𝑇, 𝑌 ) is selfadjoint if and only if it is similar to (𝑌 𝑇 , 𝑇 𝑇 , 𝑋 𝑇 ).

Canonical Forms

431

Proof. It is clear from (11) that if (𝑋, 𝑇, 𝑌 ) is selfadjoint then it is similar to (𝑌 𝑇 , 𝑇 𝑇 , 𝑋 𝑇 ). For the converse, assume that there is an invertible matrix 𝐻 such that (11) holds. We are to prove that 𝐻 is symmetric. As 𝑇 𝑇 = 𝐻𝑇 𝐻 −1 and 𝑋 𝑇 = 𝐻𝑌 it follows that (with 𝑅 as in (8)) 𝑅(𝑇 𝑇 , 𝑋 𝑇 ) = 𝐻𝑅(𝑇, 𝑌 ) and so

𝐻 = 𝑅(𝑇 𝑇 , 𝑋 𝑇 )𝑅(𝑇, 𝑌 )−1 = 𝐶(𝑋, 𝑇 )𝑇 𝑅(𝑇, 𝑌 )−1 .

But also 𝑌 𝑇 = 𝑋𝐻 −1 . Then 𝐶(𝑌 𝑇 , 𝑇 𝑇 ) = 𝐶(𝑋, 𝑇 )𝐻 −1 . That is to say, 𝐻 = 𝐶(𝑌 𝑇 , 𝑇 𝑇 )−1 𝐶(𝑋, 𝑇 ) = 𝑅(𝑇, 𝑌 )−𝑇 𝐶(𝑋, 𝑇 ). These two expressions for 𝐻 yield 𝐻 = 𝐻 𝑇 .

□

It follows from Theorems 3.4 and 3.1 that all real standard triples of a real symmetric matrix polynomial are selfadjoint (compare with [8, Th. 12.2.2]): Theorem 3.5. Let 𝐿(𝜆) have real coeﬃcients with 𝐴ℓ nonsingular. Then: (a) If 𝐿(𝜆) admits a real selfadjoint standard triple then it is real and symmetric. (b) If 𝐿(𝜆) is real and symmetric then all its real standard triples are selfadjoint. Proof. (a) If (𝑋, 𝑇, 𝑌 ) is a real selfadjoint standard triple for matrix polynomial 𝐿(𝜆) then, by Theorem 3.4, (𝑋, 𝑇, 𝑌 ) and (𝑌 𝑇 , 𝑇 𝑇 , 𝑋 𝑇 ) are similar. Hence, by Theorem 2.5, (𝑌 𝑇 , 𝑇 𝑇 , 𝑋 𝑇 ) is also a standard triple of 𝐿(𝜆). Therefore, by Theorem 3.1, the coeﬃcients of 𝐿(𝜆) are symmetric. (b) If 𝐿(𝜆) is a real selfadjoint matrix polynomial and (𝑋, 𝑇, 𝑌 ) is a real standard triple for 𝐿(𝜆) then (𝑌 𝑇 , 𝑇 𝑇 , 𝑋 𝑇 ) is also a standard triple and, according to Theorem 2.5, (𝑋, 𝑇, 𝑌 ) and (𝑌 𝑇 , 𝑇 𝑇 , 𝑋 𝑇 ) are similar. Then Theorem 3.4 implies that (𝑋, 𝑇, 𝑌 ) is selfadjoint. □ Recall that the similarity matrix for two similar standard triples (𝑋, 𝑇, 𝑌 ) and (𝑋1 , 𝑇1 , 𝑌1 ) is given by (7). In the selfadjoint case (𝑌 𝑇 , 𝑇 𝑇 , 𝑋 𝑇 ) plays the role of (𝑋1 , 𝑇1 , 𝑌1 ). Thus: Proposition 3.6. If (𝑋, 𝑇, 𝑌 ) is a real selfadjoint triple then there is one and only one nonsingular real symmetric matrix 𝐻 such that 𝑌 𝑇 = 𝑋𝐻 −1 ,

𝑇 𝑇 = 𝐻𝑇 𝐻 −1,

𝑎𝑛𝑑

𝑋 𝑇 = 𝐻𝑌.

This matrix is given by any of the four equivalent expressions: (i) (ii) (iii) (iv)

𝐻 𝐻 𝐻 𝐻

= 𝑅(𝑇, 𝑌 )−𝑇 𝐶(𝑋, 𝑇 ), = 𝐶(𝑋, 𝑇 )𝑇 𝑅(𝑇, 𝑌 )−1 , = 𝐶(𝑋, 𝑇 )𝑇 𝐴𝐶(𝑋, 𝑇 ), = 𝑅(𝑇, 𝑌 )−𝑇 Γ𝑅(𝑇, 𝑌 )−1 ,

where 𝐴 is the matrix (2) of the coeﬃcients of the unique matrix 𝐿(𝜆) for which (𝑋, 𝑇, 𝑌 ) is a selfadjoint triple and Γ is the matrix (10) of its moments.

432

P. Lancaster and I. Zaballa

Proof. Everything is known but the expressions for 𝐻 of items (iii) and (iv). Recall that (cf. (9)) 𝐴−1 = 𝐶(𝑋, 𝑇 )𝑅(𝑇, 𝑌 ) = Γ. Thus

𝐻 = 𝐶(𝑋, 𝑇 )𝑇 𝑅(𝑇, 𝑌 )−1 = 𝐶(𝑋, 𝑇 )𝑇 𝐴𝐶(𝑋, 𝑇 )

and also 𝐻 = 𝑅(𝑇, 𝑌 )−𝑇 𝐶(𝑋, 𝑇 ) = 𝑅(𝑇, 𝑌 )−𝑇 Γ𝑅(𝑇, 𝑌 )−1 .

□

The expressions for 𝐻 in items (iii) and (iv) reveal the symmetric structure of 𝐻 more clearly, because both 𝐴 and Γ are symmetric provided that 𝐿(𝜆) is symmetric. The following result is an easy consequence of Proposition 3.6: Corollary 3.7. (a) Let 𝐿(𝜆) be a real matrix polynomial with nonsingular leading coeﬃcient, and let (𝑋, 𝑇, 𝑌 ) be a real standard triple for 𝐿(𝜆). Then 𝐿(𝜆) is symmetric if and only if 𝐻 = 𝐶(𝑋, 𝑇 )𝑇 𝑅(𝑇, 𝑌 )−1 is symmetric. (b) A standard triple (𝑋, 𝑇, 𝑌 ) is selfadjoint if and only if 𝐻 is symmetric. Proof. Note ﬁrst that, from (9), we have 𝑅(𝑇, 𝑌 )−1 = 𝐴𝐶(𝑋, 𝑌 ). Now we systematically use the fact that 𝐶(𝑋, 𝑇 )𝑇 𝑅(𝑇, 𝑌 )−1 = 𝐶(𝑋, 𝑇 )𝑇 𝐴𝐶(𝑋, 𝑇 ). (a) If 𝐿(𝜆) is symmetric then, since 𝐴 is symmetric, 𝐻 = 𝐶(𝑋, 𝑇 )𝑇 𝑅(𝑇, 𝑌 )−1 = 𝐶(𝑋, 𝑇 )𝑇 𝐴𝐶(𝑋, 𝑇 ) is symmetric too. Conversely, if 𝐻 = 𝐶(𝑋, 𝑇 )𝑇 𝑅(𝑇, 𝑌 )−1 = 𝐶(𝑋, 𝑇 )𝑇 𝐴𝐶(𝑋, 𝑇 ) is symmetric then 𝐴 is symmetric and it is plain that 𝐴 is symmetric if and only if 𝐴𝑖 is symmetric for 𝑖 = 0, 1, . . . , ℓ. (b) It was proved in Theorem 3.4 that if (𝑋, 𝑇, 𝑌 ) is real selfadjoint then 𝐻 is symmetric. Conversely, let (𝑋, 𝑇, 𝑌 ) be a standard triple and 𝐿(𝜆) be the matrix polynomial for which (𝑋, 𝑇, 𝑌 ) is a standard triple. If 𝐻 is symmetric, by part (a), 𝐿(𝜆) is selfadjoint. Then, by Theorem 3.1 (𝑋, 𝑇, 𝑌 ) and (𝑌 𝑇 , 𝑇 𝑇 , 𝑋 𝑇 ) are standard triples for 𝐿(𝜆) and, consequently, they are similar. By Theorem 3.4 (𝑋, 𝑇, 𝑌 ) is selfadjoint. □ We aim to show now that the selfadjoint triples of a selfadjoint matrix polynomial are not only similar but unitarily similar. Let us recall this concept (see [8]). Let 𝐻 be an 𝑛 × 𝑛 invertible hermitian or symmetric matrix according as 𝔽 = ℂ or 𝔽 = ℝ. A matrix 𝑇 ∈ 𝔽𝑛×𝑛 is said to be 𝐻-selfadjoint if 𝔽 = ℂ and 𝑇 ∗ 𝐻 = 𝐻𝑇 and real 𝐻-selfadjoint if 𝔽 = ℝ and 𝑇 𝑇 𝐻 = 𝐻𝑇 (see [8, p. 48]). Deﬁnition 3.8. [8, Sec. 4.5, 6.1] (a) Let 𝐻1 , 𝐻2 ∈ ℂ𝑛×𝑛 be hermitian invertible matrices and let 𝑇1 , 𝑇2 ∈ ℂ𝑛×𝑛 be 𝐻1 -selfadjoint and 𝐻2 -selfadjoint, respectively. Then the pairs (𝑇1 , 𝐻1 ) and (𝑇2 , 𝐻2 ) are said to be unitarily similar if there exists an invertible matrix

Canonical Forms

433

𝑄 ∈ ℂ𝑛×𝑛 such that 𝑇2 = 𝑄−1 𝑇1 𝑄 and 𝐻2 = 𝑄∗ 𝐻1 𝑄 (𝑄 is (𝐻1 , 𝐻2 )unitary). (b) Let 𝐻1 , 𝐻2 ∈ ℝ𝑛×𝑛 be symmetric invertible matrices and let 𝑇1 , 𝑇2 ∈ ℝ𝑛×𝑛 be real 𝐻1 -selfadjoint and 𝐻2 -selfdajoint, respectively. Then the pairs (𝑇1 , 𝐻1 ) and (𝑇2 , 𝐻2 ) are said to be real unitarily similar if there exists an invertible matrix 𝑄 ∈ ℝ𝑛×𝑛 such that 𝑇2 = 𝑄−1 𝑇1 𝑄 and 𝐻2 = 𝑄𝑇 𝐻1 𝑄 (𝑄 is (𝐻1 , 𝐻2 )orthogonal). If (𝑋, 𝑇, 𝑌 ) is a (real) selfadjoint triple then there exists an invertible (symmetric) hermitian matrix 𝐻 such that (𝑇 𝑇 = 𝐻𝑇 𝐻 −1 ) 𝑇 ∗ = 𝐻𝑇 𝐻 −1 ; i.e., the “main” matrix 𝑇 is (real) 𝐻-selfadjoint. Recall that such a matrix 𝐻 is unique and is given by any of the four expressions in Proposition 3.6. Proposition 3.9. Let (𝑋1 , 𝑇1 , 𝑌1 ) and (𝑋2 , 𝑇2 , 𝑌2 ) be (real) selfadjoint triples and, for 𝑖 = 1, 2, let 𝐻𝑖 be the (symmetric) hermitian matrix such that (𝑇𝑖𝑇 = 𝐻𝑖 𝑇𝑖 𝐻𝑖−1 ) 𝑇𝑖∗ = 𝐻𝑖 𝑇𝑖 𝐻𝑖−1 and (𝑌𝑖 = 𝐻𝑖−1 𝑋𝑖𝑇 ) 𝑌𝑖 = 𝐻𝑖−1 𝑋𝑖∗ . If (𝑋1 , 𝑇1 , 𝑌1 ) and (𝑋2 , 𝑇2 , 𝑌2 ) are similar then (𝑇1 , 𝐻1 ) and (𝑇2 , 𝐻2 ) are (real) unitarily similar. Proof. The proof will be given for the real case; the complex case is proved similarly. Assume that (𝑋1 , 𝑇1 , 𝑌1 ) and (𝑋2 , 𝑇2 , 𝑌2 ) are similar and let 𝑆 be the unique nonsingular matrix such that (cf. (7)) 𝑋2 = 𝑋1 𝑆 −1 ,

𝑇2 = 𝑆𝑇1 𝑆 −1 , −1

Such a matrix has the form 𝑆 = 𝐶(𝑋2 , 𝑇2 )

𝑌2 = 𝑆𝑌1 .

𝐶(𝑋1 , 𝑇1 ). Then 𝑇2 = 𝑆𝑇1 𝑆 −1 and

𝐻2 = 𝐶(𝑋2 , 𝑇2 )𝑇 𝑅(𝑌2 , 𝑇2 )−1 = 𝑆 −𝑇 𝐶(𝑋1 , 𝑇1 )𝑇 𝑅(𝑌1 , 𝑇1 )−1 𝑆 −1 = 𝑆 −𝑇 𝐻1 𝑆 −1 . If 𝑄 = 𝑆 −1 we have 𝑇2 = 𝑄−1 𝑇1 𝑄 and 𝐻2 = 𝑄𝑇 𝐻1 𝑄 so that (𝑇1 , 𝐻1 ) and (𝑇2 , 𝐻2 ) are real unitarily similar. □ It will be important for us to note that the converse of this proposition is not true in general. That is to say, given two selfadjoint triples (𝑋1 , 𝑇1 , 𝑌1 ) and (𝑋2 , 𝑇2 , 𝑌2 ), the fact that (𝑇1 , 𝐻1 ) and (𝑇2 , 𝐻2 ) are unitarily similar does not guarantee that (𝑋1 , 𝑇1 , 𝑌1 ) and (𝑋2 , 𝑇2 , 𝑌2 ) are similar. A proof of this will need the use of the sign characteristic, a concept that will be introduced in the following section. We defer that proof until the necessary concepts have been discussed (see Section 4.2). However, for a given (real) selfadjoint matrix polynomial, all its (real) selfadjoint triples can be obtained from the “companion” triple deﬁned in (4). It has already been shown that, provided that 𝐿(𝜆) is selfadjoint, this primitive triple is also selfadjoint with symmetric (hermitian in the complex case) matrix 𝐴 of (2). It turns out that all selfadjoint standard triples for 𝐿(𝜆) can be obtained by applying unitary similarity to (𝐶𝑅 , 𝐴): Proposition 3.10. Let 𝐿(𝜆) be a selfadjoint matrix polynomial with nonsingular leading coeﬃcient and let its primitive selfadjoint triple be (𝑋0 , 𝐶𝑅 , 𝑌0 ) as given in (4). Let 𝐻 be an ℓ𝑛 × ℓ𝑛 symmetric (hermitian if 𝔽 = ℂ) invertible matrix and let 𝑇 ∈ 𝔽ℓ𝑛×ℓ𝑛 be 𝐻-selfadjoint. If (𝑇, 𝐻) and (𝐶𝑅 , 𝐴) are unitarily similar with

434

P. Lancaster and I. Zaballa

𝑇 = 𝑄−1 𝐶𝑅 𝑄 and 𝐻 = 𝑄𝑇 𝐴𝑄 (𝐻 = 𝑄∗ 𝐴𝑄 if 𝔽 = ℂ) then (𝑋0 𝑄, 𝑇, 𝐻 −1 𝑄𝑇 𝑋0𝑇 ) ((𝑋0 𝑄, 𝑇, 𝐻 −1 𝑄∗ 𝑋0∗ ) if 𝔽 = ℂ) is a selfadjoint triple of 𝐿(𝜆). Proof. Again, the proof is given in the real case. The proof in the complex case is similar. Deﬁne 𝑋 = 𝑋0 𝑄 and 𝑌 = 𝐻 −1 𝑄𝑇 𝑋0𝑇 . We have to prove ﬁrst that (𝑋, 𝑇, 𝑌 ) is similar to (𝑋0 , 𝐶𝑅 , 𝑌0 ). In fact, it follows from (2) and (4) that 𝑌0 = 𝐴−1 𝑋0𝑇 and so 𝑌 = 𝐻 −1 𝑄𝑇 𝑋0𝑇 = 𝐻 −1 𝑄𝑇 𝐴𝑌0 = 𝑄−1 𝐴−1 𝑄−𝑇 𝑄𝑇 𝐴𝑌0 = 𝑄−1 𝑌0 . Thus (𝑋, 𝑇, 𝑌 ) → (𝑋0 𝑄, 𝑄−1 𝐶𝑅 𝑄, 𝑄−1 𝑌0 ) is the required similarity, and (𝑋, 𝑇, 𝑌 ) is a standard triple of 𝐿(𝜆). We are to prove next that 𝑌 = 𝐻 −1 𝑋 𝑇 and 𝑇 𝑇 = 𝐻𝑇 𝐻 −1. The ﬁrst follows 𝑇 𝐴= from the deﬁnition of 𝑌 . Now, bearing in mind that 𝐶𝑅 is 𝐴-selfadjoint (𝐶𝑅 𝐴𝐶𝑅 ), we have 𝑇𝑇

= =

𝑇 −𝑇 𝑄𝑇 𝐶𝑅 𝑄 = 𝑄𝑇 𝐴𝐶𝑅 𝐴−1 𝑄−𝑇 = 𝑄𝑇 𝐴𝑄𝑄−1 𝐶𝑅 𝑄𝑄−1 𝐴−1 𝑄−𝑇 −1 𝐻𝑇 𝐻 ,

and the proposition follows.

□

This argument provides a construction of (real) selfadjoint triples of a (real) selfadjoint matrix polynomial using unitary similarity, and we have used the primitive triple (𝑋0 , 𝐶𝑅 , 𝑌0 ) in this construction. However, this role could be played by any selfadjoint triple of 𝐿(𝜆) and the proposition still holds – with the same proof. In other words, let (𝑋1 , 𝑇1 , 𝑌1 ) be a real selfadjoint triple of 𝐿(𝜆) and 𝐻1 the symmetric invertible matrix satisfying 𝑇1𝑇 𝐻1 = 𝐻1 𝑇1 , 𝑌1 = 𝐻1−1 𝑋1𝑇 . If 𝑇 = 𝑄−1 𝑇1 𝑄 and 𝐻 = 𝑄𝑇 𝐻1 𝑄 then (𝑋1 𝑄−1 , 𝑇, 𝐻 −1 𝑄−𝑇 𝑋1𝑇 ) is a real selfadjoint standard triple of 𝐿(𝜆). A similar result holds in the complex case.

4. Canonical structures for hermitian polynomials In this section we review some canonical structures for hermitian (and especially real symmetric) matrix polynomials. Results in the complex hermitian case are well known, and will serve to set-the-scene before discussing the less-well-known case of real symmetric matrix polynomials. First consider the standard triple (𝑋0 , 𝐶𝑅 , 𝑌0 ) of (4). It follows from The∗ orem 3.1 that (𝑌0∗ , 𝐶𝑅 , 𝑋0∗ ) is also a standard triple. Furthermore, with 𝐴 given by (2) ∗ ∗ = 𝐴𝐶𝑅 𝐴−1 or 𝐴𝐶𝑅 = 𝐶𝑅 𝐴. (13) 𝐶𝑅 ∗ ∗ Thus, we have a hermitian pair: 𝐴 = 𝐴 and (𝐴𝐶𝑅 ) = 𝐴𝐶𝑅 with 𝐴 nonsingular. And, of course, this becomes a real symmetric pair when 𝐿(𝜆) has real symmetric coeﬃcients. Canonical structures for 𝐿(𝜆) are now determined by congruence transformations applied simultaneously to 𝐴 and 𝐴𝐶𝑅 over 𝔽. For a congruence 𝑄∗ 𝐴𝑄 = 𝑃 and 𝑄∗ (𝐴𝐶𝑅 )𝑄 = 𝑃 𝐽, it follows that 𝐶𝑅 = 𝑄𝐽𝑄−1 , and this shows that, for 𝑄,

Canonical Forms

435

we can use a similarity transformation of 𝐶𝑅 to Jordan canonical form – over ℂ or ℝ, as the case may be. Simultaneous congruence transformations of this kind have been reviewed recently in [11]. Furthermore, the invertibility of 𝐴ℓ (and hence 𝐴) assumed here removes troublesome singular structures appearing in the general case. (The hermitian case appears as Theorem 5.1.1 of [8].) Deﬁnition 4.1. A standard triple of the form (𝑋, 𝐽, 𝑌 ) (where 𝐽 is a matrix in Jordan form) is said to be a Jordan triple. If, in addition, 𝐽 is a Jordan form of the companion matrix 𝐶𝑅 for a matrix polynomial 𝐿(𝜆), then (𝑋, 𝐽, 𝑌 ) is said to be a Jordan triple for 𝐿(𝜆). Then, recalling Deﬁnition 3.2 of selfadjoint standard triples: Deﬁnition 4.2. A (real) Jordan triple (𝑋, 𝐽, 𝑌 ) will be called a (real) selfadjoint Jordan triple if (𝑋, 𝐽, 𝑌 ) is a (real) selfadjoint standard triple. Of course, in these deﬁnitions, the Jordan form is that over ℝ, or over ℂ, as appropriate. As in the classical case, 𝐿(𝜆) = 𝐼𝑛 𝜆 − 𝐴, the Jordan form displays both the elementary divisor structure of the eigenvalues, but also encodes complete information on eigenvector chains. However, the details are diﬀerent in the complex and real cases. 4.1. The complex hermitian case The structure of Jordan triples over ℂ is familiar from the works of Gohberg et al., and will be summarized here. The structure of Jordan triples over ℝ may be less familiar, and is the topic of the next section. To help in the description of canonical forms we introduce the primitive 𝑚×𝑚 matrices ⎡ ⎤ ⎤ ⎡ 0 ⋅⋅⋅ 0 1 0 0 0 ⋅⋅⋅ 0 1 ⎢ .. ⎥ ⎢ 0 ⋅⋅⋅ ⎢ . 1 0 ⎥ 0 0 ⎥ ⎥ ⎢ ⎢ ⎥ ⎥ ⎢ ⎢ .. .. ⎥ . .. , 𝐺 (14) = 𝐹𝑚 = ⎢ ⎥ ⎢ 𝑚 . 1 . ⎥ ⎥ ⎥ ⎢ ⎢ . ⎥ ⎢ ⎦ ⎣ 0 1 .. 0 ⎣ 1 0 . ⎦ 1 0 ⋅⋅⋅ 0 0 0 0 ⋅⋅⋅ 0 Note also that 𝐹1 = 1, 𝐺1 = 0. In the following formulae, 𝑚 will always be the degree of an elementary divisor of the pencil 𝜆𝐼 − 𝐶𝑅 or, what is equivalent, the hermitian pencil 𝜆𝐴 − 𝐴𝐶𝑅 . Suppose that 𝐿(𝜆) is hermitian and has exactly 𝑞 real elementary divisors with associated real eigenvalues 𝛼1 , . . . , 𝛼𝑞 (not necessarily distinct), and let the degrees of these elementary divisors be 𝑙1 , . . . , 𝑙𝑞 , respectively. Also, let there be exactly 𝑠 pairs of non-real conjugate eigenvalues, (𝛽1 , 𝛽¯1 ), . . . , (𝛽𝑠 , 𝛽¯𝑠 ) with associated elementary divisors of degrees 𝑚1 , . . . , 𝑚𝑠 , respectively. Then if 𝐿(𝜆) is 𝑛 × 𝑛 with degree ℓ (and det𝐴ℓ ∕= 0) we will have ℓ𝑛 =

𝑞 ∑ 𝑗=1

𝑙𝑗 + 2

𝑠 ∑ 𝑘=1

𝑚𝑘 .

436

P. Lancaster and I. Zaballa

⊕𝑞 Now it will be convenient to introduce the notation 𝑗=1 𝑀𝑗 to denote a direct (block diagonal) sum of matrices 𝑀1 , . . . , 𝑀𝑞 . There is a complex congruence transformation which, when applied to 𝜆𝐴 − 𝐴𝐶𝑅 , produces a hermitian pencil 𝜆𝑃 − 𝑃 𝐽, where 𝑃 =

𝑞 ⊕

𝜀𝑗 𝐹𝑙𝑗

𝑗=1

and 𝑃𝐽 =

𝑞 ⊕

𝜀𝑗 (𝛼𝑗 𝐹𝑙𝑗 + 𝐺𝑙𝑗 )

𝑠 [ ⊕⊕

𝑗=1

𝑘=1

𝑠 ⊕⊕

𝐹2𝑚𝑘 ,

(15)

𝑘=1

0 𝛽¯𝑘 𝐹𝑚𝑘 + 𝐺𝑚𝑘

𝛽𝑘 𝐹𝑚𝑘 + 𝐺𝑚𝑘 0

] .

(16)

The numbers 𝜀1 , . . . , 𝜀𝑞 are each equal to either +1 or -1 and, together, they are known as the “sign characteristic” of the system (either 𝐿(𝜆) or 𝜆𝐴 − 𝐴𝐶𝑅 ). This is an important concept which plays an important role in perturbation theory for matrix polynomials, as well as more general matrix functions (see [6], for example). The reduced forms (15) and (16) are obtained from Theorem 6.1 of [11], where existence and uniqueness arguments can be found. To emphasize the dependence of the structure of 𝑃 on the sign characteristic, 𝜀 := {𝜀1 , . . . , 𝜀𝑞 }, and on the more primitive Jordan matrix, 𝐽, we write 𝑃 = 𝑃𝜀,𝐽 . It is important to note that 𝑃𝜀,𝐽 −1 2 is involutory: 𝑃𝜀,𝐽 = 𝐼, so that 𝑃𝜀,𝐽 = 𝑃𝜀,𝐽 . From (15) and (16) we deduce that the corresponding Jordan canonical form is ] 𝑞 𝑠 [ ⊕⊕ ⊕ 0 𝛽¯𝑘 𝐼𝑚𝑘 + 𝐹𝑚𝑘 𝐺𝑚𝑘 (𝛼𝑗 𝐼𝑙𝑗 + 𝐹𝑙𝑗 𝐺𝑙𝑗 ) , (17) 𝐽= 0 𝛽𝑘 𝐼𝑚𝑘 + 𝐹𝑚𝑘 𝐺𝑚𝑘 𝑗=1

𝑘=1

∗

and 𝐽 𝑃𝜀,𝐽 = 𝑃𝜀,𝐽 𝐽 so that 𝐽 is 𝑃𝜀,𝐽 -selfadjoint. If all elementary divisors of 𝐿(𝜆) are linear then 𝐽 will be diagonal, but 𝑃𝜀,𝐽 will be diagonal only if all eigenvalues are real; otherwise 𝑃𝜀,𝐽 is tridiagonal. Since 𝜆𝐴 − 𝐴𝐶𝑅 and 𝜆𝑃𝜀,𝐽 − 𝑃𝜀,𝐽 𝐽 are congruent pencils, there is an invertible complex matrix 𝑄 such that 𝑃𝜀,𝐽 = 𝑄∗ 𝐴𝑄 and 𝑃𝜀,𝐽 𝐽 = 𝑄∗ 𝐴𝐶𝑅 𝑄. Then 𝑃𝜀,𝐽 𝐽 = 𝑄∗ 𝐴𝑄𝑄−1 𝐶𝑅 𝑄 = 𝑃𝜀,𝐽 𝑄−1 𝐶𝑅 𝑄. But 𝑃𝜀,𝐽 is invertible. Therefore (𝐶𝑅 , 𝐴) and (𝐽, 𝑃𝜀,𝐽 ) are unitarily similar. If we put 𝑋 = 𝑋0 𝑄 where (𝑋0 , 𝐶𝑅 , 𝑌0 ) is the primitive selfadjoint triple of 𝐿(𝜆) given in (4), by Proposition 3.10 and bearing −1 = 𝑃𝜀,𝐽 , (𝑋, 𝐽, 𝑃𝜀,𝐽 𝑋 ∗ ) is a selfadjoint triple of 𝐿(𝜆). We have in mind that 𝑃𝜀,𝐽 just proved: Theorem 4.3. If 𝐿(𝜆) is hermitian and 𝐴ℓ is nonsingular, then there exists a selfadjoint Jordan triple of the form (𝑋, 𝐽, 𝑃𝜀,𝐽 𝑋 ∗ ) with 𝐽 and 𝑃𝜀,𝐽 as in (17) and (15). The set of numbers 𝜀 is determined uniquely by 𝐿(𝜆), up to permutation of signs in the blocks of 𝑃𝜀,𝐽 corresponding to the Jordan blocks of 𝐽 with the same real eigenvalue and the same size. We emphasize that the deﬁnition of a real selfadjoint Jordan triple given in Deﬁnition 4.2 is more general than that of [3] and [4]. In fact, in [3] and [4]

Canonical Forms

437

a standard triple (𝑋, 𝑇, 𝑌 ) is a selfadjoint Jordan triple if 𝑇 = 𝐽 (matrix in Jordan form) and 𝑌 = 𝑃𝜀,𝐽 𝑋 ∗ . This is more restrictive than Deﬁnition 4.2 where (𝑋, 𝑇, 𝑌 ) qualiﬁes as a selfadjoint Jordan triple provided that it is a standard triple, 𝑇 = 𝐽, and 𝑌 = 𝐻 −1 𝑋 ∗ where 𝐻 is any nonsingular hermitian matrix such that 𝐽 ∗ 𝐻 = 𝐻𝐽. Thus, if (𝑋, 𝐽, 𝑃𝜀,𝐽 𝑋 ∗ ) is a standard triple, it is a selfadjoint Jordan triple in both cases, but if 𝑄 is an invertible matrix such that 𝐽 = 𝑄−1 𝐽𝑄 and we deﬁne 𝐻 = 𝑄∗ 𝑃𝜀,𝐽 𝑄 then 𝐻 is hermitian and nonsingular, 𝐽 ∗ 𝐻 = 𝐻𝐽 and (𝑋, 𝐽, 𝐻 −1 𝑋 ∗ ) is a selfadjoint Jordan triple for 𝐿(𝜆) in the sense of Deﬁnition 4.2, but it is not (unless 𝐻 = 𝑃𝜀,𝐽 ) in the sense of [3] and [4]. To stress the diﬀerence between the two deﬁnitions, compare Theorem 1.10 of [3] with the fact that, according to the complex version of Theorem 3.5 (or [8, Th. 12.2.2]), all Jordan triples of an hermitian matrix polynomial are selfadjoint. 4.2. The real symmetric case When the coeﬃcients of 𝐿(𝜆) are real and symmetric, then matrices 𝐴 and 𝐴𝐶𝑅 (of (13)) are real and symmetric. Now the simultaneous reduction of these two matrices by congruence can be completed over the real ﬁeld. Again, a complete discussion can be found in [11]. The relevant result of that paper is now Theorem 9.2. We use the same notations and conventions as Section 4.1 concerning the real eigenvalues, non-real conjugate pairs of eigenvalues and the degrees of their elementary divisors. To handle the case of nonlinear elementary divisors for nonreal eigenvalues it is convenient to introduce another primitive symmetric matrix with even size, say 2𝑚 × 2𝑚: ⎡ ⎤ 0 0 ⋅⋅⋅ 1 0 ⎢ 0 0 0 −1 ⎥ ⎢ ⎥ ⎢ .. ⎥ ⎢ . 1 0 0 0 ⎥ ⎢ ⎥ 0 −1 0 0 ⎥ (18) 𝐸2𝑚 = ⎢ ⎢ ⎥. ⎢ .. ⎥ ⎢ . ⎥ ⎢ ⎥ ⎣ 1 0 0 0 ⎦ 0 −1 ⋅⋅⋅ 0 0 To avoid confusion with the hermitian case, we will denote a real Jordan form for a real symmetric polynomial 𝐿(𝜆) by 𝐾. Also, we write the non-real eigenvalues in real and imaginary parts: 𝛽𝑗 = 𝜇𝑗 + 𝑖𝜈𝑗 , for 𝑗 = 1, . . . , 𝑠. Now there is a real congruence transformation which, when applied to 𝜆𝐴 − 𝐴𝐶𝑅 , produces a real symmetric pencil 𝜆𝑃𝜀,𝐾 − 𝑃𝜀,𝐾 𝐾, where 𝑃𝜀,𝐾 =

𝑞 ⊕ 𝑗=1

𝜀𝑗 𝐹𝑙𝑗

𝑠 ⊕⊕ 𝑗=1

𝐹2𝑚𝑗

(19)

438

P. Lancaster and I. Zaballa

and 𝑃𝜀,𝐾 𝐾 =

𝑞 ⊕

𝜀𝑗 (𝛼𝑗 𝐹𝑙𝑗 + 𝐺𝑙𝑗 )

𝑗=1

[ 𝑠 ( ⊕⊕ 𝐹2𝑚𝑗 −2 𝜇𝑗 𝐹2𝑚𝑗 + 𝜈𝑗 𝐸2𝑚𝑗 + 0 𝑗=1

0 02

]) .

(20) The numbers 𝜀1 , . . . , 𝜀𝑞 are each equal to either +1 or −1 and, together, they are known as the “sign characteristic” of the system (either 𝐿(𝜆) or 𝜆𝐴 − 𝐴𝐶𝑅 ). Once 2 again, the “P” matrix is involutory: 𝑃𝜀,𝐾 = 𝐼. We deduce from these two equations that the real Jordan form is: 𝐾=

𝑞+𝑠 ⊕

𝐾𝑗 ,

(21)

𝑗=1

where, for 𝑗 = 1, . . . , 𝑞,

𝐾𝑗 = 𝛼𝑗 𝐼𝑙𝑗 + 𝐹𝑙𝑗 𝐺𝑙𝑗 ,

and for 𝑗 = 𝑞 + 1, . . . , 𝑞 + 𝑠,

[

𝐾𝑗 = 𝜇𝑗 𝐼2𝑚𝑗 + 𝜈𝑗 𝐹2𝑚𝑗 𝐸2𝑚𝑗 + 𝐹2𝑚𝑗 [ If 𝑈𝑗 =

𝜇𝑗 𝜈𝑗

−𝜈𝑗 𝜇𝑗

]

(22) 𝐹2𝑚𝑗 −2 0

0 02

] .

for 𝑗 = 𝑞 + 1, . . . , 𝑞 + 𝑠, then 𝐾𝑗 is the 2𝑚𝑗 × 2𝑚𝑗 real matrix ⎡ ⎢ ⎢ ⎢ 𝐾𝑗 = ⎢ ⎢ ⎢ ⎣

𝑈𝑗

0

𝐼2 0 .. .

𝑈𝑗 𝐼2

0

⋅⋅⋅

⋅⋅⋅ 𝑈𝑗 .. .

..

. 𝐼2

⎤ 0 .. ⎥ . ⎥ ⎥ ⎥. ⎥ ⎥ ⎦ 𝑈𝑗

(23)

Notice also that the matrices of (22) have a familiar “Jordan” structure. Thus, when 𝑙𝑗 = 3, for example, ⎤ ⎡ 0 𝛼𝑗 0 𝛼𝑗 𝐼𝑙𝑗 + 𝐹𝑙𝑗 𝐺𝑙𝑗 = ⎣ 1 𝛼𝑗 0 ⎦ . 0 1 𝛼𝑗 For semisimple real systems, 𝐾 and 𝑃𝜀,𝐾 are diagonal if all eigenvalues are real and, if there is at least one non-real eigenvalue pair, then both are tridiagonal. As in the complex case, the real congruence relating 𝜆𝐴 − 𝐴𝐶𝑅 and 𝜆𝑃𝜀,𝐾 − 𝑃𝜀,𝐾 𝐾 yields the existence of a real invertible matrix 𝑆 such that 𝑆 −1 𝐶𝑅 𝑆 = 𝐾

and 𝑆 𝑇 𝐴𝑆 = 𝑃𝜖,𝐾 .

(24)

𝑇 𝐴 is equivalent to Now, the fundamental symmetry 𝐴𝐶𝑅 = 𝐶𝑅

𝑃𝜀,𝐾 𝐾 = 𝐾 𝑇 𝑃𝜀,𝐾 ;

(25)

i.e., 𝐾 is real 𝑃𝜀,𝐾 -selfadjoint and so (𝐶𝑅 , 𝐴) and (𝐾, 𝑃𝜀,𝐾 ) are real unitarily similar. If 𝑋𝜌 = 𝑋0 𝑆 where (𝑋0 , 𝐶𝑅 , 𝑌0 ) is the primitive real selfadjoint triple of

Canonical Forms

439

−1 𝐿(𝜆) given in (4), by Proposition 3.10 and bearing in mind that 𝑃𝜀,𝐾 = 𝑃𝜀,𝐾 , 𝑇 (𝑋𝜌 , 𝐾, 𝑃𝜀,𝐾 𝑋𝜌 ) is a real selfadjoint triple of 𝐿(𝜆). This proves the existence of real Jordan selfadjoint triples for real selfadjoint matrix polynomials:

Theorem 4.4. If 𝐿(𝜆) is real and symmetric and 𝐴ℓ is nonsingular, then there exists a real selfadjoint Jordan triple of the form (𝑋𝜌 , 𝐾, 𝑃𝜀,𝐾 𝑋𝜌𝑇 ) with 𝐾 and 𝑃𝜀,𝐾 as in (21) and (19). The set of numbers 𝜀 is determined uniquely by 𝐿(𝜆), up to permutation of signs in the blocks of 𝑃𝜀,𝐾 corresponding to the Jordan blocks of 𝐾 with the same real eigenvalue and the same size. In both the real and complex cases, we have two independent systems of invariants associated with a selfadjoint matrix polynomial: the elementary divisors and the sign characteristic. The ﬁrst system of invariants deﬁnes the structure of the real or complex Jordan form and the ﬁrst and second together determine the structure of the canonical form 𝑃𝜀,𝐾 . Following [10] we say that two selfadjoint matrix polynomials are strictly isospectral if and only if they have the same elementary divisors and the same sign characteristic. We are now in position to prove that the converse of Proposition 3.9 does not hold in general. The proof is the same for real or complex matrix polynomials and we are going to focus on the real case. Assume that 𝐿1 (𝜆) and 𝐿2 (𝜆) are strictly isospectral, 𝐿1 (𝜆) ∕= 𝐿2 (𝜆) and, for 𝑖 = 1, 2, let 𝐶𝑅𝑖 and 𝐴𝑖 be the right companion form and block-symmetric matrix given by (2) associated with 𝐿𝑖 (𝜆). Then, since 𝐿1 (𝜆) and 𝐿2 (𝜆) share the same canonical forms 𝐾 and 𝑃𝜀,𝐾 , we conclude from (24) that (𝐶𝑅1 , 𝐴1 ) and (𝐶𝑅,2 , 𝐴2 ) are real unitarily similar. Now, for 𝑖 = 1, 2, (𝑋0𝑖 , 𝐶𝑅𝑖 , 𝑌0𝑖 ) given by (4) is a real selfadjoint triple of 𝐿𝑖 (𝜆) and (𝑋01 , 𝐶𝑅1 , 𝑌01 ) and (𝑋02 , 𝐶𝑅2 , 𝑌02 ) are not similar provided that 𝐿1 (𝜆) ∕= 𝐿2 (𝜆). For if they were similar there would be a nonsingular matrix 𝑇 such that (𝑋02 , 𝐶𝑅2 , 𝑌02 ) = (𝑋01 𝑇, 𝑇 −1𝐶𝑅1 𝑇, 𝑇 −1 𝑌01 ). Then 𝐶𝑅1 𝑇 = 𝑇 𝐶𝑅2 and 𝑇 must have the form ⎡

⎤ 𝑍 ⎢ 𝑍𝐶𝑅2 ⎥ ⎢ ⎥ 𝑇 = ⎢ .. ⎥ ⎣ . ⎦ ℓ−1 𝑍𝐶𝑅2

] [ for some full row rank matrix 𝑍. But 𝑋01 = 𝑋02 = 𝐼𝑛 0 ⋅ ⋅ ⋅ 0 . The condition 𝑋02 = 𝑋01 𝑇 implies 𝑍 = 𝑋01 = 𝑋02 . Then 𝑇 = 𝐼ℓ𝑛 and 𝐶𝑅1 = 𝐶𝑅2 which is a contradiction because 𝐿1 (𝜆) ∕= 𝐿2 (𝜆).

440

P. Lancaster and I. Zaballa

5. Chains of generalized eigenvectors The canonical structures ensured by Theorems 4.3 and 4.4 lead to the idea of “chains of generalized eigenvectors” which are important in perturbation theory and in many applications. In this section we show how these ideas ﬁt into the constructions of this paper. 5.1. Real Jordan triples and real eigenvector chains Given a real selfadjoint triple for a real symmetric system, as in Theorem 4.4, the real matrices (𝑋𝜌 , 𝐾) form a standard pair and (see Theorem 2.2) 𝐿(𝑋𝜌 , 𝐾) =

ℓ ∑

𝐴𝑗 𝑋𝜌 𝐾 𝑗 = 0.

(26)

𝑗=0

Recall the structure of 𝐾 from (21)–(23) and partition 𝑋𝜌 accordingly: ] [ 𝑋𝜌 = 𝑋1 ⋅ ⋅ ⋅ 𝑋𝑞 𝑋𝑞+1 ⋅ ⋅ ⋅ 𝑋𝑞+𝑠 ,

(27)

where the number of columns in 𝑋𝑗 and 𝐾𝑗 agree for each 𝑗 = 1, . . . , 𝑞 + 𝑠. Then for each 𝑗 ] [ 𝑗 , 𝑋𝜌 𝐾 𝑗 = 𝑋1 𝐾1𝑗 ⋅ ⋅ ⋅ 𝑋𝑞+𝑠 𝐾𝑞+𝑠 and it follows from (26) that ℓ ∑

𝐴𝑖 𝑋𝑗 𝐾𝑗𝑖 = 0 for

𝑗 = 1, 2, . . . , 𝑞 + 𝑠.

(28)

𝑖=0

Now, for a real eigenvalue we have 1 ≤ 𝑗 ≤ 𝑞 and write ⎡ 𝜆𝑗 ⎢ [ ] ⎢ 1 𝜆𝑗 𝑋𝑗 = 𝑥𝑗1 ⋅ ⋅ ⋅ 𝑥𝑗𝑙𝑗 and 𝐾𝑗 = ⎢ .. .. ⎣ . . 1

⎤ ⎥ ⎥ ⎥ ∈ ℝ𝑙𝑗 ×𝑙𝑗 . ⎦ 𝜆𝑗

Bearing in mind that 𝐶𝑅 𝑆 = 𝑆𝐾 and 𝑋𝜌 = 𝑋0 𝑆 (i.e., 𝑋𝜌 is the submatrix of 𝑆 formed by its 𝑛 ﬁrst rows), the following relations are easily obtained: 𝐿(𝜆𝑗 )𝑥𝑗𝑙𝑗 = 0, 𝐿(𝜆𝑗 )𝑥𝑗𝑙𝑗 −1 + 𝐿 𝐿(𝜆𝑗 )𝑥𝑗1 + 𝐿(1) (𝜆𝑗 )𝑥𝑗2 + ⋅ ⋅ ⋅ +

(1)

(𝜆𝑗 )𝑥𝑗𝑙𝑗 = 0, .. .

1 𝐿(𝑙𝑗 −1) (𝜆𝑗 )𝑥𝑗𝑙𝑗 = 0, (𝑙𝑗 − 1)!

(29)

where 𝐿(𝑘) (𝜆𝑗 ) is the 𝑘th derivative of 𝐿(𝜆) at 𝜆𝑗 . This means that 𝑥𝑗𝑙𝑗 , 𝑥𝑗,𝑙𝑗 −1 ,. . . , 𝑥𝑗1 is a real Jordan chain of 𝐿(𝜆) associated with the real eigenvalue 𝜆𝑗 (see Section 1.4 of [4], for example). In computation, the ﬁrst relation in (29) is used to ﬁnd the eigenvector 𝑥𝑗𝑙𝑗 , the second to ﬁnd 𝑥𝑗,𝑙𝑗 −1 , and so on.

Canonical Forms

441

5.2. Real chains for non-real eigenvalues Real Jordan structures associated with non-real eigenvalues are more troublesome. However, using similar ideas, we can obtain real analogues of (29) for non-real eigenvalues. First, we deﬁne matrix functions with arguments 𝑌 ∈ ℝ𝑛×𝑝 and 𝑀 ∈ ℝ𝑝×𝑝 , 𝑝 being any ﬁxed positive integer: 𝐿(𝑌, 𝑀 ) =

ℓ ∑

𝐴𝑗 𝑌 𝑀 𝑗 ,

𝐿(1) (𝑌, 𝑀 ) =

𝑗=0

ℓ ∑

𝑗𝐴𝑗 𝑌 𝑀 𝑗−1 ,

𝑗=1

and, for 𝑘 = 2, . . . , ℓ, 𝐿(𝑘) (𝑌, 𝑀 ) =

ℓ ∑

𝑗(𝑗 − 1) . . . (𝑗 − 𝑘 + 1)𝐴𝑗 𝑌 𝑀 𝑗−𝑘 .

𝑗=𝑘

Second, we recall that, by hypothesis, 𝐿(𝜆) has 𝑠 pairs of non-real conjugate eigenvalues (𝛽𝑗 , 𝛽¯𝑗 ), 𝛽𝑗 = 𝜇𝑗 + 𝑖𝜈𝑗 , with associated elementary divisors of degree 𝑚𝑗 . Third, for 1 ≤ 𝑗 ≤ 𝑠 we write ⎡ ⎤ 𝑈𝑗 0 ⋅⋅⋅ 0 ⎢ .. ⎥ ⎢ 𝐼2 𝑈𝑗 . ⎥ ⎢ ⎥ [ ] ⎢ ⎥, 𝐼2 𝑈𝑗 𝑋𝑞+𝑗 = 𝑌𝑗1 ⋅ ⋅ ⋅ 𝑌𝑗𝑚𝑗 and 𝐾𝑞+𝑗 = ⎢ 0 ⎥ ⎢ . ⎥ .. .. ⎣ .. ⎦ . . 0 ⋅⋅⋅ 𝐼2 𝑈𝑗 [ ] 𝜇𝑗 −𝜈𝑗 with 𝑌𝑗𝑖 ∈ ℝ𝑛ℓ×2 and 𝑈𝑗 = . 𝜈 𝑗 𝜇𝑗 Now, if 𝑆𝑞+𝑗 is the submatrix of 𝑆 whose columns correspond to those of 𝑋𝑞+𝑗 (of (27)), then the relation 𝐶𝑅 𝑆 = 𝑆𝐾 implies 𝐶𝑅 𝑆𝑞+𝑗 = 𝑆𝑞+𝑗 𝐾𝑞+𝑗 ,

1 ≤ 𝑗 ≤ 𝑠.

Writing down this equation explicitly it is found that, for each 𝑗, 𝐿(𝑌𝑗𝑚𝑗 , 𝑈𝑗 ) = 0, 𝐿(𝑌𝑗𝑚𝑗 −1 , 𝑈𝑗 ) + 𝐿 𝐿(𝑌𝑗1 , 𝑈𝑗 ) + 𝐿(1) (𝑌𝑗2 , 𝑈𝑗 ) + ⋅ ⋅ ⋅ +

(1)

(𝑌𝑗𝑚𝑗 , 𝑈𝑗 ) = 0, .. .

1 𝐿(𝑙𝑗 −1) (𝑌𝑗𝑚𝑗 , 𝑈𝑗 ) = 0. (𝑙𝑗 − 1)!

(30)

We may now deﬁne 𝑌𝑗𝑚𝑗 , 𝑌𝑗𝑚𝑗 −1 ,. . . , 𝑌𝑗1 to be a real Jordan chain of 𝐿(𝜆) with respect to the pair of non-real eigenvalues (𝛽𝑗 , 𝛽¯𝑗 ).

442

P. Lancaster and I. Zaballa

5.3. Real chains from complex chains Returning to the GLR theory over ℂ, observe that Theorem 10.7 of [4] ensures the existence of complex selfadjoint Jordan triples (𝑋, 𝐽, 𝑃𝜀,𝐽 𝑋 ∗ ) (as in Theorem 4.3) for real symmetric systems 𝐿(𝜆) with the special form: ] [ ¯ 2 𝑋2 , 𝐽 = Diag(𝐽1 , 𝐽¯2 , 𝐽2 ), (31) 𝑋 = 𝑋1 𝑋 where the spectrum of 𝐽1 is real, the spectrum of 𝐽2 contains no real numbers and no conjugate complex pairs, the matrix 𝑋1 is real and contains real Jordan chains of 𝐿(𝜆), and 𝑋2 contains complex Jordan chains that are not conjugate in pairs. Starting with the structure of (31), one can apply the procedure of [12, Sec. 6.7] (see also [13]) to produce a real selfadjoint Jordan triple of the form (𝑋𝜌 , 𝐾, 𝐻 −1 𝑋𝜌 ) (with 𝐾 as in Theorem 4.4), but the symmetric matrix 𝐻 may not have the canonical form, 𝑃𝜀,𝐾 of (19). This procedure can be applied with any unitary matrix 𝑉 for which 𝑉 ∗ 𝐽𝑉 = 𝐾. One can carefully select a unitary matrix 𝑉 such that 𝑉 ∗ 𝐽𝑉 = 𝐾, 𝑋𝑉 = 𝑋𝜌 is real and 𝑉 ∗ 𝑃𝜀,𝐽 𝑉 = 𝑃𝜀,𝐾 , but there are unitary matrices for which the ﬁrst two conditions are satisﬁed but not the third. To illustrate this situation consider the simplest real selfadjoint quadratic matrix polynomial 𝐿(𝜆) = 𝜆2 + 𝑏𝜆 + 𝑐 where 𝑏, 𝑐 ∈ ℝ, and assume that it has two nonreal complex conjugate roots 𝜆1,2 = 𝜇 ± 𝑖𝜔. Then a complex selfadjoint Jordan with the form (31) is (𝑋, 𝐽, 𝑃𝜀,𝐽 𝑋 ∗ ), where [ [ ] ] ¯ [ ] 𝜆 0 1 0 ¯ 𝑥 𝐽= 1 , 𝑋𝑐 = 𝑥 , 𝑃𝜀,𝐽 = 1 0 0 𝜆1 and 𝑥 = ± 2√1 𝜔 (1 − 𝑖). In order to obtain real selfadjoint Jordan triples we can use unitary matrices 𝑉 such that 𝑉 ∗ 𝐽𝑉 = 𝐾, 𝑋𝑐 𝑉 = 𝑋𝜌 is real and 𝑉 ∗ 𝑃𝜀,𝐽 𝑉 is symmetric. Two such matrices are [ ] [ ] 1 1+𝑖 1−𝑖 1 𝑖 1 √ 𝑊 = . and 𝑉 = 2 1−𝑖 1+𝑖 2 −𝑖 1 In fact,

] −𝜔 , 𝜇 [ 𝑋2 = 𝑋𝑐 𝑉 = ∓ √12𝜔

𝑊 ∗ 𝐽𝑊 = 𝑉 ∗ 𝐽𝑉 = 𝐾 = [ 𝑋1 = 𝑋𝑐 𝑊 = 0

] ± √1𝜔 ,

and 𝑊 ∗ 𝑃𝜀,𝐽 𝑊 = 𝑃𝜀,𝐽 = 𝑃𝜀,𝐾 , but

[

𝜇 𝜔

] ± √12𝜔 ,

] −1 0 𝐻 = 𝑉 𝑃𝜀,𝐽 𝑉 = . 0 1 ∗

[

Both (𝑋1 , 𝐾, 𝑃𝜀,𝐾 𝑋1𝑇 ) and (𝑋2 , 𝐾, 𝐻𝑋2𝑇 ) are real selfadjoint Jordan triples of 𝐿(𝜆). It is worth-noticing that the elements of 𝑋1 are the sum and diﬀerence of the real√and imaginary parts of 𝑋𝑐 while the elements of 𝑋2 are, up to multiplication by 2, its imaginary and real parts.

Canonical Forms

443

References [1] Chu M.T., and Xu S., Spectral decomposition of real symmetric quadratic 𝜆-matrices and its applications, Math. of Comp., 78, 2009, 293–313. [2] Chu D., Chu M.T., and Lin W.-W., Quadratic model updating with symmetry, positive deﬁniteness and no spill-over, SIAM J.Matrix Anal.Appl., 31, 2009, 546–564. [3] Gohberg I., Lancaster P., and Rodman L., Spectral analysis of selfadjoint matrix polynomials, Ann. of Math., 112, 1980, 33–71. [4] Gohberg I., Lancaster P., and Rodman L., Matrix Polynomials Academic Press, New York, 1982, and SIAM, Philadelphia, 2009. [5] Gohberg I., Lancaster P., and Rodman L., Matrices and Indeﬁnite Scalar Products Birkh¨ auser, Basel, 1983. [6] Gohberg I., Lancaster P., and Rodman L., A sign characteristic for selfadjoint meromorphic matrix functions Applicable Analysis, 16, 1983, 165–185. [7] Gohberg I., Lancaster P., and Rodman L., Invariant Subspaces of Matrices with Applications, Wiley, New York, 1986 and SIAM, Philadelphia, 2006. [8] Gohberg I., Lancaster P., and Rodman L., Indeﬁnite Linear Algebra and Applications, Birkh¨ auser, Basel, 2005. [9] Lancaster P., Inverse spectral problems for semisimple damped vibrating systems, SIAM J. Matrix Anal. Appl., 29, 2007, 279–301. [10] Lancaster P., and Prells U., Isospectral families of high-order systems, Z. Angew. Math. Mech, 87, 2007, 219–234. [11] Lancaster P., and Rodman L., Canonical forms for hermitian matrix pairs under strict equivalence and congruence, SIAM Review, 47, 2005, 407–443 [12] Lancaster P., and Tismenetsky M., The Theory of Matrices, Academic Press, New York, 1985. [13] Lin M.M., Dong B., and Chu M.T., Inverse problems for real symmetric quadratic pencils, IMA Journal of Numerical Analysis (to appear). Peter Lancaster Dept. of Mathematics and Statistics University of Calgary Calgary, AB T2N 1N4, Canada e-mail: [email protected] Ion Zaballa Departamento de Matematica Aplicada y EIO Universidad del Pais Vasco Apdo 644 E-48080 Bilbao, Spain e-mail: [email protected]

Operator Theory: Advances and Applications, Vol. 218, 445–463 c 2012 Springer Basel AG ⃝

Linearization, Factorization, and the Spectral Compression of a Self-adjoint Analytic Operator Function Under the Condition (VM) H. Langer, A. Markus and V. Matsaev To the memory of our teacher, colleague and dear friend Izrael Gohberg

Abstract. In this paper we continue the study of spectral properties of a selfadjoint analytic operator function 𝐴(𝑧) under the Virozub-Matsaev condition. As in [6], [7], main tools are the linearization and the factorization of 𝐴(𝑧). We use an abstract deﬁnition of a so-called Hilbert space linearization and show its uniqueness, and we prove a generalization of the well-known factorization theorem from [10]. The main results concern properties of the compression 𝐴Δ (𝑧) of 𝐴(𝑧) to its spectral subspace, called spectral compression of 𝐴(𝑧). Close connections between the linearization, the inner linearization, and the local spectral function of 𝐴(𝑧) and of its spectral compression 𝐴Δ (𝑧) are established. Mathematics Subject Classiﬁcation (2000). 47A56, 47A68, 47A10. Keywords. Self-adjoint analytic operator function, linearization, spectral function, spectrum of deﬁnite type, factorization.

1. Introduction This note is a continuation of [7]. We consider an analytic operator function 𝐴(𝑧) which is deﬁned and self-adjoint on a simply connected symmetric open set 𝒟 ⊂ ℂ and with values in ℒ(ℋ) for some Hilbert space ℋ; here self-adjoint means that 𝐴(𝑧 ∗ ) = 𝐴(𝑧)∗ ,

𝑧 ∈ 𝒟,

in particular, the operators 𝐴(𝜆) for 𝜆 ∈ 𝒟∩ℝ are self-adjoint. The spectrum 𝜎(𝐴), the point spectrum 𝜎𝑝 (𝐴), and the resolvent set 𝜌(𝐴) of the operator function 𝐴(𝑧) are deﬁned in the usual way (see [6], [8]). It is generally assumed that 𝒟 contains The authors thank the referee for valuable suggestions.

446

H. Langer, A. Markus and V. Matsaev

the real interval Δ0 = [𝛼0 , 𝛽0 ], that 𝐴(𝑧) satisﬁes on Δ0 the Virozub-Matsaev condition (VM), and that 𝛼0 , 𝛽0 ∈ 𝜌(𝐴). The condition (VM) is formulated at the beginning of Section 3. It means, roughly, that if for some 𝑥 ∈ ℋ, 𝑥 ∕= 0, a curve Δ0 ∋ 𝜆 −→ (𝐴(𝜆)𝑥, 𝑥),

𝜆 ∈ Δ0 ,

comes close to the real axis, it must cross the axis with a positive ascent. Under the condition (VM) on Δ0 , for a neighbourhood 𝒰 of Δ0 the set 𝒰 ∖ Δ0 belongs to 𝜌(𝐴) (see, e.g., [6, Proposition 2.1]). On the other hand, if we suppose from the beginning that (𝒰 ∖ Δ0 ) ⊂ 𝜌(𝐴), then, as proved in [2], [5], there exists a self-adjoint operator Λ (the linearization of 𝐴(𝑧)) in some Krein space ℱ , such that the relation 𝐴(𝑧)−1 = −𝑃 ∗ (Λ − 𝑧)−1 𝑃 + 𝐵(𝑧),

𝑧 ∈ 𝒰 ∖ Δ0 ,

(1.1)

holds; here 𝑃 ∈ ℒ(ℋ, ℱ ) and 𝐵(𝑧) is a self-adjoint analytic function in 𝒰 with values in ℒ(ℋ), and the spectrum of the operator function 𝐴(𝑧) in 𝒰 coincides with 𝜎(Λ). If the condition (VM) (or at least the more general condition (𝜎+ ), see below) is satisﬁed, then ℱ is a Hilbert space and Λ is a self-adjoint operator in this Hilbert space, see [6], [7]. By [7, Theorem 2.4] the operator 𝑃 maps ℋ onto ℱ with ker 𝑃 = (ran 𝑃 ∗ )⊥ , ran 𝑃 ∗ being closed. Let 𝐸 denote the spectral function of the self-adjoint operator Λ in ℱ . Then the ℒ(ℋ)-valued function 𝑄(Γ) = 𝑃 ∗ 𝐸(Γ)𝑃,

Γ ∈ ℛ,

(1.2)

where ℛ is the ring generated by all intervals of ℝ, is called the local spectral function of the operator function 𝐴(𝑧) on Δ0 . For an interval Δ = [𝛼, 𝛽] ⊂ Δ0 , such that 𝛼 and 𝛽 are not eigenvalues of the operator function 𝐴(𝑧), the relations (1.1) and (1.2) imply that ∫ ′ 1 𝐴(𝑧)−1 𝑑𝑧, 𝑄(Δ) = − 2𝜋i 𝛾(Δ) where 𝛾(Δ) is a smooth contour in 𝒰 which surrounds Δ and crosses the real axis in 𝛼 and 𝛽 orthogonally, and the ′ at the integral denotes the Cauchy principal value at 𝛼 and 𝛽. The function 𝑄 is additive on ℛ, its values 𝑄(Γ), Γ ∈ ℛ, are nonnegative operators in ℋ with closed range (see [7, Theorem 3.1]). Moreover, ran 𝑄(Γ) ⊂ ran 𝑄(Δ0 ), and with the notation ℋ(Γ) := ran 𝑄(Γ) we obtain ℋ(Γ) ⊂ ℋ(Δ0 ),

Γ ∈ ℛ.

For an interval Δ ⊂ Δ0 the subspace ℋ(Δ) is called the spectral subspace of the operator function 𝐴(𝑧) for Δ. Since all these spectral subspaces are contained in ℋ(Δ0 ), we call ℋ(Δ0 ) sometimes the main spectral subspace of 𝐴(𝑧) for Δ0 . Observe that unlike the spectral subspaces of a self-adjoint operator, these spectral subspaces of a selfadjoint operator function are not invariant under the values of the operator function. It is one aim of this note to continue the study of the local spectral function and the spectral subspaces of 𝐴(𝑧), which was started in [6], [7].

Self-adjoint Analytic Operator Functions For a self-adjoint operator 𝐴 with spectral Function 𝐸 it holds ∫ (𝐴 − 𝑡)𝑑𝐸(𝑡) = 0 for all intervals Δ, Δ

447

(1.3)

and, together with the fact that the values of 𝐸 are orthogonal projections, this relation determines 𝐸 uniquely. For a self-adjoint analytic operator function under the condition (𝜎+ ) on Δ0 we have instead (see [6, Theorem 3.4]) ∫ 𝐴(𝑡)𝑑𝑄(𝑡) = 0 for all intervals Δ ⊂ Δ0 , (1.4) Δ

where 𝑄 is the local spectral function of the operator function 𝐴(𝑧). The values 𝑄(Δ) are nonnegative operators, but in general not projections. For a self-adjoint operator 𝐴 with (1.3) we also have ∫ 𝑑𝐸(𝑡) −1 (1.5) , 𝑧 ∈ ℂ ∖ Δ0 , (𝐴 − 𝑧) = Δ0 𝑡 − 𝑧 whereas for a self-adjoint analytic operator function we have instead from (1.1) and (1.2) ∫ 𝑑𝑄(𝑡) 𝐴(𝑧)−1 = − + 𝐵(𝑧), 𝑧 ∈ 𝒰(Δ0 ), (1.6) Δ0 𝑡 − 𝑧 with a self-adjoint operator function 𝐵(𝑧) which is analytic in a neighbourhood of Δ0 . In [7] we have introduced the inner linearization 𝑆 of the operator function 𝐴(𝑧) in the main spectral subspace ℋ(Δ0 ), given by ∫ 𝜆 𝑑𝑄(𝜆)𝑄(Δ0 )−1 . (1.7) 𝑆 := 𝑃0∗ Λ(𝑃0∗ )−1 = Δ0

It is an operator in ℋ(Δ0 ), which is selfadjoint with respect to a new Hilbert inner product, and it has in Δ0 the same spectrum, eigenvalues and corresponding eigenvectors as the operator function 𝐴(𝑧). In the papers [5], [6], [7] besides (VM) also the weaker condition (𝜎+ ) was considered. By deﬁnition, the condition (𝜎+ ) holds on Δ0 , if there exist positive numbers 𝜀, 𝛿, such that for all 𝜆 ∈ Δ0 and 𝑓 ∈ ℋ, ∥𝑓 ∥ = 1, we have ) ( ∥𝐴(𝜆)𝑓 ∥ < 𝜀 =⇒ 𝐴′ (𝜆)𝑓, 𝑓 > 𝛿. Under this condition, the operator function 𝐴(𝑧) has still a local spectral function on Δ0 , however, many of the results in the present note fail. This is shown, e.g., by the example in [6, Remark 7.7]. In this example, 𝐴(𝑧) and hence also Λ have two eigenvalues, hence dim ℱ ≥ 2. On the other hand dim (ran 𝑄(Δ0 )) = 1. Since 𝑃 ∗ : ℱ → ℋ we have ker 𝑃 ∗ ∕= {0}, and 𝑃 ∗ ∕= 0 implies dim (ran 𝑃 ∗ ) = 1, hence dim (ran 𝑃 ) = 1 and, since 𝑃 : ℋ → ℱ, we ﬁnd ran 𝑃 ∕= ℱ . Thus, in this example the claims of [7, Theorem 2.4], and also of [7, Remark 3.3] do not hold, although the condition (𝜎+ ) is satisﬁed (comp. also with the ﬁrst paragraph on [7, p. 536]). In the present paper we use the relation (1.1) as an abstract deﬁnition of the linearization Λ of the operator function 𝐴(𝑧). Therefore we call this relation

448

H. Langer, A. Markus and V. Matsaev

also the basic relation for the linearization Λ. Under the condition (𝜎+ ) a minimal linearization, which is a selfadjoint operator in a Hilbert space, exists, see [5], [6]; in Section 2 we show the uniqueness of this linearization, up to unitary equivalence. As was shown in [5], the linearization constructed there is equivalent to the linearization from [2], where also the more general situation of Banach spaces was considered. This holds also with respect to the linearization in the paper [3]. In the present note, however, we consider only Hilbert spaces and use a formally simpler deﬁnition of a linearization (see Deﬁnition 2.1 below). The proof for the uniqueness of the linearization up to similarity from [3, Theorem 2.1] can be adapted to a proof for uniqueness up to unitary equivalence in our situation. However, for the convenience of the reader we prove this fact here directly. In [7] a factorization result, which goes back to [10], was proved for the case that the main spectral subspace ℋ(Δ0 ) coincides with the original space ℋ. In Section 3 this factorization is extended to the situation where ℋ(Δ0 ) can be a proper subspace of ℋ. We have mentioned already that the spectral subspaces ℋ(Δ) of the operator function 𝐴(𝑧) are not invariant under the operators 𝐴(𝑧). However, if 𝑃Δ denotes the orthogonal projection onto ℋ(Δ) in ℋ, the compressed operator function 𝐴Δ (𝑧): 𝐴Δ (𝑧)𝑓 = 𝑃Δ 𝐴(𝑧)𝑓, 𝑓 ∈ ℋ(Δ), with values in ℒ(ℋ(Δ)) is again a self-adjoint analytic operator function on 𝒟 which satisﬁes the condition (VM) on Δ0 . We call 𝐴Δ (𝑧) the spectral compression of 𝐴(𝑧) for Δ. Since it satisﬁes the condition (VM) on Δ it has a local spectral function on Δ, an inner linearization 𝑆 Δ and a Hilbert space linearization ΛΔ . In Section 4 it is shown that 𝑆 Δ coincides with the restriction 𝑆Δ of 𝑆 to its (invariant) spectral subspace corresponding to Δ (Theorem 4.1), and that also ΛΔ is the restriction ΛΔ of Λ to its (invariant) spectral subspace corresponding to Δ (Theorem 4.3). Although the spectral subspaces ℋ(Δ) of 𝐴(𝑧) are not invariant under the operators 𝐴(𝑧), they have a weaker property, which we call pseudoinvariance. This is shown in Section 5. Finally, in Section 6 we derive an explicit expression for the value 𝑄({𝜆0 }) of the local spectral function at a real eigenvalue 𝜆0 of 𝐴(𝑧). Moreover, a second proof of Theorem 4.3 is given, that is based on a block operator representation of 𝐴(𝑧)−1 with respect to the spectral subspace and its orthogonal complement. This proof allows us to state in Corollary 6.3 that the local spectral function of the spectral compression 𝐴Δ (𝑧) is the restriction of the local spectral function of 𝐴(𝑧).

2. The linearization and its uniqueness 1. Let 𝐴(𝑧) be a self-adjoint analytic operator function which is deﬁned on a symmetric open set 𝒟 ⊂ ℂ and with values in ℒ(ℋ) for some Hilbert space ℋ. Deﬁnition 2.1. Suppose that 𝒟 contains the closed interval Δ = [𝛼, 𝛽] and that for some simply connected neighbourhood 𝒰 of Δ we have 𝒰 ∖ Δ ⊂ 𝜌(𝐴). The ˜ on Δ, if there exist operator function 𝐴(𝑧) admits a Hilbert space linearization Λ

Self-adjoint Analytic Operator Functions

449

˜ in ℱ˜ and an operator 𝑃˜ ∈ ℒ(ℋ, ℱ˜), a Hilbert space ℱ˜, a self-adjoint operator Λ such that the following holds: ˜ ⊂ Δ. (a) 𝜎(Λ) ∗ ˜ ˜ (b) 𝑃 (Λ − 𝑧)−1 𝑃˜ = −𝐴(𝑧)−1 + 𝐵(𝑧), 𝑧 ∈ 𝒰 ∖ Δ, where 𝐵(𝑧) is an operator function which is analytic in 𝒰; this relation is called the basic relation for ˜ the linearization Λ. If, additionally, { } ˜ 𝑘 𝑃˜ ℋ : 𝑘 = 0, 1, 2, . . . , (c) ℱ˜ = c.l.s. Λ ˜ is called minimal. then the Hilbert space linearization Λ It is easy to check that the minimality condition (c) is equivalent to the condition } { ˜ with at ˜ − 𝑧)−1 𝑃˜ℋ : 𝑧 ∈ 𝒵 , where 𝒵 is a subset of 𝜌(Λ) (c’) ℱ˜ = c.l.s. (Λ ˜ least one accumulation point in 𝜌(Λ). Remark 2.2. In [5] the existence of a minimal Hilbert space linearization was shown if A(z) satisﬁes on Δ the condition (𝜎+ ). Under this condition the Hilbert space ℱ and the operator Λ as constructed in [5] have all the properties (a)–(c). Remark 2.3. If the stronger condition (VM) is satisﬁed on Δ0 , also the inner linearization 𝑆 of 𝐴(𝑧) is a Hilbert space linearization of 𝐴(𝑧) on Δ0 . To see this we ﬁrst observe that 𝜎(𝑆) = 𝜎(Λ) ⊂ Δ0 , and that the relations (1.1) and (1.7) imply 𝐴(𝑧)−1 = −𝐽(𝑆 − 𝑧)−1 𝑄(Δ0 ) + 𝐵(𝑧),

(2.1)

where 𝐽 denotes the embedding from ℋ(Δ0 ) into ℋ. Now we choose in Deﬁnition 2.1 as ℱ˜ the linear space ℋ(Δ0 ) equipped with the inner product ( ) ⟨𝑥, 𝑦⟩ := 𝑄(Δ0 )−1 𝑥, 𝑦 , 𝑥, 𝑦 ∈ ℋ, (2.2) and as 𝑃˜ ∈ ℒ(ℋ, ℱ˜) the mapping 𝑃˜𝑥 := 𝑄(Δ0 ) 𝑥 ∈ ℋ. Since 𝑄(Δ0 ) = 𝑃 ∗ 𝑃 , it is easy to see that then 𝑃˜ ∗ ∈ ℒ(ℱ˜, ℋ) is the embedding 𝐽 of ℋ(Δ0 ) into ℋ. Hence (2.1) is the basic relation for the linearization 𝑆. It is trivial that also the minimality property (c) holds since the set 𝑆 𝑘 𝑃˜ ℋ for 𝑘 = 0 is equal to 𝑃˜ℋ = 𝑄(Δ0 )ℋ = ℋ(Δ0 ). The following theorem states that a minimal Hilbert space linearization is unique up to unitary equivalence. Theorem 2.4. Let the operator function 𝐴(𝑧) be as at the beginning of this section. If 𝐴(𝑧) admits on Δ two minimal Hilbert space linearizations Λ1 and Λ2 , then Λ1 and Λ2 are unitarily equivalent.

450

H. Langer, A. Markus and V. Matsaev

Proof. We start from the basic relations for Λ1 and Λ2 : 𝑃𝑗∗ (Λ𝑗 − 𝑧)−1 𝑃𝑗 = −𝐴(𝑧)−1 + 𝐵𝑗 (𝑧),

𝑧 ∈ 𝒰 ∖ (𝛼, 𝛽), 𝑗 = 1, 2.

(2.3)

Choose a suﬃciently smooth simple positive oriented curve 𝛾 (⊂ 𝒰) which surrounds Δ and passes through the points 𝛼 − 𝑡, 𝛽 + 𝑡, where 𝑡 > 0 is such that the intervals [𝛼 − 𝑡, 𝛼) and (𝛽, 𝛽 + 𝑡] belong to 𝜌(Λ𝑗 ), 𝑗 = 1, 2. The equality (2.3) implies ∮ 1 ∗ 𝑘+ℓ 𝑧 𝑘+ℓ 𝐴(𝑧)−1 𝑑𝑧, 𝑘, ℓ = 0, 1, . . . , 𝑗 = 1, 2. 𝑃𝑗 Λ𝑗 𝑃𝑗 = − 2𝜋i 𝛾 Hence for any 𝑛 ∈ ℕ and vectors 𝑥𝑘 ∈ ℋ, 𝑘 = 0, 1, . . . , 𝑛, we ﬁnd for 𝑗 = 1, 2 (∮ ) 𝑛 𝑛 ∑ ( ∗ 𝑘+ℓ ) 1 ∑ 𝑘+ℓ −1 𝑃𝑗 Λ𝑗 𝑃𝑗 𝑥𝑘 , 𝑥ℓ = − 𝑧 𝐴(𝑧) 𝑑𝑧 𝑥𝑘 , 𝑥ℓ , 2𝜋i 𝛾 𝑘,ℓ=0

𝑘,ℓ=0

and therefore 𝑛 𝑛 ∑ ∑ ( ∗ 𝑘+ℓ ) ( ∗ 𝑘+ℓ ) 𝑃1 Λ1 𝑃1 𝑥𝑘 , 𝑥ℓ = 𝑃2 Λ2 𝑃2 𝑥𝑘 , 𝑥ℓ . 𝑘,ℓ=0

𝑘,ℓ=0

This relation can be written as 〈 𝑛 〉 𝑛 ∑ ∑ Λ𝑘1 𝑃1 𝑥𝑘 , Λℓ1 𝑃1 𝑥ℓ 𝑘=0

or

ℓ=0

=

1 1 𝑛 1∑ 1 1 1 𝑘 Λ1 𝑃1 𝑥𝑘 1 1 1 1

ℱ1

𝑛 ∑

𝑛 ∑

Λ𝑘2 𝑃2 𝑥𝑘 ,

𝑘=0

ℱ1

𝑘=0

Denote

〈

Λℓ2 𝑃2 𝑥ℓ

ℓ=0

1 1 𝑛 1 1∑ 1 1 𝑘 =1 Λ2 𝑃2 𝑥𝑘 1 1 1 𝑘=0

〉

.

, ℱ2

(2.4)

ℱ2

{ } 𝒟𝑗 := span Λ𝑘𝑗 𝑃𝑗 ℋ : 𝑘 = 0, 1, . . . ,

𝑗 = 1, 2.

Because of the minimality property (c), 𝒟𝑗 is a dense subset of the Hilbert space ℱ𝑗 , 𝑗 = 1, 2. Consider the correspondence 𝑛 ∑

Λ𝑘1 𝑃1 𝑥𝑘 −→

𝑘=0

𝑛 ∑

Λ𝑘2 𝑃2 𝑥𝑘 .

(2.5)

𝑘=0

It determines a correctly deﬁned mapping from 𝒟1 onto 𝒟2 , that is, any equality 𝑛 ∑

Λ𝑘1 𝑃1 𝑥𝑘 =

𝑘=0

implies

𝑛 ∑ 𝑘=0

𝑚 ∑

Λ𝑘1 𝑃1 𝑦𝑘

(2.6)

Λ𝑘2 𝑃2 𝑦𝑘 .

(2.7)

𝑘=0

Λ𝑘2 𝑃2 𝑥𝑘 =

𝑚 ∑ 𝑘=0

Self-adjoint Analytic Operator Functions

451

To see this, we add, if necessary, some 𝑥𝑘 or 𝑦𝑘 equal to 0, such that 𝑚 = 𝑛. Then the relation (2.4) yields 1 𝑛 1 1 𝑛 1 1∑ 1 1∑ 1 1 1 1 1 𝑘 𝑘 Λ1 𝑃1 (𝑥𝑘 − 𝑦𝑘 )1 = 1 Λ2 𝑃2 (𝑥𝑘 − 𝑦𝑘 )1 , 1 1 1 1 1 𝑘=0

𝑘=0

ℱ1

ℱ2

and hence (2.6) and (2.7) are equivalent. By (2.4), the mapping 𝑅 from 𝒟1 onto 𝒟2 , given by (2.5): ) ( 𝑛 𝑛 ∑ ∑ 𝑘 Λ1 𝑃1 𝑥𝑘 = Λ𝑘2 𝑃2 𝑥𝑘 𝑅 𝑘=0

𝑘=0

extends by continuity to a unitary mapping from ℱ1 onto ℱ2 , which we also denote by 𝑅. The relation ( 𝑛 ) ( 𝑛 ) ∑ ∑ 𝑘+1 𝑘 𝑅 Λ1 Λ1 𝑃1 𝑥𝑘 = 𝑅 Λ1 𝑃1 𝑥𝑘 𝑘=0

𝑘=0

=

𝑛 ∑

Λ𝑘+1 𝑃2 𝑥𝑘 2

= Λ2

𝑘=0

𝑛 ∑

( Λ𝑘2 𝑃2 𝑥𝑘

= Λ2 𝑅

𝑘=0

𝑛 ∑

) Λ𝑘1 𝑃1 𝑥𝑘

𝑘=0

implies 𝑅 Λ1 = Λ2 𝑅, hence Λ1 and Λ2 are unitarily equivalent.

□

From Theorem 2.4 and Remark 2.2 we obtain: Corollary 2.5. Suppose that the condition (VM) is satisﬁed for 𝐴(𝑧) on Δ0 , and that the endpoints of Δ0 are regular points for the operator function 𝐴(𝑧). If Λ is a minimal Hilbert space linearization of 𝐴(𝑧) for Δ0 , and 𝑆 is the inner linearization of 𝐴(𝑧) for Δ0 in ℋ(Δ0 ), equipped with the inner product as in (2.2), then Λ and 𝑆 are unitarily equivalent. Remark 2.6. Note that also the following inverse of Theorem 2.4 holds. If the operator Λ1 is a minimal Hilbert space linearization of 𝐴(𝑧) for Δ0 and if the operator Λ2 is unitarily equivalent to Λ1 : Λ2 = 𝑈 Λ1 𝑈 −1 , then Λ2 is also a minimal Hilbert space linearization of 𝐴(𝑧) for Δ0 . This is clear if we deﬁne the corresponding operator 𝑃2 by 𝑃2 = 𝑈 𝑃1 . 2. Similar as in Deﬁnition 2.1, a Krein space linearization can be deﬁned if in Deﬁnition 2.1 the words ‘Hilbert space’ are replaced everywhere by ‘Krein space’. The existence of a Krein space linearization was shown in [5] for any self-adjoint analytic operator function 𝐴(𝑧), deﬁned on a domain 𝒟 = 𝒟∗ and with compact spectrum in 𝒟, without assuming the condition (𝜎+ ). In the particular case of a monic self-adjoint operator polynomial 𝐴(𝑧) = 𝑧 𝑛 𝐼 + 𝑧 𝑛−1 𝐵𝑛−1 + ⋅ ⋅ ⋅ + 𝑧𝐵1 + 𝐵0

452

H. Langer, A. Markus and V. Matsaev

in a Hilbert space ℋ also the linearization given by the companion operator ⎛ ⎞ 0 𝐼 0 ⋅⋅⋅ 0 ⎜ 0 0 𝐼 ⋅⋅⋅ 0 ⎟ ⎜ ⎟ Λ=⎜ . . . .. ⎟ .. .. ⎝ .. . ⎠ −𝐵0 −𝐵1 −𝐵2 ⋅ ⋅ ⋅ −𝐵𝑛−1 is a Krein space linearization in this sense. Indeed, in this case we can choose ℱ = ℋ1 ⊕ ℋ2 ⊕ ⋅ ⋅ ⋅ ⊕ ℋ𝑛 ,

ℋ1 = ℋ2 = ⋅ ⋅ ⋅ = ℋ𝑛 = ℋ,

with inner product ⟨⋅, ⋅⟩ℱ = (𝐺⋅, ⋅), deﬁned by the Gram operator ⎛ ⎞ 𝐵1 𝐵2 ⋅ ⋅ ⋅ 𝐵𝑛−1 𝐼 ⎜ 𝐵2 𝐵3 ⋅ ⋅ ⋅ 𝐼 0⎟ ⎜ ⎟ ⎜ .. . . .. ⎟ .. .. 𝐺=⎜ . .⎟ ⎜ ⎟ ⎝𝐵𝑛−1 𝐼 ⋅ ⋅ ⋅ 0 0⎠ 𝐼 0 ⋅⋅⋅ 0 0 and the embedding 𝑃 which maps ℋ identically onto ℋ1 , the ﬁrst component of ℱ . It is easy to check that in this case 𝐴(𝑧)−1 = −𝑃 ∗ (Λ − 𝑧)−1 𝑃,

𝑧 ∈ 𝜌(𝐴).

What concerns Theorem 2.4 in the Krein space situation, the indeﬁnite isometry between the sets 𝒟1 and 𝒟2 follows as above, but this isometry is in general not continuous, and hence it does not extend to a unique isometry between the spaces ℱ1 and ℱ2 . In other words, two minimal Krein space linearizations are in general only weakly isomorphic, see [1]. However, this isometry between the sets 𝒟1 and 𝒟2 extends to a unique isometry between the whole spaces, if, e.g., one of the Krein spaces ℱ1 or ℱ2 (and then also the other) is a Pontryagin space.

3. Factorization of 𝑨(𝒛) Let 𝐴(𝑧) be a self-adjoint analytic operator function which is deﬁned and selfadjoint on a symmetric open set 𝒟 ⊂ ℂ. Further we shall always suppose that 𝒟 contains the real interval Δ0 = [𝛼0 , 𝛽0 ], that 𝐴(𝛼0 ) and 𝐴(𝛽0 ) are boundedly invertible and that 𝐴(𝑧) satisﬁes on Δ0 the Virozub-Matsaev condition (VM): (VM)

∃𝜀, 𝛿 > 0 : 𝜆 ∈ Δ0 , 𝑓 ∈ ℋ, ∥𝑓 ∥ = 1, ∣(𝐴(𝜆)𝑓, 𝑓 )∣ < 𝜀 =⇒ (𝐴′ (𝜆)𝑓, 𝑓 ) > 𝛿.

Then, there exists a simply connected neighbourhood of Δ0 which does not contain spectrum of 𝐴 outside Δ0 ; such a neighbourhood is denoted by 𝒰, hence 𝒰 ∖ Δ0 ⊂ 𝜌(𝐴). If ℋ(Δ) is a spectral subspace of 𝐴(𝑧), we consider the operator function with values in ℒ(ℋ(Δ), ℋ), which, for 𝑧 ∈ 𝒟, maps ℋ(Δ) ∋ 𝑓 −→ 𝐴(𝑧)𝑓 ∈ ℋ.

Self-adjoint Analytic Operator Functions

453

We call this operator function the restriction of 𝐴(𝑧) to the subspace ℋ(Δ). Observe that here the term ‘restriction’ does not mean that ℋ(Δ) is an invariant subspace of 𝐴(𝑧) (in that usual sense we shall use this term in Sections 4 and 6). For the orthogonal projection 𝑃ℋ(Δ) in ℋ onto ℋ(Δ) we write for short 𝑃Δ . Theorem 3.1. Under the assumptions at the beginning of this section, the restriction of 𝐴(𝑧) to ℋ(Δ0 ) admits the following factorization: 𝐴(𝑧)𝑓 = 𝑀 (𝑧)(𝑆 − 𝑧)𝑓,

𝑓 ∈ ℋ(Δ0 ), 𝑧 ∈ 𝒰,

(3.1)

where 𝑆 is the inner linearization of 𝐴(𝑧) ( in ℋ(Δ)0 ) and 𝑀 (𝑧), 𝑧 ∈ 𝒰, is an analytic operator function with values in ℒ ℋ(Δ0 ), ℋ . For each 𝑧 in a neighborhood of Δ0 the operator 𝑀 (𝑧) is injective and its range (depending on 𝑧) is a closed subspace of ℋ. Proof. We start from the basic relation (1.1) 𝐴(𝑧)−1 − 𝐵(𝑧) = −𝑃 ∗ (Λ − 𝑧)−1 𝑃,

𝑧 ∈ 𝒰 ∖ 𝜎(𝐴). (3.2) ( ) Both sides of this relation can be considered as elements of ℒ ℋ, ℋ(Δ0 ) and we can rewrite (3.2) in the form 𝐴(𝑧)−1 − 𝐵(𝑧) = −𝐽𝑃0∗ (Λ − 𝑧)−1 𝑃,

𝑧 ∈ 𝒰 ∖ 𝜎(𝐴),

(3.3)

where 𝐽 is the embedding from ℋ(Δ) into ℋ and 𝑃0∗ was deﬁned in [7, p. 542]. By the deﬁnition (1.7), 𝑆 − 𝑧 = 𝑃0∗ (Λ − 𝑧)(𝑃0∗ )−1 , 𝑧 ∈ 𝒰 ∖ 𝜎(𝑆), and hence (𝑆 − 𝑧)−1 = 𝑃0∗ (Λ − 𝑧)−1 (𝑃0∗ )

−1

,

𝑧 ∈ 𝒰 ∖ 𝜎(𝑆).

(𝑆 − 𝑧)−1 𝑄(Δ0 ) = 𝑃0∗ (Λ − 𝑧)−1 𝑃,

𝑧 ∈ 𝒰 ∖ 𝜎(𝑆),

(3.4)

We shall prove that (3.5) ( ) where the operators on both sides are considered as elements of ℒ ℋ, ℋ(Δ0 ) . If 𝑓 ∈ ℋ ⊖ ℋ(Δ0 ) then (𝑆 − 𝑧)−1 𝑄(Δ0 )𝑓 = 0 = 𝑃0∗ (Λ − 𝑧)−1 𝑃 𝑓,

𝑧 ∈ 𝒰 ∖ 𝜎(𝑆),

(3.6)

if 𝑓 ∈ ℋ(Δ0 ) then, again for 𝑧 ∈ 𝒰 ∖ 𝜎(𝑆), (𝑆 − 𝑧)−1 𝑄(Δ0 )𝑓

= (𝑆 − 𝑧)−1 𝑃 ∗ 𝑃 𝑓 = 𝑃0∗ (Λ − 𝑧)−1 (𝑃0∗ )−1 𝑃 ∗ 𝑃 𝑓 = 𝑃0∗ (Λ − 𝑧)−1 𝑃 𝑓,

(3.7)

where for the second equality sign we have used (3.4). Obviously, (3.6) and (3.7) imply (3.5). From (2.1) we have 𝐴(𝑧)−1 − 𝐵(𝑧) = −𝐽(𝑆 − 𝑧)−1 𝑄(Δ0 ),

𝑧 ∈ 𝒰 ∖ 𝜎(𝐴).

(3.8)

Multiplying (3.8) by 𝐴(𝑧) from the left we get 𝐼 − 𝐴(𝑧)𝐵(𝑧) = −𝐴(𝑧)𝐽(𝑆 − 𝑧)−1 𝑄(Δ0 ),

𝑧 ∈ 𝒰 ∖ 𝜎(𝐴),

(3.9)

454

H. Langer, A. Markus and V. Matsaev

where both sides act in ℋ, and multiplying (3.8) by 𝐴(𝑧) from the right we get (𝐼 − 𝐵(𝑧)𝐴(𝑧))𝑓 = −(𝑆 − 𝑧)−1 𝑄(Δ0 )𝐴(𝑧)𝑓,

𝑧 ∈ 𝒰 ∖ 𝜎(𝐴), 𝑓 ∈ ℋ. (3.10) ( ) Now let 𝑔 ∈ ℋ(Δ0 ) and apply both sides of (3.9) to 𝑄(Δ0 )−1 𝑔 ∈ ℋ(Δ0 ) : (𝐴(𝑧)𝐵(𝑧) − 𝐼)𝑄(Δ0 )−1 𝑔 = 𝐴(𝑧)(𝑆 − 𝑧)−1 𝑔,

𝑧 ∈ 𝒰 ∖ 𝜎(𝐴).

(3.11)

Set 𝑀 (𝑧) := 𝐴(𝑧)(𝑆 − 𝑧)−1 . The relation (3.11) shows that 𝑀 (𝑧) is an operator function with values in ℒ(ℋ(Δ0 ), ℋ), which is analytic in a neighborhood of Δ0 . By the deﬁnition of 𝑀 (𝑧), 𝐴(𝑧) = 𝑀 (𝑧)(𝑆 − 𝑧),

𝑧 ∈ 𝒰,

(3.12)

and it remains to prove that for each 𝑧 ∈ 𝒰 the operator 𝑀 (𝑧) is injective and that its range is closed. This will follow if we show that for 𝑧 ∈ 𝒰 and a sequence (𝑓𝑛 ) ⊂ ℋ(Δ0 ) the relation 𝑀 (𝑧)𝑓𝑛 → 0,

𝑛 → ∞,

(3.13)

imply 𝑓𝑛 → 0, 𝑛 → ∞. To this end we multiply (3.10) from the left by 𝑆 − 𝑧 to obtain ( ) (𝑆 − 𝑧) 𝐵(𝑧)𝐴(𝑧) − 𝐼 = 𝑄(Δ0 )𝐴(𝑧), and apply this relation to the elements 𝐽(𝑆 − 𝑧)−1 𝑓𝑛 , 𝑧 ∈ 𝜌(𝑆). This gives ( ) 𝑄(Δ0 )𝑀 (𝑧)𝑓𝑛 = (𝑆 − 𝑧) 𝐵(𝑧)𝐴(𝑧) − 𝐼 𝐽(𝑆 − 𝑧)−1 𝑓𝑛 =

(𝑆 − 𝑧)𝐵(𝑧)𝐴(𝑧)𝐽(𝑆 − 𝑧)−1 𝑓𝑛 − 𝑓𝑛 ,

where for the last equality sign we can remove the parentheses since the operator 𝐵(𝑧)𝐴(𝑧) maps ℋ(Δ0 ) into ℋ(Δ0 ), see (3.10). Hence 𝑄(Δ0 )𝑀 (𝑧)𝑓𝑛 = (𝑆 − 𝑧)𝐵(𝑧)𝑀 (𝑧)𝑓𝑛 − 𝑓𝑛 ,

(3.14)

for 𝑧 ∈ 𝜌(𝑆). Since both sides of (3.14) are continuous functions of 𝑧 we can choose □ 𝑧 ∈ Δ0 . Now (3.13) and (3.14) imply 𝑓𝑛 → 0. Remark 3.2. In the case 𝐴(𝛼0 ) ≪ 0,

𝐴(𝛽0 ) ≫ 0,

(3.15)

Theorem 3.1 becomes the Virozub-Matsaev factorization theorem [10] (see also [7, Theorem 4.4]). Indeed, since the conditions (3.15) are equivalent to the equality ℋ(Δ0 ) = ℋ (see [6, Corollary 7.4]), we have only to check that ran 𝑀 (𝑧) = ℋ,

𝑧 ∈ 𝒰.

(3.16)

It follows from (3.1) and (3.15) that ran 𝑀 (𝛼0 ) = ran 𝑀 (𝛽0 ) = ℋ. But ran 𝑀 (𝑧) depends continuously on 𝑧 in the gap topology, and hence (3.16) holds.

Self-adjoint Analytic Operator Functions

455

Corollary 3.3. Under the assumptions at the beginning of this section the following statements hold: (a) 𝜎(𝑆) = 𝜎(𝐴) ∩ Δ0 . (b) 𝜎𝑝 (𝑆) = 𝜎𝑝 (𝐴) ∩ Δ0 , and if 𝜆0 ∈ 𝜎𝑝 (𝑆) then ker (𝑆 − 𝜆0 ) = ker 𝐴(𝜆0 ). (c) The eigenvectors of the operator function 𝐴(𝑧), corresponding to diﬀerent eigenvalues in Δ0 , are linearly independent. If there is an inﬁnite number of such eigenvalues, then the corresponding eigenvectors form a Riesz basis in their closed linear span. We mention that the second part of (c) follows from (b) and from the fact that 𝑆 is similar to a self-adjoint operator (see [7, Theorem 4.1]). Remark 3.4. It was shown in [6, Remark 7.7] that statement (c) of Corollary 3.3 fails if we replace the condition (VM) by the condition (𝜎+ ).

4. The spectral compression of 𝑨(𝒛), I 1. Let 𝐴(𝑧) be as at the beginning of Section 3. For an interval Δ = [𝛼, 𝛽] ⊂ Δ0 , ℋ(Δ) is the corresponding spectral subspace and 𝑃Δ is the orthogonal projection ( ) in ℋ onto ℋ(Δ). Consider the operator function 𝐴Δ (𝑧) with values in ℒ ℋ(Δ) , which is deﬁned as follows: 𝐴Δ (𝑧)𝑓 := 𝑃Δ 𝐴(𝑧)𝑓,

𝑓 ∈ ℋ(Δ).

(4.1)

We call 𝐴Δ (𝑧) the spectral compression of 𝐴(𝑧) for Δ. It is easy to check, that 𝐴Δ (𝑧) is a self-adjoint analytic operator function, deﬁned for 𝑧 ∈ 𝒰, which satisﬁes the condition (VM) on Δ0 . Moreover, [7, Lemma 2.2 and Corollary 2.5] imply that 𝐴Δ (𝛼) ≤ 0,

𝐴Δ (𝛽) ≥ 0.

From [6, Lemma 4.1 (e)] it follows that 𝐴Δ (𝛼′ ) ≪ 0,

𝐴Δ (𝛽 ′ ) ≫ 0

for all 𝛼′ ∈ [𝛼0 , 𝛼), 𝛽 ′ ∈ (𝛽, 𝛽0 ].

Hence according to [7, Theorem 4.4] 𝐴Δ (𝑧) admits the Virozub-Matsaev factorization (4.2) 𝐴Δ (𝑧) = 𝑀 Δ (𝑧)(𝑆 Δ − 𝑧), 𝑧 ∈ 𝒰, where 𝑆 Δ is the inner linearization of 𝐴Δ on Δ, 𝑀 Δ (𝑧) is an analytic operator function with values in ℒ(ℋ(Δ)) which are invertible operators in ℋ(Δ), and 𝒰 is a neighbourhood of Δ. On the other hand, if we multiply the factorization (3.1) above from the left by 𝑃Δ and apply it only to elements of ℋ(Δ) we obtain ( ) 𝐴Δ (𝑧)𝑓 = 𝑃Δ 𝑀 (𝑧) 𝑆 − 𝑧 𝑓, 𝑓 ∈ ℋ(Δ) (4.3) The subspace ℋ(Δ) is an invariant subspace (even a spectral subspace) of 𝑆. Therefore this relation can be written as ( ) (4.4) 𝐴Δ (𝑧) = 𝑃Δ 𝑀 (𝑧)𝑃Δ 𝑆Δ − 𝑧 , where here 𝑆Δ denotes the restriction of 𝑆 to its invariant subspace ℋ(Δ).

456

H. Langer, A. Markus and V. Matsaev

Theorem 4.1. The operators 𝑆 Δ in (4.2) and 𝑆Δ in (4.4) coincide, that is, the inner linearization of the spectral compression 𝐴Δ (𝑧) coincides with the restriction 𝑆Δ of the inner linearization 𝑆 of 𝐴(𝑧) to its invariant subspace ℋ(Δ). Proof. The relations (4.2) and (4.4) imply

( ) 𝑀 Δ (𝑧)(𝑆 Δ − 𝑧) = 𝑃Δ 𝑀 (𝑧)𝑃Δ 𝑆Δ − 𝑧 ,

𝑧 ∈ 𝒰.

It follows that

( )−1 = 𝑀 Δ (𝑧)−1 𝑃Δ 𝑀 (𝑧)𝑃Δ , (𝑆 Δ − 𝑧) 𝑆Δ − 𝑧

𝑧 ∈ 𝒰 ∖ 𝜎(𝑆Δ ].

The operator function on the right-hand side is bounded and analytic in a neighbourhood of Δ, the function on the left-hand side is analytic outside Δ including ∞. According to Liouville’s theorem, both sides are constant, and letting 𝑧 → ∞ on the left-hand side it follows that this constant is 𝐼. □ 2. In this subsection we show that the restriction ΛΔ of a minimal Hilbert space linearization Λ to its spectral subspace ℱΔ is a minimal Hilbert space linearization of the spectral compression 𝐴Δ (𝑧). We start with the following evident statement. Lemma 4.2. Let 𝐺′ , 𝐺′′ be self-adjoint operators and 𝐸 ′ , 𝐸 ′′ , respectively, be their spectral functions. If Δ is a real interval, by 𝐺′Δ , 𝐺′′Δ we denote the restrictions of these operators to their invariant subspaces ran 𝐸 ′ (Δ), ran 𝐸 ′′ (Δ), respectively. If 𝐺′ , 𝐺′′ are unitarily equivalent, then 𝐺′Δ and 𝐺′′Δ are also unitarily equivalent. Theorem 4.3. Let Λ in ℱ be a minimal Hilbert space linearization for Δ0 of the self-adjoint operator function 𝐴(𝑧) in ℋ as above, and let Δ be a subinterval of Δ0 . Then the restriction ΛΔ of Λ to its invariant subspace ℱΔ is a minimal Hilbert space linearization for Δ of the compressed operator function 𝐴Δ (𝑧) in ℋ(Δ). Proof. Consider the inner linearization 𝑆 of 𝐴(𝑧). By Corollary 2.5 the operators Λ and 𝑆 are unitarily equivalent. Lemma 4.2 implies that also the operators ΛΔ and 𝑆Δ are unitarily equivalent. By Theorem 4.1, 𝑆Δ is the inner linearization of the operator function 𝐴Δ (𝑧) for Δ, and hence 𝑆Δ is a minimal Hilbert space linearization of 𝐴Δ (𝑧) for Δ, see Remark 2.2. Now Remark 2.6 implies that ΛΔ is a minimal Hilbert space linearization of 𝐴Δ (𝑧) for Δ. □ Corollary 4.4. The linearization Λ of the operator function 𝐴(𝑧) is also a linearization of the compression 𝐴Δ0 (𝑧) to its main spectral subspace ℋ(Δ0 ).

5. Pseudoinvariance One of the main properties of a spectral subspace of an operator is the invariance of this subspace under the operator. In our situation, however, the spectral subspace ℋ(Δ) of the self-adjoint operator function 𝐴(𝑧) can be an invariant subspace of all operators 𝐴(𝑧) only in some trivial cases. However we shall show that the subspace ℋ(Δ) has a property which can be considered as a weak analogue of invariance.

Self-adjoint Analytic Operator Functions

457

If a subspace ℛ ⊂ ℋ is not invariant under an operator 𝐺 ∈ ℒ(ℋ) then for at least one vector 𝑓 ∈ ℛ we have dist (𝐺𝑓, ℛ) > 0. This can happen in the present situation, but we show that no non-zero vector 𝐴(𝑧)𝑓 with 𝑓 ∈ ℋ(Δ) can be orthogonal to ℋ(Δ). More exactly, we prove: Theorem 5.1. There exists a number 𝑞 < 1 such that dist (𝐴(𝑧)𝑓, ℋ(Δ)) ≤ 𝑞∥𝐴(𝑧)𝑓 ∥

(5.1)

for all subintervals Δ ⊂ Δ0 , all 𝑧 ∈ Δ0 , and all 𝑓 ∈ ℋ(Δ). Proof. The relations (4.2), (4.4) and 𝑆 Δ = 𝑆Δ (see Theorem 4.1) imply that 𝑃Δ 𝑀 (𝑧)𝑃Δ (𝑆 Δ − 𝑧) = 𝑀 Δ (𝑧)(𝑆 Δ − 𝑧), Δ

𝑧 ∈ 𝒰.

The operator 𝑆 has only real spectrum, and hence ran (𝑆 𝑧 ∈ 𝒰 ∖ ℝ. Therefore (5.2) implies 𝑃Δ 𝑀 (𝑧)𝑓 = 𝑀 Δ (𝑧)𝑓,

Δ

(5.2)

− 𝑧) = ℋ(Δ) for

𝑓 ∈ ℋ(Δ), 𝑧 ∈ 𝒰 ∖ ℝ.

(5.3)

By continuity, (5.3) holds even for all 𝑧 ∈ 𝒰. Since 𝑀 Δ (𝑧), 𝑧 ∈ Δ0 , is invertible, we obtain for all 𝑧 ∈ Δ0 and 𝑓 ∈ ℋ(Δ) ∥𝑃Δ 𝑀 (𝑧)𝑓 ∥ = ∥𝑀 Δ (𝑧)𝑓 ∥ ≥ 𝛾1 ∥𝑓 ∥, ( )−1 where 𝛾1 = max𝑧∈Δ0 ∥𝑀 Δ (𝑧)−1 ∥ . Using Theorem 3.1, we have ∥𝑃Δ 𝐴(𝑧)𝑓 ∥ = ∥𝑃Δ 𝑀 (𝑧)(𝑆 − 𝑧)𝑓 ∥ ≥ 𝛾1 ∥(𝑆 − 𝑧)𝑓 ∥.

(5.4)

On the other hand, ∥𝐴(𝑧)𝑓 ∥ = ∥𝑀 (𝑧)(𝑆 − 𝑧)𝑓 ∥ ≤ 𝛾2 ∥(𝑆 − 𝑧)𝑓 ∥,

(5.5)

where 𝛾2 = max𝑧∈Δ0 ∥𝑀 (𝑧)∥. Since 2

(dist (𝐴(𝑧)𝑓, ℋ(Δ))) = ∥𝐴(𝑧)𝑓 ∥2 − ∥𝑃Δ 𝐴(𝑧)𝑓 ∥2 , )1/2 ( . the inequalities (5.4) and (5.5) imply (5.1) with 𝑞 = 1 − 𝛾12 𝛾2−2

□

Remark 5.2. The relations (5.4) and (5.5) imply also that, with 𝛾 := 𝛾1 𝛾2−1 , ∥𝑃Δ 𝐴(𝑧)𝑓 ∥ ≥ 𝛾∥𝐴(𝑧)𝑓 ∥,

𝑓 ∈ ℋ(Δ), 𝑧 ∈ Δ.

(5.6)

6. The spectral compression of 𝑨(𝒛), II 1. In this section 𝐴(𝑧) is again a self-adjoint analytic operator function as at the beginning of Section 3, and Δ is a closed subinterval of Δ0 . Recall that 𝐸 denotes the spectral function of the self-adjoint linearization Λ, ℱΔ = ran 𝐸(Δ), and 𝑄 is the local spectral function of 𝐴(𝑧).

458

H. Langer, A. Markus and V. Matsaev

First we give an explicit formula for 𝑄({𝜆0 }) =: 𝑄0 if 𝜆0 ∈ Δ0 is an eigenvalue of the operator function 𝐴(𝑧). Then ker 𝐴(𝜆0 ) = ran 𝑄0 (see [7, (3.3)]), and the condition (VM) at the beginning of Section 3 implies that ( ′ ) 𝐴 (𝜆0 )𝑓, 𝑓 ≥ 𝛿∥𝑓 ∥2 , 𝑓 ∈ ker 𝐴(𝜆0 ). (6.1) Therefore the operator 𝑃0 𝐴′ (𝜆0 )𝑃0 is uniformly positive and hence boundedly invertible on ker 𝐴(𝜆0 ) = ran 𝑄0 ; here 𝑃0 denotes the orthogonal projection onto ker 𝐴(𝜆0 ). Lemma 6.1. For 𝑓 ∈ ker 𝐴(𝜆0 ) we have 𝑓 = 𝑄0 𝐴′ (𝜆0 )𝑓,

(6.2)

ˆ be the measure which is obtained from 𝑄 by subtracting the possible Proof. Let 𝑄 point measure at 𝜆0 : { 𝑄(Δ) if 𝜆0 ∈ / Δ, ˆ 𝑄(Δ) := 𝑄(Δ) − 𝑄({𝜆0 }) if 𝜆0 ∈ Δ, for intervals Δ. The relation (1.5) implies 𝐴(𝑧)−1 = −

𝑄0 − 𝜆0 − 𝑧

∫ Δ0

ˆ 𝑑𝑄(𝑡) + 𝐵(𝑧), 𝑡−𝑧

(6.3)

and hence

∫ ˆ 𝑄0 𝑑𝑄(𝑡) 𝐴(𝑧) − 𝐴(𝑧) + 𝐵(𝑧)𝐴(𝑧). 𝜆0 − 𝑧 Δ0 𝑡 − 𝑧 For 𝑓 ∈ ker 𝐴(𝜆0 ) we get ∫ ˆ 𝑑𝑄(𝑡) 𝑄0 (𝐴(𝑧) − 𝐴(𝜆0 ))𝑓 − (𝐴(𝑧) − 𝐴(𝜆0 ))𝑓 + 𝐵(𝑧)𝐴(𝑧)𝑓. (6.4) 𝑓 =− 𝜆0 − 𝑧 Δ0 𝑡 − 𝑧 𝐼=−

The second term on the right-hand side can be written as ∫ ˆ 𝐴(𝑧) − 𝐴(𝜆0 ) 𝑑𝑄(𝑡) (𝑧 − 𝜆0 ) 𝑓. 𝑡 − 𝑧 𝑧 − 𝜆0 Δ0 Since lim𝑧→𝜆0

𝐴(𝑧) − 𝐴(𝜆0 ) = 𝐴′ (𝜆0 ) in operator norm, we ﬁnd 𝑧 − 𝜆0 ∫ ˆ 𝑑𝑄(𝑡) (𝐴(𝑧) − 𝐴(𝜆0 ))𝑓 = 0, lim 𝑧→𝜆0 Δ0 𝑡 − 𝑧

and (6.4) implies (6.2).

□

The formula (6.2) implies the desired description of 𝑄0 : Proposition 6.2. If 𝜆0 ∈ Δ0 is an eigenvalue of the self-adjoint operator function 𝐴(𝑧) and 𝑃0 denotes the orthogonal projection onto ker 𝐴(𝜆0 ), then 𝑄({𝜆0 }) = (𝑃0 𝐴′ (𝜆0 )𝑃0 )−1 𝑃0 .

(6.5)

Self-adjoint Analytic Operator Functions

459

If 𝜆0 is isolated and of ﬁnite multiplicity, a corresponding formula was given in [9, Lemma 2.1]. The relation (6.5) implies $ 𝑄Δ ({𝜆0 }) = 𝑄({𝜆0 })$ℋ(Δ) , (6.6) comp. Corollary 6.3. In the following we need some auxiliary statements. (i) If 𝐺 is a self-adjoint operator, 𝐸 its spectral function, and 𝜆0 ∈ ℝ, then for each vector ℎ it holds lim 𝑖𝑦(𝐺 − 𝜆0 − 𝑖𝑦)−1 ℎ = −𝐸({𝜆0 })ℎ.

𝑦→0

This fact is known and can also easily be checked. In a similar way it follows from (1.6): (ii) If 𝐴(𝑧) is as above and 𝜆0 ∈ Δ0 , then lim 𝑖𝑦 𝐴(𝜆0 + 𝑖𝑦)−1 𝑓 = 𝑄({𝜆0 })𝑓,

𝑦→0

𝑓 ∈ ℋ.

This relation and (6.6) imply (iii) lim𝑦→0 𝑖𝑦 𝐴Δ (𝜆0 + 𝑖𝑦)−1 𝑓 = 𝑄({𝜆0 })𝑓, 𝑓 ∈ ℋ. 2. In this subsection we give another proof of Theorem 4.3, using the following Schur factorization of a 2 × 2 block operator matrix. Let ℋ = ℋ1 ⊕ ℋ2 ,

(6.7)

and let 𝐺 ∈ ℒ(ℋ) have the corresponding matrix representation ( ) 𝐺11 𝐺12 𝐺= . 𝐺21 𝐺22 If the operators 𝐺22 and 𝐺11 − 𝐺12 𝐺−1 22 𝐺21 are invertible, then also 𝐺 is invertible and )(( ) ( )( )−1 𝐼 0 𝐼 𝐺12 𝐺−1 𝐺11 − 𝐺12 𝐺−1 0 22 22 𝐺21 𝐺−1 = (6.8) −𝐺−1 𝐼 0 𝐼 0 𝐺−1 22 𝐺21 22 To prove Theorem 4.3, the decomposition (6.7) is chosen as ℋ = ℋ(Δ) ⊕ ℋ(Δ)⊥ .

(6.9) 𝑃0∗ ℱΔ

It follows from [7, Theorem 2.4 and Theorem 4.1] that = ℋ(Δ), and hence ⊥ 𝑃 ℋ(Δ)⊥ ⊂ ℱΔ . Therefore the basic relation and the fact that (Λ − 𝑧)−1 𝑔 for ⊥ is analytic on Δi , the interior of Δ, imply that 𝐴(𝑧)−1 𝑓, 𝑓 ∈ ℋ(Δ)⊥ , is 𝑔 ∈ ℱΔ analytic on Δi . In the matrix representation of 𝐴(𝑧)−1 with respect to the decomposition (6.9): ( ) 𝑉11 (𝑧) 𝑉12 (𝑧) −1 𝐴(𝑧) =: , (6.10) 𝑉21 (𝑧) 𝑉22 (𝑧) with (1.1) we obtain 𝑉11 (𝑧) = 𝑃Δ 𝐴(𝑧)−1 𝑃Δ = −𝑃Δ 𝑃0∗ (Λ − 𝑧)−1 𝑃 𝑃Δ + 𝑃Δ 𝐵(𝑧)𝑃Δ .

(6.11)

460

H. Langer, A. Markus and V. Matsaev

Since 𝐴(𝑧)−1 𝑓 is analytic in Δi for 𝑓 ∈ ℋ(Δ)⊥ , the operator functions 𝑉12 (𝑧) and 𝑉22 (𝑧) are analytic on Δi , and because of the self-adjointness of 𝐴(𝑧) this holds also for 𝑉21 (𝑧). We show that 𝑉22 (𝑧) is boundedly invertible on Δi . Assume that for some 𝑧0 ∈ Δi there exists a sequence (𝑓𝑛 ) ⊂ ℋ(Δ)⊥ , ∥𝑓𝑛 ∥ = 1, such that 𝑉22 (𝑧0 )𝑓𝑛 → 0 if 𝑛 → ∞. Then the ℋ-valued functions ( ) ( ) 0 𝑉12 (𝑧)𝑓𝑛 = , 𝑛 = 1, 2, . . . , 𝑦𝑛 (𝑧) := 𝐴(𝑧)−1 𝑉22 (𝑧)𝑓𝑛 𝑓𝑛 are analytic on Δi (this means that they have analytic continuations from the set of non-real points) since the expressions on the right-hand side are analytic on Δi . Denote ( ( ) ) 0 𝑉12 (𝑧0 )𝑓𝑛 𝑢𝑛 = , 𝑛 = 1, 2, . . . . , 𝑣𝑛 = 𝑉22 (𝑧0 )𝑓𝑛 0 Since 𝑣𝑛 ∈ ℋ(Δ), we obtain from (5.6) ∥𝑃Δ 𝐴(𝑧0 )𝑣𝑛 ∥ ≥ 𝛾∥𝐴(𝑧0 )𝑣𝑛 ∥, Further,

(

𝐴(𝑧0 )𝑣𝑛 =

) 0 − 𝐴(𝑧0 )𝑢𝑛 , 𝑓𝑛

Since 𝑢𝑛 → 0 it follows that

𝑛 = 1, 2, . . . .

𝑃Δ 𝐴(𝑧0 )𝑣𝑛 = −𝑃Δ 𝐴(𝑧0 )𝑢𝑛 , (

0 𝐴(𝑧0 )𝑣𝑛 − 𝑓𝑛

(6.12)

𝑛 = 1, 2, . . . .

) → 0,

𝑛 → ∞,

(6.13)

and 𝑃Δ 𝐴(𝑧0 )𝑣𝑛 → 0,

𝑛 → ∞.

(6.14)

Now, if 𝑛 → ∞, (6.12) and (6.14) imply 𝐴(𝑧0 )𝑣𝑛 → 0, and from (6.13) it follows that 𝑓𝑛 → 0, a contradiction. The relation (6.10) yields ( )−1 𝑉11 (𝑧) 𝑉12 (𝑧) 𝐴(𝑧) = . 𝑉21 (𝑧) 𝑉22 (𝑧) Now we apply the Schur factorization (6.8) to 𝐺 = 𝐴(𝑧)−1 . Then the left upper block in the matrix for 𝐴(𝑧) equals )−1 ( 𝑉11 (𝑧) − 𝑉12 (𝑧)𝑉22 (𝑧)−1 𝑉21 (𝑧) that is

$ 𝐴Δ (𝑧) = 𝑃Δ 𝐴(𝑧)$ℋ(Δ) = (𝑉11 (𝑧) − 𝑉12 (𝑧)𝑉22 (𝑧)−1 𝑉21 (𝑧))−1 .

This relation and (6.11) imply 𝐴Δ (𝑧)−1 = 𝑉11 (𝑧) − 𝑉12 (𝑧)𝑉22 (𝑧)−1 𝑉21 (𝑧) = −𝑃Δ 𝑃0∗ (Λ−𝑧)−1 𝑃 𝑃Δ +𝑃Δ 𝐵(𝑧)𝑃Δ −𝑉12 (𝑧)𝑉22 (𝑧)−1 𝑉21 (𝑧).

(6.15)

Self-adjoint Analytic Operator Functions

461

With the spectral subspace ℱΔ0 ∖Δ of Λ, corresponding to the set Δ0 ∖ Δ, and the restriction ΛΔ0 ∖Δ of Λ to this spectral subspace, the ﬁrst term on the right-hand side can be written as 𝑃Δ 𝑃0∗ 𝐸(Δ)(ΛΔ − 𝑧)−1 𝐸(Δ)𝑃 𝑃Δ + 𝑃Δ 𝑃0∗ 𝐸(Δ0 ∖ Δ)(ΛΔ0 ∖Δ − 𝑧)−1 𝐸(Δ0 ∖ Δ)𝑃 𝑃Δ , and (6.15) becomes 𝐴Δ (𝑧)−1 + 𝑃Δ 𝑃0∗ 𝐸(Δ)(ΛΔ − 𝑧)−1 𝐸(Δ)𝑃 𝑃Δ = − 𝑃Δ 𝑃0∗ 𝐸(Δ0 ∖ Δ)(ΛΔ0 ∖Δ − 𝑧)−1 𝐸(Δ0 ∖ Δ)𝑃 𝑃Δ + 𝑃Δ 𝐵(𝑧)𝑃Δ − 𝑉12 (𝑧)𝑉22 (𝑧)−1 𝑉21 (𝑧). The operator function on the right-hand side is analytic on Δi . The operator function on the left-hand side is analytic on a set 𝒰 ∖ Δ, where 𝒰 is a complex neighbourhood of Δ: for 𝐴Δ (𝑧)−1 this follows from [7, Theorem 3.1 (9)], for ΛΔ it is clear from its deﬁnition. Therefore the only possible singularities of the expressions on the two sides of this equality are the endpoints of Δ = [𝛼, 𝛽]. Consider, e.g., the left endpoint 𝛼. For 𝑓 ∈ ℋΔ , 𝑓 ∕= 0, the function (( ) ) 𝜙(𝑧) := 𝐴Δ (𝑧)−1 + 𝑃Δ 𝑃0∗ 𝐸(Δ)(ΛΔ − 𝑧)−1 𝐸(Δ)𝑃 𝑃Δ 𝑓, 𝑓 is analytic in 𝒰𝜌 (𝛼) := {𝑧 : 0 < ∣𝑧 − 𝛼∣ < 𝜌}, for some 𝜌 > 0, and we have 𝐶 ∣𝜙(𝑧)∣ ≤ for 𝑧 ∈ 𝒰𝜌 (𝛼) ∖ ℝ. For the term from the second summand in ∣ Im 𝑧∣ the sum on the right-hand side this estimate is obvious, for the ﬁrst summand it follows from [6, Proposition 2.1] or (1.6). According to [8, Lemma 33.4], 𝜙(𝑧) has a simple pole in 𝛼 or is analytic there. The ﬁrst case cannot hold since the residue of 𝜙(𝑧) at its simple pole 𝛼 is zero. To show this it is enough to check that lim 𝑖𝑦𝜙(𝛼 + 𝑖𝑦) = 0.

𝑦→0

Statement (iii) implies that the contribution of the ﬁrst summand in the sum on the right-hand side for 𝜙(𝑧) equals (𝑄({𝛼})𝑓, 𝑓 ), whereas (i) and the relations (1.2), (6.6) show that the contribution of the second summand equals −(𝑄({𝛼})𝑓, 𝑓 ). It remains to prove the minimality of ΛΔ , that is, that an arbitrary 𝑓 ∈ ℱΔ can be approximated by ﬁnite sums of the form 𝑛 ∑

(ΛΔ − 𝑧𝑗,𝑛 )−1 𝐸(Δ)𝑃 𝑃Δ 𝑥Δ 𝑗,𝑛 ,

(6.16)

𝑗=1

with 𝑧𝑗,𝑛 ∈ 𝒪, a nonempty open subset of 𝜌(Λ) ∩ 𝜌(ΛΔ ), and 𝑥Δ 𝑗,𝑛 ∈ ℋ(Δ), 𝑗 = 1, 2, . . . , 𝑛, 𝑛 = 1, 2, . . . . ∑ Since the linearization Λ is minimal, 𝑓 ∈ ℱ can be ap𝑛 proximated by elements 𝑗=1 (Λ − 𝑧𝑗,𝑛 )−1 𝑃 𝑥𝑗,𝑛 with 𝑥𝑗,𝑛 ∈ ℋ. Because of 𝑓 ∈ ℱΔ ∑𝑛 we can also use 𝑗=1 (ΛΔ −𝑧𝑗,𝑛 )−1 𝐸(Δ)𝑃 𝑥𝑗,𝑛 , and if we decompose 𝑥𝑗,𝑛 according ⊥ + 𝑥′𝑗,𝑛 with 𝑥′𝑗,𝑛 ∈ ℋ(Δ)⊥ and observe that 𝑃 ℋ(Δ)⊥ ⊂ ℱΔ , to (6.9) as 𝑥𝑗,𝑛 = 𝑥Δ ∑𝑛 𝑗,𝑛 −1 Δ it follows that 𝑗=1 (ΛΔ − 𝑧𝑗,𝑛 ) 𝐸(Δ)𝑃 𝑥𝑗,𝑛 is an approximating sequence. Since 𝑥Δ 𝑗,𝑛 ∈ ℋ(Δ), this sequence coincides with (6.16), and the proof is complete.

462

H. Langer, A. Markus and V. Matsaev

Corollary 6.3. The local spectral function 𝑄Δ of 𝐴Δ (𝑧) is the restriction of the local spectral function 𝑄 of 𝐴(𝑧) to its invariant subspace ℋ(Δ): $ (6.17) 𝑄Δ (Γ) = 𝑄(Γ)$ℋ(Δ) , where Γ is any subinterval of Δ. To see this we observe that for the linearization of the operator function 𝐴Δ (𝑧) the operator 𝐸(Δ)𝑃 𝑃Δ plays the role of 𝑃 . Hence 𝑄Δ (Γ) = 𝑃Δ 𝑃 ∗ 𝐸(Δ)𝐸(Γ)𝐸(Δ)𝑃 𝑃Δ = 𝑃Δ 𝑃 ∗ 𝐸(Γ)𝑃 𝑃Δ = 𝑃Δ 𝑄(Γ)𝑃Δ , which implies (6.17).

References ˇ [1] B. Curgus, A. Dijksma, H. Langer, H.S.V. de Snoo: Characteristic functions of unitary colligations and of bounded operators in Krein spaces. Operator Theory: Adv. Appl. 41 (1989), 125–152. [2] I. Gohberg, M.A. Kaashoek, D.C. Lay: Equivalence, linearization and decomposition of holomorphic operator functions. J. Funct. Anal. 28 (1978), 102–144. [3] M.A. Kaashoek, C.V.M. van der Mee, L. Rodman: Analytic operator functions with compact spectrum. I. Spectral nodes, linearization and equivalence. Integral Equations Operator Theory 4 (1981), 504–547. [4] H. Langer, A. Markus, V. Matsaev: Locally deﬁnite operators in indeﬁnite inner product spaces. Math. Ann. 308 (1997), 405–424. [5] H. Langer, A. Markus, V. Matsaev: Linearization and compact perturbation of selfadjoint analytic operator functions. Operator Theory: Adv. Appl. 118 (2000), 255– 285. [6] H. Langer, A. Markus, V. Matsaev: Self-adjoint analytic operator functions and their local spectral function. J. Funct. Anal. 235 (2006), 193–225. [7] H. Langer, A. Markus, V. Matsaev: Self-adjoint Analytic Operator Functions: Local Spectral Function and Inner Linearization. Integral Equations Operator Theory 63 (2009), 533–545. [8] A.S. Markus: Introduction to the Spectral Theory of Polynomial Operator Pencils. AMS Translations of Mathematical Monographs, vol. 71, 1988. [9] A. Markus, V. Matsaev: On the basis property for a certain part of the eigenvectors and associated vectors of a self-adjoint operator pencil. Math. USSR Sbornik 61 (1988), 289–307. [10] A.I. Virozub, V.I. Matsaev: The spectral properties of a certain class of selfadjoint operator-valued functions. Funct. Anal. Appl. 8 (1974), 1–9.

Self-adjoint Analytic Operator Functions H. Langer Institute for Analysis and Scientiﬁc Computing Vienna University of Technology Wiedner Hauptstrasse 8–10 A-1040 Vienna, Austria e-mail: [email protected] A. Markus Department of Mathematics Ben-Gurion University of the Negev P.O. Box 653 84105 Beer-Sheva, Israel e-mail: [email protected] V. Matsaev Department of Mathematics School of Mathematical Sciences Tel Aviv University 69978 Ramat Aviv, Israel e-mail: [email protected]

463

Operator Theory: Advances and Applications, Vol. 218, 465–494 c 2012 Springer Basel AG ⃝

An Estimate for the Splitting of Holomorphic Cocycles. One Variable J¨ urgen Leiterer Dedicated to the memory of my teacher Israel Gohberg

Abstract. It is well known that every holomorphic cocycle over a domain in the complex plane and with values in the group of invertible elements of a Banach algebra, which is suﬃciently close to the unit cocycle, splits holomorphically. We prove this result with certain uniform estimates. Mathematics Subject Classiﬁcation (2000). 47A56 32L99. Keywords. Holomorphic cocycle, splitting of cocycles, uniform estimates.

1. Introduction Let 𝐷 be an open set in the complex plane, let 𝒰 = {𝑈𝑖 }𝑖∈𝐼 be an open covering of 𝐷, let 𝐴 be a Banach ( algebra ) with unit, 1, and let 𝐺𝐴 be the group of invertible of all elements of 𝐴. Let 𝐶 0 𝒰, 𝒪𝐺𝐴 be the ( set 𝐺𝐴 ) families 𝑓 = {𝑓𝑖 }𝑖∈𝐼 of holomorphic 1 be the set of families 𝑓 = {𝑓𝑖𝑗 }𝑖,𝑗∈𝐼 functions 𝑓𝑖 : 𝑈𝑖 → 𝐺𝐴, and let 𝑍 𝒰, 𝒪 of holomorphic functions 𝑓𝑖𝑗 : 𝑈𝑖 ∩ 𝑈𝑗 → 𝐺𝐴 satisfying the cocycle condition 𝑓𝑖𝑗 𝑓𝑗𝑘 = 𝑓𝑖𝑘

on 𝑈𝑖 ∩ 𝑈𝑗 ,

Set dist(𝑓, 1) =

sup

𝑖∈𝐼, 𝜁∈𝑈𝑖

∥𝑓𝑖 (𝜁) − 1∥

and dist(𝑓, 1) =

sup

𝑖,𝑗∈𝐼, 𝜁∈𝑈𝑖 ∩𝑈𝑗

∥𝑓𝑖𝑗 (𝜁) − 1∥

𝑖, 𝑗 ∈ 𝐼. ( ) for 𝑓 ∈ 𝐶 0 𝒰, 𝒪𝐺𝐴 ( ) for 𝑓 ∈ 𝑍 1 𝒰, 𝒪𝐺𝐴 .

From the theory of Grauert [G] and Bungart [B] (for one variable, see ( ) also Theorem 5.6.3 in [GL]) it is well known that, for each 𝑓 ∈ 𝑍 1 𝒰, 𝒪𝐺𝐴 with

466

J. Leiterer

( ) dist(𝑓, 1) < 1, there exists 𝑢 ∈ 𝐶 0 𝒰, 𝒪𝐺𝐴 such that 𝑓𝑖𝑗 = 𝑢𝑖 𝑢−1 𝑗

on 𝑈𝑖 ∩ 𝑈𝑗 ,

𝑖, 𝑗 ∈ 𝐼.

1

(1.1)

In this paper, we prove the following theorem (see Theorem 4.2, for a slightly more precise version). 1.1. Theorem. Suppose 𝐷 is bounded. Let 𝑑 be the diameter of 𝐷 and let 𝜀 > 0. Assume, for each 𝑎 ∈ 𝐷, there exists 𝑖 ∈ 𝐼 such that $ } { (1.2) 𝐷 ∩ 𝜁 ∈ ℂ $ ∣𝜁 − 𝑎∣ < 𝜀 ⊆ 𝑈𝑖 . 2 ( ) Then, for each 𝑓 ∈ 𝑍 1 𝒰, 𝒪𝐺𝐴 satisfying the estimate 𝜀 (1.3) dist(𝑓, 1) ≤ 26 , 2 𝑑 ( ) there exists 𝑢 ∈ 𝐶 0 𝒰, 𝒪𝐺𝐴 which solves the Cousin problem (1.1) and satisﬁes 225 𝑑 dist(𝑓, 1). (1.4) 𝜀 Of course, the constant 225 is not optimal. We present it just to show that there is a constant at this place which is independent of 𝐷, 𝜀, and the Banach algebra 𝐴. The same is true for similar constants during the paper. After linearization, the Cousin problem (1.1) leads to the inhomogeneous Cauchy-Riemann equation. Since, on bounded domains, this equation admits a solution with uniform estimates (cf. Section 3 below), an appropriate version of the implicit function theorem quickly leads to the following result: Let the hypotheses of Theorem 1.1 be fulﬁlled and)let 𝑐 > 0. Then there exists a constant 𝛿(> 0 such) ( that, for each 𝑓 ∈ 𝑍 1 𝒰, 𝒪𝐺𝐴 with dist(𝑓, 1) < 𝛿, there exists 𝑢 ∈ 𝐶 0 𝒰, 𝒪𝐺𝐴 which solves the Cousin problem (1.1) and satisﬁes the estimate dist(𝑢, 1) < 𝑐. So it is natural to analyze the proof of the implicit function theorem in order to get a proof of Theorem 1.1. However, the author did not succeed in this way (only a weaker estimate was obtained). Therefore, we go another way. First we study the equation ∂𝑈 = 𝑉, (1.5) 𝑈 −1 ∂𝑧 where 𝑉 : 𝑋 → 𝐴 is a given continuous function and 𝑈 is searched as a continuous function from 𝐷 to 𝐺𝐴. To the knowledge of the author, this equation appears for the ﬁrst time in the work of Cornalba and Griﬃths [CG], where, using the Newlander-Nierenberg theorem, local solvability is obtained (for the case 𝐴 = 𝐿(𝑟, ℂ), the algebra of complex 𝑟 × 𝑟 matrices). Then, Gennadi Henkin found a another proof for the local solvability, using uniform estimates for the inhomogeneous Cauchy-Riemann equation.3 Henkin’s proof has the advantage that it gives local solutions with uniform estimates. Analyzing the proof of Henkin, in Section 5, dist(𝑢, 1) ≤

1 If

𝐺 is connected, this is true also without the condition dist(𝑓, 1) < 1. (Def. 2.2) we call such coverings 𝜀-separated. 3 To the knowledge of the author, this proof is published only in the form of an exercise in the book [HL] (Exercise 10 at the end of Chapter 2). 2 Below

An Estimate for the Splitting

467

we obtain a global solution of (1.5) with appropriate uniform estimates, provided 𝑉 is suﬃciently small (Theorem 5.1). Then we prove a version of Theorem 1.1 for the class of continuous functions with continuous Cauchy-Riemann derivative (Theorem 8.3). In the last section, we deduce Theorem 1.1 from Theorems 8.3 and 5.1. The author has two motivations for the present paper. One motivation is to provide the Weierstrass product theorems obtained in [GR1, GR2, GL] for operator functions with some estimates. The second motivation is to provide the Oka-Grauert principle with certain estimates. The latter is also the motivation for another paper of the author [L], where the case of several variables is studied (for 𝐴 = 𝐿(𝑟, ℂ)). Finally let us compare Theorem 1.1 with the following result of B. Berndtsson and J.-P. Rosay [BR]: Let 𝐷 = 𝔻 be the unit disc in the complex plane, and let 𝐺 = 𝐺𝐿(𝑟, ℂ), the group of invertible complex 𝑟 × 𝑟 matrices. Assume ( ) condition 1 𝐺𝐿(𝑟,ℂ) satisfying (1.2) is satisﬁed for some 𝜀 > 0. Then, for each 𝑓 ∈ 𝑍 𝒰, 𝒪 the condition ∥𝑓𝑖𝑗 (𝜁)∥ < ∞, (1.6) ∥𝑓 ∥ := sup 0

(

there exists 𝑢 ∈ 𝐶 𝒰, 𝒪 ﬁes both ∥𝑢∥ :=

sup

𝑖∈𝐼, 𝜁∈𝑈𝑖

𝑖,𝑗∈𝐼, 𝜁∈𝑈𝑖 ∩𝑈𝑗

𝐺𝐿(𝑟,ℂ)

)

which solves the Cousin problem (1.1) and satis-

∥𝑢𝑖 (𝜁)∥ < ∞ 𝑎𝑛𝑑 ∥𝑢−1 ∥ :=

sup

𝑖∈𝐼, 𝜁∈𝑈𝑖

∥𝑢−1 𝑖 (𝜁)∥ < ∞.

(1.7)

Of course, our condition (1.3) is much stronger than condition (1.6). However, it seems to the author that the method of [BR], under the stronger condition (1.3) (also in the case of matrices), does not give estimate (1.4), although some weaker estimate (not explicitly stated in [BR]) can be obtained analyzing the proof of [BR].

2. Notation Throughout this paper the following notations are used. ∙ ℕ is the set of natural numbers, zero included. ℕ∗ = ℕ ∖ {0}. ℤ is the set of integers. ℂ is the complex plane. ℝ is the real line. ∙ Banach spaces and Banach algebras are always complex. ∙ The Lebesgue measure on ℂ will be denoted by 𝑑𝜆. ∙ Let 𝐷 ⊆ ℂ be an open set, let 𝐸 be a Banach space, and let 𝑓 : 𝐷 → 𝐸 be continuous. If 𝑓 is of class 𝒞 1 , then we denote by ∂𝑓 the function (and not a diﬀerential form) deﬁned by ( ) ∂𝑓 1 ∂𝑓 +𝑖 ∂𝑓 = 2 ∂𝑥 ∂𝑦 where 𝑥, 𝑦 are the canonical real coordinate functions on ℂ. If 𝑓 is only continuous (and possibly not diﬀerentiable), then we say that ∂𝑓 is continuous

468

J. Leiterer if there is a continuous function 𝑣 : 𝐷 → 𝐸 such that ∫ ∫ 𝜑𝑣 𝑑𝜆 = − (∂𝜑)𝑢 𝑑𝜆 𝐷

𝐷

(2.1)

for all 𝐶 ∞ -functions 𝜑 : 𝐷 → ℂ with compact support. This function 𝑣 (which then is uniquely determined) will be denoted by ∂𝑓 . ∙ If 𝐸 is a Banach space with the norm ∥ ⋅ ∥, 𝑋 is a subset of ℂ, and 𝑓 is an 𝐸-valued function deﬁned on 𝑋, then we set ∥𝑓 ∥𝑋 = sup ∥𝑓 (𝑧)∥. 𝑧∈𝑋

(2.2)

∙ If 𝑋 is a subset of ℂ, then we denote by 𝑋 the closure of 𝑋 in ℂ, and by int 𝑋 we denote the interior of 𝑋 with respect to ℂ. ∙ If 𝑋 ⊆ ℂ, 𝐸 is a Banach space, and 𝑓 is an 𝐸-valued function with the domain of deﬁnition 𝑋, then the support of 𝑓 , supp 𝑓 , is the maximal relatively closed subset of 𝑋 such that 𝑓 ≡ 0 outside of it. 2.1. In order to give our results also for holomorphic functions which admit a continuous extension to the boundary, or to some part of the boundary of their domain of deﬁnition, we will consider sets 𝑋 ⊆ ℂ with the property that 𝑋 ⊆ int 𝑋.

(2.3)

By a 𝒞 ∞ -function on such a set 𝑋 we mean a function which comes from a 𝒞 ∞ function deﬁned in some open (with respect to ℂ) neighborhood of 𝑋. As a consequence of (2.3), the derivatives of such functions are well deﬁned on 𝑋 by their values on int 𝑋. The following deﬁnition will be used throughout the paper. 2.2. Deﬁnition. Let 𝑋 ⊆ ℂ, let 𝒰 = {𝑈𝑖 }𝑖∈𝐼 be a covering of 𝑋 by relatively open sets4 , and let 𝜀 > 0. Then 𝒰 will be called if for} each point 𝑎 ∈ 𝑋, $ { 𝜀-separated there exists an index 𝑖 ∈ 𝐼 such that 𝑋 ∩ 𝜁 ∈ ℂ $ ∣𝜁 − 𝑎∣ < 𝜀 ⊆ 𝑈𝑖 .

3. An estimate for the Pompeiju integral 3.1. Let 𝐸 be a Banach space, let 𝐷 ⊆ ℂ be a bounded open set, and let 𝑓 : 𝐷 → 𝐸 be continuous and bounded. Then it is well known (see, e.g., Theorem 2.1.9 in [GL]) that the function 𝑢 : 𝐷 → 𝐸 deﬁned by the Pompeiju integral [P] ∫ 𝑓 (𝜁) 1 𝑑𝜆(𝜁), 𝑧 ∈ 𝐷, (3.1) 𝑢(𝑧) = − 𝜋 𝐷 𝜁−𝑧 is continuous on 𝐷 and solves the equation ∂𝑢 = 𝑓 4 i.e.,

on 𝐷.

a covering which comes from an open covering of an open (in ℂ) neighborhood of 𝑋.

An Estimate for the Splitting

469

Moreover, if 𝑑 is the diameter of 𝐷, then it is easy to see that √ ∥𝑢∥𝐷 ≤ 𝑑 2∥𝑓 ∥𝐷 . √ The constant 𝑑 2 is not optimal. But without additional geometric conditions on 𝐷 it cannot be improved so much (the case of a square shows that it is > 𝑑). However, if 𝐷 is contained in a ‘long and thin’ rectangle, the constant can be improved essentially. To make this precise, we give a deﬁnition. 3.2. Deﬁnition. Let 𝑋 be a bounded subset of ℂ such that int 𝑋 ∕= ∅. Denote by 𝑀𝑋 the set of pairs (𝑎, 𝑏) ∈ ℝ2 with 0 < 𝑎 ≤ 𝑏 such that 𝑋 is contained in a closed rectangle with side lengths 𝑎 and 𝑏. As int 𝑋 ∕= ∅, 𝑀𝑋 is a closed and bounded in ℝ2 . Therefore ( ) √ 2 𝑏 𝐶𝑋 := min 𝑎 2 + log 𝜋 𝑎 (𝑎,𝑏)∈𝑀𝑋 exists. 𝐶𝑋 will be called the rectangle constant of 𝑋. 3.3. Proposition. Let 𝐸 be a Banach space, let 𝐷 ⊆ ℂ be a bounded open set with the rectangle constant 𝐶𝐷 , and let 𝑓 : 𝐷 → 𝐸 be a bounded continuous function. Then the solution 𝑢 of ∂𝑢 = 𝑓 deﬁned on 𝐷 by the Pompeiju integral (3.1) admits the estimate (3.2) ∥𝑢∥𝐷 ≤ 𝐶𝐷 ∥𝑓 ∥𝐷 . Proof. By deﬁnition of 𝐶𝐷 , 𝐷 is contained in a rectangle with side lengths 𝑎 and 𝑏, where 𝑎 ≤ 𝑏, and ( ) √ 𝑏 2 𝐶𝐷 = 𝑎 2 + log . 𝜋 𝑎 After a shift and a rotation of 𝐷, we may assume that $ { } $ 𝐷 ⊆ 𝑧 = 𝑥 + 𝑖𝑦 ∈ ℂ $ 0 ≤ 𝑥 ≤ 𝑏, 0 ≤ 𝑦 ≤ 𝑎 . Set

} { $ 𝑏−𝑎 𝑏+𝑎 $ ≤𝑥≤ , 0≤𝑦≤𝑎 , 𝑅0 = 𝑧 = 𝑥 + 𝑖𝑦 ∈ ℂ $ 2 2 } { $ 𝑏−𝑎 $ 𝑅1 = 𝑧 = 𝑥 + 𝑖𝑦 ∈ ℂ $ 0 ≤ 𝑥 ≤ , 0≤𝑦≤𝑎 , 2 } { $ 𝑏+𝑎 $ ≤ 𝑥 ≤ 𝑏, 0 ≤ 𝑦 ≤ 𝑎 , 𝑅2 = 𝑧 = 𝑥 + 𝑖𝑦 ∈ ℂ $ 2 𝑎 𝑏 𝑧0 = + 𝑖 . 2 2 Then, for all 𝑧 ∈ 𝐷, ∫ ∫ ∥𝑓 ∥𝐷 ∥𝑓 ∥𝐷 𝑑𝜆(𝜁) 𝑑𝜆(𝜁) ∥𝑢(𝑧)∥ ≤ ≤ . 𝜋 ∣𝜁 − 𝑧∣ 𝜋 ∣𝜁 − 𝑧0 ∣ 𝑅1 ∪𝑅0 ∪𝑅2

𝑅1 ∪𝑅0 ∪𝑅2

(3.3)

470

J. Leiterer

Since

∫ 𝑅0

and

∫ 𝑅1 ∪𝑅2

𝑑𝜆(𝜁) < ∣𝜁 − 𝑧0 ∣

∫ √ ∣𝜁−𝑧0 ∣<𝑎/ 2

𝑑𝜆(𝜁) =2 ∣𝜁 − 𝑧0 ∣

∫

𝑅1

𝑑𝜆(𝜁) = ∣𝜁 − 𝑧0 ∣

𝑑𝜆(𝜁) <2 ∣𝜁 − 𝑧0 ∣

∫

√ 𝑎/ ∫ 2

0

∫

𝑏−𝑎 2

0

0

√ 2𝜋𝑟 𝑑𝑟 = 𝜋𝑎 2 𝑟

𝑎 𝑏 2

𝑏 𝑑𝑦 𝑑𝑥 = 2𝑎 log , 𝑎 −𝑥

this implies (3.2).

□

4. The main result In this section, 𝐴 is a Banach algebra with unit, 1, and 𝐺𝐴 is the group of invertible elements of 𝐴. 4.1. Multiplicative cocycles. Let 𝑋 be a subset of ℂ such that 𝑋 ⊆ int 𝑋. Then we denote by ℬ𝒪𝐴 (𝑋) the algebra of all bounded continuous functions 𝑓 : 𝑋 → 𝐴 which are holomorphic in int 𝑋, and by ℬ𝒪𝐺𝐴 (𝑋) we denote the group of all 𝑓 ∈ ℬ𝒪𝐴 (𝑋) such that 𝑓 (𝜁) ∈ 𝐺𝐴 for all 𝜁 ∈ 𝑋 and, moreover, the function 𝑓 −1 is also bounded on 𝑋. The unit element of this group, i.e., the constant function with value 1, will also be denoted by 1. Now let 𝒰 = {𝑈𝑖 }𝑖∈𝐼 a covering of 𝑋 by relatively open subsets of 𝑋. Then we use the following notations: ( ) 𝐶 0 𝒰, ℬ𝒪𝐺𝐴 is the set of families {𝑓𝑖 }𝑖∈𝐼 of functions5 𝑓𝑖 ∈ ℬ𝒪𝐺𝐴 (𝑈𝑖 ), ( ) 𝐶 1 𝒰, ℬ𝒪𝐺𝐴 is the set of families {𝑓𝑖𝑗 }𝑖,𝑗∈𝐼 of functions6 𝑓𝑖𝑗 ∈ ℬ𝒪𝐺𝐴 (𝑈𝑖 ∩ 𝑈𝑗 ), ( ) ( ) and 𝑍 1 𝒰, ℬ𝒪𝐺𝐴 is the set of all 𝑓 ∈ 𝐶 1 𝒰, ℬ𝒪𝐺𝐴 satisfying the multiplicative cocycle condition (4.1) 𝑓𝑖𝑗 𝑓𝑗𝑘 = 𝑓𝑖𝑘 on 𝑈𝑖 ∩ 𝑈𝑗 ∩ 𝑈𝑘 , 𝑖, 𝑗, 𝑘 ∈ 𝐼. 7 ( ) The elements of 𝑍 1 𝒰, ℬ𝒪𝐺𝐴 are called multiplicative cocycles. Note that the cocycle condition (4.1) implies that −1 𝑓𝑖𝑗 = 𝑓𝑗𝑖

and 5 By

𝑓𝑖𝑖 = 1

on 𝑈𝑖 ∩ 𝑈𝑗 , on 𝑈𝑖 ,

𝑖, 𝑗 ∈ 𝐼,

𝑖 ∈ 𝐼.

(4.2) (4.3)

a family of functions {𝑓𝑖 }𝑖∈𝐼 we mean a map which is deﬁned only for the indices 𝑖 ∈ 𝐼 with 𝑈𝑖 ∕= ∅. 6 By a family of functions {𝑓 } 𝑖𝑗 𝑖,𝑗∈𝐼 we mean a map which is deﬁned for the ordered pairs (𝑖, 𝑗) ∈ 𝐼 × 𝐼 with 𝑈𝑖 ∩ 𝑈𝑗 ∕= ∅. 7 More precisely, the cocycle condition means the following: If (𝑖, 𝑗, 𝑘) ∈ 𝐼 × 𝐼 × 𝐼 is an ordered triplet such that 𝑈𝑖 ∩ 𝑈𝑗 ∩ 𝑈𝑘 ∕= ∅, then 𝑓𝑖𝑗 (𝜁)𝑓𝑗𝑘 (𝜁) = 𝑓𝑖𝑘 (𝜁) for all 𝜁 ∈ 𝑈𝑖 ∩ 𝑈𝑗 ∩ 𝑈𝑘 . In particular, if 𝑈𝑖 ∩ 𝑈𝑗 ∩ 𝑈𝑘 = ∅ for all pairwise diﬀerent triplets (𝑖, 𝑗, 𝑘) ∈ 𝐼 × 𝐼 × 𝐼, then the cocycle codition just means that (4.2) and (hence) (4.3) are satisﬁed. If moreover 𝑈𝑖 ∩ 𝑈𝑗 = ∅ whenever 𝑖 ∕= 𝑗, then the cocycle condition reduces to condition (4.3).

An Estimate for the Splitting

471

( ) The element 𝑓 ∈ 𝐶 0 𝒰, ℬ𝒪𝐺𝐴 deﬁned by 𝑓𝑖 ≡ 1 on 𝑈𝑖 , 𝑖 ∈ 𝐼, will be denoted by ( ) 1, and the cocycle 𝑓 ∈ 𝑍 1 𝒰, ℬ𝒪𝐺𝐴 deﬁned by 𝑓𝑖𝑗 ≡ 1 on 𝑈𝑖 ∩ 𝑈𝑗 , 𝑖, 𝑗 ∈ 𝐼, will be called the unit cocycle and also denoted by 1. Moreover, we deﬁne ( ) ∥𝑓 − 1∥ = sup ∥𝑓𝑖 − 1∥𝑈𝑖 if 𝑓 ∈ 𝐶 0 𝒰, ℬ𝒪𝐺𝐴 , (4.4) 𝑖∈𝐼 ( ) (4.5) ∥𝑓 − 1∥ = sup ∥𝑓𝑖𝑗 − 1∥𝑈𝑖 ∩𝑈𝑗 if 𝑓 ∈ 𝐶 1 𝒰, ℬ𝒪𝐺𝐴 , 𝑖,𝑗∈𝐼

where ∥𝑓𝑖 − 1∥𝑈𝑖 and ∥𝑓𝑖𝑗 − 1∥𝑈𝑖 ∩𝑈𝑗 are deﬁned by (2.2). The “numbers” deﬁned by (4.4) and (4.5) can be inﬁnite if 𝐼 is inﬁnite. However, in this paper, we meet only those 𝑓 for which ∥𝑓 − 1∥ < 1. Now the main result of this paper can be stated as follows. 4.2. Theorem. Let 𝑋 be a bounded subset of ℂ such that 𝑋 ⊆ int 𝑋, and let 𝐶𝑋 be the rectangle constant of 𝑋 (Section 3.2). Let 𝜀 > 0, let 𝒰 = {𝑈𝑖 }𝑖∈𝐼 be an 𝜀-separated ( ) covering of 𝑋 by relatively open sets (Deﬁnition 2.2), and let 𝑓 ∈ 𝑍 1 𝒰, ℬ𝒪𝐺𝐴 be a multiplicative cocycle satisfying 1 𝜀 . and ∥𝑓 − 1∥ ≤ 224 𝐶𝑋 64 ( ) Then there exists ℎ ∈ 𝐶 0 𝒰, ℬ𝒪𝐺𝐴 such that ∥𝑓 − 1∥ ≤

𝑓𝑖𝑗 = ℎ𝑖 ℎ−1 𝑗 and

on

𝑈𝑖 ∩ 𝑈𝑗 ,

8

𝑖, 𝑗 ∈ 𝐼,

( ) 223 𝐶𝑋 ∥ℎ − 1∥ ≤ 2 + ∥𝑓 − 1∥. 𝜀

(4.6)

(4.7) (4.8)

Of course, the constants 224 and 223 in this theorem are not optimal. The interesting point is that they do not depend on 𝑋 and 𝜀.

5. An estimate for the equation 𝑼 −1 𝝏𝑼 = 𝑽 In this section, 𝐴 is a Banach algebra with unit, 1, 𝐺𝐴 is the group of invertible elements of 𝐴, 𝑋 is a bounded subset of ℂ such that 𝑋 ⊆ int 𝑋, and 𝐶𝑋 is the rectangle constant of 𝑋 (Section 3.2). The aim of this section is to prove the following theorem. 5.1. Theorem. Let 𝑉 : 𝑋 → 𝐴 be a continuous function such that 1 . ∥𝑉 ∥𝑋 ≤ 8𝐶𝑋 8 The

(5.1)

rectangle constant 𝐶𝑋 can be small compared to 𝜀2−24 although the diameter of 𝑋 is big compared to 𝜀. (The case when the diameter of 𝑋 is small compared to 𝜀 is not interesting, because then the trivial covering {𝑋} is a reﬁnement of any 𝜀-separated covering of 𝑋.) Therefore, in general, the second estimate does not follow from the ﬁrst one.

472

J. Leiterer

Then there exists a continuous function 𝑈 : 𝑋 → 𝐺𝐴 such that ∂𝑈 is also continuous on 𝑋 9 , 𝑈 −1 ∂𝑈 = 𝑉 on 𝑋, (5.2) and ∥𝑈 − 1∥𝑋 ≤ 2𝐶𝑋 ∥𝑉 ∥𝑋 .

(5.3)

The following lemma is the main step of the proof of this theorem. 5.2. Lemma. Let the hypotheses of Theorem 5.1 be fulﬁlled, and let ℬ𝒞 𝐴(𝑋) be the Banach space of 𝐴-valued bounded continuous functions on 𝑋 endowed with the sup-norm (2.2). Then there exists a bounded linear operator 𝑅 : ℬ𝒞 𝐴(𝑋) → ℬ𝒞 𝐴 (𝑋) such that, for all 𝑓 ∈ ℬ𝒞 𝐴(𝑋), ∂𝑅𝑓 is also bounded and continuous on 𝑋, ∂𝑅𝑓 − (𝑅𝑓 )𝑉 = 𝑓 and

(5.4)

∥𝑅∥ ≤ 2𝐶𝑋 ,

(5.5) 𝐴

where ∥𝑅∥ is the operator norm of 𝑅 as operator acting in ℬ𝒞 (𝑋). Proof. As noticed in Section 3, setting ∫ 𝑓 (𝜁) 1 𝑑𝜆(𝜁), (𝑇 𝑓 )(𝑧) = − 𝜋 𝜁 −𝑧

𝑧 ∈ 𝑋,

int 𝑋

𝐴

for 𝑓 ∈ ℬ𝒞 (𝑋), we obtain a bounded linear operator 𝑇 : ℬ𝒞 𝐴(𝑋) → ℬ𝒞 𝐴 (𝑋) such that ∂𝑇 𝑓 = 𝑓 for all 𝑓 ∈ ℬ𝒞 𝐴(𝑋) (5.6) and Set

∥𝑇 ∥ ≤ 𝐶𝑋 . 𝑀 𝑓 := −(𝑇 𝑓 )𝑉

(5.7)

for 𝑓 ∈ ℬ𝒞 𝐴 (𝑋).

By (5.1) and (5.7), this deﬁnes a bounded linear operator 𝑀 : ℬ𝒞 𝐴 (𝑋) → ℬ𝒞 𝐴 (𝑋) with 1 (5.8) ∥𝑀 ∥ ≤ . 4 By (5.8), id +𝑀 is invertible, where (id +𝑀 )−1 =

∞ ∑

(−𝑀 )𝑘

(5.9)

𝑘=0

and

∞ 1 1 ∑ 1 4 1(id +𝑀 )−1 1 ≤ = . 𝑘 4 3

(5.10)

𝑘=0

9 By this we mean that ∂𝑈 is continuous in int 𝑋 (in the sense deﬁned in Section 2) and admits a continuous extension to 𝑋.

An Estimate for the Splitting This implies that

473

𝑅 := 𝑇 (id +𝑀 )−1

is a well-deﬁned bounded linear endomorphism of ℬ𝒞 𝐴(𝑋), where, by (5.7) and (5.10), 4 ∥𝑅∥ ≤ ∥𝑇 ∥∥(id +𝑀 )−1 ∥ ≤ 𝐶𝑋 . 3 such that estimate (5.5) is satisﬁed. To prove that 𝑅 has also the other required properties, let 𝑓 ∈ ℬ𝒞 𝐴 (𝑋) be given. Then we see from (5.6) that ∂𝑅𝑓 is continuous on 𝑋 and ∂𝑅𝑓 = (id +𝑀 )−1 𝑓. Moreover, by deﬁnition of 𝑀 and 𝑅, ( ) 𝑀 (id +𝑀 )−1 𝑓 = − 𝑇 (id +𝑀 )−1 𝑓 𝑉 = −(𝑅𝑓 )𝑉. Together this implies ∂𝑅𝑓 − (𝑅𝑓 )𝑉 = (id +𝑀 )−1 𝑓 + 𝑀 (id +𝑀 )−1 𝑓 = (id +𝑀 )(id +𝑀 )−1 𝑓 = 𝑓. So, also (5.4) is proved.

□

5.3. Proof of Theorem 5.1. Let 𝑅 be the operator from Lemma 5.2. Set 𝑈 = 1+𝑅𝑉 . As ∂𝑅𝑉 is continuous on 𝑋, then also ∂𝑈 is continuous on 𝑋. (5.3) follows from (5.5). In view of (5.1), this further implies that ∥𝑈 − 1∥𝑋 ≤ 1/4. In particular, the values of 𝑈 are invertible. Moreover, from (5.4) we see that ( ( ( ) ) ) ( ) 𝑈 −1 ∂𝑈 = 𝑈 −1 ∂(1 + 𝑅𝑉 ) = 𝑈 −1 ∂(𝑅𝑉 ) = 𝑈 −1 𝑉 + (𝑅𝑉 )𝑉 ( ) = 𝑈 −1 (1 + 𝑅𝑉 )𝑉 = 𝑈 −1 𝑈 𝑉 = 𝑉. □

6. Partitions of unity with estimates In this section we use the following notations. For 𝜉 ∈ ℂ and 𝑟 > 0, we set $ $ } } { { $ $ and 𝐵(𝜉, 𝑟) = 𝜁 ∈ ℂ $ ∣𝜁 − 𝜉∣ ≤ 𝑟 . 𝐵(𝜉, 𝑟) = 𝜁 ∈ ℂ $ ∣𝜁 − 𝜉∣ < 𝑟 For 𝜀 > 0 and 𝜇 = (𝜇1 , 𝜇2 ) ∈ ℤ2 , we set 𝜇1 𝜇2 𝑞𝜇𝜀 = √ 𝜀 + 𝑖 √ 𝜀. 2 2 It is easy to see that

( ) 𝜀 𝐵 𝑞𝜇𝜀 , = ℂ. 2 2

∪ 𝜇∈ℤ

(6.1)

(6.2)

474

J. Leiterer

6.1. Lemma. Let 𝐽 ⊆ ℤ2 such that ♯𝐽 ≥ 22.10 Then ∩ ( ) 𝐵 𝑞𝜇𝜀 , 𝜀 = ∅ for all 𝜀 > 0.

(6.3)

𝜇∈𝐽

Proof. For 𝜇 ∈ ℤ2 we denote by 𝐽(𝜇) the set of all indices 𝜈 ∈ ℤ2 such that ∣𝑞𝜈𝜀 − 𝑞𝜇𝜀 ∣ < 2𝜀. By (6.1), this can be written in the form $𝜇 𝜈1 $$2 $$ 𝜇2 𝜈2 $$2 $ 1 $ √ 𝜀 − √ 𝜀$ + $ √ 𝜀 − √ 𝜀$ < 4𝜀2 , 2 2 2 2 or, equivalently, ∣𝜇1 − 𝜈1 ∣2 + ∣𝜇2 − 𝜈2 ∣2 < 8. (6.4) Since, for ﬁxed 𝜇, the number of indices 𝜈 ∈ ℤ2 satisfying (6.4) is equal to 21, we get ♯𝐽(𝜇) = 21 for all 𝜇 ∈ ℤ2 . (6.5) 2 Now let 𝐽 ⊆ ℤ with ♯𝐽 ≥ 22 be given. Then, by (6.5), for each 𝜇 ∈ 𝐽, there exist at least one index 𝜈 ∈ 𝐽 such that 𝜈 ∕∈ 𝐽(𝜇), i.e., ∣𝑞𝜈𝜀 − 𝑞𝜇𝜀 ∣ ≥ 2𝜀 and, hence, 𝐵(𝑞𝜇𝜀 , 𝜀) ∩ 𝐵(𝑞𝜈𝜀 , 𝜀) = ∅. In particular, then we have (6.3).

□

6.2. Lemma. For each 𝜀 > 0, there exists a 𝒞 ∞ -partition of unity {𝜒𝜇 }𝜇∈ℤ2 subordinate to the open covering { ( )} (6.6) 𝐵 𝑞𝜇𝜀 , 𝜀 2 𝜇∈ℤ

of ℂ (by (6.2) this is indeed a covering of ℂ) such that $ $ $ $ $ ∂𝜒𝜇 $ $ ∂𝜒𝜇 $ 176 $ $, $ $ for all 𝜇 ∈ ℤ2 . $ ∂𝑥 $ $ ∂𝑦 $ ≤ 𝜀

(6.7)

Proof. Denote by ∂ on of the derivatives ∂/∂𝑥 and ∂/∂𝑦. Take a 𝒞 ∞ -function 𝜒 : [0, ∞[→ [0, 1] such that 𝜒 ≡ 1 in a neighborhood of [0, 1], 𝜒 ≡ 0 in a neighborhood of [2, ∞[, and $ ′$ $𝜒 $ ≤ 2 everywhere on [0, ∞[. (6.8) Set

( ) ∣𝜁 − 𝑞𝜇𝜀 ∣2 𝜒 ˜𝜇 (𝜁) = 𝜒 4 𝜀2 2 Then, for all 𝜇 ∈ ℤ ,

for all 𝜁 ∈ ℂ and 𝜇 ∈ ℤ2 .

( 𝜀) 𝜒 ˜𝜇 ≡ 1 in a neighborhood of 𝐵 𝑞𝜇𝜀 , , 2 ( ) 𝜀 supp 𝜒 ˜𝜇 ⊆ 𝐵ℂ 𝑞𝜇𝜀 , √ . 2

10 By

♯𝐽 we denote the number of 𝐽.

(6.9) (6.10)

An Estimate for the Splitting Moreover, we set 𝜙=

∑

475

𝜒 ˜𝜇 .

𝜇∈ℤ2

By Lemma 6.1, the sum in the deﬁnition of 𝜙 is locally ﬁnite. Therefore, 𝜙 is a 𝒞 ∞ -function on ℂ and ∑ ∂𝜙 = ∂𝜒 ˜𝜇 . (6.11) 𝜇∈ℤ2

From (6.9) and (6.2) we see that 𝜙≥1

everywhere on ℂ.

(6.12)

Therefore, setting

/ 𝜒𝜇 = 𝜒 ˜𝜇 𝜙, 𝜇 ∈ ℤ2 . we obtain a 𝒞 ∞ partition of unity {𝜒𝜇 }𝜇∈ℤ2 on ℂ. By (6.10) this partition of unity is subordinate to the covering (6.6). It remains to prove estimate (6.7). We have ( ) ∣𝜁 − 𝑞𝜇𝜀 ∣2 4 ′ (∂ 𝜒 ˜𝜇 )(𝜁) = 2 𝜒 ˜ 4 (𝜁 − 𝑞𝜇𝜀 ). 𝜀 𝜇 𝜀2

Taking into account (6.8) and the fact that, by (6.10), ( ) 4∣𝜁 − 𝑞𝜇𝜀 ∣2 if 𝜒 ˜′𝜇 ∣𝜁 − 𝑞𝜇 ∣ < 𝜀 ∕= 0, 𝜀2 this implies that

8 on ℂ. (6.13) 𝜀 Since, by (6.10) and Lemma 6.1, locally, the sum in (6.11) contains not more than 21 non-zero terms, this further implies that 168 on ℂ. (6.14) ∣∂𝜙∣ ≤ 𝜀 As 𝜒 ˜𝜇 ∂𝜒 ˜𝜇 ∂𝜒𝜇 = − 2 ∂𝜙, 𝜙 ≥ 1, and 𝜒 ˜𝜇 ≤ 1, 𝜙 𝜙 from (6.13) and (6.14) we see that 176 8 168 = . □ ˜𝜇 ∣ + ∣∂𝜙∣ ≤ + ∣∂𝜒𝜇 ∣ ≤ ∣∂ 𝜒 𝜀 𝜀 𝜀 ∣∂ 𝜒 ˜𝜇 ∣ ≤

6.3. Let 𝑋 ⊆ ℂ such that 𝑋 ⊆ int 𝑋, and let 𝒰 = {𝑈𝑖 }𝑖∈𝐼 be a covering of 𝑋 by relatively open subsets of 𝑋. We say that {𝜒𝑖 }𝑖∈𝐼 is a 𝒞 ∞ partition of unity subordinate to 𝒰 if (i) for each 𝑖 ∈ 𝐼, 𝜒𝑖 is a non-negative real 𝒞 ∞ -function on 𝑋 (in the sense explained in Section 2.1) such that supp 𝜒𝑖 is compact and contained in 𝑈𝑖 , (ii) for each 𝑎 ∈ 𝑋, there exists a relative open neighborhood 𝑈 (𝑎) ⊆ 𝑋 of 𝑎 such that 𝜒𝑖 ≡ 0 on 𝑈 (𝑎) for all 𝑖 ∈ 𝐼 except for a ﬁnite number; ∑ (iii) 𝑖∈𝐼 𝜒𝑖 ≡ 1 on 𝑋.

476

J. Leiterer

If, for some 𝑚 ∈ ℕ∗ , the number of the set 𝐽(𝑎) in condition (ii) can be always chosen ≤ 𝑚, then 𝒰 will be called of order ≤ 𝑚. Note that, by Lemma 6.1, each 𝒞 ∞ partition of unity, which is subordinate to the covering { } 𝑋 ∩ 𝐵(𝑞𝜇𝜀 , 𝜀) 𝜇∈ℤ2 (6.15) is of order ≤ 21. We now combine Lemmas 6.1 and 6.2. 6.4. Lemma. Let 𝑋 ⊆ ℂ, let 𝒰 = {𝑈𝑖 }𝑖∈𝐼 be an 𝜀-separated covering of 𝑋 by relatively open subsets of 𝑋, 𝜀 > 0 (Def. 2.2). Then there exists a 𝒞 ∞ -partition of unity {𝜒𝑖 }𝑖∈𝐼 subordinate to 𝒰, which is of order ≤ 21 and such that $ $ $ $ $ ∂𝜒𝑖 $ $ ∂𝜒𝑖 $ 212 $, $ $ $ on 𝑋, 𝑖 ∈ 𝐼. (6.16) $ ∂𝑥 $ $ ∂𝑦 $ ≤ 𝜀 Proof. Denote by ∂ on of the derivatives ∂/∂𝑥 and ∂/∂𝑦. Since 𝒰 is 𝜀-separated, the covering 6.15 is a reﬁnement of 𝒰, i.e., there is a map 𝜏 : ℤ2 → 𝐼 such that ( ) 𝑋 ∩ 𝐵 𝑞𝜇𝜀 , 𝜀 ⊆ 𝑈𝜏 (𝜇) , 𝜇 ∈ ℤ2 . (6.17) ∞ By Lemma 6.2,{ there ˜𝜇 }𝜇∈ℤ2 subordinate to the )} a 𝒞 partition of unity {𝜒 ( 𝜀 exists open covering 𝐵 𝑞𝜇 , 𝜀 𝜇∈ℤ2 of ℂ which satisﬁes

∣∂ 𝜒 ˜𝜇 ∣ ≤

176 , 𝜀

𝜇 ∈ ℤ2 .

Now, for 𝑖 ∈ 𝐼, we deﬁne on 𝑋: 𝜒𝑖 = 0 if 𝑖 ∕∈ 𝜏 (ℤ2 ), and ∑ 𝜒𝑖 = 𝜒 ˜𝜇 if 𝑖 ∈ 𝜏 (ℤ2 ).

(6.18)

(6.19)

𝜇∈𝜏 −1 (𝑖)

As the sets 𝜏 −1 (𝑖), 𝑖 ∈ 𝐼, are pairwise disjunct and 𝐼 is the union of these sets, it is clear that ∑ on 𝑋, 𝜒𝑖 ≡ 1 and from (6.17) we see that supp 𝜒𝑖 ⊆ 𝑈𝑖 . Hence, {𝜒𝑖 }𝑖∈𝐼 is a 𝒞 ∞ partition of unity subordinate to 𝒰. By Lemma 6.1, the partition of unity {𝜒 ˜𝜇 }𝜇∈ℤ2 is of order ≤ 21. Since the sets 𝜏 −1 (𝑖), 𝑖 ∈ 𝐼, are pairwise disjunct, this implies, by (6.19), that also the partition {𝜒𝑖 }𝑖∈𝐼 is of order ≤ 21. The fact that the partition of unity {𝜒 ˜𝜇 }𝜇∈ℤ2 is of order ≤ 21, moreover implies that, in the sum (6.19), locally, not more than 21 terms are diﬀerent from zero. Together with (6.18) this yields the required estimate: ∣∂𝜒𝑖 ∣ ≤ 21

212 176 < . 𝜀 𝜀

□

An Estimate for the Splitting

477

7. Continuous functions with continuous Cauchy-Riemann derivative. The additive case In this section, 𝐸 is a Banach space, and 𝑋 is a bounded subset of ℂ such that 𝑋 ⊆ int 𝑋. 7.1. We denote by ℬ𝒞 𝐸 (𝑋) the Banach space of 𝐸-valued bounded continuous functions on 𝑋 endowed with the sup-norm ∥⋅∥𝑋 deﬁned by (2.2), and by ℬ𝒞∂𝐸 (𝑋) we denote the subspace of all 𝑓 ∈ ℬ𝒞 𝐸 (𝑋) such that also ∂𝑓 ∈ ℬ𝒞 𝐸 (𝑋) (the domain of deﬁnition of the diﬀerential operator ∂ as an operator in ℬ𝒞 𝐸 (𝑋)). Notice that ℬ𝒞∂𝐸 (𝑋) becomes a Banach space if we introduce the norm ∥ ⋅ ∥∂ deﬁned by ∥𝑓 ∥∂ = ∥𝑓 ∥𝑋 + ∥∂𝑓 ∥𝑋 .

(7.1)

Below we use the following simple fact (see, e.g., Proposition 2.1.2 in [GL]): If 𝑓 ∈ ℬ𝒞∂𝐸 (𝑋) and 𝜒 : 𝑋 → ℂ is a bounded continuous function such that ∂𝜒 is also continuous and bounded on 𝑋, then 𝜒𝑓 belongs to ℬ𝒞∂𝐸 (𝑋) and ∂(𝜒𝑓 ) = (∂𝜒)𝑓 + 𝜒∂𝑓.

(7.2)

Now let 𝒰 = {𝑈𝑖 }𝑖∈𝐼 be a covering of 𝑋 by relatively open subsets of 𝑋. Then we use the following notations:11 ( ) ∙ 𝐶 0 𝒰, ℬ𝒞∂𝐸 is the space of all families {𝑓𝑖 }𝑖∈𝐼 of functions 𝑓𝑖 ∈ ℬ𝒞∂𝐸 (𝑈𝑖 ). ( ) ∙ 𝐶 1 𝒰, ℬ𝒞∂𝐸 is the space of all families {𝑓𝑖𝑗 }𝑖,𝑗∈𝐼 of functions 𝑓𝑖𝑗 ∈ ℬ𝒞∂𝐸 (𝑈𝑖 ∩ 𝑈𝑗 ). ( ) ( ) ∙ 𝑍 1 𝒰, ℬ𝒞∂𝐸 is the subspace of all 𝑓 ∈ 𝐶 1 𝒰, ℬ𝒞∂𝐸 satisfying the following (additive) cocycle condition: 𝑓𝑖𝑗 + 𝑓𝑗𝑘 = 𝑓𝑖𝑘

on 𝑈𝑖 ∩ 𝑈𝑗 ∩ 𝑈𝑘 ,

𝑖, 𝑗, 𝑘 ∈ 𝐼.

(7.3)

The( elements ) of this subspace will be called additive 1-cocycles. ∙ 𝐶 2 𝒰, ℬ𝒞∂𝐸 is the space of all families {𝑓𝑖𝑗𝑘 }𝑖,𝑗,𝑘∈𝐼 of functions 𝑓𝑖𝑗𝑘 ∈ ℬ𝒞∂𝐸 (𝑈𝑖 ∩ 𝑈𝑗 ∩ 𝑈𝑘 ). ) ( ) ( ∙ 𝑍 2 𝒰, ℬ𝒞∂𝐸 is the subspace of all 𝑓 ∈ 𝐶 2 𝒰, ℬ𝒞∂𝐸 (𝑋) satisfying the following condition (also called cocycle condition): −𝑓𝑗𝑘𝑙 + 𝑓𝑖𝑘𝑙 − 𝑓𝑖𝑗𝑙 + 𝑓𝑖𝑗𝑘 = 0

on 𝑈𝑖 ∩ 𝑈𝑗 ∩ 𝑈𝑘 ∩ 𝑈𝑙 ,

𝑖, 𝑗, 𝑘, 𝑙 ∈ 𝐼.

(7.4)

The elements of this subspace will be called additive 2-cocycles. ˇ are notations from the theory of Cech cohomology with coeﬃcients in sheaves, but, in this paper, we will not use this theory, except for some very simple facts, which will be explained. Note that the map 𝑈 → ℬ𝒞 𝐸 (𝑋)(𝑈 ) applied to the relatively open subsets of 𝑋 is only a presheaf,

11 These

but not a sheaf.

∂

478

J. Leiterer

( ) ∙ For 𝑓 ∈ 𝐶 𝑞 𝒰, ℬ𝒞∂𝐸 (𝑋) , 𝑞 = 0, 1, 2, we deﬁne ∥𝑓 ∥ = sup ∥𝑓𝑖 ∥𝑈𝑖 , ∥∂𝑓 ∥ = sup ∥∂𝑓𝑖 ∥𝑈𝑖 𝑖∈𝐼

𝑖∈𝐼

if 𝑞 = 0,

∥𝑓 ∥ = sup ∥𝑓𝑖𝑗 ∥𝑈𝑖 ∩𝑈𝑗 , ∥∂𝑓 ∥ = sup ∥∂𝑓𝑖𝑗 ∥𝑈𝑖 ∩𝑈𝑗 𝑖,𝑗∈𝐼

𝑖,𝑗∈𝐼

if 𝑞 = 1,

∥𝑓 ∥ = sup ∥𝑓𝑖𝑗𝑘 ∥𝑈𝑖 ∩𝑈𝑗 ∩𝑈𝑘 , ∥∂𝑓 ∥ = sup ∥∂𝑓𝑖𝑗𝑘 ∥𝑈𝑖 ∩𝑈𝑗 ∩𝑈𝑘 𝑖,𝑗,𝑘∈𝐼

𝑖,𝑗,𝑘∈𝐼

(7.5)

if 𝑞 = 2,

and

(7.6) ∥𝑓 ∥∂ = ∥𝑓 ∥ + ∥∂𝑓 ∥. Note that can be inﬁnite if 𝐼 is inﬁnite. The space of all ) ( these “numbers” 𝑓 ∈ 𝐶 𝑞 𝒰, ℬ𝒞∂𝐸 (𝑋) with ∥𝑓 ∥∂ < ∞ is a Banach space. ∙ We deﬁne linear operators ) ) ( ( 𝛿 : 𝐶 𝑞 𝒰, ℬ𝒞∂𝐸 (𝑋) → 𝐶 𝑞+1 𝒰, ℬ𝒞∂𝐸 (𝑋) , 𝑞 = 0, 1, ( ) setting, for 𝑓 ∈ 𝐶 0 𝒰, ℬ𝒞∂𝐸 (𝑋) , (𝛿𝑓 )𝑖𝑗 = 𝑓𝑖 − 𝑓𝑗 ) ( and, for 𝑓 ∈ 𝐶 𝒰, ℬ𝒞∂𝐸 (𝑋) ,

on 𝑈𝑖 ∩ 𝑈𝑗 ,

𝑖, 𝑗 ∈ 𝐼,

(7.7)

1

(𝛿𝑓 )𝑖𝑗𝑘 = −𝑓𝑗𝑘 + 𝑓𝑖𝑘 − 𝑓𝑖𝑗 on 𝑈𝑖 ∩ 𝑈𝑗 ∩ 𝑈𝑘 , 𝑖, 𝑗, 𝑘 ∈ 𝐼. (7.8) ( ) 𝑞 𝐸 ∙ The element 𝑓 ∈ 𝐶 𝒰, ℬ𝒞∂ (𝑋) deﬁned by 𝑓𝑖 ≡ 0 if 𝑞 = 1, 𝑓𝑖𝑗 ≡ 0 if 𝑞 = 1, and 𝑓𝑖𝑗𝑘 ≡ 0 if 𝑞 = 2 will be denoted by 0. It is easy to check that ( ( ) ) 𝛿𝐶 𝑞 𝒰, ℬ𝒞∂𝐸 (𝑋) ⊆ 𝑍 𝑞+1 𝒰, ℬ𝒞∂𝐸 (𝑋) , 𝑞 = 0, 1. (7.9) ( ) Note also that the deﬁnition of 𝛿 is chosen so that an element 𝑓 ∈ 𝐶 1 𝒰, ℬ𝒞∂𝐸 (𝑋) is an additive 1-cocycle if and only if 𝛿𝑓 = 0.12 Moreover, ( it is well )known from ˇ the general theory of Cech cohomology that each 𝑓 ∈ 𝑍 𝑞 𝒰, ℬ𝒞∂𝐸 (𝑋) , 𝑞 = 1, 2, is of the form 𝑓 = 𝛿𝑢, ( ) 𝑞−1 𝐸 𝒰, ℬ𝒞∂ (𝑋) . We need a version with estimates of the latter fact, where 𝑢 ∈ 𝐶 which is stated by the following lemma (the proof of this lemma is a modiﬁcation of the corresponding arguments from the general theory). 7.2. Lemma. Let 𝜀 > 0, and let 𝒰 be an 𝜀-separated covering ) of 𝑋 by relatively ( open sets (Deﬁnition 2.2). Then, for each 𝑓 ∈ 𝑍 𝑞 𝒰, ℬ𝒞∂𝐸 (𝑋) , 𝑞 = 1, 2, such that ( ) ∥𝑓 ∥∂ < ∞, there exists 𝑢 ∈ 𝐶 𝑞−1 𝒰, ℬ𝒞∂𝐸 (𝑋) such that 𝛿𝑢 = 𝑓,

(7.10)

∥𝑢∥ ≤ ∥𝑓 ∥

(7.11)

( ) ˇ the general theory of Cech cohomology, such an operator 𝛿 is deﬁned also on 𝐶 2 𝒰 , ℬ𝒞 𝐸 (𝑋) , ∂ ( ) and its kernel is 𝑍 2 𝒰 , ℬ𝒞 𝐸 (𝑋) . Here we do not need this.

12 In

∂

An Estimate for the Splitting and ∥∂𝑢∥ ≤ ∥∂𝑓 ∥ +

479

217 ∥𝑓 ∥. 𝜀

(7.12)

) ( Proof. Let 𝒰 = {𝑈𝑖 }𝑖∈𝐼 , and let 𝑓 = {𝑓𝑖𝑗 }𝑖,𝑗∈𝐼 ∈ 𝑍 𝑞 𝒰, ℬ𝒞∂𝐸 (𝑋) with ∥𝑓 ∥ + ∥∂𝑓 ∥ < ∞ be given. Then, by Lemma 6.4, there exists a 𝒞 ∞ partition of unity {𝜒𝑖 }𝑖∈𝐼 subordinated to 𝒰, which is of order ≤ 21, such that $ $ 212 ∣(∂𝜒𝑖 )(𝜁)$$ ≤ , 𝜁 ∈ ℂ, 𝑖 ∈ 𝐼. (7.13) 𝜀 ) ( First let 𝑞 = 1. Then we deﬁne a 𝑢 = {𝑢𝑖 }𝑖∈𝐼 ∈ 𝐶 0 𝒰, ℬ𝒞∂𝐸 (𝑋) by ∑ 𝑢𝑖 = − 𝜒𝑘 𝑓𝑘𝑖 . (7.14) 𝑘∈𝐼

∑

As 𝑓 is an additive 1-cocycle and 𝜒𝑖 ≡ 1, then ) ∑ ∑ ( 𝑢𝑖 − 𝑢𝑗 = 𝜒𝑘 − 𝑓𝑘𝑖 + 𝑓𝑘𝑗 = 𝜒𝑘 𝑓𝑖𝑗 = 𝑓𝑖𝑗 , 𝑘∈𝐼

𝑘∈𝐼

i.e., we ∑ have relation (7.10). Estimate (7.11) is clear, since all 𝜒𝑘 are non-negative and 𝜒𝑘 ≡ 1. Further, by (7.14) and (7.2), ) ∑( 𝜒𝑘 ∂𝑓𝑘𝑖 + (∂𝜒𝑘 )𝑓𝑘𝑖 , 𝑖 ∈ 𝐼. ∂𝑢𝑖 = − 𝑘∈𝐼

Hence ∥∂𝑢∥ ≤ ∥∂𝑓 ∥ + ∥𝑓 ∥

sup

1≤𝜇≤𝑛 , 𝜁∈ℂ

∑

∣∂𝜒𝑘 (𝜁)∣.

𝑘∈𝐼

Since {𝜒𝑖 } is of order ≤ 21, now estimate (7.12) ( follows from ) (7.13). Now let 𝑞 = 2. Then we deﬁne a 𝑢 ∈ 𝐶 1 𝒰, ℬ𝒞∂𝐸 (𝑋) setting ∑ 𝑢𝑖𝑗 = − 𝜒𝜈 𝑓𝜈𝑖𝑗 on 𝑈𝑖 ∩ 𝑈𝑗 , 𝑖, 𝑗 ∈ 𝐼.

(7.15)

𝜈∈𝐼

Then, for all 𝑖, 𝑗, 𝑘 ∈ 𝐼, (𝛿𝑢)𝑖𝑗𝑘 = −𝑢𝑗𝑘 + 𝑢𝑖𝑘 − 𝑢𝑖𝑗 =

∑

( ) 𝜒𝜈 𝑓𝜈𝑗𝑘 − 𝑓𝜈𝑖𝑘 + 𝑓𝜈𝑖𝑗

on 𝑈𝑖 ∩ 𝑈𝑗 ∩ 𝑈𝑘 .

𝜈∈𝐼

Moreover, as 𝑓 is an additive 2-cocycle, for all 𝜈, 𝑖, 𝑗, 𝑘 ∈ 𝐼, 0 = (𝛿𝑓 )𝜈𝑖𝑗𝑘 = −𝑓𝑖𝑗𝑘 + 𝑓𝜈𝑗𝑘 − 𝑓𝜈𝑖𝑘 + 𝑓𝜈𝑖𝑗 , i.e., 𝑓𝜈𝑗𝑘 − 𝑓𝜈𝑖𝑘 + 𝑓𝜈𝑖𝑗 = 𝑓𝑖𝑗𝑘 Hence (𝛿𝑢)𝑖𝑗𝑘 =

∑ 𝜈∈𝐼

𝜒𝜈 𝑓𝑖𝑗𝑘 = 𝑓𝑖𝑗𝑘

on 𝑈𝜈 ∩ 𝑈𝑖 ∩ 𝑈𝑗 ∩ 𝑈𝑘 . on 𝑈𝑖 ∩ 𝑈𝑗 ∩ 𝑈𝑘 ,

𝑖, 𝑗, 𝑘 ∈ 𝐼,

480

J. Leiterer

i.e., ∑ we have (7.10). Estimate (7.11) is clear, since all 𝜒𝜈 are non-negative and 𝜒𝑖 ≡ 1. Further, by (7.15) and (7.2), ) ∑( 𝜒𝜈 ∂𝑓𝜈𝑖𝑗 + (∂𝜒𝜈 )𝑓𝜈𝑖𝑗 , 𝑖, 𝑗 ∈ 𝐼. ∂𝑢𝑖𝑗 = − 𝜈∈𝐼

As all 𝜒𝜈 are non-negative and

∑

𝜒𝜈 ≡ 1, this implies that ∑$ $ $∂𝜒𝜈 (𝜁)$. sup ∥∂𝑢∥ ≤ ∥∂𝑓 ∥ + ∥𝑓 ∥ 1≤𝜇≤𝑛 , 𝜁∈ℂ 𝜈∈𝐼

Since {𝜒𝜈 } is of order ≤ 21, now estimate (7.12) again follows from (7.13).

□

8. Continuous functions with continuous Cauchy-Riemann derivative. The multiplicative case We will use the following proposition. 8.1. Proposition. Let 𝑋 be a bounded subset of ℂ such that 𝑋 ⊆ int 𝑋, let 𝐴 be a Banach algebra with unit, and let 𝐺𝐴 is the group of invertible elements of 𝐴. Then (i) If 𝑓, 𝑔 : 𝑋 → 𝐴 are continuous such that also ∂𝑓 and ∂𝑔 are continuous on 𝑋, then ∂(𝑓 𝑔) is continuous on 𝑋 and ∂(𝑓 𝑔) = (∂𝑓 )𝑔 + 𝑓 ∂𝑔. (ii) If 𝑓 : 𝑋 → 𝐺𝐴 is continuous such that also ∂𝑓 is continuous on 𝑋, then 𝑓 −1 is continuous on 𝑋 and ∂𝑓 −1 = −𝑓 −1 (∂𝑓 )𝑓 −1 . Proof. This is clear when 𝑋 is open and the functions 𝑓 and 𝑔 are of class 𝒞 ∞ . The general case follows from this and the fact that, for each continuous function 𝑓 : int 𝑋 → 𝐴 such that ∂𝑓 is also continuous on int 𝑋, there exists a sequence (𝑓𝑛 ) of 𝒞 ∞ functions 𝑓𝑛 : int 𝑋 → 𝐴 such that, uniformly on the compact subsets of int 𝑋, both lim 𝑓𝑛 = 𝑓 and lim ∂𝑓𝑛 = 𝑓 (see, e.g., Lemma 2.1.3 in [GL]) □ 8.2. Let 𝑋 be a bounded subset of ℂ such that 𝑋 ⊆ int 𝑋, let 𝐴 be a Banach algebra with unit, 1, and let 𝐺𝐴 be the group of invertible elements of 𝐴. Since 𝐴 is a Banach algebra, the Banach space ℬ𝒞 𝐴 (𝑋) introduced in Section 7.1 now is a Banach algebra. Moreover we see from Proposition 8.1 (i) that the subspace ℬ𝒞∂𝐴 (𝑋) (also introduced in Section 7.1) is a subalgebra of ℬ𝒞 𝐴 (𝑋), which becomes a Banach algebra if we introduce the norm (7.1). We denote by ℬ𝒞∂𝐺𝐴 (𝑋) the set of all 𝑓 ∈ ℬ𝒞∂𝐴 (𝑋) such that 𝑓 (𝜁) ∈ 𝐺𝐿(𝑟, ℂ) for all 𝜁 ∈ 𝑋 and sup ∥𝑓 −1 (𝜁)∥ < ∞. (8.1) 𝜁∈𝑋

An Estimate for the Splitting

481

It follows from Proposition 8.1 (ii) that, for each 𝑓 ∈ ℬ𝒞∂𝐺𝐴 (𝑋), the function 𝑓 −1 again belongs to ℬ𝒞∂𝐺𝐴(𝑋), i.e., ℬ𝒞∂𝐺𝐴 (𝑋) is the group of invertible elements of the algebra ℬ𝒞∂𝐺𝐴(𝑋). Notice that the algebra ℬ𝒪𝐴 (𝑋) (Section 4.1) is a subalgebra of ℬ𝒞∂𝐴 (𝑋), and the group ℬ𝒪𝐺𝐴 (𝑋) is a subgroup of ℬ𝒞∂𝐺𝐴 (𝑋). Now let 𝒰 = {𝑈𝑖 }𝑖∈𝐼 be a covering of 𝑋 by relatively open subsets of 𝑋. Then, additional to the notations introduced in Section 7.1, here we need also the following notations: ( ) ( ) ∙ 𝐶 0 𝒰, ℬ𝒞∂𝐺𝐴 is the group of all 𝑓 ∈ 𝐶 0 𝒰, ℬ𝒞∂𝐺𝐴 such that 𝑓𝑖 ∈ ℬ𝒞∂𝐺𝐴 (𝑈𝑖 ) for(all 𝑖 ∈ 𝐼. ) ( ) ∙ 𝐶 1 𝒰, ℬ𝒞∂𝐺𝐴 is the set13 of all 𝑓 ∈ 𝐶 1 𝒰, ℬ𝒞∂𝐺𝐴 such that 𝑓𝑖𝑗 ∈ ℬ𝒞∂𝐺𝐴(𝑈𝑖 ∩ 𝑈𝑗 )( for all 𝑖,)𝑗 ∈ 𝐼. ( ) ∙ 𝑍 1 𝒰, ℬ𝒞∂𝐺𝐴 is the subset of all 𝑓 ∈ 𝐶 1 𝒰, ℬ𝒞∂𝐺𝐴 satisfying the multiplicative cocycle condition 𝑓𝑖𝑗 𝑓𝑗𝑘 = 𝑓𝑖𝑘

on 𝑈𝑖 ∩ 𝑈𝑗 ∩ 𝑈𝑘 ,

𝑖, 𝑗, 𝑘 ∈ 𝐼.

(8.2)

The elements cocycles. ( of this )subset will be(called multiplicative ) ∙ For 𝑔 ∈ 𝐶 0 𝒰, ℬ𝒞∂𝐺𝐴 and 𝑓 ∈ 𝐶 1 𝒰, ℬ𝒞∂𝐺𝐴 , we deﬁne an element 𝑔 ⋄ 𝑓 ∈ ( ) 𝐶 1 𝒰, ℬ𝒞∂𝐺𝐴 by (𝑔 ⋄ 𝑓 )𝑖𝑗 = 𝑔𝑖−1 𝑓𝑖𝑗 𝑔𝑗

on 𝑈𝑖 ∩ 𝑈𝑗 ,

𝑖, 𝑗 ∈ 𝐼.

(8.3)

Note that( 𝑔 ⋄ 1 is )always a multiplicative ( ) cocycle. Notice also that, for all 𝑔, ℎ ∈ 𝐶 0 𝒰, ℬ𝒞∂𝐺𝐴 and 𝑓 ∈ 𝐶 1 𝒰, ℬ𝒞∂𝐺𝐴 , (𝑔ℎ) ⋄ 𝑓 = ℎ ⋄ (𝑔 ⋄ 𝑓 ).

(8.4)

The aim of this section is to prove the following theorem. 8.3. Theorem. Let 𝑋 ⊆ ℂ such that 𝑋 ⊆ int 𝑋, and let 𝒰 = {𝑈𝑖 }𝑖∈𝐼 be an 𝜀-separated covering of 𝑋 )by relatively open sets (Deﬁnition 2.2). ( Let 𝑓 ∈ 𝑍 1 𝒰, ℬ𝒞∂𝐺𝐴 such that 1 ∥𝑓 − 1∥ ≤ . 64 ( ) Then there exists 𝑔 ∈ 𝐶 0 𝒰, ℬ𝒞∂𝐺𝐴 such that

and

𝑔 ⋄ 𝑓 = 1,

(8.6)

∥𝑔 − 1∥ ≤ 2∥𝑓 − 1∥,

(8.7)

220 ∥𝑓 − 1∥ + 2∥∂𝑓 ∥. 𝜀 We ﬁrst prove the following lemma. ∥∂𝑔∥ ≤

13 This

(8.5)

is also a group, but here we will not use the group structure.

(8.8)

482

J. Leiterer

( ) 8.4. Lemma. Under the hypotheses of Theorem 8.3 there exists 𝑔 ∈ 𝐶 0 𝒰, 𝒞∂𝐺𝐴 such that 1 (8.9) ∥𝑔 ⋄ 𝑓 − 1∥ ≤ ∥𝑓 − 1∥, 8 22 2 ∥∂(𝑔 ⋄ 𝑓 )∥ ≤ ∥𝑓 − 1∥2 + 16∥∂𝑓 ∥∥𝑓 − 1∥, (8.10) 𝜀 65 ∥𝑓 − 1∥, (8.11) ∥𝑔 − 1∥ ≤ 64 33 218 ∥∂𝑔∥ ≤ ∥𝑓 − 1∥ + ∥∂𝑓 ∥. (8.12) 𝜀 32 Proof. We set 𝑎 = 𝑓 − 1. In general, 𝑎 is not an additive 1-cocycle, i.e., 𝛿𝑎 ∕= 0 (Section 7.1). But, as 𝑓𝑖𝑘 = 𝑓𝑖𝑗 𝑓𝑗𝑘 , we have 1 + 𝑎𝑖𝑘 = (1 + 𝑎𝑖𝑗 )(1 + 𝑎𝑗𝑘 ) = 1 + 𝑎𝑖𝑗 + 𝑎𝑗𝑘 + 𝑎𝑖𝑗 𝑎𝑗𝑘 and therefore (𝛿𝑎)𝑖𝑗𝑘 = −𝑎𝑗𝑘 + 𝑎𝑖𝑘 − 𝑎𝑖𝑗 = 𝑎𝑖𝑗 𝑎𝑗𝑘 Hence

on 𝑈𝑖 ∩ 𝑈𝑗 ∩ 𝑈𝑘 ,

𝑖, 𝑗, 𝑘 ∈ 𝐼.

∥𝛿𝑎∥ ≤ ∥𝑎∥2

(8.13)

∥∂𝛿𝑎∥ ≤ 2∥𝑎∥∥∂𝑎∥.

(8.14)

and As 𝛿𝑎 is an additive ( )2-cocycle (see (7.9)), it follows from Lemma 7.2 that there exists 𝑢 ∈ 𝐶 1 𝒰, ℬ𝒞∂𝐴 such that 𝛿𝑢 = 𝛿𝑎,

(8.15)

∥𝑢∥ ≤ ∥𝛿𝑎∥, and

217 ∥𝛿𝑎∥. 𝜀 By (8.13) and (8.14), the last two estimates further imply ∥∂𝑢∥ ≤ ∥∂𝛿𝑎∥ +

∥𝑢∥ ≤ ∥𝑎∥2

(8.16)

and

217 ∥𝑎∥2 . (8.17) 𝜀 By (8.15), ( 𝑎 − 𝑢) is an additive 1-cocycle. Therefore, again from Lemma 7.2 we get 𝑣 ∈ 𝐶 0 𝒰, ℬ𝒞∂𝐴 such that 𝛿𝑣 = 𝑎 − 𝑢, (8.18) ∥∂𝑢∥ ≤ 2∥𝑎∥∥∂𝑎∥ +

∥𝑣∥ ≤ ∥𝑎 − 𝑢∥, and ∥∂𝑣∥ ≤ ∥∂𝑎 − ∂𝑢∥ +

217 ∥𝑎 − 𝑢∥. 𝜀

(8.19) (8.20)

An Estimate for the Splitting

483

By (8.19) and (8.16), ∥𝑣∥ ≤ ∥𝑎∥ + ∥𝑎∥2 . By (8.5) this implies that 65 ∥𝑎∥ (8.21) ∥𝑣∥ ≤ 64 and 65 (8.22) ∥𝑣∥ < 12 . 2 In particular ∥𝑣∥ < 1, which implies (by the arguments given)at the beginning of ( Section 8.2) that 𝑔 := 1 + 𝑣 belongs to the group 𝐶 0 𝒰, ℬ𝒞∂𝐺𝐴 . We will show that this 𝑔 has the required properties. Estimate (8.11) is clear by (8.21). To prove the remaining properties we set (8.23) 𝜃 = 𝑔 −1 − 1 + 𝑣. ( ) ( ) −1 0 𝐴 0 𝐴 As 𝑔 and 𝑣 belong to 𝐶 𝒰, ℬ𝒞∂ , then also 𝜃 ∈ 𝐶 𝒰, ℬ𝒞∂ . Moreover, since ∥𝑣∥ < 1, for 𝑔 −1 we have the representation ∞ ∑ 𝑔 −1 = (−𝑣)𝜇 , 𝜇=0

where the convergence is absolute with respect to ∥ ⋅ ∥. (Below we shall prove that the convergence is even absolute with respect to the Banach space norm ∥ ⋅ ∥∂ deﬁned by (7.6).) Therefore, with the same kind of convergence, ∞ ∑ (−𝑣)𝜇 . (8.24) 𝜃= 𝜇=2

More precisely, we see from (8.21) that ∞ ∞ ∑ ∑ 652 ∥𝜃∥ ≤ ∥𝑣∥2 ∥𝑣∥𝜇 ≤ 12 ∥𝑎∥2 ∥𝑣∥𝜇 . 2 𝜇=0 𝜇=0 Since, by (8.22), ∞ ∑

∥𝑣∥𝜇 ≤

𝜇=0

)𝜇 ∞ ( ∑ 65 𝜇=0

this implies ∥𝜃∥ ≤ From (8.23) we see

212

=

1 212 65 = 212 − 65 , 1 − 212

652 ∥𝑎∥2 ≤ 2∥𝑎∥2 . − 65

212

(8.25)

(𝑔 ⋄ 𝑓 )𝑖𝑗 = (1 − 𝑣𝑖 + 𝜃𝑖 )(1 + 𝑎𝑖𝑗 )(1 + 𝑣𝑗 ) = 1 − 𝑣𝑖 + 𝑎𝑖𝑗 + 𝑣𝑗 − 𝑣𝑖 𝑎𝑖𝑗 + 𝑎𝑖𝑗 𝑣𝑗 − 𝑣𝑖 𝑣𝑗 − 𝑣𝑖 𝑎𝑖𝑗 𝑣𝑗 + 𝜃𝑖 (1 + 𝑎𝑖𝑗 )(1 + 𝑣𝑗 ) Since, by (8.18), −𝑣𝑖 + 𝑎𝑖𝑗 + 𝑣𝑗 = 𝑎𝑖𝑗 − (𝛿𝑣)𝑖𝑗 = 𝑢𝑖𝑗 , this implies that (𝑔 ⋄ 𝑓 )𝑖𝑗 − 1 = 𝑢𝑖𝑗 − 𝑣𝑖 𝑎𝑖𝑗 + 𝑎𝑖𝑗 𝑣𝑗 − 𝑣𝑖 𝑣𝑗 − 𝑣𝑖 𝑎𝑖𝑗 𝑣𝑗 + 𝜃𝑖 (1 + 𝑎𝑖𝑗 )(1 + 𝑣𝑗 ). (8.26) Hence ∥𝑔 ⋄ 𝑓 − 1∥ ≤ ∥𝑢∥ + 2∥𝑣∥∥𝑎∥ + ∥𝑣∥2 + ∥𝑣∥2 ∥𝑎∥ + ∥𝜃∥(1 + ∥𝑎∥)(1 + ∥𝑣∥).

484

J. Leiterer

In view of (8.16), (8.21), (8.5), (8.25), and (8.22), this implies ( ) 65 652 652 65 212 + 65 ∥𝑔 ⋄ 𝑓 − 1∥ ≤ 1 + + 12 + 18 + 2 ⋅ ⋅ ∥𝑎∥2 ≤ 8∥𝑎∥2. 32 2 2 64 212

(8.27)

Taking again into account that ∥𝑎∥ ≤ 1/64, this implies (8.9). From (8.20), (8.16), and (8.17) we see that ) 217 ( ∥∂𝑣∥ ≤ ∥∂𝑎∥ + ∥∂𝑢∥ + ∥𝑎∥ + ∥𝑢∥ 𝜀 ) 217 217 ( ∥𝑎∥2 + ∥𝑎∥ + ∥𝑎∥2 ≤ ∥∂𝑎∥ + 2∥𝑎∥∥∂𝑎∥ + 𝜀 𝜀 ) 217 ( ∥𝑎∥ + 2∥𝑎∥2 . = ∥∂𝑎∥ + 2∥𝑎∥∥∂𝑎∥ + 𝜀 As ∥𝑎∥ ≤ 1/64, this further implies that ∥∂𝑣∥ ≤

33 218 ∥∂𝑎∥ + ∥𝑎∥. 32 𝜀

(8.28)

Since ∂𝑔 = ∂𝑣, this proves (8.12). Next we estimate ∥∂𝜃∥. From the product rule (Proposition 8.1 (i)) it follows that ∥∂(−𝑣)𝜇 ∥ ≤ 𝜇∥∂𝑣∥∥𝑣∥𝜇−1 if 𝜇 ≥ 1. By (8.21) this implies that ∥∂(−𝑣)𝜇 ∥ ≤ 𝜇∥∂𝑣∥

65 ∥𝑎∥∥𝑣∥𝜇−2 64

if 𝜇 ≥ 2,

and further, by (8.22),

( )𝜇−2 65 65 ∥∂(−𝑣) ∥ ≤ 𝜇∥∂𝑣∥ ∥𝑎∥ 12 64 2 𝜇

if 𝜇 ≥ 2.

Moreover,

( )𝜇−2 ( )∑ )𝜇−2 )𝜇−2 ∞ ∞ ( ∞ ( ∑ ∑ 65 65 65 𝜇 𝜇 12 ≤ sup 𝜇−2 =2 2 211 211 𝜇≥2 2 𝜇=2 𝜇=2 𝜇=2 )𝜇−2 ∞ ( ∑ 1 32 . <2 = 16 15 𝜇=2

Together this implies that ∞ ∑ 𝜇=2

∥∂(−𝑣)𝜇 ∥ ≤

65 32 7 ⋅ ∥∂𝑣∥∥𝑎∥ ≤ ∥∂𝑣∥∥𝑎∥. 64 15 3

Hence, by (8.24), ∥∂𝜃∥ ≤

7 ∥∂𝑣∥∥𝑎∥, 3

An Estimate for the Splitting

485

and further, by (8.28), ) ( 220 7 33 218 5 ∥∂𝜃∥ ≤ ∥∂𝑎∥ + ∥𝑎∥ ∥𝑎∥ ≤ ∥𝑎∥2 + ∥∂𝑎∥∥𝑎∥. 3 32 𝜀 𝜀 2

(8.29)

From (8.26) and the product rule (Proposition 8.1 (i)) we see ∥∂(𝑔 ⋄ 𝑓 )∥ ≤ ∥∂𝑢∥ + 2∥∂𝑣∥∥𝑎∥ + 2∥∂𝑎∥∥𝑣∥ + 2∥∂𝑣∥∥𝑣∥ + 2∥𝑣∥∥𝑎∥∥∂𝑣∥ + ∥𝑣∥2 ∥∂𝑎∥ + ∥∂𝜃∥(1 + ∥𝑎∥)(1 + ∥𝑣∥) ( ) + ∥𝜃∥ ∥∂𝑎∥ + ∥∂𝑣∥ + ∥𝑣∥∥∂𝑎∥ + ∥𝑎∥∥∂𝑣∥ . Taking into account that ∥𝑣∥ ≤

65 ∥𝑎∥, 64

∥𝑣∥ <

65 , 212

∥𝑎∥ <

1 , 64

and ∥𝜃∥ ≤ 2∥𝑎∥2 ≤

∥𝑎∥ 32

(see (8.21), (8.22), (8.5), and (8.25)), this implies that 65 65 65 ∥∂𝑎∥∥𝑎∥ + ∥∂𝑣∥∥𝑎∥ + 11 ∥𝑎∥∥∂𝑣∥ 32 32 2 ) ( 652 ∥𝑎∥ 65 1 + 18 ∥𝑎∥∥∂𝑎∥ + 2∥∂𝜃∥ + ∥∂𝑎∥ + ∥∂𝑣∥ + 12 ∥∂𝑎∥ + ∥∂𝑣∥ 2 32 2 64 ( ) 65 652 65 1 + 18 + + ≤ ∥∂𝑢∥ + ∥∂𝑎∥∥𝑎∥ 32 2 32 217 ( ) 65 1 1 65 + 11 + + 11 ∥∂𝑣∥∥𝑎∥ + 2∥∂𝜃∥ + 2+ 32 2 32 2

∥∂(𝑔 ⋄ 𝑓 )∥ ≤ ∥∂𝑢∥ + 2∥∂𝑣∥∥𝑎∥ +

≤ ∥∂𝑢∥ + 3∥∂𝑎∥∥𝑎∥ + 5∥∂𝑣∥∥𝑎∥ + 2∥∂𝜃∥. Together with (8.17), (8.28), and (8.29) this proves (8.10): 217 ∥𝑎∥2 + 3∥∂𝑎∥∥𝑎∥ ∥∂(𝑔 ⋄ 𝑓 )∥ ≤ 2∥𝑎∥∥∂𝑎∥ + 𝜀 ( ) ( 20 ) 2 33 218 5 2 +5 ∥∂𝑎∥ + ∥𝑎∥ ∥𝑎∥ + 2 ∥𝑎∥ + ∥∂𝑎∥∥𝑎∥ 32 𝜀 𝜀 2 ( 17 ) 18 20 2 2 2 ≤ +5 +2 ∥𝑎∥2 𝜀 𝜀 𝜀 ) ( 33 222 ∥𝑎∥2 + 16∥∂𝑎∥∥𝑎∥. + 2 + 3 + 5 + 5 ∥∂𝑎∥∥𝑎∥ ≤ 32 𝜀

□

8.5. Proof of Theorem 8.3. We write for short 𝐶=

222 . 𝜀

(8.30)

486

J. Leiterer

( ) Then, from Lemma 8.4 we get sequences (𝑔𝑗 )𝑗∈ℕ∗ of elements 𝑔𝑗 ∈ 𝐶 0 𝒰, ℬ𝒞∂𝐺𝐴 ( ) and (𝑓𝑗 )𝑗∈ℕ of cocycles 𝑓𝑗 ∈ 𝑍 1 𝒰, ℬ𝒞∂𝐺𝐴 such that 𝑓0 = 𝑓 and, for all 𝑗 ∈ ℕ, (8.31) 𝑓𝑗+1 = 𝑔𝑗+1 ⋄ 𝑓𝑗 , 1 (8.32) ∥𝑓𝑗+1 − 1∥ ≤ ∥𝑓𝑗 − 1∥, 8 (8.33) ∥∂𝑓𝑗+1 ∥ ≤ 𝐶∥𝑓𝑗 − 1∥2 + 16∥∂𝑓𝑗 ∥∥𝑓𝑗 − 1∥, 65 ∥𝑓 − 1∥, (8.34) ∥𝑔𝑗+1 − 1∥ ≤ 64 𝑗 𝐶 33 ∥∂𝑔𝑗+1 ∥ ≤ ∥𝑓𝑗 − 1∥ + ∥∂𝑓𝑗 ∥. (8.35) 16 32 Since 𝑓0 = 𝑓 , it follows from (8.32) that 1 1 ∥𝑓𝑗 − 1∥ ≤ 𝑗 ∥𝑓 − 1∥ = 3𝑗 ∥𝑓 − 1∥, 𝑗 ∈ ℕ. (8.36) 8 2 Together with (8.34) this yields 65 1 ∥𝑔𝑗 − 1∥ ≤ ∥𝑓 − 1∥, 𝑗 ∈ ℕ∗ . (8.37) 64 23𝑗−3 Since ∥𝑓 − 1∥ ≤ 1/64, this in particular implies that, for all 𝑗 ∈ ℕ∗ , ∥𝑔𝑗 − 1∥ < 17 and, therefore, ∞ ∞ ∑ ∑ ) ( ∥𝑔𝑗 − 1∥𝜇 1 7 (−1)𝜇 = ∥𝑔𝑗 − 1∥. ≤ ∥𝑔𝑗 − 1∥ log 1 + ∥𝑔𝑗 − 1∥ = − 𝜇 𝜇 7 6 𝜇=1 𝜇=0 Together with (8.37) this further implies that ( ) 7 1 log 1 + ∥𝑔𝑗 − 1∥ ≤ ∥𝑓 − 1∥, 𝑗 ∈ ℕ∗ . 6 23𝑗−3 Hence, for each 𝑁 ∈ ℕ, ∞ ∞ ∑ ( ) 7 ∑ 1 4 1 log 1 + ∥𝑔𝑗 − 1∥ ≤ ∥𝑓 − 1∥ = ∥𝑓 − 1∥. 6 23𝑗−3 3 23𝑁 𝑗=𝑁 +1

(8.39)

𝑗=𝑁 +1

As ∥𝑓 − 1∥ ≤ 1/64, this in particular implies that ∞ ( ) ∑ 1 . log 1 + ∥𝑔𝑗 − 1∥ ≤ 48 𝑗=1 Moreover, for 𝑁 ∈ ℕ and 𝑀 ∈ ℕ∗ , we have 𝑁∏ +𝑀 ( 𝑀 ) ∑ ∑ 1 + ∥𝑔𝑗 − 1∥ = 1 + 𝜅=1 𝑁 +1≤𝜇1 <⋅⋅⋅<𝜇𝜅 ≤𝑁 +𝑀

𝑗=𝑁 +1

(8.38)

(8.40)

∥𝑔𝜇1 − 1∥ ⋅ ⋅ ⋅ ∥𝑔𝜇𝜅 − 1∥.

and, similarly, 𝑁∏ +𝑀 𝑗=𝑁 +1

𝑀 ∑ ) ( 1 + (𝑔𝑗 − 1) = 1 +

∑

𝜅=1 𝑁 +1≤𝜇1 <⋅⋅⋅<𝜇𝜅 ≤𝑁 +𝑀

) ( ) ( 𝑔𝜇1 − 1 . . . 𝑔𝜇𝜅 − 1 .

An Estimate for the Splitting

487

Hence, for 𝑁 ∈ ℕ and 𝑀 ∈ ℕ∗ , 1 𝑁 +𝑀 1 1 𝑁 +𝑀 1 1 ∏ 1 1 ∏ ( 1 ) 1 1 1 1 = 𝑔 − 1 − 1) − 1 1 + (𝑔 𝑗 𝑗 1 1 1 1 𝑗=𝑁 +1

𝑗=𝑁 +1

1 𝑀 1∑ =1 1

∑

𝜅=1 𝑁 +1≤𝜇1 <⋅⋅⋅<𝜇𝜅 ≤𝑁 +𝑀

≤

𝑀 ∑

∑

𝜅=1 𝑁 +1≤𝜇1 <⋅⋅⋅<𝜇𝜅 ≤𝑁 +𝑀

=

𝑁∏ +𝑀

1 1 (𝑔𝜇1 − 1) . . . (𝑔𝜇𝜅 − 1)1 1

∥𝑔𝜇1 − 1∥ . . . ∥𝑔𝜇𝜅 − 1∥

𝑁∑ +𝑀 ) ( ( ) log 1 + ∥𝑔𝑗 − 1∥ − 1. 1 + ∥𝑔𝑗 − 1∥ − 1 = exp

𝑗=𝑁 +1

𝑗=𝑁 +1

Taking into account (8.40) and the fact that 𝑒𝑥 − 1 ≤ 𝑒1/48 𝑥 ≤

3 𝑥 2

for 0 ≤ 𝑥 ≤

this implies that 1 𝑁 +𝑀 1 +𝑀 ( ) 1 ∏ 1 3 𝑁∑ 1 1≤ 𝑔 − 1 log 1 + ∥𝑔 − 1∥ , 𝑗 𝑗 1 1 2 𝑗=𝑁 +1

1 , 48

𝑁 ∈ ℕ, 𝑀 ∈ ℕ∗

𝑗=𝑁 +1

and, therefore, by (8.39), 1 1 𝑁∏ 1 1 +𝑀 2 1 𝑔𝑗 − 11 1 ≤ 23𝑁 ∥𝑓 − 1∥, 1

𝑁 ∈ ℕ, 𝑀 ∈ ℕ∗ .

(8.41)

𝑗=𝑁 +1

In particular, as ∥𝑓 − 1∥ ≤ 2−6 , 1 𝑁 +𝑀 1 1 ∏ 1 1 1 1 𝑔𝑗 − 11 ≤ 5+3𝑁 , 1 1 1 2

𝑁 ∈ ℕ, 𝑀 ∈ ℕ∗ ,

(8.42)

𝑗=𝑁 +1

and, hence,

1 1 𝑁∏ 1 +𝑀 1 1 𝑔𝑗 1 1 < 2, 1

𝑁 ∈ ℕ, 𝑀 ∈ ℕ∗ .

(8.43)

𝑗=𝑁 +1

From the last two inequalities we see that, for all 𝑁, 𝑀 ∈ ℕ∗ , 1 𝑁∏ 1 1∏ 11 +𝑀 1 𝑁 ∏ 1 +𝑀 1 1 𝑁 11 𝑁∏ 1 1 1 1 1 1 1 𝑔𝑗 − 𝑔𝑗 1 ≤ 1 𝑔𝑗 11 𝑔𝑗 − 11 1 1 ≤ 24+3𝑁 . 𝑗=1

𝑗=1

𝑗=1

𝑗=𝑁 +1

(8.44)

( ) Denote by 𝐶 0 𝒰, ℬ𝒞 𝐴 the Banach algebra of families 𝜑 = {𝜑𝑖 }𝑖∈𝐼 of continuous functions 𝜑𝑖 : 𝑈𝑖 → 𝐿(𝑟, ℂ) such that ∥𝜑∥ := sup ∥𝜑𝑖 ∥𝑈𝑖 < ∞, 𝑖∈𝐼

488

J. Leiterer

( ) and let 𝐶 0 𝒰, ℬ𝒞 𝐺𝐴 be the ( group ) of invertible elements of this Banach algebra, 0 𝐴 i.e., the group of all 𝑓 ∈ 𝐶 𝒰, ℬ𝒞 such that 𝑓𝑖 (𝜁) ∈ 𝐺𝐿(𝑟, ℂ) for all 𝑖 ∈ 𝐼, 𝜁 ∈ 𝑈𝑖 and sup ∥𝜑−1 𝑖 ∥𝑈𝑖 < ∞. 𝑖∈𝐼 ∏∞ Then we see from (8.44) and (8.41) that the inﬁnite product 𝑔 := 𝑗=1 𝑔𝑗 converges ( ) in 𝐶 0 𝒰, ℬ𝒞 𝐺𝐴 , where ∥𝑔 − 1∥ ≤ 2∥𝑓 − 1∥ (8.45) and, hence (as ∣𝑓 − 1∥ ≤ 1/64), 1 ∥𝑔 − 1∥ ≤ . (8.46) 32 Moreover, from (8.31) and (8.4) we see that (𝑔1 . . . 𝑔𝑗 ) ⋄ 𝑓 = 𝑓𝑗 ,

𝑗 ∈ ℕ∗ .

By (8.36) this implies that 1 1 1(𝑔1 . . . 𝑔𝑗 ) ⋄ 𝑓 − 11 ≤ ∥𝑓 − 1∥ , 𝑗 ∈ ℕ∗ . 23𝑗 Hence 𝑔 ⋄ 𝑓 = lim (𝑔1 . . . 𝑔𝑗 ) ⋄ 𝑓 = 1. 𝑗→∞ ( ) Therefore it remains to prove that 𝑔 belongs to 𝐶 0 𝒰, ℬ𝒞∂𝐺𝐴 and satisﬁes (8.8) ((8.7) is identical (8.45)). As we already ( know) that ∥𝑔 − 1∥ < 1, for this it is suﬃcient to prove that 𝑔 belongs to 𝐶 0 𝒰, ℬ𝒞∂𝐴 and satisﬁes (8.8). From (8.33) and (8.36) we see that 𝐶 16 ∥∂𝑓𝑗+1 ∥ ≤ 6𝑗 ∥𝑓 − 1∥2 + 3𝑗 ∥∂𝑓𝑗 ∥∥𝑓 − 1∥, 𝑗 ∈ ℕ. 2 2 Since ∥𝑓 − 1∥ ≤ 1/64, this implies that 𝐶 1 ∥∂𝑓𝑗+1 ∥ ≤ 6𝑗+6 ∥𝑓 − 1∥ + 3𝑗+2 ∥∂𝑓𝑗 ∥, 𝑗 ∈ ℕ. (8.47) 2 2 Setting ﬁrst 𝑗 = 0 and then 𝑗 = 1, this gives (recall that 𝑓0 = 𝑓 ) 𝐶 1 ∥∂𝑓1 ∥ ≤ 6 ∥𝑓 − 1∥ + 2 ∥∂𝑓 ∥ (8.48) 2 2 and ) ( 𝐶 1 𝐶 1 𝐶 1 ∥∂𝑓2 ∥ ≤ 12 ∥𝑓 − 1∥ + 5 ∥∂𝑓1 ∥ ≤ 12 ∥𝑓 − 1∥ + 5 ∥𝑓 − 1∥ + 2 ∥∂𝑓 ∥ 2 2 2 2 26 2 ( ) 1 1 𝐶 1 1 = 𝐶 12 + 11 ∥𝑓 − 1∥ + 7 ∥∂𝑓 ∥ ≤ 10 ∥𝑓 − 1∥ + 7 ∥∂𝑓 ∥. (8.49) 2 2 2 2 2 Next we prove by induction that, for all 𝑗 ∈ ℕ∗ , 𝐶 1 ∥∂𝑓𝑗+1 ∥ ≤ 𝑗+9 ∥𝑓 − 1∥ + 𝑗+6 ∥∂𝑓 ∥. 2 2

(8.50)

An Estimate for the Splitting

489

For 𝑗 = 1 this holds true by (8.49). Now let 𝑗 ∈ ℕ∗ such that (8.50) is already proved. Then, by (8.47) and (8.50), ∥∂𝑓(𝑗+1)+1 ∥ ≤

𝐶

∥𝑓 − 1∥ +

1

∥∂𝑓𝑗+1 ∥ ) ( 𝐶 1 1 𝐶 ∥𝑓 − 1∥ + ∥∂𝑓 ∥ ≤ 6(𝑗+1)+6 ∥𝑓 − 1∥ + 3(𝑗+1)+2 2𝑗+9 2𝑗+6 2 2 ( ) 1 1 1 = 𝐶 6𝑗+12 + 4𝑗+14 ∥𝑓 − 1∥ + 4𝑗+11 ∥∂𝑓 ∥ 2 2 2 𝐶 1 ≤ (𝑗+1)+9 ∥𝑓 − 1∥ + (𝑗+1)+6 ∥∂𝑓 ∥. 2 2 So (8.50) is proved for all 𝑗 ∈ ℕ∗ . As 𝑓0 = 𝑓 , we see from (8.35) that 26(𝑗+1)+6

23(𝑗+1)+2

33 𝐶 ∥𝑓 − 1∥ + ∥∂𝑓 ∥. 16 32 Moreover, from (8.35), (8.36), and (8.48) we see ∥∂𝑔1 ∥ ≤

(8.51)

𝐶 33 ∥𝑓1 − 1∥ + ∥∂𝑓1 ∥ 16 ) ( 32 𝐶 33 𝐶 1 𝐶 33 ≤ 7 ∥𝑓 − 1∥ + ∥𝑓 − 1∥ + ∥∂𝑓 ∥ ≤ 5 ∥𝑓 − 1∥ + 7 ∥∂𝑓 ∥. 2 32 26 22 2 2

∥∂𝑔2 ∥ ≤

(8.52)

If 𝑗 ≥ 3, then we see from (8.35), (8.36), and (8.50) that 𝐶 33 ∥𝑓 − 1∥ + 5 ∥∂𝑓𝑗−1 ∥ 16 𝑗−1 2 ( ) 33 1 𝐶 𝐶 ∥𝑓 − 1∥ + 𝑗+4 ∥∂𝑓 ∥ ≤ 3𝑗+1 ∥𝑓 − 1∥ + 5 2 2 2𝑗+7 2 ( ) 𝐶 33 1 33 = 𝑗+5 + 7 ∥𝑓 − 1∥ + 𝑗+9 ∥∂𝑓 ∥ 2𝑗−4 2 2 2 2

∥∂𝑔𝑗 ∥ ≤

(8.53)

𝐶 33 ∥𝑓 − 1∥ + 𝑗+9 ∥∂𝑓 ∥. 2𝑗+5 2 For 𝑁 ≥ 3, from the product rule we get the estimate 1 1 1 1 1 𝑁 1 1∏ 1 𝑁∏ 1 𝑗−1 1 ∏ −1 1 1 𝑁 1 1 −1 1 𝑁∑ 1∏ 1 1 ∏ 1 𝑁 1 1 1 1 1 1 1 1 1 1∂ 𝑔𝑗 1 ≤ ∥∂𝑔1 ∥1 𝑔𝑗 1 + ∥∂𝑔𝑁 ∥1 𝑔𝑗 1 + ∥∂𝑔𝑗 ∥1 𝑔𝑖 1 ⋅ 1 𝑔𝑖 1 1, 1 ≤

𝑗=1

where, by (8.42),

𝑗=2

𝑗=1

𝑗=2

1∏ 1 1∏ 1 1 𝑁 1 1 𝑁 1 1 1 1 1≤1+ 1 , ≤ 1 + 𝑔 𝑔 − 1 𝑗1 𝑗 1 1 1 28 𝑗=2 𝑗=2

1 𝑁∏ 1 1 𝑁∏ 1 1 −1 1 1 −1 1 1 1 1 1 𝑔𝑗 1 ≤ 1 + 1 𝑔𝑗 − 11 1 1 ≤ 1 + 25 , 𝑗=1 𝑗=1

𝑖=1

𝑖=𝑗+1

490

J. Leiterer

and 1 𝑗−1 1 1 𝑁 1 ( )( ) 1∏ 1 1 ∏ 1 1 1 1 1 1 1 1 𝑔𝑖 1 ⋅ 1 𝑔𝑖 1 ≤ 1 + 5 1 + 5+3𝑗 ≤ 1 + 6 1 2 2 2 𝑖=1 𝑖=𝑗+1

if 𝑗 ≥ 2.

Hence 1 𝑁 1 ( ( ( ) ) ) 𝑁 −1 1 ∏ 1 1 1 ∑ 1 1∂ 1 𝑔 ∥ + 1 + ∥ + 1 + ∥∂𝑔𝑗 ∥ ∥∂𝑔 ∥∂𝑔 ≤ 1 + 𝑗1 1 𝑁 1 28 26 25 𝑗=2 𝑗=1 ( ( ( ) ) ) ∞ 1 1 ∑ 1 ∥∂𝑔𝑗 ∥ for 𝑁 ≥ 3. ≤ 1 + 8 ∥∂𝑔1 ∥ + 1 + 5 ∥∂𝑔2 ∥ + 1 + 5 2 2 2 𝑗=3 Together with (8.51), (8.52), and (8.53) this implies that 1 𝑁 1 ( )( ) 1 ∏ 1 𝐶 33 1 1∂ 1 𝑔 ∥∂𝑓 ∥ ∥𝑓 − 1∥ + ≤ 1 + 𝑗1 1 28 16 25 1 ( ) )( 𝐶 1 33 + 1+ 5 ∥𝑓 − 1∥ + 7 ∥∂𝑓 ∥ 2 25 2 ( )∑ ) ∞ ( 1 𝐶 33 + 1+ 5 ∥𝑓 − 1∥ + ∥∂𝑓 ∥ 2 𝑗=3 2𝑗+5 2𝑗+9

for 𝑁 ≥ 3,

and further, taking into account that ( ( ) ) ) ∞ ( 1 𝐶 𝐶 𝐶 𝐶 1 ∑ 𝐶 𝐶 1 𝐶 + 1+ 5 + + < + 1 + = 1+ 8 5 5 𝑗+5 2 16 2 2 2 𝑗=3 2 8 16 64 4 and

( ( ) ) ) ∞ ( 1 33 1 ∑ 33 1 33 + 1 + + 1 + 1+ 8 2 25 25 27 25 𝑗=3 2𝑗+9 ( ) ( ) 33 1 1 1 1 1 33 1 99 < 2, = 5 1 + 2 + 6 + 7 + 8 + 11 < 5 1 + = 2 2 2 2 2 2 2 2 64

we obtain that 1 𝑁 1 1 ∏ 1 𝐶 1∂ 𝑔𝑗 1 1 1 ≤ 4 ∥𝑓 − 1∥ + 2∥∂𝑓 ∥ for 𝑁 ≥ 3. 1

(8.54)

In particular, as ∥𝑓 − 1∥ ≤ 1/64, 1 𝑁 1 1 ∏ 1 𝐶 1∂ 𝑔𝑗 1 1 1 ≤ 28 + 2∥∂𝑓 ∥ for 𝑁 ≥ 3. 1

(8.55)

An Estimate for the Splitting

491

Moreover, from the product rule and (8.43) we see that 1 𝑁 +𝑀−1 1 1 𝑁∏ 1 1 1 𝑁∏ +𝑀 1 1 ∏ 1 1 1 +𝑀 1 1 1 1∂ 1 1 1 ≤ ∥∂𝑔 + ∥∂𝑔 𝑔 ∥ 𝑔 ∥ 𝑔 1 𝑗 𝑁 +1 1 𝑗1 𝑁 +𝑀 𝑗1 1 1 1 1 𝑗=𝑁 +1

𝑗=𝑁 +2

+ ≤4

𝑁 +𝑀−1 ∑ 𝑗=𝑁 +2 ∞ ∑

𝑗=𝑁 +1

1 1 𝑁 +𝑀 1 1 𝑗−1 1 1 ∏ 1 1 ∏ 1 1 ∥∂𝑔𝑗 ∥1 𝑔 𝑔𝑖 1 𝑖1 ⋅ 1 1 1 𝑖=𝑁 +1

∥∂𝑔𝑗 ∥

𝑖=𝑗+1

if 𝑁, 𝑀 ∈ ℕ and 𝑀 ≥ 3.

𝑗=𝑁 +1

By (8.53) and taking into account that ∥𝑓 − 1∥ ≤ 2−6 , this implies that 1 𝑁∏ 1 ( ) ( ) +𝑀 ∞ ∞ ∑ ∑ 1 1 𝐶∥𝑓 − 1∥ 33∥∂𝑓 ∥ 𝐶 ∥∂𝑓 ∥ 1∂ 1 𝑔𝑗 1 ≤ + 𝑗+7 + 𝑗+1 ≤ 1 2𝑗+3 2 2𝑗+9 2 𝑗=𝑁 +1

𝑗=𝑁 +1

𝑗=𝑁 +1

(8.56)

𝐶

∥∂𝑓 ∥ 𝐶 + ∥∂𝑓 ∥ = 𝑁 +9 + 𝑁 +1 < , 𝑁, 𝑀 ∈ ℕ, 𝑀 ≥ 3. 2 2 2𝑁 +1 Again by the product rule, 1 𝑁∏ 1 1 (∏ ))1 ( 𝑁∏ 𝑁 𝑁 +𝑀 ∏ 1 +𝑀 1 1 1 1∂ 1 = 1∂ 1 𝑔 − ∂ 𝑔 𝑔 𝑔 − 1 𝑗 𝑗1 𝑗 𝑗 1 1 1 𝑗=1

𝑗=1

𝑗=1

𝑗=𝑁 +1

1 1 +𝑀 1 1 1 1 1∏ 1 ∏ +𝑀 1 1 1 𝑁 1 1 𝑁∏ 1 𝑁 1 1 𝑁∏ 1 1 1 1 1 1 1 𝑔𝑗 1 ⋅ 1 𝑔𝑗 − 11 + 1 𝑔𝑗 1 ⋅ 1∂ 𝑔𝑗 1 ≤ 1∂ 1 𝑗=1

𝑗=𝑁 +1

𝑗=1

𝑗=𝑁 +1

for all 𝑁 ∈ ℕ and 𝑀 ∈ ℕ∗ . If 𝑁, 𝑀 ≥ 3, then, by (8.55), (8.42), (8.43), and (8.56), this implies that 1 ( 1 𝑁∏ ) 𝑁 ∏ 1 1 +𝑀 1 𝐶 + ∥∂𝑓 ∥ 𝐶 + ∥∂𝑓 ∥ 𝐶 1 1∂ 𝑔𝑗 − ∂ 𝑔𝑗 1 ≤ + 2∥∂𝑓 ∥ 5+3𝑁 + ≤ . 1 8 𝑁 2 2 2 2𝑁 −1 𝑗=1 𝑗=1 ∏∞ Together with (8.44) this implies that the inﬁnite product 𝑔 = 𝑗=1 𝑔𝑗 converges ( ) even in the Banach space 𝐶 0 𝒰, ℬ𝒞∂𝐴 endowed with the norm (7.6), where, by (8.54), 𝐶 ∥∂𝑔∥ ≤ ∥𝑓 − 1∥ + 2∥∂𝑓 ∥. 4 By deﬁnition (8.30) of 𝐶, this completes the proof of Theorem 8.3.

9. Proof of Theorem 4.2

( ) The cocycle 𝑓 belongs, in particular, to 𝑍 1 𝒰, ℬ𝒞∂𝐺𝐴 (Section 8.2). Therefore, in ( ) view of the second estimate in (4.6), from Theorem 8.3 we get 𝑔 ∈ 𝐶 0 𝒰, ℬ𝒞∂𝐺𝐴 such that 𝑔 ⋄ 𝑓 = 1, (9.1)

492

J. Leiterer ∥𝑔 − 1∥ ≤ 2∥𝑓 − 1∥,

(9.2)

220 ∥𝑓 − 1∥. 𝜀

(9.3)

and ∥∂𝑔∥ ≤ By (9.1) As 𝑓𝑖𝑗

𝑓𝑖𝑗 = 𝑔𝑖 𝑔𝑗−1 on 𝑈𝑖 ∩ 𝑈𝑗 , 𝑖, 𝑗 ∈ 𝐼. is holomorphic in 𝑈𝑖 ∩ 𝑈𝑗 ∩ int 𝑋, this implies that ) ( ) on 𝑈𝑖 ∩ 𝑈𝑗 , 𝑖, 𝑗 ∈ 𝐼, 0 = (∂𝑔𝑖 𝑔𝑗−1 − 𝑔𝑖 𝑔𝑗−1 ∂𝑔𝑗 𝑔𝑗−1

(9.4)

i.e.,

) ( ) on 𝑈𝑖 ∩ 𝑈𝑗 , 𝑖, 𝑗 ∈ 𝐼. 𝑔𝑖−1 (∂𝑔𝑖 = 𝑔𝑗−1 ∂𝑔𝑗 Therefore, we have a well-deﬁned continuous function 𝑉 : 𝑋 → 𝐴 such that 𝑉 = 𝑔𝑖−1 ∂𝑔𝑖

on 𝑈𝑖 ,

𝑖 ∈ 𝐼.

(9.5)

Since ∥𝑓 − 1∥ ≤ 1/64, we see from (9.2) that ∥𝑔 − 1∥ ≤ 1/32 and therefore ∥𝑔 −1 ∥ ≤ 2.

(9.6)

Together with (9.5) and (9.3), this implies that 221 ∥𝑓 − 1∥ (9.7) 𝜀 and further, by the ﬁrst estimate in (4.6), 1 ∥𝑉 ∥𝑋 ≤ . (9.8) 8𝐶𝑋 Therefore, we can apply Theorem 5.1 and obtain a continuous function 𝑈 : 𝑋 → 𝐺𝐴 such that ∂𝑈 is also continuous on 𝑋, ∥𝑉 ∥𝑋 ≤

𝑈 −1 ∂𝑈 = 𝑉

on 𝑋,

(9.9)

∥𝑈 − 1∥𝑋 ≤ 2𝐶𝑋 ∥𝑉 ∥𝑋 . From (9.10) and (9.8) it follows that 1 ∥𝑈 − 1∥𝑋 ≤ . 4

(9.10) (9.11)

Hence, 𝑈 belongs to the group (ℬ𝒞 ∂ )𝐺𝐴 (𝑋). Therefore, setting ℎ𝑖 = 𝑔𝑖 𝑈 −1

on 𝑈𝑖 , 𝑖 ∈ 𝐼, ( ) 𝐺𝐴 in 𝐶 𝒰, ℬ𝒞∂ . By Proposition 8.1, from (9.9) 0

we obtain an element ℎ := {ℎ𝑖 }𝑖∈𝐼 and (9.5) we see that ( ) ( ) ( ) ∂ℎ𝑖 = ∂𝑔𝑖 𝑈 −1 − 𝑔𝑖 𝑈 −1 ∂𝑈 𝑈 −1 = ∂𝑔𝑖 𝑈 −1 − 𝑔𝑖 𝑉 𝑈 −1 ( ) ( ) = ∂𝑔𝑖 𝑈 −1 − 𝑔𝑖 𝑔𝑖−1 ∂𝑔𝑖 𝑈 −1 = 0 ( ) on 𝑈𝑖 ∩ int 𝑋, i.e., ℎ even belongs to 𝐶 0 𝒰, ℬ𝒪𝐺𝑙(𝑟,ℂ) . From (9.4) we see that −1 ℎ𝑖 ℎ−1 𝑈 𝑔𝑗−1 = 𝑔𝑖 𝑔𝑗−1 = 𝑓𝑖𝑗 𝑗 = 𝑔𝑖 𝑈

on 𝑈𝑖 ∩ 𝑈𝑗 , i.e., we have (4.7).

An Estimate for the Splitting

and

It remains to prove (4.8). From (9.11) it follows that 𝑈 −1 = ∥𝑈 −1 − 1∥𝑋 ≤

∞ ∑ 𝑗=1

∥1 − 𝑈 ∥𝑗𝑋 ≤ ∥1 − 𝑈 ∥𝑋

∞ ∑ 𝑗=1

493 ∑∞

𝑗=0 (1

− 𝑈 )𝑗

1 4 = ∥1 − 𝑈 ∥𝑋 . 4𝑗−1 3

By (9.10) and (9.7), this further implies that ∥𝑈 −1 − 1∥𝑋 ≤

8 224 𝐶𝑋 ∥𝑉 ∥𝑋 ≤ 𝐶 ∥𝑓 − 1∥. 3 3𝜀 𝑋

(9.12)

Moreover, by deﬁnition of ℎ𝑖 , ℎ𝑖 − 1 = 𝑔𝑖 𝑈 −1 − 1 = (𝑔𝑖 − 1)(𝑈 −1 − 1) + (𝑔𝑖 − 1) + (𝑈 −1 − 1). Therefore ∥ℎ − 1∥ ≤ ∥𝑔 − 1∥∥𝑈 −1 − 1∥𝑋 + ∥𝑔 − 1∥ + ∥𝑈 −1 − 1∥𝑋 . Together with (9.2) and (9.12) this implies that 224 225 𝐶𝑋 ∥𝑓 − 1∥2 + 2∥𝑓 − 1∥ + 𝐶 ∥𝑓 − 1∥, 3𝜀 3𝜀 𝑋 and further, as ∥𝑓 − 1∥ ≤ 1/64, ) ( ( 25 ) 224 2 223 𝐶𝑋 𝐶 +2+ 𝐶 ∥𝑓 − 1∥ ≤ 2 + ∥ℎ − 1∥ ≤ ∥𝑓 − 1∥. 3 ⋅ 64𝜀 𝑋 3𝜀 𝑋 𝜀 ∥ℎ − 1∥ ≤

This completes the proof of Theorem 4.2.

References [CG] [BR]

[B]

Cornalba, M., Griﬃths, Ph., Analytic cycles and vector bundles on non-compact algebraic varieties, Inventiones math. 28 (1975), 1–106. Berndtsson, B., Rosay, J.-P., Quasi-isometric vector bundles and bounded factorization of holomorphic matrices, Ann. Inst. Fourier, Grenoble 51, 3 (2001), 885–901. Bungart, L., On analytic ﬁber bundles. I. Holomorphic ﬁber bundles with inﬁnitedimensional ﬁbers, Topology 7 (1967), 55–68.

[GL]

Gohberg, I., Leiterer, J., Holomorphic operator functions of one variable and applications, Birkh¨ auser 2009. [GR1] Gohberg, I., Rodman, L., Analytic operator-valued functions with prescribed local data, J. Analyse Math. 40 (1981), 90–128. [GR2] Gohberg, I., Rodman, L., Analytic matrix functions with prescribed local data, Acta Sci. Math. (Szeged) 45 (1983), no. 1-4, 189–199. [G] [HL]

Grauert, H., Analytische Faserungen u ¨ber holomorph-vollst¨ andigen R¨ aumen, Math. Ann. 135 (1958), 263–273. Henkin, G., Leiterer, J., Theory of functions on complex manifolds, Birkh¨ auser 1984.

494 [L]

[P]

J. Leiterer Leiterer, J., An estimate for the splitting of holomorphic cocycles. Several variables, Proceedings of the Workshop on Geometric Analysis of Several Complex Variables and Related Topics. Edited by: S. Berhanu, A. Meziani, N. Mir, R. Meziani, Y. Barkatou. Contemporary Mathematics, Volume 550, 2011. Pompeiju, D., Sur les singularit´es des fonctions analytiques uniformes, C. R. Acad. Sci. Paris 139 (1904), 914–915.

J¨ urgen Leiterer Institut f¨ ur Mathematik Humboldt-Universit¨ at zu Berlin Rudower Chaussee 25 D-12489 Berlin, Germany e-mail: [email protected]

Operator Theory: Advances and Applications, Vol. 218, 495–512 c 2012 Springer Basel AG ⃝

The Discrete Algebraic Riccati Equation and Hermitian Block Toeplitz Matrices Leonid Lerer and Andr´e C.M. Ran Dedicated to the memory of our dear friend and mentor Israel Gohberg

Abstract. This paper discusses the representation of the full set of solutions of the discrete algebraic Riccati equation in terms of two solutions, the diﬀerence of which is invertible. It turns out that if two such solutions exist, then all solutions can be described in terms of one of these solutions and the solution of a Stein equation. A complete parametrization is available in that case. Under some additional hypotheses we show that there are two solutions which diﬀer by an invertible matrix. In the ﬁnal sections we discuss a special case connected to invertible Hermitian block Toeplitz matrices. Mathematics Subject Classiﬁcation (2000). Primary 15A24, 47B35 Secondary 47A56 . Keywords. Riccati equations, Hermitian block Toeplitz matrices.

1. Introduction In this paper we discuss the discrete algebraic Riccati equation 𝑋 = 𝐴𝑋𝐴∗ + 𝑄 − 𝐴𝑋𝐵(𝑅 + 𝐵 ∗ 𝑋𝐵)−1 𝐵 ∗ 𝑋𝐴∗ .

(1.1)

We do not make any assumption on deﬁniteness of 𝑅 and 𝑄, and neither on deﬁniteness of the associated Popov function, but we shall assume that 𝑅 is an invertible Hermitian matrix, 𝑄 is Hermitian, and in addition, that 𝐴 is invertible. A more general version of the discrete algebraic Riccati equation plays a role in several problems in the theory of discrete time systems. It appears in the theory of linear quadratic optimal control [23], where one is interested in one particular solution in case of ﬁxed end-point control, or in a whole class of The research of the ﬁrst author is supported by the Israel Science Foundation, ISF (Grant number 121/09).

496

L. Lerer and A.C.M. Ran

solutions in case of linear or free end-point control [21]. The Riccati equation also appears in realization theory for stochastic linear systems, in 𝐻 ∞ optimal control, in factorization problems and in several other problems, [16, 13, 8, 12]. The structure of the set of all Hermitian solutions has been a subject of study since the late ninety-seventies. There are parametrizations of the set of all solutions in terms of invariant subspaces [16, 22], in terms of symmetric factorizations of certain rational matrix functions or matrix polynomials, see, e.g., [1, 13, 14, 19], and in terms of two solutions, e.g., extremal solutions [23] or unmixed solutions [4, 5, 7, 25, 26, 27]. For the unsymmetric continuous time analogue a parametrization of the set of all solutions in terms of two solutions is given in [6]. The paper discusses two main themes. The ﬁrst topic is concerned with the general theory, much in the spirit of the parametrization of all solutions in terms of two particular solutions. In particular, we show that if 𝑄 = 0, and 𝐴 and (𝐴∗ )−1 have no common eigenvalues, then there is a one-to-one correspondence between all solutions and a class of 𝐴∗ -invariant subspaces. The second topic concerns the case where the matrices 𝐴 and 𝐵 have a speciﬁc companion-like structure. It is clear that under the assumptions 𝑄 = 0 and 𝐴 is invertible the study of equation (1.1) is closely related to the well-known Stein equation. In [11] a profound connection has been found between Stein equations, block Toeplitz matrices, and inverse and direct problems for related matrix orthogonal polynomials (in particular the relation (4.6) below was discovered there). See also [15], where the connection between orthogonal polynomials and Toeplitz matrices originates. In our view it is of great interest to explore the connections between the results of [11] and the study of Riccati equations (1.1). The speciﬁc structure of 𝐴 and 𝐵 in companion form allows us to describe the set of all solutions of (1.1) in terms of two matrices: one solution, and the inverse of a certain block Toeplitz Hermitian matrix. We start the paper with some preliminary results on the discrete algebraic Riccati equation. In particular, in the next section we show how invertible solutions for the case 𝑄 = 0 may be found by solving a Stein equation. In the third section we describe all solutions in terms of two solutions whose diﬀerence is invertible. This part of the paper is close in spirit to results describing all solutions in terms of two unmixed solutions, see, e.g., [25, 27], compare also [6]. In the fourth section we introduce the special case we wish to consider, describe the connection with block Toeplitz Hermitian matrices, and describe all solutions for this speciﬁc case. Because of the speciﬁc nature of the coeﬃcients, we can derive the description of all solutions under conditions that are weaker than those of the third section. To be more precise, the coeﬃcient 𝑄 is zero in this case. We show that under speciﬁc conditions on 𝐴 and 𝐵 there is a unique invertible solution, which is the inverse of a Hermitian block Toeplitz matrix. All other solutions are given as a product of the invertible solution and certain projections. Surprisingly enough, all solutions are completely determined by their ﬁrst block column. In the ﬁfth section we present some examples of the special case. In the ﬁnal section we discuss the relation of the special case to polynomial factorization, which is closely related to the factorization results for the Popov function.

Discrete Riccati Equations and Block Toeplitz Matrices

497

2. Preliminaries on the discrete algebraic Riccati equation In this section we discuss some properties of the algebraic Riccati equation. Let 𝑄 and 𝑅 be 𝑛 × 𝑛 and 𝑚 × 𝑚 Hermitian matrices, respectively, and assume that 𝑅 is invertible. Let 𝐴 be an 𝑛 × 𝑛 matrix. We assume throughout that 𝐴 is invertible as well. Let 𝐵 be an 𝑛 × 𝑚 matrix. Consider the discrete algebraic Riccati equation 𝑋 = 𝐴𝑋𝐴∗ + 𝑄 − 𝐴𝑋𝐵(𝑅 + 𝐵 ∗ 𝑋𝐵)−1 𝐵 ∗ 𝑋𝐴∗ .

(2.1)

Our ﬁrst observation is the following: let 𝑋 be a solution of the equation (2.1). Let 𝑆 be invertible, and denote 𝐴1 = 𝑆𝐴𝑆 −1 , 𝐵1 = (𝑆 ∗ )−1 𝐵, 𝑄1 = 𝑆𝑄𝑆 ∗ . Then 𝑋1 = 𝑆𝑋𝑆 ∗ satisﬁes the equation 𝑋1 = 𝐴1 𝑋1 𝐴∗1 + 𝑄1 − 𝐴1 𝑋1 𝐵1 (𝑅 + 𝐵1∗ 𝑋1 𝐵1 )−1 𝐵1∗ 𝑋1 𝐴∗1 and conversely, every solution of this equation gives rise to a solution of the original equation in this way. Our second observation is the following (compare [21, Lemma 3.1]). Proposition 2.1. Let 𝑋0 be a solution to (2.1), and let 𝑋 be any other solution. Denote by 𝐴0 = 𝐴 − 𝐴𝑋0 𝐵(𝑅 + 𝐵 ∗ 𝑋0 𝐵)−1 𝐵 ∗ , and 𝑅0 = 𝑅 + 𝐵 ∗ 𝑋0 𝐵. The latter matrix is invertible by assumption, since 𝑋0 is a solution of (2.1). Then 𝑌 = 𝑋 − 𝑋0 satisﬁes the equation 𝑌 = 𝐴0 𝑌 𝐴∗0 − 𝐴0 𝑌 𝐵(𝑅0 + 𝐵 ∗ 𝑌 𝐵)−1 𝐵 ∗ 𝑌 𝐴∗0 . We shall be interested in a very speciﬁc pair of solutions, namely two solutions 𝑋 and 𝑋0 such that their diﬀerence 𝑋−𝑋0 is invertible. For this reason we consider invertible solutions of the special homogeneous algebraic Riccati equation when 𝑄 = 0: (2.2) 𝑋 = 𝐴𝑋𝐴∗ − 𝐴𝑋𝐵(𝑅 + 𝐵 ∗ 𝑋𝐵)−1 𝐵 ∗ 𝑋𝐴∗ . We assume that 𝐴 is invertible. Observe that 0 is obviously a solution, but we are especially interested in invertible solutions of (2.2). The following is a well-known result, see, e.g., [25, 26]. Proposition 2.2. Assume that 𝐴 is invertible, that 𝑅 is Hermitian and invertible, and let 𝑋 be an invertible solution of (2.2). Then 𝑌 = 𝑋 −1 solves the Stein equation (2.3) 𝑌 − 𝐴∗ 𝑌 𝐴 = −𝐵𝑅−1 𝐵 ∗ −1 solves the and conversely, if 𝑌 is an invertible solution of (2.3), then 𝑋 = 𝑌 discrete algebraic Riccati equation (2.2). Proof. The proof is a straightforward computation using the invertibility of 𝐴 and 𝑌 . □ As a result of the above considerations we arrive at the conclusion: assume that 𝐴 is invertible and we are given one solution 𝑋0 of (2.1), then in order to study solutions for which 𝑋 − 𝑋0 is invertible, it is enough to study the invertible solutions of the Stein equation (2.3).

498

L. Lerer and A.C.M. Ran

3. Non-invertible solutions In this section we consider other solutions 𝑋 of the algebraic Riccati equation 𝑋 = 𝐴𝑋𝐴∗ − 𝐴𝑋𝐵(𝑅 + 𝐵 ∗ 𝑋𝐵)−1 𝐵 ∗ 𝑋𝐴∗ .

(3.1)

We assume throughout that 𝑋+ is an invertible solution, and recall our standing assumptions that 𝑅 is Hermitian and invertible, and that 𝐴 is invertible as well. We ﬁrst prove the following result, which is well known in the case of positive deﬁnite 𝑅 (compare, e.g., [22, 23, 25]) Theorem 3.1. Let 𝐴 be invertible, let 𝑅 be Hermitian and invertible. Let 𝑋+ be an invertible solution of (3.1). Let 𝑁 be an 𝐴∗ -invariant subspace which is 𝑋+ ˙ . Let 𝑃𝑁 be the projection onto (𝑋+ 𝑁 )⊥ nondegenerate, that is, ℂ𝑟 = (𝑋+ 𝑁 )⊥ +𝑁 along 𝑁 , and put 𝑋𝑁 = 𝑋+ 𝑃𝑁 . Then 𝑋𝑁 is a Hermitian solution of (3.1) for which Ker 𝑋𝑁 = 𝑁 . Proof. With respect to the orthogonal decomposition ℂ𝑟 = 𝑁 ⊕ 𝑁 ⊥ we write ) ( ) ( ) ( 𝐴11 𝐵1 𝑋11 𝑋12 0 , 𝐴 = , 𝐵 = , 𝑋+ = ∗ 𝐴21 𝐴22 𝑋12 𝑋22 𝐵2 where the (1,2)-entry of 𝐴 is zero because 𝑁 is invariant under 𝐴∗ . Also 𝑋11 is invertible because of the assumption that 𝑁 is 𝑋+ -nondegenerate, and so 𝑋+ 𝑁 ∩ 𝑁 ⊥ = (0). Then ( ( ( ))⊥ ( ))⊥ 𝐼 𝑋11 = Im (𝑋+ 𝑁 )⊥ = Im −1 ∗ ∗ 𝑋12 𝑋12 𝑋11 ( ) −1 ( ) 𝑋12 −𝑋11 −1 = Ker 𝐼 𝑋11 . 𝑋12 = Im 𝐼 One then easily sees that 𝑃𝑁 , being the projection onto (𝑋+ 𝑁 )⊥ along 𝑁 , is given by ( ) −1 𝑋12 0 −𝑋11 𝑃𝑁 = , 0 𝐼 and hence 𝑋𝑁 = 𝑋+ 𝑃𝑁 is given by ) ( ( )( ) −1 0 0 𝑋11 𝑋12 𝑋12 0 −𝑋11 𝑋𝑁 = = . (3.2) −1 ∗ ∗ 𝑋12 𝑋22 0 𝑋22 − 𝑋12 0 𝐼 𝑋11 𝑋12 −1 ∗ 𝑋11 𝑋12 . Now 𝑍22 is the It follows that 𝑋𝑁 is Hermitian. Denote 𝑍22 = 𝑋22 − 𝑋12 Schur complement of 𝑋11 in the invertible matrix 𝑋+ . That is, we have )( ) ( )( −1 𝐼 0 𝑋11 0 𝑋12 𝐼 𝑋11 𝑋+ = . (3.3) −1 ∗ 0 𝑍22 𝑋12 𝑋11 𝐼 0 𝐼

It then follows from the invertibility of 𝑋+ that also 𝑍22 is invertible. Hence Ker 𝑋𝑁 = 𝑁 . It remains to prove that 𝑋𝑁 solves (3.1). For this we use the fact that 𝑌 = −1 satisﬁes 𝑋+ 𝑌 − 𝐴∗ 𝑌 𝐴 = −𝐵𝑅−1 𝐵 ∗ . (3.4)

Discrete Riccati Equations and Block Toeplitz Matrices By (3.3) we can write ( 𝐼 −1 𝑌 = 𝑋+ = 0

−1 𝑋12 −𝑋11 𝐼

) ( −1 𝑋11 0

0 −1 𝑍22

)(

𝐼

−1 ∗ −𝑋12 𝑋11

) 0 . 𝐼

499

(3.5)

In particular, concentrating on the (2,2)-block in equation (3.4), we see that −1 −1 𝑍22 − 𝐴∗22 𝑍22 𝐴22 = −𝐵2 𝑅−1 𝐵2∗ .

(3.6)

Hence, by Proposition 2.2, 𝑍22 satisﬁes the Riccati equation 𝑍22 = 𝐴22 𝑍22 𝐴∗22 − 𝐴22 𝑍22 𝐵2 (𝑅 + 𝐵2∗ 𝑍22 𝐵2 )−1 𝐵2∗ 𝑍22 𝐴∗22 .

(3.7)

Now consider our claim that 𝑋𝑁 = 𝐴𝑋𝑁 𝐴∗ − 𝐴𝑋𝑁 𝐵(𝑅 + 𝐵 ∗ 𝑋𝑁 𝐵)−1 𝐵 ∗ 𝑋𝑁 𝐴.

(3.8)

To show that this is indeed the case, one uses the facts that ( ( ) ) 0 0 0 𝐵 = 𝐴𝑋𝑁 𝐴∗ = , 𝐴𝑋 , 𝑁 0 𝐴22 𝑍22 𝐴∗22 𝐴22 𝑍22 𝐵2 and

𝑅 + 𝐵 ∗ 𝑋𝑁 𝐵 = 𝑅 + 𝐵2∗ 𝑍22 𝐵2 .

Then (3.8) is seen to be equivalent to (3.7).

□

To prove a result in the converse direction we need to make an assumption on 𝐴, but ﬁrst we present a result in the converse direction that holds under the assumptions we have used so far. Proposition 3.2. Let 𝐴 be invertible and let 𝑅 be Hermitian and invertible. Let 𝑋 be a solution of (3.1). Then 𝑁 = Ker 𝑋 is invariant under 𝐴∗ . Proof. This is an easy consequence of the fact that 𝐴 is invertible: if 𝑋 solves (3.1) then ) ( 𝑋(𝐴∗ )−1 = 𝐴 − 𝐴𝑋𝐵(𝑅 + 𝐵 ∗ 𝑋𝐵)−1 𝐵 ∗ 𝑋. So, if 𝑥 ∈ Ker 𝑋 then 𝑋(𝐴∗ )−1 𝑥 = 0. It follows that Ker 𝑋 is (𝐴∗ )−1 -invariant, □ and hence also is 𝐴∗ -invariant. We would like to be able to conclude that for any solution 𝑋 its kernel is 𝑋+ -nondegenerate. This holds true under an extra assumption on 𝐴, as can be seen in the following theorem. Theorem 3.3. Let 𝐴 be invertible, let 𝑅 be Hermitian and invertible. Assume in addition that 𝐴 and (𝐴∗ )−1 have no common eigenvalues. Then, if 𝑋 solves (3.1) Ker 𝑋 is 𝑋+ -nondegenerate. There is a one-to-one correspondence between solutions of (3.1) and subspaces 𝑁 that are 𝐴∗ -invariant and 𝑋+ -nondegenerate, which is given by 𝑋 = 𝑋+ 𝑃𝑁 , where 𝑃𝑁 is the projection onto (𝑋+ 𝑁 )⊥ along 𝑁 . In particular, there can be only one invertible solution of (3.1).

500

L. Lerer and A.C.M. Ran

Proof. Let 𝑋 be a solution of (3.1), let 𝑁 be its kernel, and let us write ℂ𝑟 = 𝑁 ⊕ 𝑁 ⊥ . With respect to this decomposition we write ( ( ) ) ( ) 𝑋11 𝑋12 𝑌11 𝑌12 0 0 −1 , 𝑌 = 𝑋 , 𝑋 = . = 𝑋+ = ∗ ∗ + 𝑋12 𝑋22 𝑌12 𝑌22 0 𝑍22 In order to prove that 𝑁 is 𝑋+ -nondegenerate we need to prove that 𝑋11 is invertible, or equivalently (see [2]), that 𝑌22 is invertible. Note that 𝑍22 is invertible. Using the fact that 𝑁 is 𝐴∗ -invariant (see the previous proposition), we con−1 −1 − 𝐴∗ 𝑋+ 𝐴 = −𝐵𝑅−1 𝐵 ∗ , to see that sider the (2,2)-block in the equation 𝑋+ 𝑌22 satisﬁes the equation 𝑌22 − 𝐴∗22 𝑌22 𝐴22 = −𝐵2 𝑅−1 𝐵2∗ .

(3.9) −1 𝑍22

However, as we have seen in the proof of Theorem 3.1 also satisﬁes this equation. If we could show that (3.9) has a unique solution we would be done, −1 is invertible. Now, since 𝐴 and (𝐴∗ )−1 have no since in that case 𝑍22 = 𝑌22 common eigenvalues, neither have 𝐴∗22 and 𝐴−1 22 . This shows that indeed, (3.9) has a unique solution (see Chapter 13 in [17]), and so we have shown that 𝑁 is 𝑋+ -nondegenerate. Combining the results in Theorem 3.1 with what we have proved just now, we see that there is indeed a one-to-one correspondence between the set of solutions of (3.1) and the set of subspaces 𝑁 that are 𝐴∗ -invariant and 𝑋+ -nondegenerate. □ Observe that the uniqueness of the solution of the Stein equation is in fact equivalent to 𝐴∗ and 𝐴−1 having no common eigenvalues. We ﬁnally arrive at the main result, which describes all noninvertible solutions of (3.1) in terms of the solution of the Stein equation (2.3). For an analogue for the non-symmetric continuous time algebraic Riccati equation compare [6], Theorem 3.1. Note however, that the parametrization given here is quite diﬀerent in nature. Theorem 3.4. Let 𝐴 be invertible, let 𝑅 be Hermitian and invertible. Assume that 𝐴 and (𝐴∗ )−1 have no common eigenvalues, and let 𝑌 be the unique solution of (2.3). Then there is a one-to-one correspondence between solutions of (3.1) and subspaces 𝑁 that are 𝐴∗ -invariant and 𝑌 -nondegenerate which is given by 𝑋 = 𝑌 −1 𝑃𝑁 , where 𝑃𝑁 is the projection onto 𝑌 𝑁 ⊥ along 𝑁 . Proof. The theorem is a direct consequence of the previous one, once we realize that a subspace 𝑁 is 𝑋+ -nondegenerate if and only if it is 𝑌 -nondegenerate, and that (𝑋+ 𝑁 )⊥ = 𝑌 𝑁 ⊥ . □ Note that if we would assume that 𝑅 is positive deﬁnite and that the pair (𝐴, 𝐵) is controllable, then we get additional information. Indeed, this condition reduces to the controllability of (𝐴22 , 𝐵2 ) being controllable, and then if (3.9) has a solution the inertia of 𝑋+ is the same as the inertia of 𝐴, as is well known. (This is the discrete-time version of the Chen-Wimmer theorem [3, 24]; the precise formulation may be found in Chapter 13, Theorem 13.2.4 in [17].) This proves the following theorem.

Discrete Riccati Equations and Block Toeplitz Matrices

501

Theorem 3.5. Let 𝐴 be invertible, let 𝑅 be positive deﬁnite, and let the pair (𝐴, 𝐵) be controllable. Assume that 𝐴 has all its eigenvalues inside the open unit disc, or has all its eigenvalues outside the closed unit disc. Then there is a one-to-one correspondence between all solutions of (3.1) and the set of 𝐴∗ -invariant subspaces. This correspondence is given as follows: let 𝑃𝑁 be the projection onto (𝑋+ 𝑁 )⊥ along 𝑁 , then the solution corresponding to 𝑁 is given by 𝑋𝑁 = 𝑋+ 𝑃𝑁 . Proof. In case 𝐴 has all its eigenvalues inside the unit circle, 𝑋+ is positive deﬁnite, in case 𝐴 has all its eigenvalues outside the unit circle 𝑋+ is negative deﬁnite. In these cases any 𝐴∗ -invariant subspace is automatically 𝑋+ -nondegenerate. □ Finally, we comment on the connection between the continuous algebraic Riccati equation and the discrete algebraic Riccati equation (1.1). It is possible, under the assumption in place in this section, to reduce the discrete algebraic Riccati equation to a continuous one, in the sense that the set of solutions is the same. This is done by a Cayley transform type argument, see, e.g., the proof of Theorem 12.2.3 in [16], where this is outlined in detail. However, the coeﬃcients of the resulting continuous algebraic Riccati equation are complicated expressions in the original coeﬃcients. Moreover, our main new result, being Theorem 3.4 where the one-to-one correspondence of solutions of (3.1) and 𝐴∗ -invariant 𝑌 nondegenerate subspaces is new, cannot be recovered in this way from existing results.

4. Riccati equations with coeﬃcients of companion type In this section we discuss a special case of the algebraic Riccati equation. A pair of matrices (𝐴, 𝐵) is called a monic pair if 𝐵 is of size 𝑛𝑟 × 𝑟 for some 𝑟 and( 𝑛, 𝐵 is of full column rank ) 𝑟, while 𝐴 is of size 𝑛𝑟 × 𝑛𝑟 for some 𝑛, and rank 𝐵 𝐴∗ 𝐵 . . . (𝐴∗ )𝑛−1 𝐵 = 𝑛𝑟. Note that if (𝐴, 𝐵) is a monic pair, then so is (𝐴 + 𝐹 𝐵 ∗ , 𝐵) for every 𝐹 of size 𝑛𝑟 × 𝑟. When (𝐴, 𝐵) is a monic pair, it follows (see, e.g., [9]) that there exists an invertible matrix 𝑆 and 𝑟 × 𝑟 matrices 𝐾1 , . . . , 𝐾𝑛 such that ⎞ ⎛ ∗ 𝐾1 𝐾2∗ ⋅ ⋅ ⋅ 𝐾𝑛∗ ⎛ ⎞ 𝐼 ⎟ ⎜ 𝐼 0 ⋅ ⋅ ⋅ 0 ⎟ ⎜ ⎜ ⎟ ⎜ .. ⎟ .. ⎜ 0⎟ −1 ⎟, . . 0 𝐼 𝐵 = 𝑆 −1 𝐴∗ 𝑆 = ⎜ 𝑆 (4.1) ⎜ ⎟. . ⎟ ⎜ ⎝ .. ⎠ ⎜ . .. ⎟ . . . ⎝ . . . ⎠ 0 0 ⋅⋅⋅ 𝐼 0 Recall that if 𝑋0 is a solution of the algebraic Riccati equation, then we denote 𝐴0 = 𝐴 + 𝐹 𝐵 ∗ , where 𝐹 = −𝐴𝑋0 𝐵(𝑅 + 𝐵 ∗ 𝑋0 𝐵)−1 . Hence, if (𝐴, 𝐵) is a monic pair, then so is (𝐴0 , 𝐵). Theorem 4.1. Under the above conditions on the coeﬃcients 𝐴 and 𝐵, the invertible solution 𝑋 of the Riccati equation (2.2) is congruent to the inverse of a block Toeplitz matrix.

502

L. Lerer and A.C.M. Ran

Proof. We ﬁrst observe that if 𝑋 is a solution of (2.2) with the coeﬃcients 𝐴, 𝐵 and 𝑅, then 𝑆 ∗ 𝑋𝑆 is a solution of (2.2) with the coeﬃcients 𝑆 −1 𝐴𝑆, 𝑆 −1 𝐵, and 𝑅. We shall use the matrix 𝑆 such that (4.1) holds. Let 𝐾 denote the matrix ⎞ ⎛ 𝐾1 𝐼 0 ⋅ ⋅ ⋅ 0 .⎟ ⎜ .. ⎜ 𝐾2 0 𝐼 . .. ⎟ ⎟ ⎜ ⎜ .. ⎟ . .. .. (4.2) 𝐾 = ⎜ ... . . .⎟ ⎟ ⎜ ⎟ ⎜ . .. ⎝ .. . 𝐼⎠ 𝐾𝑛 0 ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ 0 Let 𝐵 be as above. (𝑛) (𝑛) Deﬁne 𝑋00 = −𝑅, and deﬁne 𝑋𝑗0 = −𝐾𝑗 𝑅. We assume that 𝐾 is invertible. The Stein equation (2.3) now becomes ) ( (𝑛) −1 𝑌 − 𝐾 ∗ 𝑌 𝐾 = diag −(𝑋00 (4.3) ) 0 ⋅⋅⋅ 0 . The corresponding algebraic Riccati equation is (𝑛)

𝑋 = 𝐾𝑋𝐾 ∗ − 𝐾𝑋𝐵(𝑋00 + 𝐵 ∗ 𝑋𝐵)−1 𝐵 ∗ 𝑋𝐾 ∗.

(4.4)

−1

= 𝑇𝑛−1 , where 𝑇𝑛 is the block Toeplitz Now using [11] we see that 𝑌 = 𝑋 matrix given by ( (𝑛) ) 0 ˆ ˆ ∗ 𝑋00 𝐾, 𝑇𝑛 = 𝐾 0 𝑋 −1 ˆ is given by where 𝐾

⎛

(𝑛)

(𝑋00 )−1 ⎜ 𝐾1 ⎜ ⎜ ⎜ 𝐾2 ˆ =⎜ 𝐾 ⎜ .. ⎜ . ⎜ ⎜ . ⎝ .. 𝐾𝑛

0 𝐼

⋅⋅⋅ 0

0

𝐼 .. .

..

0

. ⋅⋅⋅

⋅⋅⋅ .. .. ..

⋅⋅⋅

. .

. ⋅⋅⋅

𝐼 0

⎞ 0 0⎟ ⎟ .. ⎟ .⎟ ⎟ . .. ⎟ .⎟ ⎟ ⎟ 0⎠

𝐼

We conclude that the invertible solution of the algebraic Riccati equation is the inverse of a block Toeplitz matrix. □ Now assume 𝐾 ∗ and 𝐾 −1 have no common eigenvalues. The latter condition can be stated in terms of a certain matrix polynomial, see [11]. We solve the Stein −1 and the solution is block-Toeplitz. Then we need to be more equation for 𝑋+ explicit about the solution 𝑋. We can take two points of view. The ﬁrst is that we are given two solutions (𝑛) of the algebraic Riccati equation, 𝑋0 and 𝑋1 , with 𝑋1 − 𝑋0 invertible. Put 𝑋𝑗0

Discrete Riccati Equations and Block Toeplitz Matrices as above, and put 𝑋(𝜆) =

𝑛 ∑ 𝑗=0

503

(𝑛)

𝜆𝑛−𝑗 𝑋𝑗0 .

−1 We assume that 𝑋(𝜆) does not have a zero 𝜆0 for which 𝜆¯0 is also a zero. Let 𝑇𝑛 be the (unique, because of all our assumptions) block Toeplitz matrix for which ⎛ (𝑛) ⎞ ⎛ ⎞ 𝑋00 𝐼 ⎜ (𝑛) ⎟ ⎜ ⎟ ⎜𝑋10 ⎟ ⎜0⎟ ⎟ 𝑇𝑛 ⎜ (4.5) .⎟ . ⎜ .. ⎟ = ⎜ ⎝ . ⎠ ⎝ .. ⎠ (𝑛) 0 𝑋 𝑛0

−1 𝑋+

= 𝑇𝑛−1 , and now we can describe all solutions of the algebraic Riccati Then equation. The second point of view: we can assume that only one solution is given, 𝑋0 . Now assume in addition that the unique solution of (4.3) is invertible, then that gives us the 𝑋1 . Equivalently, that assumption is that there is an invertible block −1 , and we can continue as above. Toeplitz satisfying (4.5). Then again 𝑋+ = 𝑇𝑛−1 If 𝑅 > 0 the invertibility of the solution is automatic, see [11, Theorem 8.1]. Now we arrive at the main point of this section. In the previous section we had to assume that 𝐴 and (𝐴∗ )−1 do not have common eigenvalues, to guarantee existence of a (unique) invertible solution to the Stein equation (2.3). However, for the particular case under consideration here, we may use Theorem 8.1 of [11] to deduce existence of an invertible solution to the Stein equation even when 𝐾 and (𝐾 ∗ )−1 have a common eigenvalue. In this case an extra condition on the eigenvectors and generalized eigenvectors should be satisﬁed. To be precise, let 𝑅 = −𝑋00 be positive deﬁnite, and introduce the orthogonal matrix polynomial associated to 𝐾 by 𝑃 (𝜆) = −𝑋(𝜆)𝑅−1/2 (see [11]). Then Theorem 8.1 in [11] tells us that the Stein equation (2.3) has an invertible solution if and only if for −1 every symmetric pair of eigenvalues 𝜆0 , 𝜆¯0 of 𝑃 (𝜆) the following holds true: for any right Jordan chains 𝑥1 , . . . , 𝑥𝑘 and 𝑦1 , . . . , 𝑦𝑙 of 𝑃 (𝜆) corresponding to 𝜆0 and −1 𝜆0 , respectively, there holds 𝜈 ∑

⟨𝑥𝜈−𝑗 , 𝑦𝑗 ⟩ = 0, where 𝜈 = max 𝑘, 𝑙.

(4.6)

𝑗=1

The condition stated above in terms of eigenvalues of the polynomial 𝑃 (𝜆) can be reformulated in terms of the eigenvalues of the matrix 𝐾. This proves the following theorem. Theorem 4.2. Let 𝐾 and 𝐵 be as above, let 𝑅 be positive deﬁnite. If condition (4.6) −1 holds whenever 𝜆0 , 𝜆0 are both eigenvalues of 𝐾, then there is an invertible block Toeplitz Hermitian matrix 𝑌 = 𝑇𝑛−1 solving (4.3). The solutions to the Riccati equation (4.4) are in one-to-one correspondence to 𝐾-invariant subspaces 𝑁 which

504

L. Lerer and A.C.M. Ran

are 𝑌 -nondegenerate. The correspondence is given by 𝑋𝑁 = 𝑃𝑁∗ 𝑌 −1 𝑃𝑁 , where 𝑃𝑁 is the projection onto 𝑌 𝑁 ⊥ along 𝑁 . If in the theorem above, we replace the condition 𝑅 > 0 by invertibility of the Hermitian matrix 𝑅, then Theorem 7.2 in [11] guarantees the existence of a Hermitian block Toeplitz matrix 𝑇𝑛−1 solving (4.3). However, invertibility of 𝑇𝑛−1 is not guaranteed. We ﬁnish this section with some remarks on the inertia of solutions. For the (𝑛) −1 case where 𝑅 = 𝑋00 is positive deﬁnite, the inertia of the solution 𝑋+ = 𝑇𝑛−1 is given as follows: the number of negative eigenvalues of 𝑋+ is the number of zeros of the orthogonal matrix polynomial 𝑃 (𝜆) inside the unit disc, the number of positive eigenvalues of 𝑋+ is the number of zeros of 𝑃 (𝜆) outside the unit disc, see [11]. In case 𝑅 is no longer positive deﬁnite, the inertia of the solution 𝑋+ may still be described in terms of zeros of a certain matrix polynomial, see [20]. The inertia of the solution 𝑋𝑁 obviously is determined by 𝑋+ and the projection 𝑃𝑁 . In any case: #negative eigenvalues of 𝑋𝑁 ≤ #negative eigenvalues of 𝑇𝑛−1 , #positive eigenvalues of 𝑋𝑁 ≤ #positive eigenvalues of 𝑇𝑛−1 , #zero eigenvalues of 𝑋𝑁 = dim 𝑁. For some more information on the inertia of the solutions in case 𝑅 > 0, see [18].

5. Computations and examples We start this section with some remarks concerning computation of the solutions. Let 𝑁 be a 𝑝-dimensional 𝐾-invariant subspace, and take any basis 𝑤1 , . . . , 𝑤𝑝 for ) ( 𝑁 . Put 𝑊 = 𝑤1 ⋅ ⋅ ⋅ 𝑤𝑝 , and let 𝑊 = 𝑈 𝐷𝑉 ∗ be the singular ) ( value decomposition of 𝑊 . Denote the 𝑗th column of 𝑈 by 𝑢𝑗 , and put 𝑈1 = 𝑢𝑝+1 ⋅ ⋅ ⋅ 𝑢𝑛 . Then: 𝑈1 𝑈1∗ : ℂ𝑛𝑟 → ℂ𝑛𝑟 is the orthogonal projection along 𝑁 onto 𝑁 ⊥ , 𝑈1 : ℂ𝑛𝑟−𝑝 → ℂ𝑛𝑟 is an isometry onto 𝑁 ⊥ , and 𝑈1∗ : ℂ𝑛𝑟 → ℂ𝑛𝑟−𝑝 is the orthoprojection along 𝑁 . For the actual computation of 𝑋𝑁 we can now use the proof of Theorem 3.1. In fact, we know from (3.2) and (3.5) that with respect to the decomposition ℂ𝑛𝑟 = 𝑁 ⊕ 𝑁 ⊥ the solution 𝑋𝑁 and the block Toeplitz 𝑌 are given by ( ) ( ) ∗ ∗ 0 0 , 𝑌 = 𝑋𝑁 = −1 . 0 𝑍22 ∗ 𝑍22 −1 = 𝑈1∗ 𝑌 𝑈1 , and so 𝑍22 = (𝑈1∗ 𝑌 𝑈1 )−1 . Thus Hence, 𝑍22

𝑋𝑁 = 𝑈1 (𝑈1∗ 𝑌 𝑈1 )−1 𝑈1∗ . The computational method expanded here will be used in the following examples. Example 1. As a ﬁrst example, consider the scalar case, that is, the case where 𝑟 = 1. Then there is a ﬁnite number of 𝐾-invariant subspaces, since 𝐾 is nonderogatory.

Discrete Riccati Equations and Block Toeplitz Matrices

505

Also, such subspaces can be completely described (see, e.g., [10]). We take 𝑅 = (𝑛) −𝑋00 = −1. Take for 𝑁 a one-dimensional subspace, such a subspace is spanned by the vector )𝑇 ( 1 1 , 𝑥𝑖 = 1 𝜆𝑖 ⋅ ⋅ ⋅ 𝜆𝑛−1 𝑖

which satisﬁes 𝐾𝑥𝑖 = 𝜆𝑖 𝑥𝑖 . The number 𝜆𝑖 is a root of the polynomial 𝑝(𝜆) = 1 − 𝑘1 𝜆 − 𝑘2 𝜆2 − ⋅ ⋅ ⋅ − 𝑘𝑛 𝜆𝑛 . To see whether 𝑁 is 𝑌 -nondegenerate, we have to compute 𝑥∗𝑖 𝑌 𝑥𝑖 . Recalling that 𝑌 satisﬁes the Stein equation (2.3), with 𝐴 = 𝐾 ∗ , we see that 𝑥∗𝑖 (𝑌 − 𝐾 ∗ 𝑌 𝐾)𝑥𝑖 𝑥∗𝑖 = 𝑥∗𝑖 𝑌 𝑥𝑖 (1 − ∣𝜆𝑖 ∣2 ) = 𝑥∗𝑖 𝐵𝑅−1 𝐵 ∗ 𝑥𝑖 = −1. So, 𝑥∗𝑖 𝑌 𝑥𝑖 ∕= 0, and 𝑁 is 𝑌 -nondegenerate. Example 2. As a second example, we take Consider the matrix ⎛ 5 0 ⎜0 5/6 𝐾 =⎜ ⎝−6 0 0 −1/6

a case where Theorem 4.2 applies. 1 0 0 0

⎞ 0 1⎟ ⎟, 0⎠ 0

and take 𝑅 = 𝐼2 . The eigenvalues of 𝐾 are the numbers 2, 1/2, 3, 1/3, and so indeed, there are pairs of eigenvalues symmetrically placed with respect to the unit circle. The corresponding eigenvectors are, respectively, ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ 1 0 1 0 ⎜0⎟ ⎜−3⎟ ⎜0⎟ ⎜−2⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎟ 𝑥1 = ⎜ ⎝−3⎠ , 𝑥2 = ⎝ 0 ⎠ , 𝑥3 = ⎝−2⎠ , 𝑥4 = ⎝ 0 ⎠ . 0 1 0 1 One sees immediately that the condition (4.6) is satisﬁed. To solve the corresponding Stein equation (4.3) one cannot just use Matlab’s dlyap function, as it will complain about multiple solutions. However, a block Toeplitz solution is easily found by solving the equation by hand, and is equal to ⎛ ⎞ 7/120 0 5/120 0 ⎜ 0 −21/10 0 −3/2 ⎟ ⎟ 𝑌 =⎜ ⎝5/120 0 7/120 0 ⎠ 0 −3/2 0 −21/10 The matrix 𝐾 has four eigenvectors, leading to a total of 14 nontrivial invariant subspaces: four one-dimensional subspaces, six two-dimensional ones, and again four three-dimensional ones. One can check directly that all 𝐾-invariant non-trivial subspaces are 𝑌 nondegenerate, and hence there is a one-to-one correspondence between 𝐾-invariant subspaces and solutions to the algebraic Riccati equation. Thus there are 16

506

L. Lerer and A.C.M. Ran

solutions to the corresponding algebraic Riccati equation: the zero solution, 𝑌 −1 and fourteen solutions corresponding to the invariant subspaces given above. To compute the sixteen solutions, one sees that because of the special form of 𝐾, all solutions will have a structure like that of 𝑌 . The computations can then be done by hand.

6. Connection with the Popov function

( Let (𝐾, 𝐵) be a monic pair, that is 𝐾 is as in (4.2), and 𝐵 ∗ = 𝐼 With this pair introduce a matrix polynomial

0 ⋅⋅⋅

) 0 .

𝑃 (𝑧) = 𝐼 − 𝑧𝐾1 − 𝑧 2 𝐾2 − ⋅ ⋅ ⋅ − 𝑧 𝑛 𝐾𝑛 . (Note that this matrix polynomial is slightly diﬀerent from the orthogonal matrix polynomial 𝑃 introduced earlier.) Let 𝑋 be a solution to the algebraic Riccati equation (6.1) 𝑋 = 𝐾𝑋𝐾 ∗ − 𝐾𝑋𝐵(𝑅 + 𝐵 ∗ 𝑋𝐵)−1 𝐵 ∗ 𝑋𝐾 ∗ , and denote 𝑋 = (𝑋𝑖𝑗 )𝑛𝑖,𝑗=1 , where each block 𝑋𝑖𝑗 is of size 𝑟 × 𝑟. We see that 𝑅 + 𝐵 ∗ 𝑋𝐵 = 𝑅 + 𝑋11 , and 𝐾𝑋𝐵 = 𝐾col (𝑋𝑖1 )𝑛𝑖=1 = col (𝐾𝑖 𝑋11 + 𝑋𝑖+1 1 ), = 0. Introduce also the matrix polynomial ) ( 𝐿(𝑧) = 𝑃 (𝑧) + 𝑧𝐼 𝑧 2 𝐼 ⋅ ⋅ ⋅ 𝑧 𝑛 𝐼 𝐾𝑋𝐵(𝑅 + 𝐵 ∗ 𝑋𝐵)−1 .

where 𝑋𝑛+1

1

Using the observations above, the polynomial 𝐿 becomes 𝐿(𝑧) = 𝑃 (𝑧) +

𝑛 ∑

𝑧 𝑗 (𝐾𝑗 𝑋11 + 𝑋𝑗+1 1 )(𝑅 + 𝑋11 )−1 .

𝑗=1

Observe that 𝐿 is a comonic polynomial. Let us write 𝐿(𝑧) = 𝑃 (𝑧) + 𝐿1 (𝑧), and put 𝐿2 (𝑧) =

𝑛 ∑

𝑧 𝑗 (𝐾𝑗 𝑋11 + 𝑋𝑗+1 1 ) = 𝐿1 (𝑧)(𝑅 + 𝑋11 ).

𝑗=1

) ( Introduce the function 𝑉 (𝑧) = 𝑧𝐼 𝑧 2 𝐼 ⋅ ⋅ ⋅ 𝑧 𝑛 𝐼 . Then we have the following expressions for the polynomials 𝑃, 𝐿 and 𝐿2 𝑃 (𝑧) = 𝐼 − 𝑉 (𝑧)𝐾𝐵,

(6.2)

𝐿2 (𝑧) = 𝑉 (𝑧)𝐾𝑋𝐵,

(6.3) ∗

−1

𝐿(𝑧) = 𝑃 (𝑧) + 𝑉 (𝑧)𝐾𝑋𝐵(𝑅 + 𝐵 𝑋𝐵)

.

(6.4)

With these notations, the following theorem describes a one-to-one correspondence between solutions and polynomial factorizations.

Discrete Riccati Equations and Block Toeplitz Matrices

507

Theorem 6.1. Let (𝐾 ∗ , 𝐵) be a monic pair. 1. If 𝑋 is a solution to the algebraic Riccati equation 𝑋 = 𝐾𝑋𝐾 ∗ − 𝐾𝑋𝐵(𝑅 + 𝐵 ∗ 𝑋𝐵)−1 𝐵 ∗ 𝑋𝐾 ∗ then, with 𝑃 (𝑧) given by (6.2) and with 𝐿(𝑧) the comonic polynomial given by (6.4), we have 𝑧 −1 )∗ . 𝑃 (𝑧)𝑅𝑃 (¯ 𝑧 −1 )∗ = 𝐿(𝑧)(𝑅 + 𝑋11 )𝐿(¯

(6.5)

2. Conversely, if we have a factorization (6.5) of 𝑃 (𝑧)𝑅𝑃 (¯ 𝑧 −1 )∗ , where 𝐿 is a comonic polynomial, then there is a solution 𝑋 of the algebraic Riccati equation such that 𝐿(𝑧) is given by (6.4). Proof. We ﬁrst prove part 1. Suppose that 𝑋 solves the algebraic Riccati equation, then we have to prove that (6.5) holds. We ﬁrst rewrite this as follows: 𝑃 (𝑧)𝑅𝑃 (¯ 𝑧 −1 )∗ = (𝑃 (𝑧) + 𝐿2 (𝑧)(𝑅 + 𝑋11 )−1 )(𝑅 + 𝑋11 )(𝑃 (¯ 𝑧 −1 )∗ + (𝑅 + 𝑋11 )−1 𝐿2 (¯ 𝑧 −1 )∗ ). Expanding on the right-hand side, and cancelling the terms 𝑃 (𝑧)𝑅𝑃 (¯ 𝑧 −1 )∗ which appear on both sides, we see that (6.5) is equivalent to 𝑃 (𝑧)𝑋11 𝑃 (¯ 𝑧 −1 )∗ + 𝐿2 (𝑧)𝑃 (¯ 𝑧 −1 )∗ + 𝑃 (𝑧)𝐿2 (¯ 𝑧 −1 )∗ + 𝐿2 (𝑧)(𝑅 + 𝑋11 )−1 𝐿2 (¯ 𝑧 −1 )∗ = 0. Now to show that his holds as a consequence of 𝑋 being a solution of the algebraic Riccati equation, consider 𝑉 (𝑧)(𝑋 − 𝐾𝑋𝐾 ∗ + 𝐾𝑋𝐵(𝑅 + 𝐵 ∗ 𝑋𝐵)−1 𝐵 ∗ 𝑋𝐾 ∗ )𝑉 (¯ 𝑧 −1 )∗ = 0. Using (6.3) we see that 𝐿2 (𝑧)(𝑅 + 𝑋11 )−1 𝐿2 (¯ 𝑧 −1 )∗ = 𝑉 (𝑧)𝐾𝑋𝐵(𝑅 + 𝐵 ∗ 𝑋𝐵)−1 𝐵 ∗ 𝑋𝐾 ∗ 𝑉 (¯ 𝑧 −1 )∗ . So, it remains to prove that 𝑧 −1 )∗ 𝑉 (𝑧)(𝑋 − 𝐾𝑋𝐾 ∗)𝑉 (¯ −1 ∗

𝑧 = 𝑃 (𝑧)𝑋11 𝑃 (¯

(6.6) −1 ∗

) + 𝐿2 (𝑧)𝑃 (¯ 𝑧

−1 ∗

) + 𝑃 (𝑧)𝐿2 (¯ 𝑧

) .

(6.7)

We compute 𝑧 −1 )∗ 𝑃 (𝑧)𝑋11 𝑃 (¯ = 𝑋11 − 𝑉 (𝑧)𝐾𝐵𝑋11 − 𝑋11 𝐵 ∗ 𝐾 ∗ 𝑉 (¯ 𝑧 −1 )∗ − 𝑉 (𝑧)𝐾𝐵𝑋11 𝐵 ∗ 𝐾 ∗ 𝑉 (¯ 𝑧 −1 )∗ , and 𝐿2 (𝑧)𝑃 (¯ 𝑧 −1 )∗ = 𝑉 (𝑧)𝐾𝑋𝐵 − 𝑉 (𝑧)𝐾𝑋𝐵𝐵 ∗ 𝐾 ∗ 𝑉 (¯ 𝑧 −1 )∗ .

508

L. Lerer and A.C.M. Ran

The expression (6.7) becomes 𝑋11 − 𝑉 (𝑧)𝐾𝐵𝑋11 − 𝑋11 𝐵 ∗ 𝐾 ∗ 𝑉 (¯ 𝑧 −1 )∗ − 𝑉 (𝑧)𝐾𝐵𝑋11 𝐵 ∗ 𝐾 ∗ 𝑉 (¯ 𝑧 −1 )∗ + 𝑉 (𝑧)𝐾𝑋𝐵 − 𝑉 (𝑧)𝐾𝑋𝐵𝐵 ∗ 𝐾 ∗ 𝑉 (¯ 𝑧 −1 )∗ + 𝐵 ∗ 𝑋𝐾 ∗ 𝑉 (¯ 𝑧 −1 )∗ − 𝑉 (𝑧)𝐾𝐵𝐵 ∗ 𝑋𝐾 ∗ 𝑉 (¯ 𝑧 −1 )∗ Now we use the fact that 𝑋11 = 𝐵 ∗ 𝑋𝐵, ⎛ 0 𝐼 ⎜ .. ⎜. 0 ⎜ ⎜ .. 𝑆 = ⎜. ⎜ ⎜. ⎝ .. 0

...

(6.8)

as well as 𝐾 = 𝑆 + 𝐾𝐵𝐵 ∗ , where ⎞ 0 ... 0 .. ⎟ .. . .⎟ ⎟ ⎟ .. .. . 0⎟ . ⎟ ⎟ 0 𝐼⎠ ...

...

0

Combining terms in (6.8) we then have 𝑋11 + 𝑉 (𝑧)(𝐾𝑋𝐵 − 𝐾𝐵𝐵 ∗ 𝑋𝐵) 𝑧 −1 )∗ + (𝐵 ∗ 𝑋𝐾 ∗ − 𝐵 ∗ 𝑋𝐵𝐵 ∗ 𝐾 ∗ )𝑉 (¯ + 𝑉 (𝑧)(𝐾𝐵𝐵 ∗ 𝑋𝐵𝐵 ∗ 𝐾 ∗ − 𝐾𝑋𝐵𝐵 ∗ 𝐾 ∗ − 𝐾𝐵𝐵 ∗ 𝑋𝐾 ∗ )𝑉 (¯ 𝑧 −1 )∗ = 𝑋11 + 𝑉 (𝑧)𝑆𝑋𝐵 + 𝐵 ∗ 𝑋𝑆 ∗ 𝑉 (¯ 𝑧 −1 )∗ + 𝑉 (𝑧) ((𝐾𝐵𝐵 ∗ − 𝐾)𝑋(𝐵𝐵 ∗ 𝐾 ∗ − 𝐾 ∗ ) − 𝐾𝑋𝐾 ∗ )) 𝑉 (¯ 𝑧 −1 )∗ = 𝑋11 𝑉 (𝑧)𝑆𝑋𝐵 + 𝐵 ∗ 𝑋𝑆 ∗ 𝑉 (¯ 𝑧 −1 )∗ + 𝑉 (𝑧)(−𝑆)𝑋(−𝑆 ∗ )𝑉 (¯ 𝑧 −1 )∗ − 𝑉 (𝑧)𝐾𝑋𝐾 ∗𝑉 (¯ 𝑧 −1 )∗ = 𝑋11 + (𝑉 (𝑧)𝑆 + 𝐵 ∗ )𝑋(𝐵 + 𝑆 ∗ 𝑉 (¯ 𝑧 −1 )∗ ) 𝑧 −1 )∗ − 𝐵 ∗ 𝑋𝐵 − 𝑉 (𝑧)𝐾𝑋𝐾 ∗𝑉 (¯ = (𝑉 (𝑧)𝑆 + 𝐵 ∗ )𝑋(𝐵 + 𝑆 ∗ 𝑉 (¯ 𝑧 −1 )∗ ) − 𝑉 (𝑧)𝐾𝑋𝐾 ∗𝑉 (¯ 𝑧 −1 )∗ . Now we note that

( 𝑉 (𝑧)𝑆 + 𝐵 ∗ = 0

𝑧𝐼

⋅⋅⋅

) ( 𝑧 𝑛−1 𝐼 + 𝐼

0 ⋅⋅⋅

(6.9)

) 1 0 = 𝑉 (𝑧). 𝑧

So, 𝑧 −1 )∗ ) = (𝑉 (𝑧)𝑆 + 𝐵 ∗ )𝑋(𝐵 + 𝑆 ∗ 𝑉 (¯

1 𝑉 (𝑧)𝑋𝑧𝑉 (¯ 𝑧 −1 )∗ = 𝑉 (𝑧)𝑋𝑉 (¯ 𝑧 −1 )∗ . 𝑧

Thus, we see that (6.9) equals 𝑧 −1 )∗ 𝑉 (𝑧)(𝑋 − 𝐾𝑋𝐾 ∗)𝑉 (¯ as desired. We now prove part 2. Suppose that (6.5) holds for some comonic polynomial 𝐿 and some Hermitian 𝑋11 for which 𝑅 + 𝑋11 is invertible. Write 𝐿(𝑧) = 𝐼 +

Discrete Riccati Equations and Block Toeplitz Matrices ∑𝑛

𝑗=1

𝑧 𝑗 𝐿𝑗 . Then deﬁne for 𝑗 = 1, . . . , 𝑛 − 1 the matrices 𝑋𝑗+1

1

509

by solving

𝐿𝑗 = −𝐾𝑗 + (𝐾𝑗 𝑋11 + 𝑋𝑗+1 1 )(𝑅 + 𝑋11 )−1 , that is, 𝑋𝑗+1

1

= (𝐿𝑗 + 𝐾𝑗 )(𝑅 + 𝑋11 ) − 𝐾𝑗 𝑋11 = 𝐿𝑗 (𝑅 + 𝑋11 ) + 𝐾𝑗 𝑅.

∗ . Next, deﬁne 𝑋𝑖𝑗 for 𝑖, 𝑗 > 1 This determines the ﬁrst column of 𝑋. Put 𝑋1𝑗 = 𝑋𝑗1 recursively from the following equation, where we take 𝑋𝑖 𝑛+1 = 0 for all 𝑖 and 𝑋𝑛+1 𝑗 = 0 for all 𝑗,

𝑋𝑖𝑗 − 𝑋𝑖+1

𝑗+1

= 𝐾 𝑖 𝑋1

𝑗+1

+ 𝑋𝑖+1 1 𝐾𝑗∗ + 𝐾𝑖 𝑋11 𝐾𝑗∗

+ (𝐾𝑖 𝑋11 + 𝑋1

𝑖+1 )(𝑅

−1

+ 𝑋11 )

(𝑋11 𝐾𝑗∗

(6.10) + 𝑋𝑗+1 1 ).

To show that it follows that 𝑋 satisﬁes the algebraic Riccati equation, use again that 𝐾 = 𝑆 + 𝐾𝐵𝐵 ∗ , and rewrite the algebraic Riccati equation as follows: 𝑋 − 𝑆𝑋𝑆 ∗ = 𝐾𝐵𝐵 ∗ 𝑋𝑆 ∗ + 𝑆𝑋𝐵𝐵 ∗ 𝐾 ∗ + 𝐾𝐵𝑋11 𝐵 ∗ 𝐾 ∗ − 𝐾𝑋𝐵(𝑅 + 𝑋11 )−1 𝐵 ∗ 𝑋𝐾 ∗ . The (𝑖, 𝑗)-entry of the left-hand side is given by 𝑋𝑖𝑗 −𝑋𝑖+1 𝑗+1 , with the agreement that all entries with a column or row index 𝑛 + 1 are zero. One checks directly that the right-hand side only depends on the entries in the ﬁrst column and row of 𝑋, and moreover, the (𝑖, 𝑗)-entry of the right-hand side is equal to the (𝑖, 𝑗)-entry of the right-hand side of (6.10). In other words, (6.10) is equivalent to the algebraic Riccati equation. □ Because of the proof of part 2 of the previous theorem, any solution 𝑋 of the Riccati equation (6.1) with coeﬃcients as discussed in the beginning of this section is completely determined by its ﬁrst block column. Recalling that the invertible solution is the inverse of a Hermitian block Toeplitz matrix, that is no surprise for the invertible solution, because of the Gohberg-Heinig-Semencul formulas, see, e.g., [11]. Surprisingly enough, this holds for the non-invertible solutions as well. The theorem above can be viewed as a corollary of more general results connecting the solutions of a more general algebraic Riccati equation with factorizations of the so-called Popov function. Below we shall explain how that connection may be used to derive the theorem above. However, here we preferred to give an independent proof, which seems to us more transparent for this case, and also gives additional information as seen from the previous paragraph. Intimately connected to the discrete algebraic Riccati equation 𝑋 = 𝐴𝑋𝐴∗ + 𝑄 − 𝐴𝑋𝐵(𝑅 + 𝐵 ∗ 𝑋𝐵)−1 𝐵 ∗ 𝑋𝐴∗ is the so-called Popov function, given by 𝐺(𝑧) = 𝑅 + 𝐵 ∗ (𝑧 −1 − 𝐴)−1 𝑄(𝑧 − 𝐴∗ )−1 𝐵.

510

L. Lerer and A.C.M. Ran

See, e.g., [14], [13], [16]. In particular, there is a one-to-one correspondence between solutions 𝑋 and certain symmetric factorizations of 𝐺, which is described as follows: let 𝑋 be a solution to the discrete algebraic Riccati equation, and put 𝑅(𝑧) = 𝐼 + 𝐵 ∗ (𝑧 −1 − 𝐴)−1 𝐴𝑋𝐵(𝑅 + 𝐵 ∗ 𝑋𝐵)−1 , then

𝐺(𝑧) = 𝑅(𝑧)(𝑅 + 𝐵 ∗ 𝑋𝐵)𝑅(¯ 𝑧 −1 )∗ . Note that when 𝑄 = 0 the Popov function reduces to a constant matrix function. For the particular case where the pair (𝐴, 𝐵) is a monic pair, we may take ( ) 𝐴 = 𝐾, 𝐵∗ = 𝐼 0 ⋅ ⋅ ⋅ 0 , where 𝐾 is given as by (4.2). Computing 𝐵 ∗ (𝑧 −1 − 𝐴)−1 = 𝑧𝐵 ∗ (𝐼 − 𝑧𝐴)−1 , we see that this is equal to

( 𝑃 (𝑧)−1 𝑧𝐼

𝑧 2𝐼

⋅⋅⋅

) 𝑧 𝑛𝐼 ,

where

𝑃 (𝑧) = 𝐼 − 𝑧𝐾1 − 𝑧 2 𝐾2 − ⋅ ⋅ ⋅ − 𝑧 𝑛 𝐾𝑛 . The Popov function then may be replaced by the polynomial ⎛1 ⎞ 𝑧𝐼 ⎜ 12 𝐼 ⎟ ) ( ⎜𝑧 ⎟ 𝑧 −1 )∗ . 𝑃 (𝑧)𝐺(𝑧)𝑃 (¯ 𝑧 −1 )∗ = 𝑧𝐼 𝑧 2 𝐼 ⋅ ⋅ ⋅ 𝑧 𝑛 𝐼 𝑄 ⎜ . ⎟ + 𝑃 (𝑧)𝑅𝑃 (¯ ⎝ .. ⎠ 𝑧𝑛𝐼

Now for the particular case where 𝑄 = 0, the latter function still is a nontrivial polynomial. The factors corresponding to solutions 𝑋 are given by 𝐿(𝑧) = 𝑃 (𝑧)𝑅(𝑧), where 𝑅(𝑧) is as above, so by ( ) 𝐿(𝑧) = 𝑃 (𝑧) + 𝑧𝐼 𝑧 2 𝐼 ⋅ ⋅ ⋅ 𝑧 𝑛 𝐼 𝐾𝑋𝐵(𝑅 + 𝐵 ∗ 𝑋𝐵)−1 . Thus, the polynomial 𝐿 is exactly the comonic polynomial from Theorem 6.1.

References [1] F.A. Aliev, B.A. Bordyug and V.B. Larin. Discrete generalized Riccati equations and polynomial matrix factorization. Systems and Control Letters 18 (1992), 49–59. [2] H. Bart, I. Gohberg, M.A. Kaashoek, The coupling method for solving integral equations. In: Topics in operator theory and networks, the Rehovot workshop, OT 12, Birkh¨ auser Verlag, Basel, 1984, pp. 39–74. [3] C.T. Chen. A generalization of the inertia theorem. SIAM J. Appl. Math. 25 (1973), 158–161. [4] D.J. Clements, H.K. Wimmer. Existence and uniqueness of unmixed solutions of the discrete-time algebraic Riccati equation. Systems and Control Letters 50 (2003), 343–346.

Discrete Riccati Equations and Block Toeplitz Matrices

511

[5] A. Ferrante. On the structure of the solutions of discrete-time algebraic Riccati equation with singular closed-loop matrix. IEEE Trans. Automat. Control 49 (2004), 2049–2054. [6] A. Ferrante, M. Pavon and S. Pinzoni. Asymmetric algebraic Riccati equation: a homeomorphic parametrization of the set of solutions. Linear Algebra and Appl. 329 (2001), 137–156. [7] A. Ferrante and H.K. Wimmer. Order reduction of discrete-time algebraic Riccati equations with with singular closed-loop matrix. Oper. Matrices 1 (2007), 61–70. [8] A.E. Frazho, M.A. Kaashoek, A.C.M. Ran. The non-symmetric discrete algebraic Riccati equation and canonical factorization of rational matrix functions on the unit circle. Integral Equations and Operator Theory. 66 (2010), 215–229. [9] I. Gohberg, P. Lancaster, L. Rodman. Matrix Polynomials, Academic Press, 1982. [10] I. Gohberg, P. Lancaster, L. Rodman. Invariant subspaces of matrices with Applications, John Wiley & Sons, New York, 1986. [11] I. Gohberg and L. Lerer. Matrix generalizations of M.G. Krein theorems on orthogonal polynomials. In Orthogonal Matrix-valued Polynomials and Applications (ed. I. Gohberg) OT 34, Birkh¨ auser Verlag, 1988, 137–202. [12] C. Heij, A.C.M. Ran, F. van Schagen, Introduction to Mathematical Systems Theory. Birkh¨ auser Verlag, Basel, 2007. [13] V. Ionescu, C. Oara and M. Weiss. Generalized Riccati Theory and robust control. A Popov function approach. John Wiley, Chichester, 1999. [14] I. Karelin, L. Lerer and A.C.M. Ran. 𝐽-symmetric factorizations and algebraic Riccati equations. In Proceedings of the IWOTA 1998 (eds. A. Dijksma, M.A. Kaashoek, A.C.M. Ran), OT 124 Birkh¨ auser Verlag, 2001, 319–360. [15] M.G. Krein. Distribution of roots of polynomials orthogonal on the unit circle with respect to a sign alternating weight. Theor. Funkcii Funkcional Anal. i. Prilozen. 2 (1966), 131–137 (Russian). [16] P. Lancaster and L. Rodman. Algebraic Riccati Equations Oxford, UK: Clarendon Press, 1995. [17] P. Lancaster and M. Tismenetsky. The Theory of Matrices. 2nd Edition. Academic Press, San Diego etc. 1985. [18] H. Langer, A.C.M. Ran, D. Temme. Inertia of Hermitian solutions of the algebraic Riccati equation. In: Proceedings of the European Control Conference, 1997. [19] V.B. Larin. The generalized Riccati equations, orthogonal projectors and factorization of matrix polynomials. Soviet J. Automat. Inform. Sci. 22 (1989), 72-77. Translated from Automatika 1989, no. 6, 70–74, 96. [20] L. Lerer, A.C.M. Ran. A new inertia theorem for Stein equations, inertia of invertible hermitian block Toeplitz matrices and matrix orthogonal polynomials. Integral Equations and Operator Theory 47 (2003), 339–360. [21] A.C.M. Ran and H.L. Trentelman. Linear quadratic problems with indeﬁnite cost for discrete time systems. SIAM J. Matrix Anal. Appl. 14 (1993), 776–797. [22] M.A. Shayman. Geometry of the algebraic Riccati equations. I, II. SIAM J. Control 21 (1983), 375–394 and 395–409.

512

L. Lerer and A.C.M. Ran

[23] J.C. Willems. Least squares stationary optimal control and the algebraic Riccati equation. IEEE Trans. Automatic Control AC-16 (1971), 621–634. [24] H.K. Wimmer. Inertia theorems for matrices, controllability, and linear vibrations. Linear Algebra and Appl.8 (1974), 337–343. [25] H.K. Wimmer. Unmixed solutions of the discrete-time algebraic Riccati equation. SIAM J. Control Optim. 30 (1992), 867–878. [26] H.K. Wimmer. Hermitian solutions of the discrete-time algebraic Riccati equation. Internat. J. Control 63 (1996), 921–936. [27] H.K. Wimmer. A parametrization of solutions of the discrete-time algebraic Riccati equation based on pairs of opposite unmixed solutions. SIAM J. Control Optim. 44 (2006), 1992–2005. Leonid Lerer Department of Mathematics Technion-Israel Institute of Technology 3200 Haifa, Israel e-mail: [email protected] Andr´e C.M. Ran Department of Mathematics Faculteit of Exact Sciences Vrije Universiteit Amsterdam De Boelelaan 1081a NL-1081 HV Amsterdam, The Netherlands e-mail: [email protected]

Operator Theory: Advances and Applications, Vol. 218, 513–539 c 2012 Springer Basel AG ⃝

On Cyclic and Nearly Cyclic Multiagent Interactions in the Plane Fr´ed´erique Oggier and Alfred Bruckstein This paper is dedicated to the memory of Professor Israel Gohberg, who was a great mathematician truly interested in engineering applications, and a wonderful person.

Abstract. Cyclic pursuit and local averaging interactions have been extensively analyzed in the context of multiagent gathering, in the ﬁeld of distributed robotics. This paper reviews some results on cyclically structured dynamical systems, and discusses their application to nearly cyclic interactions among 𝑁 point-agents in the plane, leading to formations of interesting limiting geometric conﬁgurations. In particular, we consider evolutions that can be modeled by a Toeplitz operator, and explain how they can be decoupled into independent evolving modes, focusing on nearly cyclic interactions that break symmetry leading to factor circulants rather than circulant interaction matrices. Mathematics Subject Classiﬁcation (2000). 15A18, 47B35, 68T40. Keywords. Circulant matrices, 𝜆-circulant matrices, cyclic pursuit, multiagent interaction.

1. Introduction Consider a “swarm” or “pack” of 𝑁 robots in the plane, denoted by 𝒫0 , 𝒫1 , . . . , 𝒫𝑁 −1 which can all see each other and are aware of the other robot’s identities (i.e., can distinguish them). We shall deﬁne the rules of interaction specifying how each robot 𝒫𝑘 moves in response to the (evolution in time of the) conﬁguration of the entire swarm. Therefore denoting 𝒫𝑘 ’s location at time 𝑡 to be 𝒫𝑘 (𝑡) = 𝑥𝑘 (𝑡)+𝑖𝑦𝑘 (𝑡) (a complex number), we assume that we can write the swarm evolution equations as follows: 𝑑𝒫𝑘 (𝑡) (𝐶) = Φ𝑘 {𝒫𝑠 (𝜉)∣𝑠=0,1,...,𝑁 −1 ; 𝜉 ≤ 𝑡} 𝑑𝑡 (𝐷) or 𝒫𝑘 (𝑡 + 1) = Φ𝑘 {𝒫𝑠 (𝜉)∣𝑠=0,1,...,𝑁 −1 ; 𝜉 ≤ 𝑡} (1) depending on whether the temporal evolution is continuous (𝐶) or discrete (𝐷). So far the Φ-operators are not speciﬁed, and in fact they could be quite involved in

514

F. Oggier and A. Bruckstein (𝐶)

general. The operator Φ𝑘 {⋅} provides an instantaneous velocity vector for agent (𝐷) 𝒫𝑘 in response to the locations of the other agents in the swarm, while Φ𝑘 {⋅} will yield the next location for 𝒫𝑘 in a synchronous discrete timed evolution. Both the discrete and the continuous operators should produce the same evolutions if we decide to look at the agents in a diﬀerent frame of reference, i.e., re-encode their locations using arbitrarily rotated and possibly uniformly scaled coordinates, hence the resulting evolution equations must be similarity invariant. This requirement clearly imposes restrictions on the Φ operators, and these will be discussed in the sequel (see (6)). Linear memoryless operators are an important class of (discrete or continuous) operators which have the form Φ𝑘 {𝒫0 , 𝒫1 , . . . , 𝒫𝑁 −1 } =

𝑁 −1 ∑

𝑚𝑘𝑙 (𝑡)𝒫𝑙 (𝑡)

𝑙=0

𝑚𝑘𝑙 (𝑡)

are some (complex) numbers, varying perhaps in time, but which do where not depend on previous swarm conﬁgurations. In this case, Equation (1) describes a linear (generally time varying) system’s state evolution, and there is a wealth of theory dealing with such systems in the control and signal processing literature. Here we shall mainly be concerned with a special class of (constant) linear Toeplitz operators of the form Φ𝑘 {𝒫0 , 𝒫1 , . . . , 𝒫𝑁 −1 } =

𝑁 −1 ∑

𝜆Ind[(𝑙−𝑘)<0] 𝑚(𝑙−𝑘)mod

𝑁 𝒫𝑙 (𝑡)

(2)

𝑙=0

where 𝜆 is some complex number, and { { 𝑚−1 ≡ 𝑚𝑁 −1 mod 𝑁 1 𝑖𝑓 and Ind[𝑎 < 0] = 𝑚−𝑘 ≡ 𝑚𝑁 −𝑘 mod 𝑁 0 𝑖𝑓

𝑎<0 . 𝑎≥0

Writing out explicitly Φ𝑘 {𝒫0 , . . . , 𝒫𝑁 −1 } for 𝑘 = 0, . . . , 𝑁 − 1 in matrix form and denoting ⎤ ⎡ 𝒫0 (𝑡) ⎥ ⎢ .. P(𝑡) = ⎣ ⎦, . 𝒫𝑁 −1 (𝑡)

the swarm’s evolution dynamics becomes ) ( 𝑑 P(𝑡) or P(𝑡 + 1) = ΦP(𝑡) 𝑑𝑡 ⎡ 𝑚1 𝑚0 ⎢ 𝜆𝑚𝑁 −1 𝑚0 ⎢ ⎢ = ⎢ 𝜆𝑚𝑁 −2 𝜆𝑚𝑁 −1 ⎢ .. .. ⎣ . . 𝜆𝑚1

𝜆𝑚2

(3) 𝑚2 𝑚1 𝑚0 ... ...

... ... ... .. . 𝜆𝑚𝑁 −1

𝑚𝑁 −1 𝑚𝑁 −2 ... ... 𝑚0

⎤ ⎥ ⎥ ⎥ ⎥ P(𝑡). ⎥ ⎦

On Cyclic and Nearly Cyclic Multiagent Interactions

515

Note that if 𝜆 = 1, the matrix is a special Toeplitz-circulant matrix, since by −1 deﬁnition, a Toeplitz-circulant (or simply circulant) 𝑁 × 𝑁 matrix M = [𝑚𝑖,𝑗 ]𝑁 𝑖,𝑗=0 is obtained by cyclic shift of its ﬁrst row (or equivalently, of its ﬁrst column): 𝑚𝑖,𝑗 = 𝑚𝑖−1mod𝑁,𝑗−1mod𝑁 . Otherwise it is a generalization of a circulant called a 𝜆-factor, or 𝜆-circulant matrix, where the 𝑁 × 𝑁 matrix M is similarly deﬁned by cyclic shifts of its ﬁrst row (or column), up to a factor 𝜆: { 0 ≤ 𝑖 ≤ 𝑁 − 1, 𝑖 ≤ 𝑗 ≤ 𝑁 − 1 𝑚𝑖−1,𝑗−1 𝑚𝑖,𝑗 = 𝜆𝑚𝑖−1mod𝑁,𝑗−1mod𝑁 else. Such matrices arise in several applications, such as linear systems theory [8, 10], linear algebra [1], geometry [5, 13, 15], and in connection with inverses of Toeplitz matrices [7, 9, 4], coding theory [6] and linear systems of diﬀerential equations [18]. In case of 𝜆 = 1, i.e., when the operator Φ is Toeplitz-circulant, we have that all the robotic agents perform “cyclically” the same operation, i.e., agent 𝒫𝑘 will determine its next location (or its velocity) according to the same weighted average performed on 𝒫𝑘 , 𝒫𝑘+1 , . . . , 𝒫(𝑘+𝑁 )mod𝑁 (in this order), i.e., ⎡ ⎤ 𝒫𝑘 (𝑡) { } ⎢ 𝒫𝑘+1 (𝑡) ⎥ 𝒫𝑘 (𝑡 + 1) ⎢ ⎥ (4) , 𝑚 , . . . , 𝑚 ] = [𝑚 ⎢ ⎥ . 0 1 𝑁 −1 𝑑 or 𝑑𝑡 𝒫𝑘 (𝑡) ⎣ .. ⎦ 𝒫(𝑘+𝑁 )mod

which can be rewritten as ⎡

0

0

0

1

⎢ ⎢ ⎢ ⎢ } { ⎢ 𝒫𝑘 (𝑡 + 1) ⎢ = 𝑚 ¯ ⎢ 𝑑 or 𝑑𝑡 𝒫𝑘 (𝑡) ⎢ 1 ⎢ ⎢ 1 ⎢ ⎣ 1 0 ...

1

(𝑘th place)

1

...

... 1

... 0

0

𝑁 (𝑡)

⎤

⎥ ⎥⎡ ⎥ ⎥ ⎢ .. ⎥ ⎢ . ⎥ ⎥⎢ ⎥⎣ ⎥ ⎥ ⎥ 0 ⎦ 0

𝒫0 (𝑡) 𝒫1 (𝑡) .. .

⎡ ⎢ ⎢ ⎢ Z≜⎢ ⎢ ⎢ ⎣

0 .. . .. . 0 1

1 0

0 1 0

0 ⋅⋅⋅

1 0

⎥ ⎥ ⎥ ⎦

𝒫𝑁 −1 (𝑡)

= 𝑚Z ¯ 𝑘−1 ⋅ P(𝑡) where

⎤

(5) ⎤

⎥ ⎥ ⎥ ⎥ and 𝑚 ¯ = [𝑚0 , 𝑚1 , . . . , 𝑚𝑁 −1 ]. ⎥ ⎥ 1 ⎦ 0

516

F. Oggier and A. Bruckstein

This special case, with a circulant matrix Φ, was extensively analyzed before in the context of polygon smoothing evolutions and cyclic pursuits for robotic gathering and formation control, see, e.g., [5, 13, 15, 3, 2, 12, 11, 7]. Note that invariance requirements impose some conditions on the linear evolution operators, as we now discuss. If P(𝑡) is described by the evolution equations 𝑑 P(𝑡) = Φ(𝐶) P(𝑡) 𝑑𝑡 or P(𝑡 + 1) = Φ(𝐷) P(𝑡) from some initial location P(0) = P(𝑡 = 0), and if we re-encode the agents’ positions via a general similarity transformation of the form P′ (𝑡) ≜ 𝜌P(𝑡) + 𝜏 1

(6)

where 𝜌 and 𝜏 are some complex numbers and 1 = [1, . . . , 1]𝑇 , we shall have for P′ (𝑡): ∙ in the continuous case 𝑑 ′ 𝑑 P (𝑡) ≜ (𝜌P(𝑡) + 𝜏 1) 𝑑𝑡 𝑑𝑡 𝑑 = 𝜌 P(𝑡) 𝑑𝑡 = 𝜌Φ(𝐶) P(𝑡) which is equal to Φ(𝐶) (𝜌P(𝑡) + 𝜏 1) only if Φ(𝐶) 1 = 0. ∙ in the discrete case P′ (𝑡 + 1) ≜ 𝜌P(𝑡 + 1) + 𝜏 1 = 𝜌Φ(𝐷) P(𝑡) + 𝜏 1 which is equal to Φ(𝐷) (𝜌P(𝑡) + 𝜏 1) only if Φ(𝐷) 1 = 1. Hence the Φ-matrices that describe linear, time-invariant evolutions need to obey the conditions Φ(𝐶) 1 = 0 or Φ(𝐷) 1 = 1 in order to have Euclidean or similarity invariant evolutions. In some of our examples, these conditions cannot be satisﬁed. However, note that any 𝑁 × 𝑁 matrix Φ may be embedded in an (𝑁 + 1) × (𝑁 + 1) matrix Φ as follows ⎡ ⎤ [ ] 1 [ ] Φ s ⎢ . ⎥ Φ1 + s . = ⎣ . ⎦ 0 𝑧 𝑧 1 and selecting either 𝑧 = 0 and s = −Φ1 or 𝑧 = 1 and s = −Φ1 + 1, we obtain a Φ matrix that describes an invariant evolution of a multi-agent system with an addi𝑑 𝒫𝐵 = 0 or 𝒫𝐵 (𝑡+1) = 𝒫𝐵 (𝑡)). This tional agent 𝒫𝐵 whose position is stationary ( 𝑑𝑡 additional agent will act as a “beacon” or a set reference point, for the description of the swarm of agents. In this case, setting 𝒫𝐵 = (0, 0), the evolution of the rest of

On Cyclic and Nearly Cyclic Multiagent Interactions

517

the agents will be described by the original matrix Φ. Note that the spatial location of the ﬁxed 𝒫𝐵 in the plane may be determined according to the initial location of the agents of the swarm. A good example is the geometric and aﬃne invariant decision that can be made by each agent independently to set 𝒫𝐵 , and hence the origin of its Cartesian coordinate system, at the centroid of the agent location constellation at 𝑡 = 0. This will make the swarm evolution entirely autonomous. However, an external setting of the location of 𝒫𝐵 might be useful in controlling the swarm and steering it toward a desired place in the environment. One might even desire to move 𝒫𝐵 in time and make the swarm move accordingly, by tracking the beacon point in addition to its own internal dynamics controlled by Φ.

2. Analyzing swarm evolution via mode decoupling Circulant, and 𝜆-factor circulant matrices have very special structures and this allows us to diagonalize them, essentially by Fourier transform methods. Let us see, in general, how diagonalization yields a way to analyze the evolution of the constellation of robots by decoupling it into independently evolving modes. Indeed assume that the time-invariant matrix Φ can be diagonalized (for example when Φ has distinct eigenvalues, or Φ is normal, i.e., Φ∗ Φ = ΦΦ∗ , hence having a full set of orthonormal eigenvectors), as follows Φ = 𝑇 −1 𝐷𝑇 where 𝐷 = Diag[𝑑0 , 𝑑1 . . . 𝑑𝑁 −1 ] displays the eigenvalues of Φ and the columns of 𝑇 −1 are the (right) eigenvectors. Now we have that ⎫ P(𝑡 + 1) ⎬ or = 𝑇 −1 𝐷𝑇 P(𝑡) ⎭ 𝑑 𝑑𝑡 P(𝑡) and hence

} 𝑇 P(𝑡 + 1) = 𝐷(𝑇 P(𝑡)). 𝑑 𝑑𝑡 (𝑇 P(𝑡)) ˜ In terms of the transformed vector P(𝑡) ≜ 𝑇 P(𝑡), the evolution is a decoupled evolution controlled explicitly by the (constant) eigenvalues [10]. Indeed, we have ⎡ 𝑡 ⎤ 𝑑𝑜 ⎥ ⎢ 𝑑𝑡1 0 ⎢ ⎥˜ ˜ P(𝑡) =⎢ ⎥ P(0) .. ⎦ ⎣ . 𝑑𝑡𝑁 −1

0 or

⎡ ⎢ ⎢ ˜ P(𝑡) =⎢ ⎣

𝑒𝑑0 𝑡

(discrete case)

⎤ 𝑒𝑑1 𝑡 0

0 .. .

⎥ ⎥˜ ⎥ P(0) ⎦ 𝑒𝑑𝑁 𝑡

(continous case)

.

518

F. Oggier and A. Bruckstein

Therefore diagonalization enables the explicit solution of the swarm evolution, in the case the Φ matrix is time invariant and has a full set of orthonormal eigenvectors. As we shall see below, 𝜆-factor circulants are a family of matrices that enable both a nice physical interpretation in terms of cyclic and symmetric interactions among similar agents and an explicit diagonalization via discrete Fourier transform matrices.

3. Diagonalization of factor circulants Factor circulant matrices are very special in that they provide explicit formulae for the diagonalizing transforms and for their eigenvalues. This enables us to analyze in detail the behavior of multiagent interactions when these are cyclic or “nearly” cyclic, and fully describe the limiting behaviors of the swarm. For circulants, we have the following results. Consider the unitary Fourier transform matrix ⎤ ⎡ 0 . . . 𝑤0 𝑤 𝑤0 0 1 ⎥ . . . 𝑤𝑁 −1 1 ⎢ ⎥ ⎢ 𝑤 𝑤 [FT] ≜ √ ⎢ . ⎥ . . .. .. ⎦ 𝑁 ⎣ .. 𝑤0 𝑤𝑁 −1 . . . 𝑤(𝑁 −1)(𝑁 −1) ] [ 1 = √ 𝑤(𝑘−1)(𝑙−1) 𝑘,𝑙=1,...,𝑁 𝑁

2𝜋

where 𝑤 = 𝑒−𝑖 𝑁 is an 𝑁 th root of unity. Then C is a Toeplitz-circulant matrix if and only if ⎤ ⎡ 𝜇𝑜 ⎥ ⎢ 𝜇1 0 ⎥ ⎢ C[FT] = [FT] ⎢ ⎥ .. ⎦ ⎣ . 0 𝜇𝑁 −1 where 𝜇0 , 𝜇1 , . . . , 𝜇𝑁 −1 are the eigenvalues of C and are given by 𝜇𝑙 =

𝑁 −1 ∑

2𝜋

𝑐𝑘 𝑒−𝑖 𝑁 𝑘𝑙 .

𝑘=0

Hence and

[FT]∗ C[FT] = Diag[𝜇0 , 𝜇1 , . . . , 𝜇𝑁 −1 ] C = [FT]Diag[𝜇0 , 𝜇1 , . . . , 𝜇𝑁 −1 ][FT]∗ .

To summarize the remarkable properties of circulants, we can state that they are (1) diagonalized by the discrete Fourier Transform, (2) they all commute, (3) their products are circulants, (4) their sums are circulants too, and (5) their inverses/pseudoinverses are circulants, and are readily found [9]. In fact, many of the wonders of modern signal processing algorithms, and linear, time invariant systems theory stem from the above properties.

On Cyclic and Nearly Cyclic Multiagent Interactions

519

The corresponding, and equally remarkable properties of 𝜆-circulants are, however, much less known and applied. Suppose we consider the following operation on a circulant C = C[𝑐0 ,𝑐1 ,...,𝑐𝑁 −1 ] : ⎤ ⎡ ⎤ ⎡ 𝑎𝑜 𝑏𝑜 ⎥ ⎢ ⎥ ⎢ 𝑎1 0 𝑏1 0 ⎥ ⎢ ⎥ ⎢ C W=⎢ ⎥ ⎢ ⎥, [𝑐 ,𝑐 ,...,𝑐 ] .. .. 0 1 𝑁 −1 ⎣ ⎦ ⎣ ⎦ . . 0

𝑎𝑁 −1

0

𝑏𝑁 −1

i.e., W is obtained by pre- and post multiplying C by two diagonal matrices. It is easy to see that we have ⎤ ⎡ 𝑎0 𝑏 1 . . . 𝑎0 𝑏𝑁 −1 𝑎0 𝑏 0 ⎥ ⎢ 𝑎1 𝑏 0 𝑎1 𝑏 1 . . . 𝑎1 𝑏𝑁 −1 ⎥ ⎢ W = C[𝑐0 ,𝑐1 ,...,𝑐𝑁 −1] ⊙ ⎢ . ⎥= C ⊙ M . . .. .. ⎣ .. ⎦ 𝑎𝑁 −1 𝑏0 𝑎𝑁 −1 𝑏1 . . . 𝑎𝑁 −1 𝑏𝑁 −1 where ⊙ stands for the Schur Hadamard multiplication [14] (or a “masking” operation) which multiplies matrices element-wise, and M ≜ [𝑎𝑘 𝑏𝑙 ]𝑘,𝑙=0,...,𝑛−1 . Matrices of the type W inherit interesting diagonalization properties from the original circulant C. The matrix W is a circulant matrix that is modiﬁed by a highly structured masking matrix M and W = Diag[𝑎0 , . . . , 𝑎𝑁 −1 ][FT]Diag[𝜇0 , . . . , 𝜇𝑁 −1 ][FT]∗ Diag[𝑏0 , . . . , 𝑏𝑁 −1 ]. However, since the masking matrix is neither circulant nor Toeplitz, we shall have to consider some special cases for the {𝑎0 , 𝑎1 , . . . , 𝑎𝑁 −1 } and {𝑏0 , 𝑏1 , . . . , 𝑏𝑁 −1 } sequences. First of all, note that the factorization above will be of the form ⎤ ⎡ 𝜇0 ⎥ −1 ⎢ .. W = U⎣ ⎦U . 𝜇𝑁 −1 if and only if (Diag[𝑎0 , 𝑎1 , . . . , 𝑎𝑁 −1 ][FT])−1 = [FT]∗ Diag[𝑏0 , 𝑏1 , . . . , 𝑏𝑁 −1 ] −1 −1 ∗ ⇐⇒ [FT]∗ Diag[𝑎−1 0 , 𝑎1 , . . . , 𝑎𝑁 −1 ] = [FT] Diag[𝑏0 , 𝑏1 , . . . , 𝑏𝑁 −1 ] ∗ 𝑗𝛼𝑘 or 𝑏𝑘 = 𝑎−1 𝑘 , and U will further be unitary if also 𝑏𝑘 = 𝑎𝑘 , implying that 𝑎𝑘 = 𝑒 −𝑗𝛼𝑘 ∗ and 𝑏𝑘 = 𝑒 = 𝑎𝑘 . In this case the masking-matrix multiplying C will be [𝑒𝑗𝛼𝑘 𝑒−𝑗𝛼𝑙 ] = [𝑒𝑗(𝛼𝑘 −𝛼𝑙 ) ]𝑘,𝑙=0,...,𝑁 −1 . The most interesting particular cases of {𝑎0 , 𝑎1 , . . . , 𝑎𝑁 −1 } and {𝑏0 , 𝑏1 , . . . , 𝑏𝑁 −1 } arise when we have 𝑎𝑘 = 𝛾 𝑘 and 𝑏𝑘 = 𝛾 −𝑘 , 𝑘 = 0, 1 . . . , 𝑁 − 1, for some real

520

F. Oggier and A. Bruckstein

or imaginary 𝛾. In this case ⎡ ⎢ ⎢ ⎢ M=⎢ ⎢ ⎣

1 𝛾 𝛾2 .. .

𝛾 𝑁 −1

𝛾 −1 1 𝛾 .. .

𝛾 𝑁 −2

𝛾 −2 𝛾 −1 1 .. . ...

... ... ... .. . ⎡

⎢ ⎢ ⎢ = Circ[1,𝛾 −1 ,...,𝛾 −(𝑁 −1) ] ⊙ ⎢ ⎢ ⎣

𝛾 1 𝛾𝑁 𝛾𝑁 .. . 𝛾𝑁

𝛾 −(𝑁 −1) 𝛾 −(𝑁 −1)+1 𝛾 −(𝑁 −1)+2 .. .

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

1

1 1 𝛾𝑁 𝛾𝑁

1 1 1

1 1 1 .. .

1 1 1

. . . 𝛾𝑁

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

1

where Circ[1,𝛾 −1 ,...,𝛾 −(𝑁 −1) ] is given by ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

1

𝛾 −(𝑁 −1) 𝛾 −(𝑁 −1)+1 .. .

𝛾 −(𝑁 −1)+(𝑁 −2)

𝛾 −1 1

𝛾 −(𝑁 −1) .. .

𝛾 −2 𝛾 −1 1 .. . ...

... ... ... 1 𝛾 −(𝑁 −1)

⎤ 𝛾 −(𝑁 −1) 𝛾 −(𝑁 −1)+1 ⎥ ⎥ ⎥ ⎥. ⎥ .. ⎦ . 1

Hence the matrix W = C ⊙ M becomes ⎡ ⎢ ⎢ ⎢ W = C[𝑐0 ,...,𝑐𝑁 −1 ] ⊙ Circ[1,𝛾 −1 ,...,𝛾 −(𝑁 −1) ] ⊙ ⎢ ⎢ ⎣

1 𝛾𝑁 𝛾𝑁 .. . 𝛾𝑁

1 1 𝛾𝑁 𝛾𝑁

1 1 1

1 1 1 .. .

. . . 𝛾𝑁

1 1 1

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

1

which clearly is a 𝜆(= 𝛾 𝑁 )-circulant matrix. To summarize, we have the following result: A 𝜆-circulant matrix W, denoted by ⎡ ⎤ 𝑚0 𝑚1 𝑚2 ... 𝑚𝑁 −1 ⎢ 𝜆𝑚𝑁 −1 𝑚0 𝑚1 ... 𝑚𝑁 −2 ⎥ ⎢ ⎥ ⎢ 𝜆𝑚𝑁 −2 𝜆𝑚𝑁 −1 𝑚0 ... ... ⎥ W=⎢ (7) ⎥ ⎢ ⎥ .. .. .. ⎣ . . . ... ... ⎦ 𝜆𝑚2 . . . 𝜆𝑚𝑁 −1 𝑚0 𝜆𝑚1 can be rewritten as W = Circ[𝑚0 ,𝑚1 𝛾,𝑚2 𝛾 2 ,...,𝑚𝑁 −1 𝛾 𝑁 −1 ] ⊙ Circ[1,𝛾 −1 ,...,𝛾 −(𝑁 −1) ] ⊙ Λ

On Cyclic and Nearly Cyclic Multiagent Interactions with

⎡ ⎢ ⎢ ⎢ Λ=⎢ ⎢ ⎣

1 1 𝜆 1 𝜆 𝜆 .. .. . . 𝜆 𝜆

⎤ ... 1 ... 1 ⎥ ⎥ ... 1 ⎥ ⎥ and 𝛾 𝑁 = 𝜆 .. ⎥ 𝜆 1 . ⎦ ... 𝜆 1 1 1 1

and hence can be factorized as ⎤ ⎡ ⎡ 1 𝜇0 ⎥ ⎢ 𝛾 ⎢ 𝜇1 0 ⎢ ⎥ ⎢ ⎢ ⎥ ⎢ 𝜇2 𝛾2 W=⎢ ⎥ [FT] ⎢ ⎥ ⎢ ⎢ . .. ⎦ ⎣ ⎣ 0 ... 𝛾 𝑁 −1

521

⎤

⎡

⎤

1

⎥ ⎢ 𝛾 −1 ⎥ ⎢ ⎥ ⎢ 𝛾 −2 ⎥ [FT]∗ ⎢ ⎥ ⎢ .. ⎦ ⎣ . 𝜇𝑁 −1

⎥ ⎥ ⎥ ⎥ ⎥ ⎦ 𝛾 −(𝑁 −1)

where [𝜇0 , 𝜇1 , . . . , 𝜇𝑁 −1 ] are the eigenvalues of Circ[𝑚0 ,𝑚1 𝛾,...,𝑚𝑁 −1 𝛾 𝑁 −1 ] ≜ Circ[𝑐0 ,𝑐1 ,...,𝑐𝑁 −1 ] given by 𝜇𝑙 =

𝑁 −1 ∑

2𝜋

1

𝑚𝑘 ⋅ 𝛾 𝑘 ⋅ 𝑒−𝑖 𝑁 𝑘𝑙 (𝛾 ≜ 𝜆 𝑁 ).

𝑘=0

Therefore W is readily diagonalized as follows ⎡ ⎤ ⎤ ⎡ 1 1 ⎡ ⎤ ⎢ 𝛾 −1 ⎥ ⎥ ⎢ 𝛾 𝜇0 ⎢ ⎥ ⎥ ⎢ −2 2 ⎢ .. ⎥ ⎢ ⎥ ⎢ ⎥ ∗ 𝛾 𝛾 ⎥W⎢ ⎥ [FT] ⎣ ⎦ = [FT] ⎢ . ⎢ ⎥ ⎢ ⎥ .. .. ⎣ ⎦ ⎣ ⎦ . . 𝜇𝑁 −1 𝑁 −1 −(𝑁 −1) 𝛾 𝛾 = T−1 WT, the matrices T and T−1 being ⎤ ⎡ ⎡ 1 1 ⎥ ⎢ ⎢ 𝛾 ⎥ ⎢ ⎢ T=⎢ ⎥ [FT] and T−1 = [FT]∗ ⎢ .. ⎦ ⎣ ⎣ . 𝑁 −1 𝛾

⎤ 𝛾 −1

..

.

⎥ ⎥ ⎥. ⎦ 𝛾 𝑁 −1

Note that T is not, in general a unitary transformation. In all developments above, we assumed 𝛾 to be arbitrary. If 𝛾 ∕= 0 is a real number, T will be an invertible matrix, as seen before. If however 𝛾 is purely imaginary, i.e., 𝛾 = 𝑒𝑗𝜑 , then clearly 𝛾 ∗ = 𝑒−𝑗𝜑 = 𝛾 −1 and the matrix T becomes a unitary transformation, obeying TT∗ = T∗ T = 𝐼. In this case the matrix W will be 𝜆-factor circulant with 𝜆 = 𝑒𝑗𝜑𝑁 .

522

F. Oggier and A. Bruckstein

Remark. Note that (7) can be alternatively written1 as W = 𝑚0 𝐼 + 𝑚1 Λ + ⋅ ⋅ ⋅ + 𝑚𝑁 −1 Λ𝑁 −1 with

⎡ ⎢ ⎢ ⎢ Λ=⎢ ⎢ ⎣

0

𝜆

1 0

⎤ 1 .. .

..

.

⎥ 𝑁 −1 ⎥ ∑ ⎥ e𝑗 e𝑇𝑗+1 + 𝜆e𝑁 e𝑇1 ⎥= ⎥ 𝑗=1 1 ⎦ 0

where e𝑖 , 𝑖 = 1, . . . , 𝑁 are column vectors that form the canonical basis of ℝ𝑁 . Thus 𝑁 −1 ∑ W = 𝑓 (Λ) ≜ 𝑚𝑗 Λ 𝑗 , 𝑗=0

and by the spectral mapping theorem, the spectrum 𝜎(W) = 𝑓 (𝜎(Λ)), where 𝜎(Λ) is found by computing det(𝜇𝐼𝑁 − Λ) = 𝜇𝑁 − 𝜆. It is also easily seen that W is normal ⇐⇒ Λ is normal ⇐⇒ ∣𝜆∣ = 1, telling in particular that W has a full set of orthonormal vectors as already seen above.

4. Dynamics of a cyclically interacting swarm Returning to the problem of analyzing the dynamics and the long-term behavior of a swarm of robots 𝒫0 , 𝒫1 , . . . , 𝒫𝑁 −1 interacting according to ⎤ ⎡ 𝑚0 𝑚1 𝑚2 ... 𝑚𝑁 −1 𝑚0 𝑚1 ... 𝑚𝑁 −2 ⎥ } ⎢ ⎥ ⎢ 𝜆𝑚𝑁 −1 P(𝑡 + 1) ⎥ ⎢ 𝜆𝑚𝑁 −2 𝜆𝑚𝑁 −1 𝑚0 . . . ... = ⎥ P(𝑡) = ΦP(𝑡), ⎢ 𝑑 or 𝑑𝑡 P(𝑡) ⎥ ⎢ .. .. . . ⎦ ⎣ . . . ... ... 𝜆𝑚2 . . . 𝜆𝑚𝑁 −1 𝑚0 𝜆𝑚1 we have that the interaction matrix Φ is 𝜆-circulant hence it is diagonalizable as follows: ⎤ ⎡ ⎤ ⎡ ⎡ ⎤ 1 1 𝜇0 −1 ⎥ ⎢ 𝛾 ⎥ ⎢ 𝛾 ⎢ .. ⎥ ⎥ ⎢ ⎥ ⎢ ⎢ ⎥ −2 . 2 0 ⎥ ⎢ ⎥ ⎢ ∗ 𝛾 𝛾 ⎥ [FT] ⎢ Φ=⎢ ⎥ [FT] ⎢ ⎥ ⎢ ⎥ . ⎥ ⎢ ⎥ ⎢ . . . ⎣ 0 ⎦ .. .. . ⎦ ⎣ ⎦ ⎣ 𝑁 −1 −(𝑁 −1) 𝜇 𝑁 −1 𝛾 𝛾 1 We

would like to thank one of our anonymous reviewers who suggested this.

On Cyclic and Nearly Cyclic Multiagent Interactions

523

1

where 𝛾 = 𝜆 𝑁 and 𝜇𝑙 =

𝑁 −1 ∑

𝑘

2𝜋

𝑚𝑘 𝜆 𝑁 𝑒−𝑖 𝑁 𝑘𝑙 .

𝑘=0

Therefore deﬁning

⎡

⎢ ⎢ ˜ P(𝑡) ≜ [FT]∗ ⎢ ⎣

1

⎤ ⎥ ⎥ ⎥ P(𝑡) ⎦

1 −𝑁

𝜆

..

. 𝜆−

𝑁 −1 𝑁

we have decoupled dynamics for the transformed location vector, given by ⎤ ⎡ 𝜇0 ⎫ 𝑑 ˜ ⎬ ⎢ ⎥ 𝜇1 0 𝑑𝑡 P(𝑡) ⎥˜ ⎢ or =⎢ ⎥ P(𝑡) . . ⎭ ⎦ ⎣ . 0 ˜ + 1) P(𝑡 𝜇𝑁 −1 and the evolution of the swarm is controlled by the eigenvalues 𝜇0 , 𝜇1 , . . . , 𝜇𝑁 −1 . Let us concentrate next on some speciﬁc cases of 𝑚 = [𝑚0 , . . . , 𝑚𝑁 −1 ] and 𝜆. A “𝜆- cyclic” interaction involves agents that are reacting diﬀerently with the agents that follow them to the agents that precede them in the ordering 𝒫0 , . . . , 𝒫𝑁 −1 . 4.1. Darboux’s polygon evolution and extensions As a ﬁrst example, suppose that we have a generalization of Darboux’s polygon evolution process [5], which is also a nice model for cyclic pursuit: ⎡ 1 ⎤ 1 0 0 ... 2 2 1 ⎢ 0 ⎥ 2 ⎢ ⎥ ⎢ ⎥ . .. ⎢ 0 ⎥ ⎥ P(𝑡). P(𝑡 + 1) = ⎢ ⎢ ⎥ .. ⎢ 0 ⎥ . ⎢ ⎥ 1 ⎦ 1 ⎣ 0 2 2 𝜆 12 0 0 0 0 12 In this case, we have a 𝜆-factor circulant with 2𝜋 1 2𝜋 1 1 1 1 𝜇𝑙 = + 𝜆 𝑁 𝑒−𝑖 𝑁 ⋅𝑙 = (1 + 𝜆 𝑁 𝑒−𝑖 𝑁 ⋅𝑙 ), 𝑙 = 0, 1, . . . , 𝑁 − 1. 2 2 2 Here, the evolution of the polygon vertices (or the agents in cyclic pursuit) is described by ⎤ ⎡ 𝑡 𝜇0 ⎥ ⎢ 𝜇𝑡1 0 ⎥˜ ˜ + 1) = ⎢ P(𝑡 ⎢ ⎥ P(0) . .. ⎣ ⎦ 0

𝜇𝑡𝑁 −1

524

F. Oggier and A. Bruckstein

where we deﬁned

⎡

⎢ ⎢ ˜ P(𝑡) = [FT]∗ ⎢ ⎣

1

⎤ 𝜆−1/𝑁

0 .. .

0 From this we have ⎡ 1 ⎢ 𝜆1/𝑁 ⎢ P(𝑡) = ⎢ ⎣ ⎡ ⎢ ⎢ =⎢ ⎣

𝜆1/𝑁

𝜆−(𝑁 −1)/𝑁

⎤ ..

⎥ ⎥ ˜ ⎥ [FT]P(𝑡) ⎦

. 𝜆

1

⎥ ⎥ ⎥ P(𝑡). ⎦

..

𝑁 −1 𝑁

⎤

⎡

⎥ ⎢ ⎥ ⎢ ⎥ [FT] ⎢ ⎣ ⎦

. 𝜆

𝑁 −1 𝑁

𝜇𝑡0

⎤ 𝜇𝑡1 0

0 .. .

⎥ ⎥˜ ⎥ P(0). ⎦ 𝜇𝑡𝑁 −1

The evolution of the polygon vertices (the swarm of robots) when we let the time grow, thus asymptotically depends on the dominant eigenvalues among 𝜇0 , . . . , 𝜇𝑁 −1 . If 𝜆 = 1 (which means a circulant cyclic pursuit), we have 2𝜋 1 𝜇𝑙 = (1 + 𝑒−𝑖 𝑁 ⋅𝑙 ), 𝑙 = 0, 1, . . . , 𝑁 − 1, 2 and 𝜇0 = 1. Then ⎤ ⎡ 𝑡 𝜇0 ⎥ ⎢ 𝜇𝑡1 0 ⎢ ⎥˜ lim P(𝑡) = lim [FT] ⎢ ⎥P(0) . . 𝑡→∞ 𝑡→∞ ⎦ ⎣ . 0 𝑡 𝜇𝑁 −1 ⎤ ⎡ 1 ⎥ ⎢ 𝜇𝑡1 0 ⎢ ⎥ ⎢ ⎥ 0 ⎥ ⎢ = lim [FT] ⎢ ⎥[FT]∗ P(0). .. 𝑡→∞ ⎥ ⎢ . 0 ⎥ ⎢ ⎦ ⎣ 0 𝑡 𝜇𝑁 −1 Since the dominant eigenvalue 𝜇0 = 1 and all others have modulus less than one, we have that the limiting behavior is ⎡ ⎤ 1 ⎥ 1 ⎢ ⎢ 1 ⎥ lim P(𝑡) = ⎢ .. ⎥ [1, 1, . . . , 1]P(0). 𝑡→∞ 𝑁⎣ . ⎦ 1

On Cyclic and Nearly Cyclic Multiagent Interactions

525

Hence the point constellation converges to the centroid of the initial locations. The way this convergence occurs is controlled by the next dominant eigenvalues, which are in this case 2𝜋 1 (1 + 𝑒−𝑖 𝑁 ) 2 2𝜋(𝑁 −1) 1 = (1 + 𝑒−𝑖 𝑁 ). 2

𝜇1 = 𝜇𝑁 −1 Indeed, writing

⎡ P𝑁 (𝑡) = P(𝑡) −

1 𝑁

⎢ ⎢ ⎢ ⎣

1 1 .. .

⎤ ⎥ ⎥ ⎥ [1, 1, . . . , 1]P(0), ⎦

1 we have

⎡ ⎢ ⎢ P𝑁 (𝑡) = [FT] ⎢ ⎣

0

⎤ 𝜇𝑡1 0

0 .. .

⎥ ⎥ ⎥ [FT]∗ P(0) ⎦ 𝜇𝑡𝑁 −1

and, disregarding the faster decaying terms 𝜇𝑡𝑖 , 𝑖 = 2, . . . , 𝑁 − 2, we further get ⎡ ⎤ 1 ⎥ 1 ⎢ ⎢ 𝑤 ⎥ lim P𝑁 (𝑡) = ⎢ ⎥ [1, 𝑤, . . . , 𝑤𝑁 −1 ]P(0)𝜇𝑡1 . .. 𝑡→∞ ⎦ 𝑁⎣ 𝑁 −1 𝑤 ⎤ ⎡ 1 ⎥ 𝑤𝑁 −1 1 ⎢ ⎥ ⎢ + ⎥ [1, 𝑤𝑁 −1 , . . . , 𝑤(𝑁 −1)(𝑁 −1) ]P(0)𝜇𝑡𝑁 −1 . ⎢ .. ⎦ 𝑁 ⎣ . 𝑤(𝑁 −1)(𝑁 −1)

Hence

⎡ lim P𝑁 (𝑡) =

𝑡→∞

1 𝑁

⎢ ⎢ ⎢ ⎣

1 𝑤 .. .

𝑤𝑁 −1

⎤

⎡

⎥ 1 ⎢ ⎥ ⎢ ⎥ 𝐴(𝑡)𝜇𝑡1 + ⎢ 𝑁⎣ ⎦

1

𝑤𝑁 −1 .. .

⎤ ⎥ ⎥ ⎥ 𝐵(𝑡)𝜇𝑡𝑁 −1 ⎦

𝑤(𝑁 −1)(𝑁 −1)

where 𝐴(𝑡)𝜇𝑡1 and 𝐵(𝑡)𝜇𝑁 −1 are some complex numbers, and P𝑁 (𝑡) will be, in the limit 𝑡 → ∞, an aﬃne transformation of a regular polygon, i.e., a discrete ellipse (see Figure 1).

526

F. Oggier and A. Bruckstein 1

1

0.9 0.8

0.8 0.7 0.6

0.6

0.5

0.4

0.4 0.3

0.2 0.2

0

0

0.2

0.4

0.6

0.8

1

0.1 0.1

1.2

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0.4709

Hkiwtg 30 Vjg e{enke rwtuwkv ecug ) ? 3* ykvj c tcpfqo kpkvkcn rqn{iqp ykvj O ? 9 rqkpvu- vjg tuv iwtg rtgugpvu vjg kpkvkcn eqp iwtcvkqp )vjg qwvukfg qpg* cpf vjg tuv kvgtcvkqp )kpukfg*vjg ugeqpf ujqyu vjg gpvktg gxqnw. vkqp hqt 322 kvgtcvkqpu- vjg ncuv i. wtg fkurnc{u vjg uecngf wr eqp i. wtcvkqp hqt vjg ncuv hgy kvgtcvkqpu0

0.4709

0.4709

0.4708

0.4708

0.4708 0.6204

0.6204

0.6204

0.6204

0.6204

0.6204

0.6204

For the general case where 𝜆 is some real or complex number, we have that ⎡ ⎢ ⎢ lim P(𝑡) = lim ⎢ 𝑡→∞ 𝑡→∞ ⎣ ⎡ ⎢ ⎢ ⎢ = lim ⎢ 𝑡→∞ ⎢ ⎢ ⎣

1

𝜆1/𝑁

..

⎢ ⎥ ⎢ ⎥ ⎥ [FT] ⎢ ⎣ ⎦

. 𝜆

1

⎡

⎤

𝑁 −1 𝑁

𝜆1/𝑁 ..

. 𝜆

𝑁 −1 𝑁

𝜇𝑡0

⎤ 𝜇𝑡1 0

0 ..

.

⎢ ⎢ ˜ P(0) = [FT]∗ ⎢ ⎣

1

⎡ 𝑡 𝜇0 ⎥ ⎥ ⎢ ⎥ ⎥ [FT] ⎢ ⎢ ⎥ ⎣ ⎥ ⎦

⎤ 0 0

0 .. .

⎤ 𝜆−1/𝑁

𝜇𝑡𝑁 −1

⎤

where 𝜇0 = 12 (1 + 𝜆1/𝑁 ) is the dominant eigenvalue. Since ⎡

⎥ ⎥˜ ⎥ P(0) ⎦

..

⎥ ⎥ ⎥ P(0), ⎦

. 𝜆

−(𝑁 −1) 𝑁

⎥ ⎥˜ ⎥ P(0), ⎦ 0

On Cyclic and Nearly Cyclic Multiagent Interactions we then have that

⎡

⎢ ⎢ lim P(𝑡) = 𝜇𝑡0 ⎢ 𝑡→∞ ⎣

1

527

⎤ 𝜆1/𝑁

⎡ ⎢ ⎢ × ([FT]𝑙,1 )∗ ⎢ ⎣

..

⎥ ⎥ ⎥ [FT]𝑙,1 ⎦

. 𝜆

1

𝑁 −1 𝑁

⎤ ⎥ ⎥ ⎥ P(0) ⎦

−1/𝑁

𝜆

..

. 𝜆

−(𝑁 −1) 𝑁

and since the ﬁrst column of the Fourier transform is a vector of all ones, this further simpliﬁes to ⎤ ⎡ 1 1/𝑁 ⎥ −(𝑁 −1) 1 ⎢ 𝜆. ⎥ [1, 𝜆−1/𝑁 , . . . , 𝜆 𝑁 ]P(0). lim P(𝑡) = 𝜇𝑡0 ⎢ . ⎦ ⎣ . 𝑡→∞ 𝑁 𝑁 −1 𝜆 𝑁 Therefore, we see that the limiting behavior is dominated by

⎡ 1 ]𝑡 1/𝑁 ⎢ −(𝑁 −1) 𝜆 1 1 .. (1 + 𝜆1/𝑁 ) [1, 𝜆−1/𝑁 , . . . , 𝜆 𝑁 ]P(0) ⎢ lim P(𝑡) = . 𝑡→∞ ⎣ 2 𝑁 𝑁 −1 a (complex) scalar 𝜆 𝑁 [

⎤ ⎥ ⎥. ⎦

We can distinguish diﬀerent behaviors depending on 𝜆. 1. If 𝜆 is real and ∣𝜆∣ < 1, P(𝑡) tends to zero, but the limit behavior will be a linear constellation of points ⎡ ⎤ ⎡ ⎤ 1 1 ⎢ 𝜆1/𝑁 ⎥ ⎢ 1/𝑁 ⎥ ⎥ + 𝑖(𝛼𝑡 )𝑦 ⎢ 𝜆 .. ⎥. .. (𝛼𝑡 )𝑥 ⎢ ⎣ ⎦ ⎣ ⎦ . . 𝜆

𝑁 −1 𝑁

𝜆

𝑁 −1 𝑁

If ∣𝜆∣ > 1, the constellation of agent locations will diverge in a similar formation. 2. If 𝜆 is a complex number 𝜌(𝜆) 𝑒𝑖𝜑(𝜆) , the convergence/divergence will depend on the angle of rotation induced by 𝜑(𝜆) and on the magnitude 𝜌(𝜆) . As seen in the examples provided in Figures 2, 3, 4, 5, 6, 7, in the limit, agents are marching in elliptic or circular arcs, spiralling towards their point of convergence (and in case of divergence, spiralling out to inﬁnity). As in Figure 1, the left ﬁgure presents the initial conﬁguration (in red) and the ﬁrst iteration (in blue), the second shows the entire evolution for 100 iterations (unless stated otherwise), the last ﬁgure displays the scaled up conﬁguration for the last few iterations.

528

F. Oggier and A. Bruckstein 1

1

0.9 0.8

0.8

0.7 0.6

0.6 0.5 0.4

0.4 0.3 0.2

0.2

0.1

0

0

0.2

0.4

0.6

0.8

1

1.2

0

0

0.2

0.4

0.6

0.8

1

−7

6

x 10

5

4

3

2

1

Hkiwtg 40 ? 203

−7

0

1

x 10 0

1

2

3

4

5

6

1 0.8

0.8

0.6 0.6 0.4 0.4

0.2 0

0.2

−0.2 0 −0.4 −0.2

−0.6

−0.4 −0.4

−0.2

0

0.2

0.4

0.6

0.8

1

−0.8 −0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

−0.08

−0.06

−0.04

−0.02

0

0.02

0.04

0.06

0.08

0.08 0.06 0.04 0.02 0 −0.02 −0.04 −0.06 −0.08

Hkiwtg 50 ? 3

−0.1 −0.1

On Cyclic and Nearly Cyclic Multiagent Interactions 1

529

1 0.8

0.8

0.6 0.6 0.4 0.4

0.2 0

0.2

−0.2 0 −0.4 −0.2

−0.6

−0.4 −0.4

−0.2

0

0.2

0.4

0.6

0.8

1

−0.8 −0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

0.4 0.3 0.2 0.1 0 −0.1 −0.2 −0.3

Hkiwtg 60 ? j

−0.4 −0.4

1

1

0.9

0.8

0.8

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.4

0.6

0.7

0.4

0.6 0.2 0.5 0 0.4 −0.2

0.3

−0.4

0.2

−0.6

0.1 0

0

0.2

0.4

0.6

0.8

1

−0.8 −0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

0.5 0.4 0.3 0.2 0.1 0 −0.1 −0.2 −0.3 −0.4

Hkiwtg 70 ? j

−0.5 −0.5

0

0.5

530

F. Oggier and A. Bruckstein 1.2

1 0.9

1 0.8

0.8

0.7 0.6

0.6

0.5

0.4

0.4 0.3

0.2

0.2

0 0.1 0

0

0.2

0.4

0.6

0.8

1

−0.2 −0.2

0

0.2

0.4

0.6

0.8

1

1.2

−3

7

x 10

6 5 4 3 2 1 0 −1 −2

Hkiwtg 80 ? gzr)j16 *14

−3

x 10

−3 −9

−8

−7

−6

−5

−4

−3

−2

−1

0

1

1.2

1

1

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0

0.2

−0.2

0

−0.4 −0.2 −0.2

0

0.2

0.4

0.6

0.8

1

1.2

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

1.2

−3

x 10 3

2

1

0

−1

−2

Hkiwtg 90 ? gzr)j1 6 , j*14

−3

x 10 −2

−1

0

1

2

3

On Cyclic and Nearly Cyclic Multiagent Interactions

531

4.2. Centroid gathering evolution and extensions As a second example, suppose that agent 𝒫𝑘 is moving according to the following linear combination of its own position, the positions of agents higher in the hierarchy, i.e., {𝒫𝑘+1 , . . . , 𝒫𝑁 −1 }, and the positions of those lower than itself {𝒫0 , 𝒫1 , . . . , 𝒫𝑘−1 }: 𝑁 −1 ∑

𝒫𝑘 (𝑡 + 1) = 𝛼𝒫𝑘 (𝑡) + 𝛽𝐹

𝒫𝑙 (𝑡) + 𝛽𝐵

𝑙=𝑘+1

or

⎡

𝛼 ⎢ 𝛽𝐵 ⎢ ⎢ P(𝑡 + 1) = ⎢ ⎢ ⎣ 𝛽𝐵

𝑘−1 ∑

𝒫𝑙 (𝑡)

𝑙=0

𝛽𝐹 𝛼

𝛽𝐹 𝛽𝐹 .. .

... ...

𝛽𝐹 𝛽𝐹 .. .

...

...

𝛽𝐵

𝛼

⎤ ⎥ ⎥ ⎥ ⎥ P(𝑡). ⎥ ⎦

Note that if 𝛽𝐹 = 𝛽𝐵 = (1 − 𝛼)/(𝑁 − 1), we will have 𝑁 1−𝛼 ∑ 𝒫𝑙 (𝑡) 𝑁 −1 𝑙=0,𝑙∕=𝑘 ( ) 𝑁𝛼 − 1 𝑁𝛼 − 1 𝒫𝑘 (𝑡) + 1 − = 𝒫𝑐𝑒𝑛𝑡𝑟𝑜𝑖𝑑 𝑁 −1 𝛼−1

𝒫𝑘 (𝑡 + 1) = 𝛼𝒫𝑘 (𝑡) +

hence all agents move towards the time-invariant centroid on straight lines. For general 𝛽𝐹 and 𝛽𝐵 , the above matrix is 𝛽𝐵 /𝛽𝐹 -factor circulant and is diagonalized by ⎡ ⎤ 1 ⎢ ⎥ (𝛽𝐵 /𝛽𝐹 ) 0 ⎢ ⎥ ˜ P(𝑡) = [FT]∗ ⎢ ⎥ P(𝑡), . .. ⎣ ⎦ (𝛽𝐵 /𝛽𝐹 )𝑁 −1

0

the modes or eigenvalues being given by 𝜇𝑙 = 𝛼 +

𝑁 −1 ∑

( 𝛽𝐹

𝑘=1

𝛽𝐵 𝛽𝐹

) 𝑁𝑘

2𝜋

𝑒−𝑖 𝑁 𝑘𝑙 , 𝑙 = 0, . . . , 𝑁 − 1.

Let us consider ﬁrst the case of perfectly cyclic interaction, i.e., when 𝛽𝐵 = 𝛽𝐹 . In this case, the interaction matrix is circulant, and we have 𝜇𝑙 = 𝛼 +

𝑁 −1 ∑ 𝑘=1

2𝜋

𝛽𝐹 𝑒−𝑖 𝑁 𝑘𝑙 , 𝑙 = 0, . . . , 𝑁 − 1

532

F. Oggier and A. Bruckstein

and 𝜇0 = 𝛼 + (𝑁 − 1)𝛽𝐹 𝜇𝑙 = 𝛼 − 𝛽 𝐹 +

𝑁 −1 ∑

2𝜋

𝛽𝐹 𝑒−𝑖 𝑁 𝑘𝑙 = 𝛼 − 𝛽𝐹 .

𝑘=0

For normalization, we shall take 𝛽𝐹 = (1 − 𝛼)/(𝑁 − 1) and then 𝜇0 = 1 𝜇𝑙 = (𝑁 𝛼 − 1)/(𝑁 − 1), for all 𝑙. We now have that ˜ P(𝑡) = [FT]∗ P(𝑡) evolves according to ⎡ ⎢ ⎢ ⎢ ˜ lim P(𝑡) = ⎢ 𝑡→∞ ⎢ ⎣

1

(

𝑁 𝛼−1 𝑁 −1

⎤

)𝑡 ..

.

(

𝑁 𝛼−1 𝑁 −1

⎡ ⎥ ⎢ ⎥ ⎥˜ ⎢ ⎥ P(0) = ⎢ ⎥ ⎣ )𝑡 ⎦

1 0 .. .

⎤ ⎥ ⎥ ˜ ⎥ [1, 0, . . . , 0]P(0). ⎦

0

Hence ⎡ ⎢ ⎢ lim P(𝑡) = [FT] ⎢ 𝑡→∞ ⎣

1 0 .. .

⎡

⎤

⎥ 1 ⎢ ⎥ ⎢ ⎥ [1, 0, . . . , 0][FT]∗ P(0) = ⎢ ⎦ 𝑁⎣

0

1 1 .. .

⎤ ⎥ ⎥ ⎥ [1, 1, . . . , 1]P(0), ⎦

1

i.e., as we have already seen, all points converge towards the centroid of the initial constellation. The convergence will be as follows: ⎡ ⎢ ˜ −⎢ lim P𝑁 (𝑡) = P(𝑡) ⎢ 𝑡→∞ ⎣

1 0 .. .

⎤ ⎥ ⎥ ˜ ⎥ [1, 0, . . . , 0]P(0) ⎦

0

( =

⎡ 0 )𝑡 ⎢ 𝑁𝛼 − 1 ⎢ ⎢ 𝑁 −1 ⎣ 0

⎤ 1

..

⎥ ⎥˜ ⎥ P(0). ⎦

. 1

On Cyclic and Nearly Cyclic Multiagent Interactions

0.8

0.8

0.7

0.7

0.6

0.6

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0.1

0.2

0.3

0.4

0.5

0.6

0.7

533

0.8

0.9

0.5044 0.5044 0.5044 0.5044 0.5044 0.5044 0.5044 0.5044 0.5044

Hkiwtg :0 ? 3- ? 203322 kvgtcvkqpu

0.5044 0.5044 0.4786

0.4786

0.4786

0.4786

0.4786

0.4787

0.4787

Therefore ⎛

⎡

⎢ ⎜ ⎢ ⎜ lim P𝑁 (𝑡) = [FT] ⎜𝐼 − ⎢ 𝑡→∞ ⎣ ⎝

1 0 .. .

⎟ ⎥ ⎟ ⎥ ˜ ⎥ [1, 0, . . . , 0]⎟ [FT]∗ P(0) ⎠ ⎦

0 ⎛ ( =

⎞

⎤

⎡

)𝑡 ⎢ 𝑁𝛼 − 1 ⎜ ⎢ ⎜ ⎜P(0) − ⎢ ⎣ ⎝ 𝑁 −1

1 1 .. .

⎤

⎞

⎟ ⎥ ⎟ ⎥ ⎥ [1, 1, . . . , 1]P(0)⎟ . ⎠ ⎦

1 Consequently, all agents will gather towards the centroid by moving on a line from ∑𝑁 −1 𝒫𝑘 (0) to (1/𝑁 ) 𝑖=1 𝒫𝑖 (0) (see Figure 8). Next suppose we have 𝛽𝐵 ∕= 𝛽𝐹 . Then we have a 𝜆 = 𝛽𝐵 /𝛽𝐹 factor circulant ˜ and the modes of the P(𝑡) evolution is controlled by 𝜇𝑙 = 𝛼 +

𝑁 −1 ∑ 𝑘=1

( 𝛽𝐹

𝛽𝐵 𝛽𝐹

)𝑘/𝑁

2𝜋

𝑒−𝑖 𝑁 𝑘𝑙 , 𝑙 = 0, . . . , 𝑁 − 1.

534

F. Oggier and A. Bruckstein

Here 𝜇0 = 𝛼 − 𝛽 𝐹 + 𝛽 𝐹

𝑁 −1 ( ∑ 𝑘=0

𝛽𝐵 𝛽𝐹

)𝑘/𝑁

𝛽𝐵 /𝛽𝐹 − 1 (𝛽𝐵 /𝛽𝐹 )1/𝑁 − 1 ( )( ) 1−𝛼 𝜆−1 1−𝛼 + . =𝛼− 𝑁 −1 𝑁 −1 𝜆1/𝑁 − 1 Similarly we have that ( )𝑘/𝑛 𝑁 −1 ∑ 2𝜋 𝛽𝐵 𝜇𝑙 = 𝛼 + 𝛽𝐹 𝑒−𝑖 𝑁 𝑘𝑙 𝛽𝐹 𝑘=1 ( )( ) 1−𝛼 𝜆𝑒−𝑖𝜋𝑙 − 1 1−𝛼 + . =𝛼− 𝑁 −1 𝑁 −1 𝜆1/𝑁 𝑒𝑖𝜋𝑙/𝑁 − 1 In this example too, as before, we have ⎡ 𝑡 ⎡ ⎤ 1 𝜇0 1/𝑁 ⎢ ⎢ ⎥ 𝜆 𝜇𝑡1 ⎢ ⎢ ⎥ [FT] lim P(𝑡) = lim ⎢ ⎢ ⎥ . .. .. 𝑡→∞ 𝑡→∞ ⎣ ⎣ ⎦ . = 𝛼 − 𝛽𝐹 + 𝛽𝐹

0

⎡

⎢ ⎢ × [FT]∗ ⎢ ⎣

𝜆 1

𝑁 −1 𝑁

𝜆−1/𝑁

0 and if 𝜇0 is the dominant eigenvalue, we ⎡ 1 1/𝑁 ⎢ 1 ⎢ 𝜆 lim P(𝑡) = lim 𝜇𝑡0 ⎢ .. 𝑡→∞ 𝑡→∞ 𝑁 ⎣ .

..

0

⎤

⎤ ⎥ ⎥ ⎥ ⎦ 𝜇𝑡𝑁 −1

⎥ ⎥ ⎥ P(0) ⎦

. 𝜆

−(𝑁 −1) 𝑁

shall have ⎤ ⎥ ⎥ ⎥ [1, 𝜆−1/𝑁 , . . . , 𝜆−(𝑁 −1)/𝑁 ]P(0). ⎦

𝜆𝑁 −1/𝑁

Depending on the values selected for 𝜆, we can get a wealth of interesting behaviors while the solutions converge or diverge to inﬁnity, displaying spiralling or in line marching. See Figures 9, 10, 11, 12, 13, 14 where we present a few interesting cases.

On Cyclic and Nearly Cyclic Multiagent Interactions

535

0.8 0.8 0.7 0.7 0.6 0.6 0.5 0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0

0.9

0.2

0.4

0.6

0.8

1

−13

x 10 3.4 3.2 3 2.8 2.6 2.4 2.2 2 1.8

Hkiwtg ;0 ? 207- ? 2033222 kvgtcvkqpu

1.6 −13

x 10 1

1.5

2

2.5

3

0.8 0.8 0.7 0.7

0.6

0.6

0.5

0.5

0.4 0.3

0.4

0.2 0.3 0.1 0.2 0 0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0

0.9

0.2

0.4

0.6

0.8

1

−5

x 10 8

6

4

2

0

−2

Hkiwtg 320 ? 3- ? 203322 kvgtcvkqpu

−4 −5

x 10 −6

−4

−2

0

2

4

6

8

10

536

F. Oggier and A. Bruckstein 0.8 0.8

0.7 0.6

0.7

0.5 0.6 0.4 0.5

0.3 0.2

0.4

0.1 0.3

0 −0.1

0.2

−0.2 0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

−0.2

0

0.2

0.4

0.6

0.8

1

0.025

0.02

0.015

0.01

0.005

0

Hkiwtg 330 ? j- ? 203322 kvgtcvkqpu

−0.005

0

0.005

0.01

0.015

0.02

0.025

0.03

0.8 0.8 0.7 0.7

0.6

0.6

0.5 0.4

0.5 0.3 0.4

0.2

0.3

0.1 0

0.2 −0.1 0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

−0.2

0

0.2

0.4

0.6

0.8

0.035

0.03

0.025

0.02

0.015

0.01

Hkiwtg 340 ? j- ? 203322 kvgtcvkqpu

0.005

0 −0.005

0

0.005

0.01

0.015

0.02

0.025

0.03

0.035

On Cyclic and Nearly Cyclic Multiagent Interactions

537

0.8 0.8 0.7 0.7 0.6 0.6

0.5

0.5

0.4

0.4

0.3 0.2

0.3

0.1 0.2 0 0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0

0.9

0.2

0.4

0.6

0.8

1

−0.015

−0.02

−0.025

−0.03

Hkiwtg 350 ? gzr)j16 *14 - ? 203322 kvgtcvkqpu

−0.035

−3

x 10 −5

0

5

10

15

20

0.8 0.8 0.7 0.7 0.6 0.6

0.5

0.5

0.4

0.4

0.3

0.3

0.2 0.1

0.2

0 0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0

0.2

0.4

0.6

0.8

1

−4

x 10 4 2 0 −2 −4 −6 −8

Hkiwtg 360 ? gzr)j1 6 , j*14 ? 203- 322 kvgtcvkqpu

−10 −12 −5

−4

x 10 0

5

10

15

538

F. Oggier and A. Bruckstein

5. Concluding remarks We discussed in this paper a special type of cyclic multiagent interaction modeled by 𝜆-factor cyclic matrices. Such matrices allow explicit closed form diagonalizations via generalized Fourier transforms hence enable the analysis of the evolution of the swarm via a nice, geometric, modal decomposition process. It is expected that a wealth of further similar, structured and nearly cyclic interactions will also yield explicit closed form solutions for their asymptotic behavior. In fact, we may use evolutions that ﬁx one, two [17] or several agents in the swarm and use circulant or 𝜆-circulant interactions for the rest of them leading to further highly structured matrices that can be diagonalized, and correspondingly leading to interesting and explicitly predictable and designable swarm dynamics. In closing, we note that Turing’s morphogenesis may be regarded as a further example of such dynamics for points in the plane where the 𝑥 and the 𝑦 coordinates are subjected to diﬀerent linear circulant transformations also readily generalizable to 𝜆-circulant maps [16]. An analysis of such swarm interaction for multiagent system is forthcoming. Acknowledgment The research of F. Oggier is supported in part by the Singapore National Research Foundation under Research Grant NRF-RF2009-07 and NRF-CRP2-2007-03, and in part by the Nanyang Technological University under Research Grant M58110049 and M58110070. The work of Alfred Bruckstein is supported in part by a Nanyang Technological University visiting professorship, at the SPMS and IMI center.

References [1] E.C. Boman. The Moore Penrose pseudoinverse of an arbitrary, square, k-circulant matrix. Linear and Multilinear Algebra, 50:175–179, 2002. [2] A.M. Bruckstein, N. Cohen, and A. Efrat. Ants, crickets and frogs in cyclic pursuit. CIS9105 technical report, Computer Science Dept., Technion, 1991. [3] A.M. Bruckstein, G. Sapiro, and D. Shaked. Evolutions of planar polygons. International Journal of Pattern Recognition and Artiﬁcial Intelligence, 9(6):991–1014, 1995. [4] R.E. Cline, R.J. Plemmons, and G. Worm. Generalized inverses of certain Toeplitz matrices. Linear Algebra and Its Applications, 8:25–33, 1974. [5] M.G. Darboux. Sur un probl`eme de g´eom´etrie ´el´ementaire. Bull. Sci. Math., 2:298– 304, 1878. [6] P. Elia, F. Oggier, and P. Vijay Kumar. Asymptotically optimal cooperative wireless networks without constellation expansion. IEEE Journal on Selected Areas in Communications on Cooperative Communications and Networking, 25, 2007. [7] P. Feinsilver. Circulants, inversion of circulants, and some related matrix algebras. Linear. Algebra and Appl., 56:29–43, 1984. [8] I. Gohberg and V. Olshevsky. Circulants, displacements and decompositions of matrices. Integral Equations and Operator Theory, 15:730–743, 1992.

On Cyclic and Nearly Cyclic Multiagent Interactions

539

[9] R.M. Gray. Toeplitz and circulant matrices: A review. Intelligent systems lab technical memo, Stanford University, 1971–2006. [10] F. Hirsh and S. Smale. Diﬀerential Equations, Dynamical Systems and Linear Algebra. Academic Press, 1974. [11] J.A. Marshall and M.E. Broucke. Symmetry invariance of multiagent formations in self-pursuit. IEEE Transactions on Automatic Control, 53(9):2022–2032, 2008. [12] J.A. Marshall, M.E. Broucke, and B.A. Francis. Formations of vehicles in cyclic pursuit. IEEE Transactions on Automatic Control, 49(11):1963–1974, 2004. [13] I.J. Schoenberg. The ﬁnite Fourier series and elementary geometry. Amer. Math. Monthly, 57(6):390–404, 1950. [14] I. Schur. Bemerkungen zur Theorie der beschr¨ ankten Bilinearformen mit unendlich vielen Ver¨ anderlichen. J. reine angew. Math., 140:1–28, 1911. [15] D.B. Shapiro. A periodicity problem in plane geometry. The American Math. Monthly, 91:97–108, 1984. [16] A.M. Turing. The chemical basis of morphogenesis. Philosophical Transactions of the Royal Society of London, series B, Biological Sciences, 237(641), 1952. [17] I. Wagner and A.M. Bruckstein. Row straightening by local interactions. Circuits, Systems and Signal Processing, 16(3):287–305, 1997. [18] A.C. Wilde. Diﬀerential equations involving circulant matrices. Rocky Mount. J. Math., 13(1):1–13, 1983. Fr´ed´erique Oggier and Alfred Bruckstein2 Division of Mathematical Sciences School of Physical and Mathematical Sciences Nanyang Technological University, Singapore e-mail: [email protected] [email protected]

2 Alfred

Bruckstein is visiting Professor from The Technion – IIT, Haifa, Israel.

Operator Theory: Advances and Applications, Vol. 218, 541–570 c 2012 Springer Basel AG ⃝

A Trace Formula for Diﬀerential Operators of Arbitrary Order ¨ J. Ostensson and D.R. Yafaev To the memory of Israel Cudicovich Gohberg

Abstract. An operator 𝐻 = 𝐻0 + 𝑉 where 𝐻0 = 𝑖−𝑁 ∂ 𝑁 (𝑁 is arbitrary) and 𝑉 is a diﬀerential operator of order 𝑁 −1 with coeﬃcients decaying suﬃciently rapidly at inﬁnity is considered in the space 𝐿2 (ℝ). The goal of the paper is to ﬁnd an expression for the trace of the diﬀerence of the resolvents (𝐻 − 𝑧)−1 and (𝐻0 − 𝑧)−1 in terms of the Wronskian of appropriate solutions to the diﬀerential equation 𝐻𝑢 = 𝑧𝑢. This also leads to a representation for the perturbation determinant of the pair 𝐻0 , 𝐻. Mathematics Subject Classiﬁcation (2000). 34B25, 35P25, 47A40. Keywords. One-dimensional diﬀerential operators, arbitrary order, resolvents, perturbation determinant, trace formula.

1. Introduction 1.1. In the framework of the general operator theory in an abstract Hilbert space, the spectral theory of diﬀerential operators 𝐻 = 𝑖−𝑁 ∂ 𝑁 + 𝑣𝑁 (𝑥)∂ 𝑁 −1 + ⋅ ⋅ ⋅ + 𝑣2 (𝑥)∂ + 𝑣1 (𝑥),

∂ = 𝑑/𝑑𝑥,

(1.1)

is the same for all values of 𝑁 . However, from the point of view of diﬀerential equations the problems are essentially diﬀerent for 𝑁 = 2 (for 𝑁 = 1 it is trivial) and for larger values of 𝑁 . Suppose that the coeﬃcients 𝑣𝑗 (𝑥), 𝑗 = 1, . . . , 𝑁 , decay suﬃciently rapidly as ∣𝑥∣ → ∞, and set 𝐻0 = 𝑖−𝑁 ∂ 𝑁 . Let 𝑅0 (𝑧) = (𝐻0 − 𝑧)−1 and 𝑅(𝑧) = (𝐻 − 𝑧)−1 be the resolvents of the operators 𝐻0 and 𝐻 acting in the space 𝐿2 (ℝ). The selfadjointness of the operator 𝐻 is inessential for us, and we do not assume it. The ﬁrst author is grateful to Ari Laptev for useful and stimulating discussions. The second author was partially supported by the project NONAa, ANR-08-BLANC-0228.

¨ J. Ostensson and D.R. Yafaev

542

The main goal of the present paper is to ﬁnd an expression for the trace ( ) Tr 𝑅(𝑧) − 𝑅0 (𝑧) (1.2) in terms of solutions to the diﬀerential equation 𝐻𝑢 = 𝑧𝑢. In the case 𝑁 = 2 such an expression was found by V.S. Buslaev and L.D. Faddeev in paper [5]. They considered the problem on the half-line, and the problem on the whole line was discussed by L.D. Faddeev in [7]. 1.2. Let us introduce the notation

⎛

⎜ ⎜ {𝑢1 , . . . , 𝑢𝑁 } = ⎜ ⎝

𝑢1 𝑢′1 .. .

(𝑁 −1)

𝑢1

... ... .. .

𝑢𝑁 𝑢′𝑁 .. .

(𝑁 −1)

⎞ ⎟ ⎟ ⎟ ⎠

(1.3)

. . . 𝑢𝑁

for the Wronskian matrix of solutions 𝑢1 = 𝑢1 (𝑥, 𝑧), . . . , 𝑢𝑁 = 𝑢𝑁 (𝑥, 𝑧) of the diﬀerential equation 𝑖−𝑁 𝑢(𝑁 ) (𝑥) + 𝑣𝑁 (𝑥)𝑢(𝑁 −1) (𝑥) + ⋅ ⋅ ⋅ + 𝑣2 (𝑥)𝑢′ (𝑥) + 𝑣1 (𝑥)𝑢(𝑥) = 𝑧𝑢(𝑥).

(1.4)

We always assume that 𝑧 ∈ ℂ ∖ [0, ∞) if 𝑁 is even and that Im 𝑧 ∕= 0 if 𝑁 is odd. Let 𝜁𝑗 be solutions of the equation 𝜁 𝑁 = 𝑖𝑁 𝑧. We suppose that Re 𝜁𝑗 > 0

for 𝑗 = 1, . . . , 𝑛 and

Re 𝜁𝑗 < 0 for

𝑗 = 𝑛 + 1, . . . , 𝑁.

(1.5)

Here 𝑛 = 𝑁/2 if 𝑁 is even and 𝑛 = (𝑁 − 1)/2 for Im 𝑧 > 0 and 𝑛 = (𝑁 + 1)/2 for Im 𝑧 < 0 if 𝑁 is odd. We ﬁrst explain our result for the case of functions 𝑣𝑗 (𝑥) with compact supports. We write 𝑥 << 0 if 𝑥 lies to the left of the supports of all 𝑣𝑗 (𝑥) and 𝑥 >> 0 if 𝑥 lies to the right of this set. Let 𝑢𝑗 (𝑥, 𝑧) be solutions of equation (1.4) such that 𝑢𝑗 (𝑥, 𝑧) = 𝑒𝜁𝑗 𝑥 for 𝑥 << 0 if 𝑗 = 1, . . . , 𝑛 Let

and for 𝑥 >> 0 if 𝑗 = 𝑛 + 1, . . . , 𝑁. (1.6)

W(𝑥, 𝑧) = det{𝑢1 (𝑥, 𝑧), . . . , 𝑢𝑛 (𝑥, 𝑧), 𝑢𝑛+1 (𝑥, 𝑧), . . . , 𝑢𝑁 (𝑥, 𝑧)} be the determinant of matrix (1.3), and let W0 (𝑧) = det{𝑒𝜁1 𝑥 , . . . , 𝑒𝜁𝑛 𝑥 , 𝑒𝜁𝑛+1 𝑥 , . . . , 𝑒𝜁𝑁 𝑥 }

(1.7) (1.8)

be the corresponding Wronskian for the “free” case where 𝑣𝑗 = 0 for all 𝑗 = 1, . . . , 𝑁 . Of course, ( ) ∫ 𝑥2 𝑣𝑁 (𝑦)𝑑𝑦 W(𝑥1 , 𝑧) (1.9) W(𝑥2 , 𝑧) = exp −𝑖𝑁 𝑥1

for arbitrary points 𝑥1 and 𝑥2 . We emphasize that the Wronskians W(𝑥, 𝑧) and W0 (𝑧) depend on the order of numeration of the numbers 𝜁𝑗 , but the normalized Wronskian Δ(𝑥, 𝑧) = W(𝑥, 𝑧)/W0 (𝑧) (1.10) does not depend on it.

Trace Formula

543

Our main result is that the normalized Wronskian satisﬁes (for all 𝑥 and all regular points 𝑧 of the operator 𝐻) the equation ( ) Tr 𝑅(𝑧) − 𝑅0 (𝑧) = −Δ(𝑥, 𝑧)−1 𝑑Δ(𝑥, 𝑧)/𝑑𝑧, (1.11) which we call the trace formula in this paper. Thus the trace of the diﬀerence of the resolvents admits an explicit expression in terms of properly chosen solutions of equation (1.4). Then we extend representation (1.11) to general short-range coeﬃcients 𝑣𝑗 (𝑥) satisfying the assumption ∫ ∞ ∣𝑣𝑗 (𝑥)∣2 (1 + 𝑥2 )𝛼 𝑑𝑥 < ∞, 𝛼 > 1/2, 𝑗 = 1, . . . , 𝑁, (1.12) −∞

only. In this case the functions 𝑢𝑗 (𝑥, 𝑧) in deﬁnition (1.7) are the solutions of equation (1.4) such that 𝑢𝑗 (𝑥, 𝑧) = 𝑒𝜁𝑗 𝑥 (1 + 𝑜(1))

(1.13)

as 𝑥 → −∞ if 𝑗 = 1, . . . , 𝑛 and as 𝑥 → +∞ if 𝑗 = 𝑛 + 1, . . . , 𝑁 . Here and below all asymptotic relations for solutions of equation (1.4) are supposed to be 𝑁 − 1 times diﬀerentiable in 𝑥. We emphasize that for 𝑁 > 2 asymptotics (1.13) DO NOT determine the solutions of equation (1.4) uniquely. However, the Wronskian (1.7) does not depend on speciﬁc choice of the solutions satisfying (1.13). Thus we do not need the construction of the book [2] by R. Beals, P. Deift and C. Tomei devoted to the inverse scattering problem. In [2] solutions of equation (1.4) were distinguished uniquely (away from some exceptional set of values of 𝑧) by conditions at both inﬁnities. Our construction of solutions of equation (1.4) with asymptotics (1.13) relies on integral equations which are Volterra equations for 𝑁 = 2 but are only Fredholm equations in the general case. Nevertheless for the construction of solutions with asymptotics (1.13) as 𝑥 → +∞ (as 𝑥 → −∞) we impose conditions on the coeﬃcients 𝑣𝑗 (𝑥) also as 𝑥 → +∞ (as 𝑥 → −∞) only. Suppose that 𝑣𝑁 = 0. Then W(𝑥, 𝑧) = W(𝑧) and hence Δ(𝑥, 𝑧) = Δ(𝑧) do not depend on 𝑥. In this case we identify Δ(𝑧) with the perturbation determinant for the pair of operators 𝐻0 , 𝐻. We refer to the book [10] by I.C. Gohberg and M.G. Kre˘ın for a comprehensive discussion of diﬀerent properties of perturbation determinants. Set 𝑉 = 𝐻 − 𝐻0 = 𝑣𝑁 (𝑥)∂ 𝑁 −1 + ⋅ ⋅ ⋅ + 𝑣2 (𝑥)∂ + 𝑣1 (𝑥).

(1.14)

If 𝑣𝑁 = 0, then the operator 𝑉 𝑅0 (𝑧) for Im 𝑧 ∕= 0 belongs to the trace class 𝔖1 , and hence the perturbation determinant ( ) 𝐷(𝑧) = Det 𝐼 + 𝑉 𝑅0 (𝑧) (1.15) is well deﬁned. Of particular importance is the abstract trace formula ) ( Tr 𝑅(𝑧) − 𝑅0 (𝑧) = −𝐷(𝑧)−1 𝑑𝐷(𝑧)/𝑑𝑧,

(1.16)

544

¨ J. Ostensson and D.R. Yafaev

which for deﬁnition (1.15) is a direct consequence of the formula for the derivative of a determinant. Comparing equations (1.11) and (1.16) and using that Δ(𝑧) → 1 as ∣ Im 𝑧∣ → ∞, we show that ( ) Det 𝐼 + 𝑉 𝑅0 (𝑧) = Δ(𝑧). (1.17) Thus the perturbation determinant admits an explicit expression in terms of solutions of equation (1.4). If 𝑣𝑁 ∕= 0, then under assumption (1.12) it is still true that (for all regular points 𝑧) (1.18) 𝑅(𝑧) − 𝑅0 (𝑧) ∈ 𝔖1 , although 𝑉 𝑅0 (𝑧) ∕∈ 𝔖1 . Without the condition 𝑣𝑁 = 0, equation (1.16) is satisﬁed ˜ for so-called generalized perturbation determinants 𝐷(𝑧) which are deﬁned up to constant factors (see subs. 6.2). According to equation (1.11) in the general case for every ﬁxed 𝑥 ∈ ℝ, the function Δ(𝑥, 𝑧) diﬀers from each generalized perturbation determinant by a constant (not depending on 𝑧) factor. 1.3. A preliminary step in the proof of the trace formula (1.11) is to ﬁnd a convenient representation for the resolvent 𝑅(𝑧) of the operator 𝐻. This construction goes probably back to the beginning of the twentieth century. We refer to relatively recent books [1, 2, 12] where its diﬀerent versions can be found. We start, however, with writing down necessary formulas in a form convenient for us. A diﬀerential equation of order 𝑁 can, of course, be rewritten as a special system of 𝑁 diﬀerential equations of the ﬁrst order. A consideration of ﬁrst-order systems without special assumptions on their coeﬃcients gives more general and transparent results. A large part of the paper is written in terms of solutions of ﬁrstorder systems which implies the results about solutions of diﬀerential equations of an arbitrary order as their special cases. Let us brieﬂy discuss the structure of the paper. In Sections 2 and 3 we collect necessary formulas for solutions of ﬁrst-order systems. They are used in Section 4 for the construction of the integral kernel 𝑅(𝑥, 𝑦, 𝑧) of 𝑅(𝑧). In particular, we obtain a new representation for the integral ∫ 𝑥2 𝑅(𝑦, 𝑦, 𝑧)𝑑𝑦 (1.19) 𝑥1

where the points 𝑥1 , 𝑥2 ∈ ℝ are arbitrary. Then passing to the limit 𝑥1 → −∞, 𝑥2 → +∞, we prove the trace formula (1.11) for the coeﬃcients 𝑣𝑗 , 𝑗 = 1, . . . , 𝑁 , with compact supports. A construction of solutions of equation (1.4) with asymptotics (1.13) is given in Section 5. Here we again ﬁrst consider a general system of 𝑁 diﬀerential equations of the ﬁrst order. Finally, in Section 6 we give the definition of the normalized Wronskian for operators 𝐻 with arbitrary short-range coeﬃcients and extend the trace formula to the general case. At the end we prove that the normalized Wronskian coincides with the perturbation determinant. 1.4. We note that there exists a somewhat diﬀerent approach to proofs of formulas of type (1.17). It consists of a direct calculation of determinant (1.15) whereas we

Trace Formula

545

proceed from a calculation of trace (1.2). In this way formula (1.17) was proven in [11] for the Schr¨ odinger operator on the half-line. In [11] the Fredholm expansion of determinants was used. ( ) A general approach to a calculation of determinants Det 𝐼 +𝐾 was proposed in the book [9] by I.C. Gohberg, S. Goldberg and N. Krupnik. In this book integral operators 𝐾 with so-called semi-separable kernels were considered. It is important that operators 𝐾 = 𝑉 𝑅0 (𝑧) ﬁt into this class. This approach was applied to the Schr¨ odinger operator in paper [8]. The authors thank F. Gesztesy for pointing out references [11, 9, 8].

2. Resolvent kernel In this section we consider an auxiliary vector problem. 2.1. Suppose that the eigenvalues 𝜁𝑗 , 𝑗 = 1, . . . , 𝑁 , of an 𝑁 × 𝑁 matrix L0 are distinct. We denote by p𝑗 = (𝑝1,𝑗 , 𝑝2,𝑗 , . . . , 𝑝𝑁,𝑗 )𝑡 (this notation means that the vector p𝑗 is considered as a column) eigenvectors of L0 corresponding to its eigenvalues 𝜁𝑗 and by p∗𝑗 eigenvectors of L∗0 corresponding to its eigenvalues 𝜁¯𝑗 . Recall that ⟨p𝑗 , p∗𝑘 ⟩ = 0 if 𝑗 ∕= 𝑘 (here ⟨⋅, ⋅⟩ is the scalar product in ℂ𝑁 ). Normalizations of p𝑗 and p∗𝑗 are inessential, but we suppose that ⟨p𝑗 , p∗𝑗 ⟩ = 1. Then the bases p𝑗 and p∗𝑗 , 𝑗 = 1, . . . , 𝑁 , are dual to each other. Assume that an 𝑁 × 𝑁 matrix V(𝑥) where 𝑥 ∈ ℝ belongs locally to 𝐿1 and has compact support. We write 𝑥 << 0 if 𝑥 lies to the left of the support of V(𝑥) and 𝑥 >> 0 if 𝑥 lies to the right of this set. We put L(𝑥) = L0 + V(𝑥).

(2.1)

Consider the homogeneous equation u′ (𝑥) = L(𝑥)u(𝑥)

(2.2) 𝑡

for the vector-valued function u(𝑥) = (𝑢1 (𝑥), . . . , 𝑢𝑁 (𝑥)) . For arbitrary linearly independent solutions u𝑗 (𝑥) = (𝑢1,𝑗 (𝑥), . . . , 𝑢𝑁,𝑗 (𝑥))𝑡 of this equation, we denote by ⎛ ⎞ 𝑢1,1 (𝑥) 𝑢1,2 (𝑥) . . . 𝑢1,𝑁 (𝑥) ⎜ 𝑢2,1 (𝑥) 𝑢2,2 (𝑥) . . . 𝑢2,𝑁 (𝑥) ⎟ ⎜ ⎟ U(𝑥) = ⎜ . ⎟ =: {u1 (𝑥), u2 (𝑥), . . . , u𝑁 (𝑥)} .. .. .. ⎠ ⎝ .. . . . 𝑢𝑁,1(𝑥)

𝑢𝑁,2 (𝑥)

. . . 𝑢𝑁,𝑁 (𝑥)

the corresponding fundamental matrix. It satisﬁes the matrix equation U′ (𝑥) = L(𝑥)U(𝑥). It follows that

(2.3) (2.4)

) ( 𝑑 det U(𝑥)/𝑑𝑥 = det U(𝑥) tr U′ (𝑥)U−1 (𝑥) = det U(𝑥) tr L(𝑥)

(2.5)

¨ J. Ostensson and D.R. Yafaev

546 and hence

det U(𝑥2 ) = exp

(

∫

𝑥2

𝑥1

) tr L(𝑦)𝑑𝑦 det U(𝑥1 )

(2.6)

for arbitrary points 𝑥1 and 𝑥2 . Of course det U(𝑥) ∕= 0 for all 𝑥 ∈ ℝ. We always suppose that 𝜅𝑗 := Re 𝜁𝑗 ∕= 0 for all 𝑗 = 1, . . . , 𝑁 . Let 𝑛 and 𝑁 − 𝑛 be the numbers of eigenvalues 𝜁𝑗 of the matrix L0 lying in the right and left half-planes, respectively. The cases 𝑛 = 0 or 𝑛 = 𝑁 where all 𝜁𝑗 lie in one of the half-planes are not excluded. Let u𝑗 (𝑥) be solutions of equation (2.2) distinguished by the condition u𝑗 (𝑥) = 𝑒𝜁𝑗 𝑥 p𝑗 for 𝑥 << 0 if 𝜅𝑗 > 0 and for 𝑥 >> 0 if 𝜅𝑗 < 0.

(2.7)

We denote by K+ and K− the linear spaces spanned by all solutions u𝑗 (𝑥) such that 𝜅𝑗 > 0 and such that 𝜅𝑗 < 0, respectively. Clearly, dim K+ = 𝑛 and dim K− = 𝑁 − 𝑛. We assume that (2.8) K+ ∩ K− = {0}. Then all nontrivial solutions of equation (2.2) exponentially grow either as 𝑥 → +∞ or as 𝑥 → −∞. In particular, equation (2.2) does not have nontrivial solutions u ∈ 𝐿2 (ℝ; ℂ𝑁 ). If u1 (𝑥), . . . , u𝑛 (𝑥) and u𝑛+1 (𝑥), . . . , u𝑁 (𝑥) are arbitrary linear independent solutions from K+ and K− respectively, then in view of condition (2.8) all these solutions are linearly independent. It is now convenient to accept the following Deﬁnition 2.1. Suppose that 𝑛 columns of matrix (2.3) form a basis in the linear space K+ and other 𝑁 − 𝑛 columns form a basis in K− . Then the fundamental matrix U(𝑥) is called admissible. Observe that for the “free” case where V(𝑥) = 0, we can set U0 (𝑥) = {p1 𝑒𝜁1 𝑥 , . . . , p𝑁 𝑒𝜁𝑁 𝑥 }

(2.9)

and

( ) (2.10) W0 (𝑥) = det U0 (𝑥) = det{p1 , . . . , p𝑁 } exp tr L0 𝑥 ∑𝑁 because tr L0 = 𝑗=1 𝜁𝑗 . Note that det{p1 , . . . , p𝑁 } ∕= 0 since all eigenvalues of the matrix L0 are distinct. The inverse matrix G0 (𝑥) = U−1 0 (𝑥) satisﬁes the relation ¯ ¯ (2.11) G∗0 (𝑥) = {p∗1 𝑒−𝜁1 𝑥 , . . . , p∗𝑁 𝑒−𝜁𝑁 𝑥 }. 2.2. Next we consider the nonhomogeneous equation 𝜑 (𝑥) + f (𝑥), 𝜑 ′ (𝑥) = L(𝑥)𝜑

f (𝑥) = (𝑓1 (𝑥), . . . , 𝑓𝑁 (𝑥))𝑡 ,

(2.12)

where the vector-valued function f (𝑥) has compact support. Let us use the standard method of variation of arbitrary constants and set 𝜑 (𝑥) = U(𝑥)q(𝑥),

q(𝑥) = (𝑞1 (𝑥), . . . , 𝑞𝑁 (𝑥))𝑡 ,

Trace Formula so that 𝜑 (𝑥) =

𝑁 ∑

547

𝑞𝑗 (𝑥)u𝑗 (𝑥).

(2.13)

𝑗=1

Here U(𝑥) is an arbitrary admissible fundamental matrix (2.3). Then it follows from equation (2.4) that q′ (𝑥) = g(𝑥)

where g(𝑥) = G(𝑥)f (𝑥) and G(𝑥) = U−1 (𝑥).

(2.14)

We are looking for a solution of equation (2.12) decaying (exponentially) as ∣𝑥∣ → ∞. It is convenient to accept convention (1.5) on the eigenvalues 𝜁𝑗 of the matrix L0 . Set 𝜌+ = min Re 𝜁𝑗 , 𝑗=1,...,𝑛

𝜌− =

min

𝑗=𝑛+1,...,𝑁

∣ Re 𝜁𝑗 ∣

(2.15)

and observe that estimates u𝑗 (𝑥) = 𝑂(𝑒−𝜌± ∣𝑥∣ ),

𝑥 → ∓∞,

hold for 𝑗 = 1, . . . , 𝑛 and the upper sign as well as for 𝑗 = 𝑛 + 1, . . . , 𝑁 and the lower sign. Taking into account (2.13), we see that we have to solve equation (2.14) for diﬀerent components 𝑞𝑗 (𝑥) of q(𝑥) by diﬀerent formulas. Namely, we set ∫ ∞ 𝑔𝑗 (𝑦)𝑑𝑦, 𝑗 = 1, . . . , 𝑛, 𝑞𝑗 (𝑥) = − 𝑥 ∫ 𝑥 𝑔𝑗 (𝑦)𝑑𝑦, 𝑗 = 𝑛 + 1, . . . , 𝑁, 𝑞𝑗 (𝑥) = −∞

where 𝑔𝑗 (𝑥) are components of g(𝑥). This leads to the following result. Proposition 2.2. Let assumption (2.8) hold, and let (2.3) be an arbitrary admissible fundamental matrix. Then the function ∫ ∞ ∫ 𝑥 𝑛 𝑁 ∑ ∑ u𝑗 (𝑥) (G(𝑦)f (𝑦))𝑗 𝑑𝑦 + u𝑗 (𝑥) (G(𝑦)f (𝑦))𝑗 𝑑𝑦 (2.16) 𝜑 (𝑥) = − 𝑥

𝑗=1

−∞

𝑗=𝑛+1

satisﬁes equation (2.12) and 𝜑 (𝑥) = 𝑂(𝑒−𝜌± ∣𝑥∣ ) as 𝑥 → ∓∞. Formula (2.16) can be rewritten as ∫ ∞ 𝜑 (𝑥) = R(𝑥, 𝑦)f (𝑦)𝑑𝑦 −∞

(2.17)

where the matrix-valued resolvent kernel (or the Green function) R(𝑥, 𝑦) = {𝑅𝑘,𝑙 (𝑥, 𝑦)} is deﬁned by the equality 𝑅𝑘,𝑙 (𝑥, 𝑦) = −

𝑛 ∑ 𝑗=1

𝑢𝑘,𝑗 (𝑥)𝑔𝑗,𝑙 (𝑦)𝜃(𝑦 − 𝑥) +

𝑁 ∑ 𝑗=𝑛+1

𝑢𝑘,𝑗 (𝑥)𝑔𝑗,𝑙 (𝑦)𝜃(𝑥 − 𝑦).

(2.18)

¨ J. Ostensson and D.R. Yafaev

548

Here 𝜃 is the Heaviside function, i.e., 𝜃(𝑥) = 1 for 𝑥 ≥ 0 and 𝜃(𝑥) = 0 for 𝑥 < 0, and 𝑔𝑗,𝑙 are elements of the matrix G. In the matrix notation formula (2.18) means that R(𝑥, 𝑦) = −U(𝑥)P+ U−1 (𝑦)𝜃(𝑦 − 𝑥) + U(𝑥)P− U−1 (𝑦)𝜃(𝑥 − 𝑦),

(2.19)

where the projections P± are deﬁned in the representation ℂ𝑁 = ℂ𝑛 ⊕ ℂ𝑁 −𝑛 by the block matrices ) ( ( ) 𝐼𝑛 0 0 0 . P+ = , P− = 0 𝐼𝑁 −𝑛 0 0 Expressions (2.18) or (2.19) do not of course depend on the choice of bases ˘ 1 (𝑥), . . . , u ˘ 𝑛 (𝑥) in the spaces K+ and K− . Indeed, if we choose other bases u ˘ 𝑛+1 (𝑥), . . . , u ˘ 𝑁 (𝑥), then the corresponding admissible fundamental matriand u ˘ ˘ ces U(𝑥) and U(𝑥) are related by the formula U(𝑥) = U(𝑥)F where the operator 𝑁 𝑁 ˘ −1 (𝑦) = ˘ F : ℂ → ℂ commutes with the projections P± . It follows that U(𝑥)P ±U −1 U(𝑥)P± U (𝑦). Evidently, the resolvent kernel (2.19) is a continuous function of 𝑥 and 𝑦 away from the diagonal 𝑥 = 𝑦 and R(𝑥, 𝑥 + 0, 𝑧) = −U(𝑥)P+ U−1 (𝑥), R(𝑥, 𝑥 − 0, 𝑧) = U(𝑥)P− U−1 (𝑥). It follows that R(𝑥, 𝑥 − 0, 𝑧) − R(𝑥, 𝑥 + 0, 𝑧) = I,

(2.20)

where I is the 𝑁 × 𝑁 identity matrix. 2.3. The results of the previous subsection admit a simple operator interpretation. Consider the space 𝐿2 (ℝ; ℂ𝑁 ) and deﬁne the operator H0 on the Sobolev class H1 (ℝ; ℂ𝑁 ) by the formula H0 = ∂I − L0 ,

∂ = 𝑑/𝑑𝑥.

If u(𝑥) = 𝑢(𝑥)p𝑗 where 𝑢 ∈ H1 (ℝ), then (H0 u)(𝑥) = (𝑢′ (𝑥) − 𝜁𝑗 𝑢(𝑥))p𝑗 , and hence the operator H0 is linearly equivalent to a direct sum of the operators of multiplication by 𝑖𝜉 − 𝜁𝑗 , 𝜉 ∈ ℝ, 𝑗 = 1, . . . , 𝑁 , acting in the space 𝐿2 (ℝ). It follows that the spectrum of the operator H0 consists of straight lines passing through all points −𝜁𝑗 and parallel to the imaginary axis. In particular, the inverse operator H−1 0 exists and is bounded. To deﬁne the operator H = ∂I − L0 − V(𝑥), we need the following well-known assertion (see paper [3] by M.Sh. Birman). Lemma 2.3. Let 𝑇 : 𝐿2 (ℝ; 𝑑𝑥) → 𝐿2 (ℝ; 𝑑𝜉) be an integral operator with kernel 𝑡(𝜉, 𝑥) = 𝑏(𝜉)𝑒−𝑖𝑥𝜉 𝑣(𝑥).

(2.21)

Trace Formula If 𝑏(𝜉) = (𝜉 2 + 1)−1/2 and

∫ lim

∣𝑥∣→∞

𝑥+1

549

∣𝑣(𝑦)∣2 𝑑𝑦 = 0,

𝑥

(2.22)

then the operator 𝑇 is compact. If the coeﬃcients of the matrix V(𝑥) satisfy condition (2.22), then according to Lemma 2.3 the operator VH−1 0 is compact. Hence the operator H is closed on H1 (ℝ; ℂ𝑁 ) and by virtue of the Weyl theorem essential spectra of the operators H and H0 coincide. Condition (2.8) implies that 0 is not an eigenvalue of H so that the inverse operator H−1 exists and is bounded. If the matrix-valued function V(𝑥) has compact support, then according to Proposition 2.2 the integral kernel of the operator H−1 is given by formula (2.19). 2.4. Let the solutions u𝑗 (𝑥) of equation (2.2) be distinguished by conditions (2.7). Let us give expressions for the Wronskian W(𝑥) := det U(𝑥) in terms of transition matrices T± deﬁned as follows. For 𝑗 = 1, . . . , 𝑛 and 𝑥 >> 0 or 𝑗 = 𝑛 + 1, . . . , 𝑁 and 𝑥 << 0, we have 𝑁 ∑ 𝑡𝑗,𝑘 p𝑘 𝑒𝜁𝑘 𝑥 (2.23) u𝑗 (𝑥) = with some coeﬃcients 𝑡𝑗,𝑘 . Set

𝑘=1

⎛

𝑡1,1 ⎜ 𝑡2,1 ⎜ T+ = ⎜ . ⎝ ..

𝑡𝑛,1

and

𝑡1,2 𝑡2,2 .. .

𝑡𝑛,2

... ... .. .

⎞ 𝑡1,𝑛 𝑡2,𝑛 ⎟ ⎟ .. ⎟ . ⎠

(2.24)

. . . 𝑡𝑛,𝑛

⎞ . . . 𝑡𝑛+1,𝑁 . . . 𝑡𝑛+2,𝑁 ⎟ ⎟ (2.25) .. ⎟ . .. . . ⎠ 𝑡𝑁,𝑛+1 𝑡𝑁,𝑛+2 . . . 𝑡𝑁,𝑁 Consider, for example, T+ . Using expressions (2.23) for 𝑗 = 1, . . . , 𝑛, we see that for 𝑥 >> 0 matrix (2.3) equals {𝑁 } 𝑁 ∑ ∑ 𝜁𝑘 𝑥 𝜁𝑘 𝑥 𝜁𝑛+1 𝑥 𝜁𝑁 𝑥 U(𝑥) = . (2.26) 𝑡1,𝑘 p𝑘 𝑒 , . . . , 𝑡𝑛,𝑘 p𝑘 𝑒 , p𝑛+1 𝑒 , . . . , p𝑁 𝑒 ⎛

𝑡𝑛+1,𝑛+1 ⎜𝑡𝑛+2,𝑛+1 ⎜ T− = ⎜ .. ⎝ .

𝑘=1

𝑡𝑛+1,𝑛+2 𝑡𝑛+2,𝑛+2 .. .

𝑘=1

Below, by the calculation of determinants of matrices, we systematically use that one can add to each column another column multiplied by an arbitrary number. In particular, we have { 𝑛 } 𝑛 ∑ ∑ 𝜁𝑘 𝑥 𝜁𝑘 𝑥 𝜁𝑛+1 𝑥 𝜁𝑁 𝑥 W(𝑥) = det 𝑡1,𝑘 p𝑘 𝑒 , . . . , 𝑡𝑛,𝑘 p𝑘 𝑒 , p𝑛+1 𝑒 , . . . , p𝑁 𝑒 𝑘=1

= det T+ W0 (𝑥),

𝑘=1

𝑥 >> 0,

(2.27)

¨ J. Ostensson and D.R. Yafaev

550

where the free Wronskian W0 (𝑥) is given by formula (2.10). In view of relation (2.6), it follows that for all 𝑥 ∈ ℝ ∫ ∞ ( ) W(𝑥) = exp tr L0 𝑥 − tr V(𝑦)𝑑𝑦 det T+ det{p1 , . . . , p𝑁 }. (2.28) 𝑥

Quite similarly, using expressions (2.23) for 𝑗 = 𝑛 + 1, . . . , 𝑁 and 𝑥 << 0, we obtain that W(𝑥) = det T− W0 (𝑥), 𝑥 << 0, (2.29) and ∫ 𝑥 ) ( W(𝑥) = exp tr L0 𝑥 + tr V(𝑦)𝑑𝑦 det T− det{p1 , . . . , p𝑁 }, ∀𝑥 ∈ ℝ. (2.30) −∞

This leads to the following result. Proposition 2.4. Let U(𝑥) be the fundamental matrix (2.3) where u𝑗 (𝑥) are the solutions of equation (2.2) satisfying conditions (2.7). Let the transition matrices T± be deﬁned by formulas (2.23)–(2.25). Then the Wronskian W(𝑥) = det U(𝑥) admits representations (2.28) and (2.30). Putting together equalities (2.28) and (2.30), we see that (∫ ∞ ) tr V(𝑦)𝑑𝑦 det T− . det T+ = exp −∞

Assumptions (2.8) and det T± ∕= 0 are of course equivalent.

3. Dual problem Some properties of admissible fundamental matrices become more transparent if one considers the dual problem corresponding to the matrix-valued function ˜ L(𝑥) = −L∗ (𝑥). 3.1. It follows from equation (2.4) for U(𝑥) that the inverse operator G(𝑥) = U−1 (𝑥) satisﬁes the equation G′ (𝑥) = −G(𝑥)U′ (𝑥)G(𝑥) = −G(𝑥)L(𝑥)

(3.1)

which yields the equation

˜ U(𝑥) ˜ ˜ ′ (𝑥) = L(𝑥) (3.2) U ∗ −1 ˜ ˜ for the matrix-valued function U(𝑥) := U (𝑥) . Clearly, det U(𝑥) ∕= 0 so that ˜ U(𝑥) is a fundamental matrix for this equation. Proposition 3.1 below shows that ˜ it is admissible. Set U(𝑥) = {˜ 𝑢𝑗,𝑙 (𝑥)}, G(𝑥) = {𝑔𝑗,𝑙 (𝑥)}. We use below that 𝑢 ˜𝑙,𝑗 (𝑥) = 𝑔𝑗,𝑙 (𝑥) = (−1)𝑗+𝑙 𝑚𝑙,𝑗 (𝑥)/W(𝑥),

(3.3)

where 𝑚𝑙,𝑗 (𝑥) is the minor of the matrix U(𝑥) which is the determinant of the matrix cut down from U(𝑥) by removing the row with index 𝑙 and the column with index 𝑗.

Trace Formula

551

Proposition 3.1. Let u𝑗 (𝑥) be arbitrary linear independent solutions of equation (2.2) from K+ for 𝑗 = 1, . . . , 𝑛 and from K− for 𝑗 = 𝑛 + 1, . . . , 𝑁 , and let U(𝑥) be the corresponding admissible fundamental matrix (2.3). Then for all 𝑙 = 1, . . . , 𝑁 we have 𝑔𝑗,𝑙 (𝑥) = 𝑂(𝑒𝜌− 𝑥 ), 𝑔𝑗,𝑙 (𝑥) = 𝑂(𝑒

−𝜌+ 𝑥

),

𝑥 → −∞, 𝑥 → +∞,

𝑗 = 𝑛 + 1, . . . , 𝑁,

(3.4)

𝑗 = 1, . . . , 𝑛,

with positive numbers 𝜌± deﬁned in (2.15). Proof. Let us prove, for example, the ﬁrst of relations (3.4). Changing if necessary the numeration, we can suppose that 𝑗 = 𝑁 which is notationally convenient. For 𝑥 << 0 and some numbers 𝑐𝑗,𝑘 , we have } { 𝑛 𝑛 𝑁 𝑁 ∑ ∑ ∑ ∑ 𝑐1,𝑘 p𝑘 𝑒𝜁𝑘 𝑥 , . . . , 𝑐𝑛,𝑘 p𝑘 𝑒𝜁𝑘 𝑥 , 𝑐𝑛+1,𝑘 p𝑘 𝑒𝜁𝑘 𝑥 , . . . , 𝑐𝑁,𝑘 p𝑘 𝑒𝜁𝑘 𝑥 . U(𝑥) = 𝑘=1

𝑘=1

It follows that

{

𝑚𝑙,𝑁 (𝑥) = det

𝑛 ∑ 𝑘=1

(𝑙)

𝑐1,𝑘 p𝑘 𝑒𝜁𝑘 𝑥 , . . . ,

𝑁 ∑ 𝑘=1

where

𝑘=1

𝑘=1

𝑛 ∑ 𝑘=1

(𝑙) 𝑐𝑛+1,𝑘 p𝑘 𝑒𝜁𝑘 𝑥 , . . . ,

(𝑙)

𝑐𝑛,𝑘 p𝑘 𝑒𝜁𝑘 𝑥 ,

𝑁 ∑ 𝑘=1

} (𝑙) 𝑐𝑁 −1,𝑘 p𝑘 𝑒𝜁𝑘 𝑥

(𝑙)

p𝑘 = (𝑝1,𝑘 , . . . , 𝑝𝑙−1,𝑘 , 𝑝𝑙+1,𝑘 , . . . , 𝑝𝑁,𝑘 )𝑡 .

(3.5)

(3.6)

Thus the element with index 𝑙 is removed from p𝑘 = (𝑝1,𝑘 , . . . , 𝑝𝑁,𝑘 )𝑡 so that in (3.5) we take the determinant of the (𝑁 − 1) × (𝑁 − 1) matrix. Neglecting columns which are repeated in determinant (3.5), we see that it consists of terms { (𝑙) } (𝑙) (𝑙) 𝜁𝑛 𝑥 det p1 𝑒𝜁1 𝑥 , . . . , p(𝑙) , p𝑘𝑛+1 𝑒𝜁𝑘𝑛+1 𝑥 , . . . , p𝑘𝑁 −1 𝑒𝜁𝑘𝑁 −1 𝑥 (3.7) 𝑛 𝑒 multiplied by some coeﬃcients which do not depend on 𝑥. Here the indices 𝑘𝑛+1 , . . . , 𝑘𝑁 −1 take the values 𝑛 + 1, . . . , 𝑁 and 𝑘𝑝 ∕= 𝑘𝑞 if 𝑝 ∕= 𝑞. Evidently, expression (3.7) equals 𝑁 ∑ ( ) 𝑐𝑗 exp tr L0 𝑥 − 𝜁𝑗 𝑥 , tr L0 = 𝜁𝑘 , (3.8) 𝑘=1

for some 𝑗 = 𝑛 + 1, . . . , 𝑁 and a number 𝑐𝑗 which does not depend on 𝑥. In view of formulas (2.10) and (2.29) after division by W(𝑥) this expression is 𝑂(∣𝑒−𝜁𝑗 𝑥 ∣) as 𝑥 → −∞. Hence (3.4) for 𝑗 = 𝑁 follows from (3.3). □ ˜ ˜ 𝑗 (𝑥) be columns of the matrix U(𝑥), that is, Let u ˜ ˜ 𝑁 (𝑥)}. U(𝑥) = {˜ u1 (𝑥), . . . , u

¨ J. Ostensson and D.R. Yafaev

552

Then relations (3.4) can equivalently be rewritten as ˜ 𝑗 (𝑥) = 𝑂(𝑒𝜌− 𝑥 ), u ˜ 𝑗 (𝑥) = 𝑂(𝑒 u

−𝜌+ 𝑥

),

𝑥 → −∞,

𝑗 = 𝑛 + 1, . . . , 𝑁,

𝑥 → +∞,

𝑗 = 1, . . . , 𝑛.

(3.9)

˜ Let us deﬁne the resolvent kernel R(𝑥, 𝑦) in the same way as in subs. 2.2 ˜ with L(𝑥) replaced by L(𝑥). According to relations (3.9) the ﬁrst 𝑛 columns of the ˜ matrix U(𝑥) exponentially decay as 𝑥 → +∞ and its last 𝑁 − 𝑛 columns expo˜ nentially decay as 𝑥 → −∞. Therefore the fundamental matrix U(𝑥) is admissible for equation (3.2), and it follows from formula (2.19) applied to the dual problem that ˜− U ˜+ U ˜ −1 (𝑦)𝜃(𝑦 − 𝑥) + U(𝑥) ˜ −1 (𝑦)𝜃(𝑥 − 𝑦) ˜ ˜ P ˜ P R(𝑥, 𝑦) = −U(𝑥)

(3.10)

where ˜ U(𝑥) = U∗ (𝑥)−1 ,

˜ ± = P∓ . P

(3.11)

In particular, we see that ˜ R(𝑥, 𝑦) = −R∗ (𝑦, 𝑥). This relation implicitly follows also from the results of subs. 2.3. (0)

3.2. Let G0 (𝑥) = {𝑔𝑗,𝑙 (𝑥)} be the matrix inverse to the free matrix (2.9). The next result supplements Proposition 3.1 and plays an important role in our proof of the trace formula (1.11). Proposition 3.2. Let solutions u𝑗 (𝑥) of equation (2.2) be distinguished by conditions (2.7), and let U(𝑥) be the corresponding admissible fundamental matrix (2.3). Then for all 𝑙 = 1, . . . , 𝑁 elements 𝑔𝑗,𝑙 (𝑥) of the matrix G(𝑥) = U−1 (𝑥) satisfy the relations (0)

𝑥 → +∞,

𝑗 = 𝑛 + 1, . . . , 𝑁,

(0)

𝑥 → −∞,

𝑗 = 1, . . . , 𝑛,

𝑔𝑗,𝑙 (𝑥) − 𝑔𝑗,𝑙 (𝑥) = 𝑂(𝑒−𝜌+ 𝑥 ), 𝑔𝑗,𝑙 (𝑥) − 𝑔𝑗,𝑙 (𝑥) = 𝑂(𝑒𝜌− 𝑥 ),

(3.12)

with positive numbers 𝜌± deﬁned in (2.15). Proof. Let us use the notation introduced in the proof of Proposition 3.1. We shall again prove the ﬁrst of relations (3.12) for 𝑗 = 𝑁 . According to (2.26) for 𝑥 >> 0, we have {𝑁 } 𝑁 ∑ ∑ (𝑙) 𝜁𝑘 𝑥 (𝑙) 𝜁𝑘 𝑥 (𝑙) (𝑙) 𝜁𝑛+1 𝑥 𝜁𝑁 −1 𝑥 𝑡1,𝑘 p𝑘 𝑒 , . . . , 𝑡𝑛,𝑘 p𝑘 𝑒 , p𝑛+1 𝑒 , . . . , p𝑁 −1 𝑒 𝑚𝑙,𝑁 (𝑥) = det 𝑘=1

𝑘=1

(3.13) (𝑙) where p𝑘 is vector (3.6) obtained from p𝑘 by removing the component with index 𝑙. Neglecting columns which are repeated in (𝑁 − 1) × (𝑁 − 1) matrix (3.13), we

Trace Formula see that

{

𝑚𝑙,𝑁 (𝑥) = det

𝑛 ∑ 𝑘=1

𝑛 ∑ 𝑘=1

(𝑙)

553

(𝑙)

𝑡1,𝑘 p𝑘 𝑒𝜁𝑘 𝑥 + 𝑡1,𝑁 p𝑁 𝑒𝜁𝑁 𝑥 , . . . , }

(𝑙) 𝑡𝑛,𝑘 p𝑘 𝑒𝜁𝑘 𝑥

+

(𝑙) (𝑙) (𝑙) 𝑡𝑛,𝑁 p𝑁 𝑒𝜁𝑁 𝑥 , p𝑛+1 𝑒𝜁𝑛+1 𝑥 , . . . , p𝑁 −1 𝑒𝜁𝑁 −1 𝑥

.

This determinant is the sum of the term { 𝑛 } 𝑛 ∑ ∑ (𝑙) 𝜁𝑘 𝑥 (𝑙) 𝜁𝑘 𝑥 (𝑙) (𝑙) 𝜁𝑛+1 𝑥 𝜁𝑁 −1 𝑥 (3.14) 𝑡1,𝑘 p𝑘 𝑒 , . . . , 𝑡𝑛,𝑘 p𝑘 𝑒 , p𝑛+1 𝑒 , . . . , p𝑁 −1 𝑒 det 𝑘=1

𝑘=1

and of the 𝑛 terms { 𝑛 𝑛 ∑ ∑ (𝑙) (𝑙) (𝑙) (𝑙) det 𝑡1,𝑁 p𝑁 𝑒𝜁𝑁 𝑥 , 𝑡2,𝑘 p𝑘 𝑒𝜁𝑘 𝑥 , . . . , 𝑡𝑛,𝑘 p𝑘 𝑒𝜁𝑘 𝑥 , p𝑛+1 𝑒𝜁𝑛+1 𝑥 , . . . } 𝑘=1 𝑘=1 (𝑙) 𝜁𝑁 −1 𝑥 . . . , p𝑁 −1 𝑒 ,

det

{∑ 𝑛 𝑘=1

⋅⋅⋅ (𝑙)

𝑡1,𝑘 p𝑘 𝑒𝜁𝑘 𝑥 , . . . ,

𝑛 ∑

(𝑙)

(3.15) (𝑙)

(𝑙)

𝑡𝑛−1,𝑘 p𝑘 𝑒𝜁𝑘 𝑥 , 𝑡𝑛,𝑁 p𝑁 𝑒𝜁𝑁 𝑥 , p𝑛+1 𝑒𝜁𝑛+1 𝑥 , . . . } 𝑘=1 (𝑙) 𝜁𝑁 −1 𝑥 . . . , p𝑁 −1 𝑒 .

For the free case where V(𝑥) = 0, we have the exact equality } { (𝑙) (0) (𝑙) (𝑙) 𝜁𝑛 𝑥 , p𝑛+1 𝑒𝜁𝑛+1 𝑥 , . . . , p𝑁 −1 𝑒𝜁𝑁 −1 𝑥 . 𝑚𝑙,𝑁 (𝑥) = det p1 𝑒𝜁1 𝑥 , . . . , p(𝑙) 𝑛 𝑒 Similarly to (2.27), we ﬁnd that determinant (3.14) equals { (𝑙) } (𝑙) (𝑙) 𝜁𝑛 𝑥 , p𝑛+1 𝑒𝜁𝑛+1 𝑥 , . . . , p𝑁 −1 𝑒𝜁𝑁 −1 𝑥 . det T+ det p1 𝑒𝜁1 𝑥 , . . . , p(𝑙) 𝑛 𝑒

(3.16)

(3.17)

By virtue of (2.27) expression (3.17) divided by W(𝑥) equals expression (3.16) divided by W0 (𝑥). Thus for the proof of asymptotics (3.12) it remains to estimate determinants (3.15) by 𝐶𝑒−𝜌+ 𝑥 . This is similar to the proof of Proposition 3.1. It suﬃces to consider terms corresponding to diﬀerent values of index 𝑘 in diﬀerent sums. Therefore, up to some factors not depending on 𝑥, determinants (3.15) consist of the terms } { (𝑙) (𝑙) (𝑙) (𝑙) (𝑙) (3.18) det p𝑁 𝑒𝜁𝑁 𝑥 , p𝑘1 𝑒𝜁𝑘1 𝑥 , . . . , p𝑘𝑛−1 𝑒𝜁𝑘𝑛−1 𝑥 , p𝑛+1 𝑒𝜁𝑛+1 𝑥 , . . . , p𝑁 −1 𝑒𝜁𝑁 −1 𝑥 where the indices 𝑘1 , . . . , 𝑘𝑛−1 take the values 1, . . . , 𝑛 and 𝑘𝑝 ∕= 𝑘𝑞 if 𝑝 ∕= 𝑞. Evidently, (3.18) equals expression (3.8) for some 𝑗 = 1, . . . , 𝑛 and a number 𝑐𝑗 which does not depend on 𝑥. In view of formulas (2.10) and (2.27) after division by W(𝑥) this expression is 𝑂(𝑒−𝜌+ 𝑥 ) as 𝑥 → +∞. This concludes the proof of asymptotics (3.12). □

¨ J. Ostensson and D.R. Yafaev

554

˜ ˜ 𝑗 (𝑥) of the matrix U(𝑥) In view of equality (2.11) in terms of columns u relations (3.12) can equivalently be rewritten as (cf. (3.9)) ¯

˜ 𝑗 (𝑥) − p∗𝑗 𝑒−𝜁𝑗 𝑥 = 𝑂(𝑒−𝜌+ 𝑥 ), u ˜ 𝑗 (𝑥) − u

¯ p∗𝑗 𝑒−𝜁𝑗 𝑥

= 𝑂(𝑒𝜌− 𝑥 ),

𝑥 → +∞,

𝑗 = 𝑛 + 1, . . . , 𝑁,

𝑥 → −∞,

𝑗 = 1, . . . , 𝑛.

(3.19)

¯

We emphasize that the functions p∗𝑗 𝑒−𝜁𝑗 𝑥 exponentially grow at inﬁnity while the remainders in formulas (3.19) exponentially decay. 3.3. Let us, ﬁnally, verify that the resolvent kernel deﬁned by equality (2.19) satisﬁes a natural estimate ∣R(𝑥, 𝑦)∣ ≤ 𝐶𝑒−𝜌∣𝑥−𝑦∣ ,

𝜌 = min{𝜌+ , 𝜌− },

(3.20)

1

with a constant 𝐶 not depending on 𝑥 and 𝑦. It suﬃces to check that ∣U(𝑥)P± U−1 (𝑦)∣ ≤ 𝐶𝑒−𝜌± ∣𝑥−𝑦∣ ,

±(𝑦 − 𝑥) ≥ 0.

(3.21)

We can suppose that U(𝑥) is deﬁned by formula (2.3) where the solutions u𝑗 (𝑥) of equation (2.2) satisfy condition (2.7). Note that ∣U(𝑥)P± ∣ ≤ 𝐶𝑒−𝜌± ∣𝑥∣ , ∓𝑥 ≥ 0. (3.22) Passing here to the dual problem and using that 𝜌˜± = 𝜌∓ , we see that ˜ ∓ ∣ ≤ 𝐶𝑒−𝜌± ∣𝑦∣ , ˜ P ∣P± U−1 (𝑦)∣ = ∣U∗ (𝑦)−1 P± ∣ = ∣U(𝑦)

±𝑦 ≥ 0.

(3.23)

Let us prove estimate (3.21), for example, for the upper sign. If 𝑥 ≤ 0 and 𝑦 ≥ 0, then (3.21) directly follows from (3.22) and (3.23). Suppose next that 𝑥 ≤ 0 and 𝑦 ≤ 0. According to formula (2.18) it suﬃces to check that (3.24) ∣𝑢𝑘,𝑗 (𝑥)𝑔𝑗,𝑙 (𝑦)∣ ≤ 𝐶∣𝑒𝜁𝑗 (𝑥−𝑦) ∣ for all 𝑘, 𝑙 = 1, . . . , 𝑁 and 𝑗 = 1, . . . , 𝑛. Recall that ∣𝑢𝑘,𝑗 (𝑥)∣ ≤ 𝐶∣𝑒𝜁𝑗 𝑥 ∣ and that ∣𝑔𝑗,𝑙 (𝑦)∣ ≤ 𝐶∣𝑒−𝜁𝑗 𝑦 ∣ by Proposition 3.2. This yields (3.24) and hence (3.21) for 𝑥 ≤ 0, 𝑦 ≤ 0 and the upper sign. Similarly, we obtain that estimate (3.21) is true for 𝑥 ≥ 0, 𝑦 ≥ 0 and the lower sign. Using this estimate for the dual problem we see that ˜− U ˜ −1 (𝑦)∣ ≤ 𝐶𝑒−𝜌˜− ∣𝑥−𝑦∣ , ˜ ∣U(𝑥) P

𝑥 ≥ 0, 𝑦 ≥ 0, 𝑥 ≥ 𝑦.

Passing here to adjoint matrices and taking into account relations (3.11), we ﬁnd that ∣U(𝑦)P+ U−1 (𝑥)∣ ≤ 𝐶𝑒−𝜌+ ∣𝑥−𝑦∣ , 𝑥 ≥ 0, 𝑦 ≥ 0, 𝑥 ≥ 𝑦. Interchanging 𝑥 and 𝑦, we get (3.21) for 𝑥 ≥ 0, 𝑦 ≥ 0 and the upper sign. Thus we have proven the following result. Proposition 3.3. Let assumption (2.8) hold. Then the resolvent kernel (2.19) satisﬁes estimate (3.20). 1 Here

and below 𝐶 are diﬀerent positive constants whose precise values are of no importance.

Trace Formula

555

Estimate (3.20) shows that formula (2.17) obtained for functions f with compact support extends to all f ∈ 𝐿2 (ℝ; ℂ𝑁 ).

4. Trace formula 4.1. Let us consider the diﬀerential operator (1.1) acting in the space 𝐿2 (ℝ). Recall that the operator 𝐻0 = 𝑖−𝑁 ∂ 𝑁 is self-adjoint in the space 𝐿2 (ℝ) on domain 𝒟(𝐻0 ) which is the Sobolev class H𝑁 (ℝ). The spectrum 𝜎(𝐻0 ) of 𝐻0 coincides with [0, ∞) for 𝑁 even and with ℝ for 𝑁 odd. If the coeﬃcients 𝑣𝑗 , 𝑗 = 1, . . . , 𝑁 , of operator (1.14) belong locally to 𝐿2 , then the operator 𝐻 = 𝐻0 + 𝑉 is well deﬁned by formula (1.1) at least on the class 𝐶0∞ (ℝ). If all functions 𝑣𝑗 satisfy assumption (2.22), then according to Lemma 2.3 the operator 𝑉 𝑅0 (𝑧), 𝑧 ∕∈ 𝜎(𝐻0 ), is compact. Therefore the operator 𝐻 is closed on domain 𝒟(𝐻) = 𝒟(𝐻0 ), and by virtue of the Weyl theorem its essential spectrum 𝜎ess (𝐻) = 𝜎ess (𝐻0 ). The spectrum 𝜎(𝐻) of the operator 𝐻 in ℂ ∖ 𝜎ess (𝐻0 ) consists of eigenvalues (not necessarily real) of ﬁnite multiplicity which might accumulate to 𝜎ess (𝐻0 ) only. For the construction of the integral kernel of the resolvent 𝑅(𝑧) = (𝐻 − 𝑧)−1 of the operator 𝐻, we have to solve the equation 𝑖−𝑁 𝜑(𝑁 ) (𝑥) + (𝑉 𝜑)(𝑥) = 𝑧𝜑(𝑥) + 𝑓 (𝑥),

𝑧 ∕∈ 𝜎(𝐻),

𝑓 ∈ 𝐿2 (ℝ).

(4.1)

Let us rewrite it as a system of diﬀerential equations (2.12). We introduce vectors 𝜑 = (𝜑1 , . . . , 𝜑𝑁 )𝑡 with components 𝜑𝑗 = 𝜑(𝑗−1) , f = 𝑖𝑁 (0, 0, . . . , 0, 𝑓 )𝑡 and set ⎞ ⎛ 0 1 ... 0 0 ⎜ 0 0 . . . 0 0⎟ ⎟ ⎜ ⎜ .. .. .. ⎟ , .. . . L0 (𝑧) = ⎜ . (4.2) . . .⎟ . ⎟ ⎜ ⎠ ⎝ 0 0 ... 0 1 𝑖𝑁 𝑧 0 . . . 0 0 ⎛ ⎞ 0 0 ... 0 0 ⎜ 0 0 ... 0 0 ⎟ ⎜ ⎟ . . .. ⎟ . 𝑁 ⎜ .. . .. .. .. (4.3) V(𝑥) = −𝑖 ⎜ . . ⎟ ⎜ ⎟ ⎝ 0 ⎠ 0 ... 0 0 𝑣1 (𝑥) 𝑣2 (𝑥) . . . 𝑣𝑁 −1 (𝑥) 𝑣𝑁 (𝑥) Then equation (4.1) is equivalent to vector equation (2.12) with the operator L(𝑥, 𝑧) deﬁned by equality (2.1). We emphasize that it now depends on the spectral parameter 𝑧. Matrix (4.2) has eigenvalues 𝜁𝑗 such that 𝜁𝑗𝑁 = 𝑖𝑁 𝑧 with the corresponding eigenvectors p𝑗 (𝑧) = (1, 𝜁𝑗 , . . . , 𝜁𝑗𝑁 −1 )𝑡 .

(4.4)

It is easy to see that 𝑛 = 𝑁/2 if 𝑁 is even and 𝑛 = (𝑁 − 1)/2 for Im 𝑧 > 0 and 𝑛 = (𝑁 + 1)/2 for Im 𝑧 < 0 if 𝑁 is odd.

¨ J. Ostensson and D.R. Yafaev

556

If the coeﬃcients 𝑣𝑗 (𝑥) have compact supports, then all results of Sections 2 and 3 apply. Now we have u𝑗 (𝑥, 𝑧) = (𝑢1,𝑗 (𝑥, 𝑧), . . . , 𝑢𝑁,𝑗 (𝑥, 𝑧))𝑡 where the func(𝑘−1) tions 𝑢𝑗 (𝑥, 𝑧) := 𝑢1,𝑗 (𝑥, 𝑧) satisfy homogeneous equation (1.4) and 𝑢𝑗 (𝑥, 𝑧) = 𝑢𝑘,𝑗 (𝑥, 𝑧) for 𝑘 = 2, . . . , 𝑁 . Hence fundamental matrix (2.3) takes form (1.3). This matrix U(𝑥, 𝑧) is admissible if the functions 𝑢1 (𝑥, 𝑧), . . . , 𝑢𝑛 (𝑥, 𝑧) belong to 𝐿2 (ℝ− ) and the functions 𝑢𝑛+1 (𝑥, 𝑧), . . . , 𝑢𝑁 (𝑥, 𝑧) belong to 𝐿2 (ℝ+ ). The Wronskian W(𝑥, 𝑧) = det U(𝑥, 𝑧) satisﬁes equations (2.5) and (2.6) where tr L(𝑥, 𝑧) = −𝑖𝑁 𝑣𝑁 (𝑥). This yields (1.9). Formula (2.17) means that ∫ ∞ 𝑅𝑗,𝑁 (𝑥, 𝑦, 𝑧)𝑓 (𝑦)𝑑𝑦. 𝜑(𝑗−1) (𝑥) = 𝑖𝑁 −∞

In particular, we have 𝑅𝑗,𝑁 (𝑥, 𝑦, 𝑧) = ∂𝑥𝑗−1 𝑅1,𝑁 (𝑥, 𝑦, 𝑧),

𝑗 = 2, . . . , 𝑁.

(4.5)

Condition (2.8) is equivalent to the assumption that 𝑧 is not an eigenvalue of the operator 𝐻. Thus, Proposition 2.2 entails the following result. Proposition 4.1. Suppose that 𝑧 ∕∈ 𝜎(𝐻). Let the matrix R(𝑥, 𝑦, 𝑧) = {𝑅𝑗,𝑘 (𝑥, 𝑦, 𝑧)} be deﬁned by formula (2.19) where U(𝑥, 𝑧) is an admissible fundamental matrix. Then the resolvent 𝑅(𝑧) = (𝐻 − 𝑧)−1 of the operator 𝐻 is the integral operator with kernel 𝑅(𝑥, 𝑦, 𝑧) = 𝑖𝑁 𝑅1,𝑁 (𝑥, 𝑦, 𝑧). According to formula (2.18) Proposition 4.1 implies that 𝑅(𝑥, 𝑦, 𝑧) = − 𝑖𝑁

𝑛 ∑

𝑢1,𝑗 (𝑥, 𝑧)𝑔𝑗,𝑁 (𝑦, 𝑧),

𝑥 < 𝑦,

𝑗=1

𝑅(𝑥, 𝑦, 𝑧) =𝑖

𝑁

𝑁 ∑

(4.6) 𝑢1,𝑗 (𝑥, 𝑧)𝑔𝑗,𝑁 (𝑦, 𝑧),

𝑥 > 𝑦,

𝑗=𝑛+1

It follows from relations (2.20) and (4.5) that, for 𝑁 ≥ 2, the function (𝑘) 𝑅(𝑥, 𝑦, 𝑧) as well as its derivatives 𝑅𝑥 (𝑥, 𝑦, 𝑧), 𝑘 = 1, . . . , 𝑁 − 2, are continuous functions of 𝑥 and 𝑦 while 𝑅𝑥(𝑁 −1) (𝑥, 𝑥 − 0, 𝑧) − 𝑅𝑥(𝑁 −1) (𝑥, 𝑥 + 0, 𝑧) = 𝑖𝑁 . The case 𝑁 = 1 is trivial. Although in this case the kernel 𝑅(𝑥, 𝑦, 𝑧) is not a continuous function, the diﬀerence 𝑅(𝑥, 𝑦, 𝑧) − 𝑅0 (𝑥, 𝑦, 𝑧) is continuous and its diagonal values equal zero. 4.2. Let us ﬁnd a convenient expression for integral (1.19). Since V does not depend on 𝑧, diﬀerentiating equation (2.4) in 𝑧, we have ˙ ′ (𝑥, 𝑧) = L(𝑥, 𝑧)U(𝑥, ˙ U 𝑧) + L˙ 0 (𝑧)U(𝑥, 𝑧).

Trace Formula

557

Here and below the dot stands for the derivative in 𝑧. Using formula (3.1), we now see that ( ) ˙ ˙ ˙ ′ (𝑥, 𝑧) 𝑑 G(𝑥, 𝑧)U(𝑥, 𝑧) /𝑑𝑥 = G′ (𝑥, 𝑧)U(𝑥, 𝑧) + G(𝑥, 𝑧)U (4.7) = G(𝑥, 𝑧)L˙ 0 (𝑧)U(𝑥, 𝑧) =: A(𝑥, 𝑧). Next, we calculate the matrix A(𝑥, 𝑧). It ⎛ 0 0 ⎜ .. .. ⎜ L˙ 0 (𝑧) = 𝑖𝑁 ⎜ . . ⎝0 0 1 0 and hence

follows from formula (4.2) that ⎞ ... 0 .⎟ .. . .. ⎟ ⎟ . . . 0⎠ ... 0

⎛

0 ⎜ .. ⎜ B(𝑥, 𝑧) := L˙ 0 (𝑧)U(𝑥, 𝑧) = 𝑖𝑁 ⎜ . ⎝ 0 𝑢1,1

0 .. .

0 𝑢1,2

⎞ ... 0 .. ⎟ .. . . ⎟ ⎟ ... 0 ⎠ . . . 𝑢1,𝑁

where (cf. (1.3) and (2.3)) 𝑢1,𝑗 = 𝑢𝑗 . Since A = GB, this yields the relation 𝑎𝑗,𝑘 =

𝑁 ∑

𝑔𝑗,𝑙 𝑏𝑙,𝑘 = 𝑔𝑗,𝑁 𝑏𝑁,𝑘 = 𝑖𝑁 𝑔𝑗,𝑁 𝑢1,𝑘

(4.8)

𝑙=1

for the matrix elements 𝑎𝑗,𝑘 = 𝑎𝑗,𝑘 (𝑥, 𝑧) and 𝑏𝑗,𝑘 = 𝑏𝑗,𝑘 (𝑥, 𝑧) of the matrices A and B. Putting together equalities (4.7) and (4.8), we obtain the relation 𝑖𝑁 𝑢1,𝑘 (𝑥, 𝑧)𝑔𝑗,𝑁 (𝑥, 𝑧) =

𝑁

𝑑 ∑ 𝑔𝑗,𝑙 (𝑥, 𝑧)𝑢˙ 𝑙,𝑘 (𝑥, 𝑧) 𝑑𝑥 𝑙=1

and, in particular, 𝑖𝑁 𝑢1,𝑗 (𝑥, 𝑧)𝑔𝑗,𝑁 (𝑥, 𝑧) =

𝑁

𝑑 ∑ 𝑔𝑗,𝑙 (𝑥, 𝑧)𝑢˙ 𝑙,𝑗 (𝑥, 𝑧). 𝑑𝑥

(4.9)

𝑙=1

Comparing formulas (4.6) and (4.9), we get two representations for diagonal values of the resolvent kernel: 𝑛

𝑅(𝑥, 𝑥, 𝑧) = −

𝑁

𝑑 ∑∑ 𝑔𝑗,𝑙 (𝑥, 𝑧)𝑢˙ 𝑙,𝑗 (𝑥, 𝑧), 𝑑𝑥 𝑗=1 𝑙=1

𝑅(𝑥, 𝑥, 𝑧) =

𝑁 ∑

𝑁

∑ 𝑑 𝑔𝑗,𝑙 (𝑥, 𝑧)𝑢˙ 𝑙,𝑗 (𝑥, 𝑧). 𝑑𝑥 𝑗=𝑛+1 𝑙=1

Integrating the ﬁrst of these representations over an interval (𝑥1 , 𝑥) and the second over an interval (𝑥, 𝑥2 ), we arrive at the following intermediary result.

¨ J. Ostensson and D.R. Yafaev

558

Proposition 4.2. Under the assumptions of Proposition 4.1 for all 𝑥1 , 𝑥2 , 𝑥 ∈ ℝ, the representation holds: ∫ 𝑥2 𝑁 ∑ 𝑁 ∑ 𝑅(𝑦, 𝑦, 𝑧)𝑑𝑦 = − 𝑔𝑗,𝑙 (𝑥, 𝑧)𝑢˙ 𝑙,𝑗 (𝑥, 𝑧) 𝑥1

𝑗=1 𝑙=1

+

𝑁 𝑁 ∑ ∑

𝑔𝑗,𝑙 (𝑥2 , 𝑧)𝑢˙ 𝑙,𝑗 (𝑥2 , 𝑧) +

𝑗=𝑛+1 𝑙=1

𝑛 ∑ 𝑁 ∑

(4.10) 𝑔𝑗,𝑙 (𝑥1 , 𝑧)𝑢˙ 𝑙,𝑗 (𝑥1 , 𝑧).

𝑗=1 𝑙=1

Let us consider the ﬁrst term in the right-hand side of (4.10). Since ( ) ˙ 𝑑 det U(𝑥, 𝑧)/𝑑𝑧 = det U(𝑥, 𝑧) tr U(𝑥, 𝑧)−1 U(𝑥, 𝑧) , we see that 𝑁 ∑ 𝑁 ∑

( ) ˙ ˙ 𝑔𝑗,𝑙 (𝑥, 𝑧)𝑢˙ 𝑙,𝑗 (𝑥, 𝑧) = tr G(𝑥, 𝑧)U(𝑥, 𝑧) = W(𝑥, 𝑧)−1 W(𝑥, 𝑧)

(4.11)

𝑗=1 𝑙=1

where the function W(𝑥, 𝑧) = det U(𝑥, 𝑧). Observe that according to (1.9) this expression does not depend on 𝑥 which is consistent with formula (4.10). For 𝑁 = 2, identity (4.10) reduces to formula (1.26) of paper [7]. For 𝑁 > 2, it is probably new. Identity (4.10) allows us to considerably simplify the calculation ( ) of Tr 𝑅(𝑧) − 𝑅0 (𝑧) compared to the presentation of book [13] for 𝑁 = 2. This is essential for an arbitrary 𝑁 . 4.3. Now we suppose that the solutions u𝑗 (𝑥, 𝑧) of equation (2.2) are distinguished by condition (2.7) which yields condition (1.6) on the solutions 𝑢𝑗 (𝑥, 𝑧) of equation (1.4); W(𝑥, 𝑧) is the Wronskian (1.7). Diﬀerent objects corresponding to the “free” operator 𝐻0 = 𝑖−𝑁 ∂ 𝑁 will be endowed with index 0 (upper or lower). For the (0) free case, we put 𝑢𝑗 (𝑥) = 𝑒𝜁𝑗 𝑥 , 𝑗 = 1, . . . , 𝑁 . Let U0 (𝑥, 𝑧) be the corresponding fundamental matrix (2.9) with the eigenvectors p𝑗 (𝑧) deﬁned by (4.4), and let W0 (𝑧) be its determinant (1.8). The normalized Wronskian Δ(𝑥, 𝑧) is deﬁned (0) by formula (1.10). We denote by 𝑔𝑗,𝑙 (𝑥, 𝑧) and 𝑔𝑗,𝑙 (𝑥, 𝑧) matrix elements of the matrices G(𝑥, 𝑧) = U−1 (𝑥, 𝑧) and G0 (𝑥, 𝑧) = U−1 0 (𝑥, 𝑧), respectively. For the proof of the trace formula, we combine Propositions 3.2 and 4.2. Indeed, let us subtract from equality (4.10) the same equality for the resolvent 𝑅0 (𝑧) = (𝐻0 − 𝑧)−1 and consider ∫ 𝑥2 ) ( (4.12) 𝑅(𝑦, 𝑦, 𝑧) − 𝑅0 (𝑦, 𝑦, 𝑧) 𝑑𝑦. 𝑥1

Since 𝜁𝑗𝑁 = 𝑖𝑁 𝑧, for all 𝑙 = 1, . . . , 𝑁 , we have (0)

𝑢˙ 𝑙,𝑗 (𝑥, 𝑧) = 𝑢˙ 𝑙,𝑗 (𝑥, 𝑧) = 𝑑(𝜁𝑗𝑙−1 𝑒𝜁𝑗 𝑥 )/𝑑𝑧 = 𝑖𝑁 𝑁 −1 𝜁𝑗−𝑁 +𝑙−1 (𝑙 − 1 + 𝑥𝜁𝑗 )𝑒𝜁𝑗 𝑥 if either 𝑗 = 1, . . . , 𝑛 and 𝑥 << 0 or 𝑗 = 𝑛 + 1, . . . , 𝑁 and 𝑥 >> 0.

Trace Formula

559

According to condition (1.5), it now directly follows from relations (3.12) that, for all 𝑙 = 1, . . . , 𝑁 , (0)

(0)

𝑔𝑗,𝑙 (𝑥, 𝑧)𝑢˙ 𝑙,𝑗 (𝑥, 𝑧) − 𝑔𝑗,𝑙 (𝑥, 𝑧)𝑢˙ 𝑙,𝑗 (𝑥, 𝑧) = 𝑂(𝑥𝑒−(𝜌+ +𝜌− )∣𝑥∣ ) if either 𝑗 = 1, . . . , 𝑛 and 𝑥 → −∞ or 𝑗 = 𝑛 + 1, . . . , 𝑁 and 𝑥 → +∞. Therefore using equality (4.10) we see that the contribution to (4.12) of the terms depending on 𝑥1 disappear in the limit 𝑥1 → −∞ and the contribution of the terms depending on 𝑥2 disappear in the limit 𝑥2 → +∞. Thus taking into account equality (4.11), we obtain the following result. Theorem 4.3. Suppose that the coeﬃcients 𝑣1 , . . . , 𝑣𝑁 of the operator 𝐻 have compact supports. Then for 𝑧 ∕∈ 𝜎(𝐻), the limit in the left-hand side exists and ∫ 𝑥2 ) ( lim 𝑅(𝑦, 𝑦, 𝑧) − 𝑅0 (𝑦, 𝑦, 𝑧) 𝑑𝑦 = −Δ(𝑥, 𝑧)−1 𝑑Δ(𝑥, 𝑧)/𝑑𝑧 (4.13) 𝑥1 →−∞,𝑥2 →+∞

𝑥1

where 𝑥 ∈ ℝ is arbitrary. 4.4. The left-hand side of (4.13) can of course be identiﬁed with the trace of the diﬀerence 𝑅(𝑧) − 𝑅0 (𝑧). To that end, we ﬁrst verify inclusion (1.18). We proceed from the following well-known result (see, e.g., survey [4] by M.Sh. Birman and M.Z. Solomyak). Proposition 4.4. Suppose that ∫ ∞ (1 + 𝜉 2 )𝛼 ∣𝑏(𝜉)∣2 𝑑𝜉 < ∞, ℬ 2 := −∞

𝒱 2 :=

∫

∞

−∞

(1 + 𝑥2 )𝛼 ∣𝑣(𝑥)∣2 𝑑𝑥 < ∞ (4.14)

for some 𝛼 > 1/2. Then the integral operator 𝑇 : 𝐿2 (ℝ; 𝑑𝑥) → 𝐿2 (ℝ; 𝑑𝜉) with kernel (2.21) belongs to the trace class 𝔖1 and its trace norm satisﬁes the estimate ∥𝑇 ∥𝔖1 ≤ 𝐶ℬ 𝒱 where the constant 𝐶 depends on 𝛼 > 1/2 only. Now it is easy to prove the following result. Lemma 4.5. Under assumption (1.12) inclusion (1.18) holds. Proof. Let us proceed from the resolvent identity 𝑅(𝑧) − 𝑅0 (𝑧) = −𝑅0 (𝑧)𝑉 𝑅(𝑧) = −

𝑁 ∑

𝑅0 (𝑧)𝑣𝑗 ∂ 𝑗−1 𝑅(𝑧),

𝑧 ∕∈ 𝜎(𝐻),

(4.15)

𝑗=1

where 𝑣𝑗 is the operator of multiplication by the function 𝑣𝑗 (𝑥). The operators ∂ 𝑗 𝑅(𝑧) are bounded because the operator 𝐻0 𝑅(𝑧) is bounded. Proposition 4.4 implies that 𝑅0 (𝑧)𝑣𝑗 ∈ 𝔖1 if 𝑁 > 1. Indeed, let Φ : 𝐿2 (ℝ; 𝑑𝑥) → 𝐿2 (ℝ; 𝑑𝜉) be the Fourier transform. Then the operator Φ𝑅0 (𝑧)𝑣𝑗 has integral kernel (2.21) where 𝑏(𝜉) = (2𝜋)−1/2 (𝜉 𝑁 − 𝑧)−1 and 𝑣(𝑥) = 𝑣𝑗 (𝑥). By virtue of (1.12) condition (4.14) is satisﬁed in this case. If 𝑁 = 1, then we can use that the operator ∣𝑣1 ∣1/2 𝑅0 (𝑧) belongs to the Hilbert-Schmidt class. Thus all terms in the right-hand side of (4.15) belong to the trace class. □

¨ J. Ostensson and D.R. Yafaev

560

Since the kernel 𝑅(𝑥, 𝑦, 𝑧) (and 𝑅0 (𝑥, 𝑦, 𝑧)) is a continuous function of 𝑥, 𝑦 and the limit in the left-hand side of (4.13) exists, we see (see, e.g., [13], Proposition 3.1.6) that ∫ ∞ ) ) ( ( (4.16) Tr 𝑅(𝑧) − 𝑅0 (𝑧) = 𝑅(𝑦, 𝑦, 𝑧) − 𝑅0 (𝑦, 𝑦, 𝑧) 𝑑𝑦. −∞

Putting together formulas (4.13) and (4.16), we get the trace formula (1.11) for the coeﬃcients 𝑣1 , . . . , 𝑣𝑁 with compact supports.

5. Integral equations Here we consider diﬀerential equation (1.4) with arbitrary short-range coeﬃcients. Actually, we follow the scheme of Section 2 and ﬁrst consider a more general equation (2.2). 5.1. As usual, we suppose that the eigenvalues 𝜁𝑗 , 𝑗 = 1, . . . , 𝑁 , of an 𝑁 × 𝑁 matrix L0 are distinct and do not lie on the imaginary axis. We set P𝑗 = ⟨⋅, p∗𝑗 ⟩p𝑗 ,

𝑗 = 1, . . . , 𝑁,

(5.1)

where p𝑗 are eigenvectors of L0 and the vectors p∗𝑗 form the dual basis. We have P2𝑗 = P𝑗 , P𝑗 P𝑘 = 0 if 𝑗 ∕= 𝑘, and L0 P𝑗 = 𝜁𝑗 P𝑗 ,

𝑁 ∑

P𝑗 = I.

(5.2)

𝑗=1

Let a matrix L(𝑥) be given by formula (2.1) where we now assume that V ∈ 𝐿1 (ℝ± ).

(5.3) (±)

We shall show that, for all 𝑗 = 1, . . . , 𝑁 , equation (2.2) has solutions u𝑗 (𝑥) such that (±) (5.4) u𝑗 (𝑥) = 𝑒𝜁𝑗 𝑥 (p𝑗 + 𝑜(1)), 𝑥 → ±∞. Thus we construct solutions of (2.2) both (exponentially) decaying and growing at (+) (−) inﬁnity. We emphasize that our construction of the functions u𝑗 (𝑥) (of u𝑗 (𝑥)) (±)

requires condition (5.3) for 𝑥 ∈ ℝ+ (for 𝑥 ∈ ℝ− ) only. Functions u𝑗 (𝑥) will be deﬁned as solutions of integral equations which we borrow from the book [6] (see Problem 29 of Chapter 3). For deﬁniteness, we consider the case 𝑥 → −∞ and put (−) u𝑗 = u𝑗 . Set ∑ ∑ P𝑚 𝑒𝜁𝑚 𝑥 𝜃(−𝑥) − P𝑚 𝑒𝜁𝑚 𝑥 𝜃(𝑥), (5.5) K𝑗 (𝑥) = 𝑚:𝜅𝑚 >𝜅𝑗

𝑚:𝜅𝑚 ≤𝜅𝑗

where 𝜅𝑚 = Re 𝜁𝑚 . It follows from relations (5.2) that K′𝑗 (𝑥) = L0 K𝑗 (𝑥) − 𝛿(𝑥)I,

(5.6)

Trace Formula

561

where 𝛿(𝑥) is the Dirac function. We also use the estimate ( ∑ ) ∣K𝑗 (𝑥)∣ ≤ 𝐶𝑗 𝑒𝜅𝑚 𝑥 𝜃(−𝑥) + 𝑒𝜅𝑗 𝑥 𝜃(𝑥) ,

(5.7)

𝑚:𝜅𝑚 >𝜅𝑗

which is a direct consequence of deﬁnition (5.5). In particular, we see that ∣K𝑗 (𝑥)∣ ≤ 𝐶𝑗 𝑒𝜅𝑗 𝑥 ,

∀𝑥 ∈ ℝ.

(5.8)

Let 𝜒𝑋 be the characteristic function of an interval 𝑋. Consider the integral equation ∫ ∞ u𝑗 (𝑥) = 𝑒𝜁𝑗 𝑥 p𝑗 − K𝑗 (𝑥 − 𝑦)V(𝑦)𝜒(−∞,𝑎) (𝑦)u𝑗 (𝑦)𝑑𝑦, 𝑥 < 𝑎, (5.9) −∞

for a function u𝑗 (𝑥) = u𝑗 (𝑥; 𝑎) depending on the parameter 𝑎 which will be chosen later. If 𝜅𝑗 = max𝑚 𝜅𝑚 , then the ﬁrst sum in (5.5) is absent. In this case we can omit 𝜒(−∞,𝑎) (𝑦) in (5.9) so that (5.9) becomes a Volterra integral equation. However (5.9) is only a Fredholm equation for other values of 𝑗. Suppose that a function u𝑗 (𝑥) satisﬁes the estimate u𝑗 (𝑥) = 𝑂(𝑒𝜅𝑗 𝑥 ) as 𝑥 → −∞ and equation (5.9). Diﬀerentiating (5.9) and using (5.6) we see that u′𝑗 (𝑥) = 𝜁𝑗 𝑒𝜁𝑗 𝑥 p𝑗 + V(𝑥)𝜒(−∞,𝑎) (𝑥)u𝑗 (𝑥) ∫ ∞ − L0 K𝑗 (𝑥 − 𝑦)V(𝑦)𝜒(−∞,𝑎) (𝑦)u𝑗 (𝑦)𝑑𝑦.

(5.10)

−∞

Putting together equations (5.9) and (5.10) we ﬁnd that a solution u𝑗 (𝑥) of integral equation (5.9) satisﬁes also the diﬀerential equation u′𝑗 (𝑥) = L0 u𝑗 (𝑥) + V(𝑥)𝜒(−∞,𝑎) (𝑥)u𝑗 (𝑥), which reduces to equation (2.2) for 𝑥 < 𝑎. Let us set u𝑗 (𝑥; 𝑎) = 𝑒𝜁𝑗 𝑥 w𝑗 (𝑥; 𝑎), and rewrite equation (5.9) as ∫ w𝑗 (𝑥; 𝑎) = p𝑗 −

𝑎

−∞

𝑥 < 𝑎,

K𝑗 (𝑥 − 𝑦)𝑒−𝜁𝑗 (𝑥−𝑦) V(𝑦)w𝑗 (𝑦; 𝑎)𝑑𝑦.

(5.11)

(5.12)

By virtue of assumption (5.3) and estimate (5.8) we can choose the parameter 𝑎 such that ∫ 𝑎 ∫ 𝑎 $ $ $ $ $K𝑗 (𝑥 − 𝑦)𝑒−𝜁𝑗 (𝑥−𝑦) V(𝑦)$𝑑𝑦 ≤ 𝐶 $V(𝑦)$𝑑𝑦 < 1, ∀𝑥 ∈ ℝ. (5.13) −∞

−∞

Then equation (5.12) can be solved in the space 𝐿∞ ((−∞, 𝑎); ℂ𝑁 ) by the method of successive approximations. This result can also be reformulated in the following way. Let ℚ𝑗 (𝑎) be the integral operator with kernel Q𝑗 (𝑥, 𝑦) = K𝑗 (𝑥 − 𝑦)𝑒−𝜁𝑗 (𝑥−𝑦) V(𝑦)

(5.14)

¨ J. Ostensson and D.R. Yafaev

562

acting in the space 𝐿∞ ((−∞, 𝑎); ℂ𝑁 ). Then ( w𝑗 (𝑎) = 𝐼 − ℚ𝑗 (𝑎))−1 p𝑗

(5.15)

where the inverse operator exists because ∥ℚ𝑗 (𝑎)∥ < 1. Clearly, the function u𝑗 (𝑥; 𝑎) deﬁned by formula (5.11) satisﬁes diﬀerential equation (2.2) for 𝑥 < 𝑎. Since a solution of a diﬀerential equation of ﬁrst order is determined uniquely by its value at one point, it suﬃces to require equality (5.11) only for one 𝑥 < 𝑎. Then the corresponding solution can be extended to all 𝑥 ∈ ℝ. Now we are in a position to give the precise deﬁnition. (−)

Deﬁnition 5.1. Let w𝑗 (⋅; 𝑎− ) ∈ 𝐿∞ ((−∞, 𝑎− ); ℂ𝑁 ), 𝑗 = 1, . . . , 𝑁 , be the solution of equation (5.12) where 𝑎 = 𝑎− is a suﬃciently large negative number. (−) We deﬁne u𝑗 (𝑥; 𝑎− ) as the solution of diﬀerential equation (2.2) which satisﬁes (+)

condition (5.11) for some (and then for all) 𝑥 < 𝑎. The solutions u𝑗 (𝑥; 𝑎+ ), 𝑗 = 1, . . . , 𝑁 , are deﬁned quite similarly if (−∞, 𝑎− ) is replaced by (𝑎+ , ∞) where 𝑎+ is a suﬃciently large positive number. It remains to verify asymptotics (5.4) for the function u𝑗 (𝑥). According to (5.11) and (5.12) it suﬃces to check that the integral in the right-hand side of (5.12) tends to zero as 𝑥 → −∞. Using estimate (5.7) and the inclusion w𝑗 ∈ 𝐿∞ ((−∞, 𝑎− ); ℂ𝑁 ), we see that this integral is bounded by ∫ 𝑥 ∑ ∫ 𝑎 $ $ $ $ $V(𝑦)$𝑑𝑦 + 𝐶 (5.16) 𝐶 𝑒(𝜅𝑚 −𝜅𝑗 )(𝑥−𝑦) $V(𝑦)$𝑑𝑦. −∞

𝑚:𝜅𝑚 >𝜅𝑗

𝑥

The ﬁrst integral here tends to zero as 𝑥 → −∞ by virtue of condition (5.3). Each of the integrals over (𝑥, 𝑎) can be estimated by ∫ 𝑎 ∫ 𝑥/2 $ $ $ $ (𝜅𝑚 −𝜅𝑗 )𝑥/2 $ $ $V(𝑦)$𝑑𝑦. 𝑒 V(𝑦) 𝑑𝑦 + 𝑥/2

𝑥

Since 𝜅𝑚 > 𝜅𝑗 , this expression tends to zero as 𝑥 → −∞ by virtue again of condition (5.3). Thus we arrive at the following result. Proposition 5.2. Let assumption (5.3) hold, and let 𝑎+ (𝑎− ) be a suﬃciently large (±) positive (negative) number. Then, for all 𝑗 = 1, . . . , 𝑁 , the functions u𝑗 (𝑥; 𝑎± ) (see Deﬁnition 5.1) satisfy equation (2.2) and have asymptotics (5.4). Solutions of equation (2.2) are of course not determined uniquely by asymp(±) totics (5.4). In particular, the solutions u𝑗 (𝑥; 𝑎± ) generically depend on the choice of the parameter 𝑎± . (±) Let u𝑗 (𝑥; 𝑎, 𝑟) = u𝑗 (𝑥; 𝑎± , 𝑟) be the function constructed above for the cut-oﬀ coeﬃcient V𝑟 (𝑥) = 𝜒(−𝑟,𝑟) (𝑥)V(𝑥); thus function (5.15) is now replaced by ( (5.17) w𝑗 (𝑎, 𝑟) = 𝐼 − ℚ𝑗 (𝑎)𝜒(−𝑟,𝑟) )−1 p𝑗 .

Trace Formula Since

563

) ( lim ∥ℚ𝑗 (𝑎) 1 − 𝜒(−𝑟,𝑟) ∥𝐿∞ (−∞,𝑎) = 0,

𝑟→∞

we see that u𝑗 (𝑥; 𝑎, 𝑟) → u𝑗 (𝑥, 𝑎) as 𝑟 → ∞ for all ﬁxed 𝑥 < 𝑎. This relation extends to all 𝑥 ∈ ℝ because solutions of diﬀerential equations depend continuously on initial data. Therefore Proposition 5.2 can be supplemented by the following result. Lemma 5.3. Under the assumptions of Proposition 5.2, let (±)

u𝑗 (𝑥; 𝑎± )

and

(±)

u𝑗 (𝑥; 𝑎± , 𝑟)

be the solutions of equations (2.2) with V(𝑥) and V𝑟 (𝑥), respectively, speciﬁed in Deﬁnition 5.1. Then for all 𝑗 = 1, . . . , 𝑁 the relation (±)

(±)

lim u𝑗 (𝑥; 𝑎± , 𝑟) = u𝑗 (𝑥; 𝑎± )

(5.18)

𝑟→∞

holds uniformly in 𝑥 on compact intervals of ℝ. (±)

5.2. If a function u𝑗 (𝑥) satisﬁes equation (2.2) and has asymptotics (5.4), then (±)

adding to u𝑗 (𝑥) a solution with a more rapid decay (or less rapid growth) as 𝑥 → ±∞ we obtain again a solution of equation (2.2) with the same asymptotics (5.4). It is natural to expect that this procedure exhausts the arbitrariness in the (±) deﬁnition of u𝑗 (𝑥). The precise result will be formulated in Lemma 5.5. The following assertion is almost obvious. (±)

(±)

Lemma 5.4. Suppose that solutions u1 , . . . , u𝑁 of the diﬀerential equation (2.2) have asymptotics (5.4) as 𝑥 → ±∞. Then for each of the signs “ ± ” the functions (±) (±) u1 , . . . , u𝑁 are linearly independent. Proof. It follows from (5.4) that (±)

(±)

det{u1 (𝑥), . . . , u𝑁 (𝑥)} = det{p1 , . . . , p𝑁 } exp

𝑁 (∑

)( ) 𝜁𝑗 𝑥 1 + 𝑜(1)

𝑗=1

as 𝑥 → ±∞. Since det{p1 , . . . , p𝑁 } ∕= 0, this expression is not zero for suﬃciently large ±𝑥. □ (±)

(±)

Lemma 5.5. Suppose that solutions u1 , . . . , u𝑁 of the diﬀerential equation (2.2) (±) ˜ 𝑗 be an arbitrary solution of (2.2) with have asymptotics (5.4) as 𝑥 → ±∞. Let u asymptotics (5.4) as 𝑥 → ±∞. Then necessarily ∑ (±) (±) (±) (±) ˜ 𝑗 (𝑥) = u𝑗 (𝑥) + u 𝑐𝑗,𝑙 u𝑙 (𝑥) (5.19) ±(𝜅𝑙 −𝜅𝑗 )<0 (±)

for some numbers 𝑐𝑗,𝑙 .

¨ J. Ostensson and D.R. Yafaev

564

(−)

Proof. As before, we set u𝑗 (𝑥) = u𝑗 (𝑥). According to Lemma 5.4 we have ˜ 𝑗 (𝑥) = u

𝑁 ∑

𝑐𝑗,𝑙 u𝑙 (𝑥)

(5.20)

𝑙=1

with some numbers 𝑐𝑗,𝑙 . Therefore it follows from asymptotic relations (5.4) that ∑ 𝑒𝜁𝑗 𝑥 (p𝑗 + 𝑜(1)) = 𝑐𝑗,𝑙 𝑒𝜁𝑙 𝑥 (p𝑙 + 𝑜(1)) + 𝑜(𝑒𝜅𝑗 𝑥 ) 𝜅𝑙 ≤𝜅𝑗

as 𝑥 → −∞. Since the vectors p1 , . . . , p𝑁 are linearly independent, this relation implies that 𝑐𝑗,𝑙 = 0 if 𝑙 ∕= 𝑗 and 𝑐𝑗,𝑗 = 1. Thus equality (5.20) leads to (5.19). □ 5.3. Let us return to diﬀerential equation (1.4) with coeﬃcients satisfying the assumption (5.21) 𝑣𝑗 ∈ 𝐿1 (ℝ± ), 𝑗 = 1, . . . , 𝑁, only. Deﬁne as usual the matrices L0 (𝑧) and V(𝑥) by formulas (4.2) and (4.3). Now 𝜁𝑗𝑁 = 𝑖𝑁 𝑧 and the vectors p𝑗 (𝑧), 𝑗 = 1, . . . , 𝑁 , are given by formula (4.4). It is easy to control the dependence on 𝑧 of matrices (5.1). (𝑗)

Lemma 5.6. Elements 𝑝𝑘,𝑙 (𝑧) of the matrices P𝑗 (𝑧), 𝑗 = 1, . . . , 𝑁 , obey the relation (𝑗)

𝑝𝑘,𝑙 (𝑧) = 𝑂(∣𝜁∣𝑘−𝑙 ),

∣𝜁∣𝑁 = ∣𝑧∣ → ∞.

(5.22)

Proof. Obviously, the basis dual to p𝑗 (𝑧) consists of the vectors p∗𝑗 (𝑧) = (𝑐𝑗,1 , 𝑐𝑗,2 𝜁𝑗−1 , . . . , 𝑐𝑗,𝑁 𝜁𝑗−𝑁 +1 ) where the numbers 𝑐𝑗,𝑙 do not depend on ∣𝑧∣. It now follows from equality (4.4) (𝑗) □ that 𝑝𝑘,𝑙 (𝑧) = 𝑐¯𝑗,𝑙 𝜁𝑗𝑘−1 𝜁¯𝑗−𝑙+1 , which yields (5.22). The next step is to control the dependence on 𝑧 of matrices (5.14). (𝑗)

Lemma 5.7. Elements 𝑞𝑘,𝑙 (𝑥, 𝑦, 𝑧) of the matrices Q𝑗 (𝑥, 𝑦, 𝑧), 𝑗 = 1, . . . , 𝑁 , admit the estimate (𝑗) ∣𝑞𝑘,𝑙 (𝑥, 𝑦, 𝑧)∣ ≤ 𝐶∣𝜁∣𝑘−𝑁 ∣𝑣𝑙 (𝑦)∣ (5.23) with a constant 𝐶 not depending on 𝑥, 𝑦 and 𝑧. (𝑗)

Proof. It follows from (5.5) and (5.22) that elements 𝑠𝑘,𝑙 (𝑥, 𝑧) of the matrix K𝑗 (𝑥)𝑒−𝜁𝑗 𝑥 satisfy the estimate (cf. (5.8)) (𝑗)

∣𝑠𝑘,𝑙 (𝑥, 𝑧)∣ ≤ 𝐶∣𝜁∣𝑘−𝑙 . This directly implies (5.23) because, by deﬁnition (4.3), (𝑗)

(𝑗)

𝑞𝑘,𝑙 (𝑥, 𝑦, 𝑧) = −𝑖𝑁 𝑠𝑘,𝑁 (𝑥 − 𝑦, 𝑧)𝑣𝑙 (𝑦).

□

Trace Formula (±)

(±)

(±)

565 (±)

Let u𝑗 (𝑥, 𝑧; 𝑎± ) (recall that u𝑗 = (𝑢1,𝑗 , . . . , 𝑢𝑁,𝑗 )𝑡 ) be the solution of equation (2.2) speciﬁed in Deﬁnition 5.1. Then the function (±)

(±)

𝑢𝑗 (𝑥, 𝑧; 𝑎± ) := 𝑢1,𝑗 (𝑥, 𝑧; 𝑎± )

(5.24)

satisﬁes also equation (1.4) and according to (4.4) asymptotics (5.4) imply asymptotics (1.13). Therefore Proposition 5.2 and Lemma 5.3 entail the following result. Proposition 5.8. Let assumption (5.21) hold, let ∣𝑧∣ ≥ 𝑐 > 0 and let 𝑎+ = 𝑎+ (𝑐) (𝑎− = 𝑎− (𝑐)) be a suﬃciently large positive (negative) number. Then for every (±) 𝑗 = 1, . . . , 𝑁 the function 𝑢𝑗 (𝑥, 𝑧; 𝑎± ) determined by Deﬁnition 5.1 and equality (5.24) satisﬁes equation (1.4) and has asymptotics (1.13) as 𝑥 → ±∞. Moreover, (±) the corresponding solutions 𝑢𝑗 (𝑥, 𝑧; 𝑎± , 𝑟) of equation (1.4) with cut-oﬀ coeﬃcients 𝜒(−𝑟,𝑟) (𝑥)𝑣𝑘 (𝑥), 𝑘 = 1, . . . , 𝑁 , satisfy the relation (±)

(±)

lim 𝑢𝑗 (𝑥, 𝑧; 𝑎± , 𝑟) = 𝑢𝑗 (𝑥, 𝑧; 𝑎± )

𝑟→∞

(5.25)

uniformly in 𝑥 on compact intervals of ℝ. This relation remains true for 𝑁 − 1 (±) derivatives of the functions 𝑢𝑗 . By deﬁnition (5.5), the kernels K𝑗 (𝑥, 𝑧) depend analytically on 𝑧 except on the rays where Re 𝜁𝑙 = Re 𝜁𝑗 for some root 𝜁𝑙 ∕= 𝜁𝑗 of the equation 𝜁 𝑁 = 𝑖𝑁 𝑧. In addition to the condition 𝑧 ∕∈ 𝜎(𝐻0 ), this excludes also the half-line 𝑧 < 0 for even 𝑁 and the line Re 𝑧 = 0 for odd 𝑁 . Hence the same is true for the functions (±) (±) 𝑢𝑗 (𝑥, 𝑧; 𝑎± ) if ∣𝑧∣ > 𝑐 > 0. Thus, for ﬁxed 𝑥 and 𝑎± , the functions 𝑢𝑗 (𝑥, 𝑧; 𝑎± ) are analytic functions of 𝑧 if ∣𝑧∣ > 𝑐 > 0, Im 𝑧 ∕= 0 for 𝑁 even and if Im 𝑧 ∕= 0, (±) Re 𝑧 ∕= 0 for 𝑁 odd. On the rays where Re 𝜁𝑙 = Re 𝜁𝑗 , the limits of 𝑢𝑗 (𝑥, 𝑧; 𝑎± ) from both sides exist but diﬀer, in general, from each other by a term which decays faster (or grows less rapidly) than 𝑒𝜁𝑗 𝑥 as 𝑥 → +∞ or 𝑥 → −∞. 5.4. Integral equations (5.9) turn out also to be useful (even for functions 𝑣𝑗 (𝑥) (±) with compact supports) for a study of asymptotics of the solutions 𝑢𝑗 (𝑥, 𝑧; 𝑎± ) of diﬀerential equation (1.4) as ∣𝑧∣ → ∞. We choose the sign “ − ”, ﬁx the parameter 𝑎 = 𝑎− and index 𝑗 and drop them out of notation. Consider system (5.12) of 𝑁 equations for components 𝑤𝑘 (𝑥, 𝑧), 𝑘 = 1, . . . , 𝑁 , of a vector-valued function w(𝑥, 𝑧). Set 𝑤𝑘 (𝑥, 𝑧) = 𝜁 𝑘−1 𝑤 ˜𝑘 (𝑥, 𝑧) and take into account equality (4.4). Then we obtain for 𝑤 ˜𝑘 (𝑥, 𝑧) a system ∫ 𝑎 𝑁 ∑ 𝑤 ˜𝑘 (𝑥, 𝑧) = 1 − 𝜁 𝑙−𝑘 𝑞𝑘,𝑙 (𝑥, 𝑦, 𝑧)𝑤 ˜𝑙 (𝑦, 𝑧)𝑑𝑦, 𝑥 < 𝑎, (5.26) 𝑙=1

−∞

where the elements 𝑞𝑘,𝑙 of the matrix Q satisfy inequality (5.23). In particular, the operator in the right-hand side of (5.26) is uniformly bounded as ∣𝑧∣ → ∞. Assume additionally that 𝑣𝑁 (𝑥) = 0. Then according to (5.23) the norm of the operator in the right-hand side of (5.26) is 𝑂(∣𝜁∣−1 ) as ∣𝑧∣ → ∞. Therefore for suﬃciently large ∣𝜁∣ system (5.26) can be solved in the space 𝐿∞ ((−∞, 𝑎); ℂ𝑁 )

¨ J. Ostensson and D.R. Yafaev

566

by the method of successive approximations and 𝑤 ˜𝑘 (𝑥, 𝑧) = 1 + 𝑂(∣𝜁∣−1 ), 𝑘 = 1, . . . , 𝑁 . As we have already seen in the proof of Proposition 5.2, the solution of system (5.26) necessarily has the asymptotics 𝑤 ˜𝑘 (𝑥, 𝑧) = 1 + 𝑜(1), 𝑘 = 1, . . . , 𝑁 , as 𝑥 → −∞. Deﬁne as usual 𝑢(𝑥, 𝑧) as a solution of equation (1.4) such that ˜1 (𝑥, 𝑧) for 𝑥 < 𝑎. Thus we obtain the following result. 𝑢(𝑥, 𝑧) = 𝑒𝜁𝑥 𝑤 Proposition 5.9. Let assumption (5.21) hold, and let 𝑣𝑁 = 0. Fix arbitrary 𝑎± . Then for all 𝑗 = 1, . . . , 𝑁 and all suﬃciently large ∣𝑧∣ equation (1.4) has solutions (±) 𝑢𝑗 (𝑥, 𝑧; 𝑎± ) with asymptotics (1.13) as 𝑥 → ±∞ and such that ( ) (±) 𝑢𝑗 (𝑥, 𝑧; 𝑎± ) = 𝑒𝜁𝑗 𝑥 1 + 𝑂(∣𝑧∣−1/𝑁 ) , ∣𝑧∣ → ∞, (5.27) for all ±(𝑥 − 𝑎± ) > 0. Remark 5.10. If assumption (5.21) is true for both signs, then we can set 𝑎 = +∞ in equation (5.26). For suﬃciently large ∣𝑧∣, such an equation can again be solved by the method of successive approximations.

6. The Wronskian and the perturbation determinant 6.1. Let us deﬁne the Wronskian W(𝑥) for diﬀerential equation (2.2) where the matrix-valued function L(𝑥) is given by formula (2.1) and V(𝑥) satisﬁes assumption (5.3) (for both signs) only. To justify the deﬁnition below, we start with the following observation. ˜ 𝑗 of the diﬀerential equaLemma 6.1. Suppose that both sets of solutions u𝑗 and u tion (2.2) have asymptotics (5.4) as 𝑥 → −∞ for 𝑗 = 1, . . . , 𝑛 and as 𝑥 → +∞ for 𝑗 = 𝑛 + 1, . . . , 𝑁 . Then for all 𝑥 ˜ 𝑁 (𝑥)}. u1 (𝑥), . . . , u det{u1 (𝑥), . . . , u𝑁 (𝑥)} = det{˜

(6.1)

Proof. Let us proceed from Lemma 5.5. To simplify notation, we suppose that 𝜅1 ≥ ⋅ ⋅ ⋅ ≥ 𝜅𝑛 > 𝜅𝑛+1 ≥ ⋅ ⋅ ⋅ ≥ 𝜅𝑁 .

(6.2)

First, we check that for all 𝑙 = 1, . . . , 𝑛, ˜ 𝑁 (𝑥)} = det{u1 (𝑥), . . . , u𝑙 (𝑥), u ˜ 𝑙+1 (𝑥), . . . , u ˜ 𝑁 (𝑥)}. det{˜ u1 (𝑥), . . . , u

(6.3)

˜ 1 (𝑥) = u1 (𝑥). Suppose that For 𝑙 = 1 this equality is obvious because necessarily u (6.3) is true for some 𝑙. Then using (5.19) we see that the left-hand side of (6.3) equals ∑ ˜ 𝑙+2 (𝑥), . . . , u ˜ 𝑁 (𝑥)}. 𝑐𝑙+1,𝑚 u𝑚 (𝑥), u det{u1 (𝑥), . . . , u𝑙 (𝑥), u𝑙+1 (𝑥) + 𝜅𝑚 >𝜅𝑙+1

Since according to (6.2) the contribution of the sum over 𝜅𝑚 > 𝜅𝑙+1 equals zero, this yields relation (6.3) for 𝑙 + 1 and hence for all 𝑙 = 1, . . . , 𝑛.

Trace Formula

567

Quite similarly, we can verify that for all 𝑙 = 𝑁, . . . , 𝑛 + 1 ˜ 𝑛+1 (𝑥), . . . , u ˜ 𝑁 (𝑥)} det{u1 (𝑥), . . . , u𝑛 (𝑥), u ˜ 𝑛+1 (𝑥), . . . , u ˜ 𝑙 (𝑥), u𝑙+1 (𝑥), . . . , u𝑁 (𝑥)}. = det{u1 (𝑥), . . . , u𝑛 (𝑥), u Putting together equalities (6.3) and (6.4) for 𝑙 = 𝑛, we get (6.1).

(6.4) □

Now we are in a position to deﬁne the Wronskian W(𝑥). Deﬁnition 6.2. Let u𝑗 (𝑥), 𝑗 = 1, . . . , 𝑁 , be arbitrary solutions of equation (2.2) with asymptotics (5.4) as 𝑥 → −∞ if 𝑗 = 1, . . . , 𝑛 and as 𝑥 → ∞ if 𝑗 = 𝑛+1, . . . , 𝑁 . We set (6.5) W(𝑥) = det{u1 (𝑥), . . . , u𝑁 (𝑥)}. Recall that solutions of equation (2.2) with asymptotics (5.4) exist according to Proposition 5.2. Although they are not unique, according to Lemma 6.1 the Wronskian W(𝑥) does not depend (up to a numeration of eigenvalues 𝜁𝑗 ) on a speciﬁc choice of such solutions. In particular, we have (−)

(+)

(+)

W(𝑥) = det{u1 (𝑥; 𝑎− ), . . . , u(−) 𝑛 (𝑥; 𝑎− ), u𝑛+1 (𝑥; 𝑎+ ), . . . , u𝑁 (𝑥; 𝑎+ )}

(6.6)

where 𝑎+ (𝑎− ) are suﬃciently large positive (negative) numbers and the solutions (±) u1 (𝑥; 𝑎± ) are constructed in Proposition 5.2. Of course, Deﬁnition 6.2 applies if the matrices L0 (𝑧) and V(𝑥) are given by formulas (4.2) and (4.3), respectively. In this case the Wronskian W(𝑥, 𝑧) depends analytically on the parameter 𝑧 ∕∈ 𝜎(𝐻0 ). Indeed, if additionally Im 𝑧 ∕= 0 for 𝑁 even and Re 𝑧 ∕= 0 for 𝑁 odd, then this fact directly follows from the analyticity (±) of the solutions u𝑗 (𝑥, 𝑧; 𝑎± ), 𝑗 = 1, . . . , 𝑁 (see subs. 5.3). Moreover, according to Lemma 6.1 the Wronskian W(𝑥, 𝑧) is continuous (in contrast to the solutions (±) u𝑗 (𝑥, 𝑧; 𝑎± )) on the critical rays where Re 𝜁𝑙 = Re 𝜁𝑗 for some 𝜁𝑙 ∕= 𝜁𝑗 . Therefore its analyticity in a required region follows from Morera’s theorem. Evidently, W(𝑥, 𝑧) = 0 if and only if 𝑧 is an eigenvalue of the operator 𝐻. 6.2. Let us return to the trace formula (1.11) established so far for coeﬃcients 𝑣𝑘 (𝑥), 𝑘 = 1, . . . , 𝑁 , with compact supports. Suppose that assumption (1.12) holds. Then condition (5.21) is satisﬁed for both signs. Let us approximate 𝑣𝑘 (𝑥) by the cut(−) oﬀ functions 𝜒(−𝑟,𝑟) (𝑥)𝑣𝑘 (𝑥). We denote by 𝑢𝑗 (𝑥, 𝑧; 𝑎− , 𝑟), 𝑗 = 1, . . . , 𝑛, and (+)

by 𝑢𝑗 (𝑥, 𝑧; 𝑎+ , 𝑟), 𝑗 = 𝑛 + 1, . . . , 𝑁 , the solutions of equation (1.4) with the coeﬃcients 𝜒(−𝑟,𝑟) 𝑣𝑘 determined by Deﬁnition 5.1 and equality (5.24). Let us use formula (6.6) for the Wronskian W𝑟 (𝑥, 𝑧) for equation (1.4) with cut-oﬀ coeﬃcients 𝜒(−𝑟,𝑟) (𝑥)𝑣𝑘 (𝑥). Then it follows from relation (5.25) that lim W𝑟 (𝑥, 𝑧) = W(𝑥, 𝑧).

𝑟→∞

(6.7)

In view of analyticity in 𝑧 of these functions we also have ˙ ˙ 𝑟 (𝑥, 𝑧) = W(𝑥, lim W 𝑧).

𝑟→∞

(6.8)

¨ J. Ostensson and D.R. Yafaev

568

Set 𝐻𝑟 = 𝐻0 + 𝑉𝑟 where the operator 𝑉𝑟 is deﬁned by formula (1.14) with the coeﬃcients 𝜒(−𝑟,𝑟) 𝑣𝑘 . Let us write down formula (4.13) for the operator 𝐻𝑟 and pass to the limit 𝑟 → ∞. By virtue of (6.7) and (6.8) the right-hand side of (4.13) converges to the corresponding expression for the operator 𝐻. It is possible to verify that the same is true for the left-hand side of (4.13). We shall not however dwell upon it and establish the trace formula in form (1.11). Using the resolvent identity 𝑅(𝑧) − 𝑅𝑟 (𝑧) = −

𝑁 ∑

𝑅𝑟 (𝑧)(𝑉 − 𝑉𝑟 )𝑅(𝑧),

𝑅𝑟 (𝑧) = (𝐻𝑟 − 𝑧)−1 ,

𝑗=1

we see that for 𝑧 ∕∈ 𝜎(𝐻) ∥𝑅(𝑧) − 𝑅𝑟 (𝑧)∥𝔖1 ≤ 𝐶∥𝑅0 (𝑧)(𝑉 − 𝑉𝑟 )𝑅0 (𝑧)∥𝔖1 ≤ 𝐶1

𝑁 ∑

∥𝑅0 (𝑧)𝑣𝑗 (1 − 𝜒𝑟 )∥𝔖1 .

𝑗=1

According to Proposition 4.4 there is (for 𝑁 ≥ 2) the estimate ∫ 2 ∣𝑣𝑗 (𝑥)∣2 (1 + 𝑥2 )𝛼 𝑑𝑥, 𝛼 > 1/2, ∥𝑅0 (𝑧)𝑣𝑗 (1 − 𝜒𝑟 )∥𝔖1 ≤ 𝐶 ∣𝑥∣≥𝑟

whence

lim ∥𝑅𝑟 (𝑧) − 𝑅(𝑧)∥𝔖1 = 0.

𝑟→∞

Thus, using trace formula (1.11) for cut-oﬀ perturbations 𝑉𝑟 and passing to the limit 𝑟 → ∞, we deduce it for 𝑉 . This leads to the following result. Recall that the normalized Wronskian Δ(𝑥, 𝑧) is deﬁned by formula (1.10). Theorem 6.3. Under assumption (1.12) the trace formula (1.11) holds for all 𝑧 ∕∈ 𝜎(𝐻). If inclusion (1.18) holds, then equation (1.16) is satisﬁed for a generalized perturbation determinant ( ) ˜ 𝑧0 (𝑧) = Det 𝐼 + (𝑧 − 𝑧0 )𝑅(𝑧0 )𝑉 𝑅0 (𝑧) (6.9) 𝐷 ˜ 𝑧0 (𝑧) is the usual perturbation deterwhere 𝑧0 ∕∈ 𝜎(𝐻). It is easy to see that 𝐷 minant for the pair 𝑅0 (𝑧0 ), 𝑅(𝑧0 ) at the point (𝑧 − 𝑧0 )−1 . Of course, equation ˜ (1.16) for a function 𝐷(𝑧) ﬁxes it up to a constant factor only. We note that for diﬀerent “reference points”, generalized perturbation determinants are con˜ 𝑧0 (𝑧). Moreover, if 𝑉 𝑅0 (𝑧) ∈ 𝔖1 , ˜ 𝑧1 (𝑧) = 𝐷 ˜ 𝑧0 (𝑧1 )−1 𝐷 nected by the formula 𝐷 −1 ˜ 𝑧0 (𝑧) = 𝐷(𝑧0 ) 𝐷(𝑧) where 𝐷(𝑧) is the perturbation determinant (see then 𝐷 formula (1.15)) for the pair 𝐻0 , 𝐻. Comparing equations (1.11) and (1.16) we see that for all 𝑥 ∈ ℝ and all 𝑧0 ∕∈ 𝜎(𝐻) ∫ 𝑥 ( ) 𝑁 ˜ 𝑧0 (𝑧) 𝑣𝑁 (𝑦)𝑑𝑦 𝐷 Δ(𝑥, 𝑧) = 𝐶(𝑥0 , 𝑧0 ) exp − 𝑖 𝑥0

where the constant 𝐶(𝑥0 , 𝑧0 ) ∕= 0 does not depend on 𝑥 and 𝑧.

Trace Formula

569

6.3. Suppose now that 𝑣𝑁 = 0. Then the Wronskian W(𝑥, 𝑧) =: W(𝑧) does not depend on 𝑥, and it is easy to deduce from Proposition 5.9 that ( ) (6.10) W(𝑧) = W0 (𝑧) 1 + 𝑂(∣𝑧∣−1/𝑁 ) , ∣𝑧∣ → ∞. For the proof, it suﬃces to choose 𝑎+ < 0, 𝑎− > 0 and use asymptotics (5.27) at 𝑥 = 0. As a side remark, we note that according to (6.10) the set of complex eigenvalues of the operator 𝐻 is bounded. It follows from (6.10) that the normalized Wronskian (1.10) satisﬁes the relation (6.11) Δ(𝑧) = 1 + 𝑂(∣𝑧∣−1/𝑁 ), ∣𝑧∣ → ∞. Since, by Proposition 4.4, 𝑉 𝑅0 (𝑧) ∈ 𝔖1 , the perturbation determinant is correctly deﬁned by formula (1.15) and (see book [10]) ( ) (6.12) lim Det 𝐼 + 𝑉 𝑅0 (𝑧) = 1. ∣ Im 𝑧∣→∞

Comparing equations (1.11) and (1.16), we obtain that ) ( Δ(𝑧) = 𝐶 Det 𝐼 + 𝑉 𝑅0 (𝑧)

(6.13)

for some constant 𝐶. Moreover, taking into account relations (6.11) and (6.12), we see that 𝐶 = 1 in (6.13). Let us formulate the result obtained. Theorem 6.4. Suppose that 𝑣𝑁 = 0 and that the coeﬃcients 𝑣𝑗 , 𝑗 = 1, . . . , 𝑁 − 1, satisfy assumption (1.12). Then Δ(𝑥, 𝑧) =: Δ(𝑧) does not depend on 𝑥 and relation (1.17) is true for all 𝑧 ∕∈ 𝜎(𝐻). 6.4. Finally, we note that for a derivation of the trace formula (1.11) the approximation of 𝑣𝑗 by cut-oﬀ functions 𝜒(−𝑟,𝑟) 𝑣𝑗 is not really necessary. From the very beginning, we could work with functions 𝑣𝑗 satisfying assumption (1.12) only. Then the text of Sections 2, 3 and 4 remains unchanged if, for all 𝑗 = 1, . . . , 𝑁 , the (−) functions 𝑒𝜁𝑗 𝑥 p𝑗 are replaced for 𝑥 << 0 by u𝑗 (𝑥; 𝑎− ) where 𝑎− is a suﬃciently (+)

big negative number and they are replaced for 𝑥 >> 0 by u𝑗 (𝑥; 𝑎+ ) where 𝑎+ is a suﬃciently big positive number. In particular, the deﬁnition of the transition (±) matrices in subs. 2.4 can be given in terms of the solutions u𝑗 (𝑥; 𝑎± ). However, a preliminary consideration of coeﬃcients 𝑣𝑗 with compact supports seems to be intuitively more clear.

References [1] N.I. Akhieser and I.M. Glasman, The theory of linear operators in Hilbert space, vols. I, II, Ungar, New York, 1961. [2] R. Beals, P. Deift and C. Tomei, Direct and inverse scattering on the line, Math. surveys and monographs, N 28, Amer. Math. Soc., Providence, R. I., 1988.

570

¨ J. Ostensson and D.R. Yafaev

[3] M.Sh. Birman, On the spectrum of singular boundary-value problems, Matem. Sb. 55, no. 2 (1961), 125–174 (Russian); English transl.: Eleven Papers on Analysis, Amer. Math. Soc. Transl. (2), vol. 53, Amer. Math. Soc., Providence, Rhode Island, 1966, 23–60. [4] M.Sh. Birman and M.Z. Solomyak, Estimates for the singular numbers of integral operators, Russian Math. Surveys 32 (1977), 15–89. [5] V.S. Buslaev and L.D. Faddeev, Formulas for traces for a singular Sturm-Liouville diﬀerential operator, Soviet Math. Dokl. 1 (1960), 451–454. [6] E.A. Coddington and N. Levinson, Theory of ordinary diﬀerential equations, McGraw-Hill, New York, 1955. [7] L.D. Faddeev, Inverse problem of quantum scattering theory. II, J. Soviet. Math. 5, 1976, 334–396. [8] F. Gesztesy and K.A. Makarov, (Modiﬁed) Fredholm determinants for operators with matrix-valued semi-separable integral kernels revisited, Integral Eqs. Operator Theory 47 (2003), 457–497; Erratum 48 (2004), 425–426. [9] I.C. Gohberg, S. Goldberg and N. Krupnik, Traces and determinants for linear operators, Operator Theory: Advances and Applications 116, Birkh¨ auser, Basel, 2000. [10] I.C. Gohberg and M.G. Kre˘ın, Introduction to the theory of linear nonselfadjoint operators, Nauka, Moscow, 1965; Engl. transl.: Amer. Math. Soc. Providence, R. I., 1969. [11] R. Jost and A. Pais, On the scattering of a particle by a static potential, Phys. Rev. 82 (1951), 840–851. [12] M.A. Naimark, Linear diﬀerential operators, Ungar, New York, 1967. [13] D.R. Yafaev, Mathematical scattering theory. Analytic theory, Amer. Math. Soc., Providence, Rhode Island, 2010. ¨ J. Ostensson Department of Mathematics Uppsala University Box 480 SE-751 06 Uppsala, Sweden e-mail: [email protected] D.R. Yafaev IRMAR, Universit´e de Rennes I Campus de Beaulieu F-35042 Rennes Cedex, France e-mail: [email protected]

Operator Theory: Advances and Applications, Vol. 218, 571–582 c 2012 Springer Basel AG ⃝

Jordan Structures and Lattices of Invariant Subspaces of Real Matrices Leiba Rodman Dedicated to the memory of Israel Gohberg

Abstract. Real matrices having the same Jordan structure are characterized in terms of isomorphisms and linear isomorphisms of lattices of their invariant subspaces. Mathematics Subject Classiﬁcation (2000). 15A21, 47A15. Keywords. Invariant subspace, Jordan structure, real matrix.

1. Introduction Let F = C, the complex ﬁeld, or F = R, the real ﬁeld. Two matrices 𝐴, 𝐵 ∈ C𝑛×𝑛 are said to have the same C-Jordan structure if the number 𝑠 of distinct eigenvalues 𝜆1 (𝐴), . . . , 𝜆𝑠 (𝐴) of 𝐴 and 𝜆1 (𝐵), . . . , 𝜆𝑠 (𝐵) of 𝐵 is the same, and there exists a permutation 𝜋 : {1, 2, . . . , 𝑠} → {1, 2, . . . , 𝑠} such that the partial multiplicities of 𝜆𝑗 (𝐴) as an eigenvalue of 𝐴 are identical with those of 𝜆𝜋(𝑗) (𝐵) as an eigenvalue of 𝐵, for 𝑗 = 1, 2, . . . , 𝑠. The partial multiplicities corresponding to 𝜆 ∈ 𝜎(𝐴), 𝐴 ∈ F𝑛×𝑛 , are the sizes of the Jordan blocks (in the Jordan form of 𝐴 over C) having the eigenvalue 𝜆; the partial multiplicity 𝑘 is repeated the number of times equal to the number of Jordan blocks of size 𝑘 having the eigenvalue 𝜆. “Identical” in the above deﬁnition includes having the same number of repetitions. For example, if 𝐴 ∈ C14×14 is nilpotent with the partial multiplicities 3, 3, 3, 3, 2 corresponding to the eigenvalue zero, and 𝐵 ∈ C14×14 is a nilpotent matrix with partial multiplicities 3, 3, 2, 2, 2, 2, then 𝐵 and 𝐴 do not have the same C-Jordan structure. This notion was studied and used in [8, 6, 5, 12, 1] and other sources, mainly in connection with various aspects of matrix perturbation theory. In particular, it was proved in [8] that two matrices have the same C-Jordan structure if and only if the lattices of invariant subspaces of 𝐴 and 𝐵 are isomorphic. Moreover, in this case an isomorphism of these lattices can be given by means of a linear transformation (if this happens, we say that the lattices are linearly isomorphic).

572

L. Rodman

In this paper, we study matrices having the same Jordan structure and the relationship of this property with isomorphisms of the lattices of invariant subspaces, in the context of real matrices and invariant subspaces in the corresponding real vector space. As it turns out, there are essential diﬀerences with the complex case. For example, isomorphic lattices of invariant subspaces for real matrices need not be linearly isomorphic. Our main results are Theorems 2.1 and 3.2. In the former, characterizations of having the same Jordan structure (in the context of real matrices) are given in terms of isomorphisms of lattices of invariant subspaces. In the latter, these characterizations are specialized to the case of close (in norm) matrices. In that case, isomorphic lattices of (real) invariant subspaces are necessarily linearly isomorphic. We use the following notation throughout: The spectrum of a matrix (=the set of eigenvalues, including nonreal eigenvalues of real matrices) 𝐴 will be denoted 𝜎(𝐴). Ker 𝐴 := {𝑥 ∈ F𝑛 : 𝐴𝑥 = 0} is the kernel of 𝐴 ∈ F𝑚×𝑛 , and Im 𝐴 := {𝐴𝑥 ∈ F𝑚 : 𝑥 ∈ F𝑛 } is the image (or range) of 𝐴. ℛ𝜆 (𝐴) := Ker(𝐴 − 𝜆𝐼)𝑛 ⊆ F𝑛 is the root subspace of a matrix 𝐴 ∈ F𝑛×𝑛 corresponding to the eigenvalue 𝜆 ∈ F, and ℛ𝜇±𝑖𝜈 (𝐴) := Ker(𝐴2 − 2𝜇𝐴 + (𝜇2 + 𝜈 2 )𝐼)𝑛 ⊆ R𝑛 is the real root subspace of 𝐴 ∈ R𝑛×𝑛 corresponding to a pair of nonreal complex conjugate eigenvalues 𝜇 ± 𝑖𝜈 of 𝐴. Span {𝑥1 , . . . , 𝑥𝑘 } is the subspace spanned by the vectors 𝑥1 , . . . , 𝑥𝑘 . The operator matrix norm (=the largest ∫singular value) 1 −1 𝑑𝑧 ∥𝐴∥ is used throughout, for 𝐴 ∈ C𝑛×𝑛 . We denote by 𝑃Ω (𝐴) = 2𝜋𝑖 Γ (𝑧𝐼 − 𝐴) 𝑛×𝑛 associated with eigenvalues included in a set the spectral projection of 𝐴 ∈ C Ω ⊆ C; here Γ is a suitable (simple, closed, rectiﬁable) contour such that Ω ∩ 𝜎(𝐴) is inside Γ and 𝜎(𝐴) ∖ Ω is outside Γ. Finally, for real numbers, 𝜆 and 𝜇 > 0, we let ⎡ ⎤ 𝜆 𝜇 1 0 ⋅⋅⋅ 0 0 ⎢ −𝜇 𝜆 0 1 ⋅ ⋅ ⋅ 0 0 ⎥ ⎢ ⎥ ⎢ 0 0 𝜆 𝜇 ⋅⋅⋅ 0 0 ⎥ ⎢ ⎥ ⎢ .. .. ⎥ ⎢ 0 0 −𝜇 𝜆 ⋅ ⋅ ⋅ ⎥ . . ⎢ ⎥ (1.1) 𝐽2𝑚 (𝜆 ± 𝑖𝜇) = ⎢ . ⎥ ∈ R2𝑚×2𝑚 . . . . .. .. .. ⎢ .. ⎥ 1 0 ⎢ ⎥ ⎢ . ⎥ .. .. .. ⎢ .. . . . 0 1 ⎥ ⎢ ⎥ ⎣ 0 0 0 0 𝜆 𝜇 ⎦ 0

0

0

0

−𝜇 𝜆

2. Matrices with the same Jordan structure For the real case the deﬁnition of matrices having the same Jordan structure is modiﬁed (comparing with the complex case), and actually we need two versions: Two matrices 𝐴 ∈ R𝑛×𝑛 , 𝐵 ∈ R𝑚×𝑚 are said to have the same weak R-Jordan structure if the following properties hold:

Jordan Structures and Lattices of Invariant Subspaces

573

(1) the number 𝑠 of distinct eigenvalues with nonnegative imaginary parts 𝜆1 (𝐴), . . . , 𝜆𝑠 (𝐴) of 𝐴 and of distinct eigenvalues with nonnegative imaginary parts 𝜆1 (𝐵), . . . , 𝜆𝑠 (𝐵) of 𝐵 is the same; (2) there exists a permutation 𝜋 : {1, 2, . . . , 𝑠} → {1, 2, . . . , 𝑠} such that the partial multiplicities of 𝜆𝑗 (𝐴) as an eigenvalue of 𝐴 are identical with those of 𝜆𝜋(𝑗) (𝐵) as an eigenvalue of 𝐵, for 𝑗 = 1, 2, . . . , 𝑠. Two matrices 𝐴, 𝐵 ∈ R𝑛×𝑛 are said to have the same strong R-Jordan structure if the following properties hold: (3) the number 𝑠 of distinct real eigenvalues 𝜆1 (𝐴), . . . , 𝜆𝑠 (𝐴) of 𝐴 and of distinct real eigenvalues 𝜆1 (𝐵), . . . , 𝜆𝑠 (𝐵) of 𝐵 is the same; (4) there exist permutation 𝜋 : {1, 2, . . . , 𝑠} → {1, 2, . . . , 𝑠} such that the partial multiplicities of 𝜆𝑗 (𝐴) as an eigenvalue of 𝐴 are identical with those of 𝜆𝜋(𝑗) (𝐵) as an eigenvalue of 𝐵, for 𝑗 = 1, 2, . . . , 𝑠; (5) the number 𝑡 of distinct eigenvalues with positive imaginary parts (𝜇1 + 𝑖𝜈1 )(𝐴), . . . , (𝜇𝑡 + 𝑖𝜈𝑡 )(𝐴) of 𝐴 and of distinct eigenvalues with positive imaginary parts (𝜇1 + 𝑖𝜈1 )(𝐵), . . . , (𝜇𝑡 + 𝑖𝜈𝑡 )(𝐵) of 𝐵, is the same; (6) there exist permutation 𝜎 : {1, 2, . . . , 𝑡} → {1, 2, . . . , 𝑡} such that the partial multiplicities of (𝜇𝑗 + 𝑖𝜈𝑗 )(𝐴) as an eigenvalue of 𝐴 are identical with those of (𝜇𝜎(𝑗) + 𝑖𝜈𝜎(𝑗) )(𝐵) as an eigenvalue of 𝐵, for 𝑗 = 1, 2, . . . , 𝑡. Thus, if 𝐴 and 𝐵 both have either only nonreal spectra or only real spectra, the notions of the weak and strong R-Jordan structure are identical. Part 1 in the following theorem is included for completeness; it was proved in [8]. Theorem 2.1. Part 1. The following statements are equivalent for 𝐴 ∈ C𝑛×𝑛 , 𝐵 ∈ C𝑚×𝑚 : (1a) 𝑚 = 𝑛 and 𝐴 and 𝐵 have the same C-Jordan structure; (1b) The lattices LatC (𝐴) of 𝐴-invariant subspaces in C𝑛 and LatC (𝐵) of 𝐵invariant subspaces in C𝑚 are isomorphic, i.e., there exists a bijective map 𝜓 : LatC (𝐴) −→ LatC (𝐵) such that 𝜓(ℳ1 ∩ ℳ2 ) = 𝜓(ℳ1 ) ∩ 𝜓(ℳ2 ) and 𝜓(ℳ1 + ℳ2 ) = 𝜓(ℳ1 ) + 𝜓(ℳ2 ) for every ℳ1 , ℳ2 ∈ LatC (𝐴); (1c) 𝑚 = 𝑛 and the lattices LatC (𝐴) and LatC (𝐵) are linearly isomorphic, i.e., there exists an invertible matrix 𝑇 ∈ C𝑛×𝑛 such that 𝑇 ℳ ∈ LatC (𝐵) if and only if ℳ ∈ LatC (𝐴). Part 2. The following statements are equivalent for 𝐴, 𝐵 ∈ R𝑛×𝑛 : (2a) 𝐴 and 𝐵 have the same strong R-Jordan structure; (2b) The lattices LatR (𝐴) of 𝐴-invariant subspaces in R𝑛 and LatR (𝐵) of 𝐵-invariant subspaces in R𝑛 are isomorphic with an isomorphism 𝜓 : LatR (𝐴) → LatR (𝐵)

574

L. Rodman such that

dim 𝜓(ℳ) = dim ℳ (2.1) for all root subspaces ℳ = ℛ𝜆 (𝐴), 𝜆 ∈ 𝜎(𝐴) ∩ R, and ℳ = ℛ𝜇±𝑖𝜈 (𝐴), 𝜇 ± 𝑖𝜈 ∈ 𝜎(𝐴) ∖ R; (2c) The lattices LatR (𝐴) and LatR (𝐵) are linearly isomorphic, i.e., there exists an invertible matrix 𝑇 ∈ R𝑛×𝑛 such that 𝑇 ℳ ∈ LatR (𝐵) if and only if ℳ ∈ LatR (𝐴). Part 3. The following statements are equivalent for 𝐴 ∈ R𝑛×𝑛 , 𝐵 ∈ R𝑚×𝑚 : (3a) 𝐴 and 𝐵 have the same weak R-Jordan structure; (3b) The lattices LatR (𝐴) of 𝐴-invariant subspaces in R𝑛 and LatR (𝐵) of 𝐵invariant subspaces in R𝑚 are isomorphic. The following two examples will clarify the diﬀerences between the complex and real cases in Theorem 2.1. Example 2.2. Let 𝐴1 = 0 ∈ R

1×1

[ ,

𝐴2 =

0 −1

1 0

]

∈ R2×2 .

The lattices LatR (𝐴1 ) and LatR (𝐴2 ) are isomorphic. Example 2.3. Let

[

] 0 1 , −1 0 ⎡ ⎤ 0 1 1 0 [ ] ⎢ −1 0 0 1 ⎥ 0 0 6×6 ⎥ 𝐵= ⊕⎢ ⎣ 0 0 0 1 ⎦∈R . 0 0 0 0 −1 0 Then 𝐴 and 𝐵 have same weak R-Jordan structure but not the same strong RJordan structure, and the lattices LatR (𝐴) and LatR (𝐵) are isomorphic but not linearly isomorphic. Moreover, 𝐴 and 𝐵 do not have the same Jordan structure as complex matrices, so LatC (𝐴) and LatC (𝐵) are not isomorphic. 𝐴=

0 1 0 0

]

[

⊕

0 1 −1 0

]

[

⊕

We need two lemmas for the proof of Theorem 2.1. ] 𝒜 be the alge[ 𝑎Let 𝑏 bra (isomorphic to C) of 2 × 2 real matrices of the form −𝑏 𝑎 , and denote by ℳ𝑝×𝑞 (𝒜) ⊂ R2𝑝×2𝑞 the set of 𝑝 × 𝑞 matrices with entries in 𝒜. Lemma 2.4. Let 𝑈 ∈ ℳ𝑛×𝑛 (𝒜), 𝑉 ∈ ℳ𝑝×𝑝 (𝒜) be matrices with no real eigenvalues. Then all solutions 𝑋 ∈ R2𝑛×2𝑝 of 𝑈 𝑋 = 𝑋𝑉 belong to ℳ𝑛×𝑝 (𝒜). Proof. Replacing 𝑈 by 𝑆 −1 𝑈 𝑆, 𝑉 by 𝑇 −1 𝑉 𝑇 , and 𝑋 by 𝑆 −1 𝑋𝑇 , where 𝑆 ∈ ℳ𝑛×𝑛 (𝒜), 𝑇 ∈ ℳ𝑝×𝑝 (𝒜) are suitable invertible matrices, we may assume without loss of generality that 𝑈 and 𝑉 are real almost upper triangular Jordan forms, i.e., direct sums of real Jordan blocks as in (1.1). Furthermore, using induction on 𝑛 and on 𝑝, we can assume that actually 𝑈 = 𝐽2𝑛 (𝜇 ± 𝑖𝜈), 𝑉 = 𝐽2𝑝 (𝜇′ ± 𝑖𝜈 ′ ), where

Jordan Structures and Lattices of Invariant Subspaces

575

𝜇, 𝜇′ ∈ R, 𝜈, 𝜈 ′ > 0. If 𝜇 + 𝑖𝜈 ∕= 𝜇′ + 𝑖𝜈 ′ , the only solution is 𝑋 = 0, so we may assume 𝜇 = 𝜇′ and 𝜈 = 𝜈 ′ . We can also take 𝜇 = 0. Now the result follows by elementary calculations using the easily veriﬁable fact that an equation [ ] [ ] 0 𝜈 0 𝜈 𝑍=𝑍 + 𝑌, 𝑍 ∈ R2×2 , 𝑌 ∈ 𝒜, −𝜈 0 −𝜈 0 holds if and only if 𝑍 ∈ 𝒜 and 𝑌 = 0.

□

Lemma 2.5. (a) Assume that 𝐴 ∈ R𝑛×𝑛 , 𝐵 ∈ R𝑚×𝑚 are either nilpotent, or 𝜎(𝐴) = 𝜎(𝐵) = {𝑖, −𝑖}. Then LatR (𝐴) and LatR (𝐵) are isomorphic if and only if 𝑚 = 𝑛 and 𝐴 and 𝐵 have the same R-Jordan structure. (b) Assume that 𝐴 ∈ R𝑛×𝑛 , 𝐵 ∈ R𝑚×𝑚 are such that 𝐴 is nilpotent and 𝜎(𝐵) = {𝑖, −𝑖}. Then Then LatR (𝐴) and LatR (𝐵) are isomorphic if and only if 𝐴 and 𝐵 have the same weak R-Jordan structure. Proof. Let 𝜒 : 𝒜 → C be the standard algebra isomorphism deﬁned by [ ] 𝑎 𝑏 𝜒 = 𝑎 + 𝑖𝑏, −𝑏 𝑎 and extend it entrywise to matrices; thus, for 𝑈 = [𝑈𝑖,𝑗 ]𝑛,𝑝 𝑖=1,𝑗=1 ∈ ℳ𝑛×𝑝 , let 𝑛×𝑝 ∈ C . 𝜒(𝑈 ) = [𝜒(𝑈𝑖,𝑗 )]𝑛,𝑝 𝑖=1,𝑗=1 First, we prove the claim that if 𝑈 ∈ ℳ𝑛×𝑛 has no real eigenvalues, then LatR (𝑈 ) is isomorphic to LatC (𝜒(𝑈 )). It suﬃces to consider the case when 𝑈 has a real Jordan form and 𝜎(𝑈 ) = {±𝑖}. Every 2𝑝-dimensional subspace 𝒩 ∈ LatR (𝑈 ) is the column space of a real matrix 𝑋 with linearly independent columns that satisﬁes equation of the form 𝑈 𝑋 = 𝑋𝑉, where 𝑉 is a real Jordan form, in particular 𝑉 ∈ ℳ𝑝×𝑝 (𝒜). By Lemma 2.4, 𝑋 ∈ ℳ𝑛×𝑝 (𝒜). The column space 𝒩 ′ of the matrix 𝜒(𝑋) is obviously 𝜒(𝑈 )-invariant. We let 𝜓(𝒩 ) = 𝒩 ′ . It turns out that 𝜓 is well deﬁned (i.e., 𝜓(𝒩 ) depends only on the column space of 𝑋, and does not depend on the choice of 𝑋 itself), and is actually a lattice isomorphism between LatR (𝑈 ) and LatC (𝜒(𝑈 )). Indeed, assume that 𝑋, 𝑋 ′ ∈ ℳ𝑛×𝑝 (𝒜) with linearly independent columns give rise to the same subspace 𝒩 , i.e., 𝑋 = 𝑋 ′ 𝑊 for some invertible real matrix 𝑊 . Since the kernel of 𝑋 ′ is zero, we have that 𝜒(𝑋 ′ ) has also zero kernel, and there exists 𝑌 ′ ∈ C𝑝×𝑛 such that 𝑌 ′ 𝜒(𝑋 ′ ) = 𝐼, or 𝜒−1 (𝑌 ′ )𝑋 ′ = 𝐼. Now 𝑊 = 𝜒−1 (𝑌 ′ )𝑋 obviously belongs to ℳ𝑝×𝑝 (𝒜), and 𝜒(𝑋), 𝜒(𝑋 ′ ) have the same column space, as claimed. If 𝒩1 ⊆ 𝒩2 , 𝒩1 , 𝒩2 ∈ LatR (𝑈 ), then for the corresponding matrices 𝑋1 and 𝑋2 we have 𝑋1 = 𝑋2 𝑊 for some real matrix 𝑊 , and as before we show that 𝑊 ∈ ℳ𝑝′ ×𝑞′ (𝒜) for appropriate 𝑝′ , 𝑞 ′ , hence also 𝜓(𝒩1 ) ⊆ 𝜓(𝒩2 ). All other parts or our claim are easily veriﬁed. Proof of Part (a). We may assume that both 𝐴 and 𝐵 are in the real Jordan form. Then the “if” part is obvious because we may take 𝐴 and 𝐵 equal. The “only if” part in case 𝐴 and 𝐵 are nilpotent, is a particular case of [8, Theorem 2.1]. The “only if” part in case 𝜎(𝐴) = 𝜎(𝐵) = {𝑖, −𝑖} follows by using the claim stated and proved in the preceding paragraph.

576

L. Rodman

Part (b) is proved similarly using the isomorphism of LatR (𝐵) and LatC (𝜒(𝐵)), and [8, Theorem 2.1]. □ We also need the following known fact: Proposition 2.6. Let 𝐴 ∈ F𝑛×𝑛 . Then the maximal number of distinct elements in an increasing (by inclusion) chain of 𝐴-invariant subspaces in F𝑛 is 𝑛 + 1 in the complex case, and is ∑ ∑ 1 1+ (algebraic multiplicity of 𝜆)+ (algebraic multiplicity of 𝜆) 2 𝜆∈𝜎(𝐴)∩R

𝜆∈𝜎(𝐴)∩(C∖R)

in the real case. Proof. The complex case is obvious by using the Jordan form. In the real case, use the fact that every 𝐴-invariant subspace is the direct sum of its intersections with the root subspaces of 𝐴, thereby reducing the proof to the cases when 𝜎(𝐴) = 𝜆, 𝜆 ∈ R, or 𝜎(𝐴) = {𝜇 ± 𝑖𝜈}, 𝜇 ∈ R, 𝜈 > 0. The former case is obvious again by using the real Jordan form, and the latter case follows from Lemma 2.5(b). □ Proof. We prove Theorem 2.1. Part 1 was proved in [8], see also [7]. Note that in [8, 7] it was assumed from the beginning that 𝑚 = 𝑛; however, (1b) easily implies that 𝑚 = 𝑛: Indeed, for 𝐴 ∈ C𝑛×𝑛 , a maximal increasing (by inclusion) chain of 𝐴-invariant subspaces has exactly 𝑛 + 1 elements (Proposition 2.6). We prove Part 2. The implication (2a) =⇒ (2c) follows as in the complex case (see [8, 7]), while (2c) =⇒ (2b) is trivial. We provide details for the proof of (2b) =⇒ (2a), following for the large part a line of argument analogous to that of [8]. Suppose that 𝜓 : LatR (𝐴) −→ LatR (𝐵) is a lattice isomorphism with the property (2.1). Let 𝜆1 , . . . , 𝜆𝑝 be all the distinct real eigenvalues of 𝐴, and let 𝜇𝑝+1 ± 𝑖𝜈𝑝+1 , . . . , 𝜇𝑞 ± 𝑖𝜈𝑞 be all the distinct pairs of complex conjugate nonreal eigenvalues of 𝐴. Let 𝒩𝑗 = 𝜓(ℛ𝜆𝑗 (𝐴)) for 𝑗 = 1, 2, . . . , 𝑝, and 𝒩𝑗 = 𝜓(ℛ𝜇𝑗 ±𝑖𝜈𝑗 (𝐴)) for 𝑗 = 𝑝 + 1, 𝑝 + 2, . . . , 𝑞. Then R𝑛 is a direct sum of 𝐵-invariant subspaces 𝒩1 , . . . , 𝒩𝑞 . We claim that 𝜎(𝐵∣𝒩𝑖 ) ∩ 𝜎(𝐵∣𝒩𝑗 ) = ∅, for 𝑖 ∕= 𝑗 (𝑖, 𝑗 = 1, 2, . . . , 𝑞). Indeed, assume the contrary, i.e., 𝜆0 ∈ 𝜎(𝐵∣𝒩𝑖 ) ∩ 𝜎(𝐵∣𝒩𝑗 ) for some 𝑖 ∕= 𝑗. Consider ﬁrst the case when 𝜆0 is real. Let 𝒩 = Span (𝑦1 + 𝑦2 ), where 𝑦1 , resp. 𝑦2 , are eigenvectors of 𝐵∣𝒩𝑖 , resp. 𝐵∣𝒩𝑗 , corresponding to the eigenvalue 𝜆0 . Clearly, 𝒩 is 𝐵-invariant. Let ℳ := 𝜓 −1 (𝒩 ) ∈ LatR (𝐴). Since ℳ must contain a two-dimensional 𝐴-invariant subspace (in the case 𝜎(𝐴∣ℳ ) is nonreal) or a onedimensional 𝐴-invariant subspace (in the case 𝐴∣ℳ has a real eigenvalue), and since 𝜓 is a lattice isomorphism, it follows that ℳ has dimension two (in the case 𝜎(𝐴∣ℳ ) is nonreal) or dimension one (in the case 𝐴∣ℳ has a real eigenvalue). Therefore, ℳ ⊆ ℛ𝜆𝑘 (𝐴) or ℳ ⊆ ℛ𝜇𝑘 ±𝑖𝜈𝑘 (𝐴) for some 𝑘. This implies 𝒩 ⊆ 𝒩𝑘 ˙ ⋅ ⋅ ⋅ +𝒩 ˙ 𝑞 is a for some 𝑘, 𝑘 = 1, 2, . . . , 𝑞, a contradiction with the fact that 𝒩1 + direct sum.

Jordan Structures and Lattices of Invariant Subspaces

577

Now consider the case when 𝜆0 = 𝜇0 + 𝑖𝜈0 , 𝜇0 ∈ R, 𝜈0 ∈ R ∖ {0}, is nonreal. Then there are linearly independent vectors 𝑦1 , 𝑦1′ ∈ 𝒩𝑖 and 𝑦2 , 𝑦2′ ∈ 𝒩𝑗 such that 𝐵𝑦𝑘 = 𝜇0 𝑦𝑘 − 𝜈0 𝑦𝑘′ , 𝑦2 , 𝑦1′

𝐵𝑦𝑘′ = 𝜈0 𝑦𝑘 + 𝜇0 𝑦𝑘′ ,

for 𝑘 = 1, 2.

𝑦2′ }.

Let 𝒩 = Span {𝑦1 + + Then 𝒩 is 𝐵-invariant, but does not properly contain any nonzero 𝐵-invariant subspace. So for ℳ := 𝜓 −1 (𝒩 ) we obtain as in the preceding paragraph that either ℳ ⊆ ℛ𝜆𝑘 (𝐴) or ℳ ⊆ ℛ𝜇𝑘 ±𝑖𝜈𝑘 (𝐴) for some 𝑘. It follows that 𝒩 ⊆ 𝒩𝑘 for some 𝑘, 𝑘 = 1, 2, . . . , 𝑞, and we obtain a contradiction as before. Next we show that each 𝒩𝑗 is actually a real root subspace for 𝐵. Indeed, assuming the contrary, for some 𝑖 we have ˙ ⋅ ⋅ ⋅ +ℛ ˙ 𝜆′ (𝐵)+ℛ ˙ 𝜇′ ±𝑖𝜈 ′ (𝐵)+ ˙ ⋅ ⋅ ⋅ +ℛ ˙ 𝜇′ ±𝑖𝜈 ′ (𝐵), 𝒩𝑖 = ℛ𝜆′1 (𝐵)+ 𝑘 𝑘+1 𝑘+1 ℓ ℓ ′ where 𝜆′1 , . . . , 𝜆′𝑘 are some distinct real eigenvalues of 𝐵, 𝜇′𝑘+1 ± 𝑖𝜈𝑘+1 , . . . , 𝜇′ℓ ± 𝑖𝜈ℓ′ are some distinct pairs of nonreal complex conjugate eigenvalues of 𝐵, and ℓ > 1. Letting ℳ𝑗 = 𝜓 −1 (ℛ𝜆′𝑗 (𝐵)) for 𝑗 = 1, 2, . . . , 𝑘, and ℳ𝑗 = 𝜓 −1 (ℛ𝜇′𝑗 ±𝑖𝜈𝑗′ (𝐵)) for 𝑗 = 𝑘 + 1, 𝑘 + 2, . . . , ℓ, we have

˙ ⋅ ⋅ ⋅ +ℳ ˙ 𝑘 +ℳ ˙ 𝑘+1 + ˙ ⋅ ⋅ ⋅ +ℳ ˙ ℓ, ℛ𝜆𝑖 (𝐴) = ℳ1 + if 𝑖 ∈ {1, 2, . . . , 𝑝} and ˙ ⋅ ⋅ ⋅ +ℳ ˙ 𝑘 +ℳ ˙ 𝑘+1 + ˙ ⋅ ⋅ ⋅ +ℳ ˙ ℓ, ℛ𝜇𝑖 ±𝑖𝜈𝑖 (𝐴) = ℳ1 + if 𝑖 ∈ {𝑝 + 1, 𝑝 + 2, . . . , 𝑞}. Assume 𝑖 ∈ {𝑝 + 1, 𝑝 + 2, . . . , 𝑞}, and let 𝑦1 , 𝑦1′ ∈ ℳ1 , 𝑦2 , 𝑦2′ ∈ ℳ2 be linearly independent vectors such that 𝐴𝑦𝑘 = 𝜇𝑖 𝑦𝑘 − 𝜈𝑖 𝑦𝑘′ , 𝑦2 , 𝑦1′

𝐴𝑦𝑘′ = 𝜈𝑖 𝑦𝑘 + 𝜇𝑖 𝑦𝑘′ ,

for 𝑘 = 1, 2,

𝑦2′ ).

+ Then 𝜓(ℳ) is 𝐵-invariant, is contained and let ℳ = Span (𝑦1 + in 𝒩𝑖 , but is not contained in any of ℛ𝜆′𝑗 (𝐵) or ℛ𝜇′𝑗 ±𝑖𝜈𝑗′ (𝐵). This is impossible, because ℳ does not properly contain any nonzero 𝐴-invariant subspace, and therefore 𝜓(ℳ) does not properly contain any nonzero 𝐵-invariant subspace. If 𝑖 ∈ {1, 2, . . . , 𝑝}, then we obtain a contradiction in a similar way, by considering the 𝐴-invariant subspace Span(𝑥1 +𝑥2 ), where 𝑥1 and 𝑥2 are eigenvectors of 𝐴∣ℳ1 and 𝐴∣ℳ2 , respectively (cf. the proof of [8, Theorem 2.1]). Thus, we must have ℓ = 1. We have proved that every 𝒩𝑗 is a root subspace of 𝐵 corresponding either to a real eigenvalue, or to a pair of nonreal complex conjugate eigenvalues. We also notice that LatR (𝐴∣ℛ𝜆𝑗 (𝐴) ) is isomorphic to LatR (𝐵∣𝒩𝑗 ), for 𝑗 = 1, 2, . . . , 𝑝, and LatR (𝐴∣ℛ𝜇𝑗 ±𝑖𝜈𝑗 (𝐴) ) is isomorphic to LatR (𝐵∣𝒩𝑗 ) for 𝑗 = 𝑝 + 1, 𝑝 + 2, . . . , 𝑞. Using Lemma 2.5 and the condition (2.1), we easily see that 𝐴 and 𝐵 have the same strong R-Jordan structure. Proof of Part 3. (3a) =⇒ (3b) follows from Lemma 2.5, by considering root subspaces of 𝐴 and 𝐵 associated with eigenvalues that correspond under the permutation 𝜋 of (2). Assume now (3b) holds, and let 𝜓 : LatR (𝐴) −→ LatR (𝐵) be a lattice isomorphism. As in the proof of Part 2, assuming that 𝜆1 , . . . , 𝜆𝑝 be all the dis-

578

L. Rodman

tinct real eigenvalues of 𝐴, and 𝜇𝑝+1 ± 𝑖𝜈𝑝+1 , . . . , 𝜇𝑞 ± 𝑖𝜈𝑞 be all the distinct pairs of complex conjugate nonreal eigenvalues of 𝐴, we obtain that the images 𝒩𝑗 := 𝜓(ℛ𝜆𝑗 (𝐴)) (𝑗 = 1, 2, . . . , 𝑝) and 𝒩𝑗 := 𝜓(𝑅𝜇𝑗 ±𝑖𝜈𝑗 (𝐴)) (𝑗 = 𝑝 = 1, 𝑝 + 2, . . . , 𝑞) are root subspaces of 𝐵. It follows also that LatR (𝐴∣ℛ𝜆𝑗 (𝐴) ) is isomorphic to LatR (𝐵∣𝒩𝑗 ), for 𝑗 = 1, 2, . . . , 𝑝, and LatR (𝐴∣ℛ𝜇𝑗 ±𝑖𝜈𝑗 (𝐴) ) is isomorphic to LatR (𝐵∣𝒩𝑗 ) for 𝑗 = 𝑝 + 1, 𝑝 + 2, . . . , 𝑞. Now Lemma 2.5 shows that 𝐴 and 𝐵 have the same weak R-Jordan structure. □

3. Structure preserving neighborhoods Let 𝐴 ∈ F𝑛×𝑛 , and let Ω be a non-empty set of distinct eigenvalues of 𝐴; Ω = {𝜆1 , . . . , 𝜆𝑝 }. In this section, it will be always assumed that, in the case F = R, Ω is closed under complex conjugation. For a ﬁxed positive 𝛿, the {Ω; 𝛿}F -structure preserving neighborhood of 𝐴 is deﬁned to consist of all matrices 𝐵 ∈ F𝑛×𝑛 that satisfy the following properties: 1. ∥𝐵 − 𝐴∥ < 𝛿; 2. for every 𝑗 = 1, 2, . . . , 𝑝, there exists exactly one eigenvalue, call it 𝜆𝑗 (𝐵), of 𝐵 in the open disc 𝐷(𝜆𝑗 ; 𝛿) := {𝑤 ∈ C : ∣𝑤 − 𝜆𝑗 ∣ < 𝛿}, perhaps of high multiplicity, and the partial multiplicities of the eigenvalue 𝜆𝑗 (𝐵) of 𝐵 are identical with those of the eigenvalue 𝜆𝑗 of 𝐴. In the above deﬁnition, one should think of 𝛿 as small – smaller than a ﬁxed number which depends only on 𝐴. If 𝐴 ∈ R𝑛×𝑛 , and Ω consists of (not necessarily all) distinct real eigenvalues, then the eigenvalues contained in the discs 𝐷(𝜆𝑗 ; 𝛿), 𝜆𝑗 ∈ Ω, of any 𝐵 ∈ R𝑛×𝑛 that belongs to the {Ω; 𝛿}R -structure preserving neighborhood of 𝐴, are necessarily real (assuming 𝛿 is suﬃciently small). Proposition 3.1. Let 𝐴 ∈ F𝑛×𝑛 and let 𝜆1 , . . . , 𝜆𝑝 be the distinct eigenvalues of 𝐴 with algebraic multiplicities 𝛼1 , . . . , 𝛼𝑝 , respectively. Let 1 min ∣𝜆 − 𝜇∣. 0 < 𝛿′ < 2 𝜆,𝜇∈𝜎(𝐴), 𝜆∕=𝜇 Then for every 𝛿 > 0 such that 𝛿 ≤ 𝛿′

and

3.46𝑛(2∥𝐴∥ + 𝛿)𝑛−1 𝛿 ≤ (𝛿 ′ )𝑛 , 𝑛×𝑛

(3.1) ′

we have the property that if 𝐵 ∈ F , ∥𝐵 − 𝐴∥ < 𝛿, then the disk 𝐷(𝜆𝑗 ; 𝛿 ) contains exactly 𝛼𝑗 eigenvalues of 𝐵 (counted with multiplicities), for 𝑗 = 1, 2, . . . , 𝑝. Proposition 3.1 is a consequence of the main result of [9]; the constant 3.46, which is an improvement on results obtained earlier in [2, 11], is taken from there. We do not aim at the best possible constant in this proposition. Theorem 3.2. Let 𝐴 ∈ F𝑛×𝑛 . Then for every 𝛿 > 0 suﬃciently small, and for every nonempty set Ω of distinct eigenvalues of 𝐴, the following statements are equivalent:

Jordan Structures and Lattices of Invariant Subspaces

579

(𝛼) 𝐵 belongs in the {Ω; 𝛿}F -structure preserving neighborhood of 𝐴; (𝛽) ∥𝐵 − 𝐴∥ < 𝛿, and the lattices of invariant subspaces of 𝐴∣Im 𝑃Ω (𝐴) and of 𝐵∣Im 𝑃∪ 𝐷(𝜆,𝛿′ ) (𝐵) are isomorphic; 𝜆∈Ω (𝛾) ∥𝐵 − 𝐴∥ < 𝛿, and the lattices of invariant subspaces of 𝐴∣Im 𝑃Ω (𝐴) and of 𝐵∣Im 𝑃∪ 𝐷(𝜆,𝛿′ ) (𝐵) are linearly isomorphic. 𝜆∈Ω

Here, 𝛿 ′ is taken from Proposition 3.1. Proof. In the complex case, the result follows from Theorem 2.1, Part 1, combined with Proposition 3.1. Consider now the real case. Suppose (𝛼) holds. Let 𝛿 ′ and 𝛿 > 0 be as in Proposition 3.1. Then, we see in view of Proposition 3.1 that 𝐴∣Im 𝑃𝜆 (𝐴) and 𝐵∣Im 𝑃𝐷(𝜆,𝛿′ ) (𝐵) have the same strong R-Jordan structure for every real 𝜆 ∈ Ω, and 𝐴∣Im 𝑃𝜇±𝑖𝜈 (𝐴) and 𝐵∣Im 𝑃𝐷(𝜇+𝑖𝜈,𝛿′ )∪𝐷(𝜇−𝑖𝜈,𝛿′ ) (𝐵) have the same strong R-Jordan structure for every pair 𝜇 ± 𝑖𝜈 ∈ Ω, 𝜇 ∈ R, 𝜈 > 0. Theorem 2.1, Part 2 now yields (𝛾). Since (𝛾) ⇒ (𝛽) is trivial, it remains to prove that (𝛽) ⇒ (𝛼). Thus, assume (𝛽) holds. By Theorem 2.1, Part 3, 𝐴∣Im 𝑃Ω (𝐴)

and 𝐵∣Im 𝑃∪

′ (𝐵) 𝜆∈Ω 𝐷(𝜆,𝛿 )

(3.2)

have the same weak R-Jordan structure. It will be convenient to write Ω = {𝜆1 , . . . , 𝜆𝑝 , 𝜇𝑝+1 ± 𝑖𝜈𝑝+1 , . . . , 𝜇𝑞 ± 𝑖𝜈𝑞 },

(3.3)

where 𝜆1 , . . . , 𝜆𝑝 are distinct real numbers, and 𝜇𝑝+1 ± 𝑖𝜈𝑝+1 , . . . , 𝜇𝑞 ± 𝑖𝜈𝑞 are distinct pairs of nonreal complex conjugate numbers. Then (∪𝜆∈Ω 𝐷(𝜆, 𝛿 ′ )) ∩ 𝜎(𝐵) = {𝜆′1 , . . . , 𝜆′𝑝′ , 𝜇′𝑝′ +1 ± 𝑖𝜈𝑝′ ′ +1 , . . . , 𝜇′𝑞 ± 𝑖𝜈𝑞′ }

(3.4)

where 𝜆′1 , . . . , 𝜆′𝑝′ are distinct real numbers, and 𝜇′𝑝′ +1 ± 𝑖𝜈𝑝′ ′ +1 , . . . , 𝜇′𝑞 ± 𝑖𝜈𝑞′ are distinct pairs of nonreal complex conjugate numbers. Since (3.2) have the same weak R-Jordan structure, the number 𝑞 is the same in (3.3) and (3.4). Using Proposition 3.1, it is easy to see that in every disc 𝐷(𝜇𝑗 + 𝑖𝜈𝑗 ; 𝛿 ′ ), 𝑗 = 𝑝 + 1, . . . , 𝑞, there is only one, necessarily nonreal, eigenvalue of 𝐵 (of algebraic multiplicity equal to that of 𝜇𝑗 +𝑖𝜈𝑗 as an eigenvalue of 𝐴); otherwise, we obtain a contradiction with the number 𝑞 being the same in (3.3) and (3.4). On the other hand, there may be either exactly one real eigenvalue or exactly one pair of nonreal complex conjugate eigenvalues of 𝐵 in every disc 𝐷(𝜆𝑗 ; 𝛿 ′ ), for 𝑗 = 1, 2, . . . , 𝑝. Thus, 𝑝′ ≤ 𝑝, and we may arrange (3.3) and (3.4) so that 𝜆′𝑗 ∈ 𝐷(𝜆𝑗 ; 𝛿 ′ ) for 𝑗 = 1, 2, . . . , 𝑝′ ; 𝜇′𝑗 + 𝑖𝜈𝑗′ ∈ 𝐷(𝜇𝑗 + 𝑖𝜈𝑗 ; 𝛿 ′ ) for 𝑗 = 𝑝 + 1, 𝑝 + 2, . . . , 𝑞; and 𝜇′𝑗 + 𝑖𝜈𝑗′ ∈ 𝐷(𝜆𝑗 ; 𝛿 ′ ) for 𝑗 = 𝑝′ + 1, . . . , 𝑝. In fact, 𝑝′ = 𝑝. Indeed, by Proposition 2.6, the maximal number of elements in an increasing chain of 𝐴∣Im 𝑃Ω (𝐴) -invariant subspaces is 1+

𝑝 ∑ 𝑗=1

(algebraic multiplicity of 𝜆𝑗 ) +

𝑞 ∑ 𝑗=𝑝+1

(algebraic multiplicity of 𝜇𝑗 + 𝑖𝜈𝑗 ), (3.5)

580

L. Rodman

whereas that number for 𝐵∣Im 𝑃∪

′ (𝐵) 𝜆∈Ω 𝐷(𝜆,𝛿 )

′

1+

𝑝 ∑

(algebraic multiplicity of 𝜆′𝑗 ) +

𝑗=1

𝑞 ∑

is (algebraic multiplicity of 𝜇′𝑗 + 𝑖𝜈𝑗′ ).

𝑗=𝑝′ +1

(3.6) The numbers (3.5) and (3.6) cannot be equal unless 𝑝′ = 𝑝, on the other hand, (3.5) and (3.6) must be the same in view of the assumption (𝛽). Thus, 𝑝′ = 𝑝. It will be convenient to change notation, and let 𝜏1 , . . . , 𝜏𝑞 , resp. 𝜏1′ , . . . , 𝜏𝑞′ be all eigenvalues of 𝐴∣Im 𝑃Ω (𝐴) , resp. of 𝐵∣Im 𝑃∪ 𝐷(𝜆,𝛿′ ) (𝐵) , with nonnegative imag𝜆∈Ω inary parts arranged so that 𝜏𝑗′ ∈ 𝐷(𝜏𝑗 , 𝛿 ′ ), for 𝑗 = 1, . . . , 𝑞. Denote by 𝛼𝑗 = (𝛼𝑗,1 ≥ 𝛼𝑗,2 ≥ ⋅ ⋅ ⋅ ≥ 𝛼𝑗,𝑚 ≥ ⋅ ⋅ ⋅ ),

𝑗 = 1, 2, . . . , 𝑞,

the sequence of partial multiplicities of the eigenvalue 𝜏𝑗 of 𝐴, arranged in the nondecreasing order and extended indeﬁnitely by zeros, and similarly 𝛼′𝑗 = (𝛼′𝑗,1 ≥ 𝛼′𝑗,2 ≥ ⋅ ⋅ ⋅ ≥ 𝛼′𝑗,𝑚 ≥ ⋅ ⋅ ⋅ ),

𝑗 = 1, 2, . . . , 𝑞,

for the eigenvalue 𝜏𝑗′ of 𝐵. At this point we recall the well-known majorization relation between nonincreasing sequences of nonnegative integers having ﬁnite sum. Let 𝛼 = (𝛼1 ≥ 𝛼2 ≥ ⋅ ⋅ ⋅ ≥ 𝛼𝑛 ≥ ⋅ ⋅ ⋅ ), 𝛽 = (𝛽1 ≥ 𝛽2 ≥ ⋅ ⋅ ⋅ ≥ 𝛽𝑛 ≥ ⋅ ⋅ ⋅ ) be two such sequences. We say that 𝛽 majorizes 𝛼, notation: 𝛽 ર 𝛼 if 𝑘 ∑ 𝑗=1

𝛽𝑗 ≥

𝑘 ∑

𝛼𝑗 ,

𝑘 = 1, 2, . . . ,

𝑗=1

and

∞ ∑

𝛽𝑗 =

𝑗=1

∞ ∑

𝛼𝑗 .

𝑗=1

A particular case of the main result of [10, 3] shows that 𝛼′𝑗 ર 𝛼𝑗 ,

𝑗 = 1, 2, . . . , 𝑞,

(3.7)

if 𝛿 is suﬃciently small. (We use here the facts that 𝜏𝑗′ is the only eigenvalue of 𝐵 in the disc 𝐷𝜏𝑗 ,𝛿′ and that 𝑝 = 𝑝′ .) Now let 𝜋 : {1, 2, . . . , 𝑞} be the permutation that exists by the deﬁnition of 𝐴∣Im 𝑃Ω (𝐴) and 𝐵∣Im 𝑃∪ 𝐷(𝜆,𝛿′ ) (𝐵) having the same 𝜆∈Ω weak R-Jordan structure, and let (𝑗1 , . . . , 𝑗𝑣 ) be a cycle in 𝜋. Then using (3.7) we have 𝛼𝑗1 = 𝛼′𝑗2 ર 𝛼𝑗2 = 𝛼′𝑗3 ર 𝛼𝑗3 = ⋅ ⋅ ⋅ ર 𝛼𝑗𝑣 = 𝛼′𝑗1 ર 𝛼𝑗1 . Thus, the equality holds throughout. Repeating this argument for every cycle of 𝜋, we see that we can take 𝜋 to be the identity. This proves (𝛼). □ The proof of Theorem 3.2 show that in the complex case the theorem holds for every 𝛿 > 0 satisfying (3.1), and in the real case the theorem holds for every 𝛿 > 0 satisfying (3.1) and for which (3.7) is valid.

Jordan Structures and Lattices of Invariant Subspaces

581

4. Concluding remarks Theorems 2.1 (Part 3) and 3.2 allow us to extend the main result of [8] to real matrices, with essentially the same proof. We only formulate the result, omitting details of proof. For a given 𝐴 ∈ R𝑛×𝑛 , let Υ(𝐴, 𝐵) = inf{∥𝐼 − 𝑆∥}, where the inﬁmum is taken over all invertible matrices 𝑆 ∈ R𝑛×𝑛 such that ℳ ∈ LatR (𝐴)

⇐⇒

𝑆(ℳ) ∈ LatR (𝐵).

(4.1)

Remark 4.1. In view of Theorems 2.1 (Part 3) and 3.2, there exists 𝛿 > 0 (depending on 𝐴 only) such that the set of invertible 𝑆 ∈ R𝑛×𝑛 with the property (4.1) is nonempty as soon as ∥𝐵 − 𝐴∥ < 𝛿 and 𝐵 and 𝐴 have the same weak R-Jordan structure. Let dist (LatR (𝐴), LatR (𝐵)) = max

{

ℳ∈LatR (𝐴)

sup

𝒩 ∈LatR (𝐵)

inf

∥𝑄𝒩 − 𝑄ℳ ∥,

sup

inf

} ∥𝑄𝒩 − 𝑄ℳ ∥

ℳ∈LatR (𝐵)

𝒩 ∈LatR (𝐴)

be the distance between the lattice of invariant subspaces of 𝐴 ∈ R𝑛×𝑛 and that of 𝐵 ∈ R𝑛×𝑛 ; here 𝑄𝒩 is the orthogonal projection on the subspace 𝒩 . Note that dist (LatR (𝐴), LatR (𝐵)) ≤ 1

(4.2)

for all 𝐴, 𝐵 ∈ R𝑛×𝑛 , as it follows, for example, from [4, Theorem S4.5]. Theorem 4.2. Given 𝐴 ∈ R𝑛×𝑛 , there exists 𝛿 > 0 such that sup

Υ(𝐴, 𝐵) < ∞, ∥𝐵 − 𝐴∥

(4.3)

where the supremum is taken over all 𝐵 ∈ R𝑛×𝑛 that satisfy ∥𝐵 − 𝐴∥ < 𝛿 and have the same weak R-Jordan structure as 𝐴 does. Moreover, sup

dist (LatR (𝐴), LatR (𝐵)) < ∞, ∥𝐵 − 𝐴∥

(4.4)

where the supremum is taken over all 𝐵 ∈ R𝑛×𝑛 which have the same weak RJordan structure as 𝐴 does. Using Remark 4.1, one proves as in [8] that (4.3) holds for suﬃciently small 𝛿 > 0. In view of [8, Theorem 3.1] (which is valid in the real case as well), we have that (4.4) holds provided { ( )−1 } 1 Υ(𝐴, 𝐵) (4.5) ∥𝐵 − 𝐴∥ < min 𝛿, sup 2 ∥𝐵 − 𝐴∥

582

L. Rodman

and 𝐴 and 𝐵 have the same weak R-Jordan structure; here 𝛿 > 0 is such that (4.3) holds. Using (4.2), we obtain the following inequality for every 𝐵 having the same weak R-Jordan structure as 𝐴: dist (LatR (𝐴), LatR (𝐵)) ≤ min{1/𝛿0 , 𝑀 } ∥𝐵 − 𝐴∥, where 𝑀 is the supremum in (4.3), and 𝛿0 is the right-hand side of (4.5).

References [1] T. Bella, V. Olshevsky, and U. Prasad, Lipschitz stability of canonical Jordan bases of 𝐻-selfadjoint matrices under structure-preserving perturbations. Linear Algebra Appl. 428 (2008), 2130–2176. [2] R. Bhatia, L. Elsner, and G. Krause, Bounds for the variation of the roots of a polynomial and the eigenvalues of a matrix. Linear Algebra Appl. 142 (1990), 195– 209. [3] H. den Boer and G.Ph.A. Thijsse, Semistability of sums of partial multiplicities under additive perturbation. Integral Equations Operator Theory 3 (1980), 23–42. [4] I. Gohberg, P. Lancaster, and L. Rodman, Matrix Polynomials. Academic Press, 1982; republication, SIAM 2009. [5] I. Gohberg, P. Lancaster, and L. Rodman, Matrices and Indeﬁnite Scalar Products. Birkh¨ auser Verlag, Basel, 1983. [6] I. Gohberg, P. Lancaster, and L. Rodman. Indeﬁnite Linear Algebra and Applications. Birkh¨ auser Verlag, 2005. [7] I. Gohberg, P. Lancaster, and L. Rodman. Invariant Subspaces of Matrices with Applications, J. Wiley, New York, 1986; republication, SIAM, 2006. [8] I. Gohberg and L. Rodman, On the distance between lattices of invariant subspaces of matrices. Linear Algebra Appl. 76 (1986), 85–120. [9] G.M. Krause, Bounds for the variation of matrix eigenvalues and polynomial roots. Linear Algebra Appl. 208/209 (1994), 73–82. [10] A.S. Markus and E.E. Parilis, The change of the Jordan structure of a matrix under small perturbations. Linear Algebra Appl. 54 (1983), 139–152. [11] D. Phillips, Improving spectral-variation bounds with Chebyshev polynomials. Linear Algebra Appl. 133 (1990), 165–173. [12] L. Rodman, Similarity vs unitary similarity and perturbation analysis of sign characteristic: Complex and real indeﬁnite inner products. Linear Algebra Appl. 416 (2006), 945–1009. Leiba Rodman Department of Mathematics College of William and Mary Williamsburg, VA 23187-8795, USA e-mail: [email protected]

Operator Theory: Advances and Applications, Vol. 218, 583–612 c 2012 Springer Basel AG ⃝

Pseudospectral Functions for Canonical Diﬀerential Systems. II J. Rovnyak and L.A. Sakhnovich To the memory of Israel Gohberg

Abstract. A spectral theory is constructed for canonical diﬀerential systems whose Hamiltonians have selfadjoint matrix values. In contrast with the case of nonnegative Hamiltonians, eigenvalues in general can be complex, and root functions as well as eigenfunctions come into play. Eigentransforms are deﬁned and turn out to be isometric on the span of root functions with respect to a suitably deﬁned indeﬁnite inner product on entire functions. Mathematics Subject Classiﬁcation (2000). Primary 34L10; Secondary 47B50, 47E05, 46C20, 34B09. Keywords. Canonical diﬀerential equation, root function, pseudospectral function, spectral function, indeﬁnite inner product, eigentransform.

1. Introduction We are concerned with the spectral theory of canonical diﬀerential systems 𝑑𝑌 = 𝑖𝑧𝐽𝐻(𝑥)𝑌, 0 ≤ 𝑥 ≤ ℓ, 𝑑𝑥 [ (1.1) ] 𝐼𝑚 0 𝑌 (0, 𝑧) = 0. In (1.1), we assume that [ 0 𝐽= 𝐼𝑚

] 𝐼𝑚 , 0

[

] 𝑌1 (𝑥, 𝑧) 𝑌 (𝑥, 𝑧) = , 𝑌2 (𝑥, 𝑧)

(1.2)

where 𝑌1 (𝑥, 𝑧) and 𝑌2 (𝑥, 𝑧) are 𝑚-dimensional vector-valued functions, and 𝑧 is a complex parameter. As in [3], the Hamiltonian 𝐻(𝑥) is assumed to be a measurable 2𝑚 × 2𝑚 matrix-valued function such that ∫ ℓ ∗ ∥𝐻(𝑥)∥ 𝑑𝑥 < ∞. (1.3) 𝐻(𝑥) = 𝐻(𝑥) a.e. and 0

584

J. Rovnyak and L.A. Sakhnovich

For technical reasons, we also assume throughout that [ ] 0 𝐻(𝑥) = 0 a.e. on [0, ℓ] =⇒ 𝑔 = 0. 𝑔

(1.4)

For any such system we deﬁne 𝐿2 (𝐻𝑑𝑥) as a Kre˘ın space of (equivalence classes of) 2𝑚-dimensional vector-valued functions on [0, ℓ]. Write 𝐻(𝑥) = 𝐻+ (𝑥) − 𝐻− (𝑥) where 𝐻± (𝑥) are measurable, 𝐻± (𝑥) ≥ 0, and 𝐻+ (𝑥)𝐻− (𝑥) = 0 a.e. As a linear space, 𝐿2 (𝐻𝑑𝑥) is the set of measurable 2𝑚-dimensional vector-valued functions 𝑓 on [0, ℓ] such that ∫ ℓ 𝑓 (𝑡)∗ [𝐻+ (𝑡) + 𝐻− (𝑡)]𝑓 (𝑡) 𝑑𝑡 < ∞. 0

Two functions 𝑓1 and 𝑓2 in 𝐿2 (𝐻𝑑𝑥) are identiﬁed if 𝐻(𝑥)[𝑓1 (𝑥) − 𝑓1 (𝑥)] = 0 a.e. Taken with the inner product ∫ ℓ ⟨𝑓1 , 𝑓2 ⟩𝐻 = 𝑓2 (𝑥)∗ 𝐻(𝑥)𝑓1 (𝑥) 𝑑𝑥, 𝑓1 , 𝑓2 ∈ 𝐿2 (𝐻𝑑𝑥), 0

2

𝐿 (𝐻𝑑𝑥) is a Kre˘ın space. In a natural way we can view 𝐿2 (𝐻± 𝑑𝑥) as closed subspaces, and then 𝐿2 (𝐻𝑑𝑥) = 𝐿2 (𝐻+ 𝑑𝑥) ⊕ 𝐿2 (𝐻− 𝑑𝑥) is a fundamental decomposition. Deﬁne an eigentransform 𝐹 = 𝑉 𝑓 for any 𝑓 in 𝐿2 (𝐻𝑑𝑥) by ∫ ℓ [ ] 0 𝐼𝑚 𝑊 (𝑥, 𝑧¯)∗ 𝐻(𝑥)𝑓 (𝑥) 𝑑𝑥, 𝐹 (𝑧) = 0

(1.5)

where 𝑊 (𝑥, 𝑧) is the unique 2𝑚 × 2𝑚 matrix-valued function such that 𝑑𝑊 = 𝑖𝑧𝐽𝐻(𝑥)𝑊, 𝑑𝑥 𝑊 (0, 𝑧) = 𝐼2𝑚 ,

0 ≤ 𝑥 ≤ ℓ, 𝑧 ∈ ℂ.

(1.6)

The function 𝑊 (𝑥, 𝑧) is continuous on [0, ℓ]×ℂ and entire in 𝑧 for each ﬁxed 𝑥. For each 𝑓 in 𝐿2 (𝐻𝑑𝑥), 𝐹 = 𝑉 𝑓 is an 𝑚-dimensional vector-valued entire function. Throughout we write [ ] 𝑎(𝑧) 𝑏(𝑧) ∗ 𝑊 (ℓ, 𝑧¯) = . (1.7) 𝑐(𝑧) 𝑑(𝑧) Here 𝑎(𝑧), 𝑏(𝑧), 𝑐(𝑧), 𝑑(𝑧) are 𝑚 × 𝑚 matrix-valued entire functions. Consider ﬁrst the deﬁnite case, that is, 𝐻(𝑥) ≥ 0 a.e. Then 𝐿2 (𝐻𝑑𝑥) is a Hilbert space. In this case, by a spectral function for (1.1) is meant a nondecreasing 𝑚 × 𝑚 matrix-valued function 𝜏 (𝑥) of real 𝑥 such that the eigentransform 𝑉 acts as an isometry from 𝐿2 (𝐻𝑑𝑥) into 𝐿2 (𝑑𝜏 ). We call 𝜏 (𝑥) a pseudospectral function for (1.1) if 𝑉 is a partial isometry from 𝐿2 (𝐻𝑑𝑥) into 𝐿2 (𝑑𝜏 ). Pseudospectral

Pseudospectral Functions

585

functions can be constructed using a boundary condition at the right endpoint of the interval [0, ℓ]. The boundary condition has the form [ ∗ ] 𝑅 𝑄∗ 𝑌 (ℓ, 𝑧) = 0, where 𝑅 and 𝑄 are 𝑚 × 𝑚 matrices such that 𝑅∗ 𝑄 + 𝑄∗ 𝑅 = 0, and such that the entire function 𝑐(𝑧)𝑅 + 𝑑(𝑧)𝑄 has invertible values except at isolated points. Then 𝑣(𝑧) = 𝑖[𝑎(𝑧)𝑅 + 𝑏(𝑧)𝑄][𝑐(𝑧)𝑅 + 𝑑(𝑧)𝑄]−1 is meromorphic in the complex plane, 𝑣(𝑧) = 𝑣(¯ 𝑧 )∗ at all points of analyticity, and 𝑣(𝑧) has nonnegative imaginary part in the upper half-plane. In particular, 𝑣(𝑧) has only real and simple poles and a representation ] ∫ ∞[ 𝑡 1 − 𝑣(𝑧) = 𝛼 + 𝛽𝑧 + 𝑑𝜏 (𝑡), (1.8) 1 + 𝑡2 −∞ 𝑡 − 𝑧 where 𝜏 (𝑥) is a nondecreasing 𝑚 × 𝑚 matrix-valued step function with jumps at the poles of 𝑣(𝑧), and 𝛼 = 𝛼∗ and 𝛽 ≥ 0 are constant 𝑚×𝑚 matrices. The function 𝜏 (𝑥) is a pseudospectral function. The isometric set for the eigentransform 𝑉 is the closed span of eigenfunctions. See [4, Chapter 4] and Theorems 4.2.2, 4.2.4, and 4.2.5 in [3]. In this paper we generalize the preceding constructions to Hamiltonians such that 𝐻(𝑥) = 𝐻(𝑥)∗ a.e. We introduce a meromorphic function 𝑣(𝑧) in the same way as before. Now, however, 𝑣(𝑧) can have nonreal and nonsimple poles, and in general there is no representation of 𝑣(𝑧) in the form (1.8). In place of eigenfunctions, we have to deal now with eigenchains of root functions. The role of a pseudospectral function is replaced by a notion of pseudospectral data, which consists of the collection of poles and principal parts of the meromorphic function 𝑣(𝑧). The poles and principal parts of 𝑣(𝑧) are used to construct an inner product ⟨⋅, ⋅⟩ on vectorvalued entire functions. According to our main result, Theorem 4.7, the identity ∫ ℓ 𝑓2 (𝑡)∗ 𝐻(𝑡)𝑓1 (𝑡) 𝑑𝑡 = ⟨𝐹1 , 𝐹2 ⟩ 0

holds whenever 𝑓1 and 𝑓2 are ﬁnite linear combinations of root functions and 𝐹1 and 𝐹2 are their eigentransforms. This agrees with Theorem 4.1.11 of [3] for the special case when 𝑣(𝑧) has only simple poles. The general case turns out to be quite a bit more involved. In Section 2 of the paper, we expand the function 𝑊 (𝑥, 𝑧) in a Taylor series about a point 𝑧 = 𝑤. The higher-order coeﬃcients in this expansion do not arise in the deﬁnite theory, but they are important in the general case considered here. In Section 3 we derive explicit formulas for the root functions and their eigentransforms. These formulas are needed for the main results of the paper, which appear in Section 4. Remark. We thank the referee for the comment that the construction of a related linear operator and its resolvent might yield insights into our main results. We leave this as an open question. Concerning such related linear operators, see the remark

586

J. Rovnyak and L.A. Sakhnovich

preceding Proposition 3.1. See also Section 3 of [3], where resolvent operators for canonical diﬀerential systems are investigated.

2. Taylor expansions and their coeﬃcients Assume given a system (1.1)–(1.4). Deﬁne 𝑊 (𝑥, 𝑧) and 𝑎(𝑧), 𝑏(𝑧), 𝑐(𝑧), 𝑑(𝑧) as in (1.6) and (1.7). By (1.6), 𝑑 𝑊 (𝑡, 𝑧¯)∗ 𝐽𝑊 (𝑡, 𝑤) = 𝑖(𝑤 − 𝑧)𝑊 (𝑡, 𝑧¯)∗ 𝐻(𝑡)𝑊 (𝑡, 𝑤) 𝑑𝑡 a.e. on [0, ℓ] for all complex 𝑤 and 𝑧. We deduce that 𝑊 (𝑥, 𝑧¯)∗ 𝐽𝑊 (𝑥, 𝑧) = 𝑊 (𝑥, 𝑧)𝐽𝑊 (𝑥, 𝑧¯)∗ = 𝐽, and

Set

∫ 0

ℓ

𝑊 (𝑡, 𝑧¯)∗ 𝐻(𝑡)𝑊 (𝑡, 𝑤) 𝑑𝑡 =

𝑊 (𝑥, 𝑧) =

∞ ∑

(2.1)

𝑊 (ℓ, 𝑧¯)∗ 𝐽𝑊 (ℓ, 𝑤) − 𝐽 . 𝑖(𝑤 − 𝑧)

𝑊𝑗 (𝑥, 𝑤)(𝑧 − 𝑤)𝑗 ,

(2.2)

(2.3)

𝑗=0

[

] ∑ ∞ [ 𝑎(𝑧) 𝑏(𝑧) 𝑎𝑗 (𝑤) = 𝑐(𝑧) 𝑑(𝑧) 𝑐𝑗 (𝑤) 𝑗=0

] 𝑏𝑗 (𝑤) (𝑧 − 𝑤)𝑗 , 𝑑𝑗 (𝑤)

(2.4)

for all 𝑥 in [0, ℓ] and 𝑤 in ℂ. Using the values 𝑥 = 0 and 𝑥 = ℓ, we get 𝑊0 (0, 𝑤) = 𝐼2𝑚 , 𝑊𝑗 (0, 𝑤) = 0, 𝑗 ≥ 1, [ ] 𝑎 (𝑤) ¯ ∗ 𝑐𝑗 (𝑤) ¯ ∗ 𝑊𝑗 (ℓ, 𝑤) = 𝑗 , 𝑗 ≥ 0. 𝑏𝑗 (𝑤) ¯ ∗ 𝑑𝑗 (𝑤) ¯ ∗

(2.5)

For each 𝑗 ≥ 0, 𝑊𝑗 (𝑥, 𝑤) is continuous on [0, ℓ] and entire in 𝑤 for ﬁxed 𝑥. To prove this, represent the coeﬃcients as Cauchy integrals as in (2.8) below and use the corresponding properties for 𝑊 (𝑥, 𝑧). Proposition 2.1. For every 𝑤 ∈ ℂ, 𝑑 𝑊0 (𝑥, 𝑤) = 𝑖𝑤𝐽𝐻(𝑥)𝑊0 (𝑥, 𝑤), 𝑑𝑥 𝑑 𝑊𝑗 (𝑥, 𝑤) = 𝑖𝑤𝐽𝐻(𝑥)𝑊𝑗 (𝑥, 𝑤) + 𝑖𝐽𝐻(𝑥)𝑊𝑗−1 (𝑥, 𝑤), 𝑑𝑥 a.e. on [0, ℓ].

(2.6) 𝑗 ≥ 1,

Proof. The ﬁrst equation in (2.6) holds by (1.6) since 𝑊0 (𝑥, 𝑧) = 𝑊 (𝑥, 𝑧). Since 𝑊𝑗 (0, 𝑤) = 0 for 𝑗 ≥ 1, the second equation in (2.6) is equivalent to ∫ 𝑥 ∫ 𝑥 𝑊𝑗 (𝑥, 𝑤) = 𝑖𝑤𝐽 𝐻(𝑡)𝑊𝑗 (𝑡, 𝑤) 𝑑𝑡 + 𝑖𝐽 𝐻(𝑡)𝑊𝑗−1 (𝑡, 𝑤) 𝑑𝑡, (2.7) 0

0

Pseudospectral Functions

587

0 ≤ 𝑥 ≤ ℓ. Let Γ be a circular path around 𝑤 in the counterclockwise direction. For each 𝑥 in [0, ℓ] and 𝑘 ≥ 0, ∫ 1 𝑊 (𝑥, 𝜁) 𝑊𝑘 (𝑥, 𝑤) = 𝑑𝜁. (2.8) 2𝜋𝑖 Γ (𝜁 − 𝑤)𝑘+1 To prove (2.7), ﬁrst write (1.6) in the form ∫ 𝑊 (𝑥, 𝜁) = 𝐼2𝑚 + 𝑖𝜁𝐽

𝑥

0

𝐻(𝑡)𝑊 (𝑡, 𝜁) 𝑑𝑡.

∫ Since we assume 𝑗 ≥ 1, Γ 𝑑𝜁/(𝜁 − 𝑤)𝑗+1 = 0. Thus ] ∫ 𝑥 ∫ ∫ [ 𝑑𝜁 𝑊 (𝑥, 𝜁) 1 1 𝑑𝜁 = 𝐻(𝑡)𝑊 (𝑡, 𝜁) 𝑑𝑡 𝑖𝜁𝐽 2𝜋𝑖 Γ (𝜁 − 𝑤)𝑗+1 2𝜋𝑖 Γ (𝜁 − 𝑤)𝑗+1 0 [ ∫ 𝑥 ∫ 1 = 𝐻(𝑡)𝑊 (𝑡, 𝜁) 𝑑𝑡 𝑖𝑤𝐽 2𝜋𝑖 Γ 0 ] ∫ 𝑥 𝑑𝜁 + 𝑖(𝜁 − 𝑤)𝐽 𝐻(𝑡)𝑊 (𝑡, 𝜁) 𝑑𝑡 (𝜁 − 𝑤)𝑗+1 0 ∫ 𝑥 ∫ 𝑊 (𝑡, 𝜁) 1 = 𝑖𝑤𝐽 𝐻(𝑡) 𝑑𝜁 𝑑𝑡 2𝜋𝑖 Γ (𝜁 − 𝑤)𝑗+1 0 ∫ 𝑥 ∫ 𝑊 (𝑡, 𝜁) 1 𝐻(𝑡) 𝑑𝜁 𝑑𝑡. + 𝑖𝐽 2𝜋𝑖 Γ (𝜁 − 𝑤)𝑗 0 By (2.8), this is the same as (2.7). The interchange in order of integration is justiﬁed because ∥𝐻(𝑡)∥∥𝑊 (𝑡, 𝜁)∥ is integrable over [0, ℓ] × Γ. □ Proposition 2.2. For all 𝑤 ∈ ℂ, 𝑥 ∈ [0, ℓ], and 𝑛 ≥ 0, ∑ ∑ 𝑊𝑝 (𝑥, 𝑤) ¯ ∗ 𝐽𝑊𝑞 (𝑥, 𝑤) = 𝑊𝑝 (𝑥, 𝑤)𝐽𝑊𝑞 (𝑥, 𝑤) ¯ ∗ 𝑝+𝑞=𝑛

𝑝+𝑞=𝑛

{

=

𝐽,

𝑛 = 0,

0,

𝑛 ≥ 1.

(2.9)

Proof. By (2.1) and (2.3), ∞ ∑

∗

𝑝

𝑊𝑝 (𝑥, 𝑤) ¯ (𝑧 − 𝑤) 𝐽

𝑝=0

=

∞ ∑ 𝑞=0 ∞ ∑ 𝑝=0

𝑊𝑞 (𝑥, 𝑤)(𝑧 − 𝑤)𝑞 𝑊𝑝 (𝑥, 𝑤)(𝑧 − 𝑤)𝑝 𝐽

∞ ∑

𝑊𝑞 (𝑥, 𝑤) ¯ ∗ (𝑧 − 𝑤)𝑞 = 𝐽.

𝑞=0

The relations (2.9) follow on expanding the products and collecting powers of 𝑧−𝑤: the constant terms equal 𝐽, and all other coeﬃcients are zero. □

588

J. Rovnyak and L.A. Sakhnovich

Corollary 2.3. For every 𝑤 ∈ ℂ, 𝑎0 (𝑤)𝑏0 (𝑤) ¯ ∗ + 𝑏0 (𝑤)𝑎0 (𝑤) ¯ ∗ = 0,

𝑎0 (𝑤) ¯ ∗ 𝑐0 (𝑤) + 𝑐0 (𝑤) ¯ ∗ 𝑎0 (𝑤) = 0,

¯ ∗ + 𝑏0 (𝑤)𝑐0 (𝑤) ¯ ∗ = 𝐼𝑚 , 𝑎0 (𝑤)𝑑0 (𝑤)

𝑎0 (𝑤) ¯ ∗ 𝑑0 (𝑤) + 𝑐0 (𝑤) ¯ ∗ 𝑏0 (𝑤) = 𝐼𝑚 , (2.10) 𝑏0 (𝑤) ¯ ∗ 𝑑0 (𝑤) + 𝑑0 (𝑤) ¯ ∗ 𝑏0 (𝑤) = 0.

¯ ∗ + 𝑑0 (𝑤)𝑐0 (𝑤) ¯ ∗ = 0, 𝑐0 (𝑤)𝑑0 (𝑤) For all 𝑛 ≥ 1,

∑

[𝑎𝑝 (𝑤)𝑏𝑞 (𝑤) ¯ ∗ + 𝑏𝑝 (𝑤)𝑎𝑞 (𝑤) ¯ ∗ ] = 0,

𝑝+𝑞=𝑛

∑

[𝑎𝑝 (𝑤)𝑑𝑞 (𝑤) ¯ ∗ + 𝑏𝑝 (𝑤)𝑐𝑞 (𝑤) ¯ ∗ ] = 0,

𝑝+𝑞=𝑛

∑

(2.11)

[𝑐𝑝 (𝑤)𝑑𝑞 (𝑤) ¯ ∗ + 𝑑𝑝 (𝑤)𝑐𝑞 (𝑤) ¯ ∗ ] = 0,

𝑝+𝑞=𝑛

and

∑

[𝑎𝑝 (𝑤) ¯ ∗ 𝑐𝑞 (𝑤) + 𝑐𝑝 (𝑤) ¯ ∗ 𝑎𝑞 (𝑤)] = 0,

𝑝+𝑞=𝑛

∑

[𝑎𝑝 (𝑤) ¯ ∗ 𝑑𝑞 (𝑤) + 𝑐𝑝 (𝑤) ¯ ∗ 𝑏𝑞 (𝑤)] = 0,

𝑝+𝑞=𝑛

∑

(2.12)

[𝑏𝑝 (𝑤) ¯ ∗ 𝑑𝑞 (𝑤) + 𝑑𝑝 (𝑤) ¯ ∗ 𝑏𝑞 (𝑤)] = 0.

𝑝+𝑞=𝑛

Proof. These identities follow on choosing 𝑥 = ℓ in (2.9) and expanding using (1.7). The relations (2.10) follow from the case 𝑛 = 0 and coincide with the formulas (2.1.5) of [3]. Suppose 𝑛 ≥ 1. Then by (2.9), [ ] ∑ 0 0 𝑊𝑝 (ℓ, 𝑤) ¯ ∗ 𝐽𝑊𝑞 (ℓ, 𝑤) = 0 0 𝑝+𝑞=𝑛 [ ] ][ ∑ 𝑎𝑝 (𝑤) 𝑏𝑝 (𝑤) 𝑏𝑞 (𝑤) ¯ ∗ 𝑑𝑞 (𝑤) ¯ ∗ = 𝑐𝑝 (𝑤) 𝑑𝑝 (𝑤) 𝑎𝑞 (𝑤) ¯ ∗ 𝑐𝑞 (𝑤) ¯ ∗ 𝑝+𝑞=𝑛 [ ] ∑ 𝑎𝑝 (𝑤)𝑏𝑞 (𝑤) ¯ ∗ + 𝑏𝑝 (𝑤)𝑎𝑞 (𝑤) ¯ ∗ 𝑎𝑝 (𝑤)𝑑𝑞 (𝑤) ¯ ∗ + 𝑏𝑝 (𝑤)𝑐𝑞 (𝑤) ¯ ∗ = , 𝑐𝑝 (𝑤)𝑏𝑞 (𝑤) ¯ ∗ + 𝑑𝑝 (𝑤)𝑎𝑞 (𝑤) ¯ ∗ 𝑐𝑝 (𝑤)𝑑𝑞 (𝑤) ¯ ∗ + 𝑑𝑝 (𝑤)𝑐𝑞 (𝑤) ¯ ∗ 𝑝+𝑞=𝑛 yielding (2.11). We prove (2.12) in a similar way using (2.9). Proposition 2.4. For all 𝑤, 𝑧 ∈ ℂ and 𝑛 ≥ 0, [ ] ∫ ℓ [ ] 0 0 𝐼𝑚 𝑊 (𝑡, 𝑧¯)∗ 𝐻(𝑡)𝑊𝑛 (𝑡, 𝑤) 𝑑𝑡 𝐼𝑚 0 ∞ ∑ ∑ 𝑐(𝑧)𝑑𝑝 (𝑤) ¯ ∗ + 𝑑(𝑧)𝑐𝑝 (𝑤) ¯ ∗ = Δ𝑛𝑘 (𝑤)(𝑧 − 𝑤)𝑘 . =𝑖 𝑞+1 (𝑧 − 𝑤) 𝑝+𝑞=𝑛 𝑘=0

□

(2.13)

Pseudospectral Functions In (2.13), for all 𝑛, 𝑘 ≥ 0, ∑ Δ𝑛𝑘 (𝑤) = 𝑖 [𝑐𝑞+𝑘+1 (𝑤)𝑑𝑝 (𝑤) ¯ ∗ + 𝑑𝑞+𝑘+1 (𝑤)𝑐𝑝 (𝑤) ¯ ∗ ],

589

(2.14)

𝑝+𝑞=𝑛

and the middle expression is interpreted by continuity for 𝑧 = 𝑤. Moreover, [ ] ∫ ℓ [ ] 0 ∗ 0 𝐼𝑚 𝑊𝑘 (𝑡, 𝑤) Δ𝑛𝑘 (𝑤) = ¯ 𝐻(𝑡)𝑊𝑛 (𝑡, 𝑤) 𝑑𝑡 (2.15) 𝐼 𝑚 0 and

Δ𝑛𝑘 (𝑤) ¯ ∗ = Δ𝑘𝑛 (𝑤).

(2.16)

Proof. By (2.2) and (1.7), [ ] ∫ ℓ [ ] 0 0 𝐼𝑚 𝑊 (𝑡, 𝑧¯)∗ 𝐻(𝑡)𝑊 (𝑡, 𝜆) 𝑑𝑡 (2.17) 𝐼𝑚 0 [ ] ¯ ∗ ¯ ∗ + 𝑑(𝑧)𝑐(𝜆) ] 𝑊 (ℓ, 𝑧¯)∗ 𝐽𝑊 (ℓ, 𝜆) − 𝐽 0 [ 𝑐(𝑧)𝑑(𝜆) . = 0 𝐼𝑚 =𝑖 𝐼𝑚 𝑖(𝜆 − 𝑧) 𝑧−𝜆 Using the expansions 𝑊 (𝑡, 𝜆) = [ ¯ ∗] 𝑑(𝜆) ¯ ∗ = 𝑐(𝜆)

∞ ∑

𝑊𝑛 (𝑡, 𝑤)(𝜆 − 𝑤)𝑛 ,

𝑛=0 ∞ [ ∑ 𝑝=0

] 𝑑𝑝 (𝑤) ¯ ∗ (𝜆 − 𝑤)𝑝 , 𝑐𝑝 (𝑤) ¯ ∗

𝑖 𝑖 = 𝑧−𝜆 𝑧−𝑤 we obtain ∫ ℓ 0

[

∞ ∑ 1 (𝜆 − 𝑤)𝑞 =𝑖 , 𝜆−𝑤 (𝑧 − 𝑤)𝑞+1 𝑞=0 1− 𝑧−𝑤

(2.18) (2.19) (2.20)

[ ] ∞ ∑ ] 0 ∗ 0 𝐼𝑚 𝑊 (𝑡, 𝑧¯) 𝐻(𝑡) 𝑊𝑛 (𝑡, 𝑤) (𝜆 − 𝑤)𝑛 𝑑𝑡 𝐼𝑚 𝑛=0 ] [ ¯ ∗ [ ] 𝑑(𝜆) 𝑖 = 𝑐(𝑧) 𝑑(𝑧) ¯ ∗ 𝑧−𝜆 𝑐(𝜆) ] [ ∞ ∞ ∑ ∑ [ ] 𝑑𝑝 (𝑤) ¯ ∗ (𝜆 − 𝑤)𝑞 𝑝 𝑐(𝑧) 𝑑(𝑧) = 𝑖 (𝜆 − 𝑤) (𝑧 − 𝑤)𝑞+1 𝑐𝑝 (𝑤) ¯ ∗ 𝑝=0 𝑞=0 =

∞ ∑

(𝜆 − 𝑤)𝑛 𝑖

𝑛=0

∑ 𝑐(𝑧)𝑑𝑝 (𝑤) ¯ ∗ + 𝑑(𝑧)𝑐𝑝 (𝑤) ¯ ∗ . 𝑞+1 (𝑧 − 𝑤) 𝑝+𝑞=𝑛

(2.21)

In fact, in (2.21) the ﬁrst equality is identical to (2.17) by the Taylor expansion for 𝑊 (𝑡, 𝜆) in (2.18); the second equality substitutes the two Taylor expansions in (2.19) and (2.20); the third equality collects powers of 𝜆 − 𝑤. The ﬁrst equality in (2.13) follows from (2.21) on interchanging the order of integration and summation on the left and comparing coeﬃcients.

590

J. Rovnyak and L.A. Sakhnovich

To prove the second equality in (2.13), expand 𝑐(𝑧) and 𝑑(𝑧) in Taylor series, and write ∑ 𝑐(𝑧)𝑑𝑝 (𝑤) ¯ ∗ + 𝑑(𝑧)𝑐𝑝 (𝑤) ¯ ∗ 𝑖 (𝑧 − 𝑤)𝑞+1 𝑝+𝑞=𝑛 [ ] ∞ ∑ ∑ ] [ 𝑑𝑝 (𝑤) ¯ ∗ 1 𝑐𝑘 (𝑤) 𝑑𝑘 (𝑤) (𝑧 − 𝑤)𝑘 =𝑖 ∗ (𝑧 − 𝑤)𝑞+1 𝑐 ( 𝑤) ¯ 𝑝 𝑝+𝑞=𝑛 𝑘=0 ] [ ∞ ∑ ∑ [ ] 𝑑𝑝 (𝑤) ¯ ∗ 𝑘 𝑐𝑘+𝑞+1 (𝑤) 𝑑𝑘+𝑞+1 (𝑤) =𝑖 ∗ (𝑧 − 𝑤) 𝑐 ( 𝑤) ¯ 𝑝 𝑝+𝑞=𝑛 𝑘=−𝑞−1 [ ] ∞ ∑ ∑ ] 𝑑𝑝 (𝑤) [ ¯ ∗ 𝑘 𝑐𝑘+𝑞+1 (𝑤) 𝑑𝑘+𝑞+1 (𝑤) =𝑖 ∗ (𝑧 − 𝑤) 𝑐 ( 𝑤) ¯ 𝑝 𝑝+𝑞=𝑛 𝑘=0 [ ] −1 ∑ ∑ [ ] 𝑑𝑝 (𝑤) ¯ ∗ 𝑘 𝑐𝑘+𝑞+1 (𝑤) 𝑑𝑘+𝑞+1 (𝑤) +𝑖 ∗ (𝑧 − 𝑤) 𝑐 ( 𝑤) ¯ 𝑝 𝑝+𝑞=𝑛 𝑘=−𝑞−1

=

∞ ∑

Δ𝑛𝑘 (𝑤)(𝑧 − 𝑤)𝑘 + Term 2.

(2.22)

𝑘=0

Here Term 2 = 0, since by the ﬁrst equality in (2.13), proved above, the left side of (2.22) is entire. Thus (2.22) yields the second equality in (2.13) with Δ𝑛𝑘 (𝑤) deﬁned by (2.14). The identity (2.15) follows from (2.13) on expanding 𝑊 (𝑡, 𝑧¯)∗ in a Taylor series about 𝑧 = 𝑤 and comparing coeﬃcients. Then (2.16) follows from (2.15). □

3. Root spaces and eigenchains We now add a boundary condition at the right endpoint of the interval [0, ℓ]. Thus we consider a system [

𝐼𝑚

𝑑𝑌 = 𝑖𝑧𝐽𝐻(𝑥)𝑌, 0 ≤ 𝑥 ≤ ℓ, ]𝑑𝑥 [ ∗ ] 0 𝑌 (0, 𝑧) = 0, 𝑅 𝑄∗ 𝑌 (ℓ, 𝑧) = 0,

(3.1)

subject to the conditions (1.2)–(1.4). Deﬁne 𝑎(𝑧), 𝑏(𝑧), 𝑐(𝑧), 𝑑(𝑧) by (1.7) as before. We assume two conditions: (1∘ ) 𝑅 and 𝑄 are 𝑚 × 𝑚 matrices such that 𝑅∗ 𝑄 + 𝑄∗ 𝑅 = 0; (2∘ ) the values of 𝑐(𝑧)𝑅 + 𝑑(𝑧)𝑄 are invertible except at isolated points. There are many choices of matrices meeting these conditions because 𝑐(0) = 0 and 𝑑(0) = 𝐼𝑚 . The operator 𝑅∗ 𝑅 + 𝑄∗ 𝑄 is invertible, since otherwise 𝑐(𝑧)𝑅 + 𝑑(𝑧)𝑄 has no invertible value, in violation of (2∘ ).

Pseudospectral Functions

591

Notice that (2∘ ) assures that the function 𝑣(𝑧) = 𝑖[𝑎(𝑧)𝑅 + 𝑏(𝑧)𝑄][𝑐(𝑧)𝑅 + 𝑑(𝑧)𝑄]−1

(3.2)

is deﬁned except at isolated points. This function is meromorphic on ℂ, and it satisﬁes 𝑣(𝑧) = 𝑣(¯ 𝑧 )∗ by [3, Proposition 2.3.1]. The poles and principal parts of 𝑣(𝑧) contain important information for the spectral theory of the system (3.1). (0) For each 𝜁 ∈ ℂ, let 𝔏𝜁 be the set of all solutions 𝑌 = 𝑌 (𝑥) of (3.1) with (0) (𝑘) (𝑘+1) 𝑧 = 𝜁. If 𝔏𝜁 , . . . , 𝔏𝜁 have been deﬁned, let 𝔏𝜁 be the set of all 𝑌 = 𝑌 (𝑥) such that 𝑑𝑌 = 𝑖𝜁𝐽𝐻(𝑥)𝑌 + 𝐽𝐻(𝑥)𝑌 (𝑘) , 𝑑𝑥 (3.3) [ ] [ ∗ ] 𝐼𝑚 0 𝑌 (0) = 0, 𝑅 𝑄∗ 𝑌 (ℓ) = 0, (𝑘)

(0)

(1)

for some 𝑌 (𝑘) ∈ 𝔏𝜁 . We call 𝔏𝜁 , 𝔏𝜁 , . . . root spaces. Elements of these spaces are root functions. Root spaces are linear spaces which we view as subspaces of 𝐿2 (𝐻𝑑𝑥). By Proposition 3.2 below there is a largest root space 𝔏𝜁 =

∞ ∪ 𝑗=0

(𝑗)

(𝜇)

𝔏𝜁 = 𝔏𝜁 .

(3.4)

We say that 𝜁 is an eigenvalue for (3.1) if 𝔏𝜁 ∕= {0} as a subspace of 𝐿2 (𝐻𝑑𝑥). Remark. Following [2, 4], we work directly with canonical diﬀerential systems and make no use of underlying operators on 𝐿2 (𝐻𝑑𝑥). Nevertheless, it may be noted that our deﬁnitions of eigenvalue and root space are equivalent to standard operator deﬁnitions. The root subspaces 𝔎0 , 𝔎1 , . . . for a bounded linear operator 𝑇 and eigenvalue 𝜁 are deﬁned recursively by 𝔎0 = ker (𝑇 − 𝜁𝐼) and 𝔎𝑗+1 = {𝑓 : (𝑇 − 𝜁𝐼)𝑓 ∈ 𝔎𝑗 } for all 𝑗 = 0, 1, . . . . With due attention to domains, the same deﬁnition is used for an unbounded operator. If 𝐻(𝑥) has invertible values, we can take 𝑇 = −𝑖𝐻(𝑥)−1 𝐽 𝑑/𝑑𝑥 with domain speciﬁed by boundary values [4, p. 49]. The two notions of eigenvalue and root space then coincide. In principle, one can reduce to the case of invertible Hamiltonian with a transformation given in [4, ˜ (𝑥, 𝑧) = 𝑒−𝑖𝑧𝛾𝑥 𝑊 (𝑥, 𝑧) for some 𝛾 > 0. This p.143] that replaces 𝑊 (𝑥, 𝑧) with 𝑊 ˜ yields a new system with selfadjoint Hamiltonian 𝐻(𝑥) = 𝐻(𝑥) − 𝛾𝐽. If 𝐻(𝑥) is ˜ bounded and 𝛾 is suﬃciently large, 𝐻(𝑥) has invertible values. The transformation is well behaved with respect to eigenvalues and root spaces. We do not use these constructions and therefore omit details. Proposition 3.1. For any complex number 𝜁, the following are equivalent: (i) 𝜁 is an eigenvalue of (3.1); (ii) 𝑐(𝜁)𝑅 + 𝑑(𝜁)𝑄 is not invertible; (iii) 𝜁 is a pole of 𝑣(𝑧). The eigenvalues of (3.1) are isolated points in the complex plane and occur in ¯ conjugate pairs 𝜁, 𝜁.

592

J. Rovnyak and L.A. Sakhnovich

As a preliminary to the proof, consider an 𝑚 × 𝑚 matrix-valued analytic function 𝐹 (𝑧) on a region Ω which has invertible values except at isolated points. If 𝜁 ∈ Ω and 𝐹 (𝜁) is not invertible, there is an 𝑟 ≥ 1 such that 𝐹 (𝑧) = 𝐹𝑟 (𝑧)𝑃 (𝑧),

(3.5)

where 𝐹𝑟 (𝑧) is analytic on Ω, 𝐹𝑟 (𝜁) is invertible, and 𝑃 (𝑧) is a polynomial of the form ] [ ][ ] [ 𝑃 (𝑧) = 𝐼 − 𝑃𝑟 + 𝑃𝑟 (𝑧 − 𝜁) ⋅ ⋅ ⋅ 𝐼 − 𝑃2 + 𝑃2 (𝑧 − 𝜁) 𝐼 − 𝑃1 + 𝑃1 (𝑧 − 𝜁) (3.6) for some rank-one projections 𝑃1 , 𝑃2 , . . . , 𝑃𝑟 . To see this, let Let 𝑟 be the order of 𝜁 as a zero of det 𝐹 (𝑧). Since 𝐹 (𝜁) is not invertible, there is a 𝑔1 ∕= 0 in ℂ𝑚 such that 𝐹 (𝜁)𝑔1 = 0. Let 𝑃1 be the projection on the span of 𝑔1 , and set [ ] 𝑃1 𝐹1 (𝑧) = 𝐹 (𝑧) 𝐼 − 𝑃1 + . 𝑧−𝜁 Since 𝐹 (𝜁)𝑃1 = 0, we can deﬁne 𝐹1 (𝜁) so that 𝐹1 (𝑧) is analytic on Ω. We have [ ] 𝐹 (𝑧) = 𝐹1 (𝑧) 𝐼 − 𝑃1 + 𝑃1 (𝑧 − 𝜁) , det 𝐹 (𝑧) . 𝑧−𝜁 If 𝑟 = 1, then det 𝐹1 (𝜁) ∕= 0 because then 𝜁 is a zero of det 𝐹 (𝑧) of order 1. The assertion follows in the case 𝑟 = 1. In general, we proceed in the same way but repeat the procedure 𝑟 times. det 𝐹1 (𝑧) =

Proof of Proposition 3.1. Everything here is in [3, Proposition 4.1.8] except for the equivalence of (ii) and (iii). Clearly (iii) implies (ii), so what remains is to show that (ii) implies (iii). We argue by contradiction, assuming that (ii) holds but (iii) fails. Write 𝑣(𝑧) = 𝑖𝑢1 (𝑧)𝑢2 (𝑧)−1 , where 𝑢1 (𝑧) = 𝑎(𝜁)𝑅+𝑏(𝜁)𝑄 and 𝑢2 (𝑧) = 𝑐(𝜁)𝑅+𝑑(𝜁)𝑄. Here 𝑢2 (𝑧) has invertible values except at isolated points, 𝑢2 (𝜁) is not invertible, and 𝜁 is a removable singularity of 𝑣(𝑧). Applying (3.5) to 𝐹 (𝑧) = 𝑢2 (𝑧), we obtain 𝑢2 (𝑧) = 𝑢˜2 (𝑧)𝑃 (𝑧), where 𝑢 ˜2 (𝑧) is entire, 𝑢 ˜2 (𝜁) is invertible, and 𝑃 (𝑧) has the form (3.6). Set 𝑢˜1 (𝑧) = 𝑢1 (𝑧)𝑃 (𝑧)−1 ,

𝑧 ∕= 𝜁.

Then for all 𝑧 ∕= 𝜁, ˜2 (𝑧)−1 = 𝑖˜ 𝑢1 (𝑧)˜ 𝑢2 (𝑧)−1 . 𝑣(𝑧) = 𝑖𝑢1 (𝑧)𝑃 (𝑧)−1 𝑢 Since 𝜁 is a removable singularity of 𝑣(𝑧) and 𝑢 ˜2 (𝜁) is invertible, 𝜁 is a removable singularity of 𝑢 ˜1 (𝑧). Therefore we can deﬁne 𝑢 ˜1 (𝜁) so that 𝑢 ˜1 (𝑧) is entire. By (1.7), ] [ ] [ [ ] 𝑢 (𝑧) 𝑢 ˜1 (𝑧)𝑃 (𝑧) 𝑅 = 1 = 𝑊 (ℓ, 𝑧¯)∗ . 𝑢˜2 (𝑧)𝑃 (𝑧) 𝑢2 (𝑧) 𝑄

Pseudospectral Functions

593

We can choose 𝑔 ∕= 0 in ℂ𝑚 such that 𝑃 (𝜁)𝑔 = 0, and then we get [ ] [ ] 0 ¯ ∗ 𝑅 𝑔. = 𝑊 (ℓ, 𝜁) 0 𝑄 ∗ ¯ Since 𝑊 (ℓ, 𝜁) is invertible, 𝑅𝑔 = 𝑄𝑔 = 0. The desired contradiction follows because 𝑅∗ 𝑅 + 𝑄∗ 𝑄 is invertible under our assumptions. □ Proposition 3.2. The root spaces for (3.1) are ﬁnite dimensional. Moreover, for every eigenvalue 𝜁 of (3.1), there is a 𝜇 ≥ 0 such that (0)

(1)

(𝜇)

𝔏𝜁 ⊊ 𝔏𝜁 ⊊ ⋅ ⋅ ⋅ ⊊ 𝔏𝜁

(𝜇+1)

= 𝔏𝜁

= ⋅⋅⋅ .

(3.7)

Proof. By Proposition 4.1.4(ii) of [3] the root spaces for (3.1) coincide with the root spaces for a nonzero eigenvalue of a compact operator. The assertions thus follow from well-known properties of compact operators (see, e.g., [1, Chapter I]). □ We call 𝑌 (0) (𝑥), 𝑌 (1) (𝑥), . . . , 𝑌 (𝜈) (𝑥) an eigenchain for the system (3.1) for an eigenvalue 𝜁 if [ 𝐼𝑚

𝑑𝑌 (0) = 𝑖𝜁𝐽𝐻(𝑥)𝑌 (0) , 𝑑𝑥 ] ] [ ∗ 0 𝑌 (0) (0) = 0, 𝑅 𝑄∗ 𝑌 (0) (ℓ) = 0,

and for each 𝑗 = 1, . . . , 𝜈, [ 𝐼𝑚

𝑑𝑌 (𝑗) = 𝑖𝜁𝐽𝐻(𝑥)𝑌 (𝑗) + 𝐽𝐻(𝑥)𝑌 (𝑗−1) , 𝑑𝑥 [ ∗ ] ] 𝑅 𝑄∗ 𝑌 (𝑗) (ℓ) = 0. 0 𝑌 (𝑗) (0) = 0,

Every root function 𝑌 (𝑥) is the last member 𝑌 (𝑥) = 𝑌 (𝜈) (𝑥) of some eigenchain. We use this fact to prove the following orthogonality relation, which generalizes Proposition 4.1.1 of [3]. Proposition 3.3. For any complex 𝜁1 and 𝜁2 , if 𝑌 ∈ 𝔏𝜁1 and 𝑍 ∈ 𝔏𝜁2 , then ∫ ℓ 𝑖(𝜁1 − 𝜁¯2 ) 𝑍(𝑡)∗ 𝐻(𝑡)𝑌 (𝑡) 𝑑𝑡 = 0. (3.8) 0

Hence if 𝜁 is a nonreal eigenvalue for (3.1), the root space 𝔏𝜁 is a neutral subspace of 𝐿2 (𝐻𝑑𝑥). A subspace of an indeﬁnite inner product space is called neutral if the inner product of any two of its elements is zero. [ ] 𝑅 Lemma 3.4. Let 𝑀 = ran . 𝑄 (1) If ℎ, 𝑘 ∈ ℂ𝑚 , the following are equivalent: [ ] [ ] 𝑘 ℎ ∗ ∗ (ii) ∈ 𝑀; (iii) ∈ 𝑀 ⊥. (i) 𝑅 ℎ + 𝑄 𝑘 = 0; ℎ 𝑘 (2) If 𝐴 and 𝐵 are 𝑚×𝑚 matrices such that 𝐴𝑅+𝐵𝑄 = 0, then [𝐴 𝐵]𝑀 = {0}.

594

J. Rovnyak and L.A. Sakhnovich

Proof of Lemma 3.4. Since 𝑅∗ 𝑅 + 𝑄∗ 𝑄 is invertible, 𝑀 is the range of a one-toone operator from ℂ𝑚 into ℂ2𝑚 and hence dim 𝑀 = 𝑚. Since 𝑅∗ 𝑄 + 𝑄∗ 𝑅 = 0, 𝐽𝑀 ⊆ 𝑀 ⊥ . By a dimension argument 𝐽𝑀 = 𝑀 ⊥ , and so 𝑀 = 𝐽𝑀 ⊥ . The assertions in (1) follow. To prove (2), consider any 𝜉 ∈ 𝑀 and 𝑢 ∈ ℂ𝑚 . Since 𝐴𝑅 + 𝐵𝑄 = 0 by assumption, 𝑅∗ 𝐴∗ 𝑢 + 𝑄∗ 𝐵 ∗ 𝑢 = 0. By part (1), [ ∗ ] 𝐴 𝑢 ∈ 𝑀 ⊥. 𝐵∗𝑢 [ ∗ ]∗ [ ] 𝐴 𝑢 Therefore 𝜉 = 0. By the arbitrariness of 𝑢, 𝐴 𝐵 𝜉 = 0. □ 𝐵∗𝑢 ∕ 𝜁¯2 . Proof of Proposition 3.3. The assertion is trivial if 𝜁1 = 𝜁¯2 , so assume that 𝜁1 = We must show that in this case, ∫ ℓ 𝑍(𝑡)∗ 𝐻(𝑡)𝑌 (𝑡) 𝑑𝑡 = 0. (3.9) 0

(0)

(1)

(𝜈1 )

Let 𝑌 (𝑥), 𝑌 (𝑥), . . . , 𝑌 (𝑥) and 𝑍 (0) (𝑥), 𝑍 (1) (𝑥), . . . , 𝑍 (𝜈2 ) (𝑥) be eigenchains (𝜈1 ) (𝜈2 ) with 𝑌 (𝑥) = 𝑌 (𝑥) and 𝑍 (𝑥) = 𝑍(𝑥). Set 𝑌 (−1) (𝑥) = 𝑍 (−1) (𝑥) = 0. Then 𝑑𝑌 (𝑗+1) = 𝑖𝜁1 𝐽𝐻(𝑥)𝑌 (𝑗+1) + 𝐽𝐻(𝑥)𝑌 (𝑗) , 𝑑𝑥 𝑑𝑍 (𝑘+1) = 𝑖𝜁2 𝐽𝐻(𝑥)𝑍 (𝑘+1) + 𝐽𝐻(𝑥)𝑍 (𝑘) , 𝑑𝑥 and

[ [

𝐼𝑚

𝐼𝑚

] 0 𝑌 (𝑗) (0) = 0, ] 0 𝑍 (𝑘) (0) = 0,

[ ∗ 𝑅 [ ∗ 𝑅

] 𝑄∗ 𝑌 (𝑗) (ℓ) = 0, ] 𝑄∗ 𝑍 (𝑘) (ℓ) = 0,

for all 𝑗 = −1, 0, . . . , 𝜈1 and 𝑘 = −1, 0, . . . , 𝜈2 . By the boundary conditions and Lemma 3.4(1), 𝑍 (𝑘) (0)∗ 𝐽𝑌 (𝑗) (0) = 𝑍 (𝑘) (ℓ)∗ 𝐽𝑌 (𝑗) (ℓ) = 0 for all 𝑗 = −1, 0, . . . , 𝜈1 and 𝑘 = −1, 0, . . . , 𝜈2 . Hence for the same values of 𝑗, 𝑘, ∫ ℓ[ $ℓ ] 𝑑𝑍 (𝑘) ∗ 𝑑𝑌 (𝑗) $ 𝑍 (𝑘) (𝑡)∗ 𝐽 + 𝐽𝑌 (𝑗) (𝑡) 𝑑𝑡 = 𝑍 (𝑘) (𝑡)∗ 𝐽𝑌 (𝑡)$ = 0. (3.10) 𝑑𝑡 𝑑𝑡 0 0 We show that

〉 〈 𝑖(𝜁1 − 𝜁¯2 ) 𝑌 (𝑗) , 𝑍 (𝑘)

𝐻

〉 〉 〈 〈 = − 𝑌 (𝑗−1) , 𝑍 (𝑘) − 𝑌 (𝑗) , 𝑍 (𝑘−1) ,

𝑗 = 0, 1, . . . , 𝜈1 ,

𝐻

𝑘 = 0, 1, . . . , 𝜈2 ,

𝐻

(3.11)

Pseudospectral Functions

595

where ⟨⋅, ⋅⟩𝐻 is the inner product of 𝐿2 (𝐻𝑑𝑥). In fact, ∫ ℓ 𝑖(𝜁1 − 𝜁¯2 ) 𝑍 (𝑘) (𝑡)∗ 𝐻(𝑡)𝑌 (𝑗) (𝑡) 𝑑𝑡 0

∫ =

ℓ 0

[ ] 𝑍 (𝑘) (𝑡)∗ 𝐽 𝑖𝜁1 𝐽𝐻(𝑡)𝑌 (𝑗) (𝑡) 𝑑𝑡 ∫ +

∫ =

By (3.10), 𝑖(𝜁1 − 𝜁¯2 )

∫

ℓ 0

𝑍

(𝑘)

∗

ℓ 0

(𝑡) 𝐻(𝑡)𝑌

ℓ 0

[

]∗ 𝑖𝜁2 𝐽𝐻(𝑡)𝑍 (𝑘) (𝑡) 𝐽𝑌 (𝑗) (𝑡) 𝑑𝑡 [

] 𝑑𝑌 (𝑗) (𝑗−1) − 𝐽𝐻(𝑡)𝑌 𝑍 (𝑡) 𝐽 𝑑𝑡 𝑑𝑡 ]∗ ∫ ℓ [ (𝑘) 𝑑𝑍 − 𝐽𝐻(𝑡)𝑍 (𝑘−1) 𝐽𝑌 (𝑗) (𝑡) 𝑑𝑡. + 𝑑𝑡 0 (𝑘)

(𝑗)

∗

∫ (𝑡) 𝑑𝑡 = −

0

∫ −

0

ℓ

ℓ

𝑍 (𝑘) (𝑡)∗ 𝐻(𝑡)𝑌 (𝑗−1) (𝑡) 𝑑𝑡 𝑍 (𝑘−1) (𝑡)∗ 𝐻(𝑡)𝑌 (𝑗) (𝑡) 𝑑𝑡,

which proves (3.11). The proof is completed by repeated application of (3.11). Start by choosing 𝑗 = 𝜈1 and 𝑘 = 𝜈2 in (3.11). For each term on the right, multiply by 𝜁1 − 𝜁¯2 , and repeat. Eventually we reach 𝑗 = 0 or 𝑘 = 0 for each term, and then 𝑌 (𝑗−1) (𝑥) = 𝑌 (−1) (𝑥) = 0 or 𝑍 (𝑗−1) (𝑥) = 𝑍 (−1) (𝑥) = 0 accordingly. In the end, we arrive at (3.9), as was to be shown. □ We shall need explicit formulas for eigenchains. Such formulas can be derived from the Taylor expansions (2.3) and (2.4). Set 𝐾(𝑧) = 𝑐(𝑧)𝑅 + 𝑑(𝑧)𝑄

and 𝐾𝑗 (𝑧) = 𝑐𝑗 (𝑧)𝑅 + 𝑑𝑗 (𝑧)𝑄,

𝑗 ≥ 0.

(3.12)

Proposition 3.5. The general form of an eigenchain 𝑌 (0) (𝑥), 𝑌 (1) (𝑥), . . . , 𝑌 (𝜈) (𝑥) for (3.1) for an eigenvalue 𝜁 is [ ] 0 𝑌 (0) (𝑥) = 𝑊0 (𝑥, 𝜁) , 𝑔0 [ ] [ ] 0 0 (1) 𝑌 (𝑥) = (−𝑖)𝑊1 (𝑥, 𝜁) + 𝑊0 (𝑥, 𝜁) , 𝑔0 𝑔1 [ ] [ ] [ ] 0 0 0 (2) 2 𝑌 (𝑥) = (−𝑖) 𝑊2 (𝑥, 𝜁) + (−𝑖)𝑊1 (𝑥, 𝜁) + 𝑊0 (𝑥, 𝜁) , (3.13) 𝑔0 𝑔1 𝑔2 ⋅⋅⋅

596

J. Rovnyak and L.A. Sakhnovich 𝑌

(𝜈)

[ ] [ ] 0 0 𝜈−1 (𝑥) = (−𝑖) 𝑊𝜈 (𝑥, 𝜁) 𝑊𝜈−1 (𝑥, 𝜁) + (−𝑖) 𝑔0 𝑔1 [ ] 0 + ⋅ ⋅ ⋅ + 𝑊0 (𝑥, 𝜁) , 𝑔𝜈 𝜈

where 𝑔0 , 𝑔1 , . . . , 𝑔𝜈 are vectors in ℂ𝑚 satisfying ¯ ∗ 𝑔0 = 0 , 𝐾0 (𝜁) ¯ ∗ 𝑔0 + 𝐾0 (𝜁) ¯ ∗ 𝑔1 = 0 , (−𝑖)𝐾1 (𝜁) ¯ ∗ 𝑔0 + (−𝑖)𝐾1 (𝜁) ¯ ∗ 𝑔1 + 𝐾0 (𝜁) ¯ ∗ 𝑔2 = 0 , (−𝑖)2 𝐾2 (𝜁)

(3.14)

⋅⋅⋅ ∗ 𝜈−1 ∗ ∗ ¯ ¯ ¯ 𝐾𝜈−1 (𝜁) 𝑔1 + ⋅ ⋅ ⋅ + 𝐾0 (𝜁) 𝑔𝜈 = 0 . (−𝑖) 𝐾𝜈 (𝜁) 𝑔0 + (−𝑖) 𝜈

Proof. The case 𝜈 = 0 follows from [3, Proposition 3.1.2]. We proceed by induction for the general case. Assume that the assertion is known up to the 𝑘th stage for some 𝑘 ≥ 0. Consider an eigenchain 𝑌 (0) (𝑥), . . . , 𝑌 (𝑘) (𝑥), 𝑌 (𝑘+1) (𝑥). In particular,

[

𝐼𝑚

𝑑𝑌 (𝑘+1) = 𝑖𝜁𝐽𝐻(𝑥)𝑌 (𝑘+1) + 𝐽𝐻(𝑥)𝑌 (𝑘) , 𝑑𝑥 ] ] [ ∗ 0 𝑌 (𝑘+1) (0) = 0, 𝑅 𝑄∗ 𝑌 (𝑘+1) (ℓ) = 0.

(3.15)

By the inductive assumption, we can represent 𝑌 (0) (𝑥), . . . , 𝑌 (𝑘) (𝑥) in the form (3.13)–(3.14) with 𝜈 = 𝑘. Set { [ ] [ ] 0 0 𝑌˜ (𝑘+1) (𝑥) = −𝑖 (−𝑖)𝑘 𝑊𝑘+1 (𝑥, 𝜁) + (−𝑖)𝑘−1 𝑊𝑘 (𝑥, 𝜁) 𝑔0 𝑔1 [ [ ]} ] 0 0 + ⋅ ⋅ ⋅ + (−𝑖)𝑊2 (𝑥, 𝜁) + 𝑊1 (𝑥, 𝜁) . 𝑔𝑘−1 𝑔𝑘 By (2.6),

{ [ ] [ ]) ( 𝑑𝑌˜ (𝑘+1) 0 0 𝑘 = −𝑖 (−𝑖) 𝑖𝜁𝐽𝐻(𝑥)𝑊𝑘+1 (𝑥, 𝜁) + 𝑖𝐽𝐻(𝑥)𝑊𝑘 (𝑥, 𝜁) 𝑔 𝑔 𝑑𝑥 0 0 [ ] [ ]) ( 0 0 𝑘−1 + (−𝑖) 𝑖𝜁𝐽𝐻(𝑥)𝑊𝑘 (𝑥, 𝜁) + 𝑖𝐽𝐻(𝑥)𝑊𝑘−1 (𝑥, 𝜁) 𝑔1 𝑔1 + ⋅⋅⋅

[ [ ] ]) ( 0 0 + 𝑖𝐽𝐻(𝑥)𝑊1 (𝑥, 𝜁) + (−𝑖) 𝑖𝜁𝐽𝐻(𝑥)𝑊2 (𝑥, 𝜁) 𝑔𝑘−1 𝑔𝑘−1 } [ ] [ ]) ( 0 0 + 𝑖𝜁𝐽𝐻(𝑥)𝑊1 (𝑥, 𝜁) + 𝑖𝐽𝐻(𝑥)𝑊0 (𝑥, 𝜁) 𝑔𝑘 𝑔𝑘

Pseudospectral Functions

597

{

[ ] [ ] 0 0 = 𝑖𝜁𝐽𝐻(𝑥)(−𝑖) (−𝑖)𝑘 𝑊𝑘+1 (𝑥, 𝜁) + (−𝑖)𝑘−1 𝑊𝑘 (𝑥, 𝜁) 𝑔0 𝑔1 [ [ ]} ] 0 0 + 𝑊1 (𝑥, 𝜁) + ⋅ ⋅ ⋅ + (−𝑖)𝑊2 (𝑥, 𝜁) 𝑔𝑘−1 𝑔𝑘 { [ ] [ ] 0 0 + (−𝑖)𝑘−1 𝑊𝑘−1 (𝑥, 𝜁) + 𝐽𝐻(𝑥) (−𝑖)𝑘 𝑊𝑘 (𝑥, 𝜁) 𝑔0 𝑔1 [ [ ]} ] 0 0 + ⋅ ⋅ ⋅ + (−𝑖)𝑊1 (𝑥, 𝜁) + 𝑊0 (𝑥, 𝜁) . 𝑔𝑘−1 𝑔𝑘 Thus

𝑑𝑌˜ (𝑘+1) = 𝑖𝜁𝐽𝐻(𝑥)𝑌˜ (𝑘+1) + 𝐽𝐻(𝑥)𝑌 (𝑘) . 𝑑𝑥 In view of (3.15), it follows that

[ By (2.5), 𝐼𝑚

𝑑 (𝑌 (𝑘+1) − 𝑌˜ (𝑘+1) ) = 𝑖𝜁𝐽𝐻(𝑥)(𝑌 (𝑘+1) − 𝑌˜ (𝑘+1) ). 𝑑𝑥 ] 0 (𝑌 (𝑘+1) (0) − 𝑌˜ (𝑘+1) (0)) = 0 − 0 = 0. Therefore [ ] 0 (𝑘+1) (𝑘+1) ˜ 𝑌 (𝑥) − 𝑌 (𝑥) = 𝑊0 (𝑥, 𝜁) 𝑔𝑘+1

for some 𝑔𝑘+1 ∈ ℂ𝑚 . By the deﬁnition of 𝑌˜ (𝑘+1) (𝑥), [ ] [ ] 0 0 + (−𝑖)𝑘 𝑊𝑘 (𝑥, 𝜁) 𝑌 (𝑘+1) (𝑥) = (−𝑖)𝑘+1 𝑊𝑘+1 (𝑥, 𝜁) 𝑔0 𝑔1 [ ] [ ] 0 0 + ⋅ ⋅ ⋅ + (−𝑖)𝑊1 (𝑥, 𝜁) + 𝑊0 (𝑥, 𝜁) . 𝑔𝑘 𝑔𝑘+1 ] [ The boundary condition 𝐼𝑚 0 𝑌 (𝑘+1) (0) = 0 imposes no condition on 𝑔𝑘+1 . ] [ A restriction on 𝑔𝑘+1 is imposed by the condition 𝑅∗ 𝑄∗ 𝑌 (𝑘+1) (ℓ) = 0. By ¯ ∗ + 𝑄∗ 𝑑𝑗 (𝜁) ¯ ∗ = 𝐾𝑗 (𝜁) ¯ ∗ , the the second equation in (2.5) and the identity 𝑅∗ 𝑐𝑗 (𝜁) restriction on 𝑔𝑘+1 is that ¯ ∗ 𝑔0 + (−𝑖)𝑘 𝐾𝑘 (𝜁) ¯ ∗ 𝑔1 + ⋅ ⋅ ⋅ + (−𝑖)𝐾1 (𝜁) ¯ ∗ 𝑔𝑘 + 𝐾0 (𝜁) ¯ ∗ 𝑔𝑘+1 = 0. (−𝑖)𝑘+1 𝐾𝑘+1 (𝜁) Thus the eigenchain 𝑌 (0) (𝑥), . . . , 𝑌 (𝑘) (𝑥), 𝑌 (𝑘+1) (𝑥) has the required form. The steps are reversible, and the inductive step follows. □ We also need formulas for the eigentransforms (1.5) of an eigenchain. These are given in the next result in both explicit and recursive forms. Proposition 3.6. Let 𝑌 (0) (𝑥), 𝑌 (1) (𝑥), . . . , 𝑌 (𝜈) (𝑥) be an eigenchain for (3.1) of the form (3.13), and let 𝐹 (0) (𝑧), 𝐹 (1) (𝑧), . . . , 𝐹 (𝜈) (𝑧) be the corresponding eigentransforms.

598

J. Rovnyak and L.A. Sakhnovich

(1) For each 𝑟 = 0, . . . , 𝜈, 𝐹

(𝑟)

𝑟 ∑

(𝑧) =

(−𝑖)𝑛 𝑖

𝑛=0

=

∞ (∑ 𝑟 ∑

∑ 𝑐(𝑧)𝑑𝑝 (𝜁) ¯ ∗ + 𝑑(𝑧)𝑐𝑝 (𝜁) ¯∗ 𝑔𝑟−𝑛 (𝑧 − 𝜁)𝑞+1 𝑝+𝑞=𝑛

) (−𝑖)𝑛 Δ𝑛𝑘 (𝜁) 𝑔𝑟−𝑛 (𝑧 − 𝜁)𝑘 ,

(3.16)

𝑛=0

𝑘=0

where the coeﬃcients in the last expression are as in Proposition 2.4. (2) The functions 𝐹 (0) (𝑧), 𝐹 (1) (𝑧), . . . , 𝐹 (𝜈) (𝑧) in (1) are given recursively ¯ ∗ + 𝑑(𝑧)𝑐0 (𝜁) ¯∗ 𝑐(𝑧)𝑑0 (𝜁) 𝑔0 , 𝐹 (0) (𝑧) = 𝑖 𝑧−𝜁 and [ ] 𝐹 (𝑘−1) (𝑧) − 𝑐(𝑧) 𝑑(𝑧) 𝐽𝑌 (𝑘) (ℓ) (𝑘) 𝐹 (𝑧) = , 𝑘 = 1, . . . , 𝜈. 𝑖(𝑧 − 𝜁) Moreover, for all 𝑘 = 0, . . . , 𝜈, { [ ] 𝐽𝑌 (𝑘) (ℓ) 𝐽𝑌 (𝑘−1) (ℓ) + 2 𝐹 (𝑘) (𝑧) = − 𝑐(𝑧) 𝑑(𝑧) 𝑖(𝑧 − 𝜁) 𝑖 (𝑧 − 𝜁)2 } 𝐽𝑌 (0) (ℓ) + ⋅ ⋅ ⋅ + 𝑘+1 . 𝑖 (𝑧 − 𝜁)𝑘+1

by (3.17)

(3.18)

(3.19)

Notice that if we set 𝐹 (−1) (𝑧) ≡ 0, then (3.18) agrees with (3.17) when 𝑘 = 0. Proof. (1) By (3.13), 𝑌

(𝑟)

(𝑥) =

𝑟 ∑

[ ] 0 (−𝑖) 𝑊𝑛 (𝑥, 𝜁) . 𝑔𝑟−𝑛 𝑛

𝑛=0

Hence by (2.13), 𝐹 (𝑟) (𝑧) = = = = =

∫

ℓ

[

0

𝑟 ∑

] 0 𝐼𝑚 𝑊 (𝑡, 𝑧¯)∗ 𝐻(𝑡)𝑌 (𝑟) (𝑡) 𝑑𝑡 ∫

(−𝑖)𝑛

ℓ

0

𝑛=0 𝑟 ∑

(−𝑖)𝑛 𝑖

𝑛=0 𝑟 ∑

[ ] ] 0 0 𝐼𝑚 𝑊 (𝑡, 𝑧¯)∗ 𝐻(𝑡)𝑊𝑛 (𝑡, 𝜁) 𝑑𝑡 𝑔𝑟−𝑛 𝐼𝑚

∑ 𝑐(𝑧)𝑑𝑝 (𝜁) ¯ ∗ + 𝑑(𝑧)𝑐𝑝 (𝜁) ¯∗ 𝑔𝑟−𝑛 (𝑧 − 𝜁)𝑞+1 𝑝+𝑞=𝑛

∞ ∑

(−𝑖)𝑛

[

𝑛=0 𝑘=0 ∞ (∑ 𝑟 ∑

Δ𝑛𝑘 (𝜁)𝑔𝑟−𝑛 (𝑧 − 𝜁)𝑘 )

(−𝑖) Δ𝑛𝑘 (𝜁) 𝑔𝑟−𝑛 (𝑧 − 𝜁)𝑘 .

𝑘=0

𝑛

𝑛=0

The two equalities in (3.16) follow.

Pseudospectral Functions

599

(2) Consider an eigenchain 𝑌 (0) (𝑥), 𝑌 (1) (𝑥), . . . , 𝑌 (𝜈) (𝑥) of the form (3.13), and let 𝐹 (0) (𝑧), 𝐹 (1) (𝑧), . . . , 𝐹 (𝜈) (𝑧) be the corresponding eigentransforms. The identity (3.17) is a special case of (3.16). Suppose 𝑘 = 1, . . . , 𝜈. Then 𝑑𝑌 (𝑘) = 𝑖𝜁𝐽𝐻(𝑥)𝑌 (𝑘) + 𝐽𝐻(𝑥)𝑌 (𝑘−1) , 𝑑𝑥 ] ] [ ∗ [ 𝐼𝑚 0 𝑌 (𝑘) (0) = 0, 𝑅 𝑄∗ 𝑌 (𝑘) (ℓ) = 0. Thus ∫ ℓ 0

[ 0

𝐼𝑚

]

𝑑𝑌 (𝑘) 𝑑𝑡 = 𝑊 (𝑡, 𝑧¯) 𝐽 𝑑𝑡 ∗

∫

ℓ

0

[ 𝑖𝜁 0 ∫

+ = 𝑖𝜁𝐹 Integration by parts yields [ 𝑖𝜁𝐹 (𝑘) (𝑧) + 𝐹 (𝑘−1) (𝑧) = 0

− = 0

[

0

] 0 𝐼𝑚 𝑊 (𝑡, 𝑧¯)∗ 𝐻(𝑡)𝑌 (𝑘−1) (𝑡) 𝑑𝑡

(𝑧) + 𝐹 (𝑘−1) (𝑧).

$ℓ $ ] 𝐼𝑚 𝑊 (𝑡, 𝑧¯)∗ 𝐽𝑌 (𝑘) (𝑡)$$ ∫

[

(𝑘)

ℓ[

] 𝐼𝑚 𝑊 (𝑡, 𝑧¯)∗ 𝐻(𝑡)𝑌 (𝑘) (𝑡) 𝑑𝑡

ℓ[ 0

]

𝑡=0

) ]( 𝑑 0 𝐼𝑚 𝑊 (𝑡, 𝑧¯)∗ 𝐽𝑌 (𝑘) (𝑡) 𝑑𝑡 𝑑𝑡

𝐼𝑚 𝑊 (ℓ, 𝑧¯)∗ 𝐽𝑌 (𝑘) (ℓ) ∫ ℓ ) [ ]( 0 𝐼𝑚 − 𝑖𝑧𝑊 (𝑡, 𝑧¯)∗ 𝐻(𝑡)𝐽 𝐽𝑌 (𝑘) (𝑡) 𝑑𝑡 − 0

]

= 0 𝐼𝑚 𝑊 (ℓ, 𝑧¯)∗ 𝐽𝑌 (𝑘) (ℓ) + 𝑖𝑧𝐹 (𝑘) (𝑧). ] [ ] [ Since 0 𝐼𝑚 𝑊 (ℓ, 𝑧¯)∗ = 𝑐(𝑧) 𝑑(𝑧) by (1.7), we obtain (3.18). We prove (3.19) by iterating (3.17) and (3.18). By (3.17), [ ] ¯∗ [ ] 𝐽𝑌 (0) (ℓ) [ ] 𝑐 (𝜁) 1 = − 𝑐(𝑧) 𝑑(𝑧) , 𝐹 (0) (𝑧) = − 𝑐(𝑧) 𝑑(𝑧) 𝐽 0 ¯ ∗ 𝑑0 (𝜁) 𝑖(𝑧 − 𝜁) 𝑖(𝑧 − 𝜁) which is the case 𝑘 = 0 of (3.19). By (3.18) with 𝑘 = 1, { } [ ] 1 𝐹 (1) (𝑧) = 𝐹 (0) (𝑧) − 𝑐(𝑧) 𝑑(𝑧) 𝐽𝑌 (1) (ℓ) 𝑖(𝑧 − 𝜁) } { ] [ ] 𝐽𝑌 (0) (ℓ) [ 1 − 𝑐(𝑧) 𝑑(𝑧) 𝐽𝑌 (1) (ℓ) = − 𝑐(𝑧) 𝑑(𝑧) 𝑖(𝑧 − 𝜁) 𝑖(𝑧 − 𝜁) { } [ ] 𝐽𝑌 (1) (ℓ) 𝐽𝑌 (0) (ℓ) + 2 = − 𝑐(𝑧) 𝑑(𝑧) , 𝑖(𝑧 − 𝜁) 𝑖 (𝑧 − 𝜁)2 proving (3.19) for 𝑘 = 1. The general case follows by a straightforward induction. □

600

J. Rovnyak and L.A. Sakhnovich

4. Main results We assume given a system (3.1) satisfying (1.2)–(1.4), with operators 𝑅 and 𝑄 satisfying (1∘ ) and (2∘ ). Let 𝑊 (𝑥, 𝑧) be the unique solution of (1.6). As before, we set 𝑣(𝑧) = 𝑖[𝑎(𝑧)𝑅 + 𝑏(𝑧)𝑄][𝑐(𝑧)𝑅 + 𝑑(𝑧)𝑄]−1 , (4.1) where 𝑊 (ℓ, 𝑧¯)∗ =

[

] 𝑎(𝑧) 𝑏(𝑧) . 𝑐(𝑧) 𝑑(𝑧)

(4.2)

Recall that 𝑣(𝑧) = 𝑣(¯ 𝑧 )∗ , and the only singularities of 𝑣(𝑧) are poles, which occur at the points where 𝑐(𝑧)𝑅 + 𝑑(𝑧)𝑄 is not invertible (see Proposition 3.1). By Proposition 3.1, the eigenvalues of (3.1) coincide with the poles of 𝑣(𝑧). In Deﬁnition 4.2 we use the poles of 𝑣(𝑧) to introduce an inner product space ℌ0 (𝑣) whose elements are 𝑚-dimensional vector-valued entire functions. Our main result, Theorem 4.7, asserts that the eigentransform (1.5) acts an isometry on the span of root functions in 𝐿2 (𝐻𝑑𝑥) to ℌ0 (𝑣). For each 𝑤 ∈ ℂ, write 𝑣(𝑧) = −

𝛾ϰ (𝑤) 𝛾1 (𝑤) + 𝑣˜(𝑧), − ⋅⋅⋅− ϰ (𝑧 − 𝑤) 𝑧−𝑤

(4.3)

where 𝑣˜(𝑧) is analytic at 𝑧 = 𝑤. Here ϰ = ϰ𝑤 ≥ 1 is chosen large enough that such a representation exists. The value of ϰ is not important, and zero coeﬃcients can be added at will. Such a representation is nontrivial only for poles, but it is notationally convenient to also allow 𝑤 to be a point of analyticity for 𝑣(𝑧), in which case all coeﬃcients are zero. Since 𝑣(𝑧) = 𝑣(¯ 𝑧 )∗ , 𝑣(𝑧) = −

𝛾1 (𝑤)∗ 𝛾ϰ (𝑤)∗ + 𝑣˜(¯ 𝑧 )∗ , − ⋅⋅⋅− ϰ (𝑧 − 𝑤) ¯ 𝑧−𝑤 ¯

(4.4)

¯ Hence 𝛾𝑗 (𝑤) ¯ = 𝛾𝑗 (𝑤)∗ , 𝑗 = 1, . . . , ϰ. where 𝑣˜(¯ 𝑧 )∗ is analytic at 𝑧 = 𝑤. Proposition 4.1. Let 𝜁 be an eigenvalue for (3.1), and write 𝑣(𝑧) as in (4.3) for 𝑤 = 𝜁. Let 𝑢 ∈ ℂ𝑚 be any vector, and deﬁne 𝑌 (0) (𝑥), 𝑌 (1) (𝑥), . . . , 𝑌 (ϰ−1) (𝑥) by (3.13) with 𝜈 = ϰ − 1 and 𝑔𝑗 = (−𝑖)𝑗 𝛾ϰ−𝑗 (𝜁)𝑢,

𝑗 = 0, . . . , ϰ − 1.

(4.5)

Then 𝑌 (0) (𝑥), 𝑌 (1) (𝑥), . . . , 𝑌 (ϰ−1) (𝑥) is an eigenchain for (3.1). Proof. We must show that 𝑔0 , . . . , 𝑔ϰ−1 satisfy (3.14). By (4.1) and (3.12), 𝑖[𝑎(𝑧)𝑅 + 𝑏(𝑧)𝑄] = 𝑣(𝑧)𝐾(𝑧) = 𝑣(𝑧)

∞ ∑

𝐾𝑗 (𝑤)(𝑧 − 𝑤)𝑗 .

𝑗=0

Hence by (4.3),

[

] ∞ 𝛾ϰ (𝑤) 𝛾1 (𝑤) ∑ + ⋅⋅⋅ + 𝐾𝑗 (𝑤)(𝑧 − 𝑤)𝑗 + 𝑣˜(𝑧)𝐾(𝑧). 𝑖[𝑎(𝑧)𝑅 + 𝑏(𝑧)𝑄] = − (𝑧 − 𝑤)ϰ 𝑧 − 𝑤 𝑗=0

Pseudospectral Functions

601

Since the left side is analytic at 𝑤, we deduce ϰ relations by expanding the ﬁrst term on the right side and equating coeﬃcients of negative powers of 𝑧 − 𝑤 to zero: 𝛾ϰ (𝑤)𝐾0 (𝑤) = 0, 𝛾ϰ (𝑤)𝐾1 (𝑤) + 𝛾ϰ−1 (𝑤)𝐾0 (𝑤) = 0, 𝛾ϰ (𝑤)𝐾2 (𝑤) + 𝛾ϰ−1 (𝑤)𝐾1 (𝑤) + 𝛾ϰ−2 (𝑤)𝐾0 (𝑤) = 0, ⋅⋅⋅

(4.6)

𝛾ϰ (𝑤)𝐾ϰ−1 (𝑤) + 𝛾ϰ−1 (𝑤)𝐾ϰ−2 (𝑤) + ⋅ ⋅ ⋅ + 𝛾1 (𝑤)𝐾0 (𝑤) = 0. On replacing 𝑤 by 𝑤 ¯ and taking adjoints, we deduce (3.14).

□

We introduce an inner product space that will be used in Theorem 4.7 to describe the action of the eigentransform (1.5) on root functions. Deﬁnition 4.2. Let ℌ0 (𝑣) be the set of entire functions 𝐹 (𝑧) with values in ℂ𝑚 such that 𝑣(𝑧)𝐹 (𝑧) has ﬁnitely many poles. For 𝐹, 𝐺 ∈ ℌ0 (𝑣) and 𝑤 ∈ ℂ, set ∫ 1 𝐺(¯ 𝑧 )∗ 𝑣(𝑧)𝐹 (𝑧) 𝑑𝑧, (4.7) ⟨𝐹, 𝐺⟩𝑤 = − 2𝜋𝑖 Γ𝑤 where Γ𝑤 is a counterclockwise circle about 𝑤, chosen small enough that 𝑣(𝑧) is analytic on Γ𝑤 and its interior except perhaps at 𝑧 = 𝑤. Set ∑ ⟨𝐹, 𝐺⟩𝑤 . (4.8) ⟨𝐹, 𝐺⟩ = 𝑤∈ℂ

We identify entire functions 𝐹 and 𝐺 in ℌ0 (𝑣) such that 𝑣(𝑧)[𝐹 (𝑧)− 𝐺(𝑧)] is entire (or, more precisely, has only removable singularities). The integral in (4.7) is independent of the choice of Γ𝑤 . All but ﬁnitely many terms of the sum in (4.8) are zero, and hence ⟨𝐹, 𝐺⟩ is well deﬁned. Lemma 4.3. Let 𝐹, 𝐺 ∈ ℌ0 (𝑣), 𝑤 ∈ ℂ, and let 𝐹 (𝑧) =

∞ ∑

𝐹𝑝 (𝑤)(𝑧 − 𝑤)𝑝

and

𝐺(𝑧) =

𝑝=0

∞ ∑

𝐺𝑞 (𝑤)(𝑧 ¯ − 𝑤) ¯ 𝑞

(4.9)

𝑞=0

be Taylor expansions about 𝑤 and 𝑤, ¯ respectively. If 𝑣(𝑧) is given by (4.3), then ⟨𝐹, 𝐺⟩𝑤 = or, equivalently, ⎡ ¯ 𝐺0 (𝑤) ⎢ 𝐺1 (𝑤) ¯ ⎢ ⟨𝐹, 𝐺⟩𝑤 = ⎢ .. ⎣ .

⎤∗ ⎡

⎥ ⎥ ⎥ ⎦ ¯ 𝐺ϰ−1 (𝑤)

ϰ ∑

∑

𝐺𝑞 (𝑤) ¯ ∗ 𝛾𝑗 (𝑤)𝐹𝑝 (𝑤),

(4.10)

𝑗=1 𝑝+𝑞=𝑗−1

𝛾1 (𝑤)

⎢ 𝛾 (𝑤) ⎢ 2 ⎢ ⎣ 𝛾ϰ (𝑤)

𝛾2 (𝑤)

⋅⋅⋅

𝛾ϰ−1 (𝑤)

𝛾ϰ (𝑤)

𝛾3 (𝑤)

⋅⋅⋅ ⋅⋅⋅

𝛾ϰ (𝑤)

0

0

⋅⋅⋅

0

0

⎤⎡ ⎥⎢ ⎥⎢ ⎥⎢ ⎦⎣

𝐹0 (𝑤) 𝐹1 (𝑤) .. .

𝐹ϰ−1 (𝑤)

⎤ ⎥ ⎥ ⎥. ⎦

602

J. Rovnyak and L.A. Sakhnovich

Proof. By (4.3) and (4.9), −𝐺(¯ 𝑧 )∗ 𝑣(𝑧)𝐹 (𝑧) =

∞ ∑

𝐺𝑞 (𝑤) ¯ ∗ (𝑧 − 𝑤)𝑞

𝑞=0

=

ϰ ∑ ∞ ∞ ∑ ∑

ϰ ∞ ∑ 𝛾𝑗 (𝑤) ∑ 𝐹𝑝 (𝑤)(𝑧 − 𝑤)𝑝 + 𝜑(𝑧) 𝑗 (𝑧 − 𝑤) 𝑝=0 𝑗=1

𝐺𝑞 (𝑤) ¯ ∗ 𝛾𝑗 (𝑤)𝐹𝑝 (𝑤)(𝑧 − 𝑤)𝑝+𝑞−𝑗 + 𝜑(𝑧),

𝑗=1 𝑝=0 𝑞=0

where 𝜑(𝑧) is analytic at 𝑧 = 𝑤. Only the terms with 𝑝 + 𝑞 − 𝑗 = −1 make a contribution to the integral (4.7). Therefore ∫ ϰ ∑ ∑ 1 ∗ ⟨𝐹, 𝐺⟩𝑤 = − 𝐺(¯ 𝑧 ) 𝑣(𝑧)𝐹 (𝑧) 𝑑𝑧 = 𝐺𝑞 (𝑤) ¯ ∗ 𝛾𝑗 (𝑤)𝐹𝑝 (𝑤), 2𝜋𝑖 Γ𝑤 𝑗=1 𝑝+𝑞−𝑗=−1 which is equivalent to (4.10).

□

Proposition 4.4. The inner product (4.8) is linear and symmetric. Proof. Linearity in the ﬁrst variable is clear from (4.10). Fix 𝐹, 𝐺 ∈ ℌ0 (𝑣). For ¯ 𝑗 = 1, . . . , ϰ. Hence by (4.9) and (4.10), any 𝑤 ∈ ℂ, 𝛾𝑗 (𝑤)∗ = 𝛾𝑗 (𝑤), ⟨𝐺, 𝐹 ⟩𝑤 = Therefore ⟨𝐺, 𝐹 ⟩ =

∑ 𝑤∈ℂ

ϰ ∑

∑

𝑗=1 𝑝+𝑞=𝑗−1

⟨𝐺, 𝐹 ⟩𝑤 =

𝐹𝑝 (𝑤)∗ 𝛾𝑗 (𝑤)𝐺 ¯ 𝑞 (𝑤) ¯ = ⟨𝐹, 𝐺⟩𝑤 .

∑ 𝑤∈ℂ

This proves symmetry.

⟨𝐺, 𝐹 ⟩𝑤 =

∑ 𝑤∈ℂ

⟨𝐹, 𝐺⟩𝑤 = ⟨𝐹, 𝐺⟩. □

We come now to a critical property of eigentransforms of root functions. In the case of simple poles, Lemma 4.1.10 in [3] provides what is needed. The next result is a generalization to arbitrary poles, which is stated in diﬀerent language but essentially accomplishes the same thing. (𝜇−1)

for some 𝜇 ≥ 1 and 𝐹 = 𝑉 𝑌 , then 𝑣(𝑧)𝐹 (𝑧) Proposition 4.5. (1) If 𝑌 (𝑥) ∈ 𝔏𝜁 is analytic in the complex plane except perhaps for a pole at 𝜁 of order at most 𝜇. (2) If 𝐹 = 𝑉 𝑓 where 𝑓 is a ﬁnite linear combination of root functions of (3.1), then 𝐹 ∈ ℌ0 (𝑣). Proof. (1) By (3.19),

{ } [ ] 𝐽𝑌 (𝜇−1) (ℓ) 𝐽𝑌 (𝜇−2) (ℓ) 𝐽𝑌 (0) (ℓ) + 2 𝐹 (𝑧) = − 𝑐(𝑧) 𝑑(𝑧) + ⋅⋅⋅+ 𝜇 . 𝑖(𝑧 − 𝜁) 𝑖 (𝑧 − 𝜁)2 𝑖 (𝑧 − 𝜁)𝜇 ] [ ∗ Here the boundary conditions 𝑅 𝑄∗ 𝑌 (𝑗) (ℓ) = 0 together with Lemma 3.4 imply that [ ] 𝑅 (𝑗) 𝐽𝑌 (ℓ) ∈ 𝑀 = ran , 𝑗 = 0, . . . , 𝜇 − 1. 𝑄

Pseudospectral Functions It follows that

603

[ ] ] 𝑅 𝐹 (𝑧) = 𝑐(𝑧) 𝑑(𝑧) 𝜑(𝑧), 𝑄 [

(4.11)

where

𝜑2 𝜑𝜇 𝜑1 + + ⋅⋅⋅+ 𝑧−𝜁 (𝑧 − 𝜁)2 (𝑧 − 𝜁)𝜇 for some vectors 𝜑1 , 𝜑2 , . . . , 𝜑𝜇 in ℂ𝑚 . By (4.1), 𝑣(𝑧)[𝑐(𝑧)𝑅 + 𝑑(𝑧)𝑄] = 𝑖[𝑎(𝑧)𝑅 + 𝑏(𝑧)𝑄]. Hence [ ] [ ] 𝑅 𝑣(𝑧) 𝑐(𝑧) 𝑑(𝑧) = 𝑖[𝑎(𝑧)𝑅 + 𝑏(𝑧)𝑄]. 𝑄

(4.12)

𝜑(𝑧) =

(4.13)

By (4.11) and (4.13),

[ ] [ ] 𝑅 𝑣(𝑧)𝐹 (𝑧) = 𝑣(𝑧) 𝑐(𝑧) 𝑑(𝑧) 𝜑(𝑧) = 𝑖[𝑎(𝑧)𝑅 + 𝑏(𝑧)𝑄]𝜑(𝑧). 𝑄

(4.14)

Since 𝑎(𝑧) and 𝑏(𝑧) are entire functions, (4.14) and (4.12) show that 𝑣(𝑧)𝐹 (𝑧) is analytic in ℂ except perhaps for a pole at 𝜁 of order at most 𝜇. (2) This is immediate from (1). □ Corollary 4.6. Suppose 𝑌 (𝑥) ∈ 𝔏𝜁 and ∫ ℓ [ ] 0 𝐼𝑚 𝑊 (𝑡, 𝑧¯)∗ 𝐻(𝑡)𝑌 (𝑡) 𝑑𝑡. 𝐹 (𝑧) = 0

Then for any 𝐺(𝑧) in ℌ0 (𝑣), ⟨𝐹, 𝐺⟩ = ⟨𝐹, 𝐺⟩𝜁 . ∕ 𝜁. Fix 𝑤 ∕= 𝜁. Proof. By (4.8),∑the problem is to show that ⟨𝐹, 𝐺⟩𝑤 = 0 for all 𝑤 = ∞ Write 𝐹 (𝑧) = 𝑝=0 𝐹𝑝 (𝑤)(𝑧 − 𝑤)𝑝 . Let 𝑣(𝑧) be given by (4.3). By Proposition 4.5(1), 𝑣(𝑧)𝐹 (𝑧) is analytic at 𝑧 = 𝑤, and so 𝛾ϰ (𝑤)𝐹0 (𝑤) = 0 𝛾ϰ−1 (𝑤)𝐹0 (𝑤) + 𝛾ϰ (𝑤)𝐹1 (𝑤) = 0 ⋅⋅⋅ 𝛾1 (𝑤)𝐹0 (𝑤) + 𝛾2 (𝑤)𝐹1 (𝑤) + ⋅ ⋅ ⋅ + 𝛾ϰ (𝑤)𝐹ϰ−1 (𝑤) = 0. These relations say that ⎡ 𝛾1 (𝑤) 𝛾2 (𝑤) ⎢ 𝛾 (𝑤) 𝛾 (𝑤) 3 ⎢ 2 ⎢ ⎣ 𝛾ϰ (𝑤)

0

⋅⋅⋅ ⋅⋅⋅

𝛾ϰ−1 (𝑤) 𝛾ϰ (𝑤)

⋅⋅⋅ ⋅⋅⋅

0

and hence ⟨𝐹, 𝐺⟩𝑤 = 0 by Lemma 4.3.

⎤ ⎡ ⎤ ⎤⎡ 𝛾ϰ (𝑤) 𝐹0 (𝑤) 0 ⎢ 𝐹1 (𝑤) ⎥ ⎢0⎥ 0 ⎥ ⎥⎢ ⎥ ⎢ ⎥ ⎥⎢ ⎥ = ⎢ .. ⎥, .. ⎦⎣ ⎦ ⎣.⎦ . 0 𝐹ϰ−1 (𝑤) 0 □

We are now ready to state and prove our main result, which generalizes Theorem 4.1.11 of [3].

604

J. Rovnyak and L.A. Sakhnovich

Theorem 4.7. (1) Let 𝑌 (𝑥) and 𝑍(𝑥) be ﬁnite linear combinations of root functions for the system (3.1), and let 𝐹 (𝑧), 𝐺(𝑧) be their eigentransforms. Then ∫ ℓ 𝑍(𝑡)∗ 𝐻(𝑡)𝑌 (𝑡) 𝑑𝑡 = ⟨𝐹, 𝐺⟩ . (4.15) 0

(2) Suppose 𝑓 ∈ 𝐿2 (𝐻𝑑𝑥), and let 𝐹 be its eigentransform. If 𝑓 is orthogonal to every root function of (3.1), then 𝐹 = 0 as an element of ℌ0 (𝑣). The deﬁnite case is treated in [3], and in this case more can be said. In the deﬁnite case, 𝑣(𝑧) is a Nevanlinna function which is meromorphic on the complex plane. Its poles are real and simple and coincide with the eigenvalues 𝜆1 , 𝜆2 , . . . of (3.1). In the Nevanlinna representation ] ∫ ∞[ 1 𝑡 − 𝑣(𝑧) = 𝛼 + 𝛽𝑧 + 𝑑𝜏 (𝑡), 1 + 𝑡2 −∞ 𝑡 − 𝑧 the nondecreasing function 𝜏 (𝑡) is constant on the intervals between poles, and the jump in 𝜏 (𝑡) at a pole 𝜆𝑗 is 𝜏𝑗 = − Res 𝑣(𝑧) = 𝛾1 (𝜆𝑗 ). 𝑧=𝜆𝑗

In this case, the inner product (4.8) on ℌ0 (𝑣) is the inner product of 𝐿2 (𝑑𝜏 ), and Theorem 4.7 is subsumed in the more precise Theorem 4.2.2 of [3]. In the terminology of Deﬁnition 4.2.3 of [3], 𝜏 (𝑡) is a pseudospectral function for (1.1). In general, the inner product ⟨𝐹, 𝐺⟩ in (4.15) depends on the collection of poles 𝑤 of 𝑣(𝑧) and coeﬃcients 𝛾1 (𝑤), 𝛾2 (𝑤), . . . in (4.3). Because of (4.15), we call this collection pseudospectral data for (1.1). Proof of Theorem 4.7, Part (1). By linearity and symmetry, it is suﬃcient to prove (4.15) when 𝑌 (𝑥) and 𝑍(𝑥) are root functions, say 𝑌 (𝑥) ∈ 𝔏𝜁1 and 𝑍(𝑥) ∈ 𝔏𝜁2 . Case 1: 𝜁1 ∕= 𝜁¯2 In this case, the left side of (4.15) is zero by Proposition 3.3. By Corollary 4.6, ⟨𝐹, 𝐺⟩ = ⟨𝐹, 𝐺⟩𝜁1 . Since 𝜁¯1 ∕= 𝜁2 , 𝑣(𝑧)𝐺(𝑧) is analytic at 𝑧 = 𝜁¯1 by Proposition 4.5(1). Therefore 𝛾ϰ (𝜁¯1 )𝐺0 (𝜁¯1 ) = 0 𝛾ϰ−1 (𝜁¯1 )𝐺0 (𝜁¯1 ) + 𝛾ϰ (𝜁¯1 )𝐺1 (𝜁¯1 ) = 0 ⋅⋅⋅ ¯ ¯ ¯ ¯ ¯ ¯ 𝛾1 (𝜁1 )𝐺0 (𝜁1 ) + 𝛾2 (𝜁1 )𝐺1 (𝜁1 ) + ⋅ ⋅ ⋅ + 𝛾ϰ (𝜁1 )𝐺ϰ−1 (𝜁1 ) = 0.

Pseudospectral Functions Since 𝛾𝑗 (𝜁1 ) = 𝛾𝑗 (𝜁¯1 )∗ , 𝑗 = 1, . . . , ϰ, ⎡ ⎤∗ ⎡ 𝛾1 (𝜁1 ) 𝛾2 (𝜁1 ) 𝐺0 (𝜁¯1 ) ⎢ 𝐺1 (𝜁¯1 ) ⎥ ⎢ 𝛾 (𝜁 ) 𝛾 (𝜁 ) 3 1 ⎢ ⎥ ⎢ 2 1 ⎢ ⎥ ⎢ .. ⎣ ⎣ ⎦ . 𝐺ϰ−1 (𝜁¯1 ) 𝛾ϰ (𝜁1 ) 0

⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅

605

⎤ ⎡ ⎤∗ 𝛾ϰ−1 (𝜁1 ) 𝛾ϰ (𝜁1 ) 0 ⎢0⎥ 𝛾ϰ (𝜁1 ) 0 ⎥ ⎥ ⎢ ⎥ ⎥ = ⎢ .. ⎥ . ⎦ ⎣.⎦ 0 0 0

Hence ⟨𝐹, 𝐺⟩𝜁1 = 0 by Lemma 4.3. Case 2: 𝜁1 = 𝜁¯2 ¯ As a ﬁrst step we derive the formula (4.18) for the Put 𝜁1 = 𝜁 and 𝜁2 = 𝜁. left side of (4.15). Suppose 𝑣(𝑧) = −

𝛾ϰ (𝑤) 𝛾1 (𝑤) + 𝑣˜(𝑧), − ⋅⋅⋅− ϰ (𝑧 − 𝑤) 𝑧−𝑤

(4.16)

By adding zero terms in (4.16), we can choose ϰ as large as we wish. Hence in view of the inclusions (3.7), we can assume without loss of generality that 𝑌 (𝑥) and 𝑍(𝑥) are root functions of the same order ϰ, that is, they belong to eigenchains 𝑌 (0) (𝑥), 𝑌 (1) (𝑥), . . . , 𝑌 (ϰ−1) (𝑥) = 𝑌 (𝑥), 𝑍 (0) (𝑥), 𝑍 (1) (𝑥), . . . , 𝑍 (ϰ−1) (𝑥) = 𝑍(𝑥). Thus

[ ] [ ] 0 0 𝑌 (𝑥) = (−𝑖)ϰ−1 𝑊ϰ−1 (𝑥, 𝜁) + (−𝑖)ϰ−2 𝑊ϰ−2 (𝑥, 𝜁) 𝑔0 𝑔1 [ [ ] ] 0 0 + 𝑊0 (𝑥, 𝜁) + ⋅ ⋅ ⋅ + (−𝑖)𝑊1 (𝑥, 𝜁) 𝑔ϰ−2 𝑔ϰ−1

and ϰ−1

𝑍(𝑥) = (−𝑖)

[ ] [ ] 0 0 ϰ−2 ¯ ¯ 𝑊ϰ−1 (𝑥, 𝜁) 𝑊ϰ−2 (𝑥, 𝜁) + (−𝑖) ℎ0 ℎ1 [ [ ] ] 0 0 ¯ ¯ + ⋅ ⋅ ⋅ + (−𝑖)𝑊1 (𝑥, 𝜁) + 𝑊0 (𝑥, 𝜁) , ℎϰ−2 ℎϰ−1

where the conditions on 𝑔0 , . . . , 𝑔ϰ−1 and ℎ0 , . . . , ℎϰ−1 in Proposition 3.5 are met. In the former case, by (3.14) these conditions can be written: [ ] ¯∗ ] 𝑐 (𝜁) [ ∗ 𝑅 𝑄∗ 0 ¯ ∗ 𝑔0 = 0, 𝑑0 (𝜁) { [ ] ] } [ ¯∗ ¯∗ [ ∗ ] 𝑐 (𝜁) 𝑐 (𝜁) 𝑅 𝑄∗ (−𝑖) 1 ¯ ∗ 𝑔0 + 0 ¯ ∗ 𝑔1 = 0, 𝑑1 (𝜁) 𝑑0 (𝜁) [ [ [ ] ] ] } { ¯∗ ¯∗ ¯∗ [ ∗ ] 𝑐 (𝜁) 𝑐 (𝜁) 𝑐 (𝜁) 𝑅 𝑄∗ (−𝑖)2 2 ¯ ∗ 𝑔0 + (−𝑖) 1 ¯ ∗ 𝑔1 + 0 ¯ ∗ 𝑔2 = 0, 𝑑2 (𝜁) 𝑑1 (𝜁) 𝑑0 (𝜁) ⋅⋅⋅

606

J. Rovnyak and L.A. Sakhnovich [ ∗ 𝑅

[ ] [ ] { ¯∗ ¯∗ ϰ−1 𝑐ϰ−1 (𝜁) ϰ−2 𝑐ϰ−2 (𝜁) 𝑄 (−𝑖) ¯ ∗ 𝑔0 + (−𝑖) ¯ ∗ 𝑔1 + ⋅ ⋅ ⋅ 𝑑ϰ−1 (𝜁) 𝑑ϰ−2 (𝜁) [ [ ] ] } ¯∗ ¯∗ 𝑐1 (𝜁) 𝑐0 (𝜁) +(−𝑖) ¯ ∗ 𝑔ϰ−2 + 𝑑0 (𝜁) ¯ ∗ 𝑔ϰ−1 = 0. 𝑑1 (𝜁) (4.17) ∗

By (2.15), ∫

ℓ

0

]

𝑍(𝑡)∗ 𝐻(𝑡)𝑌 (𝑡) 𝑑𝑡 =

∫

ℓ ϰ−1 ∑

0

[ 𝑖𝑞 ℎ∗ϰ−1−𝑞 0

𝑞=0

⋅ ϰ−1 ∑

[ ] 0 (−𝑖) 𝑊𝑝 (𝑡, 𝜁) 𝑑𝑡 𝑔 𝐼𝑚 ϰ−1−𝑝

ϰ−1 ∑ 𝑝=0

=

] ¯ ∗ 𝐻(𝑡)⋅ 𝐼𝑚 𝑊𝑞 (𝑡, 𝜁)

𝑝

𝑖𝑞 (−𝑖)𝑝 ℎ∗ϰ−1−𝑞 Δ𝑝𝑞 (𝜁)𝑔ϰ−1−𝑝 .

(4.18)

𝑝,𝑞=0

We next derive the formula (4.26) for the right side of (4.15). By Corollary 4.6 and Lemma 4.3, ⟨𝐹, 𝐺⟩ = ⟨𝐹, 𝐺⟩𝜁 =

ϰ ∑

∑

¯ ∗ 𝛾𝑗 (𝜁)𝐹𝑝 (𝜁) 𝐺𝑞 (𝜁)

𝑗=1 𝑝+𝑞=𝑗−1

¯ ∗ 𝛾1 (𝜁)𝐹0 (𝜁) = 𝐺0 (𝜁) ¯ ∗ 𝛾2 (𝜁)𝐹0 (𝜁) + 𝐺0 (𝜁) ¯ ∗ 𝛾2 (𝜁)𝐹1 (𝜁) + 𝐺1 (𝜁) + ⋅⋅⋅ ¯ ∗ 𝛾ϰ (𝜁)𝐹0 (𝜁) + 𝐺ϰ−2 (𝜁) ¯ ∗ 𝛾ϰ (𝜁)𝐹1 (𝜁) + 𝐺ϰ−1 (𝜁) ¯ ∗ 𝛾ϰ (𝜁)𝐹ϰ−1 (𝜁) + ⋅ ⋅ ⋅ + 𝐺0 (𝜁) [ ] ¯ ∗ 𝛾1 (𝜁)𝐹0 (𝜁) + 𝛾2 (𝜁)𝐹1 (𝜁) + ⋅ ⋅ ⋅ + 𝛾ϰ (𝜁)𝐹ϰ−1 (𝜁) = 𝐺0 (𝜁) [ ] ¯ ∗ 𝛾2 (𝜁)𝐹0 (𝜁) + 𝛾3 (𝜁)𝐹1 (𝜁) + ⋅ ⋅ ⋅ + 𝛾ϰ (𝜁)𝐹ϰ−2 (𝜁) + 𝐺1 (𝜁) + ⋅⋅⋅

[ ] ¯ ∗ 𝛾ϰ−1 (𝜁)𝐹0 (𝜁) + 𝛾ϰ (𝜁)𝐹1 (𝜁) + 𝐺ϰ−2 (𝜁) ¯ ∗ 𝛾ϰ (𝜁)𝐹0 (𝜁) , + 𝐺ϰ−1 (𝜁)

(4.19)

where 𝐹 (𝑧) =

∞ ∑ 𝑝=0

𝑝

𝐹𝑝 (𝑤)(𝑧 − 𝑤)

and 𝐺(𝑧) =

∞ ∑ 𝑞=0

𝐺𝑞 (𝑤)(𝑧 ¯ − 𝑤) ¯ 𝑞.

Pseudospectral Functions

607

By (3.16), ) ∞ ( ϰ−1 ∑ ∑ 𝑗 𝐹 (𝑧) = (−𝑖) Δ𝑗𝑘 (𝜁) 𝑔ϰ−1−𝑗 (𝑧 − 𝜁)𝑘 , 𝑘=0

𝑗=0

𝑘=0

𝑗=0

) ∞ ( ϰ−1 ∑ ∑ 𝑗 ¯ ¯𝑘, (−𝑖) Δ𝑗𝑘 (𝜁) 𝑔ϰ−1−𝑗 (𝑧 − 𝜁) 𝐺(𝑧) = and so 𝐹𝑘 (𝜁) =

ϰ−1 ∑

(−𝑖)𝑗 Δ𝑗𝑘 (𝜁) 𝑔ϰ−1−𝑗 ,

(4.20)

𝑗=0

¯ = 𝐺𝑘 (𝜁)

ϰ−1 ∑

¯ ℎϰ−1−𝑗 , (−𝑖)𝑗 Δ𝑗𝑘 (𝜁)

(4.21)

𝑗=0

for all 𝑘 = 0, 1, . . . , ϰ − 1. [ ] 𝑅 Claim: For every 𝜉 ∈ ran , 𝑄 [ ] [ ] 𝑣(𝑧) 𝑐(𝑧) 𝑑(𝑧) 𝜉 = 𝑖 𝑎(𝑧) 𝑏(𝑧) 𝜉.

(4.22)

The claim follows on writing (4.1) in the form [ ] [ ] [ ] 𝑅 [ ] 𝑅 𝑣(𝑧) 𝑐(𝑧) 𝑑(𝑧) = 𝑖 𝑎(𝑧) 𝑏(𝑧) . 𝐺 𝐺 Now by (3.19),

{ ] [ 𝐽𝑌 (ϰ−2) (ℓ) 𝐽𝑌 (ϰ−1) (ℓ) + (−𝑖)2 𝐹 (𝑧) = − 𝑐(𝑧) 𝑑(𝑧) (−𝑖) 𝑧−𝜁 (𝑧 − 𝜁)2 + ⋅ ⋅ ⋅ + (−𝑖)ϰ

𝐽𝑌 (0) (ℓ) (𝑧 − 𝜁)ϰ

} .

(4.23)

By (4.17) and Lemma 3.4(1), [ [ ] ] ¯∗ 𝑑 (𝜁) 𝑅 , 𝐽𝑌 (0) (ℓ) = 0 ¯ ∗ 𝑔0 ∈ ran 𝑄 𝑐0 (𝜁) [ [ [ ] ] ] ¯∗ ¯∗ 𝑑1 (𝜁) 𝑑0 (𝜁) 𝑅 𝐽𝑌 (ℓ) = (−𝑖) ¯ ∗ 𝑔0 + 𝑐0 (𝜁) ¯ ∗ 𝑔1 ∈ ran 𝑄 , 𝑐1 (𝜁) [ [ [ ] [ ] ] ] ¯∗ ¯∗ ¯∗ 𝑑1 (𝜁) 𝑑0 (𝜁) 𝑅 (2) 2 𝑑2 (𝜁) 𝐽𝑌 (ℓ) = (−𝑖) ¯ ∗ 𝑔0 + (−𝑖) 𝑐1 (𝜁) ¯ ∗ 𝑔1 + 𝑐0 (𝜁) ¯ ∗ 𝑔2 ∈ ran 𝑄 , 𝑐2 (𝜁) (1)

⋅⋅⋅

608

J. Rovnyak and L.A. Sakhnovich 𝐽𝑌

(ϰ−1)

ϰ−1

(ℓ) = (−𝑖)

[

] [ ] ¯∗ ¯∗ 𝑑ϰ−1 (𝜁) ϰ−2 𝑑ϰ−2 (𝜁) ¯ ∗ 𝑔0 + (−𝑖) ¯ ∗ 𝑔1 + ⋅ ⋅ ⋅ 𝑐ϰ−1 (𝜁) 𝑐ϰ−2 (𝜁) [ [ [ ] ] ] ¯∗ ¯∗ 𝑑1 (𝜁) 𝑑0 (𝜁) 𝑅 + (−𝑖) ¯ ∗ 𝑔ϰ−2 + 𝑐0 (𝜁) ¯ ∗ 𝑔ϰ−1 ∈ ran 𝑄 . 𝑐1 (𝜁)

Therefore by (4.23) and the claim,

{ [ ] 𝐽𝑌 (ϰ−2) (ℓ) 𝐽𝑌 (ϰ−1) (ℓ) + (−𝑖)2 𝑣(𝑧)𝐹 (𝑧) = −𝑣(𝑧) 𝑐(𝑧) 𝑑(𝑧) (−𝑖) 𝑧−𝜁 (𝑧 − 𝜁)2 } 𝐽𝑌 (0) (ℓ) + ⋅ ⋅ ⋅ + (−𝑖)ϰ (𝑧 − 𝜁)ϰ { (ϰ−1) [ ] 𝐽𝑌 (ℓ) 𝐽𝑌 (ϰ−2) (ℓ) + (−𝑖)2 = −𝑖 𝑎(𝑧) 𝑏(𝑧) (−𝑖) 𝑧−𝜁 (𝑧 − 𝜁)2 } 𝐽𝑌 (0) (ℓ) + ⋅ ⋅ ⋅ + (−𝑖)ϰ . (4.24) (𝑧 − 𝜁)ϰ

For the left side of (4.24), the series expansions of 𝑣(𝑧) and 𝐹 (𝑧) yield 𝛾ϰ (𝜁)𝐹0 (𝜁) 𝛾ϰ−1 (𝜁)𝐹0 (𝜁) + 𝛾ϰ (𝜁)𝐹1 (𝜁) − (𝑧 − 𝜁)ϰ (𝑧 − 𝜁)ϰ−1 𝛾1 (𝜁)𝐹0 (𝜁) + 𝛾2 (𝜁)𝐹1 (𝜁) + ⋅ ⋅ ⋅ + 𝛾ϰ (𝜁)𝐹ϰ−1 (𝜁) − ⋅⋅⋅− 𝑧−𝜁 + holomorphic part . (4.25)

𝑣(𝑧)𝐹 (𝑧) = −

The numerators here are key to calculating (4.19). We next show that these numerators are very simple expressions. In fact, by (4.24) and (4.25), −

𝛾ϰ (𝜁)𝐹0 (𝜁) 𝛾ϰ−1 (𝜁)𝐹0 (𝜁) + 𝛾ϰ (𝜁)𝐹1 (𝜁) − (𝑧 − 𝜁)ϰ (𝑧 − 𝜁)ϰ−1 𝛾1 (𝜁)𝐹0 (𝜁) + 𝛾2 (𝜁)𝐹1 (𝜁) + ⋅ ⋅ ⋅ + 𝛾ϰ (𝜁)𝐹ϰ−1 (𝜁) − ⋅⋅⋅− 𝑧−𝜁 + holomorphic part { [ ] 𝐽𝑌 (ϰ−2) (ℓ) 𝐽𝑌 (ϰ−1) (ℓ) + (−𝑖)2 = −𝑖 𝑎(𝑧) 𝑏(𝑧) (−𝑖) 𝑧−𝜁 (𝑧 − 𝜁)2 + ⋅ ⋅ ⋅ + (−𝑖)ϰ =−

{ [

𝑎0 (𝜁)

] [ 𝑏0 (𝜁) + 𝑎1 (𝜁)

⋅

}

] 𝑏1 (𝜁) (𝑧 − 𝜁) [ + 𝑎2 (𝜁)

{

𝐽𝑌 (0) (ℓ) (𝑧 − 𝜁)ϰ

} ] 𝑏2 (𝜁) (𝑧 − 𝜁)2 + ⋅ ⋅ ⋅ ⋅

(0) 𝐽𝑌 (ϰ−1) (ℓ) 𝐽𝑌 (ϰ−2) (ℓ) (ℓ) ϰ−1 𝐽𝑌 + (−𝑖) + ⋅ ⋅ ⋅ + (−𝑖) 𝑧−𝜁 (𝑧 − 𝜁)2 (𝑧 − 𝜁)ϰ

} .

Pseudospectral Functions Therefore

609

[ 𝛾ϰ (𝜁)𝐹0 (𝜁) = 𝑎0 (𝜁)

] 𝑏0 (𝜁) (−𝑖)ϰ−1 𝐽𝑌 (0) (ℓ) ] [ ¯∗ ] 𝑑 (𝜁) [ = (−𝑖)ϰ−1 𝑎0 (𝜁) 𝑏0 (𝜁) 0 ¯ ∗ 𝑔0 𝑐0 (𝜁) = (−𝑖)ϰ−1 𝑔0 ,

the last equality holding by (2.10). By (2.10) and (2.11), 𝛾ϰ−1 (𝜁)𝐹0 (𝜁) + 𝛾ϰ (𝜁)𝐹1 (𝜁) [ [ ] ] = 𝑎1 (𝜁) 𝑏1 (𝜁) (−𝑖)ϰ−1 𝐽𝑌 (0) (ℓ) + 𝑎0 (𝜁) 𝑏0 (𝜁) (−𝑖)ϰ−2 𝐽𝑌 (1) (ℓ) [ ] ¯∗ ] 𝑑 (𝜁) [ = (−𝑖)ϰ−1 𝑎1 (𝜁) 𝑏1 (𝜁) 0 ¯ ∗ 𝑔0 𝑐0 (𝜁) [ ] ] } { [ ¯∗ ¯∗ [ ] 𝑑 (𝜁) 𝑑 (𝜁) + (−𝑖)ϰ−2 𝑎0 (𝜁) 𝑏0 (𝜁) (−𝑖) 1 ¯ ∗ 𝑔0 + 0 ¯ ∗ 𝑔1 𝑐1 (𝜁) 𝑐0 (𝜁) = (−𝑖)ϰ−2 𝑔1 . We continue in this way, obtaining at the last stage 𝛾1 (𝜁)𝐹0 (𝜁) + 𝛾2 (𝜁)𝐹1 (𝜁) + ⋅ ⋅ ⋅ + 𝛾ϰ (𝜁)𝐹ϰ−1 (𝜁) ] [ = 𝑎ϰ−1 (𝜁) 𝑏ϰ−1 (𝜁) (−𝑖)ϰ−1 𝐽𝑌 (0) (ℓ) [ ] + 𝑎ϰ−2 (𝜁) 𝑏ϰ−2 (𝜁) (−𝑖)ϰ−2 𝐽𝑌 (1) (ℓ) + ⋅⋅⋅ ] [ + 𝑎0 (𝜁) 𝑏0 (𝜁) 𝐽𝑌 (ϰ−1) (ℓ) [ ] ¯∗ ] 𝑑 (𝜁) [ = (−𝑖)ϰ−1 𝑎ϰ−1 (𝜁) 𝑏ϰ−1 (𝜁) 0 ¯ ∗ 𝑔0 𝑐0 (𝜁) { [ [ ] ] } ¯∗ ¯∗ ] [ 𝑑 (𝜁) 𝑑 (𝜁) + (−𝑖)ϰ−2 𝑎ϰ−2 (𝜁) 𝑏ϰ−2 (𝜁) (−𝑖) 1 ¯ ∗ 𝑔0 + 0 ¯ ∗ 𝑔1 𝑐1 (𝜁) 𝑐0 (𝜁) + ⋅⋅⋅ { [ ] [ ] ¯∗ ¯∗ [ ] ϰ−1 𝑑ϰ−1 (𝜁) ϰ−2 𝑑ϰ−2 (𝜁) + 𝑎0 (𝜁) 𝑏0 (𝜁) (−𝑖) ¯ ∗ 𝑔0 + (−𝑖) ¯ ∗ 𝑔1 𝑐ϰ−1 (𝜁) 𝑐ϰ−2 (𝜁) [ ] } ¯∗ 𝑑0 (𝜁) + ⋅⋅⋅ + ¯ ∗ 𝑔ϰ−1 𝑐0 (𝜁) = 𝑔ϰ−1 . Thus (4.19) yields ¯ ∗ 𝑔ϰ−1 + 𝐺1 (𝜁) ¯ ∗ (−𝑖)𝑔ϰ−2 + ⋅ ⋅ ⋅ ⟨𝐹, 𝐺⟩ = 𝐺0 (𝜁) ¯ ∗ (−𝑖)ϰ−2 𝑔1 + 𝐺ϰ−1 (𝜁) ¯ ∗ (−𝑖)ϰ−1 𝑔0 + 𝐺ϰ−2 (𝜁) =

ϰ−1 ∑ 𝑝=0

¯ ∗ (−𝑖)𝑝 𝑔ϰ−1−𝑝 𝐺𝑝 (𝜁)

(4.26)

610

J. Rovnyak and L.A. Sakhnovich The ﬁnal step is to compare (4.18) and (4.26). By (4.21), ¯ = 𝐺𝑝 (𝜁)

ϰ−1 ∑

¯ ϰ−1−𝑞 (−𝑖)𝑞 Δ𝑞𝑝 (𝜁)ℎ

𝑞=0

and so by (2.16), ¯∗= 𝐺𝑝 (𝜁)

ϰ−1 ∑

¯∗= 𝑖𝑞 ℎ∗ϰ−1−𝑞 Δ𝑞𝑝 (𝜁)

𝑞=0

ϰ−1 ∑

¯ . 𝑖𝑞 ℎ∗ϰ−1−𝑞 Δ𝑝𝑞 (𝜁)

𝑞=0

Therefore (4.26) yields ⟨𝐹, 𝐺⟩ =

ϰ−1 ∑ ( ϰ−1 ∑ 𝑝=0

) (−𝑖)𝑝 𝑔ϰ−1−𝑝

¯ 𝑖𝑞 ℎ∗ϰ−1−𝑞 Δ𝑝𝑞 (𝜁)

𝑞=0

=

ϰ−1 ∑

𝑖𝑞 (−𝑖)𝑝 ℎ∗ϰ−1−𝑞 Δ𝑝𝑞 (𝜁)𝑔ϰ−1−𝑝 =

𝑝,𝑞=0

∫ 0

ℓ

𝑍(𝑡)∗ 𝐻(𝑡)𝑌 (𝑡) 𝑑𝑡,

where the last equality is by (4.18). We have veriﬁed (4.15), and this completes the proof. □ Proof of Theorem 4.7, Part (2). According to Deﬁnition 4.2, to show that 𝐹 = 0 as an element of ℌ0 (𝑣), we must show that 𝑣(𝑧)𝐹 (𝑧) is entire, that is, it is analytic at every pole of 𝑣(𝑧). ¯ Let 𝜁 be a pole of 𝑣(𝑧), and represent 𝑣(𝑧) as in (4.3) for 𝑤 = 𝜁 and 𝑤 = 𝜁. The coeﬃcients in these representations satisfy ¯ ∗ = 𝛾𝑘 (𝜁), 𝛾𝑘 (𝜁) Write 𝐹 (𝑧) =

∑∞

𝑗=0

𝑘 = 1, . . . , ϰ.

(4.27)

𝐹𝑗 (𝜁)(𝑧 − 𝜁)𝑗 . Since

{ [ ] } 𝛾ϰ−1 (𝜁) 𝛾1 (𝜁) 𝛾ϰ (𝜁) 𝑣(𝑧)𝐹 (𝑧) = − + + ⋅⋅⋅+ + 𝒪(1) ⋅ (𝑧 − 𝜁)ϰ (𝑧 − 𝜁)ϰ−1 𝑧−𝜁 { } ϰ−1 ⋅ 𝐹0 (𝜁) + 𝐹1 (𝜁)(𝑧 − 𝜁) + ⋅ ⋅ ⋅ + 𝐹ϰ−1 (𝜁)(𝑧 − 𝜁) + ⋅⋅⋅ , the problem is to show that 𝛾ϰ (𝜁)𝐹0 (𝜁) = 0, 𝛾ϰ (𝜁)𝐹1 (𝜁) + 𝛾ϰ−1 (𝜁)𝐹0 (𝜁) = 0, ⋅⋅⋅ 𝛾ϰ (𝜁)𝐹ϰ−1 (𝜁) + 𝛾ϰ−1 (𝜁)𝐹ϰ−2 (𝜁) + ⋅ ⋅ ⋅ + 𝛾1 (𝜁)𝐹0 (𝜁) = 0.

(4.28)

Pseudospectral Functions By (1.5) and (2.3),

∫

𝐹 (𝑧) = =

ℓ

0

∞ ∑

[ 0

611

] 𝐼𝑚 𝑊 (𝑥, 𝑧¯)∗ 𝐻(𝑥)𝑓 (𝑥) 𝑑𝑥

(𝑧 − 𝜁)𝑗

∫

𝑗=0

and so for all 𝑗 = 0, 1, 2, . . . , ∫ ℓ [ 0 𝐹𝑗 (𝜁) = 0

ℓ 0

[

] ¯ ∗ 𝐻(𝑥)𝑓 (𝑥) 𝑑𝑥, 0 𝐼𝑚 𝑊𝑗 (𝑥, 𝜁)

] ¯ ∗ 𝐻(𝑥)𝑓 (𝑥) 𝑑𝑥 . 𝐼𝑚 𝑊𝑗 (𝑥, 𝜁)

(4.29)

By Proposition 3.1, 𝜁¯ is an eigenvalue of (3.1). Since 𝑓 is orthogonal to all root functions of (3.1), 𝑓 is orthogonal to the root functions for the eigenvalue 𝜁¯ provided by Proposition 4.1. Denote these functions 𝑌 (0) (𝑥), . . . , 𝑌 (ϰ−1) (𝑥).

(4.30)

Explicit formulas for the functions (4.30) are given by (3.13) and (4.5) with 𝜁 ¯ Thus for each 𝑗 = 0, 1, . . . , ϰ − 1, replaced by 𝜁. [ ] [ ] 0 0 (𝑗) 𝑗 𝑗−1 ¯ ¯ 𝑌 (𝑥) = (−𝑖) 𝑊𝑗 (𝑥, 𝜁) + (−𝑖) 𝑊𝑗−1 (𝑥, 𝜁) 𝑔0 𝑔1 ] [ [ ] 0 0 ¯ ¯ + ⋅ ⋅ ⋅ + (−𝑖)𝑊1 (𝑥, 𝜁) + 𝑊0 (𝑥, 𝜁) 𝑔𝑗−1 𝑔𝑗 [ [ ] ] 0 0 𝑗 𝑗−1 ¯ ¯ + (−𝑖) 𝑊𝑗−1 (𝑥, 𝜁) = (−𝑖) 𝑊𝑗 (𝑥, 𝜁) ¯ ¯ 𝛾ϰ (𝜁)𝑢 (−𝑖)𝛾ϰ−1 (𝜁)𝑢 [ ] 0 ¯ + ⋅ ⋅ ⋅ + (−𝑖)𝑊1 (𝑥, 𝜁) ¯ (−𝑖)𝑗−1 𝛾ϰ−𝑗+1 (𝜁)𝑢 [ ] 0 ¯ + 𝑊0 (𝑥, 𝜁) , ¯ (−𝑖)𝑗 𝛾ϰ−𝑗 (𝜁)𝑢 where 𝑢 is an arbitrary vector in ℂ𝑚 . We obtain ∫ ℓ 𝑗 𝑌 (𝑗) (𝑡)∗ 𝐻(𝑡)𝑓 (𝑡) 𝑑𝑡 0 = (−𝑖) ∫ =

0

ℓ

[

0

] ¯ ∗ 𝑊𝑗 (𝑡, 𝜁)𝐻(𝑡)𝑓 ¯ 0 𝑢∗ 𝛾ϰ (𝜁) (𝑡) 𝑑𝑡 ∫

+

ℓ

[

0

] ¯ ∗ 𝑊𝑗−1 (𝑡, 𝜁)𝐻(𝑡)𝑓 ¯ 0 𝑢∗ 𝛾ϰ−1 (𝜁) (𝑡) 𝑑𝑡 ∫

+ ⋅⋅⋅ + ∫ +

0

ℓ

[

0

ℓ[

0

] ¯ ∗ 𝑊1 (𝑡, 𝜁)𝐻(𝑡)𝑓 ¯ 𝑢∗ 𝛾ϰ−𝑗+1 (𝜁) (𝑡) 𝑑𝑡

] ¯ ∗ 𝑊0 (𝑡, 𝜁)𝐻(𝑡)𝑓 ¯ 0 𝑢∗ 𝛾ϰ−𝑗 (𝜁) (𝑡) 𝑑𝑡.

612

J. Rovnyak and L.A. Sakhnovich

In view of (4.27) and (4.29) and the arbitrariness of 𝑢, we conclude that 𝛾ϰ (𝜁)∗ 𝐹𝑗 (𝜁) + 𝛾ϰ−1 (𝜁)∗ 𝐹𝑗−1 (𝜁) + ⋅ ⋅ ⋅ + 𝛾ϰ−𝑗+1 (𝜁)∗ 𝐹1 (𝜁) + 𝛾ϰ−𝑗 (𝜁)∗ 𝐹0 (𝜁) = 0, which is equivalent to the system (4.28). We have shown that 𝑣(𝑧)𝐹 (𝑧) is analytic at every pole 𝜁 of 𝑣(𝑧). Therefore 𝐹 = 0 as an element of ℌ0 (𝑣), and the proof is complete. □

References [1] I.C. Gohberg and M.G. Kre˘ın, Introduction to the theory of linear nonselfadjoint operators, American Mathematical Society, Providence, R.I., 1969. [2] I.C. Gohberg and M.G. Kre˘ın, Theory and applications of Volterra operators in Hilbert space, American Mathematical Society, Providence, R.I., 1970. [3] J. Rovnyak and L.A. Sakhnovich, Pseudospectral functions for canonical diﬀerential systems, Oper. Theory Adv. Appl., vol. 191, Birkh¨ auser, Basel, 2009, pp. 187–219. [4] L.A. Sakhnovich, Spectral theory of canonical diﬀerential systems. Method of operator identities, Oper. Theory Adv. Appl., vol. 107, Birkh¨ auser Verlag, Basel, 1999. J. Rovnyak University of Virginia Department of Mathematics P. O. Box 400137 Charlottesville, VA 22904–4137, USA e-mail: [email protected] L.A. Sakhnovich 99 Cove Avenue Milford, CT 06461, USA e-mail: [email protected]

Operator Theory: Advances and Applications, Vol. 218, 613–638 c 2012 Springer Basel AG ⃝

Operator Identities for Subnormal Tuples of Operators Daoxing Xia For the memory of Professor I. Gohberg

Abstract. Some formulas for the products of resolvents of subnormal 𝑘-tuples of operators as well as 𝑘-tuples of commuting operators are established. Mathematics Subject Classiﬁcation (2000). Primary 47B20. Keywords. Subnormal 𝑘-tuple of operators, commuting 𝑘-tuple of operators, resolvent.

1. Introduction A 𝑘-tuple of operators 𝕊 = (𝑆1 , . . . , 𝑆𝑘 ) on a Hilbert space ℋ is said to be subnormal if there is a commuting 𝑘-tuple ℕ = (𝑁1 , . . . , 𝑁𝑘 ) of normal operators on a Hilbert space ℋ0 containing ℋ as a subspace, such that 𝑆 𝑗 = 𝑁 𝑗 ∣ℋ . In this case ℕ is said to be a normal extension of 𝕊. A normal extension is said to be minimal if there is no proper subspace of ℋ0 ⊖ ℋ which reduces ℕ. The minimal normal extension (m.n.e.) of a subnormal tuple of operators exists and is essentially unique. There are several papers studying subnormal 𝑘-tuples of operators such as [1], [3], [4], [5], [6], [8], [11], [12], [15], [20]. ⋁ Let 𝑀 be the closure of 𝑖,𝑗 [𝑆𝑖∗ , 𝑆𝑗 ]ℋ. Then 𝑀 is said to be the defect space. In the ﬁrst part of this paper, the formulas for calculating the product of resolvents, a kind of Lifschitz-Brodski kernel 𝑛 𝑚 ∏ ∏ (1) 𝑃𝑀 (𝑁𝑝∗𝑖 − 𝑤¯𝑖 )−1 (𝑁𝑞𝑗 − 𝑧𝑗 )−1 ∣𝑀 , 𝑖=1

𝑗=1

will be given, where ℕ is the m.n.e. of a subnormal 𝑘-tuple of 𝕊 = (𝑆1 , . . . , 𝑆𝑘 ), 1 ≤ 𝑝𝑖 , 𝑞𝑗 ≤ 𝑘, 𝑀 is the defect space, 𝑃𝑀 is the projection from ℋ0 to 𝑀 ,

614

D. Xia

𝑧𝑖 ∈ 𝜌(𝑁𝑞𝑖 ) and 𝑤𝑗 ∈ 𝜌(𝑁𝑝𝑗 ). If 𝑧𝑖 ∈ 𝜌(𝑆𝑞𝑖 ) and 𝑤𝑗 ∈ 𝜌(𝑆𝑝𝑗 ) then (1) is equal to 𝑃𝑀

𝑚 ∏ 𝑖=1

(𝑆𝑝∗𝑖 − 𝑤¯𝑖 )−1

𝑛 ∏

(𝑆𝑞𝑗 − 𝑧𝑗 )−1 ∣𝑀 .

(2)

𝑗=1

Notice that if 𝕊 is pure, i.e., if there is no proper subspace 𝐹 ⊂ ℋ reducing 𝕊 such that 𝕊∣𝐹 is normal, then ⎧ ⎫ 𝑛 ⎬ ⋁ ⎨∏ (𝑆𝑗 − 𝑧𝑗 )−1 𝛼 : 𝑧𝑗 ∈ 𝜌(𝑆𝑗 ), 𝛼 ∈ 𝑀 . (3) ℋ = closure of ⎩ ⎭ 𝑗=1

Thus the calculation of (2) provides a way to calculate the inner product of any two vectors in ℋ. Let us review some of the theory of single subnormal or hyponormal operators related to the subject in this paper. Let 𝑆 be a subnormal operator on a Hilbert space ℋ with m.n.e. 𝑁 on ℋ0 ⊃ ℋ. We have [13] proved that the defect space def

𝑀 = closure of [𝑆 ∗ , 𝑆]ℋ is invariant with respect to 𝑆 ∗ . It is evident that [𝑆 ∗ , 𝑆]𝑀 ⊂ 𝑀 . Then we deﬁned 𝐿(𝑀 ) operators def

𝐶 = [𝑆 ∗ , 𝑆]∣𝑀

def

and Λ = (𝑆 ∗ ∣𝑀 )∗

and proved that {𝐶, Λ} is a complete unitary invariant for pure subnormal operator 𝑆. We [13] also deﬁned an idempotent 𝐿(𝑀 )-valued analytic function, the mosaic for 𝑆, as follows: ∫ 𝑢−Λ 𝜇(𝑧) = 𝑒(𝑑𝑢), 𝑧 ∈ 𝜌(𝑁 ), 𝜎(𝑁 ) 𝑢 − 𝑧 where 𝑒(⋅) = 𝑃𝑀 𝐸(⋅)∣𝑀 , 𝐸(⋅) is the spectral measure of 𝑁 , and 𝑃𝑀 is the projection from ℋ0 to 𝑀 . Then 𝜇(𝑧) = 0 for 𝑧 ∈ 𝜌(𝑆). We then deﬁned a rational function def

𝑅(𝑧) = 𝐶(𝑧 − Λ)−1 + Λ∗ , 𝑧 ∈ 𝜌(Λ) from which we derived [𝑅(𝑧), 𝜇(𝑧)] = 0,

for 𝑧 ∈ 𝜌(Λ)

∩

𝜌(𝑁 ).

Let def

𝑄(𝑧, 𝑤) = (𝑤 ¯ − Λ∗)(𝑧 − Λ) − 𝐶. The Lifschitz-Brodski kernel (1) in this case is def

¯ −1 (𝑁 − 𝑧)−1 ∣𝑀 , 𝑆(𝑧, 𝑤) = 𝑃𝑀 (𝑁 ∗ − 𝑤)

𝑧, 𝑤 ∈ 𝜌(𝑁 ).

Operator Identities for Subnormal Tuples of Operators

615

We then proved that [13] 𝑆(𝑧, 𝑤) = (𝐼 − 𝜇(𝑤)∗ )𝑄(𝑧, 𝑤)−1 − 𝑄(𝑧, 𝑤)−1 𝜇(𝑧), if 𝑧, 𝑤 ∈ 𝜌(𝑁 ) and 𝑄(𝑧, 𝑤) is invertible. When dim 𝑀 < ∞, {𝐶, Λ} is a pair of matrices and is a very useful tool for studying 𝑆. For example, in this case 𝜎(𝑁 ) ⊂ {𝑧 : det 𝑄(𝑧, 𝑧) = 0}, 𝜎(𝑆)∖𝜎(𝑁 ) is covered by a union of quadrature domains in Riemann surfaces, and there is a ﬁnite set of branched covers (𝑅𝑗 , 𝜋𝑗 ) that are quadrature domains in Riemann surfaces (see [17]) such that 𝜎(𝑆) equals to the closure of the union of the images 𝜋𝑗 of Riemann surfaces 𝑅𝑗 . In [21] and [22] Yakubovich proved that when dim 𝑀 < ∞, the algebraic curve attached to a single subnormal operator 𝑆 should be divided naturally into two halves, an explicit formula for the mosaic 𝜇(𝑧) is given, that uses these halves, and the corresponding functional models of 𝑆 on Riemann surfaces are investigated. If 𝜎(𝑆)∖𝜎(𝑁 ) is a quadrature domain 𝐷 ⊂ ℂ, then 𝑅(𝑧)𝜇(𝑧) = 𝜇(𝑧)𝑅(𝑧) = 𝑆(𝑧)𝜇(𝑧), where 𝑆(⋅) is the Schwartz function of 𝐷. Besides, the mosaic 𝜇(𝑧) is the parallel projection to the eigenspace of the matrix 𝑅(𝑧) corresponding to the eigenvalue 𝑆(𝑧). For a hyponormal operator 𝐻 on a Hilbert space ℋ, let 𝑀 = closure of [𝐻 ∗ , 𝐻]ℋ. Then 𝐻 ∗ 𝑀 ∕⊂ 𝑀 for some hyponormal operator 𝐻. M. Putinar [9] introduced the subspace ⋁ def 𝒦 = closure of {𝐻 ∗𝑛 𝑀 : 𝑛 = 0, 1, 2, . . .}. Then 𝐻 ∗ 𝒦 ⊂ 𝒦 and [𝐻 ∗ , 𝐻]𝒦 ⊂ 𝒦. In that case he introduced def

𝐶 = [𝐻 ∗ , 𝐻]∣𝒦

def

and Λ = (𝐻 ∗ ∣𝒦 )∗ ,

which are in 𝐿(𝒦). This pair {𝐶, Λ} is also a complete unitary invariant for a pure hyponormal operator 𝐻. In the case of dim 𝑀 = 1, and dim 𝒦 < ∞, Gustafsson and Putinar ([7] and [9]) studied the unique pure hyponormal operator 𝐻 satisfying the condition that the interior domain 𝐷 of 𝜎(𝐻) is a quadrature domain. The author also proved that the Schwartz function 𝑆(𝑧) of 𝐷 satisﬁes det 𝑄(𝑧, 𝑆(𝑧)) = 0, where 𝑄(𝑧, 𝑤) = (𝑤¯ − Λ∗ )(𝑧 − Λ) − 𝐶. There are several very interesting results of their linear analysis of quadrature domains, some of which are related to 𝐶 and Λ.

616

D. Xia

The author [17], [18] also introduced the mosaic 𝜇(⋅) related to the hyponormal operator associated with quadrature domain. This 𝜇(⋅) is also a meromorphic function on 𝐷, satisfying 𝜇(⋅)2 = 𝜇(⋅). Similar to (1), let 𝑆(𝑧, 𝑤) = 𝑃𝒦 (𝐻 ∗ − 𝑤) ¯ −1 (𝐻 − 𝑧)−1 ∣𝒦 ,

𝑧, 𝑤 ∈ 𝜌(𝐻).

In the case dim 𝑀 = 1, (without the restriction dim 𝐾 < ∞), J. Pincus, D. Xia, and J. Xia [10] derived the formula } { ∫ ∫ 𝑔(𝜁)𝑑𝐴(𝜁) 1 , (𝑆(𝑧, 𝑤)𝑘, 𝑘) = 1 − exp − 𝜋 (𝜁 − 𝑧)(𝜁¯ − 𝑤) ¯ where 𝑘 ∈ 𝑀 satisﬁes ∥𝑘∥ = ∥[𝐻 ∗ , 𝐻]∥, and 𝑔(⋅) is the Pincus principal function. If 𝐻 is also associated with the quadrature domain, then [7], [9], [16] (𝑆(𝑧, 𝑤)𝑘, 𝑘) = 1 −

det(𝑄(𝑧, 𝑤)) . det(𝑧 − Λ) det(𝑤¯ − Λ∗ )

All of the above show that the objects 𝐶, Λ, 𝑅(⋅), 𝜇(⋅), 𝑆(⋅, ⋅) are useful tools in the theory of subnormal operators as well as the theory of hyponormal operators. In §2, we will introduce a generalization of 𝐶, Λ, 𝑅(⋅), 𝜇(⋅), 𝑆(⋅, ⋅) in the case of a subnormal 𝑘-tuple of operators. Most of them have been studied in [15]. In §3, we will give the formula for (1). In §4, we will generalize the formula for (2) to the case of a commuting 𝑘-tuple of operators. Some papers about the application of these formulas are being prepared.

2. Analytic model for a subnormal 𝒌-tuple of operators Let 𝕊 = (𝑆1 , . . . , 𝑆𝑘 ) be a pure subnormal 𝑘-tuple of operators on a Hilbert space ℋ with m.n.e. ℕ on ℋ0 ⊃ ℋ. Let 𝑀 be the defect space: ⋁ 𝑀 = closure of {[𝑆𝑖∗ , 𝑆𝑗 ]ℋ : 𝑖, 𝑗 = 1, 2, . . . , 𝑘}. (4) Then as shown in [15], 𝑀 is invariant with respect to 𝑆𝑖∗ , and [𝑆𝑖∗ , 𝑆𝑗 ] for 𝑖, 𝑗 = 1, 2, . . . , 𝑘. Denote the operators on 𝑀 by def

def

𝐶𝑖𝑗 = [𝑆𝑖∗ , 𝑆𝑗 ]∣𝑀 and Λ𝑖 = (𝑆𝑖∗ ∣𝑀 )∗

(5)

for 𝑖, 𝑗 = 1, 2, . . . , 𝑘. Let 𝐸(⋅) be the spectral measure of ℕ on sp(ℕ). Deﬁne an 𝐿(𝑀 )-valued positive measure def

𝑒(⋅) = 𝑃 ∣𝑀 𝐸(⋅)∣𝑀 𝑘

(6)

˜ = 𝑅2 (𝐾(𝕊, 𝑒)) be the Hilbert space on sp(ℕ). Let 𝐾(𝕊) = × 𝜎(𝑆𝑗 ) ⊂ ℂ𝑘 and ℋ completion of

𝑗=1

⎧ 𝑘 ⋁ ⎨∏ ⎩

⎫ ⎬ (𝜆𝑗 − 𝑢𝑗 )−1 𝛼 : 𝛼 ∈ 𝑀, 𝜆𝑗 ∈ 𝜌(𝑆𝑗 ) ⎭

𝑗=1

Operator Identities for Subnormal Tuples of Operators with respect to the inner product def

(𝑓, 𝑔) =

617

∫ 𝑠𝑝(ℕ)

(𝑒(𝑑𝑢)𝑓 (𝑢), 𝑔(𝑢)).

(7)

Theorem[15]. Let 𝕊 be a pure subnormal 𝑘-tuple of operators on a separable Hilbert space ℋ with a m.n.e. ℕ on a Hilbert space ℋ0 ⊃ ℋ. Let 𝑀 be the defect space of 𝕊. Then there is a unitary operator 𝑈 from ℋ0 onto the Hilbert space 𝐿2 (𝑒) of all measurable and square integrable functions on sp(ℕ) with respect to inner product (7) satisfying the following conditions: ˜ 𝑈 ℋ = ℋ, 𝑈 𝑓 (𝑁 )𝛼 = 𝑓 (⋅)𝛼,

𝛼 ∈ 𝑀,

for all M-valued bounded Borel functions 𝑓 on sp(ℕ), (𝑈 𝑆𝑗 𝑈 −1 𝑓 )(𝑢) = 𝑢𝑗 𝑓 (𝑢), ¯𝑗 𝑓 (𝑢) + (Λ∗𝑗 − 𝑢¯𝑗 )𝑓 (Λ) (𝑈 𝑆𝑗∗ 𝑈 −1 𝑓 )(𝑢) = 𝑢 ˜ where for 𝑗 = 1, 2, . . . , 𝑘 and 𝑓 ∈ ℋ,

∫

𝑓 (Λ) =

𝑒(𝑑𝑢)𝑓 (𝑢),

and Λ𝑗 is deﬁned as in (5). ˜ def = (𝑆˜1 , . . . , 𝑆˜𝑘 ) is said to be the analytic model for Let 𝑆˜𝑗 = 𝑈 𝑆𝑗 𝑈 −1 . Then 𝕊 𝕊. From now on we only have to study the analytic model 𝕊, and simply identify ˜ 𝕊 with 𝕊 ˜ etc. In our calculation, we have to use several formulas in [15]. ℋ with ℋ, For any 𝑙1 , . . . , 𝑙𝑛 ∈ {1, 2, . . . , 𝑘}, deﬁne an operator 𝑚 ∏ def 𝜇𝑙𝑖 ,...,𝑙𝑚 (𝑧𝑖 , . . . , 𝑧𝑚 ) = 𝑃𝑀 (𝑁𝑙𝑖 − 𝑆𝑙𝑖 𝑃ℋ ) (𝑁𝑙𝑗 − 𝑧𝑗 )−1 ∣𝑀 ∫ =

𝑠𝑝(ℕ)

𝑗=1

(𝑢𝑙𝑖 − Λ𝑙𝑖 )𝑒(𝑑𝑢) 𝑚 ∏ (𝑢𝑙𝑗 − 𝑧𝑗 )

(8)

𝑗=1

on 𝑀 , for 𝑧𝑗 ∈ 𝜌(𝑁𝑙𝑗 ), where 𝑃ℋ is the projection from 𝒦 to ℋ. In [15], 𝜇𝑙𝑖 ,...,𝑙𝑚 is denoted by 𝑅𝑙𝑖 ,...,𝑙𝑚 (𝑧𝑖 , . . . , 𝑧𝑚 ). Later in §3 we will sometimes denote 𝜇𝑙𝑖 ,...,𝑙𝑛 by 𝜇 ˆ{𝑙𝑖 ,...,𝑙𝑛 } . Let 𝜇{𝑙1 ,...,𝑙𝑛 } (𝑧1 , . . . , 𝑧𝑛 ) be the matrix (𝑎𝑖𝑗 )𝑖,𝑗=1,...,𝑛 , where { 0 if 𝑖 > 𝑗, 𝑎𝑖𝑗 = 𝜇𝑙𝑖 ,...,𝑙𝑗 if 𝑖 ≤ 𝑗. Thus 𝜇{𝑙1 ,...,𝑙𝑛 } is an 𝐿(𝑀 𝑛 )-valued holomorphic function on def

𝜌(𝑙1 , . . . , 𝑙𝑛 ) = 𝜌(𝑁𝑙1 ) × ⋅ ⋅ ⋅ × 𝜌(𝑁𝑙𝑛 ). It is called a mosaic. In [15], it is proved that 𝜇{𝑙1 ,...,𝑙𝑛 } is idempotent, i.e., 𝜇2{𝑙1 ,...,𝑙𝑛 } = 𝜇{𝑙1 ,...,𝑙𝑛 } .

(9)

618

D. Xia

Let us deﬁne a kind of “conjugate”of 𝜇{𝑙1 ,...,𝑙𝑛 } as 𝜇†{𝑙1 ,...,𝑙𝑛 } which is a matrix (𝑏𝑖𝑗 )𝑖,𝑗=1,2,...,𝑛 where ⎧ if 𝑖 = 𝑗, ⎨ 𝐼 − 𝜇𝑛−𝑖+1 (𝑧𝑛−𝑖+1 )∗ 0 if 𝑖 > 𝑗, 𝑏𝑖𝑗 = ⎩ −𝜇𝑙𝑗′ ,𝑙𝑗′ +1 ,...,𝑙𝑖′ (𝑧𝑗 ′ , 𝑧𝑗 ′ +1 , . . . , 𝑧𝑖′ )∗ if 𝑗 > 𝑖, where 𝑖′ = 𝑛 + 1 − 𝑖 and 𝑗 ′ = 𝑛 + 1 − 𝑗. This 𝜇†{𝑙1 ,...,𝑙𝑛 } is a little bit diﬀerent from the 𝜇†{𝑙1 ,...,𝑙𝑛 } in [15], but it is only a kind of rearrangement of entries. The function

𝜇†{𝑙1 ,...,𝑙𝑛 } (𝑧1 , . . . , 𝑧𝑛 )∗ is also holomorphic on 𝜌(𝑙1 , . . . , 𝑙𝑛 ) and it is idempotent, 2

𝜇†{𝑙1 ,...,𝑙𝑛 } = 𝜇†{𝑙1 ,...,𝑙𝑛 } .

(10)

The author has written a monograph “The Analytic Theory of Subnormal Operators”and submitted it for publication, in which contains all of the results in [15] with notations which coincide with those in this paper. Let us denote the operator 𝑃𝑀

𝑚 ∏

(𝑁𝑝∗𝑖 − 𝑤 ¯𝑖 )−1

𝑖=1

𝑛 ∏

(𝑁𝑞𝑗 − 𝑧𝑗 )−1 ∣𝑀 in 𝐿(𝑀 )

𝑗=1

by 𝑆𝑝1 ,...,𝑝𝑚 ;𝑞1 ,...,𝑞𝑛 (𝑧1 , . . . , 𝑧𝑛 ; 𝑤1 , . . . , 𝑤𝑚 ); then for ﬁxed 𝑤1 , . . . , 𝑤𝑚 , 𝑆𝑝1 ,...,𝑝𝑚 ;𝑞1 ,...,𝑞𝑛 is holomorphic for (𝑧1 , . . . , 𝑧𝑛 ) ∈ 𝜌(𝑙1 , . . . , 𝑙𝑛 ) and the kernel 𝑆𝑝1 ,...,𝑝𝑚 ;𝑞1 ,...,𝑞𝑛 is hermitian, i.e., 𝑆𝑝1 ,...,𝑝𝑚 ;𝑞1 ,...,𝑞𝑛 (𝑧1 , . . . , 𝑧𝑛 ; 𝑤1 , . . . , 𝑤𝑚 )∗ = 𝑆𝑞1 ,...,𝑞𝑛 ;𝑝1 ,...,𝑝𝑚 (𝑤1 , . . . , 𝑤𝑚 ; 𝑧1 , . . . , 𝑧𝑛 ). It is easy to see that 𝑆𝑝1 ,...,𝑝𝑚 ;𝑞1 ,...,𝑞𝑛 =

∫ 𝑚 ∏

(¯ 𝑢𝑝𝑖

𝑖=1

𝑒(𝑑𝑢) . 𝑛 ∏ −𝑤 ¯𝑖 ) (𝑢𝑞𝑗 − 𝑧𝑗 )

(11)

𝑗=1

Let us denote an ordered integer set {𝑝1 , . . . , 𝑝𝑖 } by 𝑃𝑖 and {𝑞1 , . . . , 𝑞𝑖 } by 𝑄𝑖 . For example 𝑆𝑃𝑖 ,𝑄𝑗 means 𝑆𝑝1 ,...,𝑝𝑖 ;𝑞1 ,...,𝑞𝑗 . For any two ﬁnite tuples of integers, 𝑝𝑖 , 𝑖 = 1, 2, . . . , 𝑚 and 𝑞𝑗 , 𝑗 = 1, 2, . . . , 𝑛 which satisfy 1 ≤ 𝑝𝑖 , 𝑞𝑗 ≤ 𝑘, deﬁne the operator matrix 𝔖𝑃𝑚 ,𝑄𝑛 , which means 𝔖𝑝1 ,...,𝑝𝑖 ;𝑞1 ,...,𝑞𝑗 , as ⎛ ⎞ 𝑆𝑃𝑚 ,𝑄2 ⋅⋅⋅ 𝑆𝑃𝑚 ,𝑄𝑛 𝑆𝑃𝑚 ,𝑄1 ⎜𝑆𝑃𝑚−1 ,𝑄1 𝑆𝑃𝑚−1 ,𝑄2 ⋅ ⋅ ⋅ 𝑆𝑃𝑚−1 ,𝑄𝑛 ⎟ ⎜ ⎟ ⎟ 𝔖𝑃𝑚 ,𝑄𝑛 = ⎜ (12) ⎜. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .⎟ . ⎝ 𝑆𝑃2 ,𝑄1 𝑆𝑃2 ,𝑄2 ⋅⋅⋅ 𝑆𝑃2 ,𝑄𝑛 ⎠ 𝑆𝑃1 ,𝑄1 𝑆𝑃1 ,𝑄2 ⋅⋅⋅ 𝑆𝑃1 ,𝑄𝑛

Operator Identities for Subnormal Tuples of Operators

619

This matrix in (12) is a little bit diﬀerent from the matrix 𝐵 deﬁned on p. 630 of [15]. We form an (𝑚 + 𝑛) × (𝑚 + 𝑛) matrix by block matrices as ) ( † def 𝜇𝑃𝑚 𝔖𝑃𝑚 ,𝑄𝑛 ℑ𝑃𝑚 ,𝑄𝑛 = (13) 0 𝜇𝑄𝑛 where 0 in (13) is an 𝑛 × 𝑚 matrix with all entries zero. Similar to the proof of Theorem 5 in [15], we may prove that ℑ𝑃𝑚 ,𝑄𝑛 is idempotent, i.e., ℑ2𝑃𝑚 ,𝑄𝑛 = ℑ𝑃𝑚 ,𝑄𝑛 .

(14)

Actually here ℑ𝑃𝑚 ,𝑄𝑛 is almost the matrix 𝑆𝑙𝑚 in (59) of [15]. From (14), we have 𝔖𝑃𝑚 ,𝑄𝑛 = 𝜇†𝑃𝑚 𝔖𝑃𝑚 ,𝑄𝑛 + 𝔖𝑃𝑚 ,𝑄𝑛 𝜇𝑄𝑛 .

(15)

¯ 𝑄𝑚𝑙 (𝑧, 𝑤) = (Λ∗𝑚 − 𝑤)(Λ 𝑙 − 𝑧) − 𝐶𝑚𝑙 .

(16)

Deﬁne In the case of 𝑃1 = 1, 𝑄1 = 1, (15) becomes 𝑆1;1 (𝑧, 𝑤) = (𝐼 − 𝜇1 (𝑤)∗ )𝑆1;1 (𝑧, 𝑤) − 𝑆1;1 (𝑧, 𝑤)𝜇(𝑧).

(17)

Let us review the single subnormal operator 𝑆 case: 𝑆1 = 𝑆 and ∫ 𝑒(𝑑𝑢) 𝑆1;1 (𝑧, 𝑤) = (¯ 𝑢 − 𝑤)(𝑢 ¯ − 𝑧) where 𝑢 = 𝑢1 . Then from [13] as shown in §1, we have 𝑆1;1 (𝑧, 𝑤) = (𝐼 − 𝜇1 (𝑤)∗ )𝑄11 (𝑧, 𝑤)−1 − 𝑄11 (𝑧, 𝑤)−1 𝜇1 (𝑧). Comparing this with (17), it suggests that in the right-hand side of (15), the matrix 𝔖𝑃𝑚 ,𝑄𝑛 may be replaced by some rational functions of Λ𝑖 , Λ∗𝑗 and 𝐶𝑖𝑗 . That is the origin of this paper. In [15] and [19], we introduced the rational function def

𝑅𝑚𝑙 (𝑧) = 𝐶𝑚𝑙 (𝑧 − Λ𝑙 )−1 + Λ∗𝑚 ,

𝑧 ∈ 𝜌(Λ𝑙 ).

In [19], we have proved that [𝑅𝑚1 𝑙 (𝑧), 𝑅𝑚2 𝑙 (𝑧)] = 0

(18)

where [𝐴, 𝐵] = 𝐴𝐵 − 𝐵𝐴. In the case of 𝑧𝑗 ∈ 𝜌(𝑆𝑗 ), see Theorem 2 of this paper. Besides, in [15], we introduced some 𝑛 × 𝑚 matrices 𝑅𝑚,𝑙1 ,...,𝑙𝑛 (𝑧1 , . . . , 𝑧𝑛 ) = (𝑎𝑖𝑗 ), which also can be denoted by 𝑅𝑚,𝐿𝑛 , when 𝐿𝑛 stands for the tuple of integers 𝑙1 , . . . , 𝑙𝑛 satisfying 1 ≤ 𝑙𝑗 ≤ 𝑘. The matrix 𝐶𝐿𝑝 in [15] actually is −𝑅𝑝,𝐿 here. In the matrix (𝑎𝑖𝑗 ), ⎧ 𝑗 ∏   ⎨ −𝐶𝑚𝑙𝑖 (Λ𝑙𝑝 − 𝑧𝑝 )−1 if 𝑖 < 𝑗, 𝑝=𝑖 (19) 𝑎𝑖𝑗 =  if 𝑖 = 𝑗, 𝑅𝑚𝑙𝑖 (𝑧𝑖 )  ⎩ 0 if 𝑖 > 𝑗.

620

D. Xia

It is easy to see that 𝑅𝑚,𝐿𝑛 − 𝑤 is invertible, iﬀ 𝑅𝑚𝑙𝑗 − 𝑤, 𝑗 = 1, 2, . . . , 𝑛 are invertible. We also proved in [15] that [𝜇𝐿𝑛 , 𝑅𝑚,𝐿𝑛 ] = 0.

(20)

In §4, we will prove that for 𝑧 = (𝑧1 , . . . , 𝑧𝑛 ), 𝑧𝑗 ∈ 𝜌(𝑆𝑙𝑗 ), [𝑅𝑚,𝐿𝑛 (𝑧1 , . . . , 𝑧𝑛 ), 𝑅𝑚′ ,𝐿𝑛 (𝑧1 , . . . , 𝑧𝑛 )] = 0

(21)

for any 1 ≤ 𝑚, 𝑚′ ≤ 𝑘. It is still open whether (21) is true if it is only assumed that 𝑧𝑗 ∈ 𝜌(Λ𝑙𝑗 ) and 𝑛 > 1.

3. Calculation of 𝕾𝑷𝒎 ,𝑸𝒏 For 𝑤 ∈ 𝜌(Λ𝑚 ), if (𝑅𝑞𝑗 𝑚 (𝑤)∗ − 𝑧𝑗 ) is invertible, 𝑗 = 1, 2, . . . , 𝑛. Deﬁne an operator ⎞ ⎛ 𝑛 ∏ def def 𝑋𝑚,𝑄𝑛 = 𝑋𝑚;𝑞1 ,...,𝑞𝑛 (𝑧1 , . . . , 𝑧𝑛 ; 𝑤) = ⎝ (𝑅𝑞𝑗 𝑚 (𝑤)∗ − 𝑧𝑗 )−1 ⎠ (Λ∗𝑚 − 𝑤) ¯ −1 𝑗=1

(22) on 𝑀 , where 𝑄𝑛 = {𝑞1 , . . . , 𝑞𝑛 }. By (18) the does not depend on the order of product.

𝑛 ∏ 𝑗=1

in the right-hand side of (22)

For 𝑃𝑗 = {𝑝1 , . . . , 𝑝𝑗 }, 𝑄𝑗 = {𝑞1 , . . . , 𝑞𝑗 }, if 𝑧𝑖 , 𝑤𝑖 ∈ 𝜌(Λ𝑖 ) and 𝑄𝑖𝑗 (𝑧𝑗 , 𝑤𝑖 ) is invertible, deﬁne operators 𝑋𝑃𝑚 ,𝑄1 , 𝑋𝑃𝑚 ,𝑄2 , . . . , 𝑋𝑃𝑚 ,𝑄𝑛 in 𝐿(𝑀 ) by the formula ( ) 𝑋𝑃𝑚 ,𝑄1 𝑋𝑃𝑚 ,𝑄2 ⋅ ⋅ ⋅ 𝑋𝑃𝑚 ,𝑄𝑛 ) ( (23) ¯2 )−1 ⋅ ⋅ ⋅ (𝑅𝑝𝑚 ,𝑄𝑛 − 𝑤 ¯𝑚 )−1 = 𝑋𝑃1 ,𝑄1 𝑋𝑃1 ,𝑄2 ⋅ ⋅ ⋅ 𝑋𝑃1 ,𝑄𝑛 (𝑅𝑝2 ,𝑄𝑛 − 𝑤 where 𝑋𝑃1 ,𝑄𝑗 is 𝑋𝑝1 ;𝑞1 ,...,𝑞𝑗 deﬁned in (22), for 𝑚 ≥ 2. Let us comment on the product of (23). Suppose ⎞ ⎛ 𝐵11 𝐵12 𝐵13 ⋅ ⋅ ⋅ 𝐵1𝑛 ⎜ 0 𝐵22 𝐵23 ⋅ ⋅ ⋅ 𝐵2𝑛 ⎟ ⎟ ⎜ 0 0 𝐵33 ⋅ ⋅ ⋅ 𝐵3𝑛 ⎟ (𝑅𝑝2 ,𝑄𝑛 − 𝑤 ¯2 )−1 ⋅ ⋅ ⋅ (𝑅𝑝𝑚 ,𝑄𝑛 − 𝑤 ¯𝑚 )−1 = ⎜ ⎜ ⎟. ⎝. . . . . . . . . . . . . . . . . . . . . . . . . . .⎠ 0 0 0 ⋅ ⋅ ⋅ 𝐵𝑛𝑛 Then (23) means 𝑋𝑃𝑚 ,𝑄𝑗 =

𝑗 ∑

𝑋𝑃1 ,𝑄𝑙 𝐵𝑙𝑗 .

𝑙=1

The 𝑋𝑃𝑚 ,𝑄𝑛 stands for 𝑋𝑝1 ,...,𝑝𝑚 ;𝑞1 ,...,𝑞𝑛 (𝑧1 , . . . , 𝑧𝑛 ; 𝑤1 , . . . , 𝑤𝑚 ) etc.

Operator Identities for Subnormal Tuples of Operators

621

Let 𝔛𝑃𝑚 ,𝑄𝑛 = 𝔛𝑝1 ,...,𝑝𝑚 ;𝑞1 ,...,𝑞𝑛 (𝑧1 , . . . , 𝑧𝑛 ; 𝑤1 , . . . , 𝑤𝑚 ) be the matrix ⎛ ⎞ 𝑋𝑃𝑚 ,𝑄1 𝑋𝑃𝑚 ,𝑄2 ⋅⋅⋅ 𝑋𝑃𝑚 ,𝑄𝑛 ⎜𝑋𝑃𝑚−1 ,𝑄1 𝑋𝑃𝑚−1 ,𝑄2 ⋅ ⋅ ⋅ 𝑋𝑃𝑚−1 ,𝑄𝑛 ⎟ ⎜ ⎟ ⎜ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .⎟ . ⎜ ⎟ ⎝ 𝑋𝑃2 ,𝑄1 𝑋𝑃2 ,𝑄2 ⋅⋅⋅ 𝑋𝑃2 ,𝑄𝑛 ⎠ 𝑋𝑃1 ,𝑄1 𝑋𝑃1 ,𝑄2 ⋅⋅⋅ 𝑋𝑃1 ,𝑄𝑛 Theorem 1. Let 𝕊 = (𝑆1 , . . . , 𝑆𝑘 ) be a pure subnormal 𝑘-tuple of operators on a separable Hilbert space ℋ with minimal normal extension ℕ = (𝑁1 , . . . , 𝑁𝑘 ) on 𝒦 ⊃ ℋ. For integers 𝑝𝑖 , 𝑞𝑗 , 𝑖 = 1, 2, . . . , 𝑚, 𝑗 = 1, 2, . . . , 𝑛 satisfying 1 ≤ 𝑝𝑖 , 𝑞𝑗 ≤ 𝑘, if 𝑧𝑖 ∈ 𝜌(Λ𝑝𝑖 ) ∩ 𝜌(𝑁𝑝𝑖 ), 𝑖 = 1, 2, . . . , 𝑚, 𝑤𝑗 ∈ 𝜌(Λ𝑞𝑗 ) ∩ 𝜌(𝑁𝑞𝑗 ), 𝑗 = 1, 2, . . . , 𝑛 satisfy the condition that 𝑄𝑝𝑗 ,𝑞𝑖 (𝑧𝑖 , 𝑤𝑗 ) are invertible, 𝑗 = 1, 2, . . . , 𝑚, 𝑖 = 1, 2, . . . , 𝑛, then 𝔖𝑃𝑚 ,𝑄𝑛 = 𝜇†𝑃𝑚 𝔛𝑃𝑚 ,𝑄𝑛 − 𝔛𝑃𝑚 ,𝑄𝑛 𝜇𝑄𝑛 ,

(24)

where 𝔖𝑃𝑚 ,𝑄𝑛 stands for 𝔖𝑝1 ,...,𝑝𝑚 ;𝑞1 ,...,𝑞𝑛 (𝑧1 , . . . , 𝑧𝑛 ; 𝑤1 , . . . , 𝑤𝑚 ). Proof. We will prove (24) by mathematical induction with respect to 𝑚 and 𝑛. First consider the case 𝑚 = 𝑛 = 1. If 𝑚 = 𝑛 = 1, then (24) is equivalent to the following: Lemma 1. If 𝑧 ∈ 𝜌(Λ𝑞 ) ∩ 𝜌(𝑁𝑞 ), 𝑤 ∈ 𝜌(Λ𝑝 ) ∩ 𝜌(𝑁𝑝 ) and 𝑄𝑝𝑞 (𝑧, 𝑤) is invertible, then 𝑆𝑝;𝑞 (𝑧, 𝑤) = (𝐼 − 𝜇𝑝 (𝑤)∗ )𝑄𝑝𝑞 (𝑧, 𝑤)−1 − 𝑄𝑝𝑞 (𝑧, 𝑤)−1 𝜇𝑞 (𝑧). Proof. The proof is similar to that for Lemma 6 in [14]. But in order to make this paper readable, we give the details. By (13) in [15], for any 𝑝, 𝑞, 1 ≤ 𝑝, 𝑞 ≤ 𝑘, we have 𝑄𝑝𝑞 (𝑢𝑞 , 𝑢𝑝 )𝑒(𝑑𝑢) = 𝑒(𝑑𝑢)𝑄𝑝𝑞 (𝑢𝑞 , 𝑢𝑝 ) = 0,

(25)

where 𝑢 = (𝑢1 , . . . , 𝑢𝑘 ). Therefore 𝑄𝑝𝑞 (𝑧, 𝑤)𝑒(𝑑𝑢) = ((𝑤 ¯ − 𝑢¯𝑝 )(𝑧 − Λ𝑞 ) − (¯ 𝑢𝑝 − Λ∗𝑝 )(𝑢𝑞 − 𝑧))𝑒(𝑑𝑢). Thus

∫

𝑄𝑝𝑞 (𝑧, 𝑤)𝑒(𝑑𝑢) (¯ 𝑢𝑝 − 𝑤)(𝑢 ¯ 𝑞 − 𝑧) ∫ ∫ 𝑢 ¯𝑝 − Λ∗𝑝 (𝑧 − 𝑢𝑞 + 𝑢𝑞 − Λ𝑞 ) 𝑒(𝑑𝑢) − 𝑒(𝑑𝑢) =− 𝑢𝑞 − 𝑧 𝑢 ¯𝑝 − 𝑤 ¯ ∫ 𝑢 ¯𝑝 − Λ∗𝑝 = 𝐼 − 𝜇𝑞 (𝑧) − 𝑒(𝑑𝑢). 𝑢 ¯𝑝 − 𝑤 ¯

(26)

𝑢 ¯𝑝 − Λ∗𝑝 𝑒(𝑑𝑢)𝑄𝑝𝑞 (𝑧, 𝑤) = 𝑄𝑝𝑞 (𝑧, 𝑤)𝜇∗𝑝 . 𝑢¯𝑝 − 𝑤 ¯

(27)

𝑄𝑝𝑞 (𝑧, 𝑤)𝑆𝑝;𝑞 (𝑧, 𝑤) =

Let us prove that ∫

622

D. Xia

Firstly, we have ∫ ∫ 𝑢 ¯𝑝 − Λ∗𝑝 𝑒(𝑑𝑢) ∗ ∗ ∗ 𝑒(𝑑𝑢)(𝑤 ¯ − Λ𝑝 ) = (𝑤¯ − Λ𝑝 ) + (𝑤 (𝑤 ¯ − Λ∗𝑝 ) ¯ − Λ𝑝 ) 𝑢 ¯𝑝 − 𝑤 ¯ 𝑢 ¯𝑝 − 𝑤 ¯ ∫ 𝑒(𝑑𝑢) ∗ ∗ (¯ 𝑢𝑝 − Λ∗𝑝 − (¯ = (𝑤 ¯ − Λ𝑝 ) + (𝑤 ¯ − Λ𝑝 ) 𝑢𝑝 − 𝑤)) ¯ 𝑢 ¯𝑝 − 𝑤 ¯ (28) = (𝑤 ¯ − Λ∗𝑝 )𝜇𝑝 (𝑤)∗ . Next, we have to prove that ∫ 𝑢 ¯𝑝 − Λ∗𝑝 𝑒(𝑑𝑢)((𝑤¯ − Λ∗𝑝 )Λ𝑞 + 𝐶𝑝𝑞 ) = ((𝑤¯ − Λ∗𝑝 )Λ𝑞 + 𝐶𝑝𝑞 )𝜇𝑝 (𝑤)∗ . 𝑢 ¯𝑝 − 𝑤 ¯ By (25), the left-hand side of (29) is equal to ∫ 𝑢 ¯𝑝 − Λ∗𝑝 𝑒(𝑑𝑢)((𝑤¯ − Λ∗𝑝 )Λ𝑞 + (¯ 𝑢𝑝 − Λ∗𝑝 )(𝑢𝑞 − Λ𝑞 )) 𝑢 ¯𝑝 − 𝑤 ¯ ∫ 𝑢 ¯𝑝 − Λ∗𝑝 𝑒(𝑑𝑢)((𝑤¯ − 𝑢 ¯𝑝 )Λ𝑞 + (¯ 𝑢𝑝 − Λ∗𝑝 )𝑢𝑞 ) = 𝑢 ¯𝑝 − 𝑤 ¯ ∫ 𝑢 ¯𝑝 − Λ∗𝑝 , = (¯ 𝑢𝑝 − Λ∗𝑝 )𝑢𝑞 𝑒(𝑑𝑢) 𝑢¯𝑝 − 𝑤 ¯ ∫ since (¯ 𝑢𝑝 − Λ∗𝑝 )𝑒(𝑑𝑢) = 0. By (25) again, we have

(29)

(30)

𝑢𝑝 − Λ∗𝑝 )Λ𝑞 )𝑒(𝑑𝑢). (¯ 𝑢𝑝 − Λ∗𝑝 )𝑢𝑞 𝑒(𝑑𝑢) = (𝐶𝑝𝑞 + (¯ Thus the right-hand side of (30) is equal to ∫ 𝑢𝑝 − Λ∗𝑝 )Λ𝑞 𝑒(𝑑𝑢)(¯ 𝑢𝑝 − Λ∗𝑝 )(¯ 𝑢𝑝 − 𝑤) ¯ −1 (31) 𝐶𝑝𝑞 𝜇𝑝 (𝑤)∗ + (¯ ∫ ¯ − Λ∗𝑝 )Λ𝑞 𝜇𝑝 (𝑤)∗ + (¯ 𝑢𝑝 − 𝑤)Λ ¯ 𝑞 𝑒(𝑑𝑢)(¯ 𝑢𝑝 − Λ∗𝑝 )(¯ 𝑢𝑝 − 𝑤) ¯ −1 . = 𝐶𝑝𝑞 𝜇𝑝 (𝑤)∗ + (𝑤 However, the third term in the right-hand side of (31) is zero, which proves (29). From (28) and (29), we get (27). From (26) and (27), we get the lemma. □ In the case 𝑚 = 1, Theorem 1 is equivalent to the following: Lemma 2. If 𝑤 ∈ 𝜌(Λ𝑝 )∩𝜌(𝑁𝑝 ), 𝑧𝑗 ∈ 𝜌(Λ𝑞𝑗 )∩𝜌(𝑁𝑞𝑗 ), 1 ≤ 𝑝, 𝑞𝑗 ≤ 𝑘 and 𝑄𝑝𝑞𝑗 (𝑧𝑗 , 𝑤), 𝑗 = 1, 2, . . . , 𝑛 are invertible, then 𝑆𝑝,𝑄𝑛 = (𝐼 − 𝜇∗𝑝 )𝑋𝑝,𝑄𝑛 −

𝑛 ∑

𝑋𝑝,𝑄𝑗 𝜇𝑞𝑗 ,...,𝑞𝑛 ,

(32)

𝑗=1

where 𝑄𝑗 = {𝑞1 , . . . , 𝑞𝑗 }. Proof. In (32), 𝑆𝑝,𝑄𝑛 means 𝑆{𝑝},𝑄𝑛 or 𝑆𝑝;𝑞1 ,...,𝑞𝑛 , and 𝑋𝑝,𝑄𝑛 means 𝑋{𝑝},𝑄𝑛 . Let us prove it by the mathematical induction with respect to the number of 𝑞’s. For

Operator Identities for Subnormal Tuples of Operators

623

the case that there is only one of 𝑞’s, says, 𝑞1 , (32) is equivalent to Lemma 1. Assume that (32) holds good for 𝑞2 , . . . , 𝑞𝑛 (there are 𝑛 − 1 𝑞’s), i.e., 𝑆𝑝,𝑞2 ,...,𝑞𝑛 = (𝐼 − 𝜇∗𝑝 )𝑋𝑝;𝑞2 ,...,𝑞𝑛 −

𝑛 ∑

𝑋𝑝;𝑞2 ,...,𝑞𝑗 𝜇𝑞𝑗 ,...,𝑞𝑛 .

(33)

𝑗=2

We have to prove that (32) holds good for 𝑄𝑛 = {𝑞1 , 𝑞2 , . . . , 𝑞𝑛 }. By (25) again, we have ¯ 𝑞1 − 𝑧1 ) − (¯ 𝑢𝑝 − 𝑤)(𝑢 ¯ 𝑞1 − Λ𝑞1 ))𝑒(𝑑𝑢). 𝑄𝑝𝑞1 𝑒(𝑑𝑢) = ((Λ∗𝑝 − 𝑤)(𝑢 Therefore

∫

𝑄𝑝𝑞1 𝑆𝑝,𝑄𝑛 =

𝑄𝑝𝑞1 (𝑧, 𝑤)𝑒(𝑑𝑢) ¯ 𝑝;𝑞2 ,...,𝑞𝑛 − 𝜇𝑞1 ,...,𝑞𝑛 . = (Λ∗𝑝 − 𝑤)𝑆 𝑛 ∏ (¯ 𝑢𝑝 − 𝑤) ¯ (𝑢𝑞𝑗 − 𝑧𝑗 )

(34)

𝑗=1

From 𝑄𝑝𝑞 (𝑧, 𝑤)−1 (Λ∗𝑝 − 𝑤) ¯ = (𝑅𝑞𝑝 (𝑤)∗ − 𝑧)−1 , [(𝑅𝑞𝑝 (𝑤)∗ − 𝑧)−1 , 𝐼 − 𝜇𝑝 (𝑤)∗ ] = 0 (see (20)), (33) and (34), it follows that 𝑆𝑝,𝑄𝑛 = (𝑅𝑞1 𝑝 (𝑤)∗ − 𝑧1 )−1 𝑆𝑝;𝑞2 ,...,𝑞𝑛 − 𝑄−1 𝑝𝑞1 𝜇𝑞1 ,...,𝑞𝑛 = (𝐼 − 𝜇𝑝 (𝑤)∗ )(𝑅𝑞1 𝑝 (𝑤)∗ − 𝑧1 )−1 𝑋𝑝;𝑞2 ,...,𝑞𝑛 −

(35)

𝑛 ∑

(𝑅𝑞1 𝑝 (𝑤)∗ − 𝑧1 )−1 𝑋𝑝;𝑞2 ,...,𝑞𝑗 𝜇𝑞𝑗 ,...,𝑞𝑛 − 𝑄−1 𝑝𝑞1 𝜇𝑞1 ,...,𝑞𝑛 .

𝑗=2

By (22), we have 𝑋𝑝;𝑞1 = 𝑄−1 𝑝𝑞1 and 𝑋𝑝;𝑞1 ,...,𝑞𝑗 = (𝑅𝑞1 𝑝 (𝑤)∗ − 𝑧1 )−1 𝑋𝑝;𝑞2 ,...,𝑞𝑗 . Thus (35) implies (32), which proves the lemma.

□

In the case of 𝑛 = 1, Theorem 1 is equivalent to the following: Lemma 3. If 𝑤𝑗 ∈ 𝜌(Λ𝑝𝑗 ) ∩ 𝜌(𝑁𝑝𝑗 ), 𝑧 ∈ 𝜌(Λ𝑞 ) ∩ 𝜌(𝑁𝑞 ), 1 ≤ 𝑝𝑗 , 𝑞 ≤ 𝑘 and 𝑄𝑝𝑗 𝑞 (𝑧, 𝑤𝑗 ), 𝑗 = 1, 2, . . . , 𝑚 is invertible, then 𝑆𝑃𝑚 ;𝑞 (𝑧; 𝑤1 , . . . , 𝑤𝑚 ) = −

𝑚 ∑ 𝜇∗𝑝𝑗 ,...,𝑝𝑚 𝑋𝑃𝑗 ;𝑞 + (𝐼 − 𝜇𝑞 )𝑋𝑃𝑚 ;𝑞 .

(36)

𝑗=1

where 𝑃𝑗 = {𝑝1 , . . . , 𝑝𝑗 } Proof. The proof of this lemma is similar to the proof of Lemma 2. For the case 𝑚 = 1, (36) is just Lemma 1. Assume that (36) holds for 𝑝2 , . . . , 𝑝𝑚 , i.e., 𝑆𝑝2 ,...,𝑝𝑚 ;𝑞 (𝑧; 𝑤2 , . . . , 𝑤𝑚 ) = −

𝑚 ∑ 𝑗=2

𝜇∗𝑝𝑗 ,...,𝑝𝑚 𝑋𝑝2 ,...,𝑝𝑗 ;𝑞 + (𝐼 − 𝜇𝑞 )𝑋𝑝2 ,...,𝑝𝑚 ;𝑞 . (37)

624

D. Xia

From 𝑒(𝑑𝑢)𝑄𝑝1 𝑞 (𝑢) = 𝑒(𝑑𝑢)((¯ 𝑢𝑝1 − 𝑤 ¯1 )(Λ𝑞 − 𝑧) − (¯ 𝑢𝑝1 − Λ∗𝑝1 )(𝑢𝑞 − 𝑧)), it follows that ∫ 𝑒(𝑑𝑢)((¯ 𝑢𝑝1 − 𝑤 ¯1 )(Λ𝑞 − 𝑧) − (¯ 𝑢𝑝1 − Λ∗𝑝1 )(𝑢𝑞 − 𝑧)) 𝑆𝑃𝑚 ,𝑞 𝑄𝑝1 𝑞 = 𝑚 ∏ (𝑢𝑞 − 𝑧) (¯ 𝑢𝑝𝑗 − 𝑤 ¯𝑗 ) = 𝑆𝑝2 ,...,𝑝𝑚 ;𝑞 (Λ𝑞 − 𝑧) −

𝑗=1 ∗ 𝜇𝑃𝑚 .

¯1 )−1 , (20) and (37), it follows that From (Λ𝑞 − 𝑧)𝑄𝑝1 𝑞 (𝑧, 𝑤1 )−1 = (𝑅𝑝1 𝑞 (𝑧) − 𝑤 𝑆𝑃𝑚 ,𝑞 = 𝑆𝑝2 ,...,𝑝𝑚 ;𝑞 (𝑅𝑝1 𝑞 (𝑧) − 𝑤 ¯1 )−1 − 𝜇∗𝑃𝑚 𝑄𝑝1 𝑞 (𝑧, 𝑤1 )−1 = −

𝑚 ∑

𝜇∗𝑝𝑗 ,...,𝑝𝑚 𝑋𝑝2 ,...,𝑝𝑗 ;𝑞 (𝑅𝑝1 𝑞 (𝑧) − 𝑤 ¯1 )−1

(38)

𝑗=2

− (𝐼 − 𝜇𝑞 )𝑋𝑝2 ,...,𝑝𝑚 ;𝑞 (𝑅𝑝1 𝑞 (𝑧) − 𝑤 ¯1 )−1 − 𝜇∗𝑃𝑚 𝑄𝑝1 𝑞 (𝑧, 𝑤1 )−1 . But from (23) it is easy to see that ¯1 )−1 = (Λ𝑞 − 𝑧)−1 𝑋𝑝2 ,...,𝑝𝑗 ;𝑞 (𝑅𝑝1 𝑞 (𝑧) − 𝑤

𝑗 ∏ (𝑅𝑝𝑖 𝑞 (𝑧) − 𝑤 ¯𝑖 )−1 = 𝑋𝑃𝑗 ,𝑞 . 𝑖=1

Therefore (38) is equivalent to (37), which proves the lemma.

□

Now let us continue to prove Theorem 1. It is easy to see that (24) is equivalent to the following: 𝑆𝑃𝑚 ,𝑄𝑛 = (𝐼 − 𝜇∗𝑝𝑚 )𝑋𝑃𝑚 ,𝑄𝑛 −

𝑚−1 ∑

𝑛 ∑

𝑗=1

𝑗=1

𝜇∗𝑃𝑗𝑚 𝑋𝑃𝑗 ,𝑄𝑛 −

𝑋𝑃𝑚 ,𝑄𝑗 𝜇𝑄ˆ 𝑗𝑛 ,

(39)

ˆ 𝑗𝑛 means 𝑞𝑗 , . . . , 𝑞𝑛 , for any natural numbers where 𝑃𝑗𝑚 means 𝑝𝑗 , . . . , 𝑝𝑚 and 𝑄 𝑚 and 𝑛. Lemma 2 shows that (39) holds for 𝑚 = 1, and any 𝑛. Assume that (39) holds for 𝑚 = 𝑣 − 1 ≥ 1. Let us prove that (39) holds for 𝑚 = 𝑣 and any 𝑛, by mathematical induction with respect to 𝑛. Lemma 3 shows that (39) is true for 𝑛 = 1. We assume that (39) is true for 𝑚 = 𝑣 − 1 and that 𝑛 is replaced by 𝑛 − 1. Notice that 𝑒(𝑑𝑢)𝑄𝑝1 𝑞𝑛 (𝑧𝑛 , 𝑤1 ) ( ) = 𝑒(𝑑𝑢) (¯ 𝑢𝑝1 − 𝑤 ¯1 )(Λ𝑞𝑛 − 𝑧𝑛 ) − (¯ 𝑢𝑝1 − 𝑤 ¯1 )(𝑢𝑞𝑛 − 𝑧𝑛 ) + (Λ∗𝑝1 − 𝑤 ¯1 )(𝑢𝑞𝑛 − 𝑧𝑛 ) . If 𝑛 > 1, then 𝑆𝑃𝑣 ,𝑄𝑛 𝑄𝑝1 𝑞𝑛 ( ) ∫ 𝑒(𝑑𝑢) (¯ 𝑢𝑝1 − 𝑤 ¯1 )(Λ𝑞𝑛 − 𝑧𝑛 ) − (¯ 𝑢𝑝1 − 𝑤 ¯1 )(𝑢𝑞𝑛 − 𝑧𝑛 ) + (Λ∗𝑝1 − 𝑤 ¯1 )(𝑢𝑞𝑛 − 𝑧𝑛 ) = 𝑣 𝑛 ∏ ∏ (¯ 𝑢𝑝𝑗 − 𝑤 ¯𝑗 ) (𝑢𝑞𝑗 − 𝑧𝑗 ) 𝑗=1

= 𝑆𝑃ˆ𝑣 ,𝑄𝑛 (Λ𝑞𝑛 − 𝑧𝑛 ) − 𝑆𝑃ˆ𝑣 ,𝑄𝑛−1

𝑗=1 + 𝑆𝑃𝑣 ,𝑄𝑛−1 (Λ∗𝑝1

−𝑤 ¯1 ),

Operator Identities for Subnormal Tuples of Operators

625

where 𝑃ˆ𝑣 = {𝑝2 , . . . , 𝑝𝑣 }. Since there are only 𝑣 − 1 natural numbers in 𝑃ˆ𝑣 , we may apply (39) to 𝑆𝑃ˆ𝑣 ,𝑄𝑛 and 𝑆𝑃ˆ𝑣 ,𝑄𝑛−1 . Besides, by the hypothesis of mathematical induction with respect to 𝑛, we may also use the formula (39) for 𝑆𝑃𝑣 ,𝑄𝑛−1 . Thus 𝑆𝑃𝑣 ,𝑄𝑛 = (𝐼1 + 𝐼2 + 𝐼3 )𝑄−1 𝑝1 𝑞𝑛 , where ⎛ 𝐼1 = ⎝(𝐼 − 𝜇∗𝑝𝑣 )𝑋𝑃ˆ𝑣 ,𝑄𝑛 − ⎛

𝑣−1 ∑

𝑛 ∑

𝑗=2

𝑗=1

𝜇∗𝑃𝑗 𝑣 𝑋𝑃ˆ𝑗 ,𝑄𝑛 −

𝐼2 = − ⎝(𝐼 − 𝜇∗𝑝𝑣 )𝑋𝑃ˆ𝑣 ,𝑄𝑛−1 − and

⎛

𝐼3 = ⎝(𝐼 − 𝜇∗𝑝𝑣 )𝑋𝑃𝑣 ,𝑄𝑛−1 −

𝑣−1 ∑ 𝑗=2

(40) ⎞

𝑋𝑃ˆ𝑣 ,𝑄𝑗 𝜇𝑄ˆ 𝑗𝑛 ⎠ (Λ𝑞𝑛 − 𝑧𝑛 ),

𝜇∗𝑃𝑗 𝑣 𝑋𝑃ˆ𝑗 ,𝑄𝑛−1 −

𝑗=1

𝑣−1 ∑

𝑛−1 ∑

𝑗=1

𝑗=1

𝜇∗𝑃𝑗 𝑣 𝑋𝑃𝑗 ,𝑄𝑛−1 −

⎞

𝑛−1 ∑

𝑋𝑃ˆ𝑣 ,𝑄𝑗 𝜇𝑄ˆ 𝑗(𝑛−1) ⎠ ,

⎞

𝑋𝑃ˆ𝑣 ,𝑄𝑗 𝜇𝑄ˆ 𝑗(𝑛−1) ⎠ (Λ∗𝑝1 − 𝑤 ¯1 ).

Let us rearrange the terms in the summation of (40). Then 𝑆𝑃𝑣 ,𝑄𝑛 = 𝐽1 + 𝐽2 ,

(41)

where ˜ 𝑃𝑣 ,𝑄𝑛 − 𝐽1 = (𝐼 − 𝜇∗𝑝𝑣 )𝑋

𝑣−1 ∑ 𝑗=2

˜ 𝑃𝑗 ,𝑄𝑛 − 𝜇∗ 𝑋 ˜ 𝜇∗𝑃𝑗 𝑣 𝑋 𝑃1 𝑣 𝑃1 ,𝑄𝑛 ,

(42)

where

( ) ∗ ˜ 𝑃𝑗 ,𝑄𝑛 = 𝑋 ˆ 𝑋 ¯1 ) 𝑄−1 𝑝1 𝑞𝑛 , 𝑃𝑗 ,𝑄𝑛 (Λ𝑞𝑛 − 𝑧𝑛 ) − 𝑋𝑃ˆ𝑗 ,𝑄𝑛−1 + 𝑋𝑃𝑗 ,𝑄𝑛−1 (Λ𝑝1 − 𝑤

(43)

˜ 𝑃𝑗 ,𝑄1 = 𝑋 ˆ for 𝑛 > 1 and 𝑋 ¯1 )−1 , besides, 𝑃𝑗 ,𝑄1 (𝑅𝑝1 𝑞1 (𝑧𝑛 ) − 𝑤 𝐽2 = −

𝑛 ∑ 𝑗=1

−

−1

𝑋𝑃ˆ𝑣 ,𝑄𝑗 𝜇𝑄ˆ 𝑗𝑛 (𝑅𝑝1 𝑞𝑛 (𝑧𝑛 ) − 𝑤 ¯1 )

+

𝑛−1 ∑ 𝑗=1

𝑛−1 ∑ 𝑗=1

𝑋𝑃ˆ𝑣 ,𝑄ˆ 𝑗 𝜇𝑄ˆ 𝑗(𝑛−1) 𝑄−1 𝑝1 𝑞𝑛 (44)

𝑋𝑃ˆ𝑣 ,𝑄𝑗 𝜇𝑄ˆ 𝑗(𝑛−1) (Λ∗𝑝1 − 𝑤 ¯1 )𝑄−1 𝑝1 𝑞𝑛 ,

¯1 )−1 , and 𝑋𝑃1 ,𝑄𝑛−1 (Λ∗𝑝1 − 𝑤 ¯1 )𝑄−1 since (Λ𝑞𝑛 − 𝑧𝑛 )𝑄−1 𝑝1 𝑞𝑛 = (𝑅𝑝1 𝑞𝑛 (𝑧𝑛 ) − 𝑤 𝑝1 𝑞𝑛 = 𝑋𝑃1 ,𝑄𝑛 by (22). Now, let us prove that ˜ 𝑃𝑗 ,𝑄𝑛 = 𝑋𝑃𝑗 ,𝑄𝑛 . 𝑋

(45)

626

D. Xia

According to (23), we only have to prove that ( ) ˜ 𝑃𝑗 ,𝑄1 𝑋 ˜ 𝑃𝑗 ,𝑄2 ⋅ ⋅ ⋅ 𝑋 ˜ 𝑃𝑗 ,𝑄𝑛 (𝑅𝑝1 ,𝑄𝑛 − 𝑤 ¯1 ) 𝑋 ) ( = 𝑋𝑃ˆ𝑗 ,𝑄1 𝑋𝑃ˆ𝑗 ,𝑄2 ⋅ ⋅ ⋅ 𝑋𝑃ˆ𝑗 ,𝑄𝑛 ,

(46)

by mathematical induction with respect to 𝑛, where 𝑗 ≥ 2. From (23), (46) holds ˜ 𝑃𝑗 ;𝑄1 can be replaced by 𝑋𝑃𝑗 ;𝑄1 . Assume that (46) holds while 𝑛 is for 𝑛 = 1 and 𝑋 ˜ 𝑃𝑗 ,𝑄𝑖 = 𝑋𝑃𝑗 ,𝑄𝑖 replaced by 𝑛 − 1 ≥ 1. According to the deﬁnition of 𝑋’s of (23), 𝑋 for 𝑖 = 1, 2, . . . , 𝑛 − 1. In order to prove that (46) holds good for 𝑛, we only have to prove that 𝐿 = 0, where def

𝐿 = −

𝑛−1 ∑

𝑋𝑃𝑗 ,𝑄𝑖 𝐶𝑝1 𝑞𝑖

𝑖=1

𝑛 ∏ ˜ 𝑃𝑗 ,𝑄𝑛 (𝑅𝑝1 𝑞𝑛 (𝑧𝑛 ) − 𝑤 (Λ𝑞𝑙 − 𝑧𝑙 )−1 + 𝑋 ¯1 ) − 𝑋𝑃ˆ𝑗 ,𝑄𝑛 . 𝑙=𝑖

By the hypothesis of mathematical induction, −

𝑛−1 ∑

𝑋𝑃𝑗 ,𝑄𝑖 𝐶𝑝1 𝑞𝑖

𝑖=1

𝑛 ∏ (Λ𝑞𝑙 − 𝑧𝑙 )−1 𝑙=𝑖

=−

𝑛−1 ∑

𝑛−1 ∏

𝑖=1

𝑙=𝑖

(

𝑋𝑃𝑗 ,𝑄𝑖 𝐶𝑝1 𝑞𝑖

) (Λ𝑞𝑙 − 𝑧𝑙 )−1 (Λ𝑞𝑛 − 𝑧𝑛 )−1

[ = 𝑋𝑃ˆ𝑗 ,𝑄𝑛−1 − 𝑋𝑃𝑗 ,𝑄𝑛−1 (𝑅𝑝1 𝑞𝑛−1 (𝑧𝑛−1 ) − 𝑤 ¯1 ) ] −1 ⋅ (Λ𝑞𝑛 − 𝑧𝑛 )−1 − 𝑋𝑃𝑗 ,𝑄𝑛−1 𝐶𝑝1 𝑞𝑛−1 (Λ𝑞𝑛−1 − 𝑧𝑛−1 ) ( ) = 𝑋𝑃ˆ𝑗 ,𝑄𝑛−1 − 𝑋𝑃𝑗 ,𝑄𝑛−1 (Λ∗𝑝1 − 𝑤 ¯1 ) (Λ𝑞𝑛 − 𝑧𝑛 )−1 , since ¯1 = −𝐶𝑝1 𝑞𝑛−1 (Λ𝑞𝑛−1 − 𝑧𝑛−1 )−1 + Λ∗1 − 𝑤 ¯1 . 𝑅𝑝1 𝑞𝑛−1 (𝑧𝑛−1 ) − 𝑤 Thus ˜ 𝑃𝑗 ,𝑄𝑛 − 𝑋 ˆ ¯𝑗 )−1 + 𝑋𝑃ˆ𝑗 ,𝑄𝑛−1 𝑄−1 𝐿 = {𝑋 𝑝1 𝑞𝑛 𝑃𝑗 ,𝑄𝑛 (𝑅𝑝1 𝑞𝑛 (𝑧) − 𝑤 − 𝑋𝑃𝑗 ,𝑄𝑛−1 (Λ∗𝑝1 − 𝑤 ¯1 )𝑄−1 ¯𝑗 ) 𝑝1 𝑞𝑛 }(𝑅𝑝1 𝑞𝑛 (𝑧𝑛 ) − 𝑤 which equals zero by (43). Therefore (46) is proved and so does (45). From (42) and (45), we have 𝐽1 = (𝐼 − 𝜇∗𝑝𝑣 )𝑋𝑃𝑣 ,𝑄𝑛 −

𝑣−1 ∑ 𝑗=2

𝜇∗𝑃𝑗 𝑣 𝑋𝑃𝑗 ,𝑄𝑛 − 𝜇∗𝑃1 𝑣 𝑋𝑝1 ,𝑄𝑛 .

Next, let us study 𝐽2 . From 𝐿 = 0 and (45), we have 𝑋𝑃ˆ𝑣 ,𝑄𝑗 = −

𝑗−1 ∑ 𝑠=1

𝑋𝑃𝑣 ,𝑄𝑠 𝐶𝑝1 𝑞𝑠

𝑗 ∏ (Λ𝑞𝑖 − 𝑧𝑖 )−1 + 𝑋𝑃𝑣 ,𝑄𝑗 (𝑅𝑝1 𝑞𝑗 (𝑧𝑗 ) − 𝑤 ¯1 ). 𝑖=𝑠

(47)

Operator Identities for Subnormal Tuples of Operators

627

Thus 𝐽2 = −

𝑗−1 𝑗 𝑛 ∑ ∑ ∏ [− 𝑋𝑃𝑣 ,𝑄𝑠 𝐶𝑝1 𝑞𝑠 (Λ𝑞𝑖 − 𝑧𝑖 )−1 𝑠=1

𝑗=1

𝑖=𝑠

+ 𝑋𝑃𝑣 ,𝑄𝑗 (𝑅𝑝1 𝑞𝑗 (𝑧𝑗 ) − 𝑤 ¯1 )]𝜇𝑄ˆ 𝑗𝑛 (𝑅𝑝1 𝑞𝑛 (𝑧𝑛 ) − 𝑤¯1 )−1 +

𝑛−1 ∑

(

−

𝑠=1

𝑗=1

−

𝑗−1 𝑗 ∑ ∏ ) 𝑋𝑃𝑣 ,𝑄𝑠 𝐶𝑝1 𝑞𝑠 (Λ𝑞𝑖 − 𝑧𝑖 )−1 + 𝑋𝑃𝑣 ,𝑄𝑗 (𝑅𝑝1 𝑞𝑗 (𝑧𝑗 ) − 𝑤 ¯1 ) 𝜇𝑄ˆ 𝑗(𝑛−1) 𝑄−1 𝑝1 𝑞𝑛 𝑖=𝑠

𝑛−1 ∑ 𝑗=1

𝑋𝑃ˆ𝑣 ,𝑄𝑗 𝜇𝑄ˆ 𝑗(𝑛−1) (Λ∗𝑝1 − 𝑤 ¯1 )𝑄−1 𝑝1 𝑞𝑛 .

(48)

For 𝑛 − 1 ≥ 𝑠 ≥ 1, let us group all the terms with coeﬃcient 𝑋𝑃𝑣 ,𝑄𝑠 in the right-hand side of (48). That is 𝑋𝑃𝑣 ,𝑄𝑠 (𝐾1 + 𝐾2 + 𝐾3 ), where ⎛

𝑛 ∑

𝐾1 =⎝

⎞ 𝑗 ∏ 𝐶𝑝1 𝑞𝑠 (Λ𝑞𝑖 − 𝑧𝑖 )−1 𝜇𝑄ˆ 𝑗𝑛− (𝑅𝑝1 𝑞𝑠 (𝑧𝑠 ) − 𝑤 ¯1 )𝜇𝑄ˆ 𝑠𝑛⎠(𝑅𝑝1 𝑞𝑛 (𝑧𝑛 ) − 𝑤 ¯1 )−1 ,

𝑗=𝑠+1

⎛

𝐾2 =⎝−

(49)

𝑖=𝑠

⎞ 𝑗 ∏ 𝐶𝑝1 𝑞𝑠 (Λ𝑞𝑖 − 𝑧𝑖 )−1 𝜇𝑄ˆ 𝑗(𝑛−1)+ (𝑅𝑝1 𝑞𝑠 (𝑧𝑠 ) − 𝑤 ¯1 )⎠𝜇𝑄ˆ 𝑠(𝑛−1) 𝑄−1 𝑝1 𝑞𝑛 ,

𝑛−1 ∑

𝑗=𝑠+1

𝑖=𝑠

and ¯1 )𝑄−1 𝐾3 = −𝜇𝑄ˆ 𝑠(𝑛−1) (Λ∗𝑝1 − 𝑤 𝑝1 𝑞𝑛 . By (20), we have ⎛ ⎞ 𝑛−1 𝑛 ∑ ∏ 𝜇𝑄ˆ 𝑠𝑗 𝐶𝑝1 𝑞𝑗 (Λ𝑞𝑖 − 𝑧𝑖 )−1 − 𝜇𝑄ˆ 𝑠𝑛 (𝑅𝑝1 𝑞𝑛 (𝑧𝑛 ) − 𝑤 ¯1 )⎠(𝑅𝑝1 𝑞𝑛 (𝑧𝑛 ) − 𝑤 ¯1 )−1 𝐾1 =⎝ 𝑗=𝑠

and

⎛

𝐾2 =⎝−

𝑖=𝑗

𝑛−2 ∑

𝑛−1 ∏

𝑗=𝑠

𝑖=𝑗

Notice that

𝜇𝑄ˆ 𝑠𝑗 𝐶𝑝1 𝑞𝑗

𝑄−1 𝑝1 𝑞𝑛

⎞

(Λ𝑞𝑗 − 𝑧𝑗 )−1 + 𝜇𝑄ˆ 𝑠(𝑛−1) (𝑅𝑝1 𝑞𝑛−1 (𝑧𝑛−1 ) − 𝑤 ¯1 )⎠ 𝑄−1 𝑝1 𝑞𝑛 .

= (Λ𝑞𝑛 − 𝑧𝑛 )−1 (𝑅𝑝1 𝑞𝑛 (𝑧𝑛 ) − 𝑤 ¯1 )−1 and

𝑅𝑝1 𝑞𝑛−1 (𝑧𝑛−1 ) − 𝑤 ¯1 = −𝐶𝑝1 𝑞𝑛−1 (Λ𝑞𝑛−1 − 𝑧𝑛−1 )−1 + (Λ∗𝑝1 − 𝑤 ¯1 ); we then have ˆ𝑄ˆ 𝑠𝑛 𝐾1 + 𝐾2 + 𝐾3 + 𝜇 ( ) =𝜇 ˆ 𝑄ˆ 𝑠(𝑛−1) 𝐶𝑝1 𝑞𝑛−1 (Λ𝑞𝑛−1 − 𝑧𝑛−1 )−1 − Λ∗𝑝1 + 𝑤 ¯1 + 𝑅𝑝1 𝑞𝑛−1 (𝑧𝑛−1 ) − 𝑤 ¯1 𝑄−1 𝑝1 𝑞𝑛 = 0.

(50)

628

D. Xia

From (48), (49) and (50), it follows that 𝐽2 = −

𝑛 ∑ 𝑋𝑃𝑣 ,𝑄𝑠 𝜇 ˆ𝑄ˆ 𝑠𝑛 .

(51)

𝑠=1

From (41), (47) and (51), we get (39), which proves the theorem.

□

4. Some operator identities for a commuting 𝒌-tuple of operators Let 𝕋 = (𝑇1 , . . . , 𝑇𝑘 ) be a commuting 𝑘-tuple of operators, i.e., [𝑇𝑖 , 𝑇𝑗 ] = 0, for 𝑖, 𝑗 = 1, 2, . . . , 𝑘, on a Hilbert space ℋ. Let ⋁ def 𝑀 = 𝑀𝕋 = closure of {[𝑇𝑖∗, 𝑇𝑗 ]ℋ : 𝑖, 𝑗 = 1, 2, . . . , 𝑘} be the defect space of 𝕋. In the general case, it is diﬀerent from the subnormal case; 𝑀 may not be an invariant subspace of 𝑇𝑖∗ , 𝑖 = 1, 2, . . . , 𝑘. Deﬁne ⋁ 𝒦 = closure of {𝑇1∗𝑚1 𝑇2∗𝑚2 ⋅ ⋅ ⋅ 𝑇𝑘∗𝑚𝑘 𝑀𝕋 : 𝑚1 , . . . , 𝑚𝑘 = 0, 1, 2, . . .}. (52) Then 𝒦 is invariant with respect to 𝑇𝑖∗ and [𝑇𝑖∗ , 𝑇𝑗 ], 𝑖, 𝑗 = 1, 2, . . . , 𝑘. Similar to (5), deﬁne def

def

𝐶𝑖𝑗 = [𝑇𝑖∗ , 𝑇𝑗 ]∣𝒦 and Λ𝑖 = (𝑇𝑖∗ ∣𝒦 )∗ .

(53)

If 𝕋 is subnormal, then 𝒦 = 𝑀 and the operators Λ𝑖 and 𝐶𝑖𝑗 deﬁned in (53) coincide with that in (5) except for changing 𝑆𝑗 to 𝑇𝑗 . About the study of 𝒦 and 𝐶𝑖𝑗 , Λ𝑖 , see [5], [6], [9], [16], [18], [19], [20]. We use the same deﬁnition of 𝑄𝑚𝑙 (𝑧, 𝑤) in (16) for the commuting operator tuple 𝕋. Let 𝑃𝒦 be the projection from ℋ to 𝒦. ∩ ∩ Lemma 4. If 𝑤 ∈ 𝜌(Λ𝑙 ) 𝜌(𝑇𝑙 ) and 𝑧 ∈ 𝜌(Λ𝑚 ) 𝜌(𝑇𝑚 ), then 𝑄𝑙𝑚 (𝑧, 𝑤) is invertible and 𝑃𝒦 (𝑇𝑙∗ − 𝑤) ¯ −1 (𝑇𝑚 − 𝑧)−1 ∣𝒦 = 𝑄𝑙𝑚 (𝑧, 𝑤)−1 .

(54)

In the case of 𝕋 being subnormal, 𝜌(Λ𝑚 ) ⊃ 𝜌(𝑇𝑚 ) and this lemma is just the one in [15]. def

def

¯𝑙 )−1 and 𝐵 = (𝑇𝑚 − 𝑧𝑚 )−1 , then from [𝑇𝑙∗ , 𝑇𝑚 ] = 𝐶𝑙𝑚 𝑃𝒦 Proof. Let 𝐴 = (𝑇𝑙∗ − 𝑤 and [𝐴, 𝐵] = 𝐴𝐵∣𝒦 𝐶𝑙𝑚 𝑃𝒦 𝐵𝐴

(55)

we have (56) 𝑃𝒦 𝐴𝐵∣𝒦 = 𝑃𝒦 𝐴𝐵∣𝒦 𝐶𝑙𝑚 𝑃𝒦 𝐵𝐴∣𝒦 + 𝑃𝒦 𝐵𝐴∣𝒦 . ∩ ∗ − 𝑧¯)𝒦 = 𝒦. Notice that if 𝑧 ∈ 𝜌(Λ𝑚 ) 𝜌(𝑇𝑚 ), then (Λ∗𝑚 − 𝑧¯)𝒦 = 𝒦. Thus (𝑇𝑚 ∗ ∗ For 𝑦 ∈ 𝒦, let 𝑢 ∈ 𝒦 satisfying 𝑦 = (Λ𝑚 − 𝑧¯)𝑢. Then (𝑇𝑚 − 𝑧¯)𝑢 = 𝑦 and ∗ (𝑇𝑚 − 𝑧¯)−1 𝑦 = (Λ∗𝑚 − 𝑧¯)−1 𝑦 = 𝑢.

(57)

Operator Identities for Subnormal Tuples of Operators

629

Therefore, for any operator 𝑌 ∈ 𝐿(ℋ) and 𝑥, 𝑦 ∈ 𝒦, ∗ (𝐵𝑌 𝑥, 𝑦) = (𝑌 𝑥, (𝑇𝑚 − 𝑧¯)−1 𝑦)

= (𝑌 𝑥, (Λ∗𝑚 − 𝑧¯)−1 𝑦) = ((Λ𝑚 − 𝑧)−1 𝑃𝒦 𝑌 𝑥, 𝑦). Thus 𝑃𝒦 (𝑇𝑚 − 𝑧)−1 𝑌 ∣𝒦 = (Λ𝑚 − 𝑧)−1 𝑃𝒦 𝑌 ∣𝒦 .

(58)

(𝑇𝑙∗ − 𝑤) ¯ −1 ∣𝒦 = (Λ∗𝑙 − 𝑤) ¯ −1 .

(59)

From (57), we have From (56), (58) and (59), it follows that ¯ −1 + (Λ𝑚 − 𝑧)−1 (Λ∗𝑙 − 𝑤) ¯ −1 . 𝑃𝒦 𝐴𝐵∣𝒦 = 𝑃𝒦 𝐴𝐵∣𝒦 𝐶𝑙𝑚 (Λ𝑚 − 𝑧)−1 (Λ∗𝑙 − 𝑤) Thus 𝑃𝒦 𝐴𝐵∣𝒦 𝑄𝑙𝑚 (𝑧, 𝑤) = 𝐼∣𝒦 , where 𝐼∣𝒦 is the identity operator on 𝒦. Similarly from the commutator formula [𝐴, 𝐵] = 𝐵𝐴∣𝒦 𝐶𝑙𝑚 𝑃𝒦 𝐴𝐵, we have 𝑄𝑙𝑚 𝑃𝒦 𝐴𝐵∣𝒦 = 𝐼𝒦 . Therefore 𝑄𝑙𝑚 (𝑧, 𝑤) is invertible and (54) holds good.

□

Part of the following lemma has appeared in [20]. ∩ Lemma 5. If 𝑧 ∈ 𝜌(Λ𝑗 ) 𝜌(𝑇𝑗 ), then [𝑅𝑝𝑗 (𝑧), 𝑅𝑞𝑗 (𝑧)] = 0, 𝑝, 𝑞 = 1, 2, . . . , 𝑘. ∩ Furthermore, if 𝑤𝑛 ∈ 𝜌(Λ𝑚𝑛 ) 𝜌(𝑇𝑚𝑛 ), 1 ≤ 𝑚𝑛 ≤ 𝑘, then 𝑃𝒦

𝑙 ∏

∗ (𝑇𝑚 −𝑤 ¯𝑛 )−1 (𝑇𝑗 − 𝑧)−1 ∣𝒦 = (Λ𝑗 − 𝑧)−1 𝑛

𝑛=1

𝑙 ∏

(𝑅𝑚𝑛 𝑗 (𝑧) − 𝑤 ¯𝑛 )−1 ,

(60)

(61)

𝑛=1

and 𝑃𝒦 (𝑇𝑗∗ − 𝑧¯)−1 ( =

𝑙 ∏

(𝑇𝑚𝑛 − 𝑤𝑛 )−1 ∣𝒦

𝑛=1 𝑙 ∏

(𝑅𝑚𝑛 𝑗 (𝑧)∗ − 𝑤𝑛 )−1

) (Λ∗𝑗 − 𝑧¯)−1

𝑛=1

=

𝑙 ∏

[(Λ𝑚𝑛 − 𝑤𝑛 )−1 (𝑅𝑗𝑚𝑛 (𝑤𝑛 ) − 𝑧¯)−1 𝐶𝑗𝑚𝑛 (Λ𝑚𝑛 − 𝑤𝑛 )−1

𝑛=1,𝑛∕=𝑡

+ (Λ𝑚𝑛 − 𝑤𝑛 )−1 ](Λ𝑚𝑡 − 𝑤𝑡 )−1 (𝑅𝑗𝑚𝑡 (𝑤𝑡 ) − 𝑧¯)−1 .

(62)

630

D. Xia

∗ Proof. We write 𝐴𝑛 = (𝑇𝑚 −𝑤 ¯𝑛 )−1 and 𝐵 = (𝑇𝑗 − 𝑧)−1 . By (55), we have 𝑛 ( 𝑙 ) ( 𝑙−1 ) ( 𝑙−1 ) ∏ ∏ ∏ 𝐴𝑛 𝐵∣𝒦 = 𝑃𝒦 𝐴𝑛 𝐵∣𝒦 𝐶𝑙𝑗 𝑃𝒦 𝐵𝐴𝑙 ∣𝒦 + 𝑃𝒦 𝐴𝑛 𝐵𝐴𝑙 ∣𝒦 . 𝑃𝒦 𝑛=1

𝑛=1

Thus

( 𝑃𝒦

𝑙 ∏

) 𝐴𝑛

𝑛=1

(

𝑙 ∏

= 𝑃𝒦

( ) 𝐵∣𝒦 𝐼 − 𝐶𝑙𝑗 (Λ𝑗 − 𝑧)−1 (Λ∗𝑚𝑙 − 𝑤¯𝑙 )−1 ) 𝐴𝑛

𝑛=1

or

( 𝑃𝒦

𝑙 ∏

𝑛=1

𝐵∣𝒦 (Λ∗𝑚𝑙 − 𝑤¯𝑙 )−1

) 𝐴𝑛

𝑛=1

𝐵∣𝒦 = 𝑃𝒦

( 𝑙−1 ∏

) 𝐴𝑛

𝐵∣𝒦 (𝑅𝑚𝑙 𝑗 (𝑧) − 𝑤 ¯𝑙 )−1 .

(63)

𝑛=1

By mathematical induction with respect to 𝑙, using ¯ 𝑄𝑙𝑚 (𝑧, 𝑤) = (𝑅𝑙𝑚 (𝑧) − 𝑤)(Λ 𝑚 − 𝑧), (54) and (63), we may prove that ( 𝑙 ) ∏ 𝑃𝒦 𝐴𝑛 𝐵∣𝒦 = (Λ𝑗 − 𝑧𝑗 )−1 (𝑅𝑚1 𝑗 (𝑧) − 𝑤 ¯1 )−1 ⋅ ⋅ ⋅ (𝑅𝑚𝑙 𝑗 (𝑧) − 𝑤 ¯𝑙 )−1 .

(64)

(65)

𝑛=1

In the case of 𝑚1 = 𝑝 and 𝑚2 = 𝑞 in (65), we have 𝑃𝒦 (𝑇𝑝∗ − 𝑤 ¯1 )−1 (𝑇𝑞∗ − 𝑤 ¯2 )−1 (𝑇𝑗 − 𝑧)−1 ∣𝒦 = (Λ𝑗 − 𝑧𝑗 )−1 (𝑅𝑝𝑗 (𝑧) − 𝑤 ¯1 )−1 (𝑅𝑞𝑗 (𝑧) − 𝑤 ¯2 )−1 .

(66)

In (66) exchanging 𝑝 and 𝑞, 𝑤1 and 𝑤2 , we have (60) since [(𝑇𝑝∗ − 𝑤 ¯1 )−1 , (𝑇𝑞∗ − 𝑤 ¯2 )−1 ] = 0. Therefore (65) implies (61). Taking adjoints of both sides of (61), we have (62).

(67) □

Deﬁne 𝑅𝑚,{𝑙1 ,...,𝑙𝑛 } = (𝑎𝑖𝑗 ) related to 𝐶𝑖𝑗 , Λ𝑖 which is the same as in §2, such as in (19), 𝑎𝑖𝑖 = 𝑅𝑚𝑙𝑖 (𝑧𝑖 ) and 𝑎𝑖𝑗 = 0 for 𝑖 > 𝑗. Theorem 2. Let 𝕋 = (𝑇1 , . . . , 𝑇𝑘 ) be a∩commuting 𝑘-tuple of operators on ℋ. If 1 ≤ 𝑝, 𝑞, 𝑗1 , . . . , 𝑗𝑛 ≤ 𝑘 and 𝑧𝑙 ∈ 𝜌(Λ𝑗𝑙 ) 𝜌(𝑇𝑗𝑙 ), then [𝑅𝑝,𝐽 (𝑧), 𝑅𝑞,𝐽 (𝑧)] = 0 where 𝑧 = (𝑧1 , . . . , 𝑧𝑛 ), 𝐽 = {𝑗1 , . . . , 𝑗𝑛 }.

(68)

Operator Identities for Subnormal Tuples of Operators

631

Proof. In the case of 𝑛 = 1, 𝑅𝑝𝑙1 (𝑧) = 𝑅𝑝𝑙1 (𝑧1 ). Thus (60) implies (68) for 𝑛 = 1. Hence we only have to prove (68) for 𝑛 ≥ 2. ¯𝑝 )−1 , 𝐵𝑞 = For simplicity of notation, write 𝐴𝑙 = (𝑇𝑗𝑙 − 𝑧𝑙 )−1 , 𝐵𝑝 = (𝑇𝑝∗ − 𝑤 −1 ˆ −1 −𝑤 ¯𝑞 ) , 𝑅𝑝𝑖 = (𝑅𝑝𝑗𝑖 (𝑧𝑖 ) − 𝑤 ¯𝑝 ) for 𝑤𝑝 ∈ 𝜌(𝑇𝑝 ), 𝑝 = 1, 2, . . . , 𝑘. Let 𝜆𝑖 = (Λ𝑗𝑖 − 𝑧𝑖 )−1 and (𝑇𝑞∗

def

¯𝑞 )𝐶𝑝𝑗𝑚 𝜆𝑚 ⋅ ⋅ ⋅ 𝜆𝑙 + 𝐶𝑞𝑗𝑚 𝜆𝑚 ⋅ ⋅ ⋅ 𝜆𝑙 (𝑅𝑝𝑗𝑙 (𝑧𝑙 ) − 𝑤 ¯𝑝 ) 𝑈𝑚𝑙 (𝑝, 𝑞) = (𝑅𝑞𝑗𝑚 (𝑧𝑚 ) − 𝑤 𝑚−1 ∑

−

𝐶𝑞𝑗𝑚 𝜆𝑚 ⋅ ⋅ ⋅ 𝜆𝑖 𝐶𝑝𝑗𝑖 𝜆𝑖 ⋅ ⋅ ⋅ 𝜆𝑙 ,

𝑖=𝑙+1

for 𝑚 − 𝑙 > 1, and def

𝑈𝑚𝑙 (𝑝, 𝑞) = (𝑅𝑞𝑗𝑚 (𝑧𝑚 ) − 𝑤 ¯𝑞 )𝐶𝑝𝑗𝑚 𝜆𝑚 𝜆𝑙 + 𝐶𝑞𝑗𝑚 𝜆𝑚 𝜆𝑙 (𝑅𝑝𝑗𝑙 (𝑧𝑙 ) − 𝑤 ¯𝑝 ) for 𝑚 = 𝑙 + 1. It is easy to see that (68) is equivalent to 𝑈𝑚𝑙 (𝑝, 𝑞) = 𝑈𝑚𝑙 (𝑞, 𝑝)

for 𝑙 < 𝑚,

(69)

and arbitrary 𝑗𝑙 , . . . , 𝑗𝑚 . Let ˆ 𝑞𝑙 𝐶𝑞𝑗 𝐴𝑙𝑚 = 𝑅 𝑙

𝑙 ∏

𝜆𝑖

and

𝐵𝑙𝑚 = 𝐶𝑝𝑗𝑙

𝑖=𝑚

𝑙 ∏

ˆ 𝑝𝑚 , 𝜆𝑖 𝑅

𝑖=𝑚

for 𝑙 ≥ 𝑚. Deﬁne 𝐸1 = 𝐸1 (𝑝, 𝑞) = 𝐼 and 𝐸𝑚 = 𝐸𝑚 (𝑝, 𝑞), 𝑚 > 1 by the recurrence formula 𝐸𝑚 =

𝑚−1 ∑

(70)

𝐴𝑚𝑖 𝐸𝑖 .

𝑖=1

Deﬁne 𝐹1 = 𝐼 and 𝐹𝑚 , 𝑚 > 1 by the recurrence formula ˆ 𝑞𝑚 𝐶𝑞𝑗𝑚 𝜆𝑚 )𝐹𝑚−1 𝐹𝑚 = (𝜆𝑚 + 𝜆𝑚 𝑅 = (𝑅𝑗𝑚 𝑞 (𝑤𝑞 )∗ − 𝑧𝑚 )−1 𝐹𝑚−1 .

(71)

We have to prove that 𝑚−1 ∑ 𝑚−1 ∏

𝜆𝑠 𝐸𝑖 = 𝐹𝑚−1 𝜆1 ,

𝑚 = 2, 3, . . .

(72)

𝑖=1 𝑠=𝑖

and ˆ 𝑞𝑚 𝐶𝑞𝑗 𝜆𝑚 𝐹𝑚−1 𝜆1 , 𝐸𝑚 = 𝑅 𝑚

𝑚 = 2, 3, . . . ,

(73)

It is easy to see that (72) and (73) hold good for 𝑚 = 2. Suppose that (72) and (73) hold good for 𝑚 = 2, 3, . . . , 𝑙; then we have to prove that (72) and (73) hold

632

D. Xia

good for 𝑚 = 𝑙 + 1. It is easy to see that 𝑙 ∏ 𝑙 ∑ 𝜆𝑠 𝐸𝑖 = 𝜆𝑙 𝐹𝑙−1 𝜆1 + 𝜆𝑙 𝐸𝑙 𝑖=1 𝑠=𝑖

ˆ 𝑞𝑙 𝐶𝑞𝑗 𝜆𝑙 𝐹𝑙−1 𝜆1 = 𝜆𝑙 𝐹𝑙−1 𝜆1 + 𝜆𝑙 𝑅 𝑙 ˆ = (𝜆𝑙 + 𝜆𝑙 𝑅𝑞𝑙 𝐶𝑞𝑗 𝜆𝑙 )𝐹𝑙−1 𝜆1 𝑙

= 𝐹𝑙 𝜆1 which proves (72) for 𝑚 = 𝑙 + 1. Then from (70), we have ( 𝑙−1 𝑙−1 ) ∑∏ ˆ 𝐸𝑙+1 = 𝑅𝑞(𝑙+1) 𝐶𝑞𝑗 𝜆𝑙+1 𝜆𝑙 𝜆𝑠 𝐸𝑖 + 𝐴(𝑙+1)𝑙 𝐸𝑙 𝑙+1

𝑖=1 𝑠=𝑖

ˆ 𝑞𝑙 𝐶𝑞𝑗 𝜆𝑙 𝐹𝑙−1 𝜆1 ˆ 𝑞(𝑙+1) 𝐶𝑞𝑗 𝜆𝑙+1 𝜆𝑙 𝐹𝑙−1 𝜆1 + 𝑅 ˆ 𝑞(𝑙+1) 𝐶𝑞𝑗 𝜆𝑙+1 𝜆𝑙 𝑅 =𝑅 𝑙+1 𝑙+1 𝑙 ˆ ˆ = 𝑅𝑞(𝑙+1) 𝐶𝑞𝑗 𝜆𝑙+1 (𝜆𝑙 + 𝜆𝑙 𝑅𝑞𝑙 𝐶𝑞𝑗 𝜆𝑙 )𝐹𝑙−1 𝜆1 𝑙+1

𝑙

ˆ 𝑞(𝑙+1) 𝐶𝑞𝑗 𝜆𝑙+1 𝐹𝑙 𝜆1 =𝑅 𝑙+1 by (71), which proves (73) for 𝑚 = 𝑙 + 1. Hence (72) and (73) hold good. Deﬁne def ˆ ˆ 𝑁𝑚𝑙 = 𝑁𝑚𝑙 (𝑝, 𝑞) = 𝑅 𝑞𝑚 𝑈𝑚𝑙 (𝑝, 𝑞)𝑅𝑝𝑙 .

(74)

𝑁𝑚(𝑚−1) = 𝐴𝑚(𝑚−1) + 𝐵𝑚(𝑚−1)

(75)

Then

and 𝑁𝑚𝑙 = 𝐴𝑚𝑙 + 𝐵𝑚𝑙 −

𝑚−1 ∑

𝐴𝑚𝑗 𝐵𝑗𝑙

for 𝑚 > 𝑙 + 1.

(76)

𝑗=𝑙+1

Deﬁne 𝑁1 = 𝐼 and 𝑁𝑚 for 𝑚 > 1 by the recurrence formula 𝑁𝑚 =

𝑚−1 ∑

𝑁𝑚𝑖 𝑁𝑖 .

(77)

𝑖=1

We have to prove that 𝑁𝑚 = 𝐸𝑚 +

𝑚−1 ∑

𝐵𝑚𝑖 𝑁𝑖 ,

𝑖=1

by mathematical induction.

𝑚 = 2, 3, . . .

(78)

Operator Identities for Subnormal Tuples of Operators

633

It is obvious that (78) holds good for 𝑚 = 2. Suppose (78) holds good for 𝑚 = 2, 3, . . . , 𝑙 − 1. Then from (76) and (77), we have ( ) 𝑙−1 𝑙−1 𝑙−2 𝑙−1 ∑ ∑ ∑ ∑ 𝑁𝑙 − 𝐵𝑙𝑖 𝑁𝑖 = (𝑁𝑙𝑖 − 𝐵𝑙𝑖 )𝑁𝑖 = 𝐴𝑙(𝑙−1) 𝑁𝑙−1 + 𝐴𝑙𝑠 𝐵𝑠𝑖 𝑁𝑖 𝐴𝑙𝑖 − 𝑖=1

𝑖=1

= 𝐴𝑙1 +

𝑙−1 ∑

( 𝐴𝑙𝑖

𝑁𝑖 −

𝑙−1 ∑ 𝑖=2

𝑖=1

) 𝐵𝑖𝑠 𝑁𝑠

𝑠=𝑖+1

,

𝑠=1

𝑖=2

which is equal to 𝐴𝑙1 +

𝑖−1 ∑

𝐴𝑙𝑖 𝐸𝑖 by the hypothesis of the induction. By (70), it is

equal to 𝐸𝑙 , which proves (78) for all 𝑚 ≥ 2. From (62), (71) and (73), we have ˆ 𝑞𝑚 𝐶𝑞𝑗𝑚 𝜆𝑚 𝑃𝒦 𝐵𝑞 𝑅

𝑚−1 ∏

ˆ 𝑞1 , 𝐴𝑖 ∣𝒦 = 𝐸𝑚 (𝑝, 𝑞)𝑅

for 𝑚 ≥ 2.

(79)

𝑖=1

Deﬁne 𝑀𝑚 = 𝑀𝑚 (𝑝, 𝑞) = 𝑃𝒦 𝐵𝑝 𝐵𝑞

𝑚 ∏

𝐴𝑗 ∣𝒦 .

𝑗=1

Then by (56), (58) and (61), we have 𝑀𝑚 = 𝑃𝒦 𝐵𝑝 𝐵𝑞 𝐴𝑚 𝐶𝑞𝑗𝑚 𝑃𝒦 𝐴𝑚 𝐵𝑞 ˆ 𝑝𝑚 𝑅 ˆ 𝑞𝑚 𝐶𝑞𝑗𝑚 𝜆𝑚 𝑃𝒦 𝐵𝑞 = 𝜆𝑚 𝑅

𝑚−1 ∏

𝐴𝑖 ∣𝒦 + 𝑃𝒦 𝐵𝑝 𝐴𝑚 𝐵𝑞

𝑖=1 𝑚−1 ∏

𝑚−1 ∏

𝐴𝑖 ∣𝒦

𝑖=1

𝐴𝑖 ∣𝒦 𝑖=1 𝑚−1 ∏

𝑚−1 ∏

𝑖=1

𝑖=1

+ 𝑃𝒦 𝐵𝑝 𝐴𝑚 ∣𝒦 𝐶𝑝𝑗𝑚 𝑃𝒦 𝐴𝑚 𝐵𝑝 𝐵𝑞

𝐴𝑖 ∣𝒦 + 𝑃𝒦 𝐴𝑚 𝐵𝑝 𝐵𝑞

𝐴𝑖 ∣𝒦 .

By (58) and (79), we have ˆ 𝑝𝑚 𝐸𝑚 𝑅 ˆ 𝑞1 + (𝜆𝑚 𝑅 ˆ 𝑝𝑚 𝐶𝑝𝑗 𝜆𝑚 + 𝜆𝑚 )𝑀𝑚−1 . 𝑀𝑚 = 𝜆𝑚 𝑅 𝑚

(80)

We have to prove that ˆ 𝑝𝑚 𝑁𝑚 𝑅 ˆ 𝑞1 + 𝜆𝑚 𝑀𝑚−1 𝑀𝑚 = 𝜆𝑚 𝑅

(81)

by mathematical induction. It is easy to see that (81) holds good for 𝑚 = 2, since ˆ 𝑝1 𝑅 ˆ 𝑞1 and 𝑀1 = 𝜆1 𝑅 ˆ 𝑞1 + 𝐶𝑝𝑗 𝜆2 𝑀1 = (𝐸2 + 𝐵21 )𝑅 ˆ 𝑞1 . ˆ 𝑞1 = 𝑁2 𝑅 𝐸2 𝑅 2 Suppose (81) holds good for 𝑚 = 2, 3, . . . , 𝑙 − 1. Then from (80) ˆ 𝑝𝑙 𝐸𝑙 𝑅 ˆ 𝑞1 + 𝜆𝑙 𝑅 ˆ 𝑝𝑙 𝐶𝑝𝑗 𝜆𝑙 (𝜆𝑙−1 𝑅 ˆ 𝑝(𝑙−1) 𝑁𝑙−1 𝑅 ˆ 𝑞1 + 𝜆𝑙−1 𝑀𝑙−2 ) + 𝜆𝑙 𝑀𝑙−1 𝑀𝑙 = 𝜆𝑙 𝑅 𝑙 ˆ 𝑝𝑙 (𝐸𝑙 + 𝐵𝑙(𝑙−1) 𝑁𝑙−1 )𝑅 ˆ 𝑝𝑙 𝐶𝑝𝑗 𝜆𝑙 𝜆𝑙−1 𝑀𝑙−2 + 𝜆𝑙 𝑀𝑙−1 . ˆ 𝑞1 + 𝜆𝑙 𝑅 = 𝜆𝑙 𝑅 𝑙

634

D. Xia

Continuing this process, we may prove that ⎛ ⎞ 𝑙−1 ∑ ˆ 𝑝𝑙 ⎝𝐸𝑙 + ˆ 𝑞1 + 𝜆𝑙 𝑀𝑙−1 . 𝑀𝑙 = 𝜆𝑙 𝑅 𝐵𝑙𝑗 𝑁𝑗 ⎠ 𝑅

(82)

𝑗=1

From (78) and (82), we may prove that (81) holds good for all 𝑚 ≥ 2. From the fact that [𝐵𝑝 , 𝐵𝑞 ] = 0, we have 𝑀𝑚 (𝑝, 𝑞) = 𝑀𝑚 (𝑞, 𝑝). Therefore (81) implies that ˆ𝑞1 = 𝑅 ˆ 𝑞𝑚 𝑁𝑚 (𝑞, 𝑝)𝑅 ˆ 𝑝1 . ˆ 𝑝𝑚 𝑁𝑚 (𝑝, 𝑞)𝑅 𝑅

(83)

From (74) and (77), we have 𝑚−1 ∑

ˆ 𝑞1 = 𝑅 ˆ 𝑞𝑚 𝑈𝑚1 𝑅 ˆ 𝑝1 𝑅 ˆ 𝑞1 + ˆ 𝑝𝑚 𝑁𝑚 𝑅 ˆ 𝑝𝑚 𝑅 𝑅

ˆ 𝑝𝑚 𝑅 ˆ 𝑞𝑚 𝑈𝑚1 𝑅 ˆ 𝑝𝑙 𝑁𝑙 𝑅 ˆ 𝑞1 . 𝑅

(84)

𝑙=2

For 𝑚 = 2, from 𝑁21 = 𝑁2 , (83) and (84), we have (69) for 𝑚 = 2 and 𝑙 = 1. But 𝑗1 , . . . , 𝑗𝑛 are arbitrary numbers in {1, 2, . . . , 𝑘}, therefore (69) holds good for 𝑚 = 𝑙 + 1. Assume that (69) holds good for 𝑚 = 𝑙 + 1, . . . , 𝑙 + 𝑖, 𝑖 ≥ 1. Then from (83) and (84) in which 𝑙 = 1, 𝑚 = 𝑖 + 2, we may prove that (69) holds good for 𝑚 = 𝑖 + 2, 𝑙 = 1. Thus (69) holds good for 𝑚 = 𝑙 + (𝑖 + 1), which proves (69) for any 𝑚 > 𝑙 and hence the theorem. □

5. Resolvents formula for a commuting 𝒌-tuple of operators Let 𝕋 = {𝑇1 , . . . , 𝑇𝑘 } be a commuting 𝑘-tuple on a Hilbert space ℋ. We deﬁne 𝒦, 𝐶𝑖𝑗 , Λ𝑖 etc. as in §4. Let us adopt the same matrix 𝔛𝑃𝑚 ,𝑄𝑛 for 𝑃𝑛 = {𝑝1 , . . . , 𝑝𝑚 } and 𝑄𝑛 = {𝑞1 , . . . , 𝑞𝑛 }, 1 ≤ 𝑝𝑖 , 𝑞𝑗 ≤ 𝑘 as in §3. Let 𝑆𝑃𝑚 ,𝑄𝑛 be the 𝐿(𝒦)-valued function def

𝑆𝑃𝑚 ,𝑄𝑛 (𝑧1 , . . . , 𝑧𝑛 ; 𝑤1 , . . . , 𝑤𝑚 ) = 𝑃𝒦

𝑚 ∏

(𝑇𝑝∗𝑖 − 𝑤 ¯𝑖 )−1

𝑖=1

𝑛 ∏

(𝑇𝑞𝑗 − 𝑧𝑗 )−1 ∣𝒦 ,

𝑗=1

for 𝑧𝑖 , 𝑤𝑖 ∈ 𝜌(𝑇𝑖 ). Deﬁne 𝔖𝑃𝑚 ,𝑄𝑛 as in (12). Theorem 3. Let 𝕋 = {𝑇1 , . . . , 𝑇𝑘 } be a commuting 𝑘-tuple on a Hilbert space ℋ. Let 𝑧 = (𝑧1 , . . . , 𝑧𝑛 ) and 𝑤 = (𝑤1 , . . . , 𝑤𝑚 ) satisfy the condition that 𝑤𝑖 ∈ 𝜌(𝑇𝑝𝑖 ), 𝑖 = 1, 2, . . . , 𝑛 and 𝑧𝑗 ∈ 𝜌(𝑇𝑞𝑗 ), 𝑗 = 1, 2, . . . , 𝑚. Then 𝔖𝑃𝑚 ,𝑄𝑛 (𝑧, 𝑤) = 𝔛𝑃𝑚 ,𝑄𝑛 (𝑧, 𝑤).

(85)

Proof. In Lemma 5, the formulas (61) and (62) are equivalent to (85) in the case of 𝑛 = 1 and 𝑚 = 1 respectively. Let us prove (85) by mathematical induction. Suppose (85) holds good for 𝑚 = 𝑙 − 1 ≥ 1. Let us calculate 𝑆𝑃𝑙 ,𝑄𝑛 for any 𝑛. Let 𝐴𝑖 = (𝑇𝑝∗𝑖 − 𝑤 ¯𝑖 )−1 and 𝐵𝑖 = (𝑇𝑞𝑖 − 𝑧𝑖 )−1 . Then by the commutator formula [𝐴𝑖 , 𝐵𝑗 ] = 𝐵𝑗 𝐴𝑖 ∣𝒦 𝐶𝑝𝑖 𝑞𝑗 𝑃𝒦 𝐴𝑖 𝐵𝑗

Operator Identities for Subnormal Tuples of Operators

635

and 𝐴𝑗 ∣𝒦 = (Λ∗𝑝𝑗 − 𝑤 ¯𝑗 )−1 we have 𝑆𝑃𝑙 ,𝑄𝑛 = 𝑃𝒦 𝐴1 ⋅ ⋅ ⋅ 𝐴𝑙 𝐵1 ⋅ ⋅ ⋅ 𝐵𝑛 ∣𝒦 =

𝑛 ∑ 𝑗=1

𝑃𝒦 𝐴1 ⋅ ⋅ ⋅ 𝐴𝑙−1 𝐵1 ⋅ ⋅ ⋅ 𝐵𝑗 ∣𝒦 (Λ∗𝑝𝑙 − 𝑤 ¯𝑙 )−1 𝐶𝑝𝑙 𝑞𝑗 𝑃𝒦 𝐴𝑙 𝐵𝑗 ⋅ ⋅ ⋅ 𝐵𝑛 ∣𝒦

+ 𝑆𝑃𝑙−1 ,𝑄𝑛 (Λ∗𝑝𝑙 − 𝑤 ¯𝑙 )−1 . By (62), we have 𝑆𝑃𝑙 ,𝑄𝑛 =

𝑛 ∑

𝑆𝑃𝑙−1 ,𝑄𝑗 𝑓𝑗𝑛

𝑗=1

where 𝑓𝑖𝑗 = (Λ∗𝑝𝑙 − 𝑤 ¯𝑙 )−1 𝐶𝑝𝑙 𝑞𝑖 ¯𝑙 )−1 , since 𝑓𝑖𝑖 = (𝑅𝑝𝑙 𝑞𝑖 (𝑧𝑖 ) − 𝑤

𝑗 ∏ 𝑠=𝑖

(𝑅𝑞𝑠 𝑝𝑙 (𝑤𝑙 )∗ − 𝑧𝑠 )−1 (Λ∗𝑝𝑙 − 𝑤 ¯𝑙 )−1 , for 𝑖 < 𝑗 and

¯𝑙 )−1 (𝐶𝑝𝑙 𝑞𝑛 𝑃𝒦 𝐴𝑙 𝐵𝑛 ∣𝒦 + 1) (Λ∗𝑝𝑙 − 𝑤 ( ) ¯ −1 (−(𝑅𝑝𝑙 𝑞𝑛 (𝑧𝑛 ) − 𝑤 ¯𝑙 ) + (Λ∗𝑝𝑙 − 𝑤 ¯𝑙 ))(𝑅𝑝𝑙 𝑞𝑛 (𝑧𝑛 ) − 𝑤 ¯𝑙 )−1 + 1 = (Λ∗𝑝𝑙 − 𝑤) ¯𝑙 )−1 . = (𝑅𝑝𝑙 𝑞𝑛 (𝑧𝑛 ) − 𝑤 Therefore ( 𝑆𝑃𝑙 ,𝑄1

𝑆𝑃𝑙 ,𝑄2

⋅⋅⋅

) ( 𝑆𝑃𝑙 ,𝑄𝑛 = 𝑆𝑃𝑙−1 ,𝑄1

𝑆𝑃𝑙−1 ,𝑄2

⋅⋅⋅

) 𝑆𝑃𝑙−1 ,𝑄𝑛 𝐹𝑛

where 𝐹𝑛 = (𝑓𝑖𝑗 )𝑖,𝑗=1,2,...,𝑛 and 𝑓𝑖𝑗 = 0 for 𝑖 > 𝑗. By the hypothesis of mathematical induction, ( ) ( ) 𝑆𝑃𝑙−1 ,𝑄1 ⋅ ⋅ ⋅ 𝑆𝑃𝑙−1 ,𝑄𝑛 = 𝑋𝑃𝑙−1 ,𝑄1 ⋅ ⋅ ⋅ 𝑋𝑃𝑙−1 ,𝑄𝑛 . Therefore from (23), to show that 𝑆𝑃𝑙 ,𝑄𝑗 = 𝑋𝑃𝑙 ,𝑄𝑗 , we only have to prove that ¯𝑙 )−1 . 𝐹𝑛 = (𝑅𝑝𝑙 ,𝑄𝑛 − 𝑤

(86)

To prove (86), we only have to show that, for any pair (𝑖, 𝑗), 1 ≤ 𝑖, 𝑗 ≤ 𝑛, 𝑎𝑖𝑖 𝑓𝑖𝑖 = 𝐼

(87)

and 𝑗 ∑

𝑎𝑖𝑠 𝑓𝑠𝑗 = 0,

𝑖 < 𝑗,

𝑠=𝑖

¯𝑙 . Thus it is obvious that (87) holds good. where (𝑎𝑖𝑗 )𝑖,𝑗=1,2,...,𝑛 = 𝑅𝑝𝑙 ,𝑄𝑛 − 𝑤 To prove (88), notice that 𝑗 ∑ 𝑎𝑖𝑠 𝑓𝑠𝑗 = 𝐼1 + 𝐼2 + 𝐼3 , 𝑠=𝑖

(88)

636

D. Xia

for 𝑗 − 𝑖 > 0, where 𝐼1 = 𝑎𝑖𝑖 𝑓𝑖𝑗 = (𝑅𝑝𝑙 𝑞𝑖 (𝑧𝑖 ) − 𝑤 ¯𝑙 )(Λ∗𝑝𝑙 − 𝑤 ¯𝑙 )−1 𝐶𝑝𝑙 𝑞𝑖 𝑗−1 ∑

𝐼2 =

𝑗−1 ∑

𝑎𝑖𝑠 𝑓𝑠𝑗 =

𝑠=𝑖+1

𝑗 ∏ (𝑅𝑞𝑠 𝑝𝑙 (𝑤𝑙 )∗ − 𝑧𝑠 )−1 (Λ∗𝑝𝑙 − 𝑤 ¯𝑙 )−1 , 𝑠=𝑖

𝐴𝑖𝑠 𝐴𝑠𝑗 ,

𝑠=𝑖+1

where 𝐴𝑖𝑠 = 𝐶𝑝𝑙 𝑞𝑖

𝑠 ∏ (Λ𝑞𝑡 − 𝑧𝑡 )−1 (Λ∗𝑝𝑙 − 𝑤 ¯𝑙 )−1 , 𝑡=𝑖

for 1 ≤ 𝑖 ≤ 𝑠 ≤ 𝑛, if 𝑗 > 𝑖 + 1, and 𝐼2 = 0, if 𝑗 = 𝑖 + 1. Besides, 𝐼3 = −𝐶𝑝𝑙 𝑞𝑖

𝑗 ∏ (Λ𝑞𝑡 − 𝑧𝑡 )−1 (𝑅𝑝𝑙 𝑞𝑗 (𝑧𝑗 ) − 𝑤 ¯𝑙 )−1 . 𝑡=𝑖

However for 𝑗 > 𝑖 + 1, 𝐼2 = −

𝑗−1 ∑

𝑠 ∏ ( ) 𝐶𝑝𝑙 𝑞𝑖 (Λ𝑞𝑡 − 𝑧𝑡 )−1 − (𝑅𝑞𝑠 𝑝𝑙 (𝑤𝑙 )∗ − 𝑧𝑠 ) + (Λ𝑞𝑠 − 𝑧𝑠 )

𝑠=𝑖+1

𝑡=𝑖

𝑗 ∏ (𝑅𝑞𝑡 𝑝𝑙 (𝑤𝑙 )∗ − 𝑧𝑡 )−1 (Λ∗𝑝𝑙 − 𝑤 ¯𝑙 )−1 ⋅ 𝑡=𝑠

=

𝑗−1 ∑ 𝑠=𝑖+1

−

𝑗 𝑠 ∏ ∏ 𝐶𝑝𝑙 𝑞𝑖 (Λ𝑞𝑡 − 𝑧𝑡 )−1 (𝑅𝑞𝑡 𝑝𝑙 (𝑤𝑙 )∗ − 𝑧𝑡 )−1 (Λ∗𝑝𝑙 − 𝑤 ¯𝑙 )−1

𝑗−1 ∑

𝑡=𝑠+1

𝑡=𝑖

𝐶𝑝𝑙 𝑞𝑖

𝑠=𝑖+1

(89)

𝑠−1 ∏

(Λ𝑞𝑡 − 𝑧𝑡 )−1

𝑗 ∏ (𝑅𝑞𝑡 𝑝𝑙 (𝑤𝑙 )∗ − 𝑧𝑡 )−1 (Λ∗𝑝𝑙 − 𝑤 ¯𝑙 )−1 .

𝑡=𝑠

𝑡=𝑖

Most of the terms in the two summations of the right-hand side of (89) cancel each other. Thus 𝐼2 = 𝐽1 + 𝐽2 , where 𝐽1 = 𝐶𝑝𝑙 𝑞𝑖

𝑗−1 ∏ 𝑡=𝑖

= 𝐶𝑝𝑙 𝑞𝑖 = −𝐼3 ,

(Λ𝑞𝑡 − 𝑧𝑡 )−1 (𝑅𝑞𝑗 𝑝𝑙 (𝑤𝑙 )∗ − 𝑧𝑗 )−1 (Λ∗𝑝𝑙 − 𝑤 ¯𝑙 )−1

𝑗 ∏ (Λ𝑞𝑡 − 𝑧𝑡 )−1 (𝑅𝑝𝑙 𝑞𝑗 (𝑧𝑗 ) − 𝑤 ¯𝑙 )−1 𝑡=𝑖

since (𝑅𝑞𝑗 𝑝𝑙 (𝑤𝑙 )∗ − 𝑧𝑗 )−1 (Λ∗𝑝𝑙 − 𝑤 ¯𝑙 )−1 = (Λ𝑞𝑗 − 𝑧𝑗 )−1 (𝑅𝑝𝑙 𝑞𝑗 (𝑧𝑗 ) − 𝑤 ¯𝑙 )−1 .

Operator Identities for Subnormal Tuples of Operators

637

The term −1

𝐽2 = −𝐶𝑝𝑙 𝑞𝑖 (Λ𝑞𝑖 − 𝑧𝑖 )

𝑗 ∏

(𝑅𝑞𝑡 𝑝𝑙 (𝑤𝑙 )∗ − 𝑧𝑡 )−1 (Λ∗𝑝𝑙 − 𝑤 ¯𝑙 )−1

𝑡=𝑖+1

𝑗 ( ) ∏ ∗ = (𝑅𝑝𝑙 𝑞𝑖 (𝑧𝑖 ) − 𝑤 ¯𝑙 ) − (Λ𝑝𝑙 − 𝑤 ¯𝑙 ) (𝑅𝑞𝑡 𝑝𝑙 (𝑤𝑙 )∗ − 𝑧𝑡 )−1 (Λ∗𝑝𝑙 − 𝑤 ¯𝑙 )−1 .

(90)

𝑡=𝑖+1

But the product of the ﬁrst four factors in 𝐼1 from the left is (𝑅𝑝𝑙 𝑞𝑖 (𝑧𝑖 ) − 𝑤 ¯𝑙 )(Λ∗𝑝𝑙 − 𝑤 ¯𝑙 )−1 𝐶𝑝𝑙 𝑞𝑖 (𝑅𝑞𝑖 𝑝𝑙 (𝑤𝑙 )∗ − 𝑧𝑖 )−1 ( ) = (𝑅𝑝𝑙 𝑞𝑖 (𝑧𝑖 ) − 𝑤 ¯𝑙 ) − (𝑅𝑞𝑖 𝑝𝑙 (𝑤𝑙 )∗ − 𝑧𝑖 ) + (Λ𝑞𝑖 − 𝑧𝑖 ) (𝑅𝑞𝑖 𝑝𝑙 (𝑤𝑙 )∗ − 𝑧𝑖 )−1 = −(𝑅𝑝𝑙 𝑞𝑖 (𝑧𝑖 ) − 𝑤 ¯𝑙 ) + (𝑅𝑝𝑙 𝑞𝑖 (𝑧𝑖 ) − 𝑤 ¯𝑙 )(Λ𝑞𝑖 − 𝑧𝑖 )𝑄𝑝𝑙 𝑞𝑖 (𝑧𝑖 , 𝑤𝑙 )−1 (Λ∗𝑙 − 𝑤 ¯𝑙 )

(91)

¯𝑙 ) + (Λ∗𝑝𝑙 − 𝑤 ¯𝑙 ). = −(𝑅𝑝𝑙 𝑞𝑖 (𝑧𝑖 ) − 𝑤

Thus 𝐼1 + 𝐽2 = 0, which proves that 𝐼1 + 𝐼2 + 𝐼3 = 0, for 𝑗 > 𝑖 + 1. If 𝑗 = 𝑖 + 1, then 𝐼2 = 0, 𝐼1 = (𝑅𝑝𝑙 𝑞𝑖 (𝑧𝑖 ) − 𝑤 ¯𝑙 )(Λ∗𝑝𝑙 − 𝑤 ¯𝑙 )−1 𝐶𝑝𝑙 𝑞𝑖 (𝑅𝑞𝑖 𝑝𝑙 (𝑤𝑙 )∗ − 𝑧𝑖 )−1 𝑄𝑝𝑙 𝑞𝑖 +1 (𝑧𝑖+1 , 𝑤𝑙 )−1 ( ) = − (𝑅𝑝𝑙 𝑞𝑖 (𝑧𝑖 ) − 𝑤 ¯𝑙 ) + (Λ∗𝑝𝑙 − 𝑤 ¯𝑙 ) 𝑄𝑝𝑙 𝑞𝑖+1 (𝑧𝑖+1 , 𝑤𝑙 )−1 by (91) and 𝐼3 = −𝐶𝑝𝑙 𝑞𝑖 (Λ𝑞𝑖 − 𝑧𝑖 )−1 𝑄𝑝𝑙 𝑞𝑖+1 (𝑧𝑖+1 , 𝑤𝑙 )−1 . Thus 𝐼1 + 𝐼3 = 0 which proves 𝐼1 + 𝐼2 + 𝐼3 = 0 as well, and hence it proves (88). Therefore (85) is proved □

References [1] A. Athevale, On joint hyponormality of operators, Proc. Amer. Math. Soc. 103 (1988), 417–423. [2] J.B. Conway, The Theory of Subnormal Operators, Math. Sur. Mono. V.36, Amer. Math. Soc. 1991, 1–435. [3] J.B. Conway, Towards a functional calculus for subnormal tuples: the minimal extension, Trans. Amer. Math. Soc. 329 (1991), 543–577, the minimal extension and approximation in several complex variable, Proc. Symp. Pure Math. 51, Part I, Amer. Math. Soc. Providence (1990). [4] R.E. Curto, Joint hyponormality: a bridge between hyponormality and subnormality, Proc. Symp. Pure Math. 5 (1990), 69–91. [5] J. Eschmeier and M. Putinar, Some remarks on spherical isometries, in Vol. “Systems, Approximation, Singular Integral Operators and Related Topics”(A.A. Borichev and N.K. Nikolskii, eds.), Birkh¨ auser, Basel et al. (2001), 271–292. [6] J. Gleason, Matrix construction of subnormal tuples of ﬁnite type, Journal of Math. Anal. Appl. 284(2) (2003), 593–602. [7] B. Gustafsson and M. Putinar, Linear analysis of quadrature domains. II, Israel J. Math. 119 (2000), 187–216.

638

D. Xia

[8] M. Putinar, Spectral inclusion for subnormal n-tuples, Proc. Amer. Math. Soc. 90 (1984), 405–406. [9] M. Putinar, Linear analysis of quadrature domains. III, J. Math. Anal. Appl. 239(1) (1999), 101–117. [10] J.D. Pincus, D. Xia and J. Xia, The analytic model of a hyponormal operator with rank one self-commutators, Integr. Equ. Oper. Theory, 7 (1984), 516–535. Note on this paper, Integr. Equ. Oper. Theory, 7 (1984), 893–894. [11] J.D. Pincus, D. Xia, A trace formula for subnormal tuples of operators, Integr. Equ. Oper. Theory, 14 (1991), 390–398. [12] J.D. Pincus, D. Zheng, A remark on the spectral multiplicity of normal extensions of commuting subnormal tuples, Integr. Equ. Oper. Theory, 16 (1993), 145–153. [13] D. Xia, Analytic model of subnormal operators, Integr. Equ. Oper. Theory, 10 (1987), 255–289. [14] D. Xia, Analytic theory of subnormal operators, Integr. Equ. Oper. Theory, 10 (1987), 890–903. [15] D. Xia, Analytic theory of a subnormal n-tuple of operators, in Operator Theory, Operator Algebra and Applications, Proceedings of Symp. on Pure Math. 51(1) (1990), 617–640. [16] D. Xia, Hyponormal operators with ﬁnite rank self-commutators and quadrature domains, J. Math. Anal. Appl. 203 (1996), 540–559. [17] D. Xia, Trace formulas for some operators related to quadrature domains in Riemann surfaces, Integr. Equ. Oper. Theory, 47 (2003), 123–130. [18] D. Xia, Hyponormal operators with rank one self-commutator and quadrature domains, Integr. Equ. Oper. Theory, 48 (2004), 115–135. [19] D. Xia, On a class of operators of ﬁnite type, Integr. Equ. Oper. Theory, 54 (2006), 131–150. [20] D. Xia, Right spectrum and trace formula of subnormal tuples of operators of ﬁnite type, Integr. Equ. Oper. Theory, 55 (2006), 439–452. [21] D.V. Yakubovich, Subnormal operators of ﬁnite type I, Xia’s model and real algebraic curves, Revista Matem. Iber. 14 (1998), 95–115. [22] D.V. Yakubovich, Subnormal operators of ﬁnite type II, Structure theorems, Revista Matem. Iber. 14 (1998), 623–689. [23] D.V. Yakubovich, Real separated algebraic curves, quadrature domains, Ahlfors type functions and operator theory, Jour. Func. Analysis, 236 (2006), 25–58. Daoxing Xia Department of Mathematics Vanderbilt University Nashville, TN 37240, USA e-mail: [email protected] [email protected]