This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
,
(1.65)
In view of Proposition 1.15, the set (1.65) is canonical. So
is a Jordan pair of L(.l.) corresponding to the eigenvalue zero. Consider now the eigenvalue 1.
and the corresponding eigenvector is (1)-
[-fi + 1] 1
.
To find the first generalized eigenvector
L'(l)
=
0,
or
[
3 2j2+1
2J2 - 1] [-fi + 1] + [ 3
1
1 J2+1
J2 1- 1](/Jw
=
o.
(1. 66 )
1.
42
LINEARIZATION AND STANDARD PAIRS
This equation has many solutions (we can expect this fact in advance, since if cpW is a solution of (1.66), so is cpW + acp~1J, for any a E (['). As the first generalized eigenvector cpW, we can take any solution of (1.66); let us take
[fi- 2]
(1)-
0
CfJll-
.
So, X
= 1
[-fi + 1 J2- 2] 1
0
,
Consider now the eigenvalue -1. Since L(-1) =
[
fl-1 1
as an eigenvector we can take (-1)-
CfJlO
-
J2_+1 1],
[fi + 1] 1
.
The first generalized eigenvector cp\1 1 ) is defined from the equation L'( -1)cp\0 1)
+ L( -1)cp\1 1l = 0,
and a computation shows that we can take (-1)-
CfJll
-
[fi + 2] 0
.
So the Jordan pair of L(A.) corresponding to the eigenvalue - 1 is X
- 1
=
[fi + 1 J2 + 2] 1
0
•
]_ 1
=
[-1 1]
0-1·
Finally, we write down a Jordan pair (X, J) for L(A.):
X
=
[1 0 0
1
J2 - 2 J2 + 1 J2 + 2]
-fi + 1 1
0
0
•
0
0 J=
-1 -1
One can now check equality (1.64) for Example 1.5 by direct computation.
0
Consider now the scalar case, i.e., n = 1. In this case the Jordan pair is especially simple and, in fact, does not give any additional information apart from the multiplicity of an eigenvalue.
43
1.9. PROPERTIES OF A JORDAN PAIR
Proposition 1.18. Let L(A.) be a monic scalar polynomial, and A.0 be an eigenvalue of L(A.) of multiplicity v. Then
X ;.0 = [1
0
···
OJ,
is a Jordan pair of L(A.) corresponding to A. 0 • Here X;. 0 is 1 x v and J;. 0 is the v x v Jordan block. Proof Let L(A.) = (A.- A. 0 Y · L 1 (A.), where L 1 (A.) is a scalar polynomial not divisible by A.- A.0 . It is clear that 1, 0, ... , 0 (v - 1 zeros) is a Jordan chain of L(A.) corresponding to A.0 . By Proposition 1.15, this single Jordan chain forms a canonical set of Jordan chains corresponding to A.0 . Now Proposition 1.18 follows from the definition of a Jordan pair. D In general, the columns of the matrix X from a Jordan pair (X, J) of the monic matrix polynomial L(A.) are not linearly independent (because of the obvious reason that the number of columns in X is nl, which is greater than the dimension n of each column of X, if the degree 1exceeds one). It turns out that the condition 1 = 1 is necessary and sufficient for the columns of X to be linearly independent. Theorem 1.19. Let L(A.) be a monic matrix polynomial with Jordan pair (X, J). Then the columns of X are linearly independent if and only if the degree 1 of L(A.) is equal to 1.
Proof Suppose L(A.) = H - A is of degree 1, and let (X, J) be its Jordan pair. From the definition of a Jordan pair, it is clear that the columns of X form a Jordan basis for A, and therefore they are linearly independent. Conversely, if the degree of L(A.) exceeds 1, then the columns of X cannot be linearly independent, as we have seen above. D 1.9. Properties of a Jordan Pair The following theorem describes the crucial property of a Jordan pair. Theorem 1.20. Let (X, J) be a Jordan pair of a monic matrix polynomial L(A.) = H 1 + 2:}:6 AiA.i. Then the matrix of size nl x nl
col(XJ'll:A is nonsingular.
~r~
l
lxJI-1
44
1.
LINEARIZATION AND STANDARD PAIRS
Proof Let C 1 be the first companion matrix for L(A.) (as in Theorem 1.1). Then the equality (1.67) holds. Indeed, the equality in all but the last block row in (1.67) is evident; the equality in the last block row follows from (1.64). Equality (1.67) shows that the columns of col(XJ;)l:A form Jordan chains for the matrix polynomial
u- c1· Let us show that these Jordan chains of I A. - C 1 form a canonical set for every eigenvalue A.0 of !A. - C 1 • First note that in view of Theorem 1.1, a(L(A:)) = a(! A. - C 1), and the multiplicity of every eigenvalue A.0 is the same, when A.0 is regarded as an eigenvalue of L(A.) or of I A. - C 1 • Thus, the number of vectors in the columns of col(Xf)l:A corresponding to some A.0 E a(l A. - C 1) coincides with the multiplicity of A. 0 as a zero of det(J A. - C 1). Let x 1 , .•• , xk be the eigenvectors of L(A.) corresponding to A.0 and which appear as columns in the matrix X; then col(A.~x 1 )l:A, ... , col(A.~xk)f:A will be the corresponding columns in col(Xf)l:A. By Proposition 1.15, in order to prove that col(XJ;)l:A forms a canonical set of Jordan chains of !A.- C 1 corresponding to A. 0 we have only to check that the vectors CO 1( AoXj i=oE~L- , 1i
)1-1
d'nl
j
= 1, ... , k
are linearly independent. But this is evident, since by the construction of a Jordan pair it is clear that the vectors x 1, ... , xb are linearly independent. So col(Xf)l:A forms a canonical set of Jordan chains for !A.- C 1 , and by Theorem 1.19, the nonsingularity ofcol(XJi)l:A follows. 0
Corollary 1.21. If (X, J) is a Jordan pair of a monic matrix polynomial L(A.), and C 1 is its companion matrix, then J is similar to C 1 • Proof Use (1.67) bearing in mind that, according to Theorem 1.20, col(XJi)l:A is invertible. 0 Theorem 1.20 allows us to prove the following important property of Jordan pairs.
Theorem 1.22. Let(X, J)and (X 1, J 1 )beJordanpairsofthemonic matrix polynomial L(A.). Then they are similar, i.e.,
x
=
x1s,
J =
s- 1J1s,
(1.68)
where (1.69) We already know that (when J is kept fixed) the matrix X from a Jordan pair (X, J) is not defined uniquely, because a canonical set of Jordan chains is
1.9.
45
PROPERTIES OF A JORDAN PAIR
not defined uniquely. The matrix J itself is not unique either, but, as Corollary 1.21 indicates, J is similar to C t·lt follows that J and J t from Theorem 1.22 are also similar, and therefore one of them can be obtained from the second by a permutation of its Jordan blocks. Proof
Applying (1.67) for (X, J) and (X t, J t), we have Ct·col(XtJD~:6
= col(Xtft)~:6-Jt,
Ct · col(Xf)~:6 = col(XJ;)~:6 · J. Substituting C t from the second equality in the first one, it is found that col(XJ;)~:6 · J[col(Xf)~:br t col( X tlD~:6
= col( X t JD~:6 · J t·
Straightforward calculation shows that for S given by (1.69), the equalities D
(1.68) hold true.
Theorem 1.22 shows that a Jordan pair of a monic matrix polynomial is essentially unique (up to similarity). It turns out that the properties described in equality (1.64) and Theorem 1.20 are characteristic for the Jordan pair, as the following result shows. Theorem 1.23. Let (X, J) be a pair of matrices, where X is of size n x nl and J is a Jordan matrix of size nl x nl. Then (X, J) is a Jordan pair of the monic matrix polynomial L(ii) = I ii 1 + 2::~: ~ A iiii if and only if the two following conditions hold:
(i) (ii)
col(XJ;)~:6 is a nonsingular nl x nl matrix;
A0 X
+ AtXJ + · · · + A 1_tXJ 1-t + XJ 1 =
0.
Proof In view of (1.64) and Theorem 1.20, conditions (i) and (ii) are necessary. Let us prove their sufficiency. Let J 0 = diag[J ot, ... , J ok] be the part of J corresponding to some eigenvalue ii 0 with Jordan blocks 1 0 t, ... , lok· Let X 0 = [X 0 t, ... , X 0k] be the part of X corresponding to J 0 and partitioned into blocks X ot, ... , X ok according to the partition of 1 0 = diag[J 0 t, ... , J 0 k] into Jordan blocks 1 0 t, ... , lok· Then the equalities
hold for i = 1, 2, ... , k. According to Proposition 1.10, the columns of X 0 ;, i = 1, ... , k, form a Jordan chain of L(ii) corresponding to ii 0 . Let us check that the eigenvectors j~, ... , h of the blocks X 01 , ••• , X Ob respectively, are linearly independent. Suppose not; then k
LII.J;=O, i= 1
l.
46
LINEARIZATION AND STANDARD PAIRS
If=
where the IX; E fl are not all zero. Then also I - 1, and therefore the columns of the matrix coi(A.b /1,
1
oc;A.b fi = 0 for j = 1, ... ,
... , A.b /,)~: 6
are linearly dependent. But this contradicts (i) since coi(A.b fi)}: ~ is the leftmost column in the part coi(X 0 ;Ib;)~-;;~ ofcoi(XJi)}:~. In order to prove that the Jordan chains X 01 , ... , X 0 k represent a canonical set of Jordan chains corresponding to A. 0 , we have to prove only (in view of Proposition 1.15) that 1 J..lo; = u, where n x J..lo; is the size of X 0 ; and u is the multiplicity of A. 0 as a zero of det L(A.). This can be done using Proposition 1.16. Indeed, let A. 1 , . . . , Aq be all the different eigenvalues of L(A.), and let J..lpi• i = 1, ... , kP, be the sizes of Jordan blocks of J corresponding to AP, p = 1, ... , q. If uP is the multiplicity of AP as a zero of det L(A.), then
I:=
kp
I
J..lpi :$
(JP'
p
=
(1.70)
1, .. . , q
i= 1
I:!,',
by Proposition 1.16. On the other hand, I~= 1 uP = I~= 1 1 J..lp; = nl, so that in (1.70) we have equalities for every p = 1, ... , q. Again, by Proposition 1.16, we deduce that the system X 01 , ... , X ok of Jordan chains corresponding to A. 0 (where A. 0 is an arbitrary eigenvalue) is canonical. 0
1.10. Standard Pairs of a Monic Matrix Polynomial
Theorem 1.23 suggests that the spectral properties of a monic matrix polynomial L(A.) = H 1 +I~=~ AiA_i can be expressed in terms of the pair of matrices (X, J) such that (i) and (ii) of Theorem 1.23 hold. The requirement that J is Jordan is not essential, since we can replace the pair (X, J) by (XS, s- 1 JS) for any invertible nl x nl matrix Sand still maintain conditions (i) and (ii). Hence, if J is not in Jordan form, by suitable choice of S the matrix s- 1 JS can be put into Jordan form. This remark leads us to the following important definition. A pair of matrices (X, T), where X is n x nl and Tis nl x nl, is called a standard pair for L(A.) if the following conditions are satisfied: (i) coi(XT;)l:6 is nonsingular; (ii) Il:6 A;XTi + XT 1 = 0. It follows from Theorem 1.20 and formula (1.64), that every Jordan pair is standard. Moreover, Theorem 1.23 implies that any standard pair (X, T), where Tis a Jordan matrix, is in fact a Jordan pair. An important example of a standard pair appears naturally from a linearization of L(A.), as the following theorem shows (cf. Theorem 1.1).
1.10.
47
STANDARD PAIRS OF A MONIC MATRIX POLYNOMIAL
Let P 1 = [I
Theorem 1.24.
0
OJ and let
0 0
0
I
0 0
0 -Ao
0 -Al
0
I
I
Cl= -Al-l
be the companion matrix of L(A.). Then (P 1 , C 1 ) is a standard pair of L(A.). Proof
Direct computation shows that P 1 C~ = [0
with I in the (i
··· 0
I
0 · · · OJ,
i = 0, ... ' 1- 1
+ 1)th place, and P 1 C11
= [ -A 0
-Al
···
-A~].
So col(P 1 CDl:6 =I, and condition (i) is trivially satisfied. Further, l-l l-l LAiPlCl + PlC1l = LA;[O ... 0 i=O
+ [- A 0 and (ii) follows also. EXAMPLE 1.6. follows:
PI=[~
I
0 ...
OJ
i=O
-A 1
· · ·
-
Az] = 0,
D
For the matrix L(A.) from Example 1.5 the pair (P 1, C 1 ) looks as
0 0 0 0 0 0 0
~l
0 0 0 Cl= 0 0 0
0 0 0 0 0 0 0 0 0 0 -1
0
0 0
0 0 0 1
0 0 1
0 0
-j2
0
-j2
0
D
Observe the following important formula for a standard pair (X, T) of L(A.): C 1 · col(XT;)l:6 = col(XT;)l:6 · T.
(1.71)
So Tis similar to C 1 . The following result shows in particular that this transformation applies to X also, in an appropriate sense.
48
1. LINEARIZATION AND STANDARD PAIRS
Theorem 1.25. Any two standard pairs (X, T) and (X', T') of L(A.) are similar, i.e., there exists an invertible nl x nl matrix S such that X'=XS,
T'=S- 1 TS.
(1.72)
The matrix S is defined uniquely by (X, T) and (X', T') and is given by the formula (1.73)
Conversely, iffor a given standard pair (X, T) a new pair of matrices (X', T') is defined by (1.72) (with some invertible matrix S), then (X', T') is also a standard pair for L(A.). Proof The second assertion is immediate: If (i) and (ii) are satisfied for some pair (X, T) (X and T of sizes n x nl and nl x nl, respectively), then (i) and (ii) are satisfied for any pair similar to (X, T). To prove the first statement, write C 1 · col(XTi)L;;;& = col(XT;)l:& · T,
C 1 · col(X'T';)l:& = col(X'T';)l:& · T', and comparison of these equalities gives T' =
s- 1 rs
with S given by (1.73). From the definition of the inverse matrix it follows that
so
XS =X· [col(XT;)l:&r 1 ·col(X'T';)l:& =X', and (1.72) follows with S given by (1.73). Uniqueness of S follows now from the relation col(X'T';)l:& = col(XT;)l:& · S.
D
Until now we have considered polynomials whose coefficients are matrices with complex entries. Sometimes, however, it is more convenient to accept the point of view of linear operators acting in a finite dimensional complex space. In this way we are led to the consideration of operator polynomials L(A.) = IA. 1 + I~=b AjA.i, where Aj: rcn--+ rcn is a linear transformation (or operator). Clearly, there exists a close connection between the matricial and operator points of view: a matrix polynomial can be considered as an operator polynomial, when its coefficients are regarded as linear transformations generated by the matricial coefficients in a fixed basis (usually the standard orthonormal
1.10. STANDARD PAIRS OF A MONIC MATRIX POLYNOMIAL
49
(1, 0, ... , O)T, (0, 1, ... , O)T, ... , (0, ... , 0, 1)T). Conversely, every operator polynomial gives rise to a matrix polynomial, when its coefficients are written as a matrix representation in a fixed (usually standard orthogonal) basis. So in fact every definition and statement concerning matrix polynomials can be given an operator polynomial form and vice versa. In applying this rule we suppose that the powers ( ftn) 1 are considered as Euclidean spaces with the usual scalar product: I
[(xb ... , x 1), (y 1, ... , y1)]
=
L (xi> Y;), i= 1
where X;, Yi E ftn. For example, a pair oflinear transformations (X, T), where X: ([;" ~ ([;", T: (/;" 1 ~ (/;" 1 is a standard pair for the operator polynomial L(A.) if col(XT;)!:~ is an invertible linear transformation and 1-1
LA;XT;
+ XT 1 =
0.
i=O
Theorem 1.25 in the framework of operator polynomials will sound as follows: Any two standard pairs (X, T) and (X', T') of the operator polynomial L(A.) are similar, i.e., there exists an invertible linear transformation S: (/;" 1 ~ (/;" 1 such that X' = X S and T' = S- 1 T S. In the sequel we shall often give the definitions and statements for matrix polynomials only, bearing in mind that it is a trivial exercise to carry out these definitions and statements for the operator polynomial approach.
Comments A good source for the theory of linearization of analytic operator functions, for the history, and for further references is [28]. Further developments appear in [7, 66a, 66b]. The contents of Section 1.3 (solution of the inverse problem for linearization) originated in [3a]. The notion of a Jordan chain is basic. For polynomials of type H- A, where A is a square matrix, this notion is well known from linear algebra (see, for instance, [22]). One of the earliest systematic uses of these chains of generalized eigenvectors is by Keldysh [ 49a, 49b], hence the name Keldysh chains often found in the literature. Keldysh analysis is in the context of infinitedimensional spaces. Detailed exposition of his results can be found in [32b, Chapter V]. Further development of the theory of chains in the infinitedimensional case for analytic and meromorphic functions appears in [60, 24, 38]; see also [5]. Proposition 1.17 appears in [24]. The last three sections are taken from the authors' papers [34a, 34b]. Properties of standard pairs (in the context of operator polynomials) are used in [76a] to study certain problems in partial differential equations.
Chapter 2
Representation of Monic Matrix Polynomials
In Chapter 1, a language and formalism have been developed for the full description of eigenvalues, eigenvectors, and Jordan chains of matrix polynomials. In this chapter, triples of matrices will be introduced which determine completely all the spectral information about a matrix polynomial. It will then be shown how these triples can be used to solve the inverse problem, namely, given the spectral data to determine the coefficient matrices of the polynomial. Results of this (and a related) kind are given by the representation theorems. They will lead to important applications to constant coefficient differential and difference equations. In essence, these applications yield closed form solutions to boundary and initial value problems in terms of the spectral properties of an underlying matrix polynomial.
2.1. Standard and Jordan Triples
Let L(A.) = V.. 1 + 2:}: bA iA.i be a monic matrix polynomial with standard pair (X, T). By definition (Section 1.10) of such a pair a third matrix Y of size 50
2.1. STANDARD AND JORDAN TRIPLES
51
nl x n can be defined by
y =
[:T 1-~r~10. :
xyz-t
(2.1)
I
Then (X, T, Y) is called a standard triple for L(A.).If T = J is a Jordan matrix (so that the standard pair (X, T) is Jordan), the triple (X, T, Y) will be called Jordan. From Theorem 1.25 and the definition of a standard triple it follows immediately that two standard triples (X, T, Y) and (X', T', Y') are similar, i.e., for some nl x nl invertible matrix S the relations T'
X'=XS,
=
s- 1 TS,
Y' =s-ty
(2.2)
hold. The converse statement is also true: if (X, T, Y) is a standard triple of L(A.), and (X', T', Y') is a triple of matrices, with sizes of X', T', andY' equal to n x nl, nl x nl, and nl x n, respectively, such that (2.2) holds, then (X', T', Y') is a standard triple for L(A.). This property is very useful since it allows us to reduce the proofs to cases in which the standard triple is chosen to be especially simple. For example, one may choose the triple (based on the companion matrix C 1)
X= [I
(2.3)
0 · · · 0],
In the next sections we shall develop representations of L(A.) via its standard triples, as well as some explicit formulas (in terms of the spectral triple of L(A.)) for solution of initial and boundary value problems for differential equations with constant coefficients. Here we shall establish some simple but useful properties of standard triples. First recall the second companion matrix C 2 of L(A.) defined by
c2
=
0 0 I 0 0 I
0 -Ao 0 -At 0 -A2
0 0
I -A,_!
The following equality is verified by direct multiplication: C 2 = BC 1B- 1 ,
(2.4)
52
2.
REPRESENTATION OF MONIC MATRIX POLYNOMIALS
where A1 A2
A2
A1-1 I
I 0
I 0
A1-1 I
B=
(2.5) 0
is an invertible nl X nl matrix. In particular, c2 is, like C1, a linearization of L(il) (refer to Proposition 1.3). . Define now an nl x nl invertible matrix R from the equality R = ~Q)- 1 with B defined by (2.5) and Q = col(XT')l:A or, what is equivalent,
RBQ =I.
(2.6)
We shall refer to this as the biorthogonality condition for Rand Q. We now have
whence
RC 2
=
(2.7)
TR.
Now represent R as a block row [R 1 Ra, where Ri is an nl x n matrix. First observe that R 1 = Y, with Y defined by (2.1). Indeed,
(2.8)
where the matrix B- 1 has the following form (as can be checked directly): 0
0
Bo B1
B-1=
with B 0 =I and B 1 ,
.•• ,
... 0
Bo
Bo
B1
(2.9)
B1-1
B1 _ 1 defined recursively by
r = 0, ... , l - 2.
2.1.
53
STANDARD AND JORDAN TRIPLES
Substituting (2.9) in (2.8) yields
by definition. Further, (2.7) gives (taking into account the structure of C 2 ) R; = yi-ly
for
i
= 1, ... , l
and YA0
+
TY A 1
+ ·· · +
T 1- 1 Y A 1 _
1
+
T 1Y = 0.
(2.10)
We have, in particular, that for any standard triple (X, T, Y) the nl x nl matrix row (TiY)l;;;b is invertible. Note the following useful equalities:
fT l[Y
rril-l
TY
(2.11)
···
and B- 1 is given by (2.9). In particular, for for
i i
= 0, ... , l - 2 = l - 1.
(2.12)
We summarize the main information observed above, in the following statement: Proposition 2.1.
(i) matrix,
(ii)
If(X, T, Y) is a standard triple of L(A.), then:
row (TjY)~-:6
= [Y TY .. · T 1- 1 Y] is an nl
X is uniquely defined by T and Y: X= [0
x nl nonsingular
··· 0
I]· [row
(TjY)J:br\ (iii)
YA 0
+
TYA 1
+ .. · +
T 1- 1 YA 1_ 1
+
T 1Y = 0.
Proof Parts (i) and (iii) were proved above. Part (ii) is just equalities (2.12) written in a different form. D
For the purpose of further reference we define the notion of a left standard pair, which is dual to the notion of the standard pair. A pair of matrices (T, Y), where Tis nl x nl and Y is nl x n, is called a left standard pair for the monic matrix polynomial L(A.) if conditions (i) and (iii) of Proposition 2.1 hold. An
2. REPRESENTATION OF MONIC MATRIX POLYNOMIALS
54
equivalent definition is (T, Y) is a left standard pair of L(A.) if (X, T, Y) is a standard triple of L(A.), where X is defined by (ii) of Proposition 2.1. Note that (T, Y) is a left standard pair for L(A.) iff (YT, TT) is a (usual) standard pair for LT(A.), so every statement on standard pairs has its dual statement for the left standard pairs. For this reason we shall use left standard pairs only occasionally and omit proofs of the dual statements, once the proof of the original statement for a standard pair is supplied. We remark here that the notion of the second companion matrix C 2 allows us to define another simple standard triple (X 0 , T0 , Y0 ), which is in a certain sense dual to (2.3), as
X 0 = [0
(2.13)
· · · 0 I],
It is easy to check that the triple (X 0 , T0 , Y0 ) is similar to (2.3): X 0 =[1
0 ··· O]B-t,
C2 =BC 1 B-\
Y0 =Bcol(<5ill)l= 1 •
So (X 0 , T0 , Y0 ) itself is also a standard triple for L(A.). If the triple (X, J, Y) is Jordan (i.e., J is a Jordan matrix), then, as we already know, the columns of X, when decomposed into blocks consistently with the decomposition of J into Jordan blocks, form Jordan chains for L(A.). A dual meaning can be associated with the rows of Y, by using Proposition 2.1. Indeed, it follows from Proposition 2.1 that the matrix col(YT(JT)i)~:~ is nonsingular and AJYT
+ AIYTJT + ... + A!-1YT(JT)'-1 +
yT(JT)I = 0.
So (YT, JT) is a standard pair for the transposed matrix polynomial LT(A.). Let J = diag[J PJ;= 1 be the representation of J as a direct sum of its Jordan blocks J b ... , Jm of sizes oc 1, ... , ocm, respectively, and for p = 1, 2, ... , m, let 0
0 1 1 0
Kp =
0 1 1 0
0
be a matrix of size ocP x ocP. It is easy to check that KPJ~Kp
= Jp.
2.1.
55
STANDARD AND JORDAN TRIPLES
Therefore, by Theorem 1.22, (YTK, J) is a Jordan pair for U(A.), where K = diag[KPJ;= 1 • So the columns of YTK, partitioned consistently with the partition J = diag[J PJ;= 1 , form Jordan chains for LT(A.). In other words, the rows of Y, partitioned consistently with the partition of J into its Jordan blocks and taken in each block in the reverse order form left Jordan chains for L(A.). More generally, the following result holds. We denote by L *(A.) (not to be confused with (L(A.))*) the matrix polynomial whose coefficients are adjoint to those of L(A.): if L(A.) = L~=o AiA.i, then L*(A.) = L}=o AjA.i. Theorem2.2. If(X, T, Y) is a standard triple for L(A.), then (YT, TT, XT) is a standard triple for LT(A.) and (Y*, T*, X*) is a standard triple for L*(A.).
This theorem can be proved using, for instance, Theorem 1.23 and the definitions of standard pair and standard triple. The detailed proof is left to the reader. We illustrate the constructions and results given in this section by an example. EXAMPLE 2.1.
Let
A Jordan pair (X, J) was already constructed in Example 1.5:
X=
[1 0 01
-fi +
1
j2-
1
2
0
j2 +
1
j2 +
I
2] 0'
Let us now construct Yin such a way that (X, J, Y) is a Jordan triple of L(A.). Computation shows that
[:1 l~ XJZ
0 0 0 0 0 0
-fi +
1
fi-2
0 -1 0 -fi + 1 1 0 1 0 -fi + 1 -fi 2 0 1
fi+1 1 -fi -1 -1 fi + 1
fi+2 0 -1 1
-fi -2
56
2.
REPRESENTATION OF MONIC MATRIX POLYNOMIALS
and
0
fi
0
0
0 0
[:Jr
-
4
-j2 2j2 + 3
y2+2
4
4
-j2-
0 0
XJl
0
0
4
4
0 0
4
0 0
h
-y~2-l
1
4
-
4
2j2 -
0 0
-1 -
4
3 -y'2 + 2
4
4
j2 -1
-y'2+1
4
4
0
4
So the Jordan triple (X, J, Y) is completed with
0
-1
j2 + 2
0 0
4
-j2-1 4
-j2 + 2 4
-
4
0
-j2 + 1 4
4
It is easy to check that
( -fi4 -
1
~)
,4 ,
(fi + o) 4
2
(resp. ( -fi4 + 1 , - ~) 4 ,
,
(-fi + 4
2 , 0 ))
are left Jordan chains (moreover, a canonical set of left Jordan chains) of L(ll) corresponding to the eigenvalue 11. 0 = 1 (resp.ll 0 = - 1), i.e., the equalities
(-v;- ,~)L(l) 1
(-v;hold, and similarly for
11. 0 =
-
1,
1.
= 0,
~)L'(l) + (fi/ 2 , 0)L(1) = D
0
2.2.
REPRESENTATIONS OF A MONIC MATRIX POLYNOMIAL
57
We conclude this section with a remark concerning products of type XTi Y, where (X, T, Y) is a standard triple of a monic matrix polynomial L(A.) of degree !, and i is a nonnegative integer. It follows from the similarity of all standard triples to each other that XTi Y does not depend on the choice ofthe triple. (In fact, we already know (from the definition of Y) that XTiY = 0 for i = 0, ... , I - 2, and XT 1- 1 Y = 1.) Consequently, there exist formulas which express XTiY directly in terms of the coefficients Ai of L(A.). Such
formulas are given in the next proposition. Proposition 2.3. Let L(A.) = IA.1 + ~};;;~ AiA.i be a monic matrix polynomial with standard triple (X, T, Y). Then for any nonnegative integer p,
where, by definition, A; = Ofor i < 0.
It is sufficient to prove Proposition 2.3 for the case that X = · · · 0], T = C 1 (the companion matrix),
Proof
[I
0
Observe (see also the proof of Theorem 2.4 below) that [I
0 · · · O]q = [ -A 0
-A 1
so Proposition2.3 is proved for p = O(in whichcasethesum L~= 1 disappears). For the general case use induction on p to prove that the /3th block entry in the matrix [I 0 · · · O]ctt" is equal to
k~1 q~1 i,+··~i.=k}J1(-Az-;) p
[
k
q
J(-Ap+k-p-1) + (-Ap-p-1),
ij>O
p=
1, ... , I.
For a more detailed proof see Theorem 2.2 of [30a]. 2.2. Representations of a Monic Matrix Polynomial
The notion of a standard triple introduced in the preceding section is the main tool in the following representation theorem.
2.
58
REPRESENTATION OF MONIC MATRIX POLYNOMIALS
Let L(A.) = H 1 + L);;;~ AjA.j be a monic matrix polynomial of degree I with standard triple (X, T, Y). Then L(A.) admits the following representations. Theorem2.4.
(1)
right canonical form:
+ V2 A. + · · · +
L(A.) = H 1 - XT1(V1
VzA.1-
1 ),
(2.14)
1 lt;)T 1 Y,
(2.15)
where l'f are nl x n matrices such that [V1 (2)
···
VzJ = [col(XTi)l:br 1;
left canonical form: L(A.) = A.1/
-
(W1
+ A.W2 + · · · + A.1-
where W; are n x nl matrices such that col(W;)l= 1 = [row(TiY)l:br 1 ; (3)
resolventform: A. E ~\a{L).
(2.16)
Note that only X and T appear in the right canonical form of L(A.), while in the left canonical form only T and Y appear. The names "right" and "left" are chosen to stress the fact that the pair (X, T) (resp. (T, Y)) represents right (resp. left) spectral data of L(A.).
Proof Observe that all three forms (2.14), (2.15), (2.16) are independent of the choice of the standard triple (X, T, Y). Let us check this for (2.14), for example. We have to prove that if (X, T, Y) and (X', T', Y') are standard triples of L(A.), then XT 1[Vl
···
Vz] = X'(T') 1[V't
···
v;],
Y' =
s- 1 Y.
where
But these standard triples are similar:
X'=XS, Therefore [V~
v;] = =
and (2.17) follows.
[col(X'T'i)l:br 1
s- 1 [Vl
···
VzJ,
= [col(XTi)l:bsr 1
(2.17)
2.2.
59
REPRESENTATIONS OF A MONIC MATRIX POLYNOMIAL
Thus, it suffices to check (2.14) and (2.16) only for the special standard triple
X= [I
0
· · · OJ,
and for checking (2.15) we shall choose the dual standard triple defined by (2.13). Then (2.16) is just Proposition 1.2; to prove (2.14) observe that
[I
0
· · · O]q = [ -A 0
-A 1
· · ·
-A 1 _ 1]
and [Vt
···
l';] = [col([!
0
· · · OJCDl:~r 1 = I,
so
[I
0
· · · OJCUV1 V2
•· ·
l';] = [ -A 0
-A1
···
-A1-1J,
and (2.14) becomes evident. To prove (2.15), note that by direct computation one easily checks that for the standard triple (2.13) j
= 0, ... , l - 1
and So
and
[mw(C{
col(Oni):_.Jj:~r·r•y ~ [Z] ~ r_=xJ
so (2.15) follows.
T'Y
0
We obtain the next result on the singular part of the Laurent expansion (refer to Section 1. 7) as an immediate consequence of the resolvent form (2.16). But first let us make some observations concerning left Jordan chains. A canonical set of left Jordan chains is defined by matrix Y of a Jordan triple (X, J, Y) provided X is defined by a canonical set of usual Jordan chains. The lengths of left Jordan chains in a canonical set corresponding to A. 0 coincide with the partial multiplicities of L(A.) at A. 0 , and consequently, coincide with the lengths of the usual Jordan chains in a canonical set (see Proposition 1.13). To verify this fact, let L(A.)
=
E(A.)D(A.)F(A.)
60
2. REPRESENTATION OF MONIC MATRIX POLYNOMIALS
be the Smith form of L(A.) (see Section Sl.l); passing to the transposed matrix, we obtain that D(A.)is also the Smith form for LT(A.). Hence the partial multiplicities of L(A.) and LT(A.) at every eigenvalue A. 0 are the same. Corollary 2.5. Let L(A.) be a monic matrix polynomial and A. 0 E a{L). Jordan Then for every canonical set cp~>, cpy>, ... , cp~i>_ 1 (j = 1, ... , ex) of J • • chains of L(A.) corresponding to A. 0 there exists a canonical set z~>, zy>, ... , z~~>- 1 (j = 1, ... , ex) of left Jordan chains of L(A.) corresponding to A. 0 such that a:
SP(L -1(A.o)) =
ri
L L (A. j=1 k=1
rj-k
A.o)-k
L cp~~>-k-r. z~i>.
(2.18)
r=O
Note that cp~> are n-dimensional column vectors and z~> are n-dimensional rows. So the product cp~i>_k-r · z~i> makes sense and is an n x n matrix. J Proof Let (X, J, Y) be a Jordan triple for L(A.) such that the part X 0 of X corresponding to the eigenvalue A. 0 consists of the given canonical set cp~>, ... , cp~~>_ 1 (j = 1, ... , ex). Let J 0 be the part of J with eigenvalue A. 0 , and let Y0 be the corresponding part of Y. Formula (2.16) shows that (L(A.))- 1 = X 0 (AI- J 0 )- 1 Y0
+ ···,
where dots denote a matrix function which is analytic in a neighborhood of A. 0 . So (2.19) A straightforward computation of the right-hand side of (2.19) leads to formula (2.18), where z~>, ... , z~1- 1 , for j = 1, ... , ex, are the rows of Y0 taken in the reverse order in each part of Y0 corresponding to a Jordan block in J 0 . But we have already seen that these vectors form left Jordan chains of L(A.) corresponding to A. 0 • It is easily seen (for instance, using the fact that [Y JY ·. · J 1- 1 Y] is a nonsingular matrix; Proposition 2.1) that the rows z~l), ... , z~> are linearly independent, and therefore the left Jordan . z U> U> 1 , '"·-1 '" " I set (fP .. chams .or J - , ... , ex .orm a canomca c. roposihon 0 , •.• , z,r 1.16). D
The right canonical form (2.14) allows us to answer the following question: given n x nl and nl x nl matrices X and T, respectively, when is (X, T) a standard pair of some monic matrix polynomial of degree l? This happens if and only if col(XT;)L;;6 is nonsingular and in this case the desired monic matrix polynomial is unique and given by (2.14). There is a similar question: when does a pair (T, Y) of matrices T and Y of sizes nl x nl and nl x n, respectively, form a left standard pair for some monic matrix polynomial of degree l? The answer is similar: when row(T;¥)::6 is nonsingular, for then
2.2.
61
REPRESENTATIONS OF A MONIC MATRIX POLYNOMIAL
the desired matrix polynomial is unique and given by (2.15). The reader can easily supply proofs for these facts. We state as a theorem the solution of a similar problem for the resolvent form of L(A.). Theorem2.6. Let L(A.) be a monic matrix polynomial ofdegree 1and assume there exist matrices Q, T, R of sizes n x nl, nl x nl, nl x n, respectively, such that
A.¢ a(T)
(2.20)
Then (Q, T, R) is a standard triple for L. Proof Note that for IA. I sufficiently large, L - 1(A.) can be developed into a power series L -1(A.) = A.-'I
+ A.-1-1z1 + r'-2z2 + ...
for some matrices Z 1 , Z 2 , •••. It is easily seen from this development that if r is a circle in the complex plane having a(L) in its interior, then if i = 0, 1, ... ' 1 - 2 if i=l-1. But we may also chooser large enough so that (see Section Sl.8) i
= 0,
1, 2, ....
Thus, it follows from (2.20) that
~
col(QT;t:6[R TR · · · T 1- 1 R] = - 1- ( [ 2ni Jr A.'~ 1
A. ...
A_l-1
l
; L - 1 (A.)dA. A. 21-2
(2.21)
and, in particular, both col(QT;)L:6 and row(T;R)l:6 are nonsingular. Then we observe that for i = 0, ... , 1 - 1,
0 = -21 . 1 A.;L(A.)L - 1 (A.) dA. = ~ 1 A.;L(A.)Q(JA. - T)- 1 R dA. m Jr 2m Jr = (AoQ
+ A 1 QT + · · · + QT 1)T;R,
62
2.
REPRESENTATION OF MONIC MATRIX POLYNOMIALS
and, since [R TR · · · T 1- 1R] is nonsingular, we obtain A 0 Q + · · · + A1_ 1 QT 1- 1 + QT 1 = 0. This is sufficient to show that, if S = col(QT;)l:A, then
Q =[I 0 · · · O]S,
i.e., Q, T, R is a standard triple of L(A). ExAMPLE 2.2.
D
For the matrix polynomial
described in Examples 1.5 and 2.1, the representations (2.14)-(2.16) look as follows, using the Jordan triple (X, J, Y) described in Example 2.1: (a) right canonical form: L(J.) = H
3 -
[~
0 1
-fi +
fi-
1
1
J2 + 1
0
0 0 0 0
0
0
0 0 0 0
0
0
0
0 0
0
0
0 0
0
0
0 0
0 0 0 0 -1
3
0 0
0 -1
0 0
3
2
1
0
ji
0
0
-fi
-
2)2 + 3 4
4
0 0 0 0
2]
0 -1
0
fi+2
0
4
-fi -1 ).+ -fi -1
+ 0 0 0
fio+
4
4
4
2}2- 3 4
-fi +2
4
J2- 1
-fi +
4
4
(b) left canonical form: we have to compute first [Y JY JzY]-t.
4
4 4
;.z -
0
1 4
2.2.
63
REPRESENTATIONS OF A MONIC MATRIX POLYNOMIAL
It is convenient to use the biorthogonality condition (2.6), RBQ = I, where
(as given in Example 2.1) and Az I
[A,
B
= ~z
H
0 So it is found that
0 -1 1 0 [Y
0 0 1
0 0 -1
fi- 1 -fi + 1
-2~2 + 3
ofi
JY J2Y]-I = BQ =
c
y'2 1 0
0 0
fi-
0
0 -1 2
0
and the left canonical form of L(.-\) is
L(.-\) = .-\ 3 I _ {[0 -1 0 0 0 OJ 1 0 0 0 0 0 [
0
+ ~i +
[
j2
1 0 0
I
-1
fi-
0
1
-1
-2)2 + 3 fi + 1
-Ji + 1
fi-
1
0
2
fi + 1 1
0
0
-1
0 I
0
fi + 2
3
0
4
-fi-
1
1 -
4 -I
3 -I
-fi + 2 4
-,/2 + 1 4
4
0 1
-4
0 0 -1
fi + 1 2fi + 3 fi + 1 fi + 2 1
0
'
64
2. REPRESENTATION OF MONIC MATRIX POLYNOMIALS (c) resolvent form:
L(A)-1
= [
1
0
o -fi + 1
1
j2-
1
2
0
j2 +
1
1
j2 +
2]
0 -1
A A
0
-1
j2
A- 1 -1 A- 1 A + 1 -1 A+ I
+2 4
0 0
-fi- 1
1
4
4
-fi+2 4
D
0
-fi + 1 4
4
We have already observed in Section 1.7 and Corollary 2.5 that properties of the rational function L - 1(A.) are closely related to the Jordan structure of L(A.). We now study this connection further by using the resolvent form of L(A.). Consider first the case when all the elementary divisors of L(A.) are linear. Write the resolvent form for L(A.) taking a Jordan triple (X, J, Y)
Since the elementary divisors of L(A.) are linear, J is a diagonal matrix J = diag[A. 1 · · · A.n1], and therefore (H- J)- 1 = diag[(A.- A.;)- 1 ]i'~ 1 . So (2.22) Clearly the rational matrix function L(A.)- 1 has a simple pole at every point in a(L). Let now A. 0 E a(L), and let A.;,, ... , A;k be the eigenvalues from (2.22)
which are equal to A.0 . Then the singular part of L(A.)- 1 at A.0 is just
where X 0 (Y0 ) is n x k (k x n) submatrix of X (Y) consisting of the eigenvectors (left eigenvectors) corresponding to A.0 . Moreover, denoting by X; (Y;) the ith column (row) from X (Y), we have the following representation: L(A.)-1
=
I
X; Y;. i=1A.-A.;
(2.23)
2.2.
REPRESENTATIONS OF A MONIC MATRIX POLYNOMIAL
65
In general (when nonlinear elementary divisors of L(A.) exist) it is not true that the eigenvalues of L(A.) are simple poles of L(A.)- 1• We can infer from Proposition 1.17 that the order of A. 0 E a{L) as a pole of L- 1(/l) coincides with the maximal size of a Jordan block of J corresponding to A.0 . Next we note that the resolvent form (2.16) can be generalized as follows:
x- 1L(A.)- 1 = xr- 1(/A.-
T)- 1 Y,
r
= 1, ... , l,
A.zL(A.)-t = XTz(IA. _ T)-ty +I.
(2.24)
To verify these equalities, write
according to (2.16), and use the following development of (Ill - T)- 1 into a power series, valid for IA. I large enough (I A. I > I Til): (Ill- T)- 1 = A.- 1 I
+ A.- 2 T + A.- 3 T 2 + · ··.
(2.25)
The series L~ 1 A. -iTi-l converges absolutely for IIll > I Til, and multiplying it by IA - T, it is easily seen that (2.25) holds. So for IA. I > I Til
x-
X(A.- 1 I + A.- 2 T + A.- 3 T 2 + · · ·)Y = ll'-tX(A.-'T'-t + A.-r-tTr + .. ·)Y + A.'- 1 X(A.- 1 I + A.- 2 T + · · · + A.-
A.'- 1 L(A.)- 1 =
1
The last summand is zero for r = 1, ... , land I for r = l + 1 (since XT;Y = 0 for i = 0, ... , l - 2 and XT 1- 1 Y = I by the definition of Y), and the first summand is apparently XT'- 1 (AJ - T)- 1 Y, so (2.24) follows. The same argument shows also that for r > l + 1 the equality A.'- 1L(A.)-l
= xrr- 1(/li-
T)-Iy
+
r- 2
L
A.'- 2 -iXTiY
i=l-1
holds. Another consequence of the resolvent form is the following formula (as before, (X, T, Y) stands for a standard triple of a monic matrix polynomial L(A.)): -21 . ( f(A.)L - 1(/l) dll = Xf(T) Y,
mJr
(2.26)
where r is a contour such that rr(L) is insider, and f(A.) is a function which is analytic inside and in a neighborhood of r. (Taking r to be a union of small
66
2.
REPRESENTATION OF MONIC MATRIX POLYNOMIALS
circles around each eigenvalue of L(A.), we can assume that f(A.) is analytic in a neighborhood of a{L).) The proof of (2.26) is immediate: _21.
rJ
nz Jr
= X· -21 .
m
T)-1 y dA.
j f(A.)(A.I - T)- 1 dA. Y
Jr
= Xf(T) Y
by the definition of f(T) (see Section Sl.8). We conclude this section with the observation that the resolvent form (2.16) also admits another representation for L(A.) itself. Proposition 2.7. Let (X, T, Y) be a standard triple for the monic matrix polynomial L(A.). Then the functions E(A.), F(A.) defined by E(A.) = L(A.)X(I A. - T)- 1 ,
F(A.) = (/A. - T)- 1 Y L(A.)
are n x In, In x n matrix polynomials whose degrees do not exceed I - 1, and L(A.) Proof
= E(A.)(I A. - T)F(A.).
It follows from (2.24) that
L(A.)- 1[1
A.I
···
A.1- 1/] = X(IA.- T)- 1 row(TiY)~:;~.
Since the rightmost matrix is nonsingular, it follows that E(A.) = L(A.)X(IA. - T)- 1 = [I
A.!
· · · A.1- 1I] [row(TiY)~:~r 1
and the statement concerning E(A.) is proved. The corresponding statement for F(A.) is proved similarly. Now form the product E(A.)(IA. - T)F(A.) and the conclusion follows from (2.16). D 2.3. Resolvent Form and Linear Systems Theory The notion of a resolvent form for a monic matrix polynomial is closely related to the notion of a transfer function for a time-invariant linear system, which we shall now define. Consider the system of linear differential equations dx
dt =Ax+ Bu,
y = Cx,
x(O) = 0
(2.27)
where A, B, Care constant matrices and x, y, u are vectors (depending on t), which represent the state variable, the output variable, and the control variable, respectively. Of course, the sizes of all vectors and matrices are such
2.3.
RESOLVENT FORM AND LINEAR SYSTEMS THEORY
67
that equalities in (2.27) make sense. We should like to find the behavior of the output y as a function of the control vector u. Applying the Laplace transform z(s) = {"' e-•tz(t) dt
(we denote by (2.27)
z the Laplace image of a vector function z), we obtain from sx(s) =Ax+ Bit,
.v = c.x,
or
y=
C(Is- A)- 1Bu.
The rational matrix function F(s) = C(Is- A)- 1 B is called the transfer function of the linear system (2.27). Conversely, the linear system (2.27) is called a realization of the rational matrix valued function F(s). Clearly, the realization is not unique; for instance, one can always make the size of A bigger: if F(s) = C(Is- A)- 1B, then also
The realization (2.27) is called minimal if A has the minimal possible size. All minimal realizations are similar, i.e., if A, B, C and A', B', C' are minimal realizations of the same rational matrix function F(s), then A = s- 1 A'S, B = s- 1B', C = C'S for some invertible matrix S. This fact is well known in linear system theory and can be found, as well as other relevant information, in books devoted to linear system theory; see, for instance [80, Chapter 4]. In particular, in view of the resolvent representation, the inverse L - 1(.1) of a monic matrix polynomial L(A.) can be considered as a transfer function
where (Q, T, R) is a standard triple of L(A.). The corresponding realization is (2.27) with A = T, B = R, C = Q. Theorem 2.6 ensures that this realization of L - 1(.1) is unique up to similarity, provided the dimension of the state variable is kept equal to nl, i.e., if(A', B', C') is another triple of matrices such that the size of A' is nl and L - 1(A.) = C'(IA.- A')- 1 B',
then there exists an invertible matrix S such that C' = QS, A' = s- 1 TS, B' = s- 1R. Moreover, the proof of Theorem 2.6 shows (see in particular (2.21)), that the realization of L - 1(.1) via a standard triple is also minimal.
68
2.
REPRESENTATION OF MONIC MATRIX POLYNOMIALS
We now investigate when a transferfunction F(A.) = Q(IA. - T)- 1R is the inverse of a~ matrix polynomial. When this is the case, we shall say that the triple (Q, T, R) is a polynomial triple.
Theorem2.8. A triple (Q, T, R) is polynomial if and only iffor some integer 1the following conditions hold: QTiR =
{~
for for
i = 0, ... , 1 - 2
i = 1- 1,
and [ QTR QR rank .
QT"~'R QTmR
QTR QT 2 R
l
. = nl ... QTm.-1R QTmR ... QTz~-zR
(2.28)
(2.29)
for all sufficiently large integers m. Proof Assume F(A.) = Q(IA. - T)- 1 R is the inverse of a monic matrix polynomial L(A.) of degree l. Let (Q 0 , T0 , R 0 ) be a standard triple for L(A.); then clearly i = 0, 1, ....
(2.30)
Now equalities (2.28) follow from the conditions (2.12) which are satisfied by the standard triple (Q 0 , T0 , R 0 ). Equality (2.29) holds for every m ;;::: 1 in view of (2.30). Indeed,
(2.31) and since rank col(Q 0 T~)i'=-01 = rank col(T~R 0 )i'=-o1 = nl for m;;::: 1, rank Zm s nl, where Zm is the mn x mn matrix in the left-hand side of (2.31). On the other hand, the matrices col( Q0 T~):: Aand row(T~ R 0 ):: Aare non singular, so (2.31) implies rank Z 1 = nl. Clearly, rank Zm :2::: rank Z 1 form :2::: 1, so in fact rank Zm = nl (m :2::: 1) as claimed. Assume now (2.28) and (2.29) hold; then
col(QTi)~:
,_o1 · row(TjR)11.:-o1 =
r~
O I
I
(2.32)
2.3.
69
RESOLVENT FORM AND LINEAR SYSTEMS THEORY
In particular, the size v of Tis greater or equal to nl. If v = nl, then (2.32) implies nonsingularity of col(QTi)l;;;6. Consequently, there exists a monic matrix polynomial of degree l with standard triple (Q, T, R), so (Q, T, R) is a polynomial triple. (Note that in this argument (2.29) was not used.) Assume v > nl. We shall reduce the proof to the case when v = nl. To this end we shall appeal to some well-known notions and results in realization theory for linear systems, which will be stated here without proofs. Let W(A.) be a rational matrix function. For a given A. 0 E (/;define the local degree b(W; A. 0 ) (cf. [3c, Chapter IV]),
!
W-q O(W; J.,)
~rank [
w-q+1 . . . w_1] w_q · · · w_2 . .
o
. .
'
w_q
where W(A.) = If= -q (A. - A.0 )iUJ is the Laurent expansion of W(A.) in a neighborhoodofA. 0 . Definealsob(W; oo) = b(W; 0), where W(A.) = W(A. - 1 ). Evidently, b(W; A. 0 ) is nonzero only for finitely many complex numbers A. 0 • Put b(W)
=
I
b(W; A.).
.l.E(u{oo)
The number b(W) is called the McMillan degree of W(A.) (see [3c, 80]). For rational functions W(A.) which are analytic at infinity, the following equality holds (see [81] and [3c, Section 4.2]):
(2.33)
where Di are the coefficients of the Taylor series for W(A.) at infinity, W(A.) = and m is a sufficiently large integer. In our case F(A.) = Q(IA. - T)- 1R, where (Q, T, R) satisfies (2.28) and (2.29). In particular, equalities (2.29) and (2.33) ensure that b(F) = nl. By a well-known result in realization theory (see, for instance, [15a, Theorem 4.4] or [14, Section 4.4]), there exists a triple of matrices (Q 0 , T0 , R 0 ) with sizes n x nl, nl x nl, nl x n, respectively, such that Q0 (I A. - T0 )- 1 R 0 = Q(I A. - T)- 1 R. Now use (2.32) for (Q0 , T0 , R 0 ) in place of (Q, T, R) to prove that (Q0 , T0 , R 0 ) is in fact a standard triple of some monic matrix polynomial of degree l. D
If=o DiA.- i,
2.
70
REPRESENTATION OF MONIC MATRIX POLYNOMIALS
2.4. Initial Value Problems and Two-Point Boundary Value Problems
In this section we shall obtain explicit formulas for the solutions of the initial value problem and a boundary value problem associated with the differential equation
()=
L(~) d X t t
d1x(t) dI t
1 1 . dix(t) = + j=~ L.... A) d j 0 t
f( ) t'
tE[a,b]
(2.34)
where L(A.) = U 1 + ,l}:~ AiA.i is a monic matrix polynomial,f(t) is a known vector function on t E [a, b] (which will be supposed piecewise continuous), and x(t) is a vector function to be found. The results will be described in terms of a standard triple (X, T, Y) of L(A.). The next result is deduced easily from Theorem 1.5 using the fact that all standard triples of L(A.) are similar to each other. Theorem 2.9. formula
The general solution of equation (2.34) is given by the
tE [a,
b],
(2.35)
where (X, T, Y) is a standard triple of L(A.), and c E rtnt is arbitrary. In particular, the general solution of the homogeneous equation
(2.36) is given by the formula CE
fln.
Consider now the initial value problem: find a function x(t) such that (2.34) holds and x
where
X;
= xi,
(2.37)
i = 0, ... ' 1- 1,
are known vectors.
Theorem2.10. For every given set qfvectors x 0 , ... , xi_ 1 there exists a unique solution x(t) of the initial value problem (2.34), (2.37). It is given by the formula (2.35) with
c
=
[Y TY · · ·
I
0
0 0
11r :: 1. 0
Xt-1
(2.38)
2.4.
INITIAL AND TWO-POINT BOUNDARY VALUE PROBLEMS
71
Proof Let us verify that x(t) given by (2.35) and (2.38) is indeed a solution of the initial value problem. In view of Theorem 2.9 we have only to verify the equalities x
:,~~ 1 r :T 1 x
=
(2.39)
lxil-lr·
Then substitute for c from (2.38) and the result follows from the biorthogonality condition (2.6). To prove the uniqueness of the solution of the initial value problem, it is sufficient to check that the vector c from (2.35) is uniquely determined by x
D
Theorems 2.9 and 2.10 show that the matrix function
- {XeT(t-s)y G(t, s) - 0
for for
t ;;::: s t < s
is the Green's function for the initial value problem. Recall that the Green's function is characterized by the following property:
x(t)
=
J:c(t, s)f(s) ds
is the solution of (2.34) such that x
The matrix-valued function df!fined on [a, b] x [a, b] by
if a:=;,t:=;,r if T:::;, t:::;, b
(2.40)
2.
72
REPRESENTATION OF MONIC MATRIX POLYNOMIALS
satisfies the following conditions: (a) Differentiating with respect to t, L(d/dt)G 0 = 0, for (t, r) E [a, b] x [a, b], as long as t #- T. (b) Differentiating with respect to t, for for
- G(r)l - {0/ G(r)l 0 t=r+ 0 t=r- -
(c)
r = 0, 1, ... , l - 2 r = l- 1.
The function u(t) =
is a solution of L(d/dt)u = Proof
(a)
f
G0 (t, r)f(r) dr
f.
We have a~t
T
< t
~b.
Proposition 1.10 implies that the summation applied to each elementary divisor associated with J 1 or J 2 is zero, and it follows that L(d/dt)G 0 = 0 if t #- T. (b) We have
and the conclusion follows from (2.12). (c) This part is proved by verification. For brevity we treat the case l = 2. The same technique applies more generally. Write u(t) = -X 1 e1 ' 1 fe-hrYd(r) dr
+ X 2 eht
Differentiating and using the fact that X 1 Y1 u0 >(t) = -X 1 J 1 e1 ' 1 fe-J'rYd(r) dr
J:e-hrY2 f(r) dr.
+ X 2 Y2 =
+ X 2 J 2 eht
XY = 0, we obtain
J:e-hrY2 f(r) dr.
Differentiate once more and use XY = 0, XJY = I to obtain u< 2 >(t)
=
-X 1 Jfeht fe-J'rYd(r) dr
+ X 2 J~eht
J:e-hrY2 f(r) de+ f(t).
2.4. INITIAL AND TWO-POINT BOUNDARY VALUE PROBLEMS
Then, since
73
L;=o ArXJr = 0 fori= 2, we obtain L(!!_)u dt
=
I Aru(r)
=f.
D
r=O
We now seek to modify G0 in order to produce a Green's function G which will retain properties (a), (b), and (c) of the lemma but will, in addition, satisfy two-point boundary conditions. To formulate general conditions of this kind, let Yc denote the In-vector determined by Yc = col(y(c), y( 0 (c), ... , y(l-l)(c)) where y(t), t E [a, b], is an (l- 1)-times continuously differentiable n-dimensional vector function, and c E [a, b]. Then let M, N be In x In constant matrices, and consider homogeneous boundary conditions of the form Mya
+ Nyb =
0.
Our boundary value problem now has the form (2.41)
L(:t)u = f,
for the case in which the homogeneous problem (2.42)
L(:t)u = 0,
has only the trivial solution. The necessary and sufficient condition for the latter property is easy to find. Lemma 2.12. The homogeneous problem (2.42) has only the trivial solution if and only if (2.43)
where Q = col(XJ;)~:b. Proof We have seen that every solution of L(d/dt)u = 0 is expressible in the form u(t) = Xe 11c for some c E fl 1n, so the boundary condition implies
Mua
+ Nub
= (MQe 10
+ NQe 1 b)c =
0.
Since this equation is to have only the trivial solution, the In x In matrix MQe1 a + NQe 1 b is nonsingular. D Theorem 2.13. If the homogeneous problem (2.42) has only the trivial solution, then there is a unique In x n matrix K(r) independent oft and depending analytically on r, in [a, b ],for which the function G(t, r) = G0 (t, r)
+ Xe 11K(r)
(2.44)
74
2.
REPRESENTATION OF MONIC MATRIX POLYNOMIALS
l ~~. ~ l
satisfies conditions (a) and (b) of Lemma 2.11 (applied to G) as well as
(d)
V(G)~ Mr :.~
+ Nr
G0 - 1 >(a, r)
=
0.
G0 - 1 >(b, r)
In this case, the unique solution of(2.41) is given by u(t)
= f G(t, r)f(r) dr.
(2.45)
Proof It is easily seen that G inherits conditions (a) and (b) from G0 for any K(r). For condition (d) observe that V(G) = 0 if and only if V(XeJ 1K(r)) = - V(G 0 ). This may be written (MQeJa
+ NQeJb)K(r) = -
V(G 0 ),
(2.46)
where Q = col(XJi)l:A. We have seen that our assumption on the homogeneous problem implies that the matrix on the left is nonsingular, so that a unique K is obtained. The analytic dependence of K on r (through V(G 0 )) is clear. Finally, with u defined by (2.45) we have the boundary conditions satisfied for u in view of (d), and L(:t)u
=
L(:t){f G 0 (t, r)f(r) dr
+ XeJt
fK(r)f(r) dr}
The first integral reduces to f by virtue of condition (c) in Lemma 2.11, and L(d/dt)XeJt = (L~=o A,XJ')eJt is zero because of Theorem 1.23. This completes the proof. 0 We remark that there is an admissible splitting of J implicit in the use of G0 in (2.44). The representation of the solution in the form (2.45) is valid for any admissible splitting. As an important special case of the foregoing analysis we consider secondorder problems: L(:t)u
= A 0 + A 1 u< 1> + Ju< 2 > = f
with the boundary conditions u(a) = u(b) = 0. These boundary conditions are obtained by defining the 2n x 2n matrices
2.5. COMPLETE PAIRS AND SECOND-QRDER DIFFERENTIAL EQUATIONS
75
where the blocks are all n x n. Insisting that the homogeneous problem have only the trivial solution then implies that the matrix (2.47) appearing in (2.43) is nonsingular. In this case we have V(Go) =
-M[
Xl ]eit
+
N[
X2 ]eh
= [-XleJ,(a-<)yl] X2eh
J
K(r) =
XeJa]-1[ XleJ,(a-<)yl [ XeJb -X2eh
(2.48)
yielding a more explicit representation of the Green's function in (2.44). The condition on a, b, and the Jordan chains of L(A.), which is necessary and sufficient for the existence of a unique solution to our two-point boundary value problem is, therefore, that the matrix (2.47) be nonsingular.
2.5. Complete Pairs and Second-Order Differential Equations
If L(A.) = H 2 + A 1 A. + A 2 and there exist matrices S, Tsuch that L(A.) = (H - T)(I A. - S), then H - Sis said to be a (monic) right divisor of L. More general divisibility questions will be considered in subsequent chapters but, for the present discussion, we need the easily verified proposal: I A. - S is a
right divisor of L
if and only if L(S) = S 2 + A1S
+ A2
= 0.
In general, L(A.) may have many distinct right divisors but pairs of divisors of the following kind are particularly significant: right divisors H- S 1, H - S 2 for L(A.) = H 2 + A 1 A. + A 2 are said to form a complete pair if s2 - 1 is invertible.
s
Lemma2.14.
If H- S1 , H- S 2 form a complete pair for L(A.), then
L(A.) = (S2 - S1)(H- S 2)(S 2 - S 1)- 1(H - S 1) and
76
2. Proof
REPRESENTATION OF MONIC MATRIX POLYNOMIALS
By definition of right divisors, there exist matrices T1 , T2 such that
Consequently, T1 + S 1 = T2 + S 2 and T1S 1 = T2 S 2. It follows from these two relations, and the invertibility of S 2 - S 1 that we may write T 1 = (S 2 - S t)Sz(S 2 - S 1 )~ 1, and the first conclusion follows immediately. Now it is easily verified that (H- S 1 )~ 1 - (H- S 2 )~ 1
= -(I A-
S 1 )~
1(S 2 - St)(H- S 2 )~ 1.
If we take advantage of this relation, the second conclusion of the lemma now follows from the first. D The importance of the lemma is twofold: It shows that a complete pair gives a complete factorization of L(A) and that CI(L) = CI(S 1) u CI(S 2). Indeed, the first part of the proof of Theorem 2.16 will show that St. S 2 determine a Jordan pair for L. We shall see that complete pairs can also be useful in the solution of differential equations, but first let us observe a useful sufficient condition ensuring that a pair of right divisors is complete. Theorem 2.15. Let !A - St. !A - S 2 be right divisors of L(A) such that CI(S1) n CI(S2) = 0. Then !A- sb !A- S2form a complete pair.
Proof
Observe first that (2.49)
Indeed, since L(A)(IA - SJ~ 1 is a monic linear polynomial fori= 1, 2, the left-hand side of (2.49) is a constant matrix, i.e., does not depend on A. Further, for large IAI we have (IA- sa~ 1
= n~ 1
+
s;r 2
+
s?r 3
+. ··,
i = 1, 2;
so lim{A 2[(JA-
S 1 )~
1
- (H- S 2 )~ 1 ]}
= S 1 - S2.
1.-oo
Since lim;._ oc { r 2 L(A)} = I, equality (2.49) follows. Assume now that (S 1 - S2 )x = 0 for some x E Equality (2.49) implies that for A¢: CI(L),
rzn.
(2.50) Since CI(S d n CI(S 2 ) = 0, equality (2.50) defines a vector function E(A) which is analytic in the whole complex plane:
A E C\CJ(S 1) A E C\CJ(S 2).
2.5. COMPLETE PAIRS AND SECOND-ORDER DIFFERENTIAL EQUATIONS
77
But lim.<_,oo E(A.) = 0. By Liouville's theorem, E(A.) is identically zero. Then x = (/..1.- St)E(A.) = O,i.e.,Ker(S 1 - S 2) = {O}andS 1 - S 2 isnonsingular. D
Theorem 2.16. LetS~> S 2 be a complete pair of solutions of L(S) = 0, where L(A.) = IA 2 + A1 . 1. + A 0 • Then (a) Every solution of L(d/dt)u = 0 has the form
for some c~> c 2 E fl". (b) If, in addition, es 2
u(a) = u(b) = 0,
(2.51)
has the unique solution u(t) = ib G0 (t, r)f(r) dr - [es,(t-a),
eS2
where I
eS2(b-a)
J
-1
'
and (2.53)
The interesting feature of this result is that the solutions of the differential equation are completely described by the solutions S 1, S 2 of the algebraic problem L(S) = 0.
Proof Let S 1 , S 2 have Jordan forms X1J1X!\ S2 = X 2 J 2 X2 1 and write
J~>
J 2 , respectively, and let S 1 =
We show first that S t• S2 a complete pair implies that X, J is a Jordan pair for L(A.). We have
78
2.
REPRESENTATION OF MONIC MATRIX POLYNOMIALS
So the nonsingularity of the first matrix follows from that of S 2 - S 1 • Then Sf + A 1S; + A 0 = 0, for i = 1, 2, implies X;Jf + A 1X;J; + A 0 X; = 0 for each i and hence that XJ 2 + A 1XJ + A 0 X = 0. Hence (by Theorem 1.23) X, J is a Jordan pair for L(A.) and, in the terminology of the preceding section, J 1 and J 2 determine an admissible splitting of J. To prove part (a) of the theorem observe that u(t)
= [e8' 1 e821 ] [~:] = [X 1eJ 11 X 2eh 1] [~~:~:] = XeJ 1<1.,
where <1. is, in effect, an arbitrary vector of ([; 2 ". Since X, J is a standard pair, the result follows from Theorem 2.9. For part (b) observe first that, as above,
XeJt
=
[es•tx1
es2tX2]
so that the matrix of (2.47) can be written
and the hypothesis of part (b) implies the nonsingularity of this matrix and the uniqueness of the solution for part (b). Thus, according to Theorem 2.13 and remarks thereafter, the unique solution of (2.51) is given by the formula
where
and if a~ t if r ~ t
~
r,
~b.
It remains to show that formulas (2.52) and (2.54) are the same. First note that Y1 = -X! 1Z, Y2 = X2 1Z. Indeed,
(2.55)
2.6.
79
DIFFERENCE EQUATIONS AND NEWTON IDENTITIES
Furthermore, the function G0 of (2.53) can now be identified with the function G0 of (2.55). To check this use the equality XieJ,(r-r)xi- 1 = es,(r-r), i = 1, 2 (cf. Section S1.8). Further,
XcJr = [X1eJ•'
Xzeh'] = [es•'
= [es.(t-a) es2(t-b)] [e"sO•x1
es2']
[~1
;J
J
0 ebs2xz .
So
= -[est(r-a) es2(r-b)] [e"sO•X1
J
0 [XeJa]-1[eSt(a-r)Jz ebs2x 2 XeJb es2(b-r) '
and in order to prove that formulas (2.52) and (2.54) are the same, we need only to check that
W= [
e"s'X
1
0
or
w- 1 '~
[es,~-a) es2~-a)] = [~:~:] [X!l;-as, X21~-bS2l
But this follows from the equalities Si
= XJiXi- 1, i = 1, 2. D
2.6. Initial Value Problem for Difference Equations, and the Generalized Newton Identities
Consider the difference equation
r = 0, 1, ... ,
(2.56)
where L(.A) = H 1 + 2}~ ~ A iA,i is a monic matrix polynomial with a standard triple (X, T, Y). Using Theorem 1.6 and the similarity between any two standard triples of L(A.), we obtain the following result. Theorem 2.17. i
Every solution of (2.56) is given by u 0 = Xc and for
= 1, 2, ... ' ui = X Tic
+X
i-1
I
k= 0
yi-k- 1Yfko
(2.57)
80
2.
REPRESENTATION OF MONIC MATRIX POLYNOMIALS
where c E fln 1 is arbitrary. In particular, the general solution of the homogeneous problem (with f.. = 0, r = 0, 1, ... ) is given by i = 0, 1, ... ,
Corollary2.18. conditions
C E fln 1•
The solution of(2.56) which satisfies the prescribed initial
r = 0, ... , l - 1, is unique and given by putting A1
A2
A1-1
I
A2 c = [Y
TY ...
Tl-1 Y] A1-1
I
I 0
0 0
[5,]
in Theorem 2.17. Proof
Write
(2.58)
where u; is given by (2.57). Since XT;Y = 0 fori= 0, ... , l - 2, we obtain u; = XT;c for i = 0, ... , l - 1, and (2.58) becomes col(XT;)l: ~ · c = col(a;)l: ~. Since col(XT;)l:~ is invertible, c E (f;n 1 is uniquely defined and, by the biorthogonality condition (2.6),
c = [col(XT;)l:~r 1 · col(a;)l:~ = [row(T;Y)l:~]B · col(a;)l:~.
D
As an application of Theorem 2.17, we describe here a method of evaluation of the sum a, of the rth powers of the eigenvalues of L(A.) (including multiplicities). Consider the matrix difference equation connected with the polynomial L(A.) = IA.1 + L~;;;& AjA.j:
r
= 1, 2, ... , (2.59)
2.6.
81
DIFFERENCE EQUATIONS AND NEWTON IDENTITIES
where {U,}:;, 1 is a sequence of n x n matrices U, to be found. In view of Theorem 2.17 the general solution of (2.59) is given by the following formula: r = 1, 2, ... ,
(2.60)
where (X, T) is a part of the standard triple (X, T, Y) of L(A.) and Z is an arbitrary In x n matrix. Clearly, any particular solution of (2.59) is uniquely defined by the initial values u 1• . . . ' ul. We now define matrices S~> S 2 , .•. , S1 by the solution of
(2.61)
and we observe that, in practice, they would most readily be computed recursively beginning with S 1 . Note that the matrix on the left is nonsingular. Define {S,}:;, 1 as the unique solution of(2.59) determined by the initial values U; = S;, i = 1, ... , l. Let a, be the sum of the rth powers of the eigenvalues of L (including multiplicities), and let Tr(A) of a matrix A denote the trace of A, i.e., the sum of the elements on the main diagonal of A, or, what is the same, the sum of all eigenvalues of A (counting multiplicities).
Theorem 2.19.
If {S,}:;, 1 is defined as above, then
r = 1, 2, ....
(2.62)
The equalities (2.62) are known as the generalized Newton identities. Proof
By Corollary 2.18,
or (2.63) where r, = xr- 1 Y. Now we also have (see Proposition 2.1 (iii))
r = 1, 2, ... ,
82
2.
REPRESENTATION OF MONIC MATRIX POLYNOMIALS
and this implies that (2.63) can be written
r = 1, 2, . . . .
(2.64)
To approach the eigenvalues of L we observe that they are also the eigenvalues of the matrix M = QTQ- 1 • Since Q- 1 = RB, where R=[Y TY
···
T 1- 1 Y]
and B is given by (2.5), we have: M' = QT'Q- 1 = QT'RB r,+ 1 r,+l+1 rr+l+ 1
0
0
0
][A
:
:
0
0
r,+21-1
1
A2
A2
· ·:
.·
I]
I
I
.
0
Now observe that the eigenvalues of M' are just the rth powers of the eigenvalues of M (this fact can be easily observed by passing to the Jordan form of M). Hence
a,= Tr(M') = Tr(r,+ 1A1 + ... + r,+ 1A1) + ... + Tr(r,+ 1- 1A 1_ 1 + r,+ 1A 1) + Tr(r,+ 1A 1) = Tr(t,+ 1A 1 + 2r,+ 2A 2 + · · · + 1r,+ 1A 1) = Tr(S,)
by (2.64).
0
Comments The systematization of spectral information in the form of Jordan pairs and triples, leading to the more general standard pairs and triples, originates with the authors in [34a, 34b], as do the representations of Theorem 2.4. The resolvent form, however, has a longer history. More restrictive versions appear in [40, 52a]. As indicated in Section 2.3 it can also be seen as a special case of the well-known realization theorems of system theory (see [80, Chapter 4] and [47]). However, our analysis also gives a complete description of the realization theorem in terms of the spectral data. For resolvent forms of rational matrix functions see [3c, Chapter 2]. Corollary 2.5 was proved in [ 49a]; see also [38, 55]. Factorizations of matrix polynomials as described in Proposition 2. 7 are introduced in [79a]. Our discussion of initial and two-point boundary value problems in Sections 2.4 and 2.5 is based on the papers [34a, 52e]. The concept of"com-
2.6.
DIFFERENCE EQUATIONS AND NEWTON IDENTITIES
83
plete pairs" is introduced in [51]. Existence theorems can be found in that paper as well as Lemma 2.14 and Theorem 2.15. Existence theorems and numerical methods are considered in [54]. The applications of Section 2.6 are based on [ 52f]. Extensions of the main results of this chapter to polynomials whose coefficients are bounded linear operators in a Banach space can be found in [34c]. Some generalizations of this material for analytic matrix functions appear in [37d, 70c].
Chapter 3
Multiplication and Divisibility
In Chapter 2 we used the notion of a standard triple to produce representation theorems having immediate significance in the solution of differential and difference equations. It turns out that these representations of matrix polynomials provide a very natural approach to the study of products and quotients of such polynomials. These questions are to be studied in this chapter. There are many important problems which require a factorization of matrix- or operator-valued functions in such a way that the factors have certain spectral properties. Using the spectral approach it is possible to characterize right divisors by specific invariant subspaces of a linearization, which are known as supporting subs paces for the right divisors. This is one of the central ideas of the theory to be developed in this chapter, and admits explicit formulas for divisors and quotients. These formulas will subsequently facilitate the study of perturbation and stability of divisors. If a monic matrix polynomial L(A.) = LiA.)L 1 (A.) and L 1(A.) and L 2 (A.) are also monic matrix polynomials, then L 1 (A.), LiA.) may be referred to as a right divisor and left quotient of L(A.), respectively (or, equally well, as a right quotient and left divisor, respectively).It is clear that the spectrum of L must "split" in some way into the spectra of L 1 and L 2 • Our first analysis of products and divisors (presented in this chapter), makes no restriction on the
84
3.1.
85
A MULTIPLICATION THEOREM
nature ofthis splitting. Subsequently (in the next chapter) we consider in more detail the important special case in which the spectra of L 1 and L 2 are disjoint. 3.1. A Multiplication Theorem
The notion of a standard triple for monic matrix polynomials introduced in the preceding chapter turns out to be a useful and natural tool for the description of multiplication and division of monic matrix polynomials. In this section we deal with the multiplication properties. First we compute the inverse L - 1 (A.) of the product L(A.) = L 2 (A.)L 1 (A.) of two monic matrix polynomials L 1 (A.) and Lz(A.) in terms of their standard triples. Theorem 3.1. Let L;(A) be a monic operator polynomial with standard triple (Q;, 7;, R;) fori = 1, 2, and let L(A.) = L 2 (A.)L 1 (A.). Then L - 1 (A-) = [Q 1
O](H- T)- 1
[;J,
(3.1)
where
T
=
Proof
[~ R~;z].
It is easily verified that
1 (H _ T)-1 = [(H - 0Tt)-
(IA,- Tt)- 1 RtQz(H- T2 )- 1 ] (H-T2 )- 1
•
The product on the right of (3.1) is then found to be Q 1 (JA-- T1 )- 1R 1 Q 2 (JA-- T2 )- 1 R 2 •
But, using Theorem 2.4, this is just L1 1 (A.)L2 1 (A.), and the theorem follows immediately. D Theorem 3.2. If L;(A.) are monic matrix polynomials with standard triples (Q;, 7;, R;) fori= 1, 2, then L(A.) = Lz(A.)L 1 (A.) has a standard triple (Q, T, R)
with the representations
Q = [Qt 0], Proof
Combine Theorem 2.6 with Theorem 3.1.
D
Note that if both triples (Q;, 7;, R;), i = 1, 2, are Jordan, then the triple (Q, T, R) from Theorem 3.2 is not necessarily Jordan (because of the presence
3.
86
MULTIPLICATION AND DIVISIBILITY
of R 1 Q2 in T). But, as we shall see, in some cases it is not hard to write down a Jordan triple for the product, by using Theorem 3.2. Consider, for instance, the case when (J(T1 ) n (J(T2 ) = 0. In this case one can construct a Jordan triple of the product L(A.) = LiA.)L 1 (A.), provided the triples (Q;, 1';, R;) for L;(A.) (i = 1, 2) are also Jordan. Indeed, there exists a unique solution Z of the equation (3.2) (see Section S2.1); so
Hence, the triple (Q 1 , Q 1
Z)
= [Q 1
OJ[~ ~l [~ ~l
[-::
2
]
=
[~ -~ZJ[;J
(3.3)
is similar to the standard triple (Q, T, R) of L(A.) constructed in Theorem 3.2, and therefore it is also a standard triple of L(A.). Now (3.3) is Jordan provided both (Q 1 , T1 , R 1 ) and (Q 2 , T2 , R 2 ) are Jordan. Recall that equation (3.2) has a solution Z if and only if
. . .1ar to
IS s1m1
[T10 TzOJ
(see Theorem S2.1); then the arguments above still apply, and we obtain the following corollary. Corollary 3.3. Let L 1 (A.), L 2 (A.) be monic matrix polynomials such that, if (A. - A.0 Y'J, i = 1, ... , kj, is the set of the elementary divisors of LiA.) U = 1, 2) corresponding to any eigenvalue of L(A.) = L 2 (A.)L 1 (A.), then (A. - A. 0 )a•J, i = 1, ... , kj,j = 1, 2, is thesetoftheelementarydivisorsofL(A.)corresponding to A.0 . Let (Q;, 1';, R;) be a Jordan triple of L 1(A.), i = 1, 2. Then Eq. (3.2) has a solution Z, and the triple
is a Jordan triple for L(A.).
If the structure of the elementary divisors of L(A.) is more complicated than in Corollary 3.3, it is not known in general how to find the Jordan triples for the product. Another unsolved problem is the following one (see [27, 71a, 76b]) :Given the elementary divisors of the factors L 1 (A.) and L 2 (A.) (this means, given Jordan matrices T1, T2 in the above notation), find the elementary
3.1.
87
A MULTIPLICATION THEOREM
divisors of the product L(A.) = Lz(A.)L 1 (A.) (this means, find the Jordan form of the matrix
In some particular cases the solution of this problem is known. For instance, the following result holds: Lemma 3.4. Let J 1 , .•. , Jk be Jordan blocks having the same real eigenvalue and sizes oc 1 ~ · · · ~ ocb respectively. Let = [diag[J;]7= 1
T
0
To diag[JtJ7= 1
J
where T0 = (t;)f.i= 1 is a matrix of size oc x oc (oc = oc 1 + · · · + ock). Let {3; = oc 1 + · · · + oc; (i = 1, ... , k), and suppose that the k x k submatrix A = (tp,p)~.j= 1 ofT0 is invertible and A = A 1 A 2 , where A 1 is a lower triangular matrix and A 2 is an upper triangular matrix. Then the degrees of the elementary divisors of!A. - Tare 2oc 1 , •.• , 2ock. The proof of Lemma 3.4 is quite technical. Nevertheless, since the lemma will be used later in this book, we give its proof here.
Proof We shall apply similarity transformations to the matrix Tstep by step, and eventually obtain a matrix similar to T for which the elementary divisors obviously are 2oc~> ... , 2ock. For convenience we shall denote block triangular matrices of the form
by triang[K, M, N]. Without loss of generality suppose that J = diag[J;]~= 1 is nilpotent, i.e., has only the eigenvalue zero. Let i ¢: {{3 1 , {3 2 , ... , f3k} be an integer, 1 :::;; i :::;; oc. Let Uil be the oc x oc matrix such that all its rows (except for the (i + l)th row) are zeros, and its (i + l)th row is [t; 1 t; 2 • • • t;a]. Put Sil = triang[J, Uil, I]; then Sil TS;). 1 = triang[J, 7; 0 , JT], where the ith row of T; 0 is zero, and all other rows (except for the (i + l)th) of T; 0 are the same as in T0 . Note also that the submatrix A is the same in 7; 0 and in T0 • Applying this transformation consequently for every i E {1, ... , oc}\{f3t. ... , f3k}, we obtain that Tis similar to a matrix of the form T1 = triang[J, V, JT], where the ith row of Vis zero fori¢: {{3 1 , ... , f3k}, and vf3v/3q = tf3v/3q' p, q = 1, ... , k, where V = (v;)f.i=t· We shall show now that, by applying a similarity transformation to T1 , it is possible to make vii= 0 for i ¢: {{3 1 , ..• , {3d or j ¢: {{3 1 , ••• , f3k}. Let J'] = col(v;)f= 1 be the jth column of V. For fixed j ¢: {{3 1 , . . . , {3d we have vii = 0
88
3. MULTIPLICATION AND DIVISIBILITY
for i r!. {/3 1, ... , f3k}, and, since A is invertible, there exist complex numbers a 1, ••• , ak such that k
J-j
+ L Up, Vp, = 0.
i= 1 LetS 2 i = diag[ I, I + U 2 i], where all but the jth column of U 2 i are zeros, and the ith entry in the jth column of U 2 i. is aPrn if i = !3m for some m, and zero otherwise. Then S2/T1 S 2 i = triang[J, V- Z 1 , JT- Z 2 ] with the following structure of the matrices Z 1 and Z 2 : Z 1 = [0 · · · 0 J-j 0 · · · 0], where J-j is in the jth place; Z 2 = [0 · · · 0 Z 2 .i_ 1 0 · · · 0], where Z 2 ,i_ 1 =col(col( -bP·"'•aq)~'k 1 )~= 1 is an oc x 1 column in the (j- 1)th place in Z 2 (bP·"'• denotes the Kronecker symbol). Ifj =/3m+ lfor some m, then we put Z 2 = 0. It is easy to see that S2/T1S 2 i can be reduced by a similarity transformation with a matrix of type diag[ I, U 3i] to the form triang[J, W, JT], where the mth column of W1 coincides with the mth column of V- Z 1 for m ~ j and formE {/3 1 , ... , f3d. Applying this similarity consequently for every j r!. {/3 1, .•. , f3d (starting withj = f3k - 1 and finishing withj = 1), we obtain that T1 is similar to the matrix T2 = triang[J, W2 , JT], where the (/3;, f3) entries of W2 (i,j = 1, ... , k) form the matrix A, and all others are zeros. The next step is to replace the invertible submatrix A in W2 by I. We have 1 A1 A = A 2 , where A1 1 = (b;)~.i= 1 , bii = 0 fori <j, is a lower triangular matrix; A 2 = (c;)~.i= 1, cii = 0 fori > j, is an upper triangular matrix. Define the 2oc x 2oc invertible matrix S 3 = (s~j>)~"'i= 1 as follows: s~~~i = bii for i, j = 1, ... , k; s!;~ = 1 form r!. {/3 1 , ••• , f3k}; s!J> = 0 otherwise. Then S 3 T2 S3 1 = triang[J + Z 3 , W3 , JT], where the (/3;, f3) entries ofW3 (i,j = 1, ... , k)form the upper triangular matrix A 2 , and all other entries of W3 are zeros; the oc x oc matrix Z 3 can contain nonzero entries only in the places (/3;- 1, f3) fori> j and such that oc; > 1. It is easy to see that S3 T2 S3 1 is similar to T3 = triang[J, w3' JT]. Define the 2oc X 2oc invertible matrix s4 = (s~j>)~}= 1 by the following equalities: s~1~,; = cii for i,j = 1, ... , k; s~~ = 1 for m r!. {/3'1 , ... , f3i}, where p; = f3k + /3;; s~j> = 0 otherwise. Then S4 T3 S4 1 = triang[J, W4 , JT + Z 4 ], where the (/3;, f3) entries of W4 form the unit matrix I, and all other entries are zeros; the oc x oc matrix Z 4 can contain nonzero entries only in the places (/3;, f3i- 1) for i < j and such that oci > 1. Again, it is easy to see that S4 T3 S4 1 is similar to T4 = triang[J, W4 , JT]. Evidently, the degrees of the elementary divisors of T4 are 2oc 1 , ... , 2ock. So the same is true for T. D
In concrete examples it is possible to find the Jordan form of T by considering the matrix
[~1 R~;2]. as we shall now show by example.
3.2.
89
DIVISION PROCESS
ExAMPLE
3.1.
Let
Using the computations in Example 2.2 and Theorem 3.2, we find a linearization T for the product L(A.) = (L 1 (.1)) 2 : T =
[J
YXJ J ,
0
where
X=[~
0
-fi+
fi-2 fi
I
+I
fi+2] 0 ,
0 0 -1
Hfi + 2) Y= ¥-fi-1) ¥-fi + 2) ¥-fi + 1)
1 0 0 1
4
0 0
J=
'
0
0
-1 0 -1
1
-4
By examining the structure ofT we find the partial multiplicities of L(l) atA 0 = 1. To this end we shall compute the 4 x 4 submatrix T0 ofT formed by the 3rd, 4th, 9th, and lOth rows and columns (the partial multiplicities of L(l) at .1 0 = 1 coincide with the degrees of the elementary divisors of /A - Y0 ). We have
T, = 0
Since rank(T0 is4. 0
-
/)
=
1 [0 0 0
1 I 0 0
-tfi t
1 0
-!
1
!{2fi - 3) . 1 1
3, it is clear that the single partial multiplicity of L(l) at .1 0
=
1
3.2. Division Process
In this section we shall describe the division of monic matrix polynomials in terms of their standard triples. We start with some remarks concerning the division of matrix polynomials in general. Let M(A.) = MiA.i and N(A.) = L~=o NiA.i be matrix polynomials of size n x n (not necessarily monic). By the division ofthese matrix polynomials we understand a representation in the form
D=o
M(A.) = Q(A.)N(A.)
+ R(A.),
(3.4)
90
3.
MULTIPLICATION AND DIVISIBILITY
where Q(A.) (the quotient) and R(A.) (the remainder) are n x n matrix polynomials and either the degree of R(A.) is less than the degree of N(A.) or R(A.) is zero. This definition generalizes the division of scalar polynomials, which is always possible and for which the quotient and remainder are uniquely defined. However, in the matricial case it is very important to point out two factors which do not appear in the scalar case: (1) Because of the noncommutativity, we have to distinguish between representation (3.4) (which will be referred to as right division) and the representation (3.5) for some matrix polynomials Q 1 (A.) and R 1 (A.), where the degree of R(A.) is less than the degree of N(A.), or is the zero polynomial. Representation (3.5) will be referred to as left division. In general, Q(A.) =1= Q 1(A.) and R(A.) =1= R 1(A.), so we distinguish between right (Q(A.)) and left (Q 1 (A.)) quotients and between right (R(A.)) and left (R 1 (A.)) remainders. (2) The division is not always possible. The simplest example of this situation appears if we take M(A.) = M 0 as a constant nonsingular matrix and N(A.) = N 0 as a constant nonzero singular matrix. If the division is possible, then (since the degree of N(A.) is 0) the remainders R(A.) and R 1(A.) must be zeros, and then (3.4) and (3.5) take the forms (3.6) respectively. But in view of the invertibility of M 0 , neither of (3.6) can be satisfied for any Q(A.) or Q 1(A.), so the division is impossible. However, in the important case when N(A.) (the divisor) is monic, the situation resembles more the familiar case of scalar polynomials, as the following proposition shows.
Proposition 3.5. If the divisor N(A.) is monic, then the right (or left) division is always possible, and the right (or left) quotient and remainder are uniquely determined. Proof We shall prove Proposition 3.5 for right division; for left division the proof is similar. Let us prove the possibility of division, i.e., that there exist Q(A.) and R(A.) (deg R(A.) < deg N(A.)) such that (3.4) holds. We can suppose 1~ k; otherwise take Q(A.) = 0 and R(A.) = M(A.). Let N(A.) = JA.k + z:~;;;J NiA.i, and write (3.7)
91
3.2. DIVISION PROCESS
with indefinite coefficients Qj and Rj. Now compare the coefficients of A.1, A.1-l, ... , A_k in both sides:
M1 M1-1
= Ql-k' = Ql-k-1 + Ql-kNk-t> (3.8) k-1
Mk = Qo
+ L Qk-;N;. i=O
From these equalities we find in succession Q1-k, Q1_k_ 1, found Q(A.) = 2:}-:~ A.jQj. To satisfy (3.7), define also
•.. ,
Q0 , i.e., we have
j
Rj = Mj -
L QqNj-q
j = 0, ... , k - 1.
for
(3.9)
q=O
The uniqueness of the quotient and remainder is easy to prove. Suppose
M(A.) = Q(A.)N(A.)
+ R(A.) =
Q(A.)N(A.)
+ R(A.),
where the degrees of both R(A.) and R(A.) do not exceed k - 1. Then
(Q(A.) - Q(A.))N(A.)
+ R(A.) -
R(A.) = 0.
Since N(A.) is monic of degree k, it follows that Q(A.) - Q(A.) = 0, and then also R(A.) = R(A.) = 0. 0 An interesting particular case of Proposition 3.5 occurs when the divisor = /A - X. In this case we have extensions of the remainder theorem:
N(A.) is linear: N(A.)
Corollary 3.6.
Let
l
L MjA.j =
Q(A.)(H- X)+ R =(/A- X)Q1(A.)
+ R!.
j=O
Proof
Use formulas (3.8) and (3.9). It turns out that i
According to (3.9), l
R = M0
+ Q0 X
=
l:MjXj. j=O
For R 1 the proof is analogous.
0
= 1, ... ' l.
3.
92
MULTIPLICATION AND DIVISIBILITY
We go back now to the division of monic matrix polynomials. It follows from Proposition 3.5 that the division of monic matrix polynomials is always possible and uniquely determined. We now express the left and right quotients and remainders in terms of standard triples. This is the first important step toward the application of the spectral methods of this book to factorization problems. Let L(A) = H 1 + I,;:~ AiAi be a monic matrix polynomial
Theorem 3.7. of degree l, and let
L1(A) = Hk- X 1T~(V1 + · · · + V,.Ak- 1 ) be a monic matrix polynomial of degree k ::;; lin the right canonical form, where (X 1 , T1 ) is a standard pair of LlA), and [V1 • • • V,.] = [col(X 1 TDt,;:-Jr 1 . Then L(A) = Q(A)L 1(A)
+ R(A),
(3.10)
where (3.11) and
(3.12) where we put A 1 = I.
Before we start to prove Theorem 3.7, let us write down the following important corollary. We say that L 1{A) is a right divisor of L(A) if L(A)
= Q(A)L 1 (A),
i.e., the right remainder R(A) is zero.
Coronary 3.8. L 1(A) is a right divisor of L(A) pair (X 1, T1) of L 1(A), the equality AoX1
if and only if, for a standard
+ A1X1T1 + ··· + A1-1X1Ti- 1 + X1Ti
= 0
(3.13)
holds. Proof If (3.13) holds, then according to (3.12), R(A) = 0 and L 1 (A) is a right divisor of L(A). Conversely, if R(A) = 0, then by (3.12)
( .±A;X 1 T~)~ =
0
for j
=
1, ... , k.
•=0
Since [V1
· •·
V,.] is nonsingular, this means that (3.13) holds.
D
93
3.2. DIVISION PROCESS
Proof of Theorem 3. 7. We shall establish first some additional properties of standard triples for the monic polynomials L(A.) and L 1 (A.). Define G~p =
X 1 T~ Vp,
1 .::;; {3 .::;; k,
(3.14)
oc = 0, 1, 2, ....
Then for each i, 1 .::;; i .::;; k, we have (3.15)
lX~~'
(and we set Gpo = 0, p = 0, 1, ... ). Indeed, from the definition of V; we deduce
I
= [V1
V2
···
V,.]
(3.16)
]·
X 1 T~- 1
V,.] on the
Consider the composition with X 1 Tf on the left and T1 [V1 right to obtain
G1,kl
:
. (3.17)
Gk,k
Reversing the order of the factors on the right of (3.16), we see that Gii = fori= 1, 2, ... , k- 1 andj = 1, 2, ... , k. The conclusion (3.15) then follows immediately. Now we shall check the following equalities:
b;,j- 1 I
k-j
VJ
=
L T'{'V,.Bj+m•
j = 1, ... 'k,
(3.18)
m=O
where BP are the coefficients of L 1 (A.): L 1 (A.) = L~=o BiA.i with Bk =I. Since the matrix col(X 1 TD~,:-J is nonsingular, it is sufficient to check that k-j X 1 T~ VJ = X 1 T~
L T'{'V,.Bj+m•
j = 1, ... , k,
(3.19)
m=O
fori = 0, 1, ... . Indeed, using the matrices Gii and the right canonical form for L 1 (A.) rewrite the right-hand side of (3.19) as follows (where the first equality follows from (3.15)): k- j-1
Gi+k- j,k -
L
Gi+m,kGk,j+m+ 1
m=O k- j-1
= Gi+k- j,k -
L
(Gi+m+ 1,j+m+ 1 -
Gi+m.j+m)
m=O
= G;+k- i.k - (G;+
and (3.18) is proved.
G;j)
= Gii,
94
3. MULTIPLICATION AND DIVISIBILITY
Definealsothefollowingmatrixpolynomials:L 1 jA.) = Bi + Bi+tA + · · · 0, 1, ... , k. In particular, L 1 , 0 (A.) = L 1 (A.) and L 1 ,k(A.) =I. We shall need the following property of the polynomials L 1 jA.):
+ BkA.k- i,j =
k-1
)
V,.L 1 (A.) = (I A.- T1) ( i~O T{ V,.Lt.i+ t(A.) .
(3.20)
Indeed, since A.L 1 ,i+ 1 (A.) = L 1 ,/A.)- Bi (j = 0, ... , k - 1), the right-hand side of (3.20) equals k-1
k-1
k
L T{ V,.(Llj(A.)- B)- L T{+ 1 V,.Ll,j+ l(A.) = - L T{ V,.Bj
j=O
j=O
j=O
+
V,.LlO(A.).
Butsince(X 1, T1, V,.)isastandard tripleofL 1 (A.),in viewof(2.10),L~=o T{ V,.Bi = 0, and (3.20) follows. We are now ready to prove that the difference L(A.) - R(A.) is divisible by L 1 (A.). Indeed, using (3.18) and then (3.20) R(A.)(Lt(A.))- 1
= (to AiX 1 Ti) ·
Ct:
T{ V,.Ll,j+ 1(A.)) · L1 1 (A.)
I
= L AiX 1 T~(IA.- Tl)- 1 v,.. i=O
Thus, I
L(A.)(Ll(A.))- 1
= IAjA.i·X 1(IA.-
T~)- 1 V,.
i=O (and here we use the resolvent form of L 1 (A.)). So I
L(A.)(Ll(A.))- 1
-
R(A.)(Ll(A.))- 1 = LAiXl(IA.i- Ti)(IA- Tl)- 1 V,.. i=O
Using the equality Hi- Ti1 = (L~~b A,Pyi-t-p)·(AI- T1 )(i > 0), we obtain I
L(A.)(Ll(A.))- 1
-
R(A.)(Ll(A.))- 1
i-1
= LAiXl L A_Pyi-l-pfk p=O
i=l 1-1
= L A_P
I
L
AiXtTi-t-pv,.. (3.21)
p=O i=p+l Because of the biorthogonality relations X 1 T{ V,. = 0 for j = 0, ... , k- 1, all the terms in (3.21) with p = l - k + 1, ... , p = l - 1 vanish, and formula (3.11) follows. D
3.2.
95
DIVISION PROCESS
In terms of the matrices Gap introduced in the proof of Theorem 3.7, the condition that the coefficients of the remainder polynomial (3.12) are zeros (which means that L 1(A.) is aright divisor of L(A.))can be conveniently expressed in matrix form as follows. Corollary 3.9. [Gll
G12
L 1(A.) is a right divisor of L(A.) Gld=-[Ao
X
···
if and only if
A1-1J
Goz
r G,
Gll
Glk G"'
G12
0
G1~1.1
G1-1,2
1 ,
G~~ l,k
where Gap are defined by (3.14). The dual results for left division are as follows. Theorem 3.10.
Let L(A.) be as in Theorem 3.7, and let L1(A.) = A.kJ - (Wl
+ .. · + vfpk- 1)T1 Y1
be a monic matrix polynomial of degree k in its left canonical form (2.15) (so (T1, Y1) is a left standard pair of L 1(A.) and col(W;)~=l = [Yl
T1Y1
...
T1- 1Y1r 1).
Then where
and
where A 1 =I. Theorem 3.10 can be established either by a parallel line of argument from the proof of Theorem 3.7, or by applying Theorem 3.7 to the transposed matrix polynomials LT(A.) and Li(A.) (evidently the left division
L(A.)
=
L 1(A.)Q 1(A.)
+ R 1(A.)
of L(A.) by L 1(A.) gives rise to the right division LT(A.) = QJ(A.)Lf(A.) of the transposed matrix polynomials).
+ Ri(A.)
96
3. MULTIPLICATION AND DIVISIBILITY
The following definition is now to be expected: a monic matrix polynomial L 1(A.) is a left divisor of a monic matrix polynomial L(A.) if L(A.) = L 1(A.)Q 1(A.) for some matrix polynomial Q 1(A.) (which is necessarily monic). Corollary 3.11. L 1 (A.) is a left divisor of L(A.) standard pair (T1, Y1) of L 1(A.) the equality 1-1 YlT{Aj + Y1T 11 = 0
if and
only
if for
a left
L
j=O
holds.
This corollary follows from Theorem 3.10 in the same way as Corollary 3.8 follows from Theorem 3. 7.
3.3. Characterization of Divisors and Supporting Subspaces
In this section we begin to study the monic divisors (right and left) for a monic matrix polynomial L(A.) = IA 1 + I~:6 AiA.i. The starting points for our study are Corollaries 3.8 and 3.11. As usual, we shall formulate and prove our results for right divisors, while the dual results for the left divisors will be stated without proof. The main result here (Theorem 3.12) provides a geometric characterization of monic right divisors. A hint for such a characterization is already contained in the multiplication theorem (Theorem 3.2). Let L(A.) = Lz(A.)L 1 (A.) be a product of two monic matrix polynomials L 1(A.) and L 2 (A.), and let (X, T, Y) be a standard triple of L(A.) such that
X= [X 1
0],
T= [T10
X1
Yz]
T2
'
y =
[~J.
(3.22)
and (X;, T;, Y;) (i = 1, 2) is a standard triple for L;(A.) (cf. Theorem 3.2). The form ofT in (3.22) suggests the existence of aT-invariant subspace: namely, the subspace .A spanned by the first nk unit coordinate vectors in ([" 1 (here l (resp. k) is the degree of L(A.) (resp. L 1(A.))). This subspace .A can be attached to the polynomial L 1(A.), which is considered as a right divisor of L(A.), and this correspondence between subspaces and divisors turns out to be one-to-one. It is described in the following theorem. Theorem 3.12. Let L(A.) be a monic matrix polynomial of degree l with standard pair (X, T). Then for every nk-dimensional T-invariant subspace .A c ([" 1, such that the restriction col( X l.,a(T lv~tY)f~ J is invertible, there exists a unique monic right divisor L 1 (A.) of L(A.) of degree k such that its standard pair is similar to (XIv~t, Tlv~t).
3.3. CHARACTERIZATION OF DIVISORS AND SUPPORTING SUBSPACES
97
Conversely,for every monic right divisior L 1(A.) of L(A.) of degree k with standard pair (X 1 , T 1), the subspace .A = Im([col(XT;)l:6r 1[col(X 1TDl:6J)
(3.23)
is T-invariant, dim .A= nk, col(XI..H(TI..~tY)~,:-J is invertible and (XI..It, Tl..~t) is similar to (X 1, T1).
Here X l..~t (resp. T l..~t) is considered as a linear transformation .A --+ ft" (resp . .A --+ .A). Alternatively, one can think about X l..~t (resp. T l..~t) as an n x nk (resp. nk x nk) matrix, by choosing a basis in .A (and the standard orthonormal basis in fl" for representation of X l..~t). Proof Let .A c fl" 1 be an nk-dimensional T-invariant subspace such that col( X l..~t(T l..~tY)~,:-J is invertible. Construct the monic matrix polynomial L 1(A.) with standard pair (XI..H, Tl..~t) (cf. (2.14)): L1 (,1.)
= JA.k - X l..~t(T l..~t)k(Vl + V2 A. + · · · + V,.A.k- 1),
where [V1 · · · V,.] = [col(XI..~t(TI..~tY):.:-Jr 1 . Appeal to Corollary 3.8 (bearing in mind the equality AoX + A 1 XT + · · · + A1_ 1 XT 1-
1
+ XT 1 = 0,
where Ai are the coefficients of L(A.)) to deduce that L 1(A.) is a right divisor of L(A.). Conversely, let L 1(A.) be a monic right divisor of L(A.) of degree k with standard pair (X 1, Y1). Then Corollary 3.8 implies C1 col(X 1rD::6 = col(X 1TDl:6Tl,
(3.24)
where C 1 is the first companion matrix for L(A.). Also C 1 col(XT;)::6 = col(XT;)l:6 T.
(3.25)
Eliminating C 1 from (3.24) and (3.25), we obtain T[col(XTi)l:6r 1[col(X 1rD::6J ={col(XTi)::5r 1[col(X 1TDl:6JT1. (3.26)
This equality readily implies that the subspace .A given by (3.23) is Tinvariant. Moreover, it is easily seen that the columns of [col(XT;)l:6r 1 [col(X 1TDl:6J are linearly independent; equality (3.26) implies that in the basis of .A formed by these columns, T l..~t is represented by the matrix T1. Further,
98
3. MULTIPLICATION AND DIVISIBILITY
so Xj 41 is represented in the same basis in .A by the matrix X 1 • Now it is clear that (XIAt, TIAt) is similar to (X 1 , T1). Finally, the invertibility of col( X IA~(T IA~Y)l: 6 follows from this similarity and the invertibility of col(X 1 TDf,:-c}. 0 Note that the subspace .A defined by (3.23) does not depend on the choice of the standard pair (Xt. T1 ) of L 1 (.A.), because Im col(X 1 TDl:6 = Im col(X 1 S · (S- 1 T1 SY)l:6 for any invertible matrix S. Thus, for every monic right divisor L 1(A.) of L(A.) of degree k we have constructed an nk-dimensional subspace A, which will be called the supporting subspace of L 1 (A.). As (3.23) shows, the supporting subspace does depend on the standard pair (X, T); but once the standard pair (X, T) is fixed, the supporting subspace depends only on the divisor L 1(A.). If we wish to stress the dependence of A on (X, T) also (not only on L 1 (A.)), we shall speak in terms of a supporting subspace relative to the standard pair
(X, T). Note that (for fixed (X, T)) the supporting subspace A for a monic right divisor L 1(A.) is uniquely defined by the property that (X IAt, T lA~) is a standard pair of L 1 (A.). This follows from (3.23) if one uses (XIAf, TIAt) in place of (X t• T1). So Theorem 3.12 gives a one-to-one correspondence between the right monic divisors of L(A.) of degree k and T-in variant subspaces A c (/;" 1, such that dim A = nk and col( X iAt(T IA~)i)f,:- 6 is invertible, which are in fact the supporting subspaces of the right divisors. Thus, Theorem 3.12 provides a description of the algebraic relation (divisibility of monic polynomials) in a geometric language of supporting subspaces. By the property of a standard pair (Theorem 2.4) it follows also that for a monic right divisor L 1 (A.) with supporting subspace A the formula
(3.27) holds, where [V1 ~] = [col(XIAt·(TIAtY)f,:-c}r 1 . If the pair (X, T) coincides with the companion standard pair (Pt. C 1) (see Theorem 1.24) of L(A.), Theorem 3.12 can be stated in the following form: Corollary 3.13. A subspace A c (/;" 1 is a supporting subspace (relative to the companion standard pair (Pt. C 1 ))for some monic divisor L 1 (A.) ofL(A.) of degree kif and only if the following conditions hold: (i) dim A= nk; (ii) A is C 1 -invariant; (iii) the n(l - k)-dimensional subspace of (/;" 1 spanned by all nk-dimensional vectors with the first nk coordinates zeros, is a direct complement to A.
3.3.
CHARACTERIZATION OF-DIVISORS AND SUPPORTING SUBSPACES
99
In this case the divisor L 1 (A.) is uniquely defined by the subspace .A and is given by (3.27) with X= P 1 and T = C 1 . In order to deduce Corollary 3.13 from Theorem 3.12 observe that *J, where I is the nk x nk unit matrix. Therefore condition (iii) is equivalent to the invertibility of col(P 1 (C 1 Y)~,:-J has the form [I
col(P11.~t(C1I.~tY)~,:-J.
We point out the following important property of the supporting subspace, which follows immediately from Theorem 3.12. Corollary 3.14. Let L 1 (A.) be a monic divisor of L(A.) with supporting subspace .A. Then a(L 1 ) = a(JI.~t); moreover, the elementary divisors of L 1 (A.) coincide with the elementary divisors of I A. - J l.~t.
Theorem 3.12 is especially convenient when the standard pair (X, T) involved is in fact a Jordan pair, because then it is possible to obtain a deeper insight into the spectral properties of the divisors (when compared to the spectral properties of the polynomial L(A.) itself). The theorem also shows the importance of the structure of the invariant subspaces of T for the description of monic divisors. In the following section we shall illustrate, in a simple example, how to use the description of Tinvariant subspaces to find all right monic divisors of second degree. In the scalar case (n = 1), which is of course familiar, the analysis based on Theorem 3.12leads to the very well-known statement on divisibility of scalar polynomials, as it should. Indeed, consider the scalar polynomial L(A.) = 0f= 1 (A. - A.J"'; A. 1, ... , A.p are different complex numbers and a; are positive integers. As we have already seen (Proposition 1.18), a Jordan pair (X, J) of L(A.) can be chosen as follows:
where X; = [1 0 · · · OJ is a 1 x a; row and J; is an a; x a; Jordan block with eigenvalue A;. Every J -invariant subspace .A has the following structure:
.A = .AlB ... (f) .AP' where .A; c e:"• is spanned by the first p;coordinate unit vectors, i = 1, ... , p, and pi is an arbitrary nonnegative integer not exceeding a; .It is easily seen that each such ]-invariant subspace .A is supporting for some monic right divisor L.At(A.) of L(A.) of degree p = P1 + .. · + PP: the pair (XI.At, Jl.~t) = ([X 1 ,.~t · · · Xp,.~tJ, J 1 ,.~t (f)··· (f)Jp,.~t), where X;,.~t = [1 0 · · · OJ of size 1 x Pi and Ji,.At is the Pi x Pi Jordan block with eigenvalue A;, is a Jordan pair for polynomial Of= 1 (A.- A.;)'\ and therefore col(XI.~t(JI.~t)i)f:J
3.
100
MULTIPLICATION AND DIVISIBILITY
is invertible. In this way we also s_ee that the divisor with the supporting subspace A is just 1 (A. - A.;)il;. So, as expected from the very beginning, all the divisors of L(A.) are of the form 1 (A.- A.;)'\ where 0 :S f3i :S ah i = 1, ... ' p. For two divisors of L(A.) it may happen that one ofthem is in turn a divisor of the other. In terms of supporting subspaces such a relationship means nothing more than inclusion, as the following corollary shows.
Ilf=
Ilf=
Corollary 3.15. Let L 11 (A.) and Lu(A.) be monic right divisors of L(A.) then L 11 (A.) is a right divisor of L 12 (A.) if and only iffor the supporting subs paces A 1 and A 2 of L 11 (A.) and L 1 iA.), respectively, the relation A 1 c A 2 holds. Proof Let (X, T) be the standard pair of L(A.) relative to which the supporting subspaces A 1 and A 2 are defined. Then, by Theorem 3.12, (X I.At;, T I.At) (i = 1, 2) is a standard pair of Lu(A.). If A 1 c A 2 , then, by Theorem 3.12 (when applied to L 12 (A.) in place of L(A.)), L 11 (A.) is a right divisor of L 12 (A.). Suppose now L 11 (A.) is a right divisor of L 1 iA.). Then, by Theorem 3.12, there exists a supporting subspace A 12 c A 2 of L 11 (A.) as a right divisor of L 12 (A.), so that (XI.At 12 , TI.At 1 , ) is a standard pair of L 11 (A.). But then clearly A 12 is a supporting subspace of L 11 (A.) as a divisor of L(A.). Since the supporting subspace is unique, it follows that A 1 = A 12 c A 2 .
0 It is possible to deduce results analogous to Theorem 3.12 and Corollaries 3.14 and 3.15 for left divisors of L(A.)(by using left standard pairs and Corollary 3.11). However, it will be more convenient for us to obtain the description for left divisors in terms of the description of quotients in Section 3.5.
3.4. Example Let
J
A.(A. - 1)2 L(A.) = [ O
bA. A. 2 (A. _ 2) ,
bE fl.
Clearly, a(L) = {0, 1, 2}. As a Jordan pair (X, J) of L(A.), we can take
- [-b
0 1 1 0 10000
X-
0
1=
1 0 0 1 1 1 2
-b]
1'
3.4.
101
EXAMPLE
So, for instance, [-n, 0 and [6] form a canonical set of Jordan chains of L(A.) corresponding to the eigenvalue .A. 0 = 0. Let us find all the monic right divisors of L(A.) of degree 2. According to Theorem 3.12 this means that we are to find all four-dimensional }-invariant subspaces .A such that [fJ] I.At is invertible. Computation shows that
l+ -b
We shall find first all four-dimensional }-invariant subspaces .A and then check the invertibility condition. Clearly, .A = .A0 + .A 1 + .A2 , where .Hi isJ-invariantanda(JI.AtJ = {i},i = 0, 1,2.Sincedim.A0 ~ 3,dim.A 1 ~ 2, and dim .A 2 ~ 1, we obtain the following five possible cases: Case
dim .K0
1
3 3
2
dim .K 1
dim .K 2
1
0
0 2
0
2 2
3 4 5
1
1
2
Case 1 gives rise to a single }-invariant subspace .A 1 = {(xr.x 2 ,x 3 ,x4 ,0,0)TixjECl,j= 1,2,3,4}. Choosing a standard basis in .A r. we obtain
-b 0 1 1] [ [xJ}.At~ = o o 1
0 0 0 -b 1' 0 1 0 0
X
which is nonsingular for all b E Cl. So .A 1 is a supporting subspace for the right monic divisor L.At 1 (.A.) of L(A.) given by
L.AtJA.) = IA. 2
-
XI.At/ 2 I.At 1 (V\1 ) + v~l)A.),
where
[V\"V\"]
~ :J~r ~ [! [[
1 0 0 0 b -1 0 1
-]
(3.28)
3.
102
MULTIPLICATION AND DIVISIBILITY
Substituting in (3.28) also
[0 00 00 01] '
XIAt,(JIAt) 2 = 0 we obtain
Case 2 also gives rise to a single ]-invariant subspace A
2
= {(x 1 ,x 2 ,x 3 ,0,0,x 6 )TixiEC',j= 1,2,3,6}.
But in this case
(written in the standard basis) is singular, so A 2 is not a supporting subspace for any monic divisor of L(.A) of degree 2. Consider now case 3. Here we have a ]-invariant subspace A
3
= {(x 1 , 0, x 3 , x 4 , x 5 , O)TixiE C,j = 1, 3, 4, 5}
and a family of ]-invariant subspaces ..# 3 (a) = Span{(!
0 0 0 0 O)T, (0 I a 0 0 O)T, (0 0 0 I 0 O)T, (0 0 0 0 I O)T},
where a E Cis arbitrary. For A
3
(3.29)
we have
[:J., ~
n~ !
!J
which is singular, so there are no monic divisors with supporting subspace A 3 ; for the family olt 3 (a) in the basis which appears in the right-hand side of (3.29)
3.4.
103
EXAMPLE
which is nonsingular for every a E fl. The monic divisor L.Jt,(a)(A.) with the supporting subspace A 3 (a) is
L.At,(aP) = IA 2
-
XI.Jt,(a). (JI.Jt,(a)) 2 . (V\3 )
+
V~3 )A.),
where [V\3 >V~3 >] = [col(XI.A£,(a) (JI, 4hY)l=or 1 , and computation shows that
[00 00 01
2
L.At,(A.) = IA -
Case 4. Again, we have a single subspace .A4
= {(x 1 , 0, x 3 , x 4 , 0, x 1YixiE fl,j = 1, 3, 4, 6}
and a family
Aia) = Span{(l
0 0 0 0 O)T, (0 1 a 0 0 O)T, (0 0 0 1 0 O)T, (0 0 0 0 0
1)T}.
Now
1
-b 1 1 -b 1 0 0 1 [xJ}.At 4 = o o 1-2b' 0 0 0 2 X
[
which is nonsingular, and the monic divisor L.41 4 (A.) with the supporting subspace A 4 is
We pass now to the family
.A 4 (a).
a
0 -b
1 -b 0 1 1 -2b
0
2
104
3.
MULTIPLICATION AND DIVISIBILITY
This matrix is non singular if and only if a =f. 0. So for every complex value of a, except zero, the subspace A 4 (a) is supporting for the monic right divisor L.Jt.
L.Jt 4 (a,(A.)
=
+ (- 1 + 2b)A. a
[
_ 2b a
_~A.+~ a a
ab
+ 2b 2 A._ 2b 2 a
A. 2
_
a
(a =f. 0).
2(a +b) A.+ 2b a a
Case 5. A
5
=A 5 (rx,f3)=Span{(rx 0 f3 0 0 O)T,(O 0 0 1 0 O)T, (0 0 0 0 1 O)T, (0 0 0 0 0 1)T}, (3.30)
where rx, f3 are complex numbers and at least one of them is different from zero. Then X [xJ}.Jts=
r
-rxb + f3 rx 0
0
1 0 1 0
1
0 -b 0 1 1 -2b' 0 2
which is invertible if and only if rx =f. 0. So A 5 is a supporting subspace for some monic divisor of degree 2 if and only if rx =f. 0. In fact, for rx = 0 we obtain only one subspace A 5 (because then f3 =f. 0 and we can use (0 0 1 0 0 O)T in place of (rx 0 f3 0 0 O? in (3.30)). Suppose now rx =f. 0; then we can suppose that rx = 1 (using (1/rx)(rx 0 f3 0 0 O? in place of (rx 0 f3 0 0 Of in (3.30)); then with the supporting subspace A 5 ({3) is ii 2 - 2A. + 1 tf3A. + b L.Jts
/3]
(Note for reference that XI.Jts
[~ ~ ~
-:b
J)
According to Theorem 3.12, the monic matrix polynomials L.Jt,, L.Jt 3 • L.Jt 4 , L.Jt.
In Section 3.3 we have studied the monic right divisors L 1(A.) of a given monic matrix polynomial L(A.). Here we shall obtain a formula for the quotient L 2 (A.) = L(A.)L! 1 (A.). At the same time we provide a description of the left
3.5. DESCRIPTION OF THE QUOTIENT AND LEFT DIVISORS
105
monic divisors L 2 (A.) of L(A.) (because each such divisor has the form L(A.)L1 1(A.) for some right monic divisor L 1(A.)). We shall present the description in terms of standard triples, which allow us to obtain more symmetric (with respect to left and right) statements of the main results. Lemma 3.16. Let L(A.) be a monic matrix polynomial of degree I with standard triple (X, T, Y), and let P be a projector in ([" 1• Then the linear transformation
col(XTi-1)7=11ImP: Imp--+ q;nk (where k
(3.31)
< l) is invertible if and on[y if the linear transformation (I-
P)[row(Tl-k-iY)l:~]:
q;n
(3.32)
is invertible. Proof PutA = col(XT;- 1)l= 1 andB = row(T 1-;Y)l= 1.Withrespectto the decompositions ([" 1 = ImP Ker P and ([" 1 = q;nk EB q;•O-k> write
+
Thus, A; are linear transformations with the following domains and ranges: A 1: ImP--+ fl"k; A 2: Ker P--+ (["k; A 3: ImP--+ q;n
=
[~ 1 ~J
with D 1 and D 2 as invertible linear transformations. Recall that A and Bare also invertible (by the properties of a standard triple). But then A 1 is invertible if and only if B4 is invertible. This may be seen as follows. Suppose that B4 is invertible. Then B[
-B~ 1 B 3 B~ 1 ] = [!: = [B 1 -
!:][ -B~ 1 B 3 B~ 1 ] ~2 B4 1 B 3 B 2 ~4 1 ]
is invertible in view of the invertibility of B, and then also B1 - B 2 B4 1B 3 is invertible. The special form of AB implies A 1 B 2 + A 2 B 4 = 0. Hence D 1 = A1B 1 + A 2 B 3 = A 1B1 - A 1 B 2 B4 1 B 3 = A1(B 1 - B2 B4 1B 3 )anditfollows
106
3.
MULTIPLICATION AND DIVISIBILITY
that A 1 is invertible. A similar argument shows that invertibility of A 1 implies invertibility of B4 . This proves the lemma. 0 We say that Pis a supporting projector for the triple (X, T, Y) if ImP is a nontrivial invariant subspace for T and the linear transformation (3.31) is invertible for some positive integer k. One checks without difficulty that k is unique and k < f. We call k the degree of the supporting projector. It follows from Theorem 3.12 that P is a supporting projector of degree k if and only if its image is a supporting subspace of some monic divisor of L(.A) of degree k. Let P be a supporting projector for (X, T, Y) of degree k. Define T1 : ImP ---> Im P and X 1 : Im P ---> fl" by X 1 y = Xy. The invertibility of (3.31) now implies that col( X 1 T~- 1 )~= 1 : ImP---> fl"k is invertible. Hence there exists a unique linear transformation Y1 : (/;" ---> Im P such that the triple (X 1 , T1 , Y1 ) is a standard triple of some monic matrix polynomial L 1 (.A) of degree k. The triple (X 1 , T1 , Y1 ) will be called the right projection of (X, T, Y) associated with P. It follows from Theorem 3.12 that the polynomial L 1 (.A) defined by (X 1 , T1 , Y1 ) is a right divisor of L(.A), and every monic right divisor of L(.A) is generated by the right projection connected with some supporting projector of (X, T, Y). By Lemma 3.16 the linear transformation (3.32) is invertible. Define T2 : Ker P---> Ker P and Y2 : fl"---> Ker P by T2 y = (I - P)Ty,
Since Im P is an invariant subspace for T, we have (I - P)T(l - P) (I - P)T. This, together with the invertibility of (3.32) implies that
=
is invertible. Therefore there exists a unique X 2 : Ker P ---> fl" such that the triple (X 2 , T2 , Y2 ) is a standard triple for some monic matrix polynomial L 2 (.A) (which is necessarily of degree l- k). The triple (X 2 , T2 , Y2 ) will be called the left projection of (X, T, Y) associated with P. Suppose that Pis a supporting projector for (X, T, Y) of degree k, and let P': fl" 1 ---> fl" 1 be another projector such that Im P' = Im P. Then it follows immediately from the definition that P' is also a supporting projector for (X, T, Y) of degree k. Also the right projections associated with P and P' coincide. Thus what really matters in the construction of the left projection is the existence of a nontrivial invariant subspace .~ 1 for T such that for some positive integer k
3.5.
107
DESCRIPTION OF THE QUOTIENT AND LEFT DIVISORS
i.e., the existence of a supporting subspace (cf. Section 3.3). The preference for dealing with supporting projectors is due to the fact that it admits a more symmetric treatment of the division (or rather factorization) problem. The next theorem shows that the monic polynomial L 2 (A.) defined by the left projection is just the quotient L(A.)L1 1 (A.) where L 1 (A.) is the right divisor of L(A.) defined by the right projection of (X, T, Y) associated with P. Theorem 3.17. Let L(A.) be a monic matrix polynomial with standard triple (X, T, Y). Let P be a supporting projector of (X, T, Y) of degree k and let L 1 (A.) (resp. L 2 (A.)) be the monic matrix polynomial of degree k (resp. l- k) generated by the right (resp. left) projection of (X, T, Y) associated with P. Then
(3.33)
Conversely, every factorization (3.33) ofL(A.) into a product of two monic factors L 2 (A.) and L 1 (A.) of degrees l - k and k, respectively, is obtained by using some supporting projector of(X, T, Y) as above. Proof
Let P':
fl" 1 --+ fl" 1 be
P'y =
given by
[col(X 1 Ti1- 1 )~; 1 r
1
· [col(XTi- 1 )~; 1 ]y,
where X 1 = Xl 1mP; T1 = TlhnP· Then P' is a projector and ImP'= ImP. Also Ker P' =
{
1-k
.L yz-k-iYxi-tlx
}
0 , . . . , Xz-k- 1 E
fl" .
(3.34)
'; 1
The proof of this based on formulas (2.1). Define S: fl" 1 --+ Im P Ker P by
+
where P' and I - P are considered as linear transformations from C" 1 into ImP' and Ker P, respectively. One verifies easily that Sis invertible. We shall show that [X 1 O]S =X,
(3.35)
which in view of Theorem 3.1 means that (X, T) is a standard pair for the product Lz(A.)L 1 (A.), and since a monic polynomial is uniquely defined by its standard pair, (3.33) follows.
108
3.
MULTIPLICATION AND DIVISIBILITY
Take y E C" 1• Then P'y Elm P and col( X 1 Ti1 ~ 1 P'y)7= 1 = col(XTi~ 1 y)7= 1 • In particular X 1 P' y = X y. This proves that [X 1 O]S = X. The second equality in (3.35) is equivalent to the equalities (3.36)
and (I - P)T = T2 (I - P). The last equality is immediate from the definition of T2 and the fact that Im Pis an invariant subspace for T. To prove (3.36), take y E C" 1• The case when y E ImP = ImP' is trivial. Therefore assume that y E Ker P'. We then have to demonstrate that P'Ty = Y1 X 2 (1- P)y. Since · t x 0 , ••. ,x 1 ~k~JE~ rrn sue h th a t y=L.i=! "<\l~k T 1 ~k~iyxi~J· yE K er P' , th ereex1s Hence
Ty =
T 1 ~kYx 0
+
T 1 ~k~ 1 Yx 1
+ · ·· +
TYx 1 ~k~ 1
=
T 1 ~kYx 0
+u
with u E Ker P' and, as a consequence, P'Ty = P'T 1 ~kYx 0 • But then it follows from the definition of P' that
P'Ty = [row(T~~iY1 )7= 1 ] col(O, ... , 0, x 0 ) = Y1 x 0 . On the other hand, putting x
(I- P)y =(I- P) and so Y1 X 2 (/
= col(xi~ 1 )l:1,
row(T 1 ~k~iY)l:1x
=
row(T~~k~iY2 )~:1x
P)y is also equal to Y1 x 0 . This completes the proof.
-
D
Using Theorem 3.17, it is possible to write down the decomposition L(A.) = L 2 (A.)L 1 (A.), where L 2 (A.) and L 1 (A.) are written in one of the possible forms: right canonical, left canonical, or resolvent. We give in the next corollary one such decomposition in which L 2 (A.) is in the left canonical form and L 1 (A.) is in the right canonical form. Corollary 3.18. Let L(A.) and (X, T, Y) be as in Theorem 3.17. Let L 1 (A.) be a right divisor of degree k of L(A.) with the supporting subspace .41. Then
L(A.) =
+ ... + z,~kA.'~k~J)PT'~kpy] Xi.At(Ti.Att(W1 + · · · + Jv,A_k~ 1 )],
[IA.'~k-
· [IA_k-
(Zl
where [WI
Tv,]= [col(XI.At(TI.AtY)7:6r 1 ,
Pis some projector with Ker P = A, the linear transformation [PY PT 1 ~k~ 1 PY]: cn(l~k) ~ Im pis invertible, and
[
~
1
z,~k
l
= (row(PY ·
PTiP)l:t~ 1 )~ ImP~ C"u~kJ. 1:
PT PY
3.5.
DESCRIPTION OF THE QUOTIENT AND LEFT DIVISORS
109
Proof Let P be the supporting projector for the right divisor L 1 (A.). Then, by Theorem 3.17, (T2 , Y2 ) is a left standard pair for LiA.), where T2 : Ker P--> Ker P and Y2 : ([n ...... Ker Pare defined by T2 y =(I- P)Ty, Y2 x =(I- P)Yx. Choose P =I- P; then Y2 = PY and T2 = PTI1m.P· So Corollary 3.18 will follow from the left standard representation of L 2 (A.) if we prove that i = 0, 1, 2, ....
(3.37)
But (3.37) follows from the representation ofT relative to the decomposition ([nl = .41 Im p:
+
where T22 = PTIImi'· So
T; = [~1
T~J
i = 0, 1, ... ,
i.e., (PTI~mi')i = PT;IImi'• which is exactly (3.37).
0
Note that Corollary 3.18 does not depend on the choice of the projector P. Using Corollary 3.18 together with Corollary 3.15, one can deduce results concerning the decomposition of L(A.) into a product of more than two factors:
where L;(A.) are monic matrix polynomials. It turns out that such decompositions are in one-to-one correspondence with sequences of nested supporting subspaces .41m _ 1 ::J .41m _ 2 ::J · • · ::J .41 1 , where A; is the supporting subspace of L;(A.) · · · L 1 (A.) as a right divisor of L(A.). We shall not state here results of this type, but refer the reader to [34a, 34b]. For further reference we shall state the following immediate corollary of Theorem 3.17 (or Corollary 3.18).
Corollary 3.19. Let L(A.) be a monic matrix polynomial with linearization T and let L 1(A.) be a monic right divisor with supporting subspace .H. Then for any complementary subspace .41' to .41, the linear transformation PT I.At· is a linearization ofL(A. )L 1 1(A. ), where P is a projector on .41' along .41. In particular, a(LL1 1 ) = a(PTI.At.). We conclude this section with a computational example.
3.
110 EXAMPLE
3.2.
MULTIPLICATION AND DIVISIBILITY
Let
be the polynomial from Section 3.4. We shall use the same canonical pair (X, J) as in Section 4. Consider the family A 4 (rx) =Span{(!
0 0 0 0 O)T, (0 1 a 0 0 O)T, (0 0 0 1 0 O)T, (0 0 0 0 0
l
1)T},
of supporting subspaces and let 2
L.,~~.(a)(A.)
-A.+ -
(
2b)
2b
-1+-;-A.--;2 2 --A.+a a
ab
+ 2b 2 A. a
2
A.-
2(a
j
_ 2b 2 a
+ b) a
a =F 0,
2b
A.+a
be the family of corresponding right divisors (cf. Section 3.4). We compute the quotients M.(A.) = L(A.) (L.,~~ 4
u. = z •. f'.Jf'•. Y, where P. is a projector such that Ker P. = A 4 (a),
and
z.
= (P. Y)- 1 : ImP...... fl 2 . Computation shows that
Y=
0 -! 0 --z1 b -1 -b b
0
I
4
Further, let us take P. the projector with Ker P. = A 4 (a) and ImP. = {(0, x 2 , 0, 0, x 5 , 0) T E fl 6 1x 2 , x 5 E fl}. In the matrix representation
pa =
0 0 0 0 0 0
0 0 1 -1/a 0 0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 1
0 0 0 0 0 0 0
3.6.
111
SUPPORTING SUBSPACES FOR THE ADJOINT POLYNOMIAL
So PaY: ([ 2 --> Im Pais given by the matrix
and
Za=l- 11/a -±-h hia]-'-!l h ±+h/a] -1/a · - ~ -I Now let us compute PaJPa: Im Pa--> Im ?,,. To this end we compute Pale 1 and Pale 2 , where e 1 = (0 I 0 0 0 O)T and e 2 = (0 0 0 0 I O? are basis vectors in I m Pa:
o o o o o? = o,
P"Je 1 = P"(l
so
_ _ lo o]
PalPa
=
O I .
Finally, Ua
=
2
l- b1
~ +h.·
-!- h/a] ll + 2.·b a /a] lO0 OJ1 l-1/a - 2/a b 1 =
- 1/a
1
h + 2h2 ;a]. - 2hja
By direct multiplication one checks that indeed
3.6. Divisors and Supporting Subspaces for the Adjoint Polynomial
Let L(A) = IA 1 + 2}: ~ A i'V be a monic matrix polynomial of degree I with a standard triple (X, T, Y). Let 1-1
L*(A)
=
IA 1 +
I
AjAj
j= 0
be the adjoint matrix polynomial. As we already know (see Theorem 2.2) the triple (Y*, T*, X*) is a standard triple for L*(A.).
112
3.
MULTIPLICATION AND DIVISIBILITY
On the other hand, if L 1 (A.) is a monic right divisor of L(A.), we have L(A.) = Lz(A.)L 1(A.) for some monic matrix polynomial L 2 (A.). So L *(A.) = LT(A.)L!(A.), and L!(A.) is a right divisor of L *(A.). There exists a simple connection between the supporting subspace for L 1 (A.) (with respect to the standard pair (X, T) of L(A.)) and the supporting subspace of L!(A.) (with respect to the standard pair (Y*, T*)). Theorem 3.20. rf L 1 (A.) (as a right divisor of L(A.)) has a supporting subspace A (with respect to the standard pair (X, T)), then L!(A.) (as a right divisor of L *(A.)) has supporting subspace Al_ (with respect to the standard pair (Y*, T*)).
Proof We use the notions and results of Section 3.5. Let P be the supporting projector for the triple (X, T, Y) corresponding to the right divisor L 1, so A= ImP. We can suppose that P is orthogonal, P = P*; then Im(J-P)=Aj_. By Theorem 3.17 the quotient L 2 (A.)=L(A.)L} 1 (A.) is determined by the left projection (X 2 , T2 , Y2 ) of(X, T, Y) associated with P: Y2 =(I - P)Y: q;n--+ Ker P.
T2 =(I - P)T: Ker P--+ Ker P,
Since W ~- row(T~- 1 Y)l:~: q;nO-kl--+ Ker P is invertible (where k is the degree of L 1 (A.)), so is
W* = col(Y!T!)IKerP: Ker P--+ q;nO-kl.
(3.38)
Clearly, Ker Pis T*-invariant (because of the general property which can be checked easily: if a subspace N is A-invariant, then Nj_ is A*-invariant), and in view of(3.38) and Theorem 3.12 Aj_ = Ker Pis a supporting subspace such that (Y!, T!) is a standard pair ofthe divisor M(A.) of L *(A.) corresponding to Al_ (with respect to the standard triple (Y*, T*, X*) of L*(A.)). But(¥!, T!) is also a standard pair for L!(A.), as follows from the definition of a left projection associated with P. So in fact M(A.) = L!(A.), and Theorem 3.20 is obtained. D 3.7. Decomposition into a Product of Linear Factors
It is well known that a scalar polynomial over the complex numbers can be decomposed into a product of linear factors in the same field. Is it true also for monic matrix polynomials? In general, the answer is no, as the following example shows. EXAMPLE
3.3.
The polynomial ).2
L().) = [ 0
1]
;.z
3.7.
DECOMPOSITION INTO A PRODUCT Of LINEAR fACTORS
113
has a canonical triple determined by
In order that L have a monic right divisor of degree one it is necessary and sufficient that J has a two-dimensional invariant subspace on which X is invertible. However, the only such subspace is A = Span(e 1 , e 2 ) and X is not invertible on A. Hence L has no (nontrivial) divisors. 0
However, in the important case when all the elementary divisors of the polynomial are linear or, in other words, for which there is a diagonal canonical matrix J, the answer to the above question is yes, as our next result shows. Theorem 3.21. Let L(.A.) be a monic matrix polynomial having only linear elementary divisors. Then L(.A.) can be decomposed into a product of linear factors:
L(.A.) =
11 (/). -
B;).
Proof Let (X, J) be a Jordan pair for L(.A.). Then, by hypothesis, J is diagonal. Since the In x In matrix
is nonsingular, the (l - l)n x In matrix A, obtained by omitting the last term, has full rank. Thus, there exist (l - 1)n linearly independent columns of A, say columns i 1 , i 2 , ..• , iu -l)n· Since J is diagonal, the unit vectors e; 1 , • • • , e;,, -l>n in C1" generate an (I - 1)n-dimensional invariant subspace A of J and, obviously, A will be invertible on jf. Therefore, by Theorem 3.12 L(.A.) has a monic right divisor L 1 (.A.) of degree l - 1 and we may write L(.A.) = (/.A. - B 1 )L 1 (.A.). But we know (again by Theorem 3.12) that the Jordan matrix] 1 associated with L 1 (.A.) is just a submatrix of J and so L 1 (.A.) must also have all linear elementary divisors. We may now apply the argument repeatedly to obtain the theorem. 0
114
3.
MULTIPLICATION AND DIVISIBILITY
In the case n = 1 there are precisely l! distinct decompositions of the polynomial into linear factors, provided all the eigenvalues of the polynomial are distinct. This can be generalized in the following way: Corollary 3.22. If L(A.) (of Theorem 3.20) has all its eigenvalues distinct, then the number of distinct factorizations into products of linear factors is not less than [1~-:~ (nj + 1). Proof It is sufficient to show that L(A.) has no less than n(l - 1) + 1 monic righi divisors of first degree since the result then follows by repeated application to the quotient polynomial. If(X, J) is a Jordan pair for L(A.), then each right divisor U- B of L(A.) has the property that the eigenvectors of Bare columns of X and the corresponding submatrix of J is a Jordan form for B. To prove the theorem it is sufficient to find n(l - 1) + 1 nonsingular n x n submatrices of X for, since all eigenvalues of L are distinct, distinct matrices B will be derived on combining these matrices of eigenvectors with the (necessarily distinct) corresponding submatrices of J. By Theorem 3.21 there exists one right divisor U - B. Assume, without loss of generality, that the first n columns of X are linearly independent and correspond to this divisor. Since every column of X is nonzero (as an eigenvector) we can associate with the ith column (i = n + 1, ... , nl) n - 1 columns i 1, ... , in-l• from among the first n columns of X, in such a way that columns i 1 , ... , in- 1> i form a nonsingular matrix. For each i from n + 1 to ln we obtain such a matrix and, together with the submatrix of the first n columnsofXweobtainn(l- 1) + 1nonsingularn x nsubmatricesofX. D Comments
The presentation in this chapter is based on the authors' papers [34a, 34b]. The results on divisibility and multiplication of matrix polynomials are essentially of algebraic character, and can be obtained in the same way for matrix polynomials over an algebraically closed field (not necessarily(/;), and even for any field if one confines attention to standard triples (excluding Jordan triples). See also [20]. These results can also be extended for operator polynomials (see [34c]). Other approaches to divisibility of matrix (and operator) polynomials via supporting subspaces are found in [46, 56d]. A theorem showing (in the infinite-dimensional case) the connection between monic divisors and special invariant subspaces of the companion matrix was obtained earlier in [56d]. Some topological properties of the set of divisors of a monic matrix polynomial are considered in [70e].
3. 7.
DECOMPOSITION INTO A PRODUCT OF LINEAR FACTORS
115
Additional information concerning the problem of computing partial multiplicities of the product of two matrix polynomials (this problem was mentioned in Section 3.1) can be found in [27, 71a, 75, 76b]. Our Section 3.3 incorporates some simplifications of the arguments used in [34a] which are based on [73]. The notion of a supporting projector (Section 3.5) originated in [3a]. This idea was further developed in the framework of rational matrix and operator functions, giving rise to a factorization theory for such functions (see [ 4, 3c]). Theorem 3.20 (which holds also in the case of operator polynomials acting in infinite dimensional space), as well as Lemma 3.4, is proved in [34f]. Theorem 3.21 is proved by other means in [62b]. Another approach to the theory of monic matrix polynomials, which is similar to the theory of characteristic operator functions, is developed in [3b]. The main results on representations and divisibility are obtained there.
Chapter 4
Spectral Divisors and Canonical Factorization
We consid•r here the important special case of factorization of a monic matrix polynomial L(A.) = L 2 (A.)L 1(..1.), in which L 1(..1.) and L 2 (..1.) are monic polynomials with disjoint spectra. Divisors with this property will be called spectral.
Criteria for the existence of spectral divisors as well as explicit formulas for them are given in terms of contouf integrals. These results are then applied to the matrix form of Bernoulli's algorithm for the solution of polynomial equations, and to a problem of the stability of solutions of differential equations. One of the important applications of the results on spectral divisors is to canonical factorization, which plays a decisive role in inversion of block Toeplitz matrices. These applications are included in Sections 4.5 and 4.6. Section 4. 7 is devoted to the more general notion of Wiener-Hopf factorization for monic matrix polynomials. 4.1. Spectral Divisors
Much of the difficulty in the study of factorization of matrix-valued functions arises from the need to consider a" splitting" of the spectrum, i.e., the 116
4.1.
117
SPECTRAL DIVISORS
situation in which L(A.) = LiA.)L 1 (A.) and L 1 , L 2 , and L all have common eigenvalues. A relatively simple, but important class of right divisors of matrix polynomials L(A.) consists of those which have no points of spectrum in common with the quotient they generate. Such divisors are described as "spectral" and are the subject of this section. The point A.0 E (/; is a regular point of a monic matrix polynomial L if L(A-0 ) is nonsingular. Let r be a contour in fl, consisting of regular points of L. A monic right divisor L 1 of Lis a r-spectral right divisor if L = L 2 L 1 and cr(L 1 ), cr(L 2 ) are inside and outside r, respectively. Theorem 4.1. If L is a monic matrix polynomial with linearization T and L 1 is a r-spectral right divisor ofL, then the support subspace of L 1 with respect
to Tis the image of the Riesz projector Rr corresponding toT and Rr =
~ 2m
,C(IA.- T)Yr
1
dA..
r: (4.1)
Proof Let A r be the T-invariant supporting subspace of L~> and let A~ be some direct complement to A r in rtn1 (lis the degree of L(A.)). By Corollary 3.19 and the definition of a r-spectral divisor, cr(TIAir) is inside r and cr(PT l.ui-J is outside r, where Pis the projector on A~ along A r· Write the matrix T with respect to the decomposition rtn 1 = Ar +.A~:
T= [T~1 ~~] (so that T11 = TL41 ;-, T12 =(I- P)T!/11 ;-, T22 = PTIAti-• where P and I - Pare considered as linear transformations on A~ and Ar respectively). Then
and 1 j (I A. - T) -1 dA. 2ni
Rr=-
and so Im Rr = Ar.
1
=
[IO
*]
O,
0
Note that Theorem 4.1 ensures the uniqueness of a r -spectral right divisor, if one exists. We now give necessary and sufficient conditions for the existence of monic r-spectral divisors of degree k of a given monic matrix polynomial L of degree I. One necessary condition is evident: that det L(A.) has exactly nk zeros inside r (counting multiplicities). Indeed, the equality L = L 2 L 1 , where
118
4.
SPECTRAL DIVISORS AND CANONICAL FACTORIZATION
L 1 (A.) is a monic matrix polynomial of degree k such that a(L 1 ) is insider and a(L 1 ) is outside r, leads to the equality
det L = det L 2 • det L 1 , and therefore det L(A.) has exactly nk zeros insider (counting multiplicities), which coincide with the zeros of det L 1 (A.). For the scalar case (n = 1) this necessary condition is also sufficient. The situation is completely different for the matrix case: EXAMPLE
4.1.
Let A. 2 L(A.) = [ 0
J
0 (A. - 1)2 ,
and let r be a contour such that 0 is inside r and 1 is outside r. Then det L(A.) has two zeros (both equal to 0) inside r. Nevertheless, L(A.) has no r-spectral monic right divisor of degree 1. Indeed,
1 0 0 OJ
[ X=oo1o' is a Jordan pair of L(A.). The Riesz projector is
and Im Rr = Span{(l 0 0 O)T, (0 1 0 O)r}. But XIImRr is not invertible, so Rr is not a supporting subspace. By Theorem 4.1 there is no r -spectral monic right divisor of L(A.). D
So it is necessary to impose extra conditions in order to obtain a criterion for the existence of r -spectral divisors. This is done in the next theorem. Theorem 4.2. Let L be a monic matrix polynomial and r a contour consisting of regular points of L having exactly nk eigenvalues of L (counted according to multiplicities) insider. Then Lhasa r-spectral right divisor if and only if the nk x nl matrix
M
-
k,t-
1 2ni
4.1.
119
SPECTRAL DIVISORS
has rank kn. If this condition is satisfied, then the r-spectral right divisor L 1 (A) = /Ak + 2}:6 LuAj is given by the formula [L1o
···
Llk-I]=-~{[AkL- 1 (A) ·
2m
Jr
···
Ak+l-tL- 1 (A)]dA·MLl, (4.3)
where
ML 1 is any right inverse of M k,l·
Proof Let X, T, Y be a standard triple for L. Define Rr as in (4.1 ), and let A 1 = Im Rr, A 2 = Im(J- Rr). Then T = T1 E8 T2 , where T1 and T2 are the restrictions ofT to A 1 and -~ 2 respectively. In addition, u(Tt) and u(T2 ) are inside and outside r, respectively. Define X 1 = Xl 11 ,, X 2 = Xl 112 , and Y1 = Rr Y, Y2 =(I- Rr)Y. Then, using the resolvent form of L (Theorem 2.4) write
L -\A)= X(IA- T)- 1 Y = X 1 (/A- T1 )- 1 Y1
+
Xz(IA- T2 ) - 1 Y2 •
Since (/A- T2 )- 1 is analytic insider. then fori= 0, 1, 2, ... we have
where the last equality follows from the fact that u(T1 ) is insider (see Section Sl.8). Thus,
(4.5)
and since the rows of the matrix, row(Ti1 Y1 )l;;;6, are also rows of the nonsingular matrix, row(TiY)l:6, the former matrix is right invertible, i.e., row(Ti1 Y1 )l:6 · S = I for some linear transformationS: A--> fln 1• Now it is clear that M k, 1 has rank kn only if the same is true of the left factor in (4.5), i.e.,
4. SPECTRAL DIVISORS AND CANONICAL FACTORIZATION
120
the left factor is nonsingular. This implies that A is a supporting subspace for L, and by Theorem 3.12 L has a right divisor of degree k. Furthermore, X 1 , T1 determine a standard triple for Lt> and by the construction O"(L 1 ) coincides with O"(T1 ) and the part of O"(L) inside r. Finally, we write L = L 2 L 1 , and by comparing the degrees of det L, det L 1 and det L 2 , we deduce that O"(L 2 ) is outsider. Thus, L 1 is a r-spectral right divisor. For the converse, if L 1 is a r-spectral right divisor, then by Theorem 4.1 the subspace A = Im Rr c fl 1" is a supporting subspace of L associated with L~> and the left factor in (4.5) is nonsingular. Since the right factor is right invertible, it follows that M k,l has rank kn. It remains to prove the formula (4.3). Since the supporting subspace for L 1 (A.) is Im Rr, we have to prove only that
X1
T~ {l :T l f' ~ 2~i fp.'L XTk- 1
-•(J.)
lttJ
(4.6)
(see Theorem 3.12). Note that Ml, 1 = [row(T~ Yt)l:6J' · [col(X 1 T~t;:Jr 1 , where [row(T~ Y1 )l:6J' is some right inverse of row(T~ Y1 )l:6. Using this remark, together with (4.4), the equality (4.6) becomes X 1 T~
·(col(XT;t:"J)- 1 = [X 1 T~Y1 . . . X 1 T~+ 1 - 1 Y1 ] · [row(T~ Yt)l:6J' · ((col(X 1 TD~:J)- 1 ,
which is evidently true.
D
The result of Theorem 4.2 can also be written in the following form, which is sometimes more convenient. Theorem 4.3. Let L be a monic matrix polynomial, and let r be a contour consisting ofregular points ofL. Then Lhasa r -spectral right divisor ofdegree k if and only if rank Mk,l =rank Mk+ 1 , 1 = nk. (4.7) Proof We shall use the notation introduced in the proof of Theorem 4.2. Suppose that L has a r -spectral right divisor of degree k. Then col( X 1 T {)~,:- 6 is invertible, and T~> X 1 are nk x nk and n x nk matrices, respectively. Since row(T~ Y1 )l:6 is right invertible, equality (4.5) ensures that rank Mk,l = nk. The same equality (4.5) (applied to Mk+ 1 , 1) shows that rank Mk+ 1 , 1 ~ {size of Tt} = nk. But in any case rank Mk+ 1 , 1 ~rank Mk, 1; so in fact rank Mk+ 1 , 1 = nk.
4.1.
121
SPECTRAL DIVISORS
Suppose now (4.7) holds. By (4.5) and right invertibility ofrow(T~ Y1 )L;;A we easily obtain rank col(X 1 TD~,;;-J = rank col(X 1 TD~=o = nk.
(4.8)
Thus, Ker col( X 1 TD~,;;-J = Ker col( X 1 TD~=o. It follows then that (4.9) for every p ~ k. Indeed, let us prove this assertion by induction on p. For p = k it has been established already; assume it holds for p = p0 - 1, Po > k + 1 and let x E Ker col(X 1 TD~,;;-J. By the induction hypothesis,
x E Ker col( X 1 TDf~1 1 . But also
so T1 x
E
Ker col(X 1 Ti)~,;;-J,
and again by the induction hypothesis, T1 x
E
Ker col(X 1 TDf~(j 1
or
It follows that
x E Ker col( X 1 TDf~o. and (4.9) follows from p = p0 . Applying (4.9) for p = l - 1 and using (4.8), we obtain that rank col(X 1 TDl:A = nk. But col(X 1 T~)l:A is left invertible (as all its columns are also columns of the nonsingular matrix col(XT;)l:6), so its rank coincides with the number of its columns. Thus, T1 is of size nk x nk; in other words, det L(A.) has exactly nk roots insider (counting multiplicities). Now we can apply Theorem 4.2 to deduce the existence of a r -spectral right monic divisor of degree k. D If, as above, r is a contour consisting of regular points of L, then a monic left divisor L 2 of Lis a r-spectralleft divisor if L = L 2 L 1 and the spectra of L 2 and L 1 are inside and outside r, respectively. It is apparent that, in this case, L 1 is a r 1 -spectral right divisor, where the contour r 1 contains in its interior exactly those points of a{L) which are outside r. Similarly, if L 1 is a r -spectral right divisor and L = L 2 L~> then L 2 is a r cspectralleft divisor. Thus, in principle, one may characterize the existence of r-spectral left divisors by using the last theorem and a complementary contour r 1 . We shall show by example that a r -spectral divisor may exist from one side but not the other.
122
4. SPECTRAL DIVISORS AND CANONICAL FACTORIZATION
The next theorem is the dual of Theorem 4.2. It is proved for instance by applying Theorem 4.2 to the transposed matrix polynomial LT(A.) and its right r-spectral divisor.
Theorem 4.4. Under the hypotheses of Theorem 4.2 L has a r-spectral left divisor if and only if the matrix L -1(A.)
2~i L;tl-1 ~ -1(A.) [
Ml,k =
has rank kn. In this case the r-spectralleft divisor L 2 (A.) = is given by the formula [
L 20
:
Lz.~-1
l
= -
1
Ml
k •-
,
r[
A_k L
~: (A.) 1
JA.k
l
dA.,
+ ~}:J
L 2 iA.i
(4.10)
2rci)r ;tk+l-1L-1(A.)
where Ml,k is any left inverse of M 1,k· The dual result for Theorem 4.3 is the following:
Theorem 4.5. Let L(A.) be a monic matrix polynomial and r be a contour consisting of regular points of L. Then L has a r -spectral left divisor ofdegree k if and only if rank M 1,k = rank M 1,k+ 1 = nk. Note that the left-hand sides in formulas (4.3) and (4.10) do not depend on the choice of Ml, 1 and Ml, k> respectively (which are not unique in general}. We could anticipate this property bearing in mind that a (right) monic divisor is uniquely determined by its supporting subspace, and for a r-spectral divisor such a supporting subspace is the image of a Riesz projector, which is fixed by r and the choice of standard triple for L(A.). Another way to write the conditions of Theorems 4.2 and 4.4 is by using finite sections of block Toeplitz matrices. For a continuous n x n matrixvalued function H(A.) on r, such that det H(A.) =1 0 for every A. E r, deijne Di = (2rci)- 1 A.- i- 1H(A.) dA.. It is clear that if r is the unit circle then the Di are the Fourier coefficients of H(A.):
Jr
00
H(A.) =
L j=-
A_iDj
(4.11)
00
(the series in the right-hand side converges uniformly and absolutely to H(A.) under additional requirements; for instance, when H(A.) is a rational matrix function). For a triple of integers rx ~ {3 ~ y we shall define T(Hr; rx, {3, y) as
4.1. SPECTRAL DIVISORS
123
the following block Toeplitz matrix: DfJ .
-
T(Hr, oc, {3, y)-
r
DfJ+1
:
D" We mention that T(Hr; oc, oc, y) and
= [D" D"_ 1
Dy] is a block row,
is a block column. Theorem 4.2'. Let L(A.) and r be as in Theorem 4.2. Then L(A.) has a r-spectral right divisor L 1 (A.) = JA.k + 2_};;:J L 1iA.i if and only if rank T(Li 1; -1, -k, -k- l + 1) = kn. In this case the coefficients of L 1 are
given by the formula [L 1 ,k-1 L1,k- 2 ...
L 10 ]
= -T(Li 1; -k- 1, -k- 1, -k -1) ·(T(Li 1; -1, -k, -k -1
+ 1))1,
(4.12)
where (T(Li 1 ; -1, -k. -k- l + 1)1 is any right inverse of T(Li 1; -1, -k, -k -1 + 1). Note that the order of the coefficients L 1i in formula (4.12) is opposite to their order in formula (4.3). Theorem 4.4'. Let L(A.) and r be as in Theorem 4.4. Then L(A.) has a r-spectral left divisor LiA.) = JA.k + 2_};;:J L 2iA.i if and only if rank T(li 1 ; - 1, -1, - k - l + 1) = kn. In this case the coefficients of L 2 are given by the formula
L2o [ L 21
1= -(T(Li 1;
-1, -1, -k- l
+ 1))1
L2,~-1 · T(Li 1 ; -k- 1, -k- l, -k- 1),
(4.13)
where (T(Li 1 ; -1, -1, - k - l + 1))1 is any left inverse of T(Li 1; -1, -1, -k -1 + 1). Of course, Theorems 4.3 and 4.5 can also be reformulated in terms of the matrices T(Li 1 ; oc, {3, y).
124
4. SPECTRAL DIVISORS AND CANONICAL FACTORIZATION
We conclude this section with two examples: EXAMPLE
4.2.
Let L(A.)=
[
.1. 3
- 42.1. 2 -6.1. 2
300.1.2 - 30U + 42] .1. 3 +42.1. 2 -43.1.+6.
Then L has eigenvalues 0 (with multiplicity three), I, 2, and -3. Let r be the circle in the complex plane with centre at the origin and radius l We use a Jordan triple X, J, Y. It is found that we may choose J1 =
[00 01 0]I ,
yl =
[-7 48] I -7
0 0 0 and we have
_I J,
2ni 'fL
_1
-6 42 ,
(A.)dA.=XtYt =
[-7 48]
I -7,
which is invertible. In spite of this, there is no r-spectral right divisor. This is because there are, in effect, three eigenvalues of L insider, and though M 1 , 3 of Theorem 4.2 has full rank, the first hypothesis of the theorem is not satisfied. Note that the columns of X 1 are defined by a Jordan chain, and by Theorem 3.12 the linear dependence of the first two vectors shows that there is no right divisor of degree one. Similar remarks apply to the search for a r-spectralleft divisor. D ExAMPLE
4.3.
Let
a 1,
a2 , a3 ,
a4
be distinct complex numbers and
Let r be a contour with a 1 and a 2 insider and a 3 and a 4 outside r.lfwe choose
XI=[~ ~]. then it is found that
where f3 = (a 2
-
X Y =
11
a 3 )(a 2
[0
O
-
a 4)- 1(a 2 - a 1) and
f3(a 1
-
a2)- 1] O ,
[I
XtJtY1= 0
a2 {3(a 10 - a2)- 1].
Then, since M 1, 2 = [X 1Y1 X 1J 1Y1] and
J
X 1 Y1 M2,t = [ XtJtYt ,
it is seen that M 1 , 2 has rank 1 and M 2 , 1 has rank 2. Thus there exists a r-spectralleft divisor but no r-spectral right divisor. D
4.2.
125
LINEAR DIVISORS AND MATRIX EQUATIONS
4.2. Linear Divisors and Matrix Equations
Consider the unilateral matrix equation l-1 yt + I Aj yi = 0,
(4.14)
j=O
where Aj(j = 0, ... , l - 1) are given n x n matrices and Yis ann x n matrix to be found. Solutions of (4.14) are closely related to the right linear divisors ofthe monic matrix polynomial L(il) = H 1 + 2}:6 AiA.i. Namely, it is easy to verify (by a straightforward computation, or using Corollary 3.6, for instance) that Y is a solution of (4.14) if and only if IA- Y is a right divisor of L(il). Therefore, we can apply our results on divisibility of matrix polynomials to obtain some information on the solutions of (4.14). First we give the following criterion for existence of linear divisors (we have observed in Section 3.7 that, in contrast to the scalar case, not every monic matrix polynomial has a linear divisor). Theorem 4.6. The monic matrix polynomial L(A.) has a right linear divisor I A. - X if and only if there exists an invariant subspace ~ of the first companion matrix C 1 of L(A.) of the form I X ~
= Im X 2 xt-1
Proof Let ~ be the supporting subspace of IA - X relative to the companion standard pair (P 1, C 1). By Corollary 3.13(iii) we can write ~
= Im[col(X;)l;:6J
for some n X n matrices Xo =I and xl, ... ' xl-1· But Cl-invariance of~ implies that X; = XL i = 1, ... , l - 1. Finally, formula (3.27) shows that the rightdivisorofL(il)correspondingto~isjustH- X,soinfactX 1 =X. 0 In terms of the solutions of Eq. (4.14) and relative to any standard pair of L(il) (not necessarily the companion pair as in Theorem 4.6) the criterion of Theorem 4.6 can be formulated as follows. Corollary 4.7. form
A matrix Y is a solution of(4.14)
if and only if Y
has the
y =XI.${· Tl4t·(XI,~t)- 1 , where (X, T) is a standard pair of the monic matrix polynomial L(A.) and is a T-invariant subspace such that X 1.$1 is invertible.
~
126
4. SPECTRAL DIVISORS AND CANONICAL FACTORIZATION
To prove Corollary 4.7 recall that equation (4.14) is satisfied if and only if H - Y is a right divisor of L(A.), and then apply Theorem 3.12. A matrix Y such that equality (4.14) holds, will be called a (right) solvent of L(A.). A solventS is said to be a dominant solvent if every eigenvalue of S exceeds in absolute value every eigenvalue of the quotient L(A.)(IA- S)- 1. Clearly, in this case, I A. - S is a spectral divisor of L(A.). The following result provides an algorithm for computation of a dominant solvent, which can be regarded as a generalization of Bernoulli's method for computation of the zero of a scalar polynomial with largest absolute value (see[42]).
Theorem 4.8. Let L(A.) = IA1 + Il:A A.iA; be a monic matrix polynomial ofdegree l. Assume that L(A.) has a dominant solventS, and the transposed matrix polynomial LT(A.) also has a dominant solvent. Let {Ur}~ 1 be the solution of the system r
= 1, 2,... (4.15)
where {Ur},~= 1 is a sequence ofn x n matrices to be found, and is determined by the initial conditions U 0 = · · · = U 1_ 1 = 0, U 1 = I. Then Ur+ 1 ur- 1 exists for larger and ur+ 1 ur- 1 -+ as r-+ 00.
s
We shall need the following lemma for the proof of Theorem 4.8.
Lemma 4.9. Let W1 and W 2 be square matrices (not necessarily of the same size) such that
(4.16) Then W1 is nonsingular and
lim II Will
· II w;.-mll
=
o.
(4.17)
Proof Without loss of generality we may assume that W1 and W 2 are in the Jordan form; write
W;
= K; + N;,
i = 1, 2
where K; is a diagonal matrix, N; is a nilpotent matrix (i.e., such that Nr = 0 for some positive integer r;), and K;N; = N;K;. Observe that a{K;) = a(W;) and condition (4.16) ensures that W1 (and therefore also K 1) are non singular. Further, IIK1 1II = [inf{IA.IIA.ea(W1)}rt, IIK2II = sup{IA.IIA.ea(W2)}. Thus (again by (4.16))
IIKill · IIK1mll
~
ym,
0 < y < 1,
m
= 1, 2, ....
(4.18)
4.2.
127
LINEAR DIVISORS AND MATRIX EQUATIONS
Letr 1 besolargethatN~' = O,andputb 1 = maxosks;r,~J IIK!kN~II. Using the combinatorial identities
f (_1)k ( nk) (n + kk - 1) = {01
k=O
p-
for for
p > 0 p = 0,
it is easy to check that
So
~ [ r,k~O
1
::::; IIK!mll·
(m + kk - 1) bi J ::::; IIK!mll<)trl(m + ri -
2)''. (4.19)
The same procedure applied to
WT
yields (4.20)
I WTII ::::; IIKTII · bz rz m' 2 , wherer 2 issuchthatN;2 (4.20), we obtain
= Oandb 2 =
max 0 sks;r 2 ~ 1 IIK~N~II- Using(4.18)~
lim II WTII ·II W!mll ::::; b1b2 r1r2 · lim [(m
+ r1 -
2)''m'zym]
=
0.
D
m->oo
Proof of Theorem 4.8. Let (X, J, Y) be a Jordan triple for L(A). The existence of a dominant solvent allows us to choose (X, J, Y) in such a way that the following partition holds:
J = [Js 0
OJ
Jt '
y =
[il
where sup{ IAII A E u(Jt)} < inf{ IAII A E u(Js)}, and the partitions of X and Y are consistent with the partition of J. Here Js is the part of J corresponding to the dominant solvent S, so the size of Js is n x n. Moreover, Xs is a nonsingular n x n matrix (because I A - Sis a spectral divisor of L(A)), and X sis the restriction of X to the supporting subspace corresponding to this divisor (cf. Theorems 4.1 and 3.12). Since LT(A) has also a dominant solvent, and (Yr, Jr, XT) is a standard triple of LT(A), by an analogous argument we obtain that Y, is also nonsingular.
128
4.
SPECTRAL DIVISORS AND CANONICAL FACTORIZATION
Let {Ur };"'~ 1 be the solution of (4.15) determined by the initial conditions u I = ... = ul-1 = 0; Uz = I. Then it is easy to see that r = 1, 2, ....
Indeed, we know from the general form (formula (2.57)) of the solution of (4.15) that Ur = XJ'- 1 Z, r = 1, 2, ... for some nl x n matrix Z. The initial conditions together with the invertibility of col(XJit:;f> ensures that in fact
Z=Y. Now U, = XJr-l Y = Write M = Xs ¥., and E, = XJ~-~xs-1)
¥., + XtJ~-I J'; + (X 1 1~- 1 1';)Y.,- 1 Xs- 1 JXs ¥.,.
XsJ~-I
= [XsJ~- 1 Xs-
1
X 1 J~ 1';, r
= 1, 2, ... , so that (recall that S =
U, =(Sr-I+ E,-1M-I)M =(I+ E,-1M-1s-r+I)S'-1 M.
Now the fact that Sis a dominant solvent, together with Lemma 4.9, implies that E,M- Is-r--+ 0 in norm as r--+ 00 (indeed,
by Lemma 4.9). So for large enough r, U, will be nonsingular. Furthermore, when this is the case,
and it is clear, by use of the same lemma, that U,+ I ur- 1
--+
s as
r--+ 00.
D
The hypothesis in Theorem 4.8 that LT(A,) should also have a dominant solvent may look unnatural, but the following example shows that it is generally necessary. EXAMPLE
4.4.
Let
L(.?c)
=
[.F-1I .~czOJ ' .lc -
with eigenvalues I, -I, 0, 0. We construct a right spectral divisor with spectrum {I, -I}. The following matrices form a canonical triple for L:
y-
-
l
~I
_l 2
-I
01
0
0' 1
4.3.
129
SOLUTIONS OF DIFFERENTIAL EQUATIONS
and a dominant solvent is defined by S = X1J1X1-1 =
[1 -1]. 0
-1
Indeed, we have
L(A) = [A + 1 1
- 1 A-1
J[A -0 1
J
1 A+1'
However, since the corresponding matrix Y1 is singular, Bernoulli's method breaks down. With initial values U 0 = U 1 = 0, U 2 = I it is found that
so that the sequence {U,}:':o oscillates and, (for r > 2) takes only singular values.
D
An important case where the hypothesis concerning LT(A.) is redundant is that in which Lis self-adjoint (refer to Chapter 10). In this case L(A.) = L *(A.) and, if L(A.) = Q(A.)(IA.- S), it follows that LT(A.) = Q(A.)(IA.- S) and, if S is dominant for L, so is S for LT. 4.3. Stable and Exponentially Growing Solutions of Differential Equations
Let L(A.)
= IA.1 + .L;}:~ AiA.i be a monic matrix polynomial, and let d1x(t) -d, t
+
dix(t) L;Ai-di =0,
1- 1
j=O
tE[a,oo)
t
(4.21)
be the corresponding homogeneous differential equation. According to Theorem 2.9, the general solution of (4.21) is given by the formula
x(t) = XetJc, where (X, J) is a Jordan pair of L(A.), and c E (/;" 1• Assume now that L(A.) has no eigenvalues on the imaginary axis. In this case we can write
X= [X 1 X2J,
J
=
[J~
JoJ.
where a(J 1 ) (resp. a(J 2 )) lies in the open left (resp. right) half-plane. Accordingly,
x(t) = x 1(t)
+ x 2 (t),
(4.22)
where x 1 (t) = X 1 ethc 1 , x 2 (t) = X 2 ethc 2 , and c = [~~]. Observe that limr~oo llx 1 (t)ll = 0 and, if c 2 # 0, limr~oo llx 2 (t)ll = co. Moreover, x 2 (t) is
130
4. SPECTRAL DIVISORS AND CANONICAL FACTORIZATION
exponentially growing, i.e., for some real positive numbers J1 and v (in fact, J1 (resp. v) is the real part of an eigenvalue of L(A.) with the largest (resp. smallest) real part among all eigenvalues with positive real parts) such that
for some positive integer p, but lim1 _ 00 lle-
=
Theorem 4.10. Let L(A.) be a monic matrix polynomial such that a(L) does not intersect the imaginary axis. Then for every set of k vectors x 0 (a), ... , Xbk- 1>(a) there exists a unique stable solution x(t) of (4.21) such that xU>(a) = x~>(a),j = 0, ... , k- 1, if and only if the matrix polynomial L(A.) has a monic right divisor L 1(A.) of degree k such that a(L 1) lies in the open left half-plane.
Proof Assume first that L 1 (A.) is a monic right divisor of L(A.) of degree k such that a(Lt) lies in the open left half-plane. Then a fundamental result of the theory of differential equations states that, for every set of k vectors x 0 (a), ... , x~-t>(a) there exists a unique solution x 1(t) of the differential equation L 1(d/dt)x 1(t) = 0, such that xV>(a) = x~>(a),
j = 0, ... , k - 1.
(4.23)
Of course, x 1 (t) is also a solution of(4.21) and, because a(L 1 ) is in the open left half-plane, (4.24) l--"00
We show now that x 1(t) is the unique solution of L(A.) such that (4.23) and (4.24) are satisfied. Indeed, let x1(t) be another such solution of L(A.). Using decomposition (4.22) we obtain that Xt(t) = XletJ,cl, x!(t) = Xlethcl for some c~> c 1 . Now [x 1 (t)- x1 (t)]W(a) = 0 for j = 0, ... , k- 1 implies that
[
Xt X1J1 :
1e (c a}[
-
1 -
c 1 ) = 0.
(4.25)
Xt11-t Since (X 1 ,1 1 ) is a Jordan pair of L 1 (A.) (see Theorems 3.12 and 4.1), the matrix col(X 1J;)7,:-J is nonsingular, and by (4.25), c 1 = c1, i.e., x 1(t) = x1(t).
4.4.
131
LEFT AND RIGHT SPECTRAL DIVISORS
Suppose now that for every set of k vectors x 0 (a), ... , x~- 1 >(a) there exists a unique stable solution x(t) such that xU>( a) = x~>(a), j = 0, ... , k - 1. Write x(t) = X 1 ethc 1 for some c 1 ; then (4.26) For every right-hand side of (4.26) there exists a unique c 1 such that (4.26) holds. This means that col(X 1 JD~,;;-J is square and nonsingular and the existence of a right monic divisor L 1 (A.) of degree k with a(L 1 ) in the open left half-plane follows from Theorem 3.12. D 4.4. Left and Right Spectral Divisors
Example 4.3 indicates that there remains the interesting question of when a set of kn eigenvalues-and a surrounding contour r -determine both a r-spectral right divisor and a r-spectral left divisor. Two results in this direction are presented. Theorem 4.11. Let L be a monic matrix polynomial and r a contour consisting of regular points of L having exactly kn eigenvalues of L (counted according to multiplicities) insider. Then L has both a r-spectral right divisor and a r-spectralleft divisor if and only if the nk x nk matrix Mk,k defined by
l .
L -1(A.)
= -1
Mk
,k
2rci
[
r
.
A_k-1 L -1(A.)
·
·
A_k-1L -1(A.)
l
d111.
(
4.27)
A_2k-2L -1(A.)
is nonsingular. If this condition is satisfied, then the r-spectral right (resp. left) divisor L 1(A.) = JA.k + D:J L 1jA.j (resp. Lz(A.) = JA.k + D,;;-J L 2jA.j) is given by the formula: [L1o
(
.. · L1 k-1] = - ~ f [A.kL - 1(A.) · 2rczJr resp. [
L~o
L2 ..k-t
l
= - M;l· _1 .
.. ·
~
A. 2k- 1L - 1(A.)] dA.· M;l ·
l
[ A_kL 1(A.) dA.). 2rciL A_2k-1L-1(A.)
Proof Let X, T, Y be a standard triple for L, and define X 1, T1, Y1 and X 2 , T2 , Y2 as in the proof of Theorem 4.2. Then, as in the proof ofthat theorem, we obtain (4.28)
4. SPECTRAL DIVISORS AND CANONICAL FACTORIZATION
132
Then nonsingularity of M k, k implies the nonsingularity ofboth factors and, as in the proof of Theorem 4.2 (Theorem 4.4), we deduce the existence of a r-spectral right (left) divisor. Conversely, the existence of both divisors implies the nonsingularity of both factors on the right of (4.28) and hence the nonsingularity of M k, k. The formulas for the r -spectral divisors are verified in the same way as in the proof of Theorem 4.2. 0 In terms of the matrices T(Li 1 ; a, {3, y) introduced in Section 4.1, Theorem 4.11 can be restated as follows. Theorem 4.11'. Let L and r be as in Theorem 4.11. Then L has right and left r-spectral divisors L 1 (.~.) = JAk + ~}:6 L 1j)) and Lz(A.) = Hk + 2}:6 L 2jA.j, respectively, if and only ifdet T(L - 1 ; -1, -k, -2k + 1) =1- 0. In this case the coefficients of L 1 (A.) and L 2(A.) are given by the formulas
[L1,k-1
L1,k-2
...
L 10 ] = -T(Li 1; - k - 1, - k - 1, -2k)
l/~J ~
·(T(Li 1; -1, -k, -2k
+ 1))-1,
-(T(LC'; -1, -k, -2k
+ !))-'
· T(Li 1; - k - 1, -2k, -2k).
In the next theorem we relax the explicit assumption that r contains exactly the right number of eigenvalues. It turns out that this is implicit in the assumptions concerning the choice of I and k. Theorem 4.12. Let L be a monic matrix polynomial of degree l ::::; 2k, and let r be a contour ofregular points ofL. Ifdet Mk,k =1- 0 (Mk,k defined by (4.27)) then there exist a r-spectral right divisor and a r-spectralleft divisor.
Proof We prove Theorem 4.12 only for the case I= 2k; in the general case l < 2k we refer to [36b]. Suppose that r contains exactly p eigenvalues of Lin its interior. As in the preceding proofs, we may then obtain a factorization of M k, k in the form (4.28) where X 1 = X IAt, T1 = T IAt and A is the pdimensional invariant subspace of L associated with the eigenvalues insider. The right and left factors in (4.28) are then linear transformations from flnk to A and A to flnk respectively. The nonsingularity of M k, k therefore implies p = dim A ?: dim flnk = kn.
4.5. CANONICAL FACTORIZATION
133
With Yb X 2 , T2 , Y2 defined as in Theorem 4.2 and using the biorthogonality condition (2.14) we have
XY O= [
XTY
X~Y XTk.-ly X 2 Y2
= Mk k + [
X2
: X2T~- 1 Y2
···
T~- 1 Y2 ] : ·
X2nk- 2Y2
Thus, the last matrix is invertible, and we can apply the argument used in the first paragraph to deduce that In - p 2 kn. Hence p = kn, and the theorem follows from Theorem 4.11. 0 A closed rectifiable contour 3, lying outsider together with its interior, is said to be complementary tor (relative to L) if 3 contains in its interior exactly those spectral points of L(A.) which are outsider. Observe that L has right and left r-spectral monic divisors of degree kif and only if L has left and right 3-spectral monic divisors of degree l - k. This obs~rvation allows us to use Theorem 4.12 for 3 as well as for r. As a consequence, we obtain, for example, the following interesting fact. Corollary 4.13. Let L and r be as in Theorem 4.12, and suppose l = 2k. Then M k, k is nonsingular if and only if M~. k is nonsingular, where 3 is a contour complementary tor and M~.k is defined by (4.27) replacing r by 3.
For n = 1 the results of this section can be stated as follows. Corollary 4.14. Let p(A.) = L}=o ajA.j be a scalar polynomial with a1 :f. 0 and p(A.) :f. Ofor A. E r. Then an integer k is the number of zeros ofp(A.) (counting multiplicities) inside r if and only if
rank Mk,k = k
for
k 2 l/2
rank M~-k.l-k = l- k
for
k :-s; l/2,
or
where 3 is a complementary contour tor (relative to p(A.)). 4.5. Canonical Factorization
Let r be a rectifiable simple closed contour in the complex plane fl bounding the domain F+. Denote by p- the complement of p+ u r in
4.
134
SPECTRAL DIVISORS AND CANONICAL FACTORIZATION
(/; u {oo }. We shall always assume that 0 E F+ (and oo E F-). The most important example of such a contour is th~:: unit circle r 0 =PIP-I= 1}. Denote by c+(r) (resp. c-(r)) the class of all n )( n matrix-valued functions G(A), A E r u F+ (resp. A E r u F-) such that G(A) is analytic in F+ (resp. F-) and continuous in F+ u r (resp. F- u r). A continuous and invertible n x n matrix function H(A) (A E r) is said to admit a right canonical factorization (relative to r) if H(A)
= H +(A)H _(A),
(4.29)
(AE r)
where H$ 1(A) E c+(r), H~ 1(A) E c-(r). Interchanging the places of H +(A) and H _(A) in (4.29), we obtain left canonical factorization of H. Observe that, by taking transposes in (4.29), a right canonical factorization of H(A) determines a left canonical factorization of (H(A))r. This simple fact allows us to deal in the sequel mostly with right canonical factorization, bearing in mind that analogous results hold also for left factorization. In contrast, the following simple example shows that the existence of a canonical factorization from one side does not generally imply the existence of a canonical factorization from the other side. EXAMPLE
4.5.
Let H(A.)
and let
r
=
=
[
,t-1
0
1]
A. '
r 0 be the unit circle. Then
is a right canonical factorization of H(A.), with H +(A.)= [
0 A.I]
-1
and
On the other hand, H(A.) does not admit a left canonical factorization.
D
In this section we shall consider canonical factorizations with respect to a rectifiable simple contour r of rational matrix functions of the type R(A) =
LJ=
-r
AjAj.
We restrict our presentation to the case that R(A) is monic, i.e., As = I. This assumption is made in order to use the results of Sections 4.1 and 4.4, where only monic matrix polynomials are considered; but there is nothing essential in this assumption and most of the results presented below remain valid without the requirement that As = I (cf. [36c, 36d, 73]).
4.5.
135
CANONICAL FACTORIZATION
We introduce some notation. Denote by GL(€") the class of all nonsingular n x n matrices (with complex entries). For a continuous n x n matrix-valued function H(),): r-+ GL(€") define Di = (2ni)- 1
and define T(Hr;
IX,
LA- i- 1 H(A) d),,
f3, y) to be the following block Toeplitz matrix: Dp
T(Hr;
IX,
/3,
y) =
r:
Dp-1
Dp+1
Dp
D~
D~-1
(4.30)
We have already used the matrices T(Hr; IX, /3, y) in Sections 4.1 and 4.4. For convenience, we shall omit the subscript r whenever the integration is along r and there is no danger of misunderstanding. The next theorem provides the characterization of right factorization in terms of some finite sections of the infinite Toeplitz matrix T( ·; oo, 0, - oo ). Theorem 4.15. A monic rational matrix function R(A) = 2:}: :.r Ai),i H': r-+ GL(€") admits a right canonical factorization if and only if
rank T(Ri 1 ; r- 1, 0, - r- s
+
+ 1)
=rank T(Ri 1 ;r- 1, -1, -r- s) = rn. (4.31) Proof We first note that the canonical factorization problem can be reduced to the problem of existence of r-spectral divisors for monic polynomials. Indeed, let R(A) = 1}: :.r Ai),i + H' be a monic rational function and introduce a monic matrix polynomial M(A) = ;,rR(A). Then a right canonical factorization R(),) = W+(A)W_(A) for R(A) implies readily the existence of right monic r -spectral divisor ;,rw_ (A) of degree r for the polynomial M(A). Conversely, if the polynomial M(A) has a right monic r-spectral divisor N 1(A) of degree r and M(),) = N 2 (A)N 1 (A), then the equality R(A) = N 2 (A) · rrN 1 (),) yields a right canonical factorization of R(A). Now we can apply the results of Section 4.1 in the investigation of the canonical factorization. Indeed, it follows from Theorem 4.3 that the polynomial M(),) has a r-spectral right divisor if and only if the conditions of Theorem 4.15 are satisfied (here we use the following observation: T(R- 1; IX,
/3, y) =
T(M- 1; IX+ r,
f3
+ r, y + r)).
0
The proof of the following theorem is similar to the proof of Theorem 4.15 (or one can obtain Theorem 4.16 by applying Theorem 4.15 to the transposed rational function).
4.
136 Theorem 4.16.
SPECTRAL DIVISORS AND CANONICAL FACTORIZATION
A monic rational function R(A)
=
2:}:~, AiAi
r--+ GL(C") admits a left canonicalfactorization if and only if rank T(R- 1 ; r- 1, -s, - r - s + 1) =rank T(R- 1 ;r- 1, -s, - r - s) = rn.
+ H':
(4.32)
In the particular case n = 1 Theorem 4.15 implies the following characterization of the winding number of a scalar polynomial. The winding number (with respect to r) of a scalar polynomial p(A) which does not vanish on r is defined as the increment of(1/2n) [argp(A)] when Aruns through r in a positive direction. It follows from the argument principle ([63], Vol. 2, Chapter 2) that the winding number of p(A) coincides with the number of zeros of p(A) inside r (with multiplicities). Let us denote by 3 a closed simple rectifiable contour such that 3 lies in F- together with its interior, and p(A) =1= 0 for all points A E F- \ { oo} lying outside 3. Corollary 4.17. Let p(A) = D~o aiAi be a scalar polynomial with a 1 =I= 0 and p(A) =I= 0 for A E r. An integer r is the winding number of p(A) if and only if the following condition holds: rank T(p- 1 ; -1, -r, -1- r
+
1)
=rank T(p- 1 ; -1, - r - 1, -1- r) = r. This condition is equivalent to the following one:
rank T(p:S 1 ; - I, - s, -I - s + 1) = rank T(p:S 1 ; - 1, - s - 1, -1 - s) = s, where s
= l - r.
The proof of Corollary 4.17 is based on the easily verified fact that r is the winding number of p(A) if and only if the rationalfunction A-'p(A) admits onesided canonical factorization (because of commutativity, both canonical factorizations coincide). The last assertion of Corollary 4.17 reflects the fact that the contour 3 contains in its interior all the zeros of p(A) which are outside r. So p(A) has exactly s zeros (counting multiplicities) inside 3 if and only if it has exactly r zeros (counting multiplicities) inside r. Now we shall write down explicitly the factors of the canonical factorization. The formulas will involve one-sided inverses for operators of the form T(H; rx, /3, y). The superscript I will indicate the appropriate one-sided inverse of such an operator (refer to Chapter S3). We shall also use the notation introduced above. The formulas for the spectral divisors and corresponding quotients obviously imply formulas for the factorization factors. Indeed, let
4.5. CANONICAL FACTORIZATION
137
R(A.) = IA.s + ~};; ~r AiA.i: r---+ GL(~") be a monic rational function, and let W+(A.) = JA.S + IJ:6 wt A_i and W_(A.) = I + LJ= 1 wr-- jA.- j be the factors of the right canonical factorization of R(A.): R(A.) = W+(A.)W_(A.). Then the polynomial A.rW_(A.) is a right monic r-spectral divisor of the polynomial M(A.) = A.rR(A.), and formulas (4.12) and (4.13) apply. It follows that [Wr-- 1 W;_ 2
•••
W 0 ] = -T(Ri 1; -1, -1, -r- s) · [T(Ri 1 ; r- 1, 0, -r- s
+
1)J, ·(4.33)
col(Wnj:6 = -[T(R:S 1;r- 1, -s, -2s + 1)J · T(R:S 1 ; r - s - 1, - 2s, - 2s). (4.34) Let us write down also the formulas for left canonical factorization: let A+(A.) = IA.s + Ii=6 At A_i and A _(A.) = I + LJ= 1 Ar-:_ iA.- i be the factors of a left canonical factorization of R(A.) = IA.s + ~r AiA.i: r---+ GL(~"), i.e., R(A.) = A_(A.)A+(A.). Then
IJ;;
col(A;-x:J = -(T(Ri 1;r- 1, -s, -r- s · T(Ri 1; -1, -r- s, -r- s), [As+-1
As+-z
...
+ l)Y
AriJ=-T(R:S 1;-1,-1,-r-s) · (T(R:S 1 ; r - 1, 0, - r - s + 1))1.
(4.35)
(4.36)
Alternative formulas for identifying factors in canonical factorization can be provided, which do not use formulas (4.12) and (4.13). To this end, we shall establish the following formula for coefficients of the right monic r-spectral divisor N(A.) of degree r of a monic matrix polynomial M(A.) with degree l: [Nr
Nr-1
...
No]=[/ 0 ... O]·[T(M- 1;0,-r,-r-l)]i, (4.37)
where Ni = Y0 Ni (j = 0, 1, ... , r - 1) and Nr = Y0 . Here Ni are the coefficients of N(A.): N(A.) = IA.r + IJ:6 NiA.i; and Y0 is the lower coefficient of the quotient M(A.)N- 1(A.): Y0 = [M(A.)N- 1(A.)] l;.=o. It is clear that the matrix function Y(A.) = Y0 N(A.)M- 1(A.) is analytic in p+ and Y(O) = I. Then (4.38) and for
j = 0, 1, ...
(4.39)
138
4.
SPECTRAL DIVISORS AND CANONICAL FACTORIZATION
Indeed, (4.39) is a consequence of analyticity of the functions ..1/Y0 N(A.)M- 1 (..1.) in F+ for j = 0, 1, .... Formula (4.38) follows from the residue theorem, since . 1. 0 = 0 is the unique pole of the function r 1 Y0 N(A.)M- 1 (..1.) in F+. Combining (4.38) with the first l equalities from (4.39), we easily obtain the following relationship: ...
0
Y0 N 0 ]·T(M;: 1 ;0,-r,-r-l)=[I
0],
which implies (4.37) immediately. A similar formula can be deduced for the left r-spectral divisor <1>(..1.) = IX + Ii:6
= 1 + Ii=1 wt A.i, w_(A.) =
Ii=o
w;_jrj.
Then (4.40) implies the following identification of W_(A.):
[W,.
W..- 1
W0 ]=[I
0
O]·[T(R;: 1 ;r,O,-r-s)] 1.
...
One can use formula ( 4.39) in an analogous way for left canonical factorization. We conclude this section with a simple example. ExA~PLE
4.6.
Let R(A)
=
IA +
[ -~; 211 + [-12
-2]-1 4 A '
and let r be the unit circle. In the notation used above, in our case r = s = I. Let us compute ranks of the matrices T(Ri 1 ; 0, 0, -1) and T(Ri 1 ; 0, -1, -2). A computation shows that det R(A) = (A 2 - 4)( 1 + JA- 1 ), so _ 1
[A + 2+ 4A-
1 (A)=(A2-4)(1+Ii=l)
R
-4-2A-1
-1
1
and, multiplying numerator and denominator by A,
R Since Ao
=
1 _1 (A)= (_ii:_ 4)(A +
~)
l
+ 2A -1
A-~-
r
1
l
- A+ 2 [A 2 + 2A + 4 A2 - ~A- 1 . -4A- 2
-±is the only pole of R- 1 (A) inside
r, we can compute the integrals j
=
0, 1, -1
4.6.
139
THEOREMS ON TWO-SIDED CANONICAL FACTORIZATION
by the residue theorem. It turns out that
-~-. ~ .!.7rl
So T(Ri 1 ;0,0, -1)=
[-;s
f
T(Ri
0, -1, -2) =
-i]
13
l
3
-TI
l" I
2
0
0
2
1;
[~0
A.R -t(A.) dA. =
r
!]
0 '
'
13
-TI -3
30
-~
0
0
I
13
15
3 -TI
1
I
0
2
-d
It is easily seen that
rank T(Ri 1 ; 0, 0, - 1) = rank T(Ri 1 ; 0, -1, - 2) = 2, so according to Theorem 4.15, R(A.) admits right canonical factorization. Also, rank T(R- 1 ;0, -1, -1) =rank T(R- 1 ;0, -1, -2) = 2, so R(A.) admits a left canonical factorization as well. (In fact, we shall see later in Theorem 4.18 that for a trinomial R(A.) = !A + R0 + R _ 1 A.- 1 the necessary and sufficient condition to admit canonical factorizations from both sides is the invertibility of
which holds in our example.) It is not difficult to write down canonical factorizations for R(A.) using formulas (4.33) and (4.35): R A. = ()
[A.-4 2
0 A.+2
J[1 + 0tA.-
1
A. -
1
t]
is a right canonical factorization of R(A.), and
is a left canonical factorization of R(A.).
D
4.6. Theorems on Two-Sided Canonical Factorization
We apply the results of Section 4.4 to obtain criteria for the simultaneous existence ofleft and right canonical factorization of a monic rational function.
140
4. SPECTRAL DIVISORS AND CANONICAL FACTORIZATION
The main criterion is the following:
Theorem 4.18. Let R(A.) = Lj;;; :, AiA.i + JA.•: r--+ GL(C") be a monic rational function, and let q 0 = max(r, s). Then R(A.) admits a right and a left canonical factorization if and only if the matrix T(Rr 1 ; q - 1, 0, -q + 1) is nonsingular for some q ~ q 0 . If this condition is satisfied, then T(Rr 1 ; q - 1, 0, - q + 1) is nonsingular for every q ~ q 0 • The proof of Theorem 4.18 is based on the following two lemmas, which are of independent interest and provide other criteria for two-sided canonical factorization.
Lemma 4.19. A monic rational function R(A.) = l:j: :, AiA.i + JA.•: r--+ GL(C") admits a right and a left canonical factorization if and only if the matrix T(R - 1 ; r - 1, 0, - r + 1) is nonsingular and one of the two following conditions holds: rank T(Ri 1 ; r- 1, 0, - r - s- 1) =rank T(Ri 1 ;r- 1, -1, - r - s) = rn
(4.41)
rank T(Ri 1 ;r -1, -s, - r - s + 1) =rank T(Ri 1 ;r- 1, -s, - r - s) = rn.
(4.42)
or
Proof As usual, we reduce the proof to the case of the monic matrix polynomial M(A.) = A.'R(A.). Let (X, T, Z) be a standard triple of M(A.), and let (X+, T+, Z +)be its part corresponding to the part of fJ(M) lying insider. As it follows from Theorem 4.1, the right (resp. left) monic r-spectral divisor N(A.) (resp. (A.)) of degree r exists if and only if the matrix col(X + Ti+)}:6 (resp. the matrix [Z + T+ Z + T'+- 1 Z +])is nonsingular. We therefore have to establish the nonsingularity of these two matrices in order to prove sufficiency of the conditions of the lemma. Suppose T(M- 1 ; -1, - r, - 2r + 1) is nonsingular and, for example, (4.41) holds. The decomposition T(Mi 1 ; -1, -r, -2r
+
1) = col(X+ T'+- 1 -i)}:6 ·[Z+
T+Z+
T'+- 1 Z+]
(4.43)
(cf. (4.5)) allows us to claim the right invertibility of col( X+ Ti,_ )}:6. On the other hand, equality (4.41) implies that Ker col(X + Ti,_)f:J = Ker col(X + Ti,_)]:6 holds for every p ~ r (this can be shown in the same way as in the proof of Theorem 4.3). But then Ker[col(X + Ti,_)j:6J = Ker[col(X + Ti+)~;;;~] = {0}, which means invertibility of col( X+ Ti+)}: [Z+
T+Z+
6. The invertibility of T'_; 1Z+]
4.6. THEOREMS ON TWO-SIDED CANONICAL FACTORIZATION
141
follows from (4.43). Conversely, suppose that the linear transformations col(X+Ti+)}-:6 and [Z+ Z+T+ r+- 1 Z+] are invertible. Then by (4.43) T(Mi 1 ; -1, -r, -2r + 1) is also invertible, and by Theorems 4.15 and 4.16 conditions (4.41) and (4.42) hold. D Given a contour r c rt and a monic matrix polynomial M(A.): r--+ GL(rtn), a closed rectifiable simple contour 3 will be called complementary to r(relative to M(A.)) if3lies in F- together with its interior, and M(A.) E GL(rtn) for all points A. E F- \ { oo} lying outside 3. In the sequel we shall use this notion for rational matrix functions as well. In the next lemma 3 denotes a contour complementary to r (relative to R(A.)).
Lemma 4.20. For a monic rational function R(A.) = IA.s + L}: :r AiA.i: r --+ GL( rtn) to admit simultaneously a right and a left canonical factorization it is necessary and sufficient that the following condition holds: in the case r ~ s, the matrix Tr
= T(Ri 1 ; r - 1, 0, - r + 1) is non-
in the case r .:::;; s, the matrix T5 is nonsingular.
= T(R5 1 ; r - 1, r - s, r - 2s + 1)
(a) singular;
(b)
Moreover, in case (a), the nonsingularity ofTr implies the nonsingularity ofT5 , while, in case (b), the nonsingularity of T5 leads to the nonsingularity of Tr. Proof Passing, as usual, to the monic polynomial M(A.) = A.'R(A.) of degree l = r + s, we have to prove the following: M(A.) has a right and a left monic r-spectral divisor of degree r if and only if, in the case l ~ 2r, the matrix T(Mi 1 ; -1, -r, -2r + 1) is nonsingular, or, in the case l ~ 2r, the matrix T(M5 1 ; -1, -(1- r), -2(1- r) + l)isnonsingular. This is nothing but a reformulation of Theorem 4.12. The last assertion of Lemma 4.20 can be obtained, for instance, by applying Lemma 4.19. D
Note that in fact we can always realize case (a). Indeed, if r < s, regard AiA.i + JA.• with A_q = · · · = A_r_ 1 = 0, where q ~sis an arbitrary integer. Proof of Theorem 4.18. First suppose s .:::;; r, and let q be an arbitrary integer such that q ~ r. Consider R(A.) as L]: :q A iA.i + Hs with A -q = · · · = A_r_ 1 = 0. In view of Lemma 4.20 (in case (a)) existence of a two-sided canonical factorization of R(A.) is equivalent to the nonsingularity of T(Rf.1 ; q - 1, 0, -q + 1). Supposenows > r.ConsideringR(A.)asi]::.AiA.i + lA5 withA_ 5 = ··· = A - r - 1 = 0, and applying the part of the theorem already proved (the case s = r), we obtain that R(A.) admits a two-sided canonical factorization iff the matrix T(Ri 1 ; q - 1, 0, - q + 1) is invertible for some q ~ s. D R(A.) as
Ir::q
142
4. SPECTRAL DIVISORS AND CANONICAL FACTORIZATION
The following corollary from Theorem 4.18 is also of interest. Corollary 4.21. Let M(A.) = H 1 + ~J:b MiA,i: r--+ GL({:") be a monic matrix polynomial. Then u(M) n F+ = 0 if and only if the matrix T(M- 1 ; q - 1, 0, -q + 1) is nonsingular for some integer q 2: I. Proof Regard M(A,) as a monic rational function with r = 0, and note that M(A.) admits a two-sided canonical factorization if and only if u(M) n F+ = (/). D
4.7. Wiener- Hopf Factorization for Matrix Polynomials
We continue to use notations introduced in Section 4.5. A continuous and invertible n x n matrix function H(A) (A. E r) is said to admit a right WienerH opf factorization (relative to r) if the following representation holds:
0
A."• H(A.) = H +(A.)
r
A-"2
l
H _(A_)
O .
(4.44)
(A E r),
A""
whereH+(A,),(H+(A))- EC+(r);H_(A_),(H_(A_))- 1 EC-(r),andK 1 :;5; ••• :;5; Kn are integers (positive, negative, or zeros). These integers are called right partial indices of H(A.). Interchanging the places of H +(A.) and H _(A,) in (4.44), we obtain left factorization of H(A.). So the left Wiener-Hopf factorization is defined by 1
0
A."'
H(A)
~ H'_(A)
r
O A"' .
(AE r),
(4.45)
where H'+(A), (H'+(A,))- 1 E c+(r); H'_(A_), (H'_(A_))- 1 E c-(r), and K 1 :::;; ••• :::;; K~ are integers (in general they are different from Kt. ... , Kn), which naturally are called left partial indices of H(A). For brevity, the words" WienerHopf" will be suppressed in this section (so "right factorization" is used instead of" right Wiener-Hopf factorization"). In the scalar case (n = 1) there exists only one right partial index and only one left partial index; they coincide (because of the commutativity) and will be referred to as the index (which coincides with the winding number introduced in Section 4.5) of a scalar function H(A.). Note also that the canonical factorizations considered in Sections 4.5-4.6 are particular cases of (right or left) factorization; namely, when all the partial indices are zero. As in the case 1
4.7.
WIENER-HOPF FACTORIZATION FOR MATRIX POLYNOMIALS
143
of canonical factorization, right factorization of H(A.) is equivalent to left factorization of HT(A.). So we shall consider mostly right factorization, bearing in mind that dual results hold for left factorization. First we make some general remarks concerning the notion of right factorization (relative tor). We shall not prove these remarks here but refer to [25, Chapter VIII], (where Wiener-Hopf factorization is called canonical) for proofs and more detail. The factors H +(A.) and H _(A.) in (4.44) are not defined uniquely (for example, replacing in (4.44) H +(A.) by r:t.H +(A.) and H _(A.) by r:t.- 1H _(A.), r:t. E C\ {0}, we again obtain a right factorization of H(A.)). But the right factorization indices are uniquely determined by H(A.), i.e., they do not depend on the choice of H ±(A.) in (4.44). Not every continuous and invertible matrix-function on r admits a right factorization; description of classes of matrix-functions whose members admit a right factorization, can be found in [25]. We shall consider here only the case when H(A.) is a matrix polynomial; in such cases H(A.) always admits right factorization provided det H(A.) # 0 for all A E r. We illustrate the notion of factorization by an example. EXAMPLE
and
r
4.7.
Let
be the unit circle. Then
is a right factorization of H(A.), with H+(A.) = [ The right partial indices are
-~ ~] K1
=
K2
and
= 1. A left factorization of H(A.) is given by
[0I A.-1 [10 A.0J 1
H(A.) = where 1 H _(A.)= [ 0
The left partial indices are
K'1 =
0,
K~ =
]
2
;.-!] I
'
2.
0
'
H +(A.) = I.
This example shows that in general left and right partial indices are different. But they are not completely independent. For instance, the sums = 1 Ki and = 1 Ki of all right partial indices K 1 ~ · · · ~ Kn and all left
Li
Li
4.
144
SPECTRAL DIVISORS AND CANONICAL FACTORIZATION
partial indices K'1 s · · · s K~, respectively, coincide. This fact is easily verified by taking determinants in (4.44) and (4.45) and observing that 2::7= 1 Ki, as well as 1 K;, coincides with the index of det H(A). Now we provide formulas for partial indices of matrix polynomials assuming, for simplicity, that the polynomials are monic. To begin with, consider the linear case L(A) =!A- A. Let A = diag[A 1 , A 2 ] be a block representation where the eigenvalues of A 1 are in p+ and the eigenvalues of A 2 are in p-. Then the equality
D=
!A _
0 J[AI0 OJI [I - r0 A o1]
A= [I0
1
1
!A- A 2
is a right factorization of I A - A and the partial indices are (0, 0, ... , 0, I, 1, ... , 1), where the number of ones is exactly equal to rank A 1 ( = rank Im Jr (I A - A)- 1 dA). For a matrix polynomial, this result can be generalized as follows. Theorem 4.22. Let L(A) be a monic matrix polynomial of degree m with det L(A) =I= 0 for A E r. Then for the right partial indices KI s K2 s ... s Kn of L(ll) the following equalities hold Ki
= I {j In + rj _ 1
-
rj
s
i - 1, j
= 1, 2, ... , m} I
( i = 1, 2, ... , n ),
where B- 1 B_2 rj =rank [ .
B- . = 1
B_j
~-:
JAj- L-
2m r
I
1(/l)
dll
B-j-m-1
and IQ I denotes the number of elements in the set Q. The same formulas for Kj hold in case r is the unit circle and the Bj are the Fourier coefficients of L - 1 (/l): CD
L
L - 1 (A) =
j=-
AjBj
(I Ill
=
1).
CD
In some special cases one can do without computing the contour integrals, as the following result shows. (Am
Theorem 4.23. Suppose that all the eigenvalues of L(ll) = L}=o A jllj = I) are inside r and L(O) = I. Then Kj
= I {i In + qi- 1
-
qi
s
j - 1, i = 1, 2, ... , n} I
4.7.
WIENER~HOPE FACTORIZATION FOR MATRIX POLYNOMIALS
145
where
qi =rank
l
PC'+1 PC
:
1 ,
I
0
pcv+i-1
v ;::::: 0 is the minimal integer for which rank C = rank projector on the first n coordinates.
c+ 1 ,
and P is the
In fact, the assumption that L(A.) be monic is not essential in Theorems 4.22 and 4.23 and they hold for any matrix polynomial, whose spectrum does not intersect r. For the proofs of Theorems 4.22 and 4.23 we refer to [36e] (sec also [35b]). Comments
The results of Sections 4.1 and 4.4 originated in [34b, 36b]. Theorem 4.2 (without formulas for the divisor) is due to Lopatinskii [58]; however, his proof is quite different. Formula (4.3) appears for the first time in [36b]. Section 4.2 is based on the paper [52f]. Theorem 4.6 was first proved in [56c], although historically results like Corollary 4.7 began with the search for complete sets of linearly independent generalized eigenvectors, as in [56b]. Sufficient conditions for the existence of a right divisor of degree one found by methods of nonlinear analysis can be found in [54] as well as references to earlier works in this direction. Lemma 4.9 is taken from [10] and Theorem 4.10 was suggested by Clancey [ 13]. Sections 4.5 and 4.6 are based on [36b], where analogous results are proved in the infinite dimensional case. The problems concerning spectral factorization for analytic matrix and operator functions have attracted much attention. Other results are available in [36b, 61a, 61b, 6lc]. A generalization of spectral factorization, when the spectra of divisor and quotient are not quite separated, appears naturally in some applications ([3c, Chapter 6]). The canonical factorization is used for inversion of block Toeplitz matrices (see [25, Chapter VIII]). It also plays an important role in the theory of systems of differential and integral equations (see [3c, 25, 32a, 68]). Existence of a two-sided canonical factorization is significant in the analysis of projection methods for the inversion of Toeplitz matrices (see [25]). The last section is based on [36a]. For other results connected with Wiener~Hopf factorization and partial indices see [36c, 36d, 70c, 73]. More information about partial indices, their connection with Kronecker indices, and with feedback problems in system theory can be found in [21, 35b]. Extensions of the results of this chapter to the more general case of spectral divisors, simply behaved at infinity, and quasi-canonical factorization in the infinite dimensional case are obtained in [36c, 36d].
Chapter 5
Perturbation and Stability of Divisors
Numerical computation of divisors (using explicit formulas) leads in a natural way to the analysis of perturbation and stability problems. It is generally the case (but not always) that, if the existence of a right divisor depends on a "splitting" of a multiple eigenvalue of the parent polynomial, then the divisor will be practically impossible to compute, because random errors in calculation (such as rounding errors) will "blow up" the divisor. Thus, it is particularly important to investigate those divisors which are "stable" under general small perturbations of the coefficient matrices of L(A.). This will be done in this chapter (Section 5.3). More generally, we study here the dependence of divisors on the coefficients of the matrix polynomial and on the supporting subspace. It turns out that small perturbations oft he coefficients of the parent polynomial and of the supporting subspace lead to a small change in the divisor. For the special case when the divisor is spectral, the supporting subspace is the image of a Riesz projector (refer to Theorem 4.1) and these projectors are well known to be stable under small perturbations. Thus, small perturbations of the polynomial automatically generate a small change in the supporting subspace, and therefore also in the divisor. Hence spectral divisors can be approximately computed. Generally speaking, non spectral divisors do not have this property. The stable divisors which are characterized in this chapter, are precisely those which enjoy this property. 146
5.1.
147
THE CONTINUOUS DEPENDENCE OF DIVISORS
We also study here divisors of monic matrix polynomials with coefficients depending continuously or analytically on a parameter. We discuss both local and global properties of these divisors. Special attention is paid to their singularities. The interesting phenomenon of isolated divisors is also analyzed. 5.1. The Continuous Dependence of Supporting Subspaces and Divisors
With a view to exploring the dependence of divisors on supporting subspaces, and vice versa, we first introduce natural topologies into the sets of monic matrix polynomials of fixed size and degree. Let&\ be the class of all n x n monic matrix polynomials of degree k. It is easily verified that the function CJk defined on &>k x &>k by
is a metric, and &>k will be viewed as the corresponding metric space. Let d be the class of all subspaces in en 1• We consider d also as a metric space where the metric is defined by the gap ()(At,.%) between the subspaces A, JV E en! (see Section S4.3). For a monic matrix polynomial L(A.) = H 1 + 2}::~ AiA.i (Ai are n x n matrices) recall the definition of the first companion matrix CL: 0 0
I
0
0
I
0
0 -A2
0 0
CL =
0 -Ao
-Al
I
-Al-l
Consider the set fll c .91 x & 1 consisting of pairs {At, L(A.)} for which A is an invariant subspace in en! for the companion matrix CL of L. This set 1(1 will be called the supporting set and is provided with the metric induced from d x & 1 (so an ~>-neighborhood of {A, L} E 1fl consists of all pairs {A, L} E fll for which ()(Ji, A) + CJ(L, L) < ~>). Then define the subset 1flk c 1(1 consisting of all pairs {A, L(A.)}, where L(A.) E & 1 and A is a supporting subspace (with respect to the standard pair ([I 0 . . . 0], CL) of L(.-1.); see Section 1.9), associated with a monic right divisor of L of degree k. The set fllk will be called the supporting set of order k.
5.
148 Theorem 5.1.
PERTURBATION AND STABILITY OF DIVISORS
The set "'f"k is open in 11".
Proof Define the subspaceW,_kof~"'bytheconditionx = (x 1 , . . . , x 1) E W1_k if and only if x 1 = · · · = xk = 0. Suppose that A is a supporting subspace for L(A.); then by Corollary 3.13, {A, L(A.)} E "'f"k if and only if A
+ w,_k =
~"'.
Now let (A, L) E "'f"k and let (Ji, L) E "'f" be in an e-neighborhood of (A, L). Then certainly 8(A, Ji) < e. By choosing e small enough it can be guaranteed, using Theorem S4. 7, that Ji ~ k = ~" 1 • Consequently, Ji is a supporting subspace for Land (Ji, L) E "'f"k. 0
+ _
We pass now to the description of the continuous dependence of the divisor and the quotient connected with a pair (A, L) E ~- Define a map Fk: '"lfl,---+ f!J 1_k x f!Jk in the following way: the image of (A, L) E "'f"k is to be the pair of monic matrix polynomials (L 2 , L 1) where L 1 is the right divisor of L associated with A and L 2 is the quotient obtained on division of L on the right by L 1. It is evident that F k is one-to-one and surjective so that the map Fi: 1 exists. For (L 2 , L 1), (L 2 , L 1) E f!J,_k x f!Jk put
p((L 2, L 1 ),
(I 2 , I 1 )) =
a,_k(L 2, L 2)
+ aiLt. L 1),
where a; is defined by (5.1); so f!J 1_k x f!Jk is a metric space with the metric p. Define the space f!Jo to be the disconnected union: f!Jo = u~~\ (f!J,_k X f!Jk). In view of Theorem 5.1, the space 11"0 = U~~ 11 "'f"k is also a disconnected union of its subspaces 11"1, ... , "/f/;_ 1. The map F between topological spaces 11"0 and f!IJ 0 can now be defined by the component maps F 1, ... , F 1_ 1 , and F will be invertible. If X t. X 2 are topological spaces with metrics p 1 , p 2 , defined on each connected component of X 1 and X 2 , respectively, the map G: X 1 ---+ X 2 is called locally Lipschitz continuous if, for every x EX 1, there is a deleted neighborhood Ux of x for which sup (p 2 (Gx, Gy)jp 1 (x, y)) < oo. yeUx
Theorem 5.2.
The maps F and F- 1 are locally Lipschitz continuous.
Proof It is sufficient to prove that Fk and Fi: 1 are locally Lipschitz continuous fork = 1, 2, ... , l - 1. Let (A, L) E "'f"k and Fk(A, L) = (L 2 , Ld.
Recall that by Theorem 3.12 the monic matrix polynomial L 1(A.) has the following representation: L1(A.) = IAk- XCi(W1 + W2A. + · · · + mcA.k- 1),
(5.2)
5.1. THE CONTINUOUS DEPENDENCE OF DIVISORS
where X equality
=
[I
0 · · · 0], W1 ,
... ,
U'k
149
are nk x n matrices defined by the
(5.3)
and Qk = [Ikn 0] is a kn x In matrix. In order to formulate the analogous representation for L 2(A.), let us denote Y = [0 · · · 0 J]T, R 1 _k = [Y CL y . . . ci_-k- 1 Y]. Let p be the projector on Im R 1_k = {x = (x 1 ,
.•. ,
x 1)E C'" 1 lx 1
= ··· =
xk
= 0}
along A, Then (Corollary 3.18) L2(A.) = Hl-k- (Z1 + Z2A. + · · · + Zl-kA_l-k- 1)Pci_-kPY,
(5.4)
where
(in the last equality R 1 _k is considered as a linear transformation from c-•(l-k) into Im R 1 _k; with this understanding the inverse R 1-_1k: Im R 1-k ~ c-•(l-k) exists). To prove the required continuity properties of Fk: "'ffk ~ f!JJ 1_k x f!/Jk it is necessary to estimate the distance between pairs (L 2 , L 1 ), (L 2 , L 1 ) in f!JJ 1 _k x f!/Jk using the metric p defined above. Note first that if P"' is the projector on A along lm R 1 _k, then P = I - P.41 (the projector appearing in (5.4)), and that (Qk 1.41)- 1 = pol1 ckl in (5.3), where ckl: c-kn ~ € 1" is the imbedding of C'k" onto the first kn components of € 1". Then observe that in the representations (5.2) and (5.4) the coefficients of L 1 and L 2 are uniformly bounded in some neighborhood of (A, L). It is then easily seen that in order to establish the continuity required ofF k it is sufficient to verify the assertion: for a fixed (A, L) E "'ffk there exist positive constants <5 and C such that, for any subspace JV satisfying ()(A, .X) < <5, it follows that JV Im R 1 _k = €" 1 and
+
ll&i,tt - ;jJ.,vii ::;; C ()(A, JV),
where P.,vis the projector of €" 1 on JV along Im R 1_ k. But this conclusion can be drawn from Theorem S4.7, and so the Lipschitz continuity of Fk follows. To establish the local Lipschitz continuity ofF;: 1 we consider a fixed (L 2 ,L 1 )Ef!JJ1_k x f!/Jk.ltisapparentthatthepolynomialL = L 2 L 1 willbea Lipschitz continuous function of L 2 and L 1 in a neighborhood of the fixed pair. To examine the behavior of the gap between supporting subspaces associated with neighboring pairs we observe an explicit construction for P.u ,
5.
150
PERTURBATION AND STABILITY OF DIVISORS
the projection on.H along Im R 1 _k (associated with the pair L 2 , L 1 ). In fact, P 11 has the representation
F= [
P~~t l,
(5.5)
P 1 c'-1 !-,
with respect to the decomposition fl 1" = flk" EB cu-kln, where P 1 = [I 0 · · · 0]. Indeed, P 11 given by (5.5) is obviously a projector along Im R 1 _k. Let us check that ImP At= Jt't. The subspace Jt is the supporting subspace corresponding to the right divisor L 1 (1l) of L(ll); by formula (3.23), .4/ = Im col(P 1 qY:6 = Im P11. The local Lipschitz continuity of P11 as a function of L 1 is apparent from formula (5.5). Given the appropriate continuous dependence of L and P 11 the conclusion now follows from the left-hand inequality of (S4.18) in Theorem S4.7.
0 Corollary 5.3. Let L(ll, /1) = L:~:6 A;(fl)Jli + Ill 1 be a monic matrix polynomial whose coefficients A;(/1) depend continuously on a parameter 11 in an open set (jj) of the complex plane. Assume that there existsfor each f1 E :2: a monic rightdivisorL 1 (1l, /1) = Bi(f1)JliofLwithquotientL 2 (1l, fl) = L:l:3 C;(fl)Jli. Let .4/(fl) be the supporting subspace of L with respect to C L and corresponding toL 1 •
D=o
(i) IF4l(f1) is continuous on r!t (in the metric 8), then both L 1 and L 2 are continuous on 9J. (ii) If one of Lb L 2 is a continuous function off1 on ::2, then the second is also continuous on '!JJ, as is .4/(fl). Proof Part (i) follows immediately from the continuity of (.4/, L) and of F since (L 2 , L 1 ) is the image of the composition of two continuous functions. For part (ii) suppose first that L 1 is continuous in 11 on 9J. Then, using (5.5) it is clear that J!t(/1) = Im P11 (Ill depends continuously on fl. The continuity of L 2 follows as in part (i). 0
5.2. Spectral Divisors: Continuous and Analytic Dependence
We maintain here the notation introduced in the preceding section. We now wish to show that. if ..PI corresponds to a !-spectral divisor of L (see Section 4.1), then there is a neighborhood of (.4/, L) in wk consisting of pairs (.4/, L) for which the spectral property is retained. A pair (.,f/, L) for which ,A corresponds to a !-spectral divisor of L will be said to be !-spectral.
5.2.
SPECTRAL DIVISORS: CONTINUOUS AND ANALYTIC DEPENDENCE
151
Recall (Theorem 4.1) that for a r-spectral pair (A, L) E it' we have A =
(
lm(~ (IA 2m Jr
CL)- 1 d).).
Lemma 5.4. Let (A, L) E it' be r-spectral. Then there is a neighborhood of(A, L) in it' consisting ojr-spectral pairs. Proof Let L 0 E & 1 and be close to L in the a metric. Then C Lo will be close to C L and, therefore, L 0 can be chosen so close to L that H - C Lo is non singular for all). E r. (This can be easily verified by assuming the contrary and using compactness of rand continuity of det(J). - C La) as a function of). and CL0 .) Let d be a closed contour which does not intersect with r and contains in its interior exactly those parts of the spectrum of L which are outside r. Then we may assume L 0 chosen in such a way that the spectrum of L 0 also consists of two parts; one part inside r and the other part inside d. Define%, JV 0 , A 0 to be the images of the Riesz projectors determined by (d, L), (d, L 0 ), and (r, L 0 ), respectively, i.e.,
JV =
Im(2~i
L
(H - CL)- 1
d).).
+
Then A+ JV = A 0 % 0 = fln 1, (A 0 , L 0 ) is r-spectral, and if L 0 --+ L then A 0 --+A (the latter assertion follows from Theorem S4.7). Now let (A 1, L 0) E it' be in a neighborhood 1111 of (A, L). We are finished if we can show that for 1111 small enough, A 1 = A 0 . First, 1111 can be chosen small enough for both L 0 to be so close to L (whence A 0 to A) and A 1 to be so close to A, that A 1 and A 0 are arbitrarily close. Thus the first statement of Theorem S4. 7 will apply and, if 1111 is chosen small enough, then j/1 A'o = flnl. Now A 1 invariant for CLo implies the in variance of A 1 for the projector
+
l
Po= -21 . (H- CL0 ) -1 d).. m r
Since Ker P 0 = JV 0 , and A 1 n JV 0 = {0}, the P 0 -invariance of A 1 implies that ,,# 1 c Im P 0 = A 0 • But since JV 0 is a common direct complement to A 0 and A 1 , dim A 0 =dim A 1 and therefore A 1 = j { 0 . D Theorem 5.5. Let (A, L) E it'k be r-spectral. Then there is a neighborhood of(A, L) in 'If/ consisting ojr-spectral pairs from iflk. Proof
This follows immediately from Theorem 5.1 and Lemma 5.4.
D
152
5.
PERTURBATION AND STABILITY OF DIVISORS
The following result is an analytic version of Theorem 5.5. Assume that a monic matrix polynomial L(A) has the following form: 1-1
L(A, f.l) = I A1 +
L A;(f.l)Ai,
(5.6)
i=O
where the coefficients A;(f.l), i = 0, ... , I - 1 are analytic matrix-valued functions on the parameter fl, defined in a connected domain Q of the complex plane. Theorem 5.6. Let L(A, f.l) be a monic matrix polynomial of the form (5.6) with coefficients analytic in Q. Assume that ,for each f1 E Q, there is a separated part of the spectrum of L, say (J ~', and a corresponding r ~'-spectral divisor L 1(A, f.l) of L(A, f.l). Assume also that ,for each floE Q, there is a neighborhood U(Jlo) such that ,for each f1 E U(f.lo), (J ll is inside rllo' Then the coefficients of L 1 (A, f.l) and of the corresponding left quotient LiA, f.l) are analytic matrix functions of f1 in Q.
Proof The hypotheses imply that, for f1 E U(J1 0 ), the Riesz projector P(Jl) can be written
from which we deduce the analytic dependence of P(Jl) on f1 in U(J1 0 ). Let j((f.l) = Im P(f.l) be the corresponding supporting subspace of L(A, f.l). By Theorem S6.1, there exists a basis x 1 (f.l), ... , xq(Jl) (q = nk) in j[(f.l) for every f1 E Q, such that x;(f.l), i = 1, ... , q are analytic vector functions in f1 E Q. Now formulas (5.2) and (5.3) (where Qk IJ11 is considered as matrix representation in the basis x 1 (f.l), ... , xq(Jl)) show that L 1 (A, f.l) is analytic for f1 E Q. The transpose (L 2 (A, f.l))r to the left quotient is the right spectral divisor of (L(A, JlW corresponding to the separate part (J(L)\(Jil of (J(L). Applying the above argument, we find that (LiA, flW, and therefore also LiA, f.l), is an analytic function of fl· 0
5.3. Stable Factorizations In Section 5.2 it was proved that small perturbations of a polynomial lead to small pertubatjons in any spectral divisor. In this section we study and describe the class of all divisors which are stable under small perturbations. We shall measure small perturbations in terms of the metric (Jk defined by (5.1). Suppose L, Lb and L 2 are monic n x n matrix polynomials of (positive) degree l, r, and q, respectively, and assume L = L 2 L 1 . We say that this
153
5.3. STABLE FACTORIZATIONS
factorization is stable if, given e > 0, there exists {J > 0 with the following property: if L' E ~~ and a1(L', L) < b, then L' admits a factorization L' = L~L~ such that L'1 E~., L~ E~q and ar(L~, L 1 )
< e,
aq(L]., L 2) <e.
The aim of this section is to characterize stability of a factorization in terms of the supporting subspace corresponding to it. Theorem 5.5 states that any spectral divisor is stable. However, the next theorems show that there also exist stable divisors which are not spectral. First, we shall relate stable factorization to the notion of stable invariant subspaces (see Section S4.5).
Theorem 5.7. Let L, L 1 , and L 2 be monic n x n matrix polynomials and assume L = L 2L 1. This factorization is stable if and only if the corresponding supporting subspace (for the companion matrix eL of L) is stable. Proof Let l be the degree of L and r that of L 1, and put e = e L. Denote the supporting subspace corresponding to the factorization L = L 2 L 1 by .A. If A is stable fore then, using Theorem 5.2, one shows that L = L 2L 1 is a stable factorization. Now conversely, suppose the factorization is stable, but A is not. Then there exists e > 0 and a sequence of matrices {em} converging to e such that, for all "f/" E Om, ()("f/", A)
~
e,
m = 1, 2, ....
(5.7)
Here {Om} denotes the collection of all invariant subspaces for em. Put Q = col(bi1J)l= 1 and m = 1, 2, ....
Then {Sm} converges to col(Qe;- 1)l=t. which is equal to the unit nl x nl matrix. So without loss of generality we may assume that Sm is nonsingular for all m, say with inverse S;;; 1 = row(Um;)l= 1· Note that
umi--+ col(bj;J)l=1•
i = 1,
0
0
0'
l.
(5.8)
A straightforward calculation shows that Sm ems;;; 1 is the companion matrix associated with the monic matrix polynomial
From (5.8) and the fact that em--+ e it follows that azCLm, L)--+ 0. But then we may assume that for all m the polynomial Lm admits a factorization Lm = Lm2Lm1 with Lm1 E ~.. Lm2 E ~l-n and
154
5. PERTURBATION AND STABILITY OF DIVISORS
Let Am be the supporting subspace corresponding to the factorization Lm = Lm 2 Lm 1 • By Theorem 5.2 we have ()(Am, A)--+ 0. Put "f/ m = s;;; 1 Am. Then "f/ mis an invariant subspace for em. In other words "f/ mE Qm· Moreover, it follows from Sm --+ /that()( "f/ m• Am) --+ 0. (This can be verified easily by using, for example, equality (S4.12).) But then ()("f/ m• A) --+ 0. This contradicts (5. 7), and the proof is complete. D We can now formulate the following criterion for stable factorization.
Theorem 5.8. Let L, L 1 , and L 2 be monic n x n matrix polynomials and assume L = L 2 L 1 • This factorization is stable if and only if for each common eigenvalue ..1. 0 of L 1 and L 2 we have dim Ker L(A.0 ) = 1. In particular, it follows from Theorem 5.9 that for a spectral divisor L 1 of L the factorization L = L 2 L 1 is stable. This fact can also be deduced from Theorems 5.5 and 5.2. The proof of Theorem 5.8 is based on the following lemma.
Lemma 5.9.
Let
be a linear transformation from (;minto (;m, written in matrix form with respect to the decomposition (;m = (;m' EEl (;m 2 (m 1 + m 2 = m). Then (;m' is a stable invariant subspace for A if and only iffor each common eigenvalue A.0 of A 1 and A 2 the condition dim Ker(A. 0 J -A) = 1 is satisfied. Proof It is clear that (;m, is an invariant subspace for A. We know from Theorem S4.9 that (;m, is stable if and only if for each Riesz projector P of A corresponding to an eigenvalue ..1.0 with dim Ker(A. 0 J - A) ;;::: 2, we have p(;m, = 0 or P~' = ImP. Let P be a Riesz projector of A corresponding to an arbitrary eigenvalue A.0 • Also for i = 1, 2 let P; be the Riesz projector associated with A; and A.0 • Then, for e positive and sufficiently small
p = (
P1
1.
-2
f
nz JIJ.-J.ol=e
(JA.- A 1)- 1 A 0 (IA.- A 2 )- 1 dJ ·
P2
0
Observe that the Laurent expansion of (H - A;)- 1 (i form
= 1, 2) at
..1. 0 has the
-q
(I A. - A;)- 1
=
L
j= -1
(A.- A.0 )iP;QijP;
+ · · ·,
i
= 1, 2,
(5.9)
5.4. GLOBAL ANALYTIC PERTURBATIONS: PRELIMINARIES
155
where Q;i are some linear transformations of Im P; into itself and the ellipsis on the right-hand side of (5.9) represents a series in nonnegative powers of (A. - ..1. 0 ). From (5.9) one sees that P has the form
p= (p~
P1Q1 ; 2
QzPz)
where Q1 and Q2 are certain linear transformations acting from llm 2 into flm'. It follows that Pllm' 1= {0} 1= Im P if and only if ..1. 0 E a(A 1 ) n a(A 2 ). Now appeal to Theorem S4.9 (see first paragraph of the proof) to finish the proof of Lemma 5.9. D
Proof of Theorem 5.8. Let .A be the supporting subspace corresponding to the factorization L = L 2 L 1 • From Theorem 5.7 we know that this factorization is stable if and only if .A is a stable invariant subspace for C L. Let l be the degree of L, let r be the degree of L 1 , and let .%, = {(x 1,
...
,x1)Efln 1lx 1
= ··· = x, = 0}.
Then flnl = .A EB %,. With respect to this decomposition we write CL in the form
From Corollaries 3.14 and 3.19 it is known that a(L 1 ) = a(Cd and a(L 2 ) = a(C 2 ). The desired result is now obtained by applying Lemma 5.9.
D 5.4. Global Analytic Perturbations: Preliminaries In Sections 5.1 and 5.2 direct advantage was taken of the explicit dependence of the companion matrix C L for a monic matrix polynomial L(A.) on the coefficients of L(A.). Thus the appropriate standard pair for that analysis was ([J 0 · · · 0], CL). However, this standard pair is used at the cost of leaving the relationship of divisors with invariant subspaces obscure and, in particular, giving no direct line of attack on the continuation of divisors from a point to neighboring points. In this section we shall need more detailed information on the behavior of supporting subspaces and, for this purpose, Jordan pairs X ll' J ll will be used. Here the linearization J ll is relatively simple and its invariant subspaces are easy to describe. Let A 0 (f.L), ... , A 1_ 1 (f.L) be analytic functions on a connected domain Q in the complex plane taking values in the linear space of complex n x n matrices. Consider the matrix-valued function 1-1
L(A., f.L) = IA.1 +
L A;{f.L)Ai.
;~o
(5.10)
5. PERTURBATION AND STABILITY OF DIVISORS
156
Construct for each 11 En a Jordan pair (X~'' J ~')for L(A., Jl). The matrix J ~'is in Jordan normal form and will be supposed to have r Jordan blocks J~> of size q;, i = 1, ... , r with associated eigenvalues A.;(Jl) (not necessarily distinct). In general, rand q; depend on Jl. We write J ~' = diag[J~1 >, ... , Jt>J. Partition
X~'
as X
ll
= [X
where x~> has q; columns and each submatrix represents one Jordan chain. Since (X~', J ~') is also a standard pair for L(A., Jl), and in view of the divisibility results (Theorem 3.12), we may couple the study of divisors of L(A., /l) with the existence of certain invariant subspaces of J w We are now to determine the nature of the dependence of divisors and corresponding subspaces on Jl via the fine structure of the Jordan pairs (X~', J ~'). In order to achieve our objective, some important ideas developed in the monograph of Baumgartel [5] are needed. The first, a "global" theorem concerning invariance of the Jordan structure of J ~'follows from [5, Section V.7, Theorem 1].1t is necessary only to observe that J ~'is also a Jordan form for the companion matrix 0 0 CL(!l)
I 0
0 I
0 0
=
(5.11)
0 0 - Ao(/l) - A2(11)
0
I
-Az-1(/l)
to obtain
Proposition 5.10. The matrix-valued functions X~', J ~'can be defined on 0. in such a way that,for some countable set s1 of isolated points inn, the following statements hold: For every Jl E O.\S 1 the number r of Jordan blocks in J ~'and their sizes q, are independent of Jl. (b) The eigenvalues A.;(Jl), i = 1, ... , r, are analytic functions in 0.\S 1 and may have algebraic branch points at some points of S 1. (c) The blocks x~>, i = 1, ... , r, of X~' are analytic functions ofJl on 0.\S 1 which may also be branches of analytic functions having algebraic branch points at some points of S 1 •
(a)
qb
... ,
The set S 1 is associated with Baumgartel's Hypospaltpunkte and consists of points of discontinuity of J ~'inn as well as all branch points associated with the eigenvalues.
5.4. GLOBAL ANALYTIC PERTURBATIONS: PRELIMINARIES
157
Let Xj(J.L), j = 1, 2, ... , t, denote all the distinct eigenvalue functions defined on il\S 1 (which are assumed analytic in view of Proposition 5.10), and let S 2 = {J.L E il\S 1 1l;(J.L) = lj(J.L) for some i # j}. The Jordan matrix J ~' then has the same set of invariant subspaces for every J.L E il\(S 1 u S 2 ). (To check this, use Proposition S4.4 and the fact that, for
rx, pErt, matrices J !< - rxl and J ~' - PI have the same invariant subspaces.) The set S 2 consists of multiple points (mehrfache Punkte) in the terminology of Baumgartel. The setS 1 u S 2 is described as the exceptional set of L(A., J.L) inn and is, at most, countable (however, see Theorem 5.11 below), having its limit points (if any) on the boundary of n. ExAMPLE
+ J-lA + w 2
Let L(A., J-l) = A.Z
5.1.
J
_
±2w-
for some constant w, and 0 =fl. Then
[±w 1 J 0 ±w
and, for J-l #- ±2w, J~' = diag[A. 1 (J-l), A. 2 (J-l)] where A. 1 • 2 are the zeros of A. 2 Here, S 1 = {2w, -2w}, S 2 = ¢. 0 5.2.
EXAMPLE
+ J-lA + w2 •
Let
L(A., J-l)
=
[
(A. - 1)(A. - J-l) ll
and 0 = fl. The canonical Jordan matrix is found for every ll Efland hence the sets S 1 and S 2 • For ll ¢ {0, 2, ± 1, ±j2}, J ~' = diag{J-l, J-l 2 , 1, 2}.
Then J0
= diag[O, 0,
J '-
J
s,
J2
d;a·{[~ :]. -1,2}
±,~ = diag{[~
It follows that
1, 2],
= {
a 1,
±J2}
±1, 2, ±J2}, s2 = =
X " ~
J,-
{0} and, for ll E 0\S ,,
[(J-l - 2)(/-l - 1) 0 1
a diag mi :J.+
= diag{[~
1
1 - fl 2 ,.II
OJ 1 • 0
1, 4},
158
5. PERTURBATION AND STABILITY OF DIVISORS
5.5. Polynomial Dependence In the above examples it turns out that the exceptional set S 1 u S 2 is finite. The following result shows it will always be this case provided the coefficients of L(A., /1) are polynomials in 11· Theorem 5.11. Let L(A., Jl) be given by (5.10) and suppose that A;(f.l) are matrix polynomials in 11 (Q = e), i = 0, ... , 1- 1. Then the set of exceptional points of L(A., Jl) is finite.
Proof Passing to the linearization (with companion matrix) we can suppose L(A., /1) is linear: L(A., /1) = !A. - C(f.1). Consider the scalar polynomial
where n is the size of C(Jl), and aj(Jl) are polynomials in 11· The equation
f(A., /1) = 0
(5.12)
determines n zeros A. 1 (f.1), ... , A.n(Jl). Then for 11 in e\S 2 (S 3 is some finite set) the number of different zeros of (5.12) is constant. Indeed, consider the matrix (of size (2n - 1) x (2n - 1) and in which a 0 , a 1, ... , an-! depends on f.l) 1
an-1
lln-2
a1
llo
0
0
1
0 n-1
a2
a1
ao
0
0 0
n 0
0 1 (n-1)a. _1 (n-2)an-2 •• •
n
(n-1)a._ 1 • • •
an-1
an -2
an-3
ao
a1
0
0
0
2a2
a1
0
(5.13)
0 0
0
n
(n-1)a,_1
a1
The matrix M(f.l) is just the resultant matrix of f(A., f.l) and of(A., /1)/oA., considered as polynomials in A. with parameter Jl, and det M(Jl) is the resultant determinant of f(A., /1) and ~f(A., /1)/oA.. (See, for instance [78, Chapter XII] for
5.5.
159
POLYNOMIAL DEPENDENCE
resultants and their basic properties.) We shall need the following property of M(J1): rank M(Jl) = 2n
+
1 - n(Jl),
where n(Jl) is the number of common zeros (counting multiplicities) of f(A., Jl) and of(A., Jl)/oA. (see [26]). On the other hand, n = n(Jl) + CJ(Jl), where CJ(Jl) is the number of different zeros of (5.12); so
= n + 1 + CJ(Jl).
rank M(Jl)
(5.14)
Since C(Jl) is a polynomial in Jl, we have rank M(Jl) = max rank M(Jl),
( 5.15)
/lEC
where
s3
is a finite set. Indeed, let Jlo rank M(J1 0 )
E
(be such that
= max rank M(Jl) '~ r, /lEI/:
and let M 0 (J1) be an r x r submatrix of M(Jl) such that det M 0 (J1 0 ) =I= 0. Now det M 0 (J1) is a nonzero polynomial in Jl, and therefore has only finitely many zeros. Since rank M(Jl) = r provided det M 0 (Jl) =1= 0, equality (5.15) follows. By ( 5.14) the number of different zeros of ( 5.12) is constant for J1 E fl\S 3 . Moreover, CJ(Jll) < CJ(Jlz) for Jlt E s3 and Jlz E fl\S3. Consider now the decomposition of f(A., Jl) into the product of irreducible polynomials
f(A., Jl) =
ft (A., Jl) · · · fm(A.,
Jl),
( 5.16)
where /;(A., Jl) arc polynomials on A. whose coefficients are polynomials in J1 and the leading coefficient is equal to 1; moreover, /;(A., Jl) are irreducible. Consider one of the factors in (5.16), say, f 1 (A., Jl). It is easily seen from the choice of S 3 that the number of different zeros of f 1 (A., Jl) is constant for J1 E fl\S 3. Let A. 1 (Jl), ... , A.s(Jl), J1 E fl\S 3 be all the different zeros of equation f 1 (A., Jl) = 0. For given A.;(Jl), i = 1, ... , sand for given j = 1, 2, ... , denote
vij(Jl)
= rank(A.;(Jl)/ - C(Jl))j
Let us prove that v;/Jl) is constant for J1 E Cfl\S 3)\S;j, where Sij is a finite set. Indeed, let
vij = max rank (A.;(Jl)l - C(Jl))j, !1EI/'\S3
160
5. PERTURBATION AND STABILITY OF DIVISORS
and let f.lo = f.lo(i,j) E e:\S 3 be such that vii(f.lo) = vii. Let A;p., f.l) be a square submatrix of(H - C(f.l))i (of size v;i x v;) such that det Aii(A.;(f.1 0 ), f.lo) =1 0. We claim that (5.17) for k = 1, 2, ... , s (so in fact vii does not depend on i). To check (5.17) we need the following properties of the zeros A. 1(f.l), ... , A..(f.1): (1) A.;(f.l) are analytic in e:\S 3 ; (2) A. 1 (f.1), ... , A..(f.l) are the branches of the same multiple-valued analytic function in e:\S 3 ; i.e., for every f.1 E e;\S3 and for every pair A.;(f.l), A.j(f.l) of zeros of f 1(A., f.1) there exists a closed rectifiable contour r c e;\S 3 with initial and terminal point f.1 such that, after one complete turn along r, the branch A.;(f.l) becomes A.j(f.l). Property (1) follows from Proposition 5.10(b) taking into account that the number of different zeros of f 1(A., f.l) is constant in e:\S 3 . Property (2) follows from the irreducibility of / 1 (A., f.1) (this is a wellknown fact; see, for instance [63, Vol. III, Theorem 8.22]). Now suppose (5.17) does not hold for some k; so det Au(A.k(f.l), f.l) 0. Let r be a contour in e;\S 3 such that f.lo E rand after one complete turn (with initial and terminal point f.lo) along r, the branch A.k(f.l) becomes A.;(f.1). Then by analytic continuation along r we obtain det Aii(A.;(f.1 0 ), f.lo) = 0, a contradiction with the choice of f.lo· So (5.17) holds, and, in particular, there exists f.lo = f.l'o(i,j)E e:\S 3 such that
=
k = 1, ... , s.
(5.18)
(For instance, f.lo can be chosen in a neighborhood of f.lo.) Consider now the system of two scalar polynomial equations f1(A., f.l)
= 0,
det Au(A., f.l) = 0.
(5.19)
(Here i and j are assumed to be fixed as above.) According to (5.18), for f.1 = f.lo this system has no solution. Therefore the resultant R;j(f.l) of f 1(A., f.l) and det A;j(A., f.l) is not identically zero (we use here the following property of R;j(f.l) (see [78, Chapter XII]): system (5.19) has a common solution A. for fixed f.1 if and only if Rii(f.l) = 0). Since R;j(f.l) is a polynomial in f.l, it has only a finite set Sii of zeros. By the property of the resultant mentioned above,
as claimed. Now let T1 = U~= 1 1 Sii, and letT = 1 7;,, where T2 , ••• , Tm are constructed analogously for the irreducible factors f 2 (A., f.l), ... , fm(A., f.l), respectively, in (5.16). Clearly Tis a finite set. We claim that the exceptional set S1 u S2 is contained in S3 u T (thereby proving Theorem 5.11). Indeed,
Ui'=
U;=
5.5.
161
POLYNOMIAL DEPENDENCE
from the construction of S3 and T it follows that the number of different eigenvalues of I A. - C(J-t) is constant for J-l E (/;\(S 3 u T), and for every eigenvalue A.0 (J-t) (which is analytic in (/;\(S 3 u T)) of IA - C(J-t), r(J-t;j) ~r rank(A. 0 (J-t)/- C(J-t)).i
is constant for 1-l E (/;\(S 3 u T);j = 0, 1, .... The sizes of the Jordan blocks in the Jordan normal form of C(J-t) corresponding to A.0 (J-t) are completely determined by the numbers r(J-t;j); namely, j
= 1, ... , n,
(5.20)
where Y.i(J-l) is the number of the Jordan blocks corresponding to A.0 (J-t) whose size is not less than j; j = 1, ... , n (so y1 (J-t) is just the number of the Jordan blocks). Equality (5.20) is easily observed for the Jordan normal form of C(J-t); then it clearly holds for C(J-t) itself. Since r(J-t;j) are constant for J-l E (/;\(S 3 u T), (5.20) ensures that the Jordan structure of C(J-t) corresponding to A.0 (J-t) is also constant for J-l E ([;\(S 3 u T). Consequently, the exceptional set of H - C(J-t) is disjoint with S 3 u T. D Let us estimate the number of exceptional points for L(A., J-l) as in Theorem 5.11, assuming for simplicity that det L(A., J-t)is an irreducible polynomial. (We shall not aim for the best possible estimate.) Let m be the maximal degree of the matrix polynomials A 0 (J-t), ... , A 1 _ 1 (J-t). Then the degrees of the coefficients a;(J-t) of f(A., J-t) = det(J A. - C(J-t)), where C(J-t) is the companion matrix of L(A., J-t), do not exceed mn. Consequently, the degree of any minor (not identically zero) of M(J-t) (given by formula (5.13)) does not exceed (2nl - 1)mn. So the set S3 (introduced in the proof of Theorem 5.11) contains not more than (2nl - 1)mn elements. Further, the left-hand sides ofEq. (5.19) are polynomials whose degrees (as polynomials in A.) are nl (for f 1 (A., J-t) = f(A., J-t)) and not greater than nlj (for det Aij(A., J-t)). Thus, the resultant matrix of f(A., J-t) and det A;.i(A., J-t) has size not greater than nl + nlj = nl(j + 1). Further, the degrees of the coefficients of f(A., J-t) (as polynomials in J-t) do not exceed mn, and the degrees ofthe coefficients of det Aij(A., J-t) (as polynomials on J-t) do not exceed mnj. Thus, the resultant Rij(/-l) of f(A., J-t) and det Aij(A., J-t) (which is equal to the determinant of the resultant matrix) has degree less than or equal to lmn 2j(j + 1). Hence the number of elements in Sij (in the proof of Theorem 5.11) does not exceed lmn 2j(j + 1). Finally, the set nl
nl
s3 U U U sij i= l.i=l
consists of not more than (2nl- 1)mn + n 3 / 2 m L}~ 1 (j + 1)j = (2nl- l)mn + n 3 l 2 m(-~n 3 l 3 + n2 l 2 + ~nl) = mn(tp 5 + p4 + ~p 3 + 2p - 1) points where p = nl. Thus for L(A., J-t) as in Theorem 5.11, the number of exceptional points
162
5.
PERTURBATION AND STABILITY OF DIVISORS
does not exceed mn(jp 5 + p4 + Jp 3 + 2p - 1), where m is the maximal degree of the matrix polynomials A;(Jl), i = 0, ... , l- 1, and p = nl. Of course, this estimate is quite rough. 5.6. Analytic Divisors
As in Eq. (5.10), let L(A., Jl) be defined for all (A., Jl) E rt X 0. Suppose that for someJlo E 0 the monic matrix polynomial L(A., Jlo) has a right divisor L 1(A.). The possibility of extending L 1 (A.) to a continuous (or an analytic) family of right divisors L 1(A., Jl) of L(A., Jl) is to be investigated. It turns out that this may not be possible, in which case L 1 (A.) will be described as an isolated divisor, and that this can only occur if Jlo is in the exceptional set of Lin 0. In contrast, we have the following theorem. Theorem 5.12. If Jlo E O\(S 1 u S2 ), then every monic right divisor L 1(A.) of L(A., Jlo) can be extended to an analytic family L 1 (A., Jl) ofmonic right divisors of L(A., Jl) in the domain 0\(S 1 u S3 ) where S3 is an, at most, countable subset of isolated points of0\S 1 and L 1 (A., Jl) has poles or removable singularities at the points of S 3 .
Proof Let A be the supporting subspace of divisor L 1 (A.) of L(A., Jlo) with respect to the Jordan matrix JJLo· Then, if L 1 has degree k, define
It follows from the definition of a supporting subspace that Qk(Jlo) is an invertible linear transformation. Since (by Proposition 5.10) Qk(Jl) is analytic on 0\S 1 it follows that Qk(Jl) is invertible on a domain (0\S 1 )\S 3 where S3 = {Jl E 0\S 1 Idet Qk(Jl) = 0} is, at most, a countable subset of isolated points of 0\S 1 • Furthermore, the J Jlo -invariant subspace A is also invariant for every J JL with Jl E 0\S 1. Indeed, we have seen in Section 5.4 that this holds for JlE0\(S 1 uS 2 ). When JlES 2 \S 1 , use the continuity of Iw Hence by Theorem 3.12 there exists a family of right divisors L 1(A., Jl) of L(A., Jl) for each Jl E O\(S 1 u S 3 ), each divisor having the same supporting subspace A with respect to J w By formula (3.27), it follows that this family is analytic in 0\ (S 1 u S 3 ). An explicit representation of L 1 (A., Jl) is obtained in the following way (cf. formula (3.27)). Take a fixed basis in A and for each Jl in 0\S 1 represent Qk(Jl) as a matrix defined with respect to this basis and the standard basis for rtkn. Then for Jl¢= s3' define In X n matrix-valued functions WI (Jl), W,.(Jl) by 0
0
0'
5.6.
163
ANALYTIC DIVISORS
where R is the matrix (independent of fl) representing the embedding of .A into C1n. The divisor L 1 has the form
The nature of the singularities of L 1(A., fl) is apparent from this representation.
D An important special case of the theorem is flo
Corollary 5.13. If det L(A., fl) has In distinct zeros for every fl En, and if En, then every monic right divisor of L(A., flo) can be extended to a family of
monic right divisorsfor L(A., fl) which is analytic on !1\S 3 . Proof Under these hypotheses S 1 and S 2 are empty and the conclusion follows. D EXAMPLE
5.3.
Let
Then a Jordan matrix for L does not depend on Jl, i.e., for every J1 E fl,
J~ = J = diag{[~ ~l [~
:]}
We have S 1 and S 2 both empty and
X
~
=[I0 )010! 0OJ·
The subspace Jt spanned by the first two component vectors e 1 and e 2 is invariant under J and
Thus, for J1 #- 0, .It is a supporting subspace for L(A., Jl). The corresponding divisor is L 1 (A., Jl) = !A. - X J /X~ L41 ) - I and computation shows that
So S 3 = {0}.
0
Corollary 5.14. If the divisor L 1(A.) of Theorem 5.12 is, in addition, a r-spectral divisor for some closed contour r then it has a unique analytic extension to a family L 1 (A., fl) of monic right divisors defined on !1\(S 1 u S 3 ).
164
5.
PERTURBATION AND STABILITY OF DIVISORS
Proof Let L 1 (A., fl) be an extension of L 1 (A.); existence is guaranteed by Theorem 5.12. By Lemma 5.4, for fl E 0 close to flo, the divisor L 1(A., fl) is r-spectral. In particular, L 1 (A., fl) is uniquely defined in a neighborhood of flo. Because of analyticity, L 1 (A., fl) is a unique extension of L 1 (A.) throughout O\(S 1 u S 3 ). D
5.7. Isolated and Nonisolated Divisors As before, let L(A., fl) be a monic matrix polynomial of degree I with coefficients depending analytically on fl in 0, and let L 1 (A.) be a monic right divisor of degree k of L(A., flo), floE Q. The divisor L 1(A.) is said to be isolated if there is a neighborhood U(flo) of flo such that L(A., fl) has no family of right monic divisors L 1 (A., fl) of degree k which (a) depends continuously on fl in U(flo) and (b) has the property that lim 11 ~ 110 L 1 (A., fl) = L 1 (A.). Theorem 5.12 shows that if, in the definition, we have floE O\(S 1 u S2 ) then monic right divisors cannot be isolated. We demonstrate the existence of isolated divisors by means of an example. EXAMPLE 5.4. Let C(fl) be any matrix depending analytically on fl in a domain Q with the property that for fl = floE Q, C(J
fl 0 = 0'
Then define L(A., fl) = U 2 - C(fl). It is easily seen that if L(A., fl) has a right divisor U- A(fl) then L(A., fl) = U 2 - A 2(fl) and hence that L(A., fl) has a monic right di-visor if and only if C(fl) has a square root. Thus. under the hypotheses stated, L(A., fl) has an isolated divisor at flo· 0
It will now be shown that there are some cases where divisors which exist at points of the exceptional set can, nevertheless, be extended analytically. An example of this situation is provided by taking flo = 0 in Example 5.2.
Theorem 5.15. Let floE S2 and L 1 (A.) be a nonisolated right monic divisor of degree k of L(A., flo)· Then L 1 (A.) can be extended to an analytic family L 1 (A., fl) of right monic divisors of degree k of L(A., fl).
In effect, this result provides an extension of the conclusions of Theorem 5.12. The statements concerning the singularities also carry over. Proof Since L 1 (A.) is nonisolated, there is a sequence {flm} in 0\(S 1 u S 2) such that flm--> flo and there are monic right divisors L 1 (A., flm) of L(A., flm) such that limm~ 00 L 1 (A., flm) = L 1 (A.). Let ./V(flm) be the supporting subspace of L 1 (A., flm) relative to 1 11 .,. As the set of subspaces is compact (Theorem
5.7.
ISOLATED AND NONISOLATED DIVISORS
165
S4.8),thereisasubsequencemk,k = 1, 2, ... ,suchthatlimk--+oo %(f1m) = % 0 for some subspace jf/ 0 . We can assume that limm__. oo .AI"(!lm) = ,AI 0 . Since %(/1m) is invariant for J llrn and all invariant subspaces of J 11 are independent of f1 in 0\(S 1 u S 2 ) it follows that %(flo) is invariant for J llrn for each m. From this point, the proof of Theorem 5.12 can be applied to obtain the required conclusion. D The following special case is important: Corollary 5.16. If the elementary divisors of L(A., /1) are linear for each fixed 11 E Q, then every nonisolated right monic divisor L 1(A.) of L(A., flo) can be extended to an analytic family of right monic divisors for L(A., /1) on Q\S 3 . Proof In this case S 1 is empty and so the conclusion follows from Theorems 5.12 and 5.15. D Comments
The presentation of Sections 5.1, 5.2, 5.4, 5.6, and 5.7 is based on the authors' paper [34e] (some of the results are given in [34e] in the context of infinite-dimensional spaces). The main results of Section 5.3 are taken from [3b]; see also [3c]. Section 5.5 is probably new. The results concerning stable factorizations (Section 5.3) are closely related to stability properties of solutions of the matrix Riccati equation (see [3b] and [3c, Chapter 7]). It will be apparent to the reader that Section 5.4 on global analytic perturbations leans very heavily on the work of Baumgartel [5], who developed a comprehensive analytic perturbation theory for matrix polynomials.
Chapter 6
Extension Proble:rns
Let L(ll) be a monic matrix polynomial with Jordan pairs (X;, J;), i = 1, ... , r, corresponding to its different eigenvalues ll ll, respectively (a(L) = {ll 1 , ll 2 , •.. , ll,}). We have seen in Chapter 2 that the pairs (X;, J;), i = 1, ... , rare not completely independent. The only relationships between these pairs are contained in the requirement that the matrix
1,... ,
xl
r
X 1J
X1
X2
~~1-
X,J, X,
X 2 J2
1
1
Xz
5~-
1
Xr
1
j1-1 r
be nonsingular. Thus, for instance, the pairs (X 1, J 1), ••• , (X,_ 1, J,_ 1) impose certain restrictions on (X., J,). In this chapter we study these restrictions more closely. In particular, we shall describe all possible Jordan matrices J, which may occur in the pair (X., J,). It is convenient to reformulate this problem in terms of construction of a monic matrix polynomial given only part of its Jordan pair, namely, (X 1 , J 1 ), ... , (X,_ 1 , J,_ 1 ). It is natural to consider a generalization of this extension problem: construct a monic matrix polynomial if only part of its Jordan pair is given (in the above problem this part consists of the Jordan pairs corresponding to all but one eigenvalue of L(ll)). Clearly, one can expect many solutions 166
6.1.
STATEMENT OF THE PROBLEMS AND EXAMPLES
167
(and it is indeed so). Hence we seek a solution with special properties. In this chapter we present a construction of a monic matrix polynomial, given part of its Jordan pair, by using left inverses. 6.1. Statement of the Problems and Examples
Let (X, J) be a pair of matrices where X is n x rand J is an r x r Jordan matrix. We would like to regard the pair (X, J) as a Jordan pair or, at least, a part of a Jordan pair of some monic matrix polynomial L(A.). Consider the problem more closely. We already know that (X, J) is a Jordan pair of a monic matrix polynomial of degree l if and only if col(XJ;)l:6 is non singular; in particular, only if r = nl. If this condition is not satisfied (in particular, if r is not a multiple of n), we can still hope that (X, J) is a part of a Jordan pair (X, i) of some monic matrix polynomial L(A.) of degree l. This means that for some i-invariant subspace .A c fl" 1 there exists an invertible linear transformation S: .A -+ fl' such that X = X IAt S- 1, J = SJ IAt S- 1. In this case we shall say that (X, J) is a restriction of (X, i). (The notions of restriction and extension of Jordan pairs will be studied more extensively in Chapter 7.) Clearly, the property that (X, J) be a restriction of (X, i) does not depend on the choice of the Jordan pair (X, i) of L(A.), but depends entirely on the monic matrix polynomial L(A.) itself. So we can reformulate our problems as follows: given sets of Jordan chains and eigenvalues (represented by the pair (X, J)), is there a monic matrix polynomial L(A.) such that this set is a part of a canonical set of the Jordan chains of L(A.) (represented by the Jordan pair (X, i) of L(A.))? If such an L(A.) exists, how can we find it, and how can we find such an L(A.) subject to different additional constraints? This kind of question will be referred to as an extension problem, because it means that (X, J) is to be extended to a Jordan pair (X, i) for some monic polynomial. Of course, we could replace (X, J) by a pair (X, T), where Tis an r x r matrix (not necessarily Jordan), and ask for an extension of (X, T) to a standard pair (X, f) of some monic polynomial of degree l. For brevity, we shall say that (X, T) is a standard pair ofdegree l.It will be convenient for us to deal with the extension problem in this formally more general setting. A necessary condition can be given easily: Proposition 6.1. Let X and T be matrices of sizes n x r and r x r, respectively. Suppose that there exists an extension of(X, T) to a standard pair (X, T) of degree l. Then
(6.1)
Proof Let (X, T) be a standard pair of the monic matrix polynomial L(A.), and let .A be a T-invariant subspace such that (X, T) is similar to
168
6.
EXTENSION PROBLEMS
fL,1): X= XI 11 S, T = s- 1 fi. 11 S for some invertible linear transformation S: fl' ~ j f (such an ~.It exists by the definition of extension). Clearly, it is enough to prove (6.1) for (X,11 , T41 ) in place of(X, T). Let j{' be some direct complement to ..It in fl" 1, and let
(XI,1,
be the matrix representations of X and f with respect to the decomposition fl" 1 = A .It'. Since j f is f-invariant, we have T21 = 0. So
+.
Xf
x.
rx;l-l
rX1T11 *1' 1
==
X
0
xl ;f1 1
and since this matrix is nonsingular, the columns of col( X 1 T~ 1 )L:b are linearly independent. But X 1 =XI 11 , T11 = fl 41 , and (6.1) follows. D For convenience, we introduce the following definitions: a pair of matrices (X, T), where X is n x rand Tis r x r (the number r may vary from one pair to another; the number n is fixed) will be called an admissible pair. An admissible pair (X, T) satisfying condition (6.1) will be called /-independent. Thus, there exists an extension of the admissible pair (X, T) to a standard pair of degree l only if (X, T) is /-independent. We shall see later that the condition ( 6.1) is also sufficient for the existence of an extension of (X, T) to a standard pair of degree I. First we present some examples. (The scalar case).
Let (X, J) be a pair of matrices with I x r row ... , J PJ be a decomposition of J into Jordan blocks and X = [X 1 · · · X 0] be the consistent decomposition of X. The necessary condition (6.1) holds for some I if and only if !J(J;) n !J(J;) = 0 for i # j, and the leftmost entry in every X, is different from zero. Indeed, suppose for instance that !J(J 1 ) = !J(J 2 ), and let x 11 and x 12 be the leftmost entries in X 1 and X 2 , respectively. If IX 1 and IX 2 are complex numbers (not both zero) such that IX 1x 11 + IX 2 x 12 = 0, then the r-dimensional vector x containing IX 1 in its first place, IX 2 in its (r 1 + l)th place (where r 1 is the size of J 1 ), and zero elsew~ere has the property that EXAMPLE
6.1
X and r x r Jordan matrix J. Let J = diag[J ~>
x
E
Ker(XJ'),
i = 0, 1, ....
Thus, (6.1) cannot be satisfied. If, for instance, the left entry x 11 of X 1 is zero, then (1
0
· · · 0) T E Ker(X Ji),
i
=
0, 1, ... ,
and (6.1) again does not hold. On the other hand, if !J(J;) n !J(l;) = ¢ for i of j, and the leftmost entry in every X, is different from zero, then the polynomial nr~ 1 (A- A.)'i, where VJ = !J(J;) and /'j is the
6.2.
169
EXTENSIONS VIA LEFT INVERSES
size of Ji, has the Jordan pair (X, J). We leave to the reader the verification of this fact. It turns out that in the scalar case every admissible pair of matrices (X, T) which satisfies the necessary condition (6.1) is in fact a standard pair of a scalar polynomial. 0 EXAMPLE 6.2 (Existence of an extension to a standard pair of a linear monic polynomial). Another case when the sufficiency of condition (6.1) can be seen directly is when (6.1) is satisfied with I= 1, i.e., the matrix X has independent columns (in particular, r:::;; n). In this case it is easy to construct an extension (X, f) of(X, T) such that (X, f) is a standard pair of some monic matrix polynomial of degree 1. Namely, put for instance
X=
[X
Y],
- =
T
where Y is some n x (n - r) matrix such that Note that here in place of
[T0 0' OJ
X is invertible.
we could take as f any n x n matrix provided its left upper corner of size r x r is T and the left lower corner is zero. So a solution of the extension problem here is far from unique. This fact leads to various questions on uniqueness of the solution of the general extension problem. Also, there arises a problem of classification and description of the set of solutions. 0
In the next sections we shall give partial answers to these questions, by using the notion of left inverse. 6.2. Extensions via Left Inverses Let (X, T) be an admissible /-independent pair of matrices and let r be the size of T. Then there exists an r x nl matrix U such that U col(XT;)~;;f> =I,
and, in general, U is not uniquely defined (see Chapter S3). Let
U = [U 1
Uz]
be the partition of U into blocks U;, each of size r x n. We can construct now a monic n x n matrix polynomial
L(A.) = H 1 - XT 1(U 1
+
U 2 A.
+ ·· · +
U1 A.1-
1 ).
(6.2)
Note that the polynomial L(A.) is defined as if (X, T) were its standard pair replacing the inverse of col(XT 1)~:6 (which generally does not exist) by its left inverse U. The polynomial L(A.) is said to be associated with the pair (X, T).
170
6.
EXTENSION PROBLEMS
Theorem 6.2. Let (X, T) be an admissible [-independent pair, and let L(A.) be some monic matrix polynomial associated with (X, T). Then a standard pair (XL, TL) of L(A.) is an extension of(X, T), i.e.,from some Tcinvariant subspace .A there exists an invertible linear transformation S: .A -+ (/;r such that X= XLLA{S- 1 and T = STLIAfS- 1 • Proof Evidently, it is sufficient to prove Theorem 6.2 for the case when XL= [I 0 · · · 0] and TL is the first companion matrix of L(A.): 0 0
I 0
0
0
0
0
0 0
I
Let us check that .A = Im col(XT;)l:6 is Tcinvariant. Indeed, since [U 1
U2
· · •
U1]col(XT;)l:6 =I,
the equality
TL[
X~T 1 xr'-1
= [ ::
1 T
(6.3)
xr'-1
holds, so that Tcinvariance of .A is clear. From (6.3) it follows also that for
s- 1 = col(XT;)l:6: flr-+ .A the conditions of Theorem 6.2 are satisfied.
0
In connection with Theorem 6.2 the following question arises: is it true that every extension of (X, T) to a standard pair of a fixed degree 1(such that (6.1) is satisfied for this 1) can be obtained via some associated monic matrix polynomial, as in Theorem 6.2? In general the answer is no, as the following example shows. EXAMPLE
6.3.
Let
X=[~ Then the matrix
a T~ [~ n [:T] ~ [~ ~1 1
0 0
0 0
0 0
1
0
6.2.
171
EXTENSIONS VIA LEFT INVERSES
has linearly independent columns, so the necessary condition (6.1) is satisfied. However, XT 2 = 0, and formula (6.2) gives (6.4)
So any extension via left inverses (as in Theorem 6.2) is in fact an extension to a standard pair (X, f) of the single polynomial (6.4). For instance,
X~[~ ~ ~ ~} ~[ ~ ~ T
Il
But there exist many other extensions of (X, T) to nonsimilar standard pairs of degree 2. For example, if a E fl\ {0}, take
On the other hand, it is easy to produce an example where every possible extension to a standard pair (of fixed degree) can be obtained via associated monic polynomials. EXAMPLE
6.4.
Let
Then the columns of
are linearly independent, i.e., [iT] is left invertible. The general form of its left inverse is
V = [V1
Vz] =
[~ ~ ~ =~ ], -
Oc
01-c
where a, b, care independent complex numbers. The associated monic polynomials L(A.) are in the form
6.
172
EXTENSION PROBLEMS
As a Jordan pair (X, j) of L(A) for c =!= -1, 0 we can take
I o h/c]
x = X(h ' c) = [10 j
0
1
I
(6.5)
'
= j(c) = diag(O, I, I, -c)
(we emphasize in the notation the dependence of X on h and c and the dependence of j on c). For c = 0 take
-
X(O, O)
[I0 0I 01 OJI '
=
j(o, O)
=
diag(O, 1, 1, 0)
(6.6)
if h = 0, and 0
1 0 0) = [1O X(h, X= 0 1
-~-ll
J = J(h, 0) =
if h =!= 0. For c = -I, take
X= X(b, -1)
=
[I 0
I -h I 0
lo0~
l
0
0 0
1
0
0
J
o0 oI o0 oj0
o].
j = j( -I)=
0
0
0
1
1 .
0
0
0
1
(6.7)
(6.8)
We show now that every standard extension (X', T') of (X, T) of degree 2 is similar to one of the pairs (X, j) given in (6.5)-(6.8), i.e., there exists an invertible 4 x 4 matrix S 1 Several cases can occur. (depending on (X', T')) such that X' = XS, T' =
s- js.
(I) dct(H- T') has a zero different from 0 and 1. In this case T' is similar to j(c) from (6.5), and without loss of generality we can suppose that j(c) = T' for some c =!= 0, - 1. Further, since (X', T') is an extension of (X, T), the matrix X' can be taken in the following form: X'
=
[10 01 01 :x:xl
1] .
Necessarily :x 2 =!= 0 (otherwise[/~,] is not invertible), and then clearly (X', T') is similar to (X(b, c), j(c)) from (6.5) with h = :x 1 :x2 1 c. (2) det(H - T') = A. 2 (A. - 1) 2 . Then one can check that (X', T') is similar to (X, j) from (6.6) or (6.7) according as T' has two or one lineraly independent eigenvector(s) corresponding to the eigenvalue zero. (3) det(I A - T') = A.( A. - 1) 3 .ln this case (X', T') is similar to (X, j) from (6.8). 0
The following theorem allows us to recognize the cases when every extension to a standard pair can be obtained via associated monic matrix polynomials.
6.3.
SPECIAL EXTENSIONS
173
Theorem 6.3. Let (X, T) be an admissible /-independent pair of matrices. Then every standard pair (X', T') of degree l, which is an extension of(X, T), is also a standard pair of some monic matrix polynomial of degree l associated with (X, T), if and only if the rows of XT 1 are linearly independent. Proof Let (X', T') be a standard pair of degree I, and let L(A.) = IA. 1 + I~:& AiA.ibe the monic matrix polynomial with standard pair (X', T'). From the definition of a standard pair (Section 1.9) it follows that (X', T') is an extension of (X, T) if and only if
or
(6.9)
According to a result described in Chapter S3, a general solution of (6.9) is given by
(6.10)
(where the superscript" I" designates any right inverse of [col(XTi)l;:6F) if and only if the columns of (XT 1)'f. are linearly independent. Taking the transpose in (6.10), and using the definition of an associated matrix polynomial, we obtain the assertion of Theorem 6.3. 0 6.3. Special Extensions
In Section 6.2 we have seen that an extension of an admissible /-independent pair (X, T) to a standard pair of degree l can always be obtained via associated monic polynomials. Thus, if L(A.) is such a polynomial and (X, T) its standard pair, then there exists a T-invariant subspace A such that the pair (X, T) is similar to (X I At• T Lu). Let ...H' be some direct complement to A and let P be the projector on A' along .H. Then the part (X I. 11 ,, PT L,,) of the
6.
174
EXTENSION PROBLEMS
standard pair (X, T) does not originate in (X, T); it is the extra part which is "added" to the pair (X, T). In other words, the monic polynomial L(A-) has a spectral structure which originates in (X, T), and an additional spectral structure. In this section we shall construct the associated polynomial L(A-) for which this additional spectral structure is the simplest possible. This construction (as for every associated polynomial) uses a certain left inverse of the matrix col(XT;)l::::b, which will be called the special/eft inverse. The following lemma plays an important role. For given subspaces 0lt 1 , ..• , !l/11 c (["we denote by col(!llt;)l= 1 the subspace in (/;" 1 consisting of all /-tuples (x 1 , .•. , x 1) of n-dimensional vectors X b ... , X 1 SUCh that X I E !Jlt 1, X 2 E !Jlt 2 , . . . , X 1 E !l/1 1 • Lemma 6.4. Let (X, T) be an admissible [-independent pair with an invertible matrix T. Then there exists a sequence of subspaces ([" :::2 jf/' 1 :::2 · · • :::2 'lr 1 such that coJ(jf!'))=k+
for k
1
+ Im col(XTj-l ))=k+
1
=
(6.11)
([n(l-k)
= 0, ... , l - 1.
Proof Let jf/' 1 be a direct complement to Im XT 1- 1 = Im X (this equality follows from the invertibility of T) in ([". Then (6.11) holds for k = l - I. Suppose that 'lfl;+ 1 :::2 · · · :::2 'fl/. 1 are already constructed so that (6.11) holds fork= i, ... , l - 1. It is then easy to check that colC:z')}=; n lm coi(XP- 1))=; = {0}, where 2k = 'lfi~ fork= i + 1, ... , /,and :!l; = 11 ;+ 1 . Hence, the sum Y = col(o::!)~= 1 Im coi(XTj- 1)}= 1
+
is a direct sum. Let !!2 be a direct complement of Ker A in (["(l-il, where A = col(XTj- 1 )}=i+ 1. Let A1 be the generalized inverse of A (see Chapter S3) such that lm A1 = !!2 and Ker A1 =col( 'fl/·)~=i+ 1 • Let P be the projector on Ker A along !!2. One verifies easily that A 1A = I - P. Thus
.7
{[Y + XTi-
1
Pz
+ XTi- 1 A 1x] I}E ,
17''}
t1'n(l-i) r+1,XE1U ,ZEIU ,
(6.12)
X
where r is the size of T. Indeed, if x E (["
From the definition of .Cf' it follows that
+ x ~>
6.3.
175
SPECIAL EXTENSIONS
For any z E C', also
and clearly
for any y E if';+ 1. The inclusion verse inclusion, take y 'lr; + x
=::J
in (6.12) thus follows. To check the con-
E 1, 1Ecol('/f)~=; 1,and z EC'. Then [ v] [xri-1] [ v] + [xri-IJ A z = ,~ 1 + A [Pz + (I - P)z] = [ y J + [xyi-l Pz] + [xr- 1 (! - P):::] ;
+
1
x1
=
A(I - P)z
0
[Y] + [xyi-lpz] + [xyi- 1A(x 1+ A(I- P)z)] 1
0
0
x1
+ A (I
- P)z
(the last equality follows from A1x 1 = 0 and I - P = A 1A), and (6.12) is proved. Now, let !71 be a direct complement of 'If/';+ 1 + Im X Pin C". Then from (6.12) we obtain
Ti-l
and so we can put 'If/'; = if/';+ 1
+ C!Y.
D
We note some remarks concerning the subspaccs 'lri constructed in Lemma 6.4. (1) The subspaces 'ff/'i arc not uniquely determined. The question arises if one can choose if'i in a" canonical" way.lt follows from the proof of Lemma 6.4, for instance, that '/f'i can be chosen as the span of some set of coordinate unit vectors:
whereK 1 c K 1 _
1
c ··· c K ei
= (0
with 1 in the jth place.
= {1, ... ,n},and ··· 0
0
. . . O)T E C"
176
6. EXTENSION PROBLEMS
(2) The dimensions vi = dim ffli,j = 1, ... , l are determined uniquely by the pair (X, T): namely, vk+ 1 = n +rank col(XTi- 1 )J=k+ 1 (3)
-
rank col(XTi- 1 )j=b k = 0, ... ' l - 1.
(6.13)
Equality (6.11) fork = 0 means
+ Im col(XTi-
col("/1')}= 1
1
)}= 1 = (/;" 1,
and consequently, there exists an r x nl matrix V such that (6.14)
The left inverse V of col(XTi- 1 )}= 1 with property (6.14) will be called the special left inverse (of course, it depends on the choice of "'f!). Let now (X, T) be an admissible /-independent pair with invertible matrix T, and let V = [V1 · · · Jt!] be a special left inverse of col(XT;)l;:b. Then we can construct the associated monic matrix polynomial
The polynomial L(A.) will be called the special matrix polynomial associated with the pair (X, T). Theorem 6.5.
Let L(A.) be a special matrix polynomial associated with
(X, T), where Tis invertible ofsize r x r. Then zero is an eigenvalueofL(A.) with partial multiplicities
K1
2 K;
K2
2 ···,where
= I{jl1
~ j ~ land vi 2 i} I
and vi are defined by (6.13).
Here, as usual, IA I denotes the number of different elements in a finite set A. Proof
Let ff/ 1
::::> • • • ::::>
ff/ 1 be the sequence of subspaces from Lemma
6.4. Let V
= [V1
· · ·
Jt!]
be a special left inverse of col(XT;)l:b such that (6.14) and (6.15) hold. From (6.14) it follows that
j = 1, ... ' l. It is easy to see that x E if/;\ if/;+ 1 (i = 1, ... , l; if/ 1+ 1 = {0}) is an eigenvector of L(A.) generating a Jordan chain x, 0, ... , 0 (i - 1 zeros). Taking a
6.3.
177
SPECIAL EXTENSIONS
basis in each "ffl; modulo "ffl;+ 1 (i.e., the maximal set of vectors x 1, •.. , xk E "ffl; such that L~= 1 Or:; X; E "ffl;+ 1, Or:; E ~implies that or: 1 = .. · = or:k = 0), we obtain in this way a canonical set of Jordan chains of L(-1) corresponding to the eigenvalue zero. Now Theorem 6.5 follows from the definition of partial multiplicities. 0 Observe that (under the conditions and notations of Theorem 6.5) the multiplicity of zero as a root of det L(A.) is just nl - r. Then dimensional considerations lead to the following corollary of Theorem 6.5.
Under the conditions and notations of Theorem 6.5 let T) is similar to (X 141 , fl. 41 ), where A is the maximal f -invariant subspace such that f 1 41 is invertible. Corollary 6.6.
(X, f) be a standard pair of L(A.). Then (X,
Proof Let A 1 and A 2 be the maximal f-invariant subspaces such that fL 41 , is invertible and fL~t 2 is nilpotent (i.e., a(TI41 ,) = {0}). We know from Theorem 6.2 that for some T-invariant subspace A, (X, T) is similar to (XItt, Tl~t). Since Tis invertible, A c A 1 • In fact, A= A 1 . Indeed, by Theorem 6.5 dim A
2
=
K1
+ K 2 + · · ·;
on the other hand, it is easy to see that K1
+
so dim A
K2 1
=
+ · · · = v1 + · · · + v1 = nl - rank col(XT;)~:A r. But clearly, also dim A = r, and A = A
=
nl - r,
0
1•
It is not hard to generalize Theorem 6.5 in such a way as to remove the invertibility condition on T. Namely, let or: E ~ be any number outside the spectrum of T, and consider the admissible pair (X, -or:/ + T) in place of (X, T). Now the pair (X, T - or:/) is also /-independent because of the identity I
r
or:( b)!
(J(I-1(.1(/)J
0 (DJ
0 0
1[
X(T- or:/) X X(T _: or:/) 1-
1 rxi 1 XT X
1
-
1- 1 •
So we can apply Theorem 6.5 for the pair (X, -or:/ + T) producing a monic matrix polynomialLa(-1) whose standard pair is an extension of(X, -or:/ + T) (Theorem 6.2) and having at zero the partial multiplicities
Vk+ 1 = n
I + ran k co l( X ( T- or:/)1._I )i=k+ 1
-
l ·-I I rank co (X(T- or:/)1 )i=k• k = 0, ... , I - L
(6.16)
178
6.
EXTENSION PROBLEMS
Let (X~, 'Fa, Ya) be a standard triple of La(A); then (X~, 'Fa+ rxi, Ya) is a standard triple of the monic matrix polynomial L(A) = La(A + rx). Indeed, by the resolvent form (2.16), (L(A))- 1 = (L~(A + rx))- 1 = Xa(I(A + rx)- 'Fa)- 1 Ya = Xa(IA- ('Fa+ rxl))- 1 Ya,
and it remains to use Theorem 2.4. It follows that the standard triple of the polynomial L(A) is an extension of (X, T) and, in addition, its partial multiplicities at the eigenvalue Ao = rx are defined by (6.16). Thus, the construction of a special monic matrix polynomial associated with a given admissible /-independent pair (X, T) (with invertible T) can be regarded as an extension of (X, T) to a standard pair (X, f) of degree l such that u(f) = {0} u u(T); moreover, elementary divisors of I A - T corresponding to the nonzero eigenvalues are the same as those of I A - T, and the degrees of the elementary divisors of I A - f corresponding to zero are defined by (6.13). Again, we can drop the requirement that Tis invertible, replacing 0 by some rx ¢ u(T) and replacing (6.13) by (6.16). Given an admissible /-independent pair (X, T) (with invertible T of size r x r), it is possible to describe all the possible Jordan structures at zero of the matrix f such that zero is an eigenvalue off with multiplicity nl - r, and for some n x nl matrix X, the pair (X, f) is a standard extension of degree l of the pair (X, T). (The standard pair (X, f) described in the preceding paragraph is an extension of this type.) The description is given by the following theorem.
Theorem 6.7. Let (X, T) be an admissible /-independent pair. Suppose that T is invertible of size r x r, and let f be an nl x nl matrix such that A = 0 is a zero ofdet(IA- f) of multiplicity nl- r. Then for some matrix X, (X, f) is a standard extension of degree l of the pair (X, T) if and only if the degrees p 1 ~ · · · ~ Pv of the elementary divisors of I A - f corresponding to the eigenvalue 0 satisfy the inequalities i
i
LPi~ Lsi j=O
for
i = 1, 2, ... , v,
(6.17)
j=O
where skis the number of the integers n + q1 _ j-J larger thank, k = 0, 1, ... , and q 0 = 0,
-
j
q1_ i (j = 0, 1, ... , l - 1) ~
1.
For the proof of this result we refer to [37c]. Note that the partial multiplicities at zero of the special monic matrix polynomial L(A) associated with (X, T) (which coincide with the degrees of the elementary divisors of I A - f corresponding to zero, where f is some
6.3.
SPECIAL EXTENSIONS
179
linearization of L(A.)), are extremal in the sense that equalities hold in (6.17) throughout (cf. (6.13)).
Comments The results of this chapter are based on [37c]. The exposition of Section 6.3 follows [37b].
Part II
Non:monic Matrix Polyno:mials
In this part we shall consider regular n x n matrix polynomials L(A.), i.e., such that det L(A.) "¥= 0 and the leading coefficient of L(A.) is not necessarily the identity, or even invertible. As in the monic case, the spectrum rJ(L), which coincides with the zeros of det L(A.), is still a finite set. This is the crucial property allowing us to investigate the spectral properties of L(A.) along the lines developed in Part I. Such investigations will be carried out in Part II. Most of the applications which stem from the spectral analysis of monic matrix polynomials, and are presented in Part I, can be given also in the context of regular matrix polynomials. We present some applications to differential and difference equations in Chapter 8 and refer the reader to the original papers for others. On the other hand, problems concerning least common multiples and greatest common divisors of matrix polynomials fit naturally into the framework of regular polynomials rather than monic polynomials. Such problems are considered in Chapter 9. It is assumed throughout Part II that all matrix polynomials are regular, unless stated otherwise.
181
Chapter 7
Spectral Properties and Representations
In this chapter we extend the theory of Part I to include description of the spectral data, and basic representation theorems, for regular matrix polynomials, as well as solution of the inverse problem, i.e., construction of a regular polynomial given its spectral data. The latter problem is considered in two aspects: (1) when full spectral data are given, including infinity; (2) when spectral data are given at finite points only. 7.1. The Spectral Data (Finite and Infinite)
Let L(A.) = L~=o AiA.i be an n x n matrix polynomial. We shall study first the structure of the spectral data of L(A.), i.e., eigenvalues and Jordan chains. Recall the definition of a canonical set of Jordan chains of L(A.) corresponding to the eigenvalue A. 0 (see Section 1.6). Let cp}}l, ... ' (/)~~~ 1•
(/Jb2 l, ... ' (/)~;~ 1•
(/J~l, ... ' (/J~~l-1,
be such a canonical set, where ri = rank cp~l,j = 1, ... , k, k =dim Ker L(A. 0 ), and cpgl, i = 1, ... , k, are eigenvectors of L(A.) corresponding1:o A. 0 • We write this canonical set of Jordan chains in matrix form: X(A.o)
= [cpb0
···
J(A. 0) = diag(J 1> J 2,
({J~~~ 1 · · ·,
({Jb2l
J k), 183
184
7. SPECTRAL PROPERTIES AND REPRESENTATIONS
where J; is a Jordan block of size r; with eigenvalue A. 0 . Hence X(A. 0 ) is an rmatrixandJ(A. 0 )isanr x rmatrix,wherer = L~=l r1 isthemultiplicity of A. 0 as a zero of det L(A.). The pair of matrices (X(A. 0 ), J(A. 0 )) is called a Jordan pair of L(A.) corresponding to A. 0 . The following theorem provides a characterization for a Jordan pair corresponding to A. 0 •
nx
Theorem 7.1. Let (X,]) be a pair of matrices, where X is ann x 11 matrix and J is a 11 x 11 Jordan matrix with unique eigenvalue A. 0 . Then the following conditions are necessary and sufficient in order that (X,]) be a Jordan pair of L(A.) = L~=o A.1A 1 corresponding to A. 0 :
(i) (ii) (iii)
det L(A.) has a zero A. 0 of multiplicity 11, rank co1(X11)j:6 = 11, A 1 X1 1 + A 1_ 1x J l - l + · · · + A 0 X = o.
Proof Suppose that (X,]) is a Jordan pair of L(A.) corresponding to A. 0 . Then (i) follows from Corollary 1.14 and (iii) from Proposition 1.1 0. Property (ii) follows immediately from Theorem 7.3 below. Suppose now that (i)-(iii) hold. By Proposition 1.10, the part of X corresponding to each Jordan block in J, forms a Jordan chain of L(A.) corresponding to A. 0 . Condition (ii) ensures that the eigenvectors of these Jordan chains are linearly independent. Now in view of Proposition 1.15, it is clear that (X,]) is a Jordan pair of L(A.) corresponding to A. 0 . 0 Taking a Jordan pair (X(A.), J(A.)) for every eigenvalue A.1 of L(A.), we define a .finite Jordan pair (XF, JF) of L(A.) as
where pis the number of different eigenvalues of L(A.). Note that the sizes of XF and JF are n x v and v x v, respectively, where v = deg(det L(A.)). Note also that the pair (X F' JF) is not determined uniquely by the polynomial L(A.): the description of all finite Jordan pairs of L(A.) (with fixed JF) is given by the formula (XFU, JF), where (XF, JF) is any fixed finite Jordan pair of L(A.), and U is any invertible matrix which commutes with JF. The finite Jordan pair (XF, JF) does not determine L(A.) uniquely either: any matrix polynomial of the form V(A.), L(A.), where V(A.) is a matrix polynomial with det V(A.) const i= 0, has the same finite Jordan pairs as L(A.) (see Theorem 7.13 below). In order to determine L(A.) uniquely we have to consider (together with (X F' JF)) an additional Jordan pair (X cc" J ocJ of L(A.) for A. = CJJ, which is defined below. Let
=
i
= 1, ... ' q,
(7.1)
7.1. THE SPECTRAL DATA (FINITE AND INFINITE)
185
be a canonical set of Jordan chains ofthe analytic (at infinity) matrix function r'L(A) corresponding to A= oo, if this point is an eigenvalue of A- 1L(A) (where lis the degree of L(A), i.e., the maximal integer j such that Vj)(A) i= 0). By definition, a Jordan chain (resp. canonical set of Jordan chains) of an analytic matrix-valued function M(A) at infinity is just a Jordan chain (resp. canonical set of Jordan chains) of the matrix function M(A- 1 ) at zero. Thus, the Jordan chains (7.1) form a canonical set of Jordan chains of the matrix polynomial L(A) = A1L(A - 1 ) corresponding to the eigenvalue zero. We shall use the following notation X oo
= [l/lb1>
• • •
l/1~~>_ 1l/lb2 >
• • •
l/1~;>_ 1
• • •
,J,(q)
o/0
• • •
,J,(q)
]
V'sq-1 '
J oo = diag[J oo 1• J oo2' · · ·' J ooq], where J oojis the Jordan block of size sj with eigenvalue zero, and call (X oo, J 00 ) an infinite Jordan pair of L(A). Note that (X oo, J 00 ) is a Jordan pair of the matrix polynomial L(A) = A1L(A - 1 ) corresponding to the eigenvalue A = 0. This observation (combined with Theorem 7.1) leads to the following characterization of infinite Jordan pairs.
Theorem 7.2. Let (X, 1) be a pair of matrices, where X is n x f.1. and 1 is a f.1. x f.1. Jordan matrix with unique eigenvalue Ao = 0. Then the following conditions are necessary and sufficient in order that (X, 1) be an infinite Jordan pair of L(A) = L~=o AjAj: (i) (ii) (iii)
det(A1L(r 1 )) has a zero at Ao = 0 ofmultiplicity f.l., rank col( X P)~: ~ = f.l., A 0 XJ' + A 1 X1'- 1 + · · · + A 1 = 0.
ExAMPLE
7.1.
Consider the matrix polynomial
L(A.) In this case we can put down
and
= [
-(A.- 1)3 0
A.
A.
+
J
1 .
186
7. SPECTRAL PROPERTIES AND REPRESENTATIONS
7.2. Linearizations
I:=o
Let L(A) = Ai A; be a regular n x n matrix polynomial. An nl x nl linear matrix polynomial S0 + S 1 A is called a linearization of L(A) if
[L~A)
I 0 n(l-1)
J
= E(A)(S 0
+ S1 A)F(A)
for some nl x nl matrix polynomials E(A) and F(A) with constant nonzero determinants. For monic matrix polynomials the notion of linearization (with S 1 = I) was studied in Chapter 1. In this section we construct linearizations for regular matrix polynomials. For the matrix polynomial L(A) define its companion polynomial CL(A) as follows: I
0
0 0
0 I 0 0 0 0
0 -I 0 0 0 -I
0 0 A1
0
0 0
0
-1
A0 A1
A 1_
I
1
So C L(A) is a linear nl x nl matrix polynomial; if L(A) is monic, then C L(A) = !A - C 1 , where C 1 is the first companion matrix of L( A) ( cf. Theorem 1.1 ). It turns out that C L(A) is a linearization of L(A). Indeed, define a sequence of polynomials E;(A), i = 1, ... , l, by induction: Ez(A) = A 1, E;_ 1 (A) = A;_ 1 + AE;(A), i = l, ... , 2. It is easy to see that E 1 (A)
E 1_ 1 (A)
0
-I 0
-I
0
0
I
0 0 CL(A) -I
0 I
0 0
0
0
0 I -AI
I
0 '
0
0
0
-AI
0
-AI
(7.2)
I
so indeed CL(A) is a linearization df L(A). In particular, every regular matrix polynomial has a linearization. Clearly, a linearization is not unique; so the problem of a canonical choice of a linearization arises. For monic matrix polynomials the linearization U - C 1 with companion matrix C 1 can serve as a canonical choice.
187
7.2. LINEARIZATIONS
Another class of polynomials for which an analogous canonical linearization is available is the comonic polynomials L(A.), i.e., such that L(O) = I. Thus, if L(A.) =I + 1 AiA.i is a comonic matrix polynomial of degree l, the the matrix
D=
R
~ r~ ~ . 0
0 I
0 0
0
l-A~
-Al-l
will be called the comonic companion matrix of L(A.). (Compare with the definition of the companion matrix for monic matrix polynomials.) Using the comonic companion matrix, we have the linearization I - RA.; namely,
I - RA. = B(A.)diag[L(A.), I, ... , I]D(A.),
(7.3)
where
0 0
0 0
0 0
0 I
I 0
0 B2(A.)
0 B1-2(A.)
0 B1-1(A.)
B(A.) = I 0 I Bl(A.)
. h B ;(A.) -- LH-l i=O A l-i A_l-i- j , .l -- 1, wit
0
0
...
1 nd ,I- , a
0
I
-AI D(A.) =
0 I -AI I -AI 0
0 0
Clearly, B(A.) and D(A.) are everywhere invertible matrix polynomials. Indeed, it is easily seen that
A_l-2I A_l-3I 0
and (I - RA.)D- 1 (..1.)
= B(A.) · diag[L(A.), I, ... , I].
7.
188
SPECTRAL PROPERTIES AND REPRESENTATIONS
Linearizations of type (7.3) are easily extended to the case in which L(O) # I. This is done by considering the matrix polynomial L(A.) = (L(a))- 1 L(A. + a) instead of L(A.), where a¢ a{L), and observing that L(A.) is comonic. In the next section we shall encounter a different type of linearization based on the notion of decomposable pair.
7.3. Decomposable Pairs In Section 1.8 the important concept of a Jordan pair for a monic matrix polynomial was introduced, and in Section 7.1 we began the organization of the spectral data necessary for generalization of this concept to the nonmonic case. Here, we seek the proper generalization of the concept of a standard pair as introduced in Section 1.10. First, we call a pair of matrices (X, T) admissible of order p if X is n x p and T is p x p. Then, an admissible pair (X, T) of order nl is called a decomposable pair (of degree l) if the following properties are satisfied: (i)
T=
[~ ~J.
where X 1 is an n x m matrix, and T1 is an m x m matrix, for some m, 0 ~ m ~ nl (so that X 2 and T2 are of sizes n x (nl - m) and (nl - m) x (nl - m), respectively), (ii) the matrix
T~- 1 1
2 X X2Tll-2
(7.4)
X2
is nonsingular. The number m will be called the parameter of the decomposable pair (X, T). A decomposable pair (X, T) will be called a decomposable pair of the regular n x n matrix polynomial L(A.) = l:l=o A;k if, in addition to (i) and (ii), the following conditions hold: (iii)
Ll=o A; X 1 T;1 = 0, Ll=o AiX 2 T~-i = 0.
For example, if L(A.) is monic, then a standard pair for L(A.) is also a decomposable pair with the parameter m = nl. So the notion of a decomposable pair can be considered as a natural generalization of the notion of standard pair for monic matrix polynomials. Another example of a decom-
189
7.3. DECOMPOSABLE PAIRS
posable pair (with parameter m = 0) is the pair (X, T- 1 ), where {X, T) is a standard pair of a monic matrix polynomial L(A.) such that det L(O) # 0 (so that Tis nonsingular). We see that, in contrast to standard pairs for monic polynomials, there is a large nonuniqueness for their decomposable pairs. We shall touch upon the degree of non uniqueness of a decomposable pair for a matrix polynomial in the next section. The next theorem shows that a decomposable pair appears, as in the case of monic polynomials, to carry the full spectral information about its matrix polynomial, but now including spectrum at infinity. The existence of a decomposable pair for every regular matrix polynomial will also follow.
Theorem 7.3. Let L(A.) be a regular matrix polynomial, and let (XF, JF) and (X 00 , J 00 ) be its finite and infinite Jordan pairs, respectively. Then ([XF X 00 ] , JF EB J 00 ) is a decomposable pair for L(A.). Proof Let I be the degree of L(A.). By Theorems 7.1 and 7.2, properties (i) and (iii) hold for (X 1 , T1 ) = (XF, JF) .and (X 2 , T2 ) = (X oo, J 00 ). In order to show that ([XF X ooJ, JF EB J 00) is a decomposable pair for L(A.), it is enough to check the nonsingularity of (7.5) This will be done by reduction to the case of a comonic matrix polynomial. Let a E fl be such that the matrix polynomial L 1{A.) = L(A. -a) is invertible for A. = 0. We claim that the admissible pair (XF, JF + al) is a finite Jordan pair of L 1 (A.), and the admissible pair (X 00 , J ooU + aJ 00 ) - 1 ) is similar to an infinite Jordan pair of L 1 (A.). Indeed, write L(A.) = L}=o AjA.j, L 1 (A.) = £...j=O Aj 1 A.J, then
"l
.
Ak =
.± (.l k)aj-kAj
;=k J
1,
k = 0, ... , l.
(7.6)
Since (X oo, J 00 ) is an infinite Jordan pair of L(A.), we have by Theorem 7.2 l
L AjXooJ~j = 0.
(7.7)
j=O
Equations (7.5) and (7.6) imply (by a straightforward computation) that l
"
L.. A j 1 X
j=6 where
Joo = J ooU + aJ
00 ) -
1.
AI . oo J;; J = 0,
(7.8)
Let 11 be the size of J oo; then for some p,
rank col( X oo J~)f:J = 11·
(7.9)
Indeed, since (X oo, J 00 ) represent a canonical set of Jordan chains, we obtain, in particular, that (7.10)
190
7. SPECTRAL PROPERTIES AND REPRESENTATIONS
for some p. Using the property that (J(J rxJ = {0}, it is easily seen that (7.1 0) implies (in fact, is equivalent to) X oo y =F 0 for every y E Ker J oo \ {0}. But apparently Ker Joo = Ker J oo, so X oo y =F 0 for every y E Ker Joo \ {0}, and (7.9) follows. Now (7.8) and (7.9) ensure (Theorem 7.2) that (X oo, J 00 ) is similar to an infinite Jordan pair of L 1 (.~.) (observe that the multiplicity of A. 0 = 0 as a zero of det(A.1L(r 1 )) and of det(A.1L 1 ( r 1 )) is the same). It is an easy exercise to check that (XF, JF + a/) is a finite Jordan pair of L 1 (A.). Let (XF, iF) and (X 00 , ioo) be finite and infinite Jordan pairs of L 1 (A.). Since L 1(0) is nonsingular, iF is nonsingular as well. We show now that (X, f)~ ([XF X 00 ] , ii 1 Ei) j 00 ) is a standard pair for the monic matrix polynomiall 1 (A.) = A.1(L 1 (0))- 1 L 1 ( r 1 ). Let B = [x 0 x 1 · · · xr] be a Jordan chain for L 1 (A.) taken from XF, and corresponding to the eigenvalue A. 0 =F 0. Let K be the Jordan block of size (r + 1) x (r + 1) with eigenvalue A. 0 . By Proposition 1.10 we have I
L Ai BKi = 1
0,
(7.11)
i~o
where A i 1 are the coefficients of L 1 (A.). Since A. 0 =F 0, K is nonsingular and so I
L Ai BKi-l = 0. 1
(7.12)
j~O
Let K 0 be the matrix obtained from K by replacing A. 0 by A- 0 1 . Then K 0 and K- 1 are similar: (7.13) where
Mr(Ao) =
[< -1)i(! =11 )A.b+i]~<.J~O . l
and it is assumed that<= D= 1 and(~) = 0 for q > p or q = -1 and p > -1. The equality (7.13) or, what is the same, KMr(A. 0 )K 0 = M.(A- 0 ), can be verified by multiplication. Insert in (7.12) the expression (7.13) forK- 1 ; it follows from Proposition 1.10 that the columns of B(Mr(A- 0 ))- 1 form a Jordan chain of l 1 (A.) corresponding to A. 01 . Let k be the number of Jordan chains in XF corresponding· to A. 0 , and let B; (i = 1, ... , k) be the matrix whose columns form the ith Jordan chain corresponding to A. 0 . We perform the above transformation for every B;. It then follows that the columns of B 1 (Mr,(A. 0 ))- I, ... , Bk(Mrk(A. 0 ))- 1 form Jordan chains of l(A.) corresponding to ). 0 1 . Let us check that these Jordan chains of l(A.) form a canonical set of Jordan chains corresponding to ). 0 1• By Proposition 1.15, it is enough to check that the eigenvectors b 1 , ••. , bk in B 1 (Mr,(A- 0 ))- I, ... , Bk(Mrk(A. 0 ))- I, respectively, are linearly independent.
7.4. PROPERTIES OF DECOMPOSABLE PAIRS
191
But from the structure of (Mr,(A. 0 ))- 1 it is clear that b 1, •.. , bk are also the eigenvectors in B 1 , .•. , Bk> respectively, of L(A.) corresponding to ..1. 0 , and therefore they are linearly independent again by Proposition 1.15. Now it is clear that (X;. 0 , i-;0 1 ), where j ;. 0 is the part of jF corresponding to the eigenvalue ..1. 0 ( # 0) of j F and X;. 0 is the corresponding part of XF, is the part of a standard pair of L1 (A.) corresponding to ..1. 0 1 . Recall that by definition (X oo, j 00 ) is the part of a Jordan pair of L 1 (A.) corresponding to 0. So indeed (X, f) is a standard pair of L1 (A.). In particular, col(X j i)l.:b = col(XFjii, X oohJl:& is nonsingular (cf. Section 1.9). Since we can choose XF = XF, jF = JF + r:t.I, and the pair (X oo, J 00 ( / + r:t.J 00 ) - 1 ) is similar to (X 00 , j 00 ), the matrix 11- 1 ~r col( X F(J F + aiY, X 00 1t; 1 -;(J + r:t.1 00 Y)l:& is also nonsingular. Finally, observe that
I r:t.I 11-1 = @r:t.2I (b)azi
0 I (Dai
0 0 (DI
0 0 0
(Dal-1 I
(Dal-2I
Cl)I
SI-b
(7.14)
where S1_ 1 is given by (7.5). (This equality is checked by a straightforward computation.) So S 1_ 1 is also invertible, and Theorem 7.3 is proved. D 7 .4. Properties of Decomposable Pairs
This section is of an auxiliary character. We display here some simple properties of a decomposable pair, some of which will be used to prove main results in the next two sections. Proposition 7.4. Let (X, Y) = ([X 1 X 2 ], T1 EB T2 ) be a decomposable pair of degree l, and let S1_ 2 = col(X 1 TL X 2 T~- 2 -;)l:6. Then
(a) S1_ 2 hasfull rank (equal to n(l- 1)); (b) the nl x nl matrix (7.15) where S 1_ 1 is defined by (7.4), is a projector with Ker P = Ker S 1_ 2 ; (c) we have (I- P)(I
[0 OJ
_ 1 - 1 EB T2)S;I. · 1 =(I EB T2)Sz-1 0
(7.16)
192
7. Proof
SPECTRAL PROPER TIES AND REPRESENTATIONS
By forming the products indicated it is easily checked that
l ~: ~l
:::
ll
~1
s1-1
= sz- 2 T(A.),
(7.17)
-I
where T(A.) = (IA.- T1) EB (T2 A.- I). Then put A.= A.0 in (7.17), where A. 0 is such that T(A. 0 ) is nonsingular, and use the nonsingularity of S1_ 1 to deduce that sl- 2 has full rank. Dividing (7.17) by A. and letting A.--+ oo, we obtain (7.18) Now, using (7.18)
[I] [I]
- [I]
- 1 P 2 =(I EB T2)S1-1 O Sz-2(1 EB T2)Sz-11 O Sz-2
- 1 O [I =(I EB T2)Sz-1
- [I]
[JJ
OJ O Sz-2 =(I EB T2)Sz-11 O Sz-2 = P,
so P is a projector. Formula (7.15) shows that Ker P => Ker S1_ 2.
(7.19)
On the other hand,
[I]
-1 O Sz-2 = Sz-2 Sz- 2P = Sz- 2(1 EB T2)Sz-1
inviewof(7.18),sorankP 2:: rankS 1_ 2 = n(l- 1).Combiningwith(7.19),we obtain Ker P = Ker S1_ 2 as required. Finally, to check (7.16), use the following steps: P(l EB T2)S1--\ =(I EB
T2 )S1--\[~J {S1_ 2(J EB T2)S1--\}
= (I EB T2)Sz--\ [~][I OJ = (I EB T2)Sz--\ [~ where the latter equality follows from (7.18).
~J
0
The following technical property of decomposable pairs for matrix polynomials will also be useful.
7.4.
193
PROPERTIES OF DECOMPOSABLE PAIRS
Proposition 7.5. Let ([X 1 X 2 ], T1 EB T2 ) be a decomposable pair for the matrix polynomial L(A.) = L~;o A,iAi. Then
= 0,
VP
(7.20)
where Pis given by (7.15), and
V=
[
A l X 1 T 11-
1
'
-
l-1
"A.X T 12- 1 -i L.., ' 2
J •
i;Q
Proof
Since
I
LAiX 2 T~-i=0
[X 1T 11-l,X 2]=[0 ···
and
0
I]S 1_ 1 ,
i;O
we have
VP = [A1X1T11-l,
-:~AiX2 T~-]st--\[~Jst-2
=A 1 [X 1 T 11-l,X 2 ]S 1--\[~]S 1 _ 2 =A 1 [0
···
0
J][~]S 1 - 2 =0.
0
We conclude this section with some remarks concerning similarity of decomposable pairs. First, decomposable pairs
with the same parameter m are said to be similar if for some nonsingular matrices Q 1 and Q2 of sizes m x m and (nl- m) x (nl- m), respectively, the relations i = 1, 2,
hold. Clearly, if (X, T) is a decomposable pair of a matrix polynomial L(A.) (of degree 1), then so is every decomposable pair which is similar to (X, T). The converse statement, that any two decomposable pairs of L(A.) with the same parameter are similar, is more delicate. We prove that this is indeed the case if we impose certain spectral conditions on T1 and T2 , as follows. Let L(A.) = Il;o AiA.i be a matrix polynomial with decomposable pairs (7.21) and the same parameter. Let T1 and f 1 be nonsingular and suppose that the following conditions hold: (J( 7;)
=
(J(T1 1 ) n (J(T2 )
(J( f;),
=
Then (X, T) and (X}') are similar.
i
= 1, 2,
(J(T1 1 ) n (J(T2 )
(7.22)
=
0.
(7.23)
7.
194
SPECTRAL PROPERTIES AND REPRESENTATIONS
Let us prove this statement. We have [A 0
A 1 ···
A 1 ]co1(X 1 Ti1 ,X 2 Ti-;)~=o=0
(7.24)
or
Since S1_ 1 and T1 are nonsingular, Im A 0
::::>
Im[A 1 · · · A 1],
and therefore A 0 must also be nonsingular (otherwise for all A. E C, A 1 · · · A 1] # fl",
lm L(A.) c lm[A 0
a contradiction with the regularity of L(A.)). Consider the monic matrix polynomial L(A.) = A 0 1A.1L(r 1). Equality (7.24) shows that the standard pmr
[I
0
...
0 0
I 0
0 0
0 I
0], 0 0 -A()1AI -A()1AI-1 -A()1AI-2 0
I -A() 1A1
of L(A.) is similar to ([X 1 X 2 ], T! 1 EB T2 ), with similarity matrix col( X 1T!;, X 2 T~)~;;b (this matrix is nonsingular in view of the nonsingularity of S1_ 1). So ([X 1 X 2 ], T! 1 EB T2 ) is a standard pair of L(A.). Analogous arguments show that ([X 1 X 2 ], T! 1 EB T2 ) is also a standard pair of L(A.). But we know that any two standard pairs of a monic matrix polynomial are similar; so T1 1 EB T2 and t 1 1 EB T2 are similar. Together with the spectral conditions (7.22) and (7.23) this implies the similarity ofT; and f;, i = 1, 2. Now, using the fact that in a standard triple of L(A.), the part corresponding to each eigenvalue is determined uniquely up to similarity, we deduce that (X, T) and (X, T) are similar, as claimed. Note that the nonsingularity condition on T1 and T1 in the above statement may be dropped if, instead of (7.23), one requires that a((T1
+ al)- 1
n a(T2 (/
+ aT2 )- 1 ) = a((T1 + al)- 1
n a(Tz(I
+ aT2 )- 1 ) = 0,
(7.25)
for some a E ([,such that the inverse matrices in (7.25) exist. This case is easily reduced to the case considered above, bearing in mind that
7.5. DECOMPOSABLE LINEARIZATION AND A RESOLVENT FORM
195
is a decomposable pair for L(A. - a), provided ([X 1 X 2 ], T1 EB T2) is a decomposable pair for L(A.). (This can be checked by straightforward calculation using(7.14).) Finally, it is easy to check that similarity ofT2 (I + aT2 )- 1 and Tz(I + aT2)- 1 implies (in fact, is equivalent to) the similarity of T2 and T2. 7 .5. Decomposable Linearization and a Resolvent Form
We show now that a decomposable pair for a matrix polynomial L(A.), introduced in Section 7.4, determines a linearization of L(A.), as follows. Theorem 7.6. Let L(A.) = L~=o A;A.; be a regular matrix polynomial, and let ([X 1 X 2], T1 EB T2) be its decomposable pair. Then T(A.) = (I A. - T1) EB (T2A.- I) is a linearization of L(A.). Moreover
CL(A.)Sz-1 = [Sz-2] V T(A.,)
(7.26)
where V = [A 1 X 1T 11-\ - ~]::6 A; X 2 Ti-l-i] and
C L(A)
I 0
0 I
0 0
0 0
0 0
I
=
~L
0 0
-I
~J
0
.
0
0
0 -I
0 0
0
0
-I
\
Ao Al A2
Az-1
is the companion polynomial of L(A.).
Proof The equality of the first n(l - 1) rows of (7.26) coincides with (7.17). The equality in the last n rows of (7.26) follows from the definition of a decomposable pair for a matrix polynomial (property (iii)). The matrix [s'v 2 ] is nonsingular, because otherwise (7.26) would imply that det CL(A.) = 0, which is impossible since CL(A.) is a linearization of L(A.) (see Section 7.2) and L(A.) is regular. Consequently, (7.26) implies that T(A.) is a linearization of L(A.) together with CL(A.), 0 The linearization T(A.) introduced in Theorem 7.6 will be called a decomposable linearization of L(A.) (corresponding to the decomposable pair ([X 1 X 2 ], T1 EB T2 ) of L(A.)). Using a decomposable linearization, we now construct a resolvent form for a regular matrix polynomial. Theorem 7.7. Let L(A.) be a matrix polynomial with decomposable pair ([X 1 X 2 ], T1 EB T2) and corresponding decomposable linearization T(A.). Put
7.
196
SPECTRAL PROPERTIES AND REPRESENTATIONS
V= [Aix1n- 1, -I:l:6A;Xzr~- 1 -i],
S1-2
=col(X1Ti1•
xzn-z-i)i:6,
and
Then
(7.27) Observe that the matrix [8 'v 2 ] in Theorem 7.7 is nonsingular, as we have seen in the proof of Theorem 7.6. Proof.
We shall use the equality (7.2), which can be rewritten in the form .diag[L - 1(A.), I, ... , I] = D(A.)Ci 1(A.)B- 1(A.),
where
B(A.) =
B 1 (A.) B 2 (A.) .. · B 1_ 1 (A.) -I 0 0 0 -I 0 0
-I
0
I 0 0
(7.28)
0
with some matrix polynomials B;(A.), i = 1, ... , l - 1, and I
D(A.) =
0 -A.! I 0 -A.! 0
Multiplying this relation by [I
0
0 0 I 0
0 0 0
-A.!
(7.29)
I
0 · · · 0] from the left and by
u o .. · oF from the right, and taking into account the form of B(A.) and D(A.) given by (7.28) and (7.29), respectively, we obtain L - 1 (A.)
=
[I
Using (7.26) it follows that
0
. . . O]CL(A.)- 1 [0
... 0
I]T.
7.6. REPRESENTATION AND THE INVERSE PROBLEM
197
Now
[I
0
O]Sz-1T(A.)-l[Sz~2rl =
and (7.27) follows.
= [Xl [Xl
X2T~-l]T(A.)-l[Sz~2rl
X2](I EB
T~-l)T(A.)-l[Sz~2rl
0
7 .6. Representation and the Inverse Problem
We consider now the inverse problem, i.e., given a decomposable pair (X, T) of degree I, find all regular matrix polynomials (of degree/) which have (X, T) as their decomposable pair. The following theorem provides a description of all such matrix polynomials. Theorem 7.8. Let(X, T) = ([X 1 X 2 ], T1 EB T2 )beadecomposablepair of degree I, and let S 1_ 2 = col( X 1Ti1 , X 2 T~- 2- i)~:6. Then for every n x nl matrix V such that the matrix [s'v 2 ] is nonsingular, the matrix polynomial
L(A.) = V(I- P)[(H- T1 ) EB (T2 A.- I)](U 0
+
U 1 A.
+ ·· · +
U1 _ 1 A.1-
1 ),
(7.30) where P
= (I EB T2 ) [col( X 1Ti1 , X 2 n-l-i)~:6r 1[6]S 1_ 2 and [U 0
U 1 ···
U1_ 1] = [col(X 1 TLX 2 T~-l-i)l;:6r 1 ,
has (X, T) as its decomposable pair. Conversely, if L(A.) = A)i has (X, T) as its decomposable pair, then L(A.) admits representation (7.30) with
:D=o
(7.31) For future reference, let us write formula (7.30) explicitly, with V given by (7.31):
(7.32)
198
7. SPECTRAL PROPER TIES AND REPRESENTATIONS
Let V be an n x nl matrix such that [8 'v 2 ] is nonsingular, and let by (7.30). We show that (X, T) is a decomposable pair of L(A). DefineS 1_ 1andPby(7.4)and(7.15),respectively,andput W = V(I- P). Since P is a projector, Proof L(A) =
Ll=o AiAi be defined
WP=O.
(7.33)
Besides,
hence [ 8 \V 2 ] is nonsingular. Using (7.33) and Proposition 7.4(c), we obtain W(I
EB T2)S/. \
= W(I - P)(I = W(I
EB T2)s,-_\
[0 OJ
_ 1 EB T2)S1-1 O In
= [0
··· 0
A,]; (7.34)
and therefore by the definition of L(A) W(T1
EB I)s,-_\ = - [A 0
A 1_ 1].
A1 ·· ·
(7.35)
Combining (7.34) and (7.35), we obtain CL(A)SI-1 =
[s~ 1 ]rcA),
(7.36)
where C L(A) is the companion polynomial of L(A) and T(A) = (I A - T1) EB (T2 A - I). Comparison with (7.26) shows that in fact 1-1 W = [ AI X lyl-1 yl-1-i l • -"A-X L..r22
J
(7.37)
·
i=O
The bottom row of the equation obtained by substituting (7.37) in (7.36) takes the form
x2 r~-11
X 2 y~-2
[H-
1 = [ A X yl-1 _ I="A.X yl-1-i J l
1
1
'
i£;-0
I
2
2
0
x2 T1
0
J
T2 A - I .
199
7.6. REPRESENTATION AND THE INVERSE PROBLEM
This implies immediately that l
l
IAiX1 T~
IAiX2 T~-i = 0,
= 0,
1=0
i=O
i.e., (X, T) is a decomposable pair for L(A.). We now prove the converse statement. Let L(A.) = L~=oA;A.i be a regular matrix polynomial with decomposable pair (X, T). First observe that, using (7.17), we may deduce S1- 2 T(A.)Sl--\ col(JA.;)~:6 = 0.
(7.38)
Then it follows from (7.2) that
0 I
L(A.) EB I.(l-1) = [_I*
:
n(l-1)
0
0
°
.
01 :
·.
0 I
where the star denotes a matrix entry of no significance. Hence,
L(A.)
= [*
...
*
I.]CL(A.) col(JA.;)~:6.
(7.39)
Now use Theorem 7.6 (Eq. (7.26)) to substitute for CL(A.) and write
I.]
[S 1~ 2 JT(A.)S 1-_\ col(JA.;)::6
L(A.) = [*
...
*
= [*
···
*]S1- 2 T(A.)S 1--\ col(JA.;)l:6
+
VT(A.)S 1--\ col(JA.;)~:6.
But the first term on the right is zero by (7.38), and the conclusion (7.30) follows since V P = 0 by Proposition 7.5., 0 Note that the right canonical form of a monic matrix polynomial (Theorem 2.4) can be easily obtained as a particular case of representation (7.32). Indeed, let L(A.) = H 1 + Il:6 A)i, and let (X, T) = (XF, JF) be its finite Jordan pair, which is at the same time its decomposable pair (Theorem 7.3). Then formula (7.32) gives
L(A.) =
XT 1 - 1 T(A.)(t~U;A.;) = XT
which coincides with (2.14).
1-
1(IA.-
T)c~U;A.)
200
7. SPECTRAL PROPERTIES AND REPRESENTATIONS
Theorem 7.8 shows, in particular, that for each decomposable pair (X, T) there are associated matrix polynomials L(A.), and all of them are given by formula (7.30). It turns out that such an L(A.) is essentially unique: namely, if L(A.) and L(A.) are regular matrix polynomials of degree l with the same decomposable pair (X, Y), then
L(A.) = QL(A.)
(7.40)
for some constant nonsingular matrix Q. (Note that the converse is trivial: if L(A.) and L(A.) are related as in (7.40), then they have the same decomposable pairs.) More exactly, the following result holds.
let
Theorem 7.9. Let (X, T) be a decomposable pair as in Theorem 7.8, and vl and v2 ben X nl matrices such that [ 8 ~/] is nonsingular, i = 1, 2. Then Lv/A.) = QLv2(A.),
v;, and
where Lv;(A.) is given by (7.30) with V =
Q=
(Vl\Kers,_ 2 )
•
(V2\KerS 1 _ ) - 1·
(7.41)
Note that, in view of the nonsingularity of [8 j,~ 2 ] , the restrictions v;\Kers,_ 2 , i = 1, 2 are invertible; consequently, Q is a nonsingular matrix.
Proof
We have to check only that QVil- P) = V1(1- P).
(7.42)
But both sides of (7.42) are equal to zero when restricted to ImP. On the other hand, in view of Proposition 7.4(c) and (7.41), both sides of (7.42) are equal when restricted to Ker P. Since q;nl is a direct sum of Ker P and Im P, (7.42) follows. D We conclude this section by an illustrative example. EXAMPLE 7.2. Let X 1 = X 2 = I, T1 = 0, T2 = a!, a # 0 (all matrices are 2 x 2). Then ([X 1 X 2 ], T1 EB T2 ) is a decomposable pair (where l = 2). According to Theorem 7.8, all matrix polynomials L(A.) for which [(X 1 X 2 ], T1 EB T2 ) is their decomposable pair are given by (7.30). A computation shows that
(7.43) where V1 and V2 are any 2 x 2 matrices such that
is nonsingular. Of course, one can rewrite (7.43) in the form L(A.) = V(A.- aA. 2 ), where Vis any nonsingular 2 x 2 matrix, as it should be by Theorem 7.9. 0
7.7.
201
DIVISIBILITY OF MATRIX POLYNOMIALS
7.7. Divisibility of Matrix Polynomials In this section we shall characterize divisibility of matrix polynomials in terms of their spectral data at the finite points of the spectrum, i.e., the data contained in a finite Jordan pair. This characterization is given by the following theorem. Theorem 7.10. Let B(A.) and L(A.) be regular matrix polynomials. Assume B(A.) = L}=o BiA.i, Bm #- 0, and let (XF, JF) be the .finite Jordan pair of L(A.). Then L(A.) is a right divisor of B(A.), i.e., B(A.)L(A.)- 1 is also a polynomial, if and only if m
L BjXFJ~ = 0.
(7.44)
j=O
We shall need some preparation for the proof of Theorem 7.10. First, without loss of generality we can (and will) assume that L(A.) is comonic, i.e., L(O) = I. Indeed, a matrix polynomial L(A.) is a right divisor of a matrix polynomial B(A.) if and only if the co monic polynomial L - 1 (()()L(A. + ()() is a right divisor of the comonic polynomial B- 1(()()B(A. + ()(). Here ()( E q; is such that both L(()() and B(()() are non singular, existence of such an()( is ensured by the regularity conditions det B(A.) I- 0 and det L(A.) I- 0. Observe also that (XF, JF) is a finite Jordan pair of L(A.) if and only if (XF, JF +()(I) is a finite Jordan pair of L - 1 (()()L(A. + ()().This fact may be verified easily using Theorem 7.1. Second, we describe the process of division of B(A.) by L(A.) (assuming L(A.) is comonic). Let l be the degree of L(A.), and let (X oo, J 00 ) be the infinite Jordan pair of L(A.). Put (7.45) We shall use the fact that (X, J) is a standard pair for the monic matrix polynomial [(A.) = A.1L(r 1 ) (see Theorem 7.15 below). In particular, col(XJi- 1 ):= 1 is nonsingular. For ()( ~ 0 and 1 ~ f3 ~ l set Fap = XJ<XZp, where (7.46) Further, for each()(~ 0 and f3 the following formulas hold:
~
0 or f3 > l put Fap = 0. With this choice of Fap 00
L(A.) = I -
L A.iFz.z+
1-
i'
(7.47)
j= 1
and (7.48)
202
7.
SPECTRAL PROPERTIES AND REPRESENTATIONS
Indeed, formula (7.47) is an immediate consequence from the right canonical form for L(.A.) using its standard pair (X, J). Formula (7.48) coincides (up to the notation) with (3.15). The process of division of B(.A.) by L(.A.) can now be described as follows:
k = 1, 2, ... '
(7.49)
where Qk(.A.) = kit.A.j(Bj 1=0 Rk(},) =
I BiFl+j-t-i,l),
+1
1
r=O
j~/(Bj + :t>iFl+k-1-i,l+k-j),
(7.50)
and B j = 0 for j > m. We verify (7 .49) using induction on k. Fork = 1 (7 .49) is trivial. Assume that (7.49) holds for some k ~ 1. Then using the recursion (7.48), one sees that B(.A.)- Qk+ 1 (.A.)L(.A.) = Rk(.A.)- ,A_k(Bk
+
:t>;F
1+k-t-i.)L(.A.)
= Rk+ t(.A.), and the induction is complete. We say that a matrix polynomial R(.A.) has codegree k if R(il(O) = 0, i = 0, ... , k - 1, and R
xY- 1 Zj =
Rk(.A.) =
j=~+lBj.A.j + ~~~
1
B;XJI+k-t-ictt Zp.A.l+k-/3).
7. 7.
203
DIVISIBILITY OF MATRIX POLYNOMIALS
It follows that Rk(A.) = 0 if and only if m
l
+k>
m,
'f.BiXJz+k-1-i = 0.
(7.51)
i=O
Now suppose that L(A.) is a right divisor of B(A.). Then there exists k that (7. 51) holds. Because of formula (7.45) this implies
~
1 such
m
'f.BiXFJ;;l-k+1+i = 0. i=O
Multiplying the left- and right-hand sides of this identity by J~+k- 1 yields the desired formula (7.44). Conversely, suppose that (7.44) holds. Let v be a positive integer such that J"cr_, = 0. Choose k ~ 1 such that l + k ~ m + v. Multiply the left- and righthand sides of (7.44) by Jil-k+ 1 . This gives m
'f.BiXFJi
= 0.
i=O
As X",Jz:;k- 1 - i = 0 fori= O, ... ,m, we see that with this choice of k, formula (7.51) obtains. Hence L(A.) is a right divisor of B(A.). D Corollary 7.11. Let B(A.) and L(A.) be regular n x n matrix polynomials. Then L(A.) is a right divisor of B(A.) if and only if each Jordan chain of L(A.) is a Jordan chain of B(A.) corresponding to the same eigenvalue.
Proof If L(A.) is a right divisor of B(A.), then one sees through a direct computation (using root polynomials, for example) that each Jordan chain of L(A.) is a Jordan chain for B(A.) corresponding to the same eigenvalue as for L(A.). Conversely, if each Jordan chain of L(A.) is a Jordan chain for B(A.) corresponding to the same eigenvalue, then by Proposition 1.10, (7.44) holds and so L(A.) is a right divisor of B(A.). D
Note that in the proof of Theorem 7.10 we did not use the comonicity of B(A.). It follows that Theorem 7.10 is valid for any pair of n x n matrix polynomials B(A.) and L(A.), provided L(A.) is regular. A similar remark applies to the previous corollary. Next we show that a description of divisibility of regular polynomials can also be given in terms of their finite Jordan chains. In order to formulate this criterion we shall need the notions of restriction and extension of admissible pairs. As before, a pair of matrices (X, T) is called an admissible pair of order p if X is n x p and T is p x p. The space
nKer 00
Ker(X, T) =
j= 0
nKer[col(XTi00
XTi
=
s=0
1 )j= 1 ]
204
7. SPECTRAL PROPERTIES AND REPRESENTATIONS
is called the kernel of the pair (X, T). The least positive integers such that Ker[col(XTj- 1 ))= 1 ] = Ker[col(XTj- 1 )j!l] will be called the index of stabilization of the pair (X, T); it will be denoted by ind(X, T). Since the subspaces A. = Ker[col(XTj- 1 )}= 1 ] c CP form a descending chain A 1 ::::J A 2 ::::J • • • ::::J A. ::::J As+ 1 , there always exists a minimal s such that A.= As+ 1 , i.e., ind(X, T) is defined correctly. Given two admissible pairs (X, T) and (X 1 , T1 ) of orders p and p 1 , respectively, we call (X, T) an extension of(X 1 , T1 ) (or, equivalently, (X 1 , T1 ) a restriction of (X, T)) if there exists an injective linear transformation S: CP' -+ CP such that XS =X 1 and TS = ST1 • Observe that in this case Im S is invariant under T. Admissible pairs (X, T) and (X 1 , T1 ) are called similar if each is an extension of the other; in this case their orders are equal and XU = X 1 , U - 1 TV = T1 for some invertible linear transformation U. The definition of extension given above is easily seen to be consistent with the discussion of Section 6.1. Thus, the pair (X l1ms• T l1ms) is similar to (X 1 , T1 ), with the similarity transformation S: CP'-+ Im S. An important example of an admissible pair is a finite Jordan pair (X F, J F) of a comonic matrix polynomial L(A.). It follows from Theorem 7.3 that the kernel of the admissible pair (XF, JF) is zero. We shall need the following characterization of extension of admissible pairs. Lemma 7.12. Let (X, T) and (X 1 , T1 ) be admissible pairs. If (X, T) is an extension of(X 1 , T1 ), then
Im[col(X 1 T~- 1 )7'= 1 ] c Im[col(XT;- 1 )7'= 1 ],
m ~ 1.
Conversely, if(X 1 , T1 ) has a zero kernel and if
Im[col(X 1 T~- 1 )~~ lJ c Im[col(XT;- 1 )~~ lJ, where k is some fixed positive integer such that k ind(X 1 , T1 ), then (X, T) is an extension of (X 1 , T1 ).
~
(7.52)
ind(X, T) and k
~
Proof The first part of the lemma is trivial. To prove the second part: first suppose that both (X 1 , T1 ) and (X, T) have zero kernel. Our choice of k implies
Put n = col(XT;- 1 )l= 1 and Q 1 = col(X 1 T~- 1 )~= 1 , and let p and p 1 be the orders of the pairs (X, T) and (X 1 , T1 ). From our hypothesis if follows that Im n1 c Im n. Hence, there exists a linear transformationS: CPI-+ CP such that QS = Q 1 . As Ker Q 1 = {0}, we see that S is injective. Further, from
7. 7.
DIVISIBILITY OF MATRIX POLYNOMIALS
205
Ker Q = {0} it follows that S is uniquely determined. Note that QS = Q 1 implies that XS = X 1 . To prove that TS = ST1 , let us consider
ns n1.
= As before, there exists a linear transformationS: ([Pl ~ ([P such that The last equality implies QS = Q 1 , and thus S = S. But then QTS = QTS = 0 1 T1 = QST1 . It follows that Q(TS- ST1 ) = 0. As Ker Q = {0}, we have TS = ST1 . This completes the proof of the lemma for the case that both (X, T) and (X 1 , T1 ) have zero kernel. Next, suppose that Ker(X, T) -=f. {0}. Let p be the order of the pair (X, T), and let Q be a projector of ([P along Ker(X, T). Put Xo = XIImQ•
To= QTIImQ•
Then Ker(X 0 , T0 ) = {0} and for all m ;:::: 1 Im[col(X 0 T~- 1 );"= 1 ]
= Im[col(XT;- 1 );"= 1 ].
As, moreover, ind(X 0 , T0 ) = ind(X, T), we see from the result proved in the previous paragraph that (7.52) implies that (X 0 , T0 ) is an extension of(X 1 , T1 ). But, trivially, (X, T) is an extension of (X 0 , T0 ). So it follows that (X, T) is an extension of (X 1 , T1 ), and the proof is complete. 0 Now we can state and prove the promised characterization of divisibility of regular matrix polynomials in terms of their finite Jordan pairs.
Theorem 7.13. Let B(A.) and L(A.) be regular n x n matrix polynomials and let (X B• J B) and (X L• J L) be finite Jordan pairs of B(A.) and L(A.), respectively. Then L(A.) is a right divisor of B(A.) if and only if (X B• J B) is an extension of (XL,JL). Proof Let q be the degree of det B(A.), and let p be the degree of det L(A.). Then (X B, J B) and (XL, J L) are admissible pairs of orders q and p, respectively. Suppose that there exists an injective linear transformation S: ([P ~ rtq such that XBS =XL, JBS = SJL. Then XBJ~S = XLJ{ for each j;:::: 0. Now assume that B(A.) = Lj=o BiA.i. Then it/jXLJ{ =
(J0 BiXBJ~ )s = 0.
So we can apply Theorem 7.10 to show that L(A.) is a right divisor of B(A.). Conversely, suppose that L(A.) is a right divisor of B(A.). Then each chain of L(A.) is a Jordan chain of B(A.) corresponding to the same eigenvalue as for L(A.) (see Corollary 7.11). Hence Im[col(XLJ~- 1 );"= 1 ] c:: Im[col(XBJk- 1 );"= 1 ]
7. SPECTRAL PROPERTIES AND REPRESENTATIONS
206
for each m ;::::: 1. As Ker(X L' J L) and Ker(X B' J B) consist of the zero element only, it follows that we can apply Lemma 7.12 to show that (X B• J B) is an extension of (XL, h). D
7 .8. Representation Theorems for Co monic Matrix Polynomials
In this section the results of Section 7.5 and 7.6 will be considered in the special case of co monic matrix polynomials, in which case simpler representations are available. Recall that L(A.) is called comonic if L(O) = I. Clearly, such a polynomial is regular. Let (X F• JF) and (X 00 , J 00 ) be finite and infinite Jordan pairs, respectively, for the comonic polynomial L(A.) = I + 1 AJA.J. By Theorem 7.3, ([X F X 00 ] , JF EB J aJ is a decomposable pair of L(A.). Moreover, since L(O) = I, the matrix J F is nonsingular. Let J = J i 1 (f) J oo, and call the pair (X, J) = ([X F X 00 ] , Ji 1 EB J 00 ) a comonic Jordan pair of L(A.). Representations of a comonic matrix polynomial are conveniently expressed in terms of a comonic Jordan pair, as we shall see shortly. We start with some simple properties of the comonic Jordan pair (X, J). It follows from Theorem 7.3 and property (iii) in the definition of a decomposable pair of L(A.) that the nl x nl matrix col(X Jit;; b is nonsingular. Further,
D=
(7.53) where R is the comonic companion matrix of L(A.) (see Section 7.2). Indeed, (7.53) is equivalent to XJ 1 + A 1 X
+
A 1_ 1 XJ
+ ··· +
A 1 XJ 1 -
1
= 0,
which follows again from Theorem 7.3. In particular, (7.53) proves that J and R are similar. The following theorem is a more explicit version of Theorem 7. 7 applicable in the comonic case. Theorem 7.14. Let L(A.) be a comonic matrix polynomial of degree l, and let (X, J) be its comonic Jordan pair. Then the nl x nl matrix Y = col( X Ji)~:b is nonsingular and, if A.- 1 ~ a(J),
(7.54) Proof
Nonsingularity of Y has been observed already.
7.8. REPRESENTATIONS THEOREMS FOR COMONIC MATRIX POLYNOMIALS
207
where
z=
[ s~2 J-1 [0
[I EB J~ 1 ]
IF
... 0
in the notation of Theorem 7. 7. To prove (7.54), it remains to check the equality
[ -Ji 0
1
o][I oJ(s,_w 2 ]~' ~ ~ [Jiu- oJy- 1 [~j. 1
-I
0
Jl-1
0
0 I
00
>
Jl-1
:
00
•
(7.55)
I
From (7.26) we have
...
0
IF
~ -Ji~A [
Put A.
+I -J.OA
+
Js<~'.CdA)~'m
= 0 in the right-hand side to obtain
[-~F 1 ~IJ[s; 2 r 1 [0
...
0
IF=s,-_\[J
0 ...
oF.
(7.56)
(Here we use the assumption of comonicity, so that C L(O) is nonsingular.) But
s,_ 1 =
[L .. 1j y[JF~- 1 ) ~l 0
I
0
···
0
Inverting this formula and substituting in (7.56) yields (7.55).
D
Note that for any pair of matrices similar to (X, J) a representation similar to (7.54) holds. In particular, (7.54) holds, replacing X by [I 0 · · · OJ and J by the comonic companion matrix R of L(A.) (cf. (7.53)). An alternative way to derive (7.54) would be by reduction to monic polynomials. Let us outline this approach. The following observation is crucial:
208
7. SPECTRAL PROPERTIES AND REPRESENTATIONS
Theorem 7.15. Let L(.-1.) be a comonic matrix polynomial of degree 1, and let (X, J) be its comonic Jordan pair. Then (X, J) is a (right) standard pair for the monic matrix polynomial L(.-1.) = . 1. 1L(..1. - 1).
The proof of Theorem 7.15 can be obtained by repeating arguments from the proof of Theorem 7.3. We already know a resolvent representation for L(lc) = A1L().- 1)(Theorem 2.4):
L(.-1.)
= X(I.-1.-
J)- 1 Z,
. 1. ¢ (J(J),
where Z = [col(XJ;)l:Ar 1[0 · · · 0 IF. The last relation yields the biorthogonality conditions XJiz = 0 for j = 0, ... , 1- 2, XJ 1- 1 Z =I. Using these we see that for . 1. close enough to zero, L -1(..1.)
= r~[L(r1)]-1 = rl+1x(J- M)-1z = rl+1x(J + ..1.1 + ..1.212 + ···)Z = ri+1X(..1.1-1jl-1 + ,.1_1]1 + .. ·)Z = XJI-1(1 _ ..1.J)-1Z.
Consequently, (7.54) holds for . 1. close enough to zero, and, since both sides of (7.54) are rational matrix functions, (7.54) holds for all.-1. such that ..1.- 1 ¢ u(J). For a comonic matrix polynomial L(.-1.) of degree 1 representation (7.32) reduces to (7.57)
where (X, J) is a comonic Jordan pair of L(.-1.), and [V1 Vz] = [col(XJ;)l:Ar 1. The easiest way to verify (7.57) is by reduction to the monic polynomial . 1. 1L(..1.- 1) and the use of Theorem 7. 3. Again, (7.57) holds for any pair similar to (X, J). In particular, (7.57) holds for ([J 0 · · · 0], R) with the comonic companion matrix R of L(.-1.). It also follows that a comonic polynomial can be uniquely reconstructed from its comonic Jordan pair. 7.9. Comonic Polynomials from Finite Spectral Data We have seen already that a co monic matrix polynomial of given degree is uniquely defined by its comonic Jordan pair (formula (7.57)). On the other hand, the finite Jordan pair alone is not sufficient to determine uniquely the comonic polynomial. We describe here how to determine a comonic matrix polynomial by its finite Jordan pair in a way which may be described as canonical. To this end we use the construction of a special left inverse introduced in Section 6.3.
209
7.9. COMONIC POLYNOMIALS FROM FINITE SPECTRAL DATA
We shall need the following definition of s-indices and k-indices of an admissible pair (X, T) with nonsingular matrix T and ind(X, T) = I. Let k; = n + qz-;- 1 - q1_; for i = 0, ... , 1- 1, where q; =rank col(XTi)j:6 for i 2: 1 and q 0 = 0; defines; fori= 0, ... as the number of integers k 0 , k 1 , . . . , k1_ 1 which are larger than i. The numbers s0 , s 1 , ..• will be called s-ind ices of (X, T), and the numbers k 0 , k 1 , ... , k1_ 1 will be called k-ind ices of (X, T). Observe the following easily verified properties of the k-indices and s-indices: (1) (2) (3) (4)
n 2: k 0 2: k 1 2: · · · 2: k 1-
1;
so 2: s 1 2: ... ; si:::;; I fori = 0, 1, ... , and s; = 0 fori > n; "l-1
L.,i=O
ki
=
"n
L.,j=O Sj.
Now we can formulate the theorem which describes a canonical construction of the comonic polynomial via its finite spectral data. Theorem 7.16. Let (XF, JF) be an admissible pair with zero kernel and nonsingular Jordan matrix JF. The minimal degree of a comonic polynomial L(ll) such that (X F• J F) is its .finite Jordan pair, is equal to the index of stabilization ind(XF, JF). One such polynomial is given by the formula
L 0 (1l) = I - XFJi 1(V11l 1 +
· ·· +
l'(ll),
where V = [V1 , . . . , lf] is a special/eft inverse of col(XFJii)j;;;f>. The infinite part (X 00 , J 00 ) of a comonic Jordan pair for L 0 (1l) has the form X 00
= [x 0 0 · · · 0
J 00 = diag[J coO, J co 1'
X1
••· ,
0
··· 0
· · · xv
0
· · · 0],
J oov],
where J ooi is a nilpotent Jordan block of size si, and s 0 2: · · · 2: sv are the nonzero s-indices of(XF, Ji 1 ). Proof Let (X F, J F) be a finite Jordan pair of some comonic polynomial L(ll). Then (see Theorem 7.3) (X F• JF) is a restriction of a decomposable pair of L(ll). In view of property (iii) in the definition of a decomposable pair (Section 7.3), ind(XF, JF):::;; m, where m is the degree of L(ll). So the first assertion of the theorem follows. We now prove that L 0 (1l) has (XF, JF) as its finite Jordan pair. Let L(ll) = ll1L 0 ( r 1 ) = Ill 1 - XFJi 1 (V1 + ... + l'(ll 1- 1 ).ByTheorem6.5,(XF, Ji 1 ) is a part of a standard pair (X, f) for [(ll). Further, Corollary 6.6 ensures that (XF, Ji 1 ) is similar to (XIAt, fiAt), where Jt is the maximal f-invariant subspace such that fiAt is invertible. By Theorem 7.15, (XIAt, fiAt) is similar to the part (XF, 1.; 1 ) of a comonic Jordan pair ((XF, X 00 ), J.; 1 E9 ] 00 ) of L 0 (1l). So in fact (XF, JF) is a finite Jordan pair for L 0 (1l).
210
7.
SPECTRAL PROPERTIES AND REPRESENTATIONS
In order to study the structure of (X oo, J 00 ), it is convenient to consider the companion matrix 0 0
I
0
0
I
0 0
0 XFJi 1 V1
0 XFJi 1 Vz
0
I
C=
XFJi 1 V,
of L(A.). Let "'f/ 1 => · · · ::::> "'f/ 1 be the subs paces taken from the definition of the special left inverse. The subspace "'f/ = col( "'f!)}= 1 c fl" 1 is C-invariant, because C"'ff = col(1ri + 1 )f= 1 c "'f/ (by definition "'f/ 1+ 1 = {0}). The relation C [col(XFJi i)~:;~] = [col(XFJi i)~:;~] · Ji 1 shows that the subspace .P = Im col(XFJi i)J:6 is also C-invariant. It is clear that CI"W" is nilpotent, and CI.P is invertible. Let k 0 ~ k 1 ~ · · · ~ k1_ 1 be the k-indices of (XF, Ji 1 ). Let z~l, ... , 4?-,-k; be a basis in "'ff; modulo "'f!;+ 1 , for i = 1, ... , I (by definition k 1 = 0). Then it is easy to see that for j = 1, ... , k;_ 1 - k;, col(£5 1 Pz~il)~= 1 , col(£5 2 Pzy>)~= 1 , ••. , col(b;pzyl)~= 1 is a Jordan chain of C corresponding to the eigenvalue 0, oflength i. Taking into account the definition of the s-indices and Theorem 6.5, we see that (X oo, J 00 ) has the structure described in the theorem. 0 By the way, Theorem 7.16 implies the following important fact. Corollary 7.17. Let (X, T) be an admissible pair with zero kernel and nonsingular Jordan matrix T. Then there exists a comonic matrix polynomial such that (X, T) is its .finite Jordan pair.
In other words, every admissible pair (subject to natural restrictions) can play the role of a finite Jordan pair for some comonic matrix polynomial. The following corollary gives a description of all the comonic matrix polynomials for which (XF, JF) is a finite Jordan pair. Corollary 7.18. A comonic matrix polynomial L(A.) has (XF, JF) as its finite Jordan pair if and only if L(A.) admits the representation
L(A.) = U(A.)L 0 (A.), where L 0 (A.) = I - XFJi 1(V1 A. 1 + · · · + V,A.), [V1 V,] is a special left inverse of col(XFJii)~= 1 , and U(A.) is a comonic matrix polynomial with constant determinant. This is an immediate corollary of Theorems 7.13 and 7.16.
7.10.
DESCRIPTION OF DIVISORS VIA INVARIANT SUBSPACES
211
We have now solved the problem of reconstructing a comonic polynomial, given its finite Jordan pair (XF, JF). Clearly, there are many comonic polynomials of minimal degree with the same (X F• JF), which differ by their spectral data at infinity. Now we shall give a description of all possible Jordan structures at infinity for such polynomials. Theorem 7.19. Let (XF, JF) be an admissible pair of order rand zero kernel with nonsingu/ar Jordan matrix JF. Let I= ind(XF, JF). If(XF, JF) is a finite Jordan pair ofa comonic matrix polynomial L(A.) ofdegree/, and (X 00 , J ac,) is its infinite Jordan pair, then the sizes p 1 ~ p2 ~ • • • ~ Pv of the Jordan bl~cks in J 00 satisfy the conditions j
LP; ~ i= 1
j
I,s;-1
for
j = 1,2, ... , v,
(7.58)
i= 1 v
LP; = nl- r,
(7.59)
i= 1
where s0 , s 1 , . . . , are the s-indices of (X F, J i 1 ). Conversely, iffor some positive integers p 1 ~ p2 ~ · • • ~ Pv, the conditions (7.58), (7.59) are satisfied, then there exists a comonic matrix polynomial L(A.) of degree l such that L(A.) has a comonic Jordan pair of the form ((X FX 00 ), Ji 1 EB J 00 ), where J 00 is a nilpotent Jordan matrix with Jordan blocks of sizes P1, · · ·, Pv· The proof of Theorem 7.19 follows easily from Theorem 6.7. 7.10. Description of Divisors via Invariant Subspaces
The results of the preceding section allow us to give the following description of (right) co monic divisors of a given co monic matrix polynomial L(A.) in terms of its finite Jordan pair (X F• JF). Theorem 7.20. Let A be a JF-invariant subspace, and let m = ind(XFI.At' Ji1 ~). Then a comonic polynomial
(7.60)
where [V1 VmJ is a special left inverse of col((X FI.At)(J FIAt)- i+ 1)J'= 1, is a right divisor of L(A.). Conversely,for every co monic right divisor L 1 (A.) of L(A.) there exists a unique JF-invariant subspace A such that L 1 (A.) = U(A.)L.At(A.), where U(A.) is a matrix polynomial with det U(A.) 1 and U(O) = I.
=
Proof From Theorems 7.13 and 7.16 it follows that L.At(A.) is a right divisor of L(A.) for any JF-invariant subspace .A.
212
7. SPECTRAL PROPERTIES AND REPRESENTATIONS
Conversely, let L 1 (il) be a comonic right divisor of L(il) with finite Jordan pair (X 1 ,F, J 1 ,F). By Theorem 7.13, (X 1 ,F, J 1 ,F) is a restriction of(XF, JF); so (X 1 , F• J 1 , F) is a finite Jordan pair of L.At(il), where L.At(il) is given by (7.60), and A is the unique JF-invariant subspace such that (X 1 ,F• J 1 ,F) is similar to (XFI.At' JFI.At). By the same Theorem 7.13, L 1 (il) = U(il)L.At(il), where U(il) is a matrix polynomial without eigenvalues. Moreover, since L 1 (0) = L.At(O) = I, also U(O) = I, and det U(il) 1. D
=
If A is the J F-in variant subspace, which corresponds to the right divisor L 1(il) of L(il) as in Theorem 7.20, we shall say that L 1(il) is generated by A, and A is the supporting subspace of L 1 (il) (with respect to the fixed finite Jordan pair (XF, JF) of L(il)). Theorem 7.20 can be regarded as a generalization of Theorem 3.12 for nonmonic matrix polynomials. ExAMPLE
7.3.
To illustrate the theory consider the following example. Let L(il.)
=
[(il +
1)3
0
il. (il. - 1) 2
J.
Then a finite Jordan pair (XF, JF) of L(il.) is given by 1 0 0 1 OJ XF = [ 0 0 0 -8 -4 ' Then
Ji 1
= [
-1 -1 -1] 0-1 -1 0 0 -1
81[~ -~]
We compute the divisors generated by the Ji 1 -invariant subspace .A spanned by the vectors (1 0 0 0 O)T, (0 1 0 0 O)T, (0 0 0 1 O)T. We have
Ji 1 l.At
[-1 -1 OJ
0 -1 0 . 0 0 1
=
The columns of
00 -81] -1
1
0 -8
il
are independent, and a special left inverse is [V1
1 0 0 0 -1 -j .
V2 ] = [ -1 0
0
0
-i
7.11.
213
USE OF A SPECIAL GENERALIZED INVERSE
Therefore,
and the general form of a right comonic divisor generated by .A, is
U(A.).
[
(), + 1)
~I A
0
i.
l '
where U(A.) is matrix polynomial with U(O) = I and det U(A.)
=1.
0
7.11. Construction of a Comonic Matrix Polynomial via a Special Generalized Inverse
In Section 7.9 we have already considered the construction of a comonic matrix polynomial L(A.) given its finite Jordan pair (XF, JF). The spectral data in-(XF, JF) are "well organized" in the sense that for every eigenvalue A. 0 of L(A.) the part of (XF, JF) corresponding to A. 0 consists of a canonical set of Jordan chains of L(A.) corresponding to A. 0 and Jordan blocks with eigenvalue A. 0 whose sizes match the lengths of the Jordan chains in the canonical set. However, it is desirable to know how to construct a comonic matrix polynomial L(A.) if its finite spectral data are" badly organized," i.e., it is not known in advance if they form a canonical set of Jordan chains for each eigenvalue. In this section we shall solve this problem. The solution is based on the notion of a special generalized inverse. We start with some auxiliary considerations. Let (X, T) be an admissible pair of order p, and let A = col(XTi-l )i= 1 , where s = ind(X, T). Let A1 be a generalized inverse of A (see Chapter S3). It is clear that A1 can be written as A1 = [V1 V,], where V1 , V2 , ••• , V, are n x n matrices. Put (7.61)
Then for every projector P of ([P along Ker(X, T) (recall that Ker(X, T) = {x E ([PIXT;x = 0, i = 0, 1, ... }) there exists a generalized inverse A1 of A such that a standard pair of [(A.) is an extension of the pair (X limP• PT limP).
(7.62)
More exactly, the following lemma holds. Lemma 7.21. For a projector P of([P along Ker(X, T), let A1 be a generalized inverse of A = co1(XT;- 1 )i= 1 (s = ind(X, T)) such that A 1A = P.
7.
214
SPECTRAL PROPERTIES AND REPRESENTATIONS
Let Q = [I 0 · · · OJ and let C be the first companion matrix of the monic matrix polynomial (7.61) (so (Q, C) is a standard pair for l(A.)). Then the space Im A is invariant for C and the pair (Q limA, C limA) is similar to (X limP•
PTIImP). Proof
We have 0 0
I
0 XT'V1
0 XT'V2
0 0
I
C=
As
I
XPV.
= A, we have XT;= XT;- for i = 1, ... , s. So CA = AT A 1A. Since A 1A = P, the equalities A 1AP = P and AP = A hold. So CA = APT P, and hence AA 1A
1 A 1A
1
C(A limP) = (A IImP)(PT hmP). Note that S =A limP has a zero kernel (as a linear transformation from ImP to ~n•). From what we proved above we see that CS = S(PT hmP). Further
[I
0 ··· OJS =[I 0 · ··
OJAIImP =X limP•
It follows that .A = Im S = Im A is a C-invariant subspace, and ([I
0 · · · OJ IAt, CIA{)
is similar to (X limP• PT limP). D For the purpose of further reference let us state the following proposition. Proposition 7.22. Let (X, T) be an admissible pair of order p, and suppose that T is nonsingular. Then, in the notation of Lemma 7.21, the restriction ChmA is nonsingular. Proof In view of Lemma 7.21, it is sufficient to prove that PT limP is invertible, where Pis a projector of ~P along Ker col(XT;)i;: J. Write X and T in matrix form with respect to the decomposition ~P =ImP+ Ker P:
X=[X 1
OJ,
_[T
T-
11
T21
T12] . T22
12
For example, T11 = PT hmP· Let us prove that T = 0. Indeed, the matrix col(XT;)i;: bhas the following matrix form with respect to the decomposition ~P = Im P Ker P:
+
(7.63)
7.11.
USE OF A SPECIAL GENERALIZED INVERSE
215
Since s = ind(X, T), Ker co1(XTi):=o = Ker col(XTi)::J, and therefore XP = [X 1T~ 1 0]. Taking into account (7.63) we obtain that X 1T{ 1 · T12 = 0 for j = 0, ... , s - 1, and in view ofthe left invertibility of col(X 1T L)j: b, it follows that T12 = 0. Now clearly the nonsingularity of T implies that T11 = PT limP is also invertible. D So if Tin the admissbile pair (X, T) is nonsingular, then by Proposition 7.22, C limA is invertible too. But Im A is not necessarily the largest C-invariant subspace A of q;ns such that C IAt is invertible. To get this maximal subspace we have to make a special choice for AI. Recall that if 'if' 1, ... , 'if'. are subspaces in q;n, then col( 'if')}= 1 will denote the subspace of q;ns consisting of all vectors x = col(x)}= 1 with xi E 'if'i for j = 1, 2, ... , s. As before, let (X, T) be an admissible pair of order p, let s = ind(X, T) and assume that Tis nonsingular. Then there exist linear subspaces 'if'. c 'if'5 _ 1 c · · · c 'if' 1 c q;n such that (7.64)
For the case in which Ker(X, T) = {0}, formula (7.64) follows from Lemma 6.4. By replacing (X, T) by the pair (7.62) the general case may be reduced to the case Ker(X, T) = {0}, because lm col(XTi- 1):= 1 = lm[col((XIImP)(PTIImP)i- 1):=1]. We call AI a special generalized inverse of A = col(XTi- 1):= 1 whenever AI is a generalized inverse of A and Ker AI= col('if')}= 1 for some choice of subspaces 'if'. c · · · c 'if' 1 such that (7.64) holds. If A is left invertible, and AI is a special left inverse of A, as defined in Section 6.3, then AI is also a special generalized inverse of A. The following result is a refinement of Lemma 7.21 for the case of special generalized inverse. Lemma 7.23. Let (X, T) be an admissible pair oforder p, lets = ind(X, T) and assume that T is nonsingular. Let AI be a special generalized inverse of A= col(XTi- 1):= 1, and write AI= [V1 V2 · • • V.J. Let (Q, C) be the standard pair with the first companion matrix C of
Then C is completely reduced by the subspaces Im A and Ker AI, the pairs (7.65)
are similar, and
(i) (ii)
CIImA is invertible, (CIKerA')s = 0.
216
7. SPECTRAL PROPERTIES AND REPRESENTATIONS
Proof The similarity of the pairs (7.65) has been established in the proof of Lemma 7.21. There it has also been shown that Im A is C-invariant. As Tis nonsingular, the same is true for AIATIImA 1A
(cf. Proposition 7.22). But then, because of similarity, statement (i) must be true. The fact that Ker A 1 is C-invariant and statement (ii) follow from the special form of Ker AI (cf. formula (7.64)). D The next theorem gives the solution of the problem of construction of a comonic matrix polynomial given its "badly organized" data. Theorem 7.24. Let (X, T) be an admissible pair of order p, let s = ind(X, T), and assume that Tis nonsingular. Let AI be a special generalized inverse of A = col(XT 1 -i)i= 1 .
Write AI = [W1 W2
•••
W.J, and put
L(A.) = I - XT-s(W.A. Let P
= A I A.
+ · · · + WI A.•).
Then the finite Jordan pair of L(A.) is similar to the pair (X limP, PTIJmp).
(7.66)
Further, the minimal possible degree of a comonic n x n matrix polynomial with a .finite Jordan pair similar to the pair (7.66) is precisely s. Proof
Note that ind(X, T- 1 ) = ind(X, T) = s. Put L(A.) =
u· -
xT-s(wl
+ ... + W.A.s-1).
(7.67)
From the previous lemma it is clear that the pair (7.68) is similar to the pair (Q limA, C limA), where (Q, C) is the standard pair with the companion matrix C of the polynomial (7.67). Let (XL, J L) be a finite Jordan pair of L(A.), and let (X oo, J 00 ) be a Jordan pair of L(A.) corresponding to the eigenvalue 0. By Theorem 7.15 the pair ([XL X 00 ] , Ji, 1 EB J 00 ) is similar to (Q, C). From the previous lemma we know that
where CIImA is invertible and CIKer A' is nilpotent. It follows that the pair (XL, Ji, 1 ) is similar to the pair (QIImA, CIImA). But then we have proved that the finite Jordan pair of L(A.) is similar to the pair (7.66).
7.11.
USE OF A SPECIAL GENERALIZED INVERSE
217
Now, let (X B' J B) be a finite Jordan pair of some comonic n x n matrix polynomial B(A.), and assume that (X B, J B) is similar to the pair (7.66). Since the index of stabilization of the pair (7.66) is equal to s again, we see that ind(XB, J B) = s. Let B(A.) = A.rnB(A. - 1), where m is the degree of B(A.). Then (XB, J"ii 1 ) is a restriction of a standard pair of B(A.) (cf. Theorem 7.15). It follows that s = ind(XB, JB) = ind(XB, 18 1 )::;; m. This completes the proof of the theorem. D In conclusion we remark that one can avoid the requirement that T be invertible in Theorem 7.24. To this end replace T by T -a!, where a¢ a(T), and shift the argument A. -+ A. - a. The details are left to the reader. Comments
The presentation in Section 7.1 follows the paper [37a]; the contents of Sections 7.2-7.6 are based on [14], and that of Section 7.7 (including Lemma 7.13) on [29a]. The main results of Section 7.7 also appear in [37a]. Another version of division of matrix polynomials appears in [37b]. For results and discussion concerning s-indices and k-indices, see [37c, 35b]. The results of Sections 7.8-7.10 are based on [37a, 37b], and those of Section 7.11 appear in [29a]. Direct connections between a matrix polynomial and its linearization are obtained in [79a, 79b].
Chapter 8
Applications to Differential and Difference Equations
In this chapter we shall consider systems of lth-order linear differential equations with constant coefficients, as well as systems of linear difference equations with constant coefficients. In each case, there will be a characteristic matrix polynomial of degree l with determinant not identically zero. The results presented here can be viewed as extensions of some results in Chapters 2 and 3, in which the characteristic matrix polynomial is monic. It is found that, in general, the system of differential equations corresponding to a nonmonic matrix polynomial cannot be solved for every continuous righthand part. This peculiarity can be attributed to the non void spectral structure at infinity, which is always present for nonmonic matrix polynomials (see Chapter 7). One way to ensure the existence of a solution of the differential equation for every continuous right-hand side is to consider the equation in the context of distributions. We shall not pursue this approach here, referring the reader to the paper [29a] for a detailed exposition. Instead we shall stick to the "classical" approach, imposing differentiability conditions on the righthand side (so that the differentiation which occurs in the course of solution can always be performed). In the last section of this chapter we solve the problem of reconstructing a differential or difference equation when the solutions are given for the homogeneous case. 218
8.1.
219
DIFFERENTIAL EQUATIONS IN THE NONMONIC CASE
8.1. Differential Equations in the Nonmonic Case
Let L(..t) = D~o Aj).j be a regular matrix polynomial. We now consider the corresponding system of lth-order differential equations d) u L ( -d t
1
d u + · · · + A d u = [, = A 0 u + A 1 -d 11 t dt .
(8.1)
where f is a
(8.2)
which corresponds to the matrix polynomial (8.3) Clearly, the system (8.3) has a continuous solution if and only if f 2 is /-times continuously differentiable, and in that case the solution u 1 = f 1 - f~l, u2 = f 2 is unique. To formulate the main theorem on solutions of (8.1) it is convenient to introduce the notion of a resolvent triple of L(..t), as follows. Let (X, T) = ([X 1 X 2 ], T1 E8 T2 ) be a decomposable pair of L(). ). Then there exists a matrix Z such that L -t(,.t)
=
XT(..t)- 1 Z,
where T().) =(I).- T1 ) E8 (T2 ) . - I) (see Theorem 7.7). The triple (X, T, Z) will be called a resolvent triple of L(..t). Theorem 8.1. Let L().) be a matrix polynomial ofdegree l, and let (X, T, Z) be a resolvent triple for L(..t), where X = [X 1 X 2 ], T = T1 EB T2 , and let Z = [~~] be the corresponding partition of Z. Assume that the size of T1 is m x m, where m = degree(det L(..t)), and that TJ. = Ofor some positive integer v. Then for every (v + l - !)-times continuously differentiable
= E
X 1 e 1 ''x
+
{x
1 er,u-slzJ(s)ds-
:t~X 2 T~Z 2 .f
(8.4)
220
8. APPLICATIONS TO DIFFERENTIAL AND DIFFERENCE EQUATIONS
Proof We check first that formula (8.4) does indeed produce a solution of (8.1). We have (for j = 0, 1, ... , l)
uU>(t) = X 1T{eT''x
+
j-1
L X 1 Tt 1 -kZd
So l
l
l
LA ju(t) = L A jx 1T{ eT''x j=O
j=O
+
L j= 1
J' [.± to
v- 1
-
+
j-1
L A jx 1Tt 1-kz J
AjX 1T{]eT,(t-slz 1j(s) ds
1=0
l
L L AjX 2 T~Z 2 j(t). i=O j=O
Using the property L~=o AjX 1 T{
= 0 and rearranging terms, we have
(As usual, the sum L~=k+ 1 is assumed to be empty if k definition of a resolvent triple,
L -1(A.)
=
[X1
Xz]
[(H-0 1
T)- 1
+1>
l.) By the
0 J[Zz:J'
(TzA.- J)-1
and for IA. I large enough,
Z
Z
L - 1(A.) = -X 2 T~- 1 Z 2 A.v- 1 - · · · - X 2 T2 Z 2 A.- X 2 2 + X 1 1r 1 + X 1T1Z 1A.- 2 + · · · . (8.6) So the coefficient of J
8.1. DIFFERENTIAL EQUATIONS IN THE NONMONIC CASE
221
it is sufficient to check that (8.8)
is the general solution of (8.7). Since the set of all solutions of (8. 7) is an m-dimensional linear space (Theorem Sl.6), we have only to check that u0 (t) = 0 is possible only if x = 0. In fact, u0 (t) 0 implies u~>(t) =X 1 T~er 11 X = 0 fori = 1, ... , l- 1. So col(X 1 T~)l:5er 11X = 0, and since the columns of col(X 1 TDl:5 are linearly independent (refer to the definition of a decomposable pair in Section 7.3), we obtain er 11 X = 0 and x = 0. D
=
The following theorem is a more detailed version of Theorem 8.1 for the case of comonic matrix polynomials. Theorem 8.2. Let L(A.) be a comonic matrix polynomial of degree l, and let (X F• JF) and (X 00 , J cxJ be its finite and infinite Jordan pairs, respectively. Define Z to be the nl x n matrix given by
[col(XJ;- 1 )l= 1
r
1
= [* · · · * Z],
where X = [X F X 00 ] , J = Ji 1 tf) J 00 • Make a partition Z = [~:J corresponding to the partition X= [XF X ool Assume that the q;n_valued function f(t) is v-times continuously differentiable, where v is the least nonnegative integer such that J"oo = 0. Then the general solution of (8.1) is an 1-times continuously differentiable function of the form v-1
u(t) = XFeJFtx
+ LX
00
J',;; 1+kZ 00 j
k=O
- XFJi
(8.9)
where x is an arbitrary vector in q;m, m = deg det L(A.). Proof By Theorem 7.3, ([X F X ooJ, J F tf) J 00) is a decomposable pair of L(A.). Using formula (7.55) it is easily seen that 2 0 ) ( X,J, - [Ji(l0 Jl-1
Jz )
00
is a resolvent triple of L(A.). It remains to apply Theorem 8.1 to obtain (8.9). D Observe that according to (8.9) it is sufficient to require v-times differentiability off(A.) in Theorem 8.2 (instead of( v + l - 1)-times differentiability in Theorem 8.1).
222
8.
APPLICATIONS TO DIFFERENTIAL AND DIFFERENCE EQUATIONS
If v = 0, i.e., if the leading coefficient of L(A.) is invertible, then Theorem 2.9 shows that (8.1) is solvable for any continuous ~n-valued function f. Note that v = 0 implies that deg det L(A.) = nl, and the general solution of (8.1) (assuming comonicity of L(A.)) is given by (8.9)
u(t) = XFeJptx - XFJ;
(8.10)
where x is an arbitrary vector in ~nt. In the particular case that the leading coefficient of L(A.) is equal to the identity, formula (8.10) is the same as formula (2.35) with X = XF, T = JF, because then
-1; 1 A.)- 1 ZF
L(A.)- 1 = XF(IA.- JF)- 1 ¥ = XFJF(l-1)(1 (cf. Theorem 2.6), and thus
IX
2 0.
(8.11)
To illustrate the usefulness of the above approach, we shall write out the general solution of the matrix differential equation
Au+ Bu
=
f
(8.12)
in another closed form in which the coefficient matrices appear explicity. We assume that A and B are n x n matrices such that det(AA. + B) does not vanish identically. Choose a e ~ such that aA + B is nonsingular, and let r 0 and r 1 be two positively oriented closed contours around the zeros of det(AA. + B) such that a is inside r 0 and outside r 1 . Then the general solution of Au + Bu = f may be written as (8.13)
where
u 1(t)
= [ - -21 .
uz(t) =
Lo1 [ -21m.
1'-
k=
u3(t) =
i
-,etA. -
m r111.- a
i(
(AA.
1-, ro 11. - a
+ B)- 1 dA.
)k+
1
(A.A
Jx
0,
+ B)- 1 dA.
J(-a+ -dd )kf(t), t
2~i L!(AA. + B)-1(J:e
Here x 0 is an arbitrary vector in that
r(
~nand J1.
1 Jro A. - a is the zero matrix.
is the least nonnegative integer such
)I'+ (AA. + B)-1 dA. 1
8.1.
Ail
223
DIFFERENTIAL EQUATIONS IN THE NONMONIC CASE
Let us deduce the formula (8.13). Let L(il) = Ail + B and let L(il) = + aA + B = L(il + a). Consider the differential equation (aA
+ B)- 1r(:t)v =g.
(8.14)
We apply Theorem 8.2 to find a general solution of (8.14). A comonic Jordan pair X = [XF X 00 ] , J = Ji 1 EB J oo of L(il) is given by the formula (X, J) = (I, -(aA + B)- 1 A), and where
A
0
= Im
J (Iil -
J)- 1 dil,
&o
where
A
1
= Im
J (Iil -
J)- 1 dil.
dt
Here d 0 and d 1 are closed rectifiable contours such that (d 0 u d 1 ) n a'(J) = 0; zero is the only eigenvalue of J lying inside d 0 , and d 1 contains in its interior all but the zero eigenvalue of J. The matrix Z from Theorem 8.2 is just the identity. Write formula (8.9):
v(t)
= v1(t) + v2(t) + v3 (t),
where v 1 (t)
= XFe 1 1x = e<JI.Jt,>- 11x F
=
(2~i {,e;.-' [/il + (aA + B)1
1
Ar 1 dil
)x,
1'-1
vit) =
LX
00
1"cx,Z 00 g
k=O
where fl is the least nonnegative integer such that
J~ =
0;
vit) = -XFJF J:eJp(t-rlzFg(-r) d-r
= -
~ J il- 1 [/il + (aA + B)- 1 Ar 1 2n,
dt
f'e;.-'
Ja
Observe now that the contours d 0 and d 1 , while satisfying the above requirements,canbechosenalsoinsuchawaythatr 0 = - { r 1 + ajilEd 0 }, r 1 = {r 1 + a Iil E dd (the minus in the formula for r 0 denotes the opposite
8.
224
APPLICATIONS TO DIFFERENTIAL AND DIFFERENCE EQUATIONS
orientation). Make the change of variable in the above formulas for v 1 (t), v 2 (t), v 3 (t): A. = (JJ. - a)- 1 . A calculation gives (note that [I A.+ (aA
+ B)- 1 Ar 1
= (JJ. - a)(JJ.A
v 1 (t) = (- -21 . ( e
k~o
v 3 (t)
=
~ 2m
1CI
(
Jr,
f
+ B)- 1 (aA
+B))
+ B)- 1 ~)(aA + B)x,
(JJ. - a)-k- 1 (AJJ.
Jl - a
+ B)- 1 dJJ.)(aA + B)g
ro
(AJJ.
+ B)- 1
(e
Ja
+ B)g('r) dr df.l.
Finally, the general solution u(t) of(8.12) is given by the formula u(t) = ea1v(t), where v(t) is a general solution of (8.14) with g(t) = (aA + B)- 1 e-at_{(t). Using the above formulas for v(t), we obtain the desired formula (8.13) for u(t). These arguments can also be used to express formulas (8.9) in terms of the coefficients of L(A.) = I + L)~ 1 A jA.j. To see this, let
be the co monic companion matrix of L(A.). Further, put Q = row(<5 1 j/))~ 1 and Y = col(Dlj/))~ 1 . Take P 0 to be the Riesz projector of R corresponding to the eigenvalue 0, i.e., P 0 = -21 . ( (IA.- R)- 1 dA., 1CI
(8.15)
Jro
where r 0 is a suitable contour around zero separating 0 from the other eigenvalues of R. Finally, let P =I- P 0 • Then the general solution of(8.1) (with A 0 = I) is equal to
where
Il-l
uz(t) =
L QP
k~O
0
R 1- 1 +kP 0 Yj
8.2.
225
DIFFERENCE EQUATIONS IN THE NONMONIC CASE
Here z0 is an arbitrary vector in ImP and J1 is the least nonnegative integer such that P 0 RilP 0 = 0. 8.2. Difference Equations in the Nonmonic Case
We move on now to the study of the corresponding finite difference equations. Theorem 8.3. Let L(A.) and (X, T, Z) be as in Theorem 8.1. Let {.f,.}~ 0 be a sequence ofvectors in fl". Then the general solution ofa finite difference equation
Aour + A1Ur+1 + · · · + Alur+l =f.., where L(A.) =
(8.16)
r?: 0,
D=o AjA.j, is given by the formulas v-1
u0 = X 1 x-
LX
i=O
2
T~Z 2 /;,
v-1
r-1
u, = X 1T;x- LX 2 T~Z 2 /;+r i=O where x E
flm
+
LX 1 T~-j- 1 Z 1 fj,
r ?: 1, (8.17)
j=O
is arbitrary and m is the degree of det L(A.).
Proof It is easy to see (using Theorem Sl.8) that the general solution of the corresponding homogeneous finite difference equation is given by r = 0, 1, ....
u, = X 1 T;x,
Hence it suffices to show that the sequence {q>,} ~ 0 given by v- 1
= - LXzT~Zzh+r
i=O
r-1
+ LX1T;-j- 1Z1fj, j=O
r ?: 1,
is a solution of (8.16). Indeed,
l r+k-1 l v-1 L AkX 1T;+k-j- 1Z 1Jj- L LAkX 2 T~Z 2 fr+k+i k=O j=O k=O i=O
= L
(8.18)
226
8.
APPLICATIONS TO DIFFERENTIAL AND DIFFERENCE EQUATIONS
where it is assumed Ti = 0, T~ = 0 for p < 0. Using (8.6), it is seen that the coefficient of jj in the right-hand side of (8.18) is just the coefficient of .A/-• in the product L(),)L- 1 (),), which is equal to I for j = r, and zero otherwise. Hence the right-hand side of (8.18) is just f.. 0 Again, a simplified version of Theorem 8.3 for co monic polynomials can be written down using finite and infinite Jordan pairs for the construction of a decomposable pair of L(A). In this case the general solution of (8.16) (with A 0 = I) has the form v-1
u 0 = Xpx
U,
=
+
r X F J pX
L X 00 J~
j=O
+
v-1 , X
£....,
j=O
1
+iz 00 jj, r-1
oo 1 1oo 1 + iz oo f j+r - ,£...., X F J F-1 + r - iz F f i'
r
~
1,
j=O
(8.19) where (X F, J F, Zp), (X oo, J oo, Z 00 ) are as in Theorem 8.2. As in the case of a differential equation, the general solution (8.19) can be expressed in terms of the co monic companion matrix R of L(A), and the Riesz projector P 0 given by (8.15). We leave it to the reader to obtain these formulas. The following similarity properties may be used for this purpose:
where R is the comonic companion matrix of L(),), and S = col(XFJii, X ooJ:X,)l:6.
Theorem 8.3 allows us to solve various problems concerning the behavior of the solution (u,):'=o of (8.16) provided the behavior of the data (/.):'= 0 is known. The following corollary presents one result of this type (problems of this kind are important for stability theory of difference schemes for approximate solution of partial differential equations). We say the sequence (/,):'= 0 is bounded if supr=o. 1 •... llf,ll < oo. Corollary 8.4. If the spectrum of L(),) is inside the unit circle, then all solutions (u,):'=o of(8.16) are bounded provided (/.):'= 0 is bounded. Proof Use the decomposable pair ([XF X 00 ] , JF Ei1 J 00 ) in Theorem 8.3, where (Xp, Jp) and (X 00 , J 00 ) are finite and infinite Jordan pairs of L(),), respectively. Since the eigenvalues of JF are all inside the unit circle, we have
8.3. CONSTRUCTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS
227
Ill~ II ::;; Kpi, i = 0, 1, ... , where 0 < p < 1, and K > 0 is a fixed constant. Then (8.17) gives (with X 1 = XF, T1 = JF, X2 = X 00 , T2 = 1 00 ):
llu,ll ::;; IIXFIIKp'llxll
+ C~>X ooJ~Z2II) s~pllfrll r-1
+ L IIXFIIKpr-j- 1 IIZ 1 II
supllf,.ll,
j;Q
and the right-hand side is bounded uniformly in r provided sup, llf,.ll < oo.
D 8.3. Construction of Differential and Difference Equations with Given Solutions Consider the €n-vector functions cpj(t) = pj(t)e'-l, j = 1, ... , r, where pj(t) are polynomials in t whose coefficients are vectors in €n. In this section we show how one can find an n x n matrix polynomial L(A.) with a finite number of eigenvalues, such that the functions cp 1 , . . . , cp, are solutions ofthe equation
L(:t)cp
(8.20)
= 0.
Without any additional restriction, this problem is easy to solve. At first consider the scalar case (n = 1). Then we may choose r
L(A.) =
TI (A. -
A.)'i+ 1'
j; 1
where si is the degree of the polynomial pj(t). In the general case (n > 1), one first chooses scalar polynomials L 1(A.), ... , Ln(A.) such that for each j the kth coordinate function of cpi is a solution of Lk(d/dt)cp = 0. Next, put L(A.) = L 1 (A.) Ef) · · · Ef) Ln(A.). Then the functions cp 1 , ..• , cp, are solutions of L(djdt)cp = 0. But with this choice of L(A.), Eq. (8.20) may have solutions which are not linear combinations of the vector functions cp 1, ... , cp, and their derivatives. So we want to construct the polynomial L(A.) in such a way that the solution space ofEq. (8.20) is precisely equal to the space spanned by the functions cp 1 , ••. , cp, and their derivatives. This" inverse" problem is solved by Theorem 8.5 below. In the solution of this problem we use the notions and results of Section 7.11. In particular, the special generalized inverse will play an important role. Theorem 8.5. Let cpj(t) = €n, 0 ::;; k ::;; si, 1 ::;; j ::;; r. Put
Q);o Piktk)e'-i where each Pik is a vector 1
X = row(row((si - k)! Pi. srk)ki; o)}; 1•
in
228
8. APPLICATIONS TO DIFFERENTIAL AND DIFFERENCE EQUATIONS
and let J = J 1 EE> • • • EE> J, where J; is the Jordan block of order si + 1 with eigenvalue A;. Let IX be some complex number different from A1 , .•. , A, and define L(A) = I - X(J - 1X/)- 1{(A - 1X)V1 + · · · +(A- 1XYVd, where l = ind(X, J) and [V1 col(X (J - lXI) 1 - i)l = 1 . Then cp 1 ,
•.. ,
(8.21)
l-';] is a special generalized inverse of cp, are solutions of the equation
(8.22)
and every solution of this equation is a linear combination of cp 1 , .•. , cp, and their derivatives. Further, lis the minimal possible degree of any n x n matrix polynomial with this property. Proof
[
Notethatl = ind(X, J) = ind(X, J- IXI);becauseoftheidentity
I 1X(6)J
0 (~)I
X(J - lXI)
0
0 ][ IX!- 1(:1 (/)I
G=:DI
IX!- 2(:/ll )J
X
X(J _: 1XI)1-
l[ l XJ X
1
=
x)-l .
It follows that without loss of generality we may suppose that IX = 0, and let (X L• J L) be a finite Jordan pair of L(IX). Then the general solution ofEq. (8.22) is equal to X Le'hx, where X is an arbitrary vector of em, m = degree(det L(l,)) (see Theorem 8.1). By Theorem 7.24 the pair (XL, h) is similar to the pair
(XhmP• PJIJmp), where Pis a projector of fl", f.1 = LJ= 1 (si + 1), with Ker P = Ker(X, J). It follows that the general solution of (8.22) is of the form X e'1 z, where z is an arbitrary vector in fl". But this means that the solution space of (8.22) is equal to the space spanned by the function cp 1 , . . . , cp, and their derivatives. Let B(A) be an n x n matrix polynomial with the property that the solution space of B(djdt)cp = 0 is the space spanned by cp 1 , . . . , cp, and their derivatives. Then B(A) can have only a finite number of eigenvalues, and the finite Jordan pairs of B(A) and L(A) are similar. But then we can apply Theorem 7.24 to show that degree B(A) ::2: degree L(A) = l. So lis the minimal possible degree. D Now consider the analogous problem for difference equations: Let L(A) be an n x n regular matrix polynomial and consider the finite difference equation
L(C)u; = 0,
i = 1, 2, ... '
(8.23)
8.3.
CONSTRUCTION OF DIFFERENTIAL AND DIFFERENCE EQUATIONS
229
where lffu; = u;+ 1 and u 1, u 2 , ••• , is a sequence of vectors in fln. Let x 0 , ... , xk_ 1 be a Jordan chain for L(A.) corresponding to the eigenvalue A. 0 . Put X 0 = [x 0 x 1 · · · xk_ 1], and let J 0 be the Jordan block of order k with A. 0 as an eigenvalue. Then for each z E ([k the sequence
is a solution of(8.23)(Theorem 8.3). By taking z = (0 with 1 in the qth place, we see that u,+1
=
I (r -
i=O
r
q
.) A.o-q+ixi,
+J
··· 0
1 0
r;?:O
· · · O)T
(8.24)
is a solution of(8.23). Moreover, each solution of (8.23) is a linear combination of solutions of the form (8.24). The next theorem solves the inverse problem; its proof is similar to the proof of Theorem 8.5 and therefore is omitted. Theorem 8.6.
Suppose we have given a finite number of sequences of the
form . q, u<'l = "L.., r+ 1
(
i=O
r
_
r q;
)
.
+ J. A.~-q,+JxJ".. 1
r
;?:
0,
where Xi; E fln, 0 :::;; j :::;; q;, 1 :::;; i :::;; p. Put X=
row(row(xi;)j~ 0 )f= 1
and let J = J 1 EB · · · EB J P' where 1; is the Jordan block of order q; + 1 with eigenvalue A;. Let oc be some complex number different from A. 1, ... , A.P, and define L(A.) =I- X(J- ocl)- 1{(A.- oc)Vz +···+(A.- oc)'Vd,
(8.25)
where 1 = ind(X, J) and [V1 Vz] is a special generalized inverse of col(X(J - oc/) 1 - ;)l= 1 . Then the sequences u
i
= 1, 2, ... ,
and every solution of this equation is a linear combination of u
230
8.
APPLICATIONS TO DIFFERENTIAL AND DIFFERENCE EQUATIONS
L(A.) is given by (8.25) and E(A.) is a unimodular n x n matrix polynomial (i.e., with det E(A.) const =P 0).
=
Comments
Theorems 8.1 and 8.3 are proved in [14]. All other theorems are in [29a]. In a less general form an analysis of the singular linear differential equation (8.12) was made in [11].
Chapter 9
Least Common Multiples and Greatest Common Divisors of Matrix Polynomials
Let L 1 (A.), ... , L.(A.) be a finite set of matrix polynomials. A matrix polynomial N(A.) is called a (left) common multiple of L 1 (A.), ... , L.(A.) if L;(A.) is a right divisor of N(A.), i = 1, ... , s. A common multiple N(A.) is called a least common multiple (l.c.m.) of L 1 (A.), ... , L.(A.) if N(A.) is a right divisor of every other common multiple. The notions of a common divisor and a greatest common divisor are given in an analogous way: namely, a matrix polynomial D(A.) is a (right) common divisor of L 1 (A.), ... , L.(A.) if D(A.) is a right divisor of every L;(A.); D(A.) is a greatest common divisor (g.c.d.) if D(A.) is a common divisor of L 1 (A.), ... , L.(A.) and every other common divisor is in turn a right divisor of D(A.). It will transpire that the spectral theory of matrix polynomials, as developed in the earlier chapters, provides the appropriate machinery for solving problems concerning l.c.m. and g.c.d. In particular, we give in this chapter an explicit construction of the l.c.m. and g.c.d. of a finite family of matrix polynomials L 1 (A.), ... , L.(A.). The construction will be given in terms of both the spectral data of the family and their coefficients. The discussion in terms of coefficient matrices is based on the use of Vandermonde and resultant matrices, which will be introduced later in the chapter. 231
232
9. L.C.M. AND G.C.D. OF MATRIX POLYNOMIALS
Throughout Chapter 9 we assume for simplicity that the matrix polynomials L 1(A.), ... , L.(A.) are of size n x n and comonic, i.e., L;(O) = I, i = 1, ... , s. In fact, the construction of l.c.m. and g.c.d. of a finite family of regular matrix polynomials L 1 (A.), ... , L.(A.) can be reduced to a comonic case by choosing a point a which does not belong to Ui= 1 a(L;), and considering the matrix polynomials L;(a)- 1 L;(/. + a) in place of L;(/.), i = 1, ... , s. 9.1. Common Extensions of Admissible Pairs
In this and the next two sections we shall develop the necessary background for construction of the l.c.m. and g.c.d. of matrix polynomials. This background is based on the notions of restrictions and extensions of admissible pairs. We have already encountered these ideas (see Section 7.7), but let us recall the main definitions and properties. A pair of matrices (X, T) is called an admissible pair of order p if X is an n x p matrix and Tis a p x p matrix (the number n is fixed). The space
nKer 00
Ker(X, T) =
nKer[coi(XTj-l)}= I] OC'
XTj
=
j=O
s=l
is called the kernel of the pair (X, T). The least positive integers such that Ker[col(XTi- 1)}= 1] = Ker[col(XTi- 1 )j~}]
(9.1)
is the index of stabilization of(X, T) and is denoted by ind(X, T). Moreover, if s = ind(X, T), then for
r
=
1, 2, ... ,
(9.2)
where .If; = Ker[coi(XTi-l )}= 1 ] c f/7 Indeed, for r = 1 the property (9.2) is just the definition ofind(X, T). To prove (9.2) for r > 1, take x E As. Then (by (9.2) for r = 1), x E./Its+ 1 . In particular, this means that col(XTi)f= 1 x = 0, i.e., Tx EA•. Again, by (9.2) for r = 1, Tx E As+ 1 and, consequently, T 2 x E As. Continuing this process, we obtain T'x E As for r = 1, 2,, .. , and (9.2) follows. Two admissible pairs (X, T) and (X 1 , T1 ) are said to be similar if there exists a nonsingular matrix S such that XS =X 1 and T = ST1 S- 1 . In other words,(X, T)and(X 1 , T1 )aresimilarif(X, T)isanextensionof(X 1 , T1 )and (X 1 , T1 ) is an extension of (X, T). We recall the necessary and sufficient condition for extension of two admissible pairs (X, T) and (X 1 , T1 ) given by Lemma 7.12 (see Section 7.7 for the definition of extension of admissible pairs): namely, if Ker(X, T) = {0} and Ker(X 1 , T1 ) = {0}, then (X, T) is an extension of (X 1 , T1 ) if and only if (9.3)
9 .1.
233
COMMON EXTENSIONS OF ADMISSIBLE PAIRS
for some k ;;::-: max{ind(X, T), ind(X 1 , T1 )}. If this is the case, then (9.3) holds for all integers k ;;::-: 0. Let (X 1 , T1 ), . . . , (X,, T,.) be admissible pairs. The admissible pair (X, T) is said to be a common extension of (X 1 , T1 ), ••• , (X" T,.) if (X, T) is an extension of each (Xi, T), j = I, ... , r. We call (X 0 , T0 ) a least common extension of (X 1 , T1 ), . . . , (X" T,.) if (X 0 , T0 ) is a common extension of (X 1 , T1 ), . . . , (X" T,.), and any common extension of (X 1 , T1 ), . . . , (X" T,.) is an extension of(X 0 , T0 ).
Theorem 9.1. Let (X 1 , T1 ), •.. , (X" T,.) be admissible pairs of orders p 1 , ... , Pn respectively, and suppose that Ker(Xi, ~) = {0} for 1 ::;; j ::;; r. Then up to similarity there exists a unique least common extension of(X 1 , T1 ), . . . ,
(X" T,.). Put X = [X 1 . . . XrJ, T = T1 EB ... EB T,., and p = p 1 + ... + Pn and let P be a projector off/7 along Ker(X, T) (i.e., Ker P = Ker(X, T)). Then one such least common extension is given by
(9.4)
(X limP• PTIImP).
Let X 0 be the first term in (9.4) and T0 the second. By choosing some basis in ImP we can interpret (X 0 , T0 ) as an admissible pair of matrices. As Ker(X, T) is invariant forT and Xu = 0 for u E Ker(X, T), one sees that Ker(X 0 , T0 ) = {0} and Proof
Im[col(X 0 T~- 1 );"= 1 ]
= Im[col(XT;- 1 )7'= 1 ],
m;;::-:1.
(9.5)
From the definitions of X and T it is clear that
+ .. · + Im[col(XJ~- 1 );"= 1 ], m
Im[col(XT;- 1 )7'= 1 ] = Im[col(X 1 T~- 1 )7'= 1 ]
;;::-: 0.
(9.6)
But then we apply the criterion (9.3) to show that (X 0 , T0 ) is a common extension of (X 1 , T1 ), . . . , (X" T,.). Now assume that (Y, R) is a common extension of (X 1 , T1 ), . . . , (X" T,.). Then (cf. (9.3)) for 1 ::;; j ::;; r we have Im[col(YR;- 1 )~ 1 ]
::::J
Im[col(Xi T~- 1 )}= 1 ],
m;;::-:1.
Together with (9.5) and (9.6) this implies that Im[col(YR;- 1 )7'= 1 ]
::::J
Im[col(X 0 T~- 1 )7'= 1 ],
m;;::-:1.
But then we can apply (9.3) again to show that (Y, R) is an extension of (X 0 , T0 ). It follows that (X 0 , T0 ) is a least common extension of (X 1 , T1 ), ... , (X" T,.). Let (X 0 , T0 ) be another least common extension of (X 1 , T1 ), ..• , (X" T,.). Then (X 0 , T0 ) is an extension of (X 0 , T0 ) and conversely (X 0 , T0 ) is an exten-
sion of(X 0 , T0 ). Hence both pairs are similar. D
234
9.
L.C.M. AND G.C.D. OF MATRIX POLYNOMIALS
Corollary 9.2. Let (X 1 , T1 ), •.. , (X" T,.) be as in Theorem 9.1, and let (X 0 , T0 ) be a least common extension of(X 1 , T1 ), ••• , (X" T,.). Then Ker(X 0 , T0 )
= 0 and r
a(T0 ) c
U a(1;).
(9.7)
i= 1
Proof The equality Ker(X 0 , T0 ) = {0} is an immediate consequence of formula (9.4). To verify (9.7) observe that (in the notation of Theorem 9.1) Ker(X, T) is aT-invariant subspace and Thas the following form with respect to the decomposition ([;P = Ker(X, T) +ImP:
T= [T
* ]·
IKer(X. T)
0
PTilmP
So r
a(T0 )
= a(PThmP)
c a(T)
=
Ua(T;), i= 1
and (9.7) follows.
D
The first statement of Theorem 9.1 holds true for any finite set of admissible pairs (X 1 , T1 ), ... , (X" T,.) (not necessarily satisfying the condition Ker(Xi, Ij) = {0}, 1 ::<::; j ::<::; r. To this end, one first applies the previous theorem to the pairs 1 ::<::; j
::<::;
r,
(9.8)
where Pi is a projection of ([;PJ along Ker(Xi, Ij). This yields a least common extension (X 0 , T0 ) of the pairs (9.7). Next we put X= [X 0 0 1 ] and T = T0 EB 0 2 , where 0 1 is then x m zero-matrix and 0 2 is them x m zero-matrix, m = LJ= 1 dim Ker(Xi, 1j). The next theorem explains the role of formula (9.6).
Theorem 9.3. Let (X 0 , T0 ), (X 1 , T1 ), •.. , (X" T,.) be admissible, and suppose that Ker(Xi, Ij) = {0} for j = 0, 1, ... , r. Then (X 0 , T0 ) is a least common extension of(X 1, T1), ... , (X" T,.) if and only iffor each m ~ 1: Im[col(X 0 T~- 1 )i'= d = Im[col(X 1 T~- 1 )7'= 1 ] + + Im[col(X, T~- 1 )7'= 1 ].
·· · (9.9)
Proof Let X0 be the first term in (9.4) and f 0 the second. Then (X 0 , is a least common extension of (X 1, T1), •.• , (X" T,.) and
Im[col( X0 f~- 1 )i'= 1 ] = Im[col( X 1 T~- 1 )i'= 1 ] + + Im[col(X, T~- 1 )7'= 1 ],
... m ~ 1.
T0 )
235
9.2. COMMON RESTRICTIONS OF ADMISSIBLE PAIRS
(See Theorem 9.1 and the first part of its proof.) Now the pair (X 0 , T0 ) is a least common extension of (X 1, T1), ... , (X" T,.) if and only if (X 0 , T0 ) and (X 0 , T0 ) are similar. By (9.3) the last statement is equivalent to the requirement that Im[col(X 0 T~- 1 )i"=o = Im[col(X 0 f~- 1 )7"= 1 ] for all m;;::: 1, and hence the proof is complete. D 9.2. Common Restrictions of Admissible Pairs Let (X 1, T1), ... , (X" T,.) be admissible pairs. The admissible pair (X, T) is said to be a common restriction of (X 1, T1), ... , (X" T,.) if each (Xi, 1)), j = 1, ... , r, is an extension of (X, T). We call (X 0 , T0 ) a greatest common restriction of (X 1, T1), ... , (X" T,.) if (X 0 , T0 ) is a common restriction of (X 1, T1), ... , (X" T,.) and (X 0 , T0 ) is an extension of any common restriction of (X 1, T1), ... , (X" T,.). To construct a greatest common restriction we need the following lemma. Lemma 9.4. Let (X 1 , T1), ... , (X" T,.) be admissible pairs of orders p 1, ... , p" respectively, and suppose that Ker(Xi, 1}) = {O}for 1 :::;;; j :::;;; r. Let % be the linear space of all (q> 1 , q> 2 , ••• , Cf>r) E flP (p = p 1 + p2 + · · · + Pr) such that IX ;;:::
0.
Then%= {(q>, S 2 q>, ... , Srcp)lcp E A}, where A is the largest T1-invariant subspace of flP' such that for every j = 2, ... , r, there exists a linear transformations/ A~ flPi with the property that X 1 I.At = XiSi, Proof
Si T1 1At = 1JSi
(j
= 2, ... , r).
Note that% is a linear space invariant under T
=
(9.10)
T1 EB · · · EB T,..
Put
A = {q>
E
flP'I3cpi
E
flPi,
(2 :::;;; j :::;;; r),
(q>1, q>z, ... , Cf>r)
E
%}.
Take (q> 1, q> 2 , ••• , Cf>r) and (q> 1, $ 2 , ... , cpr) in %. Then (0, q> 2 - cp 2 , ••• , Cf>r - fpr) E %, and hence for 2 :::;;; j :::;;; r we have ~ Tj(q>i - (pi) = 0, IX ;;::: 0. AsKer(Xi, 1}) = {O}foreachj,wehaveq>i = fpi.Soeach(q> 1 , q> 2 , ••• , Cf>r)E% may be written as
(q>1, q>z, · · ·' Cf>r) = (q>1, S2q>1, · · ·' Srq>1), where sj is a map from A into flPi (2 :::;;; j :::;;; r). In other words
% = {(q>,S 2 q>, ... ,Srcp)lq>EA}.
(9.11)
As% is a linear space, the maps S2, ... , Sr are linear transformations. Since % is invariant for T1 EB · · · EB T,., the space % is invariant for T1 and j
= 2, ... , r.
236
9. L.C.M. AND G.C.D. OF MATRIX POLYNOMIALS
Further, from the definition of off and the identity (9.11) it is clear that xjsj = X11At· It remains to prove that .A is the largest subspace of flP' with the desired properties. Suppose that .A0 is a T1-invariant subspace of flP' and let SJ: .A0 --+ flf' (j = 2, ... , r) be a linear transformation such that (9.12) Then SJT~q> = TjSJcp for each q> E .A0 and oc ;;::: 1. This together with the first identity in (9.12) shows that (q>
But then .A0 c .A, and the proof is complete.
E
.Ao).
0
The linear transformations Si (2 ::;; j ::;; r) in the previous theorem are injective. Indeed, suppose that Siq> = 0 for some j ;;::: r. Then (9.10) implies that q> E Ker(X 1, T1). But Ker(X 1, T1) = {0}, and hence q> = 0. We can now prove an existence theorem for a greatest common restriction under the hypotheses used in Theorem 9.1 to obtain the corresponding result for a least common extension. Theorem 9.5. Let (X 1 , T1 ), .•. , (X" T,.) be admissible pairs of orders p, respectively, and suppose that Ker(Xi, T) = {0} for j = 1, ... , r. Then up to similarity there exists a unique greatest common restriction of
p 1,
•.. ,
(X 1 , T1 ),
... ,
(X, T,.).
If the subspace .A is defined as in Lemma 9.4, then one such greatest common restriction is given by
Proof Put X 0 = X 1 !At and T0 = T1 !At. By choosing some basis in .A we can interpret (X 0 , T0 ) as an admissible pair of matrices. By definition (X 0 , T0 ) is a restriction of (X 1, T1). As the linear transformations S 2 , ••• , S, in Lemma 9.4 are injective, we see from formula (9.12) that (X 0 , T0 ) is a restriction of each (Xi, Tj), 2::;; j::;; r. Thus (X 0 , T0 ) is a common restriction of (X 1 , T1 ), . . . , (X, T,.). Next, assume that (Y, R) is a common restriction of(X 1, T1), ... , (X, T,.). Let q be the order of the pair (Y, R). Then there exist injective linear transformations G/ flq --+ flPi, j = 1, ... , r, such that j
=
1, ... , r.
It follows that for each q> E flq we have ()( ;;::: 0.
9.2. COMMON RESTRICTIONS OF ADMISSIBLE PAIRS
237
HenceG 1 cpE.A,andthus Y = X 0 G 1 andG 1 R = T0 G 1 .Butthisimpliesthat (X 0 , T0 ) is an extension of (Y, R). So we have proved that (X 0 , T0 ) is a greatest common restriction of (X 1 , T1 ), ••• , (X" T,.). Finally, let (X 0 , T0 ) be another greatest common restriction of (X 1 , T1 ), ... , (X" T,.). Then (X 0 , T0 ) is an extension of (X 0 , T0 ) and conversely (X 0 , T0 ) is an extension of(X 0 , T0 ). Hence both pairs are similar. D As in the case ofleast common extensions, Theorem 9.5 remains true, with appropriate modification, if one drops the condition that Ker(Xi, T) = {0} for j = 1, ... , r. To see this, one first replaces the pair (Xi, T) by j
= 1, ... , r,
(9.13)
where Pi is a projector of ([Pi along Ker(Xi, 1]). Let (X 0 , T0 ) be a greatest common restriction of the pairs (9.13), and put X= [X 0 0 1] and T = T0 EB 02 , where 0 1 is the n x t zero-matrix and 02 the t x t zero-matrix, t = mini dim Ker(Xi, 1]). Then the pair (X, T) is a greatest common restriction of (X 1 , T1 ), ... , (X" T,.). We conclude this section with another characterization of the greatest common restriction. As before, let (X 1 , T1 ), ••• , (X" T,.) be admissible pairs of orders p 1 , ••• , Pn respectively. For each positive integer s let .%. be the linear space of all (cp 1 , cp 2 , ... , cp,) E flP, (p = p 1 + · · · + p,) such that
()( = 0, ... ' s- 1.
n:;,
Obviously, 1 .%. = .%, where .% is the linear space defined in Lemma 9.4. The subspaces .% 1 , .% 2 , •.. , form a descending sequence in flP, and hence there exists a positive integer q such that Yfq = Yfq+ 1 = · · · . The least least q with this property will be denoted by q{(Xi, 1})}= d. Theorem 9.6. Let(X 0 , T0 ),(X 1 , T1 ), ••. , (X" T,.)beadmissiblepairs,and suppose that Ker(X i, 1}) = {0} for j = 0, 1, ... , r. Then (X 0 , T0 ) is a greatest common restriction of(X 1 , T1 ), ••• , (X" T,.) if and only if
nIm[col(XiTt )7'= j= r
Im[col(X 0 T~- 1 )7'= 1 ]
=
1
1]
(9.14)
1
for each m;;::: q{(Xi, 1})}= dProof Let (Y0 , R 0 ) be the greatest common restriction of (X 1 , T 1 ), ••• , (X" T,.) as defined in the second part of Theorem 9.5. Take a fixed m ;;::: q{(Xi, 1})}= d, and let
nIm[col(Xj Tt )7'= j= r
col(cp;)7'= 1 E
1
1
1].
238
9.
L.C.M. AND G.C.D. OF MATRIX POLYNOMIALS
Then there exist 1/Ji E flPi (j = 1, ... , r) such that (/J; = Xi T}- 1 1/Ji for 1 :::;; i :::;; m and 1 :::;; j :::;; r. It follows that (l/1 1, ... , 1/1,) E off m· But off m = off. So l/1 1 E .A and 1/Ji = Sil/1 1 ,j = 2, ... , r (cf. Lemma 9.4). But then ({J;
= XiT}- 1 Sil/J 1 = X 1 T;1- 1 l/1 1 = Y0 R~- 1 l/1 1 •
It follows that
nlm[col(XiTt )i'= r
1
1]
c Im[col(Y0 R~- 1 )i'= 1 ].
j= 1
As (Y0 , R 0 ) is a common restriction of (X 1 , T1 ), •.. , (X, T,.), the reverse inclusion is trivially true. So we have proved (9.14) for X 0 = Y0 and T0 = R 0 . Since all greatest common restrictions of (X 1 , T1 ), ••• , (X, T,.) are similar, we see that (9.14) has been proved in general. Conversely, assume that (9.14) holds for each m ~ q = q{(Xi, 1})}= d.
Let (Y0 , R 0 ) be as in the first part of the proof. Then form each m, we have
~
q and hence for
Im[col(X 0 T~- 1 )7'= 1 ] = Im[col(Y0 R~- 1 )i'= 1 ]. By (9.3) this implies that (X 0 , T0 ) and ( Y0 , R 0 ) are similar, and hence (X 0 , T0 ) is a greatest common restriction of (X 1 , T1 ), ••. , (X, T,.). 0 The following remark on greatest common restrictions of admissible pairs will be useful.
Remark 9.7. Let (X 1 , T1 ), •.• , (X, T,.) be admissible pairs with Ker(Xi, 1}) = {0} for j = 1, ... , r, and let (X 0 , T0 ) be a greatest common restriction of (X 1 , T1 ), ••. , (X, T,.). As the proof of Theorem 9.6 shows, q{(Xi, 1})}= dis the smallest integer m ~ 1 such that the equality
nlm[col(XiT}- );"= r
lm[col(X 0 T~- 1 )i'= 1 ] =
1
1]
j= 1
holds. Further, (9.15) for every m ~ q{(Xi, 1})}= d, where d 0 is the size of T0 . Indeed, in view of formula (9.14), we have to show that the index of stabilization ind(X 0 , T0 ) does not exceed q{(Xi, 1})}= d. But this is evident, because ind(X 0 , T0 ):::;; ind(X 1 , T1 ):::;; q{(Xi, 1j)}=d, as one checks easily using the definitions.
9.3.
CONSTRUCTION OF L.C.M. AND G.C.D. VIA SPECTRAL DATA
239
9.3. Construction of l.c.m. and g.c.d. via Spectral Data
Let L 1(A.), ... , Ls(A.) be comonic matrix polynomials with finite Jordan pairs (X lF• J 1 F), ... , (X sF• ]sF), respectively. To construct an l.c.m. and ag.c.d. of L 1 (A.), ... , L,(A.) via the pairs (X lF• J !F), ... ,(X sF• ]sF), we shall use extensively the notions and results presented in Sections 9.1 and 9.2. Observe that (X;F, l;F) is an admissible pair of order P; = degree(det L;(A.)) and kernel zero:
where I; is the degree of L;(A) (this fact follows, for instance, from Theorem 7.15). Let (X" T,.) and (Xe, I;,) be a greatest common restriction and a least common extension, respectively, of the admissible pairs (X lF• J 1 F), ... , (XsF• ]sF). By Corollary 9.2, the admissible pairs (X, T,) and (X., I;,) also have kernel zero, and T,, I;, are nonsingular matrices. Let l, = ind(X" T,),
Theorem 9.8.
le = ind(X., T,).
The comonic matrix polynomial
M(A.) =I- Xe T; 1e(V1 ). 1e + V2 ). 1e-l
+ ·· · +
J!;eA.),
(9.16)
where [V1 V2 · · · J!;J is a special/eft inverse of col(X e T; j)~e=-o\ is an l.c.m. of minimal possible degree (?f L 1 (A.), ... , L,(A.). Any other l.c.m. of L 1 (A.), ... , Ls(A.) has the form U(A.)M(A.), where U(A.) is an arbitrary matrix polynomial with det U(A.) const i= 0.
=
Let us prove first that a comonic matrix polynomial N(A.) is an l.c.m. of L 1 (A.), ... , Ls(A.) if and only if the finite Jordan pair (XNF• JNF) of N(A.) is a least common extension of (X lF• J 1 F), ... , (X sF• ]sF). Indeed, assume that N(A.) is an l.c.m. of L 1 (A.), ... , L,(A.). By Theorem 7.13, (X NF• J NF) is a common extension of (X;F, l;F), i = 1, ... , s. Let (X, T) be an extension of(X NF• J NF), and let N(A.) be a matrix polynomial (not necessarily comonic) whose finite Jordan pair is similar to (X, T) (N(A.) can be constructed using Theorem 7.16; if Tis singular, an appropriate shift of the argument A is also needed). By Theorem 7.13, N(A.) is a common multiple of L 1 (A.), ... , Ls(A.), and, therefore, N(A.) is a right divisor of N(A.). Applying Theorem 7.13 once more we see that (X, T) is an extension of(XNF• JNF). Thus, (XNF• JNF) is a least common extension of (X lF• J 1 F), ... , (X sF• ]sF). These arguments can be reversed to show that if(X NF• J NF) is a least common extension of(X;F, l;F), i = 1, ... , s, then N(A.) is an l.c.m. of L 1 (A.), ... , Ls(A). After the assertion of the preceding paragraph is verified, Theorem 9.8 (apart from the statement about the uniqueness of an l.c.m.) follows from Theorem 7.16. Proof
240
9.
L.C.M. AND G.C.D. OF MATRIX POLYNOMIALS
We prove the uniqueness assertion. Observe that, from the definition of an l.c.m., if N 1(A.) is an l.c.m. of L 1(A.), ... , L.(A.), then N 1(A.) and M(A.) (where M(A.) is given by (9.16)) are right divisors of each other. So
M(A.) = U 2 (A.)N 1(A.) for some matrix polynomials U 1 (A.) and U 2 (A.). These equalities imply, in particular, that U 1(A.)U 2 (A.) = I for every A. E a(N 1). By continuity, U 1(A.)U 2 (A.) = I for all A. E (/;, and therefore both matrix polynomials U 1(A.) and U 2 (A.) have constant nonzero determinant. So we have proved that any l.c.m. of L 1 (A.), ... , L,(A.) has the form (9.17)
U(A.)M(A.)
where U(A.) is a matrix polynomial with empty spectrum. Conversely, it is easily seen that every matrix polynomial of the form (9.17) is an l.c.m. of L 1(A.), ... , L,(A.). 0 The proof of the following theorem is analogous to the proof of Theorem 9.8 and therefore is omitted. Theorem 9.9.
The comonic matrix polynomial
D(A.) =I_ x,r-tr(W1A_tr +
w2 A.tr-1
+ ... + Jo/1)),
where [W1 W2 • • • Jo/IJ is a special left inverse of col( X, T; i)~r;;01 , is a g.c.d. of minimal possible degree of L 1(A.), ... , L,(A.). Any other g.c.d. of L 1(A.), ... , L,(A.) is given by the formula U(A.)D(A.), where U(A.) is an arbitrary matrix polynomial without eigenvalues. 9.4. Vandermonde Matrix and Least Common Multiples
In the preceding section we constructed an l.c.m. and a g.c.d. of comonic matrix polynomials L 1 (A.), ... , L,(A.) in terms of their finite Jordan pairs. However, the usage of finite Jordan pairs is not always convenient (in particular, note that the Jordan structure of a matrix polynomial is generally unstable under small perturbation). So it is desirable to construct an l.c.m. and a g.c.d. directly in terms of the coefficients of L 1(A.), ... , L.(A.). For the greatest common divisor this will be done in the next section, and for the least common multiple we shall do this here. We need the notion of the Vandermonde matrix for the system L 1 (A.), ... , L.(A.) of comonic matrix polynomials, which will be introduced now. Let Pi be the degree of LlA.), and let (Xi, T;) = ([X iF, XiooJ, Jif- 1 EB J 00 ) beacomonic Jordan pair of LlA.), i = 1, ... , s. According to Theorem 7.15, (Xi, T;) is a
241
9.4. VANDERMONDE MATRIX AND LEAST COMMON MULTIPLES
standard pair for the monic matrix polynomial L;(i.) ticular, col( X; TDf~-01 is nonsingular; denote
Uj= [col(XjT~- 1 )f~ 1 r 1 ,
l
=
In par-
LJ=
1
pj and m is an
1
X1U1
xsus xs ~us
X1T1U1
X
1 ).
.i= l, ... ,s,
and define the following mn x pn matrix, where p arbitrary positive integer:
V,.(L1, ... , Ls) =
= ).P'L;U-
:
1
T7- 1 U 1
X 2 T7- 1 U 2
.
Xs T~-1Us
The matrix V"'(L 1, ... , L.) will be called the comonic Vandermonde matrix of L1(1l), ... , Ls(/1.). We point out that V,.(L 1, ... , Ls) does not depend on the choice of the comonic Jordan pairs (X;, I;) and can be expressed directly in terms of the coefficients of L 1 (1l), ... , Ls(.A.). In fact, let L 1 (.A.) =I+ Ap,- 1 .A. + Ap,- 2 11. 2 + ulpJ, where ulj is ann X n ... + Ao).Pl, and let u1 = [U11 u12 S: p 1 ) are given by the followf3 S: (1 p matrix. Then the expressions X 1 TfU 1 ing formulas (see Proposition 2.3): if m = 0, ... , p 1 if m = 0, ... , p 1
XITTU1p=
-
-
1 and 1 and
f3 i= m + f3 = m +
1 1 (9.18)
and in general form > p 1 ,
X 1TfU 1p =
~~~ [J ;,+-~iq=kj~(-Ap,-)] 1
·(-AfJ+k-m+p,-1) where, by definition, A;
+ (-Ap-m+p,-1),
(9.19)
= 0 for i s; 0 and
q
TI(-Ap,-;) = (-Ap, ;)(-Ap,--;)···(-Ap,-;J j= 1 To formulate the next theorem we need some further concepts. For a given family L 1, ... , Ls of co monic matrix polynomials with infinite Jordan pairs (X; 00 , 1; 00 ), i = 1, ... , s, denote by voc(L 1 , ... , Ls) the minimal integer j such that H:o = 0 for i = 1, ... , s. For integer i s; j, denote by S;/L 1 , ... , L.) the lower (j - i + 1) matrix blocks of the comonic Vandermende matrix Vj(L 1 , ••• , L.): Su{L1, ... , Ls) =
[
XJi~-~u~
:
XJt 1 UI
s s s xJi-lul . .
XJ{- 1 Us
9.
242
L.C.M. AND G.C.D. OF MATRIX POLYNOMIALS
If i = j, we shall write S;(L 1, ... , L.) in place of S;;(L 1, ... , L.). In Theorem 9.10 below we shall use also the notion of a special generalized inverse (see Section 7.11 for its definition).
Theorem 9.10. Let L 1 (A.), ... , L.(A.) be a family of comonic matrix polynomials, and let v 2::: V 00 (L 1, ••• , L.) be an arbitrary integer. Then the minimal possible degree m of a least common multiple of L 1 (A.), ... , L.(A.) is m = min{j 2:::
11 Ker s.+ 1, v+ i(L 1, ... , L.)
= Ker S.+ 1, v+ i+ 1(L 1, ... , L.)}
(9.20)
and does not depend on the choice of v. One of the least common multiples of L 1 (A.), ... , L.(A.) of the minimal degree misgiven by the formula M(A.) = I - Sv+m+ 1.v+m+ 1CL1, · · ·, L,) · [W1A.m
+ W2A.m- 1 + · · · + WmA.J, (9.21)
where [W1
W2 · · ·
Wm] is the special generalized inverse of s.+ 1,v+m(L1, · · ·, L.).
Proof Let(X;, 7;) = ([X;F X; 00 ],J;"f/ EB 1 00 )beacomonicJordanpair of L;(A.), i = 1, ... , s; let U; = [col(X;l{)f!:-0 1 1, where P; is the degree of L;(A.). Then for v 2::: v00 (L 1 , ..• , L.) we have
r
x.r; :
l
diag[U 1 ,
••. ,
u.]
x. r;+ j - 1 X2
o (9.22)
Since the matrix diag[(J;"fc' EB I)U;]i; 1 is nonsingular, the integer m (defined by (9.20)) indeed does not depend on v 2::: V 00 (L 1, ••• , L.). Let (Xe, T;,) be the least common extension of the pairs (X 1 F, J 1 F), ... , (X sF• J.F). In view of Theorem 9.8 it is sufficient to show that m = ind(Xe, T;,), and Xe T;m[V1 V2 · · ·
VmJ
=
Sv+m+ 1(L 1, ... , L,)[W1 W2 · · ·
Wm], (9.23)
243
9.4. VANDERMONDE MATRIX AND LEAST COMMON MULTIPLES
From the equality (9.22) it follows that m coincides with the least integer j such that Ker col[ X 1FJ~t. X 2Fr;}, ... ' xsFJ~i,kJ{:6 = Ker col[ X 1FJ~t, X zFF}}, ... , XsFJs"FkJLo· Using the construction of a least common extension given in Theorem 9.1, we find that indeed m = ind(X., I;,). Further, we have (by analogy with (9.22))
Sv+m+ 1 = [X 1FJ~Fm 0 XzFJZFm · diag[(J;f," EB l)U;]i= 1
0
· · · XsFJsFm
0]
and taking into account (9.22) it is easy to see that the right-hand side of(9.23) is equal to (9.24) where
[W~
W~]
W2
is a special generalized inverse of
col[X 1FJ~/. X 2FJi/, ... ' xsFJs"Fj]j=-d. We can assume (according to Theorem 9.1) that (X., T.,) =(X limP• PTI 1mp), where
X= [X1F
xsFJ
...
T
=
diag[J~J, ... , Js"F 1],
and P is a projector such that Ker P = Ker(X, T)
(=
.n xri)· Ker
•=0
The matrices X and T have the following form with respect to the decomposition flP = Ker P ImP, where p = p 1 + p2 + · · · + Ps:
+
x
= [O
x.J,
It follows that XT; = [0 X • T~], i = 0, 1, .... Now from the definition of a special generalized inverse we obtain that the left-hand side of (9.23) is equal to (9.24). So the equality (9.23) follows. D
The integer estimated: v 00 (L 1 ,
••• ,
V 00 (L 1, ••• ,
L.) ::::;; min {j
~
L.) which appears in Theorem 9.10 can be easily
11 Ker Jj(L 1 ,
••• ,
Lm) = Ker Jj+ 1 (L 1 ,
••. ,
Lm)}.
(9.25)
244
9.
L.C.M. AND G.C.D. OF MATRIX POLYNOMIALS
Indeed, for the integer v = v00 (L 1 , ... , L,) we have J'foo = 0 and J'f;; 1 =1= 0 for some i. Then also X ioo Ji;; 1 =I= 0, and there exists a vector x in the domain of definition of 1; 00 such that X; 00 Ji;; 1 X: =I= 0, X; 00 Jioo x = 0. Therefore, Ker V,(L 1 ,
.•. ,
Lm) -jJ Ker S.+ 1 (L 1 ,
..• ,
Lm),
and (9.25) follows. 9.5. Common Multiples for Monic Polynomials
Consider the following problem: given monic matrix polynomials L 1 (A.), ... , L.(A.), construct a monic matrix polynomial L(A.) which is a common multiple of L 1 (A.), ... , L.(A.) and is, in a certain sense, the smallest possible. Because of the monicity requirement for L(A.), in general one cannot demand that L(A.) be an l.c.m. of L 1 (A.), ... , L.(A.) (an l.c.m. of L 1 (A.), ... , L.(A.) may never be monic). Instead we shall require that L(A.) has the smallest possible degree. To solve this problem, we shall use the notion of the Vandermonde matrix. However, in the case of monic polynomials it is more convenient to use standard pairs (instead of comonic Jordan pairs in the case of comonic polynomials). We arrive at the following definition: let L 1 (A.), ... , L.(A.) be monic matrix polynomials with degrees p 1 , . . . , Ps and Jordan pairs (X 1 , T1 ), .•. , (X., 7'.), respectively. For an arbitrary integer m ~ 1 put
xlul Wm(Ll, ... ,L.)=
r
X 1 T1 U 1
X 2 T2 U 2
X1 T~- 1 U 1
X 2 Ti- 1 U 2
:
:
l
x.u. , x.(u· x. rr;- 1 v.
where Vi= [col(XiT~- 1 )f~ 1 r 1 . Call Wm(L 1 , ••• ,L.) the monic Vandermonde matrix of L 1 (A.), ... , L.(A.). From formulas (9.18) and (9.19) (where L 1 (A.) = L),!, 0 AiA.i) it is clear that Wm(L 1 , ••• , L.) can be expressed in terms of the coefficients of L 1 (A.), ... , L.(A.) and, in particular, does not depend on the choice of the standard pairs. For simplicity of notation we write Wm for Wm(L 1 , ... , L.) in the next theorem. Theorem 9.11.
Let L 1 (A.), ... , L.(A.) be monic matrix polynomials, and let r = min {m
~
11 Ker
Wm = Ker Wm + 1 }.
Then there exists a monic common multiple L(A.) ofL 1 (A.), ... , L.(A.) of degree r. One such monic common multiple is given by the formula L1 (A.)
= H' - S,+ 1l1mP · (Vl + VzA. + · · · + V,..A.'- 1 )
(9.26)
245
9.5. COMMON MULTIPLES FOR MONIC POLYNOMIALS
where P is a projector along Ker W,., [V1 , V2 , ••• , V.J is some left inverse for W..hmP• and Sr+ 1 is formed by the lower n rOWS of W,.+ 1· Conversely, if L(A.) is a monic common multiple of L 1 (A.), ... , L.(A.), then deg L 2 r. Observe that a monic common multiple of minimal degree is not unique in general.
Proof
Since U 1 ,
•.. ,
u. are nonsingular,
r = ind([X 1
· · ·
X.], diag[T1 ,
..• ,
'T.]).
Let (X 0 , T0 ) be the least common extension of(X 1 , T1 ), ••. , (X., T.)(Section 9.1). Then Ker(X 0 , T0 ) = {0} and ind(X 0 , T0 ) = r (see Theorem 9.1). Let
L(A.) be a monic matrix polynomial associated with the r-independent admissible pair (X 0 , T0 ), i.e., (9.27)
nx:J.
where [V1 , V2 , ••• , V.J is a left inverse of col(X 0 By Theorem 6.2, a standard pair of L(A.) is an extension of (X 0 , T0 ) and, consequently, it is also an extension of each (Xi, 7;), i = 1, ... , s. By Theorem 7.13, L(A.) is a common multiple of L 1 (A.), ... , L.(A.). In fact, formula (9.27) coincides with (9.26). This can be checked easily using the description of a least common extension given in Theorem 9.1. Conversely, let L(A.) be a monic common multiple of L 1 (A.), ... , L.(A.) of degree l. Then (Theorem 7.13) a standard pair (X, T) of L(A.) is a common extension of (X 1> T1 ), •.• , (X., T.). In particular, ind(X, T) 2 ind(X 0 , T0 )
= r,
(As before, (X 0 , T0 ) stands for a least common extension of (X 1 , T1 ), . . . , (X., T.).) On the other hand, nonsingularity of col(XTi)l:~ implies ind(X, T) = l; sol 2 r as claimed. D Theorem 9.11 becomes especially simple for the case oflinear polynomials Li(A.) = IA. - Xi, i = 1, ... , s. In this case Wm(L 1 ,
.•. ,
L.)
= col([XL X~, ... , x~m·~-01 •
Thus: Corollary 9.12. Let X 1 , ... , x. ben x n matrices. Then there exist n x n matrices A 0 , .•• , A 1_ 1 with the property l-1 l Xi+
" f...
j=O
AjXf. = 0,
i
= 1, ... ' s,
(9.28)
246
9. L.C.M. AND G.C.D. OF MATRIX POLYNOMIALS
if and only if the integer I satisfies the inequality I~ min{m ~ liKer col([X~, X~, ... , X~])k':J
= Ker col([X~, Xt ... , X~])k'= 0 }. Jf(9.29) is satisfied, one of the possible choices for A 0 , holds is given by the formula [Ao
Al
(9.29) ••• ,
A 1_ 1 such that (9.28)
· · · A,_l] = -[XL X~,···, X!] IImP[Vl
Vz
···
Jt;],
where Pis a projector along Ker col([X~, X~, ... , X~])~: b. and
is some left inverse for col([ X~, X~, ... , x~m:b limP· 9.6. Resultant Matrices and Greatest Common Divisors
In order to describe the construction of a g.c.d. for a finite family of matrix polynomials, we shall use resultant matrices, which are described below. The notion of a resultant matrix R(a, b) for a pair of scalar polynomials a(A.) = L~=o a;A.i and b(A.) = D=o b;A.; (a;, bi E C) is well known:
R(a, b)=
ao 0
al ao
0 bo 0
0 bl bo
a, al
0 b. bl
0 a,
0
0 0
a, 0 0
ao al 0 0 b. 0
1
;<.;rows
l
l
r rows
0
0
0
bo
bl
b.
l
Its determinant det R(a, b) is called the resultant of a( A.) and b(A.) and has the property that det R(a, b) =f. 0 if and only if a( A.) and b(A.) have no common zeros (it is assumed here that a,b. =f. 0). A more general result is known (see, for instance, [26]): the number of common zeros (counting multiplicities) of a(A.) and b(A.) is equal to r + s - rank R(a, b). Thus, the resultant matrix is closely linked to the greatest common divisor of a(A.) and b(A.). We shall establish analogous connections between the resultant matrix (properly generalized to the matrix case) of several matrix polynomials and their greatest common divisor.
9.6.
247
RESULTANT MATRICES AND GREATEST COMMON DIVISORS
Let L(A.) = L~= 0 A i).i be a regular n x n matrix polynomial. The following n(q - l) x nq block matrices (q > l) 0 (9.30) 0 are called resultant matrices of the polynomial L(A.). For a family of matrix polynomials Lj(A.) = Lk~o AkiAk (Aki is an n x n matrix, k = 0, 1, ... , pi, j = 1, 2, ... , s) we define the resultant matrices as
Rq(L 1 , L 2 ,
•.• ,
L.)
=
r=:i~::j
( q > max
1 ::;;;j ::;;r
Pi)·
(9.31)
Rq(L,) This definition is justified by the fact that the matrices Rq play a role which is analogous to that of the resultant matrix for two scalar polynomials, as we shall see in the next theorem. Let L 1 , . . . , Ls be comonic matrix polynomials with comonic Jordan pairs (X 1 , T1 ), ... , (Xs, T,), respectively. Let mi be the degree of det Li, j = 1, ... , s, and let m = m 1 + · · · + ms. For every positive integer q define
em,
there As the subs paces x 1 , .% 2 , •.• form a descending sequence in exists a positive integer q0 such that X qo = X qo+ 1 = · · · . The least integer q0 with this property will be denoted by q(L 1 , .•. , Ls). It is easi'l.y seen that this definition of q(L 1 , •.. , L,) does not depend on the choice of (X 1 , T1 ), ... , (X, TJ The following result provides a description for the kernels of resultant matrices. This description will serve as a basis for construction of a g. c. d. for a family of matrix polynomials (see Theorem 9.15).
Theorem 9.13. Let L 1 , . . . , Ls be comonic matrix polynomials with comonic Jordan pairs [X;, 1] = ([X;F X;"'], l;p 1 EB 1;"'), i = 1, ... , s, respectively. Let d0 be the maximal degree of the polynomials L 1 , . . . , Ls (so Rq(L 1 , . . . , Ls) is defined for q > d0 ). Then q0 d~ max{q(L 1 , ••• , Ls), d0 + 1} is the minimal integer q > d0 such that (9.32)
9.
248
L.C.M. AND G.C.D. OF MATRIX POLYNOMIALS
and for every integer q 2 q 0 thefollowing formula holds: KerRq(L 1 ,
...
+ Imcol(XaJ~-i){= 1 ,
,L,) = Imcol(XFJ~- 1 ){= 1
(9.33)
where (XF, JF) (resp. (X JocJ) is the greatest common restriction of the pairs (X IF, J tF), · · ·, (X sF, ]sF) (resp. (X 1 a-, J 1 aJ, · · ·, (Xsoo, Jso:J). 00 ,
p1,
The number q(L 1 , ... , L,) can ~e estimated in terms of the degrees Ps of L 1 (A), ... , L,(A), respectively:
•.. ,
q(L 1 ,
••• ,
L,)
s
n · min pj
+
(9.34)
max Pj· 1::£:j5:s
1 s,js,s
For the proof of this estimation see [29b]. So (9.33) holds, in particular, for every q 2 n · min 1 ,;j:o;s pj + max 1 ,;j:o;s Pj· Theorem 9.13 holds also in the case that L 1 (A), ... , L 5 (A) are merely regular matrix polynomials. For a regular matrix polynomial L;(A), let (X;o:n l;ocJ be an infinite Jordan pair for the comonic matrix polynomial L;- 1 (a)L;(A + a) (art CJ(L;)). The pair (X; 00 , 1; 00 ) = (X; 00 (a), X; 00 (a)) may depend on the choice of a, i.e., it can happen that the pairs (X; 00 (a), l; 00 (a)) and (X; 00 (b), l; 00 (b)) are not similar for a i= b, a, b i CJ(LJ However, the subspace lm col(X; 00 (a)(J; 00 (a))q- j)]= 1 does not depend on the choice of a. In view of Theorem 9.6, the subspace Im col( X oo J~- j)J= 1 also does not depend on the choices of a; i CJ(L;), i = 1, ... , s, where (X 00 , J 00 ) is the greatest common restriction of (X; 00 (a;), l; 00 (a;)), i = 1, ... , s. So formula (9.33) makes sense also in the case when L 1 (A), ... , L,(}c) are regular matrix polynomials. Moreover, the formula (9.33) is true in this case also. For additional information and proofs of the facts presented in this paragraph, see [29b]. For the proof of Theorem 9.13 we need the following lemma. Lemma 9.14. Let L(A) = I + L~=o Aj}cj be a comonic n x n matrix polynomial, and let (X, T) be a comonic Jordan pairfor L. Thenfor q > l we have
Ker Rq(L) = Im Proof
col(XP-')~= 1 .
Put F,p = XPZp (o: 2 0, 1 [Zt
Zz
.. · Zz] =
s
{J
s
1), where
[col(XTi-t)~= 1 r 1 .
Then (see Theorem 7.15 and (2.14)) I -Fu
Rq(L) =
[
-F
O
I -Fu
11
• •.
0 ]·
-Fil
9.6.
249
RESULTANT MATRICES AND GREATEST COMMON DIVISORS
Introduce
s=
l~
0
-I -FII
-I -FII
... -Fq-2!
Using Theorem 7.15 and equalities (3.15) we deduce that SRq(L)
=
[U
+
V
V],
where
u--
[ 0
.· . -/] '
-I
0
AsS is nonsingular, we see that Ker Rq(L) =
=
{[:JIU
{[::}
1
1
=
=
o}
-U- 1 V
= Im[
-u; vJ 1
Now
Since
F~p
= XPZp, it follows that
Ker R,(LHm{[
X; }z, z,_, ... Z.Jl
But row(Z 1 _)~:6 is nonsingular, and hence the lemma is proved.
0
Proof of Theorem 9.13. Let (X 1 , T 1 ), ••• , (X., T,;) be comonic Jordan pairs for L 1 , . . . , L., respectively. From the previous lemma we know that for q > d0 = max 1 -s,j-o;s pi, where Pi is the degree of Lp. ), we have
nIm col( Xi TJ-~)~~ s
Ker Rq(L 1 ,
•.• ,
L.)
=
j~
1.
1
The assertion about minimality of q 0 follows now from this formula and Remark 9.7.
9.
250
L.C.M. AND G.C.D. OF MATRIX POLYNOMIALS
Assume now q ? q 0 . From Theorem 9.6 we infer that where (X 0 , (X.,
Ker Rq(L 1, ... , L.) = lm col(X 0 T~-a)~= 1 , T0 ) is any greatest common restriction of the pairs (X 1 , T1 ),
.•• ,
T,J
Let (XiF• JiF) be a finite Jordan pair for Li(A.), and let (Xioo• Jioo) be the infinite Jordan pair of Li(A.). So Xi= [XiF Xioo], T; = Jif..l EB Jioo• 1 ::::;; i ::::;; s. Let(XF, JF)beagreatestcommonrestrictionofthepairs(X 1 F, J 1 F), ... , (X sF• J.F), and similarly let (X 00 , J 00 ) be a greatest common restriction of the pairs (X 100 , J 1 00 ), ••• , (Xsoo, 1 800 ). Then it is not difficult to see that the pair ([XF
X ooJ, Ji 1 EB J oo)
is a greatest common restriction of the pairs (X 1, T1), that for q ? q(L 1, L 2 , ••• , L.),
..• ,
(X.,
T,J It follows
Ker Rq(L 1, ... , L.) = Im col( X FJF-q• X 00 ]~-a)~= 1 . Multiplying on the right by J~- 1 EB I one sees that Ker Rq(L 1, ... , L,)
= Im col(XFJf,- 1 )~= 1 + Im(X 00 1~-a)~= 1
for q? q0 . As q > max 1 s;is;s Pi• the sum on the right-hand side is a direct sum, and formula (9.33) follows. 0 We give now a rule for construction of a greatest common divisor which is based on (9.33). This is the main result of this section. Theorem 9.15. Let L 1 , .•• , L. be comonic matrix polynomials. Assume (X 0 , T0 ) is an admissible pair with Ker(X 0 , T0 ) = {0} such that
= Im col(X 0 n)::A, where Q = [I 0] is a matrix of size nl x n(f + I) with integers f and /large enough. Then the matrix T0 is nonsingular, the matrix col( X 0 T 0 i):;;: A is left Q(Ker Rq(L 1, ... , L.))
invertible, and the comonic matrix polynomial D(A.) =I- X 0 T 0 1(W1A.1 + · · · + H/iA.), where [ W1 , ... , Hi!] is a special left inverse of col(X 0 T 0 i):;;: A, is a greatest common divisor of L 1 (A.), ... , L.(A.). The conclusions of Theorem 9.15 hold for any pair of integers (1, f) which satisfy the inequalities:
f ? min (npi - degree(det(Li(A.))), 1:::; i :Ss
I ? n( min Pi) 1s;i5s
+
max Pi• 1s;is;s
where Pi is the degree of Li(A.),j = 1, ... , s.
(9.35)
9.6.
251
RESULTANT MATRICES AND GREATEST COMMON DIVISORS
Proof Assume (9.35) holds, and let q = l + into account the estimate (9.34)) we obtain that
Q(Ker Rq(L 1 ,
••• ,
L.))
f. By Theorem 9.13 (taking
= Q(Im col(XFJ~- 1 )[= 1
+ lm col(X ,,J~i)[=
1
(9.36) where (XF, JF) (resp. (X'"" J 00 )) is the greatest common restriction of finite (resp. infinite) Jordan pairs of Lb ... , L •. It is easy to see that Jfx, = 0, so the right-hand side of (9.36) is equal to Im col(XFJt 1 )l= 1 • Now by (9.3) (using again (9.34)) we find that the admissible pair (X 0, To) is similar to (X F• J F), i.e., (X 0 , T0 ) is also a greatest common restriction of the finite Jordan pairs of L 1(A.), ... , L.(A.). It remains to apply Theorem 9.9. 0 Comments
The exposition in this chapter is based on the papers [29a, 29b, 30a]. Common multiples of monic operator polynomials in the infinite dimensional case have been studied in [30b, 70b]. The use of the Vandermonde matrix (in the infinite dimensional case) for studying operator roots of monic operator polynomials was originated in [62a, 64a, 64b]. See also [46]. Resultant matrices are well known for a pair of scalar polynomials. In the case of matrix polynomials the corresponding notion of resultant matrices was developed recently in [1, 26, 35a, 41]. Another approach to the construction of common multiples and common divisors of matrix polynomials is to be found in [6]. Behavior of monic common multiples of monic matrix polynomials, under analytic perturbation of those polynomials, is studied in [31].
Part III
Self-Adjoint Matrix
Polyno~nials
In this part the finite dimensional space fl" will be regarded as equipped with theusualscalarproduct: ((xi> ... , xn)T,(y 1, ... , Yn)T) = Lt=l X;Y;· Then, for a given matrix polynomial L(A.) = L~= 0 A iA.i, define its adjoint L *(A.) by the formula I
L *(A.) =
L A jA.i,
(10.1)
j=O
where A* means the operator adjoint to A on the space fl"; that is, (A *x, y)
= (x,
Ay)
for all
x, y E fl".
In the standard basis, if A= (a;)i.i=J, then A*= (aii>?.i=l· We are to consider in this part an important class of monic matrix polynomials L(A.) for which the coefficients Ai are hermitian matrices, Ai = Aj, or, what is the same, self-adjoint operators in fl": for every
x, y E fl".
Such matrix polynomials will be called self-adjoint: L(A.) = L *(A.). The study of self-adjoint matrix polynomials in the case l = 2 is motivated by the ubiquitous problem of damped oscillatory systems (mechanical and
253
254
SELF-ADJOINT MATRIX POLYNOMIALS
and electrical) with a finite number of degrees of freedom, which may be described by a system of second-order differential equation with self-adjoint coefficients. The main goal in this part is a description of the additional structure needed for the spectral theory of self-adjoint matrix polynomials. This includes the introduction of self-adjoint triples and a new invariant of a self-adjoint matrix polynomial, which we call the sign characteristic. Many properties of self-adjoint matrix polynomials demand some knowledge of this sign characteristic. We remark that the requirement of monicity of the self-adjoint matrix polynomial L(A.) is not necessary for most of the results given in this part. For instance, in many cases it can be replaced by the condition that the leading coefficient of L(A.) is invertible, or is positive definite. However, we shall stick to the monicity requirement, and refer the reader to [34f, 34g] for the more general cases.
Chapter 10
General Theory
In this chapter we shall give an account of basic facts to be used in the analysis of self-adjoint matrix polynomials.
10.1. Simplest Properties Consider a monic self-adjoint matrix polynomial 1-1
L(A.) = H 1 +
L AjA.j,
A; =A{, i = 0, ... , I- 1.
(10.2)
i=O
Let C 1 and C 2 , as defined in Section 1.1, be the first and second companion matrices of L(A.), respectively. As in Section 2.1, define A1
B=
A2
r ..
A2 ,
· ••
I
•••
.. .. . .
.
I
/1
0
0
(10.3)
0
and, for a given standard triple (X, T, Y) of L(A.), put Q
=
Q(X, T)
= col(XT;)l;;;A
R
= row(T;Y)l:A. (10.4)
= BC 1 B- 1 = R- 1 TR,
(10.5)
and
As we already know (cf. Section 2.1), C2
255
256
10.
GENERAL THEORY
and RBQ =I.
(10.6)
The crucial properties. of a self-adjoint matrix polynomial L(A.) are that B = B* and C 1 = Ci. So (10.5) becomes BC 1 = qB, or [C 1 x, y] = [x, C 1 y], where the new indefinite scalar product [x, y] is connected with the usual one (x, y) by the identity [x, y] = (Bx, y) (see Chapter S5). This means that C 1 is self-adjoint relative to the scalar product [x, y] (see Chapter S5). This property is crucial for our investigation of self-adjoint matrix polynomials. We start with several equivalent characterizations of a self-adjoint matrix polynomial. Theorem 10.1.
The following statements are equivalent:
(i) L *(A.) = L(A.). (ii) For any standard triple (X, T, Y) of L(A.), (Y*, T*, X*) is also a standard triple for L(A.). (iii) For any standard triple (X, T, Y) of L(A.), ifQ is defined by (10.4) and M = Q*BQ, then X= Y*M and T = M- 1 T*M. (iv) For some standard triple (X, T, Y) of L(A.), there is a nonsingular self-adjoint matrix M such that X = Y* M and T = M- 1 T* M. (v) C 1 = B- 1 qB. Proof The line of proof follows the logical sequence (i) => (ii) => (i), (i) => (iii)=> (iv) => (i), (iii)=> (v) => (iv). The first implication follows immediately from Theorem 2.2. (ii) => (i): given a standard triple for L and statement (ii) the resolvent form (2.16) can be used to write L - 1 (A.)
= X(H- T)- 1 Y = Y*(H- T*)- 1 X*.
(10.7)
It is immediately seen that L - 1 (A.) = (L - 1 (A.))* and (i) follows. (i) =>(iii): it follows from (i) that q = C 2 • Using (10.4) and (10.5) Q*- 1 T*Q* =
q
=
C 2 = R- 1 TR
whence T = (Q*R- 1 )- 1 T*(Q*R- 1 ). But RBQ =I so R- 1 = BQ, and T = M- 1 T*MwhereM = Q*BQ. To prove that X = Y* M we focus on the special triple.
X= [I
0
· · · 0],
(10.8)
10.1.
257
SIMPLEST PROPERTIES
If we can establish the relation for this triple the conclusion follows for any standard triple by the similarity relationship between two standard triples (X 1 , T1 , Y1 ) and (X 2 , T2 , ¥2 ) of L(A.) established in Theorem 1.25, i.e.,
(10.9) where S = [col(X 2 T~)::br 1 · col(X 1 T1 )::b. For this choice of triple, Q =I and (from (10.6)) R = B- 1 . Consequently XR = XB-
1
= [0 · · · 0 I] = Y*.
Hence X = Y* B = Y*(Q* BQ) = Y* M. (iii)= (iv): requires no proof. A;XT; = 0, (iv) = (i): it is quickly verified that, from the equality A 1 = I, and hypothesis (iv), Q(Y*, T*) = Q(X, T)M- 1 and then that Y*, T* form a standard pair for L.lf(Y*, T*, Z) is the associated standard triple, then Z is defined by equations Y*T*'- 1 Z = Ofor r = 1, ... , I - I, and Y*T* 1- 1 Z = I, which imply
L::=o
Z
~ T')-'l~] ~ Q(Y',
MQ(X, T)
•m ~MY
But MY= X* by hypothesis, so (Y*, T*, X*) is a standard triple for L. Equation (10.7) can now be used again to deduce that L* = L. (iii)= (v): apply the standard triple of (10.8) in statement (iii). Then T = C 1 , Q = I, M = Band (v) follows. (v) = (iv): this is obvious using the triple of (10.8) once more. 0 One of our main objectives will be to find triples (X, T, Y) for the selfadjoint matrix polynomial L(A.) such that the matrix M from Theorem 10.1 (X= Y*M, T = M- 1 T*M) is as simple as possible. ExAMPLE 10.1. Let L(,l.) = fA - A be linear, A = A*. In this case M can be chosen to be J. Indeed, L(,l.) has only linear elementary divisors and the spectrum a(L) of L(,l.) is real. Moreover, the eigenvectors of L(,l.) are orthogonal. So there exists a standard pair (X, T) of L(,l.) with unitary matrix X. With this choice of X, the matrix
M = Q*BQ =X* X
from Theorem 10.l(iii) is clearly I.
D
EXAMPLE 10.2. Let L(,l.) be a self-adjoint matrix polynomial with all of its eigenvalues real and distinct, say ). 1 , Az, ... , ).1.,. Let x 1 , ... , x 1., and y 1 , ... , y 1., be corresponding eigenvectors for L(,l.,) and L *().,) ( = L(,l.,) in this case), i = 1, 2, ... , In, respectively. This means that L(,l.,)x, = L *().,)y, = 0 and x,, y, oft 0 for i = 1, 2, ... , ln. Write X = [x 1 x 2 • • • x 1.,], J = diag[,l. 1 , Az, ... , A1.,], Y* = Lh · · · y1.,]. The two sets of
258
10. GENERAL THEORY
eigenvectors can be normalized in such a way that (X, J, Y) is a Jordan triple for L(A), so we have the resolvent representation C
1 (A) =
X(IA- J)- 1 Y =
L In
,~ 1
''*
X _iJ_i '
A- A,
(10.10)
and, since J* = J, in this case L - 1 (A) = X(IA- J)- 1 Y = Y*(IA- J)- 1 X*.
By Theorem 10.1 there exists a nonsingular self-adjoint M such that X = Y* M and M J = J* M ( = J Min this special case), It follows that M must be a real diagonal matrix, say M = diag[s 1 , , , . , s1n]. Let rx, ="/is, I, i = 1, 2,,,., In and write T = diag[:x 1, , , , , rx 1n],
E
diag[sgn s 1, , . , , sgn s1n].
=
Then M = TET. Consider now the Jordan triple (X 1 , J, Y1 ) with X 1 = X y- 1 and Y1 = T Y For this Jordan triple the corresponding matrix M 1 (X 1 = YfM 1 , M 1 J = JM 1 ) is equal to E. This is the simplest possible form of M for the self-adjoint matrix polynomial L(A). We shall see later that the signs {sgn s,}l"c 1 form the so-called sign characteristic of L(A) and (X 1 , J, Y1 ) is a self-adjoint triple. 0
Let L(A) be a self-adjoint monic matrix polynomial. Then the spectrum of L(A) is symmetric with respect to the real axis, as the following lemma shows. Lemma 10.2. If L = L *and ArE rr(L) is nonreal, then ~r is also an eigenvalue of L with the same partial multiplicities as Ar. Thus, if Jordan blocks 1r 1 , • • • , Jr, are associated with A" then exactly t blocks Js 1 , • • • , Js, are associated with As = Ar and i
=
1, 2, ... ' t.
Proof If follows from characterization (ii) of Theorem 10.1 that (Y*, J*, X*) is a standard triple for L if(X, J, Y) is so. But then the matrices J and J* are similar, and the conclusion follows. 0
We conclude this section with an observation of an apparently different character. For a nonsingular self-adjoint matrix M the signature of M, written sig M, is the difference between the number of positive eigenvalues and the number of negative eigenvalues (counting multiplicities, of course). Referring to the characterizations (iii) and (iv) of Theorem 10.1 of a monic self-adjoint matrix polynomial we are to show that, for theM appearing there, sig M depends only on the polynomial. Indeed, it will depend only on 1. Theorem 10.3. Let L =I}:~ AiAi + IA 1 be a self-adjoint matrix polynomial on (ln. Let (X, T, Y) be any standard triple of L. Then the signature of any self-adjoint matrix M for which X = Y* M, T = M- 1 T* M is given by
.
{0
s1g M = n
if if
lis even lis odd.
(10.11)
10.1.
259
SIMPLEST PROPERTIES
Proof
The relations X = Y* M, T = M- 1 T* M imply that col[XT;Jl:6 = (col[Y*T*;Jl:6)M.
Thus,
and for any other standard triple (X 1 , T1 , Y1 ) the corresponding matrix M 1 is given by M
1
= (col[YfTf;Jl:6)- 1 (col[X 1 T~Jl:6).
But there is an S for which relations (10.9) are satisfied and this leads to the conclusion M 1 = S* MS. Then as is well known, sig M 1 = sig M so that the signature is independent of the choice of standard triple. Now select the standard triple (10.8). By Theorem 10.1(v), in this case M = B. Thus, for any M of the theorem statement, sig M = sig B. To compute sig B, consider the continuous family of self-ajdoint operators B(e) = [Bi+k- 1 (e)]~.k= 1, where Bj(e) = eAi for j = 1, ... , l - 1, B 1(e) = I, B;( e) = 0 for j > l. Here e E [0, 1]. Then B = B( 1), and B( e) is nonsingular for every e E [0, 1]. Hence sig B(e) is independent of e. Indeed, let /c 1(e) ~ · · · ~ Anz(e) be the eigenvalues of B(e). As B(e) is continuous in e, so are the A.;( e). Furthermore, A.;( e) ¥= 0 for any e in [0, 1], i = 1, ... , nl in view of the nonsingularity of B(e). So the number of positive A.;( e), as well as the number of negative A.;(e), is independent of e on this interval. In particular, we obtain
sig B = sig B(O) = sig[L I
0
and the verification of (10.11) is then an easy exercise.
0
Theorem 10.3 implies (together with Corollary S5.2) the following unexpected result.
Theorem 10.4. A self-adjoint monic matrix polynomial of odd degree l has at least n elementary divisors of odd degree associated with real eigenvalues. Proof
Indeed, Theorem 10.3 implies that
{0
s1g B = n 0
if if
lis even lis odd.
It remains to apply Corollary S5.2 bearing in mind that C 1 is B-self-adjoint.
0
10.
260
GENERAL THEORY
The significance of this result is brought out if we consider two familiar extreme cases: a) if n = 1, then the result of Theorem 10.4 says that every scalar real polynomial of odd degree has a real root; b) if l = 1, then Theorem 10.4 says that every self-adjoint matrix (considered in the framework of the linear self-adjoint polynomial !A - A) has n real eigenvalues (counting multiplicities), and the elementary divisors of A are linear. These two results are not usually seen as being closely connected, but Theorem 10.4 provides an interesting unification. 10.2. Self-Adjoint Triples: Definition
Let L = L *be a monic matrix polynomial with a Jordan triple (X, J, Y). Since (Y*, J*, X*) is also a standard triple of L, we have X*=M- 1 ¥.
J*=M- 1 JM,
Y*=XM,
As mentioned before, we would like to choose (X, J, Y) in such a way that the similarity matrix M is as simple as possible. In general one cannot expect M = I even in the case of a scalar polynomial with all real eigenvalues, as the following example shows (see also Example 10.2). ExAMPLE
10.3.
Let L(A)
=
A(A - I) be a scalar polynomial. We can take J =
[~ ~]
and the general forms for X, Y in the standard triple (X, J, Y) are
[ -1] -XI
y
where x 1 , x 2
E
C'\{0}. Thus, Y*
=
=
-1
'
Xz
XM, where
M=[-lxo!llz
J
o lx21 12 .
So the simplest form of M is diag( -1, 1), which appears if x 1 and x 2 lie on the unit circle. 0
We describe now the simplest structure of M. First it is necessary to specify the complete Jordan structure of X, J, and Y. Since the nonreal eigenvalues of L occur in conjugate pairs, and in view of Lemma 10.2, a Jordan form J can be associated with L (i.e., with the first companion matrix C 1 ) having the following form. Select a maximal set {.A. 1 , ... , .A."} of eigenvalues of J containing no conjugate pair, and let Pa+ 1 , ... , Aa+b} be the distinct real eigenvalues of J. Put Aa+b+ j = Xj, for j = 1, ... , a, and let J = diag[JJf~ib
(10.12)
10.2.
261
SELF-ADJOINT TRIPLES: DEFINITION
where 1; = diag[1ii]~~ 1 is a Jordan form with eigenvalue A; and Jordan blocks 1;, 1 , . . . , 1;, k, of sizes rx;, 1 ;:::: · · · ;:::: rx;, k,, respectively. Clearly, if (X, 1, Y) is a Jordan triple, then there is a corresponding partitioning of X, and of Y, into blocks, each of which determines a Jordan chain. Thus, we write X= [X 1 Xz
...
X2a+b]
(10.13)
and X;= [X;, 1 X;,z
X;.d
(10.14)
and Y; = col[Y;j]~~ 1,
Y = col[ Y;J?:; b.
(10.15)
A standard triple (X, 1, Y) is said to be self-adjoint if 1 is Jordan and if (in the notations just introduced) (10.16) for i = 1, 2, ... , a and j = 1, 2, ... , k;, where P;j is the cxij x cxij standard involuntary permutation (sip) matrix (see Section S5.1 for the definition of a sip matrix), and also (10.17) for i = a + 1, ... , a + b and j = 1, 2, ... , k; where P;j is the cxij x cxij sip matrix, and the numbers £ij are + 1 or -1. A simple analysis can convince us that in a self-adjoint triple (X, 1, Y) the matrix M for which Y* = XM and M1 = 1* M is of the simplest possible form. Indeed, the self-adjoint equalities (10.16) and (10.17) could be rewritten in the form Y* = XP,,J> where P,, 1 is given by the formula 0 P,
pel 0 ,
0
0
(10.18)
where
and P,
=
diag[diag[~:ijP;J~~ 1Jf:%.,.1.
We remark also that P,, 1 1 = 1* P,, 1 . Thus, for this triple we have M = Pu· One cannot expect any simpler connections between X and Y in general, because the presence of blocks Pij is inevitable due to the structure of 1 and the signs c;ij appear even in the simplest examples, as shown above.
10.
262
GENERAL THEORY
The existence of self-adjoint triples for a self-adjoint matrix polynomial will be shown in the next section and, subsequently, it will be shown that the signs f.ij associated with the real eigenvalues are defined uniquely if normalized as follows: where there is more than one Jordan block of a certain fixed size associated with an eigenvalue, all the + 1s (if any) precede all the -1s (if any). The normalized ordered set {eij}, i = a + 1, ... , a + b, j = 1, 2, ... , k;, is called the sign characteristic of L(-1). Observe that Eq. (10.16) and (10.17) show that left Jordan chains can be constructed very simply from right Jordan chains. Suppose one begins with any Jordan pair (X, J) and then these equations are used to generate a matrix Y. In general (X, J, Y) will not be a self-adjoint triple because the orthogonality conditions (10.6) will not be satisfied. However, the theorem says that for a special choice of X, and then Y defined by (10.16) and (10.17) a self-adjoint triple can be generated. As an illustration of the notation of self-adjoint triples consider the resolvent form for self-adjoint triple (X, J, Y). First define submatrices of X, J as follows (cf. Eqs. (10.13) and (10.12)):
and similarly for J = diag[K 1, K 2 , K 3]. Then equalities (10.16) and (10.17) imply the existence of (special) permutation matrices P1, P2 for which
P1
X!l
y~ [ ~2~~, P 1 Xf
namely, P 1 = diag[diag(Pij)~~ 1 ]f= 1 , P2 = diag[diag(eijP;)~~ 1 ]f;-;+ 1 · In this case, the resolvent form of L(-1) becomes L - 1(-1) = X(I-1- J)- 1Y = X1(H- K1)_ 1 P1x~ + XiH- K2)- 1P2X2 + X 3(/-1- K 1)- 1P 1Xf = X1(/A- K1)- 1F1X! + X2(H- K2)- 1(F2X!) + (X3P1)(H- Kf)- 1Xf.
This equality is characteristic for self-adjoint matrix polynomials, i.e., the following result holds. Theorem 10.5.
A monic matrix polynomial L(-1) is self-adjoint
if and only
if it admits the representation L - 1(-1) = X 1(H- 1 1)- 1 · P 1X! + X 2(H- 1 2)- 1 · P 2 X~ + X 3P 1 ·(/A- Jf)- 1Xf, (10.19)
10.3.
263
SELF-ADJOINT TRIPLES: EXISTENCE
where J 1 (J 2 ) are Jordan matrices with .?m A. 0 > 0 (.?m A. 0 = 0) for every eigenvalue A. 0 of J 1 (J 2 ), and for i = 1, 2
Pf
(10.20)
=I,
Proof We saw above that if L(A.) is self-adjoint, then L - 1(A.) admits representation (10.19) (equalities (10.20) are checked easily for P; = P; and 1; = K; in the above notation). On the other hand, if (10.19) holds, then by taking adjoints one sees easily that L - 1 (A.) = (L - 1 (X))*. So L(A.) is self-adjoint.
0 We remark that the form ofthe resolvent representation (10.19) changes if the "monic" hypothesis is removed. More generally, it can be shown that a matrix polynomial L(A.) with det L(A.) "¢= 0 is self-adjoint if and only if it admits the representation L - 1 (A.) = X 1(IA.- J 1 )- 1 P 1 X~ + X 2 (IA.- 1 2 )- 1P 2 X! + X3P1(IA.- J!)- 1XT + x4(J4A.- J)- 1 P4X!, (10.21) where X 1 , X 2 , X 3 , P 1 , P 2 , J 1 , J 2 are as in Theorem 10.5, J 4 is a nilpotent Jordan matrix, and
It turns out that the matrices X 1 , X 2 , X 3 , J 1 , J 2 in (10.21) are constructed using the finite Jordan pair (XF, JF) of L(A.) in the same way as described above for monic polynomials. The pair (X 4 , J 4 ) turns out to be the restriction of the infinite Jordan pair (X 00 , J ,,::;) of L(A.) (as defined in Section 7.1) to some J 00 -invariant subspace. We shall not prove (10.21) in this book. 10.3. Self-Adjoint Triples: Existence
In this section the notions and results of Chapter S5 are to be applied to self-adjoint matrix polynomials and their self-adjoint triples. Some properties of these triples are deduced and an illustrative example is included. To begin with, we prove the existence of self-adjoint triples. In fact, this is included in the following more informative result. As usual, C 1 is the first companion matrix of L, and B is as defined in (10.3). Theorem 10.6. Let L(A.) = L~;,; ~ A jA.j + IA. 1 be a monic self-adjoint matrix polynomial. If P,. 1 is a C 1-canonical form of B and S is the reducing matrix B = S*P,. 1 S, then the triple (X, J, Y) with X
= [I 0 · · · O]S-1,
is self-adjoint.
Y
= S[O
···
I]*
10.
264
GENERAL THEORY
Conversely, if for some nonsingular S the triple (X, J, Y) defined by the above relations is self-adjoint, then S is a reducing matrix of B to C 1 -canonical form. Proof It has been shown in Theorem 10.1 that L * = Lis equivalent to the statement that C 1 is B-self-adjoint. Then Theorem S5.1 implies the existence of a C 1 -canonical form P,,J for B. Thus, B = S*P,,JS and SC 1 S- 1 = J. Then it is clear that (X, J, Y) as defined in the theorem forms a Jordan triple for L (refer to Eq. (10.9)). This triple is self-adjoint if and only if Y = P,,JX*. Using the definition of X and Y this relation is seen to be equivalent to
and this is evident in view of the definition of B. Suppose now that (X,J,Y)=([I 0 ... O]S- 1 ,SC 1 S-l,S[O ...
0 I]*)
is a self-adjoint triple. Then Y = P,.JX*. Using the equality P,,JJ = J*P,,J one checks easily that Q* = P,,JR, where Q = col[XJ;]l;:6 and R = row[J;YJl:6. By (10.6) B = R- 1 Q- 1 = Q*- 1 P,,JQ- 1 . It remains to note that S = Q- 1 . 0 Now we derive some useful formulas involving self-adjoint triples. Let (X, J) be a Jordan pair of the monic self-adjoint matrix polynomial L(..t) constructed as in (10.12), (10.13), and (10.14). Construct the corresponding matrix Q = col[Xf]l:6 and let U = Q*BQ. It is evident that U is nonsingular. It Y is the third member of a Jordan triple determined by X and J, we show first that Y = u- 1 X*. Recall that Y is determined by the orthogonality relation (10.23) We also have
Q(U-'X')
~
B-'Q•-•x•
~ B-•rfl ~ m
(10.24)
and comparing with (10.23) it is seen that Y = u- 1 X*. Since the columns of Q form Jordan chains for C 1 , C 1 Q = QJ whence Q*(BC 1 )Q = (Q* BQ)J. In addition (BC 1 )* = BC 1 and it follows that J*(Q*BQ)
= (Q*BQ)J.
(10.25)
10.3.
265
SELF-ADJOINT TRIPLES: EXISTENCE
With a partitioning consistent with that of X in (10.13) write
and (10.25) can be written
for r, s = 1, 2, ... ' 2a + b. Now S2.2) and it is found that
xr #- As implies that Q: BQS = 0 (see Section
U = [Q:BQ 8 ] = [ 0
~02 T~il
(10.26)
T1
where (10.27) and T2
= diag[Q:+ iBQa+ i]~= 1 .
(10.28)
Suppose now that the triple (X, J, Y) is self-adjoint. Then Q = s- 1 , where S is the reducing matrix to a C 1 -canonical form of B (cf. Theorem 10.6). It follows that U = P,. 1 . In particular, we have fori= a+ 1, ... , a+ b QT BQi = diag[eiiPii]'~ 1 ,
where eii =
(10.29)
± 1, fori = I, ... , a, Q:+b+iBQi = diag[Pii]'~ 1.
(10.30)
Note that Eqs. (10.26)-(10.28) hold for any Jordan triple of L(A.) (not necessarily self-adjoint). We conclude this section with an illustrative example. ExAMPLE
10.4. Consider the fourth-degree self-adjoint matrix polynomial
-- [0 1] [2(}.(i.Z ++ v1) 2
L 4 (,.)
-
1 -1
2
v.z;. ++ 1)1 4
2
]
[0 1]
1 -1 ·
It is found that det LiA.) = (A.4 - 1) 2 so the eigenvalues are the fourth roots of unity, each with multiplicity two. A Jordan pair is
266
10. GENERAL THEORY
However, it is found that this is not a self-adjoint pair. It turns out that
X= j2 [-2 8
-2i -2 1 -2 -1 OJ - 2 - 2i 2 - 1 2 1 2
is such that X, J determines a self-adjoint triple. In Eq. (10.26) we then have
10.4. Self-Adjoint Triples for Real Self-Adjoint Matrix Polynomials
Let L(A.) = IA. 1 + L:}:~ Ai.A.i be a monic matrix polynomial with real and symmetric coefficients Ai: Ai = AJ, j = 0, ... , I - 1. Then clearly L(A.) is self-adjoint and as we have already seen there exists a self-adjoint triple (X, J, Y) of L(A.). In the case of real coefficients, however, a self-adjoint triple which enjoys special properties can be constructed, as we shall see in this section. Let us make first a simple observation on Jordan chains for L(A.). Let A. 0 E u(L) be a nonreal eigenvalue of L(A.) with corresponding Jordan chain x 0 , •.. , x,. So, as defined in Section 1.4.
~
_!_
L... . 1 L
(j)
(A. 0 )xk-j-0,
k = 0, ... , r.
(10.31)
j=O) ·
Taking complex conjugates in these equalities, (and denoting by Z the matrix whose entries are complex conjugate to the entries of the matrix Z) we obtain
~ 1
L... ~
-(-j)--
L (A. 0 )xk- i
-
-
0,
k = 0, ... , r.
j=O) ·
But since the coefficients Ai are real, L
10.4.
267
REAL SELF-ADJOINT MATRIX POLYNOMIALS
It follows from the preceding discussion that there exists a Jordan pair (X, J) of L().) of the following structure:
(10.32) where a'(J< 2 l) is real and the spectrum of J< 1l is nonreal and does not contain conjugate complex pairs. The matrix x< 2 l is real. We have not yet used the condition that Ai = AJ. From this condition it follows that there exists a self-adjoint triple of L(A). The next result shows that one can choose the self-adjoint triple in such a way that property (1 0.32) holds also. Theorem 10.7. There exists a self-adjoint triple (X, J, Y) of L().) such that the pair (X, J) is decomposed as in (10.32).
We need some preparations for the proof of Theorem 10.7 and first we point out the following fact. Theorem 10.8. For every n x n nonsingular (complex) symmetric matrix V there exists ann x n matrix U such that V = UTU. Proof Consider the linear space fl" together with the symmetric bilinear form [, ] defined by V: [x, y] = yTVx. Using Lagrange's alogrithm (see [22, 52c]), we reduce the form [, ] to the form [x, y] = X;Y; for x = (x 1, ... , x")T andy= (y 1, ... , Yn)T in some basis in fl". Then the desired matrix U is the transformation matrix from the standard basis in fl" to this new basis. D
D=o
Let K = diag[K 1 , ... , K.] be a Jordan matrix with Jordan blocks . . . , Kr of sizes m1 , ..• , m" respectively. Let P; be the sip matrix of size m; (i = 1, ... , r), and put PK = diag[P 1 , •.• , P,]. Denote by :Y the set of all nonsingular matrices commuting with K.
K 1,
Lemma I 0.9. Let V be a nonsingular (complex) symmetric matrix such that PK VE .?7. Then there exists WE :T such that WTVW = PK. Conversely, if for some matrix V there exists WE :Y such that WTVW = PK, then Vis nonsingular symmetric and PK V E :Y. Proof The converse statement is easily checked using the property that PKK = KTPK and PKKT = KPK so we focus on the direct statement of the
lemma. Consider first the case when the set (J(K) of eigenvalues of K consists of one point and all the Jordan blocks of K have the same size: K = diag[K 1 ,
.•. ,
K 1 ],
where the p x p Jordan block K 1 appears r times.
268
10.
GENERAL THEORY
It is seen from Theorem S2.2 that a matrix U E ff if and only if it is nonsingular and has the partitioned form (UijX.j = 1 where uij is a p X p upper triangular Toeplitz matrix, i.e., has the form
Uu~ ~~
t2
...
tp
t1
0
(10.33) tl
t2
0
t1
for some complex numbers t 1, t 2 , ... , tP (dependent on i,j). It will first be proved that V admits the representation (10.34) for some U E ff where W1 , ••• , W, are p x p matrices. For brevity, introduce the class d of all nonsingular (complex) symmetric matrices of the form PKS, S E ff, to which V belongs. Put (10.35) where U E ff is arbitrary. Then W0 Ed. Indeed, the symmetry of W0 follows from the symmetry of V. Further, using equalities UK= KU, PK VK = KPK V, Pi_= I, and PKK = KTPK, we have PK W0 K
= =
PKUTVUK
=
PKUTPKPK VKU
PK UTKTPKPK VU
=
=
PKKTUTVU
PKUTPKKPK VU
=
KPK UTVU
=
KPK W0 ,
so that W0 Ed. Now let M be the pr x pr permutation matrix which, when applied from the right moves columns 1, 2, 3, ... , pr into the positions 1,
r
+ 1, 2r + 1, ... , (p
+ 1,
- 1)r
2,
r
+ 2, ... , (p
- 1)r
+ 2, ... , r,
2r, ... , pr,
respectively. Since V Ed, we have P K V E ff, and using representation (10.33) for the blocks of P K V it is found that
(10.36)
where VT = Jli, i = 1, 2, ... , p. (Note that the effect of the permutation is to transform a partitioning of V in r 2 p x p blocks to a partitioning in p 2 blocks
10.4.
269
REAL SELF-ADJOINT MATRIX POLYNOMIALS
ofsizer x r.)Hencefor W0 defined by(10.35)MTW0 Mhas the form, analogous to (10.36), say 0 - 0 = M TW0 M = W
0
WI
Wi Wi Ui
WI Wz
(10.37)
0
~
and furthermore, the nature of the permutation is such that W0 is block diagonal if and only if the blocks W;, i = 1, 2, •.. , p, are diagonal. If U E ff, the action of the same symmetric permutation on U is
(10.38)
Now (using the property MMT =I) we have
0 uJ
Multiply out these products and use (10.37) to obtain i
W; = I
1~
I
UT-1+ 1 I
I
k~
l-Ie ul-k+ 1•
i
= 1, 2,
0
0
0'
p.
(10.39)
I
To complete the proof of (10.34) it is to be shown that U 1 , . . . , UP can be are diagonal. This is done by determined in such a way that WI, calculating U 1 , •.. , UP successively. First let U 1 be any nonsingular matrix for which UJV1 U 1 = I. Since VJ = V1 such a U 1 exists by Lemma 10.8. Proceeding inductively, suppose that matrices U 1 , ... , U v-I have been found for which W1 , ••• , J¥,._ 1 are diagonal. Then it is deduced from (10.39) that 0
0
0'
wp
(10.40)
270
where CT =
10.
c
v,
and depends only on VI, ... '
0 v = U 11 U., then ( 10.40) takes the form
and u
GENERAL THEORY
I• ... ,
uv-1· Write
Let 0. = [f3ii]i.i~ 1 , C = [cu]i.i~l· Since UJV1U 1 =I, the matrix diagonal if and only if for 1 ~ i ~ j ~ r f3ii
+ f3u = -
W._is
cii.
There certainly exist pairs f3ii, f3ii satisfying this equation. This completes the proof of (10.34). Now each Jt; from (10.34) has the form
W;
~I r t!
0
t!
tl t2
tp-1 tp
tj E fl,
t I =I= 0.
S;Efl,
s1
There exists a matrix S; of the form
S;
=
[,,~
s2 sl
0
0
...
'•. 1'
sp-1
=1=
0
sl
such that (10.41)
S[Ut;S; =PI the sip matrix of size p x p. Indeed, (10.41) amounts to the system i = 1 i
= 2, ... , p.
This system can be solved for s 1 , s 2 , •.. , sP successively by using the ith equation to finds; in terms oft 1 , .•. , t;, s 1 , ... , s;_ 1 , where s 1 = t! 112 • Now put W = diag[S 1 , . . . , S,]U to prove Lemma 10.9 for the case when a(K) is a singleton and all Jordan blocks of K have the same size. Consider now the case when a(K) consists of one point but the sizes of Jordan blocks are not all equal. So let
Let
10.4.
271
REAL SELF-ADJOINT MATRIX POLYNOMIALS
Then the condition P K V E f7 means that the V;i have the following structure (cf. Theorem S2.2): for i #j; v;.IJ = [v
v~iil
= 0 for t < max(mi, m).
Nonsingularity of V implies, in particular, that the k 1 x k 1 matrix ]k' [ (ij) Vm,+1 i,j=1
is also nonsingular. Therefore, we can reduce the problem to the case when and
for
i = 1, ... , m1 ,
(10.42)
by applying sequentially (a finite number of times) transformations of type V ~ ETVE, where then x n matrix E has the structure E
=
[Eii]~.pi= 1
(Eii an mi x mi matrix),
with Eii = I fori = 1, ... , kP. Also.
Eii
=
e1
ez
em;
0
e1
em;-1
0
0
e1
0
0
0
where e. E fl depends on i andj, fori = 1, ... , m1 andj = m + 1, ... , mkp, and Eii = 0 otherwise. Then in the case (10.42) we apply induction on the number of different sizes of Jordan blocks inK to complete the proof of Lemma 10.9 in the case that (J(K) consists of a single point. Finally, the general case in Lemma 10.9 ((J(K) contains more than one point) reduces to the case already considered in view of the description of the matrices from f7 (Theorem S2.2). D The next lemma is an analog of Lemma 10.9 for real matrices. Given the Jordan matrix K and the matrices P 1 , .•. , P, as above, let P •. K = diag[e 1 P 1 ,
for a given set of signs
B
=
(ei)~= 1 , Bi
... ,
B,P,]
= ± 1.
Lemma 10.10. Let V be a nonsingular real symmetric matrix such that PK V E f/. Then there exists a real WE f7 such that WTVW = P,,Kfor some set of signs B. Conversely, iffor some matrix V there exists real WE f7 such that WTVW = P.,,K, then Vis real, invertible, and symmetric and PK V E f/.
10. GENERAL THEORY
272
The proof of Lemma 10.10 is analogous to the proof of Lemma 10.9 (the essential difference is that, in the notation of the proof of Lemma 10.9, we now define U 1 to be a real nonsingular matrix such that
urvl u 1 = diag[£51, ... ' <:5,], for some signs £5 1, ... , <:5,. Also1 instead of(10.41) we now prove that STW;Si = ± P 1, by letting s 1 = (±t 1)- 112 where the sign is chosen to ensure that s 1 is real (here t 1 E ~)). Proof of Theorem 10. 7. We start with a Jordan triple (X, J, Y) of L(A.) of the form
(10.43) where Im a(J 1) > 0, a(J 2) is real, X 2 is a real matrix, and the partitions of X and Yare consistent with the partition of J. (In particular, Im a(J 1) < 0.) The existence of such a Jordan triple was shown in the beginning of this section. Let Q = col(Xf)l:6. Then (see (10.26)) U = Q*BQ = [
~ Tt
Tfl 0 '
(10.44)
0
where the partition is consistent with the partition in (10.43). Moreover (equality (10.25)), J*T
= TJ.
(10.45).
It is easy to check (using (10.44) and (10.45)) that T1 is nonsingular symmetric and Pc T1 commutes with J 1 (where Pc is defined as in (10.18)). By Lemma 10.9 there exists a nonsingular matrix W1 such that W1 J 1 = J 1W1 and wrrl wl =PC. Analogously, using Lemma 10.10, one shows that there exists a real nonsingular matrix W2 such that W2 J 2 = J 2 W2 and WI T2 W2 = P,, J 2 for some set of signs a. Denote X'1 = X 1W1 and X2 = X 2 W2 • Then (cf. Theorem 1.25) the triple (X', J, Y') with
X'= [X'1,X2,X\]
and
Y' = [col(X'f)l:}J- 1 ·col(<:5i,t-J)l:6
is a Jordan triple for L(A.). Moreover, the equality
[col(X'J;)::n•n
col(X'J;)::~ ~
[I ~ ~'l
10.5. SIGN CHARACTERISTIC OF A SELF-ADJOINT MATRIX POLYNOMIAL
273
holds, which means that the triple (X', J, Y') is self-adjoint (cf. (10.29) and (10.30)). 0 We remark that the method employed in the proof of Theorem 10.7 can also be used to construct a self-adjoint triple for any monic self-adjoint matrix polynomial (not necessarily real), thereby proving Theorem 10.6 directly, without reference to Theorem S5.1. See [34f] for details. The following corollary is a version of Theorem 10.5 for the case of real coefficients A j· Corollary 10.11. Let L(A.) be a monic self-adjoint matrix polynomial with real coefficients. Then L(A.) admits the representation L - 1(A.)
= 2 ~e(X 1 (/A.- 1 1)- 1P 1Xi) + X 2 (H- 1 2 )- 1P 2 Xi, (10.46)
where J 1 (J 2 ) are Jordan matrices with Jm A.0 > 0 (Jm A. 0 = 0) for every eigenvalue A. 0 of J 1 (J 2 ), the matrix X 2 is real, and for i = 1, 2: P; = Pf = PT,
N
=I,
P;l; = J!P;.
(10.47)
Conversely, if a monic matrix polynomial L(A.) admits representation (10.46), then its coefficients are real and symmetric. Proof The direct part of Corollary 10.11 follows by combining Theorems 10.5 and 10.7. The converse statement is proved by direct verification that L- 1(A.) = (L- 1(A.)) T = (L- 1(A.))* for A. E IR, using the equalities (10.47).
0
10.5. Sign Characteristic of a Self-Adjoint Matrix Polynomial Let L(A.) be a monic matrix polynomial with self-adjoint coefficients. By Theorem 10.6 there exists a self-adjoint triple of L(A.), which includes (see (10.17)) a set of signs B;i, one for every nonzero partial multiplicity rxii• j = 1, ... , k; corresponding to every real eigenvalue A.a+i• i = 1, ... , b, of L(A.). This set of signs cii is the sign characteristic of L(A.), and it plays a central role in the investigation of self-adjoint matrix polynomials. According to this definition, the sign characteristic of L(A.) is just the C 1-sign characteristic of B (cf. Theorem 10.6). We begin with the following simple property of the sign characteristic. Proposition 10.12. Let B;i, j = 1, ... , k;, i = 1, ... , b be the sign characteristic of L(l.) = V. 1 + 2}:~ Ail) with the corresponding partial multiplicities rx;i,j = 1, ... , k;, i = 1, ... , b. Then b
k;
{0
i~l i~1![1 - ( -1Y'i)~;ii = n
if if
lis even lis odd.
(10.48)
274
10.
GENERAL THEORY
Proof Let (X, J, Y) be a self-adjoint triple of L(A.), i.e., such that Y = P,,JX*, where£ = (~>;),j = 1, ... , k;, i = 1, ... , b. By Theorem 10.6 sig B = sig Pe.J·
(10.49)
Note that the signature of a sip matrix is 1 or 0 depending if its size is odd or even. Using this fact one deduces easily that sig P,, J is just the left-hand side of (10.48). Apply Theorem 10.3 to the left-hand side of(10.49) and obtain (10.48).
D
Lf=
In particular, Proposition 10.12 ensures that 1 D~ 1 1[1 - ( -l)'•;i]£ij does not depend on the choice of a self-adjoint triple (X, J, Y) of L(A.). The next result shows that in fact the signs £ij themselves do not depend on (X, J, Y), and expresses a characteristic property of the self-adjoint matrix polynomial L(A.) (justifying the term "sign characteristic of L(A.)").
Theorem 10.13. The sign characteristic of a monic self-adjoint matrix polynomial L(A.) does not depend on the choice of its self-adjoint triple (X, J, Y). The proof is immediate: apply Theorem S5.6 and the fact that C 1 is Bself-adjoint (cf. Theorem 10.6). D The rest of this section will be devoted to the computation of the sign characteristic and some examples. Given a self-adjoint triple (X, J. Y) of L(A.) and given an eigenvector x from X corresponding to a real eigenvalue A.:, denote by v(x, A. 0 ) the length of the Jordan chain from X beginning with x, and by sgn(x, A. 0 )-the sign of the corresponding Jordan block of J in the sign characteristic of L(A.).
Theorem 10.14. Let (X, J, Y) be a self-adjoint triple of L(A.) = IA 1 + _L:}-;;;~ AjA.j. Let x 1 and y 1 , •.. , y, be eigenvector and .first vectors of a Jordan chain, respectively ,from X corresponding to the same real eigenvalue A. 0 . Then
-
{
0, sgn(y 1 , A. 0 ),
if y 1 # x 1 if y 1 = x 1
or y 1 = x 1 and r < v(x 1, A.o) ( 10.SO) and r = v(x 1 , A. 0 ).
Let Q = col(XJ;)~-:6. Denote by x1 and h, ... , y, the columns in .•• , y" respectively, in X. We shall prove first that
Proof
Q corresponding to the columns x 1 and y 1 ,
~LU>(A.o)Yr+I-j)
( xi,.± j = Il· where B is given by (10.3).
=
(x 1 ,By,),
(10.51)
10.5. SIGN CHARACTERISTIC OF A SELF-ADJOINT MATRIX POLYNOMIAL
The proof is a combinatorial one. Substituting
x
1
275
col[.A.~x 1 ]~;:;~ and ::;; 0) in the expres-
=
y, = col[Lf=o ({).A.b-iy,_;]~;;;~ (by definition yP = 0 for p sion (x 1 , By,) we deduce that
It is easy to see that '\'l
L...
'\'l
L...
k=lp=i+k
(
P-
.
I
k)
;.ri-lA
o
P
1 = ·---L
o'
(.
and (10.51) follows. Now use (10.29) to obtain (10.50) from (10.51). 0 Theorem 10.14 allows us to compute the sign characteristic of L().) in simple cases. Let us consider some examples. ExAMPLE 10.5. Let L(A.) be a monic scalar polynomial with real coefficients and r different simple real roots A. 1 > · · · > A.,. It is easy to check (using Theorem 10.14) that the sign of A.i in the sign characteristic of L(A.) is ( -1)i+ 1 . D ExAMPLE 10.6. Let L(A.) be a monic self-adjoint matrix polynomial with degree/. Let A. 1 (resp. A.,) be the maximal (resp. minimal) real eigenvalue of L(A.). Then sgn(x, A. 1 ) = 1 for any eigenvector x corresponding to A., with v(x, A. 1 ) = Y; and for any eigenvector y corresponding to A., with v(y, A.,)= 1 we have sgn(y, A.,)= ( -1Y+ 1 • Indeed, consider the real-valued function .fxCA.) = (x, L(A.)x). Then .fx(A.) > 0 for A. > A. 1 , because the leading coefficient of L(A.) is positive definite. On the other hand,fiA. 1) = 0 (for x is an eigenvector of L(A.) corresponding to A. 1 ). So .f~(A. 1 ) ~ 0. But in view of Theorem 10.14, sgn(x, A. 1) is just.f~(A. 1 ), maybe multiplied by a positive constant. Hence sgn(x, A. 1) = 1. The same argument is applicable to the eigenvector y of A.,. D
These examples are extremal in the sense that the signs can be determined there using only the eigenvalues with their partial multiplicities, without knowing anything more about the matrix polynomial. The next example shows a different situation. ExAMPLE 10.7.
Let a < b < c < d be real numbers, and let . [(A. -- a)(A. - b) L(A.) = 0
J
0 . (A. - c)(A. - d)
Then the sign characteristic of the eigenvalues a, b, c, d is -1, 1, -1, 1, respectively (as follows from Example 10.5). Interchanging in L(A.) the places of A. - band ..1 - c, we obtain a new self-adjoint matrix polynomial with the same eigenvalues and partial multiplicities, but the sign characteristic is different: - 1, - 1, 1, 1 for a, b, c, d, respectively.
D
10.
276
GENERAL THEORY
10.6. Numerical Range and Eigenvalues
For a matrix polynomial M(A) define the numerical range NR(M) as NR(M)
= {A E ~I(M(A)x, x) = 0 for some x E ~"\{0}}
(10.52)
This definition is a natural generalization of the notion of numerical range for a simple matrix A, which is defined as N A= {(Ax, x)lx E ~·. llxll = 1}. Indeed, {(Ax,x)lxE~",
llxll = 1} = NR(IA- A).
It is well known that for any matrix A the set N A is convex and compact. In general, the set N R(M) is neither convex nor compact. For example, for the scalar polynomial M(A) = A(A- 1) we have NR(M) = {0, 1}, which is obviously nonconvex; for the 2 x 2 matrix polynomial M(A) =
[~ ~]
we have NR(M) = IR, which is noncompact. It is true that NR(M) is closed (this fact can be checked by a standard argument observing that NR(M)
=
{AE~I(M(A)x,x)
=
OforsomexE~",
llxll = 1}
and using the compactness of the unit sphere in ~"). Moreover, if M(A) is monic, then NR(M) is compact. Indeed, write M(),) = nm + I,j=-01 Mi)..i. Let a = max{m · max 0 s;;s;m- d lA II AEN M.}, 1}. Clearly, a is a finite number, and for IAI >a and x E ~·. llxll = 1 we have IAm(x,x)l = IAim > m max {IAIIAENM.}IAim- 1 Os;is;m-1
so Af/:NR(M). We shall assume in the rest of this section that M(A) = L(A) is a monic self-adjoint matrix polynomial, and study the set NR(L) and its relationship with a{L). We have seen above that NR(L) is a compact set; moreover, since L(A) is self-adjoint the coefficients of the scalar polynomial (L(A)x, x), x E ~· \{0} are real, and therefore N R(L) is symmetric relative to the real line: if Ao E NR(L), then also I 0 E NR(L). Evidently, NR(L) ::J a(L). The following result exhibits a close relationship between the eigenvalues of L(A) and the numerical range NR(L), namely, that every real frontier (or boundary) point of NR(L) is an eigenvalue of L(A). Theorem 10.15. Let L(A) be a monic self-adjoint matrix polynomial, and let Ao E NR(L) n (!R\NR(L)). Then Ao is an eigenvalue of L(A).
10.6.
NUMERICAL RANGE AND EIGENVALUES
277
Proof By assumption, there exists a sequence of real numbers A. 1 , A. 2 , ..• such that limm-+ oo Am = A. 0 and A.i ~ N R(L). This means that the equality (L(A.Jx, x) = 0 holds only for x = 0; in particular, either (L(A.Jx, x) 2 0 for all x E C", or (L(A.Jx, x) :s; 0 for all x E C". Suppose, for instance, that the former case holds for infinitely many A.i. Passing to the limit along the (infinite) set of such A.i, we find that (L(A. 0 )x, x) 2 0 for all x E C". Since A. 0 E NR(L), we also have for some nonzero x 0 E C", (L(A. 0 )x 0 , x 0 ) = 0. It follows that the hermitian matrix L(A. 0 ) is singular, i.e. A. 0 E a'(L). D Since the number of different real eigenvalues of L(A.) is at most nl, where l is the degree on L(A.), the following corollary is an immediate consequence of Theorem 10.15. Corollary 10.16. Let L(A.) be as in Theorem 10.15. Then the set NR(L) n is a finite union of disjoint closed intervals and separate points:
~
(1 0.53)
where 111 < Jlz < ... < Jlzk and vj~ U7=1 [Jlzi~1, JlzJ, vh i= vhfor j1 i= jz. The number k of intervals and the number m of single points in (10.53) satisfy the inequality 2k
+ m :s;
nl,
where l is the degree of L(A.). We point out one more corollary of Theorem 10.15. Corollary 10.17. A monic selFadjoint matrix polynomial L(A.) has no real eigenvalues !land only ifNR(L) n ~ = 0.
Comments Self-adjoint matrix polynomials of second degree appear in the theory of damped oscillatory systems with a finite number of degrees of freedom ([17, 52b ]). Consideration of damped oscillatory systems with an infinite number of degrees of freedom leads naturally to self-adjoint operator polynomials (acting in infinite dimensional Hilbert space), see [51]. Self-adjoint operator polynomials (possibly with unbounded coefficients) appear also in the solution of certain partial differential equations by the Fourier method. The results of this chapter, with the exception of Sections 10.4 and 10.6, are developed in the authors' papers [34d, 34f, 34g]. The crucial fact for the theory, that C 1 is B-self-adjoint (in the notation of Section 10.1 ), has been used elsewhere (see [51, 56a, 56d]). The results of Section 10.4 are new. The numerical range for operator polynomials was introduced (implicitly) in [67] and further developed in [56a]. A more general treatment appears in [ 44].
Chapter 11
Factorization of Self-Adjoint Matrix PolynoDlials
The general results on divisibility of monic matrix polynomials introduced in Part I, together with the spectral theory for self-adjoint polynomials as developed in the preceding chapter, can now be combined to prove specific theorems on the factorization of self-adjoint matrix polynomials. Chapter 11 is devoted to results of this type. 11.1. Symmetric Factorization
We begin with symmetric factorizations of the form L = L'fL 3 L 1 , where = L 3 • Although such factorizations may appear to be of a rather special kind, it will be seen subsequently that the associated geometric properties described in the next theorem play an important role in the study of more general, nonsymmetric factorizations. As usual, C 1 stands for the first companion matrix of L, and B is defined by (10.3). L~
Theorem 11.1. Let L be a monic self-adjoint polynomial of degree l with a monic right divisor L 1 of degree k :s; l/2. Then L has a factorization L = Li L 3 L 1 for some monic matrix polynomial L 3 if and only if (Bx, y) = 0
for all
x, y E.#,
where .A is a supporting subspace for L 1 (with respect to C 1 ). 278
(11.1)
11.2. MAIN THEOREM
279
We shall say that any subspace .A for which (11.1) holds is neutral with respect to B, or is B-neutral. To illustrate the existence of B-neutral subspaces consider a monic right divisor L 1 for which a'(L 1 ) contains no pair of complex conjugate points. If L = L 2 L 1 , then a(L) = a(L 1 ) u a(L 2 ) and a(L 1 ) n a(L!) = 0. If Lis self-adjoint, then L = L 2 L 1 = L!Li and it follows that L 1 is a right divisor of Li. Hence there is a matrix polynomial L 3 for which LfL 3 L 1 = L. Consequently (11.1) holds.
Proof Let L = L 2 L 1 = LfLi, and L 1 have supporting subspace .A with respect to C 1 • It follows from Theorem 3.20 that Li has supporting subspace .AJ. with respect to Cf. But now, by Theorem 10.1 Cf = BC 1 B- 1 . It follows that the supporting subspace % for Li with respect to C 1 is given by%= B- 1 .A1.. Now observe that k ~ 1/2 implies that L = LfL 3 L 1 if and only if L 1 divides Li. Thus, using Corollary 3.15, such a factorization exists if and only if .A c: % = B- 1 .A1., i.e., if and only if ( 11.1) is satisfied. D Theorem 11.1 suggests that B-neutral subspaces play an important role in the factorization problems of self-adjoint operator polynomials. Formulas of Section 10.3 provide further examples of B-neutral subspaces. For example, the top left zero of the partitioned matrix in (10.26) demonstrates immediately that the generalized eigenspaces corresponding to the non-real eigenvalues A. 1 , .•• , A.a determine a B-neutral subspace. These eigenspaces are therefore natural candidates for the construction of supporting subspaces for right divisors in a symmetric factorization of L. In a natural extension of the idea of a B-neutral subspace, a subspace Y' of flnt is said to be B-nonnegative (B-nonpositive) if (Bx, x) 2: 0 ( ~ 0) for all x E Y'. 11.2. Main Theorem For the construction of B-neutral subspaces (as used in Theorem 11.1) we introduce sets ofnonreal eigenvalues, S, with the property that S n S = 0, i.e., such a set contains no real eigenvalues and no conjugate pairs. Call such a subset of a(L) a c-set. The main factorization theorem can then be stated as follows (the symb0l [a] denotes the greatest integer not exceeding a): Theorem 11.2.
L
Let L be a monic matrix polynomial of degree I with
= L*. Then
(a) There exists a B-nonnegative (B-nonpositive) invariant subspace .A 1 (.A 2 ) of C 1 such that, if k = [t/],
d.
tm .A 1
=
{
(k
dim .A 2 = kn
kn
+ 1)n
if if
I is even I is odd,
11. FACTORIZATION OF SELF-ADJOINT MATRIX POLYNOMIALS
280
(b) Lhasa monic right divisor L 1(L 2 ) with supporting subspace A with respect to C 1 such that
deg L 1 =
{k ~ 1
if if
1
(A2 )
lis even lis odd,
deg L 2 = k. In either case the nonreal spectrum of L 1 and of L 2 coincides with any maximal c-set of a(L) chosen in advance.
In order to build up the subspaces referred to in part (a) the nonreal and real parts of a(L) are first considered separately, and in this order. Theorem 10.6 implies the existence of a self-adjoint triple (X, J, Y) for L, with Y = P•. JX*, and the columns of X determine Jordan chains for L. Let x 1 , •.• , x P be such a Jordan chain associated with the Jordan block J 0 of J. A set of vectors x 1 , .•. , x" r ::::;; p, will be described as a set of leading vectors of the chain, and there is an associated r x r Jordan block which is a leading submatrix of J 0 • Now let 3, be a submatrix of X whose columns are made up of leading vectors of Jordan chains associated with a c-set. Then let Jc be the corresponding Jordan matrix (a submatrix of J) and write (11.2)
The following lemma shows that Im Qc is a B-neutral subspace. Maximal subspaces for L of one sign will subsequently be constructed by extending lm Qc.
Lemma 11.3.
With Qc as defined in (11.2), Q:BQc = 0.
Proof Bearing in mind the construction of the Jordan matrix J in (10.12) and in particular, the arrangement of real and nonreal eigenvalues, it can be BCc is just a submatrix of the top-left entry in the partitioned seen that matrix of (10.26). D
Q:
Consider now the real eigenvalues of L. Suppose that the Jordan block J 1 of J is associated with a real eigenvalue A. 0 and that J 1 is p x p. Then A.0 has an associated Jordan chain x 1 , ••• , xP with x 1 -::f 0, and these vectors form the columns of the n x p matrix X 1 • There is a corresponding chain
x1 , ..• , .XP associated with the eigenvalue A. 0 of C 1 and Q1 = [x 1
•••
xp] = col[X J~Jl;; &.
The definition of a self-adjoint triple shows that x 1 , so that
.•• ,
xP can be chosen
(11.3)
11.2.
281
MAIN THEOREM
where Pis the p x p sip matrix. In particular, observe that (x 1, Bxp) = ± 1. We say that the pair (). 0 , x 1), A. 0 E IR, is of the first (second) kind according as (x 1 , Bxp) = 1 or -1, respectively. Then let !/' 1 and //'2 be the sets of all pairs of the first and second kinds, respectively. It follows from Theorem 10.14 that (11.4) when the chain has length one (resp. more than one), i.e., when the associated elementary divisor of L is linear (resp. nonlinear). We shall describe x 1 as either B-positive, B-negative, or B-neutral, with the obvious meanings. Suppose that the pair (A. 0 , x 1 ), A. 0 E IR, determines a Jordan chain of length k, let r = [t(k + 1)], and consider the first r leading vectors of the chain. It is apparent from (11.3) that for for
k even k odd
(11.5)
where E = diag[O, ... , 0, 1] and is r x r. This relation can now be used to construct subspaces of vectors which are B-nonnegative, B-nonpositive, or B-neutral, and are invariant under C 1 • Thus, fork odd, Span {:X, ... , x,} is B-nonnegative or B-nonpositive according as (A. 0 , x 1 ) is a pair of the first or second kind. Furthermore, if k > 1, Span{x 1, .•. , x,_ dis aB-neutral subspace.lfk is even, then Span{x 1, •.. , :X,}, r = -!-k, is B-neutral for (A. 0 , x 1 ) of either the first or second kind. These observations are now used to build up maximal B-nonnegative and B-nonpositive invariant subspaces for C 1 (associated with real eigenvalues only in this instance). First, basis vectors for a B-nonnegative invariant subspace 2 1 are selected as follows (here [x 1 .; • • • xk,,;] = col(XJi)~-}0 , where J; is a Jordan block of J with a real eigenvalue A;, and X; is the corresponding part of X): (a) If(A.;,x 1.JE 9'1,selectx 1.;, ... ,x,,,;wherer; = [-!-(k; + 1)]. (b) If(A;, X 1,;) E ,C/2 and k; is odd, select X1,i• ... , Xr,-1,i· (c) lf(A;, x 1 _;) E !/'2 and k; is even, select x 1 ,;, ... , X,,,;· Second, basis vectors for a B-nonpositive invariant subspace 2 as follows: (a) lf(A;, x 1,;) E (b) lf(A;, X 1,;) E (c) lf(A;,x 1 ,;) E
select X1,i• ... , x,,,;. and k; is odd, select X1,i• ... , Xr,-1,i· and k; is even, select X1,i• ... , Xr,,i·
!/'2, !/'1 9'1
2
are selected
11.
282
FACTORIZATION OF SELF-ADJOINT MATRIX POLYNOMIALS
Then it is easily verified that dim !£' 1 + dim !£' 2 = p, the total number of real eigenvalues (with multiplicities). It is also clear that !£' 1 n !£' 2 = {0} if and only if all real eigenvalues have only linear elementary divisors. Now let !£' represent either of the subspaces !£' 1 or !£' 2 , and form a matrix QR whose columns are the basis vectors for !£' defined above. Then QR has full rank, Im QR = !£',and QR has the form (11.6)
where 3R is a matrix whose columns are Jordan chains (possibly truncated) for L, and J R in the associated Jordan matrix. It follows from these definitions and (11.5) that (11.7)
where D is a diagonal matrix of zeros and ones, each one being associated with an elementary divisor of odd degree, and the plus or minus sign is selected according as !£' = !£' 1 or !£' = !£' 2 . 11.3. Proof of the Main Theorem
In general, L will have both real and nonreal eigenvalues. It has been demonstrated in the above construction that certain invariant subspaces !£' 1 and !£' 2 of C 1 associated with real eigenvalues can be constructed which are B-nonnegative and B-nonpositive, respectively. However, in the discussion of nonreal eigenvalues, B-neutral invariant subs paces of C 1 were constructed (as in Lemma 11.3) using c-sets of nonreal eigenvalues. Consider now a maximal c-set and the corresponding matrix of Jordan chains for C 1 , Qc of Eq. (11.2). Then Im Qc is B-neutral and the direct sums .A 1 = !£' 1 Im Qc, .A2 = !£' 2 Im Qc will generally be larger Bnonnegative, B-nonpositive invariant subspaces, respectively. Indeed, max(dim .A 1 , dim .A 2 ) ~ kn, where k = [!1]. Construct composite matrices
+
3 = [3R
+
3cJ,
K
= JR EB J<>
S,
= col[3Ki]'j:6,
(11.8)
the S, being defined for r = 1, 2, ... , l. Then S 1 is a submatrix of Q and consequently has full rank. Furthermore, the subspace .A = Im S 1 = Im Qc
+ Im QR = Im Qc + !£'
is an invariant subspace of C 1 which is B-nonnegative or B-nonpositive according as Im QR = 2 1 or Im QR = !£'2 . In order to prove the existence of a right divisor of L of degree k = [!lJ we use Theorem 3.12 and are to show that the construction of 3 implies the invertibility of Sk. In particular, it will be shown that 3 is an n x kn matrix, i.e., dim .A = kn.
11.3.
283
PROOF OF THE MAIN THEOREM
Our construction of 2 ensures that S1 is a submatrix of Q = col(XJ;)~:6 and that
St BSI = [2*
I][
K*2*
2
~ 2~1-1
l
± fj
(11.9)
-
where fj = D E8 0 is a diagonal matrix of zeros and ones and the plus or minus sign applies according as !£' = !£' 1 or !£' = !£' 2 . Let X E Ker sk = n~=l Ker(2K') and make the unique decomposition x = x 1 + x 2 where x 1 E Im D, x 2 E Ker fj and, for brevity, write y, = 2K'x, r = 0, l, 2, .... Use (11.9) to write x*StBS 1x as follows: ±x'J'Dx 1 = [0
...
A1
0 Yt
. ..
Ak
Ak+ 1
I
0
I
0
Yk
0
Y1-1
YT-1]
X
I
I
= {Y;Yk
0 when when
lis even I is odd.
(11.10)
Case (i), I even. Equation ( 11.10) implies x 1 = 0 so that x = x 2 E Ker D. (Note that if fj = 0 this statement is trivially true.) Consider the image of x under S( BS 1 using (11.9), and it is found that
l[ l
~ ~k
0 Yt~
= 0. (11.11)
1
yt
Taking the scalar product of this vector with Kx it is deduced that Yk = 0 whence y" = 0 and x E Ker(2Kk). Then take the scalar product of (11.11) with K 3 x to find Yt+tYk+ 1 = 0 whence YHl = 0 and xEKer(2Kk+ 1 ). Proceeding in this way it is found that X En~=~ Ker(2Kk) = Ker Sl. Hence X = 0 so tnat Ker sk = {0}. This implies that the number of columns in sk (and hence in does not exceed kn, and hence that dim ·"It s kn. But this argument applies equally when ..lt = .41 1 , or j f = j { 2 (jf; = 2 1 lm Qc), and we have dim j / 1 + dim j/ 2 =' In = 2kn. thus it follows that dim j{ 1 = dim j f 2 = kn.
sa
+
11.
284
FACTORIZATION OF SELF-ADJOINT MATRIX POLYNOMIALS
Furthermore, sk must be kn X kn and hence nonsingular (whether A = A 1 or A = A 2 is chosen). Since [I k 0] IAt has the representation Sk it follows that there is a right monic divisor L 1 of L of degree k. Then
L=
L2L 1
= LiLi,
and A is an eigenvalue of L 1 if and only if A E a( C 1I.At), which is the union of (any) maximal c-set of nonreal eigenvalues with certain real eigenvalues, as described in Section 11.2 (in defining the invariant subspaces 2 1, 2 2 of C1).
+
Case (ii), l odd. (a) Consider A 2 = 2 2 Im Q, which, by construction, is B-nonpositive so that the negative sign applies in (11.10). There is an immediate contradiction unless x 1 = 0 and Yk = 0, i.e., x E Ker(EKk) r. Ker fJ. Now it is deduced from ( 11.9) that Ak+2
[E*
· · · K*k- 12*] [
..·.· 0I1[Yk+: 11
.
. I
0
.
.
0
Yt+1
= 0
.
(11.12)
Taking scalar products with K x, K x, ... , successively, it is found (as in case (i)) that X En~-:~ Ker(EK') = Ker sl, and hence X= O.lt follows that dim A 2 =:;; kn. (b) Since dim A 1 +dim A 2 = ln, we now have dim A 1 ;;:::: (k + 1)n. Consider the implications of (11.10) for A 1 (when the plus sign obtains). Let X E Ker sk+1 = n~=O Ker(EK') and write X= x1 + x2, x1 E Im IJ, x 2 E Ker fJ and (11.10) reduces to 2
4
0
Yt-1 whence x 1 = 0 and x = x 2 E Ker fJ. As in paragraph (a) above it is found that Eq. (11.12) now applies to A 1. Consequently, as in case (a), x E Ker S 1, whence x = 0. Thus, dim A 1 =:;; (k + 1)n and, combining with the conclusion dim A 2 =:;; kn, of part (a), we have dim A 1 = (k + 1)n,
dim A 2 = kn
and Sk+ 1 (defined by A 1) and Sk (defined by A 2) are nonsingular. Since [Ik+ 1, O]I.At, = Sk+ 1, [Ib O]I.At 2 = Sb it follows that A 1, A 2 determine monic right divisors of L of degrees k + 1, k, respectively. This completes the proof of Theorem 11.2. D
11.4.
285
DISCUSSION AND FURTHER DEDUCTIONS
11.4. Discussion and Further Deductions
Let us summarize the elementary divisor structure of a matrix polynomial L and the factors described in Theorem 11.2. If L has the factorization L = Q 1L 1 we tabulate (see Table I) the degrees of elementary divisors (il - il 0 )j of Q 1, L 1 resulting from the presence of such a divisor for L. Consider the case in which L has supporting subspace .# 1 and an associated c-set of nonreal eigenvalues. We assume that .11 1 is constructed as in the proof of Theorem 11.2 and that S is the maximal c-set chosen in advance. The presence of the entries in the "divisor of L 1" column is fully explained by our constructions. The first two entries in the "divisor of Q 1" column are evident. Let us check the correctness of the last three entries in the last column. To this end, recall that in view of Theorem 3.20, Q 1(il) is a right divisor of L * = L with supporting subspace At with respect to the standard triple ([0, ... ,0,1],
q,
col(t5i1J)l=l)
of L; therefore, the elementary divisors ofQ 1(il) and Jil - Cfi.At~ are the same. Now let (X, J, Y) be the self-adjoint triple of L used in the proof of Theorem 11.2. The subspace At can be easily identified in terms of X, J and the selfadjoint matrix B given by (10.3). Namely, At = Bf!l', where f!l' is the image of the nl x nk matrix (where k = [tlJ) formed by the following selected columns of col(XJi)l:A: (a) select the columns corresponding to Jordan blocks in J with nonreal eigenvalues outside S; (b) in the notation introduced in Section 11.1, if (ili, xu) E Y' 1, select xl,i'
0
••
(c) (d)
'
xki-ri,i;
if(ili, if (ili,
Xu) E g' 2 Xu) E g' 2
and ki is odd, select Xl,i• ... , Xk,-r,+ l,i; and ki is even, select X1, i• ... , Xk,-r,, i.
It is easily seen that this rule ensures the selection of exactly nk columns of col(Xf)l:A. Furthermore, Lemma 11.3 and equality (11.7) ensure that Bf!l' is orthogonal to .11 1, and since dim(Bf!l') + dim .It 1 = nl, we have Bf!l' = At. TABLE I
Divisor of L
Divisor of L 1
Divisor of Q 1
(). - Ao)i (). - Ao)i
1 (). - Ao)' (). - ).o)'
(). - Ao)'
().- Aoy-t
(). - Ao)'
().- ;.oy- 1
286
11.
FACTORIZATION OF SELF-ADJOINT MATRIX POLYNOMIALS
Taking adjoints in
C 1 col( X Ji)L;; A = col( X J;)l;: AJ and using the properties of the self-adjoint triple: Y* = XP,, 1 , P,, 1 J = J*P,, 1 , P,, 1 J* = JP,, 1 , we obtain
P,, 1 RC! = J*P,, 1 R,
(11.13)
where R = [Y, J Y, ... , 1 1- 1 Y]. Rewrite (11.13) in the form
CfR- 1 P,, 1 = R- 1P,, 1 J*, and substitute here R - t = BQ, where Q = col (X Ji):: A:
CTBQP,,J = BQP,,JJ*.
(11.14)
Use equality (11.14) and the selection rule described above for !!Z to verify the last three entries in the "divisor of Q1" column. Analogously, one computes the elementary divisors of the quotient Q2 = LL2 1 , where L 2 is taken from Theorem 11.2. We summarize this discussiorr in the following theorem. Theorem 11.4. Let L be a monic self-adjoint matrix polynomial of degree l, and let A. 1 , .•• , A., be all its different real eigenvalues with corresponding partial multiplicities mil, ... , m;,.,, i = I, ... , r, and the sign characteristic B = (e;) where j = 1, ... , s;, i = 1, ... , r. Let S be a maximal c-set. Then there exists a monic right divisor L 1 (L 2 ) of L with the following properties:
(i) deg L 1 = [(l + 1)/2], deg L 2 = l - [(l + 1)/2]; (ii) the nonreal spectrum of L 1 and of L 2 coincides with S; (iii) the partial multiplicities of L 1 (resp. of the quotient Q 1 = LL1 1 ) corresponding to A.; (i = 1, ... , r) are t(m;i + (iie;), j = 1, ... , s;, (resp. t(mii - (;jB;), j = 1, ... , s;), where (;i = 0 if mii is even, and (ii = 1 if m;i is odd; (iv) the partial multiplicities of L 2 (resp. of the quotient Q2 = LL2, 1 ) are t(mii - (;iB;), j = 1, ... , s; (resp. t(m;i + (iie;), j = 1, ... , s;). Moreover, the supporting subspace of L 1 (resp. L 2 ) with respect to C 1 is B-nonnegative (resp. B-nonpositive).
Going back to the proof of Theorem 11.2, observe that the argument of case (ii) in the proof can be applied to AI" = Im Qc where Qc corresponds to the c-set of all eigenvalues of L in the open half-plane ~m A. > 0. Since AI" is B-neutral it is also B-nonpositive and the argument referred to leads to the conclusion that dim AI" ~ kn. But this implies that, for l odd, the number of real eigenvalues is at least n, and thus confirms the last statement of Theorem 10.4.
11.4.
DISCUSSION AND FURTHER DEDUCTIONS
287
Corollary 11.5. If L has odd degree and exactly n real eigenvalues (counting multiplicities), then these eigenvalues have only linear elementary divisors and there is a factorization L(A)
= M*(A)(I A + W)M(A),
where M is a monic matrix polynomial, u(M) is in the half-plane .fm A > 0, and W* = W. Proof Theorem 10.4 implies that L has exactly n elementary divisors associated with real eigenvalues, and all of them are linear. By Proposition 10.2, LiiBii = n, where B = (eii) is the sign characteristic of L. Since the number of signs in the sign characteristic is exactly n, all signs must be + 1. Now it is clear that the subspace % of the above discussion has dimension kn and determines a supporting subspace for L which is B-neutral. The symmetric factorization is then a deduction from Theorem 11.1. 0
In the proof of Corollary 11.5 it is shown that the real eigenvalues in Corollary 11.5 are necessarily of the first kind. This can also be seen in the following way. Let Ao be a real eigenvalues of L with eigenvector x 0 • Then M(A 0 ) is nonsingular and it is easily verified that (x 0 , L 0 l(A 0 )x 0 ) = (x 0 , M*(A 0 )M(Ao)xo) =(yo, Yo)
where Yo = M(A 0 )x 0 • Hence (x 0 , Vll(A 0 )x 0 ) > 0, and the conclusion follows from equation (11.4). Corollary 11.6. If L of Theorem 11.2 has even degree and the real eigenvalues of L have all their elementary divisors of even degree, then there is a matrix polynomial M such that L(A) = M*(A)M(A). Proof This is an immediate consequence of Theorems 11.1 and 11.2. The hypothesis of even degree for the real divisors means that the supporting subspace .A constructed in Theorem 11.2 has dimension lln and is B-neutral. 0 EXAMPLE 11.1. Consider the matrix polynomial L of Example 10.4. Choose the c-set, {i} and select E from the columns of matrix X of Example 10.4 as follows:
_ [10 0OJ '
O::.c
=
and then
K=[1]Ef>[-1]EB[~ ~l
288
11.
We then have A
1
= 2 2
1
FACTORIZATION OF SELF-ADJOINT MATRIX POLYNOMIALS 1
+ Im Q
0
a B-neutral subspace, and -1
=1m [-1 -1
Qc =
lm[~
-1 -1 -1
-1 1
-a -i]
-1 -1 -i 2i 2i -3 -3
1 0
and S 2 = col[SK'];=o is given by
-r-~ - ~1 1
Sz-
-1 1 1 -1
i i
1 1
and
-1
1
1 [ -1 s-12 -4 2
1 2
-2i -2i
-~ -~] 0
0 .
2
2
Computing SK 2 S- 1 it is then found that A. LI(A.)= [
2 -
iA.-1 0
It is easily verified that L(A.) = Lf(A.)L 1 (A.).
J
-iA. A.z-iA.-1.
0
The special cases l = 2, 3 arise very frequently in practice and merit special attention. When l = 2 Theorem 11.2 implies that the quadratic bundle (with self-adjoint coefficients) can always be factored in the form
V.. 2
+ A 1 A. + A 0
= (IA.- Y)(IA.- Z).
(11.15)
If the additional hypothesis of Corollary 11.6 holds, then there is such a factorization with Y = Z*. Further discussions of problems with l = 2 will be found in Chapter 13. When l = 3 there are always at least n real eigenvalues and there are matrices Z, B 1 , B 2 for which
If there are precisely n real eigenvalues, then it follows from Corollary 11.5 that there are matrices Z, M with M* = M and
Furthermore, a(! A.- Z) is nonreal, a(! A.+ M) is just the real part of a(L(A.)) and the real eigenvalues necessarily have all their elementary divisors linear. EXAMPLE 11.2. Let l = 2 and suppose that each real eigenvalue has multiplicity one. Then the following factorizations (11.15) are possible:
L(A.) = (lA. - Y;)(IA. - Z;),
11.4.
DISCUSSION AND FURTHER DEDUCTIONS
289
i = 1, 2, in which V.. - Z 1 has supporting subspace which is maximal B-nonnegative and that of V..- Z 2 is maximal B-nonpositive, while u(L) is the disjoint union of the spectra of V..- Z 1 and V..- Z 2 . In this case, it follows from Theorem 2.15 that Z 1 , Z 2 form a complete pair for L.
Comments
The presentation of this chapter is based on [34d, 34f, 34g]. Theorem 11.2 was first proved in [56d], and Corollary 11.5 is Theorem 8 in [56c]. Theorem 11.4 is new. The notation of" kinds" of eigenvectors corresponding to real eigenvalues (introduced in Section 11.2) is basic and appears in many instances. See, for example, papers [17, 50, 51]. In [69] this idea appears in the framework of analytic operator valued functions.
Chapter 12
Further Analysis of the Sign Characteristic
In this chapter we develop further the analysis of the sign characteristic of a monic self-adjoint matrix polynomial L(A.), begun in Chapter 10. Here we shall represent the sign characteristic as a local property. This idea eventually leads to a description of the sign characteristic in terms of eigenvalues of L(A. 0 ) (as a constant matrix) for each fixed A. 0 E IR (Theorem 12.5). As an application, a description of nonnegative matrix polynomials is given in Section 12.4. 12.1. Localization of the Sign Characteristic
The idea of the sign characteristic of a self-adjoint matrix polynomial was introduced as a global notion in Chapter 10 via self-adjoint triples. In this section we give another description of the sign characteristic which ultimately shows that it can be defined locally. Theorem 12.1. Let L 1(A.) and LiA.) be two monic self-adjoint matrix polynomials. If A. 0 E a(L 1 ) is real and
Ly>(A. 0 ) = L~l(A. 0 ), i = 0, 1, ... , y, where y is the maxima/length of Jordan chains of L 1(A.) (and then also of L 2 (A.)) corresponding to A. 0 , then the sign characteristics of L 1 (A.) and LiA.) at A. 0 are the same. 290
12.1.
291
LOCALIZATION OF THE SIGN CHARACTERISTIC
By the sign characteristic of a matrix polynomial at an eigenvalue we mean the set of signs in the sign characteristic, corresponding to the Jordan blocks with this eigenvalue. It is clear that Theorem 12.1 defines the sign characteristic at A. 0 as a local property of the self-adjoint matrix polynomials. This result will be an immediate consequence of the description of the sign characteristic to be given in Theorem 12.2 below. Let L(A.) be a monic self-adjoint matrix polynomial, and let A. 0 be a real eigenvalue of L(A.). For x E Ker L(A. 0 ) \ {0} denote by v(x) the maximal length of a Jordan chain of L(A.) beginning with the eigenvector x of A.0 . Let 'P;, i = 1, ... , y (y = max{v(x)lxEKer L(A. 0 )\{0}}) be the subspace in Ker L(A.0 ) spanned by all x with v(x) ~ i. Theorem 12.2.
Fori
!;(x, y) =
=
1, ... , y, let A. 0 be a real eigenvalue of Land
(x, ±~ LUl(A. )y
J.
1-
0
X,
E
y 'P;,
where y = y
--+
'P; such that
= (x, G;y),
(iii) 'P;+ 1 = Ker G; (by definition, 'I'y+ 1 = {0}); (iv) the number of positive (negative) eigenvalues ofG;, counting according to the multiplicities, coincides with the number of positive (negative) signs in the sign characteristic of L(A.) corresponding to the Jordan blocks of A. 0 of size i.
Proof Let (X, J, Y) be a self-adjoint triple of L(A.). Then, by Theorem 10.6 X= [I 0 · · · O]S- 1 , J = SC 1 S-t, Y = S[O · · · 0 I]T, and B = S* P,,JS.It is easy to see that in fact s- 1 = col(XJi>:;:b (see, for instance, (1.67)). Recall the equivalence E(A.)(IA- C1)F(A.) = L(A.) ElHn
(12.1)
(see Theorem 1.1 and its proot), where E(A.) and F(A.) are matrix polynomials with constant nonzero determinants, and
0 I
... OJ ... 0 I
Equation (12.1) implies that the columns of then x r matrix X 0 form a Jordan chain for L(A.) corresponding to A. 0 if and only if the columns of the In x r
12.
292
FURTHER ANALYSIS OF THE SIGN CHARACTERISTIC
matrixcol[X 0 Jb]~~~formaJordanchainfor.U- C 1 ,whereJ 0 isther x r Jordan block with eigenvalue ..1. 0 (see Proposition 1.11). Now we appeal to Theorem S5.7, in view of which it remains to check that
( x,
±~
j= 1
Vil(A.o)y
J.
i>) =
(x, Bp
(12.2)
where x = col(A.bx)~:~, [y< 1 > · · · p
Al
Al-l
I
Az
B= A1-1 I
I
0
0
0
As usual, the self-adjoint matrices Ai stand for the coefficients of L(A.): L(A.) = H 1 + L:~:~ AiA.i. But this is already proved in (10.51). D / 1
Let us compute now the first two linear transformations G 1 and G 2 • Since (x, y) = (x, L'(A. 0 )y) (x, y E 'P 1 = Ker L(A.0 )), it is easy to see that G1
= PlL'(A.o)Pll'¥ 1 •
where P 1 is the orthogonal projector onto 'P 1 . Denote L~> = L
y) = (x, !L~y
+ L~y'),
where y, y' is a Jordan chain of L(A.) corresponding to ..1. 0 . Thus L0 y =
L~y
+ L 0 y'
= L~y
+ L 0 (1-
P 1 )y' = 0
(12.3)
(the last equality follows in view of L 0 P 1 = 0). Denote by Lt : q;n --+ q;n the linear transformation which is equal to L 0 1 on 'Pt and zero on 'P 1 = Ker L 0 . ThenLt L 0 (1- P 1 ) =I- P 1 and(12.3)gives(J- P 1 )y' = -Lri L~y. Now, for x, y E 'P 2 , (x,
L~y')
= (x, L~(P 1 +(I- P 1 ))y') = (x, L~Pd) + (x, L~(J- P 1 )y') = (x, G 1 Pd)
+ (x, L~(l-
P 1 )y') = (x,
L~(J-
P 1 )y')
= (x, -L~Lt L~y),
where the penultimate equality follows from the fact that x E Ker G1 and G1 = G!. Thus / 2 (x, y) = (x, !L~y - L~Lri L~y), and
where P 2 is the orthogonal projector of q;n on 'P 2 • To illustrate the above construction, consider the following example.
12.2.
293
STABILITY OF THE SIGN CHARACTERISTIC
EXAMPLE
12.1.
Let
l
0A . A2 +A
Choose Ao = 0 as an eigenvalue of L(A). Then Ker L(O) = fl 3 , so 'P 1 = fl 3 • Further,
L'(O) =
~~ ~ ~] 0
1 1
and f 1(x, y) = (x, L'(O)y) = xiy 2 + ji 3 ) + xJCy 2 + ji 3 ), where (y 1 , Y2, y 3 ?. L'(O) has one nonzero eigenvalue 2 and
x=
(x 1 , x 2 ,
x3?, y =
Ker L'(O) = '¥ 2 = Span{(l, 0, O?, (0, -1, i?}. Thus there exists exactly one partial multiplicity of L(A) corresponding to Ao = 0 which is equal to 1, and its sign is + 1. It is easily see that y, 0 is a Jordan chain for any eigenvector y E 'P 2 . Thus
f 2 (x, y) = (x, tL"(O)y + L'(O)y') = (x, y)
for
x, y
E
'¥ 2 .
Therefore there exist exactly two partial multiplicities of Ao = 0 which are equal to 2, and their signs are + 1. D
12.2. Stability of the Sign Characteristic
In this section we describe a stability property of the sign characteristic which is closely related to its localization. It turns out that it is not only the sign characteristic of a monic self-adjoint polynomial which is deterfnined by local properties, but also every monic self-adjoint polynomiaii which is sufficiently close, and has the same local Jordan structure, will have the same sign characteristic .More exactly, the following result holds. Theorem 12.3. Let L(A.) = IA 1 + 2:}: 6A_iA i be a self-adjoint matrix polynomial, and let A. 0 E a{L) be real. Then there exists a c5 > Owith the following pr_gperty: if L(A.) = IA 1 + I}:6 A_iAi is a self-adjoint polynomial such that IIAi- A ill < c5, j = 0, ... , l- 1, and if there exists a unique real eigenvalue A. 1 of L(A) in the disk I}. 0 - ).I < c5 with the same partial multiplicities as those of L(A.) at A.0 , then the sign characteristic of l(A.) at A. 1 coincides with the sign characteristic of L(A.) at A. 0 •
We shall not present the proof of Theorem 12.3 because it requires background material which is beyond the scope of this book. Interested readers are referred to [34f, 34g] for the proof.
294
12. FURTHER ANALYSIS OF THE SIGN CHARACTERISTIC
12.3. A Sign Characteristic for Self-Adjoint Analytic Matrix Functions Let L be a monic self-adjoint matrix polynomial. Then (see Theorem S6.3) for real A. the matrix L(A.) has a diagonal form
L(A.) = U(A.) · diag[,u 1 (A.), ... , ,u.(A.)] · V(A.),
(12.4)
where U(A.) is unitary (for real A.) and V(A.) = (U(A.))*. Moreover, the functions ,u;(A.) and U(A.) can be chosen to be analytic functions of the real parameter A. (but in general ,u;(A.) and U(A.) are not polynomials). Our next goal will be to describe the sign characteristic of L(l_) in terms of the functions ,u 1 (A.), ... , .Un(A.). Since the ,U;(A.) are analytic functions, we must pay some attention to the sign characteristic of analytic matrix functions in order to accomplish this goal. We shall do that in this section, which is of a preliminary nature and prepares the groundwork for the main results in the next section. Let n be a connected domain in the complex plane, symmetric with respect to the real axis. An analytic n x n matrix function A(A.) inn is called self-adjoint if A(A.) = (A(A.))* for real A. En. We consider in what follows selfadjoint analytic functions A(A.) with det A(A.) 1:- 0 only, and shall not stipulate this condition explicitly. For such self-adjoint analytic matrix functions A(A.) the spectrum o"(A) = {A. E OJdet A(A.) = 0} is a set of isolated points, and for every A.0 E a(A) one defines the Jordan chains and Jordan structure of A(A.) at A. 0 as for matrix polynomials. We shall need the following simple fact: if A(A.) is an analytic matrix function inn (not necessarily self-adjoint) and det A(A.) 1:- 0, then the Jordan chains of A(A.) corresponding to a given A. 0 E a(A) have bounded lengths, namely, their lengths do not exceed the multiplicity of A. 0 as a zero of det A(A.). For completeness let us prove this fact directly. Let cp 0 , ••• ,
E(A.) = [cp(A.), t/11, · · ·' t/Jn-1]. By (12.5), the analytic function det A(A.) · det E(A.) = det[A(A.)E(A.)] has a zero of multiplicity at least kat A. = A. 0 . Since E(A. 0 ) is nonsingular by construction, the same is true for det A(A.), i.e., A. 0 is a zero of det A(A.) of multiplicity at least k. Going back to the self-adjoint matrix function A( A.), let A. 0 E a(A) be real. Then there exists a monic self-adjoint matrix polynomial L(A.) such that
LW(A. 0 ) = AUl(A. 0 ),
j = 1, ... , y,
(12.6)
12.3.
295
SELF-ADJOINT ANALYTIC MATRIX FUNCTIONS
where y is the maximal length of Jordan chains of A( A.) corresponding to A. 0 (for instance, put L(A.) = I(A.- A. 0)Y+ 1 + L}=o AUl(A.0 )(A.- A. 0 )j). Bearing in mind that x 1 , ... , xk is a Jordan chain of L(A.) corresponding to A. 0 if and only ifx 1 #Oand L(A.o) L'(A.o)
0 0
0 L(A.o)
XI
Xz
= 0, 1 L
1 (k - 2)!
L
L(?c 0 )
xk
and analogously for A(A.), we obtain from (12.6) that L(A.) and A(A.) have the same Jordan chains corresponding to A. 0 . In particular, the structure of Jordan blocks of A(A.) and L(A.) at A. 0 is the same and we define the sign characteristic of A(A.) at A. 0 as the sign characteristic of L(A.) at A. 0 . In view of Theorem 12.1 this definition is correct (i.e., does not depend on the choice of L(A.)). The next theorem will also be useful in the next section, but is clearly of independent interest. Given an analytic matrix function R(A.) in Q, let R*(A.) denote the analytic (in Q) matrix function (R(A:))*. Observe that the notion of a canonical set of Jordan chains extends word for word for analytic matrix functions with determinant not identically zero. As for monic matrix polynomials, the lengths of Jordan chains in a canonical set (corresponding to A. 0 E u(A)) of an anaJytic matrix function A(A.), do not depend on the choice of the canonical set. The partial multiplicities of A(A.) at A. 0 , by definition, consist of these lengths and possibly some zeros (so that the total number of partial multiplicities is n, the size of A(A.)), cf. Proposition 1.13. Observe that for a monic matrix polynomial L(A.) with property (12.6), the partial multiplicities of A(A.) at A. 0 coincide with those of L(A.) at A. 0 . Theorem 12.4. Let A be a self-adjoint analytic matrix function in Q, and let A. 0 E u(A) be real. Let R be an analytic matrix function in Q such that det R(A.0 ) # 0. Then:
(i) (ii)
the partial multiplicities of A and R* AR at A. 0 are the same; the sign characteristics of A and R* AR at A. 0 are the same.
Proof Part (i) follows from Proposition 1.11 (which extends verbatim, together with its proof, for analytic matrix functions). The rest of the proof will be devoted to part (ii). Consider first the case that A = L is a monic self-adjoint matrix polynomial. Let Y be the maximal length of Jordan chains of L corresponding to A. 0 . Let S(A.) = If,!J S;(A.- A. 0)i and T(A.) = Li~J ~(A.- A.0 )j be monic
12.
296
FURTHER ANALYSIS OF THE SIGN CHARACTERISTIC
matric polynomials of degree y + 1 which are solutions of the following interpolation problem (cf. the construction of Lin formula (12.6)): S
for j = 0, 1, ... , y, for j = 0 for j = 1, ... , y.
Then (S* LS)
u- 1So u = diag[J 1• ..• ' Jk] be the Jordan form for S 0 with Jordan blocks J 1 , matrix U. Here f.,l; 1 0 0 f.,l; 1 1;
=
i
••• ,
J k and nonsingular
= 1, ... ' k,
1 f.,l;
0
where J-l; i= 0 in view of the nonsingularity of S 0 . Let J-l;(t), t E [0, tJ be a continuous function such that f.,l;(O) = J-l;, J-t;{t) = 1 and f.,l;(t) i= 0 for all t E [0, tJ. Finally, put F 0 (t) = U diag(J 1(t), ... , Jk(t))U- 1 ,
tE [0? tJ,
where J-l;(t)
1 - 2t f.,l;(t)
1 - 2t
l;(t) = 0
1 - 2t f.,l;(t)
Clearly,F0 (t)isnonsingularforallt E [0, tJ,F 0 (0) = S 0 ,F0 {t) = I. Analogous construction ensures the connection between I and T0 by means of a continuous nonsingular matrix function F 0 (t), t E [t, 1]. Now let F(A., t), t E [0, 1] be a continuous family of monic matrix polynomials of degree y + 1 such that F(A., 0) = S(A.), F(A., 1) = T(A.) and F(A. 0 , t) is nonsingular for every t E [0, 1]. (For example, y
F(A., t) = I(A. - A. 0 )Y+ 1
+ L [t1j + (1 j= 1
- t)Si] (A. - A. 0 )i
+ F 0 (t),
12.3.
297
SELF-ADJOINT ANALYTIC MATRIX FUNCTIONS
where F 0 (t) is constructed as above.) Consider the family M(A., t) = F*(A., t)L(A.)F(A., t),
t E [0, 1],
of monic self-adjoint matrix polynomials. Since det F(A. 0 , t) f= 0, it follows that ..1. 0 E a(M(A., t)) and the partial multiplicities of M(A., t) at ..1. 0 do not depend on t E [0, 1]. Furthermore, j
= 0, ... ' y. (12.7)
Let us show that the sign characteristic of M(A., t) at ..1. 0 also does not depend on t E [0, 1]. Let 'P;(t), i = 1, ... , y, t E [0, 1] be the subspace in Ker M(A. 0 , t) spanned by all eigenvectors x of M(A., t) corresponding to ..1. 0 such that there exists a Jordan chain of M(A., t) corresponding to ..1. 0 of length at least i beginning with x. By Proposition 1.11, and by (12.7): t E [0, 1],
i
= 1, ... , y.
(12.8)
For x, y E 'P;(t) define .f;(x,y;t) =
(x, ±~M(,.t0 ,t)y(i+l-i>), j=
1).
where y = y(l>, y(2), ... , yCi> is a Jordan chain of M(A., t) corresponding to According to Theorem 12.2,
...1. 0 •
.h(x, y; t) = (x, G;(t)y),
where G;(t): 'P;(t) --+ 'P;(t) is a self-adjoint linear transformation. By Theorem 12.2, in order to prove that the sign characteristic of M(A., t) at ..1. 0 does not depend on t E [0, 1], we have to show that the number of positive eigenvalues and the number of negative eigenvalues of G;(t) is constant. Clearly, it is sufficient to prove the same for the linear transformations
(see (12.8)). Let us prove that H;(t), i = 1, ... , y, are continuous functions oft E [0, 1]. Let xil, ... , x;.k, be a basis in 'P;(l). It is enough to prove that all scalar functions (x;p, H;(t)x;q), 1 ::::;; p, q ::::;; k; are continuous. So let us fix p and q, and letx;q = y(l>, y(2), ... , yU>beaJordanchainofL(A.)correspondingtoA. 0 , which starts with X;q· By Proposition 1.11, the vectors k = 1, ... 'i,
(12.10)
298
12.
FURTHER ANALYSIS OF THE SIGN CHARACTERISTIC
form a Jordan chain of M(A., t) corresponding to A.0 which begins with z< 1 > = F- 1 (A. 0 , t)y< 1 >. By Theorem 12.2, we have: (xip• Hlt)xiq) = (F- 1 (A. 0 , t)xip• G;(t)F- 1 (A. 0 , t)xiq) = (F- 1 (A.
0'
t)x.
lP'
~
.!_Mu>(A.0' t)z) '
i...J 'f
j= 1 J.
which depends continuously on tin view of (12.10). Now it is easily seen that the number of positive eigenvalues of Hi(t) and the number of negative eigenvalues of Hi(t) are constant. Indeed, write H;(t) in matrix form with respect to the decomposition 'Pi(1) = 'Pi+ 1 (1) EB ['P;+ 1 (1)].L, bearing in mind that H;(t) is self-adjoint (and therefore the corresponding matrix is hermitian) and that 'Pi+ 1 (1) = Ker H;(t) (Theorem 12.2(iii)): H;(t)
=
[~ H~t)l
Here Hlt): ['Pi+ 1 (1)].l-+ ['Pi+ 1 (1)].l is self-adjoint, nonsingular, and continuous (nonsingularity follows from the equality 'Pi+ 1 (1) = Ker H;(t), and continuity follows from the continuity of Hlt)). Therefore, the number of positive eigenvalues of H;(t) (and, consequently, of H;(t)) does not depend on t, for i = 1, ... , y. Thus, the sign characteristic of M(A., t) at A.0 does not depend on t E [0, 1]. In particular, the sign characteristics of S* LS and T* L T at A.0 are the same. But the sign characteristic ofT* LT at A.0 is the same as that of L and the sign characteristic of S* LS is the same as that of R*LR in view of Theorem 12.1. So Theorem 12.3 is proved for the case that A(A.) is a monic polynomial with invertible leading coefficient. The general case can be easily reduced to this. Namely, let L(A.) be a monic self-adjoint matrix polynomial satisfying (12.6). Let M(A.) be a matrix polynomial such that
MU>(A. 0 )
= RU>(A. 0 ),
j
= 0, ... , y.
By the definition, the sign characteristic of R* AR at A. 0 is defined by M* LM, and that of A is defined by L. But in view of the already proved case, the sign characteristics of L and M* LM at A.0 are the same. D 12.4. Third Description of the Sign Characteristic
Let L(A.) be a monic self-adjoint matrix polynomial. The sign characteristic of L(A.) was defined via self-adjoint triples for L(A.). Here we shall describe the same sign characteristic in a completely different way, namely, from the point of view of perturbation theory.
12.4.
THIRD DESCRIPTION OF THE SIGN CHARACTERISTIC
299
We shall consider L(..l.) as a hermitian matrix which depends on the real parameter ..1. Let ,u 1 (..1), ... , .Un().) be the eigenvalues of L(..l.), when L(..l.) is considered as a constant matrix for every fixed real ..1. In other words, ,U;(..l.) (i = 1, ... , n) are the roots of the equation det(J,u - L(..l.))
= 0.
(12.11)
From Theorem S6.3 it follows that ,u;(..l.) are real analytic functions of the real parameter). (when enumerated properly), and L(..l.) admits the representation (12.4). It is easy to see that ..1 0 E a(L) if and only if at least one of the ,u/). 0 ) is zero. Moreover, dim Ker L(..l. 0 ) is exactly the number of indices (1 :s; j :s; n) such that ,uj(..l. 0 ) = 0. As the next theorem shows, the partial multiplicities of L(..l.) at ..1 0 coincide with the multiplicities of ..1 0 as a zero of the analytic functions ,u 1 (..1), ... , ,ui..l.). Moreover, we can describe the sign characteristics of L(..l.) at ..1 0 in terms of the functions .Ui(..l.). The following description of the sign characteristic is one of the main results of this chapter. Theorem 12.5. Let L = L* be monic, and let ,u 1(..l.), ... , ,ui..l.) be real analytic functions of real ). such that det(I,ui - L(..l.)) = 0, j = 1, ... , n. Let ..1 1 < · · · < ..1, be the different real eigenvalues of L(..l.). For every i = 1, ... , r, write ,uj(..l.) = (). - A;)m'ivii(..l.), where v;j(..l.;) -:f. 0 is real. Then the nonzero numbers among m; 1 , ••• , m;n are the partial multiplicites of L associated with A;, and sgn vii().;) (for mii -:f. 0) is the sign attached to the partial multiplicity mii of L(..l.) at A; in its (possibly nonnormalized) sign characteristic.
Proof Equation (12.4), which a priori holds only for real). in a neighborhood of A;, can be extended to complex). which are close enough to A;, so that U(..l.), Jtj(..l.), and V(..l.) can be regarded as analytic functions in some complex neighborhood of A; in fl. This is possible since U(..l.), ,uj(..l.), and V(..l.) can be expressed as convergent series in a real neighborhood of A;; therefore these series converge also in some complex neighborhood of A;. (But then of course it is no longer true that U(..l.) is unitary and V(..l.) = (U(..l.))*.) Now the first assertion of Theorem 12.5 follows from Theorem 12.4(i). Further, in view of Theorem 12.4(ii) the sign characteristics of L and diag[,uj(..l.)]}= 1 at A; are the same. Let us compute the latter. Choose scalar polynomials fi 1 (..1), ... , fin().) ofthe same degree with real coefficients and with the properties fi)kl(..l.;) = ,u)kl(..l.;) for k = 0, ... , m;i, i = 1, ... , r, j = 1, ... , n. (Here and further we write ,u
300
12.
FURTHER ANALYSIS OF THE SIGN CHARACTERISTIC
description of the sign characteristic of diag[,Uj(.A.)Jj= 1 given in Theorem 12.2 it is easy to see that the first nonzero derivative ,U~k>(Ai) (for fixed i and j) is positive or negative depending on whether the sign of the Jordan block corresponding to the Jordan chain (0, ... , 0, 1, 0, ... , 0)\ 0, ... , 0 ("1" in the jth place) of Ai is + 1 or -1. Thus, the second assertion of Theorem 12.5 follows. D As a corollary to Theorem 12.5 we obtain the next interesting result. Note that in this theorem and elsewhere, as appropriate, the notation sig L(A) (where A is real and A¢ a(L)), is used to denote the signature of the n x n self-adjoint matrix L(A), in the classical sense, i.e., the difference between the number of positive and the number of negative roots of the equation det(J.u - L(A)) = 0. Theorem 12.6. Let L = L *be monic, and let Ao E a(L) be real. Let~ be the number of odd partial multiplicities of L at Ao, having the signs c; 1 , ... , c;~ in the sign characteristic. Then
sig L(A 0
+ 0)
~
- sig L(A0
-
0) = 2 L c;i·
(12.12)
i= 1
Proof Consider the eigenvalues .u 1(A), ... , .Un(A) of L(A). Let a 1 , •.. , an (;;::: 0) be the partial multiplicities of L(A) at Ao arranged in such an order that .uY>(A 0 ) = 0 for i = 0, 1, ... , ai - 1, and .u~a.i>(A 0 ) i= 0 (j = 1, ... , n). This is possible by Theorem 12.5. Then for ai even, .U/Ao + 0) has the same sign as .uPo - 0); for ai odd, the signs of .U/Ao + 0) and .uPo - 0) are different, and .u j(A 0 + 0) is > 0 or < 0 according as .u~a.i>(A 0 ) is > 0 or < 0. Bearing in mind that sig L(A) is the difference between the number of positive and the number of negative .U/A), we obtain now the assertion of Theorem 12.6 by using Theorem 12.5. D Corollary 12.7.
Let L, Ao
and~
sig L(A 0
+ 0)
be as in Theorem 12.6. Then = sig L(A 0
-
0)
(12.13)
if and only if~ is even and exactly ~/2 of the corresponding signs are + 1. In particular, (12.13) holds whenever all the partial multiplicities of L at Ao are even. We conclude this section with an illustrative example. EXAMPLE
12.2. Consider the polynomial L(A.)=[-1.21-2
1]
,1_2-
2 .
12.5.
301
NONNEGATIVE MATRIX POLYNOMIALS
± 1,
±J3, each one with multiplicity 1. It
for for
1 < IA. I <
Os
IA.I <
The polynomial L(A.) has 4 spectral points: is easy to check that sig L(A.)
= {
~ -2
IA.I >
J3
J3
1.
Now it is possible to compute the sign characteristic of L(A.) by using Theorem 12.6. It turns out that
s_..fi = s_ 1 = -1, where by
s.lo
e.J3 = s1 = 1,
we denote the sign corresponding to the eigenvalue A.0 ,
0
12.5. Nonnegative Matrix Polynomials
A matrix polynomial L of size n x n is called nonnegative if
(L(A)f, f)
~
0
for all real A and all f E ([". Clearly, every nonnegative polynomial is selfadjoint. In this section we give a description of monic nonnegative polynomials. Theorem 12.8. For a monic self-adjoint matrix polynomial L(A) the following statements are equivalent:
(i) (ii)
L(A) is nonnegative; L(A) admits a representation L(A) = M*(A)M(A),
(12.14)
where M(A) is a monic matrix polynomial; (iii) L(A) admits the representation (12.14) with (J(M) in the closed upper half-plane; (iv) the partial multiplicities ofL(A)for real points of spectrum are all even; (v) the degree of L(A) is even, and the sign characteristic of L(A) consists only of+ ls. Proo.f Implications (iii)= (ii) = (i) are evident. (i) = (iv). Observe that the analytic eigenvalues }1 1{).), ... , Jln(l,) of L(A) (defined in the preceding section) are nonnegative (for real A), therefore the multiplicities of each real eigenvalue as a zero of J1 1(A), ... , Jln(A) are even. Now apply Theorem 12.5. (i) = (v). Since L(A) is nonnegative, we have the Jl/A) > 0 for real A¢ (J(L), j = 1, ... , n. Let Ao E (J(L). Then clearly the first nonzero derivative
302
12.
FURTHER ANALYSIS OF THE SIGN CHARACTERISTIC
= 1, ... , n, is positive. So the signs are + 1 in view of Theorem 12.5. (v) = (i). It is sufficient to show that Jl/A) c 0 for real )., j = 1, ... , n. But this can easily be deduced by using Theorem 12.5 again. To complete the proof of Theorem 12.8 it remains to prove the implication (iv) =(iii). First let us check that the degree I of L().) is even. Indeed, otherwise in view of Theorem 10.4 L().) will have at least one odd partial multiplicity corresponding to a real eigenvalue, which is impossible in view of (iv). Now let L 1 ().)be the right divisor of L().) constructed in Theorem 11.2, such that its c-set lies in the open upper half-plane. From the proof of Theorem 11.2 it is seen that the supporting subspace of L 1 is B-neutral (because the partial multiplicities are even). By Theorem 11.1. J1}al(). 0),j
L
so (iii) holds (with M = L 1 ).
= L'fL 1 ,
D
We can say more about the partial multiplicities of L().) = M*().)M().). Theorem 12.9. Let M().) be a monic matrix polynomial. Let ). 0 E o{M) be real, and let 11 1 c · · · c IX, be the partial multiplicities of M().) at ). 0 . Then the partial multiplicities of L().) = M*().)M().) at ). 0 are 21X 1 , . . . , 21X,. Proof Let (X, J, Y) be a Jordan triple for M. By Theorem 2.2, (Y*, J*, X*) is a standard triple forM*().), and in view of the multiplication
theorem (Theorem 3.1) the triple [X
0],
[~
YY*] [ 0 J* ' X*
J
is a standard triple of L().). Let J 0 be the part of J corresponding to ).0 , and and let Y0 be the corresponding part of Y. We have to show that the elementary divisors of
are 21X 1 , ... , 21X,. Note that the rows of Y0 corresponding to some Jordan block in J 0 and taken in the reverse order form a left Jordan chain of M().) (or, what is the same, their transposes form a usual Jordan chain of MT().) corresponding to ). 0 ). Let Yb ... , Yk be the left eigenvectors of M().) (corresponding to ). 0 ). Then the desired result will follow from Lemma 3.4 if we show first that the matrix A = col[y;]~= 1 · row[yt]~= 1 is nonsingular and can be decomposed into a product of lower and upper triangular matrices. are linearly independent, Since (Ax, x) c 0 for all x E rtn andy!, ... , the matrix A is positive definite. It is well known (and easily proved by induction on the size of A) that such a matrix can be represented as a product A 1 A 2 of the lower triangular matrix A 1 and the upper triangular matrix A 2
y:
12.5.
NONNEGATIVE MATRIX POLYNOMIALS
303
(the Cholesky factorization). So indeed we can apply Lemma 3.4 to complete the proof of Theorem 12.9. D We remark that Theorem 12.9 holds for regular matrix polynomials M(A.) as well. The proof in this case can be reduced to Theorem 12.9 by considering the monic matrix polynomial M(A.) = A. 1M(r 1 + a)M(a)- 1 , where 1 is the degree of M(A.) and a¢ a(M). Comments
The results presented in Sections 12.1-12.4 are from [34f, 34g], and we refer to these papers for additional information. Factorizations of matrix polynomials of type (12.14) are well known in linear systems theory; see, for instance, [ 15a, 34h, 45], where the more general case of factorization of a self-adjoint polynomial L(A.) with constant signature for all real A. is considered.
Chapter 13
Quadratic Self-Adjoint Polynomials
This is a short chapter in which we illustrate the concepts of spectral theory for self-adjoint matrix polynomials in the case of matrix polynomials of the type L(A) = H 2
+ BA + C,
(13.1)
where B and C are positive definite matrices. Such polynomials occur in the theory of damped oscillatory systems, which are governed by the system of equations 2 L ( -d) x = -d x2
dt
dt
dx + Cx = + Bdt
f,
(13.2)
'
Note that the numerical range and, in particular, all the eigenvalues of L(A), lie in the open left half-plane. Indeed, let (L(A)f,f) = A2 (f,f)
+ A(Bf,f) + (Cf,f) =
0
for some A E C and f # 0. Then A= -(Bf, f)± j(Bf, f) 2 - 4(!, f)(Cf, f) 2(f,f) '
(13.3)
and since (Bf,f) > 0, (Cf,f) > 0, the real part of A is negative. This fact reflects dissipation of energy in damped oscillatory systems. 304
13.1.
305
OVERDAMPED CASE
13.1. Overdamped Case
We consider first the case when the system governed by Eqs. (13.2) is overdamped, i.e., the inequality
(Bf,f) 2
-
4(f,f)(Cf,f) > 0
(13.4)
holds for everyf E C\{0}. Equation (13.3) shows that, in the over-damped case, the numerical range of L(A.) is real and, consequently, so are all the eigenvalues of L(A.) (refer to Section 10.6). Note that from the point of view of applications the overdamped systems are not very interesting. However, this case is relatively simple, and a fairly complete description of spectral properties of the quadratic self-adjoint polynomial (13.1) is available, as the following theorem shows. Theorem 13.1.
Let system (13.2) be overdamped. Then
(i) all the eigenvalues of L(A.) are real and non positive; (ii) all the elementary divisors of L(A.) are linear; (iii) there exists a negative number q such that n eigenvalues A.\ll, ... , A.~ 1 ' are less than q and n eigenvalues A.\2 ', •.. , A.~2 ' are greater than q; (iv) fori = 1 and 2, the sign of A.Y' (j = 1, ... , n) in the sign characteristic ofL(A.) is ( -1Y; (v) the eigenvectors uy', ... , u~il of L(A.), corresponding to the eigenvalues A.y', ... , A.~il, respectively, are linearly independent fori= 1, 2.
Proof Property (i) has been verified already. Let us prove (ii). If (ii) is false, then according to Theorem 12.2 there exists an eigenvalue A. 0 of L(A.) and a corresponding eigenvector fo such that (L'(A.o)foJo) = 2A.o<.foJo)
+ (BfoJo) =
0.
But we also have
which implies
2A.o<.foJo)
+ (BfoJo) =
±J(BfoJo)2
-
4UoJo)(CfoJo) = 0,
a contradiction with (13.4); so (ii) is proved. This argument shows also that for every eigenvalue A. 0 of L(A.), the bilinear form (L'(A. 0 )f,f), f E Ker L(A. 0 ) is positive definite or negative definite. By Theorem 12.2 this means that all the signs associated with a fixed eigenvalue A. 0 are the same (all + 1 or all -1). By Proposition 10.12 (taking into account that by (ii) all partial multiplicities corresponding to the (real) eigenvalues of L(A.) are equal to one) the number of + 1s in the sign characteristic of L(A.)
306
13.
QUADRATIC SELF-ADJOINT POLYNOMIALS
is equal to the number of -1s, and since all the elementary divisors of L(.?c) are linear, the number of+ 1s (or -1s) is equal ton. Let W>, i = 1, ... , n, be the eigenvalues of L(.?c) (not necessarily different) having the sign -1, and let .?c~ 2 l, i = 1, ... , n, be the eigenvalues of L(.?c) with the sign + 1. According to the construction employed in the proof of Theorem 11.2 (Sections 11.2, 11.3) the eigenvectors u~l, ... , u~> corresponding to .?c\il, ... , .?c~0 , respectively, are linearly independent, fori = 1, 2. This proves (v). Observe that fori= 1, 2 andj = 1, ... , n, we have
+ (Bu
u<.il) = 2)c(il(u(il u<.il) ( L'()c(il)u<.il j J'J j J'J
and .?c <_i)
- (Bu(il u<.il)
=
j
'
j
+ j(Bu(il -
j
u(il) 2 - 4(u<.il u<.il)(CuUl u<.il) ,
j
j
'
j
j
'
j
2(u0u0) j ' j
1
( 13.5)
So by Theorem 12.2, the sign in (13.5) is + if i = 2, and - if i = 1. We shall use this observation to prove (iii) and (iv). For the proof of (iii) and (iv) it remains to show that )c\ll < )c(2) I
j
for every
'
i and j
(1
Define two functions p 1(x) and p 2 (x), for every x
s
n).
(13.6)
C"\{0}, as follows:
4(x, x)(Cx, x). Let us show that
-
{p 1 (x) Ix
i, j
) _ -(Bx, x) + d(x) Pz(X 2(x, x) '
) _ -(Bx, x) - d(x) PJ(x 2(x, x) '
where d(x) = flEx, x) 2
E
s
E
C"\ {0}} < min {.?c\2 l,
... ,
.?c~2 >}.
(13.7)
Suppose the contrary; then for some j we have: P1(y);:::::
Aj2 l = Pz(x),
(13.8)
where x is some eigenvector corresponding to andy E C"\ {0}. Clearly, pz(y) > p 1 (y) so that pz(y) =I= p 2 (x), and since p 2 (ax) = p 2 (x) for every a E C\ {0}, we deduce that x andy are linearly independent. Define
.?cj2 l,
v
= f.lY + (1 - p)x,
where f.1 is real. Then (as L(.?cj2 l)x = 0)
(.?cj2 >) 2 (v v) + .?cj2 l(Bv, v) + (Cv, v)
= f.lz{(.?cjzl)z(y, y) + .?cjzl(By, y) + (Cy, y)}.
(13.9)
However, p 1 (y) ;::::: .?cj2 l implies that the right-hand side of (13.9) is nonnegative and consequently, (13.10)
13.1.
307
OVERDAMPED CASE
for all Jl· Define g(Jl) = 2il~2 l(v, v) + (Bv, v); then g(l) < 0 by (13.8) and g(O) > 0 by (13.5) (with i = 2 and uT = x; so the sign + appears in (13.5)). Furthermore, g(Jl) is a continuous function of Jl, and hence there exists a Jlo E (0, 1) such that g(J1 0 ) = 2i!.Yl(v 0 , v0 ) + (Bv 0 , u0 ) = 0, z: 0 = v(J1 0 ). Together with (13.10) this implies (Bt• 0 , v0 ? : :; 4(v0 , v0 )(Cv0 , t•0 ); since the system is assumed to be overdamped, this is possible only if v0 = 0. This contradicts the fact that x andy are linearly independent. So (13.7) is proved. Since ilPl
E
{p 1(x)lx
E ~n\{0}},
inequalities (13.6) follow immediately from (13.7).
D
Observe that using Theorem 12.5 one can easily deduce that max{il\1 l, ... , il~ 1 l} < il 0 < min{Wl, ... , il~2 J} if and only if L(il 0 ) is negative definite (assuming of course that the system is overdamped). In particular, the set {il E IR IL(il) is negative definite} is not void, and the number q satisfies the requirements of Theorem 13.1 if and only if L(q) is negative definite. We shall deduce now general formulas for the solutions of the homogeneous equation corresponding to (13.2), as well as for the two-point boundary problem. Theorem 13.2. Let system (13.2) be overdamped, and let i!.)il be as in Theorem 13.1. Let r + (resp. r _)be a contour in the complex plane containing no points of (J(L), and such that WJ. ... ' il~2 ) (resp. wl, ... 'il~l)) are the only eigenvalues ofL(il) lying insider+ (resp. r _). Then
(i) (ii)
Jr
Jr _
the matrices + L - 1 (il) dil and L - 1 (il) dil are nonsingular; a general solution of L(d/dt)x = 0 has the form x(t) = ex + 1c 1
for some c 1, c 2
E ~n,
+ ex_ c 2 1
where
x± =
(Jr± L -l(il) dilr
1
L/L -l(il) dil;
(13.11)
(iii) the matrices X+ and X_ form a complete pair of solutions of the matrix equation X 2 + BX + C = 0, i.e., X+ - X_ is nonsingular; (iv) ifb- a> 0 is large enough, then the matrix ex+(b-a)- eX-(b-a) is nonsingular; (v) forb - a > 0 large enough, the two-point boundary value problem
L(:t)x(t) = .f(t),
x(a) = x(b) = 0
13.
308
QUADRATIC SELF-ADJOINT POLYNOMIALS
has the unique solution x(t)
b
.
= f a G0 (t, r)f( r) dr
fb[eX+(a-r)]
- [ex •
whereZ =(X+- X_)-1,
w
I = [ ex +(b-a)
I
ex -
J-
1
'
and ex + (t- r) Z
Go(t, r)
= { ex _(t-rlz:
a~ t ~ r r ~ t ~ b.
Proof By Theorem 11.2 and its proof, the matrix polynomial L(A.) admits decompositions L(A.) = (U- Y+)(IA- X+)= (U- L)(U- X_),
where o-(X _) = {Wl, ... , A.~ 1 )}, o-(X +) = {Wl, ... , A.~2 l}. In view of Theorem 13.1(iii), H - X+ and H - y_ are right and left r +-spectral divisors of L(A.), respectively; also H- X_ and H - Y+ are right and left r _-spectral divisors of L(A.), respectively. So (i) follows from Theorem 4.2. By the same theorem, X± are given by formulas (13.11 ). Now by Theorem 2.16, (ii) and (v) follow from (iii) and (iv). Property (iii) follows immediately from part (a) of Theorem 2.16; so it remains to prove (iv). Write
Now
o-(X +) = {W), ... , A.~2 )} Therefore, we can estimate
II e -X +(b-a)ll
~
K 1e -;.
(13.13) (13.14)
where A.b2 ) = min 1 , ; 9 A.l 2 \ A.b1 ) = max 1 ,;,.A.pl, and K 1 and K 2 are constants. Indeed, by Theorem 13.1 (ii), X+ = s- 1 · diag[A.\2 l, ... , A.~2 l] · S for some nonsingular matrix S; so e-x.
309
13.2. WEAKLY DAMPED CASE
13.2. Weakly Damped Case Consider now the case when the system (13.2) satisfies the condition
(Bf,f) 2
-
4(f,f)(CJJ) < 0,
f
E
q;n\{0}.
(13.16)
The system (13.2) for which (13.16) holds will be called weakly damped. Physically, condition (13.16) means that the free system (i.e., without external forces) can have only oscillatory solutions, whatever the initial values may be. For a weakly damped system we have (L(A)f,f) > 0 for every real A and every f # 0; in particular, the polynomial L(A) is nonnegative and has no real eigenvalues. According to Theorem 12.8, L(A) therefore admits the factorization L(A) = (/A - Z*)(JA - Z), where u(Z) coincides with a maximal c-set S of eigenvalues of L(A) chosen in advance. (Recall that the set of eigenvalues S is called a c-set if S n {A IAE S} = 0.) In particular, as a maximal c-set one may wish to choose the eigenvalues lying in the open upper half-plane. In such a case, by Theorem 2.15, the matrices Z and Z* form a complete pair of solutions of the matrix equation X 2 + BX + C = 0. If, in addition, eZ
13.1.
Let L(A)
=
[~ ~}z + [;/2
v;/2} + [~
~l
Note that the matrices
B
=
1 )3;2] [J312 2 '
are positive definite. The eigenvalues of L(A) are ±(- 3 ± ij23) each taken twice. It is easily seen that L(A 0 ) i= 0 for Ao = !C- 3 ± ij23); so for each eigenvalue there is only one eigenvector (up to multiplication by a nonzero number). So the elementary divisors of L(A) must be quadratic: (A-
!C -3 + ij23)) 2
(A-!( -3- ij23)) 2 .
0
310
13. QUADRATIC SELF-ADJOINT POLYNOMIALS
Comments The results of this chapter are in general not new. The finite-dimensional overdamped case was first studied in [17] (see also [52b]). As has been remarked earlier, this case has generated many generalizations almost all of which are beyond the subject matter of this book. Results in the infinitedimensional setting are to be found in [51]. Example 13.1 is in [52d].
Part IV Supple~nentary
Chapters in Linear
Algebra
In order to make this book more self-contained, several topics of linear algebra are presented in Part IV which are called upon in the first three parts. None of this subject matter is new. However, some of the topics are not readily accessible in standard texts on linear algebra and matrix theory. In particular, the self-contained treatment of Chapter S5 cannot be found elsewhere in such a form. Other topics (as in Chapter Sl) are so important for the theory of matrix polynomials that advantage can be gained from an exposition consistent with the style and objectives of the whole book. Standard works on linear algebra and matrix theory which will be useful for filling any gaps in our exposition include [22], [23], and [52c].
311
Chapter Sl
The Sn1ith Forn1 and Related Problen1s
This chapter is devoted to the Smith form for matrix polynomials (which is sometimes called a canonical form, or diagonal form of A.-matrices) and its applications. As a corollary of this analysis we derive the Jordan normal form for matrices. A brief introduction to functions of matrices is also presented.
Sl.l. The Smith Form Let L(A.) be a matrix polynomial of type Li=o AiA.i, where Ai are m x n matrices whose entries are complex numbers (so that we admit the case of rectangular matrices A). The main result, in which the Smith form of such a polynomial is described, is as follows.
Theorem Sl.l.
Every m x n matrix polynomial L(A.) admits the repre-
sentation
L(A.) = E(A.)D(A.)F(A.),
(Sl.l)
313
S 1.
314
THE SMITH FORM AND RELATED PROBLEMS
where
0 D(Jc) =
(S1.2) 0
0
0
is a diagonal polynomial matrix with monic scalar polynomials di(A) such that di(Jc) is divisible by di_ 1(Jc); E(Jc) and F(Jc) are matrix polynomials ofsizes m x m and n x n, respectively, with constant nonzero determinants. The proof of Theorems 1.1 will be given later. Representation (Sl.l) as well as the diagonal matrix D(Jc) from (S1.2) is called the Smith form of the matrix polynomial L(.lc) and plays an important role in the analysis of matrix polynomials. Since det E(Jc) const ¥-0, det F(Jc) const ¥-0, where E(Jc) and F(Jc) are taken from the Smith form (Sl.l), the inverses (E(.Ic))- 1 and (F(.Ic))- 1 are also matrix polynomials with constant nonzero determinant. The matrix polynomials E(.lc) and F(Jc) are not defined uniquely, but the diagonal matrix D(Jc) is unique. This follows from the fact that the entries of D(Jc) can be expressed in terms of the original matrix itself, as the next theorem shows. We shall need the definition of minors of the matrix polynomial L(.lc) = (aii(Jc)){.:::f::::.~- Choose k rows, 1 ~ i 1 < · · · < ik ~ m, and k columns, 1 ~ j 1 < · · · < jk ~ n, in L(.lc), and consider the determinant det(aiziJ~.m= 1 of the k x k submatrix of L(Jc) formed by the rows and columns. This determinant is called a minor of L(Jc). Loosely speaking, we shall say that this minor is of order k and is composed of the rows it. ... , ik and columns it. ... ,jk of L. Taking another set of columns and/or rows, we obtain another minor of order k of L(Jc). The total number of minors of order k is (1:)(/:').
=
=
Theorem 81.2. Let L(Jc) be an m x n matrix polynomial. f..et pk(Jc) be the greatest common divisor (with leading coefficient 1) of the minors of L(Jc) of order k, if not all of them are zeros, and let Pk(Jc) 0 if all the minors oforder k of L(Jc) are zeros. Let p0 (Jc) = 1 and D(Jc) = diag[d 1(.lc), ... , d.(Jc), 0, ... , 0] be the Smith form of L(Jc). Then r is the maximal integer such that p,(Jc) ¥= 0, and
=
i
= 1, ... , r.
(Sl.3)
Proof Let us show that if L 1 (.1c) and L 2 (.1c) are matrix polynomials of the same size such that
S 1.1.
315
THE SMITH FORM
where E± 1 (..1.) and p± 1 (..1.) are square matrix polynomials, then the greatest common divisors Pk. 1 (A.) and Pk, iA.) of the minors of order k of L 1 (A.) and LiA.), respectively, are equal. Indeed, apply the Binet-Cauchy formula (see, for instance, [22] or [52c]) twice to express a minor of L 1 (A.) of order k as a linear combination of minors of LiA.) of the same order. It therefore follows that Pk, 2 (A.) is a divisor of Pk, 1(A.). But the equation
implies that Pk. 1 (A.) is a divisor of Pk. iA.). So Pk, 1 (A.) = Pk, iA.). In the same way one shows that the maximal integer r 1 such that p,,, 1 (A.) "¥= 0, coincides with the maximal integer r2 such that p, 2 , iA.) "¥= 0. Now apply this observation for the matrix polynomials L(A.) and D(A.). It follows that we have to prove Theorems 1.2 only in the case that L(A.) itself is in the diagonal form L(A.) = D(A.). From the structure of D(A.) it is clear that
s = 1, ... , r, is the greatest common divisor of the minors of D(A.) of orders. So p.(A.) = d 1(A.) · · · d.(A.), s = 1, ... , r, and (1.3) follows. 0 To prove Theorem Sl.l, we shall use the following elementary transformations of a matrix polynomial L(A.) of size m x n: (1) interchange two rows, (2) add to some row another row multiplied by a scalar polynomial, and (3) multiply a row by a nonzero complex number, together with the three corresponding operations on columns. Note that each of these transformations is equivalent to the multiplication of L(A.) by an invertible matrix as follows: Interchange ofrows (columns) i andj in L(A.) is equivalent to multiplication on the left (right) by 1 i-+
0 0
1
1
0
(Sl.4)
0
1
S 1.
316
THE SMITH FORM AND RELATED PROBLEMS
Adding to the ith row of L(A.) the jth row multiplied by the polynomialf(A.) is equivalent to multiplication on the left by
j!
1
1
f(A.)
1
i-4
(S1.5)
1 1
1
the same operation for columns is equivalent to multiplication on the right by the matrix
1
i-4
1
(S1.6)
j-4
f(A.)
...
1 1
Finally multiplication of the ith row (column) in L(A.) by a number a # 0 is
S 1.1.
317
THE SMITH FORM
equivalent to the multiplication on the left (right) by
(Sl. 7)
a
1
(Empty places in (Sl.4)~(S1.7) are assumed to be zeros.) Matrices of the form (S1.4)~(Sl.7) will be called elementary. It is apparent that the determinant of any elementary matrix is a nonzero constant. Proof of Theorem Sl.l. Since the determinant of any elementary matrix is a nonzero constant, it is sufficient to prove that by applying a sequence of elementary transformations every matrix polynomial L(,l.) can be reduced to a diagonal form: diag[d 1 (.1.), ... , dJ.1.), 0, ... , 0], (the zeros on the diagonal can be absent, in which case r = min(m, n)), where d 1 (.1.), ... , dJ,l.) are scalar polynomials such that d;(A)/d;_ 1 (.1.), i = 1, 2, ... , r - 1, are also scalar polynomials. We shall prove this statement by induction on m and n. Form= n = 1 it is evident. Consider now the case m = 1, n > 1, i.e.,
If all a/.1.) are zeros, there is nothing to prove. Suppose that not all of a/.1.) are zeros, and let ai0 (A) be a polynomial of minimal degree among the nonzero entries of L(.1.). We can suppose thatj 0 = 1 (otherwise interchange columns in L(.1.)). By elementary transformations it is possible to replace all the other entries in L(,l.) by zeros. Indeed, let a/.1.) #- 0. Divide ai(,l.) by a 1 (.1.): ap) = h/.1.)a 1 (.1.) + r/.1.), where r/.1.) is the remainder and its degree is less than the degree of a 1 (.1.), or r/.1.) 0. Add to the jth column the first column multiplied by -hi( A). Then in the place j in the new matrix will be rp). If ri(,l.) of= 0, then put rp) in the place 1, and if there still exists a nonzero entry (different from ri(A)), apply the same argument again. Namely, divide this (say, the kth) entry by r/.1.) and add to the kth column the first multiplied by minus the quotient of the division, and so on. Since the degrees of the remainders decrease, after a finite number (not more than the degree of a 1 (.1.)) of steps we find that all the entries in our matrix, except the first, are zeros.
=
S 1.
318
THE SMITH FORM AND RELATED PROBLEMS
This proves Theorem Sl.l in the case m = 1, n > 1. The case m > 1, n = 1 is treated in a similar way. Assume now m, n > 1, and assume that the theorem is proved for m - 1 and n - 1. We can suppose that the (1, 1) entry of L(A.) is nonzero and has the minimal degree among the nonzero entries of L(A.) (indeed, we can reach this condition by interchanging rows and/or columns in L(A.) if L(A.) contains a nonzero entry; if L(A.) 0, Theorem Sl.l is trivial). With the help of the procedure described in the previous paragraph (applied for the first row and the first column of L(A.)), by a finite number of elementary transformations we reduce L(A.) to the form
=
aW(A.) L 1(A.)
=
[
0
1
~
a~ll:(A.)
··· 0 . . . a~~( A.) .
0
a~i(A.)
· · · a~J(A.)
Suppose that some a!J>(A.) ¢= 0 (i,j > 1) is not divisible by aW(A.) (without remainder). Then add to the first row the ith row and apply the above arguments again. We obtain a matrix polynomial of the structure
r
1
~
a~2~(A.) . . . a~2~(A.) ,
0
···
0
0
a~i(A.)
···
a~J(A.)
aW(A.)
L2(A.) =
where the degree of aW(A.) is less than the degree of ail£
aW(A.) L3(A.)
=[
···
0
1
~
a~~(A.) . . . a~3~(A.) ,
0
a!;i(A.)
· · · a!;J(A.)
where every ajj>(A.) is divisible by aW(A.). Multiply the first row (or column) by a nonzero constant to make the leading coefficient of ai3f
[
a~3d(A.)
L4(A.) = -<3->: au (A.) a!;i(A.)
...
a~3,>(A.)l : , a!;J(A.)
(here L 4 (A.) is an (m - 1) x (n - 1) matrix), and apply the induction hypothesis for L4 (A.) to complete the proof of Theorem Sl.l. D
Sl.2.
INVARIANT POLYNOMIALS AND ELEMENTARY DIVISORS
319
The existence of the Smith form allows us to prove easily the following fact. Corollary S1.3. An n x n matrix polynomial L(A.) has constant nonzero determinant if and only if L(A.) can be represented as a product L(A.) = F 1(A.)F;(A.) · · · F p(A.) of a finite number of elementary matrices F;(A.).
Proof Sincedet F;(A.) =canst "I= O,obviouslyanyproductofelementary matrices has a constant nonzero determinant. Conversely, suppose det L(A.) = canst "I= 0. Let L(A.) = E(A.)D(A.)F(A.)
(S1.8)
be the Smith form of L(A.). Note that by definition of a Smith form E(A.) and F(A.) are products of elementary matrices. Taking determinants in (S1.8), we obtain det L(A.) = det E(A.) · det D(A.) · det F(A.), and,consequently,det D(A.) =canst -:1= 0. ThishappensifandonlyifD(A.) =I. But then L(A.) = E(A.)F(A.) is a product of elementary matrices. D Sl.2. Invariant Polynomials and Elementary Divisors
The diagonal elements d 1(A.), ... , d,(A.) in the Smith form are called the invariant polynomials of L(A.). The number r of invariant polynomials can be defined as (S1.9)
r = max{rank L(A.)} . .l.eC
Indeed, since E(A.) and F(A.) from (Sl.l) are invertible matrices for every A., we have rank L(A.) = rank D(A.) for every A. E rt. On the other hand, it is clear that rank D(A.) = r if A. is not a zero of one of the invariant polynomials, and rank D(A.) < r otherwise. So (S1.9) follows. Represent each invariant polynomial as a product of linear factors
i
=
1, .. . , r,
where A.il, ... , A.;,k, aTe different complex numbers and ail, ... , IX;k, are positive integers. The factors (A. - A.iiY'i,j = 1, ... , k;, i = 1, ... , r, are called the elementary divisors of L(A.). An elementary divisor is said to be linear or nonlinear according as aii = 1 or <Xii > 1. Some different elementary divisors may contain the same polynomial (A. - A.0 )a (this happens, for example, in case d;(A.) = di+ 1(A.) for some i); the total number of elementary divisors of L(A.) is therefore 1 k;.
Ll=
320
S 1.
THE SMITH FORM AND RELATED PROBLEMS
Consider a simple example. EXAMPLE
S 1.1.
Let A.(A. - 1) L(A.) = [ 0
J
1
A.(A. - 1)
First let us find the Smith form for L(A.) (we shall not mention explicitly the elementary transformations): [A.(A.
~ 1)
A.(A.
~ 1)]----> [ -A.z(~- 1)2
A.(A.
~ 1)]---->
[_,12(~-1)2 ~]---->[~ A_2(A_~1)2l Thus the elementary divisors are A. 2 and (A. - 1) 2 .
D
The degrees rx;j of the elementary divisors form an important characteristic of the matrix polynomial L().). As we shall see later, they determine, in particular, the Jordan structure of L(.A.). Here we mention only the following simple property of the elementary divisors, whose verification is left to the reader.
Proposition Sl.4. Let L(.A.) be an n x n matrix polynomial such that det L().) =f- 0. Then the sum Li~ 1 L~'= 1 IY.;j of degrees of its elementary divisors (). - ).ijY'J coincides with the degree of det L(.A.). Note that the knowledge of the elementary divisors of L().) and of the number r of its invariant polynomials d 1 (.A.), ... , d,().) is sufficient to construct d 1 (.A.), ... , d,(.A.). In this construction we use the fact that d;(.A.) is divisible by d;_ 1 (.A.). Let ). 1 , ... , ).P be all the different complex numbers which appear in the elementary divisors, and let().-).;)"", ... ,().- ).;)"'·k' (i = 1, ... , p) be the elementary divisors containing the number A;, and ordered in the descending order of the degrees rx; 1 ~ · · · ~ rx;,k, > 0. Clearly, the number r of invariant polynomials must be greater than or equal to max{k~> ... , kP}. Under this condition, the invariant polynomials d 1 ().), ... , d,().) are given by the formulas p
dp) =
fl (A_ ).;)"i,r+l-1, i~
j=1, ...
,r,
1
where we put (). - ).;)"' 1 = 1 for j > k;. The following property of the elementary divisors will be used subsequently:
Proposition Sl.S. Let A().) and B().) be matrix polynomials, and let C().) = diag[A().), B().)], a block diagonal matrix polynomial. Then the set ofelementary divisors of C().) is the union of the elementary divisors of A().) and B(.A.).
S1.3.
321
DIFFERENTIAL EQUATIONS WITH CONSTANT COEFFICIENTS
Proof Let D 1 (A.) and D 2 (A.) be the Smith forms of A(A.) and B(A.), respectively. Then clearly
for some matrix polynomials E(A.) and F(A.) with constant nonzero determinant. Let (A. - A. 0 )~', ..• , (A. - A. 0 )~p and (A.- A. 0 )P•, ... , (A. - A. 0 )P" be the elementary divisors of D 1 (A.) and Dz(A.), respectively, corresponding to the same complex number A. 0 . Arrange the set of exponents IX 1 , ••• , IXP, [3 1, ••• , /3q, in a nonincreasing order: {1X 1, ..• , IXP, [3 1, ••• , /3q} = {y 1, ••. , Yp+q}, where 0 < y1 :::;; · · ·:::;; Yp+q· Using Theorem Sl.2 it is clear that in the Smith form D = diag[d 1 (A.), ... , d,(A.), 0, ... , 0] of diag[D 1(A.), D 2 (A.)], the invariant polynomial d,(A.) is divisible by (A.- A.0 )YP+q but not by (A.- A.0 )Yp+q+t, d,_ 1 (A.) is divisible by (A. - A. 0 )Yp +q - 1 but not by (A. - A. 0 )YP + • - 1 + 1 , and so on. It follows that the elementary divisors of
(and therefore also those of C(A.)) corresponding to A. 0 , are just (A. - A. 0 )Y 1 , (A. - A. 0 )YP +•, and Proposition S1.5 is proved. 0
••• ,
S1.3. Application to Differential Equations with Constant Coefficients Consider the system of homogeneous differential equations d 1x dx A 1 dtf + · · · + A 1 dt + A 0 x = 0,
(Sl.lO)
where x = x(t) is a vector function (differentiable l times) of the real argument t with values in fln, and A 1, ••• , A 0 are constant m x n matrices. Introduce the following matrix polynomial connected with system (S1.9) l
A(A.) =
L AjA.i.
(Sl.ll)
j=O
The properties of the solutions of (SUO) are closely related to the spectral properties of the polynomial A( A.). This connection appears many times in the book. Here we shall only mention the possibility of reducing system (SUO) to n (or less) independent scalar equations, using the Smith form of the matrix polynomial (SUl).
S 1.
322
THE SMITH FORM AND RELATED PROBLEMS
Given matrix polynomial B(A.) = LJ=o BjA.j and a p-times differentiable vector function y = y(t), denote for brevity
Thus, B(djdt)y is obtained if we write formally (Lf= 1 BjA.j)y and then replace A. by djdt; for example, if
B(A.) = [A.2 0
J
A.+ 2 A. 3 + 1
and
then
B(!!_) = dt y
[y~ + Y~ + 2y2]· Y2' + Y2
Note the following simple property. If B(A.) = B 1(A.)B 2(A.) is a product of two matrix polynomials B 1 (A.) and BiA.), then (SU2) This property follows from the facts that for oc 1, oc 2 E fl
and
0 ::;; i ::;; j. Let us go back to the system (SUO) and the corresponding matrix polynomial (SU1). Let D(A.) = diag[d 1(A.), ... , d,(A.), 0 ···OJ be the Smith form of A(A.):
A(A.) = E(A.)D(A.)F(A.),
(SU3)
where E(A.) and F(A.) are matrix polynomials with constant nonzero determinant. According to (SU2), the system (SUO) can be written in the form (SU4)
Sl.3.
DIFFERENTIAL EQUATIONS WITH CONSTANT COEFFICIENTS
323
Denote y = F(djdt)x(t). Multiplying (S1.14) on the left by E- 1 (djdt), we obtain the system 0 0
Yr
d,(:t) 0
0
0
Y1 = 0,
(Sl.15)
Yn 0
(y = (y 1 , •.. , y.?), which is equivalent to (S1.14). System (Sl.15) splits into r independent scalar equations
d{:t)Yi(t) = 0,
i = 1, ... , r,
(Sl.16)
(the last n - r equations are identities: 0 · Yi = 0 fori = r + 1, .. , n). As is well known, solutions of the ith equation from (Sl.16) form ami-dimensional linear space ni, where mi is the degree of di(t). It is clear then that every solution of (S1.15) has the form (y 1 (t), ... , y.(t)) with yJt) E lli(t), i = 1, ... , r, and Yr+ 1(t), ... , y.(t) are arbitrary /-times differentiable functions. The solutions of (SUO) are now given by the formulas
where Yi E lli, i = 1, ... , r, and Yr+ ~> ... , Yn are arbitrary I- times differentiable functions. As a particular case of this fact we obtain the following result. TheoremS1.6. LetA(A.) = L}=o AiA.ibeann x nmatrixpolynomialsuch that det A(A.) =f: 0. Then the dimension of the solution space of(Sl.lO) is equal to the degree of det L(A.). Indeed, under the condition det A(A.) =f: 0, the Smith form of A(A.) does not contain zeros on the main diagonal. Thus, the dimension of the solution space is just dim 0 1 + · · · + dim n., which is exactly degree(det A(A.)) (Proposition S1.4). Theorem Sl.6 is often referred to as Chrystal's theorem. Let us illustrate this reduction to scalar equations in an example.
S 1.
324 ExAMPLE S 1.2.
THE SMITH FORM AND RELATED PROBLEMS
Consider the system d2
d dt
-x 1 +x 2 =0,
x1
+-
dt
2
(S1.17)
x 2 = 0.
The corresponding matrix polynomial is
Compute the Smith form for A(A.):
Equations (S1.15) take the form Y1 = 0,
Linearly independent solutions of the last equation are Y2.1 =
e',
Y2. 2 = exp [
-1
+2 iJ3 t
J,
Y2, 3 = exp [
-12
ij3 tJ. (S1.18)
Thus, linearly independent solutions of (S1.17) are
where y 2 = yz(t) can be each one of the three functions determined in (S1.18).
0
In an analogous fashion a reduction to the scalar case can be made for the nonhomogeneous equation (S1.19)
Indeed, this equation is equivalent to
where f(t) is a given function (we shall suppose for simplicity that f(t) has as many derivatives as necessary).
Sl.4.
325
APPLICATION TO DIFFERENCE EQUATIONS
Here D(.l.) is the Smith form for A(.l.) = I~= 0 A j;.< and E(.l.) and F(.l.) are defined by (S1.13). Let y = F(d/dt)x, g = E- 1 (d/dt)f(t); then the preceding equation takes the form
v(fr)y(t) = g(t);
(Sl.20)
thus, it is reduced to a system of n independent scalar equations. Using this reduction it is not hard to prove the following result.
Lemma S1.7. The system (S1.19) (where n = m) has a solution for every right-hand side f(t) (which is sufficiently many-times differentiable) ifand only if det A(,l.) ¢ 0.
Proof If det A(,l.) '/= 0, then none of the entries on the main diagonal in D(.l.) is zero, and therefore every scalar equation in (S1.20) has a solution. If det A(.l.) 0, then the last entry on the main diagonal in D(,l.) is zero. Hence, iff(t) is chosen in such a way that the last entry in E- 1 (d/dt)f(t) is not zero, the last equation in (S1.20) does not hold, and (Sl.19) has no solution for this choice of f(t). 0
=
Of course, it can happen that for some special choice of f(t) Eq. (S1.19) still has a solution even when det A(.l.) 0.
=
S 1.4. Application to Difference Equations Let A 0 , ••• , A 1 be m x n matrices with complex entries. Consider the system of difference equations
k
=
0, 1, 2, 00., (S1.21)
where (Yo' YJ, is a given sequence of vectors in em and (Xo' X 1• is a sequence in fl" to be found. For example, such systems appear if we wish to solve approximately a system of differential equations 0
B2
0
0
,)
0
d2 -- 2 x(r) dt
dx(t) dt
+ B 1 - - - - + B0 x(t) = y(t),
0 < t < :x:,
0
0
,)
(S1.22)
(here y(t) is a given function and x(t) is unknown), by replacing each derivative by a finite difference approximations as follows: let h be a positive number; denote tj = jh, j = 0, 1, .... Given the existence of a solution x(t), consider (S1.22) only at the points t/
d 2 x(t)
B 2 -d(Z
dx(t)
+ B 1 -dt +
B 0 x(t) =
y(t),
j
= 0, 1, 00., (S1.23)
S 1.
326
THE SMITH FORM AND RELATED PROBLEMS
and replace each derivative (d 1/dt 1)x(ti) by a finite difference approximation. For example, if xi is an approximation for x(t),j = 0, 1, 2, ... , then one may use the central difference approximations dx(t) . xi+ 1 - xj- 1 ~= 2h d 2x(d) . xi+ 1 - 2xi h2
~=
(S1.24)
+ xi-1
(Sl.25)
Inserting these expressions in (S1.23) gives B xi+ 1 - 2xi 2 h2
+ xi_ 1
+
B xi+ 1 - xi_ 1 1 2h
+
B
_ oXj -
( ) Y ti ,
j
= 1, 2, ... '
or (B2
+ !B 1 h)xi+ 1 = h 2y(t),
-
(2B 2 - B 0 h 2)xi
+ (B 2 -
j = 1, 2, ... ,
!B 1 h)xi_ 1
(S1.26)
which is a difference equation oftype(Sl.21)withyk = h 2 y(tk+ 1 ),k = 0, 1,2, .... Some other device may be needed to provide the approximate values of x(O) and x(h) (since, generally speaking, the unique solution of (Sl.26) is determined by the values of x(O) and x(h)). These can be approximated if, for instance, initial values x(O) and dx(O)/dt are given; then the approximation x 1 for x(h) could be dx(O)
x 1 = x(O)
+ h ---;{('
This technique can, of course, be extended to approximate by finite differences the equation of lth order
d1x(t)
B 1 ----;IT
+ · · · + B 0 x(t) =
y(t).
The approximation by finite differences is also widely used in the numerical solution of partial differential equations. In all these cases we arrive at a system of difference equations of type (S1.21). The extent to which solutions of the difference equation give information about solutions ofthe original differential equation is, of course, quite another question. This belongs to the study of numerical analysis and will not be pursued here. To study the system (S1.21), introduce operator Iff (the shift operator), which acts on the set of all sequences (x 1 , x 2 , •.. ,) of n-dimensional vectors xi as follows: tff(x 1, x 2 , ... ,) = (x 2 , x 3 , ... ,). For difference equations, the
327
S1.4. APPLICATION TO DIFFERENCE EQUATIONS
operator Iff plays a role analogous to the operator d/dt in Section Sl.3. The system (Sl.21) can be rewritten in the form (A 0
+ A 1 tff + · · · + A 1tff 1)x =
f,
(Sl.27)
f = (!0 , f 1 , .•• ,) is a given sequence of n-dimensional vectors, and = (x 0 , x 1 , ..• ,) is a sequence to be found. Now it is quite clear that we have
where x
to consider the matrix polynomial A(A.) = L~~o AiA.i connected with (S1.21). Let D(A.) be the Smith form of A(A.) and A(A.) = E(A.)D(A.)F(A.),
=
where det E(A.) const =F 0, det F(A.) Iff and substitute in (S1.17) to obtain
(Sl.28)
=const =F 0. Replace A. in (Sl.28) by
D(tff)y = g, where y = (y 0 , y 1 ,
(Sl.29)
= F(tff)(x 0 , x 1, .•• ,) and
•.. ,)
g =(go, gt, · · · ,) = (E(tff))- 1
k = 0, 1, 2, ... '
is the degree of det A( A.). Appendix. Scalar Difference Equations In this appendix we shall give a short account of the theory ofhomogeneous scalar difference equations with constant coefficients. Consider the scalar homogeneous difference equation k = 0, 1, ...
(Sl.30)
where a; are complex numbers and a1 =F 0; (x 0 , x 1 , •.. ,) is a sequence of complex numbers to be found. It is clear that the solutions of (Sl.30) form a linear space, i.e., the sum of two solutions is again a solution, as well as a scalar multiple of a solution. Proposition S1.9. dimension l.
The linear space of all solutions of Eq. (Sl.30) has
328
S 1.
THE SMITH FORM AND RELATED PROBLEMS
+
Proof Suppose that (x~l, xyl, ... ,), j = 1, ... , I tions of (S1.30). In particular, we have
1 are nonzero solu-
= 1, .. 0' l + 1.
j
(S1.31)
Consider each equality as a linear homogeneous equation with unknowns x~l, ... , x~il. From the general theory of linear homogeneous equations it follows that (S1.31) has exactly /linearly independent solutions. This means that there exists a relationship I+ 1
L cxixjjl = 0,
k
=
0, 1,
0
0
l,
0'
(Sl.32)
j= 1
where cx 0 , ••• , cx 1 are complex numbers and not all of them are zeros. It follows that the same relationship holds for the complete sequences xUl
=
(x~l, xyl, ... ,),
j = 1,
00.,
I+ 1:
I+ I
L cxix~l = 0,
k = 0, 1,
0
0
(Sl.33)
0'
j= 1
or, in other words, l+ I
L cxjxUl = 0.
(Sl.34)
j= 1
Indeed, we shall prove (Sl.33) by induction on k. Fork = 0, ... , l, (Sl.33) is exactly (Sl.32). Suppose (S1.33) holds for all k ~ k 0 , where k 0 is some integer greater than or equal to l. Using the equalities (S1.33), we obtain j
= 1, ... , I + 1.
So
and by the induction hypothesis, I+ 1
Icxjx~2+ 1
= -a{ 1 (a 1_ 1 ·0 +
000
+ a0 ·0) =
0.
j= 1
So (S1.33) follows. Now it is clear from (Sl.34) that the solutions x( 1 l, ... , x(l+ 1l are linearly dependent. We have proved therefore that any l + 1 solutions of (S1.30) are linearly dependent; so the dimension of the solution space is less than or equal to l. To prove that the dimension of the solution space is exactly /, we construct now a linearly independent set of l solutions of (S 1.30). Let us seek for a solution of (S1.30) in the form of a geometric sequence xk = qk, k = 0, 1, ....
Sl.4.
329
APPLICATION TO DIFFERENCE EQUATIONS
Inserting in (SUO) we see that (1, q, q 2 , . . . ,) is a solution of (SUO) if and only if q is a root of the characteristic polynomial
f(A) = a0
+ a 1 A + · · · + a1 A1
of Eq. (SUO). Let q 1 , •.• , q, be the different roots off (A), and let a 1 , the multiplicity of the roots q ~> ... , q., respectively. So
... ,
a, be
s
j(A) =
Il (A -
q)a;
and
a1
+ · · · + a,
= I.
j= 1
It is not hard to see that the solutions X (j) k
-
qk
k = 0, 1,
j'
j
=
1, ... ' s,
are linearly independent. Indeed, suppose s
I
s
ajxijl
=
j= 1
I
ajq~
=0
for
k = 0, 1, ....
(S U5)
j= I
Consider the equations for k = 0, ... , s - 1 as a linear system with unknowns a~> ... , a,. The matrix of this system is
v
~[
1
;, s- I
ql
q2 s- I
q2
and it is invertible, since q ~> ... , q, are all different and det V = f1; > j (q; - q) (Vandermonde determinant, see, for instance, [65]). So (SU5) has only the trivial solution, i.e., a 1 = · · · = a, = 0. Thus, the solutions xijl = qjkl, k = 0, 1, ... ,j = 1, ... , s are linearly independent. In the case that s =I, i.e., all the roots ofj(A) are simple, we have finished: a set of I independent solutions of (SUO) is found. If not all the roots of j(A) are simple, then s < I, and the s independent solutions constructed above do not span the linear space of all solutions. In this case we have to find additional I - s solutions of (S1.30), which form, together with the above s solutions, a linearly independent set. This can be done in the following way: let qj be a root of j(A) of multiplicity aj 2':: 1. Put xk = q~yb k = 0, 1, ... , and substitute in (SUO) k=0,1, ... ,. Introduce now finite differences of the sequence y 0 , y 1 , byk
=
Yk+
b 2yk
=
Yk+2- 2yk+l
I -
yk,
+ Yb
••• , :
(S1.36)
S 1.
330
THE SMITH FORM AND RELATED PROBLEMS
and in general bmyk =
I ("!)< -1r-iYk+i•
m = 1, ... , l.
l
i=O
It is easily seen that the sequence y 0 , y 1,
..• ,
can be expressed through its
finite differences as follows: Yk+m =
. L (m) . b'yk, l
k = 0, 1, ... , m = 0, 1, ... , l,
m
i=O
where b0 Jk = Yk by definition. Substituting these expressions in (Sl.36) and rearranging terms gives
k(J
qi
0
=
0,
k = 0, 1, ....
(Sl.37) Note that J
for
i ;;::: rxi.
(Sl.38)
It is not hard to see that (Sl.38) is satisfied for every polynomial sequence Yk = L~,:-J f3mkm, k = 0, 1, ... , where /3m E fl. Indeed, byk = L~,:-5 f3~km for
some coefficients
{3~ E
fl. Now b~Jyk
= b(b ... (bY.k) = 0, fXJ
times
since operator b when applied to a polynomial sequence reduces its degree at least by 1. So we have obtained rxi solutions of (Sl.30) connected with the root qi of j().) of multiplicity rxi: k = 0, 1, ... , , m = 0, ... , rxi - 1.
(Sl.39)
It turns out that the l = Ll= 1 rxi solutions of (Sl.30) given by (S1.39) for j = 1, ... , s, are linearly independent. Thus, we have proved Proposition Sl.9 and have constructed also a basis in the solution space of (S1.30). 0
S1.5. Local Smith Form and Partial Multiplicities It is well known that for any scalar polynomial a().) and any ).0 E rt the following representation holds: a().)
= (). -
A0 Yb().),
(Sl.40)
S1.5.
331
LOCAL SMITH FORM AND PARTIAL MULTIPLICITIES
where oc is a nonnegative integer and b(A.) is a scalar polynomial such that b(A. 0 ) # 0. (If a(A. 0 ) # 0, then put oc = 0 and b(A.) = a(A.).) In this section we shall obtain an analogous representation for matrix polynomials. We restrict ourselves to the case when the matrix polynomial A(A.) is square (of size n x n) and det A(A.) =!= 0. It turns out that a representation of type (Sl.40) for matrices is defined by n nonnegative integers (instead of one nonnegative integer oc in (Sl.40)) which are called the partial multiplicities. TheoremSl.lO.
Then for every A. 0
E
LetA(A.)beann x nmatrixpolynomialwithdetA(A.) fl, A(A.) admits the representation
=!=
0.
(A. - A.oY''
A(A.) = E;.o(A.) [
0
(S1.41)
...
where E ;. 0 (A.) and F ;. 0 (A.) are matrix polynomials invertible at A. 0 , and K 1 ::::;; • • • ::::;; "" are nonnegative integers, which coincide (after striking off zeros) with the degrees of the elementary divisors of A(A.) corresponding to A. 0 (i.e., of the form (A.- A.o)"). In particular, the integers K 1 ::::;; · • · ::::;; ""from Theorem Sl.10 are uniquely determined by A(A.) and A. 0 ; they are called the partial multiplicities of A(A.) at A. 0 . The representation (S1.41) will be referred to as a local Smith form of A(A.) at A. 0 . Proof The existence of representation (S1.41) follows easily from the Smith form. Namely, let D(A.) = diag(d 1 (A.), ... , diA.)) be the Smith form of A(A.) and let A(A.) = E(A.)D(A.)F(A.),
where det E(A.) in (Sl.40):
(Sl.42)
=const # 0, det F(A.) =const # 0. Represent each d;(A.) as i
=
1, ... , n,
where d;(A. 0 ) # 0 and K; ~ 0. Since d;(A.) is divisible by d;_ 1(A.), we have K; ~ K;- 1. Now (S1.41) follows from (Sl.42), where F ;.0 (A.) = F(A.). It remains to show that K; coincide (after striking off zeros) with the degrees of elementary divisors of A(A.) corresponding to A. 0 . To this end we shall show that any factorization of A(A.) of type (S1.41) with K 1 ::::;; · · · ::::;; "" implies that Kj is the multiplicity of A. 0 as a zero of dj(A.),j = 1, ... , n, where D(A.) = diag(d 1(A.), ... , d"(A.)) is the Smith form of A(A.). Indeed, let
A(A.) = E(A.)D(A.)F(A.),
S 1.
332
THE SMITH FORM AND RELATED PROBLEMS
where E(A) and F(A) are matrix polynomials with constant nonzero determinants. Comparing with (Sl.41), write
(Sl.43)
where EA 0 (A) = (E(A))- 1E;. 0 (A), F\ 0 (A)(F(A))- 1 are matrix polynomials invertible at Ao. Applying the Binet-Cauchy formula for minors of products of matrices, we obtain d1(A)d2(A) ... d;o(A)
L mu;(A). mj.D;p). mk ..F(A),
= i.
i0
= 1, 2, ... , n
j,k
(Sl.44) where m;, .E(A) (resp. mi. v; 0 (A), mk,J(A)) is a minor of order i0 of E(A) (resp. diag((A - AoY', ... , (A - AoY"), F(A)), and the sum in (S1.44) is taken over certain set of triples (i,j, k).lt follows from (S1.44) and the condition K 1 ::; · · · ::; Kn, that Ao is a zero of the product d 1(A)diA) · · · d; 0 (A) of multiplicity at least K 1 + K 2 + · · · + K; 0 • Rewrite (Sl.43) in the form
and apply the Binet-Cauchy formula again. Using the fact that (EA (A))- 1 0 and (F;. 0 (A))- 1 are rational matrix functions which are defined and invertible at A= A0 , and using the fact that d;(A) is a divisor of d;+ 1(A), we deduce that
Sl.6.
EQUIVALENCE OF MATRIX POLYNOMIALS
333
where <1>; 0 (..1.) is a rational function defined at A. = . 1. 0 (i.e., . 1. 0 is not a pole of <1>; 0 (..1.)). It follows that . 1. 0 is a zero of d 1 (A.)d 2 (A.) · · · d; 0 (A.) of multiplicity exactly K 1 + K 2 + · · · + K; 0 , i = 1, ... , n. Hence K; is exactly the multiplicity of . 1. 0 as a zero of d;(A.) for i = 1, ... , n.
D
Note that according to the definition, the partial multiplicities are all zero for every . 1. 0 which is not a zero of an invariant polynomial of A(A.). For instance, in Example Sl.l, the partial multiplicities for . 1. 0 = 0 and . 1. 0 = 1 are 0 and 2; for . 1. 0 ¢: {0, 1}, the partial multiplicities are zeros. S1.6. Equivalence of Matrix Polynomials
Two matrix polynomials A(A.) and B(A.) of the same size are called equivalent if A(A.) = E(A.)B(A.)F(A.) for some matrix polynomials E(A.) and F(A.) with constant nonzero determinants. For this equivalence relation we shall use the symbol ,.., : A(A.) ,.., B(A.) means A(A.) and B(A.) are equivalent. It is easy to see that ,.., is indeed an equivalence relation, i.e., (1) (2) (3)
A(A.) ,.., A(A.) for every matrix polynomial A(A.); A(A.) ~ B(A.) implies B(A.) ~ A(A.); A(A.) ,.., B(A.) and B(A.) ,.., C(A.) implies A(A.) ,.., C(A.).
Let us check the last assertion, for example. We have A(A.) = E 1 (A.)B(A.)F 1 (A.), B(A.) = Ez(A.)C(A.)F 2 (A.), where £ 1 (..1.), Ez(A.), F 1 (A.), F 2 (A.) have constant nonzero determinants. Then A(A.) = E(A.)C(A.)F(A.) with E(A.) = E 1 (A.)Ez(A.) and F(A.) = F 1 (A.)F 2 (A.). So A(A.),.., C(A.). The uniqueness of the Smith form allows us to give the following criterion for the equivalence relation between matrix polynomials. Theorem Sl.ll. A(A.) ,.., B(A.) A(A.) and B(A.) are the same.
if and only !f the invariant polynomials of
Proof Suppose the invariant polynomials of A(A.) and B(A.) are the same. Then their Smith forms are equal:
where det E;(A.)
=const # 0, det F;(A.) =const # 0, i = 1, 2. Consequently,
(£ 1 (..1.))- 1 A(A.)(F 1 (..1.))- 1
=
(£ 2 (..1.))- 1 B(A.)(F z(A.))- 1 ( =D(A.))
334
S.l.
THE SMITH FORM AND RELATED PROBLEMS
and A(Jc) = E(Jc)B(Jc)F(Jc),
where E(Jc) = E 1 (Jc)(E 2 (Jc))- 1 , F(Jc) = F 1 (Jc)(F 2 (Jc))- 1 . Since EiJc) and F 2 (Jc) are matrix polynomials with constant nonzero determinants, the same
is true for E- 1 (Jc) and p- 1 (Jc), and, consequently, for E(Jc) and F(Jc). So A(Jc) "' B(Jc). Conversely, suppose A(Jc) = E(Jc)B(Jc)F(Jc), where det E(Jc) const # 0, det F(Jc) const # 0. Let D(Jc) be the Smith form for B(Jc):
=
=
B(Jc) = E 1 (Jc)D(Jc)F 1 (Jc).
Then D(Jc) is also the Smith form for A(Jc): A(Jc) = E(Jc)E 1 (Jc)D(Jc)F 1 (Jc)F(Jc).
By the uniqueness of the Smith form for A(Jc) (more exactly, by the uniqueness of the invariant polynomials of A(Jc)) it follows that the invariant polynomials of A(Jc) are the same as those of B(Jc). D Of special interest is the equivalence of n x n linear matrix polynomials of the form /A - A. It turns out that for linear polynomials the concept of equivalence is closely related to the concept of similarity between matrices. Recall, that matrices A and Bare similar if A = SBS- 1 for some nonsingular matrix S. The following theorem clarifies this relationship. Theorem S1.12.
/A - A "' /A - B
if and only if A and B are similar.
To prove this theorem, we have to introduce division of matrix polynomials. We restrict ourselves to the case when the dividend is a general matrix polynomial A(Jc) = L~=o Ai,ti, and the divisor is a matrix polynomial of type I A + X, where X is a constant n x n matrix. In this case the following representation holds: A(Jc) = Q,(Jc) (H
+ X) + R"
(Sl.45)
where Q.(Jc) is a matrix polynomial, which is called the right quotient, and R, is a constant matrix, which is called the right remainder, on division of A(Jc)byiA+X; (Sl.46) where Q1(Jc) is the left quotient, and R 1 is the left remainder (R 1 is a constant matrix). Let us check the existence of representation (Sl.45) ((Sl.46) can be checked analogously). If I = 0 (i.e., A(Jc) is constant), put Q.(Jc) 0 and
=
Sl.6.
335
EQUIVALENCE OF MATRIX POLYNOMIALS
R, = A(A.). So we can suppose 1 ~ 1; write Q,(A.) = 2:~:6 Q7lA.j. Comparing the coefficients of the same degree of A. in right and left hand sides of (Sl.45), this relation can be rewritten as follows: A1-1
= Q!'2z + Q!'2tx, ··· A 0 = Qg>x + R,.
A1
= Qg> + Q~>x,
Clearly, these equalities define Q!~ 1, ... , Q~l, Qg>, and R., sequentially. It follows from this argument that the left and right quotient and remainder are uniquely defined. Proof of Theorem S1.12. In one direction this result is immediate: if A= SBS- 1 forsomenonsingularS,thentheequalityH- A= S(IA.- B)S- 1 proves the equivalence of I A. - A and I A. - B. Conversely, suppose I A. - A ,..., !A. - B. Then for some matrix polynomials E(A.) and F(A.) with constant nonzero determinant we have E(A.)(IA.- A)F(A.) =!A.- B.
Suppose that division of (E(A.))- 1 on the left by !A. - A and of F(A.) on the right by I A. - B yield (E(A.))- 1 = (I A. - A)S(A.)
+ E0 ,
(Sl.47)
F(A.) = T(A.)(IA.- B)+ F 0 .
Substituting in the equation (E(A.))- 1 (/A.- B)= (IA.- A)F(A.), we obtain {(IA.- A)S(A.)
+ E 0 }(IA.-
B)= (IA.- A){T(A.)(IA.- B)+ F 0 },
whence (IA.- A)(S(A.)- T(A.))(JA.- B)= (JA.- A)F 0
-
E 0 (IA.- B).
Since the degree of the matrix polynomial in the right-hand side here is 1 it follows that S(A.) = T(A.); otherwise the degree of the matrix polynomial on the left is at least 2. Hence, (JA.- A)F 0 = E 0 (A.I- B),
so that AF 0 = E 0 B,
and
It remains only to prove that E 0 is nonsingular. To this end divide E(A.) on the left by !A. - B:
E(A.) = (JA.- B)U(A.)
+ R0 .
(Sl.48)
Sl.
336
THE SMITH FORM AND RELATED PROBLEMS
Then using Eqs. (S1.47) and (S1.48) we have
+ E 0 } {(H - B)V(A) + R 0 } + (H - A)F 0 V(A)
I = (E(,l))- 1E(A) = {(H - A)S(A) = (H - A) {S(A)(H - B)V(A)}
+ (H=
A)S(A)R 0
+ E0 R0
(IA- A)[S(A)(H- B)V(A)
+ F 0 V(A) + S(A)R 0 ] + E 0 R 0 .
Hence the matrix polynomial in the square brackets is zero, and E 0 R 0 =I, i.e., E 0 is nonsingular. D S1.7. Jordan Normal Form
We prove here (as an application of the Smith form) the theorem on the existence of a Jordan form for every square matrix. Let us start with some definitions. A square matrix of type
0
0
A0
is called a Jordan block, and Ao is its eigenvalue. A matrix J is called Jordan, if it is a block diagonal matrix formed by Jordan blocks on the main diagonal: 0
0
where Ji are Jordan blocks (their eigenvalues may be arbitrary). The theorem on existence and uniqueness of a Jordan form can now be stated as follows: Theorem S1.13. Every n x n matrix B is similar to a Jordan matrix. This Jordan matrix is unique up to permutation of some Jordan blocks.
Before we start to prove this theorem, let us compute the elementary divisors of H- J where J is a Jordan matrix. Consider first the matrix polynomial I A - J 0 , where J 0 is a Jordan block of size k and eigenvalud 0 . Clearly, det(J A - J 0 ) = (A - A0)k. On the other hand, there exists a minor (namely, composed by rows 1, ... , k - 1 and columns
Sl.8.
337
FUNCTIONS OF MATRICES
2, ... , k) of order k - 1 which is 1. So the greatest common divisor of the minors of order k - 1 is also 1, and by Theorem 1.2 the Smith form of H- J 0 is
r: : : : (J.J i.e., with the single elementary divisor (A. - A. 0)k. Now let A. 1 , .•• , A.P be different complex numbers and let J be the Jordan matrix consisting of Jordan blocks of sizes ct; 1 , •.. , ct;,k, and eigenvalue A.;, i = 1, ... , p. Using Proposition Sl.5 and the assertion proved in the preceding paragraph, we see that the elementary divisors of H - J are just (A. - A.;)"''i, j = 1, ... ' k;, i = 1, ... 'p. Proof of Theorem S1.13. The proof is based on reduction to the equivalence of matrix polynomials. Namely, B is similar to a Jordan matrix J if andonlyifH- B,...., H- J,andthelatterconditionmeansthattheinvariant polynomials of H - Band H - J are the same. From the investigation of the Smith form of I A. - J, where J is a Jordan matrix, it is clear how to construct a Jordan matrix J such that I A. - J and I A. - B have the same elementary divisors. Namely, J contains exactly one Jordan block of size r with eigenvalue A. 0 for every elementary divisor (A. - A. 0 )' of I A. - B. The size of J is then equal to the sum of degrees of all elementary divisors of I A. - B, which in turn is equal to the degree of det(J A. - B), i.e., to the size n of B. Thus, the sizes of J and Bare the same, and Theorem Sl.13 follows. 0 S1.8. Functions of Matrices LetT be ann x n matrix. We would like to give a meaning to f(T), as a matrix-valued function, where f(A.) is some scalar function of a complex variable. We shall restrict our attention here to functions f which are analytic in a neighborhood of the spectrum a(T) ofT (this case is sufficient for our purposes). So let f(A.) be a scalar-valued function which is analytic in some open set U c €suchthata(T) c U.Letr c Ubearectifiablecontour(dependingon f(A.)) such that a(T) is inside r; the contour r may consist of several simple contours (i.e., without self-intersections). For example, let a(T) = {A. 1, •.. , A.,} and let r = U~= 1 {A. E €II A. - A.;l = b;}, where b; > 0 are chosen small enough to ensure that r c U. Put, by definition
f
~ r f
- T)- 1 dA..
(Sl.49)
S 1.
338
THE SMITH FORM AND RELATED PROBLEMS
In view of the Cauchy integral formula, this definition does not depend on the choice of r (provided the above requirements on r are met). Note that according to (Sl.49), f(S- 1 TS) = s- 1f(T)S for any nonsingular matrix S. The following proposition shows that the definition (Sl.49) is consistent with the expected value of f(T) = T; for the analytic functions f(A,) = A,i, i = 0, 1, .... Proposition 81.14.
j = 0, 1, .... Proof
Suppose first that Tis a Jordan block with eigenvalue),
0
1
(Sl.50)
= 0:
0 0
0 0 1
T=
(Sl.51) 1 0
0 0 Then
(Sl.52)
0 (recall that n is the size of T). So
0 placej + I
0
0
1
0
0
S1.8.
339
FUNCTIONS OF MATRICES
It is then easy to verify (Sl.50) for a Jordan block T with eigenvalue A. 0 (not necessarily 0). Indeed, T - il 0 I has an eigenvalue 0, so by the case already already considered,
j = 0, 1, ... ' where r 0 = {A. - il 0 1 A. E r}. The change of variables J1 = A. left-hand side leads to j
= 0,
+ il 0
in the
1, . . . . (Sl.53)
Now
so (Sl.50) holds. Applying (Sl.50) separately for each Jordan block, we can carry (Sl.50) further for arbitrary Jordan matrices T. Finally, for a given matrix T there exists a Jordan matrix J and an invertible matrix S such that T = s- 1JS. Since (Sl.50) is already proved for J, we have
0 The definition of this function of T establishes the map cp: A -+ q;n x nfrom the set A of all functions analytic in a neighborhood of a(T) (each function in its own neighborhood) to the set C x n of all n x n matrices:
cp(f)
=
J(T)
= -21 . 1tl
r f(il)(Iil -
Jr
T)- 1 dil,
f(il) EA.
(Sl.54)
Observe that the set A is a commutative algebra with the natural definitions: (f · g)(il) = f(il) · g(il) E A for every pair offunctions f(il), g(il) E A, (ocf)(il) = oc · f(il) for f(il) E A and oc E fl. Proposition Sl.14 is a special case of the following theorem (where we use the notions introduced above):
S 1.
340
THE SMITH FORM AND RELATED PROBLEMS
rr
Theorem 81.15. The map (/): A -+ X" is a homomorphism of algebra A into the algebra of all n x n matrices. In other words,
cp(f . g) = cp(f) . cp(g ), cp(af
+ f3g) =
where f().), g().) E A and a, f3
E
(S1.55)
+ f3cp(g),
acp(f)
(S1.56)
fl.
Proof As in the proof of Proposition S 1.14 the general case is reduced to the case when Tis a single Jordan block with eigenvalue zero as in (S1.51). Letf().) = L~o ).ijj, g().) = L~o ).jgj be the developments off().) and g().) into power series in a neighborhood of zero. Then, using,(Sl.52) we find that
1
r
co
L ;.iJJ cp(f) = -2. m Jr j;o
[
;_-1 0
;_-2 ;_-1
:
:
.
.
0 fo = [0 : 0
f1 fo : 0
... ...
;_-·
.
d).
:
r
0
1
;_-n+1 1
(Sl.57)
· · · fn-11 ·.. : f1 fo
Ir: 71 ~(f. ~l~ r..h~:l
Analogously,
~Wl ~
g)
(SL58)
where the coefficients hj are taken from the development f().)g().) = L~o ).jhj. So hk = jjgk- j• k = 0, 1, ... , and direct computation of the product cp(f) · cp(g), using (S1.57) and (S1.58), shows that cp(fg) = cp(f) · cp(g), i.e., (Sl.55) holds. Since the equality (S1.56) is evident, Theorem Sl.15 follows. 0
L';o
We note the following formula, which was established in the proof of Theorem S1.15: if J is the Jordan block of size k x k with eigenvalue ). 0 , then 1 (k - 1)!
f(J) =
0
0
f(k-1)(). ) 0
(S1.59)
0
Sl.8.
FUNCTIONS OF MATRICES
341
for any analytic function f(A.) defined in a neighborhood of )., 0 (J(il(A, 0) means the ith derivative of f(A.) evaluated at A. 0 ). As an immediate consequence of Theorem S1.15 we obtain the following corollary. Corollary S1.16. of (J(T). Then
Let f(A.) and g(A.) be analytic functions in a neighborhood f(T) · g(T) = g(T) · f(T),
i.e., the matrices f(T) and g(T) commute. Comments
A good source for the theory of partial multiplicities and the local Smith form of operator valued analytic functions in infinite dimensional Banach space is [38]; see also [2, 37d]. For further development of the notion of equivalence see [28, 70c]. A generalization of the global Smith form for operator-valued analytic functions is obtained in [57a, 57b]; see also [37d].
Chapter S2
The Matrix Equation AX - XB = C
In this chapter we consider the equation (S2.1)
AX- XB = C,
where A, B, Care given matrices and X is a matrix to be found. We shall assume that A and B are square, but not necessarily of the same size: the size of A is r x r, and the size of B is s x s; then C, as well as X, is of size r x s. In the particular case of Eq. (S2.1) with A = B and C = 0, the solutions of (S2.1) are exactly the matrices which commute with A. This case is considered in Section S2.2. S2.1. Existence of Solutions of AX- XB = C
The main result on existence of solutions of(S2.1) reads as follows. Theorem S2.1.
Equation (S2.1) has a solution X
and are similar.
342
[~ ~]
if and only if the matrices
S2.1.
EXISTENCE OF SOLUTIONS OF
Proof
AX- XB = C
343
If (S2.1) has a solution X, then direct multiplication shows that
0J[JO IX] = [AOB' CJ . I -X] [AO B [OJ but
[0I
-X]= [I x]0
I
I
1
'
so
[~ ~]
and are similar with the similarity matrix [ I0
X] I
.
Suppose now that the matrices an
d [A0 CJ B
are similar:
[~ ~] = 8 -1[~ ~Js
(S2.2)
for some nonsingular matrix S. Define the linear transformations 0 1 and 0 2 on the set (l~!: of all (r + s) x (r + s) complex matrices (considered as an (r + s) 2 -dimensional vector space) as follows:
where Z
E (l~!:.
n1(Z) =
[~ ~Jz- z[~ ~J
!l 2 (Z) =
[~ ~Jz- z[~ ~J
Equality (S2.2) implies that
In particular, Ker n2 = {SZJZ E (l~!:
and
z E
Ker !ld.
Consequently, dim Ker 0
1
=dim Ker 0 2 .
(S2.3)
344
S2.
THE MATRIX EQUATION
AX- XB
=
c
One checks easily that Ker0 1 =
{[~ ~] IAT = TA,AU = VB,BV = VA,BW = wB}
and Ker0 2 =
{[~ ~] IAT + CV = TA,AU + CW = UB,BV =VA, BW = WB}
It is sufficient to find a matrix in Ker 0 2 of the form
(S2.4) (indeed, in view of the equality AU+ C( -I)= VB, the matrix U is a solution of (S2.1)). To this end introduce the set R consisting of all pairs of complex matrices (V, W), where Vis of sizes x rand W is of sizes x s such that BV = VA and BW = WB. Then R is a linear space with multiplication by a complex number and addition defined in a natural way: il((V, W)
(Vb Wl)
+ (V2,
= (il(V, il(W),
W2) = (V1
+ V2,
I)(
Wl
E
C,
+ W2).
Define linear transformations <1\: Ker 0;-+ R, i = 1, 2 by the formula
{~ ~] = (V, W). Then Ker
{[~ ~] IAT = TA,AU =VB}.
(S2.5)
In addition, we have (S2.6) To see this, observe that Im <1> 1 then
= R, because if BV =
VA and BW
=
WB,
Therefore (S2.7)
S2.2.
345
COMMUTING MATRICES
On the other hand, by the well-known property oflinear transformations, dim Ker
i
= 1, 2.
In view of (S2.3) and (S2.5), dim Im <11 1 = dim Im <11 2 ; now (S2.6) follows from (S2. 7). Consider the pair
By (S2.6), there exists a matrix
such that
This means that V0 = 0, W0 = -I; so we have found a matrix in Ker 0 2 of the form (S2.4). 0 S2.2. Commuting Matrices
Matrices A and B (both of the same size n x n) are said to commute if AB = BA. We shall describe the set of all matrices which commute with a given matrix A. In other words, we wish to find all the solutions of the equation (S2.8)
AX= XA,
where X is an n x n matrix to be found. Recall that (S2.8) is a particular case ofEq. (S2.1) (with B =A and C = 0). We can restrict ourselves to the case that A is in the Jordan form. Indeed, let J = s- 1 AS be a Jordan matrix for some nonsingular matrix S. Then X is a solution of (S2.8) if and only if Z = s- 1 X Sis a solution of JZ
= ZJ.
(S2.9)
So we shall assume that A = J is in the Jordan form. Write J = diag(J b
... ,
Ju),
S2.
346
THE MATRIX EQUATION
AX- XB
where Ja (a = 1, ... , u) is a Jordan block of size ma x ma, Ja = A.ala where I a is the unit matrix of size ma x ma, and 0
1 0
=
c
+ Ha,
0
H IX = 1
0
0
Let Z be a matrix which satisfies (S2.9). Write Z = (Zap);, IJ= 1•
where Zap is a ma x mp matrix. Rewrite Eq. (S2.9) in the form (A.IX - Ap)Zrxp = ZrxpHp - HaZrxP•
1 :::;; a, f3:::;; u.
(S2.10)
Two cases can occur: (1) A.a =I= A.p. We show that in this case Zap = 0. Indeed, multiply the left-hand side of (S2.10) by A.a - A.p and in each term in the right-hand side replace (A.a - Ap)Zap by ZapHp - HaZap· We obtain (A.a- A.p) 2 Zap = ZapHi- 2HaZapHp
+ H;zap·
Repeating this process, we obtain for every p = 1, 2, ... , p
L (-1)q(~)H~ZapH~-q·
(A.rx- Ap)PZap =
(S2.11)
q=O
Choose p large enough so that either H~ = 0 or H~-q = 0 for every = 0, ... , p. Then the right-hand side of (S2.11) is zero, and since A.a =1= A.p, we obtain that Zap = 0. (2) A.a = A.p. Then
q
ZapHp = HaZap·
(S2.12)
From the structure of Ha and Hp it follows that the product HaZap is obtained from Zap by shifting all the rows one place upwards and filling the last row with zeros; similarly, ZapHp is obtained from Zap by shifting all the columns one place to the right and filling the first column with zeros. So Eq. (S2.12) gives (where (;k is the (i, k)thentryin ZaP• which depends, of course, on a and /3): (;+ l,k = ,i,k- b
where by definition (;o = Zap has the structure
i=1, ... ,mrz, k=1, ... ,mp,
Cm~+ l,k =
0. These equalities mean that the matrix
S2.2.
C(2) ap
. . •
cW. ···
(2)
l
COMMUTING MATRICES
form~<
C(mal ap
c~jja-O
..
...
0
d1l ap
mp: Z 11 p =
(3) for m11 > mp: Z 11 p =
~ 0
[
347
1 c(i)
\ afJ
TmJ;
[T~~~]
E
fl) ·'
(S2.13)
(S2.14) (S2.15)
}m!X-mp
Matrices of types (S2.13)-(S2.15) will be referred to as upper triangular Toeplitz matrices. So we have proved the following result. Theorem S2.2. Let J = diag[J 1, ••• , Jk] be ann x n Jordan matrix with Jordan blocks J 1 , •.• , J k and eigenvalues A. 1 , ..• , A.k, respectively. Then an n x nmatrixZcommuteswithJifandonlyifZafJ = OforA11 =f. ApandZ11 pisan upper triangular Toeplitz matrix for A11 = Ap, where Z = (Z 11 p)a,p= 1 , ... ,k is the partition ofZ consistent with the partition ofJ into Jordan blocks.
We repeat that Theorem S2.2 gives, after applying a suitable similarity transformation, a description of all matrices commuting with a fixed matrix A. Corollary S2.3. If a(A) 11 a(B) = AX - X B = C has a unique solution.
0, then for all
C the equation
Proof Since Eq. (S2.1) is linear, it is sufficient to show that AX- XB = 0 has only the zero solution. Indeed, let X be a solution of this equation; then [b 7J commutes with[~ However, since a(A) 11 a(B) = 0, Theorem S2.2 implies that X = 0. 0
n
Comments Theorem S2.1 was proved in [72]. The proof presented here is from [18]. For a comprehensive treatment of commuting matrices see [22, Chapter VIII]. Formulas for the solution of Eq. (S2.1) when a(A) 11 a(B) =f. 0, as well as a treatment of more general equations in infinite dimensional spaces can be found in [16a].
Chapter S3
One-Sided and Generalized Inverses
In the main text we use some well-known facts about one-sided and generalized inverses. These facts, in an appropriate form, are presented here with their proofs. Let A be an m x n matrix with complex entries. A is called left invertible (resp. right invertible) if there exists an n x m complex matrix A 1 such that A 1A =I (resp. AA1 = /). In this case A 1 is called a left (resp. right) inverse of A. The notion of one-sided invertibility is a generalization of the notion of invertibility (nonsingularity) of square matrices. If A is a square matrix and det A =1= 0, then A- 1 is the unique left and right inverse of A. One-sided invertibility is easily characterized in other ways. Thus, in the case of left invertibility, the following statements are equivalent; (i) (ii) (iii) from (j;H
them x n matrix A is left invertible; the columns of A are linearly independent in ft;m (in particular, m 2:: n); Ker A = {0}, where A is considered as a linear transformation to ¢m. ·
Let us check this assertion. Assume A is left invertible, and assume that Ax= 0 for some x E fl.". Let A1 be a left inverse of A. Then x = A 1Ax = 0 i.e., Ker A = {0}, so (i) =(iii) follows. On the other hand, if Ker A = {0}, then the columnsf1, ..• , fn of A are 348
ONE-SIDED AND GENERALIZED INVERSES
349
linearly independent. For equality Li=o a;/;= 0, for some complex numbers a 1, ... , a., yields
Finally, we shall construct a left inverse of A provided the columns of A are linearly independent. In this case rank A = n, so we can choose a set of n linearly independent rows of A, and let A 0 be the square n x n submatrix of A formed by these rows. Since the rows of A 0 are linearly independent, A 0 is nonsingular, so there exists an inverse A 0 1. Now construct a left inverse A 1 of A as follows: A 1 is of size n x m; the columns of A 1 corresponding to the chosen linearly independent set of rows in A, form the matrix A 0 1 ; all other columns of A 1 are zeros. By a straightforward multiplication on checks that A 1A = I, and (ii) = (i) is proved. The above construction of a left inverse for the left invertible matrix A can be employed further to obtain a general formula for the left inverse A 1• Namely, assuming for simplicity that the nonsingular n x n submatrix A 0 of A occupies the top n rows, write
Then a straightforward calculation shows that (S3.1) is a left inverse for A for every n x (m - n) matrix B. Conversely, if A 1 is a left inverse of A, it can be represented in the form (S3.1). In particular, a left inverse is unique if and only if the left invertible matrix A is square (and then necessarily nonsingular). In the case of right invertibility, we have equivalence of the following: (iv) (v) (vi)
the m x n matrix A is right invertible; the rows of A are linearly independent in fl" (in particular, m :::;; n); Im A = flm.
This assertion can be obtained by arguments like those used above (or by using the equivalence of (i), (ii), and (iii) for the transposed matrix AT). In an analogous way, the m x n matrix A is right invertible if and only if there exists a nonsingular m x m submatrix A 0 of A (this can be seen from (v)). Assuming for simplicity of notation that A 0 occupies the leftmost
350
S3.
ONE-SIDED AND GENERALIZED INVERSES
columns of A, and writing A= [A 0 , A 1], we obtain the general formula for a right inverse AI (analogous to (S3.1)): (S3.2) where Cis an arbitrary (n - m) x m matrix. Right invertibility of an m x n matrix A is closely related to the equation (S3.3)
Ax= y,
cm
where y E is given and X then obviously
E
cn is to be found. Assume A is right invertible; (S3.4)
where AI is any right inverse of A, is a solution of (S3.3). It turns out that formula (S3.4) gives the general solution of (S3.3) provided y ¥= 0. Indeed, write A = [A 0 , A 1 ] with nonsingular matrix A 0 , and let x = [~~] be a solution of (S3.3), where the partition is consistent with that of A. Since y "I= 0, there exists an (n - m) x m matrix C such that Cy = x 2 • Then using (S3.3),
(A 0 1
A0 1 A1 C)y = A 0 1 y- A0 1 A1 x 2 = A 0 1 y- A0 1(y- A0 x 1 ) = x 1 • In other words, Aiy = x, where AI is given by (S3.2). Thus, for y ¥= 0, every -
solution of (S3.3) is in the form (S3.4). We point out that one-sided invertibility is stable under small perturbations. Namely, let A be a one-sided (left or right) invertible matrix. Then there exists B > 0 such that every matrix B with IIA - Bll < B is also onesided invertible, and from the same side as A. (Here and in what follows the norm II M I of the matrix M is understood as follows: II M I = sup 11 x 11 = 1 II M x II, where the norms of vectors x and Mx are euclidean.) This property follows from (ii) (in case ofleft invertibility) and from (v) (in case of right invertibility). Indeed, assume for example that A is left invertible (of size m x n). Let A 0 be a nonsingular n x n submatrix of A. The corresponding n x n submatrix B 0 of B is as close as we wish to A 0 , provided IIA - Bll is sufficiently small. Since nonsingularity of a matrix is preserved under small perturbations, B 0 will be nonsingular as well if IIA - Bll is small enough. But this implies linear independence of the columns of B, and therefore B is left invertible. One can take B = IIAIII- 1 in the preceding paragraph, where AI is some right or left inverse of A. Indeed, suppose for definiteness that A is right invertible, AAI =I. Let B be such that IIA- Bll < IIAIII- 1 . Then BAI = AAI + (B- A)AI =I+ S, whereS = (B- A)AI. By the conditions liS II< 1, I + S is invertible and (I
+ s)- 1 =
I -
s + s2
-
s3 + ... ,
351
ONE-SIDED AND GENERALIZED INVERSES
where the series converges absolutely (because liS II < 1). Thus, BAI(l + S)- 1 = I, i.e., B is right invertible. We conclude this chapter with the notion of generalized inverse. Ann x m matrix AI is called a generalized inverse of them x n matrix A if the following equalities hold:
This notion incorporates as special cases the notions of one-sided (left or right) inverses. A generalized inverse of A exists always (in contrast with one-sided inverses which exist if and only if A has full rank). One of the easiest ways to verify the existence of a generalized inverse is by using the representation A=
B 1 [~ ~JB2,
(S3.5)
where B 1 and B 2 are nonsingular matrices of sizes m x m and n x n, respectively. Representation (S3.5) may be achieved by performing elementary transformations of columns and rows of A. For A in the form (S3.5), a generalized inverse can be found easily:
o]B-1.
AI= B-2[1 2 0 0
1
Using (S3.5), one can easily verify that a generalized inverse is unique if and only if A is square and nonsingular. We shall also need the following fact: if .It c (f;m is a direct complement to Im A, and .;V c (f;n is a direct complement to Ker A, then there exists a generalized inverse AI of A such that Ker AI =.It and Im AI = %. (For the definition of direct complements see Section S4.1.) Indeed, let us check this assertion first for (S3.6) The conditions on .It and .Jt
.;V
imply that
= Im[
X
Jm-r
J
% =
Im[iJ
for some r x (m - r) matrix X and (n - r) x r matrix Y. Then
352
S3.
ONE-SIDED AND GENERALIZED INVERSES
is a generalized inverse of A with the properties that Ker A 1 = ~~ and Im A1 = %. The general case is easily reduced to the case (S3.6) by means of representation (S3.5), using the relations Ker A= B2 1 (Ker D), where
Im A= B 1 (1m D),
Chapter S4
Stable Invariant Subspaces
In Chapters 3 and 7 it is proved that the description of divisors of a matrix polynomial depends on the structure of invariant subs paces for its linearization. So the properties of the invariant subspaces for a given matrix (or linear transformation) play an important role in the spectral analysis of matrix polynomials. Consequently, we consider in this chapter invariant subspaces of a linear transformation acting in a finite dimensional linear vector space fll". In the main, attention is focused on a certain class of invariant subs paces which will be called stable. Sections S4.3, S4.4, and S4.5 are auxiliary, but the results presented there are also used in the main text. We shall often assume that f!( = fl", where fl" is considered as the linear space of n-dimensional column vectors together with the customary scalar product (x, y) defined in fl" by n
(x, y) =
L xiyi, i= 1
where x = (x 1 , .•. , xn?, y = (y 1 , ..• , Yn? E fl". We may also assume that linear transformations are represented by matrices in the standard orthonormal basis in fl". S4.1. Projectors and Subspaces
A linear transformation P: fll"---+ f!( is called a projector if P 2 = P. It follows immediately from the definition that if Pis a projector, so is s- 1 PS
353
354
S4.
STABLE INVARIANT SUBSPACES
for any invertible transformation S. The important feature of projectors is that there exists a one-to-one correspondence between the set of all projectors and the set of all pairs of complementary subs paces in fl£. This correspondence is described in Theorem S4.1. Recall first that, if A, ff' are subs paces of fl£, then A + ff' = { z E fl£ I z = x + y, x E.~, y E 2'}. This sum is said to be direct if.~ n ff' = {0} in which case we write .~ ff' for the sum. The subspaces A, ff' are complementary (are direct complements of each other) if A n ff' = {0} and.~ ff' = fl£. Subspaces .~, ff' are orthogonal if for each x E.~ and y E ff' we have (x, y) = 0 and they are orthogonal complements if, in addition, they are complementary. In this case, we write A = f£'1., ff' = A1..
+
+
Theorem S4.1. Let P be a projector. Then (ImP, Ker P) is a pair of complementary subs paces in fl£. Conversely, for every pair (ff' 1 , ff' 2 ) of complementary subspaces in fl£, there exists a unique projector P such that Im P = ff' 1 , Ker P = ff' 2 . Proof
Let x
E
fl£. Then
x = (x - Px)
+ Px.
Clearly, Px E Im P and x - Px E Ker P (because P 2 = P). So Im P + Ker P = fl£. Further, if x E Im P n Ker P, then x = Py for some y E fl£ and Px = 0. So x = Py = P 2 y = P(Py) = Px = 0,
and ImP n Ker P = {0}. Hence ImP and Ker Pare indeed complementary subspaces. Conversely, let ff' 1 and ff' 2 be a pair of complementary subs paces. Let P be the unique linear transformation in fl£ such that Px = x for x E ff' 1 and Px = 0 for x E ff' 2 . Then clearly P 2 = P, ff' 1 c: ImP, and ff' 2 c: Ker P. But we already know from the first part of the proof that ImP Ker P = fl£. By dimensional considerations, we have, consequently, ff' 1 = Im P and ff' 2 = Ker P. SoP is a projector with the desired properties. The uniqueness of P follows from the property that Px = x for every x E Im P (which in turn is a consequence of the equality P 2 = P). D
+
We say that Pis the projector on ff' 1 along ff' 2 if ImP = ff' 1, Ker P = ff' 2 • A projector Pis called orthogonal if Ker P = (lm P)1.. Orthogonal projectors are particularly important and can be characterized as follows: Proposition S4.2. adjoint, i.e., P* = P.
A projector P is orthogonal
if and
only
if P
is self-
S4.1.
355
PROJECTORS AND SUBSPACES
Recall that for a linear transformation T: fl" --+ fl" the adjoint T*: fl" is defined by the relation (Tx, y) = (x, T*y) for all x, y E fl". A transformation T is self-adjoint if T = T*; in such a case it is represented by a hermitian matrix in the standard orthonormal basis.
fl"
--+
Proof Suppose P* = P, and let x E ImP, y E Ker P. Then (x, y) = (Px, y) = (x, Py) = (x, 0) = 0, i.e., Ker P is orthogonal to Im P. Since by Theorem S4.1 Ker P and ImP are complementary, it follows that in fact Ker P = (lm P).L. Conversely, let Ker P = (Im P).L. In order to prove that P* = P, we have to check the equality (Px, y) = (x, Py)
for all
x, y E fl".
(S4.1)
Because of the sesquilinearity of the function (Px, y) in the arguments
x, y E (![,and in view of Theorem S4.1, it is sufficient to prove (S4.1) for the following 4 cases: (1) x, y E ImP; (2) x E Ker P, y E ImP; (3) x E ImP, y E Ker P; (4) x, y E Ker P. In case (4), the equality (4.1) is trivial because both sides are 0. In case (1) we have
(Px, y) = (Px, Py) = (x, Py), and (S4.1) follows. In case (2), the left-hand side of (S4.1) is zero (since x E Ker P) and the right-hand side is also zero in view of the orthogonality Ker P = (lm P).L. In the same way, one checks (S4.1) in case (3). So (S4.1) holds, and P* = P. D Note that if Pis a projector, so is I - P. Indeed, (I - P) 2 = I - 2P + P 2 = I - 2P + P = I - P. Moreover, Ker P = lm(J - P) and Im P = Ker(J - P). It is natural to call the projectors P and I - P complementary projectors. We shall now give useful representations of a projector with respect to a decomposition of (![ into a sum of two complementary subspaces. Let T: (![ --+ (![be a linear transformation and let .P b .P 2 be a pair of complementary subspaces in (![. Denote m; = dim .P; (i = 1, 2); then m1 + m2 = n ( = dim (![). The transformation T may be written as a 2 x 2 block matrix with respect to the decomposition .P 1 .P 2 = (![:
+
(S4.2) Here T;i (i,j = 1, 2) is an m; x mi matrix which represents in some basis the linear transformation P; Tl.~j: .Pi--+ .P;, where P; is the projector on .P; along .f£3 _; (so P 1 + P 2 =I).
356
S4.
STABLE INVARIANT SUBSPACES
Suppose now that T = Pis a projector on!£ 1 = ImP. Then representation (S4.2) takes the form (S4.3) for some matrix X. In general, X # 0. One can check easily that X = 0 if and only if!£ 2 = Ker P. Analogously, if!£ 1 = Ker P, then (S4.2) takes the form (S4.4) and Y = 0 if and only if!£ 2 = Im P. By the way, the direct multiplication P. P, where Pis given by (S4.3) or (S4.4), shows that Pis indeed a projector: p2 = P. S4.2. Spectral Invariant Subspaces and Riesz Projectors
Let A: fl£ ~ fl£ be a linear transformation on a linear vector space fl£ with dim fl£ = n. Recall that a subspace !£ is invariant for A (or A -invariant) if Aft' c !£. The subspace consisting of the zero vector and fl£ itself is invariant for every linear transformation. A less trivial example is a one-dimensional subspace spanned by an eigenvector. An A-invariant subspace!£ is called A-reducing if there exists a direct complement !£''to !£ in fl£ which is also A-invariant. In this case we shall say also that the pair of subspaces (!£',!£'')is A-reducing. Not every A-invariant subspace is A-reducing: for example, if A is a Jordan block of size k, considered as a linear transformation of (/} to fl\ then there exists a sequence of A-invariant subspaces flk :::J ff'k_ 1 :::J ff'k_ 2 :::J · · • :::J !£ 1 :::J {0}, where ff'; is the subspace spanned by the first i unit coordinate vectors. It is easily seen that none of the invariant subspaces ff'; (i = 1, ... , k - 1) is A-reducing (because ff'; is the single A-invariant subspace of dimension i). From the definition it follows immediately that !£ is an A-invariant (resp. A-reducing) subspace if and only if Sf£' is an invariant (resp. reducing) subspace for the linear transformation SAS- 1 , where S: fl£ ~ fl£ is invertible. This simple observation allows us to use the Jordan normal form of A in many questions concerning invariant and reducing subspaces, and we frequently take advantage of this in what follows. Invariant and reducing subspaces can be characterized in terms of projectors, as follows: let A: fl£ ~ fl£ be a linear transformation and let P: fl£ ~ fl£ be a projector. Then
S4.2.
SPECTRAL INVARIANT SUBSPACES AND RIESZ PROJECTORS
(i)
357
the subspace ImP is A-invariant if and only if PAP= AP;
(ii)
the pair of subspaces ImP, Ker Pis A-reducing if and only if AP
= PA.
Indeed, write A as a 2 x 2 block matrix with respect to the decomposition Im P Ker P = !!l':
+
(so, for instance, A 11 = PA hmP: Im P--+ Im P). The projector P has the corresponding representation (see (S4.3))
OJ
P=[I0 0 .
Now PAP = AP means that A 21 = 0. But A 21 = 0 in turn means that Im P is A -invariant, and (i) is proved. Further, AP = P A means that A21 = 0 and A 12 = 0. But clearly, the condition that Im P, Ker P is an A-reducing pair is the same. So (ii) holds as well. We consider now an important class of reducing subspaces, namely, spectral invariant subspaces. Let a( A) be the spectrum of A, i.e., the set of all eigenvalues. Among the A-invariant subspaces of special interest are spectral invariant subspaces. An A-invariant subspace fi' is called spectral with respect to a subset a c a(A), if fi' is a maximal A-invariant subspace with the property that a(A 1-P) c a. If a = {A.0 } consists of only one point A.0 , then a spectral subspace with respect to a is just the root subspace of A corresponding to A.0 . To describe spectral invariant subspaces, it is convenient to use Riesz projectors. Consider the linear transformation P" = -21 . 1tl
i
(/A.- A) -1 dA.,
(S4.5)
ro
where r" is a simple rectifiable contour such that a is insider" and a(A)\a is outside r". The matrix P" given by (S4.5) is called the Riesz projector of A corresponding to the subset a c a(A). We shall list now some simple properties of Riesz projectors. First, let us check that (S4.5) is indeed a projector. To this end letf(A.) be the function defined as follows:f(A.) = 1 inside and on r",J(A.) = 0 outsider". Then we can rewrite (S4.5) in the form
S4.
358
STABLE INVARIANT SUBSPACES
where r is some contour such that a(A) is insider. Sincef(A.) is analytic in a neighborhood of a(A), by Theorem Sl.15 we have P; =
~ f
2m Jr
f(A.)(IA. - A)- 1 dA.. ~ f f(A.)(IA.- A)- 1 dA. 2m Jr
=~
rU
=~
f
2m Jr
2m Jr
f(A.)(IA.- A)- 1 dA. = Pa
and we have used the fact that (f(A-)) 2 = f(A.). So P a is a projector. By Corollary Sl.16, PaA = APa, so Im Pais a reducing A-invariant subspace. Proposition S4.3 below shows that ImP a is in fact the spectral A -invariant subspace with respect to a. Another way to show that P a is a projector, is to consider the linear transformation A in a Jordan basis. By a Jordan basis we mean the basis in which A is represented by a matrix in a Jordan normal form. The existence of a Jordan basis follows from Theorem Sl.13, if we take into account the fact that similarity of two matrices means that they represent the same linear transformation, but in different bases. So, identifying A with its matrix representation in a Jordan basis, we can write (S4.6) where Ja (resp. Ju) is a Jordan matrix with spectrum a (resp. From formula (S4.6) we deduce then that
a= a(A)\a).
(in the matrix form with respect to the same basis as in (S4.6)). Using (S4.6) one can prove the following characterization of spectral invariant subspaces. Proposition S4.3.
The subspace Im P a is the maximal A -invariant sub-
space such that a( A limP) c
0'.
Moreover, Im P a is characterized by this property, i.e., if .!£ is a maximal A-invariant subspace such that a( A I.P) c a, then.!£ = ImP a·
S4.2.
SPECTRAL INVARIANT SUBSPACES AND RIESZ PROJECTORS
359
In particular, a spectral A-invariant subspace 2" (with respect to a given set a c a(A)) is unique and reduces A; namely, the spectral A-invariant subspace 2a (where ii = a(A)\a) is a direct complement to 2" in?£. Let il 1 , . . . , ilk be all the different eigenvalues of A, and let i = 1, ... ' k,
where n = dim ?£. The subspace 'if/; is the root subspace associated with eigenvalue il;. Writing A as a Jordan matrix (in a Jordan basis) it is easily seen that 'if/; coincides with the spectral subspace corresponding to a = {il;}. Clearly, (S4.7) decomposes?£ into the direct sum of root subspaces. Moreover, the following proposition shows that an analogous decomposition holds for every Ainvariant subspace.
Proposition S4.4.
Let 2 c ?£be an A-invariant subspace. Then
(S4.8) where 'if/ 1,
... ,
'if/ k
are the root sub spaces of A.
Proof Let P; be the Riesz projector corresponding to il;; then ImP;, i = 1, ... , k. On the other hand, P 1 + P 2 + · · · + Pk =I, so any x E 2 can be written in the form 'if/;=
(S4.9) From formula (S4.5) for Riesz projectors and the fact that 2 is A-invariant, it follows that P;x E 2 for x = 1, ... , k. Indeed, let r; be a contour such that il; is inside r; and ilj, for j ¥- i, is outside r;. For every il E r;, the linear transformation Iil - A is invertible and (Jil - A)x E 2 for every x E 2. Therefore, (Iil- A)- 1x
Now for fixed x
E
E
2
for every
x
E
2.
(S4.10)
2, the integral
(2~i {,Uil- A)-
1
dil)x
is a limit of Riemannian sums, all of them belonging to 2 in view of (S4.1 0). Hence the integral, as a limit, also belongs to 2, i.e., P;x E 2. We have proved that P;x E 2 11 1fi for any x E 2. Now formula (S4.9) gives the inclusion 2 c (2 11 'if/ 1) + · · · + (2 11 "'f/'k). Since the opposite inclusion is obvious, and the sum in the right hand side of (S4.8) is direct (because (S4. 7) is a direct sum), Proposition S4.4 follows. D
S4.
360
STABLE INVARIANT SUBSPACES
Proposition S4.4 often allows us to reduce consideration of invariant subspaces of A to the case when a( A) consists only of a single point (namely, by considering the restrictions A I'll'", to the root subspaces). S4.3. The Gap between Subspaces
This section, and the next, are of an auxiliary character and are devoted to the basic properties of the set of all subs paces in f!l = (/;".These properties will be needed in Section S4.5 to prove results on stable invariant subspaces. As usual, we assume that(/;" is endowed with the scalar product (x, y) = 1 x;Y;; x = (x 1, ... , x.)T, y = (y 1, ... , y.? E (/;" and the corresponding norm
L7=
x=
(x 1,
... ,
x.? E (/;".
The norm for an n x n matrix A (or a linear transformation A: (/;"--+ (/;") is defined accordingly:
IIAII = maxxe¢"\wl IIAxll/llxiiThe gap between subspaces !£' and A (in (/;") is defined as ()(!£',A)
= liP At
- P Yll,
(S4.11)
where P y(P At) is the orthogonal projector on !£'(A) (refer to Section S4.1). It is clear from the definition that()(!£', A) is a metric in the set of all subspaces in (/;", i.e., ()(!£', A) enjoys the following properties: (i) (ii) (iii)
()(!£',A) > 0 if!£' f= A,()(!£',!£') ()(!£',A) = ()(A,!£'); ()(!£', A) ::; ()(!£', %) + ()(%, A).
= 0;
Note also that {)(!£', A) ::; 1 (this property follows immediately from the characterization (S4.12) below). In what follows we shall denote by Sy the unit sphere in a subspace !£' c (/;", i.e., Sy = {xI x E !£', I x I = 1}. We shall also need the concept of the distance of d(x, Z) from x E (/;" to a set Z c (/;". This is defined by d(x, Z) = inftez llx - tilTheorem S4.5.
Let A,!£' be subspaces in(/;". Then
()(!£', A)
= max{ sup d(x, !£'), sup d(x, A)}. xeS.Af
xeSy
(S4.12)
S4.3.
361
THE GAP BETWEEN SUBSPACES
If P 1 (P 2 ) are projectors with Im P 1 = 2 (lm P 2 = A), not necessarily orthogonal, then (S4.13)
Proof
For every x
E
Sz we have
therefore sup d(x, A) ::5: IIP 1 - Pzll. xeSz Analogously, SUPxes.At d(x, 2) ::5: liP 1 - P zll; so
(S4.14)
max{pz, P.At} ::5: IIP1- Pzll,
where Pz = SUPxeSzd(x, A), P.At = SUPxeS.4(d(x, .!£). Observe that Pz = SUPxeszii(J- P.At)xll, P.At = SUPxes.Atii(J- Pz)xll· Consequently, for every x E rt" we have II(!- P z)P .Atxll ::5: P.AtiiP .Atxll,
11(1 - P .At)Pzxll ::5: PziiP zxll·
(S4.15)
Now liP .At(!- Pz)xll 2 =((I- Pz)P .At(!- Pz)X,
(I- Pz)x)
::5: II(! - P z)P .At(! - P z)xll· II(! - Pz)xll;
hence by (S4.15) liP .At(!- P z)xll 2 ::5: P.AtiiP .At(!- P z)xii-II(J- P z)xll,
(S4.16)
liP .At(!- P z)xll ::5: P.Atii(J- P z)xll· On the other hand, using the relation
P .At- Pz = P .At(!- Pz)- (I- P.At)Pz and the orthogonality of P .At• we obtain II(P .At- Pz)xll 2 =liP .At(!- Pz)xll 2
+
11(1- P .At)PzxV
Combining with (S4.15) and (S4.16) gives II(P .At- Pz)xll 2 ::5: P~II(J- P z)xll 2
+ P111P zXII 2
::5: max{p~, P1}11xll 2 ·
So liP .At - P zll ::5: max{pz, P.At}. Using (S4.14) (with P 1 = Pz, P 2 = P .At) we obtain (S4.12). The inequality (S4.13) follows now from (S4.14). D
362
84.
STABLE INVARIANT SUBSPACES
An important property of the metric ()(ft', .A) is that in a neighborhood of every subspace ft' c fl" all the subspaces have the same dimension (equal to dim ft'). This is a consequence of the following theorem. If ()(ft', .A) < 1, then dim ft'
Theorem S4.6.
= dim .A.
Proof Condition ()(ft', .A) < 1 implies that ft' 11 .Aj_ = {0} and ft'l_ 11 .A = {0}. Indeed, suppose the contrary, and assume, for instance, that ft' (\ .Al_ # {0}. Let X E s.!l' (\ .Al_. Then d(x, .A)= 1, and by (84.12), ()(ft', .A) .? 1, a contradiction. Now ft' n .Al_ = {0} implies that dim ft' ~ dim .A and ft'l_ n .A = {0} implies that dim ft' 2 dim .A. 0 It also follows directly from this proof that the hypothesis ()(ft', .A) < 1 .A. In addition, we have implies fl" = ft' .Al_ = ft'l_
+
+
p .!l'(.A) = ft'.
p .Al(ft') = .A,
To see the first of these, for example, observe that for any x E .A there is the unique decomposition x = y + z, y Eft', z E .Aj_. Hence, x = P .Atx = P .~tY so that .A c P .At(ft'). But the reverse inclusion is obvious and so we must have equality. We conclude this section with the following result, which makes precise the idea that direct sum decompositions of fl" are stable under small perturbations of the subspaces in the gap metric.
Theorem S4.7. Let .A, .A 1 c fl" be subspaces such that .A+ .A1 = fl". If .IV is a subspace in fl" such that ()(.A, JV) is sufficiently small, then
(84.17) and ()(.A, .IV) ~
liP.~t
-
P .¥11 ~ C()(.A, .IV),
(84.18)
where P .At(P.¥)projects fl" onto .A (onto .IV) along .A 1 and C is a constant depending on .A and .A 1 but not on .IV.
+
Proof Let us prove first that the sum .IV .A 1 is indeed direct. The condition that .A .A 1 = fl" is a direct sum implies that llx - Yll 2 (> > 0 for every x E S.At, and every y E .A. Here(> is a fixed positive constant. Take .IV so close to .A that ()(.A, JV) ~ f>/2. Then liz - Yll ~ f>/2 for every z E S.¥, where y = y(z) is the orthogonal projection of z on .A. Thus for X E s.At, and Z E S.¥ we have
+
llx - zll
2
llx - Yll - liz - Yll
2 b/2,
S4.4. THE METRIC SPACE OF SUBSPACES
363
so .AI nAt = {0}. By Theorem S4.6 dim .AI= dim A if (}(A,.#')< 1, so the dimensional consideration tells us that .AI + At = ([;"for (}(A,.#') < 1, and (S4.17) follows. To establish the right-hand inequality in (S4.18) two preliminary remarks are needed. First note that for any x E A, yEA t• x = P..41(x + y) so that llx +
Yil ~liP..4111-tllxll.
(S4.19)
It is claimed that, for (}(A,.#') small enough, liz+
Yil ~-!liP..4111-tllzll
for all z E .AI and y E At· Without loss of generality assume liz II let x EA. Then, using (S4.19) we obtain liz+ But then x
Yil
~ llx +
= (x - z) +
Yil-
(S4.20)
= 1. Suppose (}(A,.#')
liz- xll ~liP..4111-tllxll- b.
z implies llxll ~ 1 - band so
liz+
Yil
~
liP ..4111-t(1-
b)- b
and, for b small enough, (S4.20) is established. The second remark is that, for any x E ([;" llx -
P..41x11
:::; C 0 d(x, A)
(S4.21)
for some constant C 0 • To establish (S4.21), it is sufficient to consider the case that x E Ker P..41 and llxll = 1. But then, obviously, we can take C 0 = maxxeKerP..4/•llxll = t {d(x, A)-t }. Now for any x E S_,v, by use of (S4.21) and (S4.12), II(P..41
-
P.,v)xll =
llx -
P..41x11
:::; C 0 d(x, A) :::; C 0 (J(A, %).
Then, ifw E flN, llwll = 1, and w = y + z, y E.#', z EAt, II(P..41
-
P.,v)wll =
II(P..41
-
P.,v)Yil :::;
IIYIICo(J(A, .AI) :::; 2CoiiP..4/II(J(A, .K)
and the last inequality follows from (S4.20). This completes the proof of the theorem. 0
S4.4. The Metric Space of Subspaces Consider the metric space(/;" of all subspaces in([;" (considered in the gap metric (}(2, A)). In this section we shall study some basic properties of(/;". We remark that the results of the preceding section can be extended to subs paces in infinite-dimensional Hilbert space (instead of([;") and, with some modifications, to the case of infinite-dimensional Banach spaces. The
S4.
364
STABLE INVARIANT SUBSPACES
following fundamental theorem is, in contrast, essentially" finite dimensional" and cannot be extended to the framework of subs paces in infinite dimensional space.
Theorem S4.8. The metric space (/J" is compact and, therefore, complete (as a metric space). Recall that compactness of (/J" means that for every sequence 2 1 2 2 , ••• of subspaces in (/J" there exists a converging subsequence 2;,, 2; 2 , ••• , i.e., such that lim 0(2;k, 2 k-+
0)
=0
00
for some 2 0 E (/J". Completeness of(/;" means that every sequence of subspaces 2i> i = 1, 2, ... , for which limi,j-+oo 0(2;,fij) = 0 is convergent. Proof In view of Theorem S4.6 the metric space(/;" is decomposed into components (/;n,m• m = 0, ... , n, where f/Jn,m is a closed and open set in (/Jn consisting of all m-dimensional subspaces in fl". Obviously, it is sufficient to prove the compactness of each f/;n, m. To this end consider the set $n,m of all orthonormal systems u = {uk}k= 1 consisting of m vectors u 1, ... , um in fl".
For u
= {udk'= 1, v = {vk}k'= 1 E $n,m define
It is easily seen that b(u, v) is a metric in $n,m• so turning $n,m into a metric space. For each u = {uk}k=1 E $n,m define Amu = Span{u1, ... ' um} E f/Jn,m· In this way we obtain a map Am: $n,m-+ f/Jn,m of metric spaces $n,m and (/in,m· We prove now that the map Am is continuous. Indeed, let 2 E f/Jn,m and let v1, ... , vm be an orthonormal basis in 2. Pick some u = {uk}k'= 1 E $n,m (which is supposed to be in a neighborhood of v = {vdk'= 1 E $n,m). For V;, i = 1, ... , m, we have (where .A= Amu and P.¥stands for the orthogonal projector on the subspace %) :
II(P.At
-
P z)v;ll = ~
and therefore for x =
liP.At vi - v;ll ~ liP.At(v; - u;)ll + llu; - v;ll liP.Atllllv; - u;ll + llu; - v;ll ~ 2<5(u, v),
L:i!= 1rx; V; E S z m
II(P.At-Pz)xll
~2 L:lrx;lb(u,v). i= 1
S4.4
365
THE METRIC SPACE OF SUBSPACES
Now, since llxll = and so
L7'= 1 lcx;l 2 = 1, we obtain that lex;! ::s; 1 and L7'= 1 lcx;l ::s; m, (S4.22)
II(P At - P y}ly II ::s; 2mb(u, v). Fix now some y
E Sy~.
We wish to evaluate P AtY· For every x
(x, P AtY) = (P Atx, y) =((PAt - P y)x, y)
+ (x, y) =
E
2, write
((PAt - P y)x, y),
and (S4.23)
l(x, P AtY)I ::s; 2mllxllb(u, v) by (S4.22). On the other hand, write m
p AtY =
L CX;U;; i= 1
then for every z
E
.PJ.
and m
l(z, P AtY)I ::s; liz II II
L cx;(u; -
v;)ll ::s; llzllm max lcx;lllu; - v;ll.
i= 1
But IIYII = 1 implies that
L7'=
1
lcx;l 2 ::s; 1, so max 15 ;:s;m lex; I ::s; 1. Hence, (S4.24)
l(z, P AtY)I ::s; llzllmb(u, v). Combining (S4.23) and (S4.24) we obtain that l(t, P AtY)I ::s; 3mb(u, v) for every t E q;n with II til = 1. Thus,
(S4.25)
liP AtYII ::s; 3mb(u, v). Now we can easily prove the continuity of Am. Pick x using (S4.22), (S4.25), we have II(P At - P y)xll ::s; II(P At - P y)P yxll
+
E
rtn, llxll = 1. Then,
liP At(x - P yx)ll ::s; 5m · b(u, v),
so fJ(Jt, !!!)
= liP At - P Yll ::s; 5mb(u, v),
which obviously implies the continuity of Am. It is easily seen that $n, mis compact. Since Am: $n, m--+ f/Jn, mis a continuous map onto (/;n,m• the latter is also compact.
S4.
366
STABLE INVARIANT SUBSPACES
Finally, let us prove the completeness of f/T •• m· Let !1' 1 , !1' 2 , ••• be a Cauchy sequence in (/;•,m• i.e., O(!l';, !l')--+ 0 as i, j--+ 0. By compactness, there exists a partial sequence !l';k such that limk--+oo O(!l';k, !l') = 0 for some !l' E ¢n,m· But then it is easily seen that in fact !l' = lim;--. 00 !l';. D S4.5. Stable Invariant Subspaces: Definition and Main Result
Let A: ([," --+ ([," be a linear transformation. If A. 0 is an eigenvalue of A, the corresponding root space Ker(A. 0 J - A)" will be denoted by .%(A. 0 ). An A-invariant subspace .% is called stable if given a > 0 there exists b > 0 such that liB- All < b for a linear transformation B: ([,"--+(/;"implies that B has an invariant subspace .A with liP At - P.vii < a. Here PAt denotes the orthogonal projector with Im P = .A. The same definition applies for matrices. This concept is particularly important from the point of view of numerical computation. It is generally true that the process of finding a matrix representation for a linear transformation and then finding invariant subspaces can only be performed approximately. Consequently, the stable invariant subspaces will generally be the only ones amenable to numerical computation. Suppose .% is a direct sum of root subspaces of A. Then .% is a stable invariant subspace for A. This follows from the fact that .% appears as the image of a Riesz projector RA
= -21 .
r
mJr
1
dA.,
where r is a suitable contour in fl (see Proposition S4.3). Indeed, this formula ensures that IIRB- RAil is arbitrarily small if liB- All is small enough. Let .A = Im RB; since P.v and PAt are orthogonal projectors with the same images as RA and RB respectively, the following inequality holds (see Theorem S4.5) (S4.26) so liP.v- P v~tll is small together with IIRB -RAil· It will turn out, however, that in general not every stable invariant subspace is spectral. On the other hand, if dim Ker(A.i - A) > 1 and.% is a one-dimensional subspace of Ker(A.il - A), it is intuitively clear that a small perturbation of A can result in a large change in the gap between invariant subspaces. The following simple example provides such a situation. Let A be the 2 x 2 zero matrix, and let.%= Span{n]} c (/; 2 • Clearly,.% is A-invariant; but.% is unstable. Indeed, let B = diag[O, a], where a =f 0 is close enough to zero. The only one-dimensional B-invariant subspaces are
S4.6.
367
CASE OF A SINGLE EIGENVALUE
.It 1 = Span{[~]} and .It 2 = Span{[?]}, and both are far from%: computation shows that liP.v- PAt, II =
1/J2,
i
=
1, 2.
The following theorem gives the description of all stable invariant subspaces. Theorem S4.9. Let A. 1 , •.. , A., be the different eigenvalues of the linear transformation A. A subspace .K of(/;" is A-invariant and stable if and only if .K = .K 1 %., where for each j the space .Ki is an arbitrary Ainvariant subspace of .K(A.) whenever dim Ker(A.il - A) = 1, while otherwise .Ki = {0} or .Ki = .K(A.).
+ ··· +
The proof of Theorem S4.9 will be based on a series of lemmas and an auxiliary theorem which is of some interest in itself. In the proof of Theorem S4.9 we will use the following observation which follows immediately from the definition of a stable subspace: the A-invariant subspace .K is stable if and only if the SAS- 1-invariant subspace S.K is stable. Here S: (/;" -+ (/;" is an arbitrary invertible linear transformation. S4.6. Case of a Single Eigenvalue The results presented in this subsection will lead to the proof of Theorem S4.9 for the case when A has one eigenvalue only. To state the next theorem we need the following notion: a chain .11 1 c .11 2 c ···cAn-t of Ainvariant subspaces is said to be complete if dim .IIi = j for j = 1, ... , n - 1. Theorem S4.10. Given e > 0, there exists c5 > 0 such that the following holds true: If B is a linear transformation with liB- All < c5 and {.IIi} is a complete chain of B-invariant subspaces, then there exists a complete chain {%;} of A-invariant subspaces such that liP.v1 - P At 1 ll < e for j = 1, ... , n-1. In general the chain {.Ki} for A will depend on the choice of B. To see this, consider
where v E (/;. Observe that for v # 0 the only one dimensional invariant subspace of B. is Span{(O, 1?} and forB~, v # 0, the only one-dimensional invariant subspace is Span{(1, O)T}. Proof. Assume that the conclusion of the theorem is not correct. Then there exists e > 0 with the property that for every positive integer m there
S4.
368
STABLE INVARIANT SUBSPACES
exists a linear transformation Bm satisfying IIBm - A I < 1/m and a complete chain {.It mi} of Em-invariant subs paces such that for every complete chain {.AI) of A-invariant subspaces:
liP.Yi
max
-
P Atm)l
~ t:
m = 1, 2,....
(S4.27)
1 ,;,j:o;,k-1
Denote for brevity Pmi = P Atmj' Since liPmill = 1, there exists a subsequence {mi} of the sequence of positive integers and linear transformations P 1 , ••• , Pn_ 1 on fz;n, such that lim Pm;.i = Pi,
j
=
1, ... , n- 1.
(S4.28)
i-+- 00
Observe that P 1 , . . . , P n- 1 are orthogonal projectors. Indeed, passing to the limit in the equalities Pm;,i = (P m,,) 2 , we obtain that Pi = PJ. Further (S4.28) combined with P!,,i = Pm;.i implies that Pj =Pi; so Pi is an orthogonal projector (Proposition S4.2). Further, the subspace Ali= Im Pi has dimension j, j = 1, ... , n - 1. This is a consequence of Theorem S4.6. By passing to the limits it follows from BmP mi = PmiBmPmi that APi = PiAPi. Hence Ali is A-invariant. Since Pmi = Pm,i+ 1 P mi we have Pi= Pi+ 1 Pi, and thus Ali c %i+ 1 . It follows that Ali is a complete chain of A-invariant subspaces. Finally, &(Ali, .Itm,.) = IIPi - Pm,jll -+ 0. But this contradicts (S4.27), and the proof is complete. 0
Corollary S4.11. dim Ker(A. 0 I - A)
=
If A has only one eigenvalue, A. 0 say, and 1, then each invariant subspace of A is stable.
if
Proof The conditions on A are equivalent to the requirement that for each 1 .::; j .::; k - 1 the operator A has only one j-dimensional invariant subspace and the nontrivial invariant subspaces form a complete chain. So we may apply the previous theorem to get the desired result. 0
Lemma S4.12. dim Ker(A. 0 I and fln.
If A has only one eigenvalue, A.0 say, and if 2, then the only stable A-invariant subspaces are {0}
A)~
Proof Let J = diag(J 1 , •.• , J.) be a Jordan matrix for A. Here Ji is a simple Jordan block with A. 0 on the main diagonal and of size Ki, say. As dim Ker(A. 0 I -A)~ 2 we haves ~ 2. By similarity, it suffices to prove that J has no nontrivial stable invariant subspace. Let e 1 , .•. , ek be the standard basis for fln. Define the linear transformation T, on (/;n by setting T .=
,e,
{c:ei-1 0
if i = K1 + otherwise,
· · · + Kj + 1,
j
= 1, ... , S
-
1
S4.7.
369
GENERAL CASE OF STABLE INVARIANT SUBSPACES
and put B, = J + T,;. Then liB, - 111 tends to 0 as t: ..... 0. Fort: # 0 the linear transformation B, has exactly one j-dimensional invariant subspace namely, JVj = Span{e 1, .•. , ej}. Here 1 :::;; j:::;; k - 1. It follows that Jll"j is the only candidate for a stable ]-invariant subspace of dimensionj. Now consider J = diag(J., ... , J 1 ). Repeating the argument of the previous paragraph for J instead of J, we see that Jll"j is the only candidate for a stable ]-invariant subspace of dimensionj. But J = sJs- 1, where Sis the similarity transformation that reverses the order of the blocks in J. It follows that SJII"j is the only candidate for a stable ]-invariant subspace of dimensionj. However, ass ;;::: 2, we have SJII"j # JVj for 1 :::;; j:::;; k - 1, and the proof is complete. D Corollary S4.1 and Lemma S4.12 together prove Theorem S4.9 for the case when A has one eigenvalue only. S4.7. General Case of Stable Invariant Subspaces
The proof of Theorem S4.9 in the general case will be reduced to the case of one eigenvalue which was considered in Section S4.6. To carry out this reduction we will introduce some additional notions. We begin with the following notion of minimal opening. For two subspaces vii and JV of q;n the number
YJ(vll, JV) = inf{llx
+ Ylllx E vii, y E Jll", max(llxll,
IIYII) = 1}
is called the minimal opening between vii and JV. Note that always 0 :::;; YJ(vll, JV) :::;; 1, except when both vii and JV are the zero subspace, in which case YJ(vll, JV) = oo. It is easily seen that YJ(vll, JV) > 0 if and only if vii n JV = {0}. The following result shows that there exists a close connection between the minimal opening and gap B(vll, JV) between subspaces. Proposition 84.13.
Let vii m• m = 1, 2, ... , be a sequence of subspaces in
q;n_ If limn-oo B(vllm• 2) = 0 for some suhspace 2, then (S4.29)
for every subspace JV. Proof Note that the set of pairs {(x, y)lx E vii, y E Jll", max(llxll, IIYII) = 1} c q;n x q;n is compact for any pair ofsubspaces vii, JV. Since llx + Yll is a continuous function of x and y, in fact YJ(vll, .AI") = llx
+ Yll
370
S4.
STABLE INVARIANT SUBSPACES
for some vectors x E vii, y E .AI' such that max(llxll, IIYII) = 1. So we can choose xm E vllm, Ym E %, m = 1, 2, ... , such that max(llxmll, IIYmll) = 1 and l](vllm, AI')= llxm + Ymll· Pick convergent subsequences xmk--+ x 0 E r;;n, Ymk--+ Yo E AI'. Clearly, max(llxoll, IIYoll) = 1 and l](vllmk• AI')--+ llxo +Yo II. It is easy to verify (for instance by reductio ad absurdum) that x 0 E 2. Thus (S4.30) On the other hand limm__, 00 SUp l](vllm, AI')::;; 1](2, AI'). Indeed, let x
E
2, y
E
(S4.31)
.AI' be such that max(llxll, IIYII) = 1 and 1](2, .AI')
= llx + Yll·
Assume first that x ¥- 0. For any given a > 0 there exists m 0 such that d(x, vii m) < a for m ~ m 0 • Let xm E vii m be such that d(x, viim) = I x - xm II· Then zm 'M (llxllxm)/llxmll E vllm, max(llzmll, IIYII) = 1 and for m ~ m 0 we obtain llzm
+ Yll
~~~~~~~~· xll
::;; llx
+ Yll +
::;; llx
+ Yll + l i 8 + lllxmll-llxlll.
llzm-
llxmll
+
llxlll1 -
~:~::Ill I
llxmll
Taking a < llxll/2, we obtain that llzm
+ Yll
::;; llx
+ Yll + 2a + 5a, so
+ Yll
--+ llx
+ Yll
= 1](2, AI'),
l](vllm, AI')::;; llzm
and (S4.31) follows. Combining (S4.30) and (S4.31) we see that for some subsequence mk we have limk-+oo l](vllmk• AI')= 1](2, %). Now by a standard argument one proves (S4.29). Indeed, if (S4.29) were not true, then we could pick a subsequence m~ such that lim I]( vii m"• AI') "# 1](2, AI').
(S4.32)
k-+ 00
But then we repeat the above argument replacing vii m by vii ml,. So it is possible to pick a subsequence m~ from m~ such that limk-+oo I](Jtm'k• AI')= 1](2, %), a contradiction with (S.4.32). D Let us introduce some terminology and notation which will be used in the next two lemmas and their proofs. We use the shorthand Am--+ A for limm__, 00 IlAm - A I = 0, where Am, m = 1, 2, ... , and A are linear transformations on r;;n. Note that Am --+ A if and only if the entries of the matrix representations of Am (in some fixed basis) converge to the corresponding entries
S4.7.
GENERAL CASE OF STABLE INVARIANT SUBSPACES
371
of A (represented as a matrix in the same basis). We say that a simple rectifiable contour r splits the spectrum of a linear transformation T if a(T) n r = 0. In that case we can associate with T and r the Riesz projector
P(T;
n=~ ren- n-l dA. 2m Jr
The following observation will be used subsequently. If T is a linear transformation for which r splits the spectrum, then r splits the spectrum for every linear transformationS which is sufficiently close to T (i.e., I S - T I is close enough to zero). Indeed, if it were not so, we have det(A.m/ - Sm) = 0 for some sequence Am E rand sm --+ T. Pick a subsequence A.m. --+ Ao for some A. 0 E r; passing to the limit in the equality det(A.mJ - Sm.) = 0 (here we use the matrix representations of sm. and T in some fixed basis, as well as the continuous dependence of det A on the entries of A) we obtain det(A. 0 / - T) = 0, a contradiction with the splitting of a(T). We shall also use the notion of the angular operator. If Pis a projector of fl" and At is a subspace of fl" with Ker P At = fl", then there exists a unique linear transformation R from Im P into Ker P such that At = {Rx + xlx E ImP}. This transformation is called the angular operator of At with respect to the projector P. We leave to the reader the (easy) verification of existence and uniqueness of the angular operator (or else see [3c, Chapter 5]).
+
Lemma S4.14. Let r be a simple rectifiable contour that splits the spectrum ofT, let T0 be the restriction ofT to Im P(T; r) and let % be a subspace of Im P(T; r). Then % is a stable invariant subspace for T if and only if% is a stable invariant subspace for T0 .
Proof Suppose % is a stable invariant subspace for T0 , but not for T. Then one can find s > 0 such that for every positive integer m there exists a linear transformation sm such that
IISm- Til <
1/m
(S4.33)
and 8(%, At):;::::£,
(S4.34)
Here Qm denotes the collection of all invariant subspaces for Sm. From (S4.33) it is clear that Sm --+ T. By assumption r splits the spectrum of T. Thus, form sufficiently large, the contour T will split the spectrum of Sm. Moreover, P(Sm; r) --+ P(T; r) and hence Im P(Sm; r) tends to lm P(T; r) in the gap topology. But then, for m sufficiently large, Im P(T; r)
+ lm P(Sm; r) = fl"
S4.
372
STABLE INVARIANT SUBSPACES
(cf. Theorem S4. 7). Let Rm be the angular operator of Im P(Sm; T) with respect to P(T; r). Here, as in what follows, m is supposed to be sufficiently large. As P(Sm; r) -+ P(T; r), we have Rm-+ 0. Put
E
m
=[I0
Rm] I
where the matrix representation corresponds to the decomposition
fl" = Ker P(T; r)
+ Im P(T; r).
(S4.35)
Then Em is invertible with inverse
E-t =[I
0
m
-Rm] I
'
Emim P(T; r) = Im P(Sm; r), and Em-+ I. Put Tm = E;;; 1 SmEm. Then Tm Im P(T; r) c Im P(T; r) and Tm-+ T. Let Tmo be the restriction of Tm to Im P(T; r). Then Tmo-+ T0 • As..¥ is a stable invariant subspace for T0 there exists a sequence {..¥m} of subspaces of Im P(T; r) such that .Km is Tm 0 -invariant and O(.Km, ..¥)-+ 0. Note that ..¥m is also Tm-invariant. Now put Am =Em%m· Then Am is an invariant subspace for Sm. Thus Am e Qm. From Em -+ I one can easily deduce that O(A m, ..¥m) -+ 0. Together with 0(..¥m• ..¥)-+ 0 this gives O(A m• ..¥)-+ 0, which contradicts (S4.34). Next assume that..¥ c Im P(T; r) is a stable invariant subspace forT, but not for T0 • Then one can find e > 0 such that for every positive integer m there exists a linear transformation Smo on Im P(T; r) satisfying IISmo- Toll < ljm
(S4.36)
and 0(..¥, A)
~
(S4.37)
e,
Here Qmo denotes the collection of all invariant subspaces of Smo· Let T0 be the restriction of T to Ker P(T; r) and write
sm =
[ T01
0
J
smo
where the matrix representation corresponds to the decomposition (S4.35). From (S4.36) it is clear that Sm -+ T. Hence, as ..¥ is a stable invariant subspace for T, there exists a sequence {..¥m} of subspaces of fl" such that ..¥m is Sm-invariant and O(.Km, ..¥)-+ 0. Put Am= P(T; r).Km. Since P(T; r) commutes with Sm, then Am is an invariant subspace for Smo· We shall now prove that O(A m• A')-+ 0, thus obtaining a contradiction to (S4.37).
S4.7.
373
GENERAL CASE OF STABLE INVARIANT SUBSPACES
Take y
E
.Am with IIYII
~
1. Then y
IIYII = IIP(T; r)xll
= P(T; r)x for some x E .Am. As
~
inf{llx- ulllu
~
17(%m' Ker P(T; r)) · llxll,
E
Ker P(T; r)} (S4.38)
where 17(·, ·) is the minimal opening. By Proposition S4.13, 0(% m' %) --+ 0 implies that 17(%m, Ker P(T; r))--+ 17 0 , where 17o = 17(%, Ker P(T; r)). So, form sufficiently large, 17(%m, Ker P(T; r)) ~ 111o· Together with (S4.38), this gives IIYII ~ 117ollxll,
form sufficiently large. Using this it is not difficult to deduce that O(.Am,
.AI)~
(1
+ 2/17o)IIP(T; r)IIOC%m, %)
form sufficiently large. We conclude that O(.Am, .AI)--+ 0, and the proof is complete. 0
Lemma 84.15. Let .AI be an invariant subspace forT, and assume that the contour r splits the spectrum ofT. If .AI is stable for T, then P(T; r)% is a stable invariant subspace for the restriction T0 ofT to Im P(T; r). Proof It is clear that .A = P(T; r)% is T0 -invariant. Assume that .A is not stable for T0 . Then .A is not stable for T either, by Lemma S4.14. Hence there exist e > 0 and a sequence {Sm} such that Sm--+ T and 0(.2, .A)
~
a,
fi'
E
Qm,
m = 1, 2, ... ,
(S4.39)
where Qm denotes the set of all invariant subspaces of Sm. As .AI is stable for T, one can find a sequence of subspaces {.AIm} such that Sm% m c: .AIm and 0(% m' %) --+ 0. Further, since r splits the spectrum ofT and sm --+ T, the contour r will split the spectrum of sm for m sufficiently large. But then, without loss of generality, we may assume that r splits the spectrum of each Sm. Again using Sm --+ T, it follows that P(Sm; r) --+ P(T; r). Let fZ be a direct complement of .AI in As 0(% m' .AI)--+ 0, we have = fZ .AIm for m sufficiently large (Theorem S4.7). So, without loss of generality, we may assume that = fZ .AIm for each m. Let Rm be the angular operator of .AIm with respect to the projector of along fZ onto .AI, and put
en
+
en
en. +
en
Em=[~ ~m] en
+
where the matrix corresponds to the decomposition = fZ %. Note that Tm = E;;; 1 SmEm leaves .AI invariant. Because Rm--+ 0 we have Em--+ I, and so Tm --+ T.
374
S4.
STABLE INVARIANT SUBSPACES
Clearly, r splits the spectrum of Tlx· As Tm--+ T and% is invariant for Tm, the contour r will split the spectrum of Tm lx too, provided m is sufficiently large. But then we may assume that this happens for all m. Also, we have lim P(Tm lx; r)
--+
P(Tix; r).
Hence Am = Im P(Tm lx; r) --+ Im P(Tix; r) = A in the gap topology. Now consider !£' m = Em j f m. Then !£' m is an Sm-invariant subspace. From Em --+I it follows that tl(!f' m, Am)--+ 0. This together with ti(A m, A) --+ 0, gives tl(!f' m, A) --+ 0. So we arrive at a contradiction to (S4.39), and the proof is complete. D After this long preparation we are now able to give a short proof for Theorem S4.9. Proof ofT heorem S 4. 9. Suppose % is a stable invariant subspace for A. Put %j = Pj%. Then%=% 1 +···+Air. By Lemma S4.15 the space %j is a stable invariant subspace for the restriction Aj of A to %(.1). But Aj has one eigenvalue only, namely, A.j. So we may apply Lemma S4.12 to prove that %j has the desired form. Conversely, assume that each %j has the desired form, and let us prove that% = % 1 % r is a stable invariant subspace for A. By Corollary S4.11 the space %j is a stable invariant subspace for the restriction Aj of A to Im Pj. Hence we may apply Lemma S4.14 to show that each %j is a stable invariant subspace for A. But then the same is true for the direct sum % =
+ ··· +
%1
+ ... +fly.
0
Comments
Sections S4.3 and S4.4 provide basic notions and results concerning the set of subs paces (in finite dimensional space); these topics are covered in [32c, 48] in the infinite-dimensional case. The proof of Theorem S4.5 is from [33, Chapter IV] and Theorem S4.7 appears in [34e]. The results of Sections 4.5-4.7 are proved in [3b, 3c]; see also [12]. The book [3c] also contains additional information on stable invariant subspaces, as well as applications to factorization of matrix polynomials and rational functions, and to the stability of solutions of the algebraic Riccati equation. See also [3c, 4] for additional information concerning the notions of minimal opening and angular operator. We mention that there is a close relationship between solutions of matrix Riccati equations and invariant subspaces with special properties of a certain linear transformation. See [3c, 12, 15b, 16, 53] for detailed information.
Chapter S5
Indefinite Scalar Product Spaces
This chapter is devoted to some basic properties of finite-dimensional spaces with an indefinite scalar product. The results presented here are used in Part III. Attention is focused on the problem of description of all indefinite scalar products in which a given linear transformation is self-adjoint. A special canonical form is used for this description. First, we introduce the basic definitions and conventions. Let f!l' be a finite-dimensional vector space with scalar product (x, y), x, y E f!l', and let H be a self-adjoint linear transformation in?£; i.e., (H x, y) = (x, Hy) for all x, y E f!l'. This property allows us to define a new scalar product [x, y], x, y E f!l' by the formula [x, y] = (Hx, y)
The scalar product [ , ] has all the properties of the usual scalar product, except that [x, x] may be positive, negative, or zero. More exactly, [x, y] has the following properties: (1) (2) (3)
(4)
[o:lxl + o:2x2, y] = o:1[x1, y] + o:2[x2, y]; [x, 0:1Yl + 0:2Y2J = ii1[x, Y1J + ii2[x, Y2]; [x,y] = [y,x]; [x, x] is real for every x E ?£. 375
376
S5.
INDEFINITE SCALAR PRODUCT SPACES
Because of the lack of positiveness (in general) of the form [x, x], the scalar product [ , ] is called indefinite, and fl£ endowed with the indefinite scalar product [ , ] will be called an indefinite scalar product space. We are particularly interested in the case when the scalar product [ , ] is nondegenerate, i.e., [x, y] = 0 for all y E fl£ implies x = 0. This happens if and only if the underlying self-adjoint linear transformation H is invertible. This condition will be assumed throughout this chapter. Often we shall identify fl£ with(£". In this case the standard scalar product is given by (x, y) = L~= 1 xc)l;, x = (x 1, ... , x")T E fl", y = (Jt, ... , Yn)T E fl", and an indefinite scalar product is determined by an n x n nonsingular hermitian (or self-adjoint) matrix H: [x, y] = (Hx, y) for all x, y E fl". Finally, a linear transformation A: fl£ ---> fl£ is called self-adjoint with respect to H (or H-self-adjoint) if [Ax, y] = [x, Ay] for all x, y E fl£, where [x, y] = (Hx, y). This means HA = A*H.
S5.1. Canonical Form of a Self-Adjoint Matrix and the Indefinite Scalar Product Consider the following problem: given a fixed n x n matrix A, describe all the self-adjoint and nonsingular matrices H such that A is self-adjoint in the indefinite scalar product [x, y] = (Hx, y), i.e., [Ax, y] = [x, Ay], or, what is the same, H A = A* H. Clearly, in order that such an H exist, it is necessary that A is similar to A*. We shall see later (Theorem S5.1 below) that this condition is also sufficient. For the spectral properties of A this condition means that the elementary divisors of I A - A are symmetric relative to the real line; i.e., if A0 ( # A0 ) is an eigenvalue of H - A with the corresponding elementary divisors (A - AoY', i = 1, ... , k, then A: 0 is also an eigenvalue of H - A with the elementary divisor (A - A: 0 )"', i = 1, ... , k. Now the problem posed above can be reformulated as follows: given a matrix A similar to A*, describe and classify all the self-adjoint non singular matrices which carry out this similarity. Consider first the case that A = I. Then any self-adjoint nonsingular matrix His such that I isH-self-adjoint. Thus, the problem of description and classification of such matrices H is equivalent to the classical problem of reduction of H to the form S* PS for some nonsingular matrix S and diagonal matrix P such that P 2 = I. Then the equation H = S* PSis nothing more than the reduction of the quadratic form (H x, x) to a sum of squares. To formulate results for any A we need a special construction connected with a Jordan matrix J which is similar to J*. Similarity between J and J* means that the number and the sizes of Jordan blocks in J corresponding to some eigenvalue A0 ( # A: 0 ) and those corresponding to A0 are the same. We
S5.1.
377
CANONICAL FORM
now fix the structure of J in thefollowingway: select a maximal set {A. 1 , ••. , A.a} of eigenvalues of J containing no conjugate pair, and let {A.a+ ~> ... , A.a+b} be the distinct real eigenvalues of J. Put A.a+b+ j = Xj, for j = 1, ... , a, and let (S5.1) where J; = diag[Jii]~;; 1 is a Jordan form with eigenvalue A.; and Jordan blocks, 1;, 1, •.. , Ji,k; of sizes IX;, 1 2:: · · · 2:: IXi,k;• respectively. An IX x IX matrix whose (p, q) entry is 1 or 0 according asp + q = IX + 1 or p + q =f. IX + 1 will be called a sip matrix (standard involuntary permutation). An important role will be played in the sequel by the matrix P,, 1 connected with J as follows: (S5.2) where
and
Pr = diag[diag[eiiPii]~;; 1 ]f;:+ 1 with sip matrices P, 1 of sizes IX,1 x 1Xsr (s = 1, ... , a + b, t = 1, ... , k.) and the ordered set of signs e = (e;), i = a + 1, ... , a + b,j = 1, ... , k;, eii = ± 1. Using these notations the main result, which solves the problem stated at the beginning of this section, can bow be formulated.
Theorem S5.1. The matrix A is self-adjoint relative to the scalar product (Hx, y) (where H = H* is nonsingular) if and only if T*HT
= P,,J>
T- 1 AT= J
(S5.3)
for some invertible T, a matrix J in Jordan form, and a set of signs e.
The matrix P,, 1 of (S5.3) will be called an A-canonical form of H with reducing transformation S = T- 1 • It will be shown later (Theorem S5.6) that the set of signs e is uniquely defined by A and H, up to certain permutations. The proof of Theorem S5.1 is quite long and needs some auxiliary results. This will be the subject matter of the next section. Recall that the signature (denoted sigH) of a self-adjoint matrix H is the difference between the number of positive eigenvalues and the number of negative eigenvalues of H (in both cases counting with multiplicities). It coincides with the difference between the number of positive squares and the number of negative squares in the canonical quadratic form determined by H.
S5.
378
INDEFINITE SCALAR PRODUCT SPACES
The following corollary connects the signature of H with the structure of real eigenvalues of H -self-adjoint matrices. Corollary S5.2. Let A be an H-self-adjoint matrix, and let s be the signature of H. Then the real eigenvalues of A have at least lsi associated elementary divisors of odd degree. Proof It follows from the relation (S5.3) that s is also the signature of P,,J· It is readily verified that each conjugate pair of nonreal eigenvalues makes no contribution to the signature; neither do sip matrices associated with even degree elementary divisors and real eigenvalues. A sip matrix associated with an odd degree elementary divisor of a real eigenvalue contributes"+ 1" or" -1" to s depending on the corresponding sign in the sequence e. The conclusion then follows. D
Another corollary of Theorem S5.1 is connected with the simultaneous reduction of a pair of hermitian matrices A and B to a canonical form, where at least one of them is nonsingular. Corollary S5.3. Let A, B be n x n hermitian matrices, and suppose that B is invertible. Then there exists a nonsingular matrix X such that X*AX = P,,JJ,
X*BX = P,,J,
where J is the Jordan normal form of B- 1 A, and P,,J is defined as above. Proof Observe that the matrix B- 1 A is B-self-adjoint. By Theorem S5.1, we have X*BX = P,,J,
for some nonsingular matrix X. So A = BXJX- 1 and X*AX = X*BXJ = P,,JJ.
0
In the case when B is positive definite (or negative definite) and A is hermitian one can prove that the Jordan form of B- 1 A is actually a diagonal matrix with real eigenvalues. In this case Corollary S5.3 reduces to the wellknown result on simultaneous reduction of a pair of quadratic forms, when one of them is supposed to be definite (positive or negative), to the sum of squares. See, for instance, [22, Chapter X] for details. S5.2. Proof of Theorem S5.1 The line of argument used to prove Theorem S5.1 is the successive reduction of the problem to the study of restrictions of A and H to certain invariant subspaces of A. This approach is justified by the results of the
S5.2.
PROOF OF THEOREM
S5.1
379
next two lemmas. A subspace .# of ~" is said to be nondegenerate (with respect to H) if, for every x E.#\ {0} there is ayE.# such that (Hx, y) =1= 0. For any subspace.# the orthogonal companion of.# with respect to His .#A = {x E ~n I(Hx, y) = 0, Y E.#}. The first lemma says that, for a nondegenerate subspace, the orthogonal companion is also complementary, together with a converse statement.
Lemma S5~4. If.# is a nondegenerate subspace ._#A and conversely, if.# is a subspace and ~n = .# degenerate.
of~", then ~n = .# + + .#A, then .# is non-
Proof. Note first that .#A is the image under H- 1 of the usual orthogonal complement (in the sense of the usual scalar product ( , )) to .#. So
dim .#A+ dim.#= n. Hence it is sufficient to prove that .# n .#A = {0} if and only if.# is nondegenerate. But this follows directly from the definition of a nondegenerate subspace. D In the next lemma a matrix A is considered as a linear transformation with respect to the standard basis in (/;".
Lemma S5.5. Let A be self-adjoint with respect to H and let .# be a non-degenerate invariant subspace of A. Then (a) .#A is an invariant subspace of A. (b) If Q is the orthogonal projector onto .# (i.e., Q 2 = Q, Q* = Q, Im Q = .#), then A I..« defines a linear transformation on .# which is selfadjoint with respect to the invertible self-adjoint linear transformation QHI..«: .#--+ .#. Proof. Part (a) is straightforward and is left to the reader. For part (b) note first that QHI..« is self-adjoint since H = H*, Q = Q*, and QHI..« = QHQI..«· Further, QHI..« is invertible in view of the invertibility of Hand-nondegeneracy of.#. Then HA = A*H implies Q(HA)Q = Q(A*H)Q and, since AQ = QAQ, (QHQ)(QAQ)
from which the result follows.
= (QA*Q)(QHQ),
D
Now we can start with the proof of Theorem S5.1. If (S5.3) holds, then one can easily check that A is self-adjoint relative to H. Indeed, the equalities P,. 1 J = J* P,, 1 are verified directly. Then HA=T*- 1 P e,J T- 1 -TJT- 1 =T*- 1 P e,J JT- 1 =T*- 1 J*P*e,J T- 1 = T*- 1 J*T*T*- 1P,, 1 T- 1 = A*H,
S5.
380
INDEFINITE SCALAR PRODUCT SPACES
i.e., A is self-adjoint relative to H. Now let A be self-adjoint relative to the scalar product [x, y] = (Hx, y). Decompose fl" into a direct sum fl" = fll" 1 fll"a fll"a+ 1 fll"a+b• (S5.4) where fll"; is the sum of the root subspaces of A corresponding to the eigenvalues A; and A;, i = 1, ... , a. For i = a + 1, ... , a + b, fll"; is the root subspace of A corresponding to the eigenvalue A.,. We show first that fll"; and fll"i (i =1- j, i,j = 1, ... , a+ b) are orthogonal with respect to H, i.e., [x, y] = 0 for every x E fll"; and y E fll"i. Indeed, let x E fll";, y E fll"i and assume first that (A - f.lJ)•x = (A - f.lil)'y = 0, where f.l; (resp. f.li) is equal to either A; or A; (resp. A.i or .A), for some nonnegative integers sand t. We have to show that [x, y] = O.lt is easily verified that this is so when s = t = 1. Now assume inductively that [x', y'] = 0 for all pairs x' E fll"; and y' E fll"i for which (A - f.lJ)•'x' = (A - !lilly' = 0, with s' + t' < s + t.
+ ··· + +
+ ··· +
Consider x' = (A - f.lJ)x, y' =(A - f.li)y. Then, by the induction hypothesis, [x', y] = [x, y'] = 0 or f.l;[x, y] = [Ax, y], ,aJx, y] = [x, Ay]. But [Ax, y] = [x, Ay] (since A is self-adjoint with respect to H); so (f.l;- .Ui)[x,y] = O.Bythechoiceoffll";andfll"iwehavef.l; =1- .Ui;so[x,y] = 0. As fll"; = {x 1 + x 2 I(A- f.lJ)•'x 1 =(A- Ji)) 52 X 2 = 0 for some s 1 and s 2 }, with an analogous representation for fll"i, it follows from the above that [x, y] = 0 for all x E fll";, y E fll"i, where i =1- j. It then follows from (S5.4) and Lemma S5.4 that each fll"; is nondegenerate. fll";', Consider fixed fll";, where i = 1, ... , a (i.e., A.; =1- A:;). Then fll"; = fll"; where fl£; (resp. fll";') is the root subspace of A corresponding to A.; (resp. A:;). As above, it follows that the subspaces fll"; and fll";' are isotropic with respect to H, i.e., [x, y] = 0 for either x, y E fl£; or for x, y E fll";'. There exists an integer m with the properties that (A - A.J)ml.~t = 0, but (A - A.J)m- 1 a 1 =1- 0 for some a 1 E fl£;. Since fll"; is nondegenerate and fll"; is isotropic, there exists a b 1 E fll";' such that [(A - A.J)m- 1a 1 , b 1 ] = 1. Define sequences a 1, .•. , am E fll"; and b 1, ..• , bm E fll";' by . 1 bi = (A - ...1 J)r . 1 ai = (A - A.Jya , b , j = 1, ... , m.
+
1
1
Observe that [a 1 , bmJ = [a 1, (A - A:Jr- 1 b 1 ] = [(A - A.Jr- 1a 1 , b 1 ] = 1, in particular, bm =1- 0. Further, for every x E fll"; we have [x, (A - A;l)bm] = [x, (A - A:Jrb 1] = [(A - A.J)mx, b 1 ] = 0;
S5.2.
PROOF OF THEOREM
S5.1
381
so the vector (A- A.J)bm isH-orthogonal to g(j. In view of (S5.4) we deduce that (A- J.J)bm isH-orthogonal to fln, and hence (A - A.;I)bm
= 0.
Then clearly am, ... , a 1 (resp. bm, ... , b1)is a Jordan chain of A corresponding to A; (resp. J.;), i.e., for j = 1, 2, ... , m - 1,
with analogous relations for the bj (replacing A; by J.;). For j we have
+k
= m
+1
[aj, bk] = [(A - A))j- 1a 1, (A - A.J)k- 1b 1 ] = [(A - AJ)j+k- 2 a 1 , b1] = 1;
(S5.5) and analogously for j
+ k > m + 1.
(S5.6)
Now put m
c1 = a1
+ L
cj+ 1 =(A- A))cj, j = 1, ... ,m- 1,
rt.jaj,
j=2
where
rt. 2 , ••• ,
rt.m are chosen so that
[c 1, bm_ 1] = [c 1, bm_ 2 ] = · · · = [c 1, b1] = 0. Such a choice is possible, as can be checked easily using (S5.5) and (S5.6). Now for j + k ~ m [cj, bk] = [(A - A))j- 1c 1, bk] = [c 1, (A - A.J)j- 1 bk] = [c 1, bk+ j- 1] = 0, andforj (S5.6):
+ k;;:::
m
+ 1 weobtain,using(A-
AJra 1 = Otogetherwith(S5.5),
[cj, bk] = [(A - A))j- 1c 1, (A - A.J)k- 1 b 1] = [(A- AJ)j+k- 2 c 1 , b 1] = [(A- A))j+k- 2 a 1 , b 1]
=
{1,0,
for j for j
+k = +k >
m m
+1 + 1.
Let % 1 =Span{c 1, ... ,cm,b 1, ... ,bm}. The relations above show that Al.v 1 = J 1 EB ] 1 in the basis c 1, ... , em, b1, ... , bm, where J 1 is the Jordan block of size m with eigenvalue A;;
[ ]- *[0p1 X,
y - y
p1] 0
X,
X,
yE
%1
382
S5.
INDEFINITE SCALAR PRODUCT SPACES
in the same basis, and P 1 is the sip matrix of size m. We see from this representation that .% 1 is nondegenerate. By Lemma S5.4, rz;n = .% 1 A't, and by Lemma S5.5, Ntis an invariant subspace for A. If Al.v~ has nonreal eigenvalues, apply the same procedure to construct a subspace .% 2 c A't with basis c'1, ... , c~, b'1, ... , b~. such that in this basis A l.v2 = J 2 EEl J 2 , where J 2 is the Jordan block of size m' with nonreal eigenvalue, and
+
[
x, y] = y
*[ p2] p02
0 x,
with the sip matrix P 2 of size m'. Continue this procedure until the nonreal eigenvalues of A are exhausted. Consider now a fixed fll';, where i = a + 1, ... , a + b, so that A; is real. Again, let m be such that (A - AJr 1~, = 0 but (A - AJ)m- 1 1~, #- 0. Let Q; be the orthogonal projector on fll'; and define F: fll';--+ fll'; by
Since A; is real, it is easily seen that F is self-adjoint. Moreover, F #- 0; so there is a nonzero eigenvalue ofF (necessarily real) with an eigenvector a 1. Normalize a 1 so that B
=
±1.
In other words, [(A -
Letai =(Am+1
AJr- 1a1, a1]
=
B.
(S5.7)
AJ)i- 1a 1,j = 1, ... ,m.Itfollowsfrom(S5.7)thatforj + k =
[ai, ak] = [(A - AJ)i- 1a 1, (A - AJt- 1a 1] = [(A - AJr- 1a1, a1J = B. (S5.8) Moreover, for j + k > m + 1 we have:
[ai, ak] =[(A- A;f)i+k- 2a 1 , a 1 ] = 0
(S5.9)
in view of the choice of m. Now put b1 = a1 + Q(2a2 + · · · + Q(mam,
, J)i- 1b 1, J. --'- 1, ... , m, bi -- (A - 11.;
and choose Q(; so that
= [b1, b2J = •· · = [b1, bm-1J = 0. Such a choice ofQ(; is possible. Indeed, equality [b 1, bi] = OU = 1, ... , m - 1) [b1, b1J
gives, in view of (S5.8) and (S5.9), 0 = [a1 + Q(2a2 + ... + Q(mam,ai + Q(2aj+1 + ... + Q(m-j+1Q(mJ = [a 1 ,ai] + 2BQ(m-i+ 1 + (termsinQ( 2, ... ,Q(m-}
S5.3.
383
UNIQUENESS OF THE SIGN CHARACTERISTIC
Evidently, these equalities determine unique numbers 0( 2 , 0( 3 , ... , ll(m in succession. Let % = Span{b 1 , ..• , bm}. In the basis b 1, •.• , bm the linear transformation A IvY is represented by the single Jordan block with eigenvalue A.i, and
[x, y]
= y*eP 0 x,
x,yE%,
where P 0 is the sip matrix of size m. Continue the procedure on the orthogonal companion to%, and so on. Applying this construction, we find a basisf1, ..• Jn inC" such that A is represented by the Jordan matrix J of (5.1) in this basis and, with P,, 1 as defined in (S5.2),
[x, y] = y*P,, 1 x, where x and y are represented by their coordinates in the basis [ 1, .•. , fn· Let T be the n x n invertible matrix whose ith column is formed by the coordinates of[; (in the standard orthonormal basis), i = 1, ... , n. For such aT, the relation T- 1 AT = J holds because [ 1 , ... Jn is a Jordan basis for A, and equality T* HT = P,, 1 follows from (S5.5), (S5.6), and (S5.8). So (S5.3) holds. Theorem S5.1 is proved completely. 0 S5.3. Uniqueness of the Sign Characteristic Let H be an n x n hermitian nonsingular matrix, and let A be some Hself-adjoint matrix. If J is a normal form of A, then in view of Theorem S5.1 H admits an A-canonical form P,, 1 . We suppose that the order of Jordan blocks in J is fixed as explained in the first section. The set of signs e in P,, 1 will be called the A-sign characteristic of H. The problem considered in this section is that of uniqueness of e. Recall that e = {ea+j,J, i = 1, 2, ... , ki, andj = 1, 2, ... , b. in the notation of Section S5.1. Two sets of signs e(r) = {et~ i. i}, = 1, 2, will be said to be equivalent if one can be obtained from the other by permutation of signs within subsets corresponding to Jordan blocks of J having the same size and the same real eigenvalue.
r
Theorem S5.6. Let A be an H-self-adjoint matrix. Then the A-sign characteristic of His defined uniquely up to equivalence. Note that in the special case A = I, this theorem states that in any reduction of the quadratic form (Hx, x) to a sum of squares, the number of positive coefficients, and of negative coefficients is invariant, i.e., the classical inertia law.
85.
384
INDEFNINITE SCALAR PRODUCT SPACES
Proof Without loss of generality a Jordan form for A can be fixed, with the conventions of Eq. (S5.1). It is then to be proved that if P,,J, Pa,J are A-canonical forms for H, then the sets of signs £ and {> are equivalent. It is evident from the definition of an A-canonical form (Eqs. (S5.1) and (S5.2)) that the conclusion of the theorem follows if it can be established for an A having just one real eigenvalue, say a. Thus, the Jordan form J for A is assumed to have k; blocks of size m;, i = 1, 2, ... , t, where m1 > m2 > · · · > m1• Thus, we may write
and P,,J = diag[diag[~>j, ;Pi]~~ 1]}= 1
with a similar expression for Pa,J• replacing ~>j,i by signs bj,i· It is proved in Theorem S5.1 that, for some nonsingular S, H = S*P,,JS and A = s- 1JS. It follows that for any nonnegative integer k, (S5.10)
and is a relation between hermitian matrices. Observe that (Ja - J;r'- 1 = 0 for i = 2, 3, ... , t, and (Ja - J 1r'- 1 has all entries zeros except for the entry in the right upper corner, which is 1. Consequently, P,jia- Jr'- 1 = diag[~>1.1P1(Ja-
J1r'-\ ... ,~>1,k,P1(Ja- J1r'- 1,0, ... ,0].
Using this representation in (S5.10) with k = m1 - 1, we conclude that the signature of H(Ia - Ar'- 1. But precisely the same argument applies using the A-canonical form Pa,J for Hand it is concluded that L~!, 1 ~>1.; coincides with
k,
I i=
k,
~>1,i = 1
I i=
J1.i·
(S5.11)
1
Consequently, the subsets {~> 1 , 1 , ... , ~> 1 ,k,} and {b 1, 1, ... , b1,k,} of£, b are equivalent. Now examine the hermitian matrix P,,ila- J)m 2 - 1. This is found to be block diagonal with nonzero blocks of the form i
= 1, 2, ... 'k1
and j = 1, 2, ... , k 2 •
S5.3.
385
UNIQUENESS OF THE SIGN CHARACTERISTIC
Consequently, using(S5.10)withk = m 2 is given by
1, thesignatureofH(/()(- A)m 2 - 1
-
(~/u)(sig[P1(J()(- J1r
2
-
1])
+ i~/2,j·
But again this must be equal to the corresponding expression formulated using() instead of e. Hence, using (S5.11) it is found that k2
" e 2 o}. L.., j= 1
k2
= "L.., () 2 'J. j= 1
and the subsets {e 2 , 1, ... , e2 ,k,} and {b 2 , 1, ... , b2 ,k 2 } of e and() are equivalent. Now it is clear that the argument can be continued fort steps after which the equivalence of e and () is established. D It follows from Theorem S5.6 that the A-sign characteristic of H is uniquely defined if we apply the following normalization rule: in every subset of signs corresponding to the Jordan blocks of the same size and the same eigenvalue, + ls (if any) precede -ls (if any). We give a simple example for illustration. ExAMPLE
S5.1
Let J =
diag{[~ ~]. [~ ~]}-
Then p E,J = diag{[0
e1
e1] 0 ' [0 ez]} 0 ez
for
e= (e 1,e2), e; =
± 1.
According to Theorem S5.1 and Theorem S5.6, the set Q = QJ of all invertible and selfadjoint matrices H such that J isH-self-adjoint, splits into 4 disjoint sets Q 1 , Q 2 , Q 3 , Q 4 corresponding to the sets of signs ( + 1, + 1), ( + 1, -1), ( -1, + 1), and ( -1, -1), respectively:
An easy computation shows that each set Q; consists of all matrices H of the form
where a 1 , a 2 are positive and b 1 , b 2 are real parameters; e1 , e2 are± 1 depending on the set Q;. Note also that each set Qi (i = 1, 2, 3, 4) is connected. 0
386
S5.
INDEFINITE SCALAR PRODUCT SPACES
S5.4. Second Description of the Sign Characteristic Let H be an n x n hermitian nonsingular matrix, and let A be H-selfadjoint. In the main text we make use of another description of the A-sign characteristic of H, which is given below. In view of Theorem S5.1, H admits an A-canonical form P,,J in some orthonormal basis, where J is the Jordan form of A: (S5.12) HereS is a nonsingular matrix such that SA = JS and a is the A-sign characteristic of H. We can suppose that the basis is the standard orthonormal basis in q;n. Let .il0 be a fixed real eigenvalue of A, and let 'I' 1 c q;n be the subspace spanned by the eigenvectors of I.il - A corresponding to .il 0 . For x E 'I' 1 \ {0} denote by v(x) the maximal length of a Jordan chain beginning with the eigenvector x. Let '1';, i = 1, 2, ... ,y, (y = max{v(x)lxE'¥ 1 \{0}}) be the subspace of 'I' 1 spanned by all x E 'I' 1 with v(x) ~ i. Then Ker(Jil 0
-
A) = 'I' 1
~
'I' z
~
···
~
'I'r .
The following result describes the A -sign characteristic of H in terms of certain bilinear forms defined on '1';. Theorem S5.7.
Fori = 1, ... , y, let I' ;;(x, y) -- (x, Hy (i) ),
X,
y
E
'1';,
where y = y(l>, y( 2 >, ... , y(i>, is a Jordan chain of I.il - A corresponding to real il 0 with the eigenvector y. Then
(i) J;(x, y) does not depend on the choice of y( 2 >, ... , y(k); (ii) for some self-adjoint linear transformation G;: 'I';-+ '1';, J;(x, y)
= (x, G;y),
X,
y E 'I';;
(iii) for G; of(ii), 'I';+ 1 = Ker G; (by definition '~'r+ 1 = {0}); (vi) the number of positive (negative) eigenvalues of G;, counting multiplicities, coincides with the number of positive (negative) signs in the A-sign characteristic of H corresponding to the Jordan blocks of J with eigenvalue il 0 and size i. Proof By (S5.12) we have J;(x, y) = (Sx, P,,JSy(il), x, y E '1';. Clearly, Sx and Sy(l>, ... , Sy(il are eigenvector and Jordan chain, respectively, of I.il - J corresponding to .il0 • Thus, the proof is reduced to the case A = J and H = P,,J· But in this case the assertions (i)-(iv) can be checked without difficulties. D
85.4. SECOND DESCRIPTION OF THE SIGN CHARACTERISTIC
387
Comments
The presentation in this chapter follows a part ofthe authors' paper [34f]. Indefinite scalar products in infinite-dimensional spaces have been studied extensively; see [8, 43, 51] for the theory of indefinite scalar product spaces and some of its applications. Results close to Theorem S5.1 appear in [59, Chapter 7], and [77]. Applications of the theory of indefinite scalar product spaces to solutions of algebraic Riccati equations and related problems in linear control theory are found in [53, 70d, 70f].
Chapter S6
Analytic Matrix Functions
The main result in this chapter is the perturbation theorem for selfadjoint matrices (Theorem S6.3). The proof is based on auxiliary results on analytic matrix functions from Section S6.1, which is also used in the main text. S6.1. General Results
Here we consider problems concerning existence of analytic bases in the image and in the kernel of an analytic matrix-valued function. The following theorem provides basic results in this direction. Theorem S6.1. Let A(e), e En, be an n X n complex matrix-valued function which is analytic in a domain Q crt. Let r =max, en rank A(e). Then there exist n-dimensional analytic (in Q) vector-valued functions y 1(e), ... , Yn(e) with the following properties:
(i) y 1(e), ... , y.(e) are linearly independent for every e E Q; (ii) Yr+ 1(e), ... , Yn(e) are linearly independent for every e E Q;
(iii)
Span{Yt(e), ... , y.{e)}
= Im A(e)
(S6.1)
and
Span{y,+ 1(e), ... , yie)} = Ker A(e) 388
(S6.2)
S6.1. GENERAL RESULTS
389
for every e E n, except for a set of isolated points which consists exactly of those eo En for which rank A(eo) < r. For such exceptional eo En, the inclusions (S6.3) and (S6.4) hold. We remark that Theorem S6.1 includes the case in which A(e) is an analytic function of the real variable e, i.e., in a neighborhood of every real point e0 the entries of A(e) = (a;k(e))i.k~ 1 can be expressed as power series me- e0 : co
a;k(e)
=
I
aie- e0 )i
(S6.5)
j~O
where ai are complex numbers depending on i, k, and e, and the power series converge in some real neighborhood of e0 . (Indeed, the power series (S6.5) converges also in some complex neighborhood of e0 , so in fact A(e) is analytic in some complex neighborhood Q of the real line.) Theorem S6.1 will be used in this form in the next section. The proof of Theorem S6.1 is based on the following lemma.
Lemma S6.2. Let X 1(e), ... ' x,( e), e E n be n-dimensional vector-valued functions which are analytic in a domain Q in the complex plane. Assume that for some e0 En, the vectors x 1(e 0 ), ... , x.(e 0 ) are linearly independent. Then there exist n-dimensional vector functions Y1(e), ... 'y,(e), e En, with the following properties: (i) (ii) (iii)
y 1 (e), ... , y,(e) are analytic on Q; y 1(e), ... , y,(e) are linearly independent for every e E 0; Span{y 1 (e), ... ,y.(e)} = Span{x 1 (e), ... ,x,(e)} (c~n)
for every e E 0\0 0 , where 0 0 pendent}.
= {e E Qjx 1(e), ... , x,(e) are linearly de-
Proof We shall proceed by induction on r. Consider first the case r = 1. Let g(e) be an analytic scalar function in Q with the property that every zero of g(e) is also a zero of x 1 (e) having the same multiplicity, and vice versa. The existence of such a g(e) is ensured by the Weierstrass theorem (concerning construction of an analytic function given its zeros with corresponding multiplicities); see, for instance, [63, Vol. III, Chapter 3].
S6.
390
ANALYTIC MATRIX FUNCTIONS
Put YI(e) = (g(e))- 1x 1(e) to prove Lemma S6.2 in the case r = 1. We pass now to the general case. Using the induction assumption, we can suppose that x 1(e), ... , x,_ 1(e) are linearly independent for every e E 0. LetX 0 (e)beanr x rsubmatrixofthen x rmatrix[x 1(e), ... ,x,(e)]suchthat det X 0 (e 0 ) i= 0. It is well known in the theory of analytic functions that the set of zeros of the not identically zero analytic function det X 0 (e) is discrete, i.e., it is either empty or consists of isolated points. Since det X 0 (e) i= 0 implies that the vectors x 1(e), ... , x,(e) are linearly independent, it follows that the set no = {e E Oix1(e), ... 'x,(e) are linearly dependent} is also discrete. Disregarding the trivial case when 0 0 is empty, we can write 0 0 = {( 1,( 2 , ••. }, where (;EO, i = 1,2, ... ,is a finite or countable sequence with no limit points inside n. Let us show that for every j = 1, 2, ... , there exist a positive integer si and scalar functions a 1j(e), ... , a,_1.i(e) which are analytic in a neighborhood of (i such that the system of n-dimensional analytic vector functions in 0 x 1(e), ... , x,_ 1(e), (e - (j)-•i [ x,(e)
+ r-1 ;~ aii(e)x;(e) 1
J
(S6.6)
has the following properties: for e i= (i it is linearly equivalent to the system x 1(e), ... , x,(e) (i.e., both systems span the same subspace in (l'"); fore = (i it is linearly independent. Indeed, consider the n x r matrix B(e) whose columns are formed by x 1(e), ... , x,(e). By the induction hypothesis, there exists an (r - 1) x (r - 1) submatrix B 0 (e) in the first r - 1 columns of B(e) such that det B0 ((j) i= 0. For simplicity of notation suppose that B 0 (e) is formed by the first r - 1 columns and rows in B(A.); so B(e) = [B 0 (e) B 2 (e)
B 1 (e)J B 3 (e)
where B1(e), B 2 (e), and B 3 (e) are of sizes (r- 1) x 1, (n - r + 1) x (r- 1), and (n - r + 1) x 1, respectively. Since B 0 (e) is invertible in a neighborhood of (.; we can write B(e) = [
I B 2 (e)B 0 1(e)
O][B 0 (e) I 0
0 ][I B0 1(e)B 1(e)J. Z(e) 0 I
where Z(e) = B 3(e)- B 2 (e)B 0 1(e)B 1(e) is an (n - r
+ 1)
(S 6.?)
x 1 matrix. Let
si be the multiplicity of (i as a zero of the vector function Z(e). Consider the
matrix function
S6.1.
391
GENERAL RESULTS
Clearly, the columns b 1(e), ... , b.(e) of B(e) are analytic and linearly independent vector functions in a neighborhood V(() of (j. From formula (S6.7) it is clear that Span{x 1(e), ... , x.{e)} = Span{b 1 (e), ... , b.{e)} for BE U(()\(j. Further, from (S6.7) we obtain B(e) =
[BB (e) 0 (e) 2
0 J
(e - ()-"iZ(e)
and 0 [ (e - C -sjz(e)
J= (
B
_
o-•iB( )[-B 0 1 (e)B 1 (e)J J B I .
So the columns b 1(e), ... , b.(e) of B(e) have the form (S6.6), where a;j(B) are analytic scalar functions in a neighborhood of (j. Now choose y 1(e), ... , y.(B) in the form r
y.{e) =
L g;(e)x;(B),
where the scalar functions g;(B) are constructed as follows: (a) g.{e) is analytic and different from zero in Q except for the set of poles ( 1 , ( 2 , .•• , with corresponding multiplicities s 1, s 2 , ••. ; (b) the functions g;(B) (fori = 1, ... , r - 1) are analytic in Q except for the poles ( 1 , ( 2 , ..• , and the singular part of g;(B) at (j (for j = 1, 2, ... ) is equal to the singular part of aij(e)g.(B) at (j· Let us check the existence of such functions g;(B). Let g.{e) be the inverse of an analytic function with zeros at ( 1 , ( 2 , ••. , with corresponding multiplicities s 1 , s 2 , ••• , (such an analytic function exists by the Weierstrass theorem mentioned above). The functions g 1(e), ... , gr_ 1(e) are constructed using the Mittag-Leffler theorem (see [63, Vol. III, Chapter 3]. Property (a) ensures that y 1 (e), ... , Yr(B) are linearly independent for every BE 0\ {( 1, ( 2 , .•. }. In a neighborhood of each (j we have r-1
y.{e) =
L (g;(B) -
r-1
aij(e)g.(e))x;(B)
+ g.{e)(x.{e) +
i= 1
L a;j(B)x;(e))
i= 1
= (e- (j)-•{xr(E) + :t~aij(e)x;(e)]
+ {linear combination of x 1 ((j), ... , xr_ 1 (()} + · · ·, (S6.8) where the final ellipsis denotes an analytic (in a neighborhood of() vector function which assumes the value zero at (j. Formula (S6.8) and the linear
S6.
392
ANALYTIC MATRIX FUNCTIONS
independence of vectors (S6.6) for e = (i ensures that y 1 ((), ... , y,(() are linearly independent. 0 The proof of Lemma S6.2 shows that if for some s ( ~ r) the vector functions x 1 (e), ... , x.(e) are linearly independent for all e E Q, then Y;(e), i = 1, ... , r, can be chosen in such a way that (i)-(iii) hold, and moreover, y 1 (e) = x 1 (e), ... , y.(e) = xs(e) for all e E Q. We shall use this observation in the proof of Theorem S6.1. Proof of Theorem S6.1. Let A 0 (e) be an r x r submatrix of A(e), which is nonsingular for some£ E Q, i.e., det A 0 (£) =f. 0. So the set Q 0 of zeros of the analytic function det A 0 (e) is either empty or consists of isolated points. In what follows we assume for simplicity that A 0 (e) is located in the top left corner of size r x r or A( e). Let x 1 (e), ... , x,(e) be the first r columns of A(e), and let y 1 (e), ... , y,(e) be the vector functions constructed in Lemma S6.2. Then for each e E Q\Q 0 we have
Span{y 1 (e), ... , y,(e)} = Span{x 1 (e), ... , x,(e)} = Im A( e)
(S6.9)
(the last equality follows from the linear independence of x 1 (e), ... , x.(e) for e E Q\Q 0 ). We shall prove now that Span{y 1 (e), ... , y,(e)}
::::>
Im A(e),
£ E
Q.
(S6.10)
Equality (S6.9) means that for every e E Q\Q 0 there exists an r x r matrix B(e) such that Y(e)B(e) = A(e),
(S6.11)
where Y(e) = [y 1 (e), ... , y.(e)]. Note that B(e) is necessarily unique (indeed, if B'(e) also satisfies (S6.11), we have Y(e)(B(e) - B'(e)) = 0, and, in view of the linear independence of the columns of Y(e), B(e) = B'(e)). Further, B(e) is analytic in Q\Q 0 . To check this, pick an arbitrary e' E Q\Q 0 ,and let Y0 (e) be an r x r submatrix of Y(e) such that det(Y0 (e')) =f. 0 (for simplicity of notation assume that Y0 (e) occupies the top r rows of Y(e)). Then det(Y0 (e)) =f. 0 in some neighborhood V of e', and (Y0 (e))- 1 is analytic on eE V. Now Y(e) 1 ~[(Y0 (e))- 1 ,0] is a left inverse of Y(e); premultiplying (S6.11) by Y(e)' we obtain B(e) = Y(e)'A(e),
e E V.
(S6.12)
So B(e) is analytic on e E V; since e' E Q\Q 0 was arbitrary, B(e) is analytic in Q\Q 0 . Moreover, B(e) admits analytic continuation to the whole ofQ, as follows. Let e0 E Q 0 , and let Y(e)' be a left inverse of Y(e), which is analytic in a neighborhood V0 of e0 (the existence of such Y(e) is proved as above).
S6.1.
393
GENERAL RESULTS
Define B(s) as Y(s)'A(s) for s E V0 . Clearly, B(s) is analytic in V0 , and for s E V0 \ { s0 }, this definition coincides with (S6.12) in view of the uniqueness of B(s). So B(s) is analytic inn. Now it is clear that (S6.11) holds also for s E 0 0 , which proves (S6.10). Consideration of dimensions shows that in fact we have an equality in (S6.10), unless rank A(s) < r. Thus (S6.1) and (S6.3) are proved. We pass now to the proof of existence of y,+ 1 (s), ... ,y"(s) such that (ii), (S6.2), and (S6.4) hold. Let a 1(s), ... , a,(s) be the first r rows of A(s). By assumption a 1(e), ... , a,(£) are linearly independent for somes E n. Apply Lemma S6.2 to construct n-dimensional analytic row functions b 1(s), ... , b,(s) such that for all s E 0 the rows b 1(s), ... , br(s) are independent, and Span{b1(s?, ... ' br(s?} = Span{a1(s?, ... 'arCs?},
£ E
0\0o.
(S6.13)
Fix s0 E 0, and let br+ 1, ... , br be n-dimensional rows such that the vectors b 1(s 0 ?, ... , br(s 0 ?, b;+ 1, ... , bJ form a basis in fl". Applying Lemma S6.2 again (for x 1(s) = b 1 (s?, ... ,xr(s) = br(s?, Xr+ 1 (s) = b;+ 1 , ••. ,xn(s)=bJ) and the remark after the proof of Lemma S6.2, we construct n-dimensional analytic row functions br+ 1(s), ... , bn(s) such that then x n matrix
is nonsingular for all s E 0. Then the inverse B(s)- 1 is analytic on s E 0. Let Yr+ 1 (s), ... ,y"(s) be the last (n- r) columns of B(s)- 1. We claim that (ii), (S6.2), and (S6.4) are satisfied with this choice. Indeed, (ii) is evident. Takes E 0\0 0 ; from (S6.13) and the construction of yls), i = r + 1, ... , n, it follows that
But since s ¢ 0 0 , every row of A(s) is a linear combination of the first r rows; so in fact Ker A(s) :::J Span {Yr+ 1(s), ... , Yn(s)}. (S6.14) Now (S6.14) implies (S6.15)
394
S6.
ANALYTIC MATRIX FUNCTIONS
Passing to the limit when e approaches a point from Q 0 , we obtain that (S6.15), as well as the inclusion (S6.14), holds for every e E Q. Consideration of dimensions shows that the equality holds in (S6.14) if and only if rank A(e) = r, e E Q. 0 S6.2. Analytic Perturbations of Self-Adjoint Matrices In this section we shall consider eigenvalues and eigenvectors of a selfadjoint matrix which is an analytic function of a parameter. It turns out that the eigenvalues and eigenvectors are also analytic. This result is used in Chapter 11. Consider ann x n complex matrix function A(e) which depends on the real parameter e. We impose the following conditions: (i) A(e) is self-adjoint for every e E IR; i.e., A(e) = (A(e))*, where star denotes the conjugate transpose matrix; (ii) A(e) is an analytic function of the real variable e. Such a matrix function A(e) can be diagonalized simultaneously for all real e. More exactly, the following result holds. Theorem S6.3. Let A(e) be a matrix function satisfying conditions (i) and (ii). Then there exist scalar functions J1 1 (e), ... , Jln(e) and a matrix-valued function U(e), which are analytic for reale and possess the following properties for every e E IR:
A(e) = (U(e))- 1 diag[J1 1 (e), ... , Jln(e)] U(e), Proof.
U(e)(U(e))* = I.
Consider the equation
det(JA.- A(e)) =A."+ a 1(e)n-l
+ · · · + an_ 1(e)A. + an(e) =
0,
(S6.16)
which ai(e) are scalar analytic functions of the real variable e. In general, the solutions Jl(e) of (S6.16) can be chosen (when properly ordered) as power series in (e - e0 ) 1 1P in a real neighborhood of every e0 E IR (Puiseux series):
Jl(e) = Jlo
+ b 1(e -
e0 ) 11P
+ bz(e
- e0 ) 21 P
+ · · ·,
(S6.17)
(see, for instance, [52c]; also [63, Vol. III, Section 45]). But since A(e) is self-adjoint, all its eigenvalues (which are exactly the solution of (S6.1)) are real. This implies that in (S6.17) only terms with integral powers of e - e0 can appear. Indeed, let bm be the first nonzero coefficient in (S6.17): b 1 = · · · = bm-l = 0, bm =f. 0. (If bi = 0 for all i = 1, 2, ... , then our assertion is trivial.) Letting e - e0 -+ 0 through real positive values, we see that
. Jl(e) - Jlo bm = hm ( •-it e -eo r/p
S6.2.
395
ANALYTIC PERTURBATIONS OF SELF-ADJOINT MATRICES
is real, because p(t.) is real for t. real. On the other hand, letting t. - t. 0 approach 0 through negative real numbers, we note that
= lim Jl(t.) - flo
( -l)mfpb m
t~to (t.o - f.r/p
is also real. Hence ( -l)m;p is a real number, and therefore m must be a multiple of p. We can continue this argument to show that only integral powers of t. - t. 0 can appear in (S6.17). In other words, the eigenvalues of A(t.), when properly enumerated, are analytic functions of the real variable t.. Let p 1(t.) be one such analytic eigenvalue of A(t.). We turn our attention to the eigenvectors corresponding to p 1(t.). Put B(t.) = A(t.)- p 1(t.)J. Then B(t.) is self-adjoint and analytic for real t., and det B(t.) 0. By Theorem S6.2 there is an analytic (for real t.) nonzero vector x 1(t.) such that B(t.)x 1(t.) = 0. In other words, x 1(t.) is an eigenvector of A(t.) corresponding to p 1(t.). Since A(t.) = (A(t.))*, the orthogonal complement {x 1(t.)}.L to x 1(t.) in fl" is A(t.)-invariant. We claim that there is an analytic (for real t.) orthogonal basis in {x 1(t.)}.L. Indeed, let
=
X1(t.) = [xll(t.), ... , X1n(t.)]T. By Theorem S6.2, there exists an analytic basis y 1(t.), ... ,y"_ 1(t.) in Ker[x 11 (t.), ... , x 1"(t.)]. Then y 1(t.), ... , Yn- 1 (t.) is a basis in {x 1(t.)}.L (the bar denotes complex conjugation). Since t. is assumed to be real, the basis y 1(t.), ... , Yn- 1 (t.) is also analytic. Applying to this basis the Gram-Schmidt orthogonalization process, we obtain an analytic orthogonal basis z 1(t.), ... , zn_ 1(t.) in {x 1(t.)}.L. In this basis the restriction A(t.)bd
(S6.18)
where Z(t.) = [z 1 (t.), ... , z"_ 1(t.)] is ann x (n- 1) analytic matrix function with linearly independent columns, for every real t.. Fix t. 0 E ~.and let Z 0 (t.) be an (n - 1) x (n - 1) submatrix in Z(t.) such that Z 0 (~; 0 ) is nonsingular (we shall assume for simplicity of notation that Z 0 (t.) occupies the top n - 1 rows of Z(t.)). Then Z 0 (~;) will be nonsingular in some real neighborhood of t. 0 , and in this neighborhood (Z 0 (~;))- 1 will be analytic. Now (S6.18) gives A(t.) = [(Z 0 (t.))-\ O]A(c)Z(c);
so A( c) is analytic in a neighborhood of t. 0 . Since £ 0 E ~ is arbitrary, A(t.) is analytic for real t.. Now apply the above argument to find an analytic eigenvalue p 2 (t.) and corresponding analytic eigenvector x 2 (t.) for A(t.), and so on. Eventually
396
S6.
ANALYTIC MATRIX FUNCTIONS
we obtain a set of n analytic eigenvalues 11 1(s), ... , flie) of A(s), with corresponding analytic eigenvectors x 1(s), ... , xis) and, moreover, these eigenvectors are orthogonal to each other. Normalizing them (this does not destroy analyticity because s is real) we can assume that x 1(s), ... , xn(s) form an orthonormal basis in rt" for every s E IR. Now the requirements of Theorem S6.3 are satisfied with U(s) = [x 1(s), ... , xis)] - 1 . D Comments
The proofs in the first section originated in [37d]. They are based on the approach suggested in [74]. For a comprehensive treatment of perturbation problems in finite and infinite dimensional spaces see [ 48]. Theorem S6.1 can be obtained as a corollary from general results on analytic vector bundles; see [39].
References
S. Barnett, Regular polynomial matrices having relatively prime determinants, Proc. Cambridge Philos. Soc. 65, 585~590 (1969). 2. H. Bart, Meromorphic operator-valued functions, Thesis, Free University, Amsterdam [Mathematical Centre Tracts 44, Mathematical Centre, Amsterdam, 1973]. 3a. H. Bart, I. Gohberg, and M. A. Kaashoek, A new characteristic operator function connected with operator polynomials, Integral Equations and Operator Theory I, 1~12 (1978). 3b. H. Bart, I. Gohberg, and M. A. Kaashoek, Stable factorizations of monic matrix polynomials and stable invariant subspaces, Integral Equations and Operator Theory I, (4), 496~517 (1978). 3c. H. Bart, I. Gohberg, and M.A. Kaashoek, "Minimal Factorizations of Matrix and Operator Functions," Birkhaeuser, Basel, 1979. 4. H. Bart, I. Gohberg, M. A. Kaashoek, and P. van Dooren, Factorizations of transfer functions, SIAM J. Control Optim. 18 (6), 675~696 (1980). 5. H. Baumgiirtel, "Endlich-dimensionale Analytische Storungstheorie," Akademie-Verlag, Berlin, 1972. 6. R. R. Bitmead, S. Y. Kung, B. D. 0. Anderson and T. Kailath, Greatest common divisors via generalized Sylvester and Bezout matrices, IEEE Trans. Automat. Control AC-23, 1043~1047 (1978). 7. B. den Boer, Linearization of operator functions on arbitrary open sets, Integral Equations Operator Theory I, 19~27 (1978). 8. J. Bognar, "Indefinite Inner Product Spaces," Springer-Verlag, Berlin and New York, 1974. 9. M. S. Brodskii, "Triangular and Jordan representations of linear operators," Amer. Math. Soc., Trans!. Math. Monographs, Vol. 32, Providence, Rhode Island, 1971. 10. R. S. Busby and W. Fair, Iterative solution of spectral operator polynomial equation and a related continued fraction, J. Math. Anal. Appl. 50, 113~134 (1975). I.
397
398 II.
REFERENCES
S. L. Campbell, Linear systems of differential equations with singular coefficients, SIAM J. Math. Anal. 8, 1057-1066 (1977). 12. S. Campbell and J. Daughtry, The stable solutions of quadratic matrix equations, Proc. Amer. Math. Soc. 74 (1), 19-23 (1979). 13. K. Clancey, private communication. 14. N. Cohen, On spectral analysis of matrix polynomials, M.Sc. Thesis, Weizmann Institute of Science, Israel, 1979. !Sa. W. A. Coppel, Linear Systems, Notes in Pure Mathematics, Australian National University, Canberra, 6, (1972). 15b. W. A. Coppel, Matrix quadratic equations, Bull. Austral. Math. Soc. 10, 377-401 (1974). 16a. Ju. L. Daleckii and M. G. Krein, "Stability of Solutions of Differential Equations in Banach Space," Amer. Math. Soc., Trans!. Math. Monographs, Vol. 43, Providence, Rhode Island, 1974. 16b. J. Daughtry, Isolated Solutions of Quadratic Matrix Equations, Linear Afqebra and Appl. 21' 89-94 (1978). 17. R. J. Duffin, A minimax theory for overdamped networks, J. Rat. Mech. and Anal. 4, 221-233 (1955). 18. H. Flanders and H. K. Wimmer, On the matrix equations AX- XB = C and AX- YB = C, SIAM J. Appl. Math. 32, 707-710 (1977). 19. R. A. Frazer, W. J. Duncan and A. R. Collar," Elementary matrices," 2nd ed., Cambridge Univ. Press, London and New York, 1955. 20. P. Fuhrmann, Simulation of linear system and factorization of matrix polynomials, preprint (1977). 21. P. A. Fuhrmann and J. C. Willems, Factorization indices at infinity for rational matrix functions, Integral Equations Operator Theory, 2, 287-301 (1979). 22. F. R. Gantmacher, "The theory of matrices," Vols. I and II, Chelsea, Bronx, New York, 1959. 23. I. M. Gel'fand, "Lectures in linear algebra," Wiley (Interscience), New York, 1961. 24. I. Gohberg, On some problems in the spectral theory of finite-meromorphic operator functions, Izv. Akad. Nauk, Armjan. SSR, Ser. Mat., VI, 160-181 (1971), (Russian); errata: Vii, 152 (1972). 25. I. C. Gohberg and I. A. Feldman, "Convolution Equations and Projection Methods for their Solution," Amer. Math. Soc., Providence, Rhode Island (1974). 26. I. Gohberg and G. Heinig, The resultant matrix and its generalizations, I. The resultant operator for matrix polynomials, Acta Sci. Math. (Sze_qed) 37,41-61 (1975), (Russian). 27. I. Gohberg and M. A. Kaashoek, Unsolved problems in matrix and operator theory, II. Partial multiplicities for products, Integral Equations Operator Theory 2, 116-120 (1979). 28. I. Gohberg, M. A. Kaashoek and D. Lay, Equivalence, linearization and decomposition of holomorphic operator functions, J. Funct. Anal. 28, 102-144 (1978). 29a. I. Gohberg, M. A. Kaashoek, L. Lerer and L. Rodman, Common multiples and common divisors of matrix polynomials, I. Spectral method, Ind. J. Math. 30,321-356, (1981). 29b. I. Gohberg, M.A. Kaashoek, L. Lerer and L. Rodman, Common multiples and common divisors of matrix polynomials, II. Vandermonde and resultant, Linear and Multilinear Algebra (to appear). 30a. I. Gohberg, M. A. Kaashoek, and L. Rodman, Spectral analysis of families of operator polynomials and a generalized Vandermonde matrix, I. The finite dimensional case, in "Topics in Functional Analysis," (I. Gohberg and M. Kac, eds.), 91-128, Academic Press, New York, (1978). 30b. I. Gohberg, M. A. Kaashoek, and L. Rodman, Spectral analysis of families of operator polynomials and a generalized Vandermonde matrix, II. The infinite dimensional case, J. Funct. Anal. 30, 358-389 (1978).
REFERENCES 31.
399
I. Gohberg, M. A. Kaashoek and F. van Schagen, Common multiples of operator poly-
nomials with analytic coefficients, Manuscript a Math. 25, 279-314 (1978). 32a. I. C. Gohberg and M. G. Krein, Systems of internal equation on a half line with kernels depending on the difference of arguments, A mer. Math. Soc. Trans/. (2) 13, 185-264 (1960). 32b. I. C. Gohberg and M. G. Krein, "Introduction to the Theory of Linear Nonself-adjoint Operators in Hilbert Space," Amer. Math. Soc., Trans!. Math. Monographs, Vol. 24, Providence, Rhode Island, 1970. 32c. I. C. Gohberg and M.G. Krein, The basic proposition on defect numbers, root numbers and indices of linear operators, Uspehi Mat. Nauk 12, 43-118 (1957); tranlation, Russian Math. Surveys, 13, 185-264 (1960). 33. I. Gohberg and N. Krupnik, "Einfiihrung in die Theorie der eindimensionalen singuHiren Integraloperatoren," Birkhauser, Basel, 1979. 34a. I. Gohberg, P. Lancaster and L. Rodman, Spectral analysis of matrix polynomials - I. Canonical forms and divisors, Linear Algebra App/. 20, 1-44 (1978). 34b. I. Gohberg, P. Lancaster and L. Rodman, Spectral analysis of matrix polynomials - II. The resolvent form and spectral divisors, Linear Algebra and App/. 21, 65-88 (1978). 34c. I. Gohberg, P. Lancaster and L. Rodman, Representations and divisibility of operator polynomials, Canad. J. Math. 30, 1045-1069 (1978). 34d. I. Gohberg, P. Lancaster and L. Rodman, On self-adjoint matrix polynomials, Integral Equations Operator Theory 2, 434-439 (1979). 34e. I. Gohberg, P. Lancaster and L. Rodman, Perturbation theory for divisors of operator polynomials, SIAM J. Math. Anal. 10, 1161-1183 (1979). 34f. I. Gohberg, P. Lancaster and L. Rodman, Spectral analysis of selfadjoint matrix polynomials, Research Paper No. 419, Dept. of Math. and Stat., University of Calgary, 1979. 34g. I. Gohberg, P. Lancaster and L. Rodman, Spectral analysis of selfadjoint matrix polynomials, Ann. of Math. 112, 34-71 (1980). 34h. I. Gohberg, P. Lancaster and L. Rodman, Factorization of self-adjoint matrix polynomials with constant signature, Linear and Multilinear A(qebra (1982) (to appear). 35a. I. C. Gohberg and L. E. Lerer, Resultants of matrix polynomials, Bull. Amer. Math. Soc. 82, 565-567 (1976). 35b. I. Gohberg and L. Lerer, Factorization indices and Kronecker indices of matrix polynomials, Integral Equations Operator Theory 2, 199-243 (1979). 36a. I. Gohberg, L. Lerer, and L. Rodman, Factorization indices for matrix polynomials, Bull. Amer. Math. Soc. 84, 275-277 (1978). 36b. I. Gohberg, L. Lerer and L. Rodman, On canonical factorization of operator polynomials, spectral divisors and Toeplitz matrices, Integral Equations Operator Theory 1, 176-214 (1978). 36c. I. Gohberg, L. Lerer, and L. Rodman, Stable factorization of operator polynomials and spectral divisors simply behaved at infinity, I, J. Math. Anal. App/., 74, 401-431 (1980). 36d. I. Gohberg, L. Lerer and L. Rodman, Stable factorization of operator polynomials and spectral divisors simply behaved at infinity, II, J. Math. Anal. App/., 75, 1-40 (1980). 37a. I. Gohberg and L. Rodman, On spectral analysis of nonmonic matrix and operator polynomials, I. Reduction to monic polynomials, Israel J. Math. 30, 133-151 (1978). 37b. I. Gohberg and L. Rodman, On spectral analysis of nonmonic matrix and operator polynomials, II. Dependence on the finite spectral data, Israel J. Math. 30, 321-334 (1978). 37c. I. Gohberg and L. Rodman, On the spectral structure of monic matrix polynomials and the extension problem, Linear Algebra App/. 24, 157-172 (1979). 37d. I. Gohberg and L. Rodman, Analytic matrix functions with prescribed local data, J. Analyse Math. (to appear). 38. I. Gohberg and E. I. Sigal, On operator generalizations of the logarithmic residue theorem and the theorem of Rouche, Math. USSR-Sb. 13, 603-625 (1971).
400 39. 40. 41. 42. 43. 44. 45. 46.
47. 48. 49a. 49b. 50. 51.
52a. 52 b. 52c. 52d. 52e. 52f.
REFERENCES H. Grauer!, Analytische Fascrungen iiber holomorphvollstandigen Raumen, Math. Ann. 135, 263-273 (1958). K. G. Guderley, On nonlinear eigenvalue problems for matrices, SIAM J. Appl. Math. 6, 335-353 (I 958). G. Heinig, The concepts of a bezoutian and a resolvent for operator bundles, Functional Anal. App. 11, 241-243 (1977). P. Henrici, "Elements of Numerical Analysis," Wiley, New York, 1964. I. S. Iohvidov and M. G. Krein, Spectral theory of operators in spaces with an indefinite metric, Amer. Math. Soc. Translqtions (2) 13, 105-175 (1960); 34,283-373 (1963). G. A. Isaev, Numerical range of operator pencils and Keldysh multiple completeness, Functional Anal. Appl. 9, 27-30 (1975). V. A. Jakubovic, Factorization of symmetric matrix polynomials, Soviet Math. Dokl. 11, 1261-1264 (1970). V.I. Kabak, A. S. Marcus, and V.I. Mereutsa, On a connection between spectral properties of a polynomial operator bundle and its divisors, in" Spectral Properties of Operators," pp. 29-57, Stiinca, Kishinev, 1977, (in Russian). R. E. Kalman, Irreducible realizations and the degree of a rational matrix, SIAM J. Control (Optim.) 13, 520-544 (1965). T. Kato, "Perturbation theory for linear operators," Springer-Verlag, Berlin and New York, 1966. M. V. Keldysh, On eigenvalues and eigenfunctions of some classes of nonselfadjoint equations, DAN SSSR 77, No. I, 11-14 (1951), (in Russian). M. V. Keldysh, On the completeness of the eigenfunctions of some classes of nonselfadjoint linear operators, Russian Math. Sun·e,vs, 24, No.4, 15-44 (1971) A. G. Kostyuchenko and M. B. Orazov, Certain properties of the roots of a selfadjoint quadratic bundle, Functional Anal. Appl. 9, 295-305 (1975). M. G. Krein and H. Langer, On some mathematical principles in the linear theory of damped oscillations of continua, I, II, Integral Equations Operator Theory 1, 364-399, 539-566 (1978). (Translation). P. Lancaster, Inversion of lambda-matrices and applications to the theory of linear vibrations, Arch. Rational Mech. Anal. 6, 105-114 (1960). P. Lancaster, "Lambda-matrices and Vibrating Systems," Pergamon, Oxford, 1966. P. Lancaster, "Theory of Matrices," Academic Press, New York, 1969. P. Lancaster, Some questions in the classical theory of vibrating systems, Bul.Inst. Politehn. Iasi, 17 (21), 125-132 (1971). P. Lancaster, A fundamental theorem of lambda-matrices with applications~ I. Ordinary differential equations with constant coefficients, Linear Algebra Appl. 18, 189-211 (1977). P. Lancaster, A fundamental theorem of lambda-matrices with applications~ II. Difference equations with constant coefficients, Linear Algebra Appl. 18, 213-222 (1977).
P. Lancaster and L. Rodman, Existence and uniqueness theorems for the algebraic Riccati equation, Internal. J. Control32, 285-309 (1980). 54. P. Lancaster and J. G. Rokne, Solution of nonlinear operator equations, SIAM J. Math. Anal. 8, 448-457 (1977). 55. P. Lancaster and H. K. Wimmer, Zur Theorie der A.-Matrizen, Math. Nachr. 68, 325-330 (1975). 56a. H. Langer, Spektraltheorie Iinearer Operatoren in J-Raumen und einige Anwendungen auf die Schar L(l.) = A. 2 I + J.B + C. Habilitationsschrift, Tehnischen Universitat Dresden, 1964. 56b. H. Langer, Ober Lancaster's Zerlegung von Matrizen-Scharen, Arch. Rational Mech. Anal. 29, 75-80 (1968). 53.
REFERENCES
401
56c. H. Langer, Ober eine Klasse nichtlinearer Eigenwertprobleme, Acta Sci. Math. (Szeged), 35, 79-93 ( 1973). 56d. H. Langer, Factorization of operator pencils, Acta Sci. Math. (Szeged) 38, 83-96 (1976). 57a. J. Leiterer, Local and global equivalence of Meromorphic Operator Functions; I, Math. Nachr. 83, 7-29 (1978). 57b. J. Leiterer, Local and global equivalence ofMeromorphic Operator Functions; II, Math. N achr. 84, 145-170 (1978). 58. Ya. B. Lopatinskii, Factorization of a polynomial matrix, Nauch. Zap. L'vov. Po/itekh. Inst., Ser. Fiz.-Mat. 38, 3-9 (1956), (in Russian). 59. A. I. Mal'cev," Foundations of Linear Algebra," Freeman, San Francisco, California, 1963. 60. A. S. Marcus, On homomorphic operator functions, Dokl. Akad. Nauk SSSR 119, I099-1102 (1958). (Russian). 6la. A. S. Marcus and V.I. Matsaev, On spectral properties ofholomorphic operator functions in Hilbert space, Mat. Issled., IX: 4 (34), 79-91 (1974). 61 b. A. S. Marcus and V. I. Matsaev, On spectral theory of holomorphic operator functions in Hilbert space, Functional Anal. Appl. 9, 76-77 (1975), (in Russian). 6lc. A. S. Marcus and V.I. Matsaev, On spectral theory ofholomorphic operator functions, in: Research in Functional Analysis, Stiinca, Kishinev, 71-100 (1978), (in Russian). 62a. A. S. Marcus and I. V. Mereutsa, On the complete n-tuple of roots of the operator equation corresponding to a polynomial operator bundle, Math. USSR Izv. 7, 1105-1128 (1973). 62b. A. S. Marcus and I. V. Mereutsa, On some properties of A.-matrices, Mat. Issled., Kishinev 3, 207-213 (1975), (in Russian). 63. A. I. Markushevich, "Theory of Functions of a Complex Variable," Vols. 1-111, PrenticeHall, Englewood Cliffs, New Jersey, 1965. 64a. I. V. Mereutsa, On the decomposition of an operator polynomial bundle into a product of linear factors, Mat. Issled. 8, 102-114 (in Russian). 64b. I. V. Mereutsa, On decomposition of a polynomial operator bundle into factors, in "Research in Functional Analysis," Stiinca, Kishinev, pp. 101-119 (1978), (in Russian). 65. L. Mirsky, "An introduction to Linear Algebra," Oxford Univ. Press (Clarendon), London and New Yark, 1955. 66a. B. Mityagin, Linearization of holomorphic operator functions I, Integral Equations Operator Theory, I, 114-131 (1978). 66b. B. Mityagin, Linearization of holomorphic operator functions II, Integral Equations Operator Theory I, 226-249 (1978). 67. P,. H. Miiller, Ober eine Klasse von Eigenwertaufgaben mit nichtlinearer Parameteriibhangigkeit, Math. Nachr. 12, 173-181 (1954). 68. N. I. Muskhelishvili, "Singular Integral Equations," Nordhoff, Groningen, 1953. 69. M. B. Orazov, On certain properties of analytic operator-functions of two arguments, Uspehi Mat. Nauk, 4, 221-222 (1978), (in Russian). 70a. L. Rodman, On analytic equivalence of operator polynomials, Integral Equations Operator Theory, I, 400-414 (1978). 70b. L. Rodman, On existence of common multiples of monic operator polynomials, Integral Equations Operator Theory, I, 400-414 (1978). 70c. L. Rodman, Spectral theory of analytic matrix functions, Ph.D. Thesis, Tel-Aviv University, 1978. 70d. L. Rodman, On extremal solutions of the algebraic Riccati equation, Amer. Math. Soc., Lectures in Appl. Math. 18, 311-327 (1980). 70e. L. Rodman, On connected components in the set of divisors of monic matrix polynomials, Linear Algebra and Appl., 31, 33-42 (1980). 70f. L. Rodman, On nonnegative invariant subspaces in indefinite scalar product spaces, Linear and Multilinear Algebra 10, 1-14 (1981).
402 71. 72. 73. 74. 75. 76a. 76b. 77. 78. 79a. 79b. 80. 81.
REFERENCES L. Rodman and M. Schaps, On the partial multiplicities of a product of two matrix polynomials, Integral Equations Operator Theory 2, 565-599 (1979). W. E. Roth, The equations AX - YB = C and AX - XB = C in matrices, Proc. Amer. Math. Soc. 3, 392-396 (1952). B. Rowley, Spectral analysis of operator polynomials, Ph.D. Thesis, McGill University, Montreal (1979). Yu. L. Shmuljan, Finite dimensional operators depending analytically on a parameter, Ukrainian Math. J. IX (2), 195-204 (1957), (in Russian). E. I. Sigal, Partial multiplicities of a product of operator functions, Mat. Issled. 8 (3), 65-79 (1973), (in Russian). G. P. A. Thijsse, On solutions of elliptic differential equations with operator coefficients, Integral Equations Operator Theory 1, 567-579 (1978). G. P. A. Thijsse, Rules for the partial multiplicities of the product of holomorphic matrix functions, Integral Equations Operator Theory 3, 515-528 (1980). R. C. Thompson, The characteristic polynomial of a principal subpencil of a Hermitian matrix pencil, Linear Algebra and Appl. 14, 135-177 (1976). J. V. Uspensky, "Theory of Equations," McGraw-Hill, New York, 1978. H. K. Wimmer, A Jordan factorization theorem for polynomial matrices, Proc. Amer. Math. Soc., 201-206 (1979). H. K. Wimmer, The structure of non-singular polynomial matrices, Research Paper 432, University of Calgary, 1979. W. A. Wolovich, "Linear Multivariable Systems" (Applied Mathematical Sciences, Vol. II), Springer-Verlag, Berlin and New York, 1974. D. C. Youla and P. Tissi, An explicit formula for the degree of a rational matrix, Electrophysics Memo, PIBM RI-1273-65, Polytechnic Institute of Brooklyn, Electrophysics Department, 1965.
List of Notations and Conventions
Vectors, matrices and functions are assumed to be complex-valued if not stated otherwise. Linear spaces are assumed to be over the field ([ of complex numbers if not stated otherwise. (["
linear space of all n-dimensional vectors written as columns the field of real numbers identity linear transformation or identity matrix of size m x m 0 or Om zero linear transformation or zero matrix of size m x m IRi I or I m
For a constant matrix A, a(A) = {A.IH- A is singular}
For a nonconstant matrix polynomial L(A.), a(L) = {A.IL(A.) is singular}
Matrices will often be interpreted as linear transformations (written in a fixed basis, which in (/;"is generally assumed to be standard, i.e., formed by unit coordinate vectors), and vice versa. V'l(A.)
Ker T Im T
ith derivative with respect to A. of matrix-valued function L(A.)
= {x E =
{y
E
€"1Tx
q;miY
A .A
AlA{ .A+% .AEB% Span{x 1, ... , Xm}
=
= 0} form
x n matrix T
Tx for some x
E
€"}, form x n matrix T
image of a subspace .A under linear transformation A The restriction of linear transformation A to the domain .A. direct sum of subspaces orthogonal sum of subspaces subspace spanned by vectors x 1 , . . . , xm
403
404
LIST OF NOT ATIONS AND CONVENTIONS
{0}
AT A* llxll = CD=l lxd 2 ) 112 (x, y) = L7= 1 xLf1 IIA II
zero subspace transposed matrix the conjugate transpose of matrix A, or the adjoint linear transformation of A euclidean norm of x = (x~> ... , xn? E rt". scalar product of vectors x = (xh ... , xn? andy= (y 1 , ... , Yn)T norm of a matrix or linear transformation, generally given by IIAII = max{IIAxlllllxll = 1}
row(Z;)i"= 1 = row(Z 1 , ... ,Z.,) = [Z 1
Z2
.••
Z.,]
block row matrix with blocks z 1• . . . 'zm
ooi(Z,)~~, ~ rJJ di•g[Z;Jr-,
~
block oolwnn =t
di•g[Z" ... , Z.]
~
Z, " ... "
~ 1' 1'
z. [
jJ
block diagonal matrix with blocks Z 1 , ... , Z., ind(X, T) sigH sgn(x, A.) .fm A, ~e A
X () .. = { 'l
1, i = j 0, i =f.j
GL(fl")
lXI X\JI
0
~f
index of stabilization of admissible pair (X, T) signature of hermitian matrix H a sign ( + or -) associated with eigenvalue A. and eigenvector x imaginary and real parts of A. E rt complex conjugate of A. E rt Kronecker index group of all nonsingular n x n matrices number of elements in a finite set X The complement of set j ( in set X. end of proof or example equals by definition
Index A
c
Adjoint matrix polynomial, 55, Ill, 253 Adjoint operator, 253 Admissible pair, 168 associated matrix polynomial, 169, 232 common extension, 233 common restriction, 235 extension of, 232-235 greatest common restriction, 235-238 kernel of, 204, 232 least common extension, 233 /-independent, 168 order of, 188, 203 restriction of, 204, 235-238 similar, 204, 232 special associated matrix polynomial, 176 Admissible splitting of Jordan form, 71, 74,78 Analytic bases, 388-394 Analytic matrix functions, 83, 289, 341, 388396 Jordan chains for, 294-295 self-adjoint, 294 sign characteristics for, 294 -298 Analytic perturbations of self-adjoint matrices, 394-396 Angular operator, 371, 374
Canonical factorization, 133-145 right and left, 134 two-sided, 139-142, 145 Canonical form of matrix polynomial, see also Resolvent form left, 58 right, 58, 60 Characteristic function, 6 Characteristic polynomial, 329 Chrystal's theorem, 323 Clancey, K., 145 Collar, A. R., I Common divisor, 231 Common multiple, 231, 244-246, 251 Commuting matrices, 345-347 Comonic companion matrix, 187 Companion matrix, 4, 44,47, 51, 147, 186,210, 255 comonic, 187, 206 Companion polynomial, 186, 195 Complementary contour, 133, 141 Complementary subspaces, 354 Complete pair of right divisors, 75-79, 82-83, 289, 307, 309 Control variable, 66 Coordinate vector, 31 c-set, 279, 309 maximal, 282
B Baumgiirtel, H., 156, 157, 165 Bernoulli's method for matrix equations, 126 Biorthogonality condition, 52, 63, 208 Brodskii, M. S., 6
D Damped oscillatory systems, 277 Decomposable linearization, 195-197
405
406
INDEX
Decomposable pair, 188-195, 219 parameter of, 188 of a regular matrix polynomial, 188 similar, 193 Difference equation, 18-19, 79-80, 325-330 bounded solution of, 226-227 initial value problem for, 79-80 inverse problem for, 228-230 matrix, 80-81 nonmonic case, 225-227 Differential equation, see Ordinary differential equation Distance from a set, 360 Division of matrix polynomials, 89-96, 201206 right and left, 90, 114, 217 Divisors, 96-115, 201-206, see also Greatest common divisor; Spectral divisor analytic, 162-164 comonic, 211 complete pair of, 75-79 continuation of, 155 isolated and nonisolated, 164-165 linear, 125-129 perturbation of, 146-165 right and left, 4, 5, 75, 92-115 Duncan, W. J., I
Extension problems, 166-179 Extensions via left inverses, 169-173
F Factorization of matrix polynomials, 4, 82, 278-289 in linear factors, 112-114 spectral, 116-145 stable, 152-155, 165 symmetric, 278-279, 287 Finite difference approximations, 326 Frazer, R. A., I Functions of matrices, 337-341
G
r -spectral pair' 150 Gantmacher, F., I Gap between subspaces, 147, 360, 369 g.c.d., see Greatest common divisor Generalized inverse, 174, 213, 351-352 special, 215, 228, 229 Greatest common divisor, 5, 231, 240 Green's function, 71-75
I
E Eigenvalue of matrix polynomial, 2, 14, 24,25 Eigenvectors of matrix polynomials, 2, 24 generalized, 24 left, 3 linear independence of, 43 rank of, 31, 38 right, 3 Elementary divisor, 319-321, 331, 336-337, 378 of factors of self-adjoint matrix polynomial, 285 linear, 113, 165, 257, 287, 288, 305 of products, 86-89 Elementary matrix, 317 Elementary transformation, 315 Equivalence of matrix polynomials, 12, 333338, 341 Exceptional set, 157-165 Extension of admissible pairs, see Admissible pair
Indefinite scalar product, 6, 256, 375-387 canonical form of, 376-378 nondegenerate, 376 self-adjoint matrices in, 256, 277, 376-387 Index of stabilization, 204, 209, 232 Inertia law, 383 Invariant polynomials, 319, 333 Invariant subspaces, 356, 379 chain of, 367 complete chain of, 367 maximal, 358 reducing, 356 spectral, 357-359 stable, 153, 366-374 Inverse matrices, 348-352, see also Generalized inverse left and right, 348-351 left and right, stability of, 350-351 Inverse problems, 3, 60, 183, 197-200, 227230 Isotropic subspaces, 380
407
INDEX
J
M
Jordan basis, 358 Jordan block, 336 Jordan chains, 2, 23-29, 49 canonical set of, 32-35, 59, 185 leading vectors of, 280 left, 28, 39, 55 length of, 25, 34, 35 for real self-adjoint matrix polynomials, 266 Jordan form, 2, 313, 336-337 admissible splitting of, 71, 74, 78 Jordan matrix, 336 Jordan pairs, 40-46, 184, 188 comonic, 206 corresponding to an eigenvalue, 40, 166 extension of, 167 finite, 184, 263 infinite, 185, 263 for real matrix polynomial, 267 restriction of, 167 similarity of, 44 Jord~n triples, 3, 50-57, 82, 85, 258, see also ~elf-adjoint triples similarity of, 51
Marcus, A. S., 6 Matrix equations, 125-129 Matrix polynomial, I, 9 adjoint, 55 codegree of, 202 comonic, 187, 206-217, 221 degree of, 9, 185 inverse of, 3 see also Laurent expansion; Resolvent form linearization of, 11-15, 186-188 monic, 1, 9, 199 nonnegative, 301-303, 309 poles of the inverse, 65 quadratic, 74-77, 288, 304-310 rectangular, 28, 313-318 regular, 181, 219, 303 self-adjoint, see Self-adjoint matrix polynomial Wiener-Hopffactorization of, 142-145 McMillan degree, 69 Mereutsa, I. V., 6 Metric space of matrix polynomials, 147 Metric space of subspaces, 363, 374 compactness, 364 completeness, 364 Minimal opening between subspaces, 369, 374 Monic, see Matrix polynomial; Rational matrix valued function Multiplication of matrix polynomials, 84-89, 114
K Keldysh, M. V., 6, 49 K-indices of admissible pair, 209, 217 Krein, M. G., 6 Kronecker indices, 145
L A-matrix, I Langer, H., 6 Laurent expansion, singular part of, 37-39, 59-60,64 l.c.m., see Least common multiple Least common multiple, 5, 231, 239-246 Left inverse, 169 special, 174, 208, 239 Linearization of matrix polynomials, 11-15, 46, 186-188, 217 decomposable, 195-197 inverse problem for, 20-22, 52 Lipschitz continuity, 148 Livsic, M. S., 6 Lopatinskii, Ya. B., 145
N Neutral subspace, 279 Newton identities, 81 Nonderogatory matrix, 20 Nonnegative matrix polynomials, see Matrix polynomial, nonnegative Nonnegative subspace, 279 Nonpositive subspace, 279 Numerical range, 276-277, 304
0 Operator polynomials, 6, 48-49, 144 Operator-valued functions, 6, 49, 83, 145 linearization of, 49
408
INDEX
Ordinary differential equation, I, II, 15-18, 23-29, 70-79, 218-225, 304-309, 321325 first order, 222-225 Green's functions for, 71-75, 308 initial value problems, 70-75, 218-225 inverse problem for, 227-228 nonmonic case, 218-225 second order, 75-79, 304-309 singular, 218-225, 227-228, 230 stable solutions of, 129-131 two-point boundary value problems, 71-79, 307 Orthogonal companion, 379 Orthogonal subspaces, 354 Output variable, 66 Overdamped systems, 305-308
p Partial differential equations, 49 Partial indices, 7, 142, 145 Partial multiplicities, 31, 34, 115, 330-333, 341 for analytic matrix functions, 295 for self-adjoint matrix polynomial, 299, 302 Projector, 353 complementary, 355 orthogonal, 354 Riesz, 117, 151, 357, 371 Puiseux series, 394
Q Quotients, 90, I 04-111 right and left, 90, 92, 334
R Rank of a matrix polynomial, 319 Rational matrix valued function, 67, 69, 82, 115 canonical factorization of, 133-142 local degree of, 69 McMillan degree of, 69 monic, 134 realization of, 67, 82 minimal, 67 Reducing transformation for indefinite scalar product, 377 Reduction of quadratic forms, 377-378, 383
Regular point, 117 Remainders, 90 right and left, 90, 92, 334 Remainder theorem, 91 Representation of matrix polynomials, 3, 5069,82,197-200,206-217 Resolvent form, 15, 58, 61, 65, 66-69, 82, 195-197 for self-adjoint matrix polynomials, 256, 262-263,273 Resolvent triple, 219 Restriction of admissible pairs, see Admissible pair Resultant determinant of two scalar polynomials, 159 Resultant matrix for matrix polynomials, 247, 251 kernel of, 247 Resultant matrix of two scalar polynomials, 159,246 Riccati equation, 165, 374, 387 Riesz projector, 117, 151,357,371 Root polynomials, 29-31 order of, 29 Root subspace, 357,359,366-267
s Scalar product, 404, see also Indefinite scalar product Self-adjoint matrix polynomial, 6, 129, 253310 monic, 255-310 quadratic, 288, 304-310 real, 266-273 Self-adjoint transformation, 355 Self-adjoint triples, 260-273 Sgn (x,A. 0 ), 274, 404 Shift operator, 326 Signature, 258, 300, 377-378 Sign characteristic, 383-387 connection with partial multiplicities, 299301 localization, 290-293 of overdamped system, 305 of a self-adjoint analytic matrix function, 294-298 of self-adjoint matrix polynomials, 6, 258, 262,273-275,290-293,298-303
409
INDEX
stability, 293 uniqueness, 383 Similar matrices, 334 s-indices of admissible pair, 209,217 Sip matrix, 261, 377 Smith form, 313-318, 331, 341 local, 33, 38, 330, 341 Solvent of a matrix polynomial, 126 dominant, 126 Special associated polynomial, see Admissible pair Special left inverse, 174, 208, 239 Special matrix polynomial, 176 Spectral data, 183 badly organized, 213 Spectral divisor, 116-145 continuous and analytic dependence, 150152 r-spectral divisor, 117-124, 150-152, 163 Spectral theory, 2, 6 Spectrum of matrix polynomial, 24 at infinity, 5 localization of, 5 Splitting the spectrum with a contour, 371 Square roots, 164 Standard pair, 4, 46-49, 60, 188 degree of, 167 left, 53-54, 60 similarity of, 48 Standard triples, 4, 50-57 for self-adjoint matrix polynomials, 256 similarity of, 51 State variable, 66 Subspaces, see also Invariant subspaces; Metric space of subspaces complementary, 354 direct sum of, 354 isotropic,
metric space of, 363 nondegenerate, 379 orthogonal, 354 sum of, 354 Supporting projector, 106-111 associated right and left projection, I 06 degree of, 106 Supporting set, 147 of order k, 147 Supporting subspace, 5, 6, 96-104, I 07, 114, 117 nested, 100, 109 for self-adjoint matrix polynomial, 278 Systems theory, 6, 66-69, 145, 303
T Toeplitz matrices, 7, 145, 268 block, 122, 135, 145 upper triangular, 347 Transfer function, 7, 66-68
u Unit sphere in a subspace, 360
v Vandermonde matrix, 240, 251, 329 comonic, 241 monic, 244 Vibrating systems, 6, 304-310
w Weakly damped systems, 309 Wiener-Hopf equations, 7 Wiener-Hopf factorization, 142-145 Winding number, 136, 142