MATRIX PARTIAL ORDERS, SHORTED OPERATORS AND APPLICATIONS
SERIES IN ALGEBRA Series Editors: Derek J S Robinson (University of Illinois at Urbana-Champaign, USA) John M Howie (University of St. Andrews, UK) W Douglas Munn (University of Glasgow, UK)
Published Vol. 1
Infinite Groups and Group Rings edited by J M Corson, M R Dixon, M J Evans and F D Röhl
Vol. 2
Sylow Theory, Formations and Fitting Classes in Locally Finite Groups by Martyn R Dixon
Vol. 3
Finite Semigroups and Universal Algebra by Jorge Almeida
Vol. 4
Generalizations of Steinberg Groups by T A Fournelle and K W Weston
Vol. 5
Semirings: Algebraic Theory and Applications in Computer Science by U Hebisch and H J Weinert
Vol. 6
Semigroups of Matrices by Jan Okninski
Vol. 7
Partially Ordered Groups by A M W Glass
Vol. 8
Groups with Prescribed Quotient Groups and Associated Module Theory by L Kurdachenko, J Otal and I Subbotin
Vol. 9
Ring Constructions and Applications by Andrei V Kelarev
Vol. 10 Matrix Partial Orders, Shorted Operators and Applications by Sujit Kumar Mitra*, P Bhimasankaram and Saroj B Malik
*
Deceased
SERIES
MATRIX PARTIAL ORDERS, SHORTED OPERATORS AND APPLICATIONS
IN
ALGEBRA VOLUME 10
Sujit Kumar Mitra Indian Statistical Institute, India
P Bhimasankaram University of Hyderabad, India
Saroj B Malik Hindu College, University of Delhi, India
World Scientific NEW JERSEY
•
LONDON
•
SINGAPORE
•
BEIJING
•
SHANGHAI
•
HONG KONG
•
TA I P E I
•
CHENNAI
Published by World Scientific Publishing Co. Pte. Ltd. 5 Toh Tuck Link, Singapore 596224 USA office 27 Warren Street, Suite 401-402, Hackensack, NJ 07601 UK office 57 Shelton Street, Covent Garden, London WC2H 9HE
British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.
Series in Algebra — Vol. 10 MATRIX PARTIAL ORDERS, SHORTED OPERATORS AND APPLICATIONS Copyright © 2010 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.
ISBN-13 978-981-283-844-5 ISBN-10 981-283-844-9
Printed in Singapore.
To Asha Latha Mitra, Sunil Kumar Mitra and Sheila Mitra - Parents and Wife of Sujit Kumar Mitra
Preface Matrix orders have fascinated mathematicians and applied scientists alike for many years. Several matrix orders have been developed during the past four decades by researchers working in Linear Algebra. Some of them are pre-orders such as the space pre-order and the Drazin pre-order, while others are partial orders like the minus order, the sharp order and the star order. These developments have grown symbiotically with advances in other areas such as Statistics and Matrix Generalized Inverses. Two closely connected concepts - parallel sums and shorted operators for non-negative definite matrices - play a major role in the study of electrical networks. Both of these share nice relationships with matrix partial orders as also between themselves. Several extensions of these concepts have been developed in the recent past by researchers working in these and related areas. There are many research articles in the areas of matrix orders and shorted operators that are scattered in various journals. This is the first full length monograph on these topics. The aim of this monograph is to present the developments in the fields of matrix orders and shorted operators for finite matrices in a unified way and illustrate them with suitable applications in Generalized Inverses, Statistics and Electrical Networks. In the process of this compilation, many new results have evolved. Virtually every chapter in the monograph contains results unpublished hitherto. In fact, Chapter 13 on partial orders of modified matrices comprises entirely of new material. We believe that dissection of matrices through matrix decompositions helps in clear understanding of the anatomy of various matrix orders in a transparent manner. We employ simultaneous decompositions such as the simultaneous normal form, the simultaneous singular value decomposition and the generalized singular value decomposition extensively in developing the properties of matrix orders. Accordingly, the reader will find new and
vii
viii
Matrix Partial Orders, Shorted Operators and Applications
intuitive proofs to several known results in this monograph. We also pose some open problems which should be of interest to researchers in these topics. There are a number of exercises at the end of virtually every chapter. These are expected to serve the dual purpose of helping in the understanding of the topics covered in the text and of introducing other related results that have not been included in the main text. This monograph is aimed at (i) graduate students and researchers in Matrix Theory and (ii) researchers in Statistics and Electrical Engineering who may use these concepts and results as tools in their work. The monograph can be used as a text for a one-semester graduate course in advanced topics in Matrix Theory and it can also serve as self-study text for those who have knowledge in basic Linear Algebra, perhaps at the level of [Rao and Bhimasankaram (2000)]. We deliberately avoided the study of majorization as there is an excellent book [Marshall and Olkin (1979)] on this topic. In this monograph, we consider matrix orders and shorted operators for finite matrices over a field. We do not consider matrices over more general algebraic structures or operators over more general spaces. There has been a good deal of work in these directions - see [Drazin (1958)], [Hartwig (1979)], [Hartwig (1980)], [Jain, Srivastava, Blackwood and Prasad (2009)], [Morley and William (1990a,b)] and [Mitch (1986)], to name a few. We have not included them here in order to keep the monograph focused and less bulky. We have also not gone into order preserving and order reversing transformations (for the minus, star, sharp and one-sided orders) and application of the sharp order in the analysis of Markov chains. These are exciting topics of research that may develop over the next few years. This monograph was conceived by the first author, S. K. Mitra who was a pioneer in the development of theory of matrix partial orders and shorted operators. He introduced the sharp order, one-sided orders and a unified theory of matrix partial orders. He developed several approaches to the shorted operators along with applications in Generalized Inverses and Statistics. The first author who is no more, has been the mentor and the main source of inspiration for the other two authors of the monograph. However, we, the second and third authors are responsible for any errors in the monograph. July, 2009
P. Bhimasankaram and Saroj B. Malik
Acknowledgements
We are highly thankful to the SQC & OR Unit, Indian Statistical Institute, Hyderabad, the Department of Mathematics and Statistics, University of Hyderabad, Hyderabad, the Center for Analytical Finance, Indian School of Business, Hyderabad and Hindu College, Delhi University, Delhi for providing excellent facilities during different stages of the preparation of this monograph. We are deeply indebted to Professor Debasis Sengupta for his constructive comments on parts of the manuscript in its different stages of preparation which led to significant improvement in the presentation. We are also thankful to Professors Probal Chaudhuri, Thomas Mathew and PSSNV Prasada Rao for useful discussions. We thank Professors T. Amarnath, R. Tandon, V. Suresh and Shri ALN Murthy for their encouragement during the preparation of the manuscript. P. Bhimasankaram records his sincere thanks and appreciation for the encouragement and inspiration received from Professor Sankar De during the preparation of the manuscript. His family members, Amrita, Chandana, Shilpa, Chandu and Vijaya had to miss his company even during the weekends for years together during the preparation of the manuscript. But for the understanding, patience and cooperation received from them, he could never have completed the task satisfactorily. He expresses his deep sense of loving appreciation to them. Saroj Malik expresses her sincere thanks to the Department of Mathematics and Statistics, University of Hyderabad for providing local hospitality during her visits to University of Hyderabad during the preparation of the manuscript (under the SAP-UGC project). She also wishes to thank the National Academy of Sciences, Delhi for partial support during one such visit. While working on the monograph, she had several useful discussions
ix
x
Matrix Partial Orders, Shorted Operators and Applications
with Professor Ajeet I. Singh, Professor Emeritus, Stat.-Math. Division, Indian Statistical Institute, Delhi. She is deeply indebted to Professor Singh for all the help and encouragement. She wishes to thank her ex-Principal Dr. Kavita Sharma for her encouragement during all these years to bring this effort to a fruitful ending. We thank Manpreet Singh and Vishal Mangla for their help in preparing the S-diagram and Naveen Reddy for his help in editing the manuscript at various stages. We thank L. F. Kwong for her co-operation at every stage during the preparation of the manuscript. We also thank D. Rajesh Babu for the technical help. Finally, we wish to thank World Scientific Publishing Co., Singapore for providing us the necessary freedom and flexibility in completing the monograph.
Glossary of Symbols and Abbreviations
aij (aij ) At A? C(A) C(At ) C(A⊥ ) N (A) tr(A) ρ(A) d(S) det(A) A−1 L A−1 R A− {A− } A− r {A− r } A− com {A− com } A† A] AD A− ` {A− ` } A− m {A− m} A− ρ
- the (i, j)th element of the matrix A - the matrix whose (i, j)th element is aij - transpose of the matrix A - conjugate transpose of the matrix A - the column space of the matrix A - the row space of the matrix A - the orthogonal complement of C(A) - the null space of the matrix A - the trace of the matrix A - the rank of the matrix A - the dimension of the subspace S - the determinant of the matrix A - a left inverse of the matrix A - a right inverse of the matrix A - a g-inverse of the matrix A - the set of g-inverses of the matrix A - a reflexive g-inverse of the matrix A - the set of reflexive g-inverses of the matrix A - a commuting g-inverse of A - the set of commuting g-inverses of A - the Moore-Penrose inverse of the matrix A - the group inverse of the matrix A - the Drazin inverse of the matrix A - a least squares g-inverse of the matrix A - the set of least squares g-inverses of the matrix A - a minimum norm g-inverse of the matrix A - the set of minimum norm g-inverses of the matrix A - a commuting g-inverse of the matrix A with the propxi
xii
Matrix Partial Orders, Shorted Operators and Applications t
{A− ρ} A− χ {A− χ} λmax (A) λmin (A) σ(A) C R F Cn m×n C Fn d(V ) I1 I1,n F m×n H Hn I Ik diag(x1 , x2 , . . . , xn ) P(A|B) S(A|B) S(A|S, T )
G(A) G(A, B) Gr (A) ] G(A) ˜ Gr (A) P(F m×n ) m-column vector n-row vector
t erty that C(A− ρ ) ⊆ C(A ) - the set of all ρ-inverses of the matrix A - a commuting g-inverse of the matrix A with the property that C(A− χ ) ⊆ C(A) - the set of all χ-inverses of the matrix A - the maximum eigen-value of A - the minimum eigen-value of A - the set of all singular values A - the field of complex numbers - the field of real numbers - arbitrary field - the vector space of complex n-tuples - the set of all m × n matrices over C - the vector space of n-tuples over F - the dimension of vector space V - the set of all matrices of index ≤ 1 - the set of all n × n matrices of index ≤ 1 - the set of all m × n matrices over F - the set of all hermitian matrices - the set of all n × n hermitian matrices - the identity matrix - the k × k identity matrix - the n × n diagonal matrix with diagonal elements xi , i = 1, 2, . . . , n - the parallel sum of matrices A, B - the shorted matrix of A with respect to the matrix B - the shorted matrix of an m×n matrix A with respect to the subspace S of F m and the subspace T of F n , also read as the shorted matrix of an m × n matrix A indexed by subspaces S and T - a subset of {A− } - the set of {B− AB− : B− ∈ G(B)} - the set G(A) ∩ {A− r } - completion of G(A) ] ∩ {A− } - the set G(A) r - set of all subset of F m×n - an m × 1 matrix - a 1 × n matrix
Contents
Preface
vii
Acknowledgements
ix
Glossary of Symbols and Abbreviations
xi
1.
Introduction
1
1.1 Matrix orders . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Parallel sum and shorted operator . . . . . . . . . . . . . 1.3 A tour through the rest of the monograph . . . . . . . . . 2.
Matrix Decompositions and Generalized Inverses 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8
3.
Introduction . . . . . . . . . . . . . . . . . Matrix decompositions . . . . . . . . . . . Generalized inverse of a matrix . . . . . . The group inverse . . . . . . . . . . . . . Moore-Penrose inverse . . . . . . . . . . . Generalized inverses of modified matrices Simultaneous diagonalization . . . . . . . Exercises . . . . . . . . . . . . . . . . . .
. . . . . . . .
9 . . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
The Minus Order 3.1 3.2 3.3 3.4
1 3 4
9 10 17 26 36 46 55 64 67
Introduction . . . . . . . . . . . . . . . . . . . . Space pre-order . . . . . . . . . . . . . . . . . . Minus order - Some characterizations . . . . . . Matrices above/below a given matrix under the minus order . . . . . . . . . . . . . . . . . . . . xiii
. . . . . . . . . . . . . . . . . .
67 68 72
. . . . . .
81
xiv
Matrix Partial Orders, Shorted Operators and Applications
3.5 Subclass of g-inverses A− of A such and AA− = BA− when A <− B . . 3.6 Minus order for idempotent matrices 3.7 Minus order for complex matrices . . 3.8 Exercises . . . . . . . . . . . . . . . 4.
7.
Introduction . . . . . . . . . . . . . . . Sharp order - Characteristic properties Sharp order - Other properties . . . . Drazin order and an extension . . . . Exercises . . . . . . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
Introduction . . . . . . . . . . . . . . . . . . Star order - Characteristic properties . . . . Subclasses of g-inverses for which A B . Star order for special subclasses of matrices Star order and idempotent matrices . . . . Fisher-Cochran type theorems . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . .
Introduction . . . . . . . . . . . . The condition AA− = BA− . . One-sided sharp order . . . . . . − Roles of A− c and Aa in one-sided One-sided star order . . . . . . . Exercises . . . . . . . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
127 128 136 138 145 150 152 155
. . . . . . . . . . . . . . . . . . . . . sharp order . . . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
Unified Theory of Matrix Partial Orders through Generalized Inverses 7.1 7.2 7.3 7.4 7.5
103 104 110 117 124 127
One-Sided Orders 6.1 6.2 6.3 6.4 6.5 6.6
84 93 95 98 103
The Star Order 5.1 5.2 5.3 5.4 5.5 5.6 5.7
6.
A− A = A− B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The Sharp Order 4.1 4.2 4.3 4.4 4.5
5.
that . . . . . . . . . . . .
Introduction . . . . . . . . . . . . . . . . . . . . . . . . G-based order relations: Definitions and preliminaries O-based order relations and their properties . . . . . . One-sided G-based order relations . . . . . . . . . . . Properties of G-based order relations . . . . . . . . . .
155 156 160 167 171 180
183 . . . . .
. . . . .
183 184 195 200 203
Contents
xv
7.6 On G-based extensions . . . . . . . . . . . . . . . . . . . . 208 7.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 212 8.
9.
The L¨ owner Order
215
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Definition and basic properties . . . . . . . . . . . . . . . 8.3 L¨ owner order on powers and its relation with other partial orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4 L¨ owner order on generalized inverses . . . . . . . . . . . . 8.5 Generalizations of the L¨owner order . . . . . . . . . . . . 8.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . .
215 215
Parallel Sums
245
9.1 9.2 9.3 9.4 9.5 10.
Introduction . . . . . . . . . . . Definition and properties . . . Parallel sums and partial orders Continuity and index of parallel Exercises . . . . . . . . . . . .
. . . . . . . . . . . . sums . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
Introduction . . . . . . . . . . . . . . . . . . . . . . . Shorted operator - A motivation . . . . . . . . . . . Generalized Schur complement and shorted operator Shorted operator via parallel sums . . . . . . . . . . Generalized Schur complement and shorted operator matrix over general field . . . . . . . . . . . . . . . . 10.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . .
12.
. . . . .
. . . . .
Schur Complements and Shorted Operators 10.1 10.2 10.3 10.4 10.5
11.
. . . . .
226 230 238 243
245 246 259 264 270 273
. . . . . . . . . . . . of a . . . . . .
273 274 276 283 285 293
Shorted Operators - Other Approaches
295
11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Shorted operator as the limit of parallel sums - General matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3 Rank minimization problem and shorted operator . . . . . 11.4 Computation of shorted operator . . . . . . . . . . . . . . 11.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . .
295 296 305 310 315
Lattice Properties of Partial Orders
317
12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 317
xvi
Matrix Partial Orders, Shorted Operators and Applications
12.2 12.3 12.4 12.5 13.
Partial Orders of Modified Matrices 13.1 13.2 13.3 13.4 13.5 13.6
14.
14.6
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . Equivalence relation on g-inverses of a matrix . . . . . . . Equivalence relations on subclasses of g-inverses . . . . . . Equivalence relation on the outer inverses of a matrix . . Diagrammatic representation of the g-inverses and outer inverses . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Ladder . . . . . . . . . . . . . . . . . . . . . . . . . .
Applications 15.1 15.2 15.3 15.4 15.5 15.6
16.
Introduction . . Space pre-order Minus order . . Sharp order . . Star order . . . L¨ owner order .
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . Point estimation in a general linear model . . . . . . . . . Comparison of models when model matrices are related under matrix partial orders . . . . . . . . . . . . . . . . . Shorted operators - Applications . . . . . . . . . . . . . . Application of parallel sum and shorted operator to testing in linear models . . . . . . . . . . . . . . . . . . . . . . . . Shorted operator adjustment for modification of network or mechanism . . . . . . . . . . . . . . . . . . . . . . . . .
Some Open Problems 16.1 16.2 16.3
318 330 338 342 343
Equivalence Relations on Generalized and Outer Inverses 14.1 14.2 14.3 14.4 14.5
15.
Supremum and infimum of a pair of matrices under the minus order . . . . . . . . . . . . . . . . . . . . . . . . . . Supremum and infimum under the star order . . . . . . . Infimum under the sharp order . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . .
343 344 352 357 364 367 371 371 372 380 384 390 401 407 407 407 411 415 418 418 423
Simultaneous diagonalization . . . . . . . . . . . . . . . . 423 Matrices below a given matrix under sharp order . . . . . 424 Partial order combining the minus and sharp orders . . . 424
Contents
16.4 16.5 16.6 16.7
When is a G-based order relation a partial order? Parallel sum and g-inverses . . . . . . . . . . . . Shorted operator and a maximization problem . The ladder problem . . . . . . . . . . . . . . . .
Appendix A A.1 A.2 A.3 A.4 A.5 A.6 A.7 A.8
xvii
. . . .
. . . .
. . . .
. . . .
. . . .
Relations and Partial Orders
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . Relations . . . . . . . . . . . . . . . . . . . . . . . . . . Semi-groups and groups . . . . . . . . . . . . . . . . . . Semi-groups and partial orders . . . . . . . . . . . . . . Involution . . . . . . . . . . . . . . . . . . . . . . . . . . Compatibility of partial orders with algebraic operations Partial orders induced by convex cones . . . . . . . . . . Creating new partial orders from old partial orders . . .
425 425 426 427 429
. . . . . . . .
429 429 432 433 435 435 436 436
Bibliography
439
Index
445
Chapter 1
Introduction
1.1
Matrix orders
Last few decades have witnessed a steady growth in the area of matrix partial orders, a central theme of this monograph. These matrix orders are developed in detail in Chapters 3-8. They play an important role in the study of shorted operators, which we treat subsequently. In this section we give a simple and intuitive interpretation of some of the matrix partial orders. Let us begin by defining a pre-order and a partial order. A binary relation on a non-empty set is said to be a pre-order if it is reflexive and transitive. If it is also anti-symmetric, then it is called a partial order (see Appendix A). Let f be a linear transformation from F n → F m , F being an arbitrary field. Then there exist bases B and C of F n and F m respectively such that f is represented by the matrix diag(I, 0) with respect to these bases (normal form). Also, if F = C, the field of complex numbers, then there exist ortho-normal bases B and C of Cn and Cm respectively such that f is represented by the matrix diag(D, 0) with respect to these bases, where D is a positive definite diagonal matrix (singular value decomposition). Thus, every linear transformation can be represented by a diagonal matrix by choosing the bases appropriately. A matrix G is a generalized inverse (g-inverse) of a matrix A if AGA = A. Let A and B be matrices of the same order. Then A is said to be below B under the minus order if there exists a g-inverse G of A such that AG = BG and GA = GB. Again, let A and B be matrices of the same order. Let G be the MoorePenrose inverse of A, a matrix that satisfies the conditions AGA = A,
1
2
Matrix Partial Orders, Shorted Operators and Applications
GAG = G, AG and GA are hermitian. Then A is said to be below B under the star order if the Moore-Penrose inverse G of A satisfies the conditions AG = BG and GA = GB. If A and B are square matrices of the same order such that ρ(A) = ρ(A2 ) and ρ(B) = ρ(B2 ), then A is below B under the sharp order if AG = BG and GA = GB, the matrix G being the group inverse of A. Note that the group inverse of A is a matrix G such that AGA = A, GAG = G and AG = GA. The minus, the star and the sharp orders are partial orders. For a pair of matrices A and B, each of these orders is defined through the same type of conditions, namely: AG = BG and GA = GB, where G is required to belong to a suitable subclass of g-inverses of A. All the above partial orders can be nicely interpreted through matrix decompositions. Let us consider two diagonal matrices L = diag(D, 0, 0) and M = diag(D, E, 0), where D and E are diagonal matrices. (We say a matrix - square or rectangular - is diagonal if all the elements outside the principal diagonal are zero.) It is intuitive to say that L is below M or L is a section of M. By an extension of this notion, for suitable choices of nonsingular/unitary matrices P and Q, the matrix PLQ is also below PMQ in some sense. Several matrix partial orders can be expressed in this manner as we shall see in the later chapters. For instance, the matrix A is below the matrix B under the minus order if and only if A and B have simultaneous normal form such that A = Pdiag(I, 0, 0)Q and B = Pdiag(I, I, 0)Q, where P and Q are non-singular matrices. Let A and B be matrices representing the linear transformations f and g with respect to standard bases F n and F m of respectively. Then A is below B under minus if and only if there exist bases B and C of F n and F m respectively such that f and g are represented by diag(I, 0, 0) and diag(I, I, 0) respectively with respect to these bases. Further, A is below B under the star order if and only if A and B have simultaneous singular value decomposition such that A = Pdiag(D, 0, 0)Q and B = Pdiag(D, E, 0)Q, where D and E are positive definite diagonal matrices and P and Q are unitary matrices. Let A and B be square matrices. Then we say A is below B under the sharp order if and only if there exists a non-singular matrix P such that A = Pdiag(D, 0, 0)P−1 and B = Pdiag(D, E, 0)P−1 , where D and E are non-singular matrices. Yet another partial order is the well-known L¨owner order on non-
Introduction
3
negative definite matrices defined as follows. Let A and B be non-negative definite (nnd) matrices of the same order. Then A is below B under the L¨ owner order if B − A is nnd. Furthermore, this happens if and only if there exists a non-singular matrix P such that A = Pdiag(D, 0, 0)P? and B = Pdiag(H, E, 0)P? , where D, H and E are positive definite diagonal matrices and each diagonal element of H is at least as big as the corresponding diagonal element of D. Thus, we see that corresponding to each of the matrix orders mentioned above A and B, the matrices representing some linear transformations f and g under suitable choice of bases, are simultaneously represented by matrices L and M that have a relatively simple structure, where it is intuitively clear that L is below M. In later chapters of this book, we explore all these and other matrix orders and study their inter-relationships.
1.2
Parallel sum and shorted operator
Parallel sum and shorted operator as studied in this monograph have their origin in the study of impedance matrices of n-port electrical networks. There have been interesting extensions of these concepts to wider classes of matrices. Two matrices A and B of the same order are said to be parallel summable if the row and column spaces of A are contained respectively in the row and column spaces of A+B. When A and B are parallel summable, their parallel sum is defined as A(A + B)− B, where (A + B)− is a g-inverse of A + B. Parallel sum has some very interesting properties. For example, its column (row) space is the intersection of the column (row) spaces of A and B. In this monograph, we study the properties of the parallel sum in detail. We also study the connection between the parallel sum and the matrix orders. The shorted operator has been defined in more than one way. For example, the shorted operator of a matrix with respect to another matrix of the same order can be defined as the limit of a certain sequence of parallel sums. Yet another way: Let A be an m × n matrix, Vm and Vn be subspaces of vector spaces of m-tuples and n-tuples. The shorted operator of A indexed by subspaces Vm and Vn is a matrix that is the closest to A in the class of matrices having their column and row spaces contained in Vm and Vn respectively. These various definitions of shorted operator are based on different objectives. In this monograph, we make a comprehensive study
4
Matrix Partial Orders, Shorted Operators and Applications
of the shorted operators and show the equivalence of various definitions of the shorted operator.
1.3
A tour through the rest of the monograph
As indicated in Section 1.1, generalized inverses, matrix decompositions and simultaneous decompositions in particular are going to play an important role in the study of matrix partial orders, parallel sums and shorted operators. In Chapter 2, we develop the necessary background on matrix decompositions and generalized inverses. Chapter 3 starts with the space pre-order. We study this basic matrix order in detail as we feel that this is the stepping stone to the other matrix orders that follow. We then introduce the minus order using a generalized inverse. This is the first partial order we study in this monograph. We obtain several characterizing properties of the same. We obtain the classes of matrices that lie above and below a given matrix under the minus order. We also study the minus order for certain special classes of matrices like the projectors. In Chapter 4, we define the sharp order on the square matrices of index not exceeding 1. The sharp order is an order finer than the minus order that involves the group inverse. We make a detailed study of its characteristic and other properties. We then consider matrices which have index greater than 1. We define an order called Drazin order using the Drazin inverse (which is not a generalized inverse in the usual sense). This order turns out to be a pre-order different from the space pre-order. It does have some interesting properties. We finally extend the Drazin order to a partial order and study its properties. The star order is perhaps more extensively studied than the minus and the sharp orders. In fact, it appeared in the literature before both the other two orders. Chapter 5 is devoted to the study of this order. We study the characterizing properties of the star order and obtain the classes of matrices above and below a given matrix under the star order. We then specialize to some interesting subclasses of matrices such as range hermitian, normal, hermitian and idempotent matrices and study the star order for such matrices. We finally consider several matrices and study when each of them is below their sum under the star order. The results are very similar to the celebrated Fisher-Cochran Theorem on distribution of quadratic forms in normal variables.
Introduction
5
Let A and B be matrices of the same order. Then A is below B under each of the minus, the sharp and the star orders if there is a suitable g-inverse G of A such that AG = BG and GA = GB. In Chapter 6, we consider only one of the above two conditions and show that with an additional milder condition we can get a partial order. Such an order is called a one-sided order. A one-sided order corresponding to minus order simply coincides with the minus order. However, one-sided sharp and star orders do not coincide with the sharp and the star orders respectively and lead to an interesting study. In the process of obtaining one-sided sharp order, we develop two special classes of g-inverses, which are interesting in their own way. After a careful study of the results of Chapters 3-5, one finds several similar looking properties shared by these order relations. Is there a common thread? Is there a master characterizing property using which we can derive a number of common looking properties of these partial orders? In Chapter 7, we find an answer to these questions and give a unified theory of matrix partial orders developed via generalized inverses. Characterizations of common properties/results of all the partial orders are put under one umbrella of the unified theory. We conclude the chapter with some extensions of this unified theory. L¨ owner order, usually studied for nnd matrices, is one of the oldest known partial orders. Some material dealing with L¨owner order for hermitian matrices is also available in the literature. In Chapter 8, we bring together such results and make a comprehensive study of this order. We study the relationship of L¨ owner order with other partial orders studied earlier. We also study the ordering properties of generalized inverses and outer inverses under L¨ owner order. Finally, we consider a couple of extensions of L¨ owner order - one of them for rectangular matrices. Chapter 9 is devoted to the study of the parallel sum of matrices. Besides obtaining several interesting properties of parallel sums, we also explore the relationship of parallel sum with matrix orders, particularly, the space pre-order and the L¨ owner order. It turns out that parallel sum of two matrices A and B is the matrix closest to A and B in the sense that any matrix which is below A and B under the space pre-order is also below their parallel sum under the space pre-order. We study the shorted operators in Chapters 10 and 11. In Chapter 10, we provide motivation for shorted operator through Electrical Networks and Statistics. We first consider nnd matrices and study the shorted operator of an nnd matrix indexed by a subspace and develop several interesting
6
Matrix Partial Orders, Shorted Operators and Applications
properties. We note here that the shorted operator is a certain Schur complement. Let A be an nnd matrix of order n × n and S be a subspace of Cn . Then the shorted operator of A with respect to S turns out to be the nnd matrix closest to A, both under the L¨owner order and the minus order, in the class of all nnd matrices with column space contained in S. We also show that the shorted operator is the limit of a sequence of certain parallel sums. We then extend the concept of the shorted operator to possibly rectangular matrices over a general field. This leads us to a concept called complementability. We examine when the shorted operator indexed by two subspaces exists and obtain an explicit expression for the same when it exists. Again, it turns out to be a Schur complement. Let A be an m × n matrix over a general field and let Vm and Vn be subspaces of vector spaces of m-tuples and n-tuples respectively over this field. Let the shorted operator S(A|Vm , Vn ) of A indexed by Vm and Vn exist. We find that the shorted operator, S(A|Vm , Vn ), whenever it exists, is the closest to A under the minus order in the class of all matrices the column space and row space of which are contained in Vm and Vn respectively. Further, S(A|Vm , Vn ) turns out to be a Schur complement. In Chapter 11, we first extend the concept of shorted operator to (possibly) rectangular matrices over C, the field of complex numbers using the approach of the limit of a sequence of parallel sums. We examine when it exists and study its properties. We then give another definition of shorted operator using the approach of closeness in the sense of rank. More precisely, for a matrix A, we examine when a unique matrix B exists such that the rank of A − B is the least in the class of all matrices the column space and row space of which are contained in Vm and Vn respectively. When such a matrix B exists, we call it as the shorted operator S(A|Vm , Vn ) of A indexed by Vm and Vn . We show that all these approaches of defining the shorted operator lead to the same matrix. We finally make some remarks on the computational aspects of the shorted operator. One characterizing property of the shorted operator studied in Chapters 10 and 11 is that it is a maximal element of a certain collection of matrices under a suitable partial order. This raises the following natural question. Does a set of matrices equipped with a partial order become a lattice or at least a semi-lattice under that partial order? In Chapter 12, we study this problem in some detail for three of the major partial orders studied earlier, namely, the minus, the star and the sharp orders. Chapter 13 contain entirely new material. We make an extensive study of the matrix order relations for modified matrices. We consider here two
Introduction
7
types of modifications of matrices one: appending or deleting a row/column and the other: adding a rank 1 matrix. Let A be below B under a particular matrix order. Let A be modified as per one of the above modifications. We obtain the class of all modifications of B such that the modified matrix of A is below the modified matrix of B under the same matrix order. The matrix orders considered for this purpose are the space pre-order, the minus order, the sharp order, the star order and the L¨owner order. The proofs, in general, are highly computational and lengthy. Due to space considerations, we have omitted most of the proofs. However a reader interested in the proofs of these results may write to the authors. In Chapter 14, we give an application of the matrix partial orders in developing equivalence relations on the classes of generalized inverses and outer inverses of a matrix. This leads to the development of nice hierarchies among the various classes of inverses (both inner and outer) of a matrix. It also leads to a neat diagrammatic representation of various inverses of a matrix, which according the first author resembles a strawberry plantation. (It is often said, a picture is worth a thousand words!) In Chapter 15, we give applications of matrix orders, parallel sum and shorted operator to Statistics and Electrical Networks. We first collect some results related to inference in linear models for the benefit of a general reader. Then we give interpretation and application of matrix orders, parallel sum and shorted operators to comparison of models and inference in linear models. We also give an application of shorted operators to the recovery of inter-block information in incomplete block designs. We give an application of shorted operators of modified matrices to obtain the modified shorted operator when a new port is included in the network. We enlist a few open problems related to the material covered in this monograph which should be of interest to the researchers. Finally, Appendix A contains the basic material on the algebra of relations, semi-groups, groups, partial orders and related issues for those readers who may need a little brushing up of some of these basic concepts.
Chapter 2
Matrix Decompositions and Generalized Inverses
2.1
Introduction
This chapter aims at providing some prerequisites in Linear Algebra for smooth reading of the later chapters of the monograph. We assume that the reader has the knowledge at the level of a first course in Linear Algebra. In Section 2.2, we gather several known results on matrix decompositions. Rank factorizations of matrices have been in use for a long time. However their properties are not commonly available in elementary textbooks on matrices. Therefore, we treat them here in a fairly elaborate fashion. Sections 2.3-2.6 are devoted to an exposition of generalized inverses (g-inverses), where we try to provide motivation for a gradual development of different types of g-inverses. It is perhaps a little lengthy exposition for a monograph on matrix partial orders. The justification for this is two-fold: (i) in general, a first course in Linear Algebra does not provide a proper motivation and development of g-inverse and (ii) the later chapters in the monograph require a mature understanding of g-inverses. However, in this chapter, no attempt is made to make the exposition on g-inverses exhaustive or up to date. For a detailed study of generalized inverses the reader is referred to [Rao and Mitra (1971)], [Ben-Israel and Greville (2001)] and [Campbell and Meyer (1991)]. The main purpose of this chapter is to facilitate the reader in studying the later chapters of the book comfortably. In Section 2.7, we study simultaneous diagonalization, a concept which is very important when we deal with two or more matrices together. Apart from the standard results, we include simultaneous singular value decomposition for several matrices and generalized singular value decomposition, which will prove useful in understanding some partial orders like a one-sided star order. This chapter contains a few new results. We also give new and simpler
9
10
Matrix Partial Orders, Shorted Operators and Applications
proofs to some known results. In this chapter, by and large, we work with vectors and matrices over a general field. However, if we use vectors and matrices over the complex field, we clearly mention the same before such a use.
2.2
Matrix decompositions
In this section, we enumerate several matrix decompositions (often without proof) which will be useful later in understanding the structures of generalized inverses and matrix partial orders. Proofs of most of the results mentioned here can be found in standard text books on Linear Algebra, for example [Rao and Bhimasankaram (2000)]. Theorem 2.2.1. (Normal Form) Let A be an m × n matrix and let ρ(A) = r > 0. Then there exist non-singular matrices P and Q of order m × m and n × n respectively such that A = Pdiag(Ir , 0)Q. Theorem 2.2.2. (Rank Factorization) Let A be as in Theorem 2.2.1. Then there exist matrices R and S of orders m × r and r × n respectively such that ρ(R) = r = ρ(S) and A = RS. Conversely, if A = RS, where R and S are full rank matrices of order m × r and r × n respectively such that r ≤ min{m, n}, then ρ(A) = r. The first part of Theorem 2.2.2 follows easily from Theorem 2.2.1 by taking R as the matrix formed by taking first r columns of P and S as the matrix formed by the first r rows of Q. The second part is easy. Definition 2.2.3. If R and S are matrices as specified in Theorem 2.2.2, we say (R, S) is a rank factorization of A. Remark 2.2.4. If (R, S) is a rank factorization of a matrix A, then (i) the columns of R form a basis of C(A), the column space of A and (ii) the rows of S form a basis of C(At ), the column space of At . Remark 2.2.5. A matrix can have several rank factorizations. Let (R, S) be a rank factorization of A. Then the class of matrices −1 {(RT, T S), T non-singular} is the class of all rank factorizations of A. Remark 2.2.6. Let A be an m×n matrix of rank r over the field of complex numbers C. Then there exists a semi-unitary matrix U of order m × r (a
Matrix Decompositions and Generalized Inverses
11
matrix U such that U? U = Ir ) such that (U, V) is a rank factorization of A. Let A1 and A2 be non-null matrices of the same order. Let (Pi , Qi ) be a rank factorization of Ai , i = 1, 2. Clearly, Q1 A1 + A2 = (P1 : P2 ) ·· . Q2 Q1 When is (P1 : P2 ), ·· a rank factorization of A1 + A2 ? Before we Q2 can answer this question we give a couple of definitions. Definition 2.2.7. Two subspaces S and T of a vector space are said to be virtually disjoint if S ∩ T = {0}. Definition 2.2.8. Let A1 and A2 be matrices of the same order. Then A1 and A2 are said to be disjoint if (i) C(A1 ) ∩ C(A2 ) = {0} and (ii) C(At1 ) ∩ C(At2 ) = {0} or equivalently ρ(A1 + A2 ) = ρ(A1 ) + ρ(A2 ). In such a case the sum of A1 and A2 is denoted by A1 ⊕ A2 . We now prove the following: Theorem 2.2.9. Let A1 and A2 be non-null matrices of the same order. Let (Pi , Qi ) be a rank factorization of Ai , i = 1, 2. Then Q1 (P1 : P2 ), ·· Q2 is a rank factorization of A1 + A2 if and only if A1 and A2 are disjoint. Proof. ‘If’ part Notice that Q1 A1 + A2 = P1 Q1 + P2 Q2 = (P1 : P2 ) ·· . Q2 Let A1 and A2 be disjoint. Then ρ(A1 + A2 ) = ρ(A1 ) + ρ(A2 ).
12
Matrix Partial Orders, Shorted Operators and Applications
Hence the number of columns in (P1 : P2 ) = the number of rows in Q1 Q1 ·· = ρ(A1 + A2 ). So, (P1 : P2 ), ·· is a rank factorization Q2 Q2 of A1 + A2 . ‘Only if’ part Since (Pi , Qi ) is a rank factorization of Ai , i = 1, 2 and Q1 (P1 : P2 ), ·· is a rank factorization of A1 + A2 , therefore, Q2
ρ(A1 + A2 ) = ρ(P1 : P2 ) = number of columns in (P1 : P2 ) = ρ(A1 ) + ρ(A2 ).
Definition 2.2.10. A matrix H = [hij ] of order n × n is said to be in Hermite Canonical Form (HCF) if (i) (ii) (iii) (iv)
hij = 0 whenever i > j, i = 1, . . . , n and j = 1, . . . , n, hii = 1 or 0 for i = 1, . . . , n, if hii = 0 for some i, then hij = 0 for j = 1, . . . , n and if hii = 1 for some i, then hji = 0 for all j 6= i.
Thus, if H is in HCF, then H is an upper triangular matrix with each diagonal element either 0 or 1; if a diagonal element is 0, then the entire row containing this particular diagonal element is null and if a diagonal element is 1, then the remaining elements of the column containing the diagonal element are 0. For example 1 2 0 4 0 0 0 0 H= 0 0 1 −8 is a matrix in HCF. 0 0 0 0 Theorem 2.2.11. Let A be an n × n matrix. Then there exists a nonsingular matrix B such that H = BA is in HCF. Remark 2.2.12. A matrix in HCF is idempotent. Remark 2.2.13. Let A be an n × n matrix. Then by elementary row operations one can reduce A to a matrix in HCF.
Matrix Decompositions and Generalized Inverses
13
Remark 2.2.14. Let A, B and H be as in Theorem 2.2.11 with ρ(A) = r th and let ith 1 , . . . , ir diagonal elements of H be each equal to 1. Write R = (A?i1 : : A?ir ) and S = (Hti1 ? : : Htir ? ); th where A?ij is the ith j column of A and Hik ? is the ik row of H. Then (R, S) is a rank factorization of A.
Theorem 2.2.15. (Schur Decomposition) Let A be an n × n matrix. then there exists a non-singular matrix P such that A = PTP−1 , where T is an upper triangular matrix. Note that in Theorem 2.2.15, the eigen-values of the matrix A are precisely the diagonal elements of T. If the matrix A is over the field of complex numbers C, there exists a unitary matrix P such that A = PTP? , where T is an upper triangular matrix. Before we state our next theorem on the Jordan decomposition of a matrix, we define a Jordan block. A Jordan block of order 1 is a 1×1 matrix of the form [a]. A Jordan block J = [aij ] of order r > 1 is a matrix satisfying (i) (ii) (iii) (iv)
aik = 0, whenever i > k (upper triangular) a11 = . . . = arr (identical diagonal elements) aii+1 = 1 for i = 1, . . . , r − 1 and aik = 0 whenever k > i + 1.
For example 3 0 J= 0 0
1 3 0 0
0 1 3 0
0 0 is a Jordan block. 1 3
Theorem 2.2.16. (Jordan Decomposition) Let A be an n × n matrix over C. Then there exists a non-singular matrix P such that A = P diag (J1 , . . . , Jr ) P−1 , where each Ji i = 1, . . . , r is a Jordan block. Remark 2.2.17. Jordan decomposition is a more specialized form of Schur decomposition. Definition 2.2.18. Let A be an n × n matrix. The smallest positive integer k for which ρ(Ak ) = ρ(Ak+1 ) is called the index of A.
14
Matrix Partial Orders, Shorted Operators and Applications
Remark 2.2.19. The index of a non-singular matrix A is 0 and the index of a null matrix is 1. Remark 2.2.20. The matrices A and PAP−1 have the same index for each non-singular matrix P. Theorem 2.2.21. (Core-Nilpotent Decomposition) Let A be an n × n matrix. Then A can be written as the sum of matrices A1 and A2 i.e. A = A1 + A2 where (i) ρ(A1 ) = ρ(A21 ) i.e. A1 is of index ≤ 1 (ii) A2 is nilpotent i.e. there is a positive integer k such that Ak2 = 0 and (iii) A1 A2 = A2 A1 = 0. Here one or both of A1 and A2 can be null. A proof of Theorem 2.2.21 will be given in Section 2.4. Remark 2.2.22. Theorem 2.2.21 can be deduced from Theorem 2.2.16 for the matrices over complex field. Theorem 2.2.23. Let A be an n × n matrix. The following are equivalent: (i) (ii) (iii) (iv)
ρ(A) = ρ(A2 ) C(A) ∩ N (A) = {0} F n = C(A) ⊕ N (A) and There exists a non-singular matrix P such that A = Pdiag(T, 0)P−1 , where T is non-singular.
Remark 2.2.24. Theorem 2.2.21 can be restated as follows: Let A be an n × n matrix. Then there exists a non-singular matrix P such that A = Pdiag(T, N)P−1 , where T is non-singular and N is nilpotent. Remark 2.2.25. Let A be an n × n matrix over C. Then the algebraic multiplicity of the zero eigen-value of A is equal to its geometric multiplicity if and only if ρ(A) = ρ(A2 ). Definition 2.2.26. An n×n matrix A over C is said to be range-hermitian if C(A) = C(A? ) or equivalently if N (A) = N (A? ). An n × n matrix A over C is said to be hermitian if A = A? . Remark 2.2.27. Every hermitian matrix is range-hermitian. Every nonsingular matrix is also range-hermitian.
Matrix Decompositions and Generalized Inverses
15
Remark 2.2.28. If A is range-hermitian, then ρ(A) = ρ(A2 ). Theorem 2.2.29. Let A be an n × n matrix over C. Then A is rangehermitian if and only if there exists a unitary matrix U such that A = Udiag(T, 0)U? , where T is non-singular. Definition 2.2.30. A matrix A over C is said to be simple if all its eigenvalues are distinct. A is said to be semi-simple, if the algebraic multiplicity for each of its distinct eigen-values equals its geometric multiplicity. Theorem 2.2.31. (Spectral Decomposition of a Semi-Simple Matrix) Let A be an n×n matrix over C. Then A is semi-simple if and only if there exist matrices E1 , . . . , Es of order n × n and distinct complex numbers λ1 , . . . , λs such that (i) E2i = Ei , i = 1, . . . s (ii) Ei Ej = 0 whenever i 6= j (iii) I = E1 + . . . + Es and (iv) A = λ1 E1 + . . . + λs Es . Further, the matrices E1 , . . . , Es with these properties are unique. Remark 2.2.32. Every simple matrix is semi-simple. Remark 2.2.33. Let Ei , i = 1, . . . , s as in Theorem 2.2.31. Then n = ρ(I) = tr(I) = tr(E1 ) + . . . + tr(Es ) = ρ(E1 ) + . . . + ρ(Es ). Remark 2.2.34. Let Ei , i = 1, . . . , s be as in Theorem 2.2.31. Let (Pi , Qi ) be a rank factorization of Ei . Since E2i = Ei for all i and Ei Ej = 0 whenever i 6= j, Qi Pi = I for each i and Qi Pj = 0 whenever i 6= j. Let P = (P1 , . . . , Ps ) and Q = (Qt1 , . . . , Qts )t . Then it follows that P and Q are square matrices and QP = I, so, Q = P−1 . Thus, A = P diag (λ1 , . . . , λ1 , λ2 , . . . , λ2 , . . . , λs , . . . , λs ) P−1 where λi appear ρ(Ei ) times in the above diagonal matrix. Further, each column of P is a right eigen-vector of A. Remark 2.2.35. A is semi-simple if and only if A is similar to a diagonal matrix. Remark 2.2.36. Let A be a semi-simple matrix and let A = λ1 E1 + . . . + λs Es be the spectral decomposition of A. Then for each positive integer k, Ak = λk1 E1 + . . . + λks Es is the spectral decomposition of Ak .
16
Matrix Partial Orders, Shorted Operators and Applications
Definition 2.2.37. Let A be an n × n matrix over C. Then A is called a normal matrix if A? A = AA? . Theorem 2.2.38. (Spectral Decomposition of a Normal Matrix) Let A be an n × n matrix over C. Then A is normal if and only if there exists a unitary matrix U such that A = UΛU? , where Λ is a diagonal matrix (possibly complex). Remark 2.2.39. Every normal matrix is range-hermitian. Remark 2.2.40. The diagonal elements of Λ in Theorem 2.2.38 are eigenvalues of A and columns of U are the ortho-normal eigen-vectors of A. Remark 2.2.41. Let U = (U1 , . . . , Un ) be a unitary matrix and let Λ = diag(λ1 , . . . , λn ). Then the spectral decomposition A = UΛU? of A can also be written as n X λi Ui U?i . A= i=1
Remark 2.2.42. The above decomposition is in general not unique (unless all eigen-values are distinct). However, if µ1 , . . . , µs (s ≤ n) are distinct eigen-vectors of A, then by appropriate pooling of the eigen-values we can write s X X µi Vi Vi? , where Vi? Vi = I , Vi? Vj = 0 and I = Vi Vi? . U= i=1
This decomposition is unique in the sense that the orthogonal projectors Vi Vi? are unique. Theorem 2.2.43. (Spectral Decomposition of a Hermitian Matrix) Let A be an n×n matrix over C. Then A is hermitian if and only if there exists a unitary matrix U such that A = UΛU? , where Λ is a real diagonal matrix. Remark 2.2.44. Every hermitian matrix is normal with real eigen-values. Hence Remarks 2.2.39-2.2.42 are valid for hermitian matrices with obvious modifications. Theorem 2.2.45. (Singular Value Decomposition) Let A be an n × n matrix over C with rank r > 0. Then there exist unitary matrices U and V and a positive definite diagonal matrix ∆ of order r × r such that A = Udiag(∆, 0)V? = δ1 U1 V1? + . . . + δr Ur Vr? , where Ui and Vi are the ith columns of U and V respectively.
Matrix Decompositions and Generalized Inverses
17
Remark 2.2.46. Let A, U and V be as in Theorem 2.2.45. Then δ12 , . . . , δr2 are the non-zero eigen-values of the matrix A? A (or equivalently of the matrix AA? ) and Ui and Vi are eigen-vectors of the matrices AA? and A? A respectively corresponding to the eigen-value δi2 . Remark 2.2.47. δ1 , . . . , δr in Remark 2.2.46 are called the singular values of A and Ui and Vi , i = 1, . . . , r, are called singular vectors of A. Remark 2.2.48. Compare Theorems 2.2.1 and 2.2.45. For matrices over complex field, we can use the unitary matrices to reduce A to a diagonal form. However, we cannot, in general reduce A to diag(Ir , 0) by using unitary matrices unless each singular value of A is 1 in which case A? A and AA? are orthogonal projectors. Remark 2.2.49. In general a singular value decomposition as given in Theorem 2.2.45 is not unique. For a unique singular value decomposition, one has to pool the singular vectors corresponding to each distinct singular value. This is called the Penrose decomposition. For details see Chapter 12. See also [Rao and Bhimasankaram (2000)].
2.3
Generalized inverse of a matrix
Let A be a non-singular matrix. Then there exists a matrix G such that AG = GA = I. Such a matrix G is unique. The matrix G is called the inverse of A and is denoted by A−1 . If A is an m × n matrix (m 6= n) of rank m, there exists a matrix G such that AG = I. (However there is no matrix H such that HA = I.) Such a matrix G, denoted by A−1 R , is called a right inverse of A. Notice that A−1 is not unique unless m = n. Again, R if A is an m × n matrix of rank n (m 6= n), there exists a matrix G such that GA = I (note that there is no matrix T such that AT = I). Such a −1 matrix G is called a left inverse of A, denoted by A−1 L . Just as AR is not −1 unique, so also AL is not unique unless m = n. In all the above cases, there exists a matrix G such that AGA = A. However, if A is an m × n matrix of rank r < min{m, n}, then there is no matrix R such that AR = I or RA = I. One can ask: does a matrix G satisfying AGA = A exist? If this happens, G can be thought of as a generalization of the inverse/a right inverse/a left inverse. From the utility point of view, if A is non-singular, then Ax = b is consistent for all b and possesses unique solution given by x = Gb, where
18
Matrix Partial Orders, Shorted Operators and Applications
G is the inverse of A. Let A be an m × n matrix of rank m, m 6= n and G be a right inverse of A. Then Ax = b is consistent for all b and x = Gb is a solution. It should be noted that Ax = b does not have a unique solution in this case. On the other hand if A is an m×n matrix of rank n and G is a left inverse of A, then Ax = b is not necessarily consistent for all b; but if it is consistent then it has a unique solution x = Gb (interested readers may find examples). Now, let A be an m × n matrix of rank r < min{m, n}. Does there exist a matrix G such that, whenever Ax = b is consistent, x = Gb is a solution to Ax = b? If so, such a matrix G can be thought of as a generalization of inverse/ right inverse/ left inverse. We shall now establish the existence of a matrix G that will have most of these properties, there by answering both the questions posed in the preceding paragraphs. Let A be an m × n matrix. If A is null, then AGA = A for all matrices G of order n × m. In this case, Ax = 0 is the only consistent system and x = G0 = 0 is a solution for each matrix G of order n × m. If A is nonnull, let (P, Q) be a rank factorization of A. Then P has a left inverse P−1 L −1 −1 and Q has a right inverse Q−1 R . Write G = QR PL . It is easy to check that AGA = A. Also, if Ax = b is consistent, then b = Ac for some vector c. Now, AGb = AGAc = Ac = b, showing that Gb is a solution to Ax = b. Thus, for every matrix A, there is a matrix G having both the properties. In fact a stronger result holds: Theorem 2.3.1. Let A is an m × n matrix and G be an n × m matrix. Then the following are equivalent: (i) (ii) (iii) (iv) (v) (vi)
Gb is a solution to Ax = b for all b ∈ C(A) AGA = A AG is idempotent and ρ(AG) = ρ(A) GA is idempotent and ρ(GA) = ρ(A) ρ(I − AG) = m − ρ(A) and ρ(I − GA) = n − ρ(A).
Proof. (i) ⇔ (ii) AGb = b for all b ∈ C(A) ⇔ AGAu = Au for all u ⇔ AGA = A. (ii) ⇒ (iii) is trivial. (iii) ⇒ (ii) Since C(AG) ⊆ C(A) and ρ(AG) = ρ(A), we have C(AG) = C(A). So, A = AGD for some matrix D. Further, since AG is idempotent, AGAG = AG. So, AGAGD = AGD or AGA = A.
Matrix Decompositions and Generalized Inverses
19
(ii) ⇔ (iv) is established similarly. (iv) ⇒ (vi) Since GA is idempotent, so is I − GA. Further, I = GA + (I − GA). Therefore, n = ρ(I) = tr(I) = tr(GA) + tr(I − GA) = ρ(GA) + ρ(I − GA). Since ρ(GA) = ρ(A), the result follows. (vi) ⇒ (iv) Now, I = GA + (I − GA). So, n ≤ ρ(GA) + ρ(I − GA) ≤ ρ(A) + ρ(I − GA) = n. It follows that ρ(GA) = ρ(A). Again, ρ(I) = ρ(GA + (I − GA)) = ρ(GA) + ρ(I − GA), so, the column spaces of GA and I − GA are virtually disjoint and so are the row spaces. Moreover, GA(I − GA) = (I − GA)GA. Hence GA(I − GA) = 0. So, GA is idempotent. The proof for (iii) ⇔ (v) can be completed along the lines of proof for (iv) ⇔ (vi). Thus, we see that both the questions raised in the beginning of this section lead to the same class of solutions. Definition 2.3.2. Let A be an m × n matrix. Then a matrix G is said to be a generalized inverse of A if it satisfies any one of the six equivalent conditions in Theorem 2.3.1. A generalized inverse (or in short a g-inverse) of A is denoted by A− and the class of all generalized inverses of A is denoted by {A− }. We shall now study the use of a g-inverse in solving linear equations. Theorem 2.3.3. Let A be an m × n matrix and let G be a g-inverse of A. Then (i) Ax = b is consistent if and only if AGb = b. (ii) N (A), the null space of A (or equivalently the class of all solutions to Ax = 0) is given by C(I − GA) (or equivalently, (I − GA)ξ, where ξ is arbitrary). (iii) Let Ax = b be consistent. Then the class of all solutions to Ax = b is given by Gb + (I − GA)ξ, where ξ is arbitrary.
20
Matrix Partial Orders, Shorted Operators and Applications
Proof. (i) ‘If’ part is obvious and the ‘Only if’ part follows from Theorem 2.3.1(i). (ii) Since A(I−GA) = 0, C(I−GA) ⊆ N (A). Now, by Theorem 2.3.1(vi), ρ(I − GA) = n − ρ(A) = d(N (A)), the dimension of the null space of A, it follows that C(I − GA) = N (A). (iii) follows from (ii) and the fact that Gb is a solution to Ax = b. Thus, a g-inverse can be used to check the consistency of a given system of linear equations and also to obtain an expression for all possible solutions to a consistent system of linear equations. Remark 2.3.4. Let A be an m × n matrix and G be a g-inverse of A. Then for a matrix B, C(B) ⊆ C(A) if and only if B = AGB. Theorem 2.3.5. Let Ax = b be consistent and b 6= 0. Then the class of all solutions of Ax = b is {Gb : G ∈ {A− }}. Proof. Let G be a g-inverse of A. Clearly, AGb = b, since Ax = b is consistent. Hence, {Gb : G ∈ {A− }} is a subset of the class of all solutions to Ax = b. Now, let u be a solution to Ax = b and G be a g-inverse of A. Then by Theorem 2.3.3(iii), u = Gb + (I − GA)ξ for some ξ. Since b 6= 0, there exists an integer j such that bj 6= 0. Let wij = ξi /bj , i = 1, . . . , n and wik = 0, k = 1, . . . , n whenever i 6= j. Write W = (wpq ). It is easy to see that Wb = ξ. So, u = Gb + (I − GA)ξ = Gb + (I − GA)Wb = (G + (I − GA)W)b. Now A(G+(I−GA)W)A = AGA = A. Hence, G+(I−GA)W ∈ {A− } and that u ∈ {Gb : G ∈ {A− }}. We shall now obtain several expressions for the class of all g-inverses of a matrix. Notice that every matrix of order n × m is a g-inverse of the null matrix of order m × n. Theorem 2.3.6. Let A be an m × n matrix of rank r(> 0) and let A = Pdiag(I, 0)Q be a normal form of A (see 2.2.1). Then the class Theorem −1 Lr L −1 of all g-inverses of A is given by Q P , where L, M and N M N are arbitrary. Proof.
Straightforward.
Matrix Decompositions and Generalized Inverses
21
Remark 2.3.7. Let A be an m × n matrix. Let P and Q be non-singular matrices of order m × m and n × n respectively. Then G is a g-inverse of A ⇔ Q−1 GP−1 is a g-inverse of PAQ. Theorem 2.3.8. Let A be an m×n matrix and let G be a specific g-inverse of A. Then the class of all g-inverses of A is given by {A− } = {G + U − GAUAG, U arbitrary} or equivalently {A− } = {G + (I − GA)V + W(I − AG), V, Warbitrary}. Proof.
First notice that for all U, A(G + U − GAUAG)A = AGA + AUA − AGAUAGA = AGA = A.
So, G + U − GAUAG is a g-inverse of A for all U. Let G1 be another g-inverse of A. Choose U = G1 − G and check that G1 = G + U − GAUAG. Thus, the class of all g-inverses of A is given by {A− } = {G + U − GAUAG, U arbitrary}. Consider now H = G + (I − GA)V + W(I − AG), where V and W are arbitrary matrices of appropriate order. It is easy to see that H is a ginverse of A for all V and W. Choose and fix V and W. Let U = (I − GA)V + W(I − AG). Then AUA = 0. Therefore, G + (I − GA)V + W(I − AG) = G + U − GAUAG, so, H ∈ {A− } for all V and W. Next, choose and fix U. Then G + U − GAUAG = G + (I − GA)UAG + U(I − AG). Let V = UAG and W = U. Hence, {A− } = {G + U − GAUAG, U arbitrary} = {G + (I − GA)V + W(I − AG), V, W arbitrary}.
We now give a basic lemma which will be needed in obtaining a useful result on the invariance of BA− C under the choice of g-inverses A− of A. Lemma 2.3.9. Let a and c be m-column and n-column vectors respectively such that c 6= 0 and at Zc = 0 for all m × n matrices Z. Then a = 0.
22
Matrix Partial Orders, Shorted Operators and Applications
Proof. If possible, let a 6= 0. Then there exists an element say ai of a, which is non-null. Since c 6= 0, there is an element cj of c which is non-null. Take a matrix Z having the (i, j)th element zij = 1 and all other elements null. Then at Zc = ai cj 6= 0, which is a contradiction. So, a = 0. Remark 2.3.10. Similar result holds when a and c are matrices of suitable orders. More precisely, if A is a m × n matrix and C (non-null) is a p × q matrix such that At ZC = 0 for all n × p matrices Z, then A = 0. Theorem 2.3.11. Let A, B and C be matrices of appropriate orders such that the product BA− C is defined. Then BA− C is invariant under choices of A− and is non-null if and only if B and C are non-null, C(C) ⊆ C(A) and C(Bt ) ⊆ C(At ). Proof. ‘If’ part is trivial. ‘Only if’ part Let G be a g-inverse of A. Then G+(I−GA)V+W(I−AG) is a g-inverse of A, for arbitrary matrices V and W of appropriate orders. Now, BGC is non-null ⇒ B and C are non-null and B(G + (I − GA)V + W(I − AG))C = BGC for all V and W. So, B((I − GA)V + W(I − AG))C = 0 for all V and W. Taking W = 0, we have B(I − GA)VC = 0 for all V. As C is non-null, by Lemma 2.3.10, B(I − GA) = 0. Thus, C(Bt ) ⊆ C(At ). Similarly by taking V = 0, we have C(C) ⊂ C(A). In the later sections we often need to get a g-inverse G of A such that column space or row space (of G) is contained in a specified subspace. We shall now investigate when it is possible to get such a matrix G. We shall also obtain an explicit expression for such a g-inverse. Theorem 2.3.12. Let A be an m × n matrix. Let r and s be positive integers and P and Q be matrices of orders n×s and m×r respectively.Then there exists a g-inverse G of A such that C(G) ⊆ C(P) and C(Gt ) ⊆ C(Qt ) if and only if ρ(QAP) = ρ(A). If ρ(QAP) = ρ(A), then P(QAP)− Q is a g-inverse of A. Further, if PCQ is a g-inverse of A, then C is a g-inverse of QAP. Proof. If G is a matrix such that C(G) ⊆ C(P) and C(Gt ) ⊆ C(Qt ), then G = PCQ for some matrix C. Now, AGA = A ⇒ APCQA = A ⇒ ρ(AP) = ρ(QA) = ρ(A). Further, QAPCQA = QA. Hence ρ(QAP) = ρ(QA) and therefore, ρ(QAP) = ρ(A).
Matrix Decompositions and Generalized Inverses
23
Conversely, let ρ(QAP) = ρ(A). Then, ρ(QAP) = ρ(QA) = ρ(AP) = ρ(A). Consider T = P(QAP)− . Now, QAT = QAP(QAP)− is idempotent. Also, ρ(QAT) = ρ(QAP(QAP)− ) = ρ(QAP) = ρ(QA). Therefore, T is a g-inverse of QA, by Theorem 2.3.1(iii). Similarly, TQ = (QA)− Q is a g-inverse of A. Thus, P(QAP)− Q is a g-inverse of A. Now, let PCQ be a g-inverse of A. So, APCQA = A. Pre- and post-multiplying by Q and P respectively, we have QAPCQAP = QAP, showing C is a g-inverse of QAP. We know that, if G is a g-inverse of A, then AGA = A and therefore, ρ(G) ≥ ρ(A). Let ρ(A) = r < min{m, n}. Then by using Theorem 2.3.6, it can be shown that for each s such that r ≤ s ≤ min{m, n}, there is a g-inverse G of A with rank s. Take for example G = Q−1 diag(Is , 0)P−1 . We now specialize to the class of g-inverses G of A such that ρ(G) = ρ(A). Theorem 2.3.13. Let A be an m × n matrix. Then G is a g-inverse of A such that ρ(G) = ρ(A) if only if AGA = A and GAG = G. Proof. ‘If’ part Since AGA = A, G is a g-inverse of A and ρ(G) ≥ ρ(A). Moreover, GAG = G, so, ρ(A) ≥ ρ(G). Hence ρ(G) = ρ(A). ‘Only if’ part Since G is a g-inverse of A and ρ(G) = ρ(A), AG is idempotent and ρ(AG) = ρ(A) = ρ(G). So, by (iv) of Theorem 2.3.1, A is a g-inverse of G. Thus, GAG = G and G being a g-inverse of A, AGA = A. Definition 2.3.14. Let A be an m × n matrix. An n × m matrix G satisfying AGA = A and GAG = G is called a reflexive generalized inverse of A, denoted by A− r . Remark 2.3.15. Notice the relationship AGA = A and GAG = G is one of symmetry between A and G. It is easy to see that {GAG : G ∈ {A− }} is the class of all reflexive g-inverses of A. Remark 2.3.16. Analogous to Theorem 2.3.6, it is easy to show that if A = Pdiag(Ir , 0)Q, then the class of all reflexive g-inverses of A is given
24
Matrix Partial Orders, Shorted Operators and Applications
by A = Q−1
Ir L P−1 , M ML
where L and M are arbitrary. For the null matrix of order m × n, while every n × m matrix is a generalized inverse, there is only one reflexive g-inverse namely the null matrix. We shall give below another characterization of reflexive g-inverses of a non-null matrix using rank factorization. Theorem 2.3.17. Let A be a non-null m × n matrix. Let (P, Q) be a rank factorization of A. Then the class of all reflexive g-inverses of A is given −1 −1 −1 by Q−1 R PL , where PL and QR are arbitrary left and right inverses of P and Q respectively. −1 Proof. Clearly, G = Q−1 R PL is a reflexive g-inverse of A. Conversely, let G be reflexive g-inverse of A. Then AGA = A ⇒ PQGPQ = PQ ⇒ QGP = I, since Q has a right inverse and P has a −1 left inverse. Clearly, a choice of P−1 L is QG and a choice QR is GP. −1 −1 Now, GPQG = GAG = G. So, G = QR PL .
Let A be an m × n matrix of rank r (< min{m, n}). Let s be an integer such that r < s ≤ min{m, n}. We now obtain a characterization of all g-inverses of A with rank s. Theorem 2.3.18. Let A be a m × n matrix of rank r where 0 < r < min{m, n}. Let s be a positive integer such that r < s ≤ min{m, n}. Then G is a g-inverse of A with ρ(G) = s if and only if then exist nonsingular matrices P and Q of order m × m and n × n respectively such that A = Pdiag(Ir , 0)Q and G = Q−1 diag(Is , 0)P−1 . Proof. ‘If’ part is trivial. ‘Only if’ part Let A = R1 diag(Ir , 0)R2 be a normal form of A where R1 and R2 are non-singular. Since Gis a g-inverse of A, Ir L −1 G = R2 R−1 1 , for some matrices L, M and N. M N Ir L Further, since G is of rank s, ρ = s. Now M N Ir L Ir 0 Ir 0 Ir L = . M N M I 0 N − ML 0 I
Matrix Decompositions and Generalized Inverses
25
Hence ρ(N − ML) = s − r. Let N − ML = R3 diag(Is−r , 0)R4 be a normal form of N − ML. Write P = R1
Ir − L Ir 0 Ir 0 Ir 0 and Q = R2 . 0 I 0 R−1 0 R−1 −M I 4 3
It is easily verified that A = Pdiag(Ir , 0) Q and G = Q−1 diag(Is , 0)P−1 . Corollary 2.3.19. Consider the same setup as in Theorem 2.3.18. Then G is a g-inverse of A with rank s if and only if there exists a matrix B such that (a) ρ(A + B) = ρ(A) + ρ(B) = s and (b) G = (A + B)− r . Proof. ‘If’ part Let (P1 , Q1 ) and (P2 , Q2 ) be rank factorizations of A and B respectively. Q1 Since ρ(A + B) = ρ(A) + ρ(B), (P1 : P2 ), ·· is a rank factorQ2 ization of A + B. Since G is a reflexive g-inverse of A + B, we have S1 Q1 G = (R1 : R2 ) ·· where (R1 : R2 ) is a right inverse of ·· and S2 Q2 S1 ·· is a left inverse of (P1 : P2 ). Thus, S2 S1 I Q1 (R1 : R2 ) = (I : 0) and ·· P1 = ·· . S2 0 Now, I S1 AGA = P1 Q1 (R1 : R2 ) ·· P1 Q1 = P1 (I : 0) ·· Q1 = P1 Q1 = A. 0 S2 Thus, G is a g-inverse of A. Further, ρ(G) = ρ(A + B) = s. ‘Only if’ Part By Theorem 2.3.18, A = Pdiag(Ir , 0)Q and G = Q−1 diag(Is , 0)P−1 for some non-singular matrices P and Q. Let B = Pdiag(0r , Is−r , 0)Q. Then A + B = Pdiag(Is , 0)Q and ρ(A + B) = s = ρ(A) + ρ(B). Finally, G = Q−1 diag(Is , 0)P−1 is (A + B)− r .
26
2.4
Matrix Partial Orders, Shorted Operators and Applications
The group inverse
In the previous section, we studied one important sub-class of g-inverses of a matrix, namely, the class of reflexive g-inverses. In this section, we seek a g-inverse G of a square matrix A satisfying C(G) ⊆ C(A) or C(Gt ) ⊆ C(At ) or both. We shall see that not all matrices enjoy this property. It turns out that the same subclass of square matrices enjoys each of the properties mentioned above. We seek to identify this subclass of square matrices and for each matrix in this class, we characterize the sub-classes of all ginverses with these properties. This leads us to the concept of the group inverse in a very natural manner. We study briefly the major properties of the group inverse. We also define the Drazin inverse for a general square matrix by extending the concept of the group inverse of a matrix of index 1. The group inverse is found useful in the analysis of Markov chains and the Drazin inverse is useful in solving differential equations and difference equations. Definition 2.4.1. Let A be an n × n matrix. Then a g-inverse G of A such that C(G) ⊆ C(A) is called a χ-inverse of A and is denoted by A− χ. A g-inverse such that C(Gt ) ⊆ C(At ) is called a ρ-inverse of A and is denoted by A− ρ. A g-inverse G of A such that C(G) ⊆ C(A) and C(Gt ) ⊆ C(At ) is called a ρχ-inverse of A and is denoted by A− ρχ . − Theorem 2.4.2. Let A be an n × n matrix. Then each of A− χ , Aρ and − 2 Aρχ exists if and only if ρ(A) = ρ(A ) i.e., A is of index not greater than 1. 2 − Proof. By Theorem 2.3.12, each of A− χ and Aρ exists ⇔ ρ(A) = ρ(A ). − 3 Again by the same theorem, Aρχ exists ⇔ ρ(A) = ρ(A ). Since ρ(A) ≥ 2 ρ(A2 ) ≥ ρ(A3 ), A− ρχ exists ⇔ ρ(A) = ρ(A ).
Let A be an n × n matrix such that ρ(A) = ρ(A2 ). Then by taking Q = I and P = A and applying Theorem 2.3.12, we see that a choice of 2 − A− χ is A(A ) . Similarly, by taking Q = A, P = I, we see that a choice of − 2 − Aρ is (A ) A, where (A2 )− is any g-inverse of A2 . Theorem 2.4.3. Let A be an n × n matrix such that ρ(A) = ρ(A2 ). Then 3 − 3 − A(A3 )− A is an A− ρχ where (A ) is any g-inverse of A . Further, Aρχ is unique.
Matrix Decompositions and Generalized Inverses
27
Proof. By Theorem 2.3.12, it follows that G = A(A3 )− A is a g-inverse of A such that C(G) ⊆ C(A) and C(Gt ) ⊆ C(At ). Further, if ACA is ginverse of A, then C must be of the form (A3 )− for some g- inverse (A3 )− of A3 , again by Theorem 2.3.12. Since ρ(A) = ρ(A2 ), C(A3 ) = C(A), C((A3 )t ) = C(At ). So, by Theorem 2.3.11, G = A(A3 )− A is invariant under choices of g-inverses of A3 . Thus A− ρχ exists, is unique and is equal to A(A3 )− A. We shall now characterize the ρχ-inverse and the classes of all χ-inverses and all ρ-inverses using rank factorization. Theorem 2.4.4. Let A be an n × n matrix of rank r(> 0). Let (P, Q) be a rank factorization of A. Then (i) ρ(A) = ρ(A2 ) if and only if QP is non-singular. (ii) Let ρ(A) = ρ(A2 ). The class of all χ-inverses is given by −1 {P(QP)−1 P−1 : P is an arbitrary left inverse of P}. L L (iii) Let ρ(A) = ρ(A2 ). The class of all ρ-inverses is given by −1 {Q−1 Q : Q−1 R (QP) R is a right inverse of Q}. 2 (iv) Let ρ(A) = ρ(A ). Then the ρχ-inverse of A is P((QP)−1 )2 Q. Proof.
(i) We have ρ(A) = ρ(A2 ) ⇔ ρ(PQ) = ρ(PQPQ) = ρ(QP)
⇔ ρ(QP) = ρ(I) = r. Thus, ρ(A) = ρ(A ) ⇔ QP is non-singular. − (ii) Since ρ(A) = ρ(A2 ), A− χ exists. Clearly, ρ(Aχ ) ≥ ρ(A). On the other − − hand, C(Aχ ) ⊆ C(A), so C(Aχ ) = C(A) and ρ(A− χ ) = ρ(A). Hence, is a reflexive g-inverse of A. By Theorem 2.3.15, every reflexive A− χ −1 −1 −1 −1 g-inverse is of the type QR PL . Also, C(QR PL ) = C(Q−1 R ). Since C(A− χ ) = C(A) = C(P), Q−1 = PT for some matrix T. Further, T R must satisfy I = QQ−1 = QPT. As QP is a square matrix, we have R T = (QP)−1 . Thus, every χ-inverse must be of the form P(QP)−1 P−1 L . −1 −1 −1 −1 Conversely, A(P(QP) PL )A = PQP(QP) PL PQ = PQ = A −1 −1 and C(P(QP)−1 P−1 PL −1 L ) = C(P) = C(A), for each PL . So, P(QP) is a χ-inverse of A. Proof of (iii) is similar to proof of (ii). (iv) follows form (ii) and (iii). 2
Theorem 2.4.5. Let A be an n × n matrix such that ρ(A) = ρ(A2 ). Let A = Pdiag(C, 0) P−1 , where P and C are non-singular.
28
Matrix Partial Orders, Shorted Operators and Applications
(i) The class of all A− χ is given by −1 C L P P−1 ; where L is arbitrary. 0 0 (ii) The class of all A− ρ is given by −1 C 0 P P−1 ; where M is arbitrary. M 0 −1 (iii) G is A− , 0) P−1 ρχ if and only if G = Pdiag(C
Proof.
Straightforward.
Theorem 2.4.6. Let A be an n × n matrix such that ρ(A) = ρ(A2 ). Then G is A− ρχ if and only if (i) AGA = A (ii) GAG = G (iii) AG = GA.
and
Proof. ‘If’ part From (ii) and (iii) it follows that AG2 = G = G2 A. So, C(G) ⊆ C(A) and C(Gt ) ⊆ C(At ). This together with (i) implies that G is A− ρχ . The ‘only if’ part can be easily verified using Theorem 2.4.4(iv). We formally define a commuting g-inverse of a square matrix in the following: Definition 2.4.7. Let A be a square matrix. Then a square matrix G of the same order is called a commuting g-inverse of A if it satisfies the following: (i) AGA = A, i.e. G is a g-inverse of A and (ii) AG = GA. Remark 2.4.8. A square matrix A has a commuting g-inverse if and only if ρ(A) = ρ(A2 ). We now obtain the class of all commuting g-inverses of a square matrix. Theorem 2.4.9. (a) Let A be a square matrix. Then A has a commuting g-inverse if and only if A is of index not greater than 1. (b) Let A be of index not greater than 1 and let A = Pdiag(T, 0) P−1 , where T is non-singular. Then every commuting g-inverse of A is of the
Matrix Decompositions and Generalized Inverses
29
form G = Pdiag(T−1 , C)P−1 , where C is arbitrary. Proof. (a) ‘If’ part Let A be of index not greater than 1. Then by Theorem 2.2.23, there exists a non-singular matrix P such that A = Pdiag(T, 0)P−1 , where T is nonsingular. Let G = Pdiag(T−1 , 0)P−1 . Then G is a commuting g-inverse of A. ‘Only if’ part Let A have a commuting g-inverse G. Then, AGA = A2 G = A. This implies ρ(A) ≤ ρ(A2 ) ≤ ρ(A), or equivalently, ρ(A)2 = ρ(A), so, A is index 1. −1 (b) As noted in the proof of (a), , 0)P−1 is a commuting G = Pdiag(T H1 H2 g-inverse of A. Let H = P P−1 be a commuting g-inverse H3 H4 of A, partitioned conformably for multiplication with A. Now, by direct computation, H1 = T−1 , H2 and H3 are null and H4 is arbitrary. Remark 2.4.10. Let A be a square matrix such that ρ(A) = ρ(A2 ). Then A− ρχ is the unique commuting reflexive g-inverse of A. Remark 2.4.11. In view of symmetry of (i)-(iii) in Theorem 2.4.6, G = − A− ρχ if and only if A = Gρχ . It is obvious that the set of all non-singular matrices of order n × n form a group under matrix multiplication with identity I. The inverse for each matrix A in this group is A−1 , the usual inverse of a non-singular matrix. We now show that the matrices satisfying ρ(A) = ρ(A2 ) also have similar property. Let A be an n × n matrix of rank r such that ρ(A) = ρ(A2 ). Consider the class of matrices defined as CA = {B : C(B) = C(A) and C(Bt ) = C(At ) , ρ(B) = ρ(B2 )} . We shall show that CA is a group under matrix multiplication. Theorem 2.4.12. Let A be an n × n matrix of rank r such that ρ(A) = ρ(A2 ). Let A = Pdiag(T, 0)P−1 , where T is an r × r non-singular matrix. Then (i) CA = {Pdiag(R, 0)P−1 }, where R is an arbitrary r × r non-singular matrix. (ii) CA forms a group under matrix multiplication.
30
Matrix Partial Orders, Shorted Operators and Applications
(iii) For each B ∈ CA , B− ρχ is the inverse of B in the group CA . Proof. Notice that for B ∈ CA , since C(B) = C(A) and C(Bt ) = C(At ), we have C(PBP−1 ) = C(PAP−1 ) and C(P−1 Bt P)t = C(P−1 At P)t . Now (i) follows easily. (ii) Let each of B = Pdiag(R, 0) P−1 and C = Pdiag(S, 0) P−1 belong to CA , where R and S are non-singular matrices of order r × r. Clearly, BC ∈ CA . Thus, CA is closed under multiplication. Associativity follows since the matrix multiplication is associative. It can be easily verified that E = Pdiag(Ir , 0) P−1 is the identity element and for each B = Pdiag(R, 0)P−1 ∈ CA , its inverse is Pdaig(R−1 , 0)P−1 which belongs to CA . Thus, CA is a group. (iii) is easy to check. In view of Theorem 2.4.12, we have Theorem 2.4.13. Let A be a matrix such that ρ(A) = ρ(A2 ). Then the following statements are equivalent: (i) A square matrix G is the unique g-inverse of A with C(G) ⊆ C(A) and C(Gt ) ⊆ C(At ). (ii) G is the unique reflexive commuting g-inverse of A and (iii) G is the inverse of A in the group CA = {B : C(B) = C(A) and C(Bt ) = C(At ), ρ(B) = ρ(B2 )}. Remark 2.4.14. Henceforth, we shall refer to the matrix G satisfying any of the equivalent conditions in Theorem 2.4.13 as the group inverse of A and denote it by A# . Thus, A# is the same as A− ρχ . If A is a non-singular matrix, then Ak is non-singular and (Ak )−1 is (A ) for each (positive) integer k. We now establish a similar property for the group inverse. −1 k
Theorem 2.4.15. Let A be a matrix such that ρ(A) = ρ(A2 ) and let A# be the group inverse of A. Then ρ(Ak ) = ρ((Ak )2 ) and (Ak )# = (A# )k for each positive integer k. Proof. If ρ(A) = ρ(A2 ), it is easy to establish ρ(A) = ρ(As ) for each positive integer s. So, ρ(Ak ) = ρ((Ak )2 ) = ρ(A). The rest follows easily from Theorem 2.4.5(iii).
Matrix Decompositions and Generalized Inverses
31
For a non-singular matrix A, if λ is an eigen-value of A with algebraic multiplicity k, then 1/λ is an eigen-value of A−1 with the same algebraic multiplicity k. The following is easy to establish. Theorem 2.4.16. Let A be a matrix of index ≤ 1. If λ is a non-null eigenvalue of A with algebraic multiplicity k, then 1/λ is an eigen-value of A# with same algebraic multiplicity. Further, if zero is an eigen-value of A with algebraic multiplicity t, then zero is an eigen-value of A# with the same algebraic multiplicity t. Remark 2.4.17. A square matrix is of index not greater than 1 if and only if the algebraic and geometric multiplicities of its zero eigen-value are equal. If A is a non-singular matrix, then it is well known that A−1 is a polynomial in A. We now prove the following: Theorem 2.4.18. Let A be a square matrix. If A has a g-inverse G which is a polynomial in A, then A has index not greater than 1. Moreover, if ρ(A) = ρ(A2 ), then A# is a polynomial in A. Proof. If G is a polynomial in A, then G and A commute. So, G is a commuting g-inverse of A. By Remark 2.4.8, it follows that ρ(A) = ρ(A2 ). Let ρ(A) = ρ(A2 ). Write A = Pdiag(T, 0)P−1 , where P and T are k X −1 non-singular matrices. Since T is non-singular, T = cj Tj for some j=0
positive integer k and some scalars c0 , . . . , ck . It is easy to check that k X A# = cj Aj . j=0 The following Theorem is easy to establish. Theorem 2.4.19. Let A be an n × n matrix of index not greater than 1. Then the following hold: (i) (ii) (iii) (iv)
(A# )# = A. (A# )t = (At )# . F n = C(A) ⊕ N (A). For each non-singular matrix P, (PAP−1 )# = PA# P−1 .
32
Matrix Partial Orders, Shorted Operators and Applications
We now obtain the core-nilpotent decomposition (Theorem 2.2.21) for an n × n matrix A over a field F. Theorem 2.4.20. Let A be an n × n matrix of index k. Then A can be written as A = A1 + A2 where (i) ρ(A1 ) = ρ(A21 ) (ii) A2 is nilpotent and (iii) A1 A2 = A2 A1 = 0. Further, such a decomposition is unique. Proof. (i) If k = 0 or 1, take A1 = A and A2 = 0. Then A1 , A2 satisfy (i), (ii) and (iii). If k > 1, write A1 = Ak (A2k−1 )− Ak and A2 = A − Ak (A2k−1 )− Ak . Since ρ(Ak ) = ρ(Ak+1 ) = ρ(A2k−1 ), by Theorem 2.3.11, A1 is invariant under choices of g-inverse of A2k−1 . Now, A21 = Ak (A2k−1 )− Ak Ak (A2k−1 )− Ak = Ak (A2k−1 )− Ak+1 . Also, A21 (Ak+1 )− Ak = Ak (A2k−1 )− Ak = A1 . Thus, ρ(A1 ) = ρ(A21 ). (ii) It is easy to check that Ar2 = Ar − Ak+r−1 (A2k−1 )− Ak for all r ≥ 1. So, Ak2 = Ak − A2k−1 (A2k−1 )− Ak = Ak − Ak = 0. Thus A2 is nilpotent. (iii) is easy to verify. To prove uniqueness of the decomposition, we proceed as follows: Let A = B1 + B2 be another decomposition where B1 and B2 satisfy the conditions (i), (ii) and (iii). Then Ak = Ak1 = Bk1 . So, C(A1 ) = C(Ak1 ) = C(Bk1 ) = C(B1 ). Similarly, C(At1 ) = C(Bt1 ). Further, A1 A2 = A2 A1 = B1 B2 = B2 B1 = 0. Therefore, 0 = A1 B2 = B2 A1 = B1 A2 = A2 B1 . Thus, A1 (A1 − B1 ) = −A1 (A2 − B2 ) = 0. Now, C(A1 − B1 ) ⊆ C(A1 ), (A1 − B1 ) = A1 (A21 )− A1 (A1 − B1 ) = 0. Hence, A1 = B1 and A2 = B2 . In view of Theorem 2.2.23 and Remark 2.2.24, we can restate Theorem 2.4.20 as Theorem 2.4.21. Let A be an n × n matrix of index k and rank r. Then there exists a non-singular matrix P of order n × n such that A = Pdiag(T, N) P−1 , where T is non-singular and N is nilpotent such that Nk = 0.
Matrix Decompositions and Generalized Inverses
33
Remark 2.4.22. The matrices A1 and A2 in Theorem 2.4.20 are called the core and the nilpotent parts respectively of the matrix A. Further, ρ(A1 ) is called the core rank of A. The core rank of A is less than or equal to the rank of A. Also, k, the index of A is the least positive integer such that Ak2 = 0 (called the index of nilpotency or simply index of A2 ). Notice that ρ(A) = ρ(A1 ) + ρ(A2 ). As noted earlier, a matrix does not have a commuting g-inverse unless it is of index not greater than 1. However, for a general square matrix not necessarily of index ≤ 1, we can define a type of commuting inverse (which may not be a generalized inverse) that is a generalization of the group inverse. Definition 2.4.23. Let A be an n × n matrix. A matrix G is called the Drazin inverse of A, denoted by AD , if G satisfies the following properties: (i) GAG = G. (ii) Ak = Ak+1 G for some positive integer k and (iii) AG = GA. Remark 2.4.24. AD = A# if k ≤ 1. If A is either null or nilpotent, then AD = 0. Remark 2.4.25. Let k be the smallest positive integer such that Ak = Ak+1 G. If k ≥ 2, then G is not a g-inverse of A. We now prove the existence and uniqueness of the Drazin inverse for any square matrix. Theorem 2.4.26. Let A be an n × n matrix of index k and rank r. As in Theorem 2.4.21, let A = Pdiag(T, N)P−1 , where T and P are non-singular and N is nilpotent such that Nk = 0. Then AD = Pdiag(T−1 , 0) P−1 and AD is unique. Proof. It is easy to check that Pdiag(T−1 , 0)P−1 satisfies the conditions (i)-(iii) of Definition 2.4.23. To show that AD is unique, let G1 and G2 be any two choices for AD . Consider Gk+1 A2k+1 Gk+1 . By repeated applications of the conditions 1 2 (i)-(iii) of Definition 2.4.23, we have G1 = Gk+1 A2k+1 Gk+1 = G2 . 1 2 Remark 2.4.27. Consider the core-nilpotent decomposition as in Theorem k 2k+1 − k 2.4.20. Then AD = A# ) A . 1 = A (A
34
Matrix Partial Orders, Shorted Operators and Applications
Remark 2.4.28. Let A be a non-null matrix. Then AD = 0 ⇔ A is nilpotent. The following theorem is easy to establish. Theorem 2.4.29. Let A be an n × n matrix of index k. Then ((AD )D )D = AD . (AD )t = (At )D . (PAP−1 )D = PAD P−1 for each non-singular matrix P. Fn = C(Ak ) ⊕ N (Ak ). C(AD ) = C(Ak ), N (AD ) = N (Ak ). AAD = AD A is the projector PC(Ak ), N (Ak ) which projects vectors onto C(Ak ) along N (Ak ). (vii) (I − AAD ) = (I − AD A) = PN (Ak ), C(Ak ) . (i) (ii) (iii) (iv) (v) (vi)
We now demonstrate that the Drazin inverse AD too is a polynomial in A just as we showed that the group inverse of a matrix B when it exists, is a polynomial in B. Theorem 2.4.30. Let A be an n × n matrix. Then AD is a polynomial in A. Proof. Let k be the index of A. Then as seen in Remark 2.4.27, AD = Ak (A2k+1 )− Ak . Clearly, Ak (A2k+1 )− Ak is invariant under choice of (A2k+1 )− and ρ(A2k+1 ) = ρ((A2k+1 )2 ). So, AD = Ak (A2k+1 )# Ak . We know (A2k+1 )# is a polynomial in A2k+1 and therefore, in A. Hence, AD is a polynomial in A. We now obtain a formula for the index and Drazin inverse using rank factorization. Theorem 2.4.31. Let A be a non-null n × n matrix. Write A0 = A. Let Qi Pi if Ai 6= 0 and (Pi , Qi ) is a rank factorization of Ai Ai+1 = . 0 if Ai = 0 for i = 1, 2, . . .. Then the following hold: number of columns in Pi ifAi 6= 0 i+1 (i) ρ(A ) = 0 if A = 0 i
Matrix Decompositions and Generalized Inverses
35
(ii) There exists an integer s ≥ 0 such that As is either non-singular or null. Let k be the smallest integer ≥ 0 such that Ak is either non-singular or null. Then k if Ak is non-singular (iii) index of A = . k + 1, if A = 0 k
i+1
(iv) C(A ) = C(P0 P1 . . . Pi ) and N (Ai+1 ) = N (Qi . . . Q1 Q0 ) for i = 1, . . . k − 1 if Ak is non-singular and for i = 1, . . . , k − 2 if Ak is null. 0 if Ak = 0 . (v) AD = −(k+1) P0 P1 . . . Pk−1 Ak
Proof.
Qk−1 . . . Q0 when Ak is non-singular
(i) It is easy to see that Ai+1 = P0 P1 . . . Pi Qi . . . Q1 Q0 = P0 P1 . . . Pi−1 Ai Qi−1 . . . Q1 Q0 .
If Ai 6= 0, then Pi and Qi are full rank matrices and ρ(Ai+1 ) = ρ(Ai ) = ρ(Pi Qi ) for each i. If Ai is null, so is Ai+1 , therefore ρ(Ai+1 ) = 0 and if Ai is non-null, ρ(Ai+1 ) = ρ(Ai ) = ri = number of columns in Pi . (ii) Suppose Aj is neither null nor non-singular for each j. Let the order of Aj be rj × rj . Then Aj+1 is of order rj+1 × rj+1 , where rj+1 < rj . Thus {rj } is a strictly decreasing sequence as long as {Aj } is a sequence of neither non-singular nor null matrices. But this is not possible since rj can never be negative. This proves (ii). Suppose there exists As such that As is non-singular. Then it is of the same order as At for all t ≥ s. Also, if As = 0, then At = 0 all t ≥ s. Thus, there exists a smallest integer k ≥ 0 such that Ak is either non-singular or null. (iii) Let k be the smallest positive integer such that Ak is either non-singular or null. Then Ak+1 = P0 P1 . . . Pk−1 Pk Qk Qk−1 . . . Q1 Q0 = P0 P1 . . . Pk−1 Ak Qk−1 . . . Q1 Q0 . Let Ak be null. Then, Ak+1 = 0, so, A is nilpotent. As k is the smallest positive integer for which Ak is null, A has index k + 1. Now, let Ak be non-singular. Then ρ(Ak ) = rk , Ak+1 = Qk Pk is of order rk × rk . Therefore,
36
Matrix Partial Orders, Shorted Operators and Applications
ρ(Ak+1 ) = ρ(Ak ) = rk = ρ(Ak ) and k is the smallest positive integer for which this is true. So, index of A is k. (iv) Let i = 1, . . . , k − 1. From the proof of (ii) we have Ai+1 = P0 P1 . . . Pi−1 Pi Qi Qi−1 . . . Q1 Q0 , where Pj has a left inverse and Qj has a right inverse for j = 0, 1, . . . , i. Hence C(Ai+1 ) = C(P0 P1 , . . . Pi−1 Pi ) and N (Ai+1 ) = N (Qi . . . Q1 Q0 ). Let Ak be non-singular. Then we have Ak+1 = P0 P1 . . . Pk−1 Pk Qk Qk−1 . . . Q1 Q0 , where Pk has a left inverse and Qk has a right inverse. Hence, C(Ak+1 ) = C(P0 P1 . . . Pk−1 Pk )andN (Ak+1 ) = N (Qk . . . Q1 Q0 ). (v) If Ak = 0, then A is nilpotent. By Remark 2.4.27, AD = 0. Let Ak −(k+1) be non-singular. Let X = P0 P1 . . . Pk−1 Ak Qk−1 . . . Q1 Q0 . Then it is easy to check that XAX = X, XA = AX and XAk+1 = Ak = Ak+1 X. Thus, X is the Drazin inverse of A. Corollary 2.4.32. Let A be an n × n matrix of index ≤ 1 and (P, Q) be a rank factorization of A. Then A# = P(QP)−2 Q . 2.5
Moore-Penrose inverse
In this section we specialize to vectors and matrices over the field of complex numbers C and use the inner product (x, y) = y? x for x, y ∈ Cn . Let Ax = b be a consistent system of linear equations for A ∈ Cm×n and b ∈ Cm . In Section 2.3, we noticed that, in general, a solution to Ax = b is not unique. Since there are many solutions, we can look for solutions with some optimal property. One such is the minimum norm property as different solutions have possibly different norms. Does a solution with minimum norm exist? If so, does there exist a g-inverse G of A such that Gb is a solution to Ax = b with minimum norm, whenever the system is consistent? What if Ax = b is not consistent? Does there exist a g-inverse G of A such that kAGb − bk ≤ kAx − bk for each x? This means that we are seeking a g-inverse G of A such that the vector AGb is the closest to b.
Matrix Decompositions and Generalized Inverses
37
Here we shall attempt to find answers to the above questions. It turns out that the answers to all of these are in affirmative. We shall see that the minimum norm solution is unique but a least squares solution, in general, is not unique. The next question that naturally arises is: Does there exist a g-inverse G of A such that Gb has the smallest norm in the class of all least squares solutions to Ax = b for all column vectors b? The answer again is in affirmative. In fact, we show that such a g-inverse is unique. It is this g-inverse that Moore and Penrose arrived at from different considerations. This g-inverse, named after them, is called the Moore-Penrose inverse. In this section, we shall characterize the subclasses of g-inverses mentioned in this and previous two paragraphs and study their properties. In the previous section we studied g-inverse with specified column and row spaces. It turns out that the g-inverses studied in this section also have the column and row spaces contained in certain subspaces of interest which bear a striking similarity to those in the previous section. Thus, we notice that the g-inverses we study in this section have striking similarities to A− χ, # − − # A− and A . While A , A and A exist for a matrix A of index ≤ 1 ρ χ ρ over a general field, the g-inverses that we are about to study in this section exist for every matrix over the field of complex numbers (or a subfield of it). We first prove a lemma, which will be used frequently in this section. Lemma 2.5.1. Let A and B be m × n and m × p matrices respectively. Then kAxk ≤ kAx + Byk for all x and y if and only if A? B = 0. Proof.
We first note that
kAxk ≤ kAx + Byk ⇔ (Ax, Ax) ≤ (Ax + By, Ax + By) ⇔ y? B? Ax + x? A? By + y? B? By ≥ 0 . ‘If’ part follows trivially, since B? B is nnd and A? B = 0 implies (A? B)? = B? A = 0. ‘Only if’ part Let if possible, A? B 6= 0. Then there exists a vector y necessarily nonnull such that A? By 6= 0. So, By 6= 0 and y? B? By > 0. Let the j th component of A? By be a + ib 6= 0. Choose xj , the j th component of x equal to −((a + ib)/(a2 + b2 ))(y? B? By) and all other components of x as zero. Now, x? A? By = −((a − ib)/(a2 + b2 ))(y? B? By)(a + ib) = −y? B? By.
38
Matrix Partial Orders, Shorted Operators and Applications
Also, y? B? Ax = −(y? B? By)? = −y? B? By. So, for these choices of x and y, we have y? B? Ax + x? A? By + y? B? By = −y? B? By < 0, which is a contradiction. So, A? B = 0
Definition 2.5.2. Let Ax = b be consistent. Then a vector x0 is said to be a minimum norm solution to Ax = b if (i) Ax0 = b and (ii) kx0 k ≤ kxk for every x such that Ax = b. Theorem 2.5.3. Let A be an m × n matrix. Then there exists a g-inverse G of A such that Gb is a minimum norm solution to Ax = b whenever it is consistent. Proof. Consider G = A? (AA? )− where (AA? )− is some g-inverse of AA? . Since ρ(AA? ) = ρ(A), by Theorem 2.3.12, G is a g-inverse of A. So, Gb is a solution to Ax = b whenever it is consistent. Also, the class of all solutions to Ax = b is given by Gb + (I − GA)ξ, ξ arbitrary. We now show that kGbk ≤ kGb + (I − GA)ξk for all b ∈ C(A) and for all ξ. Now, b ∈ C(A) ⇔ b = Au for some u. Thus, we need to show kGAuk ≤ kGAu + (I − GA)ξk for all u and ξ. By Lemma 2.5.1, the above happens if and only if (GA)? (I − GA) = 0. Now, (GA)? (I − GA) = (A? (AA? )− A)? (I − A? (AA? )− A) = (A? ((AA? )− )? A)(I − A? (AA? )− A) = (A? ((AA? )− )? A) − (A? ((AA? )− )? )AA? (AA? )− A. Since C(A) = C(AA? ), A = AA? (AA? )− A. Therefore, (GA)? (I − GA) = (A? ((AA? )− )? A) − (A? ((AA? )− )? )A = 0. So, kGAu + (I − GA)ξk2 = kGAuk2 + k(I − GA)ξk2 . Hence, Gb is a minimum norm solution to Ax = b, whenever it is consistent. Definition 2.5.4. Let A be an m × n matrix. Then a matrix G of order n × m is said to be a minimum norm g-inverse of A, if Gb provides a minimum norm solution to Ax = b, whenever it is consistent. A minimum norm g-inverse of A is denoted by A− m.
Matrix Decompositions and Generalized Inverses
39
We shall now obtain some characterizations of A− m. Theorem 2.5.5. Let A be an m × n matrix. then G is a minimum norm g-inverse of A if and only if it satisfies any one of the following equivalent conditions: (i) AGA = A, (GA)? = GA (ii) GA = PA? , the orthogonal projector onto C(A? ) (iii) GAA? = A? .
and
Proof. In the proof of Theorem 2.5.3, we saw that G is a minimum norm g-inverse of A if and only if AGA = A and (GA)? (I−GA) = 0. However, (GA)? (I − GA) = 0 ⇒ (GA)? = (GA)? GA = GA, since (GA)? GA is hermitian. Conversely, if AGA = A and (GA)? = GA, then (GA)? (I − GA) = (GA)? − (GA)? (GA) = GA − (GA)(GA) = GA − GAGA = GA − GA = 0 . So, G is a minimum norm g-inverse of A if and only if (i) holds Therefore, it is enough to show (i) ⇒ (ii) ⇒ (iii) ⇒ (i). (i) ⇒ (ii) Let (i) hold. Clearly, AGA = A ⇒ (GA)2 = GA, so, GA is idempotent. Also, (GA)? = GA implies GA is hermitian. So, GA is the orthogonal projector onto C(GA). However, GA = (GA)? = A? G? and ρ(A? G? ) = ρ(GA) = ρ(A) = ρ(A? ). Therefore, C(GA) = C(A? G? ) = C(A? ), proving (ii) holds. (ii) ⇒ (iii) is trivial. (iii) ⇒ (i) GAA? = A? ⇒ GAA? G? = A? G? ⇒ GA(GA)? = (GA)? . Since GA(GA)? is hermitian, (GA)? = GA. Also, GAA? = A? ⇒ (GA)? A? = A? ⇒ AGA = A.
We now show that minimum norm solution is unique. Theorem 2.5.6. Let Ax = b be consistent. Then minimum norm solution to Ax = b is unique. Proof. Let G be one choice of A− m . Then Gb is a minimum norm solution to Ax = b. Every solution to Ax = b is of the form Gb + (I − GA)ξ
40
Matrix Partial Orders, Shorted Operators and Applications
for some vector ξ. Let, if possible, kGbk = kGb + (I − GA)ξk for some ξ. Then kGbk2 = kGb + (I − GA)ξk2 = kGbk2 + k(I − GA)ξk2 + b? G? (I − GA)ξ + ξ ? (I − GA)? Gb. Since b ∈ C(A), b = Au for some u. So, b? G? (I − GA)ξ = u? A? G? (I − GA)ξ = 0. Also, ξ ? (I − GA)? Gb = (b? G? (I − GA)ξ)? = 0. Therefore, kGbk2 = kGbk2 + k(I − GA)ξk2 , which is possible only if (I − GA)ξ = 0. Thus, minimum norm solution is unique. We now obtain the class of all minimum norm g-inverses of a matrix A. Observe that for the null matrix of order m × n, every matrix of order n × m is a minimum norm g-inverse. Theorem 2.5.7. Let A be an m × n matrix and G be a minimum norm − g-inverse of A. Then the class {A− m } of all Am is given by {G + U(I − AG), U arbitrary}. Proof.
Let U be an arbitrary matrix. Then (G + U(I − AG))AA? = GAA? = A? .
So, G + U(I − AG) is a minimum norm g-inverse of A for all U. Let G1 be any minimum norm g-inverse of A. Then it is easy to check that G1 = G + U(I − AG), where U = G1 − G. Thus, the class of all minimum norm g-inverse is given by G + U(I − AG), where U is arbitrary. (Note that GA = PA? = G1 A.) Theorem 2.5.8. Let A be an m × n matrix of rank r (> 0). Let Udiag(D, 0)V? be a singular value decomposition of A, where U and V are unitary and D is a positive definite diagonal matrix of order r×r. Then the class of all minimum norm g-inverses of A is given by −1 D L ? V U : L, N arbitrary . 0 N Proof.
Notice that the class of all g-inverses of A is given by −1 D L ? U ; L, M and N arbitrary . V M N
It is easy to check that GAA? = A? if and only if M = 0.
Matrix Decompositions and Generalized Inverses
41
Theorem 2.5.9. Let A be an m × n matrix of rank r(> 0). The class − {A− mr } of all reflexive minimum norm g-inverses Amr of A is given by {A? (AA? )− : (AA? )− is an arbitrary g-inverse of AA? }. Proof. In the proof of Theorem 2.5.3, we showed that A? (AA? )− is a minimum norm g-inverse of A. Notice that ρ(A? (AA? )− ) ≤ ρ(A? ). However, ρ(A? (AA? )− ) ≥ ρ(A) = ρ(A? ) as A? (AA? )− is a g-inverse of A. So, ρ(A? (AA? )− ) = ρ(A). Hence, A? (AA? )− is an A− mr . − Conversely, let G = A− mr . As G is an Ar and GA is hermitian, C(G) = C(GA) = C(A? G? ) = C(A? ). Thus G = A? T for some matrix T. Now, A? = GAA? = A? TAA? . Therefore, AA? = AA? TAA? , showing T is a g-inverse of AA? . Theorem 2.5.10. Let (P, Q) be a rank factorization of an m × n matrix ? ? −1 −1 A of rank r(> 0). Then G is an A− PL mr if and only if G = Q (QQ ) −1 for some left inverse PL of P. Proof. We note that C(Q? ) = C(A? ). The rest of the proof is similar to (i) of Theorem 2.4.4. Let us now turn our attention to a possibly inconsistent system of linear equations. We are given a system of linear equations Ax = b which may or may not be consistent. In case it is consistent, we want to find a solution. Otherwise the next best thing to do is to find an approximate solution x0 such that Ax0 is the closest to b among all Ax i.e. kAx0 − bk ≤ kAx − bk for all x. If such a vector exists, then for all x, (Ax0 − b)? (Ax0 − b) ≤ (Ax − b)? (Ax − b). Such an approximate solution is called a least (sum of) squares solution to Ax = b. We shall now explore the existence of a g-inverse G of A such that x = Gb is a least squares solution to Ax = b. Notice that a least squares solution should be an exact solution if Ax = b is consistent. Theorem 2.5.11. Let A be an m×n matrix. Then there exists a g-inverse G of A such that x = Gb is a least squares solution to Ax = b for all b ∈ Cm . Proof. Consider G = (A? A)− A? . By Theorem 2.3.12, G is a g-inverse of A. Now kAx − bk = kAx − AGb + AGb − bk.
42
Matrix Partial Orders, Shorted Operators and Applications
By Lemma 2.5.1, kAGb − bk ≤ kAGb − b + A(x − Gb)k for all b and for all x ⇔ A? (AG − I) = 0. Moreover, A? AG = A? A(A? A)− A? = A? , since C(A? A) = C(A? ). Thus, G = (A? A)− A? is a g-inverse of A such that Gb is a least squares solution to Ax = b for all b. Remark 2.5.12. If Gb has to be a least squares solution to Ax = b for all b, then AGb = b for all b ∈ C(A). So, G must be a g-inverse of A. Definition 2.5.13. Let A be an m × n matrix. Then a matrix G is said to be a least squares g-inverse of A, denoted by A− ` , if Gb is a least squares solution to Ax = b for all b ∈ Cm . We shall now obtain some characterizations of A− ` . Theorem 2.5.14. Let A be an m × n matrix. Then a matrix G is a least squares g-inverse of A if and only if it satisfies any one of the following equivalent conditions: (i) AGA = A, (AG)? = AG (ii) AG = PA , the orthogonal projector projecting vectors into C(A). (iii) A? AG = A? . Proof. In the proof of Theorem 2.5.11, we observed that G is a least squares g-inverse G of A ⇔ G satisfies (iii) i.e. if A? AG = A? . Thus, it is enough to prove (iii) ⇒ (i) ⇒ (ii) ⇒ (iii). Let G be a matrix satisfying (iii). Then, A? AGA = A? A. Now, C(A? A) = C(A? ), so, there exists a matrix T such that A = TA? A. Hence, A? AGA = A? A ⇒ TA? AGA = TA? A ⇒ AGA = A. Also, A? AG = A? ⇒ G? A? AG = G? A? = AG. So, AG is hermitian and therefore (i) holds. Let (i) hold. Then AG is an orthogonal projector onto C(AG). But AGA = A ⇒ C(AG) = C(A). So, AG = PA . Thus (ii) holds. (ii)⇒ (iii) is trivial. Notice the striking similarity of conditions in Theorems 2.5.5 and 2.5.14. Consequently, we have the following important duality between minimum norm g-inverse and least squares g-inverses. Theorem 2.5.15. Let A be an m × n matrix. then G is an A− m if and only if G? is an (A? )− . `
Matrix Decompositions and Generalized Inverses
43
Proof follows from Theorems 2.5.5 and 2.5.14, once we note that G is a g-inverse of A ⇒ G? is a g-inverse of A? . Consider a system of linear equations Ax = b. Then x0 is a least squares solution to Ax = b if and only if Ax0 is the orthogonal projection of b in C(A). Also, then b − Ax0 is the orthogonal projection of b in (C(A))⊥ . Hence, even if x0 may not be unique, both the vectors Ax0 and b − Ax0 are unique. Theorem 2.5.16. Let A be an m × n matrix and let G be a least squares g-inverse of A. Then the class of all least squares solutions to Ax = b is given by {Gb + (I − GA)ξ, ξ arbitrary}. Proof. Clearly, A(Gb + (I − GA)ξ) = AGb for all ξ. Further Gb is a least squares solution to Ax = b. Hence Gb + (I − GA)ξ is a least squares solution to Ax = b for all ξ. Let x0 be a least squares solution to Ax = b. By the discussion just preceding this theorem, Ax0 = AGb. So, A(x0 − Gb) = 0. Thus, x0 − Gb = (I − GA)ξ for some ξ. So, x0 = Gb + (I − GA)ξ for some ξ. The following Theorems 2.5.17-2.5.19 and Theorem 2.5.21 can be established along the lines of Theorems 2.5.7-2.5.10. Theorem 2.5.17. Let A be an m × n matrix. Let G be a least squares g-inverse of A. Then the class of all A− ` is given by {G + (I − GA)V : V arbitrary}. Theorem 2.5.18. Let A be an m × n matrix of rank r(> 0). Let Udiag(D, 0)V? be a singular value decomposition of A, where U and V are unitary and D is a positive definite diagonal matrix of order r×r. Then the class {A− ` } of all least squares g-inverse of A is given by −1 D 0 ? V U : M and N arbitrary . M N Theorem 2.5.19. Let A be an m × n matrix of rank r(> 0). The class − {A− `r } of all reflexive least squares g-inverses A`r of A is given by {(A? A)− A? : (A? A)− is an arbitrary g-inverse of A? A}.
44
Matrix Partial Orders, Shorted Operators and Applications
Remark 2.5.20. For a matrix A, G is an A− `r if and only if G is g-inverse of A such that C(G? ) = C(A). Recall that G is an A− ρ if and only if C(G? ) = C(A? ). Theorem 2.5.21. Let (P, Q) be a rank factorization of an m × n matrix −1 ? −1 ? A of rank r(> 0). Then G is an A− P `r if and only if G = QR (P P) −1 for some right inverse QR of Q. − Notice the similarity between A− χ and Amr . Are there counterparts to − and A` for index-1 matrices? Indeed there are and we shall introduce them in Chapter 6. We have seen in Theorem 2.5.16 that, in general, there are many least squares solutions to Ax = b. We shall now show that there is a g-inverse G of A such that Gb has the minimum norm among all least squares solutions to Ax = b for all b ∈ Cm .
A− m
Theorem 2.5.22. Let A be an m × n matrix. Then, there exists a ginverse G of A such that Gb has the minimum norm among all least squares solutions to Ax = b for all b ∈ Cm . Proof. Let G be a least squares g-inverse of A. Then the class of all least squares solutions to Ax = b is given by Gb + (I − GA)ξ, where ξ is arbitrary. Now, we are seeking a least squares g-inverse G such that kGbk ≤ kGb + (I − GA)ξk for all b and ξ. By Lemma 2.5.1, kGbk ≤ kGb + (I − GA)ξk ⇔ G? (I − GA) = 0 ⇔ G? GA = G? . Consider G = A? (A? AA? )− A? . Then A? AG = A? AA? (A? AA? )− A? = A? , since ρ(A? AA? ) = ρ(A? ). So G is a least square g-inverse of A. Again G? GAA? = ((A? A? AA? )− A? )? A? (A? AA? )− A? AA? = A{(A? AA? )− }? AA? , since ρ(A? AA? ) = ρ(AA? ) . As ρ(AA? ) = ρ(A), it follows that G? GA = A{(A? AA? )− }? A = G? . Thus, G = A? (A? AA? )− A? is a g-inverse with the desired properties. Remark 2.5.23. From the proof of Theorem 2.5.22, we see that Gb has the minimum norm among all least square solutions if and only if A? AG = A? − and G? GA = G? or equivalently G is an A− ` and A is a G` .
Matrix Decompositions and Generalized Inverses
45
Theorem 2.5.24. Let A be an m×n matrix. Then G is a matrix such that Gb has the minimum norm among all least squares solutions to Ax = b for all b ∈ Cm if and only if any one of the following equivalent conditions holds: (i) A? AG = A? and G? GA = G? (ii) AG = PA , GA = PG and (iii) AGA = A, GAG = G, (AG)? = AG, (GA)? = GA. Proof. (i) follows by Theorem 2.5.22. The equivalence of (i)-(iii) follows from Theorem 2.5.14. Remark 2.5.25. Let Gb have the minimum norm among all least squares solutions to Ax = b for all b ∈ Cm . Then (iii) of Theorem 2.5.24 shows that G is a minimum norm, least squares, reflexive g-inverse of A. Definition 2.5.26. Let A be an m × n matrix. Then an n × m matrix G satisfying any one of the equivalent conditions of Theorem 2.5.24 is called Moore-Penrose inverse of A and is denoted by A† . Theorem 2.5.27. Let A be an m×n matrix. Then Moore-Penrose inverse A† of A is unique. Proof. Let G1 and G2 be two choices of Moore-Penrose inverse of A. Then G1 = G1 AG1 = G1 PA = G1 AG2 = PA? G2 = G2 AG2 = G2 . The following theorem is easy to prove. Theorem 2.5.28. Let A be an m × n matrix. Then the Moore-Penrose inverse A† has the following properties: (i) (ii) (iii) (iv) (v)
A?† = A†? . (A† )† = A. (A? A)† = A† A†? . A† = (A? A)† A? . If R and S are unitary matrices such that the product RAS is defined, then (RAS)† = S? A† R? .
Remark 2.5.29. In Theorem 2.5.22, we gave an expression for A† , namely A† = A? (A? AA? )− A? . Thus, it follows that A† is the unique g-inverse G such that C(G) = C(A? ) and C(G? ) = C(A). Remark 2.5.30. Recall that A# = A(A3 )− A. Also, A# is the unique g-inverse of A such that C(G) = C(A) and C(G? ) = C(A? ).
46
Matrix Partial Orders, Shorted Operators and Applications
In view of this similarity, it is worth asking when A† and A# coincide. Theorem 2.5.31. Let A be an m × m matrix of index 1 over C. Then A† and A# coincide if and only if A is a range-hermitian. Proof follows from Remark 2.5.29 and Remark 2.5.30. The following characterizations of Moore-Penrose inverse are easy to establish. Theorem 2.5.32. Let A be an m × n matrix of rank (r > 0). Let A = Udaig(D, 0)V? be a singular value decomposition of A, where U and V are unitary and D is a positive definite diagonal matrix of order r × r. Then A† = Vdiag(D−1 , 0)U? . (Notice that we have a singular value decomposition of A† .) Theorem 2.5.33. Let A be an m × n matrix of rank r(> 0). Let (P, Q) be a rank factorization of A. Then A† = Q? (QQ? )−1 (P? P)−1 P? . We conclude this section by making a useful observation that, given an algorithm to compute a g-inverse of a matrix, we can compute various types of g-inverse of a given matrix using this algorithm. Reflexive g-inverse ρ-inverse χ-inverse Group inverse Least squares g-inverse Minimum norm g-inverse Moore-Penrose inverse
: : : : : : :
GAG, if G is a g-inverse of A A(A2 )− (A2 )− A A(A3 )− A (A? A)− A? A? (AA? )− A? (A? AA? )− A?
In each of the above cases, we apply the algorithm for computing a g-inverse to a suitable matrix (for example to A? AA? in case of MoorePenrose inverse) and make adjustment to it, be it a pre- and/or postmultiplication by a suitable matrix to get the required type of g-inverse of the given matrix.
2.6
Generalized inverses of modified matrices
In this section, we study the g-inverse of matrices that have been modified in some suitable ways. The need for a g-inverse of modified matrices arises
Matrix Decompositions and Generalized Inverses
47
in several situations. Consider a linear model Y = Xβ + η, where X is a known m × n matrix, β is an unknown but non-stochastic n-vector and η is a random m-vector such that its expected value E(η) = 0 and dispersion matrix D(η) = σ 2 I where σ 2 (> 0) is an unknown constant. Then a least ˆ = X− Y = (Xt X)− Xt Y. If pt β is estimable, squares estimator of β is β ` ˆ is its BLUE with variance V (pt β) ˆ = pt (Xt X)− p σ 2 . Suppose one then pt β more uncorrelated observation is available, namely: ym+1 = xtm+1 β+ηm+1 , where E(ηm+1 ) = 0, V (η m+1 ) = σ 2 and Cov(ηm+1 , η) = 0. Then a least squares estimator after incorporating the new observation may be obtained as ˜ = ((Xt : xt β )t )− (Yt : ym+1 )t m+1
`
= (Xt X + xm+1 xtm+1 )− (X t Y + ym+1 xm+1 ). ˜ of BLUE of pt β in the new model is Also, the variance V (pt β), t t t − p (X X + xm+1 xm+1 ) p σ 2 . Thus after incorporating the new observation, we need t t − ((Xt : xtm+1 )t )− ` and (X X + xm+1 xm+1 ) .
In order to assess the influence of an observation on the estimators of model parameters one needs to delete a row from X or subtract a rank 1 matrix from Xt X and obtain a suitable g-inverse of the modified matrix. When a new state is included in a Markov chain or a new port is introduced in an n-port network, modified matrices and their g-inverses of suitable type become important in the study of the effect of such modifications. Let A be an m × n matrix and G be a g-inverse of A. Here we will consider the following types of modifications (i) appending/deleting a row or a column, (ii) appending a row and a column, and (iii) adding a rank 1 matrix of the same order. In each case, we obtain a suitable modification of G, so that it will be a g-inverse of the modified matrix. The following lemma is useful in what follows: Lemma 2.6.1. Let A be an m × n matrix of rank r(> 0) and let u1 , u2 , . . . , uk be m-vectors such that ρ(A : u1 : . . . : uk ) = ρ(A)+k, where k ≤ m − r. Then there exists a vector c such that ct A = 0, and ct ui = 1 for i = 1, . . . , k.
48
Matrix Partial Orders, Shorted Operators and Applications
Proof. Let (P, Q) be a rank factorization of A. Then (P : u1 : . . . : uk ) is an m × (r + k) matrix of rank r + k. Let (Rt : v1 : vk )t be a left inverse of (P : u1 : . . . : uk ), where R is an r × m matrix. Then vi t P = 0 for i = 1, . . . , k and vi t uj = δij , i, j = 1, . . . , k. Let c = v1 + v2 + . . . + vk . Clearly, ct P = 0 and ct ui = 1 for i = 1, . . . , k. Corollary 2.6.2. If the matrices considered are over complex field, then there exists a vector c such that c? A = 0, c? ui = 1 for all i = 1, . . . , k. We shall start with the first type of modification. Let A be an m × n matrix and a g-inverse G (of A) of a suitable type be available. Let a be an m-column vector. Write d = Ga. We shall show in Theorems 2.6.3-2.6.10 that a corresponding type of g-inverse of (A : a) can be expressed in the form X = (Gt − cdt : c)t (2.6.1) for some suitable c. Theorem 2.6.3. Let A be an m × n matrix and G be a g-inverse of A. Let a be an m-column vector and d = Ga. If a 6∈ C(A), c is an m-column vector such that ct A = 0, ct a = 1 (such a c is guaranteed by Lemma 2.6.1), and X is as defined in (2.6.1), then (i) X is (A : a)− if G is A− . − (ii) X is (A : a)− r if G is Ar . (iii) Let all the vectors and matrices be over C. Then X is (A : a)− m if G − is a Am . Theorem 2.6.4. Let A, a, G and d be as specified in Theorem 2.6.3, but (I − AG)a with complex elements. Let a 6∈ C(A) and c = ? . Write X = a (I − AG)a ? ? ? (G − cd : c) . Then − (i) X is (A : a)− ` if G is A` . † † (ii) X is (A : a) if G is A .
Theorem 2.6.5. Let A, a, G, d be as in Theorem 2.6.3. Let a ∈ C(A). Then (i) X is (A : a)− if G is an A− and c is arbitrary. − t (ii) X is (A : a)− r if G is an Ar and c ∈ C(G ) is arbitrary. (iii) Assuming that the vectors and matrices are over C, X is (A : a)− ` if G is A− and c is arbitrary. `
Matrix Decompositions and Generalized Inverses
Theorem 2.6.6. In the setup of Theorem 2.6.4, let c = Then
49
G? Ga . 1 + a? G? Ga
− (i) X is a (A : a)− m if G is Am . † † (ii) X is (A : a) if G is A .
Suppose that a g-inverse X = (Gt : b)t of a suitable type for (A : a) is available, where G is an n×m matrix and b is an m-column vector. We shall now find a modification of X so that it is a g-inverse of A. Interestingly, G itself is a g-inverse of A when a 6∈ C(A). We shall probe this further in the following Lemma 2.6.7. Let X = (Gt : b)t be a g-inverse of (A : a). Then a 6∈ C(A) if and only if bt a = 1 and bt A = 0. Further, if a 6∈ C(A), then AGa = 0 and X1 = (Gt At Gt : b)t is a g-inverse of (A : a). Further, if the vectors and matrices are over C, then the following holds: Let X = (G? : b)? be a g-inverse of (A : a). Then a 6∈ C(A) if and only if b? a = 1 and b? A = 0. Theorem 2.6.8. In the setup of Lemma 2.6.7, let a 6∈ C(A). Then (i) G is a A− if (Gt : b)t is (A : a)− . t t − (ii) G is a A− r if (G : b) is (A : a)r . Assume that the matrices and the vectors are over C. Then ? ? − (iii) G is a A− m if (G : b) is (A : a)m . − − (iv) G is A` if (G? : b)? is (A : a)` . (v) G is A† if (G? : b)? is (A : a)† . We shall now consider the case when a ∈ C(A). t Theorem 2.6.9. In the setup of Lemma 2.6.7, let X = (Gt : b) be a gt ab . Then inverse of (A : a) and let bt a 6= 1. Write G1 = G I + 1 − bt a
(i) G1 is A− . (ii) G1 is Ar if X is (A : a)− r . Let the matrices and the vectors be over C. Let X =(G? : b)? bea ab? g-inverse of (A : a) and let b? a 6= 1. Write G1 = G I + . 1 − b? a Then − (iii) G1 is A− ` if X is (A : a)` . − (iv) G1 is Am if X is (A : a)− m. (v) G1 is A† if X is (A : a)† .
50
Matrix Partial Orders, Shorted Operators and Applications
The case that is remaining is bt a = 1 and bt A 6= 0 over general field and b? a = 1 and b? A = 0 over C. Since bt A 6= 0 (respectively b? A 6= 0 over C) there exists a positive integer j such that bt aj 6= 0 (respectively b? aj 6= 0 over C) where aj is the j th column of A. Define c = a + aj . Let E be the matrix obtained from G by replacing its j th row gjt (gj? over C) by gjt − bt (respectively gj? − b? over C). Then clearly (Et : b)t (respectively (E ? : b)? over C) is a g-inverse of (A : c). Also bt c = bt a + bt aj 6= 1 (respectively b? c = b? a + b? aj 6= 1 over C). Thus, we have Theorem 2.6.10. In the setup of Lemma 2.6.7, let X = (Gt : b)t be a gt t inverse of let b a = 1 and b A 6= 0. Write (A : a). Moreover, t (a + aj )b . Then G2 = E I − bt aj (i) G2 is A− . − (ii) G2 is A− r if X is (A : a)r . Consider the matrices and the vectors over C. Let X = (G? : b)? ? ? be a g-inverse of (A : a) and let b a = 1 and b A 6= 0. Write ? (a + aj )b G2 = E I − . Then b? aj − (iii) G2 is A− ` if X is (A : a)` . Proofs of Theorems 2.6.3-2.6.10 are computational. For detailed proofs one may consult [Mitra and Bhimasankaram (1971)]. Remark 2.6.11. Consider matrices and vectors over C. Let (G? : b)? be a minimum norm g-inverse of (A : a). Then b? a = 1 if and only if b? A = 0. This can be shown as follows: ? Since (G? : b)? is (A : a)− m , we have Ga = A b. Therefore, b? a = 1 ⇔ AGa = 0 ⇔ AA? b = 0 ⇔ A? b = 0. Thus, X = (G? : b)? can never be a minimum norm g-inverse or the MoorePenrose inverse (left out in Theorem 2.6.10) in the set up considered in Theorem 2.6.10. Remark 2.6.12. Notice that Gt is a g-inverse of At if G is a g-inverse of A. Further, G? is a minimum norm g-inverse of A? if G is a least squares g-inverse of A. With this observation one can deal with the case of g-inverse of A vs (At : b)t or (A? : b)? (in the case of vectors and matrices over C). Au We now consider g-inverse of A vs. ; where u and v are column vt a vectors and a is scalar. Suppose a particular type of g-inverse G of A is
Matrix Decompositions and Generalized Inverses
51
a available to us and we want to compute a g-inverse of the same type of Au B= . Using the formulae so far developed, we can achieve this in vt a two stages. First we compute a g-inverse H of T = (A : u). The using H, the formulae derived so far and the Remark 2.6.12, we compute a g-inverse of T Au ·· , where w = (vt : a) . B= = vt a w We givebelow another interesting way of obtaining g-inverse of M = A B in a single stage. C D A B Let A be non-singular. Then the matrix M = is non-singular C D if and only if F = D − CA−1 B is non-singular. Also, −1 −1 A B A 0 F−1 (CA−1 : −I) M−1 = + 0 0 −I −1 A + A−1 BF−1 CA−1 − A− BF−1 . = −1 −1 −1 −F CA F Remark 2.6.13. The matrix F = D − CA−1 B is known as the Schur complement of A in M. We shall now explore availability of the above form for g- inverse of the A B a partitioned matrix when a g-inverse G of A is available. C D Theorem 2.6.14. Let a g-inverse G of A be available. Let A B M= and F = D − CGB. Then C D G 0 GB R= + F− (CG : −I) 0 0 −I is a g-inverse of M if and only if C(C(I − GA)) ⊆ C(F) C((I − AG)B)t ⊆ C(Ft ) and −
(I − AG)BF C(I − GA) = 0
(2.6.2)
52
Matrix Partial Orders, Shorted Operators and Applications
or equivalently ρ(M) = ρ(A) + ρ(F).
(2.6.3)
Proof. Proof of “R is M − if and only if (2.6.2) holds” follows easily from the verification of the condition MRM = M. To prove (2.6.2) and (2.6.3) are equivalent, we proceed as follows. Write I −GAGB I 0 , P= , Q= −CG I 0 I − I − (I − AG)BF and S= 0 I I 0 . T= − −F − FF C(I − GA) I Notice that P, Q, S and T are all non-singular matrices and straightforward computation shows that SPMQT = diag(A, F) + Z, where −(I − AG)BF− C(I − GA) (I − AG)B(I − F− F) . Z= − (I − FF )C(I − GA) 0 It is easy to check that ρ(M) = ρ(SPMQT) = ρ(diag(A, F) + Z) = ρ(diag(A, F)) + ρ(Z) = ρ(A) + ρ(F) + ρ(Z). Hence, ρ(M) = ρ(A) + ρ(F) if and only if Z = 0, that is, if and only if (2.6.2) holds. Several remarks are in order. Remark 2.6.15. The matrix F = D − CGB where G is a g-inverse of A is called a Schur complement of A in M. Schur complements play an important role in the study of shorted operators. We shall study them in detail in Chapter 10. Remark 2.6.16. If G is reflexive g-inverse of A, and F− is a reflexive g-inverse of F, then M is always a g-inverse of R. Remark 2.6.17. If C(Ct ) ⊆ C(At ) and C(B) ⊆ C(A), then (i) F is invariant under choices of g-inverse of A and (ii) R is a g-inverse of M. Remark 2.6.18. The following statement is equivalent to the statement of Remark 2.6.17:
Matrix Decompositions and Generalized Inverses
53
If M = (Xt1 , Xt2 )t (Y1 , Y2 ), where ρ(X1 Y1 ) = ρ(X1 ) = ρ(Y1 ) (Here A = X1 Y1 , B = X1 Y2 , C = X2 Y1 and D = X2 Y2 ) then (i) F is invariant under the choices of g-inverses of A and (ii) R is a g-inverse of M. Theorem 2.6.19. Let M be an nnd matrix over C. Then R is a g-inverse of M. Remark 2.6.20. Let M be as in Theorem 2.6.19. − (i) Let G be an A− ` and F a least squares g-inverse of F. Then R is an − M` if and only if C(CG) ⊆ C(F). − (ii) Let G be an A− m and F a minimum norm g-inverse of F. Then R is t t an M− m if and only if C((GB) ) ⊆ C(F ). † † (iii) Let G be A and F be the Moore-Penrose inverse of F. Then R is M† if and only if C(CA† ) ⊆ C(F) and C((A† B)† ) ⊆ C(Ft ).
Obtaining a g-inverse of M in the most general case is a bit more complicated. A B Theorem 2.6.21. Let M = . For a matrix W, let us write C D Ew = I−WW− and Hw = I−W− W, where W− is some g-inverse of W. Let F = D − CA− B, L = EA B, K = CHA and N = HL (EK FHL )− EK . Then − A − A− BL− EA − HA K− CA− − HA K− FL− EA HA K− U= − L EA 0 HA K− F + A− B N(FL− EA + CA− − I) + −I is a g-inverse of M. Proof.
Check MUM = M.
We shall now consider the third type of modification. Notice that a matrix is of rank 1 if and only if it is of the form abt for some non-null vectors a and b. Theorem 2.6.22. Let A be an m × n matrix and A− be a g-inverse of A. Let a be an m-column vector and b an n-column vector. Let us write
54
Matrix Partial Orders, Shorted Operators and Applications
E = I − AA− , H = I − A− A and β = 1 + bt A− a. If a 6∈ C(A), let c be a column vector as per Lemma 2.6.1 so that ct A = 0 and ct a = 1. If b 6∈ C(At ), let d be a column vector such that dt b = 1 and Ad = 0. Then the following hold: (i) A− −A− act −dbt A− +βdct is a g-inverse of A+abt , when a 6∈ C(A) and b 6∈ C(At ). (ii) A− − β −1 A− abt A− is a g-inverse of A + abt , when β 6= 0, a ∈ C(A) or b ∈ C(At ) or both. (iii) A− − dbt A− is a g-inverse of A + abt , when β = 0, a ∈ C(A) and b 6∈ C(At ). (iv) A− − A− act is a g-inverse of A + abt , when β = 0, a 6∈ C(A) and b ∈ C(At ). (v) A− is a g-inverse of A + abt , when β = 0, a ∈ C(A) and b ∈ C(At ). Proof is straightforward. We shall now give an expression for (A+ab? )† in terms of A† in various cases. Theorem 2.6.23. Let A be an m×n matrix over C and let A† be available. Let a be an m-column vector and b an n-column vector. Write k = A† a, h = A†? b, E = I − AA† , F = I − A† A and β = 1 + b? A† a. Let u = Ea, v = F? b. Then the following hold: (i) A† − ku† − v?† h? + βv?† u† is (A + ab? )† , when a 6∈ C(A) and b 6∈ C(A? ). (ii) A† − kk† A† − v?† h? is (A + ab? )† , when β = 0, a ∈ C(A) and b 6∈ C(A? ). (iii) A† + (1/β ? )vk? A† − (β ? /σ1 )p1 q?1 is (A + ab? )† , where σ1 = kkk2 kvk2 + β 2 , p1 = ((kkk2 /β ? )v + k), and q1 = ((kβfvk/β ? ) A†? k + h) , when β 6= 0 , a ∈ C(A). (iv) A† + A† h†? h? − ku† is (A + ab? )† , when β = 0, a 6∈ C(A) and b ∈ C(A? ). (v) A† + (1/β ? )A† hu? − (β ? /σ2 )p2 q2 ? is (A + ab? )† , where p2 = (1/b? )kuk2 A† h + k, q2 = (1/β ? )khk2 u + h and σ2 = khk2 kuk2 + β 2 , when β 6= 0 and b ∈ C(A? ). (vi) A† − kk† A† − A† h†? h + k† A† h†? kh? is (A + ab? )† , when β = 0, a ∈ C(A) and b ∈ C(A? ). Proof.
See [Campbell and Meyer (1991)] for a proof.
Matrix Decompositions and Generalized Inverses
55
Group inverse of a modified matrix will be dealt with in detail in Chapter 13.
2.7
Simultaneous diagonalization
When we study matrix partial orders, parallel sums, and shorted operators in the later chapters of this monograph, we generally deal with a pair of matrices. Sometimes we study the common features of several matrices together as in Fisher-Cochran type theorems. Studying such relationships or common features is easy if the matrices involved are diagonal. In this section, we put together some results on simultaneous diagonalization of matrices by using (a) non-singular transformations, (b) contra-gradient transformations (c) unitary transformations and (d) similarity transformations. We also study the simultaneous diagonalization of a pair of (possibly) rectangular matrices using unitary transformations and simultaneous singular value decomposition of several (but finite in number) rectangular matrices. We then study the generalized singular value decomposition of a pair of matrices. Using the results mentioned above, we define generalized eigen-values and generalized singular values, which play a role in many applications. In this section, we work with matrices and vectors over the field of complex number C. Theorem 2.7.1. Let A1 , . . . Ak be hermitian matrices of the same order. Then there exists a unitary matrix U such that U? Aj U is diagonal for j = 1, . . . , k if and only if Ai and Aj commute for i, j = 1, . . . , k (or equivalently Ai Aj is hermitian for i, j = 1, . . . , k). Theorem 2.7.2. (a) Let A1 and A2 be hermitian matrices of the same order and let A1 be positive definite. Then there exists a non-singular matrix P such that P? A1 P = I and P? A2 P is a real diagonal matrix. (b) Let A1 , A2 be hermitian matrices of the same order, A1 be non-negative definite and C(A2 ) ⊆ C(A1 ). Then there exists a non-singular matrix P such that P? A1 P = diag(Ir , 0) and P? A2 P = diag(D, 0); where r = ρ(A1 ) and D is a real diagonal matrix of order r × r. Corollary 2.7.3. Let A1 and A2 be nnd matrices of the same order. Then there exists a non-singular matrix P such that P? A1 P and P? A2 P are diagonal with non-negative diagonal elements.
56
Matrix Partial Orders, Shorted Operators and Applications
For proofs of Theorems 2.7.1, 2.7.2 and Corollary 2.7.3 see [Rao and Bhimasankaram (2000)]. Theorem 2.7.4. Let A1 and A2 be hermitian matrices of the same order. Then there exists a non-singular matrix T such that the matrices T? A1 T and T−1 A2 (T−1 )? are both diagonal if and only if A1 A2 is semi- simple with real eigen-values and ρ(A1 A2 ) = ρ(A1 A2 A1 ). Corollary 2.7.5. Consider the same setup as in Theorem 2.7.4. Further, let A1 be nnd. Then there exists a non-singular matrix T such that matrices T? A1 T and T−1 A2 (T−1 )? are both diagonal if and only if ρ(A1 A2 A1 ) = ρ(A1 A2 ). To prove Corollary 2.7.5, observe that A1 is nnd ⇒ A1 = QQ? for some matrix Q. Now, the non-null eigen-values of A1 A2 are the same as those of Q? A2 Q including the algebraic and geometric multiplicities. Further, since Q? A2 Q is hermitian, it is also semi-simple and has real eigen-values. If ρ(A1 A2 ) = ρ(A1 A2 A1 ), then ρ(A1 A2 ) = ρ(A1 A2 A1 ) = ρ(A1 A2 A1 A2 ) and so, the algebraic and geometric multiplicities of zero eigen-value of A1 A2 are equal. Hence, A1 A2 is semi-simple with real eigen-values. Now, the result follows by Theorem 2.7.4. Corollary 2.7.6. Let A1 and A2 be nnd matrices of the same order. Then there exists a non-singular matrix T such that the matrices T? A1 T and T−1 A2 (T−1 )? are both diagonal. Theorem 2.7.7. Let A1 , . . . , Ak be hermitian matrices of the same order and let A1 be non-singular. Then there exists a non-singular matrix P such that the matrix P? Ai P is diagonal for each i = 1, . . . , k if and only if (i) Ai A−1 1 is semi-simple with real eigen-values for i = 1, . . . , k and (ii) Ai A−1 A = Aj A−1 j 1 1 Ai for i, j = 1, . . . , k. Corollary 2.7.8. Consider the same set up as in Theorem 2.7.7 and let A1 be positive definite. Then there exists a non-singular matrix P such that the matrix P? Ai P is diagonal for each i = 1, . . . , k if and only if −1 Ai A−1 1 Aj = Aj A1 Ai for i, j = 1, . . . , k. Corollary 2.7.9. Let A1 , . . . , Ak be hermitian matrices of the same order such that B = A1 +. . .+Ak is nnd. Also, let C(Ai ) ⊆ C(B) for i = 1, . . . , k. Then there exists a non-singular matrix P such that the matrix P? Ai P is diagonal for each i = 1, . . . , k if and only if Ai B− Aj = Aj B− Ai for i, j = 1, . . . k, where B− is any g-inverse of B.
Matrix Decompositions and Generalized Inverses
57
Corollary 2.7.10. Let A1 , . . . , Ak be hermitian matrices of the same order such that C(Ai ) ⊆ C(A1 ) for i = 1, . . . , k. Then there exists a nonsingular matrix P such that P? Ai P is diagonal for i = 1, . . . , k if and − only if (i) Ai A− 1 is semi-simple with real eigen-values and (ii) Ai A1 Aj = − − Aj A1 Ai for each i, j = 1, . . . , k, where A1 is any g-inverse of A1 . For a proof of Theorem 2.7.7, see [Bhimasankaram (1971a)]. Corollaries 2.7.8-2.7.10 easily follow from Theorem 2.7.7. Theorem 2.7.11. (Similarity Transformation) Let A1 , . . . , Ak be semisimple matrices of the same order. Then the following are equivalent: (i) A1 , . . . , Ak commute pair-wise (ii) A1 , . . . , Ak can be expressed as polynomials in a common semi-simple matrix B and (iii) There exists a non-singular matrix S such that S−1 Ai S is diagonal for i = 1, . . . , k. For a proof see [Bhimasankaram (1971a)] and [Ben-Israel and Greville (2001)]. We now study the simultaneous diagonalization of (possibly) rectangular matrices. Theorem 2.7.12. Let A1 , A2 be m × n matrices. Then there exist unitary matrices U and V of order m × m and n × n respectively such that U? Ai V is diagonal with real diagonal elements, i = 1, 2 if and only if A1 A?2 and A?1 A2 are hermitian. Proof. ‘Only if’ part is trivial. ‘If’ part From Theorem 2.2.45, there exist unitary matrices P and Q such that P? A1 Q = diag(∆ , 0), where ∆ is a positive definite diagonal matrix. B C Write P? A2 Q = , where the partitioning is conformable for adD E dition of P? A1 Q and P? A2 Q. Since A1 A?2 is hermitian, so is P? A1 A?2 P.
58
Matrix Partial Orders, Shorted Operators and Applications
Now, P? A1 A?2 P = P? A1 QQ? A?2 P ? ∆ 0 B D? = 0 0 C? E? ∆B? ∆D? = . 0 0 Thus, ∆B? is hermitian and D = 0. Let ∆ = diag(δ1 , . . . , δr ) and B = [bij ]. ∆B? = B∆ ⇒ δi b?ji = bij δj for all i and j. Similarly, ‘A?1 A2 is hermitian’ implies that δi bij = δj b?ji for all i and j and and C = 0. So, bij = b?ji for all i and j, and therefore, B is hermitian. Now, P? A2 Q = diag(B, E), where B is hermitian and B and ∆ commute. So, by Theorem 2.7.1, there exists a unitary matrix R such that R? ∆R = ∆, R? BR = Λ where Λ is a real diagonal matrix. Let E = XΓY? be a singular value decomposition of E where X and Y are unitary and Γ is diagonal matrix with non-negative diagonal elements. Write U = Pdiag(R, X) and V = Qdiag(R, Y). Clearly, U and V are unitary, U? A1 V = diag(R? , X? ) P? A1 Q diag(R, Y) = diag(∆, 0) and U? A2 V = diag(R? , X? )P? A2 Qdiag(R, Y) = diag(R? , X? )diag(B, E)diag(R, Y) = diag(Λ, Γ). Thus, U? A1 V and U? A2 V are both diagonal and U, V are unitary.
Remark 2.7.13. In the proof of Theorem 2.7.12 we have b?ji = bij (δi /δj ) = bij . Thus, δi = δj whenever bij 6= 0. Notice that all δ’s are positive. So, it is easy to see that there exists a permutation matrix T such that T? ∆T = diag(δ1 I, . . . , δs I) and T? BT = diag(B1 , . . . , Bs ) for some s where δi I and Bi are of the same order for i = 1, . . . , s and the graph of the incidence matrix corresponding to Bi is connected for i = 1, . . . , s. We now prove a theorem which enables us to diagonalize several matrices simultaneously. Before we do so, we need the following lemmas: Lemma 2.7.14. Let at = (a1 , . . . , ak ) and bt = (b1 , . . . , bk ) be complex vectors such that for i = 1, . . . , k max{|ai |, |bi |} 6= 0. Then there exists a real scalar θ such that the vector c = a + θb has no null components i.e. ci 6= 0 for all i. If vectors a and b are real, then c is also real.
Matrix Decompositions and Generalized Inverses
Proof.
59
Let α = max{|ai |}. If b = 0, take θ = 0. So c = a has no null i
components. Now let b 6= 0. Let bi1 , . . . , bir be the non-null components of α+1 b. Write β = min |bij |. Let θ = . Consider i such that 1 ≤ i ≤ k. Let j β bi = 0. Then ci = ai + θbi = ai 6= 0. Hence, ci 6= 0. Now, let bi 6= 0. Now, α+1 α+1 ci = ai + θbi = 0 ⇒ ai = −θbi = − bi . So, |ai | = |bi | ≥ α + 1, β β which is a contradiction, since |ai | ≤ α. Hence, ci 6= 0. Lemma 2.7.15. Let u1 , u2 , . . . , uk be any finite number of complex vectors such that max{|uij |} 6= 0 for all j, where uij is the j th component of ui . i
Then there exist real scalars θ1 , θ2 , . . . , θk such that no component of the vector v = θ1 u1 + θ2 u2 + . . . , θk uk is null. Proof. Apply Lemma 2.7.14 recursively, noting that there is real linear combination of vectors u1 , u2 no component of which is null. Theorem 2.7.16. Let A1 , . . . Ak be a finite number of matrices of the same order m × n. Then there exist unitary matrices U and V of order m×m and n×n respectively such that U? Ai V is diagonal with real diagonal elements if and only if A?i Aj and Ai A?j are hermitian for i, j = 1, . . . , k. Proof. ‘Only if’ part is trivial. ‘If’ part By Theorem 2.7.12, the result is true for k = 2. Let the result be true for k = s. So, there exist unitary matrices P and Q such that Di = P? Ai Q is real and diagonal for i = 1, . . . , s. Without loss of generality we can assume Di = diag(∆i , 0), i = 1, . . . , s where ∆1 . . . , ∆s are of same order (i) (i) and max{|δj |} 6= 0 for all j (here δj is the j th diagonal element of ∆i ). i
By Lemma 2.7.15, there exists a real linear combination s X
µi ∆i
i=1
which is a non-singular matrix. Write ?
P As+1 Q = where B is of Since, A?i As+1 s X B ( µi Di ) D i=1
B C D E
the same order as ∆i (all ∆i are of same order). and Ai A?s+1 are hermitian for i = 1, . . . , s, we have ? s X C B C and ( µi Di )? are also hermitian. By E D E i=1
60
Matrix Partial Orders, Shorted Operators and Applications
Theorem 2.7.12, it follows that B is hermitian, C = 0 and D = 0. Since A?i Aj and Ai A?j are hermitian for i, j = 1, . . . , s + 1, it follows that ∆1 . . . , ∆s and B commute pair-wise and are hermitian. So, there exists a unitary matrix R such that R? ∆i R are diagonal with real diagonal elements i = 1, . . . , s. Let S and T be unitary matrices such that S? ET is diagonal (singular value decomposition of E). Write U = P diag(R, S) and V = Q diag(R, T). Clearly, U and V are unitary and U? Ai V is diagonal with real diagonal elements for i = 1, . . . , s + 1. The result now follows by induction on s. Definition 2.7.17. The matrices A1 , . . . , Ak of the same order are said to have simultaneous singular value decomposition if there exist unitary matrices U and V such that U? Ai V is a diagonal matrix with non-negative diagonal elements for i = 1, . . . , k. Theorem 2.7.18. (Simultaneous Singular Value Decomposition) (i) Let A1 and A2 have simultaneous singular value decomposition. Then A?1 A2 , A1 A?2 are nnd. (ii) Let A1 and A2 be matrices of same order such that A?1 A2 and A1 A?2 are hermitian and at least one of them is nnd. Then A1 and A2 have simultaneous singular value decomposition. Proof. (i) is trivial. (ii) Since A1 A?2 and A?1 A2 are hermitian by Theorem 2.7.12, there exist unitary matrices U and V such that Ai = U∆i V? where ∆i is a diagonal matrix with real diagonal elements δji , j = 1, . . ., min(m, n), i = 1, 2 where A and B are matrices of order m × n. Let A1 A?2 be nnd. Then ∆1 ∆2 is nnd. Clearly ∆2 ∆1 is also nnd. Thus, a diagonal element of ∆1 is negative if and only if the corresponding diagonal element of ∆2 is negative. If ith diagonal element of ∆1 is negative replace ui , the ith column of U by −ui and δ1i and δ2i by −δ1i and −δ2i . Repeat the same for all negative diagonal elements of ∆1 . Call the resultant matrices obtained from U, ∆1 and ∆2 as W, Λ1 and Λ2 . Then Ai = WΛi V? when W as V are unitary, Λi is diagonal nnd, i = 1, 2. Hence Ai = WΛi V? is a singular value decomposition for i = 1, 2. Theorem 2.7.19. (Simultaneous Singular Value Decomposition for Several Matrices) Let A1 , . . . , Ak be matrices of the same order. (i) If A1 , . . . , Ak have simultaneous singular value decomposition, then A?i Aj and Ai A?j are nnd for i, j = 1, . . . , k. (ii) Let A1 , . . . , Ak be matrices such that A?i Aj and Ai A?j are hermitian
Matrix Decompositions and Generalized Inverses
61
and at least one of them is nnd for i, j = 1, . . . , k. Then A1 , . . . , Ak have simultaneous singular value decomposition. Proof. We shall prove both (i) and (ii) by induction on k. The result is true for k = 2, by Theorem 2.7.18. Let it be true for k = s. Consider A1 , . . . , As+1 such that A?i Aj and Ai A?j are hermitian and at least one of them is nnd for i, j = 1, . . . , s + 1. By induction hypothesis, A1 , . . . , As have simultaneous singular value decomposition. So, there exist unitary matrices P and Q such that P? Ai Q = Di where Di is diagonal nnd for i = 1, . . . , s. Without loss of generality, we can take Di = diag(∆i , 0), Ps where each ∆i is nnd and i ∆i is positive definite. (This is achieved by permuting the rows and the same columns of each of ∆i - the same permutation transformation for each i - such that the null rows and columns are moved to be the last rows and last columns.) Ps Since, A?i As+1 and Ai A?s+1 are hermitian for each i, ( 1 A?i )As+1 and Ps ( 1 Ai )A?s+1 are also hermitian. Moreover, X X P? A?i Q = diag ∆i , 0 . P B C Let P As+1 Q = , where B has same order as ∆i , we notice F E that B is nnd, C = 0 and F = 0. Since A?i A?s+1 are hermitian, ∆i B is hermitian, ∆i , B commute for i = 1, . . . , s. So, by Theorem 2.7.1, there exists a unitary matrix R such that R? ∆i R for i = 1, . . . , s and R? BR are each diagonal with real non-negative diagonal elements. Let E = SΓT? be singular value decomposition of E. Now construct U and V as in Theorem 2.7.16 and check that U? Ai V is diagonal with real non-negative diagonal elements for i = 1, . . . , s + 1. ?
Theorem 2.7.20. (Generalized Singular Value Decomposition) Let A and B be matrices of orders m × n and p × n respectively. Let r = ρ(At : Bt )t and max{m, p} ≥ r. Then there exist unitary matrices U and V and a non-singular matrix Z such that A = U∆1 Z and B = V∆2 Z, where ∆1 and ∆2 are diagonal matrices with non-negative diagonal elements. Before we prove the Theorem 2.7.20, we shall prove a lemma that is also of independent interest. Lemma 2.7.21. Let C and D be n × m matrices such that C? C = D? D. Then there exists a unitary matrix T such that D = TC.
62
Matrix Partial Orders, Shorted Operators and Applications
Proof. Let C? C = D? D = Pdiag(∆2 , 0)P? be a spectral decomposition, where ∆2 is a positive definite diagonal matrix and P is unitary. Then the singular value decompositions of C and D are given by C = Qdiag(∆ , 0)V? and D = Rdiag(∆ , 0) V? where Q and R are unitary matrices. Let T = RQ? . Then T is unitary and D = TC. We now prove the Theorem 2.7.20. Proof. Consider A? A and B? B. By Corollary 2.7.3, there exists a nonsingular matrix Z such that A? A = Z? Γ1 Z and B? B = Z? Γ2 Z, where Γ1 and Γ2 are diagonal matrices with non-negative diagonal elements. Since ρ(A? A + B? B) = ρ(A? : B? )? = r, Γ1 +Γ2 has exactly r diagonal elements which are positive. Let Γ1 = ∆?1 ∆1 , Γ2 = ∆?2 ∆2 where ∆1 , ∆2 are of orders m × n and p × n respectively, with all diagonal elements nonnegative and all the off-diagonal elements zero. It is now clear that A? A = Z? ∆?1 ∆1 Z and B? B = Z? ∆?2 ∆2 Z. So, by Lemma 2.7.21, there exist unitary matrices U and V such that A = U∆1 Z and B = V∆2 Z. Remark 2.7.22. Suppose r ≤ max{m, p} in Theorem 2.7.20. Let m ≥ p. Then the matrix ∆?1 ∆1 + (∆2 ? : 0)(∆2 t : 0)t , has exactly r positive elements, where (∆2 t : 0)t is of order m × n. But this is not possible if m < r. The case where m < p can be disposed of similarly. That is why we require the condition max{m, p} ≥ r = ρ(At : Bt ). We shall now present, in brief, the generalized eigen-values and generalized singular values. Definition 2.7.23. We say a non-zero scalar λ is a generalized eigen-value of A with respect to B if there exists a non-null vector x such that Ax = λBx. Such a vector x is called a generalized eigen-vector of A with respect to B corresponding to the generalized eigen-value λ. Let A and B be hermitian matrices of the same order with B positive definite. Then by Theorem 2.7.2, there exists a non-singular matrix P such that P? BP = I and P? AP = Λ, a diagonal matrix. So, AP = (P? )−1 Λ = BPΛ. Let Pi be the ith column of P. Then APi = ((P? )−1 Λ)i = λi BPi , where λi ’s are diagonal elements of Λ. Thus, all the non-null diagonal elements of Λ are the generalized eigen-values of A with respect to B and the columns Pi of P are generalized eigen-vectors. Notice that λi are the non-zero eigen values of B−1 A (and are same as non-null eigen-values of AB−1 ).
Matrix Decompositions and Generalized Inverses
63
Definition 2.7.24. Consider the generalized singular value decomposition as in Theorem 2.7.20. Consider the common non-null diagonal elements of δ1ij ∆1 and ∆2 . Let them be δ1i1 , . . . , δ1is and δ2i1 , . . . , δ2is . The scalars , δ2ij j = 1, . . . , s are called the generalized singular values of A with respect to B. − ? Notice that Z−1 ∆− 2 V is a g-inverse of B where ∆2 is any g-inverse of ∆2 . Now, construct the particular g-inverse G of ∆2 as follows: gij = 0, when1 if δ2ii 6= 0 and 0 otherwise. Let B− = Z−1 GV? . ever i 6= j and gii = δ2ii Then U∆1 GV? is a singular value decomposition of AB− , where ∆1 G is a diagonal nnd matrix. Further, the non-null diagonal elements of ∆1 G are precisely the generalized singular values of A with respect to B. This shows the similarity between the concepts of generalized eigenvalues and generalized singular values. For more details on generalized eigen-values, see [Rao and Mitra (1971)]. For more details on generalized singular values, see [Golub and Van Loan (1996)].
64
2.8
Matrix Partial Orders, Shorted Operators and Applications
Exercises
All matrices in the following exercises are over C, the field of complex numbers. (1) Let A be an idempotent matrix. Prove that ρ(A) = tr(A). However the converse is not true even if A is semi-simple. (2) Let E be any idempotent matrix. Show that E† E? = E? E† = E† . (3) Let A be a square matrix of index 1 such that A? = A† . Then prove that (i) A# = AA? A# = A# A? A and (ii) (A2 A? )† = A# AA? and (A? A2 )† = A? AA# A? A = A# AA? . (iii) Is it necessary that A# is a partial isometry? (4) Let A and B be matrices of the same order such that A ia rangehermitian and B is hermitian. Show that AB† A = A implies A is hermitian. (5) Let P and Q be two orthogonal projectors of order n × n. Then prove the following statements: (i) PQ is similar to a diagonal matrix. (ii) PQ is an orthogonal projector if and only if PQ is range hermitian. (iii) n − ρ(I − PQ) = d(C(P) ∩ C(Q)). (6) Let A be an n × n matrix. Then prove the following: (i) (ii) (iii) (iv)
A is range-hermitian ⇔ A† = A# . A is range-hermitian ⇔ AA† = A† A. A is of index ≤ 1 ⇔ C(A) and N (A) are complementary. Every range-hermitian matrix is of index 1.
(7) Let A be a partial isometry of index 1. Should A# be a partial isometry? (8) Prove or disprove: A#† = A†# . (9) Let A be an n × m matrix. Prove that A is unitarily similar to the maΣK ΣL , where KK? +LL? = Ir , Σ = diag(σ1 Ir1 , . . . , σt Irt ); trix 0 0 r1 + . . . + rt = r = ρ(A) and σ1 > . . . σt > 0. (Hint: Use Singular value decomposition.) (10) Prove or disprove: (i) A semi-simple matrix has index not greater than 1 and (ii) A tripotent matrix is semi-simple.
Matrix Decompositions and Generalized Inverses
65
(11) Let A be an m × n matrix. Then show that A is partial isometry ⇔ AA? is an orthogonal projector ⇔ A? A is an orthogonal projector. (12) Let an m × n matrix A be a partial isometry. Then prove that A is normal ⇔ A is range hermitian. (13) Prove or disprove: for a square matrix A, both the matrices A and A2 are partial isometries implies A is normal. (14) Show that an m × n matrix A is a partial isometry of rank r ⇔ A has exactly r singular values each equal to 1 and others are null. (15) Show that a hermitian partial isometry is a tripotent matrix. Show 1 0 0 the converse statement is not true. (Hint: Take A = 0 −1 1 .) 0 0 0 (16) Show that an n×n matrix A is range-hermitian if and only if A] = A† if and only if A† is a polynomial in A. All matrices in the following exercises are over F, an arbitrary field. (17) Let A and C be any two matrices with A of index 1. Then prove the following: (i) A# C = 0 ⇔ AC = 0 and (ii) CA# = 0 ⇔ CA = 0. (18) Let A be a tripotent matrix. Then prove the following: A2 is idempotent. −A is tripotent. The only eigen-values of A are −1, 0 and 1. ρ(A) = tr(A2 ). A is its own g-inverse. Conversely, if B is its own inverse then B is tripotent. (vi) For a tripotent matrix A, tr(A2 + A) = twice the number of positive eigen-values, tr(A2 − A) = twice the number of negative eigenvalues and tr(I − A2 ) = the number of null eigen-values of A.
(i) (ii) (iii) (iv) (v)
(19) Prove that A is of index not greater than 1 and A2 is idempotent ⇔ A is tripotent. (20) Let A, B be square matrices of the same order. Prove that AB(AB)− A = A if and only if ρ(AB) = ρ(A) and B(AB)− (AB) = B if and only if ρ(AB) = ρ(B). (21) Let A be a square matrix and (F, G) be a rank factorization of A. Then prove that A has a reflexive g-inverse X with C(X) = C(A) and N (X) = N (A) if and only if GF is invertible and if this is so, X = F(GF)−2 G = A] .
Chapter 3
The Minus Order
3.1
Introduction
After the first two introductory chapters, we are finally ready to explore the beautiful world of matrix partial orders. Historically these partial orders were defined through various generalized inverses. While it is true that the structure and mechanisms of these orders are more basic than the involvement of generalized inverse, we shall see that the use of generalized inverses makes the exposition elegant. Also, several properties can be expressed in a pleasant way in the process. In Section 3.2, we begin with the study of one of the most basic matrix order relations, namely, the space pre-order. This is the stepping stone for almost all the partial orders that we shall study subsequently and therefore, we give a detailed account of its properties. This pre-order can be extended to a partial order with minor additions to its definition as we shall see in Section 3.3. Since this extension does not lend to any further properties than the space pre-order itself, we leave it at that and define a very important and the most fundamental partial order (that implies space pre-order and also the extension) called the minus order. We call it fundamental because the partial orders studied in the three subsequent chapters are built on this partial order by putting additional restrictions. We obtain several of its characterizations in Section 3.3. In Section 3.4, given a matrix A, we obtain the class of all matrices B such that A is below B under the minus order (written as A <− B). Assuming that B is given, we also obtain the class of all matrices A such that A is below B under the minus order. In Section 3.5, assuming that A <− B, we characterize the class of all g-inverses of A such that A− A = A− B and AA− = BA− . One of the characterising properties for A to be below B under minus order is A = PB = BQ for some projectors P and Q. We also characterize the 67
68
Matrix Partial Orders, Shorted Operators and Applications
class of all projectors P and Q such that A = PB = BQ when A <− B. Section 3.6 is devoted to the study of special properties of the minus order for idempotent matrices. In Section 3.7, we present some properties of the minus order for hermitian and nnd (complex) matrices. We emphasize that all the matrices in this chapter are (a) possibly rectangular and (b) over an arbitrary field unless stated otherwise. 3.2
Space pre-order
A pre-order is a reflexive and transitive relation (see Appendix A). The space pre-order, which we define shortly, is the big daddy of (the most basic among) all the order relations that we study in this and subsequent chapters. We show that the space pre-order, as its name suggests, is a preorder. We also obtain some characterizations of this pre-order which will be useful in what follows. Definition 3.2.1. Let A and B be matrices (possibly rectangular) having the same order. Then A is said to be below B under the space pre-order, if C(A) ⊆ C(B) and C(At ) ⊆ C(Bt ). We denote the space pre-order by ‘<s ’ and write A <s B, whenever A is below B under ‘<s ’. It is easy to check that <s is reflexive and transitive. Hence, it is a pre-order. However, it is not a partial order, for, if A and B are distinct non-singular matrices of the same order n × n, then C(A) = C(B) = C(At ) = C(Bt ) = Fn . More specifically, take A=
1 0 0 1
and B =
1 1 . 0 1
Thus A <s B, B <s A, but A 6= B, showing ‘<s ’ is not anti-symmetric. We give some more examples of the space pre-order. Example 3.2.2. If A = diag(a1 , . . . , an ) and B = diag(b1 , . . . , bn ), then A <s B if and only if bi 6= 0 whenever ai 6= 0. Example 3.2.3. Let A and B be nnd matrices of the same order (over the field of complex numbers). Then A <s A + B. Remark 3.2.4. Let A and B be matrices of the same order. In view of Remark 2.3.4, we have A <s B ⇔ A = BB− A = AB− B = BB− AB− B
The Minus Order
69
for all B− . So, A <s B ⇔ A = BMB for some matrix M. Thus, ρ(A) ≤ ρ(B) if A <s B. Remark 3.2.5. Let A and B be matrices of the same order. Let P and Q be non-singular matrices of appropriate orders so that the products PAQ and PBQ are defined. Then it is obvious that C(A) ⊆ C(B) ⇔ C(PAQ) = C(PA) ⊆ C(PB) = C(PBQ). Similarly, C(At ) ⊆ C(Bt ) ⇔ C((PBQ)t ) ⊆ C((PBQ)t ). Thus, A <s B ⇔ PAQ <s PBQ, for all non-singular matrices P and Q of suitable orders. In other words, the space pre-order is invariant under equivalent transformations. The following theorem gives some useful characterizations of the space pre-order. Theorem 3.2.6. Let A and B be matrices of the same order. The following are equivalent: (i) A <s B, (ii) Let (P, Q) be a rank factorization of B. Then A = PTQ for some matrix T and (iii) Let (L, M) be a rank factorization of A. If ρ(A) = ρ(B), then B = LRM for some non-singular matrix R, and if ρ(A) < ρ(B), then there exist matrices E andF and the a non-singular matrix R such that M M matrices (L : E) and ·· have full rank and B = (L : E)R ·· . F F Proof. (i) ⇒ (ii) Let (L, M) be a rank factorization of A. Since A <s B, C(L) = C(A) and C(P) = C(B), we have C(L) ⊆ C(P). So, L = PT1 for some matrix T1 . Similarly, C(Mt ) ⊆ C(Qt ) implies Mt = Qt T2 for some matrix T2 . Therefore, A = LM = PT1 Tt2 Q = PTQ, where T = T1 Tt2 . (ii) ⇒ (i) Since A = PTQ, C(A) ⊆ C(P) = C(B) and C(At ) ⊆ C(Qt ) = C(Bt ). (i)⇒(iii) Let ρ(A) = ρ(B). As in the proof of (i) ⇒ (ii), it is easy to see that there exists a matrix R such that B = LRM. Let ρ(A) = a. Then R is an a × a matrix. Further, since L has a left inverse and M has a right inverse, we have a = ρ(A) = ρ(B) = ρ(LRM) = ρ(R). Hence, R is non-singular.
70
Matrix Partial Orders, Shorted Operators and Applications
Now, let ρ(B) > ρ(A). Since C(A) ⊆ C(B), there exists a full column rank matrix E such that columns of (L : E) form a basis of C(B). Similarly we M can find a full row rank matrix F such that the rows of ·· form a basis F of the row space of B. Let (G, H) be a rank factorization of B. Then G = (L : E)R1 , where R1 is non-singular. Similarly,
M H = R2 ·· , F where R2 is non-singular. Hence M B = GH = (L : E)R ·· , F where R = R1 R2 is non-singular. (iii) ⇒ (i) If ρ(A) = ρ(B), the proof is trivial. So, let ρ(B) > ρ(A). Let E and F be as in the proof of (i) ⇒ (iii). Notice that C(A) = C(L) ⊆ C(L : E) = C(B) and C(At ) = C(Mt ) ⊆ C((Mt : Ft )) = C(Bt ). Therefore, A <s B. The characterizations of the space pre-order in Theorem 3.2.6 enable us to obtain the class of matrices, which are below a given matrix B under ‘<s ’ (using (ii)) and the class of all matrices which are above a given matrix A under <s (using (iii)). Remark 3.2.7. In Theorem 3.2.6, the statements (ii) and (iii) can be equivalently written as (ii)0 Let B = Pdiag(I, 0)Q; where P and Q are non-singular. Then A = Pdiag(T, 0) Q for some matrix T and (iii)0 Let A = Sdiag(Ia , 0b−a , 0)W where S and W are non-singular. If ρ(B) = b ≥ a, then B = Sdiag(K, 0)W, where K is a non-singular matrix of order b × b respectively. Remark 3.2.8. Notice that C(At ) ⊆ C(Bt ) if and only if N (B) ⊆ N (A). Also C(A) ⊆ C(B) if and only if N (Bt ) ⊆ N (At ). Hence A <s B, if and only if N (B) ⊆ N (A) and N (Bt ) ⊆ N (At ).
The Minus Order
71
Theorem 3.2.9. Let A and B be matrices having the same order. Then A <s B if and only if AB− A is invariant under the choices of B− . Proof. If A = 0, then result follows trivially. If A 6= 0, the result follows from Theorem 2.3.11.
Let us now specialize to the case of square matrices of index 1. The following theorem can be proved along the lines of Theorem 3.2.6. Theorem 3.2.10. Let A and B be square matrices of order n × n with ρ(A) = a and ρ(B) = b. Let each of A and B have index not greater than 1. Then the following are equivalent: (i) A <s B, (ii) Let B = Pdiag(Db , 0)P−1 , where Db is a b × b non-singular matrix and P is a non-singular matrix. Then A = Pdiag(T, 0) P−1 , where T is a b × b matrix of index not greater than 1 and (iii) There exist non-singular matrices Q, D1 and D2 of orders a × a and b × b such that A = Qdiag(D1 , 0) Q−1 , and B = Qdiag(D2 , 0) Q−1 . Remark 3.2.11. Let A and B be square matrices of index ≤ 1 such that A <s B. Let A = Qdiag(Da , 0) Q−1 , where Da is non-singular. Then it is not necessarily true that B = Qdiag(E, 0) Q−1 , where E is non-singular. However, if A <s B, there exist nonsingular matrices Q, D and E of appropriate orders such that A = Q diag(D, 0)Q−1 and B = Q diag(E, 0)Q−1 . We now consider matrices over the field of complex numbers C and give one more characterization of the space pre-order using singular value decomposition. Theorem 3.2.12. Let A and B be matrices in Cm×n . The following are equivalent: (i) A <s B, (ii) Let B = Pdiag(∆, 0)Q∗ be a singular value decomposition of B, where P and Q are unitary matrices of appropriate orders and ∆ is a positive definite diagonal matrix. Then A = Pdiag(T, 0)Q∗ for some matrix T of the same order as ∆ and (iii) There exist unitary matrices U and V, a positive definite diagonal matrix D and a nonsingular matrix S of order not less than that of D such that A = U diag(D, 0)V∗ is a singular value decomposition of A and B = U diag(S, 0)V∗ . Proof can be given along the lines of Theorem 3.2.6.
72
3.3
Matrix Partial Orders, Shorted Operators and Applications
Minus order - Some characterizations
In the previous section we studied an order relation which is a pre-order, namely, the space pre-order. We also noted that it is not anti-symmetric and hence is not a partial order. It is easy to see that the following minimal modification to the definition of the space pre-order yields a partial order. Define if ρ(A) = ρ(B) A = B s+ A < B if . A <s B if ρ(A) < ρ(B) It is clear that ‘<s+ ’ is a partial order. This is the weakest partial order that implies the space pre-order. Unfortunately this partial order does not exhibit any special properties not possessed by the space pre-order. We now introduce a partial order which implies the space pre-order and also possesses interesting properties like the ones mentioned below. Let us consider A = diag(Ia , 0, 0) and B = diag(Ia , Ic , 0). It is reasonable to say A precedes B or A is below B. The same status can be accorded to matrices PAQ and PBQ, where P and Q are non-singular. Moreover, it is easy to see that A <s B and B is obtained from A by simply adding C = diag(0, Ic , 0). Notice that the column spaces of A and C are virtually disjoint and so are their row spaces. Thus, B = A ⊕ C. Such a matrix A can be thought as an independent section of B or we may say that A precedes B in some sense. In the previous section we saw that A <s B if and only if AB− A is invariant under the choices of g-inverse B− of B. Further, if AB− A = A, then {B− }, the set of the generalized inverses of B is a subset of {A− }. Thus, once again we can say A precedes B in some sense. In this section, we show that all the above requirements lead to an identical ordering between matrices of same order. In this chapter we make an in-depth study of this order. Consider the mathematically elegant relationship defined using generalized inverses as follows: Definition 3.3.1. Let A and B be matrices of the same order. Then A is said to be below B under the minus order if there exist generalized inverses G1 and G2 of A such that AG1 = BG1 and G2 A = G2 B. When A is below B under the minus order, we write A <− B. We shall show that the relation in Definition 3.3.1 is identical with the various ways of A preceding B described in the second paragraph of this
The Minus Order
73
section. We shall also show that the minus order is indeed a partial order on F m×n . The minus order was first introduced by Hartwig on the set of regular elements of a semi-group. He developed the order relation as an extension of the standard or natural partial order on the set of idempotent elements of a semi-group and in particular, of elements of F m×n . This order also extends the natural partial order of Vagner on inverse semi-groups and the star order of Drazin (which is studied in detail in Chapter 5). We first show that whenever A <− B, there is no loss of generality in taking the g-inverses G1 and G2 of A to be the same in Definition 3.3.1. Theorem 3.3.2. Let A and B be matrices having the same order. Then A <− B if and only if there exists a g-inverse A− of A such that A− A = A− B and AA− = BA− . Proof. ‘If’ part is trivial. ‘Only if’ part Let G1 and G2 to be the g-inverses of A such that AG1 = BG1 and G2 A = G2 B. Let G = G1 AG2 . Then G is a g-inverse of A and satisfies AG = BG, GA = GB. Remark 3.3.3. (i) In view of Theorem 3.3.2, we may take G1 and G2 to be the same in Definition 3.3.1. Henceforth, we do so without making a special mention. (ii) In the proof of Theorem 3.3.2, notice that G = G1 AG2 is a reflexive g-inverse of A. Hence, it is clear that if A <− B, then there exists a reflexive − − − − g-inverse A− r of A such that AAr = BAr and Ar A = Ar B. Clearly, 0 <− B for any matrix B, where matrices 0 and B have the same order. Also, if B <− 0, then B = 0. Henceforth, we shall consider A and B to be non-null in our study of the minus order. Any m×n matrix is a representation of some linear transformation from F n to F m under specified bases for F n and F m . Thus, if P and Q are nonsingular matrices of suitable orders so that the matrix PAQ is defined, then A and PAQ represent the same linear transformation. In view of Remark 3.2.5, the space pre-order is also a pre-order on the linear transformations. Our next Theorem shows the same is true for the minus order. Theorem 3.3.4. Let A and B be matrices of the same order and let P and Q be non-singular matrices of appropriate orders so that the matrix PAQ is defined. Then A <− B if and only if PAQ <− PBQ.
74
Matrix Partial Orders, Shorted Operators and Applications
Proof. ‘Only if’ part Let A <− B. Then there exists a g-inverse A− of A such that A− A = A− B and AA− = BA− . In view of Remark 2.3.7, C = Q−1 A− P−1 is a g-inverse of PAQ. Moreover, CPAQ = CPBQ and PAQC = PBQC. Thus, we have PAQ <− PBQ. ‘If’ part follows from the proof of ‘only if’ part once we observe that A = P−1 (PAQ)Q−1 and B = P−1 (PBQ)Q−1 . We now establish the assertion made earlier about the equivalence of seemingly different relationships mentioned in the beginning of this section. Theorem 3.3.5. Let A and B be non-null matrices of the order m × n. Let ρ(A) = a and ρ(B) = b. Then the following are equivalent: A <− B A <s B and AA− = BA− for some g-inverse A− of A − − A <s B and AA− r = BAr for some reflexive g-inverse Ar of A − − − t AA = BA for some g-inverse A of A and C(A ) ⊂ C(Bt ) If (P, Q) is a rank factorization of B, then A = PTQ for some idempotent matrix T (vi) There exist non-singular matrices R and S of orders m×m and n×n respectively such that
(i) (ii) (iii) (iv) (v)
A = Rdiag(Ia , 0, 0)S and B = Rdiag(Ia , Ib−a , 0)S (vii) B = A ⊕ (B − A) and (viii) {B− } ⊆ {A− }. Proof. (i)⇒(ii) and (ii)⇔(iii)⇔(iv) are trivial. (iii)⇒(v) Let (P, Q) be a rank factorization of B. Since A <s B, by Theorem 3.2.6(ii), we have B = PQ and A = PTQ for some matrix T. Further, every reflexive −1 −1 − −1 g-inverse A− r of A is of the from QR Tr PL for some right inverse QR − − − of Q, for some left inverse P−1 L of P and for some Tr . Now AAr = BAr −1 − −1 −1 − −1 − for some reflexive g-inverse Ar , so, PTQQR Tr PL = PQQR Tr PL − − or TT− r = Tr . Clearly, T = Tr T is idempotent. (v) ⇒ (vi) Since T is idempotent with ρ(T) = a, there exists a non-singular matrix −1 R1 such that T = R1 diag(Ia , 0)R−1 1 . So B = PQ = PR1 R1 Q, and −1 −1 A = PR1 diag(Ia , 0)R1 Q. Clearly, ρ(PR1 ) = ρ(R1 Q) = b. Also, −1 (PR1 , R−1 1 Q) is a rank factorization of B. Hence, PR1 and R1 Q can be extended to non-singular matrices R and S respectively. It is easy to see
The Minus Order
75
that A = Rdiag(Ia , 0, 0)S and B = Rdiag(Ia , Ib−a , 0)S. Proof for (vi) ⇒ (vii) is trivial. (vii) ⇒ (viii) Clearly, C(A) ⊆ C(B) and C(At ) ⊆ C(Bt ). So, for each B− , A = AB− B = AB− A + AB− (B − A) or A − AB− A = AB− (B − A). However, C(A − AB− A)t ⊆ C(AB− (B − A))t and C(AB− (B − A))t ⊆ C(B − A)t . So, C(A − AB− A)t ∩ C(AB− (B − A))t ⊆ C(At ) ∩ C(B − A)t = {0}. Hence, A − AB− A = 0 or equivalently A = AB− A. Thus, B− ∈ {A− }. (viii) ⇒ (i) Since A = AB− A for all B− , we have A <s B. Let B− and A− be any g-inverses of B and A respectively. Write G1 = B− AA− and G2 = A− AB− . Then it is easy to verify that G1 and G2 are g-inverse of A and AG1 = BG1 , G2 A = G2 B. Corollary 3.3.6. If A and B are non-null matrices of the same order such that A <− B and ρ(A) = ρ(B), then A = B. Remark 3.3.7. Let A and B be non-null matrices of the same order such that A is a full rank matrix and A <− B, then A = B. So, A is a maximal element of F m×n under ‘<− ’. Remark 3.3.8. The condition {B− } ⊆ {A− } in (viii) of Theorem 3.3.5 − can be replaced by {B− r } ⊆ {A }. Remark 3.3.9. Statement (vii) of Theorem 3.3.5 is equivalent to saying ρ(B) = ρ(A) + ρ(B − A). Remark 3.3.10. Let A and B be non-null matrices of same order such that A <− B. Then A and B have simultaneous normal form. Remark 3.3.11. In Theorem 3.3.5, the statements (ii), (iii) and (iv) can equivalently be stated as (ii)0 A <s B and A− A = A− B for some g-inverse A− of A − − (iii)0 A <s B and A− r A = Ar B for some reflexive g-inverse Ar of A 0 − − − (iv) A A = A B for some g-inverse A of A and C(A) ⊂ C(B).
76
Matrix Partial Orders, Shorted Operators and Applications
Remark 3.3.12. In Theorem 3.3.5, we proved that A <s B together with AA− = BA− for some g-inverse A− of A is equivalent to A <− B. It is easy to see (in view of Remark 2.3.4) that A <s B together with A = AB− A for some B− is also equivalent to A <− B. Remark 3.3.13. Let T1 and T2 be linear transformations from F n to F m . Let ρ(T1 ) = a and ρ(T2 ) = b, b ≥ a. In view of (i) ⇔ (vi) of Theorem 3.3.5, T1 <− T2 if and only if there exists bases {u1 , . . . , un } and {v1 , . . . , vm } of F n and F m respectively such that ( vi 1 ≤ i ≤ a T1 ui = 0 otherwise and T2 ui =
( vi
1≤i≤b
0
otherwise
.
We now show that the minus order is indeed a partial order. Theorem 3.3.14. The minus order is a partial order on F m×n . Proof. Reflexivity holds trivially. In view of Theorem 3.3.5 (viii), transitivity also holds. To prove anti-symmetry, observe that by Remark 3.3.9, A <− B ⇒ ρ(B) = ρ(A) + ρ(B − A). So, ρ(B) ≥ ρ(A). Similarly, we have B <− A ⇒ ρ(A) ≥ ρ(B). Combining the two inequalities, we have ρ(B) = ρ(A). Therefore, ρ(B − A) = 0 or equivalently B − A = 0. Hence, B = A. We now give some more characterizing properties of the minus order. Notice that if A <− B, then there exists a g-inverse A− of A such that AA− = BA− and A− A = A− B. Hence A = AA− A = AA− B = BA− A. Thus, there are projectors P(= AA− ) and Q(= A− A) such that A = PB = BQ. It turns out that this is a characterizing property of the minus order. However, before we exhibit this, we prove the following: Lemma 3.3.15. Let A, B be matrices such that the product AB is defined. Then ρ(AB) = ρ(B) − d(C(B) ∩ N (A)). Proof. Let ρ(B) = r and d(C(B) ∩ N (A)) = s. Let C and D be matrices of order n×s and n×(r−s) respectively such that C(C) = C(B)∩N (A) and C(C : D) = C(B). Clearly, C and D are matrices of full column rank. Since C(C) = C(B) ∩ N (A), we have AC = 0 and so, C(AD) = C(AB). Also,
The Minus Order
77
C(C)∩C(D) = {0}. Further, if ADu = 0, then Du ∈ C(B)∩N (A) = C(C). Conversely, if Du ∈ C(B) ∩ N (A) = C(C), then Du = Cv for some v. Hence ADu = ACv = 0. Now, ADu = 0 ⇔ Du ∈ C(C). Hence, Du ∈ C(C) ∩ C(D) = {0}, so, Du = 0. Trivially, Du = 0 ⇒ ADu = 0. Thus, ρ(AD) = ρ(D). As C(AD) = C(AB), ρ(AB) = ρ(D) = r−s. Hence, ρ(AB) = ρ(B) − d(C(B) ∩ N (A)). (Alternative Proof ): Consider the matrix S = B − B(AB)− (AB). Clearly, for any g-inverse B− of B, SB− S = S, SB− B = S and BB− S = S, so, S <− B, by Remark 3.3.12. It is easy to see that C(S) ⊆ C(B) ∩ N (A). Further, let x ∈ C(B) ∩ N (A). Then x = Bz and Ax = 0. So, ABz = 0. Since S <− B, we have x = Bz = Sz + (B(AB)− (AB))z = Sz and so, x ∈ C(S). Hence, C(S) = C(B) ∩ N (A). Now, ρ(B) = ρ(S) + ρ(B(AB)− (AB)) = d(C(B) ∩ N (A)) + ρ(AB), since A(B(AB)− (AB)) = AB. Thus, ρ(AB) = ρ(B) − d(C(B) ∩ N (A)). Theorem 3.3.16. Let A and B be matrices of the same order. Then the following are equivalent: (i) (ii) (iii) (iv)
A <− B, There exist projectors P and Q such that A = PB = BQ, B − A <− B and ρ(B − A) = ρ((I − AA− )B) = ρ(B(I − A− A)) for any A− .
Proof. (i) ⇒ (ii) Since A <− B, there exists a g-inverse A− of A such that AA− = BA− and A− A = A− B. Choose P = AA− and Q = A− A. It is easy to check that P and Q are projectors. (ii) ⇒ (i) For any g-inverse B− of B we have, AB− A = PBB− BQ = PBQ = AQ = BQ2 = BQ = A. Thus, AB− A = A. This gives {B− } ⊆ {A− } and therefore, A <− B. The equivalence of (i) and (iii) follows from the equivalence of (i) and (vii) of Theorem 3.3.5. (iii) ⇔ (iv) Notice that ρ((I − AA)− )B = ρ((I − AA− )(B − A)) = ρ(B − A) − d(C(B − A) ∩ N (I − AA− )) = ρ(B − A) − d(C(B − A) ∩ C(A)) = ρ(B − A) ⇔ C(B − A) ∩ C(A) = {0}.
78
Matrix Partial Orders, Shorted Operators and Applications
Similarly, ρ(B(I − A− A)) = ρ(B − A) ⇔ C((B − A)t ) ∩ C(At ) = {0}. Let A and B be matrices of the same order such that A <− B. It is easy to see that there exist g-inverses A− of A and B− of B such that A− <− B− and also B− <− A− . In fact, we can make a stronger statement as shown below. Theorem 3.3.17. Let A and B be matrices of the same order such that A <− B. Let G be a g-inverse of B, ρ(A) = a, ρ(B) = b and ρ(G) = g. Then there exists non-singular matrices P and Q such that A = Pdiag(Ia , 0, 0, 0)Q, B = Pdiag(Ia , Ib−a , 0, 0)Q and G = Q−1 diag(Ia , Ib−a , Ig−b , 0)P−1 . Proof. By Theorem 3.3.5 ((i) ⇒ (vi)), there exist non-singular matrices R, S such that A = Rdiag(Ia , 0, 0, 0)S and B = Rdiag(Ia , Ib−a , 0, 0)S. Since, G is a g-inverse of B, we have
Ia 0 L1 G = S−1 0 Ib−a L2 R−1 M1 M2 N for some matrices L1 , L2 , M1 , M2 and N of appropriate orders. Write
Ia 0 0 Ia 0 L1 T = 0 Ib−a 0 and W = 0 Ib−a L2 . M1 M2 I 0 0 I Clearly, T and W are non-singular matrices. Moreover, we have G = S−1 Tdiag(Ia , Ib−a , N − M1 L1 − M2 L2 )WR−1 and ρ(N − M1 L1 − M2 L2 ) = g − b. Let X−1 diag(Ig−b , 0)Y−1 be a normal form of the matrix N−M1 L1 −M2 L2 . Write, P = RW−1 diag(Ib : Y) and Q = diag(Ib : X)T−1 S. Clearly, A = Pdiag(Ia , 0, 0, 0)Q, B = Pdiag(Ia , Ib−a , 0, 0)Q and G = Q−1 diag(Ia , Ib−a , Ig−b , 0)P−1 .
The Minus Order
79
Theorem 3.3.18. Let A and B be matrices of the same order such that A <− B. Let G and H respectively be g-inverses of B and A such that G <− H. Then H is a g-inverse of B. Proof. In view of Theorem 3.3.17, there exist non-singular matrices P and Q such that A = Pdiag(Ia , 0, 0, 0)Q, B = Pdiag(Ia , Ib−a , 0, 0)Q and G = Q−1 diag(Ia , Ib−a , Ig , 0)P−1 , where ρ(A) = a, ρ(B) = b and ρ(G) = g. Let V be a g-inverse of G such that GV = HV and VG = VH. Then V must be of the form Ia 0 0 L1 0 Ib−a 0 L2 Q V = P 0 0 Ib−a L3 M1 M2 M3 N for some matrices L1 , L2 , L3 , M1 , M2 , M3 and N of appropriate orders. Proceeding as in Theorem 3.3.17, we can find non-singular matrices R, S such that A = Rdiag(Ia , 0, 0, 0, 0)S, B = Rdiag(Ia , Ib−a , 0, 0, 0)S and G = S−1 diag(Ia , Ib−a , Ig−a , 0, 0)R−1 and V = S−1 diag(Ia , Ib−a , Ig−b , Iv−g , 0)R−1 , where v = ρ(V). Since H is a g-inverse of Ia D12 D13 D21 D22 D23 H = P D31 D32 D33 D41 D42 D43 D51 D52 D53
A, it must be of the form D14 D15 D24 D25 D34 D35 Q. D44 D45 D54 D55
Since, GV = HV and VG = VH, D22 = Ib−a , D33 = Ig−b , D44 = Iv−g and Dij = 0 for all i, j = 1, 2, 3, 4 with i 6= j. Thus, Ia 0 0 0 D15 0 Ib−a 0 0 D25 H = P D31 D32 Ig−b 0 D35 Q. 0 0 0 Iv−g D45 0 0 0 0 D55 It is now easy to check that H is a g-inverse of B.
We have seen in Theorem 3.3.5, that for any two matrices A1 and A2 , Ai <− (A1 + A2 ) if and only if A1 and A2 are disjoint matrices. We shall now extend this result for several matrices and their sum. We shall see in
80
Matrix Partial Orders, Shorted Operators and Applications
later chapters that this is a prelude to Fisher-Cochran type theorems on distribution of quadratic forms in normal variables. Theorem 3.3.19. Let A1 , A2 , . . . , Ak be matrices of the same order. Pk Write A = i=1 Ai . Then the following are equivalent. Pk (i) ρ(A) = i=1 ρ(Ai ) (ii) There exist nonsingular matrices P and Q such that Ai = Pdiag(0 . . . , I, 0 . . . , 0)Q, where diag(0, . . . , I, 0, . . . , 0) has at least k + 1 diagonal blocks and I occurs in the ith block for i = 1, . . . , k (iii) ρ(A − Ai ) = ρ(A) − ρ(Ai ), i = 1, . . . , k and (iv) Ai <− A, i = 1, . . . , k. Proof. (i) ⇒ (ii) Let (Pi , Qi ) be a rank factorization of Ai , for each i = 1, . . . , k. Since (i) holds, the columns of (P1 : . . . : Pk ) are linearly independent and so are the columns of (Qt : Qt2 : . . . : Qtk ). Extend (P1 : . . . : Pk ) and (Qt1 : . . . : Qtk ) to nonsingular matrices P and Qt respectively. Now it is easy to see that Ai = P diag(0, . . . , I, 0 . . . , 0)Q. (ii) ⇒ (iii) is trivial. (iii) ⇒ (iv) follows from Theorem 3.3.5. (iv) ⇒ (i) Since (iv) holds, every A− ∈ {A− } is a g-inverse of Ai for each i. Hence Ai A− is idempotent and ρ(Ai A− ) = ρ(Ai ) for all i. Now k k k X X X ρ(Ai ) = ρ(Ai A− ) = tr(Ai A− ) = tr
k X Ai
i=1
i=1
i=1
i=1
!
! A
−
= tr(AA− ) = ρ(A). Remark 3.3.20. It is interesting to note that Ai and A − Ai are disjoint for all i implies that C(A1 ) + C(A2 ) + . . . + C(Ak ) and C(At1 ) + . . . + C(Atk ) are actually direct sums. Remark 3.3.21. Consider A1 , . . . , Ak and A as in Theorem 3.3.19. If Ai <− A, i = 1, . . . , k, then Ai <− Aj1 +. . .+Ajr for any sub-permutation (j1 , . . . , jr ) of (1, . . . , k) such that i ∈ (j1 , . . . , jr ).
The Minus Order
3.4
81
Matrices above/below a given matrix under the minus order
Let one of the matrices A and B be given. In this section, we characterize the class of matrices for the other such that A <− B and obtain some of its formulations that are useful in different situations. However, before we actually do so, we show that every complement of C(A) is C(I − AG) for some g-inverse G of A. Lemma 3.4.1. Let A be a matrix of order m × n. Then S is a complement of C(A) if and only if S = C(I − AG) for some g-inverse G of A. Proof. ‘If’ part We first notice that for any g-inverse G of A, C(A) ∩ C(I − AG) = {0}. Moreover, ρ(I − AG) = m − ρ(A) = m − d(C(A)). Thus, C(I − AG) is a complement of C(A). ‘Only if’ part Let (P1 , Q1 ) be a rank factorization of A. Let the columns of the matrix Q1 R form a basis of S and T be a matrix such that ·· is non-singular. T −1 Q1 Q1 I 0 and G = ·· (P1 : R)−1 is a Then A = (P1 : R) ·· 0 0 T T g-inverse of A. Hence. 00 I − AG = (P1 : R) (P1 : R)−1 and 0I 00 C(I − AG) = C (P1 : R) = C(R) = S . 0I Theorem 3.4.2. Let A be a matrix of order m × n such that 0 < ρ(A) < min{m, n}. Let (P1 , Q1 ) be a rank factorization of A. Then the class of all matrices B of order m × n such that A <− B is given by {B = A ⊕ C, C arbitrary} or equivalently by B = A or B = P1 Q1 + P2 Q2 , where P = (P1 : P2 ) and Qt = (Qt1 : Qt2 ) are arbitrary full rank extensions of P1 and Q1 respectively and involved partitions are conformable for matrix multiplication.
82
Matrix Partial Orders, Shorted Operators and Applications
Proof.
This theorem is a restatement of Theorem 3.3.5(vii).
Theorem 3.4.3. Let A and B be matrices of the same order. Then A <− B if and only if B = A + (I − AA− )W(I − A− A) for some matrix W and for some g-inverse A− of A. Proof.
Proof follows from Theorem 3.4.2 and Lemma 3.4.1.
Let C and B be matrices of the same order. We now consider a special form for the first matrix C and explore when C <− B. This result will be useful later and is also of independent interest. Theorem 3.4.4. Let A be a non-singular matrix. Then A 0 A 0 −AL − < B if and only if B = + Z(−MA I) 0 0 0 0 0 for some matrices L, M and Z of appropriate orders, where the block matrix diag(A, 0) and the matrix B have the same order. Proof. Let C = diag(A 0). In view of Theorem 3.4.3, C <− B if and only if B = C + (I − CC− )W(I − C− C) for some g-inverse C− and some matrixW. Since A is non-singular, every g-inverse of C is of the form A−1 L − C = , where L, M, N are arbitrary. Now, M N 0 − AL 0 0 I − CC− = and I − C− C = . 0 I −MA I Hence 0 − AL W11 W12 0 0 (I − CC )W(I − C C) = , 0 I W21 W22 −MA I −AL = Z(−MA I) I W11 W12 conformable for the above matrix where W is partitioned as W21 W22 multiplication and Z = W22 . −
−
Remark 3.4.5. In Theorem 3.4.4, B is non-singular if and only if Z is non-singular. We now ask the following question: Given a matrix B, what are all matrices A such that A <− B?
The Minus Order
83
The answer is actually contained in Theorem 3.3.5(v) but we restate for the sake of completeness. We note that if B = 0, and A <− B, then A = 0. So, we shall take matrix B to be non-null. Theorem 3.4.6. Let B be a non-null matrix with a rank factorization (P, Q). Then the class of all matrices A such that A <− B is given by {PTQ : T idempotent}. Theorem 3.4.7. Let A and B be matrices of the same order and B be non-null. Then A <− B if and only if A = B − BR(SBR)− r SB for some matrices S and R such that SBR is defined. Proof. ‘If’ part Let (P, Q) be a rank factorization of B. Then − A = B − BR(SBR)− r SB ⇒ A = P(I − QR(SPQR)r SP)Q . − It is easy to check that QR(SPQR)− r SP and I − QR(SPQR)r SP are − idempotent. So, by Theorem 3.4.6, A < B. ‘Only if’ part Let A <− B and (P, Q) be a rank factorization of B. By Theorem 3.4.6 A = PTQ, where T is idempotent. Write
T = I − (I − T) −1 −1 −1 − = I − QQ−1 R (I − T)((I − T)PL PQQR (I − T))r (I − T)PL P. −1 − Let R = Q−1 R (I − T) and S = (I − T)PL . Then A = B − BR(SBR)r SB.
Let B be a given matrix. We now show that each matrix A of the form B − BR(SBR)− r SB for some matrices S and R is the unique matrix below B under the minus order with column and row spaces specified in an interesting way. Theorem 3.4.8. Let B, R and S be matrices such that the product SBR is defined. Then there exists a matrix A satisfying ‘C(A) = N (S) ∩ C(B), C(At ) = N (Rt ) ∩ C(Bt ) and A <− B’ if and only if ρ(SB) = ρ(SBR) = ρ(BR). If such a matrix A exists, then it is unique and is is of the form A = B − BR(SBR)− SB. Proof. ‘If’ part Let ρ(SB) = ρ(SBR) = ρ(BR) hold and let A = B − BR(SBR)− SB. Clearly, C(A) ⊆ C(B) and C(At ) ⊆ C(Bt ). As C(SB) = C(SBR), we
84
Matrix Partial Orders, Shorted Operators and Applications
have SA = SB − SBR(SBR)− SB = 0. So, C(A) ⊆ N (S) ∩ C(B). Let x ∈ N (S) ∩ C(B). Then x = Bu and Sx = SBu = 0. Clearly, Au = Bu. Therefore, x ∈ C(AS) ⊆ C(AS). Thus, C(A) = N (S) ∩ C(B). Similarly, we can prove that C(At ) = N (Rt ) ∩ C(Bt ). Further, ρ(SB) = ρ(SBR) = ρ(BR) ⇒ BR(SBR)− SB is invariant under choices of (SBR)− . Hence by Theorem 3.4.7, A <− B. ‘Only if’ part Let x ∈ N (Rt Bt ). Then Bt x ∈ N (Rt ) ∩ C(Bt ) = C(At ). We can write Bt x = At x + (B − A)t x and also as Bt x = At u for some vector u, as Bt x ∈ C(At ). However, C(At ) ∩ C(B − A)t = {0}, since A <− B, we have (B − A)t x = 0. Therefore, N (Rt Bt ) ⊆ N (B − A)t or equivalently by Remark 3.2.8, C(B − A) ⊆ C(BR). Similarly, we can prove C(B − A)t ⊆ C(Bt St ). So, B − A = BRCSB for some matrix C. Since SA = 0, we have SB = SBRCSB or C is a ginverse of SBR, further, ρ(SB) = ρ(SBR) and ρ(SBR) = ρ(BR) . Thus, A = B − BRCSB = B − BR(SBR)− SB. Notice that BR(SBR)− SB is invariant under choices of (SBR)− , it follows A is unique. Remark 3.4.9. Consider the setup of Theorem 3.4.8. In the ‘only if’ part we have shown that if C(A) = N (S) ∩ C(B), C(At ) = N (Rt ) ∩ C(Bt ) and A <− B, then C(B−A) ⊆ C(BR) and C(B−A)t ⊆ C(Bt St ). In fact, under this hypothesis one can show C(B−A) = C(BR) and C(B−A)t = C(Bt St ).
3.5
Subclass of g-inverses A− of A such that A− A = A− B and AA− = BA− when A <− B
Let A and B be matrices of the same order such that A <− B. In this section we shall characterize the class of all g-inverses A− of A such that AA− = BA− and A− A = A− B. Let A− be one such g-inverse, so that AA− = BA− and A− A = A− B. We shall show that there exists a ginverse B− of B such that AA− = AB− and A− A = B− A. We shall also characterize the class of all such B− for which AA− = AB− and A− A = B− A holds. Let {A− }B = {G : AGA = A, AG = BG, GA = GB}. We shall obtain an explicit representation of {A− }B . Let A <− B. In view of Theorem 3.3.16, there exist projectors P and Q such that A = PB = BQ. We shall also characterize the classes of all projectors P and Q such
The Minus Order
85
that A = PB = BQ when A <− B. We begin with the following lemma, which is also of independent interest. Lemma 3.5.1. Let A <− B. Then C is a reflexive g-inverse of B − A such that AC = 0 and CA = 0 if and only if C = B− (B − A)B− for some g-inverse B− of B. Proof. ‘If’ part Let B− be an arbitrary g-inverse of B. Write C = B− (B − A)B− . Then CA = B− (B − A)B− A = B− (A − A) = 0. Similarly, AC = 0. Also, (B − A)C(B − A) = (B − A)B− (B − A)B− (B − A) = B − A, since (B − A) <− B. So, ρ(C) ≥ ρ(B − A). On the other hand ρ(C) = ρ(B− (B − A)B− ) ≤ ρ(B − A). Hence ρ(C) = ρ(B − A). Therefore, C is a reflexive g-inverse of B − A. ‘Only if’ part Since, A <− B, there exist non-singular matrices R and S such that A = Rdiag(Ia , 0, 0)S and B = Rdiag(Ia , Ib−a , 0)S, where a = ρ(A) and b = ρ(B). It is easy to check that every reflexive g-inverse C of B − A such that AC = 0 and CA = 0 is of the form 0 0 0 S−1 0 Ib−a D23 R−1 , 0 D32 D32 D23 where D23 and D32 are arbitrary. Choose and fix D23 and D32 . It is easy to check that Ia 0 0 T = S−1 0 Ib−a D23 R−1 , 0 D32 T33 is a g-inverse of B, where T33 is arbitrary and C = T(B − A)T for a fixed choice of T. Let A <− B. In the following Lemma, we obtain a class of g-inverses of A which is a subset of {A− }B . Lemma 3.5.2. Let A and B be matrices having the same order such that A <− B and B− be an arbitrary g-inverse of B. Let C = B− (B−A)B− be as in Lemma 3.5.1 and DT = B− − CTC, where T is an arbitrary matrix of suitable order such that CTC is defined. Then the following hold: (i) ADT = AB− , DT A = B− A, for each T, (ii) ADT A = ADT B = BDT A = A, for each T
and
86
Matrix Partial Orders, Shorted Operators and Applications
(iii) ADT = BDT if and only if T is a g-inverse of C. Further, if ADT = BDT , then DT A = DT B. Proof. Statement (i) follows from Lemma 3.5.1. Since A <− B, (ii) is a simple consequence of (i). (iii) ADT = BDT ⇔ A(B− − CTC) = B(B− − CTC) ⇔ AB− = B(B− − (B− (B − A)B− )TC), since AC = 0 and C = B− (B − A)B− ⇔ AB− = BB− B(B− (B − A)B− )TC = BB− − (B − A)B− TC ⇔ (B − A)B− = (B − A)B− TC ⇔ C = B− (B − A)B− = B(B − A)B− TC ⇔ C = CTC ⇔ T is a g-inverse of C. The remaining part of the statement is easy to check. Remark 3.5.3. Notice that Lemma 3.5.2(ii) and (iii) together imply that {B− − B− (B − A)B− : B ∈ {B− }} is a subset of {A− }B . In the next theorem, we show that this subset is indeed the set {A− }B . Theorem 3.5.4. Let A and B be matrices of the same order such that A <− B. Then (i) {A− }B = {B− − B− (B − A)B− : B− ∈ {B− }}. − − − − (ii) {A− r }B = {B AB : B ∈ {B }}. − − − − − (iii) {Ar }B = {Br ABr : Br ∈ {Br }}. Proof. (i) We already proved that {B− − B− (B − A)B− : B− ∈ {B− }} is a subset of {A− }B in Lemma 3.5.2. Let A− ∈ {A− }B . It is clear that BA− B = A. For any B− ∈ {B− }, we define C = B− (B − A)B− and E = A− + C. It is easy to check that E ∈ {B− }. By Lemma 3.5.1, C is a reflexive g-inverse of B−A and AC = 0, CA = 0. Moreover, A− B = A− A and BA− = AA− . Therefore, E(B − A)E = C(B − A)C = C. So, E = A− + E(B − A)E or A− = E − E(B − A)E with E ∈ {B− }. (ii) Since A <− B, we have B− AB− ∈ {A− r }. A routine check shows that B− AB− ∈ {A− } . To prove the other inclusion, we start with an B r − A− ∈ {A } and proceed along the same lines as in (i). r r B (iii) is proved in similar manner as (ii). Remark 3.5.5. Theorem 3.5.4(i) can be interpreted as follows:
The Minus Order
87
Let A <− B. Every g-inverse of A belonging to {A− }B can be obtained by first choosing an arbitrary g-inverse B− of B and subtracting from it a suitable reflexive g-inverse H of B − A, namely H = B− (B − A)B− , which also satisfies AH = 0, HA = 0. Thus, {A− }B = {B− − H , B− arbitrary, AH = 0, HA = 0}. Let A <− B. Let A− belong to {A− }B . Does there exist a g-inverse B of B such that B− A = A− A and AB− = AA− ? The answer is in the affirmative and is contained in the following. −
Theorem 3.5.6. Let A <− B and let A− be a g-inverse of A such that A− A = A− B and AA− = BA− . Then there exists a g-inverse B− of B such that B− A = A− A and AB− = AA− . Proof. Let a = ρ(A) and b = ρ(B). Since A <− B, there exist nonsingular matrices R and S such that A = Rdiag(Ia , 0, 0)S and B = Rdiag(Ia , Ib−a , 0) S, by Theorem 3.3.5(vi). It is easy to check that the class of all g-inverses A− of A such that A− A = A− B and AA− = BA− is given by I 0 C13 S−1 0 0 0 R−1 , C31 0 C33 where C13 , C31 , C33 are arbitrary. Choose I 0 C13 − −1 A =S 0 0 0 R−1 C31 0 C33 for a particular choice of C13 , C31 , C33 . Then it is easy to check that one choice of required B− is I 0 C13 B− = S−1 0 I 0 R−1 . C31 0 C33 In fact, the class of all such B− I −1 S 0 C31 where D is arbitrary.
is given by 0 C13 I 0 R−1 , 0 D
88
Matrix Partial Orders, Shorted Operators and Applications
Corollary 3.5.7. Let A and B be matrices of the same order such that A <− B and let A− be a g-inverse of A such that A− A = A− B and AA− = BA− , then for a g-inverse B− of B, (i) AA− = AB− ⇔ AA− and BB− commute. (ii) A− A = B− A ⇔ A− A and B− B commute. The following theorem deals with a question related to the one in Theorem 3.5.6 (from the point of view of a given g-inverse B− of B namely: Given A <− B and any g-inverse B− of B, does there exist a g-inverse A− of A such that A− A = A− B = B− A and AA− = BA− = AB− ?). Theorem 3.5.8. Let A and B be matrices of the same order such that A <− B. Then for any g-inverse B− of B, there exists a g-inverse A− of A such that (a) AA− = BA− , (b) A− A = A− B, (c) AA− = AB− , and (d) A− A = B− A. Proof. Since A <− B, there is a g-inverse A− of A such that A− A = A− B and AA− = BA− . Define G = B− BA− BB− . It is easy to check that G is g-inverse of A. Further, AG = AB− BA− BB− = AA− BB− = AA− AB− = AB− . Also, BG = BB− BA− BB− = BA− BB− = AA− BB− = AA− AB− = AB− . So, G satisfies (a). The other equalities can be similarly established. While proving Theorem 3.5.8, we obtained another characterization of {A }B . We state it below for completeness. −
Theorem 3.5.9. Let A and B be matrices of the same order such that A <− B and A = Rdiag(Ia , 0, 0)S and B = Rdiag(Ia , Ib−a , 0) S. Then Ia 0 D13 {A− }B = S−1 0 0 0 R−1 , D31 0 D33 where D13 , D31 and D33 are arbitrary. As observed in Theorem 3.3.16, for matrices A, B of the same order whenever A <− B, there exist projectors P and Q such that A = PB = BQ. Let A <− B. We shall now obtain explicitly the class of all projectors P and Q such that A = PB = BQ. In the process we determine the possible ranks the projectors P and Q can have and obtain explicitly the subclass of the projectors P and Q of specified rank with the property A = PB = BQ.
The Minus Order
89
Lemma 3.5.10. Let A and B be matrices of the same order and let P be a projector such that the product PB is defined. Then A = PB if and only if PA = A and P(B − A) = 0. Also, for a projector Q such that the product BQ is defined, A = BQ if and only if AQ = A and (B − A)Q = 0. Proof.
Proof is trivial.
Theorem 3.5.11. Let A and B be matrices of the same order m × n such that A <− B. Let P be a projector such that the product PB is defined and has same order as B. If A = PB, then ρ(A) ≤ ρ(P) ≤ ρ(A) + m − ρ(B) . Proof. In view of Lemma 3.5.10, PA = A and P(B − A) = 0. As PA = A, we have ρ(A) ≤ ρ(P). Also, P(B − A) = 0 ⇒ C(Pt ) ⊆ N (B − A)t ⇒ ρ(P) = ρ(Pt ) = d(C(Pt )) ≤ d(N (B − A)t ). Thus, ρ(A) ≤ ρ(P) ≤ ρ(A) + m − ρ(B).
Corollary 3.5.12. Let A, B be matrices of the same order m × n such that A <− B. Let Q be a projector such that the product BQ is defined and is of the same order as B. If A = BQ, then ρ(A) ≤ ρ(Q) ≤ ρ(A) + n − ρ(B) . Theorem 3.5.13. Let A and B be matrices of the same order m × n with ranks ‘a’ and ‘b’ respectively. Let A <− B. Let R and S be non-singular matrices such that A = Rdiag(Ia , 0, 0)S and B = Rdiag(Ia , Ib−a , 0)S. Then the class of all projectors P such that A = PB is given by Ia 0 T13 R 0 0 T23 R−1 , 0 0 T33 where T23 = UT33 , T13 = V(I − T33 ) and T33 is idempotent matrix of order (m − b) × (m − b) and matrices V and U are arbitrary. Proof. If P is an idempotent matrix of the given form, clearly, A = PB. Now, let P be projector such that A = PB. Write P = RTR−1 . Notice that T is idempotent since P is idempotent. Partition T as T11 T12 T13 T = T21 T22 T23 T31 T32 T33
90
Matrix Partial Orders, Shorted Operators and Applications
conformable for matrix multiplication with diag(Ia , 0, 0). Since, A = PB, PA = A and C(A) ⊆ C(P). Now, A = PB ⇒ T11 = Ia , T12 = 0, T21 = 0, T31 = 0, T22 = 0 and T32 = 0. As, T is idempotent, we have T13 + T13 T33 = T13 , T23 = T23 T33 and T233 = T33 . So, Ia 0 T13 T = 0 0 T23 0 0 T33 where T13 = V(I − T33 ), T23 = UT33 for some matrices U and V and T33 is idempotent. Hence the result follows. Corollary 3.5.14. Consider the setup of Theorem 3.5.13. Then the class of all projectors Q such that A = BQ is given by Ia 0 0 R 0 0 0 R−1 , T31 T32 T33 where T13 = (I − T33 )Z, T32 = T33 W and T33 is an idempotent matrix of order (m − b) × (m − b) and Z and W are arbitrary. Theorem 3.5.15. Let A and B be matrices of order m × n with ranks ‘a’ and ‘b’ respectively. Let A <− B and P be a projector of rank ‘p’ such that A = PB. Then there exist non-singular matrices X and Y such that A = Xdiag(Ia , 0, 0)Y and B = Xdiag(Ia , Ib−a , 0, 0)Y and
Ia 0 P = X 0 0
0 0 0 0 0 Ip−a 0 0
L 0 X−1 for some matrix L. 0 0
Proof. In view of Theorem 3.5.13, there exist non-singular matrices R and S such that A = Rdiag(Ia , 0, 0)S and B = Rdiag(Ia , Ib−a , 0)S and Ia 0 T13 P = R 0 0 T23 R−1 , 0 0 T33 where T23 = UT33 , T13 = V(I − T33 ) for some matrices ‘U and V’ and T33 is idempotent matrix of order (m − b) × (m − b). Since ρ(P) =
The Minus Order
91
p, ρ(T33 ) = p − a. So, there exists a non-singular matrix K of order (m − b) × (m − b) such that T33 = Kdiag(Ip−a , 0 )K−1 . Then I 0 0 L I 0 0 a I 0 0 0 0 0 0 R−1 . P = R 0 I 0 0 0 Ip−a 0 0 I 0 −1 0 0 K 0 0 K 0 0 0 0 Further, Ia 0 0 Ia 0 0 L Ia 0 0 0 I W 0 b−a 0 0 0 P = R 0 Ib−a 0 0 0 Ip−a 0 0 0 Ip−a 0 0 K 0 0 0 0 0 0 0 Ia 0 0 L 0 0 Ib−a − W 0 Ia 0 0 Ib−a 0 R−1 . × 0 0 Ip−a 0 0 0 K−1 0 0 0 0
L 0 0 0
Let Ia 0 0 Ia 0 0 0 I W b−a X = R 0 Ib−a 0 0 0 Ip−a 0 0 K 0 0 0
L 0 0 0
and Y = S. It can be checked that A = Xdiag(Ia , 0, 0, 0)Y, B = Xdiag(Ia , Ib−a , 0, 0)Y and Ia 0 0 L 0 0 0 0 −1 P = X 0 0 Ip−a 0 X . 0 0 0 0 Corollary 3.5.16. Consider the setup as in Theorem 3.5.15. Let Q be a projector of rank ‘q’ such that A = BQ. Then there exist non-singular matrices C and D such that A = Cdiag(Ia , 0, 0, 0)D, B = Cdiag(Ib−a , 0, 0)D and Ia 0 0 0 0 0 0 0 −1 Q = C 0 0 Iq−a 0 C . M0
0
0
We now obtain the subclass of all projectors P of rank ‘p’ (a ≤ p ≤ a + m − b) such that A = PB, whenever A <− B. From Theorem 3.5.13, it
92
Matrix Partial Orders, Shorted Operators and Applications
is clear that if A = Rdiag(Ia , 0, 0) S and B = Rdiag(Ia , Ib−a , 0) S, then every projector P such that A = PB is of the form Ia 0 T13 P = R 0 0 T23 R−1 , 0 0 T33 where T23 = UT33 , T13 = V(I − T33 ) for some matrices ‘U and V’ and T33 is idempotent matrix of order (m − b) × (m − b). Since P is idempotent and ρ(P) = p, ρ(P) = tr(P) = tr(Ia ) + tr(T33 ) = a + ρ(T33 ). Thus, the class of projectors P of given rank p is given by Ia 0 T13 P = R 0 0 T23 R−1 , 0 0 T33 where T23 = UT33 , T13 = V(I − T33 ), T33 is idempotent matrix of order (m − b) × (m − b), ρ(T33 ) = p − a and ‘U and V’ are arbitrary matrices. The class of all projectors P of rank ‘a’ such that A = PB deserves a special mention. It is clear that this class is given by Ia 0 T13 P = R 0 0 0 R−1 , 0 0 0 where T13 is arbitrary. Recall that in the proof of Theorem 3.5.6, the class of all g-inverse A− such that A− A = A− B, AA− = BA− is given by Ia 0 C13 P = R 0 0 0 R−1 . C31 0 C33 Hence for each A− in that class,
Ia 0 C13 AA− = R 0 0 0 R−1 , 0 0 0 which is a projector of rank ‘a’. Conversely, each projector P of rank ‘a’ such that A = PB is of the form Ia 0 T13 Ia 0 0 Ia 0 T13 P = R 0 0 0 R−1 = R 0 0 0 SS−1 0 0 0 R−1 0 0 0 0 0 0 0 0 0 = AA− , where Ia 0 T13 A− = S−1 0 0 0 R−1 satisfies A− A = A− B, AA− = BA− . 0 0 0
The Minus Order
93
Thus, we have proved Theorem 3.5.17. Let A and B be matrices of the same order m × n with ranks ‘a’ and ‘b’ respectively. Let A <− B. Then the class of all projectors P of rank ‘a’ such that A = PB is the class of all AA− satisfying A− A = A− B, and AA− = BA− . Theorem 3.5.18. Let A, B be matrices of the same order m × n with ranks ‘a’ and ‘b’ respectively. Let A <− B and P be a projector such that A = PB. Then P can be written as P = P1 + P2 where P1 is a projector of rank ‘a’ such that A = P1 B and P2 is projector such that P1 P2 = P2 P1 = 0, P2 A = 0 and P2 (B − A) = 0. Proof.
In the setup of Theorem 3.5.15, write Ia 0 0 L 0 0 0 L 0 0 0 0 −1 0 0 0 0 −1 P1 = X . 0 0 0 0 X , P2 = X 0 0 Ip−a 0 X 0 0 0 0 0 0 0 0 It is easy to check that P1 is a projector of rank ‘a’ such that A = P1 B and P2 is idempotent such that P1 P2 = P2 P= 0, P2 A = 0 and P2 (B−A) = 0.
3.6
Minus order for idempotent matrices
Idempotent matrices (being projectors too) are a very important class of matrices in various applications such as in the theory of Linear Models. In this section, we study the minus order for this class. For an idempotent matrix A, we find the class of all matrices B such that A <− B. We then obtain the class of all idempotent matrices B such that A <− B. Now, let B be an idempotent matrix. We show that if A <− B, then A is necessarily idempotent. We also obtain the class of all A such that A <− B. Notice that for an idempotent matrix E, E is a g-inverse of itself and ρ(E) = tr(E). Let (P, Q) be a rank factorization of a non-null square matrix A. Then A is idempotent if and only if QP = I. Moreover, for idempotent matrices E and F of the same order, it is trivial to check that E <s F if and only if E = FE = EF if and only if E <− F. Thus, for any idempotent matrix A, A <− I, A and I being matrices of the same order. Lemma 3.6.1. Let E and F be idempotent matrices of the same order. Then E <− F if and only if ρ(F − E) = ρ(F − EF) = ρ(F − FE).
94
Matrix Partial Orders, Shorted Operators and Applications
Proof.
Follows from Theorem 3.3.16(i) ⇔ (iv).
We now prove a lemma that is also of independent interest. Lemma 3.6.2. Let E and F be idempotent matrices of the same order. Then F − E is idempotent if and only if E = FE = EF. Proof. ‘If’ part Let E = FE = EF. Then (F − E)2 = F2 − FE − EF + E2 = F − FE − EF + E = F − E. Hence, F − E is idempotent. ‘Only if’ part Write F = E + (F − E). Since E, F and E − F are idempotent matrices, ρ(F) = tr(F) = tr(E) + tr(F − E) = ρ(E) + ρ(F − E). By Remark 3.3.9, we have E <− F, and so, E = FE = EF.
Theorem 3.6.3. Let A be an idempotent matrix with a spectral decomposition A = Pdiag(I, 0)P−1. Then the class of all matrices B such that −L A <− B is given by A + P W(−M I)P−1 where L, W and M are I arbitrary. Proof follows from Theorem 3.4.4. Theorem 3.6.4. Let A be an idempotent matrix. Then the class of all idempotent matrices B such that A <− B is given by B = A + C where C is an idempotent matrix such that AC = CA = 0. Proof. Let B = A + C, where C is an idempotent matrix such that AC = CA = 0. Then B2 = B and trivially A <− B. Now, let B be an idempotent matrix such that A <− B. Then by Theorem 3.3.5(vii), B = A ⊕ (B − A). Since B = B2 , ‘C(A) and C(B − A)’ are virtually disjoint and ‘C(At ) and C(B − A)t ’ are virtually disjoint, it follows that A(B − A) = 0, (B − A)A = 0 and B − A is idempotent. Let C = B − A. Then B = A + C where C is an idempotent matrix such that AC = CA = 0. Theorem 3.6.5. Let A and B be square matrices of the same order with B idempotent. Then A <− B if and only if A = A2 = AB = BA.
The Minus Order
95
Proof. ‘Only if’ part Since A <− B and B is idempotent, we have A = ABA = AB = BA. Therefore, A2 = (AB)(AB) = (ABA)B = AB = A. Thus, A = A2 = AB = BA. The proof of ‘If’ part is trivial. Theorem 3.6.6. Let B be an idempotent matrix. Let (P, Q) be a rank factorization of B. Then the class of all matrices A such that A <− B is given by A = PTQ where T is an arbitrary idempotent matrix of appropriate order. Proof follows from Theorem 3.4.6. Remark 3.6.7. The matrix A as in Theorem 3.6.6 is idempotent because A2 = PTQPTQ = PT2 Q = A. Remark 3.6.8. If A and B are idempotent matrices such that A <− B, then A# A = A# B and AA# = BA# . This is so, as for an idempotent matrix E, E# = E. (We shall make a detailed study of the relationship A# A = A# B and AA# = BA# for general square matrices in Chapter 4.) 3.7
Minus order for complex matrices
We shall now specialize to matrices over the field of complex numbers and study the special properties of the minus order. Let A <− B. We obtain a canonical form for A and B in term of the singular value decomposition of the matrix A. Theorem 3.7.1. Let A and B be nnd matrices of the same order. Then A <− B if and only if B − A is an nnd matrix of rank equal to ρ(B)−ρ(A). Proof. ‘If’ part Since ρ(B − A) = ρ(B) − ρ(A), we have A <− B, by Remark 3.3.9. ‘Only if’ part Clearly, ρ(B−A) = ρ(B)−ρ(A) as A <− B. We also have B−A <− B by Theorem 3.3.5. Since B is nnd, we take an nnd g-inverse B− of B. Then (B − A)B− (B − A) = B − A and hence, B − A is nnd. Corollary 3.7.2. Let A and B be matrices of the same order n × n. Let B be an nnd matrix. If A <− B, then A and B − A are nnd.
96
Matrix Partial Orders, Shorted Operators and Applications
Proof. If A <− B, then AB† A = A. Since B is nnd, B† is also nnd. Hence A is nnd. Moreover, A <− B ⇒ B − A <− B and B be an nnd matrix, so, as in proof of ‘Only if’ part of Theorem 3.7.1, B − A is nnd. By comparing Theorem 3.3.5(i) ⇔ (vii) with Theorem 3.7.1, we notice that if A and B are nnd and A <− B, B − A is nnd. We now prove a result similar to Theorem 3.3.5(i) ⇔ (vi) for complex matrices using unitary transformations. Theorem 3.7.3. Let A and B be matrices of the same order over the field of complex numbers with ρ(A) = a, ρ(B) = b, and b > a ≥ 1. Then A <− B if and only if there exist unitary matrices U and V and nonsingular matrices T1 and T2 such that A = Udiag(Da , 0, 0)V? = UT1 diag(Da , 0, 0)T2 V? and B = UT1 diag(Da , Db−a , 0)T2 V? , where Da and Db−a are diagonal positive definite matrices. Proof. ‘If’ part follows form Theorem 3.3.4(vi)⇒ (i) ‘Only if’ part Let A = U1 Da V1? be a singular value decomposition of A where U1 and V1 are semi-unitary matrices and Da is a diagonal positive definite matrix. Since, A <− B we have C(A) ⊆ C(B) and C(A? ) ⊆ C(B? ). So, there exist semi-unitary matrices U2 and V2 such that the columns of (U1 : U2 ) and (V1 : V2 ) form ortho-normal basis of C(B) and C(B? ) respectively. Extend (U1 : U2 ) and (V1 : V2 ) to unitary matrices U = (U1 : U2 : U3 ) and V = (V1 : V2 : V3 ) respectively. Then we can write A = Udiag(Da , 0b−a , 0) V? and B = Udiag(M, 0) V? , where M is a b × b non-singular matrix. (Compare with Theorem 3.3.5 − (i) ⇒ (vi).) Since, A <− B, we have diag(D a , 0, 0) < diag(M, 0) and M11 M12 diag(Da , 0) <− M. Partition M as M = , where M11 and M21 M22 M22 are matrices of orders a × a and (b − a) × (b − a) respectively. By Theorem 3.4.5, M22 is non-singular and 3.4.4andRemark Da 0 −Da L1 M = + M22 (−L2 Da , I) for some matrices L1 and 0 0 I L2 . Notice that M = Pdiag(Da , M22 ) Q and diag(Da , 0) = Pdiag(Da , 0)Q,
The Minus Order
where P =
I − Da L1 0 I
and Q =
97
I 0 are non-singular matri−L2 Da I
ces. Now, let M22 = RDb−a S be a singular value decomposition of M22 , where R and S are unitary and Db−a is a diagonal positive definite matrix, as M22 is non-singular. Then it is easy to check that A = Udiag(Da : 0 : 0) V? = UT1 diag(Da : 0 : 0) T2 V? and B = UT1 diag(Da : Db−a : 0) T2 V? , where I 0 I 0 P 0 Q 0 and T2 = 0 R . T1 = 0 R 0 I 0 I Clearly T1 and T2 are non-singular.
Corollary 3.7.4. Let A and B ∈ Cm×n with ρ(A) = a, ρ(B) = b, and b > a ≥ 1. Then A <− B if and only if there exist unitary matrices U and V, diagonal positive definite matrix Da and a b × b non-singular matrix M such that Da 0 F1 M= + M22 (F2 , I) 0 0 I such that A = Udiag(Da , 0b−a , 0) V? and B = U(M, 0)V? , where M22 is non-singular and F1 and F2 are some matrices of suitable orders. Corollary 3.7.5. Let A and B ∈ Cm×n be nnd matrices with ρ(A) = a, ρ(B) = b and b ≥ a ≥ 1. Then A <− B if and only if there exist a unitary matrix U, an a × a positive diagonal matrix Da and a definite Da 0 F b × b positive definite matrix M = + M22 (F? , I) such that 0 0 I A = Udiag(Da , 0b−a , 0) U? and B = Udiag(M, 0) U? ; where M22 is a positive definite matrix and F is some suitable matrix. Compare Theorem 3.7.3 with Theorem 3.3.5(i) ⇔ (vi). A vigilant reader might wonder: what is so special about Theorem 3.7.3 over Theorem 3.3.4(i) ⇒ (vi), since every non-singular matrix L can be written as L = UT where U is unitary and T non-singular (L = UU? L for any unitary matrix U). In connection with Theorem 3.7.3, the point to be noted is that we can retain a singular value decomposition for A, which is A = Udiag(Da , 0, 0)V? = UT1 diag(Da , 0, 0)T2 V? . The importance of this result will be further clear when we deal with the star order.
98
3.8
Matrix Partial Orders, Shorted Operators and Applications
Exercises
Unless otherwise stated matrices are over arbitrary field and may be square or rectangular. (1) Let A and B be hermitian matrices of the same order over the complex field. Show that A <s B if and only if C(A) ⊂ C(B). (2) If A and B are square matrices of the same order such that A <s B and B2 = 0, then show that A2 = 0. What happens if B2 = B? (3) Prove that 0 <− B for any matrix B. Let n > m and I be the identity matrix of order m × m. Further let B be a matrix of order n × n. Show that (I, 0) <− B if and only if B = (I, 0). State and prove the corresponding result for m > n and for m = n. (4) Let A and B be matrices of the same order such that A <− B and for a suitable matrix C, CA = I. Prove that A = B. Is the conclusion still valid if there is a suitable matrix D such that AD = I? Justify your answer. Conclude that all invertible matrices are maximal under the minus-order. (5) Let A and B be matrices of the same order such that A <− B and (P, Q) be a rank factorization of A. Prove that (i) P = BGP and (ii) Qt = Qt GB for some g-inverse G of B. Also show that A = BHB for some matrix H. (6) Prove that under the minus order every square matrix A lies below some invertible matrix. (Hint: Consider the normal form of A.) (7) Let A and B be square matrices of the same order such that A <− B and AB = BA. Show that if B is of index 1, then A is of index 1. What can be said about the index of B if index of A is 1? (8) Let A and B be square matrices of the same order over the complex field such that A <− B. Show that if B is range hermitian then A may not be range hermitian. (9) Let A and B be square matrices of the same order over C, the complex field such that A <− B. Let A be normal and B be a hermitian matrix. Show that A is hermitian. (10) Let A = AA− B = BA− A for some g-inverse A− of A. Show that there is a reflexive g-inverse G of A such that A = AGB = BGA. Let A <s B and for some reflexive g-inverse A− of A, BA− A = A. Show that A <− B. (11) Show that for matrices A and B over the field C of complex numbers (i) A <− B ⇔ A? <− B? .
The Minus Order
(12)
(13)
(14) (15)
(16) (17)
(18) (19)
(20) (21)
(22)
(23)
99
(ii) A <− B and C(A? ) ⊆ C(B? ). Then A = B. Give examples of matrices A and B over the field C such that A <− B, but A† 6<− B† . Further, show that if A <− B, then A† <− B† ⇔ A† BA† = A† . Show that the following hold: (i) A <− B ⇔ A− BA− = A− and BA− B = A for some A− ∈ {A− }. (ii) A <− B ⇔ {(A− + B− )/2} ⊆ {A− } for each A− ∈ {A} and for each B− ∈ {B− }. Show that A <− B ⇒ {B− } ⊂ {A− + (B − A)− }. Is the converse true? − − Show that A <− B ⇒ {B− r } ⊂ {Ar + (B − Ar )}. Is the converse true? Now suppose A and B are nnd matrices over the C, the complex − − − field. Show that {B− r } ⊂ {Ar + (B − A)r } ⇒ A < B. − For an idempotent matrix E, show that E < B ⇔ ρ(B − E) = ρ(B − BE) = ρ(B − EB). Let A <− B and one of C(B) ⊆ C(A) and C(Bt ) ⊆ C(At ) and holds. Show that A = B. Show the same holds if A <− B and AX = B is consistent. Show that for square matrices A and B, A <− I ⇔ A2 = A and I <− B ⇔ B = I. If for non-null matrix A, BA− B = A for some A− ∈ {A− }, then show that D = AB− A is invariant under the choice of B− . Further show that D is the unique matrix such that D <− B. For m × n matrices A and B over C, show that A <s B ⇔ AA† <− BB† and A† A <− B† B. For m × n matrices A and B over C, prove or disprove the following: (i) A <− B ⇒ AA? <− BB? (ii) A <− B ⇒ A? A <− B? B. Let A and B be m × n matrices. If B1 is an r × s matrix such that B1 0 B= , 0 0 where each 0 denotes a null matrix of appropriate order. A 0 1 Then show that 0 <− A <− B ⇔ A = ; 0 0 where A1 an r × s matrix and 0 <− A1 <− B1 . Let A and B be normal matrices of order n × n over C. Let ρ(A) = a and ρ(B) = b, 1 ≤ a < b ≤ n with m = b − a. Then show that the following are equivalent:
100
Matrix Partial Orders, Shorted Operators and Applications
(i) A <− B (ii) There exists a unitary matrix U such that U? AU = diag(D, 0) and D + RES RE 0 U? BU = ES E 0 , 0 0 0 where D and E are respectively a × a and m × m non-singular diagonal matrices, R is of order a × m and S is order m × a. (iii) There exists a unitary matrix U such that U? AU = diag(G, 0) and G + RES RF 0 U? BU = FS F 0 , 0 0 0 where D and F are a × a and m × m non-singular matrices, R is of order a × m and S is order m × a. (24) Let Q be a projector of order m × m and B be a m × n matrix. Show that the following are equivalent: (i) QB <− Q (ii) {B− } = {B− Q} + {B− (I − Q)}, {B− Q} ⊆ {(QB)− } and {B− (I − Q)} ⊆ {(I − QB)− }. (25) Let A, P and Q be m × n, p × m and n × q matrices respectively. Let matrices T and R be such that N (T) = C(AQ), N (PA) = C(R). t Show that if D = AR(TAR)− r TA, then C(D) ⊆ N (P), C(D ) ⊆ N (Qt ) and D <− A. (26) Let A, A1 and A2 be m × n matrices. Then the following are equivalent: (i) (ii) (iii) (iv)
A = A1 ⊕ A2 A1 <− A, A2 <− A A1 (A1 + A2 )− A2 = 0 − {A− } ⊆ {A− 1 } ∩ {A2 }.
(27) Let A be an m × n matrix. Let x ∈ C(A) and y ∈ C(At ). The vectors x and y are said to be separable if there exist disjoint matrices A1 and A2 such that (a) A = A1 ⊕ A2 and (b) x ∈ C(A2 ) and y ∈ C(At1 ).
The Minus Order
101
Clearly, any pair of x, y with x ∈ C(A) and y ∈ C(At ) is separable if at least one of x, y is null. Prove that for a given matrix A, x ∈ C(A) and y ∈ C(At ), the vectors x and y are separable if and only if yt A− x = 0 for all g-inverses A− of A. (28) Let A be an m × n matrix. Let x ∈ C(A) and y ∈ C(At ). Prove that A x x, y are separable if and only if ρ = ρ(A) yt 0 (29) Let A and B be m × n matrices such that A <− B. Suppose vectors x, and y are separable with reference to A. Show that they are also separable with reference to B.
Chapter 4
The Sharp Order
4.1
Introduction
We know that if A is an idempotent matrix (that is A2 = A), then it is similar to a matrix of the form diag(I, 0). It also acts as an identity on the column space C(A) of the matrix A. In the previous chapter, we saw that for two idempotent matrices A and B of the same order, A <− B if and only if A = AB = BA = ABA. Observe that an idempotent matrix is a matrix of index not exceeding 1 and is its unique commuting reflexive g-inverse. Let ‘I1 ’ denotes the set of all matrices of index ≤ 1. The set ‘I1 ’ contains a wide class of matrices. In fact, in addition to all idempotent matrices it includes the class of semi-simple matrices and the class of the range-hermitian matrices. Compare the following property of ‘I1 ’ which is analogous to the property of the set of idempotent matrices mentioned above in this paragraph: each non-null matrix A in ‘I1 ’ is similar to diag(D, 0) for some non-singular matrix D and acts as a non-singular linear operator on C(A), that is, for each non-null x ∈ C(A), Ax 6= 0. In this chapter, we define a partial order ‘<# ’ on I1 , the set of square matrices of index ≤ 1. We shall show that under this order ‘<# ’ A is below B (that is A <# B) if and only if A2 = AB = BA and therefore, ‘<# ’ coincides with the minus order when restricted to idempotent matrices. A subclass of g-inverses called the commuting g-inverses plays an important role in developing this partial order. Recall from Chapter 2 that a matrix G is said to be a commuting g-inverse of a matrix A if (i) AGA = A and (ii) AG = GA. Moreover, such a g-inverse for A exists if and only if A is of index ≤ 1. For a matrix A of index ≤ 1, the unique reflexive commuting g-inverse is called its group inverse and is denoted by A# Moreover, if A = Pdiag(D, 0)P−1 , where P and D are non-singular matrices, then
103
104
Matrix Partial Orders, Shorted Operators and Applications
every commuting g-inverse of A is of the form Pdiag(D−1 , C)P−1 where C is arbitrary and, in particular, A# = Pdiag(D−1 , 0)P−1 . In Section 4.2, we define a partial order, called the sharp order on matrices in I1,n , the set of n × n matrices of index 1 and obtain several characteristic properties of this order. In Section 4.3, we obtain the class of all matrices lying above (or below) a given matrix under the sharp order. We also specialize to some subclasses of I1,n and obtain interesting properties of the sharp order. In Section 4.4, we extend the ‘<# ’ order to the class of general square matrices. Let A = A1 + A2 and B = B1 + B2 be the core-nilpotent decompositions of A and B respectively, where A1 and B1 are core parts and A2 and B2 are nilpotent parts. We define A
4.2
Sharp order - Characteristic properties
The class of commuting g-inverses of a square matrix plays a vital role for the order relation we are about to study now. Recall from Chapter 2 that the group inverse of a square matrix is its unique commuting reflexive ginverse and that a square matrix possesses a commuting g-inverse if and only if it is of index≤ 1. We begin this section by defining the Sharp order on I1,n , the set of all n × n matrices of index ≤ 1. We also obtain several characteristic properties of this order. Definition 4.2.1. Let A, B ∈ I1,n . A is said to be below B under the sharp order if there exist commuting g-inverses G1 and G2 of A such that AG1 = BG1 and G2 A = G2 B. When A is below B under the sharp order, we write A <# B. Remark 4.2.2. It is obvious that A <# B ⇒ A <− B. Remark 4.2.3. In Definition 4.2.1, we actually do not require the matrix B to be of index ≤ 1. For example notice that the null matrix of order n × n belongs to I1,n and 0 <# B, for each n × n matrix B. We know that
The Sharp Order
105
not every matrix is of index ≤ 1. we now give a non-trivial example. Example 4.2.4. Consider 1 0 0 1 0 0 A = 0 0 0 and B = 0 0 1 . 0 0 0 0 0 0 Clearly, A2 = AB = BA, so, A <# B. Also, ρ(B) = 2, ρ(B2 ) = 1. Thus, B is not a matrix of index ≤ 1. However, in what follows, we take both A and B to be matrices of index ≤ 1. This is crucial for developing the properties of the sharp order. For example, in order to examine the transitivity of <# , we need to consider A <# B and B <# C, where the matrix B must be of index ≤ 1. In the previous chapter, we had shown that if A <− B, then we can get the same g-inverse G of A such that AG = BG and GA = GB. Is a similar thing possible for sharp order? The answer is in the affirmative and is contained in the following: Theorem 4.2.5. Let A, B ∈ I1,n . Then A <# B if and only if AA# = BA# and A# A = A# B. Proof. ‘If’ part Since A# is a commuting g-inverse of A, the proof is trivial. ‘Only if’ part Let G1 and G2 be commuting g-inverses of A such that AG1 = BG1 and G2 A = G2 B. Now, G1 AG2 is a reflexive commuting g-inverse of A. Therefore, G1 AG2 = A# by uniqueness of the group inverse and AA# = BA# and A# A = A# B. Remark 4.2.6. The sharp order was originally defined in [Mitra (1987)] using the group inverse. Remark 4.2.7. Let A, B ∈ I1,n and P be an n × n non-singular matrix. Then A <# B if and only if PAP−1 <# PBP−1 . In the following theorem we obtain several characterizations of the sharp order. Theorem 4.2.8. Let A, B ∈ I1,n . Then the following are equivalent: (i) A <# B
106
Matrix Partial Orders, Shorted Operators and Applications
(ii) A2 = AB = BA or equivalently, A(B − A) = (B − A)A = 0. (iii) Let a = ρ(A) and b = ρ(B). Then there exists a non-singular matrix Q such that A = Qdiag(Da , 0, 0) Q−1 and B = Qdiag(Da , Db−a , 0) Q−1 , where Da and Db−a are nonsingular matrices of order a × a and (b − a) × (b − a) respectively − (iv) {B− com } ⊆ {Acom } # − (v) B ∈ {Acom } (vi) There exists a projector P onto C(A) such that A = PB = BP (vii) A# A = A# B and (B − A)A = 0 (viii) AA# = BA# and A(B − A) = 0 and (ix) B − A <# B. Proof. (i) ⇒ (ii) Since A <# B, by Theorem 4.2.5, we have AA# = BA# and A# A = A# B. Also, A2 A# = A = A# A2 . So, A2 = AA# A2 = BA# A2 = BA, since AA# = BA# . Similarly, A2 = AB. (ii) ⇒ (i) follows from the equalities (A# )2 A = A# = A(A# )2 . (i) ⇒ (iii) Since A is of index ≤ 1, there exist non-singular matrices R and Da such that A = Rdiag(Da , 0, 0)R−1 equivalently R−1 AR = diag(Da , 0, 0). T11 T12 T13 Let R−1 BR = T. Partition T as T = T21 T22 T23 conformable for T31 T32 T33 matrix multiplication with the partition of R−1 AR. Since A2 = AB = BA, we have T11 = Da and T12 , T13 , T21 and T31 are null matrices. Thus, Da 0 0 T = 0 T22 T23 . 0 T32 T33 T22 T23 Since B is of index ≤ 1 and ρ(B) = b, the sub-matrix is of T32 T33 index≤ 1 with rank b − a. Hence there exists a non-singular matrix S such T22 T23 that = Sdiag(Db−a , 0) S−1 . Let Q = Rdiag(Ia , S). Then T32 T33 Q is a non-singular matrix. Further, A = Qdiag(Da , 0, 0) Q−1 and B = Qdiag(Da , Db−a , 0)Q−1 . (iii) ⇒ (iv) −1 −1 −1 Let G ∈ {B− , where C is an com }. Then G = Qdiag(Da , Db−a , C)Q
The Sharp Order
107
arbitrary matrix. It is easy to check that G ∈ {A− com }. (iv) ⇒ (v) is trivial. (v) ⇒ (i) Since B is of index ≤ 1, there exist non-singular matrices Q and T such −1 that B = Qdiag(T, 0) Q−1 . Notice that B#= Qdiag(T , 0)Q−1 . Par S11 S12 tition Q−1 AQ conformably as Q−1 AQ = . Since B# is a S21 S22 commuting g-inverse of A, we have AB# = B# A. This yields S12 = 0, S21 = 0 and S11 T−1 = T−1 S11 . Further, AB# A = A implies S22 = 0 and S11 T−1 S11 = S11 . Now, let G = B# AB# . Then G is a reflexive commuting g-inverse of A, so G = A# by uniqueness of the group inverse and GA = GB, AG = BG. (ii) ⇒ (vi) We have A = AA# A = A# A2 = A# AB. Let P = A# A. Then P is a projector and A = PB. Now, BP = BA# A = BAA# = A2 A# = A. (vi) ⇒ (vii) Notice that (vi) gives C(A) ⊆ C(P) and C(At ) ⊆ C(Pt ) and therefore, A = AP = PA. Since C(A# ) = C(A), we have A# = A# P = PA# . Now, A# A = A# PB = A# B. Again A = AP = BP2 = BP. So, (B − A)P = 0. Hence, (B − A)A = (B − A)PB = 0. (vii) ⇒ (ii) Since (B − A)A = 0, we have BA = A2 . Pre-multiplying A# A = A# B by A2 , we have A2 = A2 A# A = A2 A# B. Now, from A2 = A2 A# B we have A2 = AB. Proof for (ii) ⇒ (vi) ⇒ (viii) ⇒ (ii) is similar and proof for (ix) ⇔ (ii) follows from the equivalence of (i) and (ii). Remark 4.2.9. Let A ∈ I1,n and B be a non-singular matrix. In view of Theorem 4.2.8, A <# B if and only if AB−1 A = B−1 AB = A. Theorem 4.2.10. The sharp order is a partial order on I1,n . Proof. The result follows trivially from the representations of A and B as given by Theorem 4.2.8 (iii) and (iv), when A <# B. As noted earlier in Remark 4.2.2, the sharp order implies the minus order. The following example shows that the converse is not true. Example 4.2.11. Let 1 0 0 0 I S I S S= , T= , A= and B = ; 0 1 T 0 T −I 0 0
108
Matrix Partial Orders, Shorted Operators and Applications
where I is a 2×2 unit matrix. Then A <− B but A 6<# B, since A# = A and A(B − A) 6= 0. Notice that Example 4.2.11 shows that even on matrices of index ≤ 1 the minus order ‘<− ’ does not imply the sharp order. Thus, for the minus order to imply the sharp order some additional conditions are required. The following theorem includes some of the equivalent additional conditions towards this end. Theorem 4.2.12. Let A, B ∈ I1,n such that A <− B. Then the following are equivalent: (i) (ii) (iii) (iv) (v) (vi) (vii) (viii)
A <# B, A and B commute, A# and B commute, A and B# commute, A# and B# commute B# AB# = A# , BA# B = A and C(BA# B) ⊆ C(A), C((BA# B)t ) ⊆ C(At ).
Proof. (i) ⇒ (ii) follows from Theorem 4.2.8 (i) ⇒ (ii). (ii) ⇒ (iii) Since A and B commute and A# is a polynomial in A, it follows that A# and B commute. (iii) ⇒ (iv) Let A# and B commute. Since, (A# )# = A, A is a polynomial in A# and B# is a polynomial in B, therefore A and B# commute. (iv) ⇒ (v) Let A and B# commute. Since A# is a polynomial in A, A# and B# commute. (v) ⇒ (vi) Let B = Pdiag(T, 0) P−1 , where T is a non-singular matrix. As A <− B, we have A = Pdiag(S, 0)P−1 for some matrix S such that ST−1 S = S. Since A and B# commute, S and T−1 commute. Also, it is easy to check that S# = T−1 ST−1 and hence A# = B# AB# . (vi) ⇒ (vii) BA# B = BB# AB# B = A follows, as C(A) ⊆ C(B) and C(At ) ⊆ C(Bt ). (vii) ⇒ (viii) is obvious.
The Sharp Order
109
(viii) ⇒ (i) As A <− B, A = AB# A = AB# B = BB# A. Now, C(BA# B) ⊆ C(A) ⇒ AA# (BA# B) = BA# B ⇒ AB# (AA# (BA# B)) = AB# (BA# B) ⇒ (AB# A)(A# (BA# B)) = (AB# B)A# B ⇒ AA# (BA# B) = AA# B ⇒ BA# B = AA# B. Also, A <− B ⇒ C(A) ⊆ C(B) ⇒ A = BU for some U. Therefore, A = AA# A = AA# BU = BA# BU = BA# A. So, AA# = BA# . Similarly, A# A = A# B. Corollary 4.2.13. Let A, B ∈ I1,n . Then the following are equivalent: (i) A <# B, (ii) A# and B# commute (iii) A# <# B# .
and
It is important to note that for the equalities AA# = BA# and A A = A# B to hold simultaneously, we require only A to be of index ≤ 1, whatever be the index of B. In fact, it is easy to show that if A = Pdiag(D, 0)P−1 , AA# = BA# and A# A = A# B, then B = Pdiag(D, C)P−1 for some arbitrary matrix C. However, as mentioned earlier for the order relation ‘<# ’ to be a partial order, we do need B to be of index ≤ 1 and therefore we continue to take both A and B to be of index ≤ 1 in the rest of the chapter as well. We hasten to remark that some of the results proved above hold good even when B is not of index ≤ 1. The reverse order law for the group inverse of a product of two (or a finite number) of matrices does not hold in general. However, in presence of the sharp order A <# B on matrices A and B, we have the following: #
Theorem 4.2.14. Let A, B ∈ I1,n such that A <# B. Then (AB)# = B# A# = A# B# = (A# )2 . Proof. The proof follows trivially in view of the representation of A and B in Theorem 4.2.8(iii). Theorem 4.2.15. Let A, B ∈ I1,n such that A <# B. Then, A <− B and Ak <# Bk hold for each positive integer k. Proof. The proof follows trivially in view of the representation of A and B in Theorem 4.2.8(iii).
110
Matrix Partial Orders, Shorted Operators and Applications
Remark 4.2.16. Let A, B ∈ I1,n . The conditions Ak <− Bk for each positive integer k and Ak <# Bk for each positive integer k ≥ 2 do not imply A <# B. For example, take A and B of Example 4.2.11.
4.3
Sharp order - Other properties
In this section, we obtain the class of all g-inverses A− of A such that A− A = A# A = A# B = A− B when A <# B. We also obtain the class of all matrices B lying above a given matrix A under the sharp order. We then obtain a solution to the dual problem of obtaining the class of all matrices A lying below a given matrix B under this order. We conclude the section with an extension of Fisher-Cochran Theorem for matrices of index ≤ 1. We begin with the following theorem which is analogous to Theorem 3.5.6 but is actually much stronger. Theorem 4.3.1. Let A, B ∈ I1,n such that A <# B. Let A− com be any − − commuting g-inverse of A such that Acom A = Acom B and AA− com = − BA− . Then for every commuting g-inverse B of B, we have com com − − − A− com A = Bcom A and AAcom = ABcom . Proof. Since A <# B, by Theorem 4.2.8(iii), there exists a non-singular matrix P such that A = Pdiag(Da , 0, 0)P−1 and B = Pdiag(Da , Db−a , 0)P−1 , where Da and Db−a are non-singular matrices and a = ρ(A) and b = ρ(B). It is easy to check that the class of all commuting g-inverses A− com − − − B and AA = BA is given A = A of A such that A− com com com com −1 by Pdiag(D−1 , where C is arbitrary. Also, we know a , 0b−a , C)P that the class of all commuting g-inverses B− com of B is given by −1 −1 , where E is arbitrary. The rest of the proof is Pdiag(D−1 a , Db−a , E)P computational. Corollary 4.3.2. Let A, B ∈ I1,n such that A <# B and A− com be any − commuting g-inverse of A for which A− A = A B and AA− com com com = − − BAcom . Then for every commuting g-inverse Bcom of B, A− A and com − Bcom B commute. Corollary 4.3.3. Let A, B ∈ I1,n such that A <# B and A− com be any − commuting g-inverse of A for which A− A = A B and AA− com com com = − − BAcom . Then for any commuting g-inverse of Bcom of B we have − − − − − A− com A = Acom B = Bcom A and AAcom = BAcom = ABcom .
The Sharp Order
111
While proving Theorem 4.3.1, we have already obtained a characterization of {A− com }B = {G : AGA = A, AG = GA, AG = BG and GA = GB}, when A <# B. We shall obtain another characterization below. First we prove Lemma 4.3.4. Let A, B ∈ I1,n such that A <# B. Then − (B − A)# = B− com (B − A)Bcom ,
where B− com is any commuting g-inverse of B. Proof. Since A <# B, we have B − A <# B. By Theorem 4.2.8(ix), − − B− com is a commuting g-inverse of B − A. So, Bcom (B − A)Bcom , is − the unique reflexive g-inverse of B − A for each Bcom . Therefore, − (B − A)# = B− com (B − A)Bcom .
Theorem 4.3.5. Let A, B ∈ I1,n such that A <# B. Then − − − {A− com }B = {Bcom − Bcom (B − A)Bcom } # = {B− com − (B − A) }.
Proof.
Follows easily from the proof of Theorem 4.3.1 and Lemma 4.3.4.
Corollary 4.3.6. Let A, B ∈ I1,n such that A <# B. Then A <− B and A# = B# − (B − A)# . However, the converse of above the corollary is not true and can be shown by using the matrices A and B of Example 4.2.11. Let A <# B. In view of Theorem 4.2.8, there exists a projector P such that A = PB = BP. We now characterize the class of all projectors P such that A = PB = BP, when A <# B. Theorem 4.3.7. Let A, B ∈ I1,n such that A <# B. Then the class of all projectors P such that A = PB = BP is given by P = PA + CTD, where PA = A# A is the projector projecting vectors onto C(A) along N (A), (C, D) is a rank factorization of I − B# B and T is an arbitrary idempotent matrix of appropriate order.
112
Matrix Partial Orders, Shorted Operators and Applications
Proof. Notice that A <# B ⇒ AA# = BA# and A# A = A# B. Also I − B# B = CD ⇒ BC = 0, DB = 0. So, A = PA B = BPA . Thus, if P is a matrix of the form P = PA + CTD, then B(PA + CTD) = BPA = A and (PA + CTD)B = PA B = A for all T. Let P be a projector such that A = PB = BP. We show that P is of the given form. Let P = PA + Q. Since A = PB = BP and A = PA B = BPA , so, QB = 0, BQ = 0. Also, PA P = PPA = PA , we have PA Q = QPA = 0. Thus, Q is idempotent. Let (C, D) be a rank factorization of I − B# B. As QB = 0, BQ = 0, there exists a matrix T such that Q = CTD. Also, CTD = Q = Q2 = CTDCTD = CT2 D, since I − B# B is an idempotent. Thus, CTD = CT2 D. As C has a left inverse and D has a right inverse, we have T = T2 i.e. T is idempotent. Given a matrix A ∈ I1,n , we now characterize the class of all matrices B ∈ I1,n such that A <# B. First we prove a lemma. Lemma 4.3.8. Let A and B be matrices such that B = A ⊕ (B − A) and B2 = A2 ⊕ (B − A)2 . If any two of A, B and B − A are of index ≤ 1, the the third is of index ≤ 1. Proof.
Trivial.
Theorem 4.3.9. Let A ∈ I1,n . Then for a matrix B ∈ I1,n , A <# B if and only if B = A + (I − AA# )Z(I − A# A) for some Z such that (I − AA# )Z(I − A# A) is of index ≤ 1. Proof. Notice that B = A + (I − AA# )Z(I − A# A) ⇔ A(B − A) = (B − A)A = 0 ⇔ A <# B . Further, B2 = A2 + {((I − AA# )Z(I − A# A))2 }. In view of the fact that the sums involved are direct, the result follows by Lemma 4.3.8. Corollary 4.3.10. Let A be a non-singular matrix. Then diag(A, 0) <# B if and only if B = diag(A, C) for some matrix C of index ≤ 1. Corollary 4.3.11. Let A ∈ I1,n . The class of all matrices B ∈ I1,n such that A <# B is given by B = A ⊕ E where E is an arbitrary matrix of index ≤ 1. such that AE = EA = 0. Let B ∈ I1,n be a given matrix. We now obtain the class of all A ∈ I1,n such that A <# B.
The Sharp Order
113
Theorem 4.3.12. Let B ∈ I1,n . Then the class of all A ∈ I1,n such that A <# B is given by A = PCQ, where (P, Q) is a rank factorization of B and C is an arbitrary idempotent matrix such that CQP = QPC. Proof. Let (P, Q) is a rank factorization of B. Then QP is non-singular. Let C be an idempotent matrix such that CQP = QPC and A = PCQ. Clearly, ρ(A) = ρ(C), as P and Q are full rank matrices. Further, A2 = PCQPCQ = PC2 QPQ. Since QP is non-singular and P, Q are full rank matrices, ρ(A2 ) = ρ(C) = ρ(A). Thus, A ∈ I1,n . Also, A2 = PCQPCQ = PC2 QPQ = PCQPQ = AB. Similarly, we can show A2 = BA. Next, let A ∈ I1,n such that A <# B. Then A2 = BA = AB. Also, C(A) ⊆ C(B), and C(At ) ⊆ C(Bt ). So, A = PCQ, for some matrix C. Since A2 = BA = AB, we have PCQPCQ = PQPCQ = PCQPQ. Since P and Q are full rank matrices, this further gives CQPC = QPC = CQP. Now, CQPC = CQP and CQP = QPC, so, C2 QP = CQP. As QP is non-singular, we have C2 = C. Let A, B be square matrices of the same order over an algebraically closed field. Let B be of index ≤ 1 and B = Pdiag(J1 , . . . , Jr , 0)P−1 be the Jordan decomposition of B, where J1 , . . . , Jr are non-singular Jordan blocks and P is a non-singular matrix. Let A = Pdiag(D1 , . . . , Dr , 0) P−1 , where Dij = Jij , i = 1, . . . , s for sub-permutation (i1 , . . . , is ) of {1, . . . , r} and Dt = 0 for t ∈ {1, . . . , r} ∩ {i1 , . . . , is }c . It is easy to check that A is of index ≤ 1 and A <# B. We now prove the converse of this statement in a special case where there is exactly one Jordan block corresponding to each non-null eigen-value (or equivalently, the geometric multiplicity of each non-null eigen-value is 1). Theorem 4.3.13. Let A, B be square matrices of same order over an algebraically closed field. Let B be of index ≤ 1 and B = Pdiag(J1 , . . . , Jr , 0)P−1 be the Jordan decomposition of B, where J1 , . . . , Jr are non-singular Jordan blocks corresponding to distinct eigenvalues λ1 , . . . , λr and P is a non-singular matrix. Then A is of index ≤ 1 and A <# B if and only if either A is a null matrix or A = Pdiag(D1 , . . . , Dr , 0)P−1 , where Dij = Jij , j = 1, . . . , s for some sub-permutation (i1 , . . . , is ) of {1, . . . , r} and Dt = 0 for t ∈ {1, . . . , r} ∩ {i1 , . . . , is }c . Proof. ‘If’ part is trivial. ‘Only if’ part
114
Matrix Partial Orders, Shorted Operators and Applications
If A is null matrix, the result is trivial. So, let A be non-null. Notice that the following hold: (i) For C, D ∈ I1,n such that C <# D ⇔ RCR−1 <# RDR−1 . (ii) If J is a non-singular Jordan block, then C is an index ≤ 1 matrix such that C <# J if and only if either C is a null matrix or C = J. (iii) If J1 , J2 are non-singular corresponding to distinct Jordan blocks A11 A12 eigen-values, then A = <# diag(J1 , J2 ) if and only A21 A22 if A12 = 0, A21 = 0, ‘A11 = 0 or J1 ’ and ‘A22 = J2 or 0’. The proof follows from the above three steps and by induction on r. Conjecture: The conclusion of Theorem 4.3.13 remains valid even when some or all distinct non-null eigen-values are of geometric multiplicity exceeding 1. We now prove Theorem 4.3.14. Let A, B ∈ I1,n . Then following are equivalent: (i) A = B, # − (ii) A# ∈ {B− com } and B ∈ {A } and # − (iii) A ∈ {Bcom } and ρ(A) = ρ(B). Proof. (i) ⇒ (ii) is trivial. (ii) ⇒ (iii) # − As A# ∈ {B− com } ⇒ ρ(A ) = ρ(Bcom ) for some commuting g-inverse − # Bcom of B. So, ρ(A) = ρ(A ) ≥ ρ(B). Also, B# ∈ {A− }, therefore, ρ(B) = ρ(B# ) ≥ ρ(A). Thus (iii) follows. (iii) ⇒ (i) − # Since A# ∈ {B− = B− com }, A com for some Bcom . Since ρ(A) = ρ(B), # # ρ(A ) = ρ(B) implying A is the commuting reflexive g-inverse of B. Thus, A# = B# . Since for any matrix X of index ≤ 1, X = (X# )# , we have A = B. Let A, B ∈ I1,n such that A <# B. It is easy to see that A# <# B− com − − − − for all B− com ∈ {B com}. Since every Bcom is also an Acom , given a Bcom − # − obviously there exists an A− com such that Bcom < Acom . We now show − − that given commuting g-inverses Acom and Bcom of A and B respectively # − # − such that B− com < Acom whenever A < B, then Acom is a commuting g-inverse of B, a statement that is much stronger than the one in the previous sentence. (See also, Theorem 4.3.1.)
The Sharp Order
115
Theorem 4.3.15. Let A, B ∈ I1,n such that A <# B. Let A− com and B− com be any commuting g-inverses of A and B respectively satisfying # − − B− com < Acom . Then Acom is a commuting g-inverse of B. Proof. Since A <# B, there exist non-singular matrices P, C and D such that A = Pdiag(C, 0, 0)P−1 and B = Pdiag(C, D, 0)P−1 . Then −1 −1 A− , N)P−1 and B− , D−1 , T)P−1 com = Pdiag(C com = Pdiag(C # for some matrices N and T of appropriate order. Since B− A− com < com , −1 # −1 diag(D , T) < N, it follows that N = diag(D , S). Hence A− com is a commuting g-inverse of B. As noted at the beginning of the chapter, every idempotent matrix is of index ≤ 1. We have the following simple but important result. Theorem 4.3.16. Let E and F be idempotent matrices of the same order. Then the following are equivalent: (i) E <s F, (ii) E <− F (iii) E <# F.
and
We now examine when each summand in a sum of matrices of index 1 lies below the sum. The following theorem is an analogue of Fisher-Cochran theorem for matrices of index ≤ 1. Theorem 4.3.17. Let A1 , . . . , Ak ∈ I1,n and A =
k X
Ai . Consider the
i=1
following statements: (i) Ai Aj = 0 whenever i 6= j, (ii) There exist non-singular matrices P, D1 , . . . , Dk such that Ai = Pdiag(0, . . . , 0, Di , 0, . . . , 0)P−1 for each i = 1, . . . , k, (iii) Ai <# A for each i = 1, . . . , k, (iv) ρ(A) = ρ(A2 ) and k X (v) ρ(Ai ) = ρ(A). i=1
Then any of (i) and (ii) implies all the others from (i) to (v). (iii) and (v) together imply all the others from (i) to (iv).
116
Matrix Partial Orders, Shorted Operators and Applications
Proof. (i) ⇒ (ii) Let (Ei , Fi ) be a rank factorization of Ai for i = 1, . . . , k. Since Ai for i = 1, . . . , k is of index ≤ 1, Fi Ei is non-singular for each i = 1, . . . , k. Further, whenever i 6= j, Ai Aj = 0 implies Ei Fi Ej Fj = 0 and therefore, Fi Ej = 0 as each Ek has a left inverse and each Fi has a right inverse. Write E = (E1 : . . . : Ek ) and F = (Ft1 : . . . : Ftk )t . Let E be of orP der n × r and F of order r × n. Now, A = i Ei Fi = EF. Notice that FE is a non-singular, since Fi Ei is non-singular for each i and Fi Ej = 0 whenever i 6= j. So, r = ρ(FE) ≤ ρ(E) ≤ r. Hence ρ(E) = r. Similarly, ρ(F) = r. Thus, (E, F) is a rank factorization of A. Let Ek+1 be a matrix whose columns form a basis of the null space of F. Then Ek+1 is a matrix of order n × (n − r) of rank n − r. Now, if Ek+1 u = Ev for some vectors u and v, then 0 = FEk+1 u = FEv. As FE is non-singular, v = 0. Thus C(Ek+1 ) ∩ C(E) = {0}. Therefore, P = (E : Ek+1 ) is a non-singular matrix and FEk+1 = 0. Similarly, we can extend F to a non-singular matrix Q = (Ft : Ftk+1 )t such that Fk+1 E = 0. Now, Q, P and Fk+1 Ek+1 are non-singular. Also QP = diag(F1 E1 , . . . , Fk Ek , Fk+1 Ek+1 ) is nonsingular. Notice that P−1 = (QP)−1 Q. Now, it is easy to check that for each i = 1, . . . , k, Ai = Pdiag(0, . . . , 0, Fi Ei , 0, . . . , 0)P−1 . Let Di = Fi Ei , for each i = 1, . . . , k and recall that Fi Ei is non-singular. (ii) ⇒ (i) and (i) ⇒ (iii) are trivial. (ii) ⇒ (iv) We have A = Pdiag(D1 , . . . , Dk , 0)P−1 where D1 , . . . , Dk are nonsingular. Clearly, ρ(A) = ρ(A2 ). (ii) ⇒ (v) is trivial. (iii) and (v) ⇒ (i) k X Since ρ(A) = ρ(Ai ), the sums C(A1 ) + . . . + C(Ak ) and C(At1 ) + . . . + i=1
C(Atk ) are direct. Now, AAi = A2i implies
X
Aj Ai = 0. As the sum
i6=j
C(A1 ) + . . . + C(Ak ) is direct, so for each m 6= i, Am Ai = 0.
Corollary 4.3.18. Let A1 , . . . , Ak be square matrices of the same order k X and A = Ai . Consider the following statements: i=1
(i) Ai Aj = 0 whenever i 6= j and ρ(Ai ) = ρ(A2i ) for all i = 1, . . . , k. (ii) Ai = A2i for all i = 1, . . . , k. (iii) A = A2 .
The Sharp Order
117
P (iv) ρ(A) = i ρ(Ai ). (v) There exists a non-singular matrix P such that Ai = Pdiag(0, . . . , 0, I, 0, . . . , 0)P−1 , where I occurs in the ith diagonal block for i = 1, . . . k. Then (v) implies all the others from (i) to (iv) and any two of (i), (ii) and (iii) imply the rest. Also (iii) and (iv) imply the rest. Proof.
Exercise.
Remark 4.3.19. The above corollary is a non-hermitian version of FisherCochran theorem due to [Khatri (1968)].
4.4
Drazin order and an extension
In the last three sections, we studied the properties of the sharp order defined on the class ‘I1 ’ of matrices of index 1. In this section we consider square matrices not necessarily of index ≤ 1. As noted in Theorem 2.2.21, every square matrix A has the Core-Nilpotent decomposition A = A1 +A2 , where A1 ∈ I1 , A2 is nilpotent and A1 A2 = A2 A1 = 0. The matrix A1 is called the core part of A and A2 is called the nilpotent part of A. To start with, we define an order relation using the core part of the matrices ignoring the nilpotent part altogether. We call this order relation as the Drazin order. In the process of studying the properties of this order, we show that the Drazin order is a pre-order and is in general different from the space pre-order. We obtain some characterizations of this order relation, one of which also justifies its name: the Drazin order. We also establish some interesting properties of Drazin order. It is natural to ask is: Is there a way to extend the Drazin order so that it becomes a partial order? Indeed there is a way which involves the nilpotent part as well. We define this order and obtain a canonical form for the matrices under this order. We also show that this order is a partial order implying the minus order. Definition 4.4.1. Let A and B be square matrices of the same order. Let A = A1 + A2 and B = B1 + B2 be the core-nilpotent decompositions of A and B respectively, where A1 is core part of A, B1 is core part of B, A2 is nilpotent part of A and B is nilpotent part of B. The matrix A is said to be below the matrix B under the Drazin order if A1 <# B1 .
118
Matrix Partial Orders, Shorted Operators and Applications
When this happens we write A
The Sharp Order
119
(ii) ⇒ (iii) −1 By Remark 2.2.24, we can write A = Qdiag(C , where C1 is 1 , M1 )Q B11 B12 non-singular and M1 is nilpotent. Let B = Q Q−1 , where B21 B22 partitioning of B is conformable with the partitioning of A. Notice that AD = Qdiag(C−1 , 0)Q−1 . Since (ii) holds, we have 1 −1 −1 C1 0 C1 0 B11 B12 C1 0 = 0 M1 0 0 B21 B22 0 0 −1 C1 0 B11 B12 = . 0 0 B21 B22 −1 Or B11 C−1 = C−1 = 0 and C−1 1 1 B11 = I, B21 C1 1 B12 = 0. So, B11 = C1 , B21 = 0 and B12 = 0. Hence B = Qdiag(C1 , B22 ) Q−1 . Let B22 = Rdiag(C2 , N2 )R−1 , where C2 is non-singular and N2 is nilpotent. Write P = Qdiag(I, R). Then, B = Pdiag(C1 , C2 , N2 )P−1 and A = Pdiag(C1 , R−1 M1 R)P−1 . However, M1 is nilpotent implies R−1 M1 R is nilpotent. Let N1 = R−1 M1 R and (iii) follows. (iii) ⇒ (i) and (iii) ⇒ (iv) are trivial. (iv) ⇒ (ii) Since index of A is k, Ak = Ak+1 AD = AD Ak+1 . So, (AD )k+1 Ak B = (AD )k+1 (A)k+1 = AD A. Also, (AD )k+1 (A)k = AD . So, we have AAD = AD A = AD B. Similarly, BAk = Ak+1 yields BAD = AAD .
Theorem 4.4.4. The Drazin order is a pre-order on Fn×n . Proof. Reflexivity is trivial. To prove transitivity, let A
1 0 0 1 0 0 A = 0 0 1 B = 0 0 0 . 0 0 0 0 0 0
120
Matrix Partial Orders, Shorted Operators and Applications
Then A
0 1 0 0
2 0 0 0
0 0 . 0 0
1 0 . 0 0
and B =
Then A
and B =
Then A <s B, but A 6
The Sharp Order
121
Corollary 4.4.11. Consider the same setup as in Theorem 4.4.10. Then (i) A
1 does not imply A
1 1 0 A = 0 0 1 0 0 0
0 1 0 B = 1 0 0 . 0 0 1
Now, ρ(A) = 2 > ρ(A2 ) = ρ(A3 ) = 1. So index A = 2. The 1 1 1 Drazin inverse of A is AD = 0 0 0. Clearly, AD BAD = AD , 0 0 0 AD B2 = AD B2 = AD , but BAD 6= AAD . Thus A 6
122
Matrix Partial Orders, Shorted Operators and Applications
Theorem 4.4.15. Let A be a square matrix. Then the class of all matrices B such that A
Remark 4.4.16. Let A be a square matrix of index k. Then the class of all matrices B such that A
The Sharp Order
123
Proof. (i) ⇔ (ii) follows once we notice that AAD A is the core part of A and A
The Corollary follows from Theorem 4.4.18(iii).
We now show that the order relation (#, −) implies ‘<− ’. Theorem 4.4.20. The order relation (#, −) on Fn×n implies ‘<− ’ i.e., if A <#,− B, then A <− B. Proof. Let A <#,− B. Then by Theorem 4.4.18, there exists a non-singular matrix P such that A = Pdiag(C1 , 0, N1 )P−1 , and B = Pdiag(C1 , C2 , N2 )P−1 where P, C1 and C2 are non-singular and N1 , N2 are nilpotent satisfying N1 <− N2 . Let N− 1 be a g-inverse of N1 such that − − − −1 N− N = N N , N N = N N . Let G = P diag(C−1 , 0, N− . 1 2 1 1 2 1 1 1 1 )P Then, it is easy to check that G is a g-inverse of A for which AG = BG, GA = GB. Hence, A <− B.
124
4.5
Matrix Partial Orders, Shorted Operators and Applications
Exercises
(1) Prove or disprove the following: A <# B ⇒ AA# <# BB# . (2) Prove that the conclusion of Theorem 4.3.15 is false if A# ∈ {B− }, B# ∈ {A− }, by giving a suitable example. (3) Let A− be any commuting g-inverse of a matrix A of index ≤ 1. Then show that A− A = A# A. (4) Prove the following statements: (i) A <# B ⇔ A# <# B# (ii) A <# B ⇒ A <− B and A2 <− B2 . Does the converse of (ii) hold? (5) Let A be a matrix of index ≤ 1 and (L, R) be a rank factorization of A. Suppose that A <# B. Then show that L(RL)−1 R = L(RL)−2 RB = BL(RL)−2 R. (6) Let A and B be both of index ≤ 1 and satisfy (i) AA# BB# = AB# (ii) A# A = A# B and (iii) C(A) ⊆ C(B). Then prove that A <# B. (7) Let A# ∈ {B− }, A <s B and B# AB# = B# . Show that A = B. (8) Show that the relation B defined as A <s B and and A = AB# A is a partial order on I1,n , the n×n matrices in I1 . Show also, that this relation implies the minus order but is not the same as sharp order of I1,n . (9) Prove or disprove: A <− B ⇒ A# <− B# . (10) For A, B ∈ I1,n and for each λ 6= 0, define
B# − λ A# λ−1 G(λ) = B# − A#
if λ 6= 1
.
if λ = 1
Let A <# B. Show that A <− B and for each λ 6= 0, G(λ) = (B − λA)# , where G(λ) is as above. (11) Let A and B be square matrices of same order. Let A be of index ≤ 1. Show that A# A = A# B, AA# = BA# ⇔ A2 = AB = BA. Also, show that the relation A B defined by the equations A# A = A# B, AA# = BA# is a partial order on F n .
The Sharp Order
125
A 0 B1 B2 (12) Let A ∈ I∞ . Then <# = B ⇔ B = 0 0 B3 B4 B1 (I − AA# )S , where A <# B1 , S, T and B4 are T(I − AA# ) B4 arbitrary and partitions are conformable for matrix multiplication.
Chapter 5
The Star Order
5.1
Introduction
The present chapter is devoted to the study of yet another partial order known as the star order. This order is defined for matrices A and B (possibly rectangular) over the field C of complex numbers. We say that a matrix A is below B under the star order if AA? = BA? and A? A = A? B. If the matrices A and B are taken to be hermitian, these defining equalities read as A2 = AB = BA. Recall from the previous chapter that a matrix A is below a matrix B under the sharp order if and only if A2 = AB = BA. Thus, the star order and the sharp order coincide for the class of hermitian matrices. (Notice that a hermitioan matrix is a matrix of index ≤ 1.) In view of this, one can expect a number of results in this order to be parallel to those in the sharp order. This indeed is true as we shall see in due course. A well explored partial order, the star order was first introduced by Drazin in a star semi-group with involution. While establishing several other properties of this order, he showed that the star order was a partial order. He further showed that in the defining equalities of this relation, one could actually replace ‘?’ by ‘†’, in other words the statement ‘AA? = BA? and A? A = A? B’ is equivalent to ‘AA† = BA† and A† A = A† B’. However, M.Hestenes studied the properties of the matrices A and B satisfying the equalities AA? = BA? and A? A = A? B almost two decades earlier than Drazin. He called matrix A, a section of the matrix B whenever A and B satisfied the equalities AA? = BA? , A? A = A? B. The term star order was coined by Drazin. In Section 5.2, beginning with a formal definition of the star order, we obtain several of its characteristic properties. While studying the properties of the minus order and the sharp order, we found that the normal form
127
128
Matrix Partial Orders, Shorted Operators and Applications
and the reduced form (whichis a special case of the core-nilpotent decomposition) A = Pdiag D , 0 P−1 for any matrix A of index ≤ 1, where P and D are non-singular matrices respectively proved useful. In the case of the star order, we shall see that the key reduced form is the singular value decomposition (SVD). In Section 5.3, apart from obtaining several other properties of the star order, we compare the star order with the other partial orders studied so far. We also identify the class of all matrices that lie above/below a given matrix under the star order. In Section 5.4, we specialize to some subclasses of matrices of index ≤ 1, which are important in applications. A detailed study of star order on these subclasses that include range-hermitian, normal, hermitian and nnd matrices, is undertaken here. Section 5.5 covers the star order on idempotent matrices. Idempotent matrices play a significant role in applications and for this reason a separate section on star order for the class of idempotent matrices is included. Finally in Section 5.6, Fisher-Cochran Type theorems are established for star order. The matrices studied in this chapter are over C and the inner product of vectors x, y in Cn will be the standard inner product, (x, y) = y? x, unless otherwise stated.
5.2
Star order - Characteristic properties
In this section we define the star order and obtain several of its characteristic properties and establish that the star order is a partial order on Cm×n . We show that if A is below B under the star order, then A and B have simultaneous singular value decomposition. Definition 5.2.1. Let A and B be matrices of the same order m × n. Then A is said to be below B under the star order if AA? = BA? and A? A = A? B. When A is below B under the star order, we write A B. Remark 5.2.2. It is easy to show the following: (i) A B ⇔ A? B? and (ii) If U and V are unitary matrices of orders m × m and n × n respectively, then A B ⇔ UAV UBV. We start with the following theorem which shows that under the star order the matrices have a nice canonical form that is very useful in proving results.
The Star Order
129
Theorem 5.2.3. Let A and B be matrices of the same order having ranks a and b respectively. Then the following are equivalent: (i) A B. (ii) There exist unitary matrices U and V, positive definite diagonal matrices Da and Db−a of orders a × a and − a) × (b − a) (b ? respectively such that A = Udiag Da , 0 , 0 V and B = Udiag Da , Db−a , 0 V? and (iii) B − A B. Proof. (i) ⇒ (ii) Since AA? = BA? and A? A = A? B, both A? B and BA? are hermitian and nnd. So, by Theorem 2.7.18, there exist unitary matrices U, V and nnd diagonal matrices D1 and D2 such that A = UD1 V? and B = UD2 V? . ? Since A B, by Remark 5.2.2, we have D1 < D2 . Without loss of generality, we can take D1 = Da , 0 , 0 and D2 = Da , Db−a , 0 , where Da and Db−a are positive definite diagonal matrices, and (ii) follows. (ii) ⇒ (i) is trivial. (i) ⇔ (iii) is easy to check. Remark 5.2.4. Let U and V be unitary matrices of orders m × m and n × n respectively such that A = Udiag Da , 0 , 0 V? and B = Udiag Da , Db−a , 0 V? , where Da and Db−a are positive definite diagonal matrices of order a × a and (b − a) × (b − a) respectively. Then the class of all (a) least squares g-inverse of A and B are respectively given by −1 Da 0 0 M1 M2 M3 U? ; Mi Ni , arbitrary i = 1, 2, 3 , {A− } = V ` N1 N2 N3 and −1 Da 0 0 −1 ? 0 {B− } = V U ; L , arbitrary i = 1, 2, 3 . D 0 i b−a ` L1 L2 L3 (b) minimum norm g-inverses of A and B are respectively given by −1 Da S2 S3 ? {A− } = V U ; S , R , T , arbitrary i = 2, 3 0 R R i i i 2 3 m 0 T2 T 3
130
Matrix Partial Orders, Shorted Operators and Applications
and −1 Da 0 X1 −1 ? {B− } = V U ; X , arbitrary i = 1, 2, 3 . 0 D X i b−a 2 m 0 0 X3 Remark 5.2.5. Let A B. Then (i) AB† A = BB† A = AB† B = A, or equivalently (ii) A <− B. However, the converse is false as the following example shows. 1 1 1 1 Example 5.2.6. Let A = and B = . Clearly, A <− B. 0 0 1 0 2 0 2 0 As AA? = and BA? = , A≮? B. 0 0 1 0 It easy to see that each of the equalities in Remark 5.2.5(i) hold. Theorem 5.2.7. The star order is a partial order on Cm×n . Proof. Reflexivity is trivial and anti-symmetry follows by Remark 5.2.5. For transitivity, let A B and B C. Then by Theorem 5.2.3, there exist unitary matrices U andV such that A = U diag Da , 0 , 0 V? and B = U diag Da , Db−a , 0 V? , where Da and Db−a are positive definite diagonal matrices of order a × a and (b − a) × (b − a) respectively. Let C11 C12 C13 C = U C21 C22 C23 V? , where partition involved is conformable for C31 C32 C33 multiplication of Cwith A and B. Since, B C, a simple computation Da 0 0 shows that C = U 0 Db−a 0 V? . Clearly, A C. 0 0 C33 Theorem 5.2.8. Let A and B be matrices of the same order. Then the following are equivalent: − − − (i) A− `m A = A`m B and AA`m = BA`m for some minimum norm least − squares g-inverse A`m of A (ii) A† A = A† B and AA† = BA† (iii) A B − − − (iv) {B− ` } ⊆ {A` } and {Bm } ⊆ {Am } − − (v) {B`m } ⊆ {A`m } (vi) A† A = B† A and AA† = AB† , and
The Star Order
131
(vii) There exist orthogonal projectors P and Q such that A = PB = BQ. − † Proof. (i) ⇒ (ii) Since A− `m AA`m = A , the result follows. (ii) ⇒ (i) Since A† is a minimum norm least squares g-inverse, the result follows. (ii) ⇒ (iii) Notice that A? = A† AA? = A? AA† . So, AA† = BA† ⇒ AA† AA? = BA† AA? = B(A† A)? A? ⇒ AA? = BA? . Similarly A? A = A? B. (iii) ⇒ (iv) follows from Theorem 5.2.3 in view of Remark 5.2.4. (iv) ⇒ (v) − − − − − − Notice that for any B− `m , B`m ∈ {B` } ⊆ {A` } and B`m ∈ {Bm } ⊆ {Am }. − − − Also, {A`m } = {A` } ∩ {Am }. So, the result follows. (v) ⇒ (vi) † † † Since B† ∈ {A− `m }, we have AB A = A and AB and B A are hermitian. † † † † ? † ? † † So, A A = A AB A = (A A) (B A) = (B AA A)? = (B† A)? = B† A. Similarly, AA† = AB† . (vi) ⇒ (vii) From Remark 5.2.5, we have A = AB† B = AA† B, as AA† = AB† . Moreover, A = BB† A = BA† A, since A† A = B† A. Let P = AA† and Q = A† A. Clearly, P and Q are orthogonal projectors satisfying A = PB = BQ. (vii) ⇒ (iii) Now, A = PB ⇒ C(A) ⊆ C(P ) ⇒ PA = A ⇒ A? P = A? . Now A = PB ⇒ A? A = A? PB = A? B. Similarly, A = BQ ⇒ AA? = BA? . (iii) ⇒ (ii) Since A† = A† A†? A? = A? A†? A† , (ii) follows.
Corollary 5.2.9. Let A and B be matrices of the same order. Then the following are equivalent: (i) (ii) (iii) (iv)
A B A† B† AA† B = A = BA† A = BA† B and A† AB† = A† = B† AA† = B† AB† .
Proof. By Theorem 5.2.3(ii) it is easy to see that (i) ⇔ (ii), (i) ⇔ (iii) and (i) ⇔ (iv).
132
Matrix Partial Orders, Shorted Operators and Applications
We now explore the additional conditions required for the minus order to yield the star order. Theorem 5.2.10. Let A and B be matrices of the same order m × n such that A <− B. Let ρ(A) = a, ρ(B) = b and k be positive (integer or fraction). (i) The following are equivalent: (a) (b) (c) (d)
A B AB? and A? B are hermitian A† B and BA† are hermitian AB† and B† A are hermitian.
and
(ii) The following are equivalent: (a1 ) (b1 ) (c1 ) (d1 ) (f1 ) (e1 )
A B AA? BB? (AA? )k (BB? )k (AA? )k A (BB? )k B BA† B = A and C(BA† B) ⊆ C(A), C(BA† B)? ⊆ C(A)? .
(iii) The following are equivalent: (a2 ) (b2 ) (c2 ) (d2 ) (f2 ) (e2 )
A B A? A B? B (A? A)k (B? B)k (A? A)k A? (B? B)k B? B† AB† = A† and C(B† AB† ) ⊆ C(A† ).
Finally, all the conditions in anyone of (i), (ii) and (iii) are equivalent to all conditions in any other. Proof. (i) (a) ⇒ (b) A B ⇒ AA? = BA? and A? A = A? B. Also, (BA? )? = AB? . It is now clear that AB? and A? B are hermitian. (b) ⇒ (c), (d) Since AB? and A? B are hermitian, by Theorem 2.7.12, there exist unitary matrices U, V and real diagonal matrices D1 , D2 such that A = UD1 V? and B = UD2 V? . Clearly, A† B and BA† are hermitian and AB† and B† A are hermitian. (c) ⇒ (b) follows along the lines of (b) ⇒ (c).
The Star Order
133
Similarly, (d) ⇒ (b). (b) ⇒ (a) Since AB? and A? B are hermitian, by Theorem 2.7.12, there exits unitary matrices U, V and real diagonal matrices D1 , D2 such that A = UD1 V? and B = UD2 V? . As A <− B, it follows that D1 <− D2 . By suitable simultaneous permutations of rows and columns of D1 and D2 we have PD1 Q = diag(Da , 0b−a , 0) and PD2 Q = diag(Da , Db−a , 0), where P and Q are permutation matrices, Da and Db−a are non-singular diagonal matrices. Without loss of generality, we can take Da and Db−a as diagonal positive definite matrices. For, if some diagonal element di of Da (or Db−a ) is negative, then we can make it positive by multiplying it by −1 and multiplying the ith row of V? by −1. Thus, A = UP? diag(Da , 0b−a , 0)QV? and B = UP? diag(Da , Db−a , 0)QV? . Clearly, A B. (ii) (a1 ) ⇒ (b1 ) Since A B, AA? = BA? and A? A = A? B. Consider AA? BB? = AA? AB? = AA? (AB? ) = AA? AA? , since AA? = BA? = AB? and AA? is hermitian. Similarly, AA? BB? = AA? AA? . (b1 ) ⇒ (c1 ) If k is a positive integer, the result follows by (a1 ) ⇒ (b1 ). If k is a fraction, then as AA? and BB? commute and are nnd matrices, therefore, there exists a unitary matrix U such that A? A = Udiag(D1 , 0, 0)U? and B? B = Udiag(D1 , D2 , 0)U? . The result now follows by taking the k th powers. (c1 ) ⇒ (b1 ) Since (A? A)k (B? B)k , (A? A)k and (B? B)k commute and are nnd matrices. So, there exist unitary matrix U such that (A? A)k = UD1 k U? and (B? B)k = UD2 k U? , where D1 and D2 are diagonal nnd matrices.Now, (AA? )k (BB? )k ⇒ (D1 )k (D2 )k . Let i1 th , i2 th . . . , ia th diagonal elements are non-null in D1 . Then i1 th , i2 th . . . , ia th diagonal elements are same in both D1 and D2 . So, D1 D2 or AA? BB? . (b1 ) ⇒ (a1 ) Since AA? BB? , it follows that AA? , BB? commute and are nnd matrices. Hence, there exists a unitary matrix U, positive definite diagonal matrices Da and Db−a such that AA? = Udiag(D2a , 0b−a , 0)U? and BB? = Udiag(D2a , D2b−a , 0)U? . Partition U = (U1 : U2 : U3 ), where U1 and U2 have a and b − a columns respectively. Then A = U1 D2a L1 ? and B = (U1 : U2 )diag(D2a , D2b−a )(K1 : K2 )? for some matrices L1 , K1
134
Matrix Partial Orders, Shorted Operators and Applications
and K2 such that L1 L1 ? = I and (K1 : K2 )? (K1 : K2 ) = I. Since A <− B, C(L1 ) ⊆ C(A? ) ⊆ C(B? ) = C(K1 : K2 ). So, there exist ma? trices F1 and F2 suchthat L1 = (K1 : K 2 )(I − ?F1 : −F2 ) . We can Da (I − F1 ) −Da F2 K1 write A = (U1 : U2 ) . Hence, B − A = 0 0 K2 ? ? Da F1 Da F2 K1 (U1 : U2 ) . As A <− B, ρ(B − A) = b − a. So, 0 0 K2 ? Da F1 = 0 and therefore, F1 = 0. Thus, L1 = K1 − K2 F2 ? . Now, I = L1 L1 ? = I + F2 ? F2 . So, F2 = 0. Thus, L1 = K1 and A = U1 Da K1 and B = (U1 : U2 )diag(D2a , D2b−a )(K1 : K2 )? . It is clear that A B. (c1 ) ⇒ (d1 ) is trivial. (d1 ) ⇒ (c1 ) From (a1 ) ⇒ (c1 ), we have (AA? )k A (BB? )k B ⇒ (AA? )k AA? (AA? )k (BB? )k BB? (BB? )k , or (AA? )2k+1 (BB? )2k+1 , so, (AA? )p (BB? )p for some positive p. (a1 ) ⇒ (f1 ) By (i) A† A = A† B, AA† = BA† . So, BA† B = AA† A = A. (f1 ) ⇒ (e1 ) is trivial (e1 ) ⇒ (a1 ) Now, C(BA† B) ⊆ C(A) ⇒ AA† BA† B = BA† B. Also, A <− B, so, A = AB† A = AB† B. Now, pre-multiplying and post-multiplying AA† BA† B = BA† B by AB† and B† AA† , we have AB† AA† BA† BB† AA† = AB† BA† BB† AA† . As A = AB† A = AB† B, we have BA† = AA† . Similarly, using C(BA† B)? ⊆ C(A)? , we can establish A† A = A† B. The proof for (iii) can be obtained by replacing A in (ii) by A? and noting that A B ⇔ A? B? . Theorem 5.2.11. Let A and B be matrices of the same order with ρ(A) = a and ρ(B) = b. Then the following are equivalent: (i) (ii) (iii) (iv)
A B A <− B and (B − A)† = B† − A† A <− B and B† − A† ∈ {(B − A)− ` } A <− B and B† − A† ∈ {(B − A)− m }.
and
Proof. (i) ⇒ (ii) By Remark 5.2.5, A <− B. By Theorem 5.2.3, there exist unitary matrices
The Star Order
135
U and V such that A = Udiag Da , 0 , 0 V? and B = Udiag Da , Db−a , 0 V? , where Da and Db−a are positive definite diagonal matrices of order a × a and (b − a) × (b − a) respectively. Clearly, (B − A)† = B† − A† . (ii) ⇒ (iii) and (ii) ⇒ (iv) are easy. (iii) ⇒ (i) Since A <− B, by Corollary 3.7.4, there exist unitary matrices U, V, a positive definite diagonal matrix Da and a b × b non-singular matrix M given as Da 0 F1 M= + K F2 , I 0 0 I such that A = Udiag(Da , 0b−a , 0)V? and B = U(M, 0)V? , where K is non-singular and F1 and F2 are some matrices of suitable orders. Also, ? † † ? B† − A† ∈ {(B − A)− ` }, so, (B − A) (B − A)(B − A ) = (B − A) . However, A† = Vdiag(Da −1 , 0b−a , 0)U? , B† = V(M−1 , 0)U? , Da −1 − Da −1 F1 −1 where M = . Hence, −F2 Da −1 K−1 ? ? F1 F2 F2 K? F?1 , I K F2 , I = K? F?1 , I . I I I ? F2 The matrix has a right inverse and K is invertible. So, we have I 0 − Da −1 F1 ? (F1 F1 + I)K F2 , I = F?1 , I . (5.2.1) −1 −1 −F2 Da K Now, (5.2.1) ⇒ (F1 F?1 + I)K(−F2 Da −1 ) = F?1 and F1 F?1 + (F1 F?1 + I)KK−1 = I. These give F1 = 0 and F2 = 0. It now follows that A B. (iv) ⇒ (i) follows along the same lines as the proof of (iii) ⇒ (i). In Theorem 5.2.11, we have seen that if A <− B and B† − A† is either a least squares g-inverse or a minimum norm g-inverse of B − A, then A B. One may be curious to know whether the condition ‘A <− B and B† − A† is reflexive g-inverse of B − A’ would also imply A B. The answer is in the negative as the following example shows: I 0 I S and B = , where S and T Example 5.2.12. Let A = 0 0 T −I are matrices satisfying ST = 0, TS = 0. Then A† = A and B† = B−1 = B and B − A is its own reflexive g-inverse. Clearly, A≮? B.
136
Matrix Partial Orders, Shorted Operators and Applications
Incidentally, Example 5.2.12 also shows that A <− B and A† <− B† together do not imply A B. 5.3
Subclasses of g-inverses for which A B
In this section we characterize the class of all g-inverses A− of a matrix A such that A− A = A† A = A† B and AA− = AA† = BA† , when A B. We also obtain the class of all matrices that lie above/below a given matrix under the star order. Theorem 5.3.1. Let A and B be matrices of the same order such that A B and let A− `m be any minimum norm least squares g-inverse of A − − such that A− A = A− `m `m B and AA`m = BA`m . Then for each minimum − − − norm least squares g-inverse B`m of B, we have A− `m A = B`m A = A`m B − − − and AA`m = AB`m = BA`m . Proof. Proof follows along the lines of Theorem 4.3.1 by making use of Remark 5.2.4. Corollary 5.3.2. Let A and B be matrices of the same order such that A B and let A− `m be any minimum norm least squares g-inverse of A − such that A− A = A− `m B and AA− `m = BA`m . Then for each mini`m − − mum norm least squares g-inverse B`m of B, A`m A and B− `m B commute. − Further, AA− and BB commute. `m `m Given matrices A and B of the same order, we shall now obtain a characterization of ? ? ? ? {A− `m }B = {G : A GA = A , AGA = A , AG = BG, GA = GB},
the set of all minimum norm least squares g-inverses of A for which A B Theorem 5.3.3. Let A and B be matrices of the same order such that A B. Let ρ(A) = a and ρ(B) = b. Then − − − − † {A− `m }B = {B`m − B`m (B − A)B`m } = {B`m − (B − A) }. − Proof. First observe that B− `m (B − A)B`m is a minimum norm least squares g-inverse of B − A. Further, it is a reflexive g-inverse of B − A. − − Hence, for each B− `m , B`m (B − A)B`m is the Moore-Penrose inverse of − − − † B − A. Thus, {B`m − B`m (B − A)B`m } = {B− `m − (B − A) }. Using the representations of A and B as in Theorem 5.2.3, we can easily check that
The Star Order
137
? −1 G ∈ {A− matrix `m } if and only if G = Vdiag Da , 0b−a , T U for some −1 ? −1 T. Also, each B− is of the form Vdiag U for some D , D , T a `m b−a ? matrix S and (B − A)† = Vdiag 0 , D−1 U . The rest of the proof , T b−a is computational. We now obtain the class of all matrices that lie above a given matrix under the star order. Theorem 5.3.4. Let A and B be matrices having the same order. Then − − A B if and only if B = A + (I − AA− ` )T(I − Am A), where A` and − Am are respectively arbitrary least squares g-inverse and minimum norm g-inverse of A and T is arbitrary. Proof follows by definition. Our next theorem is yet another characterization of all matrices lying above a given matrix under the star order. Theorem 5.3.5. Let A be an m × n matrix and A = U1 D1 V1? be the singular value decomposition of A, where U1 , V1 are semi-unitary and ? D1 is a positive definite ? diagonal matrix. Then A < B if and only if B = Udiag D1 , D2 V , where U, V are arbitrary extensions of U1 and V1 respectively to unitary matrices and D2 is arbitrary matrix of appropriate order. Proof. ‘If’ part is trivial. ‘Only if’ part Extend U1 and V1 to unitary matrices U = (U1 : U2 ) and V = (V1 : V2 ). Then we can write A= Udiag D1 , 0 V? . It is now easy to check that B = Udiag D1 , D2 V? , for some matrix D2 . In the next theorem we obtain the class of all matrices A that are below a given matrix B under the star order. Theorem 5.3.6. Let B an m × n matrix and B = U1 DV1? be the singular value decomposition of B, where U1 , V1 are semi-unitary and D = [dij ] is a positive definite diagonal matrix. Then A B if and only if A = U1 EV1? , where E = [eij ] is a diagonal matrix such that eii = dii or 0 for each i and eij = 0 whenever i 6= j. Proof follows from Theorem 5.2.3.
138
Matrix Partial Orders, Shorted Operators and Applications
Remark 5.3.7. The matrices U1 and V1 in Theorem 5.3.6 are in general not unique unless D has all diagonal elements distinct. Whenever U1 and V1 are not unique, we can vary them over all possible choices of U1 and V1 to obtain all A that are below B under A B. 5.4
Star order for special subclasses of matrices
In this section we specialize to some subclasses of matrices with index ≤ 1, namely, range-hermitian matrices, normal matrices, hermitian matrices and the nnd matrices and make a detailed study of the star order on these subclasses. We also study the relation between the minus order and the star order for several of these subclasses. Theorem 5.4.1. Let A and B be range-hermitian matrices of the same order having ranks a and b (b ≥ a) respectively. Then (i) A <− B if and only if there exist a unitary matrix U, non-singular matrices T and M, where M is of the form T 0 F1 M= + K(F2 , I), 0 0 I where K is a (b − a) × (b − a) non-singular matrix and F1 , F2 are some matrices of appropriate order such that A = Udiag(T, 0)U? and B = Udiag(M, 0)U? . (ii) A B if and only if A and B have the same form as in (i) with F1 = 0 and F2 = 0. Proof. (i) ‘If’ part is trivial. ‘Only if’ part Since A is range-hermitian, by Theorem 2.2.9, there exists a unitary matrix W and a non-singular matrix T such that A = Wdiag(T, 0)W? . Also, A and B are range-hermitian matrices, so, C(A? ) = C(A) ⊆ C(B) = C(B? ). The remaining proof follows from Corollary 3.7.4. (ii) The proof is trivial in view of (i) and the definition of star order. Let A and B be range-hermitian matrices such that A <− B. What are the additional conditions that ensure A B? The answer is contained in the following theorem: Theorem 5.4.2. Let A and B be range-hermitian matrices of the same order such that A <− B. Then the following are equivalent:
The Star Order
(i) (ii) (iii) (iv) (v) (vi) (vii) (viii) (ix)
139
A B A? B and BA? are hermitian. A† B and BA† are hermitian AB† and B† A are hermitian A† and B commute A and B† commute A and B commute BA† B is range-hermitian and C(BA† B) ∈ C(A) and B† AB† is range-hermitian and C(B† AB† ) ∈ C(A).
Proof. Equivalence of (i)-(vii) follows from Theorem 5.4.1 and equivalence of (i), (viii) and (ix) follows from Theorem 5.2.10. Theorem 5.4.3. Let A and B be matrices of the same order such that A B. The following are equivalent: (i) A is range-hermitian (ii) A† and B commute and (iii) A and B† commute. Proof. (i) ⇔ (ii) Since A B, A† A = A† B and AA† = BA† . Now, A† and B commute ⇔ A† B = BA† ⇔ A† A = AA† ⇔ A is rangehermitian. (i) ⇔ (iii) follows from Theorem 5.2.8. We now turn our attention to normal matrices. Theorem 5.4.4. Let A and B be matrices of the same order such that A B. Then the following are equivalent: (i) A is normal (ii) A? and B commute (iii) A and B? commute.
and
The proof follows from definition of the star order and Remark 5.2.2(i). Theorem 5.4.5. Let A and B be normal matrices of the same order n × n having ranks a and b (b ≥ a) respectively. Then (i) A <− B if and only if there exist a unitary matrix U, a non-singular matrix Da with possibly complex diagonal elements and a non-singular, Da 0 F1 normal matrix M of the form M = + K(F2 , I), where 0 K I K is a (b − a) × (b − a) non-singular matrix and F1 , F2 are some
140
Matrix Partial Orders, Shorted Operators and Applications
matrices of appropriate order such that A = Udiag(Da , 0b−a , 0)U? and B = Udiag(M, 0)U? . (ii) A B if and only if A = Udiag(Da , 0b−a , 0)U? and B = Udiag(Da , Db−a , 0)U? . Proof. (a) Notice that a normal matrix is unitarily similar to a diagonal matrix with possibly complex diagonal elements and (b) B is normal if and only if M is normal. Proof now follows from Theorem 5.4.1. Theorem 5.4.6. Let A and B be normal matrices of the same order such that A <− B. Then the following are equivalent: (i) (ii) (iii) (iv) (v) (vi) (vii) (viii) (ix) (x) (xi) (xii) (xiii)
A B A† B and BA† are hermitian B† A and AB† are hermitian A? B and BA? are hermitian B? A and AB? are hermitian A and B commute A? and B commute A and B? commute A? and B? commute A† and B commute A and B† commute BA† B is normal and C(BA† B) ⊆ C(A) and B† AB† is normal and C(B† AB† ) ⊆ C(A).
In view of Theorem 5.4.5(i), proof follows from Theorem 5.2.10. Remark 5.4.7. Let A and B be hermitian matrices of the same order. If 0 we replace (xii) and (xiii) in Theorem 5.4.6 with (xii) C(BA† B) ⊆ C(A) 0 † † and (xiii) C(B AB ) ⊆ C(A) respectively, then the statements (i)-(xi), 0 0 (xii) and (xiii) are equivalent. Theorem 5.4.8. Let A and B be matrices of the same order having ranks a and b respectively. Let A <− B. Then A B if and only if A? B and AB? are normal. Proof. ‘If’ part is trivial. ‘Only if’ part By Corollary 3.7.4, there exist unitary matrices U, V, a positive definite diagonal matrix Da , and a non-singular matrix M such that A = Udiag(Da , 0b−a , 0)V? and B = Udiag(M, 0)V? ,
The Star Order
141
F1 K(F2 , I), with K a (b − a) × (b − a) I non-singular matrix and F1 , F2 some matrices of appropriate order. Now, A? B is normal ⇒ Vdiag(Da , 0b−a , 0)diag(M, 0)V? is normal Da 0 0 Da + F1 KF2 F1 K 0 ⇒ 0 0 0 KF2 K 0 is normal. 0 0 0 0 0 0 2 Da + Da F1 KF2 Da F1 K 0 (Da 2 + Da F1 KF2 )? 0 0 ⇒ 0 0 0 (Da F1 K)? 0 0 where M = diag(Da , 0) +
0 0 0 0 0 2 2 ? (Da + Da F1 KF2 ) 0 0 Da + Da F1 KF2 Da F1 K = (Da F1 K)? 0 0 0 0 0 0 0 0 0 ⇒ F1 = 0.
0 0 0 0
Similarly, AB? is normal ⇒ F2 = 0. Therefore, A = Udiag(Da , 0b−a , 0)V? , B = Udiag(Da , K, 0)V? . Clearly, A B.
Corollary 5.4.9. Let A and B be matrices of the same order having ranks a and b respectively. Let A <− B. Then the following are equivalent: (i) (ii) (iii) (iv) (v)
A B A† B and B† A and A? B and B? A and
AB† BA† AB? BA?
are normal are normal are normal are normal.
and
Theorem 5.4.10. Let A and B be matrices of the same order having ranks a and b (b ≥ a) respectively. Let A <− B, A be range-hermitian, B be hermitian and A? B be normal. Then A B. Proof.
By Theorem 5.4.1, A = Udiag(T, 0)U? and B = Udiag(M, 0)U? ,
where U is unitary, T is non-singular of order a × a and M is a b × b matrix T 0 F1 matrix of the form M = + K(F2 , I), with K a (b−a)×(b−a) 0 0 I non-singular matrix and F1 , F2 are some matrices of appropriate order.
142
Matrix Partial Orders, Shorted Operators and Applications
T 0 0 T + F1 KF2 F1 K 0 Since, A? B is normal ⇒ 0 0 0 KF2 K 0 is normal. 0 0 0 0 0 0 This gives F1 = 0. As B is hermitian, F2 = 0. So, A = Udiag(T, 0b−a , 0)V? , B = Udiag(T, K, 0)V? . Clearly, A B.
Theorem 5.4.11. Let A and B be nnd matrices of the same order. Consider the following statements: (i) (ii) (iii) (iv) (v)
A <− B A2 <− B2 A B A2 B2 and AB = BA = A2 .
Then (i) and (ii) imply all the others. Any of (iii), (iv) and (v) imply all others. Proof. (i) and (ii) ⇒ (iii), (iv) and (v) By Corollary 3.7.5, there exists a unitary matrix U, an a × a positive definite diagonal matrix Da and a b × b positive definite matrix M = Da 0 F + M22 (F? , I) such that A = Udiag(Da , 0b−a , 0) U∗ and 0 0 I B = Udiag(M, 0) U? ; where M22 is a positive definite matrix and F is some suitable matrix. As A2 <− B2 , A2 (B† )2 A2 = A2 . Therefore, 2 2 † 2 2 Da 0 0 Da 0 0 Da 0 0 Da + FKF? FK 0 0 0 0 F? K K 0 0 0 0 = 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ⇒ F = 0. So,(iii), (iv) and (v) all hold. (v)⇒ (iii), (iii) ⇒ (i) and (iv)⇒ (ii) are trivial. (v)⇒ (ii) and (iv) Let (v) hold. Then there exist nnd diagonal matrices ∆1 , ∆2 and a unitary matrix V such that A = V∆1 V? and B = V∆2 V? with ∆1 ∆2 = ∆2 ∆1 = ∆21 . Hence, by a suitable permutation of rows and the corresponding columns of ∆1 , ∆2 , we have P∆1 P? = diag(D1 , 0 , 0) and P∆2 P? = diag(D1 , D2 , 0), where D1 , D2 are positive definite diagonal matrices and P is a permutation matrix. Let U = VP. Then U is unitary and A = Udiag(D1 , 0, 0)U? and B = Udiag(D1 , D2 , 0)U? .
The Star Order
143
Clearly, (ii) and (iv) hold. (iv) ⇒ (v) Let (iv) hold. Clearly, A2 and B2 are nnd and therefore, A2 B2 = B2 A2 = A4 . Thus, there exist nnd diagonal matrices D1 , D2 and a unitary matrix U such that A2 = Udiag(D1 2 , 0, 0)U? and B2 = Udiag(D1 2 , D2 2 , 0)U? . Since A and B are the unique nnd square roots of A2 and B2 , we have A = Udiag(D1 , 0, 0)U? and B2 = Udiag(D1 , D2 , 0)U? . It is now clear that (v) holds. Corollary 5.4.12. Let A and B be hermitian matrices of the same order. Consider the statements (i)-(v) in Theorem 5.4.11. Then (i) and (ii) imply all the others. Any of (iii) and (v) imply all others. Also, (iv) implies (v) if and only if (a) every negative eigen-value λ of A is an eigen-value of B and (b) the algebraic multiplicity of λ as an eigen value of A is ≤ the algebraic multiplicity of λ as an eigen value of B. Theorem 5.4.13. Let A and B be square matrices of the same order. Then the following hold: (i) If C(A? ) ⊆ C(B), then B† A† is a reflexive least squares g-inverse of AB. † (ii) If A B and B is hermitian, then (AB) = B† A† . Proof. (i) Since C(A? ) ⊆ C(B), we have A = ABB† . Therefore, AB(B† A† )AB = (ABB† )A† AB = AB, B† A† (AB)B† A† = B† A† (ABB† )A† = B† A† AA† = B† A† and ABB† A† = AA† is hermitian, as AA† is hermitian. (ii) Since A B and B is hermitian, in view of (i), we only need to show that B† A† AB is hermitian. However, A B ⇒ A† A = A† B, AA† = BA† and A† A = B† A, AA† = AB† , by Theorem 5.2.8. Therefore, AA† B = A and A† AB? = A? = B? AA† . Hence, B† A† AB = (B† A† A)B = (B† A? ) = (AB† )? = AB† . Since AB† is hermitian, B† A† is the Moore-Penrose inverse of AB. We observe that the hypothesis in Theorem 5.4.13(i) is not enough to guarantee B† A† is the Moore-Penrose inverse of AB. Also, we cannot replace the condition A B in Theorem 5.4.13 (ii) by A <− B, even when A and B are nnd. We give the following example:
144
Matrix Partial Orders, Shorted Operators and Applications
1 1 2 1 and B = . The matrices A and 1 1 1 1 B satisfy all the in Theorem 5.4.13(i) and (ii). conditions 1 1 1 −1 3 3 1 1 † † † and B = . Now, (AB) = 26 , and Also, A = 4 1 1 −1 2 2 2 0 0 B† A† = 41 . Thus, (AB)† 6= B† A† . 1 1
Example 5.4.14. Let A =
Theorem 5.4.15. Let A and B be range-hermitian matrices of the same 2 order. Then A B if and only if (AB)† = B† A† = A† B† = A† . Proof. The ‘only if’ part follows from Theorem 4.2.8 as for any rangehermitian matrix X, X# = X† and A B is equivalent to A <# B. ‘If’ part 2 2 Notice that A† = B† A† ⇒ A† AA? = B† A† AA? = B† A? and 2 A† AA? = A† A? , we have A† A? = B† A? . Since, A is range-hermitian, there is a matrix D such that A = A? D. Therefore, A† A = A† A? D = B† A? D = B† A. 2 Similarly, using A† B† = A† we can establish AA† = BA† . Thus, A B. Recall that an m×n complex matrix U is a partial isometry if UU? U = U or equivalently, U† = U? . We can easily check that if U is a partial isometry, then for any unitary matrices P, Q of appropriate order PUQ is also a partial isometry. We show that for partial isometries, the star order and the minus order coincide. Theorem 5.4.16. Let A and B be partial isometries of the same order m × n. Then A B if and only if A <− B. Proof. ‘Only if’ part is trivial. For ‘If’ part, notice that by Corollary 3.7.4, A <− B ⇔there exist unitary matrices U, V positive definite diagonal matrix Da , and a non-singular matrix M such that A = Udiag(Da , 0b−a , 0)V? and B = Udiag(M, 0)V? , F1 where M = diag(Da , 0) + K(F2 , I), with K a (b − a) × (b − a) I non-singular matrix and F1 , F2 some matrices of appropriate order ⇔ diag(Da , 0) <− M. Since A is a partial isometry, we have Da = I and M? = M† = M−1 . It is easy to check that F1 = 0 and F2 = 0. It follows that A B.
The Star Order
5.5
145
Star order and idempotent matrices
In this section we consider idempotent matrices and orthogonal projectors and study the star order for these matrices. Recall for idempotent matrices E and F, E <− F if and only if E = EF = FE. We have the following: Theorem 5.5.1. Let E and F be any idempotent matrices of the same order. Then E <− F if and only if E E + (I − F)? . Proof. E E + (I − F)? ⇔ E? E = E? (E + (I − F)? ), EE? = (E + (I − F)? )E? ⇔ E? E = E? E + E? − E? F? , EE? = EE? + E? − F? E? ⇔ E = EF = FE ⇔ E <− F.
We now investigate when a given idempotent matrix E is below another under the star order. Theorem 5.5.2. Let E and F be idempotent matrices of the same order. Then the following are equivalent: (i) (ii) (iii) (iv)
E F EE? FF? and E? E F? F EE? <− FF? and E? E <− F? F and FF? − EE? = (F − E)(F − E)? and F? F − E? E = (F − E)? (F − E).
Proof. (i) ⇒ (ii) Since E F, E? E = E? F and EE? = FE? . Also, EE? EE? , EE? and FF? are all hermitian, therefore we have EE? EE? = EE? EF? = EE? FF? = FF? EE? . Thus, EE? FF? . Similarly, E? E F? F. (ii) ⇒ (iii) is trivial. (iii) ⇒ (iv) Since EE? <− FF? , we have FF? − EE? <− FF? and therefore, (FF? − EE? )(FF? )† (FF? − EE? ) = FF? − EE? . Also, FF? is nnd, and hence, (FF? )† is nnd. It follows that FF? − EE? is nnd. Moreover, C(E) = C(EE? ) ⊆ C(FF? ) = C(F) and C(E? ) = C(E? E) ⊆ C(F? F) = C(F? ). Hence, E = EFF = FFE. So, E = EF = FE. Now, E(FF? − EE? )E† = 0,
146
Matrix Partial Orders, Shorted Operators and Applications
and FF? −EE? is nnd, we have E(FF? −EE? ) = 0, or EF? = EE? = FE? . Hence, FF? − EE? = (F − E)(F − E)? . Similarly, we can establish F? F − E? E = (F − E)? (F − E). (iv) ⇒ (i) Notice that FF? − EE? = (F − E)(F − E)? ⇒ FF? − EE? is nnd and C(E) = C(EE? ) ⊆ C(FF? ) = C(F). Similarly, F? F − E? E = (F − E)? (F − E) ⇒ F? F − E? E is nnd and C(E? ) ⊆ C(F? ). Thus, E = EFF = FFE or E = EF = FE. As in proof of (iii) ⇒ (iv), we have EF? = EE? = FE? . Similarly, E? E = E? F = F? E. So, E F. Let E and F be idempotent matrices. Then by Remark 3.6.8, we have E <− F ⇔ E <# F. In fact we have the following: Theorem 5.5.3. Let E and F be idempotent matrices having the same order such that E <− F. Let ρ(E) = e, ρ(F) = f. Then there exists a non-singular matrix P such that E = Pdiag(Ie , 0e−f , 0)P−1 and F = Pdiag(Ie , Ie−f , 0)P−1 . Proof. Since E <− F, by Theorem 3.3.5(vi), there exists non-singular matrices P, De and De−f such that E = Pdiag(De , 0e−f , 0)P−1 and F = Pdiag(De , De−f , 0)P−1 . Since the only non-singular idempotent matrix is the identity matrix and De and De−f are non-singular, the result follows. Corollary 5.5.4. Let E and F be idempotent matrices of the same order. Then E <− F if and only if F − E is idempotent. Proof. ‘If’ part Notice that ρ(F) = tr(F) = tr(E) + tr(F − E) = ρ(E) + ρ(F − E). Therefore, by Remark 3.3.9, E <− F. ‘Only if’ part follows by Theorem 5.5.3. Corollary 5.5.5. Let E and F be any idempotent matrices of the same order. Then E <− F if and only if I − F <− I − E. Remark 5.5.6. Corollary 5.5.5 is not true for star order as shown by the following example. 1 1 0 Example 5.5.7. Let E = diag(0, 0, 1) and F = 0 0 0 . Then both 0 0 1 E, F are idempotent matrices satisfying Corollary 5.5.5. Clearly, E F,
The Star Order
147
as EE? = EF? = E? E = E? F = diag(0, 0, 1). However, I − F≮? I − E, 0 0 0 Since (I − F)? (I − F) = diag 0, 2, 0 and (I − F)? (I − E) = 1 1 0 . 0 0 0 Let E and F be idempotent matrices of the same order. We now investigate when both E F and I − F I − E hold. For this we need the following powerful result. Theorem 5.5.8. Let E and F ∈ Cn×n be idempotent matrices such that F − E is a matrix of index ≤ 1 with all eigen-values real and non-negative. Then F − E is idempotent. Proof. Notice that I = E + (F − E) + (I − F). The matrices E and I − F are idempotent. Consider F − E and let F − E = Pdiag(J1 , . . . , Jp , Jp+1 , . . . , Jq )P−1 be the Jordan decomposition, where the Jordan blocks J1 , . . . , Jp correspond to eigen-value 1 and the Jordan blocks Jp+1 , . . . , Jq correspond to the other eigen-values. Let Q0 = Pdiag(J1 , . . . , Jp , 0, . . . , 0)P−1 and Q1 = Pdiag(0, . . . , 0, Jp+1 , . . . , Jq )P−1 . Then F − E = Q0 + Q1 . Since Q1 does not have any eigen-value equal to 1, I − Q1 is non-singular. So, n = ρ(I − Q1 ) = ρ(Q0 + E + I − F) ≤ ρ(Q0 ) + ρ(E) + ρ(I − F) = ρ(Q0 ) + tr(E) + tr(I − F) ≤ n − tr(Q1 ). Thus, tr(Q1 ) = 0. However, all eigen values of Q1 are non-negative, it follows that all eigen values of Q1 are null. As F − E has index 1, it must have single Jordan block corresponding to null eigen-value, so, Q1 = 0. Thus, ρ(F − E) = tr(F − E) = tr(F) − tr(E) = tr(ρF) − tr(ρE). By Corollary 4.3.18, it follows that F − E is idempotent. Theorem 5.5.9. Let E and F be any idempotent matrices of the same order. Then the following are equivalent: (i) (ii) (iii) (iv) (v)
E − F is nnd E − F is hermitian and idempotent E − F I E <− F and E − F is hermitian and E F and I − F I − E.
Proof. (i) ⇔ (ii) ⇔ (iii) is easy. (ii) ⇔ (iv) follows by Theorem 5.5.3. (ii) ⇒ (v) Since (ii) holds, E <− F. Therefore, E = EF = FE. Now, 0 =
148
Matrix Partial Orders, Shorted Operators and Applications
FE − E = FE − E2 = (F − E)E = (F − E)? E. Therefore, F? E = E? F or E? F = E? F. Similarly, by using E = EF we get EE? = FE? . Thus, E F. Since (I − E) − (I − F) = F − E is hermitian idempotent, we have I − F I − E. (v) ⇒ (ii) Since E F, we have E <− F, so, F − E is an idempotent. Consider (F − E)? (E − F) = F? F − E? E, since E F. Also, since I − F I − E, (I − F)? I − F = I − F? I − E and therefore, F − E = F? F − F? E. Thus, (F − E)? = F? F − E? F = F? F − F? E, as E? E = E? F = F? E. So, (F − E)? = (F − E), showing E − F is hermitian. Till now, we have been considering idempotent matrices E and F neither of which need be hermitian. We now take one of these to be hermitian and obtain some interesting results. Theorem 5.5.10. Let E and F be idempotent matrices of the same order and let E be hermitian. Then the following are equivalent: (i) (ii) (iii) (iv)
FF? − E is nnd E = FE FF? = E + F(I − E)F? and FF? = E + (F − E)(E − F)? .
Proof. (i) ⇒ (ii) Since FF? − E = FF? − EE? is nnd, we have C(E) ⊆ C(F). Hence, FE = FFE = E. (ii) ⇒ (iii) E+F(I − E)F? = E+FF? −FEF? = E+FF? −EF? = E+FF? −E = FF? . (iii) ⇒ (iv) (F − E)(E − F)? = F(I − E)(I − E)? F? = F(I − E)F? = FF? − E. (iv) ⇒ (i) is trivial. Theorem 5.5.11. Let E and F be idempotent matrices of the same order and let E be hermitian. Then the following are equivalent: (i) (ii) (iii) (iv)
E F E <− F E FF? and E <− FF? .
Proof. (i) ⇒ (ii) is trivial. (ii) ⇒ (i)
The Star Order
149
Since E <− F we have E = EF = FE. Therefore, E = E? = EF? = FE? . This further gives E = F? E = EF? . Thus, E = EE? = F? E and E = E? E = EF? and so, E F. (i) ⇒ (iii) and (i) ⇒ (iv) follows by Theorem 5.5.2. (iii) ⇒ (iv) is trivial. (iv) ⇒ (ii) Now, E <− FF? ⇒ FF? − E <− FF? . Clearly, FF? − E is nnd. Hence, by Theorem 5.5.10, we have FF? = E + (F − E)(F − E)? . This further gives FF? − EE? = (F − E)(F − E)? and so, ρ(FF? − EE? ) = ρ((F − E)(F − E)? ) = ρ(F − E). Hence, ρ(F − E) = ρ(FF? − EE? ) = ρ(FF? − E) = ρ(FF? ) − ρ(E), as E <− FF? . Finally, ρ(F − E) = ρ(F) − ρ(E) and so, E <− F. Let E and F be idempotent matrices of the same order. Then we know that E <− F if and only if E = EF = FE. Suppose one of the conditions, say, E = EF holds. Further, let E be hermitian. Does that ensure E <− F? The answer is in negative as the following example shows: 1 0 1 1 Example 5.5.12. Let E = and F = . Then E and F are 0 0 0 0 idempotents and E is hermitian. Also, FE = E. However, EF 6= E and so, E≮− F. Let E be a hermitian idempotent matrix. We shall now explore the additional conditions required so that the idempotent matrices E and F satisfy E F. Theorem 5.5.13. Let E and F be idempotent matrices of the same order and let E be hermitian. Then the following are equivalent: (i) (ii) (iii) (iv)
E F E = FE and F† − E is a g-inverse of F − E EF† F = FF† E = E or equivalently E F and EF = (EF)? .
Proof. (i)⇒ (ii) As E F ⇒ E <− F, we have E = FE. Now, E is hermitian idempotent, so, E† = E and by Theorem 5.2.11, F† − E = F† − E† = (E − F)† . (ii) ⇒ (iii) Since E = FE, we have (F − E) = (F − E)(F† − E)(F − E) = (F − E)F† (F − E). Hence, EF† F + FF† E = EF† E + E. However, FF† E = E, so EF† F = FF† E. (iii)⇒ (iv)
150
Matrix Partial Orders, Shorted Operators and Applications
Notice that E = FE = FE? = EF? . Now, F† E = F† FE = EF† F is hermitian. Also, EF = (EF? )F = E(F†FF) ? F = F†FEF?F = F†E. Therefore, EF is hermitian. (iv) ⇒ (i) Since EF and E are hermitian, EF − E is hermitian. So, (EF − E)? (EF − E) = (EF − E)(EF − E) = EF − E − EF + E = 0. Hence, EF = E = FE. Thus, E <− F and by Theorem 5.5.11, E F. We can obtain similar theorems if instead of E we take F as hermitian. We conclude our discussion on partial orders on idempotent matrices with the following statement when both E and F are hermitian. Theorem 5.5.14. Let E and F be hermitian idempotent matrices of the same order. Then the following are equivalent: (i) (ii) (iii) (iv) (v) (vi) (vii)
E F E <− F E = EFE (F − E) I E <− EF I − E I − F and I − E <− I − F.
Proof is straightforward. 5.6
Fisher-Cochran type theorems
Recall that in Theorem 4.3.17, we had seen that if A1 , . . . , Ak are matrices Pk each of index 1 satisfying Ai Aj = 0 whenever i 6= j and A = i=1 Ai , then each Ai lies below A under the sharp order. As a consequence, we had also deduced a non-hermitian version of Fisher-Cochran theorem for the idempotent matrices. In this section we study similar relationships for the range-hermitian matrices, normal matrices and hermitian matrices. We also remark that the last one is essentially an algebraic version of the Fisher Cochran theorem on the distribution of quadratic forms in independent standard normal variables. Theorem 5.6.1. Let A1 , . . . , Ak be range-hermitian matrices of the same Pk order and let A = i=1 Ai . Consider the following statements: (i) Ai Aj = 0 for whenever i 6= j
The Star Order
151
(ii) there exist non-singular matrices D1 , . . . , Dk , and a unitary matrix U such that Ai = Udiag(0, . . . , 0, Di , 0, . . . , 0)U? , for each i = 1, . . . , k (iii) Ai A (iv) A is range-hermitian and Pk (v) ρ(A) = i=1 ρ(Ai ). Then (i) is equivalent to (ii). Any one of (i) and (ii) implies all others from (i) to (v). (iii) and (iv) together imply all others from (i) to (v). The proof follows along the lines of Theorem 4.3.17 using Theorem 2.2.9 and Theorem 5.4.1 The following theorems are now easy to prove. Theorem 5.6.2. Let A1 , . . . , Ak be normal matrices of the same order Pk and let A = i=1 Ai . Consider the following statements: (i) Ai Aj = 0 for whenever i 6= j (ii) there exists non-singular diagonal matrices D1 , . . . , Dk , and a unitary matrix U such that Ai = Udiag(0, . . . , 0, Di , 0, . . . , 0)U? , for each i = 1, . . . , k (iii) Ai A (iv) A is normal and Pk (v) ρ(A) = i=1 ρ(Ai ). Then (i) is equivalent to (ii). Any one of (i) and (ii) implies all others from (i) to (v). (iii) and (iv) together imply all others from (i) to (v). Theorem 5.6.3. Let A1 , . . . , Ak be hermitian matrices of the same order Pk and let A = i=1 Ai . Consider the following statements: (i) (ii) (iii) (iv) (v)
A2 = A Ai 2 = Ai for each i Ai Aj = 0 whenever i 6= j Pk ρ(A) = i=1 ρ(Ai ) and there exists a unitary matrix U such that Ai = Udiag(0, . . . , 0, Ii , 0, . . . , 0)U? , for each i = 1, . . . , k.
Then any two of (i), (ii) and (iv) imply all others from (i) to (v). Any two of (i), (ii) and (iii) imply all others from (i) to (v). (v) implies all others from (i) to (iv). Further, (i) and (iv) imply all the others.
152
5.7
Matrix Partial Orders, Shorted Operators and Applications
Exercises
(1) Let A, B be matrices of order m × n. Let the symbol stand for or <− or <s . Then show that (i) A B ⇔ B? A B? B and AB? BB? . (ii) A B ⇔ B† B and AB† BB† . (2) Let A, B be matrices of order m × n. Show that AA? BB? ⇒ AA† BB† and A? A B? B ⇒ A† A B† B. (3) Let A, B be matrices of order m × n and let scalars a, b ∈ C. Suppose A 6= 0 and A B. When does one have aA bB? (4) For matrices A, B of order m × n, show that (i) A <− ` B ⇔ A = QB where Q is projector onto a subspace of C(B) and (ii) A <− m B ⇔ A = BP where P is projector onto a subspace of C(B). (5) For matrices A, B of order m × n, show that the following are equivalent: (i) (ii) (iii) (iv)
A B; (B − A) B; (B − A)† B† ; and B† − A† B† .
(6) For matrices A, B of order m × n, show the following: (i) A B ⇒ B? A B? B and AB? BB? ; (ii) A B ⇒ A† A B† B and AA† BB† ; (iii) If m = n, then B? B† = B† B? and A B ⇒ A? A† = A† A? . (7) Let A, B be matrices of order m × n such that A B. (i) Let X ∈ Cn×m be a matrix such that BA† B = AXA. Then prove that AXA = A. If this is so, then show that A B ⇔ A <− B, BA† B = A. (ii) Let X ∈ Cn×m be a matrix such that B† AXAB† = A† , then show that AXA = A. In such a case show that A B ⇔ A <− B, B† AXAB† = A† . (8) Let A, B be matrices of order n × n such that A B. Then prove that A is Hermitian if and only if A, B commute. (9) Let A B. show that the following statements are equivalent: (i) A is normal
The Star Order
(ii) (iii) (iv) (v)
A? , B commute A, B? commute A† , B commute A, B† commute.
153
and
(10) Let A and B be square matrices of the same order such that A is range hermitian. Prove that (i) A B implies A2 B2 and AB = BA and (ii) A B implies A <# B. Also show that the implication in (i) is not reversible. (11) Let A and B be square matrices of the same order such that A is range hermitian and B is idempotent. Prove that A <− B if and only if A B. (12) Let A and B be nnd matrices of the same order. Show that A2 B2 if and only if there exists an orthogonal projector K such that A = BK. (13) Let A and B be hermitian matrices of the same order. Show that A2 B2 if and only if there exists a matrix K such that A = BK, C(K) ⊆ C(B) and KK? K = K. (14) Let A and B be partial isometries of the same order. Show that A B if and only if A <s B and AB? = (AA? )1/2 (BB? )1/2 . (15) Let A and B be m × n matrices. Define A B if AB? A = AA? A. Show that ‘’ is a partial order and A B if and only if A† <− B† . Show also that A B does not imply A <− B. (16) With notations as in Ex. 15, show that A B if and only if A B BA† B = B. (17) Let A and B be orthogonal projectors. Should ‘A <− B if and only if A <s B’ hold? Justify. (18) Let A, B be complex matrices of index not greater than 1. Consider the following statements (i) A B (ii) A2 B2 (iii) A <# B. Show that (i) and (ii) ⇒ (iii) and (i) and (iii) ⇒ (ii), but (ii) and (iii) 6⇒ (i). (19) Let A, B and C be matrices of the same order such that A B and B C. Prove that A, B and C have a simultaneous singular value decomposition.
Chapter 6
One-Sided Orders
6.1
Introduction
In the previous three chapters we studied the space pre-order and the minus, the sharp and the star partial orders. Recall that for matrices A and B of the same order A <− B ⇔ AA− = BA− and A− A = A− B for some A− ∈ {A− }, A <# B ⇔ AB = A2 = BA, and A B ⇔ A? A = A? B and AA? = BA? . Notice that there are two defining conditions for each of these partial orders. In this chapter, we shall examine the situations when only one of the two defining conditions is considered. Does one still get a partial order or at least a pre-order? If not, can one hope to get a partial order by addition of some milder condition(/s)? In case a partial order is obtained, should this new partial order be the same as the one whose defining conditions have been relaxed or does it result in a different partial order? In Chapter 3, we saw that if A <s B and AA− = BA− (or A− A = A− B), then A <− B. Suppose, we take the condition AA− = BA− and add a condition as mild as C(At ) ⊆ C(Bt ) to it, it is clear that we once again get A <− B. In Section 6.2, we study the condition AA− = BA− (or A− A = A− B). We show that while this condition has some interesting characterizations, it does not even yield a pre-order. In Section 6.3, we relax one of the defining conditions of the sharp order and notice that in contrast to the minus order, addition of the condition C(A) ⊆ C(B) or C(At ) ⊆ C(Bt ) results in a partial order distinct from the sharp order. In fact, for square matrices A, B of the same order and of index ≤ 1, if we consider any one of the two equalities say, AB = A2 together with the condition C(A) ⊆ C(B)
155
156
Matrix Partial Orders, Shorted Operators and Applications
(or the condition BA = A2 together with the condition C(At ) ⊆ C(Bt )), we obtain a partial order which we call the right sharp order (or the left sharp order respectively). We also obtain some conditions under which the one-sided sharp order and the sharp order coincide. In Section 6.4, we introduce two new classes of g-inverses of index 1 matrices. Their important properties are developed in the exercises at the end of this chapter. Our focus in this section is in exploring the role they play in the study of onesided sharp orders. In fact, it turns out that the order relations defined through them coincide with the one-sided sharp orders. In Section 6.5, we study the order relations by taking one of the defining conditions of the star order say, AA? = BA? along with C(A? ) ⊆ C(B? )(or the condition A? A = A? B along with C(A) ⊆ C(B)). We show that this leads us to a partial order that has properties very similar to the star order. The order relations thus obtained are known respectively as the right star and the left star orders. As seen in Chapter 5, for matrices A and B satisfying A B, the canonical form for A and B is the simultaneous singular value decomposition. We show that in the case of the right star (or the left star) order, the apt canonical form for the matrices A and B is a generalized singular value decomposition. We reiterate that all matrices in Sections 2, 3, and 4 are over a general field unless specified otherwise, but in Section 5, we deal with only complex matrices.
6.2
The condition AA− = BA−
We begin this section by studying the condition AA− = BA− or (A− A = A− B). As we shall see, this condition by itself does not even define a pre-order, yet has some interesting properties. We have the following: Theorem 6.2.1. Let A and B be matrices of the same order. Then the following are equivalent: (i) AA− = BA− for some g-inverse A− of A (ii) A = BQ for some idempotent matrix Q and (iii) C(A)t ∩ C(B − A)t = {0}. Proof. (i) ⇒ (ii) Since AA− = BA− , we have A = BA− A = BQ, where Q = A− A is idempotent.
One-Sided Orders
157
(ii) ⇒ (iii) Let A = BQ. Then AQ = BQ2 = BQ = A. Let u ∈ C(A)t ∩ C(B − A)t . Then u = At v = (B − A)t w for some column vectors v and w. So, ut = vt A = wt (B − A). This implies ut Q = vt AQ = wt (BQ − AQ). As BQ = A and AQ = A, we have ut Q = vt AQ = 0 and so, vt A = 0. It folT lows that u = 0. Thus, C(A)t C(B − A)t = {0}. (iii) ⇒ (i) Let (P, T) and (R, S) be rank factorizations of A and B − A respectively. So, B = A + (B − A) = PT + RS = (P : R)(Tt : St )t . Since C(At ) ∩ C((B − A)t ) = {0}, the matrix (Tt : St )t has full row rank. Let (L : M) be a right inverse of it. Therefore, TL = I, TM = 0; SL = 0 and SM = I. As P has full column rank, P has a left inverse say N. Then G = LN is a g-inverse of A and (B − A)G = RSLN = 0. Thus, BG = AG. Theorem 6.2.2. Let A and B be matrices of the same order. Then the following are equivalent: (i) A− A = A− B for some g-inverse A− of A (ii) A = PB for some idempotent matrix P and (iii) C(A) ∩ C(B − A) = 0. Proof is analogous to that of Theorem 6.2.1. In the following theorem we obtain some equivalent conditions under which the condition AA− = BA− yields the minus order. Theorem 6.2.3. Let A and B be matrices of the same order such that AA− = BA− . Then the following are equivalent: (i) (ii) (iii) (iv) (v)
A− A = A− B for some g-inverse A− of A A <− B C(At ) ⊆ C(Bt ) C(A) ∩ C(B − A) = 0 and A = PB.
Proof.
Use Theorems 6.2.1, 3.2.6 and 3.3.5.
We now obtain a canonical form for matrices A and B when the condition AA− = BA− holds. Theorem 6.2.4. Let A and B be matrices of order m × n. Let ρ(B) = s, ρ(A) = r and ρ(B − A) = t. Then AA− = BA− if and only if there exist
158
Matrix Partial Orders, Shorted Operators and Applications
non-singular matrices P and Q such that Ir 0 0 Ir T1 0 A = P 0 0 0 Q and B = P 0 T2 0Q, 0 0 0 0 0 0 T1 where an s × t matrix of rank t. T2 Proof. ‘If’ part Choose A− = Q−1 diag Ir , 0 , 0 P−1 . Then it is clear that AA− = BA− = P Ir , 0 , 0 P−1 . ‘Only if’ part Let (P1 , Q1 ) and (P2 , Q2 ) be rank factorizations of A and (B − A) respectively. Since AA− = BA− , C(At ) ∩ C(B − A)t = 0. Hence, C(Qt1 ) ∩ C(Q2 )t = 0. By choosing a basis of C P1 : P2 which includes the r columns of P1 , we can find a matrix R of order m × (s− r) with T1 ρ(P1 : R) = s and matrices T1 and T2 such that the matrix is of T2 I T1 order s × t and rank t and P1 , P2 = P1 , R . Since P1 , R 0 T 2 Q1 is a matrix of order m × s with rank s and is matrix of order Q2 (r + t) × n with rank r + t, we can find matrices P3 and Q3 such that Q1 P = P1 , R, P3 and Q = Q2 are non-singular. Now it is easy to see Q3 Ir 0 0 Ir T1 0 that A = P 0 0 0 Q and B = P 0 T2 0 Q. 0 0 0 0 0 0 Given a g-inverse A− of A such that AA− = BA− , we can now obtain the class of all g-inverses A− of A for which AA− = BA− . Theorem 6.2.5. Let A and matrices of the same ordersuch that B be Ir 0 0 Ir T1 0 − − AA = BA . Let A = P 0 0 0 Q and B = P 0 T2 0 Q where 0 0 0 0 0 0 T1 is of full column rank. Then for a g-inverse A− of A, AA− = BA− T2
One-Sided Orders
159
Ir L1 L2 if and only if A− is of the form A− = Q−1 0 0 0 P−1 , where N1 N2 N3 L1 , L2 , N1 , N2 , and N3 are arbitrary. Remark 6.2.6. One is tempted to believe that the relation defined by the condition AA− = BA− may be a pre-order. The belief, however, is put to rest by the following example: 1 0 2 1 3 1 Example 6.2.7. Let us take A = ,B= and C = . 0 0 0 0 1 0 1 a We can easily check that AA− = BA− for A− = , where a is −1 −a 0 0 arbitrary and BB− = CB− for B− = . However, AA− = CA− 1 1 for no g-inverse A− of A, so, the transitivity fails to hold. Our next theorem shows that the condition AA− = BA− is symmetric on the class of matrices that have same rank. Theorem 6.2.8. Let A and B be matrices of the same order. Consider the following: (i) AA− = BA− for some g-inverse A− of A. (ii) ρ(A) = ρ(B). (iii) BB− = AB− for some g-inverse B− of B. Then any two of (i), (ii), (iii) implies the third. Proof. (i) and (ii) ⇒ (iii) Let A− be a g-inverse of A such that AA− = BA− . Then, A = AA− A = BA− A. Further, ρ(A) = ρ(B), so, C(A) = C(B). So, B = AA− B = BA− B. Hence A− is a g-inverse of B and (iii) holds. (i) and (iii)⇒(ii) is obvious. Proof of (ii) and (iii) ⇒ (i) is similar to the proof of (i) and (ii) ⇒ (iii). Remark 6.2.9. In fact, we have proved a stronger statement than the one made in Theorem 6.2.8. If AA− = BA− for some g-inverse A− of A and ρ(A) = ρ(B), then A− is a g-inverse of B. Remark 6.2.10. Let C(B) = C(A) and AB− A = A for some g-inverse of B− of B. Then for any g-inverse A− of A, G = B− AA− is a g-inverse
160
Matrix Partial Orders, Shorted Operators and Applications
of A. Also, BG = AA− and BG = AG, showing that the condition AA− = BA− holds when A− = G. Similarly, let C(Bt ) = C(At ) and AB− A = A for some g-inverse of B− of B. Then GA = GB for some g-inverse G of A.
6.3
One-sided sharp order
We begin this section by defining one-sided sharp orders, namely left sharp and right sharp orders. We show each of these relations defines a partial order which implies the minus order. We also obtain some conditions under which each of the left and the right sharp order is equivalent to the sharp order. Definition 6.3.1. Let A and B be matrices of the same order and of index not greater than 1. We say A is below B under right sharp order, if A2 = BA and C(At ) ⊆ C(Bt ). We write A < # B, when this happens. Similarly, if A2 = AB and C(A) ⊆ C(B), we say A is below B under the left sharp order. We write A # < B, in case A is below B under the left sharp order. Remark 6.3.2. Notice that in Definition 6.3.2, as in the case of the sharp order, we do not require matrix B to be of index ≤ 1. However for same reasons as mentioned in the case of the sharp order, we shall henceforth take both A and B to be matrices of index ≤ 1. We shall now obtain a characterization of ‘< #’ using the matrix decompositions. Theorem 6.3.3. Let A and B be matrices of the same order and of index not greater than 1. Then A < # B if and only if there exists a non-singular matrix P such that S 0 0 S T12 0 A = P 0 0 0 P−1 and B = P 0 T22 0 P−1 , 0 0 0 0 0 0 where S and T22 are non-singular. Proof. ‘If’ part is trivial. ‘Only if’ part Since B is of index ≤ 1, we can find non-singular matrices R and T such
One-Sided Orders
161
A11 A12 R−1 , partitioned A21 A22 t in conformation for multiplication with B. Since, ) ⊆ C(Bt ), we C(A 2 A11 0 have A12 = 0 and A22 = 0. Now, A2 = R R−1 and A21 A11 0 TA11 0 BA = R R−1 . Since A2 = BA, this gives A211 = TA11 and 0 0 A21 A11 = 0. Further, ρ(A11 ) = ρ(TA11 ) = ρ(BA) = ρ(A2 ) = ρ(A), so, A21 = CA11 and CA211 = 0. Also, ρ(A2 ) = ρ(A) ⇒ C(A2 ) = C(A). Thus, A11 = A211 D for some matrix D. It follows that A21 = CA11 = CA211 D = 0. Now, index A11 ≤ 1, so, there exist non-singular −1 matrices S and Q of suitable orders such that A11 = Qdiag S 0 Q . Write T11 T12 T=Q Q−1 , where T11 and S are matrices of the same order. T21 T22 2 S 0 T11 S 0 Now, A2 = BA ⇒ A11 = TA211 ⇒ = . Therefore, 0 0 T21 S 0 T11 = S and = 0. Moreover, T is non-singular ⇒ T22 is non-singular. T21 Q 0 Let P = R and notice that A and B have the desired forms. 0 I that B = Rdiag(T, 0)R−1 . Write A = R
Theorem 6.3.4. Let A and B be matrices of the same order and index not greater than 1. Then A < # B if and only if there exists a non-singular matrix P such that S 0 S B12 A=P P−1 and B = P P−1 , 0 0 0 B22 where S is non-singular, B12 is arbitrary and B22 is arbitrary matrix of index ≤ 1. Since A is of index ≤1, we can find non-singular matrices P S 0 B11 B12 −1 −1 and S such that A = P P . Write B = P P , 0 0 B21 B22 2 partitioned inconformation with partitioning of A. Since A = BA, we 2 S 0 B11 S 0 have A2 = P P−1 and BA = P P−1 . It follows that 0 0 B21 S 0 S B12 B11 = S and B21 = 0. Hence, B = P P−1 , where B12 and B22 0 B22 are arbitrary. Since S is non-singular, it is easy to check that ρ(B) = ρ(B2 ) if and only if ρ(B22 ) = ρ(B222 ). Proof.
162
Matrix Partial Orders, Shorted Operators and Applications
Remark 6.3.5. Let A and B be matrices of the same order and index not greater than 1. If ρ(A) = ρ(B) and A < # B, then A = B. Theorem 6.3.6. Let A and B be matrices of the same order and index not greater than 1. Then A < # B if and only if there exists an A− χ such − − − that AA− = BA and A A = A B. χ χ χ χ Proof. ‘If’ part − − − − − 2 Let AA− χ = BAχ and Aχ A = Aχ B. Then AAχ = BAχ ⇒ A = 2 − 2 − − − AA− χ A = BAχ A = BA. Also, Aχ A = Aχ B ⇒ A = AAχ A = − t t AAχ B ⇒ C(A ) ⊆ C(B ). ‘Only if’ part Let A < # B. By Theorem 6.3.3, there exists a non-singular matrix S 0 0 S T12 0 P such that A = P 0 0 0 P−1 and B = P 0 T22 0 P−1 , 0 0 0 0 0 0 where S and T22 are non-singular. Notice that any A− χ is of the −1 S L M 0 0 0 P−1 for some matrices L and M. Also, form A− χ = P 0 0 0 I SL SM I SL SM AA− 0 P−1 and BA− 0 P−1 . Hence, χ = P 0 0 χ = P 0 0 0 0 0 0 0 0 I 0 0 − − − −1 and for every A− χ , AAχ = BAχ . Moreover, Aχ A = P 0 0 0 P 0 0 0 I S−1 T12 + LT22 0 A− 0 0 P−1 . χB = P 0 0 0 0 −1 − We take an A− for which L = −S T12 T−1 χ 22 . Thus, there exists an Aχ − such that A− χ A = Aχ B. Remark 6.3.7. It is clear that if T12 and L are non-null, then we cannot have AA# = BA# and A# A = A# B. In other words A cannot be below B under the sharp order. Analogous to Theorem 6.3.6, it is easy to prove that the left sharp order is same the ρ-order. Corollary 6.3.8. Let A and B be matrices of order n × n and of index not greater than 1. Then A # < B if and only if there exists an A− ρ such that
One-Sided Orders
163
− − − AA− ρ = BAρ and Aρ A = Aρ B.
Corollary 6.3.9. Let A and B be matrices of the same order and index not greater than 1. Then A < # B ⇒ A <− B and A # < B ⇒ A <− B. Corollary 6.3.10. The relation ‘< #’ (and also the relation ‘# <’) is a partial order on the set I1 of matrices of index 1. Proof. In view of Corollary 6.3.9, we only need to show that the relation ‘< #’ is transitive. Let A, B and C ∈ I1 such that A < # B and B < # C. Then by definition, A2 = BA and C(At ) ⊆ C(Bt ). and B2 = CB and C(Bt ) ⊆ C(Ct ). Clearly, C(At ) ⊆ C(Ct ). To prove A2 = CA, we use Theorem non-singular matrix P such that 6.3.4.As A < # B, there exists a S 0 S B 12 A=P P−1 and B = P P−1 , where S is non-singular 0 0 0 B22 C11 C12 and B22 is of index ≤ 1. Let C = P P−1 . Since, B2 = CB, C21 C22 we have C11 = S and C21 = 0. Clearly, CA = Pdiag S2 , 0 P−1 = A2 . Hence, A < # C. Remark 6.3.11. Notice that in Theorem 6.3.6, the matrix L is unique. − − Thus, when A < # B, there exists a unique A− χ such that AAχ = BAχ t −t − and A− χ A = Aχ B and C(Aχ ) ⊆ C(B ). Theorem 6.3.12. Let A and B be matrices of the same order and of index not greater than 1. Then A # < B if and only A# # < B# . Proof. Observe first that the group inverse of a block upper triangular matrix is also an upper triangular matrix. Now the proof follows by Theorem 6.3.3. Theorem 6.3.13. Let A and B be matrices of the same order and of index not greater than 1. Then the following are equivalent: (i) (ii) (iii) (iv) (v)
A<#B A <− B and A <− B and A <− B and A <− B and
C(BA) = C(A) C(BA# ) = C(A) C(B# A) = C(A) and C(B# A# ) = C(A).
Proof. (i) ⇒ (ii) − − − − Since (i) ⇒ there exists an A− χ such that AAχ = BAχ and Aχ A = Aχ B,
164
Matrix Partial Orders, Shorted Operators and Applications
it follows that A <− B. Also, A < # B ⇒ A2 = BA, so, C(BA) = C(A2 ) = C(A). Thus, (ii) holds. (ii) ⇒ (i) Since, A <s B and each of A and B have index ≤ 1, by Theorem 3.2.10, there exist matrices P, S and T of suitable non-singular order, such that S 0 0 T11 T12 0 A = P 0 0 0 P−1 and B = P T21 T22 0 P−1 for some non0 0 0 0 0 0 T11 T12 singular matrices P, S and T = . Since, C(BA) = C(A), it T21 T22 follows that T21 = 0. Further, since A <− B, T11 = S. So, (i) follows in view of Theorem 6.3.3. (ii) ⇔ (iii) We have C(BA# ) = C(BA(A3 )− A) ⊆ C(BA) ⊆ C(BA# A2 ) ⊆ C(BA# ). So, C(BA# ) = C(BA). It is now clear that (ii) ⇔ (iii) (i) ⇒ (iv) Since, A <− B, we have A = AB# A = AB# B = BB# A. Also, B# A = B# AA# A = B# A2 A# = B# BAA# = (B# BA)A# = AA# . Therefore, C(B# A) ⊆ C(A). Also, C(A) = C(AA# A) ⊆ C(AA# ) ⊆ C(B# A). So, C(B# A) = C(A). (iv)⇒ (i) is similar to (ii)⇒ (i). (iv)⇔ (v) Notice that C(B# A) = C(B# A# ). So, (iv)⇔ (v) holds. Corollary 6.3.14. Let A and B be matrices of the same order and of index not greater than 1. Then the following are equivalent: (i) (ii) (iii) (iv) (v)
A <# B A <− B A <− B A <− B A <− B
Proof.
and and and and
A, B commute A, B# commute A# , B commute and A# , B# commute.
Proof follows by Theorem 6.3.13 and Theorem 4.2.12.
Theorem 6.3.15. Let A and B be matrices of the same order and of index not greater than 1 such that A < # B. Then the following are equivalent: (i) A <# B (ii) C(ABt ) = C(BAt ) (iii) C(ABt ) = C(At ) and
One-Sided Orders
165
(iv) AB# = AA# . Proof.
Since A < # B, byTheorem 6.3.3, there exists a non-singular S 0 0 S T12 0 matrix P such that A = P 0 0 0 and B = P 0 T22 0 P−1 , 0 0 0 0 0 0 where S and T22 are non-singular. (i) ⇒ (ii) Let (i) hold. So, A2 = AB = BA. Therefore, 2 2 2 S 0 0 S ST12 0 S 0 0 0 0 0 = 0 0 0 = 0 0 0 . 0 0 0 0 0 0 0 0 0 It follows that T12 = 0. Clearly, C(ABt ) = C(BAt ). Thus, (ii) holds. (ii) ⇒ (iii) Next, let (ii) hold. So, ST12 = 0. Since, S is non-singular T12 = 0. Hence, C(ABt ) = C(At ) showing (iii) holds. (iii) ⇒ (i) Now, let (iii) hold. So, C(ABt ) = C(At ) = C(At )2 . Once again T12 = 0. So, A2 = BA = AB, which proves (i). Finally for (i) ⇔ (iv), use Theorem 6.3.3. Theorem 6.3.16. Let A and B be matrices of the same order and of index not greater than 1. Let A2 = BA. Then the following are equivalent: (i) (ii) (iii) (iv)
C(At ) ⊆ C(Bt ) C(A) ∩ C(B − A) = 0 A− A = A− B for some g-inverse A− of A A = PB for some projector P.
and
Proof. (i) ⇒ (ii) Since, A2 = BA and C(At ) ⊆ C(Bt ), (ii) holds by Corollary 6.3.9 and Theorem 6.2.2. (ii)⇒ (i) Let (P, Q) and (R, S) be rank factorization of A and B − A respectively. Q Then B = A + (B − A) = PQ + RS = P R . Notice that P : R S is of full column rank, since C(A) ∩ C(B − A) = 0. So, C(At ) = C(Qt ) ⊆ C(Qt : St ) = C(Bt ).
166
Matrix Partial Orders, Shorted Operators and Applications
(i)⇒ (iii) follows by Corollary 6.3.9. (iii)⇒ (iv) Since, A− A = A− B ⇒ A = AA− A = AA− B, so, A = PB, where P = AA− . (iv)⇒ (i) is trivial. Theorem 6.3.17. Let A and B be matrices of the same order and of index not greater than 1. The following are equivalent: (i) A# < B and ‘AA# and B’ commute (ii) A <# B.
and
Proof. (i) ⇒ (ii) Since A # < B, we have A2 = AB and C(A) ⊆ C(B). Now, pre- multiplying and post multiplying A2 = AB by A# and A respectively, we have A2 = A# ABA = AA# BA. Since AA# and B commute, we obtain A2 = BA. So, A2 = AB = BA proving A <# B. (ii) ⇒(i) is trivial. Remark 6.3.18. Notice that A < # B ⇒ (B − A) # < B. Further, C(B − A) ⊆ C(B), since A < # B ⇒ A <− B ⇒ B − A <− B. Theorem 6.3.19. Let A and B be matrices of the same order and of index not greater than 1. Then the following are equivalent: (i) A < # B and B − A < # B (ii) A <# B.
and
Proof is easy. Remark 6.3.20. Let A and B be matrices of the same order and of index not greater than 1. Then it is easy to see that A < # B ⇒ Ak < # Bk for each positive integer k ≥ 1 and therefore, A < # B ⇒ Ak <− Bk for each positive integer k ≥ 1. A similar statement holds for ‘# <’. We can now show that one sided sharp order coincides with the sharp order for the class of range-hermitian matrices. In the rest of this section we consider matrices over C, the field of complex numbers. Theorem 6.3.21. Let A and B be range-hermitian matrices of the same order. Then in notations of Corollary 3.7.4, A < # B if and only if A = Udiag Da , 0b−a , 0 U? and B = Udiag M, 0 U? ,
One-Sided Orders
167
where U is a unitary matrix, Da is positive definite matrix of order a ×a Da 0 and M is a non-singular matrix of order b × b of the form M = + 0 0 F1 M22 0 I , for some matrix F1 of suitable order. I Proof. ‘If’ part follows by direct verification. ‘Only if’ part Since A < # B ⇒ A <− B, by Corollary 3.7.4, A = Udiag Da , 0b−a , 0 U? and B = Udiag M, 0 U? , where U is a unitary matrix, Da is positive definite matrix and M is a non-singularmatrixof of order a × a and b × b respectively with M of the Da 0 F1 form M = + M22 F2 I , for some matrices F1 and F2 0 0 I of suitable orders. Since, A2 = BA, it follows that F2 = 0. Corollary 6.3.22. Let A and B be matrices of the same order. Let A be range-hermitian and B be normal. Then in notations of Corollary 3.7.4, A < # B if and only if A = Udiag Da , 0b−a , 0 U? and B = Udiag M, 0 U? , where U is a unitary matrix, Da is positive definite matrix and M is a non-singular matrix of oforder a × a and b × b Da 0 0 respectively with M of the form M = + M22 0 I , where 0 0 I M22 is non-singular matrix of order (b − a) × (b − a). Moreover, in this case A <# B. Corollary 6.3.23. Let A and B be range-hermitian matrices of the same order. Then A < # B if and only if A <# B or equivalently if and only if A and B are normal.
6.4
− Roles of A− c and Aa in one-sided sharp order
In this section we first introduce briefly two new classes of g-inverses, − namely: the class {A− c } and the class {Aa }. Details of their properties are given in exercises at the end of this chapter. We then show that A < # B − − − if and only if {B− c } ⊆ {Ac } and A # < B if and only if {Ba } ⊆ {Aa }. We begin with the following:
168
Matrix Partial Orders, Shorted Operators and Applications
Definition 6.4.1. Let A be a square matrix of index ≤ 1. A g-inverse G is called a c-inverse of A if C(GA) = C(A). A c-inverse of A is denoted by A− c. It is clear that every A− χ is a c-inverse of A. The following example shows that the converse is not true. However, every reflexive c-inverse is an A− χ. 1 0 Example 6.4.2. Let A = . Then it is easy to check that A− χ = 0 0 1 α 1 α and A− , where β is possibly a non-null scalar. It is now c = 0 0 0 β clear that a c-inverse may not be a χ-inverse. Definition 6.4.3. Let A be a matrix of order n × n and of index ≤ 1. Let Ax = b be a possibly inconsistent system of linear equations with b = c + d, where c ∈ C(A) and d ∈ N (A). (Since A is of index ≤ 1, we can write F n = C(A) ⊕ N (A), so each b ∈ Fn can be decomposed in the manner mentioned above.) We say x0 is a good approximate solution of Ax = b if Ax0 = c. Notice that in the setup of Definition 6.4.3, if b ∈ C(A), then x0 is a good approximate solution of Ax = b if and only if x0 is a solution of Ax = b or equivalently x0 is a solution of A2 x = Ab. Definition 6.4.4. Let A be a square matrix of index ≤ 1. A matrix G is said to be an A− a if Gb is a good approximate solution of Ax = b for each b. For a square matrix of index ≤ 1, a matrix G is an A− c if and only if G is an (At )− . a t
Definition 6.4.5. Let A and B be matrices of the same order and of index − − − − not greater than 1. We say A <− c B if AAc = BAc and Ac A = Ac B − − for some Ac ∈ {Ac }. Similarly we can define A <− a B. Theorem 6.4.6. Let A and B be matrices of the same order and of index not greater than 1. Then A < # B if and only if A <− c B. − − − Proof. Notice that for any A− c ∈ {Ac }, the matrix G = Ac AAc is a A− χ . The theorem now follows from Theorem 6.3.6.
One-Sided Orders
169
Corollary 6.4.7. Let A and B be matrices of the same order and of index not greater than 1. Then A # < B if and only if A <− a B. Theorem 6.4.8. Let A and B be matrices of the same order and of index not greater than 1. Then the following are equivalent: (i) A < # B − (ii) {B− c } ⊆ {Ac }. Proof. (i) ⇒ (ii) Since A < #B, by Theorem 6.3.3 there matrix P such exists a non-singular S 0 0 S T12 0 that A = P 0 0 0 and B = P 0 T22 0 P−1 , where S and T22 0 0 0 0 0 0 − − are non-singular. Also, A < # B ⇒ A <− B and {B− c } ⊆ {B } ⊆ {A }. − − To show that {B− } ⊆ {A− each B− c }, we show c is an Ac . Any Bc is of −1c −1 −1 S −S T12 T22 L1 2 the form P 0 0 L2 P−1 . Clearly, B− c A = A. So, 0 0 L3 − B− ∈ {A }. c c (ii) ⇒ (i) − − − Since {B− χ } ⊆ {Bc }, it follows that {Bχ } ⊆ {Ac }. Since B is of index 1, T 0 there exist non-singular matrices P and T such that B = P P−1 . 0 0 −1 T L − P−1 , where L is By Theorem 2.4.5 any Bχ is of the form P 0 0 A11 A12 arbitrary. Let A = P P−1 partitioned in conformation with A21 A22 − 2 − the partitioning of B. Since a B− χ is an Ac , so, Bχ A = A. Therefore, −1 2 T L A11 A12 A11 A12 −1 −1 P P P P = P P−1 for each L 0 0 A21 A22 A21 A22 and in particular for L = 0. Therefore, −1 2 T A11 + T−1 A12 A21 T−1 A12 A21 + T−1 A12 A22 P−1 P 0 0 A11 A12 =P P−1 A21 A22
170
Matrix Partial Orders, Shorted Operators and Applications
⇒ A12 = 0,A21 = 0, A22 = 0 and T−1 A211 = A11 . A11 0 So, A = P P−1 . As A is of index 1, A11 is of index 1. Now, 0 0 A11 is of index 1, so, there exists a non-singular matrix R and a non C 0 R 0 singular matrix C such that A11 = R R−1 . Let Q = P . 0 0 0 I C 0 0 T11 T12 0 Then A = Q 0 0 0 Q−1 and B = Q T21 T22 0 Q−1 , where 0 0 0 0 0 0 T11 T12 RTR−1 = and partitioning is in conformation with partitionT21 T22 11 12 T T −1 −1 −1 ing of A11 . Now, RTR is non-singular, let (RTR ) = . 21 T T22 11 12 T T L1 21 22 Then any B− is of the form P T T L2 P−1 . Using the fact that a χ 0 0 0 21 11 − − Bχ is an Ac , we get T = 0 and T = C−1 , so, T11 = C, T21 = 0 and T22 is non-singular. Now, it is clear that A2 = BA and C(At ) ⊆ C(Bt ). Thus, A < # B. The following theorem can be proved analogously: Theorem 6.4.9. Let A and B be matrices of the same order and of index not greater than 1. Then the following are equivalent: (i) A # < B − (ii) {B− a } ⊆ {Aa }. Theorem 6.4.10. Let A and B be matrices of the same order and index not greater than 1. Then the following are equivalent: (i) (ii) (iii) (iv)
A <# B − − − {B− c } ⊆ {Ac } and {Ba } ⊆ {Aa } − − − {B− χ } ⊆ {Ac } and {Bρ } ⊆ {Aa } # − B ∈ {Aac }.
and
Proof. We only need to prove (i) ⇔ (iv). # − − − − Since B# ∈ {A− ac } ⇔ B ∈ {Bc } ∩ {Ba } ⊆ {Ac } ∩ {Aa }, the proof of (i) ⇒ (iv) is clear. (iv) ⇒ (i)
One-Sided Orders
171
it is easy to check that the class of all commuting g-inverses of A is precisely {A− ac }. Now, (iv) ⇒ (i) follows from Theorem 4.2.8. 6.5
One-sided star order
The one-sided star orders maintain a similar relationship to one-sided sharp orders as did the star order to the sharp order. For hermitian matrices the right (left) sharp order and the right (left) star coincide. So, as expected, many theorems in this section will be similar to theorems in one-sided sharp order. However, a distinguishing feature is that there is no Fisher-Cochran type theorem for one-sided sharp orders, but for one-sided star order we show that it holds good. The important feature of this section is that: if A and B are matrices of the same order and A is below B under either the right star order or the left star order, then their reduced form (a canonical form) is generalized singular value decomposition of A with respect to B. In this section all matrices are over the field C of the complex numbers. Let us define for matrices A and B of the same order, A ≺ B if AA? = BA? (or equivalently AA† = BA† ). This relation is obviously reflexive. Moreover, if A ≺ B and B ≺ A, then (B − A)A? = 0 and (B − A)B? = 0. Therefore, (B − A)(B − A)? = 0 and so, A = B. Thus, A ≺ B is antisymmetric. However, this relation is not transitive as the following example shows: Example 6.5.1. Consider matrices A = , B = and C = 1 , −1 2 , 0 ? ? ? ? 2 , 1 . Then AA = 2 = BA and BB = 4 = BC , but AA? = 2 6= 1 = AC? . Thus, the relation ‘≺’ is not transitive and so, is not even a pre-order. Definition 6.5.2. Let A and B be matrices of same order. We define A < ? B (read as A is below B under right star order ) if AA? = BA? and C(A? ) ⊆ C(B? ) and A ? < B (read as A is below B under left star order ) if A? A = A? B and C(A) ⊆ C(B). Are the two relations defined above the same and the same as star order? Does each of them define a partial order? Or even a pre-order? We give below an example which not only shows that none of them is same as the star order and that they themselves are different order relations. 1 −1 1 −2 Example 6.5.3. Let A = and B = . Then clearly, −1 1 −1 0
172
Matrix Partial Orders, Shorted Operators and Applications
2 −2 C(A) ⊆ C(B) and A? A = A? B = . However, AA? = A? A and −2 2 3 −3 BA? = . Thus, A ? < B, A ≮ ?B and therefore, A ≮? B. −1 1 Remark 6.5.4. Notice that A < ? B ⇔ AA† = BA† and C(A? ) ⊆ C(B? ), since, A† = A? (A? AA? )− A? . Thus, A < ? B ⇒ AA† = BA† , and C(A? ) ⊆ C(B? ) and this implies A <− B, C(A? ) ⊆ C(B? ), C(A) ⊆ C(B) ? and C(A? )⊥C(B − A) . This is so because AA† = BA† ⇒ A = BA† A and therefore, AB† A = AB† BA† A = A; thus, AB† A = A and so, A <− B. Further, A(B? − A? ) = 0. Remark 6.5.5. A < ? B ⇒ A <− B and A ? < B ⇒ A <− B. We shall now first obtain a canonical form for matrices A and B when A < ? B. Theorem 6.5.6. Let A and B be matrices of the same order with ranks ‘r’ and ‘s’ respectively. Then A < ? B if and only if there exist a non-singular matrix P and a unitarymatrix V such that A = Pdiag Ir , 0 , 0 V? and B = Pdiag Ir , Ir−s , 0 V? . Proof. ‘If’ part is trivial. ‘Only if’ part Let A < ? B. Then by Remark 6.5.4, A <− B, C(A? ) ⊆ C(B? ), C(A) ⊆ ? C(B) and C(A? )⊥C(B − A) . Let (P1 , V1? ) and (P2 , V2? ) be rank factorizations of A and B − A respectively, where V1? V1 = I and V2? V2 = I. Then B = A + (B − A) = P1 V1? + P2 V2? = (P1 : P2 )(V1 : V2 )? . Since C(A) ∩ C(B − A) = {0}, (P1 : P2 ) is of full column rank ‘s’. Also, since ? C(A? )⊥C(B − A) , so,(V1 : V2 )? (V1 : V2 ) = I. Let P3 and V3? be matrices such that P = (P1 : P2 : P3 ) is non-singular and V? = (V1 : V2 : V3 )? is unitary. It is now easy to check that A and B have the desired forms. Remark 6.5.7. The canonical form of the matrices A and B in Theorem 6.5.6 is called a generalized singular value decomposition of A and B. Remark 6.5.8. Let A and B be matrices of the same order with ranks‘r’ and ‘s’ respectively. Then A ? < B if and only if A = Udiag Ir , 0 , 0 Q and B = Udiag Ir , Ir−s , 0 Q, for some non-singular matrix Q and unitary matrix U.
One-Sided Orders
173
We shall now obtain a characterization of A such that A < ? B when a singular value decomposition of B is available. Theorem 6.5.9. Let A and B be matrices of the same order. Let B = Udiag D , 0 V? , where U, V are unitary and D is a positive definite diagonal matrix. Then A < ? B if and only if A = Udiag T , 0 V? such that D−1 is a minimum norm g-inverse of T. Proof. ‘If’ part is trivial. ‘Only if’ part ? T E Let B = Udiag D , 0 V . Write A = U V? where partitioning F G is in conformation with the partitioning of B. Since C(A? ) ⊆ C(B? ), we have E = 0 and G = 0. Also, AA? = BA? . So, TT? = DT? and FF? = 0. This further gives D−1 TT? = T? and F = 0. Hence, D−1 is a minimum norm g-inverse of T, showing that A has the desired form. Theorem 6.5.10. Let A and B be matrices of the same order. Let A = Udiag ∆ , 0 V? , where U, V are unitary and ∆ is a positive definite ∆ T12 diagonal matrix. Then A < ? B if and only if B = U V? for 0 T22 some matrices T12 and T22 such that C(T12 ) ⊆ C(T22 ). Proof.
Proof is similar to Theorem 6.5.9.
Theorem 6.5.11. The relation ‘< ?’ is a partial order. Proof. The relation is trivially reflexive and as noted earlier it is also antisymmetric. So, we only need to show that it is transitive. Since A < ? B ⇒ AA? = BA? and C(A? ) ⊆ C(B? ). Also, B < ? C ⇒ BB? = CB? and C(B? ) ⊆ C(C? ). Clearly, C(A? ) ⊆ C(B? ) and C(B? ) ⊆ C(C? ) ⇒ C(A? ) ⊆ C(C? ). Moreover, AA? = BA? = AB? = AB? B†? B? = AB† BB? = AB† BC? = AC? = CA? . Therefore, A < ? C. Remark 6.5.12. The relation ‘? <’ is a partial order. Remark 6.5.13. Notice that A < ?B ⇔ A? ? < B? Theorem 6.5.14. Let A and B be matrices of the same order. Then A B if and only if A < ?B and A ? < B. Theorem 6.5.15. Let A and B be matrices of the same order such that AA? = BA? . Then the following are equivalent:
174
(i) (ii) (iii) (iv) (v)
Matrix Partial Orders, Shorted Operators and Applications
A
Proof follows along the lines of Theorem 6.3.16. The following theorem follows trivially from Remark 6.5.13 and Theorem 6.5.15. Theorem 6.5.16. Let A and B be matrices of the same order such that A? A = A? B. Then the following are equivalent: (i) (ii) (iii) (iv) (v)
A?< B C(A) ⊆ C(B) C(A? ) ∩ C(B − A)? = {0} AA− = BA− for some g-inverse bf A− of A A = BQ for some projector Q.
and
− − In Theorem 5.2.8, we saw that A B ⇔ {B− ` } ⊆ {A` } and {Bm } ⊆ − − − † ⇔ {B`m } ⊆ {A`m } ⇔ B ∈ {A`m }. We shall now examine whether similar statements are true for A < ? B and A ? < B.
{A− m}
Theorem 6.5.17. Let A and B be matrices of the same order. Then − A < ? B if and only if {B− m } ⊆ {Am }. Proof. ‘If’ part Let B = Udiag D 0 V? be a singular value decomposition of B, where U and V are unitary and D a positive definite diagonal matrix. Then is−1 D M U? , where M and N are arbitrary. every B− m is of the form V 0 N T11 T12 − Write A = U V? . Since, {B− m } ⊆ {Am }, we have T21 T22 V
? ? D−1 M T11 T12 T11 T?12 T11 T?12 U? U V? V V? = V V? . ? ? ? ? 0 N T21 T22 T21 T22 T21 T22
In particular, by taking M = 0 and N = 0 and simplifying we have
D−1 T11 T?11 + D−1 T12 T?12 0
D−1 T11 T?21 + D−1 T12 T?22 0
=
T?11 T?12 . ? ? T21 T22
So, T?12 = 0, T?22 = 0 and D−1 T11 T?11 = T?11 , and D−1 T11 T?21 = T?21 .
One-Sided Orders
175
Next, by taking M = 0 and simplifying we have NT21 T?21 = 0 for all N. So, T21 = 0. Hence, A = Udiag T11 0 V? such that D−1 is a minimum norm g-inverse of T11 . Therefore by Theorem 6.5.9, we have A < ? B. ‘Only if’ part Let A < ? B. Let B = Udiag D 0 V? be a singular value decomposition of B, where U and V are unitary and D is a positive definite diagonal matrix. Then by Theorem 6.5.9, A = Udiag T 0 V? such that D−1 TT? = T? . The class all B− m is given −1 D M ? by V U? , where M and N are arbitrary. Now, B− m AA = 0 N −1 ? ? D M T 0 T 0 T 0 ? ? ? U = V U? = A? . V U U V V 0 N 0 0 0 0 0 0 − Hence, {B− m } ⊆ {Am }. − Remark 6.5.18. In fact, we can prove that A < ? B ⇔ {B− mr } ⊆ {Am }. † The proof follows along the lines of Theorem 6.5.17. However, B ∈ {A− m} does not imply A < ? B as the following example shows. 1 0 1 0 Example 6.5.19. Let A = and B = . Then B† = B 1 0 0 0 1 1 ? }. However, AA = and BA? = and B† AA? = A? . So, B† ∈ {A− m 1 1 1 1 . So, AA? 6= BA? . Thus, A ≮ ? B. 0 0
Lemma 6.5.20. Let C be a non-singular matrix. Then C is a minimum norm g-inverse of a matrix H if and only if C−1 is a minimum norm g-inverse of H† . Proof. C is a minimum norm g-inverse of a matrix H ⇔ CHH? = H? ⇔ CHH† = H† ⇔ HH† = C−1 H† ⇔ H†? = C−1 H† H†? ⇔ C−1 ∈ {(H† )− m }. Theorem 6.5.21. Let A and B be matrices of the same order. Then the following are equivalent: (i) (ii) (iii) (iv)
AA? = BA? AA† = BA† A = BL, where L is an orthogonal projector on C(A? ) A = BL, where L is an orthogonal projector having range a subspace of C(B? ) and
176
Matrix Partial Orders, Shorted Operators and Applications
(v) A = BL, where L is an orthogonal projector. Proof. (i) ⇒ (ii) AA? = BA? ⇒ AA? A†? A† = BA? A†? A† ⇒ AA† = BA† . (ii)⇒(iii) AA† = BA† ⇒ A = AA† A = BA† A. Take L = A† A. (iii) ⇒ (iv) ⇒(v) is trivial. (v)⇒ (i) Let A = BL, where L is an orthogonal projector. Then A? = LB? . So, C(A? ) ⊆ C(L). This gives A? = LA? . Therefore, AA? = (BL)A? = BA? . Remark 6.5.22. Any one of the five equivalent conditions in Theorem 6.5.21 together with C(A? ) ⊆ C(B? ) implies A < ? B. Remark 6.5.23. None of the five equivalent 6.5.21 conditions in Theorem 1 0 1 1 ? ? implies C(A ) ⊆ C(B ). For take A = and B = . 1 0 1 1 Remark 6.5.24. Compare Theorem 6.5.15 and Theorem 6.5.21. Theorem 6.5.25. Let A and B be matrices of the same order. Then the following are equivalent: (i) (ii) (iii) (iv)
A A <− A <− A <−
B B and BA? is hermitian B and BA† is hermitian B and B† A is hermitian.
and
Proof. (i) ⇒ (ii) Proof is trivial in view of Remark 6.5.5 and definition of A < ? B. (ii) ⇒ (i) and ⇒ (i) (iv) D 0 Let B = U V? be a singular value decomposition of B, where 0 0 U and V are unitary and D is a positive definite diagonal matrix. In view of Theorem 6.5.9, the proof will be complete if we show that each of T 0 (ii) and (iv) implies that A is of the form U V? , where D−1 is 0 0 T E a minimum norm g-inverse of T. Let A = U V? partitioned in F G conformity with the partition of B. Then BA? is hermitian ⇒ F = 0 and A <− B ⇒ G = 0 and D−1 is a g-inverse of T and DT is hermitian.
One-Sided Orders
177
Since, C(A? ) ⊆ C(B? ), we must have E = 0. So, D−1 is a minimum norm g-inverse of T. Similarly, when B† A is hermitian, E = 0 and using C(A) ⊆ C(B), we have F = 0 (i) ⇒ (iii) Proof is trivial in view of Remark 6.5.5 and definition of A < ? B. (iii) ⇒ (i) Let A = Udiag(∆ 0)V? bea singular value decomposition of A. Then it ∆ T12 is easy to show that B = U V? for some matrices T12 and T22 0 T22 such that C(T12 ? ) ⊆ C(T22 ? ). So, (i) follows by Theorem 6.5.10. Theorem 6.5.26. Let A and B be matrices of the same order. Then the following are equivalent: (i) (ii) (iii) (iv)
A A A A
B B and A? B is hermitian B and A† B is hermitian B and AB† is hermitian.
and
Definition 6.5.27. Let A and B be matrices of same order. We say − − − − − − A <− ` B if AA` = BA` and A` A = A` B for some A` ∈ {A` }. − − − − − − Also, A <m B if AAm = BAm and Am A = Am B for some Am ∈ {A− m }. Theorem 6.5.28. Let A and B be matrices of the same order. Then A < ? B if and only if A <− m B. Proof. ‘If’ part − ? − − − Now, A <− m B ⇒ AAm = BAm for some Am ∈ {Am }. So, AA = − ? − ? ? − − ? AAm AA = BAm AA = BA Also, A <m B ⇒ A < B ⇒ C(A ) ⊆ C(B? ), hence, A < ? B. ‘Only if’ part We first note that A? (AA? )− is a minimum norm g-inverse of A. Since, A < ? B ⇒ AA? = BA? ⇒ AA? (AA? )− = BA? (AA? )− for any g− − ? ? − inverse (AA? )− of AA? . Therefore, AA− m = BAm for Am = A (AA ) . − † ∼ † † Also, A < ? B ⇒ A < B ⇒ A = AB A. Let A = A AB . Then A∼ AA? = A† AB† AA? = A† AA? = A∼ . Hence, A∼ is a minimum norm g-inverse of A. Now, A∼ B = A† AB† B = A† A = A∼ A. Thus, A <− m B. Theorem 6.5.29. Let A and B be range-hermitian matrices of same order. Then in notations of Theorem 5.4.1, A < ? B if and only if
178
Matrix Partial Orders, Shorted Operators and Applications
A = Udiag Da , 0b−a , 0 U? and B = Udiag M, 0 U? , where U is unitary, Da and M are non-singular matrices of order a × a and b × b Da , 0 F1 respectively with M of the form M = + M22 0 I , for 0, 0 I some matrix F1 and a non-singular matrix Mb of order (b − a) × (b − a). Proof.
Proof follows from Theorem 5.4.1(i) and Definition of A < ? B.
We now show that a one-sided star order is the star order if and only if the matrices involved are both normal. Thus, the class of the normal matrices is precisely the class of matrices for which the one-sided star order coincides with the star order. Remark 6.5.30. Let A and B be normal matrices of order. same ? Then A < ? B if and only if A = Udiag Da 0b−a 0 U and B = Udiag Mb 0 U? , where U is unitary, Da and Mb are non-singular matricesof order a × a b × b respectively with Mb of the form Mb = Da 0 0 + M22 0, I , for some non-singular matrix Mb of order 0 0 I (b − a) × (b − a). Thus, it follows that A B. Theorem 6.5.31. Let A and B be range-hermitian matrices of the same order such that A < ? B. Then A? A = A? B implies A and B are normal. Proof. Proof follows immediately from Theorem 6.5.29 and the fact that A? A = A? B ⇒ F1 = 0 and therefore, both A and B are normal. We have the following: Fisher-Cochran type theorem for one-sided star order Theorem 6.5.32. Let A1 , A2 , . . . , Ak be any m × n matrices such that A = A1 + A2 + . . . + Ak . Consider the following statements: (i) (ii) (iii) (iv) (v) (vi) (vii)
Ai Aj ? = 0, whenever i 6= j AAi ? = Ai Ai ? AAi † = Ai Ai † AAi ? Ai = Ai C(Ai ) ⊆ C(A) P ρ(A) = ρ(Ai ) C(Ai ) ∩ C(Aj ? ) = {0} for i 6= j
One-Sided Orders
179
(viii) AAi ? = Ai A? Then (i) ⇒ (ii) ⇒ (iii) ⇒ (iv) ⇒ (v). (i) ⇒ (vii), (iv) ⇒ (vi) and (i) ⇒ (viii). Proof.
Proof is straightforward. (Recall A† = A? (A? AA? )− A? .)
180
6.6
Matrix Partial Orders, Shorted Operators and Applications
Exercises
(1) Let A, B be range-hermitian matrices. Show that if either A2 = BA or A2 = AB and B is an orthogonal projector, then A is also an orthogonal projector. (2) Show that for square matrices A and B of the same order and of index − − − ≤ 1, A < #B < {B− χ } ⊆ {Aχ } and A# < B < {Bρ } ⊆ {Aρ }. (3) Let A and B be square matrices of the same order and of index ≤ 1 such that AA− = BA− (or A− A = A− B) for some g-inverse A− of A and AB = BA. Show that A <# B. (4) Let A and B be square matrices of the same order and of index ≤ 1 such that AA− = BA− (or A− A = A− B) for some g-inverse A− of A and A2 = AB (respectively A2 = BA). Then show that A <# B. (5) Let A and B be matrices of the same order m × n over an arbitrary field. Prove that AA− = BA− (or A− A = A− B) is invariant under choice of g-inverse A− if and only if C(Bt ) ⊆ C(At ) (or C(B) ⊆ C(B)). Show that if A <− B, the condition that each of the defining conditions is invariant under choices of A− implies A = B. (6) Let A and B be square matrices of the same order and of index ≤ 1 such that (i) A2 = BA (or A2 = AB) and (ii) BA# B = A. Then prove that A <# B. What happens if the condition (ii) is replaced by (ii)0 B# AB# = A# ? Should the same result hold if the group inverse (i) and replaced by a commuting g-inverse? (7) Let A and B be matrices of the same order m × n over the field of complex numbers such that (i) A− A = A− B and (ii) BA† B = A. Prove that A B. Assume now (i) AA− = BA− and (ii) B† AB† = A† . Show that A B. (8) Let A and B be square matrices with index A = 1. We say A <−ρ B − − − − − if for some A− ρ ∈ {Aρ }, Aρ A = Aρ B and AAρ = BAρ . Similarly −χ one defines A < B. Prove that A <# B ⇔ A <−ρ B and A <−χ B. (9) Let A be an n × n complex matrix of index 1. Show that an A− ρ is an − A`r if and only if A is range-hermitian. Thus, for square matrices A and B, show that A <−ρ B ⇔ A? < B holds A if and only if is rangehermitian. Similarly, A <−χ B ⇔ A B ⇔ A is range-hermitian. (10) Let A and B be square matrices of the same order, A be a range hermitian matrix and B be idempotent. Then show that the following are equivalent:
One-Sided Orders
(i) (ii) (iii) (iv)
181
A <− B A? < B A < ?B A B
(11) Let A and B be square matrices of the same order, A be a rangehermitian matrix and B a normal matrix. Show that (i) A < ?B if and only if A B. (ii) A? < B if and only if A B. (12) Let A and B be matrices of the same order m×n. Show the following: (i) A < ?B if and only if A <− B and (ii) A? A
and
182
Matrix Partial Orders, Shorted Operators and Applications
(iv) Gt At is a projector that projects vectors into C(At ) along N (At ). (19) Find the class of all A− a for the matrix A in question no. 15.
Chapter 7
Unified Theory of Matrix Partial Orders through Generalized Inverses
7.1
Introduction
In Chapters 3-5, we studied three major matrix partial orders namely: the minus, star and sharp orders. In each of these cases, we saw that they possess a similar characterizing property in terms of a suitable subclass of g-inverses: a matrix A is below a matrix B if AG = BG and GA = GB for some matrix G belonging to a suitable subclass of g-inverses of A. For the minus, star and sharp orders, the subclasses are {A− }, {A− `m } and {A− } respectively. A vigilant reader must have noticed that the three ac partial orders mentioned above have a number of characterizing properties which are analogous. We recall a couple of them here. If A <− B, then (i) {B− } ⊆ {A− } and (ii) there exist projectors P and Q such that A = PB = BQ. If A <# B, then − (i) {B− ac } ⊆ {Aac } and (ii) there exists a projector P onto C(A) such that A = BP = PB. Also, if A B, then − (i) {B− `m } ⊆ {A`m } and (ii) there exist orthogonal projectors P and Q such that A = PB = BQ. In Chapter 6, we studied one-sided partial orders which also exhibited some similar properties. 183
184
Matrix Partial Orders, Shorted Operators and Applications
In the present chapter, we first develop a unified theory of matrix partial orders that are defined via g-inverses. We then attempt to obtain characterizing properties similar to the ones mentioned earlier under the unified theory. We also develop a unified theory based on subclasses of outer inverses. We finally consider some extensions of the above mentioned unifications. In Section 7.2, we define g-maps and a G-based order relation (which includes as special cases the minus, the star and the sharp orders) on matrices based on subclasses of generalized inverses. We study the conditions under which a G-based order relation is a partial order. We also show that not every partial order defined using subclasses of g-inverses is a G-based order. The order relation ‘<#,− ’ discussed in Section 4 of Chapter 4 provides a nontrivial example of the same. Section 7.3 is devoted to the study of orders based on the subclasses of outer inverses. We also study when an order relation based on a subclass of g-inverses coincides with an order relation based on a subclass of outer inverses. In Section 7.4, we develop a unified theory of one-sided orders. In Section 7.5, we study some characterizing properties of G-based partial orders similar to those mentioned in the first paragraph above. Theorem 7.5.5 provides us a true unification of almost all the partial orders via g-inverses that we studied so far. Section 7.6 deals with extensions of G-based orders to accommodate some orders that are not G-based. 7.2
G-based order relations: Definitions and preliminaries
Let ‘<’ denote any of the partial orders studied in earlier chapters. Recall that they all shared a common property: A < B ⇔ some specific subset of g-inverses of B is a subset of a suitable subset of g-inverses of A. Keeping this in view, we introduce in this section the notion of a G-map and a G-based order relation. We study the properties of this order relation and examine when it defines a partial order at least on its support to be defined shortly. We also give appropriate interpretations of the G-based order relation for most of the partial orders studied earlier. Definition 7.2.1. Let P(Fn×m ) denote the power set (class of all subsets) of Fn×m . A g-map is a map G : Fm×n −→ P(Fn×m ) such that for each A ∈ Fm×n ,
G(A) is a certain subset (possibly non-
Unified Theory of Matrix Partial Orders through Generalized Inverses
185
empty) of {A− } and the set ΩG = {A ∈ Fm×n : G(A) 6= ∅} is called the support of the g-map G. Remark 7.2.2. Notice that the set G(A) is not always non-empty. For instance, let G(A) = {A# } for each A ∈ Fn×n . If A is not of index 1, then G(A) = ∅. Definition 7.2.3. Let G : Fm×n −→ P(Fn×m ) be a g-map. For A, B ∈ Fm×n , we say A
Example 7.2.7. Consider the order relation ‘<#,− ’ of Definition 4.4.17. We show that this orderrelationis not a G-based an order relation. Let 0 0 0 0 0 0 0 0 1 1 0 0 0 1 1 0 4×4 A= . −1 − 1 − 1 0 , B = 0 1 0 0 ∈ R 1 1 1 0 0 0 1 0 Clearly, A2 = 0 and B4 = 0, hence, A and B are nilpotent. Also, ρ(A) = 1, ρ(B) = 3, and ρ(B − A) = 2. So, A <− B. Since A and B are nilpotent, it follows that A<#,− B.
186
Matrix Partial Orders, Shorted Operators and Applications
If the relation <#,− were a G-based order relation, then there exists a G ∈ G(A), such that GA = GB, AG = BG. Let T = 2B − A, then G(T − A) = 0, (T − A)G = 0. So, A<#,− T. However, T = 0 0 0 0 1 − 1 − 1 0 , ρ(T) = 3, ρ(T2 ) = 2 = ρ(T3 ). Thus, T is of 1 3 1 0 −1 − 1 1 0 index 2, so, CT , the core part is of rank 2 and NT , the nilpotent part is of rank 1. Since A<#,− T, we have 0 = CA <# CT and A = NA <− NT . It follows that A = NA = NT , which is impossible. Since T − A = CT , so, ACT = AT and CT A = TA should both be null. However, an easy computation shows that neither is a null matrix. In view of Remarks 7.2.5, it is clear that the G-based order relation is always anti-symmetric and reflexive on its support. Since our main objective here is to study partial orders on matrices, it is very natural to ask if a G-based order relation is transitive at least on its support ΩG , if not on its domain. We remark that if the support, ΩG of a g-map consists of exactly two elements, then the g-map always induces a partial order on ΩG . However, if ΩG contains three or more elements, the answer is no as the following example shows: Example 7.2.8. Let G : C3×3 −→ P(C3×3 ) be the map † A , if ρ(A) = 1 G(A) = − A , otherwise.
A B, if ρ(A) = 1 A <− B, otherwise. 1 0 0 1 0 0 1 0 1 Now take A = 0 0 0 , B = 0 1 0 and C = 0 1 0 . 0 0 0 0 0 0 0 0 1 Thus, A
Notice that ρ(A) = 1, ρ(B) = 2 and ρ(C) = 3. Further, A B and B <− C. So, A
Unified Theory of Matrix Partial Orders through Generalized Inverses
187
order relation becomes a partial order. Our next theorem gives a sufficient condition under which it becomes a partial order. But before this we need to give some definitions. Definition 7.2.9. For a g-map G : Fm×n −→ P(Fn×m ), A ∈ Fm×n is said to be maximal with respect to the order relation ‘
188
Matrix Partial Orders, Shorted Operators and Applications
view of Theorem 5.2.10, A
Unified Theory of Matrix Partial Orders through Generalized Inverses
189
Let G : R3×3 −→ P(R3×3 ) be the g-map defined as for X = A {Ha }, G(X) = {G}, for X = B ∅ otherwise. Then it is easy to verify that the relation ‘
190
Matrix Partial Orders, Shorted Operators and Applications
− ] We shall denote the set G(A) ∩ {A− r } by Gr (A) and G(A) ∩ {Ar } by G˜r (A). Thus, G(A) is semi-complete if and only if Gr (A) = G˜r (A). Also, each complete set is semi-complete and the same is true for g-maps.
Remark 7.2.18. For a g-map G : Fm×n −→ P(Fn×m ) and A ∈ Fm×n , − the set G(A) such that {A− r } ⊆ G(A) ⊂ {A } is semi-complete. Theorem 7.2.19. Let G : Fm×n −→ P(Fn×m ) be g-map and let A ∈ Fm×n . Then G(A) is semi-complete if and only if G(A, A) ⊆ G(A). Proof. ‘Only if’ part Let G(A) be semi-complete and GAG ∈ G(A, A) for G ∈ G(A). Clearly, ] and is a reflexive g-inverse of A. Therefore, GAG ∈ G(A) ] ∩ {A− } = G(A) ∩ {A− } GAG ∈ G(A) r r and hence G(A, A) ⊆ G(A). ‘If’ part − ] Let G(A, A) ⊆ G(A). As G(A) ∩ {A− r } ⊆ G(A) ∩ {Ar }, we only need to ] ∩ {A− }. Then ] ∩ {A− } ⊆ G(A) ∩ {A− }. So, let H ∈ G(A) show that G(A) r r r there exists a G ∈ G(A) such that GAG = HAH = H, as H is a reflexive g-inverse of A. But then GAG ∈ G(A, A) ⊆ G(A), so H ∈ {A− r } ⊆ G(A). Hence, G(A) is semi-complete. Corollary 7.2.20. A g-map satisfying the (T)-condition is semi-complete. Theorem 7.2.21. Let G : Fm×n −→ P(Fn×m ) be a semi-complete g-map. Then for A, B ∈ ΩG , the pair (A, B) satisfies the (T)-condition if and only ] if G(B) ⊆ G(A). Proof. ‘If’ part ] and H ∈ G(B) such that HAH ∈ G(A, B). There exists Let G(B) ⊆ G(A) a G ∈ G(A) such that AH = AG, HA = GA. As G is semi-complete we have HAH = GAG ∈ G(A, A) ⊆ G(A). So, HAH ∈ G(A). Thus, the pair (A, B) satisfies the (T)-condition. ‘Only if’ part Let H ∈ G(B). Then HAH ∈ G(A, B) and the pair (A, B) satisfies the (T)-condition, so, there exists a G ∈ G(A) such that HAH = G. It is now ] clear that H ∈ G(A). Our next theorem is a step in the direction of unification of these host of partial orders that we have studied in the earlier chapters.
Unified Theory of Matrix Partial Orders through Generalized Inverses
191
Theorem 7.2.22. Let G : Fm×n −→ P(Fn×m ) be a complete g-map and A, B ∈ ΩG . Then the following are equivalent: (i) A <− B and the pair (A, B) satisfies the (T)-condition (ii) G(B) ⊆ G(A) and A <s B.
and
Proof. In view of Theorem 7.2.21, (i) ⇒ (ii) follows from Theorem 3.3.5. (ii) ⇒ (i) We only need to prove that AB− A = A for some B− ∈ {B− }. Let B− ∈ G(B). As G(B) ⊆ G(A), B− ∈ G(A), so, B− is g-inverse of A. Thus, AB− A = A and by Remark 3.3.12, A <− B. Theorem 7.2.23. Let G : Fm×n −→ P(Fn×m ) be a semi-complete g-map and A, B ∈ ΩG . Then the following are equivalent: (i) (ii) (iii)
A <− B and the pair (A, B) satisfies the (T)-condition ] ⊆ G(A) ] and A <s B and G(B) ] and A <s B. G(B) ⊆ G(A)
Proof.
(i) ⇒ (ii)
^ ] ⊆ G(A). ] To see We first note that the set G(A, B) = {HAH : H ∈ G(B)} ^ ] As H ∈ G(B), ] there exists a this, let HAH ∈ G(A, B) with H ∈ G(B). G ∈ G(B) such that HB = GB, BH = BG. So, HBH = GBG. Since A <− B, we have A = AGA = AGB = BGA = AHB = BHA. Hence, HAH = H(AHB)H = H((BHA)HB)H = (HBH)A(HBH) = ] (GBG)A(GBG) = GAG. Thus, HAH ∈ G(A, B) ⊆ G(A) ⊆ G(A). ˜ Since, G is a complete map, by Theorem 7.2.22, (ii) follows. (ii) ⇒ (iii) is trivial. (iii) ⇒ (i) By Theorem 7.2.21, the pair (A, B) satisfies the (T)-condition. Further, A <− B follows along the lines of (ii) ⇒ (i) of Theorem 7.2.22. Remark 7.2.24. Notice that the completion of {A# } is {A− com }, the class of commuting g-inverses of A and the completion of {A† } is {A− `m }, the class of least squares minimum norm g-inverses of A. Compare now Theorem 4.2.8(i) ⇒ (iv) ⇒ (v) and Theorem 5.2.8 (i) ⇒ (v) with Theorems 7.2.22 and 7.2.23.
192
Matrix Partial Orders, Shorted Operators and Applications
Corollary 7.2.25. Let G : Fm×n −→ P(Fn×m ) be a g-map. Then the order relation defined as ‘A n be matrices having same rank 1 1 1 1 and a common left inverse C. (For example take 0 0 , and 1 1) De1 2 1 2 m×n n×m fine G : F −→ P(F ) as follows: C, G(X) = C, − X ,
for X = A for X = B for all other X ∈ Fm×n .
Then G is a complete map. If Theorem 7.2.23 were to be true, we would have A <− B. Since A and B have same rank, this would mean A = B and that is not true. Corollary 7.2.29. Let G : Fm×n −→ P(Fn×m ) be a g-map defined as m×n . Then the following G(A) = {A+ }, where A+ ∈ {A− r } for each A ∈ F are equivalent: (i) The order relation A
Unified Theory of Matrix Partial Orders through Generalized Inverses
193
Hence, B+ is a g-inverse of A. As G is semi-complete, B+ AB+ ∈ G(A). Thus, B+ AB+ = A+ . Recall that A <− B ⇔ B − A <− B and similar results hold for the star and the sharp order. However, A
1/2
3/2 0 −1/2
Note that B = A ⊕ (B − A). Further, A? (B − A) 6= 0 and also, (B − A)A? 6= 0. Thus, B − A≮? B implying B − A≮G B.
194
Matrix Partial Orders, Shorted Operators and Applications
We have seen that the (T)-condition plays a vital role in proving transitivity of an order relation. What about in reverse direction, i.e. if an order relation is transitive, whether the g-map defining it satisfies the (T)condition? In our next theorem we show that the answer is in affirmative when the matrices are square and the g-map defining it is semi-complete. We also note that there is no loss of generality, if the g-map is taken to be complete. Theorem 7.2.31. Let G : Fn×n −→ P(Fn×n ) be a complete g-map. If the order relation ‘
Remark 7.2.32. (i) Notice that
Unified Theory of Matrix Partial Orders through Generalized Inverses
195
(ii) A
Theorem 7.2.35. Let
Proof. The statement is clearly true if
7.3
O-based order relations and their properties
Until now we have been concentrating on mainly the partial orders induced by g-inverses of matrices, since our main aim in this chapter is to unify various partial orders that are generalizations of the minus order. In this section we change our focus from the order relations defined through ginverses to the order relations defined by the outer inverses. One reason that can be given for this is that in Chapter 4, we saw that the Drazin
196
Matrix Partial Orders, Shorted Operators and Applications
inverse (which is an outer inverse) was used to define a pre-order and later modified to give a partial order. This gives a clear indication that outer inverses can be used to obtain new partial orders. The other reason, as seen in Example 7.2.7, is that not all partial orders are G-based order relations. So, where do such kind of order relations fit in the present scenario? We examine the possibility of getting an answer in partial orders that can be induced by selecting a suitable subset of outer inverses. We shall denote an outer inverse of any matrix A ∈ Fm×n by A− and the set all outer inverses of A by {A− }. So, {A− } = {G : GAG = G}. We begin by giving some properties of the outer inverses. Theorem 7.3.1. Let B ∈ Fm×n and A = BB− B for some B− ∈ {B− }. Then the following hold: (i) (ii) (iii) (iv) (v)
B− is reflexive g-inverse of A. A <− B. BA− B and B− BA− BB− ∈ {B− } for all A− ∈ {A− }. − − − B− = B− r BBr for some Br ∈ {Br }. − − − A = BBr ABr B for some Br ∈ {B− r }.
Proof.
(i) Since AB− A = (BB− B)B− (BB− B) = B((B− BB− )(BB− )B) = (BB− B) = A and B− AB− = B− (BB− B)B− = (B− BB− )BB− = B− . So, the result follows. (ii) Notice that A = BB− B ⇒ B− A = B− B and AB− = BB− . By (i) B− is reflexive g-inverse of A, hence, A <− B. (iii) is easy. (iv) Proof follows from(i), (ii) and Remark 3.3.3(ii). (v) follows from (i) and (iii). We now give definition of an o-map and an O-based order relation in analogy with a g-map. Definition 7.3.2. An o-map is a map O : Fm×n −→ P(Fn×m ) such that for each A ∈ Fn×m , O(A) is certain specified (possibly non-empty) subset of {A− }, the set all outer inverses of A. Definition 7.3.3. Let O : Fm×n −→ P(Fn×m ) be an o-map. We define an order relation ‘
Unified Theory of Matrix Partial Orders through Generalized Inverses
197
and call it an O-based order relation. Remark 7.3.4. (i) Notice that A = BB− B for some B− ∈ {B− } implies B− A = B− B and AB− = BB− . However, the converse is not true. For example, let A, B ∈ Fn×m be distinct matrices such that B A. Then B† B = B† A, BB† = AB† , but A 6= BB† B. (ii) If O(A) = {A− }, then by Theorem 7.3.1(ii), the order relation ‘
198
Matrix Partial Orders, Shorted Operators and Applications
We now study when an O-base order relation is a partial order. Since A
Unified Theory of Matrix Partial Orders through Generalized Inverses
(4) (5) (6) (7)
199
(XA)t = XA AX = XA C(X) ⊆ C(AX) C(Xt ) ⊆ C(AX)t
where ‘t’ denotes operation of taking transpose, if matrices are over arbitrary field and taking conjugate transpose if matrices are over complex field. We denote by S any subset of {1, 2, 3, 4, 5, 6, 7} that contains 2. Let S1 denote the subset obtained from S by replacing 2 by 1. We define S as follows: S, if 1 ∈ S S= S1 , otherwise. We write X(S) to be the set of all outer inverses of X that satisfy the equations corresponding to elements of S, whenever meaningful. For example, if S = {1, 2, 3}, the set X(S) is the set of all outer inverses of X that satisfy equations (1), (2) and (3). For S = {6, 7}, the set X(S) is not defined. Similarly, the set X(S) will denote the set of generalized inverses of X that satisfy the equations corresponding to elements of S. With these notations in place we now prove Theorem 7.3.12. Let B ∈ Fm×n . Let A = BB− B for some B− ∈ B(S). Then the condition: (T )o
B− BA− BB− ∈ B(S) for each A− ∈ A(S)
is satisfied. Further, if O(B) = B(S) and G(A) = A(S), then the two order relations ‘
200
Matrix Partial Orders, Shorted Operators and Applications
= B− BA− AA− BB− = B− BABB− ∈ B(S). Therefore, the condition To is satisfied. As A(S) = {A− }, the later statement is clear and the order relation ‘
7.4
One-sided G-based order relations
In this section we take up the case of one-sided G-based order relations defined by g-maps in Section 2 and examine the conditions under which these will define a partial order. Definition 7.4.1. Let G : Fm×n −→ P(Fm×n ) be a g-map. For any A, B ∈ Fm×n , we say A G < B, if there exists a G ∈ G(A) such that GA = GB, and C(A) ⊆ C(B). We call this order relation as left G-order. Similarly, we say A < G B if there exists a G ∈ G(A) such that AG = BG, and C(At ) ⊆ C(Bt ). We call this order relation right G-order. Remark 7.4.2. If A G < B or A < G B holds, then it ia clear that A <− B. Remark 7.4.3. If G(A) = {A− }, then both left G-order and right G-order are the same as the minus order.
Unified Theory of Matrix Partial Orders through Generalized Inverses
201
In view of the Remark 7.4.2, both left G-order and right G-order are reflexive and anti-symmetric on the support ΩG of the g-map G. In absence of any further conditions on the g-map neither of the left G-order and the right Gorder defines a partial order. One obvious choice of the additional condition is the (T)-condition and this makes both left G-order and right G-order into partial orders as the following theorem shows: Theorem 7.4.4. Let G : Fm×n −→ P(Fm×n ) be a g-map. Then (i) If A G < B ⇒ the pair (A, B) satisfies the (T)-condition, then the left G-order is a partial order on the support ΩG . (ii) If A < G B ⇒ the pair (A, B) satisfies the (T)-condition, then the right G-order is a partial order on the support ΩG . Proof is easy. Is there any other additional condition that ensures that both left Gorder and right G-order induce partial orders on the support ΩG ? The answer is contained in the following theorem: Theorem 7.4.5. Let G : Fm×n −→ P(Fn×m ) be a g-map. Then (i) If ‘A G < B ⇒ the pair (A, B) satisfies G(A)AG(B) ⊆ G(A)’, then the left G-order is a partial order on the support ΩG . (ii) If ‘A < G B ⇒ the pair (A, B) satisfies G(B)AG(A) ⊆ G(A)’, then the right G-order is a partial order on the support ΩG . Proof. We prove that the left G-order is transitive and the proof for right G-order is similar. So, let A G < B and B G < C. Now, A G < B ⇒ there exists a G ∈ G(A) such that GA = GB and C(A) ⊆ C(B). Also B G < C ⇒ there exists an H ∈ G(B) such that HB = HC and C(B) ⊆ C(C). Let H1 = GAH. Then H1 ∈ G(A) and H1 A = GAHA = GA, as A G < B ⇒ A <− B. Similarly, H1 C = GAHC = GAHB = GA. Also, C(A) ⊆ C(C). Hence, A G < C. Remark 7.4.6. (i) The conditions in Theorem 7.4.5 (i) and (ii) are in general, different from the (T)-condition. (ii) The left G-order and the right G-order can be partial orders even when defining g-map does not satisfy the respective conditions. (iii) The left G-order and the right G-order can be partial orders even when
202
Matrix Partial Orders, Shorted Operators and Applications
the order relation
Let A, B ∈ Cm×n . Then ] then there (i) It is obvious that G is a complete g-map. For, if H ∈ G(A), exists a G ∈ G(A) such that HA = GA, AH = AG. Thus, if G ∈ {A− ` }, − − − then so is H ∈ {A` } and if G ∈ {Am }, then so is H ∈ {Am }. (ii) We show that A G < B and A < G B are each equivalent to A <− B. Since, A G < B ⇒ A <− B and A < G B ⇒ A <− B, is always true, we must show that A <− B ⇒ A G < B and A <− B ⇒ A < G B. So, let A <− B. Then there exists a G ∈ {A− } such that AG = BG, GA = GB. − − − If G ∈ {A− ` } \ {A`m }, or G ∈ {Am } \ {A`m }, then in view of the fact − s − that A < B ⇒ A < B, we have A < B ⇒ A G < B and also A <− B ⇒ A < G B. Let G does not belong to {A− ` } or does not belong − − to {A− }. Then for an arbitrary A , GAA ∈ G(A) and for an arbitrary m ` ` − − Am , Am AG ∈ G(A), as G is a complete g-map. Moreover, neither of − − − − GAA− ` and Am AG is in {A`m }, because A(GAA` ) and (Am AG)A are − − − not hermitian. Also, A(GAA` ) = B(GAA` ) = AA` and A− m AGA = − A. So, A G < B and A < G B hold. Note that it is quite AGB = A A− m m possible that {A− : A− (B − A) = 0, (B − A)A− = 0} = {A− `m }. If this is true, then {B− AB− , B− arbitrary} = {A† }. So, for each B− , B− AB− A = A† A and AB− AB− = AA† . Thus, B− A = A† A and AB− = AA† . This means, B− A and AB− are invariant under choice of B− . Therefore, Cn = C(In ) ⊆ C(Bt ) and Cm = C(Im ) ⊆ C(B), which is not possible if m 6= n. (iii) We now show that the conditions in Theorem 7.4.5 (i) and (ii) are not − satisfied. Let A, B ∈ Cm×n such that A <−` B. Then {B− ` } ⊆ {A` }. − −` − † − As, A < B ⇒ A < B, we have A G < B. Also, A = Am AB` ∈ G(A)AG(B), and A† does not belong to G(A). So, G(A)AG(B) * G(A). Similarly, G(B)AG(A) * G(A). (iv) We now show that the (T)-condition does not hold. − For this, let A, B ∈ Cm×n such that A <−` B. Let B− ` AB` ∈ G(A, B). − − − − − − − If B` AB` ∈ G(A), then B` AB` ∈ {A` } or B` AB` ∈ {A− m }. Let − − − − − B− AB ∈ {A }. So, AB AB = AB is hermitian. Choose a g-inverse ` ` ` ` ` ` − − − − B− not in ({B− ` } ∪ {Bm }). Now, AB ` = BB AB` is hermitian. So, B
Unified Theory of Matrix Partial Orders through Generalized Inverses
203
− − − − and B− AB− ` commute. Therefore, AB` = B AB` B = B A should be hermitian, which is not true. − − − − − Let B− ` AB` ∈ {Am }. Then B` AB` A = B` A should be hermitian, which is again not true. Hence, (T)-condition does not hold and the
We now consider another condition (that is stronger than semicompleteness) under which the left G order and right G order are equivalent to the order relation
Properties of G-based order relations
We now study some further properties of G-based order relations, yet another step to unify the theory of matrix partial orders.
204
Matrix Partial Orders, Shorted Operators and Applications
Theorem 7.5.1. Let G : F n×n −→ P(F n×n ) be a g-map. Let A, B ∈ Fn×n . Then A
Unified Theory of Matrix Partial Orders through Generalized Inverses
205
and (b) H = P(QAP)− Q. The condition (a) implies C(QAP) = C(Q) and C(QAP)t = C(P)t . Further, C(QAP) = C(Q) ⇒ there exists a matrix U such that Q = QAPU. So, HAH = P(QAP)− QAP(QAP)− Q = P(QAP)− QAP(QAP)− QAPU = P(QAP)− QAPU = P(QAP)− Q = H. Thus, H is an outer inverse of A. ‘Only if’ part Let H be an outer inverse of A. Take P = H, Q = H. Then H = H(HAH)− H is in the desired form. Given a matrix Theorem 7.5.3 determines the class of all g-inverses as the class of all outer inverses with specified row and column spaces. Theorem 7.5.5 unifies several known partial orders as demonstrated in Remark 7.5.6. Theorem 7.5.3. Let m, n, p and q be positive integers and A ∈ Fm×n , P ∈ Fn×p and Q ∈ Fq×m . Let X = P(QAP)− Q. Then the following hold: (i) X is a g-inverse of A if and only if ρ(QAP) = ρ(A). (ii) X is an outer inverse of A with C(X) = C(P) and C(X)t = C(Q)t if and only if ρ(QAP) = ρ(P) = ρ(Q). Proof. (i) ‘If’ part Since ρ(QAP) = ρ(A), it follows that ρ(QAP) = ρ(AP) = ρ(A) and therefore, C(AP) = C(A). So, there exists a matrix U such that A = APU. Consider AXA = AXAXA = AP(QAP)− QAP(QAP)− QAPU = AP(QAP)− QAPU = APU, since C(Pt ) ⊆ C(QAPt ). So, AXA = A. Thus X is a g-inverse of A. ‘Only if’ part If X = P(QAP)− Q is a g-inverse of A, then A = AXA = AXAXA = AP(QAP)− QAP(QAP)− QA ⇒ ρ(A) ≤ ρ(QAP) ≤ ρ(A). So, ρ(QAP) = ρ(A). (ii) ‘If’ part follows along the same lines as in Theorem 7.5.2 (iii). ‘Only if’ part Let X is an outer inverse of A with C(X) = C(P) and C(X)t = C(Q)t . Then X = XAX = XAXAX ⇒ ρ(X) ≤ ρ(QAP) ≤ ρ(AP) ≤ ρ(P) and ρ(X) ≤ ρ(QAP) ≤ ρ(QA) ≤ ρ(Q). Since, C(X) = C(P) and C(X)t = C(Q)t , it follows that ρ(QAP) = ρ(P) and ρ(QAP) = ρ(Q). Remark 7.5.4. Thus, as P and Q vary in Theorem 7.5.2(i), we get all ginverses of A and in (ii) all outer inverses. Similarly, Theorem 7.5.3, gives
206
Matrix Partial Orders, Shorted Operators and Applications
us a characterization of all g-inverses and outer inverses with a specified column space and a specified row space. Theorem 7.5.5. Let m, n, p and q be positive integers and A ∈ Fm×n , P ∈ Fn×p and Q ∈ Fq×m . Let G : Fm×n −→ P(Fn×m ) be a g-map defined as follows: For each A ∈ Fm×n , G(A) = {P(QAP)− Q : ρ(QAP) = ρ(A)}. Then the following hold: (i) G satisfies the (T)-condition. (ii) G is semi-complete. (iii) The order relation ‘
Unified Theory of Matrix Partial Orders through Generalized Inverses
207
⇒ QAP = QBP. Therefore, ρ(QAP) = ρ(QBP) implies ρ(A) = ρ(B). As A
(e) If we take P = A and Q = A, we have P(QAP)− Q = A(A3 ) A and the order relation
208
7.6
Matrix Partial Orders, Shorted Operators and Applications
G-based extensions
In Section 2, we have seen that a G-based order relation ‘
Define Gˆ : Fm×n −→ P(Fm×n ) as follows: ˆ G(A) =
G(A), {A− max },
if A ∈ ΩG otherwise
where A− max is a fixed full rank g-inverse of A (the existence of such a g-inverse is guaranteed by Theorem 2.3.18). We show that if A ∈ / ΩG , then ˆ ˆ A is G-maximal. So, let A ∈ / ΩG and B be such that A
Unified Theory of Matrix Partial Orders through Generalized Inverses
209
We conclude this chapter with the study of the non-trivial partial order extensions of the sharp order. We begin with a lemma that provides us with a necessary condition for a partial order to be an extension of the sharp order. Lemma 7.6.4. Let
Definition 7.6.5. Let A, B ∈ Fn×n . We say A is below B under
210
Matrix Partial Orders, Shorted Operators and Applications
there exist non-singular matrices P, T and a matrix S of order same as of T such that A = P−1 diag S 0 P and B = P−1 diag T 0 P. As A is of index ≤ 1, S is of index ≤ 1. Now, CA <# CB , so, CA and CB commute and CA 2 = CA CB . So, T and S commute. Hence, A and B commute. We now show that the order relation
G 0 (A) = {A− : C(CA ) ⊆ C(A− ), C(CA t ) ⊆ C(A− )}. Notice that G 0 (A) is non-empty, as we can find a non-singular g-inverse of # A and such a g-inverse lies in G 0 (A). Let PA = CA C# A = CA CA . We have the following: Lemma 7.6.7. In the setup of the preceding para, the following hold: (i) For each choice of N− A , g-inverse of NA , the nilpotent part of A, the 0 matrix C# + (I − P )N− A A A (I − PA ) ∈ G (A). # − (ii) For each choice of A , CA + (I − PA )A− (I − PA ) ∈ G 0 (A). Proof. If A = Pdiag C , N P−1 be core nilpotent decomposition of −1 −1 C L − A, then PA = Pdiag I , 0 P . Also, A = P P−1 , M N− where LN = 0, NM = 0 and N− is a g-inverse of N. So, C# A + (I − # − PA )N− (I − P ) and C + (I − P )A (I − P ) are each equal to A A A A A Pdiag C−1 , N− P−1 . − (i) Clearly, C# A + (I − PA )NA (I − PA ) is g-inverse of A and that − C(CA ) ∩ C(I − PA )A (I − PA ) = {0}, so, C(CA ) ⊆ C(G). Similarly, C(CtA ) ⊆ C(Gt ). Thus, G ∈ G 0 (A). (ii) Similar to (i). Remark 7.6.8. The two classes − − G 1 (A) = {C# A + (I − PA )NA (I − PA ) : NA is arbitrary}
and − − G 2 (A) = {C# A + (I − PA )A (I − PA ) : A is arbitrary}
are identical. Theorem 7.6.9. In the setup of Lemma 7.6.7, for the order relation
0
Unified Theory of Matrix Partial Orders through Generalized Inverses
211
Proof. ‘Only if’ part 0 Let A
212
7.7
Matrix Partial Orders, Shorted Operators and Applications
Exercises
G (1) Prove that if G(A) = {A− r }, then the relation < is the minus-order. m×n (2) Let G be a complete g-map on F such that whenever A ∈ ΩG , then G(A) ∩ {A− } = 6 ∅. Then any A ∈ ΩG is maximal ⇔ ρ(A) = r min{m, n}. Prove that the same conclusion holds when the map is semi-complete. (3) Let G be a g-map of F m×n with support ΩG and Ω ⊂ ΩG . Let for each A ∈ Ω, the set G(A) is replaced by ∅. If the relation
(6) Let A and C ∈ Cm×n be non-null and G be a g-map on Cm×n . Let G(A) = Ann(C) ∩ {A− } 6= ∅. Show that G(A) is complete if and only if Ann(A) ⊂ Ann(C), where Ann(X) denotes that annihilator of the matrix X. (7) Let G : F m×n → P(F m×n ) be defined as # A , if Index A = 1 G(A) = A− , if A is nilpotent ∅, otherwise. Then show that the order relation
Unified Theory of Matrix Partial Orders through Generalized Inverses
(10)
(11)
(12) (13)
213
sets i.e., G 1 (A) = G 2 (A). Let G : F m×n → P(F n×m ) and G 0 : F m×n → P(F n×m ) be two g0 maps. Write A
The following exercises use systems of matrix equations to define partial orders. (14) Let A, B ∈ F m×n . Show that the following are equivalent: (i) A <− B (ii) AXB = A and BXB = A are jointly consistent; (iii) BXA = A and BXB= A are jointly consistent; A A A (iv) The matrix equation = X(A B) is consistent. A A B (15) Let A, B ∈ Cm×n . Show that the following are equivalent: (i) A ? < B (ii) A? AX = A? , AXB = A and BXB = A are jointly consistent. (16) Let A, B ∈ F m×n . Show that the following are equivalent: (i) A < ? B and (ii) XAA? = A? , AXB = A and BXB = A are jointly consistent. (17) Let A, B ∈ F m×n . Then the following are equivalent: (i) (ii) (iii) (iv)
A <0 B XBX = X and BXB = A are jointly consistent; BXB = A, BXBXB = BXB are jointly consistent and XBXBX = XBX and BXB = A are jointly consistent.
(18) Let A, B ∈ F m×n . Consider the matrix equations AXA = A and BXB = A. Define an order relation as follows: A B if the equa-
214
Matrix Partial Orders, Shorted Operators and Applications
tions AXA = A and BXB = A are jointly consistent. Check if this order relation is a partial order.
Chapter 8
The L¨ owner Order
8.1
Introduction
The L¨ owner order is one of the oldest and most widely used matrix partial orders. However, the properties and the results concerning the L¨owner order do not seem to have been compiled together in one source and are rather scattered. In this chapter, we attempt to give a comprehensive exposition of the L¨ owner order. In Section 8.2, we define the L¨owner order on the class of hermitian matrices. The class of nnd matrices is an important subclass of the class of hermitian matrices. We obtain several important properties of the L¨ owner order on nnd matrices and discuss some of the possible extensions to hermitian matrices and limitations when this is not possible. In Section 8.3, we study the relationship of the L¨owner order on a pair of hermitian matrices with that on their powers. We also study the relationship of the L¨ owner order with some of the other partial orders that we studied earlier. In Section 8.4, we study the ordering properties of ginverses of matrices with respect to the L¨owner order. Section 8.5 contains two generalizations of the L¨owner order, one for hermitian matrices and other for arbitrary rectangular matrices. We make a comparison of these extensions on the class of hermitian matrices.
8.2
Definition and basic properties
In this section, we formally define the L¨owner order and show that it is a partial order. We also derive some of its useful and interesting properties. Recall that Hn denotes the class of all n × n hermitian matrices. We give below two well-known results in Linear Algebra which will be needed subsequently. 215
216
Matrix Partial Orders, Shorted Operators and Applications
Theorem 8.2.1. (Fisher-Courant) Let A ∈ Hn . Let λ1 ≥ λ2 ≥ . . . λn be the eigen-values of A and u1 , u2 , . . . , un be the corresponding ortho-normal eigen vectors. (Notice that all eigen-values of A are real.) Then y? Ay = λ1 , y6=0 y? y ? y Ay minkxk=1 x? Ax = min ? = λn . y6=0 y y ? y Ay (ii) max{ ? : u?i y = 0, i = 1, . . . , k − 1} = λk . y y (iii) Let k be a fixed integer such that 2 ≤ k ≤ n and let B denote an n × (k − 1) matrix. Then (i) maxkxk=1 x? Ax = max
inf sup ( B B? y=0
y? Ay ) = λk . y? y
Proof. (i) and (ii) are easy. (iii) Let u1 , u2 , . . . , un be the ortho-normal eigen vectors of A corresponding to the eigen-values λ1 , λ2 , . . . , λn respectively. Write U = U is a unitary matrix such that U? AU = u1 : · · : un . Clearly, diag λ1 · · λn . Let B be an n × (k − 1) matrix. Write y = Uz. Then ? ? ? Pn 2 y Ay z U AUz i=1 λi zi P sup = sup = sup . n B? Uz=0 2 y? y z? z B? y=0 B? Uz=0 i=1 zi Since B? U is of order (k − 1) × n, there exists a non-null vector w such that B? Uw = 0 and wk+1 = . . . = wn = 0. So, ! Pk Pn λi wi 2 λi zi 2 i=1 i=1 Pn ≥ . (8.2.1) sup Pk 2 2 B? Uz=0 i=1 zi i=1 wi Since, (8.2.1) holds for all matrices B of order (k − 1) × n, we have inf sup ( B B? y=0
y? Ay ) ≥ λk . y? y
(8.2.2)
Taking B = u1 : · · : uk−1 , it follows from (ii) that equality holds in (8.2.2). A11 A12 Theorem 8.2.2. Let A = be an nnd matrix of order n × n, A?12 A22 where A11 , A12 and A22 are of orders r ×r, r ×(n−r) and (n−r)×(n−r) respectively. Then A is nnd if and only if
The L¨ owner Order
217
(i) A11 is nnd (ii) C(A12 ) ⊆ C(A11 ) and (iii) A22 − A?12 A− A is nnd, where A− 12 11 11 is an arbitrary g-inverse of A11 . Proof.
Notice first that we can write I 0 A11 0 I A− 11 A12 A= . A?12 A− I 0 A22 − A?12 A− 0 I 11 11 A12 I A− I 0 11 A12 The matrices and are non-singular and are 0 I A?12 A− I 11 hermitian conjugates of each other. Moreover, under both ‘if’ part as well as ‘only if’ part, we have C(A12 ) ⊆ C(A11 ). So, A?12 A− 11 A12 is invariant under choices of g-inverses A− of A . Therefore, A is nnd if and only if 11 11 A11 and A22 − A?12 A− A are nnd. 11 12 Corollary 8.2.3. Consider the same setup as in Theorem 8.2.2. Then A is nnd if and only if (i) A22 is nnd, (ii) C(A?12 ) ⊆ C(A22 ) and (iii) A11 − A?12 A− A is nnd, where A− 22 12 22 is a g-inverse of A22 . Theorem 8.2.4. Let A ∈ Hn . Then there exist nnd matrices A1 and A2 such that (i) A = A1 − A2 (ii) A1 A2 = 0.
and
Further, such a decomposition of A is unique (i.e. if A = B1 − B2 , where B1 and B2 are nnd and B1 B2 = 0, then A1 = B1 and A2 = B2 ). Proof. Let λ1 > λ2 > . . . λs = 0 > λs+1 > . . . λt be distinct eigen-values of A. Since every hermitian matrix has real eigen values it follows from Pt Remark 2.2.42, that A = i=1 λi Ti , where Ti is hermitian and idempotent Ps−1 for i = 1, . . . , t and Ti Tj = 0 whenever i 6= j. Take A1 = i=1 λi Ti and Pt A2 = −( i=s+1 λi Ti ). Clearly, A1 and A2 are nnd, A1 A2 = 0 and A = A1 + A2 . We now show that this decomposition is unique. Let A = B1 − B2 , where B1 , B2 are nnd, and B1 B2 = 0. It is easy to see that (see Theo rem 2.7.1) B1 = Udiag ∆1 , 0 , 0 U? and B2 = Udiag 0 , ∆2 , 0 U? , where ∆1 , ∆2 are positive definite diagonal matrices with their diagonal
218
Matrix Partial Orders, Shorted Operators and Applications
elements arranged in decreasing and increasing orders respectively and U is a unitary matrix. Now, A = B1 − B2 = Udiag ∆1 , −∆2 , 0 U? , is a spectral decomposition of A for which the distinct diagonal elements must be precisely λ1 > λ2 > . . . > λt . By appropriate pooling of the terms we Pt Ps−1 Pt have A = i=1 λi Si . Thus, B1 = i=1 λi Si and B2 = −( i=s+1 λi Si ). Once again by Remark 2.2.42, Si = Ti . So, A1 = B1 and A2 = B2 . We now define the L¨ owner order on the class of hermitian matrices Hn of order n × n. Definition 8.2.5. Let A, B ∈ Hn . Then A is below B under the L¨owner order if B − A is nnd. When this happens, we write A
The L¨ owner Order
219
since B − A is nnd. Therefore, λk (B) ≥ λk (A) for each k ≥ 2. Also, A
x? Ax x? Bx ≤ x? x x? x
for each x 6= 0. Hence, λ1 (A) = max x6=0
x? Ax x? Bx ≤ max ? = λ1 (B). ? x6=0 x x x x
Corollary 8.2.10. Let A, B ∈ Hn and A
(i) (ii) (iii) (iv)
We now obtain necessary and sufficient conditions for A
220
Matrix Partial Orders, Shorted Operators and Applications
(ii) C(A) ⊆ C(B) and λmax (B− A) ≤ 1, where B− is any g-inverse of B and (iii) C(A) ⊆ C(B) and AB− A
The L¨ owner Order
221
Corollary 8.2.17. Let M1 be the matrix as in Corollary 8.2.16 and M2 = B BA . Then In(M1 )=In(M2 ). AB A 0 I Proof. Let U = partitioned conformably for multiplication with I 0 M1 . Then M2 = UM1 U? and by Silvester’s Law of inertia the result follows. We now prove Theorem 8.2.18. Let A ∈ Hn and L be an n × n matrix. Then the following are equivalent: (i) In(LAL? ) = In(A) (ii) ρ(LAL? ) = ρ(A) and (iii) C(A) ∩ N (L) = {0}. Proof. (i) ⇒ (ii) is trivial. (ii) ⇔ (iii) We first note that A hermitian ⇒ ρ(AL? ) = ρ(LA). Now, by Lemma 3.3.15, ρ(LAL? ) = ρ(AL? ) − d(C(AL? )) ∩ N (L) = ρ(LA) − d(C(AL? )) ∩ N (L) and ρ(LA) = ρ(A) − d(C(A) ∩ N (L)). Hence, ρ(LAL? ) = ρ(A) − d(C(A) ∩ N (L)) − d(C(AL? ) ∩ N (L)). So, ρ(LAL? ) = ρ(A) ⇔ d(C(A) ∩ N (L)) = 0 and d(C(AL? ) ∩ N (L)) = 0. Since, C(AL? ) ⊆ C(A), we have ρ(LAL? ) = ρ(A) ⇔ d(C(A) ∩ N (L)) = 0. (ii) ⇒ (i) Let A = P ∆1 , −∆2 , 0 P? be a spectral decomposition of A, where P is unitary and ∆1 , ∆2 are positive definite diagonal matrices. We can write A = P1 ∆1 , −∆2 P?1 , where P?1 P1 = I by partitioning P as P = (P1 , P2 ), where P1 is of appropriate order. Now, LAL? = LP1 diag ∆1 , −∆2 P?1 L? . Since ρ(LAL? ) = ρ(A), we have C(A) ∩ N (L) = {0}. Also, C(A) = C(P1 ). So, by Lemma 3.3.15, ρ(LP1 ) = ρ(P1 ) = number of columns in P1 . Now, LP1 has full column rank and can be extended to a non-singular matrix Q = LP1 : T . Thus, LP1 diag ∆1 , −∆2 P?1 L? = Qdiag ∆1 , −∆2 , 0 Q? . It follows by Theorem 8.2.15 that In(LAL? ) = In(A).
222
Matrix Partial Orders, Shorted Operators and Applications
Theorem 8.2.19. Let A, B ∈ Hn . Then In(A − AB† A) = In(B − A) − (In(B) − In(A)) if and only if C(B − A) ∩ N (B) = {0}. Proof. We have In(B† )+In(A−AB† A) = In(A)+In(B† −B† AB† ), by Corollaries 8.2.16 and 8.2.17. Since, In(B† ) = In(B) and B† − B† AB† = B† (B − A)B† , it follows that In(A − AB† A) = In(B† (B − A)B† ) − (In(B) − In(A)). Now by Theorem 8.2.18, In(B† (B − A)B† ) = In(B − A) if and only if C(B − A) ∩ N (B† ) = 0. Since, N (B† ) = N (B), the result follows. Corollary 8.2.20. Let A, B ∈ Hn such that ν(A) = ν(B) and A <s B. Then ν(A − AB− A) = ν(B − A) for every g-inverse B− of B. Proof. Since A <s B, and ν(A) = ν(B), we have N (A) = N (B). So, C(B − A) ∩ N (B) = {0}. Also, as ν(A) = ν(B), so, ν(A − AB† A) = ν(B − A), by Theorem 8.2.19. Further, A <s B, gives AB† A = AB− A, for each g-inverse B− of B, so, the result follows. We are now ready to give the promised generalization of Corollary 8.2.12. Theorem 8.2.21. Let A, B ∈ Hn such that ν(A) = ν(B). Then A
(8.2.3)
Now, ρ(A2 : B1 : B2 ) = d(C(A2 : B1 : B2 )) = d(C(A2 )) + d(C(B1 : B2 )) − d(C(A2 ) ∩ C(B1 : B2 )) = ρ(A2 ) + ρ(B1 ) + ρ(B2 ) − d(C(A2 ) ∩ C(B1 : B2 )). (8.2.4)
The L¨ owner Order
223
and ρ(A2 : B1 ) = ρ(A2 ) + ρ(B1 ) − d(C(A2 ) ∩ C(B1 )).
(8.2.5)
Since ρ(A2 ) = ρ(B2 ), Equations (8.2.3), (8.2.4) and (8.2.5) together give ρ(A2 ) − d(C(A2 ) ∩ C(B1 : B2 )) = −d(C(A2 ) ∩ C(B1 )). Moreover, C(A2 ) ∩ C(B1 : B2 ) ⊆ C(A2 ), the left hand of the above equality is a positive number, whereas the right hand is a negative number, so, either side is null. This further gives C(A2 ) ∩ C(B1 ) = {0} and C(A2 ) = C(A2 ) ∩ C(B1 : B2 ). It follows that C(A2 ) ⊆ C(B1 : B2 ) = C(B) and so, C(A1 : B2 ) ⊆ C(A2 : B1 ) ⊆ C(B). Hence, C(A1 ) ⊆ C(B) and therefore C(A) ⊆ C(B). By Corollary 8.2.20, we have AB− A
BA = BA2 , C(A? B) ⊆ C(B) and λmax (AB− A? B) ≤ 1 BA = BA2 , C(A? B) ⊆ C(B) and tr(AB− A? B) ≤ ρ(BA) BA = BA2 , A? BA
Proof. Write B = CC? and let S = C† A? C. (i) ⇒ (ii) First note that λmax (AB− A? B) = λmax (C? AB− A? C). Since C(A? C) = C(A? B) ⊆ C(B), C? AB− A? C is invariant under the choice of g-inverses of
224
Matrix Partial Orders, Shorted Operators and Applications
B. Hence, we can use B† for a g-inverse of B in C? AB− A? C. Since S? S = C? AC?† C† A? C = C? AB† A? C, the non-null eigen-values of AB− A? B and S? S are the same. Also, λmax (AB− A? B) ≤ 1 ⇒ tr(S? S) ≤ ρ(S), since the number of non-null eigen-values of S? S (including the algebraic multiplicity) is ρ(S). Now, C(A? C) = C(A? B) ⊆ C(B) = C(C), we have S2 = C† A? CC† A? C = C† A2? C. Since BA = BA2 , it follows that C† A2? C = C† A? C = S. Thus, S is an idempotent. Also, S idempotent and C(A? C) ⊆ C(C) together imply that A? CC† is an idempotent. Therefore, ρ(S) = tr(S) = tr(A? CC† ) = ρ(A? CC† ) = ρ(A? CC? ) = ρ(BA). (ii) ⇒ (iv) BA = BA2 and C(A? B) ⊆ C(B) ⇒ tr(S? S) ≤ ρ(S) = tr(S2 ) as shown in (i) ⇒ (ii). However, 0 ≤ tr(S − S? )? (S − S? ) = 2(tr(S? S) − tr(S2 )), so, S = S? . Also, CSC? = CC† A? CC? = A? B? = A? B. Since CSC? = CS? C? = (CSC? )? , we have A? B = BA. Hence, BA = BA2 = A? BA. (iv) ⇒ (i) BA = A? BA = A? B. So, BA = BA2 and C(A? B) ⊆ C(B). Moreover, CC? AC†? C† A? C = BAB† A? C = A?2 C = A? C = CC† A? C. Hence, S = S? S. So, S is hermitian and idempotent. From the proof above of (i) ⇒ (ii), we have non-null eigen-values of AB− A? B and S? S are same. Hence, λmax (AB− A? B) = 1. (i) ⇔ (iii) follows from Theorem 8.2.11. Remark 8.2.24. In the setup of Theorem 8.2.23, the proofs of the equivalences (i) ⇔ (ii) ⇔ (iv) ⇔ (i) of Theorem 8.2.23, show that BA = BA2 , C(A? B) ⊆ C(B) and λmax (AB− A? B) ≤ 1 ⇒ λmax (AB− A? B) = 1. Similarly, BA = BA2 , C(A? B) ⊆ C(B) and tr(AB− A? B) ≤ ρ(BA) together imply tr(AB− A? B) = ρ(BA). Theorem 8.2.25. Let A, B be nnd matrices such that A
The L¨ owner Order
225
So, B[(I − AA† )B(I − A† A)]† B = B(I − A† A)[(I − AA† )B(I − A† A)]† (I − AA† )B = (B − A)(I − A† A)[(I − AA† )B(I − A† A)]† (I − AA† ) (B − A). Consider the matrix B−A (B − A)(I − A† A) M= . (I − AA† )(B − A) (I − AA† )(B − A)(I − A† A) By Theorem 8.2.2 it follows that M is nnd. So, by Corollary 8.2.3, we have (B − A) − B[(I − AA† )B(I − A† A)]− B = (B − A) − (B − A)(I − A† A)[(I − AA† )B(I − A† A)]† (I − AA† )(B − A) is nnd.
Let A and B be matrices of the same order (possibly rectangular). Then the matrix AB? + BA? is hermitian. When is this matrix nnd? We answer in the following: Theorem 8.2.26. Let A and B be matrices of order m × n. Then the following are equivalent: (i) 0
226
Matrix Partial Orders, Shorted Operators and Applications
λmax ((A : B)? {(A + B)(A + B)− }(A : B)) ≤ 1. However, C(A + B) ⊆ C(A : B), so, (i) ⇔ (iii).
Theorem 8.2.27. Let A, B ∈ Hn and B be an nnd matrix such that C(A) ⊆ C(B). Further, let AB + BA be nnd. Then A is nnd. ∆ 0 Proof. Let A = U U? be a spectral decomposition of A and 0 0 C11 C12 ∆ is a non-singular real diagonal matrix. Write B = U U? , C12 ? C22 where C same order.Since, AB + 11 and ∆ have BA is nnd, we have ∆ 0 0 0
C11 C12 C12 ? C22
+
C11 C12 C12 ? C22
∆ 0 0 0
=
∆C11 + C11 ∆ ∆C12 C11 ? ∆ 0
is nnd. Hence, ∆C11 + C11 ∆ is nnd and C12 = 0. Since, C(A) ⊆ C(B), we have C11 is positive definite. The ith diagonal element of ∆C11 + C11 ∆ is ∆ii (C11 )ii + (C11 )ii ∆ii = 2(∆ii (C11 )ii ). Since, ∆C11 and C11 ∆ are nnd and C11 is positive definite, we have ∆ii ≥ 0 for all i. Since ∆ is non-singular, we have ∆ii > 0. Hence, A is nnd. In the case of the minus order, if A <− B and ρ(A) = ρ(B), then A = B. Same conclusion holds even when A B and ρ(A) = ρ(B) or A <# B and ρ(A) = ρ(B). However, this is not the case with the L¨owner order. For example, we can take A = I, B = 2I. Then A
8.3
L¨ owner order on powers and its relation with other partial orders
Let A and B be nnd matrices of the same order. In this section we study the L¨ owner order for powers of A and B when A
The L¨ owner Order
227
Lemma 8.3.1. Let A be an nnd matrix. Let λ1 and σ1 respectively be the largest eigen-value and the largest singular value of A. Then |λ1 | < σ1 . Proof. Since σ1 is the largest singular value, σ1 2 is the largest eigen-value of A? A. Hence by Theorem 8.2.1 (a), we have σ1 2 = maxy6=0
y? A? Ay . y? y
Let u be the eigen vector of A corresponding to the eigen-value λ1 . So, 2
|λ1 | = λ1 λ1 =
u? A? Au y? A? Ay ≤ max = σ1 2 . y6 = 0 u? u y? y
We now prove the following: Theorem 8.3.2. Let A and B be nnd matrices of the same order. Consider the following statements: (i) A2
228
Matrix Partial Orders, Shorted Operators and Applications
Remark 8.3.4. Let A and B be nnd matrices of the same order such that 2 L 2 A
The L¨ owner Order
229
Theorem 8.3.10. Let A, B ∈ Hn . Then ‘A <− B and A2
230
Matrix Partial Orders, Shorted Operators and Applications
have Am
L¨ owner order on generalized inverses
Let A be an nnd matrix. Let G be an nnd g-inverse of A. We obtain a characterization of nnd g-inverses H of A with a specified rank each of which dominates G or is dominated by G under the L¨owner order. We also study a similar problem for nnd outer inverses. We then consider nnd matrices A and B such that A
The L¨ owner Order
231
Ir L Q−1 for some matrix L? L? L + S L and for some nnd matrix S of order (n − r) × (n − r) with rank s − r. Further, for a given g-inverse G of A, the matrices S and L are uniquely determined. rank s if and only if G = (Q? )−1
Corollary 8.4.3. Let A be an nnd matrix of order n × n with rank r. Let s be an integer such that r ≤ s ≤ n. Then an n × n matrix G is an nnd g-inverse of A with rank s if and only if there exists a non-singular matrix T such that A = Tdiag Ir , 0 T? , G = (T? )−1 diag Is , 0 T−1 . Corollary 8.4.4. Let A be an nnd matrix of order n × n with rank r. Let s be an integer such that r ≤ s ≤ n. Then an n × n matrix G is an nnd g-inverse of A with rank s if and only if G = A− r + R for some nnd − reflexive g-inverse Ar of A and for some matrix R such that AR = 0 and ρ(R) = s − r. Further given G, such a decomposition is unique. (Given G, the unique reflexive g-inverse mentioned above is GAG.) Corollary 8.4.5. Let A be an nnd matrix of order n × n with rank r. Let s be an integer such that r ≤ s ≤ n. Then an n × n matrix G is an nnd g-inverse of A with rank s if and only if G is an nnd reflexive g-inverse of A + B for some nnd matrix B such that ρ(A + B) = ρ(A) + ρ(B). We now divert our attention to the outer inverses of an nnd matrix. Theorem 8.4.6. Let A be an nnd matrix of order n × n with rank r. Let (P, P? ) be a rank factorization of A. Then G is an nnd outer inverse of −1 ? A with rank s, where 0 ≤ s ≤ r if and only if G = (P−1 L ) TPL for some −1 left inverse PL of P and for some idempotent matrix T with rank s. Proof. ‘If’ part is trivial. ‘Only if’ part Since P is a full column rank matrix, there exists a matrix Q such that the matrix R = (P : Q) is non-singular. So, A = Rdiag Ir , 0 R? . Let G be an outer inverse of Awith rank s. Since, GAG = G, it follows that B12 ? −1 B11 R−1 , where B11 = B11 B?11 , ρ(B11 ) = s and G=R B?12 B12 B?12 B11 B12 = B12 . We can rewrite G as G = R? −1 S? diag B11 , 0 SR−1 , I B12 where S = . It is easy to see that (I : −B12 )R−1 is a left inverse 0 I of P. By choosing T = B11 and PL = (I : −B12 )R−1 , the result follows.
232
Matrix Partial Orders, Shorted Operators and Applications
Corollary 8.4.7. Let A be an nnd matrix of order n × n with rank r. Then G is an nnd outer inverse of A with rank s, where 0 ≤ s ≤ r if and only if there exists a full column rank matrix S of order n × r such that A = SS? −1 −1 ? and G = (SL ) diag Is , 0 SL . We are now ready for exploring the L¨owner ordering properties of nnd g-inverses and outer inverses of an nnd matrix. We begin with g-inverses. Theorem 8.4.8. Let A be an n × n nnd matrix with rank r. Let G be an nnd g-inverse of A with rank s, where r ≤ s ≤ n. Let H be an n × n matrix of rank t, where s ≤ t ≤ n. Then the following are equivalent: (i) H is an nnd g-inverse of A and G
The L¨ owner Order
233
Proof follows along similar lines to Theorem 8.4.8. Note that in Theorem 8.4.9, for the matrix U, U
G
and
Proof. (i) ⇒ (ii) is trivial. (ii) ⇒ (iii) Since A is a g-inverse of H, we have HAG = G = GAH. Hence, H − GAH − G = H − G. (iii) ⇒ (iv) Since H − G is an outer inverse of A, ρ(H − G) = tr((H − G)A) = tr(HA) − tr(GA) = ρ(H) − ρ(G). Hence, G <− H. (iv) ⇒ (i) follows from Theorem 8.3.8. How does one obtain the class of all nnd outer inverses of a specified rank dominating (or dominated by) a given outer inverse of a matrix? We describe below a constructive procedure for this: Let A be an n×n nnd matrix with rank r. Let G be an nnd outer inverse of A with rank s(s ≤ r). Then by Corollary 8.4.7, there exists a full column −1 −1 ? rank matrix S such that A = SS? and G = (S−1 . L ) diag I , 0 SL (i) The class of all nnd outer inverses H of A with rank t (s ≤ t ≤ r) such that G
234
Matrix Partial Orders, Shorted Operators and Applications
Let A and B be positive definite matrices of the same order. Then it is well known that A
where S12 , S13 , S22 , S23 , and S33 are arbitrary matrices satisfying (a) S12 = (Λ−1 − I)Z, where Z is arbitrary (b) S22 = I + S?12 (Λ−1 − I)− S12 + V, where V is an arbitrary nnd matrix −1 S13 Λ −I S12 U1 (c) = , where U1 and U2 are arS23 S?12 S22 − I U2 bitrary and Λ−1 S12 −1 S13 (d) S33 = S?13 , S?23 + W, where W is an S?12 S22 S23 arbitrary nnd matrix such that ρ(W) = u − t. (ii) Let r = s. Let G be a given nnd g-inverse of A with rank u. Let P be a non-singular matrix that A = P diag Ir , 0 P? , such G = Q diag Iu , 0 Q? and B = P diag ∆ , 0 P? , where ∆ is a positive definite matrix such that I
The L¨ owner Order
(a) C(L) ⊆ C(I − ∆−1 ) (b) L? ∆L
235
and
Proof follows by repeated applications of Theorem 8.2.2. Remark 8.4.12. For an interesting characterization of matrices satisfying (a)-(c) of Theorem 8.4.11, see [Bhimasankaram and Mathew Thomas (1993)]. In Theorem 8.4.11 (ii), we considered a special case when ρ(A) = ρ(B). However, if A
236
Matrix Partial Orders, Shorted Operators and Applications
Clearly, there exist T13 , T23 , and T33 such that H
Let P be a non-singular matrix such that
? ? Ir 0 Q? , A = P diag Ir , 0 P , B = P diag ∆ , 0 P and G = Q 0 0
The L¨ owner Order
237
where Q= (P−1 )? , ∆ is positive definite matrix satisfying I
238
8.5
Matrix Partial Orders, Shorted Operators and Applications
Generalizations of the L¨ owner order
In this section we give two generalizations of the L¨owner order, one for the class of hermitian matrices and the other for arbitrary matrices, square or rectangular. We study their properties briefly and compare the two on the class of hermitian matrices. Let A and B be hermitian matrices of the same order. As we have seen in Theorem 8.2.21, the statements ‘A
The L¨ owner Order
239
Theorem 8.5.4. Let A, B ∈ Hn . Then A
We now consider another generalization of the L¨owner order for arbitrary matrices square or rectangular. Given a matrix A of order m × n, this ordering makes use of the unique nnd square root of AA? , denoted by 1 (AA? ) 2 for this purpose. Definition 8.5.9. Let A and B be m×n matrices. The matrix A is said to be below the matrix B under the GL-ordering (Generalized L¨owner order)
240
Matrix Partial Orders, Shorted Operators and Applications 1
1
1
1
if (AA? ) 2
The L¨ owner Order
241
Corollary 8.5.12. Let A and B be matrices of the same order. If A
1
Proof. In view of Corollary 8.2.12, (AA? ) 2
1
1
1
((BB? ) 2 )† (AA? ) 2 = ((BB? )† (BB? ) 2 )(AA? ) 2 1
1
= (BB? )† ((BB? ) 2 (AA? ) 2 ) = ((BB? )† )BA? ?
= B† A? . Hence, 1
?
1
λmax ((BB? ) 2 )† (AA? ) 2 = λmax (B† A? ) = λmax (AB† ) = λmax (B† A). Rest of the proof follows by Definition 8.5.9.
Theorem 8.5.15. Let A and B be nnd matrices. Then A
242
Matrix Partial Orders, Shorted Operators and Applications
(i.e. W† = W? ) such that A = HW. Further the matrices H and W are uniquely determined by C(H) = C(W) in which case H2 = AA? and W = H† A. For a proof see page 220 of [Ben-Israel and Greville (2001)]. We now obtain a characterization of GL-ordering in terms of polar decomposition. Theorem 8.5.18. Let A and B be matrices of order m×n. Let A = H1 W1 and B = H2 W2 be the polar decomposition of A and B respectively with C(Hi ) = C(Wi ), for i = 1, 2. Then A
Let us now compare the two order relations ‘
The L¨ owner Order
8.6
243
Exercises
(1) Let A and B be nnd matrices of the same order such that B
B
(3) Let A and B ∈ Hn , ν(A) = ν(B) 6= 0 and A
244
Matrix Partial Orders, Shorted Operators and Applications
(i) A ⊗ C is nnd. (ii) A ⊗ C
Chapter 9
Parallel Sums
9.1
Introduction
The present chapter is a prelude to the study of one of many applications of g-inverses and matrix partial orders. Here we study parallel sums which originally arose in the study of network synthesis. The concept of parallel sum is analogous to the concept of connecting resistors either in series or in parallel, a basic concept in elementary network theory. If two resistors having resistances R1 and R2 are connected in series, their joint resistance is R1 + R2 and if they are connected in parallel then their joint resistance R is −1 R = R−1 1 + R2 =
R1 R2 = R1 (R1 + R2 )−1 R2 R1 + R2
and is called parallel sum of R1 and R2 . Notice that while giving the total resistance of these resistors in parallel, we have tacitly assumed the resistances to be positive numbers. However, if R1 and/or R2 are zero, we say the joint resistance is zero. As we shall see in Chapter 10, the impedance matrix of a reciprocal resistive n-port network is an nnd matrix and that nnd matrices can be considered a generalization of non-negative real numbers. Further, if two such n-port networks N1 and N2 with impedance matrices Z1 and Z2 respectively are connected in parallel then the impedance matrix of their parallel connection is Z1 (Z1 +Z2 )† Z2 . Following the discussion in the opening para, one definition of the parallel sum of two nnd matrices A and B can be A(A + B)−1 B. This will certainly have a meaning if A + B is non-singular. If both A and B are non-singular, this parallel sum can be written as (A−1 + B−1 )−1 . If the sum A + B is singular, then suitable modifications are required to give a meaningful definition of the parallel 245
246
Matrix Partial Orders, Shorted Operators and Applications
sum. Since (A + B)† , the Moore-Penrose inverse of A + B always exists and is unique, a possible definition of the parallel sum of A and B is the matrix A(A + B)† B. Following what we just said for non-singular matrices, one may then be tempted to express this parallel sum as (A† + B† )† . This, however, is far fromtrue as can seen by taking A = I2 , the 2 × 2 0 0 identity matrix and B = . In this chapter we begin by studying 0 1 the parallel sum and its properties in a more general context, where the latter definition is the guiding source. Parallel sums of nnd matrices are discussed simultaneously. In Section 9.2, we define the notion of parallel sum for arbitrary rectangular matrices using generalized inverses and study in detail the various properties of these sums. The corresponding theorems for nnd matrices, whenever possible have been included too. We also obtain a solution for matrix equation P(A, X) = C, where P(A, X) denotes that parallel sum of matrices A and X. In Section 9.3, we study the inter-relationship of parallel sums and various partial orders studied in earlier chapters. We also prove that the set of nnd matrices form a partially ordered abelian group with respect to parallel addition as its binary operation. Section 9.4 contains results about continuity of parallel sums and their index. Drazin inverse of parallel sums is also discussed towards the end of this section. As has been our tradition, all matrices are over arbitrary field unless otherwise stated. 9.2
Definition and properties
In this section, we first define the parallel sum of two arbitrary matrices (possibly rectangular) over a general field. Traditionally we should begin by defining the parallel sum of two nnd matrices as the concept of parallel sums first came into being for these matrices only. However, we choose to start by defining the parallel sum of two arbitrary rectangular matrices over a general field and as we go along studying the various properties, we also develop similar results for parallel sums of two nnd matrices. Definition 9.2.1. Let A and B be any two m × n matrices. We say A and B are parallel summable if A(A + B)− B is invariant under the choice of g-inverses (A + B)− of A + B. When this is so, the common value of A(A + B)− B is called the parallel sum of A and B. We denote the parallel sum by P(A, B) and write P(A, B) = A(A + B)− B.
Parallel Sums
247
Before going any further, we make some useful observations in the following remarks. Remark 9.2.2. If either A or B is null, then A and B are trivially parallel summable and their parallel sum is the null matrix. Thus, to have anything interesting about parallel sums one must take only non-null matrices. Remark 9.2.3. If the non-null matrices A and B are such that A + B is null, their parallel sum is not defined, since A(A + B)− B remains no longer invariant under the choice of g-inverses of A + B. Thus, over any field, the matrices A(6= 0) and −A are not parallel summable. We now give a characterization and its various formulations for parallel summability of two matrices. Theorem 9.2.4. Let A and B be two matrices of the same order. Then the following are equivalent: (i) (ii) (iii) (iv) (v)
A and B are parallel summable C(A) ⊆ C(A + B) and C(At ) ⊆ C(A + B)t C(B) ⊆ C(A + B) and C(Bt ) ⊆ C(A + B)t C(A) + C(B) = C(A + B) and C(At ) + C(Bt ) = C(A + B)t Let ρ(A + B) = r. There exist non-singular matrices P and Q of orders m × m and n× n respectively and an r × r matrix T such that A = P diag T , 0 Q and B = P diag I − T , 0 Q (vi) Let ρ(A + B) = r. There exist non-singular matrices R, S of orders m × m and n × n respectively and an r × r matrix W such that A = R diag I − W , 0 S and B = R diag W , 0 S (vii) Let ρ(A) = a and ρ(A + B) = r. There exist non-singular matrices P0 , Q0 of orders m×m andn×n respectively and an r×r non-singular K11 K12 matrix K = such that A = P0 diag Ia , 0r−a , 0 Q0 K21 K22 K11 − Ia K12 0 and B = P0 K21 K22 0 Q0 , where K11 is a × a ma0 0 0 trix and (viii) Let ρ(B) = b and ρ(A + B) = r. There exist non-singular matrices R0 , S0 of orders m × m and n × n respectively and W11 W12 such that an r × r non-singular matrix W = W21 W22
248
Matrix Partial Orders, Shorted Operators and Applications
W11 − Ib W12 0 A = R0 W21 W22 0 S0 and B = R0 diag Ib , 0r−b , 0 S0 , 0 0 0 where W11 is a b × b matrix. Proof. (i)⇒(ii) By Theorem 2.3.11, C(B) ⊆ C(A + B) and C(At ) ⊆ C(A + B)t . However, C(A) = C(A + B − B) ⊆ C(A + B) + C(B) ⊆ C(A + B) + C(A + B) = C(A + B), as C(A + B) is a subspace. So, (ii) holds. (ii) ⇒ (i) Once again by Theorem 2.3.11, A(A + B)− A is invariant under choice of g-inverses (A + B)− . But A(A + B)− A = A(A + B)− (A + B − B) = A(A + B)− (A + B) − A(A + B)− B = A − A(A + B)− B, therefore, A(A + B)− B is invariant under choices of (A + B)− proving A and B are parallel summable. Proof of (i) ⇔ (iii) is similar. (i) ⇔ (iv) follows from the proof of (i) ⇔ (ii) and (i) ⇔ (iii). Proofs of (ii) ⇔ (v) ⇔ (vii) and (iii) ⇔ (vi) ⇔ (viii) follow by Remark 3.2.7. Corollary 9.2.5. Let A and B be two matrices of the same order. Then (i) A and B are parallel summable if and only if B and A are parallel summable. Further, P(A, B) = P(B, A). (ii) A and B are parallel summable if and only if At and Bt are parallel summable. Further, P(At , Bt ) = (P(B, A))t . (iii) If A, B ∈ Cn×n are hermitian and A and B are parallel summable, then P(A, B) is hermitian. Proof. (i) First half of the statement follows from Theorem 9.2.4. For the later half, note that A(A + B)− B = A(A + B)− (A + B − A) = A(A + B)− (A + B) − A(A + B)− A = A − A(A + B)− A = (A + B)(A + B)− A − A(A + B)− A = B(A + B)− A. Hence, P(A, B) = P(B, A). (ii) First part of the statement follows from Theorem 9.2.4. For the second part, notice that a choice of g-inverse of At + Bt is ((A + B)− )t . (iii) follows in view of (ii).
Parallel Sums
249
Remark 9.2.6. Even when at least one of A and B is a full rank matrix, the matrices A and B may not be parallel summable. A simple example is to take any non-singular matrix A and B = −A. We next show that if the matrices A and B are nnd, then they are always parallel summable and that their parallel sum is below each of A and B under the L¨ owner order. Theorem 9.2.7. Let A, B ∈ Cn×n be nnd matrices. Then the following hold: (i) C(A) + C(B) = C(A + B) (ii) A and B are parallel summable and (iii) P(A, B)
250
Matrix Partial Orders, Shorted Operators and Applications
Theorem 9.2.10. Let A, B be m × n matrices. Let C and D be matrices of order p × m and n × q respectively. Further let ρ(C) = m and ρ(D) = n. Then the following hold: (i) A and B are parallel summable if and only if CAD and CBD are parallel summable. (ii) When P(A, B) exists, P(CAD, CBD) = CP(A, B)D. Proof. (i) Observe that C(CAD) = C(CA) and C(CAD)t = C(AD)t , since C has a left inverse and D has a right inverse. Further, for any matrices R, S and T, if C(R) ⊆ C(R + S), then C(TR) ⊆ C(T(R + S)). Now, it follows immediately that C(A) ⊆ C(A + B) ⇔ C(CA) ⊆ C(C(A + B)), since C has a left inverse. Similarly, C(At ) ⊆ C(A + B)t ⇔ C(AD)t ⊆ C((A + B)D)t . Thus, (i) holds. − −1 (ii) Let P(A, B) exist. Notice that D−1 R (A + B) CL is a g-inverse of C(A + B)D. Hence, P(CAD, CBD) = CP(A, B)D. Corollary 9.2.11. Let A, B be m × n matrices. Let P and Q respectively be m×m and n×n non-singular matrices. Then A, B are parallel summable if and only if PAQ, PBQ are parallel summable. Furthermore, P(PAQ, PBQ) = PP(A, B)Q. Remark 9.2.12. If A and B are parallel summable, it neednot follow that 3 −5 A† and B† are parallel summable. For example take A = and −1 2 1 −5 0 2 5 −1 −2 B= . Then A† = A−1 = , B† = . Trivially A − 25 0 1 3 0 0 1 3 and B are parallel summable. But A† + B† = is a singular matrix 1 3 where as A† is non-singular. So, A† and B† cannot be parallel summable. Thus, one may ask the following: When are the generalized inverses of parallel summable matrices themselves parallel summable? We do not know the answer yet. However see Ex. 13 at the end of the chapter. We now obtain the class of all generalized inverses of the parallel sum of two parallel summable matrices. From the utility point of view this is an important result. However, we need the following lemma to prove it. Lemma 9.2.13. Let A and B be parallel summable. Then N (P(A, B)) = N (A) + N (B) and N (P(A, B))t = N (At ) + N (Bt ).
Parallel Sums
251
Proof. Clearly, N (A) + N (B) ⊆ N (P(A, B)). Further, N (A) ∩ N (B) ⊆ N (A) + N (B) holds always. Let x ∈ N (P(A, B)). So, P(A, B)x = 0 ⇒ P(B, A)x = 0 ⇒ (A + B)− Bx ∈ N (A) and (A + B)− Ax ∈ N (B). Therefore, (A + B)− (A + B)x ∈ N (A) + N (B). Since N (A) + N (B) ⊆ N (A + B), we have (A + B)(A + B)− (A + B)x = 0. So, (A + B)x = 0. As C(At ) ⊆ C(A + B)t and C(Bt ) ⊆ C(A + B)t we have, Ax = 0 and Bx = 0. Hence, x ∈ N (A) ∩ N (B) ⇒ x ∈ N (A) + N (B). Thus, N (P(A, B)) = N (A) + N (B). Similarly, we can prove N (P(A, B))t = N (At ) + N (Bt ). Theorem 9.2.14. Let A and B be parallel summable matrices of order m × n. Then the class of all generalized inverses of P(A, B) is given by −
{P(A, B) } = {A− } + {B− }. Conversely, if C is a non-null matrix of order m × n such that {C− } = {A− } + {B− }, then A and B are parallel summable and C = P(A, B). Proof. Let A and B be parallel summable and A− and B− be any ginverses of A and B respectively. Consider P(A, B)(A− + B− )P(A, B) = P(A, B)A− P(A, B) + P(A, B)B− P(A, B) = B(A + B)− AA− A(A + B)− B + A(A + B)− BB− B(A + B)− A = B(A + B)− A(A + B)− B + A(A + B)− B(A + B)− A = P(A, B)(A + B)− B + P(A, B)(A + B)− A = P(A, B)(A + B)− (A + B) = P(A, B), since C(P(A, B)) ⊆ C(A)∩C(B) ⊆ C(A + B). Thus, A− +B− is a g-inverse − of C(A + B), showing that {A− } + {B− } ⊆ {P(A, B) }. − To prove {P(A, B) } ⊆ {A− } + {B− }, note that as seen in the above proof, a choice for a g-inverse of P(A, B) is A− + B− for some g-inverse A− of A and B− of B. So, every g-inverse of P(A, B) is of the form G = A− + B− + (I − (P(A, B))− P(A, B))W + Z(I − P(A, B)(P(A, B))− ) and (I − (P(A, B))− P(A, B)) ∈ N (P(A, B)). However, by Lemma 9.2.13, N (P(A, B)) = N (A) + N (B). So, (I − (P(A, B))− P(A, B)) = (I − A− A)W1 + (I − B− B)W2 for some matrices W1 and W2 . Similarly, Z(I − P(A, B)(P(A, B))− ) = Z1 (I − AA− ) + Z2 (I − BB− ) for some matrices Z1 and Z2 . Therefore, G = A− + B− + (I − A− A)W1 + (I − B− B)W2 + Z1 (I − AA− ) + Z2 (I − BB− ) = A− + (I − A− A)W1 + Z1 (I − AA− ) + B− +
252
Matrix Partial Orders, Shorted Operators and Applications
(I−B− B)W2 +Z2 (I−BB− ). Clearly, A− +(I−A− A)W1 +Z1 (I−AA− ) is a g-inverse of A and B− + (I − B− B)W2 + Z2 (I − BB− ) is a g-inverse of B. Thus, G is a sum of a g-inverse of A and a g-inverse of B proving − − {P(A, B) } ⊆ {A− } + {B− }. Hence, {P(A, B) } = {A− } + {B− }. For converse, we first show that A and B are parallel summable. Let (C1 , D1 ) be a rank factorization of C. Since, {C− } = {A− } + {B− }, for each G ∈ {A− } and for each H ∈ {B− } as seen above G + (I − GA)W1 + Z1 (I − AG) + H + (I − HB)W2 + Z2 (I − BH) is a g-inverse of C. So, C1 D1 (G + (I − GA)W1 + Z1 (I − AG) + H + (I − HB)W2 + Z2 (I − BH))C1 D1 = C1 D1 , equivalently, C1 D1 ((I − GA)W1 + Z1 (I − AG) + (I − HB)W2 + Z2 (I − BH))C1 D1 = 0. As C1 and D1 are full rank matrices this further gives D1 ((I − GA)W1 + Z1 (I − AG) + (I − HB)W2 + Z2 (I − BH))C1 = 0. Since the last equation holds for all W1 , W2 , Z1 , Z2 , we have D1 (I − GA) = 0, (I − AG)C1 = 0, D1 (I − HB) = 0, (I − BH)C1 = 0. It follows that C(C1 ) ⊆ C(A) ∩ C(B) and C(D1 ) ⊆ C(At ) ∩ C(Bt ). Since C1 has full column rank = ρ(C) and D1 has full row rank = ρ(C), it follows that d(C(A) ∩ C(B)) = d(C(At ) ∩ C(Bt )) = ρ(C). Let C2 and C3 be matrices such that the columns of the matrix C1 : C2 form a basis of C(A) and that of C1 : C3 form a basis of C(B). Similarly, let D2 and D3 be matrices such that the rows of t t D1 : D2 form a basis of C(At ) and rows of D2 : D3 form a ba I 0 0 D1 t sis of C(B ). Then C = C1 : C2 : C3 0 0 0 D2 . Extend 0 0 0 D3 C1 : C2 : C3 to a non-singular matrix P = C1 : C2 : C3 : C4 and D1 I 0 0 0 D1 D2 to a non-singular matrix Q = D2 , so, C = P 0 0 0 0 Q. D3 0 0 0 0 D3 D4 0 0 0 0 R11 R12 0 0 S11 0 S13 0 R21 R22 0 0 Q and B = P 0 0 0 0 Q, Also, A = P 0 S31 0 S33 0 0 0 0 where
R11 R12
R12 R22
0
0 and
0 0 0 0 0 0 S11 S13 are non-singular matrices and let S31 S33
Parallel Sums
R11 R12
R12 R22
and
S11 S31
S13 S33
253
be their respective g-inverses. I X X X X X X X −1 Any g-inverse of C is of the form C− = Q−1 X X X X P . X X X X Similarly, g-inverses of A and B are respectively of the form 11 R − −1 X A =Q X X
X X X X
X X X X
11 S X X X −1 − −1 P and B = Q X X X X
X X X X
X X X X
X X P−1 . X X
Notice that here X’s represent some suitable matrices not of interest. Now, R11 + S11 R12 S13 0 R21 R22 0 0 ∆ 0 Q=P A + B = P Q. S31 0 S33 0 0 0 0 0 0 0 R11 + S11 R12 S13 We show that ∆ = R21 R22 0 is a non-singular matrix. S31 0 S33 Let 11 11 R R12 0 S 0 S13 F1 = R12 R22 0 , F2 = 0 I 0 , 31 0 0 I S 0 S33 where I is an identity matrix of suitable order. Then F1 and F2 are nonsingular I 0 S13 and F1 ∆F2 = R21 I 0 . Clearly, F1 ∆F2 is non-singular. There0 0 I fore, ∆ is a non-singular matrix. Also, a direct computation shows that (A + B)(A + B)− A = A and A(A + B)− (A + B) = A. So, C(A) ⊆ C(A + B) and C(At ) ⊆ C(A + B)t i.e., A and B are parallel summable. − By first part C− = P(A, B) , so, by Theorem 2.4.2 [Rao, Mitra and Bhimasankaram (1972)], C = P(A, B) Remark 9.2.15. The converse of Theorem 9.2.14 is not true if C = 0. 1 1 0 1 We give the following example: Let A = ,B = . Then 0 0 0 0
254
Matrix Partial Orders, Shorted Operators and Applications
a b x y and {B− } = , where a, b, d, x, y, w 1−a d 1 w are arbitrary. Clearly, A and B are not parallel summable. However, for C = 0, {C− } = C2×2 = {A− } + {B− }. {A− } =
Theorem 9.2.16. Let A and B be parallel summable. Then C(P(A, B)) = C(A) ∩ C(B) and C(P(A, B))t = C(At ) ∩ C(Bt ). Proof.
In view of Corollary 9.2.5, clearly we have C(P(A, B)) ⊆ C(A) ∩ C(B), C(P(A, B))t ⊆ C(At ) ∩ C(Bt ).
So, let x ∈ C(A) ∩ C(B). Let A− , B− and (A + B)− be any g-inverses of A, B and A + B respectively. Then A− + B− is a g-inverse of P(A, B) and P(A, B)(A− + B− )x = P(B, A)A− x + P(A, B)B− x = B(A + B)− AA− x + A(A + B)− BB− x. Since x ∈ C(A) ∩ C(B), we have AA− x = x, BB− x = x. Moreover, x ∈ C(A + B), so, P(A, B)(A− + B− )x = B(A + B)− x + A(A + B)− x = (A + B)(A + B)− x = x. Hence, x ∈ C(P(A, B)), and so, C(A) ∩ C(B) ⊆ C(P(A, B)). The proof of the other part is similar.
Remark 9.2.17. If A and B are disjoint matrices, then they are parallel summable with parallel sum, the null matrix. Theorem 9.2.18. Let A, B ∈ Cn×n be nnd matrices. Then their parallel sum P(A, B) is nnd. Proof. By Theorem 9.2.7, A and B are parallel summable. Let z ∈ Cn and x = P(A, B)z. Then P(A, B)(A† + B† )x = x, since A† + B† is a g-inverse of P(A, B). So, (P(A, B)z, z) = (x, z) = (P(A, B)(A† + B† )x, z) = ((A† + B† )x), P(A, B)? z = ((A† + B† )x, P(A, B)z), since, P(A, B) is hermitian. Thus, (P(A, B)z, z) = ((A† + B† )x, x) ≥ 0, as A† + B† is nnd. Thus, P(A, B) is nnd.
Parallel Sums
255
Corollary 9.2.19. Let A, B ∈ Cn×n be range-hermitian matrices. Then their parallel sum P(A, B) is range-hermitian. Proof follows from (ii) of Corollary 9.2.5 and Theorem 9.2.16. Theorem 9.2.20. Let A, B ∈ Cn×n . Then the following hold: (i) Let P, and Q denote the orthogonal projectors onto C(A) ∩ C(B) and † C(A? ) ∩ C(B? ) respectively. Then P(A, B) = Q(A− + B− )P, where − − A and B are any g-inverses of A and B respectively. In particular, the expression Q(A− + B− )P is independent of choice of A− and B− . (ii) Let P1 and P2 be orthogonal projectors onto C(A) and C(B) respectively. Then the orthogonal projector onto C(A)∩C(B) is 2(P(P1 , P2 )). Proof. (i) Since C(P(A, B)) = C(A) ∩ C(B), P = (P(A, B))(P(A, B))† and Q = (P(A, B))† (P(A, B)). So, Q(A− + B− )P = (P(A, B))† (P(A, B))(A− + B− )(P(A, B))(P(A, B))† . As (A− + B− ) is a g-inverse of P(A, B), by Theorem 9.2.14, we have Q(A− + B− )P = (P(A, B))† . (ii) Since C(P1 ) = C(A) and C(P2 ) = C(B) and P1 , P2 being hermitian and idempotent are parallel summable by Theorem 9.2.7. Also, P(P1 , P2 ) is hermitian. A simple computation shows that 2(P(P1 , P2 )) is idempotent. So, the result follows. Remark 9.2.21. Let A, B be parallel summable matrices. For any ginverse (P(A, B))− of P(A, B) and projectors P = (P(A, B))(P(A, B))− and Q = (P(A, B))− (P(A, B)), the matrix Q(A− + B− )P is a reflexive g-inverse of P(A, B). It will be interesting to know what are all the reflexive g-inverses of P(A, B). We now give all possible reflexive generalized inverses of the parallel sum P(A, B) of matrices A, B whenever it is defined. Theorem 9.2.22. Let A and B be two parallel summable matrices of order m × n. Let ρ(A) = a and ρ(A + B) = r. Further, let P, Q benon-singular K11 K12 matrices of orders m × m and n × n respectively and K = K21 K22 11 K K12 be an r × r non-singular matrix with inverse K−1 = , where K21 K22 K11 is a × a matrix such that A = P diag Ia , 0r−a , 0 Q and B =
256
Matrix Partial Orders, Shorted Operators and Applications
0 0 Q. Then the class of all reflexive g- inverses of 0 T L P(A, B) consist of matrices X of the form X = Q−1 P−1 , M ML where T = (Ia − K11 )r is any reflexive g-inverse of Ia − K11 and L, M are arbitrary. K11 − Ia P K21 0
K12 K22 0
Proof is by direct verification. In general, we may not have any of P(A + C|B) = P(A, B) + P(C|B) or P(A|B + C) = P(A, B) + P(A, C), even when all parallel sums are defined. However, we have the following: Theorem 9.2.23. Let A, B, and C be m × n matrices. If A, B are parallel summable, A + B, C are parallel summable and P(A + B|C) = 0, then the following hold: (i) A + C, B are parallel summable (ii) A, B + C are parallel summable and (iii) P(A + C, B) = P(A, B) = P(A, B + C). Proof. Since A, B are parallel summable and A + B, C are parallel summable, we have C(A) ⊆ C(A + B) ⊆ C(A + B + C), C(At ) ⊆ C(A + B)t ⊆ C(A + B + C)t (9.2.1) or equivalently C(B) ⊆ C(A + B) ⊆ C(A + B + C), C(Bt ) ⊆ C(A + B)t ⊆ C(A + B + C)t . (9.2.2) By (9.2.1) and (9.2.1), both (i) and (ii) follow. (iii) P(A + B, C) = 0 ⇒ (A + B)(A + B + C)− C = 0 ⇒ B(A + B + C)− C = 0, since C(B)t ⊆ C(A + B)t . Similarly, C(A + B + C)− B = 0. So, P(A + C, B) = (A + C)(A + B + C)− B = A(A + B + C)− B. Again, P(A + B, C) = 0 ⇒ ρ(A + B + C) = ρ(A + B) + ρ(C). This gives (A + B + C)− is a g-inverse of A + B. Hence, P(A + C, B) = A(A + B)− B = P(A, B). Similarly, P(A, B) = P(B + C, A) = P(A, B + C).
Parallel Sums
257
Remark 9.2.24. Let A and B be any m × n matrices such that A <s B. Does there exist a matrix C such that B and C are parallel summable and P(B, C) = A. The answer is in the negative and we give below an example: 1 1 0 1 0 0 Example 9.2.25. Let A = 0 0 0 and B = 0 1 0 . Then there 0 0 0 0 0 0 exists no matrix C such that B and C are parallel summable and P(B, C) = A. An affirmative answer for the above question in special cases is contained in the following theorems: Theorem 9.2.26. Let A and B be any m × n matrices such that A, B are parallel summable. Then the following hold: (i) If either C(B) ⊆ C(A) or C(Bt ) ⊆ C(At ), then both the inclusions hold and ρ(A − P(A, B)) = ρ(A). (ii) Let ρ(A−P(A, B)) = ρ(A). Then A, −P(A, B) are parallel summable and B = −P(A, −P(A, B)) + W, for some W such that A and W are disjoint matrices. Proof. (i) Since A, B are parallel summable, C(A) + C(B) = C(A + B) and C(At ) + C(Bt ) = C(A + B)t . Assume first that C(B) ⊆ C(A). Then C(A + B) = C(A) + C(B) ⊆ C(A) ⊆ C(A + B) ⇒ ρ(A + B) = ρ(A). This further gives C(At ) = C(A + B)t . Hence, C(Bt ) ⊆ C(At ). Similarly, if C(Bt ) ⊆ C(At ), then C(B) ⊆ C(A). Notice that A−P(A, B) = A(A + B)− A. So, ρ(A−P(A, B)) ≤ ρ(A). Let C(B) ⊆ C(A). Then B = AU for some matrix U. Denote P(A, B) by D. As C(A) ⊆ C(A + B), we have A = A(A + B)− (A + B) = (A − D)(I + U). So, ρ(A) ≤ ρ(A − D) = ρ(A − P(A, B)). (ii) Now, ρ(A − P(A, B)) = ρ(A) and P(A, B) <s A. So, A and −P(A, B) = −D are parallel summable. Let C denote the parallel sum of A and −D. Clearly, A + C = A(A − D)− A and by part (i) ρ(A + C) = ρ(A). Thus, A and C are also parallel summable. A g-inverse of A + C is A− (A − D)A− , since (A + C)A− (A − D)A− (A + C) = A(A − D)− A(A− (A − D)A− )A(A − D)− A
258
Matrix Partial Orders, Shorted Operators and Applications
= A(A − D)− A(A − D)(A − D)− A = A(A − D)− A = A + C. So, the parallel sum of A and C is P(A, C) = AA− (A − D)A− C = A − DA− A(A − D)− D = D = P(A, B). This further implies A(A + C)− A = A(A + B)− A and that every ginverse of A + B is a g-inverse of A + C. For, if G is a g-inverse of A + B, then (A + C)G(A + C) = A(A − D)− AGA(A − D)− A = A(A − D)− A(A + C)− A(A − D)− A = (A + C)(A + C)− (A + C) = A + C. So, A + C <− A + B, and hence A + B = A + C + W, for some matrix W disjoint with A + B and therefore with A. Remark 9.2.27. Given matrices A and C of the same order such that C <s A and ρ(A − C) = ρ(A), there exists a unique matrix B, such that A and B are parallel summable and their parallel sum is C. If matrices A and C are nnd, then so is the matrix B. Theorem 9.2.28. Let A, and B be m × n matrices such that A, B are parallel summable. Let C = P(A, B). Then ρ(A − C) ≥ 2ρ(A)−ρ(A + B). Conversely, let A, and B be any m × n matrices such that (i) C <s A and (ii) ρ(A − C) ≥ 2ρ(A) − min{m, n}. Then there exists a matrix X of order m × n such that A and X are parallel summable and their parallel sum is C. Proof. Let ρ(A + B) = r and (L, R) be a rank factorization of A + B. Since A, B are parallel summable, there exists an r × r matrix D such that A = LDR. Notice that ρ(A) = ρ(D). Clearly, B = L(I − D)R and C = LD(I − D)R. Also, A − C = LD2 R and ρ(A − C) = ρ(D2 ). By applying Fr¨ obenius inequality to D, I and D, we have ρ(D2 ) + ρ(I) ≥ ρ(D) + ρ(D) = 2ρ(A) and ρ(A − C) = ρ(D2 ) ≥ 2ρ(A) − r = 2ρ(A) − ρ(A + B). (Converse). Let ρ(A) = r and (P, Q) be a rank factorization of A. Since C <s A, we have A − C <s A, so, we can write A − C = PΛQ. Note that ρ(A − C) = ρ(Λ) = q (say). Let p be an integer such that p ≤ min{m, n} and ρ(A − C) ≥ 2ρ(A) − p. Let F be a matrix of order p × r and rank r and E a matrix of r × p and rank r such that EF = Λ. Let L and R be matrices of full rank such that LF = P and ER = Q. Then A = PQ = LEFR and
Parallel Sums
259
A − C = L(EF)2 R. Let X = LR − A. Then clearly, A and X are parallel summable and P(A, X) = C.
9.3
Parallel sums and partial orders
In this section we study the parallel sums in relation to the some of the partial orders studied in earlier chapters. We first begin with space preorder. Theorem 9.3.1. Let A, B be m × n matrices. Then the following are equivalent: (i) A, B are parallel summable (ii) A <s A + B and (iii) B <s A + B. Proof follows by Theorem 9.2.7. Theorem 9.3.2. Let A, B, be parallel summable matrices of order m × n. Then the following hold: (i) P(A, B) <s A (ii) P(A, B) <s B and (iii) If C is any m × n matrix such that C <s A and C <s B then C <s P(A, B). Thus, for parallel summable matrices, their parallel sum is a maximal element below each of A, B under space pre-order. Proof. (i) and (ii) are trivial in view of Theorem 9.2.16. (iii) Since C <s A, C <s B, we have C(C) ⊆ C(A), C(C) ⊆ C(B) and C(Ct ) ⊆ C(At ), C(Ct ) ⊆ C(Bt ). So, C(C) ⊆ C(A) ∩ C(B) = C(P(A, B)) and C(Ct ) ⊆ C(At ) ∩ C(Bt ) ⊆ C(Bt ) = C(P(A, B))t . Theorem 9.3.3. Let A, B and C be matrices of the same order such that (i) A <s B (ii) A, C are parallel summable (iii) B, C are parallel summable. Then P(A, C) <s P(B, C).
and
Proof is trivial. Theorem 9.3.4. Let A, B be m × n matrices. Then the following are equivalent:
260
Matrix Partial Orders, Shorted Operators and Applications
(i) A <− B (ii) A, B − A are parallel summable and P(A, B − A) = 0 (iii) B − A <− B.
and
Proof. (i) ⇔ (iii) by Remark 3.3.12. (i) ⇔ (ii) Since A <− B ⇒ B = A ⊕ B − A ⇔ A, B − A are parallel summable and P(A, B − A) = 0. Remark 9.3.5. Notice that Theorem 9.3.4 remains valid for any partial order finer than minus order. We have seen in Theorem 9.2.7 (iii) that for nnd matrices A and B, while P(A, B)
By Theorem 9.2.16, we have C(P(A, C)) = C(A) ∩ C(C), C(P(A, C))t = C(At ) ∩ C(Ct )
and C(P(B, C)) = C(B) ∩ C(C), C(P(B, C))t = C(Bt ) ∩ C(Ct ).
Parallel Sums
261
Since A <− B ⇒ A <s B, so, by Theorem 9.3.3, P(A, C) <s P(B, C). It remains to show that P(A, C)[P(B, C)]− P(A, C) = P(A, C) for some ginverse [P(B, C)]− of P(B, C). By Theorem 9.2.14 any g-inverse of P(B, C) is of the form B− + C− for some choice of B− and C− . Also, since (i) holds we have {B− } ⊆ {A− }. So, B− + C− is also a g-inverse of P(A, C). Therefore, the result follows. Corollary 9.3.9. Let A, B be any m × n matrices such that A A + B. Then A, B are parallel summable and P(A, B) = 0. The proof follows by Theorems 9.3.2 and 9.3.4. However, an independent proof using Theorem 5.3.3 will be more interesting. Remark 9.3.10. A theorem similar to Theorem 9.3.8 does not hold for star order as the following example shows. 1 0 1 0 1 1 Example 9.3.11. Let A = ,B = and C = . 0 0 0 1 1 0 ? Clearly, A < B, A, C and B, C are parallel summable. But then 0 1 P(A, C) = A, P(B, C) = , and P(A, C)≮? P(B, C). 1 −1 We now give two interesting results for matrices in the class of hermitian matrices. For the first result which shows parallel sums respect the L¨owner order, we give two proofs. The first proof has an advantage of obtaining the conclusion quickly. The second proof follows after of a series of theorems which are themselves useful in obtaining several other useful properties of parallel sums in this and in the next section. Theorem 9.3.12. Let A, B and C be nnd matrices of the same order such that A
262
Matrix Partial Orders, Shorted Operators and Applications
− C(A + C)† AB† C(A + C)† A − A(A + C)† CC† C(A + C)† A.
However, A(A + C)† CC† C(A + C)† A = (A + C − C)(A + C)† C(A + C)† A = (A + C)(A + C)† C(A + C)† A − C(A + C)† C(A + C)† A = C(A + C)† A − C(A + C)† C(A + C)† A = C(A + C)† A − C(A + C)† A(A + C)† C. Therefore, P(A, C)−P(A, C)[P(B, C)]− P(A, C) = C(A + C)† (A−AB† A)(A + C)† A. Now, since A
We now give the second proof of Theorem 9.3.12. We first prove a lemma. Lemma 9.3.13. Let A, B ∈ Cn×n be nnd matrices. Then for any nvectors x, y, z such that x + y = z, (P(A, B)z, z) ≤ (Ax, x) + (By, y) and equality holds if z ∈ C(A)+C(B) and z = x0 +y0 , where x0 = (A + B)† Bz and y0 = (A + B)† Az. Proof. First notice that Ax0 = P(A, B)z, By0 = P(A, B)z and x0 + y0 = (A + B)† (A + B)z = (A + B)(A + B)† z. As z ∈ C(A) + C(B) and A, B are parallel summable, C(A) + C(B) = C(A + B). So, x0 + y0 = z. Also, (P(A, B)z, z) = z? P(A, B)z = z? P(A, B)x0 + z? P(A, B)y0 = (Ax0 , x0 ) + (By0 , y0 ). Thus, the equality holds. Now, for the first part, let x1 = P(A+B) x, y1 = P(A+B) y and z1 = P( A + B)z, where P(A+B) is the orthogonal projection onto C(A + B). Write x10 = (A + B)† Bz1 , y01 = (A + B)† Az1 . Note that x10 + y01 = z1 = x1 + y1 . We can take x1 = x10 + t, y1 = y01 − t. Then (Ax1 , x1 ) = (Ax10 , x10 ) + 2Re(Ax10 , t) + (At, t) and (By1 , y1 ) = (Ay01 , y01 )−2Re(By01 , t)+(Bt, t). Since, Ax10 = By01 = P(A, B)z1 , therefore, (Ax1 , x1 ) + (By1 , y1 ) = (Ax10 , x10 ) + (At, t) + (Ay01 , y01 ) + (Bt, t) ≥ (P(A, B)z1 , z1 ). However, (Ax1 , x1 ) = (AP(A+B) x, P(A+B) x) = (Ax, x), (By1 , y1 ) = (By, y) and (P(A, B)z1 , z1 ) = (P(A, B)z, z), so the result follows.
Parallel Sums
263
Corollary 9.3.14. If A and B are nnd matrices of the same order, then (P(A, B)z, z) ≤ P((Az, z), (Bz, z)). Proof. If (Az, z) + (Bz, z) = 0, P((Az, z), (Bz, z)) is not defined. So, we may assume that (Az, z) + (Bz, z) 6= 0. Let x=
(Bz, z) (Az, z) z and y = z. (A + Bz, z) (A + Bz, z)
Then z = x + y. By Lemma 9.3.13, (P(A, B)z, z) ≤ (Ax, x) + (By, y). However, (Ax, x) + (By, y) =
(Az, z)(Bz, z) = P((Az, z), (Bz, z)), ((A + B)z, z)
the result follows.
In fact, we have a more general result. Theorem 9.3.15. Let A, B be nnd matrices of order n × n and let Z ∈ Cn×n be an arbitrary matrix. Then Z? P(A, B)Z
Lemma 9.3.16. Let A, B, C and D be nnd matrices of the same order. Then P(A, C) + P(B, D)
264
Matrix Partial Orders, Shorted Operators and Applications
And finally now, we give the second proof of Theorem 9.3.12. lary 9.3.17 below is a restatement of Theorem 9.3.12.
Corol-
Corollary 9.3.17. Let A, B and C be nnd matrices of the same order such that A
Corollary 9.3.18. Let A and B be nnd matrices of the same order such that A
9.4
Continuity and index of parallel sums
In this section we first give the error bounds on the matrix P(A + X, B + Y) − P(A, B), which is the difference of the the parallel sums of matrices A, B and their perturbations A + X, B + Y and use the same to discuss the continuity of parallel sums. We begin with Theorem 9.4.1. If A, B be nnd matrices of the same order, then kP(A, B)k ≤ P(kAk, kBk). Proof. We first note that for an nnd matrix A, we have for each x, (Ax, Ax) ≤ kAk(Ax, x). The inequality is trivial if the matrix A = 0. So, let A 6= 0. Let be any real positive number. Consider kAk − . There exists a non-null vector x0 such that (Ax0 , Ax0 ) ≥ kAk − . (Ax0 , x0 )
Parallel Sums
265
So, by above observation, as P(A, B) is nnd, for each positive real number there exists an x such that (P(A, B)x, P(A, B)x) ≥ kP(A, B)k − . (P(A, B)x, x) Let y = P(A, B)x. Then P(A, B)(A† + B† )y = y and (P(A, B)x, x) = (y, x) = (P(A, B)((A† + B† )y, x) = (A† + B† )y, P(A, B)x) = ((A† + B† )y, y) = (A† y, y) + (B† y, y). Since y ∈ C(P(A, B)), let y = Au = Bv. Then (A† y, y) = (u, Au) and (B† y, y) = (v, Bv). So, 1 1 (u, Au) (v, Bv) (P(A, B)x, x) ( + )≤ + = kAk kBk (Au, Au) (Bv, Bv) (P(A, B)x, P(A, B)x) 1 ≤ kP(A, B)k − or equivalently P(kAk, kBk) ≥ kP(A, B)k − . as is arbitrary, result follows. Remark 9.4.2. We note that in Theorem 9.4.1, equality occurs if there exists a vector y such that Ay = kAky and By = kBky. It also implies the continuity of the operator P(A, B) about ‘0’. Theorem 9.4.3. Let A, B and X be nnd matrices the same order and G denote the matrix P(A, B + X) − P(A, B). Then G is nnd and if C = 2 A + B, G = AC† (P(C, X))C† A and kGk ≤ kC† Ak kXk. Proof.
Notice that G = A(A + B + X)† B + X − A(A + B)† B = A(C + X)† B + X − AC† B
= A((C + X)† − C† )B + A(C + X)† X. If PC is the orthogonal projector onto C(C), then PC B = B. So, A((C + X)† − C† )B = A((C + X)† − C† )CC† B = A((C + X)† )C + X − XC† B − AC† CC† B = A(C† B − C† B) − A(C + X)† XC† B. Hence, G = −A(C + X)† XC† B + A(C + X)† X. Since, C(A) ⊆ C(C), PC G = G, and G being hermitian, GPC = G. So, G = GPC = A(C + X)† XPC (I − C† B) = A(C + X)† XPC C† A = AC† C(C + X)† XPC C† A = AC† P(C|X)C† A. Since P(C, X) is hermitian, G is hermitian. Moreover, 2
kGk ≤ kAC† kkP(C, X)kkC† Ak ≤ kAC† k kXk,
266
Matrix Partial Orders, Shorted Operators and Applications
for by Theorem 9.4.1, kP(C, X)k ≤
kCkkXk . kCk + kXk
Lemma 9.4.4. If A, B and X be nnd matrices of the same order, then 2P(A + X, B + X) + P(A + B, 2X) = 2P(A + B + X, X) + P(A, B + 2X) +P(A + 2X, B). Proof.
Let A + B + 2X = D. Then 2P(A + X, B + X) + P(A + B, 2X) = 2P(A + X, B + X) + P(A + B, 2X) = 2(A + X)D† B + X + 2(A + B)D† X = 2{AD† B + 2AD† X + XD† B + XD† X}.
Also, 2P(A + B + X, X) + P(A, B + 2X) = 2P(A + B + X, X) + P(A, B + 2X) + P(A + 2X, B) = 2(A + B + X)D† X + (A)D† B − 2X + (A + 2X)D† B = 2{AD† B + 2AD† X + +D† B + XD† X}. So, the two sides are equal. Lemma 9.4.5. If A, B and X be nnd matrices of the same order and H = P(A + X, B + X) − P(A, B) − P(X, X), then H is hermitian and the matrix 2H = AC† (P(C, 2X))C† A + BC† (P(C, 2X))C† B − 21 P(C, 2X), 2 2 where C = A + B. Furthermore, kHk ≤ 2(kC† Ak + kC† Bk )kXk. Proof. First notice that by Lemma 9.4.4, 2H = [P(A, B + 2X) − P(A, B)] + [P(B, A + 2X) − P(B, A)] + 2[P(X, A + B + X) − P(X, X)] − P(A + B, 2X). Consider 2[P(X, A + B + X) − P(X, X)] − P(A + B, 2X) = 2[P(X, C + X) − P(X, X)] − P(C, 2X) = 2[(C + X)(C + 2X)† X] − 2X(2X)† X − 2C(C + 2X)† X = 2X(C + 2X)† X − X = −C(C + 2X)† X = − 12 P(C, 2X), since, X(C + 2X)† (C + 2X) = X. Now, using Theorem 9.4.3, for [P(A, B + 2X) − P(A, B)] and [P(B, A + 2X) − P(B, A)], we have 2H = AC† (P(C, 2X))C† A + BC† (P(C, 2X))C† B − 12 P(C|2X). It follows that AC† (P(C|2X))C† A + BC† (P(C, 2X))C† B − 2H = 12 P(C, 2X), which is an nnd matrix. So, 2H
Parallel Sums
The last statement now follows by Theorem 9.4.3.
267
Theorem 9.4.6. If A, B, X and Y be nnd matrices of the same order, then kP(A + X, B + Y) − P(A, B)k 2 2 2 2 ≤ 2(k(A + B)† Ak + k(A + B)† Bk )kX + Yk + 12 kX + Yk . Proof.
Since A, B, X and Y are all nnd, we have A
268
Matrix Partial Orders, Shorted Operators and Applications
⇔ D(I − D) is a non-singular matrix of rank r ⇔ ρ(D) = r = ρ(A), ρ(I − D) = r = ρ(B).
Theorem 9.4.10. Let A, B be parallel summable matrices of order n × n and any three of ρ(A + B), ρ(A), ρ(B) and ρ(P(A, B)) are equal. Then A + B has index 1 if and only if P(A, B) has index 1. Proof. ‘Only if’ part Let ρ(A + B) = r and (L, R) be a rank factorization of A + B. Then as seen in Theorem 9.4.9, A = LDR, A = L(I − D)R and P(A, B) = LD(I− D)R. Also, (P(A, B))2 = LD(I − D)RLD(I − D)R. So, ρ((P(A, B))2 ) = ρ(LD(I − D)RLD(I − D)R) = ρ(D(I − D)RLD(I − D)). As A + B has index 1, the matrix RL has order r × r and is invertible. By Theorem 9.4.6, ρ(P(A, B)) = ρ(D(I − D)) = r. Therefore, D(I − D) is invertible and so, D(I − D)RLD(I − D) is invertible matrix of rank r showing ρ(P(A, B))2 = ρ(P(A, B)), i.e. P(A, B) has index 1. ‘If’ part Let P(A, B) has index 1. Then from proof of ‘Only if’ part ρ(D(I − D)RLD(I − D)) = ρ(D(I − D)). Since ρ(A) = ρ(B) = r ⇒ ρ(D) = ρ(I − D) = r, both D and I − D are invertible. So, D(I − D) is invertible. Thus, ρ(D(I − D)RLD(I − D)) = r and therefore, is invertible. It follows that RL is invertible. Thus, A + B has index 1. Corollary 9.4.11. In the set up of Theorem 9.4.10, The matrix A + B has the group inverse if and only if the parallel sum P(A, B) has the group inverse. Lemma 9.4.12. Let A be a matrix of index k and (L, R) be a rank factorization of A. Then the Drazin inverse of A is given by AD = L((RL)D )2 R. Proof. As L, R are full rank matrices, Index of A is k ⇔ ρ(Ak+1 ) = ρ(Ak ) ⇔ ρ((LR)k+1 ) = ρ((LR)k ) ⇔ ρ((RL)k ) = ρ((RL)k−1 ) ⇔ index of RL is k −1. Let X = L((RL)D )2 R. It can be easily verified that X satisfies Definition 2.4.23. Theorem 9.4.13. Let A, B be parallel summable matrices of order n × n with index of A + B equal to 1 and ρ(A + B) = r. Let (L, R) be a rank factorization of A + B and D be an r × r matrix such that A = LDR, B + L(I − D)R. If D and RL commute, then (i) Index of P(A, B) is k if and only if index of D(I − D) equals k − 1.
Parallel Sums
269
(ii) If in addition any two of ρ(A), ρ(B) and ρ(P(A, B)) are each equal to r, then the Drazin inverse of P(A, B) is given by (P(A, B))D = LD((I − D)RLD)2 (I − D)R. Proof. (i) Since D and RL commute, we have (P(A, B))k+1 = k L(D(I − D)) RLR. Also, L, R being full rank matrices, we have ρ(P(A, B))k−1 = ρ((D(I − D))k RL). Now, A is of index 1 ⇒ RL is invertible, so, ρ(P(A, B))k+1 = ρ((D(I − D))k ). Hence, (i) follows. (ii) We first note that in this case (LD, (I − D)R) is a rank factorization of P(A, B). Let index of P(A, B) be k. Then by (i) above index of D(I − D) is k − 1. By Lemma 9.4.12, the Drazin inverse of P(A, B) is (P(A, B))D = LD((I − D)RLD)2 (I − D)R.
270
9.5
Matrix Partial Orders, Shorted Operators and Applications
Exercises
(1) Let A, B and C ∈ Fm×n . Show that P(P(A, B), C) = P(A, P(B, C)), whenever all parallel sums involved are defined. (2) Let A, B and C ∈ Fm×n such that (i) CA = AC, CB = BC (ii) A+B and C each have index 1 and (iii) A, B are parallel summable. Prove that AC, BC are parallel summable with P(AC, BC) = P(A, B)C = CP(A, B). (3) Let A, B ∈ Fm×n . Prove that ρ(A + B) = ρ(A) + ρ(B) ⇒ A, B are parallel summable. Is the converse true? (4) Let A, B and C ∈ Cm×n such that (i) A, B are parallel summable with either C(B) ⊆ C(A) or C(Bt ) ⊆ C(At ) (ii) A and C are rangeHermitian (or of index 1) and (iii) CA = AC, CB = BC. Prove that AC, BC are parallel summable with P(AC, BC) = P(A, B)C = CP(A, B). (5) Let A, B ∈ Fm×n be parallel summable and x ∈ F n such that Ax = ax and Bx = bx, where a, b are both non-null scalars. Show that P(A, B)x = P(A, B)x, provided a + b is non-null. (6) Let A, B ∈ Fm×n be parallel summable. Prove that C(B) ⊆ C(A) if and only if C(Bt ) ⊆ C(At ). Conclude that if A, B are parallel summable and either C(B) ⊆ C(A) or C(Bt ) ⊆ C(At ), then A <s B. (7) Let A, B ∈ Cn×n be parallel summable. If any two of A, B and A+B are range-hermitian then so the third. (8) Let A, B ∈ Cn×n be nnd matrices. We can define the matrix of P(A, B) as follows: Since A + B is hermitian we can find a unitary matrix U such that U? (A + B)U is diagonal, say C = [cij ]. Let U? AU = [aij ], U? BU = [bij ] and U? P(A, B)U = [dij ]. Notice that if akk = 0 for some k, then ai,k = 0 for each k. With the convention 00 = 0, and ignoring the unitary matrix i.e. A + B = A, A = [aij ] etc. we can write X X X ai,k bk,j dij = ai,k ck,m + bm,j = ai,k + bk,j m k
k
Prove that (i) tr(P(A, B)) ≤ tr(A)tr(B) and (ii) det(P(A, B)) ≤ det(A) det(B). (9) Let A, B and C ∈ Cn×n be nnd matrices. Prove that L 9P(P(A, B), C) < A + B + C (Hint: Use Lemma 9.3.12). (10) Let A, B, X and Y ∈ Cn×n be nnd matrices. Let a1 and a2 be nonnegative real numbers such that a1 + a2 = 1. Then prove that P(a1 A1 + a2 A2 , a1 B1 + a2 B2 ) = a1 P(A1 , B1 ) + a2 P(A2 , B2 ).
Parallel Sums
271
(11) Let A, B ∈ Cm×n . Prove that A, B are parallel summable if and only if AA† (A + B)(A + B)† , A† A (A + B)† (A + B) if and only if BB† (A + B)(A + B)† , B† B (A + B)† (A + B). (12) Let A, B be parallel summable matrices such that A <− B. Prove that A <− 2P(A, B). (13) Let A, B be parallel summable matrices of index not greater than 1 such that (i) A + B is of index 1 and (ii) (A + B)# = A# +B# . Show that A# and B# are parallel summable.
Chapter 10
Schur Complements and Shorted Operators
10.1
Introduction
Schur complements are an interesting class of operators having a wide range of applications. For example, these operators play an important role in determining the shorted operator of the impedance matrix of an n-port electrical network some of whose ports have been shorted. These operators also play an important role in multivariate statistics. In this chapter, we shall define this class of operators and study its properties. We start with providing some motivation for the shorted operators via electrical network theory and statistics in Section 10.2. We formally define the concept of Schur complement as the shorted operator of an nnd matrix in Section 10.3 and identify this as the supremum of a certain class of nnd operators under the L¨owner order. In Section 10.4, we obtain the shorted operator/Schur complement of an nnd matrix as the limit of a sequence of parallel sums (which provides a base for obtaining the shorted matrix of a given matrix with respect to another matrix as the limit of a sequence of parallel sums in the next chapter). In Section 10.5, we extend the concept of Schur complements/shorted operator to general matrices (possibly rectangular) over a general field. We define the complementability of a matrix with respect to a pair of projectors or a pair of subspaces and give a necessary and sufficient condition for complementability of a matrix. We also show that the Schur complement is the supremum of a certain class of matrices under the minus order. Moreover, we show that Schur compression of a matrix with respect to a pair of projectors is indeed the Schur complement of the matrix with respect to a suitable pair of projectors. This chapter in fact, serves as a prelude to a more detailed study of the shorted operator in the next chapter.
273
274
Matrix Partial Orders, Shorted Operators and Applications
In Sections 10.1–10.4 we consider matrices over the field of complex numbers. In Section 10.5 we consider matrices over a general field.
10.2
Shorted operator - A motivation
Electrical networks are a great source of motivating the shorted operator. We use n-port electrical networks of specialized nature for this purpose. We consider current through any terminal of a port as an input and the resulting potential (voltage) as output. We assume that the network has the properties of homogeneity and superposition. By homogeneity, we mean that if we multiply the inputs by some factor, the outputs get multiplied by the same factor. (Thus, the scale of measurement does not affect the relationship.) A network has the property of superposition if the output of the sum of several inputs is the sum of the outputs of each individual input. Further, we consider the networks that are resistive and reciprocal in nature. A network is called resistive if it is composed of only resistors and is reciprocal if the relationship between inputs and outputs is unchanged when input and output terminals are interchanged. Throughout this section, we consider the networks with these properties only i.e., they are resistive and reciprocal in nature and have the properties of homogeneity and superposition and no further mention will be made of this when dealing with a network. Consider an n-port network. Let vj be the potential across the port j and let ij be the current through the terminals of the j th port. The vectors i1 v1 .. .. i = . and v = . are respectively called the current and voltin vn age vectors to the network. Thus, i is the input vector and v is the output vector. Since the network is homogeneous and has the property of superposition, vr is a linear combination of i1 , . . . , in for r = 1, . . . n. Thus, we have Pn vr = j=1 zrj ij , for r = 1, . . . n or equivalently v = Zi, where Z = zij is a matrix of order n × n. The matrix Z is called the impedance matrix. The elements zij can be interpreted as follows: If we apply a current of strength ij across a terminal in port j leaving the other ports open and measure the voltage across the port r, then vr vr = zrj ij or zrj = . Since the network is reciprocal and resistive, the ij impedance matrix is an nnd matrix.
Schur Complements and Shorted Operators
275
Let us take an n-port network N with impedance matrix Z. Suppose we short out the last n − r ports to obtain a new n-port network N∗ with impedance matrix Z∗ . We assume that this new network has all the properties of the original network. Then the voltage across the last n − r ports of the network N∗ is zero irrespective of inputs and therefore, the last n − r rows of impedance matrix Z∗ are null. Also, since the network is assumed to be reciprocal, the last n − r elements of each of first r rows are also null. Z∗11 0 Thus, we can write the impedance matrix Z∗ = , where Z∗11 is 0 0 an nnd matrix of order r × r. We obtain the relation between Z∗11 and Z in the following theorem: Theorem10.2.1. Let N be an n-port network with impedance matrix Z. Z11 Z12 Let Z = , where Z11 is of order r × r and Z22 is of order Z21 Z22 Z∗11 0 (n − r) × (n − r). Let Z∗11 = (Z11 − Z12 Z− Z ). Then Z = ∗ 22 21 0 0 is the impedance matrix of the network N∗ obtained from N by shorting the last n − r ports. i1 v1 Proof. Write i = and v = , where i1 and v1 are r-vectors. i2 v2 Since v = Zi, we have v1 = Z11 i1 + Z12 i2 and v2 = Z21 i1 + Z22 i2 . Since the last n − r ports are shorted, we have v2 = 0. Hence, we must determine the matrix Z∗11 such that v1 = Z∗11 i1 , given that v2 = 0. Now, v2 = 0 ⇒ Z22 i2 = −Z21 i1 . Since Z is nnd, we have C(Z21 ) ⊆ C(Z22 ). − Therefore, i2 = −Z− 22 Z21 i1 + (I − Z22 Z22 )η for some η and arbitrary g− inverse Z22 of Z22 . Thus, − v1 = Z11 i1 + Z12 i2 = Z11 i1 + Z12 (−Z− 22 Z21 ii + (I − Z22 Z22 )η).
Since Z12 (I − Z− 22 Z22 ) = 0, we have − v1 = Z11 i1 − (Z12 Z− 22 Z21 )ii = (Z11 − Z12 Z22 Z21 )ii .
Thus, Z∗11 = Z11 − Z12 Z− 22 Z21 .
Remark 10.2.2. The matrix Z∗ is referred to as the shorted operator of Z corresponding to the shorting of last n − r ports of the network N. Remark 10.2.3. It is easy to check that 0
276
Matrix Partial Orders, Shorted Operators and Applications
Remark 10.2.4. Notice that the matrix Z∗11 is a Schur complement of Z22 in Z. Let e1 , . . . , en be the standard basis of Cn , where for each j = 1, . . . , n the vector e?j is defined as e?j = (0, . . . 0, 1, 0 . . . , 0) with 1 at the j th position. Then v = v1 e1 + . . . + vn en and i = i1 e1 + . . . + in en . When we short the last n − r ports of the network N, the coefficients of er+1 , . . . , en become zero. Thus, shorting the last n − r ports can be viewed as shorting with respect to the subspace S, spanned by er+1 , . . . , en and Z∗11 as the shorted operator of Z with respect this subspace S. Yet another motivation for shorted operator comes from multivariate statistics. Let Y follow distribution with dispersion matrix ann-variate normal Y1 Σ11 Σ12 Σ. Write Y = and Σ = , where Y1 has r components Y2 Σ21 Σ22 and Σ11 and Σ22 are of orders r × r and (n − r) × (n − r) respectively. Then it is well known that the conditional dispersion matrix of Y1 when Y2 is fixed is given as: Σ11 − Σ12 Σ− 22 Σ21 , which is the Schur complement of Σ22 in Σ. Also, the dispersion matrix of (Y 1 | Y2 = c) −together with Σ11 − Σ12 Σ22 Σ21 0 Y2 = c is the shorted operator of Σ, namely . 0 0
10.3
Generalized Schur complement and shorted operator
A11 A12 Let A = be the impedance matrix of a reciprocal, resisA21 A22 tive electrical network with n-ports, where A11 is an r × r matrix and A22 is an (n − r) × (n − r) matrix. In the previous section, we saw that if the last n−− r ports are shorted, the resulting impedance matrix A11 − A12 A22 A21 0 is . This can be viewed as shorting the network 0 0 with respect to the subspace S = C(e1 : · · · : er ). This is so because we are shorting the last n−r ports. In this chapter, we consider shorting of a given matrix A with respect to a general subspace S and obtain the expression for the impedance matrix, which we call as the shorted operator AS . We emphasize that the subspace S relates to unshorted ports. Let A be an nnd matrix of order n × n. Let S be a subspace of Cn with an ortho-normal basis {s1 , s2 , . . . , sr }. Extend s1 , s2 , . . . , sr to an orthonormal basis s1 , s2 , . . . , sn of Cn . Then sr+1 , sr+2 , . . . , sn forms a basis of
Schur Complements and Shorted Operators
277
S ⊥ , the orthogonal complement of S. Let S = S1 : S2 , where S1 = s1 : · · · : sr and S2 = sr+1 : · · · : sn . Then S is a unitary matrix and the representation respect to the ortho-normal basis s1 , s2 , . . . , sn ? of A with ? S AS S AS 1 2 1 1 is S? AS = . In this case we call the matrix S?2 AS1 S?2 AS2 ? S1 AS1 − S?1 AS2 (S?2 AS2 )− S?2 AS1 0 0 0 as the Generalized Schur Complement of the matrix S?2 AS2 in the matrix S? AS. (Notice that C(S?2 AS1 ) ⊆ C(S?2 A) = C(S?2 AS2 ). Hence, the quantity S?1 AS2 (S?2 AS2 )− S?2 AS1 is invariant under the choices of g-inverses of S?2 AS2 .) The generalized Schur complement with respect to the standard basis has the representation ? S1 AS1 − S?1 AS2 (S?2 AS2 )− S?2 AS1 0 S S? (10.3.1) 0 0 or equivalently, S1 S?1 AS1 S?1 − S1 S?1 AS2 (S?2 AS2 )− S?2 AS1 S?1 .
(10.3.2)
Definition 10.3.1. Let A be an n × n nnd matrix and let S be a subspace of Cn . Let S = S1 : S2 be a unitary matrix where C(S1 ) = S. Then the shorted operator AS of A with respect to S is defined by (10.3.1) or equivalently by (10.3.2). We now show that the shorted operator AS depends only on S and not on a particular basis of S and the orthogonal complement S ⊥ of S chosen the for representation of A. We will use the machinery and notations developed above in this entire section without making a repeated mention of the same. Theorem 10.3.2. Let A be an n × n nnd matrix and let S be a subspace of Cn . Then C(AS ) ⊆ S and AS = A − APS ⊥ (PS ⊥ APS ⊥ )− PS ⊥ A where PS and PS ⊥ are orthogonal projectors onto S and S ⊥ respectively.Thus, AS depends on the subspace S and not on a particular basis of S and its orthogonal complement S ⊥ chosen for representing AS . Proof. Consider the form (10.3.2) for AS . As noted earlier the quantity S?1 AS2 (S?2 AS2 )− S?2 AS1 ia always invariant under the choices of g-inverse of S?2 AS2 . Since ρ(S2 S?2 AS2 S?2 ) = ρ(S?2 AS2 ), we have
278
Matrix Partial Orders, Shorted Operators and Applications
S?2 (S2 S?2 AS2 S?2 )− S2 is a g-inverse of S?2 AS2 for each choice of g-inverse (S2 S?2 AS2 S?2 )− of S2 S?2 AS2 S?2 . So, we can write (10.3.2) as AS = S1 S?1 AS1 S?1 − S1 S?1 AS2 S?2 (S2 S∗2 AS2 S∗2 )− S2 S∗2 AS2 S?1 = PS APS − PS APS ⊥ (PS ⊥ APS ⊥ )− PS ⊥ APS = PS (A − APS ⊥ (PS ⊥ APS ⊥ )− PS ⊥ A)PS . So, C(AS ) ⊆ S. Notice that PS ⊥ (PS ⊥ APS ⊥ )− is a g-inverse of PS ⊥ A and therefore, PS ⊥ (A − APS ⊥ (PS ⊥ APS ⊥ )− PS ⊥ A) = 0. This further gives that C(A − APS ⊥ (PS ⊥ APS ⊥ )− PS ⊥ A) ⊆ S. Hence, AS = A − APS ⊥ (PS ⊥ APS ⊥ )− PS ⊥ A.
Corollary 10.3.3. Let Q be a matrix such that C(Q) = S ⊥ . Then AS = A − AQ(Q? AQ)− Q? A, where (Q? AQ)− is any g-inverse of Q? AQ. Proof. Clearly, PS ⊥ = Q(Q? Q)− Q? . Since ρ(Q? AQ) = ρ(PS ⊥ APS ⊥ ), the matrix (Q? Q)− Q? (Q(Q? Q)− Q? AQ(Q? Q)− Q? )− Q(Q? Q)− is a ginverse of Q? AQ. So, ⊥ − ? − ? AS = A − APS ⊥ (P⊥ S APS ) PS ⊥ A = A − AQ(Q AQ) Q A
for some g-inverse (Q? AQ)− of Q? AQ. Notice that AQ(Q? AQ)− Q? A is invariant under choices of g-inverse of Q? AQ, since C(Q? A) = C(Q? AQ). Now, the result follows. A AQ ˜ = ˜ is a 2n × 2n Remark 10.3.4. Write A . Then, A ? Q A Q? AQ matrix. It is easy to see that AS is the usual Schur complement of Q? AQ ˜ in A. PS APS PS APS ⊥ ˜ In the setup of Theorem 10.3.2, write A = . PS ⊥ APS PS ⊥ APS ⊥ Let B = PSAPS , C = PS APS ⊥ , E = PS ⊥ APS and H = PS ⊥ APS ⊥ . ˜ = B C . Further, PS + PS ⊥ = I. Hence, Thus, A E H A = (PS + PS ⊥ )A(PS + PS ⊥ ) = PS APS + PS APS ⊥ + PS ⊥ APS + PS ⊥ APS ⊥ = B + C + E + H.
Schur Complements and Shorted Operators
279
The expression A = B + C + E + H is called the Orthogonal Projective decomposition of A. The matrix AS = B − CH− E is called the Schur Complement of A with respect to the subspace S and the matrix A − AS is called the Schur Compression of A with respect to the subspace S. We now show that the shorted operator AS is the supremum of a particular set of nnd matrices. Theorem 10.3.5. Let A be an nnd matrix of order n × n and let S be a subspace of Cn . Then AS = max{D : 0
By Corollary 8.2.3, it follows that 0
(10.3.3)
is nnd. Thus, 0
280
Matrix Partial Orders, Shorted Operators and Applications
hence, (AS )T = AS∩T . Interchanging the roles of S, and T we get the (AT )S = AS∩T . Corollary 10.3.8. Let A be an nnd matrix of order n × n and let S be a subspace of Cn . Then C(AS ) = C(A) ∩ S. Proof. Let PS be the orthogonal projector onto S. Then A and PS are parallel summable, so, by Theorem 9.2.16, C(P(A, PS )) = C(A) ∩ S and P(A, PS )
S?1 AS1 − S?1 AS2 (S?2 AS2 )− S?2 AS1 0 0 0
= Y.
Schur Complements and Shorted Operators
281
Notice ? that ? S1 AS1 S1 AS2 ρ = ρ(S?2 AS2 ) + ρ(S?1 AS1 − S?1 AS2 (S?2 AS2 )− S?2 AS1 ). S?2 AS1 S?2 AS2 = ρ(X) + ρ(Y). Hence, C(X) ∩ C(Y) ={0}. Now, S = C(S1 ), so any element in S is u of the form S1 (u) = S . Consider a vector x ∈ S ∩ C(A − AS ). So, 0 ? u S1 AS2 (S?2 AS2 )− S?2 AS1 S?1 AS2 v x = S1 (u) = S =S S? for 0 S?2 AS1 S?2 AS2 w some u, v and w. Therefore, u = S?1 AS2 (S?2 AS2 )− S?2 AS1 S? v + S?1 AS2 S? w (10.3.4) and 0 = S?2 AS1 S? v + S?2 AS2 S? w. (10.3.5) ? ? − Pre-multiplying (10.3.5) by S1 AS2 (S2 AS2 ) , we have S?1 AS2 (S?2 AS2 )− S?2 AS1 S? v + S?1 AS2 S? w = 0 Using (10.3.4) we have u = 0. Thus, C(A − AS ) ∩ S = 0. Thus, F = AS satisfies all the properties (i)-(iii). We now show F is unique. Let F be any nnd matrix satisfying (i)-(iii). Clearly, A − F is nnd. So, F
282
Matrix Partial Orders, Shorted Operators and Applications
Theorem 10.3.12. Let A, B be nnd matrices of the same order and let S be a subspace of Cn . Then AS + BS
Corollary 10.3.13. Let A, B be positive definite matrices of the same order n × n and let S be a subspace of Cn . Then AS + BS = (A + B)S if and only if C(A − AS ) = C(B − BS ). Proof.
Since A is positive definite and AS <− A, we have ρ(AS ) + ρ(A − AS ) = n.
If possible, let C(A − AS ) 6= C(A − BS ). As A − AS and B − BS are nnd, we have C(A − AS ) + C(B − BS ) = C(A − AS + B − BS ) and C(A − AS ) is its proper subspace. So, ρ(A − AS + B − BS ) > ρ(A − AS ) = n − ρ(AS ). Further, C(A − AS + B − BS ) ∩ S = 6 {0}. Hence, AS + BS 6= (A + B)S . Converse follows from Theorem 10.3.12. Theorem 10.3.14. Let A be nnd matrix of the order n × n and let S be a subspace of Cn . Then there exist matrices Mr and M` such that (i) (ii) (iii) (iv)
PS ⊥ Mr = Mr M` PS ⊥ = M` PS ⊥ AMr = PS ⊥ A M` APS ⊥ = APS ⊥
and
As a consequence, AMr = M` AMr = M` A. Also, for given matrix A and subspace S, the matrices AMr and M` A are unique and AS = A − AMr . Proof.
Let S1 and S2 be the matrices as in Definition 10.3.1. Then we S?1 AS1 S?1 AS2 ? S . It is easy to see that the matrices can write A = S S?2 AS1 S?2 AS2 0 0 S? Mr = S (S?2 AS2 )− S?2 AS1 (S?2 AS2 )− S?2 AS2 0 S?1 AS2 (S?2 AS2 )− and M` = S S? satisfy (i)-(iv). 0 S?2 AS2 (S?2 AS2 )−
Schur Complements and Shorted Operators
283
Now, AMr = A(PS ⊥ Mr ) = (APS ⊥ )Mr = M` APS ⊥ Mr = M` AMr . Similarly, M` A = M` APS ⊥ Mr = M` AMr . Also, it is easy to check that AMr = A − AS = M` A. Thus, the matrices AMr and M` A do not depend on choice of matrices Mr and M` respectively and so are unique. 10.4
Shorted operator via parallel sums
In this section we obtain the shorted operator of an nnd matrix with respect to a given subspace as the limit of a certain sequence of parallel sums. For this we first give the following: Lemma 10.4.1. Let C be an nnd matrix of order n × n and B any matrix of order m × n such that C(B? ) ⊆ C(C). Then lim→0 B(C + I)−1 B? = BC† B? = BC− B? , where C− is an arbitrary g-inverse of C. Proof.
If C is non-singular, the result trivially. So, let C be follows P?1 singular and C = (P1 : P2 )diag ∆ 0 be a spectral decomposition P?2 of C, where P = (P1 : P2 ) is unitary, ∆ is positive definite diagonal matrix and partitioning is such that P1 ∆ is defined.Clearly, C(C) = C(P1 ). Since, D C(B? ) ⊆ C(C) we can write B? = P1 D = P for some matrix D. Now, 0 −1 ∆ + I 0 D B(C + I)−1 B? = (D? : 0) 0 I 0 = D? (∆ + I)−1 D. Taking limit as → 0, the result follows.
We are now ready to obtain the shorted operator of an nnd matrix as the limit of a sequence of parallel sums. We prove Theorem 10.4.2. Let A be nnd matrix of the order m × m and let S be a subspace of Cm . Let PS be the orthogonal projector onto S. Then AS = limn→∞ P(A, nPS ). ? S1 AS1 S?1 AS2 Proof. As seen in the previous section, A = S S? , S?2 AS1 S?2 AS2 where S = (S1 : S2 ) is unitary and C(S1 ) = S. Also, PS = S1 S?1 =
284
Matrix Partial Orders, Shorted Operators and Applications
? S1 AS1 S?1 AS2 Sdiag I 0 S? . Write A = S S? . Then S?2 AS1 S?2 AS2 + I ? S1 AS1 + nI S?1 AS2 A + nPS = S S? , which is a positive defS?2 AS1 S?2 AS2 + I inite matrixwhenever n > 0 and > 0. Clearly, (A + nPS )−1 =SXS? , (nI + S?1 AS1 − S?1 AS2 (S?2 AS2 + I)−1 S?2 AS1 )−1 • where X = and • • • stand for some suitable matrices not of interest to us. Now, P(A , nPS ) = nPS (A + nPS )−1 A = nPS − nPS ((A + nPS )−1 )nPS nI 0 nI 0 =S X S? 0 0 0 0 Z 0 =S S? , 0 0 where Z = nI − n2 (nI + S?1 AS1 − S?1 AS2 (S?2 AS2 + I)−1 S?2 AS1 )−1 . Let Q = S?1 AS1 −S?1 AS2 (S?2 AS2 +I)−1 S?2 AS1 and N () = kQ k = k(A )S k and n be a positive integer such that n > N (). Then (I + n1 Q ) is nonsingular and (I + n1 Q )−1 = I − n1 Q + n12 Q 2 · · · . Hence, Q − n1 Q 2 · · · 0 P(A , nPS ) = S S? 0 0 1 = (A )S − (A )2S + · · · n 1 1 = (A )S − (A )2S (I + (A )S )−1 . n n By Lemma 10.4.1, AS = lim→0 (A )S . So, lim lim P(A , nPS ) = lim lim ((A )S −
n→∞ →0
n→∞ →0
= AS − lim
n→∞
1 1 (A )2S (I + (A )S )−1 ) n n
1 1 ([A2 S ](I + AS )−1 ) = AS . n n
Corollary 10.4.3. Let A, B be nnd matrices of same the order m × m and let S be a subspace of Cm . Then P(A, B)S = P(AS , B) = P(AS , BS ).
Schur Complements and Shorted Operators
285
Proof. Observe thatP(2nPS , 2nPS ) = nPS . Now, P(P(A, B), nPS ) = P(P(A, nPS ), B) = P(P(A, 2nPS ), P(B, 2nPS )), in view of commutativity and associativity of parallel sums. The result now follows by taking limits.
10.5
Generalized Schur complement and shorted operator of a matrix over general field
In Section 10.3, we defined the shorted operator of an nnd matrix A with respect to a given subspace S, which we denoted by AS . We also saw that it was a Schur complement of A in a suitably constructed matrix. (See Remark 10.3.4). For this purpose we used the orthogonal projectors onto S and the orthogonal complement S ⊥ or equivalently an ortho-normal basis of S and an ortho-normal basis of the orthogonal complement S ⊥ . We showed that given any subspace S, the matrix A has a shorted operator AS indexed by subspace S. For compatibility with notation for shorted operator in general case and for convenience, henceforth we use the notation S(A|S) to denote the shorted operator of the nnd matrix A indexed by S which we have so far denoted as AS . Thus, AS and S(A|S) mean the same. In this section, we consider matrices (possibly rectangular) over an arbitrary field F and define the notions of complementability and shorted operator for a given matrix A of order m × n. In this case we deal with two subspaces Vm and Vn of F m and F n respectively. In the absence of the distinct advantage of orthogonal complements and ortho-normal basis as in case of complex matrices, we choose complements Wm of Vm and Wn of Vn in F m and F n respectively. Using these we construct projectors P` and Pr of orders m × m and n × n onto Vm along Wm and Vn along Wn respectively. Employing conditions similar to those in Theorem 10.3.14 in terms of P` and Pr , we define complementability of a matrix and a shorted operator of a matrix indexed by Vm , Wm , Vn and Wn (or equivalently indexed by P` and Pr ). We then investigate the existence of the shorted operator defined thus. We show that this shorted operator whenever it exists depends only on the subspaces Vm and Vn and does not depend on the choice of complements Wm and Wn . We finally show that shorted operator of a given matrix A indexed by Vm and Vn (dropping the subspaces Wm and Wn in view of the last statement), whenever it exists is a solution
286
Matrix Partial Orders, Shorted Operators and Applications
to a maximization problem similar to the one considered in the Theorem 10.3.11. Let Vm , Vn , Wm , Wn , P` and Pr be as in the preceding para. One nice way to construct P` is as follows: Let P1 and P2be matrices whose columns form a bases of Vm and Wm P1 respectively. Let be the inverse of P1 : P2 . Then P` = P1 P1 . P2 Similarly, let P3 and P4 be matrices whose rows form a bases of Vn and P 3 Wn respectively. Let P3 : P4 be inverse of , then Ptr = P3 P3 . P4 Write Q` = I − P` and Qr = I − Pr . Let A be an m × n matrix. Then A = (P` + Q` )A(Ptr + Qtr ) = P` APtr + P` AQtr + Q` APtr + Q` AQtr . Let B = P` APtr , C = P` AQtr , E = Q` APtr and H = Q` AQtr . Thus, A = B + C + E + H, which is analogous to the orthogonal projective decomposition of Remark 10.3.4. We call this decomposition as projective decomposition of A with respect to the pair (P` , Pr ) of projections. Definition 10.5.1. Let A be an m×n matrix. Let P` and Pr be projectors of order m×m and n×n respectively. Then A is said to be complementable with respect to the pair (P` , Pr ) or equivalently A is said to be (P` , Pr )complementable if there exist matrices M` and Mr of orders m × m and n × n respectively such that (i) (ii) (iii) (iv)
Ptr Mr = Mr M` P` = M` P` AMr = P` A M` APtr = APtr .
and
Remark 10.5.2. Notice the similarity of conditions (i)-(iv) of Definition 10.5.1 and the conditions (i)- (iv) of Theorem 10.3.14. In Theorem 10.3.14, we have Pr = P` = PS ⊥ . Thus, an nnd matrix over C is complementable with respect to the pair (PS ⊥ , PS ⊥ ), where S is an arbitrary subspace of Cn . Analogous to the nnd case, we now define Schur complement and Schur compression in general case. Definition 10.5.3. Let A be complementable with respect to the pair (P` , Pr ) and M` , Mr be matrices satisfying (i)-(iv) of Definition 10.5.1. Then A(P` ,Pr ) = M` AMr = M` A = AMr is called the Schur compression of A with respect to the pair (P` , Pr ) and A/(P` ,Pr ) = A−A(P` ,Pr ) is
Schur Complements and Shorted Operators
287
called the Schur complement of A with respect to the pair (P` , Pr ). We call A/(P` ,Pr ) as the shorted matrix indexed by C(I − P` ) and C(I − Ptr ). Several questions arise. Why is the name ‘Schur complement’ justified in this case? Does it confirm to the usual definition of Schur complement? How relevant is the choice of the complements Wm of Vm and Wn of Vn ? Does the shorted operator defined above have optimality properties similar to those of the shorted operator in the nnd case? Before we address all these questions, we establish the following necessary and sufficient conditions for the complementability of a matrix with respect to a pair of projections. Theorem 10.5.4. Let A be an m × n matrix over F. Let P` and Pr be projectors of order m × m and n × n respectively. Let A = B + C + E + H be projective decomposition of A with respect to the pair (P` , Pr ) as given in opening para of the section (before Definition 10.5.1). Also, let P` = P1 P1 and Ptr = P3 P3 , where P1 and P3 are as constructed in the para before Definition 10.5.1. Then the following are equivalent: (i) (ii) (iii) (iv)
A is (P` , Pr )-complementable C(Et ) ⊆ C(Bt ), C(C) ⊆ C(B) N (B) ⊆ N (B + E), N (Bt ) ⊆ N (B + C)t t t C(T12 ) ⊆ C(T11 ), C(T 21 ) ⊆ C(T11), T11 T12 P3 where A = (P1 : P2 ) and T21 T22 P4 (v) ρ(P` APtr ) = ρ(P` A) = ρ(APtr ).
Proof. (ii)⇔ (iii) Notice that C(Et ) ⊆ C(Bt ) ⇔ N (B) ⊆ N (E) ⇔ N (B) = N (B + E), since C(B) ∩ C(E) = {0}. Similarly, C(C) ⊆ C(B) ⇔ N (Bt ) ⊆ N (B + C)t . (ii)⇔ (iv) Notice that P1 P3 3 4 . A = P1 : P2 A P : P P2 P4 Thus, T11 = P1 AP3 , T12 = P1 AP4 , T21 = P2 AP3 and T22 = P2 AP4 . Further, B = P1 T11 P3 , C = P1 T12 P4 , E = P2 T21 P3 ,
288
Matrix Partial Orders, Shorted Operators and Applications
and H = P2 T22 P4 . Since, P1 has a left inverse and P3 has a right inverse, C(C) ⊆ C(B) if and only if C(T12 ) ⊆ C(T11 ). Similarly, C(Et ) ⊆ C(BT ) if and only if C(Tt21 ) ⊆ C(Tt11 ). (i) ⇒ (ii) Since A is (P` , Pr )-complementable, there exist matrices M` , Mr satisfying (i)-(iv) of Definition 10.5.1. So, M` B = M` P` APtr = M` APtr = APtr = B + E. Hence, C(Et ) ⊆ C(Bt ). Similarly, BMr = B + C. Thus, C(C) ⊆ C(B). (ii) ⇒ (i) Let (ii) hold. Choose M` = (B + E)B− P` and Mr = Ptr B− (B + C), where B− is any g-inverse of B. Clearly, Ptr Mr = Mr and M` P` = M` . Also, M` APtr = (B + E)B− P` APtr = (B + E)B− B = B + E = APtr . Similarly, we can show that P` AMr = P` A. (ii) ⇒ (v) From the proof of (ii) ⇒ (i), ii follows that there exist matrices D1 and D2 such that D1 P` APtr = APtr and P` APtr D2 = P` A. So, ρ(P` APtr ) = ρ(P` A) = ρ(APtr ). (v) ⇒ (i) Since (v) holds, there exist matrices D1 and D2 such that D1 P` APtr = APtr and P` APtr D2 = P` A. Let M` = D1 P` and Mr = Ptr D2 . Now, it is easy to see that conditions (i)-(iv) of Definition 10.5.1 hold, so (i) holds for this choice of matrices M` and Mr . We next obtain expressions for the Schur compression, Schur complement and and shorted operator. Let any one of equivalent conditions (ii)-(v) of Theorem 10.5.4 hold. Then A is complementable with respect to the pair (P` , Pr ). Let M` and Mr be matrices as chosen in the proof of Theorem 10.5.4 (ii) ⇒ (i). The Schur compression A(P` ,Pr ) = M` AMr = M` A = (B + E)B− P` A = (B + E)B− (B + C) = (B + C) + E + EB− C. Notice that EB− C is invariant under choices of B− in view of (ii) of Theorem 10.5.4. Further, A/(P` ,Pr ) = A − A(P` ,Pr ) = B + C + E + H − (B + C + E + EB− C) = H − EB− C = (I − P` )(A − APtr (P` APtr )P` A)(I − Ptr ) = A − APtr (P` APtr )P` A, as ρ(P` APtr ) = Now, APtr =
ρ(APtr ).
(10.5.1)
ρ(P` A) = (P1 T11 + P2 T21 )P3 , P` A = P1 (T11 P3 + T12 P4 )
Schur Complements and Shorted Operators
289
and P` APtr = P1 T11 P3 . Clearly, ρ(P` APtr ) = ρ(T11 ). Hence, P3 (P1 T11 P3 )− P1 is a g-inverse of T11 . It can be easily seen that A/(P` ,Pr ) = A − APtr (P` APtr )P` A = P2 (T22 − T21 T− 11 T12 )P4 . Thus, the shorted operator of A indexed by Wm and Wn written as S(A|Wm , Wn , Vm , Vn ) is given by: S(A|Wm , Wn , Vm , Vn ) = P2 (T22 − T21 T− 11 T12 )P4 .
(10.5.2)
Accordingly S(A|Vm , Vn , Wm , Wn ) = P1 (T11 − T12 T− 22 T21 )P3 .
(10.5.3)
So, the Schur compression of A with respect to the pair (P` , Pr ) is A(P` ,Pr ) = A − A/(P` ,Pr ) = APtr (P` APtr )P` A.
(10.5.4)
We have taken a specific choice of M` and Mr , namely, M` = (B + E)B− P` and Mr = Ptr B− (B + C), while arriving at the expressions (10.5.1)-(10.5.4). To show these expressions are independent of any particular choice of M` and Mr , we first obtain the class of all M` and Mr , satisfying the conditions (i)-(iv) of Definition 10.5.1 under the assumption that A is (P` , Pr )-complementable. Assume that A is (P` , Pr )-complementable. Then ρ(P` APtr ) = ρ(P` A) = ρ(APtr ). Since, M` P` = M` , C(Mt` ) ⊆ C(Pt` ), we have, M` = D1 P` for some D1 . Also, M` APtr = APtr , so, we have, D1 P` APtr = APtr . This equation in D1 is evidently consistent, so, D1 = APtr (P` APtr )− + Z(I − (P` APtr )(P` APtr )− ), where Z is arbitrary. Hence, the class of all M` is given by: M` = APtr (P` APtr )− P` + Z1 (I − (P` APtr )(P` APtr )− )P` ,
(10.5.5)
where Z1 is arbitrary. Similarly, the class of all Mr is given as: Mr = Ptr (P` APtr )− P` A + (I − (P` APtr )− (P` APtr ))W,
(10.5.6)
where W is arbitrary. Consider M` AMr = APtr (P` APtr )− P` AMr + Z1 (I − (P` APtr )(P` APtr )− )P` AMr = APtr (P` APtr )− P` AMr ,
290
Matrix Partial Orders, Shorted Operators and Applications
since ρ(P` APtr ) = ρ(P` A). So, M` AMr = APtr (P` APtr )− P` A, which is independent of M` and Mr . Thus, the expressions (10.5.1)-(10.5.4) are independent of choices of M` and Mr . Recall that P` = P1 P1 and Ptr = P3 P3 . Hence the Schur compression is A(P` ,Pr ) = APtr (P` APtr )− P` A = AP3 P3 (P1 P1 AP3 P3 )− P1 P1 A. Since, ρ(P` APtr ) = ρ(P1 P1 AP3 P3 ) = ρ(P1 AP3 ), we have A(P` ,Pr ) = AP3 (P1 AP3 )− P1 A. Clearly, AP3 (P1 AP3 )− P1 A is invariant under choices of g-inverses of P1 AP3 . Also note that C(P1 ) = N (P2 ) and C(P3t ) = N (Pt4 ). Choice of matrices P2 and P4 is arbitrary subject to the conditions: columns of P2 form a basis of Wm and rows of P4 form a basis of Wn . Hence, the Schur compression and Schur complement can be obtained as follows: Let columns of P2 and rows of P4 form a basis of Wm and Wn respectively. Let P1 and P3 be matrices such that C(P1 ) and C(P3t ) form the null spaces of P2 and Pt4 respectively. Then the Schur compression of A with respect to the pair (P` , Pr ) is given as: A(P` ,Pr ) = AP3 (P1 AP3 )− P1 A,
(10.5.7)
and Schur complement of A with respect to the pair (P` , Pr ) is given as: A/(P` ,Pr ) = A − AP3 (P1 AP3 )− P1 A.
(10.5.8)
A careful reader will notice that we have arrived at (10.5.7) and (10.5.8) when the columns of P1 and rows of P3 form bases of the null spaces of P2 and Pt4 respectively. However, both these hold even when the columns of P1 and rows of P3are not necessarily linearly independent. 3 A AP ˜ = Let us write A . Then A/(P` ,Pr ) is the Schur comP1 A P1 AP3 ˜ answering the first question before Theorem 10.5.4. plement of A in A, Now, let the shorted operator S(A|Vm , Vn , Wm , Wn ) exist and Um , Un be complements of Vm and Vn respectively (possibly different from Wm and Wn ). We show that, if S(A|Vm , Vn , Wm , Wn ) exists, then S(A|Vm , Vn , Um , Un ) also exists and the two are equal. Notice that every basis of Um is the columns of a matrix of the form P1 L1 + P2 L2 for some matrices L1 and L2 such that L2 is non-singular. Similarly, every basis of Un is the columns of a matrix of the form Pt3 L3 + Pt4 L4 for some matrices L3 and L4 such that L4 is non-singular. Since S(A|Vm , Vn , Wm , Wn ) exists, we have T11 T12 P3 A = (P1 : P2 ) , T21 T22 P4
Schur Complements and Shorted Operators
291
where C(T21 ) ⊆ C(T22 ), C(Tt12 ) ⊆ C(Tt22 ). We can write T11 T12 P3 A = (P1 : P2 ) T21 T22 P4 P3 = (P1 : P2 )X , P4 P3 = (P1 : P1 L1 + P2 L2 )Y L3 P3 + L4 P4 where X=
I L1 0 L2
I −L1 L−1 2 0 L−1 2
T11 T12 I 0 I 0 , −1 T21 T22 −L−1 L3 L4 4 L3 L4
−1 −1 K T12 L−1 4 − L1 L2 T22 L4 −1 −1 −1 L−1 L−1 2 T21 − L2 T22 L4 L3 2 T22 L4
Y=
−1 −1 −1 and K = T11 − L1 L−1 2 T21 − T12 L4 L3 + L1 L2 T22 L4 L3 . Since C(T21 ) ⊆ C(T22 ), we have −1 −1 −1 −1 C(L−1 2 T21 − L2 T22 L4 L3 ) ⊆ C(L2 T22 L4 ).
Also, since C(Tt12 ) ⊆ C(Tt22 ), we have −1 −1 t −1 −1 t C(T12 L−1 4 − L1 L2 T22 L4 ) ⊆ C(L2 T22 L4 ) .
Hence, S(A|Vm , Vn , Um , Un ) exists by Theorem 10.5.4(iv). It is computational to show that −1 −1 −1 −1 T11 − T12 T− 22 T21 = T11 − L1 L2 T21 − T12 L4 L3 + L1 L2 T22 L4 L3 − −1 −1 −1 −1 − −1 −1 (T12 L−1 4 − L1 L2 T22 L4 )(L2 T22 L4 ) (L2 T22 L4 ). Thus, S(A|Vm , Vn , Um , Un ) = S(A|Vm , Vn , Wm , Wn ). As the choice of Um , Un is arbitrary, it follows that the shorted operator does not depend on the choice of the complements Wm and Wn of Vm and Vn respectively. Henceforth, we shall write S(A|Vm , Vn , Wm , Wn ) simply as S(A|Vm , Vn ). Also, S(A|Vm , Vn ) when it exists is given as: P1 (T11 − T12 T− 22 T21 )P3 .
(10.5.9)
We now show that the shorted operator S(A|Vm , Vn ) when exists is also the solution to the maximization problem analogous to the one considered in Theorem 10.3.11. Theorem 10.5.5. Let A be an m × n matrix and let Vm and Vn be subspaces of F m and F n respectively such that S(A|Vm , Vn ) exists. Then S(A|Vm , Vn ) = max{D : D <− A, C(D) ⊆ Vm , C(Dt ) ⊆ Vn }.
292
Matrix Partial Orders, Shorted Operators and Applications
Proof. Since S(A|Vm , Vn ) = P1 (T11 − T12 T− 22 T21 )P3 , it follows that t C(S(A|Vm , Vn ))⊆ Vm andC(S(A|V , V )) ⊆ V . Also m n n T11 T12 P3 A = (P1 : P2 ) and we can write T21 T22 P4 S(A|Vm , Vn ) = P1 (T11 − T12 T− 22 T21 )P3 T11 − T12 T− P3 22 T21 0 = (P1 : P2 ) . 0 0 P4 T11 T12 T11 − T12 T− T12 T− 22 T21 0 22 T21 T12 Now, − = and T21 T22 0 0 T21 T22 T11 T12 ρ = ρ(T11 − T12 T− 22 T21 ) + ρ(T22 ) T21 T22 T11 − T12 T− T12 T− 22 T21 0 22 T21 T12 =ρ +ρ . 0 0 T21 T22 Hence,
T11 − T12 T− T11 T12 22 T21 0 <− 0 0 T21 T22
and thus, S(A|Vm , Vn ) <− A. Let D be an arbitrary matrix such that D <− A, C(D) ⊆ Vm and D 0 P 1 3 C(Dt ) ⊆ Vn . Then we can write D = (P1 : P2 ) for some 0 0 P4 D1 0 T11 T12 matrix D1 . Since D <− A, we have <− and there0 0 T21 T22 fore, D1 <− T11 − T12 T− 22 T21 . So, D = P1 D1 P3 <− P1 (T11 − T12 T− 22 T21 )P3 = S(A|Vm , Vn ). Hence, the result follows.
Corollary 10.5.6. Let A and B be matrices such that S(A|Vm , Vn ) and S(B|Vm , Vn ) exist for some subspaces Vm and Vn of F m and F n respectively. If A <− B then S(A|Vm , Vn ) <− S(B|Vm , Vn ).
Schur Complements and Shorted Operators
293
10.6
Exercises A11 A12 A11 0 (1) Let A = be an nnd matrix. Define AS = . A21 A22 0 0 Prove the following: S
(i) A†S = A† . S (ii) P(A, B)
(2) (3)
(4)
(5)
(6)
The matrices AS and AS can be thought as dual to each other and corresponds to the the fact that theorems about electrical networks have dual when one interchanges the current and the voltage variables and replaces the resistance with conductance. Let A be an nnd matrix of order n × n and S and T be subspaces of Cn . If S ⊆ T , then show that AS
A is S-complementable for each subspace S of Cn A is S-complementable for each 1-dimensional subspace S of Cn for any x ∈ Cn , x? Ax = 0 ⇒ Ax = 0 and for any x ∈ Cn , x? Ax = 0 ⇒ Ax = 0 and x? Ax = 0. A11 A12 (7) Let A = be an m × n matrix over a field F, with A11 as a A21 A22 square matrix of order r ×r. Let P` and Pr be projector of order m×m and n × n respectively. Then prove that A is P` , Pr -complementable if and only if C(A12 ) ⊆ C(A11 ) and C(At21 ) ⊆ C(At11 ). Also, find the matrices M` and Mr . (i) (ii) (iii) (iv)
294
Matrix Partial Orders, Shorted Operators and Applications
(8) Let A, B, C and D be nnd matrices such that A
Chapter 11
Shorted Operators - Other Approaches
11.1
Introduction
Motivated by electrical network theory and statistics, we introduced in the previous chapter the concept of shorted operator and studied some of its properties. We saw that the shorted matrix we defined is also a suitable Schur complement. Moreover, the shorted operator S(A|Vm , Vn ) of a given matrix A indexed by the given subspaces Vm and Vn , if exists, is the closest to A in the class of matrices M = {D : D <− A, C(D) ⊆ Vm , C(Dt ) ⊆ Vn } in the sense that S(A|Vm , Vn ) ∈ M, and D <− S(A|Vm , Vn ) for each D ∈ M. We also saw that the shorted operator of an nnd matrix indexed by a given subspace is the limit of a sequence of parallel sums of suitable matrices. In the present chapter, we give some more equivalent definitions for the shorted operator and alongside develop methods to compute this operator. In Section 11.2, we extend the concept of shorted operator as a limit of a sequence of parallel sums to rectangular matrices over the field of complex numbers. We investigate the conditions under which the shorted operator thus defined exists. We obtain a formula for computing the shorted operator when it exists. We also establish several of its interesting properties which include the closeness property mentioned above. Another closeness property of the shorted operator to the given matrix is in terms of rank. More precisely, we consider the class M as in the preceding paragraph. The shorted operator S(A|Vm , Vn ), when it exists is the unique matrix such that S(A|Vm , Vn ) ∈ M and for each matrix D ∈ M, ρ(A − S(A|Vm , Vn )) ≤ ρ(A − D). In Section 11.3, we consider matrices over a general field. We define 295
296
Matrix Partial Orders, Shorted Operators and Applications
the shorted operator using the closeness property in terms of rank and investigate the conditions under which this operator exists. We show the equivalence of the various definitions given in this and the preceding chapter. Section 11.4 contains one more method to compute the shorted operator. We also compare different computational methods in terms of the labour involved for computing them. No comparisons are made in terms of error analysis.
11.2
Shorted operator as the limit of parallel sums - General matrices
In this section, we consider the matrices over C, the field of complex numbers. In Section 10.4, we obtained the shorted operator S(A|S) of nnd matrix A of order m × m with respect to a subspace S of Cm as the limit of a sequence of parallel sums, limn→∞ P(A, nPS ), where PS is the orthogonal projector onto S. With obvious modifications in Lemma 10.4.1 and Theorem 10.4.2, it can be shown that S(A|S) = limn→∞ P(A, nB), where B is an arbitrary nnd matrix such that C(B) = S. It is easy to see that 1 S(A|S) = lim P(λA, B). Thus, S(A|S) may be called the shorted operλ λ→0 ator of A shorted by B and is denoted by S(A|B). As seen in Chapter 10, in the case of nnd matrices, as seen already, the shorted operator S(A|B) always exists. In this section, we extend this concept of shorted operator to arbitrary matrices (possibly rectangular) by defining it via the limit of suitable sequence of parallel sums. We explore the conditions under which S(A|B) exists for arbitrary matrices A and B of the same order. We also obtain several properties of this operator whenever it exists. We begin by proving a lemma that will be used in the sequel. Lemma 11.2.1. Let A, B be matrices of the same order m × n such that A and αB are parallel summable for some complex number α(6= 0). Then there exists a complex number θ0 such that for every complex number θ such that | θ | ≥ | θ0 |, A and θB are parallel summable. Proof. Since A and αB are parallel summable, A <s A + αB. Let ρ(A + αB) = r. By Remark 3.2.7, there exist non-singular matrices P, Q and an r × r matrix T such that A = Pdiag T, 0 Q and A + αB = Pdiag Ir , 0 Q.
Shorted Operators - Other Approaches
297
1 Thus, αB = Pdiag Ir − T, 0 Q or B = Pdiag (Ir − T), 0 Q. α θ Consider A + θB = Pdiag T + (Ir − T), 0 Q. If each eigen-value of α θ T is 1, then it is easy to see that (T + (Ir − T)) is non-singular for all α θ. Suppose at least one eigen-value of T is different from 1. Then we can θ θ λi θ |+1}, rewrite T + (Ir − T) = Ir +(1− )T. Let θ0 =| α | maxi {| α α α 1 − λi where λ0i s are the eigen-values of T not equal to 1. It is easy to see that T + θ θ (Ir −T) is non-singular whenever | θ | ≥ | θ0 |. So, T <s T + (Ir − T) α α and therefore, A <s A + θB for each θ such that | θ | ≥ | θ0 | . Corollary 11.2.2. Let A, B be matrices of the same order such that A and αB are parallel summable for some complex number α(6= 0). Then there exists a positive integer m such that A and mB are parallel summable. Remark 11.2.3. If A, B are matrices of the same order such that A and αB are parallel summable for some complex number α(6= 0), then there exists a complex number γ0 such that for every complex number γ such that | γ | ≤ | γ0 |, γA and B are parallel summable. We now define the shorted operator. Definition 11.2.4. Let A, B be matrices of the same order m × n such that A and αB are parallel summable for some complex number α. If the limit lim A(λA + B)− B
λ→0
exists and is finite, then the limit is called the shorted operator of A shorted by B and is denoted by S(A|B). 1 P(λA, B) and if A or B λ is a null matrix, then so is S(A|B). Henceforth we consider only non-null matrices A and B. Clearly, limλ→0 A(λA + B)− B = limλ→0
Remark 11.2.5. In view of Corollary 11.2.2 and Remark 11.2.3, we notice that S(A|B), whenever it exists, is given by 1 S(A|B) = lim A( A + B)− B. n→∞ n
298
Matrix Partial Orders, Shorted Operators and Applications
We now give some elementary properties of the shorted operator when it exists. These properties will be used in exploring the conditions under which the shorted operator exists. For the sake of convenience we shall write S = S(A|B). Theorem 11.2.6. Let A and B be matrices of the same order for which S exists. Then the following hold: (i) (ii) (iii) (iv)
C(S) ⊆ C(A) ∩ C(B) and C(S? ) ⊆ C(A? ) ∩ C(B? ). C(A − S) ∩ C(B) = {0} and C(A − S)? ∩ C(B? ) = {0}. S <− A. C(S) = C(A) ∩ C(B) and C(S? ) = C(A? ) ∩ C(B? ). Thus, S <s B.
A (i) Let u ∈ C(S). Then u = Sy = lims→∞ ( + B)− By for some s A − vector y. For each s, A( + B) By ∈ C(A). Since C(A) is a closed subs A A m space of C and for each s, A( + B)− B = B( + B)− A, so, u ∈ C(A) s s and u ∈ C(B). Hence, S ⊆ C(A) ∩ C(B). Similarly we can prove that C(S? ) ⊆ C(A? ) ∩ C(B? ). (ii) Let u ∈ C(A − S) ∩ C(B). Then u = (A − S)v = Bw for for some vectors v and w. So, Proof.
A + B)− Av s A = lims→∞ B( + B)− (Bw + Sv). s
Av − Bw = Sv = lims→∞ B(
A Since C(S) ⊆ C(A) and lims→∞ B( + B)− A exists, it follows that s A A lims→∞ B( + B)− S exists. So, lims→∞ B( + B)− Bw exists and is s s A − equal to Sv − lims→∞ B( + B) Sv. Now, s S − B(
A A A A + B)− S = S − ( + B − )( + B)− S s s s s A A = S − S + ( + B)− S, s s
since C(S) ⊆ C(B) ⊆ C( lims→∞ B(
A + B). Hence, s
A A A + B)− Bw = lims→∞ ( + B)− Sv = 0. s s s
Shorted Operators - Other Approaches
299
A A A + B)− Bw ( + B)− Bw, so, = Bw − s s s A lims→∞ B( +B)− Bw = Bw. Therefore, Bw = 0 i.e., v = 0, proving s C(A − S) ∩ C(B) = {0}. Similarly, C(A − S)? ∩ C(B)? = {0}. (iii) Since C(S) ⊆ C(B) by (i) and C(A − S) ∩ C(B) = {0} by (ii), we have C(A − S) ∩ C(S) = {0}. Similarly, C(A − S)? ∩ C(S? ) = {0}. Hence, S <− A. (iv) By (i), we have C(S) ⊆ C(A) ∩ C(B) and C(S? ) ⊆ C(A? ) ∩ C(B? ). We now show that the reverse inclusions hold. Let x ∈ C(A) ∩ C(B). By (iii), C(A) = C(A − S) ⊕ C(S), so, we can write x = x1 + x2 , where x1 ∈ C(S) and x2 ∈ C(A − S). Now, C(S) ⊆ C(B), so, x − x1 = x2 ∈ C(B). By (ii), x2 = 0. Thus, x = x1 ∈ C(S). Similarly, C(A? ) ∩ C(B? ) ⊆ C(S? ). Hence, the result follows. However,
B(
Corollary 11.2.7. Let A, B be matrices of the same order. Then the following hold: (i) If ρ(A + B) = ρ(A) + ρ(B), then S = 0. (ii) If B is non-singular, then S = A. (iii) If A and B are range-hermitian, then so is S. Remark 11.2.8. In Theorem 11.2.6 the statement (i) is redundant in view of statement (iv). However, we need (i) to prove statements (ii) and (iii) which in turn are required to prove (iv). Notice that (i) implies that S <s A and S <s B. Also, if S0 is a matrix of the same order as A such that S0 <s A and S0 <s B, then S0 <s S. Remark 11.2.9. Let A, B be matrices of the same order such that S = S(A|B) exists. Then S(A|cB) exists for each non-null complex number c and is also equal to S. So, if S = S(A|B) exists, we can assume that A and B are parallel summable. Remark 11.2.10. Let A, B be matrices of the same order for which S(A|B) exists and L and R be matrices such that LAR is defined. Let L have a left inverse and R have a right inverse. Then S(LAR|LBR) exists and is equal to LS(A|B)R. We are now ready to obtain necessary and sufficient conditions for the existence of the shorted matrix S(A|B).
300
Matrix Partial Orders, Shorted Operators and Applications
Theorem 11.2.11. Let A, B be parallel summable matrices of the same order. Let (L, R) be a rank factorization of A + B and D be a matrix such that A = LDR. Then the following are equivalent: (i) S := S(A|B) exists (ii) Index of I − D is ≤ 1 (iii) ρ(B(A + B)− B) = ρ(B). Proof. (ii) ⇔ (iii) −1 −1 −1 Let L−1 L be a left inverse of L and RR a right inverse of R. Then RR LL is a g-inverse of A + B. Since −1 2 B(A + B)− B = L(I − D)RR−1 R LL L(I − D)R = L(I − D) R,
it follows that ρ(B(A + B)− B) = ρ(I − D)2 . Also, B = L(I − D)R, so, we have ρ(B) = ρ(I − D). Thus, ρ(B(A + B)− B) = ρ(B) ⇔ ρ(I − D)2 = ρ(I − D) ⇔ I − D is of index not exceeding 1. (i) ⇒ (iii) Let S := S(A|B) exist. Write A + B = A − S + B + S. Since S <s B, we have S + B <s B. Since C(A − S) ∩ C(S + B) ⊆ C(A − S) ∩ C(B) and C(A − S)? ∩ C(S + B)? ⊆ C(A − S)? ∩ C(B)? , we have by Theorem 11.2.6(ii), C(A − S) ∩ C(S + B) = {0} and C(A − S)? ∩ C(S + B)? = {0}. Therefore, A + B = (A − S) ⊕ (B + S), or equivalently S + B <− A + B. We now show that C(B) = C(S + B) and C(B? ) = C(S + B)? . Since S + B <s B, we already have C(S + B) ⊆ C(B) and C(S + B)? ⊆ C(B? ). Let x ∈ C(B). Then x = Bu for some u. Now, A, B be parallel summable, so, C(B) ⊆ C(A + B). Therefore, there exists a vector v such that x = Bu = (A + B)v = (A − S)v + (S + B)v. Therefore, Bu − (S + B)v = (A − S)v. Since Bu − (S + B)v ∈ C(B), (A − S)v ∈ C(A − S) and C(A − S) ∩ C(B) = {0}, so, Bu = (S + B)v. But then x = Bu, we have x ∈ C(S + B), giving C(B) ⊆ C(S + B) and therefore C(B) = C(S + B). Similarly, C(B? ) = C(S + B)? . Now, C(B) = C(S + B), and C(B? ) = C(S + B)? ⇒ there exist matrices T1 and T2 such that B = T1 (S + B) and BT2 = S + B. Also, since (S + B) <− (A + B) ⇒ S + B = (S + B)(A + B)− (S + B). Hence, T1 (S + B)(A + B)− (S + B) = T1 (S + B)(A + B)− BT2 = T1 (S + B), or equivalently, B(A + B)− BT2 = B. It now follows that ρ(B) = ρ(B(A + B)− BT2 ) ≤ ρ(B(A + B)− B) ≤ ρ(B) and hence, ρ(B(A + B)− B) = ρ(B). (ii) ⇒ (i)
Shorted Operators - Other Approaches
301
Let us first consider the case when D does not possess 1 as an eigen-value. In this case I − D is non-singular, so, S(D|I − D) exists and is equal to D, by Corollary 11.2.7. Hence, S(A|B) = S(LDR|L(I − D)R) exists and is equal to LDR = A. Now, let 1 be an eigen-value of D. Since I − D is of index 1, the algebraic multiplicity of 1 as an eigen-value of D is same as its geometric multiplicity. So, there exists a non-singular matrix Q and a matrix D1 such that I − D1 is non-singular and I − D = Qdiag(I − D1 , 0)Q−1 . So, D = Qdiag(D1 , I)Q−1 . Consider − 1 1 diag D1 , I diag diag I − D1 , 0 D1 , I + diag I − D1 , 0 n n = diag D1 , I
diag
1 1 D1 + (I − D1 ), I n n
It is easy to check that for all n > n0 = maxi {
−
diag I − D1 , 0 .
di }, where di are the di − 1
1 eigen-values of D1 , the matrix D1 + I − D1 is non-singular. n Thus, for all n > n0 , −1 diag(D1 , I) diag( n1 D1 + I − D1 , n1 I) diag(I − D1 , 0) 1 −1 = diag(D1 ( n D1 + I − D1 ) (I − D1 ), 0). It follows that, limn→∞ diag(D1 , I)diag(( n1 D1 + I − D1 , n1 I, nI)−1 , nI)diag(I − D1 , 0) 1 = diag(limn→∞ (D1 ( D1 + I − D1 )−1 (I − D1 )), 0) = diag(D1 , 0). n So, S(D|I − D) exists and S(D|I − D) = Qdiag(D1 , 0)Q−1 . As a consequence, we have S(A|B) exists and by Remark 11.2.10, it is equal to LQdiag(D1 , 0)Q−1 R. Corollary 11.2.12. Let A and B be parallel summable matrices of the same order. Then in the notations of Theorem 11.2.11, the following are equivalent: (i) S = S(B|A) exists (ii) Index of D is ≤ 1, (iii) ρ(A(A + B)− A) = ρ(A). Corollary 11.2.13. If A and B are parallel summable matrices of the same order and either C(A) ⊆ C(B) or C(B) ⊆ C(A), then S(A|B) exists. Proof.
S(A|B) exists by Theorem 9.2.26(i) and Theorem 11.2.11.
302
Matrix Partial Orders, Shorted Operators and Applications
Theorem 11.2.14. Let A and B be matrices of the same order such that S = S(A|B) exists. Then the following hold: (i) S = limλ→0 λ1 P(λA, B) and ? (ii) S(A? |B? ) exists and equals (S(A|B)) . Proof is trivial. Corollary 11.2.15. Let A and B be hermitian matrices of the same order such that S(A|B) exists. Then S(A|B) is hermitian. Remark 11.2.16. Let A, B be square matrices of the same order such that S(A|B) exists and is hermitian. It does not necessarily follow that A and B are hermitian. For, let A be hermitian and B non-hermitian invertible matrix such that S(A|B) exists. Then by Corollary 11.2.7 (ii), S(A|B) = A is hermitian, but B is not. We now give an example where neither of the two matrices is hermitian, yet S(A|B) is hermitian. 0 0 1 2 Example 11.2.17. Let A = and B = . Then A, B are 1 1 0 0 non-hermitian. Since C(A) ∩ C(B) = {0} and C(A? ) ∩ C(B? ) = {0} we A have , B are parallel summable for each positive integer n. Therefore, S n A 1 2 exists. Also, for each positive integer n, + B = 1 1 is invertible n n n −1 2n with inverse . So, 1 −n A 0 0 −1 2n 1 2 0 0 − A( + B) B = = . 1 1 1 −n 0 0 0 0 n Thus, S is hermitian but neither of A, B is. Let A be a given matrix. In Theorem 10.3.5 and Theorem 10.5.5, we had obtained a matrix S which has its row space and column space contained in specified subspaces such that S is the closest to A in a certain sense. We show that the shorted operator, we have defined above in this section, has a similar property. But before we exhibit this, we prove the following useful theorem: Theorem 11.2.18. Let P, Q and R be matrices of the same order such that P <s Q, P <− R and Q <− R. Then P <− Q.
Shorted Operators - Other Approaches
303
Proof. Since P <− R, we have PP− = RP− and P− P = P− R for some g-inverse P− of P, and Q <− R ⇒ {R− } ⊆ {Q− }. Let R− be a g-inverse of R. Then PP− = RP− ⇒ QR− PP− = QR− RP− . Since P <s Q, C(P) ⊆ C(Q) and C(P? ) ⊆ C(Q? ). Therefore, QR− P = P and QR− R = Q. As consequence, PP− = QP− . Similarly, P− P = P− Q and hence, P <− Q. Theorem 11.2.19. Let A, B be square matrices of the same order such that S = S(A|B) exists. Then S = S(A|B) = max{D : D <− A, C(D) ⊆ C(B), C(D? ) ⊆ C(B? ).} Proof. Let C = {D : D <− A, C(D) ⊆ C(B), C(D? ) ⊆ C(B? )}. By Theorem 11.2.6 (iii) and (iv), S ∈ C. Let D be an arbitrary matrix in C. Clearly, C(D) ⊆ C(A) ∩ C(B) = C(S) and C(D) ⊆ C(A) ∩ C(B) = C(S? ). So, D <s S. Now, D <s S, D <− A and S <− A, so, the result follows by Theorem 11.2.18. We now obtain another property related to the closeness of the shorted operator to the matrix A according to somewhat different looking criterion. We say that it is somewhat different looking criterion, since we shall later prove that these two different criteria lead to the same optimizing matrix whenever it exists. Theorem 11.2.20. Let A, B be square matrices of the same order such that S = S(A|B) exists. Then S = arg min{ρ(A − D) : C(D) ⊆ C(B), C(D? ) ⊆ C(B? )}. D
Proof. Let F = {D : C(D) ⊆ C(B), C(D? ) ⊆ C(B? )}. Notice that S ∈ F. Let D be an arbitrary element of F. Write A − D = (A − S) + (S − D). Since C(S) ⊆ C(B), C(S − D) ⊆ C(B), so, C(A − S) ∩ C(S − D) = {0}. Similarly, C(A − S)? ∩ C(S − D)? = {0}. So, A − D = (A − S) ⊕ (S − D). In view of Remark 3.3.9, we have ρ(A − S) ≤ ρ(A − D). Further, the equality sign holds if and only if ρ(S − D) = 0 or D = S. Let C <− A. Does S(A|C) exist? If so, what is it? The answer to all these questions is contained in the following: Theorem 11.2.21. Let A and C be matrices such that C <− A. Then S(A|C) and S(C|A) exist and are each equal to C.
304
Matrix Partial Orders, Shorted Operators and Applications
Since C <− A, there exist non-singular matrices P, Q such that C = Pdiag Ic , 0, 0 Q and A = Pdiag Ic , Ia−c , 0 Q, where c = ρ(C) and a = ρ(A). Notice that A and C are parallel summable. Write P = (P1 : P2 : P3 ) and Qt = (Qt 1 : Qt 2 : Qt 3 ), where ‘P1 and Qt 1 ’ each have c number of columns and ‘P2 and Qt 2 ’ each have a − c number of columns. Now, √ √ A + C = ( 2P1 : P2 : P3 )diag Ic , Ia−c , 0 ( 2Qt 1 : Qt 2 : Qt 3 )t √ √ t Ic t t t and C = ( 2P1 : P2 : P3 )diag , 0, 0 ( 2Q 1 : Q 2 : Q 3 ) . 2 √ √ Further, both ( 2P1 : P2 : P3 ) and ( 2Qt 1 : Qt 2 : Qt 3 ) are non I singular. Under the notations of Theorem 11.2.11, I − D = diag ,0 2 I and D = diag 2 , I . Clearly each of D and I − D are of index 1. Thus S(A|C) and S(C|A) exist. Following the proof of Theorem 11.2.11, we can show that S(A|C) = C. Since C <− A, using the decompositions of C and A as above it follows that S(C|A) = C. Proof.
Corollary 11.2.22. Let A and B be matrices of the same order such that S(A|B) exists. Then S(A|S(A|B)) and S(S(A|B)|A) exist and are each equal to S(A|B). We now obtain the class of all g-inverses of S(A|B). Theorem 11.2.23. Let A and B be matrices of the same order such that S = S(A|B) exists. Then − {S(A|B) } = {A− + X : A− ∈ {A− } and BXB = 0}. −
Proof. We first note that S(A|B) <− A ⇒ {A− } ⊆ {S(A|B) }. Let BXB = 0 for some matrix X of suitable order. Since C(S(A|B)) ⊆ C(B) ? and C(S(A|B) ) ⊆ C(B? ), it follows that S(A|B)XS(A|B) = 0. Hence, S(A|B)(A− + X)S(A|B) = S(A|B) for each A− ∈ {A− }. Thus, A− + X is a g-inverse of S(A|B) for each A− ∈ {A− }. Conversely, let A− +X be a g-inverse of S(A|B) for some matrix X of appropriate order. Then, S(A|B)XS(A|B) = 0. As C(S) = C(A) ∩ C(B), and C(S? ) = C(A? ) ∩ C(B? ), it follows that X = Xa + Xb , where AXa A = 0 and BXa B = 0. Notice that A− + Xa ∈ {A− }, so, A− + X = G + Xb with G ∈ {A− } and BXb B = 0. Hence, the matrix A− + X belongs to {A− + X : A− ∈ {A− } and BXB = 0} and so, − {S(A|B) } = {A− + X : A− ∈ {A− } and BXB = 0}.
Shorted Operators - Other Approaches
11.3
305
Rank minimization problem and shorted operator
In Section 10.5 we defined the shorted operator of a matrix over a general field, indexed by some specified subspaces in terms of Schur complements extending the approach followed in Section 10.3 for nnd matrices over C. In Section 11.2, we defined the shorted operator of a matrix over C as the limit of a sequence of parallel sums of suitable matrices. In both the cases the shorted operator whenever it exists has the optimality property of closeness in the following sense: Let A be an m × n matrix over a field F. Let Vm and Vn be subspaces of F m and F n respectively. Then the shorted operator S(A|Vm , Vn ), whenever it exists has the property: S(A|Vm , Vn ) = max{D : D <− A, C(D) ⊆ Vm , C(Dt ) ⊆ Vn }. In Section 11.2, under the same setup as above, we also obtained another optimality property of closeness for S(A|Vm , Vn ) namely: S(A|Vm , Vn ) = arg min{ρ(A − D) : C(D) ⊆ Vm , C(Dt ) ⊆ Vn }, D
whenever it exists. We call this later property as the rank minimization property. When this property is satisfied by a unique matrix, we show that it is equivalent to the definition of the shorted operator via parallel sums and is also equivalent to the definition via optimality property of maximization as mentioned above. All matrices in this section are over an arbitrary field F unless stated otherwise. Definition 11.3.1. Let A be an m×n matrix. Let Vm and Vn be subspaces of F m and F n respectively. Then an m×n matrix S is said to have property (∗) with respect to the triple (A, Vm , Vn ) if ρ(A − S) = min{ρ(A − D) : C(D) ⊆ Vm , C(Dt ) ⊆ Vn }. D
In general, S as defined in Definition 11.3.1 is not unique as the following example shows: 1 0 0 1 0 Example 11.3.2. Let A = 1 1 1 . Let Vm = Vn = C 0 1 . Now it 0 1 0 0 0 1 0 0 is easy to see that for arbitrary scalars a and b, the matrix Sa,b = a b 0 0 0 0 has the property (∗) with respect to the triple (A, Vm , Vn ).
306
Matrix Partial Orders, Shorted Operators and Applications
When is the matrix S having property (∗) with respect to the the triple (A, Vm , Vn ) as in Definition 11.3.1 unique? Interestingly it turns out that S is unique if and only if the shorted operator S(A|Vm , Vn ) as in Section 10.5 exists and is equal to S(A|Vm , Vn ). We show this in the following theorem: Theorem 11.3.3. Let A be an m × n matrix. Let Vm and Vn be subspaces of F m and F n respectively. Then an m × n matrix S having property (∗) with respect to the the triple (A, Vm , Vn ) exists and is unique if and only if S(A|Vm , Vn ) as defined in Section 10.5 exists. Further, S = S(A|Vm , Vn ). Proof. ‘If’ part Let S(A|Vm , Vn ) exist. We write S1 = S(A|Vm , Vn ). Clearly, C(S1 ) ⊆ Vm and C(St1 ) ⊆ Vn . Now, S1 = P1 (T11 − T12 T− 22 T21 )P3 , by (10.5.9). It is easy to check that A − S1 = (P1 T12 T− 22 + P2 )(T21 P3 + T22 P4 ) = (P1 T12 + P2 T22 )(P4 + T− 22 T21 P3 ). We show that C(A − S1 ) ∩ Vm = {0}. Recall that Vm = C(P1 ). Let (A − S1 )x = P1 y. But (A − S1 )x = (P1 T12 + P2 T22 )u for some u. So, P1 T12 u + P2 T22 u = P1 y or equivalently, P2 T22 u = P1 y − P1 T12 u = 0, since C(P1 ) ∩ C(P2 ) = {0}. Also, P2 T22 u = 0 ⇒ T22 u = 0, as P2 has a left inverse. Therefore, T12 u = 0, since S(A|Vm , Vn ) exists. Thus, (A − S1 )x = 0. Likewise we can prove C(A − S1 )t ∩ Vn = {0}. Now, let D be a matrix such that C(D) ⊆ Vm and C(Dt ) ⊆ Vn . Write A − D = A − S1 + S1 − D. Since C(S1 ) ⊆ Vm and C(D) ⊆ Vm , we have C(S1 − D) ⊆ Vm . So, C(A − S1 ) ∩ C(S1 − D) = {0}. Similarly, C(A − S1 )t ∩C(S1 − D)t = {0}. Hence, ρ(A − D) = ρ(A−S1 )+ρ(S1 −D). It follows that ρ(A − S1 ) ≤ ρ(A − D) and the equality holds if and only if S1 = D. Thus, S1 is the unique matrix satisfying the property (∗). ‘Only if’ part Suppose there exists a unique matrix S satisfying the property (∗) with respect to the triple (A, Vm , Vn ). Then C(A − S)∩Vm = {0} and C(A − S)t ∩ Vn = {0}. Let (L11 , R11 ) and (L21 , R21 ) be rank factorizations of S and A − S respectively. We clearly have, C(L11 ) ∩ C(L21 ) = {0} and C(Rt11 ) ∩ C(Rt21 ) = {0}. Let the columns of L1 = L11 : L12 form a basis for Vm R11 and the rows of R1 = form a basis of Vn . Let L2 and R2 be matrices R21 R1 such that L1 : L2 and are non-singular. Then S = L1 T11 R1 and R2
Shorted Operators - Other Approaches
307
T11 0 R1 . Clearly, 0 T22 R2 S(A|Vm , Vn ) exists by Theorem 10.5.4 and is equal to S.
A − S = L2 T22 R2 . Thus, A = L1 : L2
Thus, there is a unique matrix S having the property (∗) with respect to the triple (A, Vm , Vn ) if and only if S(A|Vm , Vn ) exists and in such a case S = S(A|Vm , Vn ). Henceforth, we consider the situation when S in Definition 11.3.1 exists and is unique. We identify S with S(A|Vm , Vn ). Given a matrix A and subspaces Vm and Vn of F m and F n respectively, let B be a matrix such that C(B) = Vm and C(Bt ) = Vn . Our next object is to show that S(A|B) as defined in Section 11.2 and S(A|Vm , Vn ) as defined above are the same when matrices are over the complex field C. Theorem 11.3.4. Let A and B be matrices of the same order over a field F of characteristic 0. Let (P1 , P3 ) be a rank factorization of B and let A P1 R = . Let P2 and P4 be matrices such that P1 : P2 and P 0 3 T11 T12 P3 P3 are non-singular matrices and A = P1 : P2 . P4 T21 T22 P4 Then the following are equivalent: (i) There exists a non-zero scalar c such that A and cB are parallel summable and ρ(B(A + B)− B)= ρ(B) A (ii) ρ(R) = ρ(A : P1 ) + ρ(P3 ) = ρ + ρ(P1 ) P3 t A A (iii) C ⊆ C(R), C ⊆ C(Rt ) and 0 0 (iv) C(T21 ) ⊆ C(T22 ) and C(Tt12 ) ⊆ C(Tt22 ). Proof.
(i) ⇒ (ii)
A + cB P1 Since A and cB are parallel summable, we have M = = P3 0 I 0 A + cB 0 I (A + cB)− P1 . − − P3 (A + cB) 0 0 P3 (A + cB) P1 0 I So, ρ(M) = ρ(A + cB) + ρ(P3 (A + cB)− P1 ) = ρ(A + cB) + ρ(B(A + cB)− B) = ρ(A + cB) + ρ(B). However,
I 0 M −cP3 I
=
A P1 P3 0
= R.
308
Matrix Partial Orders, Shorted Operators and Applications
Hence, ρ(R) = ρ(M) = ρ(A + cB) +ρ(B). Since A and cB areparallel A A summable, ρ(A + cB) = ρ(A : B) = = ρ(A : P1 ) = . Also, B P3 ρ(B) = ρ(P1 ) = ρ(P3 ). Thus, A ρ(R) = ρ(A : P1 ) + ρ(P3 ) = ρ + ρ(P1 ). P3 (ii) ⇒ (i)
A B A Since (ii) holds, we have ρ =ρ + ρ(B) = ρ(A : B) + ρ(B). B 0 B 1 2 A B A B A 0 A B F F − − Thus, < and < . Let be 0 0 B 0 B 0 B 0 F3 F4 A B a g-inverse of . Then B 0 B = BF2 B = BF3 B, AF2 B = BF4 B = BF3 A. Choose a scalar c such that F4 B + cI is non-singular. We claim that A − BF4 B and B are disjoint. Let x ∈ C(A − BF4 B) ∩ C(B). Then there exist vectors u and v such that x = (A − BF4 B)u = Bv. Premultiplying by BF3 , we have BF3 x = BF3 (A − BF4 B)u = BF3 Bv. Since B = BF3 B, BF4 B = BF3 A, we have Bv = 0. Therefore, x = 0, so, C(A − BF4 B) ∩ C(B) = {0}. Similarly, C(A − BF4 B)t ∩ C(Bt ) = {0}. So, A − BF4 B and B are disjoint and therefore, A − BF4 B and BF4 B are disjoint. Thus, C(A) = C(A − BF4 B) + C(BF4 B) ⊆ C(A − BF4 B) + C(cB + BF4 B) = C(A + cB). t
Similarly, C(A ) ⊆ C(A + cB)t . Hence, A and cB are parallel summable for some non-null scalar c. The proof of ρ(B(A + B)− B) = ρ(B) follows from the proof of (i) ⇒ (ii). (ii) ⇒ (iii) A + ρ(P1 ), we have Since ρ(R) = ρ(A : P1 ) + ρ(P3 ) = ρ P3 A P1 A 0 <− R and <− R. 0 0 P3 0 t A A Hence, C ⊆ C(R) and C ⊆ C(Rt ). 0 0 (iii) ⇒ (iv) Let P1 be a left inverse of P1 and P3 be a right inverse
Shorted Operators - Other Approaches
309
0 P3 of P3 . Then it is easy to see that is a g-inverse of P1 −P1 AP3 A R. Since C ⊆ C(R), A = P1 P1 A. Thus, C(A) ⊆ C(P1 ). Now, 0 A = P1 (T11 P3 + T12 P4 ) + P2 (T21 P3 + T22 P4 ). Pre-multiplying by P1 P1 and using A = P1 P1 A, we have P2 (T21 P3 + T22 P4 ) = 0 or equivalently T21 P3 + T22 P4 = 0, since P2 has a left inverse. So, T21 = −T22 P4 P3 . This shows that C(T21 ) ⊆ C(T22 ). Similarly, we can show that C(Tt12 ) ⊆ C(Tt22 ). (iv) ⇒ (ii) As noted earlier in Theorem 11.3.3, we have − A = P1 (T11 − T12 T− 22 T21 )P3 + (P1 T12 + P2 T22 )(P4 + T22 T21 P3 ). t t P1 A P1 A = {0}. ∩C We now show that C ∩C = {0} and C 0 P3 0 P3 A P1 Let Y= Z. Then AY = P1 Z and P3 Y = 0. Now, P3 0
P3 Y = 0 ⇒ AY = (P1 T12 + P2 T22 )(P4 Y) = P1 Z. Since P2 has a left inverse and Z = T12 P4 Y, we T22 (P obtain 4 Y) = 0. A P1 t t Since C(T12 ) ⊆ C(T22 ), so, Z = 0. Thus, C ∩C = {0}. P3 0 t t A P1 A Similarly, C ∩C = {0}. Hence, ρ(R) = ρ + ρ(P1 ). P3 0 P3 Similarly, we can prove that ρ(R) = ρ(A : P1 ) + ρ(P3 ). A few remarks are in order. Remark 11.3.5. Let Vm and Vn be subspaces of F m and F n respectively. Let X and Y be matrices such that Vm = C(X) and Vn = C(Yt ). Then it can be easily established that (ii)-(iv) of Theorem 11.3.4 are equivalent if we replace P1 and P3 by X and Y respectively. Thus, the definitions of S(A|Vm , Vn ) as in Section 10.5 and the one in this section are equivalent. Remark 11.3.6. Let A and B be matrices of the same order over a field of characteristic 0. Let Vm = C(B) and Vn = C(Bt ). We proved that S(A|Vm , Vn ) exists if and only if A and cB are parallel summable for some scalar c and ρ(B(A + B)− B) = ρ(B). As seen in Section 11.2, when these condition hold, S(A|B) = LQdiag(D, 0)Q−1 R, where (L, R) is a rank factorization of A + B, A = LDR and I − D = Qdiag(I − D1 , 0)Q−1 for some non-singular matrices Q and I − D1 .
310
Matrix Partial Orders, Shorted Operators and Applications
Remark 11.3.7. The definitions of a shorted operator as given in Sections 10.5, 11.2 and 11.3 all coincide when matrices are taken over the field of the complex numbers. Remark 11.3.8. The definition of the shorted operator S(A|B) as the limit of a sequence of parallel sums appears to be more restrictive than that of S(A|Vm , Vn ) for the following reason: In definition of S(A|B), the spaces C(B) and C(Bt ) are necessarily of the same dimension where as Vm and Vn can be of different dimensions. Remark 11.3.9. The statements (i)-(iv) in Theorem 11.3.4 are also equivalent over a general field.
11.4
Computation of shorted operator
We gave quite a few definitions of shorted operator in this chapter as well as the previous one and proved their equivalence in the last section. (See Theorem 11.3.4, Remarks 11.3.5 and 11.3.6.) We investigated the conditions under which the shorted operator exists and provided some formulae for computing the shorted operator whenever it exists. In this section, we first develop one more method for computing the shorted operator and then briefly compare the various different computational methods. We first prove the following useful lemma which is also of independent interest: Lemma 11.4.1. Let A be an m × n matrix over a field F and Vm and Vn be subspaces of F m and F n respectively. Let X and Y be matrices such A X that Vm = C(X) and Vn = C(Yt ). Denote by R, the matrix . Let Y 0 G11 G12 G= be a g-inverse of R. If G21 −G22 t A A C ⊆ C(R) and C ⊆ C(Rt ), 0 0 then the following hold: (i) XG21 X = X, YG12 Y = Y. (ii) YG11 X = 0, AG11 X = 0 and YG11 A = 0. (iii) AG12 Y = XG21 A = XG22 Y. A (iv) AG11 AG11 A = AG11 A, tr(AG11 ) = ρ(A : X) − ρ(X) = ρ − Y ρ(Y).
Shorted Operators - Other Approaches
311
G11 A and (G11 : G21 ) are g-inverses of (A : X) and respecG21 Y tively. A A A Proof. We first notice that C ⊆ C(R) ⇒ RG = and 0 0 0 t A C ⊆ C(Rt ) ⇒ (A : 0)GR = (A : 0). Further, 0 A A AG11 A + XG21 A = A (11.4.1) RG = ⇒ 0 0 YG11 A = 0. (11.4.2) AG11 A + AG12 Y = A (11.4.3) (A : 0)GR = (A : 0) ⇒ AG11 X = 0. (11.4.4) (v)
Also, RGR = R AG11 A + XG21 A + AG12 Y − XG22 Y = A AG11 X + XG21 X = X ⇒ YG11 A + YG21 Y = Y YG11 X = 0.
(11.4.5) (11.4.6) (11.4.7) (11.4.8)
Now, (i) follows from (11.4.2), (11.4.4), (11.4.6) and (11.4.7), (ii) follows from (11.4.2), (11.4.4) and (11.4.8) and (iii) follows from (11.4.1), (11.4.3) and (11.4.5). For (iv), (11.4.1) and (11.4.4) imply AG11 AG11 A = AG11 A. Further, ρ(R) = tr(RG) = tr(AG11 ) + tr(XG21 ) + tr(YG12 ) = tr(AG11 ) + ρ(X) + ρ(Y). Hence, tr(AG11 ) = ρ(R) − ρ(X) − ρ(Y) = ρ(A : X) − ρ(X) = ρ
A − ρ(Y), Y
by using (ii) ⇔ (iii) of Theorem 11.3.4 and Remark 11.3.5. (v) follows from (11.4.1), (11.4.3), (11.4.6) and (11.4.7).
We are now ready to obtain the shorted operator S(A|Vm , Vn ). Theorem 11.4.2. Consider the same setup as in Lemma 11.4.1. Then the matrix AG12 Y = XG21 A = XG22 Y is the shorted operator S(A|Vm , Vn ). Proof. Let us write S = AG12 Y = XG21 A = XG22 Y. Clearly, we have C(S) ⊆ C(X) = Vm and C(St ) ⊆ C(Yt ) = Vn . We next show that C(A − S) ∩ Vm = {0} and C(A − S)t ∩ Vn = {0}.
312
Matrix Partial Orders, Shorted Operators and Applications
We have from (11.4.1), A − S = AG11 A. Let AG11 Au = Xv for some vectors u and v. Then by Lemma 11.4.1 (iv) and (ii), AG11 AG11 Au = AG11 Xv = 0. So, AG11 AG11 Au = AG11 Au = Xv = 0. Hence, C(A − S) ∩ Vm = C(A − S) ∩ C(X) = {0}. Similarly, we can show that C(A − S)t ⊆ C(Yt ) = {0}. Let E be a matrix such that C(E) ⊆ Vm and C(Et ) ⊆ Vn . We can write A − E = A − S + S − E. Since C(S − E) ⊆ Vm and C(S − E)t ⊆ Vn , it follows that C(A − S) ∩ C(S − E) = {0} and C(A − S)t ∩ C(S − E)t = {0}. Hence, ρ(A − E) ≥ ρ(A − S) and the equality holds if and only if E = S. Thus, by Definition 11.3.1, S = S(A|Vm , Vn ). We now discuss different methods of computing S(A|Vm , Vn ). Let A be an m × n matrix over a field F and Vm and Vn be subspaces of F m and F n respectively. Let P1 be matrix whose columns form a basis of Vm and P3 be a matrix rows of which form a basis of Vn and Wm and Wn denote complements of Vm and Vn in F m and F n respectively. Let the columns of a matrix P2 for a basis of Wm and the rows of a matrix P4 form a basis for Wn . Let X and Y bematrices m = C(X) and such that V A X G11 G12 t Vn = C(Y ). Denote by R, the matrix and let G = Y 0 G21 −G22 1 −1 P P3 = (P3 : P4 ). be a g-inverse of R. Write (P1 : P2 )−1 = and P2 P4 A A Method 11.4.3. Check whether RG = and (A : 0)GR = 0 0 (A : 0). If yes, the shorted operator S(A|Vm , Vn ) exists and is given by S(A|Vm , Vn ) = XG11 Y = AG11 Y = XG11 A. If no, the shorted operator does not exist. 1 P T11 T12 3 4 Method 11.4.4. Write A(P : P ) = , where T11 = P2 T21 T22 P1 AP3 , T12 = P1 AP4 , T21 = P2 AP3 , and T22 = P2 AP4 . Check, if C(T21 ) ⊆ C(T22 ) and C(Tt12 ) ⊆ C(Tt22 ). If so, S(A|Vm , Vn ) exists and − is given by S(A|Vm , Vn ) = P1 (T11 − T12 T− 22 T21 )P3 , where T22 is an arbitrary g-inverse of T22 . If no, the shorted operator S(A|Vm , Vn ) does not exist. Equivalently, Method 11.4.4*. Check whether ρ(P2 AP4 ) = ρ(P2 A) = ρ(AP4 ). If yes, S(A|Vm , Vn ) exists and is given by A − AP4 (P2 AP4 )− P2 A, where (P2 AP4 )− is any g-inverse of P2 AP4 .
Shorted Operators - Other Approaches
313
Method 11.4.5. This is a special case of Method 11.4.3. Let F be a field of characteristic 0 and A, B be m × n matrices over F. Let Vm = C(B) and Vn = C(Bt ). Notice that d(Vm ) = d(Vn ). In this case, check if t t A A B A A B C ⊆C and C ⊆C . 0 B 0 0 B 0 If so, let (L, R) be a rank factorization of A + B. Let D be a square matrix such that A = LDR and I − D = Qdiag I − D1 0 Q−1 , where Q and I − D1 are non-singular. Then S(A|B) exists and is given by S(A|B) = −1 LQdiag D1 0 Q R. If no, then S(A|B) does not exist. Let us now compare the methods. A X For Method 11.4.3, we need to compute a g-inverse of . Once Y 0 we have this, it is very easy to check if S(A|Vm , Vn ) exists and then obtain it. However, if the orders of the matrices X and Y are m × s and r × n respectively, then we need to compute a g-inverse of a matrix of order (m + r) × (s + n) which is a matrix of much larger order. For Method 11.4.4, we need to know a basis of Wm and and a basis Wn , where Wm and Wn denote the complements of Vm and Vn respectively. We 3 −1 P . One way to obtain also need to know the matrix (P1 : P2 )−1 A P4 this matrix is as follows: Form the matrix (P1 : P2 : A) and reduce it to the form (I : (P1 : P2 )−1 A) by elementary row operations. This reduces P3 (P1 : P2 ) to I. Now consider (P1 : P2 )−1 A : and reduce it to P4 ! −1 P3 (P1 : P2 )−1 A : I by elementary column operations. This reP4 P3 T11 T12 duces to I. Thus, we have the matrix . If this matrix can P4 T21 T22 H 0 be further reduced to by elementary row and column operations, 0 T22 we have C(T21 ) ⊆ C(T22 ), C(Tt12 ) ⊆ C(Tt22 ) and H = T11 − T12 T− 22 T21 . Further, S(A|Vm , Vn ) = P1 HP3 . If the last reduction is not possible, then S(A|Vm , Vn ) does not exist. If the matrices P2 and P4 can be easily obtained, then this method has an edge over the method offered by Lemma 11.4.1. It may be noted that for any full column rank matrix P1 , there exists a sub-permutation (i1 , i2 , . . . , im−t ) of (1, 2, . . . , m) such that
314
Matrix Partial Orders, Shorted Operators and Applications
P2 = (ei1 : ei2 : ..eim−t ). However this may not be very easy to obtain algorithmically. Thus, the Methods 11.4.3 and 11.4.4 are of comparable complexity. A A B For Method 11.4.5, one has to check C ⊆ C and 0 B 0 t t A A B C ⊆ C , which is a task by itself. Even after this, there 0 B 0 is significant amount of computation left to be done that leads to the computation of S(A|B). This method appears to be computationally more expensive as compared to other methods.
Shorted Operators - Other Approaches
11.5
315
Exercises
Unless stated otherwise all matrices are over a general field F. (1) Let A, B and C be matrices of the same order such that C 6= 0 and {C− } = {A− + X : A− ∈ {A− } and BXB = 0}. Show that S = S(A|B) exists and S = C. (2) Let A and B be matrices of the same order such that S(A|B) and S(B|A) both exist. Then show that S(A|B) and S(B|A) are parallel summable with P(S(A|B), S(B|A)) = P(A, B). (3) Let A and B be matrices of the same order such that S(A|B) and S(B|A) both exist. Then show that {(S(A|B))− + (S(B|A))− } = {A− } + {B− }. (4) Let A, B and C be matrices of the same order such that (i) A <− B and (ii) the shorted matrices S(A|C) and S(B|C) exist. Show that S(A|C) <− S(B|C). Verify the validity of the conclusion if ‘<− ’ is replaced by ‘ ’ (5) Let A be an m×n matrix and S and T be subspaces of F m and F n . Let S 0 ⊆ S and T 0 ⊆ T . Show that S(A|S 0 , T 0 ) <− S(A|S, T ) whenever the shorted matrices exist. (6) Let A be an m × n matrix, S and S 0 be subspaces of F m and T and T 0 be subspaces of F n such that S ∩ S 0 6= {0} and T ∩ T 0 6= {0}. Assume S(A|S, T ) exists. Show that S(A|S ∩ S 0 , T ∩ T 0 ) exists if and only if S(S(A|S, T )|S 0 , T 0 ) exists, in which case they are equal. (7) Let the matrix A be idempotent of order n×n. Prove that S(A|Vn , Vn ), if it exists, is also idempotent where Vn is a subspace of F n .
Chapter 12
Lattice Properties of Partial Orders
12.1
Introduction
We studied the shorted operators in Chapters 10 and 11 in some detail. Their characterizing property as a maximal element of a certain collection of matrices also sets up the tone to consider an associated natural question namely, if a class of matrices is equipped with any of the partial orders on it, whether or not the set becomes a lattice under the partial order. If not, whether or not, it is a semi-lattice. In this chapter we discuss this problem for the three major partial orders namely the minus, the star and the sharp orders. Let U be a set of matrices equipped with a partial order ‘≺’. Let A, B ∈ U. We consider the following two classes of matrices C1 = {C : C ≺ A, C ≺ B} and C2 = {C : A ≺ C, B ≺ C}. The class C1 is non- empty as the null matrix is always below every matrix under ‘≺’. However, the class C2 may well be empty in general. One instance when it happens is when the matrices A and B are maximal with respect to the given partial order ‘≺’. Let us remind ourselves of the following: Definition 12.1.1. The unique maximal element of C1 , if it exists is called the infimum of the matrices A, B under ‘≺’ and is denoted by A ∧ B. Similarly, if C2 is non-empty, the unique minimal element of C2 , if it exists, is called the supremum of the matrices A, B under ‘≺’ and is denoted by A ∨ B. 317
318
Matrix Partial Orders, Shorted Operators and Applications
Section 2 gives a detailed account of finding the infimum and supremum of a given pair of matrices under the minus order. In general neither the supremum nor the infimum of a given pair of elements may exist. We study the conditions under which either one or both of infimum and supremum of a given pair of matrices under the minus order exist. Section 3 deals with the same problem for the star order. Here we show that the infimum of any pair of elements always exists. In Section 4, we study the conditions under which a pair of matrices has an infimum under the sharp order. The related problem for associated one sided orders for the star order can be found in exercises at the end of the chapter. We begin with the minus order.
12.2
Supremum and infimum of a pair of matrices under the minus order
Let A and B be matrices of the same order m × n over an arbitrary field F. For the minus order, we shall write C for the class C1 and C for the class C2 . Thus C = {C : C <− A, C <− B} and C = {C : A <− C, B <− C}. Also as noted earlier C 6= ∅ and C may be empty. We begin by giving an example of a pair of matrices for which neither A ∧ B nor A ∨ B exists. 1 0 1 1 0 0 Example 12.2.1. Let A = 0 1 0 and B = 0 1 0 . 0 0 0 0 0 0 − Notice that any non-null matrix <− B must have C such that C < A, C 1 0 0 0 0 0 rank 1. Now each of C1 = 0 0 0 and C2 = 0 1 0 ∈ C. However, 0 0 0 0 0 0 both C1 and C2 are non-comparable under the minus order. Hence, A ∧ B does not exist. 1 0 1 Moreover, the class C 6= ∅, since both the matrices D1 = 0 1 1 and 0 0 1 1 0 0 D2 = 0 1 0 ∈ C. Clearly, each has rank 3 and are not comparable 0 0 1 under the minus order. Thus, A ∨ B also does not exist.
Lattice Properties of Partial Orders
319
Thus, in general, F m×n may fail to be a lower semi-lattice as well as an upper semi-lattice under the minus order. What conditions will ensure that this is possible? The obvious thing to do is to find out when can C have the supremum and when can C have the infimum. We first study the conditions under which the supremum of any two matrices in F m×n exists. The following theorem gives a necessary and sufficient condition such that C 6= ∅ when the two matrices possess certain properties. Theorem 12.2.2. Let A and B be matrices of the same order. Write Sa = C(A), Ta = C(At ), Sb = C(B) and Tb = C(Bt ). Further, assume that the shorted operators S(A|Sb , Tb ) and S(B|Sa , Ta ) both exist. Then C 6= ∅ if and only if S(A|Sb , Tb ) = S(B|Sa , Ta ). 1 2 F F Proof. We note that if S(A|Sb , Tb ) exists and is a g-inverse F3 F4 A B of , then B = BF2 B = BF3 B, AF2 B = BF4 B = BF3 A, and B 0 the shorted operator S(A|Sb , Tb ) = BF3 A. (See Theorem 11.4.2.) ‘If’ part Let S(A|Sb , Tb ) = S(B|Sa , Ta ). We show that A + B − S(A|Sb , Tb ) ∈ C. Note that A <− A + B − S(A|Sb , Tb ) ⇔ B − S(A|Sb , Tb ) and A are disjoint matrices. Therefore, we must show that C(B − S(A|Sb , Tb )) ∩ C(A) = {0} and C(B − S(A|Sb , Tb ))t ∩ C(A)t = {0}. Let z = (B − S(A|Sb , Tb ))x = Ay. As noted above, S(A|Sb , Tb ) = BF3 A, so, z ∈ C(A) ∩ C(B) = C(S(A|Sb , Tb )) = C(S(B|Sa , Ta )). Therefore, z ∈ C(B − S(B|Sa , Ta )) ∩ C(S(B|Sa , Ta )) = {0}. Since S(B|Sa , Ta ) <− B, we have C(B − S(A|Sb , Tb )) ∩ C(A) = {0}. Hence, z = 0. This proves C(B − S(B|Sa , Ta )) ∩ C(A) = {0}. Similarly, we can show that C(B − S(B|Sa , Ta ))t ∩ C(A)t = {0}. We can prove that B <− A + B − S(A|Sb , Tb ) along the same lines. Thus, A + B − S(A|Sb , Tb ) ∈ C and so, C 6= ∅. ‘Only if’ part Let C 6= ∅ and C ∈ C. Then A <− C and B <− C. By Ex.6 of Chapter 11, we have S(S(C|Sa , Ta )|Sb , Tb ) = S(S(C|Sb , Tb )|Sa , Ta ) = S(C|Sa ∩ Sb , Ta ∩ Tb ).
(12.2.1)
Now, by Theorem 10.3.5 S(C|Sa , Ta ) = max{Z : Z <− C, C(X) ⊆ C(A). C(Xt ) ⊆ C(At )}.
320
Matrix Partial Orders, Shorted Operators and Applications
Clearly, A ∈ {Z : Z <− C, C(X) ⊆ C(A). C(Xt ) ⊆ C(At )}, and therefore, A <− S(C|Sa , Ta ). Since S(A|Sb , Tb ) <− A, C(S(A|Sb , Tb )) = C(A) ∩ C(B) ⊆ C(B) and C(S(A|Sb , Tb ))t = C(At ) ∩ C(Bt ) ⊆ C(Bt ), so, S(A|Sb , Tb ) <− S(S(C|Sa , Ta )|Sb , Tb ). However, both the matrices S(S(C|Sa , Ta )|Sb , Tb ) and S(A|Sb , Tb ) have same column space C(A) ∩ C(B) and same row space C(At ) ∩ C(Bt ), therefore have same rank and so, S(A|Sb , Tb ) = S(S(C|Sa , Ta )|Sb , Tb ). (12.2.2) Similarly, S(B|Sa , Ta ) = S(S(C|Sb , Tb )|Sa , Ta ). (12.2.3) From (12.2.1)-(12.2.3), we have S(A|Sb , Tb ) = S(B|Sa , Ta ). Remark 12.2.3. It is possible that the class C be non-empty even when S(A|Sb , Tb ) and /or S(B|Sa , Ta ) are not defined as the following example shows: Example 12.2.4. Let A, B be the matrices as in Example 12.2.1. Con1 0 a sider the matrix C = 0 1 b , where a, b, c are arbitrary and c 6= 0. 0 0 c Then A <− C and B <− C. Also, C(A)∩C(B) = C(A) = C(B) and C(At )∩ 0 C(Bt ) = C 1 . Thus, d(C(A) ∩ C(B)) = 2 and d(C(At ) ∩ C(Bt )) = 1. 0 Hence, both the shorted matrices S(A|Sb , Tb ) and S(B|Sa , Ta ) do not exist, yet C ∈ C, showing C 6= ∅. Theorem 12.2.5. Let A and B be matrices of the same order. Write Sa = C(A), Ta = C(At ), Sb = C(B) and Tb = C(Bt ). Further, assume that the shorted operators S(A|Sb , Tb ) and S(B|Sa , Ta ) both exist and are equal. Then A ∨ B exists if and only if at least one of A − S(A|Sb , Tb ) and B − S(B|Sa , Ta ) is null or equivalently at least one of A <− B or B <− A holds. Proof. The matrix A + B − S(A|Sb , Tb ) ∈ C, as shown in Theorem 12.2.2. We show that A ∨ B = A + B − S(A|Sb , Tb ) ∈ C. Let C ∈ C. Then A <− C and B <− C. So, C(A + B) = C(A) + C(B) ⊆ C(C). Hence, ρ(C) = d(C(A) + C(B)) = ρ(A) + ρ(B) − d(C(A) ∩ C(B)) ≥ ρ(A + B − S(A|Sb , Tb )),
Lattice Properties of Partial Orders
321
so, C≮− A + B − S(A|Sb , Tb ) under ‘<− ’. Let none of the matrices A − S(A|Sb , Tb ) and B − S(B|Sa , Ta ) be null. Consider the matrix XK = A + B − S(A|Sb , Tb ) + (A − S(A|Sb , Tb ))K(B − S(B|Sa , Ta )), where K is an arbitrary matrix of appropriate order. We show that for each such matrix K, the matrix XK is a minimal element of C. We first prove that XK ∈ C or equivalently, A <− XK and B <− XK which is further equivalent to establishing that the matrices ‘A and XK − A’ are disjoint and ‘B and XK − B’ are disjoint. Consider the matrix XK − A. XK − A = B − S(A|Sb , Tb ) + (A − S(A|Sb , Tb ))K(B − S(B|Sa , Ta )) = B − S(B|Sa , Ta ) + (A − S(A|Sb , Tb ))K(B − S(B|Sa , Ta )). Let z = Ax = (XK − A)y. Clearly, z ∈ C(A) ∩ C(B) = C(S(B|Sa , Ta )). So, z ∈ C(B − S(B|Sa , Ta )) ∩ C(S(B|Sa , Ta )) = {0}, as S(B|Sa , Ta ) <− B. Therefore, z = 0 implying C(A) ∩ C(XK − A) = {0}. Similarly, C(At ) ∩ C(XK − A)t = {0}. Thus, A <− XK . Similar argument shows that B <− XK . Moreover, ρ(XK ) = ρ(A + (B − S(A|Sb , Tb )))(I + K(A − S(A|Sb , Tb ))) ≤ ρ(A) + ρ((B − S(A|Sb , Tb ))(I + K(A − S(A|Sb , Tb )))) ≤ ρ(A) + ρ((B − S(A|Sb , Tb ))). Thus, XK is a minimal element for each choice of K, showing A ∨ B does not exist. Thus, either A − S(A|Sb , Tb ) or B − S(B|Sa , Ta ) is a null matrix or equivalently, either A <− B or B <− A holds. ‘If’ part Assume at least one of A − S(A|Sb , Tb ) or B − S(B|Sa , Ta ) be null, say A − S(A|Sb , Tb ) = {0}. Then A = S(A|Sb , Tb ) = S(B|Sa , Ta ) <− B. So, A ∨ B = A. Similarly, if B − S(B|Sa , Ta ) = {0}, then A ∨ B = B. Remark 12.2.6. In the notations of Theorem 12.2.5, when S(A|Sb , Tb ) and S(B|Sa , Ta ) both exist, for A ∨ B to exist, either A <− B or B <− A. Our next theorem shows that the only instance when A ∨ B exists is provided by conditions of Theorem 12.2.5. Theorem 12.2.7. Let A and B be matrices of the same order. Write Sa = C(A), Ta = C(At ), Sb = C(B) and Tb = C(Bt ). Further, assume that either of the shorted operators S(A|Sb , Tb ) or S(B|Sa , Ta ) does not exist. Then A ∨ B does not exist.
322
Matrix Partial Orders, Shorted Operators and Applications
Proof. Let us suppose that S(A|Sb , Tb ) does not exist, yet A ∨ B exists. Let C = A ∨ B and (P, Q) be a rank factorization of C. As A <− C and B <− C, by Theorem 3.3.5(v), we can write B = Pdiag(I r , 0)Q H11 H12 H11 H12 and A = P Q, where r = ρ(B) and H = is an H21 H22 H21 H22 idempotent matrix. Let λ 6= 0, 6= 1 be a scalar such that det(I + (λ − 1)(I − H22 )) 6= 0. If I = H22 , let I −(λ − 1)H12 T= = T1 0 I + (λ − 1)(I − H22 ) or I 0 T= = T2 −(λ − 1)H21 I + (λ − 1)(I − H22 ) according as H12 is non-null or H21 is non-null. If I 6= H22 , then either choice is permissible. Clearly, det(T) 6= 0. Hence, ρ(C) = ρ(PTQ). − − We now show that A< PTQ and B < PTQ. Consider T1 − H = I − H11 −λH12 . We have C(T1 − H) = C(I − H) and I − H is an −H21 λ(I − H22 ) idempotent. So, ρ(T1 − H) = ρ(I − H) and T1 = H ⊕ (T1 − H). Similarly, T2 = H ⊕ (T2 − H). Therefore, in either case A <− PTQ. However, B <− PTQ holds trivially. This contradicts that A ∨ B exists. H11 0 In case I = H22 , H12 = 0 and H21 = 0, we have A = P Q. 0 I A B Consider the matrix . By Theorem 11.4.2, S(A|Sb , Tb ) exists and B 0 S(A|Sb , Tb ) = A, a clear contradiction to the supposition that S(A|Sb , Tb ) does not exist. The case when S(B|Sa , Ta ) does not exist can be completed in the same manner. Remark 12.2.8. In the setup of Theorem 12.2.7, A ∨ B does not exist. We show that the same is not the case with infimum. Even when one of the shorted operators S(A|Sb , Tb ) or S(B|Sa , Ta ) does not exist, the infimum A ∧ B may still exist. We now study the conditions under which the infimum A ∧ B of the matrices A and B of the same order exists. We prove the following: Theorem 12.2.9. Let A and B be matrices of the same order. Let Sa = C(A), Ta = C(At ), Sb = C(B) and Tb = C(Bt ). Further, assume
Lattice Properties of Partial Orders
323
that the shorted operators S(A|Sb , Tb ) and S(B|Sa , Ta ) both exist and the field F has characteristic different from 2. Then the following hold: (i) The pairs ‘A and B,’ and ‘S(A|Sb , Tb ) and S(B|Sa , Ta )’ are parallel summable and P(S(A, Sb , Tb )|S(B|Sa , Ta )) = P(A, B). (ii) If S(A|Sb , Tb ) = S(B|Sa , Ta ), then A ∧ B exists and is given by 2P(A|B). Proof. (i) Proof follows by Ex. 11.2, Ex. 11.3 and Theorem 9.2.14. (ii) If S(A|Sb , Tb ) = S(B|Sa , Ta ), then P(S(A|Sb , Tb )|S(B|Sa , Ta )) =
S(A|Sb , Tb ) . 2
S(A|Sb , Tb ) = P(A|B). Clearly, S(A|Sb , Tb ) ∈ C, as 2 − S(A|Sb , Tb ) < A, S(B|Sa , Ta ) <− B and S(A|Sb , Tb ) = S(B|Sa , Ta ). If C ∈ C, then C <− A and C <− B. So, C <− A, C(C) ⊆ C(B), and C(Ct ) ⊆ C(Bt ). So, C <− S(A|Sb , Tb ). Hence, A ∧ B = S(A|Sb , Tb ) = 2P(A|B). So by (i), we have
Theorem 12.2.10. Let A and B be matrices of the same order. Let Sa = C(A), Ta = C(At ), Sb = C(B) and Tb = C(Bt ). Further, assume that the shorted operators S(A|Sb , Tb ) and S(B|Sa , Ta ) both exist and are not equal. Then either A ∧ B does not exist or if it exists, it must be the null matrix. Proof. We first show that when S(A|Sb , Tb ) and S(B|Sa , Ta ) exist, then A ∧ B = S(A|Sb , Tb ) ∧ S(B|Sa , Ta ). Recall that S(A|Sb , Tb ) is the unique maximal element of {C : C <− A, C(C) ⊆ Sb , C(Ct ) ⊆ Tb } and S(B|Sa , Ta ) is that of {C : C <− B, C(C) ⊆ Sa , C(Ct ) ⊆ Ta }. Also, A ∧ B is the unique maximal element of C. Let D = A ∧ B. Clearly, D <− S(A|Sb , Tb ) and D <− S(B|Sa , Ta ), since D <− A and D <− B. Therefore, D <− S(A|Sb , Tb ) ∧ S(B|Sa , Ta ). However, S(A|Sb , Tb ) ∧ S(B|Sa , Ta ) <− D, so, S(A|Sb , Tb ) ∧ S(B|Sa , Ta ) = D = A ∧ B. Let (P, Q) be a rank factorization of S(A|Sb , Tb ). Since, C(S(A|Sb , Tb )) = C(S(B|Sa , Ta )) = C(A)∩C(B) and C(S(A|Sb , Tb ))t = C(S(B|Sa , Ta ))t = C(At ) ∩ C(Bt ), by Theorem 3.2.6(iii), there exists a non-singular matrix T such that S(B|Sa , Ta ) = PTQ for some non-singular matrix T. It follows that without any loss of generality we may take A = I and B to be any non-singular matrix. Let C be a non-null matrix such that C <− I and
324
Matrix Partial Orders, Shorted Operators and Applications
C <− B. We consider two cases namely: (i) when B and C do not commute and (ii) when they commute. Case 1: Let B and C do not commute. Then C0 = BCB−1 and C are distinct matrices with same rank and satisfy C0 <− I and C0 <− B. As C0 and C are non-comparable, C can not have a unique maximal element. Case 2: Let B and C commute and X be a matrix of appropriate order such that (I − C)XC 6= 0 and det(I + (I − C)X) 6= 0. Now, let C0 = C + (I − C)XC = (I + (I − C)X)C. Clearly, ρ(C0 ) = ρ(C). Also, C0 is idempotent as C is idempotent. So, C0 <− I. It is, now, easy to see that C0 = C0 B−1 C0 , and therefore, C0 <− B. Also, C0 and C are noncomparable. Thus, once again C can not have a unique maximal element. We next show that the only situation in which A ∧ B is non-null occurs when S(A|Sb , Tb ) and S(B|Sa , Ta ) are multiples of each other with multiplication factor 6= 1. Once again, we take A = I and B to be any non-singular matrix. The following theorem identifies the situation when A ∧ B is a non-null matrix. Theorem 12.2.11. Let B be a square matrix such that either (a) B is a non-null idempotent or (b) B2 6= kB, for any scalar k. Then there exists a non-null idempotent matrix H such that H <− B. Proof. (a) If B is a non-null idempotent, take H = B and we are done. (b) Let B2 6= kB, for any scalar k i.e., B is not an idempotent. We shall actually manufacture a non-null idempotent matrix H of rank 1 such that H <− B. Let H = xyt be rank 1 matrix such that H <− B. Clearly, xyt B− xyt = xyt for each g-inverse B− of B. It follows that yt B− x = 1 for each ginverse B− of B. This is possible when x ∈ C(B) and y ∈ C(Bt ) and yt B− x = 1. Let x = Bu and yt = vt B. Then 1 = yt B− x = vt Bu. Also, H2 = H ⇒ yt x = 1 ⇒ vt B2 u = 1. If B2 6= kB, the equations vt Bu = 1 and vt B2 u = 1 clearly posses a solution for u, v. Let H = Buvt B. Thus, if S(A|Sb , Tb ) and S(B|Sa , Ta ) exist and unless they are multiples of each other with multiplication factor 6= 1, we can always find a non-null matrix that is dominated by A and B under the minus order. This guarantees the non-null A ∧ B.
Lattice Properties of Partial Orders
325
We now turn our attention to the case when one of S(A|Sb , Tb ) and S(B|Sa , Ta ) does not exist. We will show that A ∧ B exists but under some conditions. Before we can proceed to do so, we need some preparation. Lemma 12.2.12. Let (L, R) be a rank factorization of a square matrix A. Then A is idempotent if and only if RL = I. Proof is trivial. Theorem 12.2.13. Let E1 be a matrix of order k × m and rank r. Let (L1 , R) be a rank factorization of E1 and R = (R1 : R2 ) be a partition of R, where R1 is of order r× k.then there exists a matrix E2 of order E1 (m − k) × m such that E = is idempotent of rank r if and only if E2 C(I − R1 L1 ) ⊆ C(R2 ). Proof. ‘If’ part Let C(I − R1 L1 ) ⊆ C(R 2 ), so, I − R1 L1 = R1 X for some matrix X. E1 Let E2 = XR and E = . Then E2 L1 L1 L1 L1 E2 = R R= (R1 : R2 ) (R1 : R2 ) = E, X X X X L1 since, (R1 : R2 ) = I. X ‘Only if’ part E1 Let E = be an idempotent matrix of rank r. We first notice that E2 0 00 E1 = (L1 R1 : L1 R2 ). Partition the matrix E2 as E2 = (E2 : E2 ). Then L1 R1 L1 R2 the matrix E = is idempotent. So, E02 E002 L1 R1 E02 = L1 (I − R1 L1 )R1 , L1 R2 E002 = L1 (I − R1 L1 )R2 , E02 L1 R1 + E002 E02 = E02 and E02 L1 R2 + E002 E002 = E002 . Thus, L1 R2 E2 = L1 (I − R1 L1 )R. As L1 has a left inverse and R has a right inverse, we see that C(I − R1 L1 ) ⊆ C(R2 ). Theorem 12.2.14. In the setup of Theorem 12.2.13, there exists a matrix E1 E2 of order (m − k) × m such that E = is idempotent of rank r + s, E2 s > 0 into which E1 is embedded if and only if C(I − R1 L1 ) ⊆ C(R2 ) and s ≤ (m − k) − ρ(R2 ).
326
Proof.
Matrix Partial Orders, Shorted Operators and Applications
‘Only if’ part
E1 Let there exists a matrix E2 of order (m − k) × m such that E = is E2 idempotent of rank r + s, s > 0 into which E1 is embedded. We will construct a rank factorization (S, T) of E using the rank factorization (L1 , R) of E1 as in Theorem 12.2.13. Since E has rank r + s, C(Rt ) 6= C(Et ). So, we must add additional rows to R so that C(Rt ) = C(Et ). We add terminal R s rows forming a matrix say, Y to R such that the matrix is a matrix Y of rank r + s and is the right factor in our proposed factorization. Since the first k rows of E are also the first k rows of E1 , the rows of R form a basis of the row span of these k rows of E. So, in the left factor of E we can take the first k rows of L1 followed by s null rows. We can write L1 L2 R1 R2 0 E= 0 , where L2 is some k × s matrix and 0 Y1 Y2 ◦ ◦ ◦ is some (m − k) × s matrix ◦ being some appropriate matrix. Since E is an L1 L2 R1 R2 idempotent by Lemma 12.2.12, 0 0 = Ir+s . So, Y1 Y2 ◦ ◦ L 2 0 0 R1 L1 + R2 = Ir , Y1 L2 + Y2 = Is , R 0 = 0 ◦ ◦ ◦ and L 1 Y 0 = 0. ◦ L2 0 Now, R1 L1 + R2 = Ir ⇒ C(I − R1 L1 ) ⊆ C(R2 ) and R 0 = 0 ◦ ◦ ⇒ the last s columns of the left factor are in null space of R. Therefore, s ≤ d(N (R2 )) = (m− k) − ρ(R2 ) and so, we can take the last s columns 0 of the left factor as , where 0 is the null matrix of order k × s and Z Z is the matrix of order (m − k) × s, whose columns arelinearly independent L1 0 vectors from the null space of R2 . Thus, S = for some matrix X. X Z ‘If’ part Let C(I − R1 L1 ) ⊆ C(R2 ) and s ≤ (m − k) − ρ(R2 ) hold. By The-
Lattice Properties of Partial Orders
327
orem 12.2.13, E1 can be embedded in an idempotent of rank r and R2 K = I − R1 L1 for some matrix K of order (m − k)× r. Let Y be R some matrix of order s × m to be determined, so that T = is of order Y (r + s) × m and of rank r + s. Choose s columns from a basis of N(R2 ) and L1 0 let Z be a matrix of order (m − k) × s formed by them. Let S = . K Z Then E = ST is of order m × m. Clearly, ρ(S) =r + s. By Lemma 12.2.12, R L1 0 E is idempotent ⇔ TS = Ir+s ⇔ = Ir+s . Partition Y as Y K Z (Y1 : Y2 ), where Y1 is of order s × k and Y2 is of order s × (m − k). Now, R L1 0 = Ir+s Y K Z R1 R2 L1 0 ⇔ = Ir+s Y1 Y2 K Z ⇔ R2 K = Ir − R1 L1 , R2 Z = 0, Y1 L1 + Y2 K = 0 and Y2 Z = Is . L1 The equation Y1 L1 + Y2 K = 0 is equivalent to Y = 0. Both the K L1 0 equations Y = 0 and Y = I are consistent individually and K Z jointly, being linearly independent, hence form a solution for Y. Thus, with this solution as Y, the matrix T is the right factor and E = ST is idempotent of rank r + s in which E1 is embedded. Finally we are ready to study the conditions under which A ∧ B exists. Theorem 12.2.15. Let A and B be matrices of the same order. Let Sa = C(A), Ta = C(At ), Sb = C(B) and Tb = C(Bt ). Further, assume that the shorted operator S(B|Sa , Ta ) does not exist. Then the following hold: (i) There exists a matrix C such that C <− B (ii) There exists an idempotent C such that C <− B and C <− A and (iii) If there exists a unique idempotent C such that C <− B and C <− A, then C = A ∧ B, the infimum of A, B. I 0 Proof. There is no loss of generality if we take A = m and 0 0 B11 B12 B= . Since, S(B|Sa , Ta ) exists ⇔ C(B21 ) ⊆ C(B22 ) and B21 B22
328
Matrix Partial Orders, Shorted Operators and Applications
C(Bt12 ) ⊆ C(Bt22 ), so, either of the two conditions must fail to hold. Let us assume C(B21 ) ⊆ C(B22 ) but C(Bt12 ) " C(Bt22 ). Now, C(B21 ) ⊆ C(B22 ) ⇔ C(B21 , B22 )t ∩ C(I, 0)t = {0}. For, if (B21 , B22 )t x = (I, 0)t z, then Bt21 x = z and Bt22 x = 0. However, − t t t Bt22 x = 0 ⇒ Bt21 B−t 22 B22 x = (B22 B22 B21 ) x = 0 ⇒ B21 x = 0 ⇒ z = 0. t t Thus, (B21 , B22 ) x = 0 and (I, 0) z = 0. Conversely, let C(B21 , B22 )t ∩ C(I, 0)t = {0}. So, (B21 , B22 )t x = (I, 0)t z ⇒ (B21 , B22 )t x = 0 and (I, 0)t z = 0. For an arbitrary t t I − B22 B− B21 (I − B22 B− 22 22 ) vector u, let x = u and z = u. I − B22 B− 0 22 t t Then (B21 , B22 )t x = (I, 0)t z, so, (I − B22 B− 22 ) B21 u = 0. This gives − t t (I − B22 B22 ) B21 = 0 or equivalently, C(B21 ) ⊆ C(B22 ). Let xi2 , i = 1, . . . , ` be a basis of of N (Bt21 ). Each xi2 is an n−m-vector. 0 Then for each i = 1, . . . , `, the vectors xi = ∈ N (Bt12 , Bt22 ), where xi2 t t 0 is a null m-vector. Extend these ` vectors to a basis of N (B12 , B22 ), xi1 say by taking additional k vectors xi = , i = ` + 1, . . . , ` + k. Since xi2 xi , i = ` + 1, . . . , ` + k are linearly independent, xi1 , i = ` + 1, . . . , ` + k are also linearly independent. Choose xi1 , i = ` + k + 1, . . . , ` + m such that xi , i = ` + 1, . . . , ` + m form a basis of F m . Set xti1 B11 + xti2 B21 = zti−` for i = `+1, . . . , `+k. Note that each zti is an m-vector and are k in number. We define a matrix C11 as: xti1 C11 = zti−` for i = `+1, . . . , `+k. Such a matrix C11 0 C11 exists. Let C = . We show C(B − C)t ∩ C(I, 0)t = {0}. 0 0 y1 z1 t t Let x = (B − C) y = (I, 0) z. Partition y = and z as z = . y2 z2 Then (B11 − C11 )t y1 + Bt21 y2 = z1 and Bt21 y2 + Bt22 y2 = 0. So, y ∈ N (Bt21 , Bt22 ). Let y = Σ`+k α x . Then (B11 − C11 )t y1 + Bt21 y2 = z1 i=1 i i t (B11 − C11 ) I (B11 − C11 )t I `+k ⇒ y= z ⇒ Σi=1 αi xi = z 0 Bt21 0 Bt21 (B11 − C11 )t (B11 − C11 )t I xi + Σ`+k xi = z ⇒ Σ`i=1 αi t i=`+1 αi B21 Bt21 0 (B11 − C11 )t 0 (B11 − C11 )t xi1 ⇒ Σ`i=1 αi + Σ`+k t i=`+1 αi B21 xi2 Bt21 xi2 I = z. 0 It can now be seen that this gives x = 0.
Lattice Properties of Partial Orders
329
Similarly, C(B − C)t ∩ C(C11 , 0)t = {0}. Let Z be a matrix whose rows are the k zti in natural order as defined and r = ρ(Z). Then ρ(B) = n = (` + k) + r, for xti B = 0 for i = 1, . . . , `. B11 B12 t t t Also, xi B = (xi1 , xi2 ) = (zti−` , 0) for i = ` + 1, . . . , ` + m. B21 B22 Moreover, ρ(Z) = r, d(N (Z)) = k − r, so, xti B = 0 for exactly those i for which zti−` is a linear combination of r of linearly independent ztj . Such zti−` are k −r in number and form a basis of N (Z). So, there are `+k −r linearly independent vectors xti for which xti B = 0. Thus, d(N (Bt )) = ` + k − r, which gives ρ(B) = ρ(Bt ) = n − (` + k) + r. Notice that B11 − C11 B12 t t t xi (B − C) = (xi1 , xi2 ) = 0 for i = ` + 1, . . . , ` + m. B21 B22 So, d(N (Bt − Ct )) = ` + k, and therefore, ρ(B − C) = n − (` + k). Clearly, ρ(C11 ) ≥ r. We can choose C11 of rank r. (Since exactly r of zti−` are linearly independent, we take corresponding xti1 and choose C11 which will be of rank r.) Then B = C ⊕ (B − C), so, C <− B. (ii) We first show that the choice of C11 is independent of choice of a basis of N (Bt12 ,Bt22). Let yi , i = 1, . . . , ` + k be another basis of N (Bt12 , Bt22 ), 0 and yi = , i = 1, . . . , `. Then each yi is a linear combination of xi , yi2 i = 1, . . . , ` + k. So, y`+1 = Σ`+k i=1 βi xi for some scalars βi . Consider t t t t t w1 = y(`+1)1 B11 + y(`+1)2 B21 = Σ`+k i=1 βi (xi1 B11 + xi2 B21 ) `+k t t t = Σ`+k i=`+1 βi (xi1 B11 + xi2 B21 ) = Σi=`+1 βi zi−` , as C(B21 ) ⊆ C(B22 ). Thus, `+k t t t w1t = Σ`+k i=`+1 βi zi−` = Σi=`+1 βi xi1 C11 = y`+1 C11 . t Therefore, C11 satisfies the same conditions in terms of yi1 as it did in t terms of xi1 . We next show that we can choose C11 to be an idempotent of rank r. This will show C <− A. As C <− B and C is unique, whence (iii) follows. So, let the vectors xti1 be arranged in their natural order and form the rows of of a matrix M. Clearly, M is non-singular as xti1 for i = ` + 1, . . . , ` + m Z form a basis of F m . Let T be a matrix such that MC11 = . We have to T Z −1 Z find a matrix T such that is a square matrix of rank r and M T T Z Z Z is idempotent say E. Notice that E= =E . Having found T T T a matrix T, we let (L1 , R) be a rank factorization of ZM−1 = Z1 and
330
Matrix Partial Orders, Shorted Operators and Applications
R = (R1 : R2 ) be a partition of R, where R1 is of order r × (m − k) and has full column rank. By Theorem 12.2.14, there exists a unique T1 such Z1 that E = is idempotent of rank r. Choose C11 as M−1 EM. Then T1 C11 is the unique idempotent of rank r. Now (iii) follows. We now give an example of a pair of matrices A and B to show when S(A|Sb , Tb ) and S(B|Sa , Ta ) do not exist, the infimum A ∧ B also does not exist. 1 0 1 1 0 0 Example 12.2.16. Let A = 0 1 0 and B = 0 1 0 be the matri0 0 0 0 0 0 ces as in Example 12.2.1. Then S(A|Sb , Tb ) and S(B|Sa , Ta ) both do not 1 0 0 0 0 0 exist. Also, C = 0 0 0 and D = 0 1 0 are such that C <− A, 0 0 0 0 0 0 C <− B, D <− A, and D <− B. But C and D are not comparable under the minus order. So, the infimum A ∧ B, of A and B also does not exist. The following theorem gives an upper bound for A ∧ B, the infimum of A and B. Theorem 12.2.17. Let A and B be matrices of the same order. If for some γ 6= 0, 6= 1 γ −1 A and (1 − γ)−1 B are parallel summable, then A ∧ B <− P(γ −1 A, (1 − γ)−1 B). Proof. Let C = A ∧ B. Then C <− A and C <− B. So, {A− } ⊆ − {C− } and {B− } ⊆ {C− }. By Theorem 9.2.14, {P(γ −1 A, (1 − γ)−1 B) } = − − − − −1 −1 {A } + {B } ⊆ {C }. Hence, C = A ∧ B < P(γ A, (1 − γ) B), by using Theorem 3.3.5.
12.3
Supremum and infimum under the star order
We have seen in the last section that under the minus order F m×n fails to be even a semi-lattice in general. However, the same is not the case with Cm×n under the star order. We show here that under star order, Cm×n is a lower semi-lattice but is not an upper semi-lattice for, if A, B is a pair of matrices, with A or B, a maximal element of Cm×n under the star order,
Lattice Properties of Partial Orders
331
then there is no matrix C that can dominate A and B and hence A ∨ B cannot exist. We first note that for any two orthogonal projectors E and F, E ∨ F and E ∧ F always exist. (See Ex. 12.4.) To show the existence of the infimum of any two matrices, we first proceed to show the existence of the infimum when both matrices are partial isometries. Then by using a special type of representation of matrices known as the Penrose decomposition, we establish the existence of infimum of arbitrary matrices. All matrices in this section are complex matrices unless stated otherwise. We start with the following lemmas: L 0 Lemma 12.3.1. Let A and B = be matrices of the same order. 0 0 K 0 Then A B if and only if A = with K L, and all partitions 0 0 involved are conformable for matrix operations. Proof. ‘If’ part is trivial. ‘Only if’ part K M Let A = be the partition of A conformable with the given partiN T tion of B. Then A B ⇒ A <s B ⇒ M = 0, N = 0 and T = 0. Now, the result follows immediately. Lemma 12.3.2. Let Bi be matrices of order mi ×n for each i = 1, . . . , s and C = {E ∈ Cn×n : E is an orthogonal projector and Bi E = 0 for each i}. Then C has the infimum given by an orthogonal projector F on the space ∩si=1 N (Bi ). Proof. Since Bi E = 0 ⇒ C(E) ⊆ N (Bi ) for each i, so, E ∈ C ⇔ C(E) ⊆ ∩si=1 N (Bi ). Let F be a projection on the space ∩si=1 N (Bi ). Clearly, F ∈ C and for each E ∈ C, C(E) ⊆ C(F). Now, E F ⇔ C(E) ⊆ C(F), so, F is an upper bound of C. It is trivial to see that F is the unique maximal element of C. Recall a partial isometry is a matrix A ∈ Cm×n if AA? A = A, equivalently A? = A† . If A is a partial isometry, then its singular value decom I 0 r position is X? AY = , where r = ρ(A), matrices X and Y are 0 0 unitary. Lemma 12.3.3. Let A and B be matrices of the same order such that
332
Matrix Partial Orders, Shorted Operators and Applications
Ir 0 , where r = ρ(A), X and 0 0 L M Y are unitary. Further, let X? BY = , where partitions are N T conformable for matrixoperations. Then A ∧ B exists and is given Q 0 by A ∧ B = X Y? , where Q is the orthogonal projector onto 0 0 N (I − L) ∩ N (I − L)? ∩ N (M)? ∩ N (N). A is an isometry.
Let X? AY =
? ? ? Now, A B ⇔ X AY < XBY, since X and Y are unitary. Ir 0 L M So, we may take A = and B = . By Lemma 12.3.1, 0 0 N T Ir 0 E 0 L M ? ? C< ⇔C= for some projector E and C < 0 0 0 0 N T ⇔ LE = E = EL, EM = 0 and NE = 0 ⇔ C(E) ⊆ N (I − L) ∩ N (I − L)? ∩ N (M)? ∩ N (N). Let Q be the orthogonal projector on
Proof.
N (I − L) ∩ N (I − L)? ∩ N (M)? ∩ N (N). Clearly, C(E) ⊆ C(Q). Thus, A ∧ B exists and is given by Q 0 A∧B=X Y? . 0 0
We now study the existence of the infimum for two arbitrary matrices. For this, we need a representation of matrices known as the Penrose decomposition. We describe it below. Let A be an m × n matrix. Consider the singular value decompoD 0 sition A = X Y? , where X ∈ Cm×m , Y ∈ Cn×n are uni0 0 tary, D = diag(σ1 Ir1 , . . . , σk Irk ) ∈ Cr×r and σi are distinct eigen-values of A such that r = r1 + . . . rk = ρ(A). Partition X, Y conformably with the block diagonal decomposition of D, say X = (X1 , . . . , Xk+1 ) and Y = (Y1 , . . . , Yk+1 ). Define Uσi = Xdiag(0, . . . , 0, Ir1 , 0, . . . , 0) for each i = 1, . . . , k, and Uα = 0 for each positive real α, if α is not a singular value of A. Then we can write A = Σki=1 σi Xi Yi? = Σσ>0 αUα
(12.3.1)
where Uα are non-zero only for finitely many values of α and are pairwise ?-orthogonal partial isometries i.e., Uα U? β Uα = Uα , Uα U? β = 0 and
Lattice Properties of Partial Orders
333
U? α Uβ = 0 for α 6= β. The representation in (12.3.1) is called the Penrose decomposition of A. Theorem 12.3.4. Let A and B be matrices of the same order having Penrose decompositions A = Σα>0 αUα and B = Σβ>0 βVβ respectively. Consider the following statements: (i) A B, (ii) Uα Vα for each positive real α and (iii) For all α, β, α 6= β, U? α Vβ = 0 and Uα V? β = 0, i.e., Uα and Vβ are ?−orthogonal. Then (i) ⇔ (ii). Also, any of (i) or (ii) ⇒ (iii). Proof. We first prove (i) ⇒ (iii). By (i), A? A = A? B and AA? = BA? . A? A = A? B ⇒ Uα A? A = Uα A? B ⇒ α2 Uα = αUα U?α B. Also, BVβ? = βVβ Vβ? . Therefore, αUα Vβ? = βUα U?α Vβ Vβ? . Similarly, αU? Vβ = βU?α Uα Vβ? Vβ . So, α2 U?α Vβ = βU?α (αUα Vβ? )Vβ = β 2 U?α (Uα U?α Vβ Vβ? )Vβ = β 2 U?α Vβ . Since α, β are positive real numbers and α 6= β, we have U?α Vβ = 0. Similarly, Uα Vβ? = 0. (i) ⇒ (ii). From (i) ⇒ (iii), we have αUα = Uα U?α B. Pre-multiplying by U?α and using (iii), U?α αUα = U?α Uα U?α B = U?α B = αU?α Vα . As α 6= 0, we have U?α Uα = U?α Vα . Similarly, Uα U?α = Vα U?α . Therefore, Uα Vα . (ii) ⇒ (i) is easy. Corollary 12.3.5. Let A = Σα>0 αUα and B be matrices of the same order. Then A B if and only if α(Uα B) for each α. Theorem 12.3.6. Let A and B be matrices of the same order having Penrose decompositions A = Σα>0 αUα and B = Σβ>0 βVβ respectively. Then A ∧ B exists and is given by A ∧ B = Σα>0 αUα ∧ Vα . Proof. Let Wα = Uα ∧ Vα . Each Wα exists by Lemma 12.3.2 and is a partial isometry. Also, Wα Uα and Uα is a partial isometry for each α. Let D = Σα>0 αWα . We show that D is the infimum of A and B. Since
334
Matrix Partial Orders, Shorted Operators and Applications
? ? ? ? for each α, Wα Uα , we have Wα Wα = Wα Uα , Wα Wα = Uα Wα . ? ? ? ? So, Wα Wβ = Wα Wα Wα Wβ Wβ? Wβ = Wα Wα U?α Uβ Wβ? Wβ = 0 for α 6= β, since U?α Uβ = 0. Similarly, Wα Wβ? = 0. Let C = ΣαTα be an m × n matrix such that C A and C B. Then by Theorem 12.3.4, Tα Uα , Tα Vα for each α. So, Tα Wα for each α. Once again by Theorem 12.3.4, C D, and hence D = A ∧ B.
As an immediate consequence of Theorem 12.3.6, we have Theorem 12.3.7. Let A and B be matrices of the same order. Then (i) (A ∧ B)? = A? ∧ B? , (ii) (A ∧ B)† = A† ∧ B† and (iii) α(A ∧ B) = αA ∧ αB. Thus, it follows that the infimum of two hermitian matrices is hermitian. We have expressed the infimum of given matrices in terms of some partial isometries. The following theorems express A B in terms of A, B, A? and B? . Theorem 12.3.8. Let A and B be matrices of the same order. Let E = I − (I − A† B)(I − A† B)† and F = I − (I − BA† )† (I − BA† ). Then the following are equivalent: (i) AE = FA (ii) A(I − E) = (I − F)A and (iii) C(A(I − A† B)) ⊆ C(I − BA† )? and C((I − BA† )A)? ⊆ C(I − A† B). When this is so, A ∧ B = AE = FA. Proof. Notice that both E and F are projectors. (i) ⇔ (ii) is easy. (ii) ⇒ (iii) Since A(I − E) = (I − F)A, we have A(I − A† B)(I − A† B)† = (I − BA† )† (I − BA† )A. Post-multiplying (12.3.2) by I − A† B, we have A(I − A† B) = (I − BA† )† (I − BA† )A(I − A† B).
(12.3.2)
Lattice Properties of Partial Orders
335
Thus, C(A(I − A† B)) = C((I − BA† )† (I − BA† )A(I − A† B)) ⊆ C(I − BA† )? . Similarly, pre-multiplying (12.3.2) by I − BA† and taking transposes we have C((I − BA† )A)? ⊆ C(I − A† B). (iii) ⇒ (ii) Now, C(A(I − A† B)) ⊆ C(I − BA† )? ⇒ A(I − A† B) = (I − BA† )? (I − BA† )?† A(I − A† B) = ((I − BA† )† (I − BA† ))? A(I − A† B) = (I − BA† )† (I − BA† )A(I − A† B). Post-multiplying by (I − A† B)† , we have A(I − A† B)(I − A† B)† = (I − BA† )† (I − BA† )A(I − A† B)(I − A† B)† . Also, (I − BA† )A = (I − BA† )A(I − A† B)(I − A† B)† , since C((I − BA† )A)? ⊆ C(I − A† B). So, A(I − A† B)(I − A† B)† = (I − BA† )† (I − BA† )A or equivalently, A(I − E) = (I − F)A. Thus, (ii) holds. It is easy to see that AE A and AE B. To show that A ∧ B = AE, let C A and C B. Since C A, we have C† A† ⇒ CC† = CA† and C† C = A† C. Similarly, CC† = CB† and C† C = B† C. So, CA† B = CC† B = CC† C = C. Similarly, BA† C = C. Therefore, CE = C = FC, so, CC† AE = CE = C = FC = FAC† C = EAC† C. Thus, C AE, hence the result. Theorem 12.3.9. Let A and B be matrices of the same order. Let G = I − (A − B)† (A − B) and H = I − (A − B)(A − B)† . Then the following are equivalent: (i) (ii) (iii) (iv)
AG = HA A(I − G) = (I − H)A C(A(A? − B? )) ⊆ C(A − B) C((A? − B? )A)? ⊆ C(A − B)? A(A − B)† B = B(A − B)† A.
When this is so, A ∧ B = AG = HA.
and
336
Matrix Partial Orders, Shorted Operators and Applications
Proof. (i) ⇔ (ii) is easy. (ii) ⇔ (iv) is obvious. (ii) ⇒ (iii) Since (ii) holds, A(A − B)† (A − B) = (A − B)(A − B)† A. Post-multiplying by (A − B)? , we have A(A − B)† (A − B)(A − B)? = (A − B)(A − B)† A(A − B)? and therefore, A(A − B)? = (A − B)(A − B)† A(A − B)? ⇒ C(A(A? − B? )) ⊆ C(A − B). Now, C((A? − B? )A)? = C(A? (A − B)) = C(A? (A − B)(A − B)† (A − B)) = C(A? ((A − B)(A − B)† )? (A − B)) = C(((A − B)(A − B)† )A)? (A − B) ⊆ C(A − B)? . Remaining proof can be completed along the lines of the proof in Theorem 12.3.8. We have studied the shorted matrix defined through minus partial order in Chapter 11 at length. There is really nothing particularly special about the minus order. As a matter of fact one can use any of the partial orders studied till now and define a shorted matrix. In the remainder of this section we briefly study the shorted matrix defined through the star order. Definition 12.3.10. Let A be an m×n matrix. Let S and T be subspaces of Cn and Cm respectively. Let D = {C ∈ Cm×n : C A, C(C) ⊆ S and C(C? ) ⊆ T }. The unique maximal element of D is called the shorted matrix of A relative to the subspaces S and T . We denote this shorted matrix by S? (A|S, T ). Recall that S(A|S, T ), the shorted matrix of A relative the subspaces S and T under minus order does not always exist. However, we show in our next theorem that S? (A|S, T ) always exists. Theorem 12.3.11. Let A be an m × n matrix. Let S and T be subspaces of Cn and Cm respectively. Then the shorted matrix S? (A|S, T ) exists. Proof. Let D = {C ∈ Cm×n : C A, C(C) ⊆ S and C(C? ) ⊆ T }. and C ∈ D. Since C A, there exist unitary matrices X and Y
Lattice Properties of Partial Orders
337
such that C = Xdiag(Da , 0, 0)Y? and A = Xdiag(Da , Db , 0)Y? , where a = ρ(C), a + b = ρ(A), Da and Db are positive definite diagonal matrices. Also, C = AC† C = CC† A. Let P = C† C and Q = CC† . Then P and Q are orthogonal projectors onto C(C? ) and C(C) respectively. Write Da = diag(σ1 Ir1 , . . . , σm Irm ) and diag(Da , Db ) = diag(σ1 Ir1 , . . . , σk Irk ), where a = r1 + . . . + rm and a + b = r1 + . . . + rk . Partition X, Y conformably with the block diagonal decomposition of diag(Da , Db ), say X = (X1 , . . . , Xk+1 ) and Y = (Y1 , . . . , Yk+1 ). Then A = Σki=1 σi Xi Yi? , for each i, X?i Xi = I = Yi? Yi and for each i 6= j, X?i Xj = 0 = Yi? Yj . Also, if X0i = (0, . . . , Xi , . . . , 0), then each X0i X. Since P = C† C, we can write P = Σki=1 Yi Pi Yi? , where Pi are orthogonal projectors. This gives C = AP = Σki=1 σi Xi Pi Yi? . For each i, we define projectors Ei and Fi as follows: Clearly, CYi = σi Xi Pi and C(Xi Pi ) = C(σi Xi Pi ) = C(CYi ) ⊆ C(C) ⊆ S. Thus, there exists a projector Pi such that C(Xi Pi ) ⊆ S. So, let Ei be the maximal projector such that C(Xi Ei ) ⊆ S. Similarly, let Fi be the maximal projector such that C(Yi Fi ) ⊆ T . The maximal C is the one with the Pi ’s selected as large as possible such that Pi Ei and Pi Fi . Since Pi , Ei and Fi are projectors, C(Pi ) ⊆ C(Ei ) and C(Pi ) ⊆ C(Fi ). Therefore, the maximal Pi that can be selected is Ei ∧ Fi = 2P(Ei |Fi ). We show that Z = 2Σki=1 σi Xi P(Ei |Fi )Yi? is the maximal element of D. Clearly, C(Z) ⊆ S and C(Z? ) ⊆ T . Also, ZZ? = 2Σki=1 σ 2 i Xi P(Ei |Fi )Yi? , since, 2P(Ei |Fi ) is a projector onto C(Ei ) ∩ C(Fi ). Moreover, AZ? = 2Σki=1 σ 2 i Xi P(Ei |Fi )Yi? , so, ZZ? = AZ? . Similarly, Z? Z = Z? A. By the choice of Ei and Fi , Z is a maximal element of D. Thus, S? (A|S, T ) exists and is given by S? (A|S, T ) = 2Σki=1 σi Xi P(Ei |Fi )Yi? . Remark 12.3.12. The expression 2Σki=1 σi Xi P(Ei |Fi )Yi? provides a singular value decomposition of S? (A|S, T ), since 2P(Ei |Fi ) is a projector onto C(Ei ) ∩ C(Fi ) and we can write 2P(Ei |Fi ) = Di D?i with D?i D = I. Remark 12.3.13. Since C A ⇒ C <− A, it follows that S? (A|S, T ) <− S(A|S, T ). It is natural to ask when S? (A|S, T ) = S(A|S, T )? Except for trivial cases (e.g. conditions under which the star order and the minus order are equivalent) the answer is not known to us. Theorem 12.3.14. Let A be an m × n matrix. Let S and T be subspaces of Cn and Cm respectively and S? (A|S, T ), the shorted matrix of A relative to S and T . Then the following hold: (i) (S? (A|S, T ))? = S? (A? |S, T ).
338
(ii) (iii) (iv) (v)
Matrix Partial Orders, Shorted Operators and Applications
If A is hermitian, then so is S? (A|S, T ). If A is nnd, then S? (A|S, T ) is also nnd. (S? (A|S, T ))† = S? (A† |S, T ) and If S 0 , T 0 are subspaces of Cn and Cm S? (S? (A|S, T )|S 0 , T 0 ) = S? (A|S ∩ S 0 , T ∩ T 0 ).
respectively,
then
Proof. (i) In notations of Theorem12.3.11 the result follows from (a) A and A? have same singular values and (b) P(Ei |Fi ) = P(Fi |Ei ). (ii) If A is hermitian, then σi are real and X = Y. The proof now follows from (i). (iii) is trivial. D 0 ? (iv) If A has singular value decomposition given as X AY = , 0 0 D 0 where D is a positive definite diagonal matrix, then A† = Y X? is 0 0 a singular value decomposition for A† . Also, C A ⇔ C† A† . Thus, S? (A† |T , S) = 2Σki=1 σ −1 i Yi P(Fi |Ei )X?i = 2Σki=1 σ −1 i Yi P(Ei |Fi )X?i . Now by direct verification we have S? (A† |T , S) is a Moore-Penrose inverse of S? (A|S, T ). (v) First note that each side has same column space S ∩ S 0 ∩ C(A) and same row space T ∩ T 0 ∩ C(A? ). So, ρ(S? (S? (A|S, T )|S 0 , T 0 )) = ρ(S? (A|S ∩ S 0 , T ∩ T 0 )). Moreover, S? (S? (A|S, T )|S 0 , T 0 ) = max{T}, and S? (A|S ∩ S 0 , T ∩ T 0 ) = max{E} where T = {C ∈ Cm×n : C S? (A|S, T ), C(C) ⊆ S 0 , C(C? ) ⊆ T 0 } and {E} = {C ∈ Cm×n : C A, C(C) ⊆ S ∩ S 0 , C(C? ) ⊆ T ∩ T 0 }. Let C be such that C S? (A|S, T ), C(C) ⊆ S 0 and C(C? ) ⊆ T 0 . Since S? (A|S, T ) A, we have C A. Also, C(C) ⊆ C(A) ∩ S, so, C(C) ⊆ C(A) ∩ S ∩ S 0 . Similarly, C(C? ) ⊆ C(A) ∩ T ∩ T 0 . Therefore C ∈ {C ∈ Cm×n : C A, C(C) ⊆ S ∩ S 0 , C(C? ) ⊆ T ∩ T 0 } and hence, C S? (A|S ∩ S 0 , T ∩ T 0 ). This further gives S? (S? (A|S, T )|S 0 , T 0 ) S? (A|S ∩ S 0 , T ∩ T 0 ). Thus, S? (S? (A|S, T )|S 0 , T 0 ) = S? (A|S ∩ S 0 , T ∩ T 0 ).
12.4
Infimum under the sharp order
In this section we discuss the existence of A ∧ B and A ∨ B for matrices A, B under the sharp order. In general neither the supremum A ∨ B nor
Lattice Properties of Partial Orders
339
the infimum A ∧ B may exist. Supremum may fail to exist for obvious reason and the following example shows that the infimum may also fail to exist in general: 0 0 0 0 1 1 1 −2 0 1 0 0 1 2 1 −2 4×4 . Example 12.4.1. Let A = 1 1 2 −2 and B = 0 0 1 0 ∈ C 0 0 0 1 1 1 1 −1 1 1 Note that A = B + K, where K = 1 (1, 1, 1, − 2). If C is a matrix 1 dominated by B under the minus order, then C must be an idempotent 0 0 matrix of the form , where H is an idempotent of order 3×3. Now, if 0 H C <# A and C <# B, then KC = 0 and CK = 0. So, (1, 1, 1, −2)C = 0 1 1 1 and C = 0. This implies that (1, 1, −2)H = 0 and H 1 = 0. Thus, 1 1 1 columns of H are orthogonal to (1, 1, − 2)? and rows of H are orthogonal 1 1 1 to 1 . So, C(H) ⊆ C(L) and C(H? ) ⊆ C(M? ), where L = 1 −1 and 1 1 0 1 1 M? = 1 −1 , and therefore, H = LZM for some matrix Z. Since H is 2 0 idempotent and L and M are full rank matrices, so, ρ(H) ≤ ρ(ML) = 1 and Z = ZMLZ. To choose C of maximum rank in C,we must choose 2ab b − Z ∈ {(ML)r }. A general solution to Z is given by Z = , where a 12 a, b are arbitrary complex numbers. Thus, there are several choices for matrix C and so, A ∧ B does not exist. Remark 12.4.2. In Example 12.4.1, B <− A, so the minus infimum of A, B exists and is equal to B. Thus, infimum under different partial orders may be different. So one may ask: When do infimum and supremum under various partial orders coincide? We now give a necessary and sufficient conditions under which the infimum under sharp order exists.
340
Matrix Partial Orders, Shorted Operators and Applications
Theorem 12.4.3. Let A, B be square matrices having the same order and of index ≤ 1, over an arbitrary field.Let ρ(A) = m. Let us first assume Im 0 that A = . Let U and V be matrices of order m × r and m × s 0 0 respectively such that U V t t t N (B − A) ∩ C(A) = C and N (B − A ) ∩ C(A ) = C , 0 0 where r = ρ(N (B − A) ∩ C(A)) and s = ρ(N (Bt − At ) ∩ C(At )). Then A ∧ B exists if and only if r =s and Vt U is invertible. When these U(Vt U)−1 Vt 0 conditions are satisfied, A ∧ B = . 0 0 Proof. Let C be a matrix of index 1. Then C <# A and C <# B ⇔ (A − C)C = C(A − C) = 0, (B − C)C = C(B − C) = 0 ⇔ (A − C)C = C(A − C) = 0, (B − A)C = C(B − A) = 0 ⇒ C(C) ⊆ N (B − A) ∩ C(A) and C(Ct ) ⊆ N (Bt − At ) ∩ C(At ) ⇒ C(C) ⊆ C
U V and C(Ct ) ⊆ C 0 0
⇒C=
U4Vt 0 . 0 0
Also, C2 = AC ⇒ ∆Vt U∆ = ∆. So, ∆ = P(QVt UP)− r Q for some P and Q. If r = s and Vt U is invertible, then we have U∆Vt 0 t # t −1 t V . Thus, has maximal UP(QVt UP)− r QV < U(V U) 0 0 U(Vt U)−1 Vt 0 t −1 rank when ∆ = (V U) and therefore, A ∧ B = . 0 0 Conversely, if either r = s and ρ(Vt U) < r or r 6= s, then for each t t choice of matrices P and Q, ρ(UP(QVt UP)− r QV ) ≤ ρ(V U), so, a unique matrix C can not be found and therefore, A ∧ B does not exist. We now consider the case when A does not have special form of Theorem 12.4.3. Consider the setup in Theorem 12.4.3 except that A is an arbitrary square matrix. Let (L, Rt ) be a rank factorization of A. As A is of
Lattice Properties of Partial Orders
341
index 1, Rt L is invertible with ρ(Rt L) = ρ(A). Let the columns of LU form a basis of N (B − A) ∩ C(A) and rows of RV form a basis of N (Bt − At ) ∩ C(At ). Further, let U0 be a matrix columns of which form a basis of C(U) and V0 be a matrix columns of which form a basis of C(V). Then U0 is maximal being invariant under Rt L and V0 is maximal being invariant under Lt R. Under this setup we have Theorem 12.4.4. A ∧ B exists if and only if V0t U0 is invertible and when this condition is satisfied A ∧ B = LU0 (V0t U0 )−1 V0t Rt . Proof. Let C be a matrix of index 1. Then C <# A and C <# B ⇒ C = LHRt for some idempotent matrix H. Also, C <# A ⇔ (A − C)C = C(A − C) = 0 ⇔ (I − H)Rt LH = HRt L(I − H) = 0. Thus, for each Hy, Rt LHy = HRt LHy ∈ C(H). So, C(H) is invariant under Rt L. Similarly, C(Ht ) is invariant under Lt R. Now with H playing t −1 and the role of C, U0 of U and V0 of V, A0 = Im = L−1 ` A(R )r −1 −1 t −1 t −1 B0 = L` B(R )r , where L` is a left inverse of L and (R )r is a right inverse of Rt and we have the setup of Theorem 12.4.3. Therefore, A0 ∧ B0 exists if and only if V0t U0 is invertible. When V0t U0 is invertible, A0 ∧ B0 = U0 (V0t U0 )−1 V0t . Also, if C0 <# A0 , then LC0 Rt <# LARt . It is now easy to see that A ∧ B = LU0 (V0t U0 )−1 V0t Rt .
342
Matrix Partial Orders, Shorted Operators and Applications
12.5
Exercises
(1) Let A and B be matrices of the same order such that AB? and A? B are hermitian. Show that A ∧ B = A(I − (I − A† B)(I − A† B)† ) = (I − (I − BA† )† (I − BA† ))A. (2) Let A and B be matrices of the same order such that A and B commute and A − B is range-hermitian. Show that A ∧ B = A(I − (I − A† B)(I − A† B)† ). (3) Let A and B be unitary. Show that A ∧ B = A(I − (I − A† B)(I − A† B)† ) = (I − (I − BA† )† (I − BA† ))A. (4) Let E and F be orthogonal projectors. Show that E ∨ F = † † (E + F)(E + F) = (E + F) (E + F) and E ∧ F = 2P(E, F). (5) Prove or disprove the following statement: Let A = A0 ⊕ A1 and B = A0 ⊕ kA1 , k 6= 1. Then A ∨ B = A. (6) Let ‘
Chapter 13
Partial Orders of Modified Matrices
13.1
Introduction
The importance of studying partial orders and shorted operators of modified matrices lies in its potential for applications to Statistics and Networking. For example when there are model or data changes in a linear model, both partitioned matrices and rank 1 modification of a matrix play a big role in obtaining the revised inferences. Also, in electrical network theory when a new port is added or an existing port is deleted in an n-port network, the modified matrices play an important role in studying its physical properties. So, it is of interest to study the behavior of the partial orders of modified matrices. More specifically: Let X, Y be a pair of matrices and X < Y, where ‘<’ is some partial order on matrices. Suppose we append/delete a column/row to each of X, Y and denote the new pair by X1 , Y1 in either situation. We are interested in knowing if X1 < Y1 and if not, what conditions will ensure that it will be so. Since the modifications obtained by appending/deleting a row can be treated by taking the transposes of the corresponding modifications by appending/deleting columns, we confine only to the case in which the matrices under consideration are modified by appending/deleting a column. We also study the same problem when we add a rank 1 matrix to both of X and Y. In this chapter we study the partial orders of the modified matrices. The shorted operators of the modified matrices are dealt in Section 6 of Chapter 15. In Section 13.2, we study the space pre-order for both types of modifications in concerned matrices. We first obtain necessary and sufficient conditions when matrices are modified by adding/deleting a column. We next take up the case when matrices are modified by adding a rank 1 matrix. This case appears to be highly computational though not difficult
343
344
Matrix Partial Orders, Shorted Operators and Applications
to understand. In Section 13.3, we give a detailed account of happenings in relation to the minus order. In Section 13.4, the same problem for the sharp order is taken up. As before in all the three sections 13.2-13.4, we consider matrices over general fields. In Sections 13.5 and 13.6, we consider matrices over the field of complex numbers where we study the behavior of the star order and the L¨ owner order for both types of modifications. Enroute the study of various matrix order relations under both types of modifications, we see that all the conclusions are obtained after laborious computations, which are little more manageable in case of modifications obtained by appending a column than in case of modifications by adding rank 1 matrices. Keeping in view the length constraint of the monograph, we give proofs of some of basic results leading to all the other results and omit the proofs of others.
13.2
Space pre-order
Let A and B be matrices of order m × n over a field F (not necessarily field of complex numbers). Let a and b be m-column vectors over F. Let us write A1 = (A : a) and B1 = (B : b).
(13.2.1)
In this section we explore the conditions under which A1 <s B1 , given A <s B and vice versa. We start with Lemma the setup (13.2.1). Let A1 <s B1 and b ∈ / C(B). 13.2.1. Consider G H Let and be two g-inverses of B1 . Then ct A = 0 if and only if ct dt C(A) ⊆ C(B). If ct A = 0, then dt A = 0. G H Proof. Since and are g-inverses of B1 and b ∈ / C(B), ct dt by Theorem 2.6.8, it follows that G and H are g-inverses of B. t t t t s Also, c B= 0, d B = 0, and c b = 1d b. Since A1 < B1 , we have G (B : b) t (A : a) = (A : a) or equivalently BGA + bct A = A and c BGa + bct a = a. If ct A = 0, we have BGA = A, or equivalently C(A) ⊆ C(B). Since dt B = 0, it follows that dt A = dt BGA = 0. On the other hand, if C(A) ⊆ C(B), then bct A = 0. Since b 6= 0, it follows that ct A = 0.
Partial Orders of Modified Matrices
345
Theorem 13.2.2. Consider the setup (13.2.1). Then the following hold: − (i) Let b ∈ C(B). Then A1 <s B1 if and only if A <s B and a = AB b. G (ii) Let b ∈ / C(B). Then A1 <s B1 and ct A = 0 for some g-inverse ct s of B1 if and only if A < B and a ∈ C(B : b). − B Proof. (i) Since b ∈ C(B), H = is a g-inverse of B1 , where B− 0 be any g-inverse of B. Now,
A1 HB1 = A1 ⇔ AB− B = A and AB− b = a. Further, B1 HA1 = A1 ⇔ BB− a = a. Hence, A1 <s B1 ⇔ A <s B, AB− b = a and BB− a = a ⇔ A <s B, AB− b = a. G (ii) Let be a g-inverse of B1 . Clearly, ct B = 0 and ct b = 1. ct ‘Only if’ part A1 <s B1 and ct A= 0⇒ C(A) ⊆ C(B), by Lemma 13.2.1. Further, G A1 <s B1 ⇒ (B : b) t a = a. Hence, a ∈ C(B : b). c G Also, A1 t B1 = A1 ⇒ (AG + act )B = A ⇒ AGB = A, as ct B = 0. c Thus, A1 <s B1 ⇒ A <s B and a ∈ C(B : b). ‘If’ part Let B− be a g-inverse of B and let u = B− b. Then it is easy − t B − uc to check that is a g-inverse of B1 . Since ct B = 0 and ct C(A) ⊆ C(B), we have ct A = 0. It now follows that C(A : a) ⊆ C(B : b). t Since C(Bt ), a simple shows that −C(A )t ⊆ − verification B − uc B − uct A1 B1 = A1 . Further, B1 A1 = A1 also holds. ct ct Thus, A <s B and a ∈ C(B : b) ⇒ A1 <s B1 and ct A = 0. Remark 13.2.3. Let / C(B). Let the ‘if’ part of Theorem 13.2.2 hold b ∈ G for some g-inverse of B1 . Then, by Lemma 13.2.1 it holds for each ct g-inverse of B1 .
346
Matrix Partial Orders, Shorted Operators and Applications
We now study the space pre-order for matrices which are modified by adding a rank 1 matrix.(Recall that each rank 1 matrix can be expressed as xyt where x and y are non-null column vectors, not necessarily of the same order.) Let A be an m × n matrix, a an m-column vector and b an n-column vector. Consider the following conditions: (i) a ∈ C(A) (ii) b ∈ C(At ) (iii) bt A− a + 1 6= 0 for some g-inverse A− of A.
(13.2.2) (13.2.3) (13.2.4)
Lemma 13.2.4. Let A be an m×n matrix,a an m-column vector and b be A a an n-column vector. Then ρ(A + abt ) = ρ −1. Consequently, −bt − 1 ⇔ (13.2.2) − (13.2.4) hold ρ(A) − 1 ρ(A + abt ) = ρ(A) ⇔ excatly two of (13.2.2) − (13.2.4) hold ρ(A) + 1 ⇔ at most one of (13.2.2) − (13.2.4) holds. Proof.
Notice that we can write A + abt 0 Im − a A = 0 1 0 1 −bt
a 1
In bt
0 1
Proof now follows from Theorem 2.6.14 and Theorem 13.2.2.
Let A, B be matrices of order m × n, a, c be m-column vectors and b, d be n-column vectors. Let A2 = A + abt , B2 = B + cdt ; E = I − BG, F = I − GB and β = 1 + dt Gc, where G is a g-inverse of B. Let A <s B. We now explore the conditions under which A2 <s B2 and vice versa. Notice that, in general, β depends on choice of g-inverse G of B. However, if c ∈ C(B) and d ∈ C(Bt ), then β is independent of the choice of g-inverse G of B. Theorem 13.2.5. Let A, B be matrices of order m × n, a, c be non-null m-column vectors and b, d be non-null n-column vectors. Let c ∈ C(B), d ∈ C(Bt ) and β = 0. Then any two of the following imply the third: (i) A <s B (ii) A2 <s B2 (iii) There exists a g-inverse G of B satisfying (a) A2 Gc = 0, dt GA2 = 0, when a ∈ C(A), b ∈ C(At ). (b) A2 Gc = 0, dt GA = 0 and b ∈ C(Bt ), when a ∈ C(A) and b ∈ / C(At ).
Partial Orders of Modified Matrices
347
(c) a ∈ C(B), AGc = 0, and dt GA2 = 0 when a ∈ / C(A) and b ∈ C(At ) (d) a ∈ C(B), b ∈ C(Bt ), bt Gc = 0, AGc = 0, dt Ga = 0 and dt GA = 0, when a ∈ / C(A), b ∈ / C(At ). Proof. (i) and (ii) ⇒ (iii) Since (i) holds, for each g-inverse G of B we have A = AGB = BGA. Also, since (ii) holds A2 = A2 G2 B2 = B2 G2 A2 for each g-inverse G2 of B2 . By Theorem 2.6.22(v), G is also a g-inverse of B2 . So, A2 = A2 GB2 = B2 GA2 . Consider first A2 = A2 GB2 . Now A2 = A2 GB2 ⇒ A + abt = (A + abt )G(B + cdt ) = AGB + AGcdt + abt GB + abt Gcdt .
(13.2.5)
A2 = B2 GA2 ⇒ A + abt = (B + cdt )G(A + abt ) = BGA + cdt GA + BGabt + cdt Gabt . (13.2.6) We consider the four cases according as ‘a ∈ C(A) and b ∈ C(At )’ or ‘a ∈ C(A) and b ∈ / C(At )’ or ‘a ∈ / C(A) and b ∈ C(At )’ or ‘a ∈ / C(A) t and b ∈ / C(A ).’ Let a ∈ C(A) and b ∈ C(At ). Since b ∈ C(At ) ⊆ C(Bt ), (13.2.5) gives A + abt = A + AGcdt + abt + abt Gcdt . So, AGcdt + abt Gcdt = 0 or A2 Gcdt = 0. As d is non-null, we have A2 Gc = 0. Similarly, (13.2.6) ⇒ cdt GA2 = 0, and since c is non-null, we get dt GA2 = 0. Next, let a ∈ C(A) and b ∈ / C(At ). From (13.2.5), we have A + abt = AGB + AGcdt + abt GB + abt Gcdt ⇒ abt = AGcdt + abt GB + abt Gcdt ⇒ abt (I − GB) = A2 Gcdt . If b ∈ / C(Bt ), then bt (I − GB) 6= 0. So, C(abt (I − GB)) ⊆ C(Bt ) but C(A2 Gcdt ) ⊆ C(Bt ). Therefore, b ∈ C(Bt ) and A2 Gcdt = 0. Since, d is non-null, we have A2 Gc = 0. Now, (13.2.6) ⇒ A + abt = A + cdt GA + abt + cdt Gabt , since a ∈ C(A) ⊆ C(B). So, cdt GA + cdt Gabt = 0. Since c is non-null, dt GA + dt Gabt = 0. Notice that dt GA ∈ C(At ) and dt Gabt does not belong to C(At ), so, dt GA = 0 and dt Gabt = 0. As b is non-null, then dt Ga = 0. But then a ∈ C(A) and dt GA = 0, we have dt Ga = 0.
348
Matrix Partial Orders, Shorted Operators and Applications
The case when a ∈ / C(A) and b ∈ C(At ) is similar to the case when t a ∈ C(A), b ∈ / C(A ). Now, let a ∈ / C(A) and b ∈ / C(At ). From (13.2.5), we have A + abt = AGB + AGcdt + abt GB + abt Gcdt . Since AGB = A, we have abt − abt GB − abt Gcdt = AGcdt . It follows that AGc = 0, bt = bt GB + bt Gcdt ∈ C(Bt ) and bt Gc = 0. Similarly, (13.2.6) yields dt GA = 0, a ∈ C(B) and dt Ga = 0. (i) and (iii) ⇒ (ii) Let (i) and (iii) hold. By (iii) there exists a g-inverse G of B such that (1) A2 Gc = 0, dt GA2 = 0, in case a ∈ C(A), b ∈ C(At ). (2) A2 Gc = 0, dt GA = 0 and b ∈ C(Bt ), when a ∈ C(A), and b ∈ / C(At ) t (3) d GA2 = 0, AGc = 0, and a ∈ C(B) when a ∈ / C(A), b ∈ C(At ) and t t (4) a ∈ C(B), b ∈ C(B ), A2 Gc = 0, and d GA2 = 0, when a ∈ / C(A), and b ∈ / C(At ). By Theorem 2.6.22(v), G is a g-inverse of B. Let a ∈ C(A) and b ∈ C(At ). Notice that A2 G2 B2 = A + A2 Gcdt + abt GB. Since A2 Gc = 0 and b ∈ C(At ) ⊆ C(Bt ) we have A2 G2 B2 = A2 . Again, since a ∈ C(A) ⊆ C(B) and dt GA2 = 0, we have B2 GA2 = A + cdt GA2 + BGabt = A2 . The Proof in the other cases is easy and is left to the reader. Proof of (ii) and (iii) ⇒ (i) is straightforward. Remark 13.2.6. In (i) and (ii) ⇒ (iii), the result holds for every g-inverse G of B. Theorem 13.2.7. Let A and B be matrices of order m × n, a, c be non-null m-column vectors and b, d be non-null n-column vectors. Let c ∈ C(B), d ∈ C(Bt ) and β 6= 0. Then any two of the following three implies the third: (i) A <s B (ii) A2 <s B2 and (iii) a ∈ C(B) and b ∈ C(Bt ) Proof. (i) and (ii) ⇒ (iii) First note that ρ(B) = ρ(B2 ), by Lemma 13.2.4. Further, C(B2 ) ⊆ C(B) and C(Bt2 ) ⊆ C(Bt ), since c ∈ C(B) and d ∈ C(Bt ). Hence, C(B2 ) = C(B) and C(Bt2 ) = C(Bt ). Since A2 <s B2 , in view of the above there exist matrices R and S such that A + abt = BR = SB. Also, A <s B, it follows that a ∈ C(B) and
Partial Orders of Modified Matrices
349
b ∈ C(Bt ). (i) and (iii) ⇒ (ii) Since a ∈ C(B) and A <s B, we have C(A + abt ) ⊆ C(B2 ). Again, since b ∈ C(Bt ) and A <s B, we have C(A + abt )t ⊆ C(Bt ) = C(B2 )t . Hence, A2 <s B2 . (ii) and (iii) ⇒ (i) Gcdt G Let G be a g-inverse of B. By Theorem 2.6.22, H = G − is a β g-inverse of B2 . Since A2 <s B2 , we have A2 HB2 = B2 HA2 = A2 . Gcdt G t Gcdt G B+AGcdt −A cd +abt GB− A2 = A2 HB2 = AGB−A β β Gcdt G Gcdt G t Gcdt abt B + abt Gcdt − abt cd = AGB − A + AGcdt − β β β Gcdt abt Gcdt abt Gcdt (β − 1)A + abt − + abt Gcdt − (β − 1) = AGB + β β β abt . Hence, A = AGB. Similarly, A2 = B2 HA2 ⇒ A = BGA. Thus, A <s B. Corollary 13.2.8. Consider the setup of Theorem 13.2.7. Further, let a ∈ C(A) and b ∈ C(At ). Then A <s B if and only if A2 <s B2 . Assume that c ∈ C(B) and d ∈ / C(Bt ). We now explore the conditions for A2 <s B2 to hold when A <s B. For this we first prove the following lemma: Lemma 13.2.9. Let X be an m × n matrix, the vector x ∈ C(X) and y∈ / C(Xt ). Let X− be a g-inverse of X such that yt X− x = λ 6= 0. Then there exists a g-inverse G of X such that yt Gx = 0. Proof. Note that yt X− x 6= 0. So, both x and y are non-null. By Lemma 2.6.1, there exists a vector z such that zt Xt = 0 and zt y = 1. Moreover, x 6= 0 ⇒ there exits a vector w such that wt x = 1. Let G = X− − λzwt . Then G is a g-inverse of X and yt Gx = 0. Corollary 13.2.10. Let X be an m × n matrix, the vector y ∈ C(Xt ) and x∈ / C(X). Let X− be a g-inverse of X such that yt X− x 6= 0. Then there exists a g-inverse G of X such that yt Gx = 0. We now prove the following: Theorem 13.2.11. Let A and B be matrices of order m × n, a, c be nonnull m-column vectors and b, d be non-null n-column vectors. Let c ∈ C(B),
350
Matrix Partial Orders, Shorted Operators and Applications
d ∈ / C(Bt ) and G be a g-inverse of B such that dt Gc = 0. Then any two of the following three implies the third: (i) A <s B (ii) A2 <s B2 and (iii) a ∈ C(B) and (either ‘A2 Gc = 0, and b ∈ C(Bt )’) or b ∈ / C(Bt ), a d = µb + Bt v and A2 Gc = for some vector v and some scalar µ µ 6= 0. Proof. (i) and (ii) ⇒ (iii) Since (i) holds, A = AGB = BGA. Also, dt Gc = 0, β = 1. Hence, H = G−Gcdt G is a g-inverse of B2 . Since (ii) holds, A2 = A2 HB2 = B2 HA2 . A2 = B2 HA2 ⇒ A+abt = A−BGcdt GA+BGabt −BGcdt Gabt +cdt GA+cdt Gabt ⇒ abt = BGabt , since dt Gc = 0, A = BGA and BGc = c. It follows that a = BGa or a ∈ C(B), as b is non-null. Now, A2 = A2 HB2 ⇒ A + abt = AGB + abt GB − AGcdt GB − abt Gcdt GB + AGcdt + abt Gcdt , since dt Gc = 0 ⇒ (abt − AGcdt − abt Gcdt )(I − GB) = 0. If b ∈ C(Bt ), then A2 Gcdt = 0, since d ∈ / C(Bt ). So, A2 Gc = 0, as d is non-null. If b ∈ / C(Bt ), then A2 Gc 6= 0 and abt + A2 Gcdt = TB for some matrix T. Hence, A2 Gcdt = TB − abt . Since A2 Gc 6= 0, there exists a vector u such that ut A2 Gc = 1. So, dt = ut TB − ut abt or d = Bt v + µb with µ 6= 0. (As µ = 0 gives d ∈ C(Bt ).) notice that µ is unique, since b ∈ / C(Bt ). Further, (abt − A2 Gcdt )(I − GB) = 0. So, t t (ab − A2 Gc(µb + vB))(I − GB) = 0 or (a − µA2 Gc)bt (I − GB) = 0. 1 Since b ∈ / C(Bt ), bt (I − GB) 6= 0. So, A2 Gc = a. µ (i) and (iii) ⇒ (ii) Let (i) and (iii) hold. As (i) holds, there exists a g-inverse G of B such that A = AGB = BGA. Let a ∈ C(B), b ∈ C(Bt ). Since (iii) holds, A2 Gc = 0. Also, G − Gcdt G is a g-inverse of B2 . Now, A2 G2 B2 = A+abt GB−AGcdt −abt Gcdt +AGcdt +abt Gcdt = A + abt − A2 Gcdt GB + A2 Gcdt = A + abt = A2 , since b ∈ C(Bt ). Similarly, B2 G2 A2 = A2 holds. Now, let a ∈ C(B) and b ∈ / C(Bt ). Since (i) holds, there exists a g-inverse G of B such that A = AGB = BGA. Also, (iii) holds, so, d = µb + Bt v for some scalar
Partial Orders of Modified Matrices
a . µ It is easy to check that (ii) holds. Proof of (ii) and (iii) ⇒ (i) is similar.
351
µ 6= 0 and A2 Gc =
Theorem 13.2.12. et A, B be matrices of order m × n, a, c be m-column vectors and b, d be n-column vectors such that A <s B. Let c ∈ / C(B) and dt ∈ C(Bt ) and G be a g-inverse of B such that dt Gc = 0. Then any two of the following imply the third: (i) A <s B (ii) A2 <s B2 and t (iii) b ∈ C(B ) and either {dt GA2 = 0 and a ∈ C(B)} or {a ∈ / C(B), c = Bu + ηa for bt .} some vector u and some scalar η 6= 0 and dt GA2 = η Proof is similar to the proof of Theorem 13.2.9. To see when A2 <s B2 in case c ∈ / C(B) and d ∈ / C(Bt ), we need some more preparation. We first prove the following lemmas: Lemma 13.2.13. Let B be an m×n matrix and B2 = B+cdt where c and d are respectively m and n-column vectors. Let c ∈ / C(B) and d ∈ / C(Bt ). then there exists a matrix G that is a g-inverse of both B and B2 . Proof. By Lemma 13.2.4, ρ(B2 ) = ρ(B) + 1. Theorem 2.3.19 guarantees the existence of such a g-inverse. Lemma 13.2.14. Under the setup of Lemma 13.2.13, let G be a common g-inverse of B and B2 . Then BGc = 0, dt GB = 0 and dt Gc = 1. Proof.
Since G is g-inverse of B2 , so, B2 GB2 = B2 . Now,
B2 GB2 = B2 ⇒ BGcdt + cdt GB + cdt Gcdt = cdt or cdt GB = (I − BG − cdt G)cdt ⇒ cdt GB = 0 and (I − BG − cdt G)cdt = 0, since C(cdt GB) ⊆ C(Bt )and C((I − BG − cdt G)cdt ) * C(Bt ). So, dt GB = 0 and (I − BG − cdt G)c = 0. However, (I − BG − cdt G)c = 0 ⇒ BGc = 0 and dt Gt = 1, as BGc ∈ C(B) and c∈ / C(B).
352
Matrix Partial Orders, Shorted Operators and Applications
Theorem 13.2.15. Let A and B be matrices of order m × n, a, c be non-null m-vectors and and b, d be non-null n-column vectors such that c ∈ / C(B) and d ∈ / C(Bt ). Then any two of the following imply the third: (i) A <s B (ii) A2 <s B2 and (iii) a = Bu + αc, where u and α are arbitrary subject to a 6= 0 and b = vt B + θd, where v and θ are arbitrary subject to b 6= 0. Proof. By Lemmas 13.2.13 and 13.2.14, there exists a common g-inverse G of B and B2 such that BGc = 0, dt GB = 0 and dt Gc = 1. Now, A2 = A2 GB2 ⇔ A + abt = (A + abt )G(B + cdt ) and A2 = B2 GA2 ⇔ A + abt = BGA + BGabt + cdt GA + cdt Gabt (i) and (ii) ⇒ (iii) As BGc = 0, dt GB = 0 and A <s B, we have AGc = 0 and dt GA = 0. So, (i) and (ii) ⇒ abt (I − GB) = abt Gcdt and (I − BG)abt = cdt Gabt ⇒ bt (I − GB) = bt Gcdt and (I − BG)a = cdt Ga, since a, b are nonnull ⇒ b = vt B + θd for some v and θ; and a = Bu + αc, for some u and α. As a, b are non-null vt B + θd 6= 0 and Bu + αc 6= 0. (i) and (iii) ⇒ (ii) BGc = 0 and dt GB = 0, b = vt B+θd and a = Bu+αc ⇒ θ = bt Gc and α = dt Ga. So, abt (I − GB) = abt Gcdt and (I − BG)abt = cdt Gabt . Also, (i) ⇒ AGc = 0 and dt GA = 0. Now (ii) follows. (ii) and (iii) ⇒ (i) As already seen (iii) ⇒ abt (I − GB) = abt Gcdt and (I − BG)abt = cdt Gabt . So, (ii) and (iii) ⇒ A = AGB + AGcdt = BGA + cdt GA. Now, A = AGB + AGcdt ⇒ A − AGB = AGcdt = 0 and A = BGA + cdt GA ⇒ A − BGA = cdt GA = 0. Since c ∈ / C(B) and d ∈ / C(Bt ), we have A <s B. 13.3
Minus order
In this section, we study the effect of modifications on matrices related to each other under minus order. Given matrices A and B of the same order such that A <− B, we first give necessary and sufficient conditions
Partial Orders of Modified Matrices
353
under which the new matrices A1 and B1 obtained by appending/deleting a column also satisfy A1 <− B1 . We also study the converse problem namely, if A1 <− B1 , when does A <− B hold? Let A and B be matrices of the same order such that A <− B. If the matrices A and B are modified by addition of rank 1 matrices, the conditions under which new matrices are similarly related and vice versa have also been explored. For matrices A and B of the same order m × n and m-column vectors a, b we write, as in Section 2, A1 = (A : a) and B1 = (B : b). We start with the following: Theorem 13.3.1. Let A and B be matrices of order m × n and a and b be m-column vectors. Then any two of the following imply the third: (i) A <− B (ii) A1 <− B1 and (iii) Either ‘a ∈ C(A), and b − a ∈ C(B − A),’ or (b ∈ / C(B) and exactly one of ‘a ∈ C(A) and b − a ∈ C(B − A))’ holds. Proof. (i) and (ii) ⇒ (iii) Since (i) and (ii) hold, by Theorem 3.3.5 and Remark 3.3.9, ρ(B) = ρ(A) + ρ(B − A)
(13.3.1)
ρ(B1 ) = ρ(A1 ) + ρ(B1 − A1 ).
(13.3.2)
and
Moreover, ρ(B) ≤ ρ(B1 ) ≤ ρ(B1 ) + 1 and similar inequalities hold for ‘A1 and A’ and for ‘B1 − A1 and B − A’. Let b ∈ C(B). Then ρ(B1 ) = ρ(B). Using (13.3.1) and (13.3.2), we have ρ(A) ≤ ρ(A1 ) + ρ(B1 − A1 ) = ρ(B1 ) = ρ(B) = ρ(A) + ρ(B − A). Thus, ρ(B1 − A1 ) ≤ ρ(B − A). However, ρ(B − A) ≤ ρ(B1 − A1 ) always holds, so, ρ(B1 − A1 ) = ρ(B − A). Therefore, ρ(A1 ) = ρ(A) and consequently a ∈ C(A) and b − a ∈ C(B − A). Let b ∈ / C(B). Then ρ(B1 ) = ρ(B) + 1. Then by (13.3.1), we have ρ(B1 ) = ρ(B) + 1 = ρ(A) + ρ(B − A) + 1. Using this in (13.3.2), we have ρ(A1 ) + ρ(B1 − A1 ) = ρ(B1 ) = ρ(B) + 1 = ρ(A) + ρ(B − A) + 1. So, exactly one of ‘ρ(A1 ) = ρ(A) + 1 and ρ(B1 − A1 ) = ρ(B − A)’ and ‘ρ(B1 − A1 ) = ρ(B − A) + 1 and ρ(A1 ) = ρ(A)’ can hold. Hence, exactly one of ‘a ∈ C(A), and b − a ∈ C(B − A)’ holds. (i) and (iii) ⇒ (ii) Let (i) and (iii) hold. In case, a ∈ C(A), and b − a ∈ C(B − A), we have
354
Matrix Partial Orders, Shorted Operators and Applications
ρ(A1 ) = ρ(A) and ρ(B1 − A1 ) = ρ(B − A). Since (i) holds, we have ρ(B) = ρ(A) + ρ(B − A). Also b ∈ C(B). So, ρ(A1 ) + ρ(B1 − A1 ) = ρ(A) + ρ(B − A) = ρ(B) = ρ(B1 ). Hence, b ∈ C(B) and (ii) holds. Let b ∈ / C(B). If a ∈ C(A), then we have ρ(A1 ) = ρ(A), and ρ(B1 − A1 ) = ρ(B − A) + 1. Therefore, ρ(A1 ) + ρ(B1 − A1 ) = ρ(A) + ρ(B − A) + 1 = ρ(B) + 1 = ρ(B1 ), so, (ii) holds. Similarly, if b − a ∈ C(B − A) and a ∈ C(A), we have ρ(B1 − A1 ) = ρ(B − A) and ρ(A1 ) = ρ(A) + 1. So, ρ(A1 ) + ρ(B1 − A1 ) = ρ(B1 ). Thus, (ii) holds. For (ii) and (iii) ⇒ (i), the proof is similar to the above. We shall now consider the minus order for matrices that have been modified by adding rank 1 matrices. Let the notations and terminology be the same as those introduced before Theorem 13.2.5. We shall explore necessary and sufficient conditions for A2 <− B2 to hold when we know that A <− B holds. Since the minus order implies the space pre-order, so, in view of Theorems 13.2.5, 13.2.7, 13.2.11 and 13.2.15; we just need to obtain the additional conditions in each of the cases when AGA = A or/and A2 G2 A2 = A2 hold. No proofs are included, as they are highly computational but straightforward. Theorem 13.3.2. Let A, B be matrices of order m × n, a, c be m-column vectors and b, d be n-column vectors. Let c ∈ C(B), d ∈ C(Bt ) and β = 0. (a) Let a ∈ C(A) and b ∈ C(At ). Let G be a g-inverse of B such that A2 Gc = 0 and dt GA2 = 0. Then any two of the following imply the third: (i) A <− B (ii) A2 <− B2 (iii) 1 + bt Ga = 0. (b) Let a ∈ C(A) and b ∈ / C(At ). Let G be a g-inverse of B such that t A2 Gc = 0 and d GA = 0. Then any two of the following imply the third: (i) A <− B (ii) A2 <− B2 and bt GA = 0
Partial Orders of Modified Matrices
355
(iii) b ∈ C(B − A)t . (c) Let a ∈ / C(A) and b ∈ C(At ). Let G be a g-inverse of B such that AGc = 0 and dt GA2 = 0. Then any two of the following imply the third: (i) A <− B (ii) A2 <− B2 and AGa = 0 (iii) a ∈ C(B − A). (d) Let a ∈ / C(A) and b ∈ / C(At ). Let G be a g-inverse of B such that A · · · Gc = 0 and dt G A : a = 0. Then any two of the following bt imply the third: (i) A <− B (ii) A2 <− B2 , AGa = 0 and bt GA = 0, (iii) bt Ga = 1. Theorem 13.3.3. Let A, B be matrices of the same order m × n, a, c be m-column vectors and b, d be n-column vectors. Let c ∈ C(B), d ∈ C(Bt ) and β 6= 0. (a) Let a ∈ C(A) and b ∈ C(At ). Then any two of the following imply the third: (i) A <− B (ii) A2 <− B2 , AGa = a and bt GA = bt (iii) If 1 + bt Ga = 0, then at least one of A2 Gc and dt GA2 is null. If 1 + bt Ga 6= 0, then there exists a non-null scalar α such that β (1 + bt Ga)bt . A2 Gc = αa and dt GA2 = α (b) Let a ∈ C(A) and b ∈ / C(At ). Then any two of the following imply the third: (i) A <− B (ii) A2 <− B2 and AGa = a, (iii) either ‘b ∈ C(B − A)t and at least one of the A2 Gc and dt GA2 is null’ or ‘A2 Gc and dt GA2 are non-null, a = θA2 Gc, where 1 θ 6= 0 is arbitrary and b = (B − A)t u + βθ d, where u is arbitrary t subject to the condition that b ∈ / C(A ).’ (c) Let a ∈ / C(A) and b ∈ C(At ). Then any two of the following imply the third: (i) A <− B
356
Matrix Partial Orders, Shorted Operators and Applications
(ii) A2 <− B2 and bt GA = bt , (iii) either ‘a ∈ C(B − A) and at least one of A2 Gc and dt GA2 is null’ or ‘A2 Gc and dt GA2 are non-null, b = δAt2 Gt d where 1 δ 6= 0 is arbitrary and a = (B − A)v + βδ c, where v is arbitrary subject to the condition that a ∈ / C(A).’ (d) Let a ∈ / C(A) and b ∈ / C(At ). Then any two of the following imply the third: (i) A <− B (ii) A2 <− B2 t t Ga = 1 and at least one of the following holds: (iii) bt Ga − b Gcd β t (ν1 ) : - AG a : c = 0, b − b βGc d ∈ C(B − A)t or t (ν2 ) : - dt : bt GA = 0, and a − d βGa c ∈ C(B − A). Theorem 13.3.4. Let A, B be matrices of order m × n, a, c be m-column vectors and b, d be n-column vectors. Let c ∈ C(B) and d ∈ / C(Bt ). Let a ∈ C(B), b ∈ C(Bt ) and G be a g-inverse of B such that dt Gc = 0. Then any two of the following imply the third: (i) A <− B (ii) A2 <− B2 (iii) (a) A2 Gc = 0 and 1 + bt Ga = 0, when a ∈ C(A) and b ∈ C(At ). (b) A2 Gc = 0 and bt GA = 0, when a ∈ C(A) and b ∈ / C(At ). t (c) d GA2 = 0 and AGa = 0 when a ∈ / C(A) and b ∈ C(At ). (d) A2 Gc = 0, dt GA2 = 0, bt Ga = 1, AGa = 0 and bt GA = 0, when a ∈ / C(A) and b ∈ / C(At ). Theorem 13.3.5. Let A, B be matrices of order m × n, a, c be m-column vectors and b, d be n-column vectors. Let c ∈ C(B) and d ∈ / C(Bt ). Let t t a ∈ C(B), b ∈ / C(B ) and G be a g-inverse of B such that d Gc = 0. Then any two of the following imply the third: (i) A <− B (ii) A2 <− B2 (iii) (a) d = µb + (B − A)t u for some scalar µ 6= 0, for some vector u a and A2 Gc = , when a ∈ C(A). µ (b) d = µb + (B − A)t z, zt a = −µ, zt Gc + 1 = 0, bt Gcµ = 1 for some scalar µ and some vector z and a, c ∈ C(B − A), when a∈ / C(A).
Partial Orders of Modified Matrices
357
Theorem 13.3.6. Let A, B be matrices of order m × n, a, c be m-column vectors and b, d be n-column vectors. Let c ∈ / C(B) and d ∈ / C(Bt ). Further, t let a ∈ C(B) and b ∈ C(B ) and G be a common g-inverse of B and B2 . Then any two of the following imply the third: (i) A <− B (ii) A2 <− B2 (iii) (a) 1 + bt Ga = 0, when a ∈ C(A) and b ∈ C(At ). (b) bt GA = 0, when a ∈ C(A) and b ∈ / C(At ). (c) AGa = 0, when a ∈ / C(A) and b ∈ C(At ). t (d) AGa = 0, b GA = 0 and bt Ga = 1, when a ∈ / C(A) and b∈ / C(At ). Theorem 13.3.7. Let A, B be matrices of order m × n, a, c be m-column vectors and b, d be n-column vectors. Let c ∈ / C(B) and d ∈ / C(Bt ). Further, t let a ∈ C(B) and b ∈ / C(B ) and G be a common g-inverse of B and B2 . Then any two of the following imply the third: (i) A <− B (ii) A2 <− B2 (iii) A2 <s B2 , a ∈ C(A) and bt GA = 0. Theorem 13.3.8. Let A, B be matrices of order m × n, a, c be m-column vectors and b, d be n-column vectors. Let c ∈ / C(B) and d ∈ / C(Bt ). Further, t let ‘a ∈ / C(B) and b ∈ / C(B )’ and G be a common g-inverse of B and B2 . Then any two of the following imply the third: (i) A <− B (ii) A2 <− B2 (I − (GBt ))b (I − BG)a and d = . dt Ga bt Gc t Further, a = (I − AG)ξ and b = (I − GA) η for some ξ and η.
(iii) bt Gc 6= 0, dt Ga 6= 0, c =
13.4
Sharp order
Let A, B be matrices of order n × n and of index≤ 1. a, b, c and d be A a B c n-vectors and A2 = and B2 = . Given A <# B, we bt α dt β wish to explore the necessary and sufficient conditions under which (a) the matrix A2 and B are of index ≤ 1 and (b) A2 <# B2 . We also study when A2 <# B2 implies A <# B. Also, given matrices A, B of the same
358
Matrix Partial Orders, Shorted Operators and Applications
order and of index ≤ 1, vectors a, b and scalar α, we wish to determine all the vectors c, d and the scalar β such that whenever A <# B, we have A2 <# B2 . We first prove the following theorem which determines when the matrix A2 is of index ≤ 1. Theorem 13.4.1. Let A be a matrix of order n × n and index ≤ 1. Let G be the group inverse of A. Then A2 is of index ≤ 1 if and only if either a ∈ C(A), b ∈ C(At ), α = bt Ga and q = bt (G2 )a + 1 6= 0 in which case ρ(A2 ) = ρ(A) or α 6= bt Ga and at least one of a ∈ C(A), b ∈ C(At ) holds in which case ρ(A2 ) = ρ(A) + 1 or bt (I − AG)a 6= 0 in which case ρ(A2 ) = ρ(A) + 2. Proof. ‘If part’ t 2 Let a ∈ C(A), b∈ C(At ), α = bt Ga and b (G )a + 1 6= 0. Consider the 1 X XGa , where matrix G2 = r bt GX bt GXGa bt G3 a X = rG − G2 abt G − Gabt G2 + kGabt G and k = . r We can check that A22 G2 = A2 . So, ρ(A2 ) ≤ ρ(A22 ) ≤ ρ(A2 ). Thus, ρ(A2 ) = ρ(A), equivalently A2 is of index ≤ 1. In case α 6= bt Ga and at least one of a ∈ C(A), b ∈ C(At ) holds, take 1 X k(I − AG)a − Ga G2 = 1 q kbt (I − AG) − bt G t t where X = qG+Gab G−(G+kI)ab (I − AG)−(I − AG)abt G(G+kI) r and k = . Then A22 G2 = A2 . Thus index of A2 is ≤ 1. q 1 X (I − AG)a t In case b (I − AG)a 6= 0, take G2 = , where 0 p bt (I − AG) X = pG − Gabt (I − AG) − (I − AG)abt G − k(I − AG)abt (I − AG) and q k = and G2 satisfies A2 2 G2 = A2 , so that A2 is of index ≤ 1. p ‘Only if’ part Let A2 be of index ≤ 1. In case ρ(A2 ) = ρ(A), then from the proof of t Lemma 13.2.4, it follows that a ∈ C(A), b ∈ C(At ),α = b Ga. If possible, 2 G a let bt G2 a + 1 = 0. Then consider the vector w = . Then 0 AG2 a Ga A2 w = = andA2 2 w = 0. t 2 b G a −1
Partial Orders of Modified Matrices
359
Since A2 is of index 1, A2 w 6= 0. Thus, w ∈ N (A2 2 ), but w ∈ N (A2 ) and this can not happen. So, bt G2 a + 1 6= 0. G2 a Similarly, in case ρ(A2 ) = ρ(A) + 1 take w = to show α 6= bt Ga. 1 Ga + z In case ρ(A2 ) = ρ(A) + 2 take w = , where z is a solution of 1 A 0 = and show that bt (I − AG)a 6= 0. bt bt Ga − α Lemma 13.4.2. Let A, B be matrices of order n × n and index ≤ 1. Let A <# B. Then (B − A)x = 0 if and only if x ∈ C(A) ⊕ N (B) for any x ∈ Fn . Proof. Since B is of index ≤ 1, F n = C(B) ⊕ N (B). Also, A <# B, so, A <− B. This further gives C(B) = C(A) ⊕ C(B − A). It follows that F n = C(A) ⊕ C(B − A) ⊕ N (B). Now, each x ∈ Fn can be written as x = x1 + x2 + x3 , where x1 ∈ C(A), x2 ∈ C(B − A) and x3 ∈ N (B). Therefore, (B − A)x = (B − A)x1 + (B − A)x2 + (B − A)x3 . Since, A <# B ⇒ A2 = BA = AB, we have (B − A)x1 = 0. Since, x3 ∈ N (B), we have (B − A)x3 = 0. Therefore, (B − A)x = (B − A)x2 . So, (B − A)x = 0 ⇔ (B − A)x2 = 0 ⇔ x ∈ C(A) ⊕ N (B). Remark 13.4.3. Notice that we only need A < # B for Lemma 13.4.2 to hold. Corollary 13.4.4. Let A, B be matrices of order n × n and index ≤ 1. Let A <# B. yt (B − A) = 0 if and only if y ∈ C(At ) ⊕ N (Bt ). Let A, B be matrices of the same order n × n and index ≤ 1. Let us assume that A2 is of index ≤ 1 in next few theorems. Also, note that if A <# B and G denotes the group inverse of B, then G is a commuting g-inverse of A. We record the following equations for our use in this section: A22 = A2 B2 = B2 A2 if and only if A2 + abt = AB + adt = BA + cbt
(13.4.1)
t
Aa + αa = b = Ac + βa = Ba + αc t
t
t
2
t
t
t
t
b A + αb = b B + αd = d A + βb t
t
b a + α = b c + αβ = d a + βα.
(13.4.2) (13.4.3) (13.4.4)
360
Matrix Partial Orders, Shorted Operators and Applications
Theorem 13.4.5. Let A, B be matrices of order n × n and index ≤ 1. Let A2 be of index ≤ 1 and the vectors a, b be non-null. Let G denote the group inverse of B. Then any two of the following three statements implies the third: (i) A <# B (ii) A2 <# B2 and (iii) a ∈ C(A) ⊕ N (B), b ∈ C(At ) ⊕ N (Bt ), a = c, b = d and α = β. Remark 13.4.6. In the setup Theorem 13.4.5, (a) the condition (ii) and the fact that a = c, b = d and α = β together imply a ∈ C(A) ⊕ N (B) and b ∈ C(At ) ⊕ N (Bt ) and (b) if (i) holds and either a ∈ / C(A) ⊕ N (B) or b ∈ / C(At ) ⊕ N (Bt ), then there can not exist a matrix B2 of index ≤ 1 such that (ii) holds. Theorem 13.4.7. Let A, B be matrices of order n × n and index ≤ 1. Let A2 be of index ≤ 1 and a = 0, b 6= 0. Let G denote the group inverse of B. Then any two of the following three statements implies the third: (i) A <# B (ii) A2 <# B2 and (iii) c = 0, bt ∈ C(At ), either β = 0 and d = b + (B − A)t y for vector y or β 6= 0 is arbitrary and d = b − βGt b + (I − AG)t η, η arbitrary. Corollary 13.4.8. Let A, B be matrices of order n × n and index ≤ 1. Let A2 be of index ≤ 1 and a 6= 0, b = 0 and α = 0. Let G denote the group inverse of B. Then any two of the following three statements implies the third: (i) A <# B (ii) A2 <# B2 and (iii) d = 0 and either β = 0 and c = a + (B − A)w, for some vector w or β 6= 0 is arbitrary, a ∈ C(A) and c = a − βGa + (I − AG)ξ for arbitrary ξ. Theorem 13.4.9. Let A, B be matrices of order n × n and index ≤ 1. Let A2 be of index ≤ 1 and a = 0, b 6= 0 and α 6= 0. Let G denote the group inverse of B. Then any two of the following three statements implies the third: (i) A <# B (ii) A2 <# B2
and
Partial Orders of Modified Matrices
(iii) c = 0, β = α, and d = b −
361
1 (B − A)t b. α
Corollary 13.4.10. Let A, B be matrices of order n × n and index ≤ 1. Let A2 be of index ≤ 1 and a 6= 0, b = 0 and α 6= 0. Let G denote the group inverse of B. Then any two of the following three statements implies the third: (i) A <# B (ii) A2 <# B2
and
(iii) d = 0, β = α, and c = a −
1 (B − A)a. α
Theorem 13.4.11. Let A, B be matrices of order n × n and index ≤ 1. Let B2 be of index ≤ 1 and a = 0, b = 0 and α 6= 0. Let G denote the group inverse of B. Then any two of the following three statements implies the third: (i) A <# B (ii) ‘B2 is of index ≤ 1and A2 <# B2 ’ (iii) c = d = 0, and β = α.
and
Note that in this case A2 is of index ≤ 1. Theorem 13.4.12. Let A, B be matrices of order n × n and index ≤ 1. Let B2 be of index ≤ 1 and a = 0, b = 0 and α = 0. Let G denote the group inverse of B. Then any two of the following three statements implies the third: (i) A <# B (ii) ‘B2 is of index ≤ 1 and A2 <# B2 ’ and (iii) c = (I − AG)ξ, d = (I − AG)η and either at least one of c ∈ C(B) and d ∈ C(Bt ) holds and β 6= η t Gc or c ∈ C(B) and d ∈ C(Bt ) and β = η t Gc, η t G2 c 6= 0 or c ∈ / C(B) and d ∈ / C(Bt ) and η t (I − BGc) 6= 0. Let A, B be matrices of order n × n and index ≤ 1. Assume now B2 to be of index ≤ 1. We shall now obtain the class of all index ≤ 1 matrices A2 such that A <# B and A2 <# B2 are equivalent. Theorem 13.4.13. Let A, B be matrices of order n × n and index ≤ 1. Let B2 be of index ≤ 1 and c 6= 0, d 6= 0. Let G be the group inverse of B. Then any two of the following imply the third:
362
Matrix Partial Orders, Shorted Operators and Applications
(i) A <# B (ii) ‘A2 is of index ≤ 1 and A2 <# B2 ’ and (iii) a = c ∈ C(A) ⊕ N (B), b = d ∈ C(At ) ⊕ N (Bt ) and β = α. Theorem 13.4.14. Let A, B be matrices of order n × n and index ≤ 1. Let B2 be of index ≤ 1 and c = 0, d 6= 0 and β = 0. Let G be the group inverse of B. Then any two of the following imply the third: (i) A <# B (ii) ‘A2 is of index ≤ 1 and A2 <# B2 ’ and (iii) a = 0, α = 0, d ∈ C(Bt ) and b is a projection of d in C(At ) along C(Bt − At ) ⊕ N (Bt ). Corollary 13.4.15. Let A, B be matrices of order n × n and index ≤ 1. Let B2 be of index ≤ 1 and c 6= 0, d = 0 and β = 0. Let G be the group inverse of B. Then any two of the following imply the third: (i) A <# B (ii) ‘A2 is of index ≤ 1 and A2 <# B2 ’ and (iii) b = 0, α = 0, c ∈ C(B) and a is the projection in C(A) along C(B − A) ⊕ N (B). Theorem 13.4.16. Let A, B be matrices of order n × n and index ≤ 1. Let B2 be of index ≤ 1 and c = 0, d 6= 0 and β 6= 0. Let G be a group inverse of B. Then any two of the following imply the third: (i) A <# B (ii) ‘A2 is of index 1 and A2 <# B2 ’ and (iii) a = 0 and either α = 0, b = At (S− At d + (I − S− Sξ)), where ξ is arbitrary, At d ∈ C(S) and S = (B − βI)t or α = β, and b = βT− d + (I − T− T)η, where η is arbitrary and T = (A − B + βI)t . Theorem 13.4.17. Let A, B be matrices of order n × n and index ≤ 1. Let B2 be of index ≤ 1 and c = 0, d = 0 and β 6= 0. Let G be a group inverse of B. Then any two of the following imply the third: (i) A <# B (ii) ‘A2 is of index ≤ 1 and A2 <# B2 ’ and (iii) either ‘α = 0, a = 0, b = 0’ or β is an eigen-value of A and b is an eigen vector of At corresponding to eigen-value β or ‘α = β, a = 0, b = 0’ or β is an eigen-value of B − A and b is an eigen vector of (B − A)t corresponding to eigen-value β or α =
Partial Orders of Modified Matrices
363
β, a = 0, b = 0’ or β is an eigen-value of (B − A)t and a is an eigen vector of (B − A) corresponding to eigen-value β. Theorem 13.4.18. Let A, B be matrices of order n × n and index ≤ 1. Let B2 be of index ≤ 1 and c = 0, d = 0 and β = 0. Let G be a group inverse of B. Then any two of the following imply the third: (i) A <# B (ii) ‘A2 is of index ≤ 1 and A2 <# B2 ’ (iii) a = 0, b = 0 and α = 0.
and
Let A, B be matrices of order n × n and of index ≤ 1. Let a, b, c and d be n-vectors and A3 = A + abt , B3 = B + cdt . Further, let A3 and B3 be of index ≤ 1. To determine necessary and sufficient conditions such that A <# B and A3 <# B3 are equivalent, we give the following theorems: Theorem 13.4.19. Let A be a matrix of order n × n and index ≤ 1. Let G be a commuting g-inverse of A. Then A3 is of index ≤ 1 if and only if either a ∈ C(A), b ∈ C(At ), α = −1 − bt Ga = 0 and r = bt G2 a 6= 0 (in which case ρ(A3 ) = ρ(A) − 1) or α = −1 − bt Ga 6= 0 and at least one of a ∈ C(A), b ∈ C(At ) holds (in which case ρ(A3 ) = ρ(A)) or bt (I − GA)a 6= 0 holds (in which case ρ(A3 ) = ρ(A) + 1.) Theorem 13.4.20. Let A, B be matrices of order n × n and same index ≤ 1 such that A <# B and G be the group inverse of B. (Notice that G in such a case is a commuting g-inverse of A.). Let A3 and B3 be of index ≤ 1 and ρ(A3 ) = ρ(A) − 1. Then A3 <# B3 if and only if exactly one of the following holds: (i) A3 a = 0, A3 c = 0, in which case either bt A3 = 0 and c ∈ N (A) 1 or bt A3 = 0 and d ∈ N (A3 )t or d = ( t )(bat b + (I − (A3 G)t )w), χ b where χ is a vector such that c = (I − GA3 )χ. (ii) A3 a = 0, dt GA = 0 and AGc = 0 in which case bt is a left eigen value of A corresponding to eigen value bt a 6= 0. (iii) A3 a 6= 0, A3 c 6= 0, and dt GA 6= 0, in which case c = αa + (I − AG)ζ and d =
b + (I − (AG)t )η α
for arbitrary vectors ζ and η. Theorem 13.4.21. Let A, B be matrices of order n × n and of index ≤ 1 such that A <# B and G be the group inverse of B. (Notice that G in
364
Matrix Partial Orders, Shorted Operators and Applications
such a case is a commuting g-inverse of A.). Let A3 and B3 be of index ≤ 1 and ρ(A3 ) = ρ(A). Then A3 <# B3 if and only if exactly one of the following holds: (i) bt A3 = 0 and AGc = 0, in which case c ∈ N (A), d is arbitrary and either A3 a = 0 or b ∈ N (At ). (ii) bt A3 = 0 and dt A3 = 0 in which case, if bt Ga = 0, then c is arbitrary otherwise either a, c ∈ N (A3 ) or a ∈ N (A3 ) and c is bt Ga arbitrary or c = t a ∈ N (A3 ). d Ga t (iii) b A3 6= 0 and AGc 6= 0, and dt A3 6= 0, in which case c−αa ∈ N (A) 1 and d − bt ∈ N (A3 ). α Theorem 13.4.22. Let A, B be matrices of order n × n and of index ≤ 1 such that A <# B and G be the group inverse of B. (Notice that G in such a case is a commuting g-inverse of A.). Let A3 and B3 be of index ≤ 1, u = (I − AG)a, vt = bt (I − AG) and ρ(A3 ) = ρ(A) + 1. Then A3 <# B3 if and only if exactly one of the following holds: (i) Aa = 0 and dt u = 0, in which case a ∈ N (A), c is arbitrary and d is such that dt a = 0. (ii) Aa = 0 and Ac = 0, in which case a, c ∈ N (A), and d is arbitrary. (iii) Aa 6= 0, Ac 6= 0 and dt u 6= 0, in which case we choose d such that p dt u 6= 0 and c = t a + (I − AG)θ, where θ is arbitrary. d u 13.5
Star order
Let A and B be complex m × n matrices. In this section, we first obtain necessary and sufficient conditions for A1 B1 to hold when A B holds and vice versa. The conditions we obtain for the star are much simpler than the ones obtained for the minus order and for the sharp order. Recall A1 = (A : a) and B1 = (B : b), where a and b are m-column vectors. We reiterate the vectors and matrices in this and the next section are over C, the field of complex numbers. Theorem 13.5.1. Let A and B be matrices of order m × n and a, b be m-column vectors. Then any two of the following three implies the third: (i) A B (ii) A1 B1
Partial Orders of Modified Matrices
365
(iii) either ‘a = 0 and b ∈ C(A⊥ )’ or ‘a 6= 0, b = a and b ∈ C(B − A)⊥ .’ Proof. (i) and (ii) ⇒ (iii) Since (i) and (ii) hold, we have A? (B − A) = 0, (B − A)A? = 0 A?1 (B1
− A1 ) = 0, (B1 −
A1 )A?1
= 0.
(13.5.1) (13.5.2)
From (13.5.2), we have
A? a?
B−A: b−a =0
and A? = 0; B−A: b−a a? equivalently, A? (B − A) = 0, A? (b − a) = 0, a? (B − A) = 0 and (B − A)a? + (b − a)a? = 0. Using (13.5.1) in (B − A)a? + (b − a)a? = 0, we have (b − a)a? = 0 ⇒ a = 0 or b = a. In case a = 0, A? (b − a) = 0 ⇒ A? (b) = 0. So, b ∈ C(A⊥ ). In case b = a, a? (B − A) = 0 ⇒ b ∈ C(B − A)⊥ . Thus, (iii) holds. (i) and (iii) ⇒ (ii) ? A (B − A) A? (b − a) Consider A?1 (B1 − A1 ) = . a? (B − A) a? (b − a) Let a = 0 and b ∈ C(A⊥ ). Then A? b = 0. Using A? (B − A), we have A?1 (B1 − A1 ) = 0. Next, let a 6= 0, b = a and b ∈ C(B − A)⊥ . Clearly, A?1 (B1 − A1 ) = 0. Similarly, (B1 − A1 )A?1 = 0. The proof for (ii) and (iii) ⇒ (ii) is equally easy. We now study the star order for matrices A2 = A + ab? , B2 = B + cd? obtained from matrices A, B by adding to them rank 1 matrices ab? and cd? respectively. Two methods to deal with the problem, first of which is similar to the methods adopted for the minus order and the sharp order, are presented. The first method is similar to the method adopted for minus and sharp orders. The second method is simple, intuitive and reveals very clearly as to what happens when matrices are subjected to such modifications. Theorem 13.5.2. Let A, B be complex matrices of order m × n. Let a, c ∈ Cm and b, d ∈ Cn , A2 = A + ab? and B2 = B + cd? . Then any two of the following three imply the third:
366
Matrix Partial Orders, Shorted Operators and Applications
(i) A B (ii) A1 B1 (iii) (a) (ab? − cd? )? A2 = 0 and A2 (ab? − cd? )? = 0, when a ∈ C(A) and b ∈ C(A? ). (b) (ab? − cd? )? A2 = 0 and A2 (ab? − cd? )? + ab? (B − A)? = 0, when a ∈ C(A) and b ∈ / C(A? ). ? ? ? (c) (ab − cd ) A2 + (B − A)? ab? = 0 and A2 (ab? − cd? )? = 0, when a ∈ / C(A) and b ∈ C(A? ). ? (d) (ab − cd? )? A2 + (B − A)? ab? = 0 and A2 (ab? − cd? )? + ab? (B − A)? = 0, when a ∈ / C(A) and b ∈ / C(A? ). Proof is straightforward. As an alternative to Theorem 13.5.2, notice that to obtain necessary and sufficient conditions for A B ⇔ A2 B2 is equivalent to finding first a necessary and sufficient conditions for A B ⇔ A + xy? B + xy? and then for A B ⇔ A + wz? B − wz? for suitable column vectors x, y, z and w. In the following theorems we consider these special cases, which are interesting in their own right owing to the simple structure of necessary and sufficient conditions. Theorem 13.5.3. Let A, B be complex matrices of order m × n. Let a ∈ Cm , b ∈ Cn , X = A+ab? and Y = B+ab? . Then any two of the following three statements imply the third: (i) A B (ii) X Y (iii) a ∈ C(B − A⊥ ) and b ∈ C((B − A)? )⊥ . Proof.
Notice that (i) is equivalent to A? (B − A) = 0, (B − A)A? = 0
and (ii) is equivalent to A? (B − A) + (ab? )? (B − A) = 0, (B − A)A? + (B − A)(ab? )? = 0. Now the proof is easy.
Remark 13.5.4. Theorem holds with suitable modifications even when we add an arbitrary m × n matrix to A and B. Theorem 13.5.5. Let A, B be complex matrices of the same order m × n, a ∈ Cm and b ∈ Cn . Let X = A + ab? and Y = B − ab? . Then any two of the following three statements imply the third:
Partial Orders of Modified Matrices
367
(i) A B (ii) X Y (iii) a and b are singular vectors of A corresponding to the same singular −A? a value and b = − ? according as a ∈ C(A) or b ∈ C(A? ). a a Proof. (i) and (ii) ⇒ (iii) Since (i) and (ii) hold, we have A? (B − A) = 0, (B − A)A? = 0
(13.5.3)
and A? (B − A) + (ab? )? (B − A) − 2A? ab? − 2(ab? )? ab? = 0, (B − A)A? + (B − A)(ab? )? − 2(ab? )? A? − 2ab? (ab? )? = 0.
(13.5.4)
Using (13.5.3) in (13.5.4), we have (ab? )? (B − A) − 2A? ab? − 2(ab? )? ab? = 0
(13.5.5)
(B − A)(ab? )? − 2(ab? )? A? − 2ab? (ab? )? = 0.
(13.5.6)
and
Let a ∈ C(A). Therefore, from (13.5.3) we have a? (B − A) = 0. Substituting in (13.5.5) and noting that a and b are non-null vectors, we have −A? a . Substituting in (13.5.6) and simplifying A? a + ba? a = 0 ⇒ b = a? a ? ? a AA a a? AA? a we have AA? a? = a. Thus, is a singular value of A a? a a? a corresponding to singular vector a. a? AA? a is a Let b ∈ C(A? ). Then a similar computation shows that a? a singular value of A corresponding to singular vector b. 13.6
L¨ owner order
Once again in this section too, all matrices are over the field C of complex numbers. matrices of the same order and LetA and B betwo hermitian A u B x A2 = and B2 = . Given A
368
Matrix Partial Orders, Shorted Operators and Applications
Proof. Let A
Proof.
Theorem 13.6.4. Let A and B be nnd matrices of order n × n such that A
Partial Orders of Modified Matrices
369
B − A (b : a) is nnd ⇔ (b : a) = (B − A)T for some (b : a)? I matrix T and the 2 × 2 matrix I − T? (B − A)T is nnd. The remaining case can be proved similarly. matrix R =
Chapter 14
Equivalence Relations on Generalized and Outer Inverses
14.1
Introduction
The classes of generalized inverses and outer inverses of a matrix are big, so big that they can be best described as a forest. At the first glance, exploration of a forest may appear to be an arduous task. However, if the forest is divided into zones based on certain well defined criteria, then the exploration may be more tractable or may even become quite easy. If one pursues, one may find a pattern in the madness of the forest. The class of all generalized inverses and the class of all outer inverses of a matrix A form our forest. We define certain equivalence relations on these classes. The equivalence classes are the zones into which this forest gets divided and the hierarchy of the g-inverses is what we find. These lead to nice diagrammatic representation of all the g-inverses and outer inverses of a matrix revealing a well structured hierarchy. In Section 14.2, we define an equivalence relation on the class of all g-inverses of a matrix using the minus order. We explore the properties of the equivalence classes including some interesting characterizations. In Section 14.3, we study the equivalence relations based on special types of partial orders such as the star order and the sharp order. In Section 14.4, we define an equivalence relation on outer inverses and obtain characterizations of the hierarchy of outer inverses. Section 14.5 develops a scheme for diagrammatic representation of g-inverses and outer inverses depicting the hierarchy based on the results obtained in the earlier sections. Finally, we construct a ladder of g-inverses and outer inverses of a matrix and their reflexive g-inverses in Section 14.6.
371
372
14.2
Matrix Partial Orders, Shorted Operators and Applications
Equivalence relation on g-inverses of a matrix
Let A be a matrix of order m×n over a field F. Let {A− } be the class of all g-inverses of A. We begin by defining a relation “v” on {A− } as follows: Definition 14.2.1. Let G1 and G2 belong to {A− }. We say G1 v G2 if AG1 = AG2 and G1 A = G2 A (or equivalently, G1 AG2 = G2 AG1 ). It is easy to verify that the relation v is an equivalence relation on {A− } and therefore partitions the set {A− } into a disjoint union of equivalence classes. If G ∈ {A− }, we denote the equivalence class that contains G by Eq(G|A). We next show that each Eq(G|A) contains a unique reflexive g-inverse of A. Theorem 14.2.2. Let A be a matrix of order m × n. Then each equivalence class under relation “v” on {A− }, as in Definition 14.2.1, contains a unique reflexive g-inverse of A. Proof. Consider any equivalence class T and let G ∈ T. Let G0 = GAG. Clearly, G0 is a reflexive g-inverse of A. Further, AG0 = AGAG = AG and G0 A = GAGA = GA. Hence, G0 v G. Thus, G0 ∈ T. Moreover, if G1 and G2 are reflexive g-inverses of A such that G1 v G2 . Then G1 = G1 AG1 = G2 AG1 = G2 AG2 = G2 . Thus, each equivalence class contains a unique reflexive g-inverse of A. Remark 14.2.3. Let A be a matrix of order m × n and of rank r, where 0 < r < min{m, n}. Let G be a g-inverse of A with rank s (s ≥ r). Then by Theorem 2.3.18, it follows that P and there exist non- singular matrices Q such that A = Pdiag Ir , 0 Q and G = Q−1 diag Is , 0 P−1 . It is easy to see that the reflexive g-inverse of A belonging to Eq(G|A) is given −1 −1 by G0 = Q diag Ir , 0 P . Theorem 14.2.4. Let A be a matrix of order m × n. Let G0 ∈ {A− r } and − G1 ∈ Eq(G0 |A). Then G0 < G1 . Proof. Clearly, AG0 = AG1 and G0 A = G1 A, as A is a g-inverse of G0 . Hence, G0 <− G1 . Remark 14.2.5. Let A and G0 be as in Theorem 14.2.4. Then G0 = inf{G : G ∈ Eq(G0 |A)}.
Equivalence Relations on Generalized and Outer Inverses
373
Let G0 be a given reflexive g-inverse of A. We now determine the class all g-inverses G (of a specified rank) of A such that G ∈ Eq(G0 |A). Theorem 14.2.6. Let A be a matrix of order m × n of rank r such that 0 < r < min{m, n}. Let A = Pdiag Ir , 0 Q be a normal form of A, where P and matrices. Let G0 ∈ {A− r } such that Q are non-singular I L r G0 = Q−1 P−1 for some matrices L and M. Then an n × m M ML matrix G of rank s is in the equivalence class Eq(G0 |A) if and only if Ir L G = Q−1 P−1 , where S is arbitrary matrix of rank s − r. M ML + S Notice that G ∈ Eq(G0 |A) if and only if G0 = GAG. Every gI T r 1 inverse G of A is of the form G = Q−1 P−1 for some matrices T2 T3 T1 , T2 and T3 . Now, G0 = GAG ⇔ T2 = M and T1 = L. Thus, G = Ir L Q−1 P−1 . Further ρ(G) = s ⇔ ρ(T3 − ML) = s − r. Take M T3 S = T3 − ML. Proof.
Corollary 14.2.7. Let A be a matrix of order m × n and of rank r. Let G0 be a reflexive g-inverse of A. Then the class of all g-inverses G of A having rank s (s > r) such that G ∈ Eq(G0 |A) is given by Q−1 diag Ir , N P−1 , where N is an arbitrary matrix of rank s − r and the non- singular matrices P and Q are such that A = Pdiag Ir , 0 Q and G0 = Q−1 diag Ir , 0 P−1 . Corollary 14.2.8. Let A be a matrix of order m × n of rank r. Let G0 be a reflexive g-inverse of A. Then an m × n matrix G of rank s such that exists non- singular matrices R and G ∈ Eq(G0 |A) if and only if there −1 −1 S such that A = Rdiag and Ir , 0 S and G0 = S diag Ir , 0 R G = S−1 diag Is , 0 R−1 . The following theorem is yet another characterization of g-inverses belonging to an equivalence class. Theorem 14.2.9. Let A be a matrix of order m × n and of rank r. Let G0 be a reflexive g-inverse of A. Then the class of all G ∈ Eq(G0 |A) is given by G = G0 + (I − G0 A)U(I − AG0 ) where U is arbitrary. Proof. Notice that A(G − G0 ) = 0 ⇔ C(G − G0 ) ⊆ N (A) and (G − G0 )A = 0 ⇔ C(G − G0 )t ⊆ N (At ). Now the result follows easily.
374
Matrix Partial Orders, Shorted Operators and Applications
Notice that in Theorem 14.2.4, we have proved that if G ∈ Eq(G0 |A). Then G0 <− G where G0 is a reflexive g-inverse of A. We now show that every matrix Y satisfying G0 <− Y <− G also belongs to Eq(G0 |A). Theorem 14.2.10. Let A be an m × n matrix and let G0 ∈ {A− r }. Let G ∈ Eq(G0 |A). Then each matrix Y satisfying G0 <− Y <− G belongs to Eq(G0 |A). Proof. Let Y be of rank s. Then by Theorem 14.2.9, there exists non-singular matrices R and S such that A = Rdiag Ir , 0s−r , 0 S, G0 = S−1 diag Ir , 0s−r , 0 R−1 and G = S−1 diag I0 , Is−r , 0 R−1 . Since Y <− G it follows that Y = S−1 diag Ts×s , 0 R−1 , where T is − − idempotent. Since G0 <− Y <− G, we have diag Ir , 0 < T < Is . As each of the matrices diag Ir , 0 , T and Is are idempotent, we have by Theorem 3.6.4, T = diag Ir , 0 + diag 0 , M , where M is idempotent. Clearly, AG0 = AY and G0 A = YA. Hence, Y ∈ Eq(G0 |A). Corollary 14.2.11. Let A be an m × n matrix and let G0 ∈ {A− r }. Let G1 , G2 ∈ Eq(G0 |A) such that G1 <− G2 . Let Y be a matrix such that G1 <− Y <− G2 . Then Y ∈ Eq(G0 |A). Proof. By Theorem 14.2.4, G0 <− G1 . Since G1 <− Y, we have G0 <− Y. The corollary now follows from Theorem 14.2.10. Let A be an m × n matrix and let G0 ∈ {A− r }. Let Gs−r be a matrix of rank s such that Gs−r ∈ Eq(G0 |A). Does there exist a matrix As−r ∈ {G− s−r } such that As−r ∈ Eq(A|G0 )? If so, what is the class of all such As−r ? Determination of all such As−r will be helpful in the diagrammatic representation in Section 14.5. We prove Theorem 14.2.12. Let A be an m × n matrix and let G0 ∈ {A− r }. Let Eq(G |A). As in CorolGs−r be a matrix of rank s(> r) such that G ∈ 0 s−r −1 −1 lary 14.2.8, let A = Rdiag S and G = S diag R and I , 0 I , 0 0 r −1r −1 Gs−r = S diag Is , 0 R , where R and S are non-singular matrices. The class of all matrices As−r such that As−r ∈ Eq(A|G0 ) ∩ {G− s−r } is 0 given by A + R Is−r 0 Is−r L S where L and M are arbitrary. M
Equivalence Relations on Generalized and Outer Inverses
Proof.
375
As−r ∈ {G− s−r } if and only if Ir 0 J1 S As−r = R 0 Is−r L J2 M ML + J1 J2
for some matrices J1 , J2 , L and M. Now, G0 A0 = G0 As−r if and only if J1 = 0. Further, A0 G0 = As−r G0 if and only if J2 = 0. Thus, Ir 0 0 0 Is−r L S As−r ∈ Eq(A|Gs−r ) ∩ As−r ∈ {G− s−r } ⇔ As−r = R 0 M ML for some matrices L and M or equivalently, As−r = A + B, where 0 B = R Is−r 0 Is−r L S for some matrices L and M. M The matrix As−r of Theorem 14.2.12 admits an alternative expression, which we give in our next theorem. Theorem 14.2.13. Consider the same setup as in Theorem 14.2.12. The class of all matrices As−r ∈ Eq(A|G0 ) ∩ {G− s−r } is given by As−r = A + (I − AG0 )(Gs−r − G0 )− r (I − G0 A). Proof.
Let R, S, L, Mand B be as in the proof of Theorem 14.2.12. 0 Notice that G0 R Is−r = 0 and 0 Is−r L SG0 = 0 for all matrices L M and M. Hence, C(B) ⊆ C(I − AG0 ) and C(Bt ) ⊆ C(I − G0 A). The rest is computational. Several important remarks are in order. Their importance lies in the fact that they will be instrumental in designing the diagrammatic representation of the generalized inverses of a given matrix. Remark 14.2.14. Notice that ρ(As−r ) = ρ(A) + ρ(B). Remark 14.2.15. In Theorem 14.2.13, if we let 0 t Is−r = u1 : . . : us−r and 0, Is−r , L = v1t : . . : vs−r , M t then B = Ru1 v1t S ⊕ . . . ⊕ Rus−r vs−r S.
376
Matrix Partial Orders, Shorted Operators and Applications
Remark 14.2.16. Let Gi = S−1 diag Ir+i , 0 R−1 and Pi Ai = A + j=1 Ruj vjt S. Then Ai ∈ Eq(A|G0 ) ∩ {(G− ) }, i = 1, . . . , s − r. i r Remark 14.2.17. Write A0 = A. Then for each i = 0, 1, . . . , s − r − 1, we have Eq(Gi+1 |Ai+1 ) ⊆ Eq(Gi |Ai ), where Ai and Gi are defined in Remark 14.2.16.
Remark 14.2.18. Notice that s can take values from r + 1 to min{m, n}. Remark 14.2.19. We also have Gi <− Gi+1 and Ai <− Ai+1 for each i = 0, 1, . . . , s − r − 1.
Consider an m × n matrix A. Each equivalence class of g-inverses under the equivalence relation as in Definition 14.2.1 is triggered by a reflexive g-inverse of A. Let us ask the following question: Given a pair of equivalence classes, is it possible for some member of one to dominate a member of the other? The answer to this question is in affirmative. In fact, given a g-inverse G, we shall determine the set all equivalence classes such that each one of them contains a g-inverse G1 of A such that G <− G1 and also characterize all such g-inverses G1 . Theorem 14.2.20. Let A be an m × n matrix of rank r. Let G0 be a reflexive g-inverse of A and G1 be a g-inverse of A of rank s (r < s < min{m, n}) such that G1 ∈ Eq(G0 |A). Let P and Q be non-singular matrices such that A = Pdiag Ir , 0 , 0 , G0 = Q−1 diag Ir , 0 , 0 P−1 and G1 = Q−1 diag Ir , Is−r , 0 P−1 . (This is Ir L1 L2 guaranteed by Remark 14.2.3). Let G2 = Q−1 M1 M1 L1 M1 L2 P−1 M2 M2 L1 M2 L2 be reflexive g-inverse of A, where L1 is of order r × (s − r), M1 is of order (s − r) × r and the partition is determined accordingly. Then there exists a G3 ∈ Eq(G2 |A) such that G1 <− G3 if and only if C(L1 ) ⊆ C(L2 ), C(Mt1 ) ⊆ C(Mt2 ) and ρ(L2 ) + ρ(M2 ) ≤ min{m − s, n − s}. (Notice that every reflexive g-inverse of A is of the form given above.)
Equivalence Relations on Generalized and Outer Inverses
377
ByTheorem14.2.8, G3 ∈ Eq(G2 |A) if and only if for some S11 S12 matrix S = , S21 S22 Ir L1 L2 G3 = Q−1 M1 M1 L1 + S11 M1 L2 + S12 P−1 . M2 M2 L1 + S21 M2 L2 + S22 Proof.
Now, G1 <− G3 if and only if there exists a reflexive g-inverse H of G1 such that G1 H = G3 H and HG1 = HG3 . Notice that every reflexive gIr 0 E1 Q for some inverse of G1 is of the form H = P 0 Is−r E2 F1 F2 F1 E1 + F2 E2 matrices E1 , E2 , F1 and F2 of appropriate orders. It is easy to check that L2 F1 =0 (14.2.1) L = −L F (14.2.2) 2 2 2 M1 = −S12 F1 (14.2.3) G1 H = G3 H ⇔ S + S F = I (14.2.4) 11 12 2 M = −S22 F1 (14.2.5) 2 S21 = −S22 F2 (14.2.6) E1 M2 =0 (14.2.7) L = −E S (14.2.8) 1 1 21 L2 = −E1 S22 (14.2.9) HG1 = HG3 ⇔ M1 = −E2 M2 (14.2.10) S11 + E2 S21 = I (14.2.11) S12 = −E2 S22 . (14.2.12) ‘Only if’ part (14.2.2) and (14.2.10) imply C(L1 ) ⊆ C(L2 ), C(M1 t ) ⊆ C(Mt2 ) respectively. Since L2 F1 = 0, it follows that ρ(F1 ) ≤ n − s − ρ(L2 ). Since M2 = −S22 F1 , it follows that ρ(M2 ) ≤ ρ(F1 ) ≤ n − s − ρ(L2 ). Similarly, from (14.2.7) and (14.2.9) it follows that ρ(L2 ) ≤ m − s − ρ(M2 ). Thus, ρ(L2 ) + ρ(M2 ) ≤ min{m − s, n − s}. ‘If’ part We first exhibit S22 , F1 and E1 satisfying equations (14.2.5), (14.2.9), (14.2.1) and (14.2.7). In view of the equations (14.2.5) and (14.2.9), the equations (14.2.1) and (14.2.7) are the same. Let ρ(L2 ) = a and ρ(M2 )= b. We know that a + b ≤ min{m − s, n − s}. Let M2 = H1 diag Ia , 0 H2 and L2 = J1 diag Ib , 0 J2 be normal forms of M2 and L2 respectively, where H1 , H2 , J1 and J2 are non-singular matrices of appropriate orders.
378
Matrix Partial Orders, Shorted Operators and Applications
M2 is of order (n − s) × r and L2 is of order r × (n − s). Consider 0 Ia 0 0 0 0 S22 = H1 Ib 0 0 J2 , F1 = J2 −1 −Ia 0 0 H2 and 0 0 0 0 0 0 0 −Ib 0 E1 = J1 0 0 0 H1 −1 . 0 0 0 Notice that S22 is of order (n − s) × (m − s). Since a + b ≤ min{m − s, n − s}, the above construction of S22 is possible. It is easy to verify that S22 , F1 and E1 as constructed above satisfy the equations (14.2.1), (14.2.5), (14.2.7) and (14.2.9). Since C(L1 ) ⊆ C(L2 ), there exists an F2 satisfying (14.2.7). Construct S21 = −S22 F2 . So, (14.2.6) holds. Also, −E1 S21 = E1 S22 F2 = −L2 F2 = L1 , thus, (14.2.8) is also satisfied. Since C(Mt1 ) ⊆ C(Mt2 ), there exists a matrix E2 satisfying (14.2.10). Let S12 = −E2 S22 . Then (14.2.12) is satisfied. By post-multiplying (14.2.12) by F1 we have, S12 F1 = −E2 S22 F1 = −E2 M2 = M1 . Thus, (14.2.3) holds. Now, E2 S21 = −E2 S22 F2 = S12 F2 , we let S11 = I − S12 F2 = I − E2 S21 . Then both (14.2.4) and (14.2.11) are satisfied. Thus, we have exhibited S22 , F1 and F2 , where F1 and E2 are such that the equations (14.2.1)- (14.2.12) are all satisfied. Hence, G1 <− G3 . In the process, we have also exhibited a reflexive g-inverse H such that G1 H = G3 H and HG1 = HG3 . In the setup of Theorem 14.2.20, let C(L1 ) ⊆ C(L2 ), C(M1 t ) ⊆ C(M2 t ) and ρ(L2 ) + ρ(M2 ) ≤ min{m − s, n − s}. We now obtain the class of all g-inverses G3 ∈ Eq(G2 |A) such that G1 <− G3 . Theorem 14.2.21. Consider the setup of Theorem 14.2.20. Let t C(L1 ) ⊆ C(L2 ), C(Mt1 ) ⊆ C(M 2 ) and ρ(L2 ) + ρ(M2 ) ≤ min{m − s, n − s}. Let M2 = H1 diag Ia , 0 H2 and L2 = J1 diag Ib , 0 J2 be normal forms of M2 and L2 respectively, where H1 , H2 , J1 and J2 are non-singular matrices of appropriate orders. Then the class of all G3 ∈ Eq(G2 |A) 0 0 0 such that G1 <− G3 is given by G3 = G2 + Q−1 0 S11 S12 P−1 , 0 S21 S22 B11 B12 where (i) S22 = H1 H2 with B11 , B12 and B21 arbitrary B21 B22 matrices of order a × b, a × (m − s − b), (n − s − a) × b respectively and
Equivalence Relations on Generalized and Outer Inverses
379
B22 = (I − Y2− Y2 )D(I − W2 W2− ), D, Y2 and W2 arbitrary solutions to I Y2 B21 = b and B21 W2 = Ia 0 . 0 − − (ii) S21 = S22 (L− 2 L1 + (I − L2 L2 )Z), where L2 is some g-inverse of L2 and Z is arbitrary. − − (iii) S12 = (M1 M− 2 + T(I − M2 M2 )Z)S22 , where M2 is some g-inverse of M2 and T is arbitrary. − − − (iv) S11 = I + (M1 M− 2 + T(I − M2 M2 )Z)S22 (L2 L1 + (I − L2 L2 )Z). Proof. We first show that if ρ(L2 ) + ρ(M2 ) ≤ min{m − s, n − s}, then there exist matrices S22 , E1 and F1 satisfying the equations: M2 = −S22 F1 , L2 = −E1 S22 and E1 S22 F1 = 0. Let ρ(M2 ) = a and ρ(M2 ) = b. We know that a + b ≤ min{m − s, n − s}. Let M2 = H1 diag Ia , 0 H2 and L2 = J1 diag Ib , 0 J2 be normal forms of M2 , L2 respectively, where H1 , H2 , J1 and J2 are non-singular B11 B12 matrices of appropriate orders. Write S22 = H1 J2 and B21 B22 W11 W12 Y11 Y12 F1 = J2 −1 H2 and E1 = J1 H1 −1 . Thus, by W21 W22 Y21 Y22 I 0 W11 W12 equation (14.2.1), L2 F1 = 0 ⇔ J1 b J2 J−1 H2 = 0 2 0 0 W21 W22 or W11 = 0 and W12 = 0. Similarly, equation (14.2.7) yields Y11 = 0 and Y21 = 0. Now, M2 = −S22 F1 ⇔ B12 W21 = Ia , B12 W22 = 0 and B22 W21 W22 = 0. Clearly, B12 is an a × ((n − s) − b) matrix. Since a < (n − s) − b, we can find a matrix B12 of rank a. Let B12 be any arbitrary matrix of order a × ((n − s) − b) such that ρ(B12 ) = a. Choose and fix a B12 . The matrix W21 is of order (n − s − b) × a and W22 is of order (n − s − b) × (r − a). Let W21 be an arbitrary right inverse of B12 and W22 = (I − W21 B12 )Z, where Z is an arbitrary matrix of order (n − s − b) × (r − a). B22 is an arbitrary matrix such that C(Bt22 ) ⊆ N (W21 W22 )t . Also, L2 = −E1 S22 ⇔ Y12 B21 = −Ib , Y12 B22 = 0, Y22 B21 = 0 and Y22 B22 = 0. Notice that B21 is a matrix of order (m − s − a) × b. Since b ≤ m − s − a, there exists a matrix B21 of order (m − s − a) × b with rank b. Let B21 be an arbitrary matrix of order (m − s − a) × b with rank b. Choose and fix B21 . Then Y12 is an arbitrary left inverse of −B21 and Y22 = R(I − B21 Y12 ), where R is an arbitrary matrix of appropriate order. Then B22 must
380
Matrix Partial Orders, Shorted Operators and Applications
Y12 B22 = 0. Thus, B22 is an arbitrary solution of the maY22 Y12 trix equation B22 = 0 and B22 (W21 W22 ) = 0, which is given by Y22 − Y12 Y12 − B22 = (I − )D(I − (W21 , W22 )(W21 , W22 ) ), where D Y22 Y22 − Y12 − is arbitrary and “ and (W21 , W22 ) ” are arbitrary g-inverses of Y22 Y12 and (W21 , W22 ) respectively. The matrix B11 is arbitrary. The Y22 rest of the proof is now easy. satisfy
Consider an m × n matrix A and an arbitrary reflexive g-inverse G0 of A. Does the equivalence class Eq(G0 |A) determine A in the sense that if Eq(G0 |A) = Eq(G0 |B) for some matrix B having G0 as a g-inverse, should we have A = B? This indeed is so, as shown by the following theorem: Theorem 14.2.22. Let A and B be matrices of the same order and same rank r. Let A and B have a common reflexive g-inverse G0 such that Eq(G0 |A) = Eq(G0 |B). Then A = B. Proof. By Theorem 2.3.18, there exist non-singular matricesP and Q such that A = Pdiag Ir , 0 Q and G0 = Q−1 diag Ir , 0 P−1 . By Corollary 14.2.7, G ∈ Eq(G0 |A) if and only if G = Q−1 diag Ir , S P−1 forsome matrix S. Since G0 is reflexive g-inverse of B, we can write B = Ir T1 P Q for some matrices T1 and T2 . Since G ∈ Eq(G0 |B), we T2 T2 T1 Ir 0 have BG0 = BG and G0 B = GB. However, BG0 = P P−1 T2 0 Ir T1 S and BG = P P−1 , so, BG0 = BG for all G ∈ Eq(G0 |A). T2 T2 T1 S So, T1 S = 0 for all S and therefore, T1 = 0. Similarly, G0 B = GB for all G ∈ Eq(G0 |A) ⇒ T2 = 0. Hence, A = B. 14.3
Equivalence relations on subclasses of g-inverses
We begin this section by identifying the equivalence classes under the equivalence relation “v” that contain two of the special reflexive g-inverses namely the Moore-Penrose inverse and the group inverse.
Equivalence Relations on Generalized and Outer Inverses
381
Theorem 14.3.1. (a) Let A ∈ Cm×n such that ρ(A) = r (0 < r < min{m, n}). Then an n × m matrix G ∈ Eq(A† |A) if and only if G ∈ {A− `m }. (b) Let A be an n × n matrix of index ≤ 1 with rank r (0 < r < n). Then an n × n matrix G ∈ Eq(A# |A) if and only if G ∈ {A− com }. Proof follows easily from Theorem 14.2.6. Thus, the equivalence class of A† , the Moore-Penrose inverse of a matrix A consists of the class of all minimum norm least squares g-inverses of A. Similarly the equivalence class of A# , the group inverse of A consists of all commuting g-inverses of A. It is therefore natural for us to think about other subclasses of g-inverses like {A− ` }, the class of all least squares g− inverses, {Am }, the class of minimum norm g-inverses, {A− `m }, the class − − of minimum norm least squares g-inverses, {A− c }, {Aa } and {Acom }, the commuting g-inverses and their status in relation to the equivalence relation “v”. If we consider any of these classes of g-inverses and define relations on them in a manner analogous to the Definition 14.2.1, what are the interrelationships among the equivalence classes of these relations? It turns out that the relations defined are each an equivalence relation and several results concerning them are similar to those of the relation “v .” For the study of equivalence relations on the classes of all least square ginverses, the class of minimum norm g-inverses, and the class of minimum norm least square g-inverses, we consider the matrices over the field of complex numbers C. For the study of equivalence relations on the other subclasses, we consider square matrices of index ≤ 1 over a general field. We start with the study of the equivalence relations on the classes of the least squares g-inverses and minimum norm g-inverses. We present the results corresponding to these relations side by side because of the duality between these two classes (see Theorem 2.5.15). Definition 14.3.2. Let A ∈ Cm×n and G1 and G2 ∈ {A− ` }. The relation “v` ” on {A− } defined as G v G if AG = AG and G 1 ` 2 1 2 1 A = G2 A (or ` equivalently if G1 AG1 = G2 AG2 ). Similarly the relation “vm ” on {A− m } is defined. Clearly, the relations “v` ” and “vm ” are equivalence relations on {A− ` } and {A− } respectively. Further there is a unique reflexive g-inverse in m each equivalence class corresponding to each of these relations. We denote the equivalence class containing the least squares reflexive g-inverse G0 by
382
Matrix Partial Orders, Shorted Operators and Applications
Eq` (G|A). Eqm (G|A) is similarly defined. Notice that both Eq` (G|A) and Eqm (G|A) are subsets of the equivalence class Eq(G|A). Before we elaborate on properties of these relations we note the following: Theorem 14.3.3. Let A ∈ Cm×n and T be any equivalence class of {A− } under the relation “v”. If T contains a least squares g-inverse G, then it contains a reflexive least squares g-inverse G1 . Consequently, T = Eq` (G|A). Similarly, if T contains a minimum norm g-inverse G, then it contains a reflexive minimum norm g-inverse G2 . Consequently, T = Eqm (G|A). Proof. Let G1 = GAG. Then G1 is a reflexive g-inverse such that G1 v` G and AG1 = AG. Since AG is hermitian, so is AG1 . So, G1 is the reflexive least squares g-inverse belonging to T. We already know that Eq` (G1 |A) = Eq` (G|A) ⊆ T. So, let H ∈ T. As Hv` G, we have AH = AG. Thus, AH is hermitian giving H as a least squares g-inverse of A. Also, Hv` Gv` G1 . Hence, H ∈ Eq` (G1 |A) = Eq` (G|A), proving T = Eq` (G|A). Theorem 14.3.4. Let A ∈ Cm×n . If G0 ∈ {A− `r } and G1 ∈ Eq` (G0 |A), − then G0 < ?G1 . If G0 ∈ {Amr }, and G1 ∈ Eqm (G0 |A), then G0 ? < G1 . Thus, G0 = inf{G : G ∈ Eq` (G0 |A)}, if G0 ∈ {A− `r } and G0 = inf{G : G ∈ Eqm (G0 |A)}, if G0 ∈ {A− }. mr We now give some characterizations of g-inverses in an equivalence class under each of the relations “v` ” and “vm ”. Theorem 14.3.5. Let A ∈ Cm×n be a matrix of rank r. Let G0 ∈ {A− `r }. Let G be an n × m matrix of rank s. Then G ∈ Eq` (G0 |A) if and only if there exist a unitary matrix U and a non-singular matrix V such that A = Udiag Ir , 0 V, G0 = V−1 diag Ir , 0 U? and G = V−1 diag Ir , Ir−s , 0 U? . Similarly, let G0 ∈ {A− mr }. Then an n × m matrix G of rank s such that G ∈ Eqm (G0 |A) if and only if there exists a unitary matrix U and a non-singular matrix V such that A = Udiag Ir , 0 V, G0 = V−1 diag Ir , 0 U? and G = V−1 diag Ir , Ir−s , 0 U? . Theorem 14.3.6. Let A ∈ Cm×n be matrix of rank r. Let U and V are unitary matrices such that A = Udiag 4 , 0 V? be a singular value decomposition of A, where 4 is a positive definite diagonal matrix. Let −1 4 0 G0 ∈ {A− U? for some matrix M. Then `r } such that G0 = V M 0
Equivalence Relations on Generalized and Outer Inverses
383
an n × m matrix G of rank s such that G ∈ Eq` (G0 |A) if and only if 4−1 0 G=V U? , where N is an arbitrary matrix of rank s − r. M N We can define a relation “∼`m ” on {A− `m } in a manner similar to that in Definition 14.3.4. It is interesting to see that not only this relation is − an equivalence relation on {A− `m }, but the full set {A`m } forms the only equivalence class of this set under the relation and the unique reflexive ginverse in it is A† , the Moore-Penrose inverse of a matrix A. Note that in Theorem 14.3.1, we have shown that this is the equivalence class Eq(A† |A). † Thus, {A− `m } = Eq(A |A). We now consider square matrices of index not greater than 1 over a general field. Definition 14.3.7. Let A be a square matrix of index ≤ 1. Let G1 , G1 ∈ − {A− c }. The relation “∼c ” on {Ac } is defined as: G1 ∼c G1 if AG1 = AG2 and G1 A = G2 A. − Similarly, Let G1 , G1 ∈ {A− a }. The relation “∼a ” on {Aa } is defined as: G1 ∼a G1 if AG1 = AG2 and G1 A = G2 A. The relations “∼c ” and “∼a ” are equivalence relations. We denote the equivalence classes under “∼c ” and “∼a ” as Eqc (G|A) and Eqa (G|A) respectively. The following theorem is easy to prove. Theorem 14.3.8. Let A be a square matrix of index ≤ 1. Then the following hold: (i) There exists a unique χ-inverse G0 in each equivalence class under “∼c ”. −1 T L (ii) Let A = Pdiag T , 0 P−1 and G0 = P P−1 , where P 0 0 and T are non-singular and L is a fixed matrix. Then G ∈ Eqc (G0 |A) −1 T L if and only if G = P P−1 , for some matrix N. Also, 0 N G0 < #G, ∀ G ∈ Eqc (G0 |A). (iii) There exists a unique ρ-inverse G0 in each equivalence class under “∼a ”. −1 T 0 (iv) Let A = Pdiag T , 0 and G0 = P P−1 , where P and M 0 T are non-singular and M is a fixed matrix. Then an n × n matrix
384
Matrix Partial Orders, Shorted Operators and Applications
T−1 0 P−1 , for some M N matrix N. Also, G0 # < G, for all G ∈ Eqc (G0 |A).
G ∈ Eqa (G0 |A) if and only if G = P
Definition 14.3.9. Let A be a square matrix of index ≤ 1. Let G1 , G2 ∈ − {A− com }. The relation “∼com ” on {Acom } is defined as: G1 ∼com G2 if AG1 = AG2 . It is easy to check that the relation “∼com ” on {A− com } is an equivalence relation and {A− } is the only equivalence class of {A− com com } under this relation. Note that the reflexive g-inverse in this equivalence class is # A# . By Theorem 14.3.1, {A− com } = Eq(A |A). We have seen in earlier chapters that if A is a range hermitian matrix, we − − − have {A− c } = {Am } and {Aa } = {A` }. Therefore for a range hermitian matrix the relations “∼c ” and “∼m ”, “∼a ” and “∼` ” coincide. 14.4
Equivalence relation on the outer inverses of a matrix
Let A be an m×n matrix. Recall that an n×m matrix X is called an outer inverse of A, if A is a g-inverse of X. The class of all outer inverses of A is denoted by {A− }. We start this section by obtaining a characterization of the outer inverses of a matrix. Theorem 14.4.1. Let A be an m × n matrix of rank r (> 0). Let A = Pdiag Ir , 0 Q be a normal form of A, where P and Q are non-singular. Then the class of all outer inverses of A is given by L M −1 X=Q P−1 , N NM where L is an arbitrary idempotent matrix of order r × r and M, N are arbitrary matrices such that C(M) ⊆ C(L) and C(Nt ) ⊆ C(Lt ). Proof.
Direct verification.
Remark 14.4.2. Let A, X and L be as in Theorem 14.4.1. Then X is an outer inverse of A with ρ(X) = s (0 ≤ s ≤ r) if and only if ρ(L) = s. Here is another characterization of the outer inverses of a matrix that can be easily deduced from Theorem 2.3.18.
Equivalence Relations on Generalized and Outer Inverses
385
Theorem 14.4.3. Let A and X be matrices of rank r and s (s ≤ r) respectively. Then X is an outer inverse of A, if and only if there ex ist non-singular matrices R and S such that A = Rdiag Ir , 0 S and −1 Is 0 X=S R−1 . 0 0 The following result on outer inverses is analogous to Theorem 14.2.10 and Corollary 14.2.11 for g-inverses. Theorem 14.4.4. Let A be an m × n matrix and G a reflexive g-inverse of A. If X is an n × m matrix such that X <− G, then X and G − X are outer inverses of A. Proof. Since X <− G, there exists a matrix Y such that G = X ⊕ Y. Since GAG = G, we have X + Y = (X + Y)A(X + Y). Rewriting we have X − (X + Y)AX = (X + Y)AY − Y. Now, C(Xt ) ∩ C(Yt ) = 0, so, we have X − (X + Y)AX = (X + Y)AY − Y = 0. Thus, X − XAX = YAX and Y − YAY = XAY. Again, since C(X) ∩ C(Y) = 0, it follows that both X and Y are outer inverses of A. Further XAY = 0 and YAX = 0. Remark 14.4.5. Let A and X be as in Theorem 14.4.4. If X <− G, then X is an outer inverse of every reflexive g-inverse of G. Let X be an outer inverse of A. We now identify an important reflexive g-inverse of X. Theorem 14.4.6. Let X is an outer inverse of A. Then AXA is a reflexive g-inverse of X and A ∈ Eq(AXA|X). Proof. Since X is an outer inverse of A, A is an g-inverse of X. Hence, AXA ∈ {X− r }. Moreover, since AX = AXAX and XA = XAXA it follows that A ∈ Eq(AXA|X). We now define an equivalence relation on the outer inverses of a matrix. Definition 14.4.7. Let A be an m×n matrix. Let X1 , X2 ∈ {A− }. Define X1 ∼ = X2 if AX1 A = AX2 A. Theorem 14.4.8. The relation “∼ =” is an equivalence relation on {A− }. Proof is trivial. Thus, the relation “∼ =” partitions {A− }, the set of outer inverses into mutually disjoint equivalence classes. We denote the equivalence class containing an X ∈ {A− } by Eq(X|A).
386
Matrix Partial Orders, Shorted Operators and Applications
Theorem 14.4.9. Let A be an m × n matrix and X ∈ {A− }. Then for each Y ∈ Eq(X|A), ρ(Y) = ρ(X). Let A be an m × n matrix of rank r. Let G0 be a reflexive ginverse of A. Then Eq(G0 |A) contains g-inverses of A of all ranks with r ≤ s ≤ min{m, n}. However, if X is an outer inverse, then Eq(X|A) contains all outer inverses of same rank as X. We now identify all the matrices Y such that Y ∈ Eq(X|A). Theorem 14.4.10. Let A be an m × n matrix of rank r. Let X be an outer inverse of A. Following Theorem 14.3.1, let A = Pdiag Ir , 0 Q L M −1 be a normal form of A and let X = Q P−1 , where L is N NM an idempotent matrix of order r × r and LM = M and NL = N. Then L T1 Y ∈ Eq(X|A) if and only if Y = Q−1 P−1 , where T1 and T2 T2 T1 T2 are arbitrary matrices such that LT1 = T1 and T2 L = T2 . We first note that a matrix Y ∈ Eq(X|A) if and only if T3 T1 Y = Q−1 P−1 , where T3 is idempotent, T3 T1 = T1 and T2 T2 T1 T2 T3 = T2. Now, AXA = AYA if and onlyif T3 = L. In fact, L 0 T3 0 AXA = P Q and AYA = P Q. 0 0 0 0 Proof.
Remark 14.4.11. For any m × n matrix A of rank r, there is a oneone correspondence between the idempotent matrices of order r × r and the equivalence classes of {A− } under “∼ = .” In particular, the class of all reflexive g-inverses of A forms an equivalence class of {A− } under “∼ = .” The equivalence class that contains the null matrix has only null matrix in it. Let A be an m × n matrix and let X be an outer inverse of A. Notice that A = AXA ⊕ (A − AXA). Following [Goller (1986)], AXA is a rank decomposition matrix of A. (In general, if A = B ⊕ C, then B (as also C) is called a rank decomposition matrix of A.) Thus, each equivalence class of {A− } under “∼ =” corresponds to a unique rank decomposition matrix. As seen in Remark 14.4.2, each equivalence class of {A− } under “∼ =” is determined by a unique idempotent matrix, so, each rank decomposition matrix is determined by a unique idempotent. In fact, we can make a stronger statement leading to another characterization of the equivalence classes of {A− } under “∼ = .”
Equivalence Relations on Generalized and Outer Inverses
387
Theorem 14.4.12. Let A be an m × n matrix. Then every rank decomposition matrix DA determines an equivalence class of {A− } under “∼ =” given by {A− DA A− : A− ∈ {A− }}. Proof. First notice that if DA is a rank decomposition matrix of A, then AA− DA = DA A− A = DA A− DA = DA for all A− . Clearly, A− DA A− is an outer inverse of A, since for all A− , A− DA A− AA− DA A− = A− DA A− DA A− = A− DA A− . − Further, if A− 1 and A2 are two g-inverses of A, then − − − AA− 1 DA A1 A = DA = AA2 DA A2 A. Hence, for all A− ∈ {A− }, the outer inverses A− DA A− are related to each other under “∼ = .” Let A = Pdiag(Ir , 0)Q. The matrix X = GDA G is an outer inverse L M for some g-inverse G of A. By Theorem 14.4.10, X = Q−1 P−1 , N NM where L is an idempotent and matrices M and N are such that LM = M and NL = N. Let T be the equivalence class of X. Now, Z = AXA is a rank decomposition matrix and by discussion in the last para, for each ginverse A− of A, {A− ZA− } ⊆ T. ∼ Let Y ∈ T. Since Y an outer inverse of A and Y = X, we have Y = L T 1 Q−1 P−1 , where T1 and T2 are some matrices such that T2 T2 T1 Ir T1 −1 LT1 = T1 , T2 L = T2 . Consider H = Q P−1 . It is easy to T2 J see that H is a g-inverse of A and Y = HZH. Thus, T = {A− DA A− : A− ∈ {A− }}. We have seen in the proof of Theorem 14.4.12 that for a given rank decomposition matrix DA of A, there are possibly several g-inverses H of A such that HDA H = Y for some outer inverse Y of A. It may be interesting to ask the following: Given an outer inverse Y of A in an equivalence class of a rank decomposition matrix DA of A, what is the class of all g-inverses H of A such that HDA H = Y? The answer gives a nice link to the equivalence classes of g-inverses of DA . Theorem 14.4.13. Let A be an m×n matrix and let DA be a rank decomposition matrix of A. Let H1 , H2 ∈ {A− }. Then H1 DA H2 = H2 DA H1 if and only if H1 , H2 belong to the same equivalence of g-inverses of DA .
388
Matrix Partial Orders, Shorted Operators and Applications
Proof. Since DA is a rank decomposition matrix of A, each g-inverse of A is also a g-inverse of DA . It is clear that H1 ∈ Eq(H1 DA H1 |DA ). By Definition 14.2.1, H2 ∈ Eq(H1 DA H1 |DA ) if and only if H1 DA H2 = H2 DA H1 . Let B be an m × n matrix and X be an outer inverse of B. Does there − − exist a reflexive g-inverse B− r of B such that X < Br ? If so, what is the − class of all Br that dominate X under minus. The answer is contained in the following: Theorem 14.4.14. Let B be an m × n matrix and X be an outer inverse − − of B. Then there exists a reflexive g-inverse B− r of B such that X < Br . − − The class of all such Br is given by X + (I − XB)(B − BXB)r (I − BX). Proof. First notice the striking similarity between this and Theorem 14.2.9. In fact, we show that the statement of the present theorem is almost a restatement of Theorem 14.2.9. If X is an outer inverse of B, then B is a g-inverse of X ∈ Eq(BXB|X) under the relation “v”. Further, if − − X <− B− r , then (Br − X)BX = XB(Br − X) = 0, since B is a g-inverse − − of Br . Thus, Br BXB = XBXB = XB. Similarly, BXBB− r = BX. Hence, B− ∈ Eq(X|BXB). Now, take A = X and G = B and aps−r r − such that X < B− ply Theorem 14.2.8, we get the class of all B− r as r − − X + (I − XBXB)(B − BXB)r (I − BXBX) where (B − BXB)r is an arbitrary g-inverse of B − BXB. Since XBX = X, the result follows. Given an outer inverse X of B, what is the the class of all reflexive − − g-inverse B− r of B such that X < Br ? We obtained a characterization in Theorem 14.4.14 analogous to Theorem 14.2.9. We now obtain a characterization similar to Theorem 14.2.13. Theorem 14.4.15. Let B be an m × n matrix of rank r and let X be an outer inverse of B with ρ(X) = s(< r). Let B = Rdiag Ir , 0 S and X = S−1 diag Is , 0 R−1 . Then the class of all reflexive g-inverses B− r of Is 0 0 − −1 B such that X <− B− 0 Ir−s L R−1 , where r is given by Br = S 0 M ML L, M are arbitrary. Proof is easy. Remark 14.4.16. Consider an equivalence class under the relation “v .” It contains g-inverses of A of all ranks s such that ρ(A) ≤ s ≤ min{m, n}. Let
Equivalence Relations on Generalized and Outer Inverses
389
s < min{m, n}. Also, there exist distinct g-inverses of A namely G1 and G2 such that G1 <− G2 . On the other hand, all the outer inverses of A in the same equivalence class under the relation “∼ =” have same rank. Further, if X1 and X2 are distinct outer inverses belonging to the same equivalence class under the relation “∼ =”, then neither X1 <− X2 nor X2 <− X1 . The Drazin inverse AD of a square matrix A is a well known outer inverse A. We now identify the equivalence class Eq(AD |A) of AD . Theorem 14.4.17. Let A = P diag T , N P−1 be the core-nilpotent decomposition of a square matrix A, where P and T are non-singular matrices and N is a nilpotent matrix. Then G ∈ Eq(AD |A) if and −1 T R1 only if G = P P−1 , where R1 = W(I − NN− ) and R2 R2 TR1 R2 = (I − N− N)Z for some matrices W and Z of appropriate orders and for some g-inverse N− of N. By Theorem 2.4.26, we have AD = Pdiag T−1 , 0 P−1 . −1 L R1 D So, AA A = Pdiag T , 0 P . Let G = P P−1 . Now, R2 R3 G ∈ Eq(AD |A) if and only if AGA = AAD A and GAG = G. Moreover, AGA = AAD A ⇔ L = T−1 , R1 N = 0, NR2 = 0 and NR3 N = 0. Also, GAG = G ⇔ R2 TR1 + R3 NR3 = R3 ⇔ R3 N = 0 and R2 TR1 = R3 . So, the result follows. Proof.
Remark 14.4.18. Let A be a square matrix and AD be its Drazin inverse. Then G ∈ Eq(AD |A) if and only if there exists a matrix S such that (i) G = AD + S (ii) ρ(G) = ρ(AD ), (iii) AD SAD = 0; (iv) (A − (AD )# )S = 0 and (v) S(A − (AD )# ) = 0. Recall that (A − (AD )# ) is the nilpotent part of A. Let A be an m×n matrix of rank r(> 1) over C. An n×m matrix G such that GAG = G, GA and AG are hermitian is called an A−`m outer inverse of A. We shall now identify the equivalence ? class Eq(A−`m |A) of A−`m . We also note that if A = Udiag 4 , 0 V is a singular value decomposition of A, where U and V are unitary matrices and 4 is a positive definite diagonal matrix, then G is an A−`m g-inverse of rank s(0 < s < r) of A if and only if G = Vdiag R , 0 U? , where R is a 4−`m . Moreover, G is an A−`m if and only if A is an G− `m .
390
Matrix Partial Orders, Shorted Operators and Applications
Theorem 14.4.19. Let A and G be as in the preceding para and H be an R S1 n×m matrix. Then H ∈ Eq(G|A) if and only if H = V U? S2 S2 4S1 for some matrices S1 and S2 such that C(S1 ) ⊆ C(R) and C(S1 t ) ⊆ C(Rt ). Proof is by straightforward verification. Remark 14.4.20. Let G be an A−`m . Then G is the unique A−`m such that G ∈ Eq(G|A). In Theorem 14.2.22, we have seen that an equivalence class Eq(G|A) under the equivalence relation “v” determines the matrix A. However, a similar statement does not hold for outer inverses as the the following example shows. Example 14.4.21. Let 1 0 0 1 0 0 1 0 0 A = 0 1 0 , B = 0 2 0 and G = 0 0 0 . 0 0 0 0 0 0 0 0 0 Then G is an outer inverse of both A and B and Eq(G|A) = Eq(G|B). However, A 6= B. 14.5
Diagrammatic representation of the g-inverses and outer inverses
In this section we provide a diagrammatic representation of the inverses (g-inverses and outer inverses) of a matrix depicting the hierarchy of these inverses under minus order. The equivalence relations developed in the previous three sections play an important role in obtaining a neat diagrammatic representation of these generalized inverses. In fact, we show below the hierarchical representation of generalized inverses belonging to any one equivalence class under the equivalence relation “v” yields the hierarchical representation of generalized inverses belonging to all other equivalence classes under the same equivalence relation. In the process of developing the diagram, we have been able to demonstrated the power and importance of some of the other results obtained in the earlier sections. We illustrate the diagram using a 4 × 4 matrix over GF(2). We start with a series of theorems to justify the statement: ‘the hierarchical representation of generalized inverses belonging to any one equiva-
Equivalence Relations on Generalized and Outer Inverses
391
lence class’ under the equivalence relation “v” yields the hierarchical representation of generalized inverses belonging to all other equivalence classes under the same equivalence relation’. Notice that Theorem 14.2.6 establishes a one-one correspondence of g-inverses in the distinct equivalence classes. More precisely, let A = Pdiag Ir , 0 Q be an m × n matrix of Ir Li rank r, where P and Q are non-singular. Let Gi = Q−1 P−1 , Mi Mi Li i = 1, 2 be two reflexive g-inverses of A. If G ∈ Eq(G1 |A), then Ir L1 G = Q−1 P−1 for some matrix S. The matrix M1 M1 L1 + S Ir L2 −1 H = Q P−1 is a g-inverse in Eq(G2 |A). Similarly, M2 M2 L2 + S Ir L2 −1 if we take an H ∈ Eq(G2 |A), where H = Q P−1 M2 M2 L2 + T Ir L1 for some matrix T, then we see G = Q−1 P−1 ∈ M1 M1 L1 + T Eq(G1 |A). This gives a one-one correspondence between the two equivalence classes Eq(G1 |A) and Eq(G2 |A). We begin with the following: Theorem 14.5.1. Let A = Pdiag Ir , 0 Q be an m × n matrix of rank Ir Li −1 r, where P and Q are non-singular and let Gi = Q P−1 , Mi Mi Li i = 1, 2 be two reflexive g-inverses of A. Further, for i = 1, 2, j = 1, 2, I L r i let Gij = Q−1 P−1 . (Notice that Gij ∈ Eq(Gi |A) for Mi Mi Li + Sj j = 1, 2.) Then the following are equivalent: (i) G11 <− G12 (ii) S1 <− S2 and (iii) G21 <− G22 . We show (i) ⇔ (ii). Now, Ir L1 Ir L1 <− G11 <− G12 ⇔ M1 M1 L1 + S1 M1 M1 L1 + S2
Proof.
⇔ r + ρ(S1 ) + ρ(S2 − S1 ) = r + ρ(S2 ) ⇔ S1 <− S2 . Proof of (ii) ⇔ (iii) is similar.
Thus, the one-one correspondence carries over to the hierarchical relationship also.
392
Matrix Partial Orders, Shorted Operators and Applications
Theorem 14.5.2. In the setup of Theorem 14.5.1, Ir L2 Ir L1 Q−1 P−1 <− Q−1 P−1 M2 M2 L2 + S M1 M1 L1 + T Ir L1 Ir L2 −1 −1 − −1 ⇔Q P < Q P−1 . M1 M1 L1 + S M2 M2 L2 + T Proof.
Clearly, show it is enough to Ir L2 Ir L1 − < M M2 L2 + S M M1 L1 + T 2 1 Ir L1 Ir L2 ⇔ <− . M1 M1 L1 + S M2 M2 L2 + T We can rewrite Ir L1 Ir 0 Ir 0 Ir L1 Ir 0 = =X Y, M1 M1 L1 + S M1 I 0 S 0 I 0 S Ir 0 I L where X = and Y = r 1 are non-singular matrices. Now, M1 I 0 I Ir 0 Ir L2 − X Y< 0 S M2 M2 L2 + T Ir 0 Ir L2 − −1 ⇔ < X Y−1 . 0 S M2 M2 L2 + T However, Ir L2 Ir L2 − L1 X−1 Y−1 = . M2 M2 L2 + T M2 − M1 (L2 − L1 )(M2 − M1 ) + T Write L = L2 − L1 andM =M2 −M1 . So, Ir 0 Ir L − < 0 S M ML + T 0 L ⇔ r + ρ(S) + ρ = r + ρ(T) M ML + T − S 0 L ⇔ ρ(S) + ρ = ρ(T) M ML + T − S 0 L2 − L1 ⇔ρ = ρ(T) − ρ(S) M − M1 (M2 − M1 )(L2 − L1 ) + T − S 2 0 L1 − L2 ⇔ρ = ρ(T) − ρ(S) M − M (M − M2 )(L1 − L2 ) + T − S 1 2 1 Ir 0 0 L1 − L2 − ⇔ < 0 S M1 − M2 (M1 − M2 )(L1 − L2 ) + T − S Ir L1 Ir L2 ⇔ Q−1 P−1 <− Q−1 P−1 . M1 M1 L1 + S M2 M2 L2 + T
Equivalence Relations on Generalized and Outer Inverses
393
Corollary 14.5.3. In the setup of Theorem 14.5.2, Ir L1 Ir L2 Q−1 P−1 <− Q−1 P−1 M1 M1 L1 + S M2 M2 L2 + T if and only if (i) (ii) (iii) (iv)
S <− T C(M1 − M2 ) ⊆ C(T − S) t t C(L1 − L2 ) ⊆ C(T − S) and − (L1 − L2 )(T − S) (M1 − M2 ) = 0.
Proof. From the proof of Theorem 14.5.2 I L Ir L2 r 1 −1 −1 − −1 Q P < Q P−1 M1 M1 L1 + S M2 M2 L2 + T 0 L2 − L1 ⇔ρ = ρ(T) − ρ(S). M2 − M1 (M2 − M1 )(L2 − L1 ) + T − S However, 0 L2 − L1 ρ ≥ ρ(T − S). M2 − M1 (M2 − M1 )(L2 − L1 ) + T − S So, ρ(T) − ρ(S) ≥ ρ(T − S). Further, equality sign holds ⇔ ρ(T) − ρ(S) = ρ(T − S) 0 L1 − L2 ⇔ρ = ρ(T − S). M1 − M2 (M1 − M2 )(L1 − L2 ) + T − S Now the result follows. Let A = Pdiag Ir , 0 Q be an m×n matrix of rank r, where P and Q are non-singular. In view of Theorem 14.4.10 and Remark 14.4.2, each ∼ equivalence class under the relation “ =” is determined by some outer L 0 inverse XL = Q−1 P−1 of A, where L is idempotent. Distinct 0 0 idempotent matrices L of order r × r lead to distinct equivalence classes. In fact, L T1 Eq(XL |A) = Q−1 P−1 , T2 T2 T1 where T1 and T2 are arbitrary subject to LT1 = T1 and T2 L = T2 . We now show that every reflexive g-inverse of A is above a unique outer inverse in each equivalence class under the relation “ ∼ =”. Theorem 14.5.4. Let A = Pdiag Ir , 0 Q be an m × n matrix of rank r Ir N −1 and G = Q P−1 be a reflexive g-inverse of A, where P and M MN
394
Matrix Partial Orders, Shorted Operators and Applications
L 0 P−1 , where L is an idempotent 0 0 matrix of order r×r. Then there is a unique outer inverse of X ∈ Eq(XL |A) L LN such that X <− G and is given by X = Q−1 P−1 . ML MLN L S1 −1 Proof. Notice that X ∈ Eq(XL |A) ⇔ Q P−1 , where S2 S2 S1 LS1 = S1 and S2 L = S2 . Now, L S1 Ir N X <− G ⇔ <− S2 S2 S1 M MN Ir 0 Ir N − S1 − ⇔ < 0 0 M − S2 (M − S2 )(N − S1 ) However, Ir N − S1 ρ = r. M − S2 (M − S2 )(N − S1 ) So, Ir 0 Ir N − S1 − < 0 0 M − S2 (M − S2 )(N − S1 ) Ir − L N − S1 ⇔ρ M − S2 (M − S2 )(N − S1 ) = r − ρ(L) = ρ(Ir − L) ⇔ C(N − S1 ) ⊆ C(Ir − L) and C((M − S2 )t ) ⊆ C((Ir − L)t ). Q are non-singular. Let XL = Q−1
Let C(N − S1 ) ⊆ C(Ir − L) and C((M − S2 )t ) ⊆ C((Ir − L)t ). Then N − S1 = (Ir − L)U and M − S2 = V(Ir − L) for some matrices U and V. So, L(N − S1 ) = 0 and (M − S2 )L = 0 or LN = LS1 = S1 and ML − S2 L = S2 . Hence, S1 = LN and S2 = ML. If S1 = LN and S2 = ML, then N − S1 = N − LN = (I − L)N and M − S2 = M(I − L). Thus, C(N − S1 ) ⊆ C(Ir − L) and C((M − S2 )t ) ⊆ C((Ir − L)t ) ⇔ S1 = LN and S2 = ML. Corollary 14.5.5. Let A = Pdiag Ir , 0 Q be an m × n matrix of rank Li 0 r, where P and Q are non-singular. Let XLi = Q−1 P−1 and 0 0 Li Ni P−1 ∈ Eq(XLi |A) where i = 1, 2, and each Gi = Q−1 Mi Mi Ni Li is idempotent. Then G1 <− G2 if and only if L1 L2 = L2 L1 = L1 , equivalently L1 <− L2 , N1 = L1 N2 and M1 = M2 L1 .
Equivalence Relations on Generalized and Outer Inverses
395
We are now ready to describe the scheme of the diagram. Let A = Pdiag Ir , 0 Q be a given matrix, where P and Q are non-singular. By Theorem 14.2.6, the equivalence class of G0 = Q−1 diag Ir , 0 P−1 is given by Eq(G0 |A) = {Q−1 diag Ir , S P−1 : S is arbitrary}. In the diagram, every g-inverse in an equivalence class is denoted by a node and we identify a node with the g-inverse it represents, thus making no distinction between the two. Two nodes differing by rank 1 are connected by a line if the node of lower rank is below the node with higher rank under the minus order. Let Sj = {Q−1 diag Ir , S P−1 }, where the matrix S is an arbitrary matrix with ρ(S) = j. Then G0 is connectedto every g-inverse in Sj , by Theorem 14.2.4. A g-inverse Q−1 diag Ir , S P−1 ∈ Sj is connected to a g-inverse Q−1 diag Ir , T P−1 ∈ Sj+1 ⇔ S <− T, by Theorem 14.5.1. This completes the diagram for the g-inverses in Eq(G0 |A). In view of Theorems14.2.4 and 14.5.1, the diagram for g-inverses of Eq(G|A) Ir N where G = Q−1 P−1 is any other reflexive g-inverse is just a M MN replica of the diagram for the g-inverses where thecorrespondence is given −1 Ir N −1 −1 by Q diag Ir , S P ←→ Q P−1 . M MN + S Let {Lα , α ∈ Ω} be the class of all idempotent matrices of order r × r. For each α ∈ Ω, write XLα = Q−1 diag Lα , 0 P−1 . Then the equivalence classes under the relation “ ∼ =” are given by {Eq(XLα |A)}α∈Ω , where N −1 Lα −1 Eq(XLα |A) = Q P : Lα N = N, MLα = M . Consider M MN two idempotent matrices L1 and L2 of order r × r and of ranks s and s + 1 respectively, 0 ≤ s ≤ r − 1. Consider L2 N2 −1 X2 = Q P−1 ∈ Eq(Q−1 diag L2 , 0 P−1 |A), M2 M2 N2 where M2 and N2 satisfy L2 N2 = N2 and M2 L2 = M2 . Then by Corollary 14.5.5, there exists an outer inverse in the equivalence class −1 −1 Eq(Q diag L1 , 0 P |A) if and only if L1 <− L2 . In such a case the unique outer inverse X1 ∈ Eq(Q−1 diag L1 , 0 P−1 |A) such that L1 L1 N2 X1 <− X2 is given by X1 = Q−1 P−1 . M2 L1 M2 L1 N2 From the preceding discussion, it is clear that if we have the complete diagram with respect to one equivalence class, then we can construct the diagram for the complete class of g-inverses. In the case of outer inverses same can be achieved by using the Corollary 14.5.5 and the above discussion. Thus, a complete diagram can be constructed.
396
Matrix Partial Orders, Shorted Operators and Applications
B 0 Example 14.5.6. For this example, let F = GF(2). Let A = be a 0 0 1 1 I 0 4 × 4 matrix, where B = . It is easy to see that that A = P 2 0 1 0 0 B 0 is a normal form of A, where P = . Notice that B−1 = B and 0 I I2 N therefore each g-inverse of A is of the form P, where N, M, K M K are 2 × 2 matrices. Thus, there are 4096= 212 g-inverses of A. Since every I2 N reflexive g-inverse of A is of the form P, there are 256 = 28 M MN reflexive g-inverse of A. So, there are there are 16 = 24 matrices of order 2 × 2 that play important role in enumerating all the g-inverses of A in a particular equivalence class. We first list out these 16 matrices: 0 0 Rank 0: S1 = 0 0 1 0 0 1 1 1 0 0 Rank 1: S2 = , S3 = , S4 = , S5 = , 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 1 1 S6 = , S7 = , S8 = , S9 = , S10 = 0 1 1 1 1 0 0 1 1 1 1 0 1 0 1 1 Rank 2: S11 = , S12 = , S13 = , 0 1 1 1 0 1 0 0 0 1 1 1 S14 = , S15 = , S16 = . 0 0 1 1 1 0 Let G1 = diag I2 , 0 P = diag I2 , 0 diag B , I = A. Consider the equivalence class Eq(G1 |A). Notice that A is of index 1 and G1 is the group inverse of A. The g-inverses in Eq(G1 |A) are given by I2 N Gi = G1 + diag 0 , Si P, i = 1, . . . , 16. Moreover, if H1 = P M MN is another reflexive g-inverse ofA, then Eq(H1 |A) = {Hi : i = 1, . . . , 16.}, where Hi = H1 + diag 0 , Si P. Notice the one-one correspondence between the g-inverses of Eq(G1 |A) and those in Eq(H1 |A). Henceforth we concentrate on the equivalence class Eq(G1 |A). Note that ρ(G1 ) = 2, ρ(Gi ) = 3 for i = 2, . . . , 10 and ρ(Gi ) = 4 for i = 11, . . . , 16. In view of Theorem 14.2.5, G1 <− Gi , i = 2, . . . , 10. In fact G1 <− Gi , i = 2, . . . , 16. In order to determine the g-inverses among G11 to G16 which are above G2 under he minus order, we need to determine which matrices amongst S11 to S16 lie above S2 . It is easy to see that S2 <− S1i , i = 1, 2, 3, 5. By
Equivalence Relations on Generalized and Outer Inverses
397
Theorem 14.5.3, it follows that G2 <− G1i , i = 1, 2, 3, 5. Similar checking with respect to S3 to S10 yields the following table, which enumerates for each of G2 to G10 , the g-inverses of rank 4 belonging to Eq(G1 |A) that are above g-inverses of rank 3. For example, in Table 14.1, we notice that G8 is below G11 , G12 , G14 and G16 under the minus order. Consider Eq(H1 |A). In view of Theorem 14.5.1, if we replace G by H in the table, we get for each of g-inverse of rank 3, the g-inverse of rank 4 which are above H1 under the minus order. Thus for example, H5 is below H13 , H14 , H15 and H16 . Table 14.1 Rank 3 g-inverses below the rank 4 g-inverses of Eq(G1 |A) g-inverse of rank 3 g-inverse of rank 4 g-inverse of rank 3 g-inverse of rank 4 G2 G11 , G12 , G13 , G15 G3 G12 , G14 , G15 , G16 G4 G11 , G13 , G14 , G16 G5 G13 , G14 , G15 , G16 G6 G11 , G12 , G13 , G16 G7 G11 , G12 , G14 , G15 G8 G11 , G12 , G14 , G16 G9 G11 , G13 , G14 , G15 G10 G12 , G13 , G15 , G16
I2 N 0 0 P, where N = . We 0 0 0 1 shall now examine if any of the G’s is below any of the H’s. We first check I 0 that whether G1 is below any of H2 to H10 . Recall that G1 = 2 P 0 0 I N and Hi = 2 P, i = 2, . . . , 10. 0 Si Let us now do the same for H1 =
Table 14.2 g-inverses in Eq(G1 |A) below g-inverses in Eq(H1 |A) g-inverse of rank 3 g-inverse of rank 4 G2 H11 , H13 G4 H11 , H13 G5 H14 , H15 G7 H14 , H15 G8 H12 , H16 G10 H12 , H16
398
Matrix Partial Orders, Shorted Operators and Applications
By Corollary 14.5.3, G1 <− Hi ⇔ C(Nt ) ⊆ C(Si ). (0(Si − 0)− N = 0, 0 <− Si , and C(0) ⊆ C(Si − 0) hold automatically.) Hence, G1 <− H3 , G1 <− H6 and G1 <− H9 . Again by using Corollary 14.5.3, we arrive at the following table which enumerates for each of G2 to G10 , the matrices H11 to H16 which are above it under the minus order. Thus, G3 , G6 and G9 are not below any of H11 to H16 . From Theorem 14.5.2, it is clear that the g-inverses of rank 3 in the equivalence class Eq(H1 |A) which are below the inverses from among G11 − H16 in Eq(G1 |A) are obtained by interchanging G and H in the Table 14.2. We shall now turn our attention to the equivalence classes of the outer inverses of A. As mentioned previously, these equivalence classes are linked to idempotent matrices of order 2 × 2. We now enumerate them according to their rank. Rank 2: L1 = I2 1 0 1 1 0 0 1 0 Rank 1: L2 = , L3 = , L4 = , L5 = , 0 0 0 0 1 1 1 0 0 1 0 0 L6 = , L7 = 0 1 0 1 0 0 Rank 0: L8 = . 0 0 Thus, there are 8 equivalence classes of the outer of A under the inverses L 0 1 P|A) contains all relation “ ∼ =”. The equivalence class O1 = Eq( 0 0 the reflexive g-inverses of A. There are six equivalence classes of the outer Li 0 inverses of rank 1, namely, Oi = Eq( P|A), i = 2, . . . 7. Finally 0 0 there is one equivalence class corresponding to L8 = 0 and is Eq(0|A) = 0. Using Theorem 14.2.12, it can be shown that there are exactly 16 matrices in each of Oi i = 2, . . . 7. Using the same one can enumerate the 16 matrices in O1 as: 1 0 1. 0 0 1 0 3. 0 1
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 P = 0 0 0 0 , 2. 1 0 0 0 P = 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1 0 0 0 1 0 P = 0 0 0 0 , 4. 0 0 0 0 P = 0 1 0 0 0 0 0 1 0 0 0 0 1 1 0 0 1 0 0 0 1
1 0 1 0 1 0 1 1
0 0 0 0 0 0 0 0
0 0 , 0 0 0 0 , 0 0
Equivalence Relations on Generalized and Outer Inverses
1 1 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 5. 0 0 0 0 P = 0 0 0 0 , 6. 1 0 0 0 0 0 0 0 0 0 1 0 1 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 7. 0 0 0 0 P = 0 0 0 0 , 8. 1 1 0 1 0 1 1 1 0 1 1 1 1 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 9. 0 0 0 0 P = 0 0 0 0 , 10. 1 0 0 0 0 0 0 0 0 0 1 0 0 1 1 1 0 1 1 0 0 0 0 0 0 0 0 0 11. 0 0 0 0 P = 0 0 0 0 , 12. 1 1 0 0 1 1 0 0 1 1 1 1 1 1 1 1 0 1 1 0 0 0 0 0 0 0 0 0 13. 0 0 0 0 P = 0 0 0 0 , 14. 1 0 0 0 0 0 0 0 0 0 1 1 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 15. 0 0 0 0 P = 0 0 0 0 , 16. 1 1 0 1 1 1 1 1 1 1
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 0 1 0 1 0 1 1
399
1 1 1 0 0 0 P = 0 0 0 0 , 1 1 1 0 0 0 0 0 0 0 0 1 1 1 0 0 P = 0 0 0 0 , 1 1 1 0 0 0 1 1 1 0 0 P = 1 1 0 0 1 1 0 0 P = 1 1 1 1 1 1 0 0 P = 1 1 0 0 1 1 0 0 P = 1 1 1 1
1 1 0 1 0 1 0 1 1 1 0 1 0 1 0 1 1
1 0 0 0 0 0 0 0 0 1 0 1 0 1 0 1 1
0 1 0 , 1 0 1 0 , 1 1 1 0 , 1 0 1 0 . 1 1
It is easy to see by Theorem 14.5.4, that the outer inverses of rank 1 below A in each of the equivalence classes Oi i = 2, . . . 7 are given by 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 in O2 , 0 0 0 0 in O3 , 0 0 1 0 0 0 0 0 0
0 0 0 0 0 1 1 0 0
0 0 0 0 0 0 0 0 0
0 0 0 in O4 , 0 0 0 0 in O6 , 0 0
0 1 1 0 0 0 0 0 0
0 1 1 0 0 0 1 0 0
0 0 0 0 0 0 0 0 0
0 0 0 in O5 , 0 0 0 0 in O7 . 0 0
400
Matrix Partial Orders, Shorted Operators and Applications
These details are diagrammatically represented in the following figure:
G 11
G 12
G 13
G 14
G 15
G 16
Rank 4
Rank 3
G2
G3
G4
G5
G6
G7
G8
G9
G 10
Rank 2 G1 Rank 1 1 5 9 13
2 6 10 14
O1
3 7 11 15
4 8 12 16
1 5 9 13
2 6 10 14
3 7 11 15
4 8 12 16
O2
1 5 9 13
2 6 10 14
3 7 11 15
4 8 12 16
1 5 9 13
O3
2 6 10 14
3 7 11 15
O4
4 8 12 16
1 5 9 13
2 6 10 14
3 7 11 15
O5
Rank 0 O7
Fig. 14.1
An example of a network.
4 8 12 16
1 5 9 13
2 6 10 14
3 7 11 15
O6
4 8 12 16
Equivalence Relations on Generalized and Outer Inverses
14.6
401
The Ladder
In this section we construct a ladder of matrices and their reflexive ginverses (on either side of each step) which are in hierarchy above and below a given matrix A and a reflexive g-inverse of A using Theorems 14.2.12 and 14.2.11. We illustrate the ladder construction using the same 4 × 4 matrix of the previous section. From the Figure 14.1, we notice that there is a path connecting the outer inverses and the g-inverses of the matrix A. One such path is O8 → O11 → G1 → G2 → G11 . Equivalently, O8 <− O11 <− G1 <− G2 <− G11 . Notice that O8 is the outer inverse of A with rank 0 and G11 is a g-inverse of of A with rank 4. From the same figure, it is also clear that such a path beginning from O8 is not unique, since O8 → O21 → G1 → G5 → G16 is yet another path. In fact, as you see in the figure, there are several paths that connect O8 to G1 and end in a g-inverse of A with rank 4. Write A−2 = O8 = 0, A−1 = O11 , A0 = A, A1 = G2 and A2 = G11 . Then A−2 → A−1 → A0 → A1 → A2 is also a path in the sense that A−2 <− A−1 <− A0 <− A1 <− A2 . Further, A−2 , A−1 , A0 , A1 , and A2 are the reflexive g-inverses of O8 , O11 , G1 , G2 and G11 respectively. We now consider the ordered pairs (A−2 , O8 ), (A−1 , O11 ), (A0 , G1 ), (A1 , G2 ), (A2 , G11 ). These pairs have the following properties: (i) Within each pair, the coordinate matrices are reflexive g-inverses of each other. (ii) Rank of the coordinate matrices of each successive pair beginning with (A−2 , O8 ) is one more than the rank of the coordinate matrices of its predecessor. (iii) Consider any two consecutive pairs. The first(the second) coordinate matrix of the former pair is below the first (respectively the second) coordinate matrix of the later pair under the minus order. Thus, we can think of the these various pairs forming a ladder in which each pair resembles a step of the ladder. We formalize all this discussion into a definition of the ladder as follows: Let A be an m×n matrix of rank r. Let α = min{m, n}. Write A0 = A. Let G0 be a reflexive g-inverse of A. Let Gj , j = −r, . . . , −1, outer inverses of A, Gj , j = 1, . . . , r, the g-inverses of A and Aj , j = −r, . . . , α − r be matrices satisfy the following conditions: (i) ρ(Gj ) = r + j, j = −r, . . . , α − r,
402
Matrix Partial Orders, Shorted Operators and Applications
(ii) Gj <− Gj+1 , j = −r, . . . , α − r − 1; (iii) Aj is a reflexive g-inverse of Gj , j = −r, . . . , α − r (iv) Aj <− Aj+1 , j = −r, . . . , α − r − 1. Aα−r
Gα−r
A1
G1
A0 A−r A−r
and
G0 G−r−1
1
G−r Fig. 14.2 Ladder.
We denote the step (A0 , G0 ) as the ground level, (A−r , G−r ) = (0, 0) as the basement and (Aα−r , Gα−r ) as the first floor. Thus, the ladder connects the basement to the first floor via a landing in the ground level. A ladder has the following features: (i) Each step is characterized by a matrix and a reflexive g-inverse of the matrix. (ii) Consider two consecutive steps (Aj , Gj ) and (Aj+1 , Gj+1 ). Then ρ(Aj+1 ) = ρ(Aj ) + 1, ρ(Gj+1 ) = ρ(Gj ) + 1, Aj <− Aj+1 , and Gj <− Gj+1 . Thus, it makes sense to call the step (Aj+1 , Gj+1 ) as the step just above the step (Aj , Gj ). The ladder defined above is very flexible in the sense, that the landing (the ground level) can be adjusted anywhere between the levels −r to α − r. For example 14.5.6, we have already constructed a ladder. Does a ladder exist for every matrix A and a reflexive g-inverse G? What is the class of all such ladders, in case one exists? We shall explore all this in the rest of the section. Let A be an m × n matrix of rank r. Let G0 be reflexive g-inverse of A. We can take A = Pdiag Ir , 0 Q and G0 = Q−1 diag Ir , 0 P−1 for some non-singular matrices P and Q. Write A0 = A, and define for each j = −r, . . . , α − r, α = min{m, n} Aj = Pdiag Ir+j , 0 Q, Gj = Q−1 diag Ir+j , 0 P−1 . Then clearly,
Equivalence Relations on Generalized and Outer Inverses
(i) (ii) (iii) (iv)
ρ(Gj ) = r + j, j = −r, . . . , α − r Gj <− Gj+1 , j = −r, . . . , α − r − 1 Aj is a reflexive g-inverse of Gj , j = −r, . . . , α − r Aj <− Aj+1 , j = −r, . . . , α − r − 1.
403
and
Thus, we have constructed a ladder (Aj , Gj ), j = −r, . . . , α − r. We shall now explore the possibilities of obtaining all possible ladders passing through (A, G0 ), where G0 is a reflexive g-inverse of A. The following theorems are useful towards this end: Theorem 14.6.1. Let A be an m × n matrix of rank r and G0 be reflexive g-inverse of A. Write A0 = A. Let G−i be an outer inverse of A with rank r − i, i = 1, . . . r such that G−i <− G−i+1 , i = 1, . . . , r. Then (i) There exists non-singular matrices P and Q such that A = Pdiag Ir , 0 Q and G−i = Q−1 diag Ir−i , 0 P−1 , i = 0, 1, . . . r. (ii) There exist matrices A−i , i = 1, . . . r such that (a) A−i is a reflexive g-inverse of G−i , i = 0, 1, . . . , r (b) A−i <− A−i+1 , i = 1, . . . , r. (iii) General form of A−i satisfying the condition in Ir−i mi t M 0 A−i = Q−1 P−1 , where M = · 0 0 · m1 t least one of ui and mi is null for j = 1, . . . , i.
(b) ui 0 · · 0
and
is given by · u1 · 0 · · , where at · · · 0
Proof is by induction on i. Let A be an m × n matrix of rank r. Given an outer inverse G−i of A with rank r − i, what is the class of all outer inverses G−i+1 of A such that G−i <− G−i+1 ? The answer is contained in the following: Theorem 14.6.2. Let A be an m×n matrix of rank r. Let G−i be anouter inverse of A. Let A = Rdiag Ir , 0 S and G−i = S−1 diag Ir−i , 0 R−1 Then the class of all outer inverses G−i+1 of A of rank r − i + 1 such that G−i <− G−i+1 is given by Ir−i 0 0 G−i+1 = S−1 0 1 ut R−1 , where u and m are arbitrary. 0 m mut Proof is analogous to Theorem 14.4.16.
404
Matrix Partial Orders, Shorted Operators and Applications
We now turn our attention to g-inverses of A. We first prove Theorem 14.6.3. Let A be an m × n matrix of rank r. Let Gi be a g-inverse of rank r + i of A. Write A = Pdiag Ir , 0 Q and Ir L1 Gi = Q−1 P−1 for some matrices L1 , M1 and for some M1 M1 L1 + S matrix S of rank i. Then the class of all g-inverses Gi+1 of rank r + i + 1 such that Gi <− Gi+1 is given by Ir L2 Gi+1 = Q−1 P−1 , M2 M2 L2 + S + uvt where u, v are arbitrary subject to ρ(S + uvt ) = i+1, L2 = L1 +zt uvt and M2 = M1 + uvt w with z, and w are arbitrary subject to either zt u = 0 or vt w = 0. Proof follows from Theorem 14.5.3. Let A be an m × n matrix of rank r and G0 be reflexive g-inverse of A. We shall now restrict our attention to g-inverses in the equivalence class Eq(G0 |A). Given g-inverses G1 , . . . , Gα−r in Eq(G1 |A) such that Gi <− Gi+1 , i = 0, 1, . . . , α − r = 1, we find the class of all matrices A1 , . . . , Aα−r such that (Ai , Gi ) is a step of a ladder. Theorem 14.6.4. Let A be an m×n matrix of rank r and α = min{m, n}. Let G0 be reflexive g-inverse of A. Let Gi , i = 0, 1, . . . , r be g-inverses of A satisfying the following conditions: (i) Gi ∈ Eq(G0 |A) for all i (ii) ρ(Gi ) = r + i for all i and (iii) Gi <− Gi+1 , i = 0, 1, . . . , α − r − 1. Then the following hold: (a) There exist non-singular matrices P and Q such that A = Pdiag Ir , 0 Q and Gi = Q−1 diag Ir+i , 0 P−1 , i = 0, 1, . . . , α − r. (b) There exist matrices Ai , i = 1, . . . , α − r such that (i) Ai is a reflexive g-inverse of Gi , i = 0, 1, . . . , α − r − 1, where A0 = A and (ii) Ai <− Ai+1 , i = 0, 1, . . . , α − r − 1. (c) General form of Ai is given by Theorem 14.2.4.
Equivalence Relations on Generalized and Outer Inverses
405
Proof is by induction on i. Let A be an m × n matrix of rank r. Theorems 14.6.1 and 14.6.2 completely characterize all possibilities for the part of the ladder from basement to ground level. Theorem 14.6.3 completely characterize all possibilities for the g-inverse part of each step in a ladder and Theorem 14.6.4 characterizes all possible steps of a ladder if we choose to restrict to an equivalence class. The characterizations of Ai ’s if Gi ’s cut across the equivalence classes appears to be cumbersome. It would be nice if a neat characterization of the same can be obtained.
Chapter 15
Applications
15.1
Introduction
In the preceding chapter, we established hierarchies in various classes of generalized inverses of a matrix using the matrix partial orders developed earlier in this monograph. In this chapter, we give some applications of matrix partial orders and shorted operators in statistics and electrical networks. In Section 15.2, we recall a few basic results on point estimation in linear models. In Section 15.3 we make a systematic comparison of a pair of linear models and interpret statistically the implication of the design matrix of one model being below the one of the other model under various orders. Section 15.4 starts with interpreting the BLUE and dispersion of BLUE in terms of a shorted operator. We then give applications of shorted operator to the recovery of inter-block information in incomplete block designs. In Section 15.5, we give applications of the parallel sum and the shorted operator to testing in a linear model. Not surprisingly, the shorted operator proves to be of superior use here. In Section 15.6, we consider modifications of shorting mechanisms in n-port reciprocal resistive networks endowed with a shorting mechanism and obtain the modified shorted operators. This may prove useful in sensitivity analysis of shorting mechanism when there is a scope of choice. We also consider adding one more port to the network and obtain the modified shorted operator.
15.2
Point estimation in a general linear model
In this section, we gather a few results on the point estimation of linear parametric functions in a general linear model that are needed in the following sections. For proofs of these results and an excellent exposition of linear 407
408
Matrix Partial Orders, Shorted Operators and Applications
models in general, we refer the reader to [Sengupta and Jammalamadaka (2003)]. Consider the linear model y = Xβ + ε, where (i) X is a known n × m matrix, known as the model matrix and β is an unknown non-stochastic vector of parameters in Rm , (ii) ε is a random vector (unobservable) in Rn such that its mean, E(ε) 6= 0 and its dispersion matrix, D(ε) 6= σ 2 V, where σ is an unspecified positive constant and V is a known n × n non-negative definite matrix. (Thus, E(y) = Xβ and D(y) = σ 2 V.) The vector y is referred to as the random vector of observations. Henceforth, we denote this linear model by the triple (y, Xβ, σ 2 V). Lemma 15.2.1. The observation vector y ∈ C(X : V) with probability 1. Proof. Consider the vector ζ = (I − PV )(y − Xβ) = (I − PV )ε. It is easy to see that E(ζ) = 0 and D(ζ) = 0. Hence, for each i, ζi , the ith component of ζ is zero with probability 1. Thus, y − Xβ ∈ C(V) with probability 1 or equivalently y = Xβ + Vu for some vector u. It follows that y ∈ C(X : V) with probability 1. Lemma 15.2.2. (Covariance Adjustment) Let u and v be random vectors with finite first and second moments and E(v) = 0. Then the vector v is uncorrelated with the vector u − Bv if and only if Bv = Cov(u, v)(D(v))− v with probability 1, where Cov(u, v) denotes the covariance matrix between vectors u and v. See Proposition 3.1.2 of [Sengupta and Jammalamadaka (2003)] for a proof. Definition 15.2.3. Let (y, Xβ, σ 2 V) be a linear model and A be a matrix with m columns. We say Aβ is estimable, if there exists a matrix C such that E(Cy) = Aβ for all β ∈ Rm . Theorem 15.2.4. Let (y, Xβ, σ 2 V) be a linear model and A be a matrix with m columns. Then Aβ is estimable if and only if C(At ) ⊆ C(Xt ). See Proposition 7.2.4 [Sengupta and Jammalamadaka (2003)]. Definition 15.2.5. Let (y, Xβ, σ 2 V) be a linear model. A linear function mt y is said to be a linear zero function if E(mt y) = 0 for all β ∈ Rm . Remark 15.2.6. The class of all linear zero functions in the linear model (y, Xβ, σ 2 V) is given by {mt y, m ∈ C(I − PX )}.
Applications
409
Definition 15.2.7. Let (y, Xβ, σ 2 V) be a linear model, A be a matrix with m columns and β ∈ Rm . Let Aβ be estimable. Then Ly is said to be BLUE of Aβ if (i) E(Ly) = Aβ for all β ∈ Rm and (ii) D(Ly)
Remark 15.2.10. In the setup of Theorem 15.2.9, if Aβ is estimable, then there exists a matrix L such that A = LX. The BLUE of Aβ is Lˆ y.
410
Matrix Partial Orders, Shorted Operators and Applications
Theorem 15.2.11. Let (y, Xβ, σ 2 V) be a linear model with C(X) ⊆ C(V). Then ˆ = X(Xt V− X)− Xt V− y and (i) y ˆ = Xβ (ii) D(ˆ y) = σ 2 X(Xt V− X)− Xt , where V− is an any g-inverses of V. Proof. First note that Xt V− X is invariant under choices of g-inverse of V, since C(X) ⊆ C(V). Clearly, ρ(X) = ρ(Xt V− X) for each V− , for we can choose a positive definite g-inverse V− of V. So, (Xt V− X)− Xt V− is a g-inverse of X. Thus, E(X(Xt V− X)− Xt V− y) = Xβ for all β. Further, Cov(X(Xt V− X)− Xt V− y, (I − PX )y) = X(Xt V− X)− Xt V− σ 2 V(I − PX ) = 0, since, Xt V− V = Xt . Therefore (i) follows. For (ii), D(ˆ y) = Cov(ˆ y, y ˆ) = σ 2 X(Xt V− X)− Xt V− V(X(Xt V− X)− Xt V− )t = σ 2 X(Xt V− X)− Xt (X(Xt V− X)− Xt V− )t = σ 2 X(Xt V− X)− Xt V− (X(Xt V− X)− Xt )t = σ 2 X(Xt V− X)− Xt V− X(Xt V− X)− Xt = σ 2 X(Xt V− X)− Xt and hence the result. Remark 15.2.12. If V = I, then y ˆ = PX y. Remark 15.2.13. For the linear model (y, Xβ, σ 2 V), `t y is the BLUE of its expectation if and only if V` ∈ C(X). Further, if V = I, then `t y is the BLUE of its expectation if and only if ` ∈ C(X). We now consider linear constraints Aβ = 0 on the parametric vector β, where Aβ is estimable. We have Theorem 15.2.14. Let (y, Xβ, σ 2 V) be a linear model where β is subject to the constraint Aβ = 0. Let Aβ be estimable. Then the constrained model is equivalent to the unconstrained model (y, X(I − A− A)θ, σ 2 V). Further, ρ(X) = ρ(X(I − A− A)) + ρ(A). Proof. We first note that Aβ = 0 ⇔ β = (I − A− A)θ, for some θ. Thus, the linear model (y, Xβ, σ 2 V) subject to the constraint Aβ = 0 is equivalent to the linear model (y, X(I − A− A)θ, σ 2 V). To prove the second statement, we write X = XA− A + X(I − A− A). Since Aβ is estimable, by Theorem 15.2.4, we have C(At ) ⊆ C(Xt ). We now show that C(XA− A) ∩ C(X(I − A− A)) = {0}. Let u and v be vectors such that XA− Au = (X(I − A− A))v. Pre-multiplying by
Applications
411
AX− , we have AX− XA− Au = AX− (X(I − A− A))v or equivalently, Au = 0, since AX− X = A. Thus, XA− Au = 0 = (X(I − A− A))v. Similarly, we can show that C(XA− A)t ∩ C(X(I − A− A))t = {0}. Thus, ρ(X) = ρ(XA− A) + ρ(X(I − A− A)). However, ρ(A) = ρ(AX− XA− A) ≤ ρ(XA− A) ≤ ρ(A) ⇒ ρ(A) = ρ(XA− A). Hence, ρ(X) = ρ(X(I − A− A)) + ρ(A).
Consider the linear model M =(y, Xβ, σ 2 V). Let X and β be parβ1 titioned conformably as X = (X1 : X2 ) and β = respectively β2 such that Xβ = X1 β1 + X2 β2 . Suppose we are interested in estimable linear functions of β1 only. Let M ∗ denote the reduced 2 model ((I − PX2 )y, (I − PX2 )X1 β1 , σ (I − PX2 )V(I − PX2 )). The following theorem describes the relationship of the linear models M ∗ and M: Theorem 15.2.15. Let the linear models M and M ∗ be defined as above. Then (i) The sets of linear zero functions of M and M ∗ coincide. (ii) pt β1 is estimable in M ∗ if and only if it is estimable in M. (iii) Let pt β1 be estimable under M ∗ . The the BLUE of pt β1 under M and under M ∗ coincide. (iv) The dispersion matrices of BLUE’s of (I − PX2 )X1 β1 under M and under M ∗ coincide. (v) The residual sums of squares under M and under M ∗ are identical. See Proposition 7.10.1 [Sengupta and Jammalamadaka (2003)].
15.3
Comparison of models when model matrices are related under matrix partial orders
In this section, we consider the linear models when their model matrices are related to each other under different order relations on matrices such as: the space pre-order, the minus order, the left/right star order or star order and study the consequences with respect to the estimability and finding BLUE’s.
412
Matrix Partial Orders, Shorted Operators and Applications
We choose and fix two arbitrary linear models M1 =(y, X1 β, σ 2 V) and M2 =(y, X2 β, σ 2 V) and refer to the model matrices of M1 and M2 for proving various theorems of this section. We start with the space pre-order. Theorem 15.3.1. Let M1 and M2 be linear models for which C(X1 ) ⊆ C(V). Then X1 <s X2 if and only if (i) every estimable linear parametric function under M1 is also estimable under M2 and (ii) for each ` such that `t y is the BLUE of its expectation under M1 is also the BLUE of its expectation under M2 . Proof. We show that (i) is equivalent to C(Xt1 ) ⊆ C(Xt2 ) and (ii) is equivalent to C(X1 ) ⊆ C(X2 ). Thus, X1 <s X2 if and only if (i) and (ii) hold. Let (i) hold and p ∈ C(Xt1 ). Then pt β is estimable under M1 . Since (i) holds pt β is estimable under M2 , so, p ∈ C(Xt2 ). Thus, C(Xt1 ) ⊆ C(Xt2 ). Now, let C(Xt1 ) ⊆ C(Xt2 ) and pt β be estimable under M1 . By Theorem 15.2.4, p ∈ C(Xt1 ) and so, p ∈ C(Xt2 ). Thus, pt β is estimable under M2 . Let (ii) hold and m ∈ C(X1 ). Since C(X1 ) ⊆ C(V), we have m ∈ C(V). So, m = Va for some vector a. By Remark 15.2.13, mt y is the BLUE of its expectation under M1 and by (ii) mt y is the BLUE of its expectation under M2 . Thus, m = Va ∈ C(X2 ) and so, C(X1 ) ⊆ C(X2 ). Let C(X1 ) ⊆ C(X2 ) and pt y be the BLUE of its expectation under M1 . Then Vp ∈ C(X1 ). Since C(X1 ) ⊆ C(X2 ), we have Vp ∈ C(X2 ). Therefore, by Remark 15.2.13, pt y is the BLUE of its expectation under M2 . Corollary 15.3.2. Let M1 and M2 be linear models and let V be positive definite. Then X1 <s X2 if and only if (i) and (ii) of Theorem 15.3.1 hold. The condition C(X1 ) ⊆ C(V) can not be dispensed with from Theorem 15.3.1. We give the following example: Example Let 15.3.3. 1 1 1 0 0 1 1 1 1 1 1 1 1 , X2 = 0 0 and V = 0 0 0 . X1 = 0 1 0 0 1 0 0 1 0 0 0 1 0 0 0 Then for each m, Vm ∈ C(X1 ) and Vm ∈ C(X2 ) and C(X1 ) * C(X2 ). Remark 15.3.4. Consider the set up of Theorem 15.3.1. Let C(X1 ) ⊆ C(V) and X1 <s X2 . It is clear that the BLUE of X1 β under M1 continues
Applications
413
to be its BLUE under M2 . However, the dispersion matrices of the BLUE of X1 β under M1 and M2 are not comparable. Before we prove a theorem similar to Theorem 15.3.1 for the minus order, we show that if C(X1 ) ⊆ C(V) and X1 <− X2 , then the dispersion matrix of the BLUE of X1 β under the linear model M1 is below the the dispersion matrix of the BLUE of X1 β under the linear model M2 in the L¨ owner order. Theorem 15.3.5. Let M1 and M2 be linear models for which C(X1 ) ⊆ C(V) and X1 <− X2 . Then the following hold: (i) The linear model M1 is equivalent to M3 = (y, L1 θ1 , σ 2 V) and the linear model M2 is equivalent to M4 = (y, L1 θ1 + L2 θ2 , σ 2 V), where (L1 : L2 ) is a full column rank matrix such that ρ(L1 ) = ρ(X1 ) and ρ(L1 : L2 ) = ρ(X2 ) (ii) θ1 is estimable under both M3 and M4 . (iii) The dispersion matrix of the BLUE of θ1 under M3 is below the dispersion matrix of the BLUE of θ1 under M4 under the L¨ owner order. (iv) There is no estimable linear function of β common to the linear models M1 and M5 =(y, (X2 − X1 )β, σ 2 V). − (i) Since X1 < X2 there exist matrices L1 , L2 , N1 and N2 N1 such that (L1 , N1 ) and (L1 : L2 ), are rank factorizations of X1 N2 and X2 respectively. Write θ1 = N1 β and θ2 = N2 β. Clearly (i) holds. (ii) and (iv) are easy to prove. (iii) Since C(X1 ) ⊆ C(V), we have C((I−PL2 )L1 ) ⊆ C((I−PL2 )V(I−PL2 )). Also, C(L1 ) = C(X1 ) ⊆ C(V), so, Lt1 V− L1 is invariant under choices of g-inverses of V. Further, ρ(Lt1 V− L1 ) = ρ(L1 ). Hence Lt1 V− L1 is nonsingular. Now,
Proof.
ρ((I − PL2 )L1 ) = ρ(L1 ) − d(C(L1 ) ∩ N (I − PL2 )) = ρ(L1 ) − d(C(L1 ) ∩ C(L2 )) = ρ(L1 ). Hence, Lt1 (I − PL2 )((I − PL2 )V− (I − PL2 ))− (I − PL2 )L1 is non-singular. b Let θˆ1 and θˆ1 be the BLUEs of θ1 under M3 and M4 . Using Theorems 15.2.10 and 15.2.14, we have D(θˆ1 ) = σ 2 (Lt1 V− L1 )−1 and b D(θˆ1 ) = σ 2 (Lt1 (I − PL2 )((I − PL2 )V− (I − PL2 ))− (I − PL2 )L1 )−1 . Now, t − L1 V L1 = Lt1 V− VV− L1 . Also, the matrix V − V(I − PL2 )((I − PL2 )V− (I − PL2 ))− (I − PL2 )V
414
Matrix Partial Orders, Shorted Operators and Applications
is nnd. Hence, Lt1 V− VV− L1 − Lt1 V− V(I − PL2 )((I − PL2 )V− (I − PL2 ))− (I − PL2 )VV− L1
b is nnd. So, D(θˆ1 )
We are now ready to prove the following: Theorem 15.3.6. Let M1 =(y, X1 θ, σ 2 V) and M2 =(y, X2 β, σ 2 V) be any two linear models. Then X1 <− X2 if and only if there exists a matrix A such that C(At ) ⊆ C(Xt2 ) and M1 is the model M2 constrained by Aβ = 0. Proof. ‘If’ part follows by Theorem 15.2.14. ‘Only if’ part Let A = X2 − X1 . Since X1 <− X2 , we have A <− X2 . So, there exists a g-inverse A− of A such that A− A = A− X2 and AA− = X2 A− . It is clear that C(A)t ⊆ C(Xt2 ) and therefore, Aβ is estimable. Further, X1 = X2 (I − A− A). Let M2 be constrained under Aβ = 0. Then β = (I − A− A)θ and therefore, M1 is the model M2 constrained by Aβ = 0. We now investigate the interpretations of the left star order in the linear models. Theorem 15.3.7. Let M1∗ =(y, X1 β, σ 2 I) and M2∗ =(y, X2 β, σ 2 I) be any two linear models. Then X1 ? < X2 if and only if (i) The linear models M1∗ and M =(y, (X2 − X1 )β, σ 2 I) have no common estimable linear functions of β. (ii) X1 β is estimable under the model M2∗ and (iii) The BLUE of X1 β under the model M1∗ is also its BLUE under M2∗ and the dispersion matrix of X1 β under the model M1∗ is same as under model M2∗ . Proof. ‘If’ part Let the rows of L1 and L2 form the basis of the row subspaces of X1 and X2 − X1 respectively. Since (i) holds, C(Lt1 ) ∩ C(Lt2 ) = {0}. Let X1 = T1 L1 and X2 − X1 = T2 L2 for some matrices T1 and T2 . Clearly, (T1 , L1 ) and (T2 , L2 ) are rank factorizations of X1 and X2 − X1 respectively. Write Li β = θi , for i = 1, 2. Then M1∗ =(y, T1 θ1 , σ 2 I) and M2∗ =(y, T1 θ1 + T2 θ2 , σ 2 I) respectively. Since X1 β is estimable
Applications
415
under both the models, therefore θ1 is estimable under both the modˆ els. Let θˆ1 and ˆ θ1 denote the BLUEs of θ1 under (y, T1 θ1 , σ 2 I) and (y, T1 θ1 + T2 θ2 , σ 2 I) respectively. Now, D(θˆ1 ) = σ 2 (Tt1 T1 )−1 and ˆ ˆ D(ˆ θ1 ) = σ 2 ((Tt1 T1 )−1 + Tt1 T2 (Tt2 T2 )−1 Tt2 T1 ). As D(θˆ1 ) = D(ˆ θ1 ), we have Tt1 T2 = 0. Hence, Xt1 (X2 − X1 ) = 0, implying X1 ? < X2 . ‘Only if’ part follows, since X1 ? < X2 ⇒ X1 = Udiag I 0 0 Q and X2 = Udiag I I 0 Q for some orthogonal matrix U and non-singular matrix Q.
Corollary 15.3.8. Let M1∗ =(y, X1 β, σ 2 I) and M2∗ =(y, X2 β, σ 2 I) be any two linear models such that X1 ? < X2 . Then the BLUE of every estimable linear function of the parameters under M1∗ is a linear zero function under M =(y, (X2 − X1 )β, σ 2 I) and vice versa. Our next theorem is an application of Corollary 8.2.10 on L¨owner order for estimation under linear models. Theorem 15.3.9. Let (y, Xβ, σ 2 V) be a linear model such that V is nnd. d as its BLUE. If Ly is an unbiased estimator of Let Aβ be estimable with Aβ d d ≤ det(D(Ly)). Aβ, then (i) tr(D(Aβ)) ≤ tr(D(Ly)) and (ii) det(D(Aβ)) d is its BLUE, Proof. Since Ly is an unbiased estimator of Aβ, and Aβ L d D(Aβ) < D(Ly). The result follows from Corollary 8.2.10. 15.4
Shorted operators - Applications
In this section we shall give some interpretations and applications of shorted operators in Statistics. We first give the interpretation of BLUE in a linear model. Theorem 15.4.1. Let (y, Xβ, σ 2 V) be a linear model in which both the matrices X and V can be rank deficient. Let y = Xβ + Vu (Lemma 15.2.1). Then the following hold: b = S(V|C(X))u + Xβ. (i) The BLUE of Xβ is y 2 (ii) D(b y) = σ S(V|C(X)). (iii) D(e) = σ 2 (V − V|C(X)). Proof follows from Theorem 15.2.9 and the definition of a shorted operator.
416
Matrix Partial Orders, Shorted Operators and Applications
In Corollary 10.3.6, we showed that if A and B are two nnd matrices of the same order n × n such that A
(Statistical). Let Y1 B11 B12 ∼ Nn 0, Y2 B21 B22
and Y1 Y2 ∼ Nn+s 0, B C , Ct I Z where s = ρ(B − A) and (C, Ct ) is a rank factorization of B − A. Then − D(Y1 |Y2 ) = B11 − B12 B− 22 B21 and D(Y1 |Y2 , Z) = A11 − A12 A22 A21 . Since D(Y1 |Y2 , Z)
(15.4.1)
Applications
417
Bx = b
(15.4.2)
and
be the normal equations for deriving intra-block and inter-block estimators respectively and (A + B)x = a + b
(15.4.3)
be the normal equations for deriving combined intra-inter-block estimators. Clearly, each of the equations (15.4.1)-(15.4.3) is consistent. Also, A, B and A + B are nnd matrices. The linear function pt x is unique for all solutions of (15.4.1) if and only if p ∈ C(At ). In the recovery of interblock information it is of interest to identify those linear functions pt x with p ∈ C(At ) for which the substitution of a solution of (15.4.1) or (15.4.3) leads to identical answers. We exhibit a solution to this problem whenever S(A|B) = C exists. Theorem 15.4.3. Let A and B be nnd matrices of the same order such that the shorted matrix S(A|B) = C exists. Let p ∈ C(A). Then pt (A + B)− (a + b) = pt (A)− a for all a ∈ C(A) and b ∈ C(B) if and only if p ∈ C(A − C). Proof. ‘If’ part First note that C(A − C) ⊆ C(A). Let x0 satisfy (15.4.1). Since ρ(A + B) = ρ(A − C) + ρ(B + C), (see proof of Theorem 11.2.11) and b − Bx0 ∈ C(B) = C(B + C), we have (A − C)(A + B)− (b − Bx0 ) = 0. Therefore, (A − C)(A + B)− (a + b) = (A − C)(A − C + C + B)− ((A + B)x0 + b − Bx0 ) = (A − C)(B + C + A − C)− (A + B)x0 = (A − C)(B + A)− (A + B)x0 , since C(A − C) ⊆ C(A + B) = (A − C)x0 = (A − C)A− a. Hence, pt (A + B)− (a + b) = pt (A)− a for all p ∈ C(A − C). ‘Only if’ part Now, pt (A + B)− (a + b) = pt A− a for all a ∈ C(A) and for all b ∈ C(B) ⇒ pt (A + B)− b = 0 for all b ∈ C(B) ⇒ pt (A + B)− B = 0.
418
Matrix Partial Orders, Shorted Operators and Applications
Since p ∈ C(A), so, p = Ax for some x. Now, pt (A + B)− B = 0 ⇒ xt P(A|B) = 0. However, as C(P(A|B)) = C(S(A|B)) = C(C) and A− ∈ {C− }, we have xt C = 0. This implies xt = θt (I − CA− ) for some θ. Hence p = Ax = A(I − A− C)θ = (A − C)θ for some θ.
15.5
Application of parallel sum and shorted operator to testing in linear models
Consider the linear model (y, Xβ, σ 2 I) and let y ∼ Nn (Xβ, σ 2 I). Consider the linear hypothesis H0 : Aβ = ξ, that is assumed to be consistent i.e., ξ ∈ C(A). As shown in the Proposition 5.3.6 of [Sengupta and Jammalamadaka (2003)], the testable part of the hypothesis is H01 : TAβ = Tξ, where T is an arbitrary matrix such that C(At Tt ) = C(At ) ∩ C(Xt ). Recall that C(P(Xt X|At A)) = C(At ) ∩ C(Xt ). Also, for the shorted operator S(Xt X|C(At )), we have C(S(Xt X|C(At ))) = C(At ) ∩ C(Xt ). Choose T = (P(Xt X|At A))A− or (S(Xt X|C(At )))A− , where A− is any g-inverse of A, then C(At Tt ) = C(At ) ∩ C(Xt ). If T = (P(Xt X|At A))A− , then TA = (P(Xt X|At A))A− A = P(Xt X|At A) and if T = (S(Xt X|C(At )))A− , then TA = S(Xt X|C(At )). Note that TA is symmetric in both the cases. ˆ = σ 2 TA(Xt X)− . Further, if TA = S(Xt X|C(At )), Either way D(TAβ) ˆ then D(TAβ)TA = σ 2 TA. Henceforth, we assume TA = S(Xt X|C(At )). (Xt X)− ˆ Further, if u = TAβˆ −g, where is a g-inverse of D(TAβ). Clearly, σ2 g = Th, then the variance ratio F-Test for testing the hypothesis H01 is (Xt X)− ut u, which follows χ2v , where v = ρ(TA) = tr((Xt X)− )TA. σ2
15.6
Shorted operator adjustment for modification of network or mechanism
Let us consider a reciprocal resistive electrical network with n ports with impedance matrix A. Let (i1 , i2 , . . . in ) be a permutation of (1, 2, . . . n). th Suppose that ith r+1 , . . . , in ports of the network are shorted. If S denotes the subspace spanned by ei1 , . . . , eir , where eij is the ith j column of the identity matrix. Then S(A|S) is the shorted operator corresponding to this shorting mechanism. We wish to study the shorted operator when the shorting mechanism undergoes some modifications.
Applications
419
In this section we consider the following modifications to the shorting mechanism: (a) In addition to the ports shorted earlier, one more port is shorted. (b) Amongst the ports shorted earlier, we choose one and undo the shorting operation on this port. We also consider a modification of the network by adding a new port. We obtain the modified shorted operator corresponding to a suitably chosen shorting mechanism. In fact, we obtain slightly more general results from which the results for all the above modifications follow easily. Let A be an nnd matrix of order n × n. Let S be a subspace of Cn and the shorted matrix S(A|S) be available. Let x and y be vectors such that x ∈ / S and y ∈ S. Let Lu denote the subspace spanned by the vector A x ⊥ u. We obtain S(A|S + Lx ) and S(A|S ∩ Ly ). Again, let B = , xt c where B is nnd. Let u1 , u2 , . . . ur form an ortho-normal basis of S. Let n+1 d1 , d 2 , . . . dr be gen given real numbers and S∗ be the subspace of C u1 u2 ur erated by , ... . We obtain S(B|S∗ ). Before we can d1 d2 dr embark upon doing this we need the following: A x Lemma 15.6.1. Let B = be a hermitian matrix and G = x? c H y be a hermitian g-inverse of B. Then the following hold: y? d (i) x ∈ / C(A) if and only if Ay = 0, y? x = 1 and d = 0. (ii) H is a g-inverse of A if x ∈ / C(A). ? (iii) Let θ = 1 − y x − cd 6= 0. then 1 d 1 T = H + (Hx + cy)y? + y(x? H + cy? ) + 2 (Hx + cy)(Hx + cy)? θ θ θ is g-inverse of A. 1 (iv) Let y? x+cd = 1, d 6= 0 and Ay+dx = 0. Then H− yy? is g-inverse d of A. (v) Let y? x + cd = 1, d 6= 0 and t = kAy + dxk2 6= 0. Write ξ = 1 (I − HA − yx? )(Ay + dx). Then R = H + ξy? + yξ ? + dξξ ? is a t g-inverse of A. (vi) Let y? x = 1, d = 0 and Ay 6= 0. Write t = kAyk2 and ξ = 1 (I − HA − yx? )Ay. Then R = H + ξy? + yξ ? , is a g-inverse t
420
Matrix Partial Orders, Shorted Operators and Applications
of A. For a proof see [Bhimasankaram (1988a)]. Before we can actually start giving the shorted operators, we fix a setup to be used subsequently. Sz : Let A be an nnd matrix of order n × n. Let S be a subspace of Cn and q1 , q2 , . . . , qn−r form an orthonormal basis of S ⊥ . Write Q = (q1 , q2 , . . . , qn−r ). Let Q, AQ; Q? AQ and (Q? AQ)− be available. Clearly, S(A|S) = A − AQ(Q? AQ)− Q? A. Let x ∈ / S. To obtain S(A|S + Lx ), we proceed as follows: Algorithm 15.6.2. Step 1: Compute u = PS ⊥ x = QQ? x. Step 2: Extend u to an orthonormal basis u, p1 , p2 , . . . , pn−r−1 of S ⊥ . (This can be achieved by performing Gram Schmidt orthogonalization process on u, q1 , q2 , . . . , qn−r .) Let P1 = (p1 , p2 , . . . , pn−r−1 ) and P = (P1 , u). Note that P = QQ? P and Q? P is unitary. Moreover, S(A|S) = A − AP(P? AP)− P? A. Step 3: Compute (P? AP)− = P? Q(Q? AQ)− Q? P. Step 4: Compute AP1 = AQQ? P1 and (P?1 AP1 )− using Lemma 15.6.1. Step 5: Compute S(A|S + Lx ) = A − AP1 (P?1 AP1 )− P?1 A. Remark 15.6.3. Notice that C(P1 ) = (S + Lx )⊥ . For, P1 y = 0, whenever y ∈ S as C(P1 ) ⊆ S ⊥ . Further, P1 x = P1 (PS x + u). Again P1 PS x = 0, since PS x ∈ S and P1 u = 0 by construction. Hence, P1 x = 0. Thus, C(P1 ) ⊆ (S + Lx )⊥ . Now d(S + Lx ) = r + 1 and ρ(P1 ) = n − r − 1. So, the result follows. Let us consider the setup Sz fixed before Algorithm 15.6.2. Let y ∈ S. Then it is quite easy to obtain S(A|S ∩ L⊥ y ). So, we have Theorem 15.6.4. If T = S(A|S), then S(A|S ∩ L⊥ y ) = T−
1 Tyy? T. y? Ty
Proof follows from the fact that S(A|S ∩ T ) = S(S(A|S)|T ). Consider a reciprocal resistive electrical network with n ports havth th ing impedance matrix A. Let ith 1 , i2 , . . . , in−r ports of the network be
Applications
421
shorted. Let S be space generated by the vectors ein−r+1 , . . . , ein , where (i1 , i2 , . . . in ) is a permutation of (1, 2, . . . n) and ej denotes the j th column of the identity matrix. Let S(A|S) be the corresponding shorted operator. Suppose we want to unshort the port eij , 1 ≤ j ≤ n − r. Then the resultant shorted operator is given by S(A|S + Leij ) and can be computed using the Algorithm 15.6.2. (In this case the Steps 1 and 2 are not necessary since eij ∈ S ⊥ and is part of the orthonormal basis ei1 , . . . , ein−r .) Consider the same setup as before. Suppose we want to short one more port say the ith j , n − r < j ≤ n in addition to the already shorted ports i1 , i2 , . . . , in−r . The resulting shorted operator is given by S(A|S ∩ L⊥ ei j ) and can be easily obtained by using Theorem 15.6.4. z Once again let us consider the setup S fixed before Algorithm 15.6.2. A x Let B = , where x ∈ C(A) and c − xt A− x ≥ 0, so that B is nnd. xt c Let w1 , w2 , . . . wr form an orthonormal basis of S. Let d1 , d2 , . . . dr n+1 be complex generated by given numbers and S∗ be the subspace of C w1 w2 wr , ... . In order to obtain S(B|S∗ ) we proceed as fold1 d2 dr lows: Algorithm 15.6.5.
Pr P P di wi , if d2i 6= 0 and set v = 0, if d2i = 0. Step 1: Compute v = − Pi=1 r 2 d i=1 i Q v Form T = and note that T? T = I and C(T) = S∗⊥ . 0 1 ? Q AQ Q? y Step 2: Compute T? BT = , where y ? Q v ? y + x? v + c y = Av + x. Step 3: Compute ? ? (Q AQ)− 0 (Q AQ)− Q? y ? − ¯ (T BT) = +h (y? Q(Q? AQ)− , − 1), 0 0 −1 0, if h = v? y + x? v + c − y? Q(Q? AQ)− Q? y = 0 1 , if h 6= 0 h Step 4: Compute S(B|S∗ ) = B − BT(T? BT)− T? B. S(A|S) S(A|S)ξ Remark 15.6.6. If h = 0, then S(B|S∗ ) = , ξ ? S(A|S) c − ξ ? S(A|S)ξ where x = Aξ. Since x ∈ C(A), there exists a ξ such that x = Aξ. (
¯ where h=
422
Matrix Partial Orders, Shorted Operators and Applications
Remark 15.6.7. Notice that h = 0 if and only if ρ(T? BT) = ρ(Q? AQ). Remark 15.6.8. If y = Av + x ∈ S, then Q? y = 0 and S(A|S) S(A|S)ξ S(B|S∗ ) = ? ¯ ? v + c)2 , where ξ is as in Remark 15.6.6. ξ S(A|S) c − h(x Consider the setup Sz as described before Algorithm 15.6.2. Suppose one more port is added to the network and let the new impedance matrix be A x B= . Then the shorted operator corresponding to the shorting of x? c th ith 1 , . . . , ir ports is given by S(B|S∗ ), where S∗ is the space generated by the vectors eir+1 , . . . , ein . The vectors ei1 , . . . , eir , en+1 form an orthonormal basis of S∗⊥ . (Here d1 , . . . , dr are all zero.) The rest follows immediately.
Chapter 16
Some Open Problems
The open problems in any area of research can always be found with some effort but when they are recorded in a readily available format in some accessible source along with the progress made in attempting a solution they become an asset. In this chapter, we enlist some problems that we could not resolve while writing this monograph. We also provide some background information relating to the problems:- how these problems arose and what little is known to us about them.
16.1
Simultaneous diagonalization
Simultaneous diagonalization of various types played a major role in our exposition on the partial orders. [Eckart and Young (1939)] showed that, given two matrices A1 and A2 of order m × n over the complex field C, there exist unitary matrices U and V of orders m×m and n×n respectively such that Ai = UDi V? for i = 1, 2, where D1 and D2 are real diagonal matrices if and only if A1 A?2 and A?1 A2 are hermitian matrices. In Chapter 2, we extended this result for a finite number (more than 2) of matrices and provided necessary and sufficient conditions for simultaneous singular value decomposition of several matrices in Theorem 2.7.19. This lead us to the following question: Problem 16.1.1. Let A1 and A2 be matrices of order m × n over the complex field C. Obtain necessary and sufficient conditions for the existence of unitary matrices U and V of orders m × m and n × n respectively such that Ai = UDi V? for i = 1, 2, where D1 and D2 are diagonal matrices, not necessarily real. 423
424
16.2
Matrix Partial Orders, Shorted Operators and Applications
Matrices below a given matrix under sharp order
Let A and B be square matrices of index ≤ 1 over an algebraically closed field F. Further, let B = Pdiag(J1 , . . . , Jr , 0)P−1 be the Jordan form of B, where Ji for i = 1, . . . , r are non-singular Jordan blocks and A = Pdiag(D1 , . . . , Dr , 0)P−1 , where P is a non-singular matrix, Dij = Jij , j = 1, . . . , s for some sub-permutation (i1 , . . . , is ) of {1, . . . , r} and Dt = 0 for t ∈ {1, . . . , r} ∩ {i1 , . . . , is }c . In Chapter 4, we observed that A <# B. We proved the following converse in special case in Theorem 4.3.13: Let A and B are square matrices of index ≤ 1 over an algebraically closed field F and B has the Jordan decomposition mentioned above with the further condition that the geometric multiplicity of each of its nonnull eigen-values is 1. Then A <# B implies that A must be of the form mentioned above. Problem 16.2.1. Let B be a square matrix of index ≤ 1 over an algebraically closed field F. Characterize the class of all matrices A of index ≤ 1 such that A <# B in terms of Jordan form of the matrix B. 16.3
Partial order combining the minus and sharp orders
Let A and B be square matrices of the same order. Let A = A1 + A2 and B = B1 +B2 be the core-nilpotent decompositions of A and B respectively, where A1 is core part of A, B1 is core part of B, A2 is nilpotent part of A and B1 is nilpotent part of B. In chapter 4, using the sharp order on the core parts and the minus order on the nilpotent parts, we defined (#, −) order, written as A <#,− B if A1 <# B1 and A2 <− B2 . This order was introduced by Mitra and Hartwig [Mitra and Hartwig (1992a)] under the name C-N order. We had shown that this defines a partial order that implies the minus order (Corollary 4.4.19 and Theorem 4.4.20). Now, suppose A <− B and A1 <# B1 . Clearly, this too defines a partial order. When does this relation imply the (#, −) order? In other words when A <− B and A1 <# B1 implies A <#,− B? This leads to the following: Problem 16.3.1. Let A and B be square matrices of the same order. Let A = A1 + A2 and B = B1 + B2 be the core-nilpotent decompositions of A and B respectively. What are necessary and sufficient conditions under which A <− B and A1 <# B1 implies A <#,− B?
Some Open Problems
425
Let C denote the class of square matrices whose core part is rangehermitian. Let the setup be that of Problem 16.3.1. Define A <#,? B if A1 <# B1 and A2 B2 . It is easy to see that this is a partial order on the class C which implies the star order. Now, let A and B ∈ C such that A B and A1 <# B1 . It is clear that this too defines a partial order on C. When does A B and A1 <# B1 together imply A <#,? B. So, we have Problem 16.3.2. Let A and B be square matrices of the same order. Let A = A1 + A2 and B = B1 + B2 be the core-nilpotent decompositions of A and B respectively. What are necessary and sufficient conditions for under which A B and A1 <# B1 implies A <#,? B?
16.4
When is a G-based order relation a partial order?
In Chapter 7, we obtained some sufficient conditions for a G-based order to be a partial order on its support (Theorem 7.2.13). We also obtained a necessary and sufficient condition in a special case (Corollary 7.2.33). The question that still remains to be answered is ‘When is a G-based based order a partial order?’ This prompts us to include the following: Problem 16.4.1. Obtain a necessary and sufficient condition such that a G-base order relation is a partial order on its support. Problem 16.4.2. Obtain a necessary and sufficient condition such that a G-base order relation defines a partial order on the class of all matrices possibly rectangular over a field F.
16.5
Parallel sum and g-inverses
Let A and B be parallel summable matrices over C, the field of complex numbers. We have seen that Theorem 9.2.20 gives a nice expression for the Moore-Penrose inverse of the parallel sum of A and B in terms of g-inverses of A and B and the orthogonal projectors onto the column and row spaces of the parallel sum. Even when the matrices A and B are over a general field, one can construct a reflexive g-inverse of the parallel sum in terms of g-inverses of A and B and oblique projectors onto the row and column spaces of the parallel sum (Remark 9.2.21). Can we generate all reflexive
426
Matrix Partial Orders, Shorted Operators and Applications
g-inverses of the parallel sum by varying over the g-inverses and projectors involved in the above construction? Problem 16.5.1. Let A and B be parallel summable. Let G be a g-inverse of P(A|B). Write PG = P(A|B)G and QG = GP(A|B). Does the class of matrices −
{QG (A− + B− )PG : G ∈ {P(A|B) }, A− ∈ {A− } and B− ∈ {B− }} exhaust the class of all reflexive g-inverses of P(A|B)? Suppose A and B be are parallel summable. What about the parallel summability of their g-inverses? This leads us to the following Problem 16.5.2. Let A and B be parallel summable. (a) Do there exist g-inverses A− and B− which are parallel summable? (b) Obtain necessary and sufficient conditions under which the MoorePenrose inverses of A and B are parallel summable, assuming that A and B are over the field of complex numbers. (c) If A and B are square matrices of index not exceeding 1, is P(A|B), their parallel sum of index not exceeding 1 too? (d) Let A and B be square matrices and of index not exceeding 1. Obtain necessary and sufficient conditions under which the group inverses of A and B are parallel summable.
16.6
Shorted operator and a maximization problem
Problem 16.6.1. Let A be an nnd matrix of order n × n and S be a subspace of Cn . We have seen in Theorem 10.3.11 that the shorted operator S(A|S) is the maximal element of {D : D <− A and C(D) ⊆ S}. What happens if we maximize over all matrices D such that D A? This leads us to the following: Problem 16.6.2. Let T = {D : D A and C(D) ⊆ S}. We know max{T} exists. When does this maximum coincide with S(A|S)? (Notice that if A and D are idempotent, it is certainly the case.) More generally, let A be an m × n matrix over C, the field of complex numbers. When S? (A|S, T ) = S(A|S, T )?
Some Open Problems
16.7
427
The ladder problem
We had seen in Chapter 14 that Theorems 14.6.4 and 14.2.4 characterize all possible steps of a ladder, if we choose to restrict to an equivalence class. What if the g-inverses cut across the equivalence classes? So, we have Problem 16.7.1. Consider the setup of Theorem 14.6.4 except that the g-inverses Gi can cut across the equivalence classes. Obtain a characterization of all Ai such that (Ai , Gi ), varying over i form the part of a ladder from the ground level to the first floor.
Appendix A
Relations and Partial Orders
A.1
Introduction
This monograph presumes basic knowledge in relations and orders on a set. The object of this appendix is to provide the definitions and simple results on these topics aided by several illustrative examples. We hope that this will help the reader who is familiar with these concepts but needs a quick brush up. Readers wishing to go for more details can consult standard text books, such as [Vagner (1952)] and Classical Algebra by P.M. Cohn.
A.2
Relations
The relations we consider in Mathematics are an abstraction of the relations we see every day in our life. If the objects are represented by x and y, we can write an ordered pair as (x, y) or as (y, x). In general the ordered pairs (x, y) and (y, x) are different. In fact if (x, y) and (w, z) are two ordered pairs, then (x, y) = (w, z) if and only if x = w and y = z. Definition A.2.1. Let A and B be two sets. Then the set A × B, known as Cartesian product of A and B is the set {(x, y) : x ∈ A and y ∈ B}. Definition A.2.2. A relation from a set A to a set B is a subset of A × B. Let R be relation from a set A to a set B. If an ordered pair (x, y) ∈ R, then we say x is related to y under the relation R and denote it as xRy. A relation from a set A to itself is called a relation in the set A or on the set A. 429
430
Matrix Partial Orders, Shorted Operators and Applications
Definition A.2.3. A relation R in a set A is called (i) (ii) (iii) (iv)
Reflexive if xRx (or (x, x) ∈ R), for each x ∈ A. Anti-symmetric if for x, y ∈ A, xRy and yRx, then x = y. Symmetric if for x, y ∈ A, xRy, then yRx. Transitive if for x, y and z ∈ A, xRy and yRz, then xRz.
Example A.2.4. Let A = {x1 , x2 , . . . , xn } be any finite set. Then (a) R = {(x1 , x1 ), (x2 , x2 ) . . . , (xn , xn )} is a reflexive relation on A. What are the other properties R enjoys? (b) R = {(x1 , x1 ), (x1 , x2 ), (x2 , x2 )} is an anti-symmetric relation. (c) R = {(x1 , x1 ), (x1 , x2 ), (x2 , x1 ), (x2 , x2 )} is a symmetric relation. Example A.2.5. Let A = N, the set of natural numbers. Let R be the usual ‘less than’ relation on A Clearly, this is an anti-symmetric as well as a transitive relation. What about reflexivity? Definition A.2.6. A relation R on a set A is called an equivalence relation if it is a reflexive, symmetric and transitive relation. Example A.2.7. The relation A × A is an equivalence relation on A. The relation in Example A.2.4(a) is also an equivalence relation. What about ∅, the null relation? Definition A.2.8. A relation R on a set A is called a partial order if it is a reflexive, anti-symmetric and transitive relation. Example A.2.9. The well known relation ‘A ⊆ B’ on the set ℘(U ), of all subsets of the set U is clearly a partial order on the set U. Example A.2.10. Let M denote the set of all finite subsets of the real numbers having the same number of elements arranged in (say) decreasing order. We define a relation on M as follows: Let X = {x1 , x2 , . . . , xn } and Y = {y1 , y2 , . . . , yn } be any two elements Pj Pj of M. We say Y majorizes X, if i=1 xi = i=1 yi for each j = 1, 2, . . . (n− Pn Pn 1) and i=1 xi = i=1 yi . When Y majorizes X, we write X Y. Clearly, ‘’ is a partial order on M. A set equipped with a partial order is also referred to as a Poset. Whenever a relation R is a partial order on a set A, it is a standard practice
Relations and Partial Orders
431
to write R as ‘≺’ and we also adopt it. Thus, (A, ≺) is a set A with a partial order ‘≺’ on it. Definition A.2.11. A relation R on a set A is called an pre-order if it is a reflexive and transitive relation. Thus, every partial order and also every equivalence relation is a preorder. Remark A.2.12. A subset T of a relation R on a set A may or may not inherit a property that R possesses. For any subset B of a partially ordered set A, if the partial order on A is also a partial order on B, then it shall be referred to as partial order on B. Definition A.2.13. Let (A, ≺) be a Poset. An element a ∈ A is called a maximal element of A with respect to the partial order ≺ if there exists no element x0 ∈ A other than a such that a ≺ x0 , i.e. if there exists an element x0 ∈ A such that a ≺ x0 , then x0 = a. An element a ∈ A is called greatest element of A if for each x ∈ A, x ≺ a. Similarly, an element a ∈ A is called a minimal element of A with respect to the partial order ≺ if there exists no element x0 ∈ A other than a such that x0 ≺ a. An element a ∈ A is called least element of A if for each x ∈ A, a ≺ x. Definition A.2.14. Let B be a subset of a poset (A, ≺). An element a ∈ A is called an upper bound of B if for all x ∈ B, x ≺ a. The least element amongst all the upper bounds of B is called Supremum of B. Whenever, Supremum of a subset B exists we write sup{B}. If a, b ∈ A, we denote the the Supremum of a, b as a ∨ b and read it as ‘a join b’. One can similarly define Infimum of a subset in analogous manner. For a, b ∈ A, we denote the the Infimum of a, b as a ∧ b and read it as ‘a meet b’. Definition A.2.15. A Poset (A, ≺) is said to form a lattice if for each pair of elements a, b ∈ A, Supremum of a, b and Infimum of a, b exist in A. A Poset (A, ≺) is called join semi-lattice if Supremum of a, b exists for each pair of elements a, b ∈ A. Analogously we have a meet semi-lattice. Definition A.2.16. Let the set A be equipped with two partial orders ≺1 and ≺2 . Then ”≺1 ” is said to be finer than ”≺2 ” if for a, b ∈ A, a ≺1 b =⇒ a ≺2 b.
432
Matrix Partial Orders, Shorted Operators and Applications
A.3
Semi-groups and groups
We briefly introduce the concepts of semi-group and group. Definition A.3.1. A binary operation ∗ on a set S is a function with domain S × S and codomain S. Example A.3.2. Consider the set Z of integers. The operation of addition as well as multiplication are binary operations on Z, but the operation of division is not a binary operation on Z. Definition A.3.3. Let S be a set with at least one element equipped with a binary operation ∗ is called a semi-group if (S1.) ’∗’ is associative, i.e. ∀ a, b, c ∈ S, a ∗ (b ∗ c) = (a ∗ b) ∗ c. We usually denote the binary operation multiplicatively. If there exists an element e ∈ S such that ∀ a ∈ S, ae = ea = a, then S is said to have an identity e. Whenever a semi-group has an identity, it is unique and is called the identity of S. An element 0 satisfying a0 = 0a = 0, ∀ a ∈ S is called a Zero element of S. If a semi-group has a Zero element, it is unique and is called as the Zero of S. We generally write 1 for the identity of a semi-group and 0 for the zero of the semi-group, if it exists. Definition A.3.4. An element a of a semi-group S is called an idempotent element (or simply an idempotent) if a2 = a and denote by E(S), the set of all idempotent elements of S. Definition A.3.5. In a semi-group S with zero, 0, a non-zero element a ∈ S is called a nilpotent if there is a positive integer n ∈ N such that an = 0 and the smallest such integer is called the index of nilpotency of a. It is clear that the index of nilpotency of any element is ≥ 2. Definition A.3.6. Let S be a semi-group. An element a ∈ S is called regular if a = axa for some x ∈ S. The element x in Linear Algebra terminology is called g-inverse of the element a. Moreover, if a ∈ S is regular with a = axa for some x ∈ S, then the two elements ax and xa are idempotent elements of S. If we let y = xax, then y satisfies aya = a and yay = y. Thus y is also regular.
Relations and Partial Orders
433
Definition A.3.7. Let S be a semi-group. An element b ∈ S is called r-inverse of an element a if a = aba, b = bab Thus, every regular element in a semi-group has an r-inverse. Further if b is an r-inverse of a, then a is an r-inverse of b. The notion ‘r-inverse’ of an element in Linear Algebra terminology is the same as reflexive g-inverse of the element. Definition A.3.8. A semi-group S is called a regular semi-group, if each of its elements is regular. Definition A.3.9. A semi-group S is called an inverse semi-group if each of its elements has a unique r-inverse. Definition A.3.10. A semi-group S with identity 1 is called unit regular, if ∀ a ∈ S, there exists an invertible element u ∈ S such that aua = a. Theorem A.3.11. (i) A semi-group S is an inverse semi-group if and only if S is regular and for each a, b ∈ E(S), ab = ba if and only if S is regular and E(S) is a sub semi-group. (ii) For each pair of elements a, b of an inverse semi-group S, let g, h and k respectively denote the r-inverses of a, b and ab, the k = hg. Definition A.3.12. A semi-group G is called a group if G2. there exists an element e ∈ G such that for each a ∈ G, ae = a. G3. For each a ∈ G, there exists b ∈ G such that ab = e. Equivalently, G20 . If there exists an element f ∈ G such that for each a ∈ G, f a = a. G30 . For each a ∈ G, there exists a b ∈ G such that ba = f. Equivalently, G200 . If there exists an element e ∈ G such that for each a ∈ G, ea = a = ae. G300 . For each a ∈ G, there exists a b ∈ G such that ba = e = ab.
A.4
Semi-groups and partial orders
Definition A.4.1. A partial order on a semi-group S is called a natural partial order if it is defined by means of multiplication of S.
434
Matrix Partial Orders, Shorted Operators and Applications
For any semi-group S, the relation on E(S) defined by e ≤ f if e = ef = f e, ∀e, f ∈ E(S) (A.1) is a partial order. This partial order on E(S) is known as the natural (or canonical) ordering of idempotent elements of a semi-group S. The relation ‘≤’ defined on any inverse semi-group S as a ≤ b if a = eb for some e ∈ E(S) and a, b ∈ S (A.2) is also an example of a natural partial order. We call this partial order the Vagner order. Theorem A.4.2. Let S be an inverse semi-group equipped with Vagner order [Vagner (1952)]. Suppose a, b ∈ S. Let g and h be the r-inverses of a and b respectively. Then the following are equivalent: (i) (ii) (iii) (iv) (v)
a≤b a ∈ {be : e ∈ E(S)} = bE(S) ag = ah ga = ha and a = aha.
Moreover, if a ≤ b and c ∈ S, then ac ≤ bc, ca ≤ cb and g ≤ h. Another example of a natural partial order on a regular semi-group defined in 1980 independently by Hartwig and Nambooripad, is the relation a ≤ b if and only if a = eb = bf for some e, f ∈ E(S). (A.3) This relation coincides with relation (A.1) on E(S). For an inverse semigroup the relation (A.3) coincides with the partial order (A.2). Theorem A.4.3. For a regular semi-group S, the following are equivalent: a = eb = bf for some e, f ∈ E(S) a = aa0 b = ba00 a for some a0 , a00 ∈ V (a = {x ∈ S : a = axa, x = xax} a = aa0 b = ba0 a for some a00 ∈ V (a) a0 a = a0 b, aa0 = ba0 for some a0 ∈ V (a) a = ab? b = bb? a = ab? a for some b? ∈ V (b) a = axb = bxa, a = axa, b = bxb for some x ∈ S a = eb for some idempotent e ∈ aS and aS ⊆ bS For each idempotent f ∈ bS, there exists an idempotent e ∈ aS such that e ≤ f and a = eb (ix) a = ab0 a for some b0 ∈ V (b), aS ⊆ bS (x) a = xb = by, xa = a for some x, y ∈ S and (xi) a = eb = bx for some e ∈ E(S), x ∈ S.
(i) (ii) (iii) (iv) (v) (vi) (vii) (viii)
Relations and Partial Orders
A.5
435
Involution
Definition A.5.1. An involution ‘star’ denoted as ? on a semi-group S is a map a → a? of S into itself, satisfying (i) a?? = a and (ii) (ab)? = a? b? . Obviously an involution is a bijective map on S. Definition A.5.2. An involution ? on a semi-group S is called proper if a? a = a? b = b? a = b? b ⇒ a = b. Definition A.5.3. A proper ?-semi-group is a semi-group with ? as a specified proper involution on it. Notice that if the semi-group has the zero element ‘0’, then an involution ?- is proper if and only if a? a = 0 ⇒ a = 0. (star cancelation law) Example A.5.4. Any inverse semi-group with ?-defined as a? = g, the unique pseudo inverse of a, ia a proper ?-semi-group. Example A.5.5. The set Cm×m of all the m × m matrices with involution ?- as the conjugate transpose i.e. A? = (A− )0 is a proper ?-semi-group. More generally, the ring B(H) of all bounded linear operators on the complex Hilbert space H with with involution ? as T ? = adjoint of T, is a proper ?-semi-group. Definition A.5.6. In a ?-semi-group S, an element a ∈ S is called a ?-regular if aa? and aa? are both regular. Equivalently, a ∈ S is ?-regular if and only if there exists a solution (that is necessarily unique) to the equations: axa = a, xax = x, (ax)? = ax, (xa)? = xa.
A.6
Compatibility of partial orders with algebraic operations
Definition A.6.1. Let (S, ≤) be a partially ordered set. If S is a semigroup, the we say the order relation ‘≤’ is compatible from right or left with semi-group structure of S, if for any a, b, c ∈ S, a ≤ b ⇒ ac ≤ bc or ca ≤ cb respectively.
436
Matrix Partial Orders, Shorted Operators and Applications
Example A.6.2. Vagner’s natural partial order is compatible on both sides with multiplication. When a semi-group has a partial order compatible with its semi-group structure, we call the semi-group a partially ordered semi-group.
A.7
Partial orders induced by convex cones
Definition A.7.1. Let C be a non-empty subset of a real vector space V (e.g. Rn ). C is called a convex cone if for each x, y ∈ C, λ1 , λ2 ≥ 0, λ1 x + λ2 y ∈ C. It is clear that 0 ∈ C. Definition A.7.2. The cone ordering on any subset W of a real vector space V induced by C is a relation on W defined as follows: For x, y ∈ W, x ≤ y ⇐⇒ y − x ∈ C. Notice that the cone ordering defines a pre-order on W. Moreover, if W is itself a convex cone, then the relation ≤ also satisfies the following additional relations: If x ≤ y, then for all z ∈ W, x + z ≤ y + z and for all λ ≥ 0, λx ≤ λy. Conversely, any pre-order on a subset W of V that satisfies both the above properties is a cone ordering induced by the cone C = {x : x ≥ 0}.
A.8
Creating new partial orders from old partial orders
Given a set, it may have several partial orders on it. We can use these to induce new partial orders on the set. We include here two techniques. Definition A.8.1. If
Relations and Partial Orders
437
P, then so does not b. We can define a new relation on S as follows: For a, b ∈ S, a b if a <1 b if the element a satisfies P and a <2 b otherwise. Clearly, < is a partial order on S. This partial order is called a combination of two partial orders <1 , and <2 .
This page intentionally left blank
Bibliography
Albert A. (1969) Conditions for positive and nonnegative definiteness in terms of pseudo inverses, SIAM J. Appl. Math. 17: 434-440. Ander W.N. (Jr.) Duffin R.J. (1963) Series and parallel addition of matrices. J. Math. Anal. Appl. 11: 576-574. Anderson W.N.(Jr.) (1971) Shorted operators I, SIAM J. Appl. Math. 20:520-525. Anderson, W.N.(Jr.) and Trapp, G.E. (1975) Shorted operators. II, SIAM J. Appl. Math. 28:60-71, (this concept first introduced by Krein [880]). Ando T. (1979) Generalized Schur complements. Linear Algebra and its Applications 27: 173-186. Baksalary J.K. (1986) A relationship between the star and minus orderings. Linear Algebra and its applications 127:157-169. Baksalary J.K. and Hauke J. (1987) Partial orderings of matrices referring to singular values or eigen-values. Linear Algebra and its applications 96:1726. Baksalary J.K. and Hauke J. (1990) A Further Algebraic version of Cochran’s Theorem and Matrix Partial Orderings. Linear Algebra and its applications 127:157-169. Baksalary J.K., Hauke J., Liu Xiaoji and Liu Sanyang (2004) Relationships between partial orders of matrices and their powers. Linear Algebra and its applications 379: 277-287. Baksalary J.K., Kala, R. and Klaczynski, K. (1983). The Matrix inequality M ≥ B ∗ M M . Linear Algebra and its applications 54:77-86. Baksalary J.K. and Mitra S.K. (1989) The Left-star and right-Star Partial Orderings. Technical Report A 220, Department of Mathematical Sciences, University of Tampere, Tampere, Finland. Baksalary J.K., Nordstr¨ om Kenneth and Styan George P.H. (1990) L¨ owner Ordering Antitonicity of generalized Inverses of Hermitian matrices. Linear Algebra and its applications 127:171-182. Baksalary J.K., Pukelshiem F. and Styan G.P.H. (1989) Some properties of the matrix partial orderings, Linear Algebra and its Applications 96:17-26. Baksalary J.K. and Pukelshiem F. (1991) On L¨ owner, minus, and star order of nonnegative definite matrices and their squares, Linear Algebra and its
439
440
Matrix Partial Orders, Shorted Operators and Applications
applications 151:135-141. Ben-Israel A (1969) On matrices of index zero or one. SIAM J. Appl. Math. 17(6):1118-1121. Ben-Israel A and Greville T.N.E. (2001) Generalized inverses: Theory and applications, Springer. Bhagwat K.V and Subramanian R. Inequalities between means of positive operations, math. Proc. Cambridge Philos. Soc. 393-401. Bhimasankaram P. (1971a) Some contributions to the theory, applications and computation of generalized inverses of matrices, Doctoral dissertation, Indian Statistical Institute. Bhimasankaram P. (1971b) On generalized inverses of partitioned matrices, Sankhy˜ a, Series A, 33, 311-314. Bhimasankaram P. (1988a) On Genralized Inverses of a Block in a Partitioned matrix, Linear Algebra and its applications 109, 131-143. Bhimasankaram P. (1988b) Rank factorization of a matrix and its applications, Math. Sci. 13, no.1, 4-14. Bhimasankaram P. and Malik Saroj (2007) Shorted Operators of Partitioned matrices and Applications, Linear Algebra and its applications 425, 150-161. Bhimasankaram P. and Mathew Thomas. (1993) On ordering properties of Generalized inverse of Nonnegative Definite matrices, Linear Algebra and its applications 183:131-146. Boullion T.L. and Odell P.L. (1971) Generalized Inverse Matrices, John Wiley & Sons, New York. Butler C.A. and Morley T.D. (1988) A note on the shorted operator, SIAM J. Matrix Anal. Appl. 9, no. 2, 147-155. Campbell S.L. (1977) Drazin generalized inverses, Linear Algebra and Appl. 18: no. 1, 53-57. Campbell S.L. (1979/80) Continuity of the Drazin inverse. Linear and Multilinear Algebra 8 no. 3, 265-268. Campbell S.L. and Meyer C.D.(Jr) (1978) Weak Drazin inverses, Linear Algebra and its Applications 20 no. 2:167-178. Campbell S.L. and Meyer C.D.(Jr) (1991) Generalized Inverses of Linear Transformations, Pitman (Advanced Publishing Program), Boston, Mass., 1979 (reprinted by Dover, 1991). Carlson D. (1975) Matrix decompositions involving the Schur complement SIAM J. Appl. Math. 28: 577-587. Carlson D. (1986) What are Schur complements, anyway? Linear Algebra and its Applications 74: 257-275. Carlson D. and Haynsworth Emilie V. (1983) Complementable and Almost Definite Matrices, Linear Algebra and its Applications 52/53: 157-176. Drazin M.P. (1958) Pseudo inverse associative rings and semi-groups. Amer. Math. Monthly 65: 506-514. Drazin M.P. (1978) Natural structure on semi-groups with involution. Bull. Amer. Math. Soc. 84: 139-141. Eckart C. and Young G. (1939) A principal axis transformation for non Hermitian matrices, Bull. Amer. Math. Soc. 45: 118-121.
Bibliography
441
Englefield M.H. (1966) The commuting inverses of a square matrix, Proc. Cambridge Philos. Soc. 62: 667-671. Erdelyi I. (1967) On the matrix equation Ax = λBx J. Math. Anal. Appl. 17:117232. Goller H. (1986) Shorted operators and rank decomposition matrices. Linear Algebra and its applications 81: 207-236. G.H Golub and L.Van Loan Matrix Computations (3rd Ed.)(1996), John Hopkin University studies in mathematical sciences. Gonz´ alez N. Castro, Koliha J.J. and Yimin Wei (2000) Perturbation of the Drazin inverse for matrices with equal eigen projections at zero. Linear Algebra and its applications 312:181-189. Gonz´ akez N. Castro and Koliha J.J. and Stra˘skraba, (2001) Perturbation of Drazin inverse. Soochow J. Math 27(20): 201-211. Gr¨ oß J (1987) A note on a partial ordering in the set of Hermitian matrices, SIAM J. Matrix Anal. Appl. 18, no. 4, 887-892. Gr¨ oß J (1997) Some Remarks on partialorderings of Hermitian matrices, Linear and Multilinear Algebra 42, 53-60. Gr¨ oß J (2006) Remarks on the sharp partial order and the ordering of squares of matrices, Linear Algebra and its applications 417: 87-93. Hartwig R.E. (1975) 1-2 inverses and invariance of BA† C, Linear Algebra and its applications 9: 271-275. Hartwig R.E. (1976) Block generalized inverses, Arch. Rational Mech. Anal. 61: 197-251. Hartwig R.E. (1978) A note on partial ordering of Positive Semi-definite Matrices, Linear and Multilinear algebra 6: 223-226. Hartwig R.E. (1979) Pseudo Lattice properties of the star-orthogonal partial ordering for star-regular rings, Proceedings of the American mathematical society 77(3). Hartwig R.E. (1980) How to order regular elements. Math Japonica 25: 1-13. Hartwig R.E. (1981) A note on rank-additivity, Linear and Multilinear algebra. 10: 50-61. Hartwig R.E. and Drazin M.P. (1982) Lattice properties of the star order for matrices. J. Math. Anal. Appl. 86: 145-161. Hartwig R.E. and Raphael L. (1992) Maximal elementa under the Three Partial Orders. Linear Algebra and its applications 175: 39-61. Hartwig R.E. and Spindleb¨ ock Klaus (1983). Some Closed Form Formulae for the intersection of two special matrices under the star order, 13:323-331. Hartwig R.E. and Spindleb¨ ock Klaus (1984) Matrices for which A∗ and A† commute. Linear and Multilinear Algebra 14: 241-256. Hartwig R.E. and Styan G.P.H. (1986) On some characterizations of the “star” partial ordering and rank subtractivity. Linear Algebra and its applications 82: 145-161. Hartwig R.E. and Styan G.P.H. (1987) Partially ordered idempotent matrices, in Proceedings of the Second International Tampere Conference in Statistics (Pukkila T. and Puntanen S, Eds.), Department of Mathematical Sciences, Univ. of Tampere, 361-383.
442
Matrix Partial Orders, Shorted Operators and Applications
Hauke Jan and Markiewicz (1995) On partial orderings of Rectangular matrices. Linear Algebra and its applications 219: 187-193. Hestenes M.R. (1961) Relative Hermitian matrices, Pacific J. Math. 11, 225-245. Horn R.A. and Johnson C.R. (1985) Matrix analysis, Cambridge Press. Horn R.A. and Johnson C.R. (1999, Reprint) Topics in Matrix analysis, Cambridge Press. Jain S.K., Mitra S.K. and Werner H.-J. (1996) Extensions of G-based matrix partial orders, SIAM J. Matrix Anal. Appl. 17 no. 4, 834-850. Jain S.K., Srivastava Ashish K., Blackwood B. and Prasad K.M.(2009) Shorted Operators Relative to a Partial Order in Regular rings, Communications in Algebra, to appear. Khatri C.G. (1968) Some results for the singular multivariate regression models, Series A, 30, 267-280. Lewis T.O. and Newman T.G. (1968) Pseudo inverses of positive semi-definite matrices. SIAM J. Appl. Math. 16: 701-703. ¨ L¨ owner K. (1934) Uber monotone Matrixfunktionen, Math. Z. 38: 177-216. Marshall Albert W. and Olkin Ingram (1979) Inequalities: The Theory of Majorization and Its applications, Academic Press, Orlando, Florida. Mitch H. (1986) A natural partial order for semi-groups, Proc. Amer. Math. Soc. 97: 384-388. Mitra S.K. (1968) A new class of g-inverses of square matrices. Sankhy˜ a Ser. A 30: 323-330. Mitra Sujit Kumar (1982a) Simultaneous Diagonalization of rectangular matrices, Linear Algebra and its applications 47: 139-150. Mitra Sujit Kumar (1982b) Properties of the Fundamental bordered matrix used in linear estimation, Statistics and Probability: Essays in Honor of C.R. c North Holland Publishing Company 505-509. Rao Mitra S.K. (1986b) The Minus Partial Order and the Shorted Matrix. Linear Algebra and its applications 83: 1-27. Mitra S.K. (1987) On group inverses and the sharp order. Linear Algebra and its applications 92: 17-37. Mitra S.K. (1988) Infimum of a pair of matrices. Linear Algebra and its applications 105: 163-182. Mitra S.K. (1989) Block Independence in Generalized inverse: A coordinate free look, Statistical Data Analysis and Inference 429-443. Mitra S.K. (1990) Shorted matrices in star and related orderings, circuits Systems signal Process 9: 197-212. Mitra S.K. (1991) Matrix Partial Orders through Generalized inverses: Unified Theory. Linear Algebra and its applications 148: 237-263. Mitra S.K. (1992b) On G-based extensions of the sharp order. Linear and Multilinear algebra 31: 147-151. Mitra S.K. (1994) Separation Theorems. Linear Algebra and its applications 208/209:239-256. Mitra S.K. (1999) Diagrammatic presentation of inner and outer inverses: Sdiagrams, Linear Algebra and its Applications 287: no. 1-3, 271-288. Mitra Sujit Kumar and Bhimasankaram P. (1971) Generalized Inverses of Par-
Bibliography
443
titioned Matrices and recalculation of least squares estimates for data or model changes, Sankhy˜ a, Series A, 33(4), 395-410. Mitra S.K. and Hartwig R.E. (1992a) Partial orders based on outer inverses, Linear Algebra and its Applications 176: 3-20. Mitra S.K. and Puri M.L. (1973) On parallel Sum and Difference of Matrices. Journal of Mathematical Analysis and Applications 44: 92-97. Mitra S.K. and Puri M.L. (1979) Shorted operators and generalized inverses of matrices, Linear Algebra and its Applications 25: 45-56. Mitra S.K. and Puri M.L. (1982c) Shorted matrices An extended concept and some applications. Linear Algebra and its applications 42: 57-79. Mitra S.K. and Puri M.L. (1983) The fundamental Bordered matrix of Linear estimation and Duffin-Morley General Linear Electromechanical systems, Applicable Analysis 14: 214-258. Mitra S.K., Puntanen S. and Styan G.P.H. (1994) shorted matrices in Linear Statistical Models: A Review Report A 287, Department of Mathematical Sciences, University of Tampere, Tampere, Finland. Mitra S.K. and Odell P.L. (1986a) On parallel summability of matrices. Linear Algebra and its applications 74: 239-255. Morley T.M. and William W.L. (1990a) Parallel sums of operators, Proc. Symposia in Pure Mathematics 51: 129-133. Morley T.M. and William W.L. (1990b) Parallel sums and norm convergence, Circuits, Systems and Signal Processing 9: 213-222. Nambooripad K.S.S. (1980) The natural partial order on a regular semi-group. Proc. Edinburgh Math. Soc. 23: 249-260. Pringle R.M. and Ranyer A.A. (1971) Generalized inverse Matrices and the applications to Statistics, Griffin, London. Rao A.R. and Bhimasankaram P. (2000) Linear Algebra, Hindustan Book Agency, Delhi, India. Rao C.R. and Mitra S.K. (1971) Generalized Inverses of Matrices and Its applications, John Wiley, New York. Rao C.R., Mitra S.K. and Bhimasankaram P. (1972) Determination of a matrix by its subclasses of generalized inverses, Sankhya Ser. A 34: 5-8. Robert P. (1968) On the group inverse of a linear transformation, J. Math. Anal. Appl. 22: 658-669. Sumbamurthy P. (1987) Characterizations of a matrix by its subclass of g-inverse. Sankhya Ser. A 49: 412-414. Sengupta Debasis (2009), Personal communication. Sengupta Debasis and Jammalamadaka Sreenivasa Rao (2003) Linear Models An Integrated Approach, World Scientific, Singapore. Seshu S. and Reed M.B. (1961) Linear Graphs and Electrical Networks, AddisonWesley, Reading Massachusetts. Vagner V. (1952) Generalized groups, Dokl. Akad. Nauk SSSR, 84: 1119-1122 (Russian).
This page intentionally left blank
Index
χ-inverse, 26 G-based order relation, 184 G-map, 184 O-based order relation, 196 ρ-inverse, 26 ρχ-inverse, 26 nnd, 68 (T)-condition, 187
Eigen-values, 13 Equivalence relation, 372, 430 Estimation, 407 Factorization, 10 Fisher-Cochran, 80, 110, 178 g-inverse, 19 Generalized eigen-values, 62 Generalized inverse, 19 Generalized Schur complement, 277 Generalized Singular Value Decomposition, 61 Geometric multiplicity, 14 GL-ordering, 239 Group, 432 Group inverse, 26, 103, 380
Algebraic multiplicity, 14 Algebraically closed field, 113 Anti-symmetric, 430 Appendix, 429 Binary operation, 432 BLUE, 409 Commuting g-inverse, 28, 381 Complete g-map, 189 Complex matrices, 156 Cone ordering, 436 Core, 33 Core part, 104 Core rank, 33 Core-nilpotent decomposition, 14, 209 Core-nilpotent decompositions, 104
HCF, 12 Hermite Canonical Form, 12 Hermitian, 14, 68 Idempotent matrices, 93 Impedance matrix, 245, 274 Index, 13 Infimum, 317 Inverse semi-group, 433 Involution, 435
Decompositions, 10 Disjoint, 11 Drazin inverse, 26 Drazin order, 117
Jordan block, 13, 113 Jordan decomposition, 13 445
446
Matrix Partial Orders, Shorted Operators and Applications
L¨ owner order, 215 Ladder, 401 Lattice, 317 Least squares g-inverse, 43, 381 Least squares solution, 41 Left G-order, 200 Left inverse, 18 Left sharp order, 156 Left star order, 156, 171 Linear model, 407 Linear Zero Function, 408 Lower semi-lattice, 319 Matrix partial orders, 67 Minimum norm g-inverse, 38, 381 Minimum norm least squares g-inverse, 381 Minimum norm solution, 38 Minus order, 67 modified matrices, 46 Moore-Penrose, 380 Moore-Penrose inverse, 36, 45 Natural partial order, 433 Network theory, 245 Nilpotent, 33 Nilpotent part, 104 Normal form, 1, 10 Normal matrix, 16, 178 One-sided sharp order, 156 Orthogonal complement, 277 Orthogonal projective decomposition, 279 Orthogonal projector, 39, 42, 175 Outer inverse, 384, 385 Parallel sum, 246 Parallel summable, 246 Parametric functions, 407 Partial order, 430 Partial-order, 1 Poset, 430 Pre-order, 1, 67, 431 Projectors, 77, 93 Proper ?-semi-group, 435
r-inverse, 433 Range-hermitian, 14, 103, 167 Rank, 10 Reciprocal, 274 Reflexive, 430 reflexive g-inverse, 23 Reflexive order, 68 Regular, 432 Regular semi-group, 433 Relation, 429 Residuals, 409 Resistive, 274 Right G-order, 200 Right inverse, 18 Right sharp order, 156 Right star order, 156, 171 Schur complement, 52 Schur compression, 279 Schur decomposition, 13 Semi-complete g-maps, 189 Semi-group, 432 Semi-lattice, 317 Semi-simple, 15, 103 Sharp order, 104 Shorted operator, 276 Simple, 15 Simultaneous Singular Value Decomposition, 60 Singular Value Decomposition, 16 Space pre-order, 67 Spectral decomposition of a semi-simple matrix, 15 Star order, 127 Support of g-map, 185 Supremum, 317 Symmetric, 430 Transitive, 68, 430 Unit regular semi-group, 433 Upper semi-lattice, 319 Virtually disjoint, 11